Professional Documents
Culture Documents
a corpus-based approach
March 2011
ABBREVIATIONS..................................................................................................... 7
ABSTRACT………………………………………..………………………………... 8
DECLARATION………………………………..…………………………………... 9
ACKNOWLEDGEMENTS........................................................................................ 10
CHAPTER 1. INTRODUCTION
1.1 Preliminaries………………………...……………………………………....... 11
1.2 Rationale…………………………...………………………………………….. 13
1.2.1 Lancashire area…………………….…………………………………. 15
1.2.2 Lancashire identity……………………………………………………. 17
1.3 Methodological approaches………..……………………………………….... 20
1.3.1 Corpora in sociolinguistics...………………………………………….. 22
1.3.2 Sound Archive corpus….……………………………………………... 23
1.3.3 Dialect literature in sociolinguistics...……………………………….. 25
1.3.4 Litcorp and Lancashire Fairytales…………………………………….. 27
1.3.5 Questionnaires and other methods………………………………......... 31
1.3.6 Choosing grammatical features……………………………………….. 32
1.4 Theoretical approach…………………………………………………………. 33
1.5 Overview and aims…………………………………………………………… 34
CHAPTER 2. RELATIVATION
2.1 Overview………………………………….…………………………………… 35
2.2 Analysis of relative clauses................................................................................ 35
2.2.1 Defining relativization constructions………………..………………... 35
2.2.2 Relativization types……………..…………………………………….. 37
2.2.2.1 Zero relatives………………………………………………... 38
2.2.2.2 Relativization in varieties of English……………………….. 40
2.2.3 Factors influencing relativizer choice………………………………… 41
2.2.3.1 Syntax: dependencies between RV and antecedent…………. 41
2.2.3.2 Semantic category of antecedent……………………………. 42
2.2.3.3 Restrictiveness (semantic scope) of the RC………………… 43
2.2.3.4 Other factors………………………………………………… 45
2.2.4 Diachronic change in relativization…………………………………... 46
2.2.5 Summary and research questions……………………………………... 46
2.3 Methodology....................................................................................................... 47
2.3.1 Rationale for methodology……………………………………………. 48
2.3.2 Corpora: Litcorp and Sound Archive…………………………………. 49
2.3.3 Questionnaires……………………………………………………….... 52
2.3.4 Classification and division of respondents……………………………. 54
2.4 Results and discussion....................................................................................... 55
2.4.1 Overview of corpus results…………………………………………… 55
2.4.2 Corpus results – restrictiveness of relative clause…….……………… 61
2.4.3 Corpus results – semantic category of antecedent …………………... 62
2.4.4 Questionnaire findings - distribution of relativizers ………………… 62
2.5 Concluding remarks………………………………………………………….. 65
1
CHAPTER 3. HAVEN’T TO
3.1 Introduction………………………………………………….………………... 68
3.2 Literature review............................................................................................... 69
3.2.1 Modality in Standard English………………………….……………... 69
3.2.2 Modals vs. semi-modals……………………………….…………….... 72
3.2.3 Recent changes to modal verbs in Standard English….……………… 73
3.2.4 HAVE to and HAVEn’t to in current Standard English ...……………..... 75
3.2.5 History of the HAVEn’t to construction…………………………….…. 77
3.2.6 The rise of periphrastic do……….…………………………………… 79
3.2.7 Related constructions………...……………………………………...... 80
3.2.8 Modals in varieties of English……………………………………....... 82
3.2.9 Summary……………………………………………………………… 83
3.2.10 Research hypotheses………………………………………………….. 84
3.3 Methodology…………………………………………………………………... 85
3.3.1 Introduction…………………………………………………………… 85
3.3.2 Corpus searches………………………………………………………. 85
3.4 Results and discussion....................................................................................... 86
3.4.1 Semi-modals and the BNC………………………………………….... 86
3.4.2 Corpus comparison of HAVEn’t to …………………………………… 87
3.4.3 Syntactic analysis of HAVEn’t to in Lancashire data …………………. 89
3.4.4 Semantic analysis of HAVEn’t to in the Lancashire data……………… 90
3.4.5 Explanations for semantic differences – constructional polysemy…… 93
3.5 Analysing the necessity/obligation construction family................................. 94
3.5.1 Introduction…………………………………………….……………... 94
3.5.2 Corpus results………………………………………………………… 94
3.5.3 Diachronic change – testing the Frequency Hypotheses.…………….. 95
3.5.4 Considerations and contradictions……….…………………………… 97
3.5.5 Semantic evidence……………………….…………………………… 98
3.5.6 Syntactic evidence……………………….…………………………… 100
3.6 Concluding remarks......................................................................................... 101
2
4.3.5 Classification and division of respondents.…………………………... 134
4.4 Results and analysis.......................................................................................... 135
4.4.1 NSR results…………………………….……………………………... 135
4.4.2 Other constructions with 3sg agreement …………………………….. 142
4.4.3 Other nonstandard agreement patterns……………………………….. 149
4.4.4 Was/were variation…………………………………………………… 150
4.4.5 Other variation with was/were...……………………………………… 153
4.4.6 Questionnaire results…………………………………………………. 154
4.5 Concluding remarks…………………………………………………..……... 158
CHAPTER 5. SALIENCE
5.1 Introduction........................................................................................................ 161
5.1.1 What is salience?...………………………………………………….... 162
5.1.2 Salience, markedness and enregisterment…………………………….. 163
5.1.3 Accommodation – problems and solutions…………………………… 165
5.1.4 Aims…………………………………………………………………... 167
5.2 Rationale............................................................................................................. 169
5.2.1 Using dialect literature..…………………………............................... 170
5.2.2 Choosing constructions……………………………………………….. 170
5.2.3 Summary, research questions and hypotheses ………………………. 172
5.3 Methodology....................................................................................................... 172
5.3.1 Corpus methods………………………………………………………. 173
5.3.2 New corpus data – Lancashire Fairytales……………………............. 176
5.3.3 Interpreting corpus result……………………………………………... 179
5.4 Results and discussion....................................................................................... 180
5.4.1 Features found across all corpora….…………………………………. 181
5.4.2 Features found in dialect literature………………………………….. 184
5.4.3 Features found in the most recent corpora...………………………….. 188
5.4.4 Other features…………………………………………………………. 188
5.4.5 Lancashire Fairytales – comparing Lancs and non-Lancs...………….. 191
5.5 Concluding remarks.......................................................................................... 194
REFERENCES……………………………………………………………………... 206
APPENDICES
A Appendix A: Map of the old County of Lancashire…………………………… 222
B Appendix B: Texts comprising Litcorp………………………………………... 223
C Appendix C: Questionnaire – sociolinguistic information..………………….... 224
D Appendix D: Questionnaire – content testing the NSR ………………….......... 225
E Appendix E: Ellipsis test sentences……………………………..……………... 227
F Appendix F: Questionnaires – content testing zero relatives ………...……… 228
G Appendix G: Sample text from Lancashire Fairytales………………………… 230
3
List of tables
CHAPTER 1 INTRODUCTION
Table 1 Overview of data sources and variables……………………………….. 21
CHAPTER 2 RELATIVIZATION
Table 1 Frequency of relativizer in Sound Archive and Litcorp……………….. 55
Table 2 Frequency of zero relatives in a 45,000 word sample from each
corpus…………………………………………………………………... 58
Table 3 Variant spellings of what and that relativizers in Litcorp……………... 58
Table 4 Analysis of ut results in Litcorp……………………………………….. 60
Table 5 Frequency of restrictive and non-restrictive relative clauses by
relativizer………………………………………………………………. 60
Table 6 Frequency of relative clauses by animacy type………………………... 62
Table 7 Questionnaire results for zero relatives………………………………... 63
Table 8 Choice of relativizer by questionnaire participants……………………. 64
CHAPTER 3 HAVEN’T TO
Table 1 NICE Qualities of semi-modals as compared to ‘core’ modals in
Standard English (Quirk et al., 1985:140)…………………………… 72
Table 2 Negative forms of semi-modals in the BNC, with and without DO, raw
frequency results……………………………………………………….. 87
Table 3 Instances of forms of the HAVEn’t to construction in the Lancashire
data……………………………………………………………………... 88
Table 4 Difference in obligation type in HAVEn’t to constructions in
Lancashire dialect data (raw frequency results)……………………..… 91
Table 5 HAVEn’t to family of constructions (normalized frequency results)…… 95
Table 6 Obligation construction family and the NICE properties……………… 100
5
List of figures
CHAPTER 1 INTRODUCTION
Figure 1 Map of Lancashire ………………………………………………….. 15
Figure 2 Overview of data sources used in this thesis..………………………. 21
Figure 3 Geographical location of informants in the Sound Archive corpus… 24
Figure 4 Excerpt from Lancashire Pride (Thompson, 1945)………………… 29
CHAPTER 2 RELATIVIZATION
Figure 1 Most frequent relativizers in the Lancashire corpora……………….. 56
CHAPTER 3 HAVEN’T TO
Figure 1 The auxiliary verb-main verb scale (adapted from Quirk et al.,
1995:137)……………………………………………………………. 73
Figure 2 S-curve model of language change (reproduced from Kroch
1989:22)............................................................................................... 81
Figure 3 Possible diachronic change in the Lancashire corpus data………….. 96
Figure 4 Distribution of constructions displaying weak and strong obligation. 98
CHAPTER 5 SALIENCE
Figure 1 Nonstandard features examined in Chapter 5…..…………………… 161
Figure 2 Mapping salient constructions………………………………………. 168
Figure 3 Semi-phonetic respellings in Tummus and Mearey (Bobbin, 1846)... 170
Figure 4 Dialect literature task – Lancashire Fairytales……………………… 177
Figure 5 Interpreting corpus comparisons……………………………………. 180
6
Abbreviations
1 First Person
2 Second Person
3 Third Person
Neg Negative
sg Singular
pl Plural
Ø Zero (missing element)
Adj Adjective
AdvP Adverb phrase
NP Noun phrase
RC Relative clause
RRC Restrictive relative clause
RV Finite relative clause verb
NRRC Non-restrictive relative clause
ZR Zero relative
to-inf to infinitive
7
Abstract
This thesis investigates a number of key grammatical features found in the previously
under-studied Lancashire dialect. While the primary aims of the study are without
doubt descriptive, a strong theoretical and methodological component to the
investigation is also present. Theoretically, this study is couched within the usage-
based approach to language (see e.g. Croft and Cruse, 2004: 291-327). It employs
innovative uses of new methodologies relating not only to a substantial spoken corpus,
but also to a newly collated corpus compiled from historical dialect literature texts.
Corpus resources are also supported by acceptability judgements and tasks which are
gathered from a large number of respondents using new techniques in order to
maximise the extent and significance of the data presented here.
This thesis details variation that is already well documented in other varieties
of English (e.g. relativization, verbal agreement), but differentiates itself by
highlighting nuances and complexities not previously considered before, such as
semantic differences in the HAVEn’t to construction; constructional competition in the
Northern Subject Rule and approaches to using corpora in measuring sociolinguistic
salience.
Underpinning the thesis is the idea that the interplay between non-standard
data and theoretical linguistics can be bidirectional, where theory can inform the
analysis of dialect data, and such analysis of dialect data can inform the formulation or
further refinement of new or existing linguistic theory (see also Hollmann and
Siewierska, 2011, Hollmann, to appear, and references cited therein).
The methods used here and the research presented by employing these
methods in the subsequent chapters emphasize the need for a broad range of resource
types in order to strengthen claims made in sociolinguistic research.
8
Declaration
I declare that this thesis is my own work has not been submitted in substantially the
same form for the award of a higher degree elsewhere.
9
Acknowledgements
This thesis would never have got going, much less be completed, without the financial
support I received from the Arts and Humanities Research Council, for which I am
extremely grateful.
My gratitude goes also to the SCR at Fylde College, Lancaster University who gave
me the opportunity to remain engaged in college life within my role as Assistant
Dean. In particular, my thanks go to Dr. Matt Storey for his ever practical advice, and
especially to my fellow Assistant Dean and PhD student, Dr. Krishna Morker.
I am also grateful to all of my friends, and in particular to Toshi, Chris, Mags, Clairey
and Katie; Chris-M and Rachel, Nicola and Jenny; and to my new friends in
Cambridge, Liz and Laura. It goes without saying that I am incredibly grateful also to
Matt, for his love, support, suggestions, advice and help on a day-to-day basis
throughout my project.
Last but not least, I am indebted to all of my family for all of their continued
encouragement and belief in me. My biggest thank you must go to my Dad and Tracey
who have always supported me in ways too numerous to list. They both are a real
inspiration to me in my education and, more importantly, in my life more generally.
10
Chapter 1. Introduction
1.1 Preliminaries
This study investigates several key grammatical features of Lancashire dialect. While
the primary aims of the study are descriptive, there is also a strong theoretical and
the usage-based approach to language (see e.g. Bybee, 1985; Langacker, 1987; Croft
uses not only a substantial spoken corpus but also a corpus compiled from historical
dialect literature texts. These corpus resources are also supported by acceptability
judgement tasks in order to maximise the extent and significance of the data presented
here.
exhibiting dialectal variation in the British Isles (outlined in general by e.g. Kortmann
et al., 2004; Trudgill, 1999), and are therefore of prime interest for experts in dialect
Corpus) or in general (e.g. Barbiers et al, 2006, Syntactic Atlas of Dutch Dialects;
Chapter 2 examines the structure and use of relative clauses and explores the
outlined by Quirk et al., 1985:1252) has made inroads into Lancashire dialect. This
chapter also provides an interesting account of the potential diachronic changes that
have occurred in Lancashire dialect with respect to the use of zero relatives; an issue
construction found in Lancashire with modal meanings that can be similar to both
11
DOn’t HAVE to and mustn’t depending on context of use. The analysis examines how
these modal meanings interact with other semantically related constructions (e.g.
Northern Subject Rule. Few in-depth analyses of the NSR have been conducted in any
one region, and currently no such analysis for Lancashire exists. Most studies do not
address variables such as the possible interplay between the NSR and other similar
are also frequently overlooked in the current literature. This chapter analyses corpus
data from spoken and written sources and is supported by acceptability judgements
from questionnaire results in order to explore both the possible instances of the NSR
and morphological phenomena (such as the NSR) the result of rules or constraints,
and to what extent is this variation more idiosyncratic, unpredictable and region or
community-specific?
methodology proposed here asserts that grammatical features which can be considered
language that is produced by Lancashire dialect speakers (found in the spoken Sound
Archive corpus) and that which is perceived by them to be dialectal (found in the
12
of salience (and indeed in outlining and working with the concept of salience more
Overall, the thesis explores the idea that the interplay between non-standard
data and theoretical linguistics can be bidirectional: theory can inform the study of
dialect data, and dialect data can inform the formulation or further refinement of
linguistic theory (see also Hollmann and Siewierska, 2011; Hollmann, to appear and
references therein). Many approaches to grammar e.g. Functional Grammar (see e.g.
Dik and Hengeveld, 1997), Construction Grammar (see e.g. Croft, 2001; Goldberg,
2002) and Cognitive Grammar (see e.g. Langacker 1991; Croft and Cruse, 2004), have
developed theories of language variation and change based on the analysis of data,
although rarely is any of this data drawn from so called ‘non-standard’ sources. Other
(Kerswill & Williams, 2002) and enregisterment (Agha, 2003) may add to existing
linguistic theory and so are explored with respect to the Lancashire data considered in
this thesis.
1.2 Rationale
data, not least for reasons of locality. Lancaster University currently holds recordings
from the North West Sound Archive (outlined further in §1.3.1) and also provides
from local interest groups and dialect societies which continue to be popular and well
1
See e.g. http://www.thelancashiresociety.org.uk and http://www.edwinwaughdialectsociety.com
13
included in the Survey of English Dialects (Orton et al., 1962-71), this source does not
techniques, (as outlined in e.g. Chambers and Trudgill, 1998). Lancashire results from
the SED are included in a number of cross-regional studies (e.g. Bresnan, Deo and
Sharma, 2007; Pietsch, 2005; Herrmann, 2005), but further data is required in order to
provide a fuller analysis of variation in this region. Recently, studies into the
Lancashire dialect have been conducted, such as those by Hollmann and Siewierska
(2006, 2007, 2011); Siewierska and Hollmann (2005). Related local varieties have
also recently been paid some attention, e.g. Bolton (Shorrocks, 1999; Moore, 2004).
phonological variation has been discussed in more detail, e.g. by Vivian (2000),
Barras (2006) and more generally by e.g. Watson (2006); a trend that appears to be
Siewierska (2006:22), grammatical variation has by and large been overlooked due to
(namely, the Generative paradigm) and the unavailability of (sufficient quantities of)
suitable data.
This thesis builds upon previous research both in this region and in varieties of
English more widely in order to consider how dialect data can provide new insights
into cognitive and theoretical linguistics whilst also giving a descriptive account of the
14
1.2.1 Lancashire area
Lancashire is situated in the Northwest of England, and bordered to the north by the
county of Cumbria; to the east by the counties of North and West Yorkshire; and to
the south by the metropolitan counties of Greater Manchester and Merseyside. The
Before the 1974 local government reform, the County of Lancashire also encompassed
towns now situated in other surrounding counties (see Appendix A for a map of the
old County of Lancashire). The towns of Bury, Bolton, Oldham, Rochdale, Salford,
and Wigan are now part of Greater Manchester but were once at the heart of
Lancashire’s cotton and milling trade (along with other towns such as, Burnley and
15
Chorley which remain in the county of Lancashire today). Other towns in the old
boundary reform, e.g. Knowsley, St Helens and Sefton now form part of Merseyside;
Warrington and Widnes are now part of Cheshire and the Furness Peninsula;
Westmoreland and Cartmel are now part of Cumbria. As a result, both linguistic and
towns and counties. Lancashire’s largest border is with Yorkshire, and parallels
between the language used in these two locations have been noted in a number of
Lancashire vary significantly. While the north of Lancashire is largely rural and in
some parts very sparsely populated (e.g. Carnforth, Silverdale), the south and east are
more densely populated and contain primarily industrial or formerly industrial towns
(e.g. Burnley, Chorley) which perhaps are influenced (both linguistically and
have been made by Shorrocks, 1999), phonological variation has been identified. A
found in south and east Lancashire, e.g. in the towns of Burnley, Blackburn, and
Accrington (Barras, 2006), but is absent in many other places within the County of
variation within county boundaries such as this is far from uncommon (compare e.g.
16
considerable variation found in the new county of Tyne and Wear, e.g. by Burbano-
interesting variation and possible levelling and diffusion of nonstandard features from
area to area, this variable is not considered within the realms of this study. Spoken
corpus data used in this study is taken from a selection of informants living in various
towns in modern-day Lancashire (see §1.5.2 for further details) and all results from
the corpus data are considered to represent Lancashire. In this thesis Lancashire refers
to the cultural-linguistic area rather than being fixed immovably to any county
boundary.
The link between language and identity is well attested in the literature (e.g. Bucholtz
and Hall, 2003; Holmes, 1997; Schiffrin, 1996). The concept of enregisterment
2003: 231). As found by Beal (2006) in Sheffield and Newcastle, the Lancashire
dialect is the subject of a number of humorous books, guides and glossaries such as
Completely Lanky (Dutton, 2006) and Lanky Twang (Freethy and Scollins, 2002).
dialect poetry, stories and songs (some of which are utilized in this study, see §1.3.2
for further details). Many other volumes exist detailing cultural and historical
(Baldock and Wood, 1995) and The spirit of Lancashire (Sparks 2009). Lancashire
merchandise is also found in gift shops and tourist information centres across the
region, often with “I love Lancashire” slogans emblazoned on various mugs and tea
17
towels (along with the more imaginative car stickers “Lancashire. There Will Be
Blood….pudding” referring both to the 2007 film by Paul Thomas Anderson and the
local delicacy, black pudding). This suggests that Lancashire has a defined set of
cultural and linguistic norms, certainly for the speakers of this variety, and that an
awareness of these norms may impact on language use in this region (see Hollmann
and Siewierska, 2011 for a concrete albeit very tentative suggestion in this direction,
speakers, attitudinal data was collected from all informants who completed either the
acceptability questionnaires and/or tasks that were used later in this study (see §1.5.3
information such as age, location and also attitudinal information from the informants
(alongside, of course, the specific test questions in the main part of the questionnaire).
The inclusion of attitudinal questions aimed to test how the informants perceive the
Lancashire dialect and to uncover more about Lancashire identity and how this might
fit with the language use reflected in the questionnaires themselves. No specific
questions were asked about neighbouring regions (so as not to influence any response)
but instead informants were invited to respond to the open question “how do you feel
about your accent/dialect?” More than 100 people who identified themselves as
here with those informants also living in Lancashire who considered themselves not to
speak with any regional dialect, see §1.3.3 for more on this distinction). Around 65
gave positive responses, 20 gave more negative replies with around 15 giving neither
(1-5).
18
(1) “Positive. I think it sounds friendly. It's part of my identity and people like it
or think it is funny. People instantly know where I'm from. It never sounds
pretentious.” (Lancs015)
(2) “Gives a sense of individuality from other regions. It’s different to Geordies
or Scousers or Mancs - and much nicer!” (Lancs029)
(3) “Positive, I like the fact its not cockney, or brummie, and you can get away
with murder (not literally of course!!) down south because they think we're
simple country folk, little do they know!” (Lancs004)
(4) “If I meet someone new then they know straight away where I'm from when I
begin to talk, for me this is a positive thing because I am very proud of being
a Lancashire Lass.” (Lancs083)
(5) “I love my Lancashire accent, far better than any other. The only problem I
have with it living down South is getting people to understand what I am
asking for when I order a c-o-a-k-e (coke!)” (Lancs011)
in their responses and many more highlight the separation of Lancashire from other
neighbouring regions, as demonstrated in (2) and (3). This formulation of an ‘us’ and
‘them’ idea in the minds of speakers shows that speakers are aware of both geographic
and linguistic differences between Lancashire and other local varieties. Interestingly,
regions now considered outside the County boundaries, typically from towns now
belonging to the northern part of Greater Manchester such as Bolton and Rochdale.
These results are not extensive enough to make generalizations about the status of
Lancashire dialect with respect to language contact and change, but indicate that
County boundaries.
There were of course other less positive views, mainly relating to speakers’
feelings of being portrayed as ‘common’ or ‘stupid’ or ‘poor’; these are shown in (6-
19
(6) “I don't really like it when people say I have a strong Lancashire accent (it's
usually people from the south of England) because I don't want to sound
'common'.”(Lancs005)
(7) It sounds very bland, in comparison to scouse. Although the fact i've been
brought up in the chav capital of Lancashire, i managed to speak rather 'posh'
for a blackpudlian anyway.” (Lancs079)
(9) “I feel that in some ways northern accents (including Lancashire) are still
judged to be inferior or to indicate lesser intelligence or class standing, no
matter how many trendy regional people they put on the telly.” (Lancs050)
(10) “when growing up: to speak with a broad Lancashire accent was considered
'common' and restricted your position in the job market. I think it sounds
guttural and boring too.” (Lancs054)
It is evident form the above that the Lancashire dialect is a clear and distinct entity in
the minds of many Lancashire dialect speakers (both those contacted in this thesis and
beyond) and that Lancashire speakers are aware of social implications associated with
The data drawn upon in this thesis is outlined in the diagram in Figure 2 and expanded
20
Contemporary dialect Acceptability tasks and
literature
questionnaires
(Lancashire Fairytales)
Further details on the size, collection dates and informants that contribute to the
various sources used in this thesis (as outlined in Figure 2) are shown in Table 1.
21
1.3.1 Corpora in sociolinguistics
The analysis of nonstandard regional dialects goes back at least to the nineteenth
century; see e.g. Chambers and Trudgill (1998) or Ihalainen (1994) for an overview.
Prior to the advent of large rapidly accessible, annotatable and searchable electronic
corpora in the 1960s, dialectology relied on the notes of fieldworkers. Most of the
analyses were restricted to lexical variation, often producing isoglosses and word
maps (such as those found in the original SED results, see Orton et al., 1962-71). The
Currently a number of large spoken English dialect corpora exist, e.g. the
Freiburg English Dialect corpus (FRED) (Kortmann et al, 2002-2005) 2 which draws
on data from a number of regions in the UK; the Newcastle Electronic Corpus of
Tyneside English (NECTE) (Allen et al. 2006) 3 ; the Scottish Corpus of Texts and
Speech (SCOTS) (Corbett et al, 2004) 4 ; and the Limerick Corpus of Irish English (L-
CIE) Farr, Murphy and O’Keefe, 2004) 5 . This thesis uses over 800,000 words of
spoken and written corpus data, falling broadly into two parts - spoken data taken
from the North West Sound Archive (see e.g. Hollmann and Siewierska, 2006), and
dialect literature taken from (primarily) stories written by Lancashire speakers. The
2
http://www2.anglistik.uni-freiburg.de/institut/lskortmann/FRED/
3
http://research.ncl.ac.uk/necte/
4
http://www.scottishcorpus.ac.uk/
5
http://www.ul.ie/~lcie/
22
1.3.2 Sound Archive corpus
The Sound Archive corpus is a 325,000 word corpus held at Lancaster University
transcribed from oral history interviews held at the North West Sound Archive. 6 Of
the thirty-two Sound Archive recordings used in this thesis, seventeen were
transcribed entirely by me. The remainder were transcribed by an audio typist and
carefully checked and corrected by me. The Sound Archive recordings themselves (as
opposed to the transcriptions alone) were also used throughout this research in order
to double check any points that were found initially to be unclear. The recordings
were typically not represented. When variants found in the recordings were
was used. This particularly applied to the use of /mI/ for the Standard English my
(11) […] and me feet went from under me and t' axe went in me leg and there I were laid
on t' floor with axe in me leg (Sound Archive).
Local dialect lexis that did not appear in dictionaries is recorded consistently (e.g.
nobbut (meaning no more than, nothing but) and gradely (meaning fine or excellent).
local history project and involve speakers between the ages of approximately 55 to 80,
from both northern parts of Lancashire, e.g. Morecambe, Lancaster, Fleetwood (15
speakers) and more southern parts of Lancashire, e.g. Accrington, Chorley and
6
For further information on the North West Sound Archive, see
http://www.lancashire.gov.uk/corporate/web/view.asp?siteid=2856&pageid=4970&e=e
23
Burnley (17 speakers). As the recordings made for the Sound Archive were intended
as past of an oral history project rather than for linguistic research, unfortunately little
distribution of Sound Archive informants is shown in Figure 2 where each green dot
Interviews range between 3,000 and 17,000 words in length. Speakers in the Sound
Archive corpus typically cover topics such as agriculture, wartime, farming and
fishing. An extract from the Sound Archive corpus is shown in example (12)
24
(12) No it were ni-- it were nice because they had them big pipes ‘cos we had them
big pipes in t’ greenhouses up smallholdings you know, them big, must have
been coal mustn’t it, anthracite coal yeah. And teachers had er, their room it
were in top of a buil-- at er Burnley Wood School, it were at top of one of er
buildings. And if I were wet through we used to have a change of clothes. See
we hadn’t to sit in them. And I used to have to stay at er er dining room on
erm Oxford Road to er have dinner. And then when it were winter time and it
were right dark we used to get out about three o’clock or four or well before
four o’clock. (Sound Archive)
While oral history interviews are a good source of relatively unrestricted speech, this
corpus data is strongly biased towards past tense constructions. Attempts have been
made to compensate and counteract for this bias by employing additional analytical
Lancashire dialect as used by its speakers towards the end of the last century. If
analysed alone, in isolation from other data, it would allow only synchronic
descriptive observations to be made. However, by combining the data from the Sound
Archive corpus with that drawn from elicitation (as advocated by e.g. Hollmann and
Siewierska, 2006 and described with respect to this thesis in §1.3.5) and dialect
Broadly speaking, dialect literature is here intended to mean stories and narratives
written with the intention of representing dialectal speech of that region, by writers
from that region. This is different from the orthographic representation of dialect
literature’ and ‘literary dialect’ respectively; this study concerns only dialect literature.
The history of writing in dialect is extensive, but despite this, its use in linguistic
25
research is relatively recent. A number of studies have considered historical dialect
texts, (and a number those texts are from Lancashire) although none of these are in the
context of measuring language change (e.g. Shorrocks 2002; Ruano García 2007). It is
aspect of dialect literature lies in the conscious respelling of words by the writers. If
by the author (as suggested by Sebba, 2009), then these features give an extra layer of
significance to the grammar and lexis chosen by the writer. While of course
respellings naturally lend themselves to a phonological analysis, I argue that they are
also interesting in terms of whether or not the distribution of these respellings may
More recently, as reported by Beal (2000, 2009) and Honeybone and Watson
and books about regional dialects are now relatively common, e.g. for Scots - Wha’s
Like Us? (Say it in Scots) (Robinson, 2008); Geordie - Larn Yersel’ Geordie (Dobson,
1969), Cornish - Oall Rite Me Ansum!: A Salute to the Cornish Dialect (Merton and
Scollins, 2003). Resources such as these can provide insights into language choices
made by these writers, but are not the focus of this study. It is interesting to note that
more recently still, even newer sources of dialect literature have begun to emerge via
the Internet. Spoof encyclopaedia pages and discussion forums are prevalent, with
many contributors not only sharing regional words and phrases, but also writing in
26
their regional dialect. 7 Alongside this, the social networking site Twitter has recently
provided instances of dialect writing that could merit further study. Shown in (13) is a
Twitter.
(13) Cheryl Kerl: Oh aye pet Ah embrace Europe me man, an Ah’m propah
multilingwill an aall. Ah speak English, Esperanteaur an uv coase Jawdee az
well (18th March 2011)
While there are, as yet, no examples of the Lancashire dialect being represented via
this medium, this nonetheless remains an interesting possibility for future research.
Currently, much of the research into dialect literature has been small scale,
confined to one or two texts per study. There has been no extensive corpus-based
analysis of a collection of dialect literature and it has not been used to measure
language change. It is hoped that this may change, both due to the approaches
advocated in this thesis and to the newly available Salamanca Corpus, a digital archive
The dialect literature found in stories and narratives in this region cannot be
record of the writers’ perception and representation of speakers at the time of writing.
Because of this, analyses of dialect literature can uncover the most salient or important
7
See the spoof Wikipedia page for the Lancashire dialect (i.e. Lanky Twang):
http://uncyclopedia.wikia.com/wiki/Lanky_Twang and forum threads discussing Lancashire dialect
such as this: http://www.redvee.net/forums/showthread.php?11439-lanky-twang
8
See http://salamancacorpus.usal.es/SC/index.html for further information.
27
The Lancashire dialect literature corpus (Litcorp) used in this study is larger
than the Sound Archive at approximately 500,000 words in length. It is compiled from
six books written in the Lancashire dialect by a variety of authors, sourced from
converting the files to plain text format using conversion software, the data can then
The dialect literature books are written in the period 1855 – 1945, and are narratives,
monologues and plays. Songs and poems were avoided due to possible interfering
factors such as rhyme. A full list of titles included in the Litcorp can be found in
28
LANCASHIRE PRIDE
I
SPRING CLEANING
TOMMY GREENHALGH walked moodily into the bar of the
“Hark to Dandler.” His pal Jimmy Dearden was just “taking
the top off” his drink as Tommy arrived. “How do Jimmy,”
said Tommy. Jimmy wiped the froth from his exuberant
moustache and said “How do Tommy. Tha looks a bit
powfagged. Owt up?”
“Ah don’t know as there is,” said Tommy. “Ah’ve
come out o’ th’ road.”
“Out o’ th’ road o’ what?” said Jimmy?
“Out o’ th’ road o’ battle, murder an’ sudden death,”
said Tommy vindictively. “Ah’ve come out o’ th’ road of an
earthquake.”
“Well,” said Jimmy, “Tha’s reached the harbour o’
refuge.”
“For th’ time being,” said Tommy. “Just for th’ time
being.”
“Ha’ one wi’ me?” said Jimmy?
“Tha’s took th’ words out o’ me mouth,” said Tommy.
has been collected. Respondents were asked to write in what they considered to be
Lancashire dialect. This means that this corpus captures the current perception of the
respondents (along of course with any possibly phonetic representation they choose to
include). This fusion of elicitation and dialect writing is new and will provide a useful
29
In order to end up with stories of suitable length, the participants were asked
to reproduce a story that was familiar to them – a fairy tale. In building this new
contain (e.g. grammatical variation, lexical choices, and semi-phonetic spellings). The
mixed age range and were contacted through social networking websites. An example
of the texts produced by the informants is given in (14) (further instances are found in
Appendix F.
(14) […] the prince, he were broken hearted, and he says, “i’m gonna find me
lovely lass, im gonna search all round kingdom!” And off he went down
t’road, holdin onto the clog that she’d left ont ground […] (Lancashire
Fairytales)
also employs, from time to time, the British National Corpus (BNC). The BNC is a
100 million word corpus drawn from both written and spoken language from a wide
range of sources collected during the 1990s. The BNC was designed to represent a
cross-section of British English. 9 Use of the BNC does not form a considerable part of
this study, but instead is used as a reference corpus at various points with which to
9
For further information on the BNC, please see http://www.natcorp.ox.ac.uk
30
1.3.5 Questionnaires and other methods
Questionnaires are used in this study in order to both include the perceptions of
more modern speakers, and to allow a (tentative) further time depth comparison with
the corpus data. The questionnaires that I have devised tested variables such as the
constructions further, in order to compensate for the dominance of the past tense
constructions in the corpora. The questionnaires also targeted different groups often
to judge sentences on a five point scale, with 1 being the least acceptable to them and
(15) ‘They have a shop of their own and is very well off.’
(least acceptable) 1 2 3 4 5 (most acceptable)
The questionnaire data is compared to the results from both corpora. Since all
three data sources were gathered by different means and cover different time periods,
a combination of these results should substantiate any claims made, but also shows
respondents was also gathered. This information is used in order to divide the
respondents into different categories throughout this work, depending on the various
aims of each chapter. For example, in Chapter 4 both Lancashire and non-Lancashire
31
Lancashire dialect in answer to the question ‘do you have a particular dialect? If yes,
how would you describe it?’ are classified as ‘Lancashire, dialect speakers’. Other
speakers who identify themselves as living (or having lived) in a Lancashire town or
village for a majority of their life, but suggested that they did not have a Lancashire
Based on the attitudinal data presented in (1-10) it seems that modern Lancashire
features of their language use. It is also clear that grammatical variation exists within
this region, some of which is relatively well known (such as definite article reduction,
including deletion (see Hollmann and Siewierska, 2011 for discussion). Variation and
change is not characteristic of one language area to the exclusion of other areas–
phonological, lexical and discourse variation are also associated with regional
variation and are linked together. Phonological features (as represented through
nonstandard spelling) and lexical choice are not considered at length in this thesis but
This thesis avoids grammatical features that have been the focus of previous
possessive me (Hollmann and Siewierska, 2007). (This is except for Chapter 5 where
32
grained analyses of the corpus data revealed numerous grammatical and spelling
change with respect to factors such as language communities, prestige, and language
contact (see e.g. Milroy and Milroy, 1992; Labov, 2006; Trudgill, 2008). More
recently however, the usage-based model has received some consideration within the
field of sociolinguistics (see Hollmann and Siewierska, 2011, for a discussion of this).
knowledge and language use is sensitive to frequency of usage. This means that token
Langacker, 1987; Croft and Cruse, 2004: 291-327). It therefore follows, for example,
that language structures that are used more often (and therefore have a high token
frequency) may become more reinforced in the minds of speakers. This entrenchment
language change (i.e. it is possible that entrenched constructions may resist language
change).
varieties (e.g. Tomasello, 2003; Mukherjee, 2005), studies such as these rarely
consider any possible standard vs. nonstandard variation that may be present in their
data. Along with this, the usage-based model has been often disregarded by
33
varieties found in their data (e.g. Hollmann and Siewierska, 2007; Clarke and
Trousdale, 2009).
possible explanatory factor themselves, while also taking into account the more
prestige, language contact etc.) In doing so, this study contributes further to the
description of the interplay between corpus linguistics, nonstandard data and linguistic
theory.
The present study links with and contributes to previous research in a number of ways,
combining empirical aspects of corpus research with the descriptive approach in order
outlined earlier, along with a large number of acceptability judgement tasks. The
examined features of Lancashire dialect provide a fruitful testing ground for theories
salience as interpreted within the usage-based framework (see e.g. Croft and Cruse,
2004: 291-327).
Overall, the thesis explores the idea that the interplay between non-standard
data and theoretical linguistics can be bidirectional and that a successful description of
corpora of different types in combination with elicitation and acceptability tasks can
give the most constructive results when examining nonstandard varieties such as this.
34
Chapter 2. Relativization
2. 1 Overview
Relative clauses (RCs) have been the subject of much research into varieties of British
English (e.g. D’Arcy and Tagliamonte, 2010; Kearns, 2007; Tagliamonte, Smith and
Lawrence, 2005; Beal and Corrigan, 2002; Herrmann, 2002; Fox and Thompson,
1990; Ihalainen, 1980). Although the Lancashire dialect region has been (briefly)
studied as part of wider investigation into British English (e.g. by Herrmann, 2005),
and certain claims have been made about the behaviour of particular RCs in this
available. This chapter tests whether or not claims made by other researchers are
supported by the Lancashire data examined in this thesis. More generally, this chapter
aims to set out the relativization patterns that are frequently found in Lancashire. As
part of this, the assertion that regional dialects deviate from Standard English with
respect to relativizer choice in particular contexts is tested (see e.g. Quirk et al, 1985;
grammatical features which will be tested include types of relativizer (namely, the
the antecedent and restrictiveness). These concepts are discussed further in §2.2.
Relative clauses consist of a finite relative clause verb (henceforth RV), and (usually)
35
pronoun (called the antecedent), to which it is syntactically linked. An example of
such is the sentence the man who wasn’t there, where was is the RV, who is the
relativizer, and the man is the antecedent. Henceforth in examples, RCs are shown in
(1) I used to go in a little toffee shop in Bridge Street and it was my wife [who
owned it] and I used to get my cigarettes there various times of the day
(Sound Archive)
which narrows the antecedent’s frame of reference. It is worth noting that the
relativizer takes on, or repeats, the semantic target of the antecedent, and as such, is in
itself “redundant”, and does not (normally) convey any new information, unless the
RCs. RCs differ from main clauses, in that they may (a) begin with a relativizer; or, in
the absence of a relativizer, (b) bring about a different word order from that of a main
clause; for example, in the sentence someone I know knows him, there are two
adjacent subject nouns and two adjacent finite verbs, neither typically acceptable main
clause word orders. The syntactic relation between the antecedent and the RV may
vary, as do other noun-verb relationships (e.g. subject, object, etc); this is described in
more detail in §2.2.1.3). Standard English has the following relativizers: which
(implying that the antecedent is not a person), who, whom, whose (implying that the
antecedent is a person), that (the most frequent and not implying anything about the
antecedent), and finally, a null relativizer zero or Ø, permissible with both person and
non-person antecedents. In Standard English, the antecedent usually precedes the RC,
but may follow it, though such types of relativization are not considered in this
investigation.
36
Relativization is not the only method of conveying additional information
about a noun whilst including a verb phrase: this can also be done with a noun-verb
and a finite auxiliary verb to convert the structure into a relative clause, e.g. making “a
biscuit [that is covered in chocolate]”. This can also be achieved with noun-adjective
patterns, e.g. “a problem [solvable by hard work]” where the relativizer and the
auxiliary BE can be used in the relative clause, i.e. “a problem [that is solvable by
hard work]”. Another option, not always available in SE, is to use a preposition phrase
e.g. “a man [in need of help]”, and its relativization “a man [who is in need of help]”.
Although these semantically related constructions are not considered in the data
examined in this chapter, it is worth noting that this choice of conveying similar
semantic information with differing syntactic patterns may impact upon distributional
frequencies found later in the results, in this thesis as a whole, and of course in the
the speaker or writer. Certain relativizers are typically linked to certain conditions (as
discussed later in e.g. restrictiveness in §2.2.3). Aside from the wh-relatives and that-
relatives which are commonly described in the literature, further types of relativizers
are found, namely zero relatives (Ø) and non-standard relativizers. These are
37
2.2.2.1 Zero relatives
Zero relatives (ZRs) are RCs that are not introduced by an overt relativizer. In many
al. (1985:865) due to this “deletion” of the relativizer for subject relatives as shown in
e.g. (2).
(Erdmann, 1980; Auwera, 1984), but here I follow Fischer (1992) in using zero
relative. Most accounts of Standard English do not clearly outline the semantic and
syntactic circumstances where ZRs are acceptable (and indeed are frequently used);
there does not appear to be a consensus on the range of environments that ZRs are
able to occur in. Much of the literature describes a number of core examples, namely,
existential there, existential have, it clefts, and clauses in which the main verb
(3) Well everything had a season , I can't remember when it was , we used to play
topping whip, er marbles in the channel, er skipping, lots of skipping where
you dash in and there's a line [Ø 'd be waiting to jump in and out of the rope].
(Sound Archive)
(4) 1: Have your relations worked in the slaughterhouse, it just seems an unusual
pastime for a 10 year old?
2: I'd two brothers [Ø worked there]. (Sound Archive)
(5) It were ‘is father [Ø made toil of his holiday] in the hope of benefitting his
boy (Litcorp)
(6) I have heard of a schoolmaster [Ø taught his pupils to say it the same way].
(Litcorp)
38
Certain other less widely attested ZRs are also described, e.g. by Doherty (2000: 87), a
number of which are found in the Lancashire data, e.g. NPs headed by free choice any
(7) An ee weret nobu’ trouble, that mon [Ø lived overt’ road from Mearey].
Alwus upter summat. (Litcorp)
(8) More than any place [Ø I 've ever been in in my life], it was full of plovers '
nests, plovers, tewits as we called them (sound Archive)
It should be noted that there is a distinction between sentences in which the relativizer
has a subject vs. non-subject antecedent. This can be seen in examples (7) and (8)
(such as that shown in 8) are often perfectly acceptable, as suggested by e.g. Olofsson
(1981:94). These relative constructions therefore may not be a notable feature of the
Lancashire dialect in particular, but rather a construction often found in English more
generally.
distinctions are not made. Herrmann (2005:35) discussed have and be existentials and
Anderwald (2004:189) examines only instances of existential there clauses. All of the
ZRs exemplified here are tested with respect to the Lancashire data, both in the
speakers, (see §2.2 for further information) in order to both outline the distributions
describe ZRs as being nonstandard and primarily found in regional and/or informal
varieties of English (although as Tottie (1995) points out, zero relatives are persistent
in written English too). Such relativization has indeed been identified in regions of the
39
UK (e.g. by Tagliamonte, Smith and Lawrence, 2005; Beal and Corrigan, 2002). This
chapter explores ZRs in order to provide a descriptive account of their behaviour and
acceptability in Lancashire, and also to test if they have undergone any diachronic
change.
Recent research has documented a number of ways in which varieties of English differ
from Standard English (and from each other) in their relativization strategies (e.g.
Ihalainen, 1980; Bailey, 1999; Beal and Corrigan, 2002; Herrmann 2002;
Tagliamonte, Smith and Lawrence, 2005; Kearns, 2007; D’Arcy and Tagliamonte,
2010). Many report that wh- RCs (namely those found with the relativizers who,
whom, which and whose) are most frequently associated with Standard English, while
other relativizers (such as that, ø, and the less frequent as) are linked more often to
dialectal speech (e.g. Quirk et al. 1985:1252; Herrmann 2002:94;). In many recent
regional dialects is described. This is, perhaps, expected due to dialect levelling and
(such as Herrmann, (2005) based on data from the Survey of English Dialects (Orton
English. In studies that consider Lancashire data to some degree (e.g. Shorrocks,
1999), instances of what and of at relativization are pointed to as typical for this
northern English; Beal and Corrigan (2002) found this to be one of the least preferred
40
These nonstandard relativization patterns are accompanied by nonstandard
from the Lancashire region. Herrmann (2002:33), for example, considers at, ut and t’
be determined (see §2.3.) There is also disagreement among authors about whether at
Romaine (1982a:70) suggests that even if at was different, it has become mentally
merged with that over time. The distribution of these nonstandard relativizer forms
A survey of the literature reveals that, most typically, forms of relativization are
influenced by various factors (both syntactic and semantic). Firstly, with respect to
syntactic function: what is the syntactic relation between the relativizer and the RV? Is
the RV subject of the relative clause verb, the object, etc? Secondly, with respect to
the semantic category of the antecedent: is it human or not? Finally, with respect to
narrows the meaning of the antecedent, to avoid confusion with other possible
antecedent, and is not intended to narrow the meaning. These factors are now outlined
in subsequent sections.
Relative clauses can be distinguished from each other according to the syntactic
relationship between the relative clause finite verb and the relativizer (or antecedent, if
41
there is no relativizer); these constructional differences are known as subordination
types. Much of the literature describes three basic types: nominal, adnominal and
sentential. Nominal RCs are those that function as a noun, e.g. can be subject or object
of a verb, and they have no antecedent, as shown in (9); adnominal RCs modify
nouns, and are perhaps the most common example of RCs, shown in (10); finally,
sentential RCs act like a separate sentence, and their antecedent is a preceding
sentence (11).
(9) [...] and er you went home and got [what was left for you] and you had that.
(Sound Archive).
(10) [...] then this letter arrived saying you 'd got a place at the secondary school
[which was in Preston] (Sound Archive)
(11) And slowly but surely, thankfully I gained their trust, [which is very very
important]. (Sound Archive)
Adnominal RCs are the prototypical RC, and will be the main focus of this chapter.
From this point, unless otherwise stated, RC refers only to adnominal RCs.
attributes which describe what the word means; categories may be as general or
specific as is needed for a particular task. A noun referring to a human being can be
described as having the semantic category “person”, and this entails the attributes of
animacy or being alive, as well as more human specific attributes such as having
head noun a person, or not? Antecedents having semantic category person often occur
in English with the relativizer who, and who only occurs in SE with a person
42
antecedent, so it would not be grammatical Standard English to say “*There’s a dog
[who goes for walks here].” On the other hand, the relativizer which can only occur in
contrasts the head noun from any other nouns with which it may have been confused;
In (12) the friend is being contrasted with the speaker’s other friends who do not live
in Manchester, whereas in (13), the fact that the brother is not feeling well is not used
in order to define which brother is being referred to, but is just adding incidental
information about that brother. Often, NRRCs are preceded by a pause in speech, or a
comma in writing, which is not the case for RRCs; this perhaps reflects the nature of
these RC types: RRCs are adding information needed for disambiguation of the
antecedent, so the information should be said sooner; on the other hand, NRRCs are
adding information not needed for identification, so that information can afford to
43
RCs are often distributed in patterns, e.g. certain relativizers prefer certain RC
Pullum, 2002:1056). However, it is possible that both that and Ø may be used in non-
restrictive contexts. While this may be less frequent in Standard English, it is reported
(2002:104).
(14) [...] I seen Eric Adams [that lived there], he said it come one Sunday
Dinnertime
(15) […] there was Mr McNaughton and Ben Weir from Kendal [Ø came round
buying horses]
Researchers often link NRRCs with the wh- relativizers, rather than with that
or zero (Huddleston and Pullum 2002:183). With regard to zero relatives, Quirk et al
(1985:1985) suggest that ‘non-restrictive zero cannot occur’ (see §2.2.1 for a further
discussion of zero relatives). Such conclusions are most typically based on Standard
English alone and so are tested on the nonstandard data presented in this chapter.
non-restrictive relativization - it is not always clear in what way or to what extent the
noun phrase is being restricted. Indeed, unless all relevant facts are known about the
antecedent (and its candidates), it is impossible to tell for certain what patterns the
speaker is using. One kind of restrictiveness ambiguity is that where more than one
(an unlimited number) of antecedent candidates are being described; a number of such
(16) Then I had another brother [what went to er America, nineteen twenty
three]. (Sound Archive)
44
From this example, it is unclear if the speaker is using the restrictive meaning, which
implies the speaker has other brothers who went to America, or if the speaker is using
the unrestrictive meaning, which is not explicit about other brothers, but may be used
to imply that the brothers did not go to America. In cases like this, a look at the wider
context often reveals the intended meaning (which in this instance is non-restrictive,
i.e. does not imply any other brothers went to America), as shown in (17).
(17) There were, he were a poultry farmer. He were a loomer at first. Then he were
a poultry farmer, you see they were all allotments and poultry farms and pig
farms and er. I had another brother what er were dairyman at er Townley.
Then I had another brother [what went to er America, nineteen twenty
three]. (Sound Archive)
There were 27 results where ambiguity resolution was not possible, even by looking at
the wider text, and all of these were excluded from the results presented in this
chapter.
A number of other factors (aside from those outlined above) have been found to affect
relativizer choice; these include: proximity of the relativizer and antecedent (Quirk
1957); length of the RC (Quirk 1957; Tagliamonte, Smith and Lawrence 2005); clause
complexity (Tagliamonte, Smith and Lawrence 2005); and discourse features (e.g.
information flow, Fox and Thompson 1990). It is also likely that social values ascribed
to particular constructions may impact upon their use within a speech community (as
outlined by Hollmann and Siewierska, forthcoming). These factors are not considered
under the remit of this analysis but are future research possibilities.
45
2.2.4 Diachronic change in relativization
replaced an earlier system where that was the primary relative marker (e.g. Mustanoja
1960; Romaine 1982; Montgomery 1989). By the late seventeenth century, the three
relative markers which, who, and that were used in much the same way as they are
today (as outlined in e.g. Quirk et al. 1985). While this may be the current situation for
Standard English, many regional varieties appear not to have implemented wh-
relativization to the same degree, or at the same pace, and many display a preference
favoured in spoken rather than written language (e.g. Quirk 1957; Romaine 1982), and
so we would therefore expect this pattern to play out in the Lancashire data.
This chapter examines how syntactic and semantic conditions may influence RC use
relativization in a number of related areas (e.g. Herrmann’s 2002 study which includes
Smith and Lawrence’s (2005) study which includes results from Maryport in
Cumbria). By examining corpus results both from the older Litcorp and the more
The corpus results are supported by questionnaire data, and this aims to identify usage
corpus and questionnaire data (as also demonstrated elsewhere in this thesis) allows a
46
more descriptive account of how RCs are used in Lancashire, (see Hollmann and
Studies into RCs have outlined significant variation from that typically found
addressed:
(a) How does Lancashire differ from Standard English (and from other dialects of
(b) What types of ZRs are found in Lancashire? How frequent are ZRs and how
(c) What types of non-standard relativizers are found in Lancashire? This includes
(d) Are any of the factors which can influence relativizer choice at work in the
antecedent.
(e) Has there been any change in relativization strategies over time? This includes
(f) Can questionnaire data tell us anything more about the acceptability of ZRs?
2.3 Methodology
As with all other chapters in this thesis, spoken transcribed data from the Sound
Archive corpus is analysed along with written data from Litcorp (please see §1.5 for
47
more details on these sources). In order to support the Sound Archive and Litcorp
analyses, a questionnaire exploring ZRs such as there’s a man down the street Ø goes
there too is targeted at Lancashire dialect speakers in order to test the acceptability of
different types of zero relativization (as outlined in §2.2.1). More specifically, this
questionnaire has two aims: to examine whether or not zero relativizers are
speakers, and to test a number of ZRs that are less frequently included in discussions
on ZRs yet can be found in the Lancashire data (namely NPs headed by free choice
any and ZRs as a modifier of that-phrase). Results from these questionnaires are
Studies of RCs have been completed in a number of other regions of the UK, e.g. in
Sheffield and Newcastle (Beal and Corrigan, 2002); in Scotland (Romaine, 1980); in
Somerset (Ihalainen and Harris, 1980) and in Dorset (Van den Eynden, 1992).
Currently no such analysis exists for the Lancashire region. Lancashire data has
and also mentioned briefly in other more general studies of Lancashire (e.g.
Shorrocks, 1999). Tagliamonte, Smith and Lawrence (2005) include information from
nearby Maryport in Cumbria and it might be the case that similarities between
approach employed by Tagliamonte, Smith and Lawrence differs slightly from that
proposed here (their analysis includes RRCs only), it will nonetheless be interesting to
48
Historically, studies of regional grammatical variation were based on elicited
data only, with the aim of compiling distributional maps and isoglosses (as found in,
e.g. the Survey of English Dialects project (Orton, 1969-71)). More recently, corpus-
based approaches have been employed in the study of dialect grammar, with many of
these considering relativization in their analyses. For example, Beal and Corrigan
(2002) draw on the Newcastle Electronic Corpus of Tyneside English (NECTE) and
the Survey of Sheffield Usage (SSU) in their study of these regions and Tagliamonte,
Smith and Lawrence (2005) use the 1 million word ROOTS corpus (Tagliamonte,
2001-2003) in their analysis of a number of UK regions clustered around the Irish Sea.
The use of corpora in these studies allows the retrieval of a large number of instances
obtained.
The nature of RCs (and in particular, ZRs) mean that often they can be
difficult to extract from corpus data (for a further discussion of this, see §2.2.2). It is
in instances like this where acceptability judgements and elicitation tasks can help to
corroborate existing corpus results and target forms absent in the data, thus providing
As just mentioned , extracting RCs from corpus data is not an easy task, in part due to
complementizers), but also due to the wide range of semantic interpretations involved
with related concepts such as e.g. restrictiveness. While overt relativizers can be
searched for individually (i.e. a search for the individual form which or whose etc.),
49
results obtained in this way of course include non–RC uses, as shown with what in
(18) If you take the lads you grew up with that were fishing then, what were they
doing during the war? (Sound Archive)
as RCs, software such as this is most typically trained on Standard English only. A
preliminary test using the Stanford Parser 10 indicated that the nonstandard variation
found within the Lancashire corpus data (and particular the considerable grammatical
and spelling variation found in Litcorp) led to inaccurate parsing and unreliable
results. Because of this, all overt relativizer forms are searched for in the corpora
individually. From these results each sentence is then manually sorted and either
based on semantic and syntactic analysis. Those sentences containing RCs are then
subject to further analyses (e.g. for syntactic function, semantic category of the
motivation to carry out this manual search. As here I propose that restrictiveness in
Lancashire may not be solely linked to relativizer type (i.e. syntactic factors), this
means that primarily semantic interpretation is needed. This qualitative approach can
Litcorp aim to represent their dialect using semi-phonetic spellings, it is not possible
for example, to search for ‘who’, and find every instance of what the writer may
10
For more information, see: http://nlp.stanford.edu/software/lex-parser.shtml
50
intend as who, due to the numerous variant spellings. A small sample of the variant
(19) In a bit two farmers [wot lived at Marton] coom in, and they’d a collie dug
wi’ urn. They’d bin takkin’ cattle to Poulton (Litcorp)
(20) To this mon ([whooa I soon percciv ‘t wur th’ Clark]) th’ Cunstable tow’d it,
an he began o whackering as if id stowd is Geese (Litcorp)
(21) Well, I fairly chinked wi’ lowfin’ at that, for Jim were cleeon shaven, an’
what Bess had tan for a mustache were thoose three hairs on Jim’s wart [ut
had tickled her face!] (Litcorp)
of the Litcorp text. Variant spellings found in Litcorp include whoa, (who); whooa
(who I); whoos (whose), whot, wha, wot, ot (what); tha’ (that) and the slightly less
transparent ut, although not all of these spellings were consistently used to signal
relativizers; this is discussed further in §2.4.1 with particular reference to what, that
and ut.
ZRs cannot be retrieved from the corpus using the same methodology as their
overt counterparts since there is no search term to input (it is not possible to search for
methods have been used to automatically uncover ZRs from parsed corpus data with
some success (e.g. in the Penn Treebank project, (see e.g. Marcus, Santorini and
Marcinkeiwicz, 1993)), data that is POS-tagged only, as is the case for the Lancashire
POS tags to retrieve possible instances of ZRs from corpus data. Lehman is able to
narrow down the results by formulating a POS tag search for the construction: finite
verb + NP + finite verb (e.g. I have a home help [Ø does my shopping]). Although
this approach may not capture every instance of ZRs in the corpora (e.g. those with
51
more complex NPs with extensive pre and postmodification) searching the corpora
manually was not a feasible option, given their size. Instead, a smaller in-depth study
words) aims to capture how zero relatives are used in more detail. A sample of a
similar size from Litcorp is also examined. These limitations on corpus-based searches
for ZRs also lend weight to the inclusion of questionnaire data to support the corpus
findings.
2.3.3 Questionnaire
The questionnaire is used in this chapter primarily to gather data on ZRs. The
questionnaire contains two parts. The first part tests the acceptability of particular RCs
Lancashire speakers. More specifically, the questions on ZRs examine four main
types: existential there as shown in (22), existential have (23), it clefts (24) and main
(22) There’s a young girl I know [Ø has got that one too].
(23) I’ve something [Ø might help you sort out the problem].
(25) I met a lady the other day [Ø could do the same sort of thing].
Along with the examples in (22-25), the less well-known ZRs as outlined in
§2.2.1 are also tested, namely NPs headed by free choice any and ZRs as a modifier of
(27) I didn’t really know her, that girl [Ø lived round by the market].
52
In the first part of the questionnaire participants were asked to judge sentences, such
as those detailed above, on a five point scale, with 1 being the least acceptable to them
(28) ‘There’s a man down the street goes there every week too’
(least acceptable) 1 2 3 4 5 (most acceptable)
The second part of the test required respondents to link together two clauses with a
relativizer in order to produce one complete sentence. The wording of the question is
shown in (29).
(29) Below are two statements. Combine these statements together into one
sentence. Two examples are given below:
STATEMENT: There’s a girl in the kitchen. She ate the last cake.
RESPONSE: There’s a girl in the kitchen who ate the last cake.
This second part of the test aimed to uncover which relativizers respondents would
choose in relatively free production. This part of the test contained (18) sentences that
53
2.3.4 Classification and division of respondents
questions about the age, location and background of the respondent. Unlike later
questionnaires (see e.g. Chapter 4, §4.4.6), in this instance only Lancashire dialect
18-22. The remaining 115 informants were reached via social networking websites
questionnaire. Most online participants were of a mixed age range, with the average
age being 36. Online informants were then encouraged to pass on the questionnaire to
their colleagues, family or friends if they thought it was likely that they would also
variation could very likely be a factor that influences results presented in this chapter
(i.e. as described by Milroy, 1980), this area is too broad to be discussed under the
remit of this thesis. However, to test the possibility of using social media to
crowdsource for sociolinguistic research such as that carried out in this chapter, an
additional question was inserted into the online version of the questionnaire. This
question was: where did you find the link to this survey? The results for each response
are shown in brackets - a. directly from a Facebook group (48); b. via Twitter (22); c.
from someone I know (25); d. from the researcher directly (20). This suggests that
using social media is a viable direction for further research into both reaching a wider
11
For further information, see http://www.facebook.com
54
2.4 Results and discussion
The overall frequency of all RCs in the Sound Archive and Litcorp is shown in
Table 1 listed by relativizer (these results are show at this stage with no differentiation
between e.g. antecedent type, restrictiveness, etc). The tables display raw frequency
results as some values are too low to normalise, e.g. to frequencies per 100,000. The
Results from the corpora show that the relativizer that is the most frequent in
both corpora. This supports findings that suggest that in spoken discourse that-
such, the Dialect Literature is considered to reflect spoken style in text). The results
for that found in the Litcorp make up a larger proportion of the total relativization
strategies than in Sound Archive, and therefore also agree with the assertion that older
texts may display a higher frequency of the older that-relativization pattern (e.g.
Mustanoja 1960). This therefore suggests that there has been an element of diachronic
55
change with respect to that relativization, although conclusive results are not possible
at this stage.
From the results shown in Table 1, the preference for RCs in Lancashire is as
shown below along with arrows depicting any change (relativizers with a share of less
Sound Archive that > which > what > who > Ø
FIGURE 1. MOST FREQUENT RELATIVIZERS IN THE LANCASHIRE CORPORA
Previous dialect studies have suggested that wh-relativization has made inroads
into regional dialects (e.g. Herrmann 2005:28). This does not seem to be the case in
Lancashire where in fact the only change in relativization overall seems to be the
increase of what. This increase in what relativization has been outlined (e.g. by
generally. It could well be the case that this reported trend in general English has
indeed influenced the frequency of this variable in Lancashire, thus explaining the
elsewhere in the literature suggesting that the relativizer what is associated with the
Lancashire region to some degree (e.g. Shorrocks 1999:101). This would mean that
perhaps this distribution does not represent a change, but instead corroborates the
assertion of Shorrocks that what is a feature often found in Lancashire. Overall, the
Alongside this, there are of course differences between the two corpora to consider;
56
A number of the results displayed a very low frequency. In the Sound Archive
only 6 instances of whom were found. Only 4 out of 32 speakers produced this
relativizer, 2 results being found within the same sentence, as shown in (30).
(30) Incidentally, speaking about Hortner, there was a boy staying there in those
days [whom I met with] and with whom [I spent some considerable time] and
we became great friends. (Sound Archive)
Along with the infrequent whom, instances of as were also very infrequent in both
(31) At th’ Sunday Skoo [as I went to] th’ dobby [as cleond th’ skoo an’ kept us
i’ order wi’ a cane while th’ skoo oppent] were named Skinner, Ham Skinner.
He were a lung, thin chap, an’ hi wife, wot helped him, were very fat, an’
puffed a lot when hoo were warkin. (Litcorp)
ZRs appear with perhaps a lower frequency than expected in the corpus data;
this may be due to a number of factors. In Litcorp it may be the case that writers
dialect writing to be a representation of the dialect in written form, and therefore aims
to have salient features; a zero form is perhaps less noticeable, or salient, than a non-
standard form. A similar trend, which could also be accounted for by this hypothesis,
It was necessary to estimate the extent to which low frequencies of ZRs might
be due to the retrieval method employed here, and the extent to which this is a true
analysis of 5 speakers from Sound Archive and an equal portion of Litcorp data is
57
employed here, in order to test this assertion. Results from this close analysis of ZRs
found in the two sample texts can then be compared against the results obtained by
corpus retrieval methods in order to extrapolate to a margin of error. Results from this
Litcorp
13 2.8/ 10,000 2.0 / 10,000 40
sample
TABLE 2. FREQUENCY OF ZERO RELATIVES IN A 45,000 WORD SAMPLE FROM EACH
CORPUS
In the case study, a total of 32 ZRs were found. By working out the maximum number
of ZRs that may have been missed by using corpus methods, we can see that, although
a significant number of ZRs may have been omitted, this does not change the order of
the role of the factors influencing relativizer choice (set out in §2.2.3) in the
Lancashire data. Before this, the relativizer results from Litcorp can be explored
variation. Some of the results for Litcorp shown in Table 1 are reproduced in Table 3,
this time with the distribution across variant spellings with that and what.
Litcorp
what 4 (13.8%)
what 29
wot 25 (86.2%)
that 163 (29.1%)
that 560
ut 397 (70.9%)
TABLE 3. VARIANT SPELLINGS OF WHAT AND THAT RELATIVIZERS IN LITCORP
58
There are 29 instances of what used as a relativizer in Litcorp. Interestingly, the
standard spelling of what is only used as a relativizer 4 times, e.g. (32), with each of
(32) There were only six heauses in a row in th’ Grove, an’ everywheer else there
were twenty or moor. An’ th’ folk [what lived in Hosburn Grove] thowt
summat o’ theirsels, th’ women specially. (Litcorp)
The remaining 25 results were found with the spelling wot, where each of the 25
(33) In a bit two farmers [wot lived at Marton] coom in, and they’d a collie dug
wi’ urn. They’d bin takkin’ cattle to Poulton, an wur on th’ road whoam
again.
Here, Litcorp writers use wot to mark the nonstandardness of this relativization
strategy, this respelling was not used to indicate any other function.
ut, and t are rated as phonemic/phonetic variants of that by me’ (2002:70). This
assertion appears to be less categorical in the Litcorp data where instances are found
that are ambiguous, or at least difficult to resolve. This is in part due to ut being used
Within the same example here, ut is used to mean at (as in, he became an
not sweep and relativizing that (as in, “anybody [that had a bit of insight]”). This is
12
The only exception to this was the use of wot as a semi-phonetic spelling for hot, e.g. he fotcht a red-
wot fire-potter eaut o’ th’ heause an’ flourished it like a sword (Litcorp), of which there are 8 instances.
59
problematic when looking at the data, where relativization with at is also found (albeit
infrequently) as in (35).
(35) It’ll be hard wi’ folk ut areno prepared for it. A blazin’ wot summer, an’ neaw
ice an’ snow, an’ a wynt [at shakes th’ heause]. Han th’ coals come?”
(Litcorp)
(36) “An’ win yo’ give us that pictur’ o’ yo’rs for Walmsley Fowt Bonfire” th’ lad
said. “Jim Thuston says it’s fit for nowt else.” “Does theau meean my
portrait?” “Aye, that [ut Jim Thuston says wur painted for a aleheause sign].”
That wur enoof for me. “Here,” aw said, “if theau artno’ away fro’ this dur in
abeaut five seconds, aw’ll send thee flyin’ o’er that garden, an’ witheaut
wings, too, theau yung jackanapes.” (Litcorp)
relative and as other parts of speech. Because ut occurs significantly more frequently
as that, it is likely that when speakers write ut, they are more likely to mean that than
any other possible meaning. Only instances that were possible to disambiguate were
Raw Percentage of
frequency all uses of ut
relative that 397 (41.5%)
demonstrative that 46 (4.8%)
that
conjunction that 465 (48.6%)
adverb that 19 (2.0%)
at at 6 (0.6%)
it pronoun it 24 (2.5%)
TOTAL 957
TABLE 4. ANALYSIS OF UT RESULTS IN LITCORP
60
Due to the predominance of ut used to mean both relativizer and non-relativizer that,
that.
The restrictive and non-restrictive results for each relativizer in both corpora are
shown in Table 5. The relative percentage distribution between restrictive and non-
Non-restrictive relatives are infrequent in the Lancashire data. Huddleston and Pullum
(2002:183) suggest that non restrictive uses of that are rare, but 58 examples of this
(37) I asked our James that worked there, and he said it were never reported or
anything (Sound Archive)
61
2.4.3 Corpus results: semantic category of antecedent
Perhaps unsurprisingly, the relativizers who, whose and whom are exclusively
found referring to person. The relativizer that prefers antecedents with personality but
is also found with antecedents such as the house that I lived in. In both corpora which
appears with both personality and non-personality antecedents. ZRs also prefer
personality rather than non-personality with what found the most evenly with all
antecedent types. Overall, aside from whom, whose and who, relativization in
particular. Participants were asked to assign scores from 1 to 5 to each test sentence,
with 1 being judged by them as the least acceptable and 5 as the most acceptable. A
full copy of the questionnaire can be found in Appendix B. A five point scale was
62
used in this test and the overall median results for all respondent groups are shown in
Context
existential existential main verb NPs headed by Modifier of
it cleft that-phrase
there have introducing free choice any
Score 4 (3.8) 3 (2.1) 3 (2.6) 3 (2.7) 3(3.0) 2(2.9)
TABLE 7. QUESTIONNAIRE RESULTS FOR ZERO RELATIVES
Existential there sentences such as “there’s a man down the street [Ø goes there
too]” were the most acceptable to Lancashire speakers. All test sentences were judged
analysis of the results suggests that often informants are reluctant to choose 1 or 5.
Conversely, there were also participants who only gave scores of 1 or 5, i.e. a yes/no-
type response. This aside, combined with the substantial corpus data, these results
The second part of the questionnaire required the participants to joint two
sentences together with a relativizer of their choice. Both the raw frequency and the
percentage distribution for each sentence type are shown in Table 8. There were three
test sentences of each type, one of which is given in the table, for reference.
63
Relativizer
Example that what who which Ø whose
There’s a girl in the kitchen. 61 8 87 0 2 0
human
She ate the last cake. (38.6%) (5.1%) (55.1%) (0.0%) (1.3%) (0.0%)
I went to the Council. They 0 0 126 32 0 0
collective
took my claim seriously. (0.0%) (0.0%) (79.7%) (20.3%) (0.0%) (0.0%)
I had a raincoat. It was blue 136 15 0 7 0 0
thing
with grey stripes (86.1%) (9.5%) (0.0%) (4.4%) (0.0%) (0.0%)
I saw a horse. It looked very 80 20 31 23 4 0
animal
cold. (50.6%) (12.7%) (19.6%) (14.6%) (2.5%) (0.0%)
There is a woman. She went
52 2 102 0 2 0
subject to the bank. She is “a
(32.9%) (1.3%) (64.4%) (0.0%) (1.3%) (0.0%)
woman...
There is a woman. I saw her
0 1 0 0 0 157
Object husband at the bank. She is
(0.0%) (0.6%) (0.0%) (0.0%) (0.0%) (99.4%)
“a woman...
TABLE 8. CHOICE OF RELATIVIZER BY QUESTIONNAIRE PARTICIPANTS.
with human and animal subjects, producing sentences like there was a dog went to the
vet. This suggests that (at least certain types of) ZRs are not unacceptable to
lack of corpus findings, suggests that these non-standard relativizers, which perhaps
were once found in Lancashire (as suggested by e.g. Herrmann, 2005), are now rare
for Lancashire speakers. Use of what appeared in the data, although relatively
husband I saw at the bank],” with what’s used instead of the Standard English choice
whose. It could perhaps be suggested that some results such as this may perhaps be
unreliable due to the social values ascribed to it in this context i.e. it may have been a
tongue-in-cheek response (issues such as this are considered further in Chapter 5).
This appears to be the case here, where this particular speaker did not use what in any
other sentence in the task. This would suggests that further tests may be needed in
order to determine other factors (e.g. possible social values) associated with the
64
acceptability of sentences such as a woman what’s husband, as opposed to e.g. a
This analysis of RCs in Lancashire has revealed that relativization strategies in this
region differ from those found in Standard English. Herrmann (2005) suggested that,
constrained, and this seems to be the case in Lancashire. The corpus and questionnaire
results show that significant variation is found with relativization type, syntactic
the relativizer over antecedent. In particular, what relativization is both found in the
grammatical or both) then these respellings are considered significant. Litcorp results
showed that, in particular, the exclusive use of wot to indicate relativization and what
used in all other contexts indicates that this construction is recognisable to these
The concept of salience as put forward here is further outlined both at various points
The overall frequency of relativizers largely fit in with the Lancashire findings
undergone any significant change during the period that both Herrmann’s study and
65
this present investigation cover (1960s – 1990s). Despite this, there are a number of
key differences. In the second part of the survey, with animate human subject,
speakers in Lancashire most typically used subject wh- relatives e.g. there’s a girl in
the kitchen [who ate the last cake], but prefer that with non-human animate subjects
e.g. the sheep in the field [that jumped over the fence].
A number of dialect speakers (23 out of 158) used the perhaps more
nonstandard what e.g. the sheep in the field [what jumped over the fence]. This was
found, in particular, with restrictive clauses with animate (both human and non-
human) – although no comparison with SE speakers was made here, even to find what
used productively here contrasts with, e.g. Beal and Corrigan’s (2002) findings in
Newcastle and Sheffield. Modern speakers produced what in the sentence linking
exercise more frequently than perhaps would be expected. It is unclear if this is part of
Non restrictive relatives are infrequent in the data. Quirk et al. (1985:1252)
suggest that that is rare as a non-restrictive relativizer and that zero is impossible. This
is not the case with the results outlined here, where a number of non-restrictive uses
were found.
With regard to the status of the typically “Lancashire” features (namely, as and
at), as outlined by Shorrocks (1999) and Herrmann (2002), this was not corroborated
found these were restricted to Litcorp only and even then were very infrequent.
Combined with the results from the corpora, this suggests that this variable is rare for
Lancashire speakers. The questionnaire results indicated that ZRs are most acceptable
in existential there sentences, but also that Lancashire speakers found all types of zero
66
relatives to be acceptable. A better approach may be to start from semantic position,
accounts of grammar start from a syntactic point of view this was a rational position to
take. Contrast e.g. I’ve got a lawn wants cutting with I’ve got a lawn wanting cutting.
An analysis with the emphasis placed more clearly on semantic proposition rather than
the more syntactic starting point taken here may give more a clearer picture about the
interplay between related constructions and therefore allow us to draw more precise
conclusions about language variation and change. Building from this assertion, this
67
Chapter 3. HAVEn’t to
3.1 Introduction
This chapter focuses on the syntax, semantics and frequency of the HAVEn’t to
describe its use and outline the way in which grammaticalization may have played a
when forming the negative (Quirk et al, 1985:138; Ellegård 1953:154). However, data
from the Sound Archive and Litcorp used in this study suggests that this is not
necessarily the case for Lancashire dialect speakers, see e.g. (1) compared to (2) (see
(1) No no you haven’t to change or anything, come just as you are’ (Sound
Archive)
(2) You know, like you get first class stamps, you don't have to lick ’em do you,
you just stick em on, that's progress in't it? (Sound Archive)
While core modal verbs such as COULD and MIGHT do not need DO-support,
newer semi-modals such as DARE to and USED to, and indeed HAVE to, generally do,
although as shown above, for the latter this is not always the case amongst Lancashire
dialect speakers. The comparison of this construction in the Sound Archive and
Litcorp allows tentative assumptions to be made about how the Lancashire dialect
may have changed over time. In addition, this study considers other semantically
13
It should be noted that while often HAVEn’t to is written throughout this chapter, the intended
meaning is HAVE Neg to Inf (i.e. all negated forms.
68
(3) It was very popular at one time, it mustn’t have been popular enough.
(Sound Archive)
(5) Platt, my lad, tha needn’t goo a step further’ this is the lass for thee. (Litcorp)
they are used within both the Litcorp and Sound Archive data. It may also be possible
to suggest reasons why changes in frequency may have occurred, and relate this to
linguistic theory more generally. The Lancashire data for these constructions will be
compared, in places, with data from the BNC, and also with other studies of changes
to modal verbs in English, such as Leech (2003) and Biber et al. (1999). It is also
Biber et al. (1999:485) state that English has nine modal verbs: can, could, may,
might, shall, should, will, would, and must. Semantically, they suggest that modality
distinction down further (e.g. Van der Auwera & Plungian (1998:52) distinguish
permission, and probability), this is not necessary for the remit of this study, and so
69
Morphosyntactically, in Standard English modal constructions have no non-
finite forms - most have present and past with no person-number marking in the third
person singular present, and most have irregularity in the past form, e.g. can/could,
may/might. Modals are complemented by a bare infinitive and follow the NICE
semi-modals. Semi-modals include constructions such as DARE to, NEED to, OUGHT to,
HAVE to, and USED to. These constructions fulfil a similar semantic function to modals
from core modal verbs in terms of syntax. Semi-modal constructions conform to the
NICE properties to differing degrees. This syntactic difference is used later in this
(Quirk et al., 1985: 137, Krug, 1996:43), non-modal auxiliaries (Warner, 1993:3) and
quasi-modal (Coates, 1983: 52; Perkins, 1983: 65; Leech, 1987: 73). While the choice
of name is debated, the existence of such a category is not, and in this study I follow
Quirk et al., similar to Biber et al., describe modal verbs (and semi-modals) as
describing the semantics, Quirk et al. take a more formal approach, concentrating
much of their analysis on the syntactic properties of modal verbs – the so called NICE
demonstrated in (7-9), with core modal verbs (a) and lexical verbs (b).
70
Negation - The test for negation suggests that modal constructions are able to form
negative constructions by using the particle not or the contracted form –n’t. This is not
Inversion - Inversion of the subject and operator is typical for modal verbs in a range
can be reduced. As demonstrated by the examples below, this is not possible for
lexical verbs.
(8a) You'll see for yourself. (BNC – A0D 2587) You will see for yourself.
(9b) *If anyone keeps spoiling the dinner, John keeps [spoiling it].
both lexical verbs and many semi-modals require DO-support as shown in the
examples below.
71
(10) He ran away. He didn’t run away.
This comparison shows that these criteria may be used to determine whether or
not a verb can be considered syntactically modal or not (or to what degree), and
therefore are a useful set of tests that will be utilized later in this study.
Much of the literature suggests that the difference between modals and semi-modals is
predominately one of syntax, with both modals and semi-modals sharing a similar
lack of definiteness’ (Warner, 1993:13). In order to show how these semi-modals are
syntactically different to the core modals, it is useful to look again at the NICE
This table shows that, on the whole, semi-modals tend not to conform to all of
the NICE properties. However, it should also be noted that the class definition
between semi- and core modals is not binary. In fact, from the table it is clear that the
boundaries are instead blurred, and can be said to form a continuum of modality with
some semi-modals being closer to or further away from the core modals. For example,
72
not all semi-modals are unable to take negation (a focus of this study), e.g. ?DAREn’t,
?OUGHTn’t to, *USEDn’t to. This idea is also put forward by Quirk et al. (1985: 137)
who suggest that there is a auxiliary verb > main verb scale; this is represented in
Figure 1.
(one verb phrase) (a) central modals: can, could, may, might, shall,
should, will
(two verb phrases) (f) main verb + hope to + Inf, begin + -ing participle
nonfinite clause:
FIGURE 1. THE AUXILIARY VERB/MAIN VERB SCALE (ADAPTED FROM QUIRK ET AL.
1995:137)
By looking at the Lancashire data for HAVEn’t to, (shown in Figure 1 only in
the positive form (HAVE to) as a semi-auxiliary), this study aims to find out how close
or far away HAVEn’t to is to a modal function for Lancashire speakers (such as that of
central modals represented in Figure. 1), and how this compares to its use in Standard
English. This study also goes some way to considering if, and perhaps why modal
verbs may have moved along this scale by using diachronic data (see §3.4.4).
English, with much of the research (e.g. Leech, 2003; Biber, 1999), looking at
73
changes in British English in contrast with American English. Only Krug (1996) pays
ARCHER, BNC, Frown, Brown, FLOB and LOB. Results for this study suggest that
both HAVEn’t to and HAVEn’t got to are very rare in current Standard English, with all
corpora returning very low frequencies for these constructions. Krug suggests that the
from unproductive auxiliary negation (HAVEn’t to) to DO negation (DOn’t have to).
Krug suggests that the occurrences of HAVEn’t to within the data are found in the
language of older speakers who exhibit the retention of an obsolescent structure, and
that there is a regional tendency for these speakers to come from the north of England.
synonymous with the core modal MUSTn’t, meaning not supposed to / not allowed to,
whereas DOn’t HAVE to contrasts with this, meaning ‘not obliged to’. This contrast,
and the semantics of both HAVEn’t to, DOn’t have to, and other related constructions is
Leech suggests that changes to modal verbs are related to more general
generalization, and colloquialization (Leech, 2003:236). Corpus data from the Brown
family of corpora indicates that modals marking necessity are more frequent in British
particular those with periphrastic DO, are becoming more frequent in British English.
However, it is unclear strictly how this rise in frequency of modals marking necessity,
74
Nonetheless, this trend in Standard English is tested against the Lancashire data in
§3.4.4.
English, drawing largely the same conclusions about general change amongst the two
varieties. Different to Leech, Biber et al. take a closer look at genres, suggesting that
conversation in Standard English, with the other semi-modals being mainly restricted
The most common semantic categorisation of HAVE to suggests that its meaning is
close to that of MUST (see e.g. Quirk et. al., 1985:145; Perkins, 1983:65; Coates,
the classification of HAVE to as a semi-modal and MUST as a core modal. Along with
this, semantic differences are also present. One of these is objectivity; HAVE to
speaker. For example, as shown in the invented examples below, (13) involves an
obligation to the speaker, and (14) represents some kind of outside authority or
internal drive.
However, the exact semantic role of HAVEn’t to with respect to other modal
and semi-modal verbs is disputed. Going back to Figure 1, the exact placement of
HAVE to (and also HAVEn’t to) on a scale such as this is contested. Visser (1969:1478)
suggests that to all intents and purposes HAVE to is a modal auxiliary, Krug (1996)
75
suggests that it is located at around the mid-point between full auxiliary and full verb,
and Coates (1983) suggests that it fulfils none of the defining characteristics of modal
auxiliaries. While this study considers HAVEn’t to rather than HAVE to, it goes some
way to providing semantic, syntactic and diachronic evidence for where on a scale
HAVE TO as it is clear in the data that the HAVEn’t to construction displays two distinct
meanings and that the difference is related to obligation. It is not simply a negated
form of HAVE to (see also §3.4.6). I follow Bybee et al. (1994:186) in distinguishing
between weak and strong constructions can be seen in the examples (15) and (16),
respectively.
(15) No no you haven’t to change or anything, come just as you are. (Sound
Archive)
(16) Even what happened, you hadn’t to talk, you had to lie still and be quiet.
(Sound Archive)
For Lancashire speakers, HAVEn’t to does not always semantically correspond to the
Standard English negative of HAVE to, i.e. DOn’t HAVE to, which displays only
obligation in the weaker sense. This would therefore suggest that, for Lancashire
speakers, HAVE to and HAVEn’t to are distinct (but related) constructions and they are
(1989:112), and Hundt (1997:143) unanimously agree that under negation and in
found in Standard English corpora such as the BNC, these instances are comparatively
76
rare (see §4.2), here, this construction is widely regarded as belonging to ‘a formal
literary style’ (Carter & McCarthy, 2006:244), or an ‘archaic and largely obsolete
One of the main focuses of this study is how the HAVEn’t to construction in Lancashire
dialect may have changed over time. In relation to this, it is useful to look first at how
There is no full account of the negative HAVE to in the literature, with many
discussions featuring the negative construction only very briefly. For this reason this
section examines scholarship on the HAVE to construction, but pays special attention to
how the negative may be formed. A further section deals with the rise of periphrastic
DO.
study by Fischer et al. (2000:293). Here it is suggested that the HAVE to construction,
change, moving from a more lexical to a grammatical function (Hopper & Traugott,
2003). As a result of high frequency, these constructions move away from their lexical
(Bybee 2006). In this case, the full lexical Old English verb HABBAN indicating
possession changes over time to the auxiliary or semi-modal HAVE to expressing duty
or obligation (the original lexical use continues to co-exist with the now
77
Fischer et al. suggest that HAVE to changes from a full lexical verb to a semi-
modal because of a more general change in English word order (of OV to VO) from
HAVE + object + Inf (17), to HAVE + Inf + object (18), inviting the ‘bracketing’ of
Lehmann (1995:34) suggests that this gradual word order change over time
English. The exact route of grammaticalization of HAVE to, and the number of distinct
stages that are involved in this change are contested in the literature, (see. e.g. Visser,
1963-73:1477; Brinton, 1991:12; Fischer et al., 2000:301). However, all sources agree
analysis of the origin of HAVEn’t to as outlined for HAVE to. Denison suggests that the
HAVE to of obligation (the focus of this study) rarely conforms to the NICE properties,
thus demonstrating that in current Standard English, HAVEn’t to does not syntactically
behave like a modal verb. Denison states that HAVEn’t to is present in some northern
dialects. Alongside this is the suggestion that HAVEn’t to is also a newer development
from a lexical verb in Old English to a semi-modal in current Standard English. While
the negative HAVE to was considered only minimally in Denison’s study, this study
focuses on its change in the Lancashire dialect data, with respect to that of a possible
78
3.2.6 The rise of periphrastic DO
constructions and core modal verbs is the requirement of DO-support in order to fulfil
three of the four NICE properties, those of negation (19), inversion (20) and ellipsis
(21).
Here it is shown that DO-support is usually needed by semi-modals, but not by core
modals. As the NICE properties are used later in this study as a measure of degree of
modal auxiliaryhood (§3.5.6), it is necessary to examine the history of DO, and the rise
In Standard English, along with negation, inversion and ellipsis, DO may also
be used in a number of other ways, e.g. as a verbal noun, or for pragmatic emphasis.
However, these constructions are not found in the syntax of modal constructions
striking features of Present Day English when compared to older stages of English,
and indeed to many other European languages. It is suggested that operator (or
auxiliary) DO changed from the Old English DON to the modern English operator due
to reasons of dialect, register and style (although Denison discusses the suggestion
made by Ellegård (1953) that this change took place initially in poetic language with
some caution.) Ellegård (1953) suggests that the lexical verb DO in early intransitive
use meant something like act. The typical transitive use was something more like
perform or accomplish or also put or place. Until Middle English, DO could also be
79
causative, displaying both VOSI and V+I patterns. Fischer and Nänny (2001) state
that the constructional polysemy of DO was already present at this stage, with DO
imperatives – many uses that DO still displays today. Ellegård (1953) suggests that
periphrastic DO came from changes to the causative VOSI word order. Similar to the
changes in the HAVE to construction, changes in the word order have led to the
reanalysis of periphrastic DO. The construction (DO + NP + Inf) is lost over time,
to the decline of finite lexical verbs is discussed in §3.2.9). DO was then later re-
analysed as an auxiliary.
§3.5 of this study looks at a number of other constructions found in the data that fulfil
(22) And I said, “you know Vera you shouldn't go with him, you know what he's
after, you know, you shouldn't go with him.” (Sound Archive)
(23) And then she'd get out of bed and go to the toilet and I said, “Margaret, you
mustn't,” I thought she'd collapse. (Sound Archive)
(24) And then er one fella said “you don't need to come on yer bike now love
we've got a van a van coming”. (Sound Archive)
(25) They're much easier this way round because you haven't got to go through
the minor at all to reach them. (BNC)
appropriate to suggest that they belong to a construction family. That is, they have a
similar meaning, are used in similar circumstances and so are likely to be related to
one another cognitively. I follow work by Goldberg and Jackendoff (2004), which
suggests that a number of constructions can form a closely related group or family. In
80
this study, the term construction family is therefore used to refer to those constructions
that show a similar semantic and syntactic distribution, but may be different in some
concept of construction families and rarely examine more than one or two linguistic
variables. Kroch (1989) tends to focus on cases involving only two competing
constructions such as the diachronic decrease in lexical verbs and the simultaneous
increase of DO-support. I would suggest that this viewpoint is somewhat idealized, and
the neat replacement of one construction with another is often unlikely. Approaches
such as Krug’s can be plotted graphically showing the increase in one constructions
correlating with the simultaneous decrease in another; the so-called S-curve as shown
I would suggest that a distribution such as that shown in Figure 2 is rare. Often it is
not the case that one construction has only one direct correlate, and it is unlikely that
81
sociolinguistic salience are precisely the same for each opposing construction, thus
producing a distribution similar to that shown in Figure 2. (See Chapter 5 for a further
Instead, often one construction can have many possible matching constructions that
convey a similar meaning, and so any analysis of language change should consider
this concept more broadly. This is the approach adopted here, where corpus results for
modal constructions of obligation more generally are outlined alongside results for
HAVEn’t to
Variation from Standard English found in modal verbs has been the subject of a
number of studies into regional varieties both in the UK (see e.g. Beal, 1993; Miller,
1993; Trousdale, 2003; Brown, 1991) and elsewhere (see e.g. D’Arcy and
Tagliamonte, 2010; Mishoe, 1994; Labov et al., 1972). This indicates that this
this feature shows variation in Lancashire. These studies report on differences in both
meaning and form including simplification (e.g. Trousdale 2003), and double modal
constructions (e.g. Labov et al. 1968, Mishoe 1994). Tagliamonte and Smith (2006)
detail changes to deontic MUST, HAVE to and HAVE got to in the UK and Northern
Ireland. They conclude that MUST is obsolescent and that HAVE to is being used in
contexts traditionally encoded by MUST, with HAVE got to specializing for indefinite
found here and no difference is made between the obligation types with the same one
82
3.2.9 Summary
syntactically they conform to the NICE properties, as set out in §3.2.2. The HAVEn’t to
Semi-modals differ from core modal verbs most strongly in syntax, by taking
showed that the distinction between modals and semi-modals is not binary, with
different constructions being judged as more or less ‘modal’. This study examines
where HAVEn’t to and related constructions occur on this scale by analysing their
Many studies that model diachronic change examine only two competing
variants, e.g. Kroch (1989). The advantage of this approach is that results can be
plotted showing the increase in one pattern correlating to the simultaneous decrease in
another; the so-called S-curve as seen earlier in Figure 2. It is suggested here that for
many cases, this approach neglects to take into account all of the variants, and as a
result returns much idealized results. In contrast to this narrow approach, all
constructions related to HAVEn’t to are searched for in the corpus data. Results are
shown in §3.5.2.
Alongside this, most studies on modals and semi-modals do not take into
account any nonstandard British dialect data (although studies such as those by Beal,
1993; Miller, 1993; and Trousdale, 2003 have examined aspects of modality in
various regional dialects). This widespread neglect of dialect data combined with the
brevity of analyses relating to negative forms of semi-modals and the narrow focus of
83
diachronic studies of competing variants leaves a number of research questions that I
There are a number of research hypotheses which arise from the issues discussed in
the literature review (and from the preliminary examination of the data). The
undergoes a change in form and function (in this case, for example, the adoption of
some of the NICE properties mentioned earlier). Because HAVEn’t to shows a NICE
property (namely resistance to negation with periphrastic DO), it seems likely that this
This arises from the assumption that the increase in frequency of one word or
construction can bring about the fall of another nearly synonymous word/construction.
Kroch (1989) shows how competing constructions may interact in this way.
frequency, some other construction(s) having the same or similar meanings will show
84
a fall in frequency when the (relatively recent) Sound Archive and (older) Litcorp data
are compared.
3.3 Methodology
3.3.1 Introduction
This section details the procedures used in order to gather the data from which my
conclusions are drawn. Data is taken, initially, from the Sound Archive corpus. This
corpus is compared to Litcorp, a dialect literature corpus taken from an earlier time
period in order to establish a diachronic perspective within the data (see §1.3 for
results from these corpora, conclusions may be drawn about the nature of the HAVEn’t
to construction for Lancashire dialect speakers. In §3.4.4, again these corpora are used
in order to look at semantically related constructions. The BNC is also used, in places,
as a control, for frequency comparisons between Standard English and the Lancashire
In the initial searches for the HAVE to construction, all forms of the verb were searched
for, e.g. haven’t to, hasn’t to, hadn’t to. Also, both the contracted negative form –n’t
and the full negative not were looked for, along with constructions with DO, e.g. did
not have to, don’t have to etc. Further searches were carried out in order to find
permission. Biber et al. (1999:486) suggest that this group includes MUST, SHOULD
better, HAD better, HAVE to, HAVE got to, NEED to, OUGHT to, and BE supposed to. As
these constructions are compared to the data for HAVEn’t to, the negative forms,
85
mustn’t, shouldn’t etc, are searched for in the Lancashire corpus data. A number of
other constructions additional to those from Biber et al. (namely BEn’t to + Inf; BEn’t
obliged to + Inf; BEn’t allowed to + Inf) are taken from the literature (Quirk et al,
1985:139; Huddleston and Pullum, 2002:361) and are searched for alongside the list
from Biber et al. As with all previous searches, all verb forms and contractions are
included e.g. don’t need to includes don’t need to, do not need to, did not need to,
does not need to etc. In all data, patterns that look similar on the surface but actually
exemplify other constructions, such as (26), are discounted from any results.
(26) Erm but it hasn't to the best of my knowledge, it has not resulted in a rash of
of developments and motorway intersections. (BNC - KM7 562)
Before examining the Lancashire data, and also for purposes of a comparative
analysis, it is useful to examine semi–modal data from Standard English. For this
These constructions were searched for both with and without DO, in order to show
which of these constructions are most frequent in Standard English. As with all data
presented in this study, while only one form is used in the table headings, all possible
verb forms, along with both the contracted and full forms of the negative particle, are
included in the results. These results are presented together for reasons of clarity.
As we have already seen, the literature suggests that modal verbs occur in a
range of syntactic positions. Negation is possible without DO-support for modal verbs,
86
The data in Table 2 shows the raw frequency results for a selection of modals
found in the whole of the BNC (both written and spoken). This data is included here
with DO without DO
DOn’t HAVE to 2106 (99%) HAVEn’t to 28 (1%)
DOn’t NEED to 839 (98%) NEEDn’t to 14 (2%)
DOn’t DARE to 8 (100%) DAREn’t to 0 (0%)
DOn’t USED to 24 (73%) USEDn’t to 9 (27%)
TABLE 2. NEGATIVE FORMS OF SEMI-MODALS IN THE BNC WITH AND WITHOUT DO (RAW
FREQUENCY RESULTS)
semi-modals when forming the negative. The data suggests that in Standard English,
while it is possible for HAVE to to occur in the negative without DO, this is extremely
rare, compared with the number of occurrences with DO (28 vs. 2106 cases). The
HAVEn’t to results in the BNC are discussed further in §3.4.2. These data for HAVEn’t
to may now be compared and contrasted to the results from the Lancashire corpora.
Sound Archive is approximately 300,000), their results have been normalized to show
frequencies per 100,000 words - this is shown in Table 3. For reasons of comparison,
87
HAVEn’t to DOn’t HAVE to
raw raw
(per 100,000 words) (per 100,000 words)
frequency frequency
Litcorp 4 0.80 (80.0%) 1 0.02 (20.0%)
Sound Archive 14 4.67 (42.4%) 19 6.34 (57.6%)
BNC 18 0.02 (0.7%) 2,578 2.2 (99.3%)
TABLE 3. INSTANCES OF FORMS OF THE HAVEN’T TO CONSTRUCTION IN LANCASHIRE
DIALECT DATA.
The data show that both HAVEn’t to and DOn’t have to are more frequent in the more
modern Sound Archive than in the older Litcorp. In some respects these frequency
must be taken when comparing such different corpora - Sound Archive is a spoken
corpus, Litcorp is written and the BNC is mixed. This difference in genre and the
relatively low frequency of this construction overall, means that only tentative
possibly explain the increase in both HAVEn’t to and DOn’t have to in the data.
However, the Standard English data from the BNC shows that this is not the case for
the HAVEn’t to construction; in fact, the BNC displays a frequency of only 0.02
occurrences per 100,000 words. It could be suggested that, like speakers of Standard
are not the same constructions that undergo change in Standard English. This point is
Another possible reason for the perceived difference in the two corpora could
be salience. Kerswill & Williams (2002) suggest that salient constructions are those
88
which are recognised by speakers as a feature of a certain dialect, speaker or region.
As Litcorp is not a record or transcription of speakers at that time, but rather of the
corpus of the most salient or important dialectal features as judged by the writer (see
Chapter 5 for a further discussion of this). This means that the low frequency of both
HAVEn’t to and DOn’t have to within the Litcorp data could be because these are not
writers (i.e. HAVEn’t to may not be salient). It may also be that some other
construction is used in Litcorp to indicate obligation, e.g. weren’t to, or aren’t to. This
displays properties closer to that of a core modal verb. A more detailed analysis of the
semantic and syntactic features of these results is discussed in the following sections,
In the data, every occurrence of the HAVEn’t to construction occurs in the same
(28) Int’ neet he went, an’ th’ aggravation uv it were he hadn’t to feight for her.
But I towd her he could feight, for her un win. (Litcorp)
(29) Even what happened, you hadn’t to talk, you had to lie still and be quiet.
(Sound Archive)
89
As shown in the examples above, hadn’t to occurs with dynamic verbs such as hit,
talk and go. The literature (e.g. Fischer et al., 2000:301) suggested that HAVEn’t to
(and HAVE to) first occurred with the word order hadn’t + obj + Inf, changing later,
after reanalysis, to hadn’t to + Inf. The whole of the Lancashire data provides only
(30) He hadn’t mich to do as we never played fro music. We did at first […]
(Litcorp)
This suggests that this older form is now largely obsolete for Lancashire speakers.
Further syntactic analyses comparing HAVEn’t to with other related constructions are
offered in §3.5.6.
A closer analysis of the data suggests that HAVEn’t to displays two different meanings
for Lancashire speakers, and that the difference is related to obligation. In terms of
Bybee et al.’s distinction between strong or weak obligation (1994:186), example (31)
(31) No no you haven’t to change or anything, come just as you are (Sound
Archive)
Here, the referent of you is not very strongly obligated to do something. The
meaning of HAVEn’t to, when used in this way, is more semantically similar to other
semi-modals such as NEEDn’t, DOn’t need to or DOn’t have to. This suggestion is
(32). Here, the meaning of HAVEn’t to is closer to the core modal must.
90
(32) Even what happened, you hadn’t to talk, you had to lie still and be quiet.
(Sound Archive )
The difference in meaning between the strong and weak constructions can be
further demonstrated by looking at DO. The Standard English DOn’t have to is similar
to the meaning of HAVEn’t to displaying weak obligation in the Lancashire data. This
can be seen in example (33). Negative HAVE to constructions with DO are not
(33) It’s easy here, I hadn’t to / don’t have to get me car out on a Sunday, it’s
lovely to walk down the path (Sound Archive)
(34) You had to move your arms as well, but you hadn’t to / *didn’t have to
move your head, you’d to keep laid flat. (Sound Archive)
and (36).
(36) You haven’t to go to the shop (I’ve got enough food in the cupboard)
In order to examine distribution of weak and strong meanings, the frequency of these
Obligation type
Weak Strong
Litcorp 2 (50%) 2 (50%)
Sound Archive 3 (21%) 11 (79%)
TABLE 4. DIFFERENCE IN OBLIGATION TYPE IN HAVEN’T TO CONSTRUCTIONS IN
LANCASHIRE DIALECT (RAW FREQUENCY RESULTS)
This data shows that the two different meanings, as outlined in (37 – 38) are
possible in Lancashire, with both corpora returning results for both variants. While the
Litcorp data is certainly not significant enough to be considered and the Sound
91
Archive also does not return a huge number of results, a closer look at the Sound
Archive data shows that the three instances of weak HAVEn’t to are produced by three
different speakers. This shows that this difference cannot be explained simply as part
of a particular speaker’s own language use. Out of these three speakers, two use both
the strong and weak varieties within their interview, suggesting that HAVEn’t to has
two different meanings, at least for these speakers. The examples below show the
(37) Well I don’t know who it were what must have been Mayor what came up or
something you know. ‘Cos you hadn’t to pay or anything you know there
were always plenty of collections you know if you wanted to collect or give
anything. (Sound Archive)
(38) Erm well you hadn’t to have any dirty shoes on. Well they were very poor
and people hadn’t er clogs or anything, you had to have a clog fund to buy
these clogs. (Sound Archive)
This constructional difference was also tested on results for HAVEn’t to in the
BNC in order to see if this difference in meaning is also present in the Standard
English data. As shown in Table 3 earlier, the results for the same search returned
only 18 instances of this construction, of which the majority were found in the
regional varieties). Most of these instances are recorded as ‘north’ but no further
details are given and so the exact location of these speakers cannot be ascertained. The
BNC results suggest that this feature is not frequent in Standard English, (compare the
18 results for HAVEn’t to with e.g. the 2,578 results for the semantically similar DOn’t
have to). This aside, differences in obligation types can also be found in the BNC data
(39) She said I can't tell you, I haven't to tell you! (BNC - KB8 5178)
92
(40) Oh well er I asked Joyce and she said erm, he hasn't to go in, he's not bad
enough (BNC – KB2 2435)
However, we can conclude that this construction is not frequent in Standard English
and so any differences in obligation type shown here can not be attributed.
Goldberg (1995:65) suggests that constructions, like words, can be polysemous; this
means that a single form has two or more meanings that are semantically related.
Often, one meaning is a historical extension of the other meaning(s). This fits in well
with many suggestions about the development of the HAVEn’t to constructions, and in
particular, with the variation in meaning within the same construction that is found in
the Lancashire data. For example, Krug (1996:56) suggests that the virtual absence of
not negation (for HAVEn’t to) in Present Day English points to a diachronic
It could be suggested that this diachronic change has directly led on to the polysemous
While the data has shown that HAVEn’t to and the Standard English DOn’t have
to can be near synonyms (as in 41 and 42), HAVEn’t to can also be semantically similar
(41) You haven’t to sit over there [there’s plenty of room here]
(42) You don’t have to sit over there [there’s plenty of room here]
(43) You mustn’t sit over there [or you’ll get into trouble]
(44) You haven’t to sit over there [or you’ll get into trouble]
semantic category of obligation behave and to examine whether or not the data
93
suggests that related constructions have undergone a diachronic change, §3.5
3.5.1 Introduction
Lancashire dialect. This analysis now examines how the possible change in HAVEn’t to
may relate to changes in the frequency of other modal and semi-modal constructions
construction family. In this particular case, the function of this family is one of
necessity or obligation. With that in mind, the focus of this study now turns to
constructions that share a similar meaning, but differ in terms of structure. This multi-
constructional approach aims to provide a broader picture of diachronic change for the
see §3.3.2.
Although many search terms were included in this data analysis, for reasons of clarity,
only those search terms that returned results from the corpus data are included in the
tables of results. The data in Table 5 shows the normalized frequency results for the
obligation and permission family of constructions in both the Sound Archive and
Litcorp data.
94
Frequency per 100,000 words
Litcorp Sound Archive
SHOULDn’t 12.8 6.4
MUSTn’t 1.0 2.4
NEEDn’t 1.8 0.0
DOn’tneed to 0.2 1.0
HAVEn’t to 0.8 4.7
HAVEn’t got to 0 0.1
DOn’t have to 0.4 6.3
TABLE 5. HAVEN’T TO FAMILY OF CONSTRUCTIONS (NORMALIZED FREQUENCY RESULTS)
The semantic and syntactic differences represented in this data, along with the
The comparison between the Litcorp and the Sound Archive is most clearly
represented in Figure 3.The graph shows the members of the obligation family as they
(may) vary over time between the older Litcorp and the more recent Sound Archive;
95
Relative frequencies of the obligation family of constructions
14
Litcorp
12
Sound Archive
BNC
Frequency per 100,000 words
10
0
SHOULDn’t MUST n’t NEEDn’t DOn’t NEED HAVEn’t to HAVEn’t got DOn’t have to
to to
This data shows that most constructions are more frequent in the more modern Sound
form and function as a result of high frequency (Hopper & Traugott 2003:44). These
results are insufficient as to cite high frequency as having any involvement in possible
constructions displaying a fall. The data suggests that NEEDn’t displays a decrease in
frequency as compared with a possible increase in DOn’t need to, thus possibly
proving this hypothesis to some degree. The increase in both HAVEn’t to and DOn’t
have to contrasts well with this change in NEEDn’t vs. DOn’t need to. Quirk et al.
96
(1985:146) suggest that in both British and American English constructions formed
with periphrastic DO (DOn’t have to, DOn’t need to etc.) have increased over time,
while their counterparts (e.g. HAVEn’t to, NEEDn’t) have decreased. The data for both
forms of negative NEED supports this theory, while the HAVEn’t to data goes against
this trend. Unlike Standard English, HAVEn’t to remains frequent in the Lancashire
data.
One of the most striking results shown by the data is the decrease in frequency of
English (e.g. Quirk et al, 1985:141). It may be the case that this trend also fulfils the
Competing Constructions Hypothesis as set out earlier, and that Lancashire speakers
instead of using SHOULDn’t. This theory is expanded upon in §3.5.5, with a closer look
Litcorp. As can be seen on the graph in Figure 3, many of the Litcorp results (other
than SHOULDn’t) display low frequencies. All constructions other than SHOULDn’t have
frequencies between only 0-2 per 100,000 words. It could therefore be suggested that
obligation, as a concept, simply is not very important in these kinds of stories that
make up Litcorp. This same explanation may account for all increases in the data,
(DOn’t have to, SHOULDn’t, HAVEn’t to, MUSTn’t and don’t NEED to) as the starting
values for these constructions in the Litcorp are so low. Because of this, more
syntactic and semantic data is needed in order to support the claims put forward here.
97
3.5.5 Semantic evidence
In order to examine possible semantic reasons for change over time, the data can be
analysed based on the semantically strong / weak distinction as described earlier when
strong obligation are those in which an implied source of authority exerts more force
over the people or entities involved; on the other hand, constructions displaying weak
obligation are those in which there is less force from the implied authority source, and
hence, although the force is implied, it is not binding on the people or entities; for
example compare we must not go home (strong) with we don’t have to go home
(weak). All results were analysed into the categories weak, strong, or both.
Constructions that were classified as both were those that displayed at least a 1:3 ratio
of both obligation types, (e.g. 5 weak uses and 15 strong uses, or vice versa). The
DOn’t have to
MUSTn’t HAVEn’t to NEEDn’t
SHOULDn’t HAVEn’t got to DOn’t need to
STRONG WEAK
to weak and strong obligation and those which have both. Here it is suggested that
98
both HAVEn’t to and HAVEn’t got to display constructional polysemy, where both
strong and weak meanings are possible. Harris and Campbell (1995:26) suggest that
Further to this, as both HAVEn’t to and HAVEn’t got to appear with both weak
and strong meanings (unlike other constructions that only display weak obligation), I
would suggest that these constructions are more semantically similar to the strong core
meaning may show a fall in frequency when the Sound Archive and Litcorp are
compared, due to their function being ‘taken over’ by other constructions. Decrease in
both SHOULDn’t and NEEDn’t (underlined in Figure 4) may be related to the observed
increase in HAVEn’t to and DOn’t have to in this way. It has been shown earlier that in
the Lancashire data HAVEn’t to occurs with the strong meaning 79% of the time. This
suggests that in Lancashire the increase in HAVEn’t to (in this strong sense) and the
decrease in the semantically similar SHOULDn’t may be linked, thus suggesting that the
Competing Constructions Hypothesis may be true. The increase in MUST also could be
99
3.5.6 Syntactic evidence
In order to further support the suggestion that the HAVEn’t to construction in current
Lancashire dialect displays syntactic properties closer to that of a core modal verb, it
is useful to look back to the NICE properties, as detailed by Quirk et al. (1985:147).
instances where no examples were present in the data, a group of ten informants who
on their acceptability by using test sentences such as (45). (See Chapter 1 for more
14
The full list of test sentences can be found in Appendix C
100
more syntactically similar to the core modals SHOULDn’t and MUSTn’t suggesting that
semi-modals such as those formed with periphrastic DO behave much less like modal
verbs. HAVEn’t to is the only construction in this family that may be able to have a
mentioned previously, (please see §1.3.5 for further details on these informants). This
not, and if they had heard it in use. While this test was too small-scale to give any
significant results, it is perhaps interesting to note that around than half of the
informants reported that either they have heard it from a Lancashire dialect speaker, or
Given the results presented earlier in the chapter, it is possible to conclude that the
towards a more modal function, has been proven correct. The results from the
diachronic data suggest that semi-modal HAVEn’t to now behaves more like a core
modal verb for Lancashire speakers than is the case in Standard English. The semantic
polysemy for Lancashire speakers and this differentiation is present in both the
Litcorp and Sound Archive data. It seems that in a majority of cases (in the Sound
Archive data), its meaning is semantically closer to the stronger modal verbs MUSTn’t
or SHOULDn’t rather than to the weaker DOn’t have to. The syntactic analysis also
101
suggests that HAVEn’t to has become more grammaticalized. The comparison of the
NICE properties clearly shows that HAVEn’t to is syntactically closer to a core modal
The results are not without question; although the semantic and syntactic
arguments clearly show that, synchronically, the HAVEn’t to construction has become
grammaticalized in the Lancashire dialect data, the diachronic data does not
conclusively show the process of grammaticalization taking place from the Litcorp
period to the Sound Archive period. This may be due to the somewhat problematic
nature of the comparison between the written Litcorp and the spoken Sound Archive
data: the former is written ‘consciously’ by the author, while the latter is spoken
‘naturally’. The construction family analysis seems to show polarised results for
Litcorp, with SHOULDn’t returning more than three times as many results as all other
clear. The constructional polysemy of HAVEn’t to, along with that of the similar
HAVEn’t got to, could also be used to support the Construction Competition
constructions that are able to fulfil a similar semantic role. However, as discussed
previously, the Litcorp data is not completely reliable in this respect, and so no firm
conclusions can be drawn. It may be that the authors of Litcorp overuse certain
102
modals, or, on the other hand, it may be that this is indeed representative, in which
language change such as that of Kroch (1989). The S-curve model of language change
examines two variants and suggests that as one competing variant increases, the other
decreases, thus producing the S-curve as shown in §3.2.7. However, this approach
will, in many cases, be too narrow. This study suggests that often there may be a
number of similar constructions which are able to fulfil a particular semantic role, thus
meaning that speakers are not limited to a choice of only two variants in opposition.
The interaction between these variants is complex, and cannot easily be accounted for
by Kroch’s S-curve model. The results of this study suggest that a wider scope of
While this investigation has yielded interesting results, there are a number of
limitations involved with both the methodology used here, and with the scope of this
sparseness. Although the results for HAVEn’t to returned low frequencies, there were
enough to prove both that the construction certainly exists, and also enough to carry
out an analysis of aspects of its syntax and its semantics. While enlarging the corpora
may provide further evidence, Biber et al. suggest that the obligation/necessity
modals, such as those examined here, are less common overall than other modal
studies (see e.g. Cowart, 1997; Schütze, 1996) and would be a good option in order to
further this study. A combination of these approaches (as put forward by Hollmann &
Siewierska, 2006) should give a good overall picture of how this construction family
103
is used, and account for the limitations encountered in this study. This approach is
adopted in other chapters in this thesis (e.g. in Chapter 4 when testing habitual
constructions).
It was suggested earlier that sociolinguistic salience may have been a motivating
factor for the low frequency of the construction in question in Litcorp, which
somewhat undermines the frequency results presented here. This variable is analysed
construction HAVE to (see e.g. Brinton, 1991; Quirk, 1985; Biber et al., 1999; Fischer
et al., 2000). This project could be furthered by more analysis of the positive
Standard English. There is also the potential to take a wider view of these
constructions; some of the generalizations suggested here could be used as a basis for
looking at all modal verbs in Lancashire dialect data. Looking at further modals and
104
Chapter 4. Verbal agreement and the Northern Subject Rule
4.1 Introduction
the NSR any present indicative verb in any person may take the suffix –s (normally of
course only associated with 3sg) except for when found directly adjacent to any non-
3sg personal pronoun, thus giving the distinction they go home, but the children goes
home.
(Pietsch, 2005; Ihalainen, 1994; Murray, 1873) and also in a number of areas beyond
this (Rupp and Britain, 2008; McCafferty, 2003; Godfrey and Tagliamonte, 1999).
However, few in-depth region-specific analyses of the NSR have been conducted,
with some studies (e.g. Börjars and Chapman, 1998; Henry, 1995) including little or
no data in their research. Alongside this, most studies do not address variables such as
the possible interplay between the NSR and other similar constructions, nor examine
These issues are considered in this chapter, where data from spoken and
order to explore both the possible instances and acceptability of the NSR in
phenomena (such as the NSR) the result of relatively clear rules or constraints, and to
105
It is even possible that agreement variation (such as that proposed to be demonstrated
variation. This possibility is discussed further with respect to the results from
Lancashire in §4.5.
4.1.1 Overview
Standard English resembles many other world languages in that it displays agreement
between the verb and subject (see e.g. Siewierska, 2004), whilst also differing in a
number of other ways; with lexical verbs, agreement is confined to the present
has a more elaborate agreement paradigm (although this is not uncommon cross-
Present Past
Old High German English Old English English
1sg bim, bin am wæs was
2sg bist are wǽre were
3sg ist is wæs was
1pl birum are wǽron were
2pl bir(e)n are wǽron were
3pl birut, bir(e)t are wǽron were
TABLE 1. PARADIGM OF BE IN OLD HIGH GERMAN, OLD ENGLISH AND ENGLISH
syncretism, where a single form serves two or more morphosyntactic functions (see
e.g. Corbett (2006) for a discussion of this). In the case of regular lexical verbs, there
is only one overt marker of person agreement, namely –s, used to indicate 3sg in the
present indicative – all other forms are zero, as shown in (1-2) respectively.
106
(1) He / she / it likes chocolate.
This Present Day English agreement system is thought to have arisen due to changes
in word order that initially left person and number marked on the verb by both
pronouns and by verb endings as shown in Table 2, (modified from Van Gelderen,
2006).
Present indicative
1 ic find(e)
2 thou findes(t)
3 he findeþ/ he findes
Pl we, ye(e), thei, findeþ/en
TABLE 2. LATE MIDDLE ENGLISH PRESENT INDICATIVE AGREEMENT WITH FIND
This “double marking” of the verb meant that verbal endings eventually became
connection, many Romance languages, where the subject pronoun is normally omitted
Despite this change, 3sg verbal agreement distinctions were kept in English, and today
substantial variation with the 3sg form is present in many regional varieties (see
§4.2.1 for a further discussion of the history of variation associated with the NSR in
particular). More generally, departures from standard verbal agreement in both British
and worldwide varieties of English are common and have been widely studied (e.g.
Cheshire, 1982; Kortmann and Schneider, 2004; Labov et al., 1968; Trudgill, 1999).
subsumed under the Northern Subject Rule. This phenomenon is also referred to as
present-tense rule (Montgomery, 2004) and is also outlined by others (e.g. Rupp,
107
2005; Hudson, 1999). Despite apparent differences in terminology, all suggest that, as
discussed earlier, in varieties of English any present indicative verb in any person may
take the verbal suffix -s except when directly adjacent to non-3sg personal pronouns,
NSR; much of the literature suggests that the type and position of the subject may also
influence the application of this pattern (e.g. by Pietsch, 2005; Godfrey and
British Isles, and while it has been suggested to be prevalent in the North (e.g. by
Pietsch 2005; Klemola 2000), it is not exclusively located in these areas. The idea of
alongside Northern England (and Scotland), NSR agreement has been identified in
Ulster (McCafferty, 2003), and in the Southwest (Godfrey & Tagliamonte, 1999).
Interestingly, many dialects of English also have differing agreement patterns related
to 3sg variation. Varieties found in East Anglia (e.g. by Britain & Rupp, 2005) and in
Buckie Scots (by Smith & Tagliamonte, 1998) display agreement patterns in direct
opposition to the NSR where 3sg forms are more commonly found with adjacent 3sg
pronominal subjects than with an NP subject, e.g. the cat purr, it purrs (taken from
108
Pietsch (2005:22) suggests that NSR agreement can be found in the Lancashire
part of the Freiburg English Dialect corpus (henceforth, FRED) and the Survey of
English Dialects (henceforth, SED). However, the Lancashire part of the FRED and
SED data contains relatively few speakers (when compared to this study); a more
thorough investigation is required in order to determine the extent to which the NSR
may be present in Lancashire more widely rather than being limited to a small group
of speakers as tested by the FRED and SED data. The present analysis of a corpus of
19th and 20th century Lancashire dialect literature also allows tentative claims to be
made about possible changes to verbal agreement in this region whilst also providing
As shown in examples (3-5), the traditional definition of the NSR suggests that
present indicative verbs may take -s verbal agreement in all circumstances, except
when directly adjacent to a non-3sg personal pronoun subject. This specific influence
type-of-subject constraint, and by many others in less explicit terms (e.g. Montgomery
1994b; Cheshire and Fox, 2006). Alongside this subject type restriction, many also
suggest that the position of the subject in relation to the verb determines the
application of the NSR, where non-3sg personal pronoun subjects which are separated
from the verb by a clause or phrase may also take 3sg verbal agreement, as shown
It can be argued that pronominal adjacency may override the possible use of
109
occurs, thus avoiding a conflict with the person and/or number features of the pronoun
and the verb. These subject type (i.e. pronoun vs. non-pronoun subject) and subject
position (i.e. adjacent vs. non-adjacent subject) restrictions are the main constraints
associated with the NSR and are largely agreed upon in the literature. These
constraints have also been suggested to affect the application of the NSR. Both
Godfrey & Tagliamonte (1999:97) and Bailey et al. (1989) suggest a heaviness
constraint, where the phonological size of the subject may affect agreement. This
would imply that longer and more phonologically dense (or heavy) noun phrases are
more likely to occur with NSR agreement. However, Godfrey & Tagliamonte (1999)
in (7), or both (both of these examples are invented to exemplify this). Godfrey &
Tagliamonte also report that no examples of this constraint are found in their results.
(6) The children who are always with the dog likes going outside.
Although we know that heaviness can affect word order (see e.g. Hawkins,
1994) there is no clear reason as to why heaviness should affect agreement. Instead, I
would suggest that Godfrey and Tagliamonte’s heaviness constraint may be part of (or
indeed encompass) a wider pattern of agreement that is not specifically regional, nor
part of the NSR. This involves the most verb-adjacent part of a long or complex
subject, (rather than the whole subject NP), agreeing with the verb, (in (6-7) this is the
dog.) This can be further demonstrated in (8) taken from the British National Corpus
(BNC)
110
(8) His writing-room on the first floor contains an unprepossessing table and a
sideboard, on which sit his word-processor and printer. A small chair and
bookcase completes the picture. (BNC AOP12)
In this example, a small chair and bookcase could be considered as 3sg or as 3pl but it
is found with 3sg agreement, perhaps due to the adjacent 3sg bookcase. This
also suggests that it is “not part of NSR agreement proper”, and this is the stance I will
also take. Nonetheless, examples of such agreement in the Lancashire data are
similarities with the NSR (namely, non-3sg subjects can occur with 3sg agreement), it
therefore may compete with it or overlap with its use in some way. The concept of
refinement and significant weakening of the NSR, one of the most inclusive
Pietsch’s inclusion of thou within the definitions of the NSR is perhaps doubtful here.
Thou normally takes –s in most dialects where it occurs, i.e. its agreement is identical
to 3sg subjects (see e.g. Shorrocks, 1999:93; Smith and Tagliamonte, 1998), and so it
is unclear on exactly why this should be part of the NSR. This aside, instances of thou
are retained in the analysis for descriptive clarity nonetheless. The generalized
definition presented above encompasses the possible variability of subject type and
111
position with respect to the verb as outlined earlier – all of which are explored within
Alongside 3sg agreement variation with present indicative lexical verbs, many studies
also include analyses of nonstandard verbal agreement associated with 3sg forms of
Although definitions of the NSR do not typically account for past tense variation (as
of course lexical verbs have no 3sg distinction in the past), the more complex
displaying NSR variation, nonstandard 3sg BE in all tenses is able to occur in non-3sg
contexts due to analogical levelling with the NSR (or put simply, due to the spread
and influence of a dominant pattern, see e.g. Henry, 2005; McCafferty, 2003; Godfrey
& Tagliamonte, 1999). Subject type and subject position constraints (associated with
the NSR) are suggested to apply to verbal agreement patterns with BE. This means that
verb-adjacent personal pronouns occur with standard agreement (in this case with
were) as in (10), while adjacent subjects which are not personal pronouns may allow
nonstandard agreement, (in this case with was), as in (11), (taken from Britain and
Rupp, 2005:3).
Another factor that has been suggested as affecting the use of was/were is the
polarity of the clause. Nonstandard were (i.e. in the 1sg and 3sg form) is said to occur
112
more frequently with negative polarity subjects, (Cheshire & Fox, 2006:3; Anderwald,
2001:3; Henry, 1995: 22) as shown in (12-13) modified from Tagliamonte (1998:22).
linked to the NSR’ (1995:20), although this hypothesis has not been addressed in other
studies. Any possible links between negation and the NSR will be explored,
Variation with past tense BE is found in Northern England – the region most
typically associated with NSR agreement (see e.g. Hollmann and Siewierska, 2006;
Tagliamonte, 1998). Alongside this, variation with BE is also present in many areas
that are not suggested to display NSR agreement, e.g. in London (Cheshire & Fox,
2006) in the English Fens (Britain, 2002), and in certain worldwide varieties of
English (see e.g. Schilling-Estes, 2000; Wolfram and Sellers, 1999). This would
suggest that while there may be a link between the NSR and nonstandard 3sg variation
with BE in regions where NSR variation is prevalent, was/were variation may also
occur independently of this rule. This means that this ‘independent’ was/were
variation, in particular, may not be restricted by the subject type and subject position
often ascribed to regions of the UK such as Lancashire and Yorkshire (Pietsch, 2005).
Tagliamonte (1998:160) suggests that amongst present day speakers from the city of
York the was/were alternation is found “in the speech of the same individual, in the
113
using a subsection of the data used in this study. As with Tagliamonte’s findings from
York, the Lancashire data showed both inter- and intraspeaker variation with
was/were. In Lancashire, the past tense BE paradigm showed levelling towards was,
but more interestingly also towards were. In the study of a community of high school
speakers from nearby by Bolton, Moore (2003:386) also found that the overwhelming
tendency was towards levelling to were. In both studies were levelling appears to be
frequent in all sentence types; this differs from other regional varieties, where
Cheshire & Fox, 2006:3). These was/were findings for Lancashire (and to some
extent, York) appear to conflict with the earlier hypotheses which suggested that in
variation. In Lancashire the non-3sg pattern can be extended to all contexts – this
opposes the NSR agreement pattern where 3sg patterns are extended to non-3sg
(Pietsch 2005, Ramisch 2009) although in these regions was/were agreement appears
to be less restricted (or in fact even completely unrestricted) by the proposed analogy
with the NSR. This may suggest that other variables, such as constructional frequency,
patterns aside from levelling to were exist within the Lancashire data, e.g. (14-15).
(14) “If he’re poorly he ‘ad betther have a cab, an’ go whoam.” “Poorly?” Bob
said, lookin as if he could like t’ ha’ put th’ waiter i th’ doctor’s honds.
“Dustno know good singin when theau yers it, theau donned-up mopstail?”
(Litcorp)
(15) Well we told the skipper he says "oh he's not blind" he says "he just wants to
go back, he don't want to go to sea for Christmas." (Sound Archive)
114
4.2 History of the NSR
NSR agreement can be found in Northern English texts as early as the late Middle
English period (Filppula et al., 2002:49); the exact origins and development of this
agreement pattern before this time is largely unknown due to a lack of written data.
Arabic, Tagalog, Hebrew and some other Semitic languages (see e.g. Filppula et al.,
2002:47), although these languages typically display a full agreement paradigm, rather
than the limited agreement paradigm found in English. This means that the presence
Klemola (2002) suggests that the reflexes of the NSR agreement pattern found
in Northern varieties are not an innovation, but instead are a retention of an older
agreement pattern that underwent changes due to factors such as language contact and
language-internal variation. The loss of the more complex agreement system found in
Old English is part of a more general loss of affixation that may be found in all
Germanic languages, although reasons why only part of this pattern remains are
dialects, one northern variety and one southern, may have contributed to the
development of the NSR. Firstly, in the North vowels in the common Germanic
singular forms –u, and –ið underwent a process of weakening, becoming -e, -eð during
Old English. The -ð forms were then replaced with –s in both the plural and in the
third singular sometime later, and the vowels in the plural and 3sg endings (–að and
eð) also lost their contrast. The -e ending in the first singular eventually became zero.
115
Secondly, the innovation of affixless (or zero) forms at first occurred only in a certain
development was apparently initiated by the southern dialects and only began to reach
the North at some time during late Old English. These zero forms were then
changes meant that by the end of the ME the present tense paradigm of lexical verbs
contained only two distinct forms, the 3sg -(e)s and –Ø for all other forms (although
previous to this, further distinctions were retained). The –Ø verbal endings now
occurred when adjacent to pronouns (except in 3sg contexts), and –(e)s occurred
variably in all other contexts. This pattern of agreement forms the basis of the NSR.
Corbett (2006) suggests that the reduction and syncretism of the agreement
affixes may be due to the rise of subject pronouns. As subject pronouns became the
routine way of expressing person reference, person marking on the verb became
functionally redundant. When the verb and subject were not adjacent, the verbal –s
agreement was kept, (although later lost in Standard English) thus resulting in the
NSR agreement paradigm (see Siewierska, 2004:277-81 for a further discussion of the
116
Reasons for these changes in English leading up to the development of the
NSR are often attributed to language contact with Celtic (e.g. by Isaac, 2003) or
Scandinavian (e.g. by White, 2002). While no direct link between Scandinavian and
the NSR is described in the literature, the sound change resulting in –s verbal endings
in Northern England which later enabled NSR changes to occur, is often attributed to
influence from Old Norse. This is because Old Norse had syncretised the 2sg and 3sg
agreement marking, with both forms ending in the uvular trill /R/. In a language
contact situation, this variant may have been perceived by English speakers as
something similar to /s/ or /z/. This parallel may have allowed the spread of -s from
the 2sg to the 3sg in Northern English at that time by analogy with Old Norse.
However, this does not explain the spread of verbal -s also to the plural forms, as Old
Norse had three distinct forms in the plural. Old Norse also had no alternation of the
agreement paradigm according to adjacency of subject and verb, and so could not
have influenced the NSR in this respect. This aside, if it is accepted that the
Norse, then Old Norse may be considered as playing a role in the appearance of the
resemble those of the NSR (Venneman, 2000). Specifically, the Brythonic agreement
system has person and number inflections whenever the clause has no overt subject
NP, or only a weak personal pronoun. With plural subject NPs, an unmarked third
person verb form is used as shown in (16-17) below taken from King (1993:137).
117
(16) Maen nhw ’n dysgu Cymraeg.
be.PRES.3P they PROG learn.INF Welsh
‘They’re learning Welsh.’
.Although this agreement pattern is similar to the NSR, it has been suggested that
possible influence of Brythonic on the NSR, observable in Middle English, does not
fit with respect to the timeline of settlement and language contact (see e.g. Klemola,
2000 for more details on this). While language contact with Scandinavian or Celtic
languages may arguably have had some role in NSR developments, dialect contact
between the Northern and Southern varieties of English appears to have been the most
important factor. The combination of language-internal change and dialect contact can
It has been suggested (e.g. by Kroch, 1989; Culicover, 2008) that constructions that
share a similar syntactic form or semantic interpretation may compete and overlap in
the minds of speakers, often over time resulting in one form being reanalysed as
another (as we have seen earlier in this chapter with both Ø and /R/ as markers of
agreement). This competition may apply to the NSR, as of course not every example
superficially similar to the NSR can be found in this data. For example, initially, it
appears as if the NSR may be found in historical present constructions e.g. (18).
118
(18) The folk there says, “get off ‘ome Thurson.” O’ th’ evils o’ drinkin! So I
went back to our house, th’ missus was fast asleep, good job too. (Litcorp)
Historical present constructions bear some resemblance to the NSR in that they are
able to have 3sg verbal agreement in persons other than 3sg, as shown in (18).
However, the historical present construction does not display variation in agreement
based on subject type and subject position constraints. This means that the historical
present construction allows adjacency of any personal pronoun and the 3sg verbal
agreement form. This construction is also semantically different from the NSR in that
it uses present tense verb forms to narrate events that are in the past (Huddleston &
Pullum, 2002:129-131). This semantic difference between the NSR construction and
the historical present can be resolved by examining other verb forms in the utterance
or sentence. For example, in (18) use of the past tense went suggests that this utterance
refers to events in the past; the same can be said for used to, sold and was in example
(19).
(19) He used to make home-made toffee and er he sold milk and bacon and cheese
and all that, and there was a crowd in there, and I goes charging to the
counter. “A gill of milk Mr Jackson please”, which was about a penny or
something like that. (Sound Archive)
Habitual constructions also display similarities to both the NSR and the
(20) Most days, men fro’ town goes down dock 9.30 sharp. They always walked
past window, right shoutin n all. (Litcorp)
119
Like the historical present, this construction does not conform to the adjacency
constraints characteristic of the NSR, and therefore, like the historical present, it may
Habitual constructions often include temporal adverb phrases such as most days or
not always require these adverb phrases in order to represent habitual semantics. This
makes habitual constructions more problematic than the historical present ones with
respect to disambiguation. For example, (22-23) are taken from Godfrey and
Tagliamonte (1999:108) and are presented there as clear examples of NSR agreement.
While the assertion that (22-23) display NSR agreement can be considered true in the
sense that the non-3sg NPs jackdaws and me legs occur with the 3sg –s ending, it is
difficult to state, categorically, that neither of these examples displays any vestige of
habitual semantics. It is plausible to argue that (22) could mean something similar to
there’s a few jackdaws that often come out the back and (23) my legs often ache a
bit. 15 Equally, (23) could be instances of the present indicative meaning something
more similar to my legs ache at the moment (and so be a good case for NSR
indication is given on this either way. Presentation of a wider sentence context may
15
Often is used here as an arbitrary illustration of an adverb phrase. Equally, any adverb phrase (e.g.
often, frequently, every day, on a Tuesday etc) could be intended by the speaker. This neatly
demonstrates the point – you simply cannot second-guess what any speaker may have meant.
120
also go some way to resolving this issue. This same problem occurs with data from
Pietsch (2005) as shown below in (24-27). All of the following utterances are cited as
all of these examples may express habitual semantics (or at least, may be judged to by
some speakers) then should they be considered as examples of the NSR proper? This
issue has not been adequately addressed in the literature. Shorrocks (1999:112) makes
a distinction between habitual constructions and the NSR suggesting that examples
from Bolton such as I often tells him may be due to habitual semantics and not the
how to deal with such variation within quantitative analyses is put forward.
Pietsch (2005:10) certainly suggests that the habitual construction and the NSR
are two distinct constructions, but also does not adequately deal with the implications
related, with the –s that occurs with intervening adverb phrases between subject and
verb being re-analysed as a marker of habitual semantics and extended into pronoun-
Goldberg, 1995) which suggest that over time, one construction can develop out of
another, where the same form is paired with different but related senses. No other
explanations for the origins of this habitual construction have been put forward, and
121
there are currently no other studies of –s as a marker of habitual constructions in
dialects of English.
have not been overtly addressed in their analysis (i.e. McCafferty, 2003; Hudson,
1999; Godfrey and Tagliamonte, 1999; Börjars and Chapman, 1998; Henry, 1995). It
may be the case that in regions where –s can indicate habitual aspect when found with
semantics alone, and extended into pronoun-adjacent contexts without the need for
any adverb phrase. Therefore, it is not implausible that an utterance such as burglars
steals ‘em could mean something like burglars always steal ‘em in the minds of
certain speakers. This utterance would then therefore not be a good instance of the
NSR. If habitual constructions are frequent in Lancashire, then the frequency of the
NSR may be affected by crossover and interference with this pattern. This will be
Most NSR studies, such as the examples from Pietsch (2005) discussed in (24-27),
give only sentence fragments, making it difficult to distinguish whether or not the
possibility. Certainly, this is rarely addressed clearly in the discussion. The problems
in (28).
122
Many would consider this to be a good example of NSR, with the 3pl NP occurring
with the nonstandard –s. However, the addition of hypothetical contextual information
shows without doubt how this could also easily be either habitual (29), or historical
present (30). Only examples with adjacent non-3sg personal pronouns with standard
agreement (-Ø) really show that this constraint may be a clear example of the NSR, as
in (31).
(29) Every Friday at 5.30 the children comes to the fair, after school. (Habitual)
(30) So, the children comes to the fair, and says “look at that!” So we went over to
the stall. (Historical present)
(31) The children comes to the fair and they enjoy the rides. (NSR)
However, a wider scope may not always make the speaker’s or writer’s intended
meaning clearer – it is impossible to know whether or not they are using the NSR
proper or instead are using –s in a more idiosyncratic way, and I would suggest that
these similar constructions may influence, compete, overlap and mix with each other
in the minds of speakers in a way that is difficult to distinguish and test, thus perhaps
and region-specific.
4.2.3 Salience
Sociolinguistic salience may also have a bearing on the occurrences of the NSR in the
corpora. Kerswill & Williams (2002) suggest that salient constructions are those
which are overt in the speaker’s mind (see also markers, stereotypes and indicators,
Labov 2001; markedness Greenberg 1966). While previous accounts of the NSR have
not dealt with salience explicitly, the social implication of the usage of nonstandard
123
As Litcorp is not a record or transcription of Lancashire dialect speakers, but
instead a record of the writers’ perception and representation of these speakers, it may
by these writers. This approach is expanded upon and discussed more explicitly in
original – no other attempts to examine and compare data such as this can be found in
previous studies. Similarly, although the Sound Archive data is a transcription of real
speakers, it may be that a number of nonstandard forms are recognized by (or are
salient to) the speaker as being more dialectal. It is therefore plausible that the
constructions, depending on both their own knowledge of their local dialect and the
way in which they wish themselves to be portrayed (i.e. as more or less dialectal, see
e.g. Hollmann and Siewierska, 2006). It could also be suggested that certain
constructions may be more salient than others and that this in turn would have a
bearing on their frequency in the corpus. For example, it may be the case that the use
of nonstandard verbal agreement forms such as (32) stand out more (or are more
(32) Ah always does what Ah con for her, an Ah will say this, she’s allus thankful
for a bit o’ help. (Litcorp)
(33) And she says “I’ll get you some butties for t’train”. (Sound Archive)
This may mean that the (arguably) more salient nonstandard verbal agreement,
including that of the NSR, may be found less frequently in speakers who wish to
adhere to overt prestige forms, and possibly is more frequent in speakers wishing to
adhere to potential covert prestige forms, i.e. dialect forms. By examining the
124
frequency of NSR agreement in the Litcorp compared to the Sound Archive, not only
will potential language change be investigated, but the status of NSR agreement as a
NSR agreement is not a salient feature of the dialect, it would be less likely to appear
in the Litcorp data as the writers would not necessarily perceive it as an obvious
feature of dialect speakers from this region. This ‘perception frequency’ can then be
Sound Archive, giving an interesting contrast. These issues will be tested in this study,
order to test constraints such as subject type and position. These questionnaires are
used not only to target those informants who identify themselves as dialect and non-
dialect speakers, but also cast the net more widely and attempt to see if any
differences exist between Lancashire and other regions of the UK. Further details on
Both salience and constructional competition, and indeed, the development of the
NSR construction itself, are underpinned by the role of frequency. Building on the
ideas of Bybee (1985), approaches in Cognitive Linguistics (e.g. Croft & Cruse, 2004;
Croft; 2001; Langacker, 2000; Goldberg, 2006) suggest that the relationship between
suggested that language use influences the structure of representation in the mind, and
that grammatical structures that are used more often (and therefore have a high token
16
Conclusions such as these are tentative, for a full discussion of the merits and problems associated
with comparisons of this nature, see Chapter 5.
125
frequency) become more reinforced (or entrenched). This may suggest that verbs
which have a higher frequency, as compared to other verbs, may be more entrenched
and therefore more resistant to language change, thus preserving older agreement
patterns. This will be investigated with respect to the corpus data in §4.4.1 where the
Studies into the NSR have produced diverse and often conflicting results. As Clarke
(1997:3) points out, ‘differing methods of analysis, number of tokens used, lack of
comparable corpora, and the range of linguistic data examined all play a part in the
lack of consensus over the development and function of verbal -s.’ This chapter takes
data, provides a robust description of the NSR in Lancashire. More specifically, the
(a) To what extent is the NSR a feature of the Lancashire dialect data examined
in this study?
(b) What, if any, factors may motivate an informant’s decision to use NSR
features, such as the constraints detailed in §4.2, and concepts such as
salience and frequency?
(c) Is there any evidence that was/were variation is influenced by the NSR in
Lancashire?
(d) What effect, if any, does the frequency of the superficially similar
constructions (the historical present and the habitual) have on the distribution
of the NSR in Lancashire?
(e) What changes in the frequency of the NSR have occurred over time?
this study and indeed in this thesis. The Sound Archive data consists of oral history
126
interviews, and therefore contains relatively few examples of constructions in the
present tense (except for, of course, those in the historical present which, as outlined
earlier, are not considered as part of the NSR). This makes investigations into the NSR
for biases such as these. By using both corpora and sociolinguistic questionnaires, the
4.3 Methodology
As with other studies in this project, spoken transcribed data from the Sound Archive
corpus is analysed along with data from Litcorp. For further information on the
speakers, locations and overall sampling of the corpora please see Chapter 1. The
analyses of Sound Archive and Litcorp data are then compared to a questionnaire
exploring 3sg agreement (and in particular, the NSR) targeted at Lancashire dialect
speakers and also speakers from other regions (see § 3.4.3). A full copy of this can be
found in Appendix D.
Many previous analyses of 3sg agreement variation do not base their claims on a
suitable amount of data. Both Börjars and Chapman (1998) and Hudson (1999)
conduct no empirical tests, but instead base their arguments on intuition alone. Börjars
and Chapman suggest that nonstandard 3sg agreement, specifically with lexical verbs,
is triggered by inverted pronouns. While this may be the case, this proposition remains
Henry’s (1995) study of Belfast English gathers data from elicited grammaticality
judgements in order to suggest quite the opposite, that the application of verbal –s is
127
prohibited under inversion in this variety of English. However, no information is
included on the number, age or sex of the informants, or about the nature or structure
reported differences between agreement in Belfast English and the NSR are true
interview techniques (see e.g. Labov, 1972 for further information on this) from eight
elderly rural speakers in Devon. No information is given about the topic or length of
these interviews, but they report 628 instances of verbal –s used in a nonstandard way,
and attribute a number of these to the NSR. The use of interview data is a good
approach (and indeed, is one of the methods used in this study), although a larger
number of informants might have allowed Godfrey and Tagliamonte to make stronger
Pietsch (2005) provides some of the most robust results for the NSR by taking
a more quantitative approach. His study examines data from the Northern Ireland
Survey of Hiberno-English Speech, and the Freiburg Corpus of English Dialects. This
provides a good account of the distribution of the NSR in the British Isles. While these
corpora are a good resource and studies such as this can make strong claims on the
distribution of this variable at the time that the corpora were collected, a focus on
more modern data would provide interesting information on the possible development
of verbal -s. While a number of studies into NSR development have included written
historical documents (e.g. Wright 2002; Montgomery et al, 1993; Bailey et al., 1989),
aside from the ‘one excerpt of short prose’ examined by McCafferty (2003:5) this
128
inclusion of dialect literature in particular, alongside spoken corpus data is novel. A
4.3.2 Corpora
The corpus methodology falls into two main strands – retrieval and analysis of
nonstandard 3sg agreement in lexical verbs, and retrieval and analysis of nonstandard
agreement with auxiliary verbs (in particular, BE but also HAVE and DO) in both the
As both corpora are part-of-speech tagged using the CLAWS-7 tagset, 17 all
lexical verbs that may exhibit NSR agreement (i.e. 3sg forms) can be retrieved by
searching for the tag _VVZ, which retrieves all -s forms of lexical verbs (e.g. gives,
works etc). BE, HAVE and DO (both lexical and auxiliary) are retrieved by searching for
their individual forms, e.g. am, is, have, and all possible contractions, e.g. ’m,’s,’ve
etc. Subsequent searches were also carried out in order to find adjacent personal
pronouns and verbs that displayed nonstandard agreement patterns. Along with
identifying possible NSR examples, these corpus searches also uncover historical
present constructions (34); habitual constructions 18 (35); other agreement patterns (36)
(34) Anyway I said to her one day, I says “what's the matter?” I said “why won't
you mix with the other girls?” (Sound Archive)
(36) And I said, “well I don't know, I'll have to go and ask her” so he give me
money for t'bus and I went up and asked me Mother if, me Grandad lived with
17
See http://ucrel.lancs.ac.uk/claws7tags.html for more details on the tagset.
18
Habitual constructions are here defined as those with occurring with relevant adverb phrases; habitual
constructions without adverb phrases will be captured by the search for _VVZ. See §4.4.2 for a further
discussion of this.
129
(37) So er I thought well when Alex goes I'll bow out and that’s it. (Sound
Archive)
While examples such as (35-37) are excluded from any possible NSR results,
frequency data for these constructions are presented in Table 7. A further discussion
of the relevance and implication of the historical present and habitual constructions is
discussed in §4.5. Any results that do not clearly fall into these four categories will be
Contractions are retrieved from the corpus data by searching for the individual
contracted form (e.g. ’s or ’ve) as all contractions are split and stored as separate
tokens in both corpora. All results are analysed for any possible ambiguity, e.g.
examples of ’s that may be genitives (38); the contracted form of BE (39); or the
(38) I went into Mr Jackson's shop which was on Victoria Street (Sound Archive)
(39) Anyway, when I come back and he’s waiting at Wyredock Station for me
(Sound Archive)
(40) Yo’ seen, he’s known yo’ so long, an’ he’s warked wi’ yo for mony a
yer.”(Litcorp)
are also found in both corpora, e.g. were (41-42); and has (43). These results are
discussed in §4.4.2.
130
(41) “You don’t suppose I’d sell it without the shell”, he said; an he looked as if he
thowt aw’re havin him on. (Litcorp)
(42) So we went into Fat Jack’s i’ th’ corner; an’ he co’ed for two twopenno’ths
wi’ as mich swagger as if he’re gooin’ to get change for a suvverin. (Litcorp)
(43) He said, “my tea doesn't taste right unless I’s had it in me black and when
I’ve had me tea I has a wash.” (Sound Archive)
The data searches also retrieved any nonstandard forms of the verbs BE (44), HAVE
(44) “Here theau art”, hoo says, an pretended t’ offer me th’ paper (Litcorp)
One problem often associated with these nonstandard verb forms is the
omission of the subject, as shown in the above examples (45-46). In most occurrences
of this, as with two of the examples above, the subject can be resolved from the
context by looking in the corpus – in (45) the subject you refers to Bill, and in (46) the
subject you refers to Jamie. Examples where the resolution of the subject from the
context is not possible are excluded from the final results. Similar to these
nonstandard verb forms, archaic 2sg pronouns tha, thee (47), and thou (48), are also
(47) “The smoke, tha'll have to give it up cop, it gives thee cancer, aye " (Sound
Archive)
(48) Where hast thou been? Thou art all in a sweat (Litcorp)
Alongside tha, thee and thou, the archaic pronoun hoo is present in the Litcorp data.
Wales (1996:19) states that hoo is the 3sg feminine subject pronoun from the Old
English heo and occurred mainly in the North West Midlands, while Beal (2004:119)
131
suggests that this form is also commonly found in Lancashire. Hoo occurs frequently
(49) He thowt, for a bit, hoo were playin’ a trick on him, but th’ choilt did it quite
innercent. (Litcorp)
(50) He geet so bad in a bit, an’ were vomitin’ so much, that Margit were freetend,
so hoo rushed off to Bill Olegg’s for summat to stop th’ gripin’ pain an’
ickness. (Litcorp)
Both verbs and personal pronouns (and of course other sentence elements)
present in Litcorp display variant spellings representing the writers’ desire to convey
the phonology of their accent through the written word e.g. (51-52).
(52) Neaw theau couldno’ tell ‘em fro’ ladies, unless it wur by ther tongues.”
(Litcorp)
Both the nonstandard contractions and the variant spellings of pronouns and also verbs
were found by searching for their specific search term (i.e. results for you include yo
and y’). The variant forms were originally uncovered by close examination of a
4.3.4 Questionnaires
Questionnaires are used in this study in order to both include the perceptions of more
modern speakers and to allow a (tentative) further time depth comparison with the
Sound Archive and Litcorp data. The questionnaire that I have devised aims to test the
possible morphosyntactic limitations (or constraints) to the NSR that are outlined in
§4.2.1, in order to uncover whether or not syntactic position and subject type exert an
effect on the application of the NSR for current Lancashire speakers. The
132
questionnaire also explores present tense constructions further, in order to compensate
participants were asked to judge sentences on a five point scale, with 1 being the least
acceptable to them and 5 being the most acceptable e.g. (53). Descriptors were not
assigned to the intervening values (i.e. 2, 3 and 4) so that interval variable status (as
opposed to ordinal variable status) can be approximated (see e.g. Cowart, 1997:71 for
further details).
(53) ‘They have a shop of their own and is very well off.’
The sentences chosen for the questionnaire relate to the constraints as detailed
in §4.1.2, and test nonstandard agreement with variables such as adjacency, type of
133
4.3.5 Classification and division of respondents
dialect in response to the question ‘do you have a particular dialect? If yes, how
would you describe it?’ are classified as ‘Lancashire, dialect speakers’ in the results in
§4.4.5. In addition to this, there were a number of speakers who identify themselves as
living in a Lancashire town or village (or having lived there for a majority of their
life), but did not proclaim to have a Lancashire dialect. These speakers are classified
Along with this distinction, the questionnaire respondents were also split into
north/south groups, to explore whether there are any differences between them. Again,
the speakers were categorized according to how they identified themselves in the
only partly dependant upon the geographical origin of the speaker and partly on other
cultural factors (see e.g. Wales 2006: 9-24 for a discussion of this). Here I follow
Trudgill (1999) in defining the north/south divide as being delineated by the so called
Wash-Severn line.
The questionnaire data will be compared to the results from both corpora.
Since all three data sources were gathered by different means and cover different time
periods, a sensitive combination of these results should give a good picture of how the
134
4.4 Results and analysis
Many discussions of the NSR (outlined initially in §4.1) demonstrate the reflexes of
this rule by including examples that show the effect of adjacent vs. non-adjacent
personal pronouns within the same sentence or utterance e.g. (54) and (55).
No examples such as these are found in any of the Lancashire corpora. This is not due
to an absence of sentences of this type within these data; nineteen examples such as
(56) Because when you’re passing in a car, on the bus, you just see a church, you
don’t know whether it’s in good condition or bad condition until you come
and say “how long since this was done?” (Sound Archive)
Sentences such as (56) are not significantly frequent enough to make strong claims
other infrequent results, the acceptability of this construction is tested by means of the
The strong definition of the NSR suggests that every present indicative verb
takes the 3sg form, except when it is directly adjacent to a personal pronoun subject.
However, the reality is more complicated; some features of the NSR are also features
others (e.g. 3sg agreement with non-3sg subjects) are shared by other agreement
patterns known to exist. As outlined earlier, a broader version of the NSR offers more
scope for possible variability (as set out below) and is tested here with respect to the
Lancashire data:
135
a) 3sg subjects (and thou) always take –s (or related 3sg form)
b) Non-3sg subjects may have –s (or related 3sg form)
c) Non-adjacent subject and verb prefer –s (or related 3sg form)
It should be noted that not only results which may conform to the NSR are
relevant, but also the determination of the extent of the variability associated with this
Tables 4-9 deal with present tense variation only – was/were results (which are
analysed separately in §4.4.5. All results presented here are raw frequency results
only; although the corpora are not strictly comparable in terms of size, many values
are too small to normalise and still achieve usable results (to e.g. values per 100,000).
As set out in the methodology, adverb phrases relating to time (such as sometimes,
never, every Wednesday) intervening between subject and verb (e.g. I always goes
there) are not included as examples of the NSR and are instead analysed in Table 7 as
habitual constructions. Previously I have outlined the possibility that all instances of
the present indicative may contain an element of habitual semantics e.g. (57)
This is difficult to resolve in any satisfactory way and so, in line with previous studies,
I tentatively include instances such as ‘the men takes the pictures’ as good examples
of the NSR. Unlike other studies that display only part or single sentences, all
examples from the Lancashire corpora are detailed with a wider textual context for
clarity.
136
NSR RESULTS LITCORP SOUND ARCHIVE
NSR agreement is infrequent in the Lancashire corpora; only 254 instances are found
in this data with 97% of these being attributed to the Litcorp. In the Litcorp, NSR
agreement is found most frequently with thou and a majority of these instances
(92.0%) occur with the variant spelling theau as can be seen in (58-59).
(58) “That shows aw’m no’ used to buyin’ owt o’ th sooart.” “If theau wants a bit
o’ gradely stuff thee goo deawn to Muirhead’s i’ Victoria Street,” Siah said.
“Dunno thee buy common stuff!” (Litcorp)
(59) “Oh, aw’ll agree to that,” Jim said. “Then go to wark,” Juddie said, “an’ mind
heaw theau raises th’ tub. Theau’re shakin neaw as ill as if theau’re gooin’ t’
be hanged.” (Litcorp)
This use of the variant spelling may suggest that thou is considered by the writers of
form theau may be closely linked to the similarly salient or dialectal choice of non-
standard 3sg agreement. The possibility that instances of thou with 3sg agreement
may be more frequent than any other agreement pattern with thou overall (and
therefore entrenched in the mind of the writers as outlined in §4.2.4) is tested in Table
137
USE OF THOU (AND VARIANT FORMS) LITCORP SOUND ARCHIVE
Results from Litcorp lend support to the assertion of Pietsch (2005:6) who suggests
that thou always occurs with NSR agreement. While in the Sound Archive thou shows
no real preference for 3sg agreement, the total number of instances of thou is very low
Aside from those occurrences with thou, the remaining NSR results in the
Litcorp are most frequently found with adjacent non-3sg NP subjects e.g. (60) and
(61).
(60) It’s bad news fur coffin makkers, an’ th’ timber trade generally! Neaw, when
those fashions changes, some trades are allbut owver. When women gan off
wearin’ crinolines, th’ wire trade went deawn, an so did th’ boot trade, an’ th’
stockin’ trade. (Litcorp)
(61) “Tha keeps suppin’ it,” said Jonty. “Abit,” said Jimmy. “Ah don’t like to hurt
its feelings. Not when it’s out on its feet.” “Ah’ll bet yo’re wives is glad to be
shut on yo,” said Jonty. There were that big a fog on when Ah left,” said
Tommy, “as Ah’ll bet hoo doesn’t know Ah’ve gone.”(Litcorp)
instances found in the Litcorp data. This preference for NSR agreement with verb-
that 3sg agreement is the ‘default’ agreement pattern, except for when standard
138
agreement is provoked by adjacent personal pronouns. If this is the case in Lancashire,
closer look at the frequency of other similar agreement patterns in Table 8 explores
the possibility that constructional overlap or competition may affect the frequency of
Very few examples of the NSR were found in the Sound Archive; two such
(62) Wherever she was going and she'd to stand all sorts of insults, “what are tha
doing down here, you don't belong down here, you get back back up yon
where tha belongs” and they used to pick sods up and throw sods at her.
(Sound Archive)
(63) Now, there was a deterrent there just by the name. Nowadays well , it just
seems anything goes. Now I know drugs has accelerated it because you
know, they want money, but I think that the punishments have gone down and
down and so like anything goes. (Sound Archive)
Sentence (62) is a good example of the NSR, again found with an archaic personal
pronoun form. Example (63) is more problematic; it is possible that drugs could be
considered by the speaker as either singular or plural (i.e. they has/it has accelerated
it), and there is no clear way to resolve this possible ambiguity. There are two further
examples such as this in the Sound Archive, (64) being one of them:
(64) No we used to gut them on the deck and then your deck used to be sectioned
up into certain, you know when you went to sea, once you start fishing you
have boards and then really your decks looks like a criss cross of different
pounds here there and everywhere (Sound Archive)
Again here it is not completely clear in (64) if the speaker is referring to your decks in
the singular or plural. Tentatively both examples are included in the totals in Table 4,
although this further weakens the already insubstantial evidence for the NSR in
139
there are a number of other NSR ‘near misses’ in the Sound Archive; one such
(65) The form was, the reason was, you may not know this, that if there wasn't at
that time a bishop of Blackburn in residence, there was an inter waiting for a
new bishop and after a certain time, if there isn't a bishop then either [his
assistant or one of the suffrages, in this case people of Burnley], acts on his
behalf and after a certain period of time the gift of that job lapses to either the
Archbishop of York who's the next one up from being a bishop or the crown
and it had lapsed to the crown. (Sound Archive)
This initially appears to be a good example of the NSR, with the subject the people of
Burnley (or they) taking the 3sg –s ending. However, on closer inspection it is clear
that the subject of this sentence is actually the 3sg his assistant or one of the suffrages
(the NP is shown here in square brackets.) In this example, agreement has been
maintained despite the distance between subject and verb; something that often does
Reasons for such a low frequency of NSR agreement in both the Sound
Archive and Litcorp data may be due to language change, with present day speakers
moving away from older agreement patterns that are perceived as nonstandard, such
as the NSR. Certainly, the presence of the NSR in the Litcorp and near absence in the
Sound Archive lends weight to this argument for diachronic change, although the low
frequency of the NSR in the Litcorp overall (and of course other differences between
most closely linked to thou. It is therefore possible that a decrease in this personal
pronoun form may have resulted in a decrease in NSR agreement if speakers and/or
140
Differences in frequency may also be due to differences in the purpose of the
texts in the two corpora - speakers in the Sound Archive may have actively avoided
using NSR agreement (as opposed to writers in the Litcorp who were aiming to
salient constructions are those which are overt in the speakers’ mind (Kerswill &
discussed further in Chapter 5) and it could be argued that, for example, nonstandard
more salient and therefore possibly more actively avoided) when compared to other
towards a more standard variety would be more likely to avoid the possibly more
stereotyped NSR verbal agreement form. However, without further data it is not
possible to know whether Lancashire (or indeed any other) speakers are
accommodating towards or away from what they perceive as a more standard variety.
Smith et al. (2007) found verbal –s to be used frequently in their study of children and
construction. It may be that the variation found in Lancashire is perhaps a result of the
purpose and aims of the data analysed here, rather than as a result of its conscious
perhaps less likely due to very low frequencies of the NSR found in the relatively
It may also be the case that the NSR is not considered as a particularly
‘Lancashire’ feature for the Litcorp writers (other than, arguably, with thou). While,
for example, lexical choice may be perceived as being more obviously regional,
141
certain aspects of grammar may not be. Moreover, as the NSR has already been
reported as being present in a relatively wide geographical area (see e.g. Pietsch
2005), Litcorp writers may not have included this feature in their writing as it was not
evidence for this, or any other hypotheses put forward here by looking only at the data
presented so far. With this in mind, this analysis now turns to the comparative
frequency results in order to outline the role of other agreement patterns in Lancashire.
Table 6 shows all agreement patterns with adjacent and non-adjacent pronominal
comparative frequency distribution between 3sg agreement forms (e.g. –s) and non-
3sg agreement forms (e.g. –Ø) are given for each variable tested.
142
LITCORP
LEXICAL VERBS AUX VERBS
SUBJECT POSITION SUBJECT TYPE -s -Ø 3sg ag Non 3sg ag
203 397 61 414
non-3sg pronoun
(33.8%) (66.2%) (12.8%) (87.2%)
Adjacent
330 12 339 2
3sg pronoun
(96.5%) (3.5%) (99.4%) (0.6%)
non-3sg 549 381 43 291
pronoun (59.0%) (41.0%) (12.9%) (87.1%)
Non-adjacent
382 25 256 36
3sg pronoun
(93.9%) (6.1%) (87.7%) (12.3%)
SOUND ARCHIVE
LEXICAL VERBS AUX VERBS
SUBJECT POSITION SUBJECT TYPE -s -Ø 3sg ag Non 3sg ag
154 4035 4 864
non-3sg pronoun
(3.7%) (96.3%) (0.5%) (99.5%)
Adjacent
245 46 808 0
3sg pronoun
(94.2%) (15.8%) (100%) (0%)
197 2036 16 681
non-3sg pronoun
(8.8%) (91.2%) (2.3%) (97.7%)
Non-adjacent
263 31 389 5
3sg pronoun
(89.5%) (10.5%) (98.7%) (1.3%)
TABLE 6. TESTING PRONOUN ADJACENCY
In Table 6, any possible NSR agreement with pronouns falls into the category of non-
adjacent non-3sg pronoun with 3sg agreement (shown in boldface.) While 592
examples of this agreement pattern were found in Litcorp and 213 in the Sound
Archive, as we know from Table 4 (which shows NSR results only), a majority of
these examples are not instances of the NSR. Instead, many of these results are
habitual or historical present constructions. Aside from those examples with archaic
personal pronouns, no examples of the NSR with pronominal subjects are found in the
Sound Archive and only 8 examples are present in Litcorp, e.g. (66)
(66) Jolly good feed this, guv’nor. Sweep like a machine if the sweeper kims
round. Hope it’ll kim before the ladies turns out; they sweeps it all up with
their togs they does. Hullo! there he goes! (Litcorp)
143
As was the case with the previous data, the sparseness of NSR results in this
data makes it is impossible to make strong claims about the possible effect of pronoun
adjacency on the NSR in Lancashire, but relatively easy to suggest that this pattern is
LITCORP
LEXICAL VERBS AUX VERBS
SUBJECT POSITION SUBJECT TYPE -s -Ø 3sg ag Non 3sg ag
201 436 65 436
non-3sg
(31.6%) (68.4%) (13.0%) (87.0%)
Adjacent
134 29 360 10
3sg
(82.2%) (17.8%) (97.3%) (2.7%)
0 121 1 210
non-3sg
(0.0%) (100%) (0.5%) (99.5%)
Non-adjacent
102 2 68 5
3sg
(98.1%) (1.9%) (93.2%) (6.8%)
SOUND ARCHIVE
LEXICAL VERBS AUX VERBS
SUBJECT POSITION SUBJECT TYPE -s -Ø 3sg ag Non 3sg ag
164 4082 9 1745
non-3sg
(3.6%) (96.4%) (0.5%) (99.5%)
Adjacent
429 118 1212 3
3sg
(78.4%) (21.6%) (99.8%) (0.2%)
4 197 2 460
non-3sg
(2.0%) (98.0%) (0.4%) (99.6%)
Non-adjacent
429 118 1212 3
3sg
(78.4%) (21.6%) (99.8%) (0.2%)
TABLE 7. TESTING NON-PRONOMINAL SUBJECT ADJACENCY
Results with all subject types from both corpora show that standard agreement
with both lexical and auxiliary verbs (i.e. 3sg pronouns with 3sg agreement, non-3sg
pronouns with non-3sg agreement) is more frequent than other agreement patterns.
144
The data in Table 7 suggests that nonstandard 3sg agreement is more frequent
in the Litcorp than in the Sound Archive data. This is particularly noticeable with
lexical verbs, as 31.6% of all lexical verbs in this corpus occur with this agreement
pattern. However, as we know from Table 4, not all of these instances of non-3sg
subjects with 3sgs are examples of the NSR; in fact most were instead categorised as
able to occur with 3sg agreement in non-3sg contexts yet are unaffected by the subject
type and subject position restrictions. In this study, nonstandard 3sg examples are
wider context that they occur in. This methodology differentiates the current study
from other studies; a narrow scope to any investigation may lead to an inaccurate
analysis. More concretely, many of the results which at first appeared to display the
NSR were upon further analysis found not to, e.g. (67).
This example seems to be a good NSR example with 3sg stands occurring with the 1pl
we. However, a look at the wider context of this utterance reveals that this is in fact a
(68) We all stands outside the pub. Suddenly he comes running out shouting that
we’re late and we’ve missed the train. (Sound Archive)
Instances such as (67-68) clearly exemplify the need to look at the wider context and
again raise concerns with respect to the accuracy of previous claims made as to the
145
Constructional variation with 3sg agreement is explored further in Table 8.
The NSR results presented in Table 3 are included again here for comparison.
The historical present is the most frequent of all of the nonstandard 3sg agreement
patterns; a total of 561 instances of historical present constructions are found in the
(69) So I gets on the train and he says “you look tired” I says “aye I am” he says
“well you get your head down. So where do you want to get off?” I said
“Preston” He said “oh you get your head down and we’ll give you a shake
when we get to Preston. So I goes into a deep sleep and the next thing I felt
the train jerking, looked through t’window, Crewe! (Sound Archive)
As with example (69) above, many instances show tense variation with the use of the
historical present, often using past tense forms alongside present tense 3sg forms.
The frequency of the historical present construction may be explained (in part)
due to bias within the corpus; the oral history interviews in the Sound Archive feature
146
dialogues that concentrate very heavily on narrating past events using the historical
(70) Anyway, one day I said to her one day I says “what’s the matter?” I says
“Why won’t you mix with the other girls?” I said “they want to be friendly,
but” I says “you just won’t co-operate with them at all.” (Sound Archive)
This aside, this frequency remains significant. It may be the case that the
prevalence of this 3sg form used without subject type and subject position restrictions
has affected the frequency of the NSR. The high frequency of the historical present
may mean that Lancashire speakers/writers associate the 3sg –s (and related forms of
irregular verbs) more frequently with the historical present rather than as a marker of
agreement. This issue is complicated further by the presence of the structurally similar
(71) It’s a mystery of nature, said Young Winterburn. Like us bein’ here at
o’. Ah sometimes wonders why we are here. (Litcorp)
As shown above, only habitual examples that have a relevant adverb phrase are
without an adverb phrase and this is problematic when identifying instances of the
NSR. While the distinction between the historical present and the NSR is more
obviously based on the tense as given in the context of the sentence, habituality is
147
Habitual aspect Non-habitual aspect
Here it is clear that a sentence may only be described as a good example of the NSR
with any certainty if that sentence does not convey habitual aspect and refers only to
the present time. Interpreting aspect from corpus results can be difficult, as shown in
(72) Jolly good feed this, guv’nor. Sweep like a machine if the sweeper kims
round. Hope it’ll kim before the ladies turns out; they sweeps it all up with
their togs they does. Hullo! there he goes! (Litcorp)
Here it is possible that the speaker intends something like the ladies always sweep it
all up but this is partly speculative. It is probable that this constructional competition
and overlap between constructions (shown in Table 9) combined with the high
frequency of the historical present and habitual constructions (which of course have
no subject type or subject position restriction) has resulted in such a low number of
instances of the NSR. Lancashire dialect speakers may not associate 3sg agreement
forms with the present indicative only, but instead use this construction to indicate
habituality or present tense in past tense contexts, thus giving the distribution found in
may also be the case that the frequent occurrence of the historical present construction
148
found in the Lancashire data does not affect instances of the NSR, and instead reflects
Aside from the constructions outlined in §4.4.1 and §4.4.2, other nonstandard
agreement patterns are found in the corpora. There are 102 examples of adjacent 3sg
(73) Well we told the skipper, he says “oh he's not blind” he says, “he just wants to
go back, he don't want to go to sea for Christmas”. (Sound Archive)
(74) I used to swing that round so your centrifugal force kept the milk in and you’d
twirl it round like that, it don't come out. (Sound Archive)
(75) They let him wed Joe Tinker’s widow, ut says hoos waitin for mi shoon,
becose if he is a bit of a foo’ sometimes, he are too good a mon to throw
away upo’ sich like as her. (Litcorp)
This pattern is more frequent than the NSR, and is also found in other regions of the
UK, e.g. in East Anglia (Britain and Rupp, 2005) and Buckie Scots (Smith and
Tagliamonte, 1998). Most frequently in Lancashire, variation of this type (3sg subjects
with non-3sg agreement) is found with come and to a lesser extent, give, as shown in
e.g. (76-77).
(76) So then we went in we got into Stornoway. The the lifeboat come out to us
and the er some of the fishing boats you know. And the old man says don't
take any any ropes or owt the lifeboat’s coming. (Sound Archive)
(77) And I said, “well I don't know, I'll have to go and ask her” so he give me
money for t'bus and I went up and asked me Mother if, me Grandad lived with
us so me brother was all right, me Grandad would look after him you see.
(Sound Archive)
with both examples here using the historical present. Tagliamonte (2001:44) refers to
variation with come as Past Reference Come. Tagliamonte suggests that come/came
149
variation is also present in York, and indeed is a well-known non-standard
The data also returned a number of constructions which were excluded from
everybody in (78) which may be interpreted as a plural, and those instances that could
not be definitively resolved from their context as the subject of the sentence is unclear,
e.g. (79).
(78) He 'd just stand there would the butcher, newspaper in his hand, handful of
mince meat, couple of neck end chops, two sausages, a real Jacob 's joint and
he would hold it up in the air, who'll give me two bob for this, well
everybody shout out but we were fortunate again there because my dad was
an old mate of the butchers (Sound Archive)
(79) So they were all, and goes over to there and says “you don’t know me do
you?” (Sound Archive)
exists within the Lancashire corpora. A corpus analysis of 3sg variation in present
tense verbs has shown that NSR agreement is infrequent in Lancashire, but where the
NSR does occur, no adherence to the subject type and subject position is found. This
analysis now examines similar 3sg variation with past tense forms of BE; a
Reading (Cheshire 1982); York (Tagliamonte 1998); the Fens (Britain 2002) and is
Nonstandard was/were variation typically has three different distribution patterns. The
first, and most common, involves levelling to was across person, number and polarity
(see e.g. Chambers, 1995; Tagliamonte and Smith, 1999; Malcom, 1996). The second
150
involves levelling to were in negative polarity contexts and, to a lesser degree, was in
positive polarity contexts (see e.g. Trudgill, 1999; Cheshire, 1982; Tagliamonte,
1998). The third, and less frequent pattern, involves levelling to were in both positive
and negative polarity clauses (e.g. in nearby Bolton, Shorrocks, 1999; Moore, 2003).
Hollmann and Siewierska (2006:25) using a subsection of the data used in this study
and suggested that past tense BE paradigm showed levelling towards was, but more
interestingly also towards were. These results are now tested on both the Litcorp and
These results show that nonstandard use of was/were, e.g. (80), is more frequent in the
(80) When we were kids if there was anything wrong with us, boils or anything
like that, you never went to the doctor's, you were sent round to Grandma
Wheelers, and she were terrifying she were. (Sound Archive)
As with the preliminary results from Hollmann and Siewierska (2006:25), both
corpora show that variation is found in both directions (levelling towards was and
were). A further analysis of sentence polarity now tests whether or not this variable
affects was/were choice in Lancashire, as found in other studies (e.g. by Cheshire and
Fox, 2006:3).
151
negated non-negated
was 7 (41.2%) 10 (58.8%)
Litcorp
were 869 (59.6%) 590 (40.4%)
Sound was 188 (43.1%) 248 (56.9%)
Archive were 302 (50.6%) 295 (49.4%)
TABLE 11. NEGATED VS. NON-NEGATED NONSTANDARD WAS/WERE RESULTS.
Was levelling, and perhaps more interestingly, were levelling, is relatively frequent
(81) Yeah, I carried on in this shop and I weren't making much money. When all
was paid out and everything I'd only about five bob left which wasn't enough.
Well there's, that shop was round the corner and come on t' front here and I
took one here and that was better by 10 pound a week. (Sound Archive)
(82) There were, he were a poultry farmer. He were a loomer at first. Then he
were a poultry farmer you see they were all allotments and poultry farms and
pig farms and er. I had another brother what er were dairyman at er Townley.
(Sound Archive)
This differs from other regional varieties where levelling to were is suggested to occur
mainly in negative polarity contexts. These was/were results for Lancashire conflict
with the earlier hypotheses which suggested that in regions where NSR agreement is
pattern (i.e. was) can be extended to all contexts – this opposes the NSR agreement
pattern where 3sg patterns are extended to non-3sg contexts. These results are
was/were variation appears to be unrestricted by any proposed analogy with the NSR.
This may suggest that other variables, such as e.g. constructional frequency, may
affect was/were variation in this region. Recently, the usage-based model has received
discussion of this). The usage-based model (see e.g. Croft and Cruse, 2004: 291-327)
152
suggests that constructions that are more frequent become more entrenched in the
mind over time. This means that if this nonstandard use of were is used frequently by
the two forms may often appear to be phonologically similar, e.g. (83-84).
(85) Yeah they were right big they was, two big strapping lads and er, and the
sister, she was called Lizzie, Elizabeth but we always called her Lizzie, Miss
Lizzie. (Sound Archive)
(86) No, no, I was going to shoe horses with Joe Littleun, I were going to shoe
horses. (Sound Archive)
were found in the corpora, although these were most frequent in the Sound Archive
data. Clefted sentences were found with all combinations of was/were e.g. (87-88).
(87) Oh aye, Morecambe was a great place for entertainment during the war it
was. (Sound Archive)
153
(89) That were a sad job were that. (Litcorp)
An analysis of the corpus data has revealed that while variation in verbal
agreement is frequent in Lancashire; instances of the NSR are rare, existing almost
exclusively with the archaic personal pronoun thou. Perhaps unsurprisingly then,
was/were variation also shows no restriction with respect to subject type or position.
As mentioned previously, possible biases due to the nature of the corpus data (for
example, frequent use of the past tense in the Sound Archive, possible stylistic
motivations in Litcorp) may have skewed these results. As the instances of NSR in the
corpus were too infrequent to enable the testing of constraints such as subject type and
subject position, the questionnaire is used in order to explore possible variation such
as this.
The questionnaire reached 269 informants. Of these, 243 completed the questionnaire
in its entirety and are included in the results shown in §4.4.6. 103 informants were
from a mixture of English regions. The other 140 informants were targeted via social
networking websites, and were asked to fill in an online version of the questionnaire.
Online informants were then encouraged to pass on the questionnaire to any of their
colleagues, family or friends that they felt were also likely to respond. Most online
participants were of a mixed age range and from a number of different regions,
although the majority were from Lancashire or the North West, with the average age
being 36.
154
The tables in this section present the scores of the grouped participants (see
§4.3 for a discussion of groupings and participants). Participants were asked to assign
scores from 1 to 5 to test sentences, with 1 being judged by them as the least
acceptable and 5 as the most acceptable. A full copy of the questionnaire can be found
As discussed in § 4.3, the dialect speaker vs. non-dialect speaker distinction between
informants groups was made by the informants themselves in response to the question
differentiation is made in order to test whether or not dialect speakers are more likely
The five point scale used in this test allows statistical analyses and
status to be approximated on the assumption that intervals between each of these five
values are the same. This allows mean scores to be calculated. The overall median
results for all respondent groups are shown in Table 13. The mean score is shown
155
The questions posed in the survey can be grouped by the particular subject
type or position restriction being tested, e.g. adjacency, heaviness etc. These results
constraint type
adjacent non-adjacent
respondent group heavy “normal”
personal personal
NPs agreement
pronouns pronouns
Lancashire (dialect speakers) 3 (2.4) 1 (1.7) 2 (2.3) 4 (4.0)
Lancashire (non-dialect speakers) 2 (1.7) 1 (1.6) 2 (2.0) 4 (4.3)
Other north (dialect speakers) 2 (2.0) 1 (1.5) 2 (2.1) 4 (4.3)
Other north (non-dialect speakers) 2 (1.8) 1 (1.5) 2 (2.0) 4 (4.4)
South (dialect speakers) 2 (1.8) 1 (1.2) 2 (1.9) 4 (4.2)
South (non-dialect speakers) 2 (1.8) 1 (1.1) 2 (1.9) 4 (4.2)
TABLE 13. MEDIAN AND MEAN ACCEPTABILITY SCORE BY ALL RESPONDANTS, GROUPED
RESULTS
Both the Mann-Whitney U-test for median values and the t-test were employed in
order to test the significance of this data. All respondent groups were compared to one
another for all constraint types. Many of the results showed no significance, thus
suggesting that speakers from different regions, in many cases, found certain
equally unacceptable, often giving the lowest possible score. This unacceptability ties
in with the low frequency of NSR agreement found in the Lancashire corpora. It may
most speakers.
The most significant survey results come from the respondent group
Lancashire dialect speakers. Adjacent personal pronouns returned both mean and
median results that are statistically significant at a confidence level of 94%. This
156
occurring with nonstandard verbal agreement to be more acceptable than any other of
the tested constraints. Again, this further substantiates earlier findings from the corpus
results, which unlike definitions of the NSR in the literature, also showed that
nonstandard 3sg agreement occurs with adjacent personal pronouns very frequently. A
further breakdown of these results for adjacent personal pronouns with each individual
Here it can be seen that the test sentence containing I talks (I talks to the man for a
while) is the most acceptable of all test sentences in this category, with this being the
case particularly for Lancashire dialect speakers. The acceptability of sentences such
as this may be linked back to constructional overlap; this sentence could be considered
phrase) by the survey respondents. This would suggest that Lancashire dialect
speakers find adjacent non 3sg pronouns with 3sg agreement to be more acceptable
than other groups, perhaps due to the frequency of competing constructions, (see e.g.
Table 8).
analysis of the results suggests that often informants are reluctant to choose 1 or 5,
with sometimes even ‘normal’ sentences not being given the highest score.
Conversely, there were also participants who only gave scores of 1 or 5, i.e. a yes/no-
157
type response. This aside, combined with the substantial corpus data, these results
This analysis of possible instances of the NSR in Lancashire has revealed that while
3sg agreement in this region is subject to considerable variation, the situation is far too
variation that conform to the NSR are extremely rare in the Lancashire data,
contrastive patterns frequently detailed in the literature, such as we peel ‘em and boils
‘em’ (Ihalainen 1994:221) were found, and aside from those instances with the archaic
pronoun thou, only fifty five clear and outright instances of the NSR were found in the
combined corpora of over 800,000 words. As the NSR is not a frequent construction
Linked to this lack of instances of the NSR, variation found with past tense BE
does not conform to subject type, position or polarity constraints and is found
comparatively rare in most varieties of English. The lack of NSR results does not
suggest, however, that this chapter has presented no findings; over 4,000 instances of
nonstandard verbal agreement are identified in the corpus data. Most frequently,
instances of agreement variation in the Lancashire data involve a direct flouting of the
subject position and subject type restrictions specified by the NSR, e.g. (91) and (92).
158
(91) Ah likes a good hymn tune (Litcorp)
(92) I used to swing that round so your centrifugal force kept the milk in and you’d
twirl it round like that, it don’t come out. (Sound Archive)
The violation of position and type restrictions (namely, adjacent non-3sg pronouns
found with 3sg agreement) is further substantiated by results from Lancashire dialect
acceptability score (as compared to other respondent groups) for non-3sg pronouns
with 3sg agreement. Possible reasons for this distribution may lie with the frequency
of both the historical present and habitual construction in the corpus data. While these
frequencies may be due in part to corpora biases, they do still provide a good basis to
suggest that these constructions are prevalent in the region and therefore interfere and
overlap.
One of the main difficulties in this analysis lies in delineating the NSR with
the NSR as compared to the historical present is quite clearly a question of relative
time (and is usually easily resolved from the sentence context), this is trickier for
the validity of some previous NSR claims, such as those outlined in §4.3.1. Certainly,
in regions such as Lancashire where 3sg forms can indicate the habitual aspect (with
or without adverb phrases) these 3sg forms may be re-analysed as a marker of habitual
semantics alone, and extended into pronoun-adjacent contexts rather than be a marker
of agreement as suggested by Pietsch (2005) in sentences such as the sheep bleats, and
As outlined in §4.3.1, much of the previous research into the NSR has been
dominated by theories, rather than being informed by empirical data. This study goes
159
some way towards redressing that balance, although further research, particularly on
testing the boundaries between construction types and their effect of patterns of
agreement is needed.
160
Chapter 5. Salience
5.1 Introduction
it cognitively or perceptually prominent, both for speakers of the dialect and speakers
implications for theories of language variation and change in regard to, for instance,
(e.g. Bardovi-Harlig, 1987) and language contact (e.g. Trudgill, 1986). A number of
studies have discussed how variables become salient, but how saliency could be
investigated on the basis of corpus data, let alone quantified and evaluated, is quite a
Lancashire dialect literature. The main grammatical features that are considered in this
chapter are set out in Figure 1. The rationale for selecting these features in particular is
discussed in §5.3.1.
161
A methodology for comparing spoken language to dialect literature is described and
constructions that the speakers believe encapsulates their variety – via its salient
results, but by comparing dialect literature to spoken language the difference between
of this methodology will allow us to arrive at some idea of which of the grammatical
features listed in Figure 1 emerge as salient in terms of their distribution across the
corpora and which may stand out as being primarily produced or primarily perceived.
used in sociolinguistics” and from a survey of the literature, it seems that salience is
indeed used to refer to a number of different concepts in slightly different ways. While
prominent, sometimes salience describes awareness of the listener, i.e. how readily a
particular variant is perceived or heard (e.g. Mufwene, 1991) and on other occasions it
relates to awareness of the speaker, e.g. (Hickey 2000:57). Salience is also sometimes
used to refer to a non-linguistic factor that the context or participants may have
1998:10) and discourse salience (e.g. Prasad and Strube, 2000). Often, salience is seen
to be gradable, i.e. certain variables are considered as being more salient than others.
Thus for instance, as discussed further in §5.2.1, markers, are seen as less salient than
indicators and these less salient than stereotypes (Labov, 2001). Trudgill (1986)
162
“ordinary” salience. A clear way of testing hypotheses such as these using corpus data
is yet to be established.
Kerswill and Williams (2002) and Hollmann and Siewierska (2006) agree that
independent factors underlie salience, and Hollmann and Siewierska suggest that
both structural and external or extralinguistic factors. Although the factors affecting
and/or determining salience are of course relevant to this study, the focus here is not
its journey towards becoming salient in one particular dialect. Instead, this chapter
aims to test a methodology for quantitatively examining salience and to describe how
this can be applied in order to outline the salient features within a particular dataset.
The outcomes of this analysis should then allow more concrete claims to be made
about the behaviour and distribution of particular constructions based on corpus data.
used in the literature to describe the status of linguistic features such as enregisterment
(Labov, 2001). While these concepts are not considered in this chapter at length, it is
useful to outline how they may relate to and interact with salience.
163
Enregisterment is the identification of a set of linguistic norms as ‘a linguistic
variables that mark out a specific scheme of cultural values by the speaker. It therefore
follows that dialect literature used in this thesis might be considered as the
enregisterment of those Lancashire features by the writers of such material, i.e. those
features that the writers consider to typify this regional variety (see e.g. Beal, 2000;
often related to complexity (e.g. Croft, 2002; Greenberg, 1966) where the marked
this comparison with salience appears to work to some degree. However, salience is
different to markedness in that a salient form has no real tendency towards being the
more complex as compared to its semantic counterpart (compare e.g. the “marked”
definite article deletion as compared to the “unmarked” the). Nor do such cases fall
Anderwald, 2003) where the expected pattern (i.e. complex = marked, simple =
(1988:39) equates the term “unmarked” with ‘regular’, ‘normal’, ‘usual’; and
characterization does not fit with the concept of salience, where certain nonstandard
features (such as e.g. was/were variation) are far from ‘abnormal’ or ‘exceptional’ in
164
the data. Along with this, salience is typically used in the literature to relate to
nonstandard forms – markedness can refer to two opposing but acceptable features.
outlines a stigmatization hierarchy (2001), where features are divided into markers,
indicators and stereotypes depending on how closely linked they are to a particular
group in society. Indicators are described as not showing any change in style, but
instead vary with respect to social stratification. Markers show both social and
overtly attached. Stereotypes not only have well-known social meanings, but are
generally stigmatized and often actively avoided. It is hoped that a comparison of the
should enable the identification of these differences. The dialect literature corpora can
spoken corpus and also in dialect literature but are rare in Standard English may be
considered as indicators. Any features that occur across all corpora and perhaps are
also reported in other varieties of English, or perhaps even in Standard English, could
be considered as markers.
As with other chapters in this thesis, the analysis in this chapter is based upon spoken
data from the Sound Archive corpus and written dialect data from Litcorp (please see
Chapter 1 for a further discussion of these sources). When exploring salience in these
two very different corpora a number of factors must be taken into account. Although
the Sound Archive data is a transcription of the speech of Lancashire dialect speakers,
it may be that certain nonstandard forms are recognized by these speakers as being
165
dialectal. It is therefore plausible that the informants in the Sound Archive may
their own knowledge of their local dialect and the way in which they wish themselves
to be portrayed (i.e. as more or less dialectal, see e.g. Hollmann and Siewierska 2006).
Indeed, often salient features are categorised as such by the speaker’s readiness to
accommodate away from them (see e.g. Kerswill and Williams, 2002; Hollmann and
circularity into this methodology – how can we know if features are salient by looking
at corpus data, if speakers who contribute to that corpus data also know that certain
features are salient too and so actively up/downplay them? Accommodation such as
this is difficult to factor into any analysis; it is difficult to know which speakers may
have accommodated their dialect, when this happens and also in which direction(s).
The inclusion of data from Litcorp does go some way to working around the
above problem. As Litcorp is not a transcription of real Lancashire dialect speech, but
judged by the writers in question. This use of an extensive dialect literature corpus in
order to quantify salience is original – no other attempts to examine and compare data
such as this can be found in previous studies. However, as (on average) around 100
years separate the data in Litcorp and Sound Archive it is possible that variation due
to diachronic change could influence results. To consolidate the results from Litcorp
and to broaden the diachronic span of the dialect literature, new corpus data has been
reproduce a story that was familiar to them – a fairy tale. This new corpus is named
Lancashire Fairytales, a sample of which is shown in (1); (further examples are given
166
in Appendix G). More details on how this new corpus was collected are provided in
§5.3.2.
(1) An Cinderella were havin a gradely time at Ball wit Hansome Prince. Then,
she looked at time and said “oooh eck, I’ve gorra dash love, or I’ll turn into
some right nasty vegertable!” An off she dashed, right down road.
(Lancashire Fairytales)
5.1.4 Aims
The main aim behind the methodology employed here is to contrast production and
perception in the Lancashire corpus data. Concretely, this involves evaluating the
perceived features of Lancashire grammar as set out in dialect literature against the
if there are features that are perceived as part of the Lancashire dialect but occur rarely
in speech, and if there are features of the dialect that do occur in speech (and are
across the corpora, this does not automatically mean that this feature is a salient
feature of the Lancashire dialect (although naturally this is also possible). Lancashire
sources could indicate a feature of Standard English, rather than a salient feature of
the Lancashire dialect. This can be better demonstrated using the diagram in Figure 2
which shows the possible intersect between standard, dialectal and salient features.
167
Constructions
salient to the
dialect
Although from the outset the grammatical variation examined here is intended
to be ‘nonstandard’ (i.e. those features set out in Figure 1) again, that does not entail
‘Lancashire’. Even writers who intend to write in Lancashire dialect use varying
to note that the salience of grammatical features need not be uniquely tied to a specific
region. While some features may typify a particular region alone (regional words are a
good case in point), other features may occur in several areas (e.g. definite article
reduction found in both Lancashire and Yorkshire; variation with BE found in various
locations.)
It may also be the case that certain constructions are more salient than others
and this in turn may have implications for their corpus frequency. For example, the
may stand out more than other nonstandard features, such as definite article deletion,
certainly when spoken. This can be appreciated perhaps on the basis of the examples
168
(2) If you if you was fourteen you was ready for work.(Sound Archive)
(3) I don't know whether I, mind you must have had milk there because we
used to call for Ø milkman one of Ø lads did. (Sound Archive)
As a consequence of the above, one may speculate that instances of more salient
nonstandard verbal agreement as shown in (2) are likely to be found less frequently in
speakers who wish to adhere to overt prestige forms, and more frequently in speakers
wishing to adhere to potential covert prestige forms (i.e. dialect forms). As mentioned
previously, speaker attitudes are not consistent, and although the methods employed
accommodation, it is not feasible to employ such methods here alongside other aims
of the chapter. Consequently, accommodation is not treated as such in this chapter, but
5.2 Rationale
A number of researchers have outlined the factors which may influence and govern
the amount of salience that is associated with particular linguistic constructions, citing
use (Bardovi-Harlig, 1987), and a combination of these along with social factors
accounts of salience are complex and cannot be considered at length here. Instead, my
analysis now turns to how salience can be measured using corpus data of different
types.
169
5.2.1 Using dialect literature
Perhaps the most obvious feature of dialect literature lies in the semi-phonetic
M. – Why, whot’s bin th’ matter, hanney fawn eawt withur
Measter ?
T. – Whot ! there’s bin moort’ do in a Gonnart much, I’ll
uphowd tey ! – For whot dust think ? bo’ th’ tother Day boh
Yusterday, hus Lads moot’d ha’ o bit on o Hallidey, (becose
Here we can clearly see both grammatical variation but also significant semi-
then these features give an extra layer of significance to the grammar and lexis as
chosen by the writers of dialect literature. While of course respellings naturally lend
themselves to a phonological analysis, I argue that they are also interesting in terms of
whether or not the distribution of these respellings may interact with instances of
explores grammatical variation in Lancashire more widely. So far this thesis has
avoided grammatical features that have been the focus of previous studies in
170
Lancashire e.g. definite article reduction/deletion (Hollmann & Siewierska, 2006;
2011); ditransitives (Siewierska & Hollmann, 2005); and possessive me (Hollmann &
Siewierska, 2007). Instead, the approach here aims to uncover significant variation in
a larger selection of grammatical features, rather than analyse specific features only.
features are tested. As outlined previously, the Sound Archive and Litcorp were
subject to a fine-grained analysis at the beginning of this project that highlighted both
the constructions that have been addressed in previous chapters along with a number
of those included in this analysis (e.g. what as a subject relative, was/were variation).
nonstandard varieties of English (e.g. by Cheshire, Edwards and Whittle, 1989; Beal,
‘Lancashire’ variables, the scope to which these other nonstandard constructions (such
as lack of plural marking and never as a negator) are frequent in Lancashire can be
tested, along with whether or not they are perceived by Lancashire Dialect writers as
phonological, lexical and discourse variation are also associated with regional
variation and as such can also be variably salient. Phonological features (as
represented through nonstandard spelling) and lexical choice are discussed in brief in
§5.4.4.
171
5.2.3 Summary, research questions and hypotheses
An analysis of the literature has shown that salience is often cited as a reason for
language variation and change. It is also clear that definitions of salience vary,
particularly in terms of emphasis (e.g. variation perceived by the speaker, the listener
or both) and factors that influence salience are complex and interrelated. While these
considerations undoubtedly impact upon any findings presented here, this thesis takes
a more methodological approach. The approach here is not to define what makes
dialect. Currently there is a lack of suitable methodologies to test out and uncover
salience, and to describe which features may be considered as salient, based on a wide
range and large amount of corpus data. More specifically, this chapter addresses the
following themes:
b) According to the data used here, which features of the Lancashire dialect are
salient?
5.3 Methodology
new idea. It is hoped that such a measurement will provide new insights that go some
172
way towards finding out what speakers actually consider to be salient features of their
dialect. Alongside the distribution of the relevant features in the two corpora, their
distribution in a reference corpus of Standard English (in this case the BNC) will also
be taken into account in order to help adjudicate whether or not the features in
question are salient in Lancashire dialect or simply frequent in Standard English more
generally.
lists and keyword analyses are used to identify and explore grammatical variation.
e.g. Shorrocks, 1999; Hollmann and Siewierska, 2006; 2007; Siewierska and
variation found in the UK more generally (e.g. by Cheshire, Edwards and Whittle,
1989; Beal, 2004; Kortmann and Szmrecsanyi, 2004). As not every possible
interesting result can be presented within the limits of this chapter (or even within this
thesis), only a selection, presented in Figure 1, are included here, with others
summarised in §5.4.
The overarching idea behind the methodology employed involves comparing the
often all instances of each feature occur in total (used both in a standard and
173
nonstandard way) so that the percentage of nonstandard uses can then be ascertained.
Percentages rather than raw or normalised frequency figures are used because these
are more informative when comparing a larger number of variables. In short, ten
nonstandard instances of a frequent feature are quite different to ten instances of a rare
one. However, in some cases this methodology requires sensitive application. Where
the alternation between standard and nonstandard constructions is fairly fixed and
restricted (e.g. use of was as opposed were) the methodology proposed here gives
useful results. Kerswill and Williams (2002:100) take the stance that the lack of full
semantic equivalence between variants means that these variants should be omitted
from the analysis. I instead agree with Hollmann and Siewierska’s assertion that this
stance is perhaps too strong (2006:28). Instances that do not have clear or obvious
on their own merits but should not be excluded outright. By way of illustration
was right big. What should this construction be compared to in Standard English? One
without an adverbial but with a stronger adjective such as it was enormous, or it was
gigantic. This may also vary from speaker to speaker. In instances like this, ideally the
nonstandard construction (in this case right + adjective) must be compared against all
construction’ in order to avoid extensive searches for each grammatical feature under
combinations were retrieved from the corpus. This list of constructions (along with
examples) was then presented to the Lancashire dialect speakers test group (see §1.3
174
for further information) who identified the instances where they judged that right
could be used (e.g. very + adjective, extremely + adjective but not always + adjective).
Results for right + adjective were then compared against only these ‘acceptable’
forms, in order to give as accurate a score as possible. This method was used with all
other features that had multi-construction options for their standard form.
matching constructions may have lower scores than results that have a more restricted
variants. This could have undesirable implications if the frequency of each feature
importance since each individual feature is compared only to instances found across
For example, should all of the instances of dislocation (such as he were nice, were Mr
Jones) and of the discourse marker see (e.g. I’ll put it away for you, see) be compared
markers? It seems obvious that comparing sentences that include features of this type
to all other sentences in the corpus is not appropriate. Both dislocation and discourse
markers are used in particular contexts and for particular purposes and can appear
idiosyncratically both within the same text and from speaker to speaker. In order to
shown in §5.4.4.
Once the degree of nonstandard use for each feature is established using the
methods outlined above, the score for each feature is averaged across the three
175
corpora. Then, the positive or negative deviation from this average in each corpus can
one corpus as compared to another rather than comparing the raw standard-to-
nonstandard usage scores in each corpus. This gives a better indication of how the
corpora compare. The BNC data is not averaged in the same way, but instead the
word forms (e.g. t’, or were etc). Other variation is found by searching for more
complicated patterns, e.g. possessive me + noun. The results from most searches
meanings. Results that involved omission (such as zero relatives or definite article
deletion) were the most difficult to retrieve and were found by either using more
variation found in the dialect literature also involves some element of nonstandard
spelling e.g. theau (thou), coom (come) and wur (were). These variant spelling forms
were initially identified in the preliminary analysis of these data (as outlined in
Chapter 1) and so were also retrieved by searching for their individual word forms.
Words (and of course spelling variants) that appear in one corpus but not in
another (i.e. the lexical choices and dialect words) were retrieved by means of a
176
dates between these two corpora. Therefore to counterbalance this, a new corpus of
dialect literature was collected. Respondents were asked to write in what they
considered to be Lancashire dialect. This means that this corpus captures the
by the respondents (along with any possible phonetic representation they might
choose to include). As with Litcorp, this corpus is a collection of the most salient
features of the Lancashire dialect as judged by these writers. In order to get the
participants to write a story of useful length, they were asked to reproduce a story that
was familiar to them – a fairy tale. In building this new corpus, Lancashire Fairytales,
the length, style and number of stories a participant could write was unrestricted,
along with the type of variation their story should contain (e.g. grammatical variation,
dialect were produced by the small test group (see Figure 2, Chapter 1) and included
shown in Figure 4.
For this task, please write a fairy story that is familiar to you, in Lancashire Dialect.
Imagine that a speaker with a Lancashire accent and dialect is telling you this story,
and write how you think they would say it.
For example, you might choose to write the story of Little Red Riding Hood,
Goldilocks and the Three Bears or The Three Little Pigs.
Two examples are shown below:
(a) She turned, an’ said to Jack "Where’s money for cow?" Jack looked round,
an’ said, surprised‐like, "Why, I’ve getten these magic beans!" "Magic
beans?" she said, "My foot! They’re nobut rubbish are them!"
(b) An’ Cinderella were cryin’ and cryin’. Then, in corner of room appeared a
right nice lady, an' she says "Cinderella, you will go t’ball".
177
The task was completed by 53 Lancashire respondents and 42 non-Lancashire
respondents, with most contributors writing between 350-500 words each. As with
previous tasks, respondents were segregated based on their answer to the preliminary
question do you have a Lancashire dialect? Yes/no. Those who answered no were then
asked for their region. There were 12 participants from the North East, 8 from North
West (excluding Lancashire), 7 from South East, 4 from West Midlands, 3 from East
Anglia, 3 from Wales, 2 from the South West, and 3 from various other regions.
typically aged 18-22. Others were of a mixed age range and were contacted through
social networking websites and encouraged to pass the task on to anyone they thought
might also complete it. Cinderella was the most popular choice of story, with 16
instances in the corpus, closely followed by Three Little Pigs with 14. Lancashire
Fairytales totals 61,317 words which are roughly evenly distributed between
Lancashire Fairytales is a relatively small corpus when compared to the other corpora
used in this thesis, it nonetheless provides an important source for comparison with
the other corpora and also with itself by contrasting the Lancashire and non-
Fairytale corpus is set out in §5.4.5. In order to compare it with the other corpora,
initially only the Lancashire section of the fairytale corpus is used in the analysis.
178
5.3.3 Interpreting corpus results
The method outlined here describes both how corpus results are analysed in this
chapter and also how they could be analysed in corresponding corpora from other
are compared, nonstandard grammatical variation found in the dialect literature corpus
will also be found in spoken corpus to some degree. 19 This is because it is logical to
suggest that grammatical variation used by dialect speakers when they talk also forms
part of what they conceptualize as part of the dialect. Perhaps more interestingly, if
nonstandard grammatical features are found in the dialect literature but not in the
spoken corpus nor in any reference corpus, then these features are either archaic
dialectal features that are no longer currently used, or, good examples of salient
features that are rarely found in the Spoken corpus, perhaps due to social values
Also interesting are those nonstandard features that are found in the spoken
corpus but not in the reference corpus or the dialect literature. This distribution can be
best expressed in Figure 5. These features may well be nonstandard but do not (as yet)
19
Corpora collected at the same time from the same set of informants would probably give results that
withstand influence from variables such as diachronic change and intraspeaker variation. In the case of
the Lancashire data used here (namely Sound Archive and Litcorp) this was not possible, and in part
motivated the decision to compile a new collection of dialect literature.
179
FIGURE 5. INTERPRETING CORPUS COMPARISONS
represent the results from all three corpora together. Indeed, the methodology
employed here does not lend itself to comparing variables in this way. Instead, similar
results are clustered together depending on their distribution across the three corpora.
In some cases, particular words or parts of speech are discussed separately (e.g. was
and were variation are discussed as two separate variables, rather than as part of
variation with BE more generally). Other non-grammatical features (e.g. lexical choice
and semi-phonetic spelling) are discussed in §5.4.4.which considers (in brief) both
dialect words found frequently in the UK (e.g. owt and nowt) along with other
Lancashire specific dialect words, e.g. nobbut (no more than, nothing but) and gradely
180
5.4.1 Features found across all corpora
Result presented in this section occurred in Litcorp, Lancashire Fairytales and the
Sound Archive corpora in a fairly even distribution. In this section only, results from
the BNC are also included as a reference corpus in order to check that possible salient
features are typical to Lancashire, rather than being part of nonstandard variation
contemporary humorous dialect literature written about this region, definite article
Definite article reduction is also found in the dialect literature corpora reduced to both
t’ as th’ as shown in (4), and also as a collocate of in or on, often expressed as ont and
(4) Margit had lost a deol o’ wynt by th’ time hoo geet to th’ surgery, but as
luck ud have it, th’ doctor were in. (Litcorp)
(5) And off he went down t’road, holdin onto the clog that she’d left ont ground
[…] (Lancs_0017)
Both of these constructions appear to be rare in the BNC data, although reduced forms
are of course easier to quantify (121 instances of t’ are found compared to the
6,041,234 instances of the). The comparative distribution of the reduced and deleted
181
forms of the is interesting to note. In both Litcorp and Lancashire Fairytales the
reduced form (t’ or th’) is used more frequently than the zero form. This is particularly
consider this more salient than the zero form. It could also be the case that this
reduced form is more impactful than zero (a point raised in §2.3.1 with respect to zero
relatives) – if the writers are trying to represent their dialect, perhaps it is more
meaningful to included a reduced form that is noticeable on the page rather than a zero
form. Tied in with this, the Sound Archive shows the biggest variation between
definite article reduction and definite article deletion. This perhaps complementary
distribution could indicate that while speakers are aware of the reduced form (as it
also occurred frequently in the dialect literature) this may be a construction that they
accommodate away from (see Hollmann and Siewierska (2011) for a further
Both nonstandard was and nonstandard were are frequent across all corpora
and are found more frequently in the Sound Archive than other corpora. The corpus
This distribution perhaps suggests that this variant, while still perceived as
nonstandard, is actually produced more than it is perceived. This would indicate this is
perhaps less strongly associated with this dialect variety than definite article
182
this thesis (see §4.4.4), levelling to were is found more frequently that levelling to
was, in both positive and negative sentences. An example of this is shown in (6-7)
(6) But er, I went to woodwork, I weren't very happy. I don't like the
smell of new wood actually but er that might be a throwback, I don't
know. (Sound Archive)
London (Cheshire & Fox, 2006) in the English Fens (Britain, 2002). It may be the
case that although this variable is frequently produced by Lancashire dialect speakers,
it is not perceived by them as a salient part of their dialect, or at least not as salient as
Never as a past tense negator is also found across all corpora, although most
it is found in the Lancashire corpora. Results for never are shown in Table 3, and
(8) There were a peacock outside of Townley Hall. I never remember 'em being
two. No there were only one. (Sound Archive)
(9) When I geet here th’ chap hadn’t come to meet me, an’ he never turned up
aw day. (Litcorp)
183
This pattern appears infrequently in the BNC. This suggests that, as Lancashire dialect
writers have included it and it is also found in the spoken corpus, it is likely to be a
salient feature of Lancashire. This is where it is important to note that here salient
features of Lancashire do not refer to those features found exclusively in this region
and no other(s).
A number of other nonstandard features were also found across all corpora,
and many of these also appear in the literature as nonstandard features that are
2004:1154-55). Of course, just because they are found in other varieties does not
mean that they are not salient in Lancashire. These included demonstrative them and
(10) Them’s o’ reet,” said Young Winterburn, “for little lads” (Litcorp)
(11) An she wur a luverly lass wot lived wi all these dwarves. (Lancashire
Fairytales)
Results found in the Litcorp and Lancashire Fairytales comparatively display a very
features were present in Lancashire Fairytales that were not found in other corpora
too. This is perhaps unsurprising and means that writers in Lancashire Fairytales are
not using any of the grammatical variation tested here that does not feature also in the
spoken language present in the Sound Archive corpus or the written language of
Litcorp. While Lancashire Fairytales did not display nonstandard variation found with
duck and put wood int hole were frequent in this corpus but not found in the Sound
Archive or Litcorp. This suggests that these constructions are salient to Lancashire,
184
but are perhaps in a different category to salient words that would typically be used in
everyday speech. It may be the case that these more idiomatic constructions are
enregistered as signifiers of this variety in much the same way that e.g. “why aye
Sound
Litcorp Fairytale
Archive
adverbial right + adjective
-11.05 +20.00 -8.95
mean score:11.05
TABLE 4.DISTRIBUTION OF ADVERBIAL RIGHT + ADJECTIVE
This suggests that for current Lancashire speakers this construction is salient, and
perhaps its infrequent use in Sound Archive may be due to the social values that are
assigned to it. Adverbial right was often frequently found with nonstandard were, as
(12) […] and I had a uniform, oh it were right posh, I had a green uniform and it
buttoned all way up the side with er fancy buttons (Sound Archive)
This suggests that it is possible adverbial right may influence the use of
nonstandard were or vice versa, or indeed the larger construction ‘NP were right Adj’
185
As found in previous chapters, nonstandard spellings were present throughout
Litcorp (and to some degree, Lancashire Fairytales) and are discussed in brief in
§5.4.4.
A number of results are found with a stronger distribution within the older
dialect literature (Litcorp) than in the other corpora. Most typically, variation found in
this category involves forms that are now archaic. These are nonstandard
Archaic personal pronouns were not found in the Sound Archive at all, and
were also rare in Lancashire Fairytales. Archaic verb forms such as dost, art and hast
are also found significantly more frequently in Litcorp than in the other corpora.
Results for nonstandard spellings were included (e.g. dost also includes any results for
durst and verbal uses of dust). Contracted ’st and ’rt forms were found in the Litcorp
(13) “Theaw’rt some perculiar mannert Jackonapes I’ll uphowd” sed hoo;
“Ney, ney, I’st naw grope in the Breeches not I.” (Litcorp)
Irregular past tense verbs such as knowed, etten and forgetten were found in
Litcorp but very rarely in the other corpora. Here the difference is made between
nonstandard forms, i.e. getten instead of got rather than just nonstandard spelling, e.g.
could indicate one of two things - these features were more ‘standard’ at the time of
186
writing and so occur in the Litcorp in much the same way that features of Standard
English occur across all corpora now (indicating a diachronic change), or, these
features were considered to be a salient part of the dialect at that time, but now are
not. A closer look at the dialect literature reveals that these pronoun forms are found
more frequently with semi-phonetic spelling than with standard spelling. In particular,
theau occurred in Litcorp 951 times compared to the 135 instances of thou. This
would suggest that these pronouns are associated with Lancashire based on the earlier
(14) And the princess come fleeing out of dancehall, just as clock were striking.
(Fairytale – Lancs)
(15) There were a lowf fro’ th’ lobby, an’ Ferret Eon said nowt, though some
colour coom in his face, as th’ farmer bid him Good-neet. (Litcorp)
The features presented in are found frequently in Litcorp but also are found in the
Sound Archive data, suggesting that unlike those in Table 5, these features are not
archaic, but perhaps feature in Litcorp due to reasons of style. This is discussed further
in §5.5.
187
5.4.3 Features found in the most recent corpora
Results presented in Table 7 are found frequently within the newer corpora
(Fairytale and Sound Archive) but are not found in the older Litcorp.
These results shown in Table 7 are those which are used by Lancashire
speakers but are not considered by them to be a salient part of their dialect. A number
of these are features which are perhaps found in varieties of English more widely,
such as adverbial quick and absence of plural marking. Others may typically be used
Either way, most of the results included here are not represented significantly in the
dialect literature, which perhaps indicates that, as yet, they are relatively free from
social values. To use Labov’s terms (see e.g. 2001), these results are perhaps
indicators.
A number of features that did not fit easily into the methodology adopted here were
found in the corpora; many of these were more stylistic or discourse based. A majority
of these features were found in the Sound Archive, such as dislocation (16), this
188
(16) […] and in addition to that it was very prevalent was this, because Huncoat
was divided in two by imaginary line from where we’re sitting now (Sound
Archive)
(17) Well er as they said they were always thinking about this here ghost but we
never saw any ghost. (Sound Archive)
(18) So I, I said, I want it for Christmas, so I'll put it away for you, see.
Variation such as this may be indicative of a particular ‘spoken Lancashire style’ and
There were also many instances where particular nonstandard words (rather
than grammatical features) were used by the speakers or writers. One of the most
frequent was owt and nowt. The frequencies of each of these are shown in Table 8.
The owt results from Litcorp had to be sorted manually, due to results like (19).
(19) Heawsumever, little Emma were a favourite wi’ Ginger; he awlus breetened
up a lot when hoo went to his shop, an’ she very oft coom owt wi a cake or
some towfy as Ginger had trated her to. (Litcorp)
A dominance of owt and nowt in the written corpora perhaps means that these forms
have a particular social value ascribed to them that speakers do not wish to use in their
spoken language.
were found across the corpora, although not hugely frequently. A number of examples
189
(20) I know it's going to be a bit of a job because I were nobbut a lad when I left.
(Sound Archive)
(21) But th’ sun’s gradely hot; it make’s one sleepy, doesn’t it ? (Litcorp)
(22) Once upon u time, thur wur a littl’ chitty named Thumbelina. (Lancashire
Fairytales)
These features are perhaps similar to the more idiomatic constructions found earlier
Honeybone and Watson, forthcoming), the variant spelling found in the dialect
the phonology of the language used. While phonology is not the focus of this chapter
(or indeed this thesis), variant spellings occurred so frequently in the data they most
certainly warrant at least an overview. Some of the most frequent respellings are
shown in Table 9.
phonological
Example
feature
He geet up then, an’ th’ clock struck eight, but when he went to
[əʊ] [ɔ] oppen th’ dur for th’ milk. (Litcorp)
her wur a tinker wur Jack, an off ‘e went wit best ceaw deawn
[з:] [ə]
t’market. (Lancashire Fairytales)
They’re bothered abeaut gerrin’ shoon to fit tint, an’ thine’s just th’
[t] [ɹ] pattern. (Litcorp)
Awonder’t what wur up when th’ post-chap coome hommerin at th’
[a] [ɔ] dur o Monday morning’(Litcorp)
But I mony a time wished I’d never seen it, for it caused me mony a
[e] [ɔ] freet, an’ made me so narvous I’st never get o’er it. (Litcorp)
[u:l] [u:] They said th’ skoo wur full, an’ a lul had had to goo away. (Litcorp)
TABLE 9. A SAMPLE OF THE NONSTANDARD PHONOLOGICAL FEATURES FREQUENT IN
DIALECT LITERATURE
phoneme [əu], which occur most frequently in Litcorp in theau (thou), abeaut (about)
190
and deaun (down). These three instances alone totalled 1841 results in Litcorp, with
their standard counterparts totalling only 394. Reduction and deletion of word-final
consonants was also very frequently represented in the texts, either by the omission of
(23) Well, tell thi’ mother to soak a piece o’ flannel i’th’ milk, an’ le th’ choilt
suck it. (Litcorp)
While only a few phonological features have been outlined here in order to
demonstrate how phonetic respellings can also be used to indicate the salient
phonological features in this region, the potential for further analysis (perhaps along
While only the Lancashire part of Lancashire Fairytales has been used in the analyses
presented so far, interesting results can be found by contrasting the Lancashire and
with features such as definite article reduction/deletion; levelling to were (and also to
was) and lexical choices such as owt, nowt use frequently in both sections of the
corpus, as we can see in the two extracts from Three Little Pigs shown in the non-
(24) Once upon er time there were three little pigs who lived in a right nice ‘ouse.
T’house was made with straw. (Lancashire Fairytales – non-Lancs)
(25) Once upon a time theyre wur three lickle pigs. These here pigs lived thur
days int luvley ouse made uh straw an ‘ay. (Lancashire Fairytales – Lancs)
191
Even in these two very short examples that are telling the same narrative we can see
differences between the two texts. On the whole, Lancashire writers often aimed to
represent their phonology via variant spellings, and tended to include a more selective
other hand often had a smaller selection of features that they seemed to consider as
constructions, dialect words) but often included them in a haphazard or arbitrary way.
features with each other due to their relative frequencies of occurrence, the table is
still useful in showing the difference between the two parts of the corpus.
Lancs non-Lancs
definite article reduction 900 976
nonstandard were 255 142
adverbial right + adjective 177 33
definite article deletion 144 80
dialect words 105 9
past reference come 99 27
possessive me + noun 86 18
archaic 2nd person pronouns 59 18
archaic verb form 49 0
dislocation 43 3
absence of plural marking 40 12
subject relative what 32 27
nonstandard was 23 12
nonstandard irregular lexical verb 18 0
TABLE 10. RAW FREQUENCIES OF NONSTANDARD FEATURES IN THE LANCS AND NON-
LANCS PARTS OF LANCASHIRE FAIRYTALES
the corpus. A closer look at the non-Lancashire texts reveals that often the reduced
192
form is used in every single possible instance, even when barely any other
(26) And t’girl was called Little Red Riding Hood. And one day when t’sun was
shinin’ she went in t’forest and was looking for t’house where her Grandma
lived. T’house was only small and it were hidden by t’trees. (Lancashire
Fairytales, non-Lancs_0006)
This suggests that this form is certainly strongly associated with Lancashire dialect,
more selective with their application of this nonstandard form, and have a
One of the most interesting (and surprising) aspects of the Lancashire Fairytale
corpus was found not in the grammatical variation displayed by the writers or the
nonstandard spelling representing the phonology, but in the content of the stories
involve some change or embellishment to the expected narrative despite this not being
stories involved glass clogs instead of slippers. Others stories mentioned the
surrounding area (e.g. two different writers describe Grandma from Little Red Riding
Hood as living in Grizedale forest). Others describe living conditions and scenery in
unexpected detail, often including cobbled streets, mills and local foods (including, on
one occasion, the Wicked Witch offering Snow White some tainted hotpot rather than
an apple). One example showing this local influence is given below in example (7),
(27) “owdo Jack” she says, “Wossupwithi?” Jack ‘ad com in leukin like e’d seen
nobbut strife. “By eck, trouble at mill” says Jack. “I’ve been given t’shove”.
“Tha't backerts thee!” she said. “We’ll hav t’sell ceaw! Get thur self pulled
reaunt an mek sharp down t’market.” (Lancashire Fairytales, Lancs_0016)
193
The link between language and identity is clear here, with writers showing an
particular reference to times gone by) and the Lancashire dialect. Here Lancashire
Humorous Localised dialect literature; a genre now well established for many dialects
of English (both in the UK and beyond). The contrast between the clearly stereotyped,
(and often archaic) written forms produced by the writers of the Lancashire Fairytale
corpus (such as that shown in (27)), as compared to their own speech, is interesting.
There are no instances in any of the spoken corpora displaying either the range or the
density of the nonstandard variation found in the dialect writing. It is therefore evident
that writers of the Lancashire dialect literature are consciously using a set of linguistic
forms and constructions that enact a socially recognised register (as outlined by Agha,
2003 and Johnstone et al., 2006), namely what they conceptualize as Lancashire.
This chapter has aimed to both uncover the salient grammatical features in Lancashire
as found in the Lancashire corpus data and to propose a suitable methodology to arrive
at this outcome.
Results from the analysis of the corpus data have revealed distinct differences
in the distribution of grammatical features across the corpus sources. This indicates
that not all nonstandard variation produced by Lancashire dialect speakers is indeed
perceived as being salient (and therefore included in the dialect literature). Figure 1
A number of the nonstandard features tested were frequent across all of the
Lancashire corpora. The most prevalent of these were nonstandard was and were,
194
definite article reduction/deletion, and past reference come. It is therefore suggested
that these features are salient to Lancashire speakers but not so strongly associated
with Lancashire so that their frequency in the Sound Archive is diminished due to
possible accommodation.
Other features were apparent in the dialect literature corpora but not in the
Sound Archive. Two possibilities exist for these constructions, either they are archaic
(e.g. those found in Litcorp) or, they are perceived as very salient and so are perhaps
avoided by speakers when in conversation. Aside from those that were attested in the
literature as being archaic, this category contained the used of more idiomatic
constructions such as “ey up me duck” and dialect words and phrases such as “gradely
int it!”. As these constructions are perhaps enregistered as very clearly being part of
the Lancashire dialect; it is unlikely that they may be found in natural conversation,
A number of features were present in the Sound Archive but not in the dialect
literature. This suggests that while these features are used by Lancashire dialect
speakers, they are yet to acquire a social value. These variants are an interesting
Perhaps the most surprising results from Lancashire Fairytales emerged from
the rewriting of the narrative of the fairytale in order to include some element of the
Lancashire area, its customs or cuisine. This aspect was not overtly indicated in the
question, but clearly shows the link between the Lancashire dialect and identity for
and perceived variables was useful but not without limitation. One problem is
195
circularity. How can we know if features are salient by looking at corpus data, if
speakers who contribute to that corpus data also know that certain features are salient
too and so actively up/downplay them? - a point also outlined by Kerswill and
nuances, which could potentially be revealing. The influence of the task may also
have had an impact on the distribution of constructions. For example, it may be the
case that a lower frequency of me + noun was found in Lancashire Fairytales simply
because the writers did not have the opportunity to use possessive construction when
writing a fairy story. The corpora used for this analysis may also have impacted upon
the outcomes; a comparison of the perception and production (i.e. speech and writing)
of the same group of speakers would control variables such as accommodation, and
studies, along with perceptual dialectology may allow a clearer picture to emerge of
the grammatical constructions that are salient in the Lancashire region. Nonetheless,
the data examined in this chapter and the conclusions put forward about the
features considered here are clearly consistent with previous research, e.g. Kerswill
§5.4.4, this demonstrated how phonetic respellings in the Lancashire dialect literature
could be analysed in order to uncover the salient phonological features in this region.
Analyses such as this would allow a broader picture of variation of all types in
196
Lancashire to be outlined, and would provide results that would complement the
197
Chapter 6. Concluding remarks
features found in the previously under-explored Lancashire dialect data, whilst also
variation and change. The approach adopted here is new in that it combines a variety
of data types (see §1.3 for more details on this), and explores the contribution that a
large corpus of dialect literature, along with other methods, can make in uncovering
The contribution of this study lies not only in profiling both existing and
historical features of the Lancashire dialect, but also in the use of multiple methods of
necessity, the use of a considerable spoken corpus in conjunction with both historical
contention that this approach has provided valuable insight into how multiple methods
The combination of new methodologies and data outlined here has shown that
oral history interviews can be a useful avenue for testing linguistic theories (provided
that these are handled with care) and that dialect literature can, to some extent, be used
evidence about the dialect in question. Dialect literature, when treated as a collection
of the most salient features of a variety as judged by that writer, can offer insights into
sociolinguistic salience.
198
targeted and explored in more detail. This was particularly effective when considering
rare phenomena such as zero relatives or the NSR. The online hosting of these
questionnaires and the sourcing of participants via social networking websites meant
that a large number of participants were reached, thus giving more robust and
sociolinguistic data collection. Along with the traditional dialect literature, a new
aspects of both elicitation and dialect literature, is also unique to this study and
allowed further insights into differences between the perception and production of
wide variety of sources utilized in this thesis created multiple opportunities for
analyzing the data in question from different perspectives and arriving at a much more
Chapter 2 used the Lancashire data to test a number of assertions that are
English and is, on the whole, less constrained. This chapter also introduced new
which is typically difficult to retrieve from corpus data alone. A sentence-linking task
where informants had a free choice of which relativizer to use in linking clauses
showed that, at least to some degree, the relativizer what is productive in this region,
199
Chapter 3 provided a semantic and syntactic analysis of the previously
Lancashire that displays meanings that can be similar to both DOn’t HAVE to, and
mustn’t depending on the context of use. The results show that the semi-modal
HAVEn’t to has changed over time, and now tends to behave more like a core modal
verb for Lancashire speakers. In a majority of cases in the Sound Archive data, it
displays a meaning that is closer to the stronger modal verbs MUSTn’t or SHOULDn’t
Along with describing this construction, Chapter 3 tested how possible it was
to use different sources of dialect data in the analysis of diachronic change. The
results are open to several interpretations. The semantic and syntactic arguments
grammaticalized in the Lancashire dialect data, but diachronic changes in the data are
more uncertain. This uncertainty may be due to either the relatively low frequency of
this construction overall, or, to the somewhat problematic nature of the comparison
between the written Litcorp (as opposed to a historical spoken source) and the spoken
Leading on from the analysis of HAVEn’t to, Chapter 3 then turned to the wider
construction family, i.e. those constructions that have similar semantics and syntactic
properties (e.g. MUSTn’t, SHOULDn’t, NEEDn’t); an approach that has yet to be widely
adopted in sociolinguistics. The aim here was to explore the concept of constructional
in the development of this construction (and its construction family) more widely in
Lancashire.
200
The verdict on construction competition is not entirely clear but nonetheless
the point still remains that often a number of similar constructions, as opposed to just
two opposing variants, can fulfil a similar semantic function and that the interaction
between these variants is complex and cannot easily be accounted for by e.g. the S-
competing variants no doubt has implications for studies of language change and
sociolinguistics, suggesting that a wider scope of focus will often be necessary when
the (so-called) Northern Subject Rule. This chapter provided a new account of this
while variation with 3sg agreement in prevalent in Lancashire, the situation is far too
complex to be accounted for by a single rule such as that ascribed by the NSR. My
analysis finds that instances of present tense indicative variation that appear to be
instances of the NSR are extremely rare in the Lancashire data, particularly in the
Crucially, this chapter also analysed the semantics surrounding the NSR. It
attempted to uncover whether or not habitual constructions had been fully appreciated
by other researchers, and the extent to which such constructions (which, importantly,
are very frequent in Lancashire) impact upon the status of the NSR. In Lancashire
corpus data, often nonstandard verbal agreement involved a direct flout of the subject
position and subject type restrictions specified for by the NSR. Evidence from the
201
indicated a higher acceptability score (versus other respondents) for adjacent non-3sg
pronouns with 3sg agreement. With this in mind, the analysis showed that there is
great difficulty in differentiating NSR from other similar constructions, and that this is
not something that should simply be passed over. While the semantic difference in the
(and can usually be resolved by examining the wider context of the text or utterance),
this is trickier for habitual constructions. I argue that it is very possible that in regions,
such as Lancashire, where it can be proven that the usage of -s in the habitual aspect is
frequent (with or without adverb phrases), 3sg forms may have been re-analysed by
rather than be a marker of agreement. This assertion may undermine the validity of
(2006) in finding that levelling to was but more frequently (and interestingly) to were
concept that was previously untested. The methodology involved comparing a large
corpus of produced variables (i.e. speech) to a large corpus of perceived variables (i.e.
here is original. Although examining only the dialect literature corpora would have no
doubt yielded considerable interesting results, what is perhaps more interesting is the
difference between the nonstandard constructions present in the written data and those
202
This chapter advocated the use of dialect literature as a key component in
unearthing grammatical (and other) patterns that are salient features of the Lancashire
dialect. This comparison not only allowed salient constructions in the data to be
described, but also revealed constructions that are salient but do not occur in speech
(perhaps in part due to their social value or status) and also constructions that are
nonstandard yet do not currently have ascribed social values. Some grammatical
patterns appeared in both texts, but the distribution displayed different weightings
indicating preferences for either written or spoken data. Results such as this can allow
them as salient or not salient which is highly promising with respect to future corpus-
based research.
The corpus-based method used in Chapter 5 was not with out some limitations;
certain more discourse-based features were not able to have their nonstandardness
perhaps even stronger results could be achieved if both written and spoken data was
collected from the same group of informants, and, if possible, aligned for
research into semi-phonetic respellings and nonstandard vocabulary terms were found
which only occur in the “perceived” dialect literature which was only treated in brief
here due to the restrictions of this chapter. Considerations of these, and also of the
further attention
Overall, this thesis has not only described grammatical variation in Lancashire
but has set out to emphasize the importance of corpus-based dialect grammar for
203
linguistics in general (see also Hollmann and Siewierska, 2011, and Hollmann, to
appear, who focus specifically on the importance of frequency effects and schemas).
An important underlying theme of this study has been the testing of linguistic claims
using large corpus resources. It is clear that the method of achieving significance for a
claim depends on the nature of that claim. A simple claim, such as the existence of a
construction, requires only simple searches and statistics to show that it exists in the
data. Problems arise with evaluating assertions which are more complex, relating to a
number of conditions or features, often overlapping (as demonstrated by, for example,
the NSR). It seems clear that in order to confirm suggested trends and/or rule out
different sources must be used. This will allow the corroboration of data from multiple
The methods used here were not without limitation. As frequently highlighted,
requires large resources, and it may be the case that larger corpora may have been able
to substantiate some of the claims made in this thesis in a more convincing manner.
Naturally, there are obstacles to acquiring such a large amount of data, e.g. simply the
lack of the data in existence, time constraints and costs. In this study this limitation
was counteracted to some degree by elicitation methods (and this approach is strongly
advocated), but larger corpora and perhaps, for example, conversational data
structured around topic or e.g. grammatical tense to some degree might be of use in
Aside from the data, I argue that order for an approach like this to progress yet
further a number of broader questions arising from this thesis need to be addressed.
204
One of these concerns the interplay between frequency, salience and perhaps other
social factors (as outlined also in Hollmann and Siewierska, 2011), along with the role
provide numerous opportunities for future research. In particular, attention to intra and
tasks may also lend themselves well to studies of possible social network effects, a
very important aspect of sociolinguistic theory that was beyond the scope of the
present study.
205
References
Agha, Asif. 2003. The social life of cultural value. Language and Communication
23:231-273.
Auwera, Johan van der. 1984. More on the history of the subject contact clause in
Bailey, Guy, Natalie Maynor and Patricia Cukor-Avila. 1989. Variation and concord
Salmon Ltd.
van Koppen,. J. van Craenenbroeck, V. van den Heede, (eds.). 2005. Syntactic
Barlow, Michael and Charles Albert Ferguson. 1988. Agreement in natural language.
Barras, Will. 2006. The exhalations whizzing in the air: SQUARE and NURSE in
Essex.
Beal, Joan C. 1993. The grammar of Tyneside and Northumbrian English. In James
Milroy and Lesley Milroy (eds.), Real English: the grammar of English
Beal, Joan C. 2004. The phonology of English dialects in the north of England. In
Poussa (ed.), Relativisation on the North Sea littoral. Munich: Lincom Europa.
Biber, Douglas, Susan Conrad, Edward Finegan, Stig Johansson and Geoffrey Leech.
Börjars, Kersti and Carol Chapman. 1998. Agreement and pro-drop in some dialects
Bresnan, Joan, Ashwini Deo and Devyani Sharma. 2007. Typology in variation: a
Britain, David. 2002. Diffusion, levelling, simplification and reallocation in past tense
Britain, David and Laura Rupp,. 2005. Subject-verb agreement in English Dialects:
the East Anglian Subject Rule. Paper presented at the University of Essex.
Unpublished.
207
Brown, K. 1991. Double modals in Hawick Scots. In Peter Trudgill and J. Chambers,
Bucholtz, Mary and Kira Hall. 2003. Language and identity. In Alessandro Duranti,
Bybee, Joan. 1985. Morphology: a study of the relation between meaning and form.
Bybee, Joan. 2006. From usage to grammar: the mind's response to repetition.
Language 82:4.
Cheshire, Jenny and Sue Fox. 2006. A new look at was/were: the perspective from
Cheshire, Jenny, Viv Edwards, and Pamela Whittle,. 1989. Urban British dialect
208
Cheshire, Jenny. 1982.Variation in an English dialect: a sociolinguistic study.
Press.
Clark, Lynn and Graeme Trousdale. 2009. The role of frequency in phonological
Press.
Culicover, Peter W. 2008. The birth and death of constructions: the case of English
D’Arcy, Alexandra and Sali Tagliamonte. 2010. Prestige, accommodation and the
Dik, Simon and Kees Hengeveld. 1997. The theory of Functional Grammar. Part I:
Dobson, Scot. 1969. Larn Yersel' Geordie. Newcastle upon Tyne: Graham.
Doherty, Cathal. 1993. The syntax of subject contact relatives. Paper presented at the
209
al. (eds.), Proceedings of the Chicago Linguistic Society. 55–65. Chicago:
Ellegård, Alvar. 1953. The auxiliary do: the establishment and regulation of it's use in
Filppula, Markku, Juhani Klemola,. and Heli Pitkänen (eds.). 2002. The Celtic roots of
Fischer, Olga and Max Nänny. 2001. Iconicity. Special issue of the European Journal
Fischer, Olga, Ans van Kemenade, Willem Koopman and Wim van der Wurff. 2000.
Fischer, Olga. 1992. Syntax. In N. Blake (ed.), The Cambridge history of the English
Fox, Barbara. and Sandra Thompson. 1990. A discourse explanation of the grammar
Freethy, Ron and Richard Scollins. 2002. Lankie Twang (Local Dialect). Newbury:
Countryside Books.
Godfrey, Elizabeth and Sali Tagliamonte. 1999. Another piece for the verbal -s story:
11:87-121.
Goldberg, Adele and Ray Jackendoff. 2004. The English resultative as a family of
210
Goldberg, Adele. 1995. Constructions. A Construction Grammar approach to
Henry, Alison 1995. Belfast English and Standard English. Dialect variation and
Henry, Alison. 2005. Non-standard dialects and linguistic data. Lingua 115:1599-
1617.
Herrmann, Tanja. 2005. Relative clauses in English dialects of the British Isles. In
Hickey, Raymond. 2000. Salience, stigma and standard. In Laura Wright (ed.), The
211
Hollmann, Willem B. and Anna Siewierska. 2006. Corpora and (the need for) other
Amerikanistik 54:203-216.
Hollmann, Willem B. and Anna Siewierska. 2011. The status of frequency, schemas,
1:195–223.
Hopper, Paul J. and Elizabeth Closs Traugott. 2003. Grammaticalization, 2nd Edition.
Huddleston, Rodney and Geoffrey K. Pullum. 2002. The Cambridge grammar of the
Linguistics 3: 173-207.
Hundt, Marianne. 1997. Has BrE been catching up with AmE over the past thirty
212
the Seventeenth International Conference on English Language Research on
Mitteilungen.
Ihalainen, Ossi. 1994. The dialects of England since 1776. In Robert Burchfield (ed.),
Kain, Roger and Richard Oliver. 2006. Historic parishes of England and Wales: an
Kearns, Kate. 2007. Epistemic verbs and zero complementizer. English Language and
Linguistics. 11:475-505.
Jones and E. Esch (eds.), Language change. The interplay of internal, external
213
Klemola, Juhani. 2002. The origins of the Northern Subject Rule: a case of early
Universitätsverlag C. Winter.
Gruyter.
Pennsylvania Press.
Oxford: Blackwell.
Labov, William. 2006. The social stratification of English in New York. Cambridge:
Lambrecht, Knud. 1988. There was a farmer had a dog: syntactic amalgams revisited.
214
Langacker, Ronald W. 1987 Foundations of cognitive grammar: theoretical
Langacker, Ronald W. 1991. Concept, image, and symbol: the cognitive basis of
Leech, Geoffrey. 1987. Meaning and the English verb, 2nd Edition. London:
Longman.
Leech, Geoffrey. 2003. Modality on the move: the English modal auxiliaries 1961-
1992. In Roberta Facchinetti, Manfred Krug and Frank Palmer (eds.), Modality
McCafferty, Kevin. 2003. The Northern Subject Rule in Ulster: How Scots, how
Merton, Les and Richard Scollins. 2003. Oall Rite Me Ansum!: A Salute to the
Routledge.
215
Miller, Jim. 1993. The grammar of Scottish English. In James Milroy and Lesley
Milroy (eds.). 1993. Real English, the grammar of English dialects in the
Milroy, L. and J. Milroy. 1992. Social network and social class: toward and integrated
Mishoe, Margaret and Michael Montgomery. 1994. The pragmatics of multiple modal
Montgomery, Michael., Janet Fuller and Sharon DeMarse. 1993. The black men has
wives and sweet harts [and third-person plural -s] Jest like the white men.
York: Rodopi.
216
Murray, Sir James Augustus Henry. 1873. The Dialect of the southern counties of
Orton, Harold, Wilfrid Halliday, Eugen Dieth, Martyn Wakelin, Michael Barry, Philip
Perkins, Mick R. 1983. Modal expressions in English. London: Frances Pinter and
Norwood.
Pietsch, Lukas. 2005. Some do and some doesn't: verbal concord in the north of the
Prasad, R and M. Strube. 2000. Discourse salience and pronoun resolution in Hindi.
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech and Jan Svartvik. 1985. A
University, UK.
Robinson, Chris. 2008. Wha's like us? (Say it in Scots). Edinburgh: Black and White
Publishing.
217
Roby, John. 1829. Traditions of Lancashire. London: Longman
Romaine, Suzanne. 1980. The relative clause marker in Scots English: diffusion,
9:221-49.
Ruano García, Javier. 2007. Thou'rt a strange fillee: evidence for 'y-tensing' in 17th
288.
141-174.
Sebba, Mark. 2009. Spelling as a social practice. In Janet Maybin and Joan Swann
Shorrocks, Graham. 1996. The second person singular interrogative in the traditional
218
Motapanyane (eds), Microparametric syntax and dialect variation.
Shorrocks, Graham. 1999. Grammar of the dialect of the Bolton Area. Part I.
Shorrocks, Graham. 1999. Grammar of the dialect of the Bolton Area. Part II.
Siewierska, Anna and Willem B. Hollmann. 2005. Ditransitive clauses in English with
Smith, Jennifer and Sali Tagliamonte. 1998. We were all thegither… I think we was
17/2:105–126.
Smith, Jennifer, M. Durham and L. Fortune. 2007. 'Mam, my trousers is fa'in doon!':
Sparks, John. 2009. Spirit of Lancashire, 2nd revised edition. Wellington, UK:
Halsgrove/Pixz Books.
Tagliamonte, Sali and Jennifer Smith. 1998. Roots of English in the African American
Tagliamonte, Sali and Jennifer Smith. 1999. Analogical levelling in Samaná English:
the case of was and were. Journal of English Linguistics 27, 1. 8-16.
219
Tagliamonte, Sali, Jennifer Smith and Helen Lawrence. 2005. No taming the
Tagliamonte, Sali. 1998. Was / were variation across the generations: view from the
Tagliamonte, Sali. 2004. Back to the roots: the legacy of British dialects. Final report
Tagliamonte, Sali. and Helen Lawrence. 2000. I used to dance, but I don't dance now.:
de Gruyter.
Trudgill, Peter. 2008. Colonial dialect contact in the history of European languages:
37, 241–280
Van der Auwera amd Plungian. 1998. Modality's semantic map. Linguistic Typology
2:79-124.
Venneman, Theo. 2000. English as a Celtic language. Atlantic influence from above
and from below, in Hildegard Tristram (ed.), The Celtic Englishes II,
220
Visser, F. T. 1963–1973. An historical syntax of the English language. Leiden: E. J.
Brill.
University.
Wales, Katie. 2006. Northern English: a cultural and social history. Cambridge:
Wells, John. 1970. Local accents in England and Wales. Journal of Linguistics 6 (2):
231–252.
White, David L. 2002. Explaining the innovations of Middle English: what, where and
why. In Markku Filppula, Juhani Klemola and Heli Pitkänen (eds.), The Celtic
Joensuu Press.
Wolfram, Walt and Jason Sellers. 1999. Ethnolinguistic marking of past be in Lumbee
Wright, Laura. 2002. Third person plural present tense markers in London prisoners’
224.
221
Appendix A: Map of the old County of Lancashire
Below, I present a map of the old County of Lancashire before the 1974 boundary
changes (taken from Kain and Oliver, 2001).
222
Appendix B: Texts comprising Litcorp
Below, I present a list of the texts comprising the Lancashire dialect literature corpus
(Licorp) used throughout this thesis:
Brierley, Benjamin. 1896. 'Aboth-Yate' Sketches and Other Short Stories, volume. 1.
Oldham: W.E. Clegg
Brierley, Benjamin. 1886. 'Ab o’th’-Yate' Sketches and Other Short Stories, volume 2.
Oldham: W.E. Clegg
Collier, John (also know as Tim Bobbin). 1846. Tummus and Meary. John Haywood:
Manchester
Saunders, Langford. 1911. Lancashire humour and pathos. Manchester: Fred Johnson
& Co.
223
Appendix C: Questionnaire – sociolinguistic information
Below, I present the sociolinguistic questionnaire used in conjunction with
acceptability questionnaires employed in this thesis:
Dialect Survey
Information about you...
..........................................................................................................................................
If you have not always lived in the same town/city/village, please specify where
else you lived and for how many years.
..........................................................................................................................................
2. How old are you? (If you would prefer not to say, please leave blank)
.........................................................................................................................................
3. Would you consider yourself to be a speaker of a particular English
dialect?
..........................................................................................................................................
4. If so, which dialect?
..........................................................................................................................................
5. How do you feel about your dialect, e.g. positive or negative? Is there
anything you particularly like or dislike?
..........................................................................................................................................
..........................................................................................................................................
..........................................................................................................................................
..........................................................................................................................................
........................................................................................................................................
Thank you!
Please proceed to complete the survey...
224
Appendix D: Questionnaire – content testing the NSR
Below, I present the acceptability questionnaire used to test the NSR. The format of
the questions is shown below. A list of sentences used to populate the survey as also
given:
Please read the sentences and rate how acceptable they are to you.
Please give each sentence a score between 1 and 5, with 1 being the least acceptable
or most unlikely to be used by you, and 5 being the most acceptable or likely to be
used by you.
For example, if you judged sentence A (below) to be very acceptable, you should
give it a score of 5, and so circle the number 5, as shown below.
A. ‘The man with the red hat sometimes goes into the shop.’
(least 1 2 3 4 5 (most
acceptable) acceptable)
If you judge sentence B (below) to be very unacceptable, you should give it a score
of 1, and so circle the number 1.
B. ‘With sometimes red the hat into the shop goes the man.’
(least 1 2 3 4 5 (most
acceptable) acceptable)
Given below are a list of the sentences that were used to populate the survey:
225
12. You only very occasionally asks me for help.
13. I is sometimes not sure about what he will say about all of the mistakes I
make.
14. You have lots left to do but you’re making good progress.
15. My friends wife does a cookery class at the community centre.
16. The other day they walks for three miles before they came to a post-box.
17. You and your sister have got no manners and is very nasty to him sometimes.
18. We usually always do something nice at Christmas.
226
Appendix E: Ellipsis test sentences
Below, I present the list of list of elliptical sentences that was used to test the
acceptability of these construction to Lancashire speakers in Chapter 3:
227
Appendix F: Questionnaire – content testing zero relatives
Below, I present the acceptability questionnaire used to test zero relatives. The format
of the questions is shown below. A list of sentences used to populate the survey as
also given:
Please read the sentences and rate how acceptable they are to you.
Please give each sentence a score between 1 and 5, with 1 being the least acceptable
or most unlikely to be used by you, and 5 being the most acceptable or likely to be
used by you.
For example, if you judged sentence A (below) to be very acceptable, you should
give it a score of 5, and so circle the number 5, as shown below.
C. ‘The man with the red hat sometimes goes into the shop.’
(least 1 2 3 4 5 (most
acceptable) acceptable)
If you judge sentence B (below) to be very unacceptable, you should give it a score
of 1, and so circle the number 1.
D. ‘With sometimes red the hat into the shop goes the man.’
(least 1 2 3 4 5 (most
acceptable) acceptable)
Given below are a list of the sentences that were used to populate the survey:
229
Appendix G: Sample text from Lancashire Fairytales
Below, I present the three excerpts from Lancashire dialect writing collected to form
Lancashire Fairytales and used in Chapter 5:
(2) (non-Lancs_0023)
Once upon u time, there were a right pretty girl named Sleepin Beauty. Sleepin
Beauty had been put into sleep by Wicked Witch. One day, a Hansome Prince
come along, and gave her a smacker right ont lips! Sleepin beauty woke up
and lived happily ever after with prince int castle.
(3) (Lancs_0002)
Three little pigs. One day, three lickle pigs were flyin nest an meckin them
houses for't livin in. First lickle pig came across man wit' hay an says, "ay up
fettler, can thou gimme some hay for't house I'm meckin?" man says, "aye
lad", an lickle pig mecks house of hay. Second lickle pig saw't man wit sticks
an says, "ay up fettler, can thou gimme some sticks for't meckin me house?" an
man says, "aye lad" an lickle pig mecks house of sticks like. Third lickle pig
sees man wit great big stones an says, "ay up fettler, can thou gimme some
great big stones for't house I'm meckin?" an man says, "aye lad" an lickle pig
mecks house of stones. All of sudden, wolf comes t'village an starts chappin
doors. 'ee says t'first lickle pig, "Lickle pig! Lickle pig! Let me in! Let me in!"
an lickle pig says, "Not for't hair on me chin!" an wolf says, "Then i'll huff an
puff an blow yer house in!" an does it. Then 'ee cooks an ate lickle pig. 'ee says
t'necks lickle pig, "Lickle pig! Lickle pig! Let me in! Let me in!" an lickle pig
says, "Not for't hair on me chin!" an wolf says, "Then i'll huff an puff an blow
yer house in!" an does it gain. Then 'ee cooks an ate necks lickle pig. When 'ee
sees last lickle pig 'ee says, "Lickle pig! Lickle pig! Let me in! Let me in!" an
lickle pig laffs an says, "Not for't hair on me chin!" an wolf says, "Then i'll
huff an puff an blow yer house in!" but can't. So 'ee tries gain but still can't.
Then 'ee climbs up t'roof an jumps down chim-eny an right in't big pot o' hot
watter an lickle pig cooks an ate wolf.
230