Lancashire Dialect Grammar A Corpus Base

Lancashire dialect grammar:
a corpus-based approach
A thesis submitted to Lancaster University
for the degree of Doctor of Philosophy
in the Faculty of Arts and Social Sciences.
Claire Dembry, BA, MA
Department of Linguistics and English Language

Lancaster University
March 2011
ABBREVIATIONS..................................................................................................... 7
ABSTRACT………………………………………..………………………………... 8
DECLARATION………………………………..…………………………………... 9
ACKNOWLEDGEMENTS........................................................................................ 10
CHAPTER 1. INTRODUCTION
1.1 Preliminaries………………………...……………………………………....... 11
1.2 Rationale…………………………...………………………………………….. 13
1.2.1 Lancashire area…………………….…………………………………. 15
1.2.2 Lancashire identity……………………………………………………. 17
1.3 Methodological approaches………..……………………………………….... 20
1.3.1 Corpora in sociolinguistics...………………………………………….. 22
1.3.2 Sound Archive corpus….……………………………………………... 23
1.3.3 Dialect literature in sociolinguistics...……………………………….. 25
1.3.4 Litcorp and Lancashire Fairytales…………………………………….. 27
1.3.5 Questionnaires and other methods………………………………......... 31
1.3.6 Choosing grammatical features……………………………………….. 32
1.4 Theoretical approach…………………………………………………………. 33
1.5 Overview and aims…………………………………………………………… 34
CHAPTER 2. RELATIVATION
2.1 Overview………………………………….…………………………………… 35
2.2 Analysis of relative clauses................................................................................ 35
2.2.1 Defining relativization constructions………………..………………... 35
2.2.2 Relativization types……………..…………………………………….. 37
2.2.2.1 Zero relatives………………………………………………... 38
2.2.2.2 Relativization in varieties of English……………………….. 40
2.2.3 Factors influencing relativizer choice………………………………… 41
2.2.3.1 Syntax: dependencies between RV and antecedent…………. 41
2.2.3.2 Semantic category of antecedent……………………………. 42
2.2.3.3 Restrictiveness (semantic scope) of the RC………………… 43
2.2.3.4 Other factors………………………………………………… 45
2.2.4 Diachronic change in relativization…………………………………... 46
2.2.5 Summary and research questions……………………………………... 46
2.3 Methodology....................................................................................................... 47
2.3.1 Rationale for methodology……………………………………………. 48
2.3.2 Corpora: Litcorp and Sound Archive…………………………………. 49
2.3.3 Questionnaires……………………………………………………….... 52
2.3.4 Classification and division of respondents……………………………. 54
2.4 Results and discussion....................................................................................... 55
2.4.1 Overview of corpus results…………………………………………… 55
2.4.2 Corpus results – restrictiveness of relative clause…….……………… 61
2.4.3 Corpus results – semantic category of antecedent …………………... 62
2.4.4 Questionnaire findings - distribution of relativizers ………………… 62
2.5 Concluding remarks………………………………………………………….. 65
1
CHAPTER 3. HAVEN’T TO
3.1 Introduction………………………………………………….………………... 68
3.2 Literature review............................................................................................... 69
3.2.1 Modality in Standard English………………………….……………... 69
3.2.2 Modals vs. semi-modals……………………………….…………….... 72
3.2.3 Recent changes to modal verbs in Standard English….……………… 73
3.2.4 HAVE to and HAVEn’t to in current Standard English ...……………..... 75
3.2.5 History of the HAVEn’t to construction…………………………….…. 77
3.2.6 The rise of periphrastic do……….…………………………………… 79
3.2.7 Related constructions………...……………………………………...... 80
3.2.8 Modals in varieties of English……………………………………....... 82
3.2.9 Summary……………………………………………………………… 83
3.2.10 Research hypotheses………………………………………………….. 84
3.3 Methodology…………………………………………………………………... 85
3.3.1 Introduction…………………………………………………………… 85
3.3.2 Corpus searches………………………………………………………. 85
3.4.1 Semi-modals and the BNC………………………………………….... 86
3.4.2 Corpus comparison of HAVEn’t to …………………………………… 87
3.4.3 Syntactic analysis of HAVEn’t to in Lancashire data …………………. 89
3.4.4 Semantic analysis of HAVEn’t to in the Lancashire data……………… 90
3.4.5 Explanations for semantic differences – constructional polysemy…… 93
3.5 Analysing the necessity/obligation construction family................................. 94
3.5.1 Introduction…………………………………………….……………... 94
3.5.2 Corpus results………………………………………………………… 94
3.5.3 Diachronic change – testing the Frequency Hypotheses.…………….. 95
3.5.4 Considerations and contradictions……….…………………………… 97
3.5.5 Semantic evidence……………………….…………………………… 98
3.5.6 Syntactic evidence……………………….…………………………… 100
3.6 Concluding remarks......................................................................................... 101
CHAPTER 4. VERBAL AGREEMENT AND THE NORTHERN SUBJECT

RULE
4.1 Introduction………………..………………………………………………….. 105
4.1.1 Overview…………….………………………………………………... 106
4.1.2 A focus on the NSR….……………………………………………….. 109
4.1.3 Variation with BE….………………………………………………... 112
4.2 History of the NSR……………………...……………………………………. 115
4.2.1 Origins of the NSR……………….………………………………….... 115
4.2.2 Constructional competition……….…………………………………... 118
4.2.3 Salience…………………………….…………………………………. 123
4.2.4 Frequency of usage…………………...………………………………. 125
4.2.5 Summary and research questions…….……………………………….. 126
4.3 Methodology....................................................................................................... 127
4.3.1 Rationale for methodology………………….………………………... 127
4.3.2 Corpora…………………………………….…………………………. 129
4.3.3 Standard and nonstandard verb forms…….………………………….. 130
4.3.4 Questionnaires………………………………………………………... 132
2
4.3.5 Classification and division of respondents.…………………………... 134
4.4 Results and analysis.......................................................................................... 135
4.4.1 NSR results…………………………….……………………………... 135
4.4.2 Other constructions with 3sg agreement …………………………….. 142
4.4.3 Other nonstandard agreement patterns……………………………….. 149
4.4.4 Was/were variation…………………………………………………… 150
4.4.5 Other variation with was/were...……………………………………… 153
4.4.6 Questionnaire results…………………………………………………. 154
4.5 Concluding remarks…………………………………………………..……... 158
CHAPTER 5. SALIENCE
5.1 Introduction........................................................................................................ 161
5.1.1 What is salience?...………………………………………………….... 162
5.1.2 Salience, markedness and enregisterment…………………………….. 163
5.1.3 Accommodation – problems and solutions…………………………… 165
5.1.4 Aims…………………………………………………………………... 167
5.2 Rationale............................................................................................................. 169
5.2.1 Using dialect literature..…………………………............................... 170
5.2.2 Choosing constructions……………………………………………….. 170
5.2.3 Summary, research questions and hypotheses ………………………. 172
5.3 Methodology....................................................................................................... 172
5.3.1 Corpus methods………………………………………………………. 173
5.3.2 New corpus data – Lancashire Fairytales……………………............. 176
5.3.3 Interpreting corpus result……………………………………………... 179
5.4.1 Features found across all corpora….…………………………………. 181
5.4.2 Features found in dialect literature………………………………….. 184
5.4.3 Features found in the most recent corpora...………………………….. 188
5.4.4 Other features…………………………………………………………. 188
5.4.5 Lancashire Fairytales – comparing Lancs and non-Lancs...………….. 191
5.5 Concluding remarks.......................................................................................... 194
CHAPTER 6. CONCLUDING REMARKS………………………………………. 198
REFERENCES……………………………………………………………………... 206
APPENDICES
A Appendix A: Map of the old County of Lancashire…………………………… 222
B Appendix B: Texts comprising Litcorp………………………………………... 223
C Appendix C: Questionnaire – sociolinguistic information..………………….... 224
D Appendix D: Questionnaire – content testing the NSR ………………….......... 225
E Appendix E: Ellipsis test sentences……………………………..……………... 227
F Appendix F: Questionnaires – content testing zero relatives ………...……… 228
G Appendix G: Sample text from Lancashire Fairytales………………………… 230
3
List of tables
CHAPTER 1 INTRODUCTION
Table 1 Overview of data sources and variables……………………………….. 21
CHAPTER 2 RELATIVIZATION
Table 1 Frequency of relativizer in Sound Archive and Litcorp……………….. 55
Table 2 Frequency of zero relatives in a 45,000 word sample from each
corpus…………………………………………………………………... 58
Table 3 Variant spellings of what and that relativizers in Litcorp……………... 58
Table 4 Analysis of ut results in Litcorp……………………………………….. 60
Table 5 Frequency of restrictive and non-restrictive relative clauses by
relativizer………………………………………………………………. 60
Table 6 Frequency of relative clauses by animacy type………………………... 62
Table 7 Questionnaire results for zero relatives………………………………... 63
Table 8 Choice of relativizer by questionnaire participants……………………. 64
CHAPTER 3 HAVEN’T TO
Table 1 NICE Qualities of semi-modals as compared to ‘core’ modals in
Standard English (Quirk et al., 1985:140)…………………………… 72
Table 2 Negative forms of semi-modals in the BNC, with and without DO, raw
frequency results……………………………………………………….. 87
Table 3 Instances of forms of the HAVEn’t to construction in the Lancashire
data……………………………………………………………………... 88
Table 4 Difference in obligation type in HAVEn’t to constructions in
Lancashire dialect data (raw frequency results)……………………..… 91
Table 5 HAVEn’t to family of constructions (normalized frequency results)…… 95
Table 6 Obligation construction family and the NICE properties……………… 100
CHAPTER 4 VERBAL AGREEMENT AND THE NSR

Table 1 Paradigm of BE in Old High German, Old English and Modern
English..……………...…………………………………….................... 106
Table 2 Late Middle English present indicative agreement with FIND…………. 107
Table 3 Diachronic changes in the verbal agreement system in Northern
English…………………………………………………………………. 116
Table 4 Instances of the NSR in the Lancashire corpora……………………….. 137
Table 5 Agreement patterns with thou in the Lancashire corpora……………… 138
Table 6 Testing pronoun adjacency…………………………………………….. 143
Table 7 Testing non-pronominal subject adjacency……………………………. 144
Table 8 Frequency of nonstandard 3sg constructions analysed as either
habitual or historical present…………………………………………… 146
Table 9 Overlap between the NSR, habitual and the historical present
constructions…………………………………………………………… 148
Table 10 Was/were variation in the Sound Archive and Litcorp………………… 151
Table 11 Negated vs. non-negated nonstandard was/were results………………. 152
Table 12 Geographical region and dialect type of informant……………………. 155
Table 13 Median and mean acceptability score by all respondents, grouped
results…………………………………………………………………... 156
Table 14 Testing adjacent personal pronouns…………………………………… 157
4
CHAPTER 5 SALIENCE
Table 1 Definite article reduction/deletion...…………………………………… 181
Table 2 Was/were variation…….......................................................................... 182
Table 3 Past tense negator never...............……………………………………... 183
Table 4 Distribution of adverbial right + adjective…………………………….. 185
Table 5 Features predominately found in Litcorp..…………………………….. 186
Table 6 Other nonstandard features found predominately in Litcorp…………... 187
Table 7 Features found predominately in recent corpora...…………………….. 188
Table 8 Distribution of owt and nowt………………...………………………… 189
Table 9 A sample of nonstandard phonological features frequent in dialect
literature…………………………………………..…………………… 190
Table 10 Raw frequencies of nonstandard features in the Lancs and Non-Lancs
parts of Lancashire Fairytales…..……………………………………… 192
5
List of figures
CHAPTER 1 INTRODUCTION
Figure 1 Map of Lancashire ………………………………………………….. 15
Figure 2 Overview of data sources used in this thesis..………………………. 21
Figure 3 Geographical location of informants in the Sound Archive corpus… 24
Figure 4 Excerpt from Lancashire Pride (Thompson, 1945)………………… 29
CHAPTER 2 RELATIVIZATION
Figure 1 Most frequent relativizers in the Lancashire corpora……………….. 56
CHAPTER 3 HAVEN’T TO
Figure 1 The auxiliary verb-main verb scale (adapted from Quirk et al.,
1995:137)……………………………………………………………. 73
Figure 2 S-curve model of language change (reproduced from Kroch
1989:22)............................................................................................... 81
Figure 3 Possible diachronic change in the Lancashire corpus data………….. 96
Figure 4 Distribution of constructions displaying weak and strong obligation. 98
CHAPTER 5 SALIENCE
Figure 1 Nonstandard features examined in Chapter 5…..…………………… 161
Figure 2 Mapping salient constructions………………………………………. 168
Figure 3 Semi-phonetic respellings in Tummus and Mearey (Bobbin, 1846)... 170
Figure 4 Dialect literature task – Lancashire Fairytales……………………… 177
Figure 5 Interpreting corpus comparisons……………………………………. 180
6
Abbreviations
1 First Person
2 Second Person
3 Third Person
Neg Negative
sg Singular
pl Plural
Ø Zero (missing element)
Adj Adjective
AdvP Adverb phrase
NP Noun phrase
RC Relative clause
RRC Restrictive relative clause
RV Finite relative clause verb
NRRC Non-restrictive relative clause
ZR Zero relative
to-inf to infinitive
Litcorp Lancashire Literature Corpus

Sound Archive North West Sound Archive Corpus
7
Abstract
This thesis investigates a number of key grammatical features found in the previously
under-studied Lancashire dialect. While the primary aims of the study are without
doubt descriptive, a strong theoretical and methodological component to the
investigation is also present. Theoretically, this study is couched within the usage-
based approach to language (see e.g. Croft and Cruse, 2004: 291-327). It employs
innovative uses of new methodologies relating not only to a substantial spoken corpus,
but also to a newly collated corpus compiled from historical dialect literature texts.
Corpus resources are also supported by acceptability judgements and tasks which are
gathered from a large number of respondents using new techniques in order to
maximise the extent and significance of the data presented here.
This thesis details variation that is already well documented in other varieties
of English (e.g. relativization, verbal agreement), but differentiates itself by
highlighting nuances and complexities not previously considered before, such as
semantic differences in the HAVEn’t to construction; constructional competition in the
Northern Subject Rule and approaches to using corpora in measuring sociolinguistic
salience.
Underpinning the thesis is the idea that the interplay between non-standard
data and theoretical linguistics can be bidirectional, where theory can inform the
analysis of dialect data, and such analysis of dialect data can inform the formulation or
further refinement of new or existing linguistic theory (see also Hollmann and
Siewierska, 2011, Hollmann, to appear, and references cited therein).
The methods used here and the research presented by employing these
methods in the subsequent chapters emphasize the need for a broad range of resource
types in order to strengthen claims made in sociolinguistic research.
8
Declaration
I declare that this thesis is my own work has not been submitted in substantially the
same form for the award of a higher degree elsewhere.
9
Acknowledgements
This thesis would never have got going, much less be completed, without the financial
support I received from the Arts and Humanities Research Council, for which I am
extremely grateful.
My utmost gratitude must be expressed to my supervisors, Professor Anna Siewierska

and Dr.Willem Hollmann, for their help and support during my doctoral studies, and
also before this during my time as an undergraduate in Lancaster. It is with great
sadness that Anna did not get to see the finished thesis.
I am grateful to many other people in the Department of Linguistics and English

Language at Lancaster University, in particular to Dr. Andrew Hardie and Dr.
Jonathan Culpeper for giving me so many opportunities to gain really valuable (and
very enjoyable) teaching experience throughout the duration of my PhD.
My gratitude goes also to the SCR at Fylde College, Lancaster University who gave
me the opportunity to remain engaged in college life within my role as Assistant
Dean. In particular, my thanks go to Dr. Matt Storey for his ever practical advice, and
especially to my fellow Assistant Dean and PhD student, Dr. Krishna Morker.
I am grateful to my new employer, Cambridge University Press, and in particular to

Ann Fiddes and Dr. Julia Harrison for their support, encouragement and patience in
the final stages of my work. My appointment in Cambridge has been a great incentive
to finish my research.
I am also grateful to all of my friends, and in particular to Toshi, Chris, Mags, Clairey
and Katie; Chris-M and Rachel, Nicola and Jenny; and to my new friends in
Cambridge, Liz and Laura. It goes without saying that I am incredibly grateful also to
Matt, for his love, support, suggestions, advice and help on a day-to-day basis
throughout my project.
Last but not least, I am indebted to all of my family for all of their continued
encouragement and belief in me. My biggest thank you must go to my Dad and Tracey
who have always supported me in ways too numerous to list. They both are a real
inspiration to me in my education and, more importantly, in my life more generally.
10
Chapter 1. Introduction
1.1 Preliminaries
This study investigates several key grammatical features of Lancashire dialect. While
the primary aims of the study are descriptive, there is also a strong theoretical and
methodical component to the investigation. Theoretically, the study is couched within
the usage-based approach to language (see e.g. Bybee, 1985; Langacker, 1987; Croft
and Cruse, 2004: 291-327). With respect to methodology, it is innovative in that it
uses not only a substantial spoken corpus but also a corpus compiled from historical
dialect literature texts. These corpus resources are also supported by acceptability
judgement tasks in order to maximise the extent and significance of the data presented
here.
The grammatical features investigated are among those well known as
exhibiting dialectal variation in the British Isles (outlined in general by e.g. Kortmann
et al., 2004; Trudgill, 1999), and are therefore of prime interest for experts in dialect
grammar be it of English (e.g. Kortmann et al., 2000-2005, Freiburg English Dialect
Corpus) or in general (e.g. Barbiers et al, 2006, Syntactic Atlas of Dutch Dialects;
Vangsnes et al., 2005-2010, Scandinavian Dialect Syntax Project).
Chapter 2 examines the structure and use of relative clauses and explores the
extent to which the wh-relativization strategy typical of Standard English (e.g. as
outlined by Quirk et al., 1985:1252) has made inroads into Lancashire dialect. This
chapter also provides an interesting account of the potential diachronic changes that
have occurred in Lancashire dialect with respect to the use of zero relatives; an issue
notoriously difficult to investigate with standard methodology.
Chapter 3 provides an analysis of the HAVEn’t to construction, a polysemous
construction found in Lancashire with modal meanings that can be similar to both
11
DOn’t HAVE to and mustn’t depending on context of use. The analysis examines how
these modal meanings interact with other semantically related constructions (e.g.
SHOULDn’t, MUSTn’t, NEEDn’t) and evolve in the process of language change.
Chapter 4 considers verbal agreement in Lancashire, focussing on the so-called
Northern Subject Rule. Few in-depth analyses of the NSR have been conducted in any
one region, and currently no such analysis for Lancashire exists. Most studies do not
address variables such as the possible interplay between the NSR and other similar
constructions (e.g. habitual or historical present). Cognitive-perceptual factors such as
salience or frequency of usage as potential explanations for this agreement variation
are also frequently overlooked in the current literature. This chapter analyses corpus
data from spoken and written sources and is supported by acceptability judgements
from questionnaire results in order to explore both the possible instances of the NSR
and its acceptability in Lancashire. A broader question relating to synchronic theories
of language variation is also investigated; i.e. to what extent is variation in syntactic
and morphological phenomena (such as the NSR) the result of rules or constraints,
and to what extent is this variation more idiosyncratic, unpredictable and region or
community-specific?
Chapter 5 proposes an entirely new and innovative methodology to test
sociolinguistic salience by contrasting corpus data of different types. The
methodology proposed here asserts that grammatical features which can be considered
as salient in Lancashire can be identified by comparing the differences between the
language that is produced by Lancashire dialect speakers (found in the spoken Sound
Archive corpus) and that which is perceived by them to be dialectal (found in the
written dialect literature). The methodological difficulties posed by the investigation
12
of salience (and indeed in outlining and working with the concept of salience more
generally) are also addressed.
Overall, the thesis explores the idea that the interplay between non-standard
data and theoretical linguistics can be bidirectional: theory can inform the study of
dialect data, and dialect data can inform the formulation or further refinement of
linguistic theory (see also Hollmann and Siewierska, 2011; Hollmann, to appear and
references therein). Many approaches to grammar e.g. Functional Grammar (see e.g.
Dik and Hengeveld, 1997), Construction Grammar (see e.g. Croft, 2001; Goldberg,
2002) and Cognitive Grammar (see e.g. Langacker 1991; Croft and Cruse, 2004), have
developed theories of language variation and change based on the analysis of data,
although rarely is any of this data drawn from so called ‘non-standard’ sources. Other
considerations relating to sociolinguistic and regional differences such as salience
(Kerswill & Williams, 2002) and enregisterment (Agha, 2003) may add to existing
linguistic theory and so are explored with respect to the Lancashire data considered in
this thesis.
1.2 Rationale
Lancashire dialect is a good choice for studying grammatical variation in non-standard
data, not least for reasons of locality. Lancaster University currently holds recordings
from the North West Sound Archive (outlined further in §1.3.1) and also provides
easy access to Lancashire dialect speakers in the local area.
Until recently, the Lancashire dialect remained relatively uninvestigated (aside
from local interest groups and dialect societies which continue to be popular and well
supported in the local area). 1 Although a number of Lancashire informants are
1
See e.g. http://www.thelancashiresociety.org.uk and http://www.edwinwaughdialectsociety.com
13
included in the Survey of English Dialects (Orton et al., 1962-71), this source does not
provide enough instances of a wide enough range of grammatical features to warrant
any region-specific conclusions to be made. This is in part due to data collection
techniques, (as outlined in e.g. Chambers and Trudgill, 1998). Lancashire results from
the SED are included in a number of cross-regional studies (e.g. Bresnan, Deo and
Sharma, 2007; Pietsch, 2005; Herrmann, 2005), but further data is required in order to
provide a fuller analysis of variation in this region. Recently, studies into the
Lancashire dialect have been conducted, such as those by Hollmann and Siewierska
(2006, 2007, 2011); Siewierska and Hollmann (2005). Related local varieties have
also recently been paid some attention, e.g. Bolton (Shorrocks, 1999; Moore, 2004).
While the analysis of grammatical variation in Lancashire is still relatively rare,
phonological variation has been discussed in more detail, e.g. by Vivian (2000),
Barras (2006) and more generally by e.g. Watson (2006); a trend that appears to be
typically found in sociolinguistics more generally. As outlined in Hollmann and
Siewierska (2006:22), grammatical variation has by and large been overlooked due to
a number of factors, such as the dominance of a non-variationist approach to grammar
(namely, the Generative paradigm) and the unavailability of (sufficient quantities of)
suitable data.
This thesis builds upon previous research both in this region and in varieties of
English more widely in order to consider how dialect data can provide new insights
into cognitive and theoretical linguistics whilst also giving a descriptive account of the
language used by speakers in the Lancashire area.
14
1.2.1 Lancashire area
Lancashire is situated in the Northwest of England, and bordered to the north by the
county of Cumbria; to the east by the counties of North and West Yorkshire; and to
the south by the metropolitan counties of Greater Manchester and Merseyside. The
current county boundaries for Lancashire are shown in Figure 1.
FIGURE 1. MAP OF LANCASHIRE
Before the 1974 local government reform, the County of Lancashire also encompassed
towns now situated in other surrounding counties (see Appendix A for a map of the
old County of Lancashire). The towns of Bury, Bolton, Oldham, Rochdale, Salford,
and Wigan are now part of Greater Manchester but were once at the heart of
Lancashire’s cotton and milling trade (along with other towns such as, Burnley and
15
Chorley which remain in the county of Lancashire today). Other towns in the old
County of Lancashire also became parts of neighbouring counties as a consequence of
boundary reform, e.g. Knowsley, St Helens and Sefton now form part of Merseyside;
Warrington and Widnes are now part of Cheshire and the Furness Peninsula;
Westmoreland and Cartmel are now part of Cumbria. As a result, both linguistic and
cultural influences in modern-day Lancashire can be expected from these surrounding
towns and counties. Lancashire’s largest border is with Yorkshire, and parallels
between the language used in these two locations have been noted in a number of
studies (e.g. Tagliamonte, 1998; Tagliamonte and Lawrence, 2000).
The landscape, industry and population size of towns in the County of
Lancashire vary significantly. While the north of Lancashire is largely rural and in
some parts very sparsely populated (e.g. Carnforth, Silverdale), the south and east are
more densely populated and contain primarily industrial or formerly industrial towns
(e.g. Burnley, Chorley) which perhaps are influenced (both linguistically and
culturally) by their neighbours in Manchester and Liverpool. Although grammatical
variation has yet to be extensively investigated in Lancashire (although some inroads
have been made by Shorrocks, 1999), phonological variation has been identified. A
phonological difference that has been noted is rhoticity. Although sometimes
considered as a typical feature of ‘Lancashire’ (e.g. by Wells, 1970), rhoticity shows
phonemic variation within the county boundary, as outlined by Beal (2004:130). It is
found in south and east Lancashire, e.g. in the towns of Burnley, Blackburn, and
Accrington (Barras, 2006), but is absent in many other places within the County of
Lancashire, e.g. Lancaster, Preston, Blackpool. It is likely that grammatical variation
may also display sub-regional differences. Phonetic and perhaps grammatical
variation within county boundaries such as this is far from uncommon (compare e.g.
16
considerable variation found in the new county of Tyne and Wear, e.g. by Burbano-
Elizondo, 2008). Although an intra-regional grammatical study may uncover
interesting variation and possible levelling and diffusion of nonstandard features from
area to area, this variable is not considered within the realms of this study. Spoken
corpus data used in this study is taken from a selection of informants living in various
towns in modern-day Lancashire (see §1.5.2 for further details) and all results from
the corpus data are considered to represent Lancashire. In this thesis Lancashire refers
to the cultural-linguistic area rather than being fixed immovably to any county
boundary.
1.2.2 Lancashire identity
The link between language and identity is well attested in the literature (e.g. Bucholtz
and Hall, 2003; Holmes, 1997; Schiffrin, 1996). The concept of enregisterment
describes the definition and identification of a regional variety as ‘a linguistic
repertoire differentiable within a language as a socially recognised register’ (Agha
2003: 231). As found by Beal (2006) in Sheffield and Newcastle, the Lancashire
dialect is the subject of a number of humorous books, guides and glossaries such as
Completely Lanky (Dutton, 2006) and Lanky Twang (Freethy and Scollins, 2002).
Dialect writing is also frequently found in the region in collections of ‘traditional’
dialect poetry, stories and songs (some of which are utilized in this study, see §1.3.2
for further details). Many other volumes exist detailing cultural and historical
traditions, e.g. Traditions of Lancashire (Roby 2005); Favourite Lancashire recipes
(Baldock and Wood, 1995) and The spirit of Lancashire (Sparks 2009). Lancashire
merchandise is also found in gift shops and tourist information centres across the
region, often with “I love Lancashire” slogans emblazoned on various mugs and tea
17
towels (along with the more imaginative car stickers “Lancashire. There Will Be
Blood….pudding” referring both to the 2007 film by Paul Thomas Anderson and the
local delicacy, black pudding). This suggests that Lancashire has a defined set of
cultural and linguistic norms, certainly for the speakers of this variety, and that an
awareness of these norms may impact on language use in this region (see Hollmann
and Siewierska, 2011 for a concrete albeit very tentative suggestion in this direction,
concerning definite article reduction in the region).
In order to explore how the Lancashire dialect is perceived by Lancashire
speakers, attitudinal data was collected from all informants who completed either the
acceptability questionnaires and/or tasks that were used later in this study (see §1.5.3
for more details on this approach). These questionnaires elicited sociolinguistic
information such as age, location and also attitudinal information from the informants
(alongside, of course, the specific test questions in the main part of the questionnaire).
The inclusion of attitudinal questions aimed to test how the informants perceive the
Lancashire dialect and to uncover more about Lancashire identity and how this might
fit with the language use reflected in the questionnaires themselves. No specific
questions were asked about neighbouring regions (so as not to influence any response)
but instead informants were invited to respond to the open question “how do you feel
about your accent/dialect?” More than 100 people who identified themselves as
Lancashire dialect speakers left a response to this question (a comparison is drawn
here with those informants also living in Lancashire who considered themselves not to
speak with any regional dialect, see §1.3.3 for more on this distinction). Around 65
gave positive responses, 20 gave more negative replies with around 15 giving neither
a positive nor negative answer. A selection of the positive responses is presented in
(1-5).
18
(1) “Positive. I think it sounds friendly. It's part of my identity and people like it
or think it is funny. People instantly know where I'm from. It never sounds
pretentious.” (Lancs015)
(2) “Gives a sense of individuality from other regions. It’s different to Geordies
or Scousers or Mancs - and much nicer!” (Lancs029)
(3) “Positive, I like the fact its not cockney, or brummie, and you can get away
with murder (not literally of course!!) down south because they think we're
simple country folk, little do they know!” (Lancs004)
(4) “If I meet someone new then they know straight away where I'm from when I
begin to talk, for me this is a positive thing because I am very proud of being
a Lancashire Lass.” (Lancs083)
(5) “I love my Lancashire accent, far better than any other. The only problem I
have with it living down South is getting people to understand what I am
asking for when I order a c-o-a-k-e (coke!)” (Lancs011)
Around 15 speakers categorise themselves as a ‘Lancashire lass’ or a ‘Lancashire lad’
in their responses and many more highlight the separation of Lancashire from other
neighbouring regions, as demonstrated in (2) and (3). This formulation of an ‘us’ and
‘them’ idea in the minds of speakers shows that speakers are aware of both geographic
and linguistic differences between Lancashire and other local varieties. Interestingly,
15 informants who identified themselves as Lancashire dialect speakers came from
regions now considered outside the County boundaries, typically from towns now
belonging to the northern part of Greater Manchester such as Bolton and Rochdale.
These results are not extensive enough to make generalizations about the status of
Lancashire dialect with respect to language contact and change, but indicate that
Lancashire dialect speakers themselves appear at least in some part unconstrained by
County boundaries.
There were of course other less positive views, mainly relating to speakers’
feelings of being portrayed as ‘common’ or ‘stupid’ or ‘poor’; these are shown in (6-
10). References to Lancashire speech sounding ‘gutteral’ or ‘flat’ as typified by the
comments in (7) and (10) were also found.
19
(6) “I don't really like it when people say I have a strong Lancashire accent (it's
usually people from the south of England) because I don't want to sound
'common'.”(Lancs005)
(7) It sounds very bland, in comparison to scouse. Although the fact i've been
brought up in the chav capital of Lancashire, i managed to speak rather 'posh'
for a blackpudlian anyway.” (Lancs079)
(8) “I particularly dislike that I may be perceived by others to be stupid because

of my northern accent, even by other northern people with less broad or no
accent.” (Lancs012)
(9) “I feel that in some ways northern accents (including Lancashire) are still
judged to be inferior or to indicate lesser intelligence or class standing, no
matter how many trendy regional people they put on the telly.” (Lancs050)
(10) “when growing up: to speak with a broad Lancashire accent was considered
'common' and restricted your position in the job market. I think it sounds
guttural and boring too.” (Lancs054)
It is evident form the above that the Lancashire dialect is a clear and distinct entity in
the minds of many Lancashire dialect speakers (both those contacted in this thesis and
beyond) and that Lancashire speakers are aware of social implications associated with
this regional dialect, be they positive or negative.
1.3 Methodological approaches
The data drawn upon in this thesis is outlined in the diagram in Figure 2 and expanded
upon in subsequent sections.
20
Contemporary dialect Acceptability tasks and
literature
questionnaires
(Lancashire Fairytales)
Historical dialect Ad-hoc

literature writing acceptability
(Litcorp) judgement tasks
Analysis of
grammatical
Oral history variation in Standard English
interviews Lancashire reference corpus
(Sound Archive) (BNC)
FIGURE 2. OVERVIEW OF DATA SOURCES USED IN THIS THESIS
Further details on the size, collection dates and informants that contribute to the
various sources used in this thesis (as outlined in Figure 2) are shown in Table 1.
Collection approximate size Number of

Informant origin (number)
date (if applicable) informants
Lancashire
Sound Archive 1970 -1990 325,000 32
(See Figure 3)
Litcorp 1880 - 1945 500,000 6 Lancashire
Lancashire Lancashire (52)
2009 60,000 95
Fairytales Non-Lancashire (43)
Lancashire (123)
Acceptability
2009 - 243 Other north (84)
judgements
South (36)
Ad-hoc test
2009 - 10 Lancashire
group
Reference
1990s 100m unknown Mixed
corpus (BNC)
TABLE 1. OVERVIEW OF DATA SOURCES AND VARIABLES
21
1.3.1 Corpora in sociolinguistics
The analysis of nonstandard regional dialects goes back at least to the nineteenth
century; see e.g. Chambers and Trudgill (1998) or Ihalainen (1994) for an overview.
Prior to the advent of large rapidly accessible, annotatable and searchable electronic
corpora in the 1960s, dialectology relied on the notes of fieldworkers. Most of the
analyses were restricted to lexical variation, often producing isoglosses and word
maps (such as those found in the original SED results, see Orton et al., 1962-71). The
advent of corpora in particular allowed new studies in grammatical variation to be
possible. As grammatical features typically show much less variation as compared to
phonological features, often an extensive amount of data is needed in order to even
find only a few instances of variation in one particular construction.
Currently a number of large spoken English dialect corpora exist, e.g. the
Freiburg English Dialect corpus (FRED) (Kortmann et al, 2002-2005) 2 which draws
on data from a number of regions in the UK; the Newcastle Electronic Corpus of
Tyneside English (NECTE) (Allen et al. 2006) 3 ; the Scottish Corpus of Texts and
Speech (SCOTS) (Corbett et al, 2004) 4 ; and the Limerick Corpus of Irish English (L-
CIE) Farr, Murphy and O’Keefe, 2004) 5 . This thesis uses over 800,000 words of
spoken and written corpus data, falling broadly into two parts - spoken data taken
from the North West Sound Archive (see e.g. Hollmann and Siewierska, 2006), and
dialect literature taken from (primarily) stories written by Lancashire speakers. The
first of these sources is now described in more detail.
2
http://www2.anglistik.uni-freiburg.de/institut/lskortmann/FRED/
3
http://research.ncl.ac.uk/necte/
4
http://www.scottishcorpus.ac.uk/
5
http://www.ul.ie/~lcie/
22
1.3.2 Sound Archive corpus
The Sound Archive corpus is a 325,000 word corpus held at Lancaster University
transcribed from oral history interviews held at the North West Sound Archive. 6 Of
the thirty-two Sound Archive recordings used in this thesis, seventeen were
transcribed entirely by me. The remainder were transcribed by an audio typist and
carefully checked and corrected by me. The Sound Archive recordings themselves (as
opposed to the transcriptions alone) were also used throughout this research in order
to double check any points that were found initially to be unclear. The recordings
were transcribed orthographically in standard British English; phonological variants
were typically not represented. When variants found in the recordings were
morphologically (or morpho-phonemically) determined, a consistent variant spelling
was used. This particularly applied to the use of /mI/ for the Standard English my
which was represented as me in the transcriptions, and definite article reductions
which were represented as t’, as shown in example (11).
(11) […] and me feet went from under me and t' axe went in me leg and there I were laid
on t' floor with axe in me leg (Sound Archive).
Local dialect lexis that did not appear in dictionaries is recorded consistently (e.g.
nobbut (meaning no more than, nothing but) and gradely (meaning fine or excellent).
The Sound Archive interviews were conducted from 1970-1990 as part of a
local history project and involve speakers between the ages of approximately 55 to 80,
all native to Lancashire. The corpus is comprised of thirty-two speakers, originating
from both northern parts of Lancashire, e.g. Morecambe, Lancaster, Fleetwood (15
speakers) and more southern parts of Lancashire, e.g. Accrington, Chorley and
6
For further information on the North West Sound Archive, see
http://www.lancashire.gov.uk/corporate/web/view.asp?siteid=2856&pageid=4970&e=e
23
Burnley (17 speakers). As the recordings made for the Sound Archive were intended
as past of an oral history project rather than for linguistic research, unfortunately little
sociolinguistic information about the speakers’ background is available. The
distribution of Sound Archive informants is shown in Figure 2 where each green dot
represents one informant.
FIGURE 3. GEOGRAPHICAL LOCATION OF INFORMANTS IN THE SOUND ARCHIVE CORPUS
Interviews range between 3,000 and 17,000 words in length. Speakers in the Sound
Archive corpus typically cover topics such as agriculture, wartime, farming and
fishing. An extract from the Sound Archive corpus is shown in example (12)
24
(12) No it were ni-- it were nice because they had them big pipes ‘cos we had them
big pipes in t’ greenhouses up smallholdings you know, them big, must have
been coal mustn’t it, anthracite coal yeah. And teachers had er, their room it
were in top of a buil-- at er Burnley Wood School, it were at top of one of er
buildings. And if I were wet through we used to have a change of clothes. See
we hadn’t to sit in them. And I used to have to stay at er er dining room on
erm Oxford Road to er have dinner. And then when it were winter time and it
were right dark we used to get out about three o’clock or four or well before
four o’clock. (Sound Archive)
While oral history interviews are a good source of relatively unrestricted speech, this
corpus data is strongly biased towards past tense constructions. Attempts have been
made to compensate and counteract for this bias by employing additional analytical
perspectives (to be described in §1.3.3-5). The Sound Archive gives a snapshot of
Lancashire dialect as used by its speakers towards the end of the last century. If
analysed alone, in isolation from other data, it would allow only synchronic
descriptive observations to be made. However, by combining the data from the Sound
Archive corpus with that drawn from elicitation (as advocated by e.g. Hollmann and
Siewierska, 2006 and described with respect to this thesis in §1.3.5) and dialect
literature, a more comprehensive picture of variation in Lancashire emerges.
1.3.3 Dialect literature in sociolinguistics
Broadly speaking, dialect literature is here intended to mean stories and narratives
written with the intention of representing dialectal speech of that region, by writers
from that region. This is different from the orthographic representation of dialect
speakers in literature more generally (e.g. Charles Dickens’ representation of
Lancashire in Hard Times or Irvine Welsh’s representation of Scottish spoken in
Edinburgh in Trainspotting). These two are termed by Shorrocks (1996) as ‘dialect
literature’ and ‘literary dialect’ respectively; this study concerns only dialect literature.
The history of writing in dialect is extensive, but despite this, its use in linguistic
25
research is relatively recent. A number of studies have considered historical dialect
texts, (and a number those texts are from Lancashire) although none of these are in the
context of measuring language change (e.g. Shorrocks 2002; Ruano García 2007). It is
somewhat unsurprising also that most dialect literature investigations concentrate on
phonology (e.g. by Beal 2000; Honeybone and Watson, forthcoming). An important
aspect of dialect literature lies in the conscious respelling of words by the writers. If
semi-phonetic respellings can be considered as indications of a meaningful decision
by the author (as suggested by Sebba, 2009), then these features give an extra layer of
significance to the grammar and lexis chosen by the writer. While of course
respellings naturally lend themselves to a phonological analysis, I argue that they are
also interesting in terms of whether or not the distribution of these respellings may
interact with instances of nonstandard grammatical variation with reference e.g. to
salience (this is discussed further in Chapter 5).
More recently, as reported by Beal (2000, 2009) and Honeybone and Watson
(forthcoming), contemporary humorous dialect literature data in the form of glossaries
and books about regional dialects are now relatively common, e.g. for Scots - Wha’s
Like Us? (Say it in Scots) (Robinson, 2008); Geordie - Larn Yersel’ Geordie (Dobson,
1969), Cornish - Oall Rite Me Ansum!: A Salute to the Cornish Dialect (Merton and
Scollins, 2003). Resources such as these can provide insights into language choices
made by these writers, but are not the focus of this study. It is interesting to note that
more recently still, even newer sources of dialect literature have begun to emerge via
the Internet. Spoof encyclopaedia pages and discussion forums are prevalent, with
many contributors not only sharing regional words and phrases, but also writing in
26
their regional dialect. 7 Alongside this, the social networking site Twitter has recently
provided instances of dialect writing that could merit further study. Shown in (13) is a
parody of the popular Newcastle-born singer/songwriter Cheryl Cole taken from
Twitter.
(13) Cheryl Kerl: Oh aye pet Ah embrace Europe me man, an Ah’m propah
multilingwill an aall. Ah speak English, Esperanteaur an uv coase Jawdee az
well (18th March 2011)
While there are, as yet, no examples of the Lancashire dialect being represented via
this medium, this nonetheless remains an interesting possibility for future research.
Currently, much of the research into dialect literature has been small scale,
confined to one or two texts per study. There has been no extensive corpus-based
analysis of a collection of dialect literature and it has not been used to measure
language change. It is hoped that this may change, both due to the approaches
advocated in this thesis and to the newly available Salamanca Corpus, a digital archive
of English dialect text released in February 2011. 8
1.3.4 Litcorp and Lancashire Fairytales
The dialect literature found in stories and narratives in this region cannot be
regarded as a record or transcription of Lancashire dialect speakers, but rather as a
record of the writers’ perception and representation of speakers at the time of writing.
Because of this, analyses of dialect literature can uncover the most salient or important
dialectal features as judged by these writers.
7
See the spoof Wikipedia page for the Lancashire dialect (i.e. Lanky Twang):
http://uncyclopedia.wikia.com/wiki/Lanky_Twang and forum threads discussing Lancashire dialect
such as this: http://www.redvee.net/forums/showthread.php?11439-lanky-twang
8
See http://salamancacorpus.usal.es/SC/index.html for further information.
27
The Lancashire dialect literature corpus (Litcorp) used in this study is larger
than the Sound Archive at approximately 500,000 words in length. It is compiled from
six books written in the Lancashire dialect by a variety of authors, sourced from
Lancaster University Library. By electronically scanning the books and then
converting the files to plain text format using conversion software, the data can then
be searched electronically – something not possible by using the books themselves.
The dialect literature books are written in the period 1855 – 1945, and are narratives,
monologues and plays. Songs and poems were avoided due to possible interfering
factors such as rhyme. A full list of titles included in the Litcorp can be found in
Appendix B. As an example of the types of texts found in Litcorp, a short excerpt
from Lancashire Pride (Thompson, 1945) is included in Figure 4.
28
LANCASHIRE PRIDE
I
SPRING CLEANING

TOMMY GREENHALGH walked moodily into the bar of the
“Hark to Dandler.” His pal Jimmy Dearden was just “taking
the top off” his drink as Tommy arrived. “How do Jimmy,”
said Tommy. Jimmy wiped the froth from his exuberant
moustache and said “How do Tommy. Tha looks a bit
powfagged. Owt up?”
“Ah don’t know as there is,” said Tommy. “Ah’ve
come out o’ th’ road.”
“Out o’ th’ road o’ what?” said Jimmy?
“Out o’ th’ road o’ battle, murder an’ sudden death,”
said Tommy vindictively. “Ah’ve come out o’ th’ road of an
earthquake.”
“Well,” said Jimmy, “Tha’s reached the harbour o’
refuge.”
“For th’ time being,” said Tommy. “Just for th’ time
being.”
“Ha’ one wi’ me?” said Jimmy?
“Tha’s took th’ words out o’ me mouth,” said Tommy.
FIGURE 4. EXCERPT FROM LANCASHIRE PRIDE (THOMPSON, 1945)
In order to both capture additional data from Lancashire respondents and to
provide a counterbalance to the historical Litcorp, a new collection of dialect literature
has been collected. Respondents were asked to write in what they considered to be
Lancashire dialect. This means that this corpus captures the current perception of the
grammatical repertoire of a Lancashire dialect speaker as considered by the
respondents (along of course with any possibly phonetic representation they choose to
include). This fusion of elicitation and dialect writing is new and will provide a useful
contrast with other data sources.
29
In order to end up with stories of suitable length, the participants were asked
to reproduce a story that was familiar to them – a fairy tale. In building this new
corpus, Lancashire Fairytales, no restrictions were placed on the length, style or
number of stories a participant could write, or on what type of variation it should
contain (e.g. grammatical variation, lexical choices, and semi-phonetic spellings). The
task was completed by 53 Lancashire respondents and 42 non-Lancashire respondents,
with most contributors writing between 350-500 words each. Around 40 of
respondents were undergraduate students at Lancaster University (split between both
Lancashire and non-Lancashire speakers), typically aged 18-21. Others were of a
mixed age range and were contacted through social networking websites. An example
of the texts produced by the informants is given in (14) (further instances are found in
Appendix F.
(14) […] the prince, he were broken hearted, and he says, “i’m gonna find me
lovely lass, im gonna search all round kingdom!” And off he went down
t’road, holdin onto the clog that she’d left ont ground […] (Lancashire
Fairytales)
In addition to the corpus resources described in previous sections, this thesis
also employs, from time to time, the British National Corpus (BNC). The BNC is a
100 million word corpus drawn from both written and spoken language from a wide
range of sources collected during the 1990s. The BNC was designed to represent a
cross-section of British English. 9 Use of the BNC does not form a considerable part of
this study, but instead is used as a reference corpus at various points with which to
compare the Lancashire data to Standard English.
9
For further information on the BNC, please see http://www.natcorp.ox.ac.uk
30
1.3.5 Questionnaires and other methods
Questionnaires are used in this study in order to both include the perceptions of
more modern speakers, and to allow a (tentative) further time depth comparison with
the corpus data. The questionnaires that I have devised tested variables such as the
possible morphosyntactic limitations (or constraints) to the NSR and acceptability of
relative clauses. The questionnaire is also useful in exploring present tense
constructions further, in order to compensate for the dominance of the past tense
constructions in the corpora. The questionnaires also targeted different groups often
contrasting, for instance, Lancashire and non-Lancashire speakers. A copy of the
questionnaires used in this thesis can be found in Appendices B and D.
The questionnaire employs quantitative questions where participants are asked
to judge sentences on a five point scale, with 1 being the least acceptable to them and
5 being the most acceptable e.g. (15).
(15) ‘They have a shop of their own and is very well off.’
(least acceptable) 1 2 3 4 5 (most acceptable)
The questionnaire data is compared to the results from both corpora. Since all
three data sources were gathered by different means and cover different time periods,
a combination of these results should substantiate any claims made, but also shows
how the combination of methods is a useful approach in sociolinguistics.
As mentioned previously, sociolinguistic information about the questionnaire
respondents was also gathered. This information is used in order to divide the
respondents into different categories throughout this work, depending on the various
aims of each chapter. For example, in Chapter 4 both Lancashire and non-Lancashire
respondents are compared. Respondents who identified themselves as having a
31
Lancashire dialect in answer to the question ‘do you have a particular dialect? If yes,
how would you describe it?’ are classified as ‘Lancashire, dialect speakers’. Other
speakers who identify themselves as living (or having lived) in a Lancashire town or
village for a majority of their life, but suggested that they did not have a Lancashire
dialect. These speakers are classified as ‘Lancashire, non-dialect speakers’ in the
results in order to determine whether or not there is a tangible difference between
these two groups.
1.3.6 Choosing grammatical features
Based on the attitudinal data presented in (1-10) it seems that modern Lancashire
dialect speakers appear to be aware of phonological or lexical/vocabulary-related
features of their language use. It is also clear that grammatical variation exists within
this region, some of which is relatively well known (such as definite article reduction,
including deletion (see Hollmann and Siewierska, 2011 for discussion). Variation and
change is not characteristic of one language area to the exclusion of other areas–
phonological, lexical and discourse variation are also associated with regional
variation and are linked together. Phonological features (as represented through
nonstandard spelling) and lexical choice are not considered at length in this thesis but
are discussed in places throughout this work, mainly in §5.4.7.
This thesis avoids grammatical features that have been the focus of previous
studies in Lancashire e.g. definite article reduction/deletion (Hollmann and
Siewierska, 2006; 2011); ditransitives (Siewierska and Hollmann, 2005); and
possessive me (Hollmann and Siewierska, 2007). (This is except for Chapter 5 where
variation in Lancashire is considered widely, with respect to salience.) Initial fine-
32
grained analyses of the corpus data revealed numerous grammatical and spelling
variants. Many of these are utilized in later chapters.
1.4 Theoretical approach
Many approaches to dialectology and sociolinguistics examine language variation and
change with respect to factors such as language communities, prestige, and language
contact (see e.g. Milroy and Milroy, 1992; Labov, 2006; Trudgill, 2008). More
recently however, the usage-based model has received some consideration within the
field of sociolinguistics (see Hollmann and Siewierska, 2011, for a discussion of this).
The usage-based model suggests that the relationship between grammatical
knowledge and language use is sensitive to frequency of usage. This means that token
frequency is crucial in the organisation of linguistic knowledge (Bybee, 1985;
Langacker, 1987; Croft and Cruse, 2004: 291-327). It therefore follows, for example,
that language structures that are used more often (and therefore have a high token
frequency) may become more reinforced in the minds of speakers. This entrenchment
of particular constructions may therefore, in turn, impact upon processes such as
language change (i.e. it is possible that entrenched constructions may resist language
change).
While the usage-based model has been applied to ‘standard’ language
varieties (e.g. Tomasello, 2003; Mukherjee, 2005), studies such as these rarely
consider any possible standard vs. nonstandard variation that may be present in their
data. Along with this, the usage-based model has been often disregarded by
sociolinguists working with nonstandard data. Recently a number of studies have
moved towards integrating the usage-based model with sociolinguistic theory by
suggesting that token frequency plays a role in aspects of variation in non-standard
33
varieties found in their data (e.g. Hollmann and Siewierska, 2007; Clarke and
Trousdale, 2009).
This thesis builds upon this approach by considering corpus frequencies as a
possible explanatory factor themselves, while also taking into account the more
‘traditional’ elements of sociolinguistic theory (social values, language communities,
prestige, language contact etc.) In doing so, this study contributes further to the
description of the interplay between corpus linguistics, nonstandard data and linguistic
theory.
1.5 Overview and aims
The present study links with and contributes to previous research in a number of ways,
combining empirical aspects of corpus research with the descriptive approach in order
to provide an account of this under-studied regional variety. The presented analysis of
grammatical features of the Lancashire dialect draws on extensive corpus data, as
outlined earlier, along with a large number of acceptability judgement tasks. The
examined features of Lancashire dialect provide a fruitful testing ground for theories
of language change particularly for stages of grammaticalization, and the relevance of
salience as interpreted within the usage-based framework (see e.g. Croft and Cruse,
2004: 291-327).
Overall, the thesis explores the idea that the interplay between non-standard
data and theoretical linguistics can be bidirectional and that a successful description of
a nonstandard dialect requires the application of several methodologies. In particular
corpora of different types in combination with elicitation and acceptability tasks can
give the most constructive results when examining nonstandard varieties such as this.
34
Chapter 2. Relativization
2. 1 Overview
Relative clauses (RCs) have been the subject of much research into varieties of British
English (e.g. D’Arcy and Tagliamonte, 2010; Kearns, 2007; Tagliamonte, Smith and
Lawrence, 2005; Beal and Corrigan, 2002; Herrmann, 2002; Fox and Thompson,
1990; Ihalainen, 1980). Although the Lancashire dialect region has been (briefly)
studied as part of wider investigation into British English (e.g. by Herrmann, 2005),
and certain claims have been made about the behaviour of particular RCs in this
region (e.g. by Shorrocks, 1999), currently no thorough investigation of this region is
available. This chapter tests whether or not claims made by other researchers are
supported by the Lancashire data examined in this thesis. More generally, this chapter
aims to set out the relativization patterns that are frequently found in Lancashire. As
part of this, the assertion that regional dialects deviate from Standard English with
respect to relativizer choice in particular contexts is tested (see e.g. Quirk et al, 1985;
Huddleston and Pullum, 2002:183). In order to consider these factors, data is
examined from Litcorp, Sound Archive, and an acceptability questionnaire. The
grammatical features which will be tested include types of relativizer (namely, the
distribution of ‘standard relatives’, zero relativizers and nonstandard relativizers) and
factors influencing relativizer choice (namely, syntactic function; semantic category of
the antecedent and restrictiveness). These concepts are discussed further in §2.2.
2.2 Analysis of RCs
2.2.1 Defining relativization constructions
Relative clauses consist of a finite relative clause verb (henceforth RV), and (usually)
a pronoun (henceforth called the relativizer); the RC requires a noun phrase or
35
pronoun (called the antecedent), to which it is syntactically linked. An example of
such is the sentence the man who wasn’t there, where was is the RV, who is the
relativizer, and the man is the antecedent. Henceforth in examples, RCs are shown in
square brackets, antecedents in boldface, the RV is italicised and relativizers are
underlined, as shown in example (1).
(1) I used to go in a little toffee shop in Bridge Street and it was my wife [who
owned it] and I used to get my cigarettes there various times of the day
(Sound Archive)
The purpose of the RC is to provide (further) information about the antecedent,
which narrows the antecedent’s frame of reference. It is worth noting that the
relativizer takes on, or repeats, the semantic target of the antecedent, and as such, is in
itself “redundant”, and does not (normally) convey any new information, unless the
antecedent is extremely ambiguous. It is worth delineating more precisely the form of
RCs. RCs differ from main clauses, in that they may (a) begin with a relativizer; or, in
the absence of a relativizer, (b) bring about a different word order from that of a main
clause; for example, in the sentence someone I know knows him, there are two
adjacent subject nouns and two adjacent finite verbs, neither typically acceptable main
clause word orders. The syntactic relation between the antecedent and the RV may
vary, as do other noun-verb relationships (e.g. subject, object, etc); this is described in
more detail in §2.2.1.3). Standard English has the following relativizers: which
(implying that the antecedent is not a person), who, whom, whose (implying that the
antecedent is a person), that (the most frequent and not implying anything about the
antecedent), and finally, a null relativizer zero or Ø, permissible with both person and
non-person antecedents. In Standard English, the antecedent usually precedes the RC,
but may follow it, though such types of relativization are not considered in this
investigation.
36
Relativization is not the only method of conveying additional information
about a noun whilst including a verb phrase: this can also be done with a noun-verb
participle, as in “a biscuit [covered in chocolate]”, and noun-gerund patterns, as in
“someone [earning a living]”; in both these cases, it is possible to insert a relativizer
and a finite auxiliary verb to convert the structure into a relative clause, e.g. making “a
biscuit [that is covered in chocolate]”. This can also be achieved with noun-adjective
patterns, e.g. “a problem [solvable by hard work]” where the relativizer and the
auxiliary BE can be used in the relative clause, i.e. “a problem [that is solvable by
hard work]”. Another option, not always available in SE, is to use a preposition phrase
e.g. “a man [in need of help]”, and its relativization “a man [who is in need of help]”.
Although these semantically related constructions are not considered in the data
examined in this chapter, it is worth noting that this choice of conveying similar
semantic information with differing syntactic patterns may impact upon distributional
frequencies found later in the results, in this thesis as a whole, and of course in the
study of grammatical constructions in general.
2.2.2 Relativization types
Relativization strategies can vary according to the particular relativizer employed by
the speaker or writer. Certain relativizers are typically linked to certain conditions (as
discussed later in e.g. restrictiveness in §2.2.3). Aside from the wh-relatives and that-
relatives which are commonly described in the literature, further types of relativizers
are found, namely zero relatives (Ø) and non-standard relativizers. These are
discussed in subsequent sections.
37
2.2.2.1 Zero relatives
Zero relatives (ZRs) are RCs that are not introduced by an overt relativizer. In many
accounts of Standard English, ZRs are considered as ungrammatical, e.g. by Quirk et
al. (1985:865) due to this “deletion” of the relativizer for subject relatives as shown in
e.g. (2).
(2) No love, he wasn’t someone [Ø I really knew], no. (Sound Archive).
ZRs are referred to in the literature as the non-introduced relative (Mustanoja
1960), Presentational Amalgam Construction (Lambrecht, 1988) and contact clauses
(Erdmann, 1980; Auwera, 1984), but here I follow Fischer (1992) in using zero
relative. Most accounts of Standard English do not clearly outline the semantic and
syntactic circumstances where ZRs are acceptable (and indeed are frequently used);
there does not appear to be a consensus on the range of environments that ZRs are
able to occur in. Much of the literature describes a number of core examples, namely,
existential there, existential have, it clefts, and clauses in which the main verb
introduces an individual into the discourse. These are exemplified in (3-6)
respectively, with examples taken from the Lancashire data.
(3) Well everything had a season , I can't remember when it was , we used to play
topping whip, er marbles in the channel, er skipping, lots of skipping where
you dash in and there's a line [Ø 'd be waiting to jump in and out of the rope].
(Sound Archive)
(4) 1: Have your relations worked in the slaughterhouse, it just seems an unusual
pastime for a 10 year old?
2: I'd two brothers [Ø worked there]. (Sound Archive)
(5) It were ‘is father [Ø made toil of his holiday] in the hope of benefitting his
boy (Litcorp)
(6) I have heard of a schoolmaster [Ø taught his pupils to say it the same way].
(Litcorp)
38
Certain other less widely attested ZRs are also described, e.g. by Doherty (2000: 87), a
number of which are found in the Lancashire data, e.g. NPs headed by free choice any
and ZRs as a modifier of that-phrase, these are shown in (7-8) respectively.
(7) An ee weret nobu’ trouble, that mon [Ø lived overt’ road from Mearey].
Alwus upter summat. (Litcorp)
(8) More than any place [Ø I 've ever been in in my life], it was full of plovers '
nests, plovers, tewits as we called them (sound Archive)
It should be noted that there is a distinction between sentences in which the relativizer
has a subject vs. non-subject antecedent. This can be seen in examples (7) and (8)
respectively. In Standard English, zero relatives found with non-subject antecedents
(such as that shown in 8) are often perfectly acceptable, as suggested by e.g. Olofsson
(1981:94). These relative constructions therefore may not be a notable feature of the
Lancashire dialect in particular, but rather a construction often found in English more
generally.
In most of the research in varieties of English that covers ZRs, these
distinctions are not made. Herrmann (2005:35) discussed have and be existentials and
Anderwald (2004:189) examines only instances of existential there clauses. All of the
ZRs exemplified here are tested with respect to the Lancashire data, both in the
corpora and via an acceptability questionnaire completed by Lancashire dialect
speakers, (see §2.2 for further information) in order to both outline the distributions
and acceptability of these types of ZRs in Lancashire.
Many standard accounts of grammar (e.g. Huddleston and Pullum, 2002:1056)
describe ZRs as being nonstandard and primarily found in regional and/or informal
varieties of English (although as Tottie (1995) points out, zero relatives are persistent
in written English too). Such relativization has indeed been identified in regions of the
39
UK (e.g. by Tagliamonte, Smith and Lawrence, 2005; Beal and Corrigan, 2002). This
chapter explores ZRs in order to provide a descriptive account of their behaviour and
acceptability in Lancashire, and also to test if they have undergone any diachronic
change.
2.2.2.2 Relativization in varieties of English
Recent research has documented a number of ways in which varieties of English differ
from Standard English (and from each other) in their relativization strategies (e.g.
Ihalainen, 1980; Bailey, 1999; Beal and Corrigan, 2002; Herrmann 2002;
Tagliamonte, Smith and Lawrence, 2005; Kearns, 2007; D’Arcy and Tagliamonte,
2010). Many report that wh- RCs (namely those found with the relativizers who,
whom, which and whose) are most frequently associated with Standard English, while
other relativizers (such as that, ø, and the less frequent as) are linked more often to
dialectal speech (e.g. Quirk et al. 1985:1252; Herrmann 2002:94;). In many recent
regional studies, a levelling towards these more standard wh-relative clauses in
regional dialects is described. This is, perhaps, expected due to dialect levelling and
possible weakening of regional varieties. In cross-dialectal studies of relativization
(such as Herrmann, (2005) based on data from the Survey of English Dialects (Orton
et al. 1969-72)), non-standard relativizer at was found to be used in varieties of British
English. In studies that consider Lancashire data to some degree (e.g. Shorrocks,
1999), instances of what and of at relativization are pointed to as typical for this
region. Use of what relativization could be considered as relatively unusual in
northern English; Beal and Corrigan (2002) found this to be one of the least preferred
patterns in their study of the Northern towns of Sheffield and Newcastle.
40
These nonstandard relativization patterns are accompanied by nonstandard
relativizers. A number of the cross-linguistic studies mentioned include data taken
from the Lancashire region. Herrmann (2002:33), for example, considers at, ut and t’
as phonemic/phonetic variants of that. Whether this in fact holds for Lancashire is to
be determined (see §2.3.) There is also disagreement among authors about whether at
is a separate particle of Scandinavian origin, or if alongside ut and ‘t it’s part of that.
Romaine (1982a:70) suggests that even if at was different, it has become mentally
merged with that over time. The distribution of these nonstandard relativizer forms
and patterns is explored in this chapter.
2.2.3 Factors influencing relativizer choice
A survey of the literature reveals that, most typically, forms of relativization are
influenced by various factors (both syntactic and semantic). Firstly, with respect to
syntactic function: what is the syntactic relation between the relativizer and the RV? Is
the RV subject of the relative clause verb, the object, etc? Secondly, with respect to
the semantic category of the antecedent: is it human or not? Finally, with respect to
restrictiveness: is the RC restrictive (RRC), or non-restrictive (NRRC)? A RRC
narrows the meaning of the antecedent, to avoid confusion with other possible
meanings, whereas an NRRC is merely supplying more information about the
antecedent, and is not intended to narrow the meaning. These factors are now outlined
in subsequent sections.
§2.2.3.1 Syntax: Dependencies between RV and antecedent
Relative clauses can be distinguished from each other according to the syntactic
relationship between the relative clause finite verb and the relativizer (or antecedent, if
41
there is no relativizer); these constructional differences are known as subordination
types. Much of the literature describes three basic types: nominal, adnominal and
sentential. Nominal RCs are those that function as a noun, e.g. can be subject or object
of a verb, and they have no antecedent, as shown in (9); adnominal RCs modify
nouns, and are perhaps the most common example of RCs, shown in (10); finally,
sentential RCs act like a separate sentence, and their antecedent is a preceding
sentence (11).
(9) [...] and er you went home and got [what was left for you] and you had that.
(Sound Archive).
(10) [...] then this letter arrived saying you 'd got a place at the secondary school
[which was in Preston] (Sound Archive)
(11) And slowly but surely, thankfully I gained their trust, [which is very very
important]. (Sound Archive)
Adnominal RCs are the prototypical RC, and will be the main focus of this chapter.
From this point, unless otherwise stated, RC refers only to adnominal RCs.
2.2.3.2 Semantic category of antecedent
Another important factor in relativizer choice is the semantic category of the
antecedent. Semantic categories are ways of grouping words according to the
attributes which describe what the word means; categories may be as general or
specific as is needed for a particular task. A noun referring to a human being can be
described as having the semantic category “person”, and this entails the attributes of
animacy or being alive, as well as more human specific attributes such as having
intentions and emotions. “Personality” is the state of having semantic category
“person”. Personality, or humanness, is a binary distinction, i.e. is the antecedent’s
head noun a person, or not? Antecedents having semantic category person often occur
in English with the relativizer who, and who only occurs in SE with a person
42
antecedent, so it would not be grammatical Standard English to say “*There’s a dog
[who goes for walks here].” On the other hand, the relativizer which can only occur in
Standard English with non-person antecedents.
2.2.3.3 Restrictiveness of the RC
Restrictiveness is an important aspect of relative clauses, and is a causal factor in
relativizer choice. As stated above, the restrictive/non-restrictive distinction (also
referred to as defining/non-defining, e.g. Carter and McCarthy 2006:566) concerns the
action of the RC on the frame of reference of the NP. If an RC is restrictive, it
contrasts the head noun from any other nouns with which it may have been confused;
if an RC is non-restrictive, it gives additional information that is not required for
identification. This is demonstrated below with invented examples, where (12) is
restrictive and (13) is unrestrictive.
(12) My friend [who lives in Manchester] phoned me yesterday.
(13) My brother, [who wasn’t feeling well], didn’t go to work today.
In (12) the friend is being contrasted with the speaker’s other friends who do not live
in Manchester, whereas in (13), the fact that the brother is not feeling well is not used
in order to define which brother is being referred to, but is just adding incidental
information about that brother. Often, NRRCs are preceded by a pause in speech, or a
comma in writing, which is not the case for RRCs; this perhaps reflects the nature of
these RC types: RRCs are adding information needed for disambiguation of the
antecedent, so the information should be said sooner; on the other hand, NRRCs are
adding information not needed for identification, so that information can afford to
wait for a pause.
43
RCs are often distributed in patterns, e.g. certain relativizers prefer certain RC
types. Conventionally, NRRCs are said to require a wh-relativizer (Huddleston and
Pullum, 2002:1056). However, it is possible that both that and Ø may be used in non-
restrictive contexts. While this may be less frequent in Standard English, it is reported
as being possible in dialectal varieties such as in (14-15) taken from Herrmann
(2002:104).
(14) [...] I seen Eric Adams [that lived there], he said it come one Sunday
Dinnertime
(15) […] there was Mr McNaughton and Ben Weir from Kendal [Ø came round
buying horses]
Researchers often link NRRCs with the wh- relativizers, rather than with that
or zero (Huddleston and Pullum 2002:183). With regard to zero relatives, Quirk et al
(1985:1985) suggest that ‘non-restrictive zero cannot occur’ (see §2.2.1 for a further
discussion of zero relatives). Such conclusions are most typically based on Standard
English alone and so are tested on the nonstandard data presented in this chapter.
It is important to note that there is a blurred boundary between restrictive and
non-restrictive relativization - it is not always clear in what way or to what extent the
noun phrase is being restricted. Indeed, unless all relevant facts are known about the
antecedent (and its candidates), it is impossible to tell for certain what patterns the
speaker is using. One kind of restrictiveness ambiguity is that where more than one
(an unlimited number) of antecedent candidates are being described; a number of such
sentences were found in the corpus data, an example of which is (16).
(16) Then I had another brother [what went to er America, nineteen twenty
three]. (Sound Archive)
44
From this example, it is unclear if the speaker is using the restrictive meaning, which
implies the speaker has other brothers who went to America, or if the speaker is using
the unrestrictive meaning, which is not explicit about other brothers, but may be used
to imply that the brothers did not go to America. In cases like this, a look at the wider
context often reveals the intended meaning (which in this instance is non-restrictive,
i.e. does not imply any other brothers went to America), as shown in (17).
(17) There were, he were a poultry farmer. He were a loomer at first. Then he were
a poultry farmer, you see they were all allotments and poultry farms and pig
farms and er. I had another brother what er were dairyman at er Townley.
Then I had another brother [what went to er America, nineteen twenty
three]. (Sound Archive)
There were 27 results where ambiguity resolution was not possible, even by looking at
the wider text, and all of these were excluded from the results presented in this
chapter.
2.2.3.4 Other factors
A number of other factors (aside from those outlined above) have been found to affect
relativizer choice; these include: proximity of the relativizer and antecedent (Quirk
1957); length of the RC (Quirk 1957; Tagliamonte, Smith and Lawrence 2005); clause
complexity (Tagliamonte, Smith and Lawrence 2005); and discourse features (e.g.
information flow, Fox and Thompson 1990). It is also likely that social values ascribed
to particular constructions may impact upon their use within a speech community (as
outlined by Hollmann and Siewierska, forthcoming). These factors are not considered
under the remit of this analysis but are future research possibilities.
45
2.2.4 Diachronic change in relativization
There seems to be a general consensus that, in English, wh- pronouns gradually
replaced an earlier system where that was the primary relative marker (e.g. Mustanoja
1960; Romaine 1982; Montgomery 1989). By the late seventeenth century, the three
relative markers which, who, and that were used in much the same way as they are
today (as outlined in e.g. Quirk et al. 1985). While this may be the current situation for
Standard English, many regional varieties appear not to have implemented wh-
relativization to the same degree, or at the same pace, and many display a preference
for the older that-relativization strategy. That-relativization is also reported as being
favoured in spoken rather than written language (e.g. Quirk 1957; Romaine 1982), and
so we would therefore expect this pattern to play out in the Lancashire data.
2.2.5 Summary and research questions
This chapter examines how syntactic and semantic conditions may influence RC use
in Lancashire. Where possible, comparisons are drawn with previous research on
relativization in a number of related areas (e.g. Herrmann’s 2002 study which includes
a number of results from Lancashire and nearby Westmorland, and Tagliamonte,
Smith and Lawrence’s (2005) study which includes results from Maryport in
Cumbria). By examining corpus results both from the older Litcorp and the more
recent Sound Archive, it is possible to outline dialect specific trends in relativization.
The corpus results are supported by questionnaire data, and this aims to identify usage
of zero relatives (ZRs); a feature often difficult to uncover from corpus-based
approaches alone, as there is no relativizer form to search for. The combination of
corpus and questionnaire data (as also demonstrated elsewhere in this thesis) allows a
46
more descriptive account of how RCs are used in Lancashire, (see Hollmann and
Siewierska (2006), for a further discussion of using multiple methods.)
Studies into RCs have outlined significant variation from that typically found
in Standard English. By examining Lancashire data of different types, relativization
strategies in Lancashire can be uncovered. The following research questions are
addressed:
(a) How does Lancashire differ from Standard English (and from other dialects of
English) with respect to relativization?
(b) What types of ZRs are found in Lancashire? How frequent are ZRs and how
acceptable are they to Lancashire dialect speakers?
(c) What types of non-standard relativizers are found in Lancashire? This includes
instances of semi-phonetic spellings as a possible marker of salience. (See
§2.4.1 and also Chapter 5 for a discussion of salience).
(d) Are any of the factors which can influence relativizer choice at work in the
Lancashire dialect? These factors, as stated above, are syntactic relation
between relativizer/antecedent and RV; semantic category of the antecedent;
and restrictiveness or non-restrictiveness of the relative clause regarding the
antecedent.
(e) Has there been any change in relativization strategies over time? This includes
both standard and non-standard features.
(f) Can questionnaire data tell us anything more about the acceptability of ZRs?
2.3 Methodology
As with all other chapters in this thesis, spoken transcribed data from the Sound
Archive corpus is analysed along with written data from Litcorp (please see §1.5 for
47
more details on these sources). In order to support the Sound Archive and Litcorp
analyses, a questionnaire exploring ZRs such as there’s a man down the street Ø goes
there too is targeted at Lancashire dialect speakers in order to test the acceptability of
different types of zero relativization (as outlined in §2.2.1). More specifically, this
questionnaire has two aims: to examine whether or not zero relativizers are
distinguished from one another as being more or less acceptable by Lancashire
speakers, and to test a number of ZRs that are less frequently included in discussions
on ZRs yet can be found in the Lancashire data (namely NPs headed by free choice
any and ZRs as a modifier of that-phrase). Results from these questionnaires are
presented in §2.3.5; a full copy of the questionnaire can be found in Appendix D.
2.3.1 Rationale for methodology
Studies of RCs have been completed in a number of other regions of the UK, e.g. in
Sheffield and Newcastle (Beal and Corrigan, 2002); in Scotland (Romaine, 1980); in
Somerset (Ihalainen and Harris, 1980) and in Dorset (Van den Eynden, 1992).
Currently no such analysis exists for the Lancashire region. Lancashire data has
however been included in a number of cross-linguistic studies (e.g. Herrmann, 2005),
and also mentioned briefly in other more general studies of Lancashire (e.g.
Shorrocks, 1999). Tagliamonte, Smith and Lawrence (2005) include information from
nearby Maryport in Cumbria and it might be the case that similarities between
relativization in these two neighbouring regions exist. Although the methodological
approach employed by Tagliamonte, Smith and Lawrence differs slightly from that
proposed here (their analysis includes RRCs only), it will nonetheless be interesting to
consider similarities and differences in these results.
48
Historically, studies of regional grammatical variation were based on elicited
data only, with the aim of compiling distributional maps and isoglosses (as found in,
e.g. the Survey of English Dialects project (Orton, 1969-71)). More recently, corpus-
based approaches have been employed in the study of dialect grammar, with many of
these considering relativization in their analyses. For example, Beal and Corrigan
(2002) draw on the Newcastle Electronic Corpus of Tyneside English (NECTE) and
the Survey of Sheffield Usage (SSU) in their study of these regions and Tagliamonte,
Smith and Lawrence (2005) use the 1 million word ROOTS corpus (Tagliamonte,
2001-2003) in their analysis of a number of UK regions clustered around the Irish Sea.
The use of corpora in these studies allows the retrieval of a large number of instances
of relative clause use, thus permitting distributional and frequency information to be
obtained.
The nature of RCs (and in particular, ZRs) mean that often they can be
difficult to extract from corpus data (for a further discussion of this, see §2.2.2). It is
in instances like this where acceptability judgements and elicitation tasks can help to
corroborate existing corpus results and target forms absent in the data, thus providing
rationale for their inclusion in this chapter.
2.3.2 Corpora: Litcorp and Sound Archive
As just mentioned , extracting RCs from corpus data is not an easy task, in part due to
relativizers fulfilling other grammatical functions in addition to being relativizers (e.g.
demonstrative pronouns, interrogative pronouns, demonstrative adjectives,
complementizers), but also due to the wide range of semantic interpretations involved
with related concepts such as e.g. restrictiveness. While overt relativizers can be
searched for individually (i.e. a search for the individual form which or whose etc.),
49
results obtained in this way of course include non–RC uses, as shown with what in
(18); (here what is not used as a relativizer, but as an interrogative pronoun).
(18) If you take the lads you grew up with that were fishing then, what were they
doing during the war? (Sound Archive)
Although automatic parsing techniques can denote grammatical relations such
as RCs, software such as this is most typically trained on Standard English only. A
preliminary test using the Stanford Parser 10 indicated that the nonstandard variation
found within the Lancashire corpus data (and particular the considerable grammatical
and spelling variation found in Litcorp) led to inaccurate parsing and unreliable
results. Because of this, all overt relativizer forms are searched for in the corpora
individually. From these results each sentence is then manually sorted and either
included or discarded depending on whether or not it constitutes an instance of RC use
based on semantic and syntactic analysis. Those sentences containing RCs are then
subject to further analyses (e.g. for syntactic function, semantic category of the
antecedent, restrictiveness, etc.) in order to arrive at the frequency results presented
later in §2.3. The restrictive/non-restrictive definition outlined in §2.1.3 is a good
motivation to carry out this manual search. As here I propose that restrictiveness in
Lancashire may not be solely linked to relativizer type (i.e. syntactic factors), this
means that primarily semantic interpretation is needed. This qualitative approach can
then form the basis of further quantitative study.
Searching for relativizers in Litcorp presents further problems. As writers in
Litcorp aim to represent their dialect using semi-phonetic spellings, it is not possible
for example, to search for ‘who’, and find every instance of what the writer may
10
For more information, see: http://nlp.stanford.edu/software/lex-parser.shtml
50
intend as who, due to the numerous variant spellings. A small sample of the variant
forms of relativizers found in Litcorp is shown in (19-21).
(19) In a bit two farmers [wot lived at Marton] coom in, and they’d a collie dug
wi’ urn. They’d bin takkin’ cattle to Poulton (Litcorp)
(20) To this mon ([whooa I soon percciv ‘t wur th’ Clark]) th’ Cunstable tow’d it,
an he began o whackering as if id stowd is Geese (Litcorp)
(21) Well, I fairly chinked wi’ lowfin’ at that, for Jim were cleeon shaven, an’
what Bess had tan for a mustache were thoose three hairs on Jim’s wart [ut
had tickled her face!] (Litcorp)
Variant spellings were uncovered by means of a close examination of a sample
of the Litcorp text. Variant spellings found in Litcorp include whoa, (who); whooa
(who I); whoos (whose), whot, wha, wot, ot (what); tha’ (that) and the slightly less
transparent ut, although not all of these spellings were consistently used to signal
relativizers; this is discussed further in §2.4.1 with particular reference to what, that
and ut.
ZRs cannot be retrieved from the corpus using the same methodology as their
overt counterparts since there is no search term to input (it is not possible to search for
an omission without it already having been annotated as such). While computational
methods have been used to automatically uncover ZRs from parsed corpus data with
some success (e.g. in the Penn Treebank project, (see e.g. Marcus, Santorini and
Marcinkeiwicz, 1993)), data that is POS-tagged only, as is the case for the Lancashire
data, is more problematic. Lehman (1997:187-191) describes a methodology for using
POS tags to retrieve possible instances of ZRs from corpus data. Lehman is able to
narrow down the results by formulating a POS tag search for the construction: finite
verb + NP + finite verb (e.g. I have a home help [Ø does my shopping]). Although
this approach may not capture every instance of ZRs in the corpora (e.g. those with
51
more complex NPs with extensive pre and postmodification) searching the corpora
manually was not a feasible option, given their size. Instead, a smaller in-depth study
of 5 speakers from the Sound Archive corpus (a subcorpus of approximately 45,000
words) aims to capture how zero relatives are used in more detail. A sample of a
similar size from Litcorp is also examined. These limitations on corpus-based searches
for ZRs also lend weight to the inclusion of questionnaire data to support the corpus
findings.
2.3.3 Questionnaire
The questionnaire is used in this chapter primarily to gather data on ZRs. The
questionnaire contains two parts. The first part tests the acceptability of particular RCs
and examines possible morphosyntactic influences on RC choice for current
Lancashire speakers. More specifically, the questions on ZRs examine four main
types: existential there as shown in (22), existential have (23), it clefts (24) and main
verb introducing (25).
(22) There’s a young girl I know [Ø has got that one too].
(23) I’ve something [Ø might help you sort out the problem].
(24) I think it was Laura [Ø told me you were going home].
(25) I met a lady the other day [Ø could do the same sort of thing].
Along with the examples in (22-25), the less well-known ZRs as outlined in
§2.2.1 are also tested, namely NPs headed by free choice any and ZRs as a modifier of
a that-phrase. These are exemplified in (26-27).
(26) I haven’t got any work [Ø needs doing].
(27) I didn’t really know her, that girl [Ø lived round by the market].
52
In the first part of the questionnaire participants were asked to judge sentences, such
as those detailed above, on a five point scale, with 1 being the least acceptable to them
and 5 being the most acceptable, as shown in (28).
(28) ‘There’s a man down the street goes there every week too’
The second part of the test required respondents to link together two clauses with a
relativizer in order to produce one complete sentence. The wording of the question is
shown in (29).
(29) Below are two statements. Combine these statements together into one
sentence. Two examples are given below:
STATEMENT: There’s a girl in the kitchen. She ate the last cake.
RESPONSE: There’s a girl in the kitchen who ate the last cake.
STATEMENT: I had a raincoat. It was blue with grey stripes

RESPONSE: I had a raincoat that was blue with grey stripes.
This second part of the test aimed to uncover which relativizers respondents would
choose in relatively free production. This part of the test contained (18) sentences that
tested features such as syntactic function; semantic category of the antecedent,
restrictiveness of the relative. A full copy of the questionnaire is presented in
Appendix B. Results from the questionnaire are outlined in §2.4.2.
53
2.3.4 Classification and division of respondents
The questionnaire (as described in §2.2.3) was prefaced by a number of sociolinguistic
questions about the age, location and background of the respondent. Unlike later
questionnaires (see e.g. Chapter 4, §4.4.6), in this instance only Lancashire dialect
speakers were targeted.
The questionnaire was completed in its entirety by 158 respondents, 43 of
which were students in undergraduate classes at Lancaster University, typically aged
18-22. The remaining 115 informants were reached via social networking websites
(primarily Facebook 11 ) and were asked to fill in an online version of the
questionnaire. Most online participants were of a mixed age range, with the average
age being 36. Online informants were then encouraged to pass on the questionnaire to
their colleagues, family or friends if they thought it was likely that they would also
complete the questionnaire. While possible social network effects of language
variation could very likely be a factor that influences results presented in this chapter
(i.e. as described by Milroy, 1980), this area is too broad to be discussed under the
remit of this thesis. However, to test the possibility of using social media to
crowdsource for sociolinguistic research such as that carried out in this chapter, an
additional question was inserted into the online version of the questionnaire. This
question was: where did you find the link to this survey? The results for each response
are shown in brackets - a. directly from a Facebook group (48); b. via Twitter (22); c.
from someone I know (25); d. from the researcher directly (20). This suggests that
using social media is a viable direction for further research into both reaching a wider
number of survey participants and investigating possible social network effects.
11
For further information, see http://www.facebook.com
54
2.4 Results and discussion
2.4.1 Overview of corpus results
The overall frequency of all RCs in the Sound Archive and Litcorp is shown in
Table 1 listed by relativizer (these results are show at this stage with no differentiation
between e.g. antecedent type, restrictiveness, etc). The tables display raw frequency
results as some values are too low to normalise, e.g. to frequencies per 100,000. The
percentage values in Table 1 show the distribution of RC across each corpus.
Litcorp Sound Archive

that 560
(57.2%) 671 (35.4%)
who 129
(13.2%) 339 (17.9%)
which 141
(14.4%) 443 (23.4%)
what 29(3.0%) 325 (17.2%)
whose 3(0.3%) 4 (0.2%)
whom 12(1.2%) 7 (0.4%)
Ø 98
(10.0%) 103 (5.4%)
as 5(0.5%) 2 (0.1%)
at 2(0.2%) 0 (0.0%)
TOTAL 979 (100.0%) 1894 (100.0%)
TABLE 1. FREQUENCY OF RELATIVIZER IN SOUND ARCHIVE AND LITCORP
Results from the corpora show that the relativizer that is the most frequent in
both corpora. This supports findings that suggest that in spoken discourse that-
relativization is favoured by speakers (although Litcorp is not spoken language as
such, the Dialect Literature is considered to reflect spoken style in text). The results
for that found in the Litcorp make up a larger proportion of the total relativization
strategies than in Sound Archive, and therefore also agree with the assertion that older
texts may display a higher frequency of the older that-relativization pattern (e.g.
Mustanoja 1960). This therefore suggests that there has been an element of diachronic
55
change with respect to that relativization, although conclusive results are not possible
at this stage.
From the results shown in Table 1, the preference for RCs in Lancashire is as
shown below along with arrows depicting any change (relativizers with a share of less
than 3% are not represented).
Litcorp that > which > who > Ø> what
Sound Archive that > which > what > who > Ø
FIGURE 1. MOST FREQUENT RELATIVIZERS IN THE LANCASHIRE CORPORA
Previous dialect studies have suggested that wh-relativization has made inroads
into regional dialects (e.g. Herrmann 2005:28). This does not seem to be the case in
Lancashire where in fact the only change in relativization overall seems to be the
increase of what. This increase in what relativization has been outlined (e.g. by
Cheshire, Edwards and Whittle 1993:68) as an overall trend in English more
generally. It could well be the case that this reported trend in general English has
indeed influenced the frequency of this variable in Lancashire, thus explaining the
distribution found in Table 1. An alternative conclusion may be drawn from evidence
elsewhere in the literature suggesting that the relativizer what is associated with the
Lancashire region to some degree (e.g. Shorrocks 1999:101). This would mean that
perhaps this distribution does not represent a change, but instead corroborates the
assertion of Shorrocks that what is a feature often found in Lancashire. Overall, the
results shown in Table 1 suggest that, in very general terms, relativization in
Lancashire appears to have remained relatively stable. It is interesting to note that
relativization is less frequent in Litcorp as compared to the Sound Archive overall.
Alongside this, there are of course differences between the two corpora to consider;
this is explored further in §2.5.
56
A number of the results displayed a very low frequency. In the Sound Archive
only 6 instances of whom were found. Only 4 out of 32 speakers produced this
relativizer, 2 results being found within the same sentence, as shown in (30).
(30) Incidentally, speaking about Hortner, there was a boy staying there in those
days [whom I met with] and with whom [I spent some considerable time] and
we became great friends. (Sound Archive)
Along with the infrequent whom, instances of as were also very infrequent in both
corpora, shown in (31).
(31) At th’ Sunday Skoo [as I went to] th’ dobby [as cleond th’ skoo an’ kept us
i’ order wi’ a cane while th’ skoo oppent] were named Skinner, Ham Skinner.
He were a lung, thin chap, an’ hi wife, wot helped him, were very fat, an’
puffed a lot when hoo were warkin. (Litcorp)
Although this variant was ascribed to Lancashire by Herrmann, it may be that it is
now archaic, as it does not feature in the Sound Archive.
ZRs appear with perhaps a lower frequency than expected in the corpus data;
this may be due to a number of factors. In Litcorp it may be the case that writers
choose to use an overt nonstandard form, rather than an omission in their
representation of the dialect. This hypothesis is untested, but follows if we consider
dialect writing to be a representation of the dialect in written form, and therefore aims
to have salient features; a zero form is perhaps less noticeable, or salient, than a non-
standard form. A similar trend, which could also be accounted for by this hypothesis,
is found in definite article reduction later in this thesis (see §5.4.7).
It was necessary to estimate the extent to which low frequencies of ZRs might
be due to the retrieval method employed here, and the extent to which this is a true
representation of this relativization strategy in Lancashire. A more fine-grained
analysis of 5 speakers from Sound Archive and an equal portion of Litcorp data is
57
employed here, in order to test this assertion. Results from this close analysis of ZRs
found in the two sample texts can then be compared against the results obtained by
corpus retrieval methods in order to extrapolate to a margin of error. Results from this
analysis are shown in Table 2.
number of Ø ratio of ratio of Maximum estimate of total

relatives found Ø:words in Ø :words in number of Ø possibly
manually sample whole corpus missed by corpus search
5 speaker
subcorpus 19 4.2/ 10,000 3.5 / 10,000 21
Litcorp
13 2.8/ 10,000 2.0 / 10,000 40
sample
TABLE 2. FREQUENCY OF ZERO RELATIVES IN A 45,000 WORD SAMPLE FROM EACH
CORPUS
In the case study, a total of 32 ZRs were found. By working out the maximum number
of ZRs that may have been missed by using corpus methods, we can see that, although
a significant number of ZRs may have been omitted, this does not change the order of
the most frequent relativizers as shown in Figure 1 above.
A closer look at the results shown in Table 1 is needed, in order to determine
the role of the factors influencing relativizer choice (set out in §2.2.3) in the
Lancashire data. Before this, the relativizer results from Litcorp can be explored
further, as the nonstandard spellings in particular uncovered some very interesting
variation. Some of the results for Litcorp shown in Table 1 are reproduced in Table 3,
this time with the distribution across variant spellings with that and what.
Litcorp
what 4 (13.8%)
what 29
wot 25 (86.2%)
that 163 (29.1%)
that 560
ut 397 (70.9%)
TABLE 3. VARIANT SPELLINGS OF WHAT AND THAT RELATIVIZERS IN LITCORP
58
There are 29 instances of what used as a relativizer in Litcorp. Interestingly, the
standard spelling of what is only used as a relativizer 4 times, e.g. (32), with each of
these 4 examples occurring in the same source text.
(32) There were only six heauses in a row in th’ Grove, an’ everywheer else there
were twenty or moor. An’ th’ folk [what lived in Hosburn Grove] thowt
summat o’ theirsels, th’ women specially. (Litcorp)
The remaining 25 results were found with the spelling wot, where each of the 25
results was a relativizer, as shown in (33) below. 12
(33) In a bit two farmers [wot lived at Marton] coom in, and they’d a collie dug
wi’ urn. They’d bin takkin’ cattle to Poulton, an wur on th’ road whoam
again.
Here, Litcorp writers use wot to mark the nonstandardness of this relativization
strategy, this respelling was not used to indicate any other function.
The instances of nonstandard ut were more complicated. Herrmann states ‘at,
ut, and t are rated as phonemic/phonetic variants of that by me’ (2002:70). This
assertion appears to be less categorical in the Litcorp data where instances are found
that are ambiguous, or at least difficult to resolve. This is in part due to ut being used
not exclusively for that as shown in (34).
(34) He geet in to be a soart ov an under sweepereaut ut a wareheause, an’ gan his

mesther sich satisfaction ut he’re soon promoted to th’ top end o’th’ brush, at
an extry shillin’ a week. He wurno’ ut this job lung, for onybody [ut had a bit
o’ inseet into things] could see ut he didno’ sweep like a common mon.
(Litcorp)
Within the same example here, ut is used to mean at (as in, he became an
undersweeper at a warehouse...), subordinating that, (as in he could see that he did
not sweep and relativizing that (as in, “anybody [that had a bit of insight]”). This is
12
The only exception to this was the use of wot as a semi-phonetic spelling for hot, e.g. he fotcht a red-
wot fire-potter eaut o’ th’ heause an’ flourished it like a sword (Litcorp), of which there are 8 instances.
59
problematic when looking at the data, where relativization with at is also found (albeit
infrequently) as in (35).
(35) It’ll be hard wi’ folk ut areno prepared for it. A blazin’ wot summer, an’ neaw
ice an’ snow, an’ a wynt [at shakes th’ heause]. Han th’ coals come?”
(Litcorp)
Examples of ut also occur directly adjacent to that, such as in (36).
(36) “An’ win yo’ give us that pictur’ o’ yo’rs for Walmsley Fowt Bonfire” th’ lad
said. “Jim Thuston says it’s fit for nowt else.” “Does theau meean my
portrait?” “Aye, that [ut Jim Thuston says wur painted for a aleheause sign].”
That wur enoof for me. “Here,” aw said, “if theau artno’ away fro’ this dur in
abeaut five seconds, aw’ll send thee flyin’ o’er that garden, an’ witheaut
wings, too, theau yung jackanapes.” (Litcorp)
It could be suggested that that is indeed the intended relativizer in example
(36). It is also possible that this could be an instance of at or what relativization. In
order to resolve this problem an analysis of all instances of ut is necessary, both as a
relative and as other parts of speech. Because ut occurs significantly more frequently
as that, it is likely that when speakers write ut, they are more likely to mean that than
any other possible meaning. Only instances that were possible to disambiguate were
included. These results can be seen in Table 4.
Raw Percentage of
frequency all uses of ut
relative that 397 (41.5%)
demonstrative that 46 (4.8%)
that
conjunction that 465 (48.6%)
adverb that 19 (2.0%)
at at 6 (0.6%)
it pronoun it 24 (2.5%)
TOTAL 957
TABLE 4. ANALYSIS OF UT RESULTS IN LITCORP
60
Due to the predominance of ut used to mean both relativizer and non-relativizer that,
any possible ambiguous examples of ut in Litcorp are here counted as belonging to
that.
2.4.2 Corpus results: restrictiveness of relative clause
The restrictive and non-restrictive results for each relativizer in both corpora are
shown in Table 5. The relative percentage distribution between restrictive and non-
restrictive results is also displayed.

restrictive non-restrictive restrictive non-restrictive
that 515 (92.0%) 45 (8.0%) 658 (98.1%) 13 (1.9%)
who 120 (93.0%) 9 (7.0%) 281 (82.9%) 58 (17.1%)
which 2 (1.4%) 139 (98.6%) 256 (57.8%) 187 (42.2%)
what 13 (44.8%) 16 (55.2%) 101 (31.1%) 224 (68.9%)
whose 2 (66.7%) 1 (33.3%) 2 (50.0%) 2 (50.0%)
whom 2 (16.7%) 10 (83.3%) 7 (100.0%) 0 (0.0%)
Ø 98 (100.0%) 0 (0.0%) 103 (100.0%) 0 (0.0%)
as 2 (40.0%) 3 (60.0%) 2 (100.0%) 0 (0.0%)
at 2 (100.0%) 0 (0.0%) 0 (0.0%) 0 (0.0%)
TOTAL 756 (77.2%) 223 (22.8%) 1410 (74.4%) 484 (25.6%)
TABLE 5. FREQUENCY OF RESTRICTIVE AND NON-RESTRICTIVE RESTRICTIVE RELATIVE
CLAUSE BY RELATIVIZER
Non-restrictive relatives are infrequent in the Lancashire data. Huddleston and Pullum
(2002:183) suggest that non restrictive uses of that are rare, but 58 examples of this
are found in the corpora, as shown below in (37).
(37) I asked our James that worked there, and he said it were never reported or
anything (Sound Archive)
61
2.4.3 Corpus results: semantic category of antecedent
The results for relativizer type with respect to the personality/non-personality
distinction are shown in Table 6.

personality non-personality personality non-personality
who 129 (100.0%) 0 (0.0%) 339 (100.0%) 0 (0.0%)
whose 3 (100.0%) 0 (0.0%) 4 (100.0%) 0 (0.0%)
whom 12 (100.0%) 0 (0.0%) 7 (100.0%) 0 (0.0%)
that 489 (87.3%) 71 (12.7%) 474 (70.6%) 197 (29.4%)
which 88 (62.4%) 53 (37.6%) 191 (43.1%) 252 (56.9%)
what 16 (55.2%) 13 (44.8%) 173 (53.2%) 152 (46.8%)
Ø 94 (95.9%) 4 (4.1%) 101 (98.1%) 2 (1.9%)
as 5 (100.0%) 0 (0.0%) 2 (100.0%) 0 (0.0%)
at 2 (100.0%) 0 (0.0%) 0 (0.0%) 0 (0.0%)
TABLE 6. FREQUENCY OF RELATIVE CLAUSE BY ANIMACY TYPE
Perhaps unsurprisingly, the relativizers who, whose and whom are exclusively
found referring to person. The relativizer that prefers antecedents with personality but
is also found with antecedents such as the house that I lived in. In both corpora which
appears with both personality and non-personality antecedents. ZRs also prefer
personality rather than non-personality with what found the most evenly with all
antecedent types. Overall, aside from whom, whose and who, relativization in
Lancashire appears to be fairly unrestricted by personality.
2.4.4 Questionnaire findings: distribution of relativizers
The questionnaire completed by Lancashire dialect speakers targeted ZRs in
particular. Participants were asked to assign scores from 1 to 5 to each test sentence,
with 1 being judged by them as the least acceptable and 5 as the most acceptable. A
full copy of the questionnaire can be found in Appendix B. A five point scale was
62
used in this test and the overall median results for all respondent groups are shown in
Table 7. The mean score is shown alongside this, in brackets
Context
existential existential main verb NPs headed by Modifier of
it cleft that-phrase
there have introducing free choice any
Score 4 (3.8) 3 (2.1) 3 (2.6) 3 (2.7) 3(3.0) 2(2.9)
TABLE 7. QUESTIONNAIRE RESULTS FOR ZERO RELATIVES
Existential there sentences such as “there’s a man down the street [Ø goes there
too]” were the most acceptable to Lancashire speakers. All test sentences were judged
to be acceptable to Lancashire speakers, with only 10 speakers giving scores of 1.
The use of a questionnaire methodology is not without its limitations; an
analysis of the results suggests that often informants are reluctant to choose 1 or 5.
Conversely, there were also participants who only gave scores of 1 or 5, i.e. a yes/no-
type response. This aside, combined with the substantial corpus data, these results
give a good picture of relativization in Lancashire.
The second part of the questionnaire required the participants to joint two
sentences together with a relativizer of their choice. Both the raw frequency and the
percentage distribution for each sentence type are shown in Table 8. There were three
test sentences of each type, one of which is given in the table, for reference.
63
Relativizer
Example that what who which Ø whose
There’s a girl in the kitchen. 61 8 87 0 2 0
human
She ate the last cake. (38.6%) (5.1%) (55.1%) (0.0%) (1.3%) (0.0%)
I went to the Council. They 0 0 126 32 0 0
collective
took my claim seriously. (0.0%) (0.0%) (79.7%) (20.3%) (0.0%) (0.0%)
I had a raincoat. It was blue 136 15 0 7 0 0
thing
with grey stripes (86.1%) (9.5%) (0.0%) (4.4%) (0.0%) (0.0%)
I saw a horse. It looked very 80 20 31 23 4 0
animal
cold. (50.6%) (12.7%) (19.6%) (14.6%) (2.5%) (0.0%)
There is a woman. She went
52 2 102 0 2 0
subject to the bank. She is “a
(32.9%) (1.3%) (64.4%) (0.0%) (1.3%) (0.0%)
woman...
There is a woman. I saw her
0 1 0 0 0 157
Object husband at the bank. She is
(0.0%) (0.6%) (0.0%) (0.0%) (0.0%) (99.4%)
“a woman...
TABLE 8. CHOICE OF RELATIVIZER BY QUESTIONNAIRE PARTICIPANTS.
The questionnaire returned a number of surprising results. Although
comparatively infrequent, a number of speakers used ZRs productively, in particular
with human and animal subjects, producing sentences like there was a dog went to the
vet. This suggests that (at least certain types of) ZRs are not unacceptable to
Lancashire speakers. No speakers produced at or ut. This absence, in addition to the
lack of corpus findings, suggests that these non-standard relativizers, which perhaps
were once found in Lancashire (as suggested by e.g. Herrmann, 2005), are now rare
for Lancashire speakers. Use of what appeared in the data, although relatively
infrequently. One participant completed the sentence “she is a woman [what’s
husband I saw at the bank],” with what’s used instead of the Standard English choice
whose. It could perhaps be suggested that some results such as this may perhaps be
unreliable due to the social values ascribed to it in this context i.e. it may have been a
tongue-in-cheek response (issues such as this are considered further in Chapter 5).
This appears to be the case here, where this particular speaker did not use what in any
other sentence in the task. This would suggests that further tests may be needed in
order to determine other factors (e.g. possible social values) associated with the
64
acceptability of sentences such as a woman what’s husband, as opposed to e.g. a
woman whose husband.
2.5 Concluding remarks
This analysis of RCs in Lancashire has revealed that relativization strategies in this
region differ from those found in Standard English. Herrmann (2005) suggested that,
very generally, relativization strategies in regional dialects appear to be less
constrained, and this seems to be the case in Lancashire. The corpus and questionnaire
results show that significant variation is found with relativization type, syntactic
function of the relativizer, semantic category of the antecedent, and restrictiveness of
the relativizer over antecedent. In particular, what relativization is both found in the
corpus data and is produced in the sentence linking task by respondents.
Significant variation was found in the use of nonstandard relativizers in
Litcorp. Much of this variation involved semi-phonetic respelling. As outlined in
§1.3.3 if we reason that semi-phonetic spellings indicate a conscious choice by the
speaker to represent the nonstandardness of their dialect, (be it phonological or
grammatical or both) then these respellings are considered significant. Litcorp results
showed that, in particular, the exclusive use of wot to indicate relativization and what
used in all other contexts indicates that this construction is recognisable to these
Lancashire Dialect writers as a salient feature of relativization in Lancashire region.
The concept of salience as put forward here is further outlined both at various points
in subsequent chapters, but primarily in Chapter 5.
The overall frequency of relativizers largely fit in with the Lancashire findings
of Herrmann (2002) suggesting perhaps that relativization in Lancashire has not
undergone any significant change during the period that both Herrmann’s study and
65
this present investigation cover (1960s – 1990s). Despite this, there are a number of
key differences. In the second part of the survey, with animate human subject,
speakers in Lancashire most typically used subject wh- relatives e.g. there’s a girl in
the kitchen [who ate the last cake], but prefer that with non-human animate subjects
e.g. the sheep in the field [that jumped over the fence].
A number of dialect speakers (23 out of 158) used the perhaps more
nonstandard what e.g. the sheep in the field [what jumped over the fence]. This was
found, in particular, with restrictive clauses with animate (both human and non-
human) – although no comparison with SE speakers was made here, even to find what
used productively here contrasts with, e.g. Beal and Corrigan’s (2002) findings in
Newcastle and Sheffield. Modern speakers produced what in the sentence linking
exercise more frequently than perhaps would be expected. It is unclear if this is part of
a relatively recent increase in this relativization reported throughout the UK (e.g. by
Cheshire, Edwards and Whittle 1993:68) or a feature particular to Lancashire.
Non restrictive relatives are infrequent in the data. Quirk et al. (1985:1252)
suggest that that is rare as a non-restrictive relativizer and that zero is impossible. This
is not the case with the results outlined here, where a number of non-restrictive uses
were found.
With regard to the status of the typically “Lancashire” features (namely, as and
at), as outlined by Shorrocks (1999) and Herrmann (2002), this was not corroborated
by corpus data. No speakers produced as or at and although some instances were
found these were restricted to Litcorp only and even then were very infrequent.
Combined with the results from the corpora, this suggests that this variable is rare for
Lancashire speakers. The questionnaire results indicated that ZRs are most acceptable
in existential there sentences, but also that Lancashire speakers found all types of zero
66
relatives to be acceptable. A better approach may be to start from semantic position,
e.g. giving additional information (loosely linked here to restrictively). As most
accounts of grammar start from a syntactic point of view this was a rational position to
take. Contrast e.g. I’ve got a lawn wants cutting with I’ve got a lawn wanting cutting.
An analysis with the emphasis placed more clearly on semantic proposition rather than
the more syntactic starting point taken here may give more a clearer picture about the
interplay between related constructions and therefore allow us to draw more precise
conclusions about language variation and change. Building from this assertion, this
approach is adopted with respect to the HAVEn’t to construction in Chapter 3.
67
Chapter 3. HAVEn’t to
3.1 Introduction
This chapter focuses on the syntax, semantics and frequency of the HAVEn’t to
construction (and later, a group of semantically similar constructions) in order to both
describe its use and outline the way in which grammaticalization may have played a
role in the development of this construction in Lancashire.
It is widely accepted that in Standard English, HAVE to requires DO-support
when forming the negative (Quirk et al, 1985:138; Ellegård 1953:154). However, data
from the Sound Archive and Litcorp used in this study suggests that this is not
necessarily the case for Lancashire dialect speakers, see e.g. (1) compared to (2) (see
Chapter 1 for a detailed discussion of the corpora used here). 13
(1) No no you haven’t to change or anything, come just as you are’ (Sound
Archive)
(2) You know, like you get first class stamps, you don't have to lick ’em do you,
you just stick em on, that's progress in't it? (Sound Archive)
While core modal verbs such as COULD and MIGHT do not need DO-support,
newer semi-modals such as DARE to and USED to, and indeed HAVE to, generally do,
although as shown above, for the latter this is not always the case amongst Lancashire
dialect speakers. The comparison of this construction in the Sound Archive and
Litcorp allows tentative assumptions to be made about how the Lancashire dialect
may have changed over time. In addition, this study considers other semantically
related negative modal constructions, i.e. (3-5).
13
It should be noted that while often HAVEn’t to is written throughout this chapter, the intended
meaning is HAVE Neg to Inf (i.e. all negated forms.
68
(3) It was very popular at one time, it mustn’t have been popular enough.
(Sound Archive)
(4) A properly trained salesman shouldn’t tak’ no for a answer. (Litcorp)
(5) Platt, my lad, tha needn’t goo a step further’ this is the lass for thee. (Litcorp)
By analysing these constructions, I will provide a descriptive analysis of how
they are used within both the Litcorp and Sound Archive data. It may also be possible
to suggest reasons why changes in frequency may have occurred, and relate this to
linguistic theory more generally. The Lancashire data for these constructions will be
compared, in places, with data from the BNC, and also with other studies of changes
to modal verbs in English, such as Leech (2003) and Biber et al. (1999). It is also
possible to draw tentative conclusions about the representation of dialect in the
literature by means of a considered analysis of the frequency results.
3.2 Literature review
3.2.1 Modality in Standard English
Biber et al. (1999:485) state that English has nine modal verbs: can, could, may,
might, shall, should, will, would, and must. Semantically, they suggest that modality
can be grouped into three broad categories – permission/possibility/ability;
obligation/necessity, and volition/prediction. While many other studies break this
distinction down further (e.g. Van der Auwera & Plungian (1998:52) distinguish
between uncertainty, dynamic possibility, ability, capacity, need, obligation, necessity,
permission, and probability), this is not necessary for the remit of this study, and so
further analyses will concentrate mainly on the category of obligation/necessity of
which HAVEn’t to is a member.
69
Morphosyntactically, in Standard English modal constructions have no non-
finite forms - most have present and past with no person-number marking in the third
person singular present, and most have irregularity in the past form, e.g. can/could,
may/might. Modals are complemented by a bare infinitive and follow the NICE
properties as set out by Quirk et al (1985:140), discussed later in this section.
In addition to the core modals, as described above, English has a number of
semi-modals. Semi-modals include constructions such as DARE to, NEED to, OUGHT to,
HAVE to, and USED to. These constructions fulfil a similar semantic function to modals
(i.e. one of possibility/ability, obligation/necessity, or volition/prediction), but differ
from core modal verbs in terms of syntax. Semi-modal constructions conform to the
NICE properties to differing degrees. This syntactic difference is used later in this
study as a measure of grammaticalization.
It should be noted that semi-modals have been labelled in various different
ways in the literature, including marginal modals (Denison, 1993:315), semi-auxiliary
(Quirk et al., 1985: 137, Krug, 1996:43), non-modal auxiliaries (Warner, 1993:3) and
quasi-modal (Coates, 1983: 52; Perkins, 1983: 65; Leech, 1987: 73). While the choice
of name is debated, the existence of such a category is not, and in this study I follow
Biber et al. (1999:483) by using semi-modal.
Quirk et al., similar to Biber et al., describe modal verbs (and semi-modals) as
representing concepts such as volition, probability and obligation. While also
describing the semantics, Quirk et al. take a more formal approach, concentrating
much of their analysis on the syntactic properties of modal verbs – the so called NICE
properties: negation, inversion, contraction and ellipsis. These properties are
demonstrated in (7-9), with core modal verbs (a) and lexical verbs (b).
70
Negation - The test for negation suggests that modal constructions are able to form
negative constructions by using the particle not or the contracted form –n’t. This is not
possible for lexical verbs, as can be seen in the following examples.
(6a) It may not be ready in time. (BNC – A08 2589)
(6b) *He kicked not a ball.
Inversion - Inversion of the subject and operator is typical for modal verbs in a range
of contexts, including interrogative sentences as shown here.
(7a) Should he go back? (BNC – CDE 2474) He should go back.
(7b) *Jumps he on the bed? He jumps on the bed.
Contractions/clitics - The tests for clitics/contractions show that modal constructions
can be reduced. As demonstrated by the examples below, this is not possible for
lexical verbs.
(8a) You'll see for yourself. (BNC – A0D 2587) You will see for yourself.
(8b) *We’te it on Saturday. We ate it on Saturday.
Ellipsis - Modals may also appear in elliptical constructions without a complement.
(9a) If anyone can do it, she can [do it].
(9b) *If anyone keeps spoiling the dinner, John keeps [spoiling it].
When forming negative (10), interrogative (11) or elliptical sentences (12),
both lexical verbs and many semi-modals require DO-support as shown in the
examples below.
71
(10) He ran away. He didn’t run away.
(11) He jumps on the bed. Does he jump on the bed?
(12) If anyone keeps spoiling the dinner, John does.
This comparison shows that these criteria may be used to determine whether or
not a verb can be considered syntactically modal or not (or to what degree), and
therefore are a useful set of tests that will be utilized later in this study.
3.2.2 Modals vs. semi-modals
Much of the literature suggests that the difference between modals and semi-modals is
predominately one of syntax, with both modals and semi-modals sharing a similar
semantic space, i.e. one of ‘obligation, permission, probability, futurity, uncertainty,
lack of definiteness’ (Warner, 1993:13). In order to show how these semi-modals are
syntactically different to the core modals, it is useful to look again at the NICE
properties as set out earlier in (6–9).
Semi-modal Core modals

Negation daren’t, ?oughtn’t to, *usedn’t to can’t, shouldn’t, won’t
Inversion He needs to / *needs to he? He will / will he?
*I’sed to (used to), *he’ren’t go
Contraction He’ll, you’d,
(he daren’t go)
?If anyone needs to do it, he needs to [do If anyone will do it, he will [do
Ellipsis
it] it].
TABLE 1 – NICE QUALITIES OF SEMI-MODALS AS COMPARED TO ‘CORE’ MODALS IN
STANDARD ENGLISH (QUIRK ET AL., 1985:140)
This table shows that, on the whole, semi-modals tend not to conform to all of
the NICE properties. However, it should also be noted that the class definition
between semi- and core modals is not binary. In fact, from the table it is clear that the
boundaries are instead blurred, and can be said to form a continuum of modality with
some semi-modals being closer to or further away from the core modals. For example,
72
not all semi-modals are unable to take negation (a focus of this study), e.g. ?DAREn’t,
?OUGHTn’t to, *USEDn’t to. This idea is also put forward by Quirk et al. (1985: 137)
who suggest that there is a auxiliary verb > main verb scale; this is represented in
Figure 1.
(one verb phrase) (a) central modals: can, could, may, might, shall,
should, will
(b) marginal modals: dare, need, ought to, used to
(c) modal idioms: had better, would rather, be to,

have got to
(d) semi-auxiliaries: have to, be about to, be able to,

be bound to, be going to,
be obliged to,
(e) caternatives: appear to, happen to, seem to
(two verb phrases) (f) main verb + hope to + Inf, begin + -ing participle
nonfinite clause:
FIGURE 1. THE AUXILIARY VERB/MAIN VERB SCALE (ADAPTED FROM QUIRK ET AL.
1995:137)
By looking at the Lancashire data for HAVEn’t to, (shown in Figure 1 only in
the positive form (HAVE to) as a semi-auxiliary), this study aims to find out how close
or far away HAVEn’t to is to a modal function for Lancashire speakers (such as that of
central modals represented in Figure. 1), and how this compares to its use in Standard
English. This study also goes some way to considering if, and perhaps why modal
verbs may have moved along this scale by using diachronic data (see §3.4.4).
3.2.3 Recent changes to modal verbs in Standard English
A number of studies have analysed recent changes to modal verbs in Standard
English, with much of the research (e.g. Leech, 2003; Biber, 1999), looking at
73
changes in British English in contrast with American English. Only Krug (1996) pays
significant attention to the HAVEn’t to construction as examined in this study. Krug
looks specifically at HAVEn’t to (1996:103) using a mixture of corpora including
ARCHER, BNC, Frown, Brown, FLOB and LOB. Results for this study suggest that
both HAVEn’t to and HAVEn’t got to are very rare in current Standard English, with all
corpora returning very low frequencies for these constructions. Krug suggests that the
absence of not negation in present day English points to a diachronic development
from unproductive auxiliary negation (HAVEn’t to) to DO negation (DOn’t have to).
Krug suggests that the occurrences of HAVEn’t to within the data are found in the
language of older speakers who exhibit the retention of an obsolescent structure, and
that there is a regional tendency for these speakers to come from the north of England.
Interestingly, Krug also suggests that HAVEn’t to expresses mainly prohibition
synonymous with the core modal MUSTn’t, meaning not supposed to / not allowed to,
whereas DOn’t HAVE to contrasts with this, meaning ‘not obliged to’. This contrast,
and the semantics of both HAVEn’t to, DOn’t have to, and other related constructions is
investigated in the Lancashire data in §3.5.5.
Leech suggests that changes to modal verbs are related to more general
changes in language style, politeness and genre such as informalization,
generalization, and colloquialization (Leech, 2003:236). Corpus data from the Brown
family of corpora indicates that modals marking necessity are more frequent in British
English, as compared to American English. It is also suggested that semi-modals, in
particular those with periphrastic DO, are becoming more frequent in British English.
However, it is unclear strictly how this rise in frequency of modals marking necessity,
and also semi-modals, is linked to informality, generalness or colloquialization.
74
Nonetheless, this trend in Standard English is tested against the Lancashire data in
§3.4.4.
Biber et al. (1999:498), similar to Leech, compare British with American
English, drawing largely the same conclusions about general change amongst the two
varieties. Different to Leech, Biber et al. take a closer look at genres, suggesting that
have to is the only semi-modal that is common in written discourse as well as
conversation in Standard English, with the other semi-modals being mainly restricted
to spoken and informal contexts.
3.2.4 HAVE to and HAVEn’t to in current Standard English
The most common semantic categorisation of HAVE to suggests that its meaning is
close to that of MUST (see e.g. Quirk et. al., 1985:145; Perkins, 1983:65; Coates,
1983:52), although morphosyntactic differences between the two are recognized by
the classification of HAVE to as a semi-modal and MUST as a core modal. Along with
this, semantic differences are also present. One of these is objectivity; HAVE to
suggests obligations to external entities, while MUST refers to obligations to the
speaker. For example, as shown in the invented examples below, (13) involves an
obligation to the speaker, and (14) represents some kind of outside authority or
internal drive.
(13) He must bring me my lunch [or I’ll be angry]
(14) He has to bring me my lunch [it’s his job]
However, the exact semantic role of HAVEn’t to with respect to other modal
and semi-modal verbs is disputed. Going back to Figure 1, the exact placement of
HAVE to (and also HAVEn’t to) on a scale such as this is contested. Visser (1969:1478)
suggests that to all intents and purposes HAVE to is a modal auxiliary, Krug (1996)
75
suggests that it is located at around the mid-point between full auxiliary and full verb,
and Coates (1983) suggests that it fulfils none of the defining characteristics of modal
auxiliaries. While this study considers HAVEn’t to rather than HAVE to, it goes some
way to providing semantic, syntactic and diachronic evidence for where on a scale
such as that in Figure 1 HAVEn’t to is located for Lancashire dialect speakers.
Importantly, HAVEn’t to may be considered as a separate construction from
HAVE TO as it is clear in the data that the HAVEn’t to construction displays two distinct
meanings and that the difference is related to obligation. It is not simply a negated
form of HAVE to (see also §3.4.6). I follow Bybee et al. (1994:186) in distinguishing
between constructions that express obligation as strong or weak. The distinction
between weak and strong constructions can be seen in the examples (15) and (16),
respectively.
(15) No no you haven’t to change or anything, come just as you are. (Sound
Archive)
(16) Even what happened, you hadn’t to talk, you had to lie still and be quiet.
(Sound Archive)
For Lancashire speakers, HAVEn’t to does not always semantically correspond to the
Standard English negative of HAVE to, i.e. DOn’t HAVE to, which displays only
obligation in the weaker sense. This would therefore suggest that, for Lancashire
speakers, HAVE to and HAVEn’t to are distinct (but related) constructions and they are
therefore treated as such in this study.
While many studies of modal verbs do not include elaborate discussions of
negated semi-modals as such, studies such as Quirk et al (1985:141), Bauer
(1989:112), and Hundt (1997:143) unanimously agree that under negation and in
questions HAVE to requires DO-support. Although examples of HAVEn’t to can be
found in Standard English corpora such as the BNC, these instances are comparatively
76
rare (see §4.2), here, this construction is widely regarded as belonging to ‘a formal
literary style’ (Carter & McCarthy, 2006:244), or an ‘archaic and largely obsolete
form’ (Krug, 1996:45).
3.2.5 History of the haven’t to construction
One of the main focuses of this study is how the HAVEn’t to construction in Lancashire
dialect may have changed over time. In relation to this, it is useful to look first at how
the HAVEn’t to construction has emerged.
There is no full account of the negative HAVE to in the literature, with many
discussions featuring the negative construction only very briefly. For this reason this
section examines scholarship on the HAVE to construction, but pays special attention to
how the negative may be formed. A further section deals with the rise of periphrastic
DO.
One of the best accounts of the development of HAVE to is given as a case
study by Fischer et al. (2000:293). Here it is suggested that the HAVE to construction,
in Standard English, represents a case of ‘regular’ grammaticalization. The theory of
grammaticalization suggests that over time constructions may undergo a functional
change, moving from a more lexical to a grammatical function (Hopper & Traugott,
2003). As a result of high frequency, these constructions move away from their lexical
meaning and become independently ‘entrenched’ in this new grammatical function
(Bybee 2006). In this case, the full lexical Old English verb HABBAN indicating
possession changes over time to the auxiliary or semi-modal HAVE to expressing duty
or obligation (the original lexical use continues to co-exist with the now
grammaticalized HAVE to).
77
Fischer et al. suggest that HAVE to changes from a full lexical verb to a semi-
modal because of a more general change in English word order (of OV to VO) from
HAVE + object + Inf (17), to HAVE + Inf + object (18), inviting the ‘bracketing’ of
HAVE to Inf as a single construction.
(17) I [have] somebody [to love]. (possession)
(18) I [have to love] somebody. (duty / obligation)
Lehmann (1995:34) suggests that this gradual word order change over time
resulted in the reanalysis of HAVE to as the semi-modal HAVE to by speakers of
English. The exact route of grammaticalization of HAVE to, and the number of distinct
stages that are involved in this change are contested in the literature, (see. e.g. Visser,
1963-73:1477; Brinton, 1991:12; Fischer et al., 2000:301). However, all sources agree
on the change from lexical to auxiliary / semi-modal.
Denison (1993:317) does explicitly mention HAVEn’t to and provides a similar
analysis of the origin of HAVEn’t to as outlined for HAVE to. Denison suggests that the
HAVE to of obligation (the focus of this study) rarely conforms to the NICE properties,
thus demonstrating that in current Standard English, HAVEn’t to does not syntactically
behave like a modal verb. Denison states that HAVEn’t to is present in some northern
dialects. Alongside this is the suggestion that HAVEn’t to is also a newer development
in standard southern English, although no supporting data is cited.
The grammaticalization of HAVE to has been shown in the literature to move
from a lexical verb in Old English to a semi-modal in current Standard English. While
the negative HAVE to was considered only minimally in Denison’s study, this study
focuses on its change in the Lancashire dialect data, with respect to that of a possible
functional change towards the category of core modal verbs.
78
3.2.6 The rise of periphrastic DO
As suggested previously, one of the main syntactic differences between semi-modal
constructions and core modal verbs is the requirement of DO-support in order to fulfil
three of the four NICE properties, those of negation (19), inversion (20) and ellipsis
(21).
(19) *dare not to vs. do not dare to
(20) He needs to vs. does he need to?
(21) If anyone used to go, he did.
Here it is shown that DO-support is usually needed by semi-modals, but not by core
modals. As the NICE properties are used later in this study as a measure of degree of
modal auxiliaryhood (§3.5.6), it is necessary to examine the history of DO, and the rise
of periphrastic DO in English, with particular reference to modal constructions.
In Standard English, along with negation, inversion and ellipsis, DO may also
be used in a number of other ways, e.g. as a verbal noun, or for pragmatic emphasis.
However, these constructions are not found in the syntax of modal constructions
discussed in this study, and so no further analyses of these are necessary.
According to Denison (1993), the role of DO as an operator is one of the most
striking features of Present Day English when compared to older stages of English,
and indeed to many other European languages. It is suggested that operator (or
auxiliary) DO changed from the Old English DON to the modern English operator due
to reasons of dialect, register and style (although Denison discusses the suggestion
made by Ellegård (1953) that this change took place initially in poetic language with
some caution.) Ellegård (1953) suggests that the lexical verb DO in early intransitive
use meant something like act. The typical transitive use was something more like
perform or accomplish or also put or place. Until Middle English, DO could also be
79
causative, displaying both VOSI and V+I patterns. Fischer and Nänny (2001) state
that the constructional polysemy of DO was already present at this stage, with DO
being used in positive declaratives, negatives, interrogatives, inversion, emphasis, and
imperatives – many uses that DO still displays today. Ellegård (1953) suggests that
periphrastic DO came from changes to the causative VOSI word order. Similar to the
changes in the HAVE to construction, changes in the word order have led to the
reanalysis of periphrastic DO. The construction (DO + NP + Inf) is lost over time,
leaving DO + Inf isolated. Kroch’s (1989) modelling of this change in DO as compared
to the decline of finite lexical verbs is discussed in §3.2.9). DO was then later re-
analysed as an auxiliary.
3.2.7 Related constructions
§3.5 of this study looks at a number of other constructions found in the data that fulfil
a similar semantic role to HAVEn’t to e.g. (22-25).
(22) And I said, “you know Vera you shouldn't go with him, you know what he's
after, you know, you shouldn't go with him.” (Sound Archive)
(23) And then she'd get out of bed and go to the toilet and I said, “Margaret, you
mustn't,” I thought she'd collapse. (Sound Archive)
(24) And then er one fella said “you don't need to come on yer bike now love
we've got a van a van coming”. (Sound Archive)
(25) They're much easier this way round because you haven't got to go through
the minor at all to reach them. (BNC)
Due to the similarities between the constructions in examples (22-25), it is
appropriate to suggest that they belong to a construction family. That is, they have a
similar meaning, are used in similar circumstances and so are likely to be related to
one another cognitively. I follow work by Goldberg and Jackendoff (2004), which
suggests that a number of constructions can form a closely related group or family. In
80
this study, the term construction family is therefore used to refer to those constructions
that show a similar semantic and syntactic distribution, but may be different in some
other way, which can be detailed by means of data analysis.
Many approaches in language change do not take into consideration this
concept of construction families and rarely examine more than one or two linguistic
variables. Kroch (1989) tends to focus on cases involving only two competing
constructions such as the diachronic decrease in lexical verbs and the simultaneous
increase of DO-support. I would suggest that this viewpoint is somewhat idealized, and
the neat replacement of one construction with another is often unlikely. Approaches
such as Krug’s can be plotted graphically showing the increase in one constructions
correlating with the simultaneous decrease in another; the so-called S-curve as shown
below for language change relating to DO-support.
FIGURE 2. S-CURVE MODEL OF LANGUAGE CHANGE (REPRODUCED FROM KROCH

1989:22)
I would suggest that a distribution such as that shown in Figure 2 is rare. Often it is
not the case that one construction has only one direct correlate, and it is unlikely that
factors such as frequency, differences in meaning and both prosodic and
81
sociolinguistic salience are precisely the same for each opposing construction, thus
producing a distribution similar to that shown in Figure 2. (See Chapter 5 for a further
discussion on the problems associated with matching constructions in this way.)
Instead, often one construction can have many possible matching constructions that
convey a similar meaning, and so any analysis of language change should consider
this concept more broadly. This is the approach adopted here, where corpus results for
modal constructions of obligation more generally are outlined alongside results for
HAVEn’t to
3.2.8 Modals in varieties of English
Variation from Standard English found in modal verbs has been the subject of a
number of studies into regional varieties both in the UK (see e.g. Beal, 1993; Miller,
1993; Trousdale, 2003; Brown, 1991) and elsewhere (see e.g. D’Arcy and
Tagliamonte, 2010; Mishoe, 1994; Labov et al., 1972). This indicates that this
grammatical feature shows a high level of variability; it is therefore unsurprising that
this feature shows variation in Lancashire. These studies report on differences in both
meaning and form including simplification (e.g. Trousdale 2003), and double modal
constructions (e.g. Labov et al. 1968, Mishoe 1994). Tagliamonte and Smith (2006)
detail changes to deontic MUST, HAVE to and HAVE got to in the UK and Northern
Ireland. They conclude that MUST is obsolescent and that HAVE to is being used in
contexts traditionally encoded by MUST, with HAVE got to specializing for indefinite
reference. No discussion is found relating to variation with HAVEn’t to of the type
found here and no difference is made between the obligation types with the same one
construction as is found in the Lancashire data.
82
3.2.9 Summary
The literature suggests that modal verbs have meanings pertaining to
permission/possibility/ability, obligation/necessity, and volition/prediction, and that
syntactically they conform to the NICE properties, as set out in §3.2.2. The HAVEn’t to
construction is described as being one of obligation / necessity (e.g. by Biber et al.
1999:486) and this category of constructions is examined in this study.
Semi-modals differ from core modal verbs most strongly in syntax, by taking
periphrastic DO in sentences expressing negation, inversion and ellipsis. Figure 1
showed that the distinction between modals and semi-modals is not binary, with
different constructions being judged as more or less ‘modal’. This study examines
where HAVEn’t to and related constructions occur on this scale by analysing their
semantic and syntactic properties.
Many studies that model diachronic change examine only two competing
variants, e.g. Kroch (1989). The advantage of this approach is that results can be
plotted showing the increase in one pattern correlating to the simultaneous decrease in
another; the so-called S-curve as seen earlier in Figure 2. It is suggested here that for
many cases, this approach neglects to take into account all of the variants, and as a
result returns much idealized results. In contrast to this narrow approach, all
constructions related to HAVEn’t to are searched for in the corpus data. Results are
shown in §3.5.2.
Alongside this, most studies on modals and semi-modals do not take into
account any nonstandard British dialect data (although studies such as those by Beal,
1993; Miller, 1993; and Trousdale, 2003 have examined aspects of modality in
various regional dialects). This widespread neglect of dialect data combined with the
brevity of analyses relating to negative forms of semi-modals and the narrow focus of
83
diachronic studies of competing variants leaves a number of research questions that I
aim to resolve in this chapter.
3.2.10 Research hypotheses
There are a number of research hypotheses which arise from the issues discussed in
the literature review (and from the preliminary examination of the data). The
following are the research hypotheses tested in this study:
(a) Grammaticalization Hypothesis: The HAVEn’t to construction in

current Lancashire dialect, compared to Standard English, displays
properties closer to that of a core modal verb.
This change is contrary to developments in Standard English. As stated previously,
grammaticalization refers to the process in which a word (or multiword construction)
undergoes a change in form and function (in this case, for example, the adoption of
some of the NICE properties mentioned earlier). Because HAVEn’t to shows a NICE
property (namely resistance to negation with periphrastic DO), it seems likely that this
construction has undergone grammaticalization.
(b) Constructional Competition Hypothesis: Constructions showing a

rise in frequency may coincide with other semantically similar
constructions showing a fall.
This arises from the assumption that the increase in frequency of one word or
construction can bring about the fall of another nearly synonymous word/construction.
Kroch (1989) shows how competing constructions may interact in this way.
Concretely, this hypothesis predicts that if HAVEn’t to shows a rise in
frequency, some other construction(s) having the same or similar meanings will show
84
a fall in frequency when the (relatively recent) Sound Archive and (older) Litcorp data
are compared.
3.3 Methodology
3.3.1 Introduction
This section details the procedures used in order to gather the data from which my
conclusions are drawn. Data is taken, initially, from the Sound Archive corpus. This
corpus is compared to Litcorp, a dialect literature corpus taken from an earlier time
period in order to establish a diachronic perspective within the data (see §1.3 for
further information on these sources). By comparing both syntactic and semantic
results from these corpora, conclusions may be drawn about the nature of the HAVEn’t
to construction for Lancashire dialect speakers. In §3.4.4, again these corpora are used
in order to look at semantically related constructions. The BNC is also used, in places,
as a control, for frequency comparisons between Standard English and the Lancashire
data. Limitations of this corpus-based approach are discussed in §3.6.
3.3.2 Corpus searches
In the initial searches for the HAVE to construction, all forms of the verb were searched
for, e.g. haven’t to, hasn’t to, hadn’t to. Also, both the contracted negative form –n’t
and the full negative not were looked for, along with constructions with DO, e.g. did
not have to, don’t have to etc. Further searches were carried out in order to find
similar constructions in the HAVEn’t to construction family referring to obligation or
permission. Biber et al. (1999:486) suggest that this group includes MUST, SHOULD
better, HAD better, HAVE to, HAVE got to, NEED to, OUGHT to, and BE supposed to. As
these constructions are compared to the data for HAVEn’t to, the negative forms,
85
mustn’t, shouldn’t etc, are searched for in the Lancashire corpus data. A number of
other constructions additional to those from Biber et al. (namely BEn’t to + Inf; BEn’t
obliged to + Inf; BEn’t allowed to + Inf) are taken from the literature (Quirk et al,
1985:139; Huddleston and Pullum, 2002:361) and are searched for alongside the list
from Biber et al. As with all previous searches, all verb forms and contractions are
included e.g. don’t need to includes don’t need to, do not need to, did not need to,
does not need to etc. In all data, patterns that look similar on the surface but actually
exemplify other constructions, such as (26), are discounted from any results.
(26) Erm but it hasn't to the best of my knowledge, it has not resulted in a rash of
of developments and motorway intersections. (BNC - KM7 562)
3.4 Results and Discussion

3.4.1 Semi-modals in the BNC
Before examining the Lancashire data, and also for purposes of a comparative
analysis, it is useful to examine semi–modal data from Standard English. For this
purpose, an analysis of a selection of semi-modals from the BNC is presented here.
These constructions were searched for both with and without DO, in order to show
which of these constructions are most frequent in Standard English. As with all data
presented in this study, while only one form is used in the table headings, all possible
verb forms, along with both the contracted and full forms of the negative particle, are
included in the results. These results are presented together for reasons of clarity.
As we have already seen, the literature suggests that modal verbs occur in a
range of syntactic positions. Negation is possible without DO-support for modal verbs,
while semi-modals, on the whole, require DO in order to form the negative.
86
The data in Table 2 shows the raw frequency results for a selection of modals
found in the whole of the BNC (both written and spoken). This data is included here
in order to provide a Standard English context.
with DO without DO
DOn’t HAVE to 2106 (99%) HAVEn’t to 28 (1%)
DOn’t NEED to 839 (98%) NEEDn’t to 14 (2%)
DOn’t DARE to 8 (100%) DAREn’t to 0 (0%)
DOn’t USED to 24 (73%) USEDn’t to 9 (27%)
TABLE 2. NEGATIVE FORMS OF SEMI-MODALS IN THE BNC WITH AND WITHOUT DO (RAW
FREQUENCY RESULTS)
The results show that in Standard English DO-support is usually required by
semi-modals when forming the negative. The data suggests that in Standard English,
while it is possible for HAVE to to occur in the negative without DO, this is extremely
rare, compared with the number of occurrences with DO (28 vs. 2106 cases). The
HAVEn’t to results in the BNC are discussed further in §3.4.2. These data for HAVEn’t
to may now be compared and contrasted to the results from the Lancashire corpora.
3.4.2 Corpus comparison of HAVEn’t to
As discussed in the methodology, in order to uncover any possible diachronic change
in the HAVEn’t to construction, it is useful to look closely at the Lancashire corpora in
comparison with each other.
As the corpora are of different sizes (Litcorp is approximately 500,000 words,
Sound Archive is approximately 300,000), their results have been normalized to show
frequencies per 100,000 words - this is shown in Table 3. For reasons of comparison,
normalized data from the BNC is also shown.
87
HAVEn’t to DOn’t HAVE to
raw raw
(per 100,000 words) (per 100,000 words)
frequency frequency
Litcorp 4 0.80 (80.0%) 1 0.02 (20.0%)
Sound Archive 14 4.67 (42.4%) 19 6.34 (57.6%)
BNC 18 0.02 (0.7%) 2,578 2.2 (99.3%)
TABLE 3. INSTANCES OF FORMS OF THE HAVEN’T TO CONSTRUCTION IN LANCASHIRE
DIALECT DATA.
The data show that both HAVEn’t to and DOn’t have to are more frequent in the more
modern Sound Archive than in the older Litcorp. In some respects these frequency
results do go against the expected results indicating grammaticalization, where these
more frequent constructions undergo grammaticalization. This demonstrates that care
must be taken when comparing such different corpora - Sound Archive is a spoken
corpus, Litcorp is written and the BNC is mixed. This difference in genre and the
relatively low frequency of this construction overall, means that only tentative
conclusions can be drawn at this stage.
An increase in semi-modals in English over time more generally, could
possibly explain the increase in both HAVEn’t to and DOn’t have to in the data.
However, the Standard English data from the BNC shows that this is not the case for
the HAVEn’t to construction; in fact, the BNC displays a frequency of only 0.02
occurrences per 100,000 words. It could be suggested that, like speakers of Standard
English, Lancashire speakers have undergone a change in usage of certain modal
constructions, although the specific constructions that undergo change in Lancashire
are not the same constructions that undergo change in Standard English. This point is
further investigated in §3.5 where a number of modal and semi-modal constructions
relating to obligation are compared in the data.
Another possible reason for the perceived difference in the two corpora could
be salience. Kerswill & Williams (2002) suggest that salient constructions are those
88
which are recognised by speakers as a feature of a certain dialect, speaker or region.
As Litcorp is not a record or transcription of speakers at that time, but rather of the
writer’s perception and representation of these speakers, it could be considered a
corpus of the most salient or important dialectal features as judged by the writer (see
Chapter 5 for a further discussion of this). This means that the low frequency of both
HAVEn’t to and DOn’t have to within the Litcorp data could be because these are not
judged to be the most important or noticeable features of dialect speakers by these
writers (i.e. HAVEn’t to may not be salient). It may also be that some other
construction is used in Litcorp to indicate obligation, e.g. weren’t to, or aren’t to. This
possibility is explored in §3.5.5.
The grammaticalization hypothesis put forward in §3.2.10 suggests that the
HAVEn’t to construction in current Lancashire dialect, compared to Standard English,
displays properties closer to that of a core modal verb. A more detailed analysis of the
semantic and syntactic features of these results is discussed in the following sections,
in order to test out theories put forward here.
3.4.3 Syntactic analysis of HAVEn’t to in Lancashire data
In the data, every occurrence of the HAVEn’t to construction occurs in the same
syntactic pattern. Each is preceded by a personal pronoun and followed by a verb
phrase (27 – 29).
(27) See we hadn’t to sit in them. (Sound Archive)
(28) Int’ neet he went, an’ th’ aggravation uv it were he hadn’t to feight for her.
But I towd her he could feight, for her un win. (Litcorp)
(Sound Archive)
89
As shown in the examples above, hadn’t to occurs with dynamic verbs such as hit,
talk and go. The literature (e.g. Fischer et al., 2000:301) suggested that HAVEn’t to
(and HAVE to) first occurred with the word order hadn’t + obj + Inf, changing later,
after reanalysis, to hadn’t to + Inf. The whole of the Lancashire data provides only
one example of the older form (30).
(30) He hadn’t mich to do as we never played fro music. We did at first […]
(Litcorp)
This suggests that this older form is now largely obsolete for Lancashire speakers.
Further syntactic analyses comparing HAVEn’t to with other related constructions are
offered in §3.5.6.
3.4.4 Semantic analysis of HAVEn’t to in the Lancashire data
A closer analysis of the data suggests that HAVEn’t to displays two different meanings
for Lancashire speakers, and that the difference is related to obligation. In terms of
Bybee et al.’s distinction between strong or weak obligation (1994:186), example (31)
displays the weak type.
(31) No no you haven’t to change or anything, come just as you are (Sound
Archive)
Here, the referent of you is not very strongly obligated to do something. The
meaning of HAVEn’t to, when used in this way, is more semantically similar to other
semi-modals such as NEEDn’t, DOn’t need to or DOn’t have to. This suggestion is
analysed with respect to the data in §3.5.5.
An example of HAVEn’t to classified as having strong obligation is shown in
(32). Here, the meaning of HAVEn’t to is closer to the core modal must.
90
(Sound Archive )
The difference in meaning between the strong and weak constructions can be
further demonstrated by looking at DO. The Standard English DOn’t have to is similar
to the meaning of HAVEn’t to displaying weak obligation in the Lancashire data. This
can be seen in example (33). Negative HAVE to constructions with DO are not
compatible with a meaning that displays strong obligation (34).
(33) It’s easy here, I hadn’t to / don’t have to get me car out on a Sunday, it’s
lovely to walk down the path (Sound Archive)
(34) You had to move your arms as well, but you hadn’t to / *didn’t have to
move your head, you’d to keep laid flat. (Sound Archive)
This difference can also be shown by providing further contextualization, as in (35)
and (36).
(35) You haven’t to go to the shop (because it’s dangerous)
(36) You haven’t to go to the shop (I’ve got enough food in the cupboard)
In order to examine distribution of weak and strong meanings, the frequency of these
constructions in the Lancashire corpus data is displayed in Table 4.
Obligation type
Weak Strong
Litcorp 2 (50%) 2 (50%)
Sound Archive 3 (21%) 11 (79%)
TABLE 4. DIFFERENCE IN OBLIGATION TYPE IN HAVEN’T TO CONSTRUCTIONS IN
LANCASHIRE DIALECT (RAW FREQUENCY RESULTS)
This data shows that the two different meanings, as outlined in (37 – 38) are
possible in Lancashire, with both corpora returning results for both variants. While the
Litcorp data is certainly not significant enough to be considered and the Sound
91
Archive also does not return a huge number of results, a closer look at the Sound
Archive data shows that the three instances of weak HAVEn’t to are produced by three
different speakers. This shows that this difference cannot be explained simply as part
of a particular speaker’s own language use. Out of these three speakers, two use both
the strong and weak varieties within their interview, suggesting that HAVEn’t to has
two different meanings, at least for these speakers. The examples below show the
weak (37) and strong (38) examples for speaker E.D.
(37) Well I don’t know who it were what must have been Mayor what came up or
something you know. ‘Cos you hadn’t to pay or anything you know there
were always plenty of collections you know if you wanted to collect or give
anything. (Sound Archive)
(38) Erm well you hadn’t to have any dirty shoes on. Well they were very poor
and people hadn’t er clogs or anything, you had to have a clog fund to buy
these clogs. (Sound Archive)
This constructional difference was also tested on results for HAVEn’t to in the
BNC in order to see if this difference in meaning is also present in the Standard
English data. As shown in Table 3 earlier, the results for the same search returned
only 18 instances of this construction, of which the majority were found in the
demographically sampled spoken section of the corpus (indicating speakers of
regional varieties). Most of these instances are recorded as ‘north’ but no further
details are given and so the exact location of these speakers cannot be ascertained. The
BNC results suggest that this feature is not frequent in Standard English, (compare the
18 results for HAVEn’t to with e.g. the 2,578 results for the semantically similar DOn’t
have to). This aside, differences in obligation types can also be found in the BNC data
for HAVEn’t to, as shown in examples (39-40).
(39) She said I can't tell you, I haven't to tell you! (BNC - KB8 5178)
92
(40) Oh well er I asked Joyce and she said erm, he hasn't to go in, he's not bad
enough (BNC – KB2 2435)
However, we can conclude that this construction is not frequent in Standard English
and so any differences in obligation type shown here can not be attributed.
3.4.5 Explanations for semantic differences – constructional polysemy
Goldberg (1995:65) suggests that constructions, like words, can be polysemous; this
means that a single form has two or more meanings that are semantically related.
Often, one meaning is a historical extension of the other meaning(s). This fits in well
with many suggestions about the development of the HAVEn’t to constructions, and in
particular, with the variation in meaning within the same construction that is found in
the Lancashire data. For example, Krug (1996:56) suggests that the virtual absence of
not negation (for HAVEn’t to) in Present Day English points to a diachronic
development from an unproductive auxiliary negation (HAVEn’t to), to a DO negation.
It could be suggested that this diachronic change has directly led on to the polysemous
meanings that the HAVEn’t to construction displays within the data.
While the data has shown that HAVEn’t to and the Standard English DOn’t have
to can be near synonyms (as in 41 and 42), HAVEn’t to can also be semantically similar
to must (43 and 44), as demonstrated in the invented examples below.
(41) You haven’t to sit over there [there’s plenty of room here]
(42) You don’t have to sit over there [there’s plenty of room here]
(43) You mustn’t sit over there [or you’ll get into trouble]
(44) You haven’t to sit over there [or you’ll get into trouble]
In order to get a fuller picture of how other modal constructions in the
semantic category of obligation behave and to examine whether or not the data
93
suggests that related constructions have undergone a diachronic change, §3.5
examines related constructions as an extension of the analyses carried out here.
3.5 Analysing the necessity/obligation construction family
3.5.1 Introduction
So far, this study has provided an analysis of the HAVEn’t to construction in
Lancashire dialect. This analysis now examines how the possible change in HAVEn’t to
may relate to changes in the frequency of other modal and semi-modal constructions
fulfilling a similar semantic role.
As previously suggested, semantically similar constructions may form a
construction family. In this particular case, the function of this family is one of
necessity or obligation. With that in mind, the focus of this study now turns to
constructions that share a similar meaning, but differ in terms of structure. This multi-
constructional approach aims to provide a broader picture of diachronic change for the
whole construction family. For methodological considerations relating to this section,
see §3.3.2.
3.5.2 Corpus results
Although many search terms were included in this data analysis, for reasons of clarity,
only those search terms that returned results from the corpus data are included in the
tables of results. The data in Table 5 shows the normalized frequency results for the
obligation and permission family of constructions in both the Sound Archive and
Litcorp data.
94
Frequency per 100,000 words
SHOULDn’t 12.8 6.4
MUSTn’t 1.0 2.4
NEEDn’t 1.8 0.0
DOn’tneed to 0.2 1.0
HAVEn’t to 0.8 4.7
HAVEn’t got to 0 0.1
DOn’t have to 0.4 6.3
TABLE 5. HAVEN’T TO FAMILY OF CONSTRUCTIONS (NORMALIZED FREQUENCY RESULTS)
The semantic and syntactic differences represented in this data, along with the
possible diachronic change, are discussed in the subsequent sections.
3.5.3 Diachronic Change – testing the frequency hypotheses
The comparison between the Litcorp and the Sound Archive is most clearly
represented in Figure 3.The graph shows the members of the obligation family as they
(may) vary over time between the older Litcorp and the more recent Sound Archive;
results from the BNC are included here as a control.
95
Relative frequencies of the obligation family of constructions
14
Litcorp
12
Sound Archive
BNC
Frequency per 100,000 words
10
0
SHOULDn’t MUST n’t NEEDn’t DOn’t NEED HAVEn’t to HAVEn’t got DOn’t have to
to to
FIGURE 3. POSSIBLE DIACHRONIC CHANGE IN THE LANCASHIRE CORPUS DATA
This data shows that most constructions are more frequent in the more modern Sound
Archive than in the older Litcorp. As mentioned previously, grammaticalization refers
to the process in which a word (or multiword construction) undergoes a change in
form and function as a result of high frequency (Hopper & Traugott 2003:44). These
results are insufficient as to cite high frequency as having any involvement in possible
grammaticalization. Instead, syntactic and semantic evidence for this suggestion of
grammaticalization are examined further in §3.5.5 and §3.5.6.
The Constructional Competition Hypothesis suggested that constructions
showing a rise in frequency may be accompanied by other semantically similar
constructions displaying a fall. The data suggests that NEEDn’t displays a decrease in
frequency as compared with a possible increase in DOn’t need to, thus possibly
proving this hypothesis to some degree. The increase in both HAVEn’t to and DOn’t
have to contrasts well with this change in NEEDn’t vs. DOn’t need to. Quirk et al.
96
(1985:146) suggest that in both British and American English constructions formed
with periphrastic DO (DOn’t have to, DOn’t need to etc.) have increased over time,
while their counterparts (e.g. HAVEn’t to, NEEDn’t) have decreased. The data for both
forms of negative NEED supports this theory, while the HAVEn’t to data goes against
this trend. Unlike Standard English, HAVEn’t to remains frequent in the Lancashire
data.
3.5.4 Considerations and contradictions
One of the most striking results shown by the data is the decrease in frequency of
SHOULDn’t. This change is contrary to data found in studies of change in Standard
English (e.g. Quirk et al, 1985:141). It may be the case that this trend also fulfils the
Competing Constructions Hypothesis as set out earlier, and that Lancashire speakers
choose to use semantically similar constructions, such as MUSTn’t or HAVEn’t to,
instead of using SHOULDn’t. This theory is expanded upon in §3.5.5, with a closer look
at meaning within the data.
However, the decrease in SHOULDn’t may have another explanation relating to
Litcorp. As can be seen on the graph in Figure 3, many of the Litcorp results (other
than SHOULDn’t) display low frequencies. All constructions other than SHOULDn’t have
frequencies between only 0-2 per 100,000 words. It could therefore be suggested that
obligation, as a concept, simply is not very important in these kinds of stories that
make up Litcorp. This same explanation may account for all increases in the data,
(DOn’t have to, SHOULDn’t, HAVEn’t to, MUSTn’t and don’t NEED to) as the starting
values for these constructions in the Litcorp are so low. Because of this, more
syntactic and semantic data is needed in order to support the claims put forward here.
97
3.5.5 Semantic evidence
In order to examine possible semantic reasons for change over time, the data can be
analysed based on the semantically strong / weak distinction as described earlier when
analysing the constructional polysemy of haven’t to (§3.4.7). Constructions displaying
strong obligation are those in which an implied source of authority exerts more force
over the people or entities involved; on the other hand, constructions displaying weak
obligation are those in which there is less force from the implied authority source, and
hence, although the force is implied, it is not binding on the people or entities; for
example compare we must not go home (strong) with we don’t have to go home
(weak). All results were analysed into the categories weak, strong, or both.
Constructions that were classified as both were those that displayed at least a 1:3 ratio
of both obligation types, (e.g. 5 weak uses and 15 strong uses, or vice versa). The
results can be seen in Figure 4.
DOn’t have to
MUSTn’t HAVEn’t to NEEDn’t
SHOULDn’t HAVEn’t got to DOn’t need to
STRONG WEAK
FIGURE 4. DISTRIBUTION OF CONSTRUCTIONS DISPLAYING WEAK AND STRONG

OBLIGATION
The diagram in Figure 4 shows those constructions having meanings relating
to weak and strong obligation and those which have both. Here it is suggested that
98
both HAVEn’t to and HAVEn’t got to display constructional polysemy, where both
strong and weak meanings are possible. Harris and Campbell (1995:26) suggest that
grammaticalization is gradual, and that the maintenance of multiple meanings of a
construction by the speaker suggests an on-going change.
Further to this, as both HAVEn’t to and HAVEn’t got to appear with both weak
and strong meanings (unlike other constructions that only display weak obligation), I
would suggest that these constructions are more semantically similar to the strong core
modal verbs MUSTn’t and SHOULDn’t. This reinforces the Grammaticalization
Hypothesis as already indicated by the frequency data. The grammatical difference
between HAVEn’t to and HAVEn’t got to is compared in §3.5.6 in order to uncover
which construction may be more grammaticalized.
The Competing Constructions Hypothesis suggests that if HAVEn’t to shows a
rise in frequency (possible grammaticalization), other constructions having a similar
meaning may show a fall in frequency when the Sound Archive and Litcorp are
compared, due to their function being ‘taken over’ by other constructions. Decrease in
both SHOULDn’t and NEEDn’t (underlined in Figure 4) may be related to the observed
increase in HAVEn’t to and DOn’t have to in this way. It has been shown earlier that in
the Lancashire data HAVEn’t to occurs with the strong meaning 79% of the time. This
suggests that in Lancashire the increase in HAVEn’t to (in this strong sense) and the
decrease in the semantically similar SHOULDn’t may be linked, thus suggesting that the
Competing Constructions Hypothesis may be true. The increase in MUST also could be
attributed to the decrease in SHOULDn’t.
99
3.5.6 Syntactic evidence
In order to further support the suggestion that the HAVEn’t to construction in current
Lancashire dialect displays syntactic properties closer to that of a core modal verb, it
is useful to look back to the NICE properties, as detailed by Quirk et al. (1985:147).
As many of the syntactic constructions are relatively rare (e.g. ellipsis), in
instances where no examples were present in the data, a group of ten informants who
identified themselves as Lancashire dialect speakers were asked to make judgements
on their acceptability by using test sentences such as (45). (See Chapter 1 for more
information about informants). 14
(45) Oh no, I’ven’t to go there, it’s too far away.
As much as possible, data from the Lancashire corpora is used to determine
whether or not each of the NICE properties is possible.
Syntactic properties of modals (NICE properties)

Negation
Inversion Contractions Ellipsis
(without do)
SHOULDn’t    
MUSTn’t    
HAVEn’t to   ? ?
NEEDn’t    
HAVEn’t got to    ?
DOn’t need to    
DOn’t have to    ?
TABLE 6. OBLIGATION CONSTRUCTION FAMILY AND THE NICE PROPERTIES
This data supports the hypothesis that HAVEn’t to is relatively highly
grammaticalized in the Lancashire dialect data. HAVEn’t to (along with NEEDn’t) is
14
The full list of test sentences can be found in Appendix C
100
more syntactically similar to the core modals SHOULDn’t and MUSTn’t suggesting that
it is more grammaticalized than other semi-modals in this construction family. Other
semi-modals such as those formed with periphrastic DO behave much less like modal
verbs. HAVEn’t to is the only construction in this family that may be able to have a
contracted form. As a non-Lancashire speaker myself, it was necessary to check the
acceptability of this contraction with a small group of Lancashire dialect speakers, as
mentioned previously, (please see §1.3.5 for further details on these informants). This
mini-experiment involved the presentation of a number of sentences displaying
contracted HAVEn’t to to the speakers and asking if they found it to be acceptable or
not, and if they had heard it in use. While this test was too small-scale to give any
significant results, it is perhaps interesting to note that around than half of the
informants reported that either they have heard it from a Lancashire dialect speaker, or
would use it themselves.
Given the results presented earlier in the chapter, it is possible to conclude that the
Grammaticalization Hypothesis, i.e. that the semi-modal HAVEn’t to has changed
towards a more modal function, has been proven correct. The results from the
diachronic data suggest that semi-modal HAVEn’t to now behaves more like a core
modal verb for Lancashire speakers than is the case in Standard English. The semantic
analysis supports this assertion, showing that HAVEn’t to displays constructional
polysemy for Lancashire speakers and this differentiation is present in both the
Litcorp and Sound Archive data. It seems that in a majority of cases (in the Sound
Archive data), its meaning is semantically closer to the stronger modal verbs MUSTn’t
or SHOULDn’t rather than to the weaker DOn’t have to. The syntactic analysis also
101
suggests that HAVEn’t to has become more grammaticalized. The comparison of the
NICE properties clearly shows that HAVEn’t to is syntactically closer to a core modal
like SHOULDn’t or MUSTn’t, as it conforms more closely to the NICE properties.
The results are not without question; although the semantic and syntactic
arguments clearly show that, synchronically, the HAVEn’t to construction has become
grammaticalized in the Lancashire dialect data, the diachronic data does not
conclusively show the process of grammaticalization taking place from the Litcorp
period to the Sound Archive period. This may be due to the somewhat problematic
nature of the comparison between the written Litcorp and the spoken Sound Archive
data: the former is written ‘consciously’ by the author, while the latter is spoken
‘naturally’. The construction family analysis seems to show polarised results for
Litcorp, with SHOULDn’t returning more than three times as many results as all other
constructions combined; it may be that the authors have no obvious, salient, or
‘dialectal’ way to represent negative obligation, and so instead use SHOULDn’t as
perhaps a neutral or default choice.
While the argument for the grammaticalization of HAVEn’t to is persuasive,
though diachronically not conclusive, the verdict on construction competition is not as
clear. The constructional polysemy of HAVEn’t to, along with that of the similar
HAVEn’t got to, could also be used to support the Construction Competition
Hypothesis, i.e. that constructions showing a rise in frequency may be accompanied
by a fall in other semantically similar constructions. The possible decrease in both
SHOULDn’t and NEEDn’t may be explained by the rise in frequency of other
constructions that are able to fulfil a similar semantic role. However, as discussed
previously, the Litcorp data is not completely reliable in this respect, and so no firm
conclusions can be drawn. It may be that the authors of Litcorp overuse certain
102
modals, or, on the other hand, it may be that this is indeed representative, in which
case the Construction Competition Hypothesis would be validated.
This analysis of a number of competing variants has implications for studies of
language change such as that of Kroch (1989). The S-curve model of language change
examines two variants and suggests that as one competing variant increases, the other
decreases, thus producing the S-curve as shown in §3.2.7. However, this approach
will, in many cases, be too narrow. This study suggests that often there may be a
number of similar constructions which are able to fulfil a particular semantic role, thus
meaning that speakers are not limited to a choice of only two variants in opposition.
The interaction between these variants is complex, and cannot easily be accounted for
by Kroch’s S-curve model. The results of this study suggest that a wider scope of
focus will often be necessary when looking at diachronic change.
While this investigation has yielded interesting results, there are a number of
limitations involved with both the methodology used here, and with the scope of this
thesis in general. One of the main limitations, as mentioned previously, is data
sparseness. Although the results for HAVEn’t to returned low frequencies, there were
enough to prove both that the construction certainly exists, and also enough to carry
out an analysis of aspects of its syntax and its semantics. While enlarging the corpora
may provide further evidence, Biber et al. suggest that the obligation/necessity
modals, such as those examined here, are less common overall than other modal
categories (Biber et al., 1999:493).
Further elicitation or acceptability judgement tests have proven useful in other
studies (see e.g. Cowart, 1997; Schütze, 1996) and would be a good option in order to
further this study. A combination of these approaches (as put forward by Hollmann &
Siewierska, 2006) should give a good overall picture of how this construction family
103
is used, and account for the limitations encountered in this study. This approach is
adopted in other chapters in this thesis (e.g. in Chapter 4 when testing habitual
constructions).
It was suggested earlier that sociolinguistic salience may have been a motivating
factor for the low frequency of the construction in question in Litcorp, which
somewhat undermines the frequency results presented here. This variable is analysed
in more detail in Chapter 5.
Previous work on semi-modal HAVEn’t to focussed mainly on the positive
construction HAVE to (see e.g. Brinton, 1991; Quirk, 1985; Biber et al., 1999; Fischer
et al., 2000). This project could be furthered by more analysis of the positive
construction, and a comparison of the Lancashire results to other findings from
Standard English. There is also the potential to take a wider view of these
constructions; some of the generalizations suggested here could be used as a basis for
looking at all modal verbs in Lancashire dialect data. Looking at further modals and
semi-modals would help towards examining possible competition within the
obligation family of constructions.
104
Chapter 4. Verbal agreement and the Northern Subject Rule
4.1 Introduction
The Northern Subject Rule (NSR) is a phenomenon of nonstandard subject-verb
agreement that is reported to be commonly found in varieties of English. According to
the NSR any present indicative verb in any person may take the suffix –s (normally of
course only associated with 3sg) except for when found directly adjacent to any non-
3sg personal pronoun, thus giving the distinction they go home, but the children goes
home.
It is suggested that the NSR is prevalent throughout Northern England,
(Pietsch, 2005; Ihalainen, 1994; Murray, 1873) and also in a number of areas beyond
this (Rupp and Britain, 2008; McCafferty, 2003; Godfrey and Tagliamonte, 1999).
However, few in-depth region-specific analyses of the NSR have been conducted,
with some studies (e.g. Börjars and Chapman, 1998; Henry, 1995) including little or
no data in their research. Alongside this, most studies do not address variables such as
the possible interplay between the NSR and other similar constructions, nor examine
cognitive-perceptual factors such as salience or frequency of usage, as potential
explanations for this agreement variation.
These issues are considered in this chapter, where data from spoken and
written sources, along with acceptability judgements from questionnaires, is used in
order to explore both the possible instances and acceptability of the NSR in
Lancashire. A broader question relating to synchronic theories of language variation is
also explored; i.e. to what extent is variation in syntactic and morphological
phenomena (such as the NSR) the result of relatively clear rules or constraints, and to
what extent is this variation more idiosyncratic, unpredictable and region or
community-specific? Resolving this question for a particular phenomenon is difficult.
105
It is even possible that agreement variation (such as that proposed to be demonstrated
by the NSR) is determined both by rules and also by idiosyncratic region-specific
variation. This possibility is discussed further with respect to the results from
Lancashire in §4.5.
4.1.1 Overview
Standard English resembles many other world languages in that it displays agreement
between the verb and subject (see e.g. Siewierska, 2004), whilst also differing in a
number of other ways; with lexical verbs, agreement is confined to the present
indicative only, with no marking on verbs in the subjunctive or imperative mood. BE
has a more elaborate agreement paradigm (although this is not uncommon cross-
linguistically in European languages, e.g. German, Dutch), a remnant of the older
Germanic agreement system, as shown below.
Present Past
Old High German English Old English English
1sg bim, bin am wæs was
2sg bist are wǽre were
3sg ist is wæs was
1pl birum are wǽron were
2pl bir(e)n are wǽron were
3pl birut, bir(e)t are wǽron were
TABLE 1. PARADIGM OF BE IN OLD HIGH GERMAN, OLD ENGLISH AND ENGLISH
Agreement in Present Day English displays a considerable amount of
syncretism, where a single form serves two or more morphosyntactic functions (see
e.g. Corbett (2006) for a discussion of this). In the case of regular lexical verbs, there
is only one overt marker of person agreement, namely –s, used to indicate 3sg in the
present indicative – all other forms are zero, as shown in (1-2) respectively.
106
(1) He / she / it likes chocolate.
(2) I / you / we / they like chocolate.
This Present Day English agreement system is thought to have arisen due to changes
in word order that initially left person and number marked on the verb by both
pronouns and by verb endings as shown in Table 2, (modified from Van Gelderen,
2006).
Present indicative
1 ic find(e)
2 thou findes(t)
3 he findeþ/ he findes
Pl we, ye(e), thei, findeþ/en
TABLE 2. LATE MIDDLE ENGLISH PRESENT INDICATIVE AGREEMENT WITH FIND
This “double marking” of the verb meant that verbal endings eventually became
weakened and to an extent, redundant in many Germanic languages. Compare, in this
connection, many Romance languages, where the subject pronoun is normally omitted
and person/number is therefore frequently only signalled by verbal morphology.
Despite this change, 3sg verbal agreement distinctions were kept in English, and today
substantial variation with the 3sg form is present in many regional varieties (see
§4.2.1 for a further discussion of the history of variation associated with the NSR in
particular). More generally, departures from standard verbal agreement in both British
and worldwide varieties of English are common and have been widely studied (e.g.
Cheshire, 1982; Kortmann and Schneider, 2004; Labov et al., 1968; Trudgill, 1999).
Many occurrences of verbal agreement variation in British English are often
subsumed under the Northern Subject Rule. This phenomenon is also referred to as
nonstandard agreement (Cheshire, 1982), singular concord (Henry, 2005), northern
present-tense rule (Montgomery, 2004) and is also outlined by others (e.g. Rupp,
107
2005; Hudson, 1999). Despite apparent differences in terminology, all suggest that, as
discussed earlier, in varieties of English any present indicative verb in any person may
take the verbal suffix -s except when directly adjacent to non-3sg personal pronouns,
as shown in (3-5), (modified from Pietsch, 2005:1).
(3) They sing
(4) The birds sings
(5) They always sings
The examples given above by no means encompass a definitive description of the
NSR; much of the literature suggests that the type and position of the subject may also
influence the application of this pattern (e.g. by Pietsch, 2005; Godfrey and
Tagliamonte, 1999). This is discussed further in § 4.1.2.
Nonstandard use of 3sg verbal agreement is found in many regions of the
British Isles, and while it has been suggested to be prevalent in the North (e.g. by
Pietsch 2005; Klemola 2000), it is not exclusively located in these areas. The idea of
the Northern Subject Rule being exclusively “northern” is somewhat misleading;
alongside Northern England (and Scotland), NSR agreement has been identified in
Ulster (McCafferty, 2003), and in the Southwest (Godfrey & Tagliamonte, 1999).
Interestingly, many dialects of English also have differing agreement patterns related
to 3sg variation. Varieties found in East Anglia (e.g. by Britain & Rupp, 2005) and in
Buckie Scots (by Smith & Tagliamonte, 1998) display agreement patterns in direct
opposition to the NSR where 3sg forms are more commonly found with adjacent 3sg
pronominal subjects than with an NP subject, e.g. the cat purr, it purrs (taken from
Britain and Rupp, 2005).
108
Pietsch (2005:22) suggests that NSR agreement can be found in the Lancashire
part of the Freiburg English Dialect corpus (henceforth, FRED) and the Survey of
English Dialects (henceforth, SED). However, the Lancashire part of the FRED and
SED data contains relatively few speakers (when compared to this study); a more
thorough investigation is required in order to determine the extent to which the NSR
may be present in Lancashire more widely rather than being limited to a small group
of speakers as tested by the FRED and SED data. The present analysis of a corpus of
19th and 20th century Lancashire dialect literature also allows tentative claims to be
made about possible changes to verbal agreement in this region whilst also providing
insights into the use of the NSR in historical written texts.
4.1.2 A focus on the NSR
As shown in examples (3-5), the traditional definition of the NSR suggests that
present indicative verbs may take -s verbal agreement in all circumstances, except
when directly adjacent to a non-3sg personal pronoun subject. This specific influence
of the pronoun on the verb is discussed by Pietsch (2005) as subject type, by
McCafferty (2003) as NP/PRO constraint, by Godfrey and Tagliamonte (1999) as
type-of-subject constraint, and by many others in less explicit terms (e.g. Montgomery
1994b; Cheshire and Fox, 2006). Alongside this subject type restriction, many also
suggest that the position of the subject in relation to the verb determines the
application of the NSR, where non-3sg personal pronoun subjects which are separated
from the verb by a clause or phrase may also take 3sg verbal agreement, as shown
earlier, e.g. (3) vs. (5).
It can be argued that pronominal adjacency may override the possible use of
nonstandard verbal agreement forms ensuring that, instead, standard agreement
109
occurs, thus avoiding a conflict with the person and/or number features of the pronoun
and the verb. These subject type (i.e. pronoun vs. non-pronoun subject) and subject
position (i.e. adjacent vs. non-adjacent subject) restrictions are the main constraints
associated with the NSR and are largely agreed upon in the literature. These
constraints are tested with respect to the Lancashire data in §4.4.
Alongside subject type and subject position restrictions, a number of other
constraints have also been suggested to affect the application of the NSR. Both
Godfrey & Tagliamonte (1999:97) and Bailey et al. (1989) suggest a heaviness
constraint, where the phonological size of the subject may affect agreement. This
would imply that longer and more phonologically dense (or heavy) noun phrases are
more likely to occur with NSR agreement. However, Godfrey & Tagliamonte (1999)
provide no exemplification of what exactly they mean by heavy. It is unclear if
heaviness refers to subjects with pre/postmodification, as in (6), coordinated NPs, as
in (7), or both (both of these examples are invented to exemplify this). Godfrey &
Tagliamonte also report that no examples of this constraint are found in their results.
(6) The children who are always with the dog likes going outside.
(7) The man and the dog likes going outside.
Although we know that heaviness can affect word order (see e.g. Hawkins,
1994) there is no clear reason as to why heaviness should affect agreement. Instead, I
would suggest that Godfrey and Tagliamonte’s heaviness constraint may be part of (or
indeed encompass) a wider pattern of agreement that is not specifically regional, nor
part of the NSR. This involves the most verb-adjacent part of a long or complex
subject, (rather than the whole subject NP), agreeing with the verb, (in (6-7) this is the
dog.) This can be further demonstrated in (8) taken from the British National Corpus
(BNC)
110
(8) His writing-room on the first floor contains an unprepossessing table and a
sideboard, on which sit his word-processor and printer. A small chair and
bookcase completes the picture. (BNC AOP12)
In this example, a small chair and bookcase could be considered as 3sg or as 3pl but it
is found with 3sg agreement, perhaps due to the adjacent 3sg bookcase. This
phenomenon is described by Quirk et al. (1985:35) as proximity agreement, or number
attraction, and by Pietsch as processing-induced non-agreement (2005:12). Pietsch
also suggests that it is “not part of NSR agreement proper”, and this is the stance I will
also take. Nonetheless, examples of such agreement in the Lancashire data are
presented in §4.4.2. As this number attraction construction does share structural
similarities with the NSR (namely, non-3sg subjects can occur with 3sg agreement), it
therefore may compete with it or overlap with its use in some way. The concept of
constructional competition is discussed further in §4.4.2.
Variations on the ‘traditional’ definition of the NSR have led to further
refinement and significant weakening of the NSR, one of the most inclusive
definitions being from Pietsch (2005:30) as shown below.
- All 3sg subjects (and, where found, thou) always take –s

- All other subjects, except personal pronouns, take –s variably
- Non-adjacency of subject and verb favours –s
Pietsch’s inclusion of thou within the definitions of the NSR is perhaps doubtful here.
Thou normally takes –s in most dialects where it occurs, i.e. its agreement is identical
to 3sg subjects (see e.g. Shorrocks, 1999:93; Smith and Tagliamonte, 1998), and so it
is unclear on exactly why this should be part of the NSR. This aside, instances of thou
are retained in the analysis for descriptive clarity nonetheless. The generalized
definition presented above encompasses the possible variability of subject type and
111
position with respect to the verb as outlined earlier – all of which are explored within
the Lancashire corpus data in §4.4.
4.1.3 Variation with BE
Alongside 3sg agreement variation with present indicative lexical verbs, many studies
also include analyses of nonstandard verbal agreement associated with 3sg forms of
BE as shown in example (9).
(9) The eggs is cracked. (Henry. 1995:12)
Although definitions of the NSR do not typically account for past tense variation (as
of course lexical verbs have no 3sg distinction in the past), the more complex
paradigm of BE allows this possibility to be explored. It is suggested that in dialects
displaying NSR variation, nonstandard 3sg BE in all tenses is able to occur in non-3sg
contexts due to analogical levelling with the NSR (or put simply, due to the spread
and influence of a dominant pattern, see e.g. Henry, 2005; McCafferty, 2003; Godfrey
& Tagliamonte, 1999). Subject type and subject position constraints (associated with
the NSR) are suggested to apply to verbal agreement patterns with BE. This means that
verb-adjacent personal pronouns occur with standard agreement (in this case with
were) as in (10), while adjacent subjects which are not personal pronouns may allow
nonstandard agreement, (in this case with was), as in (11), (taken from Britain and
Rupp, 2005:3).
(10) They were purring.
(11) The cats was purring.
Another factor that has been suggested as affecting the use of was/were is the
polarity of the clause. Nonstandard were (i.e. in the 1sg and 3sg form) is said to occur
112
more frequently with negative polarity subjects, (Cheshire & Fox, 2006:3; Anderwald,
2001:3; Henry, 1995: 22) as shown in (12-13) modified from Tagliamonte (1998:22).
(12) You weren't a long way away though.
(13) *You wasn't a long way away though.
This negative polarity variable is also considered by Henry to be ‘parametrically
linked to the NSR’ (1995:20), although this hypothesis has not been addressed in other
studies. Any possible links between negation and the NSR will be explored,
particularly in relation to BE in §4.4.4.
Variation with past tense BE is found in Northern England – the region most
typically associated with NSR agreement (see e.g. Hollmann and Siewierska, 2006;
Tagliamonte, 1998). Alongside this, variation with BE is also present in many areas
that are not suggested to display NSR agreement, e.g. in London (Cheshire & Fox,
2006) in the English Fens (Britain, 2002), and in certain worldwide varieties of
English (see e.g. Schilling-Estes, 2000; Wolfram and Sellers, 1999). This would
suggest that while there may be a link between the NSR and nonstandard 3sg variation
with BE in regions where NSR variation is prevalent, was/were variation may also
occur independently of this rule. This means that this ‘independent’ was/were
variation, in particular, may not be restricted by the subject type and subject position
constraints of the NSR as outlined in (12-13). Greater variability with was/were is
often ascribed to regions of the UK such as Lancashire and Yorkshire (Pietsch, 2005).
Tagliamonte (1998:160) suggests that amongst present day speakers from the city of
York the was/were alternation is found “in the speech of the same individual, in the
same sentence, and in all grammatical persons”. A preliminary examination of past
tense BE variation in Lancashire was undertaken by Hollmann & Siewierska (2006:25)
113
using a subsection of the data used in this study. As with Tagliamonte’s findings from
York, the Lancashire data showed both inter- and intraspeaker variation with
was/were. In Lancashire, the past tense BE paradigm showed levelling towards was,
but more interestingly also towards were. In the study of a community of high school
speakers from nearby by Bolton, Moore (2003:386) also found that the overwhelming
tendency was towards levelling to were. In both studies were levelling appears to be
frequent in all sentence types; this differs from other regional varieties, where
levelling to were is suggested to occur mainly in negative polarity contexts (e.g.
Cheshire & Fox, 2006:3). These was/were findings for Lancashire (and to some
extent, York) appear to conflict with the earlier hypotheses which suggested that in
regions where NSR agreement is present, it is analogically extended to was/were
variation. In Lancashire the non-3sg pattern can be extended to all contexts – this
opposes the NSR agreement pattern where 3sg patterns are extended to non-3sg
contexts. NSR agreement is suggested to be found both in Lancashire and Yorkshire
(Pietsch 2005, Ramisch 2009) although in these regions was/were agreement appears
to be less restricted (or in fact even completely unrestricted) by the proposed analogy
with the NSR. This may suggest that other variables, such as constructional frequency,
may affect was/were variation in these regions.
It is also acknowledged that a number of other non-3sg nonstandard agreement
patterns aside from levelling to were exist within the Lancashire data, e.g. (14-15).
Variation such as this is discussed further in §4.3.
(14) “If he’re poorly he ‘ad betther have a cab, an’ go whoam.” “Poorly?” Bob
said, lookin as if he could like t’ ha’ put th’ waiter i th’ doctor’s honds.
“Dustno know good singin when theau yers it, theau donned-up mopstail?”
(Litcorp)
(15) Well we told the skipper he says "oh he's not blind" he says "he just wants to
go back, he don't want to go to sea for Christmas." (Sound Archive)
114
4.2 History of the NSR
4.2.1 Origins of the NSR
NSR agreement can be found in Northern English texts as early as the late Middle
English period (Filppula et al., 2002:49); the exact origins and development of this
agreement pattern before this time is largely unknown due to a lack of written data.
Typologically, NSR-type agreement is quite rare. Pronoun adjacency constraints such
as those found with NSR agreement do appear in a handful of languages, namely
Arabic, Tagalog, Hebrew and some other Semitic languages (see e.g. Filppula et al.,
2002:47), although these languages typically display a full agreement paradigm, rather
than the limited agreement paradigm found in English. This means that the presence
of such a selective agreement pattern in a language that typically displays
comparatively little verbal agreement is highly unusual.
Klemola (2002) suggests that the reflexes of the NSR agreement pattern found
in Northern varieties are not an innovation, but instead are a retention of an older
agreement pattern that underwent changes due to factors such as language contact and
language-internal variation. The loss of the more complex agreement system found in
Old English is part of a more general loss of affixation that may be found in all
Germanic languages, although reasons why only part of this pattern remains are
difficult to ascertain. Two independent developments from two different English
dialects, one northern variety and one southern, may have contributed to the
development of the NSR. Firstly, in the North vowels in the common Germanic
singular forms –u, and –ið underwent a process of weakening, becoming -e, -eð during
Old English. The -ð forms were then replaced with –s in both the plural and in the
third singular sometime later, and the vowels in the plural and 3sg endings (–að and
eð) also lost their contrast. The -e ending in the first singular eventually became zero.
115
Secondly, the innovation of affixless (or zero) forms at first occurred only in a certain
restricted set of syntactic environments, namely adjacent to pronouns. This
development was apparently initiated by the southern dialects and only began to reach
the North at some time during late Old English. These zero forms were then
reinterpreted as markers of agreement (Barlow and Ferguson, 1998:183). These
changes meant that by the end of the ME the present tense paradigm of lexical verbs
contained only two distinct forms, the 3sg -(e)s and –Ø for all other forms (although
previous to this, further distinctions were retained). The –Ø verbal endings now
occurred when adjacent to pronouns (except in 3sg contexts), and –(e)s occurred
variably in all other contexts. This pattern of agreement forms the basis of the NSR.
These changes are summarised in Table 3.
Late Northern Middle

Germanic Old English
Old English English
1sg -u -e -Ø -Ø
ðs
2sg -is -is -s -Ø
að/ eð contrast
3sg -ið vowels -eð -(e)s -(e)s
lost
1pl -að weakened -að -s -Ø
-e becomes Ø
2pl -að -að -s -Ø
3pl -að -að -s -Ø
TABLE 3. DIACHRONIC CHANGES IN THE VERBAL AGREEMENT SYSTEM IN NORTHERN
ENGLISH.
Corbett (2006) suggests that the reduction and syncretism of the agreement
affixes may be due to the rise of subject pronouns. As subject pronouns became the
routine way of expressing person reference, person marking on the verb became
functionally redundant. When the verb and subject were not adjacent, the verbal –s
agreement was kept, (although later lost in Standard English) thus resulting in the
NSR agreement paradigm (see Siewierska, 2004:277-81 for a further discussion of the
decline in verbal agreement in English).
116
Reasons for these changes in English leading up to the development of the
NSR are often attributed to language contact with Celtic (e.g. by Isaac, 2003) or
Scandinavian (e.g. by White, 2002). While no direct link between Scandinavian and
the NSR is described in the literature, the sound change resulting in –s verbal endings
in Northern England which later enabled NSR changes to occur, is often attributed to
influence from Old Norse. This is because Old Norse had syncretised the 2sg and 3sg
agreement marking, with both forms ending in the uvular trill /R/. In a language
contact situation, this variant may have been perceived by English speakers as
something similar to /s/ or /z/. This parallel may have allowed the spread of -s from
the 2sg to the 3sg in Northern English at that time by analogy with Old Norse.
However, this does not explain the spread of verbal -s also to the plural forms, as Old
Norse had three distinct forms in the plural. Old Norse also had no alternation of the
agreement paradigm according to adjacency of subject and verb, and so could not
have influenced the NSR in this respect. This aside, if it is accepted that the
development of verbal –s in varieties of Northern English may be attributed to Old
Norse, then Old Norse may be considered as playing a role in the appearance of the
NSR, albeit not directly.
Celtic is also suggested as an influence on the NSR due to agreement patterns
in Brythonic languages (and in particular in Welsh) displaying restrictions which
resemble those of the NSR (Venneman, 2000). Specifically, the Brythonic agreement
system has person and number inflections whenever the clause has no overt subject
NP, or only a weak personal pronoun. With plural subject NPs, an unmarked third
person verb form is used as shown in (16-17) below taken from King (1993:137).
117
(16) Maen nhw ’n dysgu Cymraeg.
be.PRES.3P they PROG learn.INF Welsh
‘They’re learning Welsh.’
(17) Mae Kev a Gina yn dysgu Cymraeg.

be.PRES.3S Kev and Gina PROG learn.INF Welsh
‘Kev and Gina are learning Welsh.’
.Although this agreement pattern is similar to the NSR, it has been suggested that
possible influence of Brythonic on the NSR, observable in Middle English, does not
fit with respect to the timeline of settlement and language contact (see e.g. Klemola,
2000 for more details on this). While language contact with Scandinavian or Celtic
languages may arguably have had some role in NSR developments, dialect contact
between the Northern and Southern varieties of English appears to have been the most
important factor. The combination of language-internal change and dialect contact can
be considered as the cause of nonstandard verbal agreement patterns (such as the
proposed NSR) found in modern varieties.
4.2.2 Constructional competition
It has been suggested (e.g. by Kroch, 1989; Culicover, 2008) that constructions that
share a similar syntactic form or semantic interpretation may compete and overlap in
the minds of speakers, often over time resulting in one form being reanalysed as
another (as we have seen earlier in this chapter with both Ø and /R/ as markers of
agreement). This competition may apply to the NSR, as of course not every example
of nonstandard 3sg verbal agreement without an adjacent personal pronoun is
automatically an instance of the NSR. A number of constructions which are
superficially similar to the NSR can be found in this data. For example, initially, it
appears as if the NSR may be found in historical present constructions e.g. (18).
118
(18) The folk there says, “get off ‘ome Thurson.” O’ th’ evils o’ drinkin! So I
went back to our house, th’ missus was fast asleep, good job too. (Litcorp)
Historical present constructions bear some resemblance to the NSR in that they are
able to have 3sg verbal agreement in persons other than 3sg, as shown in (18).
However, the historical present construction does not display variation in agreement
based on subject type and subject position constraints. This means that the historical
present construction allows adjacency of any personal pronoun and the 3sg verbal
agreement form. This construction is also semantically different from the NSR in that
it uses present tense verb forms to narrate events that are in the past (Huddleston &
Pullum, 2002:129-131). This semantic difference between the NSR construction and
the historical present can be resolved by examining other verb forms in the utterance
or sentence. For example, in (18) use of the past tense went suggests that this utterance
refers to events in the past; the same can be said for used to, sold and was in example
(19).
(19) He used to make home-made toffee and er he sold milk and bacon and cheese
and all that, and there was a crowd in there, and I goes charging to the
counter. “A gill of milk Mr Jackson please”, which was about a penny or
something like that. (Sound Archive)
Habitual constructions also display similarities to both the NSR and the
historical present by allowing 3sg agreement with non-3sg subjects. Habitual
constructions refer to actions that occur repeatedly e.g. (20-21).
(20) Most days, men fro’ town goes down dock 9.30 sharp. They always walked
past window, right shoutin n all. (Litcorp)
119
Like the historical present, this construction does not conform to the adjacency
constraints characteristic of the NSR, and therefore, like the historical present, it may
be found with verb adjacent pronouns in any person, e.g. (21).
(21) Every Wednesday, Ah comes th’ same time. (Litcorp)
Habitual constructions often include temporal adverb phrases such as most days or
every Wednesday as shown in (20-21) respectively. However, these constructions do
not always require these adverb phrases in order to represent habitual semantics. This
makes habitual constructions more problematic than the historical present ones with
respect to disambiguation. For example, (22-23) are taken from Godfrey and
Tagliamonte (1999:108) and are presented there as clear examples of NSR agreement.
(22) There’s a few jackdaws comes out the back. (1/362)
(23) Me legs aches a bit. I got it in me knee joints now. (7/303)
While the assertion that (22-23) display NSR agreement can be considered true in the
sense that the non-3sg NPs jackdaws and me legs occur with the 3sg –s ending, it is
difficult to state, categorically, that neither of these examples displays any vestige of
habitual semantics. It is plausible to argue that (22) could mean something similar to
there’s a few jackdaws that often come out the back and (23) my legs often ache a
bit. 15 Equally, (23) could be instances of the present indicative meaning something
more similar to my legs ache at the moment (and so be a good case for NSR
agreement.) It is quite possible that Godfrey and Tagliamonte have carefully
considered constructions that display habitual semantics in their analysis but no
indication is given on this either way. Presentation of a wider sentence context may
15
Often is used here as an arbitrary illustration of an adverb phrase. Equally, any adverb phrase (e.g.
often, frequently, every day, on a Tuesday etc) could be intended by the speaker. This neatly
demonstrates the point – you simply cannot second-guess what any speaker may have meant.
120
also go some way to resolving this issue. This same problem occurs with data from
Pietsch (2005) as shown below in (24-27). All of the following utterances are cited as
being good examples of the NSR.
(24) burglars steals ’em
(25) great snows comes
(26) sheep bleats
(27) some goes that way
The ambiguity as discussed earlier is problematic. If it is possible to argue that
all of these examples may express habitual semantics (or at least, may be judged to by
some speakers) then should they be considered as examples of the NSR proper? This
issue has not been adequately addressed in the literature. Shorrocks (1999:112) makes
a distinction between habitual constructions and the NSR suggesting that examples
from Bolton such as I often tells him may be due to habitual semantics and not the
NSR. However, no further elaboration on this point is made and no suggestion as to
how to deal with such variation within quantitative analyses is put forward.
Pietsch (2005:10) certainly suggests that the habitual construction and the NSR
are two distinct constructions, but also does not adequately deal with the implications
of their similarity. Pietsch indicates that they may possibly be causally/historically
related, with the –s that occurs with intervening adverb phrases between subject and
verb being re-analysed as a marker of habitual semantics and extended into pronoun-
adjacent contexts. This fits in with theories on constructional polysemy (e.g.
Goldberg, 1995) which suggest that over time, one construction can develop out of
another, where the same form is paired with different but related senses. No other
explanations for the origins of this habitual construction have been put forward, and
121
there are currently no other studies of –s as a marker of habitual constructions in
dialects of English.
The problem of habitual/NSR disambiguation may undermine the validity of
some previous NSR claims made by researchers, particularly if habitual constructions
have not been overtly addressed in their analysis (i.e. McCafferty, 2003; Hudson,
1999; Godfrey and Tagliamonte, 1999; Börjars and Chapman, 1998; Henry, 1995). It
may be the case that in regions where –s can indicate habitual aspect when found with
intervening adverbs, the 3sg marker –s may be re-analysed as a marker of habitual
semantics alone, and extended into pronoun-adjacent contexts without the need for
any adverb phrase. Therefore, it is not implausible that an utterance such as burglars
steals ‘em could mean something like burglars always steal ‘em in the minds of
certain speakers. This utterance would then therefore not be a good instance of the
NSR. If habitual constructions are frequent in Lancashire, then the frequency of the
NSR may be affected by crossover and interference with this pattern. This will be
examined with respect to the Lancashire data in §4.4.2.
When examining possible instances of NSR, often a wider context is needed.
Most NSR studies, such as the examples from Pietsch (2005) discussed in (24-27),
give only sentence fragments, making it difficult to distinguish whether or not the
habitual or the historical present constructions have really been considered as a
possibility. Certainly, this is rarely addressed clearly in the discussion. The problems
associated with a narrow-scope approach are exemplified using an invented example
in (28).
(28) the children comes to the fair
122
Many would consider this to be a good example of NSR, with the 3pl NP occurring
with the nonstandard –s. However, the addition of hypothetical contextual information
shows without doubt how this could also easily be either habitual (29), or historical
present (30). Only examples with adjacent non-3sg personal pronouns with standard
agreement (-Ø) really show that this constraint may be a clear example of the NSR, as
in (31).
(29) Every Friday at 5.30 the children comes to the fair, after school. (Habitual)
(30) So, the children comes to the fair, and says “look at that!” So we went over to
the stall. (Historical present)
(31) The children comes to the fair and they enjoy the rides. (NSR)
However, a wider scope may not always make the speaker’s or writer’s intended
meaning clearer – it is impossible to know whether or not they are using the NSR
proper or instead are using –s in a more idiosyncratic way, and I would suggest that
these similar constructions may influence, compete, overlap and mix with each other
in the minds of speakers in a way that is difficult to distinguish and test, thus perhaps
pointing to my earlier hypothesis that NSR-type agreement may be more idiosyncratic
and region-specific.
4.2.3 Salience
Sociolinguistic salience may also have a bearing on the occurrences of the NSR in the
corpora. Kerswill & Williams (2002) suggest that salient constructions are those
which are overt in the speaker’s mind (see also markers, stereotypes and indicators,
Labov 2001; markedness Greenberg 1966). While previous accounts of the NSR have
not dealt with salience explicitly, the social implication of the usage of nonstandard
forms by the speaker must be considered as a variable.
123
As Litcorp is not a record or transcription of Lancashire dialect speakers, but
instead a record of the writers’ perception and representation of these speakers, it may
be considered as a corpus of the most salient or important dialectal features as judged
by these writers. This approach is expanded upon and discussed more explicitly in
Chapter 5. This use of a dialect literature corpus in order to quantify salience is
original – no other attempts to examine and compare data such as this can be found in
previous studies. Similarly, although the Sound Archive data is a transcription of real
speakers, it may be that a number of nonstandard forms are recognized by (or are
salient to) the speaker as being more dialectal. It is therefore plausible that the
informants in the Sound Archive may actively down-play or emphasize particular
constructions, depending on both their own knowledge of their local dialect and the
way in which they wish themselves to be portrayed (i.e. as more or less dialectal, see
e.g. Hollmann and Siewierska, 2006). It could also be suggested that certain
constructions may be more salient than others and that this in turn would have a
bearing on their frequency in the corpus. For example, it may be the case that the use
of nonstandard verbal agreement forms such as (32) stand out more (or are more
salient) when compared to other nonstandard features, such as definite article
reduction/deletion (33), certainly when spoken.
(32) Ah always does what Ah con for her, an Ah will say this, she’s allus thankful
for a bit o’ help. (Litcorp)
(33) And she says “I’ll get you some butties for t’train”. (Sound Archive)
This may mean that the (arguably) more salient nonstandard verbal agreement,
including that of the NSR, may be found less frequently in speakers who wish to
adhere to overt prestige forms, and possibly is more frequent in speakers wishing to
adhere to potential covert prestige forms, i.e. dialect forms. By examining the
124
frequency of NSR agreement in the Litcorp compared to the Sound Archive, not only
will potential language change be investigated, but the status of NSR agreement as a
possible salient feature of the Lancashire dialect may be uncovered.16 Specifically, if
NSR agreement is not a salient feature of the dialect, it would be less likely to appear
in the Litcorp data as the writers would not necessarily perceive it as an obvious
feature of dialect speakers from this region. This ‘perception frequency’ can then be
compared to the actual ‘production frequency’ of NSR agreement by speakers in the
Sound Archive, giving an interesting contrast. These issues will be tested in this study,
and are discussed in more detail in § 4.4.1 and also in Chapter 5.
Alongside these corpus methods, acceptability questionnaires are employed in
order to test constraints such as subject type and position. These questionnaires are
used not only to target those informants who identify themselves as dialect and non-
dialect speakers, but also cast the net more widely and attempt to see if any
differences exist between Lancashire and other regions of the UK. Further details on
this method are outlined in §4.3.4.
4.2.4 Frequency of usage
Both salience and constructional competition, and indeed, the development of the
NSR construction itself, are underpinned by the role of frequency. Building on the
ideas of Bybee (1985), approaches in Cognitive Linguistics (e.g. Croft & Cruse, 2004;
Croft; 2001; Langacker, 2000; Goldberg, 2006) suggest that the relationship between
grammatical knowledge and language use is sensitive to frequency of usage. It is
suggested that language use influences the structure of representation in the mind, and
that grammatical structures that are used more often (and therefore have a high token
16
Conclusions such as these are tentative, for a full discussion of the merits and problems associated
with comparisons of this nature, see Chapter 5.
125
frequency) become more reinforced (or entrenched). This may suggest that verbs
which have a higher frequency, as compared to other verbs, may be more entrenched
and therefore more resistant to language change, thus preserving older agreement
patterns. This will be investigated with respect to the corpus data in §4.4.1 where the
frequency of agreement patterns in the corpus data is explored.
4.2.5 Summary and research questions
Studies into the NSR have produced diverse and often conflicting results. As Clarke
(1997:3) points out, ‘differing methods of analysis, number of tokens used, lack of
comparable corpora, and the range of linguistic data examined all play a part in the
lack of consensus over the development and function of verbal -s.’ This chapter takes
a quantitative approach, and by examining a large amount of corpus and questionnaire
data, provides a robust description of the NSR in Lancashire. More specifically, the
following questions are addressed:
(a) To what extent is the NSR a feature of the Lancashire dialect data examined
in this study?
(b) What, if any, factors may motivate an informant’s decision to use NSR
features, such as the constraints detailed in §4.2, and concepts such as
salience and frequency?
(c) Is there any evidence that was/were variation is influenced by the NSR in
Lancashire?
(d) What effect, if any, does the frequency of the superficially similar
constructions (the historical present and the habitual) have on the distribution
of the NSR in Lancashire?
(e) What changes in the frequency of the NSR have occurred over time?
Exploring suitable methodologies for dialect grammar research is a parallel focus of
this study and indeed in this thesis. The Sound Archive data consists of oral history
126
interviews, and therefore contains relatively few examples of constructions in the
present tense (except for, of course, those in the historical present which, as outlined
earlier, are not considered as part of the NSR). This makes investigations into the NSR
a good opportunity to combine results from additional sources in order to compensate
for biases such as these. By using both corpora and sociolinguistic questionnaires, the
usage of and possible limitations to the NSR in Lancashire may be uncovered.
4.3 Methodology
As with other studies in this project, spoken transcribed data from the Sound Archive
corpus is analysed along with data from Litcorp. For further information on the
speakers, locations and overall sampling of the corpora please see Chapter 1. The
analyses of Sound Archive and Litcorp data are then compared to a questionnaire
exploring 3sg agreement (and in particular, the NSR) targeted at Lancashire dialect
speakers and also speakers from other regions (see § 3.4.3). A full copy of this can be
found in Appendix D.
4.3.1 Rationale for methodology
Many previous analyses of 3sg agreement variation do not base their claims on a
suitable amount of data. Both Börjars and Chapman (1998) and Hudson (1999)
conduct no empirical tests, but instead base their arguments on intuition alone. Börjars
and Chapman suggest that nonstandard 3sg agreement, specifically with lexical verbs,
is triggered by inverted pronouns. While this may be the case, this proposition remains
an untested hypothesis only; it is impossible to prove without examining any data.
Henry’s (1995) study of Belfast English gathers data from elicited grammaticality
judgements in order to suggest quite the opposite, that the application of verbal –s is
127
prohibited under inversion in this variety of English. However, no information is
included on the number, age or sex of the informants, or about the nature or structure
of the grammaticality experiment. In fact, Henry provides no methodological details
whatsoever. Due to this omission, it is impossible to determine whether or not the
reported differences between agreement in Belfast English and the NSR are true
differences, or whether they differ due to incomparable methodological approaches.
Godfrey and Tagliamonte (1999) gather their data using sociolinguistic
interview techniques (see e.g. Labov, 1972 for further information on this) from eight
elderly rural speakers in Devon. No information is given about the topic or length of
these interviews, but they report 628 instances of verbal –s used in a nonstandard way,
and attribute a number of these to the NSR. The use of interview data is a good
approach (and indeed, is one of the methods used in this study), although a larger
number of informants might have allowed Godfrey and Tagliamonte to make stronger
claims about 3sg variation in this region.
Pietsch (2005) provides some of the most robust results for the NSR by taking
a more quantitative approach. His study examines data from the Northern Ireland
Transcribed Corpus of Speech, the Survey of English Dialects, the Tape-Recorded
Survey of Hiberno-English Speech, and the Freiburg Corpus of English Dialects. This
provides a good account of the distribution of the NSR in the British Isles. While these
corpora are a good resource and studies such as this can make strong claims on the
distribution of this variable at the time that the corpora were collected, a focus on
more modern data would provide interesting information on the possible development
of verbal -s. While a number of studies into NSR development have included written
historical documents (e.g. Wright 2002; Montgomery et al, 1993; Bailey et al., 1989),
aside from the ‘one excerpt of short prose’ examined by McCafferty (2003:5) this
128
inclusion of dialect literature in particular, alongside spoken corpus data is novel. A
combination of the spoken, written and questionnaire data should allow a
comprehensive picture of nonstandard 3sg agreement in Lancashire to be outlined.
4.3.2 Corpora
The corpus methodology falls into two main strands – retrieval and analysis of
nonstandard 3sg agreement in lexical verbs, and retrieval and analysis of nonstandard
agreement with auxiliary verbs (in particular, BE but also HAVE and DO) in both the
Sound Archive and Litcorp data.
As both corpora are part-of-speech tagged using the CLAWS-7 tagset, 17 all
lexical verbs that may exhibit NSR agreement (i.e. 3sg forms) can be retrieved by
searching for the tag _VVZ, which retrieves all -s forms of lexical verbs (e.g. gives,
works etc). BE, HAVE and DO (both lexical and auxiliary) are retrieved by searching for
their individual forms, e.g. am, is, have, and all possible contractions, e.g. ’m,’s,’ve
etc. Subsequent searches were also carried out in order to find adjacent personal
pronouns and verbs that displayed nonstandard agreement patterns. Along with
identifying possible NSR examples, these corpus searches also uncover historical
present constructions (34); habitual constructions 18 (35); other agreement patterns (36)
and of course constructions which show standard agreement (37).
(34) Anyway I said to her one day, I says “what's the matter?” I said “why won't
you mix with the other girls?” (Sound Archive)
(35) Ah always coughs before ah wakes. (Litcorp)
(36) And I said, “well I don't know, I'll have to go and ask her” so he give me
money for t'bus and I went up and asked me Mother if, me Grandad lived with
17
See http://ucrel.lancs.ac.uk/claws7tags.html for more details on the tagset.
18
Habitual constructions are here defined as those with occurring with relevant adverb phrases; habitual
constructions without adverb phrases will be captured by the search for _VVZ. See §4.4.2 for a further
discussion of this.
129
(37) So er I thought well when Alex goes I'll bow out and that’s it. (Sound
Archive)
While examples such as (35-37) are excluded from any possible NSR results,
frequency data for these constructions are presented in Table 7. A further discussion
of the relevance and implication of the historical present and habitual constructions is
discussed in §4.5. Any results that do not clearly fall into these four categories will be
excluded from all tables but discussed in §4.4.2.
4.3.3 Standard and nonstandard verb forms
Contractions are retrieved from the corpus data by searching for the individual
contracted form (e.g. ’s or ’ve) as all contractions are split and stored as separate
tokens in both corpora. All results are analysed for any possible ambiguity, e.g.
examples of ’s that may be genitives (38); the contracted form of BE (39); or the
contracted form of has (40). Any genitive results are excluded.
(38) I went into Mr Jackson's shop which was on Victoria Street (Sound Archive)
(39) Anyway, when I come back and he’s waiting at Wyredock Station for me
(Sound Archive)
(40) Yo’ seen, he’s known yo’ so long, an’ he’s warked wi’ yo for mony a
yer.”(Litcorp)
Alongside these standard uses of contractions, contractions used in a nonstandard way
are also found in both corpora, e.g. were (41-42); and has (43). These results are
discussed in §4.4.2.
130
(41) “You don’t suppose I’d sell it without the shell”, he said; an he looked as if he
thowt aw’re havin him on. (Litcorp)
(42) So we went into Fat Jack’s i’ th’ corner; an’ he co’ed for two twopenno’ths
wi’ as mich swagger as if he’re gooin’ to get change for a suvverin. (Litcorp)
(43) He said, “my tea doesn't taste right unless I’s had it in me black and when
I’ve had me tea I has a wash.” (Sound Archive)
The data searches also retrieved any nonstandard forms of the verbs BE (44), HAVE
(45) and DO (46) that are present in both corpora.
(44) “Here theau art”, hoo says, an pretended t’ offer me th’ paper (Litcorp)
(45) What hast Ø getten i’ thi basket, Bill? (Litcorp)
(46) Neaw, Jamie, what dost Ø think abeaut that? (Litcorp)
One problem often associated with these nonstandard verb forms is the
omission of the subject, as shown in the above examples (45-46). In most occurrences
of this, as with two of the examples above, the subject can be resolved from the
context by looking in the corpus – in (45) the subject you refers to Bill, and in (46) the
subject you refers to Jamie. Examples where the resolution of the subject from the
context is not possible are excluded from the final results. Similar to these
nonstandard verb forms, archaic 2sg pronouns tha, thee (47), and thou (48), are also
present in both corpora to varying degrees.
(47) “The smoke, tha'll have to give it up cop, it gives thee cancer, aye " (Sound
Archive)
(48) Where hast thou been? Thou art all in a sweat (Litcorp)
Alongside tha, thee and thou, the archaic pronoun hoo is present in the Litcorp data.
Wales (1996:19) states that hoo is the 3sg feminine subject pronoun from the Old
English heo and occurred mainly in the North West Midlands, while Beal (2004:119)
131
suggests that this form is also commonly found in Lancashire. Hoo occurs frequently
in the dialect literature as a feminine 3sg pronoun, e.g. (49-50).
(49) He thowt, for a bit, hoo were playin’ a trick on him, but th’ choilt did it quite
innercent. (Litcorp)
(50) He geet so bad in a bit, an’ were vomitin’ so much, that Margit were freetend,
so hoo rushed off to Bill Olegg’s for summat to stop th’ gripin’ pain an’
ickness. (Litcorp)
Both verbs and personal pronouns (and of course other sentence elements)
present in Litcorp display variant spellings representing the writers’ desire to convey
the phonology of their accent through the written word e.g. (51-52).
(51) Yo’re as welcome here as yo are a-whoam. (Litcorp)
(52) Neaw theau couldno’ tell ‘em fro’ ladies, unless it wur by ther tongues.”
(Litcorp)
Both the nonstandard contractions and the variant spellings of pronouns and also verbs
were found by searching for their specific search term (i.e. results for you include yo
and y’). The variant forms were originally uncovered by close examination of a
sample of the dialect literature.
4.3.4 Questionnaires
Questionnaires are used in this study in order to both include the perceptions of more
modern speakers and to allow a (tentative) further time depth comparison with the
Sound Archive and Litcorp data. The questionnaire that I have devised aims to test the
possible morphosyntactic limitations (or constraints) to the NSR that are outlined in
§4.2.1, in order to uncover whether or not syntactic position and subject type exert an
effect on the application of the NSR for current Lancashire speakers. The
132
questionnaire also explores present tense constructions further, in order to compensate
for the dominance of the past tense constructions in the corpora.
The questionnaire has been designed with a view to quantitative analysis;
participants were asked to judge sentences on a five point scale, with 1 being the least
acceptable to them and 5 being the most acceptable e.g. (53). Descriptors were not
assigned to the intervening values (i.e. 2, 3 and 4) so that interval variable status (as
opposed to ordinal variable status) can be approximated (see e.g. Cowart, 1997:71 for
further details).
(53) ‘They have a shop of their own and is very well off.’
The sentences chosen for the questionnaire relate to the constraints as detailed
in §4.1.2, and test nonstandard agreement with variables such as adjacency, type of
subject, and position of subject.
133
4.3.5 Classification and division of respondents
Questionnaire respondents who identified themselves as being speakers of Lancashire
dialect in response to the question ‘do you have a particular dialect? If yes, how
would you describe it?’ are classified as ‘Lancashire, dialect speakers’ in the results in
§4.4.5. In addition to this, there were a number of speakers who identify themselves as
living in a Lancashire town or village (or having lived there for a majority of their
life), but did not proclaim to have a Lancashire dialect. These speakers are classified
as ‘Lancashire, non-dialect speakers’ in the results in order to explore whether or not
there is a tangible difference between these two groups.
Along with this distinction, the questionnaire respondents were also split into
north/south groups, to explore whether there are any differences between them. Again,
the speakers were categorized according to how they identified themselves in the
questionnaire. As is well known, the north/south divide is a contentious issue, being
only partly dependant upon the geographical origin of the speaker and partly on other
cultural factors (see e.g. Wales 2006: 9-24 for a discussion of this). Here I follow
Trudgill (1999) in defining the north/south divide as being delineated by the so called
Wash-Severn line.
The questionnaire data will be compared to the results from both corpora.
Since all three data sources were gathered by different means and cover different time
periods, a sensitive combination of these results should give a good picture of how the
NSR functions in Lancashire.
134
4.4 Results and analysis
4.4.1 NSR results
Many discussions of the NSR (outlined initially in §4.1) demonstrate the reflexes of
this rule by including examples that show the effect of adjacent vs. non-adjacent
personal pronouns within the same sentence or utterance e.g. (54) and (55).
(54) We peel ‘em and boils ‘em. (Ihalainen 1994:221)
(55) They sing and dances. (Henry 1995)
No examples such as these are found in any of the Lancashire corpora. This is not due
to an absence of sentences of this type within these data; nineteen examples such as
those shown in (56) are found in the texts.
(56) Because when you’re passing in a car, on the bus, you just see a church, you
don’t know whether it’s in good condition or bad condition until you come
and say “how long since this was done?” (Sound Archive)
Sentences such as (56) are not significantly frequent enough to make strong claims
about the impossibility of NSR agreement in these contexts in Lancashire. As with
other infrequent results, the acceptability of this construction is tested by means of the
questionnaire, see §4.4.6.
The strong definition of the NSR suggests that every present indicative verb
takes the 3sg form, except when it is directly adjacent to a personal pronoun subject.
However, the reality is more complicated; some features of the NSR are also features
of Standard English (i.e. standard agreement with verb-adjacent pronouns) while
others (e.g. 3sg agreement with non-3sg subjects) are shared by other agreement
patterns known to exist. As outlined earlier, a broader version of the NSR offers more
scope for possible variability (as set out below) and is tested here with respect to the
Lancashire data:
135
a) 3sg subjects (and thou) always take –s (or related 3sg form)
b) Non-3sg subjects may have –s (or related 3sg form)
c) Non-adjacent subject and verb prefer –s (or related 3sg form)
It should be noted that not only results which may conform to the NSR are
relevant, but also the determination of the extent of the variability associated with this
rule (and with verbal agreement in Lancashire more generally).
Tables 4-9 deal with present tense variation only – was/were results (which are
suggested to be analogous to the NSR, see my earlier discussion in §4.1.3) are
analysed separately in §4.4.5. All results presented here are raw frequency results
only; although the corpora are not strictly comparable in terms of size, many values
are too small to normalise and still achieve usable results (to e.g. values per 100,000).
As set out in the methodology, adverb phrases relating to time (such as sometimes,
never, every Wednesday) intervening between subject and verb (e.g. I always goes
there) are not included as examples of the NSR and are instead analysed in Table 7 as
habitual constructions. Previously I have outlined the possibility that all instances of
the present indicative may contain an element of habitual semantics e.g. (57)
(57) He goes to the shops [every day/once a week/frequently/at 5 o’clock].
This is difficult to resolve in any satisfactory way and so, in line with previous studies,
I tentatively include instances such as ‘the men takes the pictures’ as good examples
of the NSR. Unlike other studies that display only part or single sentences, all
examples from the Lancashire corpora are detailed with a wider textual context for
clarity.
136
NSR RESULTS LITCORP SOUND ARCHIVE
Non-adjacent non-3sg pronominal subjects

9 (3.6%) 0 (0%)
(e.g. I _ goes; you _ takes; they _ gives.)
non-adjacent non-3sg NP subjects
8 (3.2%) 0 (0%)
(e.g. the children _ goes ; all the cats _ sleeps)
adjacent non-3sg NP subjects
31 (12.6%) 6 (86.7%)
(e.g. the dogs eats ; five of the men finds)
thou - adjacent 192 (77.7%) 1 (14.3%)
thou - non-adjacent 7 (2.8%) 0 (0%)
TOTAL 247 (100%) 7 (100%)
TABLE 4. INSTANCES OF THE NSR IN THE LANCASHIRE CORPORA
NSR agreement is infrequent in the Lancashire corpora; only 254 instances are found
in this data with 97% of these being attributed to the Litcorp. In the Litcorp, NSR
agreement is found most frequently with thou and a majority of these instances
(92.0%) occur with the variant spelling theau as can be seen in (58-59).
(58) “That shows aw’m no’ used to buyin’ owt o’ th sooart.” “If theau wants a bit
o’ gradely stuff thee goo deawn to Muirhead’s i’ Victoria Street,” Siah said.
“Dunno thee buy common stuff!” (Litcorp)
(59) “Oh, aw’ll agree to that,” Jim said. “Then go to wark,” Juddie said, “an’ mind
heaw theau raises th’ tub. Theau’re shakin neaw as ill as if theau’re gooin’ t’
be hanged.” (Litcorp)
This use of the variant spelling may suggest that thou is considered by the writers of
Litcorp as being particularly dialectal (or salient) and so is represented in an
orthographically nonstandard way. It is possible that the frequency of the nonstandard
form theau may be closely linked to the similarly salient or dialectal choice of non-
standard 3sg agreement. The possibility that instances of thou with 3sg agreement
may be more frequent than any other agreement pattern with thou overall (and
therefore entrenched in the mind of the writers as outlined in §4.2.4) is tested in Table
5. (Adjacent and non-adjacent subjects are not distinguished in this instance).
137
USE OF THOU (AND VARIANT FORMS) LITCORP SOUND ARCHIVE
3sg agreement 183 1

(e.g. thou likes them) (95.3%) (16.7%)
2sg agreement 7 4
(e.g. thou like them) (3.6%) (33.3%)
archaic 2sg agreement 2 6
(e.g. thou art a good man) (1.0%) (50.0%)
192 12
TOTAL
(100%) (100%)
TABLE 5. AGREEMENT PATTERNS WITH THOU IN THE LANCASHIRE CORPORA
Results from Litcorp lend support to the assertion of Pietsch (2005:6) who suggests
that thou always occurs with NSR agreement. While in the Sound Archive thou shows
no real preference for 3sg agreement, the total number of instances of thou is very low
and no conclusions can be based on these data.
Aside from those occurrences with thou, the remaining NSR results in the
Litcorp are most frequently found with adjacent non-3sg NP subjects e.g. (60) and
(61).
(60) It’s bad news fur coffin makkers, an’ th’ timber trade generally! Neaw, when
those fashions changes, some trades are allbut owver. When women gan off
wearin’ crinolines, th’ wire trade went deawn, an so did th’ boot trade, an’ th’
stockin’ trade. (Litcorp)
(61) “Tha keeps suppin’ it,” said Jonty. “Abit,” said Jimmy. “Ah don’t like to hurt
its feelings. Not when it’s out on its feet.” “Ah’ll bet yo’re wives is glad to be
shut on yo,” said Jonty. There were that big a fog on when Ah left,” said
Tommy, “as Ah’ll bet hoo doesn’t know Ah’ve gone.”(Litcorp)
NSR agreement with non-adjacent subjects is rare in Lancashire, with only 17
instances found in the Litcorp data. This preference for NSR agreement with verb-
adjacent non-pronominal subjects is somewhat contrary to the NSR which suggests
that 3sg agreement is the ‘default’ agreement pattern, except for when standard
138
agreement is provoked by adjacent personal pronouns. If this is the case in Lancashire,
a higher number of non-adjacent subjects with 3sg agreement would be expected. A
closer look at the frequency of other similar agreement patterns in Table 8 explores
the possibility that constructional overlap or competition may affect the frequency of
the NSR in this region.
Very few examples of the NSR were found in the Sound Archive; two such
instances are shown in (62-63).
(62) Wherever she was going and she'd to stand all sorts of insults, “what are tha
doing down here, you don't belong down here, you get back back up yon
where tha belongs” and they used to pick sods up and throw sods at her.
(Sound Archive)
(63) Now, there was a deterrent there just by the name. Nowadays well , it just
seems anything goes. Now I know drugs has accelerated it because you
know, they want money, but I think that the punishments have gone down and
down and so like anything goes. (Sound Archive)
Sentence (62) is a good example of the NSR, again found with an archaic personal
pronoun form. Example (63) is more problematic; it is possible that drugs could be
considered by the speaker as either singular or plural (i.e. they has/it has accelerated
it), and there is no clear way to resolve this possible ambiguity. There are two further
examples such as this in the Sound Archive, (64) being one of them:
(64) No we used to gut them on the deck and then your deck used to be sectioned
up into certain, you know when you went to sea, once you start fishing you
have boards and then really your decks looks like a criss cross of different
pounds here there and everywhere (Sound Archive)
Again here it is not completely clear in (64) if the speaker is referring to your decks in
the singular or plural. Tentatively both examples are included in the totals in Table 4,
although this further weakens the already insubstantial evidence for the NSR in
present day Lancashire speakers. Further to this singular/plural ambiguity problem,
139
there are a number of other NSR ‘near misses’ in the Sound Archive; one such
example is shown in (65).
(65) The form was, the reason was, you may not know this, that if there wasn't at
that time a bishop of Blackburn in residence, there was an inter waiting for a
new bishop and after a certain time, if there isn't a bishop then either [his
assistant or one of the suffrages, in this case people of Burnley], acts on his
behalf and after a certain period of time the gift of that job lapses to either the
Archbishop of York who's the next one up from being a bishop or the crown
and it had lapsed to the crown. (Sound Archive)
This initially appears to be a good example of the NSR, with the subject the people of
Burnley (or they) taking the 3sg –s ending. However, on closer inspection it is clear
that the subject of this sentence is actually the 3sg his assistant or one of the suffrages
(the NP is shown here in square brackets.) In this example, agreement has been
maintained despite the distance between subject and verb; something that often does
not happen (see my earlier discussion on proximity agreement in §4.1.2).
Reasons for such a low frequency of NSR agreement in both the Sound
Archive and Litcorp data may be due to language change, with present day speakers
moving away from older agreement patterns that are perceived as nonstandard, such
as the NSR. Certainly, the presence of the NSR in the Litcorp and near absence in the
Sound Archive lends weight to this argument for diachronic change, although the low
frequency of the NSR in the Litcorp overall (and of course other differences between
the corpora) makes this difficult to conclusively determine.
Tables 4 and 5 have shown that NSR agreement in Lancashire appears to be
most closely linked to thou. It is therefore possible that a decrease in this personal
pronoun form may have resulted in a decrease in NSR agreement if speakers and/or
writers perceive thou + verb –s as a semi-independent construction rather than as part
of an overall schema of verbal agreement.
140
Differences in frequency may also be due to differences in the purpose of the
texts in the two corpora - speakers in the Sound Archive may have actively avoided
using NSR agreement (as opposed to writers in the Litcorp who were aiming to
represent the dialect) due to its sociolinguistic salience. As suggested previously,
salient constructions are those which are overt in the speakers’ mind (Kerswill &
Williams 2002). It is not unreasonable to suggest that salience is gradient (as
discussed further in Chapter 5) and it could be argued that, for example, nonstandard
agreement in lexical verbs is perhaps more obvious or noticeable to speakers (i.e.
more salient and therefore possibly more actively avoided) when compared to other
nonstandard features found in the corpora (for example, definite article
reduction/deletion). This argument implies that speakers wishing to accommodate
towards a more standard variety would be more likely to avoid the possibly more
stereotyped NSR verbal agreement form. However, without further data it is not
possible to know whether Lancashire (or indeed any other) speakers are
accommodating towards or away from what they perceive as a more standard variety.
Smith et al. (2007) found verbal –s to be used frequently in their study of children and
caregivers in Scotland, despite so-called ‘social-constraints’ associated with the
construction. It may be that the variation found in Lancashire is perhaps a result of the
purpose and aims of the data analysed here, rather than as a result of its conscious
avoidance by Lancashire speakers (although I would suggest that this conclusion is
perhaps less likely due to very low frequencies of the NSR found in the relatively
large corpora examined here).
It may also be the case that the NSR is not considered as a particularly
‘Lancashire’ feature for the Litcorp writers (other than, arguably, with thou). While,
for example, lexical choice may be perceived as being more obviously regional,
141
certain aspects of grammar may not be. Moreover, as the NSR has already been
reported as being present in a relatively wide geographical area (see e.g. Pietsch
2005), Litcorp writers may not have included this feature in their writing as it was not
specific to the Lancashire region that they wished to represent.
4.4.2 Other constructions with 3sg agreement
As outlined in §4.2.2, it is also possible that constructional competition or overlap
plays a role in the distribution of NSR in Lancashire. It is impossible to provide firm
evidence for this, or any other hypotheses put forward here by looking only at the data
presented so far. With this in mind, this analysis now turns to the comparative
frequency results in order to outline the role of other agreement patterns in Lancashire.
Table 6 shows all agreement patterns with adjacent and non-adjacent pronominal
subjects in both corpora. Non-pronominal subjects are shown in Table 7. The
comparative frequency distribution between 3sg agreement forms (e.g. –s) and non-
3sg agreement forms (e.g. –Ø) are given for each variable tested.
142
LITCORP
LEXICAL VERBS AUX VERBS
SUBJECT POSITION SUBJECT TYPE -s -Ø 3sg ag Non 3sg ag
203 397 61 414
non-3sg pronoun
(33.8%) (66.2%) (12.8%) (87.2%)
Adjacent
330 12 339 2
3sg pronoun
(96.5%) (3.5%) (99.4%) (0.6%)
non-3sg 549 381 43 291
pronoun (59.0%) (41.0%) (12.9%) (87.1%)
Non-adjacent
382 25 256 36
3sg pronoun
(93.9%) (6.1%) (87.7%) (12.3%)
SOUND ARCHIVE
154 4035 4 864
non-3sg pronoun
(3.7%) (96.3%) (0.5%) (99.5%)
Adjacent
245 46 808 0
3sg pronoun
(94.2%) (15.8%) (100%) (0%)
197 2036 16 681
non-3sg pronoun
(8.8%) (91.2%) (2.3%) (97.7%)
Non-adjacent
263 31 389 5
3sg pronoun
(89.5%) (10.5%) (98.7%) (1.3%)
TABLE 6. TESTING PRONOUN ADJACENCY
In Table 6, any possible NSR agreement with pronouns falls into the category of non-
adjacent non-3sg pronoun with 3sg agreement (shown in boldface.) While 592
examples of this agreement pattern were found in Litcorp and 213 in the Sound
Archive, as we know from Table 4 (which shows NSR results only), a majority of
these examples are not instances of the NSR. Instead, many of these results are
habitual or historical present constructions. Aside from those examples with archaic
personal pronouns, no examples of the NSR with pronominal subjects are found in the
Sound Archive and only 8 examples are present in Litcorp, e.g. (66)
(66) Jolly good feed this, guv’nor. Sweep like a machine if the sweeper kims
round. Hope it’ll kim before the ladies turns out; they sweeps it all up with
their togs they does. Hullo! there he goes! (Litcorp)
143
As was the case with the previous data, the sparseness of NSR results in this
data makes it is impossible to make strong claims about the possible effect of pronoun
adjacency on the NSR in Lancashire, but relatively easy to suggest that this pattern is
not frequent. The distribution of non-pronominal subjects is shown in Table 7.
LITCORP
201 436 65 436
non-3sg
(31.6%) (68.4%) (13.0%) (87.0%)
Adjacent
134 29 360 10
3sg
(82.2%) (17.8%) (97.3%) (2.7%)
0 121 1 210
non-3sg
(0.0%) (100%) (0.5%) (99.5%)
Non-adjacent
102 2 68 5
3sg
(98.1%) (1.9%) (93.2%) (6.8%)
SOUND ARCHIVE
164 4082 9 1745
non-3sg
(3.6%) (96.4%) (0.5%) (99.5%)
Adjacent
429 118 1212 3
3sg
(78.4%) (21.6%) (99.8%) (0.2%)
4 197 2 460
non-3sg
(2.0%) (98.0%) (0.4%) (99.6%)
Non-adjacent
429 118 1212 3
3sg
(78.4%) (21.6%) (99.8%) (0.2%)
TABLE 7. TESTING NON-PRONOMINAL SUBJECT ADJACENCY
Results with all subject types from both corpora show that standard agreement
with both lexical and auxiliary verbs (i.e. 3sg pronouns with 3sg agreement, non-3sg
pronouns with non-3sg agreement) is more frequent than other agreement patterns.
While this finding is somewhat unsurprising, there is nonetheless significant variation
from Standard English in the data that is worth mentioning.
144
The data in Table 7 suggests that nonstandard 3sg agreement is more frequent
in the Litcorp than in the Sound Archive data. This is particularly noticeable with
lexical verbs, as 31.6% of all lexical verbs in this corpus occur with this agreement
pattern. However, as we know from Table 4, not all of these instances of non-3sg
subjects with 3sgs are examples of the NSR; in fact most were instead categorised as
habitual or historical present constructions. Historical present and habitual
constructions show a superficial similarity to possible NSR constructions; they are
able to occur with 3sg agreement in non-3sg contexts yet are unaffected by the subject
type and subject position restrictions. In this study, nonstandard 3sg examples are
classified as being either habitual or historical present constructions by looking at the
wider context that they occur in. This methodology differentiates the current study
from other studies; a narrow scope to any investigation may lead to an inaccurate
analysis. More concretely, many of the results which at first appeared to display the
NSR were upon further analysis found not to, e.g. (67).
(67) We all stands outside the pub. (Sound Archive)
This example seems to be a good NSR example with 3sg stands occurring with the 1pl
we. However, a look at the wider context of this utterance reveals that this is in fact a
historical present construction, as shown in (68).
(68) We all stands outside the pub. Suddenly he comes running out shouting that
we’re late and we’ve missed the train. (Sound Archive)
Instances such as (67-68) clearly exemplify the need to look at the wider context and
again raise concerns with respect to the accuracy of previous claims made as to the
frequency and distribution of the NSR.
145
Constructional variation with 3sg agreement is explored further in Table 8.
The NSR results presented in Table 3 are included again here for comparison.
CORPUS construction RAW FREQ
historical present 397 (58.5%)

habitual (with AdvP) 26 (3.8%)
Litcorp
other 9 (1.3%)
NSR (with thou) 199 (29.3%)
NSR (with other subject) 48 (7.1%)
Total NS 3sg agreement 679 (100%)
historical present 164 (75.9%)
habitual (with AdvP) 33 (15.3%)
Sound Archive
other 12 (5.6%)
NSR 7 (3.2%)
Total NS 3sg agreement 216 (100%)
TABLE 8. FREQUENCY OF NONSTANDARD 3SG CONSTRUCTIONS ANALYSED AS EITHER

HABITUAL OR HISTORICAL PRESENT
The historical present is the most frequent of all of the nonstandard 3sg agreement
patterns; a total of 561 instances of historical present constructions are found in the
corpora e.g. (69).
(69) So I gets on the train and he says “you look tired” I says “aye I am” he says
“well you get your head down. So where do you want to get off?” I said
“Preston” He said “oh you get your head down and we’ll give you a shake
when we get to Preston. So I goes into a deep sleep and the next thing I felt
the train jerking, looked through t’window, Crewe! (Sound Archive)
As with example (69) above, many instances show tense variation with the use of the
historical present, often using past tense forms alongside present tense 3sg forms.
The frequency of the historical present construction may be explained (in part)
due to bias within the corpus; the oral history interviews in the Sound Archive feature
146
dialogues that concentrate very heavily on narrating past events using the historical
present as a stylistic feature, e.g. (70).
(70) Anyway, one day I said to her one day I says “what’s the matter?” I says
“Why won’t you mix with the other girls?” I said “they want to be friendly,
but” I says “you just won’t co-operate with them at all.” (Sound Archive)
This aside, this frequency remains significant. It may be the case that the
prevalence of this 3sg form used without subject type and subject position restrictions
has affected the frequency of the NSR. The high frequency of the historical present
may mean that Lancashire speakers/writers associate the 3sg –s (and related forms of
irregular verbs) more frequently with the historical present rather than as a marker of
agreement. This issue is complicated further by the presence of the structurally similar
habitual construction in the corpus data, e.g. (71).
(71) It’s a mystery of nature, said Young Winterburn. Like us bein’ here at
o’. Ah sometimes wonders why we are here. (Litcorp)
As shown above, only habitual examples that have a relevant adverb phrase are
included in the tabulated results. However, habitual semantics can be expressed
without an adverb phrase and this is problematic when identifying instances of the
NSR. While the distinction between the historical present and the NSR is more
obviously based on the tense as given in the context of the sentence, habituality is
more difficult to discern. This is exemplified in Table 9.
147
Habitual aspect Non-habitual aspect
The men always sings loudly, (it went on all

night.) [Habitual? NSR? Historical present?]
Past
reference The men ø sings loudly, (it went on all night). The men sings loudly (it went
[NSR? Habitual? Historical present?] on all night). [Historical
present? NSR?]
The men always sings loudly.

Present [NSR? Habitual?]
reference
The men ø sings loudly. The men sings loudly.
[NSR? Habitual?] [NSR]
TABLE 9. OVERLAP BETWEEN THE NSR, HABITUAL AND THE HISTORICAL PRESENT
CONSTRUCTIONS
Here it is clear that a sentence may only be described as a good example of the NSR
with any certainty if that sentence does not convey habitual aspect and refers only to
the present time. Interpreting aspect from corpus results can be difficult, as shown in
example (72) (previously included as example (66)).
(72) Jolly good feed this, guv’nor. Sweep like a machine if the sweeper kims
round. Hope it’ll kim before the ladies turns out; they sweeps it all up with
their togs they does. Hullo! there he goes! (Litcorp)
Here it is possible that the speaker intends something like the ladies always sweep it
all up but this is partly speculative. It is probable that this constructional competition
and overlap between constructions (shown in Table 9) combined with the high
frequency of the historical present and habitual constructions (which of course have
no subject type or subject position restriction) has resulted in such a low number of
instances of the NSR. Lancashire dialect speakers may not associate 3sg agreement
forms with the present indicative only, but instead use this construction to indicate
habituality or present tense in past tense contexts, thus giving the distribution found in
Table 8. However, this assertion is difficult to prove, and as mentioned previously, it
may also be the case that the frequent occurrence of the historical present construction
148
found in the Lancashire data does not affect instances of the NSR, and instead reflects
only the nature of the oral history discourse.
4.4.3 Other nonstandard agreement patterns
Aside from the constructions outlined in §4.4.1 and §4.4.2, other nonstandard
agreement patterns are found in the corpora. There are 102 examples of adjacent 3sg
pronouns with non-3sg agreement, e.g. (73-75).
(73) Well we told the skipper, he says “oh he's not blind” he says, “he just wants to
go back, he don't want to go to sea for Christmas”. (Sound Archive)
(74) I used to swing that round so your centrifugal force kept the milk in and you’d
twirl it round like that, it don't come out. (Sound Archive)
(75) They let him wed Joe Tinker’s widow, ut says hoos waitin for mi shoon,
becose if he is a bit of a foo’ sometimes, he are too good a mon to throw
away upo’ sich like as her. (Litcorp)
This pattern is more frequent than the NSR, and is also found in other regions of the
UK, e.g. in East Anglia (Britain and Rupp, 2005) and Buckie Scots (Smith and
Tagliamonte, 1998). Most frequently in Lancashire, variation of this type (3sg subjects
with non-3sg agreement) is found with come and to a lesser extent, give, as shown in
e.g. (76-77).
(76) So then we went in we got into Stornoway. The the lifeboat come out to us
and the er some of the fishing boats you know. And the old man says don't
take any any ropes or owt the lifeboat’s coming. (Sound Archive)
(77) And I said, “well I don't know, I'll have to go and ask her” so he give me
money for t'bus and I went up and asked me Mother if, me Grandad lived with
us so me brother was all right, me Grandad would look after him you see.
(Sound Archive)
These examples again demonstrate the cross-over from other constructions,
with both examples here using the historical present. Tagliamonte (2001:44) refers to
variation with come as Past Reference Come. Tagliamonte suggests that come/came
149
variation is also present in York, and indeed is a well-known non-standard
characteristic of English dialect; it is therefore unsurprising to find it in this data.
The data also returned a number of constructions which were excluded from
subsequent tables. These included examples of indefinite pronominal subjects, such as
everybody in (78) which may be interpreted as a plural, and those instances that could
not be definitively resolved from their context as the subject of the sentence is unclear,
e.g. (79).
(78) He 'd just stand there would the butcher, newspaper in his hand, handful of
mince meat, couple of neck end chops, two sausages, a real Jacob 's joint and
he would hold it up in the air, who'll give me two bob for this, well
everybody shout out but we were fortunate again there because my dad was
an old mate of the butchers (Sound Archive)
(79) So they were all, and goes over to there and says “you don’t know me do
you?” (Sound Archive)
By now it should be evident that considerable variation in verbal agreement
exists within the Lancashire corpora. A corpus analysis of 3sg variation in present
tense verbs has shown that NSR agreement is infrequent in Lancashire, but where the
NSR does occur, no adherence to the subject type and subject position is found. This
analysis now examines similar 3sg variation with past tense forms of BE; a
construction that displays considerable variation in most dialects of English, e.g. in
Reading (Cheshire 1982); York (Tagliamonte 1998); the Fens (Britain 2002) and is
often linked to NSR-type constraints.
4.4.4 Was/were variation
Nonstandard was/were variation typically has three different distribution patterns. The
first, and most common, involves levelling to was across person, number and polarity
(see e.g. Chambers, 1995; Tagliamonte and Smith, 1999; Malcom, 1996). The second
150
involves levelling to were in negative polarity contexts and, to a lesser degree, was in
positive polarity contexts (see e.g. Trudgill, 1999; Cheshire, 1982; Tagliamonte,
1998). The third, and less frequent pattern, involves levelling to were in both positive
and negative polarity clauses (e.g. in nearby Bolton, Shorrocks, 1999; Moore, 2003).
A preliminary examination of past tense BE variation in Lancashire was undertaken by
Hollmann and Siewierska (2006:25) using a subsection of the data used in this study
and suggested that past tense BE paradigm showed levelling towards was, but more
interestingly also towards were. These results are now tested on both the Litcorp and
Sound Archive data; results can be seen in Table 10.
was were total

(Nonstandard) 17 (2.5%) 1459 (8.5%) 1476 (8.3%)
Litcorp
(Standard) 651 (97.5%) 15610 (91.5%) 16261 (91.7%)
Sound (Nonstandard) 436 (6.7%) 797 (20.9%) 1233 (12.0%)
Archive (Standard) 6025 (93.3%) 3010 (79.1%) 9035 (88.0%)
TABLE 10. WAS/WERE VARIATION IN THE SOUND ARCHIVE AND LITCORP
These results show that nonstandard use of was/were, e.g. (80), is more frequent in the
sound archive data than in Litcorp, but is prevalent in both corpora.
(80) When we were kids if there was anything wrong with us, boils or anything
like that, you never went to the doctor's, you were sent round to Grandma
Wheelers, and she were terrifying she were. (Sound Archive)
As with the preliminary results from Hollmann and Siewierska (2006:25), both
corpora show that variation is found in both directions (levelling towards was and
were). A further analysis of sentence polarity now tests whether or not this variable
affects was/were choice in Lancashire, as found in other studies (e.g. by Cheshire and
Fox, 2006:3).
151
negated non-negated
was 7 (41.2%) 10 (58.8%)
Litcorp
were 869 (59.6%) 590 (40.4%)
Sound was 188 (43.1%) 248 (56.9%)
Archive were 302 (50.6%) 295 (49.4%)
TABLE 11. NEGATED VS. NON-NEGATED NONSTANDARD WAS/WERE RESULTS.
Was levelling, and perhaps more interestingly, were levelling, is relatively frequent
both with and without negation in Lancashire, e.g. (81-82).
(81) Yeah, I carried on in this shop and I weren't making much money. When all
was paid out and everything I'd only about five bob left which wasn't enough.
Well there's, that shop was round the corner and come on t' front here and I
took one here and that was better by 10 pound a week. (Sound Archive)
(82) There were, he were a poultry farmer. He were a loomer at first. Then he
were a poultry farmer you see they were all allotments and poultry farms and
pig farms and er. I had another brother what er were dairyman at er Townley.
(Sound Archive)
This differs from other regional varieties where levelling to were is suggested to occur
mainly in negative polarity contexts. These was/were results for Lancashire conflict
with the earlier hypotheses which suggested that in regions where NSR agreement is
present, it is analogically extended to was/were variation. In Lancashire the non-3sg
pattern (i.e. was) can be extended to all contexts – this opposes the NSR agreement
pattern where 3sg patterns are extended to non-3sg contexts. These results are
unsurprising considering the absence of NSR agreement in this Lancashire data;
was/were variation appears to be unrestricted by any proposed analogy with the NSR.
This may suggest that other variables, such as e.g. constructional frequency, may
affect was/were variation in this region. Recently, the usage-based model has received
some attention in sociolinguistics (see Hollmann and Siewierska, 2011, for a
discussion of this). The usage-based model (see e.g. Croft and Cruse, 2004: 291-327)
152
suggests that constructions that are more frequent become more entrenched in the
mind over time. This means that if this nonstandard use of were is used frequently by
speakers then perhaps it is more resistant to language change.
Was/were levelling may also be more susceptible to variation, as in contrast
the two forms may often appear to be phonologically similar, e.g. (83-84).
(83) [a: wə bi:In dɹagd daƱn ðə pIt]

I were being dragged down the pit (Sound Archive)
(84) [a: wəzbi:In baθd In zInk baθ]

I was being bathed in zinc bath (Sound Archive)
4.4.5 Other variation with was/were
As in the case of Hollmann and Siewierska’s (2006) results, considerable intraspeaker
variation is found with was/were in the Lancashire data, e.g. (85-86).
(85) Yeah they were right big they was, two big strapping lads and er, and the
sister, she was called Lizzie, Elizabeth but we always called her Lizzie, Miss
Lizzie. (Sound Archive)
(86) No, no, I was going to shoe horses with Joe Littleun, I were going to shoe
horses. (Sound Archive)
Within the was/were analysis a number of constructions displaying dislocation
were found in the corpora, although these were most frequent in the Sound Archive
data. Clefted sentences were found with all combinations of was/were e.g. (87-88).
(87) Oh aye, Morecambe was a great place for entertainment during the war it
was. (Sound Archive)
(88) The price of coal was low, it were. (Litcorp)
Alongside the NP was/were X, it was/were pattern, there was also the NP

was/were X, was/were it pattern, e.g. (89-90).
153
(89) That were a sad job were that. (Litcorp)
(90) Yeah he were nice were Mr Kay. (Sound Archive).
An analysis of the corpus data has revealed that while variation in verbal
agreement is frequent in Lancashire; instances of the NSR are rare, existing almost
exclusively with the archaic personal pronoun thou. Perhaps unsurprisingly then,
was/were variation also shows no restriction with respect to subject type or position.
As mentioned previously, possible biases due to the nature of the corpus data (for
example, frequent use of the past tense in the Sound Archive, possible stylistic
motivations in Litcorp) may have skewed these results. As the instances of NSR in the
corpus were too infrequent to enable the testing of constraints such as subject type and
subject position, the questionnaire is used in order to explore possible variation such
as this.
4.4.6 Questionnaire results
The questionnaire reached 269 informants. Of these, 243 completed the questionnaire
in its entirety and are included in the results shown in §4.4.6. 103 informants were
students in undergraduate classes at Lancaster University, typically aged 18-22 and
from a mixture of English regions. The other 140 informants were targeted via social
networking websites, and were asked to fill in an online version of the questionnaire.
Online informants were then encouraged to pass on the questionnaire to any of their
colleagues, family or friends that they felt were also likely to respond. Most online
participants were of a mixed age range and from a number of different regions,
although the majority were from Lancashire or the North West, with the average age
being 36.
154
The tables in this section present the scores of the grouped participants (see
§4.3 for a discussion of groupings and participants). Participants were asked to assign
scores from 1 to 5 to test sentences, with 1 being judged by them as the least
acceptable and 5 as the most acceptable. A full copy of the questionnaire can be found
in Appendix D. The distribution of informants is shown in Table 12.
Informant group Number of informants

(dialect speakers) 98 (40.3%)
Lancashire
(non-dialect speakers 25 (10.3%)
Other north
(non-dialect speakers) 20 (8.2%)
South
(non-dialect speakers) 22 (9.1%)
total 243 (100%)
TABLE 12. GEOGRAPHICAL REGION AND DIALECT TYPE OF INFORMANT
As discussed in § 4.3, the dialect speaker vs. non-dialect speaker distinction between
informants groups was made by the informants themselves in response to the question
do you consider yourself to be a dialect speaker? If yes, which dialect? This
differentiation is made in order to test whether or not dialect speakers are more likely
to find NSR agreement acceptable, as compared to non-dialect speakers.
The five point scale used in this test allows statistical analyses and
comparability. Excluding explicit descriptors for 2, 3 and 4 allows interval variable
status to be approximated on the assumption that intervals between each of these five
values are the same. This allows mean scores to be calculated. The overall median
results for all respondent groups are shown in Table 13. The mean score is shown
alongside this, in brackets.
155
The questions posed in the survey can be grouped by the particular subject
type or position restriction being tested, e.g. adjacency, heaviness etc. These results
with respect to the respondent groups can be seen in Table 13.
constraint type
adjacent non-adjacent
respondent group heavy “normal”
personal personal
NPs agreement
pronouns pronouns
Lancashire (dialect speakers) 3 (2.4) 1 (1.7) 2 (2.3) 4 (4.0)
Lancashire (non-dialect speakers) 2 (1.7) 1 (1.6) 2 (2.0) 4 (4.3)
Other north (dialect speakers) 2 (2.0) 1 (1.5) 2 (2.1) 4 (4.3)
Other north (non-dialect speakers) 2 (1.8) 1 (1.5) 2 (2.0) 4 (4.4)
South (dialect speakers) 2 (1.8) 1 (1.2) 2 (1.9) 4 (4.2)
South (non-dialect speakers) 2 (1.8) 1 (1.1) 2 (1.9) 4 (4.2)
TABLE 13. MEDIAN AND MEAN ACCEPTABILITY SCORE BY ALL RESPONDANTS, GROUPED
RESULTS
Both the Mann-Whitney U-test for median values and the t-test were employed in
order to test the significance of this data. All respondent groups were compared to one
another for all constraint types. Many of the results showed no significance, thus
suggesting that speakers from different regions, in many cases, found certain
constraint types (e.g. non-adjacent personal pronouns, heavy noun phrases) to be
equally unacceptable, often giving the lowest possible score. This unacceptability ties
in with the low frequency of NSR agreement found in the Lancashire corpora. It may
be that in Lancashire, along with other regions, NSR agreement is unacceptable to
most speakers.
The most significant survey results come from the respondent group
Lancashire dialect speakers. Adjacent personal pronouns returned both mean and
median results that are statistically significant at a confidence level of 94%. This
suggests that Lancashire dialect speakers consider adjacent personal pronouns
156
occurring with nonstandard verbal agreement to be more acceptable than any other of
the tested constraints. Again, this further substantiates earlier findings from the corpus
results, which unlike definitions of the NSR in the literature, also showed that
nonstandard 3sg agreement occurs with adjacent personal pronouns very frequently. A
further breakdown of these results for adjacent personal pronouns with each individual
test sentence is presented in Table 14.
adjacent personal pronouns

respondent group I talks you needs we thinks they walks
Lancashire (dialect speakers) 3 (2.4) 1 (1.7) 2 (2.3) 1 (1.9)
Lancashire (non-dialect speakers) 2 (1.7) 1 (1.6) 2 (2.0) 1 (1.8)
Other north (dialect speakers) 2 (2.0) 1 (1.5) 2 (2.1) 1 (1.9)
Other north (non-dialect speakers) 2 (1.8) 1 (1.5) 2 (2.0) 1 (1.7)
South (dialect speakers) 2 (1.8) 1 (1.2) 2 (1.9) 1 (1.5)
South (non-dialect speakers) 2 (1.8) 1 (1.1) 2 (1.9) 1 (1.5)
TABLE 14. TESTING ADJACENT PERSONAL PRONOUNS
Here it can be seen that the test sentence containing I talks (I talks to the man for a
while) is the most acceptable of all test sentences in this category, with this being the
case particularly for Lancashire dialect speakers. The acceptability of sentences such
as this may be linked back to constructional overlap; this sentence could be considered
as an example of the historical present, or of the habitual (without an overt adverb
phrase) by the survey respondents. This would suggest that Lancashire dialect
speakers find adjacent non 3sg pronouns with 3sg agreement to be more acceptable
than other groups, perhaps due to the frequency of competing constructions, (see e.g.
Table 8).
The use of a questionnaire methodology is not without its limitations; an
analysis of the results suggests that often informants are reluctant to choose 1 or 5,
with sometimes even ‘normal’ sentences not being given the highest score.
Conversely, there were also participants who only gave scores of 1 or 5, i.e. a yes/no-
157
type response. This aside, combined with the substantial corpus data, these results
give a good picture of verbal agreement in Lancashire.
This analysis of possible instances of the NSR in Lancashire has revealed that while
3sg agreement in this region is subject to considerable variation, the situation is far too
complex to be accounted for by a single rule. Instances of present tense indicative
variation that conform to the NSR are extremely rare in the Lancashire data,
particularly in the more modern Sound Archive corpus. No examples of the
contrastive patterns frequently detailed in the literature, such as we peel ‘em and boils
‘em’ (Ihalainen 1994:221) were found, and aside from those instances with the archaic
pronoun thou, only fifty five clear and outright instances of the NSR were found in the
combined corpora of over 800,000 words. As the NSR is not a frequent construction
in the Lancashire data, no evidence-based hypotheses about possible diachronic
change can be put forward.
Linked to this lack of instances of the NSR, variation found with past tense BE
does not conform to subject type, position or polarity constraints and is found
frequently in all person/number/polarity contexts. Agreement of this type is
comparatively rare in most varieties of English. The lack of NSR results does not
suggest, however, that this chapter has presented no findings; over 4,000 instances of
nonstandard verbal agreement are identified in the corpus data. Most frequently,
instances of agreement variation in the Lancashire data involve a direct flouting of the
subject position and subject type restrictions specified by the NSR, e.g. (91) and (92).
158
(91) Ah likes a good hymn tune (Litcorp)
(92) I used to swing that round so your centrifugal force kept the milk in and you’d
twirl it round like that, it don’t come out. (Sound Archive)
The violation of position and type restrictions (namely, adjacent non-3sg pronouns
found with 3sg agreement) is further substantiated by results from Lancashire dialect
speakers in the questionnaire. Results from these respondents indicated a higher
acceptability score (as compared to other respondent groups) for non-3sg pronouns
with 3sg agreement. Possible reasons for this distribution may lie with the frequency
of both the historical present and habitual construction in the corpus data. While these
frequencies may be due in part to corpora biases, they do still provide a good basis to
suggest that these constructions are prevalent in the region and therefore interfere and
overlap.
One of the main difficulties in this analysis lies in delineating the NSR with
respect to other constructions, as outlined in Table 9. While the semantic difference in
the NSR as compared to the historical present is quite clearly a question of relative
time (and is usually easily resolved from the sentence context), this is trickier for
habitual constructions. The problem of habitual aspect disambiguation may undermine
the validity of some previous NSR claims, such as those outlined in §4.3.1. Certainly,
in regions such as Lancashire where 3sg forms can indicate the habitual aspect (with
or without adverb phrases) these 3sg forms may be re-analysed as a marker of habitual
semantics alone, and extended into pronoun-adjacent contexts rather than be a marker
of agreement as suggested by Pietsch (2005) in sentences such as the sheep bleats, and
burglars steals ‘em.
As outlined in §4.3.1, much of the previous research into the NSR has been
dominated by theories, rather than being informed by empirical data. This study goes
159
some way towards redressing that balance, although further research, particularly on
testing the boundaries between construction types and their effect of patterns of
agreement is needed.
160
Chapter 5. Salience
5.1 Introduction
Sociolinguistic salience is described as the property of a dialectal feature which makes
it cognitively or perceptually prominent, both for speakers of the dialect and speakers
of other dialects (Kerswill & Williams, 2002). This awareness of possible
sociolinguistic values associated with particular words and constructions has
implications for theories of language variation and change in regard to, for instance,
the distribution of sociolinguistic variables (e.g. Kerswill, 1985); language learning
(e.g. Bardovi-Harlig, 1987) and language contact (e.g. Trudgill, 1986). A number of
studies have discussed how variables become salient, but how saliency could be
investigated on the basis of corpus data, let alone quantified and evaluated, is quite a
new topic of research.
This chapter compares the similarities and differences found in a selection of
nonstandard grammatical constructions that are produced in spoken conversation and
Lancashire dialect literature. The main grammatical features that are considered in this
chapter are set out in Figure 1. The rationale for selecting these features in particular is
discussed in §5.3.1.
definite article reduction/deletion what relativization

was/were variation archaic personal pronouns
past tense negator never archaic verb forms
demonstrative them nonstandard irregular lexical verbs
past reference come adverbial right + adjective
possessive me + noun absence of plural marking
adverbial quick present participles sat and stood
FIGURE 1. NONSTANDARD FEATURES EXAMINED IN CHAPTER 5.
161
A methodology for comparing spoken language to dialect literature is described and
tested in this chapter. Here dialect literature is considered as a collection of
constructions that the speakers believe encapsulates their variety – via its salient
constructions. An analysis of dialect literature alone will no doubt give interesting
results, but by comparing dialect literature to spoken language the difference between
production and perception of grammatical features can be examined. The application
of this methodology will allow us to arrive at some idea of which of the grammatical
features listed in Figure 1 emerge as salient in terms of their distribution across the
corpora and which may stand out as being primarily produced or primarily perceived.
5.1.1 What is salience?
Meyerhoff (2010:294) describes salience as “a maddeningly under-defined term when
used in sociolinguistics” and from a survey of the literature, it seems that salience is
indeed used to refer to a number of different concepts in slightly different ways. While
most definitions typically refer to some element of language that is noticeable or
prominent, sometimes salience describes awareness of the listener, i.e. how readily a
particular variant is perceived or heard (e.g. Mufwene, 1991) and on other occasions it
relates to awareness of the speaker, e.g. (Hickey 2000:57). Salience is also sometimes
used to refer to a non-linguistic factor that the context or participants may have
foregrounded in discourse, e.g. gender salience (e.g. Cheshire and Gardener-Chloros,
1998:10) and discourse salience (e.g. Prasad and Strube, 2000). Often, salience is seen
to be gradable, i.e. certain variables are considered as being more salient than others.
Thus for instance, as discussed further in §5.2.1, markers, are seen as less salient than
indicators and these less salient than stereotypes (Labov, 2001). Trudgill (1986)
describes extra strong salience as having more significance than (presumably)
162
“ordinary” salience. A clear way of testing hypotheses such as these using corpus data
is yet to be established.
Views are divided on the degree to which salience might be conditioned by
language internal factors (e.g. structure, perception and cognition) or by
extralinguistic factors (e.g. sociolinguistic and social-psychological causes). Both
Kerswill and Williams (2002) and Hollmann and Siewierska (2006) agree that
independent factors underlie salience, and Hollmann and Siewierska suggest that
cognitive-perceptual factors are primary. In this chapter salience is used to refer to
both structural and external or extralinguistic factors. Although the factors affecting
and/or determining salience are of course relevant to this study, the focus here is not
specifically on the causes of salience or on the trajectory of a particular feature along
its journey towards becoming salient in one particular dialect. Instead, this chapter
aims to test a methodology for quantitatively examining salience and to describe how
this can be applied in order to outline the salient features within a particular dataset.
The outcomes of this analysis should then allow more concrete claims to be made
about the behaviour and distribution of particular constructions based on corpus data.
5.1.2 Salience, markedness and enregisterment
If salience is considered to be the property of a dialectal feature which makes it
cognitively or perceptually prominent, then it is also related to several other concepts
used in the literature to describe the status of linguistic features such as enregisterment
(Agha, 2003); markedness (Greenberg, 1966) and the stigmatization hierarchy
(Labov, 2001). While these concepts are not considered in this chapter at length, it is
useful to outline how they may relate to and interact with salience.
163
Enregisterment is the identification of a set of linguistic norms as ‘a linguistic
repertoire differentiable within a language as a socially recognised register’ (Agha
2003: 231). Enregisterment is often discussed in relation to the identification a set of
variables that mark out a specific scheme of cultural values by the speaker. It therefore
follows that dialect literature used in this thesis might be considered as the
enregisterment of those Lancashire features by the writers of such material, i.e. those
features that the writers consider to typify this regional variety (see e.g. Beal, 2000;
Honeybone and Watson, forthcoming, for analyses based on this approach).
Markedness is frequently concerned with oppositions in phonology, grammar
and semantics – often contrasting marked and unmarked counterparts. Markedness is
often related to complexity (e.g. Croft, 2002; Greenberg, 1966) where the marked
feature is a counterpart to a broader or more dominant unmarked pattern. If we
consider Standard English as “unmarked” and nonstandard varieties as “marked”, then
this comparison with salience appears to work to some degree. However, salience is
different to markedness in that a salient form has no real tendency towards being the
more complex as compared to its semantic counterpart (compare e.g. the “marked”
definite article deletion as compared to the “unmarked” the). Nor do such cases fall
straightforwardly under what is sometimes termed as ‘local markedness’ (e.g. by
Anderwald, 2003) where the expected pattern (i.e. complex = marked, simple =
unmarked) is reversed. Markedness is also frequently cited as being associated with
frequency, although this is rarely mentioned explicitly. For example, Radford
(1988:39) equates the term “unmarked” with ‘regular’, ‘normal’, ‘usual’; and
“marked” with ‘irregular’, ‘abnormal’, ‘exceptional’, or ‘unusual’. This
characterization does not fit with the concept of salience, where certain nonstandard
features (such as e.g. was/were variation) are far from ‘abnormal’ or ‘exceptional’ in
164
the data. Along with this, salience is typically used in the literature to relate to
nonstandard forms – markedness can refer to two opposing but acceptable features.
Similar to the suggestion of gradable salience mentioned earlier, Labov
outlines a stigmatization hierarchy (2001), where features are divided into markers,
indicators and stereotypes depending on how closely linked they are to a particular
group in society. Indicators are described as not showing any change in style, but
instead vary with respect to social stratification. Markers show both social and
stylistic stratification and are linguistic variables to which social interpretation is
overtly attached. Stereotypes not only have well-known social meanings, but are
generally stigmatized and often actively avoided. It is hoped that a comparison of the
perceptions of the Lancashire dialect with what is actually produced by speakers
should enable the identification of these differences. The dialect literature corpora can
be considered as a collection of the stereotyped features. Features that occur in the
spoken corpus and also in dialect literature but are rare in Standard English may be
considered as indicators. Any features that occur across all corpora and perhaps are
also reported in other varieties of English, or perhaps even in Standard English, could
be considered as markers.
5.1.3 Accommodation – problems and solutions
As with other chapters in this thesis, the analysis in this chapter is based upon spoken
data from the Sound Archive corpus and written dialect data from Litcorp (please see
Chapter 1 for a further discussion of these sources). When exploring salience in these
two very different corpora a number of factors must be taken into account. Although
the Sound Archive data is a transcription of the speech of Lancashire dialect speakers,
it may be that certain nonstandard forms are recognized by these speakers as being
165
dialectal. It is therefore plausible that the informants in the Sound Archive may
actively down-play or, equally, emphasize particular constructions, depending on both
their own knowledge of their local dialect and the way in which they wish themselves
to be portrayed (i.e. as more or less dialectal, see e.g. Hollmann and Siewierska 2006).
Indeed, often salient features are categorised as such by the speaker’s readiness to
accommodate away from them (see e.g. Kerswill and Williams, 2002; Hollmann and
Siewierska, 2006). Of course, looking at salience in spoken data brings an element of
circularity into this methodology – how can we know if features are salient by looking
at corpus data, if speakers who contribute to that corpus data also know that certain
features are salient too and so actively up/downplay them? Accommodation such as
this is difficult to factor into any analysis; it is difficult to know which speakers may
have accommodated their dialect, when this happens and also in which direction(s).
The inclusion of data from Litcorp does go some way to working around the
above problem. As Litcorp is not a transcription of real Lancashire dialect speech, but
instead a record of the writers’ perception and representation of Lancashire speakers,
it can be considered as a corpus of the most salient or important dialectal features as
judged by the writers in question. This use of an extensive dialect literature corpus in
order to quantify salience is original – no other attempts to examine and compare data
such as this can be found in previous studies. However, as (on average) around 100
years separate the data in Litcorp and Sound Archive it is possible that variation due
to diachronic change could influence results. To consolidate the results from Litcorp
and to broaden the diachronic span of the dialect literature, new corpus data has been
collected. In order to collect data of a suitable length participants were asked to
reproduce a story that was familiar to them – a fairy tale. This new corpus is named
Lancashire Fairytales, a sample of which is shown in (1); (further examples are given
166
in Appendix G). More details on how this new corpus was collected are provided in
§5.3.2.
(1) An Cinderella were havin a gradely time at Ball wit Hansome Prince. Then,
she looked at time and said “oooh eck, I’ve gorra dash love, or I’ll turn into
some right nasty vegertable!” An off she dashed, right down road.
(Lancashire Fairytales)
5.1.4 Aims
The main aim behind the methodology employed here is to contrast production and
perception in the Lancashire corpus data. Concretely, this involves evaluating the
perceived features of Lancashire grammar as set out in dialect literature against the
features present in the Sound Archive corpus. In particular, it is interesting to uncover
if there are features that are perceived as part of the Lancashire dialect but occur rarely
in speech, and if there are features of the dialect that do occur in speech (and are
nonstandard) yet are not perceived as part of the Lancashire dialect.
It should be noted that if a particular nonstandard feature is found frequently
across the corpora, this does not automatically mean that this feature is a salient
feature of the Lancashire dialect (although naturally this is also possible). Lancashire
dialect and Standard English of course share a large number of grammatical
constructions, and so instances of grammatical features found across the corpus
sources could indicate a feature of Standard English, rather than a salient feature of
the Lancashire dialect. This can be better demonstrated using the diagram in Figure 2
which shows the possible intersect between standard, dialectal and salient features.
167
Constructions
salient to the
dialect
Constructions used Constructions used

by non-dialect
by dialect speakers
speakers
FIGURE 2. MAPPING SALIENT CONSTRUCTIONS
Although from the outset the grammatical variation examined here is intended
to be ‘nonstandard’ (i.e. those features set out in Figure 1) again, that does not entail
that a nonstandard construction found across all corpora in question is particularly
‘Lancashire’. Even writers who intend to write in Lancashire dialect use varying
degrees of ‘generic’ (or non-salient) nonstandardness. Alongside this, it is important
to note that the salience of grammatical features need not be uniquely tied to a specific
region. While some features may typify a particular region alone (regional words are a
good case in point), other features may occur in several areas (e.g. definite article
reduction found in both Lancashire and Yorkshire; variation with BE found in various
locations.)
It may also be the case that certain constructions are more salient than others
and this in turn may have implications for their corpus frequency. For example, the
use of nonstandard verbal agreement forms (such as those discussed in Chapter 4)
may stand out more than other nonstandard features, such as definite article deletion,
certainly when spoken. This can be appreciated perhaps on the basis of the examples
in (2) and (3).
168
(2) If you if you was fourteen you was ready for work.(Sound Archive)
(3) I don't know whether I, mind you must have had milk there because we
used to call for Ø milkman one of Ø lads did. (Sound Archive)
As a consequence of the above, one may speculate that instances of more salient
nonstandard verbal agreement as shown in (2) are likely to be found less frequently in
speakers who wish to adhere to overt prestige forms, and more frequently in speakers
wishing to adhere to potential covert prestige forms (i.e. dialect forms). As mentioned
previously, speaker attitudes are not consistent, and although the methods employed
by Hollmann and Siewierska (2006) may certainly go some way to measuring
accommodation, it is not feasible to employ such methods here alongside other aims
of the chapter. Consequently, accommodation is not treated as such in this chapter, but
this remains a possibility for further research.
5.2 Rationale
A number of researchers have outlined the factors which may influence and govern
the amount of salience that is associated with particular linguistic constructions, citing
a number of different factors as being influential. These include prosodic salience
(Yaeger-Dror; 1993); isomorphism (Mufwene, 1991; Chapman, 1995); frequency of
use (Bardovi-Harlig, 1987), and a combination of these along with social factors
(Hollmann and Siewierska, 2006; 2011). As demonstrated by these varying positions,
accounts of salience are complex and cannot be considered at length here. Instead, my
analysis now turns to how salience can be measured using corpus data of different
types.
169
5.2.1 Using dialect literature
Perhaps the most obvious feature of dialect literature lies in the semi-phonetic
representation of the dialect by the writers which is best demonstrated using an
excerpt from one of the Litcorp texts, as shown in Figure 3.
M. – Why, whot’s bin th’ matter, hanney fawn eawt withur
Measter ?
T. – Whot ! there’s bin moort’ do in a Gonnart much, I’ll
uphowd tey ! – For whot dust think ? bo’ th’ tother Day boh
Yusterday, hus Lads moot’d ha’ o bit on o Hallidey, (becose
FIGURE 3 SEMI-PHONETIC RESPELLINGS IN TUMMUS AND MEARY (TIM BOBBIN, 1846)
Here we can clearly see both grammatical variation but also significant semi-
phonetic respellings. If these semi-phonetic respellings can be considered as
indications of a meaningful decision by the author (as is suggested by Sebba, 2009),
then these features give an extra layer of significance to the grammar and lexis as
chosen by the writers of dialect literature. While of course respellings naturally lend
themselves to a phonological analysis, I argue that they are also interesting in terms of
whether or not the distribution of these respellings may interact with instances of
nonstandard grammatical variation.
5.2.2 Choosing constructions
Unlike previous chapters which focused on distinct areas of grammatical variation
(e.g. verbal agreement in Chapter 4 or relativization in Chapter 2), this chapter
explores grammatical variation in Lancashire more widely. So far this thesis has
avoided grammatical features that have been the focus of previous studies in
170
Lancashire e.g. definite article reduction/deletion (Hollmann & Siewierska, 2006;
2011); ditransitives (Siewierska & Hollmann, 2005); and possessive me (Hollmann &
Siewierska, 2007). Instead, the approach here aims to uncover significant variation in
a larger selection of grammatical features, rather than analyse specific features only.
Along with features already identified as typical to Lancashire, a further selection of
features are tested. As outlined previously, the Sound Archive and Litcorp were
subject to a fine-grained analysis at the beginning of this project that highlighted both
the constructions that have been addressed in previous chapters along with a number
of those included in this analysis (e.g. what as a subject relative, was/were variation).
Other instances of variation are taken from an overview of existing literature on
nonstandard varieties of English (e.g. by Cheshire, Edwards and Whittle, 1989; Beal,
2004; Kortmann and Szmrecsanyi, 2004). By including not only typically
‘Lancashire’ variables, the scope to which these other nonstandard constructions (such
as lack of plural marking and never as a negator) are frequent in Lancashire can be
tested, along with whether or not they are perceived by Lancashire Dialect writers as
being salient enough to include in their writing.
While grammatical variation is the focus of this study, salience is not a
phenomenon that applies exclusively to one language area independently of others;
phonological, lexical and discourse variation are also associated with regional
variation and as such can also be variably salient. Phonological features (as
represented through nonstandard spelling) and lexical choice are discussed in brief in
§5.4.4.
171
5.2.3 Summary, research questions and hypotheses
An analysis of the literature has shown that salience is often cited as a reason for
language variation and change. It is also clear that definitions of salience vary,
particularly in terms of emphasis (e.g. variation perceived by the speaker, the listener
or both) and factors that influence salience are complex and interrelated. While these
considerations undoubtedly impact upon any findings presented here, this thesis takes
a more methodological approach. The approach here is not to define what makes
something salient, but to be descriptive about salient features of the Lancashire
dialect. Currently there is a lack of suitable methodologies to test out and uncover
salience, and to describe which features may be considered as salient, based on a wide
range and large amount of corpus data. More specifically, this chapter addresses the
following themes:
a) How do grammatical features that are perceived as part of the Lancashire
dialect match to those features that are produced by Lancashire speakers?
b) According to the data used here, which features of the Lancashire dialect are
salient?
c) What is a suitable method for measuring salience?
d) If semi-phonetic respellings by the writers of dialect literature can be
considered as significant (as suggested by e.g. Sebba, 2009), then salient
constructions will be subject to respellings in the dialect literature.
5.3 Methodology
Measuring salience by comparing the presence and frequency of variables produced
by speakers in free speech to the frequency of these variables in dialect literature is a
new idea. It is hoped that such a measurement will provide new insights that go some
172
way towards finding out what speakers actually consider to be salient features of their
dialect. Alongside the distribution of the relevant features in the two corpora, their
distribution in a reference corpus of Standard English (in this case the BNC) will also
be taken into account in order to help adjudicate whether or not the features in
question are salient in Lancashire dialect or simply frequent in Standard English more
generally.
5.3.1 Corpus methods
As with previous chapters, corpus methods such as concordance searches, frequency
lists and keyword analyses are used to identify and explore grammatical variation.
Nonstandard features considered in this chapter were initially uncovered using a
number of methods: by looking at previously established variation in Lancashire (by
e.g. Shorrocks, 1999; Hollmann and Siewierska, 2006; 2007; Siewierska and
Hollmann, 2005); by a preliminary examination of samples from all corpora (as
outlined in Chapter 1) and by examining features of nonstandard grammatical
variation found in the UK more generally (e.g. by Cheshire, Edwards and Whittle,
1989; Beal, 2004; Kortmann and Szmrecsanyi, 2004). As not every possible
interesting result can be presented within the limits of this chapter (or even within this
thesis), only a selection, presented in Figure 1, are included here, with others
summarised in §5.4.
The overarching idea behind the methodology employed involves comparing the
frequency of standard and nonstandard uses of a particular grammatical feature used
in each corpus (for example, comparing the frequency of definite article
reduction/deletion to instances of the). To do this, it is necessary to calculate how
often all instances of each feature occur in total (used both in a standard and
173
nonstandard way) so that the percentage of nonstandard uses can then be ascertained.
Percentages rather than raw or normalised frequency figures are used because these
are more informative when comparing a larger number of variables. In short, ten
nonstandard instances of a frequent feature are quite different to ten instances of a rare
one. However, in some cases this methodology requires sensitive application. Where
the alternation between standard and nonstandard constructions is fairly fixed and
restricted (e.g. use of was as opposed were) the methodology proposed here gives
useful results. Kerswill and Williams (2002:100) take the stance that the lack of full
semantic equivalence between variants means that these variants should be omitted
from the analysis. I instead agree with Hollmann and Siewierska’s assertion that this
stance is perhaps too strong (2006:28). Instances that do not have clear or obvious
standard/nonstandard matches are more problematic and indeed need to be considered
on their own merits but should not be excluded outright. By way of illustration
consider the nonstandard construction right + adjective found in Lancashire, as in it
was right big. What should this construction be compared to in Standard English? One
possibility is really + adjective, another is very + adjective, a third is a construction
without an adverbial but with a stronger adjective such as it was enormous, or it was
gigantic. This may also vary from speaker to speaker. In instances like this, ideally the
nonstandard construction (in this case right + adjective) must be compared against all
possible semantically similar constructions where it could be used. However, sensible
restrictions need to be placed on what might be considered as a ‘semantically similar
construction’ in order to avoid extensive searches for each grammatical feature under
investigation. In the case mentioned above, only other adverbs + adjective
combinations were retrieved from the corpus. This list of constructions (along with
examples) was then presented to the Lancashire dialect speakers test group (see §1.3
174
for further information) who identified the instances where they judged that right
could be used (e.g. very + adjective, extremely + adjective but not always + adjective).
Results for right + adjective were then compared against only these ‘acceptable’
forms, in order to give as accurate a score as possible. This method was used with all
other features that had multi-construction options for their standard form.
As a consequence of this method, grammatical features that have multiple
matching constructions may have lower scores than results that have a more restricted
nonstandard/standard match as they are perhaps compared to a wider range of
variants. This could have undesirable implications if the frequency of each feature
were to be compared against each other directly. Here, however, it is of little
importance since each individual feature is compared only to instances found across
the three corpora on a feature-by-feature basis.
The methodology outlined above is also problematic when applied to instances
where the feature is more discourse-based or encompasses whole clauses or sentences.
For example, should all of the instances of dislocation (such as he were nice, were Mr
Jones) and of the discourse marker see (e.g. I’ll put it away for you, see) be compared
against all other non-dislocated sentences or sentences without such discourse
markers? It seems obvious that comparing sentences that include features of this type
to all other sentences in the corpus is not appropriate. Both dislocation and discourse
markers are used in particular contexts and for particular purposes and can appear
idiosyncratically both within the same text and from speaker to speaker. In order to
avoid false comparisons, these constructions are discussed more descriptively, as
shown in §5.4.4.
Once the degree of nonstandard use for each feature is established using the
methods outlined above, the score for each feature is averaged across the three
175
corpora. Then, the positive or negative deviation from this average in each corpus can
be used to highlight how particular grammatical items may be over or underused in
one corpus as compared to another rather than comparing the raw standard-to-
nonstandard usage scores in each corpus. This gives a better indication of how the
corpora compare. The BNC data is not averaged in the same way, but instead the
frequency of nonstandard to standard forms (as detailed earlier) is used.
Much of the grammatical variation is retrievable by searching for individual
word forms (e.g. t’, or were etc). Other variation is found by searching for more
complicated patterns, e.g. possessive me + noun. The results from most searches
require some element of manual sorting, for example to disambiguate similar
meanings. Results that involved omission (such as zero relatives or definite article
deletion) were the most difficult to retrieve and were found by either using more
complex corpus search strings, by manual searching or by a mixture of the two.
As we have seen in previous chapters, often nonstandard grammatical
variation found in the dialect literature also involves some element of nonstandard
spelling e.g. theau (thou), coom (come) and wur (were). These variant spelling forms
were initially identified in the preliminary analysis of these data (as outlined in
Chapter 1) and so were also retrieved by searching for their individual word forms.
Words (and of course spelling variants) that appear in one corpus but not in
another (i.e. the lexical choices and dialect words) were retrieved by means of a
keyword analysis. These lexical results are outlined in §5.4.7.
5.3.2 New corpus data – Lancashire Fairytales
As mentioned earlier, any differences emerging from a comparison of Litcorp and
Sound Archive are potentially attributable to diachronic change due to difference in
176
dates between these two corpora. Therefore to counterbalance this, a new corpus of
dialect literature was collected. Respondents were asked to write in what they
considered to be Lancashire dialect. This means that this corpus captures the
perception of the grammatical repertoire of a Lancashire dialect speaker as considered
by the respondents (along with any possible phonetic representation they might
choose to include). As with Litcorp, this corpus is a collection of the most salient
features of the Lancashire dialect as judged by these writers. In order to get the
participants to write a story of useful length, they were asked to reproduce a story that
was familiar to them – a fairy tale. In building this new corpus, Lancashire Fairytales,
the length, style and number of stories a participant could write was unrestricted,
along with the type of variation their story should contain (e.g. grammatical variation,
lexical choices, and semi-phonetic spellings). Two short examples in Lancashire
dialect were produced by the small test group (see Figure 2, Chapter 1) and included
in the questionnaire for demonstration purposes. The wording of the questionnaire is
shown in Figure 4.
For this task, please write a fairy story that is familiar to you, in Lancashire Dialect.
Imagine that a speaker with a Lancashire accent and dialect is telling you this story,
and write how you think they would say it.

For example, you might choose to write the story of Little Red Riding Hood,
Goldilocks and the Three Bears or The Three Little Pigs.

Two examples are shown below:

(a) She turned, an’ said to Jack "Where’s money for cow?" Jack looked round,
an’ said, surprised‐like, "Why, I’ve getten these magic beans!" "Magic
beans?" she said, "My foot! They’re nobut rubbish are them!"

(b) An’ Cinderella were cryin’ and cryin’. Then, in corner of room appeared a
right nice lady, an' she says "Cinderella, you will go t’ball".
FIGURE 4. DIALECT LITERATURE TASK – LANCASHIRE FAIRYTALES.
177
The task was completed by 53 Lancashire respondents and 42 non-Lancashire
respondents, with most contributors writing between 350-500 words each. As with
previous tasks, respondents were segregated based on their answer to the preliminary
question do you have a Lancashire dialect? Yes/no. Those who answered no were then
asked for their region. There were 12 participants from the North East, 8 from North
West (excluding Lancashire), 7 from South East, 4 from West Midlands, 3 from East
Anglia, 3 from Wales, 2 from the South West, and 3 from various other regions.
Around 40 of the total number of respondents were undergraduate students at
Lancaster University (split between both Lancashire and non-Lancashire speakers),
typically aged 18-22. Others were of a mixed age range and were contacted through
social networking websites and encouraged to pass the task on to anyone they thought
might also complete it. Cinderella was the most popular choice of story, with 16
instances in the corpus, closely followed by Three Little Pigs with 14. Lancashire
Fairytales totals 61,317 words which are roughly evenly distributed between
Lancashire and non-Lancashire speakers (32,344 and 28,973 respectively). Although
Lancashire Fairytales is a relatively small corpus when compared to the other corpora
used in this thesis, it nonetheless provides an important source for comparison with
the other corpora and also with itself by contrasting the Lancashire and non-
Lancashire responses. The comparison of non-Lancashire and Lancashire parts of the
Fairytale corpus is set out in §5.4.5. In order to compare it with the other corpora,
initially only the Lancashire section of the fairytale corpus is used in the analysis.
178
5.3.3 Interpreting corpus results
The method outlined here describes both how corpus results are analysed in this
chapter and also how they could be analysed in corresponding corpora from other
varieties if this methodology was to be employed.
If a diachronically comparable corpus of dialect literature and spoken corpus
are compared, nonstandard grammatical variation found in the dialect literature corpus
will also be found in spoken corpus to some degree. 19 This is because it is logical to
suggest that grammatical variation used by dialect speakers when they talk also forms
part of what they conceptualize as part of the dialect. Perhaps more interestingly, if
nonstandard grammatical features are found in the dialect literature but not in the
spoken corpus nor in any reference corpus, then these features are either archaic
dialectal features that are no longer currently used, or, good examples of salient
features that are rarely found in the Spoken corpus, perhaps due to social values
ascribed to them by the speech community.
Also interesting are those nonstandard features that are found in the spoken
corpus but not in the reference corpus or the dialect literature. This distribution can be
best expressed in Figure 5. These features may well be nonstandard but do not (as yet)
have any social values attached.
19
Corpora collected at the same time from the same set of informants would probably give results that
withstand influence from variables such as diachronic change and intraspeaker variation. In the case of
the Lancashire data used here (namely Sound Archive and Litcorp) this was not possible, and in part
motivated the decision to compile a new collection of dialect literature.
179
FIGURE 5. INTERPRETING CORPUS COMPARISONS
5.4 Results and discussion
Since a large number of variables are explored in this analysis, it is impossible to
represent the results from all three corpora together. Indeed, the methodology
employed here does not lend itself to comparing variables in this way. Instead, similar
results are clustered together depending on their distribution across the three corpora.
In some cases, particular words or parts of speech are discussed separately (e.g. was
and were variation are discussed as two separate variables, rather than as part of
variation with BE more generally). Other non-grammatical features (e.g. lexical choice
and semi-phonetic spelling) are discussed in §5.4.4.which considers (in brief) both
dialect words found frequently in the UK (e.g. owt and nowt) along with other
Lancashire specific dialect words, e.g. nobbut (no more than, nothing but) and gradely
(excellent) and the implications of semi-phonetic respellings.
180
5.4.1 Features found across all corpora
Result presented in this section occurred in Litcorp, Lancashire Fairytales and the
Sound Archive corpora in a fairly even distribution. In this section only, results from
the BNC are also included as a reference corpus in order to check that possible salient
features are typical to Lancashire, rather than being part of nonstandard variation
found in Standard English more generally.
Perhaps unsurprisingly, definite article reduction/deletion occurred across all
corpora. This feature is perhaps typically associated with Lancashire; in the
contemporary humorous dialect literature written about this region, definite article
reduction/deletion is referred to as “the first basic rule of speaking Lanky” (Dutton,
2002:6). The corpus results are shown below in Table 1.
Litcorp Fairytale Sound Archive

definite article reduction
+0.08 +2.67 -2.75
mean score: 4.98
definite article deletion
-1.60 -1.01 +4.01
mean score: 2.99
TABLE 1. DEFINITE ARTICLE REDUCTION/DELETION
Definite article reduction is also found in the dialect literature corpora reduced to both
t’ as th’ as shown in (4), and also as a collocate of in or on, often expressed as ont and
int as shown in (5).
(4) Margit had lost a deol o’ wynt by th’ time hoo geet to th’ surgery, but as
luck ud have it, th’ doctor were in. (Litcorp)
(5) And off he went down t’road, holdin onto the clog that she’d left ont ground
[…] (Lancs_0017)
Both of these constructions appear to be rare in the BNC data, although reduced forms
are of course easier to quantify (121 instances of t’ are found compared to the
6,041,234 instances of the). The comparative distribution of the reduced and deleted
181
forms of the is interesting to note. In both Litcorp and Lancashire Fairytales the
reduced form (t’ or th’) is used more frequently than the zero form. This is particularly
noticeable in Lancashire Fairytales suggesting perhaps that contemporary writers
consider this more salient than the zero form. It could also be the case that this
reduced form is more impactful than zero (a point raised in §2.3.1 with respect to zero
relatives) – if the writers are trying to represent their dialect, perhaps it is more
meaningful to included a reduced form that is noticeable on the page rather than a zero
form. Tied in with this, the Sound Archive shows the biggest variation between
definite article reduction and definite article deletion. This perhaps complementary
distribution could indicate that while speakers are aware of the reduced form (as it
also occurred frequently in the dialect literature) this may be a construction that they
accommodate away from (see Hollmann and Siewierska (2011) for a further
discussion of factors such as frequency and social identity).
Both nonstandard was and nonstandard were are frequent across all corpora
and are found more frequently in the Sound Archive than other corpora. The corpus
results are shown in Table 2.

nonstandard were
-4.84 -2.72 +7.56
mean score: 13.34
nonstandard was
-1.61 -0.99 +2.59
mean score: 4.11
TABLE 2. WAS/WERE VARIATION
This distribution perhaps suggests that this variant, while still perceived as
nonstandard, is actually produced more than it is perceived. This would indicate this is
perhaps less strongly associated with this dialect variety than definite article
reduction/deletion. As found by Hollmann and Siewierska (2006), and also earlier in
182
this thesis (see §4.4.4), levelling to were is found more frequently that levelling to
was, in both positive and negative sentences. An example of this is shown in (6-7)
(6) But er, I went to woodwork, I weren't very happy. I don't like the
smell of new wood actually but er that might be a throwback, I don't
know. (Sound Archive)
(7) It were awlus feightin’, an I were never eaut o’ trouble. (Litcorp)
Was/were variation is reported extensively in other dialects of English, e.g. in
London (Cheshire & Fox, 2006) in the English Fens (Britain, 2002). It may be the
case that although this variable is frequently produced by Lancashire dialect speakers,
it is not perceived by them as a salient part of their dialect, or at least not as salient as
compared to e.g. definite article reduction/deletion.
Never as a past tense negator is also found across all corpora, although most
frequently in Litcorp. Kortmann and Szmrecsanyi outline this nonstandard variation as
the second most frequently found pattern of nonstandard grammatical variation
present in varieties of English worldwide (2004:1154). It is therefore unsurprising that
it is found in the Lancashire corpora. Results for never are shown in Table 3, and
instances of it in the Lancashire data are shown in (8-9).

past tense negator never
+9.78 -8.37 -1.4
mean score: 33.72
TABLE 3. PAST TENSE NEGATOR NEVER
(8) There were a peacock outside of Townley Hall. I never remember 'em being
two. No there were only one. (Sound Archive)
(9) When I geet here th’ chap hadn’t come to meet me, an’ he never turned up
aw day. (Litcorp)
183
This pattern appears infrequently in the BNC. This suggests that, as Lancashire dialect
writers have included it and it is also found in the spoken corpus, it is likely to be a
salient feature of Lancashire. This is where it is important to note that here salient
features of Lancashire do not refer to those features found exclusively in this region
and no other(s).
A number of other nonstandard features were also found across all corpora,
and many of these also appear in the literature as nonstandard features that are
common across British varieties of English (e.g. Kortmann and Szmrecsanyi,
2004:1154-55). Of course, just because they are found in other varieties does not
mean that they are not salient in Lancashire. These included demonstrative them and
what relativization as shown in (10) and (11) respectively
(10) Them’s o’ reet,” said Young Winterburn, “for little lads” (Litcorp)
(11) An she wur a luverly lass wot lived wi all these dwarves. (Lancashire
Fairytales)
5.4.2 Features found in dialect literature
Results found in the Litcorp and Lancashire Fairytales comparatively display a very
different distribution. Of the nonstandard features examined in this chapter, no
features were present in Lancashire Fairytales that were not found in other corpora
too. This is perhaps unsurprising and means that writers in Lancashire Fairytales are
not using any of the grammatical variation tested here that does not feature also in the
spoken language present in the Sound Archive corpus or the written language of
Litcorp. While Lancashire Fairytales did not display nonstandard variation found with
any of the nonstandard forms exclusively, idiomatic constructions such as ey up me
duck and put wood int hole were frequent in this corpus but not found in the Sound
Archive or Litcorp. This suggests that these constructions are salient to Lancashire,
184
but are perhaps in a different category to salient words that would typically be used in
everyday speech. It may be the case that these more idiomatic constructions are
enregistered as signifiers of this variety in much the same way that e.g. “why aye
man” is in the north east of England
Part of the restricted set of constructions found in Lancashire Fairytales may be
related to the task, and this is discussed in §5.5.
The distribution of right + adjective is more common in the Lancashire
Fairytales as compared to any other corpus. There are no occurrences of this
construction in Litcorp as shown in Table 4.
Sound
Litcorp Fairytale
Archive
adverbial right + adjective
-11.05 +20.00 -8.95
mean score:11.05
TABLE 4.DISTRIBUTION OF ADVERBIAL RIGHT + ADJECTIVE
This suggests that for current Lancashire speakers this construction is salient, and
perhaps its infrequent use in Sound Archive may be due to the social values that are
assigned to it. Adverbial right was often frequently found with nonstandard were, as
shown in example (12).
(12) […] and I had a uniform, oh it were right posh, I had a green uniform and it
buttoned all way up the side with er fancy buttons (Sound Archive)
This suggests that it is possible adverbial right may influence the use of
nonstandard were or vice versa, or indeed the larger construction ‘NP were right Adj’
may be a salient construction in itself. Further corpus investigations and perhaps
elicitation tests would be needed to verify this claim.
185
As found in previous chapters, nonstandard spellings were present throughout
Litcorp (and to some degree, Lancashire Fairytales) and are discussed in brief in
§5.4.4.
A number of results are found with a stronger distribution within the older
dialect literature (Litcorp) than in the other corpora. Most typically, variation found in
this category involves forms that are now archaic. These are nonstandard
constructions are represented below in Table 5.

nd
archaic 2 person pronouns
+46.68 -22.37 -24.17
mean score: 24.42
archaic verb forms
+12.95 -6.45 -6.49
mean score: 6.7
nonstandard past
tense irregular verbs +45.69 -22.06 -23.02
mean score: 23.02
TABLE 5. ARCHAIC 2ND FEATURES FOUND PREDOMINATELY IN LITCORP
Archaic personal pronouns were not found in the Sound Archive at all, and
were also rare in Lancashire Fairytales. Archaic verb forms such as dost, art and hast
are also found significantly more frequently in Litcorp than in the other corpora.
Results for nonstandard spellings were included (e.g. dost also includes any results for
durst and verbal uses of dust). Contracted ’st and ’rt forms were found in the Litcorp
data, as shown in (13).
(13) “Theaw’rt some perculiar mannert Jackonapes I’ll uphowd” sed hoo;
“Ney, ney, I’st naw grope in the Breeches not I.” (Litcorp)
Irregular past tense verbs such as knowed, etten and forgetten were found in
Litcorp but very rarely in the other corpora. Here the difference is made between
nonstandard forms, i.e. getten instead of got rather than just nonstandard spelling, e.g.
alleawed instead of allowed. The distribution displayed by these archaic features
could indicate one of two things - these features were more ‘standard’ at the time of
186
writing and so occur in the Litcorp in much the same way that features of Standard
English occur across all corpora now (indicating a diachronic change), or, these
features were considered to be a salient part of the dialect at that time, but now are
not. A closer look at the dialect literature reveals that these pronoun forms are found
more frequently with semi-phonetic spelling than with standard spelling. In particular,
theau occurred in Litcorp 951 times compared to the 135 instances of thou. This
would suggest that these pronouns are associated with Lancashire based on the earlier
hypothesis that nonstandard spellings may indicate salient features.
A number of other features were more frequently found in Litcorp as
compared to the other corpora, these are shown in Table 6.

present participle sat
+13.52 0.00 -1.55
mean score:4.00
present participle stood
25.12 -9.05 -7.03
mean score: 9.05
Past reference come
+52.44 -14.00 -2.00
mean score: 34.84
TABLE 6. OTHER NONSTANDARD FEATURES FOUND PRODOMINATELY IN LITCORP
(14) And the princess come fleeing out of dancehall, just as clock were striking.
(Fairytale – Lancs)
(15) There were a lowf fro’ th’ lobby, an’ Ferret Eon said nowt, though some
colour coom in his face, as th’ farmer bid him Good-neet. (Litcorp)
The features presented in are found frequently in Litcorp but also are found in the
Sound Archive data, suggesting that unlike those in Table 5, these features are not
archaic, but perhaps feature in Litcorp due to reasons of style. This is discussed further
in §5.5.
187
5.4.3 Features found in the most recent corpora
Results presented in Table 7 are found frequently within the newer corpora
(Fairytale and Sound Archive) but are not found in the older Litcorp.

Possessive me + noun
-15.00 -12.19 +42.21
mean score :15.00
absence of plural marking
-9.24 -10.02 +19.26
mean score: 11.24
what as a subject relative
-5.85 -1.52 +7.37
mean score: 7.85
adverbial quick
-20.13 -17.54 +37.67
mean score: 23.13
TABLE 7. FEATURES FOUND PREDOMINATELY IN RECENT CORPORA
These results shown in Table 7 are those which are used by Lancashire
speakers but are not considered by them to be a salient part of their dialect. A number
of these are features which are perhaps found in varieties of English more widely,
such as adverbial quick and absence of plural marking. Others may typically be used
by speakers of this region in particular, such as me + noun or subject relative what.
Either way, most of the results included here are not represented significantly in the
dialect literature, which perhaps indicates that, as yet, they are relatively free from
social values. To use Labov’s terms (see e.g. 2001), these results are perhaps
indicators.
5.4.4 Other features
A number of features that did not fit easily into the methodology adopted here were
found in the corpora; many of these were more stylistic or discourse based. A majority
of these features were found in the Sound Archive, such as dislocation (16), this
here/that there (17) and discourse marker see (18).
188
(16) […] and in addition to that it was very prevalent was this, because Huncoat
was divided in two by imaginary line from where we’re sitting now (Sound
Archive)
(17) Well er as they said they were always thinking about this here ghost but we
never saw any ghost. (Sound Archive)
(18) So I, I said, I want it for Christmas, so I'll put it away for you, see.
Variation such as this may be indicative of a particular ‘spoken Lancashire style’ and
this is an area that would benefit from further research.
There were also many instances where particular nonstandard words (rather
than grammatical features) were used by the speakers or writers. One of the most
frequent was owt and nowt. The frequencies of each of these are shown in Table 8.

owt
+47.38 -9.93 -37.45
mean score: 41.95
nowt
+39.59 +5.77 -46.35
mean score:
TABLE 8. DISTRIBUTION OF OWT AND NOWT
The owt results from Litcorp had to be sorted manually, due to results like (19).
(19) Heawsumever, little Emma were a favourite wi’ Ginger; he awlus breetened
up a lot when hoo went to his shop, an’ she very oft coom owt wi a cake or
some towfy as Ginger had trated her to. (Litcorp)
A dominance of owt and nowt in the written corpora perhaps means that these forms
have a particular social value ascribed to them that speakers do not wish to use in their
spoken language.
Alongside these nonstandard words, other more region-specific dialect words
were found across the corpora, although not hugely frequently. A number of examples
of these are given in (20-22).
189
(20) I know it's going to be a bit of a job because I were nobbut a lad when I left.
(Sound Archive)
(21) But th’ sun’s gradely hot; it make’s one sleepy, doesn’t it ? (Litcorp)
(22) Once upon u time, thur wur a littl’ chitty named Thumbelina. (Lancashire
Fairytales)
These features are perhaps similar to the more idiomatic constructions found earlier
(e.g. ey up) and so are strong sign of this regional variety.
As found in previous chapters (and of course discussed in more detail by e.g.
Honeybone and Watson, forthcoming), the variant spelling found in the dialect
literature corpus can be considered as a conscious decision by the writer to represent
the phonology of the language used. While phonology is not the focus of this chapter
(or indeed this thesis), variant spellings occurred so frequently in the data they most
certainly warrant at least an overview. Some of the most frequent respellings are
shown in Table 9.
phonological
Example
feature
He geet up then, an’ th’ clock struck eight, but when he went to
[əʊ]  [ɔ] oppen th’ dur for th’ milk. (Litcorp)
her wur a tinker wur Jack, an off ‘e went wit best ceaw deawn
[з:]  [ə]
t’market. (Lancashire Fairytales)
They’re bothered abeaut gerrin’ shoon to fit tint, an’ thine’s just th’
[t]  [ɹ] pattern. (Litcorp)
Awonder’t what wur up when th’ post-chap coome hommerin at th’
[a]  [ɔ] dur o Monday morning’(Litcorp)
But I mony a time wished I’d never seen it, for it caused me mony a
[e]  [ɔ] freet, an’ made me so narvous I’st never get o’er it. (Litcorp)
[u:l]  [u:] They said th’ skoo wur full, an’ a lul had had to goo away. (Litcorp)
TABLE 9. A SAMPLE OF THE NONSTANDARD PHONOLOGICAL FEATURES FREQUENT IN
DIALECT LITERATURE
There most frequent respelling spelling appears representations of the
phoneme [əu], which occur most frequently in Litcorp in theau (thou), abeaut (about)
190
and deaun (down). These three instances alone totalled 1841 results in Litcorp, with
their standard counterparts totalling only 394. Reduction and deletion of word-final
consonants was also very frequently represented in the texts, either by the omission of
letters or with apostrophes as shown in the example (23).
(23) Well, tell thi’ mother to soak a piece o’ flannel i’th’ milk, an’ le th’ choilt
suck it. (Litcorp)
While only a few phonological features have been outlined here in order to
demonstrate how phonetic respellings can also be used to indicate the salient
phonological features in this region, the potential for further analysis (perhaps along
the lines of that conducted by Honeybone and Watson, forthcoming) is in no doubt.
5.4.5 Lancashire Fairytales - comparing Lancs and non-Lancs
While only the Lancashire part of Lancashire Fairytales has been used in the analyses
presented so far, interesting results can be found by contrasting the Lancashire and
non-Lancashire respondents. Much of this data showed a large element of crossover,
with features such as definite article reduction/deletion; levelling to were (and also to
was) and lexical choices such as owt, nowt use frequently in both sections of the
corpus, as we can see in the two extracts from Three Little Pigs shown in the non-
Lancashire and Lancashire examples respectively in (24-5).
(24) Once upon er time there were three little pigs who lived in a right nice ‘ouse.
T’house was made with straw. (Lancashire Fairytales – non-Lancs)
(25) Once upon a time theyre wur three lickle pigs. These here pigs lived thur
days int luvley ouse made uh straw an ‘ay. (Lancashire Fairytales – Lancs)
191
Even in these two very short examples that are telling the same narrative we can see
differences between the two texts. On the whole, Lancashire writers often aimed to
represent their phonology via variant spellings, and tended to include a more selective
and sensitive application of nonstandard variables. Non-Lancashire writers on the
other hand often had a smaller selection of features that they seemed to consider as
‘Lancashire’ (namely definite article reduction/deletion, a number of idiomatic
constructions, dialect words) but often included them in a haphazard or arbitrary way.
In order to explore this further, a breakdown of nonstandard grammatical features
found in Lancashire Fairytales is shown below. While it is difficult to compare
features with each other due to their relative frequencies of occurrence, the table is
still useful in showing the difference between the two parts of the corpus.
Lancs non-Lancs
definite article reduction 900 976
nonstandard were 255 142
adverbial right + adjective 177 33
definite article deletion 144 80
dialect words 105 9
past reference come 99 27
possessive me + noun 86 18
archaic 2nd person pronouns 59 18
archaic verb form 49 0
dislocation 43 3
absence of plural marking 40 12
subject relative what 32 27
nonstandard was 23 12
nonstandard irregular lexical verb 18 0
TABLE 10. RAW FREQUENCIES OF NONSTANDARD FEATURES IN THE LANCS AND NON-
LANCS PARTS OF LANCASHIRE FAIRYTALES
Perhaps most surprising is the distribution of definite article reduction, with
more instances in the non-Lancashire section as compared to the Lancashire part of
the corpus. A closer look at the non-Lancashire texts reveals that often the reduced
192
form is used in every single possible instance, even when barely any other
nonstandard variation is used, as shown in the example in (26).
(26) And t’girl was called Little Red Riding Hood. And one day when t’sun was
shinin’ she went in t’forest and was looking for t’house where her Grandma
lived. T’house was only small and it were hidden by t’trees. (Lancashire
Fairytales, non-Lancs_0006)
This suggests that this form is certainly strongly associated with Lancashire dialect,
both by Lancashire and non-Lancashire speakers, although Lancashire speakers are
more selective with their application of this nonstandard form, and have a
comparatively higher number of instances of the deletion in contrast.
One of the most interesting (and surprising) aspects of the Lancashire Fairytale
corpus was found not in the grammatical variation displayed by the writers or the
nonstandard spelling representing the phonology, but in the content of the stories
themselves. Many of the stories (predominately those in the Lancashire section)
involve some change or embellishment to the expected narrative despite this not being
mentioned in the instructions to participants. For example, 5 of the 16 Cinderella
stories involved glass clogs instead of slippers. Others stories mentioned the
surrounding area (e.g. two different writers describe Grandma from Little Red Riding
Hood as living in Grizedale forest). Others describe living conditions and scenery in
unexpected detail, often including cobbled streets, mills and local foods (including, on
one occasion, the Wicked Witch offering Snow White some tainted hotpot rather than
an apple). One example showing this local influence is given below in example (7),
where Jack has to sell the cow due to ‘trouble at mill’.
(27) “owdo Jack” she says, “Wossupwithi?” Jack ‘ad com in leukin like e’d seen
nobbut strife. “By eck, trouble at mill” says Jack. “I’ve been given t’shove”.
“Tha't backerts thee!” she said. “We’ll hav t’sell ceaw! Get thur self pulled
reaunt an mek sharp down t’market.” (Lancashire Fairytales, Lancs_0016)
193
The link between language and identity is clear here, with writers showing an
obvious connection between what they consider to be Lancashire themes (with
particular reference to times gone by) and the Lancashire dialect. Here Lancashire
writers seem to be perhaps influenced by the genre described by Contemporary
Humorous Localised dialect literature; a genre now well established for many dialects
of English (both in the UK and beyond). The contrast between the clearly stereotyped,
(and often archaic) written forms produced by the writers of the Lancashire Fairytale
corpus (such as that shown in (27)), as compared to their own speech, is interesting.
There are no instances in any of the spoken corpora displaying either the range or the
density of the nonstandard variation found in the dialect writing. It is therefore evident
that writers of the Lancashire dialect literature are consciously using a set of linguistic
forms and constructions that enact a socially recognised register (as outlined by Agha,
2003 and Johnstone et al., 2006), namely what they conceptualize as Lancashire.
This chapter has aimed to both uncover the salient grammatical features in Lancashire
as found in the Lancashire corpus data and to propose a suitable methodology to arrive
at this outcome.
Results from the analysis of the corpus data have revealed distinct differences
in the distribution of grammatical features across the corpus sources. This indicates
that not all nonstandard variation produced by Lancashire dialect speakers is indeed
perceived as being salient (and therefore included in the dialect literature). Figure 1
attempted to graphically represent this concept.
A number of the nonstandard features tested were frequent across all of the
Lancashire corpora. The most prevalent of these were nonstandard was and were,
194
definite article reduction/deletion, and past reference come. It is therefore suggested
that these features are salient to Lancashire speakers but not so strongly associated
with Lancashire so that their frequency in the Sound Archive is diminished due to
possible accommodation.
Other features were apparent in the dialect literature corpora but not in the
Sound Archive. Two possibilities exist for these constructions, either they are archaic
(e.g. those found in Litcorp) or, they are perceived as very salient and so are perhaps
avoided by speakers when in conversation. Aside from those that were attested in the
literature as being archaic, this category contained the used of more idiomatic
constructions such as “ey up me duck” and dialect words and phrases such as “gradely
int it!”. As these constructions are perhaps enregistered as very clearly being part of
the Lancashire dialect; it is unlikely that they may be found in natural conversation,
unless perhaps in a humorous way.
A number of features were present in the Sound Archive but not in the dialect
literature. This suggests that while these features are used by Lancashire dialect
speakers, they are yet to acquire a social value. These variants are an interesting
category and may point to ‘ones to watch’ if conducting a longitudinal survey of
salience in one particular region.
Perhaps the most surprising results from Lancashire Fairytales emerged from
the rewriting of the narrative of the fairytale in order to include some element of the
Lancashire area, its customs or cuisine. This aspect was not overtly indicated in the
question, but clearly shows the link between the Lancashire dialect and identity for
many of these informants.
The methodology used in order to highlight the differences between produced
and perceived variables was useful but not without limitation. One problem is
195
circularity. How can we know if features are salient by looking at corpus data, if
speakers who contribute to that corpus data also know that certain features are salient
too and so actively up/downplay them? - a point also outlined by Kerswill and
Williams (2002:104). A closer consideration (and perhaps measurement) of
accommodation could certainly add to the methodology outline here. Further, by
focusing on a larger number of constructions, we have undoubtedly overlooked
nuances, which could potentially be revealing. The influence of the task may also
have had an impact on the distribution of constructions. For example, it may be the
case that a lower frequency of me + noun was found in Lancashire Fairytales simply
because the writers did not have the opportunity to use possessive construction when
writing a fairy story. The corpora used for this analysis may also have impacted upon
the outcomes; a comparison of the perception and production (i.e. speech and writing)
of the same group of speakers would control variables such as accommodation, and
possible diachronic change. Additionally, further elicitation tests and attitudinal
studies, along with perceptual dialectology may allow a clearer picture to emerge of
the grammatical constructions that are salient in the Lancashire region. Nonetheless,
the data examined in this chapter and the conclusions put forward about the
production, perception and relative salience attributed to the various grammatical
features considered here are clearly consistent with previous research, e.g. Kerswill
and Williams (2002).
While only a handful of phonological features were explored very briefly in
§5.4.4, this demonstrated how phonetic respellings in the Lancashire dialect literature
could be analysed in order to uncover the salient phonological features in this region.
Analyses such as this would allow a broader picture of variation of all types in
196
Lancashire to be outlined, and would provide results that would complement the
grammatical variation as set out in this chapter.
197
Chapter 6. Concluding remarks
This thesis has provided a fine-grained description of a number of grammatical
features found in the previously under-explored Lancashire dialect data, whilst also
examining the implications of nonstandard data for wider theories of language
variation and change. The approach adopted here is new in that it combines a variety
of data types (see §1.3 for more details on this), and explores the contribution that a
large corpus of dialect literature, along with other methods, can make in uncovering
regional grammatical variation.
The contribution of this study lies not only in profiling both existing and
historical features of the Lancashire dialect, but also in the use of multiple methods of
quantitative analysis. While the empirical basis of dialectology is an obvious
necessity, the use of a considerable spoken corpus in conjunction with both historical
and current dialect literature as well as elicited information is unique. It is my
contention that this approach has provided valuable insight into how multiple methods
can improve the scope and validity of any possible conclusions.
The combination of new methodologies and data outlined here has shown that
oral history interviews can be a useful avenue for testing linguistic theories (provided
that these are handled with care) and that dialect literature can, to some extent, be used
to counterbalance a lack of both historical spoken resources and historical written
evidence about the dialect in question. Dialect literature, when treated as a collection
of the most salient features of a variety as judged by that writer, can offer insights into
sociolinguistic salience.
Possible biases in the corpus data provided a rationale for supplementary
methods also being employed in this thesis. Acceptability questionnaires enabled
specific constructions that were found to be infrequent in the corpus data to be
198
targeted and explored in more detail. This was particularly effective when considering
rare phenomena such as zero relatives or the NSR. The online hosting of these
questionnaires and the sourcing of participants via social networking websites meant
that a large number of participants were reached, thus giving more robust and
representative results – an approach that could have implications for further
sociolinguistic data collection. Along with the traditional dialect literature, a new
corpus of dialect literature was compiled by inviting participants to write a story in
what they considered to be Lancashire dialect. This approach, which combined
aspects of both elicitation and dialect literature, is also unique to this study and
allowed further insights into differences between the perception and production of
nonstandard variables, in particular with reference to sociolinguistic salience. The
wide variety of sources utilized in this thesis created multiple opportunities for
analyzing the data in question from different perspectives and arriving at a much more
comprehensive understanding of what dialectal features actually are.
Chapter 2 used the Lancashire data to test a number of assertions that are
typically found in the literature on relativization in Standard English, such as the
correlation between relativizer type and restrictiveness. Results show that
relativization in Lancashire shows variation different to that described in Standard
English and is, on the whole, less constrained. This chapter also introduced new
information on the distribution of zero relative clauses in Lancashire; a construction
which is typically difficult to retrieve from corpus data alone. A sentence-linking task
where informants had a free choice of which relativizer to use in linking clauses
showed that, at least to some degree, the relativizer what is productive in this region,
contrasting perhaps with other results from northern regions of England.
199
Chapter 3 provided a semantic and syntactic analysis of the previously
undocumented HAVEn’t to construction, a polysemous construction found in
Lancashire that displays meanings that can be similar to both DOn’t HAVE to, and
mustn’t depending on the context of use. The results show that the semi-modal
HAVEn’t to has changed over time, and now tends to behave more like a core modal
verb for Lancashire speakers. In a majority of cases in the Sound Archive data, it
displays a meaning that is closer to the stronger modal verbs MUSTn’t or SHOULDn’t
rather than to the weaker DOn’t have to.
Along with describing this construction, Chapter 3 tested how possible it was
to use different sources of dialect data in the analysis of diachronic change. The
results are open to several interpretations. The semantic and syntactic arguments
clearly show that, synchronically, the HAVEn’t to construction has become
grammaticalized in the Lancashire dialect data, but diachronic changes in the data are
more uncertain. This uncertainty may be due to either the relatively low frequency of
this construction overall, or, to the somewhat problematic nature of the comparison
between the written Litcorp (as opposed to a historical spoken source) and the spoken
Sound Archive data.
Leading on from the analysis of HAVEn’t to, Chapter 3 then turned to the wider
construction family, i.e. those constructions that have similar semantics and syntactic
properties (e.g. MUSTn’t, SHOULDn’t, NEEDn’t); an approach that has yet to be widely
adopted in sociolinguistics. The aim here was to explore the concept of constructional
competition in order to determine whether grammaticalization may have played a role
in the development of this construction (and its construction family) more widely in
Lancashire.
200
The verdict on construction competition is not entirely clear but nonetheless
the point still remains that often a number of similar constructions, as opposed to just
two opposing variants, can fulfil a similar semantic function and that the interaction
between these variants is complex and cannot easily be accounted for by e.g. the S-
curve model of language change (Kroch, 1989). This analysis of a number of
competing variants no doubt has implications for studies of language change and
sociolinguistics, suggesting that a wider scope of focus will often be necessary when
looking at diachronic change.
Chapter 4 analysed verbal agreement in Lancashire, examining in particular
the (so-called) Northern Subject Rule. This chapter provided a new account of this
phenomenon in Lancashire based on both the analysis of corpus data, and
acceptability judgements from questionnaires.
Contrary to the analyses of previous researchers, this analysis revealed that
while variation with 3sg agreement in prevalent in Lancashire, the situation is far too
complex to be accounted for by a single rule such as that ascribed by the NSR. My
analysis finds that instances of present tense indicative variation that appear to be
instances of the NSR are extremely rare in the Lancashire data, particularly in the
more modern Sound Archive corpus.
Crucially, this chapter also analysed the semantics surrounding the NSR. It
attempted to uncover whether or not habitual constructions had been fully appreciated
by other researchers, and the extent to which such constructions (which, importantly,
are very frequent in Lancashire) impact upon the status of the NSR. In Lancashire
corpus data, often nonstandard verbal agreement involved a direct flout of the subject
position and subject type restrictions specified for by the NSR. Evidence from the
acceptability questionnaire corroborated these results; Lancashire respondents
201
indicated a higher acceptability score (versus other respondents) for adjacent non-3sg
pronouns with 3sg agreement. With this in mind, the analysis showed that there is
great difficulty in differentiating NSR from other similar constructions, and that this is
not something that should simply be passed over. While the semantic difference in the
NSR as compared to the historical present construction is a question of relative time
(and can usually be resolved by examining the wider context of the text or utterance),
this is trickier for habitual constructions. I argue that it is very possible that in regions,
such as Lancashire, where it can be proven that the usage of -s in the habitual aspect is
frequent (with or without adverb phrases), 3sg forms may have been re-analysed by
speakers as a habitual semantics marker and extended into pronoun-adjacent contexts
rather than be a marker of agreement. This assertion may undermine the validity of
some previous claims about NSR set out in the literature.
Chapter 4 also confirmed the position outlined in Hollmann and Siewierska
(2006) in finding that levelling to was but more frequently (and interestingly) to were
is possible in all person/number/polarity contexts – a distribution that is comparatively
rare in most varieties of English.
Chapter 5 proposed a contrastive corpus-based approach to salience – a
concept that was previously untested. The methodology involved comparing a large
corpus of produced variables (i.e. speech) to a large corpus of perceived variables (i.e.
dialect writing) in order to test sociolinguistic salience. The methodology proposed
here is original. Although examining only the dialect literature corpora would have no
doubt yielded considerable interesting results, what is perhaps more interesting is the
difference between the nonstandard constructions present in the written data and those
that are actually spoken in the Sound Archive.
202
This chapter advocated the use of dialect literature as a key component in
unearthing grammatical (and other) patterns that are salient features of the Lancashire
dialect. This comparison not only allowed salient constructions in the data to be
described, but also revealed constructions that are salient but do not occur in speech
(perhaps in part due to their social value or status) and also constructions that are
nonstandard yet do not currently have ascribed social values. Some grammatical
patterns appeared in both texts, but the distribution displayed different weightings
indicating preferences for either written or spoken data. Results such as this can allow
finer distinctions to be made between constructions, rather than simply classifying
them as salient or not salient which is highly promising with respect to future corpus-
based research.
The corpus-based method used in Chapter 5 was not with out some limitations;
certain more discourse-based features were not able to have their nonstandardness
proportion calculated due to the lack of appropriate standard equivalents. Also,
perhaps even stronger results could be achieved if both written and spoken data was
collected from the same group of informants, and, if possible, aligned for
considerations such as tense and topic.
This aside, this method also pointed to extensive possibilities to undertake
research into semi-phonetic respellings and nonstandard vocabulary terms were found
which only occur in the “perceived” dialect literature which was only treated in brief
here due to the restrictions of this chapter. Considerations of these, and also of the
grammatical variation, as possible representations or indices of identity also merits
further attention
Overall, this thesis has not only described grammatical variation in Lancashire
but has set out to emphasize the importance of corpus-based dialect grammar for
203
linguistics in general (see also Hollmann and Siewierska, 2011, and Hollmann, to
appear, who focus specifically on the importance of frequency effects and schemas).
An important underlying theme of this study has been the testing of linguistic claims
using large corpus resources. It is clear that the method of achieving significance for a
claim depends on the nature of that claim. A simple claim, such as the existence of a
construction, requires only simple searches and statistics to show that it exists in the
data. Problems arise with evaluating assertions which are more complex, relating to a
number of conditions or features, often overlapping (as demonstrated by, for example,
the NSR). It seems clear that in order to confirm suggested trends and/or rule out
competing hypotheses, considerable empirical data of different types and from
different sources must be used. This will allow the corroboration of data from multiple
perspectives, strengthening the likelihood of the hypotheses, and enabling both
synchronic and diachronic study.
The methods used here were not without limitation. As frequently highlighted,
comparing grammatical variation as opposed to, e.g. phonological variation typically
requires large resources, and it may be the case that larger corpora may have been able
to substantiate some of the claims made in this thesis in a more convincing manner.
Naturally, there are obstacles to acquiring such a large amount of data, e.g. simply the
lack of the data in existence, time constraints and costs. In this study this limitation
was counteracted to some degree by elicitation methods (and this approach is strongly
advocated), but larger corpora and perhaps, for example, conversational data
structured around topic or e.g. grammatical tense to some degree might be of use in
supporting methods outlined here.
Aside from the data, I argue that order for an approach like this to progress yet
further a number of broader questions arising from this thesis need to be addressed.
204
One of these concerns the interplay between frequency, salience and perhaps other
social factors (as outlined also in Hollmann and Siewierska, 2011), along with the role
of construction families in language change. Questions raised by this study can
provide numerous opportunities for future research. In particular, attention to intra and
inter-regional variation could provide further interesting results. Furthermore, new
data collection methods described in relation to gathering participants for elicitation
tasks may also lend themselves well to studies of possible social network effects, a
very important aspect of sociolinguistic theory that was beyond the scope of the
present study.
205
References
Agha, Asif. 2003. The social life of cultural value. Language and Communication
23:231-273.
Anderwald, Lieselotte. 2001. Was/were-variation in non-standard British English
today. English World-Wide 22:1-21.
Auwera, Johan van der. 1984. More on the history of the subject contact clause in
English. Folio Linguistica Historica 5:171-184.
Bailey, Guy, Natalie Maynor and Patricia Cukor-Avila. 1989. Variation and concord
in Early Modern English. Language Variation and Change 1:285:300.
Baldock, Dorothy and A. Wood. 1995. Favourite Lancashire recipes. Sevenoaks: J.
Salmon Ltd.
Barbiers, S., H. Bennis, G. De Vogelaer, M. Devos, M. van der Ham, I. Haslinger, M.
van Koppen,. J. van Craenenbroeck, V. van den Heede, (eds.). 2005. Syntactic
atlas of the Dutch dialects (SAND). Amsterdam: Amsterdam University Press.
Bardovi-Harlig, Kathleen. 1987. Markedness and salience in second-language
acquisition. Language Learning 37:385-407.
Barlow, Michael and Charles Albert Ferguson. 1988. Agreement in natural language.
Cambridge: Cambridge University Press.
Barras, Will. 2006. The exhalations whizzing in the air: SQUARE and NURSE in
Lancashire English. Paper presented at LANGUE Conference, University of
Essex.
Bauer, Laurie. 1994. Watching English change: an introduction to the study of
linguistic change in the twentieth century. London: Longman.
Beal, Joan C. 1993. The grammar of Tyneside and Northumbrian English. In James
Milroy and Lesley Milroy (eds.), Real English: the grammar of English
dialects in the British Isles 187-213. London: Longman.

206
Beal, Joan C. 2000. From Geordie Ridley to Viz: popular literature in Tyneside
English. Language and Literature 9:343-359.
Beal, Joan C. 2004. The phonology of English dialects in the north of England. In
Bernd Kortmann and E. W. Schneider (eds.), A handbook of varieties of
English, Volume I 113-133. Berlin: Mouton de Gruyter, .
Beal, Joan C. 2006. Language and region. London/New York: Routledge.
Beal, Joan C. 2009. Enregisterment, commodification, and historical context: Geordie
versus Sheffieldish. American Speech 84(2):138-15.
Beal, Joan C. and Karen Corrigan. 2002. Relativisation in Tyneside English. In P.
Poussa (ed.), Relativisation on the North Sea littoral. Munich: Lincom Europa.
Biber, Douglas, Susan Conrad, Edward Finegan, Stig Johansson and Geoffrey Leech.
1999. Longman grammar of spoken and written English. Harlow: Longman.
Börjars, Kersti and Carol Chapman. 1998. Agreement and pro-drop in some dialects
of English. Linguistics 36:71-98.
Bresnan, Joan, Ashwini Deo and Devyani Sharma. 2007. Typology in variation: a
probabilistic approach to be and n't in the Survey of English Dialects. English
Language and Linguistics 11(2) 301-346.
Brinton, Laurel J. 1991. The origin and development of quasi-modal have to in
English. Paper presented at the 10th ICHL, Amsterdam. Unpublished,
University of British Coloumbia.
Britain, David. 2002. Diffusion, levelling, simplification and reallocation in past tense
BE in the English Fens. Journal of Sociolinguistics 6(1) 16-43.
Britain, David and Laura Rupp,. 2005. Subject-verb agreement in English Dialects:
the East Anglian Subject Rule. Paper presented at the University of Essex.
Unpublished.
207
Brown, K. 1991. Double modals in Hawick Scots. In Peter Trudgill and J. Chambers,
(eds.), Dialects of English: studies in grammatical variation. 74-103.
London/New York: Longman.
Bucholtz, Mary and Kira Hall. 2003. Language and identity. In Alessandro Duranti,
(ed.), A companion to Linguistic Anthropology. 369-394. Oxford: Blackwell.
Burbano-Elizondo, Lourdes. 2008. Language variation and identity in Sunderland.
Unpublished PhD Thesis, Sheffield University.
Bybee, Joan. 1985. Morphology: a study of the relation between meaning and form.
Philadelphia: John Benjamins.
Bybee, Joan. 2006. From usage to grammar: the mind's response to repetition.
Language 82:4.
Carter, Ronald and Mike McCarthy. 2006. Cambridge grammar of English.
Chambers, J. K. 1995. Sociolinguistic theory: linguistic variation and its social
significance. Oxford: Blackwell.
Chambers, J. K. and Peter Trudgill. 1998. Dialectology, 2nd edition. Cambridge:
Cambridge University Press.
Cheshire, Jenny and Penelope Gardner-Chloros. 1998. Code-switching and the
sociolinguistic gender pattern. In S. Ide and B. Hill (eds.), International
Journal of the Sociology of Language, 129. Special edition on Women’s
Languages in Various Parts of the World. 5-34.
Cheshire, Jenny and Sue Fox. 2006. A new look at was/were: the perspective from
London. Paper presented at Sociolinguistics Symposium 16, Limerick.
Cheshire, Jenny, Viv Edwards, and Pamela Whittle,. 1989. Urban British dialect
grammar: the question of dialect levelling. English Worldwide 10(2) 185-225.
208
Cheshire, Jenny. 1982.Variation in an English dialect: a sociolinguistic study.
Cambridge Studies in Linguistics no. 37. Cambridge: Cambridge University
Press.
Clark, Lynn and Graeme Trousdale. 2009. The role of frequency in phonological
change: evidence from TH-fronting in east-central Scotland. English Language
and Linguistics 13(1): 33-55.
Clarke, S. 1997a. English verbal -s revisited: the evidence from Newfoundland.
American Speech 72:227–259.
Coates, Jennifer. The semantics of modal auxiliaries. London: Croom Helm.
Corbett, Greville. 2006. Agreement. Cambridge: Cambridge University Press.
Cowart, Wayne. 1997. Experimental syntax. London: Sage Publications.
Croft, William and D. Alan Cruse. 2004. Cognitive Linguistics. Cambridge:
Croft, William. 2001. Radical Construction Grammar. Oxford: Oxford University
Press.
Culicover, Peter W. 2008. The birth and death of constructions: the case of English
do-support. Journal of Germanic Linguistics 20:1-52.
D’Arcy, Alexandra and Sali Tagliamonte. 2010. Prestige, accommodation and the
legacy of relative who. Language in Society. 39:389-410.
Denison, David. 1993. English historical syntax. London/New York: Longman.
Dik, Simon and Kees Hengeveld. 1997. The theory of Functional Grammar. Part I:
the structure of the clause. 2nd Edition. Berlin: Mouton de Gruyter.
Dobson, Scot. 1969. Larn Yersel' Geordie. Newcastle upon Tyne: Graham.
Doherty, Cathal. 1993. The syntax of subject contact relatives. Paper presented at the
twenty-ninth meeting of the Chicago Linguistic Society. In Katharine Beals et
209
al. (eds.), Proceedings of the Chicago Linguistic Society. 55–65. Chicago:
Chicago Linguistic Society.
Dutton, Dave. 1992. Completely Lanky. Printwise Publications.
Ellegård, Alvar. 1953. The auxiliary do: the establishment and regulation of it's use in
English. PhD Thesis. Stockholm: Almquvist and Wiskell.
Erdmann, Peter. 1980. On the history of subject contact-clauses in English. Folia
Linguistica Historica 1:139–170.
Filppula, Markku, Juhani Klemola,. and Heli Pitkänen (eds.). 2002. The Celtic roots of
English. Joensuu: Joensuu University Press.
Fischer, Olga and Max Nänny. 2001. Iconicity. Special issue of the European Journal
of English Studies, (5.1).
Fischer, Olga, Ans van Kemenade, Willem Koopman and Wim van der Wurff. 2000.
The syntax of Early English. Cambridge: Cambridge University Press.
Fischer, Olga. 1992. Syntax. In N. Blake (ed.), The Cambridge history of the English
language. Volume 2 1066–1476. Cambridge: Cambridge University Press.
Fox, Barbara. and Sandra Thompson. 1990. A discourse explanation of the grammar
of relative clauses in English conversation. Language 66:51-64.
Freethy, Ron and Richard Scollins. 2002. Lankie Twang (Local Dialect). Newbury:
Countryside Books.
Godfrey, Elizabeth and Sali Tagliamonte. 1999. Another piece for the verbal -s story:
evidence from Devon in southwest England. Language Variation and Change.
11:87-121.
Goldberg, Adele and Ray Jackendoff. 2004. The English resultative as a family of
constructions. Language 80.532-68.
210
Goldberg, Adele. 1995. Constructions. A Construction Grammar approach to
argument structure. Chicago: University of Chicago Press.
Goldberg, Adele. 2002. Construction Grammar. In Encyclopedia of Cognitive Science.
Macmillan Reference Limited Nature Publishing Group.
Goldberg, Adele. 2006. Constructions at work, the tature of generalization in
language. Oxford: Oxford University Press.
Greenberg, Joseph H. (ed.). 1966. Universals of language, 2nd edition. Cambridge,
MA: Massachusetts Institute of Technology Press.
Harris, Alice, and Lyle Campbell. 1995. Historical syntax in cross-linguistic
perspective. Cambridge: Cambridge University Press.
Hawkins, Jack. 1994. A performance theory of order and constituency. Cambridge:
Cambridge University Press
Henry, Alison 1995. Belfast English and Standard English. Dialect variation and
parameter setting. Oxford: University Press.
Henry, Alison. 2005. Non-standard dialects and linguistic data. Lingua 115:1599-
1617.
Herrmann, Tanja. 2005. Relative clauses in English dialects of the British Isles. In
Bernd Kortmann, Tanja Herrmann, Lukas Pietsch, and Susanne Wagner,
(eds.), A comparative grammar of British English dialects. Agreement, gender,
relative clauses. Berlin/New York: Mouton de Gruyter.
Hickey, Raymond. 2000. Salience, stigma and standard. In Laura Wright (ed.), The
development of standard English 1300-1800. Theories, descriptions, conflicts.
211
Hollmann, Willem B. and Anna Siewierska. 2006. Corpora and (the need for) other
methods in a study of Lancashire dialect. Zeitschrift für Anglistik und
Amerikanistik 54:203-216.
Hollmann, Willem B. and Anna Siewierska. 2007. A construction grammar account of
possessive constructions in Lancashire dialect: some advantages and
challenges. English Language and Linguistics 11:407-424.
Hollmann, Willem B. and Anna Siewierska. 2011. The status of frequency, schemas,
and identity in Cognitive Sociolinguistics: a case study on definite article
reduction. Cognitive Linguistics 22–1:25–54.
Hollmann, Willem B. Forthcoming. Constructions in cognitive salience. In Thomas
Hoffmann and Graeme Trousdale (eds.), The Oxford handbook of
Construction Grammar. Oxford: Oxford University Press
Holmes, Janet. 1997. Women, language and identity. Journal of Sociolinguistics
1:195–223.
Honeybone, Patrick. and Kevin Watson. (In prep). The sociolinguistics of
orthography: exploring contemporary, humorous, localised dialect literature.
Hopper, Paul J. and Elizabeth Closs Traugott. 2003. Grammaticalization, 2nd Edition.
Huddleston, Rodney and Geoffrey K. Pullum. 2002. The Cambridge grammar of the
English language. Cambridge: Cambridge University Press.
Hudson, Richard. 1999. Subject-verb agreement in English. English Language and
Linguistics 3: 173-207.
Hundt, Marianne. 1997. Has BrE been catching up with AmE over the past thirty
years? In Magnus Ljung (ed.), Corpus-Based Studies in English. Papers from
212
the Seventeenth International Conference on English Language Research on
Computerized Corpora (ICAME 17). Stockholm.
Ihalainen, Ossi. 1980. Relative clauses in the dialect of Somerset. Neuphilologische
Mitteilungen.
Ihalainen, Ossi. 1994. The dialects of England since 1776. In Robert Burchfield (ed.),
The Cambridge history of the English language. Volume V: English language
in Britain and overseas. Origins and development. 197-274. Cambridge:
Isaac, Graham R. 2003. Diagnosing the symptoms of contact: some Celtic-English
case histories. In Hildegard Tristram (ed.), The Celtic Englishes III.
Heidelberg: Universitätsverlag C Winter.
Johnstone, Barbara, Jennifer Andrus, and Andrew E. Danielson. 2006. Mobility,
indexicality, and the enregisterment of ‘Pittsburghese’. Journal of English
Linguistics 34: 77–101.
Kain, Roger and Richard Oliver. 2006. Historic parishes of England and Wales: an
electronic map of boundaries before 1850 with a gazetteer and metadata.
Colchester, Arts and Humanities Data Service.
Kearns, Kate. 2007. Epistemic verbs and zero complementizer. English Language and
Linguistics. 11:475-505.
Kerswill, Paul and Ann Williams. 2002. 'Salience' as an explanatory factor in
language change: evidence from dialect levelling in urban England. In M. C.
Jones and E. Esch (eds.), Language change. The interplay of internal, external
and extra-linguistic factors. 81-110. Berlin: Mouton de Gruyter.
King, G. 2003. Modern Welsh. Oxford: Routledge.
213
Klemola, Juhani. 2002. The origins of the Northern Subject Rule: a case of early
contact? In Hildegard Tristram, (ed.), Celtic Englishes II, Heidelberg:
Universitätsverlag C. Winter.
Kortmann, Bernd and Benedikt Szmrecsanyi. 2004. Global synopsis: morphological
and syntactic variation in English. In B. Kortmann, E. Schneider, K. Burridge,
, R. Mesthrie and C. Upton (eds.), A handbook of varieties of English. 1142-
1202. Berlin/New York: Mouton de Gruyter.
Kortmann, Bernd, E. Schneider, K. Burridge, R. Mesthrie and C. Upton (eds.). 2004.
A handbook of varieties of English. Berlin/New York: Mouton de Gruyter.
Kroch, Anthony. 1989. Reflexes of grammar in patterns of language change.
Language Variation and Change, 1:199-244.
Krug, Manfred. 1996. Emerging English modals: a corpus-based study of
grammaticalization. Berlin/New York:Mouton de Gruyter
Labov, William. 1968. The reflection of social processes in linguistic structures. In J.
Fishman (ed.), Readings in the sociology of language. New York: Mouton de
Gruyter.
Labov, William. 1972. Sociolinguistic patterns. Philadelphia: University of
Pennsylvania Press.
Labov, William. 2001. Principles of linguistic change. Volume 2: social factors.
Oxford: Blackwell.
Labov, William. 2006. The social stratification of English in New York. Cambridge:
University of Cambridge Press.
Lambrecht, Knud. 1988. There was a farmer had a dog: syntactic amalgams revisited.
In Proceedings of the fourteenth annual meeting of the Berkley Linguistic
Society. 319-339. UC Berkley, California.
214
Langacker, Ronald W. 1987 Foundations of cognitive grammar: theoretical
prerequisites. Stanford, CA: Stanford University Press.
Langacker, Ronald W. 1991. Concept, image, and symbol: the cognitive basis of
grammar, 2nd edition. Berlin/New York: Mouton de Gruyter.
Leech, Geoffrey. 1987. Meaning and the English verb, 2nd Edition. London:
Longman.
Leech, Geoffrey. 2003. Modality on the move: the English modal auxiliaries 1961-
1992. In Roberta Facchinetti, Manfred Krug and Frank Palmer (eds.), Modality
in contemporary English. 223-240. Berlin/New York: Mouton de Gruyter.
Lehmann, Christian. 1982. Thoughts on grammaticalization. A programmatic Sketch.
Vol. I. Köln: Arbeiten des Kölner Universalien-Projekts, Nr. 48.
Lehmann, Hans Martin. 1997. Automatic retrieval of zero elements in a computerised
corpus. In Magnus Ljung (ed.), Corpus-based studies in English. Papers from
the Seventeenth International Conference on English Language Research on
Computerized Corpora. 179–194. Amsterdam: Rodopi.
Malcolm, Ian. 1996. Observations on variability in the verb phrase in Aboriginal
English. Australian Journal of Linguistics 16:145-165.
Marcus, Santorini, and M. Marcinkiewicz. 1993. Building a large annotated corpus of
English: The Penn Treebank. Computational Linguistics, 19(2).
McCafferty, Kevin. 2003. The Northern Subject Rule in Ulster: How Scots, how
English? Language Variation and Change 15:105-139.
Merton, Les and Richard Scollins. 2003. Oall Rite Me Ansum!: A Salute to the
Cornish Dialect (Local Dialect). Newbury: Countryside Books.
Meyerhoff, Miriam. 2010. The Sociolinguistics reader. London/New York:
Routledge.
215
Miller, Jim. 1993. The grammar of Scottish English. In James Milroy and Lesley
Milroy (eds.). 1993. Real English, the grammar of English dialects in the
British Isles. London/New York: Longman.
Milroy, L. and J. Milroy. 1992. Social network and social class: toward and integrated
sociolinguistic model. Language in Society 21:4
Mishoe, Margaret and Michael Montgomery. 1994. The pragmatics of multiple modal
variation in North and South Carolina. American Speech 69.1:3-29.
Montgomery, Michael. 1994. The evolution of verb concord in Scots. In Alexander
Fenton and A. MacDonald (eds.), Studies in Scots and Gaelic. 81–95.
Edinburgh: Canongate Academic.
Montgomery, Michael. 2004. Grammar of Appalachian English. In Bernd Kortmann
and Edgar W. Schneider (eds.), Handbook of varieties of English: volume 3,
37-72. Berlin: Mouton de Gruyter.
Montgomery, Michael., Janet Fuller and Sharon DeMarse. 1993. The black men has
wives and sweet harts [and third-person plural -s] Jest like the white men.
Evidence for verbal -s from written documents on 19th-century African
American speech. Language Variation and Change 5:335–357.
Moore, Emma. 2003. A sociolinguistic analysis of a Bolton high school. Unpublished
PhD thesis: University of Manchester
Mufwene, Salikoko. 1991. Pidgins, creoles, typology, and markedness. In Francis
Byrne and Thom Huebner (eds.), Development and structures of Creole
Languages. Essays in Honor of Derek Bickerton. Amsterdam: John Benjamins.
Mukherjee, Joybrato. 2005. English ditransitive verbs: aspects of theory, description
and a usage-based model. Language and Computers 53. Amsterdam/New
York: Rodopi.
216
Murray, Sir James Augustus Henry. 1873. The Dialect of the southern counties of
Scotland: its pronunciation, grammar, and historical relations. London:
Published for the Philological Society by Asher and Co.
Mustanoja, Tauno. 1960. A Middle English syntax. Societe neophilologique, Helsinki.
Olofsson, Arne. 1981. Relative junctions in written American English. Gothenburg
Studies in English 50.
Orton, Harold, Wilfrid Halliday, Eugen Dieth, Martyn Wakelin, Michael Barry, Philip
Tilling. 1962-71. Survey of English dialects: basic materials. Introduction and
4 volumes (each in 3 parts). Leeds: E. J. Arnold and Son.
Perkins, Mick R. 1983. Modal expressions in English. London: Frances Pinter and
Norwood.
Pietsch, Lukas. 2005. Some do and some doesn't: verbal concord in the north of the
British Isles. In Kortmann, Bernd, Tanja Herrmann, Lukas Pietsch, and
Susanne Wagner, (eds.), A comparative grammar of British English dialects.
Agreement, gender, relative Clauses. Berlin/New York: Mouton de Gruyter.
Prasad, R and M. Strube. 2000. Discourse salience and pronoun resolution in Hindi.
Penn Working Papers in Linguistics 6.3 189-208. University of Pennsylvannia.
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech and Jan Svartvik. 1985. A
comprehensive grammar of the English Language. London: Longman.
Ramisch, Heinrich. 2008.The Northern Subject Rule and its 'northernness': a
geolinguistic perspective. Presented at Dialects as a testing ground for
theories of language change session at Methods in Dialectology XIII. Leeds
University, UK.
Robinson, Chris. 2008. Wha's like us? (Say it in Scots). Edinburgh: Black and White
Publishing.
217
Roby, John. 1829. Traditions of Lancashire. London: Longman
Romaine, Suzanne. (ed). 1982. Sociolinguistic variation in speech communities.
London: Edward Arnold.
Romaine, Suzanne. 1980. The relative clause marker in Scots English: diffusion,
complexity and style as dimensions of syntactic change. Language in Society
9:221-49.
Ruano García, Javier. 2007. Thou'rt a strange fillee: evidence for 'y-tensing' in 17th
century Lancashire dialect? Sederi 17.
Rupp, Laura. 2005. Constraints on non-standard -s in expletive there sentences: a
generative-variationist perspective. English Language and Linguistics 9: 225-
288.
Rupp, Laura. and David Britain. 2008. Concord variation: a generative-
sociolinguistic perspective. Basingstoke: Palgrave.
Schiffrin, Deborah. 1996. Narrative as self portrait: the sociolinguistic construction of
identity. Language in Society 25(2).
Schilling-Estes, Natalie. 2000. Investigating intra-ethnic differentiation: /ay/ in
Lumbee Native American English. Language Variation and Change 12.2:
141-174.
Schütze, Carson. 1996. The empirical base of linguistics: grammaticality judgments
and linguistic methodology. Chicago: University of Chicago Press.
Sebba, Mark. 2009. Spelling as a social practice. In Janet Maybin and Joan Swann
(eds.), Routledge companion to English Language studies. London: Routledge.
Shorrocks, Graham. 1996. The second person singular interrogative in the traditional
vernacular of the Bolton Metropolitan area. In James R Black and Virginia
218
Motapanyane (eds), Microparametric syntax and dialect variation.
Amsterdam: John Benjamins.
Shorrocks, Graham. 1999. Grammar of the dialect of the Bolton Area. Part I.
Morphology and syntax. Frankfurt: Peter Lang.
Shorrocks, Graham. 1999. Grammar of the dialect of the Bolton Area. Part II.
Morphology and syntax. Frankfurt: Peter Lang.
Siewierska, Anna and Willem B. Hollmann. 2005. Ditransitive clauses in English with
special reference to Lancashire dialect. In Mike Hannay and Gerard J. Steen
(eds.), Structural-functional studies in English grammar 83-102.
Amsterdam/Philadelphia: John Benjamins.
Siewierska, Anna. 2004. Person. Cambridge: Cambridge University Press.
Smith, Jennifer and Sali Tagliamonte. 1998. We were all thegither… I think we was
all thegither’: was regularization in Buckie English. World Englishes
17/2:105–126.
Smith, Jennifer, M. Durham and L. Fortune. 2007. 'Mam, my trousers is fa'in doon!':
Community, caregiver, and child in the acquisition of variation in a Scottish
dialect. Language Variation and Change, 19(1):63-99
Sparks, John. 2009. Spirit of Lancashire, 2nd revised edition. Wellington, UK:
Halsgrove/Pixz Books.
Tagliamonte, Sali and Jennifer Smith. 1998. Roots of English in the African American
diaspora? Englishes. 147-165.
Tagliamonte, Sali and Jennifer Smith. 1999. Analogical levelling in Samaná English:
the case of was and were. Journal of English Linguistics 27, 1. 8-16.
219
Tagliamonte, Sali, Jennifer Smith and Helen Lawrence. 2005. No taming the
vernacular! Insights from the relatives in northern Britain. Language Variation
and Change. 17.1: 75-112.
Tagliamonte, Sali. 1998. Was / were variation across the generations: view from the
city of York. Language Variation and Change, 10(2): 153-191
Tagliamonte, Sali. 2004. Back to the roots: the legacy of British dialects. Final report
to the ESRC, grant no: R000239097.
Tagliamonte, Sali. and Helen Lawrence. 2000. I used to dance, but I don't dance now.:
the habitual past in English. Journal of English Linguistics, 28(4): 323-353.
Tomasello, M. 2003. Constructing a language: a usage-based theory of language
acquisition. Harvard: Harvard University Press.
Trousdale, Graeme. 2003. Modal verbs in Tyneside English: evidence for
(socio)linguistic theory. In Roberta Facchinetti, Manfred Krug and Frank
Palmer (eds), Modality in contemporary English. London/New York: Mouton
de Gruyter.
Trudgill, Peter. 1999. The Dialects of England. Oxford: Blackwell.
Trudgill, Peter. 2008. Colonial dialect contact in the history of European languages:
On the irrelevance of identity to new-dialect formation. Language in Society
37, 241–280
Van der Auwera amd Plungian. 1998. Modality's semantic map. Linguistic Typology
2:79-124.
Venneman, Theo. 2000. English as a Celtic language. Atlantic influence from above
and from below, in Hildegard Tristram (ed.), The Celtic Englishes II,
Anglistische Forschungen 399-406. Heidelberg: Carl Winter.
220
Visser, F. T. 1963–1973. An historical syntax of the English language. Leiden: E. J.
Brill.
Vivian, Louisa. 2000. /r/ in Accrington. Undergraduate dissertation. Colchester: Essex
University.
Wales, Katie. 2006. Northern English: a cultural and social history. Cambridge:
Warner, Anthony. 1993. English auxiliaries. Structure and history. Cambridge:
Watson, Kevin. 2006. Phonological resistance and innovation in the North-West of
England. English Today 22 (2):55-61.
Wells, John. 1970. Local accents in England and Wales. Journal of Linguistics 6 (2):
231–252.
White, David L. 2002. Explaining the innovations of Middle English: what, where and
why. In Markku Filppula, Juhani Klemola and Heli Pitkänen (eds.), The Celtic
roots of English. Studies in Language 37: 153-174. Joensuu: University of
Joensuu Press.
Wolfram, Walt and Jason Sellers. 1999. Ethnolinguistic marking of past be in Lumbee
Vernacular English. Journal of English Linguistics 27:94-114.
Wright, Laura. 2002. Third person plural present tense markers in London prisoners’
depositions, 1562–1623. American Speech 77: 242–263.
Yaeger-Dror, Malcah. 1993. Linguistic analysis of dialect correction and its
interaction with cognitive salience. Language Variation and Change 5:189-
224.
221
Appendix A: Map of the old County of Lancashire
Below, I present a map of the old County of Lancashire before the 1974 boundary
changes (taken from Kain and Oliver, 2001).
222
Appendix B: Texts comprising Litcorp
Below, I present a list of the texts comprising the Lancashire dialect literature corpus
(Licorp) used throughout this thesis:
Baron, William. 1888. Bits o’ broad Lancashire. John Haywood: Manchester
Billington, William. 1883. Lancashire songs poems and sketches. Blackburn:

Toulmin.
Brierley, Benjamin. 1896. 'Aboth-Yate' Sketches and Other Short Stories, volume. 1.
Oldham: W.E. Clegg
Brierley, Benjamin. 1886. 'Ab o’th’-Yate' Sketches and Other Short Stories, volume 2.
Oldham: W.E. Clegg
Collier, John (also know as Tim Bobbin). 1846. Tummus and Meary. John Haywood:
Manchester
Saunders, Langford. 1911. Lancashire humour and pathos. Manchester: Fred Johnson
& Co.
Thompson, T. 1945. Lancashire Pride. London: George Allen.
223
Appendix C: Questionnaire – sociolinguistic information
Below, I present the sociolinguistic questionnaire used in conjunction with
acceptability questionnaires employed in this thesis:
Dialect Survey
Information about you...
1. Where were you born?
..........................................................................................................................................
If you have not always lived in the same town/city/village, please specify where
else you lived and for how many years.
..........................................................................................................................................
2. How old are you? (If you would prefer not to say, please leave blank)
.........................................................................................................................................
3. Would you consider yourself to be a speaker of a particular English
dialect?
..........................................................................................................................................
4. If so, which dialect?
..........................................................................................................................................
5. How do you feel about your dialect, e.g. positive or negative? Is there
anything you particularly like or dislike?
..........................................................................................................................................
..........................................................................................................................................
..........................................................................................................................................
..........................................................................................................................................
........................................................................................................................................
Thank you!
Please proceed to complete the survey...
224
Appendix D: Questionnaire – content testing the NSR
Below, I present the acceptability questionnaire used to test the NSR. The format of
the questions is shown below. A list of sentences used to populate the survey as also
given:
Please read the sentences and rate how acceptable they are to you.
Please give each sentence a score between 1 and 5, with 1 being the least acceptable
or most unlikely to be used by you, and 5 being the most acceptable or likely to be
used by you.
For example, if you judged sentence A (below) to be very acceptable, you should
give it a score of 5, and so circle the number 5, as shown below.
A. ‘The man with the red hat sometimes goes into the shop.’
(least 1 2 3 4 5 (most
acceptable) acceptable)
If you judge sentence B (below) to be very unacceptable, you should give it a score
of 1, and so circle the number 1.
B. ‘With sometimes red the hat into the shop goes the man.’
Please proceed to start the survey!
Given below are a list of the sentences that were used to populate the survey:
1. All of you are confident and tries very hard.

2. On a Monday I talks to the man from the butchers for a few minutes.
3. She found a new house and is very happy.
4. They have a shop of their own and is very well off.
5. Everyone’s spent all of their money and has got nothing left.
6. We are waiting for the specialist to phone back.
7. You really has to try that new restaurant, its great!
8. These does wonders for my health.
9. Bob and John, when the go out for a walk, finds a man who had fallen over.
10. You does nothing but go on and on about recycling.
11. All of us from Flat Ten thinks the cleaner isn’t doing her job properly.
225
12. You only very occasionally asks me for help.
13. I is sometimes not sure about what he will say about all of the mistakes I
make.
14. You have lots left to do but you’re making good progress.
15. My friends wife does a cookery class at the community centre.
16. The other day they walks for three miles before they came to a post-box.
17. You and your sister have got no manners and is very nasty to him sometimes.
18. We usually always do something nice at Christmas.
226
Appendix E: Ellipsis test sentences
Below, I present the list of list of elliptical sentences that was used to test the
acceptability of these construction to Lancashire speakers in Chapter 3:
1. I’ven’t got any money for the bus.

2. You’ren’t to go there or you’d be in real trouble.
3. I’sn’t to go to far, its not very nice outside
4. We’dn’t anything else to do.
5. You’ren’t ever going to understand this.
6. I’ven’t got to be anywhere tomorrow.
7. We’dn’t any left.
8. They’dn’t found the answer.
9. The girlsn’t got a clue what to do.
10. The workers’dn’t left yet.
227
Appendix F: Questionnaire – content testing zero relatives
Below, I present the acceptability questionnaire used to test zero relatives. The format
of the questions is shown below. A list of sentences used to populate the survey as
also given:
Please read the sentences and rate how acceptable they are to you.
Please give each sentence a score between 1 and 5, with 1 being the least acceptable
or most unlikely to be used by you, and 5 being the most acceptable or likely to be
used by you.
For example, if you judged sentence A (below) to be very acceptable, you should
give it a score of 5, and so circle the number 5, as shown below.
C. ‘The man with the red hat sometimes goes into the shop.’
If you judge sentence B (below) to be very unacceptable, you should give it a score
of 1, and so circle the number 1.
D. ‘With sometimes red the hat into the shop goes the man.’
Please proceed to start the survey!
Given below are a list of the sentences that were used to populate the survey:
1. I have something might help you understand it

2. It was Laura told me about that.
3. There’s a man down the street goes there too.
4. She’s the one took the money.
5. I met a man once could do that.
6. It was that one I wanted
7. I can’t quite remember, perhaps it was my son asked me to do that.
8. I know a lady from work has one of those things.
9. I haven’t got any work needs doing
10. I once had a dog could eat two tins of food in one morning.
11. Have you got any plants want watering?
228
12. I think it might be John picked the wallpaper in here.
13. My nana’s got a squeaky gate wants oiling.
14. The girls are the ones messed it up for us all
15. I don’t think I’ve ever seen anyone has been so angry.
16. It were that place never closed, even in winter
17. Are there any drinks want making?
18. He’s the one gets the blame.
229
Appendix G: Sample text from Lancashire Fairytales
Below, I present the three excerpts from Lancashire dialect writing collected to form
Lancashire Fairytales and used in Chapter 5:
(1) (Lancs_ 0017)

[…] An Cinderella were havin a gradely time at Ball wit Hansome Prince.
Then, she looked at time and said “oooh eck, I’ve gorra dash love, or I’ll turn
into some right nasty vegertable!” An off she dashed, right down road. Well,
the prince, he were broken hearted, and he says, “i’m gonna find me lovely
lass, im gonna search all round kingdom!” And off he went down t’road,
holdin onto the clog that she’d left ont ground […]
(2) (non-Lancs_0023)
Once upon u time, there were a right pretty girl named Sleepin Beauty. Sleepin
Beauty had been put into sleep by Wicked Witch. One day, a Hansome Prince
come along, and gave her a smacker right ont lips! Sleepin beauty woke up
and lived happily ever after with prince int castle.
(3) (Lancs_0002)
Three little pigs. One day, three lickle pigs were flyin nest an meckin them
houses for't livin in. First lickle pig came across man wit' hay an says, "ay up
fettler, can thou gimme some hay for't house I'm meckin?" man says, "aye
lad", an lickle pig mecks house of hay. Second lickle pig saw't man wit sticks
an says, "ay up fettler, can thou gimme some sticks for't meckin me house?" an
man says, "aye lad" an lickle pig mecks house of sticks like. Third lickle pig
sees man wit great big stones an says, "ay up fettler, can thou gimme some
great big stones for't house I'm meckin?" an man says, "aye lad" an lickle pig
mecks house of stones. All of sudden, wolf comes t'village an starts chappin
doors. 'ee says t'first lickle pig, "Lickle pig! Lickle pig! Let me in! Let me in!"
an lickle pig says, "Not for't hair on me chin!" an wolf says, "Then i'll huff an
puff an blow yer house in!" an does it. Then 'ee cooks an ate lickle pig. 'ee says
t'necks lickle pig, "Lickle pig! Lickle pig! Let me in! Let me in!" an lickle pig
says, "Not for't hair on me chin!" an wolf says, "Then i'll huff an puff an blow
yer house in!" an does it gain. Then 'ee cooks an ate necks lickle pig. When 'ee
sees last lickle pig 'ee says, "Lickle pig! Lickle pig! Let me in! Let me in!" an
lickle pig laffs an says, "Not for't hair on me chin!" an wolf says, "Then i'll
huff an puff an blow yer house in!" but can't. So 'ee tries gain but still can't.
Then 'ee climbs up t'roof an jumps down chim-eny an right in't big pot o' hot
watter an lickle pig cooks an ate wolf.
230

Lancashire Dialect Grammar A Corpus Base

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lancashire Dialect Grammar A Corpus Base

Uploaded by

Copyright:

Available Formats

Lancashire dialect grammar:

A thesis submitted to Lancaster University

for the degree of Doctor of Philosophy

in the Faculty of Arts and Social Sciences.

Claire Dembry, BA, MA

Department of Linguistics and English Language

CHAPTER 4. VERBAL AGREEMENT AND THE NORTHERN SUBJECT

CHAPTER 6. CONCLUDING REMARKS………………………………………. 198

CHAPTER 4 VERBAL AGREEMENT AND THE NSR

Litcorp Lancashire Literature Corpus

My utmost gratitude must be expressed to my supervisors, Professor Anna Siewierska

I am grateful to many other people in the Department of Linguistics and English

I am grateful to my new employer, Cambridge University Press, and in particular to

methodical component to the investigation. Theoretically, the study is couched within

and Cruse, 2004: 291-327). With respect to methodology, it is innovative in that it

The grammatical features investigated are among those well known as

grammar be it of English (e.g. Kortmann et al., 2000-2005, Freiburg English Dialect

Vangsnes et al., 2005-2010, Scandinavian Dialect Syntax Project).

extent to which the wh-relativization strategy typical of Standard English (e.g. as

notoriously difficult to investigate with standard methodology.

Chapter 3 provides an analysis of the HAVEn’t to construction, a polysemous

SHOULDn’t, MUSTn’t, NEEDn’t) and evolve in the process of language change.

Chapter 4 considers verbal agreement in Lancashire, focussing on the so-called

constructions (e.g. habitual or historical present). Cognitive-perceptual factors such as

salience or frequency of usage as potential explanations for this agreement variation

and its acceptability in Lancashire. A broader question relating to synchronic theories

of language variation is also investigated; i.e. to what extent is variation in syntactic

Chapter 5 proposes an entirely new and innovative methodology to test

sociolinguistic salience by contrasting corpus data of different types. The

as salient in Lancashire can be identified by comparing the differences between the

written dialect literature). The methodological difficulties posed by the investigation

generally) are also addressed.

considerations relating to sociolinguistic and regional differences such as salience

Lancashire dialect is a good choice for studying grammatical variation in non-standard

easy access to Lancashire dialect speakers in the local area.

Until recently, the Lancashire dialect remained relatively uninvestigated (aside

supported in the local area). 1 Although a number of Lancashire informants are

provide enough instances of a wide enough range of grammatical features to warrant

any region-specific conclusions to be made. This is in part due to data collection

While the analysis of grammatical variation in Lancashire is still relatively rare,

typically found in sociolinguistics more generally. As outlined in Hollmann and

a number of factors, such as the dominance of a non-variationist approach to grammar

language used by speakers in the Lancashire area.

current county boundaries for Lancashire are shown in Figure 1.

FIGURE 1. MAP OF LANCASHIRE

County of Lancashire also became parts of neighbouring counties as a consequence of

cultural influences in modern-day Lancashire can be expected from these surrounding

studies (e.g. Tagliamonte, 1998; Tagliamonte and Lawrence, 2000).

The landscape, industry and population size of towns in the County of

culturally) by their neighbours in Manchester and Liverpool. Although grammatical

variation has yet to be extensively investigated in Lancashire (although some inroads

phonological difference that has been noted is rhoticity. Although sometimes

considered as a typical feature of ‘Lancashire’ (e.g. by Wells, 1970), rhoticity shows

phonemic variation within the county boundary, as outlined by Beal (2004:130). It is

Lancashire, e.g. Lancaster, Preston, Blackpool. It is likely that grammatical variation

may also display sub-regional differences. Phonetic and perhaps grammatical

Elizondo, 2008). Although an intra-regional grammatical study may uncover

1.2.2 Lancashire identity

describes the definition and identification of a regional variety as ‘a linguistic

repertoire differentiable within a language as a socially recognised register’ (Agha

Dialect writing is also frequently found in the region in collections of ‘traditional’

traditions, e.g. Traditions of Lancashire (Roby 2005); Favourite Lancashire recipes

concerning definite article reduction in the region).