(15685179 - Dead Sea Discoveries) Computational Stylometric Approach To The Dead Sea Scrolls

Dead Sea Discoveries 25 (2018) 57–82
brill.com/dsd
Computational Stylometric Approach to the Dead

Sea Scrolls
Towards a New Research Agenda
Pierre Van Hecke

KU Leuven, Belgium
pierre.vanhecke@kuleuven.be
Abstract
The question of how to classify the different texts of the Dead Sea Scrolls is a central
issue in scholarship. There is little agreement or even little reflection, however, on the
methodology with which these classifications should be made.
This article argues that recent developments in computational stylometry address
these methodological issues and that the approach therefore constitutes a necessary
addition to existing scholarship. The first section briefly introduces the recent develop-
ments in computational stylometry, while the second tests the feasibility of a stylometric
approach for research on the Scrolls. Taking into account the particular challenges of the
corpus, an exploratory methodology is described, and its first results are presented. In the
third and final section, directions for future research in the field are articulated.
Keywords
Dead Sea Scrolls – computational stylometry – text classification – stylistic features
Ever since the discovery of the first Dead Sea Scrolls, the question of how to
classify the different texts has been a central issue in scholarship. The term
Dead Sea Scrolls is used in this article in the narrower sense to refer to the
manuscripts found in the vicinity of Khirbet Qumran. On the one hand, there
* I wish to thank my colleague Eibert Tigchelaar and my collaborators Mathias Coeckelbergs

and David Van Acker for their valuable comments and suggestions on different parts and
aspects of this article.
© koninklijke brill nv, leiden, 2018 | doi 10.1163/15685179-12341464

Downloaded from Brill.com12/19/2018 07:55:33PM by shiliger@hotmail.com
via communal account
58 Van Hecke
is the question of which position the corpus of the Dead Sea Scrolls takes vis-
à-vis other known Hebrew corpora, in particular the Biblical and the Tannaitic
corpus; on the other hand, the question arose whether the Dead Sea Scrolls
really constitute a unified corpus at all, or whether they rather find their origin
in a number of different redactional and/or scribal milieus.1
In dealing with these questions, linguistic arguments play an important
role. The texts’ linguistic features—lexical, orthographic, morphological and
syntactic—are taken as classifiers that categorise compositions and manu-
scripts, grouping together the ones that share a sufficient number of character-
istics and distinguishing those that lack such correspondences.
In the course of Dead Sea Scrolls research, substantial progress has been
made both in the task of classifying the different manuscripts and composi-
tions and in the meticulous description of the linguistic features of the scrolls
that may help in doing so.2 Yet important questions remain, as the lack of una-
nimity on, for example, the distinction between sectarian and non-sectarian
composition demonstrates. More importantly, however, there is little agree-
ment and even little reflection on the methodology with which classifications
should be done. Two methodological issues stand out in this regard. First, there
is the question of which features or characteristics should be taken into con-
sideration when classifying texts. Should one, for example, categorise texts on
the basis of the presence or absence of lexical features (that is: typical vocabu-
lary) or should one attribute more importance to orthographic or grammatical
features in order to group texts? Second, once a list of typical features has been
identified, the question arises of how (groups of) features should be weighted
against each other. It is seldom the case that texts contain either all or none of
the features considered characteristic of a particular category. The presence
of certain characteristics should therefore be weighted against the absence
of others. Due to the lack of clear methodological criteria, the evaluation of
which features to select and how to weight them is often done intuitively, re-
sulting in strongly different text classifications.
In this article, I will argue that recent insights from and developments in
computational stylometry address these methodological issues and that the
1 The question was first raised by Baruch Levine in his article on the then recently pub-
lished Temple Scroll: “The Temple Scroll: Aspects of Its Historical Provenance and Literary
Character,” Bulletin of the American Schools of Oriental Research 232 (1978): 5–23.
2 The contributions to the International Symposia on the Hebrew of the Dead Sea Scrolls and
Ben Sira have been crucial in this regard, see STDJ 26, 33, 36, 73, 108, 114, along with other stud-
ies mentioned in this article and studies such as E.Y. Kutscher, The Language and Linguistic
Background of the Isaiah Scroll (1QIsaa) (Leiden: Brill, 1974).

Computational Stylometric Approach to the Dead Sea Scrolls 59
approach therefore constitutes a necessary addition to existing scholarship. In

the first section, I will briefly introduce the recent developments in computa-
tional stylometry. The second section will test the feasibility of a stylometric
approach to the Scrolls. Taking into account the particular challenges of the
corpus, an exploratory methodology will be described, and its first results will
be presented. In the third and final section, directions for future research in the
field will be articulated.
1 Computational Stylometry: Background and Developments
Computational stylometry is a recent approach within corpus linguistics that

aims at determining the specific characteristics of texts and the stylistic dis-
tance between texts with the help of statistical techniques. Computational
tools and statistical methods to describe textual style have already been used
for a number of decades in classical tasks of authorship attribution.3 In these
tasks, a text of unknown provenance is ascribed to one of a limited number of
potential authors on the basis of the stylistic distance between the text under
consideration and a sample of texts from known authors.4 These methods
were of little use for the study of the Dead Sea Scrolls, not only because in the
latter corpus, no known authors are extant with which texts of unknown prov-
enance can be compared, but also because the available techniques only yield-
ed reliable results with longer texts than most Dead Sea documents. However,
in recent years, the methods for authorship attribution have been consider-
ably refined. As a result, more complex authorship situations—involving short
texts, or involving many potential and unknown authors—can now be dealt
3 One often points to the monograph by Mosteller and Wallace on the authorship of the
Federalist Papers as marking the start of this new approach, using advanced statistical meth-
ods to attribute these essays to their respective authors on the basis of the most frequently
used words in the texts, see F. Mosteller and D.L. Wallace, Inference and Disputed Authorship:
The Federalist Papers (Reading: Addison-Wesley, 1964).
4 A recent example that received quite some media coverage was the discovery that the un-
known Robert Galbraith, the alleged author of the debut novel The Cuckoo’s Calling, was
none other than the well-known Harry Potter author J.K. Rowling. By adopting a pseudonym,
she attempted to keep the different genres in her oeuvre separate, but stylometrist Patrick
Juola was able to unmask her identity. See http://www.scientificamerican.com/article/
how-a-computer-program-helped-show-jk-rowling-write-a-cuckoos-calling/ (accessed 6th
September 2016).
Dead Sea Discoveries 25 (2018) 57–82Downloaded from Brill.com12/19/2018 07:55:33PM by shiliger@hotmail.com

60 Van Hecke
with.5 As Koppel and others have argued, these refined analyses can all be
brought back to what they call the fundamental problem of authorship attri-
bution: “given two (possibly short) documents, determine if they are written
by a single author or not.”6 These recent advancements make the approach
much more promising for the Dead Sea Scrolls.
Moreover, stylometry is increasingly turning to attribution problems of his-
torical texts,7 and the developments in this field can obviously be very fruitful
for similar studies on the Dead Sea Scrolls.
Methodologically, stylometric research has focused on a number of issues
in the last decade. First, there is the ongoing discussion about which textual
5 For an overview, see P. Juola, “Authorship Attribution,” Foundations and Trends in Information
Retrieval 1/3 (2006): 233–34. Among other things, the massive availability of online texts
(blogs, emails, webpages) and the challenges of forensic analysis of these data have strongly
precipitated these developments. Also other, related tasks are increasingly dealt with using
stylometric techniques: authorship profiling, e.g. determining the age, gender and level of
education of authors (S. Argamon, M. Koppel, J. Pennebaker and J. Schler, “Automatically
Profiling the Author of an Anonymous Text,” Communications of the ACM 52/2 [2009]: 119–
23); intrinsic plagiarism detection, aiming to detect plagiarism not on the basis of correspon-
dences with external sources but on the basis of internal stylistic inconsistencies (S. Meyer
zu Eissen and B. Stein, “Intrinsic Plagiarism Detection” in Advances in Information Retrieval.
Proceedings of the 28th European Conference on IR Research, ECIR 2006 [ed. M. Lalmas
et al.; London, Berlin, Heidelberg: Springer, 2006]; E. Stamatatos, “Intrinsic Plagiarism
Detection Using Character n-gram Profiles,” in SEPLN 2009 Workshop on Uncovering
Plagiarism, Authorship, and Social Software Misuse, [ed. B. Stein et al; Valencia: Universidad
Politécnica de Valencia and CEUR-WS.org, 2009], 38–46; M. Zechner, M. Muhr, R. Kern and
M. Granitzer, “External and Intrinsic Plagiarism Detection Using Vector Space Models,” in
SEPLN 2009 Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse, [ed.
B. Stein et al; Valencia: Universidad Politécnica de Valencia and CEUR-WS.org, 2009], 47–55;
M. Kestemont, K. Luyckx and W. Daelemans, “Intrinsic Plagiarism Detection Using Character
Trigram Distance Scores,” in CLEF 2011 Labs and Workshop, Notebook Papers, http://ceur-ws
.org/Vol-1177/CLEF2011wn-PAN-KestemontEt2011.pdf [accessed 6th September 2016]; and
the contributions to the 45/1 [2011] issue of Language Resources and Evaluation), and the
study of stylistic inconsistencies due to collaborative authorship (N. Graham, G. Hirst and
B. Marthi, “Segmenting Documents by Stylistic Character,” Journal of Natural Language
Engineering 11/4 (2005): 397–415).
6 M. Koppel, J. Schler, S. Argamon and Y. Winter, “The ‘Fundamental Problem’ of Authorship
Attribution,” English Studies 93 (2012): 284–91, p. 284.
7 M. Kestemont, S. Moens and J. Deploige, “Collaborative Authorship in the Twelfth Century:
A Stylometric Study of Hildegard of Bingen and Guibert of Gembloux,” Digital Scholarship
in the Humanities 30/2 (2015): 199–224; J. Stover, Y. Winter, M. Koppel and M. Kestemont,
“Computational Authorship Verification Method Attributes New Work to Major 2nd Century
African Author,” Journal of the American Society for Information Science and Technology 67/1
(2016): 239–42.

features are best suited to stylometrically differentiate texts.8 Whereas tradi-

tional, non-automated authorship studies generally tend to focus on a rela-
tively small number of specific content words to categorize texts,9 stylometric
studies will typically focus precisely on the most frequent words or sequences
of words (word n-grams), and, moreover, will look at a large number of these
words (typically several hundred) at once. In addition, non-lexical features
have become increasingly important in stylometric research. Pride of place is
now taken by the most frequent character n-grams, i.e. sequences of a fixed
number of characters:10 these n-grams have higher frequencies than words,
and therefore yield better results in quantitative analysis. Other advantages
are that they capture both orthographic, lexical and morphosyntactic informa-
tion and that they deal better with ‘noise’ in texts (spelling errors, incomplete
words).11 Also syntactic and semantic features are used for stylometric analy-
ses. Among the syntactic features, the analysis of the most common sequences
of parts of speech (POS n-grams) in texts yields positive results. On the se-
mantic level, Koppel and others have recently argued for the analysis of texts’
preferences in synonym choice (synsets).12
Next to the choice of the appropriate analytical features, much research
has been devoted to the adequate measures to quantify the distance or stylis-
tic difference between two texts. Since stylometric analyses depend on large
amounts of data, sophisticated methods are needed to compare all the select-
ed characteristics of two or more texts statistically. In the last years, Burrows’
Delta measure13 has gained massive support in stylometry and continues to be
8 For lists of such features, see Stamatatos, “Intrinsic Plagiarism Detection”; M. Koppel,
J. Schler and S. Argamon, “Computational Methods in Authorship Attribution,” Journal of
the American Society for Information Science and Technology 60/1 (2009): 9–26.
9 For the Dead Sea Scrolls, see for example the fundamental work on typical sectarian vo-
cabulary by D. Dimant, “The Vocabulary of the Qumran Sectarian Texts,” in Qumran und
die Archäologie, ed. J. Frey, C. Claußen and N. Kessler (Tübingen: Mohr Siebeck, 2011),
347–95.
10 See the seminal article by V. Keselj, N. Cercone and C. Thomas, “N-Gram-Based Author
Profiles for Authorship Attribution,” Computational Linguistics 3 (2003): 255–64.
11 Stamatatos, “Intrinsic Plagiarism Detection,” 553.
12 M. Koppel, N. Akiva, I. Dershowitz and N. Dershowitz, “Unsupervised Decomposition
of a Document into Authorial Components,” Proceedings of the 49th Annual Meeting of
the Association for Computational Linguistics: Human Language Technologies (Portland:
Association for Computational Linguistics, 2011), 1356–64.
13 J. Burrows, “‘Delta’: a Measure of Stylistic Difference and a Guide to Likely Authorship,”
Literary and Linguistic Computing 17/3 (2002): 267–87.

62 Van Hecke
optimized.14 Finally, the algorithms for performing supervised machine-learn-

ing tasks of classification are in constant development (e.g. Support Vector
Machines, Nearest Neighbor Classification), and the discussion about which
mechanisms give the best results is ongoing.
As the challenges in preparing the corpora and choosing and applying the
adequate features, measures and techniques constitute major hurdles for this
type of research to be performed by scholars without formal background in
Information Retrieval, several groups have developed software that bring
these analyses within closer reach of other scholars in the humanities. Two
of the most accessible packages are Patrick Juola’s Java Graphical Authorship
Attribution Program (JGAAP) and the ‘stylo’ package developed for the statisti-
cal software R by Eder, Rybicki and Kestemont.15 The latter package has the ad-
vantage of working within the well-known R language and environment with
its acclaimed graphical output possibilities, and of featuring a user-friendly
graphical user interface.
2 Stylometry and the Dead Sea Scrolls
The application of quantitative authorship attribution techniques in Classical

Hebrew studies is limited. In the field of biblical studies, Radday devoted a pre-
liminary study to authorship questions in the book of Isaiah16 and a monograph-
length discussion of the redactional history of the book of Genesis,17 attempts
that did not go without discussion.18 Very recently, Dershowitz and others
applied a number of the most recent techniques to the redactional history of
14 D. Hoover, “Testing Burrows’ Delta,” Literary and Linguistic Computing 19/4 (2004): 453–75;
P. Smith and W. Aldridge, “Improving Authorship Attribution. Optimizing Burrows’ Delta
Method,” Journal of Quantitative Linguistics 18/1 (2011): 63–88; S. Evert et al., “Towards
a Better Understanding of Burrows’s Delta in Literary Authorship Attribution,” in
Proceedings of NAACL-HLT Fourth Workshop on Computational Linguistics for Literature
(Denver: Association for Computational Linguistics, 2015), 79–88.
15 M. Eder, J. Rybicki and M. Kestemont, “Stylometry with R: A Package for Computational
Text Analysis,” The R Journal 8/1 (2016): 107–21.
16 Y.T. Radday, “Isaiah and the Computer. A Preliminary Report,” Computers and the
Humanities 5 (1970): 65–73.
17 Y.T. Radday, and H. Shore, Genesis: An Authorship Study in Computer-Assisted Statistical
Linguistics (Rome: Biblical Institute Press, 1985).
18 D. Forbes, “A Critique of Statistical Approaches to the Isaiah Authorship Problem,” in
Association Internationale Bible et Informatique, Actes du Troisième Colloque International
(Paris: Champion, 1992), 531–45.

the Pentateuch and came to the conclusion that on stylometric grounds the
Priestly redaction (P) can be distinguished from non-P material, a conclusion
also reached by Radday twenty years ago.19 With regard to the Dead Sea Scrolls,
there are a few exploratory computational and corpus linguistic studies that
are limited in scope and/or approach,20 but stylometric analyses have not yet
been applied to the corpus. In the present article, I propose an exploratory
feasibility test of the application of such a stylometric analysis to the Dead
Sea Scrolls. I first describe the subsequent methodological steps and decisions
taken in this experimental investigation, and subsequently present some of its
preliminary results.
2.1 Towards a Methodology

2.1.1 Data Preparation
The first step in applying computational methods to a corpus is the creation of
electronic text files that can be processed by the research software needed for
the analyses. This step has largely already been taken: the Scrolls are available
in electronic form, and have even been lexically, morphologically and syntacti-
cally tagged.21 Even though the quality of the data is still constantly improv-
ing, as more accurate readings of the Scrolls become available, the databases
constitute an excellent starting point for quantitative analyses. In order to
make the available databases operational for the present research, some minor
additional data preparation was needed, described in more detail in the note
below.22
19 I. Dershowitz, M. Koppel, N. Akiva and N. Dershowitz, “Computerized Source Criticism of
Biblical Texts,” Journal of Biblical Literature 134/2 (2015): 253–71.
20 Most recently J.M.A. Starr, “Quantitative Analysis of the Aramaic Qumran Texts” (PhD
diss., University of Edinburgh, 2013); J.T. Jacobs, “Corpus Based Statistical Analysis of
the Linguistic Character of the ‘Biblical’ Dead Sea Scrolls” (PhD diss., University of
Manchester, 2015).
21 The currently most accessible databases are the following. Firstly, the Ma’agarim database
of the Historical Dictionary Project of the Academy of the Hebrew Language should be
mentioned, which contains, among many other Hebrew compositions, all the extant texts
from the Dead Sea Scrolls. This invaluable database has been made publicly available, and
can be reached through the website maagarim.hebrew-academy.org.il/.
The commercial Accordance program contains a lexically and morphologically tagged
database of both the biblical and the non-biblical Dead Sea Scrolls. Syntactical tagging of
the corpus by Robert Holmstedt and Martin Abegg is in progress.
22 For the present research, the text of the non-biblical Dead Sea Scrolls available in the
Accordance® software was converted document by document to plain text files (.txt).
From these files, all paratextual elements (document, fragment, column and line

64 Van Hecke
2.1.2 Selection of Features

The next important methodological decision to take is the selection of textual
features serving as criteria by which to categorize texts. One option is to manu-
ally select a set of features known or presumed to be typical of a particular
group of texts. The advantage of this conscious selection is that it can build
on a long tradition of qualitative (as opposed to quantitative) research into
the linguistic characteristics of the Dead Sea Scrolls.23 Apart from the inevi-
table discussion mentioned above about which features to include or exclude,
the conscious selection of features has two important drawbacks. On the one
hand, there is the danger of circularity: features might be selected because of
their observed frequency in a particular group of documents, after which a text
classification on the basis of these features will obviously demonstrate that
this group of documents needs to be distinguished from others lacking these
features. The second, related drawback is that this procedure might overlook
significant features. Therefore, a better procedure, at least initially, is to catego-
rize texts on the basis of an automatically extracted set of features. This can
typically be achieved by generating a concordance list of all the features occur-
ring in the corpus and subsequently selecting a (sufficiently large) number of
features which are jointly used as classifiers for the documents in the corpus.
Even though this work is done automatically, the researcher, of course, still has
numbers) were removed, as well as all reconstructed text. Letters marked as probable or
possible by the editors have been accepted; this decision may lead to a number of debat-
able attributions, but as will be explained below, occasional misattributions do not fun-
damentally hamper the research results. Textual gaps were likewise removed, since they
would otherwise have been analysed as characteristics of the document. Given the limita-
tions of the text encoding, the gaps are now represented as spaces, which represents an
obvious inconsistency, as spaces in the text can now stand for both a single space in the
original text and a longer gap. Given the quantitative approach followed in this paper, this
‘noise’ does not affect the validity of the research results, although it will be clear that this
inconsistency needs to be addressed in future research.
Since the statistical program R used for the analyses cannot deal with Hebrew script,
the text was finally converted into ASCII text, following the conventions of e.g. the
Comprehensive Aramaic Lexicon website. Alef and ayin were transliterated as “a” and “o”
respectively, since brackets—used traditionally to represent these letters—have differ-
ent functions within R. All removals and substitutions were executed with the freeware
TextCrawler 3, using regular expressions (regex).
23 See e.g. E. Qimron, The Hebrew of the Dead Sea Scrolls (Atlanta, GA: Scholars Press, 1986);
E.D. Reymond, Qumran Hebrew. An Overview of Orthography, Phonology, and Morphology
(Atlanta, GA: Society of Biblical Literature, 2014), and the previous STDJ volumes of pro-
ceedings of the International Symposia on the Hebrew of the Dead Sea Scrolls and Ben
Sira referred to in note 3 above.

to determine which features to look for in the corpus, and how many instances
of this feature should be taken into consideration. As mentioned above, one
could opt to work with the most frequent words, which will consequently in-
clude a large number of function words such as prepositions, pronouns and
conjunctions usually left out of consideration in stylistic analyses. Although
this procedure yields very good results in many types of stylometric research,
it is still to be determined whether it is suitable for the Dead Sea Scrolls. On
the one hand, Hebrew being a highly inflectional language, lexemes often have
many different word forms, which will be regarded as different words by an
automated concordancer, thus yielding less than optimal results. This problem
could be avoided by searching for lexemes instead, but this procedure involves
some possibly debatable decision-making on the part of the researcher24 and,
moreover, strongly reduces the corpus’s linguistic richness. On the other hand,
the Dead Sea Scrolls are often quite lacunose, resulting in a large number of in-
complete words, which would, however erroneously, be identified as separate
words by a concordancer. Filling all the lacunae is methodologically problem-
atic, since it creates many words not extant in the text.
A better strategy could, therefore, be to take fixed-length character strings as
features with which to classify the documents, as has already been mentioned
above. For the current exploratory research, I have opted for so-called charac-
ter trigrams, i.e. strings of three characters,25 which can easily be created by an
24 The judgments and decisions made by experts are obviously completely legitimate. For
quantitative analysis, the interference of these decisions should be minimized, however,
in order to avoid the risk that the analysis measures the decisions of experts, rather than
features of the text itself. This danger becomes all the greater when different parts of the
corpus have been prepared by different experts, but even in the case of a single expert, it
is not inconceivable that this person has judged cases differently at different stages of the
analysis.
25 A sentence like “This is an example” will be divided in the following trigrams: “Thi,” “his,”
“is,” “s i,” is,” “is,” “s a,” “an,” “an,” “n e,” “ex,” “exa,” “xam,” “amp,” “mpl,” “ple.” The example
makes clear that spaces are also counted as a character. The usability of character n-grams
in stylometry has been demonstrated in a large number of analyses across different lan-
guages and genres, see Koppel, Schler and Argamon, “Computational Methods,” pp. 12–13;
W. Daelemans, “Explanation in Computational Stylometry,” in Computational Linguistics
and Intelligent Text Processing, ed. A. Gelbukh (Berlin/Heidelberg: Springer, 2013):
451–62, n. 18 referring to J. Grieve, “Quantitative Authorship Attribution: An Evaluation
of Techniques,” Literary and Linguistic Computing 22/3 (2007): 251–70. Daelemans sum-
marizes the reasons for the superiority of character n-grams over other features as fol-
lows: “They provide an excellent tradeoff between sparseness and information content.
Because of their higher frequency compared to other feature types such as tokens, bet-
ter probability estimates are possible for character n-grams, while at the same time they

66 Van Hecke
automatic concordancer, without interference from the researcher. As has been

remarked above, this approach is very robust in the case of a “polluted” corpus,
and hence of the Dead Sea Scrolls corpus, which is polluted in two ways: first,
the corpus is often very lacunose, displaying many incomplete words, and sec-
ond, the deciphering of the manuscripts is sometimes difficult, which can re-
sult in misrepresented words in the published editions. In a trigram-approach,
partially misrepresented words and incomplete words still carry enough infor-
mation to yield useable trigrams and therefore to guarantee reliable analyses.26
Moreover, besides the generally acknowledged advantages of this approach
(character n-grams capture both orthographic, lexical, morphological and
even some syntactic information),27 it is particularly well-suited to Hebrew:
dividing the text in trigrams not only captures much of the lexical information
contained in the roots consisting of subsequent consonants, but also much
morphological information, given the language’s highly standardized prefixes
and suffixes. Whether this trigram approach yields substantially better results
than a more standard ‘most frequent word’ approach is still to be determined,
and the current contribution will provide some elements to stimulate further
discussion by using both word and trigram approaches.
When the whole corpus has been divided into words or character trigrams, a
frequency-hierarchy of the most common trigrams or words can be generated,
which will be used in the following steps of the research.
2.1.3 Measuring the Weight of Features in a Text

With the features used for textual classification chosen, and a hierarchy of the
most frequent features established, one then needs to decide how many fea-
tures (most frequent words or most frequent character trigrams) will be used
for the analysis. Typically, a few hundred features are selected, but this number
can be adapted for optimal analysis.
combine information about punctuation, morphology (character n-grams can represent

morphemes as well as roots), lexicon (function words are often short), and even context
[…].”
26 In a word-based analysis, the words “example,” “exanple” (misrepresented) and “examp”
(broken) would be regarded as three unrelated textual features, whereby the latter two
would be useless for stylometric analysis, since for example the broken word “examp” is
not a stylistic feature of any document, but simply the characteristic of the manuscript’s
state of preservation. In a trigram-approach, the first two cases share two trigrams, name-
ly “exa” and “ple,” while the first and third even share three trigrams “exa,” “xam,” “amp.”
These trigrams are therefore useable data in a quantitative stylometric analysis.
27 See n. 26.

Subsequently, one needs to determine the method with which, first, to

measure the importance of each of these features for each of the documents
analysed, and, then, to measure the ‘distance’ (resemblance or distinction)
between all the documents. What needs to be done in the first step, simply
speaking, is to create a table in which each of the documents is represented
by a column, with each of the selected features (the predetermined number of
the most frequent character trigrams or most frequent words) being represent-
ed by a row. Each cell then contains the information about how important a
specific feature (row) is in a specific document (column), expressed with a nu-
merical value. As briefly mentioned above, there is a lot of support to suggest
that this value can best be calculated as in the Delta measure first developed
by John Burrows, although other measures exist. Without entering into the
technical details, this measure takes the observed frequency of a feature in a
document and subtracts the mean frequency of the same feature in the whole
corpus from this value. By doing so, features that are much more frequent in
a particular document X than the mean frequency of the same feature in the
whole corpus, will, for example, yield a high value for that document X. Since
some features are much more evenly distributed across the corpus than
others, this too should be considered in the calculation. For that reason, the sub-
traction “observed frequency in document X–mean frequency in the corpus”
should still be divided by the standard deviation of that feature. The result will
be that a feature that is generally speaking very evenly distributed across the
corpus, but occurs much more frequently than its mean in a particular docu-
ment will receive a high numerical value, higher than a feature of which the
distribution across the corpus is very uneven to begin with. This also means
that very specific vocabulary in a document will receive less weight than com-
mon vocabulary that is used exceptionally frequently in a document. The un-
derlying rationale for this choice is that exceptional vocabulary can be due to
the document’s specific topic, rather than due to author or group related sty-
listic characteristics. Since stylometric analysis of the Scrolls aims at grouping
texts on the basis of their linguistic similarity—which might be indicative of
their provenance—and not so much on the basis of their specific content, this
methodological decision speaks for itself.
The following exemplary table (Figure 1) illustrates the results of these cal-
culations: the columns represent a number of randomly selected DSS manu-
scripts, the rows the most frequent character trigrams in the corpus, with each
cell containing the value for the trigram in the document, calculated as indi-
cated above. The table that formed the basis for the present article is obviously
much more extended, containing all the selected documents and the—in this
case—500 most frequent trigrams, most including one or two blank spaces.

68
CD 11Q19 4Q416 4Q417 4Q418 4Q423 1QHa 1QM 1QpHab 1QS 1QSa
‫ים‬ 1.2471 1.4060 0.2506 0.1343 0.4862 0.6191 0.8103 1.5667 1.5973 1.2137 1.4475
‫ול‬ 0.0920 0.6915 0.1253 0.7725 0.8133 0.1547 0.8368 0.8372 0.5839 1.1561 0.9650
‫הו‬ 0.4600 0.9687 1.2844 0.8733 0.9459 1.1609 0.9370 0.6511 0.2919 0.4342 0.4020
‫םו‬ 0.7257 0.5086 0.1879 0.1679 0.4508 0.4643 0.6158 0.6365 0.7042 0.5911 0.8845
‫כה‬ 0.0255 0.4743 2.9135 1.6795 1.4675 1.8575 1.1992 0.5875 0.1889 0.1621 –
‫ות‬ 0.5417 0.5572 0.1566 0.2351 0.1326 0.1547 0.4950 0.9792 0.4981 0.4969 0.5629
‫כול‬ 0.0204 0.6430 0.0313 0.6718 0.6895 0.2321 0.6629 0.6414 0.5496 1.0201 0.8845
‫ל‬ 0.1584 0.1171 0.4699 0.6382 1.1493 1.1609 0.2563 0.2790 0.3607 0.0418 0.0804
‫וא‬ 0.3373 0.4286 0.9398 0.8397 0.5127 0.3869 0.5834 0.4651 0.3263 0.4656 0.3618
‫אל‬ 0.9762 0.2972 0.8771 0.8397 0.3359 0.1547 0.1119 0.7882 0.4294 0.4812 0.5629
‫שר‬ 0.7155 0.6887 0.3759 0.3023 0.2121 0.2321 0.2416 0.1909 1.0305 0.4132 0.1608
‫אל‬ 0.7666 0.3029 0.7205 0.5710 0.2917 0.2321 0.2092 0.6952 0.4809 0.3766 0.4825
‫מה‬ 0.1226 1.1002 0.3759 0.5038 0.3890 0.5417 0.2504 0.6120 0.3435 0.1360 0.2010
‫וו‬ 0.4548 0.4029 0.5012 0.6046 0.4862 0.4643 0.2327 0.3133 0.4294 0.5702 0.3618
‫הל‬ 0.2095 0.5029 0.6892 0.4702 0.6100 0.4643 0.4626 0.3525 0.2232 0.2929 0.4020
‫וא‬ 0.2555 0.6715 0.1566 0.3023 0.4243 0.3869 0.2534 0.1713 0.6526 0.5179 0.4422
‫ול‬ 0.3373 0.4772 0.1566 0.1679 0.2475 0.1547 0.5303 0.2252 0.2748 0.5493 0.6031
‫ני‬ 0.2095 0.3315 0.1253 0.1007 0.1591 0.0773 0.7012 0.3623 0.0687 0.3295 0.4020
‫את‬ 0.6031 0.6544 0.0626 0.1679 0.0442 0.3869 0.0618 0.2203 0.5152 0.2981 0.2814
Figure 1 Frequency table of character trigrams (cropped).

Van Hecke

2.1.4 Measuring the Distance between Texts

The final step is to establish the distance between any two documents, and
to group documents according to the respective measured distances between
them. This is achieved by adding up the differences between the values of each
of the selected features, which results in what is called the Delta measure of
distance between two documents.28 For example, taking the table in Figure 1
above, in order to measure the distance between the Damascus Document and
the War Scroll, the value of the trigram “ ‫ ”ים‬in CD will be subtracted from the
value of the same feature in 1QM (so |1.2471245 – 1.5667841| = 0.3196596). The
more similarly two documents use a particular feature, the lower this value will
be. In the same way, the values of all the other features in CD will be subtracted
from those in 1QM, and the average of all these results (in absolute values) will
indicate the overall distance between the two documents CD and 1QM. This
calculation is subsequently repeated to measure the distances between any
two documents, e.g. CD and 11Q19; CD and 4Q416; 4Q416 and 4Q417; 4Q416 and
1QHa etc.
Obviously, a list with numerical distances between all the pairs of docu-
ments would look like a distance table between cities, which is of little use if
one wishes to easily categorize documents within a corpus. Fortunately, sta-
tistical programs come with a set of visualization techniques which produce
graphs that are much easier to read and interpret than tables, as will be shown
below.
2.2 Testing Stylometric Analysis of the Dead Sea Scrolls

The most important question is, of course, whether this methodology yields
any reliable results in the case of the Dead Sea Scrolls, and whether these re-
sults are promising enough for the research to be continued. In the present
28 In technical notation:

n
Ai  i Bi  i

1
( AB )  
n i 1 i i

with A, B being the texts between which the distance is to be measured; i any given fea-
ture; Ai the frequency of i in A; Bi the frequency of i in B; μi the mean frequency of i in the
whole corpus and σi the standard deviation of the frequencies of i in the corpus. This no-
tation can be algebraically simplified to:
n
Ai  Bi

1
( AB ) 
n i 1 i
see S. Argamon, “Interpreting Burrow’s Delta: Geometric and Probabilistic Foundations,”

Literary and Linguistic Computing 23/2 (2008): 131–37.

70 Van Hecke
section, I therefore present a number of preliminary test results, which aim to

demonstrate the feasibility of the approach, but which do not claim to be the
final word on the categorization of the Dead Sea Scrolls. For the latter goal,
more research needs to be done, as will be shown.
For all the aspects of the data analysis and the visualization of the results,
the ‘stylo’ library for the statistical program R was used. With the correct corpus
preparation, this package automatically generates the feature lists asked for by
the user, performs the distance measuring techniques on the data described
above, and visualises the results of the distance measurements. All this can be
done with a user-friendly graphical user interface (GUI), although command
line instruction is also possible.29
2.2.1 Differentiating Non-biblical Dead Sea Scrolls from Masoretic

Biblical Books
As a first test for the methodology described above, the question was raised
whether it would be possible to distinguish the non-biblical Dead Sea Scrolls
from the books of the Hebrew Bible on stylometric grounds. If the method is
unable to make this undisputed distinction, it is of no use to test it further.
For this goal, the texts from both corpora (with the Biblical books in their yet
unvocalised Masoretic form30) were grouped together in one research corpus,
and the software was asked to group the texts on the basis of the most fre-
quent words (MFW) and character trigrams (MFC 3-grams). For the current test
circumstances, only Scrolls with more than 300 words or identifiable parts of
words were used. The expectation was that smaller documents would not yield
reliable results, but future research may indicate that this limit should be ad-
justed. Each of the Dead Sea Scrolls and the Biblical books have been labelled
with the prefix “DSS_” and “HB_” respectively, but these labels are only used for
easy visual interpretation of the research results, and in no way tell the soft-
ware which texts to group together. Two well-established statistical methods
were tested: on the one hand the cluster analysis, which represents the results
in a dendrogram, in which documents that show the most stylistic similarity
sit closest to each other on the branches of the diagram. On the other hand,
Principal Component Analysis using a correlation matrix was applied, which
29 For an extensive description of the functionalities of this library, see the article by its
developers: M. Eder, J. Rybicki and M. Kestemont, “Stylometry with R: A Package for
Computational Text Analysis,” The R Journal 8/1 (2016): 107–21.
30 Without this limitation, the analysis would have made the obvious distinction between
vocalised and unvocalised text.

represents the results in a two-dimensional scatter plot with the two most im-
portant features in the corpus represented as the plot’s two axes.
The cluster analyses on the basis of the most frequent words and of both
the 200 and the 300 most frequent character trigrams divide the documents
perfectly into two groups, which completely coincide with the Dead Sea
Scrolls—Biblical books distinction. As an example, see the dendrogram of the
cluster analysis of the 300 MFC 3-grams (Figure 2), separating the DSS from the
Biblical books. It is important to note that it is the distance to the next “node”
in the diagram that indicates how closely related two documents are, with the
DSS and the Biblical books sitting on two separate branches of the diagram.
The fact that two documents stand on two adjacent lines does not necessarily
indicate similarity.31
Interestingly, the analyses on the basis of an even larger group of features
(400 and 500 MFC 3-grams) both have a few “misattributions,” explainable as
they might be: the 400 MFC analysis groups Nahum with the DSS, whereas the
500 MFC analysis groups Obadiah with the DSS and 4Q167 (pHosb) with the
biblical books. An analysis of only 100 MFC is not completely accurate when
grouping 4Q176, 4Q381 (Non-Canonical Psalms) and 11Q5 among the biblical
books, whereas Qohelet is regarded as closer to the DSS than to the other bib-
lical books.32 Two remarks should be made concerning these results: first of
all, the classification is remarkably accurate, given the fact that it was done in
a completely automated way, using as the only features the automatically se-
lected most frequent words or most frequent character tri-grams, as discussed
31 The fact, for example, that 4Q405 and 4Q403 stand on the lines immediately below the
book of Ezra, does not mean at all that the former compositions are the closest to (Late-)
Biblical Books. The distance between books and compositions is rather a function of their
featuring in the same branch of the tree diagram. Which unrelated branches are charted
next to each other is a matter of coincidence in the visualization.
32 Obviously, these “misattributions” are easily explainable: 4Q176 consists mainly of an
anthology of quotations and comments on biblical consolations texts taken from the
books of Psalms, Zechariah, and mainly Deutero-Isaiah. 4Q381 is written on the basis of
and in the style of biblical Psalms, and moreover adopts a defective spelling. The non-
biblical portions of 11Q5 taken into consideration here also adopt a very biblical style,
as demonstrated by E.D. Reymond, New Idioms Within Old. Poetry and Parallelism in the
Non-Masoretic Poems of 11Q5(=11QPsa) (Atlanta, GA: Society of Biblical Literature 2011, esp.
pp. 192f.). I thank Eibert Tigchelaar for these comments on the issue. Given the similarity
of these documents with biblical literature, it is all the more remarkable that the stylo-
metric analyses with more features correctly identify them as non-biblical.

72 Van Hecke
Figure 2 Cluster Analysis of DSS and HB.
above.33 Second, the misattributions in the analyses on the basis of 400 and
500 MFC only concern short documents, which are known to be more diffi-
cult to categorize correctly. The results show, however, that in the vast majority
of cases the attributions of short documents were accurate. The method
thus proves itself able to handle short documents rather well. Moreover, the
misattributions are rather incidental and, hence, can be detected by running
33 For this particular task, analyses on the basis of MFC were slightly less reliable than the
ones on the basis of MFW. As will be shown below, for more fine-grained analyses, MFC
analyses are clearly better, however.

multiple analyses with different features. Finally, the misattribution of Qohelet,

linguistically understandable though it may be, seems to indicate that 100 MFC
are too few to yield reliable results, even though the 100 and even the 50 most
frequent words were enough to correctly cluster the DSS and HB books. There
is no assurance, however, that this low number of features will be sufficient for
more fine-grained clustering tasks.
Another way to represent the results of the distance measurements between
the DSS and the HB books is Principal Component Analysis.34 Whichever fea-
tures were used for the analysis, the two groups of documents are clearly dis-
tinguished, with no DSS standing closer to a HB book than to any other DSS
manuscript and vice versa. As an example, the scatter plot of the 500 MFC tri-
gram analysis is shown in Figure 3, but the plots of the other analyses look very
similar.
The two corpora are represented as homogeneous clouds, without any doc-
ument being located in the area occupied by the other corpus, in other words
that the distance of any biblical book is closer to another biblical book than
any document of the DSS and vice versa. If the methodology were not able to
differentiate these two corpora, it would prove to be unsuited for any more
fine-grained analysis.35
34 Since we are analyzing 500 variables of each document, many of which are interrelated,
the analysis should ideally be represented in a 500–dimensional plot, with one dimen-
sion for each of the 500 variables. This would of course result in a visualization and con-
ceptualization that is impossible for the human mind to grasp. Therefore, the statistical
technique of Principal Component Analysis or PCA is used to reduce the number of
data set’s dimensions to two principal components, which can be represented in a two-
dimensional plot. This implies reducing the description of the data to a new set of com-
ponents which are uncorrelated, but which maximally preserve the information given by
the original variances of the variables and their correlations. As a result, the axes on a PCA
graph do not represent single variables of the original dataset, but an optimal transforma-
tion of the variables. In this analysis, the variation in the original data set is maximally
represented in a two-dimensional space. For more background, see I.T. Joliffe, Principal
Component Analysis (New York: Springer, 1986).
35 It should be noted that this first research is used to demonstrate the feasibility of the
methodology, but that the graph should not be taken as the last word on the respective
distances between individual compositions, although it does give a first approximation of
which books show a degree of similarity. More research needs to be done in order to test
these distances and account for the results. Some indications of this future research are
given in the concluding paragraph of this article.

74 Van Hecke
Figure 3 PCA of DSS and HB.

Note: Please note that this graph is intended to illustrate the overall feasibility of
the method advocated with this paper. Forthcoming studies will assess the different
factors affecting the clustering in more detail, and will provide detailed analyses of
text classifications.
2.2.2 Categorising the Dead Sea Scrolls: A Test Corpus

The first results of the application of stylometric analyses to Classical Hebrew
texts being promising, the question can be asked how well the methodology
will perform in clustering documents within the Dead Sea corpus. The situa-
tion is much more complicated in this case. Methodologically, it is common
practice to first test the method on a set of texts of known provenance, and
once the method yields accurate results in correctly clustering the texts ac-
cording to their provenance, to move to texts of unknown provenance. In
the Dead Sea corpus, however, there is no sub-corpus of texts of which the
provenance is indisputably established by which to calibrate our methods. We

therefore have to resort to the creation of an artificial corpus, in which the dif-
ferent “texts” consist of artificially chunked portions of larger compositions,
which are, for the cause of the experiment, considered to have a common prov-
enance. If the method is successful in clustering the different sections of single
texts together, the results of the clustering of texts of unknown provenance
arguably display some degree of accuracy.
For this test setup, the four largest non-biblical scrolls (1QHa, 1QM, 1QS,
11QTa) were chunked in sections of around 500 words. In order not to overrep-
resent the longer Hodayot and Temple scrolls, the number of sections of the
latter scrolls was limited to nine and ten, respectively. The result of this opera-
tion was the creation of a test corpus consisting of nine texts taken from the
Hodayot, nine from the War Scroll, six from the Rule Scroll and ten from the
Temple Scroll. These texts were given serial numbers (1QH1–9; 1QM1–9; 1QS1–6
and 11QT1–10), not referring to the column numbers but simply indicating the
sequential order of the newly created “texts.”
The method performed remarkably well in clustering together the different
sections taken from each scroll. Whether one choses to cluster the text on the
basis of the most frequent words, or of the most frequent character trigrams,
the texts taken from each of the scrolls are correctly grouped together. As a first
example, the plot chart of the Principal Component Analysis on the basis of
the 500 most frequent words is presented here (Figure 4).
Except for the document 1QS_6 (which corresponds to 1QS 10:14–11:22), all
the documents are located closer to another document taken from the same
scroll than to any document taken from the other three scrolls. This same
grouping also becomes visible in the cluster analysis graph on the basis of the
500 most frequent character trigrams (Figure 5).
The algorithm correctly groups together all the created documents, except
for 1QS_6 mentioned above which is considered to be closer to the 1QH-texts;
this might be explained by the hymnic nature of the end of 1QS. The cluster
analysis of Figure 5 also shows that 1QS and 1QM are considered to be most
closely related, while the Temple Scroll is regarded as standing at a larger dis-
tance from the three other scrolls. It might be tempting to take this analysis as
proof for the generally held opinion that the Rule, the War and the Hodayot
texts belong to the sectarian documents, while the Temple Scroll does not.
More research is needed, however, before the conclusion can be reached that
the documents originated in a common (linguistic) background. Most im-
portantly, the results should be cross-checked with other analyses. A similar
cluster analysis on the basis of the 500 most frequent words, for example, still
groups 1QM and 1QS closest together, but places 11QT closer to these two than
1QH. More research needs to be done on the validity of the distinction between

76 Van Hecke
Figure 4 PCA of four largest non-biblical scrolls.

sectarian and non-sectarian documents, as will be shown below. Nonetheless,

the fact that the method is able to correctly group relatively short documents
(around 500 words) on the basis of their most frequent features (in the latter
case: character trigrams) is highly promising, and justifies the efforts to further
develop this line of research.
2.2.3 Towards a Categorisation of the Dead Sea Scrolls

A final test for the feasibility of stylometric research on the Dead Sea Scrolls
can now be undertaken. The corpus contains a number of literary composi-
tions, of which multiple manuscripts are preserved. Well-known examples are

Figure 5 Cluster Analysis of four largest non-biblical scrolls.
the Instruction-texts (4Q415, 4Q416, 4Q417, 4Q418, 4Q42336), the Rule of the
Community-texts (1QS, 4Q258, 4Q25937), the Songs of the Sabbath Sacrifice
36 Excluding 1Q26 because of its limited length, far below the 300-word limit introduced
above.
37 Excluding the shorter Rule-texts.

78 Van Hecke
(esp. the longer 4Q400, 4Q403 and 4Q405), the Damascus Document (CD,
4Q266, 4Q267, 4Q270, 4Q271).38 Moreover, these manuscripts are only partially
overlapping, so that they could be regarded, for the sake of methodology, as
different texts with a common provenance. If the methodology wishes to claim
any validity, it should be able to cluster these manuscripts.39
Before turning to the analysis results, it should be noted again that, in gener-
al, manuscripts with less than 300 identifiable (and not reconstructed) words
or parts of words have been excluded from the analysis.
When running a cluster analysis with a number of different parameters, it
immediately becomes clear that the most frequent character trigrams (MFC)
approach yields results that are clearly superior to the most frequent word
analysis (MFW).
In the MFC-analysis with the 500 most frequent character trigrams, the
manuscripts of all the documents mentioned above are in each case flawlessly
grouped together on a single branch of the dendrogram, as can be seen in the
graph below (Figure 6), without interference from other documents that do
not belong on the branch. The only exception would be that 1QSa (Rule of the
Congregation) shows up on the same branch as the Rule of the Community
texts 1QS, 4Q258 and 4Q259 and is even calculated to be closer to 1QS than
4Q259. Even though the effect of document size on the reliability of the
method still needs to be determined, it could be argued that 4Q259 is long
enough to allow for general classification as a Rule-text, but too short for more
detailed analyses.
Similar results also emerge in the Principal Component Analysis scatter
plot, presented in Figure 7.
Closer analysis of the graph not only shows that the stylometric method
proposed here is remarkably accurate at clustering the different manuscripts
of a single composition, but also visualizes a number of scholarly discussions
on the classification of the different compositions found around the Dead Sea.
Without drawing any firm conclusions, for which it is much too early, a num-
ber of observations can be made. First, Songs of the Sabbath Sacrifice is located
at a remarkable distance from the other compositions in the PCA plot, which
38 Excluding the shorter D manuscripts.

39 As is mandatory in the stylo R package used for the analysis, each manuscript has been
given a name consisting of two parts: the manuscripts have been given a number of dif-
ferent prefixes, which aid the statistical program R in visualising the data: each prefix is
represented in a different colour in the graph, so that the researcher can easily visually
inspect the clustering of the data. In the present printed version, the graphs’ colours have
been desaturated for technical reasons, but full colour graphs are available with the au-
thor upon request.

Figure 6 Cluster Analysis of DSS.
could point in the direction of what their editor, Carol Newsom, claimed, pri-
marily on the basis of their content, namely, that these Songs do not represent
sectarian writings.40 Second, the Instruction texts occupy a marginal position
vis-à-vis the other compositions in the PCA plot, whereas in the cluster analysis
40 C.A. Newsom, “ ‘Sectuallly Explicit’ Literature from Qumran,” in The Hebrew Bible and Its
Interpreters, ed. W.H. Propp, B. Halpern, D.N. Freedman (Winona Lake, IN: Eisenbrauns,
1990): 167–87.

80 Van Hecke
Figure 7 PCA of DSS.

this is not the case. Remarkably, this reflects the current debate on the posi-
tion of the Instruction composition, as illustrated by Devorah Dimant’s refuta-
tion of Torleif Elgvin’s claims that the composition is non-sectarian.41 Third
and most interestingly, the analysis does not show a sharp stylometric distinc-
tion between what are commonly identified as sectarian writings and those
that do not belong to this group. It is too early to determine what these results
41
D. Dimant, “The Vocabulary of the Qumran Sectarian Texts,” in Qumran und die
Archäologie, ed. J. Frey, C. Claußen and N. Kessler (Tübingen: Mohr Siebeck, 2011), 347–95,
here 386–89.

demonstrate, but it does show that more research on the usefulness of this dis-
tinction is needed. Finally, it is noteworthy that genre effects seem to play a role
in the classification presented above. Pesher(-like) texts such as e.g. 1QpHab,
4QpHosb, 4QpPs and 11QMelch cluster together, while hymnic and poetic texts
like the Hodayot, 1QSb, 11Q5 (the non-biblical portions) are grouped together,
which does not necessarily say much about their common origin.
3 Future Prospects and Directions
This preliminary, survey-like exploration has demonstrated that it is possible

to apply stylometric analysis to the Dead Sea Scrolls corpus with its specific
characteristics. At the same time, it has become clear that more research needs
to be done before the analysis can yield conclusive evidence for the classifica-
tion of the compositions found near the Dead Sea. In this concluding section,
I will briefly sketch a number of methodological issues that need to be ad-
dressed (while others will undoubtedly emerge) and point to a few specific
research questions that should be dealt with in future research.
Methodologically speaking, further testing needs to be done to determine
what is the minimal size for documents to yield reliable results and under what
conditions, i.e. for what kinds of analyses. In the tests presented in this article,
it was shown that documents of 300 words were correctly identified as belong-
ing to a certain composition, while even shorter documents were correctly
grouped together with other manuscripts of the same composition. It could
be tested, for example, how small documents can be before the success rate in
clustering fragments of known compositions falls below a certain threshold. A
second methodological question concerns the features chosen as the basis of
stylometric research. In the present article most frequent words and character
trigrams were chosen, with the latter in certain cases providing the more satis-
fying results. It should be tested, however, whether other established features
(e.g. part of speech n-grams, discourse elements) yield satisfactory results and
to what extent the analyses on the basis of different features can be combined.
Moreover, the analyses on the basis of automatically extracted features, as pre-
sented here, should be compared (in their performance and in their results) to
classifications using the same algorithms but based on features that are con-
sciously selected by scholars (e.g. typical vocabulary, particular morphological
characteristics or scribal habits).
With stylometric methodologies being applicable to the Dead Sea corpus,
as this article has argued, a number of specific research questions can be
dealt with. However, in dealing with these questions, a large caveat is in place.

82 Van Hecke
Authorship studies, from which the techniques described above are borrowed,
typically presuppose modern conceptions of authorship, whereby texts are
typically written by a single author. This model obviously does not apply to
the Dead Sea Scrolls, many of which, we may assume, have a more complex re-
dactional history, involving multiple authors and redactors. As the tests above
illustrate, this does not necessarily invalidate the approach, but it does raise
the question of what the distance between texts measures, if not single au-
thorship. Are texts grouped together because they belong to the same milieu?
Or because they are contemporaneous, and thus reflect similar diachronous
language variants? Or because they share genre characteristics, a possibility
already mentioned above? And can stylometric methods determine whether
a text had a complex redactional history or was, on the contrary, written by a
single author after all?
With this caveat in mind, the following research questions may be exempla-
ry avenues for future research. On the one hand, the traditional classification
of the Scrolls in a sectarian and a non-sectarian group could be addressed. Not
only can the approach add new elements for the classification of such disputed
compositions as the Instruction texts, the Songs of the Sabbath Sacrifice or
the Temple Scroll, it can also test whether the distinction between sectarian
and non-sectarian texts, disputed as it is, can be maintained on stylometric
grounds.
Another example of a future research question might be: can stylometric
analysis provide evidence for the literary growth or redactional history of cer-
tain compositions? Can, for example, the distinction between Community
Hymns and Teacher Hymns in the Hodayot be maintained on stylistic grounds,
or what can, for example, be said about the compositional nature of the Rule
of the Community?
Stylometric analysis thus has the potential to have a lasting influence on
Dead Sea Scrolls studies, and more research is urgently needed. This analy-
sis, as a form of distant reading of the texts, will, however, never supplant the
meticulous, close reading of texts, which is so characteristic of the discipline,
just as aerial archaeology and other imaging techniques will never make exca-
vation work superfluous. But just as aerial archaeology or geomagnetic scan-
ning are able to make visible patterns which cannot be seen while standing on
the surface and can guide the archaeologist to the most interesting areas for
excavation, so stylometric analysis will be able to make visible patterns that
remained hitherto unnoticed, and will direct future detailed linguistic and tex-
tual studies on the Dead Sea Scrolls.


(15685179 - Dead Sea Discoveries) Computational Stylometric Approach To The Dead Sea Scrolls

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

(15685179 - Dead Sea Discoveries) Computational Stylometric Approach To The Dead Sea Scrolls

Uploaded by

Copyright:

Available Formats

Dead Sea Discoveries 25 (2018) 57–82

Computational Stylometric Approach to the Dead

Pierre Van Hecke

Dead Sea Scrolls – computational stylometry – text classification – stylistic features

* I wish to thank my colleague Eibert Tigchelaar and my collaborators Mathias Coeckelbergs

© koninklijke brill nv, leiden, 2018 | doi 10.1163/15685179-12341464

Dead Sea Discoveries 25 (2018) 57–82

approach therefore constitutes a necessary addition to existing scholarship. In

1 Computational Stylometry: Background and Developments

Computational stylometry is a recent approach within corpus linguistics that

Dead Sea Discoveries 25 (2018) 57–82Downloaded from Brill.com12/19/2018 07:55:33PM by shiliger@hotmail.com

Dead Sea Discoveries 25 (2018) 57–82

features are best suited to stylometrically differentiate texts.8 Whereas tradi-

Dead Sea Discoveries 25 (2018) 57–82Downloaded from Brill.com12/19/2018 07:55:33PM by shiliger@hotmail.com

optimized.14 Finally, the algorithms for performing supervised machine-learn-

2 Stylometry and the Dead Sea Scrolls

The application of quantitative authorship attribution techniques in Classical

Dead Sea Discoveries 25 (2018) 57–82

2.1 Towards a Methodology

Dead Sea Discoveries 25 (2018) 57–82Downloaded from Brill.com12/19/2018 07:55:33PM by shiliger@hotmail.com

2.1.2 Selection of Features

Dead Sea Discoveries 25 (2018) 57–82

Dead Sea Discoveries 25 (2018) 57–82Downloaded from Brill.com12/19/2018 07:55:33PM by shiliger@hotmail.com

automatic concordancer, without interference from the researcher. As has been

2.1.3 Measuring the Weight of Features in a Text

combine information about punctuation, morphology (character n-grams can represent

Dead Sea Discoveries 25 (2018) 57–82

Subsequently, one needs to determine the method with which, first, to

Dead Sea Discoveries 25 (2018) 57–82Downloaded from Brill.com12/19/2018 07:55:33PM by shiliger@hotmail.com

Figure 1 Frequency table of character trigrams (cropped).

Dead Sea Discoveries 25 (2018) 57–82

via communal account

2.1.4 Measuring the Distance between Texts

2.2 Testing Stylometric Analysis of the Dead Sea Scrolls

28 In technical notation:

see S. Argamon, “Interpreting Burrow’s Delta: Geometric and Probabilistic Foundations,”

Dead Sea Discoveries 25 (2018) 57–82Downloaded from Brill.com12/19/2018 07:55:33PM by shiliger@hotmail.com

section, I therefore present a number of preliminary test results, which aim to

2.2.1 Differentiating Non-biblical Dead Sea Scrolls from Masoretic

Dead Sea Discoveries 25 (2018) 57–82

Dead Sea Discoveries 25 (2018) 57–82Downloaded from Brill.com12/19/2018 07:55:33PM by shiliger@hotmail.com

Figure 2 Cluster Analysis of DSS and HB.

Dead Sea Discoveries 25 (2018) 57–82

multiple analyses with different features. Finally, the misattribution of Qohelet,

Dead Sea Discoveries 25 (2018) 57–82Downloaded from Brill.com12/19/2018 07:55:33PM by shiliger@hotmail.com

Figure 3 PCA of DSS and HB.

2.2.2 Categorising the Dead Sea Scrolls: A Test Corpus

Dead Sea Discoveries 25 (2018) 57–82

Dead Sea Discoveries 25 (2018) 57–82Downloaded from Brill.com12/19/2018 07:55:33PM by shiliger@hotmail.com

Figure 4 PCA of four largest non-biblical scrolls.

sectarian and non-sectarian documents, as will be shown below. Nonetheless,

2.2.3 Towards a Categorisation of the Dead Sea Scrolls

Dead Sea Discoveries 25 (2018) 57–82

Figure 5 Cluster Analysis of four largest non-biblical scrolls.

Dead Sea Discoveries 25 (2018) 57–82Downloaded from Brill.com12/19/2018 07:55:33PM by shiliger@hotmail.com

38 Excluding the shorter D manuscripts.

Dead Sea Discoveries 25 (2018) 57–82

Figure 6 Cluster Analysis of DSS.

Dead Sea Discoveries 25 (2018) 57–82Downloaded from Brill.com12/19/2018 07:55:33PM by shiliger@hotmail.com

Figure 7 PCA of DSS.

Dead Sea Discoveries 25 (2018) 57–82

3 Future Prospects and Directions

This preliminary, survey-like exploration has demonstrated that it is possible

* I wish to thank my colleague Eibert Tigchelaar and my collaborators Mathias Coeckelbergs

28 In technical notation:

see S. Argamon, “Interpreting Burrow’s Delta: Geometric and Probabilistic Foundations,”

38 Excluding the shorter D manuscripts.