You are on page 1of 21

A Statistical Exploration of Text Partition Into Constituents: The Case of

the Priestly Source in the Books of Genesis and Exodus


Gideon Yoffe Axel Bühler Nachum Dershowitz
gideon.yoffe@mail.huji.ac.il axel.buhler@unige.ch nachumd@tauex.tau.ac.il

Israel Finkelstein Eli Piasetzky Thomas Römer


fink2@tauex.tau.ac.il eip@tauphy.tau.ac.il thomas.romer@college-de-france.fr

Barak Sober
barak.sober@mail.huji.ac.il
Abstract prone to drastic (and occasionally abrupt) changes
over time (e.g., Nicholson, 2002).
We present a pipeline for a statistical textual ex-
ploration, offering a stylometry-based expla- When scrutinized, the interrelation between vari-
nation and statistical validation of a hypothe- ous literary features within the text may shed light
arXiv:2305.02170v3 [cs.CL] 10 Jun 2023

sized partition of a text. Given a parameteriza- on its historical context. From it, one may infer the
tion of the text, our pipeline: (1) detects literary number of authors, time(s) and place(s) of compo-
features yielding the optimal overlap between sition, and even the geopolitical, social, and theo-
the hypothesized and unsupervised partitions,
logical setting(s) (e.g., Wellhausen et al., 2009; von
(2) performs a hypothesis-testing analysis to
quantify the statistical significance of the opti-
Rad, 1972; Knohl, 2010; Römer, 2015).
mal overlap, while conserving implicit correla- Such scrutiny thus serves a double purpose: (1)
tions between units of text that are more likely to disambiguate and identify features in the text
to be grouped, and (3) extracts and quantifies that are insightful of the lexical sources of the par-
the importance of features most responsible for tition (e.g., Givón, 1991; Yosef, 2018; van Peursen,
the classification, estimates their statistical sta- 2019); (2) to attempt to trace these features to the
bility and cluster-wise abundance.
context of the text’s composition (e.g., Koppel et al.,
We apply our pipeline to the first two books in 2011; Pat-El, 2021).
the Bible, where one stylistic component stands
out in the eyes of biblical scholars, namely, the Works employing computational methods in text
Priestly component. We identify and explore stylometry – a statistical analysis of differences
statistically significant stylistic differences be- in literary, lexical, grammatical or orthographic
tween the Priestly and non-Priestly compo- style between genres or authors (Holmes, 1998) –
nents. have been introduced several decades ago (Tweedie
et al., 1996; Koppel et al., 2002; Juola et al., 2006;
1 Introduction
Koppel et al., 2009; Stamatatos, 2009), with bibli-
It is agreed among scholars that the extant ver- cal exegesis spurring very early attempts at com-
sion(s) of the Hebrew Bible is a result of various puterized authorship-identification tasks (Radday,
editorial actions (additions, redaction, and more). 1970; Radday and Shore, 1985). Since then, these
As such, it may be viewed as a literary patchwork methods have proved useful also in investigating
quilt – whose patches differ, for example, by genre, ancient (e.g., Kestemont et al., 2016; Verma, 2017;
date, and origin – and the distinction between bib- Kabala, 2020), and biblical texts as well (Koppel
lical texts as well as their relation to other ancient et al., 2011; Dershowitz et al., 2014; Roorda, 2015;
texts from the Near East is at the heart of biblical van Peursen, 2019), albeit to a humbler extent.
scholarship, theology, ancient Israel studies, and Finally, statistical-learning-based research, which
philology. Many debates in this field have been on- makes headway in an impressively diverse span
going for decades, if not centuries, with no verdict of disciplines, is taking its first steps in the con-
(e.g., the division of the biblical text into its “origi- text of ancient scripts (Murai, 2013; Faigenbaum-
nal” constituents; (Gunkel, 2006; von Rad, 2001; Golovin et al., 2016; Dhali et al., 2017; Popović
Wellhausen et al., 2009)). While several paradigms et al., 2021; Faigenbaum-Golovin et al., 2020). In
gained prevalence throughout the evolution of this the biblical context, Dershowitz et al. (2014, 2015)
discipline (Wellhausen et al., 2009; von Rad, 1972; addressed reproducing hypothesized partitions of
Zakovitch, 1980), the jury is still out on many re- various biblical corpora with a computerized ap-
lated hypotheses, such that these paradigms are proach as well, using features such as orthographic
differences and synonyms. In the first work – the 1893; Knohl, 2007; Römer, 2014; Faust, 2019).
Cochran-Mantel-Haenszel (CMH) test was applied Therefore, the distinction between P and nonP texts
as a means of hypothesis testing, with the null hy- is considered a benchmark in biblical exegesis.
pothesis that the synonym features are drawn from
the same distribution. While descriptive statistics 2 Methodology
were successfully applied to various classification
and attribution tasks of ancient texts (see above), 2.1 Data – Digital Biblical Corpora
uncertainty quantification has been insufficiently We use two digital corpora of the Masoretic variant
explored in NLP-related context (Dror et al., 2018), of the Hebrew Bible in (biblical) Hebrew: (1) a ver-
and in particular – in that of text stylometry. sion of the Leningrad codex, made freely available
In this work, we introduce a novel exploratory by STEPBible.1 This dataset comes parsed with
text stylometry pipeline, with which we: (1) find a full morphological and semantic tags for all words,
combination of textual parameters that optimizes prefixes, and suffixes. From this dataset, we utilize
the agreement between the hypothesized and un- the grammatical representation of the text through
supervised partitions, (2) test the statistical signif- phrase-dependent part-of-speech tags (POS). (2)
icance of the overlap, (3) extract features that are A digitally parsed version of the Biblia Hebraica
important to the classification, the proportion of Stuttgartensia (Roorda, 2018) (hereafter BHSA).2
their importance, and their relative importance in In the BHSA database, we consider the lexematic
each cluster. Each stage of this analysis was cross- (i.e., words reduced to lexemes) and grammatical
validated (i.e., applied on many randomly-chosen representation of the text through POS. The differ-
sub-samples of the corpus rather than applied once ence between the two POS-wise representations of
on the entire corpus) and tested for statistical sta- the text is that (1) encodes additional morpholog-
bility. ical information within tags, resulting in several
To perform (2) in a meaningful way for textual hundreds of unique tags, whereas (2) assigns one
analysis, we had to overcome the fact that label- out of 14 more “basic” grammatical tags3 to each
permutation tests do not consider correlations be- word. We refer to the POS-wise representation of
tween units of text, which affect their likelihood of (1) and (2) as “high-res POS” and “low-res POS”,
being clustered together. This results in unrealisti- respectively.
cally optimistic p-values, as, in fact, the hypothe-
sized partition does implicitly consider such corre- 2.2 Manual Annotation of Partition
lations (by, e.g., grouping texts of a similar genre, From biblical scholars, we receive verse-wise label-
subject, etc.). We overcome this by introducing ing, assigning each verse in the books of Genesis
a cyclic label-shift test which preserves the struc- (1533 verses) and Exodus (1213 verses) to one of
ture of the hypothesized partition, thus conserving two categories: P or nonP, made available online4 .
the implicit correlations therein. Furthermore, we Hereafter, we refer to this labeling as “scholarly
identify literary features that are responsible for the labeling”.
clustering, as opposed to intra-cluster feature selec- While the dating of P texts in the Pentateuch
tion techniques (e.g., Hruschka and Covoes, 2005; remains an open, heavily-debated question (e.g.,
Cai et al., 2010; Zhu et al., 2015), which seek to Haran, 1981; Hurvitz, 1988; Giuntoli and Schmid,
detect significant features within each cluster. This 2015), there exists a surprising agreement amongst
is also a novel approach to text stylometry. biblical scholars regarding what verses are affili-
With this pipeline, we examined the hypothet- ated to P (e.g., Knohl, 2007; Römer, 2014; Faust,
ical distinction between texts of Priestly (P) and 2019), amounting to an agreement of 96.5% and
non-Priestly (nonP) origin in the books of Gene- 97.3% between various biblical scholars for the
sis and Exodus. The Priestly source is the most books of Genesis and Exodus, respectively. We
agreed-upon constituent underlying the Pentateuch describe the computation of these estimates in Ap-
(i.e., Torah). The consensus over which texts are pendix A.
associated with P (mainly through semantic analy- 1
https://github.com/STEPBible/STEPBible-Data
sis) stems from the stylistic and theological distinc- 2
https://etcbc.github.io/bhsa
tion from other texts in the Pentateuch, streamlined 3
https://etcbc.github.io/bhsa/features/pdp/
4
across texts associated therewith (e.g., Holzinger, https://github.com/YoffeG/PnonP
2.3 Text Parameterization §2.3). We use tf-idf (term frequency divided by doc-
The underlying assumption in this work is that a ument frequency) to encode each verse, assigning
significant stylistic difference between two texts a relevance score to each feature therein (Aizawa,
of a roughly-similar genre (or, indeed, any number 2003). Works such as Fabien et al. (2020); Mar-
of distinct texts) should manifest in simple observ- cińczuk et al. (2021) demonstrate that in the ab-
ables in NLP, such as the utilization of vocabulary sence of a pre-trained neural language model, tf-idf
(i.e., distribution of words) and grammatical struc- provides an appropriate and often optimal embed-
ture. ding method in tasks of unsupervised classification
We consider three parameters whose different of texts. For a single combination of an n-gram
combinations emphasize different properties-, and size and running-window width (using either lex-
therefore yield different classifications- of the text: emes, low- or high-res POS), the tf-idf embedding
yields a single-feature matrix X ∈ Rn×d , where
• Lexemes, low-, and high-res POS: we consider n is the number of verses and d is the number of
three representations of the text: words in lexematic unique n-grams of rank n.
form and low- and high-res POS (see §2.1). This
It is important to note that this work aims to set
parameter tests the ability to classify the text based
a benchmark for future endeavors using strictly tra-
on vocabulary or grammatical structure.
ditional machinery throughout our analysis. To
• n-gram size: we consider sequences of consec- ensure collaboration with biblical scholars, our
utive lexemes/POS of different lengths (i.e., n- methodology allows for full interpretability of the
grams). Different sizes of n-grams may be remi- exploration process (see §D.4). This has threefold
niscent of different qualities of the text (e.g., Suen, importance: (1) the field of text stylometry, espe-
1979; Cavnar and Trenkle, 1994; Ahmed et al., cially that of ancient Hebrew texts, has hitherto
2004; Stamatatos, 2013). For example, a distinc- been explored statistically and computationally to
tion based on a larger n-gram may indicate a se- a limited extent, such that even when utilizing con-
mantic difference between texts or the use of longer servative text-embeddings, such as in this work,
grammatical modules in the case of POS (e.g., considerable insight can be gained concerning both
parallelisms in the books of Psalms and Proverbs the quality of the analysis and the philological ques-
(Berlin, 1979)). In contrast, a distinction made tion. (2) Obtaining benchmark results using tra-
based on shorter n-grams is indicative of more em- ditional embeddings is a pre-condition for imple-
bedded differences in the use of language (e.g., menting more sophisticated yet convoluted embed-
Wright and Chin, 2014; Litvinova et al., 2015). dings, such as pre-trained language models (e.g.
That said, both these examples indicate that a “false Shmidman et al., 2022) or self-trained/calibrated
positive” distinction can be made where there is a language models (Wald et al., 2021), which we
difference in genre (e.g., Feldman et al., 2009; Tang intend to apply in future works. (3) Finally, the in-
and Cao, 2015). This degeneracy requires careful terdisciplinary nature of this work and our desire to
analysis of the resulting clustering or inclusion of contribute to the field of biblical exegesis (and tra-
genre-specific texts only for the clustering phase. ditional philology in general) requires our results to
• Verse-wise running-window width: biblical be predominantly interpretable, such that they can
verses have an average length of 25 words. Hence, be subjected to complementary analysis by schol-
a single verse may not contain sufficient context ars from the opposite side of the interdisciplinary
that can be robustly classified. This is especially divide (Piotrowski, 2012).
important since our classification is based on sta-
tistical properties of features in the text (see §2.4). 2.5 Clustering
Therefore, we define a running window parameter,
which concatenates consecutive verses into a single We choose the k-means algorithm as our cluster-
super-verse (i.e., a running window of k would turn ing tool of choice (Hastie et al., 2009, Ch. 13.2.1)
the ith verse to the sequence of the i − k : i + k and hardwire the number of clusters to two, ac-
verses) to provide additional context. cording to the hypothesized P/nonP partition. The
justification for our choice of this algorithm is its
2.4 Text Embedding simple loss-optimization procedure, which is vital
We consider individual verses, or sequences of to our feature importance analysis and is discussed
verses, as the atomic constituents of the text (see in detail in §2.8.
We use the balanced accuracy (BA) score optimization procedure. For each shift, we perform
(Sokolova et al., 2006) for our overlap statistic, a the parameter optimization procedure (see §2.6),
standard measure designed to address asymmetries where we now have the shifted scholarly labels
between cluster sizes. instead of the original (“un-shifted”) ones. Thus,
Due to the stochastic nature of k-means (Bottou, we generate a distribution of our statistic under the
2004), every time it is used in this work – it is run null hypothesis, from which we derive a p-value.
50 times – (with different random initialization) – In Fig. 1, we present an intuitive scheme
and the result yielding the smallest k-means loss is where we demonstrate our rationale concerning
chosen. the hypothesis-testing procedure in text stylometry.

2.6 Optimizing for Overlap 2.8 Feature Importance and Interpretability


We perform a 2D grid search over a pre-determined of Classification
range of n-gram sizes and running-window widths Given a k-means labeling produced for the text,
to find the parameters combination yielding the op- which was embedded according to some combi-
timal overlap for low- or high-res POS lexemes. nation of parameters (§2.3), we wish to quantify
We test the statistical stability of each combination the importance of individual n-grams to the clas-
of these parameters (i.e., to ensure that the over- sification, the proportion of their importance, and
lap reached by each combination is statistically associate to which cluster they are characteristic
significant) by cross-validating the 2D grid search of.
over some number of randomly-chosen sub-sets Consider the loss function of the k-means algo-
of verses, from which we derive the average over- rithm:
k
lap value for each combination of parameters and X
argmin |Si | · var(Si ),
the standard deviation thereof. We describe the S i=1
optimization process in detail in Appendix B.
where k = 2 is the number of desired clusters, S is
2.7 Hypothesis Testing and Validating Results the group of all potential sets of verses, split into k
Through hypothesis testing, we establish the statis- clusters, |Si | is the size of the ith cluster (i.e., num-
tical significance of the achieved optimal overlap ber of verses therein) and var(Si ) is the variance
value between the unsupervised and hypothesized of the ith cluster. That is, the k-means aims to min-
partitions. To derive a p-value from some empiri- imize intra-class variance. Equivalently, we could
cal null distribution, we consider the assumption optimize for the maximization of the inter-cluster
that the hypothesized partition, manifested in the variance (i.e., the variance between clusters), given
scholarly labeling, exhibits a specific formulaic par- by X X
tition of chunks of the text. A formulaic partition, argmax ∥xi − xj ∥2 . (1)
S
in turn, suggests that verses within each of the P i∈S1 j∈S2
(nonP) blocks are correlated – a fact that the stan- Let D ∈ R|S1 |·|S2 |×d denote the matrix of inter-
dard label-permutation test is intrinsically agnostic cluster differences whose columns are Dℓ , which
of, as it permutes labels without considering po- for i ∈ {1, . . . , |S1 | − 1}, j ∈ {1, . . . |S2 |} and
tential correlations between verses (Fig. 1 left). ℓ = i · |S2 | + j are defined by
Thus, the null distribution synthesized through a
series of permutations would represent an overly- (D)ℓ = xi − xj .
optimistic scenario that does not correspond to any
conceivable scenario in text stylometry. Then, applying PCA to D would yield the axis
To remedy this, we devise a more prohibitive sta- across which Eq. (2.8) is optimized as the first
tistical test. Instead of permuting the labels to have principal component. This component represents
an arbitrary order, we perform a cyclic shift test of the axis of maximized variance, and each feature’s
the scholarly labeling (Fig. 1 right). This procedure contribution is given by its corresponding loading.
retains the scholarly labels’ hypothesized structure Finally, it can be shown that this principal axis
but shifts them across different verses. We perform could also be computed by subtracting the centroids
as many cyclic shifts as there are labels (i.e., num- of the two clusters. Therefore, to leverage this
ber of verses) in each book by skips of twice the observation and extract the features’ importance,
largest running-window width considered in the we perform the following procedure:
Similarity Axis 1
a: text 1

Similarity Axis 2
b: text 2
c: text 3 a a b b b c c b b
[0,0,1,1,1,0,0,1,1]; 𝑛 0 = 4, 𝑛 1 = 5
Expert Labels (ℒ! )

a a b b b c c b b a a b b b c c b b
[1,0,1,0,1,0,1,0,1] [1,1,0,0,1,1,1,0,0]
Permuted Labels (ℒ" ) Cyclically-shifted Labels (ℒ# )
(ℒ* ∈ 𝔏!"#$ ) (ℒ) ∈ 𝔏%&%'(% ⊂ 𝔏!"#$ )

𝑛 1 +𝑛 0
|𝔏!"#$ | = ≫ |𝔏%&%'(% | = 𝑛 1 + 𝑛(0)
𝑛 1

𝒫 𝒟(ℒ!" , ℒ# ) > 𝒟(ℒ !" , ℒ $ ) ≫ 𝒫 𝒟(ℒ !" , ℒ $ ) > 𝒟(ℒ !" , ℒ # )

Figure 1: Hypothesis-testing rationale: consider a case of three distinct texts, divided into nine smaller units, whose
relative similarity is plotted in the top left diagram. These units of texts are distinguished into two classes by
an expert and are labeled as either 0 or 1 (L1 ; where n(0), n(1) are the sums of units of text labeled as 0 and 1,
respectively). Here, the correlation between adjacent units of text (which is unknown a-priori) is implicitly taken
into account, to some degree, by the expert. Consider now the group of all possible permutations (Lperm ) of the
expert and the group containing all the possible cyclic shifts thereof (Lcyclic ). The latter is a sub-group of – but is
much smaller than – the first (Lcyclic ⊂ Lperm ). For sufficiently long texts, we argue that the probability (P) of
randomly permuting the expert labeling such that it would capture some intrinsic correlations between the units of
text (L3 ) and would yield a substantial overlap (expressed in the figure as the function D, which in our case is the
BA score) with an unsupervised clustering labeling (Lus ) of the text – or at least equivalent to that of the expert
labeling – is considerably lower than that of a cyclically-shifted expert labeling (L2 ), which, up to a shift, assumes
the implicit correlations between units of text manifested in L1 . These, in turn, are more likely to yield equivalent
overlap values to L1 . A cyclic-shift test is, therefore, a more prohibitive hypothesis-testing routine than the naive
permutation test.

• Compute the principal separating axis of the two each important feature w.r.t. both clusters – assign-
clusters by subtracting the centroid of S1 from that ing it a score between −1 : 1. A score of 1 (−1)
of S2 . indicates that the feature is solely abundant in its as-
• The contribution of each n-gram feature to the sociated (opposite) cluster. Thus, a feature’s abun-
cluster separation is given by its respective loading dance nearer zero indicates that its contribution to
in the principal separating axis. the clustering is rather in its combination with other
• Since tf-idf assigns strictly non-negative scores features than a standalone indicator.
to each feature, the signs of the principal separating
axis’ loadings allow us to associate the importance 3 Results
of each n-gram to a specific cluster (see Appendix We apply a cross-validated optimization analysis
C). to the three representations of the book (§2.6) by
• To determine the stability of the importance of performing 2D grid searches for 20 randomly cho-
n-grams across multiple sub-samples of each book, sen sub-sets of 250 verses for each representation.
we perform a cross-validation routine (similar to We perform a cross-validated cyclic hypothesis-
the one performed in §2.6). Explicitly, we perform testing analysis for the optimal overlap value (§2.7)
all the steps listed above for some number of simu- using five randomly-chosen sub-sets of verses, sim-
lations (i.e., randomly sample a sub-set of verses ilarly to the above – and derive a p-value. Finally,
and extract the importance vector) and compute the we perform a cross-validated feature importance
mean and standard deviation of all simulations. analysis for every representation (§2.8), over 100
• Finally, we compute the relative uniqueness of randomly-chosen sub-sets of verses, similar to the
above. and perform a cross-validated optimization anal-
In Table 1, we list the cross-validated results of ysis with low- and high-res POS (see §2.6). We
each representation for both books and the derived then compare the resulting optimal overlap values
p-value. In Fig. 2, we visualize results for all stages of each time. We plot the results of this experiment
of our analysis applied to the lexematic represen- in Fig. 4.
tation of the book of Genesis. In Appendix D, we We find that, as expected, the optimal overlap
plot the results for all stages of our analyses for drops as a function of the size of the removed P-
both books and discuss them in detail. Appendix associated text. Interestingly, the optimal overlap
E presents a detailed biblical-exegetical analysis increases when a smaller block is removed. Ad-
of our results and an expert’s evaluation of our ditionally, unlike in the case of Genesis, the low-
approach. res POS representation doesn’t lead to increased
optimal overlap relative to the high-res POS rep-
3.1 Understanding the Discrepancy in Results resentation. This suggests that: (1) the fluctuation
Between the Books of Genesis and Exodus of the optimal overlap indicates that our pipeline is
In Table 1, we list our optimal overlap values for all sensitive to some semantic field associated with the
textual representations of both books. Notice that two large P blocks rather than to a global stylistic
there exists a statistically-significant discrepancy difference between P/nonP. (2) In cases of more
between the optimal overlap values between both sporadically-distributed texts that are stylistically
books that is as follows: different from the text in which they are embedded
For Exodus, all three representations reach the – one representation of the text is not systematically
same optimal overlap of roughly 88%. In con- preferable to others.
trast, in Genesis – there is a difference between the
achieved optimal overlap values for the lexematic 4 Conclusions
and low-res POS representations on the one hand
We examined the hypothetical distinction between
(73%) and high-res POS representation on the other
texts of priestly (P) and non-priestly (nonP) ori-
(65%).
gin in the books of Genesis and Exodus, which
When examining the verses belonging to each we explored with a novel unsupervised pipeline
cluster when classified with the optimal parameter for text stylometry. We sought a combination of a
combination, it is evident that most of the overlap- running-window width (i.e., the number of consec-
ping P-associated verses in Exodus are grouped in utive verses to consider as a single unit of text) and
two blocks of P-associated of text, spanning 243 n-gram size of lexemes, low- or high-resolution
and 214 verses. These make considerable outliers phrase-dependent parts-of-speech that optimized
in the size distribution of P-associated blocks in the overlap between the unsupervised and hypothe-
both books (Fig. 3), which may be related to the sized partitions. We established the statistical sig-
observed discrepancy. nificance of our results using a cyclic-shift test,
This discrepancy begs two questions: which we show to be more adequate for text sty-
lometry problems than a naive permutation test.
1. Are the linguistic differences between P/nonP Finally, we extracted n-grams that contribute the
– which may be captured by our analysis – most to the classification, their respective propor-
that are attenuated for shorter sequences of P tions, statistical robustness, and correlation to other
texts? features. We achieve optimal, statistically signifi-
2. Does the high overlap in the book of Exo- cant overlap values of 73% and 90% for the books
dus arise due to an implicit sensitivity to the of Genesis and Exodus, respectively.
generic/semantic uniqueness of the two big P- We find the discrepancy in optimal overlap val-
associated blocks rather than a global stylistic ues between the two books to stem from two fac-
difference between P/nonP? tors: (1) A more sporadic distribution of P texts
in Genesis, as opposed to a more formulaic one
To examine this, we perform the following ex- in Exodus. (2) The sensitivity of our pipeline to
periment: each time, we remove a different P- a distinct semantic field manifested in two large P
associated block (1st largest, 2nd largest, 3rd blocks in Exodus, comprising the majority of the
largest, and the 1st + 2nd largest) from the text P-associated text therein.
Book Opt. overlap (lexemes) Opt. overlap (low-res POS) Opt. overlap (high-res POS) p-value
Genesis 72.95±6.45% (rw: 4, n: 1) 65.03±5.64% (rw: 14, n: 1) 73.96±2.91% (rw: 4, n: 1) 0.08 (low-res POS)
Exodus 89.23±2.53% (rw: 8, n: 2) 88.63±1.96% (rw: 9, n: 4) 86.53±2.91% (rw: 6, n: 2) 0.06 (high-res POS)

Table 1: Cross-validated optimization and hypothesis testing results: for each representation, we list the optimal
overlap value, its respective uncertainty, and combination of parameters (rw for running-window width and n for
n-gram size).

Optimal overlap as a function of cyclic shift of labels. p-value: 0.08


16

0
55 60 72
Overlap [%]
Overlap [%]

Cluster 1 (P) Cluster 2 (nonP)

Figure 2: A statistical exploration of the hypothesized partition of the books of Genesis and Exodus into Priestly
and non-Priestly constituents: results for the book of Genesis (lexemes representation). Upper Left Panel –
Optimization: Cross-validated grid-search over verse running-window widths and n-gram sizes to identify the
combination yielding an optimal overlap. The combination, value, and uncertainty of the optimal overlap are plotted
on top of the panel, and all combinations whose overlap value is within 1σ of the optimal overlap value are marked
in red cells. Upper Right Panel – Hypothesis Testing: Cross-validated hypothesis testing, where we simulate the
null hypothesis distribution through a cyclic shift of the hypothesized P/nonP labeling. The p-value is measured with
respect to the optimal overlap value minus its standard deviation. Bottom Panels – feature importance Analysis:
Cross-validated important feature analysis (running-window 4 and n-gram size 1) and their statistical stability,
displaying features bearing 75% of the explained variance. Features in the left and right panels are important to
the P-associated and nonP-associated clusters, respectively. The small numbers above each error bar indicate the
cluster-wise abundance of the feature.
classification, to quantify them, and subject them
6 Genesis to complementary philological analysis (see Ap-
Exodus pendix E) – requires that they be interpretable. This
5
constraint limits the ability to implement state-of-
4 the-art language-model-based embeddings without
Occurence

3
devising the required framework for their interpre-
tation. Consequently, using traditional embeddings
2 – which encode mostly explicit lexical features (e.g.,
1 see §2.4) – limits the complexity of the analyzed
textual phenomena and is therefore agnostic of po-
0
0 50 100 150 200 250 tential signal that is manifested in more complex
P-associated block size
features.
Figure 3: Size distribution histogram of P-associated • In text stylometry questions, especially those re-
blocks in the books of Genesis and Exodus. lated to ancient texts, it is often problematic (and
even impossible) to rely on a benchmark training
set with which supervised statistical learning can
take place. This, in turn, means that supervised
learning in such tasks must be implemented with
w/o 3rd
block extreme caution so as not to introduce a bias into
Entire a supposedly-unbiased analysis. Therefore, imple-
book
w/o 2nd w/o 1st menting supervised learning techniques for such
block block
tasks requires a complementary framework that
w/o 1st
could overcome such potential biases. In light of
+2nd this, our analysis involves predominantly unsuper-
blocks
vised exploration of the text, given different param-
eterizations.
• Our ability to draw insight from exploring the
Figure 4: Optimal overlap of a cross-validated optimiza-
stylistic differences between the hypothesized dis-
tion analysis of the book of Exodus, after removing
different P-associated blocks. Black dots: high-res tinct texts relies heavily on observing significant
POS. Red dots: low-res POS. overlap between the hypothesized and unsuper-
vised partitions. Without it, the ability to discern
the similarity between the results of our pipeline
Through complementary exegetical and statisti- is greatly obscured, as the pipeline remains essen-
cal analyses, we show that our methodology dif- tially agnostic of the hypothesized partition. Such
ferentiates the unique generic style of the Priestly a scenario either deems the parameterization irrele-
source, characterized by lawgiving, cult instruc- vant to the hypothesized partition or disproves the
tions, and streamlining a continuous chronological hypothesized partition. Breaking the degeneracy
sequence of the story through third-person narra- between these two possibilities may entail consid-
tion. This observation corroborates and hones the erable additional analysis.
stance of most biblical scholars.
6 Acknowledgements
5 Limitations
We thank Dr. Rotem Dror for her kind assistance
• The interdisciplinary element – at the heart of this and contribution to the editing process and Ziv
work – mandates that our results be interpretable Ben-Aharon for providing technical guidance in
and relevant to scholars from the opposite side operating the HUJI cluster.
of the methodological divide (i.e., biblical schol-
ars). This, in turn, introduces constraints to our
framework – the foremost is choosing appropriate References
text-embedding techniques. As discussed in §2.4 Bashir Ahmed, Sung-Hyuk Cha, and Charles Tap-
and §2.8, the ability to extract specific lexical fea- pert. 2004. Language identification from text us-
tures (i.e., unique n-grams) that are important to the ing n-gram based cumulative frequency addition. In
Proceedings of Student/Faculty Research Day, vol- Shira Faigenbaum-Golovin, Arie Shaus, Barak Sober,
ume 12. CSIS, Pace University. Eli Turkel, Eli Piasetzky, and Israel Finkelstein. 2020.
Algorithmic handwriting analysis of the Samaria in-
Akiko Aizawa. 2003. An information-theoretic perspec- scriptions illuminates bureaucratic apparatus in bibli-
tive of tf–idf measures. Information Processing & cal Israel. PLOS ONE, 15(1):e0227452.
Management, 39(1):45–65.
Avraham Faust. 2019. The world of P: The mate-
Adele Berlin. 1979. Grammatical aspects of biblical rial realm of priestly writings. Vetus Testamentum,
parallelism. Hebrew Union College Annual, 50:17– 69(2):173–218.
43.
Sergey Feldman, Marius A. Marin, Mari Ostendorf, and
Maya R. Gupta. 2009. Part-of-speech histograms for
Léon Bottou. 2004. Stochastic learning. In Sum-
genre classification of text. In 2009 IEEE Interna-
mer School on Machine Learning, pages 146–168.
tional Conference on Acoustics, Speech and Signal
Springer.
Processing, pages 4781–4784. IEEE.
Deng Cai, Chiyuan Zhang, and Xiaofei He. 2010. Un- Federico Giuntoli and Konrad Schmid. 2015. The Post-
supervised feature selection for multi-cluster data. Priestly Pentateuch: New Perspectives on its Redac-
In Proceedings of the 16th ACM SIGKDD Interna- tional Development and Theological Profiles. Mohr
tional Conference on Knowledge Discovery and Data Siebeck.
Mining, pages 333–342.
Talmy Givón. 1991. The evolution of dependent
William B. Cavnar and John M. Trenkle. 1994. N-gram- clause morpho-syntax in Biblical Hebrew. In Eliza-
based text categorization. In Proceedings of SDAIR- beth Closs Traugott and Berund Heine, editors, Ap-
94, 3rd Annual Symposium on Document Analysis proaches to SGrammaticalization, volume 2, pages
and Information Retrieval, volume 161175. Citeseer. 257–310. John Benjamins.

Idan Dershowitz, Navot Akiva, Moshe Koppel, and Hermann Gunkel. 2006. Creation and Chaos in the
Nachum Dershowitz. 2015. Computerized source Primeval Era and the Eschaton: Religio-Historical
criticism of biblical texts. Journal of Biblical Litera- Study of Genesis 1 and Revelation 12. Eerdmans,
ture, 134(2):253–271. Grand Rapids, MI. Trans. by K. William Whitney,
Jr.; original edition 1895.
Idan Dershowitz, Nachum Dershowitz, Tomer Hasid,
Mehahem Haran. 1981. Behind the scenes of history:
and Amnon Ta-Shma. 2014. Orthography and bibli-
Determining the date of the priestly source. Journal
cal criticism. In Proceedings of Digital Humanities
of Biblical Literature, 100(3):321–333.
(DH 2014, Lausanne, Switzerland), pages 451–453.
T. Hastie, R. Tibshirani, and J. H. Friedman. 2009. The
Maruf A. Dhali, Sheng He, Mladen Popović, Eibert Elements of Statistical Learning: Data Mining, In-
Tigchelaar, and Lambert Schomaker. 2017. A digital ference, and Prediction. Springer Series in Statistics.
palaeographic approach towards writer identification Springer.
in the Dead Sea Scrolls. In Proceedings of the 6th
International Conference on Pattern Recognition Ap- David I. Holmes. 1998. The evolution of stylometry
plications and Methods. SCITEPRESS - Science and in humanities scholarship. Literary and Linguistic
Technology Publications. Computing, 13(3):111–117.

Rotem Dror, Gili Baumer, Segev Shlomov, and Roi Heinrich Holzinger. 1893. Einleitung in den Hexateuch,
Reichart. 2018. The hitchhiker’s guide to testing volume 1. Mohr Siebeck.
statistical significance in natural language processing.
Eduardo R. Hruschka and Thiago F. Covoes. 2005. Fea-
In Proceedings of the 56th Annual Meeting of the
ture selection for cluster analysis: an approach based
Association for Computational Linguistics (Volume
on the simplified silhouette criterion. In International
1: Long Papers), pages 1383–1392.
Conference on Computational Intelligence for Mod-
elling, Control and Automation and International
Maël Fabien, Esaú Villatoro-Tello, Petr Motlicek, and
Conference on Intelligent Agents, Web Technologies
Shantipriya Parida. 2020. Bertaa: Bert fine-tuning
and Internet Commerce (CIMCA-IAWTIC’06), vol-
for authorship attribution. In Proceedings of the 17th
ume 1, pages 32–38. IEEE.
International Conference on Natural Language Pro-
cessing (ICON), pages 127–137. Avi Hurvitz. 1988. Dating the priestly source in light
of the historical study of biblical Hebrew. a century
Shira Faigenbaum-Golovin, Arie Shaus, Barak Sober, after Wellhausen. Zeitschrift für die alttestamentliche
David Levin, Nadav Na’aman, Benjamin Sass, Eli Wissenschaft, 100:88–100.
Turkel, Eli Piasetzky, and Israel Finkelstein. 2016.
Algorithmic handwriting analysis of Judah’s military Louis C. Jonker. 2012. Reading the Pentateuch’s ge-
correspondence sheds light on composition of bibli- nealogies after the exile: The Chronicler’s usage of
cal texts. Proceedings of the National Academy of Genesis 1–11 in negotiating an all-Israelite identity.
Sciences, 113(17):4664–4669. Old Testament Essays, 25(2):316–333.
Patrick Juola, John Sofko, and Patrick Brennan. 2006. A Michael Piotrowski. 2012. NLP and digital humanities.
prototype for authorship attribution studies. Literary In Natural Language Processing for Historical Texts,
and Linguistic Computing, 21(2):169–178. pages 5–10. Springer.

Jakub Kabala. 2020. Computational authorship attri- Mladen Popović, Maruf A. Dhali, and Lambert
bution in medieval Latin corpora: the case of the Schomaker. 2021. Artificial intelligence based writer
Monk of Lido (ca. 1101–08) and Gallus Anonymous identification generates new evidence for the un-
(ca. 1113–17). Language Resources and Evaluation, known scribes of the Dead Sea Scrolls exemplified
54(1):25–56. by the Great Isaiah Scroll (1QIsaa). PLOS ONE,
16:1–28.
Mike Kestemont, Justin Stover, Moshe Koppel, Folgert
Karsdorp, and Walter Daelemans. 2016. Authenti- Yehuda T. Radday. 1970. Isaiah and the computer: A
cating the writings of Julius Caesar. Expert Systems preliminary report. Computers and the Humanities,
with Applications, 63:86–96. 5(2):65–73.

Yehudah T. Radday and Haim Shore. 1985. Genesis:


Israel Knohl. 2007. The Sanctuary of Silence: The
An authorship study in computer-assisted statistical
Priestly Torah and the Holiness School. Eisenbrauns.
linguistics, volume 103 of Analecta Biblica. Biblical
Israel Knohl. 2010. The Divine Symphony: The Bible’s Institution Press.
Many Voices. Jewish Publication Society. Thomas Römer. 2014. From the call of Moses to the
parting of the sea: Reflections on the priestly version
Moshe Koppel, Navot Akiva, Idan Dershowitz, and of the Exodus narrative. In The Book of Exodus,
Nachum Dershowitz. 2011. Unsupervised decom- pages 121–150. Brill.
position of a document into authorial components.
In Proceedings of the 49th Annual Meeting of the Thomas Römer. 2015. The Invention of God. Harvard
Association for Computational Linguistics: Human University Press.
Language Technologies, pages 1356–1364, Portland,
OR. Association for Computational Linguistics. Dirk Roorda. 2015. The Hebrew Bible as Data:
Laboratory-Sharing-Experiences. CLARIN in the
Moshe Koppel, Shlomo Argamon, and Anat Rachel Shi- Low Countries.
moni. 2002. Automatically categorizing written texts
by author gender. Literary and Linguistic Computing, Dirk Roorda. 2018. Coding the Hebrew Bible: Linguis-
17(4):401–412. tics and literature. Research Data Journal for the
Humanities and Social Sciences, 3(1):27–41.
Moshe Koppel, Jonathan Schler, and Shlomo Argamon.
2009. Computational methods in authorship attribu- Avi Shmidman, Joshua Guedalia, Shaltiel Shmidman,
tion. Journal of the American Society for Information Cheyn Shmuel Shmidman, Eli Handel, and Moshe
Science and Technology, 60(1):9–26. Koppel. 2022. Introducing berel: Bert embed-
dings for rabbinic-encoded language. arXiv preprint
T. A. Litvinova, P. V. Seredin, and O. A. Litvinova. 2015. arXiv:2208.01875.
Using part-of-speech sequences frequencies in a text
to predict author personality: a corpus study. Indian Marina Sokolova, Nathalie Japkowicz, and Stan Sz-
Journal of Science and Technology, 8:93. pakowicz. 2006. Beyond accuracy, f-score and roc:
a family of discriminant measures for performance
Michał Marcińczuk, Mateusz Gniewkowski, Tomasz evaluation. In Australasian Joint Conference on Arti-
Walkowiak, and Marcin B˛edkowski. 2021. Text doc- ficial Intelligence, pages 1015–1021. Springer.
ument clustering: Wordnet vs. TF-IDF vs. word em-
Efstathios Stamatatos. 2009. A survey of modern au-
beddings. In Proceedings of the 11th Global Wordnet
thorship attribution methods. Journal of the Ameri-
Conference, pages 207–214.
can Society for information Science and Technology,
60(3):538–556.
Hajime Murai. 2013. Exegetical Science for the In-
terpretation of the Bible: Algorithms and Software Efstathios Stamatatos. 2013. On the robustness of au-
for Quantitative Analysis of Christian Documents. thorship attribution based on character n-gram fea-
In Software Engineering, Artificial Intelligence, Net- tures. Journal of Law and Policy, 21(2):421–439.
working and Parallel/Distributed Computing, pages
67–86. Springer International Publishing. Ching Y. Suen. 1979. N-gram statistics for natural
language understanding and text processing. IEEE
Ernest Nicholson. 2002. The Pentateuch in the Twenti- Transactions on Pattern Analysis and Machine Intel-
eth Century. Oxford University Press. ligence, PAMI-1(2):164–172.

Na’ama Pat-El. 2021. Syntactic Aramaisms as a tool Xiaoyan Tang and Jing Cao. 2015. Automatic genre
for the internal chronology of Biblical Hebrew. In classification via n-grams of part-of-speech tags.
Diachrony in Biblical Hebrew, pages 245–264. Penn Procedia-Social and Behavioral Sciences, 198:474–
State University Press. 478.
Fiona J. Tweedie, Sameer Singh, and David I. Holmes.
1996. Neural network applications in stylometry:
Appendices
The Federalist papers. Computers and the Humani-
ties, 30(1):1–10. A Estimating Scholarly Consensus of
P-associated Texts in Genesis and
Wido van Peursen. 2019. A Computational Approach
to Syntactic Diversity in the Hebrew Bible. Journal Exodus
of Biblical Text Research, 44:237–253.
To quantify the consensus amongst biblical schol-
Mayuri Verma. 2017. Lexical Analysis of Religious ars concerning the distinction between P/nonP texts
Texts using Text Mining and Machine Learning in Genesis and Exodus, we consider two sets of la-
Tools. International Journal of Computer Applica-
belings of P/nonP: the first is that provided by bib-
tions, 168(8):39–45.
lical scholars, and the other is similar labeling used
Gerhard von Rad. 1972. Genesis – A commentary, 3rd in the work of Dershowitz et al. (2015), who also
rev. ed. edition. S.C.M. Press, London. Trans. by apply computational methods in an attempt to de-
John H. Marks.
tect a meaningful dichotomy between P and nonP
Gerhard von Rad. 2001. Old Testament Theology: The texts as well, albeit under a different paradigm. In
theology of Israel’s historical traditions, volume 1. their work, Dershowitz et al. consider P/nonP dis-
Westminster John Knox Press.
tinctions of three independent biblical scholars and
Yoav Wald, Amir Feder, Daniel Greenfeld, and Uri compile a single “consensus” labeling of verses
Shalit. 2021. On calibration and out-of-domain gen- over the affiliation of which all three scholars agree
eralization. Advances in neural information process- (1291 verses in Genesis and 1057 verses in Ex-
ing systems, 34:2215–2227.
odus), except for explicitly incriminating P texts
Julius Wellhausen, J. Sutherland Black, Allan Menzies, (such as genealogical trees that are strongly affili-
and William Robertson Smith. 2009. Prolegomena ated with P (e.g., Jonker, 2012)). We find a 96.5%
to the History of Israel. Cambridge University Press.
and 97.3% agreement between the two labelings
William R. Wright and David N. Chin. 2014. Person- for the books of Genesis and Exodus, respectively.
ality profiling from text: Introducing part-of-speech
n-grams. In International Conference on User Model- B Cross-validated Overlap Optimization
ing, Adaptation, and Personalization, pages 243–253.
Springer. The cross-validated overlap optimization is per-
Ofer Yosef. 2018. Determining orthography on the basis
formed as follows:
of Masoretic notes. In The Masora on Scripture and
Its Methods, chapter 3, pages 34–48. De Gruyter, • For lexemes/POS, we consider a range of n-
Berlin. gram sizes and a range of running-window
widths, through combinations of which we op-
Yair Zakovitch. 1980. A study of precise and partial
derivations in biblical etymology. Journal for the timize the classification overlap. These ranges
Study of the Old Testament, 5(15):31–50. are determined empirically by observing the
overlap values decrease monotonically as the
Guo-Niu Zhu, Jie Hu, Jin Qi, Jin Ma, and Ying-Hong
Peng. 2015. An integrated feature selection and
n-gram sizes/running-window widths become
cluster analysis techniques for case-based reasoning. too large (see Figs. 6, 5). This produces a 2D
Engineering Applications of Artificial Intelligence, matrix, where the (i, j)th entry is the resulting
39:14–22. overlap value achieved by the k-means given
the combination of the ith running-window
width and the jth n-gram size.
• We cross-validate the grid-search process by
performing a series of simulations, where in
each simulation, we generate a random subset
of 250 verses from the given book, containing
at least 50 verses of each class (according to
the scholarly labeling). Each such simulation
produces a 2D overlap matrix, as mentioned
above, for the given subset of verses.
• Finally, we average the 2D overlap matrices
of all simulations, and for each entry, we cal- all simulation-wise loadings and derive the vari-
culate its standard deviation across all sim- ance thereof to receive a cross-validated vector of
ulations, producing a 2D standard deviation feature importance loadings and their respective
matrix. After normalizing the average overlap uncertainties. These are plotted as the error bars in
matrix by the standard deviation matrix, we Figs. 2, and similarly in the figures in Appendix C
choose the optimal combination that produces
the classification with the optimal overlap. D Results
• We perform this analysis on both lexemes, D.1 Optimization Results: Genesis
low- and high-res POS, yielding three av- For each representation, we achieve the following
eraged overlap matrices, from which opti- optimal overlap values (see Fig. 6): 72.95±6.45%
mal parameter combinations and their respec- for lexemes, 65.03±5.64% for high-res POS, and
tive standard deviations (calculated across the 73.96±2.91% for low-res POS. We observe the
cross-validation simulations) can be identified following:
for each feature.
• Optimal overlap values achieved for lexemes
C Feature Importance Analysis and high-res POS are consistent to within a
1σ, whereas the optimal overlap achieved for
We consider the two optimal overlap clusters of low-res POS is higher by ∼ 2σ.
verses as the algorithm assigned them. We calcu-
• We find a less consistent and considerably
late a difference matrix D, where from every verse
larger spread of optimal parameter combina-
in cluster 1, we subtract every verse in cluster 2,
tions for the low- and high-POS-wise embed-
receiving a matrix D ∈ R|S1 |·|S2 |×d , where |S1 |,
dings, as opposed to Exodus. While no clear
|S2 | are the number of verses assigned to cluster
pattern of optimal parameter combinations is
1 and 2, respectively, and d is the dimension of
observed in any feature, small n-gram sizes
the embedding (i.e., number of unique n-grams
are also preferred here. For lexemes, a well-
in the text). Note that tf-idf embedded texts are
defined range of running-window widths of
non-negative, such that the difference matrix D has
unigrams is observed to yield optimal over-
positive values for features with a high tf-idf score
laps.
in cluster 1 and negative values for features with a
high tf-idf score in cluster 2. As mentioned in §2.8, • Unlike in the case of Exodus, parameter com-
the first principal axis of D is equivalent to the dif- binations yielding optimal overlap values of
ference between the two centroids produced by the each feature (marked with red cells in Fig.
k-means. Similarly, this difference vector is a lin- 5) do not exhibit higher consistency across
ear combination of all the features (i.e., n-grams), the cross-validation simulations than combi-
where a numerical value gives the importance of nations that yield smaller overlap (i.e., small
each feature called “loading”, ranging from nega- cross-validation variance, see the right panels
tive to positive values and their importance to the in Fig. 5), except for the low-res POS feature.
given principal axes is given by their absolute value.
Due to the nature of the difference matrix D (see D.2 Optimization Results: Exodus
§2.8) – the sign of the loading indicates in which For each feature, we achieve the following optimal
cluster the feature is important. Thus we can as- overlaps (see Fig. 6): 89.23±2.53% for lexemes,
sign distinguishing features to the specific cluster 88.63±1.96% for high-res POS, and 86.53±2.91%
of which they, or a combination thereof with other for low-res POS. We observe the following:
features, are characteristic.
Finally, we seek to determine the stability of the • For all three representations, optimal overlap
importance of features across multiple sub-samples values are consistent to within 1σ.
of each book. We perform this computation as • For all three representations, parameter com-
follows: given a parameters combination, we per- binations yielding optimal overlap values
form all the steps listed above for 100 simulations, (marked with red cells in Fig. 6) exhibit high
where in each simulation, we randomly sample a consistency across the cross-validation simu-
sub-sample of 250 verses and extract the impor- lations (i.e., slight cross-validation variance,
tance loadings of the features. We then average see the right panels in Fig. 6).
• For lexemes, we find that the range of opti- E Biblical-Exegetical Discussion
mal parameter combinations is concentrated
within 1- and 2-grams and is relatively in- Here we perform an exegetical analysis of our re-
dependent of running-window width (i.e., sults for each book. All data to which this analysis
when some optimal running-window width was applied is available online5 .
is reached–the overlap values do not change
E.1 Genesis
dramatically as it increases).
• For the high-res POS, we find that the range of E.1.1 Semantics
optimal combinations is restricted to 2-grams, The extraction of the features of P for Genesis over-
but is also insensitive to the running-window laps with the work of characterizing the priestly
width. stratum (Holzinger, 1893). Thus, the characteris-
• For low-res POS, we find that larger n-gram tic use of numbers in P (here, in descending or-
sizes, and a wider range thereof, produce the der of importance, the algorithm considered the
optimal overlap values. Additionally, we ob- terms 100, 9, 8, 5, 3, 6, 4, 2, 7 as characteristic
serve a dependence between given n-gram of P) appears mainly in the genealogies, e.g., Gen
sizes and running-window widths to reach a 5 and 11, but also in the use of ordinal numbers
large overlap. to give the months and in the definition of the di-
mensions of the tabernacle. The term “year” (!‫)שׁנה‬
is used in both dates and P genealogies, and the
D.3 Hypothesis Testing Results
term “day” (!M‫ )יו‬demonstrates a similar calendrical
We present our results of the hypothesis testing concern. Furthermore, in the genealogies, we find
through cyclic-shift, described in §2.7. Here, too, the names of the patriarchs considered to belong
we perform a cross-validated test by performing to P (!‫נח‬, !‫מהללאל‬, !N‫קינ‬, !K‫חנו‬, !‫מתושׁלח‬, !‫אנושׁ‬, !M‫אד‬, !‫שׁת‬,
five simulations – each containing a randomly cho- !‫ישׁמעאל‬, !K‫למ‬, !‫רעו‬, !‫פלג‬, !‫)סרוג‬. The term “son” (!N‫)ב‬
sen sub-sample of 250 verses (with a mandatory appears in genealogies but also in typical P expres-
minimum of 50 verses of each class), to which the sions such as “son of X year” (!‫ שׁנה‬N‫ )ב‬to indicate
cyclic-shift analysis is applied. We compute the the age of someone, “sons of Israel” (!‫)בני ישׂראל‬,
optimal overlap for every shift and generate a shift- etc. The root !‫( ילד‬to beget or to give birth) is found
series of optimal overlap values (i.e., the null dis- in the genealogies in Gen 5; 11; Exod 6 but also
tribution). We then average across simulations. We in other P narratives of the patriarchs (Gen 16–17;
then derive the p-value from the synthesized null 21; 25; 35; 36; 46; 48) which focus on affiliation.
distribution. The chosen “real optimal overlap”, The term “generation” (!‫ )דור‬is also recognized as
which we use to derive the p-value, is the average typically P (Gen 6:9; 9:12; 17:7,9,12; etc. ), as
optimal overlap at a shift of 0 (i.e., original label- well as the term “annals” (!‫)תולדות‬, which serves to
ing) minus its standard deviation. For each book, introduce a narrative section or a genealogy and.
we perform this analysis for the features yielding This term structures the narrative and genealogical
the optimal overlap; low-res POS for Genesis and sections in the book of Genesis (Gen 2:4; 5:1; 6:9;
high-res POS for Exodus. We plot our results for 10:1,32; 11:10,27; 25:12–13,19; 36:1,9; etc.). The
both books in Fig. 7. terms “fowl” (!P‫)עו‬, “beast/flesh” (!‫)בשׂר‬, “creeping”
The resulting p-values are 0.08 and 0.06 for the (!‫)רמשׂ‬, “swarming” (!Z‫)שׁר‬, “living being” (!‫)נפשׁ חיה‬,
books of Genesis and Exodus, respectively. “cattle” (!‫)בהמה‬, “kind” (!N‫ )מי‬are found in the typi-
cally P expression “living creatures of every kind:
D.4 Feature Importance Analysis Results
cattle and creeping things and wild animals of the
D.4.1 Feature Importance: Genesis earth of every kind” (Gen 1:24; cf. Gen 1:25-26;
In Figs. 8-10 we plot feature importance analysis 6:7,20; 7:14,23; etc.). These expressions are often
results for the three representations of the book of associated with “multiplication” (!‫)רבה‬, an essential
Genesis. theme for P that also appears in the blessings of P
narratives as in Gen 17; 48; etc. The term “being”
D.4.2 Feature Importance: Exodus
(!‫ )נפשׁ‬is also used in P texts to refer to a person, e.g.,
In Figs. 11-13 we plot feature importance analysis in Gen 12; 17; 36; 46. As for the term “all” (!‫)כל‬, it
results for the three representations of the book of
5
Exodus. https://github.com/YoffeG/PnonP
is used overwhelmingly in both P and D texts. In P gies in Gen 5; 11. The appearance of the term “to
texts (Gen 1:27; 5:2; 6:19), humanity (!M‫)אד‬, in the die” (!‫ )מות‬as a characteristic of P is explained by
image (!M‫ )צל‬of God, is conceived in a dichotomy its presence in the genealogies of Gen 5; 11 but
of “male” (!‫ )זכר‬and “female” (!‫)נקבה‬. The root !‫זכר‬ also in the succession of each of the generations of
in its second sense, that of “remembering”, also the patriarchs. Finally, the terms “water” (!M‫ )מי‬and
plays a role in the P narratives (Gen 8:1; 9:15–16; “heaven” (!M‫ )שׁמי‬play a major role in the creation
19:29; Exod 2:24; 6:5) when God remembers his narrative P (Gen 1:1-2:4) and the flood narrative
covenant and intervenes to help humanity or the (Gen 6-9*). These two terms also appear in Exodus,
Israelites. The covenant (!‫ ;ברית‬cf. also Gen 9; where water is mentioned in the account of the duel
17), the sign of which is the circumcision (!‫ )מול‬of of the magicians (Exod 7-9*), in the passage of the
the foreskin (!‫ )ערלה‬is correctly characterized as sea of reeds which is paralleled in the creation of
P. According to P, God’s covenants are linked to Gen 1 (Exod 14), and as a means of purification
a promise of offspring (!‫ ;זרע‬cf. Gen 17; etc.) and during the building of the tabernacle (Exod 29-30;
valid forever (!M‫ ;עול‬Gen 9; 17; 48:4; Exod 12:14; 40). This latter function of water probably builds
etc.). The term “seed/descendant” (!‫ )זרע‬is also its symbolism in the other narratives. The term
used by P in the creation narratives in Gen 1. The firmament (!‫ )רקיע‬is associated with heaven and
term “between” (!N‫ )בי‬is used several times to indi- appears only in the creation story P of Gen 1 but is
cate the parties concerned by the covenant in Gen of little significance elsewhere.
9 and Gen 17. The term is also found frequently in
Gen 1 in the creation story, where creation is the On the non-P side, terms like “Joseph”,
result of separation “between” (!N‫ )בי‬different ele- “pharaoh”, and “Egypt” are non-P features since
ments – presenting God as the creator is not typical the story of Joseph is non-P. Similarly, the presence
of a national god whose role primarily guarantees of “Jacob” is explained by an account of only a
protection, military success, and fertility. The trans- few verses for this story in P as opposed to sev-
formation of the God of Israel into a creator God eral whole chapters for the non-P account of Ja-
appears only in the exilic or postexilic texts. Thus, cob. The terms “brother” (16P /179nonP ), “father”
the root “to create” (!‫ )ברא‬is rightly associated with (19P /213nonP ), and “mother” (4P /33nonP ) as non-
P (Gen 1:1-2:4; 5:1-2). The use of divine names is P features can be understood by a greater empha-
particular in the priestly narratives. “God” (!M‫)אלהי‬ sis on family in the original patriarchal accounts,
is the term used in the origin stories (Gen 1–11), whereas P emphasizes genealogy. The terms “mas-
“El Shaddai” for the patriarchs (Gen 12–Exod 6), ter” (!N‫)אדו‬, “slave/servant” (!‫)עבד‬, and “boy/servant”
and finally, “YHWH” from Exodus 6,2-3 on. Here, (!‫ )נער‬reflect the hierarchical structures of the house-
the algorithm did understand a particular use by P hold of the wealthy landowners in the narratives of
of the term “God” (!M‫)אלהי‬. One of the differences the patriarchs but are of no interest to the priestly
with Holzinger’s list is the fact that the algorithm editors. Similarly, non-P texts show more inter-
considers the terms Noah (!‫)נח‬, flood (!‫)מבול‬, and est in livestock, with terms such as camel (!‫)גמל‬,
the ark (!‫ )תבה‬as typically P. This is probably be- donkey (!‫)חמור‬, or small livestock (!N‫ )צא‬considered
cause the flood narrative is much more developed non-P features. The dialogues are more present in
in P than in non-P or because the semantic environ- the non-P stories than in the P stories. Thus the
ment is attached to other P expressions. Neverthe- terms that open the direct discourses “speak” (!‫)דבר‬,
less, all three terms appear in non-P texts as well. “say” (!‫)אמר‬, and “tell” (!‫ )נגד‬are considered typical
The term “daughter” (!‫ )בת‬should be considered P non-P terms as well as the set of Hebrew propo-
not because of its frequency, which is admittedly sitions in direct discourse (!‫מה‬, !‫עתה‬, !‫ה‬, !‫כי‬, !M‫ג‬, !‫הנה‬,
somewhat higher in the P narratives of Genesis, !‫אל‬, !‫נא‬, !M‫א‬, !‫)לא‬. The term “man” (!‫ )אישׁ‬can be used
but probably because of its semantic environment. in many ways: man, husband, human; someone.
Thus, the term appears in the expression “sons and Its use and expressions using it are significantly
daughters” (!M‫)בנות ובני‬, which is very frequently more frequent in non-P texts (42P/213nonP). This
used in Gen 5; 11. The preposition “after” (!‫)אחר‬ may be an evolution of the language rather than a
appears in the expression “after you” (!K‫ )אחרי‬in the deliberate or theological change on the part of P.
promise to Abraham in Gen 17 or the expression Finally, for the terms “to enter” (!‫ )בוא‬or “to go”
“after his begetting” (!‫ )הולידו אחרי‬in the genealo- (!K‫ )הל‬to be characteristic of non-P, this may reflect
a stronger interest in place, in travel in the original
texts probably composed to legitimize sanctuaries pleonasms which use a form with this suffix: !‫עמו‬,
or as etiological narratives whereas these aspects !‫אתו‬, etc. (Holzinger, 1893). Concerning verbs, the
are less marked in the P texts. form Qal or Piel, qatal in the 2nd masculine singu-
lar, is prevalent in P texts. This is understandable
E.2 Exodus because of P’s theology, according to which God
E.2.1 Semantics orders using the second person, and then the pro-
For the P-texts of Exodus, the algorithm has ex- tagonists carry out according to YHWH’s order.
tracted the semantic features of the tabernacle con- On the side of the non-P texts, the narrative form,
struction in Exod 25-31; 35-40 but does not give i.e., Qal, wayyiqtol in the 3rd masculine singular,
features of the P-texts that would be found else- is significant, although these forms are also very
where. We find in the features: the different names present in P texts. Another peculiarity is the use
of the holy place, “the holy one”, “the dwelling”, in non-P of “name in the constructed form + place
“the tent of meeting”; the materials used for the con- name”. P seems to have avoided topical construc-
struction, “acacia wood”, “pure gold”, “bronze”, tions because of reduced interest in localizations.
“linen”, “blue, purple, crimson yarns”, etc.; the spa- The remaining terms are persistent elements. Fur-
tialization, “around”, “outside”; the dimensions, ther analysis would be needed to understand the
“length”, “cubit”, “five”; the components, “altar”, relevance of the distinction made by the algorithm.
“curtain”, “ark”, “utensils”, “table”, and YHWH’s
E.3 Summary
orders to Moses, “you shall make”. Thus, the al-
gorithm has a good understanding of the terms As we can see, the pipeline could extract typical
specific to the construction of the Tabernacle but features of priestly texts in Genesis, easily recog-
is not susceptible to a more general understand- nizable for a specialist. In addition, other P fea-
ing of the characteristics of P in Exodus. The tures have also been found that may be specific to
non-P features are more interesting. For exam- a single narrative, correspond to repeated use of an
ple, the use of the word “people” (!M‫ )ע‬appears expression, or be a significant theological theme
primarily in the non-P texts because the priestly such as water. On the other hand, the features of
redactors usually preferred to use the term “as- non-P texts do not indicate a coherent editorial mi-
sembly” (!‫)עדה‬. The word “I” in the long form lieu or style but rather allow us to better distinguish
(!‫ )אנכי‬is considered non-P because the word ’ny, between P texts and non-P texts by particular the-
the short form, appears in P texts. The expres- ological or linguistic features. The data provided
sion “to YHWH” does appear 24 times in non-P by the algorithm allow for the detection of particu-
texts, e.g., “to cry out to YHWH”/“to speak to larities that require an explanation. More detailed
YHWH”/“to turn to YHWH”, whereas P avoids the investigations than those presented above are nec-
expression. This is easily understandable by a de- essary to better understand specific instances of
sire to give YHWH the initiative in all interactions. the results. For the texts of Exodus, the excessive
In P, it is he who demands, commands, and speaks. importance of the chapters devoted to the construc-
There are few dialogues. As for the terms Egypt tion of the Tabernacle (Exod 25-31; 35-40) did not
(!M‫ )מצרי‬and Pharaoh (!‫)פרעה‬, they are indeed quan- allow us to obtain satisfactory results in the charac-
titatively more frequent in the non-Priestly texts of terization of P, which could indicate an originally
Exodus (respectively 36P/139nonP et 26P/89nonP ) independent document. Nevertheless, the charac-
as in Genesis. terization of non-P texts is relevant, as well as the
results on grammar.
E.2.2 Grammar
As we have already seen, non-P texts more often
adopt the protagonists’ point of view by includ-
ing dialogues or their thoughts, whereas P texts
prefer a third-person narration. One of the con-
sequences is the privileged use of 3rd singular or
plural suffixes, unlike non-P texts where 1st sin-
gular or 2nd singular suffixes are more often used.
Moreover, the massive use of the 3rd person in
P texts can also be explained by the presence of
Best feature: (4, 1), 72.95±6.45% Combination-wise [%]
62.0 60.4 56.7 54.8 53.6 3.8 1.7 2.4 1.7 1.5 9.0

19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
68.1 64.3 59.3 56.3 54.5
72 9.2 4.5 3.8 4.3 4.3
71.7 62.2 60.0 55.0 54.2 7.9 2.0 3.6 3.9 3.9
72.5 63.8 60.8 56.7 54.4 6.4 6.4 4.9 5.3 4.5
73.0 63.1 60.6 55.3 55.8 6.5 7.8 4.8 4.0 4.1 7.5
70.8 63.2 57.5 55.0 55.0 68 5.6 5.8 6.7 4.6 3.7
Running-window width

Running-window width
68.3 62.8 58.0 56.0 54.5 5.0 4.9 5.0 3.9 4.0
67.9 61.9 57.9 57.8 54.9 4.8 5.7 5.1 3.7 3.8
67.4 62.3 60.7 56.7 55.0 4.6 6.8 5.7 4.0 3.7 6.0
65.3 61.2 60.3 57.4 55.0 64 4.0 5.2 5.0 3.4 3.4
64.7 60.7 58.9 57.1 55.3 4.1 5.8 4.1 4.3 3.4
64.9 60.1 60.6 59.6 56.2 3.8 6.1 7.0 6.0 3.6
65.1 61.8 60.3 59.1 56.4 3.4 4.9 7.1 5.2 3.3 4.5
65.2 60.7 61.4 57.4 55.6 60 3.4 5.8 6.7 5.5 3.8
65.3 60.6 60.4 56.9 56.6 3.4 5.8 5.5 3.6 3.5
65.3 59.7 59.2 59.0 56.6 3.4 5.8 6.1 6.7 4.4
65.2 59.6 60.8 59.9 56.4 3.3 5.9 4.7 5.4 4.1 3.0
65.2 59.5 60.8 59.3 56.2 56 3.3 6.2 4.9 6.1 4.4
65.2 59.0 59.5 58.9 56.2 3.3 6.7 5.1 4.9 4.7
65.2 58.9 59.3 58.1 56.7 3.3 6.8 5.4 5.5 5.2
1.5
1 2 3 4 5 1 2 3 4 5
N-gram order N-gram order
(a) Genesis cross-validated optimization analysis results: lexemes.

Best feature: (14, 1), 65.03±5.64% Combination-wise [%]


64.4 58.0 54.8 53.7 53.5 4.2 4.4 2.1 1.5 1.3
19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

64.4 60.3 59.4 58.2 57.5 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 5.4 2.9 2.3 2.3 2.2 6
64
63.0 60.9 59.5 58.5 57.8 5.5 4.3 2.2 2.2 2.8
62.2 61.4 59.6 58.9 58.1 5.0 4.6 2.5 2.5 2.8
61.9 61.7 59.3 58.6 58.1 4.5 4.7 2.7 3.1 2.6
61.9 62.0 59.3 58.9 57.8 62 4.6 4.3 2.6 3.1 2.9 5
Running-window width

Running-window width

62.2 62.0 59.5 58.4 57.6 4.0 4.4 2.6 3.7 3.4
62.9 62.6 58.5 58.1 57.0 3.9 4.4 2.5 3.6 4.1
63.0 62.8 58.4 58.5 57.5 60 4.7 4.5 3.0 3.8 3.1
62.6 62.7 58.2 57.6 58.0 5.5 4.8 3.0 3.3 5.0 4
63.4 63.1 58.8 56.4 57.4 5.4 4.8 4.8 3.8 3.8
63.7 63.3 58.5 56.7 57.0 5.0 4.8 3.6 3.8 3.6
63.6 63.4 58.0 56.9 56.5 58 5.8 4.7 4.2 3.5 4.1
64.8 62.7 57.2 57.3 56.3 5.6 5.2 4.0 3.4 4.0 3
65.0 62.3 57.3 56.7 56.7 5.6 5.6 3.9 3.4 4.3
64.2 61.6 57.6 56.6 56.1 56 5.9 6.3 4.1 3.6 3.7
65.0 61.1 57.7 56.7 56.3 4.8 5.9 4.0 3.6 4.0
64.2 60.0 59.0 57.2 56.0 5.6 6.0 5.9 3.5 4.0 2
63.4 59.5 58.4 57.2 57.0 5.1 6.4 4.5 4.3 5.2
62.2 60.0 58.5 57.7 58.0 54 5.2 6.3 4.8 4.5 4.8
1 2 3 4 5 1 2 3 4 5
N-gram order N-gram order
(b) Genesis cross-validated optimization analysis results: high-res POS.

Best feature: (4, 1), 73.96±4.32% Combination-wise [%]


62.6 65.7 63.4 60.3 56.3 53.7 53.3 53.1 52.2 4.7 4.8 4.0 4.0 3.3 2.1 2.1 1.7 1.3
19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

68.5 69.4 69.2 65.7 64.1 62.5 58.3 56.9 55.4


72 4.2 6.1 6.9 7.3 8.1 7.1 4.3 3.0 3.0 12.5
70.4 68.8 67.1 66.9 65.8 65.9 60.9 57.0 56.1 4.0 6.4 8.0 8.3 8.4 9.4 5.4 4.0 3.1
72.4 69.1 67.8 68.0 66.5 66.4 63.4 58.6 57.0 4.2 7.0 8.3 7.5 9.5 9.5 7.8 4.7 3.8
74.0 69.7 68.5 67.7 68.2 67.3 63.4 60.5 57.6 4.3 8.2 8.9 8.0 9.5 10.1 7.8 4.8 4.7
73.3 69.9 68.5 68.6 66.6 66.1 65.2 61.9 58.1 68 4.6 8.5 9.7 8.6 9.6 10.5 9.4 8.3 4.2 10.0
Running-window width

Running-window width

72.5 69.7 68.0 70.0 68.7 65.8 62.3 61.9 57.8 5.5 9.2 11.4 10.8 10.5 10.1 8.0 7.1 4.0
72.1 69.8 67.0 69.2 68.8 65.8 63.2 61.3 59.3 5.9 9.4 11.7 11.0 9.9 10.0 7.5 7.8 5.7
70.9 70.2 67.3 69.5 68.7 66.1 61.9 62.1 58.6 6.1 9.5 11.8 11.6 11.2 9.9 8.1 8.3 4.1
71.0 70.1 67.5 69.5 68.9 65.3 61.1 60.2 58.2 64 6.2 9.6 11.6 11.8 10.8 10.0 8.0 7.7 4.6
7.5
72.0 70.1 67.6 69.3 67.8 65.8 61.3 60.8 59.0 5.1 9.6 11.3 12.1 10.3 10.1 7.9 7.2 4.3
72.3 70.3 67.9 69.2 67.6 64.3 63.7 58.9 61.3 5.2 9.5 11.2 12.1 10.4 10.3 9.4 4.6 7.2
73.4 70.2 68.6 69.1 69.4 64.9 62.9 61.1 60.2 4.5 9.3 11.5 12.2 12.1 10.4 10.2 7.7 7.2
60
73.3 70.4 69.1 68.9 68.6 66.2 63.6 61.3 61.5 4.4 9.4 11.3 12.3 11.3 11.3 9.8 7.4 7.0
73.5 70.4 69.2 68.8 68.5 65.8 63.6 61.3 62.0 4.1 9.3 11.0 12.3 11.6 11.7 9.6 7.5 7.9
5.0
73.2 71.1 69.3 68.9 68.3 65.6 63.5 61.2 59.5 3.7 9.2 11.0 12.3 11.6 11.9 10.1 7.5 4.5
73.7 71.6 69.2 69.0 68.9 66.2 63.5 61.3 60.1 56 3.6 9.4 11.1 12.1 11.1 11.8 10.8 7.5 4.8
73.0 72.1 69.4 69.4 69.0 65.9 64.2 61.1 60.9 3.4 9.2 11.3 11.9 11.4 12.9 11.1 8.0 4.4
72.6 71.9 69.2 69.0 67.7 66.2 64.1 62.3 59.6 3.0 8.9 11.4 12.2 12.5 12.8 11.0 7.4 5.2 2.5
72.3 72.5 69.0 68.7 67.2 65.3 63.6 61.6 59.5 3.2 9.0 12.0 12.2 12.6 13.4 11.3 7.9 5.5
1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9
N-gram order N-gram order
(c) Genesis cross-validated optimization analysis results: low-res POS.

Figure 5: Cross-validated optimization analysis results for the book of Genesis, using all three representations:
lexemes (a), high- and low-res POS (b, c), respectively, probing a running-window width range of 1–10 and
n-gram size range of 1–10 over 20 simulations. Left Panels: Averaged combination-wise overlaps. Right Panels:
Combination-wise standard-deviation (σ) of the overlaps generated with the cross-validation process. For each case,
the best-fit feature (running-window width, n-gram size) and its respective σ appear at the top of each left panel.
Overlaps of combinations within the 1σ range of the best fit are marked with red cells.
Best feature: (8, 2), 89.23±2.53% Combination-wise [%]
67.4 55.4 51.5 51.2 51.0 50.9 51.0 50.9 50.8 88 10.2 3.6 0.9 0.8 0.7 0.5 0.5 0.3 0.2

0
83.2 74.6 54.6 52.3 51.9 51.7 51.7 51.5 51.5 3.7 9.4 3.5 0.8 0.7 0.6 0.5 0.4 0.5 12.5
1

1
84.5 82.1 57.0 53.2 52.6 52.3 52.4 52.2 51.9 80 4.5 8.1 5.6 1.7 1.2 0.9 1.0 0.8 0.8
2

2
10.0
Running-window width

Running-window width
86.9 85.0 61.7 54.1 53.4 52.8 52.8 52.5 52.4 3.8 7.7 8.4 2.1 1.5 1.0 0.9 0.9 0.9
3

3
88.5 83.1 64.8 55.2 54.0 53.0 53.2 53.3 53.1 72 3.1 12.0 11.2 2.2 1.6 1.3 1.1 1.1 1.0
4

4
7.5
88.8 89.0 67.0 56.3 54.7 54.0 53.6 53.8 53.1 2.3 2.6 12.8 2.5 1.5 1.6 1.8 2.0 1.3
5

5
88.8 89.1 69.5 56.7 55.3 54.6 54.2 54.3 53.7 64 2.2 2.6 13.7 2.6 2.2 2.1 1.8 2.3 1.8 5.0
6

6
88.9 89.2 71.0 57.0 56.5 55.5 54.7 54.7 54.5 2.2 2.5 14.1 3.4 3.1 2.7 2.4 2.2 2.2
7

7
88.4 89.2 71.9 59.7 56.2 55.6 55.2 54.6 54.3 2.6 2.5 14.3 5.0 2.9 2.2 2.4 2.1 2.0 2.5
56
8

8
88.3 89.0 74.4 59.8 56.1 55.5 55.3 55.0 54.8 2.7 2.4 14.1 4.6 3.9 2.1 2.6 2.1 1.9
9

9
1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9
N-gram order N-gram order
(a) Exodus cross-validated optimization analysis results: lexemes.

Best feature: (6, 2), 88.63±1.96% Combination-wise [%]


88
63.7 60.1 53.0 51.4 51.1 51.0 50.9 51.0 50.9 7.1 3.6 2.7 1.1 0.6 0.4 0.4 0.3 0.2
0

12.5
78.6 78.2 65.5 56.0 52.3 51.9 51.8 51.7 51.7 5.4 8.1 6.8 3.4 0.9 0.7 0.6 0.5 0.4
1

83.6 84.2 71.4 55.7 52.4 52.2 52.0 52.1 51.9 80 3.8 3.0 10.1 4.4 0.9 0.9 0.7 0.5 0.6
2

10.0
Running-window width

Running-window width

85.4 86.4 78.2 61.1 54.6 52.9 52.5 52.3 52.1 2.4 3.6 11.5 8.0 2.5 1.7 1.4 1.3 1.2
3

86.5 87.2 81.7 63.9 54.2 53.5 53.0 52.9 52.9 72 2.7 3.8 10.6 10.6 2.8 1.8 1.8 1.5 1.5
4

7.5
86.4 88.4 83.2 66.9 58.0 54.2 53.5 53.2 53.4 2.6 2.0 10.3 13.6 6.6 1.9 2.0 1.9 1.7
5

86.0 88.6 85.4 68.6 59.7 55.3 54.4 53.8 53.4 64 3.1 2.0 7.6 14.2 6.0 2.7 2.7 2.4 2.0 5.0
6

86.2 88.5 85.6 68.8 58.5 56.6 54.5 53.9 53.6 2.8 1.8 7.6 13.8 5.1 3.3 2.5 2.6 2.5
7

86.0 88.5 85.9 70.0 61.5 57.8 56.0 54.4 54.2 3.1 1.9 6.9 13.1 5.2 4.0 3.2 2.6 2.2 2.5
56
8

86.2 87.8 86.1 70.0 60.3 59.0 56.0 54.6 54.2 3.2 3.0 6.8 13.1 4.9 4.6 2.8 2.2 2.1
9

1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9
N-gram order N-gram order
(b) Exodus cross-validated optimization analysis results: high-res POS.

Best feature: (9, 4), 86.53±2.91% Combination-wise [%]


12.5
62.3 69.3 67.6 62.5 55.2 52.8 51.7 51.3 51.1 3.6 3.6 4.2 5.5 3.2 2.1 1.2 0.7 0.4
84
0

70.2 74.2 78.0 77.1 73.8 68.7 62.0 55.1 52.7 3.6 2.6 2.3 2.6 5.9 6.9 6.3 3.8 2.3
1

10.0
73.4 74.1 77.8 78.1 79.1 76.2 64.0 53.8 52.6 78 3.1 3.9 2.1 2.5 2.9 7.1 9.1 2.3 1.5
2

2
Running-window width

Running-window width

75.3 74.6 78.9 79.1 81.2 79.4 69.4 58.9 52.7 2.7 3.4 2.3 3.2 3.2 7.0 10.8 7.5 1.5
3

72 7.5
77.5 78.8 81.5 82.4 83.7 82.4 77.7 63.4 56.8 4.0 3.3 3.0 3.5 4.6 5.5 10.3 10.8 4.7
4

78.9 81.4 83.4 84.0 83.2 82.2 76.8 65.3 55.7 3.2 3.4 3.2 3.6 4.3 5.7 10.6 11.3 2.9
5

66
79.3 82.4 85.3 84.5 83.4 82.4 77.3 66.5 56.0 3.5 3.6 3.4 3.8 4.6 5.1 9.7 12.6 3.6
5.0
6

78.7 82.8 86.1 85.3 83.4 83.1 74.0 66.5 57.2 60 3.1 3.0 2.8 4.5 4.8 5.2 12.2 11.7 4.3
7

79.3 83.1 85.8 86.2 85.6 83.1 75.4 64.8 58.6 3.3 3.4 2.9 4.0 4.0 6.2 12.0 11.3 5.8 2.5
8

54
79.5 83.4 86.1 86.5 85.9 84.1 76.7 65.9 61.2 3.8 3.0 2.6 2.9 3.6 6.5 12.0 11.0 8.8
9

1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9
N-gram order N-gram order
(c) Exodus cross-validated optimization analysis results: low-res POS.

Figure 6: Cross-validated optimization analysis results for the book of Exodus, similar to Fig. 5.
(a) Cross-validated hypothesis-testing results: book of Genesis (low-res POS)

(b) Cross-validated hypothesis-testing results: book of Exodus (high-res POS)

Figure 7: Cross-validated hypothesis-testing results, performed as described in §3, for the books of Genesis (sub-
figure a) and Exodus (sub-figure b), respectively. Left Panels: Optimal overlap as a function of cyclic label shift.
The derived p-value is listed on top of each panel. Right Panels: Resulting null distributions of the optimal overlap
values. The optimal overlap value of the 0-shift (i.e., scholarly labeling) is listed on each panel.
Cluster 1 (P) Cluster 2 (nonP)

Figure 8: Cross-validated feature importance analysis results for the book of Genesis: lexemes (running-window
width of 4; n-gram size of 1). Note: displaying features carrying 75% of the explained variance.

Cluster 1 (P) Cluster 2 (nonP)

Figure 9: Cross-validated feature importance analysis results for the book of Genesis: high-res POS (running-
window width of 1; n-gram size of 1). Note: displaying features carrying 90% of the explained variance.
Cluster 1 (P) Cluster 2 (nonP)

Figure 10: Cross-validated feature importance analysis results for the book of Genesis: low-res POS (running-
window width of 4; n-gram size of 1). Note: displaying features carrying 100% of the explained variance.

Cluster 2 (P) Cluster 1 (nonP)

Figure 11: Exodus cross-validated feature importance results: lexemes (running-window width of 5; n-gram size of
2).
Cluster 1 (P) Cluster 2 (nonP)

Figure 12: Exodus cross-validated feature importance results: high-res POS (running-window width of 4; n-gram
size of 2).

Cluster 1 (P) Cluster 2 (nonP)

Figure 13: Exodus cross-validated feature importance results: low-res POS (running-window width of 6; n-gram
size of 3).

You might also like