You are on page 1of 23

Journal of New Music Research

ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/nnmr20

Harmony and form in Brazilian Choro: A corpus-


driven approach to musical style analysis

Fabian C. Moss , Willian Fernandes Souza & Martin Rohrmeier

To cite this article: Fabian C. Moss , Willian Fernandes Souza & Martin Rohrmeier (2020):
Harmony and form in Brazilian Choro: A corpus-driven approach to musical style analysis, Journal
of New Music Research, DOI: 10.1080/09298215.2020.1797109

To link to this article: https://doi.org/10.1080/09298215.2020.1797109

© 2020 The Author(s). Published by Informa


UK Limited, trading as Taylor & Francis
Group

Published online: 04 Aug 2020.

Submit your article to this journal

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=nnmr20
JOURNAL OF NEW MUSIC RESEARCH
https://doi.org/10.1080/09298215.2020.1797109

Harmony and form in Brazilian Choro: A corpus-driven approach to musical style


analysis
Fabian C. Moss a , Willian Fernandes Souza b and Martin Rohrmeiera
a Digital and Cognitive Musicology Lab, Digital Humanities Institute, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland;
b Cognição Musical e Processos Criativos, Programa de Pós-Graduação em Música, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil

ABSTRACT ARTICLE HISTORY


This corpus study constitutes the first quantitative style analysis of Choro, a primarily instrumental Received 8 October 2018
music genre that emerged in Brazil at the end of the 19th century. We evaluate its description in a Accepted 22 June 2020
recent comprehensive textbook by transcribing the chord symbols and formal structure of the 295 KEYWORDS
representative pieces in the Choro Songbook. Our approach uncovers central stylistic traits of this Choro; musical style analysis;
musical idiom on empirical grounds. It thus advances data-driven musical style analysis by studying corpus study; harmony; form
both harmony and form in a musical genre that lies outside the traditional canon.

1. Introduction
methods (Basili et al., 2004; Brackett, 2016; Fabbri, 2014;
A continuously growing body of corpus studies in the McKay & Fujinaga, 2006) and the lack of standardised
field of computational music analysis aims at inves- encoding and annotation formats (Neuwirth et al., 2018;
tigating centuries-old music-theoretical questions with Oramas et al., 2018).
modern data-driven approaches.1 This leads not only In the words of Leonard Meyer, the goal of style analy-
to refinements of the questions asked and advances in sis is ‘to describe the patternings replicated in some group
the applied methodologies, but also to the creation of of works, to discover and formulate the rules and strate-
symbolic datasets that facilitate style analysis. Existing gies that are the basis for such patternings, and to explain
resources cover a diversity of musical genres, encod- in the light of these constraints how the characteristics
ings, formats, and methodologies. Many datasets con- described are related to one another’ (Meyer, 1989, p. 38).
centrate on melody (Brinkman & Huron, 2018; Eerola In this spirit, the present study provides an empirically-
et al., 2009; Huron, 1996; Pearce & Wiggins, 2004; grounded style analysis of the musical genre of Choro, a
Von Hippel & Huron, 2000), or harmony (Albrecht primarily instrumental Brazilian music genre. Choro is
& Shanahan, 2012; Burgoyne et al., 2013; Hedges a musical practice that lies outside of canonical datasets
& Rohrmeier, 2011; Moss et al., 2019; Rohrmeier in music information retrieval (Panteli et al., 2018; Sav-
& Cross, 2008; Temperley & de Clercq, 2013; age, forthcoming) on classical music, e.g. Bach, Haydn,
Tymoczko, 2003; White & Quinn, 2016), but rarely con- Mozart, Beethoven (Jacoby et al., 2015; Moss et al., 2019;
sider aspects of formal structure (for an exception see Rohrmeier & Cross, 2008; Sears et al., 2017), and pop-
Sears et al., 2017) in order to describe, infer, or predict ular music, e.g. Jazz, Beatles, Charts (Broze & Shana-
idiosyncrasies and prototypical patterns of a certain style, han, 2013; Gauvin, 2015; Harte, 2010), and has thus not
genre, or composer. been extensively studied empirically so far. We take as our
Although augmenting musical style analysis by apply- starting point the recent and comprehensive theoretical
ing statistical methods as well as concepts and measures account A estrutura do Choro (Almada, 2006) and eval-
from information theory has long been acknowledged by uate the descriptions therein against transcriptions of a
musicologists (Manzara et al., 1992; Meyer, 1957; Pearce collection of representative Choro pieces from the Choro
& Wiggins, 2004; Weiß et al., 2018; Youngblood, 1958), Songbook (Chediak, 2009, 2011a, 2011b).
the computational analysis of symbolic corpora has faced Our analyses consider the chord symbols and the
several difficulties due to the diversity of analytical formal structure of Choro pieces with computational

CONTACT Fabian C. Moss fabian.moss@epfl.ch Digital and Cognitive Musicology Lab, Digital Humanities Institute, École Polytechnique Fédérale de
Lausanne, Lausanne, VD, Switzerland
1 For discussions of this development see e.g. Neuwirth and Rohrmeier (2016); Temperley and VanHandel (2013).
© 2020 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group
This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License (http://creativecommons.org/
licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered,
transformed, or built upon in any way.
2 F. C. MOSS ET AL.

methods from musicological corpus research. In doing dances for piano created new subgenres such as Polka-
so, we seek to gain particular insights into the struc- Mazurka or Polca-Lundu.
tural features of harmony and form in this genre and At the turn from the 19th to the 20th century, Polca-
their mutual relationship. Moreover, we hope that our Lundu split into two subgenres (at least in terms of its
contribution broadens the scope of computational music social usage): Tango Brasileiro and Maxixe. The former
research by focusing on a previously neglected genre.2 represents mostly piano pieces and the latter is a dance
The following, we give a brief introduction to Choro also played by other instruments. Today, both are encom-
as a musical genre and its historical development, and passed by Choro practices, especially events like Rodas de
restate its main characteristics based on the qualita- Choro.4
tive account in Almada (2006). Section 2 (‘Procedure Around the 1930’s, the term Choro was established
and Methods’) describes the Choro Songbook Corpus, and promoted by publishers to create a brand and pop-
the dataset underlying our study, and how the data was ularise it. Since then it is also known under other related
transcribed and transformed. It also contains a detailed names, i.e. hybrid subgenres such as Choro-Serenata.
account of the methods used for analysis. Section 3 Before the 1930’s one can observe a high production of
(‘Results and Discussion’) presents our findings and their Tangos (in Brazil as well as in Argentina). Due to the
interpretations with respect to the style of Choro. Finally, strong association of Tango with the production in the
Section 4 summarises our findings, contextualises them, region of Río de la Plata (Argentinian Tango), the Tangos
and discusses potential avenues for future research. Brasileiros were appropriated to the category of Choro.
Thus, Argentinian Tango can be conceived as a contrast-
ing genre in relationship with Choro (Sandroni, 2001).
1.1. History of Choro and its subgenres The contact of Choro with other Brazilian genres in
Choro is a predominantly instrumental music tradition the 1940s, for instance Samba and Canção, created new
that emerged in Brazil around 1870. It can be described as subgenres such as Choro Sambado,5 or Choro-Canção.
a hybrid outcome of several genres with African, Ameri- In the second half of the 20th century, several genres
can, and European roots (Aranha, 2012; Piedade, 2003). abraded on Choro indirectly, like Rock and Bossa Nova
This friction gave rise to a genre that is considered to (one can observe traces in harmonies or in performance
be one of the first expressions of generic Brazilian music but without creating new subgenres), and directly like
(Mair, 2000). Since then, the production of Choro pieces Jazz, bringing about Choro-Jazz.
fluctuated, including, for instance, a revival in the late
1970’s (Livingston, 1999). Nowadays, one can witness a
1.2. Harmony and form in Choro
genre subsisting in a rich collection of styles (Cazes, 1999;
Taborda, 2010; Valente, 2014). Choro can be conceptu- A recent comprehensive resource on harmony and
alised as a category that includes three meanings: (1) it is form in Choro is Carlos Almada’s A Estrutura do
a social musical event; (2) it is a style, a manner of doing Choro (Almada, 2006), a textbook containing theoret-
music; and (3) it is a genre that contains specific musical ical descriptions as well as exercises for composition
traits or features. The present research project adapts the and improvisation. In its theoretical part, Almada (2006,
latter definition of Choro as a genre and considers as well pp. 7–26) describes various musical features such as har-
its subgenres.3 mony, form, and rhythm on different levels. For instance,
The history of Choro as well as the complex relation to Almada (2006, p. 10) observes that the most recurrent
other genres and its subgenres is outlined in Figure 1. The keys in Choro are F, C, G, and D for major keys, and
lateral extremes of the figure show its two main sources in Dm, Am, Em, and Gm for minor keys. This ranked list of
the 19th century: vocal genres such as Modinha and Sere- most common keys is quite imprecise insofar as he only
nata on the left-hand side, and dances on the right-hand mentions four keys for each mode and does not explain
side, with various origins such as Schottisch and Polca the (estimated) relative frequencies of these keys. Almada
from the Czech Republic, Mazurka from Poland, Valsa does not talk about different metres in Choro but virtually
from Austria, Fox-trot from the United States, and Lundu all score examples in his book are in 2/4 metre. Only in
from Angola and the region of Congo (Sandroni, 2001). four out of 90 examples the 2/4 metre is notated explicitly.
In the 19th century, the interaction between these salon
4 Rodas de choro can be translated as Choro circles or Choro gigs where Choros
are mainly performed with an ensemble of guitars (with six and/or seven
2 See Wundervald and Zeviani (2019) for a recent extensive study on genres in strings), a cavaquinho, a mandolin, a flute, a clarinet, and a pandeiro.
Brazilian Popular music. 5 For the purpose of our research, Samba-Choro, Choro-Samba, and Choro
3 Most pieces in the Choro Songbook do not have a specified subgenre but are Sambado are assumed to be the same, but we are aware that the musico-
just called Choro. logical position about these terms is still in development.
JOURNAL OF NEW MUSIC RESEARCH 3

Figure 1. Indirect and direct influences of other genres in the context of Choro (based on de Souza, 2016a, 2016b). Solid lines represent
strong influences between genres, and dashed lines represent rather indirect ones. The colours stand for different time periods. Sub-
genres related to vocal genres are shown in the left part and dance-related sub-genres are shown in the right part.

Table 1. Prototypical key relations in Choros in three-part form mode which are based only on a small sample consid-
(Almada, 2006, pp. 9–10). ering mainly two composers, namely Joaquim Calado
Parts A B A C A and Pixinguinha (Almada, 2006, acknowledgements).
Major 1 I V I IV I According to Sève, ‘the main element to functional char-
Major 2 I VIm I IV I acterisation of each phrase is the harmony rather than the
Minor Im III Im I Im
rhythm [of the melody]’ (Sève, 2015, p. 137). According
Note: If a Roman numeral is followed by the letter ‘m’ it depicts a minor key,
otherwise a major key. to this view, harmony aids to analyse formal aspects of
music which also dovetails with theories of musical form
in classical music. This would also entail that the 16-bar
parts are made of smaller regular formal units such as
One can assume that this metre is taken to be the default periods or sentences (Caplin, 1998). With respect to har-
in Choro. monic transitions in general, we expect to see patterns
Another interesting statement concerns the form and that constitute the core in many Western tonal styles, such
local key structure within pieces. The formal arrange- as perfect cadences, V-I patterns, and the like.
ment of Choro prototypically features a rondo-like par- Moreover, one can find very detailed instructions
tition into three different parts (i.e. an A-B-A-C-A form). about the employment of certain chords and chord types
This aspect is a commonplace of the Choro literature (Almada, 2006, pp. 7–8): In major keys, chords on scale
(Almada, 2006; Mesquita, 2017; Sève, 2015), cited in his- degrees 1̂, 2̂, 3̂, 4̂, and 6̂ appear ‘most commonly’ as triads,
torical texts and, at times, subject of discussions among the dominant appears ‘always’ as V7, and VII is ‘rarely’
musicians (Cazes, 1999). An alternative form is the two- used. In the minor mode, chords on scale degrees 1̂, 3̂,
part rounded binary (A-B-A; Caplin, 1998), but it occurs 4̂, and 6̂ appear ‘most commonly’ as triads, whereas the
rather as an exception (Almada, 2006, p. 9). The most dominant can appear both as V7 or even Vm.
common local key patterns for the three-part forms are In addition to these rather simple triadic harmonies,
given in Table 1, two for the major mode and one for the many chords in Choro have an extended structure,
minor mode (Almada, 2006, pp. 9–10). In major pieces, involving sevenths, ninths, elevenths, and so forth. Based
the second part (B) is most often in the dominant (V) or on the history of the genre outlined above and previous
relative key (VIm), and part C is most commonly in the work by de Souza (2016a, 2016b), we expect that Choro
subdominant key (IV), whereas in minor pieces the sec- uses more modifications of triads and tetrads (alterations,
ond part is most often in the relative key (III) and part C added, and omitted notes) over time because more and
in the parallel key (I). more options become available, for instance due to influ-
Several authors observe that most parts have sixteen ences from Jazz and Bossa Nova.6
bars with prototypical harmonic patterns (Almada, 2006,
6 Note that these genres are undergoing change themselves. For instance,
2012; Ferrão & Navia, 2016). Almada’s textbook contains
Broze and Shanahan (2013) show changes in chord usage for Jazz between
a number of examples for both the major and minor 1924 and 1968.
4 F. C. MOSS ET AL.

To summarise, we use the corpus of transcribed har- 2.2. Transcription procedure


monies from the Choro Songook to empirically evaluate
The parts and phrase structure of the pieces are encoded
the stylistic norms proposed by Almada (2006).
in a recursive notation inspired by de Clercq and Temper-
ley (2011). Parts are defined by key changes and phrases
are groupings of bars of variable length that are, for
2. Procedure and methods
instance, determined by repetition signs, double bars,
2.1. Dataset or a coda. To ensure consistency and to enable algo-
rithmic analysis, we captured the chord symbols with a
The Choro Songbook Corpus consists of transcriptions of
PERL-style regular expression which is shown in the top
the chord symbols and the formal arrangement of all 295
panel of Figure 3, along with an example transcription
pieces in the three volumes of the Choro Songbook (Che-
of ‘Canhoto de Paraíba’ (Choro Songbook, volume 2) in
diak, 2009, 2011a, 2011b) into a machine-readable for-
the bottom panel that shows the overall harmonic struc-
mat. The transcriptions were executed by WFS and
ture as defined by the parts in two different keys as well as
proofread by FCM. The scores in the Choro Songbook
the phrases. Phrases in this context thus do not conform
consist of notated melodies and bass lines as well as
strictly to any music-theoretical definition but are rather
chord symbols. Only the chord symbols along with their
based on notational segments of the scores in the Choro
metrical and formal position were transcribed (see ‘Tran-
Songbook. The lengths of the phrases in this example
scription Procedure’). The chord symbols denote the root
extend from one bar (e.g. P2 and P6) to 15 bars (P3).
of the chord, the chord type such as major, minor, or
The last line of a transcription describes the over-
diminished, potentially added notes, and bass notes. The
all formal structure of a piece, beginning with a start
different parts of the pieces are annotated with capi-
symbol S that specifies the global key and metre. Sub-
tal letters (e.g. A, B, C). Since keys are not explicitly
sequent upward lines define the phrase structure of the
annotated in the songbook and can only be inferred
individual parts (e.g. PartA) and indicate key or metre
from the key signatures and final cadences, the key of a
changes, if applicable, in square brackets. In almost all
piece or part can be ambiguous. The keys of the pieces
cases, the assignment of the global and local keys was
were therefore determined by the authors. Additional
straight-forward. In the few more ambiguous cases, the
metadata, such as composer name and life dates, year
final cadence of a part or piece was used to determine the
of composition, title, and subgenre (e.g. Polca, Valsa),
key. The top lines of the transcription define the chord
were collected from various resources. Besides the largest
sequences in the phrases (e.g. P1). Phrase boundaries
category Choro, which makes up 67% of the corpus,
were determined by repetition signs, da capo, dal segno,
the most frequent subgenres are Valsa ( ∼ 10%), Choro
and codas. Bars in a phrase are separated by a conclud-
Sambado (5%), and Polca and Maxixe (each ∼ 4%).
ing bar line symbol ‘|’ and can contain any number of
The composition dates for this selection of Choros
chord symbols, separated by empty spaces. The duration
in the songbook range from 1875 to 2007 with two
of chord symbols is implicitly encoded by the number of
peaks in the 1950s and late 1970s (see the histogram in
chord symbols in a bar in combination with that bar’s
Figure 2).
metre. For instance, the bar ‘C G7 C |’ in a 3/4 metre
Although the set of pieces in the Choro Songbook
would result in a duration of a quarter note for each chord
Corpus is relatively small,7 the nature of the source
symbol. In a 2/4 metre, these chord symbols would be
allows us to assume representativity to some extent
interpreted as a triplet. If chord symbols are not evenly
because songbooks are produced in order to capture
spaced in a bar, they are placed on a metrical grid defined
the essence of a certain genre.8 This is, evidently, sub-
by the smallest chord duration. Consider for example the
ject to interpretations and decisions of the editors, thus
tresillo, a common rhythm in Brazilian music. In a 2/4
the present selection represents just one of many pos-
bar it consists of two dotted-eighth notes followed by
sibilities of interpretation along a tradition of Choro
a plain eighth note. The chord sequence ‘C G7 C’ in
$ practice.
this rhythm would thus be transcribed as ‘C . . G7
. . C . |’. Bars without chord symbols were tran-
scribed as empty bars that are interpreted as repetitions
of the preceding harmony. Bars with rests or uninter-
7 The Choro archive, for instance, at the Instituto Casa do Choro con- pretable harmonies on melodies such as chromatic lines
tains scanned scores of more than 10,000 pieces (http://acervo. were transcribed as NC (non-chord). Added notes and
casadochoro.com.br/Works).
8 Another popular example are the Real Books (Hal Leonard Publishing Corpo-
alterations were enclosed in (potentially multiple) paren-
ration, 2004), a comprehensive collection of Jazz standards. theses, e.g. D7(b9)(b13). If a chord symbol specifies
JOURNAL OF NEW MUSIC RESEARCH 5

Figure 2. Historical distribution of pieces in the Choro Songbook.

Figure 3. PERL-style regular expression for chord symbols in the Choro Songbook Corpus (top) and transcription of ‘Canhoto de Paraíba’
(Choro Songbook, volume 2).

a bass note, it is transcribed after a slash, e.g. Gm/Bb, minor and subsequently returns to A major. The mod-
similar to the notation used by Harte (2010). ulation back to A major does not need to be encoded
explicitly, because it is inherited from the parent node (S)
in the tree. The modulation structure of the piece gives
2.3. Parsing and transformation it a three-part form (A-B-A), one of the most common
Since the notated scores in the Choro Songbook con- patterns (see ‘Key relations within a piece’).
tain many repetitions and da capo instructions, all pieces These expanded transcriptions were subsequently
were parsed and expanded automatically, so that the final transformed into Roman numeral symbols in order to
sequence of chord symbols corresponds to a full perfor- allow for pattern discovery and comparison across dif-
mance of the piece as notated. An exemplary parse-tree ferent pieces. In this step, all parts of the chord symbol
of the piece ‘Canhoto de Paraíba’ is shown in Figure 4. remained invariant, except the root and the bass note.
This piece is in A major and in 2/4, as indicated by the The root was translated to a Roman numeral relative to
top-level symbol S[A, 2/4]. It modulates in its mid- its local key, and the bass note was translated to an arabic
dle part (PartB[F#m]) to the relative key of F sharp number relative to the root. The chord symbol Dm/A was
6 F. C. MOSS ET AL.

Figure 4. Hierarchical representation of ‘Canhoto de Paraíba’ as parsed from the transcription including all repetitions. Some non-
terminal symbols (e.g. P1 and P3) appear several times but are encoded only once (see Figure 3). The global key of the piece is A major
but part B (PartB) is in F sharp minor, the relative key.

accordingly translated to VIm/5 in the key of F major pieces, respectively. The diachronic usage of chord modi-
and to IIm/5 in the key of C major. The corpus con- fications was likewise quantified by the relative frequency
tains 682 unique Roman numeral chord symbols after the of occurrence per decade.
translation. This approach is useful for individual chords or short
All transcriptions were transformed into a data sequences thereof. It is problematic if the sequences
frame and saved in tab-separated value (TSV) for- become so long that they occur only once in the cor-
mat in order to facilitate the analysis. That data frame pus and hence their frequencies are all identical. Since
contains the absolute and relative (to the local key) we are particularly interested in the relation of harmony
chord symbols; scale degrees; bar numbers; chord and form, we need to take longer sequences of chords
durations; global and local keys, modes, and metres; (e.g. in 16- or 32-bar phrases) into account. To circum-
path; root note; chord type; added and omitted notes; vent the mentioned obstacle, we apply a unigram model
bass note, and the filename of the transcription. The that considers not the relative frequencies of the entire
data frame and the transcriptions are freely avail- phrases (16-grams, 32-grams) but only the per-bar rel-
able at https://doi.org/10.5281/zenodo.3881347 for non- ative frequency of chords within the phrases. Here, the
commercial use. unigrams correspond to bars and might thus consist of
several chords contained in a bar. The likelihood of each
phrase is then calculated as the product of the probabili-
2.4. Operationalizations ties of the respective bars which, in turn, are estimated by
To evaluate whether Almada’s (2006) description of the relative unigram frequencies of chords within a given
Choro holds up to empirical scrutiny, we operationalise bar.
the involved concepts in a quantitative way. All state- A common distinction in Natural Language Process-
ments regarding the frequency of occurrence of keys, ing (NLP) is made between tokens and types (Manning
sequences of keys, and chords were quantified directly & Schütze, 2003). Tokens are concrete occurrences of
by observing the absolute frequency f (x) of the respec- items, such as chords or keys, at a specific position in the
tive items x, or sequences of items x = (x1 , . . . , xn ) (also dataset. Types, on the other hand, are the alphabet of dif-
called n-grams; Manning & Schütze, 2003) in the dataset. ferent symbols. The type-token ratio (TTR; Milička, 2012)
The relative frequency p(x) of an item x is the absolute is thus a measure of lexical diversity. Moreover, the
frequency f (x) divided by the the total number of unique detailed representation scheme of chord symbols in the
items in the dataset. Relative frequency is used as an esti- Choro dataset allows for detailed investigations of the dis-
mator for the probability p of the item x, p(x) = f (x)/N, tributions of chord types given a scale degree, for example
e.g. a chord symbol, occurring in the musical genre the distribution over chord types V, V7, V7(9), and oth-
Choro. Oftentimes, relative frequencies are not taken ers given scale degree V. In particular, the entropy of these
with respect to the size of the whole corpus but rather distributions is used to compare different scale degrees.
relative to the size of the subsets of all major and minor Entropy is commonly used as a measure to quantify
JOURNAL OF NEW MUSIC RESEARCH 7

uncertainty (Margulis & Beatty, 2008; Pearce, 2018). The is defined as the arithmetic mean of all pairs of chords
entropy H(X) of a discrete random variable X over an between the two bars B1 and B2 ,
alphabet of finite size N is defined as 
1
 dbar (B1 , B2 ) = d(c1 , c2 ),
H(X) = − p(x) log2 p(x). |B1 | · |B2 | c ∈B
1 1
x c2 ∈B2

The entropy is maximal, if all outcomes are equally likely where |B1 | and |B2 | are the cardinalities of bars B1 and
with probability 1/N. Consequently, the maximal entropy B2 , respectively. If both bars contain only one chord each,
is Hmax (X) = log2 (N). Because distributions generally dbar is equal to d.
do not have the same support, it is necessary to use the
normalized entropy instead, also known as efficiency, 3. Results and discussion
Hnorm (X) = H(X)/Hmax (X). 3.1. Basic statistics
Note that the efficiency is bounded between 0 and 1. The We begin our analysis of Choro by observing a number
complementary quantity 1 − Hnorm is called redundancy of basic statistics concerning the frequencies of pieces,
(MacKay, 2003). chord symbols, sub-genres, and composers. The final
To compare chord symbols, we introduce a simple dataset consists of 295 transcribed and expanded pieces9
proxy for a metric of chord symbol dissimilarity. The with a total number of 44,067 chord tokens (682 unique
chord symbols in the Choro Songbook Corpus consist of types). The songbook contains pieces by 180 composers
several chord features defined by the the parts of the reg- in 19 sub-genres. The two most prominent composers
ular expression in Figure 3. The metric takes into account are Pixinguinha and Jacob do Bandolim with 32 and 31
the chord features root (e.g. I, V, bIII), chord type (e.g. compositions, respectively, together contributing more
m, +, o), chord alterations (e.g. omit3, 7, b13), and bass than a fifth of all pieces in the songbook. In some cases,
notes (e.g. 3, 5). The overall chord dissimilarity of two the Choro Songbook accounts for multiple composers,
chord symbols c1 and c2 is defined as the weighted sum or composer-lyricist duos, for example Pixinguinha and
of individual metrics for each feature, Benedito Lacerda.10

d(c1 , c2 ) = λf df (c1 , c2 ),
f ∈F
3.2. Global keys and metres of pieces
The distribution of global keys for the major and the
with features F = {root, type, alteration, bass} and
minor mode is shown in Figure 5.11 About 63% are in
respective weights λf . In our approach, we set the weight
major and 37% minor. This key distribution reveals that
of each feature to λf = 0.25.
keys with fewer accidentals are preferred. It is particu-
Distance between chord roots is measured by their
larly noteworthy that Dm dominates the minor keys with
distance on the line of fifths (Temperley, 2000). Accord-
almost 15.9% occurrence. One interpretation could be
ingly, roots I and V have a distance of 1 corresponding
that some keys are more idiomatic for certain instru-
to either a perfect fifth or a perfect fourth, whereas the
ments, for instance for the clarinet. The vast majority of
roots bIII and #II have a distance of 12, the dis-
the pieces is in 2/4 metre (86.1%) with the remaining
tance of enharmonic equivalence. The distance between
pieces being in 3/4 (10.5%) representing the Valsas,12 or
chord types is measured as a Boolean feature. If two
2/2 (3.4%).
chords c1 and c2 are of the same type (major, minor,
Almada’s observation that simple global keys with few
diminished, or augmented), this feature distance is 0, oth-
accidentals and the 2/4 metre are most recurrent is largely
erwise it is 1. Analogously, the bass note distance is 0
if the two chords have the same bass note (relative to 9 Two pieces, ‘Um chorinho em aldeia’ and ‘Chorinho pra você’, appear twice
the root) and 1 otherwise. The chord alterations met- in the songbooks. Only one instance of each was used for the analyses.
ric is defined as the cardinality of the symmetric differ- 10 Although Lacerda attended recordings with Pixinguinha from 1946 until
1950, they did not compose together. Lacerda is mentioned in the Choro
ence between the two sets of extensions. For example, if Songbook as a composer because he was able to do business with editors
the chords are V7 and V7(b9)(#11), the symmetric in order to promote Pixinguinha’s music. The piece ‘Um a zero’, for instance,
difference of their extensions is |{7} ∪ {7, b9, #11}| − was written by Pixinguinha in 1918 but they didn’t know each other yet (da
Silva & Filho, 1998, p. 148).
|{7} ∩ {7, b9, #11}| = 3 − 1 = 2. 11 We largely corroborate the global key frequencies in (Sève, 2015, p. 118),

The overall metric d can also be extended to measure although he only reports counts for 286 of the 296 pieces of the Choro
Songbook.
the dissimilarity of bars that each might contain mul- 12 The only exception is ‘Os três chorões’ (Choro Songbook, volume 1) which is
tiple chords. In this case, the dissimilarity of two bars a Choro in 3/4 but not a Valsa.
8 F. C. MOSS ET AL.

Figure 5. Key distributions percentage in the Choro Songbook. Major modes are displayed in blue bars and minor in orange.

supported by the empirical data, although the key of non-classical genres such as Rock and Jazz on Choro.
E minor is much less common than one would assume Unfortunately, recent corpus studies on these genres do
based on his book. The stereotypical Choro piece is thus not report statistics on key transitions within pieces
in major, has maximally one accidental and is in a 2/4 (Broze & Shanahan, 2013; de Clercq & Temperley, 2011)
metre. If it is in minor, it is very likely to be in D minor. so that more exact conclusions can not be drawn at this
point.
3.3. Key relations within pieces
3.4. Formal arrangement within pieces
Apart from these rather superficial observations, we anal-
yse now whether Choro features prototypical modula- Given that Choro pieces modulate most often to keys that
tions within the pieces. To evaluate the relation of keys are close to the tonic by fifths or thirds, and that local key
within a piece, all keys are expressed relative to the global changes also define the parts of a piece, we expect not
key with a Roman numeral, e.g. if a major piece modu- only to see recurrent key pairs, but that there is only a
lates to the dominant key and back to the tonic, the key small number of patterns that govern the overall struc-
relations would be I-V-I. Subsequently, all key changes ture of a piece. For an example, see Figure 4. The corpus
(key bigrams) between parts are extracted and counted exhibits quite a diverse distribution of part lengths, as is
for each piece. An overview about all key transitions in shown in Figure 7. There are three peaks at length 16, 30,
this dataset is shown in Figure 6. It maps the keys and key and 32 bars. Among all 786 parts in the corpus, the 258
changes onto Schoenberg’s map of key relations (Schoen- 32-bar parts (33% of all parts in the data) and the 59 16-
berg, 1969), independently for major and minor pieces, bar parts (7.5% of all parts) stand out in particular. Note
where the width of the arrows represents the estimated that the frequencies of lengths are shown on a logarith-
transition probabilities. mic scale. The 83 irregular parts of length 30 bars (10.6%
Two observations can be made. First, keys that are of all parts) can, for instance, occur, when repetition and
close to the tonic (I or Im) are more frequent, although da capo patterns are involved. All other part lengths are
more distant keys occur occasionally. This corroborates much less frequent.
that closely related keys are preferred. Second, the com- We expect to see in particular two-part and three-
parison of the major and minor key patterns shows that part forms to be prevalent. Figure 8 lists the most fre-
pieces in minor keys modulate to more keys which are quent patterns of formal arrangement that occur more
also more distant than in major. The most common key than 1% in the entire dataset, sorted by their rank.
transitions are between the tonic and its relative and Almada’s table of key transitions (Table 1) suggests that
subdominant keys in major, and between the tonic and pieces follow a small number of prototypes and are either
its relative and parallel keys in minor. Finally, modu- in three-part (A-B-A-C-A) or two-part (A-B-A) form.
lations to the dominant key (V), which are common- Indeed, both in major and minor pieces the most com-
place in Western classical music, are rather rare in both mon patterns are found in this table. In major, by far the
modes. This is particularly noteworthy since Almada’s most common pattern is ‘I-VIm-I-IV-I’ (three-part
prime pattern for modulations in the major mode does form). It occurs 30.3% of the time and is the second pat-
include the dominant key at a prominent position (see tern in Almada’s table, although it is not clear whether
Table 1). We attribute this finding to the influence of other Almada’s table is to be read as a ranked list. It is followed
JOURNAL OF NEW MUSIC RESEARCH 9

Figure 6. Modulations in the Choro Songbook projected to Schoenberg’s (1969) map of key relations. The vertical axis corresponds to
the line of fifths and the horizontal axis to alternating relative and parallel keys, but only the keys that occur in the corpus are shown. The
arrow strengths are proportional to the frequency of occurrence of a key transition in the corpus.

Figure 7. Absolute frequencies of part lengths in the dataset on a logarithmic scale.

by ‘I’ (non-modulating, 12.4%), and ‘I-X-I’ (rounded to Almada’s assertion that two-part forms are the excep-
binary), where ‘X’ can be the relative (VIm, 12.4%), sub- tion, it becomes clear that the pieces with two different
dominant (IV, 10.8%), or parallel (Im, 6.5%) key and parts play an important role for the genre. Almada prob-
thus be understood as a shortened version of the three- ably did not consider this aspect because of his focus on
part form. Almada’s second prototypical form is the most older composers such as Pixinguinha and Callado. The
common three-part form but ranked sixth with only 6% relatively small number of prevalent formal patterns fits
among all formal patterns in major. The two most com- well with a musical genre that allows for improvisation
mon key patterns in minor are ‘Im-III-Im-I-Im’ and largely takes place in the context of social gatherings
(three-part) with modulations to the relative (III) and where the distinction between performers and audience
the parallel (I) keys, and ‘Im-III-Im’ (20%), followed can not always be strictly drawn. Accordingly, a small
by ‘Im-I-Im’ (19.1%, both two-part), and the non- number of standard patterns facilitates the participation
modulating ‘Im’ (11.9%). All percentages are relative to of musicians during a performance.
the number of pieces per mode.
It is furthermore surprising that the second ranking
key pattern in major does not modulate at all. Regard-
3.5. Harmonic patterns in 16- and 32-bar parts
ing the two-part pieces (A-B-A), the sum of the third,
fourth, and fifth key patterns in major is 28.8% and the Apart from knowing the overall key and formal struc-
sum of the second and third in minor is 38.7%. Contrary ture of a piece, a successful performance also needs to
10 F. C. MOSS ET AL.

specific chord symbol to occur in a specific bar in a 16-


bar phrase. It is calculated over the harmonic content that
a bar can contain, regardless whether it is one or several
chords. The maximal redundancy of 1 is achieved when
a bar contains exactly the same chord(s) in all patterns
(in Table 2, these are rows A–F in major, and rows A–D
in minor). The redundancy of a bar is zero if no two pat-
terns contain the same chord in this bar. The redundancy
values for all bars is shown in Figure 9.
One can notice that bars 4, 8 and 16 have very high
redundancies in both modes. In minor, bar 12 has also
high redundancy equal to the value of bar 11 but the val-
ues in minor are only based on four different patterns
(rows A–D in Table 2) and are thus less nuanced than
the ones for major. Nonetheless, bars 11 and 12 overall
feature authentic, evaded, or deceptive cadences, ren-
dering bar 12 a momentary point of harmonic stability
before the phrases conclude in an authentic motion to the
tonic in bar 16. More recently, Ramos (2016) considered
78 parts with 16 bars of 26 pieces by Pixinguinha showing
a similar pattern to Almada’s example. This observation
suggests that (sub-)phrase boundaries are marked har-
monically. From another point of view, the fluctuation of
values can be interpreted as the degree of harmonic sta-
bility, that also shapes a listener’s or accompanist’s predic-
tions through implicit learning (Huron, 2006; Rohrmeier
& Rebuschat, 2012; Tillmann, 2005). If particular bars
in 16-bar parts consistently exhibit high redundancy, the
harmony in these bars is easier to predict for people
Figure 8. Most frequent formal arrangements and key patterns familiar with this idiom and may very well guide accom-
of transitions between sections in major (top) and minor pieces paniment or improvisation in a Choro performance.
(bottom), respectively. Only patterns with a relative frequency
The prototypical 16-bar patterns in Table 2 contain
greater than 1% are shown (linear scale).
only Roman numerals and chord forms, and should thus
be understood as abstractions from the wide range of
chords that are possible in Choro. In fact, the empirical
coordinate the harmonies within a part of a piece espe- distributions of 16- and 32-bar phrases contain no exact
cially when songbooks or leadsheets are not available. duplicates under transposition invariance at all which
Accordingly, it is also expected to find strong regular- renders ranking the most common phrases impossible.
ities on the level of parts. Despite the variety in part To avoid this pitfall, we apply a per-bar unigram model
lengths (see Figure 7), the largest number of parts in as described in ‘Operationalizations’. According to this
Choro is regular and either 16 or 32 bars long. For an model, the five most likely empirical 16-bar phrases for
example see Figure 4, where both PartA and PartB are both major and minor are shown in Table 3 and the five
32-bar parts and PartA is a repetition of a 16-bar sub- most likely 32-bar phrases are shown in Table A1 in the
part (the combination of P1 and P2). Almada (2006, pp. Appendix.
20–23) provides two tables for the six most common 16- Note that, for both modes, the most likely 16-bar
bar phrases in major and the four most common 16-bar pattern starts off-tonic with a V7 chord in bar 1. Fur-
phrases in minor which are reproduced in Table 2. thermore, comparing the eighth bar of the empirical
All patterns end with a I chord in bar 16 (in major phrases with the theoretical ones shows that it does not
as well as in minor), whereas bar 8 always includes a always contain a V chord as Almada suggests, but can also
V chord. In order to compare the bars systematically, involve other chords with dominant function, e.g. VII
we calculate the redundancy of each bar, as defined in minor, and in major even tonic chords. The last bar
above (see ‘Operationalizations’). Redundancy can here involves mostly tonic chords in both modes, but can also
be interpreted as a measure of how certain one is for a end in a III7 chord which is the dominant of the tonic
JOURNAL OF NEW MUSIC RESEARCH 11

Table 2. Theoretical prototypical harmonic progressions (Almada, 2006, pp. 20–23).


Major 1 2 3 4 5 6 7 8
A I-V◦ /II II II-V I IV I V/V V
B V I-V/II V/V-V I V/VI VI V/V V
C I6-III◦ II V I V/VI VI V/III III-V
D I V VI V/VI V/VI V/II V/V V
E I-V I VI-V/VI VI V◦ /III I6 V/V V
F I-V/VI VI-V/II II-V I V/III III V/III III-V
9 10 11 12 13 14 15 16
A I-V◦ /II II V/VI VI V◦ /III I6-V/II II-V I
B V I-V/II V/IV IV IV I V/V-V I
C I6-III◦ II V I V/II V/V V I
D I V VI V/VI II-IVm I6-III◦ II-V I
E I-V/VI V/II II-V/II II IV-V◦ /V I4-V/II II-V I
F I-V/VI VI-V/II II-V V/II IV-IVm I-V/II V/V-V I
Minor 1 2 3 4 5 6 7 8
A I-V I-V/IV IV-V/IV IV I-V IV V V
B V I V/IV IV V/III IV I V
C V I V I V I-V/V Vm V
D I V/V V I I V/III III V
9 10 11 12 13 14 15 16
A I-V I-V/IV IV-V/IV IV II I V I
B V/III III V I VI-V I V/V-V I
C V I V I V/V-V I V I
D I V/V V I IV-V I II-V I

Figure 9. Redundancies of 16-bar phrases based on Almada (2006).

in the relative key (VIm) to which Choros in major often likely phrases according to a unigram model. Does the
modulate. In minor, the second most likely pattern ends result hold for the whole dataset? To investigate this
on a VII7 chord, which is < the dominant of the relative question, all 16-, and 32-bar parts in the Choro Song-
key in minor (III), also a frequent target for modulation book Corpus are inspected. We include the 32-bar parts
in the minor mode (compare Figure 6). Although the here, because we assume that most of the 32-bar phrases
empirical chord progressions in Table 3 are more diverse are indeed repeated 16-bar phrases as in ‘Canhoto de
than the ones given by Almada, one has to bear in mind Paraíba’ (Figure 4). The redundancy and efficiency (see
that there are 685 unique chord types in total and the vari- ‘Operationalizations’) are calculated relative to the size of
ety is considerably smaller than it could be based on this the respective parts (N = 16 or 32). Figure 10 shows the
vocabulary size. In other words, the regularity is not as redundancy (dark grey) and efficiency (light grey) values
clear as in the theoretical predictions but still impressive for each bar. Sixteen-bar parts (60 of 787 parts in total)
considering the number of possibilities. have a mean redundancy of 0.4 (black dashed line, top
So far, we have seen that redundancy indicates a panel). The lower panel displays efficiency and redun-
tight connection between harmony and form because dancy for all 257 32-bar parts (mean redundancy 0.52;
it tends to be higher towards the end of subphrases black dashed line). Recall that these quantities are com-
(e.g. bars 1–8, 9–16), at least for Almada’s theoretical plementary and their sum always has to add up to 1. The
predictions. We have also compared the theoretical 16- bars in Figure 10 are stacked to emphasize this comple-
bar phrases of proposed by Almada with the five most mentary behaviour.
12 F. C. MOSS ET AL.

Table 3. Top 5 empirical most likely 16-bar phrases.


Major 1 2 3 4 5 6 7 8
1 V7 I V7 I V7 I V7 I
2 I IIm V7 I IIm V7 I I
3 I IIm7-V7 IV7 III7 II7-V7 I V-II7 V7
4 I IIm7-V7 I/3-bIIIo IIm7-V7 I V-IIIm7 II7/5-II7 V7
5 V7 I III7/3# VIm IIm/3 I V7 I
9 10 11 12 13 14 15 16
1 V7 I V7 I V7 I V7 I
2 I IIm V7 I IIm V7 I I
3 I IIm7-V7 IV7-IV7/3 VI/7 IIm/3b-#IVo I/5 IIm7-V7 I
4 I-VIm7 IIm7-V7 I/3-bIIIo IIm7-V7 VI7 IIm-IVm IIm7-V7 I-III7
5 V7 I III7/3# VIm IIm/3 I V7 I-V7-III7
Minor 1 2 3 4 5 6 7 8
1 V7 Im V7 Im-I7 IVm-IIm7(b5) Im/3 II7/3# V7
2 V7 Im V7 Im-Im7 IV7 VII IV7 VII
3 Im-Im/7 II7(b9)/5-IVm6/3b V7-V/7 Im/3-V7 Im-#VIm7(b5) Vm/3-III7 VI-II7 V7
4 Im-Im/7 IVm/3 V7 Im-V7 Im-Im/7 IVm/3 Im-#VIo-V7 Im-V7
5 IIm7(b5)-V7 Im-Im/7b IVm6/3b-V7 Im-I7 IVm-IIm7(b5) Im-Im/7b II7/5-VI7(b5) V7
9 10 11 12 13 14 15 16
1 V7 Im V7 Im-I7 IVm-IIm7(b5) Im/3-Im/7 IVm6/3-V7 Im
2 V7 Im V7 Im-Im7 IV7 VII IV7 VII-VII7
3 Im-Im/7 II7(b9)/5-IVm6/3b V7 Im-I7(b9) IVm-IIm7(b5) Im/3-VI7 bII-V7 Im
4 Im-Im/7 IVm/3 V7 Im-V7 Im-Im/7 IVm/3 Im-#VIo-V7 Im
5 IIm7(b5)-V7 Im-Im/7b IVm6/3b-V7 Im-I7 IVm-IIm7(b5) Im-Im/7b II7/5-VI7(b5) V7

Figure 10. Efficiency (light grey) and redundancy (dark grey) for 16- and 32-bar parts. The dashed line is the mean redundancy.

The dark grey redundancy bars resemble a hyperme- of the subsequent phrase is less regular than the transi-
trical grid (Lerdahl & Jackendoff, 1983) where phrase tion from bar 16 to bar 17 which very often is the same
boundaries are found at bar numbers that are multiples of as bar 1 of the same phrase.
4, such as 4, 8, 16, and, 32. In the 32-bar plot, the highest Furthermore, it seems that the pattern of redundan-
redundancy value is in bar 16. This is the last bar before cies for bars 1 to 16 in 32-bar parts is very similar to the
the repetition, since in many cases the 32-bar sections pattern for bars 17 to 32, indicating a correspondence in
consist of a repeated 16-bar phrase. Highest redundancy redundancy between bars that are 16 bars apart. To con-
means that uncertainty is minimal. This reflects the fact firm that the similar patterns in bars 1–16 and 17–32 are
that bar 16 very often contains varieties of V that lead indeed the result of a repetition, i.e. based on harmonic
back to the beginning in bar 1. The somewhat lower similarity, and not just a mere coincidence of similar
redundancy (smaller dark grey bar) in bar 32 is due to redundancy values, we calculate the chord distance (see
the fact that very often modulations to new keys take ‘Operationalizations’) between each pair of bars with a
place (compare to the transcription in Figure 3). While distance of 16 bars, e.g. bar 1 and bar 17, bar 2 and bar 18,
the key patterns are quite stable in Choros (see ‘Key rela- up to bar 16 and bar 32. If two bars contain exactly the
tions within pieces’), the transition from bar 32 to bar 1 same chords, their distance is zero.
JOURNAL OF NEW MUSIC RESEARCH 13

Table 4. Root interval classes and their relative frequencies in 32- subphrases (bars 8, 16, 24, 32) entail that the harmonic
bar phrases. content of these bars is less uncertain than in the other
droot IC Name Frequency bars and hence easier to predict for musicians and lis-
0 P1/P8 Unison/octave .648 teners (Huron, 2006; Rohrmeier, 2013). In addition to
1 P4/P5 Perfect fourth/perfect fifth .154 the regularity in the modulation patterns within Choro
2 M2/m7 Major second/minor seventh .040
3 m3/M6 Minor third/major sixth .073 pieces (Figure 8) there are strong regularities regarding
4 M3/m6 Major third/minor sixth .021 the harmony in the 16- and 32-bar phrases which make
5 m2/M7 Minor second/major seventh .018
6 A4/d5 Augmented fourth/diminished fifth .014
up a large part of all pieces (Figure 7). Taking these
7 A1/d8 Augmented unison/diminished octave .019 results together one can conclude that a mental hierar-
8 A5/d4 Augmented fifth/diminished fourth .003 chical representation of the form and the hypermetrical
9 A2/d7 Augmented second/diminished seventh .004
10 A6/d3 Augmented sixth/diminished third .005 structure facilitate the prediction of the harmony in a
11 A3/d6 Augmented third/diminished sixth .001 particular bar in Choro and other genres (including Jazz
12 A7/d9 Augmented seventh/diminished second –
and Classical music). This might be particularly relevant
because improvisation is part of the performance and
musicians often do not rely on scores but play by ear, also
Not accounting for octave differences, which is impos- relying on their implicit knowledge about stylistic regu-
sible with Roman numerals, the distance between roots larities. The tree-representation in Figure 4 can, in this
can be expressed in terms of interval classes (IC). Table 4 view, be understood as an approximation of the cogni-
shows a range of interval classes and corresponding tive representation of the formal arrangement of Choro
values of droot and their names, encompassing dia- pieces.
tonic (0–6), chromatic (7–11) and enharmonic (12) inter-
val classes.
3.6. Short harmonic patterns
The last column of Table 4 shows the relative frequen-
cies of these root distances in the Choro Songbook Corpus. We have seen that 16- and 32-bar phrases are highly reg-
Clearly, most of the time chords in 32-bar phrases that ular with respect to harmony. Which chords occur most
are 16 bars apart have the same root (droot = 0) and often in Choro? Which harmonic patterns are preva-
the second most common relation between roots is a lent? The relative frequency of chords and chord pat-
fifth relation (droot = 1). Other intervals are much less terns in the dataset can answer questions about central
frequent. chords and chord pattenrns (Moss et al., 2019). Recall
The actual distance d(c1 , c2 ) between two chords c1 that the considerable size of the chord vocabulary (685
and c2 that are 16 bars apart is calculated as the weighted unique chord tokens) is partly due to minor variations
sum of several features (see ‘Operationalizations’). More- of the same chord, such as adding chord extensions or
over, while the root distance can vary greatly (Table 4 specifying a bass note and thus the chord inversion. It
shows values up to 12), the other feature distances are is common in music theory to subsume these similar
much more restricted: The chord type metric is either 0 or chord instances under a more general category (Bur-
1, as is the bass metric, and since no chord in the dataset goyne et al., 2013). Here, we group chords together that
contains more than three extensions, the chord alter- have the same root and the same type, i.e. V7 and
ations metric is at most 6. The observed chord distances V7(b9) are grouped into a V category, and IIm7 and
are shown in Figure 11; error bars represent standard IIm7(b5) are categorised into IIm. Based on this
deviations from the mean. To interpret these values, it categorisation, Figure 12 shows the 20 most frequent
is important to note that, despite the obvious variabil- chords (unigrams, top row) and short harmonic patterns
ity, no pair of bars in 32-bar phrases exceeds an average (bigrams, trigrams, and quadrigrams, rows two to four)
chord distance of .5, a consequence of the interval class for the major (left column) and the minor mode (right
frequencies in Table 4. The average chord distance across column). Note that the horizontal axes are on different
all sixteen chord pairs is even lower with a value of .288. scales.
This leads us to conclude that the two halves of most First, one can observe that these short patterns con-
32-bar phrases are indeed close variants of each other and stitute large proportions of the whole dataset. The 20
in virtually all cases have the same root. reduced chord categories account for almost the com-
Furthermore, this corroborates the oberservation plete dataset, for 96% in major, and for 94% in minor.
made with respect to Figure 10 that there is a tight rela- Increasing the size of the harmonic patterns to bigrams,
tionship between the harmonic content of bars and their trigrams and quadrigrams, approximately corresponds to
hypermetrical position, in other words between harmony diminishing this proportion to half (53% in major and
and form. High redundancy values for bars at the end of minor), a quarter (27% in major, 25% in minor), and an
14 F. C. MOSS ET AL.

eighth (14% in major, 12% in minor) of the data. While & Temperley, 2011; Temperley & de Clercq, 2013). It
this is a strong decrease, it holds that a relatively small is an interesting observation that local harmonic pro-
number of patterns accounts for large portions of the gressions in the present corpus follow different patterns
data, a finding consistent with other studys on musical than the larger harmonic key patterns (see ‘Key rela-
datasets (Arthur, 2017; Mauch et al., 2008; Rohrmeier tions within pieces’). On a small scale, harmonic progres-
& Cross, 2008; White, 2013; Zanette, 2006). More con- sions in Choro are largely determined by dominant-tonic
cretely, the by far most common chords in both major progressions while, on a larger scale, subdominant, rela-
and minor are the ones on scale degrees I and V, although tive, and parallel relationships prevail. This implies that
the dominant is somewhat more common than the tonic harmony in Choro operates differently on the global (key
in the minor mode. For both modes, the two most com- relations of sections) and the local (chord transitions in
mon bigrams are V-I and V-Im, respectively. This is and between phrases) level.
interesting and not trivial. In a linguistic text, the two
most common terms could be ‘the’ and ‘a’ but one would
hardly find any bigrams ‘the a’ or ‘a the’. Hence, the fact
that common unigrams also form parts of the most com- 3.7. Chord types and tokens
mon n-grams with larger n is indicative of the central role In order to study whether chords on certain scale degrees
of these harmonies and corresponding strong regularities appear predominantly as a certain type (e.g. as triads, as
in the underlying harmonic language. seventh chords, or with alterations or bass notes), we tal-
The top ranks among the bigrams are occupied lied the distribution of all chords on a fixed scale-degree
by progressions of fifths, e.g. V-I, IIm-V, I-V, and and grouped them by mode (major and minor).
II-V in major, and V-Im, V-I, Im-V, II-V, and Inspecting all scale degrees in major (see the top left
I-IVm in minor. But note also that repetitions (e.g. panel in Figure 12), we find that scale degrees I, IIm, IV
I-I and Im-Im) occur frequently. These are due to indeed occur most commonly as triads. Scale degree III,
the reduced chord representation and correspond, for on the other hand, occurs most frequently as III7. It is
example, to resolutions of chord extensions or changes very likely that Almada analysed it as V7/VIm (e.g. E7
in the bass note (chord inversion). The most common as the dominant of Am in C major), so that he did not
tri- and quadrigrams clearly show the prevalence of include it in his list. IIIm is the second most common type
cadential patterns, such as IIm-V-I and VI-IIm-V-I of scale degree III. Analogously, scale degree VI appears
in major, and IVm-V-I and Im-II-V-Im in minor. most common as VI7, but could have been interpreted
The other class of frequent harmonic patterns could by Almada as V7/IIm. The second most frequent chord
be subsumed under dominant-tonic alternations, e.g. type of scale degree VI is VIm. In minor (cf. top right
I-V-I and V-I-V-I in major, and Im-V-Im and panel in Figure 12), the data agree with Almada in that
V-Im-V-Im in minor. These interpretations support the scale degrees I and IVm are most frequently triadic. Scale
characterisation that the core of harmonic patterns found degree III surprisingly appears most commonly as III7
in Choro consists predominantly of authentic progres- which could also be read as V7/VI (e.g. Eb7 as the domi-
sions. In this regard, Choro seems to be more similar nant to Ab in C minor). The major triad on the sixth scale
to Jazz that also largely consists of falling-fifths patterns degree ranks only fourth, with VI7, VIm, and VIm7 being
(Broze & Shanahan, 2013; Rohrmeier, 2020) than to Rock more common. The supertonic scale degree II occurs
where ascending fifths are more prominent (de Clercq mostly as II7 which can be analysed as V7/V, and second

Figure 11. Chord distances between the harmonic content of bars that are 16 bars apart in 32-bar phrases. Error bars represent standard
deviations from the mean.
JOURNAL OF NEW MUSIC RESEARCH 15

Table 5. Number of types, number of tokens, and efficiency for all


diatonic Roman numerals (RN) in major and minor, respectively.
Mode SD Types Tokens TTR Efficiency
Major I 65 7949 0.008 0.479
Major II 46 3957 0.012 0.551
Major III 43 1674 0.026 0.633
Major IV 51 2495 0.020 0.673
Major V 54 5674 0.010 0.398
Major VI 42 2709 0.016 0.549
Major VII 35 725 0.048 0.554
Minor I 68 4950 0.014 0.542
Minor II 47 1834 0.026 0.613
Minor III 50 850 0.059 0.717
Minor IV 52 1909 0.027 0.650
Minor V 57 3373 0.017 0.413
Minor VI 60 1207 0.050 0.624
Minor VII 37 576 0.064 0.681

VII, on the other hand, occurs only 576 times in minor,


which might justify such a statement. In general, these
calculations confirm Almada’s assumption for these scale
degrees but also give a much more detailed picture. We
compared the distribution of types for all scale degrees
by measuring their normalised entropy, or efficiency (see
‘Operationalizations’). An efficiency value of 0 means that
there is only one choice for a scale degree (which is the
case for the NC symbol, for instance), whereas a maxi-
mal efficiency value of 1 means that all types are equally
likely. Table 5 shows the number of types, number of
tokens, type-token ratio (TTR), and efficiency values for
all diatonic scale-degrees I, II, III, IV, V, VI, and VII
in major and minor, respectively. Chromatically altered
scale degrees (such as VI, II) are omitted for reasons of
space. The full table is given in Table A2 in the Appendix.
Figure 13 displays the type-token ratio (blue for major,
orange for minor) and efficiency (black dots) for each
scale degree. One can see that the diatonic scale degrees
(I to VII in both major and minor) occupy the leftmost
ranks and thus have both lowest efficiency and type-
token ratio. In fact, both measures are positively corre-
lated (Pearson correlation of r = .71). Scale degrees IV
and III, both in major, are also among the lowest entropy
scale-degrees and hence show a complexity comparable
to the unaltered scale degrees in terms of their efficiency.

3.8. Diachronic usage of chord alterations


Not all chords in Choro are triads or seventh chords but
have modifications. A chord modification is defined as
Figure 12. Top 20 unigrams, bigrams, trigrams, and quadrigrams either changing notes within a chord (5, 5), adding
in major (left column) and minor (right column). notes to a chord (9, 11), or removing notes from a chord
(omit 3). Approximately 10.7% of all chords in the over-
most commonly as IIm7(5) (more than 1000 occur- all corpus have one or more chord modifications, not
rences). Therefore, one can hardly agree that II can only counting minor or major sevenths. The ranked relative
be used ‘with difficulty’ (Almada, 2006). Scale degree frequencies of chord modifications in the overall corpus
16 F. C. MOSS ET AL.

Figure 13. Type-token ratio for all scale degrees in major (blue) and minor (orange); black dots represent the efficiency of the type
distribution for a given scale degree. Type-token ratio and efficiency are positively correlated (r = .71).

Figure 14. Occurrence of chord extensions over time shown as black dots. The bars on the left show the absolute frequency of occurrence
on a logarithmic scale.

is shown in the bar plot on the left of Figure 14 on a loga- chords in more recent compositions which is in line
rithmic scale. Chord modifications that occur conjointly with our expectations regarding the diachronic change
have been counted separately, e.g. a V(b9)(b13) chord of chord extension usage. However, this could be due to
symbol would increase the counts for both 9 and 13. the fact that the Choro Songbook contains more pieces
We left the original spelling of the modifications intact from certain decades and that some pieces are much
and did not reinterpret enharmonic equivalences, such longer than others (see Figure 2). To remove this potential
as 4 and 5, 6 and 13, or 4 and 11. The distributions bias, we calculated the proportion of chords with mod-
in the larger right plot show their employment over time ifications for each piece and grouped them by decade.
as black dots. The continuous distributions represent ker- The result is shown in Figure 15. The error bars repre-
nel density estimates for the distribution of the respective sent the standard error of the mean and are only weakly
chord modification. For visualisation purposes, the area and not significantly linked to the absolute number of
of the estimates has been transformed to be proportional chords in a decade (Pearson r = .14). Observing the
to the modifications’ frequency but on a linear scale. The change in the usage of chord extensions over time by
vertical position of dots within one chord modification decade shows a clear historical pattern of almost mono-
has purely a visual function. Darker appearing dots are tonic increase. Moreover, the variability increases as well,
the result of overlapping points. as indicated by the larger error bars in the second half
It appears to be the case that most modifications occur of the 20th century in Figure 15. While chord exten-
more often between the 1940’s and 1990’s, indicating an sions occur only relatively rarely until ca. 1950 with an
increase of the usage of modifications and more complex almost constant average frequency around 5%, chord
JOURNAL OF NEW MUSIC RESEARCH 17

Figure 15. Proportion of chords with modifications (extensions, suspensions) per decade. Error bars show the standard error of the
mean.

extensions occur much more frequently in the follow- chord modifications. The usage of these chord modi-
ing decades with a steeper rise between 1950 and 1980. fications increases over time, indicating an intra-genre
The most recent Choro compositions from the last two development towards a larger and more complex chord
decades of the 20th century finally exhibit a leap of alphabet. Finally, looking at the chord distributions of
chord extension usage to about 30%, strongly supporting bars in larger phrases (16 or 32 bars) shows a tight rela-
our expectation that external influences from genres like tionship between hypermetrical position and harmonic
Jazz led to a heightened employment of more dissonant variability. The harmonic content of bars at the end of
chords. subphrases with respect to a binary hypermeter is also
more predictable.
While the results of the present study are limited and
4. Conclusions only allow to draw conclusions about the structural fea-
This study was conducted in order to provide a quan- tures of harmony and form in this genre, they contribute
titative style analysis of Choro, a Brazilian instrumen- to a better understanding of Choro on a quantitative
tal music genre largely neglected by empirical music basis, and augment the music-theoretical treatment of
research so far. Based on transcriptions of the Choro these issues with exact computational methods. They
Songbook (Chediak, 2009, 2011a, 2011b), a set of repre- show that the empirical data is largely in accordance
sentative Choro pieces, we operationalised and evaluated with the Almada’s qualitative descriptions which asserts
the qualitative descriptions of the genre in A estrutura do the representative status of the Choro Songbook Cor-
Choro (Almada, 2006). pus for this genre. Researchers as well as Choro com-
Summarizing our results, this corpus study describes posers and performers alike are invited to incorporate
harmony and form and their mutual relationship in our results into their own work and elaborate on our
Choro. We provide the empirical distributions of global findings. Notably, the hierarchical representation of the
keys, and metres and found that the majority of Choros transcriptions of the Choro Songbook Corpus provides
follow a relatively small number of formal templates that an elegant encoding of the formal arrangement of the
can be identified as three-part (A-B-A-C-A), two-part Choro pieces. Since computational research on musical
(A-B-A), and non-modulating (A) forms. Parts in Choro form still suffers from the lack of reliable corpora, one can
pieces commonly modulate to closely related keys, such expect this dataset to ameliorate this situation. Another
as the subdominant, parallel, and relative, but notably particularly promising research avenue is the expansion
largely avoid modulations to the dominant. This is con- of the corpus with complementary sources, such as the
trasted by the fact that dominant-tonic progressions con- scores from the Choro Archive, which would also enable
stitute overwhelmingly large portions of local harmonic a much more detailed description of the historical pro-
progressions between chords. The chord alphabet itself cess of genre formation, and allow for thorough compar-
is very rich and can feature dozens of chord types for isons of the subtle differences between sub-genres. Yet
a scale degree that go way beyond the basic classes of another direction might be to combine the transcriptions
triads and sevenths chords by employing a variety of with audio data from recordings, either contemporary or
18 F. C. MOSS ET AL.

historical, and thus augment the understanding of Choro ORCID


by incorporating data from musical performance. Fabian C. Moss http://orcid.org/0000-0001-9377-2066
It is to date difficult to exactly test whether the empiri- Willian Fernandes Souza http://orcid.org/0000-0002-2798-
cally determined characteristics of Choro are particular 8106
this genre or reflect more general traits of, say, Latin-
American music because appropriate data for compari- References
son is virtually non-existent. The most likely candidate Albrecht, J., & Shanahan, D. (2012). The use of large corpora
for genre comparison in future research is Jazz (e.g. with to train a new type of key-finding algorithm: An improved
Eremenko et al., 2018; Pfleiderer et al., 2017; Shanahan treatment of the minor mode. Music Perception: An Interdis-
& Broze, 2012) which will certainly shed more light on ciplinary Journal, 31(1), 59–67. https://doi.org/10.1525/mp.
the relation between harmony and form in Popular music 2013.31.1.59
Almada, C. (2006). A estrutura do choro. Da Fonseca Comuni-
genres, and which could also be used to more rigorously cação.
study the question whether the degree of improvisation in Almada, C. (2012). O choro como modelo arquetípico da Teo-
a musical genre is somewhat inversely related to the vari- ria Gerativa da Música Tonal. Revista Brasileira de Música,
ability in its forms as we speculated here. The quantitative 25(1), 61–78. https://doi.org/10.47146/rbm.v25i1
methodology of a corpus study applied to this body of Aranha, C. (2012). Chorinho brasileiro: como tudo começou
(Bilingual ed.). Editora DBA.
music does not only illuminate important aspects of this
Arthur, C. (2017). Taking harmony into account: The effect of
particular genre. It can also be applied to the empirical harmony of melodic probability. Music Perception: An Inter-
characterisation of harmony and form in other musi- disciplinary Journal, 34(4), 405–423. https://doi.org/10.1525/
cal styles and might promote similar research projects, mp.2017.34.4.405
especially in the analysis of lesser researched genres. Basili, R., Serafini, A., & Stellato, A. (2004, October 10–14).
Classification of musical genre: A machine learning approach.
Proceedings of the international conference on Music Infor-
Acknowledgements mation Retrieval (ISMIR), Barcelona, Spain.
Brackett, D. (2016). Categorizing sound: Genre and twentieth-
The authors would like to thank Irmãos Vitale Editores Ltda., century popular music. University of California Press.
the publisher of the Choro Songbook, and its initiator Almir Brinkman, A., & Huron, D. (2018). The leading sixth
Chediak (in memoriam) and editors Mario Sève, Rogério Souza scale degree: A test of Day-O’Connell’s theory. Journal of
and Dininho for licencing the rights to publish the Choro Song- New Music Research, 47(2), 166–175. https://doi.org/10.
book Corpus. We also would like to thank the members of the 1080/09298215.2017.1407345
Digital and Cognitive Musicology Lab (DCML) at École Poly- Broze, Y., & Shanahan, D. (2013). Diachronic changes in Jazz
technique Fédérale de Lausanne (EPFL) as well as the the editor harmony: A cognitive perspective. Music Perception: An
and two anonymous reviewers for valuable feedback. Interdisciplinary Journal, 31(1), 32–45. https://doi.org/10.
1525/mp.2013.31.1.32
Burgoyne, J. A., Wild, J., & Fujinaga, I. (2013). Compositional
Data availability statement data analysis of harmonic structures in popular music. In J.
Yust, J. Wild, & J. A. Burgoyne (Eds.), Mathematics and com-
The data that support the findings of this study are openly putation in music (Lecture Notes in Computer Science, Vol.
available for non-commercial use on the Zenodo reposi- 7937, pp. 52–63). Springer.
Caplin, W. E. (1998). Classical form: A theory of formal functions
tory https://doi.org/10.5281/zenodo.3881347. for the instrumental music of Haydn, Mozart and Beethoven.
Oxford University Press.
Cazes, H. L. (1999). Choro: do Quintal ao Municipal. Editora 34.
Disclosure statement Chediak, A. (2009). Choro songbook (Vol. 1). Lumiar Editora.
No potential conflict of interest was reported by the author(s). Chediak, A. (2011a). Choro songbook (Vol. 2). Lumiar Editora.
Chediak, A. (2011b). Choro songbook (Vol. 3). Lumiar Editora.
da Silva, M. T. B., & Filho, A. L. O. (1998). Pixinguinha. Filho
de Ogum Bexiguento. Gryphus.
Funding de Clercq, T., & Temperley, D. (2011). A corpus analysis of Rock
Funding for this collaborative project was provided by several harmony. Popular Music, 30(1), 47–70. https://doi.org/10.
resources: Zukunftskonzept at Technische Universität Dresden 1017/S026114301000067X
(ZUK 64), funded by the Exzellenzinitiative of the Deutsche de Souza, W. F. (2016a). Choro como categoria e suas metáforas.
Forschungsgemeinschaft (DFG); École Polytechnique Fédérale XXVI Congresso da Associação Nacional de Pesquisa e Pós-
de Lausanne (EPFL); Coordenação de Aperfeiçoamento de Pes- Graduação em Música. Belo Horizonte.
soal de Nível Superior (CAPES) of the Brazilian Department de Souza, W. F. (2016b). Distinções de gênero e estilo nas práticas
of Education; Erasmus Mundus (Euro Brazilian Window+) de choro. Anais do IV SIMPOM.
[EB16DM1897]. The authors also thank Claude Latour for Eerola, T., Louhivuori, J., & Lebaka, E. (2009). Expectancy in
supporting this research through the Latour Chair in Digital Sami Yoiks revisited: The role of data-driven and schema-
Musicology at EPFL. driven knowledge in the formation of melodic expectations.
JOURNAL OF NEW MUSIC RESEARCH 19

Musicae Scientiae, 13(2), 231–272. https://doi.org/10.1177/ McKay, C., & Fujinaga, I. (2006). Musical genre classification: Is
102986490901300203 it worth pursuing and how can it be improved? Proceedings of
Eremenko, V., Demirel, E., Bozkurt, B., & Serra, X. (2018). the international conference on Music Information Retrieval
Audio-aligned Jazz harmony dataset for automatic chord (ISMIR) (pp. 101–106).
transcription and corpus-based research. International Soci- Mesquita, M. (2017). Would Brazilian Choro be a rondo form?
ety for Music Information Retrieval Conference, Paris, Some historical-analytical considerations. Proceedings of the
France. 4th international meeting of Music Theory and Analysis (pp.
Fabbri, F. (2014). Music taxonomies: An overview. ‘Musique 213–224), Universidade de São Paulo.
Savante/Musiques Actuelles: Articulations. JAM 2014: Journ- Meyer, L. B. (1957). Meaning in music and information theory.
ées d’analyse musicale 2014 de la Sfam (Société Française The Journal of Aesthetics and Art Criticism, 15(4), 412–424.
d’Analyse Musicale)’. https://doi.org/10.2307/427154
Ferrão, G., & Navia, G. (2016). Período ou sentença? Hibridismo Meyer, L. B. (1989). Style and music. Theory, history, and ideol-
temático no choro. II Encontro da Associação Brasileira de ogy. University of Chicago Press.
Teoria e Análise Musical (p. 42). Milička, J. (2012, April 26–29). Rank-frequency relation & type-
Gauvin, H. L. (2015). “The times they were a-changin’”: A token relation: Two sides of the same coin. Methods and
database-driven approach to the evolution of harmonic syn- applications of quantitative linguistics selected papers of
tax in popular music from the 1960s. Empirical Musicology the 8th international conference on Quantitative Linguistics
Review, 10(3), 215–238. https://doi.org/10.18061/emr.v10i3 (QUALICO) in Belgrade, Serbia (pp. 163–171).
Hal Leonard Publishing Corporation. (2004). The real book. Moss, F. C., Neuwirth, M., Harasim, D., & Rohrmeier, M.
Hal Leonard. (2019). Statistical characteristics of tonal harmony: A cor-
Harte, C. (2010). Towards automatic extraction of harmony pus study on Beethoven’s string quartets. PLoS One, 14(6),
information from music signals [Phd thesis]. Queen Mary e0217242. https://doi.org/10.1371/journal.pone.0217242
University of London. Neuwirth, M., Harasim, D., Moss, F. C., & Rohrmeier, M.
Hedges, T., & Rohrmeier, M. (2011). Exploring Rameau and (2018). The Annotated Beethoven Corpus (ABC): A dataset
beyond: A corpus study of root progression theories. In of harmonic analyses of all Beethoven string quartets. Fron-
C. Agon, E. Amiot, M. Andreatta, G. Assayag, J. Bresson, tiers in Digital Humanities, 5, 1–5. https://doi.org/10.3389/
& J. Mandereau (Eds.), Mathematics and computation in fdigh.2018.00016
music (Lecture Notes in Artificial Intelligence, Vol. 6726, pp. Neuwirth, M., & Rohrmeier, M. (2016). Wie wissenschaftlich
334–337). Springer. muss Musiktheorie sein? Chancen und Herausforderungen
Huron, D. (1996). The melodic arch in Western folksongs. musikalischer Korpusforschung. Zeitschrift der Gesellschaft
Computing in Musicology, 10, 3–23. für Musiktheorie, 13(2), 171–193. https://doi.org/10.31751/
Huron, D. (2006). Sweet anticipation. Music and the psychology 915
of expectation. MIT Press. Oramas, S., Espinosa-Anke, L., Gómez, F., & Serra, X. (2018).
Jacoby, N., Tishby, N., & Tymoczko, D. (2015). An information Natural language processing for music knowledge discov-
theoretic approach to chord categorization and functional ery. Journal of New Music Research. https://doi.org/10.1080/
harmony. Journal of New Music Research, 44(3), 219–244. 09298215.2018.1488878
https://doi.org/10.1080/09298215.2015.1036888 Panteli, M., Benetos, E., & Dixon, S. (2018). A review of
Lerdahl, F., & Jackendoff, R. S. (1983). A generative theory of manual and computational approaches for the study of
tonal music. MIT Press. world music corpora. Journal of New Music Research.
Livingston, T. E. (1999). Choro and Music Revivalism in Rio https://doi.org/10.1080/09298215.2017.1418896
De Janeiro, 1973–1995 [PhD thesis]. University of Illinois at Pearce, M. T. (2018). Statistical learning and probabilistic pre-
Urbana-Champaign. diction in music cognition: mechanisms of stylistic encul-
MacKay, D. (2003). Information theory, inference and learning turation. Annals of the New York Academy of Sciences,
algorithms. Cambridge University Press. 1423(1), 378–395. https://doi.org/10.1111/nyas.2018.1423.
Mair, M. (2000). A history of Chôro in context. Mandolin Quar- issue-1
terly, 5(1), 13–20. https://www.marilynnmair.com/articles/ Pearce, M. T., & Wiggins, G. A. (2004). Improved methods for
choro/2000/history-of-choro/ statistical modelling of monophonic music. Journal of New
Manning, C., & Schütze, H. (2003). Foundations of statistical Music Research, 33(4), 367–385. https://doi.org/10.1080/
natural language processing (6th ed.). MIT Press. 0929821052000343840
Manzara, L. C., Witten, I. H., & James, M. (1992). On Pfleiderer, M., Frieler, K., Abeßer, J., Zaddach, W. G., &
the entropy of music: An experiment with Bach chorale Burkhart, B. (Eds.). (2017). Inside the Jazzomat: New Perspec-
melodies. Leonardo Music Journal, 2(1), 81–88. https://doi. tives for Jazz Research. Schott.
org/10.2307/1513213 Piedade, A. T. C. (2003). Brazilian Jazz and friction of musicali-
Margulis, E. H., & Beatty, A. P. (2008). Musical style, psychoaes- ties. In E. T. Atkins (Ed.), Jazz planet (pp. 41–58). University
thetics, and prospects for entropy as an analytic tool. Com- Press of Mississippi.
puter Music Journal, 32(4), 64–78. https://doi.org/10.1162/ Ramos, P. E. Z. M. (2016). Léxico harmônico em choros de
comj.2008.32.4.64 Pixinguinha. Anais do IV SIMPOM 2016, Rio de Janeiro.
Mauch, M., Müllensiefen, D., Dixon, S., & Wiggins, G. A. Rohrmeier, M. (2013). Musical expectancy – bridging music
(2008). Can statistical language models be used for the analysis theory, cognitive and computational approaches. Zeitschrift
of harmonic progressions. Proceedings of the 10th interna- der Gesellschaft für Musiktheorie, 10(2), 343–371. https://doi.
tional conference on Music Perception and Cognition. org/10.31751/724
20 F. C. MOSS ET AL.

Rohrmeier, M. (2020). The syntax of Jazz harmony: Diatonic Temperley, D., & de Clercq, T. (2013). Statistical analysis of
tonality, phrase structure, and form. Music Theory & Analy- harmony and melody in Rock music. Journal of New Music
sis, 7(1), 1–63. https://doi.org/10.11116/MTA.7.1.1 Research, 42(3), 187–204. https://doi.org/10.1080/09298215.
Rohrmeier, M., & Cross, I. (2008). Statistical properties of tonal 2013.788039
harmony in Bach’s chorales. In K. Miyazaki, M. Adachi, Y. Temperley, D., & VanHandel, L. (2013). Introduction to the
Hiraga, Y. Nakajima, & M. Tsuzaki (Eds.), Proceedings of special issues on corpus methods. Music Perception: An
the 10th international conference on Music Perception and Interdisciplinary Journal, 31(1), 1–3. https://doi.org/10.1525/
Cognition, Hokkaido University, Sapporo, Japan (Vol. 6, pp. mp.2013.31.1.1
619–627). Tillmann, B. (2005). Implicit investigations of tonal knowledge
Rohrmeier, M., & Rebuschat, P. (2012). Implicit learning in nonmusician listeners. Annals of the New York Academy
and acquisition of music. Topics in Cognitive Science, 4(4), of Sciences, 1060, 100–110. https://doi.org/10.1196/annals.
525–553. https://doi.org/10.1111/tops.2012.4.issue-4 1360.007
Sandroni, C. (2001). Feitiço decente: Transformações do samba Tymoczko, D. (2003). Function theories: A statistical approach.
no Rio de Janeiro (1917–1933). Editora UFRJ. Musurgia, 10, 35–64.
Savage, P. E. (forthcoming). The need for global studies. Valente, P. V. (2014). Transformações do choro no século XXI:
In D. Shanahan, J. A. Burgoyne, & I. Quinn (Eds.), estruturas, performance e improvisação [PhD thesis]. Univer-
Oxford handbook of music corpus studies. Oxford University sidade de São Paulo.
Press. Von Hippel, P., & Huron, D. (2000). Why do skips pre-
Schoenberg, A. (1969). Structural functions of harmony. Faber cede reversals? The effect of tessitura on melodic structure.
and Faber. Music Perception: An Interdisciplinary Journal, 18(1), 59–85.
Sears, D. R., Pearce, M. T., Caplin, W. E., & McAdams, S. (2017). https://doi.org/10.2307/40285901
Simulating melodic and harmonic expectations for tonal Weiß, C., Mauch, M., & Dixon, S. (2018). Investigating style
cadences using probabilistic models. Journal of New Music evolution of Western classical music: A computational
Research, 47(1), 1–24. https://doi.org/10.1080/09298215. approach. Musicae Scientiae. https://doi.org/10.1177/102986
2017.1367010 4918757595
Sève, M. (2015). Fraseado no choro: uma análise de estilo por White, C. W. (2013). Some statistical properties of tonality,
padrões de recorrência [Master thesis]. Universidade de Rio 1650–1900 [PhD thesis]. Yale University.
de Janeiro. White, C. W., & Quinn, I. (2016). The Yale-Classical Archives
Shanahan, D., & Broze, Y. (2012). A diachronic analysis of har- corpus. Empirical Musicology Review, 11(1), 50–58. https://
monic schemata in Jazz. Proceedings of the 12th interna- doi.org/10.18061/emr.v11i1
tional conference on Music Perception and Cognition and Wundervald, B. D., & Zeviani, W. M. (2019). Machine learn-
the 8th Triennial Conference of the European Society for the ing and chord based feature engineering for genre prediction in
Cognitive Sciences of Music (pp. 909–917). popular Brazilian music. arXiv e-prints, arXiv:1902.03283.
Taborda, M. E. (2010). As abordagens estilísticas no choro Youngblood, J. E. (1958). Style as information. Journal of Music
brasileiro (1902–1950). Historia Actual Online, 8(23), Theory, 2(1), 24–35. https://doi.org/10.2307/842928
137–146. Zanette, D. H. (2006). Zipf ’s law and the creation of musical
Temperley, D. (2000). The line of fifths. Music Analysis, 19(3), context. Musicae Scientiae, 10(1), 3–18. https://doi.org/10.
289–319. https://doi.org/10.1111/musa.2000.19.issue-3 1177/102986490601000101
Appendix. Tables

Table A1. Top 5 empirical most likely 32-bar phrases.


Major 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 I IIm-IIm7 V7 I III7 VIm-VIm7 II7 V7 I IIm III7 VIm IVm/3b I/5-VI7 II7-V7 I
2 V7 I V7 I V7 I II7 I V7 I V7 I V7 I V7 I
3 I-VI7 IIm V7 I III7 VIm II7 V7 I-VI7 IIm V7 I bVI7 I-VIm II7-V7 I
4 I IIm-IIm7 V7 I III7 VIm II7 V7 I IIm-IIm7 V7 I I7 IV/3-IVm6/3b I/5-V7 I-V7
5 I-V7 I VI7 IIm III7 VIm II7 V7 I-V7 I I7 IV IV-#IVo I/5-VI7 IIm-IVm6/3b-V7 I
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
1 I IIm-IIm7 V7 I III7 VIm-VIm7 II7 V7 I IIm III7 VIm IVm/3b I/5-VI7 II7-V7 I
2 V7 I V7 I V7 I II7 I V7 I V7 I V7 I V7 I-I7/7b
3 I-VI7 IIm V7 I III7 VIm II7 V7 I-VI7 IIm V7 I bVI7 I-VIm II7-V7 I
4 I IIm-IIm7 V7 I III7 VIm II7 V7 I IIm-IIm7 V7 I I7 IV/3-IVm6/3b I/5-V7 I-V7
5 I-V7 I VI7 IIm III7 VIm II7 V7 I-V7 I I7 IV IV-#IVo I/5-VI7 IIm-IVm6/3b-V7 I
Minor 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 Im IVm V7 Im Im-#VIm7(b5) Vm II7 V7 Im IVm V7 Im IVm6 Im II7-V7 Im
2 Im IVm V7 Im Im-#VIm7(b5) Vm II7 V7 Im IVm V7 Im-I7 IVm Im V7 Im-V7
3 Im-V7 Im I7 IVm IVm-IIm7(b5) Im II7 V7 Im-V7 Im I7 IVm IVm-IIm7(b5) Im II7-V7 Im
4 Im IVm V7 Im-V7 Im-#VIm(b5) Vm II7 V7 Im IVm V7 VIIm6/3b-I7 IVm-IIm(b5) Im II7-V7 Im-V7
5 Im II7 V7 Im-V7 Im-#VIm7(b5) Vm II7 V7 Im II7 V7 I7 IVm-IIm7(b5) Im-Im/7 IVm6/3-V7 Im
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

JOURNAL OF NEW MUSIC RESEARCH


1 Im IVm V7 Im Im-#VIm7(b5) Vm II7 V7 Im IVm V7 Im IVm6 Im II7-V7 Im
2 Im IVm V7 Im Im-#VIm7(b5) Vm II7 V7 Im IVm V7 Im-I7 IVm Im V7 Im
3 Im-V7 Im I7 IVm IVm-IIm7(b5) Im II7 V7 Im-V7 Im I7 IVm IVm-IIm7(b5) Im II7-V7 Im
4 Im IVm V7 Im-V7 Im-#VIm(b5) Vm II7 V7 Im IVm V7 VIIm6/3b-I7 IVm-IIm(b5) Im II7-V7 Im-VII7
5 Im II7 V7 Im-V7 Im-#VIm7(b5) Vm II7 V7 Im II7 V7 I7 IVm-IIm7(b5) Im-Im/7 IVm6/3-V7 Im-VII7

21
22 F. C. MOSS ET AL.

Table A2. Number of types and tokens, type-token ratio (TTR),


and efficiency for all chord types in the dataset, sorted by TTR, as
in Figure 13.
Mode SD Types Tokens TTR Efficiency
Major NC 1 141 0.007 –
Major I 65 7949 0.008 0.479
Major V 54 5674 0.010 0.398
Minor NC 1 87 0.011 –
Major II 46 3957 0.012 0.551
Minor I 68 4950 0.014 0.542
Major VI 42 2709 0.016 0.549
Minor V 57 3373 0.017 0.413
Major IV 51 2495 0.020 0.673
Minor II 47 1834 0.026 0.613
Major III 43 1674 0.026 0.633
Minor IV 52 1909 0.027 0.65
Major #IV 16 547 0.029 0.39
Major bIII 20 558 0.036 0.556
Major VII 35 725 0.048 0.554
Minor VI 60 1207 0.050 0.624
Minor III 50 850 0.059 0.717
Major bVII 24 399 0.060 0.658
Minor VII 37 576 0.064 0.681
Major bII 21 304 0.069 0.76
Major bVI 23 297 0.077 0.706
Major #I 9 115 0.078 0.608
Minor #VI 21 254 0.083 0.692
Major #II 7 84 0.083 0.336
Minor #II 2 23 0.087 0.426
Minor bII 28 273 0.103 0.77
Minor #IV 17 156 0.109 0.534
Major #VI 12 102 0.118 0.786
Major bV 11 84 0.131 0.742
Minor #I 12 78 0.154 0.753
Minor bVI 12 77 0.156 0.806
Major bI 5 32 0.156 0.455
Minor bIII 15 94 0.160 0.605
Minor bV 14 82 0.171 0.833
Major #V 7 38 0.184 0.526
Minor bVII 16 84 0.190 0.801
Minor bI 12 55 0.218 0.904
Minor #III 20 86 0.233 0.877
Major #III 7 29 0.241 0.816
Minor #V 7 27 0.259 0.866
Minor #VII 13 49 0.265 0.921
Major #VII 2 7 0.286 0.863
Major bIV 4 9 0.444 0.918
Minor bIV 6 13 0.462 0.933
Minor ##VI 1 1 1.000 –

You might also like