Professional Documents
Culture Documents
Cruz - A Study of Grammatics in Music Composition
Cruz - A Study of Grammatics in Music Composition
frequent in practice, is the use of only positive samples. strings that begin with substrings in I, end with substrings
The main problem lies in finding a more “abstract” or in F and do not contain any substring in T. Stochastic k-
general grammar G’ so that R+⊂ (G’), and so that (G’)- TSSL are obtained by associating probabilities to the rules
R+ (the extra-language) contains only strings with “similar of the grammars, and it has been demonstrated that they
features” to the strings from R+. A grammar G is said to are equivalent to N-GRAM’s with N=k (Segarra 93).The
be identified in the limit (Gold 67) if, for sufficiently large inference of k-TSSLs was discussed in (García and Vidal
sample sets, (G’)= (G). Grammars and Formal 90) where the k-TSI algorithm was proposed. This
Languages can also be seen from a probabilistic algorithm consists in building the sets Σ,I,F,T by
viewpoint, in which different rules have different observation of the corresponding events in the training
probabilities of being used in the derivation of strings. strings. From these sets a Finite-State automaton that
The corresponding extension of GI is called Stochastic recognizes the associated language is straightforwardly
Grammatical Inference and this is how all the algorithms built.
used in our study work.
The ALERGIA Algorithm
The Error-Correcting Grammatical Inference ALERGIA infers stochastic regular grammars in the
(ECGI) Algorithm limit by means of a State Merging Technique based on
ECGI is a GI heuristic that was explicitly designed to probabilistic criteria (Carrasco and Oncina 94). It builds
capture relevant regularities of concatenation and length the prefix tree acceptor (PTA) from the sample and, at
exhibited by substructures of unidimensional patterns. It every node, estimates the probabilities of the transitions
was proposed in (Rulot and Vidal 87) and relies on error- associated to the node. The PTA is the canonical
correcting parsing to build up a stochastic regular automaton for the training data (R+) and only recognizes
grammar through a single incremental pass over a positive L(R+). Next, ALERGIA tries to merge pairs of nodes,
training set. Let R+ be this training sequence of strings. following a well-defined order. Merging is performed, if
Initially, a trivial grammar is built from the first string of the tails of the states have the same probabilities within
R+. Then, for every new string that cannot be parsed with statistical uncertainties. The process ends when further
the current grammar, a standard error-correcting scheme merging is not possible. By merging states in the PTA the
is adopted to determine a string in the language of the size of the recognized language is increased, and so,
current grammar that is error-correcting closest to the training samples are generalized. In order to cope with
input string. This is achieved through a Viterbi-like, error- common statistical fluctuations of experimental data,
correcting parsing procedure (Forney 73) which also state-equivalence is accepted within a confidence interval
yields the corresponding optimal sequence of both non- calculated from a number between 0 and 1, which is
error and error rules used in the parse. From these results, introduced by the user, and which is called accuracy (a).
the current grammar is updated by adding a number of This parameter will be the responsible for the higher or
rules and (non terminals) that permit the new string, along lower generalization of the language recognized by the
with other adequate generalizations, to be accepted. resulting automaton.
Similarly, the parsing results are also used to update
frequency counts from which probabilities of both non- The Synthesis Algorithm
error and error rules are estimated. In this work, automata will be inferred for different
musical styles (languages). “Composing” melodies in a
The k-TSI Algorithm specified style amounts to synthesizing strings (melodies)
This algorithm belongs to a family of techniques which from the automaton that represents the desired style. To
explicitly attempts to learn which symbols or substrings achieve this, we use an algorithm that performs random
tend to follow others in the target language. It is a string generation from a stochastic automaton.
= n sixty-fourth note =
= u
The Analysis Algorithm
:
the pitch and two (one is optional) for the length. A space
22c 26c 24c 24c 24n 0c 24c 27c 24c 26c 27c 24c 22c 26c 22c 24c 26c 24n
character is used between notes to separate them. An
example of implicit notation of a score is shown in Fig. 3.
Fig. 3. A Gregorian style score and the corresponding
implicit notation code.
In order to code the pitch, a number was assigned to
each note of the adopted scale, beginning with the deepest
sound, corresponding to number 1, and ascending Experiments
progressively halftone by halftone to the highest sound,
For the present study, we chose 3 occidental musical
corresponding to number 43. The note length was coded
styles from different epochs and we took 100 sample
using one or two characters after the note number. The
melodies from each one. These samples were 10 to 40
second character is optional and we call it “extra-length”.
seconds long. The first style was Gregorian, from the
Thus, a note in implicit notation is coded as follows:
Middle Ages. As a second style, we used passages from
Note number + length + extra-length. arias and instrumental solos from the sacred music of J. S.
Bach, the most famous composer from the Baroque
The way of coding the field length can be seen in figure period. The third style consisted of passages from Scott
1. The field extra-length contains a symbol that modifies Joplin Ragtimes for piano, which belong to the beginning
the length given by the previous field, normally increasing of the 20th century and are, along with Negro Spirituals,
it. We only needed the symbols described in figure 2 to forerunners of Jazz. Since we are interested only in
code the training samples, but there are more. monodies (just one voice), for the second and third styles
it was necessary to find and extract the melody from the
whole set of notes that form the sample musical passage. considered as melodies. Every synthesis in the Gregorian
For the Gregorian style, this was not necessary because it style accomplished this objective, but more than half of
is a monody. the syntheses in Bach and Joplin styles did not (these
syntheses were qualified as “bad”). The second objective
Automatic Composition experiments (very good syntheses) was reached in the Gregorian style,
We inferred three automata (one per style) with each GI but not with Bach and Joplin, as can be seen in Table 1.
algorithm, trying different values of k (with k-TSI) and a Examples of ECGI compositions are shown in Fig. 4 and
(with ALERGIA) with the objective of obtaining the best Fig. 5.
synthesis. As a preliminary evaluation, the following
criteria were adopted:
GREGORIAN BACH JOPLIN
1.- The synthesis must make musical sense and must
not be a random series of sounds. VERY GOOD 20 % 0% 0%
2.- It must sound similar to the Musical Style that it is GOOD 25 % 5% 5%
meant to (i.e. Gregorian, Bach, Joplin). NOT VERY GOOD 35 % 20 % 15 %
3.- It must have the characteristics of a proper melody BAD 20 % 75 % 80 %
(beginning, development, end).
Table 1: Qualification (in percentage) for melodies composed
4.- It must be different from any of the training
by the ECGI algorithm.
melodies.
A Cross-Validation was made with each GI algorithm The ALERGIA results were similar to k-TSI and can be
for the style classification of test melodies. Then, we seen in Table 6 and Table 7. They are presented in the
studied the success and failure ratios in this classification. same way as the k-TSI results. We tried different a values
In the cross-validation for each experiment, we took 90 covering its entire range ([0,1]) and the results for some of
these values are shown in the above-mentioned tables. We far, GI algorithms compose monodies; that is, melodies
stopped with a=0.008 because the canonical automaton without accompaniment. By changing the GI algorithms
was obtained with this value. Best results were achieved in such a way that they could accept samples formed by
with a=0.01, as in Automatic Composition. In general, several strings, it might be possible to deal with samples
Scott Joplin’s style was better recognized with ALERGIA with more than one voice.
than with k-TSI, and Bach’s style was better recognized
with k-TSI than with ALERGIA.
References
Carrasco, R.C.; Oncina, J. 1994. Learning Stochastic
a = 0.9 a = 0.6 a = 0.1 a = 0.01 a = 0.008 Regular Grammars by means of a State Merging Method.
GREGORIAN
“Grammatical Inference and Applications”. Carrasco,
100% 100% 100% 100% 99%
BACH
R.C.; Oncina, J. eds. Springer-Verlag, (Lecture notes in
63% 65% 78% 80% 66%
Artificial Intelligence (862)).
JOPLIN 72% 77% 76% 80% 70%
Cruz P.P. 1996. Estudio de diversos algoritmos de
Table 6: Average success rate classifying with ALERGIA. Inferencia Gramatical para el Reconocimiento
Automático de Estilos Musicales y la Composición de
a = 0.9 a = 0.6 a = 0.1 a = 0.01 a = 0.008 melodías en dichos estilos. Proyecto Fin de Carrera.
22.3% 19.3% 15.3% 13.3% 21.3% Facultad de Informática. Universidad Politécnica de
Valencia.
Table 7: Average classifying error with ALERGIA
Forney, G. D. 1973. The Viterbi algorithm. IEEE Proc. 3,
pp. 268-278.
Conclusions and future trends
García, P.; Vidal, E 1990. Inference of K-Testable
As a main conclusion from this study, we can say that
Languages In the Strict Sense and Application to
GI algorithms are appropriate for Automatic Composition
Syntactic Pattern Recognition. IEEE Trans. on PAMI, 12,
and Style Recognition, but some improvements have to be
9, pp. 920-925.
made. In Automatic Composition, we obtained good
results in Gregorian style with both the ECGI and the k-
Gold, E. M. 1967. Language Identification in the Limit.
TSI algorithms. These results were not as good for Bach
Inf. and Control, Vol. 10, pp. 447-474.
and Joplin’s styles, but are still promising. In Automatic
Music Style Recognition we achieved good results as well. Rulot, H.; Vidal, E. 1987. Modelling (sub)string-length
With k-TSI, the average classifying error was 9.9 %, and based constraints through a Gramatical Inference method.
with ALERGIA, it was 13.3 %. This is the first time that NATO ASI Series, Vol. F30 Pattern Recognition Theory
these kinds of experiments have been carried out. and Applications, pp. 451-459. Springer-Verlag.
In order to improve these results the following lines of Rumsey, F. 1994. MIDI Systems & Control. Ed. Focal
study: (1) Increasing the number of training samples: Press.
because GI algorithms tend to need a large quantity of
training samples to get good results. (2) Solving the Segarra, E. 1993. Una Aproximación Inductiva a la
“tonality” problem: since training samples have different Comprensión del Discurso Continuo. Facultad de
tonalities, this increases the diversity of the samples and Informática. Universidad Politécnica de Valencia, 1993.
tends to “confuse” GI algorithms. This can be solved with
different methods: transposing every sample to the same
tonality, including information about their tonality in the
samples or just changing the way of coding music to a
“differential code” which codes intervals between notes
instead of coding the absolute pitch of the notes. (3)
Increasing the samples length. Once these problems are
solved, we could employ entire musical pieces and not
just small fragments as was done in this study. future
studies could also be expanded to include Polyphony. So