You are on page 1of 1

1.3.

1 Database Used

For any classification work a good database is very much essential. A good collection should have
sufficient diversity and count of each type of samples should be well balanced. Keeping these in mind
we have created the audio database for our experiment.

The database contains 1390 audio clips which are broadly of three types like speech, music without
voice (instrumental) and music with voice (song). Instrumental signals are further categorized based on
the instrument type and song signals are classified based on their genre. The distribution of different
type audio clips present in our database is described in Table 1.1. 200 samples of the speech signals are
of male voice and rest are of female voice. The samples correspond to a large set of speakers and that
too of different age group also. The instrumental signals are the recordings of number of instruments
like piano, organ, flute, saxophone, guitar, violin, sitar, drum, tabla. Thus, it is not only a good mixture
of different instrument type (keyboard, woodwind, string and percussion) but also corresponds to
different instrument in each type. The song signals belong to number of genres and are taken from
different singers also. Songs and speech also reflect variation in terms of the spoken language. These
audio files are collected from different sources. Some are collected from CD recording, some are
collected from the recordings of live programs and some sound files are downloaded from various sites
in internet. Among these sound files some files are noisy also.

The audio files in our collection are in WAV format. In this format uncompressed sample values are
stored following linear pulse code modulation (LPCM). Thus, it is a simple form to work with. Each
audio clip in the database is of 40-45 seconds duration. Sampling frequency for the data is 22050 Hz.
Samples are of 16-bit and of type mono.

You might also like