SCHOOL OF AUDIO ENGINEERING

Diploma in Audio Engineering

Course Notes

1

TABLE OF CONTENTS Table of Contents.............................................................................................................................................. 2 AE01 – Sound Theory ...................................................................................................................................... 6 AE02 – Music Theory I .................................................................................................................................. 34 AE03 – Analog Tape Machines..................................................................................................................... 59 AE04 – The Decibel ....................................................................................................................................... 70 AE05 – Basic Electronics............................................................................................................................... 88 AE06 – Audio Lines and Patchbays............................................................................................................ 101 AE07 – Analog Mixing Consoles................................................................................................................ 114 AE08 – Signal Processors ............................................................................................................................ 135 AE09 – Microphones.................................................................................................................................... 160 AE11 – Digital Technology ......................................................................................................................... 168 AE12 – Computer Fundamentals................................................................................................................. 182 AE13 – Music Theory II............................................................................................................................... 197 AE14 – MIDI ................................................................................................................................................ 229 AE15 – Digital Recording Formats ............................................................................................................. 263 AE16 Digital Audio Workstations............................................................................................................... 293 AE17 – Mastering for Audio and Multimedia ............................................................................................ 302 AE18 – Synchronization and Timecode...................................................................................................... 318 AE19 Professional Recording studios ......................................................................................................... 332 AE20 – Audio Postproduction for Video .................................................................................................... 335 AE21 – Sound For TV and Film.................................................................................................................. 347 AE22 – Multimedia Overview..................................................................................................................... 366

2

AE23 – Internet Audio ................................................................................................................................. 379 AE24 – Loudspeakers and Amplifiers......................................................................................................... 405 AE25 – Live Sound Reinforcement............................................................................................................. 434 AE27 – Acoustical Climate.......................................................................................................................... 447 AE28 – 3D Sound ......................................................................................................................................... 487 AE29 – The Broadcast Concept................................................................................................................... 504 AE30 – Music Business................................................................................................................................ 517

3

AE01 – SOUND THEORY
Fundamentals of Sound 1. 2. 3. The Elements Of Communication The Sound Pressure Wave Speed of sound wave 3.1 4. 5. Particle velocity:

Amplitude/Loudness/ Volume/Gain Frequency 5.1 5.2 5.3 5.4 5.5 5.6 Frequency Spectrum Pitch: Wavelength: Period Phase Phase Shift

6.

Difference between musical sound and noise 6.1 6.2 6.3 Harmonic content: Timbre: Octaves

7. 8. 9.

Waveforms Types: Wave shape Acoustic Envelop:

THE HUMAN EAR 1. 2. 3. The Outer Ear The Middle Ear The Inner Ear

4

4. 5.

Neural Processing The Ear and Frequency Perception 5.1 Critical Bandwidth and Beats

6. 7.

Frequency range and pressure sensitivity of the ear Noise-induced hearing loss 7.1 7.2 A loss of hearing sensitivity A loss of hearing acuity

8. 9.

Protecting your hearing Perception of sound source direction 9.1 9.2 9.3 9.4 Interaural time difference (ITD) Interaural intensity difference (IID) Pinnae and head movement effects The Haas effect Ear Training Listening to Music Listening with Microphones Listening in Foldback and Mixdown

10. 10.1 10.2 10.3

5

SCHOOL OF AUDIO ENGINEERING

AE01– Fundamentals of Sound

Student Notes

AE01 – SOUND THEORY
Fundamentals of Sound

1.

The Elements Of Communication

Communication: transfer of information from a source or stimulus through a medium to a reception point. The medium through which the information travels can be air, water, space or solid objects. Information that is carried through all natural media takes the form of waves repeating patterns that oscillate back and forth. E.g. light, sound, electricity radio and TV waves. Stimulus: A medium must be stimulated in order for waves of information to be generated in it. A stimulus produces energy, which radiates outwards from the source in all directions. The sun and an electric light bulb produce light energy. A speaker, a vibrating guitar string or tuning fork and the voice are sound sources, which produce sound energy waves. Medium: A medium is something intermediate or in the middle. In an exchange of communication the medium lies between the stimulus and the receptor. The medium transmits the waves generated by the stimulus and delivers these waves to the receptor. In acoustic sound transmission, the primary medium is air. In electronic sound transmission the medium is an electric circuit, Sound waves will not travel through space although light will. In space no-one can hear you scream. Reception/Perception: A receptor must be capable of responding to the waves being transmitted through the medium in order for information to be perceived. The receptor must be physically configured to sympathetically tune in to the types of waves it receives. An ear or a microphone is tuned in to sound waves. An eye or a camera is tuned in to light waves. Our senses respond to the properties or characteristics of waves such as frequency, amplitude and type of waveform. 2. The Sound Pressure Wave

All sound waves are produced by vibrations of material objects. The object could be the vibrating string of a guitar, which is very weak, but considerably reinforced by the vibrating wooded body of the instruments soundboard. Any vibrating object can act as a sound source and produce a sound wave, and the greater the surface area the object presents to the air the more it can move, or more medium it can displace. All sounds are produced by mechanical vibration of objects: e.g. Rods, Diaphragms, Stings, Reeds, Vocal chords and Forced airflow The vibrations may have wave shapes that are simple or complex. These wave shapes will be determined by the shape, size and stiffness of the source and the manner in which the vibrations are initiated. They could be initiated by: Hammering (rods), Plucking (string), Bowing (strings), Forced air flow (vibration of air column - Organ, voice). Vibrations from any of these sources cause a series of pressure fluctuations of the medium surrounding the object to travel outwards through the air from the source.

6

SCHOOL OF AUDIO ENGINEERING

AE01– Fundamentals of Sound

Student Notes It is important to note that the particles of the medium in this case molecules of air, do not travel from the source to the receiver, but vibrate in a direction parallel to the direction of travel of the sound wave. Thus sound is a longitudinal wave motion, which propagates at right angles away from the source. The transmission of sound energy via molecular collision is termed propagation. When air molecules are at random position or when there is no sound present in the air medium, normal atmospheric pressure exists. (0.0002 Dynes per cm2, 0.00002 Pascal or 20mpa) When a sound source is made to vibrate, it causes the air particles surrounding it to be alternately compressed and rarefied. Thereby fluctuating the mean air pressure between a higher than normal state and a lower than normal state. This fluctuation is determined by the rate of vibration and the force at which the vibration was initiated upon the source. At the initial forward excursion of the vibrating source, the particle nearest to that sound source is thrown forward to a point where it comes into contact with another adjacent molecule. After coming into contact with this adjacent molecule, the molecule will move back along the path of its original travel where its momentum will cause it to bypass its normal rest position and regress to its extreme rear-ward position from where it will swing back and finally come to its normal rest.

Recap: • • • • • When there is no sound wave present all particles are in a state of equilibrium and normal air pressure exists through out the medium. Higher pressure occurs where air particles press together, causing a region of higher than normal pressure called compression. Lower pressure occurs in the medium where adjacent particles are moving a part to create a partial vacuum causing a region of lower than normal pressure called a rarefaction The molecules vibrate at the same rate as the source vibration. Each air-particle vibrates about its position at rest at the same frequency as the sound source, i.e. there is sympathetic vibration The molecules are displaced from their mean positions by a distance proportional to the amount of energy in the wave. This means the higher energy from the source the more displacement from mean position.

7

SCHOOL OF AUDIO ENGINEERING

AE01– Fundamentals of Sound

Student Notes • • • A pressure wave radiates away from the source at a constant speed i.e. the speed of sound in air. Sound pressure fluctuates between positive and negative amplitudes around a zero pressure median point, which is in fact the prevailing atmospheric pressure. These areas of compressions and rare factions move away from the body in the form of a Longitudinal wave motion -each molecule transferring its energy to another creating an expanding spherical sound filed.

3.

Speed of sound wave

The speed of sound (c or S) with which the compressions and rarefaction move through the medium is the velocity of the sound wave or the speed at which it travels through a medium. Sound waves need a material medium for transmission; any medium that has an abundance of molecules can transmit sound. The speed of sound varies with the density and elasticity of the medium in which it is travelling through. Sound is capable of travelling through liquids and solid bodies, through water or steel and other substances. In air, the velocity is affected by: Density: A medium with more closely packed molecules; the faster sound will travel in that medium. Temperature: The speed of sound increases as temperature rises according to the formula V = 331 + 0.6t m/s Where t is now in Celsius. Approximately 1 m/s rise for every degree increase in temperature Humidity: With increase of RH, high frequencies will suffer due to absorption of sound. At normal temperature (30 degree C) and air pressure (0.0002 dynes/cm2) the velocity of sound is 342 meters per second. Generally sound moves through water about 4 times as fast as it does through air and through iron it moves approximately 14 times faster. The speed of sound can be determined from: Speed = Distanced travelled/Time taken or Speed = d/t The speed of sound in another medium is referred to by S. For example, the speed of sound on magnetic tape is equal to the tape speed at the time of recording i.e. S = 15 or 30 inches per sec (ips) Velocity refers to speed in a particular direction. As most sound waves move in all directions unless impeded in some way, the velocity of a sound wave is equivalent to the speed. It is worth remembering that at normal temperature, pressure and sea level sound travels approx: 1meter in 3 milliseconds i.e. 1 meter in 2.92 milliseconds Example: How long does it take for sound to travel 1 km

8

SCHOOL OF AUDIO ENGINEERING

AE01– Fundamentals of Sound

Student Notes Time Taken = 1000/342 = 2.92 s

As mentioned above any vibrating object can act as a sound source and thus produce a sound wave. The greater the surface area the tile object presents to the air, the more air it can move. The object could be the vibrating string of a guitar, which is very weak, but considerably reinforced by the vibrating wooded body of the instruments soundboard. The disturbance of air molecules around a sound source is not restricted to a single source, two or more sources can emit Sound waves and the medium around each of the sources would be distributed by each of them (the instruments) Air by virtue of is elasticity, can support a number of independent sound waves and produce them simultaneously. 3.1 Particle velocity: Refers to the velocity at which a particle in the path of a sound wave is moved (displaced) by the wave as it passes. Should not be confused with the velocity at which the sound wave travels through the medium, which is constant unless the sound wave encounters a different medium in which case the sound wave will refract. If the sound wave is sinusoidal (sine wave shape), particle velocity will be zero at the peaks of displacement and will reach a maximum when passing through its normal rest position. 4. Amplitude/Loudness/ Volume/Gain

When describing the energy of a sound wave the term amplitude is used. It is the distance above or below the centre line of a waveform (such as a pure sine wave). The greater the displacement of the molecule from its centre position, the more intense the pressure variation or physical displacement, of the particles, within the medium. In the case of air medium it represents the pressure change in the air as it deviates from the normal state at each instant.

Amplitude of a sound wave in air is measured in Pascal or dynes per sq, both units
of air pressure. However for audio purposes air pressure differences are more meaningful and these are expressed by the logarithmic power ratio called the Bel or Decibel (dB).

9

SCHOOL OF AUDIO ENGINEERING

AE01– Fundamentals of Sound

Student Notes

Waveform amplitudes are measured using various standards. a. Peak Amplitude refers to the positive and negative maximums of the wave. b. Root Means Squared Amplitude (RMS) gives a meaningful average of the peak values and more closely approximates the signal level perceived by our ears. RMS amplitude is equal to 0.707 times the peak value of the wave.

Our perception of loudness is not proportional to the energy of the sound wave this means that the human ear does not perceive all the frequencies at the same intensity. We are most sensitive to tones in the middle frequencies (3kHz to 4kHz) with decreasing sensitivity to those having relatively lower or higher frequencies. Loudness and Volume are not the same: Hi-fi systems have both a loudness switch and a volume control. A volume control is used to adjust the overall sound level over the entire frequency range of the audio spectrum (20Hz to 20kHz). A volume control is not frequency or tone sensitive, when you advance the volume control; all tones are increased in level. A loudness switch increases the low frequency and high frequency range of the spectrum while not -affecting the mid range tortes.

Fletcher & Munson Curves or equal loudness contours show the response of the
human ear throughout the audio range and reveal that more audio sound power is required at the low end and high end of the sound spectrum to obtain sounds of equal loudness.

10

p.SCHOOL OF AUDIO ENGINEERING AE01– Fundamentals of Sound Student Notes 5. Frequency = 1 (second)/ t (period) E.01 (seconds per cycle) =100Hz A cycle can begin at any point on the waveform but to be complete (1 cycle) it must pass through the zero line and end at a point that has the same value as the starting point. It is measured in cycles-per-second (c. Frequency The rapidity. This spectrum is defined by the particular characteristics of human hearing and corresponds to the pitch or frequency ranges of all commonly used musical instruments. 5.2 Pitch: Describes the fundamental or basic tone of a sound.g. 1/0. which a cycle of vibration repeats. The frequency of the tone is a measure of the number of complete 11 . 5.1 Frequency Spectrum The scope of the audible spectrum of frequencies is from 20Hz to 20KHz.s) or simply (Hertz) One complete excursion of a wave plotted over a 360 degree axis of a circle is known as a cycle. It is determined by the frequency of the tone. itself is called the frequency. The number of cycles that occur over the period of one second is known as a Hertz.

The greater the number of waves per second the higher the frequency or higher the pitch of the sound. Period (t) = 1/30 = 0.g. 5. displacement in time between waves of the same frequency.3 Wavelength: The wavelength of a wave is the actual physical distance covered by a waveform. The studio engineer must always contend with two distinct waves. That is the fraction of time required for a complete progression of the wave. a 30 Hz sound wave completes 30 cycles each second or one cycle every 1/30th of a second (0.033s to complete one cycle of 30 Hz Frequency 5. 1. Reflected Waves 12 . It refers to the relative.3cm 1.5 Phase The concept of phase is important in describing sound waves. in meters (m) V is the velocity of sound in the medium (m/s) F is the frequency in Hertz (Hz) Typical wavelengths encountered acoustics: Frequency 20Hz 1kHz 8kHz 20kHz Wavelength 17.4 Period Is the amount of time required for one complete cycle of the sound wave.7cm 5.1m 34cm 4.033s) Formula: P (or time taken for one complete oscillation) = 1/f E.SCHOOL OF AUDIO ENGINEERING AE01– Fundamentals of Sound Student Notes vibrations generated per second. or stance between any two corresponding points of a given cycle Formula: ?= v/f Where: ? = lambda or the wavelength in the medium. Direct waves 2. eg.

the resulting waveform is of the same frequency and phase but will have twice the original amplitude. part of that wave passes through the surface while that surface material absorbs part of it. Since a cycle can begin at any point on a waveform. they will cancel each other out when added resulting in zero amplitude. When 2 waveforms are completely in phase (0 degree phase different) and of the same frequency and peak amplitude are added. they will have different signed amplitudes at each instant in time. The direct and reflected wave may be wholly or partially in phase with each other: the result is that they will either reinforce each other or cancel each other at the point of the cycle where they converge. The rest is reflected as a delayed wave. If 2 waveforms. it is possible to have 2 waves interacting with each other. 13 . If two waveforms are completely out of phase (180 degree phase different). These two waves are said to be out of phase with respect to each other.SCHOOL OF AUDIO ENGINEERING AE01– Fundamentals of Sound Student Notes When the direct sound wave strikes a reflective surface. are offset in time. which have the same frequency and peak amplitude.

g. 90/270 degrees . The result would be certain frequencies will be boosted and others will he attenuated.50% coherent < 90 degrees . depth Overall loss of amplitude 14 . If one wave is offset in time. they will interfere constructively or destructively with each other at certain points of the wave form. this could be a boost or cut Loss of stereo image-instrument placement. An experiment with home hi-fi can best demonstrate the phenomenon.equal addition cancellation . it is difficult to perceive the actual phase addition or subtraction.SCHOOL OF AUDIO ENGINEERING AE01– Fundamentals of Sound Student Notes 5. If two waveforms which have the same frequency and peak amplitude are off set in time. When two sound sources produce two waves in close proximity to each other they will interact or interfere with each other. The result will be: Loss of bass A change of the mid + high frequencies response. They interfere constructively with their amplitudes being added together.6 Phase Shift (Ø) is often used to describe the relationship between 2 waves. 180-degree phase-shift -The waves are completely out of phase or uncorrelated. with bass frequencies being affected the most. Zero Coherence. Where changing the + and -connections of one of the speakers will result in a phase different of 180 degrees between the second speaker. E. Other degrees of phase shift -The waves are partially in phase. In music we deal with complex waveforms. it will be out of phase with the other. They interfere destructively with each other with their amplitudes cancelling each other to produce zero signal. But the result of out -of-phase conditions will have cancellation of certain frequencies. Time delays between waveforms introduce different degrees of phase shift. The type of interference these two waves produce is dependent upon the phase relationship or Phase Shift between them. Phase relationship characteristics for 2 identical waves: 0 degree phase shift -The waves are said to he completely in-phase or correlated or 100% coherent. Both additions and cancellations will occur.more destructive interference.more constructive interference >90 degrees .

These harmonies are known as the harmonic series. A musical note consists of a Fundamental wave and a number of overtones called Harmonies. However sound exists above the threshold of our hewing called ultrasound -20kHz and above and below our hearing range called infrasound -20Hz and below. we find their tones pleasing.005 X 100 X 360 = 180 degrees or total phase shift. Regular vibrations produce musical tones.SCHOOL OF AUDIO ENGINEERING AE01– Fundamentals of Sound Student Notes Acoustical phase cancellation is the most destructive condition that can arise during recording. Difference between musical sound and noise Sound carries an implication that it is something we can hear or that is audible. the resulting waveform is of the same frequency and phase but will have twice the original amplitude. and of the same frequency and peak amplitudes are added. It two waves are completely out of phase (180 phase different) they will cancel each other when added -resulting in zero amplitude. The arrival of the second wave will be 180 out of phase and will result in zero amplitude. 6. hence no output. The series of tones/frequencies are vibrating in tune with one another. Noise produces irregular vibrations their air pressure vibrations are random and we perceive them as unpleasant tones. Because of their orderliness.g. what will be the phase shift if a 100Hz wave was delayed by 5 milliseconds? Ø = 0. 6.5 milliseconds? When two waveforms are completely in phase (zero phase different). What would be the degree of phase shift if a 100Hz wave was delayed by 2.1 Harmonic content: 15 . Phase shift can he calculated by formula: Ø = T x Fr x 360 Ø is the phase-shift in degrees T= time delay in seconds Fr = Frequency in Hertz E. Care must be taken during recording to position stereo mics to maintain equilateral distance from source.

The third harmonic will be 1320Hz (440x3=1320) No matter how complex a waveform is. 1/3. The octave range relates directly to the propensity for human hearing to judge relative pitch or frequency on a 2:1 ratio. There is a tendency for the string to vibrate at 1/2. it can be shown to be the sum of sine waves whose frequencies are whole numbered multiples of the fundamental. 7. This is the factor that allows us to recognise the difference between the two instruments although they may be playing the same note. along with the fundamental constitute the timbre or tonal characteristic of a particular instrument. This is described as its 2nd mode of vibration. Thus producing a second harmonic. they tend to vibrate in quite a complex manner. In the fundamental mode (1st harmonic) the string vibrates or oscillates as a whole with respect to the two fixed ends. But there is also a tendency for the two halves to oscillate.SCHOOL OF AUDIO ENGINEERING AE01– Fundamentals of Sound Student Notes Is the tonal quality or timbre of the sound.2 Timbre: The first mode of vibration has the fundamental frequency vibrating at its extreme ends. A Violin playing 440Hz has a completely different fundamental/ harmonic relationship to a viola. The composite waveform comprising the fundamental and its numerous harmonics can look quite irregular with many sharp peaks and dips. An octave distance is always a doubling of frequency so the octave above the A at 440 Hz is 880 Hz or the second harmonic. The octave scale is said to be logarithmic while the harmonic scale is linear.3 Octaves The octave is a musical term. which refers to the distance between one note and its recurrence higher or lower in a musical scale. at about twice the frequency of the fundamental note. Sound waves are made up of a fundamental tone and several different frequencies called Harmonics or Overtones. Therefore the octave scale and the harmonic scale are different. Overtones may or may not be harmonically related to the fundamental. These subsequent vibrations. 6. plucked (guitar or bowed violin). 1/4 its length. Harmonics are whole numbered multiples of the fundamental. If the fundamental frequency is 440 Hz its second harmonic will be 880Hz or twice the fundamental (440 x 2 = 880). The factor that enables us to differentiate the same note being played by several instruments is the harmonic/overtone relationship between the two instruments playing the same note. However the next octave above 880 Hz is 1720 Hz or four times the fundamental. In the case of middle C the frequency generated will be 261. 6. Harmonic modes: When strings are struck as in a piano. 3rd mode of vibration etc. They have waveform 16 . By combining raw waveforms one can simulate any acoustic instrument.63 Hz. Yet the fundamental tone and each harmonic is made up of a very regular shape waveform. Waveforms Types: Waveforms are the building blocks of sound. This is how synthesisers make sound.

1.Where the most basic wave is the Sine wave. Wave shape White Noise .Noise is a random mixture of sine waves continuously shifting in frequency. tuning forks. 17 . To create a Square wave the sine wave fundamental is combined with a number of odd harmonics at regular intervals. Wave synthesis combines simple waves into complex waveforms e. Simple waves contain harmonics. We can break down a complex waveform into a combination of sine waves. named after the 19th century Frenchman who proposed this method. Complex: Speech and music depart from the simple sine form. Noise . 8. This method is called Fourier Analysis. These waveforms are called simple because they are continuous and repetitive. ii. Flute. There are generally two types of synt hetic noise: i. Wave shape is created by the amplitude and harmonic components in the wave.SCHOOL OF AUDIO ENGINEERING AE01– Fundamentals of Sound Student Notes generators. which they combine to form composite waveforms. the synthesiser.g. amplitude and phase. which traces a simple harmonic motion e. Pendulum. One cycle looks exactly like the next and they are perfectly symmetrical around the zero line.g. 2. which approximate real instruments. The ear mechanism also distinguishes frequency in complex waves by breaking them down into sine wave components. Simple Complex Simple . Generally musical waveforms can be divided into two categories.equal energy per frequency Pink noise: equal energy per octave. which create all types of waves.

SCHOOL OF AUDIO ENGINEERING AE01– Fundamentals of Sound Student Notes Adding even harmonies components with the fundamental creates a Triangle wave shape. 18 . A Saw tooth wave is made of odd and even harmonics added to the fundamental frequency.

Decay time is the internal dynamics of the instrument (resonance of a Tom drum) can be longer than main decay (sustain) the Tom can ring for a duration. THE HUMAN EAR The organ of hearing the ear operates as a Transducer i. Every instrument produces its own envelop which works in combination with its timbre the determines the subjected sound of the instrument The envelope of a waveform describes the way its intensity varies in the time that the sound is produced and dies away.e. it translates wave movement through several mediums . Sustain time is the sound source is maintained from max levels to mid levels.g.7cm) i.air pressure variations into mechanical action then to liquid variations and finally to electrical/neural impulses. 2. 3. The Outer Ear Consists of the Pinna and the ear canal (external meatus). An acoustic envelope bus 4 basic sections: attack decay.SCHOOL OF AUDIO ENGINEERING AE01– Fundamentals of Sound Student Notes 9. which corresponds to critical bandwidth for speech intelligibility. The ear canal is often compared to an organ pipe in that certain frequencies (around 3KHz) will resonate within it because of its dimensions (3cm x 0. sustain and release. 1. 4. Here sound waves are collected and directed toward the middle ear.e. the old ear trumpet as a hearing aid. frequencies whose quarter wavelengths are similar in size to the length of the canal. 1. The ear will perceive this frequency band as louder. Acoustic Envelop: An important aspect influencing the waveform of a sound is its envelope. Release Time is the time it takes for a sound to fall below the noise floor. Both the pinna and the ear canal increase the loudness of a sound we hear by concentrating or focusing the sound waves e. The envelope therefore describes a relationship between time and amplitude. “Pipe 19 . Attack time is the time it takes for the sound to rise to maximum level. This can be viewed on a graph by connecting a wave's the peak points of the same polarity over a series of cycles.

In order to achieve efficient transfer of energy from the tympanic membrane to the oval window. Thus the tympanic membrane when viewed downt he auditory canal from outside appears concave and conical in shape. The malleus (Hammer) is fixed to the middle fibrous layer of the tympanic membrane in such a way that when the membrane is at rest. This is to overcome the higher resistance to movement of the cochlea fluid compared to that of air at the input tot he ear. whether from external sources or the individual concerned. comprising the malleus.8) times larger than the pressure at the tympanic membrane. and the malleus is approximately 1.3 X 2 = 33. peaking in the 2-4kHz region. The oval window forms the boundary between the middle and inner ears. the effective pressure acting on the oval window is arranged by mechanical means to be greater than the that actig on the tympanic membrane. it is pulled inwards. And To protect the hearing system to some extent from the effects of loud sounds. In humans. incus and stapes – more commonly known as the hammer. The second function of the middle ear is to provide some protection for the hearing system from the effects of loud sounds. The malleus and incus (Hammer and anvil) are joined quite firmly such that at normal intensity levels they act as a single unit. Thus acoustic vibrations are transmitted via the tympanic membrane and ossicles as mechanical movements to the cochlea of the inner ear. resulting in a twofold increase in the force applied the malleus. whether from external sources or the individual concerned. the area of the tympanic membrane is approximately 13 times larger than the area of the stapes footplate .3 times the length of the incus. The function of the middle ear is two fold: • • To transmit the movements of the tympanic membrane to the fluid which fills the cochles without significant loss in energy.SCHOOL OF AUDIO ENGINEERING AE01– Fundamentals of Sound Student Notes resonance” amplifies the sound pressure falling on the outer ear by around 10dB by the time it strikes the eardrum. 2. One end of the stapes (Stirrup) is the stapes footplate is attached to the oval window of the cochlea. 20 . The buckling effect of the tympanic membrane provides a force increase by a factor of 2. Wiener and Ross have found that diffraction around the head results in a further amplification effect adding a further 10dB in the same bandwidth. Thus the pressure at the stapes footplate is about (13 X 1. anvil and stirrup – to the oval window of the cochlea. Resistance to movement can be thought of as impedance to movement and the impedance of fluid to movement is high compared to that of air. rotating together as the tympanic membrane vibrates to move the stapes via a ball and socket joint in a piston-like manner. The Middle Ear The mechanical movements of the tympanic membrane are transmitted through three small bones known as ossicles. The ossicles act as a mechanical impedance converter or impedance transformer and this is achieved by two means • • The lever effect of the malleus (hammer) and incus (anvil) The area difference between the tympanic membrane and the stirrup foot plate. A third aspect of the middle ear which appears relevant to the impedance conversion process is the buckling movement of the tympanic membrane itself as it moves.

which pass along the fibres at about 10m/s intervals. The CF is directly related to the part of the basilar membrane from which the stimulus arises.g. which the brain interprets. .g. with lower frequencies (e. Neural Processing Nerve signals consist of a number of electrochemical impulses. Amplitude peaks for different frequencies occur along the Basilar Membrane in different parts of the cochlea. The Oval Window opens into the upper part of the cochlea. the neuron discharges are caused by the cochlea nerves either firing (on) or not firing (off) producing a type of binary code. it has been suggested that the acoustic reflex is too slow to protect the hearing system. It is about the size of a pea and encased in solid bone. This reduces the efficiency with which vibrations are transmitted from the tympanic membrane to the inner ear and thus protects the inner ear to some extent from loud sounds. 50Hz) towards the end and higher freq. The Inner Ear The inner ear consists of 2 fluid-filled structures: The vestibular system consisting of 3 semi-circular canals. 1500 Hz) at the beginning. High frequencies cause maximal vibration at the stapes end of the basilar membrane where it is narrow and thick. the utricle and sacculus.SCHOOL OF AUDIO ENGINEERING AE01– Fundamentals of Sound Student Notes This occurs as a result of the action of two muscles in the middle ear: The tensor tympani and the stapedius muscle. The loudness of a sound is related to the number of nerve fibres excited (3. These so called “microphonic” potentials were actually picked up and amplified from the cortex of an anaesthetized cat. These muscles contract automatically in response to sounds with levels greater than approximately 75dBSPL and they have the effect of increasing the impedance of the middle ear by stiffening the ossicular chain. The waves act on hair like nerve terminals bunched under the Basilar Membrane in the Organ of Corti. and the pressure releasing Round Window into the lower part. The rocking motion of the Oval Window caused by the Ossicles sets up sound waves in the fluid. Low frequencies cause greater effect at the apical end where the membrane is thin and wide. filled with fluid and divided into an upper and a lower part by a pair of membranes (Basilar Membrane and Tectorial Membrane). 3. In the case of loud impulsive sound such as the firing of a large gun. 4. (e. The Cochlea is the organ of hearing.these are concerned with balance and posture.000 21 . The names of these muscles derive from where they connect with the ossicular chain: the tensor tympani is attached near the tympanic membrane and the stapedius muscle is attached to the stapes. These potentials are proportional to the sound pressure falling on the ear over an 80dB range. Each fibre in the cochlea nerve responds most sensitively to its own characteristic frequency (CF) requiring a minimum spl at this frequency to stimulate it or raise it detectably. but this is for frequencies below 1Khz only. This stiffening of the muslces is known as acoustic reflex. It takes some 60ms to 120 ms for the muscles to contract in response to a loud sound. It is coiled up like a seashell. Intensity is conveyed by the mean rate of the impulses. Approximately 12 to 14 dB of attenuation is provided by this protection mechanism. These nerves convey signals in the form of neuron discharges to the brain. While the microphonic signals are analog.

A single fibre firing would represent the threshold of sensitivity. For the majority of listeners. or sine waves. average values are quoted which are based on measurements made for a large number of listeners.F1 ) or (F1 – F2 ) if F1 is greater than F2 and the amplitude varies between (A1 + A2 ) and ((A1 . beats are usually heard when the frequency difference between the tones is less than about 12. 5. speech and other sounds.A2 ). If F1 is fixed and F2 is changed slowly from being equal to or in unison with F1 either upwards or downwards in frequency. deciphering it to give a picture of harmonic richness or Timbre of the sound. The smooth separate sensation persists while the two tones remain within the frequency range of the listener’s hearing. As soon as F2 is moved higher (lower) than F1 a sound with a clearly undulating amplitude variations known as beats is heard. At a particular place. The Ear and Frequency Perception This section considers how well the hearing system can discriminate between individual frequency components of an input sound. There is no exact frequency difference at which the change from fused to separate and from beats to rough to smooth occur for every listener.5Hz. Whether or not two components that are of similar amplitude and close together in frequency can be discriminated depends on the extent to which the basilar membrane displacements due to each of the two components are clearly separated or not.SCHOOL OF AUDIO ENGINEERING AE01– Fundamentals of Sound Student Notes maximum) and the repetition rates of such excitation. the approximate frequency and order in which they occur is common to all listeners. The point where the two tones are heard as separate as opposed to fused when the frequency differnce is increased can be thought of as the point where two peak displacements on the basilar membrane begin to emerge from a single maximum displacement on the membrane. and a further increase in frequency difference is needed for the rough sensation to become smooth. When F1 is equal to F2 a single note is heard. However. the following is generally heard. The ear interprets this information as a ratio of amplitudes. or (A1 + A2 ) and (A2 – A1 ) If A2 is greater than A1. (A1 = A2 ) the amplitude of the beats varies between 2 X A1 and 0. with amplitudes A1 and A2 and frequencies F1 and F2 are sounded together. Each component of an input sound will give rise to a displacement of the basilar membrane. Note that when the amplitudes are equal. and the sensation of beats generally gives away to one of a fused tone which sounds rough when the frequency difference is increased above 15Hz. This will provide the basis for understanding the resolution of the hearing system and it will underpin discussions relating to the psychoacoustics of how we hear music. and in common with all psychoacoustic effects. However. and it is only when the rough sensation becomes smooth 22 . In this way component frequencies (partials) of a signal are separated and their amplitudes measured. 5. The frequency of the beats is equal to the (F2 . As the frequency difference is increased further there is a point where the fused tone gives way to two separate tones but still with the sensation of roughness. The displacement due to each individual component is spread to some extent on either side of the peak. at this point the underlying motion of the membrane which gives rise to the two peaks causes them to interfere with each other giving the rough sensation.1 Critical Bandwidth and Beats Suppose two pure tones.

due to long-term exposure. However. Masking is when one frequency can not be heard as a result of another frequency that is louder and close to it. Frequency range and pressur e sensitivity of the ear The frequency range of the human ear. This is the threshold of hearing. Hearing losses can also be induced from prolonged exposure to loud sounds.the human ear is usually quoted as having a frequency range of 20Hz to 20.1 A loss of hearing sensitivity The effect of noise exposure causes the efficiency of the transduction of sound into nerve impulses to reduce. but ultimately it becomes permanent as the hair cells are permanently flattened as a result of the damage. the minimum sound pressure variation which can be detected by the human hearing system around 4Khz is approximately 10 micropascals. the decline being less for low frequencies than for high. Note this is different from the threshold shift due to the acoustic reflex which occurs over a much shorter time period and is a form of built-in hearing protection. The ear’s sensitivity to sounds of different frequencies varies over a vast sound pressure level range. This is the threshold of pain 7.2 A loss of hearing acuity 23 . 7. This reduction in the upper freqency limit of the hearing range is accompanied by a decline in hearing sensitivity at all frequencies with age. This is due to damage to the hair cells in each of the organs of corti. Healthy young children may have a full range hearing range up to 20Khz. ‘ the critical bandwidth is that bandwidth at which subjective responses rather abruptly change. On Average. This loss of sensitivity manifests itself as a shift in the threshold of hearing that they can hear. This shift in the threshold can be temporary.’ The critical bandwidth changes according to frequency. Noise-induced hearing loss The ear is a sensitive and accurate organ of sound transduction and analysis. The maximum average sound pressure level which is heard rather than perceived as being painful is 20Pa. but by the age of 20.000Hz (20Khz) but this is not necessarily the case for every individual. which does not allow them time to recover. IN practice Critical bandwidth is usually measured by an effect known as masking in which the rather abrupt change is more clearly perceived by listeners. A more formal definition is given by Scharf (1970). This is known as presbyacusis or presbycusis and is a function of normal ageing process. This range changes as part of the human ageing process. It continues to reduce gradually to about 8Khz by retirement age.SCHOOL OF AUDIO ENGINEERING AE01– Fundamentals of Sound Student Notes that the separation of the places on the membrane is sufficent to fully resolve the two tones. particularly in terms of the upper limit which tends to reduce. This damage can manifest itself in two major forms : 7. 10--5 Pa. 6. the upper limit may have dropped to 16Khz. for short times of exposures. The frequency difference between the pure tones at the point where a listener’s perception changes from rough and separate to smooth and separate is known as critcal bandwidth. the ear can be damaged by exposure to excessive levels of sound or noise.

Protecting your hearing Hearing loss is insidious and permanent and by the time it is measurable it is too late. We have seen that a crucial part of our ability to hear and analyse sounds is our ability to separate out the sounds into distinct frequency bands. Tinnitus is the name given to a condition in which the cochlea spontaneously generates noise. 24 . However. The ear is most sensitive at the first resonance of the ear canal. The second effect is a reduction in the hearing sensitivity. This fact is recognised by legislation which requires that the noise exposure of workers be less than this limit. and this is the frequency at which most hearing damage first shows up. • Another related effect due to damage to the hair cells is noise-induced tinnitus. 8. This enhancement mechanism is very easily damaged. Although 90 dB(SPL) is taken as a damage threshold if the noise exposure causes ringing in the ears. In noise-induced tinnitus exposure to loud noise triggers this. however. this may be due to other factors as well. This effect is more insidious because the effect is less easy to measure and perceive. Interestingly it may well make musical sounds which were consonant more dissonant because of the presence of more than one frequency harmonic in a critical band. it may be that damage may be occurring even if the sound level is less than 90 dB(SPL). because the enhancement mechanism also increases the amplitude sensitivity of the ear. or a mixture of the two. it appears to be more sensitive to excessive noise than the main transduction system. or about 4 kHz. These bands are very narrow. random noises. Because the damage is caused by excessive noise exposure it is more likely at the frequencies at which the acoustic level at the ear is enhanced. This distinctive pattern is evidence that the hearing loss measured is due to noise exposure rather than some other condition. and as well as being disturbing. Hearing damage in this region is usually referred to as an audiometric notch. especially if the ringing lasts longer than the length of exposure. at the ear. which can be tonal. The first strategy is to avoid exposure to excess noises. that exposure to noises with amplitudes of greater than 90 dB(SPL) will cause permanent hearing damage. our ability to separate out the different components of the sound is impaired. called critical bands. Therefore in order to protect hearing sensitivity and acuity one must be proactive. The effect of the damage though is not just to reduce the threshold but also to increase the bandwidth of our acoustic filters. it manifests itself as a difficulty in interpreting sounds rather than a mere reduction in their perceived level. and this will reduce our ability to understand speech or separate out desired sound from competing noise. There is strong evidence. etc. Note that if the work environment has a noise level of greater than this then hearing protection of a sufficient standard should be used to bring the noise level. How much noise exposure is acceptable? There is some evidence that the normal noise in Western society has some long-term effects because measurements on the hearing of other cultures have shown that there is a much lower threshold of hearing at a given age compared with Westerners. such as the inevitable high-frequency loss due to ageing.SCHOOL OF AUDIO ENGINEERING AE01– Fundamentals of Sound Student Notes This is a more subtle effect but in many ways is more severe than the first effect. This has two main effects: • Firstly. there is some evidence that people who suffer from this complaint may be more sensitive to noise induced hearing damage. for example. the level of pollution. Their narrowness is due to an active mechanism of positive feedback in the cochlea which enhances the standing wave effects mentioned earlier.

There are two effects of the separation of our ears on the sound wave: firstly the sounds arrive at different times and secondly they have different intensities. and socialising with other people. and cost less than a CD recording. If the sound is directly in front. The second is when one is playing music. or anywhere on the median plane. These have the advantage of being unobtrusive and reduce the sound level by a modest. The time difference between the two ears will depend on the difference in the lengths that 25 . These two effects are quite different so let us consider them in turn. The only real solution is to avoid the loud sounds in the first place. In these situations it is a good idea either to limit the noise dose or. and many professional sound engineers also do the same. it is worth taking care of. but useful. for both the enjoyment of music. and irreplaceable. one can keep a reasonable distance away from the speakers at a concert or disco. for communicating. This effect. such as bands.1 Interaural time difference (ITD) Because the ears are separated by about 18 cm there will be a time difference between the sound arriving at the ear nearest the source and the one further away. discos. In both cases the levels are under your control and so can be reduced. but how ? Because our two ears are separated by our head. or behind. as even small ones are capable of producing damaging sound levels. better still. the acoustic reflex. etc. reduces the sensitivity of your hearing when loud sounds occur. can result in a sound level increase spiral. night clubs. For example. 9. then a more extreme form of hearing protection may be required. or even weeks in the case of hearing acuity. So when the sound is off to the left the left ear will receive the sound and when it is off to the right the right ear will hear it first. 9. amount (15-20 dB) while still allowing conversation to take place at the speech levels required to compete with the noise! These devises are also available with a ‘flat’ attenuation characteristic with frequency and so do not alter the sound balance too much. every day of the week! The authors regularly use small ‘in-ear’ hearing protectors when they know they are going to be exposed to high sound levels. Perception of sound source direction How do we perceive the direction that a sound arrives from ? The answer is that we make use of our two ears. It takes a few days. this has an acoustic effect which is a function of the direction of the sound. Now and in the future. especially in small rooms with a ‘live’ acoustic. However.SCHOOL OF AUDIO ENGINEERING AE01– Fundamentals of Sound Student Notes There are a few situations where potential damage is more likely. use some hearing protection. the sound will arrive at both ears simultaneously. with either acoustic or electric instruments. where there is a tendency to increase the sound level ‘to hear it better’ which results in further dulling. as these are also capable of producing damaging sound levels. However. • • The first is when listening to recorded music over headphones. Your hearing is essential. and power tools. such as headphone style ear defenders. such as power tools. There are sound sources over which one has no control. combined with the effects of temporary threshold shifts. or nightclub. to recover from a large noise dose so one should avoid going to a loud concert. For very loud sounds. if this situation does occur then a rest away from the excessive noise will allow some sensitivity to return.

73 × 10-4 s (673 µs). as the source moves away from the median plane. and increases at the other. the ambiguous frequency limit would be higher at smaller angles.SCHOOL OF AUDIO ENGINEERING AE01– Fundamentals of Sound Student Notes the two sounds have to travel.2 Interaural intensity difference (IID) The other cue that is used to detect the direction of the sound is the differing levels of intensity that result at each ear due to the shading effect of the head. the maximum ITD occurs at 90° and is = 6.3 Pinnae and head movement effects The above models of directional hearing do not explain how we can resolve front to back ambiguities or the elevation of the source. for this method of sound localization and this angle is 743Hz. The levels at each ear is equal when the sound source is on the median plane but the level at one ear progressively reduces. For a head the diameter of which is 18 cm.8 kHz. in all three dimensions. Thus for sounds at 90° the maximum frequency that can have its direction determined by phase is 743 Hz. The level reduces in the ear that is furthest away from the source. In between these two frequencies the ability of our ears to resolve direction is not as good as at other frequencies. This is due to the fact that sounds striking the pinnae are reflected into the ear canal by the complex set of ridges that exist on the ear. The delay that a sound wave experiences will be a function of its direction of arrival. at a particular angle. Note that the cross-over between the two techniques starts at about 700 Hz and would be complete at about four times this frequency at 2. by a very small but significant amount. There is also a frequency limit to the way in which sound direction can be resolved by the ear in this way. 9. This sets a maximum frequency. Note that there is no difference in the delay between front and back positions at the same angle. and so will form comb filter interference effects on the sound the ear receives. These pinnae reflections will be delayed. and we can use these cues to help resolve the ambiguities in direction 26 . 9. This means that we must use different mechanisms and strategies to differentiate between front and back sounds. This is due to the fact that the ear appears to use the phase shift in the wave caused by the interaural time difference to resolve the direction. There are in fact two ways which are used by the human being to perform these tasks. The first is to use the effect of our ears on the sounds we receive to resolve the angle and direction of the sound. This means that there will be a minimum frequency below which the effect of intensity is less useful for localisation which will correspond to when the head is about one third of a wavelength in size (1/3λ). When the phase shift is greater than 180° there will be an unresolvable ambiguity in the direction because there are two possible angles—one to the left and one to the right—that could cause such a phase shift. However. this corresponds to a minimum frequency of about 637Hz Thus the interaural intensity difference is a cue for direction at high frequencies whereas a the interaural intensity difference is a cue for direction at low frequencies.

There is also an effect due to the fact that the headphones also do not model the effect of the head. means of resolving directional ambiguities is to move our heads. The effect is also person specific. concert halls and sound reinforcement systems. Because the sound source tracks our head movement it cannot be outside and hence must be in the head. We also find that if we hear sound recorded through other people’s ears that we have a poorer ability to localise the sound.SCHOOL OF AUDIO ENGINEERING AE01– Fundamentals of Sound Student Notes that are not resolved by the main directional hearing mechanism. 10. 10. we move our head towards the sound and may even attempt to place it in front of us in the normal direction. just educated ears. microphones and mixing. However.music. if they arrive after 30 ms they will be perceived as echoes. A person develops his or her awareness of sound through years of education and practice. When we hear a sound that we wish to attend to. 9. The delays are very small and so these effects occur at high audio frequencies. because the interference patterns are not the same as those for our ears. we perceive the sound as coming from the acoustic source. where all the delays and intensities will be the same. Ear Training The basic requirement of a creative sound engineer is to be able to listen well and analyse what they hear. unless the level of sound from the speakers is very high. We have to constantly work at training our ears by developing good listening habits. Experiments with headphone listening which correctly model the head and keep the source direction constant as the head moves give a much more convincing illusion. The act of moving our head will change the direction of the sound arrival and this change of direction will depend on the sound source position relative to us. In essence it is important to ensure that the first reflections arrive at the audience earlier than 30 ms to avoid them being perceived as echoes. because of this effect. and powerful. as we all have differently shaped ears and learn these cues as we grow up. This movement cue is one of the reasons that we perceive the sound from headphones as being ‘in the head’.1 Listening to Music 27 . by cutting very long hair short for example. In sound reinforcement systems the output of the speakers will often be delayed with respect to their acoustic sound but. we can concentrate our ear training around three basic practices . or resolve its direction. The second. typically above 5kHz. Thus we get confused for a while when we change our acoustic head shape radically. In fact it seems that our preference is for a delay gap of less than 20 ms if the sound of the hall is to be classed as ‘intimate’.4 The Haas effect The effect can be summarised as follows : • • The ear will attend to the direction of the sound that arrives first and will not attend to the reflections providing they arrive within 30 ms of the first sound. There are no golden ears. These results have important implications for studios. As an engineer. The reflections arriving before 30 ms are fused into the perception of the first arrival. Thus a sound from the rear will move in different direction compared to a sound in front of or above the listener.

28 . For heavily produced music. 10. chorus. production clichés and mix set-ups. doubling of instruments and voices. Identify the use of different signal processing FX. break etc. Notice how instrumentals differ from vocal tracks. Notice the spread of instruments from left to right.g. Notice how the music underscores the action and the choice of sound effects builds a mood and a soundscape. The engineer must be able to identify the timbral nuances and the characteristic of particular instruments. the reeds and the brass all working together.2 Listening with Microphones Mic placement relative to the instrument can provide totally different timbral colour eg proximity boost on closely placed cardiod mics. The engineer must learn the true timbral sound of an instrument and its timbral balances. Listen to different styles of music. a sax may be miked near the top to accent higher notes or an acoustic guitar across the sound hole for more bass. Also attend live music concerts. Note the basic ensembles used. Learn the structure of various song forms such as verse. For small ensemble work. listen to how a rhythm section works together.SCHOOL OF AUDIO ENGINEERING AE01– Fundamentals of Sound Student Notes Try and dedicate at least half an hour per day to listening to well recorded and mixed acoustic and electric music. Listen to an orchestra live. guitar and piano interlock. The ear can only make good judgements by making comparisons. There can be an ensemble chord created by the string section. The way an engineer mics an instrument is influenced by: • • • • • • Type of music Type of instrument Creative Production Acoustics of the hall or studio Type of mic Leakage considerations Always make A/B comparisons between mics different and different positions. Notice the conventions for scoring for different genres of film and different types of TV. Notice how tension is built up and how different characters are supported by the sound design. Listen to direct-to-two track mixes and compare with heavily produced mixes. Listen for panning tricks. stand in front of each section and hear its overall balance and how it layers with other sections. Learn the structuring of orchestral balance. Listen to sound design in a movie or TV show. Learn how lead instrument and lead vocals interact with this song structure. front to back up and down. drums. including complex musical forms. Notice how different stereo systems and listening rooms influence the sound of the same piece of music. percussion. Analyse a musical mix into the various components of the sound stage. How bass. A mic can be positioned to capture just a portion of the frequency spectrum of an instrument to be conducive with a particular “sound” or genre. E. rock acoustic piano may favour the piano’s high end and require close miking near the hammers to accent percussive attack. listen for production tricks.

A 29 . Mic placement and the number of diffusers and their placement can greatly enhance the “air” of the instrument. An Engineer should be able to recognise characteristics of the main frequency bands with their ears. which can be problematic or used to capture an “artistic” modified spectrum. Spatial information Depth of field Brash Noise 10. The minimum 3:1 mic spacing rule helps control cross-leakage. thin 4000-6000 Presence 6000-20000 Highs Loudness and closeness. closeness Air. This can cause timbre changes. Crispness. Boosting/cutti ng helps create seness /distance Tinny. Boxiness 250-2000 Low Mid-range Harmonics start to occur Body Hornlike(5001000 Hz) Ear fatigue (1kHz2kHz) 2000-4000 High Mid-range Vocal intelligibility Gives definition Definition. floor and ceiling can affect the timbre of instruments. baffles.SCHOOL OF AUDIO ENGINEERING AE01– Fundamentals of Sound Student Notes In the studio reflections from stands. If reed and brass players can’t hear themselves and the rest of the group they tend to overplay. energy. When miking sections. Hz 16 – 160 Band Extreme lows Characteristics Felt more than heard Positive Warmth Negative Muddyness 160-250 Bass No stereo information Fatness Boominess . Good foldback makes musicians play with each other instead of fighting to be heard. Diffuser walls placed around acoustic instruments can provide an openness and a blend of the direct /reflected sound field.3 Listening in Foldback and Mixdown A balanced cue mix captures the natural blend of the musicians. walls. improper miking can cause timbre changes due to instrument leakage.

diffusion. Loudspeakers 6. The control room is an entire system that includes: Room Acoustics (modal. Control Room Acoustics: The studio control room must be as neutral as possible if we are to judge the accuracy of what we have miked or what we are mixing. rather than heard and Reverb. 30 . Loudspeaker decoupling 7. isolation) Early Reflections: when early. For this. diffused energy fills in the time delay gap and enlarges the perceived size and depth of the listening environment. up/down. Level C can be background instruments. or swallow the mic if the mix is too soft. Compare with mixes with the same production style. Grounding 11. Loudspeaker placement referenced to mix position 8. Musicians will aim their instruments at music stands or walls for added reflection to help them overcome a hearing problem. which are felt. Think of a mix in Three Levels: Level A 0 to 1 meter Level B 1 to 6 meters Level C 6 meters and further Instruments which are tracked in the studio are all recorded at roughly the same level (SOL) and are often close miked. front/back. equipment etc) 12. 2. sounds. Take rests and don’t let your ears get fatigued. They will not stay in tune if they cannot hear backing instruments. 3. 1. Dimensional mixing encompasses timbral balancing and layering of spectral content and effects with the basic instrumentation. Equipment and cabinet placement Every effort should be made during mixdown to listen to the mix on near field and big monitors. System gain structure 9. Their dynamics must be kept relatively stable so their position does not change. Electronics 10. Most instruments remain on level B so you can hear them all the time. Dimensional Mixing: The final 10% of a mix picture is spatial placement and layering of instruments or sounds. loud instruments drifting in the background. Mixing areas 5. Work at around 85 dB SPL but listen at many levels for brief periods of time. Level A instruments will be lead and solo instruments. Shell stiffness and mass 4. absorption.SCHOOL OF AUDIO ENGINEERING AE01– Fundamentals of Sound Student Notes singer will back off from the mic if their voice in the headphone mix is to loud. always think sound in dimensional space: left/right. Mechanical noise (air con. If an instrument is to stand further back in the mix it has to change in volume and frequency. Listen in mono.

Sharp Flat Natural The Stave (Staff) Clefs Scales 2.3 2. Tones and semitones 1. 8.2 5. Bar and bar line Additional notes 31 .2 1. Key signatures 3. Time signatures 6.2 Key Signature hints: Accidentals 4.1 1.1 6.1 3.1 1.1 Triad: 5.2 Rudiments 1.1 Writing an F-Major Scale in the treble clef 3. Rhythms and Beat Notation 5.AE02 – Music Theory I 1.3 Pulse or beat Dotted Notes Rests 6. Basic concepts of Music Notation 1. Tonic 4.1 5.2 Simple Time Signatures Compound Time Signature 7.

1 9. Double-Sharps & Double-Flats Modes 32 .8.

33 .

Clefs. It provides a platform from which sounds can be analysed both from the engineering perspective as well as from the musical perspective. 1. 1. In normal music writing. The lines and spaces have been numbered for easy reference. • • • Pitch. The staff lines represent musical pitches. Music is made up of three basic components. provides you with the right knowledge to analyse and appreciate a wide variety of music styles.SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes AE02 – MUSIC THEORY I Introduction Music theory is fundamental to learning about music. Harmony Rhythm.1 The Stave (Staff) Music in the “western” culture is written on five lines and four spaces: This collection of lines and spaces is called a staff.2 Clefs There are various types of clefs the two most common ones are: 34 . These three components are represented on paper by staff lines. The actual pitch represented by a staff line or space depends on what clef is at the beginning of the staff and the key signature (more on key signatures later). notes and rests (notes and rests will be discussed in later lessons). Basic concepts of Music Notation Music notation is done using certain fixed tools: 1. It is a requirement for the diploma in Audio Engineering but is also useful outside the diploma. though from the western perspective. This study in music theory. The numbering here also shows that the lines and spaces are counted from the bottom. these numbers will not be there. Music theory is a lot of fun when you become more familiar with it.

35 . loop back down until you have cut the 1st line. Keep moving up until you crossed the 5th line after which. Usually the two dots are added at the side of the clef. Loop back down and stop between the 1st and 2nd line. A treble clef and a bass clef joined together form what is called a ‘grand staff’.SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes “treble” clef “bass” clef Practice drawing some on your own: Start drawing the treble clef from between the 1st and 2nd line and move down to form a loop. Start drawing the bass clef on the 4th line and loop up to form the curve at the 5th line.

e. You can refer back to the keyboard below to understand the sequence. Rudiments 1.Staff lines. You should be able to see the pattern of repetition now after observing the way the notes are arranged. D. Sharps and flats are a semitone from its natural note. C. B. A to A# (A sharp) or C to C# or A back to A flat. G. Octave simply means that an octave of middle C is the next C after you have gone through the 7 alphabets associated with musical notes (C. it means that the notes are half a tone higher or lower. C (an octave lower) Middle C C (an octave higher) The ‘middle C’ indicated on the keyboard is the ‘C’ that is directly corresponding to the ‘C’ in the space between the treble and bass clefs. It can be applied for all notes on the keyboard. G. E. It means that as long as the two notes differ by a note exactly it means that it is a tone apart. This is as shown below. i. flats and semitones are. As for a semitone. D. They then repeat themselves in a cycle. The naming of the black notes requires that you understand what sharps. A to B or C to D or A back to G. The special cases are B to C and E to F. i. A. C which is 7 notes apart). Refer back to the staff lines above. E. B. clefs and the keyboard Musical note names are according to the first seven letters of the English alphabets. 36 . F. F.e.SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes Note Naming. The n otes indicated on the grand staff are the consecutive white notes on the keyboard. Tones and semitones A tone is a note that is spaced exactly one note higher or lower. A .

In the case of Cb. A to Ab or C to Cb.SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes 1. That is. The same is true for all notes. the flat of that particular note is the black note on the left. It will help you to clearly visualize the entire process. Please note that any reference to ‘tone’. Like A. (‘Step’ means by tone or by semitone) A major scale proceeds by following a certain pattern of tones and semitones. A etc. Scales A scale is a series of notes that proceed up or down by step. We’ll go through the process of writing a major scale step by step (no pun intended). 2. Then place an F on the staff. 2. we can see from the keyboard that Cb is not a black note. ascending.1 Writing an F-Major Scale in the treble clef STEP 1: Draw a treble clef on a staff.3 Natural It is the ‘original’ note.1 Sharp A sharp [#] is a note that is higher than the original note. B. We are going to write an F-major scale in the treble clef. the ‘F above middle ‘C’. means a ‘whole tone’. A to A# or C to C# or any other that is on the keyboard. Understanding scales depends on your knowledge of tones and semitones. But we’ll get to that in a moment. It can be seen that the flat and the sharp are considered to be shared. This is one of the special cases. using quarter notes. 1. As for the rest.2 Flat A flat [b] is a note that is lower than the original note. and you’ll see that writing scales is actually a fairly simple process I would recommend getting a piece of staff paper and writing out the steps as you see them demonstrated here for you. C …. It is usually the black note that is on the right of that particular note. Cb is also equivalent to B. Make certain that you fully understand the difference between tones and semitones. 1. It can also be noted that E# is also equivalent to F and B# is a C. 37 .G. G# and Ab is the same note. That is.

the ‘G’ and ‘A’. No problem. S=Semitone). You will find that in this scale. ascending for one octave Remember. ‘F’ and ‘G’. Major scales follow a certain pattern of tones and semitones. we can use accidentals (sharps and flats) to make them conform. Therefore. checking each interval between all notes in the scale. and we can go on. Here is the complete correct F major scale: 38 . tone. so that conforms to the second interval requirement. The distance between these two notes is a whole tone. On we go! Our next notes to examine are the 3rd and 4th notes. Here’s what we’ve got so far: We show whole tones with a square bracket and semitones with a slur (curve).SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes STEP 2: Write a note on each line and space. any note above the middle line ‘B’ should point its stem downward. Just keep going.T] (T=tone. STEP 3: You’ve now written a scale. and now it’s a semitone. But our major-scale pattern says that there should only be a semitone between these two notes.T] T T T [S. Now let us look at the 2nd and 3rd notes. We start by looking at the first two notes. the A’ and ‘B’. but not necessarily a major scale. What is the distance between these two notes? It is a whole tone. We will just lower the B to a B-flat. is correct. the first interval in the pattern. the B-flat is the only accidental that we have to use. If they don’t. We now have to examine the intervals between each and every note to see that they conform to this pattern. The ‘B’ itself can go either way. ‘Tone’. This forms a whole tone. any note below the middle line B should point its stem upward.T T [S.

SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes An F-major scale. and finally the ‘D. G=Re. you used: A-flat. Make sure that you write your scale using the process mentioned above. as you can see. It looks like this. We also know what they mean. it should have looked like this: Remember. All the different major scales use their own set of accidentals. In the case of the Aflat major scale above. you’ll learn how to make a proper key signature from the accidentals that are used. For the major scale of F just constructed. 39 . Now we need to know what order to write them down in a key signature. For that. Start with one octave of notes. and then make your adjustments if necessary. in both clefs of the Grand Staff. unless canceled out temporarily by an accidental. A major scale can be represented in Tonic solfa by the common Do Re Mi Fa So La Ti Do. then the ‘E’. For practice. 3. D-flat. If you did your job properly. try wilting an A-major scale in the bass clef. Just go back to Step 1 and start on an ‘A’. E and A will be flat. the square brackets represent whole tones. F=Do. In the next lesson. and B-flat. B-flat=Fa etc. the rounded ones represent semitones. When we see the following key signature: We know th every B. the ‘A’. It is the only major scale that has one flat. Key signatures We’ve all seen key signatures . we have a nifty little rhyme: Battle Ends And Down Goes Charles Father The first letter of each word in this sentence tells us the order that the flats are entered in a key signature: first the ‘B’.they’re the collection of sharps or flats at the beginning of each staff. you were asked to write an A-flat major scale. Now how do we convert those accidentals to a key signature? Take a look at the scale and write down all of the accidentals you used. at In the previous lesson’s test. E-flat. A=Mi. has one flat.

” rhyme is that reversing the order of the rhyme gives us the order of sharps in a key signature: Father .Ends.1 Key Signature hints: There are some little “tricks” that can help you know which major key belongs to which key signature Consider this key signature 40 .And .SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes A key signature that uses all seven possible flats will look like this: The neat thing about the “Battle ...Goes .Charles .Ends — Battle A key signature that uses all seven possible sharps will look like this: 3.Down .

Counting down in this key signature four notes. Fa’ is the solfa name for the fourth note of the scale. this key signature belongs to C-flat major 41 . Therefore. the last is ‘fa’. The last flat indicated above is the F-flat.) The last sharp indicated above is the B#. we know that the next note will be the key-note. If that’s the fourth note. and it will be one diatonic semitone higher. we know that the key-note will be four notes lower. is the solfa name for the seventh note of the scale. this key signature belongs to C# major. but in fact it’s quite easy if you remember this rhyme: When sharps you see. the ‘leading tone’ (You will learn more about these technical names in a later lesson. ‘Ti’.SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes You might think this is a rather complicated one to start with. Now remember this little rhyme: When flats there are. the last is “Ti’. of course. we hit ‘C-flat’. If that’s the seventh note. Consider this key signature. Therefore.

2 Accidentals An accidental is a sharp or flat symbol placed in the music that does not normally belong to the given key. When we speak of a note in a scale. and identify’ those chords by the technical name.e. we can refer to it by its number: ‘G is note number 1 of a G-major scale). It would definitely be to your own benefit. This is because tonic and dominant chords form the basic backbone of much of what we call ‘tonal music.SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes 3. By this time you should be familiar with sharps and flats and natural notations. Tonic As you know. (i. It would be good to also familiarize yourself with the position of each note in relation with the keyboard. 4. a tonic chord) The notes in the major scale are given technical names according to where their numerical position on the scale.) A technical name not only identifies a note. or by its technical name: ‘G’ is the tonic note in a G-major scale. we can build chords on all of the various notes in a scale. every scale degree has a technical name. but can also give us information as to the function of a note within a scale. Furthermore. First we need to learn a couple of important definitions: 42 . • • • • • • The First is the Tonic The Second is the Supertonic The Third is the Mediant The Fourth is the Subdominant The Fifth is the Dominant The Sixth is the Submediant The Seventh is the Leading tone • We are only going to deal with tonic and dominant chords.

(Root-3rd-5th) Dominant triads are built in similar fashion as tonic triads In other words. Here are several keys. They are triads because the structure of the chord is 13-5. the middle one is the 3rd.) These are tonic triads because they are chords built on the tonic note. and minor triads with a lower-case i. If we are in the key of A-major. and build a 1-3-5 triad. we place the Roman numeral for ‘1’ underneath it: The procedure we just followed to create a tonic triad is the same for any key. according to the definition of a triad given above. It is traditional to indicate the triad by using a Roman numeral. this would be the tonic note: If we build a triad on top of this note. Let’s take a good look at the structure of a dominant triad Note this one. A chord can be any three or more notes played together.SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes 4. but a triad has a particular structure. and the top note is the 5th. another as the 3rd and the other as the 5th. in D-major: 43 . Any chord in this structure (root-3rd-5th) is called a triad. it would look like this: This is a three-note chord in which the bottom note is acting as the root. simply go to the dominant note of the scale. with tonic triads: (It is traditional in most schools of theory to indicate major triads with an upper-case ‘I’. (The numbers 3rd and 5th refer to the intervals above the bottom note.1 Triad: A three-note chord in which one note is identified as the root. Since we have just built a triad on the first note of the scale.) We say that this is a tonic triad because it is a triad that has been built on the tonic note of the key we’re in. A chord is the simultaneous sounding of three or more notes.

because the fifth note is the dominant note. the 2nd chord has an E# because E# is a leading tone for the tonic (F#). Here are some more dominant triads. The middle note is the leading tone of the key. IMPORTANT: Dominant triads must always be major.. you must raise the third (middle) of the chord to make it major. That is what gives dominant chords their important place in traditional harmony: they help define the tonic chord in that manner. Dominant chords must always have the leading tone present. in various keys: The V-chords in the minor keys above had their middle notes (the 3rd) raised by using an accidental in order to create a leading tone to the tonic. (i.Take a look again at the V-chord above.SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes We put the number ‘V’ underneath it because it is a triad that has been built on the fifth note of the scale. 44 . If you are in a minor key.) This is important. it is called a dominant triad. there is always that leading tone. You will see that the bottom note is the dominant note of the key. For example. but you can see that the leading tone in this triad (the middle note) is a whole tone away from ‘A’. But look at this V-chord in A-minor: A leading tone is always a semitone. So we have to raise the ‘G’ to become ‘G#’: The simple way to remember this is to remember this rule: “All dominant chords must be major. the middle note that “wants” to move up to the tonic. C# is the leading tone in D-major. In a dominant triad. whether you are in a major key or a minor key. Furthermore. no matter what key you write them in.e.” The G# is called an accidental.

. These are called half notes: You can tell with this diagram that it takes two half notes to make a whole note.. We need notes of shorter duration. forever! However. let’s just do one more for now. Rhythms and Beat Notation In music notation. theoretically. Composers need a way of indicating to performers how long to hold each note. 5. You can see the relationships between note lengths very clearly: Here is an equation that should make sense to you. If we divide the line into two equal parts. This diagram is showing that one whole note takes up the entire line. We could keep going. called eighth notes. They look like quarter notes with flags: So eight eighths equals one whole. By making each note look a little different. The next smaller note value is called a quarter note: It takes four quarters to make a whole note. 45 . Let us keep going. and all others in between. Here are notes of even shorter value.SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes 5. it has been placed there to help you visualize its length. this can be easily communicated. Here is a whole note. you can tell that it takes two quarter notes to make one half note. of course. a note you’ve probably seen before. There are long notes and short ones. Also. Let’s look at all the diagrams placed together. the beat or pulse and therefore the rhythm of the music is represented by fixed kinds notes and time signatures.1 Pulse or beat Musical notes are not all held for the same duration. sitting on a line: The whole note is not normally found sitting on a line like this. It also equals two halves. It also. a whole note would be too big to fit in it.

an eighth note By adding another flag. you add half of its value to the note. two eighth notes and one quarter note. is three beats long. Here is a dotted half note: It is one half note plus half of a half note (one quarter). it would look like a quarter note .a sixteenth note: 46 . Remember the eighth note? Without the flag.2 Dotted Notes You know that in many time signatures a quarter note equals one beat. When you add a dot to a note. A dotted quarter note looks like this: The dot makes the note half again as long as a quarter note. It is just the same as the following arithmetic equation: 5. adding a flag to a note makes a note half as long. but take your time and figure it out: if you add together the lengths of one half note. you will get one whole note. Here’s another one: This may look a little complicated. therefore. By adding the flag it becomes a note of half that value . it becomes half as long as an eighth note . What is half of one? If you add that to the quarter.one beat.SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes It shows that two quarter notes equal one half note in length. you get a note that is 1 and a 1/2 beats long. Similarly. A dotted half note.

The half note gets two beats. For example. the stem may point either upward or downward. there is a corresponding rest of the same length. the whole note is a note that gets four beats. it looks like a small black rectangle that hangs from the fourth line. Same thing for sixteenths: Using the beam in place of the flags simply makes it look a little “tidier” and a little easier for a performer to read. It hangs from that line no matter which clef you use. Concerning the direction of stems. How many sixteenth notes does it take to make one half note? Eight. The whole rest also gets four beats: As you can see. It takes four sixteenth notes to equal one quarter note. One whole note? Sixteen. Many times when two or more eighth notes are written side-by-side.3 Rests For every note. the flag is replaced with a beam: These two beams eighths are exactly the same as if the writer had written. and so does the half rest: Here are the “rest” of the rests! The quarter rest (I beat): 47 . it is important to know that sometimes stems can point upward as in the examples above. if the note is above the middle line of the staff: If the note is on the middle line.SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes It takes two sixteenth notes to equal one eighth note. But stems can also point downward. 5. It also indicates beat duration.

to indicate how many beats are in each bar. Time signatures Writers of music have a convenient way of putting music into “sections” or “compartments” that make it visually easy to follow. Simply stated. with simple time signatures for one. notice in bar 2 that the eighth notes have been beamed together in groups of two. 6. 48 .SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes The eighth rest (0. This is stated directly. The compartments have been discussed before we call them “measures” or “bars” Take a look at most printed music. and you’ll see this very clear1y. a time signature consists of two numbers.25 beat): 6. The writer is showing us that the quarter note gets the beat.1 Simple Time Signatures Simple time signatures tell us two things immediately: HOW MANY beats are in each bar? What kind of note gets the beat. one being written above the other. Simple! (Guess that’s why they call it a simple time signature?) Also. and The ‘4’ tells us that each beat is one quarter note long.5 beat): The sixteenth rest (0. Study the following: The time signature tells us two things: The ‘2’ tells us that there are 2 beats in every bar. That’s because two eighth notes together are one quarter note in length. Our first task is to discover the differences between simple and compound time. You’ll also notice at the beginning of each piece of music a time signature. Measures are separated from each other by “bar lines”. or indirectly. with compound time signatures for example.

) th bar of music is the dotted quarter.2 Compound Time Signature Unlike simple time signatures.SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes Here’s the same excerpt with the beats shown above the music: If we were to count along with the excerpt as it is played. We therefore need to take the eighth notes and “condense” them to discover what the beat is. The top number is not divisible by ‘3’ (Except for time signatures with a ‘3’ on top. compound time signatures do not directly show us the number of beats per bar. shown underneath: This excerpt shows four things that describe all simple time signatures: 1. 3. each beat can be “subdivided” into two parts. it is possible to apply dotted quarter note beats. 2. Here is the same excerpt with the subdivision. 2. which are often simple time signatures!) 4. 2” etc. You can see that In other words. The beat is an un-dotted note. 6. 1. they show us the number of breakdown notes per bar. the beat in a by going through the two bars of the excerpt. we can see that the writer has beamed the first three eighth notes together. Simple time signatures show the number of beats in every bar. Instead. 1. Study the following: In this excerpt. Condensing the three eighths down to one note gives us a dotted quarter. Each beat is subdivided into two components. 2. (1/8 + 1/8 th th + 1/8 = 1 dotted quarter note. Here’s what it looks like: 49 . that’s why they were beamed together. we would say “1. The subdivision or breakdown of a beat is its number of components. In simple time signatures. or breakdown. The writer is showing that the first three eighths form one beat.

Look at bar 1. The breakdown notes are EIGHTH notes. Looks like the quarter note may be the beat unit in this excerpt. we can tell that this is a simple time signature: The beat is an un-dotted note. though simple time beats break down into two parts. So. Armed with that knowledge. (Except for time signatures with a ‘3’ on top. Each beat is subdivided into three components. the time signature is six eights. Compound time signatures show the number of breakdown notes in every bar. Notice that the eighth notes are beamed together in groups of two. we can break down each beat into beat subdivisions. 50 . here are the four things that describe compound time signatures: 1. 4. The top number is evenly divisible by ‘3’. compound time beats break down into three parts: You can see that each bar has SIX breakdown notes. However. 3. The beat is a dotted note. you should be able to say what time signature the following excerpt is in: So let’s study it. Therefore. Each one of those eighth note pairs can “condense down” to form one quarter note. Can we apply a quarter note beat pattern to the whole excerpt? Absolutely! This is what it would look like: Since applying quarter notes as a beat unit seem to work. 2.SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes Just like with simple time signatures.

we get one note which is a dotted quarter in length. Therefore the time signature is 7. the beat breaks down into three parts: How many breakdown notes in each bar? Nine. What kind of notes are the breakdown notes? Eighth notes. As this is compound time. and that the quarter note is the unit that ‘gets the beat”.SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes Each beat will divide into two components (one quarter note subdivides into two eighth notes). If we condense those three notes down. Since we know that it is simple time. The dotted eighth. (Another word for time signature is meter). the actual time signature should be the same as the number of beats per bar. That leaves the quarter note and the eighth note in the middle. Notice in particular. The eighth rest and two eighth notes at the beginning would certainly be explained in terms of a dotted quarter beat. Bar and bar line Music is often divided up into units called measures or bars. sixteenth. Therefore the numbers of the time signature will reflect the number of breakdown notes in each bar. 51 . the last group of notes at the end of the first bar. so this is a compound time. and eighth note have all been beamed together. So how do we assign this excerpt a time signature? The beat is a dotted note. Let us see if we can apply a dotted quarter beat to the entire excerpt. It appears that perhaps the dotted quarter will be the beat unit in this excerpt. some music is written so that every measure has four beats. The number of beats is determined by the time signature. and that too can fit into the dotted quarter beat pattern. Let us try another one: Look at how the eighth notes are beamed. For example. Each measure has a certain number of beats.

you will eventually learn that the time signatures listed above are called simple time signatures. The bottom four means “quarter note”. because of the time signature at the beginning of the piece.SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes In such a piece the time signature would be we say “four fours when we read this time signature. you are probably already familiar with ‘counting” in this manner. in any given time signature. It is necessary. and that each beat is a quarter note: Bar1: 3 quarter notes = 3 beats. The bottom number tells us what kind of note gets the beat. For example. There are things you will eventually need to know about all time signatures. and that the number of beats is the top number of the time signature. What if you were to get a piece of music in which the composer put the time signature at the beginning. Take a look at the following piece of music: This is a piece of music that has been written in three fours. if you count up the number of beats in each bar. the top ‘4’ represents the number of beats per bar: four. but ‘forgot” to draw in the bar lines? 52 . Bar3: 1 half note plus 2 eighth notes = 3 beats. you would find that each bar has three beats. But that’s not necessary right now. That’s obvious. In this time signature. The bottom ‘4 tells us what kind of note gets the beat. All you need to know is that in each of these particular time signatures: • • The top number tells us how many beats. If we were to take the example above and write the count of each bar. it would look like this: If you play a musical instrument. are some of the commonly used time signatures. Bar2: 4 eighth notes plus 1 quarter note = 3 beats. How would we be able to know that the piece was in three fours time? Well. But let’s say that the composer forgot to put a time signature at the beginning. to make sure that each bar has the same number of beats. Bar4: 1 dotted half note = 3 beats.

8. (Our way of showing a note that is the fourth sixteenth past the beat.and.a Two –e. You can see that each bar gets 2 beats. If you come across a piece of music in which the eighth note gets the beat. The second eighth gets a “+” to indicate that its in-between beats one and two. the notes consecutively will be raised or lowered.weak . In bar I.weak” pulsing of the music.a”. when there is an accidental. you would have to raise or lower the accidental back to the natural note. Additional notes Do not forget that in a bar of notes.and .strong . then each eighth note gets a number. The next sixteenth is a “+” because it is one eighth past the beat. Sometimes we have to write the counts into a bar that features syncopation. Here is what it should look like once you have drawn the lines in: Bar 1: 2 eighths plus 1 quarter = 2 beats.1 Double-Sharps & Double-Flats 53 . then draw a bar line.) This funny way of showing the counts makes it easy to say the counts. and each sixteenth gets a “+”: 8. the first eighth gets a “1. The counts have been written in.. because that is what the time signature means. So count two beats. if you saw a bar of music in that had eight sixteenth notes.SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes The time signature is two fours. In order to get back the original note. For example. Syncopation occurs when the normal rhythmic stresses in a bar are changed. Notice that each beat gets a number. normally in a piece of music written in one tends to be quite aware of a strong . In bar 2 the first sixteenth gets a “1”. It should work out that every bar gets two beats.. then count another two beats and draw another bar line. The next sixteenth gets an “e” (our way of showing a note that is one sixteenth past the beat). The fourth sixteenth gets an “a”. 4 sixteenths plus1 quarter 2 beats. you would say the count like this: “One -e. etc. Bar 2. For example.

we raise that note by one semitone. Modes A mode is a type of scale. composers wrote in what were called modes. is two semitones lower than ‘A’. Modal melodies can be very beautiful. You have already learned to write major and minor. which means that they both produce the same pitch frequency. lowering a letter name by one semitone can be represented by placing a flat in front of the note: By placing a sharp in front of a note. we lower that note by one semitone. If you were to play it on your instrument.’ is two semitones higher than ‘A’. Such study of modes can get quite in-depth. called ‘a-double-flat’. By placing a flat in front of a note. 54 . and of course we have been using them ever since. There are situations that arise in which we need to raise a note that is already sharp. with composers like Debussy. and is a fascinating field of study. we say that ‘A-double-sharp’ and ‘B’ are enharmonic equivalents. and their study is certainly worthwhile. 9. If you were to play it on your instrument. Therefore. When two notes are ENHARMONICALLY EQUIVALENT. thus creating a doublesharp. There was a resurgence of interest in modes toward the end of the 19th century. you would play a ‘G’. called ‘a-double-sharp. You will see that these situations occur most often in the building of certain minor scales. Music based on major and minor scales came into common usage in the early 1600s. you would play a B.SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes As you already know. A double-sharp sign looks like the letter x: This note. A double-flat literally looks like two flat signs side-by-side: This note. Before the 1600s. raising a letter-name by one semitone can be represented by placing a sharp in front of the note: Similarly. A-double-flat and ‘0’ are said to be enharmonic equivalents.

We call this scale the Dorian mode.) Subdominant to subdominant gives us the Lydian mode: Dominant to dominant produces the mixolydian mode: 55 . not by the actual pitches used. we get the Phrygian mode: (The tone-semitone pattern is still that of the C-major scale. it is just that the scale now starts and ends on a ‘D’ instead of a ‘C’. What if you were to take this same C-major scale. or final. we shall only delve into the basic construction of modes so that we can identify and write them. We can start a scale on all the different notes of our C-major scale above. We say that the note ‘D’ is the key note. The first and perhaps most important thing is: A mode is distinguished by the pattern of tone and semitones. It would look like this: It still has the pattern of tones and semitones that belong to C-major. A scale that runs from what appears to be the second degree (supertonic) up to the second degree an octave higher is said to be in the Dorian mode. for our purposes here as a rudimentary music theory course.SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes However. starting on a middle C and proceeding upward for one octave. The tones and semitones have been indicated. Take a look at this C-major scale. if we write a scale from the mediant to the mediant. For example. started on a D and proceeded upward for one octave. of the mode. but instead of starting on a ‘C’. and you can tell by that tone-semitone pattern that this is indeed a C-major scale.

you are also forming a mode. though we more often than not simply call it ‘C-major 56 . called the Ionian mode. when you write a major scale from the tonic note up to the tonic note.SCHOOL OF AUDIO ENGINEERING A02– Music Theory I Student Notes Submediant to submediant produces the Aeolian mode: And leading tone to leading tone makes the locrian mode: Incidentally. So something in C-major could technically be said to be in C-Ionian.

3 3. 2.1 5.2 2. Configurations Track Width Tape Speed Recording Channels Input Signal Modes 5. 4. Bias Current and Dynamic Range of Magnetic Tape 2.AE03 – Analog Tape Machines 1.2 5. c. 2. DC Servo Motor. 3.1 Height 57 . b. Sync Mode PROBLEMS WITH MAGNETIC TAPE 1.1 2.3 Input Mode Reproduce Mode.2 a. Record Equalisation Playback equalization Equalisation standards Head Alignment 3. 3. Hysteresis Motor. 5. The Analogue Tape Recorder (ATR) The Tape Transport Capstan Motors 3.1 3. 1. Open-Loop System Closed Loop Zero Loop System TAPE AND HEAD CONFIGURATIONS.

Tails out tape storage Cleanliness Degaussing 58 . Azimuth Zenith Wrap Rack Electronic Calibration TAPE SERVICING 1. Print-Through . 2.2 3.4 3. 3.3.3 3.5 4.

rhythm and duration of the recorded programme. The Tape Transport The process of recording the audio bandwidth on magnetic tape. Newer reel to reel ATRs employ an electronics called Total Transport Logic(TTL) which allows the operator to safely push play whilst rewinding or fast forewarding. Stop. 59 .SCHOOL OF AUDIO ENGINEERING A03– Analog Tape Machines Student Notes AE03 – ANALOG TAPE MACHINES 1. The mechanism responds to the various transport control buttons to carry out basic operations such as : Play/Record. In the past. Pause. Transport Technology is based on the relationship of physical tape length to specific periods of time. using a professional reel to reel ATR depends on the machine’s ability to pass tape across a head path at a constant speed. Fast Foreward and Rewind. The Analogue Tape Recorder (ATR) Calling a taperecorder analog refers to its ability to transform an electrical input signal into corresponding magnetic energy that is stored in the form of magnetic remanents on the tape. TTL uses sensors and an automatic rocking procedure to control tape movement. Instead a rocking proceedure was used to slow the tape by cutting between FF and REW. 2. one had to take care not to stop a machine that was in fast shuttle as this could stretch the tape. During playback the time spectrum must be kept stable by duplicating the precise speed at which the tape was recorded. The mechanism which accomplishes these tasks is called the tape transport system. A professional multitrack recorder runs at one of 2 speeds : 15ips or 30ips. Thus preserving the original pitch. The elements of the transport deck of a studer A-840 1/4” Mastering Machine: A B C D E F G H Supply Reel Takeup reel Capstan Capstan Idler Tape guides Tension Regulators Tape Shuttle Control Transport Controls and Tape Timer The transport controls the movement of the tape across the heads at a constant speed. with a uniform tension.

Closed Loop The tape guide path is isolated from the rest of the transport with unsupported sections of tape kept to a minimum. A small amount of take-up reel torque helps to spool the tape onto the take-up reel after the headblock. Zero Loop System Takes full advantage of TTL logic and dc servo feedback circuitry in a system that does not emply a capstan. A light is shone through the rotating disk and a sensor counts the number of disc notches per second registered as light flashes. 4. A notched tachometer disk is mounted directly on the capstan motor shaft. This design is now the standard in pro ATRs. 60 . This closed loop minimizes distortions associated with open loop systems. Capstan Motors The capstan is the most critical element in the transport system. Insteadthe tape is shuttled and kept at the right tension by the interplay of the supply and take-up reel motors.a stable reference of 50 or 60 Hz. Maintains a constant speed by following the supply voltage frequency from the power line . Tape Transport Systems. The tape is actually pulled out of the headblock at a faster rate than it is allowed to enter. Three methods of transporting tape across a headpath: a. A small amount of torque or takeup is applied to the supply reel motor.2 DC Servo Motor. b. There are 2 common types of capstan motors: 3. ie in an opposite direction to that of the tape travel. Open-Loop System Here the tape is squeezed between the capstan and the capstan idler to move the tape. c. A Resolver compares the actual state of rotation with a standard reference to give a highly accurate and stable capstan speed.1 Hysteresis Motor. This provides the right amount of tension for tape to head contact. It is the shaft of a rotational motor which is kept at a constant rate of speed.SCHOOL OF AUDIO ENGINEERING A03– Analog Tape Machines Student Notes The EDIT button frees the tension from the supply and take up reels so that the tape may be moved manually across the head block to determine an edit point. 3. Motion and tape-tension sensors on both sides of the headblock are used to continually monitor and adjust speed and tension. Uses Motion sensing feedback circuitry. 3.

3. crosstalk is less. at fast speeds. 1” . At faster tape speeds recorded bandwidth is increased. 2 Trk. This allows for increased input level before saturation. 4-trk. These electronic modules enable adjustment to input level. This means you can get more signal on tape before distortion with increases the SNR of the recording. hence these frequencies will not be lost in the gap of the repro head (Scanning Loss). Monitoring of signal levels for each channel is done via a dedicated VU meter. 8. Recording Channels The channel circuitry of an ATR is composed of a number of identical channel modules which corresponds to the number of tracks the machine can record and play back. Reproduce and Erase. Sony) are available in a wide number of track and tape-width configurations. 1/2” . output level. 15ips and 30ips. 1/4” . Tape Speed Directly related to signal level and bandwidth. 1. Professional ATRs (Studer.SCHOOL OF AUDIO ENGINEERING A03– Analog Tape Machines Student Notes TAPE AND HEAD CONFIGURATIONS. Configurations Number of tracks recorded per width of tape. sync level and equalisation. Common audio production tape speeds are 71/2ips. Fostex) save money by placing more tracks on smaller width tapes: 4Trk. Input Signal Modes The output signal of each ATR module may be switched between three modes: Input.and 16-trk 1/2” 2. an increased amount of magnetism can be retained by the tape. 8-trk. Track Width With greater track width. Each channel must perform the same three functions: Record. 4. More domains pass over the heads at h igh speeds therefore making the average magnetisation greater. 1/4” . 16. 5. Otari.and 24-trk 2” Budget or Semi-pro configurations (Tascam. This is because the wavelengths of high frequencies are longer ie cover more tape. Repro and Sync. A wider track makes the recording less susceptible to signal dropouts. 30ips has professional acceptance for the best frequency response and SNR.When the space between tracks (Guardband) is wider. 5.1 Input Mode 61 . resulting in a higher output signal and an improved SNR.

1 Record Equalisation Magnetic tape has a nonlinear frequency response curve ie recorded low and high frequency responses are not flat which of course implies that in the recording of these 62 . (Inertia Point) and it trails off at saturation. Thus the signal to be recorded is moved away from the nonlinear crossover range and into a linear portion of the curve.3 Sync Mode Sync mode is used during overdubbing on a multitrack ATR. Tape Equalisation Two types of equalisation are applied to the record and playback signals to increase the linearity of the frequency response of magnetic tape. ie it is slow on the takeup around zero. The signal is derived from the playback head. 5. Bias Current and Dynamic Range of Magnetic Tape The development of magnetic flux under the influence of magnetisation is not linear.2 Reproduce Mode. Compensatory measures to extend tape linearity are taken in the application of bias current to the record heads. Setting bias level (amplitude) is crucial to optimising the SNR of the recorded material. This nonlinearity of tape’s magnetic response leads to a restriction in the dynamic range of magnetic tape. the high frequency bias signal is ignored by the repro head and only the input signal is reproduced. On playback. The signal from the repro head is always slightly delayed compared to the record head because of the distance between the 2 heads. 5. PROBLEMS WITH MAGNETIC TAPE 1. 2. Bias current is applied by mixing the incoming audio signal with a high frequency signal of 150-250KHz . A track selected in sync mode will be played back by the record head and will thus appear to be synchronised to the input signals. This signal modulates the amplitude of the input signal to a higher average flux level. This delay interferes with accurate monitoring and makes overdubbing impossible. it will be distorted at the crossover area around the zero point where there is only a very small amount of magnetic flux. When a signal is recorded. Here we wish to record new material whilst monitoring the recorded material off tape. Bias level varies with individual record heads and different types of tapes.SCHOOL OF AUDIO ENGINEERING A03– Analog Tape Machines Student Notes The output signal is derived from the input section of the module. 2.

2.2 Playback equalization This is the 6dB/octave filter inserted in the playback circuitry to compensate for the doubling of level per octave response of magnetic tape. The height of the record and playback heads must be aligned in relation to the tape path and each other for the full reproduction of the recorded signal.SCHOOL OF AUDIO ENGINEERING A03– Analog Tape Machines Student Notes frequencies the SNR decreases. In the record stage. 15ips) CCIR/DIN International Radio Consultative Commission and Deutsche Industrie Normen (Europe 15ips AES Audio Engineering Society (30ips) 3. Therefore the levels on tape are unnatural and need to be restored on playback. Head Alignment An important factor affecting recording quality is the magnetic tape head’s physical positioning or alignment.1 Height Determines where the signal will be recorded.A. 3. if levels were recorded at a normal level. adjustment usually performed by screws on the headblock. National Association of Broadcasters (US. For optimum recording the head must track the tape exactly. 63 .3 Equalisation standards Tape machines have one of three equalisation standard settings. The audio signal thus has its highs and lows boosted (Pre-emphasis) before it is recorded by the record head. The head has five dimensions of alignment. high and low frequencies would be too low and not achieve adequate magnetisation. Canada. This is achieved by a complementary Post-emphasis equaliser in the playback circuit which readjusts the high and low frequencies back to their proper levels. Singapore at 15ips) IEC International Electrotechnical Commission (Aust. 2. N. Each EQ setting is used in a different part of the world.B.

Azimuth 3. The tape must contact the top and the bottom of the head with equal force otherwise the tape will Skew (ride up and 64 . All headgaps should be 90 degrees perpendicular to the tape so that they are in-phase with each other.2 Azimuth The tilt of the head in the plane parallel to the tape.SCHOOL OF AUDIO ENGINEERING A03– Analog Tape Machines Student Notes Height Record Playback 3.3 Zenith The tilt of the head towards or away from the tape.

Determines the pressure of the tape against the head. correct wrap Incorrect wrap 3.SCHOOL OF AUDIO ENGINEERING A03– Analog Tape Machines Student Notes down on the head). Zenith wear 3.4 Wrap The angle at which the tape bends around the head and the location of the gap in that angle. 65 .5 Rack How far foreward the head is. Uneven zenith adjustment leads to an uneven wear path on the head.

Bias signal is increased to a peak and then pulled back 1dB. Electronic Calibration Tape formulations differ from each other to the extent that an ATR must be calibrated to to optimise its performance with a particular tape formulation. Record 10KHz tone at 0VU and adjust record Hi-freq EQ. The playback head is first calibrated for each track using the reference tape and setting repro level and high frequency playback levels. The procedure used to set the controls to standard levels is called Calibration. 66 . Use a 50Hz tone to adjust Low freq playback EQ. The tape contains the following set of recorded materials: • • • Standard Reference level . Next the record head is calibrated.SCHOOL OF AUDIO ENGINEERING A03– Analog Tape Machines Student Notes Rack 4. Standard levels must be adhered to so that tapes may be played on different ATRs. equalisation. by recording reference tone of 700 Hz from the oscillator onto fresh tape at 0VU. v. ii.15Khz for 30 secs. iv. Calibration is carried out to provide a standard reference for levels and equalisation. again on each track. Procedure for alignment: i. Azimuth adjustment tone . Frequency response tones from high to low. Next the record head bias level is adjusted. ATRs have variable electronic adjustments for record/playback level. iii. A reproduce alignment tape is used which is available in various tape speeds and track width configurations. bias current.700Hz or 1Khz signal recorded at a standard reference flux level of 185 nWb/m.

This gives rise to an audible pre-echo or ghosting effect on playback. Print-through still occurs. 1 1 Assignment 1 – AE001 67 . The Degausser is turned on and slowly moved towards the head. which are in phase when the tape has been wound up onto the supply reel with its beginning or “Head” out. Degaussing procedure requires care or the heads can be harmed. bad print-through can be avoided if the tape is wound for storage onto the takeup reel so that it is “Tail Out”. A degausser operates like an erase head: ie it produces a high level alternating signal which saturates and randomises the residual magnetic flux. Oxide accumulation is most critical on the heads themselves leading to a loss of signal in record and playback. Print-through is a tape storage problem.. Print-through is the transfer of a recorded signal from one layer of magnetic tape to an adjacent layer by means of magnetic induction. Tape heads and metal transport guides are cleaned with denatured (pure) alcohol and a soft cotton swab. but the ghosting effect is reversed as an echo which follows and is therefore masked by the original sound.SCHOOL OF AUDIO ENGINEERING A03– Analog Tape Machines Student Notes TAPE SERVICING 1. Tape head cleaning should occur before every recording session. It is gently moved across the head without touching it and then slowly removed. Conversely. 3. Periodic Degaussing or demagnetising of the heads is necessary. 2. When the tape is rolled up print-through is greatest between a layer of tape and the one immediately above it. Cleanliness The ATR heads and transport must be kept clean of oxide particles which are shed due to the friction of tape running over the heads.Tails out tape storage A form of deterioration of tape quality. Degaussing Heads do tend to retain small amounts of residual magnetism which can lead to high frequency deterioration in record and repro. Print-Through . Repeat the same procedure for each head. All tapes sould be removed and the ATR switched off.

SCHOOL OF AUDIO ENGINEERING A03– Analog Tape Machines Student Notes 68 .

7. Correlated sound sources Uncorrelated sound sources Adding decibels together The inverse square law 9.2 8.3 2.1 7. Power. 3. And Voltage Relative Versus Absolute Levels 2. 5. 2.AE04 – The Decibel 1.6 3.1 2.1 Dynamic range of common recording formats 69 . power and pressure level Sound intensity level Sound power level Sound pressure level Adding sounds together 7.2 2. dBm dBu dBV and dBv dBW Equal Loudness contours and Weighting networks Other concepts 4.1 The effect of boundaries DB in Electronics 1. 4. 2. 9. Logarithms What is a Decibel? Sound intensity. 6. 4.

Logarithms Logarithms allow smaller numbers to represent much larger values. learning how would make the rest of this section easier. Thus. not to complicate them. 103 = 1000 This means the log of 1000 is 3. if properly presented. At this point. If you’re one of the many people who is “a little fuzzy” about decibels. If 3 is the log of 1000.SCHOOL OF AUDIO ENGINEERING A04– The Decibel Student Notes AE04 – THE DECIBEL 1. 70 . since our ears’ sensitivity is “logarithmic. the “dB.” and it need not be all that difficult to grasp. Antilog basically refers to the reverse process. Also. then 1000 is the antilog of 3. This computation can be made with the use of a scientific calculator. the dB was intended to simplify things. What is a Decibel? Numerous attempts have been made to explain one of the most common. Going to basic algebra Xy = Z X is the base. Antilog is the number you get when you raise 10 to the log value. This simply means. The reason that the dB is used is that it is logarithmic. y is the power to which X is raised and Z is the computed number. If the base is 10 then y can be called the logarithm (log for short) of Z. yet confusing terms in audio. y is the number to which 10 must be raised to get Z Let’s use numbers to make it clearer. the following explanations should clear things up for you. 2. Mathematical Definition of the dB The dB always describes aratio of two quantities…quantities that are most often related to power. and therefore smaller numbers can be used to express values that otherwise would require more digits. This means 3 can be used to represent 1000 as long as it is known that 3 is a logarithm.” dB values relate to how we hear better than do absolute numbers or simple ratios.” “dB” is an abbreviation for “decibel.

To demonstrate this. Since a decibel (dB) is 1/10 of a Bel. The decibel is more convenient to use in sound systems. of 2 watts to 1 watt? dB = 10 • log (P1 ÷ P2) = 10 • log (2 ÷ 1) = 10 • log 2 = 10 • . NOTE: If you don’t have a calculator that gives log values. log has no unit. 71 . of 100 watts to 10 watts? dB = 10 • log (P1 ÷ P0) = 10 • log (100 ÷ 10) = 10 • log 10 = 10 • 1 = 10 so the ratio of 100 watts to 10 watts is 10 dB.SCHOOL OF AUDIO ENGINEERING A04– The Decibel Student Notes The decibel is actual 1/10 of a Bel (a unit names after Alexander Graham Bell. or a book with log tables. PROBLEM: What is the ratio. then you need to know that the logarithm of 2 is . in dB.301 in order to solve the above equation. in dB. Therefore. let’s plug in some real values in the dB equation.301 = 3. A calculator that can perform log calculations helps a lot. it can be mathematically expressed by the equation: dB = 10 log (P1 ÷ P2) it’s important that you realise that a logarithm describes the ratio of two powers. which is why the “B” of dB is upper case). primarily because the number scaling is more natural.01 =3 so the ratio of 2 watts to 1 watt is 3 dB. PROBLEM: What is the ratio. not the power value themselves.

Therefore we are interested in the amount of energy transferred per unit of time. This scale is based on the ratio of the actual power density to a reference intensity of 1 picowatt per square metre (10-12 Wm-2).000 watts? That’s a 10:1 ratio. it is 3 dB greater (or if it is half the power. Sound intensity of real sound sources can vary over a range. in general we are more interested in the rate of energy transfer. What is the relationship of one milliwatt to 1/10 watt? One milliwatt is 1/1000 watt. whenever one power is ten times another. In audio. However. in terms watts per unit area. and because of the way we perceive the loudness of a sound. and that’s 1/100 of 1/10 watt. For instance. Sound intensity. it is 10 dB greater (or if it is 1/10 the power. Sound intensity level The sound intensity represents the flow of energy through a unit area. it is 10 dB less). which means it is 10 dB below 1/10 watt. Sound is also a threedimensional quantity and so a sound wave will occupy space. instead of the total energy transferred. Decibels are applied throughout the audio field to represent levels. using the decibel. it is 3 dB less). that is the number of joules per second (watts) that propagate. This will become clearer later. 4. the sound intensity level is usually expressed on the logarithmic scale. Because of this it is helpful to characterize the rate of energy transfer with respect to area. This fixed reference allows the decibel value to have an absolute value. power and pressure level The energy of a sound wave is a measure of the amount of sound present. 3. which is a measure of the power density of a sound wave propagating in a particular direction. decibels for specific ratios have a fixed reference. In other words it represents the watts per unit area from a sound source and this means that it can be related to the sound power level by dividing it by the radiating area of the sound source. Because of this. so. again it is 10 dB. which is greater than one million million (1012).SCHOOL OF AUDIO ENGINEERING A04– The Decibel Student Notes The two previous problems point out interesting aspects of using the dB to express power ratios: whenever one power is twice another. One can begin to see the reason for using dB by expressing a few other power values. This gives a quantity known as the sound intensity. that is. how much greater than 100 watts is 1. Thus the sound intensity level (SIL) is defined as : dBSIL = 10 log10 Iactual ) Iref 72 .

and because of the way we perceive sound. Thus the sound pressure level (SPL) is defined as: 73 . which is known as the sound pressure. although it is useful theoretically. Because of this. Note that 1 Pa equals a pressure of 1 Nm-2. pressure is used as a measure of the amplitude of the sound wave. Sound power level The sound power level is a measure of the total power radiated in all directions by a source of sound and it is often given the abbreviation SWL. This scale is based on the ratio of the actual sound pressure to the notional threshold of hearing at 1 kHz of 20 µPa. unlike the sound intensity.) pressure of a sound wave at a particular point.SCHOOL OF AUDIO ENGINEERING A04– The Decibel Student Notes -2 Where Iactual = the actual sound power flux level (in Wm ) and Iref = the reference sound power flux level (10-12 W m-2) 5. These two pressures broadly correspond to the threshold of hearing (20 µPa) and the threshold of pain (20 Pa) for a human being. Thus real sounds can vary over a range of amplitudes which is greater than a million. Because human ears are sensitive to pressure. However. or sometimes PWL. the sound power has no particular direction. Other measures could be either the amplitude of the pressure or the associated velocity component of the sound wave. which is the root mean square (rms. It has the advantage of not depending on the acoustic context. respectively. and because it is easier to measure. for example ones which generate unwanted noises. This gives a quantity. the sound pressure level is also usually expressed on the logarithmic scale. The sound power level is also expressed as the logarithm of a ratio in decibels and can be calculated from the ratio of the actual power level to a reference level of 1 picowatt (10-12 W) as follows: dBSWL = 10 log10 wactual w ref where wactual = the actual sound power level (in watts) and wref = the reference sound level (10-12W) The sound power level is useful for comparing the total acoustic power radiated by objects. and can be measured. Sound pressure level The sound intensity is one way of measuring and describing the amplitude of a sound wave at a particular point. 6. Note that. it is not the usual quantity used when describing the amplitude of a sound. The sound pressure for real sound sources can vary from less than 20 microPascals (20 µPa or 20 × 10-6 Pa) to greater than 20 Pascal (20 Pa). at a frequency of 1 kHz.

It can be calculated from V = P Zacoustic (Eq. The operation of squaring the pressure can be converted into multiplication of the logarithm by a factor of two. in the same way that electrical power is proportional to the square of voltage. The first is to make the result a number in which an integer change is approximately equal to the smallest change that can be perceived by the human ear.12) The pressure and velocity component amplitudes of the intensity are linked via the acoustic impedance so the intensity can be calculated in terms of just the sound pressure and acoustic impedance by: Iacoustic = V * p = __P_* P= ___P2_ Zacoustic Zacoustic Therefore the sound intensity level could be calculated using the pressure component amplitude and the acoustic impedance using: ___P2___ Zacoustic = Iref 10log10 ___P2__ Zacoustic Iref SIL = 10 log10 Iacoustic = 10 log10 Iref This shows that the sound intensity is proportional to the square of pressure. The second is to provide some equivalence to intensity measures of sound level as follows.SCHOOL OF AUDIO ENGINEERING A04– The Decibel Student Notes dBSPL = 20 log10 pactual pref where pactual = the actual pressure level (in Pa) and pref = the reference pressure level (20 µPa) The multiplier of 20 has a two-fold purpose. which gives: SIL = 20 log10 _____P___ √Z acoustic Iref 74 .# 1. The intensity of an acoustic wave is given by the product of the volume velocity and pressure amplitude as: Iacoustic = VP Where P = the pressure component amplitude And V = the volume velocity component amplitude Acoustic impedance is analogous to the electrical impedance but for acoustic.

provided that nothing alters the number and proportions of the sound pressure waves arriving at the point at which the sound pressure is measured. this equation shows that if the pressure reference level was calculated as: _________ ________ Pref = √Z acoustic Iref = √416 × 10-12 = 20. and then may be reproduced using several loudspeakers. such as might arise from a simple reflection from a nearby surface. These different means of describing and measuring sound amplitudes can be confusing and one must be careful to ascertain which one is being used in a given context.1 Correlated sound sources In this situation the sound comes from several sources which are related. However. There are two different situations. This will be the case for sound waves in the atmosphere well away from any reflecting surfaces. a reference to sound level implies that the SPL is being used because the pressure component can be measured easily and corresponds most closely to what we hear. It will not be true when there are additional pressure waves due to reflections. The actual pressure reference level of 20 µPa is close enough to say that the two measures of sound level are broadly equivalent. such as a recording or a microphone. In fact. whereas the sound intensity level is the power density from a sound source at the measurement point. This can happen in two ways. SIL ≈ SPL. SIL ≈ SPL for a single sound wave a reasonable distance from the source and any boundaries. They can be equivalent because the sound pressure level is calculated at a single point and sound intensity is the power density from a sound source at the measurement point. 7. these may result from other musical instruments or reflections from surfaces in a room. the different sources may be related by a simple reflection. changes in level for both SIL and SPL will be equivalent because if the sound intensity increases then the sound pressure at a point will also increase by the same proportion. Thus a 10 dB change in SIL will result in a 10 dB change in SPL. the sound may be derived from a common electrical source. sound levels together.SCHOOL OF AUDIO ENGINEERING A04– The Decibel Student Notes This equation is similar to the previous dB equations except that the reference level is expressed differently. that is there are no extra pressure waves due to reflections. Adding sounds together So far we have only considered the amplitude of single sources of sound. as might arise in any room or if the acoustic impedance changes. That is. the sound pressure level and the sound intensity level are approximately equivalent. In general. Firstly. If there is only a single pressure wave from the sound source at the measurement point. Secondly.4 × 10-6 then the two ratios would be equivalent. Because the speakers are being fed the 75 . in most practical situations where more than one source of sound is present. 7. However. which must be considered when adding. In order for this to happen the extra sources must be derived from a single source. If the delay is short then the delayed sound will be similar to the original and so it will be correlated with the primary sound source. However. the sound pressure level is the sum of the sound pressure waves at the measurement point.

As the sound spreads out from a source it gets weaker. In the first case the different instruments will be generating different waveforms and a t different frequencies. instead it spreads out as it travels away from the radiating source. As the amount of honey has not changed it must therefore have a quarter 76 . In these situations the result can be expressed as a multiplication. due to the delay. they act like several related sources and so are correlated. This is because adding logarithms together is equivalent to the logarithm of the multiplication of the quantities that the logarithms represent. the surface area of the balloon would have increased four fold. in reality sound propagates in three dimensions. If a decibel result of the summation is required then it must be converted back to decibels after the summation has taken place. This means that the sound from a source does not travel on a constant beam. 8. Consider a half blown up spherical balloon. However. it may come from two different instruments. Even when the same instruments play in unison. For example. Because the delayed wave is different it appears to be unrelated to the original source and so is uncorrelated with it. and so can be expressed as a summation of decibel values. these differences will occur. In this context the decibel representation of sound level is very useful. The inverse square law So far we have only considered sound as a disturbance that propagates in one direction. although the additional sound source comes from the primary one and so could be expected to be related to it. In the second case. This is not due to it being absorbed but due to its energy being spread more thinly. but are spatially disparate. the primary source of the sound will have changed in pitch. Adding decibels together Decibels are a logarithmic scale and this means that adding decibels together is not the same as adding the sources’ amplitudes together.2 Uncorrelated sound sources In this situation the sound comes from several sources which are unrelated. as there are many acoustic situations in which the effect on the sound wave is multiplicative. Clearly this is not the same as a simple summation! When decibel quantities are added together it is important to convert them back to their original ratios before adding them together. 7. In other words decibels can be added when the underlying sound level calculation is a multiplication. If the balloon is blown up to double its radius. for example the attenuation of sound through walls or their absorption by a surface. This is because in the intervening time. which is coated with honey to a certain thickness.SCHOOL OF AUDIO ENGINEERING A04– The Decibel Student Notes same signal. the delay will mean that the waveform from the additional source will no longer be the same. amplitude and waveshape. There are some areas of sound level calculation where the fact that the addition of decibels represents multiplication is an advantage. 9. or from the same source but with a considerable delay due to reflections.

or smog and humidity. In practice. The amount of excess attenuation is dependent on the level of impurities and humidity and is therefore variable. all real sources have a finite area so the intensity at the source is always finite. but also gets duller. This concentration can be expressed as an extra multiplication factor. because there are no boundaries to restrict wave propagation. And Voltage The dB can be used to express power ratios. Power. In these situations the sound is radiating into a restricted space. Note that the sound intensity level at the source is. Sound radiation in this type of propagating environment is often called the free field radiation. this is a direct consequence of the inverse square law and is a convenient rule of thumb. However. such as a floor. for example impurities and water molecules. but other techniques can also achieve the same effect. irrespective of the directivity. or even all the way round them in the case of rooms. due to the extra attenuation these cause at high frequencies. the surface area of the sound wave still increases in proportion to the square of the distance. These extra sources of absorption have more effect at high frequencies and. Furthermore this reduction in intensity is purely a function of geometry and is not due to any physical absorption process. in theory. However. It is simply 77 . that is there is an inverse square relationship between sound intensity and the distance from the sound source. The sound intensity level reduces by 6 dB every time we double the distance. it is important to remember that the sound intensity of a source reduces in proportion to the square of the distance. this is only possible when the sound source is well away from any surfaces that might reflect the propagating wave. Obviously the presence of boundaries is one means of restriction. However. DB in Electronics 1. 9. as one moves away from a source. The level of sound thus increases as boundaries concentrate the sound power of the source into smaller angles. The sound intensity for a sound wave that spreads out in all directions from a source reduces as the square of the distance. infinite because the area for a point source is zero. The sound intensity from a source behaves in an analogous fashion in that every time the distance from a sound source is doubled the intensity reduces by a factor of four. However. in many cases a sound source is placed on a boundary. For example the horn structure of brass instruments results in the same effect.SCHOOL OF AUDIO ENGINEERING A04– The Decibel Student Notes of the thickness that it had before.1 The effect of boundaries But how does a boundary affect this value? Clearly many acoustic contexts involve the presence of boundaries near acoustic sources. In practice there are additional sources of absorption in air. despite the restriction of the radiating space. as a result sound not only gets quieter. The effect of the boundaries is to merely concentrate the sound power of the source into a smaller range of angles.

therefore. 10 times the voltage is a 20 dB increase.SCHOOL OF AUDIO ENGINEERING A04– The Decibel Student Notes dBwatts = 10 • log (P1 ÷ P2) For example. twice the voltage is a 6 dB increase. in dB? dBwatts = 10 • log (P1 ÷ P2) = 10 • log (100 ÷ 10) = 10 • log 10 = 10 • 1 = 10 dB The dB can be used for voltage as well. This means V2 = P/R For a constant resistance. For example 102 = 100 The log of 10 = 1 The log of 100 = 2 Therefore the log of 102 can be seen to be = 2 * the log of 10 = 2 voltage ratios. While twice the power is a 3 dB increase. According to the ohm’s law Power (P) = V2 * R. What is the ratio of 100 watts to 10 watts. Similarly while 10 times the power is a 10 dB increase. dBvolts = 20 log (E1 ÷ E0) where E0 and E1 are the two voltage values. in dB? dBvolts = 20 • log (E1 ÷ E0) 78 . To represent voltage in logarithms. the relationship is square the power relationship. Consider what this means. The decibel relationship of power ratios is not the same as that for voltage ratios. as explained below. power is proportional to the square of the voltage. The square of a number can be represented logarithmically by multiplying the log of the number by 2. The following equations should clarify this relationship: What is the ratio of 100 volts to 10 volts.

000 8.000:1 ratio (one hundred thousand watts in this case). Relative Versus Absolute Levels If we use a reference value of 1 watt for P0. the following chart may be helpful: Power Value of P1 (Watts) 1.6 2.0 10.000 80. For finding smaller dB values (i. for power ratios between 1:1 and 10:1).5 3.000 Level in dB (Relative to 1 Watt P0) 0 10 20 23 26 29 30 33 36 39 40 43 46 49 50 The value of using dB to express relative levels should be apparent here.SCHOOL OF AUDIO ENGINEERING A04– The Decibel Student Notes = 20 • log (100 ÷ 10) = 20 • log 10 = 20 dB 2.e.000 4.3 8. since a mere 50 dB denotes a 100.0 Level in dB (relative to 1 watt P0) 0 1 2 3 4 5 6 7 8 9 10 79 .000 10.0 6..0 2.25 1.15 4.000 100.0 1.0 5.000 40. then the dB = 10 log (P1 ÷ P0) equation yields the following relationships: Power Value of P1 (Watts) 1 10 100 200 400 800 1.000 2.000 20.

Chinn. It actually tells us that the console is capable of delivering 100 milliwatts (0. but in fewer words. Instead of stating “the maximum output level is 100 milliwatts. Of course. Example C: “The console’s maximum output level is +20 dBm.A. However. In the IRE article. however.” in itself. and is always referenced to 1 milliwatt.” That statement is meaningless because the zero reference for “dB” is not specified. It so happens that this amount of power is dissipated when a voltage of 0.001 watts.” we say it is “+20 dBm. Gannett and R.775 volts. as explained in the next subsection. dBm has no direct relationship to voltage or impedance.” Example C tells us exactly the same thing as Example B. It’s like telling a stranger “I can only do 20. in Section 3. D.” 80 .” then any number of dB above or below that implied or stated zero reference may be used to describe a specific quantity. so more “compact” ways of expressing the same idea have been developed. The dBm was actually set forth as an industry standard in the Proceedings of the Institute of Radio Engineers. 0 dBm = 1 milliwatt.” The typical circuit in which dBm was measured when the term was first devised was a 600 ohm telephone line. and the next 10 dB another tenfold increase (from 10 mW to 100 mW). Volume 28.1 watt) into some load. many people mistakenly believe that 0 dBm means “0. the first 10 dB represents a tenfold power increase (from 1 mW to 10 mW). the above statement is awkward. Example B : “The console’s maximum output level is 20 dB above 1 milliwatt.K. in an article by H.SCHOOL OF AUDIO ENGINEERING A04– The Decibel Student Notes The key concept is that “dB. the reference level was 0.1. That is. 0 dBm does. How do we know it can deliver 100 milliwatts ? Of the 20 dB expressed.” Example D: “The mixer’s maximum output level is +20 dBm into 600 ohms.” without providing a clue as to what the 20 describes. For this reason.1 dBm The term dBm expresses an electrical power level.” Example B makes a specific claim.775 Vrms is applied to a 600 ohm line. which is one milliwatt. Example A : “The console’s maximum output level is +20 dB. when a standard reference value is used for “0 dB.2. Moris titled “A New Standard Volume Indicator and Reference Level. always means one milliwatt.M.” but that is only the case in a 600 ohm circuit. has no absolute value. The absolute Decibel in Electrical Signal Levels 2. in January 1940. We’ll give several examples of “specifications” to illustrate this concept.

as explained in Section 3. another dB term was devised…dBu. The dBu was specified as a standard unit in order to avoid confusion with another voltage-related dB unit. the dBV. if this console were connected to a 600 ohm termination. the whole concept of the dB is to simplify the numbers involved. as we said earlier. and might burn out. and connection to a lower impedance load would tend to draw more power from the output. The output in Example D would drive 600 ohms.SCHOOL OF AUDIO ENGINEERING A04– The Decibel Student Notes Example D tells us that the output is virtually the same as that expressed in Examples B and C.” rather than any “dB” quantity. Example E states minimum specified load impedance. How can we make these assumptions? One learns to read between the lines. dBu is a more appropriate term for expressing output or input voltage. its output would probably drop in voltage. 2. although the voltage can be calculated if the impedance is known. dBm implies one zero reference. so if a given power level is to be delivered. However. Power output isn’t really a consideration. and only if.3 dBV and dBv 81 . is the most common term. Example E : “The console’s maximum output level is +20 dBu into a 10k ohm or higher impedance load.2. etc. but there is a significant difference. whereas Example E specifies a minimum load impedance of 10.75 volts rms. tape decks. The term “dBm” expresses a power ratio so how does it relate to voltage? Not directly. We’ll go on to explain these and show the relationship between several commonly used “dB” terms. 2. For that reason. This allows us to calculate that the maximum output voltage into that load is 7.775 volts. Example D refers the output level to power (dBm). and.2 dBu Most modern audio equipment (consoles. Conversely.5.000 ohms. then a higher voltage would have to be delivered to equal that same power output.” Example E tells us that the console’s maximum output voltage is 7. increase in distortion. signal processors. and dBu implies another. * That complicates things. the dBm figure is derived with a 600-ohm load. except in the case of power amplifiers driving loudspeakers.75 volts. the dBu value is not dependent on the load: 0 dBu is always 0. but it gives us the additional information that the load is 600 ohms. just as we calculated for Example D. and the load impedance is higher.) is sensitive to voltage levels. even though the output voltage is not given in the specification. in which case “watts. Draining more power from an output circuit that is not capable of delivering the power (which we imply from the dBu/voltage specification and the minimum impedance) will result in reduced output voltage and possible distortion or component failure. The voltage represented by dBu is equivalent to that represented by dBm if. This brings up a major source of confusion with the dB… the dB is often used with different zero references.

To 82 . and can lead to serious errors elsewhere. (1) and (2).2 dB to whatever dBV value you have. the only difference between dBu (or dBv) and dBV is the actual voltage chosen as the reference for “0 dB.” The above two statements.SCHOOL OF AUDIO ENGINEERING A04– The Decibel Student Notes The dBu is a relatively recent voltage-reference term. Both indicate the nominal output level is 1. To recap.6 volts rms.” as adopted by the National Association of Broadcasters (NAB) and others. dBv was a voltage-related term with 0 dBv = 0. The “u” in “dBu” thus stands for “unloaded. Example G: “The nominal output level is + 4dBv. This means that the first output specified will deliver a nominal output of 1. “dBv” with the lower case “v” was convenient because the dB values would tend to be the same as though “dBm” were used provided the “dBm” output was specified to drive 600 ohm loads. the capital “V” was then made the 1-volt zero reference standard by the International Electrotechnical Commission (IEC).” “The nominal output level is +4 dBV.23 volts rms. you can convert dBV to dBu (or dBm across 600 ohms) by adding 2. although the latter is the preferable usage today.23 V rms.” The two statements. appear to be identical. Unfortunately. making it easier to compare dBu specs with products specified in dBm.775 volts). whereas 0 dBu and 0 dBv are 0. To avoid confusion. dBV denoted a voltage-reference. For many years. people often did not distinguish clearly between “dBv” (a 0.” term engineers use to describe an output which works into no load (an open circuit) or an insignificant load (such as the typical high impedance inputs of modern audio equipment). to denote the voltage value corresponding to the power indicated in dBm (that is. while the NAB agreed to use a small “u” to denote the voltage value that might be obtained when the customary 600 ohm load is used to measure the dBm (although the load itself must be minimal). (1) and (2). Example F: “The nominal output level is +4 dBv. with 0dBV = 1 volt rms. it became common practice to use a lower case “v. you will notice the former uses a lower case “v” and the latter an upper case “V” after the “dB”.775 volt zero reference –if one assumes a 600 ohm circuit) and “dBV” (a 1 volt zero reference without regard to circuit impedance). During that period. The convenience factor here only makes sense where a voltage sensitive (read “high impedance” input is involved.4 Converting dBV to dBu (or to dBm across 600 ohms) So long as you’re dealing with voltage (not power).775 volts. whereas the second mixer specified will deliver a nominal output level of 1.” “The nominal output level is +4 dBu. 2. are identical. but upon closer scrutiny.” 0 dBV is 1 volt.

316 0.0 + 1.0 . and the voltages they represent.2 .2 . Level in dBu or dBm Level in dBV (0 dBV = 1 V With Reference to Impedance.5 0. Typical line level XLR connector inputs and outputs are intended for use with low or high impedance equipment.10.0 + 4.388 0.775V unterminated.775 V across a 600 ohm load impedance + 8.775 volt reference).” This standard is the one.0 2. Since older low impedance equipment was sensitive to power. which is basically sensitive to voltage rather than power.0 .21 The following Table 3. old 83 .0 Voltage (RMS) (0 dBu = 0.2 + 6.12.8.23 1.2. line level phono jack inputs and outputs are intended for use with high impedance equipment.0 . it’s just the other way around –you subtract 2.0 .78 0.0 .2 .6.3.2 + 4.21 dBV = dBv . dBv =dBV + 2.8 . respectively. or of broadcast. which has been used for many years in the consumer audio equipment business. Typically.5 Relating dBV. so their nominal levels may be specified as “-10 dBV.12.2 dB from the dBu value.3 shows the relationship between common values of dBV and dbu. XLR connector nominal levels were often specified as “+ 4 dBm” or “+8 dBm.0 .0 .245 0.100 2.20.10. you may see phono jack inputs and outputs rated in dBV (1 volt reference) because that is the standard generally applied to such equipment.8 .7.250 0.6. 0 dBm = 0.6 1.2 0. dBu and dBm to Specifications In many products. while the XLR connector output levels and some phone jack output levels are rated in dBm (1 milliwatt reference) or dBu (0. (while dBu values would probably suffice today.8 .SCHOOL OF AUDIO ENGINEERING A04– The Decibel Student Notes convert dBu (dBm) to dbV.0 1.” levels characteristic of sound reinforcement and recording.775 0. Which is Usually High) + 6.8 2.2.9.17.00 0.

and so forth. but here we see some anomalies –perhaps due to physiological limitations in the cochlea (the inner ear) as well as localized resonances. One magazine wished to express larger power numbers without larger dB values… for example. the multi-hundred watt output of large power amplifiers. The “A curve” is down 30 dB at 50 Hz. A 1000 watt amplifier is a 30 dBW amplifier. meaning. 3. so check the specifications carefully. the dB values in Tables 3-1 and 3-2 can be considered “dBW” (decibels.” This is incorrect since “loudness” has a very distinct. which is why the equal loudness contours sweep upward at lower frequencies. and over 45 dB 84 . Therefore. Loudness is related to these items.) Phone jack inputs and outputs are usually specified at the higher levels and lower impedance characteristic of XLRs. Equal Loudness contours and Weighting networks The concepts of sound pressure level. A low impedance line output generally may be connected to higher impedance inputs. The mass of the eardrum and other constituents of the ear limit high frequency response.6 dBW We have explained that the dBm is a measure of electrical power. and the frequency response may be adversely affected. the equipment could be damaged. that magazine established another dB power reference: dBW. Be aware that if a high impedance output is connected to low impedance input. though exceptions exist. In some cases. a 100 watt power amplifier is a 20 dBW amplifier (10 log (100÷1) = 10 log (100) = 10• 2 = 20 dB). 2. a ratio referenced to 1 milliwatt. dBm is handy when dealing with the miniscule power (in the millionths of a watt) output of microphones. you can also see why it has difficulty responding to low frequency (long wavelength) sound. The fact that the ear is not linear guided the makers of sound level meters to use a corrective filter – the inverse of the typical equal loudness contour –when measuring SPL. For this reason. It turns out that this is the frequency range where the outer ear’s canal is resonant. The filter has a so-called “A weighting” characteristic. and not so simple.SCHOOL OF AUDIO ENGINEERING A04– The Decibel Student Notes practices linger and the dBm is still used. the dB. In fact. the sensitivity changes with absolute level. referenced to 1 watt of electrical power). and frequency response have been treated in previous sections. If you realize how small the eardrum is. that output may be somewhat overloaded (which can increase the distortion and lower the signal level). The fact that all the contours have slightly different curvatures simply tells us our hearing is not linear… that is. without much change in level. Some people use the term “loudness” interchangeably with “SPL” or “volume. if we are referring to amplifier power. you’ll see that peak hearing sensitivity comes between 3 and 4 kHz. which can be seen in the upward trend of the contours at higher frequencies. If you examine the whole set of equal loudness contours. 0 dBW is 1 watt. and the modest levels in signal processors (in the milliwatts).

Sometimes the quietest portion is obscured by ambient noise. in that case the dynamic range is the the difference in dB between the loudest part of the program and the noise floor. the ear has a “flatter” sensitivity characteristic. Headroom. then rises a few dB at between 1. This average level is called a nominal level.)” means the same as above. In the presence of loud sounds. one would want a flatter response from the SPL meter.A. A weighted. This is the function of the B and C weighting scales.H. Headroom = (Peak level) – (nominal Level) 85 .S. the inappropriate use of the A scale works in favor of those who don’t want to be restricted. Other concepts Dynamic Range. The dynamic range of a system is therefore the difference in dB of the peak and the noise floor Dynamic range = (peak Level) – (Noise Floor).SCHOOL OF AUDIO ENGINEERING A04– The Decibel Student Notes at 20 Hz relative to the 1 kHz reference. Given the sensitivity characteristic of the ear. Remember that 40 dB SPL (at 1 kHz) is equivalent to the sound of a very quiet residence. “dB SPL (A weighted)” ditto… 4. Every sound system has an inherent noise floor which is the residual electronic noise in the system. The differene in dB of the nominal level of a signal and the peak level of a piece of equipment in which the signal exists is called the headroom.Dynamic range is the difference in decibels between the loudest and quietest portion of a program. This is roughly the inverse of the 40 phon equal e loudness curve.5 and 3 kHz. such as rock concerts. Every system also has a peak output level. You may see weighting expressed in many ways: “dB (A)” means the same as “dB SPL. This can be seen by comparing the 100 or 110 phon equal loudness curves (which are the typical loudness at such concerts) to the 40 phon curve. Since this causes them to obtain lower readings than they otherwise would. (Occupational Safety & Health Administration) and most government agencies that get involved with sound level monitoring continue to use the A scale for measuring loud noises. O.every program has an average level. and falls below th 1 kHz sensitivity beyond 6 kHz. In order for the measured sound level to more closely correspond to the perceived sound level.” “dB (A wtd. the “A weighted” curve is most suitable for low level sound measurement. In apparent conflict with this common-sense approach.

S/N = (Nominal level) – (Noise floor) 4.DR = More than 90dB sampling rate 44.DR=65dB Analog Tape.1 Dynamic range of common recording formats Gramophone.4Khz FM Broadcast.SCHOOL OF AUDIO ENGINEERING A04– The Decibel Student Notes Most equipment are specify nominal level which gives optimum performance with adequate headroom.DR= about 45dB 86 .The difference in dB between the nominal level and the noise floor is the signal to noise ratio.DR= Less than 90dB Compact disc.78rpm DR=18dB LP. Signal to noise ratio.

3 8.c. 6. Electromotive force (EMF) Magnetism and Electricity Alternating Current 6.1 Phase 6. 5.1 1. Initial Concepts 1. 3.2 2. Safety devices 7.1 Practical Applications 4. Introduction to transformers Introduction to transistors 87 .1 8.c. 10.4 Voltage in an ac circuit Current in an ac Circuit Resistance in an AC Circuit Resonance (electronics) 9./d.3 Mains Plug 7.1 Fuses 8. What is an Atom Sources and Kind of Electricity (a.) Electricity Circuits 3.AE05 – Basic Electronics 1. The AC Circuit 8.2 8.

1. It has a positvely charged nucleus and a field of negatively charged partcles called electrons orbiting around the nucleus a. This movement and oscillation of electrons will continue until the conductor stops moving. Electrons – The electons are negatively charged and they rotate around the nucleus of the atom. The movement of electrons as a means of transferring energy from one molecule of a substance to another is determined by how much potential difference exists to initiate this movement. Some are higher voltage than others and some are re-chargeable but all produce what is known as Direct Current DC. This kind of electricity is called Alternating Current AC.c. one of such is the use of chemicals. The direction of flow of electrons will reverse once the direction of movement of the conductor is reversed. One of such uses electromagnetism.The nucleus consists of two particles. Initial Concepts 1.the most common source of electricity is the cell which uses conducting plates in chemical solutions to generate electricity.2. Potential difference is analogous to potential energy.1 What is an Atom An Atom is a smallest particle of which all matter is made. Cells.A magnet is a piece of metal that has a field of force around out in a particular direction. 1. it creates a potential difference within the conductor which exerts a force on the electrons.electrons in motion.2. more on AC later.SCHOOL OF AUDIO ENGINEERING A05– Basic Electronics Student Notes AE05 – BASIC ELECTRONICS 1.) It is assumed that electricity is simply . a neutral Neutron and a positively charged Proton b. 1. There are various ways in which electrical potential difference can be created./d.Another source of electricity is the conversion of mechanical energy to electrical energy.1 Chemical sources. Nucleus. This field of force is like lines of force. Electromagnetism. When a conductor is made to move through this lines of force.c. The electrons will then move causing electricty to flow. 88 .2 Sources and Kind of Electricity (a.2 Mechanical sources.There are different kinds of Cells.

Potential Difference.the current is equal throughout the circuit for a series circuit. or increase the water pressure (Potential Difference). Power. The law states that V = I *R Where V is voltage in Volts I is Current in Amps And R is resistance in Ohms. Voltage.Electric power is the amount of work electricity can do.The term potential difference refers to a kind of force that causes electron movement. Current is expressed as flow rate of electrons per second and is commonly measured in Amperes or Amps. a load. 1 in eq. b.SCHOOL OF AUDIO ENGINEERING A05– Basic Electronics Student Notes 2.and resistance describes how difficult it is to move. The unit of PD is Volts. e. (A) Resistance. are related by the ohm’s law. If you want more water to pass through (more current) you either increase the water pressure (Voltage) or you increase the size of the pipe (Reduce resistance). the water the electricity and the tap the Voltage source. 1 d. Current and power. This means. Ohm’s Law. 2 gives ii. wires/cables to connect the elements of the circuit and a switch or some means of controlling the flow of electricity. Eq. This means you can traces the path of the current on the circuit diagram with a pencil without lifting the pencil. Eq. A series Circuit is a circuit where the current path is not broken. Power is the ultimate aim of electrical energy. The size of the pipe is the resistance. Electric power in our model above will be how much water is transferred in a specific length of time. Power = V*I Substituting using eq. IN a circuit. to get the more water to pass in less time (more power) you either increase the pipe size (reduce the resistance). 89 . 2 P = V2 / R or P = I2 R 3. Electricity a. i. Electricity can be likened to water in a pipe. 1/1000 amps = 1 milliamp (mA). All components receive the same amount of current. (V) Current. The unit of current is Amps. The voltage is a measure of the current flowing and the resistance of the component.These four quantities. Voltage – The voltage in a series circuit is different for each component except when the components have the same resistance. c. In other words.A circuit is a closed path through which electrical energy flows. the same current flows through all the circuit components. The Unit of Resistance is the Ohms. It is a measure of the rate of transfer of energy.Current is therefore the electrons in movement. you have a voltage source. Series Circuits. Current. resistance. The unit of power is Watts (W). Circuits a.

Speaker Loading –The resistance of the cable used to hook up a speaker is in series with the resistance of the speaker as well. Parallel Circuits. the resistance of the conductor creates a load on the current that it has to overcome to get to the other side.25 x 1018 iii. a 1000 Watt amp will draw 4A from a 250 V Source when powered fully. Electromotive force (EMF) 90 .the voltage in a parallel circuit is constant and equal across all components Current. Total resistance. the current breaks up to travel several paths through the circuit and components do not all receive the same amount of current. The more the resistance. 4. For example. ii. Power Supply considerations – A power source is often limited by the maximum Voltage it can deliver and the maximum amount of current it can sustain.SCHOOL OF AUDIO ENGINEERING A05– Basic Electronics Student Notes 1 Amp is equivalent to 1 Coulomb of electricity where a coulomb = 6.total resistance in a parallel circuit is a little more tricky 1 / R Total = 1/ R1 + 1 / R2 + 1 / R3 3. iii.in a parallel circuit. Speaker to amp matching – the total load (resistance) seen by an amp can be calculated based on the resistance values of the speakers and whether they are connected in series or parallel.1 Practical Applications Resistance is encountered in Audio in all instances. The current breaks up in relation to how much resistance is on each path. This resistance causes the current to lose some of its ability to do work.the current drawn by each component in a parallel circuit depends on its resistance and the circuit voltage. Total Resistance. i. This can be easily calculated using the power rating on all the equipment and the Voltage specification for the source. When hooking up equipment like amps to a source. Some important instances are: Cable Resistance and Heat loss – When current flows through a conductor. R Total = R1 +R2 + R3 …… b. the less the current on that path. Voltage. This power loss is in the form of heat and can be calculated as Power Loss due to heat generated = I2 * R This is important with speakers. it is important to note how much maximum current will be drawn from that source and if the source can handle it.the total resistance in a series circuit is simply an addition of the individual resistance. this increases the load seen by the amplifier’s output stage and this deteriorates the quality of the output signal.

This property is now understood to be a result of electric currents that are induced in individual atoms and molecules. iv. This effect is a result of a strong interaction between the magnetic moments of the individual atoms or electrons in the magnetic substance that causes them to line up parallel to one another. These currents. such as benzene. although an ordinary piece of iron might not have an overall magnetic moment. thereby aligning the moments of all the individual • • 91 . according to Ampere's law. in each domain. Many materials are diamagnetic. What is a magnet. who studied the forces between wires carrying electric currents. Devices that produce electromotive force by the action of light . a source of electromotive force or potential difference is necessary. Paramagnetic behavior results when the applied magnetic field lines up all the existing magnetic moments of the individual atoms or molecules that make up the material. have a magnetic moment induced in them that opposes the direction of the magnetic field. the English scientist Michael Faraday discovered that moving a magnet near a wire induces an electric current in that wire. v. and by the French physicist Dominique François Jean Arago. Electrostatic machines. the atomic moments are aligned parallel to one another. This discovery. Separate domains have total moments that do not necessarily point in the same direction. while Faraday showed that a magnetic field can be used to create an electric current. The available sources are as follows: i. the theories of electricity and magnetism were investigated simultaneously. and Devices that produce electromotive force by means of physical pressure. Magnetism and Electricity in the late 18th and early 19th centuries. A ferromagnetic substance is one that. the piezoelectric crystal 5. who magnetized a piece of iron by placing it near a current-carrying wire. In 1831. which showed a connection between electricity and magnetism. iii. for example. Voltaic cells. which operate on the principle of inducing electric charges by mechanical means Electromagnetic machines. In ordinary circumstances these ferromagnetic materials are divided into regions called domains. which produce an electromotive force through electrochemical action Devices that produce electromotive force through the action of heat . was followed up by the French scientist André Marie Ampère. that have a structure that enables the easy establishment of electric currents. and ferromagnetic—is based on how the material reacts to a magnetic field. retains a magnetic moment even when the external magnetic field is reduced to zero. • Diamagnetic materials. Thus. the strongest ones are metallic bismuth and organic molecules. In 1819 an important discovery was made by the Danish physicist Hans Christian Oersted. like iron. produce magnetic moments in opposition to the applied field.SCHOOL OF AUDIO ENGINEERING A05– Basic Electronics Student Notes To produce a flow of current in any electrical circuit. One classification of magnetic materials—into diamagnetic. who found that a magnetic needle could be deflected by an electric current flowing through a wire.the magnetic properties of materials are classified in a number of different ways. when placed in a magnetic field. paramagnetic. This results in an overall magnetic moment that adds to the magnetic field. ii. vi. in which current is generated by mechanically moving conductors through a magnetic field or a number of fields . magnetization can be induced in it by placing the iron in a magnetic field. the inverse effect to that found by Oersted: Oersted showed that an electric current creates a magnetic field.

when linked to a coil of one turn will induce an e. Used in tape heads of tape decks. Several devices generating electricity operate on this principle. a current is set up or induced in the conductor. or if the strength of a stationary. conducting loop is made to vary. The Symbol is B. Alternating Current Electric Motors and Generators. When a conductor is moved back and forth in a magnetic field. A typical recorded signal on magentic tape will have a flux per meter in the region of 300 nWb/meter. or dynamo. Two related physical principles underlie the operation of generators and motors. It is exactly like the audio we studied already. An alternating current is a sinewave. In audio. group of devices used to convert mechanical energy into electrical energy. Weber is defined as the flux that.m.SCHOOL OF AUDIO ENGINEERING A05– Basic Electronics Student Notes domains. or electrical energy into mechanical energy. first observed by the French physicist André Marie Ampère in 1820. if reduced to zero. and is generally used as a source of electric power. the device that converts mechanical energy or any other type of energy into electrical energy is called an input transducer and one that converts electrical energy into mechanical energy (speaker cone movement) is called an output transducer. These are diamagnetic and paramagnetic materials. Therefore theories and principles that apply to generators and motors applies to input and output transducers. These are used in the manufacture of speakers. they are called magnetically soft materials. If a current is passed through a conductor located in a magnetic field. Flux density is the concentration of flux and the unit is weber/meter2 . by electromagnetic means. the field exerts a mechanical force on it. Alternating current has several valuable characteristics. only this time it represents electricity.f. known as hysteresis. Nano is 10-9 . If they retain the magnetism. The converse of this principle is that of electromagnetic reaction. 6. alternator. nanoweber/meter2 is used. the unit is the Weber (Wb). as compared to direct current. A machine that converts mechanical energy into electrical energy is called a generator. The first is the principle of electromagnetic induction discovered by the British scientist Michael Faraday in 1831. of 1 Volt. If a conductor is moved through a magnetic field. If they lose the magnetism after the field is removed. producing an oscillating form of current called alternating current. both for industrial installations and in the home. 92 . The symbol for magnetic flux is φ and as mentioned earlier. the substance is called a magnetically hard material and these are ferromagnetic substances. and a machine that converts electrical energy into mechanical energy is called a motor. Magnetic materials can also be categorized on the basis of whether they retain the magnetism after the field is removed. the flow of current in the conductor will change direction as often as the physical motion of the conductor changes direction. The energy expended in reorienting the domains from the magnetized back to the demagnetized state manifests itself in a lag in response. instead. When audio is converted to its electrical analogue it exists as AC. The weber is too large a unit in audio for example to specify the flux/meter on tape weber/meter2 is too large.

it has a period and therefore a phase.these devices compare the current going in from the live side with that returning to the neutral. When a fuse melts. For mains supply.1 Phase AC can be in or out of phase. 8. 8. Earth leakage current devices (ECD)0 these devices operate by detecting a flow of current to the earth.3 Mains Plug The mains wiring have color codes. For an AC supply. Some of these measured are discussed below.SCHOOL OF AUDIO ENGINEERING A05– Basic Electronics Student Notes AC current has voltage amplitudes and peak values. Safety devices The high levels of current and voltage that exists in AC distribution systems demands some that some safety measures be put in place. 6. The fact that AC oscillates is one reason why sound can be converted into electricity. It also has rms voltage. 5 and 13A. the blue wire goes to the Neutral (N) and the green/yellow wire goes to the Earth. 6.1 Voltage in an ac circuit 93 . A common way of bringing the three phases back together is a star connection which provides 230V between each arm and 400V 3-phase. ii. Fuses are rated by the amount of current they can handle. 3. The brown wire is live (L). if they are not equal a trip operates and the supply is cut. For domestic rating the most common fuses are 1. frequency dependent characteristics come into play. but differs in that AC is oscillating therefore. Because AC has a frequency. frequency and phase. The AC Circuit The AC circuit is similar to the DC circuit. 7. In a standard 3 phase set up you would have three phases 120 degrees out of phase with each other. the amount of work the supply can do is related not to its peak amplitude but to its rms value. you say it ‘blows’. Residual Current devices (RCDs). The sum at any one point of these three would be zero.1 Fuses A fuse is piece of conductor which has a maximum current carrying capacity exceeding which it will melt down. 7. Rms is the same here as it was for soundwaves. there are two types of safety units i. For standard power supplies (UK) you have 50Hz AC line.

When one plate is charged with electricity from a direct-current or electrostatic source. This property makes them useful when direct current must be prevented from entering some part of an electric circuit.3 Resistance in an AC Circuit Due to the oscillating nature of Alternating Current. the voltage can be either positive or negative. This is reactance. 8. For one it is a major design factor in filter circuits like cross over networks for passive multi-driver loudspeaker designs (more on this later in the course). and composition of the capacitor's dielectric. In its simplest form a capacitor consists of two metal plates separated by a nonconducting layer called the dielectric. but if a sheet of copper is instead interposed between the two bodies. that is. area. 8. Two of these factors resist the flow of current in relation to the frequency of oscillation. The capacitance of a capacitor is measured in farads and is determined by the formula C = q/V. is a device that is capable of storing an electrical charge. Two oppositely charged bodies placed on either side of a piece of glass (a dielectric) will attract each other.2 Current in an ac Circuit Current in an AC circuit can either be positive or negative. the charge will be conducted by the copper. the unit is still amps. positive if the original charge is negative and negative if the charge is positive. 94 . It becomes immediately obvious that reactance plays a big part audio. they can conduct direct current for only an instant but function well as conductors in alternatingcurrent circuits. Rms is computed in the same way as for soundwaves. There are two types of reactive elements which differ according to how changes in frequency affect them. is a substance that is a poor conductor of electricity and that will sustain the force of an electric field passing through it. Capacitance is the ability of a circuit system to store electricity. or electrical condenser. This property is not exhibited by conducting substances. the other plate will have induced in it a charge of the opposite sign.1 Capacitance Capacitor. some other factors related to the frequency of oscillation arise. Dielectric. where q is the charge (in coulombs) on one of the conductors and V is the potential difference (in volts) between the conductors.3. The unit is still Volts.SCHOOL OF AUDIO ENGINEERING A05– Basic Electronics Student Notes In an AC circuit. the amount of electric charge it can hold. Capacitors are limited in the amount of electric charge they can absorb. 8. The electrical size of a capacitor is its capacitance. or insulator. The voltage considered is the rms voltage. The capacitance depends only on the thickness.

and it stores energy that becomes available when the electric field is removed. the resulting changing magnetic field cuts across the conductor itself and induces a voltage in it. mica. As in the case of a magnet. The values of this constant for usable dielectrics vary from slightly more than 1 for air up to 100 or more for certain ceramics containing titanium oxide. there is an associated ‘resistance’ that an inductance imposes on an AC circuit and this ‘resistance’ is called inductive reactance. to store energy. often used as dielectrics. and vacuums are used as dielectrics. the dielectric is under stress. and mineral oils. a certain amount of polarization remains when the polarizing force is removed. with the value for a vacuum taken as unity. ceramics. who discovered the effect. A dielectric composed of a wax disk that has hardened while under electric stress will retain its polarization for years. are used extensively in all branches of electrical engineering. have constants ranging from about 2 to 9. When the dielectric is placed in an electric field. The Resistance in an AC circuit is dependent on the ‘resistance’ introduced by the capacitance in the circuit. depending on the purpose for which the device is intended. The amount of self-induction of a coil. Air. paper.SCHOOL OF AUDIO ENGINEERING A05– Basic Electronics Student Notes In most instances the properties of a dielectric are caused by the polarization of the substance. The field around each length of the wire adds up within the coil to create a high inductive reactance (resistance due to inductance) in the wires. is measured by the electrical unit called the henry. the electrons and protons of its constituent atoms reorient themselves. This self-induced voltage is opposite to the applied voltage and tends to limit or reverse the original current. Dielectrics.3. Glass. This effect is more pronounced if the different sections of an AC conductor are close to together as in the case of a long cable coiled up. oil. it is determined only by the geometry of the coil and the magnetic properties of its core. compared to a vacuum. porcelain. particularly those with high dielectric constants. 8.2 Inductance When the current in a conductor varies. and is expressed in terms of a dielectric constant. This resistance is called reactance and it is given by the equation Xc = 1 / 2πfc The unit is the ohms This equation shows that capacitive reactance is dependent on Frequency. The inductance is independent of current or voltage. This will affect the signal going through. mica. Due to this characteristic of inductance. The ability of a dielectric to withstand electric fields without losing insulating properties is known as its dielectric strength. where they are employed to increase the efficiency of capacitors. The effectiveness of dielectrics is measured by their relative ability. its inductance. 95 . Such dielectrics are known as electrets. named after the American physicist Joseph Henry. and in some cases molecules become similarly polarized. As a result of this polarization. The polarization of a dielectric resembles the polarization that takes place when a piece of iron is magnetized. Capacitors are produced in a wide variety of forms.

Resonant circuits are used in electric equipment. the opposite is true: The impedance is extremely high and little current will pass. depending upon the amounts of inductance and capacitance in the circuit. Filters of this type. 9. Introduction to transformers When an alternating current passes through a coil of wire. Because these two are also opposite in nature. the magnetic field about the coil expands and collapses and then expands in a field of opposite polarity and again collapses. The effect of having an inductor and capacitor in a circuit affects the total impedance of the circuit. And this is the frequency where the combined effect of the Capacitor and inductor either reach a maximum or minimum.4 Resonance (electronics) Resonance is a condition in a circuit in which the combined impedances of the capacity and induction to alternating currents cancel each other out or reinforce each other to produce a minimum or maximum impedance at a specific frequency. If an alternating voltage of the resonant frequency is applied to a circuit in which capacity and inductance are connected in series. one increase while the other reduces. are used to tune radio and television receivers to the frequency of the transmitting station so that the receiver will accept that frequency and reject others. if frequency increases. 8. such as filters. Resonance occurs at a given frequency. Therefore resistance in an AC circuit is called impedance. When the capacitance and inductance are connected in parallel. the impedance of the circuit drops to a minimum and the circuit will conduct a maximum amount of current. the movement of the magnetic field induces an alternating current in the second coil. If the second coil has a larger number of turns than the first. to select or reject currents of specific frequencies.SCHOOL OF AUDIO ENGINEERING A05– Basic Electronics Student Notes The reactance of this inductance in a circuit is dependent on the Frequency of the Alternating current. This creates an effect called resonance. Inductive reactance can be found from XL= 2πfL The unit is the ohms The combination of resistance with capacitive reactance and inductive reactance in an AC circuit is called impedance. the voltage induced in the second coil will be larger than the voltage in the first. for each circuit. because the field is acting 96 . but not in direct electric connection with it. in which either the capacity or the inductance of the circuit can be varied. If another conductor or coil of wire is placed in the magnetic field of the first coil. called the resonant frequency.

and computer systems. Thus.SCHOOL OF AUDIO ENGINEERING A05– Basic Electronics Student Notes on a greater number of individual conductors. The product of current times voltage is constant in each set of coils. or induced. They cancel out. or half the available power In a transformer the coil into which the power is fed is called the primary. The transistor is a solid-state device consisting of a tiny piece of semiconducting material. The ratio Number of turns in primary/number of turns in secondary Is called the turns ratio and is usually denoted by n. Until the advent of the transistor in 1948.000 V line will be 10 watts. whereas the loss on the 2000 V line will be 100. The action of a transformer makes possible the economical transmission of electric power over long distances. The power lost in the line through heating is equal to the square of the current times the resistance. There are four types of transformers • • • • Voltage step-up Voltage step-down Current step-up Current step-down Because transformers allow the flow of electricity from one point in the circuit to another without any physical contact. 10. developments in the field of electronics were dependent on the use of thermionic vacuum tubes.000 watts of power is supplied to a power line. the net induced current across the transformer is zero. magnetic amplifiers. the one in which the power is taken from is called a secondary. the loss on the 200. common name for a group of electronic devices used as amplifiers or oscillators in communications. If 200. Conversely. it may be equally well supplied by a potential of 200. so that in a step-up transformer. specialized rotating machinery. because power is equal to the product of voltage and current.000 watts.000 V and a current of 1 amp or by a potential of 2000 V and a current of 100 amp. Because they are in pahse. in electronics. the secondary. and special capacitors as amplifiers. the voltage increase in the secondary is accompanied by a corresponding decrease in the current. Common mode signals are signals that are in phase. usually germanium or silicon. if the resistance of the line is 10 ohms. they can be used as isolation devices for common mode signals. A transformer in which the secondary voltage is higher than the primary is called a step-up transformer. (will be further explained under the audio lines and patchbays). Introduction to transistors Transistor. if the number of turns in the second coil is smaller. control. The two coils have different number of turns in them. the device is known as a step-down transformer. If the secondary voltage is less than the primary. voltage will be smaller than the primary voltage. 97 . to which three or more electrical connections are made.

The figure 2 below is of a circuit board with resistors and capacitors but also with transistors. Figure 1 98 . The sealed metal containers house the transistors. Some audio engineers still swear by them for their supposed warmth and acceptable coloration of the sound especially at lower frequencies. transistors and vacuum tubes have an effect on audio signals that varies with the design and how hot they are. This effect will be discussed later. Before transistors. As the performance of all electronics is affected by heat. these devices were used for amplification purposes.SCHOOL OF AUDIO ENGINEERING A05– Basic Electronics Student Notes The Figure 1 below is an example of a Vacuum tube.

SCHOOL OF AUDIO ENGINEERING A05– Basic Electronics Student Notes Figure 2 99 .

100 .

This unwanted sounds. Cable and Cable Parameters A cable is a conductor used to connect the different parts of electrical systems together so that electricity can flow. Small strands of wire can be used to make larger gage by simply wrapping them together. The gage of wire relates to its resistance and how much current it can handle before meltdown (Fuses). (see table 1) Other important factors considered are i. Due to differences in potential. This interconnection is made using cables. the aim is to maintain the integrity of the signal. These two processes alter the gage of the wire and can also affect the integrity of a contact (connection) point. Oxidation and corrosion. Corrosion is a problem that arises when two dissimilar metals are in direct contact with each other. throughout the signal path and to exclude noise from the signal. jacks and plugs. 101 . Noise is refers to the unwanted sounds in the output. without change. This is why a study of these interconnects is important.SCHOOL OF AUDIO ENGINEERING A06– Audio Lines and Patchbays Student Notes AE06 – AUDIO LINES AND PATCHBAYS Introduction One of the most important principles relating to electrical devices is that of interconnection between the devices or between components in the same circuit. They have to be reduced in thickness for application as various wire sizes. Metals are the most used materials for conductors (wires). But conductivity of a metal is not all that is important in choosing a metal for a piece of wire. the electrical signal from one point of the circuit/signal path to the other. Some metals are better conductors than others as the table below shows. 1. one metal will ‘eat away’ at the other one. Wire gage – This refers to the thickness of the wire. the integrity (accuracy) of sound in the electrical domain is maintained by conducting. Anything that affects the electrical signal changes the audio characteristics. Oxidation occurs when the metal combines with oxygen and other chemicals to create Rust. Solid and stranded wire – a large gage wire can either be a solid wire or stranded wire. The standardized sizes are specified by the American wire gage (AWG). the signal tends to travel down the ii. as the frequency of the current increases. can be induced from as acoustic source and get encoded with the original signal or it can be induced electrically. This process makes the wire brittle (break easily not very flexible) and the process to restore some flexibility to the wire is called annealing. Annealing. Not all metals respond to this well. When Alternating current travels down a wire. In audio. v. With audio lines. which distorts the sound. iv. Capacitance in Cables – Capacitance effects in wires is a result of a situation called ‘skin effect’. The table 2 below shows a few different wire gages.Metals come in different sizes. (see table 3) Corrosion and Oxidation – Metals have two problems. iii. the lower the number the better the conductor. The quality and appropriateness of these cables. jacks and plugs can affect the electrical signal going through them.

SCHOOL OF AUDIO ENGINEERING A06– Audio Lines and Patchbays Student Notes surface of the wire rather than through it.39 0. Table 1 Metals Silver Copper Gold Aluminum Nickel Steel Mil-ohms/ft @20oC 9.9 10.032 0. of Strands 7 19 7 19 42 Of Gage 28 32 22 27 30 102 . The effect of inductance is very small unless the wire is carrying high current (as in the case with loudspeaker cables) and much of its length is coiled up vi. This effect is not very pronounced for frequencies within the audio range but can begin to get very severe for higher frequencies e.41 0.7 17 47 74 Flexibility Poor Good Excellent Good Poor Excellent Annealing Poor Good Not Needed Good Poor Not Needed Cost High Medium Very high Fair Medium Very Low Table 2 AWG 40 30 20 10 1 1/0 3/0 4/0 Diameter 0. The plastic coating on the wire that is used for isolation will begin to act like a dielectric. Capacitance effects in wire occur most for high frequency signals. in the Megahertz range.4 14.01 0.464 0. the current travels through the whole wire.68 Compares to Smaller than human hair Sewing Thread Diameter of a Pin Knitting Needle Pencil (1-naught) Finger (3-naught) Marking pen (4-naught) Towel Rack Table 3 Stranded AWG 20 20 14 14 14 No. Inductance in Cables – Inductance is the electrical property of storing energy in the magnetic field that surrounds a current carrying wire.0031 0. But for low frequencies.102 0.g.

keeping electrical noise out of the system is one of the greatest and most difficult tasks.2 Sources of EMI In audio the most common sources of EMI include: • • • • Line frequency AC power (60Hz tone) broadband electrical noise on AC power lines caused by power surges from electronic motors and dimmers. This can used with a balanced circuit. will compete and conflict with an audio signal that is being transferred down the same conductors. radiated electromagnetic waves (Radio Frequencies. This is an unbalanced configuration. generating current from a magnetic field. iii. ii. Electromagnetic radiation generates both current and voltages which as noise. When the electromagnetic radiation reaches audible levels where the audio signal is degraded. 3. we call this Electromagnetic Interference or Electronic Noise. 3. Electromagnetic radiation is the oscillation of electric and magnetic fields operating together by transferring energy back and forth. iv. A twisted pair of conductors – Twisted pair have the same use features as parallel with several advantages. generating a charge from an electric field. A shielded twisted pair of conductors – This provides all the advantages of the twisted pair with the added advantage of the extra shield. The sum total of interconnections in the audio system will affect the final SNR. It is still an unbalanced configuration though. Cable Construction There are several configurations for constructing cables for various Applications. This works best with balanced circuits.1 Electromagnetic Interference (EMI) The type of noise which can get into audio systems in many different ways is called EMI. The conductors in a cable assembly can take any of these configurations i. Electrical Noise in Audio Systems In the electronic audio signal chain. EMI arises as a consequence of conductors such as audio cabling being exposed to electromagnetic radiation from other cables. electric motors and the atmosphere. and an inductor. A pair of parallel conductors – this provides a send and return conductor for a circuit and is often used in AC power wiring or Loudspeaker wiring. Twisting the wires keeps them very close together so that they will always have similar electromagnetic properties relative to ground (will be explained later). Every conductor in a system will act simultaneously as a capacitor. A shielded single conductor (coaxial) – The shielded single conductor offers same features as the parallel config but with the advantage of a shield.RF) intercable crosstalk 103 .SCHOOL OF AUDIO ENGINEERING A06– Audio Lines and Patchbays Student Notes 2. 3.

3. buzzes.1 Conducted Coupling Occurs whenever there is a shared conductor with an impedance shared by both the source and the receiver.2 Electric Field Coupling For example close wires which run parallel to each other. of the powering of the audio system can help reduce this source of noise.4 Electromagnetic Radiation in the far field Airborne EMI whose potential for interference depends on the field strength of the source and receiver’s susceptibility or immunity to the noise. ie The audio system may be coupled via the AC power wiring to other systems which have electric motors. where send and receiver are separated will cause a long loop which is likely to induce current noise. eg Instrument cables from electric guitars commonly act like aerials and pick up RF.Where source and receiver are closer together. and is proportional to the area source and receiver share with each other 3. The effects of each of these four sources of noise is greatly reduced by the adherence to standard methods and practices of interconnection when designing. 4. e. Proper grounding practices and the isolation. 3. installing and using an audio system. chirps.3. interconnection practices in the audio system are also an important deterrent. usually by transformer.3. light dimmers etc.3 Magnetic field coupling A current is produced by mutual inductance between source and receiver and is directly proportional to the loop area of the receiver circuit (which behaves like the windings in an electromagnet) and the permeability of the medium between source and receiver. However. or intelligible voice signal interference. 3.g Long wire runs such as long microphone cables. the less inductance can occur. A Noise voltage is produced by the capacitance between source and receiver. Interconnection Principles 104 . There are 4 means of transmission of electrical noise: 3.SCHOOL OF AUDIO ENGINEERING A06– Audio Lines and Patchbays Student Notes EMI can manifest itself as hum.3.3 Means of Transmission of EMI Identifying how noise is being transmitted to the receiver is a key factor in determining how to control it. girgles. whistles.3.

which they usually do. with the ground sometimes used as a shield.4 Unbalanced to Unbalanced Interconnection Generally interconnected with coaxial cable in which the centre conductor is the signal wire and the outer conductor is a braided shield which is also a ground return. Unbalanced circuits are found in domestic and semi-pro audio systems. Unbalanced circuits do not exhibit CMR. twisted pair wire is used. It is transmitted over one conductor and a ground. (-). Some signal processors used in the proaudio have unbalanced inputs and/or outputs.3 Unbalanced Interconnection Consists of one signal and a ground reference. cold.2 Balanced Interconnections Four basic types. hot. the ground connection will create a ground loop when both pieces of interconnected equipment have a ground reference. DM signals have a different polarity on each conductor such as an audio signal from a balanced output. 4. all with good noise rejection capabilities: • • • • Active-balanced to active balanced Balanced Transformers Active Balanced to transformer Balanced transformer to active balanced interconnect The method of wiring a balanced interconnection is always the same.1 Balanced lines and Circuits All interconnecting audio is either balanced or unbalanced. signal. The balanced system is universally used in professional audio because of its ability to control noise inputs by only passing differential mode signals and rejecting in-phase Common Mode (CM) Signals (Common Mode Rejection). CM signals are of the same polarity on each conductor. return. Also. common. Out-of-phase Signal: Negative. such as signals picked up by a radiating electromagnetic noise source or ground reference differences between two devices (CM ground noise).SCHOOL OF AUDIO ENGINEERING A06– Audio Lines and Patchbays Student Notes 4.(DM) The signals are of equal level but opposite polarity: In Phase Signal: Positive (+). 4. 4. In this way an electrical relationship is physically represented by the interconnecting wiring. The earth connection via the shield is always terminated on the input side. line. Shielded. This procedure avoids the possibility of groundloop potentials occurring between the two interconnected devices. A In a balanced interconnect there is both an in-phase and an out-of-phase signal ie two signals or wires in a push-pull arrangement known as Differential Mode. If 105 .

any device can be connected to any other with little likelihood of distortion or noise. Interconnecting domestic gear (SOL -20dBv) with pro gear (SOL +4dBv) will probably degrade the SNR. the shield connection introduces a ground loop. Current day professional audio. 7. using modern op amp (operational amplifier) equipment. The term polarity is used when talking about balanced lines because frequency and degree of phase shift usually are not relevent here. Phase relationships are more complex and involve degrees of phase shift and a frequency specification ie 90 degrees out of phase at 500Hz. Where output Z approaches input Z noise and distortion can occur eg plugging a domestic hi-Z output into a pro Hi-Z input. Line levels are more robust. The issue of whether or not signals are in polarity or not with each other becomes critical when signals from a mulitrack programme are mixed to mono and interference is caused due to ouof-polarity signals. See table 7-2 Signal levels given in voltages. Care must be taken to check the Standard Operating Level (SOL) of a device before connecting to it.High and Low Audio signals exist at a wide range of nominal or average levels. load) Impedances were first designed for professional audio based around the 600 ohm power matched system where all inputs and outputs were 600 ohms. 106 . Impedances for Mic and Line level Systems All interconnects are partially characterised by their impedance (Z). Circuit level . has developed the protocol of low Z outputs (60 ohms ) and Hi Z inputs (10 . 6. Drive) Cable Impedance (Characteristic impedance) Input Impedance (drain. When a signal is out of polarity it is 180 degrees out of phase at all frequencies. Mic levels are comparatively low which makes them highly susceptible to noise inputs. Polarity and Phase A basic consideration of any electrical interconnection is whether the signal being captured by the mic and transmitted thru electronic equipment will be of the same polarity or phase throughout the chain. no groundloop exists. In any system there are 3 impedances to consider: • • • Output Impedance (Source.SCHOOL OF AUDIO ENGINEERING A06– Audio Lines and Patchbays Student Notes the two connected devices are each grounded. Two signals are “in-polarity” when their voltages move together in time. They are “out-of polarity” when the voltages move in opposite directions in time. 5.200 K ohms) As long as this impedance ratio is mantained. If the shield is grounded at the output end only.

cold to cold etc.1 Relative Polarity Describes a comparison between 2 signals such as a stereo pair. In the signal chain. A balanced connection has three pins +. In an average studio. twisted pair is used with the shield and negative wire connected to the unbalanced output’s earth connection. Increase in noise due to impedance mismatching.SCHOOL OF AUDIO ENGINEERING A06– Audio Lines and Patchbays Student Notes 7. Noise generated by an output being shorted to ground. 7. . Relative polarity of a signal may be compared at any 2 points in the signal chain.3 Polarity of Connectors and Equipment Confusion can arrive when connecting equipment with balanced wiring which has an in polarity and an out of polarity signal. Any reversal of this procedure will upset the absolute polarity relationship in the system . hot must always be connected to hot. where their polarity relation ship to each other is of primary concern. External noise pickup due to lack of proper CMR. pin 3 = out-of-polarity (cold) 8. In the wiring schema. Absolute polarity testing of a system is done using an assymmetrical waveform such as a sawtooth wave whose inversion will be immediately noticed on an oscilliscope inserted at various points in the chain. A system of interconnection must be devised which will overcome the four main problems of mixed format systems: • • • • Increase in noise and distortion due to level mismatching. is unbalanced. a shielded. 9. Interface convertors between -10dBv and +4dBv can be used to overcome level and impedance mismatches. Foreward Referencing will produce common mode rejection or EMI and ground loop noise at the input end. particularly for guitar and keyboards. Shielding 107 . Mixed Format Interconnections Systems containing both balanced and unbalanced equipment pose a special problem. The convention for balanced wiring is: Pin 1 = shield. A method of interconnection called forward referencing can be used to produce CMR between an unbalanced output and a balanced input: 8. as are many console inserts.1 Forward referencing Will work where mixed interfaces have a balanced input or where an unbalanced output drives a balanced input. pin 2 = in-polarity (hot). 7. At the balanced input. hot and cold are connected to their respective terminals and the earth/shield is lifted.2 Absolute Polarity The signal at any point in the chain is compared with the originally captured acoustic signal (A).and shield/ground. a large variety of music oriented processing gear . Many signal processing devices are balanced in and unbalanced out.

eg crosstalk. Twisting Twisting of wires causes electrical fields to induce common mode signals on the wire. Every device should be grounded just once to avoid ground loops. shows a list of different signal classifications. The ideal ground is a zero-potential body with zero impedance capable of sinking any and all stray potentials and current. 108 . Power Ground: In a balanced ie grounded power system of 240V. Power Ground Technical Ground. safety and a degree of resistance to EMI to the audio system. Wires of different levels should cross at right angles. It is good practice to group wires of the same level eg mic level or line level and to keep these groups separated. Shielding in the form of braid or foil is commonly used to wrap the single or multiple conductors in a transmission wire. Separation and Routing Physical separation of cables has a significant effect upon their interaction with each other. The greater the number of turns perunitlength. 12. the higher the frequency at which EMI will be reduced. In a grounding system all devices and all metal surfaces are linked in a star configuration to a common ground sunk in the earth which is the ultimate ground. An unbalanced power system may have 240V at one pole and 0V at the other.more stable and less noisey due to CMR. Grounding Grounding is a fundamental technique used in the control of EMI. In the case of a balanced pair the shield is used as the connection to ground. Twisting reduces magnetic EMI pickup by effectively reducing the loop area of the cable to zero and is vastly more effective than magnetic shielding.SCHOOL OF AUDIO ENGINEERING A06– Audio Lines and Patchbays Student Notes Shielding is used to control noise by preventing transmission of EMI from the source to the receiver. 2. In a balanced line system the potentials should balance each other out to zero . the voltage swings from +120V to -120V around the 0V ground. Coupling between parallel wires diminishes by 3dBv per doubling of the distance between them. The Ground is a neutral reference around which the AC signal or voltage oscillates which adds stability.more unstable and unsafe. Technical Ground: The audio signal voltage oscillates between positive/negative or hot/cold potentials with fround also being 0V neutral. In an audio system there are two types of ground: 1. It is done to minimize conducted coupling and to ground shields. 10. 11.

most transparent conduction.a conductor operates as a component in an electric circuit.2 Microphone cable Shielded.is the conductor size durable enough for installation. cheap and easy to terminate. Used in multipair/multicore cables. Termination .Conductors in a cable assembly can take many configurations. These 2 characteristics must be kept in balance for best. Braid shields which are made of a fine wire weave. twisted pairs providing an overall balanced line. Right Cable configuration . Conductive plastic is a rugged alternative to metal braids. 1. shielded cable.3 popular types: braid. cost and ease of termination. Insulation Characteristics . Wire is usually sized to the American Wire Gauge (AWG) where a lower number represents a thicker guage eg AWG 5 is thicker gauge than AWG 15.1 Coaxial Cable Unbalanced. A range of plastics and rubbers are used. Is it safe ? Resistance/Impedance . spiral or foil. Characteristics of Conductors A conductor is used to conduct electricity from one location to another. there are a number of considerations.Must not be too brittle or will break when bent. Shield characteristics . Recent use in semi-pro audio as digital connector for S/P DIF (Sony/Philips Digital Interface).SCHOOL OF AUDIO ENGINEERING A06– Audio Lines and Patchbays Student Notes Wire And Cable For Audio A wire contains only one conductor while a cable can contain any number of insulated wires in a protected bundle. single strand. Low level mic signals require good quality shielding for max protection from airborne EMI. Flexibility . light. flexibility. • • 1. Foil shields are small. work best at low frequencies. When choosing the right conductor for a particular interconnection purpose. Most wire conductors are made from copper. Physical Strength . including: • • • • • • • Current capacity . Hi-quality audio video (AV) coaxial is best for digital data over short wire runs 1. Low capacitance 109 . Capacitance and Inductance . alternately producing an electrical charge and an induced current. commonly used in domestic audio with unbalanced RCA connectors.will the conductor size carry the required amount of current without overheating.Is the conductor the right size to be terminated in the required connector.The type of insulation used on the individual conductors of a cable affects voltage rating.Does the conductor size provide sufficiently low resistance so that signal loss in the wire is acceptable.

4 Multipair/Multicore Cable Individual twisted. 2. The balanced connectors can also be used for single conductor plus shield/ground (unbalanced) connectios as these generally require only two contacts. 1.3 Line level Cable Physically the same as mic cable but distinguished by the fact that it carries line level signals. shielded pairs bundled together in an outer jacket. Different cable shapes and strand weaves are marketed at expensive prices. Thicker gauges (8 to 10 AWG) are used as grounding wires. Cable Connectors -Single Line Mic/Line Level There are a number of connectors used for single twisted pair balanced connections. Much thicker than mic cables .SCHOOL OF AUDIO ENGINEERING A06– Audio Lines and Patchbays Student Notes is desirable. 1. Line level cables carry signals greater distances which means impedance must be kept low (60ohms) so that the frequency response of low output signals is not degraded. A short wire length will keep cable induction/capacitance and impedance effects to a minimum. XLR and Ring/tip/sleeve (RTS) are examples of balanced connectors as they each have three terminals.usually 9 to 14 AWG. flexible and strong. 1. Always unbalanced and usually unshielded. 14 AWG is strong enough for AC power runs in fixed installations. Each pair of wires must have an isolated shield. 1. 110 .5 Low Impedance loud speaker cables Must be as transparent as possible to the passage of the audio signal from the amp to the speakers. Useful where many mic or line level cables are required. TS connectors are unbalanced and have only 2 terminals. but not both in the one bundle.6 Power cables Must be chosen to meet safety fire and electrical code standards. Portable mic cables must be rugged. Usually a finely stranded 22AWG.

6.5 4. 5.2 3.1 Signal Flow Pots.3 3. Buttons. Introduction Mixer connectors The channel 3.4 2. Recording Monitoring Overdubbing Mixdown Console Design 2. 3.1 3.AE07 – Analog Mixing Consoles 1.1 1.2 1.2 3. Channel Input 111 . The Recording Process 1. Faders.3 1.3 Split consoles In-line consoles Monitor Consoles A Recording Console 1. 2.1 6.1 2.2 6. The channel path Output from channel Insert Direct out Auxiliary Routing The Master Section Mixer Design and applications 6.4 3.

5 4.2 6.9 4. 3.1 4. Gain stages Preamps EQ 112 . Subgroups/Group Faders Master Section 6.3 4. 6.3 3.2 3.3.stereo 5.6 4.10 Bus/Tape Switch EQ in Mon Dynamics in Monitor Aux in Monitor Monitor Pot/Small Fader Monitor Mute Button Monitor Solo Button Fader in Monitor/fader Rev Button Monitor Panpot Monitor Bus. 2.1 6.3 Aux Send/Return Masters Communications Module Master Monitor Module GAIN STRUCTURING 1.6 4.4 3.8 4. Equalisation (EQ) Dynamics Section Auxilliary Section(Aux) Channel Fader Channel Path Feeds Monitor Section 4.7 4.4 4.5 3.2 4.

4. Faders Stereo masters and Monitors Level to tape Balancing A Monitor Mix 7. 6.1 7. 5. 7.2 Setting Levels Setting Pan Positions 113 .

flange. But on some mixers the main left and right out may use XLR connectors. All the output sections on most mixers send out line level signals (with exceptions). ii. Introduction The Mixer is a device that takes electrical audio signals from different input transducers (from 2 to as many as 96 and more) and can i.These jacks typically connect to the line level inputs and are TS compatible though TRS is used because the inputs are balanced. The channel 114 . harmonizer. These sections are called Channels. The typical jacks found at the input section of most mixers are i. The input circuitry of the mixer might be balanced or unbalanced. Mixer connectors Signals from the transducer get into the mixer through standard connectors. Most are also connected using the TRS balanced Phono plug. Each of these sections have identical controls for sending the input transducer signal out of the desk and for bringing it back. That is. the master section is designed to receive and send information from more than one channel out to out-board gear and receive a return signal if necessary. etc. 2. or effect processors like reverb machines. The master section is also designed to mix and blend the signals from more than one channel.These typically connect to the microphone level input of the desk which as mentioned above connect to the pre-amplifier circuit. delay machines. dynamic processors. Microphone level. The master section is different from the channels in that every control on the master section can deal with information (Signal) from more than one channel. The out-board devices refer to signal processing equipment which can be equalizers. The other section of the mixer is the master section. TRS Phono. Most professional mixers have balanced (differential) input circuitry. 3.Small voltages that need to be pre-amplified Line level-larger voltages that do not need to be pre-amplified. ii. ii. Send them to different out-board devices or to tapes or to amplifiers Blend them together to create a mix Every mixer has a lot of identical sections to which the input transducers are connected. chorus. There are two levels of signals expected by most mixers.SCHOOL OF AUDIO ENGINEERING A07– Analog Mixing Consoles Student Notes AE07 – ANALOG MIXING CONSOLES 1. Canon (XLR). i.

This is to increase the number of different places the signal on that channel can be sent. i. Processing. Each channel has two sets of inputs i. These categories are 115 .. However. If the signal is being sent along with signals from other channels to the same out board processor. ii. The channel path consists of three main controls for determining how much level of the signal on a channel gets sent out to either the master section or to a processor.This controls the level of signal that goes out of the channel. 3. it might be better to send it out from the channel itself. ii.. In order to process a signal (send it to a processor) there has to be controls for sending the signal out either as a single signal from that channel only or in combination with other signals from other channels. then each signal must be sent to the same single output with a single control on the master section first. Auxiliary. either to the main L/R out or to some other output on the master section. This single control then acts like a ‘bus’ for all the signals routed to it. 3. ii. Insert point Direct out.There are different types of outboard devices falling into two main categories as far as mixer routing is concerned.This is normally a rotary control at the top of the channel It has two main functions. Fader. It controls the pre-amplifier It controls the maximum signal that enters the desk from the input.2 Output from channel Each channel on the desk is internally wired to the master section.1 The channel path The channel path describes the specific controls on that channel through which a signal on that channel passes. There are two ways to send a signal out from the channel itself. Gain. other alternatives are provided to give the engineer access to the signal on each channel in isolation.SCHOOL OF AUDIO ENGINEERING A07– Analog Mixing Consoles Student Notes The channel is also called the in/out (I/O) module.This also control the level of signal that goes out of the channel to the auxiliary master control in the master section. or the Group buses or the main L/R Bus. Each channel normally has more than on auxiliary. If the signal is to be sent in isolation (on its own only) then it need not go to the master section. Microphone level (XLR) Line Level (TRS) Once the signal comes in to the mixer it can be mixed with others or sent to out board devices for processing. One of such is the auxiliary mentioned above. i.

It is unique in that the signal that goes out through the insert has to come back again through the insert or the channel will have no signal anymore. Insert send.3 Insert The insert as mentioned earlier is a way of sending the signal from just one channel out for processing. 3. It is similar to the insert in that each channel has its own direct out but it is different in that the direct outs do not affect the signal on the channel. It also has a master control on the master section to which all the individual aux channel controls send their signals. 3. ii.5 Auxiliary Another way of getting a signal out of from a channel but not in isolation is the auxiliary path. the signal coming in from the XLR or TRS input the signal will be sent out through another connector called the ‘insert out’ or ‘insert send’ or just ‘send’.SCHOOL OF AUDIO ENGINEERING A07– Analog Mixing Consoles Student Notes i. Therefore to use the insert connections attached to every channel on the mixer you need to make two connections i. but there is one main output at the master section for all. They do not have direct in! It is just an optional output for each channel. It acts like an auxiliary for single channels only. 3. To be able to get the signal back on the channel path. It should be noted however that if the auxiliaries are used to send group of signals to a processor. ii. Processors that should be Inserted in the channel path and Processors that should be used on an auxiliary (supplementary) copy of the main signal. Each of these 8 auxiliary controls will be connected to its own master auxiliary control. then in order to be able to mix/blend the processed sound signal with signal from other channels or just to route it again somewhere else. The auxiliary path allows you to send varying levels of signals from different channels out from a single point in the master section.4 Direct out The direct out is another connection that allows you take signal out of one channel only. This can be done by connecting the output of the 116 . Therefore the single output connector (TRS) of the auxiliary is not associated with each channel.takes the signal out of your channel and sends it to the processor Insert return. This controls the overall level of all the individual channel controls.brings the signal back in to the same channel for further routing or mixing with other signals. it has to be brought back in to the desk. you have to connect the output of the processor to the ‘insert in’ or ‘insert return’ or just ‘return’. it no longer exists on the channel. This means there will be 8 auxiliary controls in the master section . That is. When the signal goes out like this. it is now inside the processors connected to the insert send. Channels on a desk can have as many as 8 or more auxiliaries. Auxiliary controls exist on each channel to determine how much of each channel’s signal is sent to the master section.

Therefore to send signal from a channel to these inbuilt machines all you have to do is turn up the specific auxiliary level on that channel.SCHOOL OF AUDIO ENGINEERING A07– Analog Mixing Consoles Student Notes processor to another channel on the desk or using some dedicated inputs in the master section called auxiliary returns. To route to the buses. Or a 117 . auxiliaries. Internal routing. Some mixers also have other processors built in like dynamic processors etc.This involves sending signal out of the desk and brought back again. The output of this effect machine might also be returned to the desk to two separate channels that can not be used otherwise or the output connection might be left to the user or they might bring it back into a dedicated auxiliary return just for these in the master section. For example. the mixers have a set of switches with numbers that represent the buses to which they are connected. Pre. If it is external there will be cable connections required. This is done by using the insert. Pre or Post Finally.means the control has access to the signal before the specified other control. an auxiliary marked pre-fader means the auxiliary has access to the signal before it gets to the fader. Post. For example. These equalizers are already inserted on the channel path.this means the control has access to the signal after the specified control.these determine whether one processor or control (faders and such) have access to the channel signal before another one.Internal routing involves sending channels to specific buses and sending auxiliaries to the main auxiliary master. Routing on a desk can either be internal or external. Or an auxiliary marked pre-Eq means the auxiliary gets its signal before the Eq. Routing Routing simply means sending the signal around. The signal flow simply describes every where the signal passed through from it source up to where it terminates (tape or amp). The signal path/signal flow is very important to properly wire up a mixer and to troubleshoot. buses. So a switch with number 1 will connect that channel to bus 1 and a switch with L/R will connect that channel to the main master L/R mix bus. an auxiliary marked post-fader means the auxiliary gets its signal after the fader level. direct out or any other outputs available on the desk. and its master control on the master section and the effect machine will receive the signal and process without you connecting any cables. With a switch you can remove (uninsert/disconnect) them from the channel path or put them back in (Eq on/off or Eq in/out switches). you have something called pre. But internal routing can get a little more complicated than that. Some mixers have built in Equalizers for each channel. External routing.and post. Some mixers have effect processors like reverb machines built in and they have dedicated specific auxiliaries to these effect machines. If it is internal the same connections are made but with switches instead of physically handling the cable. 4.

ii. The master section can be designed with a lot of combinations of output options but in general the output options will be i. Tape out Headphone out Matrix out Studio out Control room out Solo ii. iii.SCHOOL OF AUDIO ENGINEERING A07– Analog Mixing Consoles Student Notes direct out marked post fader means the level of the signal going out of the direct out will be affect by fader movements. The common use for this design is to send signals to a multi-track recorder from one side and monitor the off tape signal on the other. A set of in/out on either side of the master section. Some designs have most of the stereo channels on one side and the mono channels on the other.The main master mix bus sends out two signals. iv. the design of the master section. sometimes with a third option for centre.a mix bus is a simply a control that sums up the signal from different channels and sends them out of the desk. Aux out Main out . iii. vii. 6. A mix bus is also called a group. Auxiliary returns Tape in or 2-Track in Talk back Microphone input Mixer Design and applications Mixers are designed in a variety of ways to suit specific applications. At least one output to one mix bus. viii. ix. This is because multiple channels can be sent to the mix bus and the mix bus will then serve as the main master control for them. vi. The Master Section The master section of a mixer. 5.1 Split consoles A split console is a console that has three sections. as stated earlier. 6. controls the master outputs of the desk. Some common mixer designs are 6. left and right.2 In-line consoles 118 . The general considerations for a mixer are Number of inputs. Bus out or Group out . how these inputs are laid out on the desk. A mixer designed for use at a concert event will be different from that designed for use in a studio or a radio broadcast facility. Some inputs on the master section are i. v. All mix buses have dedicated output connectors (TRS).

1. The inputs can be recorded separately or all at once. 119 . electronic instruments and tape recorders.Tracking. each channel with its own input jacks.3 Monitor Consoles This is a dedicated console for processing monitor signals. Each instrument or sound is generally recorded on a separate track of the master tape . monitor speakers or signal processors. 6. 1. The console therefore has a monitor section which allows us to hear sounds separately or in combination. tone. This means. The outputs from the monitor section feed the power amplifier and monitor speakers in the control room.SCHOOL OF AUDIO ENGINEERING A07– Analog Mixing Consoles Student Notes This design incorporates the main in/out module and the monitor module on the same channel strip. The console allows for the subtle blending and combining of a multiplicity of signal sources.1 Recording Audio signals from mics. Though.2 Monitoring The engineer must be able to hear the different sounds as they are being recorded. blending and spatial positioning of signals that are applied to its inputs by mics.provides control of volume. the two inputs share some controls on the path like Eq and Aux Sends. and later as they are played back. These consoles generally have a lot of auxiliaries and the master section incorporates a lot of alternative paths like matrices. The Recording Process The console is normally used in every phase of audio production and its design reflects the many different tasks it is called upon to perform. access to the monitor path is restricted and in most designs. The recording Console or the Audio production Console . This design saves space. or some combination of both. A Recording Console Recording consoles are designed specifically to serve the needs of a recording engineer. Monitor mixers must have the ability to output multiple mixes of the same channels. 1. The console is used to set and balance the level of each signal as it goes to tape. each channel strip acts as two separate channels. Signal sources connected to the inputs of the console are assigned to particular console outputs which are wired to to the various input channels of the recorder. The console provides a means of directing or Routing these signals to appropriate devices like tape machines . electronic instruments and other sources are recorded to magnetic tape or a digital format.

1. Console inputs are fed by the playback outputs of the multitrack taperecorder. The same functions are duplicated on each module . This involves monitoring of the already recorded material as the musicians play along. The design of the console surface always entails a series of parallel modules or channel strips. however the physical top/down layout of a channel strip does not necessarily exactly reflect the signal path through the console. The standard controls on a console are of the sort listed below and the requirements are for an efficient signal flow.1 Signal Flow Console design reflects different concepts of Signal Flow ie the particular way in which audio signals are routed from a console's inputs to its outputs. The tape outputs are finally routed to a two-track recorder (eg 2-trk ATR or DAT) via the console's stereo output called a Stereo Bus . Later monitoring of sounds off tape is necessary. During recording one must herar the direct input from the mic. somtimes in combination with inputs being newly recorded.3 Overdubbing The additional recording or tracking of other instruments once the intitial recording or bed tracks have been laid. Finally all the recorded tracks must be monitored together for their balancing and combining in the mixdown stage. Signal flow diagrams are used to describe the lay out of the sections of a module in the order that the signal passes through them. 2. Signal flow through each module is also basically the same. The musicians must be provided with a cue or headphone mix from off tape via the auxilliary sends of the console. Upon input to the console a signal said to follow a particular Signal path within the console which is determined by particular settings made on the console surface. Two simultaneous signal paths are Channel and Monitor Paths. however. 2. Console Design The design of a console is based on the applications for which the console is made. 1.SCHOOL OF AUDIO ENGINEERING A07– Analog Mixing Consoles Student Notes Different monitoring setups apply to different stages of production. There are two basic console types: 120 . Equalisation and signal processing may be added via the console to each recorded track as it is mixed. depending on design the functions of the modules may differ slightly. This allows for the final mix to be recorded.4 Mixdown When all tracking is finished the recorded material on the various tracks of the multitrack recorder are combined. Different consoles have different module setups and different signal flow diagrams. balanced and enhanced via the consul in the mixdown.

A split from the output section signal is taken for pre-tape monitoring. After signals are combined on the bus.1.3 Buses An electric conductor to which one or many signals may be collected and combined. an output bus combines output signals etc. 2. its output is assigned to a single destination eg auxillary bus goes to the input of a signal processor. A 32 channel console is commonly used for 24 track recording. The signal travels from the input section. leaving 8 modules spare for signal processor returns.2. Also the signal must travel through quite a few gain stages which can add noise. eg the monitor bus combines all monitor signals. Panpots) Gain pots can be side or center detented ie their neutral or zero point is located extreme left or in the centre of its swing. 2. 2. They are usually limited by their amount of outputs and routing inflexibility. output and monitor controls for a single audio channel within a single channel strip called an I/O Module.SCHOOL OF AUDIO ENGINEERING A07– Analog Mixing Consoles Student Notes 2.1 Pots Short for Potentiometer. 3.3 Buttons Switches that select and deselect console functions. 2. Split line consoles are the cheapest and simplest type of console design. May be used for boosting/cutting levels (aux Monitor Pots) or making selections (EQ freq selection. Buttons. signal paths. The surface of any console is laid with different controllers: 2.2 In-line The in-line console places all the input. In-Line Console Sections 121 .1.1 Split Line Divides the controls on the console surface into 3 separate sections: Input. Channel on/off buttons (Mutes) often light up when engaged. 2.2. Offtape signals may be selected at the monitor section.2 Faders Slider controls used for gain settings. Ouput. Faders. Monitor.2. The inline console must have at least as many I/O modules as there are channels an the multitrack. output bus goes to the channel input of a ATR. to the output section and then via the master bus to the tape in. The signal can return off-tape thru the input modules if so desired by selecting line instead of mic.2 Pots. Work on a simple rotary twist motion. Popular in Sound Reinforcement and small small studios.

although is usually bypassed until selected by an EQ to Channel Select Switch. selected frequencies.1 Channel Input A mic/line switch allows for the selection of one of two inputs: 3.1 Filters (HPF and LPF) Sometimes included in the input section.5 to 9). Equalisation provides boost and cut of the signal at particular.1. A Mic preamplifier in the console provides the gain to bring the mic signal up to a typical Line Level (0dB). The mic-in section will also include: • • • • +48V phantom power switch for remote powering of condenser mics. The Preamp Gain Pot provides around 70dB of boost to a mic signal.2 Shelf EQ Bass and treble boost or cut Pots. 3. Types of equalisers include: 3. 3.3 Sweep EQ Boost or cut Pot plus centre frequency selector pot.1 Mic in The typical mic operating level (-45 to -55 dBV) is far below that of a tape recorder.2. A 10dB Pad for attenuating extreme mic signals. Phase Reverse Switch (f) for 180 degree reversal of signal phase. but these are still equalisers.2 Equalisation (EQ) Fed directly from the input section. High and Lowpass Filters for cutting very low rumble such as aircon noise (HPF) or very high nioise (Hiss from amp) from the input signal. An EQ section will contain from 2 to 4 equalisers which cover the frequency spectrum.2. 3. 122 . Sometimes there is a separate line level gain pot which will boost or cut an input signal by around 30dB. 3. Provide cut only.2.SCHOOL OF AUDIO ENGINEERING A07– Analog Mixing Consoles Student Notes An I/O module of a in-line console will typically contain the following sections: 3.4 Parametric EQ Same as Sweep EQ but with the addition of a pot for selecting bandwidth or Q (0.1.2 Line in Selects a line-level signal to the input section such as an electronic instrument or a tape return signal. 3. Hi Q is narrow bandwidth.2.

3.SCHOOL OF AUDIO ENGINEERING A07– Analog Mixing Consoles Student Notes 3. Buttons allow the dynamics section to be selected at the channel input ie pre-EQ or channel output ie post fader.4 Auxilliary Section(Aux) A module may contain anywhere from 2 to 8 aux sends and they are numbered accordingly ie Aux 1.1 Channel Assignment Matrix The Channel Assignment Matrix can distribute the input signal to any or all tracks on the multitrack recorder. effectively stopping the input signal from going any further. Mono auxs are commonly used for rev/echo sends in mixdown and are usually post fader so that the effect may be faded with the dry signal. compression and limiting. Aux 2etc.5 Channel Fader The channel fader controls the overall level of the signal before it is sent to either an output bus for tracking or to the stereo bus for mixdown. Auxs may be mono.3 Dynamics Section Some large. The Channel signal is tapped or split. A signal input into module 14 may be assigned to track 14 on 123 .6. A stereo select switch allows for either option. expensive in-line consoles have an on-board dynamics section in each I/O module to perform gating. 3. The fader section also contains channel cut or MUTE switch which simply turns the channel on and off. The cue mix is Pre fader so that fader movements do not effect the headphone mix. 3. The channel fader is typically long throw and provides 30 to 40 dB of boost or cut to the signal although the fader should usuaually remain around its unity gain position of 0dB.6 Channel Path Feeds At the output of the of the channel fader. (ie aux pot 1 sends to aux master bus 1 etc) The aux master bus 1 collects all aux 1 signals and sends them to an outboard signal processor such as a reverb unit. the channel signal may be routed to one or more of the following destinations and their associated buses: • • • • Multitrack Taperecorder via Output bus or Direct bus Aux master buses Monitor System and two track taperecorder via Master stereo bus Solo Bus These buses are accessed via the following buttons found on each I/O module: 3. or configured in stereo pairs with a corresponding panpot. A Stereo Auxilliary pair is commonly used for the Cue/headphne mix sent to musicians when overdubbing. The aux pot determines the level of split signal sent to its corresponding aux master bus. An aux may be placed in the signal path before or after the the channel fader with the use of the PRE/POST Switch.

The output bus combines all the signals assigned to that module's output section. the panpot allows the signal to be moved or panned from Left to center to right.SCHOOL OF AUDIO ENGINEERING A07– Analog Mixing Consoles Student Notes the multitrack by pressing button 14. This allows us to check out a particular track in the mix without having to mute every track manually. 3.2 Aux Pre/Post Switch Selects the channel aux send to post channel fader which means signals in the aux bus will be effected by fader movements. The monitor section is composed of the following controls: 4. Post is often selected for effects. Alternatively the same signal may be sent to track 15 by pressing button 15. Two adjacent tracks (14 and 15) may be chosen to operate as a stereo pair. The monitor path allows the engineer to create a monitor mix of the signals being recorded or already recorded.6. With one side routed to the left speaker and the other to the right. When pressed. 3. 4. A Pan Switch and Channel Panpot allow the panning of the signal L eft or Right in the stereo pair. (PFL or AFL) or In place where it will retain its panned position in the mix and any effects assigned to it via the auxes.4 SOLO Button Is also located by the channel fader. A signal may be soloed Pre or After fader.6.5 Panpot The Panoramic Potentiometer allows the incoming signal to be routed in varying proportions to a pair of output buses or to the master stereo bus. 4. 3.3 MIX Button Assigns the input signal from that module directly to the stereo master bus. This monitor mix should resemble the musicality of the program material so that everyone in the control room can judge how the track is shaping up. the solo switch simultaneously mutes all other unsoloed channels and routes the soloed signals to a master solo bus which in turn cuts into the monitor system. 3.6. The output buses are usually wired to the track inputs of the multitrack.1 Bus/Tape Switch Selects which signal feeds the monitor path either a signal from the I/O module's output bus or a signal from the multitrack recorder's output (tape return). DIRECT button assigns the input signal directly to the output bus associated with that module ie 14 in direct to 14 out. The channel path just discussed uses gain or level adjustment to set the best possible levels to tape when tracking. the incoming signal actually travels along 2 paths: the channel path and the Monitor Path.6.2 EQ in Mon 124 . Monitor Section Within an I/O module. Use this button to assign channels for mixdown.

4.SCHOOL OF AUDIO ENGINEERING A07– Analog Mixing Consoles Student Notes Selects the EQ section into the monitor path so that the signal may be equalised without going to tape. In some consoles the stereo master bus and the monitor bus are the same. 4. Any channel fader may be selected as a group fader and and any other fader output may be assigned to its control. 6.8 Fader in Monitor/fader Rev Button Places the channel fader in the monitor path and the monitor pot in the channel path ie reverses their functions. 4.7 Monitor Solo Button Routes the monitor signal to the solo bus . 4.5 Monitor Pot/Small Fader Controls the monitor signal level from that module to the monitor bus.4 Aux in Monitor Selects an aux send into the monitor path so that the signal may have rev/echo added without being recorded.stereo Combines the monitor signals from all the I/O moules and feeds a stereo signal to the master stereo bus for output to 2-track and studio monitor speakers.3 Dynamics in Monitor Selects the dynamics section into the monitor path. 4. Useful when a fader has broken.6 Monitor Mute Button Turns the monitor path on or off for each module.9 Monitor Panpot Pans the monitor signal left.10 Monitor Bus. 4. 4. centre in the stereo monitor mix. Subgroups/Group Faders The inline console provides a group function mode where the level of any group of incoming signals may be adjusted by a single fader which is designated as Group fader. Master Section 125 . 4. Group faders or subgroups are useful during mixdown to control the level of a group of instruments in the mix such as drums or strings with a single fader. 5. right.

SCHOOL OF AUDIO ENGINEERING

A07– Analog Mixing Consoles

Student Notes The master section of the inline console contains master controllers for the various buses, switches which can globally reset the console and monitor and communications functions. The master section is responsibler for the overall coordination of the I/O modules. Most consoles are split into 2 or 3 master modules: Aux Send/Return masters, Communication module, and Master Monitor section. 6.1 Aux Send/Return Masters Each Aux Send bus is controlled by a master level pot which controls the signal level sent out of the console to signal processors of headphone amplifier (Cue Mix). A signal sent out to a precessor will return to the console processed or ‘Wet”. A console may provide a number of stereo Aux Returns each with a master control pot and a panpot. The signal from these returns may be assigned to an output buss for recording to tape or sent directly to the master stereo bus for mixdown or monitoring. 6.2 Communications Module Usually contains: • • • talkback mic for communicating to artists in the studio Signal generator - sine waves and pink noise for calibration and testing purposes. Slating system - allows a voice or tone to be recorded to tape to identify starts of songs etc.

6.3 Master Monitor Module Contains the master controls associated with the control room and studio monitor systems.: • • • • • • • • Monitor Selector Matrix- selects taperecorder or other stereo output to be routed to the monitor section. Mix Selector routes all off tape signals and any “live”output bus signals to the monitor section Studio and Control room Monitor Master level pots and on/off switches. Speaker select A/B chooses nearfield or reference monitor speakers. Mono switch Stereo master Fader a single fader used to adjust the output levels from the mixdown/monitor buses before these levels are sent to spekers or 2-trk machine. Solo Master Level pot - controls level of the output from the Solo bus. Select PFL or In place Solo button

GAIN STRUCTURING From mic to console to recorder, there are a multitude of level controllers spread over the signal path. The active circuit element (usually an op amp) governed by a level control is called a Gain Stage. Gain structuring refers how you choose to set these various levels so that the SNR of the signal remains acceptable and the overall level stays balanced around 0VU.

126

SCHOOL OF AUDIO ENGINEERING

A07– Analog Mixing Consoles

Student Notes 1. Gain stages

A signal entering the console will travel thru the input gain stage, the EQ section, the fader section, possibly a group fader before going to tape. A split of the signal may go through the various aux gain stages as well as the internal gain controls on a signal processor before getting to the stereo bus. Gain structuring is the method of adjusting these levels for an optimum clean signal. Each level control exists within the context of an entire signal path . This path is affected by both upstream and downstream level controls. Also the SNR can only get worse after the signal leaves the mic or instrument. A practical guideline: avoid setting any gain stage at its highest or lowest extreme, with the possible exception of the original source’s output level. An ideal level for each gain stage is 75% of its maximum output. The best approach is to start at the beginning of the signal chain and work through each gain stage. By watching the meters and listening the levels can be adjusted until an optimum signal balance is achieved. 2. Preamps

The preamp level pot is the most important gain stage in the signal chain. Mic preamps apply the most gain to the signal. A very low signal may have to be turned up, but this will inevitabley add noise. Better to find a better mic position where the signal is stronger. A heavy signal which requires attenuation leaves no headroom and is likely to distort at peak points. Again, the solution is in trying to moderate the sound source.In most cases its best to set the preamp trim in a position that allows some flexibility and avoid changing it. 3. EQ

An equaliser is a gain stage which will boost or cut a signal at a particular frequency. If the signal from the input preamp is too hot there will be little headroom left for equalisation. Boosting a signal at certain frequencies before it goes to tape may increase noise or distort the timbral character of the instrument. Once a frequency is cut and recorded it cannot easily be replaced.EQ can often be deslected as a gain stage and this is useful for keeping noise levels at a minimum. In general, equalisation should only be used when necessary. Many engineers prefer not to EQ to tape but to leave their options open for the mixdown 4. Faders

The channel fader is used to make fine adjustments to the signal as it naturally dips or jumps as the sound source level changes. The setting of the fader especially effects headroom. If it is set too high its hard to to boost a signal in a quiet passage. Conversely, setting the fader too low will not allow a proper fadeout before noise sets in. In practice the fader should be kept around the 0dBm mark on its slider path with the signal registering 0VU on the meters. The same considerations apply to group faders. This way the channel fader is used to change the level of a single instrument in the group, and the group fader will determine the level of the group of instruments either to tape in in a mixdown.

127

SCHOOL OF AUDIO ENGINEERING

A07– Analog Mixing Consoles

Student Notes 5. Stereo masters and Monitors

Don’t forget to set the stereo bus mater fader at its unity position. This again preserves headroom and allows for a smooth fadeout of the whole track at mixdown. The level at which one listens while making gainsettings can effect the overall balance of the sound. This is because of the non linearity of our hearing (Equal loudness contours). If we listen at too low a level we can miss details in the program like hiss and minute distortion. Also we will tend to overcompensate for the frequencies our ears are not picking up which could lead to overequalising a sound. Conversely, monitoring at high levels will lead to ear fatigue and cutting too much bass or treble. It is therefore good to work out a relationship between monitor gain and the optimum signal level. One way would be to play a well balanced program at 0VUand measure the SPL at the mix position with an SPL meter. Increase monitor gain until the meter reads 85dB which is the optimum listening level. Mark this point on the monitor gain pot as a reference level for future monitoring. 6. Level to tape

Its good practice to strive for an average level of 0VU when recording to tape. However there are some exceptions to this rule. • • • • Sync tones should be recorded no higher than -5VU so their signal won’t bleed onto other tracks. Drums have very fast attack transients which can distort at 0VU. Levels of around -5 to -3 VU will give more headroom and allow for some equalisation later. Some high pitched percussion instruments like cowbell or xylaphone will also bleed easily and should be tracked at around -5VU. Electric Guitar, especially rhythm guitar, can be recorded at hotter than 0VU levels, but not so that the meter pins.

7.

Balancing A Monitor Mix

Levels are often recorded to tape irrespective of the overall musical balance of the program being recorded. This is because other considerations such as SNR are primary in recording. However one must still be able to judge the quality of a performance and an approximation of the final product. Hence instruments are blended in a musical way (ie mixed) and sent to the monitor speakers. The monitor mix is performed using the level pots and panpots in the monitor section to roughly balance and position the instruments. Often reverb and equalisation will also be necessary to please a client. These effects can be selected into the monitor path and have no effect on the recorded sound. 7.1 Setting Levels A rough balance of the instruments is provided using the monitor level pots. The aim is to highlight the important elements of the program and also any signal which the client may wish to study in detail. The feeling or mood or dramatic presence of the

128

SCHOOL OF AUDIO ENGINEERING

A07– Analog Mixing Consoles

Student Notes program should be drawn out in the monitor mix, however everything should be equally balanced so that all options are still left open. 7.2 Setting Pan Positions Start with the most important elements such as vocal, then position the supporting instruments around them. Panning involves a left- right spread of instruments across the soundstage. There are no definite rules , but their are some conventions which most clients will expect to hear. 7.2.1 Center Position

The mix must create a focal point for the listener. This is accomplished by taking certain key elements such as lead vocal, lead instrument, kick and snare drum and giving them prominence in the mix. One way to this is to pan them to the centre, as this is where the listeners attention is naturally drawn. Centre pan position is also used for low frequency sounds which lack clarity and directionality. the bass guitar is most often at the centre of the mix. 7.2.2 Far Left/Right Position

The most crucial use of extreme L/R panning is in the positioning and opening up of stereo recorded or stereo emulated tracks. This is so that the localisation cues inherent in these tracks can be fully present in the mix. Drum miking usually includes a stereo overhead pair. These should be panned extreme l/R so that the various elements of the kit will appear in position across the sound stage. Stereo miked piano should also be opened up so the keyboard will sound from Hi to low across the stage.FX returns are often in stereo and these must be panned extreme L/R. Single instruments can be on the far sides of the mix, especially if the effect required is one of distance from the centre or commentary on the centre eg backing vocals or lead guitar playing with lead vocal.
2

2

Assignment 2 – AE002

129

SCHOOL OF AUDIO ENGINEERING

A07– Analog Mixing Consoles

Student Notes

130

AE08 – Signal Processors DYNAMIC RANGE PROCESSORS 1. 2. 3. 4. 5. Dynamic Range Control Gain Riding Amplifier Gain transfer characteristics Electronic gain riding The Compressor/Limiter 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 6. Operation Compression ratio/Slope of the Compression Curve Threshold/Rotation point Attack Time (dB/ms) Release Time (dB/sec) Metering The Limiter Using Compressors

EXPANDERS/GATES 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 Operation Gate Expansion Ratio Expansion Threshold Range Attack Hold Release Key Input

131

Equalisation & Filters 1. 2. Equalisers: What They Do Equalisation Characteristics 2.1 2.2 2.3 3. 4. 5. Equalisation Curve Bandwidth and Quality(Q) Active/Passive EQ

Equaliser Electronics Controls Equaliser Types 5.1 5.2 5.3 5.4 5.5 5.6 Shelf EQ Parametric and Sweep Equalisers Notch Filter Presence Filter Comb Filter Graphic Equalisers

6.

Practical Equalisation 6.1 Extreme low bass: 20 to 60Hz. 6.2 Bass: 60 to 300Hz. 6.4 6.5 6.6 6.7 6.8 Lower Midrange: 300 to 2500Hz. Upper Midrange: 2500Hz to 5000Hz. Presence" range: 5000Hz to 7000Hz. High frequency range: 7000Hz to 20000Hz. Vocal Equalisation

7. 8.

Some Uses of Equalisation Eq Chart

132

DIGITAL SIGNAL PROCESSORS 1. 2. 3. Digital Delay Line (DDL) Pulsing Modulation Parameters 3.1 3.2 3.3 4. Sweep range, Modulation amount, depth Modulation Type Modulation Rate

Setting up Reverb devices

133

134

SCHOOL OF AUDIO ENGINEERING

A08– Signal Processors

Student Notes

AE08 – SIGNAL PROCESSORS
DYNAMIC RANGE PROCESSORS This unit Introduces the student to the basics of controlling the dynamic range of program material. 1. Dynamic Range Control

In broadcasting and analog recording, the original signal’s dynamic range often exceeds the capabilities of the medium, and must be reduced or compressed to fit the capabilities of that medium. The dynamic range of music is around 120dB while the dynamic range of analog tape and FM broadcasting are around 60 to 70 dB. Often when we try and reproduce the wide dynamic range programe through the narrow dynamic range medium information can be lost in background noise and distortion. Therefore, the dynamic range of the program material must be compressed until it fits within the S/N limitations imposed by the medium. 2. Gain Riding

This can easily be demonstrated when miking a singer who is expressively changing their dynamics to suit the song eg Mariah Carey singing “Can’t Live if Living is without you” where she goes from a breathy whisper to an impassioned scream. If we watched the VU meter of the vocal we would see it dip below an acceptable level then suddenly peak into the red. The engineer’s reaction is to try and balance the level and keep it at unity gain by moving the fader up and down in response to the dynamic changes. This method, called Gain Riding, is commonly used to keep a vocal track balanced. However it’s not very practical when the sound can shift dramatically without warning and where there are many channels, each with independent gain variables. Under these conditions you either need an octopus with perfect reflexes or some method of automatic gain adjustment. 3. Amplifier Gain transfer characteristics

The same situation can be looked at from the point of view of an amplifier whose gain has been adjusted in response to a signal whose dynamic variations exceed the SNR of the system. The input level on the horizontal axis varies from -15 to +15 dB. The SNR of the system is around 25dB. Three different level adjustments a,b, and c are made to the input signal in an attempt to contain it within the dynamic constraints of the system: a. b. c. 0dB adjustment ie flat response A boost of the fader by 10dB A cut of the fader by 10dB

135

In this instance an 8dB increase in input would produce a 4dB increase in output for a Gain Reduction of 4dB. This effect may be produced by means of a variable gain amplifier in which the gain of an electronic circuit varies automatically as a function of the input signal. 1:1 No gain reduction. Linear gain transfer characteristic. 4:1 +12dB gain beyond threshold will be compressed to +3dB with a gain reduction of 9dB 20:1 +10 dB gain beyond the threshold will be compressed to 0. 5. The output of the compressor is linear below the threshold point. 4.2 Compression ratio/Slope of the Compression Curve The increase of the input signal needed to cause a 1dB increase in the output signal. A compressor is in effect.b and c gain adjustments would be required ie a continuous or variable gain change called COMPRESSION. Through compressor operation.3 Threshold/Rotation point The threshold setting defines the point at which gain reduction begins. This produces an output signal that fits within the limits imposed by the system.When an input signal exceeds a predetermined level called a threshold compressor gain is reduced and the signal is attenuated. Electronic gain riding Compression can solve this problem by continuously varying the amp gain as the signal varies.5dB with a gain reduction of 9. high level signals are dropped below system overload. the output is equal to input. an automatic fader. 2:1 Every 1dB of gain beyond the rotation point is reduced by half eg +10dB gain above unity will be compressed to +5dB with a 5dB gain reduction. 5.SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes None of these static gain adjustments is adequate to track the dyanics of the input signal to yield an output where the signal neither distorts nor is lost in the noise floor. This means that at high levels of compression eg 20:1 or infinity to one there is very little gain increase . Two devices which use a variable gain amplifier to achieve electronic gain control are the Compressor/Limiter and the Expander/Gate 5. In fact a combination of a. Higher compression ratios will require more input signal for a stronger output response. This sets the ratio of input to output dynamic range where unity gain equals 1:1 For example a 2:1 compression ratio setting means that the output dynamic range is half that of the input.1 Operation A compressor/limiter is a variable gain amplifier in which the dynamic range of the output signal is less than that of the applied input signal.5dB 136 . The Compressor/Limiter 5.

Instruments with fast attack times like percussion should not necessarily have their compression attack times set at 0ms. This is called Limiting. However these fast peaks add to the feeling of liveness and dynamism and because they are so fast they produce little noticeable distortion to analog tape (They may however distort in the digital medium). Threshold settings can be varied with the threshold pot The lower the threshold setting. If the threshold is lowered to -10dB the gain reduction of the same signal will increase to 10dB. In this way our perception of the dynamic range of a piece is kept intact ie we don’t actually hear the gain reduction occurring. A signal which is 10dB over unity and compressed by 2:1 will have gain reduction of 5dB. Vocal usually requires a moderate attack time. A little bit of attack time will preserve if not increase the sense of percussive attack. clarinet. 5. Large. The release time is defined as the time required for the processed signal to return to its original level. which is isually 63% of the final value of the gain reduction. short duration peaks do not noticeably increase the perceived loudness of a signal. Attack time is defined as the time it takes for the gain to decrease by a certain amount. the quicker the compression slope is implemented.4 Attack Time (dB/ms) Since musical signals vary in loudness they will be above the threshold at one instant and below it in the next. This leads to a gradual compression around threshold yielding a curve referred to as “Soft-Knee”. especially at higher ratios. In overeasy compression which is used in all DBX compressors. Thus attack time needs to be varied so that a balance is preserved between keeping the edge or peak attack of a signal.SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes A ratio of infinity will give 100% gain reduction above threshold no matter what the input level is. The speed with which the gain is reduced after it exceeds the threshold is referred to as the Attack Time. Thus the amount of gain reduction is always related to the threshold setting. 137 . This smoother transition is considered to be less audible than an abrupt hard knee compression. 5. In general sounds with a fast attack eg percussion will require faster attack time settings than sounds which slowly swell eg flute. the increasing input signal gradually activates the gain reduction amplifier as the threshold is approached and full compression is not acheived until a short distance after the threshold point has been passed.5 Release Time (dB/sec) Setting the release time controls the closing of the compression envelope when the input signal falls below thr eshold. it can lead to a lifeless sound. Compression can be instantaneously applied at the threshold producing a standard “Hard Knee” curve. Why does attack time need to be varied ? The perception of the ear to the loudness of a signal is proportional to that signal’s average or RMS level. whilst compressing it fast enough to control the average level of the program. Altho this offers maximum protection from high level transisnts.

8 Using Compressors Some common compressor applications are: * * * * * * Fattening a kick or snare Adding sustain to a guitar of synth string sound. Typically. Smoothing out a vocal performance Raising a signal out of a mix Preventing sound system overload. 5.SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes If release time is set too short and full gain is restored instaneously. Extremely short attack and release times are used so that the ear cannot hear the gain being reduced and restored. 5. Bar graph LED displays are used on some units to indicate amount of gain reduction. Pumping and Breathing can be heard due to the rapid rise of background noise as the gain is suddenly increased. eg bass guitar strings. tapes or discs. As the calibrations on the pots suggest. 138 . Since such a large increase in the input is required to produce a significant increase in the output. if rapid peaks were fed into the compressor.7 The Limiter When the compression ratio is made large enough (>10:1) the compressor becomes a limiter. the liklihood of overloading the equipment is greatly reduced. Thumps. Gentler release times control the rate at which gain is restored to create a smoother transition between compression and non-compression. 5.6 Metering Compressors usually have built-in metering to allow monitoring of the amount of gain reduction taking place. Peakstop/clipper: Limiting can be used to stop short term peaks from creating distortion or actual damage. Balance out the diffeent volume ranges of an instrument. Also. The meter reads 0VU when the input signal is below the threshold and falls to the left or right to indicate the number of decibels of gain reduction. A limiter is is used to prevent signal peaks from exceeding a certain level in order to prevent overloading amplifiers. the gain would be restored after each one and the overall gain of the program material would not be effectively reduced. Release times are on the whole much longer than attack times. sounds with a long decay envelope will require longer release time settings eg strings.

An expander or gate can be used to stretch the dynamic range of these noisy signals by adjusting the signal’s internal gain to attenuate the noise as the signal level drops. High levels are boosted. Low ratios below 4:1 produce controlled downward expansion with a smooth transition between signal 139 . They can be used as noise reduction devices by adjusting them such that noise is removed below the threshold level. eg A 2:1 ratio produces a 2dB change in output level for every 1dB of input.3 Expansion Ratio The ratio of input to output dynamic range. Low levels are attenuated. The higher the expansion ratio. while the desired signal will pass above the threshold. Certain expanders also perform the opposite function of raising the gain as the signal rises above the threshold. 6. amplified guitar sounds with hums and buzzes. EXPANDERS/GATES Many audio signals are restricted in their dynamic range by their very nature eg recordings with a high level of ambient background noise.SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes * Reduce sibilance (de-esser) by inserting in the compression circuit a HPF that causes the circuit to trigger compression when a excess if High frequency signal is present. This interconnection mixes the outputs of the 2 units so that a gain reduction in one channel will produce an equal gain reduction in the other channel preventing the Centre Shifting of information in the stereo image. When a signal level is low ie below the expansion threshold gain is lowered and program loudness is reduced. Compressors are also used to reduce the gain of a stereo mix so that the average level will be booosted and the mix will sound louder. The ratio control changes the effect from an expander to a gate. 6. In this way. A 2 channel compressor in stereo mode or 2 identical compressors in link mode must be used. Expansion is the process of decreasing the gain of a signal as its level falls and/or increasing the gain as the level rises. the higher the dynamic range of the output signal. expanders increase the dynamic range of a program by making loud signals louder and soft signals softer. but become noticeable in its absence. The ratio controls downward expansion from 1:1 or unity to 20:1.2 Gate A simple expander whose parameters are set to sharply attenuate an input signal whenever its level falls below a threshold 6. The output is adjusted continuously by the input signal over its entire dynamic range.1 Operation An expander is a variable amplifier in which the dynamic range of the output signal is greater than the dynamic range of the applied input signal. 6. These noises are masked by the sound itself.

after the key signal falls below the threshold. For gating purposes where high ratios are used. In these circumstances a lower expansion ratio is used so that the lower levels of the program material will not be expanded downward to inaudibility. 6. 6. As the signal rises the expansion decreases.5 Range Determines the maximum amount of attenuation. All levels above the threshold will be passed without gain change. High ratios around 30:1 produce a gating effect where the signal is abruptly cut off.SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes and noise reduction.6 Attack Adjusts the time to achieve unity gain after the input signal exceeds the threshold. Eg when the threshold is 0VU an expansion ratio of 2:1 will cause a -10dB input to produce a -20dB output. Instruments with strong sustain like guitar and piano. (0dB or no attenuation to 100dB or inaudible). When the gate is closed.. In this way the program material is expanded downward. the threshold is set at a high level perhaps at the maximum peak level of the input signal. 6. the high ratio will drop the signal below the level of audibility. A fast attack signal eg snare should have a fast attack setting so that its initial dynamism is not stifled. In the presence of a very low level signal the expander gain will be at minimum. 6.7 Hold Adjusts the period of delay before the onset of the release cycle. Using the gain control it is possible to attenuate only slight amounts. Transients with fast releases should have fast release settings so that background noise will be cut immediately. 6.4 Expansion Threshold The point below which expansion begins. ie the time it takes for the system to stop working. thus improving naturalness. A slow attack sinal should have slow attack setting so that noise is kept down as the signal develops. Use where quick rapid peaks succeed each other. This can produce a choppy quality to the sound. When the signal passes the threshold. 140 . the threshold is often set at the minimum level required to open the gate. eg voice material with lots of short pauses. In many applications it is not desirable for the signal to be gated off completely when the signal drops below threshold. the expander has reduced its expansion to unity gain and the transfer characteristic of the signal is linear. slow decay sounds like cymbals require loner release times. For expansion over a wide dynamic range.8 Release Adjusts the time for the gain to fall to the value set by the range control.

An equaliser can help "position" each instrument in a three-dimensional stereo image. The simplest and best advice is that equalisation should only be used when necessary.1 Equalisation Curve 141 . So equalisers are used not only to correct a particular sound. (SAE students: please use "Practical EQ" Demonstration Tape No. It can increase separation between instruments. to match the sound of another instrument. Another group uses equalisation indiscriminately. and the effect they have on sound. For example. (including both outboard and consolemounted models). you can use them during overdubs. The gate will only open when it “hears “ this frequency. 1). We will explore what makes them work. Finally. The trigger may be: • • • Another external signal. unwanted frequencies can be filtered out. or to produce a better blend of the sounds of different microphones. as well as different types of filter. Used for cleaning up a close miked drum signal so the other drums don’t interefere with the track. Expander Applications Controlling Leakage in the Studio Reducing Feedback on stage Mics Keying External sounds to a percussion track Ducking programs for Voice overs and paging Adding new dynamics to existing program material 4. but also as a creative tool. to getting that elusive sound which the producer can hear in his head. In this chapter we will examine different types of equaliser. Equalisers: What They Do At its simplest. • • • • • Equalisation & Filters Introduction To equalise or not to equalise? Many audio engineers subscribe to the idea that the least amount of equalisation is already too much.SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes 6. we will explain the controls to be found on equalisers and filters. This control over the spectrum of the sound gives the engineer a good deal of creative freedom in changing timbre and harmonic balance. and show how they are used to arrive at desire effects.9 Key Input Most expanders have a key input control which allows a signal to act as a trigger to set off downward expansion. Key input is set on external (EXT) The same signal with equalisation or filtering. 1. the equaliser allows the engineer to cut or boost any frequency or group of frequencies within the audio spectrum. but never quite describe. Equalisation Characteristics 2. or you can use them to control the balance within a mix without resorting to great level changes. This is called frequency conscious gating . eg a snare can be used as a trigger to open the gate allowing reverb to sound with the snare. Its tasks range all the way from analysing and improving a control room's acoustics. By the same process. If the gate is opened by a high hat or cymbal the key input can be tuned to a bandpass frequency equivalent to that of the close miked drum. 2.

Q is determined as the ratio of bandwidth to the Center Frequency: Quality/Slope Rate = Center Frequency Bandwidth 2. For example. Frequencies on either side of this centre will also be boosted in proportion the the angle of the slope of the EQ curve. The distance between the frequencies at 3dB below peak on both sides of the curve is called the Bandwidth of the curve. 2. we must understand a little about the electronics of the thing. inductors and resistors. Inductors (such as voice coils). Resistors present an equal opposition to the signal at all frequencies. So-called resistance capacitors present a frequency-discriminating resistance call capacitive reactance (Xc).SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes An equaliser always selects a frequency range or bandwidth to control. Passive equalisers base their performance on electronically passive components like capacitors. Passive Eq can only cut or attenuate a selected frequency bandwidth. where: f is the frequency C is the capacitance 1 Xc = --------------2pfC From this it can be seen that a capacitor passes high frequencies better then low. Although one frequency is often specified. Capacitive reactance increases as the frequency decreases and can be expressed in the following formula.2 Bandwidth and Quality(Q) A typical bell-shaped equalisation curve has its centre frequency (fc) at the peak cut or boost point of the bell. present a frequency-related resistance called inductive reactance increases when the frequency increases. Equaliser Electronics Before we look at the equaliser's controls and talk about how to use them. an equaliser selects a +4dB boost at 5KHz. Most Equalisers are active types whose circuits involve altering the feedback loop of an op amp (operational amplifier). Active equalisers can cut or boost at selected frequencies therefore operate as defacto gain controllers in the signal flow.The principals can be briefly stated thus: the various control parameters of equalisation can be manipulated using three basic electronic components: resistors. where L = inductance: 142 . this will be the centre frequency of an equalisation curve with a slope on one or both sides.3 Active/Passive EQ Equalisers can be passive or active. The overall shape or Slope rate of the equalisation curve ie whether it is narrower or wider. is called the Quailty or Q of that curve. 3. A wider bandwidth will produce a wider Q and vice versa. capacitors and inductors. It can be calculated using the following formula.

Resistors dissipate power through their reactance. Of course other components. input/output gain adjustments. 5. With this type of equalisation the eventual equalisation curve flattens out. Controls Depending upon the type of equaliser or filter the following controls are available: Bandwidth: This control affects the number of frequencies which are being increased or decreased around the centre frequency. 143 . high pass and low pass filters. This frequency is generally 3dB below the maximum boost or cut. that is. Additional controls found on most graphic equalisers are. Frequency Selection: With this control we can choose the frequency that we wish to change. The main frequency affected by this change will be called the "centre" or "resonant" frequency. the frequency at which the slope begins to flatten out to a linear response. Equaliser Types 5. The equaliser can be identified by its turnover frequency. Roll Off Frequency: On a filter.1 Shelf EQ The typical bass and treble boost and cut equalisers. (See Diagram 2). after the boost or the cut. Shelving refers to the rise or drop in frequency response from a selected frequency which tapers off to a preset level and continues at this level to the end of the audio spectrum. such as balancing networks. (Measured in dB per octave). A shelf EQ will boost or cut all the frequencies above or below the selected frequency equally by the same amount. the rate at which the frequencies are decreased after the roll off point. 4. and a bypass switch. the frequency at which decrease in signal level starts to take place. amplifiers compensating for insertion losses and the like are also found in today's studio equaliser. but capacitors and inductors do not. Roll Off Slope: On a filter. Also known as "Q" or "Bell".SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes XL = 2 p f L Therefore an inductor has the opposite effect on the frequency response to that of a capacitor.

bandwidth) or it cannot be termed a "parametric" equaliser. The frequencies covered by these equalisers are divided into frequency bands. however they are still full parametric equalisers. in recording studios there are usually enough other equalisers able to do the same task. and decreases as signal frequency continues to increase. In a mixing console we usually only find equalisers with a maximum of four bands. The presence filter is usually found in television audio and broadcast studios. A parameric equaliser must have these three control parameters per band (frequency. They are generally used to eliminate simple hums or other pure-frequency spurious signals which may have crept in among a recorded programme.4 Presence Filter With the presence filter the engineer can modify frequencies within the ear's region of greatest sensitivity (2kHz to 4kHz). Equalisers of this type are limited in the number of frequencies and bandwidths that can be selected. The sweep EQ provides a bell curve boost or cut for centre frequencies not covered by the shelf EQ. which work on the LC circuit principle. 5. 5. One such is the Neumann W491. it reaches a maximum at that frequency. such as the Orban 672A or the 674A (stereo) which is an eight band equaliser. The unit is an outboard device which is switched across the applicable audio channel. Equalisers of this type can be referred to as peaking filters. There are also manufacturers offering equalisers of more than four bands.2 Parametric and Sweep Equalisers The difference between a parametric equaliser and a sweep equaliser is that the sweep equaliser can only control the frequency and the amount of cut/boost.3 Notch Filter A notch filter is a filter used to cut a single frequency. while the parametric is able to control the bandwidth or "Q" as well. Most consoles will provide a parametric midrange equaliser and shelved low and high frequency EQ. Some parametric equalisers do not provide a continuously variable bandwidth or frequency control. or to "lift" an instrument out from the background. cut/boost. or several single frequencies. The attenuation decreases as frequency rises to the boost frequency. simple units may have only a single band. At the selected frequency the sweep EQ will boost or cut with a predetermined Q and bandwidth. The presence filter has a preset "Q" and is used to increase intelligibility of speech. with a maximum boost (Note: boost only) of 6dB. These are outboard devices. 144 .SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes 5. The filter usually has 5 preset frequencies within the above range. The diagram shows the Orban 672A parametric equaliser. There is usually a control for selecting a particular centre frequency. sophisticated ones have up to four. One could also describe in as a "tuned peaking filter". The "Q" is adjusted in such a way as to enhance the resonance of the affected audio signal.

The "active" graphic equaliser uses active components and induces no signal level losses. to allow us to modify the room's acoustic characteristics to reach an optimally flat response. Graphic equalisers have a preset frequency and "Q" factor. say. The principal of operation is that two capacitors are connected in parallel. active and passive. which reflect its functions graphically by dividing the audio spectrum up into a number of separate "bands" which can be modified independently. a studio or a concert hall. see the chapter on Acoustics. In the passive design. Filtered bands alternate with unfiltered bands. they 145 .thus giving the impression that 3000Hz has been boosted. 5.6 Graphic Equalisers The graphic equaliser derives its name from its front panel layout. It is actually an amplifier. and little resistance at all other frequencies. the two resulting signals are sent left and right in the stereo mix. When they are connected in series.5 Comb Filter Imagine a series of filters (or alternating bands of a 1/3 or 1/6 octave graphic EQ). For example. Court Acoustics. covering the entire useful audio spectrum.SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes 5. It is our most powerful tool for correcting acoustic problems in a room. The passive equaliser. This loss of level . This is a comb filter.Used to correct an audio signal. (so-called because of the shape of the resulting frequency graph) and they are commonly used in pairs to produce a pseudo-stereo signal from a monaural source. when instructed to boost. which is a job best left to the parametric equaliser. All those old recordings "electronically reprocessed for stereo" have been subjected to this process. however.typically about 35dB . There are many techniques fro this "tuning". the graphic equaliser smooths out the peaks and dips of the waveform rather than radically changing the character of the signal. and of course also has applications in smoothing out audio signals. a certain amount of signal level is lost during processing. That is. however. Klark Technik and Yamaha are leading manufacturers of both types. This results in a "cut" in the level of the chosen frequencies. the graphic equaliser is often used to "tune" a control room. Manufacturers such as White. a "flanger" or "comb filter" effect can be approximated by cutting and boosting alternate bands of a 1/3-octave equaliser to the maximum. will actually cut all other frequencies . The graphic equaliser can also be used for certain audio effects. For details. Because it is so well suited to smoothing out an audio waveform. The two equalisers are set such that the "spikes" of one match the "notches" of the other. There are two main types of graphic equaliser. 3000Hz. they provide great opposition to the signal applied to them at a particular frequency or group of frequencies. designed such that only certain frequencies are amplified (a unity gain amplifier). whilst the cut/boost is adjustable.must be corrected with an additional power stage.

The resonant frequency is that frequency at which. manufactured by White.1 Octave The centre frequencies selected (by the manufacturer) are one musical octave apart. This is especially noticeable when simultaneously boosting and cutting adjacent bands on the equaliser. 20. 630. 500. 300. which gives them the greatest number of equalisation possibilities. 6. and great resistance to all others.2 Half Octave Where the frequencies are 1/2 octave apart. offers very little precise control for studio or acoustic applications and is mostly used by musicians as a special effect to equalise a musical instrument. the greater the phase lag. 3. eg. The greater the boost and cut. 125. 50. 400Hz etc. 400. However. 50Hz. 200.6. All the above units come in stereo or mono versions. This provides a "boost" to the selected frequencies.6. eg. since it usually only possesses 8 selectable frequencies. 63. 50. between 20Hz and 150Hz. This equaliser provides 31 bands between 20Hz and 20kHz and is generally used for acoustic correction. 1. 800Hz. 5. This type of unit is exclusively used fro acoustic room correction.6. and on up to 20kHz.3 One-Third Octave Where the centre frequencies are 1/3-octave apart.5. There is just one commonly-used 1/6thoctave unit. Graphic equalisers fall into the following groups. There is no such thing as the "best" mixing console equaliser: 146 . 5. 31. 100. Mixing console equalisers are usually of the "combination" variety. 25. To minimize the phase lag (reaction time) one should never boost and cut adjacent frequencies or groups of frequencies. This 1/6-octave range is only used in the low frequency range. 160. 100.6.4 One-Sixth Octave Where the frequencies are 1/6th octave apart. The octave graphic equaliser.2kHz. This unit would be known as an 8-band graphic equaliser. eg.SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes provide little resistance to the signal at the selected frequencies. 315. 200. it is also founds as an outboard device in most studios. and some older recording consoles had them installed in each channel. for a given setting. 75. 40. 5. The half octave equaliser is used in recording studios to smooth out the musical signal. 200Hz. Since the equaliser achieves its boost or cut of frequencies through reactive components. 100Hz. 80. the parallel circuit provides the greatest opposition and the series circuits the least opposition. according to the number of bands into which the audio spectrum is divided: 5.4kHz etc. as it offers excellent control in the low frequency area where studios have most of their problems. a certain amount of phase shift will be introduced into the signal path.6kHz. 150. 400Hz.

It is important to be aware of phasing problems which can be introduced by the equaliser if the bandwidth adjustment is too narrow. The parametric equaliser. 6. Used correctly. you place microphones next to all the various drums. during recording. boosts frequencies which are in this acoustic well. and not by memorising some frequency setting and applying them in all situations. If the producer wants these low frequencies. 6. Note that the change of equalisation will tend to change the balance of the rhythm section. Much of today's recording involves different studios and engineers for original recording and for mixdown. other than for effects.SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes 6. destroys these natural waveforms with bad equalisation. during mixdown. the equaliser can blend sounds together. Too much equalisation can also destroy "natural" blending of instruments' waveforms.2 Bass: 60 to 300Hz. because all listeners' sound systems will be able to reproduce it. They have very limited application in modern music recording. When they are mixed. while excessive cut will result in a very thin sound. These frequencies are more felt than heard.1 Extreme low bass: 20 to 60Hz. Below is a guide to frequency ranges and their characteristics from the point of view of equalisation. and thus the feel of the basic body of the song. Equalisation is always done by ear. if the tape is played on radio. the second engineer. or by the use of a phase meter. A further problem is that. can also cause the most damage. they should only be applied for short periods of time. making your mix sound good in one control room by totally different in a second control room with a different monitor system. Phasing problems can be evaluated by switching the monitor system into mono. This bandwidth deserves the greatest time and attention in setting EQ. However this boost will be recognized by the tape and recorded. Excessive boosting of this range will result in a very "boomy" sound. Control rooms often have a problem known as an acoustic well which "swallows up" a certain frequency range. and you equalise these with parametric equaliser so that each sounds good on its own.4 Lower Midrange: 300 to 2500Hz. which increases their effectiveness. If one engineer. the radio station's limiters will react more severely to a tape with excessive bass. thereby making it sound quieter. This is the frequency range which is most important to the "feel" of music. it is likely that the natural waveforms will have been changed and the kit no longer sounds good as a whole. and can only be heard in other listening environments. It has a tendency to smooth out the uneven response of most control room monitor systems. cannot repair the damage. (Tip: If you divide your equalisation time for each instrument into 100 units you should spend about 50 units 147 . they will limit this tape more. there will be no apparent increase of these frequencies in the monitor system. although the most versatile. with the help of the parametric equaliser. Too much extreme low frequency content will "muddy up" the sound and send excessive level to tape. 6. As an engineer. If the engineer. it also contains most of the fundamental tones. Let's suppose that you have a drum kit which sounds good to the ear in the studio. Practical Equalisation The equaliser is the engineer's tuning instrument.

5 Upper Midrange: 2500Hz to 5000Hz. e. Boosting around 3kHz will add clarity to both without increasing the overall level.6 Presence" range: 5000Hz to 7000Hz. The 500Hz to 1kHz region produces 35% 148 .8 Vocal Equalisation Where the material to be recorded includes vocals. while others tend to become more dominant. and can also cause excessive tape/noise distortion. 25 units for the highs and 25 units for the lows). Speech fundamentals occur between 125 and 250Hz. 6. at around 125Óz. On the other hand. 6. only a small part of that range is responsible for clarity and intelligibility. The frequency range from 63Hz to 500Hz carries 60% of the power of the voice yet it contributes on 5% to the intelligibility. 6. Male fundamentals occur lower. contain little energy yet they are essential to intelligibility. m. because it tends to mask all other frequencies. Adding too much lower midrange will give a "telephone-like" quality. Cutting in this range will make the sound compressed and unclear. Excessive boost will create a thin and distant sound. The upper midrange is important for acoustic instruments and vocal sounds. but boosting these frequencies after processing through effects units will tend to add noise and a "hard" sound. an its clear transmission is therefore essential as far as the voice quality is concerned. vowels and consonants.7 High frequency range: 7000Hz to 20000Hz.SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes in the midrange area. Adding too much around 800Hz will give a horn-like sound. Consonants. The vocal range of frequencies (including harmonics) is from 40Hz to 10kHz. Vowels essentially contain the maximum energy and power of the voice. boosting cymbals and vocals at around 12kHz will "clean up" their sound. Be aware that your ears will become fatigued quickly if there is an excess of 1 to 2kHz in the mix. applying a boost of 3 to 6dB around 5000Hz will add additional clarity to the record. This is a technique often used by disc cutting engineers. However. occuring over the range 350Hz to 2kHz. v) may be masked. However. Boosting this range when recording acoustic instruments will give "clarity" and "transparency". o. So we note here a few special considerations for vocal EQ. b. occuring over the range of 1. it creates the impression of extra loudness on the record. Vocal sounds can be divided into three main areas: fundamentals. 6. Boosting this range will make the sound thin and annoying. In addition. Some vocal sounds (eg.5kHz to 4kHz. they are naturally the most important element. sometimes curring a few dB in this range will give an instrument a warmer sound. The fundamental region is important as it allows us to tell who is speaking.

They are useful in mixdown. The spectrum of speech changes noticeably with voice level.SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes intelligibility. while the range from 1kHz to 8kHz produces just %5 of the power by 60% of the intelligibility. there is no need to concern yourself with frequencies above 12kHz. and create new sounds. and improve vocal clarity. a cut in the 150Hz to 500Hz region will make it boxy. It is far simpler to change the level of an instrument with the fader than the equaliser. Dips around 500Hz to 1kHz produce hardness. For instance. Dips around 2kHz to 5kHz reduce intelligibility and make speech "wooly" and lifeless.the intelligibility and clarity of speech can be improved. to compensate for monitor characteristics.known as the "presence band" . The most effective way to use equalisation is to choose the frequency and apply maximum boost first. When equalising. where they can increase separation of instruments. It may be necessary sometime to cut a particular group of frequencies in order to give the impression that others have been boosted. to smooth out peals and dips. and at various levels. In the words of the great producer George Martin: "all you need is ears". Using an equaliser to boost all frequencies will only produce an overall increase of level. Remember that these stated equalisation frequencies are only guidelines and will have to be changed according to instrument and musical key (for acoustic instruments) in each case. they can equalise telephone lines and through-lines to compensate for losses in transmission. Of course. the engineer should bear in mind the final use of the material he or she is mixing. Peaks in the 5-10kHz band produce sibilance and a gritty quality. and equalisation can again be used to correct this. they can highlight feature instruments in the final mix. By rolling off the frequencies and accentuating the range from 1kHz to 5kHz . we don't always know the ultimate use of the product. is always advisable. Something similar applies to a dance mix where it is important to concentrate more on frequencies around 100Hz. Some Uses of Equalisation Equalisers can serve a multitude of purposes. 149 . if the produce is to be used on television. then cut back until the desired amount of equalisation has been reached. They are used to reduce noise and hiss. The diagram below shows the changes that occur in speech as the SPL (dBa) changes: 7. hollow or tubelike. 8. and they can simulate various acoustic effects. Finally. th can help emphasis certain psych-acoustic ey properties. not a change of EQ. to compensate for bad microphone positioning. so monitoring on a selection of speakers in the control room. Eq Chart A table showing the frequency ranges of various instruments and their "general equalisation points. Boosting the low frequencies from 100Hz to 250Hz makes speech boomy or chesty. whilst peaks around 1kHz and 3kHz produce a metallic nasal quality. Our ears tend to react more easily to boosted sound than to cut sound.

5kHz Trumpet Frequency Range Overtones Equalisation Bell 5kHz Attack 8kHz (E-D3) 160 to 1175Hz Up to 15kHz Fullness 120Hz to 240Hz 150 .100Hz Guitar (A) Frequency Range Overtones Equalisation (E-D3) 41 to 1175Hz Up to 12kHz Warmth 240Hz Clarity 2kHz to 5kHz Attack 3.5kHz Attack 7-10kHz (G-C4) 196 to 2100Hz Up to 10kHz Warmth around 240Hz Double Bass Frequency Range Overtones Equalisation Body 200Hz String Noise 2.5kHz (E-C1) 41 to 260Hz Up to 8kHz Fullness 80 .SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes Violin Frequency Range Overtones Equalisation String 2.

2kHz Air 6kHz (D2-C5) 587 to 4200Hz Around 10kHz Warmth 500-700Hz Oboe Frequency Range Overtones Equalisation Resonance 1.5kHz (B-F2) 247 to 1400Hz Up to 12kHz Body 300Hz Clarinet Frequency Range (D-G3) 147 to 1570Hz 151 .8kHz Fullness 80Hz Grand Piano Frequency Range Overtones Equalisation Clarity 2.SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes Tuba Frequency Range Overtones Equalisation Resonance 500Hz Cut 1.2kHz Attack 4.5 to 4kHz Attack 8kHz (A2-C5) 27 to 4200Hz Over 13kHz Warmth 120Hz Flute (small) Frequency Range Overtones Equalisation Breath 3.2kHz (B2-A1) 29 to 440Hz Up to 1.

2kHz (C-C3) 130 to 1050Hz Around 8 to 10kHz Fullness 200Hz Bass Drum Frequency Range Overtones Not defined (low) Around 4kHz 152 .5kHz Air 5.5kHz (D-C) 73 to 130Hz Up to 4kHz Warmth 90Hz Bass Guitar (E) Frequency Range Overtones Equalisation Warmth 300Hz String 2.4kHz Scratch 4.SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes Overtones Equalisation Harmonics 2.2kHz Up to 4kHz Bell 300Hz Tympani Frequency Range Overtones Equalisation Attack 2.5kHz (E-C2) 82Hz to 520Hz Up to 8kHz body 80Hz Viola Frequency Range Overtones Equalisation String 2.0kHz Air 4.

5kHz Air 10kHz Not defined Up to 10kHz Bell 220Hz Toms Frequency Range Overtones Equalisation Not defined Up to 3.SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes Equalisation Box Sound 400Hz Cut 3.0kHz Body 120Hz Snare Drum Frequency Range Overtones Equalisation Hollow 400Hz Snare 2.5Hz Not defined Up to 8kHz Body 120 & 240Hz Cymbals Frequency Range Overtones Equalisation Clarity 7.5kHz Fullness 120Hz (Floor Tom) 240Hz (Hanging Toms) Cut 5kHz Guitar (E) Frequency Range Overtones Equalisation 5kHz (E-G3) 82 to 1570Hz Fullness 240Hz 153 .

Through the performance of the digital algorithms. Any time value based on divisions or multiples of this value will work with the song ie 250ms.SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes Warmth 400Hz String 2. Pulsing When setting delay and reverb times one must be careful to ensure that these timings don’t conflict with the tempo of the piece of music. A ROM or factory memory of a typical digital reverb unit is programmed with a number of common types of reverbs such as hall. 2 secs.5 secs. slap echoes. Song tempos are expressed in beats per minute (bpm).5kHz DIGITAL SIGNAL PROCESSORS Microprocessors (DSP chips) programmed with algorithms which perform fast. If tempo is not known the pulse may be determined with a single delay placed on a the time keeping sound such as snare. Pulsing is the method of determining which timings can be used with a particular tempo. 63ms. these delays are organised into defined sets of random delay patterns resulting in different reverb characteristics ie simulations of different environments. Some units have up to six parallel delays which can be used to produce echo clusters and panning effects. long spacey echoes. 1. Within these presets is more or less flexibility to adjust various reverb and delay parameters to create user defined variations. x 1000 Pulse in milliseconds = 60 bpm A song with a tempo of 120bpm will have a 500 ms pulse for each of its beats. This binary signal is fed into a shift register or Buffer where it is temporarily stored or delayed for a user defined amount of time before being read out of the buffer by a clocking oscillator. Vary the delay time until it falls exactly on the next beat and this value can be used as the pulse 3. Some common delay effects are doubling. infinite repeat which is a simple form of sample looping. and flanging. 1. room. stadium etc. 2. The input signal is regenerated into a series of closely spaced digital delays. 31ms etc or 1 sec. number of repeats and amount of feedback or regeneration. 125ms. Modulation Parameters 154 . plate. intensive calculations on the incoming signal. DDLs may have flexible parameters for configuring different types of delays such as delay time. The DDL converts the signal into a digital PCM (Pulse Code Modulation) form. Digital Delay Line (DDL) The DDL produces a series of discrete repeats of the input signal at regularly spaced and user-defined time intervals. etc.

The actual input level is program dependent but good starting point is to set the i/p level as high as possible before distortion. Adjustable Parameters. 100ms to 200ms). Input level: [dBs] Determines the level of the signal at the input stage of the unit. so does the detuning of the sound.varies delay time from max to min in a cyclical fashion . where there is a slight gap before onset of reverb. Most reverb units are mono in stereo out but generate different reflections patterns for the L + R audio channels. 3.1Hz (1cycle every 10 secs) to 20Hz. A wide sweep range or full depth will give dramatic flanging effects. Pre-Delay:[or Ini.1 Sweep range. With flanging and chorusing. A slower rate produces a slow. It indicates the length of time between sound source and onset of the reverb. The first step to setting a device is to have a look at the device manual or documentation. Setting up Reverb devices Setting up functions differ form device to device. A faster rate with full depth will sound more out of tune than a slower rate with full depth. 155 . will add a greater sense of clarity to the track. This gives the effect of separating the dry from the reverb. delay -1 ms to 400 ms] Delays a signal before it can get to the reverb cluster.3 Modulation Rate Sets the modulation frequency of the LFO. gradual detuning.often used with chorus and flange. These two signals independent signals is assimilated by our stereo hearing mechanism enabling us to perceive spaciousness. Modulation amount. A delay with a 2:1 sweep range will sweep over a 2:1 time interval (5ms to 10ms.SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes 3. This will obtain best signal to noise ratio. depth Determines how much the modulation section or LFO varies the the delay time. Rate and depth interact with each other to produce the total amount of pitch change and how often it oscillates. Sine . Triangle .smoother than triangle Square . With Chorus as depth increases.2 Modulation Type Different waveform shapes which are used by the LFO.changes delay times at random 3. 4. Longer pre-delays time. Typical rates are 0.switches between two delay times Random . then return to the original pitch and start the cycle all over. modulation causes the original pitch to go flat to a point of maximum flatness.

a cleaner reverb effect in produced . 156 . As the diffusion value is increased. Diffusion: [0 to 10] is the complexity of the reflections and how spread out the cluster is. Less ER level would make the mix less cluttered. can be described as closely packed tight reverb[thicker] as one cannot perceive the gap between the reflections. These are the first or early reflections-or the nearest reflective surfaces that the direct sound comes into contact with. The general rule regarding decay time is as long as needed but no longer than necessary. Vox Strings Brass 25 to 50 ms 40 to 80 ms 45 to 80 ms 75 to 125ms 100 to 200 ms 50 to 100 ms ER level:[0 to 100%] In simulating ambient spaces there is a distinct echo prevalent at the beginning of the sound. the complexity of the reflections increases producing a larger more diffused reverb cluster field. Higher settings produces more dense reverb. Density: [0 to 10] This parameter determines the density o f the reverb reflections -that is the average amount of time between reflections. Lower settings produces minimum reverb density and leads to a more spacious sound -one can perceive the individual echoes between reflections.you get less of a diffused reverb field is building up.sluggish mix. Decay time: [0 to 12s] Long decay time simulate larger ambient spaces while shorter decay times put the instrument in smaller environments.SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes General Pre-delay guidelines: Up tempo Drums/Perc Ballad drums Acoustic instr. Try to use shorter decay times which will allow each layer[track] in the mix some place in which to be heard. If set to minimum complexity. Longer decay times on each track will reduce instrument definition and lead to a somewhat cluttered. More ER level on a particular instrument can reinforce the sound of the instrument which will help project the perceived ÔsizeÕ of the instrument.

SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes Feedback/Regeneration: [%] This is a variable control that sends the output of the delay unit back into the input . It will create multiple repeats this parameter should be used moderately as an excessive amount will result in an uncontrollable feedback .[or a squelling loop] 157 .

SCHOOL OF AUDIO ENGINEERING A08– Signal Processors Student Notes 158 .

1 2. Mono Microphone Techniques 1. 2.2 2.2 12.1 1. and Mid-side (MS) or Coincident Pair Near Coincident or OSS (Optimal Stereo Sound) 159 .3 1.AE09 – Microphones 1.4 2.2 1.1 12.4 3. Carbon Granules Piezo Electric Capacitor/electrostatic Moving coil (Dynamic) Pick-up or directivity patterns 12. Distant Microphone Placement Close Microphone Placement Accent Miking Ambient Miking Stereo Miking Techniques 2.5 Pressure operation Pressure Gradient operation (figure-of-eight) Cardioids Hypercardioid Highly Directional Microphones Microphone Techniques 1.1 2.2 2.3 12.3 2. Introduction to common Input transducers Transducer types 2.4 12.3 AB or Spaced Pair XY.

In some microphones. which can sometimes occur at the start of a sound.SCHOOL OF AUDIO ENGINEERING A09– Microphones Student Notes AE09 – MICROPHONES 1. it is a matter of the diaphragm’s ability to respond rapidly to the almost violent changes in sound wave pressure. contact pick ups. iv. The diaphragm and the transducer will have to be looked at together as transducer systems. the stylus of a record player and of course a microphone. 2. The polar pattern should vary as little as possible with frequency Environmental influences like humidity and temperature should have little of no effect on the microphone’s output. The casing. Sensitivity – the electrical output should be as high as possible so that the audio signal produced by the microphone is high relative to system noise produced in other equipment down the signal path like Mixing consoles. vii. Examples of input transducer s are Guitar pickups. Transducer types The basic microphone is made of A diaphragm – a thin light membrane which moves in the presence of soundwaves so that its movement follows accurately the sound wave pressure vibrations. The transducer. The transducer – The diaphragm is hooked up to the transducer. which would be present even if a perfect. noiseless microphone were used.the significance of the casing is that it determines the directional response of the microphone. as mentioned earlier. vi. there may be some advantages of having a frequency response that is not flat. This is often quoted as the equivalent acoustic noise (in dBA). The microphones response to starting transients is important. produces a voltage that is precise analogue of the diaphragm movements. Some transducer systems are 160 . which is as far as possible. Introduction to common Input transducers Input transducers are devices for producing an electrical voltage. a replica of the sound waves striking the sensitive part of the Transducer. The frequency response should be as flat as possible over the frequency range 20Hz to 20kHz. v. Essentially. The microphone is the main focus of this section but it shoul dbe noted that transducer basics apply to all input transducers. an example of this would be a reduced bass response for public address microphones. Repair and maintenance should be quick and accessible ii. Some important characteristics of microphones for professional use are i. Self-Generated electrical noise should be low. iii.

microphone design depends on how the design allows soundwaves to strike the diaphragm. In this system. 2. Using a DC source. The 0 V side of the DC supply (Return) is connected to the earth. This movement of the granules causes the to be more closely packed and less closely packed which affects the resistance of the granules and therefore a DC current flowing throught he granules will be modulated. 2. This connection ensures that microphones connected the desk that do not use the phantom power will not be adversely affected by the DC voltage as the net result of the 48V at the Transducer is zero. Pick-up or directivity patterns An important characteristic of a microphone is its response to sound arriving from different directions. 2.2 Piezo Electric Some crystals such as barium titanate and quartz. An electret is a substance that almost permanently retains electrical charge.f. the 48V DC supply is connected to the microphone terminals on the console. they can work as a transducer. One is to use a DC source of about 50V the other is to substances called electrets for the diaphragm. This has the disadvantage of high impedance.m. By attaching a diaphragm to such a crystal.1 Carbon Granules Carbon granules are loosely packed in contact with the diaphragm in such a way that when the diaphragm moves it moves the granules. This means there is not difference between the two wires.SCHOOL OF AUDIO ENGINEERING A09– Microphones Student Notes 2. in a small lightweight coil attached to a diaphragm.4 Moving coil (Dynamic) This is the most widely used. The DC voltage is carried on the two signal wires of the microphone cable. The coil is often made aluminum. The combination of diaphragm and back plate is known as the capsule.m.1 Pressure operation 161 . there is no need for the polarizing DC Voltage. Movements of the latter as a result of sound waves pressure.. Used in the past in telephones. create e. Such voltages are small.f. when they are deformed. Capacitance changes occur as a result of the diaphragm moving in the presence of sound waves. for lightness.5-1mV for speech. 12. 3. cause alternating voltages to be produced. In the order of 0. They depend on the production of e. This is also called polar pattern The directivity pattern of polar diagram of a . and the impedance is commonly about 30 ohms. The coil/diaphragm assembly is heavier than the diaphragm of an electrostatic microphone and this affects the transient response. When use in a microphone. the most common way is the 48V standard also called TPowering.3 Capacitor/electrostatic These uses a conducting diaphragm (for example plastic coated with a metal deposit) positioned near to a rigid metal plate so the the two of them forma capacitor. The plates have to be charged and there are two ways to charge them.

the highly directional pickup pattern will only start after about 2kHz. 12. 12. sound waves can only strike the diaphragm from the front.2 Pressure Gradient operation (figure-of-eight) The basic design of this type of microphone is so that soundwaves have equal access to both sides of the diaphragm.SCHOOL OF AUDIO ENGINEERING A09– Microphones Student Notes With this design. Microphones designed with the pressure gradient principle exhibit this proximity effect. The main characteristic of this design is that its frontal lobe is narrower which makes it more directional but at the same time. But sounds from 90o would result in 0 output because they will cancel out. Below this frequency the system will be more or less omni-directional. The use of a parabolic reflector – The use of a parabolic reflector with the microphone at the focus can create very narrow lobes. The carioid pattern is achieved by using a phase shift principle. This delay is such that the sound which originates from behind the diaphragm and goes into the aperture is delayed long enough for the same sound which diffracts round the microphone to get to the diaphragm. The resultant pressure at the diaphragm for sound originating from behind the microphone will therefore be zero. This means obviously that the microphone will show some directional characteristics as frequency increases. 12.5 Highly Directional Microphones Highly directional microphone pick up patterns can be achieved in two ways i. This is polar pattern is very useful I practice. Those coming from 180o will strike the diaphragm by diffraction assuming their wavelength was large enough. An alternative path for soundwaves to the diaphragm but behind it (180o) is created by creating an aperture in the casing. The use of interference tubes (Gun or Rifle microphones) – A long slotted tube is placed after the diaphragm.4 Hypercardioid A hypecardioid microphone can be thought of as an intermediate between a pressure gradient and a cardioid. The microphones theoretically would give equal outputs for sounds 0o and 180o. 162 . It can be assumed that sound coming from any direction will be able to strike the diaphragm. Sounds that arrive off axis of the tube enter the slots and will arrive at the diaphragm at different times depending on ii.3 Cardioids This polar pattern does not respond to sounds coming from behind (180o) the microphone. Cardioid designs are an extension of the pressure gradient operation and also exhibit proximity effects. albeit not as much as the pressure gradient. This is called proximity effect. 12. This means the movement of the diaphragm is completely determined by the direct soundwave pressure. The polar pattern of a pressure-operated microphone is determined by the frequency and the shape and diameter of microphone. a rear lobe develops which means it picks up sound from behind as well. In practice. But of course this will only be for frequencies above that with the wavelength equal to the diameter of the parabolic dish. The sound waves which come in from this aperture are then delayed by inserting an acoustic labyrinth in the their path. A characteristic of pressure gradient microphones is an exaggerated output at low frequencies when the source is close to the microphone.

There would therefore be a phase cancellations.SCHOOL OF AUDIO ENGINEERING A09– Microphones Student Notes which part o the tube they enter. The Pressure Zone Mic (PZM) is an electret-condenser mic designed to work on a boundary such as a wall or a floor. 1. will determine the basic application of miking techniques. One must try to strike an overall balance between the ensemble and the overall acoustics. 163 . Phase cancellation is eliminated because the mic. All recording situations will have inherent limitations such as availability of mics. 1. Microphone Techniques Proper miking technique comes down to choosing the right mic and positioning it properly. The most cancellation occurs for frequencies with wavelengths less than the acoustic length of the tube. Using this style provides an open. A height of 1/8 to 1/16 inces will keep the lowest cancellation above 10KHz. A problem with distant miking is that reflections from the floor which reach the mic out of phase with the direct sound will cause frequency cancellations. Microphone Placement is broken down into 2 broad categories: Mono miking or the use of single or multiple single mics. Often mono and stereo sources are mixed together. and Stereo miking which uses pairs of mics to capture the soundfield in a way which emulates certain binaural features of the ear. Mono Microphone Techniques The basic idea of mono miking is the collection of various mono sources of sound for combination in a mix for a simulated stereo effect. Moving the mic closer to the floor reduces the pathlength of reflected sound and raises the frequency of cancellation. will add direct and reflected sound together resulting in a smoother overall frequency response. The PZM is therefore well suited for distant miking applications. These considerations coupled with the desired effect ie what sort of recorded sound you're after. Mic placement depends on size of sound source and the reverberent characteristics of the room. There 4 Monomiking styles of mic placement directly related to the distance of a mic from its sound source. located at the point of reflection. live feeling to the sound. Most tubes are about 50cm long. Distant miking is often used on large ensembles such as choirs or orchestras. size of the recording space and unwanted leakage. Such a distance picks up a tonally balanced sound from the instrument or ensemble and also picks up the acoustic environment ie reflected sound. This microphone design is used mainly for news gathering for radio and TV work.1 Distant Microphone Placement The positioning of one or more mics at 3 feet or more from the sound source.

2. in a studio. Multitrack recording often requires that individual instruments be as "clean" as possible when tracked to tape. Stereo miking methods rely on principles similar to those utilized by the ear/brain to localise a sound source. b.1 AB or Spaced Pair The two mics (Omni or cardioid) are placed quite far from each other to preserve a L/R spread or soundstage. These methods may be used in close or distant miking setups to record ensembles. orchestras or individual sound sources live or in the studio. c. This is similar to the ear utilizing Interaural Arrival Time differences to perceive direction. 1. Miking too close may colour the recorded tone quality of a source. The ambeint mic is used to enhance the total recorded sound in a number of ways: a.3 Accent Miking A not too close miking technique used to highlight an instrument in an ensemble which is being picked by distant mikes. Only direct. on-axis sound is captured. A common technique when close miking is to search for the instrument's "sweet spot" by making small adjustments to mic placement near the surface of the instrument. 1.SCHOOL OF AUDIO ENGINEERING A09– Microphones Student Notes 1. The accent mike will add more volume and presence to the highlighted instrument when mixed together with the main mic.4 Ambient Miking An ambient mike is placed at such a distance that the reverberent or room sound is more prominent than the direct signals. direction and distance. Creates a tight present sound quality which effectively excludes the acoustic environment. Very common technique in studio and live sound reinforcement applications where lots of unwanted sound (leakage) needs to be excluded. The AB method works on the arrival time differences between the two mics to obtain the stereo image. Small variations in distance can drastically alter the way an instrument sounds through the mic. used to add the studio rooms acoustic back in to a close miked recording 2. 164 .2 Close Microphone Placement The mic is placed 1" to 3' from the source. restore natural reverb to a live recording used to pick up audience reaction in a live concert. Stereo Miking Techniques The use of two identical microphones to obtain a stereo image in which sound sources can be identified by location. The sweet spot is where the instrument sounds fullestand richest.

3 3 Assignment 3 – AE003 165 . The AB stereo method can give an exaggerated stereo spread and can suffer from a perceived hole in the center effect. May also by done on three channels of a console with the 3rd channel containing the reverse phase of the bidirectional mic. Therefore. 2 omni mics can be used for more ambiance. This is similar to the Interaural Intensity Difference utilized by the ear. The side picks up ambient sound while the mid picks up the direct sound. NOS and Faulkner. This help mantain phase integrity between the mics ie less chance of ou-of-phase cancellations occurring. The stereo image is obtained by intensity differences produced by the soundsource on each mic.2 XY. The sound can be warm and ambient but off centre sources can seem diffuse ie not properly located. Microphone pair is separated by a distance similar to that between the 2 ears. XY Pair are two cardioid mics (Top angled L . The images are usually sharp and accurate but the stereo spread can seem narrow. The Blumlein Pair uses 2 bidirectional mics set at right angles to each other. This technique produces realistic 3D stereo effects which can be heard through headphones. and Mid-side (MS) or Coincident Pair In both these techniques the mic capsules sit on top of each other. MS Technique utilizes a cardioid and a bidirectional mic. Uses the best features of AB and XY to produce a soundstage with sharply focused images and an accurate stero spread. Usually performed with a decoder box which produces various types of polar patterns which result from this combination.bottom R) set at an angle of between 90 and 135 degrees.SCHOOL OF AUDIO ENGINEERING A09– Microphones Student Notes In placing AB mics use the 3:1 Rule : the distance between the to mics should be at least 3 times the distance between the mics and the source. 2. The Binaural Mic or Dummy Head is a development of the near coincident approach which mounts 2 mics in ear cavities of a model head. The angle increases the intensity differences and widens the stereo image. 2.3 Near Coincident or OSS (Optimal Stereo Sound) Several versions of this method including the ORTF (Office for Radio and TV of France). both level and time differences are used to obtain the stero image.

1 6.4 Dither Generator Input Low Pass Filter Sample and Hold Analog to Digital conversion 8.2 6.1 10.1 8. correction and concealment 9. 2. Error protection.2 11.AE11 – Digital Technology 1.2 8.3 7.3 8. 4. 3. Digital Audio Reproduction Demodulation Circuits Reproduction Processing Digital to Analog Conversion Output sample and hold Output Low-Pass Filter and Oversampling 166 . 5.1 11. Quantization Error Dither S/E Ratio Pulse Code Modulation Linear PCM recording section 8. Advantages of Digital Audio over Analog Audio Binary numbers Conversions of Binary numbers to Decimal and vice versa Sampling Aliasing Quantization 6.1 10. 10.2 Modulation process. 11. 6.5 Record processing 9. 8.

167 .

These storage mediums have longer life expectancy than analogue storage mediums like tape do. Binary numbers Binary numbers are used to represent data in the digital realm. MDs and hard disks. (Cassette dubbing compared to CD burning) Digital audio can be stored on more reliable media such as CDs. A perfect copy of the original source can be made in the digital domain whereas in analog domain tape noise and hiss are added to the duplicates. Advantages of Digital Audio over Analog Audio Digital audio can exists in a non-linear form in comparison with analog audio. (CDs last longer than cassette tapes) Analog Audio Digital Audio ♦ ♦ ♦ ♦ ♦ Obvious generation lost Noise added during copying No perfect copy can be made Can only be stored on limited analog medium Cannot be manipulated by computer ♦ ♦ ♦ ♦ ♦ No generation lost No noise added during copying Perfect copy can be made Can be stored on large number of digital medium Can be manipulated by computer Analogue technologies have more circuitry that adds noise and distortion to the signal than digital technologies.8dB . All numbers in the decimal system can be represented by the binary system.SCHOOL OF AUDIO ENGINEERING A11– Digital Technology Student Notes AE11 – DIGITAL TECHNOLOGY Introduction Audio was previously recorded. Now. (Analogue editing compared with a DAW such as Protools) There is virtually no degradation of original signal when copying. however. audio can be stored and reproduced in a digital form. 1. DAT players. For an audio engineer of the current times. Such examples of this technology will be CD players. stored and reproduced by analog means and mediums.) 2. it is vital that he/she must understand the underlying digital theories and concepts of these technologies that are available.a value 30db above the noise figure for most conventional analog tape recorders. digital consoles and digital samplers. (16bit Digital Audio recorder has a dynamic range of 97. there are only two states (a "0" state and a "1" state). In this system. Therefore digital technology has more dynamic range than analogue technology. due to the current advance in digital technology. This nonlinearity provides more flexibility in audio editing processes. Binary digits used in digital devices are called bits (short for Binary digITS). The decimal system is known as base 10 because there are 10 168 .

For example the Alesis XT20 uses 20-bit word length. The binary system is known as base 2 because it has only 2 numbers (0 and 1) to represent all the figures in its system. Digital devices need a fixed length of binary numbers called a "word". 169 .SCHOOL OF AUDIO ENGINEERING A11– Digital Technology Student Notes numbers (0-9) to represent all the figures in this system. Word length refers to the number of digits and is fixed in the design of a digital device (00000001. which is equal to 65536 representations. The following is a further illustration of the above-mentioned principle: For systems with 2-bit word length there can be only four representations 00 01 10 11 For systems with 3-bit word length there will be eight representations 000 001 010 011 110 111 100 101 This can be calculated by the following formulae Number of representations = 2 to the power of n (when n= word length) Therefore a 16-bit system would have.8-bit word length). 2 to the power of 16.

Therefore a device with a sample rate of 44.SCHOOL OF AUDIO ENGINEERING A11– Digital Technology Student Notes 3. Conversions of Binary numbers to Decimal and vice versa For example for a 5 digit word can represent all combination of numbers from 0 to 31 (2 to the power of 5) where: 00000 (Bin) = 0 (Dec) 00001 (Bin) = 1 (Dec) 11111 (Bin) = 31 (Dec) The first bit in a word is known as the Most Significant Bit (MSB) and the last bit is known as the Least Significant bit (LSB). The sampling rate of a digital system is defined as the number of snap shots or samples that are taken in one second. That means when a sound wave is recorded. The word length of a digital device becomes the measure of the "Resolution" of the system and with digital audio better resolution means better fidelity of reproduction. Sampling In the digital realm. One BYTE = eight bits and One NIBBLE = 4 bits. This process is called discrete time sampling. These snap shots are later examined and given a specific value (a binary number). B i t S 170 .1kHz takes 44100 samples per second. snap shots of the wave are taken at different instances. (24 bit systems are superior to 16 bit systems) 4. recordings of analog waves are done through periodic sampling.

171 . 5. For the current standards a 16-bit (65563 steps) representation is acceptable. Aliasing Alias frequencies are produced when the input signal is not filtered effectively to remove all the frequencies above half of the sampling frequency. 16. and F is a frequency higher than half the sampling frequency. Quantization Quantization is the measured amplitude of an analog signal at a discrete sample time. which is represented by the word length used to encode the signal. the number of bits in a signal such as 8. we must sample at a rate which twice the highest through put frequency to achieve loss less sampling.SCHOOL OF AUDIO ENGINEERING A11– Digital Technology Student Notes According to the Nyquist theorem. which were not present in the original audio signal. S samples per second are needed to completely represent a waveform with a bandwidth of S/2 Hz. The solution to aliasing is to band limit the input signal at half the sampling frequency using a steep slope HPF called an anti-aliasing filter. The sample continues to produce samples at a fixed rate. outputting a stream of false information caused by deviant high frequencies. Therefore. Therefore for a bandwidth of 20Hz-20kHz. one must use a sampling frequency of at least 40 kHz.e.12. No matter how fine we make the resolution we cannot fully capture the full complexity of an analog waveform. These are called alias or fold over frequencies. i. i. it is a necessity to send the recording signal through a low pass filter before the sampling circuit to act in accordance with the Nyquist Theorem. For example if S ids the sampling rate. This is because there are no longer adequate samples to represent the deviant high frequencies.e. 20. 6. This false information takes the form of new descending frequencies. 24 bits. then a new frequency Fa is created where Fa = S-F. The accuracy of quantization is limited by the system's resolution.

8 (dB) 7. To fix quantization errors. a low-level noise called dither is added to the audio signal before the sampling process. At the system output. In addition. Dither removes the distortion of quantization error and replaces it with low-level white noise. Whereas signal to noise ratio is used to indicate the overall dynamic range of an analogue system. becoming distortion at low levels. the error becomes measurable distortion in reactive with the signal. the difference between the actual and measured values.1 Quantization Error This is the difference between actual analog value at the sample time and the chosen quantization intervals value i.e.1/2 interval at the sample time.2 Dither Although quantization error occurs at a very low level. although not identical to signal to noise ratio. Dither randomises the effect of quantization error. However. The original analog waveform is sampled and its amplitude quantized by the analog to digital (A/D) converter. The signal to error ratio of a certain digital device can be formulated as follows: S/E = 6N + 1.3 S/E Ratio Signal to error ratio is closely akin. its presence must be considered in hi fidelity music. quantization noises changes with amplitude. Binary numbers are sent 172 .SCHOOL OF AUDIO ENGINEERING A11– Digital Technology Student Notes 6. This error will sound like white noise that is heard together with the program material. this error will be contained in the output signal. Quantization error is limited to +. 6. the signal to error ratio of a digital device indicates the degree of accuracy used when encoding a signal's dynamic range with regard to the step related effects of quantization. 6. Pulse Code Modulation A method of encoding digital audio information which uses a carrier wave in the form of a stream of pulses which represents the digital data. Particularly at low levels. there will be no noise in silent passages of the program.

8. This is done in less than 20 microseconds. The circuit must determine which quantization increment is closest to the analog waveform's current value. Variations in absolute timing called jitter can create modulation noise. The dither causes the audio signal to constantly move between quantization levels. In a 16-bit linear PCM system each of the 65 536 increments must be evenly spaced throughout the amplitude range so that even the LSBs in the resulting word are meaningful. 8. Gaussian white noise is often used. The noise should resemble noise from analog systems.4 Analog to Digital conversion This is the most critical component of the entire system. which is accomplished by the S/H circuit. The ideal LPF would have a "Brick wall" cut off. Thus the speed and accuracy are the key requirements for an A/D converter.3 Sample and Hold The S/H circuit time samples the analog waveform at a fixed periodic rate and holds the analog value until the A/DC outputs the corresponding digital word.SCHOOL OF AUDIO ENGINEERING A11– Digital Technology Student Notes to the storage medium as a series of pulses representing amplitude. An S/H circuit is essentially capacitor and a switch. Samples must be taken precisely at the correct time.5 Record processing After conversion several operations must take place prior to storage: 173 . but this is very hard to achieve. 8. time information is stored implicitly as samples taken at a fixed periodic rate. If two channels are to be sampled the PCM data may be multiplexed to form one data stream.2 Input Low Pass Filter The analog signal is low-pass filtered by a very sharp cut-off filter to band limit the signal and its entire harmonic content to frequencies below half of the sampling frequency. and output a binary number specifying that level. On playback the bit stream is decoded to recover back the original amplitude information at proper sample times and the analog waveform is reconstructed by the digital to analog converter (DAC). Thus proving a guard band of 2kHz to ensure the attenuation is sufficient. 8.1 Dither Generator An analog noise signal is added to the analog signal coming from the line amplifier. In professional recorders with a sampling frequency of 48kHz. which is very easy on the ear. the input filters are usually designed for 20Hz-20kHz. Linear PCM recording section 8. In audio digitization. The data is processed for error correction and stored. Maintaining absolute time throughout a digital system is essential. 8.

Raw channel is properly encoded to facilitate storage and later recovery. data are either recovered correctly. is proportional to the amount of redundancy. it is only necessary to reverse the state and it must be right. This is the purpose of error correction. Clearly the more failures which have to be handled. Error protection. a bit is either correct or wrong. 9. The audibility of a bit error depends upon which bit of the sample is involved. this is graceful degradation. and. which can be handled. However. table of contents. Clearly a means is needed to render errors from the medium inaudible. the more redundancy is needed. concealments occur with negligible frequency unless there is an actual fault or problem. magnetic tape is an imperfect medium. Small amounts of noise are rejected. Whatever the medium and whatever the nature of the mechanism responsible. and the fourth one is redundant. the amount of error. copyright information. However the A/DC outputs parallel data. the plane can still fly. correction and concealment As anyone familiar with analog recording will know. An error of this kind is called a burst error. This is done by coding the data by adding redundant bits. Several types of coding are applied to modify or supplement the original data. which are in error. in a well-designed system. or suffer some combination of bit errors and burst errors. even Time Code can be added. Adding redundancy is not confined to digital technology: airliners have several engines and cars have twin braking systems. which in analog recording are audible. In Compact Disc. especially if it is frequent. Address codes are added to identify location of data in the recording. The amount of redundancy is equal to the amount of failure. Other specifications such as sampling frequency. Thus the correction process is trivial and perfect. random errors can be caused by imperfections in the moulding process. Consequently corrected samples are inaudible. In the case of the failure of two engines. the samples are returned to exactly their original value. correction is not possible. in order to allow graceful degradation. a bit has only two states.SCHOOL OF AUDIO ENGINEERING A11– Digital Technology Student Notes Multiplexing . entire words. If a four-engined airliner is designed to fly normally with one engine failed. The main difficulty is in identifying the bits.e. If it is wrong. if the MSB of one sample was in error in a quiet passage. This is shown in the following diagram. If the LSB of one sample was in error in a loud passage of music. concealment will be used. Conversely. In binary. no one could fail to notice the resulting loud transient. In a digital recording of binary data. In digital audio. The multiplexer converts this parallel data to serial data. Clearly the chances of a two-engine failure on the same flight are remote. Concealment is a process where the value of a missing sample is estimated from those nearby. with no intermediate stage. It suffers from noise and dropouts. the effect would be totally masked and no one could detect it. three of the engines have enough power to reach cruise speed. infrequent noise impulses cause some individual bits to be in error (bit errors). If the amount of error exceeds the amount of redundancy. The estimated sample value is not necessarily exactly the same as the original.Digital audio channel data is processed in a single stream. A synchronisation code is a fixed pattern of bits provided to identify the beginning of each word as it occurs in the bit stream. 174 . but inevitably. which can be corrected. Data coding . Concealment is made possible by rearranging or shuffling the sample sequence prior to recording. whereas burst errors are due to contamination or scratching of the CD surface. but it must slow down. Dropouts can cause a larger number of bits in one place to be in error. i. and so under some circumstances concealment can be audible.

Odd-numbered samples are separated from even-numbered samples prior to recording. sample 8 is incorrect. so that an uncorrectable burst error only affects one set. between correct samples. On replay. 175 . The following figure shows that the efficiency of the system can be raised using interleaving. the shuffle is a waste of time. it is only needed if correction is not possible. more data are lost in a given-sized dropout.SCHOOL OF AUDIO ENGINEERING A11– Digital Technology Student Notes In cases where the error correction is inadequate. adding redundancy equal to the size of a dropout to every code is inefficient. As a result an uncorrectable error causes incorrect samples to occur singly. The waveform is now described half as often. and the error is now split up so that it results in every other sample being lost. The odd and even sets of samples may be recorded in different places. Almost all digital recorders use such an odd/even shuffle for concealment. This interpolated value is substituted for the incorrect value. This is better than not being reproduced at all even if it is not perfect. the samples are recombined into their natural sequence. Clearly if any errors are fully correctable. but can still be reproduced with some loss of accuracy. In high-density recorders. concealment can be used provided that the samples have been ordered appropriately in the recording. In the example shown. Odd and even samples are recorded in different places as shown here. but samples 7 and 9 are unaffected and an approximation to the value of sample 8 can be had by taking the average value of the two.

if a burst error occurs on the medium. as this will not reveal whether the error rate is normal or within a whisker of failure. Not all digital audio machines l audio equipment should have an error rate display. 176 . Binary code is not recorded directly. When the memory is read. confidence replay is about one-tenth of a second behind the input. time compression and timebase correction processes cause delay and this is evident in the time taken before audio emerges after starting a digital machine. On replay.SCHOOL OF AUDIO ENGINEERING A11– Digital Technology Student Notes Sequential samples from the ADC are assembled into codes. It would take too much storage space. The presence of an error-correction system means that the audio quality is independent of the tape/head quality within limits. This is the final electronic manipulation of the audio data before storage. it is copied to the medium by reading down columns.1 Modulation process. Interleave. When the memory is full. which in turn loads the power supply and interferes with the operation of the DAC. This is done by writing samples from tape into a memory in columns. it will damage sequential samples in a vertical direction in the de-interleave memory. there is no flaw in the theory. In the Binary bit stream there is really no way to directly distinguish between the individual bits. The effect is harder to eliminate in small battery-powered machines where space for screening and decoupling components is hard to find. the samples need to be de-interleaved to return them to their natural sequence. and to compare it with normal figures. Additionally it is inefficient to store binary code directly onto the medium. Confidence replay takes place later than the distance between record and replay heads would indicate. Error protection and correction are provided so that the effect of storage defects is minimised. and when it is full. 9. rather a modulated code in the form of a modulation waveform which is stored and which represents the bit stream. There is no point in trying to assess the health of a machine by listening to it. In DASH-format recorders. a single large error is broken down into a number of small errors whose size is exactly equal to the correcting power of the codes and the correction is performed with maximum efficiency. The only useful procedure is to monitor the frequency with which errors are being corrected. de-interleave. are properly engineered. but it is only a matter of design. A series of ones and zeros would forma static signal upon playback and timing information would be lost. a burst of errors will raise the current taken by the logic. however. Professional digita Some people claim to be able to hear error correction and misguidedly conclude that the above theory is flawed. Synchronous recording requires new techniques to overcome the effect of the delays. Samples read from the memory are now in their original sequence so there is no effect on the recording. A number of sequential codes are assembled along rows in a memory. The data is processed before storage by adding parity bits and check codes. and if the DAC shares a common power supply with the error-correction logic. However. the memory is read in rows. both of which are redundant data created from the original data to help detect and correct errors. Finally interleaving is employed in which data is scattered to various locations on the recording medium. but these are not recorded in their natural sequence.

10. bad head alignment) in the medium. They accomplish the buffering of data to minimise the effects of mechanical variations and transport problems (e. Using redundancy techniques such as parity and checksums. Used only where synchronisation is externally generated such as video tape recordings. Non-return to Zero Inverted code (NRZI) . Blocks of 8 bits are translated into blocks of 14-bits using a look-up table. 10. Modified Frequency modulation (MFM) . whatever its code (EFM.g.1s and 0s are represented directly as high and low levels. Various modulation codes have been designed: Non-return to zero code (NRZ) . that is. Eight to Fourteen modulation (MFM) . This circuit takes single bits and outputs whole simultaneous words. tape stretch. 4. A waveform shaper circuit is used to identify the transitions and reconstruct the modulation code.Used for CD storage. The waveform is very distorted and only transitions between levels have survived corresponding to the original recorded signal. the data is checked for errors. The demulitplexer reconverts the serial data to its parallel form. The modulated data. These timing variations will cause jitter. timing variations. a simple code which amplitude information represents the binary information. 1. Digital to Analog Conversion 177 . 10.1 Demodulation Circuits A preamp is required to boost the low-level signal coming off the tape heads. Thus any flux change in the magnetic medium indicates a 1. 2.SCHOOL OF AUDIO ENGINEERING A11– Digital Technology Student Notes Typically.Sometimes called miller code. error compensation techniques are used to conceal the error. which in a CD would mean physical presence of a pit-edge.only 1s are denoted with amplitude transitions. 3. In extreme cases the signal will be momentarily switched off. etc) is then demodulated to NRZ code. it is the transition from one level to another rather than the amplitude levels. in the modulation process. MFM. When error is too extensive for recovery. Digital Audio Reproduction Digital audio reproduction processes can be compared to the reverse of the digital audio recording processes. The data is read into a buffer whose output occurs at an accurately controlled rate thus ensuring precise timing data and nullifying any jitter caused by mechanical variations in the medium.2 Reproduction Processing The production processing circuits are concerned with minimising the effects of data storage. 11. The circuits firstly de-interleave the data and assemble it in the correct order. The reproduction processing circuits also perform error correction and demulitplexing. Each one represents a transition in the medium. which represents the information on the medium.

1 Output sample and hold When the DAC switches from one output voltage to another. Oversampling has the effect of further reducing inter modulation and other forms of distortion. which resembles the output of its counterpart in the recording conversion. in effect.commonly ranging between 12 and 128 times the original rate. false voltage variations such as switching glitches can occur which will produce audible distortion. This significant increase in the sample rate is accomplished by interpolating the original sample times. It operates like a gate removing false voltages from the analog stream and like a timing buffer. Its output is a precise "staircase" analog signal. The S/H circuit holds correct voltage during the intervals when the DAC switches from samples Hence false glitches are avoided by the S/H circuitry. 11. makes educated guesses as to where sample levels would fall at the new sample time and generate an equivalent digital word for that level. 11. A DAC accepts input digital word and converts it into an output analog voltage or current. the effective sampling rate of a signal-processing block is multiplied by a certain factor .determining how accurately the digitized signal will be restored to the analog domain.2 Output Low-Pass Filter and Oversampling This "anti-imaging" LPF has a design very similar to the input "anti-aliasing" filter. Whenever Oversampling is employed. re-clocking the precise flow of voltages. The output circuit acquires a voltage from the DAC only when the circuit has reached a stable output condition.SCHOOL OF AUDIO ENGINEERING A11– Digital Technology Student Notes The D/AC is the most critical element in the reproduction system . This technique. 178 . Oversampling techniques are used in conjunction with this filter.

SCHOOL OF AUDIO ENGINEERING A05– Basic Electronics Student Notes 179 .

An Introduction 1.3 NuBus 8. 7. 5.2 PCI bus 7. Personal Computer Workstation Minicomputer Mainframe Supercomputer Chips CPU Microprocessor RISC Coprocessors Bus 7.the disk drive Port Serial Port Parallel Port SCSI 10. 3.1 1. 9. 10. Macintosh Computer 1. 6.1 local or system bus 7.3 APPLE COMPUTER 1. RAM (main memory) Storage .4 1.3 1.1 PowerPC 180 .2 1.5 2.AE12 – Computer Fundamentals Hardware 1.2 10. 4.1 10.

5 and up) Desktop Top File Management System Clipboard Application 181 .Software 1. 3. 10. 11. 8. Data Instruction Program Software Operating System (OS) 5.1 6. Memory Management System (7. 9. 7. 5. 2. 4.

182 . and circuits -.wires. transistors. many others make it possible for the basic components to work together efficiently. this is the component that actually executes instructions. Computers can be classified by size and power as follows: 1. Hardware 1.2 Workstation A powerful. input device: Usually a keyboard or mouse. An Introduction All general-purpose computers require the following hardware components: • • • • • memory: Enables a computer to store. 1. the input device is the conduit through which data and instructions enter a computer. single-user computer based on a microprocessor. printer. the instructions and data are called software. Modern computers are electronic and digital. every computer requires a bus that transmits data from one part of the computer to another. The two principal characteristics of a computer are: •It responds to a specific set of instructions in a well-defined manner. but it has a more powerful microprocessor and a higher-quality monitor. single-user computer. In addition to these components.SCHOOL OF AUDIO ENGINEERING A10– Computer Fundamentals Student Notes AE12 – COMPUTER FUNDAMENTALS A computer is a programmable machine. at least temporarily.3 Minicomputer A multi-user computer capable of supporting from 10 to hundreds of users simultaneously. The actual machinery -. Common mass storage devices include disk drives and tape drives.is called hardware. mass storage device: Allows a computer to permanently retain large amounts of data. a personal computer has a keyboard for entering data. output device: A display screen.1 Personal Computer A small. a monitor for displaying information. central processing unit (CPU): The heart of the computer. data and programs. and a storage device for saving data. or other device that lets you see what the computer has accomplished. •It can execute a prerecorded list of instructions (a program). A workstation is like a personal computer. For example. In addition to the microprocessor. 1.

SCHOOL OF AUDIO ENGINEERING A10– Computer Fundamentals Student Notes 1. the CPU is where most calculations take place. there are also single in-line memory modules (SIMMs). CPUs require one or more printed circuit boards. sentences. Books provide a useful analogy for describing the difference between software and hardware. CPU Abbreviation of central processing unit. and the overall meaning are the software. but it has no substance. 3. and symbols.4 Mainframe A powerful multi-user computer capable of supporting many hundreds of users simultaneously. 2. Computers consist of many chips placed on electronic boards called printed circuit boards. • • • • Chips come in a variety of packages. and chips.5 Supercomputer An extremely fast computer that can perform hundreds of millions of instructions per second. PGAs: Pin-grid arrays are square chips in which the pins are arranged in concentric squares. Two typical components of a CPU are: 183 . There are different types of chips. evenly divided in two rows. Software exists as ideas. For example. A typical chip can contain millions of electronic components (transistors). which consist of up to nine chips packaged as a single unit. SIPs: Single in-line packages are chips that have just one row of legs in a straight line like a comb. printers. software is untouchable. A computer without software is like a book full of blank pages you need software to make the computer useful just as you need words to make a book meaningful. In contrast. the CPU is housed in a single chip called a microprocessor. concepts. and pronounced as separate letters. In addition to these types of chips. In terms of computing power. Hardware Refers to objects that you can actually touch. 1. n large machines. On personal computers and small workstations. paragraphs. The pages and the ink are the hardware. the CPU is the most important element of a computer system. whereas memory chips contain blank memory. keyboards. Sometimes referred to simply as the processor or central processor. disk drives. The three most common are: DIPs: Dual in-line packages are the traditional buglike chips that have anywhere from 8 to 40 legs. like disks. while the words. Chips A small piece of semiconducting material (usually silicon) on which an integrated circuit is embedded. boards. CPU chips (also called microprocessors) contain an entire processing unit. The CPU is the brains of the computer. display screens.

eg 8-bit. Microprocessor A silicon chip that contains a CPU. which extracts instructions from memory and decodes and executes them. the tendency among computer manufacturers was to build increasingly complex CPUs that had ever-larger sets of instructions. the higher the value. They argue that this is not worth the trouble because conventional microprocessors are becoming increasingly fast and cheap anyway. CPU speed is also measured in MIPS (millions of instructions per second) and Mflops (millions of floating point operations per second) • In both cases. which makes them cheaper to design and produce. 5. the terms microprocessor and CPU are used interchangeably. There is still considerable controversy among experts about the ultimate value of RISC architectures. Until the mid-1980s. Its proponents argue that RISC machines are both cheaper and faster. the more powerful the CPU. conventional computers have been referred to as CISCs (complex instruction set computers). The number of bits processed in a single instruction or the size of the data word that the microprocessor can hold in one of its registers. the clock speed determines how many instructions per second the processor can execute. 16-bit. RISC architectures put a greater burden on the software. A 16-bit processoe can process 2 pices of 8-bit data in parallel. 4. microprocessors are classified as being either RISC (reduced instruction set computer) or CISC (complex instruction set computer). One advantage of reduced instruction set computers is that they can execute their instructions very fast because the instructions are so simple. Two basic characteristics differentiate microprocessors: • bandwidth: or Internal Architecture. The control unit. a 32-bit microprocessor that runs at 50MHz is more powerful than a 16-bit microprocessor that runs at 25MHz. At that time. For example. Given in megahertz (MHz). and all comparison operations. from clock radios to fuelinjection systems for automobiles. 64-bit. Microprocessors also control the logic of almost all digital devices. Another. is that RISC chips require fewer transistors. a number of computer manufacturers decided to reverse this trend by building CPUs capable of executing only a very limited set of instructions. At the heart of all personal computers and most workstations sits a microprocessor. calling on the ALU when necessary. perhaps more important advantage. In addition to bandwidth and clock speed. Skeptics note that by making the hardware simpler. and are therefore the machines of the future. In the world of personal computers. which performs arithmetic and logical operations such as addition and multiplication. Since the emergence of RISC computers.SCHOOL OF AUDIO ENGINEERING A10– Computer Fundamentals Student Notes • • The arithmetic logic unit (ALU). clock speed: Microprocessors are driven by a crystal clock. 32-bit. however. RISC Pronounced risk. a type of microprocessor that recognizes a relatively limited number of instructions. acronym for reduced instruction set computer. 184 .

7. Many of today's RISC chips support as many instructions as yesterday's CISC chips. The data bus transfers actual data. For example. A fast bus allows data to be transferred faster. The Floating Point Unit (FPU) is a math coprocessor specifically designed to crunch non-integer and exponential values (eg used for rendering 3D graphics and animation). for video data. There's also an expansion bus that enables expansion boards to access the CPU and memory.an address bus and a data bus. 7. When used in reference to personal computers. known as its width. they provide very fast throughput. Control Lines . The local bus is a data bus that connects directly. This is a bus that connects all the internal computer components to the CPU and main memory. Older Macs use a bus called NuBus. On PCs. Several different types of buses are used on Apple Macintosh computers. the argument is becoming moot because CISC and RISC implementations are becoming more and more alike. All buses consist of two parts -. the address bus transfers information about where the data should go ie which memory location is being addressed. the old ISA bus is being replaced by faster buses such as PCI.SCHOOL OF AUDIO ENGINEERING A10– Computer Fundamentals Student Notes To some extent. Coprocessors Processors that handle specialised tasks freeing up the main processor to overseeing the entire operation. Most modern PCs include both a local bus. to the microprocessor.A sort of traffic cop for data. as well as a more general expansion bus for other devices that do not require such fast data throughput. 6. such as video data. specifying the functions associated with data and address lines. You can think of a bus as a highway on which data travels within a computer. Every bus has a clock speed measured in MHz. which makes applications run faster. Although local buses can support only a few devices. is important because it determines how much data can be transmitted at one time. or almost directly. but newer ones use PCI. 185 . whereas a 32-bit bus can transmit 32 bits of data. The local bus is a high-speed pathway that connects directly to the processor. And today's CISC chips use many techniques formerly associated with RISC chips. Bus A collection of wires through which data is transmitted from one part of a computer to another. The size of a bus. a 16-bit bus can transmit 16 bits of data.1 local or system bus Many PCs made today include a local bus for data that requires especially fast transfer speeds. the term bus usually refers to internal bus.

In contrast. 8. therefore. ROM (read-only memory ) refers to special memory used to store programs that boot the computer and perform diagnostics. any byte of memory can be accessed without touching the preceding bytes. the term RAM is synonymous with main memory. RAM is the most common type of memory found in computers and other devices. such as printers. There are two basic types of RAM: • • dynamic RAM (DRAM) static RAM (SRAM) The two types differ in the technology they use to hold data. but supports a 64-bit extension for new processors. To be precise.the disk drive A disk drive is a machine that reads data from and writes data onto a disk. Static RAM needs to be refreshed less often. Storage . dynamic RAM being the more common type. It can run at clock speeds of 33 or 66 MHz. but it is also more expensive than dynamic RAM. it yields a throughput rate of 133 MBps. In common usage. Current Macs use the PCI bus. In fact. however. 186 . Most modern PCs include a PCI bus in addition to a more general ISA expansion bus. A disk drive resembles a stereo turntable in that it rotates the disk very fast. Dynamic RAM needs to be refreshed thousands of times per second. Many analysts. a local bus standard developed by Intel Corporation. It has one or more heads that read and write data. PCI is not tied to any particular family of microprocessors. RAM should be referred to as read/write RAM and ROM as read-only RAM. Both types of RAM are volatile. Most personal computers have a small amount of ROM (a few thousand bytes). 9. which makes it faster.SCHOOL OF AUDIO ENGINEERING A10– Computer Fundamentals Student Notes 7. 64-bit implementations running at 66 MHz provide 524 MBps. both types of memory (ROM and RAM) allow random access. PCI is a 32-bit bus. a type of computer memory that can be accessed randomly.2 PCI bus Acronym for Peripheral Component Interconnect. Although it was developed by Intel. For example. the memory available to programs.3 NuBus The expansion bus for versions of the Macintosh computers starting with the Macintosh II and ending with the Performa. that is. At 32 bits and 33 MHz. PCI is also used on newer versions ofthe Macintosh computer. acronym for random access memory. a computer with 8M RAM has approximately 8 million bytes of memory that programs can use. RAM (main memory) Pronounced ramm. believe that PCI will eventually supplant ISA entirely. meaning that they lose their contents when the power is turned off. 7. such as the Pentium.

or interface. Both of these parallel ports support bi-directional communication and transfer rates ten times as fast as the Centronics port. Internally. 187 . printers. On PCs. Personal computers have various types of ports. mice. a hard disk drive (HDD) reads and writes hard disks. Externally. including modems. in which only 1 bit is transmitted at a time. The areas cross-referenced by the intersection of tracks and sectors are called blocks. mice. 10. which supports the same connectors as the Centronics interface. and an optical drive reads optical disks. there are several ports for connecting disk drives. Also. There are different types of disk drives for different types of disks. personal computers have ports for connecting modems. A magnetic disk drive reads magnetic disks. that can be used for serial communication.1 Serial Port A port. Disk drives can be either internal (housed within the computer) or external (housed in a separate box that connects to the computer).) A newer type of parallel port. and keyboards. the parallel port uses a 25-pin connector (type DB-25) and is used almost exclusively to connect printers. 10. A serial port is a general-purpose interface that can be used for almost any type of device. Port An interface on a computer to which you can connect a device. the surface of the disk is crosssectioned into wedge-shaped sectors. the parallel port is a Centronics interface that uses a 25-pin connector.SCHOOL OF AUDIO ENGINEERING A10– Computer Fundamentals Student Notes The disk consists of concentric rings called tracks. and other peripheral devices. Almost all personal computers come with a serial RS-232C port or RS-422 port for connecting a modem or mouse and a parallel port for connecting a printer. It is often called a Centronics interface after the company that designed the original standard for parallel communication between a computer and printer. Serial data transfer refers to transmitting data one bit at a time. SCSI (Small Computer System Interface) ports support higher transmission speeds than do conventional ports and enable you to attach up to seven devices to the same port. Most serial ports on personal computers conform to the RS-232C or RS-422 standards. and printers (although most printers are connected to a parallel port). is the EPP (Enhanced Parallel Port) or ECP (Extended Capabilities Port). All Apple Macintosh computers since the Macintosh Plus have a SCSI port. The opposite of serial is parallel. Most personal computers have both a parallel port and at least one serial port. in which several bits are transmitted concurrently. display screens. On PCs. (The modern parallel interface is based on a design by Epson. A PC will have at least one internal HDD with a least 1 Gigabyte of storage space.2 Parallel Port A parallel interface for connecting an external device such as a printer. hard disks are faster and can hold a lot more data than floppies. 10. and a floppy drive (FDD) accesses floppy disks. For example.

Ultra SCSI: Uses an 8-bit bus. so two SCSI interfaces may be incompatible. All Apple Macintosh computers starting with the Macintosh Plus come with a SCSI port for attaching devices such as disk drives and printers. PCs support a variety of interfaces in addition to SCSI. but more flexible. These include IDE. SCSI supports several types of connectors. Apple's innovations include: 188 . some analysts say that the entire evolution of the PC can be viewed as an effort to catch up with the Apple Macintosh. They are used for many types of communication in addition to connecting printers 10. APPLE COMPUTER A personal computer company founded in 1976 by Steven Jobs and Steve Wozniak.SCHOOL OF AUDIO ENGINEERING A10– Computer Fundamentals Student Notes Macintoshes have a SCSI port. In fact. Apple also has often been the first to bring sophisticated technologies to the personal computer. however. Although SCSI is an ANSI standard. Fast Wide SCSI: Uses a 16-bit bus and supports data rates of 20 MBps. attach SCSI devices to a PC by inserting a SCSI board in one of the expansion slots. In addition. Pronounced scuzzy. For example. The following varieties of SCSI are currently implemented: • • • • • SCSI: Uses an 8-bit bus. Also called SCSI-3. Many high-end new PCs come with SCSI built in. so that SCSI is really an I/O bus rather than simply an interface. and supports data rates of 4 MBps Fast SCSI: Uses an 8-bit bus. however. While SCSI is the only standard interface for Macintoshes. You can. and many UNIX systems for attaching peripheral devices to computers. and Centronics for printers. some PCs. SCSI interfaces provide for faster data transmission rates (up to 40 megabytes per second) than standard serial and parallel ports. which is parallel. there are many variations of it. In addition to inventing new technologies. Apple has been one of the most innovative influences. enhanced IDE and ESDI for mass storage devices. Note. that the lack of a single SCSI standard means that some devices may not work with some SCSI boards. Throughout the history of personal computing. you can attach many devices to a single SCSI port. SCSI is a parallel interface standard used by Apple Macintosh computers. and supports data rates of 10 MBps. Ultra Wide SCSI: Uses a 16-bit bus and supports data rates of 40 MBps. and supports data rates of 20 MBps.3 SCSI Abbreviation of small computer system interface.

Apple Computer. with varying degrees of speed and power. he or she can learn new applications relatively easily. the Mac II introduced a new expansion bus called NuBus that made it possible to add devices and configure them entirely with software. PowerMacs can also run programs written for the Motorola processors. 1. Apple released the Macintosh TV. The Apple II. and vice versa. introduced in 1977. many software companies are producing Mac versions of their products that can read files produced by a Windows version of the software. Apple released a new version of the Macintosh with builtin support for networking (LocalTalk). Apple introduced the Power Mac. The Macintosh family of computers is not compatible with the IBM family of personal computers. sound. Since then. was the first personal computer to offer color monitors. Plug & play expansion. the first personal computer with built-in television and stereo CD. and Motorola Corporation. All older Macintosh computers use a microprocessor from the Motorola 68000 family. Performance Optimization With Enhanced RISC. based on the PowerPC RISC microprocessor. They have different microprocessors and different file formats. have built PCs based on the PowerPC. which appeared in 1994. the Macintosh features a graphical user interface (GUI) that utilizes windows. Microsoft offers a Mac -like GUI for PCs called Windows. Many components of the Macintosh GUI have become de facto standards and can be found in other operating systems. The first computers based on the PowerPC architecture were the Power Macs. numerous software producers have produced similar interfaces. This can make it difficult to share data between the two types of computers. 1. There are many different Macintosh models.different monitors. and other multimedia applications. Once a user has become familiar with one application. it remains to be 189 . and a mouse to make it relatively easy for novices to use the computer productively. Built-in networking. but in 1994 Apple switched to the PowerPC microprocessor. such as Microsoft Windows. icons. For example. Apple introduced QuickTime. disk drives. This means that all applications that run on a Macintosh computer have a similar user interface. All models are available in many different configurations -. Macintosh Computer A popular model of computer made by Apple Computer. Integrated television. and memory. In 1994. Moreover.SCHOOL OF AUDIO ENGINEERING A10– Computer Fundamentals Student Notes • • • • • • • Graphical user interface (GUI). In 1985. Although the initial reviews have been good. QuickTime. Rather than learning a complex set of commands. you need only point to a selection on a menu and click a mouse button. including IBM. In 1991. The name is derived from IBM's name for the architecture. a multi-platform standard for video. Introduced in 1984. RISC. Since the Macintosh interface's arrival on the marketplace and its enthusiastic acceptance by customers. the GUI is embedded into the operating system.1 PowerPC A RISC-based computer architecture developed jointly by IBM. In 1993. In 1987. First introduced in 1983 on its Lisa computer. Color. Increasingly. other manufacturers. however.

All software is divided into two general categories: data and programs. people use data as both the singular and plural form of the word. a computer's instruction set is the list of all the basic commands in the computer's machine language. however. The term instruction is often used to describe the most rudimentary programming commands. 2. COBOL.5 and higher). interpreters.as numbers or text on pieces of paper. Software 1. when executed. Data can exist in a variety of forms -. Without programs. These are all high-level languages. a single piece of information. data is the plural of datum. C++. or as facts stored in a person's mind. Low-level languages are closer to the language used by a computer. with the huge number of Intel -based computers in use and on the market. In practice. usually formatted in a special way. and assemblers. or graphical images. One can also write programs in lowlevel languages called assembly languages. For example. including the Macintosh operating system (System 7. There are many programming languages -. For example. Data Distinct pieces of information. Strictly speaking. 3. The variables can represent numeric data. Instruction A basic command. while high-level languages are closer to human languages. Pascal. some applications make a distinction between data files (files that contain binary data) and text files (files that contain ASCII data). Windows NT.SCHOOL OF AUDIO ENGINEERING A10– Computer Fundamentals Student Notes seen whether this new architecture can eventually supplant. and LISP are just a few. Programs are collections of instructions for manipulating data.C. A program is like a recipe. as bits and bytes stored in electronic memory. FORTRAN. This translation is performed by compilers. causes the computer to behave in a predetermined manner. and OS/2. although this is more difficult. BASIC. text. It contains a list of ingredients (called variables) and a list of directions (called statements) that tell the computer what to do with the variables. every program must be translated into a machine language that the computer can understand. There are already a number of different operating systems that run on PowerPCbased computers. 190 . computers are useless. The term data is often used to distinguish binary machine-readable information from textual human-readable information. Eventually. or even coexist. Program An organized list of instructions that.

General-purpose operating systems. This means that the program is already in machine language -. determines to a great extent the applications you can run. Anything that can be stored electronically is software. For example. when you purchase a program. the most popular operating systems are DOS. you can say: "The problem lies in the software. and Windows. Operating systems can be classified as follows: • • • • • Multi-user: Allows two or more users to run programs at the same time. but others are available. For example. spreadsheets. But to buy the software. Every general-purpose computer must have an operating system to run other programs. The application programs must be written to run on top of a particular operating system.SCHOOL OF AUDIO ENGINEERING A10– Computer Fundamentals Student Notes When you buy software. called application programs." The distinction between software and hardware is sometimes confusing because they are so integrally linked. Multiprocessing: Supports running a program on more than one CPU. Operating System (OS) The most important program that runs on a computer. The operating system is also responsible for security. 5. OS/2. keeping track of files and directories on the disk. The storage devices and display devices are hardware. and controlling peripheral devices such as disk drives and printers.it makes sure that different programs and users running at the same time do not interfere with each other. You can also say: "It's a software problem. are not real-time. 191 . The terms software and hardware are used as both nouns and adjectives. 4. you normally buy an executable version of a program. Your choice of operating system. such as DOS and UNIX. sending output to the display screen. It is like a traffic cop -. and database management systems fall under the category of applications software. For large systems. Multitasking: Allows more than one program to run concurrently. such as recognizing input from the keyboard. ensuring that unauthorized users do not access the system. Multithreading: Allows different parts of a single program to run concurrently." meaning that there is a problem with the program or data. not with the computer itself. Software Computer instructions or data. you are buying software. For PCs.it has already been compiled and assembled and is ready to execute. Real-time: Responds to input instantly. can run. such as Xenix. word processors. Operating systems perform basic tasks. Some operating systems permit hundreds or even thousands of concurrent users. you need to buy the disk (hardware) on which the software is recorded. Clearly. Operating systems provide a software platform on top of which other programs. applications software: Includes programs that do real work for users. therefore. the operating system has even greater responsibilities and powers. Software is often divided into two categories: • • systems software: Includes the operating system and all the utilities that enable the computer to function.

Clipboard A special file or memory area (buffer) where data is stored temporarily before being copied to another location. an essential program that runs whenever you start up a Macintosh. that show cabinets. files. the word processor copies it from the clipboard to its final destination. System is short for System file. For example. and throwing them away. 6. close and save files with an application. a desktop is the metaphor used to portray file systems. When it runs an application the OS looks at the memory map to determine where to place it. Many word processors. In 192 . 9. Desktop In graphical user interfaces. a hierarchical file system is one that uses directories to organize files into a tree structure. and various types of documents (that is.SCHOOL OF AUDIO ENGINEERING A10– Computer Fundamentals Student Notes As a user. These systems interact smoothly with the operating system but provide more features. 8.5 and up) On Macintoshes. The OS creates and maintains a directory or file allocation table. putting one on top of another. The commands are accepted and executed by a part of the operating system called the command processor or command line interpreter. Top The desktop management and file management system for Apple Macintosh computers.1 Memory Management The OS uses a memory map to keep applications and files from conflicting. System (7. when you paste the block. You can arrange the icons on the electronic desktop just as you can arrange real objects on a real desktop -. reshuffling them. the DOS operating system contains commands such as COPY and RENAME for copying files and changing the names of files. such as improved backup procedures and stricter file protection. letters. folders. respectively. The System and Finder programs together make up the Macintosh operating system. When you cut a block of text. Graphical user interfaces allow you to enter commands by pointing and clicking at objects that appear on the screen 5. File Management System The system that an operating system or program uses to organize and keep track of files. allocates enough RAM and copies the program into memory. 10. For example. you normally interact with the operating system through a set of commands. Such a desktop consists of pictures. pictures). the Finder is responsible for managing the Clipboard and Scrapbook and all desktop icons and windows. The OS the handles the request to open. In addition to managing files and disks. for example. reports. use a clipboard for cutting and pasting. you can buy separate file management systems. called icons. Although the operating system provides its own file management system. 7.moving them around. the word processor copies the block to the clipboard. The System provides information to all other applications that run on a Macintosh.

In contrast. 11. and spreadsheets. called the Scrapbook. Systems software consists of low-level programs that interact with the computer at a very basic level. The one it calls the Clipboard can hold only one item at a time and is flushed when you turn the computer off. compilers. can hold several items at once and retains its contents from one working session to another. Application A program or group of programs designed for end users. Figuratively speaking. The Macintosh uses two types of clipboards. applications software (also called end-user programs ) includes database programs. Software can be divided into two general classes: systems software and applications software . and utilities for managing computer resources. the Clipboard (with a capital C) can be used to copy data from one application to another. This includes operating systems. applications software sits on top of systems software because it is unable to run without the operating system and system utilities. The other. 193 . word processors.SCHOOL OF AUDIO ENGINEERING A10– Computer Fundamentals Student Notes Microsoft Windows and the Apple Macintosh operating system.

SCHOOL OF AUDIO ENGINEERING A10– Computer Fundamentals Student Notes 194 .

Transposing instruments in the brass section: The Percussion Section 5.2 5. Memorising key signatures Intervals 4.1 Primary chords in major keys Musical Instrument and the Orchestra 1. Major scales with sharps.AE13 – Music Theory II Music Theory – Intermediate Level 1. Major scales 1.2 Unpitched Percussion Instruments History of Western Art Music 195 .2 5. the concept of key Major scales with flats Natural minor scales and the concept of relative keys 2.1 1. Augmented and diminished intervals Chords built from major scales 5. 4. 3.2 Minor intervals 4.1 Harmonic minor and melodic minor scales 3. 2.1 Major intervals and perfect intervals 4.2 2. 4. Stings Section Woodwind Section The Brass Section Transposing Instrument 4.1 Transposing Instruments in the Woodwind Section: 4.1 Pitch Percussion Instruments 5.

1 Characteristics of Baroque music 4. Twentieth century 6. Classical (1750-1820) 4.1 Characteristics of romantic music 5.3 Secular Music 3. Baroque (1600-1750) 3.1 Characteristic of twentieth century music 6.2 Important composers of the twentieth century 196 .1 Social background 1.2 Sacred Music of the Middle ages 1. Middle Ages (450-1450) 1.1 Characteristics of the classical style 4.1 Characteristics of Renaissance Music 2.1 Sacred Music 2.1 5. Renaissance (1450-1600) 2.3 Secular Music In The Middle Ages 2. Important composers of the classical period Romantic (1820-1900) 5.2 Important composers of the romantic period 6.1.

197 . and follows the pattern of tones and semitones illustrated below. Major scales A major scale consists of eight notes covering one octave. ite If you examine the major scale closely. you can see that the pattern of tones and semitones between the first four notes is exactly the same as the pattern of tones and semitones between the last four notes-that is tone-tone-semitone*. We can therefore build a series of major scales by adding successive four-note sequences. but you can hear the pattern by playing the wh notes of the piano keyboard from C to C an octave higher. A major scale can begin on any pitch.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes AE13 – MUSIC THEORY II Music Theory – Intermediate Level 1. * A four-note sequence following this pattern is sometimes described as a 'major tetrachord'.

and indicates which notes are to be sharpened each time they are played. For example.8 then the new scale always starts on note 5 of the old scale (e. If we number the notes of a scale 1. Since in the key of D major the F note and the C note will always be sharpened.5.1 Major scales with sharps. This saves us the trouble of having to write a sharp against the note whenever it occurs within the music.2. Each new sharp is also five notes away from the previous sharp. The key signature is displayed at the beginning of each staff.g. 198 . if a melody is based around the notes from the D major scale.4. G major scale has one sharp (F#) while the next major scale (D major) has an additional sharp (C#) which is five away from F#. If a piece of music is based around the notes of a particular major scale we say that the music is in that key. and that D note is the tonic. For instance. we can simplify our notation by writing a key signature.3. C major is followed by G major-and G is the fifth letter in a musical sequence which begins on C-C D E F G). we say that the melody is in the key of D major. the concept of key If you look closely at the previous example you will notice some patterns.6. A key signature The key signature is a grouping of the sharps or flats. Each new scale contains one more sharp than the previous scale.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes 1.7.

then when we continue on with our tetrachord-adding process we arrive at a series of keys which require a progressively decreasing number of flats-rather than an ever increasing number of sharps and even double-sharps! 199 . 1. Below is a list of major scales with sharps.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes It is possible that the music might also be in a related minor key-this is discussed later. If we make this enharmonic change. written out with appropriate key signatures. It is therefore possible to think of an F# major scale as being a Gb major scale instead.2 Major scales with flats If you recall our discussion of enharmonic notes you will remember that the note F# is exactly the same as the note Gb.

and follows the pattern of tones and semitones illustrated below.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes As is the case with 'sharp' keys. it is possible to use a key signature with 'flat' keys. 2. 200 . Below is a list of major scales with flats. but you can hear the pattern by playing the white notes of the piano keyboard from A to A an octave higher. A natural minor scale can start from any pitch. Natural minor scales and the concept of relative keys A natural minor scale consists of eight notes covering one octave. written out with appropriate key signatures.

1 Harmonic minor and melodic minor scales There are some other minor scales which are you should be aware of. composers began to prefer the stronger harmonic flavour associated with the use of a sharpened 7th note. For instance G major scale (with one sharp-F#) contains the same notes as E natural minor scale. a melody in a minor mode would often end with the notes 8 7 8 from the minor scale. Over time. 201 . and the major scale starts on the third note of the relative minor scale. The only difference between the scales is that G major scale runs from G to G and E minor scale runs from E to E. The harmonic minor scale is a natural minor scale with a raised or sharpened 7th note. Note that the relative minor scale starts on the sixth note of the major scale. E minor may be described as the relative minor of G major-and in turn G major may be described as the relative major of E minor.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes If we examine any given major scale. Below is list of major scales alongside the corresponding relative minor scale. we will discover that there is a natural minor scale which shares exactly the same pattern of sharps or flats. and this melody would be harmonised as in the example below. as illustrated in the next example. in the case of the G major and E minor keys mentioned above. In early music. Example 5. A very brief history is as follows. Major and minor keys which share the same key signature are said to be relative to each other. 2.9 illustrates the sound of the natural minor scale followed by the harmonic minor. Therefore. In European "art music" history this scale resulted from the desire to provide a stronger sounding cadence (the chord progression at the end of a section of music).

The harmonic minor scale is also a common element in folk music from various regions of the world. you will often encounter references to it and should be aware of its origins. however. This scale emerged as a consequence of raising the 7th note at cadences. that in popular music (as in early classical music) it is very common to find melodies which use both the ordinary 7th and the sharpened 7th-thereby creating melodic interest and variety.e. To avoid the somewhat awkward leap (for singers). when a melody moves from the minor 6 note to the raised (major) 7 note. and the "normal" 6th and 7th descending). which uses the raised 7th. As a result the complete "classical' melodic minor is as below (i. the leaping problem was avoided by using the 7th and 6th from the natural minor scale. The first is the "classical" melodic minor. With a raised 6th and 7th ascending. When the melody moved down rather than up. Below is a typical illustration. which is created. Even though this scale is hardly ever used in popular music. and a melody based entirely on the harmonic minor can often seem to have a "world music" quality.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes The result is the so-called harmonic minor scale. There are two forms of melodic minor scale. 202 . You should note. composers began to raise the 6th note as well. which has a different form when ascending as compared to descending.

e. even if (particularly if!) You already have some knowledge in this area.g. and be able be able to write these sharps and flats in the correct position on the staff. Bb is joined by Eb then Ab etc. rather than resorting to your notes or a calculator to work things out! Tip 1: When dealing with keys with sharps think in fives. subsequent 'flat' keys are four letter names away from the preceding key (e. F# is joined by C# then G#. C major has no flats. If we begin with C major (or A minor)-which have no sharps. You might also find some saying helpful in remembering the order of sharps in the key signature (e. after the first flat to be used in a key signature (Bb).SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes 3. In addition. Tip 2: When dealing with keys with flats think in fours. 1 know from long teaching experience that the lack of this knowledge provides a serious and frustrating impediment to satisfying progress. subsequent 'sharp' keys are five letter names away from the preceding key. scales and chords-from interval recognition to scale and chord construction. C major [B] A minor). F major has one flat and F is four letter names away from C in the musical alphabet--C D E F).g. after the first sharp to be used in a key signature (F#). Father Christmas Gets Drunk At Every Ball). For example C (no sharps) is followed by G major (one sharp)-and G is five letter names away from C in the musical alphabet--C D E F G. If we begin with C major (or A minor)-which have no flats.).). You must have instant recall of the pattern of sharps/flats in any major or minor key (at least to five sharps and flats at this stage). subsequent flats are four letters away (i. 203 . You should therefore memorise this information as quickly as possible. Similarly A minor (no sharps) is followed by E minor (one sharp) and E is five letter names away from A in the musical alphabetic B C D E. Similarly. A list of key signatures together with the relevant major and minor key is provided below. Tip 3: The relative minor is three letter names below the major (e. subsequent sharps are five letters away (i. Here are a few summarising tips which may help to make the process of memorisation a little easier.e. C# # etc. Please don't fool yourself about how well you know key signatures. Memorising key signatures It should be realised that a knowledge of key signatures is an indispensable aid in working with a range of musical elements such as intervals.g. but remember you should ultimately develop instant recall.

e. and considers the distinctive qualities associated with particular intervals. 204 . It also explores the relationship between intervals and the overtone series. while a harmonic interval is the distance between two notes which are played simultaneously (i. notated and identified. As 'harmony). 4. and the sound of various intervals in two-part harmony. not F minor) and so there is no substitute for understanding the concepts and then carefully memorising the specific information. chord-scale relationships etc. A major is relative of F# minor. This material may seem a little 'dry' and technical. but if you build a solid foundation of knowledge in relation to 'language' of music theory at this stage. and you should make sure that you are totally comfortable with relevant terminology and principles of notation which are covered in this chapter.g. There are numerous written exercises provided at the end of the chapter. you can avoid the frustration of having to continually refer back to this material when you study aspects of melody. A musical interval is the pitch distance between two notes.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes Be aware that (as with the tips 1 and 2) there will be times when the letter name is not enough (e. Intervals This chapter examines the way musical intervals are labelled. chord construction. A melodic interval is the distance between two adjacent melody notes.

there are several intervals (the unison. sixth and seventh) are described as major intervals. The various interval qualities are described below. E(3). The idea of interval quality relates to the particular sound of the interval. then a different interval is created as we add each of the notes of the major scale to the tonic. which in turn relates to the actual distance between the notes. since they come from the major scale. as is the interval between C and R However. 4. In interval terminology this is called a fourth. which have become known as perfect intervals. while the interval between F and D is described as a sixth. third. For example the interval between C and E is numbered as a third. whereas the distance between C and R (which we call a 'minor' interval) is only three semitones. and most listeners will describe C-E as having a brighter sound than C-Eb. As you might expect.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes Intervals are described with reference to two aspects---distance and quality. fifth and octave). 205 . F(4).C(1). Similarly the interval between G and B is described as a third. If we consider the tonic note as a fixed bottom note. However. D(2). Interval distance is expressed as a number by counting the total number of letter names encompassed by the two notes (including the notes themselves).1 Major intervals and perfect intervals We will return to the major scale as an interval reference point. fourth. many of these intervals (the second. For example. the distance between C and E (which we call a 'major' interval) is four semitones. since G and C below encompass five letter names. The same numbering process also applies when describing downward intervals. Therefore the interval between the notes C and F would be numbered as four. the interval from G to C below is described as a fifth.

Even now. you will notice that the octave. and when early Christian church music (which drew extensively from Greek theory) began to use some simple harmony. there is a 1:1 relationship between the tonic and its unison (i. and so you will be able to use your knowledge of major and minor scales to assist in labelling intervals. minor sixth and minor seventh intervals by lowering the appropriate note from the major scale. 206 . Also. Therefore it is possible to create minor second. fifth and octave which are part of both the major and natural minor scale). however. 4.2 Minor intervals When a major interval is made smaller by one semitone it becomes a minor interval. who are thought to have discovered that these intervals embody the most 'perfect' mathematical relationships. The ancient Greeks because of their mathematical perfection favoured perfect intervals. and a 3:4 relationship between fourth and tonic. fifth and fourth intervals are the first naturally-occurring intervals within the series. For example. a 2:3 relationship between fifth and tonic. As the name suggests. perfect intervals were preferred. In both major and natural minor scales the second note is a major second interval from the tonic. one inconsistency you should note at this stage. Harmonic series based on C. the sound of a series of perfect intervals can seem suggestive of an earlier period. There is. if you re-examine the harmonic series. minor sixth and minor seventh interval all occur within the natural minor scale (together with the perfect fourth. the minor third.e. minor third. The same note). a 1:2 relationship between octave and tonic.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes The origin of the term perfect interval goes back to the ancient Greeks.

or an F note labelled as a minor third. When memorising the principles which govern how we label musical intervals. F) rather than an augmented second. since to 'diminish' something is to make it smaller. minor sixth etc. you should note the meaning of the labels themselves. 207 . 4. For example. A diminished interval is one semitone smaller than a perfect interval or a minor interval. Similarly. in relation to a D tonic (or key centre) you will often see a B note described as a major sixth. and it is no surprise that 'major' intervals should be larger than 'minor' intervals. Since to 'augment' something is to make it even larger in some way. Although the note E# is the same pitch as an F. *This may seem a bit technical. To AUGMENT OR To DIMINISH? OR “WHEN IS A MINOR SIXTH AN AUGMENTED FIFTH?” ETC. Are also often used to label notes in reference to a given tonic. but bear with me. E. terms such as perfect fourth. major third. if it was written as an F the interval would be then numbered as a (minor) third (I). The rule is 'Number first-then quality'. We have already examined the logic of the term 'perfect' intervals. it seems logical that augmented intervals should be expanded major or perfect intervals. it seems logical that diminished intervals should be compressed minor or perfect intervals.2 Augmented and diminished intervals An augmented interval is one semitone larger than a perfect interval or a major interval.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes As well as being used to identify musical intervals.

the middle interval is best written as an augmented fifth.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes When you are dealing with the notation of intervals things can get a little confusing sometimes. Chords built from major scales In the same way that it is possible to create a series of major and minor third intervals from the notes of the major scale. the top note is moving upwards-creating changing harmonic intervals as it does so. minor. diminished will be the same no mater what major key is used. On the other hand. minor. major. but a general 'rule of thumb' is that notation should follow musical logic where possible. This pattern of triads-major. In (a) below. 5. so it is best to write the second interval as a minor sixth. 208 . major. it is also possible to create a series of major and minor triads. example (b) sees the major sixth being decreased by a semitone. Let me give you a simple example by way of illustration. minor. and that an augmented fifth is the same as a minor sixth etc. There are a number of factors (too many to discuss at this point) which may influence the way an interval is notated. If you examine the example below you can also see that that an augmented fourth is exactly the same as a diminished fifth. Since the sound is that of expanding intervals. For instance we noted above how an augmented second will sound exactly the same as a minor third.

however-so for the moment we will ignore it. A large Roman numeral is used to indicate a major chord. IV. When dealing with harmony it is common practice to label the chords built from a major scale with Roman numerals. The I chord is called the tonic chord. and V). Most pieces of music begin and end on the tonic chord.e. The chords we label I. We will revisit the diminished sound at a later point when discussing seventh chords. It is considered to be the most important chord in the key. The triads built from C major scale are illustrated below with both Arabic and Roman names. The diminished triad is rarely used in popular music. Of course. The tonic chord may move to any other chord but has no strong tendency to move in a particular direction. Without reference to a particular key. while a small Roman numeral is used to indicate a minor chord. 209 .1 Primary chords in major keys The three most important chords in a major key are those built on the first.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes You will have noticed the reference to a diminished triad. it is difficult to convey a sense of a chord in words. A small Roman numeral together with a small circle indicates a diminished chord. fourth and fifth scale degrees (i. 5. This enables us to talk about types of chords and chord progressions etc. and normally has a sense of repose or finality. and these are described as primary chords. but these descriptions (and the recorded examples) should give you some idea of the typical sound and function of these important chords. there is a type of "harmonic gravity" which results in a return to this chord at some point. I will give you a brief description of each of the primary chords and discuss some common chord progressions (a chord progression is the movement from one chord to another. After a piece of music moves away from the tonic. and may consist of a single movement or a series of movements). A diminished triad is like a minor triad with a lowered (diminished) fifth-or you can think of it as two superimposed minor thirds.

This function is further enhanced by the addition of a seventh note-creating a dominant seventh chord (this chord is discussed in more detail later in this chapter). Note how the I V I sequence at the beginning establishes the sense of the key. V-IV) is used quite often in popular music. When we reach the IV chord.3 illustrates two I-V-I progressions in different keys (and with different melodies and feels). you will hear a typical leaping melodic hook which has a sense of taking the melody away from home bass. Example 6. One common progression involves an upward step movement to V (i. The IV chord is known as the subdominant chord (i. In longer chord progressions the IV chord is often used in association with a melody which seems to move away from "home bass". The example below illustrates a typical longer progression involving primary triads (i've notated a simple version of the melody below).SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes The V chord is the next most important chord in a given key. The IV chord can also move down to the I chord. creating a plagal cadence (IV I)-a sound commonly associated with the "amen" of a Christian church hymn. The reverse IV V progression (i. and is known as the dominant chord.e. If the IV-V or V-IV is repeated the effect is to prolong the tension by delaying the ultimate resolution to the I chord. A cadence is a "final" chord progression-typically found at the end of a melodic phrase etc. before the I V I at the end re-establishes the tonic and the melody falls again. Example 6. "below the dominant") and is used in a variety of chord progressions. The V-I chord movement is known in classical music theory as perfect cadence or final cadence. The effect of V-IV movement is to release the tension associated with the V chord (which wants to move to I).e. The dominant chord usually has a strong tendency to move back to the tonic-so when V moves to I it is said to be performing a dominant function.4 illustrates two I-IV-V-I progressions. but nothing very interesting happens with the melody. followed by one deceptive cadence (I-V-vi). 210 . The V-I progression is so predictable that when a V chord moves to a chord other than I (usually vi) the cadence is described as a deceptive cadence. IV-V)-this will often occur as part of a IV-V-I movement.e.

They are the most important instrument in the orchestra and are playing most of the time. Cello and Double Bass. 3. Viola. Woodwind.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes Musical Instrument and the Orchestra A typical symphony orchestra is made up of 4 main sections namely 1. 4. Stings Section Strings section consists mainly of Violin. 2. Brass. Percussions 1. The violin is further divided into the First and Second violin sections each could be playing different notes in harmony or in unison. 211 . Strings.

The designs of the 4 string-instruments are quite similar except their sizes.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes Stings instruments produce sound by the vibration of strings that are either bowed or plucked (as in pizzicato). The bridge is a small thin piece of wood supporting all the 4 strings on top of the body. They all have 4 strings stretched across a fretless fingerboard. it fluctuates in a complex manner that produces its fundamental frequencies as well as the harmonics that give the instrument sound its characteristic. A vibrato effect can be achieved by vibrating the finger touching a string. A musician plays a different pitch by changing the position of his fingers on the fingerboard while bowing. The vibrations of the strings are transmitted to the sound box (the body) via the bridge. It is to be noted here that when a piece of string vibrates. 212 .

saxophone is added and it belongs to the woodwind section. oboe (Col Anglais).SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes Fundamental 2nd Harmonic 3rd Harmonic 4th Harmonic Result If we were to note down the frequencies of the fundamental and each of the harmonics. The two instruments are thus interchangeable. Woodwind Section The Woodwind section consists mainly of flute. In more modern orchestra setting. Saxophone has similar mouthpiece and fingering pattern as the clarinet. clarinet and bassoon. 2. we will realize that they form the Harmonic Series. 213 .

To be able to play notes of different pitches. The length of the tube determines the lowest note the instrument could play. It also allows fast movements of closing and opening of the holes which makes playing of fast passage possible. 214 .SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes The vibration of a column of air within a tube is the principle of sound production of woodwind instrument. Different pitches could then be achieved by covering or uncovering the holes. Tube (bore of a woodwind instrument) One cycle of the fundamental pitch Tube (bore of a woodwind instrument) Hole The fundamental wave form is shortened and thus produced higher pitch. Key action is part of the instrument to aid the playing of the instrument. It allows a long tube to be played without having to move or stretch fingers across the length of the instrument. holes are punched along the length of the tube by careful calculation.

A single reed mouthpiece Screws Mouthpiece Reed Air flow Oboe and Bassoon are double reed instrument. The total length of its air column is about 2 meters long. A clarinet play can control the tone of the instrument by forcing his lower lip against the reed.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes Flute is the only instrument in the woodwind section which does not use reed. 215 . There is a shorter version of flute called the piccolo which is half the size and plays an octave higher. the rest of the woodwind family is divided into the Single reed and Double reed categories. It has a lower pitch counterpart called the Col Anglais (or English Horn) which is longer and deeper in tone. When a flute player blows into the mouthpiece. The effect of the double reed when vibrating produces a more pipe-liked sound as compared to that of a single reed instrument. A double reed mouthpiece Reed Reed Bassoon is the bass instrument in the woodwind family. Oboe is a treble instrument of the double reed type. air rushes through the hole of the mouthpiece and initiate vibrations of air within the body of the flute. It has a rather comical sound characteristic and thus is used for humorous section of the music. The sound of a reed instrument comes from vibration of the reed caused by blowing. Apart from flute and piccolo. Clarinet is a Single reed woodwind instrument (so is the saxophone). Their mouthpieces are made of two pieces of bamboo reed tied together. It has a mouthpiece with a reed attached. A reed is a thin piece of bamboo attached to the mouthpiece of the instrument.

The vibration is transmitted into the column of air within the instrument which is then amplified at the bell-shape end. If we examine the Harmonic Series. a brass player can play different pitches by changing the strength of his breath and lip shape. Secondly. trombone. With a fixed length of brass pipe. a relative wide range of notes can be played.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes 3. The trombone employs a slide mechanism to change the length of its tube. different pitches can also be achieved by changing the length of the tube. brass instruments produce different notes by changing the physical length of the tube via various means. Combining with the player’s alteration of the breath and lips. 216 . Firstly. The Brass Section The brass section consists of mainly trumpet. Among the brass section. The sound of all brass instruments is produced by vibration of the player’s lips pressing against the mouthpiece. Different pitches can be achieved by two methods. Instead of punching holes along the length instrument (such as the woodwind family). trumpet and French Horn are transposing instrument. That explains why Baroque trumpets always play at high pitch as this was the only way melodies could be played with the technology available at the time. we will find that notes get closer when approaching the higher end of the series. French horn and tuba. a skillful player can produce notes of the Harmonic Series. French Horn and Tuba employ the valve and piston system to redirect air into different length of tubes. This mechanism allows the player to produce ‘glissando’ effects. Trumpet.

notes written a perfect 5th above F. Different sizes of the clarinet are used for different music and sometimes different section of the same piece. there is a need to strike a balance between the length of the instrument and its tone colour. A transposing instrument plays a different pitch with a standardized fingering pattern.notes written a perfect 4th above the sounding notes.1 Transposing Instruments in the Woodwind Section: A. The amount of transposition of an instrument is indicated by a note name. 4. Clarinet comes in different sizes – shorter ones for higher pitch and longer ones for lower pitch. One of such instrument in the woodwind section is the clarinet.e.notes written a major 2nd above B. the particular clarinet will always be a major second lower than the key written on the score. Therefore. Since B flat is a major second (two semitones) lower than C. musicians can play clarinet of the different length all in the same pitch by changing their fingering pattern. For example. for a clarinet in B flat to participate in this performance. Therefore. Within an orchestra.notes written a minor 3rd above C. The result is that some instruments play at a different pitch from others. Bass clarinet in Bb . his clarinet will sound a major second lower and thus produce a tune in G Major which is the desired key of the music being played by the whole orchestra. its score has to be written a major second HIGHER to compensate for its lower pitch. Clarinet in A . Occasionally used: 217 . For example. However.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes 4. Clarinet in Eb . A better solution is to write the music score at a different key while keeping the musician’s fingering pattern consistent. the player would read a score of a different key from his other partners. This result is the so-called transposing instrument.notes written a minor 3rd below D. a ‘clarinet in B flat’ will produce a B flat when the player plays a C (with the standard fingering) on the score. a little bit of calculation is needed when writing score for transposing instrument. The music scores for instruments such as violin. Cor anglais (in F) . Clarinet in Bb .notes written a major 9th above C. piano. in order for the clarinet in B flat to play along with the rest of the orchestra. trombone and other nontransposing instruments are written in G Major. Although. The name of the note is the note produced when the player plays a C on the score. Thus. Alto (or bass) flute in G . the clarinet player would have to read a score that is major second higher i. an orchestra is playing a piece in G Major. A Major. As the player reads the A Major score. Transposing Instrument Woodwind instrument makers have to find the range of notes which the instrument sound best. but it will be difficult for them to adopt a different playing style whenever they change instrument.

Tubular Bells and etc. Saxophone in Eb (baritone) . 5. Cornet in Eb . Pitched percussion instruments are those that have a distinctive pitch. generate more Overtones than harmonics that overshadow the sense of pitch. Saxophone in Bb (tenor) . gong and etc. French horn in F ..notes written a major 2nd above h. bass drums. Glockenspiel.notes written a minor 3rd below 5.notes written a major 2nd above G. Horn in D . The Percussion Section The Percussion Section consists of a wide range of instrument that could be divided into – Pitched and Unpitched Percussion. Trumpet in D .notes written a minor 7th above D.notes written a major 9th above j. They are perceived to have no distinctive pitch.notes written a major 2nd below F Cornet in Bb . Unpitched percussion instruments such as cymbals. Trumpet in Bb .notes written a minor 6th above C. Saxophone in Eb (alto) . Vibraphone.notes written a major 6th above i.notes written an 8ve plus major 6th above 4. Saxophone in Bb (soprano) .notes written a major 2nd above the sounding notes. Such as the Celesta.2 Transposing instruments in the brass section: A.notes written a perfect 5th above B.1 Pitch Percussion Instruments 218 . snare drums (side drum). Horn in E . E.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes G.

2 Unpitched Percussion Instruments A Timpani 219 .SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes 5.

After about 1100. These changes are continuous. Most medieval music was vocal. 1. organ be came the most prominent instrument in the church. Writing music and singing became important occupations in the church.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes History of Western Art Music Musical styles change from one era in history to the next. 220 . the official music of the Roman Catholic church has been Gregorian Chant. Greece. It is monophonic in texture. 1. Middle Ages (450-1450) 1. The history of western art music can be divided into the following stylistic periods: Middle Ages (450-1450) Renaissance (1450-1600) Baroque (1600-1750) Classical (1750-1820) Romantic (1820-1900) Twentieth century We know that music played an important role in the cultures of ancient Israel. and so any boundary line between one style period and the next can be only an approximation. The melodies of Gregorian Chant were meant to enhance parts of religious services.1 Social background A thousand years of European history are spanned by the phrase “Middle Age” beginning around 450 AD with the disintegration of the Roman Empire. and wars. for centuries. Most important musicians were priests and worked for the church. With the preeminence of the church.2 Sacred Music of the Middle ages Gregorian Chant For over 1000 years. which consists of melody set to sacred Latin texts and sung without accompaniment. The first style period to be considered is the European Middle Ages. and Rome. it’s no surprise that only sacred music was notated. upheavals. This era witnessed the “dark ages”. Roman Catholic church had great influence on all segments of the society. from which notated music has come down to us. though a wide variety of instruments served as accompaniment. a time of migrations. But hardly any notated music has survived from these ancient civilizations.

From about 1170 to 1200. For the first time in music history. focused on human life and its accomplishments. People then spoke of “rebirth” or “renaissance” of human creativity. At first Gregorian melodies were passed along by oral tradition.3 Secular Music In The Middle Ages The first large body of secular songs that survives in decipherable notation was composed during the 12th and 13th centuries by French nobles called troubadours and trouvères. But sometime between 700 and 900. Renaissance (1450-1600) The 15th and 16th centuries in Europe have come to be known as the Renaissance. The University of Paris attracted leading scholars. are the first notable composers known by name. Medieval music that consists of Gregorian Chant and one or more additional melodic lines is called organum. but as the number of chants grew to the thousands. we know it evolved over many centuries. Paris became the center of polyphonic music. 2. 221 . notation indicated precise rhythms as well as pitches. they were notated to ensure musical uniformly throughout the western church.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes Gregorian Chant is named after Pope Gregory I (the Great). School Of Notre Dame: Measured Rhythm After 1150. Monks in monastery choirs began to add a second melodic line to Gregorian Chant. Leonin and Perotin. Most of the songs deal with love and were preserved because nobles had clerics write them down. Two successive choirmasters of Notre Dame. They and their followers are referred to as the school of Notre Dame. Although medieval legend credits Pope Gregory with the creation of Gregorian Chant. with definite time values and a clearly defined meter. During the Renaissance. Some of its practices came from the Jewish synagogue of the first century after Christ. and the Cathedral of Notre Dame was the supreme monument of gothic architecture. the first step were taken in a revolution that eventually transformed western music. the Notre Dame composers developed rhythmic innovations. Most of the several thousand melodies known today were created between 600 and 1300 AD. the dominant intellectual movement. The Development Of Polyphony: Organum For centuries. who reorganized the Catholic liturgy during his reign from 590 to 604. western music was basically monophonic. The music of Leonin and Perotin used measure rhythm. which was called humanism. having a single melodic line. It was a period of exploration and adventure. 1.

Baroque (1600-1750) During the Baroque period.1 Sacred Music The 2 main form of sacred music in the Renaissance music are the motet and the mass. An important kind of secular vocal music during the Renaissance was the madrigal. 2. The invention of printing with movable type (around 1450) accelerated the spread of learning. five or six voice parts of nearly equal melodic interest. Renaissance secular music was written for groups of solo voices and for solo voice with the accompaniment of one or more instruments.1 Characteristics of Renaissance Music In the Renaissance. While there is a wide range of emotion in Renaissance music. for the unity of Christendom was exploded by the Protestant Reformation led by Martin Luther (1483-1546).1 Characteristics of Baroque music 222 . While most of the population barely managed to survive. 2. The texture of Renaissance music is chiefly polyphonic. 3. as in the Middle Ages. the ruling class was enormously rich and powerful. vocal music was more important than instrumental music. like a motet. European rulers surrounded themselves with luxury. A typical choral piece has four. tone color and rhythm. with no extreme contrasts of dynamics. but a mass is a longer composition. The humanistic interest in language influenced vocal music. it usually is expressed in moderate balanced way. The word painting technique was used in music. They are alike in style. combines homophonic and polyphonic textures. usually about love. But it more often uses word painting and unusual harmonies. Rhythm is more gentle and each melody line has great rhythmic independence. A madrigal. 3. 2.3 Secular Music Secular vocal music became increasingly popular.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes The Catholic church was far less powerful in the Renaissance than during the Middle Ages. creating a close relationship between words and music. a piece of several solo voices set to a short poem. Music was set to poems in various languages throughout Europe.

During the classical period. Mood in classical music may change gradually or suddenly.1 Characteristics of the classical style Contrast of mood – Great variety and contrast of mood received new emphasis in classical music. However.1 Important composers of the classical period Joseph Haydn (1732-1809) 223 . Paralleling continuity of dynamic level. In contrast to the polyphonic texture of late baroque music. Baroque melody creates a feeling of continuity. 4. Revolutions in thought and action were paralleled by shifts of styles in the visual arts and music. the desire for gradual dynamic change led to the replacement of the harpsichord by the piano. An opening melody will be heard again and again in the course of a baroque piece. 4. the volume seems to stay constant for a stretch of time. Classical melodies are among the most tuneful and easy to remember. scientific methods and discoveries of geniuses like Galileo and Newton vastly changed people’s view of the world. Flexibility of rhythm adds variety to classical music. Chords became increasingly important and became more significant in themselves. faith in the power of reason was so great that it began to undermine the authority of the social and religious establishment. By the middle of 18th century. classical music is basically homophonic. The classical composers’ interest in expressing shades of emotion led to the widespread use of gradual dynamic change – cresendo and decresendo. Classical (1750-1820) During the baroque era. expressing conflicting surges of elation and depression. We’ve noted that late baroque music is predominantly polyphonic in texture: two or more melodic lines compete for the listener’s attention. Important composers of the baroque period Antonio Vivaldi (1678-1741) Johann Sebastian Bach (1685-1750) George Frideric Handel (1685-1759) 4. They often sound balanced and symmetrical because they are frequently made up of two phrase of the same length.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes Unity of mood – a baroque piece usually expresses one basic mood throughout which is conveyed by continuity of rhythm. pieces shift smoothly or suddenly from texture to another.

2 Important composers of the romantic period Franz Schubert (1797-1828) Robert Schumann (1810-1856) Frederic Chopin (1810-1849) Felix Mendelssohn (1809-1847) Hector Berlioz (1803-1869) Fraz Liszt (1811-1886) Peter Ilyich Tchaikovsky (1840-1893) Antonin Dvorak (1841-1904) 224 . characters. using tone color to obtain variety of mood and atmosphere. Romantic (1820-1900) The early 19th century brought the flowering of romanticism. A programmatic instrumental piece can represent the emotions. Music nationalism was expressed when romantic composers deliberately created music with a specific national identity. New chords and novel ways of using familiar chords were explored. There was more prominent use of chromatic harmony. Program Music – instrumental music associated with a story. Romantic writers emphasized freedom of expression. Many romantics created music that sound unique and reflects their personalities. Romantic composers reveled in rich and sensuous sound. One example is Tchaikovsky’s Romeo and Juliet. imagination and individualism. which employs chords containing tones not found in the prevailing major and minor scale. an orchestral work inspired by Shakespeare’s play. 5. idea or scene.1 Characteristics of romantic music Romantic music puts unprecedented emphasis on self-expression and individuality of style. a cultural movement that stressed emotion. There were more varieties in tone color. 5. became popular. or it can evoke the sounds and motions of nature.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes Wolfgang Amadeus Mozart (1756-1791) Ludwig Van Beethovan (1770-1827) 5. Romantic music also calls for wide range of dynamics. and events of a particular story. Extreme dynamic expression like φφφφ and ππππ were used. poem.

composers depended on the listener’s awareness of the general principles underlying the interrelationship of tone and chords. abstract paintings no longer tried to represent the visual world. A dissonant chord was unstable. consonant chord. 6. there were entirely new approaches to the organization of pitch and rhythm and a vast expansion in the vocabulary of sound used. Beats are grouped irregularly. and the accented beat comes at unequal time intervals. A consonant chord was stable. In the past. sirens and automobile brake drums are brought into the orchestra as noisemaker. Noiselike and percussive sounds are often used and instruments are played at the very top or bottom of their range. creating variety.SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes Johannes Brahms (1833-1897) Giacomo Puccini (1858-1924) Richard Wagner (1813-1883) 6.1 Characteristic of twentieth century music Tone color has become a more important element of music than it ever was before. Twentieth century music relies less on preestablished relationship and expectations. and Albert Einstein revolutionized the view of the universe with his special theory of relativity. and mood. continuity. In music. Listeners are guided by musical cues contained only within an individual composition. Chords are divided into 2 opposing types: consonant and dissonant. It often has a major role. There was new way of organizing the rhythm. During the period preceding World War I. discoveries were made that overturned long-held beliefs. Sometimes. it functioned as a point of rest or arrival. In visual arts. The twentieth century brought fundamental changes in the way chords are treated. especially percussive sounds. Twentieth century The year 1900 to 1913 brought radical new development in science and art. Rapidly changing of meters are characteristic of twentieth century. Sigmund Freud explored the unconscious and developed psychoanalysis. even typewriters.2 Important composers of the twentieth century Claude Debussy (1862-1918) Igor Stravinsky (1882-1971) Bela Bartok (1881-1945) George Gershwin (1898-1937) 225 . 6. its tension demanded onward motion or resolution to a stable.

SCHOOL OF AUDIO ENGINEERING AE13 – Music Theory II Student Notes 226 .

4 3.3 3.3 3.5 3. Daisy Chain/MIDI Channels The Advantages of MIDI General MIDI GM Patches 227 . The Needs for MIDI MIDI Hardware Connections 2.1 4.1 3.1 Standard Midi Connections 2. Setting up MIDI connections with Macintosh computer MIDI interface for PC MIDI Messages 3.7 3.9 4. 2.4 Assigning a patch to a "robomusician" Individual control via each MIDI Channel Changing instrumentation – Program Change Active Sense 5. 6.8 3.2 4.3 4. 8.2 3.6 3.AE14 – MIDI 1. 7. MIDI Note Numbers Channel Message Channel Voice Message Note On/Off Velocity information Running status Polyphonic key pressure (aftertouch) Control change Channel modes (under the control change category) MIDI Channels and the Multi-Timbre Sound Module 4.2 2.

1 3. MIDI Clock Song Position Pointer (SPP) MIDI Timecode (MTC) 3. What is a sampler? 1.1 1.4 The sampling process The Editing Process The Programming Process Channel Mapping Process 228 . Keyboard Workstation Software Sequencers Digital Sampler 1. 2.2 Quarter frame messages Full-Frame Message MIDI Sequencer 1. What is a sequencer? The Hardware Sequencers 2.1 9.8. 3.1 3. GM Drum Sounds Saving Midi files Midi Synchronization Methods 1.3 1. 2.2 1.

Only keys – Called a keyboard controller. Electronic musical instruments were devices that electrically simulated the sound of real instruments. The tones in the tone module are the different sounds of the different instruments that the electronic keyboard can play. without using prerecorded tapes.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes AE14 – MIDI Introduction MIDI stands for Musical Instrument Digital Interface. 1. Midi was developed in response to this need. An example is a keyboard. These two sections can exist together in one device or you can have them exist separately. b. This led to the creation of electronic musical instruments. The Needs for MIDI Musical instruments only existed as acoustic devices until the discovery of electricity. The keyboard is made up two separate sections a. Modern keyboards can play the sound of almost any musical instrument and even non-musical sounds. and played back at a later time. When they exist separately you have a. it's not possible to play 5 instruments at once unless he has some method of remote control. Midi is a digital language in binary form. Only sounds. Perhaps a musician wishes to blend the sax patches upon 5 different instruments to create a more authentic-sounding sax section in a big band. is called a patch. But. These were called sequencers. b. An application for sequencers is a situation where some musicians want to be able to have "backing tracks" in live performance. Each sound. Blend certain patches upon those instruments. This module focuses on basic theory of MIDI and sequencing and sampling technique. The section with the keys – called the keyboard The section with the tones – called the tone generator or the tone module. the trumpet sound. Midi sequencers offered a more flexible alternative that allowed real time changes in the arrangement. Musicians after playing with electronic musical instruments then desired to be able to control multiple instruments remotely or automatically from a single device. Since there were existing ways to record and even edit binary data through CPU technology. the next development was dedicated processing hardware and software to handle midi messages. edited like one would edit text on a word processor. since a musician has only two hands and feet. No keys – Called a tone module or a tone generator. Can not be used without a controller. 229 . It does not have any sounds on it so cannot be used for anything but remote controlling. This hardware and software allowed midi messages to be recorded. Remote control is when a musician plays one musical instrument. It is a standard interfacing language used on all modern digital musical instrument and sequencer. and that instrument controls (one or more) other musical instruments. for example. Examples of this include combining the sounds of several instruments playing in perfect unison to "thicken" or layer a musical part.

Pin 2 is used for shielding and Pins 4 (+) and 5 (-) are for actual transmissions. Often coming with Midi IN. This is simply a device that acts as a translator of between the midi equipment and the PC.2 Setting up MIDI connections with Macintosh computer i. In the diagram below. Out of the 5 pins. take note of the position and numbering of the 5 pins. MIDI Hardware Connections The visible MIDI connectors on an instrument are female 5-pin DIN jacks. Out of the 3 used pins. A Hardware requirement for connecting a midi device to a computer is a Midi interface. so that those instruments can pass MIDI signals to each other.1 Standard Midi Connections The standard MIDI hardware connection is a 5-pin DIN Plug. Connect the MIDI OUT of the Controller Keyboard to MIDI IN of the Sound Module. You use MIDI cables (with male DIN connectors) to connect the MIDI jacks of various instruments together. You connect the MIDI OUT of one instrument to the MIDI IN of another instrument. only 3 pins are used for MIDI transmissions. microphone cables and XLR jacks (with 3 pins) were used in MIDI connection. There are separate jacks for incoming MIDI signals (received from another instrument that is sending MIDI signals). Also note that Pin 1 and 3 are not used at present. 2. The extra 2 pins are reserved for future development of the MIDI standard. Any messages that the instrument itself creates (or modifies) are sent out its MIDI OUT jack but not the MIDI THRU jack.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes 2. Some instruments have a third MIDI jack labeled "Thru". MIDI signals that the instrument creates and sends to another device). the THRU jack is exactly like the OUT jack with one important difference. This is used as if it were an OUT jack. In fact. Think of the THRU jack as a stream-lined. This option exist for connection to PC as well so often a sound module will have a selctor switch at the host port for Mac or PC. 2. An alternative connector type found mainly on Macintosh computers is a host connector . and vice versa. unprocessed MIDI OUT jack. OUT and THRU connections on one end and computer connections like Parallel or host ports on the other. It is interesting to note that before the establishing of the 5-pin DIN plug as standard. and outgoing MIDI signals (ie. 230 . and therefore you attach a THRU jack only to another instrument's IN jack. Here the Midi IN/OUT/Thru all use this single host cable.

iii.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes ii. Connect the Host port of the Sound Module to either Printer or Modem port of the Macintosh. Make sure that the selector switch of the Sound Module Host port is switched to “Mac” for communication with the Macintosh. 231 .

Not only does this sound a musical note. MIDI Messages Many electronic instruments not only respond to MIDI messages that they receive (at their MIDI IN jack). the first instrument would send out a MIDI Note Off message for that middle C to the second instrument. then the second instrument would "hear" this MIDI message and sound its middle C too. it also causes a MIDI Note-On message to be sent out of the keyboard's MIDI OUT jack. it also causes another message -. the instrument sent a MIDI Note On message for middle C out of its MIDI OUT jack. You saw above that when the musician pushed down that middle C note. And then the second instrument would stop sounding its middle C note. The serial com-port – with a custom-made com-port-to-Host cable and the appropriate software driver. they also automatically generate MIDI messages while the musician plays the instrument (and send those messages out their MIDI OUT jacks). Some famous of such are Cubase VST by Steinberg and Digital Performer by MOTU. Not only does this stop sounding the musical note. the com-port can function as both the MIDI In and Out through a standard Macintosh Host cable. 3. That message consists of 3 numeric values: 128 60 64. Soundcard joystick port – this is a simplest way of connecting MIDI to PC with a standard Creative compatible soundcard. The parallel port (printer port) – The parallel port usually works with external MIDI interface box which provides multiple MIDI Ins and Outs. 232 . When the musician released that middle C note. video and other sequencers. Dedicated MIDI interface card – There are also internal interface cards that perform similar functions to the external interface boxes. If you were to connect a second instrument's MIDI IN jack to the first instrument's MIDI OUT. Some of the major software sequencers are integrated with digital audio and video features which turned the computer into an integrated audio/video production workstation. A special Joystick Port to MIDI cable and a software driver is needed to convert the Joystick port into MIDI In/Out interface. That message consists of 3 numeric values: 144 60 64 The musician now releases that middle C key.3 MIDI interface for PC There are few ways of interfacing PC with MIDI. Note that one of the values is different than the Note-On message. requires that the MIDI device has a built-in Host port. This set up however.a MIDI Note-Off message -.to be sent out of the keyboard's MIDI OUT jack. Each makes use of a different communication port of the PC computer. More advanced models are also equipped with synchronization functions to allow MIDI sync with tape recorders. When a musician pushes down (and holds down) the middle C key on a keyboard.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes 2.

it's too expensive to manufacture a MIDI controller with 128 keys on it. There's a message that tells an instrument to move its pitch wheel and by how much. As mentioned. There are 59 other keys below middle C upon a MIDI controller). the F# in the third octave of a piano keyboard). There's a message that tells the instrument to change its patch (ie. The lowest note is a C upon a MIDI controller (as opposed to an A upon an acoustic piano). (ie. 3. if the musician moves the pitch wheel. Its lowest key is actually MIDI Note Number 21). the lowest key being the low A like upon a piano. The C# above it would have a note number of 1. Of course. There's a message that tells the instrument to press or release its sustain pedal. most keyboard controllers have a "MIDI transpose" function so that. For example. most controllers have less than 128 keys upon them. a MIDI drum kit. Etc. For example. It would be like he had two left and two right hands that worked in perfect sync. So typically. most musicians aren't accustomed to playing a keyboard with more than 88 keys (or a drum kit with 128 pads. The numbers used are 0 to 127. the software/device may display note names. (Consequently. (Of course. Besides. Some MIDI software or devices don't use MIDI note numbers to identify notes to a musician (even though that's what the MIDI devices themselves expect. A MIDI controller has a wider range of notes -. more octaves of notes -. etc). (ie. you can alter the note range that your (more limited) keyboard covers. a MIDI controller can theoretically have 128 keys (or frets. these other messages are automatically generated when a musician plays the instrument. etc). maybe from an organ sound to a guitar sound). Of course. and also octave numbers (as shown in the diagram above).perhaps even too unwieldly an instrument to play. a pitch wheel MIDI message is sent out of the instrument's MIDI OUT jack. That is the A note above middle C). For example. or drum pads.than even an acoustic piano).ie. Never mind that a true "full-size" keyboard controller would have 128 keys and its lowest key would go down to a C two octaves below the piano's A key. But whereas musicians name the keys using the alphabetical names. And of course. the lowest note upon a MIDI controller is a C and this is assigned note number 0. etc) can have upto 128 distinct pitches/notes. "Middle C" is note number 60. you could transpose it down an octave so that it is assigned a note number of 9. MIDI note numbers don't mean that much to a musician. a MIDI note number of 69 is used for A440 tuning. such as F#3 (ie. a "full-size" keyboard controller will usually have only the 88 keys that a piano has. The D note above that would have a note number of 2. (ie. It would seem very odd to the musician -.1 MIDI Note Numbers A MIDI controller (be it a piano-like keyboard. 233 . Instead. What with all of the possible MIDI messages. the pitch wheel message is a different group of numbers than either the Note On or Note Off messages). There's a message that tells the instrument to change its volume and by how much. everyth that the musician did ing upon the first instrument would be echoed upon the second instrument.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes But MIDI is more than just "Note On" and "Note Off" messages. even if you don't have the full 128 keys. As mentioned. in practice. There are lots more messages. a MIDI guitar. And just like with Note On and Note Off messages. So. these are only a few of the many available messages in the MIDI command set. and what they pass to each other in MIDI messages). instead of that lowest A key being assigned to note number 21. so they instead assign a unique number to each key. with sharps and flats. this is more difficult for MIDI devices to process.

The status byte is then followed by one or more data bytes. So. Some software/devices instead consider the third octave of the MIDI note range (ie. In that case. A typical Channel Message [Type of message || Channel # ] [ Data byte 1 ] [Data byte 2]… 234 . So they pretend that the third octave is octave 0. (They do this because they may be designed to better conform to a keyboard controller that has a more limited range. MIDI consists of a hierarchy of different categories of messages catering for the different needs of most MIDI devices. and the highest note name is G8. then middle C's note name is C5. the first 2 octaves (that are physically missing) are referred to as -2 and -1.2 Channel Message Channel Messages are those that carry specific MIDI channels. the channel numbers are in the form of 4 binary digits as part of the status byte. and that concerns the octave numbers for note names. These MIDI channel numbers range from binary 0 to 15 but are commonly referred to as MIDI Channel 1 to 16. If your MIDI software/device considers octave 0 as being the lowest octave of the MIDI note range (which it ideally should). because the first two octaves are physically "missing" on the keyboard). one which perhaps doesn't have the two lowest octaves of keys which a 128 key controller would theoretically have. and the highest possible note name is G10 (note number 127). This discrepancy is purely in the way that the software/device displays the note name to you. The lowest note name is then C0 (note number 0). the lowest note name is C-2. Below is a diagram showing different types of MIDI Messages: MIDI Messages Channel Message System Message Voice Mode Common Realtime Exclusiv 3. middle C's note name is C3. 2 octaves below middle C) as octave 0. Beside Note number. nagging discrepancy that has crept up between various models of MIDI devices and software programs.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes There is one. Note On and Note Off messages. In binary form. The other 4 bits of the status byte specify different types of Channel Messages.

2. The table below shows MIDI note numbers related to the musical scale Musical note C –2 C –1 C0 C1 C2 C3 (middle C) C4 C5 C6 C7 CS G8 MIDI note number 0 12 24 36 48 60 (Yamaha convention) 72 84 96 108 120 127 235 . The most common Channel Voice Messages are: Note On Note Off Program Change Control Change Pitch bend Polyphonic Aftertouch Channel Aftertouch 3. and is perhaps less suitable for other instruments and cultures where the definition of pitches is not so black and white.3 Channel Voice Message Channel Voice Messages are ones that most directly concerned with the music being recorded and played.4 Note On/Off A Note On/Off message consists of 1 status bytes and 2 data bytes as shown below: Status Byte 8 bits [Note On/Off || Chn #] Data Bytes 8 bits Note Number 8 bits Velocity MIDI note numbers relate directly to the western musical chromatic scale. Yamaha established the use of C3 for middle C. means have been developed of adapting control to situations where unconventional tunings are required. Nonetheless. This quantisation of the pitch scale is geared very much towards keyboard instruments. although there is a certain degree of confusion here.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes 3. whereas others have used C4 Some software allows the user to decide which convention will be used for display purposes. and the format of the message allows for 128 note numbers which cover a range of a little over ten octaves adequate for the full range of most musical material. Note numbers normally relate to the musical scale as shown in Table 2.

5 Velocity information Note messages are associated with a velocity byte. for example: [Note On || Chn #] [Data] [Velocity] [Data] [Velocity] [Data] [Velocity] It will be appreciated that for a long string of note data this could reduce the amount of data sent by nearly one third. But as in most music each note on is almost always followed quickly by a note off for the same note number. velocity zero as equivalent to a note off message. The former will correspond to the force exerted on the key as it is depressed: in other words.6 Running status When a large amount of information is transmitted over a single MIDI bus. to reduce the amount of data transmitted as much as possible. It involves the assumption that once a status byte has been asserted by a controller there is no need to reiterate this status for each subsequent message of that status. Note off velocity (or 'release velocity') is not widely used. both note on and note off. This velocity value has 128 possible states. Running status is an accepted method of reducing the amount of data transmitted. as the status would be changing from note on to note off very regularly. 'how hard you hit it' (called 'note on velocity'). allowing a string of what appears to be note on messages. 3. Yet it might be useful for a computer sequencer which can record data from a large number of 236 . The Note On Velocity zero value is reserved for the special purpose of turning a note off. its software should interpret this as a note off message. as it relates to the speed at which a note is released. and can be applied internally to scale the effect of one or more of the envelope generators in a synthesiser. Thus a string of note on messages could be sent with the note on status only sent at the start of the series of note data. It will be advantageous. which is not a parameter that affects the sound of many normal keyboard instruments. thus eliminating most of the advantage gained by running status. in order to keep the delay as short as possible and to avoid overloading the devices on the bus with unnecessary data. delays naturally arise due to the serial nature of transmission wherein data such as the concurrent notes of a chord must be sent one after the other. and this is used to represent the speed at which a key was pressed or released. This is the reason for the adoption of note on. If an instrument sees a note number with a velocity of zero. Running status is not used at all times for a string of same status messages. because it avoids a change of status during running status. It is used to control parameters such as the volume or timbre of the note at the audio output of an instrument.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes 3. Nonetheless it is available for special effects if a manufacturer decides to implement it. Indeed. therefore. so long as the status has not changed in between. for reasons which will become clear under 'Running status' below. and one which all MIDI software should understand. in fact. and will often only be called upon by an instrument's software when the rate of data exceeds a certain point. but which is. this method would clearly break down. an examination of the data from atypical synthesiser indicates that running status is not used during a large amount of ordinary playing.

The controller messages have proliferated enormously since the early days of MIDI. It should be noted.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes MIDI devices. Aftertouch is perhaps a slightly misleading term as it does not make clear what aspect of touch is referred to. These come under the control change message group.7 Polyphonic key pressure (aftertouch) The key pressure messages are sometimes called 'aftertouch' by keyboard manufacturers. The polyphonic key pressure message is not widely used. The control change message takes the general form: [Control Change || Chn #] [Controller number] [Data] thus a number of controllers may be addressed using the same type of status byte by changing the controller number. A technique known as 'controller thinning' may be used by a device to limit the rate at which such messages are transmitted. as it transmits a separate value for every key on the keyboard. as the message will be sent for every note in a chord every time the pressure changes. that some manufacturers have shown it to be possible to implement this feature at a reasonable cost.5. and is beyond the scope of many keyboards. and many people have confused it with note off velocity. This can be expensive to implement. and might transmit it all out of a single port on replay. 3. Alternatively this data may be filtered out altogether if it is not required. there is now 237 . a MIDI device may be capable of transmitting control information which corresponds to the various switches. and thus requires a separate sensor for every key. in order not to allow any one device to lag behind in relation to the others. 3.9). As most people do not maintain a constant pressur e on the bottom of a key whilst playing. Even so. so most manufacturers have resorted to the use of the channel pressure message (see below). and this may be implemented either before transmission or at a later stage using a computer. control wheels and pedals associated with it. The message takes the general format: [Aftertouch || Chn #] [Note number] [Pressure] Implementing polyphonic key pressure messages involves the transmission of a considerable amount of data over MIDI which might well be unnecessary. and it is used to instigate effects based on how much the player leans onto the key after depressing it. This message refers to the amount of pressure placed on a key at the bottom of its travel. as the sequencer would be alternating between the addressing of a number of different devices with different statuses. It is often applied to performance parameters such as vibrato. the benefit would not be particularly great even in this case. though. and should be distinguished from program change messages (see section 2. Although the original MIDI standard did not lay down any hard and fast rules for the assignment of physical control devices to logical controller numbers. and not all devices will implement all of them. many messages might be sent per note.8 Control change As well as note information.

in order to conform to the standard format of the message. and the analogue type. Clearly. not all controllers will require this resolution. 1 for ON). Effectively there is a switch between the output of the keyboard and the control input to the sound generators which allows the instrument to play its own sound generators in normal operation when the switch is closed. Seven bits would only allow 128 possible positions of an analogue controller to be represented. such as used in some 'sustain' pedals (although this is not common in the majority of equipment). but. If a system opts not to use the extra resolution offered by the second byte. the switch type. and it would be possible to use just a single bit for this purpose. If the switch is opened. lever. the link is broken and the output from the keyboard feeds the MIDI OUT while the sound generators are controlled from the MIDI IN. and these are often known as continuous controllers. controller numbers &06 and &38 both represent the data entry slider. for example. the data values can make up a 14 bit number (because the first bit of each data word has to be a zero). the first 32 controllers handle the most significant byte (MSbyte) of the controller data.9 Channel modes (under the control change category) Although grouped with the controllers.4 shows a more detailed breakdown of the use of these. and &40-7F for ON. which allows the quantisation of a control's position to be one part in 214(1638410). under the same status. and in practice this is all that is transmitted on many devices. switch states are normally represented by data values between &00 and &3F for OFF. In other words switches are now considered as 7 bit continuous controllers. On/off switches can be represented easily in binary form (0 for OFF. bank select and effects control have been left for coverage by later chapters on MIDI implementation in synthesisers and effects devices. but it is available if needed. 3. For this reason. The control change messages have become fairly complex so coverage of them is divided into a number of sections.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes common agreement amongst manufacturers that certain controller numbers will be used for certain purposes. although the full list is regularly updated by the MMA. and a sound 238 . and these are grouped as shown in Table 2. This is to allow for greater resolution in the quantisation of position than would be feasible with the seven bits that are offered by a single data byte. There are 128 controller numbers available.3. it should send only the MSbyte for coarse control. and these are controlled by the MMA and JMS C. and it may be possible on some instruments to define positions in between off and on in order to provide further degrees of control. In this mode the instrument acts as two separate devices: a keyboard without any sound. while the second 32 handle the least significant byte (LSbyte). In this way. Local on/off' is used to make or break the link between an instrument's keyboard and its own sound generators. as found in the majority of MIDI-controlled musical instruments. Table 2. In older systems it may be found that only &00 = OFF and &7F = ON. The analogue controller is any continuously variable wheel. The first 64 controller numbers relate to only 32 physical controllers (the continuous controllers). Only the LSbyte would be needed for small movements of a control. and it was considered that this might not be adequate in some cases. the channel mode messages differ somewhat in that they set the mode of operation of the instrument receiving on that particular channel. The topics of sound control. It should be noted that there are two distinct kinds of controller: that is. slider or pedal that might have any one of a number of positions. Together.

Devices should power-up in this mode according to the original specification. the instrument will ignore the channel number in the status byte and will attempt to act on any data that may arrive. whatever its channel. then four voices will be assigned to adjacent MIDI channels. starting from the basic channel which is the one on which the instrument has been set to receive in normal operation. if the data byte is set to 4. if the data byte is set to zero. some devices may release the quietest note (that with the lowest velocity value). The legato switch controller allows a similar type of playing in polyphonic modes by allowing new note messages only to change the pitch In poly mode the instrument will sound as many notes as it is able at the same time. Mono mode tends to be used mostly on MIDI guitar synthesizers since each string can then have its own channel. whereby MIDI information on each channel controlled a separate monophonic musical voice. 'Omni on' sets the instrument to receive on all of the MIDI channels. as opposed to 'Poly' (phonic) in which a number of notes may be sounded together. The data byte that accompanies the mono mode message specifies how many voices are to be assigned to adjacent MIDI channels. although on cheaper systems all of these voices may be combined to one audio output. and only accept a new note if is not already sounding.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes generator without a keyboard. Exceptionally. starting with the basic receive channel. The mode also has the advantage that it is possible to play in a truly legato fashion . where it may not always be desired that everything played on the master keyboard results in sound from the instrument itself. and each can control its own set of pitch bend and other parameters. Mono mode sets the instrument such that it will only reproduce one note at a time. or the note furthest through its velocity envelope. provided that the software can handle the reception. It is also common to run a device in poly mode on more than one receive channel. The more intelligent of them may look to see if the same note already exists in the notes currently sounding. In this way. Even more intelligently. This used to be one of the only ways of getting a device to generate more than one type of voice at a time. as set by the instrument's controls. to make way for a later arrival. 'Omni off' ensures that the instrument will only act on data tagged with its own channel number(s). all sixteen voices (if they exist) are assigned each to one of the sixteen MIDI channels. This configuration can be useful when the instrument in use is the master keyboard for a large sequencer system. For example.that is with a smooth take over between the notes of a melody because the arrival of a second note message acts simply to change the pitch if the first one is still being held down. but more recent devices will tend to power up in the mode that they were left. whereas others will refuse to play the new note. In other words. Some may be able to route excess note messages to their MIDI OUT ports so that they can be played by a chained device. rather than re-triggering the start of a note envelope. In older devices the mono mode came into its own as a means of operating an instrument in a 'multitimbral' fashion. a single multitimbral instrument can act as sixteen monophonic instruments. Instruments differ as to the action taken when the number of simultaneous notes is exceeded: some will release the first note played in favour of the new note. 239 .

Think of these sub-modules as robotic musicians. reverb and chorus levels. How do you tell the robomusician to pick up a certain instrument? Hit that button upon his channel and give him a message telling him the number of the patch/instrument 240 . but of course. harp. Now think of MIDI channels as channels (ie. even if it's traditionally an instrument that can't play chords. with each of the 16 patches set to a different MIDI channel. patches in a sound module (even the ones built into a computer sound card -. etc. flute.often referred to as a "wavetable synth"). The patches are numbered. etc. clarinet. maybe your arrangement needs a drum kit. if the robomusician plays a trumpet patch. he can play chords on it. and robomusician 3 to pick up a saxophone. let's talk about a typical MIDI setup which is limited to 16 channels). Each robomusician can play any of the hundreds of instruments (ie. he still is restricted to playing only one instrument at a time. For example. instrument) upon its own MIDI channel. patches) among those hundreds.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes 4. violin. you typically have Piano. This means that the module can listen to all 16 MIDI channels at once. MIDI Channels and the Multi-Timbre Sound Module Most MIDI sound modules today are "multi-timbral". cello. trumpet. You have 16 of them in any one MIDI setup. robomusician #10 can play only drums. So let's say that you tell robomusician 1 to sit at a piano. For example. even though a real trumpet is incapable of sounding more than one pitch at a time). a piano. The other robomusicians are super musicians. and perhaps other settings. and maybe he's the only robomusician who can play the drums). channel #10 is reserved for only drums. and a saxophone. panning. acoustic guitar. Each robomusician can of course play only one instrument at a time. Flute. I'll call them "robomusicians". (In fact. 4.1 Assigning a patch to a "robomusician" Think of a patch as a "musical instrument". As an example. You have 16 of them inside one multi-timbral module. Some setups have multiple MIDI Ins/Outs with more than 16 MIDI channels. and play any 16 of its "patches" simultaneously. But here. Each sub-module plays its own patch (ie. sub-module) has his own microphone plugged into one channel of that 16 channel mixer. For example. (On the other hand. Each robomusician (ie. so that each robomusician has a different instrument to play. to be played simultaneously by your 16 robomusicians. It's as if the module had 16 smaller "sub-modules" inside of it. a trumpet patch may be the fifty-seventh patch available among all of the choices. a bass guitar. (He's on MIDI channel 10 of the mixer). and robomusician 2 to pick up a bass guitar. patches) in your module. Let's say that you tell the remaining 12 robomusicians to pick up an accordian. musical instruments) to chose from. Bass Guitar. harmonica. In other words. (I assume one discrete MIDI bus in this "MIDI setup". with some MIDI modules. Typically. inputs) upon a mixing console. Saxophone. Let's say that the drums are played by robomusician #10. Since you have 16 robomusicians. each robomusician can play chords upon any instrument he plays. most modules have hundreds of patches (ie. you can pick out any 16 instruments (ie. so you have individual control over his volume.

In fact. there are many different things that a robomusician can do independently of the other 15 robomusicians. Now when you send him note messages.3 Changing instrumentation – Program Change OK. Even then. you can send him another MIDI Program Change to tell him to put down the Banjo and pick up the saxophone again (or some other instrument). How do you do that over MIDI? Well.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes you want him to play. So. etc. (ie. any of your robomusicians can change their instruments during the arrangement. Well. a 17th instrument needs to be played -. Let's say that the sax player isn't supposed to be playing anything at this point in the arrangement. And that's why I say that it's as if there are are 16 "sub-modules" inside of one multitimbral module because these 16 robomusicians really do have independent control over their musical performances. and takes notice of only those messages that are on his channel). that's what MIDI messages are for.remember that he's the guy who was playing the sax). thanks to there being 16 MIDI channels in that one MIDI cable that runs to the multi-timbral module's MIDI In. How do you tell a robomusician what notes to play? You send him MIDI Note messages on his channel. (Well. you send (to the multitimbral module's MIDI In) a MIDI Program Change message upon the MIDI channel for that robomusician. How do you tell a robomusician to bend his pitch? You send him Pitch Wheel messages on his MIDI channel. The MIDI Program Change message is the one that instructs a robomusician to pick up a certain instrument. Remember that only that one robomusician "hears" these messages. telling him to pick up a Banjo. So. maybe robomusician 10 is limited to playing only drums. let's say that at one point in your arrangement. Later on. The other robomusicians see only those messages on their respective channels. Contained in the MIDI Program Change message is the number of the desired patch/instrument. and each usually has its own settings for such things as Volume. and I'm going to use the Roland preference. For example. you send a MIDI Program Change to robomusician 3 (ie. 241 . although you're limited to 16 robomusicians playing 16 instruments simultaneously. Each robomusician ignores messages that aren't on his channel. 4. Parts . at that point you've got to have one of your 16 robomusicians put down his current instrument and pick up a Banjo instead. you send a MIDI Program Change (with a value that selects the Saxophone patch) on MIDI channel 3.So is there a name for these 16 "robomusicians" or "sub-modules" inside of your MIDI module? Well.maybe a Banjo. a Part. to tell robomusician 3 to pick up a sax. he may be able to choose from among several different drum kits). So. because there are many different MIDI controller messages that can be sent on any given MIDI channel. A Roland multi-timbral module has 16 Parts inside of it. panning. For example.each robomusician playing simultaneously with individual control over his volume. the sax player is robomusician 3. 4. you can now have them play a MIDI arrangement with these 16 instruments -. different manufacturers refer to them in different ways. How do you tell a robomusician to change his volume? You send him Volume Controller messages on his MIDI channel. on MIDI channel 3 -.2 Individual control via each MIDI Channel After you've told the 16 robomusicians what instruments to pick up. he'll be playing that banjo. so you send him note messages on MIDI channel 3.

Just because one Part is receiving a Pitch Wheel message and bending its pitch doesn't mean that another Part has to do the same. the controller will automatically send out an Active Sense MIDI message that tells the attached sound modules (connected to the controller's MIDI OUT) "Hey. These Parts are completely independent of each other. if you again leave the controller sitting idle. I'm just sitting here idle because my owner isn't playing me at the moment". that travels over your MIDI cables just like note data or controller data). 242 . and the sound modules would have no way of knowing whether they were still connected to the controller. It can provide an automatic "MIDI panic" control that will cause stuck notes and other undesirable effects to be resolved whenever the MIDI connection inbetween modules is broken or impeded in some way. etc) that mixes the output of all 16 Parts to a pair of stereo output jacks. Don't worry that you haven't seen any MIDI note data. etc. if you walk up to the controller and start playing it. reverb level. another Part could make its volume increase when it receives increasing Channel Pressure messages. MOD Wheel controller (often used for a vibrato effect). then after a little while of sitting there idle. from me in awhile. I'm still connected to you. etc. then it will stop sending these "reassurance" messages because now it will be sending other MIDI data instead. On the other hand. By constantly "talking" to the sound modules (sending Active Sense messages whenever it has no other MIDI data to send). And then later. For example. one Part can cause its patch to sound brighter when it receives Channel Pressure messages that increase in value. Let's also say that all three units implement Active Sense. or controller messages. The stereo outputs of the module are like the stereo outputs of that mixing console). panning. If you aren't playing the controller at any given moment (and therefore the controller isn't sending out MIDI data). or Pitch Wheel messages. Here's how Active Sense works. When the sound modules hear this "reassurance" from the controller. each Part has its own way of reacting to MIDI data such as Channel Pressure (often used to adjust volume or brightness). which MIDI data the Part "plays").4 Active Sense Active Sense is a type of MIDI message (ie. and its MIDI channel (ie. Furthermore. Now. then they expect the controller to repeat this same Active Sense message every so often as long as it's sitting idle. It's used to implement a "safety feature". it will resume sending Active Sense messages out of its MIDI OUT jack. 4. the controller is able to let the modules know that they're still connected to the controller. Reverb and Chorus levels.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes Panning. You'll note that sound of all 16 robomusicians typically comes out of a stereo output of your sound module (or computer card). That's because most multi-timbral modules have an internal mixer (which can be adjusted by MIDI controller messages to set volume. (ie. then there would be no other MIDI data sent to the sound modules while the controller was left idle. That's how the sound modules know that everything is still OK with their MIDI connections. Assume that you have two sound modules daisy-chained to the MIDI OUT of a keyboard controller. The 16 microphones and 16 channel mixing console I alluded to earlier are built into the MIDI module itself. If not for that steady stream of Active Sense messages. brightness. and Pitch Wheel (used to slide the pitch up and down).

sixth. fifth. Active Sense is a "safety" feature. instrument. most) MIDI units do NOT implement Active Sense. look at the MIDI Implementation Chart in your users manual. Now. The first module is still attached to the controller's MIDI OUT. 300 milliseconds. so the modules are holding that stuck note. the stuck note is turned off. or Pitch Wheel messages. For example. they're not getting even that reassurance. OK. then it should at least be sending them Active Sense messages. which is a long time for MIDI devices. If you connect a controller with Active Sense to a module that doesn't implement Active Sense. and the second instrument will pass onto the third instrument those messages that the first instrument sent. and the module won't be able to realize when a connection has been broken. etc. Hence.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes So how is a "safety feature" derived from this? Let's say that you press and hold down a key on the controller. disconnect the daisy-chained modules). and therefore aren't setup for this "safety" feature. Now. We call this "daisy-chaining" instruments. you'll look down the Recognized column until you get to Active Sense (it's listed in the AUX Messages section almost at the bottom of the chart). what happens? Well. In such a case. and it automatically turns off any sounding notes. then the module automatically concludes that the MIDI connection is broken. but now the second module is no longer daisy-chained to the first module's MIDI OUT). After awhile. So. all by itself. you disconnect the MIDI cable at the controller's MIDI OUT (ie. you'll look down the Transmitted column. the second module will quickly realize that its MIDI connection has been broken. in order to avoid being left in a "stuck" or undesirable state. the first module will still be getting the Active Sense messages from the controller. etc. So. while still holding that note down. because they know that if the controller isn't sending them MIDI note data. after each module waits for a suitably long time (ie. Even if you now release the key on the controller. You could add a fourth. the module can take automatic safety precautions such as turning off any sounding notes. or controllers. (Typically. In this case. Roland is one of the few companies that adhere to the MIDI spec to this degree). This sends a MIDI Note On message to the sound modules telling them to start sounding a note. 5. It allows a MIDI module to know when its connection to some other unit's MIDI OUT jack has been broken. Note that many (in fact. (ie. Active Sense allows any module to detect when the MIDI connection to the preceding module is broken. And yet. So how do you know if your unit implements Active Sense? Well. the modules are left sounding that "stuck note". Daisy Chain/MIDI Channels You can attach a MIDI cable from the second instrument's MIDI THRU to a third instrument's MIDI IN. they start to get worried. all 3 instruments can play in unison. but these can't be passed on to the second module (because it's disconnected from the MIDI chain). but a fraction of a second to humans) without receiving an Active Sense message. let's say that you disconnected the MIDI cable between the two sound modules. Roland gear DOES implement Active Sense. 243 . For sound modules. For a controller. there's no way for the controller to tell the modules to stop sounding the note (because you disabled the MIDI connection). or resetting itself to a default state. OK. Therefore. and they're waiting for some more MIDI messages from the controller. then the controller's reassurance messages are ignored. Make sure that there is a circle instead of an X shown beside Active Sense.

244 . are all still separated by messages on different channels. Etc. when using MIDI. he can set them all to respond to different channels and therefore have independent control over each one. So the musician can store the messages generated by many instruments in one file. then you can keep each instrument's output separate. if a musician has several instruments daisy-chained. and you send a MIDI Note On on channel 2. and then do things with that MIDI data on the computer with software. For example. when the musician plays each instrument. The Advantages of MIDI There are two main advantages of MIDI -. it generates MIDI messages only on its one channel. then the instrument will not play the note. it "hears" the messages from all instruments over just one incoming cable. that MIDI message for the middle C note can be sent on channel 1. For example. if you have only 2 digital audio tracks (typically). Most all MIDI instruments allow the musician to select which channel(s) to respond to and which to ignore. 6. those MIDI messages) of all of his instruments. such as moving the pitch wheel. and the process would undoubtably be far from perfect. Or. a musician never loses control over every single individual action that he made upon each instrument. and yet the messages can be easily pulled apart on a per instrument basis because each instrument's MIDI messages are on a different MIDI channel. pressing the sustain pedal. He does this not by digitizing the actual audio coming out of all of his electronic instruments. How is this possible? There are 16 MIDI "channels". For example. from playing a particular note at a particular point. They all exist in that one run of MIDI cables that daisy-chain 2 or more instruments (and perhaps a computer) together. daisy-chained instruments don't always have to play in unison either. If you've got a system that has 16 stereo digital audio tracks. and if you want to edit a certain note of one instrument's part.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes But. Once you mix the analog signals together. but it's put together in such a way that every single musical action can be easily examined and edited. it would take massive amounts of computation to later filter out separate instruments. They're analog. but rather by "recording" the MIDI OUT (ie. Remember that the MIDI messages for all of those instruments go over one run of cables. The great advantage of MIDI is that the "notes" and other musical actions. the software can playback MIDI messages upon all 16 channels with the same rhythms as the human who originally caused the instrument(s) to generate those messages. Also. So. if you set an instrument to respond to MIDI messages only on channel 1. The data is all there. software can store MIDI messages to the computer's disk drive. then you've got to mix the audio signals together before you digitize them. etc. Also. it's very easy to keep the MIDI data separate for each instrument even though it all goes over one long run of cables. a musician can digitally record his musical performance and store it on the computer (to be played back by the computer). you lose control over each instrument's output. so if you put the computer at the end. Because MIDI is a digital signal. the MIDI Note On message for middle C on channel 1 will be slightly different than the MIDI Note On message for middle C on channel 2. to pushing the sustain pedal at a certain time. that's even less feasible. But. etc.it's an easily edited/manipulated form of data. individual musical part even though all of the MIDI messages controlling those daisychained instruments pass through each instrument. it's very easy to interface electronic instruments to computers. In other words. it can be sent on channel 2. Contrast this with digitizing the audio output of all of those electronic instruments. After all. produces relatively small data files). So. and also it's a compact form of data (ie. So ultimately. So. Those instruments' audio outputs don't produce digital signals. Each can play its own.

A GM sound module should be multi-timbral. Oboe. meaning that it can play MIDI messages upon all 16 channels simultaneously. beneath my hairline. In this way. Patch number 25 on a GM module must be a Nylon String Guitar. That's 6 bytes. (Of course. it typically takes much more storage to digitize the audio output of an instrument than it does to record an instrument's MIDI messages. you have to be recording all of the time that the note is sounding. and their respective Program Change numbers. I'd love to have a built-in MIDI OUT jack on my body. So. the computer is storing literally thousands of bytes of "waveform" data representing the sound coming out of the instrument's AUDIO OUT. with MIDI a musician records his actions (ie. For example. in sync. The assignments of drum sounds to MIDI notes is shown in the chart. and this standard is called General MIDI (GM). By contrast. For example. it was decided that Patch number 1 on all sound modules should be the sound of an Acoustic Grand Piano. then each of that Drum Part's MIDI notes triggers a different drum sound. GM Drum Sounds. movements). whose Patches are instrumental sounds). The patches are arranged into 16 "families" of instruments. and digitizing it into a WAVE file is the best digital option right now. GM Patches. you always hear some sort of Acoustic Grand Piano sound. In fact. He presses the note down. all patches must sound an A440 pitch when receiving a MIDI note number of 69. But what if you want to record someone singing? You can strip search the person. it became desirable to define a standard set of Patches in order to make sound modules more compatible. Why? Let's take an example. If the GM module also has a built-in "drum module" (ie. That's why sequencer programs exist that record and play both MIDI and WAVE data. I'd have it installed at the back of my neck. With digital audio. a Note Off message). for electronic instruments that's a great idea. it must be recorded. but you're not going to find a MIDI OUT jack on his body. For example. no matter what MIDI sound module you use. 7. The Drum Part is usually set to receive MIDI messages upon channel 10. So. General MIDI On MIDI sound modules (ie. you will find Saxophone. I anxiously await the day when scientists will be able to offer "human MIDI retrofits". with each family containing 8 instruments. to record that singing. shows you the names of all GM Patches. a Note On and a Note Off message. and then the next message doesn't happen until you finally release the note (ie. there are only 2 messages involved. With MIDI. and Clarinet. Furthermore. and you're still going to have only 6 bytes. So why not always "record" and "play" MIDI data instead of WAVE data if the former offers so many advantages? OK. The chart. There's a Note On message when you sound the note. if you want to digitize that whole note. there is a Reed family. A standard was set for 128 Patches which must appear in a specific order.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes Furthermore. you could hold down that note for an hour. he does nothing until he releases the note. Then. for the entire time that you hold down the note. usually one of 16 Parts). but when I needed to use it. 245 . with a different GM Patch sounding for each channel. Among the 8 instruments within the Reed family. when you change to Patch number 1. So while the instrument is making sound. Nobody would ever see it. I'd just push back my hair and plug in the cable). you record the instrument's sound. you're going to have to record the sound. You see. interpreting every one of my motions and thoughts into MIDI messages. Say that you want to record a whole note.

All of these standards help to ensure that MIDI Files play back properly upon setups of various equipment. but at least they will both be piano patches on the 2 modules. and the song file would therefore play all of the correct instrumentation automatically. ie. The pitch wheel bend range should default to +/. confident that those messages will select the correct instruments on all GM sound modules. A440 reference. one module could use cheap FM synthesis to simulate the Acoustic Grand Piano patch. Channel Volume (7). to create that one Piano patch. volume) of each note. Reset All Controllers (121). This may be hard-wired to control the VCA level (ie. Pan (10). Channel Volume should default to 90. The module also should respond to Channel Pressure (often used to control VCA level or VCO level for vibrato depth) as well as the following MIDI controller messages: Modulation (1) (usually hard-wired to control LFO amount. The GM standard is actually not encompassed in the MIDI specification. For example. For example. But. GM Patches This chart shows the names of all 128 GM Instruments. the GM spec spells out a few global settings. most MIDI sound modules offer such programmability. NOTE: The GM spec doesn't dictate how a module produces sound. but need to allow the musician to switch to GM mode when desired. such as being able to respond to Pitch and Modulation Wheels. Obviously. Expression (11). and All Notes Off (123). Finally. Furthermore. The GM spec also spells out other minimum requirements that a GM module should meet. mapped out across the MIDI note range. GM doesn't dictate VCA envelopes for the various patches. the Sax patch upon one module may have a longer release time than the same patch upon another module.2 semitones.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes GM Standard makes it easy for musicians to put Program Change messages in their MIDI (sequencer) song files. and the MIDI Program Change numbers which select those Instruments. See the MIDI specification. musicians need not worry about parts being played back in the wrong octave. Initial tuning should be standard. Sustain (64). the 2 patches won't sound exactly alike. There is a MIDI System Exclusive message that can be used to turn a module's General MIDI mode on or off. 246 . and also be able to play 24 notes simultaneously (with dynamic voice allocation between the 16 Parts). and there's no reason why someone can't set up the Patches in his sound module to be entirely different sounds than the GM set. the module should respond to velocity (ie. non-GM playback modes or extra. most have a GM option so that musicians can easily play the many MIDI files that expect a GM module. the module should respond to these Registered Parameter Numbers: Pitch Wheel Bend Range (0). and Coarse Tuning (2). so for example. Fine Tuning (1). 8. with all other controllers and effects off (including pitch wheel offset of 0). wouldn't be played back on a Cymbal. This is useful for modules that also offer more expansive. musicians didn't have to worry that a snare drum part. So too. Another module could use 24 digital audio waveforms of various notes on a piano. for note messages). vibrato). Some modules may allow velocity to affect other parameters. Finally. programmable banks of patches beyond the GM set. After all. Additionally. for example.

SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes Prog# Instrument Prog# Instrument PIANO 1 2 3 4 5 6 7 8 Acoustic Grand Bright Acoustic Electric Grand Honky-Tonk Electric Piano 1 Electric Piano 2 Harpsichord Clavinet CHROMATIC PERCUSSION 9 10 11 12 13 14 15 16 Celesta Glockenspiel Music Box Vibraphone Marimba Xylophone Tubular Bells Dulcimer ORGAN 17 18 19 20 21 22 23 24 Drawbar Organ Percussive Organ Rock Organ Church Organ Reed Organ Accordion Harmonica Tango Accordion GUITAR 25 26 27 28 29 30 31 32 Nylon String Guitar Steel String Guitar Electric Jazz Guitar Electric Clean Guitar Electric Muted Guitar Overdriven Guitar Distortion Guitar Guitar Harmonics BASS 33 34 35 Acoustic Bass Electric Bass(finger) Electric Bass(pick) SOLO STRINGS 41 42 43 Violin Viola Cello 247 .

SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes 36 37 38 39 40 Fretless Bass Slap Bass 1 Slap Bass 2 Synth Bass 1 Synth Bass 2 44 45 46 47 48 Contrabass Tremolo Strings Pizzicato Strings Orchestral Strings Timpani ENSEMBLE 49 50 51 52 53 54 55 56 String Ensemble 1 String Ensemble 2 SynthStrings 1 SynthStrings 2 Choir Aahs Voice Oohs Synth Voice Orchestra Hit BRASS 57 58 59 60 61 62 63 64 Trumpet Trombone Tuba Muted Trumpet French Horn Brass Section SynthBrass 1 SynthBrass 2 REED 65 66 67 68 69 70 71 72 Soprano Sax Alto Sax Tenor Sax Baritone Sax Oboe English Horn Bassoon Clarinet PIPE 73 74 75 76 77 78 79 80 Piccolo Flute Recorder Pan Flute Blown Bottle Shakuhachi Whistle Ocarina 248 .

SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes SYNTH LEAD 81 82 83 84 85 86 87 88 Lead 1 (square) Lead 2 (sawtooth) Lead 3 (calliope) Lead 4 (chiff) Lead 5 (charang) Lead 6 (voice) Lead 7 (fifths) Lead 8 (bass+lead) SYNTH PAD 89 90 91 92 93 94 95 96 Pad 1 (new age) Pad 2 (warm) Pad 3 (polysynth) Pad 4 (choir) Pad 5 (bowed) Pad 6 (metallic) Pad 7 (halo) Pad 8 (sweep) SYNTH EFFECTS 97 98 99 100 101 102 103 104 FX 1 (rain) ETHNIC 105 106 107 108 109 110 111 112 Sitar Banjo Shamisen Koto Kalimba Bagpipe Fiddle Shanai FX 2 (soundtrack) FX 3 (crystal) FX 4 (atmosphere) FX 5 (brightness) FX 6 (goblins) FX 7 (echoes) FX 8 (sci-fi) PERCUSSIVE 113 114 115 116 117 Tinkle Bell Agogo Steel Drums Woodblock Taiko Drum SOUND EFFECTS 121 122 123 124 125 Guitar Fret Noise Breath Noise Seashore Bird Tweet Telephone Ring 249 .

SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes 118 119 120 Melodic Tom Synth Drum Reverse Cymbal 126 127 128 Helicopter Applause Gunshot Prog# refers to the MIDI Program Change number that causes this Patch to be selected. So. 8. the Patch number for Reverse Cymbal is actually sent as 119 rather than 120.1 GM Drum Sounds This chart shows what drum sounds are assigned to each MIDI note for a GM module (ie. For example. MIDI Drum Sound MIDI Drum Sound Note # 35 Acoustic Bass Drum 36 Bass Drum 1 37 Side Stick 38 Acoustic Snare 39 Hand Clap 40 Electric Snare 41 Low Floor Tom 42 Closed Hi-Hat 43 High Floor Tom 44 Pedal Hi-Hat 45 Low Tom 46 Open Hi-Hat 47 Low-Mid Tom 48 Hi-Mid Tom 49 Crash Cymbal 1 50 High Tom 51 Ride Cymbal 1 52 Chinese Cymbal 53 Ride Bell 54 Tambourine 55 Splash Cymbal 56 Cowbell 57 Crash Cymbal 2 58 Vibraslap Note # 59 Ride Cymbal 2 60 Hi Bongo 61 Low Bongo 62 Mute Hi Conga 63 Open Hi Conga 64 Low Conga 65 High Timbale 66 Low Timbale 67 High Agogo 68 Low Agogo 69 Cabasa 70 Maracas 71 Short Whistle 72 Long Whistle 73 Short Guiro 74 Long Guiro 75 Claves 76 Hi Wood Block 77 Low Wood Block 78 Mute Cuica 79 Open Cuica 80 Mute Triangle 81 Open Triangle 250 . but note that MIDI modules count the first Patch as 0. But. the software or module understands that humans normally start counting from 1. not 1. sending a MIDI Program Change with a value of 120 (ie. the software or module automatically does this subtraction when it generates the MIDI Program Change message. and so would expect that you'd count the Reverse Cymbal as Patch 120. that has a drum part). when entering that Patch number using sequencer software or your module's control panel. So. These decimal numbers are what the user normally sees on his module's display (or in a sequencer's "Event List"). Therefore. the value that is sent in the Program Change message would actually be one less. actually 119) to a Part causes the Reverse Cymbal Patch to be selected for playing that Part's MIDI data.

(Well. Of course. ASCII) file may store the text of a story or newspaper article. Type 2 is used to store a collection of songs or patterns. A MIDI file can also store other information relating to a musical performance. including System Exclusive messages. available on this web site. ie. usually described in pulse-per-quarter-note (ppqn). but it will look like gibberish. and time signature. MIDI is binary data. just like a text (ie. you can. a sequencer can playback all of the note messages. 251 . to convert a MIDI file to readable text). and other MIDI messages. the patch editor may not care about the timestamp associated with each SysEx message. but a MIDI file contains musical information. and is used by many sequencers. Saving Midi files A MIDI file is a data file. are akin to a collection of Type 0 files all crammed into one MIDI file. 1. and a MIDI file is therefore a binary file. MIDI files are not specific to any particular computer platform or product. Other software and hardware devices in addition to sequencers may use MIDI files. In this case. such as tempo. For example. for example. Specifically. and other sequencer settings. numerous drum beats. Both Type 0 and 1 store one "song". with the original "musical rhythms". standardized file format designed to store "musical performances". Type 1 files separate each musical part upon its own track. by storing System Exclusive messages (received from the instrument) within the MIDI file. along with a timestamp for each MIDI message. a sequencer can playback all of the MIDI messages within the MIDI file at the same relative times as when the messages were originally generated. MIDI Clock Early MIDI equipment achieved synchronization via a separate ‘sync’ connection carrying a clock signal at one of a number of rates. available on this web site). a patch editor may store an instrument's patch settings in a MIDI file. rate converters were needed in order to synchronize MIDI equipment of different made. You can't load a MIDI file into a text editor and view it. text. and all of the MIDI messages (ie. There are 3 different "Types" (sometimes called "Formats") of MIDI files. Midi Synchronization Methods 1. A timestamp is simply the time when the message was generated. Type 2 files. (If you need to convert a MIDI file to the various Types. a MIDI file stores MIDI data -. even if it represents many musical parts upon different MIDI channels. A MIDI file even has provisions for storing the names of tracks in a sequencer. It stores information.the data (ie. As different manufacturers have their preferred rates.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes 9. which are extremely rare. the entire performance) is placed in that one track. since the data is not ASCII. In other words. you can use my MIDI File Disassembler/Assembler utility. use my Midi File Conversion utility. "pattern". 3. Type 0 files contain only one track. Using the timestamps. key signature. 2. The MIDI file format was devised to be able to store any kind of MIDI message. or musical performance. A MIDI file is therefore a generic. commands) that musical instruments transmit between each other to control such things as playing notes and adjusting an instrument's sound in various ways.

it is difficult or impossible to start a song from middle position. One MIDI beat) have passed. 2. On receipt of &F8. Therefore. ‘Stop’ will stop the playback. it is to note that slight errors do occur as realtime messages still have to wait for the processing of the voice messages in front of it.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes As standard MIDI messages were used for synchronization. ‘Start’ will cause the remote machine to playback the song from the beginning. A group of system messages called the ‘system real time’ messages control the execution of timed sequences in a MIDI system. it is advisable to avoid transmitting these two messages together in the same direction. 24 MIDI clocks are transmitted per quarter note (thus 24 bpqn). to provide a means for distributing conventional SMPTE/EBU timecode data around a MIDI system in a format that is compatible with the MIDI protocol. and secondly to provide a means for transmitting ‘setup’ messages which may be downloaded from a controlling computer to receivers in order to program them with cue points at which certain events are to take place. and these are often used in conjunction with the song pointer. However. ‘Continue’ will cause the remote machine to playback from where it last stopped. Song Position Pointer (SPP) SPP was then introduced to overcome the problem. MIDI Timecode (MTC) MIDI timecode has two specific functions. A MIDI beat is equivalent to a musical semiquaver or 16th note. Different from the MIDI Clock. Therefore. a device which handles timing information should increment its internal clock by the relevant amount. A SPP message will be sent followed by ‘Continue’. MTC represents timing in the form of Hours : minutes : seconds : frames 252 . Using this method. 3. This in turn will increment the internal song pointer after six MIDI clocks (ie. SPP enable slave machines to following the master even when the master is ‘fast forwarded’ or ‘rewinded’. and any changes of tempo within the system should be reflected in a change in the rate of MIDI clocks. As timing is vital in synchronization. Firstly. the ‘pulse sync’ method became obsolete. This will inform the remote machine to relocate its song pointer and starts playback from there. ‘Start’. system real time messages are given priority over note messages when transmitted together. Any device controlling the sequencing of other instrument should generate clock bytes at the appropriate intervals. the system realtime messages concerned with synchronization. all of which are single bytes are: • • • • Timing clock Start Continue Stop The timing clock (often referred to as ‘MIDI beat clock’) is a single status byte (&F8) to be issued by the controlling device six times per MIDI beat. ‘Stop’ and ‘Continue’ are used to control the remote machine’s replay. To control autolocation within a stored song.

3. denoted by the status byte &F1. Status Byte [ F1 ] [ Type Data Byte | Time Data ] 0000 Frame LS 0001 Frame MS 0010 Seconds LS 0011 Seconds MS 0100 Minutes LS 0101 Minutes MS 0110 Hours LS 0111 Hours MS 253 . seconds or frames. In order to transmit this information over MIDI.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes In an LTC timecode frame. minutes. Each message of the group of eight represents a part of the timecode frame value and takes the general form: &[F1][data] The data byte begins with zero and the next seven bits of the data word are made up of a 3 bit code defining whether the message represents hours. these groups representing the tens and units of each. Quarter Time Frame: One which updates a receiver regularly with running timecode. a status byte followed by relevant data bytes. i.1 Quarter frame messages One timecode frame is represented by too much information to be sent in one standard MIDI message. Msnibble or Lsnibble. 2. so it is broken down into eight separate messages. two binary data groups are allocated to each of hours. it has to be turned into a format which is compatible with other MIDI date. so there are eight binary groups in total representing the time value of a frame.e. seconds and frames. minutes. There are two types of MTC synchronizing message: 1. It is transmitted as a universal realtime System Exclusive message. followed by the four bits representing the binary value of that nibble. Full-frame message: One that transmits one-time updates of the timecode position for situation such as exist during the high speed spooling of tape machines.

2 Full-Frame Message Full-frame message is catagorised as universal realtime message. 2. with the ‘frames Lsnibble’ message transmitted on the frame boundary of the timecode frame that it represents. These capabilities allow a sequencer to perform the basic function of a conventional tape recorder. The receiver must in fact maintain a two-frame offset between displayed timecode and received timecode since the frame value has taken two frames to transmit completely. changing tempo and time signature and etc.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes Nibble – half a byte (4 bits) LS – least significant MS – most significant At a frame rate of 30 fps. The Hardware Sequencers The most common type of sequencers used during the 70’s and 80’s are the so called “hardware sequencers”. Sequencers come in various forms but all of them resemble a standard computer system equipped with operating system. but this would involve what has been considered to be too great an overhead in transmitted data.5% of the available data bandwidth. the messages can be reproduced at the exact timing. they can be altered.ID][01][01][hh][mm][ss][ff][F7] MIDI Sequencer 1. The format of this message is as follows: & [F0][7F][dev. 3. What is a sequencer? A sequencer can be thought as a computer dedicated for storing. during playback. therefore the receiving device is updated every two frames. The basic functions of a sequencer are to record MIDI messages. processing and editing MIDI messages. Quarter frame messages may be transmitted in forward or reverse order. As eight messages are needed fully to represent a frame. Thus editing can be done to the MIDI messages such as correcting error notes. If MTC is transmitted continuously over MIDI it will take up approximately 7. store them with references to a certain timing information so that. microprocessor and memory integrated with controls designed for performing sequence-specific functions. it can be appreciated that 30 x 8 = 240 messages really ought to be transmitted per second if the receiving device were to be updated every frame. As MIDI messages are digital information stored on a sequencer. to emulate timecode running either forwards or backwards. They are portable machines with LCD (LED on the earliest models) 254 . transposing notes. quarter-frame message would be sent over MIDI at a rate of 120 messages per second.

allows printing of music score. storage and control system of the computer and are thus more powerful and less restricted than its hardware based counterparts. The combination provides the advantage of ease-of-use and portability to musicians who need to move around. Software sequencers tap on the processing power. Software Sequencers With the popularity of personal computer. Majority of the software sequencers require external MIDI interface to receive and distribute MIDI data. A famous example of the Hardware Sequencer is the MC-50 by Roland.1 Keyboard Workstation Sequencers are also built into newer synthesizers and samplers and are often known as “Keyboard Workstation”. Compared to Software Sequencers. Apple Macintosh. Sync Ports are found. It has also limited RAM side which restricts the number of MIDI Notes that can be recorded. allows both software and hardware plug-in to expand the capabilities of the sequencer. There are other added advantages. Atari and Amiga Commodore computers. Roland MC-50 MKII MIDI Sequencer 2. • • • • bigger monitor screen which makes editing easier. On the side of the box. sequencers are also available in the forms of software packages running on PC. hardware sequencers are restricted by its small LCD screen which make editing difficult especially for beginner. The advantage of hardware sequencer is that it is portable and is therefore popular even nowadays for on-stage performance. 3. Digital Sampler 255 . MIDI IN and THRU/OUT.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes screen and control panel. most editing functions are performed by standard computer commands (like the cut and paste commands) which users are already familiar with.

It has to be noted here that the sample we are concerned here is most of the time the sound of just a single note of an instrument. 2. to create a piano sound program on a sampler. Channel mapping – creating a multi-timbre sound module. Before a sound can fully function in a sampler. Editing – cutting away redundant portion of the sample and define loops. What is a sampler? If we can understand the concept of a sound module. Programming – define how the sample is to be playback and modified by the built-in algorithm. SCSI or USB. A sampler is similar to a sound module in its internal structure and operations except that there is no built-in factory preset sound ROM. the following steps are involved: Sampling – recording of the ‘raw’ sound into the sampler. RAM ROM Chip containing digital sound samples ROM Chip containing preset synth data Sound Samples transferred from Control Synth data transferred into synthesizer A typical sound module structural layout. The user needs to go through a series of processes to ensure that the sounds are playback correctly on keyboard. individual sound samples are called (by MIDI Program Change Number) and loaded into the RAM to be furthered modified or combined with the synthesized sounds of the device. a sound module without preset sound chips.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes 1. As we know. For example. Mxer & Modulator Mixed Output There are basically three ways of putting sounds into a sampler’s RAM. During the playback stage.g. Loading pre-recorded samples from a sample disc/CD-ROM and. a sound module consists of one or more sound ROM chips that store sound samples in digital form. Record directly into the sampler’s line/mic input. 1. it would be easier to grasp the concept of a digital sampler.e. Thus we can say that a sampler is in fact a dummy sound module i. it requires users to input digital sound samples into its RAM before it can be used. Transferring of sound file from another device (such as a computer or another sampler) via communication line e. 3. Therefore. 256 . Getting sound into a sampler is not the end of the story. the user has to record individual keys of a real piano and not the performance of musical piece (although there are exceptional cases).

there are two major tasks to be performed. middle C) of a piano. It is then converted into digital samples by the A/D converter. the user needs to consider the transposing characteristics of the sampler during the playback process.3 The Programming Process Once all the samples required are edited. F and so on can be obtained.g. This will greatly save the amount of memory space of the sampler. At this stage.SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes 1. have to be trimmed. Therefore. This part of the process is called Key-Span. D# . C#. For example. if a Middle C at 261Hz is sampled at 44. “chick monk” effects would occur when transposing too much upwards. unwanted portion of the sample such as short noise or silence before the actual sample and the unnecessary long sustain phase of the sound . And this should be planned at before the sampling session takes place. 1. analog sound enters the sampler via the line in and on some samplers. a Program (a collection of all the settings mentioned above) is created. First is to assign samples to the appropriate keys as planned during the sampling stage. This is even more obvious on human voice. pre-recorded sound samples stored on digital storage media could also be transferred directly via SCSI or USB interface. The Program can then be stored together with the sample for future recall. dynamic and ADSR of the instrument change when played on different keys. Secondly. (See diagrams on next page) 1.2 The Editing Process There are two main purposes for the editing of samples. 1.4 Channel Mapping Process 257 . tone color. the next step would be to define how these samples are to be played by the sampler. it has to be noted that on a real life instrument such as a piano. by playing back the sample at sampling rate higher than 44. basically.1KHz.1 The sampling process During the sampling process. looping has to be done and loop type to be defined. Second is to program the built-in algorithm of the sampler in order to adjust the ADSR (Envelope Generator). microphone inputs. D. This is because the sampler can later transpose this key automatically to span across the whole keyboard by means of varying the playback-sampling rate. In such cases. At the end of the programming process. Alternatively. E. over-transposing a key will result in unnatural or unrealistic sound. multiple notes should be recorded at appropriate intervals. Here.1KHz. Filtering and adding other effects to the samples. It is possible to create a piano sound program by just recording a single key (e. However. To avoid the problems of over-transposition. Firstly.

SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes This process is only necessary if the user intends to create a multi-timbre sound module out of the sampler. 4 B e T T T o T A T o R 4 Assignment 4 and 5– AE004 and AE005 258 . volume of various Program. adjusts overall stereo image. During this process. the user defines MIDI channels.

SCHOOL OF AUDIO ENGINEERING AE14 – Midi Student Notes 259 .

1 7. 3. MAIN AREA SUB AREA ATF AREA START ID and SKIP ID Absolute Time Skip Ids Error Correction System Other Features 7. The CD Message 1.2 7.2 1.4 Random Programming Various Repeat Functions Music Scan End Search 260 .1 1. DAT Mechanism DAT RECORDING SYSTEM WHAT IS RECORDED ON A DAT 3.5 3.4 1.AE15 – Digital Recording Formats COMPACT DISC & PLAYER 1.3 3. 7. 2.4 3. The medium The Pickup FOCUSING MECHANISM TRACKING PICKUP CONTROL SPARS CODE DIGITAL AUDIO TAPE 1.3 1.3 7.6 4.2 3.5 2.1 3.

2 DVD 1.7. The Laser Pickup Playback On A Magneto Optical (Mo) Disc Recording Quick Random Access Adaptive Transform Acoustic Coding (Atrac) Shock Proof Memory Mini Disc Specifications Manufacture and Mastering K-1216 MD Format Converter K-1217 Address generator 10. 7.1 10. 5. 10.1 1. 8.1 2.2 PRE-REcorded MD Recordable MD 3. Informative Display Construction Longer Recording Time Material of Tape Identification Holes General Features 1. 6. 4.1 Storage Capacity 261 .5 7.9 MINI DISC 1.7 7. The Disc 2.2 Quick Random Access Recordable Disc 2. DVD-Audio 1.6 7.8 7. 9.

4 5.5 5.1 5.3 5. Multi Channels Meridian Lossless Packing Various Audio Formats Audio details of DVD-Video 5. 2. 5.6 Surround Sound Format Linear PCM Dolby Digital MPEG Audio Digital Theater Systems Sony Dynamic Digital Sound Super Audio Compact Disc (SACD) 1. 4.2. 3.2 5. Watermark – Anti Piracy Feature Hybrid Disc 262 .

The CD Message A CD contains digitally encoded audio information in the form of pits impressed into its surface. 263 . 1. Data is stored in pit formation.4 m/s.SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes AE15 – DIGITAL RECORDING FORMATS Introduction CD. The lead-out area informs the player that the end of the disc has been reached.1 The medium The CD's dimensions are: Diameter 12cm Thickness approx. The CD spins at a fixed Constant Linear Velocity (CLV) of 1. some optical recorders are designed to conform to its standard and format so that the recorded information maybe reproduced on a conventional Compact Disc Player (CDP). DAT. 1. The CDP's laser beam is guided across the disc from the inside to the outside. In addition.2 m/s for CD with programme exceeding 60 minutes.4 m/s the angular velocity varies between 568-228 rpm. COMPACT DISC & PLAYER The CD is the audio standard for optical playback systems. The angular velocity (rpm) decreases as the optical pickup moves toward the outer tracks of the disc. At linear velocity of 1. At 1. A CD programme under 60 minutes has a CLV of 1. SACD and DVD formats and their operations are explained. such as the number of musical selections including starting points and duration of each selection. 1. The lead-in and lead-out areas are designed to provide information to control the player. The information on the disc is read by the player's optical pickup. decoded. MD. The disc is designed to allow easy access to the information by the optical system as well as provide protection for the encoded information. The lead-in area contains a Table of Content (TOC) which provides information to the CDP.2 m/s the angular velocity varies between 486-196 rpm.2 mm The middle hole for the motor spindle shaft is 15 mm in diameter. processed and ultimately converted into acoustical energy. starting from at the lead-in area moving outward through the programme area and ending at the outer edge with the lead-out area.

As the beam passes through the grating it diffracts in different directions resulting in an intense main beam (primary beam) with successively less intense beams on either side.054 µm depending on the encoded data and the linear velocity of the disc.11 µm respectively. CDP uses either a three-beam pickup or a single-beam pickup. The total number of spiral revolutions on a CD is 20 625. Diffraction gratings are plates with slits placed only a wavelength apart (fig 9a).833-3. The pickup must respond accurately under adverse conditions such as playing damaged and dirty disc or while experiencing vibrations or shocks. 8). The pickup is required to track the information on the disc. The information contain in the pit structure on the disc surface is coded so that the edge of each pit represents 0 and all space between the edges represent 0. The entire lens assembly is able to move across the disc as directed by the tracking information taken from the disc and programming information provided by the user. focus a laser beam and read the information as the disc rotates. 1.5 and 0. Its wavelength is 780 nm.SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes These pits vary in length from 0. The laser beam is split by a diffraction grating into multiple beams. Only the primary 264 . The track runs circumferentially from the inside to the outside.2 The Pickup The function of the pick up is to transfer the encoded information from the optical disc to the CDP's decoding circuit (fig. A laser diode functions as the optical source for the pickup. The width and depth of the pits are approximately 0. The AIGaAs laser is commonly used in CDP.

The outer two beams (secondary beams) are used for tracking. The light is subjected to a collimator lens that converges the previously divergent light into a parallel path. The primary beam is used for reading data and focusing the beam. The objective lens converges the impinging light to the focal point at a distance (d) from the lens called the focal length (fig. 9c). 265 .SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes and secondary beams are used in the optical system of a CDP. The laser light is then focus onto the disc by an objective lens. The light is conditionally passed through a Polarization Beam Splitter (PBS). The light is than directed to the toward a quarter-wave plate (QWP). The QWP is an anisotropic crystal material designed to rotate the plane of polarization of linearly polarized light by 45o (fig 9b). it reflects all other light. The PBS acts as a one-way mirror allowing only vertically polarized light to pass to the disc.

Thereafter the electrical signals are sent to the decoder for processing and eventually are converted into audio signal. The plane of polarization of the reflected light is now at right angle to its original state.7 µm at the reflective surface of the disc. 9d). Light passes through the cylindrical lens and is received by an array of photodetectors. Since the light is horizontally polarized it is reflected by the PBS toward a cylindrical lens. The wavelength of the laser in air is 780 nm. The cylindrical lens uses an astigmatic property to reveal focusing errors in the optical system (fig. This interference thus decreases the intensity of light returned to the pickup lens. the wavelength is reduced to approximately 500 nm. The light signal is converted to a corresponding electrical signal by the photodetectors. The presence of pits and land areas are detected in terms of changing light intensity by photodetectors. 266 .SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes The objective lens of a CDP is mounted on a two-axis actuator that is controlled by focus and tracking servos. typically a four-quadrant photodetector. Upon entering the polycarbonate subtrate with a refractive index of 1. The pits appear as bumps from underneath where the laser enters the medium. The spot size of the primary beam on the surface is 0. The depth of the pits are between 110 and 130 nm and are designed to be approximately one quarter of the laser's wavelength. This is due to the converging effect of the objective lens. As the light is reflected from the disc and it passes through the objective lens again. Accurate control of the focusing system causes dust. its now horizontally polarized. The pit depth of one quarter of the laser's wavelength will create a diffraction structure such that reflected light undergoes destructive interference. The light converges as it passes through the objective lens and is again phase-shifted 45o by the QWP.55. scratches or fingerprints on the surface of the disc to appear out of focus to the reading laser.8 mm and is further reduced in size to 1.

When the optical correction system is out of focus the focus correction signal is a non-zero value. If the focal point moves behind the photodetectors. When the focal point is in front of the photodetectors an elliptical image is projected on the photodectors at an angle. C. which moves the objective lens up or down according until the laser is focused and the shape of the image on the photodetectors becomes circular 267 . The focus correction signal provides feedback to the focus servo circuit. The photodetectors act as transducers converting the impinging light signals into corresponding electrical signals.SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes 1. and the shape of the photodetector is correspondingly circular. The electrical signals thus contain information for both focusing and tracking the laser beam. The focus circuit provides control for the vertical positioning of the two-axis objective lens. D) [Fig 10] used to control the focus servo and to transfer the audio signal to the decoding circuit.3 FOCUSING MECHANISM A perfectly focused beam places the focal point of the light on the photodectors where the shape of the image on the photodetectors is correspondingly circular.(B+D) is equal to zero when the laser is focused correctly on the disc. the elliptical image is rotated 90o. as well as audio information. B. The focus correction signal (A+C) . The four-quadrant photodetectors (labeled A.

and therefore the audio signal is below the threshold and the focus servo is inactive. 268 . When the audio signal exceeds the threshold the focus servo is activated. 11]. The secondary beams are directed to these photodetectors. A focus search circuit initially moves the lens closer to the disc.4 TRACKING The tracking servo is controlled by the signals received at two outer photodetectors E and F (Fig. When the disc is first placed in the player the distance between the objective lens and the disc is large. 1. The two outer photodetectors generate a tracking error signal (E-F) [fig. The audio signal must exceed a threshold level prior to activation of the focus servo. causing the audio signal to increase. The tracking system detects mistracking to the left or right and returns a tracking error signal to the tracking servo. 10).SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes The audio information is collected by summing the signals from the four photodiodes (A+B+C+D).

SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes 269 .

1. Tracking signals are derived from the signal used to control the two-axis objective lens actuator. The sled servo operation is contingent upon the level of the primary beam exceeding the threshold. SPARS CODE Found on CDs stating which recording format was used in the three stages of album production. The tracking error signal controls the horizontal movement of the two-axis objective lens actuator.SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes The servo moves the pickup accordingly to correct the tracking error. Focal Press 270 . The tracking servo continually moves the objective lens in the appropriate direction to reduce the tracking error. Fundamentals of Digital Audio. During fast forward or reverse a microprocessor takes control of the tracking servo to increase locating speed. 2.digital Mixing A D A D Mastering D D D D Reference: Ken C. Recording A A D D A – analogue D . Precise tracking is always provided by the tracking servo and corresponding control circuit.5 PICKUP CONTROL Three-beam pickups are mounted on a sled that moves radially across the disc.. Polman. providing coarse tracking capabilities.

5 (half the size of Compact Cassette) 120 min Up to 200 times its normal layback speed 60min programme material can be searched in < 20 sec Tape Width Tape Thickness Tape Size Longest Playback Time High-speed Search 271 . For a DAT. noise. both recording .SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes DIGITAL AUDIO TAPE Introduction DAT was developed by Sony in the early 1980’s as new consumer format to replace the analogue Compact Cassette. High Fidelity Recording There is no signal degradation during playback or recording because the audio signal is recorded digitally on a DAT.81 mm 13 microns 73mm x 54mm x 10. There is no tape hiss. tape and transport in an analogue cassette recorder will affect the playback of a cassette tape. THE DAT ADVANTAGE Frequency Response Sampling Frequency Dynamic Range Bit Depth Wow & Flutter Tape Speed 20 to 20 KHz 44. of the above problems associated with analogue recording and playback.1 or 48 Khz > 90 dB 16. The quality of the head.76 cm/sec) 3. 24 Bit Unmeasurable 8. if not an absent. As a result there is little. distortion and wow and flutter which are inherent limitations found in cassette tape.15 mm/sec (1/6th of compact cassette tape @ 4.Digitalto-Analogue Conversion (decoding) are performed in the digital domain.Analogue-to-Digital Conversion (encoding) and playback .

Signals are recorded diagonally on the tape.76 cm/s. Highly precise. 2. Diagonal tracks are created on the tape in the area used for recording. high-density recording is required. Because of this. This is a big advantage of DAT. in the same way as in a VTR.elapse and total programme time Yes Synchronisation 1. fast-search and cueing without having to “unwrap” the tape from the drum. Head A and B. The drum of the DAT rotates at 2000 rpm (1800 rpm for VTR).6 microns wide or less then 1/5th the diameter of a human hair. There are two heads. DAT Mechanism Due to the extremely large amount of digital audio data involved.SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes Subcode Like CD a subcode can be incorporated to provide information such as track number and playing time . The head rotates at a very high speed while the tape runs pass it. These tracks are only about 13.163 mm wide. which 2.13 meter per sec or 66 times that of an analogue cassette at 4. DAT RECORDING SYSTEM The DAT recording system is like that of a VTR. For this reason a DAT employs rotation heads like those of a VTR. the stress on the tape is minimised allowing high-speed rewinding. using a helical scan system. for a relative tape speed of 3. thereby increasing the relative head-to-tape speed and improving recording performance. 272 . mounted on a rotating drum which is set at a 6o 23” angle against the tape. The tape wraps around only 90o of a 30mm drum’s circumference (see diagram below) which is remarkably little when compared to 180o to a VTR.

WHAT IS RECORDED ON A DAT The different types of data that is written on a DAT tape can be divided into three following areas: 273 . DAT heads are used for both recording and playback. During playback the heads do not read information from adjacent tracks due to their different azimuth angles. This is because they slightly overlap as shown in the drawing. In this way the two heads play and record in an interleaved pattern. The A head azimuth is angled at 20 o called the plus (+) azimuth head. At present. On both edges of the tape there are auxiliary tracks for fixed head access in as-yet-undecided future systems.591 micron wide tracks as they rotate. 3. overlapping slightly to leave no space between tracks because the azimuth angles of the two heads are different. Heads A and B alternately record on 13. crosstalk between tracks is minimised. A and B.SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes While rotating. on the other hand. and is called the minus (-) azimuth head. and any unwanted audio data is erased by writing over it. but there is practically no space between the tracks. records at an angle of 20 o clockwise from the perpendicular track. The B head. the two heads. record alternatively. they can be looked at as being nothing more than blank areas that contribute to tracking stability.

the number of channels in the recorded signal format and the presence or absence of emphasis. The ATF is found both on home VCR and DAT alike.2 SUB AREA Start IDs.SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes 3. start and skip IDs. the main ID is repeated 16 times per track to ensure no mistakes occur when the DAT recorder reads the data. among other things. These signals are recorded on one track when the head is touching the tape at a 90 o angle. 3. the playback head must be correctly positioned on the recorded track. anti-piracy status known as Serial Copy Management System (SCMS). PCM digital audio data (main data) is recorded here.1 MAIN AREA In the main area. address.3 ATF AREA Automatic Track Finding (ATF) data is recorded here. the number of bits. absolute time and programme numbers are recorded in this area. 3. Even a slight 274 . This sub area is designed to give the user the convenience of editing programme numbers. This records. Due to the important role Main ID plays as control signal in playback. such information as sampling frequency. A similar sub area exists in CDs but DAT has four times the recording capacity. The control signal called the Main ID (Identification Code) is also embedded in this area. During the playback of a DAT.

SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes drift from the track will impair the playback due to the head inability to read data correctly.5 Absolute Time Absolute Time (ABS) is another example of data that is recorded automatically. This means that even if playback from the B head is impossible because of clogging provided. by selecting the appropriate number on the 10-key pad. During playback when the DAT deck comes to a Skip ID. During playback it is used by the search function to locate the start of a song. This data is then rearranged during playback. The Start ID is recorded automatically and it can set to record manually as well. The DAT error correction system can restore all of the original PCM data because odd number data and even number date is split between the A and B head tracks during recording. ATF and SUB again. With programme numbers it is also possible to go directly to the song of one’s choice by using the direct song function. It is a special mark recorded to indicate the beginning of a song. ATF in DAT helps to do away with control head or tracking control knob found in VCRs. there is sufficient data on the A head track. Skip IDs can be inserted or erased at will during recording and playback. 3. ATF. For example one is listening to a recorded digital radio broadcast.4 START ID and SKIP ID These signals recorded in the subcode are. Thus the order of recording on a DAT is SUB. Automatic recording occurs at the first rise in signal level after a space of about two-second or more of silence. It represents the total time that has elapsed since the beginning of the tape. it skips over at high speed to the next Start ID. It cannot normally be added after recording although it is possible with certain high-grade models. MAIN. The ATF signal is recorded to make sure heads stay correctly positioned on the right track during recording and playback. The error 275 . 3.6 Skip Ids Skip IDs is useful. In the normal mode it is recorded from the start of a song up through to the ninth second. This is because data is interleaved across the two tracks during recording. This is essential if synchronization is needed subsequently. and it serves a useful role in high-speed searching. Programme Numbers are recorded aut omatically at the same time as IDs to separate songs. 3. 4. Skip Ids can be used to exclude commercials and talk leaving just the music for one’s enjoyment. Editing of recorded music programme is possible by adding or deleting Start and Programme Numbers. Error Correction System On a DAT even if one of the heads is clogged almost perfect playback is possible from the other head. the Start ID – which is of one of the most important. enabling the user to mark songs or the recorded material that one does not want to hear.

a number of repeat functions are possible. 7.2 Various Repeat Functions As with a CD. 7. thanks to powerful error correction and compensation function. At the same time. dust and damage that could affect the recorded data.7 Longer Recording Time At the slower secondary speed the total tape time can be increased up to 4 hours. a programmed sequence.3 Music Scan This function enables rapid checking of everything recorded on the tape by automatically playing the first few seconds of all the songs. the slider pin is depressed to release the hub brakes and retract the slider which uncovers the tape reel hub holes. 7.4 End Search This fast-forwards the recorded DAT to the end of the last song where new programme can be recorded on it. 7. all the songs on the tape. This protects the tape from fingerprints. 7.SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes correction system will be able to provide approximate correction by interpolating an average value based on division by two of the neighbouring data values from both sides of the missing value. 276 . When the tape is dirty or damaged. Other Features 7. or songs between any two points can be repeated at will. All recordings on a DAT is done on one side only. The DAT system is designed to suppress noise from dirty or damaged tape in all but the worst cases.5 Informative Display Display of subcode information such as the number of songs. time and programming information is possible. the front lid opens so the tape can be drawn out and wrapped across the periphery of the rotary head drum. 7.1 Random Programming This function allows one to select the order in which songs are played back with fast access time. A single song.6 Construction A lid covers the tape and a slider covers the reel hub holes when not loaded. Some DAT will use a slightly thinner tape for up to 3 hours of recording time at the normal speed and up to 6 hours at the secondary speed. dirt. 7. When loading the cassette.

this can be opened or closed by a sliding tab. 277 .SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes There is no “Side B” on a DAT unlike compact analogue cassette. Unlike on ordinary cassette tapes.8 Material of Tape DAT tape uses metal particles of unoxidised iron and cobalt alloy particles suspended in a binder. one of which is an erasure prevention hole.9 Identification Holes There are five identification holes on the bottom of the tape. 7. This is similar to the metal particle tape used in analogue cassettes and 8mm (Hi-8) video cassette. 7.

SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes 278 .

never gets stretched.1 Quick Random Access Total Durability . The audio signals are recorded in the form of pits on the MINI DISC. General Features 1. During the recording on MO disc a laser shines from the back of the MO disc while a magnetic field is applied to the front. Superb compactness Casing dimension 1. 1. The disc is made of polycarbonate like the CD. No skips whilst jogging or driving. MO disc casing has a read and write window on both sides of it. The pickup system is based on a standard CD pickup with the addition of a MO signal readout analyzer and two photo diodes. The optical pickup never physically touches the surfaces of the MD. The casing has a read window shutter only on the bottom of the surface of the cartridge. no wear and tear. leaving the top free for the label. The Laser Pickup ø 64mm disc 68 x 72 x 5mm Ability to read both types of MD for MO discs. broken or tangled. the laser pickup reads the amount of reflected light.2 Recordable Disc Unsurpassed digital sound based on CD technology. Shock-proof portability-uses an advanced semiconductor memory to provide almost total shock resistant operation. 2. The disc is house in a casing for further protection. Hence no scratches. The Disc 2. 2. the pickup reads the polarity of the disc. For pre-recorded discs. 3. The casing helps to protect the pits.2 Recordable MD Can be recorded and re-recorded up to a million times without signal loss or degradation and has a life time to that of a CD.SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes MINI DISC Introduction The MD was developed in the early 1990’s as new consumer format to replace the analogue Compact Cassette. This recordable disc is based on the Magneto Optical (MO) disc technology. (See diagramme) 279 .1 PRE-REcorded MD Pre-recorded Mini Disc for music software.

If no pits exist. a “1” or “0” is read. some of the light is diffracted and less light reaches the photo diodes. a “1” or “0” is read. most of the light is reflected back through the beam splitter into the analyzer and eventually into the photo diodes. The magnetic signal on the disc affects the polarization of the effect. The amount of light reflected depends on whether or not a pit exists. If a pit does exist. The pits are covered with a thin layer of aluminum which improves its reflectivity like that of a CD. The direction of the polarization is converted into light intensity by the MO signal readout analyzer. Playback On A Magneto Optical (Mo) Disc A 0.5 mW laser is focused onto the magnetic layer. Depending on the direction of polarization. The electrical signal from the photo diodes are subtracted and depending on whether the difference is positive or negative.SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes 4. The same laser is used for the playback of pre-recorded discs. One of the two photo diodes will receive more light. The electrical signals are summed up depending on the sum. 280 .

Quick Random Access MD has a ‘pre-groove’ which is formed during manufacture. Recording Requires the use of a laser and a polarizing magnetic field. At that point the spot takes on a polarity of the applied magnetic field. As the disc rotates and the irradiated domain (the exposed domain) returns to its normal temperature.SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes 5. 281 . The magnetic head is positioned directly across from the laser source on the opposite across from the laser source on the opposite side of the disc. This groove helps the tracking servo and spindle servo during recording and playback. it temporarily looses its coercive force (becomes neutral and loses its magnetism). 6. Addressed information is recorded at intervals of 13. When the magnetic layer in the disc is heated by the laser to a temperature of 400oF. Polarities of N & S can be recorded. corresponding to “1” or “0”. A magnetic field corresponding to the input signal is generated over the laser spot. allowing the temperature at the spot to drop below the Curie point. Therefore the disc has all the addresses (timing) already notched along the groove even on a blank MD.3 ms using technology that puts small zigzags on the pre-groove. its magnetic orientation is determined by an external magnetic field produced by the write head. The rotation of the disc then displaces the area to be recorded.

44. A MD stores approximately 130 MB data or 1.SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes There is a User Table of Content Area (UTOC) located around the inner edge of the disc before the programme which only contains the order of the music. 44. Unlike uncompressed digital audio data for 16-bit. ATRAC starts with 16-bit information and it analyses segments of the data for its waveform content. 7. (see diagram) 282 .1 KHz stereo CD recording is approximately 10 MB/min.756 MB/min of compressed stereo digital audio data – 16-bit. Start and end addresses for all music tracks recorded on the disc are stored in this area enabling easy programming just by rewriting the addresses. Adaptive Transform Acoustic Coding (Atrac) Digital audio data compression compresses information down to a fifth (20%) enabling 74 minutes of recording time on MD.1 KHz. ATRAC encodes only those frequency components which are audible to us. It works on the principle of the threshold of hearing and masking effect.

depending on buffer memory size.SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes 8. Shock Proof Memory Aim is to prevent skipping or muting while the user is moving about. Allows the use of a buffer memory that can store up 20 seconds of digital information.3M bit per second for playback. As long as the pickup returns to the correct position within 3 – 20 seconds. It is a read ahead buffer. the listener will never experience mistracking. The correct information continues to be supplied to the ATRAC decoder from the buffer memory. 283 . Should the pickup be jarred out of position and stop supplying information. The pickup can read information off the disc at a rate of 1.4M bit per second but the ATRAC decoder requires a data rate of only 0.

2 .1.SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes Since signals enter the buffer faster than they leave it.3 ms where it was displaced and returns the pickup to the correct position. Useful for language learning and music study. It simply memorizes the address every 13. A keyboard is supplied with the format converter so that character and text information can be applied. Subcode info is read from the original CD master and converted to MD format sub code and saved on the hard disc. 10. At that point the pickup stops reading information from the disc. where the student could repeat and record what is already on the disc. A hard disc is built into the K-1216 for saving the compressed signal data. the signal is restored back to the 16-bit audio for monitoring the MD audio sound.4 m/sec (CLV) centrifugal linear velocity Frequency range: Dynamic range: Sampling Resolution: Sampling Frequency: Disc speed: User Table of Content (UTOC) programmable A MD that has some pre-recorded and recordable areas in one disc. 284 .2 K-1217 Address generator Used in producing the final MO glass master and only required at actual disc cutting facilities. Mini Disc Specifications 5 Hz to 20 KHz 105 dB 16-bit 44.1 K-1216 MD Format Converter The original 16 bit digital signal is compressed using ATRAC encoder. This memory can also be applied to the conventional CD but will require a much larger memory. Simultaneously with ATRAC compression. supplying the necessary audio. 10. the buffer will eventually become full. 9. Manufacture and Mastering 10. error correction and address codes via Sony CDX-I code pro. It resumes as soon as there is room in the memory. It interfaces directly with the cutting machine. Using a concept called sector repositioning the pickup has the ability to resume reading from the correct point after being displaced.1 KHz 1.

single layer disc in comparison to CD’s storage capacity of 650 MB only.7GB of data on a single sided.08 GB 4. DVD can hold up to 4. other then that. Yielding frequency response of up to 96 kHz and dynamic range of up to 144 dB. it is identical to a CD player’s transport.4 kHz only two channels are available. Multichannel PCM will be downmixable by the player. The maximum data rate is 9.1/88. RS-PC (Reed Solomon Product Code) error correction system is approximately 10 times more robust than the current CD system. although at 192 and 176. Here two layers of different data are to be found on one or both sides of the disc. the separate layer can be read by changing the focal point of the laser. Though one layer lies in front of the other.54 GB 17. single-sided A double-layer. with up to 6 channels at sample rates of 48/96/192 kHz (also 44. more data can be packed into the disc. 1. Like a vinyl record it needs to be flipped over when side A has finished playing on current DVD player. double-sided 8. 285 .40 GB DVD player requires new heads using shorter-wavelength lasers and more refined focusing mechanisms. 1. This is very similar to us looking through a rain-spattered windowpane and focusing on the landscape.1 Storage Capacity CD 650 .7GB 9. DVD-Audio Even though DVD is of the same size and shape of a CD. the rain-splattered windowpane would now become out of focus.2/176.SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes DVD Introduction DVD format was design to replace the Laser Disc boosting a higher storage capacity then LD. Multi Channels LPCM is mandatory. This is achieved by tightening tolerances to squeeze more data pits closer together.4 kHz) and sample sizes of 16/20/24 bits.6 Mbps. Sampling rates and sizes can vary for different channels by using a predefined set of groups. Where multilayer technology is used on the DVD. 2.700 MB DVD A single sided. single layer A doubled-sided A dual-layer (DVD-9).

where the decoder controls mixing from 6 channels down to 2. etc. and DTS. The "0. although Dolby Digital is required for audio content that has associated video.1 channels MPEG-2 audio: 1 to 5. A subset of DVD-Video features (no angles. Meridian Lossless Packing Meridian's MLP (Meridian Lossless Packing) scheme licensed by Dolby. MPEG audio.SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes 3. Each track can be in one of three formats: • • • Dolby Digital (formerly AC-3): 1 to 5. EMI. Some DVD manufacturers such as Pioneer are developing audio-only players using the DVD-Video format. Universal. with a set of limited transitions. MLP removes redundancy from the signal to achieve a compression ratio of about 2:1 while allowing the PCM signal to be completely recreated by the MLP decoder (required in all DVD-Audio players). BMG. DVD-Audio includes specialized downmixing features.1 Surround Sound Format A DVD-Video disc can have up to 8 audio tracks (streams). new universal DVD players will also support all DVD-Audio features. A special simplified navigation mode can be used on players without a video display. Some DVD-Video discs contain mostly audio with only video still frames. 4. DVD-Audio allows up to 16 still graphics per track. Two additional optional formats are provided: DTS and SDDS. Yamaha may also release DVD-Audio players at the same time.1" refers to a low-frequency effects (LFE) channel that connects to a sub low frequency driver (subwoofer). MLP allows playing times of about 74 to 135 minutes of 6-channel 96kHz/24-bit audio on a single layer (compared to 45 minutes without packing). Both require external decoders and are not supported by all players. no seamless branching. and each track can be identified with a table. Audio details of DVD-Video The following details are for audio tracks on DVD-Video. It's expected that shortly after DVD-Audio players appear. 286 .200.1 or 7.1 channels PCM: 1 to 8 channels. 5. described below) are optional on DVD-Audio discs. Two-channel 192kHz/24-bit playing times are about 120 to 140 minutes (compared to 67 minutes without packing). Unlike DVD-Video. and Warner have all announced that they will have about 10 to 15 DVD-Audio titles available at launch. On-screen displays can be used for synchronized lyrics and navigation menus. Coefficients range from 0dB to 60dB. Matsushita announced that its new Panasonic and Technics universal DVD-Audio/DVD-Video players will be available in fall 1999 and will cost $700 to $1. DVD-Audio includes coefficient tables called SMART (system-managed audio resource technique) to control mixdown and avoid volume buildup from channel aggregation.) is allowed. Various Audio Formats Other audio formats of DVD-Video (Dolby Digital. 5. Up to 16 tables can be defined by each Audio Title Set (album).

1 channel format adds left-center and right-center channels. using lossy compression from original PCM format with sample rate of 48 kHz at 16 bits. and three-dimensional sound field reproduction. (Most Dolby Digital decoders support up to 640 kbps. According to DTS. or 24 bits/sample.) Dolby Digital is the format used for audio tracks on almost all DVDs.k.1 or 7.5 Digital Theater Systems DTS (Digital Theater Systems) Digital Surround is an optional multi-channel (5.a. The variable bitrate is 32 kbps to 912 kbps. (Audio CD is limited to 44.1 channels and 192 being the normal rate for stereo (with or without surround encoding). 5. but will probably be rare for home use.1 kHz at 16 bits. Both MPEG-1 and MPEG-2 formats are supported. All DVD players can play DTS audio CDs.6 Sony Dynamic Digital Sound SDDS (Sony Dynamic Digital Sound) is an optional multi-channel (5. additional bits and higher sampling rates are useful in studio work.k.144 Mbps. advanced digital processing.3 Dolby Digital Dolby Digital is multi-channel digital audio. and some may not use all 20 or 24 bits. NBC. noise shaping. 5. compressed from PCM at 48 kHz. but many players ignore it. unmatrix) are not supported by the DVD-Video standard. It's generally felt that the 96 dB dynamic range of 16 bits or even the 120 dB range of 20 bits combined with a 48 kHz sampling rate is adequate for high-fidelity sound reproduction. with 384 being the normal rate for 5. MPEG-2 surround channels are in an extension stream matrixed onto the MPEG-1 stereo channels. The DVD standard includes an audio stream format reserved for DTS. The data rate can go up to 1280 287 . existing DTS decoders will work with DTS DVDs. which makes MPEG-2 audio backwards compatible with MPEG-1 hardware (an MPEG-1 system will only see the two stereo channels. 5.1) digital audio format. DVD players are required to support all the variations of LPCM. but some of them may subsample 96 kHz down to 48 kHz.SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes 5. using lossy AC-3 coding technology from original PCM with a sample rate of 48 kHz at up to 24 bits.) It ranges from 1 to 8 channels depending on surround sound format. The maximum bitrate is 6. which limits sample rates and bit sizes with 5 or more channels. The bitrate is 64 kbps to 448 kbps.2 Linear PCM Linear PCM is uncompressed (lossless) digital audio. The 7.4 MPEG Audio MPEG audio is multi-channel digital audio. a. The data rate is from 64 kbps to 1536 kbps (though the DTS Coherent Acoustics format supports up to 4096 kbps as well as variable data rate for lossless compression). MPEG-1 is limited to 384 kbps. It can be sampled at 48 or 96 kHz with 16. using lossy compression from PCM at 48 kHz at up to 20 bits. 5. However.a.) MPEG Layer III (MP3) and MPEG-2 AAC (a. with 384 being the normal average rate. the same format used on CDs and most studio masters.1) digital audio format. 20. The signal provided on the digital output for external digital-to-analog converters may be limited to less than 96 kHz and less than 24 bits.

It also makes downsampling more accurate and efficient. Maximum data rate is 2. Hybrid Disc Two layers – the top layer is CD while the bottom layer is high density SACD. The 1 bit system encodes music at 2 822 400 samples a second. The optical pickup must contain additional circuitry to read the PSP watermark. Super Audio Compact Disc (SACD) Sony and Philips are promoting SACD. The singlelayer disc can store a full album of high-resolution music. This supposedly improves quality by removing the brick wall filters required for PCM encoding. Sony released an SACD player in Japan in May 1999 and expects the player to be available in the U. a competing DVD-based format using Direct Stream Digital (DSD) encoding with sampling rates of up to 100 kHz. for $4. 1. SACD includes text and still graphics. With the SACD format. Future SACDs will have enough room for both a two-channel mix and a multi-channel version of the same music including text and graphics alike. Sony says the format is aimed at audiophiles and is not intended to replace the audio CD format – 13 billion CDs worldwide. which is then compared to information on the disc to make sure it's legitimate – anti piracy. 2. It is compatible with 700 million CD players worldwide. SACD and DVD-audio are not design to replace the CD format. DSD includes a lossless encoding technique that produces approximately 2:1 data reduction (50%) by predicting each sample and then run-length encoding the error signal. DSD provides frequency response from DC to over 100 kHz with a dynamic range of over 120 dB. It may be revived when yields are high enough that it no longer costs more to make a hybrid SACD disc than to press both an SACD DVD and a CD.000 by the end of the year. SACD discs are not playable in existing DVDROM drives.8 Mbps.S.SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes kbps. Initial SACD releases will be mixed in stereo. A dual-layer disc provides nearly twice the playing time.Pit signal processing (PSP) modulates the width of pits on the disc to store a digital watermark (data is stored in the pit length). music companies can offer three different types of discs. A downside because of the requirement for new watermarking circuitry. Watermark – Anti Piracy Feature SACD includes a physical watermarking feature . SDDS is a theatrical film soundtrack format based on the ATRAC compression format technology that is used in Minidisc. but no video. not 288 . DSD is based on the pulse-density modulation (PDM) technique that uses (1-bit) single bits to represent the incremental rise or fall of the audio waveform.

289 . Mobile Fidelity Labs. A number of studios have announced that they will release SACD titles by the end of the year: Audioquest (2). Water Lily Acoustics (2).SCHOOL OF AUDIO ENGINEERING AE15 – Digital Recording Formats Student Notes multichannel. Telarc (12). Sony (40). DMP (5).

1 Winchester Magnetic Disc Drives 8.1 3. Disk Storage Digital Audio Requirements Audio Editing Multichannel Recording Concepts in Hard Disk Recording 6. 2. 6. Pro-tools ( Use manual) 290 .2 8.4 Non-destructive Crossfade Destructive Crossfade Edit-point Searching The Edit Decision List (EDL) 9.AE16 Digital Audio Workstations 1.1 6.5 6.3 8. 4. Multichannel Considerations 7.2 6.7 The Sound File Sound File Storage . 5.1 8.3 6. Editing in Tapeless Systems 8.4 6.The Directory Buffering The Allocation Unit (AU) Access Time (ms) Transfer Rate (MB/sec) Disk Optimization 7. Storage Requirements Usage of Storage capacity on disc versus tape 2.6 6.

291 .

292 .

Storage Requirements In a conventional linear PCM system without data compression the data rate (bits/sec) from one channel of digital audio will depend on the sampling rate and the resolution. Multichannel Recording Multichannel Recording may be accomplished by dividing the storage capacity between the channels. In this way information is accessed quickly by reference to a directory of the disks contents containing the locations of blocks of data relating to particular files. E. In fact one cannot talk of 'tracks' in random access systems since there is simply one central reservoir of storage serving a number of channel outputs when they are required. Digital Audio Requirements Samples. a system operating at 48kHz using 16 bits will have a data rate of: 48. The system may allow an hour of storage time divided between 4 channel outputs. since they are time-discreet may be processed and stored either contiguously or non-contiguously. corresponding to the number of single channel minutes at a standard resolution. audio data is transferred to and from the disc via RAM 293 .g. provided that buffer is employed at the inputs and the outputs to smooth the transfer of data to and from the disk. Audio Editing Audio Editing may be accomplished in the digital domain by the joining of one recording to another in RAM. During system operation.SCHOOL OF AUDIO ENGINEERING AE16 – Digital Audio Workstations Student Notes AE16 DIGITAL AUDIO WORKSTATIONS 1.000 x 16 = 768 Kbit/sec or 5. (512 – 1024 bytes) each of which may be separately addressed.e. they involve the dividing up of the storage space into blocks of fixed size. storage capacity tends to be purchased in units of so many megabytes at a time. 4. 3. With a disk drive the total average storage capacity can be distributed in any way between channels. Thus digital audio is ideal for storage on blockstructured such as hard disks. 5. Usage of Storage capacity on disc versus tape A 30-minute reel of 24-track tape using 48kHz/16 Bit uses 396MBs only when all 24 tracks are recorded for the full 30 min.1 Disk Storage All computer mass storage devices are block structured. provided they are reassembled into their original order (or some other specified order) before analogue conversion.5Mbs per min 2. In disk-based systems. 2. but the total amount of 'programme time' to which this would correspond depends on the amount of time that each channel output is being fed with data. i. A fast access time disk drive (under 20ms seek time with no thermal calibration) makes it possible to locate sound flies at different times and play them back as one continuous stream of audio without any audible glitches. using the buffer to provide a continuous output.

Data is also transferred between RAM and the audio interfaces via the buffers under CPU and user command control. The disk is a storage in which no one part has any specific time relationship to any other part .1024 bytes) and this is achieved by filling a RAM buffer with a continuous audio input and then reading out of RAM with bursts of disk blocks which are written to disk.buffers are used to synchronise audio data with an external reference such as Time Code by controlling the rate at which data is read out to ensure lock. In order to preserve the original order of samples.3 Buffering Buffering ensures that time-continuous data may be broken up and made continuous again. 6. The timing of data leaving the buffer is made steady by using a reference clock to control the reading process. E. the locations of all the pieces of the sound file will he registered.1 The Sound File This is an individual sound recording of any length. fading and mixing data is written from the disc to the DSP unit via RAM. Timebase Correction . disk recording does not start at one place and finish at another. the buffer must operate in the FirstIn-First-Out (FIFO) mode. Buffering is used to accomplish the following tasks: • Writing/Reading . An AU of 8 Kbs = 16 x 512 bytes 294 . the size of each file and its location.Disk media require that audio is split into blocks (typically 512.Discontinuities of transitions between various sound files are smoothed out and made continuous at their join or cross-fade. Concepts in Hard Disk Recording 6.e. Synchronisation .The Directory A directory is used as an index to the store.4 The Allocation Unit (AU) A minimum AU is defined representing a package of contiguous blocks. which in turn passes it to the buffered audio outputs.The timing of data entering the buffer may be erratic or have gaps. 6. 6.2 Sound File Storage . Within the directory. Smooth Editing . On replay the RAM buffer is filled with bursts from the disc blocks and read out in a continuous form. which holds only a portion of audio data at any one time. containing entries specifying what has been stored. • • • 6. When that particular file is requested the system will reassemble the pieces by retrieving them in a sequence.g. During editing.i. bypassing the CPU. A buffer is a short-term RAM store. or its subindexes.SCHOOL OF AUDIO ENGINEERING AE16 – Digital Audio Workstations Student Notes using the Direct Memory Access (DMA) controller and buss.

SCHOOL OF AUDIO ENGINEERING AE16 – Digital Audio Workstations Student Notes 6. Then the disc rotates until the desired block moves under the head (rotational latency). 295 . average access time for a particular hard disk will be the sum of its seek and rotational latencies. These 2 factors also limit the freedom with which long crossfades and other operational features may be implemented.5 Access Time (ms) The time taken between the system requesting a file from the disk and the first byte of that file being accessed by the disc controller. This is a measure of buss speed and CPU rate. and the physical size of the disc. Transfer rate in conjunction with access time limits the number of channels that can be successfully recorded or played back. In a disk drive system access time is governed by the speed at which the read/write heads can move accurately from one place to another. 6. The head must move radially across the disk leading to a delay called seek latency. 6.7 Disk Optimization To get the best response out a hard disk recording system the efficiency of data transfer to and from the store must be optimised by keeping the number of access to a minimum for any given file.6 Transfer Rate (MB/sec) The rate at which data can be transferred to and from the disk once the relevant location has been found. Also access time can mean the time taken for the head to jump from file to file. Hence.

the concept of track is very loosely defined and a “channel" refers to how many physical monophonic audio inputs and outputs there are in the system.SCHOOL OF AUDIO ENGINEERING AE16 – Digital Audio Workstations Student Notes 7. Multichannel Considerations In a tapeless system. This leads to a number of considerations: 296 . Multichannel disk recording systems often use more than one disk drive since there is a limit to the number of channels which can be serviced by a single drive.

Each block is separated by a small gap and preceded by an address mark which uniquely identifies the block location. 297 . The heads 'float' across the surface. Each track is divided up into blocks. The term Cylinder relates to all the tracks which reside physically in line with each other in the vertical plane. Next work out how many disks are required for the total storage capacity. lifted by the aerodynamic effect of the air produce between the positioner and the disk rotation. through the different disk surfaces. is a good strategy. Data is stored in a series of concentric rings (tracks). A Sector refers to a block projected onto the multiple layers of the cylinder. particularly where the playback of instruments in a music piece is required.1 Winchester Magnetic Disc Drives Used in PCs. Decide which flies or channels should he written to which drives. perhaps assigning several channels per disk. Contained within a sealed unit to stop disk contamination. The disks are rigid platters that rotate on a common spindle. A storage strategy which places files physically close to another based on their time contiguous relationship will favour a faster playback if these files are to be read off the disk in this same order. 7. In sound design where sound FX are randomly access according to picture considerations the is strategy may not work so well. Hence a system which imitates a multitrack tape machine.SCHOOL OF AUDIO ENGINEERING AE16 – Digital Audio Workstations Student Notes Determine how many channels a given disk can handle.

track and cylinder in a hard disk’s platters. the longer and more gradual the crossfade. in general.2 Destructive Crossfade Can be made as long as the user wants but involve non-realtime calculation and separate storage of the crossfaded file. There are two variations of the destructive crossfade: A real edited master recording is created from the assembled takes which would exist as a separate soundfile. 8. the demands on memory are high. Using non-destructive editing. At the start of the crossfade the system reads out both X and Y samples into a crossfade processor. the larger RAM capacity. Editing in Tapeless Systems Pre-recorded soundfiles are replayed in a predetermined sequence. 8. 298 . In this way edit points may be changed and new takes inserted without ever effecting the integrity of the original material. any number of edited masters can be compiled from one set of source files. block. Thus. Because the system must maintain audio data simultaneously from two audio regions.1 Non-destructive Crossfade When performing a short fade from file X to file Y. simply by altering the replay schedule. At the time of the cross fade the system ensures that data from both files exists simultaneously in different address areas. Memory buffering is used to smooth the transition from one file to another. Time coincident X and Y samples are blended together and sent to the appropriate channel output. You either require more disc space for this operation or you wipe over your previous flies.SCHOOL OF AUDIO ENGINEERING AE16 – Digital Audio Workstations Student Notes Diagram illustrates the difference betwe en sector. in accordance with a replay schedule called an Edit Decision List (EDL). 8. The exact overlap between old and new material will depend on a user-specified crossfade. Both files are read out via memory. by reading from the disk ahead of the realtime playback.

3 Edit-point Searching Often there is a user interface utilising a moving tape metaphor. The EDL controls the replay process and is the result of the operator. allowing the user to cut. copy and paste files into appropriate locations along a virtual tape. having chosen the soundfiles and the places .SCHOOL OF AUDIO ENGINEERING AE16 – Digital Audio Workstations Student Notes Crossfade segments are created and stored separately from the main soundfiles. 9. splice. 8. 5 Pro-tools ( Use manual) 5 Assignment 6 – AE006-1 299 . This is not a realtime process and the user has to wait for the results. To achieve the final EDL the operator will have auditioned each soundfile and determined the crossfade points. 8. Variable speed replay in both directions is used to simulate ‘scrubbing’ or reel rocking.4 The Edit Decision List (EDL) The heart of real-time editing process is the EDL which is a list of soundfiles sent to particular audio outputs at particular times. This saves on disc space but allows for long crossfades.at which they are to be joined.often specified in SMPTE addresses .

9. Baril language Various CD Formats 300 . solid state electronics Quality Digital & Analog Processing Quality issues with CD-R media Should it be mixed to analog or digital? The verdict A note on monitors Considerations for mastering The CD has to be loud What are the other considerations? Solution Technical tip if loudness is important to a mix Improving the Final Product EQ Technique Creating Space Phasing Problems Alternate Mixes TC Finalizer A non-technical perspective THE LANGUAGE OF EQUALIZATION 1.AE17 – Mastering for Audio and Multimedia 1.2 13. 6. 15. Mastering Assembley editing Sweetening Output Tube vs. 10.1 13. 4. 5.3 14.3 12. 3.1 11. 11. 11.2 11. 2. 13. 8. 7. 13. 16.

1. 6. 2. 5. 4. 3. Red Book Yellow Book Mixed Mode Blue Book Green Book Orange Book 301 .

The tunes are sequenced into the order you specify and correct spacing is made between cuts. and to bring out instruments that (in retrospect) did not seem to come out properly in the mix. this has been considered the heart of the process. they designed signal processors such as compressors. punch. Experience and a feel for the music determine the best path. where copies are made. How? A guideline is to do what serves the music.SCHOOL OF AUDIO ENGINEERING AE17 – Mastering for Audio and Multimedia Student Notes AE17 – MASTERING FOR AUDIO AND MULTIMEDIA 1. And ensure that each mastering sounds good on its own medium. This is sent to a replication plant. tape hiss.50 dB. The solution sometimes goes against logic.96 dB. the equalization used is usually subtle. A wide range of techniques can bring out an album's native voice. or takes. Individual song is “sweetened” via quality analogue tube processors and/or digital equipment and software. This is what albums need in the pop markets: big. Every album has a "voice" in which the message of the artist is delivered. pops. radio-ready sound and a competitive edge. limiters. smoothness. clicks. Most people are familiar with the idea of recording music in a live concert or recording studio. The goal is to increase the emotional intensity. Equipment and techniques were developed to further "sweeten" the sound. Vinyl – 70 dB. Ultimately these takes are assembled into a final master tape. If the performance. clicks and strange noises can often be fixed at the mastering stage. etc. Mastering is designed to fit the entire albums’ dynamic range into the respective media. where clarity. Fine-tuning the fade ins and fade outs. You make tapes that store the individual performances. 302 . Cassette . 50 to 15 Khz.96 dB. sense of air and detail can all be enhanced. depending on their source and severity. They noticed that changing the settings could also have beneficial effects on the music. or the latest technology could be appropriate. Sweetening When engineers first began cutting master discs used to produce vinyl records. Equalizing songs to make them brighter or darker. Depth. or the cuts are crossfaded as you wish. It has three steps: 2. MD . especially in Pop styles. Vintage tube processors may work. arrangement and recording quality are good to start with. Removing noises. Mastering can then profoundly affect its impact and resonance. impact and "punch" are enhanced. etc. Beginnings and ends of cuts are faded to black (silence) or room tone (the natural background noise of the performing space). e.g. A strong performance and good recording technique will set the basic tone for this voice. Spacing the songs. Digital editing today is done on high quality 24 Bit ADC/DAC and DAW. The process of creating the final master is called mastering. 3. and even casual listeners can notice the difference. Adding PQ Subcode. Mastering Mastering includes adjusting the levels so there will be consistency in the levels of various songs for the entire album. depending on the needs of the music. and equalizers (EQ) to prevent overloading the cutter head. then the final master sounds even better than the mix tapes. Assembley editing The tapes from the location recording or mixdown sessions are transferred to a digital editor. Since then. CD . Pops.

It needs air.attuned to the complete presentation rather than the detail of the mixes . Mastering creates a seamless whole out of a collection of individual tracks. impact. The market is demanding. sweet.CD replicators. for quality control reasons. you end up with differences in level and tone. in-store play. fat. focus. mastering means creating the glass master disc that is used to make stampers (which are then used to press the CDs)." The goal is to understand and accept the producer's guidance and then add or remove only what the music requires. Usually a lot can be done to improve the mixes. brilliance. A finished mix is a complex balance that can be made worse as easily as it can be improved. Output The finished music tracks are transferred to the media needed for mass production. since the mixes were recorded at different times of day over a week or more. A new outlook. The mastering facility has ultra-clean processors that are built to handle stereo signals. The glass master should be a perfect mirror image of the CDR master disc produced by the music mastering facility. Almost every CD plant prefers to make it's own glass masters. Tube vs. The best tube designs approach perfection from the other side. a new set of listening skills . Mastering engineers must be fluent in both the artistic and technical areas of music-making. But there are at least three reasons why professionals send tapes out for mastering: i. If you want the disc to compete in the radio markets. depth. present. since a lot of terms are used to describe sound. smooth and live. ambient. since both solutions have advantages in different areas. resonant. This may be obvious . usually CDR (Recordable CD). with a sense of transparency and air. it has to be right sonically.it is one thing to run a guitar through a limiter and equalizer. This master disc can be played on any CD player. warm. sparkle. Often the changes are subtle. iii. The mastering engineer has fresh. The best solid state circuits come close to disappearing.can make a huge difference. experienced ears. People consistently use the term 303 . then a straight transfer might do fine. They tend to err in the direction of a faster. Good communications skills are also critical. Either tubes or transistors can be truly excellent when the circuit is done carefully with the highest quality components available. Also. everyone involved is tired of it. more defined. The classic definition of mastering is the final creative step in producing a CD. Its worth using the best equipment available. solid state electronics No one will ever settle this question.SCHOOL OF AUDIO ENGINEERING AE17 – Mastering for Audio and Multimedia Student Notes 4. It's tough to keep perspective after you've heard a mix 50 times. For example: "It needs to sound big. and the homes of consumers accustomed to excellent music. clarity and definition. slightly less forgiving sound. Note . ii. and another thing to run your whole mix through it. punch. not a bit more. taut. so the client can audition it and give final approval. Why not just transfer the mixes straight to cdr? If a mixing engineer is one of the rare few who can mix and master at the same time. By the time the mixes are done. 5. Occasionally cassette replicators prefer 4mm DAT tapes.

You have added a conversion process to the signal.SCHOOL OF AUDIO ENGINEERING AE17 – Mastering for Audio and Multimedia Student Notes "warmth" when describing tube sound. Sense of air and soundstaging are usually somewhat compromised. using at least 24 bits of data. Quality Digital & Analog Processing 1. Technically. to maintain the integrity of the music. The summary? Consider using an editor to assemble your tracks. Assume for a moment that you have a great-sounding two-channel mix. rather than just pass the signal through with no changes. A harder. We are not talking about the average studio compressor or EQ. but a significant number of people say the resulting master lacks the raw energy that was there in the early mixdowns. Digital design approach the standard set by the analog solid state. or at least neutral. This means for straight assembly editing . Test instruments will tell you one thing. They ask what can be done to fix it. but specialized low distortion. Jitter is an interesting feature of digital audio that can mess up the sound of otherwise good music. If you prefer the best that analog has to 3. many PC's and Macs are equipped with 24-bit processing cards. the better analog processors haven't been replicated digitally. It depends on your style. Still. Some digital processors can sound excellent. and music is inherently an analog process. The best analog gear has a sound that a digital device cannot fully model. this is accurate. Jitter will cause a loss of stereo image focus (blurring) and depth to the reproduced music during digital-to-analogue conversion. just don't preserve enough accuracy in their computations. Lower resolution (16Bit) digital processing devices. Tube circuits offer a wider array of options if you are trying to improve the sound. either on DAT or a higherresolution 24-bit format. Every converter. be it 16-bit or 24-bit has a unique sound. I can re-clock the data stream. that's great. We may have to go back to analog. Sometimes the detailed sound that digital is known for is perfect. 6. 304 . They should sound good. low noise. In the real world. Signal processors affect a lot of things in the music. Assume also that you have that mix stored to your satisfaction. This includes a number of custom tools that I use because they make the music sound right. To be fair. and back again to digital to get the sounds that people want on their albums. through some high-resolution equipment. and leave the sweetening to devices that have been proven over time to be more friendly to the music. wide bandwidth devices designed for mastering. including the ones built into some well-known digital editing workstations. if the mixes are already stored digitally and you have digital sweetening tools that deliver the sound you want. As an added measure when going to analog. These process are not considered Sweetening. Sweetening takes a lot of computational horsepower. but are less forgiving. cross-fading between takes and cleaning up the beginnings and ends of cuts. and your ears may tell you another. the benefits usually outweigh the drawbacks. and theory says that the music will suffer. closed top end feel often results from the limits set by current technology. effectively eliminating digital jitter. You need to edit the mixes into a sequenced tape.sequencing the tracks. 2. Many digital editors will do a good job if they are used correctly. usually called a submaster or work tape. So to recap. 4.

many mastering houses use digital editors that write the final CDR at higher than real-time speeds. The most important thing is to do what serves the music. Radar. and 24-bit sound cards are available for a few hundred dollars. Among 16-bit mixdown formats. even though discs spinning at higher speeds are considered to be more stable. The good news is this has never caused a problem on a master that I've produced for replication at a CD plant. since extra conversions can remove “live” feel. but not everyone is convinced. If you use noise reduction such as Dolby SR. DAT is the most familiar. especially 1/2 inch stereo: 1. As manufacturers of players cut costs. This is the least expensive way to move your tracks into the higher-quality world of 24 bits. where glass masters are cut at faster than real time rates to maximize throughput. This makes it possible to mix directly to computer. the problem may even be getting worse . then the plant will be able to produce discs that sound the same. mastering facilities prefer to work with source material that is higher in resolution than the (16-bit) target media. 7.SCHOOL OF AUDIO ENGINEERING AE17 – Mastering for Audio and Multimedia Student Notes offer. car stereos and boom boxes. CDRs can tax this type of gear to it's limits. The built-in correction on the playback side should compensate for any increased errors. error rates sometimes go up. Quality issues with CD-R media As CDR (CD-Recorderable) replaces Umatic tape (Professional ¾” video cassette) as the common format for shipping CD masters. If the master that I send to you for final approval sounds fine. Should it be mixed to analog or digital? Given a choice. When discs are mastered in real time. One possible explanation for sonic differences of the ‘70s trend towards mastering vinyl at half speed. There are two commonly available options that meet this spec: 24-bit digital files and analog tape. and VS1680 can also write 24-bit files. 8. I take the conservative approach and create all CDRs at low speed. though measured signal to noise ratio favors DAT. then convert only once. This practice is being mirrored at some replication plants. 2x speeds are common and the trend is to 4x and beyond. Protools. CDRs don't play as reliably as manufactured discs. In fact. especially on changers. Dynamic range is roughly equivalent. using media certified for 6 times cutting. and then burn a CD-ROM disc containing 24-bit stereo music files. Dirty laser pickups can also aggravate the problem. When the disc is played back to cut the glass master. (When was the last time you cleaned the pickup on your player?) The bad news is that a CDR might not sound perfect on all CD players. It is worth examining the differences between this standard medium and analog tape. This saves time for the engineer and allows the facility to generate more products. A second factor is more measurable: as you increase cutting speeds. especially on high-speed copies. the media has more relative time to respond to the cutting laser. Most real-world pop and jazz music has a dynamic range considerably less than 85 dB.there's equipment out there that will barely handle a production CD. It is convenient and sonically equal to the CD. Many studios now have computers equipped with CD burners. and strange things result. People have noticed variations in level during playback or crackling sounds in very quiet passages of the music. analog is superior. so noise is not often the issue. This results in a pit geometry that is more precise. there is less chance for jitter and other timing-related problems to occur. I have heard a number of top people discuss sound differences between CDRs recorded in real time and at higher speeds. carefully. the noisier 1/4" stereo tape is just 305 . Common multitrack systems such as Paris.

Its still there. Mackie HRS824. 2. by a factor of at least ten. Measured distortion favors digital. Many 15 ips analog machines have useable response from below 20 Hz out to almost 30 kHz. If you have a 1/4" machine available. 9. Also. This can be a big plus for certain styles of pop. 306 . it's not a bad compromise. At 30 ips. The noise spectrum is also different at 30 ips. especially if you have good noise reduction. Its easier to fix the problems when you know that you can trust the monitors. the sound is gone. and enough resolution to make even long reverb tails sound convincing. especially if its running at 30 ips.. rock and rap music. speakers in this size range can often create a very accurate stereo image. Some prefer smaller nearfield monitors placed close to the listener. These bumps. related to the design of the playback head. If you can create and store 24-bit mixes. finer machines go from near 20 Hz to way past 40 kHz. The flip side is that as you record hotter signals onto analog tape. since these can minimize problems associated with the acoustics of average rooms. you get a natural compressor/limiter action as the tape begins to saturate. While its more trouble and expense to work with and the media costs more. Every engineer needs a reliable reference e. This contributes to the quality of "air" that people associate with analog tape. Even when expensive outboard A/D converters are used.SCHOOL OF AUDIO ENGINEERING AE17 – Mastering for Audio and Multimedia Student Notes fine for many styles of music. Many hit records are being made in this format.) This is not true with analog. A note on monitors Many problems that mastering engineers encounter come from mixdowns made on less than perfect monitors. (A point in favor of 24-bit. They are often compensated for naturally when mixing. Many people still prefer analog. then sit back and listen to both. The verdict DAT is generally considered better than 1/4" tape at 15 ips. When you go lower than 16 bits of signal in a digital system. 24-bit storage on inexpensive CDROM discs has a lot of appeal for studios who don't want to buy and maintain an older 1/2" deck. providing the extra octave of high end as well as a smoother roll-off.g. 10. These monitors can't deliver deep bass. Genelec 1031A. response drops like a stone due to the very sharp anti-aliasing filters needed for digital. Above 20 kHz. B&W 805 Nautilus. you'll know why. where you can still track musical information below the noise floor. run it in parallel with the DAT for a couple of mixes. Once you hear it. Compare them side by side if you have the chance. 3. most folks choose the analog. If you compare DAT to 1/2" analog.. DAT has a flatter frequency response within the normal 20 Hz to 20 kHz band. the results are worth it. the picture changes. Most people feel that 1/2" tape preserves more detail in low level signals. I have clients that prefer the analog. are usually 2dB or less in size and do not cause phase response problems. but since this is also true for most speakers that consumers use. the comparison with 1/2" is closer. My opinion is that 1/2" is the format of choice for most styles of Pop and Rock music. in depth as well as width. These filters can also cause strange things to happen to the high end due to their impulse and phase response. with a greater sense of air than DAT. but 24-bit can also be excellent. To many this means cleaner sound overall. analog has the advantage. though if you want the fat in-yourface sound of tape. All analog machines have slight irregularities in lower bass response caused by "head bumps". but has a "silkier" quality.

Then the strongest tracks can be re-mastered specifically for airplay and added onto the tail of the CD.3 Solution Its usually possible to settle on a fairly hot compromise level that keeps everyone happy. guitar processors and room mics. Here's another option: Many CDs run 45 to 60 minutes in length. For those who want to push a particular track very hard. Like it or not. then you can compensate based on experience. What gets sacrificed is clarity. there's a lot to be said for speakers that are just plain accurate. whatever it takes. your tune starts to sound like a 'music minus one' track . If your speakers tell the truth. to say the least. equalized and compressed for airplay.2 What are the other considerations? If you master an album for airplay only. radio stations will jam your music through their own limiters prior to being transmitted over the airwaves. 12.SCHOOL OF AUDIO ENGINEERING AE17 – Mastering for Audio and Multimedia Student Notes Some engineers prefer larger wideband monitors e. The entire album can be mastered first for high fidelity listening. but you run the risk that people at home will experience listening fatigue after 15 to 30 minutes. This means that for everyone not in a perfect reception area. and disappear in mono. most consumers don't have systems this good. and they can ignore the bonus cuts if they want to. 11. You can always push the level up a bit. The real world impact occurs when the album goes out over the radio. The sonic quality of these processors is usually not perfect. This is wasted when the music is crammed into the top 10 dB.g. The processors in a good mastering facility will produce high levels and better maintain the sound quality that you have worked so hard to create. Of course. Stereo separation degrades as you get farther from a station. but the live feel and sense of air will suffer. Still. so there is unused time available after the album is cut onto CD. Technical tip if loudness is important to a mix Hit the mono button on your console or run the DAT mixes back through a couple of modules panned to the center.the guitar disappears. assuming the speaker/room match is done right. there is one good reason to go for it at the mastering stage. Looking at this problem from a music industry perspective. all the tracks will sound punchy.1 The CD has to be loud If you want maximum level. correctly placed in a room that is designed for listening. This can be caused by delay lines. Genelec 1038A. These systems will give the most accurate picture of the music. 11. A good home listening system can reproduce over 90dB of dynamic range. record labels are getting locked into a no-win contest to be 1/2 dB louder than the next guy. 307 . Considerations for mastering 11. one possible scenario is to release promotional EP's or singles. In extreme cases. These hotter versions can be labeled as bonus tracks on a jewel box sticker. The option doesn't cost the buyer extra. The record store version can then be mastered for a hi-fi home environment. an instrument will sound HUGE in the stereo mix. Fill the extra space with music rather than silence. B&W 801 Nautilus. even if they like the material itself. regardless of what the rest of the world uses. Music buffs generally don't enjoy hot sound for more than a little while. there is a phasing problem. 11. If the mixes drop noticeably in level when played in mono.

and before the mix goes to DAT tape. The best solution from the technical standpoint is to send the mixdown tapes. (This includes BBE Sonic Maximizer and Aphex Aural Exciters. Even if you want the final sound to be stepped-on or processed with effects. you may be able to solve a bass muddiness problem quickly. If you elect to use a local facility for assembly editing. Check your early mixdowns with good headphones available. You may be able to bring overall levels up. balance and EQ are good. The producers leave a lot of "space" so that the strong basic groove does not get cluttered. radio-ready sound. Since phones eliminate all roomrelated bass problems. and let the mastering engineer assemble and sweeten them into a finished master. then compare your sound. You will preserve more air and detail. It turns out their mixdown monitors were too smooth in the treble. consider the option of including extra tracks. 13. I can almost certainly do that with higher quality than you get with commonly available studio gear. mushy tracks. Excellent phones still can't image properly. then use it on the mix . You just need to be aware of the tradeoffs. use them sparingly. (the T. The final product will have a more polished feel. noise and distortion issues. which often add CD-unfriendly grit to the top. if you want your whole album as loud as possible. get a better balance between all instruments and still maintain a great drum feel. If you send a tape that is heavily limited. This brings out a key point: mastering can only improve good product. it is tough to fix at the mastering stage. It cannot fix bad mixes. they are generally better than speakers that cost 10 times as much. and the overall level. the better the final result. poor arrangements or sloppy playing. or EQ'd too much. or they get muddy fast. If this is the case as you mix. A number of clients remix tracks that were originally BBE'd. ask the engineer to not normalize.C. there will be problems getting both to sound good. If you are still at the mixing stage. Listen to tracks from albums you respect.moderately. They had used the exciter to compensate for speaker defects or poor hearing. In general. The mastered CD got chart action in Billboard. A great mix usually produces a great master & vice versa. try using 3 to 6dB of limiting on either or both instruments. If you have a lot of guitar. EQ or compress the music with their computer based editor. On many tapes I receive.) • • • • • 308 . If you want very pretty sound and very hot levels. the producer decided to replace some weak tracks and remix the entire album. A stitch in time saves nine. and may not be able to create the sound you need. Any mix must have no major technical problems. After we discussed it. If you have a stereo compressor or spectral processor that you love the sound of. I had one client whose mixes were seriously flawed. the less stereo processing you do. I'll work to make it that way.SCHOOL OF AUDIO ENGINEERING AE17 – Mastering for Audio and Multimedia Student Notes To summarize. but for resolving spectral balance. Many hit records are very sparse. The producer just wants a big. then I have less room to maneuver. and the results on an accurate monitor were bad. kick and snare drums contribute most of the peak levels. synth and "hot licks" tracks available to spice up the mix. consider simplifying the mix. "Wall of sound" mixes need to be done carefully. Improving the Final Product • The first suggestion crosses over into the producer's jurisdiction but a number of clients have found it helpful. When this effect is overdone. Finalizer and L1 Maximizer come to mind). and you will avoid adding unwanted “grunge” to the sound. so everybody was happy but the release date had to be pushed back six weeks. If the kick drum sounds great and the bass guitar is terrible.

even inexpensive ones. its worth checking out at least once. then burn 24-bit files to CDROM to 309 . Have the backup vocalists step back six inches or switch places for the second pass. A related technique is removing unneeded frequencies. try moving the microphone and the performer within the recording space. it comes across as richness. If you've never tried it. speed it up a few cents. if you have the option. This technique is effective for toning down the inherent resonance in snares. You will free up 'space' needed for other instruments.SCHOOL OF AUDIO ENGINEERING AE17 – Mastering for Audio and Multimedia Student Notes 13. Alternate Mixes When you have the perfect mix on tape. no matter how good the one is. When you go back through the mix tapes to choose the keepers. Make the sound worse. If you scan across the console and notice that too many channels are boosted at 10KHz. and all the changes are fresh in your mind. Alternately use a phase meter.2 Creating Space When you double track any instrument or vocal. as opposed to objectionable ones. Vocals often get a bit buried in the heat of a long mix session. When you provide variation.3 Phasing Problems Detect these easily by listening to the mix in mono. Varispeed: If you can varispeed the multitrack recorder. for sure. The EQ'd result might sound odd when you solo it. This why having a few different reverbs . then rerun the mix with all other settings identical. 14. • 15. Its usually more effective than boosting the characteristics you like. 13. Vary the bandwidth to fine tune. try notching out or “thinning” (one track at a time) sections of the musical spectrum above or below the frequencies where a track makes it's major contribution to the mix. If your mix is sounding denser than you like. 13. Rerun the mix. Find the offending frequency range by boosting 6 to 12 dB and sweeping back and forth .at fairly narrow bandwidth.is better than having just one. but others will notice a subtly higher energy level in the sound. acoustic guitars and other instruments. this might help solve the problem. consider creating one or two alternate mixes before moving on to the next track: • Vocals: Push up the vocal faders one or two dB. but could be the right answer in the context of the whole mix. The resulting mix may drive people with perfect pitch nuts.1 EQ Technique Use a few channels of sweepable or fully parametric EQ to remove problem frequencies from instruments. up to maybe 1% total. An increasing number of studios can store the higher-quality 24-bit mixes directly on computer. Our ears are sensitive to the complex nature of acoustical ambiance. Add in a different room mic. Then switch from boost to a few dB of cut at the problem frequency. etc.better than most DAT machines. TC Finalizer The good news is that the 24-bit converters in this unit may be the best ones you'll have available . you may prefer the version with slightly hotter vocals.

SCHOOL OF AUDIO ENGINEERING AE17 – Mastering for Audio and Multimedia Student Notes send out for mastering or as safeties. if it is used in moderation.) If someone insists that it will provide all the enhancement you need.” While there might he some regionalism in this vocabulary and perhaps some variety in interpretation. I've gotten too many calls from folks who need significant improvements to their one and only submaster tape which. most often. There's a huge array of vintage and modern processors with distinctive sounds. mastering has less to do with technique and more with awareness. there's a reason why mastering facilities have not traded in all their other equipment to buy a Finalizer (or similar units now on the market. believe them. be sure to try them in your space before you commit. And add a little sparkle to the keyboards. How and when to make the moves should not require thought go with the “feel”. regardless of skill. with no processing (safety copy). Having all the right factors aligned does not guarantee a great performance. ©1999 DRT mastering THE LANGUAGE OF EQUALIZATION No treatise on equalization would be complete without offering some sort of a lexicon of EQ'ese. was “Finalized. the process of equalization seems to be replete with adjectives. primarily as a compressor. Excellence becomes possible when receptivity combines with experience. A non-technical perspective Much of what I discuss in this presentation relates to technique. but it seems there is a reality behind these words automatically grasped by sonically sensitive people. and they may improve in other ways. Perhaps they reflect a simple word/frequency association. Rarely do engineers. you need to be in a space very similar to that of an artist in a live performance.” Usually they were told "We run all our mixes through it because it makes everything sound good and hot. More than any other aspect of recording. and then make sure that you run a second parallel recorder fed directly off the console. the rest is icing.decent mic preamps. producers and artists communicate using specific technical terms like. brick wall multiband limiting etc. This device as advertised. it is more like “let's warm up the guitar. the following list does reflect some sort of a consensual opinion on the location between certain commonly 310 . Considering the unit as a processor. Even then. Once you have the basics . Some legendary mics are special purpose devices. depending on the experience of the user. will create a sonic signature that is not reversible." While this approach may work for demos and local radio. it turns out. but increases the chances drastically. Like in anything else there will be clients who cannot be satisfied.. Most musicians can relate to this way of working. You will have an unprocessed safety copy if you are unhappy with its processing. Technique has to be a given. EQ and compressors. If you like exotic microphones. it is still quite amazing how universal many of these terms are and how intuitively the human mind seems to grasp even unfamiliar terms. 16. Ultimately. To produce the best work.. and may have a sound different than you expect. as well as people familiar with martial arts and similar disciplines. constant practice and a quality instrument. “try x number of dB boost at frequency y”. an occasional failure is handed to everyone. The issue is how to express the creative force (Force) that is available to all of us when we work to be receptive. with its presets. then the mixes will be louder. While no attempt has been made to be comprehensive or authoritative.

masked by low-end energy (typically in the region of 125-250 Hz).5 kHz with a roll-off both above and below). Warm – obviously a positive characteristic often found between 200 and 400 Hz. hence. Woofy – a somewhat nebulous term for sounds that are sort of "covered". Presence – Anywhere from 3-6 kHz can be used to make a sound more present. 1. Glass – A very translucent. Fat – generally applied to the octave above -boominess: -(say 60-150 Hz). One is encouraged to modify and add to this brief lexicon according to one’s common usage. but can certainly be sensed. Muddiness – Actually a compound problem: woofiness plus puffiness (excess low end and also low mids). "cut" means to put an incisive "point" on the sound (2. but palpable brilliance associated with 12-15 kHz. This vocabulary is offered in the interest of spurring better communication among practitioners of the audio arts. It's still sort of a cloud. These waves move a lot of air. boomy. found at 15-20 kHz. Zizz – refers to a pleasantly biting high-end resonance (think of a "harpsichord"-type brightness found around 10 . Sparkle – A real smooth stratospheric brilliance almost beyond hearing. Telephony – accentuating the limited bandwidth characteristic commonly associated with telephones (A concentration of frequencies around 1. Makes things sound big. Cutting – Here.12 kHz). Sibilance – Dangerous "s" sounds and lots of other trashiness can often be found at 7-10 kHz. (Usually found between 600 and 1 kHz). Baril language Boomy – applied to a sound overabundant in low lows (in the region of 40-60 Hz).5-4 kHz does this very effectively). Darkness – The opposite of brightness (a general lack of highs at 10 Khz and beyond). 311 . Could easily degenerate into woofiness or puffiness if overdone. but not as big.SCHOOL OF AUDIO ENGINEERING AE17 – Mastering for Audio and Multimedia Student Notes used terms and their associations in the realm of frequency. but not earth-shaking. Boxy – seems to remind one of the sound in a small box-like room. Puffy – is like an octave -above woofy (say 250 to 500 Hz). Brightness – Most generally achieved by shelving EQ of everything above 10 Khz.5-2.

Various CD Formats What's confusing is the multiple number of competing CD-ROM formats for artists and record labels to choose from. Windows CD-ROM. etc. To date. while no one can impart automatic discretion in the use of equalization. data cannot be read while sound is being played. The Red Book standard allows for up to seventy-four minutes of stereo music using Pulse Code Modulation (PCM to compress two stereo channels into 640MB of space). almost all of which are non-compatible. 1. Yellow Book Computer CD-ROMs conform to the Yellow Book standard also published by Sony and Philips. Yet the discs vary in two significant ways: • • The way information is stored on the physical disc media The nature of the information that can be contained on them All of the most popular CD technology has been developed by Sony and Philips. This standard defines the proper layout of the computer data on a disc. The newer CD-audio disc storage space is 700 MB allows for 80 minutes of programme time. The compact disc industry started in 1982 when Sony and Philips created the Compact Disc Digital Audio Standard. 2. Red Book The original disc: the audio CD conforms to the Red Book standard. or standards. DOS CD-ROM. it should be clear that the art of equalization is one that can be studied. Audio is the only type of data that can be stored with the technology. Mac CD-ROM. practiced and learned. Red for stereo. Mixed Mode When a CD has CD-ROM tracks and CD-Audio tracks it's called a "Mixed Mode" disc. The format also adds better error correction (necessary for computer data) and better random access capabilities. so computer applications must use Mixed Mode in a staged manner. Openness – The quality of having sufficient highs and lows. Yellow Book takes the basic Red Book standard and defines two new track types – computer data and compressed audio or picture data. Each type of CD has a given set of parameters. These different types of CD standards are named after colours. Enhanced CD. To get around these limitations and to obtain synchronization of multiple tracks. commonly known as the Red Book.of the CD-ROM/ XA standard. All CDs are the same physical size. There are various CD standards: Audio CDs.SCHOOL OF AUDIO ENGINEERING AE17 – Mastering for Audio and Multimedia Student Notes Thinness – The opposite of muddiness (a deficiency of lows and low mid frequencies). In conclusion. an interleafed style was added in the form. 312 . 3. CD-I CD-Plus. Red Book organizes the audio to be stored into disc tracks with a fixed size and duration. CD-Extra.

SCHOOL OF AUDIO ENGINEERING

AE17 – Mastering for Audio and Multimedia

Student Notes CD-ROM/XA (CD-ROM Extended Architecture) was proposed by Philips, Sony and Microsoft in 1988 as an extension to Yellow Book. This specification stretches the Yellow Book standard and adds the most interesting idea from Green Book (CD-I), interleaved audio. Before this time, audio was confined to one track at a time. Interleaved audio allows the sound to be divided and intermixed with other content. The CD-ROM/XA drive needs special hardware to separate the channels. When it does, it gives the appearance of simultaneous synchronized media. Not only does CD-ROM/XA allow for intermixed sound, but it also provides various levels of fidelity and compression. CD-ROM/XA uses a more efficient technique of audio compression than the Red Book, allowing four to eight times more audio on the same disc. These hybrid discs became known as Mixed Mode CDs. Mixed Mode CDs placed the ROM data on track one and audio on track two. And that seemed like an acceptable way to arrange the information on a disc at that time. The CD worked in a standard audio CD player (however the user had to manually skip track one) and the technology was designed to work with computer CD-ROM drives. Everything seems to be developing smoothly until the music industry started screening the technology. They found it unacceptable that when the data track is played it gives a high pitch screeching noise. They prefer the audio track to be placed first while the data track is placed last. 4. Blue Book

Enhanced CD, CD-Plus and Multi-Session are all names for the Blue Book standard. Enhanced CD is a hybrid disc format which merges the audio-only characteristics of Red Book and the visual data of Yellow Book. With the Blue Book standard, the music is on the first track and the computer data is on the second track. Any user can take that CD and play it on any standard audio CD player, even though it is labelled CD-ROM. The disc will play without any difficulty, and won't have any squeaks because of computer data. It knows to handle both audio and computer data because there are two tracks, which is why Blue Book discs are called "MultiSession" discs. In 1993/1994, Sony and Philips were busy propagating the name "'CD-Plus," the name they coined to indicate the generic combination of multimedia material with audio CDs. That name quickly faded when Sony/Philips lost a court battle to market their CD-Plus product to music enthusiasts worldwide. It seems that a Canadian CD retailer already owned the Canadian trademark to the CD-Plus name, so Sony/Philips were precluded from selling product under that name. Sony and Philips next adopted the term "CD-Extra," which is the replacement name for CD-Plus. That name has flunked with developers and the general public. For the record, the Recording Industry Association of America (RIAA) has officially dubbed this new technology "Enhanced CD." This new name for interactive music CDs has gained wide acceptance and is now the preferred term used to describe discs which contain both audio and ROM data. Jackson Browne's Looking East Enhanced CD features the evolution of a song. Soundgarden's Alive In the Superunknown has an autoplay mode which gives the title the feel of a cool screen saver. Sarah Mclachlan’s Surfacing with video clips of her recording sessions. The record companies were satisfied with this new technology because it meant everything is absolutely identical to the audio Red Book discs. The upside is that you can now play your Enhanced CD on any CD player. The negative aspect is that it can't be played on all CD-ROM drives.

313

SCHOOL OF AUDIO ENGINEERING

AE17 – Mastering for Audio and Multimedia

Student Notes Single session CD-ROM player (before September 1994) probably can't play an Enhanced CD. Its single session driver is not capable of reading a second session. Since Blue Book technology places the music on the first track and the computer data on the second track, a single, session player-which reads only the first information session stored on a CD-ROM-cannot fully play a Multi-Session disc. One would need to load new drivers in order to access that outside track or the data track a single session CD-ROM driver on the computer. The modern Enhanced CD avoids the data track problem by specifying that all computer data be stored in a second session-one in which conventional audio CD players cannot access. In order for Enhanced CD discs to be properly recognized by a computer, there are prerequisites: The computer must be equipped with a Multi-Session CD-ROM drive. If a computer was purchased after September 1994, it would have a Multi-Session drive, as well as a sound card. The CD-ROM firmware must support CD player discs. Depending on the rigidity of the CD-ROM firmware, the drive may or may not recognize Enhanced CD discs properly. The firmware included in some drivers is inflexible to the point that they cannot accommodate the different data layout contained on Enhanced CD discs. In some cases, one may have to upgrade the firmware to allow the driver to support Enhanced CD. Another prerequisite for Enhanced CD discs to be recognized by the computer is: The CD-ROM device drivers must recognize Enhanced CD. This last requirement means that the CD-ROM drive must be capable of recognizing that a given disc conforms to the Enhanced CD standard. When an Enhanced CD disc is inserted into the CD-ROM drive, the current device drivers must be able to differentiate between audio and data. 5. Green Book

The Green Book CD-ROM standard was introduced by Philips in 1986 and is the basis for a special format called Compact Disc Interactive, or CD-I. Although CD-I using the Green Book standard is a general CD-ROM format with the bonus of interleaved audio, all CD-I titles must play on a CD-I player, because only this piece of equipment can decode the CD-I formatted discs with a special operating system known as OS/9. A CD-I system consists of a stand-alone player connected to a TV set. Consumers have been slow to embrace the CD-I technology. The CD-I market is limited by the special CD-I development environment and the relatively small number of CD-I player manufacturers. A unique part of the Green Book standard is the specification for interleaved data. "Interleaved data" means taking various forms of media, such as pictures, sounds, and movies, and programming them into one track on a disc. Interleaving is one way to make sure that all data on the same track is synchronized, a problem that existed with Yellow Book CD-ROM prior to the integration of the CD-ROM/XA. 6. Orange Book

The Orange Book standard is used to define the writeable CD format. Part 1 of the standard covers the new magneto-optical which is completely revisable. Part 2 covers the CD-R (compact disc recordable) for compact disc write-once media (which defines the Multi-Session format that allows for incremental updates to the media). Ordinary CD-ROM drives cannot write to CD-R media and can only play the first session on a Multi-Session disc.

314

SCHOOL OF AUDIO ENGINEERING

AE17 – Mastering for Audio and Multimedia

Student Notes

CD Access time - At the moment, there is no standardization of CD-ROM drives. However, there are a specific set of criteria which they require, and life would be easier if this criteria were standardized. Most existing CD-ROM drives operate with relatively slow access times and transfer rates compared to computer hard drives. The access time is how long it takes to find and return data. The access time of an average CD-ROM is 300 milliseconds, while the access time for a hard drive is about ten to twenty milliseconds. The transfer rate is the amount of data that is passed in a second. The transfer rate for CD-ROM is about 150 kilobytes per second, rooted in the technology designed to play a steady stream of digital CD music. (The transfer rate needed for uncompressed full-screen, full-motion video is approximately 30MB per second.) Dual speed drives help improve performance by providing up to 300KB per second transfer rates.

6

6

Assignment 7 – AE006-2

315

AE18 – Synchronization and Timecode
SYNCHRONIZATION 1. Classification of Synchronization 1.1 1.2 2. Feedback Control Systems (Closed-loop) Non Feedback Control (Open-loop control)

Pulse Synchronisation 2.1 2.2 Click Track Frequency Shift Keying (FSK)

3.

Timepiece Synchronisation 3.1 SMPTE/EBU TIME CODE

4.

Longitudinal, Vertical Interval and Visible Time Code 4.1 Longitudinal Time Code (LTC)

5. 6.

Time Code and Audio Tape Copying SMPTE Time Code 6.1 6.2 6.3 Refreshing or Regenerating Code Jam Synchronising Reshaping

7. 8.

Time Code Generator Methods of Synchronisation 8.1 8.2 Chase Lock Machine Control

9.

Other Synchronization-related concepts 9.1 9.2 9.3 Slave Code Error Slave Code Offset Event Triggers

316

9.4 9.5 9.6 10.

Flying Offsets Slew Advanced Transport Controls

SMPTE-TO-MIDI Conversion

317

SCHOOL OF AUDIO ENGINEERING

AE18 – Synchronization and Timecode

Student Notes

AE18 – SYNCHRONIZATION AND TIMECODE
SYNCHRONIZATION Synchronisation causes two or more devices to operate at the same time and speed. In audio, the term means an interface between two or more independent, normally freerunning machines - two multitracks, for instance, or an audio tape recorder and a video player. Audio synchronisation has its origin in post-production for film and, later, video, like Sound and Foley, but synchronising equipment is now very common in music recording studios as well. Sound production will involve the engineer with some form of synchronization. Largescale industrials are huge, often intermixing pre-recorded video, film, slides, multitracked music and effects with live performance. All of these media must operate synchronously, on cue, without failure. The importance of understanding how synchronization systems work, cannot be overemphasized. 1. Classification of Synchronization

For two machines to be synchronised there must be a means of determining both their movements, and adjusting them so that the two operate at the same time. This means that each machine must provide a sync signal that reflects its movement. The sync signals from both machines must be compatible; each must show the machine's movement in the same way, so they can be compared. The sync signals may be either generated directly by the machine transports or recorded on a media, e.g. tape, handled by the machines and reproduced in playback. Synchronization control systems can be classified into either be feedback controlled or nonfeedback controlled systems. 1.1 Feedback Control Systems (Closed-loop) One machine is taken to be the master, and is a reference to which the slave is adjusted. In a feedback control synchronisation systems (Fig. 20-1), a separate synchroniser compares the slave's sync signal to the master's sync signal, and generates an error signal, which drives the slave's motor to follow the master. Feedback control automatically compensates for any variations in the master's movement, be it small speed variations due to changing power, or gross variations like bumping the master into fast-forward. It is self-calibrating. As long as the sync signals are compatible and readable, feedback control will bring the two machines into lock. Feedback control is the method that is used when two or more normally freerunning machines are to be synchronised. The general term for closed-loop pulse synchronisation is resolving. 1.2 Non Feedback Control (Open-loop control) This simpler method is used when the slave machine can be driven directly by the master’s time code. For example, syncing a drum machine from a MIDI Sequencer’s clock and bypassing the drum machine's internal clock. This is analogous (similar) to a physical mechanical linkage between two tape machine transports, causing them to run from a single motor.

318

SCHOOL OF AUDIO ENGINEERING

AE18 – Synchronization and Timecode

Student Notes

We can further categorise existing synchronisation techniques in two classes - Pulse method and Timepiece method according to the information contained in each sync signal. 2. Pulse Synchronisation

Relies upon a simple stream of electronic pulses to maintain constant speed among machines. Pulse methods are used in open-loop systems. Some of the pulse methods are use in: 2.1 Click Track A click track is a metronome signal used by live musicians to stay in tempo with a mechanical, MIDI-sequenced, pre-recorded music or visual program. The click corresponds to the intended, beat of the music, so its rate can vary. On occasion, amplitude accents or a dual-tone click may be used to denote the first beat of each measure. Some MIDI sequencers can read and synchronize with click tracks. 2.2 Frequency Shift Keying (FSK) Is a method for translating a low-frequency pulse (such as any - of the above proprietary clocks) to a higher frequency for recording to audio tape or transmitting over a medium whose low-frequency response is limited. The clock pulse modulates the frequency of a carrier oscillator, producing a two-tone audio-frequency signal. Units designed for FSK sync read the two-tone signal and convert it back into the corresponding low-frequency pulse. 3. Timepiece Synchronisation

Pulse-type synchronization methods share a common drawback: sync signal in all cases has no marker function. Pulse sync signals can be resolved so those two systems run at the same rate. However they convey no information beyond speed. To be in sync, the systems must

319

SCHOOL OF AUDIO ENGINEERING

AE18 – Synchronization and Timecode

Student Notes start at the same time and at the same point in the song. Should the master be stopped in the middle of a song, the slave will stop likewise. After the master has been rewound to the start of the song again and begins playing, the slave will continue from where it last stopped. Timepiece sync methods fix this problem by using a more complex sync signal, into which place markers are encoded. These place markers identify individual points in the song. As a consequence, systems using timepiece synchronisation can lock to one another exactly, even though they may not start at precisely the same instant and place in the song. E.g. SMPTE, MIDI clock with Song Position Pointer, Smart FSK, Midi Time Code (MTC) 3.1 SMPTE/EBU TIME CODE SMPTE Time Code is a synchronisation standard adopted in the United States in the early 1960’s for video editing. Although still used for that purpose, it is also used in audio as a spin-off of the audio post production market (film and video sound); and is widely used for audio synchronising. Sometimes called electronic sprockets, SMPTE code allows one or more video or audio transports to be locked together via a synchroniser, and can be used for syncing sequencers and console automation. EBU is short for the European Broadcast Union, an organisation like SMPTE that uses the same code standard. SMPTE code is a timekeeping signal. It is an audio signal like any other audio signal and it can be patched in the same manner (the path should be the most direct possible e.g. without EQ or Noise Reduction). Time code requires a synchroniser, which will interface with a multi-pin connector on each of the machines that it controls. The LTC itself will be recorded on an audio track (edge tracks) or VITC incorporated within the Video Signal between each video frame. 3.1.1 Signal Structure

Time code is a digital pulse stream carrying timekeeping info. The data is encoded using Manchester Bi Phase Modulation, a technique which defines a binary ‘0’ as a single clock transition, and a ‘1’ as a pair of clock transitions (see Figure 20-2). This affords a number of distinct advantages: 1. 2. The signal is immune to polarity reversals: an out-of-phase connection will not affect the transmission of data. Because the data can be detected using an electronic technique called zero crossing detection, the Time Code signal's amplitude can vary somewhat without confusing receiver circuitry. The rate of transmission does not affect the way the synchronizer understands the code, so it can still read the code if the source transport is in fast-forward. The data can be understood when transmitted backwards i.e. when the source transport is in rewind.

3. 4.

320

SCHOOL OF AUDIO ENGINEERING

AE18 – Synchronization and Timecode

Student Notes Time code is like a 24 hour clock and runs from 00:00:00:00 through 23:59:59:29. The code address enables each image frame to be identified e.g. on the display one hour, two minutes, three seconds, and four frames would look like 01:02:03:04.

SMPTE data bits are organised into 80- or 90-bit words, which are synchronised to the image frame rate, one word per image frame. Data within each word are encoded in BCD (Binary Coded Decimal) format, which is just a way of reading and writing it, and express three things: time code address, user bits and sync bits. Figure 20 - 3 illustrate an 80-bit SMPTE Time Code word. 3.1.2 Time Code Address

Time Code Address is an eight-digit number - two digits each for hours, minutes, seconds and frames. The frame rate means how many frames per second used for different countries' standards. 3.1.3 User Bits

User Bits are eight alphanumeric digits that information like the date, reel number, etc. User bits do not control as it only relays information to make it easier for filing and organisation. 3.1.4 Sync Words

Sync Words indicate the direction of the time code during playback of the machine. If you go into rewind with time code playing into the head, it is the sync word that informs the synchroniser or tape machine which direction the time code is travelling. 3.1.5 Frame Rates

Four different types of SMPTE code or frame rates are in use throughout the world. Each is distinguished by its relationship to a film or video frame rate. In the US, black-and-white video runs at 30 fps (frames per second), colour video at approximately 29.97 fps, and film at 24 fps.

321

SCHOOL OF AUDIO ENGINEERING

AE18 – Synchronization and Timecode

Student Notes In Europe, both film and video run at 25 fps. 30-Frame - Time code with a 30-frame division is the "original time code" also known as Non-Drop (N/D). Its frame rate corresponds to United States black-and-white NTSC video. 30 non-drop- this is the same as a clock: its time code address represents real time –30 ms instead of a 1000ms in a second. 30 Drop-Frame - When colour TV was invented, it couldn't transmit colour at 30- frames and still be in phase with black-and-white signals. The colour frame rate had to be reduced to approx. 29.97 fps, to give the colour scan lines enough time to cross the screen and still produce a clear image. This fixed one problem, but created another. Time code at that rate ran slower than real time, making an hour of 30 Frame Time Code last 1 hour and 3.6 seconds. A new time code, Drop-Frame (D/F), was created to deal with this dilemma. 30 D/F omits frame number 00 and 01 for every minute except the 10th minute of the TC. For instance, 01:00:00:00 - 01:00:00:01 ~ 01:00:00:29 – 01:00:01:00

~ 01:00:59:29 - 01:01:00:02 - 01:01:00:03 ~ 01:09:59:29 - 01:10:00:00 - 01:10:00:01 30 D/F has remained the US network broadcast standard ever since. 25-Frame - In Europe, the mains line frequency is 50 Hz. Time code based on that reference is most easily divided into 25 frames per second, which is the frame rate for European video and film (the PAL/SECAM standard, where PAL is Phase Alternation Line, and SECAM is Sequential Colours in Memory. The 25-Frame time code also called EBU time code is used in any country where 50 Hz is the line reference. There is no 25 Drop-Frame in Europe, both colour and black-and-white run at the same frame rate. 24-Frame - Since the film industry used 24 fps as their standard, 24-Frame Time Code was introduced to correspond with film. This time code is sometimes used by film composers who have all of their music cue sheets marked in frames based on a rate of 24, or for editing film on tape. Unless one is doing a video for broadcast, most people choose Non-Drop time code because it expresses real time. There is no harm in using Drop-Frame for any purpose because most equipment will sync to either code. In practical terms, it really doesn't matter which code you use, as long as the usage is consistent. However intermixing of different frame rates must be avoided.

322

If that tape is used often for dubbing or mixing. Time Code and Audio Tape Recording SMPTE Time Code onto a tape is a. This will reduce the likelihood of time code bleeding into adjacent channels.) Whenever possible record the TC first followed by the audio. 20-4). 4. Their differences from each other are the physical way they are recorded on video tape (Fig. 5. and older video tapes. When recorded to video. LTC is placed on one of the linear audio tracks of the video tape. On large multitrack tape recorders.a as Striping Tape. there are two basic versions of SMPTE Time Code. and refers to the stripe of oxide that is placed on the edge of the film after it has been developed. will be striped with LTC. A TC’s square wave frequencies lies in the midfrequency band where our hearing is most sensitive.k. LTC is the "original SMPTE Time Code standard". if they contain time code. the edge tracks tend to be unstable. the stripe is used to record audio for the film. (This term comes from the motion picture industry. There is a danger of the edge track becoming damaged. Recording the same time code on tracks 23 and 24 will protect you from potential dropouts on track 24. Longitudinal.SCHOOL OF AUDIO ENGINEERING AE18 – Synchronization and Timecode Student Notes 4. A spare time code track will becomes invaluable.1 Longitudinal Time Code (LTC) Longitudinal Time Code is recorded on a standard audio tape track. Vertical Interval and Visible Time Code Regardless of the frame rate. 323 .

with an exception. the level should be -3 VU. Some generators can jam sync backwards extending the front of the tape also. A stand-alone TC generator must used in conjunction with synchronisers or tape machines. Copying TC always involve some form of signal conditioning.3 Reshaping Reshaping is a process that transfers the existing code through a signal refreshing but it cannot repair missing bits. Refreshing or regenerating code is performed by sending the original TC into a time code generator. This is especially useful when T/C is too short on a tape or there are dropouts. On semipro machines. The generator locks onto the incoming time code and replicates it. As a result some form of signal conditioning is required if a time code track is to be copied from one tape to another. Time Code Generator Most synchronisers have built-in TC generators as well as certain tape machines including certain portable 2 or 4 track units used for field production and news gathering like Nagra tape recorders. creating a fresh time code signal. 6. In both refreshing and jam sync mode. one can either choose to copy the user bit information or generate new user bits . Once the Jam Sync has started it will continue to generate TC even if the original TC has stopped “playing”. This is known as Refreshing. Three methods of copying LTC 1. This “refreshed” TC is recorded onto tape.SCHOOL OF AUDIO ENGINEERING AE18 – Synchronization and Timecode Student Notes TC should be recorded at a level not greater than -10 VU for professional machines.2 Jam Synchronising Jam Synchronising is the same as regenerating code. which do not have built-in TC generators. 6. 2. 3. 6. The pre-roll and post roll of 30 seconds (before and after the start of the audio) will provide sufficient time for the synchroniser to “lock” all the equipment up. Copying SMPTE Time Code If Time Code needs to be copied from one recorder to another like making a safety of a multitrack master. Allow a minimum of running time code before recording any audio on tape. 324 . Some generators/regenerators can actually fix missing bits in the code. 7. Refreshing Jam syncing Reshaping 6. It is important to record time code on an edge track of a multitrack recorder and then leave a blank track (or guard track) between the code and the next recorded track.different dates.1 Refreshing or Regenerating Code Since magnetic tape recordings are imperfect in reproducing square wave signals due to its bandwidth limitations.

The time code recorded on a video. and it provides half-frame accuracy for edits. In Machine Control. E. In chase lock. the synchroniser controls both the slave capstan speed and its transport controls. This calculation offsets the slave time code so that it matches the master time code numerically.g. on the screen. It can be present in a video signal without being visible on screen.SCHOOL OF AUDIO ENGINEERING AE18 – Synchronization and Timecode Student Notes Vertical Interval Time Code (VITC) is recorded within the video picture frame . the offset would be -03:00:00. 8.00. It basically makes the two T/C values the same. but includes several "housekeeping bits" which bring its word length to 90 bits. If the master code is already in play. 9. The capstan will be instructed to speed up or slow down until it reaches lock.1 Slave Code Error Slave Code Error is simply the time difference between the master and slave codes. 9. the offset would be 03:00:00:00. 325 . This error can either be used to determine how far away the slave is from the master or stored for subsequent usage. VITC is structured similarly to LTC. Offsets are extremely useful when matching a pre-striped music track to a new master time code.during the vertical-blanking interval. Methods of Synchronisation 8. Slave code offset. the slave will automatically go to that general area and play.2 Machine Control Machine Control is another common method of synchronisation.2 Slave Code Offset Slave Code Offset is the slave code error added to or subtracted from the slave time code. If the master and slave time codes were reversed. 8. the system is considered "locked". At that point. A synchroniser monitors the slave’s and master’s time code signals and will speed up or “chase” or slow the slave capstan until the TC read from the slave machine matches the master time code. and frames. If the master code is three minutes ahead. If the master time code reads 01:00:00:00 and the slave time code reads 04:00:00:00. Where video and multitrack audio transports must be synchronised. minutes. VITC can be read from a still frame (whereas LTC cannot). The synchroniser displays the TC in hours. both VITC and LTC may be used together. the slave capstan control uses the closed-loop method. seconds.1 Chase Lock Chase Lock is the most common form of synchronisation. which appears on the monitor screen is called a window dub or burnt time code. In audio-only productions only LTC is used. LTC is most common in audio because VITC cannot be recorded on audio tracks. the slave transport will be instructed to advance the tape to that area and park. Other Synchronization-related concepts 9. The error can be either positive or negative depending on whether the slave is ahead of or behind the master. for example.

The transports must be locked before slewing can occur. MIDI provides a powerful extension to the automation capabilities of a synchronised system.6 Advanced Transport Controls Similar to transport autolocators. In addition. some sophisticated synthesisers having built-in sequencers are capable of reading time code and performing the conversion internally. and vocals. which captures the offsets while master and/or slave are in motion. some synchronisers offer such features as zone limiting and auto punch. SMPTE-TO-MIDI Conversion Used in conjunction with SMPTE Time Code. The device that makes this possible is the SMPTE-to-MIDI Converter. 9. Stand-alone converters which interfaced directly with computer sequencers are available from several manufacturers. 10. a unit which reads SMPTE Time Code and translates it to MIDI clock with Song Position Pointer data. when re-ached by the slave. 326 . Once you have determined the proper slew value. and hit the capture key at the desired beat. This is also referred to as offset trim. you would park the master at that point. run the music slave (unlocked). either stops the transport or takes it out of record. They are used for triggering equipment that either cannot or does not need to be synchronised.SCHOOL OF AUDIO ENGINEERING AE18 – Synchronization and Timecode Student Notes 9. if the master is a video deck and you want a particular beat in the music to occur at a specific point in the video. Zone limiting sets a Predetermined time code number that. If there is already an offset prior to slewing. This is useful for defining a specific beat in the music relative to the master time code. with one major difference: the location points are related to In addition to autolocation. Auto punch brings the slave in and out of record at specified time code locations. 9. In the studio. it can be entered as a permanent offset. These time code numbers can sometimes be stored in event trigger registers.sequenced MIDI instrumental parts with multitrack overdubbing of acoustic instruments. The converter serves as the interface between the SMPTE-synchronised system and the MIDI-synchronised system.5 Slew Slewing is a way of manually advancing or retarding the slave capstan by increments of frame and subframe. the slew value will simply be added to or subtracted from the prior offset. some synchronisers’ autolocate master and slave transports. The triggers can be used to start and stop tape machines or to trigger sequences. For example.3 Event Triggers Event Triggers are time code values placed in event trigger registers. SMPTE-to-MIDI conversion is a tremendous aid in combining pre.4 Flying Offsets Flying offsets are accomplished using a synchroniser capture key. 9.

video or film. This technique is now commonly used. automated audio and MIDI data stored in MIDI-sequences. 327 .SCHOOL OF AUDIO ENGINEERING AE18 – Synchronization and Timecode Student Notes In a live sound reinforcement. can be played back synchronously with multitrack recorders. It provides hi-fi audio and video presentation. including automated MIDI controlled lighting. all at a lower cost.

SCHOOL OF AUDIO ENGINEERING AE18 – Synchronization and Timecode Student Notes 328 .

SCHOOL OF AUDIO ENGINEERING AE18 – Synchronization and Timecode Student Notes 329 .

330 .

Recording on the Mackie Mixing down on the mackie 331 .AE19 Professional Recording studios The Mackie Digital 8-Bus Console 1. 2.

1.SCHOOL OF AUDIO ENGINEERING AE19 – Professional Recording Studio Student Notes AE19 PROFESSIONAL RECORDING STUDIOS The Mackie Digital 8-Bus Console This section gives an introduction to the professional studio environments. The 24 track studio built around the mackie digital 8-bus console would be used. Recording on the Mackie 2. Mixing down on the mackie 7 7 Assignment 8 – AE008 332 .

2 Six generations in a typical film work Four Generations In A Typical TV Sitcom 3.11 Consistency Depth – Dimension Worldize Walla/Babble Track Room Tone.8 1.6 1.AE20 – Audio Postproduction for Video 1. Music Editing Specialist Foley Sound – Effects Editing Editing 2.2 2. Ambience Or Presence Wild sound Multisense Effects Panning (dialogue for TV/film) Cross Fades Split Edit Cut Effects Psychoacoustics 1.4 1.10 1.1 2. Frequency Masking 1. Post Production Techniques 1. Documentary Production 333 .2 1.1 2.5 1.1 1. Temporary Masking Automated Dialogue Replacement Sound Editing 1.7 1.9 1.1 1.3 1. Supervising Sound Editor / Sound Designer 1.

6 Masters for Video Release Print Masters for Analogue Sound Track 334 . Off-Line Editor Edit Decision List Edit Master Print Masters to Exhibition 4.2 3.1 3.3.3 4.5 4.

335 .1. Once the audio and video has been synchronized using time code offset. Another method is to make the synchronizer the master controller. During the production shoot. Pro Tools 5.SCHOOL OF AUDIO ENGINEERING AE20 – Audio Postproduction For Video Student Notes AE20 – AUDIO POSTPRODUCTION FOR VIDEO Introduction Post production occurs after the production stage. He is to be mindful not to cast shadow onto the set when he is moving the mike overhead to capture the best sound possible. with a burnt time code window." He would record the dialogue tracks or even wild tracks e. into video. At the end of day. When the film has been edited and is ready for post production work.g. Once it has been transferred to a DAW. ADR. The producer. This is necessary because film does not have soundtracks unlike video. the engineer will use a synchronizer to sync the digital audio and video tape recorder together from the DAW using e. Hence detailed labeling of the various takes by the on-location recordist is essential.g. The boom operator is required to read and understand the script what is going to take place in the shoot. addition of sound effects. This is the job of the “third man’ if he is available. the on-location recordist is to capture the "production sound. editing and sweetening takes place. This is at least one of way of synchronization. Here audio recording. the sound editor and director will decide on which "takes" to "print" to use.g. MIDI machine control (MMC). Finally the sound track is added to the video. music. where the DAW and video tape recorder will be slaves or synchronized to it. The sound editor is responsible for the "sonic" style or "feel" of the production. The video is given to the sound editor. He is to hold the mike out of the way of the frame when the camera is tracking out. a rough cut at this stage. It will be scheduled for completion in an agreed number of weeks by the sound editor and director. director and sound editor during the pre production meeting prior to the production stage would determine this. The sound editor achieve synchronization of audio and video by looking for the clapper board "snapping" on video. The film is transferred to telecine machine which will convert the edited movie. The sound editor would now find the matching audio track on DAT and digitally transferred the audio to a digital audio workstation (DAW) first e. on screen for making synchronization easier and then matching its sound on the audio track. ambience onto a portable mixer and DAT recorder using a fishpole (boom) and shotgun condensor microphone. The recordist is to capture the dialogue or ambience as best as he could.

It is "easier" that way for the sound editor to control the ambience level and the sound of the footsteps as two separate elements.revealing an empty hall. Should an extreme long shot or establishing shot of the actor be taken while he is walking in this large empty hall. ambience and dialogue replacement is done and mixed correctly. 336 . It is difficult to record the footsteps echoing in a large hall on DAT and then make it "match" with the rest of the sounds on the DAW. The reason for this is humans are able to psychoacoustically suppress the reverberation while focusing on the sound of footsteps. On the other hand the sound editor may use Effects Library CDs e. Subsequently during post production he will make sure the sound of the footsteps are in sync with the character’s walking. There might be under this list a number of ways of walking on different types of wooden floor. He would have to equalized them to make them fit with the rest of the other audio. Hence the sound editor will capture the footsteps in a foley stage where the foley artiste will walk in the same kind of shoes and on similar surface. This is to make the sound realistic. The reverb of that hall environment with the sound of footsteps will be difficult to control by equalization since the sound of footsteps will sound loose (lacking crispiness and details) in comparison to what we hear in real life because it is recorded in a distance. the sound effects into a sound effect stem and the dialogue into a dialogue stem. the reverb level would be lower now (its high frequency rolled off or it would interfere with his speech). sound effects. the sound editor will add reverb to it after he has spotted the effects to the picture. He will capture the sound dry without reverb. Even though the camera shot remains the same . Hollywood Edge for the sound effects. When there is a "cut to" edit on the video. A character is walking in an empty hall. To make mixing easier for the sound editor in case there is a need for changes all the music will be mixed into a music stem. Post Production Techniques Depending on where the characters are on screen.g. the video will sound "empty. E.” Hence there is a need to turn up the footsteps' level to fill up the “void” in the audio. Once that is correct he will add reverb from digital effects unit to it. When the actor finished talking on the mobile. This must be done to simulate that the audiences are “psychoacoustically” performing "the cocktail party effect" in suppressing the noise in the characters surrounding while they concentrate on listening to his phone conversation. the sound of his footsteps is turned up as though the audiences are doing it "psychoacoustically" again because they have already finished listening to his phone conversation. Since most of these sound effects are dry.g. Here he will choose from many different sound effects on e. His footsteps will echo through the audio track and a large reverb is used.g. As oppose to a recording both the footsteps sound and the reverberation of a large hall. footsteps with leather-sole shoes on wooden floor. When the music.g. The final output of these stems is a Print Master. one element. he walks away off-screen from the camera. the sound editor will attempt to use artificially created reverbs to match their location be it outdoors or indoors. Doing this gives the video perspective also just like what is encountered in real life. Everything will sound natural as thought it happened during the production stage itself. The sound editor will choose the most suitable sound effect for the job. The sound editor will add sound effects into the DAW in the same place (timeline of the DAW) as the video where the objects are making a sound or noise. Reverbs may be increased or decreased depending on how the camera captures the character in e.SCHOOL OF AUDIO ENGINEERING AE20 – Audio Postproduction For Video Student Notes 1. On the other hand if the footsteps' sounds are not turned up. a large hall. showing him at a close up shot on screen in the same hall talking on his mobile phone.

played by the extras. A record played on a turntable produces too much direct signal.1 Consistency Consistency during production sound is more important than having great and not-sogreat sounding dailies. You do not want an audience of about one hundred people applauding while the screen reveals about thirty people. DVD for exhibition or release. This is vital should the “room tone” varies between takes within the same scene since it was filmed over a few hours or even days. Ambience Or Presence Record “room tone”. Record the crowd talking separately in production or postproduction stage. which in turn gets picked up by mikes. During the postproduction stage recording ensure that the activity. mimic talking in the background.3 Worldize At times to worldize an archaic recording is to send them through loudspeakers in a reverberant environment.SCHOOL OF AUDIO ENGINEERING AE20 – Audio Postproduction For Video Student Notes Thereafter it can be “laid back” or record the print master onto a film or video. 1. It gives an intimate “in-your-face” style different from the normal reverb-added overdubs. 1. gender and people is about the same. 1. This can be added during post production (frequency masking) to ensure good continuity between cuts in the same scene. Apocalypse Now On the other hand. Inconsistency in production sound is distracting. 1. This is important especially within a scene than from scene to scene. E. Play the same music from a gymnasium hall’s public addressing system and recording it will produce a “period sounding” recording that suits a “period” production. 1.5 Room Tone.in Rick’s Café near the opening scene after Sam had sung “Knock on Wood”.4 Walla/Babble Track In a crowded room scene. which may not give the intended effect even after some processing. This recording is also known as “walla” or “babble” track.g. This will prevent spills of the crowd’s talking from getting into the production recording. This happened in the classic movie Casablanca . a voice-over narration that is made much more reverberant (while an actor is not speaking) than the on-screen action is one method of suggesting to the audience that we are listening to his thoughts.6 Wild sound 337 . atmospheric or presence. Make use of different miking techniques/position to achieve the desired result without the use of EQ (processing) if possible unless there is no other way around it. record the actor’s voice first while the crowd. 1. ambience noise.2 Depth – Dimension A talent speaking into a bright condenser mike with no reverb added will give a “voice inside a head” effect.

However this is rarely done for the voice unlike effects. it does not flow – unnatural. They also compliment one another. When there is a cut to edit in the film and the same character appears center of the screen with the voice now suddenly panned to the center of the stereo panorama. A tittle crawl at the beginning of Star Wars seems to take a longer time scrolling up screen while watched in silent.” Psychoacoustics 1. A movie without audio will appear longer than what it seems to be. The reason being if a character is speaking on the left side of the screen and his voice is panned to the left also. this will distinguished it from the onscreen dialogue.9 Cross Fades Cross fade room tone (presence. Most character’s voices are panned center (mono) no matter where their position on screen might be. However “butt splices” can be used if desired. On the other hand.10 Split Edit When the sound of an object or object is heard first before it/they appear/s on screen is known as split edit. The performance/acting is fine. During editing the change in background noise level is obvious.g. The audience will find it jarring. E. A phone is heard ringing before it is seen on screen. The only exception here is when an off-screen character speaks a line.8 Panning (dialogue for TV/film) With stereo we can pan effects or voice in the stereo panorama to fit the positions of the characters on the screen. 1. Frequency Masking A scene was shot over the duration several days or hours. 1. which will be panned either left or right. This will ease the audience into the next scene as it sounds more natural then a “butt splice”.11 Cut Effects Everything that is seen on screen that makes a sound should be heard. 1. lacking continuity. It is used to show transitions in time. 1. In other word. One solution is to use the production recording (sounds captured during the shooting itself) with the louder 338 . atmosphere) from one scene to another scene. “See a car moving. It is used later in post production to match scene.SCHOOL OF AUDIO ENGINEERING AE20 – Audio Postproduction For Video Student Notes Wild sound is recorded without the camera rolling or without synchronization usually after a shoot. 1.7 Multisense Effects A movie’s sound track is designed to interact with the movie itself. Hear a car moving. with music it seems to march right along. Thus is covered by a cut sound effect.

Then he would match their waveforms to the original dialogue track's waveform and timeline also so that it would appear to be in sync with the "mouthing" of those words by the actor on screen. The newly recorded spoken lines match with their “speaking” or “mouthing” of their lines on the video. the sound editor may attempt to salvage it by "selecting the words individually" via the waveform on editing window and then splitting it up on the digital audio software. 1. The sound designer has over a range responsibility from the overall conception of the sound track. If that proves unsuccessful.1 Temporary Masking A loud sound can mask a soft sound that occurs just after it. the sound editor would have to record the actor's voice again and then perform another lip-syncing. For instance. Humans perceived louder sounds (some 10ms) faster than we do softer one. Pre-masking or backward masking occurs when a louder sound masks a softer one. 2. Supervising Sound Editor / Sound Designer This term Sound Design encompasses the traditional processes of both editing and mixing.dialogue track is used whether it is of good quality or not. Sound Editing 1. like on the video. After the dialogue has been recorded it is synchronized to the video. Sound editors use it to mask imperfection in an edit and also to maintain rhythm. He hears his dialogue (production sound) played from the DAW through headphones. in the studio now where it will be recorded. composing music and its role in film (giving it the mood or sonic style) to making specialized complex sound effects. This will produce smooth – sounding edits and good continuity. The actors or voice talents will now be watching the video on a monitor in a studio. The actors or voice talents repeat their lines once again.SCHOOL OF AUDIO ENGINEERING AE20 – Audio Postproduction For Video Student Notes noise (wild track) for the whole entire scene. Star Wars – Luke Skywalker’s land speeder’s sound is achieved by sticking a mike in the tube of a turned on vacuum cleaner 339 . These factors made it unsuitable for final release unless the production is a low-budget production where the production sound . The purpose for ADR is to have the dialogue track replaced because the production recording of it is inconsistent and has noise in it. Automated Dialogue Replacement Automated Dialogue Replacement (ADR) or Looping. • • Dinosaur sound in Jurassic Park is a mixture of penguin and baby elephant trumpeting. The sound editor will check to ensure that the lipsyncing is perfect. This is known as postmasking. Should the lip-syncing be imperfect. a loud cymbal crash is edited “on the beat” to cover up for any imperfection momentary discontinuity at the actual edit point using pre-masking.

The sound editor may compose a piece of music or choose music from a CD library to enhance the mood of the movie. On the other hand. via contact pickup. Fast tempo together with quick edits of a movie showing someone running away from danger will usually set the audience hearts racing.4 sec). Foley sound is recorded dry and are processed subsequently to suit the production. director and supervising sound editor to “spot” the music to picture. 1. Spotting is performed here at this stage. units are mixed together into premixes and premixes together form a final mix. Foley artistes will perform the action more or less synchronously with the picture they see. Sound design is the art of getting the “right sound” in the right place (screen) at the right time (synchronized). Bird’s wing flapping sound is created using the sound produced by opening and closing an umbrella rapidly. 1. Editing A block diagram (Fig 9. After a composition has been written according to the picture lock. Foley sound effect adds realism to a film by enhancing if not exaggerating the sounds in everyday life.0. 2. Another solution is to use foley sound. a music specialist might be hired to do the same job. of tautly strung cable after it has been struck • • The thud of a fist into a body is achieved by striking an axe into a watermelon. The first step in the process of music composition is the gathering of composer. Or a slower tempo with lots of low dark notes to give the impression of impeding danger. Each row represents a generation. The tempo and style of the music will often give set the pace and mood for a certain part or scene in a movie. At the same time it heightens suspense.SCHOOL OF AUDIO ENGINEERING AE20 – Audio Postproduction For Video Student Notes Storm Trooper’s laser gun’s shots sound is obtained by recording the vibration.1) of the overall mixing process for a feature film. After the Foley sound has been recorded.2 Foley Sound – Effects Editing Should editing effects from an effect library prove tedious.1 Music Editing Specialist Music scored for film is composed to fit the time given by the film. 340 . Foley sound effects (named after Jack Foley) are made in a recording studio called a Foley stage (RT60 . editing of the music might be necessary once more because the picture might have gone through more editing. Where he will spot the music to the movie via time code. if budget is not a constraint. Next the final mix stem are mixed together to produce the print master which is the output of the sound postproduction process. The important question or art is when to bring in the music to enhance the mood or even heightened tension or climax. Most movies have about 50% music in it. This takes place after the picture is “locked” (picture lock). Spotting refers to going through the picture and noting where music should be present and what kind it should be. If the production copy of the video is a rough cut (not ready for final release yet) editing might be necessary until the fine cut (final release) is ready. they are edited on a DAW for hard syncing.

Another advantage is during the theatrical film mixing.SCHOOL OF AUDIO ENGINEERING AE20 – Audio Postproduction For Video Student Notes The advantage of keeping the various mix stem components parts on tracks separate from each other is the convenience from which a variety of different types of output mixes can be obtained called print masters. should the director wants changes. Lay back occurs when the master mix is dubbed to the Edited Master videotape.1 2.1 Six generations in a typical film work • • • • • • Original source recording Cut units. Fig 9. Fig 9. composed of mix stems Print masters Masters TV production has a tighter schedule then film production. A foreign language can be dubbed into a Music & Effects mix (M&E) subsequently. new premixes can be done quickly. also called elements Premixes Final mix. which is measured in working hours instead of weeks for film. Print masters can be used for different purposes.1 Once all the tracks mixed together.2 Four Generations In A Typical TV Sitcom • • • • Dialogue Music Effects Audience reaction (laughter & applause) 341 . they form the Master Mix. 2.

all under computer control. digital multitrack and optical disk. which is often delivered from the off-line system to the on-line one on a floppy disk. Print Masters to Exhibition Print masters are the final output of the recording process – the sound tracks.2 Edit Decision List After the edit is approved an on-line session is scheduled. The on-line system essentially duplicates the actions of the EDL. Another factor in making print masters for both film and TV shows involves foreign-language distribution. The dubbing site will add the new dialogue to M&E and produce its own print master. insert recording at each cut and assembling an Edit Master video tape. 3. The output of an Off-Line Edit is an Edit Decision List (EDL). 1: Print Master Types and their Characteristics 342 . it is common to supply M&E master without dialogue.SCHOOL OF AUDIO ENGINEERING AE20 – Audio Postproduction For Video Student Notes 3. Each release medium has two primary factors that together determine the parameters of the associate print master. Documentary Production 3. Print masters maybe recorded on a mag film. See Table 11.1 Off-Line Editor Videotape copies of the original tapes is to send to an off-line editor (edit bay or offline room). 3. where there is a video edit. The charge per hour is higher also. 4. Each print master must be tailored to the capability of the specific target medium. For dubbing purposes. the audio will be edited as well. shuttling machines back and forth. The dialogue track is supplied as a separate track for translation and synchronization purposes of the foreign-language dubbing site. The typical edit here is called audio-for-video.3 Edit Master The on-line room has higher quality automated editing system equipment. the number of available audio channels and the dynamic range capacity of those channels.

4. sound masters are typically prepared for the video market. A positive release print is the inverse of the negative.SCHOOL OF AUDIO ENGINEERING AE20 – Audio Postproduction For Video Student Notes The figure on the right represents a sound track negative for four types of tracks: (a) conventional stereo variable area in the standard analogue sound track area next to the picture. Frame lines of a Cinemascope picture are shown for reference. although they are not printed on the sound track negative. When a master on mag film is supplied. (b) DTS time code located between the picture and conventional sound track areas for synchronisation of an external disc. (c) Dolby Digital between the perforations on the sound track side of the film. and (d) Sony Digital Dynamic Sound located outside the perforations on both sides of the film.5 Masters for Video Release Once theatrical print master types have been made. it may be in one of several formats: 343 .

6 Print Masters for Analogue Sound Track While primary theatre’s worldwide are equipped to reproduce digital-format sound tracks convert into analogue optical sound tracks are always recorded on all theatrical release prints because all the world’s 35mm projector can play them. recorded as two tracks on a three-track format Mono. documentary. A method of encoding four channels worth of information into 2 tracks was developed – 4:2:4 matrix technology (some trade names include Dolby Stereo). and a few feature films. used most often for historical. often supplied as a DM&E (Dialogue. Due to size limitations of the sound track area on a film which usually accommodates 2 analogue tracks. Music and Effects) recorded as a threetrack • 4. The name given to print masters in 344 . recorded as six-tracks Lt Rt for two-channel PCM and analogue tracks of laser disc and hi-fi and longitudinal tracks of consumer tape formats. The primary factors are limitations in the number of channels and dynamic range.SCHOOL OF AUDIO ENGINEERING AE20 – Audio Postproduction For Video Student Notes • • 5.1 channel master.1 channel for multi-channel laser disc and DVD markets. usually identical to the theatrical 5. However there is some compromise and this is one process that occurs in preparing stereo analogue sound track masters.

right and surround channel. 8 8 Assignment 9 – AE008 345 . centre.SCHOOL OF AUDIO ENGINEERING AE20 – Audio Postproduction For Video Student Notes this format is Lt Rt. The “t” refers to “total” meaning a left and right track carrying information. which maybe decoded into left.

2 Double-System Recording The Third Man 3. 2.1 2. Some Characteristics of Sound 3. Wireless Microphone & Technique Sound Recordist / Mixer 2. Recording Different sound sources 4. 1.1 4.1 2.AE21 – Sound For TV and Film 1.2 Absorption of Sound Doppler Shift 4.1 Basic Shots of People General terms Basic Shots of People PROTOCOLS 1. 3. Video Protocol Audio Instructions Shooting a Scene From a Dramatic Production 346 . Production Sound and Miking Techniques 1.2 Recording sound of Nature Recording sound of Sporting events Getting a Shot Started 1.1 3.

this will be followed by a close-up shot of each officer while they talk. The mike should be moved away from the characters. However the differences in sound will cause discontinuity problem during postproduction editing. so that they do not appear in the frame. 60. A wind screen is used in conjunction with a shotgun mike. Avoid clipping it near the chest 347 . Due to their omni directional polar response. This will no doubt result in good sounding dailies. If the shotgun mike is pointed towards the policeman (facing away from the highway) and the same mike is next positioned facing the other policeman (facing towards to the highway). All these are shot separately by having the actors repeat the scene (their lines and actions) over and over. For instance. they will capture noise in the set. Secondly as the mike is moved away from the characters. This also helps to add perspective in the production sound. use the boom mike placed overhead at an angle away from the highway (maximum off-axis rejection) and move the mike over to each actor.SCHOOL OF AUDIO ENGINEERING AE21 – Sound For TV and Film Student Notes AE21 – SOUND FOR TV AND FILM 1. During editing the “road tone” can be added during postproduction to ensure continuity. After a master shot of the two cops is taken. a two-shot and close-ups. In a conventional film-style production. this is essential in reducing wind noise when it is shot out of doors or when the mike is panned left and right between two or more speaking characters during a shoot.1 Wireless Microphone & Technique Wireless lavalier (true diversity wireless lavalier by Sennheiser EW 512) can be used by planting and hiding it in the character’s hair or concealing it in their clothes. The noise from the highway would be greater for the second police officer because the traffic is right behind him. multiple setups are used to achieve this. it tends to sound duller due to chest resonance and the lack high frequency. There is a temptation for sound recordist to match the camera perspective for each new shot. When a camera shot of two characters is zoomed out. props) to capture the characters speaking. With new camera angle and lighting adjusted. Neumann KMR 81. The next best option is to plant boundary mikes at strategic location (tables. If it must be planted on the character’s clothing. AKG CK 8X) placed overhead captures the character’s voice clearly and naturally. Usually a scene is shot from several angles in order to offer the postproduction editor a range of choices in developing the scene. Record the noise of the highway separately (wild track). while maintaining the same angle. when it is their turn to speak their lines. In an extreme wide shot where a boom mike cannot be used as it will appear on in the frame. This is to prevent the mike from being captured into the frame. a wide shot will establish where the two traffic cops are – next to the highway as they sit on their motorbikes. Production Sound and Miking Techniques Boom mike is used for recording because the shotgun mike (Sennheiser MKH 50. The setups maybe shot over a few hours. a master shot. the mike will now capture more indirect-to-direct signal of their dialogue. KMR 82. 1. To minimize this problem. 70 416 or 816. If the mike is placed below the characters. It must be placed near the neck where it is closer to the mouth.

g. 2. Close up shots of a subject or an object on the screen must have its full frequency spectrum . Fostex DP4. In some cases. Some Characteristics of Sound 3. TV or video production. E. The sound engineer monitors the recorded signal on the DAT recorder’s headphones out. storage of recordings and printing of the takes after a shoot. a portable mixer is not necessary because the portable DAT recorder has phantom power for the shotgun mike and built-in mic pre-amp too. Doing this will add perspective to a film. The sound recordist will use a portable mixer e. He would setup the equipment before a shoot and the packing up of them after the shoot. to make it sound natural. He also records down takes that need to be printed for synchronization with dailies subsequently. The same subject or object shot from a distant must “loose” some of its mid and high frequencies. Gunfire from a distance. 3. He is responsible for setting up and packing away the equipment before and after a shoot. The same occurs for brass band music heard from a distance. How far away the object is to the character on the screen.SCHOOL OF AUDIO ENGINEERING AE21 – Sound For TV and Film Student Notes where it will pick up chest resonance making the sound bassy.1 Double-System Recording Film is used for recording images. 2. 2. Thus they are called double-system recording. He might also double up as the boom operator and on-location recordist when necessary. He will be in-charge of creating ambient noise at the appropriate time.1 Absorption of Sound Air absorbs high frequencies (short wavelength) more readily than low frequencies. including adding more reverberation.brighter (more detailed) to make it sound realistic. Sound Recordist / Mixer He is the person in-charge of recording the production sound and wild tracks for any film. DAT recorder and/or DAW is/are used to record sound optimally. The portable mixer’s output will be connected to a portable DAT recorder. in the open. Shure FP32A that provides phantom power for the shotgun microphone. He is also responsible for logging. which has longer wavelength and is physically stronger in strength than high frequency. Eliminate clothes ruffling noise during any actor’s movements.g. 348 . Thus each device is optimized for its own medium. sounds dull because the air absorbs the high frequencies.2 The Third Man He is the sound engineer’s assistant.

2 Doppler Shift Doppler shift occurs when a constant note (or noise) is perceived to increase and then decrease its pitch and amplitude due to its wavelength being compressed and stretched respectively. Consequently noise or music will sound dull behind a wall due to the absence of high frequencies but not low ones. curtain. To make them sound realistic when synced to picture.SCHOOL OF AUDIO ENGINEERING AE21 – Sound For TV and Film Student Notes Enclosed spaces lined with a lot of fibrous materials (carpet. for instance: • • • • • • Engine idling Acceleration from stop Braking to a stop A pass by of a car A steady while the mike is fixed onto the outside of a car An interior steady All these recordings can be looped and doppler shifted if necessary during postproduction stage. a wall. This will result in an acoustic shadow forming behind large objects e. the recording is looped in a digital sampler to extend the length of the “flight”. High frequencies are generally non-defractional in nature unlike low frequencies. Recording Different sound sources 4. 349 . 3. After it has flown past a few microphones. All this is done and panned simultaneously as arrow zipped past the screen from left to right. a train is blowing its whistle as it comes towards a person (whistle pitch goes up and becomes louder) and passes away from him/her (whistle pitch goes down and becomes softer). sofa) will tend to be more absorptive of high frequencies as well. 4. A sound recordist may need to record sounds of a car. For instance. Arrows flying through the air. Its feathers are ruffled so that the noise it makes can be captured while it is in flight. Thereafter it is pitch shifted up quickly to simulate the beginning of its flight and then pitch shifted down at the end of it.1 Recording sound of Nature A parabolic reflector is used to focus faint distant sound into a super cardiod condenser mike for recording sports and wildlife sound for documentaries.g.

meaning that the internal self-check for speed accuracy has been passed.) Alternatively. The assistant director or director says. have sound synchronised with it. The assistant director or director says. which is universally recognised as a take that is to be printed. and shown at dailies." 3. "Rolling. "Twenty-seven B take one" then hits the clapperboard closed. Dailies – Dailies are processed films received from the laboratory that are checked for framing. in view of the camera. several shotgun mikes are strategically placed at the venue in order to capture the sounds of the games including the spectators’ response. "Cut" and both camera and sound stop." 6. "Roll camera. 4. e." 2. "Roll sound." and the camera log and sound log have that take numbered circled. such as. and if it is a Nagra or HHB DAT recorder. the following sequence is common: 1. The clapperboard slate operator says the scene and take number. The recordist then says "Speed. The assistant director or director says." 4. "Action" to cue the beginning of the scene. Getting a Shot Started In narrative filmmaking. Should there be problems with it. focusing or exposure errors. The director says." 5. the director says. 350 . The sound recordist turns on the recorder. The bowl may be made of glass fiber or metal (coated on the outside with a material to stop it from “ringing”. In another design the reflector is segmented and can be folded like a fan for travel and storage. The camera operator starts the camera. The next decision taken is whether to print that take or not. At the end of the take. "Slate" or "Marker" or "Sticks. and then gets out of the way. with a director in charge. 7. if it is made of transparent plastics the operator can see through it to follow action. he observes that the speed and power indicator is on.SCHOOL OF AUDIO ENGINEERING AE21 – Sound For TV and Film Student Notes Microphone in parabolic reflector.g. at sports meetings. If the decision is to print the take. the director says "Print it. A re-shoot of a certain scene might be necessary. and after observing it to be on speed says.2 Recording sound of Sporting events For many sporting events.

Basic Shots of People Effective pictures of people tend to follow a series of regular. 4. List of scene/take information recorded List of any wild lines recorded List of any sound effects or other wild sound recorded List of any presence recorded Takes that are meant to be copied from the production source tape to be heard at dailies. Where a director talks about an 'MCU' instead of 'chest shot' or 'bust shot'. Sometimes a general indication such as ‘over-shoulder’ shot or ‘point -of-view shot’ (POV) is sufficient. the purpose of these classifications is to convey information and in most organizations they have become standardized by custom. Production name/number Shooting date Reel number Produce/studio/director. the production sound recordist consults with the script person and a camera crew member concerning the three logs from the set-sound. with matching picture having their take numbers circled. 9. 6. or additional indications could be low shot. Frame people in any other way and the shot usually looks awkward and unbalanced.SCHOOL OF AUDIO ENGINEERING AE21 – Sound For TV and Film Student Notes Logging the sound tapes is the responsibility of the production sound mixer. 1. To prevent confusion. three-quarters frontal. easily recognized arrangements.1 General terms The general direction of a shot can be indicated by the broad description frontal shot side view.’ ‘two shot’. the shot classifications opposite are used. script. and the general guide ‘single shot. For some purposes it is enough simply to indicate how many people are to be in shot. These provide convenient quick references that enable a director to indicate the shot he wants in just a few words. 5. Several terms have evolved for each but the shots themselves are universal. etc. high shot. Experience has shown that these compositions provide the most artistically pleasing effects. 2. and camerarationalising the list of scenes and takes shot that day to be certain that everything needed for dailies will be printed. follow that terminology. though. 351 . and the designation "Print circled takes" At the end of a shooting day. 1. CLASSIFICATION To indicate exactly how much of a person is to appear in shot. it is best to use those found locally. level shot. to show height. 7. 8. Usually the log will give: 1. ‘three shots’ even ‘group shot’ is used. After all. 3. But do not think of such 'standard shots' as simply routines.

such as ‘cutting just below the waist’. CU Close-up . In no time at all. free to concentrate on other aspects of the action.cuts body at lower chest (breast pocket.just above head to upper chest (cuts below necktie knot).from mid-forehead to above chin. This will enable the on-location recordist to become script/camerashot literate. chest shot) . armpit) MS Medium shot (mid-shot. BCU Big close-up (tight CU. CMS.SCHOOL OF AUDIO ENGINEERING AE21 – Sound For TV and Film Student Notes Remember these arrangements by their framing.isolated detail. MCU Medium close-up (bust shot. 352 . Basic Shots of People Shots are identified by how much of the subject they include: ECU Extreme close-up (detail shot) .cuts body just below waist. close medium shot.full head height nearly fills the screen. you will find yourself automatically thinking in these terms. full head) . It also ensure that the boom operator /on-location recordist will know when to move or reposition mics during changes in camera angles so that the mic will not appear in the shot. VCU Very close-up (face shot) . waist shot) . ‘just below the knees’ etc. Another point to remember is to avoid casting the mic’s shadow in the set while shooting because it will interfere with the audiences’ suspension of disbelief.

They are commonly used to signal a change in time or place. nevertheless the TV audience has become somewhat adept at picking up the meaning of various audio and video transition devices. They consist of a two. Video Protocol Some people argue that.. This is debatable. Traditionally. this also consists of a passing of time.or three-second transition from a full signal to black and silence. remember that cutting from a static scene to a scene with motion accelerates tension and viewer interest. there might also be a cross-fade in the music at the same time. cutting from a scene with fast-paced movement to a static scene can bring about a sudden collapse of tension. Generally. Lap dissolve: Fading one video source out while simultaneously fading up on (going to) another source. Midway through a lap dissolve.e. ¾ shot Knee shot.entire body plus short distance above/below. subheadings and chapter divisions aren't as apparent in television as they are on the printed page. unlike writing. FLS) . First.SCHOOL OF AUDIO ENGINEERING AE21 – Sound For TV and Film Student Notes Knee. ELS Extra long shot (extreme LS. MLS Medium long shot (full-length shot. there are those that describe camera shots. Cuts or takes are instant transitions from one video source to another. The “meanwhile. Just in case the transition was somehow missed. Although paragraphs. shots can be likened to sentences: each shot is a visual statement. three-quarter length shot . XLS) PROTOCOLS 1. which also apply to both audio and video. can be likened to the beginning and ending of book chapters. Put in grammatical terms. 353 . both signals will be present in equal proportions. Conversely. In the process of scriptwriting. Covershot: An establishing wide-angle or long shot of a set used both to establish the relationships between subject matter in a scene and to momentarily cover problems with mismatched action. In single-camera production a specific shot often requires several takes before it meets the approval of the director. a number of other phrases and abbreviations are commonly used. LS long shot . These transitions normally signal a major division in a production. In describing shots in a scriptwriting.person occupies three-quarters to one-third screen height. video and film don't have any standardized grammar (i. The slow lap dissolve in video and the cross-fade in audio (where two sources momentarily overlap) often signal a transition. back at the ranch” phrase used in early westerns was often punctuated by a lap dissolve from a scene in town to a scene at the ranch.cuts just below knees. conventions or structure). Take: A single shot. teleplays (television plays) and screenplays (film scripts) start with fade-in and close with fade-out. Fade-ins and fade-outs.

Subjective shot or point of view (POV) script indicates that the audience (camera) will see what the character sees. Generally used at the opening of a scene. MS: medium shot. CUs are commonly used for insert shots of objects when important details need to be shown. A canted shot. can easily be seen. a head and shoulders shot. eye level. a MS is normally a shot from the waist up. With objects an XCU is often necessary to reveal important detail. high angle. film-style production action and dialogue are taped from the master shot perspective before the closer insert shots are done.SCHOOL OF AUDIO ENGINEERING AE21 – Sound For TV and Film Student Notes Mastershot: A wide. The XCU shot may show just the eyes or mouth of an individual. which are important to understanding a conversation. Except for dramatic shock value. causing horizontal lines to run up or down hill. 354 . Often in single camera. a long shot should not be immediately followed by a close-up. Establishing shot: A wide shot meant to orient the audience to an overall locale the relationship between scene elements. When applied to talent. this is a shot from the top of their heads to their feet. it indicates a hand-held camera shot that moves in a walking or running motion while following a character. simply because important details aren’t easy to see. Birdseye view. or dutch angle shot. master shot or establishing shot are all designations for a wide shot (WS) or long shot (LS) that gives the audience a basic orientation to the geography of a scene. Subjective camera shots can add drama and frenzy to chase scenes. MCU: medium close-up. is tilted 25 to 45 degrees to one side. Thereafter. a shot cropped between the shoulders and the belt line. A medium shot or medium long shot should come in between. they are visually weak. A relatively straight on CU is the most desirable for interviews. for extreme long shot. all-inclusive shot of a scene that establishes major elements. or VLS. In the video column of video scripts the shorthand designation LS is normally used. Occasionally. and low angle. XCU: extreme close-up. On people this is generally reserved for dramatic impact. Cover shots should be used only long enough to orient viewers to the relationship between scene elements. Changing facial expressions. Other shot designations use in scripts include the following: MLS: medium long shot or FS (full shot). The transition is too abrupt. they can be momentarily used as reminders or updates on scene changes. one will see the abbreviations XLS. On a person. Often. CU: close-up. Cover shot. for very long shot. In the relatively low-resolution medium of NTSC television. A two-shot or three-shot (also 2-S and 3-S) designates a shot of two or three people in one scene. With people.

to indicate a near 180-degree shift in camera position. OSV: off-screen voice. It can also refer to narration heard at a higher level than a source of music or background sound. a short sequence from "The Professor and the Minx" will be traced (Figure 16. 2. Audio Instructions SOT: sound-on-tape. Although a writer occasionally feels it necessary to indicate camera shots and angles on a script. Reverse-angle shot.SCHOOL OF AUDIO ENGINEERING AE21 – Sound For TV and Film Student Notes Camera finds.) POV: point of view. MIC or MIKE: microphone. in order to emphasize a fast cut. there are a number of other abbreviations used in scriptwriting. this is an area that is better left to the judgement of the director.generally on some specific subject matter. In addition to these basic script terms. Knowing the story by analyzing the script and visualizing the characters through a storyboard are essential to blocking and directing the production. 3. The voice indicated on the script is from a person who is not visible. This refers to narration heard over a video source. Widening to. signal a zoom or dolly back. ANNCR: announcer. VTR: video tape recording. The picture shows the back of one person's head and possibly one shoulder. music or background sound will be from a videotape audio track. 355 . OS shot: over-the-shoulder shot. Dramatic scripts will often note that a shot will be seen from the point of view of a particular actor. SOF: sound-on-film. to indicate the camera moves with a person or object. (These are also designated as O/S and X/S shots. EXT and INT: exterior (outside) and interior shot.7). Same as subjective shot. Shooting a Scene From a Dramatic Production As a way of illustrating setups and the whole concept of single-camera. Quick cut to. film-style production. This indicates that the voice. The terms various angles and series of cuts indicate a variety of shots . to indicate the camera moves in on a particular portion of a scene. Camera goes with. KEY: the electronic overlay of titles and credits over background video. VO: voice over.

Barbi and Professor Timorous. RUSTIC SEASIDE CHALET. There are two principal actors in the scene. which she associates with the experiments. Barbi has for some time been trying to break off a relationship with him. On this particular night she has had to escape her nightclub job right after work to keep from encountering Ken. he has moved into a remote seaside chalet. (SFX: Sound of the sea and occasional ship horns are heard softly in the background) DR. NIGHT Barbi pulls up to the chalet In her sports car She grabs her longhaired Persian cat from the seat beside her end jumps out of the car.SCHOOL OF AUDIO ENGINEERING AE21 – Sound For TV and Film Student Notes Below is a brief scenario of the story. EXT. TIMOROUS Miss Stevens BARBI Barbi and this is Tiffy (CONTINUE) 356 . she nervously looks back at the road to see of she has been followed. The consequences of the experiments aren't Barbi's only problems. Meanwhile. Barbi quickly makes her way to the from door and knocks impatiently. As she awaits. she abruptly dropped out of the experiments-and out of school. Over the two-year period. Timorous. Timorous has taken a year's sabbatical to analyze and write up his research. but Ken. has demonstrated major problems in coping with her rejection. To keep from being disturbed. who has a history of violence. Without notice or explanation. Two years earlier Barbi was involved in psychological experiments headed by Dr. Barbi has experienced disturbing psychic episodes and weeks of sleepless nights.

takes off her coat and lays it on a chair.SCHOOL OF AUDIO ENGINEERING AE21 – Sound For TV and Film Student Notes CONTINUE INT. BARBI (continuing) Jeez. Barbi carefully puts her cat down. TIMOROUS (glaring at the cat) 357 . Is this a museum or something? (OSV) DR. Timorous is becoming more uncomfortable by the minute. BARBI Tiffy and I won’t be much trouble. Heavy hewn beams support a high cathedral ceiling. Timorous invites Barbi gives one more glance back and quickly moves past him into the chalet. A wooden stairway leads to a large loft now being used an office. a quite place where I should be able to work. DR. A modern refrigerator betrays the atmosphere. She is still wearing her abbreviated cocktail dress. Small windows look over a restless ocean. Dr. SEASIDE CHALET – NIGHT (SFX: Muffled sound of the sea and occasional ship horn at a softer level are heard) Dr. Watches with obvious disapproval. She comes to a halt in the large rustic room. The furnishings belong to the 18th Century. On the left of the downstairs area there is a small kitchen with an antique stove and a wooden kitchen table. On the right side of the central area is a bedroom. TIMOROUS No. partially hidden behind a free-standing divider.

ANOTHER ANGLE shows Barbi. looking into the bedroom area.SCHOOL OF AUDIO ENGINEERING AE21 – Sound For TV and Film Student Notes You picked a pretty strange time to invite yourself over. Timorous is now at the doorway watching. BARBI (shrugs) Didn’t have much choice. (CONTINUE) CONTINUE BARBI (looking around. In the foreground is a huge. Timorous goes to the kitchen. Barbi walks into the adjoining bedroom. BARBI It’s been days since I have any sleep. While Barbi starts exploring the chalet. If I had a lawyer… but that’s too long a story to start on now. four-posted bed. glass of milk in hand. 358 . Plus the mess I’m in is mostly your doing. engulfed by fatigue. pours himself a glass of milk. amazed) Right out of a 1900 Sears and Roebuck catalogue! Barbi checks the bed for its firmness.

(glances up at him) I am not going to sleep in this $400 dress. So you better cross your eyes or whatever you have to do. TIMOROUS Wait just a minute! We need to discuss… BARBI (interrupting) First thing first. Would you mind getting the light? Seeing that she’s not going to be stopped. Timorous and the cat stare menacing at each other. kicks off shoes) So. pausing for a second without looking back to meekly flip off the light. We can talk tomorrow.SCHOOL OF AUDIO ENGINEERING AE21 – Sound For TV and Film Student Notes (makes a decision. DR. There is silence. right now that comes first. Timorous moves to the kitchen table and sits down. 359 . he quickly turns to leave.

calls him from work at 2 A. Barbi rushes home to change out of her cocktail dress. Desperately needing a "safe port" for a while.M. Before Dr. 16. she announces that she is on her way over.” he apparently holds the key to her psychological problems.8 STORYBOARD OF THE SCRIPT Barbi gets Timorous' telephone number. Barbi's demeanour is "threadbare" because of her turbulent personal problems. Although Barbi thinks Timorous wrote the book on “stuffed shirts. With no time to change. Ken screeches up in front of her condo. In the course of the conversation she finds out where he is.SCHOOL OF AUDIO ENGINEERING AE21 – Sound For TV and Film Student Notes Fig. but just as she gets there. she grabs her cat and flees down the back stairs. Timorous 360 . she hangs up. and tells him about the reactions to the experiments. her lack of sleep and her ongoing reaction to the experiments. Timorous can object.

parking. The next shot (Figure 16. it would provide the editor with footage that could be used between the closer setups. At least two setups could be added. Then the door opens. 361 . At this point we could cut to a variation of the master shot as she gets out of the car and walks toward the front door (Figure 16. and Timorous delivers his line. however. Although every director would approach these scenes differently. on the phone she threatened a lawsuit if he didn't set things straight. Many directors would first shoot the entire arrival sequence from a master shot perspective (Figure 16. The first sequence is of Barbi driving up to the chalet and getting out of her car. let's look at one method. We could then cut to a medium.8B): the car driving up. Although this wide-shot perspective would not really show any detail. To clearly establish the presence of the cat.SCHOOL OF AUDIO ENGINEERING AE21 – Sound For TV and Film Student Notes disdains Barbi's libertine lifestyle (the minx). over-the-shoulder shot (Figure 16. knocks and nervously looks over her shoulder.8F) as she walks to the door. First is a cut to Barbi driving (Figure 16.8E). we could then cut to Figure 16. and we need to add additional information on Barbi's mental state.8c would serve two purposes: it is a needed transition between the long shot and the close-up to follow.8C as Barbi turns off the ignition and quickly and nervously looks around before hastily leaving the car.8D). As the car comes to a stop. we'll probably want a close-up of Barbi scooping up the cat from the seat (Figure 16. Barbi getting her cat and getting out of the car and walking up to the door.8A) before she turns left into the driveway of the chalet (Figure 16-8B).

8L).8N). Knowing what Barbi is doing.8H). To further emphasize the helplessness of Timorous in the situation.” In a reverse-angle shot we then see what Timorous sees: Barbi checking out the bed (Figure 16. we haven't seen Timorous for a while. moves toward Barbi and delivers his line (Figure 16. By holding a medium shot of Timorous at this point.8G) or simply have her deliver it over the wide shot of the room (Figure 16.8J) so we can see Barbi as she speaks. we can now cut to a medium shot of Timorous moving toward the kitchen. We then need to see her turn and start down the stairs. establish Barbi's explorations and show that Timorous has been left standing powerlessly as Barbi takes over the situation. Barbi has a line at this point. Holding the same wide shot at this point will do three things: emphasize the nature of the chalet. we can cut to a reaction shot of Timorous (Figure 16. (Figure 16. we need to see him close the door and move into the room. so we could either cut to a medium close-up of her (similar to Figure 16. it would give the audience a bit more time to study the room.8G). First.8K). At this point we have two options. to establish that there's only an office upstairs. She then delivers her lines as she kicks off her shoes and starts to take off her dress. Second. there could be silence as Barbi brushes past him and moves toward the bedroom. and we expect an answer. and we need to know how he's handling things.8O) as he starts 362 .” This will allow us to see the look of disapproval that comes into Timorous' eyes.8I).8H). we could then see Barbi starting to explore things. we could cut to a wide shot of the room from Barbi's viewpoint as she goes in and then to a close-up of her reaction. so we can rejoin Barbi. reverse-angle shot similar to Figure 16. As soon as we see the milk carton. we can cut to a reverse-angle wide shot of the room (Figure 16. The final decision in this case can be made during editing. If we wanted to put cause before effect. In a wide. A cut to the bottom of the stairs shows Timorous standing there with a glass of milk (Figure 16. we could cut to a shot of Barbi emerging at the top of stairs of the loft and looking at a large desk piled high with reports (Figure 16. a question has been asked. As Barbi enters the room. We can cut to a shot of Timorous as he closes the front door. At the appropriate point.8H. opening the refrigerator and pulling out a half-gallon of milk (Figure 16. Third. In either case we now need a shot to cover Barbi's action as she puts down the cat (we need to see this to explain why she will be without it in subsequent shots) and moves out of frame to explore the chalet.8M). We could hold the same shot of Timorous as Barbi delivers her line: “Tiffy and I won't be much trouble. As the audience is wondering what caused her response. Or we could cut immediately to a two-shot (Figure 16. we can emphasize the look of dismay on his face as Barbi announces (out of the shot) that “she and Tiffy are going to stay for a while. We now need a shot of Timorous at this point for three reasons. Although this would not underscore Barbi's reaction (“Is this a museum or something?"). we could cut immediately to a medium close-up of her reaction. Although the shot is not in the script. we know what's going on.SCHOOL OF AUDIO ENGINEERING AE21 – Sound For TV and Film Student Notes We might make the next sequence more interesting by putting effect before cause.

Barbi calls out. minx personality. we could emphasize her whirlwind.8P). "right" way to do this sequence. By staying on Timorous in more shots. Each director and editor would interpret these script pages in a slightly different way. There is no single. We can avoid a jump cut and condense the sequence of Timorous moving into the kitchen and sitting down with a brief back-lit shot of Barbi slipping under the covers (Figure 16. By including more shots of Barbi. glass in hand. it should not be at odds with the basic story-line and character personalities conveyed in the script. the incredulity of the situation (from his perspective) would be emphasized. "Would you mind getting the light. To end the scene.SCHOOL OF AUDIO ENGINEERING AE21 – Sound For TV and Film Student Notes to back away while strenuously objecting to what's going on. staring down at the cat in silence (Figure 16. As he starts to leave. 363 . we could then cut to a medium wide shot of Timorous." which causes him to (without looking at Barbi) to reach around and meekly flip off the light. Whatever the approach. Allowance for creativity and personal interpretation is one of the strengths of dramatic production.8Q).

SCHOOL OF AUDIO ENGINEERING AE21 – Sound For TV and Film Student Notes 364 .

3 AVI (audio/visual interface) Quicktime MPEG (Moving Picture Experts Group) 7. About Pixel (a contraction of "picture element") 4.6 Bitmap JPEG (Joint Photographic Experts Group) GIF (Graphics Interchange Format) Animated GIF TIFF (Tag Image File Format) 3-D (three dimensions or three-dimensional) 4.4 3.1 6.3 3.2 6.1 Resolution 5.1 2.5 3.1 1. Text 1. Graphic File Format 3. Interactivity Multimedia Production Process 365 . Unformatted Text Formatted Text Types of Computer Graphics 2.2 AIFF (Audio Interchange File Format ) WAV 6. Audio File Format 5. 8. Video File Format 6.1 3.2 Vector Graphics Raster graphics 3.2 3.2 2.AE22 – Multimedia Overview 1.1 5.

sound. text entry. touch screen. on CD-ROM or a Web site). it is usually called interactive multimedia). the use of a speaker or actors and "props" together with sound. and still or animated graphic images Text.SCHOOL OF AUDIO ENGINEERING AE22 – Multimedia Overview Student Notes AE22 – MULTIMEDIA OVERVIEW Introduction Multimedia is more than one concurrent presentation medium (for example. images. Interactive elements can include: voice command. animated GIFs on the Web) produces multimedia. images. A rule-of-thumb for the minimum development cost of a packaged multimedia production with video for commercial presentation (as at trade shows) is: $1. and live theater. Elements of Multimedia • • • • • • Text Graphics (Images) Audio Video Animation Interaction 366 . however. sound. Multimedia tends to imply sophistication (and relatively more expense) in both production and presentation than simple text -and-images. sound. or live participation (in live presentations). Some people might say that the addition of animated images (for example. Although still images are a different medium than text. mouse manipulation. Since any Web site can be viewed as a multimedia presentation. or presentations presented concurrently In live situations. CD-ROMs. and motion video Multimedia can arguably be distinguished from traditional motion pictures or movies both by the scale of the production (multimedia is usually smaller and less expensive) and by the possibility of audience interactivity or involvement (in which case. but it has typically meant one of the following: • • • • • Text and sound Text. and video images Video and sound Multiple display areas. including the Web. and/or motion video. Multimedia presentations are possible in many contexts. any tool that helps develop a site in multimedia form can be classed as multimedia software and the cost can be less than for standard video productions. video capture of the user. multimedia is typically used to mean the combination of text.000 a minute of presentation time.

1 and they will be able to open the file and read it. and display of the written texts of the diverse languages of the modern world. In ASCII. and IBM PC character sets. Text 1. In others.) Below is an example of a section of RTF markup words embedded in a text file. save it as an RTF file (it will have a ".SCHOOL OF AUDIO ENGINEERING AE22 – Multimedia Overview Student Notes 1. the Unicode standard contains 34..Windows NT uses this newer code. processing.} The RTF Specification uses the ANSI.}{\f29\froman\fcharset204\fprq2 Times New Roman Cyr. {\rtf1\ansi\ansicpg1252\uc1 \deff0\deflang1033\deflangfe1033{\fonttbl{\f0\froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman. Additional work is underway to add the few modern languages not yet included. For example. Conversion programs allow different operating systems to change a file from one code to another. EBCDIC . or special character is represented with a 7-bit binary number (a string of seven 0s or 1s). a separate reader or writer may be required.168 distinct coded characters derived from 24 supported language scripts. Officially called the Unicode Worldwide Character Standard. It defines control words and symbols that serve as "common denominator" formatting commands. ASCII was developed by the American National Standards Institute (ANSI). Macintosh. Currently.ASCII is the most common form of unformatted text used in computers and on the Internet. it is a system for "the interchange.2 Formatted Text RTF (Rich Text Format) . 1. UNIX and DOS-based operating systems (except for Windows NT) use ASCII for text files.RTF (Rich Text Format) is a file format that lets you exchange text files between different word processors in different operating systems. Unicode is an entirely new idea in setting up binary codes for text or script characters. 367 .IBM's System 390 servers use this proprietary 8-bit code." It also supports many classical and historical texts in a number of languages.1 Unformatted Text ASCII (American Standard Code for Information Interchange) . the RTF capability may be built into the word processor. user can format words styles such as Bold. With such commands.0 on Windows 3. These characters cover the principal written languages of the world. PC-8. numeric.rtf" file name suffix). Italic. you can create a file using Microsoft Word 97 in Windows 95. Superscript and etc. and stored these text effects in the text file. 128 possible characters are defined. and send it to someone who uses WordPerfect 6.}{\f28\froman\fcharset238\fprq2 Times New Roman CE. Underline. Unicode . (In some cases. each alphabetic.

A raster file is usually larger than a vector image file. A raster file is usually difficult to modify without loss of information. which maps bits directly to a display space (and is sometimes called a bitmap). the file is processed by an RTF writer which converts the word processor's markup to the RTF language. instead of containing a bit in the file for each bit of a line drawing. however. Most images created with tools such as Adobe Illustrator and CorelDraw are in the form of vector image files. TIFF. although there are software tools that can convert a raster file into a vector file for refinement and changes. a vector graphic file describes a series of points to be connected. In physics. (And for three-dimensional images. When being read. At some point. The raster file is sometimes referred to as a bitmap because it contains information that is directly mapped to the display grid. Graphic File Format 368 .2 Raster graphics Raster graphics are digital images created or captured (for example. a zcoordinate. In vector graphics. is used to create an RTF reader or writer. One result is a much smaller file. 3. Animation images are also usually created as vector files.1 Vector Graphics Vector graphics is the creation of digital images through a sequence of commands or mathematical statements that place lines and shapes in a given two-dimensional or three-dimensional space. a vector is a representation of both a quantity and a direction at the same time. (The Specification. Types of Computer Graphics 2. Examples of raster image file types are: BMP. a copy of which is located in the archives at the World Wide Web Consortium (W3C). A raster is a grid of x.SCHOOL OF AUDIO ENGINEERING AE22 – Multimedia Overview Student Notes When saving a file in the Rich Text Format. For example. a vector image is converted into a raster image. the file that results from a graphic artist's work is created and saved as a sequence of vector statements. sometimes be reconverted to vector files for further refinement). 2. Shockwave's Flash product lets you create 2-D and 3-D animations that are sent to a requestor as a vector file and then rasterized "on the fly" as they arrive. the control words and symbols are processed by an RTF reader that converts the RTF language into formatting for the word processor that will display the document. A vector file is sometimes called a geometric file. by scanning in a photo) as a set of samples of a given space.) A raster image file identifies which of these coordinates to illuminate in monochrome or color values. The vector image can be converted to a raster image file prior to its display so that it can be ported between systems. Vector image files are easier to modify than raster image files (which can.(horizontal) and y(vertical) coordinates on a display space. GIF. and JPEG files.) 2.

A GIF89a can also be specified for interlaced presentation. 87a and 89a. The other is the JPEG. an image with much solid color will tend to require a small bitmap. Version 89a (July.SCHOOL OF AUDIO ENGINEERING AE22 – Multimedia Overview Student Notes 3. the PNG format. Thus. which is a short sequence of images within a single GIF file. has been developed by an Internet committee and major browsers will soon be supporting it. When you create a JPEG or convert an image from another format to a JPEG.2 JPEG (Joint Photographic Experts Group) A JPEG (pronounced JAY-peg) is a graphic image created by choosing from a range of compression qualities (actually. the image cannot be immediately rescaled by a user without losing definition. The JPEG scheme includes 29 distinct coding processes although a JPEG implementor may not use them all. the GIF has become a de facto standard form of image. Technically. is designed to be quickly rescaled.jpg". There are two versions of the format. a GIF uses the 2D raster data type. you are asked to specify the quality of image you want. usually with the file suffix of ". 3. You can create a progressive JPEG that is similar to an interlaced GIF. Because a bitmap uses a fixed or raster method of specifying an image. A vector graphic image. A bitmap does not need to contain a bit of color-coded information for each pixel on every row. It only needs to contain information indicating a new color as the display scans along a row. A patent-free replacement for the GIF. On the Web and elsewhere on the Internet (for example. you can make a trade-off between image quality and file size. and uses LZW compression. 3. the JPEG file format is ISO standard 10918. bulletin board services). is encoded in binary. Along with the Graphic Interchange Format (GIF) file.3 GIF (Graphics Interchange Format) A GIF (some people say "JIF" and others say "GIF" with a hard G) is one of the two most common file formats for graphic images on the World Wide Web. CorelDraw and CAD formats use vector graphics. 1989) allows for the possibility of an animated GIF. 3.4 Animated GIF 369 . Since the highest quality results in the largest file. from one of a suite of compression algorithms).1 Bitmap A bitmap defines a display space and the color for each pixel or "bit" in the display space. The format is actually owned by Compuserve and companies that make products that exploit the format (but not ordinary Web users or businesses that include GIFs in their pages) need to license its use. A GIF and a JPEG are examples of graphic image file types that contain bitmaps. however. Formally. the JPEG is a file type supported by the World Wide Web protocol.

an animated GIF is a file in the Graphics Interchange Format specified as GIF89a that contains within the single file a set of images that are presented in a specified order.6 3-D (three dimensions or three-dimensional) In computers. Shockwave. or RGB full color. including those used for scanning images. geometry. these require browsers and operating systems capable of handling the applets. An animated GIF can loop endlessly (and it appears as though your document never finishes arriving) or it can present one or a few sequences and then stop the animation. The Virtual Reality 370 . However. including gray scale. One of the most common graphic image formats.SCHOOL OF AUDIO ENGINEERING AE22 – Multimedia Overview Student Notes An animated GIF is a graphic image on a Web page that moves – for example. a twirling icon or a banner with a hand that waves or letters that magically get larger. The Dutchman's Page includes a "Top 150 Animated GIFs List. In the first phase. GIF Construction Set for Windows from Alchemy Mindworks.tiff" or ". Ray Dream Studio. A number of people maintain lists of animated GIF examples." Learn2. and medical imaging applications. selective use of animated GIFs to illustrate "how-to" topics.com is a site that makes intelligent. 3D Studio MAX. For Mac users.5 TIFF (Tag Image File Format) TIFF (Tag Image File Format) is a common format for exchanging raster (bitmapped) images between application programs. In the third stage. 3-D image creation can be viewed as a three-phase process of: tessellation. Animated GIFs can be handled by most browsers and are easier to build than comparable images with Java or Shockwave. Virtual reality experiences may also require additional equipment. A TIFF file can be identified as a file with a ". 3-D applications. and rendering. models are created of individual objects using linked points that are made into a number of individual polygons (tiles). In particular. color palette. TIFF files are commonly used in desktop publishing. 3-D (three dimensions or three-dimensional) describes an image that provides the perception of depth. You usually need a special plug-in viewer for your Web browser to view and interact with 3-D images. When 3-D images are made interactive so that users feel involved with the scene. PC users can try out a shareware program. Microsoft and Hewlett-Packard were among the contributors to the format. TIFF files can be in any of several classes. In the next stage. 3. the experience is called virtual reality. the polygons are transformed in various ways and lighting effects are applied. and Visual Reality. Softimage 3D. GIFBuilder can be downloaded from Yves Piguet in Switzerland. LightWave 3D. the transformed images are rendered into objects with very fine detail. and can include files with JPEG. faxing. a freeware program.tif" file name suffix. or CCITT Group 4 standard runlength compression. and other tools can be used to build applets that achieve the same effects as an animated GIF. LZW. Popular products for creating 3-D effects include Extreme 3D. The TIFF format was developed in 1986 by an industry committee chaired by the Aldus Corporation (now part of Adobe Software). Java. 3.

you're set the resolution to something less than the maximum resolution. The same pixel resolution will be sharper on a smaller monitor and gradually lose sharpness on larger monitors because the same number of pixels are being spread out over a larger number of inches. contains a bitmap of an image (along with other data). and blue. About Pixel (a contraction of "picture element") A pixel (a word invented from "picture element") is the basic unit of programmable color on a computer display or in a computer image. individual image elements such as text will be larger in size. expressed in terms of the number of pixels on the horizontal axis and the number on the vertical axis. Up to three bytes of data are allocated for specifying a pixel's color. and 640 by 480 resolutions.) Pixel has generally replaced an earlier contraction of picture element. green.) Dots per inch is determined by both the physical screen size and the resolution setting. The physical size of a pixel depends on how you've set th resolution for the display screen. the term dot means pixel. For example. for example. a pixel will be larger than the physical size of the screen's dot.red. 800 by 600.rather than a physical . however. Think of it as a logical .unit. most color display systems use only eight-bits (which provides up to 256 different colors). A given image will have less resolution . (On the other hand. A bitmap is a file that indicates a color for each pixel along the horizontal axis or row (called the x coordinate) and a color for each pixel along the vertical axis (called the y coordinate).the dot pitch matches the pixel size) and usually several lesser resolutions. 4. However. A true color or 24-bit color system uses all three bytes. 4.fewer dots per inch . not dot as in dot pitch.SCHOOL OF AUDIO ENGINEERING AE22 – Multimedia Overview Student Notes Modelling Language (VRML) allows the creator to specify images and the rules for their display and interaction using textual language statements. pel. The specific color that a pixel describes is a some blend of three components of the color spectrum . a display system that supports a maximum resolution of 1280 by 1023 pixels may also support 1024 by 768. on the same size screen. Or. A given computer display system will have a maximum resolution that depends on its physical ability to focus light (in which case the physical dot size . Note that on a given size monitor. 371 . The sharpness of the image on a display depends on the resolution and the size of the monitor. the maximum resolution may offer a sharper image but be spread across a space too small to read well. If you've set the display to its maximum e resolution.1 Resolution Resolution is the number of pixels (individual points of color) contained on a display monitor. (In this usage. If. A GIF file. the image will have less resolution if the resolution setting is made larger resetting from 800 by 600 pixels per horizontal and vertical line to 640 by 480 means fewer dots per inch on the screen and an image that is less sharp. one byte for each color.on a larger screen as the same data is spread out over a larger physical area. the physical size of a pixel will equal the physical size of the dot pitch (let's just call it the dot size) of the display. Screen image sharpness is sometimes expressed as dots per inch (dpi).

However. A smaller VGA display would have more pixels per inch. Many sound cards take advantage of Direct Memory Access (DMA). WAV files predictably have the extension . For example. Some wavetable chips include a special section for drum sounds to support rhythmic effects. Typically. MIDI lets you capture sound and play it back based on the commands in files that are essentially little "scripts" to the "orchestra" (one might think of it as a written description of what the conductor is doing and which instruments are being pointed to with the baton). Prestoring sound waveforms in a lookup table improved quality and throughput. Video File Format 372 . A full-duplex sound card lets you record and playback at the same time (or." this is the Windows standard for waveform sound files.2 WAV Pronounced "wave. the resolution and the physical monitor size together do let you determine the pixels per inch. 6.In computer technology. a 15-inch VGA monitor has a resolution of 640 pixels along a 12-inch horizontal line or about 53 pixels per inch. PC monitors have somewhere between 50 and 100 pixels per inch. talk and hear at the same time). Wavetables are used as part of music or sound synthesizers that use the musical instrument digital interface (MIDI). Many also include an FM synthesizer in order to play back sounds from older applications or files.SCHOOL OF AUDIO ENGINEERING AE22 – Multimedia Overview Student Notes Display resolution is not measured in dots per inch as it usually is with printers. 5. computer sounds (digital versions of analog waveforms) were generated through frequency modulation (FM). Today's more advanced sound cards include wavetables with 32 "voices" or instruments (that are combined during creation and playback). Played by a variety of downloadable software on both the PC and the Mac. a wavetable is a table of stored sound waves that are digitized samples of actual recorded sound.wav. 5. It is also used by Silion Graphics and in several professional audio packages. Audio File Format 5. A wavetable sound can be enhanced or modified using reverberation or other effects before it is saved in the table. A wavetable is stored in readonly memory (ROM) on a sound card chip but it can also be supplemented with software. Originally. Wavetable sound cards use digital signal processor (DSP) chips.1 AIFF (Audio Interchange File Format ) This audio file format was developed by Apple Computer for storing high-quality sampled audio and musical instrument information. Wavetable . Some sound cards work with software that provides additional voices. if you're using Internet telephony.

Using a Quicktime player that either comes with a Web browser or can be downloaded from Apple or the browser company. You can find out where to download a viewer and a great deal of other information at the MPEG Pointers and Resources home page. you need a personal computer with sufficient processor speed. Quicktime files combine sound. (Note that . you can view and control brief multimedia sequences. and playback technology from Apple. MPEG-4. 373 . animation. An MPEG-4 standard is planned for late 1998. was merged with the MPEG-2 standard when it became apparent that the MPEG-2 standard met the HDTV requirements.mp3 file suffixes indicate MPEG-1 audio layer-3 files. internal memory. and video in a single file.) You can download shareware or commercial MPEG players from a number of sites on the Web. each designed for a different purpose. see About MPEG on the official MPEG site. and mov. evolves standards for digital video and digital audio compression. storage. and hard disk space to handle and play the typically large MPEG file (which has a file name suffix of . An MPEG2 player can handle MPEG-1 data as well.2 Quicktime Quicktime is a multimedia development. (We happen to use a shareware player called Net Toob). MPEG standards 1-4: MPEG-1 was designed for coding progressive video at a transmission rate of about 1. and an artificial intelligence (AI) approach to reconstructing images. A proposed MPEG-3 standard. To use MPEG files. intended for High Definition TV (HDTV).1 AVI (audio/visual interface) Windows AVI provides the capability to develop animation files that can be included in multimedia presentations and as part of World Wide Web pages. Quicktime files can be recognized by their file name extensions: qt.5 million bits per second.mpeg). not MPEG-3 standard files. according to IEEE Spectrum is expected to address speech and video synthesis.SCHOOL OF AUDIO ENGINEERING AE22 – Multimedia Overview Student Notes 6. MPEG-2 was designed for coding interlaced images at transmission rates above 4 million bits per second. fractal geometry. 6. It was designed specifically for Video-CD and CD-i media. For additional information. You also need an MPEG viewer or client software that plays MPEG files.avi extension) require a special player which may be already included with your Web browser or may require downloading. computer visualization. The files (which end with an . text. MPEG-2 is used for digital TV broadcast and DVD. the Moving Picture Experts Group. 6.mpg or . The MPEG standards are an evolving series. mov. It meets periodically under the auspices of the International Standards Organization (ISO).3 MPEG (Moving Picture Experts Group) MPEG (pronounced EM-pehg).

message boxes needed are planned and will serve as a guideline for the whole production. Almost at the same time. Any kind of user input. Upon completion of the script. the animation 374 . such as the number of screens. a typical multimedia production would start with a script. including typing commands or clicking the mouse. Just like a film production. The first interactive human-computer interfaces tended to be input text sequences called "commands" (as in "DOS commands") and terse one-line responses from the system. Multimedia Production Process A multimedia production involves creating program materials for different media and incorporates them into a single presentation. they're usually called batch or background programs. the Web (and many non-Web applications in any computer system) offer other possibilities for interactivity. (Programs that run without immediate user involvement are not interactive. is a form of input. Hypertext or the word and picture links you can connect to are the most common form of interactivity when using the Web (which can be thought of as a giant. motion video sequences. GUIs inherently promoted interactivity because they offered the user more interaction options. interactivity is the sensory dialog that occurs between a human being (or possibly another live creature) and a computer program. checkbox and message windows) to be presented in the multimedia. In the late 1970's. buttons. If narration is involved in the action. interconnected application progam). This is usually done under the supervision of a multimedia producer or art director. the flow of the presentation. the graphic artists would then design the visual presentation of the setting. 8. the first graphical user interfaces (GUIs) emerged from the Xerox PARC Lab. the animator will then work on the animation of the various characters according to the script.SCHOOL OF AUDIO ENGINEERING AE22 – Multimedia Overview Student Notes 7. However. order entry applications and many other business applications are also interactive. telling them what programs to run and then interacting with those programs. and sounds are output forms of interactivity. you not only interact with the browser (the Web application program) but also with the pages that the browser brings to you. descriptions of the characters. and other interactive programs. Displayed images and text. In addition to hypertext. The plan may be in the form of a flowchart. found their way into the Apple Macintosh personal computer.) Games are usually thought of as fostering a great amount of interactivity. text field. On the World Wide Web. drawing programs. The earliest form of interaction with computers was indirect and consisted of submitting commands on punched cards and letting the computer read them and perform the commands. and then into Microsoft's Windows operating systems and thus into almost all personal computers available today. Once the graphic design is completed. The script contains the storyline. Interactivity In computers. Later computer systems were designed so that average people (not just programmers) could interact immediately with computers. characters and interaction according to the styles and ambience required by the story. such as word processors (then called "editors"). but in a more constrained way (offering fewer options for user interaction). printouts. narration and various interactivities (such as buttons.

The testing process is to ensure that all animation. It is converted into a standalone.WAV or . soundtracks and sound effects could be obtained from sample CD collections. The final tested product is then ready to be released. If video is required in the presentation. according to the script. edited and converted into a digital video file format. In such cases. video and etc. programming logics works without flaws. Regardless of the original format and medium of the recording being made. Sometimes the narration maybe recorded prior to the creation of animation. Most of the time. soundtrack and sound effects. executable program (sometimes called the ‘projector’) to be ‘burnt’ onto a CD-R which is used as the master copy for mass production. video. The size of the video screen is to be determined so that it will fit into the design of the presentation as planned. Sometimes. the animator may find it easier to work on pre-determined timing. professional recording studio and voice-over talents are used. By the end of the authoring process. a fully function prototype is produced which will undergo different stages of test. graphics. This usually done by the programmer who would determine the interactivity and incorporate different animations. 375 . composers are employed to produce theme song or music. The job of the audio engineer in the production is to record narration. In bigger production. sounds. it has to be shot. sounds. The final stage of the production is authoring – which serves to combine all the various media into a complete presentation. The recording of narration sometimes involves using a narration studio set up in a small room. the recorded sounds have to be converted into digital audio file format such as .SCHOOL OF AUDIO ENGINEERING AE22 – Multimedia Overview Student Notes has to be timed to allow sufficient time for speech before moving on to the next screen. It is common that a rough narration track is used for guiding the animation and a proper version of the narration will then be recorded upon completion of the animation.AIFF in order to be incorporated into the presentation.

SCHOOL OF AUDIO ENGINEERING AE22 – Multimedia Overview Student Notes 376 .

2 5.4 2. Notes on Decoding The Anatomy of an MP3 File 5.1 2. What is Internet? The origin of Internet 2. 3.3 5.1 2.4 Notes on "Lossiness" Masking Effects Bitrates Huffman Coding 4. How MP3 Works: Inside the Codec 1.2 2. MPEG Audio Compression in a Nutshell Psychoacoustics and MP3 Codecs Breaking It Down 3.AE23 – Internet Audio The Internet 1.1 3. 2.5 The most common usage of Internet Internet Addresses Transmission Control Protocol (TCP) Domain Names HTML (Hypertext Markup Language) The MP3 Audio Format 1.3 2.1 5. Listening to MP3 Streams 377 .3 3.2 3.4 Inside the Header Frame Locking onto the Data Stream ID3 Space Frames per Second 6. 5.

6.2 6.1 6.3 Types of Streaming MP3-on-demand MP3 broadcast 378 .

The Internet thrives and develops as its many users find new ways to create. This network has long since been dismantled and replaced by several networks that now span the globe. Technically. a network of networks that spans the globe. no one runs the Internet. 2. Central archives – In most of the computer networks. government needed to link universities. or other materials of historical interest. This allowed the universities to share research information. many more colleges and universities were connected to the Internet. 379 . One of the key projects they were assigned to look at was if all of their defense information was stored in just one computer. the Internet is made up of thousands of smaller networks.SCHOOL OF AUDIO ENGINEERING AE23 – Internet Audio Student Notes AE23 – INTERNET AUDIO The Internet 1. Protocol . It's impossible to give an exact count of the number of networks or users that comprise the Internet. One way to survive that threat was to replicate and distribute the information between many computers all over the country using a network. Several major universities and defense contractors were linked together on a network called DARPANET all using the same protocol which now known as TCP/IP or Internet Protocol. Rather. military contractors and defense contractors together so that they could cooperate on advanced research projects together.A standard procedure for regulating data transmission between computers. but it is easily in the thousands and millions respectively. The origin of Internet The Internet started in about 1968 when the U. What is Internet? The Internet is what we call a meta-network. The Internet employs a set of standardized protocols which allow for the sharing of resources among different kinds of computers that communicate with each other on the network. display and retrieve the information that constitutes the Internet. that is. These standards. The Internet is also what we call a distributed system. documents. sometimes referred to as the Internet Protocol Suite.A place or collection containing records. The federal government formed an agency called Advanced Research Projects Agency (ARPA). that computers would use to talk to each other were agreed upon. are the rules that developers adhere to when creating new functions for the Internet. programs and recent news. Archive . there is no central archives. some of the languages.S. it would be an easy target for a nuclear attack. all records (archives) are stored on a center server that provides such records to all terminals that are connected to it. In the nineties. In the mid eighties. Soon more and more ways of communicating information across this new "Information Super Highway" were being developed. or protocols. In 1975. the Internet was opened up for commercial use as the Cold War ended.

and computer programs.ext.SCHOOL OF AUDIO ENGINEERING AE23 – Internet Audio Student Notes TCP/IP . Because Web pages are written using standardized Hypertext Markup Language (HTML). images.2 Email – Electronic Mail Email. or through the public Internet. 2.1. through interconnected computer networks.Transmission Control Protocol/Internet Protocol Cold War .” 2.1 System of computers and files by which users may view and interact with a variety of information. email is sent to a person at an address. they can be interpreted by any browser program such as Netscape Navigator (by Netscape Communications Corporation) and Internet Explorer (by Microsoft) 2.1.1. 380 . when you want to send email to someone. A username is a pseudonym that identifies a person’s account on an Internet service. Direct military conflict did not occur between the two superpowers. also written “e-mail. current world and business news. The WWW is organized so that users can move easily between documents called Web pages. Just like regular mail. sound. using addresses called Uniform Resource Locators (URLs). stored all over the world.gov. the global consortium of interconnected computer networks. Users generally navigate the WWW using an application known as a browser.post-1945 struggle between the United States and its allies and the group of nations led by the Union of Soviet Socialist Republics (USSR). or other programs. Web pages contain links leading a user directly to other pages.3 FTP – File Transfer Protocol The protocol used for copying files to and from remote computer systems on a network using TCP/IP such as the Internet. public and university library resources. such as listing files and directories on the remote system. but intense economic and diplomatic struggles erupted. an email address becomes “username at location dot extension. The Cold War eventually ended when the USSR dissolved in 31 December 1991.” is a totally electronic form of personal communication. including magazine archives. The @location. For example. When read aloud. This protocol also allows users to use FTP commands to work with files. President is the username and @whitehouse.gov is the address of the White House on the Internet.1 The most common usage of Internet • • • • • WWW – World Wide Web Email – Electronic Mail FTP – File Transmission Protocol IRC – Internet Relay Chat Telnet WWW – World Wide Web 2.ext is the address of the Internet service itself. which presents text. the President of the United States can be reached at president@whitehouse. The WWW can be accessed by a computer connected to an Internet. For example. you would send the email to an address that looks like username@location.

8. pictures and sound files from the client computer to the host server. he could operate the server as if the server is local. Some information or service providers. the right section tells the router which computer should receive that information. such as software developers. which may be reflected in the channel's name.1.2 Internet Addresses Internet addresses comprise four numbers. go at the start of your information. enables the user to join a channel. maintained by an IRC server. 2.A multi-user. UNIX.5 Telnet A UNIX command in BSD Unix that enables a user to log in to a remote computer on a network using the rlogin protocol. I P Wires and computers 381 . Telnet is provided by some Internet Service Providers (ISP) for their users to manage their account such as changing password and etc. Clearly. a channel is dedicated to a particular topic. Generally. multitasking operating system originally developed by Ken Thompson and Dennis Ritchie at AT&T Bell Laboratories in 1969 for use on minicomputers 2.g. no two computers should have the same address.4 IRC – Internet Relay Chat A service that enables an Internet user to participate in a conversation online in real time with other users. 122. e. rather like one onion ring sits on top of another. separated by full stops and each one less than 256. and then displays the other participants' words on individual lines so that the user can respond. When a user is connected to a remote Telnet server. transmits the text typed by each user who has joined the channel to all other users who have joined the channel.1. the router needs programs to allow it to make all these decisions and to decipher addresses. sometimes called the dotted-quad. It is used to transfer HTML. This is what is known as the IP software and it sits on top of the basic wires. Obviously. These numerical addresses.234. The leftmost part tells the router which network you are part of.SCHOOL OF AUDIO ENGINEERING AE23 – Internet Audio Student Notes FTP is vital for establishing web page. An IRC channel.34. An IRC client shows the names of currently active channels. placed their software or document on a FTP server so that users who have access to the server could freely download. IRC was invented in 1988 by Jarkko Oikarinen of Finland 2.

just like the letters being placed in a post-bag. I send a message. that network’s TCP program will begin to put them together as a single document and in the right order based on the sequence numbers put on by the original TCP software. Should you need to send less than 1500 characters. adds an overhead to the overall speed at which data is sent and received. your computer will eventually receive your requested 382 . This is a vast improvement on the manual postal system in which when a letter gets lost. when Ms. and pass it on to the most convenient router. However. Naturally. it frequently stays lost. you application program can use UDP. The TCP. It is sent off via the network. you want to find someone’s telephone number. All this shredding into packets. the receiving computer discards everything that has been sent and requests another complete transmission from the sending machine. for example. exactly what the TCP does is effectively invisible (seemless). 2.3 Transmission Control Protocol (TCP) TCP is also closely related to the Internet Protocol that you may often hear people refer to the TCP/IP. Suppose one of the packets gets lost during transmission. X go into another mailbox. they can be sent to her in the correct sequence. When this happens. Transmission Control Protocol. When all the packets eventually arrive at the other end and possibly out of order. Y asks for her “mail”. Y. counting up to 1500 characters. probably less than 20 characters. 2. The fact that during transmission it was ripped into little bits (packets) and re-assembled by the receiving TCP program at the other end is something we are not normally aware of. those for Mr. little more than about 20 lines of text. which is not unusual? Fortunately. Each packet is passed to the IP software which is then capable of handling each packet of 1500 characters or so. It is this layer which prevent the system being monopolized by a handful of users. it is too large. Y are put into one mailbox. She sees a three-page document on her computer screen. User Datagram Protocol. each of 1500 characters.SCHOOL OF AUDIO ENGINEERING AE23 – Internet Audio Student Notes The data sent over IP networks is limited to 1500 characters. yet another outer layer of the network. however. My packets of data/information are mixed up with everyone else’s packets. from her own mailbox. To users. then TCP can be slow. This is handled by the TCP. Suppose you have a ten page document to send to someone. it has a cousin called UDP. Both ends need to keep in contact until the receiving end is satisfied that all has been received. You will type the document and address it. there are various techniques used in computing to quickly detect transmission errors. It will number each one and put on the IP address. many people want to send or receive much more than 1500 characters. If the message gets through to the other end. staying in touch until the error detection activity is satisfied. one of the re-transmissions will be correct. a three-page document to Ms. Eventually. and it will rip this document into shreds (called packets). If. All those for Ms. the IP cannot handle it. It is through the IP address and the TCP sequence number that the TCP software at the local network sorts out all the individual packets.3. can.1 UDP UDP sends a message via the IP as usual but without any frills. Then. However.

sae. It has more or less four parts e. Each domain is given the responsibility for creating names within its own group. The other parts are domains. The name will need to be turned into IP numerical address by your local network software in order to travel over the Internet. it was the Network Information Center (NIC). the overall domain for all educational computers in Singapore which is “sg”. The number would then be used in place of the name you used. There were delays in registering and distributing the name/address file. it no longer became practical to use the NIC service. 2.g. People applied to NIC with their names and IP addresses and were added to the list. you can do so. If it does not arrive within a given time period.edu.sg each separated by a full stop (but are not related to the four numbers of the IP address). Your computer will look up the corresponding numeric address. However.SCHOOL OF AUDIO ENGINEERING AE23 – Internet Audio Student Notes telephone number and all is well. This list. It can be passed to your display screen. The “ty” belongs to the SAE network (sae) which is part of the “edu”. was distributed regularly to all users. Some other system was required. this was the online Domain Name System. when the number of users on the Internet was manageable.4 Domain Names If you want to use names. your program will request the transmission again. called a hosts’ file. In the early days. The leftmost part is an actual computer. in the above example “ty”. s g edu sae ty 383 . rather like a public telephone directory. having assumed it has been lost. ty. This process will continue until you see the telephone number displayed on to your screen. It set up a registry. which handled names. as the number of users grew.

HTML is defined in practice both by Netscape and Microsoft as they add changes to their Web browsers and more officially for the industry by the World Wide Web Consortium (W3C). such as that found on CDs. We like Laura Lemay's Teach Yourself Web Publishing with HTML 4 in One Week (Sams. A new version of HTML called HTML 4 has recently been officially recommended by W3C. 384 . which in effect sets a "tolerance" level-the lower the data storage allotment.1 MPEG Audio Compression in a Nutshell Uncompressed audio. keeping that which does. MP3 encoding tools analyze incoming source signal. and conformance to a tightly specified format for encoding and decoding audio into compact bitstreams. The encoder can then discard most of the data that doesn't match the stored models. break it down into mathematical patterns. making this level an effective standard. and in mathematical models representing human hearing patterns. so much so that it can be quite accurately described in tables and charts. For example. and compare these patterns to psychoacoustic models stored in the encoder itself. Significant features in HTML 4 are sometimes described in general as dynamic HTML. 1. The person doing the encoding can specify how many bits should be allotted to storing each second of music. and quite a lot is known about the process. the more data will be discarded. if two notes are very similar and very close together. And of course your ears are more sensitive to some frequencies than others. while still managing to maintain an acceptable level of fidelity to the original source material? The entire MP3 phenomenon is made possible by the confluence of several distinct but interrelated elements: A few simple insights into the nature of human psychoacoustics. Web developers using the more advanced features of HTML 4 may have to design pages for both browsers and send out the appropriate version to a user. However.net Publishing) and Ian Graham's HTML Sourcebook (John Wiley). There are a number of helpful books on HTML. The study of these auditory phenomena is called psycboacoustics. your brain may never perceive the quieter signal. How MP3 Works: Inside the Codec How does the MP3 format accomplish its radical feats of compression and decompression.5 HTML (Hypertext Markup Language) HTML (Hypertext Markup Language) is the set of "markup" symbols or codes inserted in a file intended for display on a World Wide Web browser. If two sounds are very different but one is much louder than the other.SCHOOL OF AUDIO ENGINEERING AE23 – Internet Audio Student Notes 2. The MP3 Audio Format 1. it's possible to learn HTML from material available on the Web itself. a whole lot of number crunching. both Netscape and Microsoft browsers currently implement some features differently and provide non-standard extensions. stores more data than your brain can actually process. However. The markup tells the Web browser how to display a Web page's words and images for the user. your brain may perceive only one of them.

In fact. even our own musical instruments create many vibrational frequencies that are imperceptible to our ears. For example. the name of the album from which the track came. However. the recording year. and so on) is sensitive to a broader range of sounds and audio resolutions than is the human ear. a second compression run is also made. much like a filmstrip. genre. you may be surprised to learn that a good recording stores a tremendous amount of audio data that you never even hear. These are simple and well-established empirical observations on the human hearing mechanism. In some encodings. if one frame has leftover storage space and the next frame doesn't have enough. it's generally true that humans perceive midrange frequencies more strongly than high and low frequencies. This is called "ID3" data. Psychoacoustics and MP3 Codecs Everything is vibration. 2. The most sensitive range of hearing for most people hovers between 2kHz to 4kHz. and personal comments may be stored. The process is actually quite a bit more complex than that. which shrinks the remaining data even more via more traditional means (similar to the familiar "zip" compression process). and that sensitivity to higher frequencies diminishes with age and prolonged exposure to loud volumes. most of us can't hear much of anything above 16kHz (although women tend to preserve the ability to hear higher frequencies later into life than do men). because data is lost in the process. one of the most important functions of the mind is to function as a sieve. these frames may interact with one another. Clearly. The basic principle of any perceptual codec is that there's little point in storing information that can't be perceived by humans anyway. sifting the most important information out of the incoming signal. Each frame of data is preceded by a header that contains extra information about the data to come. by the time we're adults. they may team up for optimal results. Somewhere in between these extremes are wavelengths that are perceptible to human beings as light and sound. leaving the conscious self to focus on the stuff that matters.and ultrasonic vibration. Waves vibrating at different frequencies manifest themselves differently. Just beyond the realms of light and sound are sub. and all waves oscillate at different lengths (a wavelength is defined as the distance between the peak of one wave and the peak of the next). Our sense organs are tuned only to very narrow bandwidths of vibration in the overall picture. which involves the mind itself. all the way from the astronomically slow pulsations of the universe itself to the inconceivably fast vibration of matter (and beyond). MP3 files are composed of a series of very short frames." systematically bringing important information to the fore and sublimating or ignoring superfluous or irrelevant data. and zillions of other frequencies imperceptible to humans (such as radio and microwave). However. and will become increasingly useful as your collection grows. which runs roughly from 500Hz to 2kHz. the infrared and ultraviolet light spectra. because recording equipment (microphones. guitar pickups. The universe is made of waves. At the beginning or end of an MP3 file. it's been estimated that we really only process a billionth of the data available to our five senses at any given time. and we'll go into more detail later on. the track title. extra information about the file itself. Some have postulated that the sane mind functions as a sort of "reducing valve. In fact. In fact. there's a second piece to this puzzle. As obvious as this may sound. This kind of compression is called lossy. a level probably evolutionarily related to the normal range of the human voice. such as the name of the artist.SCHOOL OF AUDIO ENGINEERING AE23 – Internet Audio Student Notes and the worse the resulting music will sound. one after another. 385 . While hearing capacities vary from one individual to the next.

on the entire spectrum of audible frequencies." In other words. First it throws away what humans can't hear anyway (or at least it makes acceptable compromises). which are stored in the codec as a reference table. the MP3 encoding process can be subdivided into a handful of discrete tasks (not necessarily in this order): • • Break the signal into smaller component pieces called "frames. The encoding bitrate is taken into account. In brief. and chiefly concerns us here. The frequency spread for each frame is compared to mathematical models of human psychoacoustics. Because different portions of the frequency spectrum are most efficiently encoded via slight variants of the same algorithm. requires most of the complexity. For instance. Perceptual codecs are highly complex. this step breaks the signal into sub-bands. if you're encoding at 128 kbps. but we'll get to that later). as determined by the encoder). and how much will be left on the cutting room floor. Breaking It Down MP3 uses two compression techniques to achieve its size reduction ratios over uncompressed audio--one lossy and one lossless. and then it encodes the redundancies to achieve further compression. and all of them work a little differently. which can be processed independently for optimal results (but note that all sub-bands use the algorithm-they just allocate the number of bits differently. you have an upper limit on how much data can be stored in each frame (unless you're encoding with variable bitrates. You can think of frames much as you would the frames in a movie film." each typically lasting a fraction of a second. find out how the bits will need to be distributed to best account for the audio to be encoded. the general principles of perceptual coding remain the same from one codec to the next. However. This step determines how much of the available audio data will be stored. From this model. Analyze the signal to determine its "spectral energy distribution. and the maximum number of bits that can be allocated to each frame is calculated. it's the first part of the process that does most of the grunt work. since they'll • • 386 .SCHOOL OF AUDIO ENGINEERING AE23 – Internet Audio Student Notes 3. it can be determined which frequencies need to be rendered accurately. However.

" which compresses redundant information throughout the sample.1 Notes on "Lossiness" Compression formats. However. which banks on the fact that image files often store more information than necessary to display an image of acceptable quality. you could end up storing hundreds of thousands of pixels of perfect blue. and which ones can be dropped or allocated fewer bits. and then you compress what's left to shrink the storage space required by any redundancies. If you were to scan and store this image on your hard drive. Along the way. In addition. discarding data in the process. The Huffman coding does not work with a psychoacoustic model." When the part of the image depicting the sand is encountered. This second step. images. excellent compression ratios can be achieved for images that don't need to be displayed at high resolutions. some types of data can withstand having information thrown away. or random collections of files. does not discard any data-it just lets you store what's left in a smaller amount of space. the preceding steps are not necessarily run in order. are either lossless or lossy. 3. The collection of frames is assembled into a serial bitstream. no matter how well-encoded. The headers contain instructional "metadata" specific to that frame. they may be represented as the mathematical equivalent of "repeat blue pixel 273. all identical to one another. The secret of a photographic compression method like GIF is that this redundant information is reduced to a single description. By throwing away some of the information. but achieves additional compression via more traditional means. and by encoding redundant information with mathematical algorithms.SCHOOL OF AUDIO ENGINEERING AE23 – Internet Audio Student Notes be perceptible to humans. and it's important to understand that all MP3 files. Rather than store all the bits individually. or you're willing to make a compromise: Smaller files in exchange for missing but unimportant data. losing even a single byte is unacceptable. video. The distinction is simple: Lossless formats are identical to the original(s) after being decompressed. many other factors enter into the equation. you can see the entire MP3 encoding process as a two-pass system: First you run all of the psychoacoustic models. algorithms for the encoding of an individual frame often rely on the results of an encoding for the frames that precede or follow it. and therefore redundant. We'll take a deeper look at much of this process in the sections that follow. have discarded some of the information that was stored in the original. whether they operate on audio. Think for a moment of a photograph depicting a clear blue sky. A good example of a lossy compression format is JPEG. with header information preceding each data frame. the Huffrnan coding. When you unpack a zip archive containing a backup of your system from last month. uncompressed signal. it does illustrate the concept of lossiness. The entire process usually includes some degree of simultaneity. since we wouldn't be able to hear them anyway. on the grounds that either you'll never notice what's missing.zip archiving scheme. and below it a beach. • Thus. 387 . Many lossy compression formats work by scanning for redundant data and reducing it to a mathematical depiction which can be "unpacked" later on. often as the result of options chosen prior to beginning the encoding.000 times. while lossy formats are not. A good example of a lossless compression format is the ubiquitous . Why store data that can't be heard? The bitstream is run through the process of "Huffman coding. While the JPEG analogy doesn't depict the MP3 compression process accurately.

let's say you have an audio signal consisting of a perfect sine wave fluctuating at 1.1 Simultaneous (auditory) masking The simultaneous masking effect (sometimes referred to as "auditory masking") may be best described by analogy. 3. As it moves past the sun to the right. Now you introduce a second perfect sine wave. Most humans will not be able to detect the second pitch at all. we'll slowly change the frequency (pitch) of the second tone until it's fluctuating at. If JPEG compression is set low. which occur s unconsciously at every moment for all of us. So. and vibratory audio signal. Tone 2 is barely audible next to Tone 1. described earlier. until at a certain point. frequencies) have to be before they're considered redundant with one another is the key to determining the degree of lossiness. This is why simple images can be stored as small files. it's because its frequency is very close (similar) to that of the first. it becomes visible again. If JPEG compression is set high.SCHOOL OF AUDIO ENGINEERING AE23 – Internet Audio Student Notes the sand is analyzed for redundancy and similar reductions can be achieved. However. 4. the mind. However. we'll leave its volume exactly as it was. involves a process called masking. You see the bird flying in from the left. Tone 2 is quite audible. to put simultaneous masking into more concrete terms. Of course. 388 . At Point A. the reason the second pitch is inaudible is not just because it's quieter. one louder than the other. light blue and medium blue pixels may be treated as being redundant with one another. more accurately. at -10c[b.000Hz." which demonstrates an important aspect of the mind's role in hearing: Any time frequencies are close to one another.2. In more concrete audio terms. On the other hand. -10db. The end result will be a clearer picture and a larger image file. we have the aural equivalent of an optical illusion-a trick of our perceptual capacity that contributes to our brain's ability to filter out the less relevant and give focus to stronger elements. even though its volume remains unchanged.00OHz. As the second pitch becomes more dissimilar from the first. 3. The MP3 codec. all it knows are relative frequencies and volume levels. recall how you can sometimes hear an acoustic guitarist's fingers sliding over the ridged spirals of the guitar strings during quiet passages. while complex images don't compress as well-they contain less redundancy.10OHz-but also much quieter-say. much as mountains on the distant horizon may appear to be evenly textured and similarly colored. it becomes more audible. we h ave difficulty perceiving them as unique. is unconcerned with guitar stings. JPEG compression works in accord with user-defined "tolerance thresholds". most humans will hear two distinct tones. of course.2 Masking Effects Part of the process of mental filtering. then it seems to disappear. and is of much interest to students of psychoacoustics: the study of the interrelation between the ear. this one fluctuating at a pitch just slightly higher-let's make it 1. because the sun's light is so strong in contrast. To illustrate this fact. Two separate masking effects come into play in MP3 encoding: auditory and temporal. the codec will be more fussy about determining which pixels are redundant. In effect. At Point B. determining how similar two adjacent pixels (or. say. Think of a bird flying in front of the sun. you seldom if ever hear this effect during a full-on rock anthem. What's going on here is a psychoacoustic phenomenon called "simultaneous masking. because the wall of sound surrounding the guitar all but completely drowns these subtle effects. even while the same mountains might be full of variation and rich flora if one were hiking in them.

if a loud sound and a quiet sound are played simultaneously. an MP3 file will not. An easy way to visualize the effect of bitrate on audio quality is to think of an old. The idea behind temporal masking is that humans also have trouble hearing distinct sounds that are close to one another in time. or in the case of other formats such as AAC or MP3 with MPEG-2 extensions.e. For example. you won't be able to hear the quiet sound. based on time rather than on frequency. or 128. mono. If. This distance. so premasking and postmasking both occur. which is dependent on the relationship between frequencies and their relative volumes... For example. and the bitrate. 3. the "irrelevant" portions of the signal are mapped against two factors: a mathematical model of human psychoacoustics (i. which has a similar net result. significant enough to keep it in the bitstream rather than throwing it away. The codec takes the bitrate into consideration as it writes each frame 389 .. the current de facto standard is to encode MP3 at 128 kbps. as they might do with a JPEG image. this is precisely what the algorithms behind most audio compression formats dothey exploit certain aspects of human psychoacoustic phenomena to allocate storage space intelligently. The key to the success of temporal masking is in determining (quantifying) the length of time between the two tones at which the second tone becomes audible.e. turn-of-the-century film. which is established at the time of encoding. quieter sound. such as the number of bits per second allocated to storing the data and the number of channels being stored. and more to the dominant signal.the higher the bitrate. they can control the number of bits per second to be devoted to data storage. which means less data is distributed over a given time frame. And. stereo. and are accounted for in the algorithm. the greater the audio resolution of the final product. there is sufficient delay between the two sounds. If you were to try and compress an audio signal containing two sine waves. or PCM*) audio storage format will use just as much disk space to store a texturally constant passage in a symphonic work as it will for a dynamically textured one. you would want the ability to devote less disk storage space to the nearly inaudible signal. The bitrate simply refers to the number of bits per second that should be devoted to storing the final product . turns out to be around five milliseconds when working with pure tones. Of course. there's a second sort of masking which also comes into play. mathematical descriptions of the limitations of human auditory perception. multichannel audio. or threshold.SCHOOL OF AUDIO ENGINEERING AE23 – Internet Audio Student Notes Now consider for a moment the fact that an audio signal consisting of two sine waves-even if one is quieter-contains almost twice as much data as a signal containing a single wave. The MP3 codec is based on perceptual principles but also encapsulates many other factors. Thus.e.000 bits per second. of course. the masking requirements). i. however. in a sense. this process also works in reverse-you may not hear a quiet tone if it comes directly before a louder one. MP3 and similar audio compression formats are called "perceptual codecs" because they are.3 Bitrates While MP3 users cannot control the degree of lossiness specifically. you will hear the second.2 Temporal masking In addition to auditory masking.2. i. Whereas a raw (waveform. In the process of coding. Old movies appear herky-jerky to us because fewer frames per second are being displayed. though it varies up and down in accordance with different audio passages. 3.

The drawback to CBR is that most music isn't structured with anything approaching a constant rate. If the bitrate is low. the irrelevancy and redundancy criteria will be measured harshly. VBR Files may present timing difficulties for decoders. you have to settle for less quality. In other words. the file size of the end product corresponds directly with the bitrate: If you want small files. the scales are essentially arbitrary. when encoding with VBR. Second. then that's what you're going to get. All notions of bits per second go right out the window. these files may not be playable in older-generation decoders.SCHOOL OF AUDIO ENGINEERING AE23 – Internet Audio Student Notes to the bitstream. VBR Most of the information you'll read assumes that the bitstream is being encoded at a constant bitrate (CBR). and so on. However. Passages with many instruments or voices are succeeded by passages with few. 3. which is now owned by Real Networks. think of them as though you were using a slider to control the overall quality versus file size ratio as you might with a JPEG editor. The response to this situation has been the development of variable bitrate (VBR) encoders and decoders. Therefore. Rather than specifying a bitrate before encoding begins. While MusicMatch jukebox gives you a scale of 1 to 100. where the scale represents a distortion ratio. which had no notion of VBR concepts (although the ISO standard specifies that a player must handle VBR files if it's to be considered ISO-compliant). VBR techniques conveniently take some of the guess Bitrates refer to the total rate for all encoded channels. but is now supported by dozens. or tolerance. simplicity follows complexity. the LAME command-line encoder lets you specify a quality of 0 to 9. VBR technology was first implemented by Xing. of course.3. if you specify a 128 kbps encoding. of third-party products. start to finish. resulting in a lower-quality product. Of course. First. instead. which vary the bitrate in accordance with the dynamics of the signal flowing through each frame. if not hundreds. Confusingly. While VBR files may achieve smaller file sizes than those encoded in CBR at a roughly equivalent fidelity. and the end result will sound better. the user specifies a threshold. If the bitrate is high. and more subtlety will be stripped out. in any case. or run the tests yourself. you can't just assume that higher numbers mean higher quality-see the documentation for your encoder before proceeding. In other words. one selects VBR quality on a variable scale. the codec will be applied with leniency. you can go for higher bitrates. they present a number of drawbacks of their own.1 CBR vs. You may expect your MP3 player to display inaccurate timing readouts-or no timing information at all-when playing back VBR files. this scale is represented differently in different encoders. If you don't mind larger files. a 128 kbps stereo MP3 is equivalent in size and quality to two 390 .

SCHOOL OF AUDIO ENGINEERING

AE23 – Internet Audio

Student Notes separate 64 kbps mono files. However, a 128 kbps stereo file will enjoy better quality than two separate 64 kbps mono files, since in a stereo file, bits will be allocated according to the complexity of the channels. In a given time, one channel may utilize 60% of the bits while the other uses only 40%. The cumulative size in bits will, however, remain constant. 3.3.2 Bitrates vs. samplerates

Bitrates aren't quite the final arbiter of quality. The resolution of audio signal in general is in large part determined by the number of source samples per second stored in a given format. While bitrates are a measure of the amount of data stored for every second of audio, samplerates measure the frequency with which the signal is stored, and are measured in kiloHertz, or thousands of samples per second. The standard samplerate of CD audio is 44.1kHz, so this is the default samplerate used by most encoders, and found in most downloadable MP3 files. Audio professionals often work with 48kHz audio (and, more recently, 96kHz*). Digital audio storage of lectures and plain speech is sometimes recorded as low as 8kHz. Streamed MP3 audio is often sent out at half, or even a quarter of the CD rate in order to compensate for slow Internet connection speeds. If you need to minimize storage space, or are planning to run your own Internet radio station, and are willing to sacrifice some quality, you'll want to do some experimenting with various samplerates. Before moving away from the topic of perceptual codecs, there's an important point to be made about the category as a whole: They all make baseline assumptions about the limitations of human perception, and about how closely the end result will be listened to. The fact of the matter is that all that stuff being stripped out adds up to something. While no recording format, whether it be vinyl, reel-to-reel, compact disk, or wax cylinder, can capture all of the overtones and subtle nuances of a live performance, nor can any playback equipment on the face of the earth reproduce the quality of a live performance. All compression formats especially perceptual codecs-are capable of robbing the signal of subtleties. While certain frequencies may not be distinctly perceptible, their cumulative effect contributes to the overall "presence" and ambience of recorded music. Once a signal has been encoded, some of the "magic" of the original signal has been stripped away, and cannot be retrieved no matter how hard you listen or how good your playback equipment. As a result, MP3 files are sometimes described as sounding "hollow" in comparison to their uncompressed cousins. Of course, the higher the quality of the encoding, the less magic lost. You have to strike your own compromises. Many feel that the current digital audio standard offers less resolution than the best analog recording, which is why many audiophile purists still swear by vinyl LPs. Digital audio introduced a host of distortions never before encountered with analog, but hasn't had analog's 50+ years of research and development to eradicate them. Compressing and further modifying "CD quality" audio with a lossy perceptual codec like MP3, some might say, adds insult to injury. But then there's reality, and the reality right now is that the vast majority of us do not listen to music with the trained ears of a true audiophile, nor do most of us possess magnificent playback equipment. Most of us use middle-ground sound cards and PC speakers, most of us have limits to the amount of data we can store conveniently, and most of us connect to the Internet with relatively low-bandwidth modems. Reality dictates that we make compromises. Fortunately, the reality of our sound cards and

391

SCHOOL OF AUDIO ENGINEERING

AE23 – Internet Audio

Student Notes speakers, the quality of which lags far behind the quality of decent home audio systems, also means that most of these compromises won't be perceived most of the time. The bottom line is that the perceptual codec represents a "good enough" opportunity for us to have our cake and eat it too. As things stand now, it all comes down to a matter of file size if we want to store and transfer audio files with anything approaching a level of convenience. In a perfect world, we would all have unlimited storage and unlimited bandwidth. In such a world, the MP3 format may never have come to exist-it would have had no reason to. If necessity is the mother of invention, the invention would never have happened. Compression techniques and the perceptual codec represent a compromise we can live with until storage and bandwidth limitations vanish for good. 3.4 Huffman Coding At the end of the perceptual coding process, a second compression process is run. However, this second round is not a perceptual coding, but rather a more traditional compression of all the bits in the file, taken together as a whole, To use a loose analogy, you might think of this second run, called the "Huffman coding," as being similar to zip or other standard compression mechanisms (in other words, the Huffman run is completely lossless, unlike the perceptual coding techniques). Huffman coding is extremely fast, as it utilizes a look-up table for spotting possible bit substitutions. In other words, it doesn't have to "figure anything out" in order to do its job. The chief benefit of the Huffman compression run is that it compensates for those areas where the perceptual masking is less efficient. For example, a passage of music that contains many sounds happening at once (i.e., a "polyphonous" passage) will benefit greatly from the masking filter. However, a musical phrase consisting only of a single, sustained note will not. However, this passage can be compressed very efficiently with more traditional means, due to its high level of redundancy. On average, an additional 20% of the total file size can be shaved during the Huffman coding. Raw Power – if you've surmised from all of this that encoding and decoding MP3 must require a lot of CPU cycles, you're right. In fact, unless you're into ray tracing or encryption cracking, encoding MP3 is one of the few things an average computer user can do on a regular basis that consumes all of the horsepower you can throw at it. Note, however, that the encoding process is far more intensive than decoding (playing). Since you're likely to be decoding much more frequently than you will be encoding, this is intentional, and is in fact one of the design precepts of the MP3 system (and even more so of next -generation formats such as AAC and VQF). Creating an MP3 file, as previously described, is a hugely complex task, taking many disparate factors into consideration. The task is one of pure, intensive mathematics. While the computer industry is notorious for hawking more processing power to consumers than they really need, this is one area where you will definitely benefit from the fastest CPU (or CPUs) you can get your hands on, if you plan to do a lot of encoding. It's impossible to recommend any particular processor speed, for several reasons: • People have very different encoding needs and thresholds of what constitutes acceptable speed.

392

SCHOOL OF AUDIO ENGINEERING

AE23 – Internet Audio

Student Notes • • • Encoders vary radically from one to the next in terms of their overall efficiency. Any mention of specific processor speeds would surely be out-of-date by the time you read this. There's more to the equation than just the speed of the CPU. While MHz may be a good measure of CPU speed when comparing processors from the same family, it's not a good performance measurement between systems. The difference in size of the chip's on-board cache may have a big impact on encoding speeds, as do floating-point optimizations, so one must be careful to note such differences when benchmarking.

In any case, it's easy enough to set up batch jobs with most encoders, so you can always let it rip while you go out to lunch, or even overnight. Unless you're really stuck with an old clunker of a machine (a CPU manufactured prior to 1996, for example) and your needs aren't intensive, don't even think about running out to get a new computer just to pump up your encoding speed. You'll be better off making sure you have an adequate complement of RAM, a fast and accurate DAE capable CD-ROM drive, a good sound card, and that you're using the most efficient encoder available for your platform. 4. Notes on Decoding

As noted earlier, the great bulk of the work in the MP3 system as a whole is placed on the encoding process. Since one typically plays files more frequently than one encodes them, this makes sense. Decoders do not need to store or work with a model of human psychoacoustic principles, nor do they require a bit allocation procedure. All the MP3 player has to worry about is examining the bitstrearn of header and data frames for spectral components and the side information stored alongside them, and then reconstructing this information to create an audio signal. The player is nothing but an (often) fancy interface onto your collection of MP3 files and playlists and your sound card, encapsulating the relatively straightforward rules of decoding the MP3 bitstream format. While there are measurable differences in the efficiency-and audible differences in the quality-of various MP3 decoders, the differences are largely negligible on computer hardware manufactured in the last few years. That's not to say that decoders just sit in the background consuming no resources. In fact, on some machines and some operating systems you'll notice a slight (or even pronounced) sluggishness in other operations while your player is running. This is particularly true on operating systems that don't feature a finely grained threading model, such as MacOS and most versions of Windows. Linux and, to an even greater extent, BeOS are largely exempt from MP3 skipping problems, given decent hardware. And of course, if you're listening to MP3 audio streamed over the Internet, you'll get skipping problems if you don't have enough bandwidth to handle the bitrate/ sampling frequency of the stream. Some MP3 decoders chew up more CPU time than others, but the differences between them in terms of efficiency are not as great as the differences between their feature sets, or between the efficiency of various encoders. Choosing an MP3 player becomes a question of cost, extensibility, audio quality, and appearance. DAE stands for Digital Audio Extraction, and refers to a CD-ROM drive's ability to grab audio data as raw bits from audio CDs, so you don't have to rip tracks via the sound card. More details on DAE can be found in Chapter 5. That's still a lot to consider, but at least you don't have to worry much about benchmarking the hundreds of players available on the market (unless you've got a really slow machine).

393

SCHOOL OF AUDIO ENGINEERING

AE23 – Internet Audio

Student Notes If you're a stickler for audio quality, you've probably got a decent to excellent sound card already. However, if you've got an older sound card (such as a SoundBlaster 16) and a slower CPU (slower than a Pentium 133), be aware that the "look ahead" buffer in the MP3 player can easily become exhausted, which will result in an audible degradation of sound quality. However, sticking a better sound card (such as a SoundBlaster 64) in the same machine may eliminate these artifacts, since better sound cards perform more of the critical math in their own hardware, rather than burdening the computer's CPU with it. While this situation won't affect many modem geeks, there's an easy way to test your equipment to determine if its lack of speed is affecting audio quality: just pick a favorite MP3 file and decode it to a noncompressed format such as WAV, then listen to the MP3 and the WAV side-by-side. If the WAV version sounds better, you'll know that your machine isn't up to the MP3 playback task, since the uncompressed version requires very little processing power to play. 5. The Anatomy of an MP3 File

Aside from being familiar with the basic options available to the MP3 encoder, the typical user doesn't need to know how MP3 files are structured internally any more than she needs to know how JPEG images or Word documents are structured behind the scenes. For the morbidly curious, however, here's an x-ray view of the MP3 file format. 5.1 Inside the Header Frame As mentioned earlier, MP3 files are segmented into zillions of frames, each containing a fraction of a second's worth of audio data, ready to be reconstructed by the decoder. inserted at the beginning of every data frame is a "header frame," which stores 32 bits of meta-data related to the coming data frame. The MP3 header begins with a "sync" block, consisting of 11 bits. The sync block allows players to search for and "lock onto" the first available occurrence of a valid frame, which is useful in MP3 broadcasting, for moving around quickly from one part of a track to another, and for skipping ID3 or other data that may be living at the start of the file. However, note that it's not enough for a player to simply find the sync block in any binary file and assume that it's a valid MP3 file, since the same pattern of 11 bits could theoretically be found in any random binary file. Thus, it's also necessary for the decoder to check for the validity of other header data as well, or for multiple valid frames in a row. Table 2-1 lists the total 32 bits of header data that are spread over 13 header positions. Following the sync block comes an ID bit, which specifies whether the frame has been encoded in MPEG-1 or MPEG-2. Two layer bits follow, determining whether the frame is Layer 1, 11, 111, or not defined. If the protection bit is not set, a 16-bit checksum will be inserted prior to the beginning of the audio data. The bitrate field, naturally, specifies the bitrate of the current frame (e.g., 128 kbps), which is followed by a specifier for the audio frequency (from 16,000Hz to 44,100Hz, depending on whether MPEG-1 or MPEG-2 is currently in use). The padding bit is used to make sure that each frame satisfies the bitrate requirements exactly. For example, a 128 kbps Layer II bitstream at 44.1kHz may end up with some frames of 417 bytes and some of 418. The 417-byte frames will have the padding bit set to "on" (1) to compensate for the discrepancy.

394

SCHOOL OF AUDIO ENGINEERING

AE23 – Internet Audio

Student Notes

5.2 Locking onto the Data Stream One of the original design goals of MP3 was that it would be suitable for broadcasting. As a result, it becomes important that MP3 receivers be able to lock onto the signal at any point in the stream. This is one of the big reasons why a header frame is placed prior to each data frame, so that a receiver tuning in at any point in the broadcast can search for sync data and start playing almost immediately. Interestingly, this fact theoretically makes it possible to cut MPEG files into smaller pieces and play the pieces individually. However, this unfortunately is not possible with Layer III files (MP3) due to the fact that frames often depend on data contained in other frames (see "Dipping into the reservoir," earlier). Thus, you can't just open any old MP3 file in your favorite audio editor for editing or tweaking. The mode field refers to the stereo/mono status of the frame, and allows for the setting of stereo, joint stereo, dual channel, and mono encoding options. If joint stereo effects have been enabled, the mode extension field tells the decoder exactly how to handle it, Le, whether high frequencies have been combined across channels. The copyright bit does not hold copyright information per se (obviously, since it's only one bit long), but rather mimics a similar copyright bit used on CDs and DATs. If this bit is set, it's officially illegal to copy the track (some ripping programs will report this information back to you if the copyright bit is found to be set). If the data is found on its original media, the home bit will be set. The "private" bit can be used by specific applications to trigger custom events.

395

SCHOOL OF AUDIO ENGINEERING

AE23 – Internet Audio

Student Notes The emphasis field is used as a flag, in case a corresponding emphasis bit was set in the original recording. The emphasis bit is rarely used anymore, though some recordings do still use it. Finally, the decoder moves on through the checksum (if it exists) and on to the actual audio data frame, and the process begins all over again, with thousands of frames per audio file.

5.3 ID3 Space Tacked to the beginning or end of an MP3 file, "ID3" tag information may be stored, possibly including artist and title, copyright information, terms of use, proof of ownership, an encapsulated thumbnail image, and comments. There are actually two variants of the ID3 specification: ID3v1 and ID3v2, and while the potential differences between them are great, virtually all modem MP3 players can handle files with tags in

396

SCHOOL OF AUDIO ENGINEERING

AE23 – Internet Audio

Student Notes either format (though a few older players will have problems with ID3v2 tags). Not only are ID3v2 tags capable of storing a lot more information than ID3v1 tags, but they appear at the beginning of the bitstream, rather than at the end. The reason for this is simple: When an MP3 file is being broadcast or streamed rather than simply downloaded, the player needs to be able to display all of this information throughout the duration of the track, not at the end when it's too late. It's unfortunate that ID3 tags ever ended up being tagged onto the end of MP3 files to begin with; we'd be much better off if all MP3 files stored their ID3 data at the beginning rather than at the end of the file. As it stands, some MP3 players will simply give up if actual audio data is not encountered within the first few frames. While players developed to the actual ISO MPEG specification will know how to handle either type, the specification itself is unfortunately vague on this point. It simply states that a player should look for a "sync header," without specifying exactly where seeking should start and stop. This laxness in the spec has caused some controversy among developers of ID3-enabled applications, who naturally don't want their applications seeking blindly through 1GB image files, should the user happen to hand one to the application. Fortunately, the ID3v2 spec is more specific on the matter. One of the more interesting portions of the ID3 specification is the numerical categorization of types of audio. The numerical identifiers are stored in the ID3 tag, and typically mapped to the actual names via a picklist or another widget in the MP3 player or ID3 tool. 5.4 Frames per Second Just as the movie industry has a standard that specifies the number of frames per second in a film in order to guarantee a constant rate of playback on any projector, the MP3 spec employs a similar standard. Regardless of the bitrate of the file, a frame in an MPEG-1 file lasts for 26ms (26/1000 of a second). This works out to around 38fps. If the bitrate is higher, the frame size is simply larger, and vice versa. In addition, the number of samples stored in an MP3 frame is constant, at 1,152 samples per frame. The total size in bytes for any given frame can be calculated with the following formula: FrameSize = 144 * BitRate / (SampleRate + Padding). Where the bitrate is measured in bits per second (remember to add the relevant number of zeros to convert from kbps to bps), SampleRate refers to the samplerate of the original input data, and padding refers to extra data added to the frame to fill it up completely in the event that the encoding process leaves unfilled space in the frame. For example, if you're encoding a file at 128 kbps, the original sample rate was 44.1kHz, and no padding bit has been set, the total size of each frame will be 417.96 bytes: 144 * 128000 / (44100 + 0) = 417.96 bytes. Keeping in mind that each frame contains the header information described above, it would be easy to think that header data accounts for a lot of redundant information being stored and read back. However, keep in mind that each frame header is only 32 bits long. At 38fps, that means you get around 1,223 bits per second of header data, total. Since a file encoded at 128 kbps contains 128,000 bits every second; the total amount of header data is miniscule in comparison to the amount of audio data in the frame itself.

397

SCHOOL OF AUDIO ENGINEERING

AE23 – Internet Audio

Student Notes that you can control WinAmp with a standard, $25 universal remote control? You'll need to purchase a separate infrared receiver to connect to your computer, and then install a corresponding IR plug-in to process the invisible signals it generates. This is the kind of connectivity that people are talking about when they refer to "convergence"-the computer becoming the center of the home infotainment system. See Ampapod (www.v.nu/core/ampapod/) or Irman (www.evation.com/ irman/) for more information. Other general plug-ins let you output your current playlist to an HTML file for use with SHOUTcast or icecast, some let you operate WinAmp controls from the Windows TaskBar tray, or even control WinAmp over a local area network (IAN). 6. Listening to MP3 Streams

MP3 files don't have to be downloaded to your hard drive, necessarily. In many cases, you'll be able to play them back in your favorite MP3 player directly from the server on which they're located. There are several ways in which MP3 webmasters can dish up files for "streaming." As an end user, you won't need to worry much about the particular streaming technique in use, though it's interesting to know the difference. In some cases, you may have to tweak a few settings in your browser or player to make sure streamed files are handled by your operating system properly. 6.1 Types of Streaming There are two primary ways in which MP3 files can be streamed to users without being downloaded: MP3-on-demand and MP3 broadcast. 6.2 MP3-on-demand in this form of streaming, control of the download is in the hands of the MP3 player, rather than the browser. Because this capability is built into most MP3 players, users can choose at any time to listen to an MP3 file directly from a web server, without saving it to their hard drives first. Of course, this assumes that the user has sufficient bandwidth to listen to the file in real time without it skipping or halting, but we'll get to bandwidth issues later. If you have a fast Internet connection, look around in your player's menus for an option labeled something like "Open Location" or "Play URL" and enter the URL of any MP3 file on the web. The easiest way to get this information is to right-click a link to an MP3 file in your browser and choose "Copy Link Location" from the context menu, then paste the URL into the Open Location dialog in your player. In addition, MP3-on-demand can be forced by the webmaster, so that clicking a link normally will cause MP3 files to be pulled down by the player and played directly, rather than saved to hard drive as with a normal download. To do this. the webmaster creates an "M3U" (MPEG URL) playlist file, which is a plain text document containing the full URL to an MP3 file (or list of files) on a web server. Because the text file is tiny, the browser can download the M3U file to the user's hard drive nearly instantaneously. The web server sending the M3U file should (if it's configured correctly) dish it up with the MIME type audio/x-mpegurl. This MIME type should, in turn, be associated in the user's browser or operating system with a preferred MP3 player capable of handling MP3 streams. Once the M3U file is downloaded, it's launched in the preferred MP3 player, which reads URLs out of the file and takes over control of the actual download.

398

SCHOOL OF AUDIO ENGINEERING

AE23 – Internet Audio

Student Notes Note the difference here: when you download an MP3 file normally, the browser itself handles the entire download, and users have to then launch the MP3 file in an MP3 player manually. The MP3 file is stored on the user's hard drive for future use. With MP3-on-demand, the MP3 player handles the download, not the browser. The MP3 player plays the file as it's being downloaded, not later on. And unlike a standard download, the MP3 file is not present on the user's system after they've finished listening to the track. The advantages of using the MP3-on-demand technique are: • • • • The user does not have to wait for the download to complete before beginning to hear music. The user has more control over playback than with real streaming (e.g., the user can skip around between songs or fast-forward through songs at will). The webmaster has a degree of protection against MP3 files being stored permanently on the user's system. The publisher or webmaster does not have to set up any special MP3 serving software by having the MP3 player "suck" the file down. A plain vanilla web server running on any operating system is capable of serving up MP3-on-demand.

Because MP3-on-demand offers so much flexibility to both the webmaster and to the user, it may eventually become more popular than true MP3 streaming if and when all of us have lots of bandwidth. As long as our bandwidth is limited, however, true streaming solutions will continue to outweigh on-demand systems in popularity. An important aspect of the MP3-on-demand technique is that it's "asynchronous," or outside of time. In other words, it doesn't matter what time of day the user accesses the file-he'll hear it from the beginning. This is very different from TV, radio, or MP3 broadcast, where you get whatever is being broadcast at the moment in time when you tune in to the channel. For this reason, the MP3-on-demand technique is also sometimes referred to as "pseudostreaming”. A good example of MP3-on-demand can be found at MP3.com. Access any artist's page and access one of the Hi-Fi or Low-Fi links. Rather than being prompted for a download location, your favorite MP3 player will be launched and the file should start playing immediately. 6.3 MP3 broadcast In contrast to pseudo-streaming, MP3 broadcasting (or real streaming), is "synchronous," and thus more akin to TV and radio broadcasting. In this case, the user tunes in to a channel or station which is playing an MP3-encoded bitstream much like a radio station, sometimes complete with live announcements and commercials. The person running the MP3 server is running the show in real time, and the listener only hears the portion of the show currently being dished up. When you tune in to an MP3 broadcast, you can't just pick an arbitrary tune from the show, any more than you can with radio. Running an MP3 broadcast station is a fairly complicated matter, and you'll learn all about that in Chapter 8. As a user, however, listening to streamed MP3 is rarely more than a matter of point and click. To find MP3 broadcasts, check out sites such as www.shoutcast.com www.icecast.org. www.radiospy.com, www.mycaster. come www.greenwitch.com, or www.live365.com and you'll find dozens-or hundreds--of

399

SCHOOL OF AUDIO ENGINEERING

AE23 – Internet Audio

Student Notes ongoing broadcasts. If your system is configured properly, clicking a link to a stream in progress will cause your MP3 player to be launched and (after a short delay) for that stream to begin playing. If it doesn't, see the following section, "Configuring Your System to Handle Streaming MP3." Real MP3 streams are usually sent as pls files (MIME type audio/xscpls), rather than M3U The difference between these two playlist types is described earlier in this chapter. The advantages to real-time MP3 streaming are: • • • • Much greater control for the webcaster (voice, live mixing, etc.) Webmaster can send a stream to many people without needing tons of bandwidth on the playback machine (though they still need access to a server with a fast connection) Difficult for user to save MP3 bitstrearn to hard drive Optimization of bitstrearn for various client bandwidths

To receive MP3 streams from MP3 broadcasts or pseudo-streams, your player must be capable of managing downloads and buffering streams over the Internet on its own. The vast majority of popular MP3 players are stream-enabled--even many of the command-line players for Unix/Linux. The biggest concern for most users, of course, is the speed of their Internet connections. if you're on a slow connection and the stream being served up (or pulled down) carries more bits per second than your modem is capable of delivering, you'll experience choppy, halting playback. This can be mediated somewhat by two solutions. On the client (user) side, a process called buffering can be used. in the buffering process, the MP3 player grabs a good chunk of data before it begins to play, and continues to read ahead in the stream. The music being played is thus delayed by a few seconds. The slower the connection, of course, the larger the buffer required. Theoretically, one could utilize a buffer so large that the entire song was downloaded before a second of music was played. This would guarantee perfect playback over even the slowest connections, but would undermine the advantage of listening to streams. Most users, however, require more modest buffer settings. If you find that your MP3 streams are skipping or pausing as they're played, dig around in your player's options and preferences for something like "Streaming Preferences." In WinAmp, tap Ctrl+P to bring up the preferences screen and navigate to Plugins - Input NuIlSoft MPEG Audio Decoder. Click the Configure button and select the Streaming tab, where you'll find an array of buffering options. Most likely, you'll just want to change the numerical value for kilobytes of prebuffered audio (try increasing it by 25% or 50% for starters). You can also control how much of a track will be grabbed before a single byte is played. On the server side, webmasters, can do a number of things to make things easier for their modem-connected users, including downsampling MP3 files to lower frequencies, encoding files at lower bitrates, and sending mono, rather than stereo, streams over the Internet, All of these, of course, degrade sound quality, but are necessary to deliver acceptable streams to modem users. A sophisticated MP3 server will offer high-bandwidth and low-bandwidth options so users can select the best possible quality for their connection speed (such as the Hi-Fi and Low-Fi pseudo-streaming options found at MP3.com).

400

SCHOOL OF AUDIO ENGINEERING

AE23 – Internet Audio

Student Notes

401

SCHOOL OF AUDIO ENGINEERING

AE23 – Internet Audio

Student Notes

402

AE24 – Loudspeakers and Amplifiers
1. Parts of the Moving Coil Loudspeaker 1.1. 1.2 1.3 1.4 1.5 1. CONE SUSPENSION SYSTEM Drive System Magnetic Drive System Magnetic Material

Loudspeaker Enclosures 1.1 1.2 1.3 1.4 1.5 Open Baffle Sealed Enclosure - Acoustic Suspension Or Infinite Baffle Vented Enclosure Or Bass Reflex Acoustic Labyrinth Passive Radiators, Slave Or Drone Cones

1. 1.1 2. 3. 4. 5.

Indirect Radiating Moving Coil Loudspeakers Indirect Radiator Or Horn Loaded Drivers Piezoelectric Drivers Speaker Terminology Electrostatic Loudspeakers Crossovers 5.1 Types Of Distortion

6.

Filter Designs 6.1 6.2 6.3 Slope Rate Phase Response Cut-off Characteristics

7.

Passive Electronics

403

8.

Advantages Of Active Crossovers 8.1 8.2 8.3 8.4 Flexibility Control Isolation and Impedance Independence Power Efficiency

9.

Time Or Phase Alignment

Power Amplifiers 1. 2. 3. 4. 5. 6. 7. 8. Amplifier Maintenance Amplifier Power Rating Frequency Response Power Bandwidth Slew Rate Of An Amplifier Total Harmonic Distortion Bridged Mono Operation Considerations in choosing an amp 8.1 9. Clipping

Types of Amplifiers 9.1 9.2 Voltage Controlled Amplifiers (VCA) Operational Amplifier Amplifier Applications Equalizers Summing Amplifier Distribution Amplifiers

10. 10.1 10.2 10.3

404

SCHOOL OF AUDIO ENGINEERING

AE24 – Loudspeakers and Amplifiers

Student Notes

AE24 – LOUDSPEAKERS AND AMPLIFIERS
Introduction Loudspeakers are output transducers. Loudspeakers and microphones are extremely similar in principle but differ only in design. Transducers have been treated earlier and here we will just look at some of the various transducer types used in loudspeaker construction. a. b. c. d. e. Electromagnetic – This is the most common transducer type used, it uses the moving coil principle. Servo-drive – the servo drive is a variation of the electromagnetic principle using an electric motor, This design is basically for bass frequency drivers. Electrostatic Ribbon Piezo-electric loudspeaker

The transducer is attached to a diaphragm that moves the air to create the sound waves. This entire mechanism is called a driver. Speakers can be designed to be either direct radiating or indirect. Direct Radiating Moving Coil Loudspeakers is a system where the driver is in DIRECT contact with the free air around it. At best these are only about 8% efficient. 1. Parts of the Moving Coil Loudspeaker

1.1. CONE Usually made from paper cardboard which has been folded or molded into a cone shape. Paper needs to be reinforced to minimize flexing. Paper cone is usually used for mid and low frequency driver. Paper tends to color the high frequencies. Synthetic cones are made from POLYPROPLENE, BEXTRINE, COBEX and KEVLAR. In general the desirable characteristic found in cones must be stiff and light. The cone must move fast and responsively. Cones are also made of aluminum. Hi frequency domes are made of polypropylene, titanium, silk, aluminum and kevlar.

405

SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes 406 .

1. coil form and the pole piece. The DOME is simply a dust cover for the coil. This is the gap between the MAGNET and the POLE PIECE where the VOICE COIL is positioned. The spider also keeps dust and other foreign bodies out of the voice coil gap. 1. This is usually made from MYLAR but it is sometimes made from ALUMINIUM.2 SUSPENSION SYSTEM This is made up of the SUSPENSION RING and the SPIDER. Commonly used due to its lower cost of production. 1.4 Magnetic Drive System This consists of a ring of PERMANENT MAGNET glued between a top and back plate made from MILD STEEL. The voice coil must be centred to within 0. Its is important that the voice coil be kept centred and only move up and down and not from side to side.2mm in the voice coil gap. It is coiled around a small cylinder called the COIL FORM. This is used because it is very flexible and will not damage the cone if they happen to touch. It is designed to provide a focused magnetic field across the VOICE COIL GAP.5 Magnetic Material FERRITE: This is made from POWDERED CERAMIC material that is mixed with water then baked to the required shape. It is then glued to the back of the cone. All the components must be aligned so that there is a SYMMETRICAL FLUX FIELD between the POLE PIECE and the MAGNET. The VOICE COIL is attached to the connectors by LITZ wire which is made from TINSEL woven with cotton.3 Drive System The VOICE COIL is made from very fine rectangular COPPER wire. It is responsible for the guidance and excursion of the cone.SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes 1. 407 .

1. there is a hole in the large pole piece. The rest of the AC power is lost in heat and resistive losses associated with the cone suspension and air. It is not used very often because of the price of COBALT. That is on condition the compliance ratio of the driver in the cabinet is α = > 3. Ferro fluid is added to a high frequency driver’s voice coil gap to dissipate the heat in the voice coil. 1. drawing the air in and out of the voice coil gap via this ventilation hole. thus there is potential for phase cancellations. In a driver. SEARIUM COBALT NEO DYMIUM: This is made from RARE EARTHS and is extremely expensive would be found in more expensive Loudspeakers. mounted in no cabinet. resonance and increased power handling. THERMAL LIMITATIONS: Creating SQUARE WAVES or running speakers for an excessive time can break down the resins and glues in the speaker and destroy it. 1. This is because there is not only sound waves being produced from the front of the speaker but also from the rear of the speaker. lower distortion level and better transient response. Loudspeaker Enclosures The design of the enclosure is as important as the design of the speaker. This system consists of a completely sealed cabinet where the air inside acts as a suspension system or "spring". The trapped air will now control the motion of the driver's cone.Acoustic Suspension Or Infinite Baffle Since there is a need to prevent the rear waves from combining with the front waves a separating wall is used. The moving voice coil and dome will create a pumping action. At best it is only about 5% efficient. This is a major consideration when designing speakers because they will all create heat when they are in operation. 12" and above. In a driver with a NARROW voice coil gap.g. its magnet serves as a type of HEAT SINK for the voice coil. Whereas higher frequency's have a much more directional nature and will tend not to merge with each other.1 Open Baffle This can be described as a speaker standing in free air. Benefits are reduced distortion level. the cabinet is bigger but the driver remains the same. The sound waves from the front of the speaker are 180o out of phase with the sound waves from the rear of the speaker.2 Sealed Enclosure . the air trapped in the cabinet does not "linearlize" the motion of the cone 408 . Once the compliance ratio is α = < 3. The lower frequency's are more prone to phase cancellations than higher frequency's because a low frequency will DIFFRACT or bend around the rear speaker of the speaker to merge with the frontal waves and cause phase cancellation of the low range frequency's. Resulting in less cone excursion. VENT. e. Compliance is the ease of which a cone moves in free air.SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes ALNICO-V: This is a metallurgical alloy whose main constituent is COBALT. HEADPHONES and MICROPHONES.

4 Acoustic Labyrinth This works in the same way as the bass reflex. 1. It is designed to move in sympathy with the rear waves.SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes anymore.5 Passive Radiators. polyfill is used to fill up the entire internal volume of the cabinet. To minimized the internal resonance.g.3 Vented Enclosure Or Bass Reflex With this system the rear waves are used. The vented enclosure has a hole cut in the front of the cabinet. cone excursion and transient response are now like as if the driver is operating in free air. 1. its 409 . This driver is not wired to anything. The distortion level. They might be placed at the back of a loudspeaker e. Mackie HR824 monitors. It is designed so that the frequency's that are to be reinforced appear at the front of the cabinet at 0 o phase angle with the same set of frequency's that are produced from the front of the speaker. Polyfill is used to lined the internal surfaces of the cabinet only for most vented enclosure to minimize resonance. 1. Again appearing at 0 o phase angle at the front of the cabinet. However the benefit of mounting a driver in a cabinet will be lower frequency response will be obtained up to a certain limit. This hole or TUNED PORT is placed in a position so that it reinforces a particular set of frequency's. Slave Or Drone Cones This system works by placing another driver in the tuned port of a cabinet. but it is normally used in small speaker enclosures. Because of the size of the smaller cabinets and the length of the wavelength of lower frequency's a LABYRINTH arrangement is used to channel the lower frequency's to the front of the cabinet.

1. 410 . direct radiators hence are not very efficient at converting electrical signals into acoustic sound waves. The use of multi-cellular horns and acoustic lenses can help to provide a wider radiation angle for high frequencies. It is now about 45% efficient. This increased efficiency is somewhat offset by the fact that high frequencies. This is the principle of the indirect radiator or "horn loaded" system. Indirect Radiating Moving Coil Loudspeakers 1. become even more directional or focused. The driver actually performs better.SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes diaphragm measures 6”x12”. The PHASING PLUG is incorporated in the design to alter the path length of the waves emanating from different parts of the diaphragm so their phase is coherent. If a horn with a small diameter throat is placed in front of the driver. These are not deemed as very efficient and are rarely seen in professional monitors. the high pressure sound wave is gradually transformed into a low pressure sound wave as it travels through the flaring throat of the horn which more closely matches to the atmospheric pressure. which are already more directional than low frequencies. The resultant sound pressure level is considerably higher than a direct radiator alone can produce. Because of the conical shape of the horn. Since drivers have a high mechanical impedance (resistance to motion) and the atmosphere has a low acoustic impedance (resistance).1 Indirect Radiator Or Horn Loaded Drivers With an INDIRECT RADIATOR the speaker diaphragm is coupled to surrounding air by a device that functions as an ACOUSTIC TRANSFORMER. a high pressure area is created that matches more closely to the high mechanical impedance of the driver.

SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes 1. 411 . So for better usage of space this system was developed because low frequencies diffract easily and can move through the labyrinth with little problems. Folded horns are about 45% efficient.1 Folded Horns A straight horn designed for low frequency operation would have to be about 6 foot in length to accommodate low frequencies' wavelength.1.

Acoustic Lens These are sometimes placed over the mouth of a horn to direct the sound in a specific direction. setups for paging purposes due to their low sonic quality. E.1 . If a horn is designed well it is not needful to use acoustic lens. Its frequency range is 5 to 30 Khz.1 Multi-Cellular Horns A horn will throw sound further than a direct radiator. They are mainly used in P.g. but the radiation pattern or dispersion angle will be narrower. 412 . They are often used in sirens and in some large P.A. "radiator-looking fins" in front of some high frequency drivers e.g. It naturally rejects frequency below 5 KHz and it has a high tolerance for heat. They are only used as high frequency drivers operating above 5kHz. 1. Piezoelectric Drivers When a crystal is mechanically deformed an electrical current is produced as in a microphone but if an electrical current is applied to a crystal it changes its dimensions.SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes 1. JBL loudspeakers.A. However its sonic quality is quite poor. In an effort to widen the radiation pattern MULTICELLULAR HORNS have been developed. The crystals used in PIEZOELECTRIC DRIVERS are called BI-MORPHS because they can change their shape in 2 way's depending on the polarity of the current applied to them. 2. These distribute the sound to 2 or more sections resulting in a wider radiation of the horizontal plane and sometimes the vertical plane. PIEZOELECTRIC DRIVER’s sensitivity is high. applications.1.1.

SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes 413 .

e. They might favour either one of the frequency band if they monitor too closely unlike co-axial drivers. mechanical or electronic) that serves to improve a speaker’s transient response. The cone is injection moulded from a mineral filled engineering plastic and its thickness graded from the thin edge to the thick apex for high rigidity. d.SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes a. A measure of a speaker's output (SPL) at various angles. whilst the diffraction ring ensures minimum HF diffraction. Damping factor of above 100 is considered good for amplifier. Tannoy claims. that engineers benefit from this “point source” monitoring is phase coherence of the high and low frequency bands. Both bands emanating from the two drivers will merge after a certain distance in front of the loudspeakers. lightweight Duralumin tweeter diaphragm is cooled and damped in magnetic fluid for true piston movement up to 30kHz ensuring smooth. for better imaging and smoother treble. c. A mid/bass driver has a hollow pole piece (e) with a high frequency driver (f) mounted in the hole of that pole piece. for a large dynamic range and excellent off-axis performance f. Tannoy dual concentric driver or co-axial drivers is actually constructed from two separate drivers. A ratio of the power input (watts) to the power output (dB SPL) of a driver. blending the HF with the output from the LF cone. detailed treble. The ease with which a speaker diaphragm moves. The injection moulded Tulip Waveguide forms the HF into a perfectly spherical wavefront. Compliance: Damping: Directivity: Dynamic Range: Efficiency: 414 . High rigidity cast chassis eliminates resonance and permits a large open area behind the cone preventing unwanted sound reflections. The nitrile rubber surround improves the linearity of the cone's movement. A measure e of the softest to the loudest sounds that a speaker can produce. The stiff. The engineers will hear both bands from one point or source unlike the conventional loudspeakers where they are placed apart. b. Coincident acoustic point source reproduces the full audio signal from one point in space ensuring tactile imaging and a true acoustic sound field from a wide listening area. A form of opposition (acoustical. Speaker Terminology Bandwidth: All of the frequencies falling between the points where the loudspeaker's output rolls off by – 3 B. The advantage is. Yet it is still relatively lightweight. 3.

SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes Frequency Response: A measure of a speaker's relative amplitude at various frequencies within its bandwidth by +/. 415 . All vibrating devices have a resonant frequency at which they vibrate most readily. A speaker's ability to follow and reproduce transient waveforms. Electrostatic Loudspeakers An electrostatic loudspeaker is essentially a capacitor. Resonance: Sensitivity: Transient Response: 4. Occurs in a speaker when the opposition of surrounding air is at a minimum. A measure of a speaker's SPL at one meter when driven with one watt of pink noise (pink noise contains equal energy per octave).3dB.

A high slope rate will support the narrow dispersion characteristics of a tweeter.in this case the simple three-way system we outlined previously. 6. For example. the names of which derive from the originators of the design. Slope rate interacts with two other important aspects of a filter . This three-way street forms the intersection of choices available for filters in crossover design.1 Slope Rate Since low frequencies can damage or destroy a tweeter. Each parameter choice influences the others. This way the tweeter can be precisely aimed at the listener. or the inability for the speaker to reproduce that frequency. 416 . Higher order slopes decrease the degree of overlap between frequency bands but often introduce phase problems into the drivers.1 Types Of Distortion A loudspeaker is designed to perform optimally within a particular frequency range and.phase response and cut-off characteristic. due to the speaker’s lack of compliance at their range. Three definitive and commonly used filter designs are Butterworth. Filter Designs Filters are also classified according to their design types. Intermodulation distortion occurs when. the high pass filter needed to roll-of bass frequencies from a tweeter will differ in slope rate from a low pass filter used to cut-off 80 Hz and above from a subwoofer.SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes A power supply polarizes a diaphragm made of mylar to several thousands volts relative to a pair of conductive perforated plates. be maximally responsive to a certain power range from the power amp. Bessel and Chebyshev. so that the resulting electrostatic attraction and repulsion alternately pushes and pulls the central diaphragm. Frequency distortion is more subtle and can result from the lack of signal at certain frequencies. Crossovers 5. 6. Yet another type of filter is a variation of the Butterworth called the Linkwitz-Riley. The waveform is changed on its vertical axis. 5. The specially constructed transformer steps up the audio voltage from the power amplifier and drives the plates. or even minor departures from these power and frequency specifications will lead to speaker distortion or clipping. One way out is to select characteristics which are appropriate for the job at hand . two frequencies interfere to produce audible harmonic distortion. The push-pull action cancels non-linear distortion in the diaphragm. Narrow dispersion tweeters often have high cut-off points of around 5 KHz. the high pass filter needs a steep slope rate to choke off frequencies below the cut -off . Radical. according its impedance. Amplitude distortion occurs when the speaker experiences a power overload. Getting the right filter design for the job is the secret of transparent and efficient crossover networks.

with the other drivers. So what filters are useful for critical audio ? First and third-order Butterworths. with its steep slope and flat frequency response. Gentle filters provide longer overlaps and generally avoid phase problems. 7 KHz 417 . crossover filters are a small. Higher slope filters are more likely to alter phase response. 1200 Hz. lower order slopes can be used with subs and midrange woofers. They are commonly used for low pass filters in sub-woofer applications. also makes an excellent tweeter crossover. The Butterworth has a flat pass-band and is referred to as a “maximally flat “ filter. Bessel filters have a gentle cut-off slope with a good phase performance. Multispeaker cancellation and reinforcement effects may result which can muddy the sound stage. Generally speaking. 6. But they're very useful. First order filters are usually phase coherent. Chebyshev filters roll-off fast with a non-flat pass-band. 5 KHz. change the frequency-response and generally monkey with the system. especially for high pass filters. The fourth-order Linkwitz Riley filter. Linkwitz-Riley filters are quite expensive and usually find application only in professional and competition systems. Common Crossover Frequencies 80 Hz. One installer trick is to use a higher order filter which inverts by 180 degrees and then reverse the cables leading to the tweeter thereby restoring phase integrity. as well as second and fourth-order Linkwitz-Riley filters. higher order filters are usually more costly because they require more parts 6. lead to a drop in amplitude. which have a flat response and are phase coherent. 800 Hz. The driver will then be out of sync so to speak. High pass filter duties upstream of tweeters require the steep. It is only when they do their job properly that the output from a quality multispeaker array can truly be realised. but crucial part of the audio chain. 18 dB/va slopes of the third order Butterworth filters.3 Cut-off Characteristics A filter's cut-off characteristic refers to the variation in amplitude in the frequencies between the pass band (frequencies the filter allows to pass) and the stop-band frequencies the filter rejects).SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes In contrast. Directionality and dispersion are no longer critical at the lower end of the spectrum as bass frequencies generally seem to emanate from the centre of the sound stage. the different filter designs exhibit different cut-off characteristics. Unfortunately. 500 Hz. In general.2 Phase Response Filters can often alter a driver's phase response. In summation. First-order Butterworths have a gentle slope and are the least intrusive of any filter.

In an active system an impedance variation from a loudspeaker component is not felt upstream by the filter.SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes 7. It is important that all the components in a passive crossover match the impedance that the amplifier will see from the speakers. with an active crossover on the subs. 8. If these impedances change (say in the speaker). so high quality components must be used.1 Flexibility Passive crossovers are usually set within the speaker cabinet itself and are a hassle to change. Passive crossovers are best utilised to match woofer and tweeter. A passive hi-pass crossover necessarily inserts at least one if not several large 418 . Standalone units can also be purchased for installation with the speakers of your choice. Advantages Of Active Crossovers 8. Tightly wound air-core inductors make good passive high pass filters for woofer/sub-woofer divides. making an effective woofer/tweeter divide. Also a passive filter can itself alter the impedance set-up between amp and speakers.2 Control Active crossovers are located close to other control electronics (preamps and mixers) so that system balance can quickly be performed. Capacitors are known to ring at their resonant frequencies. 8. A passive crossover is designed around components with specific impedances. Active crossover frequencies and slopes can be continuously changed easily. A capacitor tuned to the right capacitance will operate as a high pass filter at a particular crossover frequency. Many manufacturers sell speaker pairs with a high quality passive crossover as part of the package. The main problem with inductors is that the signal must pass through the windings which unnecessarily prolongs its passage to the speakers. crossover frequencies will change. Crossover frequency and level are as easily controlled as any other line level system change.3 Isolation and Impedance Independence Typical loudspeaker components vary widely in impedance across their intended band of use. 8. Passive Electronics The electronics are normally quite simple as no active components are used. Thus the power system and the amplifier and loudspeaker components can easily be swapped over or added to and quickly compensated for with the crossover. The active system is therefore unaffected by system impedance changes. Mylar and polypropylene capacitors are expensive but sound the best.

More frequencies means more dynamic fluctuations. The result should be close to the 50 watt spec. as in normal operating conditions . using an active crossover implies using more than one power amp. In a big hall where a second stack of speakers is physically located in front of the main stack and therefore closer to the crowd. another drives the mid frequency driver while another drives low frequency driver.g. is typically fed a signal composed of varying amounts of low. However when separate high and low frequencies are combined. destructive phase effects will occur which can undermine the integrity and delicate balance of the sound stage. means an increased overall average output. JBL “buddha belly” nearfield monitors. 8. If the physical position of these components causes different relative arrival times of the various signal bands. Bi-amping: One amplifier drives the high frequency driver while a nother one drives the low frequency driver. crossovers) must reassemble and deliver the various frequency ranges acoustically to the listening position. Tri-amping: 9. In a properly set-up multi-speaker system controlled by an active dividing network each frequency range receives only the necessary power required. This can effect the damping factor of the driver for the worst. This is more expensive. In a test set-up we feed the amp a single frequency sine wave and test its maximum output before clipping. The Tannoy “dual centric” driver represents another alternative to solving time alignment problems. E. This means a good length of extra wire at around 16 AWG. The easiest way to fix this time alignment problem is to ensure that voice coils mounted in speaker cabinets are physically aligned in the vertical/horizontal plane so that they deliver their load in the same plane and at the same distance from the listener. Low frequencies inevitably draw more power than highs.SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes inductors in series with the low frequency driver. the total amount of power available will be less. Time Or Phase Alignment The various components in the system (speakers. One amplifier drives the high frequency driver. This can actually mean using less powerful amps by the distributing the power more efficiently. of say 50 watts RMS per channel. amps. A single amplifier set-up.4 Power Efficiency In most cases. its signal is delayed with a delay line so that its speakers will fire simultaneuosly with the arrival of sound from the back speakers. This is due to the increased "crest factor" or the relationship between peak and RMS outputs. mid and high frequency components. 419 . Hence the two sets of speakers will reinforce one another and sound reaches the back of the hall as intended. degrading transient bass response. but it is ultimately far more efficient than having one amp power a full range of speakers.

This will given it time to charge up its power supply. In sound systems. placing the amp rack up against the wall. it is very important to look after them & check them regularly. They will usually feature ON/OFF switch.SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes Power Amplifiers A power amplifier is a signal-processing component whose function is to increase the power of an audio signal. Whenever a power amplifier is driven hard it will get hot. Amplifiers used in studio might have heatsinkson the chassis instead of fan for quiet operations.A. For Crown amplifiers have distortion intentionally added into the amp’s output circuit so the 420 . located just before the loudspeakers. An amplifier that dies can often destroy all the loudspeakers connected to it. Switching all of them on at one time blow fuses in the venue or even cause the circuit breaker to “trip”. level controls & sometimes meters. For professional sound reinforcement.g. Professional power amplifiers work so hard. they will shut down the amps until they are cooled again. Previously amplifiers were big & heavy pieces of equipment housed in racks that need more then one person to lift. It must be given a lot of ventilation so that the heat produced will be dissipated as quickly as possible. Some engineers will even place a fan to blow at the rear of the amplifier racks to facilitate the dissipation of heat more effectively. This is even more so especially when it is dishing out a lot of currents when it is connected to low impedance loads. Amps are to be powered up last in order to avoid sending any clicks & thumps through the P. In P. system when the other equipment are switched on after them. Halfer P9505 and Alesis RA-100 The airflow should not be obstructed by e. If the amps have thermal protection devices within them. This ruggedness & toughness was usually translated as big & heavy amps. It will take the small signal from the mixing console & turn it into a large signal that will drive the speakers to high volume levels. However nowadays amplifier technology is advancing towards smaller. The input level going into each power amp is controlled by the mixing console’s stereo output signals. Amps are switched on one at a time. an amplifier must be powerful. E. curtains or in a cabinet and stacking the amps on top of each other without adequate space between them. capacitors before the next one is turned on. Power amplifiers designed for professional applications are generally very simple in appearance when compared to many hi-fi amplifiers. reliable & abuse proof with the ability to work to its maximum gig after gig.g. the power amplifier is always the final active component in the signal chain. lighter & more powerful designs with smart power supplies & built-in failsafe protection circuitry. systems most power amplifiers level controls are turned up to the maximum. The electric fans inside them will still continue to work.A.

Continuous power is the ability to do work continuously as opposed to periodically – programme or peak power. An amplifier’s specification usually states its : Continuous Power Rating is 100 watts into 8 ohms Programme Power is 200 watts into 8 ohms Peak Power is 400 watts into 8 ohms Some manufacturers state the Continuous Power rating as Power RMS. 3.at a specified distortion level & over a specified frequency range. Frequency Response The frequency response of an amplifier is a measure of how accurately the output of an audio device is able to reproduce the input across the whole frequency spectrum. ensure make sure the fans work clean the fan filter regularly check & tighten all connections regularly change the fuses regularly always earth/ground the amplifier keep a spare amp at hand keep all audio equipment cool. Amplifier Maintenance 1. 6. If given as a graph. An amplifier specification is referred to as “X watts RMS into X ohms” can also be known as continuous average power rating. 2. clean & dry Amplifier Power Rating The power rating of an amplifier will state the power that the unit will deliver into a specified load . 4. 2. The flatter the curve to the 0 dB reference line (which represents the input) & the wider the frequency range it covers the better the frequency response is said to be. 3. 1. It is often given as a curve plotted on a graph or can be given as a figure. The back panel of an amplifier will feature the amp’s audio outputs – usually five-way binding posts and/or Neutrik speakon connectors for loudspeakers. high pass filter roll off. mode switches and thru-lines.SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes distortion becomes audible indicating that the amp’s thermal level is becoming dangerously high. This is the power rating specification an engineer should consider when looking into the power rating of an amplifier. Even though they might have thermal protection devices within them. 5. 421 .

Such low level will not indicate the true power bandwidth performance of an amplifier.related specification. The power bandwidth tells us more about the unit’s performance.the amplifier’s response may collapse at the frequency extreme i. 5 Hz to 50 kHz to ensure flat frequency & flat phase response across audible frequencies. Even though the amplifier might show a very wide flat frequency response at low power levels. A power amplifier’s frequency response is usually measured at 1-watt output level. It does not in fact represent the frequency response of the amplifier.50 kHz. Power Output = 1 watt The frequency response specification of a Power Amplifier is generally measured at 1 watt output level. Power Bandwidth The power bandwidth of an amplifier is a measure of its ability to produce high output power over a wide frequency range. defined as the frequency range lying between those points at which the amplifier will produce at least half its rated power before clipping. +/. The curves in the power bandwidth graphs even though resembles a frequency response curve. the frequency response graphs of many power amplifiers extend beyond these limits e.e.SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes Although the audio frequency spectrum is 20Hz to 20 kHz.g. It represents the maximum power output plotted against frequency. high frequency range when it is driven to maximum power. If given as a figure it will read typically Frequency Response 5 Hz .0 dB RL = 8 ohms . 422 . The power bandwidth can affect the frequency response of the amplifier. 4. It is specified as a numerical bandwidth (so many kHz) or can be shown as a form of graph. Power bandwidth is a frequency . If the amplifier’s power bandwidth is limited .

the higher the slew rate must be. This is a very important specification because sharp musical transients usually occur on peaks where the power demand is at its greatest. This lag in speed will appear as a steep ramp. These amplifiers will exhibit consistent frequency response at both low & high levels. Slew Rate Of An Amplifier The Slew Rate is a measure of an amplifier’s ability to respond to very fast changes in signal voltage. Unfortunately due to the inherent speed limitations of analogue circuitry. The slope of the ramping output voltage is called the SLEW RATE of the amplifier. the amplifiers output voltage change will occur somewhat more slowly than the input step. 5. Slew rate is specified in volts per microsecond. As a rule: Low Power Amplifiers (up to 100 watts) should have a slew rate of at least 10 volts/microsecond High Power Amplifiers (above 200 watts) should have a slew rate of at least 30 volts/microsecond 423 . in relationship to the instantaneous step change at the input. except at a higher voltage. Assume there is an instantaneous step change in the signal voltage at the input of the amplifier. 1 microsecond is 1/1 000 000th of a second Usually the higher the amplifier power.SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes The power bandwidth of some modern output transformerless transistor amplifiers is generally considered excellent. The Slew Rate of an amplifier can affect its ability to accurately reproduce musical transients & complex waveforms at high volume levels. They will reproduce program material at high power with better fidelity when compared with older transformer-coupled designs. The amplifier will try to reproduce & replicate the input step change at its output as accurately as possible.

Harmonic distortion is made of one or more signal components that are whole number multiples of the input frequency. Most audio device e. 7. Total Harmonic Distortion DISTORTION is any unwanted change that occurs in an audio signal. 400. 4 & 5th harmonics. 424 .g.SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes 6.H. measurement is often given as a percentage of the relative level of the harmonics at the output of the device compared to the primary input signal.g. 300. The T.05% or less. These harmonics are distortion since they are not part of the original input signal. The output can be said to contain 2nd.D. if a 100Hz sine wave is applied to the input of an amplifier & the output of the amplifier contains not only 100 Hz but also its upper harmonics of 200. Harmonic distortion can be caused by clipping or by the design limitations of an audio device. 3 . 500Hz. E. power amplifiers should show distortion specifications of 0. Bridged Mono Operation rd th In a power specification of a power amplifier you will often come across a specification that states Bridged Mono Operation.

.... 8 ohms both channels driven. Both halves of the stereo amplifier now process the same signal & the amplifier now becomes a single channel unit working in a push-pull fashion.. Stereo.........240W/ch Stereo. Bridging mode is selected by a switch on the rear-panel of an amplifier.......400W/ch Mono... 8 ohms........ 20Hz . 425 .... With both channels used & the RMS voltage across the load for a given input signal level is effectively doubled for what it would be if the load was connected across one channel only...... When a power amplifier is bridged the signal is fed to both amplifier channels but from the left input & the signal polarity of the right channel is reversed relative to the other channel.....800W/ch The mono specification refers to the amplifiers power capability in Bridged operation.SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes The specifications might read Continuous average sine wave power at less than 0..05% THD.... The load is connected to the two HOT output terminals on the rear of the unit..20KHz............. 4 ohms both channels driven............ The left channel output is the positive (+) connection & the right is the negative (-). Only change the amplifier mode after the amplifier has been turned off...

SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes 426 .

This expansion will cause it to get stuck in the voice coil gap. consider the number of factors. Clipping occurs when an amplifier is asked to produce output beyond its designed limits. The impedance of each individual loudspeaker & the NET load impedance must be calculated. Also it is important that the power rating of the amplifier not be significantly more than the loudspeaker can handle because it becomes easy to destroy the loudspeaker with excess power or with excessive excursion causing the driver to “bottom out”.SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes 8. DC voltage is produced when a sine wave or complex wave (AC) is cut off at the electrical ceiling and floor of an amplifier. 427 . The DC voltage will heat up the voice coil of the loudspeaker causing the glue. holding the voice coil to melt. The power output is doubled.1 Clipping A low power amplifier with inadequate power can actually damage loudspeakers by being driven into clipping. otherwise you will not be able to utilize the full SPL potential of the loudspeaker system. the voice coil itself will exceed its thermal limitations and will expand. It is important that the power rating of the amplifier not be too low. Considerations in choosing an amp When choosing an amplifier for a loudspeaker system. 8. When an amplifier undergoes severe clippings. The amplifier must be able to drive the collective load of the loudspeakers if it is connected in a series or parallel or series-in-a-parallel connection.

SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes 428 .

Thus a part of the output is applied out-of-phase to the input reducing the output. the analogue signal is proportionately attenuated. high-bandwidth amplifier with a high input impedance & a low output impedance. Due to these qualities the op amp is used in a variety of audio & video applications by just adding additional components to the basic circuit to fit the required design needs. In order to reduce the gain of an Op Amp to more stable workable levels. 9. 429 .2. This has the additional effect of stabilizing the amplifier and reducing distortion. VCA 's are commonly used for console automation & automated analog signal processors.1 Voltage Controlled Amplifiers (VCA) In a VCA the programme audio level is a function of a DC voltage (generally ranging from 0 to 5 Volts) applied to the control input of the device. A common building block for audio use is the operational amplifier or op amp. The op amp is a stable high gain. It is also found in Voltage Controlled Automated Equalizers. 9. An external voltage is used to change the audio signal level.SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes 9. a negative feedback loop is used. Types of Amplifiers 9.2 Operational Amplifier Large-scale integrated circuits have been developed for compactness & versatile applications. As the control voltage is increased (in relation to the position of the fader).1 Negative Feedback Negative Feedback is a technique used that applies a portion of the output signal through a limiting resistor (which determines the gain) back into the negative or phase reversing input terminal.

SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes 10. This is designed to combine any number of discrete inputs while providing a high degree of separation/isolation between them.3 Distribution Amplifiers The distribution of audio signals to many devices or signal paths is often necessary.2 Summing Amplifier This is also known as an active combining amplifier. Where or when increased power is needed.1 Equalizers An E. such as in headphone distribution paths or to various crossovers and/or amplifiers in a live sound reinforcement setup. This type of amplifier will split the signals into 4 . a distribution amplifier is needed e.16 outputs. Amplifier Applications 10. The outputs can be either a mono (A or B signal) or stereo signal (A + B) depending on user’s selection. The summing amplifier is a very important component in an audio console design because of the great amount of signal routing which requires total isolation.g. 10. in order to separate each input from all the other inputs and still maintain signal-control flexibility. is a frequency dependant amplifier. Equalization is mainly achieved through the use of resistive & capacitive networks that are located in the negative feedback loop. By changing the circuit design any number of equalization curves can be achieved. 430 . Each output has its own level control.Q. 10.

SCHOOL OF AUDIO ENGINEERING AE24 – Loudspeakers and Amplifiers Student Notes 431 .

Limiter Active Crossover Power Amplifier Loudspeaker Monitors on stage Front Of House Pa System Sound Check Drums Vocals Deadspots .1 Learn From Mistakes 14. Stage/Junction Box Multicore/Snake Front of House Console Graphic Equalizer 6. 2. 6.AE25 – Live Sound Reinforcement 1.2 5. 12.1 Tuning The Monitor 7. 3. 11. 10. 4.Wireless Mic Achieving A Balanced Live Mix 13. 13.1 4.3 14.3 Loudspeaker Placement 432 .2 13. 9. 14.1 13. BASIC CONCEPT Signal Flow Microphones Cables/Lines 4. 8.2 Console Placement 14.

433 .

They are usually of cardiod polar pattern to minimize feedback as it offers good off-axis rejection of spill or noise. Cables/Lines Use good quality balance cables with low impedance. It will be subjected to rougher treatment than the usual like in a studio.SCHOOL OF AUDIO ENGINEERING AE25 – Live Sound Student Notes AE25 – LIVE SOUND REINFORCEMENT 1. BASIC CONCEPT The objective of a live sound setup is for a band of musicians whose performances (instruments) are selectively amplified in order to be heard by an audience in a small club or through complete amplification of all instruments in an outdoor concert. Their frequency response is dependent upon its applications whether it is use for vocals or instruments. Place the receiver on the side of the stage. EW 345 and EW 565) are popular due to the lack of mic wires restricting the movement of the performers. Z. 4.I) -> mic lines -> junction/stage box -> mutlicore/snake -> console/desk -> effects units (FX) -> graphic EQ -> peak limiter -> crossover -> power amplifiers -> loudspeakers 3. 434 . 4. It may contain parallel split outputs for the foldback console in a large live sound reinforcement setup. It maybe made into a portable case. If the stage is far away from the console and there is a danger of signal loss. snake which is linked to the console. Connect its output to the junction box. They are usually dynamic microphones as it can withstand high sound pressure level before distorting due to overloading. Microphones Choose mics that are rugged as they have to withstand constant usage and handling. The receiver will always switch to the stronger signal in order to prevent a drop of signal level during the wireless microphone transmission. 2. Signal Flow Mics/direct injection (D. Wireless microphones (Sennheiser EW 135. The receiver has two antennas to tune into these frequencies. True diversity is commonly used because the transmitter in the microphone handle casing will emit two frequencies.1 Stage/Junction Box It is usually made of aluminum with mic inputs and outputs for monitors and FOH loudspeakers.

Even if it is possible. signals for monitors on stage. stereo buss signals to FOH amplifiers. Gain: Good. The snake has been termed as the “life line”. Thus giving more control over the live mixing situation. aux. Care must be taken when using and storing it to prolong its life as it is invaluable in live sound.g. Auxiliary Section: > 4 pre/post auxiliaries are good. it will be sent to the rest of the signal flow chain after it.SCHOOL OF AUDIO ENGINEERING AE25 – Live Sound Student Notes 4. While others are used to send signal from the console to the stage. more is better. its noise or spills will enter into the Stereo Buss. More is better. from 8 to 24. It is used to connect the stage box to the FOH console. Ability to form at least 2 Mute Groups. Pad (attenuation button -20 to -30dB): EQ section: Shelving High Sweep able Mid Shelving Low Anything more than these in the EQ Section would be useful giving the engineer more parameter to deal with. Remember to do subtractive EQ’ing rather additive ones unless no other way around it. clean mic and line gain is important on the desk. Offers Pre Fader Listen (PFL) & After Fader Listen (AFL) Mute buttons: Make a live mix clean when it is enabled on channels that have no signal it them yet. Some balance cables in the snake are used to send signal to the console from stage. Should the built-in pre-amp be noisy. Some are used for effects while others are for monitor sends. it cannot be repaired or replaced. Essential when doing discreet sound check over Solo Function (non-destructive): headphones.2 Multicore/Snake The multicore contains multiple numbers. Groups/Sub groups Metering indicators: 435 . it would be expensive making it uneconomical. Front of House Console Split monitor console is usually used because of its ability to Group channels together. Headroom: +24 dB is good. If the channels are left unmuted. Group/Sub Group: A minimal of 4 sub group. E. on at least 10 channels for drums with high SPL. 5. If a cable within it is damaged. of shielded pair of cables (balanced) low impedance.

1 Tuning The Monitor Push the mic channel fader to 0 dB. If they are pull down their level on the graphic EQ. Turn the aux level in the mic channel to 12 o’clock. Or when the problem frequency keeps feedbacking and its band on the graphic EQ cannot be lowered any more because it is already at maximum attenuation. more gain before feedback. 436 . The mic is secured to the mic stand on stage is next to its monitor. 31-band graphic EQ is often used in a live sound set up. Graphic Equalizer A 1/3 8/va. The graphic EQ is often use to eliminate feedback on stage by attenuating problem frequencies (peaks). Thereafter the mic pre-amp can be turned up higher until the next set of problem frequencies start another round of feedback.SCHOOL OF AUDIO ENGINEERING AE25 – Live Sound Student Notes Stereo buss External peripherals: Reverb units Digital delays Compressors/Limiters Gate De-essers 6. The auxiliary or monitor sends are connected into the graphic EQ directly once again for this purpose. Thereafter go to the graphic EQ and boost the band that is causing the problem frequencies. the feedback will start. Once those problem frequencies are taken care of. Any nulls (dips) in the frequency response can likewise be boosted. Just before it does. 6. the mic pre-amp is turned down to the level where the feedback stops. The monitor’s amp is turned to full output level. Turn the mic channel pre-amp level up slowly. A flat frequency response would be the ideal result for both situations. The FOH stereo buss is fed directly into it for this purpose. The graphic EQ connected to the monitor is flat. Pull the frequency band level down to about half way down the slider. Eventually some frequencies will start to ring as they are about to go into feedback. Proceed to the neighboring bands on the EQ and see if they are causing the feedback too. When the problem frequency band is boosted. the mic pre-amp can be turned up higher. Repeat the earlier boosting the frequency band on the graphic EQ. It is used to tune the PA system to the concert venue (in conjunction with the use of Pink Noise and a real time spectrum analyzer) due to the venue’s poor acoustics which will color the reproduced sound. In other words. It can be used to help correct the PA system’s frequency response deficiencies. The time to stop tuning the monitor is when the feedback consists of many frequencies because its frequency response is now flat.

level and peak limiter. 437 . roll-off. a mic dropping onto the stage. Mid *(1.SCHOOL OF AUDIO ENGINEERING AE25 – Live Sound Student Notes 7. A Peak Limiter is preferred with a fast attack time setting instead of a RMS limiter which measures average signal.g. The limiter will level out the transient before it reaches the power amplifiers and loudspeakers.Lo *(200). It would let high transients through into the system. e. delay.20KHz and divide it into specific frequency bands like 2 bands .3 Khz). 9.2 Khz). Hi-Mid (6. 8. is better than having no limiter or protection at all in the system. Crossover is a device that takes an audio frequency spectrum of 20 .Lo *(1.2 Khz) & Hi 3 bands . Limiter Mainly used for PA system protection in case of a signal overload anywhere in the signal flow chain. phase. Set the threshold to about 20 dB above 0 dB. Nevertheless in any case using a RMS limiter. Hi *Approximate crossover frequency point/s Some crossovers will variable crossover frequency points. Power Amplifier A device that takes its low voltage audio signals at its inputs and amplifies them to a higher voltage at its outputs. Thus it might not be able to offer protection to the system. *(250). Lo-Mid *(1 Khz). Hi 4 bands .Lo. set ratio to >10:1 or higher with the fastest attack time setting. Active Crossover Most crossovers used in live sound are usually active ones.

Loudspeaker An electrical/acoustical transducer 11. Monitors on stage 438 .SCHOOL OF AUDIO ENGINEERING AE25 – Live Sound Student Notes 10.

Once they start playing chances are they will not stop because they love hearing themselves being amplified. Note that the monitor has less gain before feedback. back vocalists & vocalist.” It must sound clear and crisp. Front Of House Pa System For sound check ask the band to play 16 bars. Check for their style or mood of their numbers so you can act accordingly by adjusting the parameter settings of various equipment required. bass guitarist. Sound Check 439 . Two. This is to check for any feedback. the polar pattern of the mic changes from cardiod to omni directional polar pattern. It allows the performer to move about the stage without the need to stay close to monitors. Explain to them the situation before the sound check in order for the engineer to help them sound good their help is needed.g. Start with the drummer. 13. keyboardist. This is to check for any feedback. In-ear monitoring is becoming popular e. There is a possibility the mid and hi frequency might have been attenuated. In other words the sound check will be less effective. however not all at one time. Inform the performers/musicians where it is in case of need. Always have a spare microphone connected and ready for use in case of need. Cover the mic with hand. Likewise do not lean against side fill or monitor also. Everyone works as a team. Speak into it “Check One. Take them on individually for sound check and most importantly one at a time. by Shure. They come with limiter for protection against excessive sound pressure level. Even after tuning it and it sounds dull or hallow.SCHOOL OF AUDIO ENGINEERING AE25 – Live Sound Student Notes After the monitors have been tuned via “ringing the monitors” method. It consists of a transmitter belt pack that the performer carries with the earphones connected to it. Diplomacy is important as it may secure more jobs in the future. Put it next to your face. Restore these frequencies by pushing up their respective bands on the graphic EQ. reverb etc correct. EQ. Inform the vocalist/backup vocalists to point the mic away from the foldback or monitor wedges in order to prevent feedback. If they are playing all at a time the engineer will have a much harder time trying to get their gain. Further more there is not a need to be concerned about feedback from the monitor or tuning it. 12. Sennheiser EW 300-IEM. lead guitarist. Go to reflective surfaces on the stage. Remember to have spare batteries for this equipment including wireless microphones.

electric guitar amps.2 Vocals Vocals need to he clean. 14. Watch for boominess at 200 to 300 Hz. 13.Wireless Mic If wireless mics are used. Only then go back and fix up any problems. 440 . go to the next channel.3 Deadspots . Processing the signal in order to make it sound good will take too much time and equipment. Spend some time getting the channel EQ sounding right. Ensure everything works fine. since they can range from a whisper to a shout. Get everything that is working out of the way first. the balance of the mix will be thrown off during the performance. "Check 1-2" is not enough. If they fail to follow this. Request them to play their loudest and softest number . crisp loud and clear. so you have any idea of what they are like. Achieving A Balanced Live Mix Inform the musicians that their individual level during performance should be no louder than during rehearsal e. This will roll off the low end and take away annoying stage noise and give a cleaner vocal sound. Reassure them to leave the live mix for the engineer to handle. If there is a High Pass Filter or Low Cut switch on the channel then switch it ON. and cut it back in the Lo-Mids.SCHOOL OF AUDIO ENGINEERING AE25 – Live Sound Student Notes 13. so a few lines sung at a high level is essential. with the channel fader at about ¾ way up. more often to cut it back considerably. If they are not tune right or simply sounds bad. 13. someone should speak into it and walk around the stage to see if there are any potential dead spots. Request them to play a fast number and a slow number if any. Gate the drum signals because they tighten tip the sound a lot. If there is a problem with a lead. Drums are one of the hardest things to get right. It will be harder for the engineer to correct the balance of the live mix and it may not sound as good as during rehearsal. but keep all the others bouncing up to -6 dB. as each mic only picks tip its own drum There's no spill.1 Drums Listen to each drum individually and adjust the console EQ until the sound desired is obtained.g. Don't set the channel gain too high. Keep things moving right along so that the musicians do not get bored waiting for you to set up something. There is rarely a need to boost low frequency. Save problems till last.check the dynamics. You need a bit of spare level with vocals. You need to know how the voice will sound when the adrenalin is pumping on stage. Set the channel gain so that the meter just hits 0 dB on loud parts. Get the singer/s to sing into the microphone. Kick and snare can run just up to 0 dB. Make sure the mic doesn't feed back when you push the fader all the way tip. It will sound that way. Mark these areas with masking tape to indicate that they are out of bound locations and inform the performers about them. In a stereo mix they make the mix more open and spacious. Check it on the PFL/SOLO meter. The signal level can always be increased later once the band plays something.

the live mix will be emphasizing more on his vocal and especially his guitar playing and not really the drum sound. it may cover up for the mistakes made and they will not be that obvious like a slightly imbalance mix. Gain riding can be done properly for any instruments only if the engineer is familiar with the songs. For Phil Collin’s concert.g. the audiences want to hear his voice and the drums. E. For Eric Clapton’s concert.6 KHz will put some “bite” into the vocals. The main vocal will be slightly louder than the overall music. 441 .g. Rubbish. Usually their vocals’ level will be louder than the music. Most importantly learn from the mistakes and try not repeat them again.2 Console Placement The equipment and their layout will always depend on the venue.12 feet away from the FOH loudspeakers if possible. At least 10 . Things are usually okay. A little boost at 3. If there are no gates available “gain ride” the backing vocals. They should be layered just slightly on top of the music. Remember to turn up the level for guitar or any instrumental solo in the mix. Junk Funk. Chinese pop music mix is one where the vocal is usually louder than the music. The focus for most songs is the vocals. Garbage and Trash etc. Solo artists working with different bands most of the time. Phil is known for his drumming and its powerful gated reverb drum sound. Since a live sound’s level is loud.SCHOOL OF AUDIO ENGINEERING AE25 – Live Sound Student Notes A good mix is one where all instruments can be heard.g. The worse thing that can happen is when there is no sound at all or the sound coming from the loudspeakers is unclear or distorted.1 Learn From Mistakes Mistakes will be made during a live mix. 14. 14. to help them cut through the mix. They must not get buried in the mix unless that is the style of the music. Give the audience the mix they want and paid for. Gate the backing vocals if possible to make the mix cleaner. improve your live mixing technique along the way. Be mindful and quick to turn up a solo’s level because they are fleeting. It might be a good idea to record down on cassette or MD the live mix for one’s own reference and use it for gauging and correcting future live mixes. It must be in a place where it is away from the crowd if possible. E. unless this is what the music is like e. Fortunately mistakes made during live sound are forgotten because those moments will be fleeting unless they are recorded onto tape. at the same level as a guitar or keyboard solo. Unless it is Phil Collins on the drums with Eric Clapton in a duet. Consider this fact whenever there is a live sound setup at any venue. The most important thing is as long as there as the concert goers hear music from the loudspeakers.

Alternatively run the multicore along the wall and floor.SCHOOL OF AUDIO ENGINEERING AE25 – Live Sound Student Notes If one can hear the front-of-house live mix better. Ensure there is a wide. Angling the loudspeakers with regards to their dispersion pattern in order to provide an even coverage of the venue is essential. Do not equalize the music until the bass response sounds correct in that position. When a loudspeaker is placed in an open space with no surfaces near it. Remember to walk around the venue to check the sound out accurately. If a desk is set up in a corner. stomping and jumping on them. Constant visual contact with the performers on stage is important especially when they need to communicate to the engineer visually through pre-planned gestures. A minimum of 6 feet or more would be good. the low frequency energy is now concentrated into half space (1/2). The exception to this case is that of the dance floor unless the management of the place insists on it because it is bound to get knocked into. there might be a need to re-angle the loudspeakers. which boosts the low frequencies by +3 dB. Check their dispersion angle and pattern. There will be an increase in bass response due to the boundaries reflecting the audio signal back to the audio engineer. This is done to prevent bass boost in the bottom end of the music programme. This is to prevent accidents from happening. Have a gooseneck lamp or flashlight. Run the multicore away from people as much as possible. Setting up the FOH desk in the middle of a venue is good because the engineer will hear better. Place barricades around it to signify that it is out-of-bounds for both instances. 14. This is to prevent people from walking.3 Loudspeaker Placement This will depend on the stage size and its height. It will radiate low frequencies in all directions evenly (full space). Ensure the support loudspeakers are delayed (time = distant/velocity x 1000ms) in relationship to the main cluster loudspeakers. It will be trampled on less. Damages to it will reduce its life and purpose. It will also determine the exact positioning of the loudspeakers. The length of the snake will most likely affect the distance between the FOH console and the stage. If the coverage of the loudspeakers is uneven. The multicore has been termed as the ‘lifeline’. Chances are the engineer can give a better mix. It will be nice if the loudspeaker array can be “flown” or suspended so they will give a wider coverage of the venue. Placing a loudspeaker in 442 . even coverage of music for the venue. No food or drinks allowed at the console. The audience in the centre of the venue who is away from the boundaries will hear the music as lacking in bass. Where the console is placed it should command a good view of what is happening on stage. Place the loudspeaker stacks well away from walls. If a loudspeaker was to be placed against a surface. head-mounted light to illuminate the desk during the live performance should the equipment’s parameters need adjustment in the dark.

Placed near 3 boundaries. maximizes the bass output by +9 dB because of the concentration of the low frequency energy into eighth space (1/8). affecting the mid-bass region. Low frequencies are defractional. The reflected signals are delayed and when they are combined with the direct signal in front of the loudspeaker comb filtering or phase cancellations will result. a bass boost of + 6 dB. Placing a loudspeaker amongst 3 surfaces. radiating around the loudspeaker and will reflect themselves off the neighboring surfaces. The high frequencies are not affected much by the loudspeaker placement near the boundaries due to their directional nature. Have the loudspeaker stacks placed in front of the microphones to prevent feedback. 443 . Summary: Placed near a boundary.g.SCHOOL OF AUDIO ENGINEERING AE25 – Live Sound Student Notes between 2 surfaces maximizes the bass output by +6 dB because of the concentration of the low frequency energy into quarter space (1/4). a bass boost + 9 dB. e. Never the other way around. 1/2 space. Placed near 2 Boundaries. a corner. 1/8 space. 1/4 space. a bass boost of +3 dB.

The frequency response of the venue changes when it is empty to when it is full. Humans are great frequency absorbers. professional riggers are usually sought for this job. If loudspeakers arrays are to be flown. If they are not placed above head level.SCHOOL OF AUDIO ENGINEERING AE25 – Live Sound Student Notes Should a musician wander too near the back of the FOH loudspeaker stack should it be on the front edge of the stage. Should the loudspeaker be unstable and they topple over and injured someone. Some loudspeaker cabinets weigh over a hundred kilograms. There might be a danger of a low frequency feedback due their defractional nature. The graphic EQ might need some fine-tuning. The reason being that the longer the loudspeaker cables there will be an increase in resistance which will result in power losses. Should they fall on someone due to negligence on the part of an engineer. This will give the hi and mid frequency a longer throw in the venue. etc. The audio engineer or whoever is responsible for setting it up may liable for injury resulting from negligence. They are liable for the injury or death sustained. Switch them on one at a time and switch them one at a time on last. Ensure the power amplifiers are in a well-ventilated area. Switch them ON last and the first equipment OFF. That is why the same piece of music during rehearsal in an empty venue will sound different in a venue full of audiences during the actual performance itself. Place the Hi and Mid frequency drivers above the audience’s head level. Use wide gauge loudspeaker cables to minimize resistive and voltage losses. 444 . the hi and mid frequencies will be absorbed by the audiences. sturdy position where they cannot be pushed over easily. Have the power amplifiers as close as necessary to the loudspeakers. Ensure that the loudspeakers are in a safe.

2 3.1 3.1 1. Acoustic Isolation 445 .2 5.6 Critical Distance Sound Transmission Class (Stc) Transmission Loss (Tl) Resilient Channels Standing Waves Different Room Modes 6. Acoustics 1.6 4.1 5.4 3.2 1.4 5.3 1.5 5. REVERBERATION SOUNDPROOFING 1. Combfilter Effects Definitions 5. Reflection Absorption Refraction Diffraction The Acoustic Environment The Velocity (Speed) Of Sound Classification of Sound Fields 3.5 3.3 Free Fields Diffuse (Reverberant) Field Semi .3 5.4 2.AE27 – Acoustical Climate 1. 1. 5.Reverberant Field Pressure Fields Ambient Noise Field 3.

3 Wooden Floating Floor Concrete Floating Floors Advantages of the jack-up method over the panel method 1.1 7. Control Room Design 8. Dry Wall Solid Wall Floor construction Techniques 3.3 8. 5.4 The Rettinger Control Room Design The Live End-Dead End (Lede) Control Room Design Reflection Free Zone (Rfz) Control Room Design 446 . Ceiling Construction Control Room Window Construction Control Room Door Frequency Absorption 7.4 7.1.5 High Frequency Absorbers Or Passive Absorbers Low Frequency Absorbers Or Reactive Absorbers Panel Resonators (Membrane Absorbers) Helmholtz Resonators Diffusers 8. 7.2 8.3 7.2.1. Wall Construction Techniques 2. 4. 3. 3.2 3.2 7. 6.2.

but will tend to be focused if it falls on a concave surface (figure 3).S S. (Figure 1. Acoustics Whenever a sound source (S. Generally.S S. Also. if the sound happens to fall on a convex surface. 447 . the incident sound wave which arrives at a surface will tend to get thrown back into the room. the more acoustic energy will get absorbed rather than getting reflected back into the room and the higher the frequency of the sound wave the easier it will be absorbed. if the sound’s wavelength is much longer than the objects surface. 1.) S. Otherwise. A sound that reflects off a plain even surface will get reflected back into the environment at exactly the same angle at which it had arrived at the surface. The amplitude and the strength of the reflection will be dependent upon the material of the surface and the sound waves frequency hitting it.S 1 2 3 The surface of the object or boundary which the sound is striking is also of equal importance for the sound wave to be able to get reflected.S. the softer and more porous the surface of the boundary.1 Reflection A sound wave which is propagating away from its source in an enclosed environment will get reflected off surfaces. (A reflection is the ability of a sound wave to bounce off a hard surface ).SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes AE27 – ACOUSTICAL CLIMATE 1. The objects surface must be the same as or of equal magnitude to the wavelength of the sound wave. also known as the angle of incidence).) is put in an enclosed environment we should always consider the interactions it will have with the boundaries surrounding it. Meaning that. the reflection caused will be minimal and negligible. the sound wave will be scattered widely (figure 2).

thus transforming the sound energy into heat energy through friction. the direction of propagation will be changed.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes Reflection is the occurrence of sound waves bouncing off hard surfaces. a portion of the energy is absorbed & a portion of it is reflected. where the pores are open to each other.) S.2 Absorption The absorption of sound waves in fibrous or granular materials is primarily a matter of the translation of the sound energy into the form of heat energy. High frequencies are easily absorbed because of their short wavelength. (figure 5. If the wavelength of the sound is small compared to the dimensions of a hard surface.) If the sound wave enters a boundary at any angle other than that of 90o the waveform will be bent upon entry. due to the different density of the medium compared to air and vice versa. High frequencies will reflect easier than low frequencies because of their shorter wavelength and with every reflection the intensity of the sound will decrease. This frictional dissipation is usually provided more by porous materials. yellow bats and also pink bats. When sound strikes a material. rock wool. 448 . 1. The amount of absorption is dependent upon the porosity of the surface and the amount which is absorbed is converted to heat energy. This form of resistance is known as flow resistance.S incident sound wave absorption reflection transmitted sound wave 4 1. (figure 4. It is measured by observing the drop of pressure as the air passes through a surface. vegetable fibres and foam made from bubbles. When the sound waves enter the surface of such porous.) (The Absorption of acoustic energy is the reverse of reflection. the air molecules vibrate at more reduced amplitudes.3 Refraction When sound passes through a medium of one density into a medium of a different density. granular materials. reflection will take place. Some common materials that provide this type of friction include felted mineral fibres.

449 . (this means that.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes (It is the ability of the sound to bend slightly as it passes through a surface. The Acoustic Environment The Outdoor environment can often be classified as a "free field". (Low frequencies are Omni directional. while those with a shorter wavelength relative to the surface area will reflect back. The amount of refraction is dependent on the density of the medium & the temperature. the frequencies which have a larger wavelength than the object will tend to ignore it and continue on its line of propagation).4 Diffraction Is the ability of the sound waves to bend around objects or surfaces. Low frequencies due to their large wavelength have the ability to bend around surfaces much easier than higher frequencies.) incident sound wave Dense medium Less dense medium 5 1. diffraction will occur. If the wavelength of the sound wave is larger when compared to the size of the object or surface in its path.) 6 2.

For FAHRENHEIT: v = 49 • (459. Reflection and diffraction of sound around solid objects Refraction and Shadow formation of frequency and the effects of wind and temperature Reflection and absorption by the ground surface itself 1.6 • (273 + Celsius) sq.4 + Fahrenheit) sq. But in practice. 450 . The approximate formula for calculating Velocity is . it is the region where the effects of boundaries are negligible over the region of interest. C = is the temperature in degrees Celsius.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes A sound field is a free field. it is very important to be able to calculate and measure the Velocity of sound and to understand the effects of the medium with temperature. So when designing an acoustical environment consider the following factors • • • • • The Inverse Square Law level change Attenuation of certain frequencies due to humidity and other related factors. the effect of the wavelength upon the velocity of sound in a medium is Wavelength ( w ) = Velocity (v or c ) frequency ( f ) Velocity (v or c) = Wavelength x frequency frequency ( f ) = Velocity Wavelength To be able to deal with many acoustic interactions. F = is the temperature in degrees Fahrenheit. root where v = velocity in meters per second. if it is uniform and free from boundaries and undisturbed by other sound sources. For CELSIUS: v = 20. The Velocity of Sound is temperature dependent. root where v = velocity in feet per second.1 The Velocity (Speed) Of Sound For a given frequency.

3 Diffuse (Reverberant) Field In a diffuse or reverberant sound field. In a free field.32)/1. The velocity depends only on the type of gas and the temperature it is travelling through.5) sq. Which also is not disturbed by other sound sources. the speed of sound at the top of a mountain would be the same as at the bottom of the mountain on condition the temperature were the same at both locations. it is the field where the boundaries have no effect over the region of interest. the sound energy flows only in one direction. root = 1130 ft/sec. the sound pressure is the same everywhere. It is independent of changes in Atmospheric Pressure.Reverberant Field 451 . 3. if it is uniform and free of boundaries. 3.4 Semi . after a certain period of time. Anechoic Chambers are classified as Free Fields. Which means that.5 F we can calculate: 49 • (459.8 t is in Fahrenheit To convert Celsius to Fahrenheit: 1. Classification of Sound Fields 3. E.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes At room temperature of 72.2 Free Fields A sound field is said to be a free field. In practice. This type of sound field requires an enclosed space with essentially no acoustic absorption.8 (t ) + 32 t is in Celsius The velocity of sound is independent of the effect of altitude. To convert Fahrenheit to Celsius: (t . 3.g. after the sound source has stopped generating sound.4 + 72. The flow of energy in all directions is equal.

3.sixth of the wavelength of the sound.6 Ambient Noise Field The ambient noise field.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes In a Semi-Reverberant sound field. The Pressure Field commonly exists only in Cavities (also known as COUPLERS). you might introduce serious distortions into the recording of the performance. Pressure Fields are used when the National Bureau of Standards Calibrate microphones. the instantaneous pressure is uniform everywhere. is a field which comprises of those sound sources which do not contribute to the desired or wanted sound. This type of sound field is usually found in architectural acoustic environment. Comb-filtering effects are hard to manage when distant miking a sound source or when placing a mic close to any kind of boundary where reflections take place. 3. the sound energy is both reflected and absorbed. The maximum dimension of the cavity is one . but there are certain areas in the field that have a direction of propagation from the sound source.5 Pressure Fields In a pressure field. Much of the energy is from the diffuse field. There is no direction of propagation. If you ever try to mic up a choir or any smaller singing group without the knowledge of combfilter effects. 452 . The energy flows in more than one direction. + = = 4. Combfilter Effects Combfiltering occurs acoustically in any environment or electrically in any electrical component.

that in-turn drives the diaphragm of a microphone in free space.001 = 1000 Hz. 1000Hz -10dB 0dB 500Hz 1500Hz 2500Hz 3500Hz Remember this characteristic of the microphone when working with acoustical comb filtering effects. the microphone will deliver twice the output voltage .SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes Main cause of comb filtering is by time delays. Also remembering a few simple relationships will enable you to estimate the effects of comb-filters. If an adjustment is made and the two 100Hz acoustical signals are of identical amplitude and are in phase.an increase in 6 dB. For the same delay of 1 ms the first Null will occur at 1/(2t) = 1/2 x 0. The Peaks and the Nulls will be spaced apart by the same amount of 1000 Hz. a 100 Hz output voltage will appear at the microphones terminal. For example. The frequency in which the first Null occurs (the Null of the lowest frequency) is 1/(2t). Now. If the delay is (t) seconds. 453 . the spacing between the Peaks and the Nulls is (1/t) Hz. if we have a 100 Hz tone which is being produced from a loudspeaker (A).001 = 1/0.001 second or 1 ms) 1/t = 1/0. one acoustically cancels the other and the mic’s output falls to zero. if a second loudspeaker (B) also produces a 100 Hz tone at an identical pressure but 180 degrees out of phase with the first signal. a delay of (0.002 = 500Hz. A microphone is a blind instrument with its diaphragm responding to whatever fluctuations in the air pressure that occurs at its surface -falling within its operational range (20-20Khz). For example.

This is commonly known as signal distortion.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes This 1 millisecond of delay the first Null is at 500 Hz and the peaks are spaced 1000 Hz apart from each other. E. An important point to observe is that the 1/(2t) (the lowest Null frequency) will take away energy from any signal which has that delay amount. 454 . A complex waveform (music or speech) passing through a system having a 1 millisecond delay will have important components removed or reduced. the effect will be much more radical because the lowest null frequency is now at 1 KHz. Not just a simple sine wave .0001 sec) The Lowest Null Frequency would be 1 / (2t) = 1 / 0. But. the boost of the peaks will rarely be a perfect 6 dB and the nulls will usually not be at minus infinity. E. The second acoustical signal is reflecting off the surface of a table and entering the same microphone with a delay of 0. which is in the centre of the speech frequency range.0001= 10 KHz The 0 dB straight line in comb filtering graphs will show the direct sound without the delayed sound. This is due to 1.0002 = 5 KHz They will be spaced 10000 Hz apart since 1 / t =1 / 0. Assume that a person is speaking into a microphone and the direct acoustical signal coming from the person’s mouth is going directly into the microphone. where the response of the polar pattern is down. which would both affect the speech and music. If we have a 0.but also a complex waveform. The delayed sound is reduced in amplitude because of the travelling distance and the energy transfer which happens when it strikes and reflects from a surface.1 ms (0. These graphs always assume that both the direct and delayed signals are of equal amplitude. Their name comes from the fact that the Nulls and Peaks on the scales which resembles a comb.0005 sec) of delay. in practice the delayed signal is usually lower in amplitude than the direct signal. So. The acoustical delayed signal will usually arrive at the microphone at an angle.g. but the common phrase is comb filter distortion. Comb filtering graphs are plotted in -Linear frequency scales -Log frequency scales.5 ms (0. 2.g.

because the transmitted sound is less. 5. Shown as DC. 5. It basically is the sound proofing ability of that construction. which create a tonal imbalance and colour the natural sound of an instrument. 5.2 Sound Transmission Class (Stc) It is a rating system designed to compare the sound transmission characteristics of various architectural materials and constructions.1 Critical Distance Is the point in a room where the direct sound and reverberant sound are equal. These sound waves are reinforced by their own reflections.3 Transmission Loss (Tl) The number of dB reduction by a sound barrier the transmission of a sound. They exist mostly in every room.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes 5. Transmission loss varies with frequency.4 Resilient Channels Are metal tracks that hold a wall surface in position away from the surface it is attached to. 5. 455 . because most rooms are made from parallel walls.5 Standing Waves A standing wave describes a sound that resonates between parallel surfaces. Definitions 5. The higher the STC the better the soundproofing would be. They are designed to isolate the two surfaces so that the outer one will not pick-up the vibrations of the inner surface. It is shown as a value between 0 and 100. floors and ceilings.

456 . Standing Wave formula is f= V 2xd Where f V d is the fundamental frequency. because the fundamental resonant frequency (also known as the fundamental standing wave) has a wavelength of twice the distance between the parallel surfaces. is the distance.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes Fortunately these frequencies are predictable. is the velocity in ft/sec.

equipment.g. E. bass traps or diffusers. F2 = 150Hz. It also occur among four surfaces called tangential mode and six surfaces called oblique mode. If F = 50 Hz the multiples will be F1 = 100Hz.. 5.. There will be more than one resonant frequency in a given room if the boundaries are parallel. Normally. a room will not only have one fundamental standing wave.. After furniture. Being less severe. 2 x 20 The harmonics of the fundamental resonant frequency are not as harmful as the fundamental frequency because of their lower amplitudes.etc. absorbers have been placed into a studio and control room. One way to eliminate standing waves in a room is to construct non-parallel boundaries. 457 . What would the fundamental standing wave be? 1130 ft/sec = 28 Hz. E. The distance between to parallel walls is 20 feet.. These items will break up the other two modes. the rest of the resonant standing waves will be the multiples (or harmonics) of the fundamental. F3 = 200 Hz. 56Hz.g. This is because a room is made from more than 2 walls.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes Once the fundamental resonant frequency is determined.6 Different Room Modes Standing waves occurring between two surfaces is known as axial mode. Fortunately the focus of absorbing the axial mode is the primary objective. 112Hz.

simultaneous reflections created from all the room surfaces. The intensity and the duration of the reverberation will depend on the room’s dimensions and the absorption characteristics of the surfaces. E. The dimensions are: Floor to ceiling Wall to wall Front wall to Back wall Room’s volume V = 10 x 20 x 30 = 6000 ft3 = 10 ft = 20 ft = 30 ft 458 . Although reverberation can be a desirable effect in studio productions.161 factor instead of 0. use the formula below If the studio is measured in feet: RT60 = V x 0. If it is measured in meters (metric system) use 0..049 Where V is the room’s dimensions in m3 (volume) S is the total surface area of the material A is the absorption co-efficient of the material Reverberation time is always measured using a 1 KHz tone...SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes 6.049______ (A1 x S1) + (A2 x S2) +.. which has wood. To calculate the reverb time (RT 60) within an acoustical space. Calculate the RT60 of a room. it still should be controlled with-in the studio environment. concrete and curtain. The same value that the absorption co-efficient values are taken with. REVERBERATION Reverberation is the result of multiple..g. Reverberation Time (RT 60) is the time taken for the reverb to die away by dropping to 60 dB in level from the original level after the moment a sound has stopped.

049 = 294 = 0. WOOD: S = (10 x 20) x 2 = 400 ft2 A = 0.07 (A2 x S2) = 0.09 x 400 = 36 CONCRETE: S = (30 x 20) x 2 =1200 ft2 A = 0.75 x 600 = 450 RT60 = 6000 x 0.07 x 1200 = 84 CURTAINS: S = (30 x 10) x 2 = 600ft2 A = 0.75 (A3 x S3) = 0.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes Then the surface areas and the absorption coefficients for each material must be calculated.09 (A1 x S1) = 0.51 sec 36 + 84 + 450 570 459 .

460 . It is the first and most basic element of room acoustics that affects any recording. a uniform (flat) frequency response and must offer sufficient flexibility and control over reflections. Noise quite simply is unwanted sound. On a busy street ambient noise levels can measure as much as 90 dB Sound Pressure Levels (SPL). Acoustic isolation in a recording environment is the attempt to reduce noise to a low acceptable level. Noise can never be completely eliminated. chances are that the recording will be noisy also. If a recording environment is noisy. 1. This is known as soundproofing. The environment must feature good acoustical isolation.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes SOUNDPROOFING Introduction There are certain considerations in the design of a studio which when met will offer effective results. Acoustic Isolation One of the most effective ways of soundproofing a studio/control room is to use the “room-in-aroom” design. PNC 15 is preferred. experience has shown that the ambient noise levels must lie well below 25dBSPL to be acceptable. Noise can be produced by many of sources. a country farmhouse may have an ambient level of around 40dBSPL. Ambient Noise – This is the common term for background noise and is measured in dB. In a recording studio.

Dry Wall Consider the effect of sound striking a standard wall construction. it will continue to vibrate on its own for some time. This is false due to the ears remarkable ability to reject background noise. with insulation material and gyprock (gypsum board) on either side. As a result we seldom even notice sounds which constantly envelop us. When it is set into motion at its resonant frequency using a tone from an external sound source. This effect is further enhanced when the two wall surfaces are of the same thickness.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes There are several misconceptions about ambient noise and how it is controlled. STC 40-45 461 . If you flick at it with your finger. part of it is reflected back into the room. But some of it will set the wallboard into vibration. even after the external sound source has stopped generating sound. E. This wallboard becomes a mini loudspeaker. This psychological phenomenon has been incorporated into the hearing system as protection against sensory overload by the millions of sounds that surround us. which means that they will resonate at a same specific frequency. 2. transmitting a muffled version of the sound to the other side. such as traffic. consisting of a frame made by 4” x 2”s.g.1. A tuning fork has a resonant frequency of 440 Hz. soon both forks will be humming at that particular frequency. This wall construction is not very soundproof and it will have a STC rating of around 35 at best. All this will achieve is to reduce the high frequency reflections in a room. A microphone will pickup these noises and transfer them directly to the recording. Once it is vibrating and you put it next to another identical tuning fork (without touching it). This vibration is transmitted to the frame and subsequently to the wall-board on the opposite side. a room that sounds quite to the ear has a low ambient noise level. it will vibrate at 440 Hz. When a sound strikes the wall. Every rigid body has a resonant frequency. When one vibrates the other will do the same. Such as using porous type of materials to line the surfaces in a room. Another misconception about ambient noise is that. This is the same for the wallboard mentioned earlier. Wall Construction Techniques 2. it will not reduce the sound transmission. wind or mechanical noise.

between a recording studio and a control room should be around at least 50 to 60. 462 . STC 55-60 One way to achieve a high STC rating is to acoustically decouple the wall surfaces.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes The normal STC rating of a wall.

Then every gap and joint should be sealed with caulking if the highest transmission loss is to be attained. you should think of where you are going to mount the electrical outlets. Before the gyprock sheets are secured onto the frames. The electrical outlet boxes should not be located directly across from each other. door. etc. situated. One wall surface should be faced with 2 x ½” gyprock wallboard and the other side should be faced with a single layer of 5/8” gyprock wallboard to be able to eliminate the resonant characteristics of the wall surfaces. A minimum of 6 inches of fiberglass insulation is placed inside the wall. the best method is to have double frame. The frames should be made from 2” x 4” studs and they could be made in such a way that the studs are parallel to each other or that they are staggered with one another. They should be staggered instead. The staggered stud wall frame is better as it will prevent resonance modes from occurring in any way within the wall.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes Although there are more than a few ways of sound proofing walls. Where is the control room windows. Sprayable foam insulation can be used for this purpose. This means that you actually build two walls with no contact between them. 463 . switches and where wiring is going to be laid. The outer wall can be mounted on the existing concrete floor and the inner wall frame can be mounted on Neoprene rubber to avoid sound transmission from the floor. Once everything has been mounted and secured into place they should be filled up with absorbent material.

SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes 1. 464 . water.2 Solid Wall A solid wall made of 4” x 8” concrete blocks rating is STC 50. termite and fire resistant albeit heavy. STC 50-55 STC 60-65 It is durable.

1. Wooden Floating Floor These floors utilize heavy framework supported by specially designed vibration isolators placed at regular intervals beneath the floor. rubber. foam or a blanket of some type. Floor construction Techniques Having soundproof walls around the studio/control room is rather useless unless the floor and ceiling are also soundproofed. Care must be taken so that the floating floors do not touch the walls around it. spring. thereby increasing its soundproofing characteristics. The sand should be KILN-DRIED to avoid the 'ill-effects' of residual moisture. thereby ruining the floating effect. Rubber lining should be put between the floor and the walls 465 . which could lead to warping or cracking of the wooden floating floor at a future date. E.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes 3. 3. Typical wooden floating floor Filling the airspace between the frames of a wooden floating floor with sand will increase the mass of the floor.g. A floating floor is simply a lifted floor supported by a vibration-absorbing medium. Vibration isolators are large blocks of heavily compressed "Neoprene Rubber". ¾” plywood is placed on top of these vibration isolators and then the framework is secured on top of this.

3.2. The thickness of the concrete is around 4 inches. The existing floor is covered with a waterproofing layer of plastic and the edges of the room is lined up with asphalt-faced fiberglass. thus creating a sound attenuating air gap between the existing floor and the floating floor. 3. which are in turn supported by vibration isolators.2 Jack-Up Method In this method. A thin plastic sheet is spread over the plywood to prevent the concrete from leaking through the gaps between the panels. Reinforcement iron mesh is placed in the middle of the concrete. The basic method for building and floating a concrete floating floor is to support the floor at regular points with very efficient vibration isolators. 466 .2.5 meters apart from each other and the total thickness of the floor is around ½ foot.2. The vibration isolators are placed 1. 3.1 Panel Method The wet concrete is poured on top of the plywood panels.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes and then caulked. no plywood is used. Concrete Floating Floors A heavy concrete floating floor working in conjunction with a concrete existing floor can eliminate virtually all transmission of noise through the studio floor.

Then reinforcement iron mesh and bars are tied to these insulators and then the concrete floor is poured.3 Advantages of the jack-up method over the panel method The height of the air-gap can be adjusted to the threshold of the doors. Since no wood is used there will be no rotting of the floor and no termite problems. 3. through the floor is very severe. A well-constructed floating floor will have a STC rating of more than 55.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes The special Jack-up vibration isolators are spaced at regular intervals on the layer of plastic. The air gap which is created is adjustable from between 1 to 4 inches. then spring-mounted vibration isolators should be used. If the vibration transmission. When the floor has set. it is raised into position by turning the jacking screws in each isolator. 467 .

b. To be able to increase the sound proofing of the ceiling it also should be lined with pink bat. with smaller voice over booths. Normally a floating ceiling is hung from above.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes 4. To be able to reduce the sound level being transmitted to the other side of the window. There are 2 questions that should be asked before actually deciding on which materials to use. In this case the weight of the ceiling is transferred directly through the walls e. But. When sound hits a single pane of glass. only to be slightly attenuated. the glass bends in sympathy to the incident sound wave and travels straight through to the other side. Window size is also important. In any case square windows should be avoided. the most practical approach will be to build a window from two pieces of glass being positioned a few centimeters apart. If the floating ceiling is going to be hung from the existing ceiling it should be ¼” clear from the surrounding walls and the remaining air space then should be caulked tightly with non hardening caulking compound. How loud will the sound be on the other side of the control room window? What is the wall surrounding the window made from? 468 . Ceiling Construction A floating/suspending ceiling is an excellent device for attenuating the transmission of sound to and from above. If the window is smaller the overall performance of the wall will be better. Control Room Window Construction The design and construction of both control room windows and doors has always been one of the most difficult aspects of studio construction. a. it is also possible to construct a soundproofed window with the dimensions of 6' to 8'. 5. with vibration isolators made from neoprene rubber or coiled springs. This is because it is impossible to construct the windows and doors from the same material which the walls are built from. Windows and doors are usually the weakest link in soundproofing. Although there are several ways of building windows which will provide a reasonable amount of sound attenuation. which can vary the cost of the construction enormously. the floating ceiling must be massive. thus creating an air gap which will form an acoustical air-spring that effectively will decouple the air movement from one room to the other. In order for good soundproofing. This will dampen the sound that has been transmitted and act as an absorber.g. Sometimes a floating ceiling is supported from below by the floating walls.

then you can select the window design. A single pane of glass in a frame mounted on a single leaf wall will provide approximately 25 dB of attenuation. The windows should be tightly sealed. 469 . Probably. If the studio is going to be used for loud amplified sounds. which is constructed from two sheets of glass with 200 mm of air-gap in between each other. The above mentioned attenuation figures are for the best case situations. 10mm and 12 mm thick panes. Once it is decided on which type of materials the walls have been constructed from. which will have little use in studio design.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes For example. then the best possible window construction would be required to be able to provide at least 50 dB of attenuation. The best results are obtained from using 9mm. This can provide about 40 dB of attenuation and the other design is the "triple glazed window" which uses three sheets of glass which are again spaced 200 mm's apart and should be used when good sound proofing is required. a vocal overdub studio the windows construction do not need to be highly soundproofed. Ordinary 6-mm domestic window glass should be avoided. It will be no use building a triple glazed window on a dry wall with a rating of STC 35. the most commonly used design is the double glazed window. The three-pane construction will provide up to 55 dB of sound attenuation. The materials used for the wall construction are also of equal importance. sound absorbent material should be placed in-between each pane of glass and the whole of the wood frame should float on a rubber bed and not touch the walls. A double pane construction would be satisfactory.

SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes Control room window construction In windows which have been constructed from two or three panes of glass. these will take the weight off the glass and form a cushion for it to sit on. Each pane of glass should be mounted on its own frame sitting on rubber isolators. It is also a common practice to angle the glass on the studio side downwards slightly to stop the view being obstructed by reflections from studio lights which also will help to minimize standing waves. In the construction. hardwood should be used for the frames because hardwood is normally KILN-DRIED which does not contain any moisture which will not warp and crack at any future date. When installing the glass all the edges around the frame should be sealed with caulking sealant. It will also dampen the vibrations of the glass. 470 . the panes must not be placed parallel to each other and should be of different thickness.

The average interior door found in most homes is a hollow-core door that has been undercut (shortened slightly) for case of opening. Sealing the air space around the door by applying weather-stripping or rubber gasketing will help by 2 or 3 dB. In comparison the transmission loss of a typical single-layer gypsum wall would be 35 dB. At low frequencies. 50 Hz. Such lightweight.g. e. the average TL of the 471 . This type of door is typically constructed of two thin wooden-veneer face panels adhered to a frame. Other than permanent openings. the actual TL of the hollow-core door may deteriorate to only 4 or 5 dB. on top of the absorbent material lining silica gel crystals should be placed to absorb any moisture future moisture. this is not an appreciable difference. doors are the most common culprits for sound leakage in rooms. fragile construction offers little resistance to sound. 6. the hollow-core door should be replaced with a massive.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes In between the glass panes. By applying rubber or felt weather-stripping around the door frame and installing a floor threshold to insure an airtight fit. To obtain a distinct improvement. dense solid-core door. Control Room Door The average transmission loss (TL) of an unsealed hollow-core door is only 17 dB. However.

rigid acoustical surface a portion of it will be reflected back and a portion of it will loose energy at the surface area. Sometimes when special attention has been given in the designing the walls to control the reflections. This means that wood absorbs 15% of the incident acoustic energy while reflecting 85% of it back.g. bringing the TL up to about 40 dB. Installing a second sealed solid-core door on the other side of the partition will offer the best improvement.1 High Frequency Absorbers Or Passive Absorbers 472 . The portion of the sound which is absorbed and the portion which is reflected back is usually shown as a ratio. When sound strikes a smooth.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes solid-core door can be brought up to about 30 dB. In general they are solidly built and air tight. Refrigeration door locks can be used because they are mounted on the exterior of the door without the need to drill large holes into the door. As we know the absorption of acoustical energy is exactly the reverse of reflection. 7. Frequency Absorption The method for designing a recording studio or control room for a smooth flat frequency is to first measure the room’s frequency response using a pink noise generator and a spectrum analyzer. The amount of absorption at specific frequencies is determined by the material itself. absorbing of these particular frequencies must be considered. 7. if a piece of wood has an absorption coefficient of 0. the frequency response of the room may still exhibit irregularities. It also have a release mechanism from the other side to unlock the door. E. This will immediately show us how the room reacts to frequencies over the entire audio spectrum. This helps to preserve the soundproofing of the door. To be able to tailor the frequency response to the desired stage.15 Sabin. This ratio is known as ‘the absorption co-efficient’ of absorptive materials and is a figure between 0 (reflective) and 1 (highly absorptive). The absorption coefficients table usually will show which frequencies will be absorbed with known construction materials.

Absorption coefficients of Sonex contoured thicknesses.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes The absorption of high frequencies is accomplished by the use of dense porous (interstices) materials. allowing the high frequencies in a room to be carefully controlled. E. Sonex Foam of various 473 .g. fiberglass and carpeting. cloth. These materials have high absorption values at high frequencies.

25 0.60 0. rubber or cork tile on concrete 0.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes Absorption Coefficients Table of General Building Materials and Furnishings Absorption Coefficients (Hz) 125 250 500 1000 2000 4000 Materials Brick.03 0.02 0.03 0.06 0.03 0.07 0.48 0.07 0. draped to half area 0.08 0.02 0.73 0.27 0. 18 oz/yd2.06 0. on concrete Same.015 0.35 0.03 0.02 474 .37 0.24 0.01 0.02 Linoleum. heavy.01 0.34 0.07 0. asphalt.03 0.03 0. coarse Concrete block.70 0.10 0.03 0.02 0.05 0.01 0.01 0.39 0.02 0.35 0.08 Fabrics: Light velour.09 0. in contact with wall Medium velour.03 0. draped to half area Heavy velour.31 0.02 0.72 0.17 0.69 0.03 0.63 0.02 0.60 0.05 0.65 Floors: Concrete or terrazzo 0. unglazed.11 0.08 0.55 0.70 0.14 0.57 0.49 0. 14 oz/yd2. hung straight. on 40-oz hairfelt or foam rubber Same. painted Carpet.04 0. 10 oz/yd2.04 0.14 0. painted 0. with impermeable latex backing on 40-oz hairfelt or foam rubber Concrete block.39 0.02 0.24 0. unglazed Brick.71 0.29 0.75 0.36 0.31 0.65 0.44 0.

2 0. per sq.025 Absorption of Seats and Audience. per square foot of floor area Unoccupied cloth-covered upholstered seats.10 0.07 Glass: Large panes of heavy plate glass Ordinary window glass Gypsum Board. seated in upholstered seats.12 0.07 0.04 0.02 0.06 0.04 0.3 0.09 0.10 0. depending upon furnishings Deep balcony.06 0.06 0.29 0.85 0.07 0.07 0.013 0.1.70 475 .82 0.15 .02 0.015 0.60 0.74 0.14 0.80 0.49 0.35 0.09 0.04 0. smooth finish on tile or brick Plaster.01 0.015 0.03 0.04 0.96 0.02 0.28 0.013 7.04 0.04 0.10 0. upholstered seats Grills.01 0.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes Wood Wood parquet in asphalt on concrete 0.18 0. 1" nailed to 2 x 4s.66 0. gypsum or lime. rough finish on lath Same. sabins per square foot.18 0.25 0. gypsum or lime.06 0.50 0. with smooth finish Plywood panelling.15 0. as in a swimming pool Air.05 0.25 .10 0.06 0. of seating area Audience.10 0.9 0.04 0.14 0.02 Openings: Stage.07 0.02 0.0.22 0.04 0. of Floor area Unoccupied leather-covered upholstered 0.75 0.008 2. sabins/100 ft3 @ 50% RH 0.17 0.03 0.04 0. ft. ventilating Plaster.11 0.01 0.50 .01 0. 16" oc Marble or glazed tile 0.0.03 0.05 0.88 0.020 0.88 0.03 0.06 0.11 0. 3/8" thick Water surface.00 0.008 0.93 0.07 0.05 0.

E. 476 .86 0.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes seats.1 Bass Traps This is due to the fact that low frequencies are best dampened by compliance or the ability to bend and flex with the incident waveform.54 0. the lower the frequencies will absorb.86 0.60 0.g. Due to low frequencies longer wavelength.62 0. they are boxes made from ply board. the bigger the box.2. 8. per Sq.75 0. Impractical for studio use because it takes up a lot of space. occupied.15 7.3 ft frequency 100 Hz The quarter wavelength of 100Hz is 2.61 0.57 Chairs. each. per square foot of floor area 0.2 Low Frequency Absorbers Or Reactive Absorbers It can be observed from the absorption coefficient table that materials which offer good high frequency absorption usually do not offer good low frequency absorption. This allows a portion of the waveforms power to be absorbed by the amount of energy spent to flex the material. 7.44 0.2 The Construction Of A Bass Trap Dense. rigid materials like plywood have a high degree of reflectivity. metal or wood seats.825 feet. Wavelength = velocity_ = 1130 ft/sec = 11.91 0.50 0. Ft.19 0.22 0.39 0. Basically. To absorb a 100Hz mode. These are also known as resonators and involve a tuned air cavity. But if flexible materials are used. unoccupied 0.38 0.58 0. This is how low frequencies are absorbed. Low frequencies can also be absorbed by bass traps. of floor Area 0. the acoustical energy is converted into mechanical energy. in which most of the sound energy bounces back.30 Wooden pews. passive absorbers will have to be very large in size to be able to absorb these frequencies. passive absorbers have to be a ¼ wavelength of the frequency involved.

375 lbs/ft2 3/4” plywood the f is 2. The fundamental resonant frequencies can be calculated from the room’s dimensions using the standing wave formula. Then the air gap dimension must be calculated from: x= 28900 mx f 2 Where x= air space dimension in inches m= f = surface density of plywood in lbs/ft2 fundamental frequency in Hz. If the resonant standing wave frequency in a room is 28Hz.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes This is the main principle of a bass trap.g. What would the air gap dimension be? 477 . In the trap construction the density of the plywood and the fundamental frequency must be known. Standing wave formula v 2xd The bass trap is designed and calculated to absorb a particular resonant frequency.063 lbs/ft2 E. Surface density 1/2” plywood the f is 1. If ½” plywood is used.

The supporting studs should be spaced by a minimum of 24” apart. x = (600) 2 f m x f m = the depth of the box in cm. the smaller the required air space is.4 Helmholtz Resonators The Helmholtz resonator works on the principle that a cavity of certain dimensions has a certain resonant frequency. the weight and the size of the panel. 7. The advantage is that. The attachment of the studs will reinforce the plywood and alter the density value. Thus making it a broadband frequency absorber. = the resonant frequency in Hertz. but at the same time will retain the liveliness of the room. These resonators work best on the lower mid range frequencies and are made from wood or concrete. 478 . It helps to broaden the Q of the panel resonators. If we change the size of the cavity we can also change the resonant frequency. = the surface density of the front panel in lbs/ft2 where (The mass means how heavy the material is according to its dimensions and weight.3 Panel Resonators (Membrane Absorbers) These resonators will absorb acoustic energy via the vibration of panels. By placing some absorbent material (usually passive absorbers) across the front panel’s surface it is also possible to get rid of the low frequencies as well as the high frequencies.) 7.86” Hence the denser the material is. To find the resonant frequency of the panel resonator use the formula below. This is mainly due to the reflective quality of the high frequencies and the front panel.375)(28)2 = 27” Conversely if a 3/4” plywood is used. Panel absorbers will absorb a fairly large range of low frequencies. This will only alter the final measurement of the air gap by around half an inch in order to compensate the higher density value after the plywood has been fastened to the studs. they can be tuned to notch out a narrow band of frequencies and can be constructed for the problematic frequencies only. It was used by the ancient Greeks including medieval churches in Denmark and Sweden.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes x= 28900 (1. the air gap is reduced to 17. whereby the resonant frequency is determined by the thickness.

When sound strikes at the opening of the bottle and the enclosed volume of air resists compression and the oscillations due to the resonant frequency of the bottle. In the ancient days the Q factor was broadened by filling the cavity with ash. the bandwidth needs to be broadened in order to become reasonably effective. A diffuser can run vertically from floor to ceiling or horizontally from wall to wall dissolving echoes and breaking up standing waves. fo = 60 (A/d x 1/V) sq. empty resonator has a very sharp Q factor. E. meaning that the openings of the cavities are either slat or holes. A classic coke 125ml bottle has a resonant frequency of 168 Hz and Q factor of 267. Phase cancellations occur and the approaching wave is “absorbed” or cancelled.5 Diffusers To treat the walls and surfaces with absorbent materials might no be sufficient enough to eliminate reflections and echoes from occurring in the environment. Both will work equally well and the resonators are easily integrated into the design of a control room or concert hall and are very often almost invisible. Geometric structures known as diffuser are used to break up echoes and reflections. 479 .SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes The most basic version of a Helmholtz resonator is a bottle. Nowadays fiberglass/rock wool is used instead. Since a plain. root A = mouth cross sectional area (imperial system) d = depth of the neck V = volume of bottle 7. Diffusion is a technique that is used to create an irregular surface area that will deflect sound in random directions.g. The modern resonator is either a slat type or a perforated type. Vibration in the neck is sent back towards the approaching wave.

D design. travels to the bottom of the well and returns to the opening where it reacts with the incoming incident sound wave. The Q.D. Diffusers constructed from P. the thinner the width the higher the frequency that will be dissipated and the deeper the well the lower the frequency that will be dissipated.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes There are two main diffuser designs commonly used in control room designs.D. is not symmetrical.D. However the P. The basic elements of the process of designing a diffuser is that the maximum frequency that is to be diffused is determined by the width of the wells which is about half the wavelength of the shortest wavelength to be scattered. scatters the sound over a very wide angle and diffusers based on this design are much more inferior than diffusers based on the Q.D.D. Primitive Root Diffuser Quadratic Residue Diffuser To preserve the integrity of each well in these diffuser designs a thin metallic separator or divider is usually used.R. A incident sound wave that strikes the mouth of each well.R. Also are easier to be constructed.R.D. These are the Quadratic Residue Diffuser (Q. is that one side will always be a mirror image of the other side.R.). The depth of the wells determines the minimum and maximum frequency that is to be diffused.R. consists of periodic series of slots or wells of equal width but the depth is determined by a special number theory sequence.) and the Primitive (pronounced as prime) Root Diffuser (P. The depth of each well will determine the phase relationship of the incident and reflected components which in turn determines the angle at which the resultant sound wave is sent off.R. 480 .R. Basically. They are much better in suppressing a specular lobe of broadband sound and scatter the sound energy in very wide angles. The main feature of the Q.

The common wall that separates the engineer in the control room from the musicians inside the studio should be a solid acoustical barrier in order to prevent sound in the studio from interfering with the sound produced from the room monitors at the engineer’s position. absorbers and non-parallel boundaries. Even though some room monitors are elevated. The main monitors should be placed 8 to 12 feet apart. They should be angled downwards pointing towards the engineer maintaining an equilateral triangle. diffusers. The engineer’s position must be located as the same distance from the loudspeaker’s spacing. angled 60o. Control Room Design The control room should have the same tonal balance as the studio by employing bass traps.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes A diffuser designed to dissipate a 100 Hz standing wave 8. aiming at the engineer. This means that the room monitors and engineer form an equilateral triangle together. 481 .

a source of pink noise is injected into the control room via a stereo graphic equalizer which in turn is fed into a stereo amplifier which drives the main field monitors or room monitors. A control room is considered to have a good tonal balance when it can produce a flat frequency response. To test this. A microphone placed at the engineer’s position registers these readings and is connected to it. When pink noise is injected into the control room through the room monitors. A graphic EQ is a mere corrective tool and not a cure for room acoustical ills.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes Setup for tuning the control room’s room monitor. The graphic EQ is used to “tune” the room monitors to the control room by boosting or attenuating the respective frequency bands accordingly until a flat frequency response is obtained. Room monitors should have either a 12”. This flat frequency response will be seen on the spectrum analyzer. the spectrum analyzer will show the control room’s frequency response showing the dips (deficiencies) and peaks (resonances) of it. The microphone position is the engineer’s position. 15” or 18” mid/bass or bass driver in it. Hence it is a good practice to achieve a good tonal balance in a control room/studio via acoustically tuning it first before resorting to any equalization. Which means that there is unity gain across the whole audio spectrum. The graphic EQ should not be used as a substitute for poor room acoustics by making radical compensations. 482 . A Spectrum Analyzer graphically displays the acoustic power levels of each the frequency band in real time.

Absorbent material used in anechoic chambers is usually used for this purpose and the entire rear wall is covered with this highly absorbent. The reflective surfaces are usually of convex shape.2 The Rettinger Control Room Design This control room is designed to give minimum delay between the direct sound and the Early reflections.3 The Live End-Dead End (Lede) Control Room Design This control room design by Don Davis is the exact opposite of the Rettinger design. The absorptive materials are applied to all surfaces . Basically the Rettinger control room design has the front walls reflective from the engineer’s position and the back wall absorptive. wide band absorber. The LEDE is superior to Rettinger Design because the engineer monitors only the direct signal.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes 8. All the surfaces of the rear part of the control room are reflective and diffusive. The placement of all the absorbent materials is in the front portion of the control room facing the engineer. A control room following the Rettinger’s Design concept 8.ceiling. The rear wall is made highly absorptive in order to minimize reflections bouncing off them. All surfaces must be symmetrical from the front to rear. Hence whatever has been miked up in the studio will be heard exactly the same way in the control room. walls and floor. Minimal if no coloration added unlike Rettinger Design. 483 . This is accomplished by the placement of reflective surfaces of different shapes placed above and by the sides of the engineer’s position.

Note the non-parallel walls to discourage flutter echoes and standing waves. The engineer’s position receives only direct signal. The front walls and ceiling facing the engineer are splayed at an angle. 9 9 Assignment 10 – AE009 484 . Hence it is called reflection free zone. Note the diffuser wall at the rear to keep the acoustical energy in the control room.4 Reflection Free Zone (Rfz) Control Room Design This design follows the same approach of LEDE control room. 8.SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes A control room design which follows the LEDE principle. The reflections emitted from the room monitors bounces around the engineer’s position.

SCHOOL OF AUDIO ENGINEERING AE27 – Acoustical Climate Student Notes A studio modeled after the Reflection Free Zone control room 485 .

Sound Miking Technique 6. 4.1 1.1 2. Arrival time differences Intensity differences Phase differences Binaural Recording/Dummy Head (Kundstkopf) 2.1 5.2 In-head localization Disadvantages of Binaural recording 3.2 SETUP Differences Between Stereo And Surround Mixes 5.AE28 – 3D Sound 1.1 4.1 Ambisonic – Soundfield Microphone Mkv EXTRACTS Mixing Sound Part 1: Universal Observations Part 2: Perspective ã where do the sounds go By Tomlinson Holman 486 .Compatible 3 D (transaural audio) Mixing System SURROUND SOUND 4.2 Reduced Listening Fatigue Tomlinson Holman Experiment (THX) 6. Surround Mixing Techniques 5. Loudspeaker .3 2. Localization of sound sources by the Human Ear 1.2 1. Surround .

Some techniques stress more on arrival time differences (A-B) while others are more on intensity differences (XY) and the hybrids . a signal that is not coming from the exact front. Another factor that determines the intensity of a signal at the ear is. Hence the head becomes an obstacle for the higher frequencies. This combfilter effect will occur in each ear. Each location around us is defined by a unique and typical. Localization of sound sources by the Human Ear 1. The only known but not common recording technique that gives three-dimensional results is the Dummy Head recording. This influences the intensity difference between the two ear input signals. an acoustic shadow. When all these reflections combine at the eardrum. back or above the head. but part of it will be reflected by the ridges and structures of the pinnae itself. Arrival time difference. a combfilter effect is produced due to a combination of the original signal plus its delayed replica. The above mentioned factors are responsible for us in hearing ‘3 dimensionally’ which enables us to localize a sound source with great accuracy.2 Intensity differences Like arrival time differences. whenever a sound source is not situated along the vertical. will occur. Normal stereo recordings done with any of the well established stereo miking techniques are only able to cover two of these requirements i. Hence the term ‘phase differences’ is used.1 Arrival time differences These differences are experienced when the signal form the sound source reaches one ear (closer to the sound source) earlier than the other ear further away form the sound source. 1.near coincident miking techniques. how much a signal has to diffract around the head in order to reach the ear further away from the sound source. there would be less high frequencies in that audio signal.3 Phase differences Sound travelling towards the head will reach not only the eardrum directly. The more a signal has to diffract around the head. middle plain of the head. combination of these three factors.e. The combfilter effect is different from each other in both ears. not discounting. 1. reflections from the shoulders as well. arrival time and intensity differences.SCHOOL OF AUDIO ENGINEERING AE28 – 3D Sound Student Notes AE28 – 3D SOUND 1. intensity difference between the ear input signals occur due to the ears being separated approximately by 17 cm apart from one another. 487 .

above and even below (height information) the listener. Listening tests revealed that the binarural recordings appears up-front on loudspeakers and it also sounds more reverberant then it actually is when listened to over headphones. This could be solved by restoring the conch resonance via equalising the headphone signal. the left/right signal is only for the left/right ear respectively. sides. intensity differences and phase differences are needed to produce an illusion of a stereo image that is sonically detached from the loudspeakers. Should a person be listening to a recording using a dummy head’s ears which are different from his own then this 3D effect will be reduced. 3. Loudspeaker . When a binaural recording is played over a pair of loudspeakers acoustic crosstalk occurs due to left and right signals merging with each other. which is caused by headphones. This will cause the signal to be heard outside your head again. Inadequate spaciousness at low frequencies over loudspeakers can be remedied by a low-frequency boost of about 15 dB at 40 Hz and +1 dB at 400 Hz which will improve the low-frequency separation. The recreation of sound source locations and room ambience is startling. Sonic images are reproduced in front. Ahole-in-the-middle may also appear over the monitors. To circumvent this problem. This effect may not work for everyone since the dummy head’s ears are modeled after the average human ear.Compatible 3 D (transaural audio) Mixing System A right combination of arrival time differences.1 In-head localization At times the images are heard inside rather then outside your head. Binaural recording with headphone playback is the most natural and spatially accurate method known. the conch is the cavity just outside the ear canal. One such reason is due to disturbance to the conch resonance. 2. This 3D stereo image can be placed anywhere in a 180o horizontal plane at the front and sides of a listener. The result should not be less then stunning.2 Disadvantages of Binaural recording It is not loudspeaker or mono compatible. small omni-directional electrec condenser microphones which fit into a listener’s ears (Sennheiser Mk 2002) can be used to record the live music instead. Binaural Recording/Dummy Head (Kundstkopf) In a dummy head. Recording using dummy head needs to be placed fairly close to the ensemble. 488 . A recording made this way will give a very real sense of ambience and three-dimensionality realism. two condenser microphones are placed inside a head made of either wood. behind.SCHOOL OF AUDIO ENGINEERING AE28 – 3D Sound Student Notes 2. Listeners experienced an illusion that they are in the arena where the musical instrument is actually being played and recorded. 2. foam or hard plastic at the rear of the ear canal. The center image can be made to appear more solid by boosting the presence range. The phase differences produced by the dummy head in a very ‘ear specific’ way. like listening to music over a pair of headphones.

(fig. It was said that RSS was not as subtle as Q Sound in its effects. Q Sound latest 3D software boasts to be mono-compatible.SCHOOL OF AUDIO ENGINEERING AE28 – 3D Sound Student Notes This system has to be able to eliminate if not minimize the interaural acoustic crosstalk which occur in stereo listening. Now the listener will hear only the intended correct signal. Mixes done with Q Sound appeared on the market in 1990 on Madonna. when played over loudspeakers. 3D mixing is used mainly to enhanced a transparency and spread of a mix instead of solely creating wild and sonically detached images leaping out of the loudspeakers. 2) The result is each loudspeaker reproduces its respective signal and a phase reversed opposite signal. In 1990. Paula Abdul. 489 . The fee then for using Q Sound on an album production was approximately US$20 000 which included both the rental of equipment and engineers. Canada used 15 Motorola 56000 DSP chips to process and produce 3 dimensional sonic images. 1) This is achieved by using a special EQ setting and a delay unit which takes a split of the input left/right signal which is crossfed to the opposite right/left output channel. (fig. The listener will hear the left signal and the right signal as well (due to acoustic crosstalk). RSS™. Nowadays it can be purchased for less then a few hundred dollars as a software plug-in for a Sound Forge. the phase reversed right signal found in the left signal will cancel that right signal. The RSS 3D effect was described as being very apparent if not harsh then. Janet Jackson and Sting’s Soul Cages album. fast and powerful microprocessors are needed. This 3D effect is experienced not only in the optimal stereo position but from any position in front of the loudspeakers. (fig 3) Fig 1 In order to perform 3 dimensional mixing in real time. Cakewalk and Pro Tools audio editing and MIDI sequencer software. Another 3D mixing system is Roland Surround Sound. Q Sound™ from Calgary.

SCHOOL OF AUDIO ENGINEERING AE28 – 3D Sound Student Notes Left input Left output EQ for head + - EQ EQ EQ for head Delay Delay + Right input Fig 2 Right output An electronic crosstalk canceller 490 .

1. This is to ensure that there is a smooth transition at the crossover frequency point between the monitors and 491 .1 SETUP Dolby Digital (AC-3) or 5. The additional two loudspeakers are placed inbetween the Left-Centre and Right -Centre of the Left. Digital Theatre System (DTS) is a 5.SCHOOL OF AUDIO ENGINEERING AE28 – 3D Sound Student Notes Fig 3. which means it has 7 loudspeakers plus a subwoofer making it 8 loudspeakers. The working of an electronic crosstalk canceller 4. 6 or 8 loudspeakers.1 surround sound format.1). It is important to roll-off the bottom end of the monitors at around 100 Hz for Yamaha NS-10m and 80 Hz for Mackie HR 824 or Genelec 1030A. 4. All the loudspeakers and amplifiers should be identical with the former pocessing a frequency response of 60 to 20 KHz. front and rear . SURROUND SOUND Surround sound uses 4.left and right loudspeakers. Centre and Right loudspeakers.1. Sony Digital Dynamic Sound (SDDS) is 7.g. Using powered monitors e. depending on which format one use to decode the data in order to obtain the recorded music. a centre loudspeaker and a subwoofer (0. Mackie HR 824 or Genelec 1030A will save the engineer both time and effort when setting up the power amplifiers and then needing to connect the power amplifiers to the loudspeakers.1 uses 6 loudspeakers. Hence Dolby 5. In surround mixing the additional audio signals needed to feed the other loudspeakers are derived from the sub/group output busses of the console.

This is done so that the centre loudspeaker will not be overloaded. The rear loudspeakers can be used to reproduce the reverberation of the hall and audiences applause. In this instance the listener is surrounded by music. an element of it and not merely an observer. whistling and cheering which are also recorded in the live concert are panned to the rear speakers which will give the listener a feeling that he is sitting in the hall with them. Surround Mixing Techniques During surround mixing the drum or vocal could be placed in front left. If the engineer is not skilful. high and low pass filters are used to limit/alter their individual frequency range in order to prevent or minimize this from occurring. For instance the congoes in the rear-right speaker or the keyboard/electric guitar in the rear-left speaker and the background vocal in both rear and front loudspeakers. 492 . the listener is now becomes “included” in the music. The audience’s response like clapping. pad sound. The vocal’s or drum’s bottom end can be sent to the subwoofer in order to make their presence felt without the muddiness which is frequently encountered in stereo mixes whenever the bass is increased too much. It is not natural but nonetheless creative or different. Different artificial reverberations are also used. synthesizers. backup vocals and piano. The reverbs and slap delays on the instruments and vocal can be sent to the rear speakers to make it feel more realistic just like what the listener would encounter if he is in the concert hall of the performance. The surround mix can be made to sound realistic or even unrealistic for that matter. Longer-than-normal reverbs can be used on the rear loudspeakers in a surround mix without the fear of ‘muddying’ up a mix. centre and right loudspeakers. 5. In any stereo mix the audio engineer has to make “space” for the vocal in order for it to stand out clear in the mixture sounds from the drums. As a result there are less constrains unlike stereo mixes where it is reproduced on only two loudspeakers. Full parametric EQ.SCHOOL OF AUDIO ENGINEERING AE28 – 3D Sound Student Notes subwoofer. The subwoofer (frequency response should be from 40 to 110 Hz) need not be similar to the surround monitors. In a surround mix for classical recording the musical ensemble can be reproduced over the front left. This is to help create the illusion of spatial/depth or distances between the various instruments. drum sounds more realistic this way than ever before when compared to conventional stereo mixes. to make the various instruments sound better and it is also used to “separate” or “blend” the individual sounds from or with one another. As a result. in conjunction with the equalization. frequency clashes or masking will result. With the vocal panned to the front-left. 4. Nevertheless it gives an engineer more creative decisions as it is now in surround using 6 loudspeakers instead of 2 which offers a wider frequency response and dynamic range. centre. centre and right loudspeakers to increase it apparent size or it may sound small. front-right loudspeakers of the surround mix. electric bass/guitar.2 Differences Between Stereo And Surround Mixes Surround mixing is more complicated than the conventional stereo mixing.

via the control unit. 493 . LR.1 Microphone System This system is a single. It serves as a guide to surround mixing. they will sound more realistic.Sound Miking Technique In general. Mic techniques for surround sound should be optimized to counteract this effect. is able to electronically steer and move the microphone both in real and post production time whilst a fully three dimensional output signal is available for surround sound use.SCHOOL OF AUDIO ENGINEERING AE28 – 3D Sound Student Notes Less compression is needed in surround mixes. 6. The user. THX equipment is used in conjunction with surround sound formats.1 Reduced Listening Fatigue Incidentally audio engineers experienced less if not a lack of hearing fatigue when mixing with surround than with stereo. Audio equipment that is THX certified means that they meet its specifications. A number of mic techniques have been developed for recording in surround sound. and most importantly of all to ensure that the various instruments will not “step on each other. This is an introduction to Surround Mixing. The SoundField MKV is a unique product offering a high degree of accuracy in the generation of phase-coherent. This is probably due to the nature of human hearing where we listen to sounds coming from all directions instead of from the front left and right as in the case of stereo. and W) into L. 6. It is a universally recognized standard for an ultimate cinematic experience with regards to the audio reproduction.2 Tomlinson Holman Experiment (THX) Tomlinson Holman Experiment (THX) is not a surround sound format. the different ways of surround mixing a song or sound track become endless. When 3D mixing is incorporated into surround sound mixing also. The balance of the different instruments’ levels are not as critical unlike stereo mixes where the various instruments need to share the same stereo audio signal which is in turn is reproduced over two loudspeakers. C. listening in surround sound reduces the stereo separation (stage width) because of the center speaker. R.” 5. coincident stereo and mono microphone patterns. The decoder translates the mic’s B-format signals (X. RR. Surround . Y. Since they have a wider dynamic range then in stereo mixes.1 Ambisonic – Soundfield Microphone Mkv SoundField 5. Z. 5. Thus there is a need for the engineer to alter the levels around during mixing in order to make the stereo mix interesting. multiple-capsule microphone (SoundField ST250 or MKV) and SoundField surround-sound decoder for recording in surround sound. and mono subwoofer outputs.

the microphone can be easily manipulated and repositioned closer to the vocal at the post production stage. the SoundField system offers three unique additional controls: Azimuth. the stereo image can be widened and the microphone pulled back further away from the kit. polar pattern and angle controls. pointing in any direction. The illustration on the right is the decoder remote unit.e. Even the polar pattern response can be altered. 494 . from omni to unidirectional polar pattern. like the zoom lens on a camera. This combination of controls allows the SoundField user to replicate any microphone configuration. If in a choral recording the soloist’s vocal is too distant in the mix. or if necessary. when it is mixed in stereo. if a drum kit recorded with the SoundField set to an equivalent crossed pair of cardioid microphones with an angle of 90 degrees now needs more ambience. the SoundField control unit also has a four channel ‘B Format’ output. i. In additional to the usual polar pattern and angle controls. dominance. The Elevation allows vertical positioning of the microphone up or down by 45 degrees and the Dominance gives the effect of moving the microphone closer to or further away from the sound source. elevation. The multitrack recording can be replayed through the SoundField MKV control unit in a post production environment for re-mixing utilizing the azimuth. Elevation and Dominance.SCHOOL OF AUDIO ENGINEERING AE28 – 3D Sound Student Notes The four capsules are tetrahedral arrayed. the microphone can be aimed off-axis from the kit. without any risk of a popping effect. All capsules’ outputs are time -aligned to produce a phase coherent signal. In addition to the Left/Right stereo outputs. The Azimuth allows the electronic rotation of the microphone to any horizontal position through a full 360 degrees. Alternatively. These B-format signals carry three-dimensional information from the entire sound environment and can be recorded on four channels of a multi track recorder. Without the need to re-record. if desired.

It is the ability to “hear into the mix” that matters in this context. 1. A possibly surprising feature was how much commonality there was in surround mixing technique among the various mixers. then concentrate on the lead instrument part. stereo mixing often involves elaborate equalization just to make the various instruments audible. There are still reasons to equalize. they can be better heard. Here it is easy to agree with the proposition that. First we might hear the melody and lead singer. These observations have been accumulated over the last few years as first one then another mixer explained their early surround experience in meetings I’ve attended. but not the whole output of the source. we look at those observations common to many music mixers working for the first time in surround as opposed to 2-channel stereo.SCHOOL OF AUDIO ENGINEERING AE28 – 3D Sound Student Notes EXTRACTS Mixing Sound Part 1: Universal Observations By Tomlinson Holman Starting a series on mixing. We might wind up with a Boundary Layer Microphone on the floor in front of the cellist. horn. One of the main ones is microphone placement. but let’s take an exaggerated case and you’ll see what I mean. although we’ve used a flat microphone. The reason for this seems to be that dense mixing of many elements leads to sonic clutter in stereo. Let’s say we have to mic Yo Yo Ma playing the cello. Such equalization may well not be representing the timbre of the source well. One thing that is attractive to music listening is following the various streams on different occasions. or even shifting focus among the streams during one listening. the next time the background vocals. Once the sources are more spread out spatially. but it is off axis from the position where we might choose to capture the timbre correctly. one main reason to equalize is to get the source to sound like itself! This may seem strange.S. which the floor would be. This “multi-point mono” approach means that spreading separate monaural sources out in space makes them better distinguished. so it is not as much trouble to get them to be heard. listeners were better able to distinguish quickly which was alerting them than if they were concentrated in one spot. Perceptual psychologists call this auditory streaming. buzzer. resulting in the masking of one part of the program by another. Since one of the primary ways to distinguish one sound from another is by its timbre. etc. they are more easily distinguished. This was found first technically in command and control headquarters by the U. Surround mixing generally uses less program equalization than stereo. Army during World War II. we have not captured the instrument properly. and so program equalization needs to be done. klaxon. but there can be no microphone in front of the instrument because we’re photographing the performance and it is undesirable to interrupt the view. Since the choice of microphone placement captures the timbre of a natural source only along one or multiple axes. Equalization for multichannel thus becomes both easier and also better able to concentrate on rendering the timbre of the source in a manner that is closer to the natural acoustic timbre of the source. One of the interesting differences between pure music mixing and mixing for picture is the fact that music mixes are more likely to be heard over and over than audio-for-picture ones.. despite their never having talked to one another. The microphone has very flat response on a large enough baffle. but rather emphasizing it so that it can be heard in context against others in the mix. etc. This is one of the reasons for wanting to listen over and over. 495 . With sources more spread out. By spatially spreading out a bell.

SCHOOL OF AUDIO ENGINEERING AE28 – 3D Sound Student Notes One way to determine the equalization in this case is analytically. among other things. Yet. There is another reason at work for voice equalization. When I take a measurement microphone into a space and measure the detailed frequency response from a source. lavalieres for instance. but not in such an exaggerated way. 496 . and vice versa. I have a theory that goes like this: People have put up their favorite studio microphones in front of a soloist and panned it to the center of a multichannel system. Survival of the fittest has caused the evolution in microphone and EQ choice for 2-channel stereo to have a peak injust the region where the listening conditions induce a dip! If you start with a flatter microphone and EQ. overcome the interaural crosstalk-induced dip due to stereo. stereo. Unfortunately it took a ten-band parametric equalizer to make the match. the peak-to-peak variation is reduced considerably (also reducing the corresponding need for sound system equalization). constructive and destructive interference will occur at the ear due to the time delay. like boom mics vs. The reason that they had this experience is that they have a long-time experience with phantom image stereo and have. through the choice of microphone and equalization. Comparing it to the long-term spectrum of the mic on the floor (and getting the sign of the difference right!) would result in the required equalization to turn the timbre captured at the undesired microphone position into one much more like the reference position — and possibly even indistinguishable from the “right” position. this doesn’t work. Again. that a small amount of compression is a good thing. and we’ve gotten a much more interchangeable sound quality from the less desirable positions. the components of which are affected by the single-point pick-up in turning the strong peak-topeak variation in response into a stronger-than-heard peak-to-peak variation in level. due to interaural crosstalk. and makes the sound more like the source than without it. They have found the sound honky — peaked up in the presence region around 2 to 3 kHz — said. Voices. Thus. especially lead vocals. I say unfortunately because. “Ug. there may still be some need to compress. My theory for this starts with the fact that the microphone is capturing sound at one point in space (mono dialog recording. Panning a source to the center of a stereo soundfield leads to a “darker” timbre. So equalization can be reserved more for getting things to sound good than for getting them to be distinguished in the mix for multichannel compared to stereo. This is the sound from the left loudspeaker that reaches the right ear. I’ve done this for voice using the various microphone techniques normally used. This need is not so pressing in multichannel as it is in stereo. The reasons are much the same as for equalization — compression is used to keep one auditory stream at a more constant level so that it can be distinguished from other sources in the mix. we hear at two points in space. The recorded voice as it goes through its various formants produces varying spectra. I see a lot of peak-to-peak variation in level caused by standing waves and reflections — even up to very high frequencies. when I spatially average the sound. and easier to listen to. 2. Since the flight time from the left speaker to the right ear is slightly longer (only about 200 microseconds longer typically) than for the right speaker to the right ear.” and rejected the center channel and have gone back to stereo. as is standard). I find that. in my book. it also showed how up against it a dialog mixer is when faced with a console with a typical 4-band parametric equalizer attempting to match a boom mic and a lavaliere by ear. in mixing dialog in documentaries in particular. you may well find that you like the multichannel center better. although it worked well. Generally this means you need less peaky voice tracks for multichannel vs. are usually panned center in stereo. When we listen naturally. The most salient feature of this effect is a strong dip centered in the 2 to 3 kHz region. for sound centered in middle of a stereo image. We can put up a microphone where we’d like it in rehearsal (hopefully of a solo portion of the program to avoid clutter from other sources) and capture the long-term spectrum at that point in space. adding the same choice of equalization that they have used before for stereo. even over an area of only a head size in diameter. Compression is less needed in multichannel sound than in stereo.

On the one hand. When asked the relevant question most people wanted a “best seat in the house” perspective. John Atkinson.SCHOOL OF AUDIO ENGINEERING AE28 – 3D Sound Student Notes employing some compression actually makes the source sound “more like itself” than without it. but what I hear often seems to be a “radio mix” intended for intelligibility in less than ideal conditions by employing way too much compression (they’re going to do it again at the radio station anyway). younger people seemed to take to the idea of “in-the-band” in greater numbers than older people. and the microphone is not. Part 2: Perspective ã where do the sounds go By Tomlinson Holman By far the most important decision taken by the producer and mixer of virtually any surround music project is which of two primary perspectives to employ: frontally oriented with surround enhancement called the direct/ambient approach. Goldilocks comes to mind: the porridge (equalization and compression) should not be too cool and not too hot. and. of course. On the other hand there is probably little familiarity in the general public about the issue. I believe there is a case to be made for each of these. It is useful to know this and factor it in along with some knowledge of psychoacoustics to enhance mixes for the widest range of listeners. whether that be by riding gain on the classical soprano or putting the source track through a compressor — but not too much. Of the two. I must say that I don’t have much experience with pop music vocals. because our two ears are averaging the soundfield spatially. some of the critics think I'm stuck in my ways as a direct/ambient natural classical music mixer and wonder aloud why I don't exercise the freedom that surround sound offers to put the listener at the 497 . especially for some listeners. Also. Greater compression leads to hearing artifacts. you should not hear background pumping or other problems or you will need to back off on the compression. For music. for a wide spectrum of people. so they may be responding with what is more familiar to them. the experience of being embedded in the music is at least unfamiliar. and this is likely to be the first time they've ever heard anything like the question. the editor of Stereophile. I've been criticized in the past for my surround mixes. but just right. recently wrote that having percussion coming at you from the back is unnatural and disturbing. you may find that the decision based on marketing is a toss up. the decision whether to use a direct/ambient or in-the-band approach might be weighted by audience research (especially if you are a crass marketer). which the CEA sponsored in a blind telephone survey of 1000 households. So. In fact. A woman in a CEA focus group on surround also found it to be uncomfortable having direct sounds coming from behind her. and multichannel doesn’t play on the radio anyway (at least not yet!). when broken down by age. the in-the-band perspective is the more controversial one. so depending on what you are mixing and its audience. and that ultimately it is the intersection of the aesthetic values of the project with surround sound capabilities that should make the decision. So I’ve given a rationale for some compression being right. or a “middle of the band” perspective. The amount of compression I’m talking about for documentary dialog is in the range of 6–8 dB maximum. and perhaps even disturbing.

The reviews? Here's one: 498 . You'd hardly think I could do both! I guess I'm an equal opportunity offender. with head-type spacing and angled at 110 degrees centered over and behind the conductor's head. I have used time delay on the spot mics. Recently I've been criticized for exactly the opposite: playing too much with space making mixes too gimmicky. The classical music mixes I've worked on involved spaced multi-microphone technique using a basic ORTF pair of cardioids. In fact. because this is the central question of perspective. In the 10. which comes about as a function of placement in the hall.SCHOOL OF AUDIO ENGINEERING AE28 – 3D Sound Student Notes conductor's podium and wrap the sound around more. which is what I used) is useful. There's a lot more about the natural. The hall mics were omnis directed to the surround speakers only. but I do mean to try its techniques over the next year or so. because otherwise as you bring the level up you are also adding in a sound that arrives at the spot microphones earlier than at the main ones. and where things were panned. So again time delay (or its postproduction adjustment editorially in a digital audio workstation. they may be too far away from the stage in time. the paper goes beyond anything I've tried. The left and right outriggers were assigned hard left and right. cardioid spot mics. direct/ambient style approach in the long paper described in Relevant Research this month. an outboard set of left and right cardioids more or less in line with the ORTF pair. One of the happy accidents we got along with the music was stage hands setting the stage of a hall while a piano was being tuned. the omnis are sent to the dipoles and the cardioids to the center back in order to make a more fully enveloping soundfield. and omni and cardioid hall mics. and the spot mics where appropriate to match into the soundfield. which leads to exaggeration of the spot mic due to the precedence effect. adding a deeper perspective than even the omnis. Interesting. I'd like to explain how examples of each type came about. Stay tuned. the center ORTF pair just to the left and right of center.2-channel version of these mixes. and the hall cardioids were pointed away from the orchestra. With the right perspective on the surround mics.

with different notes from the same instrument popping back and forth due to the different frequency responses. I have found experimentally that the total level of the surround sound vs. but even the 5. I didn't have to imagine that I was in the recorded venue because it sounded as if I was.1 (DTS ES). For instance. here is what I did. it sounded as if you were in the acoustic space of the recorded venue. So side imaging should probably best be left out of the equation. but it does not work to the sides! So knowing these facts. there are some known psychoacoustics that mitigate against attempting imaging everywhere in the 5. and consolidating them into a mix. Yes there is an advantage here of the 10. Yet. the album 499 . from a company called TMH.” As he pointed out at CES. there was no indication that any sound at all was happening to the sides or to the rear. Herbie Hancock loaned us the 46 original tracks of his tune “Butterfly. The approach is strictly direct/ambient. Truly incredible.1channel system. with a ±1 dB change making a huge change in frontal “focus” vs. surround “envelopment. This is because it is a mighty wide gap between a left loudspeaker at 30 degrees and a left surround at 110 degrees. as it turns out to be critical here. Some believe in such a simplified world that there is a stereo soundfield in front of us. direct sound is very critical.2 system over standard 5. but shutting them off makes the soundfield collapse into the front. but it is early in the use of multichannel sound for music.1 (Surround EX). for those demo recordings that were simply trying to re-create sounds that were happening on stage in front of you. designed by Tom Holman. although we hear about it in workshops and seminars in the field.channel version of this program material works quite well. phantom imaging works in front and behind. That is. or even in quasi-6. ambient cues did not sound like anything specific.” It is easy to go wrong. and one to each side. This was a 10. Yes. or 7.1 (Logic 7). and the reviewer obviously got the point: you don't hear the surrounds obviously. But perhaps there needs to be some organizing principles applied to the logic of experimentation. and the different frequency response in your ear canal for these directions of arrival makes them not match. even if your speakers and room are perfect. 6. This means that panning down the sides in 5. and one behind us.1 is fairly hopeless. so watch out for your monitor system calibration.2-channel system. and I think experimentation is still in order. complete with phantom imaging even if the center channel is not used (“it's the devil's work — don't go there. Unlike much of our present-day surround sound.SCHOOL OF AUDIO ENGINEERING AE28 – 3D Sound Student Notes Finally I heard a system that is light years ahead of the rest when it comes to re-creating a sound space and placing sounds within it.1.” as one editor put it to me).1 in producing greater envelopment because of the greater number of directions of arrival we've got. Now maybe I'm getting looser as time goes by.

Further. The sounds to the sides were those that did not have a particularly hard edge to them. I think. some in groups.2-channel remix of one of his tracks that stunned all in the audience with its beauty and musicality. can work. even at 90 degrees. as it is alone and completely exposed. How did it go over? During the presentation I attended. Having the percussion up front reduces that feeling of uncertainty that occurs when transients come from all around you. Then the whole thing comes to an end with a flute solo without accompaniment. I theorized that certain instruments might be better if they were moving. Here it made sense to me to bump it around the multichannel soundfield. Finally. I hadn't thought of the title at all. The first is a set of hand chimes. but at the end it's heard alone and exposed. Next. Usually played by hanging them in the air about at head height of the musician. running a glissando across them really seemed to make sense putting them into the left and right high channels (±45 degrees in plan and +45 degrees elevated). without being annoying. the disjuncture is avoided. the better. at least to me. with appropriate motion. by “filling in the gap” between 30 and 110 degrees with a speaker at 60 degrees. Herbie's keyboards produce a warm sound and I felt they should be enveloping. He told me that the flute was the “Butterfly. It seems that. when the content was too big for stereo and demanded a multichannel format. where different solo instruments take over from time to time and are highlighted. I think. surround is natural. maybe this is a little weird because soloists don't usually move to the center. after having a wall of sound up to then. I've reported earlier on side imaging in the 10-channel system. There were certain tracks that are attention-getting percussion. but having the audience know where each soloist was going to enter in this complex mix kind of helps them in sorting things out. Following a lot of sound around is tiring on the listeners. This was a 10-channel mix. I played this back for Herbie. In the end. more than forcing one solution or the other. So I picked just a couple of things to move. Actually. Then I broke down the tracks by type and made some basic decisions. and he liked it. and side imaging. and that the choice between them should relate to the aesthetics of the content. I started by placing the percussion across the front stereo stage and centering the soloists where you'd expect they might move and take the spotlight. So I put them principally into the left and right wide (±60 degrees) and left and right surround to embed you inside them. jazz great Herbie Hancock gave a short talk about how multichannel sound can be a boon to musicians…It's nice to hear from a leading musician such statements as “stereo is unnatural. Duh — of course. I started by listening to all the tracks. 500 . but the same principles would apply to a 5-channel version. others are more warm and fuzzy.SCHOOL OF AUDIO ENGINEERING AE28 – 3D Sound Student Notes flopped in the marketplace because the record company demanded a stereo mix. Then the listener hears something like the player would hear. Certain spot sound effects I left for center back. but moving some sound can be exciting.” He cited some of his own stereo work that was too sonically complex to fit comfortably within a two speaker medium and demonstrated a 10.” and “the more channels. I think both styles of work have a great deal going for them.” and that it takes off at the end. being buried in the mix details. as they were intermittent and adding a certain zing to be put there. the killer flute solo occupied the spot of the soloists in center front when it played earlier in the tune. The tune is in the jazz idiom.

SCHOOL OF AUDIO ENGINEERING AE28 – 3D Sound Student Notes 501 .

The Broadcast Chain Broadcast Standard The Audio Processing 3.5 4.3 3.3 4.5 Very High Frequency (VHF) Waves 30MHz .1 6. 2.300MHz Ultra High Frequency (UHF) Waves 300MHz – 3GHz Side Bands Base Band Stereo FM broadcasting 7. Radio Station 7.1 3.1 Horizontal and Vertical Polarization 502 .4 3.3 6.2 6. Transmission of Television Signal 8.2 4.2 3. Encoding and Decoding Automatic Level Control Modulation / Transmission AM Modulation Frequency Modulation (FM) Broadcast Specifications 4. Specifications of Broadcasting Bands and Frequency Stereophonic Broadcasting 6. 3. 6.1 4.4 6.1 Personnel 8.4 Deviation Modulation Transmission Stage Electromagnetic Wave 5.AE29 – The Broadcast Concept 1.

8.1 9.2 8. Circular Polarization Stereo Television Sound SATELLITE 9.2 Direct Broadcast Satellite CABLE NETWORK 503 .3 9.

it handles all the audio frequency prior to the modulation of the carrier frequency at the transmission stage. 3. mixer. to software like analogue disks. 504 . The Audio Processing The Audio Processing functions include special effects such as reverberation. Secam (Sequence Couler a Memoire). debates technical issue and recommendation to most of the radio stations around the world. speakers.(Wave) 2. automatic level control. 3. etc. equalisation. In this broadcast chain.S. In response to the application of an external DC control signal. peak limiting. Some of the standards set and introduced by the CCIR were N.2 Automatic Level Control Automatic level controllers are designed to control the audio of disparate source more accurately and efficiently than a human. 3. the components involves ranges from hardware like microphone to magnetic recording and playback head of tape players. The most influential international organisation was th Geneva Base known as CCIR’ (Comite’ e Consultatif International Des Radio communications.A is the FCC (Federal Communication Commission) which roles were to establish rules. amplifier. program storage device. which is capable of doing automatic gain control. Another such body in U.A. signal processing units. Broadcast Standard In order to control and have a standard on all the equipment on all equipment. Mic – Console – Signal Processor – Encoder – Modulation Transmitter Speaker – Radio – Receiver Antenna .1 Encoding and Decoding This is a very important stage in the signal chain.C. The Automatic Level Control works in such a way the audio program is passed through an amplifier. digital disk.SCHOOL OF AUDIO ENGINEERING AE29 – The Broadcast Concept Student Notes AE29 – THE BROADCAST CONCEPT 1. This could be a program from the in house line input (recorded program) to a live telecast reporting external program.L. The Broadcast Chain The input to the broadcast chain is from one or more program sources.) This organisation is a world wide base as it share technology. The encoder may have multiple input in addition to the main program like stereo composite signal etc. Due to its position at the last stage of the signal chain as the final device. (National Television Standards Committee). granting and maintaining licenses of radio station etc. (Phase Alternation Line). P.T. etc.S.

If the gain is high then the program output level will be attenuated and vice versa. but the amplitude and phase angle.M. Canada.S. This is due to its relatively narrow spectrum allocation. C-Quam (A.4 AM Modulation A. Frequency multiplexing techniques such as those employed for FM are impractical in A.M.. stereo it is necessary that all the side band from the amplitude and angle modulate overlap the various approaches were tested in different manner.M. which is modulated.M. A. service.SCHOOL OF AUDIO ENGINEERING AE29 – The Broadcast Concept Student Notes The output signal from the variable gain amplifier is sampled and fed back to the control circuit usually a detector. is the simplest form of modulation used for long medium and short wave bands.) of the two carrier signal and share the frequency 90o apart in phase. Australia. For A. In which angle modulation is generated such as frequency modulation.3 Modulation / Transmission Located at the final link in the broadcast chain the transmitter which is not an audio device. connecting this DC signal back to the control of the gain in the variable gain stage completes the loop. In response to the changes in the audio program level.M. This simple system will produce distortion in monophonic receivers when only the left or right signal is being transmitted.M. stereo system) is widely used in countries like U.M. which produces a signal varying in level. And when the audio modulates or varies the size of the peaks and valleys of the space between the succeeding waves of the carrier frequency. 3. Stereo transmission was started in the early 1900s. quadrature modulation or independent side band modulation.R signal encoded by amplitude modulation (A. this system uses the L + R and the L . 3. This superimposition is called "modulation". Its function is to superimpose the audio signal (program) onto a higher carrier wave in the radio frequency spectrum. etc. This resulted in a single-phase carrier. The carrier wave can be modulated in various ways and commonly used methods by the radio broadcasting are AM (Amplitude Modulation) and FM (Frequency Modulation). 505 . The shape of the carrier waves varies with the audio signal is said to be A. phase modulation. Nearly all technical approach modulations were utilised as a combination of amplitude modulation and angle (frequency) modulation of a single carrier.

S. Both systems allow a peak deviation of main R.F. as the audio signal (program signal) will vary the frequency of the carrier wave. This is opposite to the amplitude in A.M. carrier is ± 75Khz. FM Signal Recommendations published by CCIR on the two systems for FM broadcasting were the "Pilot Tone System” mostly used in U. which is the main character of A. 3. and many other countries.M. while the "Polar Modular System" is used by Soviet Union. 506 .5 Frequency Modulation (FM) Frequency modulation occurs in low frequency modulation at about under 20 kHz. The carrier wave frequency will vary above and below of the unmodulated carrier in regard with the audio signal. the basic frequency remains constant.SCHOOL OF AUDIO ENGINEERING AE29 – The Broadcast Concept Student Notes AM Signal Although the envelope of the waveform varies.

the expander processes the sum of the 'S' and 'S signals to achieve an improved SNR of up to 14 dB at low program. as it is the characteristic of the system. 4. This is achieved by limiting the peak values of the program signal. carrier. Radio waves are transmitted from the transmission aerials and induce any corresponding radio signal into any receiver aerial in their path.’s envelope is demodulated by the radio receiver as a click. The first is a pre . 4.SCHOOL OF AUDIO ENGINEERING AE29 – The Broadcast Concept Student Notes In the pilot tone system.F.3 Transmission Stage Radio is a term used for transmitting of information between two points without using transmission lines. 507 .2 Modulation The modulating signal is used in the transmission of radio frequency electronic waves 4. the stereo multiplex signal consists of three components. The Radio Frequency signal is picked up by the aerial and is fed to the receiver where the signal is separated into useable and unusable signal (demodulated).1 Deviation The amount that the frequency of the carrier has changed is called deviation. Maximum frequency deviation is the maximum value permitted for deviation of the carrier from its rest frequency. One of the main advantages FM has over AM is the stereo broadcasting capability. As for FM most interference will be eliminated because the receiver only needs to detect the instantaneous carrier wave.emphasised Mid (M) signal which is the sum of the left and right audio signal (L + R) in monophonic transmission. Two additional components are needed for stereo transmission includes the 19Khz pilot tone and the 'S' the sidebands of 38Khz (stereo sub carrier). The stereo subcarrier is locked in phase at the precise timing of twice the frequency of the pilot tone and is suppressed in amplitude at least 40dB.4 Electromagnetic Wave Electromagnetic Wave is the RF (Radio Frequency) the main used to transfer the audio signal at the transmitting area in the signal flow of a broadcasting chain.M. In FM it is much less prone to interference. Broadcast Specifications 4. whereas in AM if there is any stray pulse that enters the bandwidth of the A.R). the M Signal alone will not cause maximum deviation. In radio receivers. The peak level audio signals will cause a 100% deviation of the R. 4. Amplitude modulation by the pre-emphasised difference of the Left and Right audio signal (L . For stereo transmission.

High band TV 6. TV broadcasts.300 kHz Long Distance Medium Frequency (MF) Wave 300 KHz – 3 MHz Local Sound Medium Broadcast High Frequency (HF) Broadcast or short wave 3 MHz – 30 MHz Distance Sound Very High Frequency (VHF) Spacecraft communication Ultra High Frequency 30 MHz – 300 MHz FM broadcast. Stereophonic Broadcasting 508 . 300 MHz . Long Waves 30 KH-z .SCHOOL OF AUDIO ENGINEERING AE29 – The Broadcast Concept Student Notes 5. weather satellites. Specifications of Broadcasting Bands and Frequency Frequency Uses Bands Low Frequency (LF) or communications. (UHF) Radar.13 GHz Wirelss mic.

108 MHz) was employed initially for mono broadcast. The radio waves energy travels on the ground or travel in straight lines from transmitter to repeater to receiving antennas. The stereo sound was incorporated into this transmission of digital audio found it's way into TV transmission by using NICAM 728 sound (Near Instantaneous Companded Audio Multiplex).H.2 Ultra High Frequency (UHF) Waves 300MHz – 3GHz Ultra High Frequency (UHF) are mainly use for Radar communication.3 Side Bands The 'S' which consists of the sidebands of 38Khz stereo sub carrier. As TV sound quality took a major step forward with the introduction of ultra high frequency transmissions. The M signal is mono compatible since it is the sum of the L and R stereophonic signal. But the most commonly used one is PCM (Pulse Code Modulation). Band II (VHF 88 .F. 6. This is achieved by sending the Mid (M) and Side (S) signals instead of transmitting just L and R signals.4 Base Band Base band refers to any unmodulated analogue or digital audio signal maintained at its original frequency spectrum. they cannot travel very far unless using high power transmitter uses line of sight transmission. By modulating the base band signal to higher ranges of frequency. Base band signal contain high amounts of power at low frequencies and cannot be efficiently transmitted over radio links with normal sized antennas. efficient transmission of data becomes possible. 6. Weather Satellites. 728 refers to the 728 bits per frame. 6. 6. High Band Television. Modulating the digital base band signal onto a carrier and transmitting it in the S.5 Stereo FM broadcasting Although FM radios broadcasting is now known and used mainly for stereo broadcasting.1 Very High Frequency (VHF) Waves 30MHz .SCHOOL OF AUDIO ENGINEERING AE29 – The Broadcast Concept Student Notes 6.300MHz Most stereo radio station broadcast in VHF wave band. This meant that a stereo broadcast system had to be adapted that was compatible with the existing mono system. Since these waves are not reflected by the ionosphere. (Super High Frequency) band of the broadcasting centre transfers the audio information. For digital base band signal may be of any format. The base band signal from the studio contains the audio information. L+R/2 = M 509 . etc.

But some information must be given to the receiver about this suppressed carrier. This information will be like the carrier frequency and it's phase. which must be accommodated within the deviation limits. This is achieved by sending a low level signal (known as Pilot Tone) at 19KHz is exactly half the frequency of the 38KHz suppressed carrier. which operates on the entire spectrum. program manager. 510 . 7. ROLES The DJ/presenters spins the music and host the program either in-house in the studio (live or line) or live from a location. so it will be able to recognise it.1 Personnel Personnel in a radio station comprise of the DJ or presenters. L – R/2 = S The first L and R signals are converted to M and S signals. During the modulation stage the 38KHz carrier is removed to save power.SCHOOL OF AUDIO ENGINEERING AE29 – The Broadcast Concept Student Notes It shall produce a deviation of the main carrier of not more than ± 67.5KHz For monophonic FM broadcasting. doubled this to a 38KHz which derives the S signal. The two addition components for stereo transmission included a 19KHz pilot tone and the S (Side) signal. broadcast engineers. etc. The M signal is not changed but the side signals are converted to M and S signals. From the M and S signal components it can produce a left and right signal for the loudspeakers. The program manager manages all the air time program and sales of airtime – commercials. But a stereo receiver has a decoder. It is the sum of all the signal components. A mono receiver will only be able to detect the M signal and convert it into audio. The M signal is not changed but the side signal is used to amplitude modulate a 38KHz carrier. which consists of a 38 Khz stereo sub carrier. For stereo broadcasting the M signal alone will cause the maximum deviation. The signal is then added to the mid signal to give the multiplex signal which is then fed into the FM transmitter. That is where the finance of the station comes from. Radio Station 7. There is no sub carriers used only the signal component will be transmitted. It can extract the 19KHz pilot tone.

This polarisation can be seen in television aerials. The sound is transmitted on a slightly different frequency to the vision. 8. Radio waves are horizontally polarised if the electric field varies in strength from side to side while the magnetic field varies in strength from top to bottom. 8. The speed on any radio waves is always die same whatever its frequency is. It travels at the speed of light at 186 000 miles per second or 297 600 km per second.2 Circular Polarization Radio waves transmitted from Direct Broadcast Satellites (DBS) are not polarised vertically or horizontally but circularly instead. Vertical polarisation is the reverse of this and is shown by the letter V. Their job includes monitoring all the broadcast signals. These can either be Left – Hand Circular (LHC) or Right – Hand Circular (RHC). These radio waves are both electrical and magnetic field. 511 . airwaves from earthbound transmitter or through space from satellites. Their frequency distinguishes one from the other in radio waves because the electric and magnetic fields are at right angles to each other and to direction of propagation. in a similar manner to the way video signals are recorded. The audio signal is modulated onto a radio carrier wave and for television transmission the video signals are frequently modulated on the carrier. allowing sound and vision to be processed separately. Transmission of Television Signal 8. The signals transmitted through the atmosphere or from space are essentially frequencies above the frequency spectrum.1 Horizontal and Vertical Polarization Television signals can be transmitted to homes through cable. The letter H shows horizontal polarisation. but the main concerns for any system must be to ensure that stereo sound does not affect the quality of the mono transmission.SCHOOL OF AUDIO ENGINEERING AE29 – The Broadcast Concept Student Notes The broadcast engineer job is to ensure that all the studio facilities are in working condition including all backup equipment. 8. which can have either horizontal or vertical rods clearly including indication the type of polarisation used.3 Stereo Television Sound Stereo sound for television can be transmitted in various ways.

(These are layers of ionised air between 100 and 1000 kilometres above the earth). This reflected radio wave from the satellite or geostation covering a large area on the earth is called a footprint. The ionosphere cannot reflect back television signals. They can project straight through the ionosphere with little interference in the space above the ionosphere with minimal transmission losses. NICAM 728 When stereo television sound was incorporated into TV transmission. from whatever source. Television signals are therefore radiated over short distances through line-of-sight via of many transmitting stations and repeaters making up for the ground losses. the lowest of which is in the ultra high frequency range.1 Direct Broadcast Satellite In the European DBS system. This is much higher than any television system. just like a “mirror”.7 to 12. digital audio found its way by using NICAM 728 sound (Near Instantaneous Companded Audio Multiplex) 728 refers to the 728 bits per frame. 9. the carrier having a bandwidth of 27MHz.5GHz. However. Which when it is matrix with the M signal produces the stereo sound just like stereo FM. As a result one satellite can cover an area that would normally require several hundred transmitters. 512 . These layers can be used to bounce radio waves back to earth. ground wave transmissions unsatisfactory over any long distances. Since losses increase with frequency. 9. Therefore higher frequencies are used in broadcasting. Satellites transmit at higher frequencies than television systems. SATELLITE Satellites receive their signals from ground stations via microwave links called uplink.SCHOOL OF AUDIO ENGINEERING AE29 – The Broadcast Concept Student Notes For compatibility with an existing system. Hence it would transmit over very long distances. In satellite broadcasting any transmitted wave. NICAM 728 stereo signal is also mono compatible. these reflection back to earth begin to fail at radio frequencies of about 100 MHz and completely disappear at about 2 GHz. At this point the waves travel straight through the ionosphere. Once the signal arrives at the satellite it is converted to the satellite transmitting frequency and then sent back down to earth called downlink. the transmitting frequencies are in the range of 11. the main channel carries the sum signal and the sub carrier usually carries the additional difference signal. It has to put up with both the absorption from the ground or the atmosphere and also has to put up with reflection. which take advantage of ionosphere. Wave of low frequencies was readily absorbed by the earth and objects in their path.

SCHOOL OF AUDIO ENGINEERING AE29 – The Broadcast Concept Student Notes E. One satellite can perform the same function of over 600 earthbound television stations. transmitters and repeaters needed in order to cover Great Britain. 9.g. An example of signal processing in a transmission and reception signal path.2 CABLE NETWORK In countries where satellites dishes are inaccessible to consumers. Television stations receive DBS signals first before they are broadcast to home via cable network. 513 .

3 2. Forms of Business Basic Business Structures 2. What copyright protection covers-what it doesn't 1.1 2.2 Copyrights 1. Expression vs. Common law copyright 4. Ideas First sale Copyright doesn't protect titles Sole Traders (Or Proprietors) Partnerships Duration of copyright protection 2.3 4.1 4. International copyright protection 3.1 1.4 Copyright registration Poor man's copyright Copyright Protection Period & Public Domain Exception for "works for hire" 5. Infringement 5. "unpublished" works 3.1 Mechanical Right 514 .1 Copyright notice 4.2 Fair use Compulsory licensing 6.2 4.1 5. 2. Types of Rights Owned by Songwriters 6.AE30 – MUSIC BUSINESS Business Orgainzations 1.1 "Published" vs.2 1.

2 1.1 1. Industry structure 1.2 1.1 1. Music publishing agreements 1.6 Performing rights Synchronization Rights Other licenses needed for audiovisual works Print Rights Transcription Rights Songwriter and Publisher 1. Major Record Mini-Majors Independents What's a record? Masters Royalty computation 515 . Standard Agreements Copublishing Agreements Administration Agreements Points of Negotiation Self-publishing The performing rights organizations 1.3 1.4 6.2 6.2 1.1 1.5 6.5 2.4 1.4 Membership in Compass Commercially recorded: Licenses of COMPASS Reproduction Rights The Recording Industry 1.3 1.4 1.5 2.3 6.6.3 1.

1 2.10 2.11 2.4 2.7 2.8 2.2 2.5 2.12 Basic Concept Here's how it works: Free Goods Promotion Copies Return Privilege Reserves "90% of Net Sales" Advances and Recoupment Other Goodies Risk of Loss Cross-collateralization Cross-Collateralization of Deals 516 .6 2.9 2.2.3 2.

management control and capacity disclosure requirements and tax advantages. The principal forms of business organisations are the sole proprietorships. borrowing needs. 2. They are able to attract investments from the public and have greater access to borrowed funds for the financing of the enterprise. Other than the principal forms mentioned in the next paragraph. profitability. businesses conducted by sole traders. these include co-operative societies. By and large. Hence this chapter will only deal with the principal forms of business organisation in Singapore. Business Orgainzations 1. and corporations (private and public limited companies). rather than for business. It aims at introducing students to the music industry and business practices in local context. Other than the principal forms mentioned in the next paragraph. and corporations (private and public limited companies). partnerships. management control and capacity disclosure requirements and tax advantages. capital requirements. It is possible to do business through a variety of forms. It is possible to do business through a variety of forms. Large-scale business operations are undertaken by public companies as they are able to command greater financial resources. these include co-operative societies. partnerships. This is due mainly to financing limitations and the capacity for risk bearing. while there is no legal bar to their doing business. Basic Business Structures A person who wants to do business must first choose a business form which is suitable for his needs. various types of associations and other bodies. they can undertake massive projects and benefit from economies of scale. they are more specifically suited for the purposes for which they exist.SCHOOL OF AUDIO ENGINEERING AE30 – Music Business Student Notes AE30 – MUSIC BUSINESS Introduction This module discusses 2 main issues. rather than for business. various types of associations and other bodies. The main factors deciding the type of organisation include the type of commercial trading or industrial venture. Hence this chapter will only deal with the principal forms of business organisation in Singapore. trusts. However. borrowing needs. Music Copyrights and Business Laws. The principal forms of business organisations are the sole proprietorships. while there is no legal bar to their doing business. they are more specifically suited for the purposes for which they exist. partnerships and small private companies are restricted to small and medium trading ventures or small scale manufacture. However. 517 . Forms of Business A person who wants to do business must first choose a business form which is suitable for his needs. The main factors deciding the type of organisation include the type of commercial trading or industrial venture. The size of the business may often decide the type of business organisation. Generally these organisations tend to be family concerns. profitability. trusts. capital requirements. On account of their large financial resources.

This is due mainly to financing limitations and the capacity for risk bearing. environmental laws. His business can take the form of retail stores. There is. and other such activities. They are able to attract investments from the public and have greater access to borrowed funds for the financing of the enterprise. labour laws. It may be noted. agency. he may be required to obtain licences from the appropriate authorities. He is so called because he alone bears the responsibility of managing the business and he alone takes the profit. construction or other activities. health and safety laws. and is personally liable to the creditors for his debts. small factories. however. third party liability and industrial laws. but applies to all businesses including those carried on by partnerships and companies. For some kinds of businesses. By and large. he may buy goods and convert them into other goods by employing workers and then sell them to private customers or other businesses. Sole proprietorship is the usual form for the one-man operation because it is relatively easy to establish. however.1 Sole Traders (Or Proprietors) A single or sole trader is a natural person engaged in any kind of business activity with a view to profit. Following are some of the disadvantages of a sole proprietor: 518 . For instance. that the application of this Act is not limited to the sole proprietor alone. they can undertake massive projects and benefit from economies of scale. This could include trading activities like buying and selling.2). He alone bears the risk of the enterprise. Large scale business operations are undertaken by public companies as they are able to command greater financial resources. In the eyes of the law. he can undertake any kind of business. Generally these organisations tend to be family concerns. He is thus subject to the normal business regulation in the same way as all other business organisations. eg to act as an auditor. maintenance and service. including taxation. He is required to register his business under the Business Registration Act by making an application to the Registrar of Businesses prior to the commencement of his business (see next chapter. property development. he cannot undertake any professional activity which may be undertaken only by persons qualified to do so.SCHOOL OF AUDIO ENGINEERING AE30 – Music Business Student Notes The size of the business may often decide the type of business organisation. 2. partnerships and small private companies are restricted to small and medium trading ventures or small scale manufacture. Generally. On account of their large financial resources. A sole trader who carries on a pharmacy business must employ a registered pharmacist if he is not so qualified. no law that is directed specifically at the sole proprietor. market gardens. businesses conducted by sole traders. he may buy goods in bulk and sell them by retail in smaller quantities to customers. He will also have to comply with the legal regulations which may be applicable. section 3. solicitor or doctor. farms. Likewise he cannot dispense drugs unless he is a qualified registered pharmacist. he is simply an individual carrying on business on his own behalf. He enters into contracts in his own name. Thus. Or. He must however comply with the requirements of the law with regard to any given activity.

unlike a company or a partnership. under the provisions of the Application of English Law Act. as an individual. This is a re-enactment of the [UK] Partnership Act 1890. any decision to cease doing business will be his own. Section 1 of the Act defines a partnership as "the relation which subsists between two or more persons carrying on a business in common with a view to profit". 2. unless it is necessary to sell them to discharge his debts. as the business will also die upon his death. All the three conditions must be present. In such cases. he does not have the capacity to borrow large sums of money for his business.5. paragraph 1. effective from 12 November 1993. as he is not answerable to anyone but himself. However. and sometimes a question may arise as to whether or not a person is a partner of a business. if any one of them is absent. all references in section 2. or implied by conduct. it states that the sharing by a person of the profits of a business is prima facie evidence that he is a partner of the business. However. he will continue to be liable to unpaid suppliers and creditors notwithstanding cessation of the business.3 of this chapter are to the new Act. but extends to all his private property. His executors will either have to sell the business as a going concern or sell the assets one by one. the expansion of his business depends on the amount of capital that he can raise and his management skill. and there is no continuity of existence. and with a view to profit.SCHOOL OF AUDIO ENGINEERING AE30 – Music Business Student Notes • • • • he bears the entire risk of his business. Cap 391. it may not always be easy to prove the existence of all the three conditions. The definition indicates three basic requirements of a partnership: • • • it is a relationship which arises out of agreement. there is no partnership. there is no partnership merely because there is a joint ownership of property (Davis v Davis (1894)). If the assets of the business have been left to a person by will. Unless otherwise stated.5). As to the cessation of his business. the executors will have to transfer those assets to that person as part of the winding up of the estate.2 Partnerships Creation of a partnership The law relating to partnerships is contained in the Partnership Act. His borrowing facilities are limited by the property he can provide as security to the bank or the financial institution for loans or overdrafts. For example. Cap 7A (see previous chapter. the Act provides for further guidance in s 2(3). However. or because 519 . his tax liability will increase with greater profits unlike in the case of a company where profits are taxed at a flat rate. A partnership may be created by an express agreement which may be oral or in writing. to carry on a business in common. His liability to his creditors is not limited to the amount he has invested in his business.

where the share of profit is being received as remuneration for services or as repayment of a loan (see generally. A firm is an unincorporated association and has no separate legal entity of its own. that he can enter into contracts. each partner is the unlimited agent of every other partner in the course of an account of partnership business. As against a proprietor. pooling of management know-how and sharing of risks. Where the number exceeds 20. in situations (a) to (e) mentioned in s 2(3). the number should not exceed 20. The Act provides that. unless. 2. and the person with whom he is dealing either knows that he has no authority or does not know or believe him to be a partner (s 5). As between the partners and outsiders. undertake obligations and dispose of the partnership property in the ordinary course of business and these acts will generally bind all other partners.1 Number and type of partners A partnership may consist of two or more persons but under s 17(3) of the Companies Act. not an isolated transaction. eg accountants. The partners are called collectively a "firm" and can be sued in the firm name.SCHOOL OF AUDIO ENGINEERING AE30 – Music Business Student Notes there is a sharing of gross returns (Cox v Coulson (1916)). The acts of every partner who does any act for carrying on in the usual way business of the kind carried on by the firm. Pooley v Driver (1876). carried on with a view to profit. Higgins v Beauchamp (1914). Between the partners. The partners must carry on the business together but all of them need not take an equally active part in the business. 2. this restriction does not apply to a partnership formed for the purpose of carrying on any profession or calling. Their respective responsibilities would be determined by their agreement. it does not cease to be a partnership even if there is a loss in the end. Cap 50. bind the firm and its partners (Mercantile Credit Co Ltd v Garrod (1962). However.2. or on account of receipt by a person of a share of profits. so long as there is an intention to make profit. Some of the partners may take an active part in the business. lawyers. in the absence of agreement every partner has a right to take part in the management of the business (s24(5)). s 17(4)). Cox v Hickman (1860). and doctors (CA. but as regards outsiders. their respective obligations are determined by their agreement.2.2 Partners as agents Every partner is an agent of the firm and of his other partners. such a partnership would have to be registered as a company. a partnership has distinct advantages of larger capital. As for profit. • • the partner so acting has no authority to act for the firm in that e of profit is matter. or of a payment contingent on or varying with the profits of a business. Badeley v Consolidated Bank (1888)). for example. while others may be dormant or "sleeping" partners. This means. These generally involve situations where the share of profit is being received in some capacity other than as a partner or paid for some reason other than as a share of profit. 520 . "Carrying on" of business denotes a degree of continuity (Smith v Anderson (1880)). their position is different.

subject to his private creditors having the first claim on his property. even if he is not a partner. Liability in contract and in tort The Act provides for the liability of partners in contract and in tort. he has a right to claim contribution from other partners. The firm is liable to third parties for the wrongful act or omission of any partner acting in the ordinary course of business of the firm. "given credit to the firm". the liability of the partners for the debts of the firm is unlimited. A partner who retires from a firm continues to be liable for the partnership debts or obligations incurred before his retirement (s 17(2). he will be unable to have recourse to other partners. because such liability does not depend on giving credit to the firm. The phrase. Every partner in a firm is liable jointly with the other partners for all the debts and obligations incurred while he is a partner. it is noted that although a partnership is not a legal entity.2. This means that the wronged party may bring as many actions as there are partners although for reasons of cost and efficacy. This happens where a person represents himself as a partner expressly or by conduct. the creditors can have recourse to the private assets of the partner. if he fails in his suit. Liability by “holding out” A person may in some circumstances be liable as a partner. the creditors would not have the right to sue the incoming partner.3 Change of partners An incoming partner is not liable to the creditors of the firm for anything done before he became a partner (s 17(l)). even if it has been agreed between the partners that a restriction shall be placed on the power of any of them to bind the firm. where the new firm takes over the old firm's liabilities and under the terms the incoming partner undertakes such liability. If the partnership business fails. by "holding out". In this regard. on the faith of such representation. Where a partner settles a claim. He would. a partnership can be sued in the firm name. or knowingly allows himself to be so represented by others (Tower Cabinet Co Ltd v Ingram (1949)). but he may be 521 . he should sue all the partners in one action. should not be interpreted narrowly and should encompass most contractual situations.SCHOOL OF AUDIO ENGINEERING AE30 – Music Business Student Notes Thus. The liability in contract is joint (s 9).either to sue the partners jointly or to sue any individual partnerrship formed separately. or with the authority of his co-partners. any act done in contravention of the agreement is nevertheless binding on the firm so long as the other contracting party did not know of the restriction (s 8). to the same extent as the partner so acting (s 10). In such cases he is liable to anyone who dealt with the firm on the basis that he is a partner. His liability in such cases is only to those persons who have. 2. "Joint" liability means the creditor has only one right of action . however. given credit to the firm (s 14). Court v Berlin (1897)). Liability of partners in tort is joint and several (s 12). Since he has only one action. not be liable for the torts of the firm. Unless such a right is acquired by novation. Unlike in the case of a member of a company limited by shares.

ie he appears to them to be a partner because they do not know that he has left the firm (s 36. Partnership property Partnership property must be held and applied by the partners exclusively for the purposes of the partnership and in accordance with the partnership agreement (s 20(l)). In such a case. Partnership property is not liable to be taken in execution except on a judgment against the firm (s 23). on account of firm or for the purposes and in the course of the partnership business (s 20). purchase the same (s23(3)). Such an agreement may be either express or inferred as a fact from the course of dealings between the creditors and the newly constituted firm. and persons who have had no previous dealings with the firm. unless a contrary intention appears (s 21). The only remedy of a creditor of a partner in his private capacity. unless he has given them actual notice that he is no longer a partner. Tower Cabinet Co Ltd v Ingram (1949)). a partner who pays more than what he has agreed to contribute as capital is entitled to interest. from the firm continues to be liable for the debts of the firms incurred after his retirement. The Partnership Act provides that. He is liable to persons to whom he is an apparent partner of the firm. These would include: • • persons who had dealt with the firm before his retirement.SCHOOL OF AUDIO ENGINEERING AE30 – Music Business Student Notes discharged from existing liabilities an agreement to that effect between him. the firm must indemnify every partner for payments made or personal liabilities incurred by him in the ordinary or proper conduct of the business or for the preservation of the property of the firm. the following rules apply (s 24): • • • all the partners are entitled to share equally in the capital and profits of the business and must contribute equally towards the losses. A partner's interest in partnership property may be charged by an order of the court. The partnership property includes: • • • property originally brought into the partnership. 522 .2. or in the case of a sale being directed. property acquired. the other partner or partners may redeem the interest charged. and the creditors. a partner who retires.4 Rights and duties between partners The interests of partners in the partnership property and their rights and duties in relation to the partnership are determined by agreement between them. In some situations. 2. and not as a member of the firm. the new firm. whether of capital or otherwise. unless otherwise agreed. unless has given them constructive notice of his retirement by publication in the Gazette and in the local newspapers. and property bought with the money belonging to the firm. against the partnership property is to obtain an order charging that partner's interest in the partnership property and profits with the amount of debt. whether by purchase or otherwise.

. the purchaser can resell the product without additional payment to the copyright owner. This permission takes the form of a license. make copies of the CD or even play it publicly for profit without authorization from the copyright owners. 523 . if the song is about lost love. Which means you can sell your CD collection without having to compensate the music publishers or recording artists. A "tangible medium" is any format "now known or later developed" from which the song "can be perceived. not the idea behind the work. applied to thousands of songs. as owner of the physical product. No one may use or make copies of a copyrighted song in any manner without the owner's permission. a CD). nor the artist's performance of the songs. For instance. therefore. Copyrights Copyright literally means the right to copy. you own the disc. every partner may take part in the management of the partnership business. either directly or with the aid of a machine or device. 1.SCHOOL OF AUDIO ENGINEERING AE30 – Music Business Student Notes • • • • • a partner is not entitled. You may not. What copyright protection covers-what it doesn't Any original work is automatically copyrighted once it has been 'fixed in any tangible medium of expression. the lyrics and music are copyrightable but not the subject matter. and will be. Copyright owners have the exclusive right to perform or make copies of their songs (or authorize others to do so). But you do not own the songs. The theme of lost love has been. including what fees or royalties must be paid to the copyright owner. But. or otherwise communicated. The purchaser then is the owner of the physical product.1 Expression vs. 1. When you purchase a CD. which lays out precise conditions and limitations under which the song may be used. reproduced.2 First sale Copyright doesn't protect the physical property which contains the song. partner shall be entitled to remuneration for acting in the partnership business. before the ascertainment of profits. any difference arising as to ordinary matters connected with the partnership business may be decided by a majority of the partners. though not the copyright. but no change may be made in the nature of the partnership business without the consent of all existing partners. 1. no person may be introduced as a partner without the consent of all existing partners. to interest on the capital subscribed by him.g. The copyright owner is paid on the first sale of the product embodying the copyright (e." Translation: A song is 'fixed" sufficiently for copyright protection the moment the writer jots down the lyrics on paper with musical notations or records a rough demo on a cassette recorder. Ideas Copyright protects the way a song is expressed. on the disc.

However. protection is extended to every significant country.) Dozens of copyrighted songs share the same title. Duration of copyright protection Copyright protection is in effect from.3 Copyright doesn't protect titles A lot of thought often goes into choosing a song title. whichever comes first. or 110 years. 2. to conform with most other European nations. 1. Moreover. lease.) to the public by sale. but not publicly distributed in manufactured form. however.1 "Published" vs. copyright remains in effect until 50 years after the death of the last surviving collaborator. for all practical purposes. a work created by an employee for hire as a work for hire is the property of the employer. Live performances are not considered publication.SCHOOL OF AUDIO ENGINEERING AE30 – Music Business Student Notes The concept of first sale is the legal basis that video stores use to buy videos that they in turn rent out and ultimately resell as used videos. the time a song is fixed in physical form (recorded or written down) until 50 years after the songwriter's death. International copyright protection There is no "international copyright" that automatically protects a work throughout the world. "unpublished" works For a work to be considered published. A work for hire's copyright is in effect for 75 years from the date of publication or 100 years from the date of creation. etc. sheet music. Congress is currently considering extending the term of copyright to the life of the songwriter plus 70 years. 3. But most countries have laws to protect foreign works under certain conditions. a song written in the year 2000 by a 20-year-old songwriter who dies in 2060 at age 80 would remain under copyright protection until the year 2110. certain songs have become so famous that the Library of Congress (in USA) may refuse to register a new song with a title that would be easily confused with an established song. (Titles of motion pictures can he protected by registration. For example. Copyright protection in each country depends upon that country's own operative copyright laws. Many record stores also buy and sell used CDs on the same legal grounds. you cannot copyright a song title. 524 . it must be fixed (recorded or printed) and distributed in tangible form (discs. However. or lending. rental. Thus. the Doctrine of Unfair Competition (applicable to US laws) might apply if you were to try and register a new song called "Rudolph the Red-Nosed Reincleer. The term of protection was recently increased to life of the songwriter plus 70 years in Great Britain. But though a title is very important and sometimes unique. If there are two or more writers. tapes. international treaties have greatly simplified copyright protection so that. Unpublished works may be fixed (written down or recorded as a demo)." 2.

) The Berne Convention Act is not retroactive. had to be accompanied by the appropriate copyright notice. Thus. it would fall into public domain. with a notice. thinking it was in public domain. a song could be considered "dedicated to the public. If the notice was! not included. there might be a reduction in damages for infringement that the copyright owner would otherwise receive. an infringer can not claim that he or she "innocently infringed" a work. (In such cases. or lyrics published in a magazine or book. a song published in sheet music form. since signing the Berne Convention. This extended to printed copies (sheet music." As such. Example: (c) 1999 by Smith Music or copyright 1999 by Smith Music Sound recordings are copyrighted by the record manufacturer using a different copyright symbol (a letter P enclosed by a circle). Failure to place a copyright notice on works created since March 1. First. copyright notices were not required on label copies or album covers of discs and tapes for the songs contained therein. placing a notice of copyright on published works is still strongly recommended. Once a song (or any other copyrightable work) becomes public domain. Nevertheless. 4. except by special legislation. 1989. Once copyright is lost. lyric reprints. and the owner forever loses claim to copyright! The copyright notice had to be in an obvious place on the published copy of the work. copyright can not be restored.1 Copyright notice The 1976 Copyright Act of the USA requirements that a notice of copyright be placed on all published copies of a song. no longer results ill loss of copyright." followed by the year of publication and the name of the copyright claimant. film/TV/video credits). 1989. or the word 'copyright. must still be published with the appropriate copyright notice. copyright was lost immediately (with the possible exception of works seeking "ad interim" protection). you allow potential users to be able to identify you as the owner. If a work was first published without the required notice before 1978. it can never be restored in the United States. Notice included the copyright symbol © (the letter C enclosed in a circle). which represents the manufacturer's claim to copyright in the recorded performance of the song-not the song itself. all works first published and distributed in the United States before March 1. Common law copyright 525 . Copyright claims by citizens of any country belonging to either treaty are recognized in every other signatory country. 1989. Secondly. so that you could be contacted for future licensing.SCHOOL OF AUDIO ENGINEERING AE30 – Music Business Student Notes The two major copyright treaties are the Berne Convention and the Universal Copyright Convention (UCC). 3. however. In other words. However. copyright notice is not required for works created on or after March 1. generally the title page or first page of the music.

Once fixed. Deposit is required of any work publicly distributed in the United States.A. it isn't necessary to register a claim to copyright with the Library of Congress to validate ownership. copyright owners must still deposit two complete copies or phonorecords in the Copyright Office. But it is highly prudent to register your claim as quickly as possible. Here's why: If someone uses your work without authorization. The Copyright Act requires a published work to be deposited with the Copyright Office at the Library of Congress within three months of publication. And even then. You can register the work at any time during the lifetime of the copyright (from the date the song was fixed until 50 years after the death of the last surviving writer).S. 4.000 (even up to $100. Copyright protection begins the moment a work is created and continues until the work is fixed in tangible form. . No formal copyright registration is required under common law.1 Copyright registration In the U.2 Poor man's copyright Some songwriters document their claim to copyright by putting a copy of the song in a scaled envelope and sending it to themselves by certified mail. The work can be performed live if the performance is not recorded so that someone may later have access to the recording.000 in some blatant cases). common law copyright protection can continue virtually forever. you may not be able to take legal action until you first register your claim with the Library of Congress. No one may use a song protected by common law copyright for commercial purposes without the owner's authorization. although you won't lose your copyright protection if you fail to do this. you may be entitled to statutory damages ranging from $500 to $20.S. you can lose a substantial sum. What registration does is make a public record of your claim to copyright from the date you registered the work. Registration itself does not preclude someone contesting your claim.SCHOOL OF AUDIO ENGINEERING AE30 – Music Business Student Notes The 1976 Copyright Act provides owners protection for songs "fixed" in tangible form. damages and royalties arising from unauthorized usage that occurred before you belatedly filed your registration. However. Of money because you neglected to register your claim in a timely manner. the 1976 Copyright Act lays down the formalities of registering a claim to copyright with the Library of Congress. But there is also protection under common law for songs that have not yet been recorded or written down. U. 4. As long as a song is not fixed. They then leave the 526 . although it is most advisable. a song's protection is subject to the Copyright Act of 1976. whether or not the work contains a copyright notice. plus any profits the infringer made from your work. Copies deposited must represent the best edition. and/or without payment. you may not be able to receive statutory. Thus. Though the mandatory requirement for copyright notice has been abolished.If you have registered your claim and can prove someone has willfully infringed your copyright.

Anyone can use a PD song in any manner.SCHOOL OF AUDIO ENGINEERING AE30 – Music Business Student Notes envelope sealed. whichever comes first. but it is an iffy way to establish copyright ownership because you could lose legal status to claim damages and royalties earned from unauthorized usages prior to the date of formal registration.4 Exception for "works for hire" Termination rights do not apply to songs created as works for hire. You cannot take a work from the public domain.S.e. they have to obtain a license from you and pay you royalties. the songwriter cannot re-claim a work for hire. Any work that outlives copyright protection is said to be in the public domain. Under the 11909 Copyright Act.." Since the songwriter waives copyright ownership before the work is created. Then. copyright protection can be claimed from time of publication until 50 years after the death of the writer). 4. could not have been cobbled together later in order to make a spurious claim against someone else's work. as well as when composers are commissioned to write specific material for employers. The idea is to be able to prove that a copyright claim was made on a certain date.S. such as "Yankee Doodle. copyright inherent upon creation belongs to the employer. The owner of a work for hire enjoys copyright protection for a period of either 75 years after date of publication. This applies to cases where songwri ters are signed to exclusive agreements as employees for hire. therefore." and copyright it. The 1976 Copyright Act brings the U.3 Copyright Protection Period & Public Domain Copyright protection is not forever. if anyone wants to record your arrangement of Mozart's work. 5. But you can devise an arrangement of a public domain song and copyright the arrangement. copyright owners could protect songs under copyright for an initial 28-year term with the opportunity to renew protection for a further 28 years. release it without having to pay mechanical royalties. On works for hire. This method of copyright is sometimes acceptable by courts attempting to settle rival claims. since the writer never owned the copyright in the first place. copyright law into line with most other countries (i. set new lyrics to it. such as movie themes. U. Infringement 527 . or 100 years after date of creation. Thus. and then copyright the new arrangement. But you can also do this: You can rearrange the Mozart aria musically. 4. or PD. you can record a snatch of melody from a Mozart aria land. Copyright claims are registered listing the employer (publisher) as "author" and the songwriter/composer as "employee for hire. and that the work in question. without permission or payment. believing the postmark will validate their copyright claim when the envelope is opened in a court of law. Thus.

528 . and gave rise to the term mechanical reproduction. nonprofit operations. and must be used only for purposes of teaching. as opposed to the more market-oriented rates negotiated between the perform i ng-righ t societies and commercial broadcasters. and permits the reproduction of small amounts of copyrighted material for the purposes of critical review. comment. then. illustration. or on the value of the original work. four special provisions are included which permit sompulsory licensing. legislation was passed allowing public television and radio to use copyrighted material upon payment of preset fees. Because they are partially taxpayer-funded. etc. Types of Rights Owned by Songwriters Songwriter’s Rights Mechanical Synchronization Performing Print Transcription 6. news reporting. rather than electronic. and hand-cranked machines that played acoustically recorded cylinders and discs. The amount of copying must have no practical effect on the market for. permission to reproduce song copyrights by mechanical means was called a mechanical license. parody. or summary." the Copyright Act makes certain exceptions to the monopoly of rights a copyright owner has with regard to licensing works.1 Fair use One defense against a charge of infringement is the principle of fair use.1 Mechanical Right The earliest technology for reproducing musical performances included player pianos. quotation. This concept is incorporated in to copyright law.SCHOOL OF AUDIO ENGINEERING AE30 – Music Business Student Notes 5. COMPULSORY LICENSING APPLIES TO: Noncommercial transmissions by public broadcasters This applies notably to Public Broadcasting System (PBS) and National Public Radio (NPR). music boxes. 5. criticism. This technology was mechanical. In particular. Logically. scholarship.2 Compulsory licensing In addition to the principle of "fair use. 6. The term compulsory license means that the copyright owner must issue a license to a qualified user under certain conditions and upon payment by the user of statutory fees or royalties.

howto videos. electronic games. pub. In Singapore the agent for performing rights is COMPASS (Composers and Authors Society of Singapore). Mechanical income and performance income are the two main sources of revenue for most publishers. training films. usually refers to monies earned from the manufacture and sale of songs in the forms of vinyl records. 6. Different types of synchronization licenses There are at least six different types of synchronization licenses: 1. 6.SCHOOL OF AUDIO ENGINEERING AE30 – Music Business Student Notes Though the technology to reproduce sound recordings has long since evolved from mechanical to electrical means. audio tapes. jukebox. travelogues. pipe-in music at hotels. Today. A blanket fee is fixed amount of royalty to be paid yearly by users. is the term used to describe sound and sight combined. The collection of performing rights are usually in the form of licensing. 2. is the authorization given by publishers to audiovisual producers to include music in their product. 3. 4. the term synchronization refers to integrating music with visual images to produce an audiovisual work. TV. 6. television productions.2 Performing rights Apart from live performance. (Audiovisual. TV stations. certain statistical formulae are used to estimate the proportion of the songs being used. Radio airplay log books were also used as a guide. therefore. clinic and etc. and from analog to digital. The amount is based on statistical estimation of the number of songs to be played during the licensing period. etc. hotel owners) and collect a “blanket” license fee. promotional clips. a reference to mechanical income. performing rights include rights to broadcast music publicly such as radio. Theatrical Television Videogram Commercial advertising Promotional music video Non theatrical/non-commercial 529 .) A synchronization license. so that licensing music for audiovisual works is an increasingly important source of music publishing revenue. 5. or mechanical royalties. the permission to reproduce and sell sound recordings of songs is still called a mechanical license. COMPASS issues license to users (radio. Synchronization licenses are issued for movies. and videocassettes. of course. New technologies have fueled a rapid expansion of visual media in recent years. a non-profit organisation that collects performing rights royalties and distribute them to its members. documentaries. In order to help distributing the royalty fairly among the member songwriters. compact discs.3 Synchronization Rights In music publishing. advertising.

For instance. accordion.5. Before an audiovisual work can be offered for exhibition. 6. most publishers don't print their own sheet music at all.5 Print Rights Sheet music was the principal format in which songs were marketed to the public during the late 19th and early 20th centuries. Warner/ Chappell. Some svnchronization licenses incorporate performing. Performance licenses permit the exhibition (or public showing) of music in audiovisual formats.) Choral arrangement 530 . So. Cherry Lane Music. including the term. The mass market consisted of amateur musicians who regularly bought new sheet music to play at home. the audiovisual producer must get a synchronization license from the music copyright owner. Today. Consolidated Music Sales. or reproduced for the home video market. They license print rights to specialist companies. territory. which built consumer demand for the sheet music. a song could be printed as: • • • • • • Piano copy (single song editions. FJH Music. 6. sheet music remains a potentially lucrative supplement to a publisher's income. the user must obtain separate performance and mechanical licenses if public showings are intended or if copies are going to be made and sold. Hal Leonard Publications. etc. and options available.and mechanical-right provisions allowing users to perform and make copies. But when synchronization licenses don't include these provisions. first things first. thus adding to potential print income.1 Printed music formats There are some 30-50 different formats in which a single song can be published in print. organ.SCHOOL OF AUDIO ENGINEERING AE30 – Music Business Student Notes Each of these licenses differs from the others in how the copyright owner is compensated. method of distribution. 6. such as Columbia Pictures Publicauons. Though print sales no longer dominate the music publishing business. and in the scope of permission granted to the user. Hits Usually appear in several formats simultaneously. however. like videos. CD-ROMs. licensing music used for audiovisual works usually involves two other types of rights: performance and mechanical. and sale of copies of audiovisual productions in various consumer-friendly formats.4 Other licenses needed for audiovisual works In addition to synchronization rights. distribution. laser discs. and Plymouth Music. etc. unbound) Marching-band arrangement Combo arrangement for dance band Educational or method books Arrangements for specific instruments (guitar. music must be synchronized to the visual action on the film. Before the advent of records and radio. Mechanical licenses authorize the manufacture. the method of popularizing a song was to get it performed by popular entertainers.

Top Country Hits of 1999) mixed folios (songs by different writers. Publisher collects all revenue and pays writer eight to 12 cents per printed edition. copublishing agreements.SCHOOL OF AUDIO ENGINEERING AE30 – Music Business Student Notes • • • • • • • • Stage band arrangement Brass band arrangement String orchestra arrangement Concept folios editions (e. Under these agreements. Publisher collects income and splits fifty-fifty with composer. and including all the songs contained on the album) 6. Print Income. There are three types of contracts pertaining to songwriters: standard agreements. Net receipts (that amount received by or credited to publisher from subpublisher) are split fifty-fifty with composer. Foreign Income. subject to the possibility of its reversion to its composer 35 years after 531 . Music publishing agreements The relationship between songwriters and music publishers are established by means of publishing contract. Synchronization Income. and administration agreements. 1. the publisher owns the copyright in the composition for the term of its copyright.6 Transcription Rights Transcription Rights are mostly concerned lyric writers or lyricists. Publisher receives and retains all of "publisher's share" of performance income.1 Standard Agreements These transactions come in two species-single-song agreements and long-term agreements. Publisher collects all mechanical income and pays 50% to composer. reproducing the album cover on the folio cover.g. Songwriter and Publisher 1. • Performance Income. • • • In a single-song deal. The rights permit translation of the lyrics into another language. the income is generally split as follows: * Mechanical Income. popularized by different artists) Fake books (top lines with chord notations and lyrics) Personality folios (focus on a particular -artist or songwriter Matching folios (issued in conjunction with a specific album release. Composer is paid directly by the performing rights organization and retains all such "writer's share" of performance income. This is commonly found in songs that are covered by foreign artists.

the publisher administers the compositions subject to such agreements. Under copublishing agreements. receives 50% of the publisher's share of performance income. earned by the compositions. at the close of which all administrative rights to the compositions revert to the administered cornpany. 1. However. in addition to the writer's share of performance income. and often more. incapable of being recaptured by the composer. whichever is earlier. such agreements are difficult to obtain for songwriters who have no independent means of exploiting compositions that would be subject to such an arrangement. Thus. The administrator collects all income but remits at least as much as a copublisher's share. but also shares in that portion of what traditionally was the publisher's share of music publishing income. under such agreements. a number of stated songs. Such agreements extend for periods varying from three to five years. the publisher issues all documents and contracts affecting such compositions and collects all income.SCHOOL OF AUDIO ENGINEERING AE30 – Music Business Student Notes its publication (its first commercial distribution) or 40 years after its assignment (transfer). That is. with two to four one-year options being held by the publisher. Usually. and collects its income throughout the world. Under most such agreements. and synchronization income derived from the composition and. Single-song agreements are entered into with or without advances paid to the songwriter: there is no common industry practice or standard.000 to $25. the writer is paid an advance on signing and with each option pickup. Some long-term songwriter's agreements provide that compositions created pursuant to such agreements are works for hire for the publisher and. Copublishing agreements can encompass one song. as in a long-term songwriter's agreement. or all compositions written over a period of years.2 Copublishing Agreements Copublishing agreements differ in two material respects from standard agreements described above. 532 . print income. which issues mechanical licenses. the publisher administers the compositions subject to them. registers compositions with performing rights organizations. or songwriters that can get their songs "covered" (recorded by others) such transactions are often the most beneficial and these are done frequently.3 Administration Agreements The most advantageous arrangement for a songwriter is an administration agreement. the administrative activities of a songwriter's company are conducted by another publisher. copyrights to the compositions are owned jointly by the publisher and songwriter. as in standard agreements. to the songwriter's administered company (10% to 15% of the gross revenues is the percentage normally retained by an administrator). Under this type of agreement. or a weekly salary ($18. or roughly 50% of the gross revenues (except print revenues) of the composition. the songwriter is ordinarily paid 75% of the mechanical income.000 per year is the average fora fledgling writer). Under both types of agreement. 1. All compositions created by the songwriter during the term are owned by the publisher. In practice. the songwriter not only receives the writer's share of publishing income. other than the writer's share of performance income. For singer-songwriters that have recording deals. Normally such agreements last for one year. hence.

A composer should allow translations of or the addition of new lyrics to any composition only with his or her prior consent as in some countries a translator or lyricist may register and receive income from a translation which is never performed." The performing rights organizations make their entire catalogs of compositions available to broadcasters upon the payment of a fee. self--publishing makes sense. long-term publishing. from the performance rights organizations that represent songwriters and publishers. A songwriter should always attempt to obtain a reversion of any composition subject to a single-song. and business managers. songwriters differ on the priorities of the numerous issues involved in a songwriting agreement. but in practice. For composers interested in and capable of properly administering and promoting the products of their artistry. or broadcasting of a song. Broadcasters traditionally secure licenses. 1. self-publishing can be a viable alternative to the traditional arrangements with publishers. as are advances. which allow them to broadcast programs containing music. It must be noted that there are differences in the working processes of each of these organizations. Similarly. one of the exclusive rights granted to the owner of a composition is the right to perform that composition publicly. or even recorded.4 Points of Negotiation Royalties are always negotiable. financed by the artist. dancing to. In this context. to counsel them on the best methods of navigating the narrows of music publishing.5 Self-publishing Some composers are capable of creating and administering their catalogues.SCHOOL OF AUDIO ENGINEERING AE30 – Music Business Student Notes 1. It is difficult. 2. 533 . The performing rights organizations In the copyright law. After deducting the costs of administration. and a composer seeking affiliation with one of them should consult knowledgeable industry sources to help in making a choice. however. see the chapter titled "Performing Rights Organizations: An Overview. subject to certain exceptions. which gives them the right to perform all of the songs in the organization's catalog without having to contact each publisher directly. Songwriters must become expert or have advisors. When a record is released on an independent label. For a more complete discussion on these organizations. Universal copyrights are valuable assets-don't transfer or lessen rights to them without a good reason. "perform" is a term of art which means the singing. or copublishing agreement when any such composition has not been commercially exploited within a time period specified in any such contract. The music publishing industry is not so difficult that its mechanics would confound an attentive student. those fees are divided between the writers and publishers based upon the number of performances logged by the performing rights organization. such as attorneys. Points in publishing contracts vary in importance among publishing companies. playing. personal managers. sold. to obtain the commercial exploitation of compositions. few writers are successful going it alone.

The society was formed in 1987 in conjunction with the Copyright Act of Singapore 1987 to answer the call of composers.1 Membership in Compass 534 . Authors and Publishers (ASCAP). specifically. COMPASS awards scholarships to deserving musicians. which represents almost all copyright musical works in the world. COMPASS has grown into a full fledged organisation providing an array of services for its members.SCHOOL OF AUDIO ENGINEERING AE30 – Music Business Student Notes In the USA there are few major performing rights organizations. In addition to the control and ownership of more than 4.000 pieces of music written and published locally. The members elect a Board of Directors comprising five writer-members and five publisher-members. authors* (and their heirs) and publishers of musical works and their related lyrics. and presents awards to members of the society for distinguished achievements in music and the performing arts. COMPASS is a non-profit public company which administers the public performance. The Board of Directors is responsible for the major policies of the society. In addition. diffusion and reproduction rights in music and musical associated literary works on behalf of its members." in this case. * "Authors. Inc. COMPASS deals specifically with music copyright and the usage of musical works. This means that COMPASS administers the works of more than a million composers. including ASCAP (American Society of Composers. for use of their material. COMPASS fulfils its social and community role by being an ardent supporter of music and the arts in Singapore. This was accomplished by setting up a registry of musical works for composers and lyricists whereby their works would be protected by the society. broadcast. and SESAC. CASH (Composers and Authors Society of Hong Kong). lyricists and publishers in need of copyright protection. BMI (Broadcast Music. Authors and Publishers). among others. people who put words to music 1. refers to lyricists. Performing Rights are taken care of by Composers and Authors Society of Singapore (COMPASS) COMPASS – Composers And Authors Society of Singapore The Composers and Authors Society of Singapore (COMPASS) is an organisation created to protect and promote the copyright interests of composers. assists local artists in producing their music (CD albums). the members would also be protected from any unauthorised usage of their materials and be subjected to due rewards. The purpose was to provide them with a means of compensation for usage of their creative material by other parties. They are : The American Society of Composers.). the society has established a licensing department to ensure that members are duly compensated. Since 1987. By enforcement. lyricists and publishers world wide and controls more than 13 million musical works. COMPASS currently has over 400 registered local members. Inc. Elections are carried out bi-annually. sponsors concerts. in the form of royalties. COMPASS has entered into reciprocal agreements with other affiliated societies world wide. (BMI). The Board of Directors appoints a General Manager who is responsible for the daily operations and administration of the society. Broadcast Music. In Singapore.

Some examples of premises requiring licenses are discotheques.3 Licenses of COMPASS Public Performance Rights If you are a proprietor of a business that provides music to the public. or the work has been made available by inclusion in a catalogue of a recorded music library (such as background or mood music) the work has been recorded and transmitted by a television or radio broadcasting station or cable diffusion serv