You are on page 1of 85

AUDIO RECORDING

SYSTEM
Why surround?

• Ever since the 1950’s, great steps have


been taken in order to create the ultimate
home-entertainment experience.

• Walt Disney attempted to create a


surrounding sound experience with the
viewing of Fantasia. The surround sound
process was called “Fantasound”
Monophonic

• Single source of sound


– Usually, a TV speaker or a radio transmission
– This source is referred to as a “channel”
Monophonic
Stereo

Dual source of sound


– Involves the use of two speakers, or
“channels,” paired as a Left and
Right

– Dubbed as “Hi-Fidelity”

– In Dolby, this is considered AC-1,


because one track carries both
channels
Stereo
(1970’s)
Quadraphonic Stereo
(1970’s)
Dolby Surround
(1985’s)

• Two (2) audio tracks are used: Left and Right (track
1), and Mono Surround (track 2)

• These two tracks are carried on stereo program


sources such as videotapes and TV broadcasts into
the home

• This was dubbed as “AC-2”, because two tracks


were used to carry three channels of sound.

(http://www.dolby.com)
Dolby Surround
(1985)
Dolby Pro-Logic Surround
(1989)

• Like Dolby Surround, Dolby Pro-Logic Surround


used the original two tracks of audio – Left and
Right (track 1) and Mono Surround (track 2) – while
adding a third track for the Center channel (track 3)

• This new track was used with a filter system that


generated all “direct front” sound (such as actor’s
voices) to appear as though it was centered (hence,
the “center channel”
Dolby Pro-Logic Surround
(1989)
Dolby 5.1 Digital Surround
(1995)

• The speaker arrangement uses six channels (only


available on DVD and Blue-Ray): Left, Right, Center,
Surround Left, Surround Right, and the Low-
Frequency Channel (subwoofer).

• Dolby 5.1 Surround is dubbed so because the .1


LFE (subwoofer) channel is not a constant sound
generator. It is actually triggered, on occasion, to
fire.

• This Dolby technology truly uses separate stereo


surround signals, versus the original mono
channels available in Dolby Surround and Dolby Pro-
Logic. (AC-3)
Dolby 5.1 Digital Surround
(1995)
Dolby EX (THX) Surround
(2002)

• Dolby Digital EX takes the Dolby Digital 5.1-channel


setup one step further with an additional center
surround channel (reproduced through one or two
speakers) for extra dimensional detail and an
enveloping surround sound effect.

• Feature films originally released in Dolby Digital


Surround EX (the cinema version) carry the
encoded extra surround channel in their
subsequent DVD releases, as well as onto 5.1-
channel digital satellite and TV broadcasts.

• Also known commercially as THX


Dolby EX (7.1) Surround
(2004)
Dolby Digital Plus

• Dolby® Digital Plus is the next-generation audio


technology for all high-definition programming and
media.

• Can deliver 7.1 channels and beyond* of enhanced-


quality audio

• Allows multiple languages

• Compatible with the millions of home


entertainment systems equipped with Dolby Digital
Audio in Multimedia
• In a multimedia production, sound and
music are crucial in helping to establish
moods and create environments.
Audio on PCs
• Many types of sounds are accessible with
a PC. They include:
– Music
– Sound effects
– Spoken narration
– Video soundtracks
– Real-time telephone conversations
– Operating system alerts and prompts
Digital Audio Recording
• Digital recording
devices capture sound
by sampling the sound
waves.
Digital Audio Quality
• The quality and size of digital audio
depends on:
– The sampling rate
– The sample size
– The number of channels
– The time span of the recording
What is streaming audio?
• Streaming audio plays as it reaches your
PC, making it unnecessary to wait until the
entire file is downloaded to the computer.
Audio File Formats
• An audio file’s format determines what
files a PC can open and play, and how much
space the file occupies on a disk. File
formats include:
– MP3
– WAV
– MIDI
MP3 Format
• MP3 is a standard format for music files
sent over the Internet. MP3s:
– Use one of three MPEG standards for audio
compression
– Can compress an audio file to about one-twelfth
of the space it occupies on a CD with no
significant loss of sound quality
WAV Format
• WAV is a standard for sound files on
Windows and Macintosh PCs. WAVs:
– Do not compress audio as much as MP3s
– Are generally used for sound effects and other
small files
MIDI Format
• MIDI is a method and format for
recording music from synthesizers and other
electronic instruments. MIDIs:
– Are created with a computer that has a sequencer
– Do not contain actual musical notes
– Do not contain sound waves or use sampling
– Are small and load quickly on a Web site
Audio Software for the PC
• Most new PCs come with some software and
hardware for recording and managing audio files.
– Audio editing software allows you to edit audio files
and convert them from one format to another.
– MIDI software includes programs for recording,
storing, replaying, and editing MIDI files.
– Composition software allows you to create sheet
music for many voices or instruments.
Audio Hardware Devices
• Audio hardware devices for the PC may
include:
– Audio or sound cards
– Speakers
– Microphones for voice input
– MIDI input devices
– CD/DVD burners
Speech/Voice Recognition
Definition
• Speech recognition is the process of converting an
acoustic signal, captured by a microphone or a
telephone, to a set of words.
• The recognised words can be an end in
themselves, as for applications such as commands
& control, data entry, and document preparation.
• They can also serve as the input to further
linguistic processing in order to achieve speech
understanding
Speech Processing
• Signal processing:
– Convert the audio wave into a sequence of feature vectors
• Speech recognition:
– Decode the sequence of feature vectors into a sequence of
words
• Semantic interpretation:
– Determine the meaning of the recognized words
• Dialog Management:
– Correct errors and help get the task done
• Response Generation
– What words to use to maximize user understanding
• Speech synthesis (Text to Speech):
– Generate synthetic speech from a ‘marked-up’ word string
Dialog Management
• Goal: determine what to accomplish in response to
user utterances, e.g.:
– Answer user question
– Solicit further information
– Confirm/Clarify user utterance
– Notify invalid query
– Notify invalid query and suggest alternative
• Interface between user/language processing
components and system knowledge base
What you can do with Speech
Recognition
• Transcription
– dictation, information retrieval
• Command and control
– data entry, device control, navigation, call
routing
• Information access
– airline schedules, stock quotes, directory
assistance
• Problem solving
Transcription and Dictation
• Transcription is transforming a stream of
human speech into computer-readable form
– Medical reports, court proceedings, notes
– Indexing (e.g., broadcasts)
• Dictation is the interactive composition of
text
– Report, correspondence, etc.
Speech recognition and
understanding
• Sphinx system
– speaker-independent
– continuous speech
– large vocabulary
• ATIS system
– air travel information retrieval
– context management
Speech Recognition and Call
Centres
• Automate services, lower payroll
• Shorten time on hold
• Shorten agent and client call
time
• Reduce fraud
• Improve customer service
Applications related to Speech
Recognition
• Speech Recognition
• Figure out what a person is saying.
• Speaker Verification
• Authenticate that a person is who she/he claims
to be.
• Limited speech patterns
• Speaker Identification
• Assigns an identity to the voice of an unknown
person.
• Arbitrary speech patterns
video
Basic Video Properties
• Representation of Video Signals
– Visual Representation
• To present the observer with as realistic as possible
a representation of a scene
– Transmission of Video Signals
• Television Systems
– Video Digitization
• Digital Television
Visual Representation
• In order to accurately convey both spatial
and temporal aspects of a scene, the
following properties are considered
– Vertical Details and Viewing Distance
• The geometry of a television image is based on the
ratio of the picture width W to the picture height H
(W/H), called the aspect ratio.
– Conventional aspect ratio is 4:3.
• The angular field of view is determined by the
viewing distance, D, and is calculated as D/H.
Visual Representation
• Horizontal Detail and Picture Width
– Can be determined from the aspect ratio
• Total detail content of a picture
– Since not all lines (horizontal and vertical) are visible to
the observer, additional information can be transmitted
through them.
• Depth perception
– Depth is a result of composing a picture by each eye
(from different angles)
– In a flat TV picture
• Perspective appearance of the subject matter
• Choice of focal length of the camera lens and changes in depth
focus
Visual Representation
• Luminance
– RGB can be converted to a luminance (brightness
signal) and two color difference signals (chrominance)
for TV signal transmission
• Temporal Aspects of Illumination
– A discrete sequence of still images can be perceived as
a continuous sequence.
• The impression of motion is generated by a rapid succession of
barely differing still pictures (frames).
– Rate must be high enough to ensure smooth transition.
– Rate must be high enough so that the continuity of perception is
not disrupted by the dark intervals between pictures
• The light is cut off, briefly, between these frames.
Visual Representation
• Continuity of Motion
– Continuity is perceived with at least 15 frames per
second.
• To make motion appear smooth in a recorded film (not
synthetically generated), a rate of 30 frames per second is
needed.
– Films recorded with 24 frames per second look strange when
large objects close to the viewer move quickly.
• NTSC (National Television Systems Committee) Standard
– Original: 30 frames/second
– Currently: 29.97 frames/second
• PAL (Phase Alternating Line) Standard
– 25 frames per second
Visual Representation
• Flicker
– If the refresh rate is low, a periodic fluctuation
of the perceived brightness can result.
• Minimum to avoid flicker is 50 Hz.
• Technical measures in movies and TV have allowed
lower refresh rates.
Signal Formats
• RGB
• YUV
• YIQ
• Composite Signals
– Instead of sending each component on one channel, send them all
• Computer Video Formats
– Current video digitization hardware differ in
• Resolution of digital images (frames)
• Quantization
• Frame rate
– Motion depends on the display hardware (Figure 5-4 page 86)
Signal Formats
• Computer Video Formats
– Color Graphics Adapter (CGA)
• Resolution: 320 x 200
• 2 bits / pixel
– Enhanced Graphics Adapter (EGA)
• Resolution: 640 x 350
• 4 bits / pixel
– Video Graphics Array (VGA)
• Resolution: 640 x 480
• 8 bits / pixel
– Super Video Graphics Array (SVGA)
• Resolution: 1024 x 768, 1280 x 1024, 1600 x 1280
• 8 bits / pixel
– Video accelerators are needed to avoid reduced performance at
higher resolutions
• Check the storage requirements of the above systems!
TELEVISION BROADCASTING STANDARDS
Television Systems
• Conventional Systems
– NTSC
• Originated in the US
• Uses color carriers of approx. 4.429 MHz or approx.
3.57 MHz.
• With suppressed color carrier, it uses quadrature
amplitude modulation
• Refresh rate: 30Hz
• # horizontal lines: 525
Television Systems
• Conventional Systems
– SECAM (Sequential Couleur Avec Memoire)
• France and Eastern Europe
• With suppressed color carrier, it uses frequency
modulation
• Refresh rate: 25Hz
• # horizontal lines: 625
Television Systems
• Conventional Systems
– PAL
• Parts of Western Europe
• Uses color carriers of approx. 4.43 MHz
• With suppressed color carrier, it uses quadrature
amplitude modulation
• Refresh rate: 25Hz
• # horizontal lines: 525
Television Systems
• High-Definition TV (HDTV)
– Research began in Japan, 1968
– Third Technological Shift (after black and
white and color TV)
– The goal was to “integrate” the viewer with the
events happening on the screen
Television Systems
• High-Definition TV (HDTV)
– Resolution
• More than 1000 scanning lines
– Approximately double the resolution of conventional TV
• Higher video bandwidth
– About five times that of conventional TV
• Resolutions Recommended
– High 1440 Level: 1440 x 1152
– High Level: 1920 x 1152
– Later, 2k x 2k pixels
– Frame Rate
• No agreement on a fixed frame rate worldwide
• 50 or 60 frames per second
Television Systems
• High-Definition TV (HDTV)
– Aspect Ratio
• Originally 16:9
• Currently 4:3
– Interlaced and/or progressive scanning formats
• Conventional systems supported interlaced scanning
formats
• HDTV supports progressive scanning formats
– Viewing Conditions
• Screen area should be bigger than 8000cm2 for
“real” scenes
Digital Television
• DVB (Digital Video Broadcasting) started in
Europe in the early 90s after considerable progress
in video compression techniques
– More precisely DTVB (Digital TeleVision
Broadcasting)
– MPEG-2 for source coding of audio/video data
– Satellite Connections, CATV networks and (S)MATV
(Small) Master Antenna TV systems were suitable for
digital TV distribution
• DVB-S (Satellite) and DVB-C (Cable) were adopted by the
European Telecommunications Standards Institute ETSI as
official standards.
– Multichannel Microwave Distribution Systems
(MMDS) are another possibility.
Digital Television
• Advantages
– Increased number of programs can be
transmitted over a TV channel
– Adaptable video/audio quality to each
application
– Exceptionally secure encryption systems for
pay-per-view
– Additional services (video on demand, data
broadcast,...)
– Computers and TV convergence
• Creating an animation
using a program such as
3D Studio Max or
trueSpace is often just a
part of a total video
production process.
• Video editing software
(such as Adobe
Premiere) offers the
opportunity to enhance
animation productions
with sound, still
images, and scene
• The addition of sound
can add realism and
interest to a video
production.
• Titles and single images
(static or scrolling)
provide additional
information.
• Analog (linear) devices record light and
sound as continuously changing
electrical signals described by a continuous
change of voltage.
• Digital recordings are composed of a series of
specific, discrete values which are recorded
and manipulated as bits of information, which
can be accessed or modified one bit at the
time or in selected groups of bits.
• Digital media is stored in a
format that a computer can read
and process directly.
– Digital cameras, scanners, and
digital audio recorders can be used
to save images and sound in a
format that can be recognized by
computer programs.
– Digital media may come from
images created or sound recorded
directly by computer programs.
• Analog media must be digitized or converted to a
digital format before using a computer
– Analog images may be obtained from such sources as
older video cameras working with VHS or SVHS.
Analog sounds may come from sources such as
audiotapes and recordings.
– Hardware devices such as a video capture card must be
attached to the computer to bring analog materials into
computer video editing programs.
Digital Versus Analog
• Sufficient computer
resources are needed
for digital video
editing.
– Fast processors
needed to process the
video.
– Additional RAM
beyond customary
requirements is
needed.
• Very large hard drives
are needed. Few
minutes of footage
require vast amounts of
storage.
• Video cards should be
capable of working
with 24-bit color depth
displays.
• Large monitors are
better due to the need to
work with numerous
• Selecting settings can be a
complex task requiring an
understanding of input
resources and output goals.
• The ability to make good
decisions regarding capture,
edit, and output settings
require an understanding of
topics such as frame rates,
compression, and audio.
• Numerous books can help but
Experience is still a really
• Timebase specifies
time divisions used to
calculate the time
position of each edit,
expressed in frames per
second (fps).
– 24 is used for editing
motion-picture film
– 25 for editing PAL
(European standard)
– 29.97 for editing NTSC
(North American
standard) video
(television)
• Frame rate indicates to the
number of frames per second
contained in the source or the
exported video. Whenever
possible, the timebase and
frame rate agree. The frame
rate does not affect the speed of
the video, only how smoothly it
displays.
• Timecode is a way of
specifying time. Timecode is
displayed in hours, minutes,
second and frames
(00;00;00;00). The timecode
• Frame size specifies the dimensions (in pixels) for
frames. Choose the frame size that matches your source
video. Common frames sizes include:
– 640 x 480–standard for low-end video cards
– 720 x 486–standard-resolution professional video
– 720 x 480–DV standard
– 720 x 576–PAL video standard (Used in Europe.)
• Aspect ratio is the ratio of width to
height of the video display.
– Pixel aspect ratio is the ratio for a pixel
while the frame aspect ratio is the width to
height relationship for an image.
– 4:3 is the standard for conventional
television and analog video.
– 16:9 is the motion picture standard.
– Distortion can occur when a source image
has a different pixel aspect ratio from the
one used by your display monitor. Some
software may correct for the distortion.
• CODECs (compressor/decompressor) specify
the compression system used for reducing the
size of digital files. Digital video and audio
files are very large and must be reduced for
use on anything other than powerful computer
systems. Some common CODECS include
systems for QuickTime or Windows.
• QuickTime (movie-playing
format for both the Mac and
Windows platform) -
Cinepak, DV-NTSC, Motion
JPEG A and B, Video
• Video for Windows (movie-
playing format available only
for the Windows platform) –
Cinepak, Intel Indeo,
Microrsoft DV, Microsoft
Video1
• Color bit depth is the number of colors to be
included. The more colors that you choose to work
with, the larger the file size and in turn, the more
computer resources required.
– 8-bit color (256 colors) might be used for displays on the Web.
– 24-bit color (millions of colors) produces the best image quality.
– 32-bit color (millions of colors) allows the use of an alpha
channel .
• Audio bit depth is the
number of bits used to
describe the audio sample.
– 8-bit mono is similar to FM
radio
– 16-bit is similar to CD audio
• Audio interleave specifies
how often audio
information is inserted
among the video frames.
• Audio compression
reduces file size and is
needed when you plan
to export very large
audio files to CD-
ROMs or the Internet.
• Audio formats include
WAV, MP3, and MIDI
files. MIDI files do not
include vocals. MPEG
files can also include
audio.
• Visual and audio source media are referred to as
clips, which is a film industry metaphor referring to
short segments of a film project.
– Clips may be either computer-generated or live-action
images or sounds that may last from a few frames to
several minutes.
– Bins are used store and organize clips in a small screen
space. Bin is another film industry metaphor, which is
where editors hung strips of film until added to the total
production.
• Opening and viewing clips
– Images must be in a format that the video editing
software can recognize such as an avi (for animation),
wav (for sound), or jpg (for still image) before it can be
imported.
– Many software programs provide both a “source”
window and a separate “program” window where the
entire production can be monitored.
– Sound clips may be displayed as a waveform where
sounds are shown as spikes in a graph.
• Playback controls are a part of most viewing
windows. Play, Stop, Frame back, frame forward
are typical of window commands.
• The Timeline helps cue the user as to the relative
position and duration of a particular clip (or frame)
within the program by graphically showing the clips
as colored bars whose length is an indication of the
duration. As clip positions are moved along the
timeline, their position within the program is
changed.
• Typically the timeline will include rows or
individual tracks for images, audio, and scene
transition clips. The tracks often include a time
ruler for measurement of the clips duration.
• Some programs allow the duration of a clip to be
changed by altering the length of the bar
representing the clip. Scenes within the program
may be slowed or the speed increased using this
stretch method.
• Cutting and joining clips
– Software tools are typically
available for selecting a clip on the
timeline and then cutting the bar
that represents the clip. Using this
process, segments of “film” may
be separated, deleted, moved, or
joined with other clips.
– Cutting and joining may be used
on audio or video.
• Transitions allow you to make a
gradual or interesting change from one
clip to another by using special effects.
– Transitions might include dissolve, page
peels, slides, and stretches.
– The number and types of transitions
available depend upon the software you
are using.
Audio mixing is the process of making
adjustments to sound clips.
• Title clips
– Alpha channel allows
you superimpose the
title
– Title rolls allow text to
move from the bottom of
the screen to beyond the
top used for credits.
• A title crawl moves the text
horizontally across the screen.
News bulletins along the
bottom of the television are an
example of this type of effect.
• Text and graphics may be
created in other programs and
inserted. Video editing
programs are usually limited in
their ability to create and
manipulate text and graphics.
• By using layering techniques, adjusting opacity, and
creating transparency, composite clips can be
created.
– Bluescreen (greenscreen) and track hierarchy allow
background scenes to be overlaid and image editing to
occur.
– Keying makes only certain parts of a clip transparent
which can then be filled with other images (clips on the
lower tracks of the timeline.)
• Output may be to
videotape for display on
a television or to a
digital file for display
through a computer
output device.
• Output may be put into
other presentation
programs such as
PowerPoint.
• Export goals will determine the
output settings that you choose.
– Does the production need to operate on
Windows and/or Mac platforms?
– What software will be used to play your
production?
– What image quality is required?
– How big can the file size be?
– Will the production be displayed on the
Web?
• Common digital outputs
– Audio Video Interleave (avi) – for use on Windows
only computers, good for short digital movies.
– QuickTime – a cross platform Apple format that is
popular for Web video.
– RealVideo – RealNetworks streaming video is an
extremely popular format.
• Video editing programs may be exported to other
multimedia programs (such as Macromedia
Director or Authorware) for addition editing or
integration with other materials such as Flash
programs.

You might also like