You are on page 1of 75

Encoding Audio and Video

Digital audio
Digital audio
Sounds created on a computer exist as digital
information encoded as audio files.
Digital audio
Digital sound is broken down into thousands of
samples per second. Each sound sample is
stored as binary data.
Digital audio quality

Factors that affect the quality of digital audio


include:
• sample rate - the number of audio samples
captured every second
• bit depth - the number of bits available for
each clip
• bit rate - the number of bits used per second
of audio
Digital audio
The sample rate is how many samples, or
measurements, of the sound are taken each
second. The more samples that are taken, the
more detail about where the waves rise and fall
is recorded and the higher the quality of the
audio. Also, the shape of the sound wave is
captured more accurately.
Digital audio
Each sample represents the amplitude of the
digital signal at a specific point in time. The
amplitude is stored as either an integer or a
floating point number and encoded as a binary
number.
Digital audio
A common audio sample rate for music is
44,100 samples per second. The unit for the
sample rate is hertz (Hz). 44,100 samples per
second is 44,100 hertz or 44.1 kilohertz (kHz)
1.1.3 Sound
• Sound is an oscillation of pressure transmitted
through a solid, liquid, or gas.
1.1.3 Sound
• When you hear different volumes and pitches
of sound all that is happening is that each
sound wave varies in energy for the volume
(larger energy waves, the louder the sound),
or distance between sound waves which
adjusts the pitch, (smaller distances between
waves leads to higher pitched sound).
1.1.3 Sound
1 - Base volume and frequency
2 - double volume and frequency
3 - same volume treble the frequency
Analogue and digital
• An analogue sound wave is picked up by a
microphone and sent to an Analogue to Digital
(ADC) converter in the form of analogue
electrical signals.
• The ADC converts the electrical signals into
digital values which can be stored on a
computer.
Analogue and digital
• Once in a digital format you can edit sounds
with programs such as audacity.
Analogue and digital
• Analogue to Digital Converter (ADC) -
Converts analogue sound into digital signals
that can be stored on a computer

• Digital to Analogue Converter (DAC) -


Converts digital signals stored on a computer
into analogue sound that can be played
through devices such as speakers
Sound Sampling
• Sampling Rate - The number of samples taken
per second

• Hertz (Hz) - the SI unit of frequency defined as


the number of cycles per second of a periodic
phenomenon
Sampling resolution
• Sampling resolution - the number of bits
assigned to each sample
• A common audio sample rate for music is 44,100
samples per second. The unit for the sample rate
is hertz (Hz). 44,100 samples per second is 44,100
hertz or 44.1 kilohertz (kHz).
• Telephone networks and VOIP services can use a
sample rate as low as 8 kHz. This uses less data to
represent the audio. At 8 kHz, the human voice
can still be heard clearly - but music at this
sample rate would sound low quality.
Bit depth
• Bit depth is the number of bits available for each
sample. The higher the bit depth, the higher the
quality of the audio. Bit depth is usually 16 bits
on a CD and 24 bits on a DVD.
• A bit depth of 16 has a resolution of 65,536
possible values, but a bit depth of 24 has over 16
million possible values.
• 16-bit resolution means each sample can be any
binary value between 0000 0000 0000 0000 and
1111 1111 1111 1111.
Bit rate

• The bit rate of a file tells us how many bits of


data are processed every second. Bit rates are
usually measured in kilobits per second
(kbps).
Calculating bit rate

• The bit rate is calculated using the formula:


• Frequency × bit depth × channels = bit rate
• A typical, uncompressed high-quality audio file
has a sample rate of 44,100 samples per second,
a bit depth of 16 bits per sample and 2 channels
of stereo audio. The bit rate for this file would be:
• 44,100 samples per second × 16 bits per sample
× 2 channels = 1,411,200 bits per second (or
1,411.2 kbps)
• A four-minute (240 second) song at this bit
rate would create a file size of:
• 14,411,200 × 240 = 338,688,000 bits (or 40.37
megabytes)
Sound Editing
• If you are interested in sound editing you can
start editing your own music using a program
called Audacity.
• Using Audacity you can create your own
sound samples with different sample rates
and sample resolutions, listening to the
difference between them and noting the
different file sizes.
Features(Audacity)
• Recording
– Audacity can record live audio through a
microphone or mixer, or digitize recordings from
cassette tapes, records or minidiscs.
– With some sound cards, and on any Windows
Vista, Windows 7 or Windows 8 machine,
– Audacity can also capture streaming audio.
Features(Audacity)
• Import and Export
– Import sound files, edit them, and combine them
with other files or new recordings.
– Export your recordings in many different file
formats, including multiple files at once.
Features(Audacity)
• Sound Quality
– Supports 16-bit, 24-bit and 32-bit (floating point)
samples (the latter preserves samples in excess of
full scale).
– Sample rates and formats are converted using
high-quality resampling and dithering.
– Tracks with different sample rates or formats are
converted automatically in real time.
Features(Audacity)
• Editing
– Easy editing with Cut, Copy, Paste and Delete.
– Unlimited sequential Undo (and Redo) to go back
any number of steps.
– Edit and mix large numbers of tracks.
– Multiple clips are allowed per track.
Features(Audacity)
• Editing
– Label tracks with selectable Sync-Lock Tracks
feature for keeping tracks and labels synchronized.
– Draw Tool to alter individual sample points.
– Envelope Tool to fade the volume up or down
smoothly.
– Automatic Crash Recovery in the event of
abnormal program termination.
Features(Audacity)
• Accessibility
– Tracks and selections can be fully manipulated
using the keyboard.
– Large range of keyboard shortcuts.
– Excellent support for JAWS, NVDA and other
screen readers on Windows, and for VoiceOver on
Mac.
Features(Audacity)
• Effects
– Change the pitch without altering the tempo (or
vice-versa).
– Remove static, hiss, hum or other constant
background noises.
– Alter frequencies with Equalization, Bass and
Treble, High/Low Pass and Notch Filter effects.
Features(Audacity)
• Effects
– Adjust volume with Compressor, Amplify,
Normalize, Fade In/Fade Out and Adjustable Fade
effects.
– Remove Vocals from suitable stereo tracks.
– Create voice-overs for podcasts or DJ sets using
Auto Duck effect.
– Run "Chains" of effects on a project or multiple
files in Batch Processing mode.
Features(Audacity)
• Effects
– Other built-in effects include:
• Echo
• Paulstretch (extreme stretch)
• Phaser
• Reverb
• Reverse
• Truncate Silence
Features(Audacity)
• Plug-ins
– Support for LADSPA, Nyquist, VST and Audio Unit
effect plug-ins.
– Effects written in the Nyquist programming
language can be easily modified in a text editor -
or you can even write your own plug-in.
Features(Audacity)
• Analysis
– Spectrogram view modes for visualizing
frequencies.
– "Plot Spectrum" command for detailed frequency
analysis.
– "Sample Data Export" for exporting a file
containing amplitude values for each sample in
the selection.
Features(Audacity)
• Analysis
– Contrast Analysis for analyzing average rms
volume differences between foreground speech
and background music.
– Support for adding VAMP analysis plug-ins.
Features(Audacity)
• Free and Cross-Platform
– Licensed under the GNU General Public License
(GPL).
– Runs on Windows, Mac OS X and GNU/Linux
Compression

• Compression is a useful tool for reducing file sizes. When


images, sounds or videos are compressed, data is removed
to reduce the file size. This is very helpful when streaming
and downloading files.
• Streamed music and downloadable files, such as MP3s, are
usually between 128 kbps and 320 kbps - much lower than
the 1,411 kbps of an uncompressed file.
• Videos are also compressed when they are streamed over a
network. Streaming HD video requires a high-speed
internet connection. Without it, the user would experience
buffering and regular drops in quality. HD video is usually
around 3 mbps. SD is around 1,500 kbps.
Compression can be lossy or lossless.
• Lossless compression means that as the file
size is compressed, the audio quality remains
the same - it does not get worse. Also, the file
can be restored back to its original state. FLAC
and ALAC are open source lossless
compression formats. Lossless compression
can reduce file sizes by up to 50% without
losing quality.
Compression

• Lossy compression permanently removes part of


data. For example, a WAV file compressed to an
MP3 would be lossy compression. The bit rate
could be set at 64 kbps, which would reduce the
size and quality of the file. However, it would not
be possible to recreate a 1,411 kbps quality file
from a 64 kbps MP3.
• With lossy compression, the original bit depth is
reduced to remove data and reduce the file size.
The bit depth becomes variable.
Compression

• MP3 and AAC are lossy compressed audio file


formats widely supported on different
platforms. MP3 and AAC are both patented
codecs. Ogg Vorbis is an open source
alternative for lossy compression.
Digital video

• A digital film is created from a series of static


images played at a high speed. Digital films are
usually around 24 frames per second but can
be anything up to around 100 frames per
second or more
• Films have a frame rate per second (fps). This
is similar to sample rate. HD film is normally
50 or 60 fps. This can also be measured in
frequency (Hz). TV and computer screens have
a specification in Hz to indicate the frame rate
they support.
• Digital films also have a bit rate that accounts
for the total audio and image data processed
every second.
Video compression

• Videos are compressed in order to:


– reduce the resolution
– reduce the dimensions
– reduce the bit rate
Video compression

• Data lost during the compression process can


cause poor picture quality or even random
coloured blocks that appear and disappear on
the screen. These blocks are called artefacts.
Video compression

• Examples of popular lossy video file formats


include MP4 and MOV. Video file formats use
codecs to carry out compression algorithms
on the video's picture and audio data.
• App developer Ivo Jansch explains why
compression is used to reduce the amount of
binary numbers contained within a video file
Codecs and compression algorithms

• Codecs are programs that encode data as


usable files, whether images, audio or video.
Compression codecs are designed to remove
data without losing quality (where possible).
Algorithms work out what data can be
removed and reduce file size.
Run length encoding (RLE)

• One of the simplest examples of compression


is RLE. RLE is a basic form of data compression
that converts consecutive identical values into
a code consisting of the character and the
number marking the length of the run. The
more similar values there are, the more values
can be compressed. The sequence of data is
stored as a single value and count.
1.1.4 Video
• Video is an electronic medium for the
recording, copying and broadcasting of
moving visual images characteristics of video
streams.
1.1.4 Video
• Number of frames per second
– Frame rate, the number of still pictures per unit of
time of video, ranges from six or eight frames per
second (frame/s) for old mechanical cameras to
120 or more frames per second for new
professional cameras.
Interlaced vs progressive encoding
• Interlaced Encoding
– Interlacing was invented as a way to reduce flicker
in early mechanical and CRT video displays
without increasing the number of complete
frames per second, which would have required
sacrificing image detail in order to remain within
the limitations of a narrow bandwidth.
Interlaced vs progressive encoding
• Interlaced Encoding
– The horizontal scan lines of each complete frame
are treated as if numbered consecutively and
captured as two fields: an odd field (upper field)
consisting of the odd-numbered lines and an even
field (lower field) consisting of the even-
numbered lines.
Interlaced vs progressive encoding
• Interlaced Encoding
– Analog display devices reproduce each frame in the
same way, effectively doubling the frame rate as far as
perceptible overall flicker is concerned.
– When the image capture device acquires the fields
one at a time, rather than dividing up a complete
frame after it is captured, the frame rate for motion is
effectively doubled as well, resulting in smoother,
more lifelike reproduction (although with halved
detail) of rapidly moving parts of the image when
viewed on an interlaced CRT display
Interlaced vs progressive encoding
• Progressive encoding
– In progressive scan systems, each refresh period
updates all of the scan lines of each frame in
sequence.
– When displaying a natively progressive broadcast
or recorded signal, the result is optimum spatial
resolution of both the stationary and moving parts
of the image.
Video compression method
(digital only)
• Video compression
– Uncompressed video delivers maximum quality,
but with a very high data rate. A variety of
methods are used to compress video streams,
with the most effective ones using a Group Of
Pictures (GOP) to reduce spatial and temporal
redundancy.
Spatial and Temporal redundancy
• Spatial Redundancy
– spatial redundancy is reduced by registering
differences between parts of a single frame; this
task is known as intraframe compression and is
closely related to image compression.
Spatial and Temporal redundancy
• Temporal redundancy
– temporal redundancy can be reduced by
registering differences between frames; this task
is known as interframe compression, including
motion compensation and other techniques.
Definition of terms
• AIFF - Audio interchange file format - an
uncompressed audio file format developed by
Apple
• Algorithm - A sequence of logical instructions
for carrying out a task. In computing,
algorithms are needed to design computer
programs.
Definition of terms
• Amplitude - The maximum height of a wave
from the middle of the wave to its the crest or
trough.
• audio sample - A digital representation of a
sound.
• Bit - The smallest unit of data in computing
represented by a 1 in binary.
Definition of terms
• bit depth - bit depth
• bit rate - bit rate
• Buffer - A temporary area of computer
memory used to store data for running
processes.
• Codec - A program that encodes or decodes
digital information. They are used to create or
read audio and video files.
Definition of terms
• Compression - A method of reducing file sizes,
particularly in digital media such as photos,
audio and video.
• Downloading - To copy a file from the internet
onto your computer or device.
• floating point - A data value in computer
programming used to denote decimal
numbers.
Definition of terms
• Frame- A single static image in a video and
animation.
• HD - High definition.
• Hertz - The unit of frequency, symbol 'Hz'. 1 Hz
is 1 wave or cycle per second.
• Integer - A whole number - in computing, a
data type which represents signed (positive)
or unsigned (negative) whole numbers.
Definition of terms
• Kbps - Kilobits per second (Kbps): a
measurement of the speed data is being
transferred at.
• Lossless - A form of compression that encodes
digital files without losing detail. Files can be
also restored to their uncompressed quality.
• Lossy - A form of compression that reduces
digital file sizes by removing data.
Definition of terms
• MBps- Megabytes per second - a
measurement of data transfer speed.
• MP3 - A standard audio file format which uses
lossy compression. Compatible with most
media players. Designed by the Moving
picture experts group - layer 3.
• open source - A model for creating technology
that promotes free access to its design and
makes it free to share.
Definition of terms
• PCM - Pulse-code modulation - a process for
digitizing analogue audio and creating an
uncompressed audio file
• sample rate - How many samples of data are
taken per second. This is normally measured
in hertz, eg an audio file usually uses samples
of 44.1 kHz (44,100 audio samples per
second).
Definition of terms
• SD - Standard definition.
• Streaming - Data that is sent in pieces. Each
piece is viewed as it arrives, eg a streaming
video is watched as it downloads.
• Uncompressed - A file which has not had any
data removed through compression.
Definition of terms
• VoIP - Voice over internet protocol - a protocol
used for communicating voice data over the
internet.
• WAV - An uncompressed audio file format
developed by Microsoft.

You might also like