Professional Documents
Culture Documents
VoIP Lecture5
VoIP Lecture5
Lecture (5)
Voice Over IP (VoIP)
1
10/9/2023
2
10/9/2023
Huawei live trial ongoing * Saving (CAPEX&OPEX) from BSS side due to removal of
TC
NSN trials planned in May *Better voice quality after removing transcoding and
A enabling TrFO calls.
(BSC>>CS core)
*Important for the implementation of A-Flex feature
which is needed for the completion of the MSC pool
project.
2G
3
10/9/2023
Topics
• What is VoIP?
• Why is VoIP Attractive?
• How Does it Work?
• Voice Coding
What is VoIP?
4
10/9/2023
• Typical voice codecs used in VoIP include ITU-T standards such as 64 kb/s G.711 PCM, 8
kb/s G.729 and 5.3/6.3 kb/s G.723.1; 3GPP standards such as AMR; opensource codecs
such as iLBC and proprietary codecs such as Skype’s SILK codec which has variable bit
rates in the range of 6 to 40 kb/s.
• Some codecs can only operate at a fixed bit rate, whereas many advanced codecs can have
variable bit rates which may be used for adaptive VoIP applications to improve voice quality.
• Voice codecs or speech codecs are based on different speech compression techniques
which aim to remove redundancy from the speech signal to achieve compression and to
reduce transmission and storage costs.
• In practice, speech compression codecs are normally compared with the 64 kb/s PCM
codec which is regarded as the reference for all speech codecs. Speech codecs with the
lowest data rates (e.g., 2.4 or 1.2 kb/s Vocoder) are used mainly in secure communications.
• In general, the higher the speech bit rate, the higher the speech quality and the greater the
bandwidth and storage requirements.
10
5
10/9/2023
11
6
10/9/2023
13
Codec Look
Voice Algor.
Origin Standard Type Bit rate ahead le
Frame (ms) (ms) delay (ms)
Kb/S
G.711 PCM 64 0
16 50
0.125 0 0.125
G.726 24 25
ADPCM
G.727 32 7
40 0.125 0 0.125 2
ITU-T
12.8 20
G.728 LD-CELP 0.625 0 0.625
16 7
G.729(a) CS-ACEP 8 20 5 15 11
ACELP 5.3 19
G.723.1 30 7.5 37.5
MP-MLQ 6.3 15
GSM-FR RPE-LTP 13 20 0 20 20
ETSI
GSM-HR VSEPL 6.5 20 0 20 23
14
7
10/9/2023
1- Waveform-Based Coding
16
8
10/9/2023
• Typical ones are Pulse Code Modulation (PCM) and Adaptive Differential PCM (ADPCM)
• For PCM, it uses non-uniform quantization to have more fine quantization steps for small speech
signal and coarse quantization steps for large speech signal (logarithmic compression). Statistics
have shown that small speech signal has higher percentage in overall speech representations.
Smaller quantization steps will have lower quantization error, thus better Signal-to-Noise Ratio
(SNR) for PCM coding.
• There are two PCM codecs, namely PCM μ-law which is standardized for use in North America
and Japan, and PCM A-law for use in Europe and the rest of the world. ITU-T G.711 was
standardized by ITU-T for PCM codecs in 1988.
• For both PCM A-law and μ-law, each sample is coded using 8 bits (compressed from 16-bit linear
PCM data per sample), this yields the PCM transmission rate of 64 kb/s when 8 kHz sample rate is
applied (8000 samples/s × 8 bits/sample = 64 kb/s). 64 kb/s PCM is normally used as a reference
point for all other speech compression codecs.
• ADPCM, proposed by Jayant in 1974 at Bell Labs, was developed to further compress PCM codec
based on correlation between adjacent speech samples. Consisting of adaptive quantiser and
adaptive predictor, a block diagram for ADPCM encoder and decoder (codec).
17
18
9
10/9/2023
• Parametric compression only sends relevant parameters related with speech production to the
receiver side and reconstructs the speech from the speech production model, Thus, high
compression ratio can be achieved. The most typical example of parametric compression is
Linear Prediction Coding (LPC), proposed by Atal in 1971 at Bell Labs. It was designed to
emulate the human speech production mechanisms and the compression can reach the bit rate
as lower as 800 bit/s (Compression Ratio reaches 80 when compared to 64 kb/s PCM). It
normally operates at bit rates from 4.8 to 1.2 kb/s. The LPC based speech codecs can achieve
high compression rate, however, the voice quality is also low.
• A typical parametric codec is Linear Prediction Coding (LPC) vocoder which has a bit rate
from 1.2 to 4.8 kb/s and is normally used in secure wireless communications systems when
transmission bandwidth is very limited.
19
• Hybrid Coding Techniques were proposed to combine the features of both waveform-
based and parametric-based coding (and hence the name of hybrid coding). It keeps the
nature of parametric coding which includes vocal tract filter and pitch period analysis, and
voiced/unvoiced decision.
• Instead of using an impulse period train to represent the excitation signal for voiced
speech segment, it uses waveform-like excitation signal for voiced, unvoiced or transition
(containing both voiced or unvoiced) speech segments.
• The most well known one, so called “Codebook Excitation Linear Prediction (CELP)”
has created a huge success for hybrid speech codecs in the range of 4.8 kb/s to 16 kb/s
for mobile/wireless/satellite communications achieving toll quality (MOS over 4.0) or
communications quality (MOS over 3.5).
• Almost all modern speech codecs (such as G.729, G.723.1, AMR, iLBC and SILK
codecs) belong to the hybrid compression coding with majority of them based on
CELP techniques.
20
10
10/9/2023
• RPE/LTP: Regular Pulse Excitation/Long Term Prediction, used in ETSI GSM Full-Rate (FR) at 13
kb/s.
• VSELP: Vector Sum Excited Linear Prediction: ETSI GSM Half-Rate (HR) at 5.6 kb/s.
• ACELP: Algebraic CELP, used in ETSI GSM Enhanced Full-Rate (EFR) at 12.2 kb/s and ETSI
AMR from 4.75 to 12.2 kb/s.
21
• In the last sections, we mainly discussed Narrowband (NB) speech compression, aimed at
speech spectrum from 0 to 4 kHz. Not only used in VoIP systems, this 0 to 4 kHz
narrowband speech, expanded from speech frequency range of 300 Hz to 3400 Hz, has also
been used in traditional digital telephony in the Public Switched Telephone Networks (PSTN)
• In VoIP and mobile applications, there is a trend in recent years to use Wideband (WB)
speech to provide high fidelity speech transmission quality. For WB speech, the
speech spectrum is expanded to 0–7 kHz, with sampling rate at 16 kHz.
• Compared to 0–4 kHz narrowband speech, wideband speech will have more higher
frequency components and have high speech quality.
• There are currently three wideband speech compression methods which have been used
in different wideband speech codecs standardized by ITU-T or ETSI, They are:
22
11
10/9/2023
• Example:
WB-AMR is now used for 3G Mobile Networks to enhance Voice Quality.
23
24
12
10/9/2023
• From this table, you should be able to see the historic development of
speech compression coding standards (from 64 kb/s, 32 kb/s, 16 kb/s, 8
kb/s to 6.4/5.3 kb/s) for achieving high compression efficiency, the mobile
codecs development from GSM to AMR for 2G and 3G applications, the
development from single rate codec, dual-rate codec, 8- mode codec to
variable rate codec for achieving high application flexibility, and the trend
from narrowband (NB) codecs to wideband codecs (WB) for achieving high
speech quality (even for High Definition voice).
25
• This development has made speech compression codecs more efficient and
more flexible for many different applications including VoIP.
• In the table, the columns on coded bits per sample/frame and speech frame
for each codec will help you to understand payload size and to calculate
VoIP bandwidth which will be covered in RTP transport protocol.
• The columns on look-ahead time and codec’s algorithmic delay will help to
understand codec delay and VoIP end-to-end delay, a key QoS metric,
which will be discussed in detail next part.
26
13
10/9/2023
• Packets are sent across the Internet - and re-assembled at the destination
and reconverted back into audio.
• Across the Internet packets can use many different routes (Packet
Switching), whereas in a traditional telephone call a single dedicated circuit
is required for each call (Circuit Switching).
• The routers that route traffic on the Internet are a fraction of cost of
switches on traditional long distance phone networks. All this means
cheaper phone calls.
27
Voice To/From IP
Analog IP-Network
Digital
Voice
Compress Re-sequence
Digital Voice
IP-Network Analog
28
14
10/9/2023
The most common, standardized encoding algorithms and their coding rate and speech quality.
29
Vocoder Attributes
• Vocoder speech quality is a function of bit rate, complexity, and processing delay.
There usually is a strong interdependence between all these attributes and they
may have to be traded off against each other.
• For example, low-bit-rate vocoders tend to have more delay than higher bit rate
vocoders. Low-bit-rate vocoders also require higher VLSI complexity to
implement. As might be expected, low-bit-rate vocoders often have lower speech
quality than the higher bit rate vocoders.
30
15
10/9/2023
Vocoder Attributes
1. Bit Rate:
• Most vocoders operate at a fixed bit rate regardless of the input signal
characteristics; however, the goal is to make the vocoder variable-rate. For
simultaneous voice and data applications, a compromise is to create a silence
compression algorithm as shown in the below table .
2. Delay:
The delay in a speech coding system usually consists of two major components:
• Frame delay
• Speech processing delay
31
Vocoder Attributes
3. Vocoders's Complexity :
4. Quality :
• The measure used in comparisons is how good the speech sounds under ideal
conditions-namely, clean speech, no transmission errors, and only one encoding
(note, however, that in the real world these ideal conditions are often not met
because there can be large amounts of such background noise as street noise,
office noise, air conditioning noise, etc.). Quality is measured by different methods
but it finally mapped to Mean Opinion Score (MOS) value which is ranked from 1
to 5 to represent the degree of voice quality.
32
16
10/9/2023
33
Packet Encapsulation
34
17
10/9/2023
35
36
18
10/9/2023
For example, the required bandwidth for a G.729 call (8 Kbps codec
bit rate) with RTP, MP and the default 20 bytes of voice payload is:
• Total packet size (bits) = (67 bytes) x 8 bits per byte = 536 bits
• Bandwidth per call = voice packet size (536 bits) x 50 pps = 26.8
Kbps
37
38
19
10/9/2023
39
40
20
10/9/2023
• With circuit-switched voice networks, all voice calls use 64 Kbps fixed-bandwidth
links regardless of how much of the conversation is speech and how much is
silence. With VoIP networks, all conversation and silence is packetized. Using
Voice Activity Detection (VAD), packets of silence can be suppressed.
• Over time and as an average on a volume of more than 24 calls, VAD may
provide up to a 35 percent bandwidth savings. The savings are not realized
on every individual voice call, or on any specific point measurement.
• For the purposes of network design and bandwidth engineering, VAD should not
be taken into account, especially on links that carry fewer than 24 voice calls
simultaneously.
• Various features such as music on hold and fax render VAD ineffective. When
the network is engineered for the full voice call bandwidth, all savings provided
by VAD are available to data applications.
• VAD also provides Comfort Noise Generation (CNG). Because you can mistake
silence for a disconnected call, CNG provides locally generated white noise so
the call appears normally connected to both parties.
41
42
21
10/9/2023
• Cost Saving:
TDM release a lot of STM1s which resulted in huge OPEX Savings.
• Network modernization.
43
• cRTP feature is now enabled in the core network which resulted in BW savings about
25% to 30%.
• still the door is opened for more BW and voice quality enhancement through VAD
and other Features
44
22
10/9/2023
45
46
46
23