MOS - Definition

Abstract
Introduction
I. MOS IN GENERAL
1. MOS - Definition
2. MOS - Rating Scales and Formulas
3. MOS – Different MOS Metrics
3.1.Listening context
3.2.Talking context
3.3.Conversational context
4. MOS - For speech and audio quality
5. MOS - Methods and Applications
6. MOS - Limitations and Alternatives
II. MOS TERMINOLOGY FOR AUDIO

III. MOS TERMINOLOGY FOR VIDEO
IV. DATASET’S EXPLANATION
V. TEST FACILITIES AND PREDICTION
Conclusion
1. MOS - Definition
The Mean Opinion Score, MOS or frequently called MOS score is used to evaluate the quality of
a phone call such as: voice, video, audiovisual etc… Furthermore, MOS is mostly used in
Telecommunications in which the researchers or developers can judge digital approximations
themselves. By the way, VoIP (Voice over Internet Protocol) which basically allows people to
test the network and ensure if the voice transmission is good, medium or weak. In addition with
VoIP, people can test the quality issues, measure the voice degradation and performance though.
Example: After hanging up a call on WhatsApp, there is a message sent for both parties in which
they have to evaluate the ranking of the call (Good, Medium or Weak).
2. MOS - Rating Scales and Formulas
In here, MOS has the opportunity of ranking the quality of the voice call. The values are from 1
to 5. If the MOS score is 1, it means that the quality is BAD, otherwise 5 means that it’s
EXCELLENT. Basically it depends sometimes on what do people really need to measure.
MOS QUALITY IMPAIREMENT
5 Excellent Extremely good
4 Good Acceptable
3 Fair Slightly annoying
2 Poor Annoying
1 Bad Very annoying
Score = 1, A and B are communicating but it is impossible for them to hear each others.
Score = 2, in this case efforts are required for both parties A and B, hard to understand
what they are talking about.
Score = 3, A and B can understand each other by providing a slight effort.
Score = 4, A and B can talk normally with a good emission and reception languages.
Score = 5, No more effort from both parties, the communication is so perfect.
Besides that, people can each others, the voice quality can be good or bad, but it’s very important
at this point to know what factors affect the MOS score on VoIP system.
These are included, among others:
Hardware: It’s always necessary to have the hardware between the network and the VoIP
Packet loss: When making a call, packets are transmitted. Moreover one or multiple
packets can be lost causing a call failure as well.
Bandwidth: In electronic, bandwidth is a range of frequencies, measured in hertz, which

is used for transmitting signals.
Furthermore, MOS can be calculated by:
𝑛
Rn
𝑀𝑂𝑆 = ∑ ( )
N
𝑛=1
Figure 1: MOS formula

R = Individual ratings
N = Number of calls made
3. MOS – Different MOS metrics
We realize that there are three (3) main Mean Opinion Score metrics used to evaluate the quality
of the VoIP such as: listening context, the talking context and the conversional context.
3.1 Listening context
Basically, when we speak about Listening context, we think about the voice message in which
both parties (A and B) listen to the message sent without any response. Nowadays, participants
realize that they can listen hear perfectly the voice message, sometimes not clearly or badly. By
the way, the listening context can be disturbed by:
Codec
Noise
Information loss
Signal level
Furthermore when the participant A is a little away to his phone (Example: he is in a distance of
5 meters to the phone, or very close to a fan) while sending a voice message, participant B might
not hear what the other one says. The consequences are that it definitely decreases the speech
quality and the voice loudness as well.
3.2 Talking context
In general, participants meet several contexts when talking on phone. A call can be made, but
does it mean that both participants receive an answer in return? Whenever they can’t talk, what
are the issues?
Based on our expectation, it can be caused by the distortion of the signal (channel), the echo and
the noise which are the main limiting factors in communication and measurement systems.
3.2.1 Distortion of the signal
In general, there are three (3) main characteristics of the channel such as: the input channel, the
distortion channel and the output channel. Basically in phone communication, a signal is always
sent but it can definitely take several different ways to the receiver which has many
consequences like delay and attenuation arrive to the receiver. The channel distortion can
degrade or even interrupt the communication process but nowadays researchers have been
improving the technology by setting up a big component which is channel modeling and
equalization. Based on that, channel modeling and equalization is able to reduce the interference
and noise effects for better demodulation.
Figure 2: Illustration of the input channel, the distortion channel and the output channel
3.2.2 The echo
Based on Telecommunication systems, we can define an echo as a sound or series of sounds

caused by the reflection of sound waves from a surface back to the listener. In fact, most of the
time echo appears when making international telephone calls and in hands-free
telecommunications. If the echo exceeds approximately 25 milliseconds (ms) it can be an
annoyance to the listener. There are different kinds of echo which are: acoustic echo and length
of delay.
Acoustic echo: When we talk about “acoustic”, we can think about: microphones and
speakers. With the new technology, there is a possibility (way) to connect our mobile
phone to the microphones or speakers then make a conversation. Acoustic echo can be
intensified when using a sensitive microphone and even when speakers and/or
microphones volume is high enough when making a call. These are basically the factors
of an acoustic echo. We can illustrate that through a picture below.
Figure 3: Acoustic echo
Length of delay: We can explain a length of delay as the time that we spend to reach the
destination. In telecommunication, length of delay is the time it takes from the signal to
propagate from the sender to the receiver. There is a phenomenon called End to End
delay in which the destination waits for the packets needed from the source.
Figure 4: End to end delay
3.2.3 The Noise
We can define the noise as a disruption or a random signal in communication system met when
talking in the transmission system from the sender to the receiver. If someone in Vietnam wants
to send information to another one in Senegal, will he know or make sure that his information
was sent without any disturbance? By the way, we are going to talk about the different types of
noise.
Physical noise: It is such an interference that people meet when talking to each others. It
can be provoked by: loud music when recording, car’s horn, fan etc…
Physiological noise: It’s basically a misunderstanding between both parties (sender and
receiver). We can meet several cases of physiological noise such as: a stutterer (someone
who speaks with difficulty), a miss of words articulation, a hesitation, speaking slowly or
faster etc…
Psychological noise: It’s all about the health mental trouble. When talking, the sender can
think about something else which affects the communication for the receiver to
understand. It’s called a man-made noise.
By the way, it is also very important to know how to determine noise level in communication
system. We will need to measure the SNR (Signal to Noise Ratio). Here is a formula of SNR in
communication system.
SNR = 10 log S / N dB
S = Signal
N = Noise
dB = decibels units
In general, researchers have concluded that: “To have a good quality voice transmission , the
average of SNR must be 30 dB or nearby. In some telecommunication post like Orange, the SNR
can be 90 dB, it means that the Signal is strong enough and the noise is insignificant.
3.3 Conversational context
In the real life, we meet different kinds of conversations between participants. Therefore, they
are in some difficulties to listen and talk which seem to be in conversational context. If both
participants talk at the same time, it’s called double talk and if they both stay quiet (don’t talk to
each others), it’s called mutual silence. We have been experimenting different types of
conversations on phone showed in the figure below:
Talk
+ Hear
Talk Hear
Only Only
Can’t Talk
or Can’t
Hear
Figure 5: Different phases of conversations
Talk only: M. Hung is able to talk but can’t hear anything

Talk + Hear: M. Hung can talk and hear his friend
Hear only: M. Hung can’t speak but he hears only his friend
Can’t talk or can’t hear: M. Hung is definitely disconnected
4. MOS - For speech and audio quality
Nowadays, there are so many factors that really affect the voice quality. Moreover the phone
operators like (Orange, Viettel, Mobiphone, FPT etc…) ask questions themselves and would like
to know:
Where do the issues come from?

How does it happen?
Why are there these issues?
What is the best solution to overcome issues?
Actually, couple software applications are in the market in order to help phone operators to
measure, to analyze, and to improve the voice quality itself.
Among the some applications we can enumerate: SEVANA and CYARA.

4.1 SEVANA
SEVANA is an European Software applications providing multiple services especially on voice

quality and monitoring product. It is actually available to download and it isn’t expensive to get a
license as well. SEVANA is an open source product that has couple products such as:
AQUA: AQUA (Audio Quality Analyzer) is a very useful product helping phone
operators or services to evaluate a voice quality analysis. With AQUA, there is a way to
compares audio files made by participants and also test the voice quality. MOS can be
used in AQUA for calculating scores. Every service need this kind of products but the
only inconvenience is that AQUA is so expensive.
PVQA: PVQA (Passive Voice Quality Analysis) is especially based on call and
monitoring quality system. Basically with PVQA, we have the possibility to get feedback
from the participants about the quality of the voice call (good, medium or bad) and even
customized MOS score calculator.
4.2 CYARA
Nowadays, CYARA is one of the global leaders in omnichannel customer experience testing and
monitoring. CYARA’s platform is composed by: call routing and agent desktop testing, global in
country dialing (type any numbers in any countries), omnichannel testing and finally the voice
quality testing.
Moreover, in the past, we only could test 10% of scenarios which is not good and reliable for the
system and customer lifecycle but now CYARA allows us to test over 90% of scenarios in the
fraction of time. It is very important to know that CYARA has so many functionalities of testing:
Connectivity
Responsiveness
Quality
No drop-outs
To know more about CYARA advantages, here is the link that we can watch on YouTube:
https://www.youtube.com/watch?v=Z_NuINGrBlM
5. Methods and Applications

MOS - Definition

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MOS - Definition

Uploaded by

Copyright:

Available Formats

Abstract

II. MOS TERMINOLOGY FOR AUDIO

2. MOS - Rating Scales and Formulas

Score = 3, A and B can understand each other by providing a slight effort.

Score = 5, No more effort from both parties, the communication is so perfect.

These are included, among others:

Bandwidth: In electronic, bandwidth is a range of frequencies, measured in hertz, which

Furthermore, MOS can be calculated by:

Figure 1: MOS formula

N = Number of calls made

3. MOS – Different MOS metrics

3.1 Listening context

3.2 Talking context

3.2.1 Distortion of the signal

3.2.2 The echo

Based on Telecommunication systems, we can define an echo as a sound or series of sounds

Figure 4: End to end delay

3.2.3 The Noise

3.3 Conversational context

Figure 5: Different phases of conversations

Talk only: M. Hung is able to talk but can’t hear anything

4. MOS - For speech and audio quality

Where do the issues come from?

Among the some applications we can enumerate: SEVANA and CYARA.

SEVANA is an European Software applications providing multiple services especially on voice

5. Methods and Applications

You might also like