Professional Documents
Culture Documents
Neurocomputing
journal homepage: www.elsevier.com/locate/neucom
a r t i c l e i n f o a b s t r a c t
Article history: The quality of experience (QoE) of the end-users is a critical criterion of measurement in VoIP (Voice over
Received 26 February 2019 Internet Protocol) systems for technical and commercial purposes. We investigate how quality of service
Revised 19 October 2019
(QoS) influences QoE and assesses the QoE in VoIP communication. Our contributions are three-fold. First,
Accepted 16 December 2019
the impacts of QoS on QoE are comprehensively analyzed by experimental means and an association test
Available online xxx
method, instead of independently studying each parameter. Second, an algorithm is proposed to integrate
Communicated by Dr F.A Khan the effects of QoS parameters with spatial or temporal characteristics on QoE. Third, we apply machine
learning regression algorithms with QoS impairments, noise and echo impairments to nonintrusive voice
Keywords:
Quality of Service quality prediction in different network environments. The results from numerous experiments show that
VoIP fairly accurate prediction can be obtained from these models. Our work will achieve a more accurate
Association Test evaluation of the QoE in VoIP by using QoS parameters, clarify the influence of IP network environments,
Quality of Experience noise and echo impairments on the quality and reliability of VoIP traffic, and provide QoS parameter
Machine Learning requirements for the VoIP application that runs at the desired QoE level.
© 2019 Elsevier B.V. All rights reserved.
https://doi.org/10.1016/j.neucom.2019.12.072
0925-2312/© 2019 Elsevier B.V. All rights reserved.
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072
JID: NEUCOM
ARTICLE IN PRESS [m5G;January 3, 2020;16:18]
Table 1 impairments of several types are fused by E-Model into rating fac-
MOS and its correspondence to perception of end users [3,4].
tor R, the perceptual impairment scale which can be derived by
Score Quality Perception (Description of impairment)
R = R0 − Is − Id − Ie−e f f + A (1)
5 Excellent Imperceptible
4 Good Perceptible, but not annoying where R0 is signal-to-noise ratio with the noise referring to circuit
3 Fair Slightly annoying noise and background noise, Is the impairment on voice signal
2 Poor Annoying
affected by a collection of factors nearly concurrently, Id the
1 Bad Very annoying
impairment by the delay factor, Ie−e f f (or Ie ) being the effect of
information loss due to the encoding scheme and packet loss, and
been distinguished and the respective threshold ranges of distinct A is the factor that adapts the quality value [19]. Higher R indicates
parameters given the specific QoE level have not been provided; a better quality of voice. E-Model is a combination of determined
and the models that map QoS parameters to QoE do not employ a empirical formulae which applies only to restricted network con-
holistic approach to these parameters. Employing machine learning ditions and a limited number of codecs. However, the impairment
algorithms to evaluate QoE is a common method but the selection by the jitter factor is not taken into consideration in this model.
of an appropriate learning method remains an unresolved issue. Most studies are primarily concerned with the effect of packet
In this paper, instead of independently checking each QoS, the loss [20–26]. The method proposed in [20], evaluated the influence
authors analyze the impact of these impairments on QoE and de- of packet loss on Skype quality by a set of formulae according to
termine the threshold ranges of different parameters on QoE levels. subjective MOS measurement and simplified E-Model. In [21], the
Exerting the maximal information coefficient (MIC) [11] and dis- Thai language is estimated by using ACR listening opinion tests,
tance correlation algorithm [12], we notice that the jitter parame- where G.726, G.729 codec and random packet loss were employed.
ter has a strong nonlinear correlation with QoE and the temporal In [22,23], the authors addressed the effect of bursty losses on
and spatial characteristics of QoS are introduced to obtain a more VoIP, in particular, [23] derived a method of adjusting the condi-
reasonable integration of the parameter effects on QoE, which pro- tional loss probability for the Gilbert loss model as the packet in-
vides a perceptually accurate and nonintrusive voice quality pre- terval varies. In [24], the authors primarily scrutinized the effect of
diction that prevents the need for time-consuming subjective tests. packet dispersion (noticeable loss rate) on the quality of VoIP ap-
We identify the best regression learning algorithm and machine plications. In [25,26], by considering the Weber-Fechner Law (WFL)
learning algorithm for predicting or screening the voice quality. and IQX hypothesis (exponential interdependency of QoE and QoS),
Here is an arrangement of the following content in the paper. the authors use logarithmic regression
The literatures review regarding VoIP QoE evaluation are included
in the Section 2. In Section 3, an experiment platform is designed Q oE = log(aQ oS + b) (2)
to emulate the VoIP traffic, and the spatial and temporal charac- and exponential regression
teristics of QoS impairments are described. Section 4 presents our
method to estimate the listening voice quality and conversational QoE = ae−bQoS + γ (3)
voice quality. In Section 5, machine learning regression algorithms to indicate the relation between QoE and packet loss. However,
are developed for predicting the QoE in VoIP applications. Finally, WFL and IQX have only been employed with a single input pa-
we make conclusions and present prospective work in Section 6. rameter/metric.
Other literature studied the consequence of packet loss and de-
2. Related work lay with respect to QoE using ITU’s E-Model [27–29]. In [27], the
authors noticed that the correlation between packet loss and delay
In real-time voice communications, MOS test has been ex- did not satisfy the linearity after assessing the voice quality of PCM
tensively regarded as the QoE rating standard [13]. The test has and G.728 codec varying parameters of packet loss and delay. In
five score levels: bad-1, poor-2, fair-3, good-4, and excellent-5. [28], the function of a few internet backbone links was evaluated.
MOS value are the result of the objective or subjective measure- The analysis indicated VoIP quality substantially relied on not only
ment [14]. However, executing subjective MOS measurements is the link quality of the provider but also the playout buffer scheme.
time-consuming, expensive, and irreproducible, which hinders An approach that considered the interactivity of voice communi-
the suitability of objective methods. Exemplars of the intrusive cations was developed by Sun and Ifeachor [29], where E-Model
objective method are Perceptual Speech Quality Measure (PSQM) and PESQ are combined to evaluate the voice quality, and they pro-
[15], Measuring Normalizing Blocks (MNB) [16], Perceptual Anal- posed a nonlinear regression model of voice quality:
ysis Measurement System (PAMS) [17], and Perceptual Evaluation
of Speech Quality (PESQ) [18]. Although the intrusive objective MOSC = a + bx + cy + dx2 + ey2 + f xy + gx3 + hy3 + ixy2 + jx2 y
method highly correlates with subjective tests, it evaluates speech
(4)
quality on the basis of original speech whence not a practical
monitoring tool on the live traffic. The E-Model is the most exten- where x is the packet loss rate, and y represents the end-to-end
sively applied nonintrusive objective assessment method, it was delay. However, no assessments have been performed to figure
originally designed for conventional network planning [19]. In [19], out how jitter and jitter buffer relate to the VoIP QoE in [29]. In
Table 2
IP network QoS class definitions and network performance objectives [5].
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072
JID: NEUCOM
ARTICLE IN PRESS [m5G;January 3, 2020;16:18]
[30], the voice quality of cloud-based trading communication sys- of the VoIP component. The median part shows the end-to-end
tem was evaluated by the simplified E-Model, showing delay im- VoIP components from sender to receiver: after original signals
pairment has less negative impact on voice quality than jitter and from the sender are sporadically sampled by the encoder which
packet loss rate do. In [31], in line with native Thai users, a sim- subsequently creates a constant bit rate stream, then the packe-
plified E-Model was proposed for VoIP quality assessment of G.729 tizer take the stream into IP packets; in the network transmission
codec, the authors suggested the jitter impairment factor should component, IP packets transport voice data on the network; at the
be further studied. receiver, the playback buffer provides a smooth playout to allevi-
In [32], the equipment impairment factor of the E-Model has ate delay variations (jitter), the received packets are carried to the
been extended by utilizing an artificial neural network (ANN). In depacketizer retrieved by the decoder. The top part processes the
addition, the use of machine learning algorithms to establish the voice quality by PESQ and converts its result to the MOS.
mapping relationship between QoS and QoE is an important direc- To generate the experiment data sets that corresponds to dif-
tion of current research. For example, genetic programming [33], ferent network parameters, two main software tools (NIST Net
deep belief networks [34], active learning [35] and fuzzy evidence network emulator [42] and OpenPhone VoIP application [43]) are
theory [36] have been deployed in inspections. In [37], for web employed in our tests. The VoIP application runs on the end PCs
surfing service, the authors analyze the relationship between QoS (Personal Computers, typically equipped with a microphone and
and web QoE, and predict user expectation to the network state by loudspeakers) that send voice samples from the source to the des-
using several machine learning algorithms. tination based on an input voice recording file and controls the
Charonyktakis et al. proposed the MLQoE method, which ap- encoding algorithm and the packetization interval. NIST Net is set
plied multiple machine learning algorithms (ANN, Support Vec- on the Linux Router to emulate different network transmission en-
tor Machine, Decision Trees, etc.) and nested cross-validation to vironments. This voice signal is sent through NIST Net, which en-
evaluate the VoIP QoE [11]. Although the MLQoE dominates other abled us to study the application quality of various network im-
methods, the precision of the data set used in the experiment pairments.
has a large deviation from the desired accuracy suggested by As shown in Fig. 1, if we do not consider the echo impairment
the ITU-T standard [38,39]. In [40], the authors think that the due to devices (microphone and loudspeaker), the key impairment
clear/comprehensive manual on the available parametric models factors of VoIP can be divided into two types: digital-to-analog
and the critical QoE performance parameters per service type (D/A) reciprocal transformation impairment, when the voice signal
which is currently missing. In [41], the authors analyzed the re- is encoded/decoded and packetized/depacketized in a VoIP system,
lationship between QoS and QoE in geostationary satellite system, the other one is network impairment which includes packet loss,
but it more focused on the design and use of the experimental delay, jitter, bandwidth and etc.
platform than on the QoE evaluation model itself.
As previously mentioned, few limitations exist, which we
intend to overcome in this paper. First, the majority of achieve- 3.2. Network impairments and its spatial and temporal characteristics
ments concentrate on the effect of packet loss on QoE, few
studies explore the influence of jitter and jitter buffer variation 3.2.1. Network impairments
on QoE has. Second, there are hardly any mapping models directly Generally, in a VoIP system, the primary QoE impairments in-
used for monitoring QoE in view of multiple QoS parameters clude packet loss, delay, jitter (or delay variation) and bandwidth
simultaneously. restrictions. The measurement and analysis of these parameters
have always been a focus of field of network performance research
3. Experiment description
[44–47].
To achieve a more comparative analysis, our first step is to The packet loss is calculated as follows: let PS =
build a VoIP test bed. Although the experiments performed in a { ps,1 , ps,2 , . . . , ps,n } be a group of packets being sent from sender
real network condition can be convincing, they are uncontrollable, to receiver. According to their sending sequence, the packets ps, i
unrepeatable and costly since the involvement of a large number are put in ascending order. Let PR = { pr,1 , pr,2 , . . . , pr,n } denote the
of pieces of equipment and service providers is inevitable. To cir- corresponding collection of received packets at the receiver. The
cumvent these disadvantages, the testbed described in this section packet loss can be calculated as PL = 1 − |PR |/|PS |. The delay is
is specifically designed to provide the necessary network condi- commonly called end-to-end delay, which encompasses (a) en-
tions and simplify the collection of the reference and degraded coding and packetization delay at the sender; (b) depacketization,
voice signals for listening/conversation voice quality measurement. decoding and buffering delay at the receiver; and (c) propagation
time, transmission delay, and queuing delay through a network
3.1. Experiment platform link or network element. The encoding/decoding delay and pack-
etization/depacketization delay are originated from processing
Fig. 1 illustrates the framework of the experimental platform. related to the coder. Table 3 summarizes the codec-related delay
The lower part of the figure illustrates various impairment factors of G.711 and G.729A. According to the recommendation in [48], a
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072
JID: NEUCOM
ARTICLE IN PRESS [m5G;January 3, 2020;16:18]
Table 3 Fig. 2.a–c illustrates the phenomenon of packet loss and jitter,
Codec-related delay for different codecs [48].
which causes changes in the packet spatial distribution. In the fig-
Standard codec type frame Look- mean one-way ures, si and ri are the sending time of packets and arrival time
size (ms) ahead(ms) delay introduced by of packets, respectively, and pi is the playout time of packets in
coder-related
the jitter buffer. Two cases exist: in the first case, the packets are
processing (ms)
discarded during network transmission due to bit error or conges-
G.711(64 kb/s) PCM 0.125 0 0.375 tion, et.al. (as shown in Fig. 2.a); in the second case, the packets
G.729A(8 kb/s) CS-ACELP 10 5 35
have been dropped by the jitter buffer when they arrive later than
the scheduled playout deadline (as shown in Fig. 2.b). As a conse-
quence of jitter in a stream of packets, packet disorder may occur
de-jitter buffer adds one half of its peak delay to the end-to-end (as shown in Fig. 2.c). Send the packet stream p1 , p2 , p3 , p4 , p5 , p6 ,
delay. and p7 in that order. If p2 , p5, and p6 are those do not arrive in the
Jitter indicates the statistical variance of packet inter-arrival desired order while others stay correct, the received stream may
time. Different descriptions of the jitter exist: (a) the standard de- become {p1 , p3 , p2 , p4 , p7 , p5 , and p6 }. In this case, p2, p5 , p6 or
viation of the delay; (b) the mean deviation of the packet spac- even p3 , p2 , p4 , p7 , p5 , and p6 are discarded by the jitter buffer al-
ing change according to RFC 1889 [49] and c) the inter-packet de- gorithm. The spatial characteristics of packets are destroyed, which
lay variation according to RFC 3393 [50]. In NIST Net, jitter is the indicates the transmission of less information and a reduction in
simulation by the standard deviation of the delay. Although the user-perceived quality. This phenomenon is termed as the spatial-
bandwidth requirements for single VoIP applications are relatively related impairment of QoS parameters. Since modified jitter butter
low (usually below 64 kbps of voice data), they comprise an im- algorithms can enhance total user-perceived quality by handling
portant factor that cannot be disregarded because the variance in subsequent packets and disorder packets differently, They have al-
packet loss, jitter, and delay are intrinsically related to the band- ways been intensively studied [51–54].
width. By making an appropriate choice for the codec, the required Temporal-related impairments are related to the end-to-end de-
bandwidth can be controlled. In an IP network, the bandwidth lay in a VoIP system (as shown in Fig. 2.d). Longer end-to-end delay
that suffices to carry voice stream primarily depends on the en- increases the potential of voice echo, or response time, and lowers
coding type, sample period and IP/UDP/RTP/Ethernet headers. Con- conversational quality.
sider the G.711 code as an example. Each packet is sent by one Eth-
ernet frame; the payload for the G.711 encoding and 20 ms sam-
ple period is 160 octets; and the IP/UDP/RTP/Ethernet headers add
a fixed 66 octets to the payload. Thus, a total of 226 bytes is re- 3.3. Mean opinion score and metrics
quired, of which the transmission requires a unilateral bandwidth
of 90.4 kbps. Similarly, the bandwidth requirement for G.729A cod- The VoIP QoE can be estimated by subjective or objective mea-
ing is 34.4 kbps. surement, but a subjective test with the standard ITU-T P.800 must
be conducted when provided with at least 100 interviewees and
3.2.2. Spatial and temporal-related impairments of QoS parameters special lab conditions in regard to the room size, noise level, mi-
In VoIP, the speech stream divides to small frames which are crophone position. Despite measuring VoIP QoE with the best ac-
transported by the IP packets. these frames are then brought to- curacy, subjective methods are expensive, lack repeatability and are
gether again to form a continuous stream at the destination based inapplicable. Instead, objective methods, such as the PESQ and E-
on the special spatial and temporal order. In practice, jitter and Model, are the most commonly employed models. The resulting
packet loss change the distribution of the voice packets instead values of the PESQ and E-Model can be converted to MOS under
of preserving the constant inter-packet gap and orderly sequence. ITU-T Rec.P.862.1 and ITU-T Rec.G.107, respectively.
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072
JID: NEUCOM
ARTICLE IN PRESS [m5G;January 3, 2020;16:18]
ITU-T P.802.1 defines the mapping function from the P.862 PESQ
score to the MOS as statistically significant data without network impairment. Consider
G711 coding as an example, its experimental results are illustrated
y = 0.999 + 4/(1 + e−1.4945x+4.6607 ) (5) in Fig. 5. As shown in Fig. 5, all MOS scores of G.711 are near 4.4,
where x is the PESQ score and y is the corresponding MOS score. which coincide with the ITU’s standard. This platform guarantees
ITU-T Rec.G.107 describes MOS in terms of R factor, a controllable range of the D/A impairment. Although NIST Net is
famous instrument for the simulation of network conditions, we
1 R<0 inspect its source code and conduct several experiments to ensure
MOS = 1 + 0.035R + R(R − 60 )(100 − R )7 · 10−6 0 < R < 100 the correct simulation under given network conditions. In experi-
4.5 R > 100 ments, the measured value of packet loss and delay is calculated by
averaging ten individual measurement runs. By comparing the pre-
(6)
set value (theoretical value) of NIST Net and the measured value by
Figs. 3 and 4 show the transformational relation curve for the the Wireshark tool [55], we find that the error of packet loss usu-
PESQ, E-Model and MOS, respectively. ally does not exceed 0.15% and the error of delay remains below
1 ms (as shown in Figs. 6 and 7). Extra experimental analysis of
3.4. Accuracy analysis of experimental platform the accuracy and stability of NIST Net is provided in [26,42]. NIST
Net is a stable and reliable platform for simulating network im-
As analyzed in Section 3.1, the accuracy of the experimental pairments.
platform is determined by the impairment degree of digital-to- Furthermore, we did a series of experiments to analyze the
analog (D/A) transformation in the process of voice signal encod- impact of terminal equipment on voice quality. As mentioned in
ing/decoding and the performance of NIST Net. To evaluate the Section 3.1, we use the PCs (typically equipped with a build-in mi-
D/A impairment degree, we perform multiple experiments to gain crophone and loudspeaker) as the sender and the receiver of VoIP
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072
JID: NEUCOM
ARTICLE IN PRESS [m5G;January 3, 2020;16:18]
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072
JID: NEUCOM
ARTICLE IN PRESS [m5G;January 3, 2020;16:18]
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072
JID: NEUCOM
ARTICLE IN PRESS [m5G;January 3, 2020;16:18]
the maximal information coefficient (MIC) [11] and distance corre- ing both the linear relationship and nonlinear relationship among
lation algorithm [12] to identify and classify relationships among variables [11,12].
variables. Comparisons with other state-of-the-art association test The MIC does not provide the given distributional assumptions
methods (e.g., Fisher z-test [60] and specialized test [61]), the MIC about the measured data. The experiments of [11] showed that the
and distance correlation algorithm are more suitable for identify- MIC has generality and equitability, which has no bias towards any
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072
JID: NEUCOM
ARTICLE IN PRESS [m5G;January 3, 2020;16:18]
Fig. 13. Bandwidth vs MOS. Fig. 16. Degradation in MOS due to circuit noise.
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072
JID: NEUCOM
ARTICLE IN PRESS [m5G;January 3, 2020;16:18]
iment. Generally, jitter may prompt packet loss, and packet loss in
transmission may shake the jitter distribution. These interactions
will yield a complex nonlinear relation between QoE and the pa-
rameters(the experiments in Section 4.4.1 provide evidence). We
claim that packet loss and jitter co-act to change the spatial distri-
bution of a voice packet, which impairs the final listening quality.
For the listening quality, packet loss and jitter have to be simul-
taneously considered, which for the conversation quality, they are
indispensable.
4.4.2.1. Conversational voice quality versus packet loss, jitter, and de-
The correlation coefficient of distance is defined as lay. We directly derive the MOSC value from the packet loss, jitter,
and delay based on the concept of Fig. 23. Here is the procedure.
√ v (X,Y
√)
2
v2 (X )v2 (Y ) > 0 Step 1. Attain the modified Ie from impairment of packet loss
R (X, Y ) =
2 v2 ( X ) v2 (Y ) (9)
0 v2 (X )v2 (Y ) = 0 and jitter
The first step is to calculate the corresponding PESQ score for
where v2 (X, X ) = v2 (X ), v2 (X, Y ) = v2 (Y ) is the distance covari- each combination of packet loss and jitter, then convert the PESQ
ance. Details of the distance correlation algorithm are provided in scores into MOS values.
[12]. Figs. 18 and 19 show the association test results for each net-
work impairment and its corresponding MOS value with respect 1. Count the PESQ against packet loss rate and jitter given the
to the MIC and distance correlation respectively. The packet loss, codec;
delay, jitter, and bandwidth have almost an equal important de- 2. Convert the PESQ scores to MOS values using Eq. (5);
pendence on QoE. As shown in Fig. 19, the smaller is the jitter 3. Map MOS to R. If 6.5 ≤ R ≤ 100, R can be calculated from
buffer, the greater is the association of jitter on QoE. Only consid- the MOS according to the following formula (10):
ering the effect of packet loss is not adequate when evaluating the 20 √ π
voice quality. R= (8 − 226(h + )) (10)
3 3
with h= 1
arctan 2 × (18566 − 6750MOS,
4.4. Measurement of listening voice quality and conversational voice 3
quality 15 −903522 + 1113960MOS − 202500MOS2 ) [19],
arctan( yx )forx ≥ 0
and a tan 2(x, y ) = using Eq. (6).
Classically, the listening quality measurements emphasize the π − arctan( −x y
) f orx < 0
impact of packet loss, while conversation measurements are con- 4. Calculate the modified Ie using R. By only considering the
cerned with the coaction of packet loss and delay [29]. However, impairment from the packet loss, jitter, and codec, we can
the analysis shows that the packet loss, jitter and delay are almost simply write Ie in terms of R
equivalently important in Section 4.3. Thus, disregarding jitter is
inappropriate. Measurements of single impairment (packet loss or Ie = R0 − R (11)
jitter) distinctly mismatch the real network scenario since packet With default value R0 is 93.2.
loss or jitter does not exclusively occur within a given time inter-
val. Research independently parametrizes jitter into the E-Model The resulting curves for Ie vs the packet loss and jitter in three
[62]. This approach is not well supported by any theory or exper- different buffer conditions are shown in Figs. 24 and 25.
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072
JID: NEUCOM
ARTICLE IN PRESS [m5G;January 3, 2020;16:18]
Fig. 21. MOSL versus packet loss and jitter for G.711.
Fig. 22. MOSL versus packet loss and jitter for G.729A.
Fig. 23. Measurement of conversational voice quality using a combined PESQ and E-Model.
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072
JID: NEUCOM
ARTICLE IN PRESS [m5G;January 3, 2020;16:18]
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072
JID: NEUCOM
ARTICLE IN PRESS [m5G;January 3, 2020;16:18]
Fig. 27. MOSc versus packet loss, delay, jitter for G.711.
Table 5
Codec comparison from the point of view of quality thresholds.
Listening voice quality conversational voice quality (when Listening voice quality conversational voice
delay =150 ms) (when delay quality
=150 ms)
G.711 buffer=10ms <2 <2 < 1.5 < 1.5 < 11 < 8 < 7 < 7
buffer=20ms <2 <8 < 1.5 <6 < 11 < 11 < 7 < 11
buffer=40ms <2 < 12 < 1.5 < 10 < 11 < 18 < 7 < 17
G.729A buffer=10ms < 4 < 4 < 3 < 3
buffer=20ms < 4 < 10 < 3 < 8
buffer=40ms < 4 < 15 < 3 < 14
tain a required quality level. For the conversational voice quality, the packet loss rate is less than 11% or the average jitter does not
the range of parameters for a given condition (reference [10,64] exceed 18 ms, et.al. Outside these bounds, the quality will be poor.
conclusion, here we set the delay value to 150 ms). For example, Note that G.729A cannot achieve “excellent” quality even in a per-
when the jitter buffer is 40 ms, G.711 provides a satisfactory lis- fect network environment. Therefore, the excellent-to-good quality
tening voice quality since the packet loss rate is less than 2% or threshold is not present.
the average jitter does not exceed 12 ms, the listening voice qual- Based on the experiment data of Section 4.2, we infer the noise
ity will attain a fair quality if the packet loss rates range between and echo thresholds that correspond to the specific QoE levels.
2% and 11% or the jitter ranges between 12 ms and 18 ms. When Table 6. illustrates several thresholds that noise and echo parame-
the jitter buffer is 40 ms and the delay is 150 ms, the conversa- ters must be kept beyond or below in order to attain the “Good”
tional voice quality of G.711 can provide excellent quality because or “Fair” quality level.
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072
JID: NEUCOM
ARTICLE IN PRESS [m5G;January 3, 2020;16:18]
Fig. 28. MOSc versus packet loss, delay, jitter for G.729A.
Fig. 29. Degradation in RO due to electric circuit noise and room noise.
5. Machine learning-based models for VoIP QoE prediction multi-parameter and include several discrete variables, nonlinear
regression models are not available. In this paper, we use machine
If only one or two QoS metrics are used to predict the voice learning algorithms as a means of building mapping models, the
quality, the nonlinear regression method is a suitable choice, as reason is that, machine learning methods reduce the model com-
demonstrated in the literature [26,29]. However, the predictors are plexity, enhance the accuracy, and can be easily expanded. In the
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072
JID: NEUCOM
ARTICLE IN PRESS [m5G;January 3, 2020;16:18]
Fig. 30. Id vs. delay impairment, talker and listener echo impairment.
Fig. 31. Combined MOS (including both loss, jitter, delay, noise and echo impairment).
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072
JID: NEUCOM
ARTICLE IN PRESS [m5G;January 3, 2020;16:18]
Table 6
Codec comparison from the point of view of quality thresholds (noise and echo impairment).
Nc (dBm0p) Ps (dB(A)) Pr (dB(A)) TELR(dB) WEPL(dB) Nc (dBm0p) Ps (dB(A)) Prm (dB(A)) TELR(dB) WEPL(dB)
delay(ms) delay(ms) delay(ms) delay(ms)
G.711 <−53 <55 <60 >44 >54 >22 >33 <−38 <67 <70 >33 >44 >55 >13 >20 >36
G.729A <−50 <58 <63 >41 >51 >20 >29
Fig. 32. System structure for voice quality prediction based on machine learning regression model.
practical point of view, machine learning has great advantage in rics, such as delay, jitter, packet loss, jitter buffer, and coding type,
manipulating large-scale information such that model by machine and then outputs the MOS value. The system structure of pre-
learning can be transferred from simulation research to actual VoIP dicting MOS value from QoS impairment, noise and echo impair-
system without difficulties. In addition, in recent years, the theory ment using the machine learning algorithms is shown in Fig. 32.b.
of machine learning has been continuously developed, and new Fig. 32.b is a revision of Scheme I of Fig. 32.a. The predicted
methods are emerging. With the emergence of new theories and MOSC is obtained by the listed machine learning algorithms, and
new methods of machine learning, the modelling of QoE and QoS, the performance of our system is measured by the absolute er-
which has plenty of rooms for its development in accuracy and ef- ror between the predicted MOS score and the measured MOS
ficiency. score.
We employ several classical algorithms (K-nearest neighbors
(KNN), regression tree, ANN, bagging, and SVM) in our work. Al- 5.1. Parameter setting
though other machine learning algorithms are available, the hu-
man ear’s recognition and MOS values are coarse-grained values Via experiments, we obtained 19 data sets. For each dataset, we
and do not require high precision. If the time efficiency and eval- randomly divide it into a training set and a test set at ratio 7:3.
uation accuracy are sufficient, we believe that the classical ma- For each division, we run every method 10 times to study the av-
chine learning algorithms are sufficient to satisfy the requirements erage performance. We evaluate the methods by the MAE (Mean
of voice quality prediction. Fig. 32.a presents an overview of our Absolute Error) and RMSE (Root Mean Square Error). Tables 7 and
system, in which the predictor inputs the network and codec met- 8 list the mean, median, and standard deviation in terms of these
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072
JID: NEUCOM
ARTICLE IN PRESS [m5G;January 3, 2020;16:18]
MAE RMSE MAE RMSE MAE RMSE MAE RMSE MAE RMSE
MOSL of G.711 (jitter buffer 10 ms)r 0.0603 0.0171 0.1447 0.0408 0.0969 0.0241 0.1239 0.0362 0.0628 0.0147
MOSL of G.711 (jitter buffer 20 ms)r 0.0802 0.0223 0.1952 0.0503 0.1067 0.0259 0.1653 0.0450 0.0651 0.0156
MOSL of G.711 (jitter buffer 40 ms) 0.0968 0.0249 0.1929 0.0458 0.1076 0.0251 0.1567 0.0392 0.0706 0.0163
MOSL of G.711 (jitter buffer 10+20+40 ms) 0.0831 0.0140 0.1565 0.0241 0.0864 0.0124 0.1830 0.0267 0.0625 0.0083
MOSL of G.729A (jitter buffer 10 ms) 0.0523 0.0165 0.1104 0.0324 0.0935 0.0229 0.1245 0.0380 0.0659 0.0144
MOSL of G.729A (jitter buffer 20 ms) 0.0858 0.0234 0.1733 0.0433 0.1124 0.0274 0.1565 0.0435 0.0605 0.0134
MOSL of G.729A (jitter buffer 40 ms) 0.1035 0.0285 0.2032 0.0512 0.1129 0.0288 0.1694 0.0435 0.0756 0.0183
MOSL of G.729A (jitter buffer 10+20+40 ms) 0.0784 0.0137 0.1596 0.0264 0.0856 0.0125 0.1895 0.0283 0.0617 0.0083
MOSL of G.711+ G.729A (jitter buffer 10+20+40 ms) 0.2240 0.0195 0.1677 0.0193 0.0836 0.0086 0.1338 0.0150 0.0638 0.0060
three values. In the KNN method, we heuristically determine the parameters were picked as such: the radial basis function (RBF)
optimal number K of nearest neighbors by using cross-validation. kernel was the default the trade-off parameter C had a value of
The results show that the distance metric for the variables is the 28 ; and the parameter gamma g was obtained from {20 , 20.5 , 21 ,
Cityblock distance, and K = 3. The KNN method achieves the best 21.5 , 22 , 22.5 , 23 }. We train a monolayer Neural Network model.
performance under the measurements of MAE and median abso- The size of the hidden layer and the output layer is 10 and 3 re-
lute error. In the process of bagging, the T samples are randomly spectively. Activation function of the hidden layer and the output
sampled from the training data with replacement, base classifier is layer is the tangent function and the linear function respectively.
constructed using one of the T samples and combined by a mean The measure function is MSE. We use the BP(Back-Propagation)
strategy. The base classifiers are implemented by a regression func- elastic algorithm to train the model. Regression tree is a deci-
tion without pruning in MATLAB, and T is set to 100. The LIBSVM sion tree with binary splits for regression and the split criterion is
library version 3.14 was selected for the SVM, whose the hyper- MSE.
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072
JID: NEUCOM
ARTICLE IN PRESS [m5G;January 3, 2020;16:18]
5.2. Evaluation racy: the majority of the MAE is less than 0.1 and the median abso-
lute error is less than 0.23 in 19 datasets (Fig. 33.a–i; Fig. 34.a–k;).
We evaluate the performance of these algorithms for different In all datasets of the listening test (Fig. 33.a–i), the SVM is the
conditions of parameters. The results show that the machine learn- finest with respect to the median absolute error (MAE, 0.06539
ing regression algorithms can predict the VoIP QoE with fair accu- and 0.06593) and the RMSE (0.01281) followed by the KNN, ANN,
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072
JID: NEUCOM
ARTICLE IN PRESS [m5G;January 3, 2020;16:18]
Table 8
Performance evaluation of prediction of MOSL .
MAE RMSE MAE RMSE MAE RMSE MAE RMSE MAE RMSE
MOSC of G.711 (jitter buffer 10 ms) 0.0666 0.0038 0.0499 0.0022 0.0737 0.0029 0.0860 0.0037 0.0417 0.0014
MOSC of G.711 (jitter buffer 20 ms) 0.0802 0.0040 0.0594 0.0024 0.0748 0.0029 0.0946 0.0038 0.0440 0.0015
MOSC of G.711 (jitter buffer 40 ms) 0.1025 0.0050 0.0649 0.0026 0.0747 0.0028 0.1019 0.0039 0.0463 0.0016
MOSC of G.711 (jitter buffer 10+20+40 ms) 0.0820 0.0024 0.0568 0.0013 0.0747 0.0017 0.0382 0.0010 0.0440 0.0009
MOSC of G.729A (jitter buffer 10 ms) 0.0494 0.0032 0.0346 0.0017 0.0737 0.0029 0.0643 0.0031 0.0446 0.0015
MOSC of G.729A (jitter buffer 20 ms) 0.0672 0.0034 0.0416 0.0018 0.0746 0.0029 0.0757 0.0032 0.0420 0.0014
MOSC of G.729A (jitter buffer 40 ms) 0.0866 0.0044 0.0490 0.0020 0.0770 0.0029 0.0867 0.0034 0.0475 0.0016
MOSC of G.729A (jitter buffer 10+20+40 ms) 0.0678 0.0021 0.0413 0.0011 0.0747 0.0017 0.0317 0.0008 0.0446 0.0009
MOSC of G.711+ G.729A (jitter buffer 0.1836 0.0027 0.0498 0.0009 0.0765 0.0012 0.0588 0.0010 0.0439 0.0006
10+20+40 ms)
MOSC of G.711+ G.729A (packet loss, delay, 0.2225 0.0011 0.03534 0.00017 0.09502 0.00047 0.03466 0.00017
jitter, jitter buffer, noise, echo)
bagging and regression tree (Table 7). The ascending order of these Declaration of Competing Interest
algorithms with respect to the standard deviation of the absolute
error is SVM, ANN, KNN, regression tree and bagging. In mostly The authors declare that they have no known competing finan-
datasets of the conversation test (Fig. 34.a–i), the SVM also outper- cial interests or personal relationships that could have appeared to
forms the other algorithms with respect to the MAE and median influence the work reported in this paper.
absolute error (0.04429 and 0.0443 in that order) followed by the
regression tree, bagging, ANN and KNN (as illustrated in Table 8).
Acknowledgments
In dataset of conversation test that combined QoS impairments,
noise and echo impairments (Fig. 34.k), we find that the traditional
This work is supported by the National Natural Science Founda-
training algorithms for SVMs, such as chunking and SMO cannot
tion of China (Grant nos. 61872226, 61702315), the Natural Science
run properly because large training sets. In this case, Bagging out-
Foundation of Shanxi Province, China (Grant nos. 201701D121052,
performs the other algorithms(as illustrated in last line of Table 8).
201901D211169), the Key R&D Program (International Science and
When the training set is bigger, our algorithm predicts more
Technology Cooperation Project) of Shanxi Province, China (Grant
accurate. For all the datasets, the prediction quality of the SVM,
no. 201903D421003) and the 1331 Engineering Project of Shanxi
ANN, regression tree and bagging for conversation quality is higher
Province, China.
than that for listening quality. We also determined that if train-
ing dataset includes discrete input parameters, the accuracy of the
KNN rapidly decreases (as shown in Fig. 33.i, Fig. 34.i). References
Although the training phases of machine learning regression al-
[1] J. Barakovič Husič, S. Baraković, S. Muminović, Is there any impact of human
gorithms have relatively high computational complexity, the com- influence factors on quality of experience? in: Proceedings of 40th Interna-
putational complexity of the prediction phase is negligible. For ex- tional Convention on Information and Communication Technology, Electron-
ample, in Thinkpad T460 portable laptop computer and Matlab ics and Microelectronics (MIPRO), 2017, pp. 434–439, doi:10.23919/MIPRO.2017.
7973464.
2015b programming environment, the time cost of predicting a
[2] D. Tsolkas, E. Liotou, N. Passas, L. Merakos, A survey on parametric QoE es-
single item sample of KNN, regression tree, ANN, bagging, and SVM timation for popular services, J. Netw. Comput. Appl. 77 (2017) 1–17, doi:10.
is 2.07 ms, 16.66 ms, 62.80 ms, 69.81 ms, and 1.01 ms, respectively. 1016/j.jnca.2016.10.016.
[3] ITU-T Recommendation P.10/G.100, Vocabulary and effects of transmission pa-
Thus, they can work online.
rameters on customer opinion of transmission quality, amendment 1, June,
2019. Available: https://www.itu.int/rec/T- REC- P.10- 201906- I!Amd1.
6. Conclusions and prospects [4] ITU Recommendation P.800, Methods for subjective determination of transmis-
sion quality, August 1996. Available: https://www.itu.int/rec/T- REC- P.800/en.
[5] ITU Recommendation Y.1541, Network performance objectives for IP-
We analyzed the impacts of QoS parameters on QoE by experi- based services, December 2011. Available: https://www.itu.int/rec/T- REC- Y.
ments and an association test method. Both the theoretical analysis 1541-201112-I/en.
and the experiments show a strong nonlinear relationship between [6] Y.J. Chen, K.S. Wu, Q. Zhang, From QoS to QoE: a tutorial on video quality as-
sessment, IEEE Commun. Surv. Tutor. 17 (2015), 1126–1165, doi:10.1109/COMST.
the jitter and QoE. Then in light of the spatial and temporal char- 2014.2363139.
acteristics of the network impairments, the new methodology was [7] M. Alreshoodi, J. Woods, Survey on QoE\QoS correlation models for multime-
developed to measure the voice quality, without the need of time- dia services, Int. J. Distrib. Parallel Syst. 4 (2013) 53–72, doi:10.5121/ijdps.2013.
430553.
consuming subjective tests. Furthermore, we applied several classi- [8] D. Ghadiyaram, J. Pan, A...C. Bovik, A subjective and objective study of stalling
cal regression algorithms to predict the voice quality and compare events in mobile streaming videos, IEEE Trans. Circuits Syst. Video Technol. 29
the performances of these algorithms. The work enhances evalu- (2019) 183–197, doi:10.1109/TCSVT.2017.2768542.
[9] N. Rao, A. Maleki, F. Chen, W. Chen, C. Zhang, N. Kaur, A. Haque, Analysis of
ation of the VoIP QoE by using a network performance parame- the effect of QoS on video conferencing QoE, IWCMC (2019) (2019) 1267–1272,
ter, explains how IP network environments can impact the quality doi:10.1109/IWCMC.2019.8766591.
and reliability of VoIP traffic, and calculates typical network per- [10] P. Charonyktakis, M. Plakia, I. Tsamardinos, M. Papadopouli, On user-centric
modular QoE prediction for VoIP based on machine-learning algorithms, IEEE
formance requirements for a VoIP application to run at the desired
Trans. Mob. Comput. 15 (2016) 1443–1456, doi:10.1109/TMC.2015.2461216.
QoE level. The effects of noise impairment and echo impairment [11] D.N. Reshef, Y.A. Reshef, H.K. Finucane, S.R. Grossman, Detecting novel asso-
on voice quality are also analyzed. This research can also be ap- ciations in large data sets, Science 334 (2011) 1518–1524, doi:10.1126/science.
1205438.
plied to a VoIP system to ensure that playout buffer control and
[12] G.J. Szekely, M.L. Rizzo, N.K. Bakirov, Measuring and testing dependence
adaptive codec type can achieve the best possible end-to-end per- by correlation of distances, Ann. Stat. 35 (2007) 2769–2794, doi:10.1214/
ceived voice quality. In prospective, we will explore the impacts of 0 090536070 0 0 0 0 0505.
other parameters (e.g., burst packet loss and packetization interval, [13] ITU-T Recommendation P.800.1, Mean opinion score (MOS) terminology, 2016.
[14] F.D. Rango, M. Tropea, P. Fazio, S. Marano, Overview on VoIP: subjective and
et.al) and build an automatic VoIP quality monitoring system with objective measurement methods, Int. J. Comput. Sci. Netw. Secur. 6 (2006)
network measurement algorithms. 140–153.
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072
JID: NEUCOM
ARTICLE IN PRESS [m5G;January 3, 2020;16:18]
[15] ITU-T Recommendation P.861, Objective quality measurement of telephone- [42] M. Carson, D. Santay, NIST NET - A Linux-based network emulation tool, ACM
band (30 0-340 0 Hz) speech codecs, February 1998. Available: https://www.itu. SIGCOMM Comput. Commun. Rev. 33 (2003), 111–126, doi:10.1145/956993.
int/rec/T- REC- P.861- 199802- W/en. 957007.
[16] S. Voran, Objective estimation of perceived speech quality. I. Development of [43] OpenPhone. https://www.VoIP-info.org/openphone.
the measuring normalizing block technique. IEEE Trans. Speech Audio Process. [44] B.P. Padhy, Adaptive latency compensator considering packet drop and packet
7 (1999) 383–390, doi:10.1109/89.771259. disorder for wide area damping control design, Int. J. Electr. Power Energy Syst.
[17] A.W. Rix, M.P. Hollier, The perceptual analysis measurement system for robust 106 (2019), 477–487, doi:10.1016/j.ijepes.2018.10.015.
end-to-end speech quality assessment, in: Proceedings of the IEEE Conference [45] Y. Cao, Bifurcations in an internet congestion control system with distributed
on Acoustics, Speech and Signal Processing, Vol. 3, 20 0 0, pp. 1515–1518, doi:10. delay, Appl. Math. Comput. 347 (2019) 54–63, doi:10.1016/j.amc.2018.10.093.
1109/ICASSP.20 0 0.861935. [46] K. Bidaj, J.B. Begueret, J. Deroo, Jitter definition, measurement, generation,
[18] ITU-T Recommendation P.862, Perceptual evaluation of speech quality (PESQ), analysis, and decomposition, Int. J. Circuit Theory Appl. 46 (2018), 2171–2188,
an objective method for end-to-end speech quality assessment of narrow- doi:10.1002/cta.2559.
band telephone networks and speech codecs, February 2001. Available: https: [47] X.Q. Li, K.L. Yeung, Bandwidth-efficient network monitoring algorithms based
//www.itu.int/rec/T- REC- P.862- 200102- I/en. on segment routing, Comput. Netw. 147 (2018) 236–245, doi:10.1016/j.comnet.
[19] ITU-T Recommendation G.107, The E-Model, a computational model for use in 2018.10.010.
transmission planning, June 2015. Available: https://www.itu.int/rec/T- REC- G. [48] ITU-T Recommendation G.114. One-way transmission time, May 2003. Avail-
107-201506-I/en. able: https://www.itu.int/rec/T- REC- G.114- 200305- I/en.
[20] P. Wuttidittachotti, T. Daengsi, Subjective MOS model and simplified E-Model [49] H. Schulzrinne, S. Casner, R. Frederick, V. Jacobson, RTP: a transport proto-
enhancement for skype associated with packet loss effects: a case using col for real-time applications, RFC 1889, IETF. January 1996. Available: https:
conversation-like tests with Thai users, Multimed. Tools Appl. 76 (2017) //tools.ietf.org/html/rfc1889.
16163–16187, doi:10.1007/s11042- 016- 3901- 5. [50] C. Demichelis, P. Chimento, IP packet delay variation metric for IP performance
[21] P. Wuttidittachotti, P. Khaoduang, T. Daengsi, MOS estimation model develop- metrics, RFC 3393, IETF. November 2002. Available: https://tools.ietf.org/html/
ment using ACR listening-opinion tests with Thai users referring to loss ef- rfc3393.
fects: a case of G.726 and G.729, Multimed. Syst. 24 (2018) 285–295, doi:10. [51] P. Imputato, S. Avallone, An analysis of the impact of network device buffers on
10 07/s0 0530- 017- 0549- 6. packet schedulers through experiments and simulations, Simul. Modell. Pract.
[22] S. Jelassi, G. Rubino, A perception-oriented Markov model of loss incidents ob- Theory, 80 (2018), 1–18, doi:10.1016/j.simpat.2017.09.008.
served over VoIP networks, Comput. Commun. 128 (2018) 80–90, doi:10.1016/ [52] K. Hammad, A. Moubayed, A. Shami, S. Primak, Analytical approximation of
j.comcom.2018.06.009. packet delay jitter in simple queues, 5(2016), pp. 564–567, IEEE Wirel. Com-
[23] W. Jiang, H. Schulzrinne, Perceived quality of packet audio under bursty losses, mun. Lett. doi:10.1109/LWC.2016.2601609.
in: Proceedings of IEEE INFOCOM, 2002. [53] E. Kim, T. Kim, C. Lee, An adaptive buffering scheme for P2P live and time-
[24] H. Zlatokrilov, H. Levy, The effect of packet dispersion on voice applications in shifted streaming, Appl. Sci. 7 (2017) 1–14, doi:10.3390/app7020204.
IP networks, IEEE/ACM Trans. Netw. 14 (2006) 277–288, doi:10.1109/tnet.2006. [54] Z.Z. Qiao, R.K. Venkatasubramanian, L.F. Sun, E.C. Ifeachor, A new buffer algo-
872543. rithm for speech quality improvement in VoIP systems, Wirel. Pers. Commun.
[25] P. Reichl, B. Tuffin, R. Schatz, Logarithmic laws in service quality percep- 45 (2008) 189–207, doi:10.1007/s11277-007- 9408- 7.
tion: where microeconomics meets psychophysics and quality of experience, [55] Wreshark, https://www.wireshark.org/
Telecommun. Syst. 52 (2013), 587–600, doi:10.1007/s11235-011-9503-7. [56] H. Schulzrinne, RTP profile for audio and video conferences with minimal con-
[26] T. Hossfeld, D. Hock, P. Tran-Gia, K. Tutschku, M. Fiedler, Testing the iqx trol, RFC 3551, IETF, July 2003. Available: https://tools.ietf.org/html/rfc3551.
hypothesis for exponential interdependency between QoS and QoE of voice [57] http://www.linkedin.com/answers/technology/information-technology/
codecs iLBC and G.711, Proceedings of the ITC Specialist Seminar on Quality telecommunications/TCH_ITS_TCI/491659-28591248
of Experience, 2008. [58] M. Voznak, A. Kovac, M. Halas, Effective packet loss estimation on VoIP jitter
[27] L. Roychoudhuri, E. Al-Shaer, G.B. Brewster, On the impact of loss and delay buffer, in: Proceedings of the 2012 International Conference on Networking,
variation on internet packet audio transmission, Comput. Commun. 29 (2006) vol. 7291, 2012, pp. 157–162, doi:10.1007/978- 3- 642- 30039- 4_21.
1578–1589, doi:10.1016/j.comcom.20 06.04.0 04. [59] S. Tao, K. Xu, A. Estepa, Improving VoIP quality through path switching, in:
[28] A.P. Markopoulou, F.A. Tobagi, M.J. Karam, Assessing the quality of voice com- Proceedings of IEEE INFOCOM, 2005, pp. 2268–2278.
munications over internet backbones, IEEE/ACM Trans. Netw. 11 (2003), 747– [60] P. Spirtes, C.N. Glymour, R. Scheines, Causation, Prediction, and Search,
760, doi:10.1109/TNET.2003.818179. 81(20 0 0), Cambridge, MA, USA: MIT Press
[29] L. Sun, E. Ifeachor, Voice quality prediction models and their applications in [61] M. Ahdesmäki, H. Lähdesmäki, R. Pearson, H. Huttunen, O. Yli-Harja, Ro-
VoIP networks, IEEE Trans. Multimed. 8 (2006) 809–820, doi:10.1109/TMM. bust detection of periodic time series measured from biological systems, BMC
2006.876279. Bioinformatics, 6 (2005), 1-18, doi:10.1186/1471-2105-6-117.
[30] D. Aklilu, L. Vicky, F. Ernest, C. Bill, QoE estimation model for a secure real- [62] H.L. Zhang, Z.M. Gu, Z.Q. Tian, QoS evaluation based on extend E-Model in
time voice communication system in the cloud, ACSW (2019) 1–10, doi:10. VoIP, in: Proceedings of 13th International Conference on Advanced Communi-
1145/3290688.3290705. cation Technology, 2011.
[31] T. Daengsi, P. Wuttidittachotti, QoE modeling for voice over IP: simplified E- [63] R.G. Cole, J.H. Rosenbluth, Voice over IP performance monitoring„ ACM SIG-
Model enhancement utilizing the subjective mos prediction model: a case of COMM Comput. Commun. Rev. 31 (2001) 9–24, doi:10.1145/505666.505669.
G.729 and Thai users, J. Netw. Syst. Manag. (29) (2019) 837–859, doi:10.1007/ [64] ITU-T Recommendation G.113, Transmission impairments due to speech
s10922- 018- 09487- 4. processing, November 2007. Available: https://www.itu.int/rec/T- REC- G.
[32] M. Al-Akhras, H. Zedan, R. John, I. Almomani, Non-intrusive speech quality pre- 113-200711-I/en.
diction in VoIP networks using a neural network approach, Neurocomputing,
72 (2009) 2595–260, doi:10.1016/j.neucom.2008.10.019. Zhiguo Hu received Ph.D. degree in computer science
[33] A. Raja, R. Muhammad Atif Azad, C. Flanagan, D. Picovici, C. Ryan, Non- from TongJi University, China in 2012. He is now working
intrusive quality evaluation of VoIP using genetic programming, Inf. Comput. in the School of Computer and Information Technology,
Syst. 275 (2006), doi:10.1109/BIMNICS.2006.361795. Shanxi University. His research interests include network
[34] E.T. Affonso, R.D. Nunes, R.L. Rosa, G.F. Pivaro, D.Z. Rodriguez, Speech quality measurement, data mining and machine leaning.
assessment in wireless VoIP communication using deep belief network. IEEE
Access, 6 (2018) 77022–77032, doi:10.1109/ACCESS.2018.2871072.
[35] H.S. Chang, C.F. Hsu, T. Hossfeld, K.T. Chen, Active learning for crowdsourced
QoE modeling, IEEE Trans. Multimed. 20 (2018) 3337–3352, doi:10.1109/TMM.
2018.2831639.
[36] T. Mansouri, A. Nabavi, A.Z. Ravasan, H. Ahangarbahan, A practical model for
ensemble estimation of QoS and QoE in VoIP services via fuzzy inference
systems and fuzzy evidence theory, Telecommun. Syst. 16 (2016) 861–873,
doi:10.1007/s11235-015-0041-6. Hongren Yan received M.A degree from Arizona State
[37] Ben Letaifa, Asma, WBQoEMS: web browsing QoE monitoring system based on University in 2016. He is now a Ph.D student and research
prediction algorithms, Int. J. Commun. Syst. 32 (2019) 1–16, doi:10.1002/dac. assistant in the Institute of Big Data Science and Industry,
4007. Shanxi University. He specializes in Information geometry,
[38] https://www.VoIPmechanic.com/mos- mean- opinion- score.htm. machine learning theory, and complex analysis mining.
[39] K. Salah, On the deployment of VoIP in ethernet networks: methodology and
case study, Comput. Commun. 29 (2006) 1039–1054, doi:10.1016/j.comcom.
20 05.06.0 04.
[40] A.A. Laghari, H. He, M. Shafiq, A. Khan, Application of quality of experience in
networked services: review, trend & perspectives, Syst. Pract. Action Res. 32
(2019) 501–519, doi:10.1007/s11213- 018- 9471- x.
[41] A. Antoine, L. Emmanuel, K. Nicolas, Making trustable satellite experiments: an
application to a VoIP scenario, in: Proceedings of the IEEE Vehicular Technol-
ogy Conference, 2019, doi:10.1109/VTCSpring.2019.8746404.
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072
JID: NEUCOM
ARTICLE IN PRESS [m5G;January 3, 2020;16:18]
Tao Yan was received the Ph.D. degree from Chengdu In- Guoqing Liu received the B.E degree from Shanxi Univer-
stitute of Computer Applications, Chinese Academy of Sci- sity in 2016. She is now studying in the School of com-
ence in 2017. He is now a lecturer at Shanxi University. puter and information technology, Shanxi University. Her
His research interests include image processing and evo- research interest is reinforcement learning.
lutionary computation.
Please cite this article as: Z. Hu, H. Yan and T. Yan et al., Evaluating QoE in VoIP networks with QoS mapping and machine learning
algorithms, Neurocomputing, https://doi.org/10.1016/j.neucom.2019.12.072