You are on page 1of 6

2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX)

A QUALITY OF EXPERIENCE MODEL FOR ADAPTIVE MEDIA PLAYOUT

Benjamin Rainer and Christian Timmerer

Multimedia Communication (MMC) Research Group, Institute of Information Technology (ITEC)


Alpen-Adria-Universität Klagenfurt, Klagenfurt, Austria, {firstname.lastname}@itec.aau.at

ABSTRACT shown that the QoE decreases exponentially with an increase


In the past decade Adaptive Media Playout (AMP) has been in stalling events of the multimedia playback [2]. Therefore,
intensively studied with respect to the detection of when to we propose to use Adaptive Media Playout (AMP) which
increase or decrease the playback rate in order to maintain a changes the playback rate adaptively. As stated in [3] and
certain buffer fill state. In this paper we subjectively assess [4], AMP was initially introduced for compensating the im-
the QoE of AMP with respect to non-periodically and ran- pact of error prone communication channels on the smooth-
domly selected content sections of a video sequence by using ness of the multimedia playback and avoiding buffer under-
crowdsourcing. Furthermore, we introduce metrics that allow /overruns. That is, the playback rate is increased/decreased in
to quantify the distortion for audio and video that are caused order to maintain a particular playback buffer fill state. In the
by increasing or decreasing the playback rate. With these pre- literature, many AMP algorithms have been introduced which
liminaries we study the correlation between the introduced assume an error prone communication channel and most of
metrics and the subjectively assessed QoE. Therefore, we de- them are based on the buffer fill state [5].
rive a utility model that allows estimating the QoE with the Other studies investigated the decrease of the frame rate
introduced metrics. We instantiate and validate the model us- for short time periods for low bit-rate videos [6]. In [7] the
ing the data gathered from the conducted study. impact of reducing the frame rate for the entire duration of a
Index Terms— Adaptive Media Playout, Inter-Destination video sequence was investigated. Most of the discussed stud-
Media Synchronization, Crowdsourcing, Subjective Quality ies investigated AMP for video only, a playback rate lower
Assessment, Quality of Experience than the nominal playback rate, and for the full duration of
the multimedia content.
The aim of this paper is to study and quantify the impact
1. INTRODUCTION
on the QoE of increasing and decreasing the playback rate for
Synchronizing the multimedia playback among geographi- randomly selected content sections with different length (in
cally distributed users is one of the features enabled by Inter- time). Therefore, we present audio-visual metrics that allow
Destination Multimedia Synchronization (IDMS) [1]. A sys- us to quantify these playback rate variations (Section 2). We
tem that claims to provide IDMS requires the four following further verify our assumptions by correlating our metrics with
mechanisms: the results of a conducted subjective quality assessment using
crowdsourcing (Section 3). Finally, we analytically derive a
• Session management of clients requiring IDMS. utility model that describes the correlation between the QoE
• Signaling of timing and control information among and the introduced metrics and we provide insights on how
the participating clients within an IDMS session. the change in the playback rate affects the QoE (Section 4).
The results and contributions are summarized and concluded
• Negotiation on a reference playback timestamp in Section 5 including future work.
deals with the selection of a playback timestamp within
an IDMS session to which clients have to synchronize
their playback. 2. QUANTIFYING THE DISTORTION OF
MULTIMEDIA CONTENT IN THE TEMPORAL
• Carrying out the actual synchronization to overcome
DOMAIN
the identified asynchronism by modifying the multime-
dia playback of each client.
In the past decade many spatial quality metrics have been in-
In this paper we focus on carrying out the actual syn- troduced, especially for the video domain [8]. It has been
chronization. A naı̈ve strategy for carrying out the synchro- shown that the spatial quality metrics are able to represent the
nization is to pause the multimedia playback or skip multi- impact on the QoE to a certain extent. Recent quality met-
media content. Recent subjective quality assessments have rics aim to cover the temporal domain. For example, in [9] a

978-1-4799-6536-6/14/$31.00 ©2014 IEEE 177


2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX)

spatial-temporal quality metric for video has been introduced dvi may be any value in the interval of [−1, 1]. |F | depicts the
and subjectively evaluated. This metric relies on the image overall number of frames. For audio we followed the same
data of each video frame and quantifies its spatial-temporal principle as for video with the difference that we used the
distortion. In our case we aim on quantifying the distortion spectral energy of the audio frames for each audio channel by
caused by playback rate variations and, thus, we focus on tem- using the Fourier Transformation. The Fourier Transforma-
poral information only. tion for a single audio frame and for the i-th content section
Increasing or decreasing the media playback rate – de- is denoted by baci (cf. Equation 3), where c depicts the audio
noted as µ – results in a perceptual distortion in audio and/or channel. Equation 4 defines our distortion metric for audio.
video. This distortion depends on the actual multimedia con- Note that we take into account the audio channels.
tent, especially on the temporal features, for which the play- Mi
X jk
back rate is increased or decreased. Therefore, we propose aci (k) =
b e−2πi M fac (j) (3)
the following metrics for measuring the distortion caused by j=Ni

modifying the playback rate of audio and video: C Fei
Sf Fei Sf
X X X X X
• Audio: the spectral energy of an audio frame for the si = (( |b
acu (k)| − |b
acu (k)|)) (4)
c=1 ∼
c-th channel is denoted by fac (x). u=Fsi
k=0 u=Fsi k=0

• Video: the average length of motion vectors between C denotes the number of audio channels available. Sf de-
two consecutive frames is denoted by fv (x). notes the highest frequency. Finally, dai denotes the distor-
tion in audio for the i-th content section.
These metrics allow us to quantify the distortion for audio 1
dai = PC si (5)
and video when increasing or decreasing the playback rate
c=1 sec
for a specific content section. This is done by comparing how sec denotes the overall spectral energy of channel c. Again,
much of each temporal feature has been experienced by the dai may be any value in the interval of [−1, 1]. If there are
user during the content section with and without the playback more content sections for which the playback rate is changed
rate change. We differentiate between increasing and decreas- we use the average of the introduced metrics denoted by dv
ing the playback rate when calculating our distortion metrics and da , respectively.
because our hypotheses is that increasing the playback rate
may have a different impact on the QoE than decreasing the
3. SUBJECTIVE QUALITY ASSESSMENT USING
playback rate.
CROWDSOURCING
In order to determine our metrics we calculate the last and
the first frame number for the content section for which the For validating our metrics and to investigate their correlation
playback rate shall be changed. Determining the first frame with the QoE, we conducted a subjective quality assessment
for the i-th content section for which the playback rate is using crowdsourcing in the following referred as study [10].
changed is done by Fs (tsi , tei ) = btsi · f psµ0 c, where tsi , tei Therefore, we will first describe the key aspects of the study
denotes the start and end of the i-th content section in sec- followed by the screening/filtering of the participants and the
onds. f psµ0 represents the frames per second for the nominal statistical analysis of the results.
playback rate µ0 . For determining the last frame of the i-th
content section for which the playback rate has been increased
∼ 3.1. Participants, Stimuli, Methodology, and Assessment
or decreased we use the following function Fe (tsi , tei ) as de- Platform
picted in Equation 1.
∼ For conducting our user study we selected the crowdsourc-
Fe (tsi , tei ) = bFs (tsi , tei ) + (tei − tsi ) · f ps∆µ c (1)
ing platform Microworkers1 . Microworkers allows hosting
f ps∆µ denotes the frames per seconds for the changed play- so-called campaigns to which subjects (called microworkers)
back rate ∆µ (∆µ = µ0 +δµ, where δµ denotes the change of can subscribe. These campaigns include a detailed descrip-

the playback rate). With δµ equal to zero, Fe (tsi , tei ) yields tion of the task and asks each participant to hand in a proof in
the last frame for the i-th content section without any play- order to verify their participation. The duration of the study
back rate changes which is denoted by Fe (tsi , tei ). In the is approximately 15 minutes. We have found that the typi-
following, we introduce the metrics for the distortion in au- cal amount of money that is payed for a task with a duration
dio (dai ) and the distortion in video (dvi ) for the i-th content of about 15 minutes is approximately $0.20. Therefore, we
section. Equation 2 denotes the distortion metric for video. have set a slightly higher compensation of $0.25 as an ex-

Fe (tsi ,tei ) Fe (tsi ,tei )
tra motivation for each participant [11]. Figure 1 depicts the
1 X X evaluation methodology used to conduct the study. The intro-
dvi = P|F | ( fv (j) − fv (j))
fv (j) duction explains the task and the test procedure. Furthermore,
j=1 j=Fs (tsi ,tei ) j=Fs (tsi ,tei )
(2) 1 http://www.microworkers.com, last access: July 2014.

178
2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX)

and the possibility to ask control questions. The screening of


participants according to the mentioned mechanisms is dis-
cussed in the following section.

3.2. Screening and Filtering of Participants


Fig. 1. Methodology for the Study. Using crowdsourcing for subjective quality assessments is
the participants are asked to agree to a disclaimer. The pre- gaining more and more attention due to reduced costs com-
questionnaire allows us to gather demographic information pared to in-lab studies. However, crowdsourced studies are
and helps us to check whether the participants are from the unsupervised and adopting traditional screening methods –
countries we asked for at Microworkers. The training phase such as outlier detection – are insufficient as participants may
should allow the participants to become familiar with the task try to cheat [17]. Therefore, we use additional data provided
and the rating possibilities. In order to introduce AMP to the by the Web-based QoE assessment platform for screening
participants and to its effects on the media playback we se- and filtering participants as follows:
lected the video sequence Babylon A. D. taken from [12]. The First, we use the above mentioned control question. An
training sequence is presented three times with three different incorrect answer leads to an exclusion of the participant from
media playback rates µ ∈ {1, 0.5, 2}, i.e., µ = 1 corresponds the final statistical analysis because an incorrect answer may
to the nominal playback rate µ0 , µ = 0.5 is half the nominal indicate that the participant did not thoughtful take part in the
playback rate µ0 , and µ = 2 denotes twice the nominal play- study.
back rate µ0 . The training sequence is presented for its whole Second, we try to detect participants who skipped, short-
duration with the mentioned media playback rates. ened, or paused the stimulus presentation. Therefore, we use
After the training phase the main evaluation starts. For the F-test to test whether there exists a significant difference
the stimulus we selected a video sequence with audio and between the variance of the theoretical playback time of each
duration of 51 seconds from the beginning of the open source stimulus presentation and the actual playback time of each
movie Big Buck Bunny2 . We annotated the content sec- s2 2
participant represented by the statistics f = sX 2 , where sY is
tions for which the media playback rate is increased or de- Y
the sample variance of the nominal playback durations and s2X
creased. The playback rate changes were initially randomly
the sample variance of playback times of a participant. This
scattered throughout the whole video sequence with a cu-
ratio follows a F distribution with m − 1 and n − 1 degrees of
mulative duration of 8.84 seconds, thus reflecting 17.3% of
freedom.
the sequence’s duration. As indicated in Figure 1 we use
Third, we investigate those participants who rated very
a single stimulus method as recommended in [13, 14]. We
often extreme values. We define extreme values as the set of
use a continuous rating scale [0, 100] represented by a slider.
values 0 − 5, 50, 95 − 100. This shall allow us to identify par-
Furthermore, we randomly inserted a control question after
ticipants that only moved the slider a few times from its initial
a stimulus presentation. In particular, this control question
position (50) and those who just pushed the slider into the left
asks the participants what they have seen in the previous
or right direction very often. According to the binomial distri-
video sequence with three possible answers provided. In
bution with p = 12 the probability of rating an
 extreme value
total the sequence is presented nine times to the partici-
Pk n j
pants with the following configurations for the playback rate or not is given by P(X ≤ k) = j=0 p (1 − p)n−j .
µ ∈ {0.5, 0.6, 0.8, 1, 1.2, 1.4, 1.6, 1.8, 2}. The nominal play- j
Therefore, the probability that a certain extreme value is se-
back rate µ0 = 1 depicts the hidden reference. Please note
lected more than k-times is P(X > k) = 1 − P(X ≤ k). We
that we only modify the playback rate of specific sections and
rejected the ratings of a participant if the probability for se-
not for the entire video sequence.
lecting a certain extreme value more than k times was below
At the end, we ask the participants to fill out a short post-
α = 5%, P(X > k) < α.
questionnaire which provides the possibility to give feedback
Without any filtering applied 80 persons participated in
and to state whether subjects had already participated in a
the study. 11 participants have provided a wrong answer for
similar study. The study is conducted by adopting an open-
the control question. With the F-test on the variance of the
source Web-based QoE assessment platform [15]. Please note
actual playback time of the stimulus presentations, five par-
that the media player used by this platform uses the Wave-
ticipants could be identified that paused the playback or had
form Similarity based Overlap-Add which tries to maintain
statistically significant lower playback times (i.e., skipped at
the pitch of audio when increasing or decreasing the play-
least one stimulus presentation). For the ratings nine partici-
back rate [16]. In addition to its usability, the platform pro-
pants have been screened that did not vote correctly (by con-
vides several mechanisms to track the behavior of participants
stantly voting 0-5, 50 or 95-100). Therefore, 55 participants
in terms of measuring the time of each stimulus presentation
(48 male and 7 female) have been considered for the final sta-
2 http://www.bigbuckbunny.org, last access: July 2014. tistical analysis of the results.

179
2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX)

relate with the MOS for the different playback rate configura-
tions. The Pearson correlation coefficient for dv and the QoE
ratings is ρ = 0.43. For da the Pearson correlation coefficient
is ρ = 0.679. If we take the absolute values of |da | and |dv |
for calculating the Pearson correlation coefficient we obtain
following values for the linear correlation between the met-
rics and the QoE scores. The Pearson correlation coefficient
for |da | and the QoE ratings is ρ = −0.5549 and for |dv | and
the QoE ratings ρ = −0.9565. For the metric in the audio do-
main both da and |da | show a low linear correlation with the
obtained MOS, respectively. For the metric in the video do-
main we have obtained contrary results. Taking dv we have a
low linear correlation between the metric and the QoE ratings
but if we take |dv | there exists a high negative linear corre-
lation. Nevertheless, if we want to retain the ability of dis-
tinguishing whether the playback rate has been decreased or
increased we have to use the signed metrics. The low values
Fig. 2. MOS and 95% CI for (dv , da , µ). between the signed metrics and the QoE scores show us that
3.3. Statistical Analysis of the Results assuming a linear relationship between the distortion metrics
and the assessed QoE may not be appropriate. Therefore, we
After screening the participants and their responses, the rat- try to find a model that explains this correlation better than a
ings for each stimulus presentation were subject to statistical linear model which will be discussed in the next sections.
significance tests. According to the Central Limit Theorem Finally, the results indicate that with an increase in |dv |
we assume that the ratings of the participants are normally and |da | the QoE is reduced. Interestingly, the QoE does not
distributed. Nevertheless, we have conducted a Shapiro-Wilk- decrease linearly. For the playback rate changes in the range
test to assess whether the ratings are not normally distributed. of [0.8, 1.8] the QoE remains high compared to the reference.
The null hypothesis (H0), stating that no normal distribution Increasing or decreasing the playback rate further causes a
is present, was rejected for each configuration of playback huge drop in the QoE.
rates of the stimulus presentations. By the use of the metrics
introduced in Section 2 the distortion caused by increasing or
decreasing the playback rates for the selected content sections 4. QOE UTILITY MODEL FOR AMP
are expressed by da for the average distortion in the frequency
domain for the audio channels and dv for the average distor- In [5] several AMP algorithms were assessed on their impact
tion of motion for video. on the QoE regarding QoS parameters such as the initial play-
Figure 2 depicts for each triple (dv , da , µ) the assessed back delay, loss rate, underflow time ratio, and the playback
Mean Opinion Score (MOS). It can be observed that play- rate by introducing cross traffic. The actual impact on the
back rates near the nominal playback rate of µ = 1 cause only playback of the content was not taken into account. Further-
a slight drop in QoE. A Student’s t-test supports this finding more, there is the need for a model which can be easily com-
by stating no significant difference in MOS between the ref- bined with other QoE metrics in order to assess the QoE of a
erence of µ = 1 and the following playback rates: µ = 0.8 system that uses AMP. The presented results of the conducted
(p = 0.93, t = −0.083); µ = 1.2 (p = 0.92, t = 0.096); subjective quality assessment using crowdsourcing gave us a
µ = 1.4 (p = 0.81, t = 0.42); µ = 1.6 (p = 0.22, t = 1.23); first impression on how the QoE degrades with an increase or
µ = 1.8 (p = 0.16, t = 1.41). These results indicate that the decrease in the playback rate when selecting content sections
users could not notice a significant difference for playback with a short time duration. With the knowledge that the Pear-
rates µ ∈ [0.8, 1.8]. son correlation between the QoE scores and the audio/video
For the other playback rates it can be observed that the metric is low, we try to find a function which allows us to
QoE significantly degrades. A Student’s t-test revealed that approximate the QoE more precisely than a linear function
there exists a significant difference in the MOS from the nom- could do. Therefore, we need a function that allows to esti-
inal playback rate of µ = 1 and the media playback rates with mate the QoE from the distortions for audio and video and re-
following values: µ = 0.5 (p = 0.00, t = 4.5217); µ = 0.6 turns us the coefficient of degradation, like ζ : R×R → [0, 1].
(p = 0.002, t = 3.2); µ = 2 (p = 0.03, t = 2.19). These Therefore, let
1 x−θ1 2 1 y−θ3 2
results provide the evidence that users perceived a significant ζ(x, y)θ = e− 2 ( θ2 ) e− 2 ( θ4 )
difference between the reference and the test conditions. be the two-dimensional function of degradation with the pa-
In the following we investigate how well our metrics cor- rameter vector θ that describes the relationship between the

180
2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX)

degradation of the QoE and the quantitative metrics (dv , da ).


Furthermore, ζ(x, y)θ is logarithmic concave for (x, y). The
results of the subjective quality assessment indicated that
small distortions in the metrics, especially for audio, do al-
ready cause an impact on the QoE. Therefore, we formulate
our model as follows:
QoE(dv , da ) = QoEwo · ζ(dv , da )θ∗ , (6)

where θ represents the optimal parameter vector such that
for a cost function f (θ) for each θ it holds that f (θ) ≥ f (θ ∗ ).
Thus, θ ∗ representing the optimal solution to a minimization
problem. The QoEwo states the QoE without any playback
rate changes. This QoEwo can be derived from QoS parame-
ters (e.g., bit-rate, resolution, delay, jitter, etc.) by the use of
existing models such as those in [5, 18]. The assessment of
the actual QoEwo is out of scope of this paper.
In order to instantiate our proposed utility model (cf.
Equation 6) we use the responses received by the conducted
subjective quality assessment using crowdsourcing. We fit Fig. 3. Fitted ζ for the received responses.
our model to the obtained data by using multiple instances of the actual data revealed that the instantiated model reflects
of the conjugate gradient method [19]. For the cost function 92.48% of the variability of the actual data. Furthermore, we
we used the Least-Square-Estimator or the squared l2 -norm conducted an F-test to test whether θ is the zero vector. The
given by ||ln(ζ θ (x, y)) − ln(z)||22 depicted in Equation 7. test revealed that the null-hypotheses can be rejected with p =
This function is neither convex nor concacve for θ, but at 2.49 · 10−4 and F = 27.84 for α = 5%. We have shown that
least twice differentiable. ζ(x, y) fits our purpose quite well and provides the possibility
to estimate the coefficient of degradation for different values
X
f (θ) = (ln(ζ(x, y)θ ) − ln(z))2 , (7)
(x,y,z)∈M
of (dv , da ).
QoE(x,y)
where M := {(x, y, z) ∈ R3 |z = max{QoE(x,y)} } is the set
x,y

of 3-tuples with z representing the QoE assessed for (da , dv ). 5. DISCUSSION AND CONCLUSION
Therefore, we try to find the parameter vector θ ∗ that mini-
mizes our f (θ). For determining the conjugated search direc- As mentioned in Section 3 we use only the first 51 seconds of
tions we use the method proposed by Polak-Ribiere [20]. In Big Buck Bunny as multimedia content. This video sequence
order to find a near optimal vector θ we use multiple instances is presented with different configuration of content sections
of the conjugate gradient algorithm with starting points uni- and playback rates for these content sections. Recency ef-
formly distributed in the interval of ]0, 1] for all θi . We se- fects may occur and participants may unwittingly provide un-
lected the θ that provided the lowest costs in terms of our cost reliable ratings. Even shuffling the stimulus presentation in
function f (θ). a random fashion does not avoid recency effects in this case.
Especially, when extreme conditions are presented consecu-
By the use of the conjugate gradient method we fitted
tively (e.g., first with a playback rate of µ = 0.5 and then with
ζ(x, y)θ to the responses received during our study discussed
µ = 1.4). An option is to use different video sequences. But,
in Section 3 and we found the following values for the param-
this may have an influence on the actual task because partic-
eter vector θ ∗ = (0.0011, 0.0482, −0.0004, 0.0184)T . Equa-
ipants may like or dislike them and, therefore, this may have
tion 8 depicts the fully instantiated utility model.
1 x−0.0011 2 1 y+0.0004 2
an impact on the provided QoE rating. Therefore, we decided
QoE(dv , da )θ∗ = QoEwo · e− 2 ( 0.0482 ) e− 2 ( 0.0184 )
to use only a single video sequence for this subjective qual-
(8) ity assessment. Another possibility, that would have increased
Figure 3 depicts the QoE(dv , da )θ∗ by the use of the fitted the duration of the crowdsourced study, is to introduce dummy
ζ. An interesting finding is that a distortion in audio impacts video sequences which allow the participants to forget the last
the QoE more than the same amount of distortion in video. real stimulus presentation. Nevertheless, further subjective
This can be observed by comparing the second and fourth quality assessments have to be conducted in order to support
component of θ or by taking a look at Figure 3. the findings presented in this paper (e.g., [21]).
To test how well our utility model fits the actual data, we The results of the subjective quality assessment using
have conducted an analysis of variance (ANOVA) on how the crowdsourcing lead us to the hypotheses that the correlation
fitted model reflects the variability of the actual data. The between the distortion in the video/audio domain and the
ratio of the sum of squares of the model and sum of squares QoE of playback rate variations can be described by a non

181
2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX)

linear model. The introduced model lead us to the finding that [6] Y.-F. Ou, Y. Zhou, and Y. Wang, “Perceptual quality of
audio plays an important role when increasing or decreasing video with frame rate variation: A subjective study,” in
the playback rate. This is depicted in Figure 3 and denoted IEEE ICASSP, 2010, pp. 2446–2449.
in Equation 8. Comparing our results to the results obtained [7] Q. Huynh-Thu and M. Ghanbari, “Perceived quality of
by other subjective quality assessments which assess the QoE the variation of the video temporal resolution for low bit
of playback variations for video only [6, 7, 22] we can see rate coding,” in Picture Coding Symposium, 2007.
that altering the playback rate for the combination of audio [8] W. Lin and C.-C. J. Kuo, “Perceptual visual quality met-
and video has a very different impact on the QoE. Please note rics: A survey,” Journal of Visual Communication and
that for video only, the human perception is more tolerant for Image Representation, vol. 22, no. 4, 2011.
playback rate variations. This is not the case for audio and [9] Y. Wang, T. Jiang, S. Ma, and W. Gao, “Novel spatio-
the combination of audio and video as shown in this paper. temporal structural information based video quality
metric,” IEEE Transactions on Circuits and Systems for
The contribution of this paper is twofold. First, we have
Video Technology, vol. 22, no. 7, pp. 989–998, 2012.
shown that there is a significant difference between the im- [10] B. Rainer and C. Timmerer, “Self-Organized Inter-
pact of increasing and decreasing the playback rate on the Destination Multimedia Synchronization for Adaptive
QoE. An interesting finding is that increasing the playback Media Streaming,” in 22nd ACM Multimedia, 2014.
rate for specific content sections have a lower impact on the [11] M. Hirth, T. Hossfeld, and P. Tran-Gia, “Anatomy of
QoE than decreasing the playback rate by the reciprocal of a Crowdsourcing Platform - Using the Example of Mi-
the increase of the playback rate. Second, we have introduced croworkers.com,” in 5th IMIS, June 2011, pp. 322–329.
metrics that measure the distortion in audio and video caused [12] M. Waltl, C. Timmerer, B. Rainer, and H. Hellwag-
by increasing or decreasing the playback rate. With the use of ner, “Sensory Effect Dataset and Test Setups,” in 4th
these metrics we derive a utility model. QoMEX. IEEE, 2012, pp. 115–120.
Future work comprises the use of the obtained utility [13] “Rec. ITU-R BT.500-11,” Tech. Rep.
model in order to determine content sections that minimize [14] ITU-T Recommendation P.910, “Subjective video qual-
the impact of playback variations on the QoE. Considering ity assessment methods for multimedia applications,”
the use case of carrying out the synchronization in IDMS, the International Telecommunication Union, Geneva,
buffer contents may be used to determine appropriate content Switzerland, Tech. Rep., Apr. 2008.
sections for overcoming the identified asynchronism. [15] B. Rainer, M. Waltl, and C. Timmerer, “A Web based
Subjective Evaluation Platform,” in 5th QoMEX. IEEE,
Acknowledgments: This work was supported in part by
jul 2013, pp. 24–25.
the EC in the context of the SocialSensor (FP7-ICT-287975) [16] W. Verhelst and M. Roelands, “An overlap-add tech-
and QUALINET (COST IC 1003) projects and partly per- nique based on waveform similarity (wsola) for high
formed in the Lakeside Labs research cluster at AAU. quality time-scale modification of speech,” in IEEE
ICASSP, vol. 2, April 1993, pp. 554–557 vol.2.
[17] T. Hossfeld, C. Keimel, M. Hirth, B. Gardlo, J. Habigt,
6. REFERENCES
K. Diepold, and P. Tran-Gia, “Best Practices for QoE
Crowdtesting: QoE Assessment with Crowdsourcing,”
[1] M. Montagud, F. Boronat, H. Stokking, and R. Branden-
IEEE Transactions on Multimedia, 2013.
burg, “Inter-destination multimedia synchronization: [18] F. Pereira, “A triple user characterization model for
schemes, use cases and standardization,” Multimedia video adaptation and quality of experience evaluatio,”
Systems, vol. 18, pp. 459–482, 2012. in Multimedia Signal Processing, 2005 IEEE 7th Work-
[2] T. Hossfeld, M. Seufert, M. Hirth, T. Zinner, P. Tran- shop on, 2005, pp. 1–4.
Gia, and R. Schatz, “Quantification of youtube qoe via [19] C. T. Kelley, Iterative Methods for Linear and Nonlin-
crowdsourcing,” in ISM, 2011, pp. 494–499. ear Equations, ser. Frontiers in Applied Mathematics.
[3] M. Kalman, E. Steinbach, and B. Girod, “Adaptive me- SIAM, 1995, no. 16.
dia playout for low-delay video streaming over error- [20] E. Polak, Computational methods in optimization; a
prone channels,” IEEE Transactions on Circuits and unified approach [by] E. Polak. Academic Press New
Systems for Video Technology, vol. 14, no. 6, pp. 841– York, 1971.
851, 2004. [21] B. Rainer and C. Timmerer, “A subjective evaluation us-
[4] M. Yuang, S. Liang, and Y. Chen, “Dynamic video play- ing crowdsourcing of adaptive media playout utilizing
out smoothing method for multimedia applications,” audio-visual content features,” in IEEE QCMAN 2014,
Multimedia Tools and Applications, vol. 6, no. 1, pp. H. Lutfiyya and P. Cholda, Eds. IEEE, may 2014.
[22] Z. Lu, W. Lin, B. C. Seng, S. Kato, E. Ong, and
47–60, 1998.
S. Yao, “Perceptual Quality Evaluation on Periodic
[5] M. Li, “Qoe-based performance evaluation for adaptive Frame-Dropping Video,” in IEEE ICIP, vol. 3, 2007, pp.
media playout systems,” Advances in Multimedia, vol. 433– 436.
2013, p. 7, 2013.

182

You might also like