You are on page 1of 10

Measuring Internet Video Quality of Experience

from the Viewers Perspective

An Industry Whitepaper

Contents Executive Summary


Third-party video applications that run over-the-top (OTT) of a
Executive Summary ................................... 1 communications service providers (CSP) transport layer are the
dominant drivers of bandwidth on the Internet. The popularity of
Introduction to Internet Video ...................... 2
such video is responsible for two fundamental shifts in consumer
Progressive Video .................................. 2 behavior: higher peak bandwidth levels and heightened
Adaptive Video ..................................... 3 subscriber sensitivity to video quality.

Considerations for Measuring Video QoE .......... 5 During periods of peak utilization, resources on the network are
Session Lifetime and Sampling Frequency ..... 5 more prone to congestion; from the viewers perspective,
congestion can very visibly manifest as degradation in video
Correlation Across Sessions ....................... 5
streaming quality. Therefore, measuring the customers quality
Routing Asymmetry ................................ 6 of experience (QoE) requires measurement of video QoE.
Measuring Progressive Video QoE ............... 6
OTT video is delivered in two primary streaming mechanisms:
Measuring Adaptive Video QoE................... 7 progressive video, in which sections of a single file are delivered
Why TCP Packet Loss Doesnt Matter ........... 7 in bursts; and adaptive video, in which chunks of differing
display quality are delivered based upon the networks transport
Conclusion .............................................. 9 capabilities.
Requirements for a Video QoE Solution ........ 9
From a transport layer perspective, HTTP over TCP is the
Additional Resources ............................. 10 dominant transport mechanism, introducing two primary
dimensions to video quality: display quality (fidelity) and
transport quality (stalling). Perhaps counterintuitively, video QoE
has no correlation (i.e., is completely unrelated) to TCP
performance metrics (e.g., jitter and packet loss).

For a representative assessment of QoE from the viewers


perspective, the video QoE solution must use different methods
to measure QoE of progressive video and QoE of adaptive video,
and must take into account both display quality and transport
quality. To fulfill these requirements, the solution must be both
protocol- and container-aware.
Version 2.0
Measuring Internet Video Quality of Experience from the Viewers Perspective

Introduction to Internet Video


Third-party video applications that run over-the-top (OTT) of a communications service providers
(CSP) transport layer are the dominant drivers of bandwidth on todays consumer Internet. In regions
where it is available, Netflix either has emerged or is emerging as the largest single source of Internet
traffic, while YouTube is the largest globally. 1

This popularity of video in general, and OTT video in particular, is responsible for two fundamental
shifts in consumer behavior:

1. Higher peak bandwidth levels: since OTT video is an on demand application, it drives traffic
when it is viewed, which typically peaks in the range of 8pm to 11pm; previously, video
content was often acquired in bulk via P2P networks and then consumed later
2. Heightened subscriber sensitivity to quality: video is a sensory experience with rapidly changing
sights and sounds, so shifts in quality (e.g., stalls, pixelization, compression artifacts, shifts up
or down in resolution, changes in frame-rate) are instantly recognized by the viewer
From the CSP perspective, OTT video is decreasing network efficiency in the macro sense, in that the
peak-to-trough ratio is increasing and there is a large amount of available but unutilized capacity
throughout the day. During these high peaks, the network is more prone to congestion; from the
viewers perspective, congestion can very visibly manifest as degradation in video streaming quality.

To understand how to measure the video quality of experience (QoE) from the viewers perspective,
one must first understand how OTT video is delivered.

There are two primary forms of streaming video on the Internet today: progressive and adaptive. While
there are other video methods available (e.g., RTP multicast over UDP), they are not in common use
today as a delivery mechanism for OTT video.

Progressive Video
In the progressive method, a single large file is burst-paced into the network, with no consideration
for available bandwidth.

In this scenario, the media asset is a single video file stored on a server and HTTP byte-ranges are
used for seeking through the video. Each byte range may either come in the same TCP connection or a
new connection.

For situations in which videos are available with different display quality choices (each of which
corresponds to a different bitrate), the video client (or user) manually selects a particular resolution,
which corresponds to a different file on the server. The result is that two users watching the same
video at different resolutions are actually watching different files.

Once chosen, this display quality is constant, regardless of the networks ability to transport sufficient
data to maintain stall-free playback.

If one were to look at a cross-section of a progressive video stream, it would look like Figure 1; the
container includes information relevant to the video file and content, and the elementary streams
correspond to video, audio, sub-titles, etc.

1
Information about worldwide traffic composition is available from several good sources, including Ciscos Visual Networking
Index (VNI), Akamais State of the Internet studies, and Sandvines own Global Internet Phenomena Reports

2
Measuring Internet Video Quality of Experience from the Viewers Perspective

More specifically, the following information relevant to determining QoE (and also additional business
intelligence) is available from each layer:

IP: Subscriber, CDN, BGP AS Path


Subscriber: physical location on network, service plan, device type
TCP: nothing of relevance to QoE
HTTP: asset (used to link multiple chunks together), duration, stall information (transport
quality
Container: codec, resolution, bitrate (display quality), CDN
Elementary stream: bytes transferred

Container Elementary
(Resolution, Stream
IP TCP HTTP Target bitrate,
Codec-Type, Elementary
) Stream

Figure 1 - Cross-section of a progressive video stream

Adaptive Video
Quite different from the progressive method, adaptive video modulates the display quality based on
the networks available transport capacity (i.e., bandwidth). It achieves this effect by fetching
chunks of the video in a piecewise manner; the chunk chosen is of the maximum deliverable display
quality. At the beginning, the video client requests the first chunk, and starts playing it. If this chunk
takes too long to deliver, then the next chunk will be requested at lower display quality; if this initial
chunk delivered especially quickly, then the next chunk will be requested at a higher display quality.

An illustration of an adaptive video flow in flight is provided in Figure 2. In this example, the viewer
gets two chunks of low definition video, followed by two chunks of standard definition, then two of
high definition, and finally two again of standard, all in response to the dynamically-changing transport
capacity of the network.

Figure 2 - 'In-flight' adaptive video stream

3
Measuring Internet Video Quality of Experience from the Viewers Perspective

The most common adaptive video methods are Microsoft Silverlight, Apple HTTP Live Streaming (HLS),
and Adobe Real-time Media Protocol (RTMP). They each operate in a similar fashion: the video asset is
stored on disk in multiple target bitrate/resolutions. A chunk is generally between 10 MB and 100 MB in
size, depending on display quality.

In the same fashion as progressive video, each chunk may either come in the same TCP connection or a
new connection.

From a cross-section perspective, adaptive video is similar to progressive, with one key difference: an
additional protocol layer that gives parameters for seeking chunks, obtaining meta-data, etc.

More specifically, the following information relevant to determining QoE (and also additional business
intelligence) is available from each layer:

IP: Subscriber, CDN, BGP AS Path


Subscriber: physical location on network, service plan, device type
TCP: nothing of relevance to QoE
HTTP: asset (used to link multiple chunks together), protocol, CDN
Protocol: duration, stall information (transport quality)
Container: codec, resolution, bitrate (display quality)
Elementary stream: bytes transferred

Container Elementary
protocol (Resolution, Stream
IP TCP HTTP (e.g. HLS, Target bitrate,
RTMP, ...) Codec-Type, Elementary
) Stream

Figure 3 - "Cross-section" of an adaptive video stream

4
Measuring Internet Video Quality of Experience from the Viewers Perspective

Considerations for Measuring Video QoE


From a transport layer perspective, HTTP over TCP is the dominant transport mechanism, as is
represented in Figure 1 and Figure 3.

As a consequence of this transport mechanism, there are two primary dimensions to video quality:

1. Display quality (fidelity): is the image quality sufficient for the devices screen size?
2. Transport quality (stalling): how long does the video take to start, and does it play smoothly?

For a representative assessment of QoE from the viewers perspective, a quality measurement must
take into account both of these dimensions.

The different nature of content delivery between progressive and adaptive video streams requires that
the quality of experience measurement be performed differently.

Session Lifetime and Sampling Frequency


Video sessions are typically long-lived 2, and an accurate assessment of the viewers quality of
experience is only possible if the necessary measurements are taken throughout the full duration.

However, the question of how many measurements to take (or, more accurately, with what frequency
to take the measurements) must be answered. Measurements taken too frequently will incur processing
overhead with diminishing returns; measurements taken too infrequently will fail to accurately capture
the quality of experience and may fall prey to a sampling error 3.

In practice, to avoid error there must be two more measurements per chunk or piece of video; this can
typically be achieved by sampling at a 15 second or even 30 second interval.

Correlation Across Sessions


For both progressive and adaptive video streams, it is of fundamental importance to be able to
correlate the same asset across multiple GET transactions issued into the same, or different, TCP
connections.

TCP Connection #1 TCP Connection #2


GET /asset HTTP/1.1 GET /asset HTTP/1.1
Range: bytes=0-100 Range: bytes=300-400

200 OK 200 OK
Content-Range: bytes 0-100/100 Content-Range: bytes 300-400/
bytes ... bytes ...

GET /asset HTTP/1.1 GET /asset HTTP/1.1


Range: bytes=100-200 Range: bytes=9000-9100

200 OK 200 OK
Content-Range: bytes 100-200/100 Content-Range: bytes 9000-9100/
bytes ... bytes ...

Figure 4 - HTTP progressive video 'flows' across multiple connections

2
Even for videos where this statement is not necessary true in the strictly absolute sense (e.g., for a three-minute YouTube
clip), it is still valid in the relative sense; that is, even a three-minute video is much longer than most Internet sessions and will
likely be delivered in many bursts (progressive) or chunks (adaptive).
3
Intrepid readers can learn more at: http://en.wikipedia.org/wiki/Nyquist_rate#Nyquist_rate_relative_to_sampling

5
Measuring Internet Video Quality of Experience from the Viewers Perspective

Only through this correlation/state-tracking can measurements be accurately applied to a video as a


whole when the content delivery is split within or across TCP connections.

Routing Asymmetry
By design, all communications networks include routing asymmetry an environment in which packets
take different routes between the same endpoints. 4

Suffice it to say that for the purposes of measuring video QoE, all routing asymmetry must be resolved
this is the only way in which the full video stream can be seen, which is the only way that quality of
experience can be determined with any accuracy.

Measuring Progressive Video QoE


In a progressive video stream, a seek action can occur when a user moves a control (such as a slider)
to choose a different time or the client re-buffers on a stall; from a delivery standpoint, this
corresponds to selecting and delivering different byte-ranges. Consequently, the duration of a
progressive video stream can only be determined by keeping track of these seek actions dynamically
for all HTTP transactions (i.e., not just the initial GET on an HTTP flow).

Similarly, tracking these seek actions is the only way in which the transport quality of a progressive
flow can be measured.

To understand the display quality, the measurement system must be able to extract information from
the video container (recall Figure 1).

Figure 5 shows how a buffer stall can be detected by monitoring the actual and required progress of
the stream; when the actual progress drops below that which is required for the requested display
quality, the stream is forced to buffer.

Figure 5 - Measuring a buffer stall in a progressive video stream

4
A comprehensive explanation of routing asymmetry and its implications for network policy control is available in the Sandvine
whitepaper Applying Network Policy Control to Asymmetric Traffic: Considerations and Solutions

6
Measuring Internet Video Quality of Experience from the Viewers Perspective

Measuring Adaptive Video QoE


Recall that in an adaptive stream the video client requests the first chunk, and starts playing it. If a
chunk takes too long to deliver, the client will request a lower-quality chunk in the next time interval.

To properly measure the delivered quality of adaptive video (i.e., the display quality and the transport
quality), the measurement solution must be both protocol-aware (e.g., HLS) and container-aware (to
see the resolution/bitrate/codec), and also must measure stalling in the backchannel from the client.

It is important to note that the size of the transfer is irrelevant, since it changes dynamically; only the
container information is valid.

Why TCP Packet Loss Doesnt Matter


There is a common misconception that TCP metrics can validly serve as either direct or proxy measures
for video quality of experience. However, due to the nature of TCP (i.e., it is designed to retransmit
and buffer), these metrics are not relevant to measuring video QoE. As a consequence, traditional test
and measurement methods are not effective for determining OTT video quality.

Since this misconception is both common and detrimental to CSPs efforts to measure the quality
experienced by their network users, and is deliberately promoted by many solutions vendors, it is
worth explaining in greater detail.

In the earliest days of the Internet, TCP had no congestion management: it would simply go as fast as it
could with no throttle; if packets got dropped then they were retransmitted. In 1984, RFC 896
predicted the need for a congestion control algorithm, and this was implemented in 1987 by American
computer scientist Van Jacobson. The Van Jacobson algorithm used packet loss to signal that a link was
operating at maximum capacity and to cause the transmitter to slow.

In general, much confusion still exists regarding the significance of packet loss. The word loss has
immediate and almost exclusively negative connotations, so many people (even telecommunications
professionals) find it odd that in both normal and congested situations packet loss occurs on the
network. How can this be?

The answer lies in the behavior of TCP. Consider Figure 6, where a client application has a 100 Mbps
connection to its first hop, and a server has a 1 Gbps connection to its first hop what does TCP do
when a file is fetched? Is 900 Mbps dropped, or 990 Mbps? In fact, the quantity of dropped packets is
negligible. The server starts accelerating, going faster and faster until a packet is dropped, and then it
slows down slightly. In this diagram, the achieved rate is 10 Mbps because the end-to-end connection
adjusts to the slowest link in the chain. Thus we cannot use TCP packet loss as a means of finding the
narrowest path because it will stop dropping once it determines the rate of the narrowest link.

Figure 6 - Typical single Internet path

7
Measuring Internet Video Quality of Experience from the Viewers Perspective

If we assume that the server shown in Figure 6 has a video stream which must be delivered at 12 Mbps
to avoid stalling, then there will still be no TCP packet loss to indicate the stalls. The only indication
will come from the higher-layer signaling in the client. If a quality measurement solution cannot detect
that signaling, then it is incapable of measuring this required dimension of video quality.

In fact, TCP packet loss only exists to any amount when there are drops which are uncorrelated to
bitrate (e.g., when you are using your home WiFi and someone generates interference by turning on
their microwave or cordless phone).

8
Measuring Internet Video Quality of Experience from the Viewers Perspective

Conclusion
Video, particularly video from over-the-top content providers, is coming to dominate the global
Internet. This video is primarily delivered in two ways: progressive streaming, in which sections of a
single file are delivered in bursts; and adaptive video, in which chunks of differing display quality are
delivered based upon the networks transport capabilities.

These different delivery mechanisms necessitate different mechanisms to measure video quality of
experience, but in both cases there are two dimensions that must be measured separately and then
considered together: display quality and transport quality.

To measure all the aspects of display quality and transport quality, a number of conditions must be
met both at a fundamental level that applies to both delivery mechanisms (e.g., absence of routing
asymmetry) and at a level applying to each mechanism individually (e.g., ability to read the protocol
and container).

Despite widespread misunderstandings, metrics associated with TCP flows, including jitter and packet
loss, have no relevance to video quality of experience; in fact, these metrics are completely
uncorrelated with video QoE.

Requirements for a Video QoE Solution


After examining the many factors that must be considered when measuring video quality of experience,
from the perspective of the end viewer, a number of requirements emerge.

Consideration Requirement Explanation


Should measure the QoE from the The viewers experience is most
perspective of the viewer important
There are two dimensions to
Must measure both display quality and video quality that in
transport quality, and must integrate both combination determine the QoE:
into the QoE metric display quality and transport
quality
QoE Metric
These metrics are irrelevant to
Must not use any TCP jitter or packet loss video QoE, and entirely
uncorrelated
Progressive streaming and
The QoE metric must be calculated
adaptive streaming are
differently for progressive streams and
fundamentally and profoundly
adaptive streams
different in behavior
Measurements must be made throughout Video sessions are generally
the full duration of a video session long-lived, with variable quality
Sampling must be optimized to
give accurate results without
The frequency of measurements must be onerous performance overhead;
sufficient to avoid sampling error practically, sampling at 15
Sessions second or 30 second intervals is
appropriate
What looks like a single video to
The video QoE solution must be able to
the end user is actually split
correlate delivery of the same asset across
within and across TCP
multiple HTTP GET transactions issued into
connections that must all be
the same, or different, TCP connections
considered as a whole

9
Measuring Internet Video Quality of Experience from the Viewers Perspective

The video QoE solution must be deployed


in such a manner that it sees each video
To compile an accurate metric
flow in a perfectly symmetrical manner;
Routing Asymmetry of video QoE, the entire video
that is, routing asymmetry must be
delivery must be considered
completely resolved from the perspective
of the measuring device
The video QoE solution must be able to
Required to both measure the
keep track of seek actions (whether
duration of a progressive video
initiated by the user or the video client)
flow and to count buffer events
dynamically for all HTTP transactions
Progressive Video QoE
The container holds the display
The video QoE solution must be able to
quality information, which is a
extract display quality information from
required dimension of video
the video container
streaming QoE
The video QoE solution must be protocol Required to measure the
aware transport quality
The container holds the display
The video QoE solution must be able to
quality information, which is a
extract display quality information from
required dimension of video
Adaptive Video QoE the video container
streaming QoE
Since the display quality (and
Video QoE metric must not be based in any bitrate) varies dynamically, the
part on transferred bytes number of bytes transferred is
irrelevant

Additional Resources
In addition to the resources cited in the footnotes throughout this document, please consider reading
the Sandvine technology showcase Video Quality of Experience Score, available on www.sandvine.com.

10

You might also like