Professional Documents
Culture Documents
An Industry Whitepaper
Considerations for Measuring Video QoE .......... 5 During periods of peak utilization, resources on the network are
Session Lifetime and Sampling Frequency ..... 5 more prone to congestion; from the viewers perspective,
congestion can very visibly manifest as degradation in video
Correlation Across Sessions ....................... 5
streaming quality. Therefore, measuring the customers quality
Routing Asymmetry ................................ 6 of experience (QoE) requires measurement of video QoE.
Measuring Progressive Video QoE ............... 6
OTT video is delivered in two primary streaming mechanisms:
Measuring Adaptive Video QoE................... 7 progressive video, in which sections of a single file are delivered
Why TCP Packet Loss Doesnt Matter ........... 7 in bursts; and adaptive video, in which chunks of differing
display quality are delivered based upon the networks transport
Conclusion .............................................. 9 capabilities.
Requirements for a Video QoE Solution ........ 9
From a transport layer perspective, HTTP over TCP is the
Additional Resources ............................. 10 dominant transport mechanism, introducing two primary
dimensions to video quality: display quality (fidelity) and
transport quality (stalling). Perhaps counterintuitively, video QoE
has no correlation (i.e., is completely unrelated) to TCP
performance metrics (e.g., jitter and packet loss).
This popularity of video in general, and OTT video in particular, is responsible for two fundamental
shifts in consumer behavior:
1. Higher peak bandwidth levels: since OTT video is an on demand application, it drives traffic
when it is viewed, which typically peaks in the range of 8pm to 11pm; previously, video
content was often acquired in bulk via P2P networks and then consumed later
2. Heightened subscriber sensitivity to quality: video is a sensory experience with rapidly changing
sights and sounds, so shifts in quality (e.g., stalls, pixelization, compression artifacts, shifts up
or down in resolution, changes in frame-rate) are instantly recognized by the viewer
From the CSP perspective, OTT video is decreasing network efficiency in the macro sense, in that the
peak-to-trough ratio is increasing and there is a large amount of available but unutilized capacity
throughout the day. During these high peaks, the network is more prone to congestion; from the
viewers perspective, congestion can very visibly manifest as degradation in video streaming quality.
To understand how to measure the video quality of experience (QoE) from the viewers perspective,
one must first understand how OTT video is delivered.
There are two primary forms of streaming video on the Internet today: progressive and adaptive. While
there are other video methods available (e.g., RTP multicast over UDP), they are not in common use
today as a delivery mechanism for OTT video.
Progressive Video
In the progressive method, a single large file is burst-paced into the network, with no consideration
for available bandwidth.
In this scenario, the media asset is a single video file stored on a server and HTTP byte-ranges are
used for seeking through the video. Each byte range may either come in the same TCP connection or a
new connection.
For situations in which videos are available with different display quality choices (each of which
corresponds to a different bitrate), the video client (or user) manually selects a particular resolution,
which corresponds to a different file on the server. The result is that two users watching the same
video at different resolutions are actually watching different files.
Once chosen, this display quality is constant, regardless of the networks ability to transport sufficient
data to maintain stall-free playback.
If one were to look at a cross-section of a progressive video stream, it would look like Figure 1; the
container includes information relevant to the video file and content, and the elementary streams
correspond to video, audio, sub-titles, etc.
1
Information about worldwide traffic composition is available from several good sources, including Ciscos Visual Networking
Index (VNI), Akamais State of the Internet studies, and Sandvines own Global Internet Phenomena Reports
2
Measuring Internet Video Quality of Experience from the Viewers Perspective
More specifically, the following information relevant to determining QoE (and also additional business
intelligence) is available from each layer:
Container Elementary
(Resolution, Stream
IP TCP HTTP Target bitrate,
Codec-Type, Elementary
) Stream
Adaptive Video
Quite different from the progressive method, adaptive video modulates the display quality based on
the networks available transport capacity (i.e., bandwidth). It achieves this effect by fetching
chunks of the video in a piecewise manner; the chunk chosen is of the maximum deliverable display
quality. At the beginning, the video client requests the first chunk, and starts playing it. If this chunk
takes too long to deliver, then the next chunk will be requested at lower display quality; if this initial
chunk delivered especially quickly, then the next chunk will be requested at a higher display quality.
An illustration of an adaptive video flow in flight is provided in Figure 2. In this example, the viewer
gets two chunks of low definition video, followed by two chunks of standard definition, then two of
high definition, and finally two again of standard, all in response to the dynamically-changing transport
capacity of the network.
3
Measuring Internet Video Quality of Experience from the Viewers Perspective
The most common adaptive video methods are Microsoft Silverlight, Apple HTTP Live Streaming (HLS),
and Adobe Real-time Media Protocol (RTMP). They each operate in a similar fashion: the video asset is
stored on disk in multiple target bitrate/resolutions. A chunk is generally between 10 MB and 100 MB in
size, depending on display quality.
In the same fashion as progressive video, each chunk may either come in the same TCP connection or a
new connection.
From a cross-section perspective, adaptive video is similar to progressive, with one key difference: an
additional protocol layer that gives parameters for seeking chunks, obtaining meta-data, etc.
More specifically, the following information relevant to determining QoE (and also additional business
intelligence) is available from each layer:
Container Elementary
protocol (Resolution, Stream
IP TCP HTTP (e.g. HLS, Target bitrate,
RTMP, ...) Codec-Type, Elementary
) Stream
4
Measuring Internet Video Quality of Experience from the Viewers Perspective
As a consequence of this transport mechanism, there are two primary dimensions to video quality:
1. Display quality (fidelity): is the image quality sufficient for the devices screen size?
2. Transport quality (stalling): how long does the video take to start, and does it play smoothly?
For a representative assessment of QoE from the viewers perspective, a quality measurement must
take into account both of these dimensions.
The different nature of content delivery between progressive and adaptive video streams requires that
the quality of experience measurement be performed differently.
However, the question of how many measurements to take (or, more accurately, with what frequency
to take the measurements) must be answered. Measurements taken too frequently will incur processing
overhead with diminishing returns; measurements taken too infrequently will fail to accurately capture
the quality of experience and may fall prey to a sampling error 3.
In practice, to avoid error there must be two more measurements per chunk or piece of video; this can
typically be achieved by sampling at a 15 second or even 30 second interval.
200 OK 200 OK
Content-Range: bytes 0-100/100 Content-Range: bytes 300-400/
bytes ... bytes ...
200 OK 200 OK
Content-Range: bytes 100-200/100 Content-Range: bytes 9000-9100/
bytes ... bytes ...
2
Even for videos where this statement is not necessary true in the strictly absolute sense (e.g., for a three-minute YouTube
clip), it is still valid in the relative sense; that is, even a three-minute video is much longer than most Internet sessions and will
likely be delivered in many bursts (progressive) or chunks (adaptive).
3
Intrepid readers can learn more at: http://en.wikipedia.org/wiki/Nyquist_rate#Nyquist_rate_relative_to_sampling
5
Measuring Internet Video Quality of Experience from the Viewers Perspective
Routing Asymmetry
By design, all communications networks include routing asymmetry an environment in which packets
take different routes between the same endpoints. 4
Suffice it to say that for the purposes of measuring video QoE, all routing asymmetry must be resolved
this is the only way in which the full video stream can be seen, which is the only way that quality of
experience can be determined with any accuracy.
Similarly, tracking these seek actions is the only way in which the transport quality of a progressive
flow can be measured.
To understand the display quality, the measurement system must be able to extract information from
the video container (recall Figure 1).
Figure 5 shows how a buffer stall can be detected by monitoring the actual and required progress of
the stream; when the actual progress drops below that which is required for the requested display
quality, the stream is forced to buffer.
4
A comprehensive explanation of routing asymmetry and its implications for network policy control is available in the Sandvine
whitepaper Applying Network Policy Control to Asymmetric Traffic: Considerations and Solutions
6
Measuring Internet Video Quality of Experience from the Viewers Perspective
To properly measure the delivered quality of adaptive video (i.e., the display quality and the transport
quality), the measurement solution must be both protocol-aware (e.g., HLS) and container-aware (to
see the resolution/bitrate/codec), and also must measure stalling in the backchannel from the client.
It is important to note that the size of the transfer is irrelevant, since it changes dynamically; only the
container information is valid.
Since this misconception is both common and detrimental to CSPs efforts to measure the quality
experienced by their network users, and is deliberately promoted by many solutions vendors, it is
worth explaining in greater detail.
In the earliest days of the Internet, TCP had no congestion management: it would simply go as fast as it
could with no throttle; if packets got dropped then they were retransmitted. In 1984, RFC 896
predicted the need for a congestion control algorithm, and this was implemented in 1987 by American
computer scientist Van Jacobson. The Van Jacobson algorithm used packet loss to signal that a link was
operating at maximum capacity and to cause the transmitter to slow.
In general, much confusion still exists regarding the significance of packet loss. The word loss has
immediate and almost exclusively negative connotations, so many people (even telecommunications
professionals) find it odd that in both normal and congested situations packet loss occurs on the
network. How can this be?
The answer lies in the behavior of TCP. Consider Figure 6, where a client application has a 100 Mbps
connection to its first hop, and a server has a 1 Gbps connection to its first hop what does TCP do
when a file is fetched? Is 900 Mbps dropped, or 990 Mbps? In fact, the quantity of dropped packets is
negligible. The server starts accelerating, going faster and faster until a packet is dropped, and then it
slows down slightly. In this diagram, the achieved rate is 10 Mbps because the end-to-end connection
adjusts to the slowest link in the chain. Thus we cannot use TCP packet loss as a means of finding the
narrowest path because it will stop dropping once it determines the rate of the narrowest link.
7
Measuring Internet Video Quality of Experience from the Viewers Perspective
If we assume that the server shown in Figure 6 has a video stream which must be delivered at 12 Mbps
to avoid stalling, then there will still be no TCP packet loss to indicate the stalls. The only indication
will come from the higher-layer signaling in the client. If a quality measurement solution cannot detect
that signaling, then it is incapable of measuring this required dimension of video quality.
In fact, TCP packet loss only exists to any amount when there are drops which are uncorrelated to
bitrate (e.g., when you are using your home WiFi and someone generates interference by turning on
their microwave or cordless phone).
8
Measuring Internet Video Quality of Experience from the Viewers Perspective
Conclusion
Video, particularly video from over-the-top content providers, is coming to dominate the global
Internet. This video is primarily delivered in two ways: progressive streaming, in which sections of a
single file are delivered in bursts; and adaptive video, in which chunks of differing display quality are
delivered based upon the networks transport capabilities.
These different delivery mechanisms necessitate different mechanisms to measure video quality of
experience, but in both cases there are two dimensions that must be measured separately and then
considered together: display quality and transport quality.
To measure all the aspects of display quality and transport quality, a number of conditions must be
met both at a fundamental level that applies to both delivery mechanisms (e.g., absence of routing
asymmetry) and at a level applying to each mechanism individually (e.g., ability to read the protocol
and container).
Despite widespread misunderstandings, metrics associated with TCP flows, including jitter and packet
loss, have no relevance to video quality of experience; in fact, these metrics are completely
uncorrelated with video QoE.
9
Measuring Internet Video Quality of Experience from the Viewers Perspective
Additional Resources
In addition to the resources cited in the footnotes throughout this document, please consider reading
the Sandvine technology showcase Video Quality of Experience Score, available on www.sandvine.com.
10