Professional Documents
Culture Documents
Mark Thompson
With the advancement of camera and display technology available to consumers, the progression of
digital imagery to a quality approaching the limits of the human visual system (HVS) is closer than
ever before. The technological race to capture and display images with higher resolutions has
already taken place (modern camera resolutions far exceed the resolution of current display
technology) culminating in the development of UHD (ultra-high definition) and 4K displays by most
television manufacturers. Parallel to these developments, over the last ten years the introduction of
high dynamic range (HDR) in television displays reflects the desire to reproduce imagery with a
broader colour gamut and dynamic range of luminance.
The ability to capture HDR images [Mann & Picard, 1994] and video [Kang et al, 2003] has been
possible with even consumer grade cameras for a number of years, and current generation
computer graphics cards have been processing and reproducing HDR imagery since 2014. Although
reference displays capable of HDR imagery have existed for over a decade [Seetzen et al, 2004],
manufacturers have only recently developed HDR ready displays for the average consumer. Prior to
this, standard (or low) dynamic range (SDR/LDR) was the accepted standard [ITU-REC BT 709-6,
2015] for consumer high definition television images (HDTV – not to be confused with HDR). The
extended dynamic range and colour gamut of HDR imagery is not fully realised SDR displays such as
CRT or plasma monitors, and only specifically designed HDR displays exhibit the advantage of a HDR
signal [Seetzen et al, 2004]. The extended dynamic range of these displays outputs whiter possible
whites, blacker possible blacks and an increased separation of highlights and shadows.
The following report outlines the specifics of the extended colour and dynamic range of HDR
standard image parameter values [ITU-REC BT 2020-2, 2015] [ITU-REC BT 2100-2, 2018], with
comparison to SDR. The display methods of HDR video signals [Myszkowski et al, 2008: 4] will also be
addressed, with an overview and comparison of the design models and formats.
In order to draw accurate comparisons between SDR and HDR, and between the systems by which
HDR is realised, a number of terms must be defined. Luminance describes a measure of luminous
intensity per unit area, including a weighting to wavelengths of light more sensitive to human
perception [Myszkowski et al, 2008: 2]. Luminance is generally given in candela per square meter, or
in the non-SI unit ‘nits’. For reference, a typical LCD monitor may have a luminance ranging from
0.5cd/m2 (0.5 nits) – 500cd/m2 (500 nits), while a dark night and the sun have luminances of
approximately 3×10-5 cd/m2 and 2×106 cd/m2 respectively [Kunkel & Reinhard, 2010].
Naturally when considering HDR, dynamic range is quoted more often than luminance, as HDR does
not always assess absolute luminance levels of a scene [Myszkowski et al, 2008: 10] [Hulusic, 2016:
2]. Displayed dynamic range refers to the difference in luminance from the brightest white to the
darkest black (normalised black level), and is expressed as a ratio, or in f-stops equivalent to the base
2 logarithm of the ratio [Borer & Cotton, 2015: 1]. Displayed dynamic range is more relevant than
signal dynamic range when considering the reproduction of HDR images, as the black level will be set
using PLUGE based on the ambient light conditions [BBC, 2016].
In order to determine the dynamic range in practice, the bit-depth in comparison to the range of
luminance must be considered, as quantisation artefacts such as banding may result [Seetzen et al,
2004]. Where the number of quantisation steps over a contrast range is too small, and luminance
levels between steps increases above the just-noticeable-difference (JND) of 2% (less sensitive at
lower luminances), banding in the image will be visible [Borer & Cotton, 2015: 2].
The most obvious distinction between HDR and SDR relates to their dynamic ranges, both as
described in the International Telecommunication Union recommendations, and with respect to
practical application. Conventional 8-bit SDR has an approximate range in luminance between 0.5
and 120 nits, with a reference peak luminance quoted at 100 nits for CRT displays [ITU-REC BT 709-6,
2015]. This relates to a dynamic range of approximately 100:1, or 6-7 f-stops. Due to the luminance
ratios between quantisation steps exceeding JND thresholds, the display dynamic range of 8-bit SDR
is closer to 5 or 6 stops while limiting artefacts at low luminances [Borer & Cotton, 2015: 2]. 10-bit
professional SDR video does support a dynamic range closer to 10 f-stops [Borer & Cotton, 2015: 2].
The display dynamic ranges for reference viewing of HDR images is outlined in ITU-REC BT 2100-2,
with a peak luminance of at least 1000 nits, and a black level of less than 0.005 nits. Equivalent to a
dynamic range of 200,000:1 or 17.6 f-stops, the display dynamic range parameters for HDR exceeds
the 14 f-stops of dynamic range related to human steady-state vision [Kunkel & Reinhard, 2010: 17].
In comparison to SDR, HDR has an improvement of approximately 8 stops above the black point, and
4 stops below the reference white point. The 10-bit and 12-bit coding of HDR formats produce
luminance ratios between quantisation steps (mostly) below JND thresholds, limiting the potential
for quantisation related artefacts.
Video system bit-coding also affects the precision of the colour gamut in SDR and HDR images. The
restriction of consumer SDR to three 8-bit integer colour channels [Myszkowski et al, 2008: 1], limits
SDR to less than 256 levels of RGB precision per-channel, equivalent to a dynamic range of
approximately 100:1 before introducing banding artefacts [BBC R&D, 2016]. The 10 or 12-bit coding
established in ITU-REC BT 2100 provides a minimum precision of 1015 (not 1024 due to timing
Figure 2: JND steps for maximum luminances based on Barten's model [Barten, 1993] [Seetzen et al, 2004]
Standardised HDR capable displays are able to reproduce a wider array of colours with improved
saturation over the colour gamut of SDR displays. The wider colour gamut permits a more realistic
representation of images, particularly at very bright and very dark hues [Schulte & Barsotti, 2016: 5].
The theoretical colour space outlined in the BT 709 for HDTV (SDR) is provided below in table 1, and
is also displayed in figure 3 with reference to the CIE 1931 colour space chromaticity diagram. The
CIE 1931 colour space is a reference to the complete colour perception of an average human, and
the ITU-REC BT 709 colour space comprises only a portion (≈33.5%) of the CIE colour space [Schulte
& Barsotti, 2016: 1]. The system colourimetry parameters outlined by ITU-REC BT 2020 for UHDTV
(and HDR) results in a much larger colour space than BT 709 (figure 3), representing approximately
57.3% of the CIE 1931 colour space [Schulte & Barsotti, 2016: 1].
BT 2020 may not encompass the complete range of human colour perception, though the BT 2020
colour space does cover 99.9% of Pointer’s gamut – a replication of the diffuse surface colours
observed by humans in nature (figure 4) [Schulte & Barsotti, 2016: 6].
The process of capturing and displaying HDR imagery can be described through a series of transfer
functions; the initial capture relative luminance by the camera, the transform (OETF) of that detail to
encode a HDR signal, then the decoding of that HDR signal (EOTF) to output native visualisation of
the original signal. The applied opto-electronic transfer function (OETF) however, is not linearly
related to the electro-optic transfer function, and a non-linear opto-optic transfer function (OOTF)
compensates to allow consistent end-to-end HDR image reproduction [Borer & Cotton, 2015: 3]. This
can be expressed by the display luminance being equal to the HDR signal to the power of the system
gamma (OOTF).
Figure 5: General process of HDR image capture to native visualisation of the HDR signal [Borer & Cotton, 2015: 3]
Modern displays generally fail to reproduce the original scene luminance and as a result, without the
reference OOTF (system gamma) would display a low contrast representation of the HDR content.
Figure 6 indicates the adjustment made to the extended high and low values of luminance,
compressing the dynamic range of the image capture, enhances the mid-range contrast, and
mapping the signal to the available dynamic range of the display [ITU-R BT.2390-3, 2017: 9]. Where
the dynamic range is compressed for limited bit-depth signals, the visibility of quantisation related
artefacts for the system is reduced [Borer & Cotton, 2015: 3].
For display-independent HDR, OOTF adjustments accommodate for a difference in displayed tonal
perception from the original HDR scene content due to the viewing environment. A brighter ambient
luminance can be compensated for by raising the HDR signal black level, and applying a knee to the
system gamma can produce a more representative output from a display with a limited maximum
brightness [ITU-R BT.2390-3, 2017: 12]. Limitations to the colour gamut of the display can be
accounted for by colour gamut mapping [ITU-R BT.2390-3, 2017: 12]. These display alterations take
place on the display side of the system chain.
The system architecture for HDR video is separated based on the position of the rendering intent
(OOTF) relative to the EOTF. The Perceptual Quantiser (PQ) system has rendering intent
superimposed with the OETF, and the Hybrid Log-Gamma (HLG) system superimposes OOTF with
EOTF [ITU-R BT.2390-3, 2017: 8]. Both of these standards are included within ITU-REC BT 2100,
though each represents a different approach to the display of HDR video.
The EOTF of PQ is described by SMPTE ST2084 (figure 9), with a non-linear function based not on the
power law gamma curves of HDTV (ITU-REC BT 709), but instead designed to correspond to the HVS
model outlined by Barten [1993], with an increased bit-rate efficiency across HDR luminance ranges
of 5×10-5 nits up to 10,000 nits [Schulte & Barsotti, 2016: 4]. The video signal for PQ is given by the
absolute luminance of the mastering reference display (display referred) quantised to a bit depth of
10 or 12 bits [BBC R&D, 2016]. With 10-bit quantising, quantisation artefacts are produced from the
at luminances approaching 10,000 nits, though masking occurs due to camera noise levels [ITU-R
BT.2390-3, 2017: 16].
As an absolute video signal, the quantised luminance of the signal is mapped utilising an identical
EOTF for the reference display and the non-reference display. In this way the luminance and
chromaticity can be near-perfectly reproduced provided the non-reference display is compatible
with the video signal dynamic range and colour gamut, and the same display settings, environment
and ambient illumination are experienced [ITU-R BT.2390-3, 2017: 16]. Due to the display-referred
nature of PQ video signals, static metadata is required in order to describe the luminance and
chromaticity of the reference display [Schulte & Barsotti, 2016: 5], allowing display mapping of the
signal to the non-reference display (figure 6) – though this will not accurately reproduce the
reference signal.
HLG system signal levels are proportional to the output of the camera sensor (scene-referred,
relative signal), then normalised with reference to the white level [Borer & Cotton, 2015: 4]. Figure
10 illustrates the non-linear OETF, with the parameters of the function standardised in ITU-REC BT
2100. For the bit-coding efficiency necessary to prevent quantisation artefacts, HLG utilises a
logarithmic OETF where JND steps in luminance is constant (higher luminances), and a gamma law
OETF curve at low luminances where the JND threshold is less sensitive [ITU-R BT.2390-3, 2017: 23].
HLG supports a dynamic range from 0.01 nits to 2000 nits (17.6 f-stops) from a 10-bit signal with
minimal visible quantisation artefacts [Borer & Cotton, 2015: 8].
As HLG signals are scene-referred, with an appropriate EOTF and rendering intent any monitor can
display an image representative of the HDR signal (display independent) without the need for
content related metadata [Borer & Cotton, 2015: 4]. This is achieved by adjusting the system gamma
applied to the normalised scene luminance dependent on nominal display peak luminances.
Figure 10: HLG OETF curve relative to SDR power gamma OETF curve (BT 2020) [ITU-R BT.2390-3, 2017: 24]
Both PQ and HLG were each designed with independent motivations and objectives in mind. The
strength of PQ HDR systems is a result of its development for film content, where the display
dynamic range and ambient viewing environment will be more similar to the reference display and
environment. In these conditions the extension of shadow and highlight detail can truly resemble
the artistic intent of the content. This sacrifices direct compatibility with varying displays and
ambient environments, requiring metadata and display mapping. Due to the need for display
mapping and metadata, PQ HDR content has limited compatibility with SDR displays [ITU-R BT.2390-
3, 2017: 29]. This isn’t the case for HLG – developed by the BBC and NHK for suitability with live
broadcast television – where display independence and a matched OETF curve allows direct
compatibility with SDR displays (with BT 2020 colour spaces), albeit with variations in specular
highlight reproduction [ITU-R BT.2390-3, 2017: 29]. Video format conversion is required to display
PQ and HLG HDR signals on SDR displays with BT 709 colour spaces.
There are four delivery formats based on the defined HDR systems; HDR10, Dolby Vision, SL-HDR1
(all of which utilise the PQ system) and HLG10 (which utilises the HLG system). HDR10 is the adopted
delivery format of the Blu-ray Disc Association, the High Definition Multimedia Interface (HDMI)
Forum and the UHD Alliance, and is an open format supported by a variety of companies [Schulte &
Barsotti, 2016: 10]. The HDR10 media profile implements a 10-bit depth, 4:2:0 base video signal
format, with static metadata (dynamic metadata to be introduced for HDR10+ with SMPTE ST 2094)
and a colour space equivalent to the BT 2020 standard [Schulte & Barsotti, 2016: 11]. HDR10 is
currently the most widespread format, supported by many manufacturers and companies.
Dolby Vision – developed and licensed by Dolby Labs – uses the PQ EOTF, however unlike HDR10,
utilises 12-bit coding and dynamic metadata for scene-scene luminance, chromaticity, colour
volume, black level and peak luminance parameters [Schulte & Barsotti, 2016: 12]. Dolby Vision
utilises a dual-layer profile with SDR and HDR10 compatible 10-bit base layers with a 2-bit
enhancement layer, and a single 10-bit layer profile suited to live broadcast applications [Schulte &
Barsotti, 2016: 12]. These features have the potential to provide the greatest image quality of the
different formats, though without 12-bit consumer displays capable of dynamic ranges approaching
28 f-stops, the extent of Dolby Vision is not realised.
SL-HDR1 is a format developed by Technicolor and Philips, and is designed to allow 10-bit HDR video
signals to be distributed via 8-bit SDR networks. This is achieved by decomposition and encoding of
the HDR signal to SDR, followed by the decoding and reconstruction of the HDR video signal for
display on HDR capable devices [ETSI TS 103 433, 2016]. Where an SDR display is utilised, the
dynamic metadata allowing reconstruction of the HDR signal is extraneous [Schulte & Barsotti, 2016:
16].
The HLG10 format is the royalty-free delivery system which utilises the HLG OETF curve, with 10-bit
coding and support of the BT 2020 colour space [Schulte & Barsotti, 2016: 12]. As described
previously, the HLG based delivery system does not require metadata and is designed for live HDR
television broadcast.
Conclusion
With an increased capability to capture and display images with colour gamut and dynamic ranges
which closely reflect the HVS, the preference by end-users for HDR video over SDR [Mukherjee et al,
2016] has resulted in the development of an increasing number of HDR system architectures and
delivery formats. Each of these systems has specific design properties with advantages and
disadvantages in relation to their intended use. The capabilities of HDR content available to
consumers has continued to progress to a point where limitations exist due to the environment in
which the content is displayed and the human visual system.
The most significant increase of dynamic range from SDR video is a consequence of reduced black
points, with companies such as LG quoting black points at zero with their HDR OLED displays and
infinite contrast ratios. Specular reflections and ambient light prevent a black point at true zero
[Banterle et al, 2011: 6], and an ambient luminance of 5 nits is outlined for HDR in BT 2100. The
colour space outlined for HDR (BT 2020) already approximates Pointer’s gamut and the dynamic
range of available dual-modulation reference displays produce luminance ranges between 0.004 and
20,000 nits [ITU-R BT.2390-3, 2017: 4] – exceeding the steady state dynamic range of the HVS by
more than 8 f-stops. Some further advancement to HDR will undoubtedly take place in the near
future; however the development ceiling based on the limitations of the HVS is approaching.