Professional Documents
Culture Documents
A Performance Evaluation of MPEG-21 BSDL in The Context of H.264/AVC
A Performance Evaluation of MPEG-21 BSDL in The Context of H.264/AVC
264/AVC
Wesley De Neve+ , Sam Lerouge+ , Peter Lambert+ , and Rik Van de Walle*
* Ghent + Ghent
University, Sint-Pietersnieuwstraat 41 B-9000, Ghent, Belgium; University - IMEC, Sint-Pietersnieuwstraat 41 B-9000, Ghent, Belgium
ABSTRACT
H.264/AVC is a new specication for digital video coding that aims at a deployment in a lot of multimedia applications, such as video conferencing, digital television broadcasting, and internet streaming. This is for instance reected by the design goals of the standard, which are about the provision of an ecient compression scheme and a network-friendly representation of the compressed data. Those requirements have resulted in a very exible syntax and architecture that is fundamentally dierent from previous standards for video compression. In this paper, a detailed discussion will be provided on how to apply an extended version of the MPEG21 Bitstream Syntax Description Language (MPEG-21 BSDL) to the Annex B syntax of the H.264/AVC specication. This XML based language will facilitate the high-level manipulation of an H.264/AVC bitstream in order to take into account the constraints and requirements of a particular usage environment. Our performance measurements and optimizations show that it is possible to make use of MPEG-21 BSDL in the context of the current H.264/AVC standard with a feasible computational complexity when exploiting temporal scalability. Keywords: AVC, BSDL, Content Adaptation, Content Description, H.264, MPEG, Scalability
1. INTRODUCTION
H.264/AVC is a new specication for digital video coding,1 characterized by a design that targets eciency, robustness, and usability.2 Because of its support for a wide range of bit rates,3 H.264/AVC can even be considered as a universal standard for digital video coding. The latter implies that the specication in question will be used under the hood of a lot of multimedia applications in the very near future. Those video-enabled applications will most probably be deployed on a wide variety of terminals, exchanging information with each other by making use of several types of networks. This is not a very attractive situation for content providers because they see themselves as being obliged to provide several versions of the same multimedia presentation in order to reach a target audience that is as large as possible. It would be much more ecient if they only had to provide one presentation that could be reused under all circumstances. A solution for this diversity is the usage of scalable video coding, together with a complementary content adaptation system. In the current H.264/AVC specication, there are no explicit provisions for enabling scalability although some eorts are emerging.4 The latter is currently a hot topic in the video coding and content adaptation community5 because of the fact that scalable coding should make it possible to deal with the growing variety of networks and terminals in an ecient way. To be more specic, think for example about the scenario of a user who has a large collection of music video clips at his or her disposal. One may assume that all video streams are encoded at a very high quality, for instance by making use of an ecient implementation of the Main Prole as available in the H.264/AVC specication. As such the media les in question are suited for playback on a digital home entertainment system. But what if the user wants to enjoy the same video clips on a mobile device when traveling to work by train? Then the need arises for a content adapation system that should make it possible to
Further author information: (Send correspondence to Wesley De Neve) Wesley De Neve: E-mail: wesley.deneve@ugent.be, Telephone: +32 (0)9 264 89 29 Sam Lerouge: E-mail: sam.lerouge@ugent.be, Telephone: +32 (0)9 264 89 17 Peter Lambert: E-mail: peter.lambert@ugent.be, Telephone: +32 (0)9 264 89 29 Rik Van de Walle: E-mail: rik.vandewalle@ugent.be, Telephone: +32 (0)9 264 33 68
Applications of Digital Image Processing XXVII, edited by Andrew G. Tescher, Proceedings of SPIE Vol. 5558 (SPIE, Bellingham, WA, 2004) 0277-786X/04/$15 doi: 10.1117/12.564822
555
Downloaded from SPIE Digital Library on 08 Jul 2010 to 143.248.227.93. Terms of Use: http://spiedl.org/terms
realize an ecient transfer of the video clips from the full-featured PC to a mobile device, taking into account the constraints of the new usage environment (such as a limited battery life, a reduced screen resolution, . . . ). Although H.264/AVC is a specication for single-layered video compression, we will show how an extended version of MPEG-21 BSDL can be used in combination with the Annex B syntax in order to make possible some high-level manipulations of an H.264/AVC bitstream. In particular, we will discuss some results with regards to the performance when using BSDL to exploit a trivial form of temporal scalability in H.264/AVC. The outline of the paper is as follows: after having given an in-depth overview of the involved technologies in section 2, a description of the applied methodology for performing the measurements is provided in section 3. Section 4 discusses the obtained results and section 5 concludes.
556
Downloaded from SPIE Digital Library on 08 Jul 2010 to 143.248.227.93. Terms of Use: http://spiedl.org/terms
0x00
zero_byte
0x000001
start_code_prefix_one_3bytes
NAL layer
NAL Unit (NALU)
Network Abstraction Layer generic stream of NAL Units ... NALU NALU NALU NALU NALU ...
slice header
slice data
macroblock layer
MB MB MB MB MB MB MB MB
MB MB
nal_ref_idc nal_unit_type
RTP/IP
MPEG-2 Systems
Parameters valid for an entire sequence - profile@level information - resolution - number of reference pictures
NAL header (active) picture parameter set (PPS)
Parameters valid for at least one picture - type of entropy coding - number of slice groups (FMO) - initial values for quantisation parameter - parameters for deblocking filter
NAL header slice header slice data
resources across a wide range of networks and devices.7 It is actually embedded in part 7 of MPEG-21, the latter better known as MPEG-21 Digital Item Adaptation (MPEG-21 DIA).8 The motivation behind the development of MPEG-21 BSDL is the fact that having a scalable format alone is not sucient. One also needs a program for the analysis and the actual adaptation of (scalable) bitstreams. Because every coding format has its own structure, one would expect at rst sight that a separate program is required for every specic coding format. However, a more generic solution can be devised. To be more specic, it is possible to create a universal program for the analysis and adaptation of (scalable) bitstreams by relying on a common language for the description of the syntax of a specic coding format. Such a language was developed in the context of MPEG-21 and is known as BSDL. The language in question is in fact based on some extensions to W3C XML Schema on the one hand (bitwise datatypes, . . . ) and on some restrictions to W3C XML Schema on the other hand (the occurance of attributes is for instance prohibited in the resulting XML description of the structure of a bitstream because attributes are allowed to occur in an arbitrary order by XML Schema, the latter naturally not being the case for syntax elements, . . . ). Making use of XML and XML
XML Schema is a recommendation of the World Wide Web Consortium (W3C), making it possible to specify some rules with respect to the structure of an XML document, the nomenclature of XML elements and attributes, . . . 911
557
Downloaded from SPIE Digital Library on 08 Jul 2010 to 143.248.227.93. Terms of Use: http://spiedl.org/terms
Schema has several advantages: one can reuse a lot of already existing tools for doing XML related operations and it also allows a more straightforward integration with other XML based standards in the long term. To focus ones mind, Figure 2(a) provides a simplied example that illustrates how MPEG-21 BSDL can be combined with H.264/AVC. An excerpt of the developed scheme in BSDL for the Annex B syntax of the H.264/AVC specication is available in Annex A. On the right side of Figure 2(a), one can notice a video stream, i.e. a sequence of slices. On the left side of the picture an XML based BSD is provided, describing the high level structure of the H.264/AVC bitstream. This XML description contains several elements. As illustrated by the arrows, most of the elements are linked to a corresponding slice and contain some information about the slice in question, such as the type of the slice and the position of the rst and last byte of the slice in question in the compressed stream. In a next step, it is possible to apply some changes to this XML description. For instance, one can decide to drop the XML elements that are linked to the B slices. The interesting thing about this is that one can provide this altered XML description to a content adaptation engine that is smart enough to recognize the changes that were done in the XML domain (which is a more abstract or high-level approach for doing content manipulation). As such, the content adaptation engine can apply those changes in the compressed domain, resulting in a bitstream without B slices. This temporally downsampled bitstream is, for instance, now more appropriate for playback on a mobile device. Since the H.264/AVC specication is time unaware, one may also assume that the synchronisation of the remaining H.264/AVC samples can be taken into account by relying on a le format or a network protocol, as illustrated by the Systems layer in Figure 1(a). With respect to the content adaptation engine, this piece of logic is available in the MPEG-21 reference software package. The process as discussed before is summarized in a formal way in Figure 2(b). In the rst step, one starts from a bitstream typically encoded at a high quality such that it is useful to derive other versions of this particular bitstream. This parent bitstream is given as input to the BinToBSD tool, being part of the MPEG21 reference software, together with a description of the Annex B Syntax at a certain granularity (for instance a description up to the level of the NALU header or up to the level of the slide header()). The latter syntax description is written down by making use of BSDL. The BinToBSD tool is now capable of generating an XML description of the structure of the H.264/AVC bitstream in question. In a next step, one can apply a set of lters to the XML based bitstream syntax description (BSD) of the H.264/AVC bitstream. For instance, in a rst stage one can apply a lter in order to simplify the XML description in question or in order to add some metadata such that smarter adaptations are possible.12 For example, based on MPEG-7 metadata, one can highlight that part of the video stream that is dealing with a sports scene. After this preprocessing step, one can apply zero or more lters in order to realize the actual manipulation of the XML description, such as dropping the XML elements describing the B slices or, for instance, selecting the scenes that contain sports content. Which lter to apply can be made dependent on a negotiation process making use of multi-criteria optimization.13 This will nally result in an appropriate XML description. The lters can be implemented by relying on several technologies, such as Extensible Stylesheet Language (XSL) documents, an XML API, . . . In a nal step, the adapted BSD can be provided to the BSDToBin tool. Together with the original bitstream and the document describing (a part of) the H.264/AVC syntax, this will result in an adapted bitstream that is suited for a particular usage environment. Note that the BSD, as generated by the BinToBSD tool, only has to be created once in a production environment. This observation also applies to the preprocessing step. When the bitstream syntax description is available at a sucient detailed level, it should also be possible to derive several versions of the original H.264/AVC bitstream in order to meet the requirements of a particular usage environment. It is also important to know that MPEG-21 BSDL often allows doing data manipulations without requiring a recode of the media data in question, although it is possible that some side eects have to be solved. The latter will be discussed in a next section. It should also be clear that MPEG-21 BSDL allows realizing manipulations of multimedia content at a more abstract level once a BSD of a particular bitstream is available, thus making it possible to enter the semantic domain (i.e. not having to deal any longer with the pure bits and bytes).
We assume that every remaining slice can be reconstructed without having to rely on a B slice.
558
Downloaded from SPIE Digital Library on 08 Jul 2010 to 143.248.227.93. Terms of Use: http://spiedl.org/terms
I slice
<bitstream xml:base=myPrecious_30hz.264> <header>0-24</header> <I_slice>25-2637</I_slice> <B_slice>2638-2746</B_slice> <B_slice>2747-2903</B_slice> <P_slice>2903-3857</P_slice> <B_slice>3857-3972</B_slice> <B_slice>3973-4103</B_slice> </bitstream>
1 Original Bitstream [myPrecious_30hz.264] 2 XML Description [myPrecious_30hz.xml]
B slice
BinToBSD + h264_avc.bsd
Pre-processing
B slice
Filters
XSLT Stylesheet Stylesheet 4 XSLT XSLT Stylesheet [drop_BSlices.xsl]
P slice
BSDToBin + h264_avc.bsd
7 Scaled Bitstream [myPrecious_10hz.264]
Post-processing
B slice
B slice t
559
Downloaded from SPIE Digital Library on 08 Jul 2010 to 143.248.227.93. Terms of Use: http://spiedl.org/terms
0 33 66
33 0 66
0 33 66
Figure 3. Generation of a corrupt bitstream due to the usage of the fillByte construction. Note that 1 is the binary exponential golomb representation for the decimal zero, and that 00000100001 is the binary representation of 33.
it is not possible to describe a datatype by making use the just mentioned language in an ecient way. To be more specic, the implementation attribute allows to call Java classes from the BSD scheme written in BSDL. The implementation construction was used, among others, for parsing the syntax elements that are encoded by the signed or unsigned exponential golomb entropy coding scheme (i.e. this is one of the cases in which one has to deal with a complex datatype). It is possible to describe those entropy coding schemes in BSDL, but this will result in a tremendous overhead: every single bit of an exponential golomb coded syntax element has to be put in one XML element. On top of that, it is not straightforward to interpret or decode the resulting XML description of the syntax element in question by making use of XPath. The latter technology allows to perform queries against an XML document for retrieving the value of a particular XML element, . . . 15 This functionality is required when one wants to apply changes to a BSD. For instance, the decoding of elements that are encoded by the entropy coding schemes in question is necessary for realizing temporal scalability since the slice type parameter is represented by an exponential golomb codeword. The implementation construction is also used for the parsing of the slice group change cycle syntax element. This parameter occurs as the last syntax element in the slice header() syntax structure when Flexible Macroblock Ordering (FMO) types 2, 3, or 4 are used, the latter supporting evolving slice groups. As such, the parameter in question determines the number of macroblocks in slicegroup 0. The main reason for using the implementation construction lies in the fact that the number of bits for the representation of the slice group change cycle syntax element has to be computed by evaluating the logarithmic function with base two, a common operation when parsing bitstreams. However, the latter is not available in the XPath specication (i.e. this is one of the cases in which complex computations are necessary). When the set of input values is limited, this limitation can be circumvented by making use of the union element and precalculated values (thus no longer requiring the evaluation of the logarithmic function). However, the latter is not the case for the syntax element in question because the number of input values is dependent on the resolution. Getting access to the value of this parameter allowed us to drop the background of a video sequence encoded with FMO types 2, 3, and 4. The procedure for the elimination of the background itself was implemented by making use of a cascade of two XSL stylesheets due to the complexity of the XPath expressions. This complexity is a consequence of the pointer based relationship between a slice header() and the picture and sequence parameter sets, and the fact that the latter can occur more than once in an H.264/AVC compliant bitstream. The encoding and decoding were done by making use of a modied version of the reference encoder and decoder. Finally, the implementation approach was also applied to the cabac alignment one bit syntax element, being part of the slice data() structure. The reason for this approach can be explained by an example in which the slices in an H.264/AVC bitstream are shued per picture. Although being a pure academic problem, it is a good illustration of the side eects that may occur when performing editing operations in the compressed
The functionality of this element can be compared to a switch statement in a programming language.
560
Downloaded from SPIE Digital Library on 08 Jul 2010 to 143.248.227.93. Terms of Use: http://spiedl.org/terms
domain. The latter phenomenon can for instance occur when transcoding an H.264/AVC bitstream from the Main Prole to the Baseline prole. The shuing of the slices is illustrated by Figure 3(a) in which a sequence of pictures at QCIF resolution is encoded by dividing each picture into three slices of equal sizes (33 macroblocks). The shuing consists of switching every rst and second slice of a picture by manipulating the value of the first mb in slice parameter in the corresponding slice header() syntax structures. This manipulation will be detected by the Arbitrary Slice Ordering (ASO) feature of a decoder, resulting in a distorted video sequence. Due to the fact that the first mb in slice syntax element is represented by an exponential golomb code, the change of zero to 33 and vice versa (33 is the number of the rst macroblock in the second slice) will result in a change of the byte alignment of the slice header() structure. As illustrated by table 3(b) the fillByte construction does not deal with that change in a correct way, resulting in a corrupt bitstream at the transition of the slice header() and the slice data() syntax structures since the question marks should all have been replaced by ones. The fact that byte alignment has to be achieved at the transition of the slice header() and the slice data() syntax structures by adding an appropriate number of one bits is required by the H.264/AVC specication. For simplicity, all syntax elements between the first mb in slice parameter and the cabac alignment one bit parameter are omitted. Although we could develop a BSD description up to the level of the slice header() syntax structure, we are currently not able to parse bitstreams in which NALU emulation prevention bytes occur at the level of the syntax structure in question. The presence of emulation prevention bytes ensures that no sequence of consecutive byte-aligned bytes in the NAL unit contains a start code prex. The reason for not being able to deal with those special bytes is the lack of an appropriate look ahead mechanism for the detection of the bytes in question in the current version of MPEG-21 BSDL. In theory, it would be possible to locate those bytes by making use of the ifNext construction in MPEG-21 BSDL because the latter allows looking ahead. However, such an approach would actually require an ifNext operation that can be executed on every 32 bits aligned position. The latter is not achievable in practice (for instance, due to the usage of VLCs). Another challenge is the appropriate insertion of NALU emulation prevention bytes in manipulated bitstreams. For instance, our implemented procedural objects do not take into account the occurance of and the need for NALU emulation prevention bytes. Note that this problem does not emerge in the case of MPEG-4 Visual bitstreams due to a totally dierent organization of the header information such that the usage of emulation bytes is not necessary.
3. APPLIED METHODOLOGY
This section discusses the way the compressed bitstreams and their corresponding XML based syntax descriptions were generated. Some information is provided about the tools used for doing the proling of the reference software for MPEG-21 BSDL (i.e. the BinToBSD tool and the BSDToBin tools).
A reference picture is a picture with nal ref idc not equal to zero.
561
Downloaded from SPIE Digital Library on 08 Jul 2010 to 143.248.227.93. Terms of Use: http://spiedl.org/terms
in a set of 27 bitstreams. All slices per picture belong to the same type (satised due to the value of the slice type syntax element). Emulation prevention bytes did not occur in the syntax structures parsed by the BSDL reference software. The actual performance analysis was done for several schemes written in BSDL: a full scheme describing all syntax elements up to the level of the slice header() datastructure, and a normalized scheme only describing those parameters that are really necessary for exploiting temporal scalability. The latter implies parsing everything up to the level of the slice type parameter in the slice header() datastructure for the slices containing coded picture data. The SPS and PPS are not analyzed in case of the simplied scheme. For the generation of the XML descriptions, the BSDL reference software was used, version 1.1.3. Timing was done by relying on the timers as made available in the two BSDL tools, taking into account the overhead related to input and output.
4. EXPERIMENTAL RESULTS
This section covers some of the performance results that were obtained during our research. Figure 4(a) indicates that the processing time required by the BinToBSD tool is characterized by an exponential behavior in terms of the number of slices in case of the simplied BSD scheme (note that the Y-axis has a logarithmic scale and that the points on the X-axis are not equidistant). For instance, 145 seconds are needed in order to generate a BSD for a bitstream containing 599 pictures, hereby making use of one slice per picture. One can also notice that the processing is done in terms of slices: a stream of 300 pictures without slices results in the same behaviour for the BinToBSD tool as a stream of 100 pictures with three slices per picture. The exponential behavior of the BinToBSD tool is also emphasized when making use of the full scheme for generating a BSD, especially due to the evaluation of a lot of control statements necessary for guiding the parsing process, the latter often implemented as complex XPath expressions. A rst attempt to boost the performance consisted of making the simplied BSD scheme deterministic. The two previous schemes, the full one and the simplied one, are generic in the sense that they can be applied to any H.264/AVC compliant bitstream, regardless of the prole implemented or the GOP structure used . When taking into account the latter information, together with the fact that the SPS and the PPS are always the two rst NALUs, one can create a BSD scheme that is much more simple because it is possible to drop a lot of complex control statements then. However, this scheme still resulted in an exponential behavior of the BinToBSD tool as can be seen in Figure 4(b). An extensive proling with the HPjmeter tool revealed that the performance problem of the BinToBSD program, making use of the simplied deterministic scheme, could still be traced back to the usage of XPath expressions. To be more specic, the performance problem in question is related to the usage of XPath expressions when the nOccurs attribute is used. The latter BSDL attribute species how many times a particular syntax element can occur in a bitstream by making use of an XPath expression. When this attribute does not occur in the BSD scheme, the BinToBSD tool falls back to a default value of one (a constant XPath expression) for the attribute in question since most syntax elements only occur once on a particular position in a bitstream. However, when this attribute does occur in the BSD scheme, the BinToBSD tool duplicates the internal datastructure containing the XML description of the structure of the H.264/AVC bitstream, anticipating the possible execution of an XPath expression. Because the nOccurs attribute was used in the declaration of every possible syntax element for clarity purposes (even when the syntax element could only occur once), its presence resulted
This genericity is also one of the major reasons for the complexity of the XPath expressions.
562
Downloaded from SPIE Digital Library on 08 Jul 2010 to 143.248.227.93. Terms of Use: http://spiedl.org/terms
10000.0
1000.0
1000.0
100.0
100.0
10.0
10.0
1.0 21 49 73
1.0
//
99
199 #Pictures
299
399
499
599
21
49
73
99
199 #Pictures
299
399
499
599
1 slice/picture
2 slices/picture
3 slices/picture
1 slice/picture
2 slices/picture
3 slices/picture
99
199 #Pictures
299
399
499
599
21
49
73
99
199 #Pictures
299
399
499
599
1 slice/picture
2 slices/picture
3 slices/picture
1 slice/picture
2 slices/picture
3 slices/picture
60 50 40 30 20 10 0
60 50 40 30 20 10 0
tm
io
la ng
til
to
g2
so
a.
ot h
l.d
er
.io
la ng
su n
l.u
e. cr im
ja v
pe
SD
ec
a.
a.
til
e. xm
xm
ut il.V
pe g2 1. X
ja v
.m
ja v
a.
21
ap ac he .
or g
ch
ch
ja v
pe g
a.
ap a
ap a
ja v
pe g
ot h
21
.u
er
io
563
Downloaded from SPIE Digital Library on 08 Jul 2010 to 143.248.227.93. Terms of Use: http://spiedl.org/terms
in a duplication of the XML datastructure for every syntax element. The behaviour in question is reected by the execution times needed by the functions that are responsible for the duplication of the XML structure. As can be deduced from the pie chart in Figure 4(e), a lot of processing time is spent in the Document Table Model (DTM) package (org.apache.xml.dtm). DTM is an interface designed specically to optimize performance and minimize storage when making use of the Apache XPath and XSLT implementations.18 Note that the exclusive method time is the time spent in a method, not taking into account the time spent in the functions that were called by the method in question. Taking into account the latter knowledge, a much more ecient version of the simplied deterministic scheme in BSDL could be created. This nally resulted in the generation of a BSD that is faster than real-time, because of the lack of the nOccurs attribute and the lack of XPath expressions in the scheme in question. This is also illustrated by the shift of the accumulated exclusive method time to other packages in Figure 4, especially to the ones that are responsible for input and output operations. For example, about 4 seconds are needed in order to generate a BSD for a bitstream containing 599 pictures, hereby making use of three slices per picture. The latter example took about 992 seconds in case of the simplied deterministic BSDL scheme, about 1096 seconds in case of the original simplied scheme, and about 68041 seconds in case of the full scheme. The average speed-up of the BinToBSD tool, using the optimized simplied deterministic scheme, is 90.95%, compared to the execution time needed by the original simplied scheme (standard deviation: 13.35%). Note that the BSD, as the result of the usage of the optimized scheme, is still equivalent with the one that is being generated by the simplied and the very rst deterministic scheme, thus still enabling the exploitation of temporal scalability. With respect to the processing time needed by the Xalan XSL engine, one can observe execution times that are quite fast: generating an XML document originally describing 599 pictures (one slice per picture) requires 625 milliseconds. The resulting XML document only contains the descriptions of NALUs carrying a SPS, a PPS, or compressed data related to I and P slices (and no longer to B slices). The same observation applies to the BSDToBin tool, regardless whether the Document Object Model (DOM) or the Simple API for XML (SAX) are used for the internal representation and processing of the XML description. The fast behavior of the BSDToBin tool can be explained by the fact that it is no longer necessary to evaluate XPath expressions. This leads to the observation of a potential asymmetrical behavior between the BSD encoder (BinToBSD) and BSD decoder (BSDToBin) when the encoder in question has to deal with a lot of XPath expressions, the latter being very similar to the behavior of MPEG-x and H.26x encoders and decoders. It is also interesting to see that the fast behavior of the optimized simplied deterministic scheme proves that it is possible to make use of Java procedural objects for achieving byte alignment and for the decoding and encoding of exponential golomb coded syntax elements in an ecient way. Some of the quantitative results can be found in Annex B.
564
Downloaded from SPIE Digital Library on 08 Jul 2010 to 143.248.227.93. Terms of Use: http://spiedl.org/terms
Table 2. Resulting output of the BinToBSD tool for the rst seven syntax elements of a SPS.
Table 3. Overview of the performance measurements: D denotes DOM, S denotes SAX, det denotes the deterministic simplied scheme, while opt stands for the optimized version of the latter. The temporal downsampling resulted in a 48.86% reduction of the bitstream size on the average (standard deviation: 0.89%), the latter being dependent on the rate-distortion model used.
565
Downloaded from SPIE Digital Library on 08 Jul 2010 to 143.248.227.93. Terms of Use: http://spiedl.org/terms
ACKNOWLEDGMENTS
The authors would like to thank the developers of the MPEG-21 BSDL reference software for making available the required extensions. We would also like to thank Davy De Schrijver for the clarifying discussions about the usage of the HPjmeter proling tool. The research activities that have been described in this paper were funded by Ghent University, the Interdisciplinary Institute for Broadband Technology (IBBT), the Institute for the Promotion of Innovation by Science and Technology in Flanders (IWT), the Fund for Scientic Research-Flanders (FWO-Flanders), the Belgian Federal Science Policy Oce (BFSPO), and the European Union.
REFERENCES
1. T. Wiegand, G. J. Sullivan, G. Bjntegaard, and A. Luthra, Overview of the H.264/AVC video coding standard, IEEE Trans. Circuits Syst. Video Technol. 13, pp. 560576, July 2003. 2. I. E. G. Richardson, H.264 and MPEG-4 Video Compression: Video Coding for Next-generation Multimedia, John Wiley & Sons, LTD, 2003. 3. Requirements for AVC Codec, MPEG-document ISO/IEC JTC1/SC29/WG11/N4672, Joint Video Team of ISO/IEC JTC1/SC29/WG11 and ITU-T SG16/Q.6, Jeju, Korea, Mar. 2002. Available on http://www. chiariglione.org/mpeg/working documents. 4. H. Schwarz, D. Marpe, and T. Wiegand, Subband Extension of H.264/AVC, JVT-document JVT-K023, Munich, Germany, Joint Video Team of ISO/IEC JTC1/SC29/WG11 and ITU-T SG16/Q.6, Mar. 2004. 5. Requirements and Applications for Scalable Video Coding, MPEG-document ISO/IEC JTC1/SC29/WG11 N6025, Moving Picture Experts Group (MPEG), Gold Coast, Australia, Mar. 2003. Available on http://www.chiariglione.org/mpeg/working documents. 6. M. Amielh and S. Devillers, Bitstream Syntax Description Language: Application of XML-Schema to Multimedia Content Adaptation, in WWW2002: The Eleventh International World Wide Web Conference, (Honolulu, Hawaii), May 2002. Available on http://www2002.org/CDROM/alternate/334/. 7. I. Burnett, R. V. de Walle, K. Hill, J. Bormans, and F. Pereira, MPEG-21: Goals and achievements, IEEE Multimedia 10, pp. 6070, October-December 2003. 8. Multimedia Framework Part 7: Digital Item Adaptation, Final Draft International Standard, MPEGdocument ISO/IEC JTC1/SC29/WG11/N6167, Moving Picture Experts Group (MPEG), Waikaloa, USA, Dec. 2003. 9. D. C. Fallside, XML Schema Part 0: Primer, recommendation, World Wide Web Consortium (W3C), http://www.w3c.org/TR/xmlschema-0/, May 2001. 10. H. S. Thompson, D. Beech, M. Maloney, and N. Mendelsohn, XML Schema Part 1: Structures, recommendation, World Wide Web Consortium (W3C), http://www.w3c.org/TR/xmlschema-1/, May 2001. 11. P. V. Biron and A. Malhotra, XML Schema Part 1: Datatypes, recommendation, World Wide Web Consortium (W3C), http://www.w3c.org/TR/xmlschema-2/, May 2001. 12. J. Magalh aes and F. Pereira, Using MPEG standards for multimedia customization, IEEE Trans. Circuits Syst. Video Technol. 19, pp. 437456, May 2004. 13. S. Lerouge, P. Lambert, and R. Van de Walle, Multi-criteria optimization for scalable bitstreams, in Proceedings of the 8th International Workshop on Visual Content Processing and Representation, pp. 122 130, Springer, (Madrid), September 2003. 14. W. De Neve, F. De Keukelaere, K. De Wolf, and R. Van de Walle, Applying MPEG-21 BSDL to the JVT H.264/AVC Specication in MPEG-21 Session Mobility Scenarios, in Proceedings of the 5th International Workshop on Image Analysis for Multimedia Interactive Services, p. 4 pp, (Lisboa), April 2004. 15. J. Clark and S. DeRose, XML Path Language 1.0, recommendation, World Wide Web Consortium (W3C), http://www.w3c.org/TR/xpath, Nov. 1999. 16. JVT H.264/AVC Reference Software. http://bs.hhi.de/suehring/tml/download/. 17. HPjmeter. http://www.hp.com/products1/unix/java/hpjmeter/. 18. The Document Table Model. http://xml.apache.org/xalan-j/dtm.html.
566
Downloaded from SPIE Digital Library on 08 Jul 2010 to 143.248.227.93. Terms of Use: http://spiedl.org/terms