Professional Documents
Culture Documents
Data Representation
A transformation is necessary
From one data format to another
Simple in theory
but usually with impact on performance
Two transformation principles:
1. Pair-wise transformation for every representation
2. Transform to canonical representation
R2 R2
R3 R3
. .
. .
. .
Rn Rn
Universitt Duisburg-Essen Torben Weis 5
Verteilte Systeme
2. Transform to Canonical Representation
R2 R2
R3 RC R3
. .
. .
. .
Rn Rn
Universitt Duisburg-Essen Torben Weis 6
Verteilte Systeme
Outline
Byte order
Text encoding
Data serialization formats
CDR
ASN.1
XML
JSON
Java object serialization
0x 00 00 07 D4
Little-endian
0 1 2 3
If you start here, its little endian D4 07 00 00
Least significant byte first
Universitt Duisburg-Essen Torben Weis 9
Verteilte Systeme
Byte Order (3)
128 characters
Non-printable
Carriage return
(CR)
Newline (LF)
Tab
Printable
Latin alphabet
Some symbols
First bit 0?
process 1 byte and return character
Second bit 0? (when in the middle of a character)
skip 1 byte and goto start
Third bit 0?
process 2 bytes and return character
U-00000000 - U-0000007F 0xxxxxxx
U+0000
U+010000
to
to
U+FFFF
U+01FFFF
Pickling/Unpickling
Synonymous to serialization/deserialization
Universitt Duisburg-Essen Torben Weis 21
Verteilte Systeme
Schema: External Specification
Abstract Syntax
Describes the structure of data independent of any
programming language
Concrete Syntax
Describes the structure of data for the selected
programming language
Transfer Syntax
Structure of data used for transfer
(e.g. <Type, Length, Value>)
Tag class
Primitive or Tag number
(2 bit)
complex type (variable
(1 bit) length)
Example definition:
BookType ::= INTEGER {
paperback(0)
hardcover(1)
}
hardcover BookType ::= hardcover(1)
Cons:
Coding is not time efficient
(e.g. accessing single bits)
Many applications know type information, no need to
transfer tag and length
<book bookid="1234">
<title>XML in 5 minutes</title>
<author>T.Knowsitall</author>
<price>0.50</price>
<availability>1</availability>
<bookoftheyear/>
</book>
Inefficient
Expensive conversion from binary data to text
Text format uses lot of space
XML is simple
XML is human-readable
Proof: MS Word Save Unpack XML
{
"bookid": 1234,
"title": "JSON in 5 minutes",
"author": "T.Knowsitall",
"price": 0.50,
"availability": 1,
"bookoftheyear": true,
}
null
Binary serializer
ObjectOutputStream/ObjectInputStream
ObjectOutputStream objOut = new ObjectOutputStream(outputStream);
objOut.writeObject(nodes[0]);
XML serializer
JAXB Marshaller/Unmarshaller
Object-to-XML mapping customizable
JAXBContext context = JAXBContext.newInstance(Node.class);
Marshaller m = context.createMarshaller();
m.marshal(nodes[0], outputStream);
Example output
Hex Text
@XmlRootElement
public class Object {
@XmlElement
public int foo;
public int bar;
}
Example output