You are on page 1of 33

ProtocolBuffers

http://code.google.com/p/protobuf/

Overview

WhatareProtocolBuffers

Structureofa.protofile

Howtouseamessage

Howmessagesareencoded

Importantpointstoremember

MoreStuff

2/33

WhatareProtocolBuffers?

SerializationformatbyGoogle
usedbyGoogleforalmostallinternalRPC
protocolsandfileformats
(currently48,162differentmessagetypesdefinedintheGooglecodetreeacross12,183.proto
files.They'reusedbothinRPCsystemsandforpersistentstorageofdatainavarietyofstorage
systems.)

Goals:

Simplicity

Compatibility

Performance

3/33

ComparisonXMLProtobuf

Readablebyhumansbinaryformat

SelfdescribingGarbagewithout.protofile

Bigfilessmallfiles(310times)

Slowtoserialize/parsefast(20100times)

.xsd(complex).proto(simple,lessambiguous)

Complexaccesseasyaccess

4/33

ComparisonXMLProtobuf(cntd)
<person>
<name>JohnDoe</name>
<email>jdoe@example.com</email>
</person>

Person{
name:"JohnDoe"
email:"jdoe@example.com"
}

(==69bytes,510'000nstoparse)

(==28bytes,100200nstoparse)

cout<<"Name:"
<<
person.getElementsByTagName("nam
e")>item(0)>innerText()
<<endl;
cout<<"Email:"
<<
person.getElementsByTagName("emai
l")>item(0)>innerText()
<<endl;

cout<<"Name:"<<person.name()<<
endl;
cout<<"Email:"<<person.email()<<
endl;

5/33

Example
messagePerson{
requiredstringname=1;//nameofperson
requiredint32id=2;//idofperson
optionalstringemail=3;//emailaddress
enumPhoneType{
MOBILE=0;
HOME=1;
WORK=2;
}
messagePhoneNumber{
requiredstringnumber=1;
optionalPhoneTypetype=2[default=HOME];
}

repeatedPhoneNumberphone=4;
}

6/33

From.prototoruntime

Messagesdefinedin.protofile(s)

Compiledintosourcecodewithprotoc

C++

Java

Python

MorelanguagesviaAddOns(C#,PHP,Perl,ObjC,etc)

Usageincode

Passedvianetwork/files

7/33

MessageDefinition

Messagesdefinedin.protofiles

Syntax:Message[MessageName]{ }

Canbenested

Willbeconvertedtoe.g.aC++class

...

8/33

MessageContents

Eachmessagemayhave

Messages

Enums:

enum<name>{
valuename=value;
}

Fields

Eachfieldisdefinedas

<rule><type><name>=<id>{[<options>]};

9/33

Fieldrules

Required

exactlyonce(msg.fieldname())

Optional

Noneorone

Queryexistence(msg.has_fieldname())

Repeated

Nonetoinfinite(orderedarray)

Querycount(msg.fieldname_size())

Useoptionpacked=true forefficientencoding

10/33

Requiredisrequired

Fieldrulerequiredisatoughdecision
Onceafieldisrequired,itmuststayrequired
foreverunlesscompatibilitybetweenversionsis
tobebroken(notsuchagoodidea)
SomeengineersatGoogleadvisetoneveruse
required

11/33

Fieldtypes
.prototype

Note

float/double

C++type
float/double

int32/int64

Variablelength,primarily
suitedforpos.numbers

uint32/sint32(dto....64)

Variablelength,un/signed (u)int32/(u)int64

(s)fixed32,(s)fixed64

Fixedlength(un/signed),
bettersuitedfor>228/56

bool

int32/int64

(u)int32/(u)int64
bool

string

UTF8or7bitASCII

std::string

bytes

Arbitrarysequenceof
bytes

std::string

MessageorEnumtype

Correspondingclass

12/33

Fieldid(tag)

Eachfieldhasauniquetag(id)(1..2291)

(Uniquepermessagedefinition)

Variablelengthencoded1..15==onebyte

Identifiesthefieldwithinthebinaryformat

i.e.fieldnamesareNOTusedintheencodeddata

Assignedforlife

13/33

Options,namespacesandimporting

Options:

[default=value]setsadefaultvalue(beware:default
valuesarenotencoded!)

[packed=false/true]betterencodingofrepeated

[deprecated=false/true]marksafieldasobsolete

[optimize_for=SPEED/CODE/LITE_RUNTIME]

Javapackageandouterclassname

Namespaces/packagescanbedefinedviae.g.package
com.example.message
Importingofmessagesdefinedinotherfilesviaimport
14/33

filename.proto

Example(again)
messagePerson{
requiredstringname=1;//nameofperson
requiredint32id=2;//idofperson
optionalstringemail=3;//emailaddress
enumPhoneType{
MOBILE=0;
HOME=1;
WORK=2;
}
messagePhoneNumber{
requiredstringnumber=1;
optionalPhoneTypetype=2[default=HOME];
}

repeatedPhoneNumberphone=4;
}

15/33

OverviewWherearewe

WhatareProtocolBuffers

Structureofa.protofile

Howtouseamessage

Howmessagesareencoded

Importantpointstoremember

MoreStuff

16/33

From.prototocode

protoccompilercreatesclassesindesired
language
Example:protoccpp_out=.person.proto
willcreateperson.pb.ccandperson.pb.h

17/33

Generatedcode
//name//id
boolhas_name()const;boolhas_id()
const;
voidclear_name();voidclear_id();
conststring&name()const;int32_tid()const;
voidset_name(conststring&value);voidset_id(int32_t
)
voidset_name(constchar*value);
string*mutable_name();
//phone
inlineintphone_size()const;
inlinevoidclear_phone();
inlineconstRepeatedPtrField<Person_PhoneNumber>&
phone()const;
inlineRepeatedPtrField<Person_PhoneNumber>*
mutable_phone();
18/33

inlineconstPerson_PhoneNumber&phone(intindex)const;
inlinePerson_PhoneNumber*mutable_phone(intindex);

Settingvaluesinamessage
#include"person.pb.h"
Personperson;
person.set_name("HansMustermann");
person.set_email("hans@muster.mann");
//std::string*name=person.mutable_name();
//*name="HansMustermann";
Person::PhoneNumber*phone;
phone=person.add_phone();
phone>set_number("03012345678");
phone>set_type(Person::WORK);
phone=person.add_phone();
phone>set_number("0170987654321");
phone>set_type(Person::MOBILE);
//checkforvalidity:person.IsInitialized()==true?

19/33

Serializing

Serializedatavia

std::stringperson.SerializeAsString()
person.SerializeToString(std::string*)
person.SerializeToFileDescriptor(int)
person.SerializeToOstream(std::ostream*)
person.SerializeToArray(char*,intsize)

Example

std::ofstreamfile(filename,
std::ios::out|std::ios::binary);
if(false==file.fail()){
person.SerializeToOstream(&file);
20/33

Parsing

Parsevia

person.ParseFromIstream(std::istream*)
person.ParseFromString(std::string)
person.ParseFromFileDescriptor(int)
person.ParseFromArray(constchar*,int)

Example:

std::ifstreamfile(filename,
std::ios::in|std::ios::binary);
if(false==file.fail()){
person.ParseFromIstream(&file);
}
21/33

Retrievingvaluesfromamessage
#include"person.pb.h"
Personperson;
person.ParseFromIstream(file);
if(person.IsInitialized()){
cout<<"Name:"<<person.name()<<endl;
if(person.has_email()){
cout<<"Email:"<<person.email()<<endl;
}
for(inti=0;i<person.phone_size();i++){
cout<<"Phone:"<<person.phone(i).number()
<<endl;
}
}

22/33

OverviewWherearewe

WhatareProtocolBuffers

Structureofa.protofile

Howtouseamessage

Howmessagesareencoded

Importantpointstoremember

MoreStuff

23/33

Messageencoding

Fulldescriptionat
code.google.com/intl/apis/protocolbuffers/docs/encoding.html

Messagesareencodedinbinaryformat,many
key/valuepairs
Key=(id<<3)|wire_type

0=Varint(u/s/int32/64,bool,enum)

1=64bit(fixed64,sfixed64,double)

2=Lengthdelimited(string,bytes,messages,packed
repeatedfields)
5=32bit(fixed32,sfixed32,float)

Littleendian

24/33

MessageencodingVarints

lower7bitsperbyteareusedtostoredata;ifMSBis
set,thenextbytebelongstothisvalueaswell.
Example:100000001
300(100101100)1010110000000010
Example:messageTest1{requiredint32a=1;}
andsettingato150(0x96)isencodedas089601:
08=00001000,sowiretype=0(varint)andid=1

9601=10010110000000110010110150

Generic/unsignedintegertypesusevarintencoding

25/33

MessageencodingZigZag

int32storesnegativevaluesinfulllength
signedintegertypes(e.g.sint32)useZigZag
MappingsmallpositiveANDnegativevaluesto
smallsizes:
00
11
+12
23
24

i.e.n(n<<1)^(n>>31)

26/33

MessageencodingTherest

string,byte:varintencodedlength+rawdata
float,double:asis(littleendian)
repeatedfields:
packed=false:tag/idoccursmultipletimes

packed=true:tag+size+elements

Unusedfieldsarenotpartofthemessage
strings

27/33

OverviewWherearewe

WhatareProtocolBuffers

Structureofa.protofile

Howtouseamessage

Howmessagesareencoded

Importantpointstoremember

MoreStuff

28/33

Importantpointstoremember

Alwaysrememberthatbackwardandforward
compatibilityisgoal#1withprotobuf
Beabsolutelysureaboutafield'slongterm
necessitywhenusingrequired
Chooseidnumbers115foroftenusedvalues
(moreefficientlyencoded)
Chooseappropriatedatatypes,basedon
expectedvaluessigned/unsigned/genericmay
resultinbetterencoding

29/33

Updatingamessage
Toupdateamessage

Definenewfieldsasrepeatedoroptionalandset
sensibledefaultvalues(forbackwardscompatibility)
Donotchangetags/idsanddonotrecycletags/ids
(whene.g.removingoptionalfieldsinanupdate,make
surethattheidwillnotbeusedagain,preferablyby
prefixingthenameoftheobsoletefieldwithe.g.
OBSOLETE_)
Somedatatypechanges(e.g.betweenints)possible
Whenchangingdefaults,rememberthatdefaultvalues
30/33

arenotencodedbutalwaysusedasdefinedin.proto

Morestuff

Extensions

Definerangesoftags/idsthatcanbedefinedin
another.protofile
message OneMessage {
ext ensi ons 100 t o max;
}
/ / El sewher e. . .
ext end OneMessage {
opt i onal Foo f oo_ext = 100;
opt i onal Bar bar _ext = 101;
opt i onal Baz baz_ext = 102;
}

31/33

Morestuff(cntd)

Services

PossibletocreatestubsforRPCservicesusing
protobuf,e.g.

serviceSearchService{
rpcSearch(SearchRequest)returns(SearchResponse
);
}

Selfdescribingmessages,Reflection
Customoptions

32/33

Questions?

33/33

You might also like