You are on page 1of 9

Under The Hood

What's going on behind the scenes?

Wilfred Springer

Table of Contents

1. Codec .......................................................................................................................... 1
2. CodecFactory ............................................................................................................... 2
2.1. CompoundCodecFactory ...................................................................................... 5
2.2. WholeNumberCodecFactory ................................................................................. 5
2.3. BooleanCodecFactory ......................................................................................... 6
2.4. ObjectCodecFactory ........................................................................................... 6
2.5. ListCodecFactory ................................................................................................ 7
3. Binding ....................................................................................................................... 7
4. CodecDecorator ........................................................................................................... 8
4.1. LazyLoadingCodecDecorator ................................................................................. 8
4.2. SlicingCodecDecorator ........................................................................................ 8

The previous chapter introduced a couple of simple introductory cases, showing some of the tricks
the framework has up its sleeves if you do not feel the desire to customize anything. While you were
reading that chapter, you might have already gone like "hmmm, that's sweet, but unfortunately it
doesn't work in my corner case". If that's what you thought, then this is the chapter you need.

In this chapter, I will explain what is actually going on under the hood, in order to understand how
to extend the framework yourself. The focus in this chapter is on four major abstractions: the
Codec interface itself, the CodecFactory interface, the Binding interface and the CodecDecorator
interface. In addition to discussing the actual interface, this chapter will also discuss some of the
implementations, in order to help you to understand how everything fits together.

1 Codec
Let's first revisit the Codec interface. In the previous chapter, we already talked about the way you
use this interface. In fact, we said that you better use the Codecs convenience class in all cases. If
you do that, then there is actually very little to know about the Codec interface itself. It just magically

However, if you want to extend the framework yourself, then this is the first interface you really
need to understand, since this is probably the interface you need to implement.

Figure 1. Codec Interface

public interface Codec<T> {

Under The Hood

T decode(BitBuffer buffer, Resolver resolver, Builder builder)

throws DecodingException;
int getSize(Resolver resolver);
Expression<Integer, Resolver> getSize();
CodecDescriptor getCodecDescriptor();
Class<?>[] getTypes();
Class<?> getType();

The decode(BitBuffer, Resolver, Builder) is obviously the operation that will decode data
from the BitBuffer into an instance of T.

The Resolver allows the Codec to resolve references used in Preon annotations.

The Codec interface is the interface implemented by objects that are able to decode data from a
BitBuffer and to encode data into a BitBuffer1. In addition to that, the Codec needs to be able to
make some sort of prediction on the number of bits occupied by the encoded data. And last but not
least, the Codec needs to be able to return a CodecDescriptor, used for rendering documents with a
description of the Codec.

When I just said that Codecs are capable of decoding data from a BitBuffer, you could have actually
read 'decoding an object from a BitBuffer'. Codecs are associated to a single type, and are expected
to return only a single instances of that type from the BitBuffer.

In many cases, your objects will hold references to many other objects. Does that mean that a single
Codec needs to be able to recursively traverse the object graph and understand how to decode each
of the individual members of the objects that it encounters?

The answer to that question is both yes and no. "Yes", since whenever you invoke decode on the
Codec, it needs to be able to reproduce all objects that are referenced by the object that is going
to be returned. "No", since the Codec does not have to understand all of that itself. It can simply
delegate to other Codecs, one for every type of attribute it encounters.

If you are decoding an object, you could actually use the Codec interface directly. If you create an
instance using the Codecs class, they just magically know what to do. We just learned that Codecs
constructed like this, will most likely delegate their work to other Codecs, which in turn will most
likely delegate their work to other Codecs, and so on and so forth. This way, the Codec both hides a
chain of responsibility, but is also able to act as a link inside a chain of responsibility.

2 CodecFactory

The previous section ended with stating that a single Codec most likely delegates to other Codecs,
which in turn delegate to other Codecs, etc. Obviously, each Codec has to be constructed before it
can be used. All of these instances are created as a result of the create() method on Codecs. But how
does the Codecs class know which ones to create?

As you might have already have guessed by its name, it is of course the CodecFactory. The
CodecFactory a single operation, that is expected to be able to return Codec from the context passed
in, or return null.

The latter is currently not supported yet, but it's not unlikely that it's going to be implemented in the future.

Under The Hood

Figure 2. CodecFactory Interface

public interface CodecFactory {

<T> Codec<T> create(AnnotatedElement metadata,
Class<T> type,
ResolverContext context);

Almost every type of Codec you will ever use, will be created by a CodecFactory. If you are searching
the Preon codebase for the different type of Codecs supported by it, then chances are you will
stumble across CodecFactories only. And if you want to extend the framework with your own
Codecs, the thing you actually need to pass to the Codecs class is a CodecFactory, and not a Codec.

The CodecFactory needs to be able to create a Codec from three parameters passed in. The type of
object expected, a socalled ResolverContext and metadata. If the Codec is used to decode data to be
injected in a field, then the metadata provides access to the annotations defined on that field2.

The third parameter (ResolverContext) is a Limbo ReferenceContext. This is the object that supports
your CodecFactory in creating references to the context of the field for which it is currently trying to
create a reference.

Example 1. Expression sample

class Stuff {
@Bound int nrOfThings;
@BoundList(size="nrThings") Thing[] things;

Let's take the class above as an example, and assume that your CodecFactory needs to see if it is
able to construct a Codec for the "things" field. The annotation contains an expression defining the
number of "things" in the array. However, the size annotation attribute is just a String.

In order to be able to turn the expresion "nrOfThings" into something usable, we need to turn that
expression into something we can actually evaluate - in this case, a Limbo Expression object. And if
there is a problem with this expression, we obviously want to find out early.

In general, the CodecFactory should not rely on the assumption that the metadata passed in is based on a field. It should just treat it as a
number of hints suggesting how to decode data.

Under The Hood

Figure 3. Expression applied

There are basically two things that could be wrong with the expression: either the expression cannot
be interpreted, or the expression contains references to variables not available in the context in
which the expression will be evaluated. The ResolverContext supports detecting the latter case.

This is the way it works: whenever your CodecFactory gets a chance to create a Codec based on
metadata, type information and a ResolverContext passed in, it will have to use the ResolverContext
to create Expression objects. The Expressions class used to create Expression instances accepts a
ReferenceContext as a parameter, and ResolverContext is nothing but a special ReferenceContext.
(One specific to Preon.) Normally, you would pass the ResolverContext directly, but there are cases
in which you could consider replacing or wrapping the ResolverContext3.

So, the first and most important purpose of the ResolverContext is early validation of your
expression. The second and also important purpose of the ResolverContext is to facilitate
documentation getting generated from your Codec.

If your CodecFactory constructs a Codec based on an expression, then the documentation generated
by that Codec probably needs to take that into account.

In case of the example of Figure 3, “Expression applied”, you would expect a description similar to
this: "First you read a 32-bit integer. Then you read a list of items with the size corresponding to the
32-bit integer you just read before."

Any type of realistic documentation of this file format needs to include this dependency. You want
the documentation to clearly state that the size of the list size is dependent on the 32 bits you just
read before. That last bit "the 32 bits you just read before" is encapsulated in a Reference. And the
ResolverContext allows the Expressions class to obtain these references without having to analyze
the entire class again.

So, the Expression "nrOfElements" will be parsed into a Reference to the nrOfElements attribute
defined before. It's the responsibility of the Reference to render itself in a useful way. The Codec
There are actually cases in which you might want to replace that ResolverContext with another one. Typically when you want to introduce new
variables, or if you are basically 'popping' or 'pushing' the stack. (If your current context changes into the context of the attributes type.)

Under The Hood

or CodecFactory does not even try to make sense out of it. It just relies on the ResolverContext to
return a proper reference.

Expression expr = Expressions.create("nrOfElements", resolverContext);

// Potentially results in"
// "the number of elements defined before"

Now, this may be one of the areas in which the flexibility is getting a litle in the way of understanding
what's going on here. However, just to comfort you a little, it's just object orientation. That's all it is.
The Reference itself decides how it needs to be represented. And the type of ResolverContext passed
to the Codec decides how these References are getting constructed. With that, sky is the limit.

2.1 CompoundCodecFactory
The CompoundCodecFactory must be one of the only implementations of CodecFactory that
doesn't actually create any other Codec itself. The main purpose of the CompoundCodecFactory
is to hide the complexity of choosing a particular type of CodecFactory behind an interface. The
CompoundCodecFactory references a list of other CodecFactories. When asked for construction of a
Codec, it will simply ask all of the factories it references and try each of them to construct a Codec. If
all fails, it will simply return null, just what you would expect for a CodecFactory.

2.2 WholeNumberCodecFactory
There are many CodecFactories that create Codecs themselves. (There are also a few other
CodecFactories that don't create Codecs themselves, apart from the CompoundCodecFactory, but
that's a subject that can wait for a while while.) This section starts with one of the most simplest: the

The WholeNumberCodecFactory creates Codecs capable of decoding - well - whole numbers.

I didn't want to say integers, since that might lead you to be believe that its capable
of decoding java.lang.Integer and its primitive counterpart only, which is not the case.
WholeNumberCodecFactory supports decoding byte, short, int, long and all object representations of
those types.

The WholeNumberCodecFactory is the first example in which a a CodecFactory could use the
metadata passed with annotations to decode the data in the proper way. By default, it will return a
Codec for byte, short, int or long type of fields whenever:

• null is passed in as metadata;

• An @Bound instance is passed in as metadata;
• An @BoundNumber instance is passed in as metadata.

The null case is not important for now. The @Bound annotation supports the default case. The
CodecFactory will take this annotation as a signal that it actually needs to create a Codec, using the
defaults for the type passed in. The @BoundNumber annotation supports the case in which you do
want a Codec to be created, but you don't like the defaults.

Here are some examples in which you would want to use the @BoundNumber annotation, instead of
the @Bound annatation:

Under The Hood

• You want to decode 4 bits into a byte.

• You want to force little endian when decoding an 32-bit integer.
• You want to decode a 32-bit unsigned integer as a long.

The WholeNumberCodecFactory is a good example of the pattern that you will find implemented in
almost all other CodecFactories:

• Attempt to return a default Codec whenever null or @Bound is passed in.

• ...unless there is some other piece of metadata passed in, telling the CodecFactory how to
customize the Codec it creates.

2.3 BooleanCodecFactory
The BooleanCodecFactory is by far the simplest example of a CodecFactory (and associated Codec)
that you could ever come up with. If the CodecFactory is challenged with a boolean type (either
the primitive type or the object type) and the presence of an @Bound annotation in the metadata
passed in, then it will create a new Codec. Whenever this Codec is asked to decode a value from the
BitBuffer, it will read a single bit and return true if the bit is 1, and false otherwise.

2.4 ObjectCodecFactory
In theory, it could have been possible to construct Codecs by hand, by using a constructor. However,
that's not actually something you want to do yourself. The Codec created would have to closely
resemble the datastructure you need to decode. And since every Codec is capable of decoding
one value, and there is a fair chance that the object you are trying to decode exists of many other
objects, it would even be quite hard to do this yourself.

The good news is: you don't have to do all of that yourself. The ObjectCodecFactory basically does it
all for you.

The ObjectCodecFactory works on basically all existing objects. If all other CodecFactories fail, then
the ObjectCodecFactory might still be able to return a Codec, even though the Codec created might
basically be a no-op. (It won't decode anything.) That's why normally every CompoundCodecFactory
instance is advised to try the ObjectCodecFactory as a last attempt.

The ObjectCodecFactory is probably much simpler than you would expect. Suppose that you want to
create a decoder for instances of type A:

1. Get the list of all fields declared on type A.

2. For each field declared on type A, create a Codec for the type of value accepted by that field,
and wrap both the reference to the field and the Codec for its values in a new Binding instance.

3. Create a new Codec that - on an invocation of decode - will always create an instance of type A,
and populate its data by giving every Binding the opportunity to load its data from the BitBuffer
passed in. (The Binding will simply use the Codec it references to do the actual decoding of the
field's value.)

You may have wondered how the ObjectCodecFactory creates a Codec corresponding to the field's
type, in Step 2. The answer is simple: it uses a CodecFactory. It may be different CodecFactory than

Under The Hood

the ObjectCodecFactory itself, but it's highly likely that it references the ObjectCodecFactory itself
somewhere under its covers.

We just said that ObjectCodecFactories use Binding objects under the covers. It's important that you
know about these guys, since the ObjectCodecFactory actually allows you to override the way they
work, by pluging in your own BindingFactory. We are not going to discuss the typical use case yet,
but for now it's important at least to know they exist. (Check out Section 3, “Binding” for a typical
example on how to leverage this plugpoint.)

2.5 ListCodecFactory
With the three CodecFactories listed above, we would already be able to decode nested objects, in
which each of the objects fields references either a numberic value or another object. But that's not
enough. (In fact, we don't even know when it would be enough; that's why Preon is an extensible
framework. We leave it up to you to decide when you consider it to be enough.)

One of the most important missing cases is support for lists. Many binary encoding standards have
some repeating sequences inside. This is where the ListCodecFactory comes in.

The ListCodecFactory allows you to decode lists of objects. At this stage, it's limited to a list of a
certain size, but it is expected that there will be other implementations that have other ways to
demarcate the end of the list.

The ListCodecFactory creates a Codec whenever the @BoundList annotation is passed in. It will use
the attributes of this annotation to figure out how and what to decode. In the simple case, that could
be as little as stating the type and the size of the list.

If you would create a Codec simply by passing the type and the size of the list, then the
ListCodecFactory can be expected to return a Codec implementation that will - on calling decode() -
return a List implementation that allows you to visit all elements in the List. Don't expect one of the
default implementations of List though. By default, the Codec created by the ListCodecFactory will
return its own implements of List, one that decodes object on the fly, on demand.

It is important to emphasize the difference between the Codec that returns the List instance, and the
Codec used to decode elements of the List. These are two completely different Codecs. The Codec
decoding the List is seeded with a Codec capable of reading the elements of the List. ??? shows how
these elements collorate in a real world scenario.

Figure 4. Decoding a List

3 Binding
A couple of sections back, I talked about Bindings, and how the Codec created by an
ObjectCodecFactory uses these bindings to load and store values from and in a Field. I also said that
you can customize the way these Bindings behave. This section is going to give you an example.

Many binary formats have some conditional built into the specification. Here are some examples:

• If the version of the file is bigger than 400, then read this data structure first, otherwise continue
with the next data structure.

Under The Hood

• If the first bit read is 1, then read 7 bits and decode it as a byte; otherwise continue to read 7 + 8
bits and decode as a short.

All of these cases could have been solved by breaking out of the ordinary framework and implement
the entire encoding/decoding logic yourself. That would not only be really hard, but also result in
code that will not be self-documenting in the way Preon normally is.

Preon takes another approach, and allows you to declaritively specify the conditions in which the
framework should try to load data in bound fields, using the @If annotation. The @If annotation
takes a single argument, which is the condition stated in Limbo.

The reason Preon is capable of dealing with these expression is because it will create a Binding that
is capable of loading data from the BitBuffer if the condition holds. Which is why it is sometimes
convenient to have the ability to create your own custom Binding implementation instead of the
default one.

4 CodecDecorator
Up til now, we have only seen examples of CodecFactories creating Codecs. Codecs are however
not always constructed by CodecFactories. If you want the framework to deal with your own
Codec, then there is another way to make it create your own Codec, which is by implementing the
CodecDecorator interface.

As you may have guessed, the CodecDecorator has something in common with the Decorator
Pattern (see GoF). The CodecDecorator allows you to transparenly add additional behaviour to a
Codec, by applying some 'decoration'. So, where the CodecFactory accepts the type and metadata
indicating the specific Codec you want to construct, the CodecDecorator also accepts the Codec that
should be wrapped.

4.1 LazyLoadingCodecDecorator
When you ask the LazyLoadingCodecDecorator to decorate an existing Codec, you get a new Codec
that - on receiving a call to decode - will not pass the call on to the Codec it decorates; instead, it will
return a proxy that acts as the object that needs to be created by the decororated Codec. The actual
object will only be loaded from the BitBuffer right after you call an operation on that proxy for the
first time.

The LazyLoadingCodecDecorator is triggered by the presence of the LazyLoading annotation on the

type you want to have loaded lazily.

4.2 SlicingCodecDecorator
The SlicingCodecDecorator decorates existing Codecs by returning a new Codec that - on receiving
a decode call - passes the request on to the decorated Codec, but replacing the BitBuffer passed by
taking a slice of the original.

The main reason of having a SlicingCodecDecorator is to support type length value records. The main
idea here is that Codecs reading the records data should not be required to know the amount of data

Under The Hood

to be expected. As you might have seen before, the Codec constructed doesn't support passing in
the maximum number of bytes to be read, or anything. And having to implement that logic for all TLV
record readers seemed to be a waste.

Instead of having every TLV record Codec implement support for detecting the end of the record,
the decision was made to externalize that behaviour and put it into a separate Codec, decorating the
Codec that reads the data from the TLV record.