You are on page 1of 21

Authenticated encryption


Lorenzo Peraldo, Vittorio Picco

December 20, 2007


1 Introduction 2
1.1 Authenticated Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Generic composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Single-Pass combined modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Two-pass combined modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 CCM, Counter with Cipher Block Chaining-Message Authentication Code 5

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Formatting function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Length of the MAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.5 Efficiency and performances of CCM . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.6 Criticism of CCM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.7 A possible attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 GCM, Galois/Counter Mode 11

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 IV and Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3.1 Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3.2 IV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4 Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.5 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4 Conclusions 19

Bibliography 20

Chapter 1


1.1 Authenticated Encryption

Authenticated Encryption (AE) is a term used to describe encryption systems which simultaneously
protect confidentiality and authenticity (integrity) of communications. These goals have long been
studied, but they have only recently enjoyed a high level of interest from cryptographers due to the
complexity of implementing systems for privacy and authentication separately in a single applica-
tion. For decades the solution to this problem has been to combine privacy and authentication in
a straightforward manner using the so-called ”generic composition”, but recently there have been
a number of new construction which achieve this two goals simultaneously, often much faster than
generic composition solutions.
What we’ll analyze regards authenticated encryption in a symmetric-key model. Thus a single key
K will be chosen randomly and then shared between the sender and the receiver. Once the two parties
have the key, we have to provide them an AE algorithm such that the sender can process a selected
message M with the AE algorithm along with the key K (and possibly a nonce N ), and then send the
resulting output to the receiver. The output of this processing will be a ciphertext C, the nonce N
and a short message authentication tag, T . Then the receiver should be able to recover M using C,
N and his copy of the key K, and to verify the authenticity of the received message using the above
parameters along with the tag T .
To make an AE algorithm good we have many requirements. For example performance, portability,
simplicity, parallelizability, freedom from patents and of course security.
This last requirement is maybe the most important one as an AE scheme has two goals, privacy
and authenticity,and it won’t serve our needs if it’s not secure. Privacy means that a passive attacker
that views the ciphertext C and the nonce N , cannot read the content of the message M . to achieve
this we could make C indistinguishable from a random bit string. Authenticity, instead, means that
an active attacker cannot easily generate a valid ciphertext C, a nonce N and a tag T such that the
receiver will believe it was generated by the authorized sender.
In many applications we do not only encrypt and authenticate our message, but we also need to
include some additional information A which must be authenticated too. For example in a network
packet we should encrypt the payload and authenticate both the header and the payload. For this
reason associated data needs to be included as input to the AE schemes. Schemes that allow associated
data are called AEAD (Authenticated Encryption with Associated Data).
One unfortunate aspect of most cryptographic schemes it that we cannot prove that any scheme
meets the formal goals required of it. We can prove only some things related to security depending on
the type of cryptographic object we analyze. If it is a primitive such as a block cipher, there’s no proof
of security possible, so we can just hope for security after showing that none of the known attacks are
working (differential cryptanalysis). For algorithms that are built on top of these primitives we can
prove at best that they are as secure as the underlying primitives.

1.2 Generic composition
The traditional way to obtain both authenticity and confidentiality was, until recently, to find two
well designed protocols, one for encryption and one for authentication, and then use them in sequence.
This is really straightforward and, at least at a first glance, completely safe. Is both the algorithms
are safe, so their combination in sequence would be safe as well.
In general the approach was to choose a strong mode of operations for block ciphers, like CBC,
and then to use it with authentication protocols that do not use keyed-hash functions. This kind of
approach has been proved as wrong, and the best example to show it is the WEP protocol.
Wired Equivalent Privacy protocol is a common choice to protect WiFi networks. It provided
authentication with a simple CRC hash and then used a stream cipher to encrypt the data. The
mechanism is very simple and is also very simple to circumvent it.
Another common mistake is to use the same key both for the authentication and encryption
operations. This a weaker requirement, though, and a smart implementation could reduce a lot the
related security risks.
There are three available choices when using generic composition:

• MAC then Encrypt (MtE): we compute a MAC of the plaintext, add to it and then encrypt the

• Encrypt then MAC(EtM): we first encrypt the plaintext and then authenticate the resulting

• Encrypt and MAC (E&M): we first encrypt the plaintext and then authenticate the plaintext

Some studies has been done on which of these three strategies is the best to achieve authenticated
encryption in the safest way, and the result was that in general Encrypt then MAC is the best choice.
Performing such an operation gives the MAC a property called ”strongly unforgeable”, while using
the other two methods there could be confidentiality problems.
The conclusion of these studies was that EtM can be considered safe if provided of a secure
encryption algorithm and of a secure MAC, each using independent keys. MtE and E&M can be
considered secure if attention is paid on the choice of the combination encryption algorithm-MAC
In addition of their extremely simplicity, the generic composition methods have the interesting
characteristic that since the two operation are completely independent, is possible to encrypt only a
subset of the total transmitted data. In this way we can add to a message some additional data that
will not be encrypted and therefore is useful for authentication only.
On the other hand, the obvious drawback of the generic composition is the long time needed to
process two times the message, with two different keys.

1.3 Single-Pass combined modes

Until 2000 there was no way to obtain authenticated encryption with only a single pass over the
message. Generic composition was the only way to provide AE. In 2000 IAPM has been developed.
If with generic composition we needed to invoke 2 · m times the block cipher (where m is the
number of blocks in which the plaintext has been splitted), with IAPM we need just m + log(m)
invocations. This is because IAPM compute certain values, called the seeds, before the encryption;
seeds calculation is used to achieve authentication and needs only log(m) invocations of the encryption
After the release of IAPM many researchers started to work on their own solution of single-pass
AE, generally modifying the original structure of IAPM. The researchers understood the power of the
modes they were developing, so they all patented their discoveries. This was also their biggest mistake.
It has been never verified whether all these patents override each other, even though probably some
of them do, because of the hype of that period on single-pass AE. Since now there has never been

requests of verifying the possible overrides in court, but the possible users of the proposed methods
are quite afraid that if they choose to implement one of these solutions there could be legal actions
against them.
In conclusion, Single-pass combined modes has never been used as a standard in any application
and furthermore other teams of researchers have quited they work in this field because they could
have accused of violating some Intellectual Property. Now the interest of researchers has moved in
another direction, the two-pass combined mode.

1.4 Two-pass combined modes

The Intellectual Property problems raised by the Single-pass combined mode had clearly shown that
it was important to find good solutions to the authenticated encryption problems, and that these
solution should be patent-free.
The two-pass combined modes represent a class of algorithms with performances not so far from
the single-pass ones, but all with no intellectual property restriction.
The first to be developed was the CCM (which will be explained in the details later), then EAX
tried to solve some of the CCM problems, then CWC has been developed to improve EAX and finally
GCM has been created. GCM is probably the best algorithm available now and will also be explained
in all its aspects later.
CCM is not much better than just generic composition, because in fact it uses a standard MAC
generation algorithm (CBC-MAC) and then a standard CTR encryption, but it offers some advantages.
The biggest is the use of only a single key to encrypt and generate the MAC. We said that this could
be a security problems but CCM designers paid a lot of care on this topic and therefore CCM has now
been proved as secure. CCM has also become the mandatory mode for the 802.11 wireless networks.
EAX solved some of the CCM problems, in particular the issue of knowing the message length
in advance and its complicated definition that used some unnatural parametrization. EAX uses a
modified CBC-MAC called OMAC, and then the CTR mode of operations.
CWC has been developed because both CCM and EAX can’t be parallelized in hardware, because
the CBC-MAC (or its variants) is inherently serial and so can’t achieve high throughput. CWC
could operate up to 10Gbit per second. The features of CWC, though, can be exploited in a parallel
computing environment only, in a normal computer its performances are not any better of the previous
GMC is the latest developed algorithm and is the one with the highest performances. It has
been developed by one of the researcher of CWC and therefore has been totally devoted to improve
CWC performances and to fill its lacks. GMC has been created starting from a modification in the
mathematical construct that lays at the CWC basis. While CWC operated in a modulo 127 integer
environment, the GMC makes use of the Galois field mathematics: this apparently abstract choice
allows the implementers to carry out a much much simpler hardware circuit that realizes the hashing
function. This allow GCM to perform with throughput of more than 10Gbit per second. The only
possible competitor for GCM is the OCB (that is a single-pass mode), but many reasons (other than
intellectual property) makes GCM a best choice. For example OCB needs two different schemes, one
for encryption and one for decryption, while GCM only needs one.

Chapter 2

CCM, Counter with Cipher Block

Chaining-Message Authentication

2.1 Introduction
CCM is a mode of operation for a symmetric key block cipher algorithm. It combines the techniques of
the Counter (CTR) mode and the Cipher Block Chaining-Message Authentication Code (CBC-MAC)
algorithm to provide confidentiality and authenticity of the data.
The Counter mode with CBC-MAC mode is designed to use the Advanced Encryption Standard
(AES) block cipher, or any other block cipher with a block size of 128 bits or more, to provide
authentication and encryption using a single key. As the secret key is only one, being this a symmetric
key algorithm, it must be established beforehand and be known only by the two parties involved in the
transmission of the data. For this purpose CCM requires a well-designed key management structure.
CCM is intended for use in a packet environment and thus it can’t be used with stream data. The
plaintext input includes a header, which is authenticated but not encrypted, and a payload, which is
both authenticated and encrypted. Each packet must be an integral number of bytes and must be
assigned a unique value, called nonce. The maximum number of packets that can be authenticated
with the same key is determined by the size of the nonce, which is one of the parameters that must
be decided when designing the algorithm.
CCM processing expands the packet size by appending an encrypted authentication tag. Successful
verification of this authentication tag provides assurance that the packet originated from a source with
access to the block cipher key and it also provides assurance that the packet wasn’t altered after the
generation of the authentication tag. Failed verification of the tag is designed to reveal both accidental
and intentional, unauthorized modifications of the packet.
CCM allows pre-computation of the key stream if the nonce value is known, allowing half of the
computational load to be pre-processed. This property can be used to improve the efficiency of an
implementation. The size of the implementation can be minimized as well, as CCM uses only the
forward encryption function of the block cipher and not the inverse function.
CCM mode was designed by Russ Housley, Doug Whiting and Niels Ferguson. At the time CCM
mode was developed, Russ Housley was employed by RSA Laboratories.
A minor variation of the CCM, so called CCM*, is used in ZigBee standard. CCM* includes all
of the features of CCM and additionally offers encryption-only and integrity-only capabilities.

2.2 Description
CCM consists of two main processes: generation-encryption and decryption-verification. These two
processes combine the counter mode encryption and the cipher block chaining-message authentication
code to compute a MAC to provide authentication.

Before the implementation of CCM it is important to have a valid key establishment and key
management to ensure the efficiency of the block cipher algorithm used for encryption. The secret key
for this algorithm must be generated randomly and be shared only by the parties to the information
or the whole cipher algorithm would be useless. Moreover the same key can be used for a maximum
number of invocations of the cipher block algorithm and this limit should be set to 261 .
As we said the CCM combines two cryptographic mechanisms based on this cipher block algo-
rithm. The first one is the Counter mode, used for confidentiality, which requires the generation of
a sufficiently long sequence of blocks (counter blocks) that will then be used to encrypt the message.
These blocks don’t need to be secret but must be distinct within a single invocation and any other
invocation of the cipher block algorithm under the same secret key.
The other mechanism used in CCM is CBC-MAC. This method is basically an adaption of the
cipher block chaining mode used for authentication. Starting from a zero initialization vector CBC
is applied to the data to be authenticated and the last block generated, truncated at an established
length, is used as an authentication tag called MAC.
Note that the same key is used both for the CTR and the CBC-MAC.
For the generic CCM mode there are two parameter choices. The first one is the size of the
authentication tag M , which involves a trade-off between message expansion and the probability that
an attacker can undetectably modify a message. Valid values for M are 4, 6, 8, 10, 12, 14 and 16
bytes. The second choices is on parameter L, the size of the length field, which requires a trade-off
between the maximum message size and the size of the nonce. Therefore the length of the message
we want to encrypt and authenticate must be defined beforehand.

To authenticate and encrypt the message we need the following input information:

• an encryption key K for the block cipher;

• a nonce N ;

• the message, also called payload P , of length determined by the choice of the parameter L;

• additional authenticated associated data A, used to authenticate plaintext packet headers, or

other information about the message.

CCM produces as output the ciphertext C.

The first step for authentication is to generate the authentication tag M . This is done using
CBC-MAC. First a formatting function is applied to 3 of the inputs, the payload, the associated data
and the nonce, to produce blocks B0 , B1 , ..., Bn . These blocks provide the input for the CBC-MAC
function that generates the MAC we need using the key K for the cipher block chaining.
Then we need to perform encryption. Once we have the authentication tag M, the Counter mode
is applied to generate the counter blocks CT R0 , CT R1 , ..., CT Rm . Thanks to these blocks we can
encrypt the message by XORing the various octets of the payload with the blocks CT R1, ..., CT Rm .
CT R0 is instead XORed with the authentication tag M to generate an authentication value.
The final output of the generation-encryption process is the ciphertext C which consists of the
encrypted payload followed by the encrypted authentication value computed before. This is the detail
of the steps needed for generation-encryption:

1. Apply the formatting function to (N, A, P ) to produce the blocks B0 , B1 , ..., Br ;

2. Set Y0 = CIPHK (B0 );

3. For i = 1 to r, do Yi = CIPHK (Bi ⊕ Yi − 1);

4. Set T = MSBT len (Yr );

5. Apply the counter generation function to generate the counter blocks Ctr0 , Ctr1 , ..., Ctrm ;

6. For j = 0 to m, do Sj = CIPHK (Ctrj );

7. Set S = S1 kS2 k...kSm ;

8. Return C = (P ⊕ MSBP len (S))k(T ⊕ MSBT len (S0 )).

For the process of decryption and the verification of authenticity and integrity we need the following

• the received ciphertext C;

• the associated data A;

• the nonce N ;

• the cipher key K.

As an output this process produces our message in plaintext or INVALID if the verification fails.
First of all the counter mode decryption is applied to the received ciphertext with the key K to
produce the payload and the associated authentication tag (MAC). Then the nonce, the associated
data and the computed payload are formatted according to the formatting function in order to produce
blocks for the CBC-MAC mechanism. This is applied to recomputed the MAC and compare it with
the received one in order to verify it. If this is not verified then the decryption-verification function
returns the error message INVALID, else it gives as output the payload.
To provide higher security, when an INVALID message is returned, the payload P and the MAC
should not be revealed and the implementation should ensure a third party not to be able to distinguish
what step the error message results from. This is the detail of the steps needed for decryption-

1. If Clen ≤ T len, then return INVALID;

2. Apply the counter generation function to generate the counter blocks Ctr0 , Ctr1 , ..., Ctrm ;

3. For j = 0 to m, do Sj = CIPHK (Ctrj );

4. Set S = S1 kS2 k...kSm ;

5. Set P = MSBClen−T len (C)⊕ MSBClen−T len (S);

6. Set T = LSBT len (C)⊕ MSBT len (S0 );

7. If N , A, or P is not valid, then return INVALID, else apply the formatting function to (N, A, P )
to produce the blocks B0 , B1 , ..., Br ;

8. Set Y0 = CIPHK (B0 );

9. For i = 1 to r, do Yj = CIPHK (Bi ⊕ Yi−1 );

10. If T 6= MSBT len (Yr ), then return INVALID, else return P .

2.3 Formatting function

The blocks B0 , B1 , ..., Bn used by the CBC-MAC mechanism are generated by a formatting function
that acts on the nonce, the payload and the associated data. The value of n depends on this formatting
function. This formatting function must hold the following properties for any key used:

• the first block B0 uniquely determines the nonce N ;

• the formatting data uniquely determines the payload P and the associated data A;

• the first block B0 is distinct from any counter blocks used across all the invocations of CCM
under a given key; this means that the formatting function and the counter generation function
should not be constructed independently.

The formatting function also defines which values (bit lengths) of payload, associated data, nonce
and authentication tag are valid. In fact the formatting function imposes some restriction on these
parameters that must be respected.
The bit lengths of N , A and P must be multiple of 8 bits and the same is for the authentication
tag. The first block of the formatted data represents the binary representation of the length of the
payload. The length of this block can be called q and it’s a parameter of the formatting function we
have to define. Therefore q determines the maximum length of the payload so that p < 28q . The
value of q also determines the length of the nonce n, because the sum q + n must be constant. Thus
we’ll have a trade-off between the maximum number of invocations of CCM under a given key and
the maximum length of the payload for these invocations.

Formatted input data

The formatted data, in the form of blocks B0 , B1 , ..., Bn , must be well defined. The first block B0
contains a byte dedicated to four flags, the nonce and the binary representation of the message length
as we said before.
The four flag are the following: the first bit is Reserved and the second is for Adata; then follow
2 strings of 3 bits which contain the encoded values of t and q.

Byte number 0 1...15 − q 16 − q...15

Contents Flags N Q

Table 2.1: Formatting of B0

If the Adata field is 0 then there’s no associated data, else the associated data is formatted in this
way: the associated data length a is encoded and the encoding is concatenated with the associated
data A, followed by the minimum number of 0 needed so that the resulting string can be partitioned
into 16 bytes blocks B1 , ..., Bm , where m depends on the associated data length a.
Depending on the value of a, it can be encoded into 2, 6 or 10 bytes.
The last n − m blocks Bm+1 , Bm+2 , ..., Bn represent the payload followed by the minimum number
of 0 such that this string can be partitioned into 16bytes blocks.
Not only the input data must be formatted, but also the counter blocks used in the CTR mode
need to be formatted in the following way:

Byte number 0 1...15 − q 16 − q...15

Contents Flags N [i]8q

Table 2.2: Formatting of CT Ri

Each block CT Ri contains the nonce N , the encoding of the index i and a field with flags. The
first 2 bits of these flags are reserved for future use; these are followed by 3 bits set to 0 to ensure that
all the counter blocks are different from B0 and the last 3 bits contain the encoding of q as in B0 .

2.4 Length of the MAC

The length of the MAC is one of the most important security parameters within CCM.
During the decryption-verification process we determine whether the purported ciphertext is a
valid ciphertext, which means that it’s been generated by a generation-encryption process with access

to the secret key, the nonce and the associated data, or not. The assurance of authentication of
CCM is based on the scarcity of ciphertext. This means that an attacker without the key or with
no access to the generation-encryption process cannot generate a ciphertext easily and therefore if
a ciphertext passes the decryption-verification process it’s very likely to be a valid and legitimately
generated ciphertext.
The first thing we verify in a purported ciphertext is that it’s length is at least equal to the length
of the authentication tag (MAC), which we’ll call Tlen . The decryption-verification process compares
the MAC decrypted from the ciphertext with the MAC computed for the received payload, the nonce
and the associated data. If the MACs are equal then the result is positive and the process outputs
the payload, else the output will be the error message INVALID. In this case one between the payload
and the associated data is not authentic. If the result is positive and we get the payload then both
the payload and associated data are authentic, but this assurance cannot be absolute as an attacker
could still have a small probability to generate a valid ciphertext. This probability depends on Tlen
and in particular it is less than 2−T len . As an attacker could present many ciphertexts to increase this
probability or intercept a valid ciphertext and replay it, the receiver should have proper controlling
So we could state that the larger Tlen we choose, the greater authentication we assure, but we
must beware of the trade-off that the choice of Tlen implies. In fact larger values of Tlen require more
bandwidth for the ciphertext and this could not always be available for some connections.
To ensure a good security n low risks we should always choose a value of Tlen greater than 64. a
smaller value for Tlen could be chosen for example for low bandwidth connections where there’s not
the possibility to attempt many trials. We can say that Tlen should satisfy the following inequality:

M axErrs
Tlen > lg
Where Risk is the highest acceptable probability for an inauthentic message to pass the decryption-
verification process, and M axErrs is the number of times that the output can be the error message
INVALID before the key is retired.
To preserve security, implementations need to limit the total amount of data that is encrypted with
a single key; the total number of block cipher encryption operations in the CBC-MAC and encryption
together cannot exceed 261 . (This allows nearly 264 octets to be encrypted and authenticated using
CCM. This is roughly 16 million terabytes, which should be more than enough for most applications).
In an environment where this limit might be reached, the sender must ensure that the total number
of block cipher encryption operations in the CBC-MAC and encryption together does not exceed 261 .
Receivers that do not expect to decrypt the same message twice may also check this limit.

2.5 Efficiency and performances of CCM

Performances depend on the speed of the block cipher implementation. In hardware, for large packets,
the speed achievable for CCM is roughly the same as that achievable with the CBC encryption mode.
Encrypting and authenticating an empty message, without any additional authentication data,
requires two block cipher encryption operations. For each block of additional authenticated data one
additional block cipher operation is required. Each message block requires two block cipher encryption
operations. The worst-case situation is when both the message and the additional authentication data
are a single octet. In this case, CCM requires five block cipher encryption operations.
Both CCM encryption and CCM decryption operations require only the block cipher encryption
function. In AES, the encryption and decryption algorithms have some significant differences. Thus,
using only the forward encrypt operation can lead to a significant saving in code size and hardware
implementation and size.
In hardware, CCM can compute the message authentication code and perform encryption in a
single pass. This means that the implementation doesn’t have to wait for the calculation of the MAC
to be completed to start the encryption. Thus there is a good advantage in the speed of this algorithm.

CCM was designed for use in a packet processing environment. The authentication processing
requires the message length to be known in advance, which makes one-pass processing difficult in
some environments. However, in almost all environments message or packets lengths are well known
so we don’t have this problem.

2.6 Criticism of CCM

There are several problems regarding different aspects of CCM that have been analyzed.
In terms of efficiency the first problem with CCM is that it doesn’t work on-line. This means that
it can’t work on a stream of data as we’ve already said but must have the input data n needs to know
the length of the message before starting the process. On the other hand it’s true that CCM is often
used in environments where packet length are well known even if in many context we can’t know the
length of the message we’re handling until it’s finished.
Length-prepend annotation also causes another problem for the associated data: CCM disrupts its
word-alignment. This problem may cause significant losses in the performances, as modern machines
perform operations much more efficiently when pointers into memory fall along word-boundaries. This
can’t be done when we prepend the length-annotation to the associated data. This problem becomes
more relevant when the associated data is long, but we usually expect the associated data to be just
a few bytes.
Another problem related to the associated data comes from the fact that CCM can’t pre-process
static associated data. This would be very useful in contexts where the associated data is the same
during a whole communication session so that we could process it once for all in order to reduce the
time needed for encryption and decryption. This cannot be done because the algorithm encodes the
nonce and the message length before the associated data rather than after it.
Parametrization of CCM is another aspect that is often criticized. The main points of this criticism
include the fact that a trade-off between the length of the nonce and the message length, induced by
the choice the user has to do before using CCM, is apparently without any sense as the two parameters
have nothing to do with each other. Furthermore byte orientation of CCM, as it’s defined only on
octet strings, could be seen as a limit for this mode of operation.

2.7 A possible attack

A common slogan in the design of Internet protocols is ”be conservative in what you send, and liberal
in what you accept”. Imagine a CCM implementation respecting this slogan literally; the sender
always send 16-byte tags messages, but the receiver accept messages with valid tags of any permitted
length. An attacker could choose to create 4-byte tags and generate a valid ciphertext after 232 tries.
However this attack could be of limited value as it’s a blind forgery and the attacker couldn’t control
whether the message is accepted or not.
Another possible scenario is the same: a smarter attacker can fully control what message the
recipient will accept. This happens because the transmitted ciphertext has the form of C k T where
T is the authentication tag (MAC) and the received message M is computed as a function of C.
the direct forgery attack can be performed as follows if an attacker intercepts a valid ciphertext for a
message M . The attacker may want to flip certain bits positions in M and then generate 232 ciphertext
in the form M ′ k T ′ where M ′ is obtained XORing M with the difference the attacker decides to flip
some bits, and T ′ varies over all 4-byte values. One of these ciphertext will be accepted as a valid
encryption of M ′ . Thus an attacker can forge any message with 232 trials, given a single ciphertext
that was authenticated with a 128-bit tag.
One possible countermeasure to this kind of attack would be to fix the tag length parameter at
key-negotiation time so that only sender and receiver know it. In this way the recipient will accept
only one value of the tag length, in order to avoid the direct forgery attack, and won’t accept a new
tag length until the end of the session.

Chapter 3

GCM, Galois/Counter Mode

3.1 Introduction
The GCM is a mode of operation for block ciphers, that provides authenticated encryption. It makes
use of the finite fields mathematics (Galois fields) to provide authentication and uses the CTR mode
of operation for the AES cipher to provide encryption.
The GCM has been developed to meet the growing need of fast algorithms, capable of handling the
fastest and fastest networks speed. In the era of Gigabit networks, a reliable and fast authenticated
encryption algorithm is desired: the encryption is usually a fast operation, that can be realized in many
ways, and many protocols provide efficient encryption techniques; many of them can be implemented
in software and in hardware, make the most of pipelining and parallelization. The real bottleneck is
the authentication part. Although many algorithms provide authentication, almost none can keep up
the pace of Gigabit links, and in fact a standard doesn’t exist.
GCM is an authenticated encryption mode of operation that can be realized both in software
and hardware, can be pipelined and work in a multiprocessor environment, and is free of intellectual
property restrictions, thus is a perfect candidate to fill the emerging need.
The Galois Counter Mode is based on the CTR, but adds a MAC, computed with operations in a
Galois field. This choice has been made because the operation of multiplication is extremely easy to
perform within such a field. It only involves basic operations that can be implemented in hardware.
The function that computes the MAC is called GHASH and it produces a tag; this tag is sent with
the ciphertext and must be verified by the recipient in order to authenticate the message. One of the
most interesting features of the GCM is that the function GHASH doesn’t need to be applied to an
encrypted text, but it can be used alone, only to provide authentication: in this case the algorithm is
called GMAC. What is remarkable is that if one changes a few bits of the plaintext and then compute
the MAC again, the computational effort needed is proportional to the number of bits changed.
Another useful characteristic of GCM, that makes it particularly attractive, is that it needs an
Initialization Vector (IV), but this vector does not have a fixed length, it can be arbitrary. Since the
IV is often a nonce, any available nonce can be used, spreading the field of application of the GCM.
GCM has been designed to be used with AES, in particular AES-128, that is a common choice for
many applications. In any case the 128 bit is just a suggestion, the Galois Counter Mode can be used
with other lengths. The key used for encryption and to generate the MAC is the same: this choice
simplifies the operation of key distribution.

3.2 Description
GCM has two main functions, called authenticated encryption and authenticated decryption. We’re
going to analyze them separately, even though they are almost identical.

Authenticated encryption
This function needs 4 inputs:

• the plaintext, called P ;

• the secret key, called K;

• the initialization vector, IV ;

• the additional authentication data, shortly AAD, indicated with A in the formulas.

The output produced are only 2:

• the ciphertext, called C;

• an authentication tag, called T .

We will now describe how this algorithm works, providing also other information on the input and
output data. We assume to use the AES-128 as the underlying encryption cipher but as we’ve already
said, the size of the cipher is not important.
The authenticated encryption function acts on two different levels: one for encryption and one for
authentication. Let’s consider encryption first.
The plaintext length can be up to 239 − 256 bits, that is about 64 gigabytes of data. The plaintext
is encrypted with the key K, whose length is appropriate to cipher one, in our case 128 bits. To start
the encryption the initialization vector IV must be provided; IV can be of any length but the best
choice is 12 bytes (96 bits), because in this case the algorithm is optimized; for applications where
efficiency is a must, this length should be chosen.
The input data are organized in this way. The plaintext is divided in sub-blocks of 128 bits (or
the given cypher block size). At the end there are n blocks of 128 bits and a last block composed of
the remaining bits, that are not enough to form a 128 bits block.
Here is the encryption algorithm:

1. H = E(K, 0128 );
IV k 031 1 if len(IV ) = 96
2. Y0 = ;
GHASH(H, {}, IV ) otherwise

3. Yi = incr(Yi − 1) for i = 1, ..., n;

4. Ci = Pi ⊕ E(K, Yi ) for i = 1, ..., n − 1;

5. Cn∗ = Pn∗ ⊕ MSBu (E(K, Yn ));

6. T = MSBt (GHASH(H, A, C) ⊕ E(K, Y0 )).

Y is the counter of the CTR, that is initialized to the IV (padded with 31 zeros and 1 one). If IV
is not 96 bits long a GHASH operation is performed to reduce (or expand) it to the 128 bits standard
length. This is the reason why it’s suggested to use a 12 byte IV in efficiency-bounded applications:
in this way the IV is used as-it-is and no GHASH operation must be performed. We will return later
on the GHASH functioning.
The value of the counter is then encrypted with the key and the result is XORed with the first
block of data. Then the counter is increased by one. This operation is done modulo 232 , that is no
more than saying that every time the counter reaches the value 232 is then set to zero. The new value
of the counter is encrypted and XORed with the next data block and so on. For the last data block
the operations are the same but only the most significant bits are considered.
At the end we have n ciphertext blocks, each of them corresponding to a plaintext block. The
blocks can be assembled and sent, or sent separately, possibly with sequence number to allow the
recipient to reconstruct the original message. The choice of adopting the CTR mode of operation
makes possible to treat each ciphertext block separately, so that the decryption operations can be
pipelined in hardware to maximize the throughput.

Now the authentication part. To authenticate the data of the plaintext a MAC is used. This
Message Authentication Code is computed using the GHASH function. We’re not going to describe in
detail the definition of this function but we will only outline its main features. The GHASH is based
on two basic operations easy to implement in hardware: the XOR and the multiplication in a Galois
field. The GHASH needs 3 inputs:
• the encrypted all zeros string, aka the hash sub-key, called H;
• the authenticated data A, up to 264 bits;
• the ciphertext block C.
The hashing function computes many different XOR and multiplications in GF (2128 ). The output
is a 128 bit string, but this string is not immediately the authentication tag T , it is the base to compute
it. T is computed using this string and the ciphertext, taking only a certain number of bits of the
output to allow the user choose the level of security of the tag.
Since the output of the GHASH is a 128 bit string it’s natural to use it to resize a non-128 bits IV.
In this case the input data of the function are not the same of the previous case, the authenticated
data is replaced by an empty string and the ciphertext is replaced with the IV. Since the ciphertext
has got the same length of the plaintext that can be from 0 to 239 − 256 bits, this is also the possible
size of the IV.

Authenticated decryption
This function needs 5 inputs:
• the secret key, called K;
• the initialization vector, IV ;
• the ciphertext, called C;
• the additional authentication data, shortly AAD, indicated with A;
• the authentication tag, called T ;

At the output we obtain only 1 item:

• the plaintext P or the F AIL special symbol if anything in the authenticated decryption goes
How does it work? It performs exactly the same operations of the encryption, with the exception
that the hash function is done before the encryption. This is possible because the ciphertext is obtained
XORing the encrypted counter value with the plaintext. The XOR is the inverse operation of itself so
after receiving the encrypted data if sufficient to encrypt the local counter (that must start with the
same value of the encryption algorithm, the IV ) and XOR it with the received ciphertext.
This is the procedure:
1. H = E(K, 0128 );
IV k 031 1 if len(IV ) = 96
2. Y0 = ;
GHASH(H, {}, IV ) otherwise

3. T ′ = MSBt (GHASH(H, A, C) ⊕ E(K, Y0 )).

4. Yi = incr(Yi − 1) for i = 1, ..., n;
5. Pi = Ci ⊕ E(K, Yi ) for i = 1, ..., n;
6. Pn∗ = Cn∗ ⊕ MSBu (E(K, Yn ));
T ′ is compared to the T received with the ciphertext; if the two values match then the message
is authentic and decryption is performed, otherwise the F AIL symbol is produced and the procedure

The GMAC is the name given to the Galois Counter Mode of operation in the case only authentication
is needed. This could be done for many reasons, the simplest of which is that there could not be the
interest in encrypting the data but only in their authentication; another case is that we only want to
take advantage of the speed of the GMAC algorithm with small plaintexts.
To explain the latter we have to make a few considerations on the context where GCM could
be applied. We have seen that the authenticated encryption function encrypts the plaintext P only,
and not the additional data A, that is passed to the GHASH function to compute a hash, but is
never being encrypted by the encryption function E. In conclusion we have two kinds of data: the
plaintext (encrypted and authenticated) and the additional data (only authenticated). There are a
lot of applications that could take advantage of this characteristic and one of the most important is
packet routing over a network.
The header of a, for example, IP packet, carries a lot of information that is needed to routers to
forward the packet in a direction rather than another. If these data are encrypted routing can’t be
done easily. On the contrary, using GCM, is possible to encrypt the payload of the packets and let in
clear text the header. Moreover, if we just want to authenticate the packets but not encrypt them we
can apply the GMAC function only.
It is very interesting to know of what the Internet traffic is made of. Some studies have been done
on this topic and they all lead to the same, and quite surprising, conclusion. First of all, TCP packets
represent almost 90% of the Internet traffic. The surprising part is that almost 60% of the worldwide
Internet traffic is made of packets smaller or equal to 44 bytes! This is due to the very low size but
the very frequent use of ACK or SYN packets, that are very small. An analysis of GCM/GMAC
performances compared to other similar authenticated encryption techniques shows that it is the best
performing mode of operations in the Internet environment.

3.3 IV and Keys

The initialization vector and the symmetric key are two of the most critical elements of this mode of
operation. If not used properly they can compromise the security offered by GCM. We will analyze
them separately.
There is a very important constraint in choosing the IV and the key, that is strongly stated in
the official NIST publication. The document refer to the following principle as the ”uniqueness”

The probability that the authenticated encryption function ever will be invoked with the
same IV and the same key on two (or more) distinct sets of input data shall be no greater
than 2−32 .

This limitation is obviously imposed to achieve a high security level, and must be obeyed in all
GCM implementations.

3.3.1 Keys
First of all, the key. In symmetric encryption ciphers, the key is the most important element, as
stated in the Kerchoff’s principles. It has no importance keeping the algorithm secret if the key are
not handled in a safe way, and, moreover, the security of the cipher relies entirely on the key.
The official Recommendation, though, is not giving any indication on how to create and to dis-
tribute the keys; it only says that a given key should be ”fresh” and that the mechanism of creating
the keys should resist to attacks and tampering. In addition, no method is specified for key distri-
bution, so an implementation of GCM should carefully take into account this aspects, and choose a
proper way to create and distribute the keys, otherwise the security of the system would be seriously

3.3.2 IV
About the IVs, the recommendation is more strict. It specifies two frameworks to create the initial-
ization vectors, one deterministic and the other based on random number generator. We’re not going
to enter in the details of these two methods, but we will only describe them briefly, to highlight the
main differences. We will refer to the two frameworks respectively as the deterministic-based and the
RBG-based (RBG stands for Random Bit Generator).
Both the systems does not specify the length of the IV, that is arbitrary, and treat the IV as the
composition of two fields, but their logical meaning is different in the two cases.
In the deterministic-based construction the first part is called fixed field, the second one the invo-
cation field. Each device in the network has got a unique fixed field and every time the authenticated
encryption function is called, the invocation field is incremented, to guarantee that no ciphertext are
created using the same IV in a reasonable amount of time.
In the RBG-based construction the two fields are called the random field and the free field. The
recommendation is that the random field is at least 96 bits, while the free field could be 0 bits long.
The random field can be either a real random number generated in a secure way, or the increment of
a previous random number. The free field can assume the value of any number that we like, but the
suggestion is that is 0, so that the IV is a completely random number.
Doesn’t matter which method is used, we cannot generate infinite IVs always using the same key,
otherwise we won’t met the requirement expressed in the Recommendation. No more than 232 IVs
can be used with the same key, but in general the number of IVs used depends on the length of the
key and on the number of devices implementing the GCM. The value 232 is valid only if there are only
2 devices using a 128 bits long key. In other cases, that make use of shorter keys or with a higher
number of devices, a fewer initialization vectors can be used before changing the key.
In any case the Recommendation states clearly that the probability of using the algorithm with
the same key and IV must be lower than 2−32 , otherwise a high security level is not guaranteed.

3.4 Implementations
We have seen that GCM can be implemented both in hardware and software, so we’re going to take
a brief look to these implementations, in particular for the hardware one.

For what relates to software implementations we are just going to show that there are 2 directly
proportional variables: the memory occupation and the amount of computation. We consider the
GHASH function only because it is the only real operation needed to keep into account: the other
operations are XORs, increments (both performed in 1 clock cycle), and the encryption (that depends
on the underlying block cipher, AES, and not on the GCM structure).
The GHASH operation that costs more in term of time (apart from the encryption) is the multi-
plication over the Galois field. This operation, though, has got an interesting property: it is linear in
the bits of one of the factors. For example, if we want to compute H ·X, where H is the encrypted null
vector and X an arbitrary bit string, the operation is linear in the bits of X. This means that given a
value of H, is possible to construct a table with the result of certain values of the multiplication and
use them as a basis to compute the final result.
It can be shown that considering the string X (128 bits) split in 16 parts, each 8 bits long, the
operation H · X can be performed storing a table of intermediate results 65,536 bytes large. This table
must be computed every time a certain key is going to be used, computing the value H and then all
the table entries. If we want to save memory we can consider X as split in 32 parts, each 16 bits long,
and in this case the needed table is only 8,192 bytes large.
In table 3.1 is shown the amount of memory needed and the correspondent throughput expressed
in cycles per byte.

Method Storage requirement Throughput
Simple, 8-bit tables 65,535 bytes/key 13.1
Simple, 4-bit tables 8,192 bytes/key 17.3
Shoup’s, 8-bit tables 1024 bytes + 4096 bytes/key 32.1
Shoup’s, 4-bit tables 64 bytes + 256 bytes/key 69.3
No tables 16 bytes/key 119

Table 3.1: Throughput of GHASH on a Motorola G4 processor

One of the design goals of GCM was the efficiency and the possibility to be implemented in hardware
in a simple way, to allow GCM deal with authenticated encryption over Gigabit networks. The choice
of a hardware implementation is the only possible in such an environment, but it also a very good
choice in other cases where the speed factor is not so relevant.
A possible and straightforward implementation is the one represented in figure 3.1

Figure 3.1: GCM basic hardware implementation scheme

The function requires 4 inputs: the IV, the additional data AAD, the plaintext and the key, that is
not explicitly represented in the figure. The rhomboids represent the point where data are switched.
The left part of the diagram is the part devoted to authentication, while the right part is the one
performing encryption.
At the very first cycle the IV is incremented and then sent to the encryption block (this block
itself can be realized in hardware in many ways, but here we’re looking at it as a black box, since
the explanation must be architectural independent). Now the data are sent to the left part of the
diagram, to the XOR operation, and here they wait for the other data to be computed.
The left part takes as input the AAD, performs the multiplication with H (remember, H is the

encrypted null vector) and sends the result to the XOR block where the data just computed are
waiting. Note that this operation can be done in parallel to the first one.
Now the first cycle is done and the switches are toggled. The IV is incremented and encrypted
again, but this time the result is sent to the right part and XORed with the first block of plaintext
waiting. This time the encrypted data block is sent to the multiplication block instead of the AAD,
that will no longer be used. The operations go on until the plaintext is finished, and at the output we
receive the plaintext and the authentication tag.
There are some very interesting considerations to do on this scheme.
The first thing is that if we look at figure 1 we see that the two parts of the scheme, the two
pipelines, are independent except for the tag creation part. If we can build two distinct encryption
block, performing the AES operations, we could completely split the two pipelines, so that they could
run completely in parallel.
The second observation is about the multiplying block. This part must perform the operation in
GF (2128 ), or, generally speaking, in GF (2q ) (remember that to achieve the top speed, the 128 bit cipher
should be chosen). The multiplication in such a field can be realized in hardware in many ways, each
with different requirements in terms of area occupation and time to compute the result. The fact is that
a parallel multiplier can do the operation in just 1 clock cycle, with an area occupation proportional
to q 2 . In any case the area of this multiplier would be at maximum about 30% of the area needed by
the AES encryption block, so it doesn’t affect a lot the total occupation. Only in applications where
area occupation is very strongly limited we should consider other implementations of the multiplier,
noting that all other solutions need an area and a time to provide the result proportional to q.

3.5 Security
To analyze GCM security, it’s necessary to introduce a lot of mathematics. This is not the goal of
this text, so we just briefly report the lines to follow to achieve the result.
Consider this experiment. We have what is called a permutation oracle that act in this way: it is a
black box that gives as output a bit string; sometimes the bit string is a completely random sequence
and sometimes it is a pseudo-random function (PRF), that is the result of an encryption with a given
key. In the general case the probability of emitting random or pseudo-random streams is 0.5. An
attacker can query the permutation oracle and obtain a result; he then has to determine whether the
output string is a random sequence or the result of an encryption. If he can make it we have what is
called the distinguishing advantage.
In another experiment there are two oracles: the tag-generation oracle and the tag-verification
oracle. The first receives as input a string from the attacker and generates an authentication tag
for that string; the second receives a message and a tag, and tells if the tag actually verifies that
message or not. The attackers can use the tag-generation oracle to construct message-tag pairs, try
to understand the way it works and then provide a pair to the tag-verification oracle. Obviously the
messages built by the first oracle will be accepted by the second one, but we indicate as the forgery
advantage the probability that an attacker can build on his own a message/tag pair accepted by the
tag-verification oracle.
The distinguishing advantage is related to encryption security and the forgery advantage is related
to authentication security. The goal of securing an authenticated encryption operation is to reduce
to the minimum possible these two advantages, so that an attacker can’t exploit them to be able to
build ad hoc messages or, worst case, recover the key.
The proof of GCM security is not particularly complicated but requires some mathematics to be
used and therefore we are not going to explain all the passages. The results are quite intuitive and
tell us that to achieve a very high security we need a long tag (128 bits is advised) and not to use
too long IVs as well as encrypting too long messages. In particular we observe that the distinguishing
advantage is quadratic in the length of the plaintext and linear both in the length of the plaintext and
the length of the IV ; the forgery advantage is quadratic in the length of P , and linear in the length
of the pair C, A.
If we use a long IV it will be hashed by the algorithm and therefore we increase the probability of

collisions using longer and longer IVs; in addition, if we encrypt very long messages we increase the
probability of collisions in the authentication tag T .
Obviously the security of a mode of operations relies on the security of the underlying bock cipher,
so we can’t expect high security levels if we choose the right IVs, we don’t encrypt very long messages,
but we use a key of only 32 bits...
The official recommendation gives us some advices to increase the security provided by GCM.
The first advice is about the keys and is quite obvious. Keys should be kept strictly secret and
changed whenever is needed, distributing them in a safe way.
The second advice is about IV. The repetition of an IV in the authenticated encryption function
causes serious problems because of how the CTR mode of operation is built. Remember that GCM
is based on the CTR technique. If we can induce a bit flipping in the ciphertext, a corresponding bit
flip will be produced in the plaintext upon decryption. An attacker with an authenticated decryption
oracle could induce strategic bit flipping to see the results in the plaintext. In any case there is the
MAC, so performing such an attack to GCM should be useless because there is the tag.
There is a problem though, when IVs are repeated. In this case the computed tag is only a function
of the hash sub-key H, so it could theoretically possible to recover the sub-key, with specific attacks.
An attacker with H at his disposal, could modify a ciphertext and then compute a valid authentication
tag. The receiver would notice that the IV used is the same, but could think that the sender was in
good faith, so would try to decrypt the message, and since the authentication would be verified he
could think that the message is authentic. In conclusion IVs must be changed any time a transmission
is performed and it must be guaranteed that also in the case of a power down the same IV must not
be used twice.
The third advice is about the tags. We know that given a tag of length t the probability of
obtaining a collision (i.e.: the same tag with different ciphertexts) is 2−t . With GCM an attacker
could use techniques that increase this probability. Any of these forgery attacks that succeed increases
the probability that other attacks of the same kind come through, and finally the hash sub-key H
could be compromised, canceling any authentication assurance.
The fourth and last advice is related to the protection against replay attacks. To avoid this kind
of attacks is sufficient to follow the general principle of the second advice, instructing the recipient
to discard messages with duplicated IVs for a given key, and, furthermore, to use timestamps in the
additional authenticated data.

Chapter 4


Authenticated Encryption is probably one of the best available examples to show what a wrong use
of Intellectual Properties could do. The problem of AE is quite important, especially in a high speed
network environment, and a reliable, fast and easy to implement AE protocol is needed. We’ve seen,
though, that the best technical solution has been actually blocked by a dull use of the Intellectual
Property concept.
The two described solutions are good ways to solve the problem and are free of Intellectual Proper-
ties restrictions, but are not optimal solutions. They both use the two-pass combined mode strategy;
CCM does it in a very simple way, because it was one of the first solutions of this type ever pro-
posed; GCM is better in many ways, but in any case the approach is only slightly better than generic
So why choose an authenticated encryption protocol? One of the reason is that is often good
to have a unique solution to solve two problems, just because it could be cheaper to implement, for
instance. The advantage of saving time is unfortunately quite limited, since the single-pass solutions
are in fact unavailable. It is useful, though, to have an algorithm like GCM that could implement just
authentication, and AE when needed only.
Let’s have a look at the main properties CCM offers.
The main security function offered is of course authenticated encryption. There is no error prop-
agation during the generation process. Sender and recipient must be synchronized as they both need
to use the same nonce, based for example on a counter. The encryption process can be parallelized
if needed but this is not true for the authentication process so CCM algorithm can’t be parallelized.
The process needs a unique key, shared by sender n receiver and used both for the counter mode and
the cipher block encryption, a nonce and a counter, which are part of the counter block. In terms of
memory requirements CCM requires memory for the encrypt operation of the underlying block cipher
algorithm, for the plaintext, the ciphertext and a packet counter.
One important feature of CCM is that the encryption key stream can be precomputed, saving time
and increasing speed. Unluckily the same cannot be said for authentication.
To what relates GCM we can briefly summarize its main features.
Like CCM there is no error propagation, because is based on a mode that operates with independent
blocks. GCM can be efficiently parallelized, improving a lot the performances. It needs one key only:
the structure of the algorithm is designed to eliminate the security issues that could rise from the
use of the same key for the authentication and encryption parts. The IVs can have arbitrary length,
although the suggested one is 96 bits, and should never be reused with the same key. GCM is on-line,
in the sense that the recipient do not need to know the length of the incoming message, but can
process the data blocks as they arrive. The authentication tag has a variable length, from 0 to 128
bits and the ciphertext has the same length of the plaintext.
Probably the most important feature of GCM is its possibility to be easily implemented in hard-
ware, allowing throughput of more than 10Gbps.


[1] NIST Special Publication 800-38C.

[2] NIST Special Publication 800-38D.


[4] J. Black. Authenticated encryption.

[5] P. Rogaway, D. Wagner. A Critique of CCM.

[6] J. Jonsson. On The Security of CTR + CBC-MAC.

[7] D. Whiting, R. Housley, N. Ferguson. Counter with CBC-MAC (CCM) IETF Internet Draft

[8] D. A. McGrew, J. Viega. The Security and Performance of the Galois/Counter Mode (GCM) of

[9] D. A. McGrew, J. Viega. The Galois/Counter Mode of Operation (GCM)

[10] D. A. McGrew, J. Viega. Flexible and Efficient Message Authentication in Hardware and Software

[11] K. Claffy, G. Miller, K. Thompson The nature of the beast: recent traffic measurements from
an Internet backbone