You are on page 1of 12

Strictly Confidential Technical Documentation Open AMR Initiative

Open AMR Initiative


AMR Codec

Technical Documentation
Version 2.0, Revision A
July 2007

2.0, A, 07/2007 1 of 12
Strictly Confidential Technical Documentation Open AMR Initiative

This material and information (“Information”) constitutes a trade secret of VoiceAge Corporation and is strictly confidential. You
agree to keep this Information confidential and to take all necessary measures to maintain its secrecy. Without limiting the foregoing,
VoiceAge Corporation considers its confidential Information, including, but not limited to, any source code and technical information,
to be an unpublished proprietary trade secret. If an authorized publication occurs, the following notice shall be affixed to it: Copyright
© 1996-2007 VoiceAge Corporation. All Rights Reserved.

No part of this material may be reproduced, including, but not limited to, photocopying, electronic or mechanical recording, nor
stored in a retrieval system, or otherwise transmitted, in any form or by any means, without the prior written permission of VoiceAge
Corporation.

VoiceAge Corporation assumes no responsibility for any errors or omissions. This Information is subject to continuous updates and
improvements. All warranties implied or expressed, including but not limited to implied warranties of merchantability, fitness for
purpose, condition of title, and non-infringement, are specifically excluded. In no event shall VoiceAge Corporation and its suppliers
be liable for any special, indirect or consequential damages or any damages whatsoever arising out of or in connection with the use
of this information. The foregoing disclaimer shall apply to the maximum extent permitted by applicable law, even if a particular
remedy fails its essential purpose.

ACELP and VoiceAge are registered trademarks of VoiceAge Corporation in Canada and/or other countries. Any unauthorized use
is strictly prohibited.

© Copyright 2007 VoiceAge Corporation

VoiceAge Corporation
750 Lucerne Road, Suite 250
Montreal, QC H3R 2H6 CANADA
Telephone: (514) 737-4940
Fax: (514) 908-2037
sales@voiceage.com
www.voiceage.com

2.0, A, 07/2007 2 of 12
Strictly Confidential Technical Documentation Open AMR Initiative

Contents
Revision history .........................................................................................4
References..................................................................................................4
The AMR codec ..........................................................................................5
Package contents.......................................................................................6
Data input/output format ...........................................................................7
Discontinuous Transmission (DTX) .........................................................9
About the Encoder/Decoder Sample Programs ....................................10
Usage of the encoder ..............................................................................................................10
Usage of the decoder ..............................................................................................................10
Building the sample programs.................................................................................................10
AMR API functions...................................................................................11
E_IF_init ..................................................................................................................................11
E_IF_encode ...........................................................................................................................11
E_IF_exit .................................................................................................................................11
D_IF_init ..................................................................................................................................12
D_IF_decode...........................................................................................................................12
D_IF_exit .................................................................................................................................12

2.0, A, 07/2007 3 of 12
Strictly Confidential Technical Documentation Open AMR Initiative

Revision history
July 2007 Updated descriptive text, frame bitmap table, references and document template.

July 2004 Second release.

July 2002 First release of this document.

References
[1] 3GPP 1999 TS 26.071, “AMR speech Codec; General description.”
http://www.3gpp.org/ftp/Specs/html-info/26071.htm

[2] 3GPP TS 26.104: “ANSI-C code for the floating-point Adaptive Multi-Rate (AMR) speech codec.”
http://www.3gpp.org/ftp/Specs/html-info/26104.htm

[3] IETF RFC 3267, “RTP payload format and file storage format for the Adaptive Multi-Rate (AMR)
Adaptive Multi-Rate Wideband (AMR-WB) audio codecs,” March 2002.
http://www.ietf.org/rfc/rfc3267.txt

[4] 3GPP 1999 TS 26.101, “AMR Narrowband Speech Codec; Frame Structure.”
http://www.3gpp.org/ftp/Specs/html-info/26101.htm

2.0, A, 07/2007 4 of 12
Strictly Confidential Technical Documentation Open AMR Initiative

The AMR codec


VoiceAge’s AMR is an adaptive multi-rate narrowband speech codec with eight bit rate modes ranging
from 4.75 kbps to 12.2 kbps and an additional low-bit-rate background noise mode. The codec includes a
voice activity detector, a comfort noise generator and an error concealment mechanism, all of which
improve speech quality over lossy transmission mediums. For a general description, please see [1].

The implementation provided in this package is the AMR floating-point speech encoder and fast fixed-
point speech decoder. The encoder produces output that is compatible with the AMR-NB IF2 format. The
decoder is bit-exact with 3GPP TS 26.104 [2].

The RTP payload format defined in [3] enables the use of AMR in RTP packet-switched networks in
applications like streaming and provides interoperability with existing codec transport formats on non-IP
networks.

2.0, A, 07/2007 5 of 12
Strictly Confidential Technical Documentation Open AMR Initiative

Package contents
These files are included in the AMR Open Initiative package.
AMR-NB.pdf This document.
AMR-NB.lib Win32 statically linkable library of AMR-NB floating-point encoder/
fixed-point decoder for Pentium and compatible processors.
encoder.c Source code for encoder test program.
decoder.c Source code for decoder test program.
interf_enc.h Header files needed to compile encoder and decoder test programs.
interf_dec.h
typedef.h

encoder.exe Encoder test program executable.


decoder.exe Decoder test program executable.

2.0, A, 07/2007 6 of 12
Strictly Confidential Technical Documentation Open AMR Initiative

Data input/output format


Input to the encoder is in 16-bit pulse code modulation (PCM) speech data sampled at 8 kHz. The
decoder outputs the reconstructed speech data in the same format. Each input speech frame of 20 ms
consists of 160 16-bit PCM words containing 14-bit left-aligned uniform samples. The encoder outputs
compressed speech data in octet aligned (by using bit stuffing) AMR-NB Interface Format 2, as defined in
the 3GPP TS 26.101 [4].

Frame structure for AMR-NB IF2


Frame Type (4 bits)

AMR-NB Core Speech Frame


(size depends on bit rate mode)

Bit Stuffing (n bits)

An AMR-NB IF2 frame contains a header with a “Frame Type” field. The 4-bit “Frame Type” field identifies
the current frame as either an AMR-NB codec mode, comfort noise or an empty frame. The AMR-NB core
frame is the compressed speech data or comfort noise data within a 20-ms frame. The size of this data
depends on the current AMR-NB codec mode. The last field contains stuffing bits, which are necessary to
align the AMR-NB IF2 frame to the next multiple of eight. The following table shows the bit allocation for
AMR-NB IF2 frames.

Table 1. Total bits used for an AMR-NB IF2 frame

Frame Bit rate Frame AMR-NB Padding bits Total bytes


Type Index (kbps) type bits core bits per AMR-NB
IF2 frame
0 4.75 4 95 5 13
1 5.15 4 103 5 14
2 5.90 4 118 6 16
3 6.70 4 143 6 18
4 7.40 4 148 0 19
5 7.95 4 159 5 21
6 10.2 4 204 0 26
7 12.2 4 244 0 31
8 AMR SID* 4 39 5 6
9 GSM-EFR SID 4 39 1 6
10 TDMA-EFR SID 4 39 6 6
11 PDC-EFR SID 4 39 7 6
12-14 (for future use) - - - -
15 No data 4 0 4 1
*Bit rate of comfort noise (FT index 8) is 1.75 kbps when assuming continuous transmission.

2.0, A, 07/2007 7 of 12
Strictly Confidential Technical Documentation Open AMR Initiative

Table 2, based on table A.1a in [4], shows an example how the AMR 6.7-kbps mode is mapped into
AMR IF2. The four least significant bits (LSB) of the first octet (octet 1) consist of the Frame Type (=3) for
the AMR 6.7-kbps mode (see table 1a in [4]). This data field is followed by the 134 AMR core frame
speech bits (d(0)..d(133)), which consist of 58 Class A bits and 76 Class B bits as described in table 2 in
[4]. This results in a total of 138 bits, and 6 bits are needed for bit stuffing to arrive to the closest multiple
of 8, which is 144 bits.

Table 2. Example mapping of the AMR 6.7-kbps speech coding mode into AMR IF2
(The bits used for bit stuffing are denoted as UB (for "unused bit").)

MSB Mapping of bits LSB


AMR 6.7
Octet Bit 8 Bit 7 Bit 6 Bit 5 Bit 4 Bit 3 Bit 2 Bit 1

Frame Type (= 3)
1 MSB ........ LSB
d(3) d(2) d(1) d(0) 0 0 1 1
2 d(11) d(10) d(9) d(8) d(7) d(6) d(5) d(4)

3 … … … … … … … d(12)

Stuffing bits
18

UB UB UB UB UB UB d(133) d(132)

2.0, A, 07/2007 8 of 12
Strictly Confidential Technical Documentation Open AMR Initiative

Discontinuous Transmission (DTX)


In a typical telephone conversation, voice transmission alternates frequently between the speaking
parties, leaving long pauses of silence. These pauses can be efficiently represented as background noise
and transmitted at a much lower bit rate than speech. The discontinuous transmission mode is used to
encode frames that contain only background noise.

When AMR operates in DTX mode, a voice activity detector (VAD) on the transmission (TX) side
evaluates whether a frame contains any voice data. In the absence of speech, a silence information
descriptor (SID) frame, which contains characteristics describing the background noise, is transmitted. On
the reception (RX) side, a comfort noise generator (CNG) is used to synthesize background noise based
on the SID frame parameters. On the TX side, the encoder generates “no data” frames until it detects a
change in the input signal (as background noise or speech).

2.0, A, 07/2007 9 of 12
Strictly Confidential Technical Documentation Open AMR Initiative

About the Encoder/Decoder Sample Programs


The sample programs encoder.c and decoder.c demonstrate how to initialize and call the encoding and
decoding processes. Input to the encoder and output from the decoder is in the form of 16-bit PCM words
containing 14-bit left-aligned uniform speech samples.

Usage of the encoder


encoder (-dtx) mode speech_file bitstream_file

-dtx Enables discontinous transmission mode.


mode Specifies encoding at one of the 8 AMR-NB bit rates.
–modefile filename Can be used instead of the mode argument to specify the encoding mode for
each frame from a mode control file. This text file should contain one mode
number (0-7) per line.

This table shows the AMR encoding modes and their bit rates.

Mode 0 1 2 3 4 5 6 7
Bit rate
4.75 5.15 5.90 6.70 7.40 7.95 10.20 12.20
(kbps)

Usage of the decoder


decoder bitstream_file synth_file

Building the sample programs


To build the speech encoder or decoder sample programs, compile the file encoder.c (or decoder.c). Link
this object file to the codec static AMR-NB library.

2.0, A, 07/2007 10 of 12
Strictly Confidential Technical Documentation Open AMR Initiative

AMR API functions

E_IF_init

Description Allocates and initializes the encoder state memory.

Syntax #include " interf_enc.h "

void * E_IF_init (dtx);

Arguments dtx dtx = 1 to enable discontinuous transmission

Returned value void * Pointer to the state memory used by the encoder

E_IF_encode

Description Encodes one frame of speech data into a byte-aligned IF2 compatible
packed data stream.

Syntax #include " interf_enc.h "

int E_IF_encode (Word16 mode, Word16 *speech,


Uword8 *serial);

Arguments mode Encoding mode at one of 8 AMR-NB bit rates (0-7)


speech Input buffer containing one frame of speech samples
serial Output buffer containing compressed data

Returned value Number of bytes written to output buffer

E_IF_exit

Description Frees the encoder state

Syntax #include " interf_enc.h "

void E_IF_exit ();

Arguments

Returned value None

2.0, A, 07/2007 11 of 12
Strictly Confidential Technical Documentation Open AMR Initiative

D_IF_init

Description Allocates and initializes the decoder state memory.

Syntax #include " interf_dec.h "

void * D_IF_init (void);

Arguments None

Returned value void * Pointer to state memory used by the decoder

D_IF_decode

Description Decodes one compressed speech frame.

Syntax #include " interf_dec.h "

void D_IF_Decode (Uword8 *bits, Word16 *synth);

Arguments bits Input buffer containing compressed data from the encoder

synth Output buffer containing one frame of decoded speech


samples

Returned value None

D_IF_exit

Description Frees the decoder state memory.

Syntax #include " interf_dec.h "

void D_IF_exit ();

Arguments

Returned value None

2.0, A, 07/2007 12 of 12