You are on page 1of 8

HEX file format

Hex files have much the same properties as Binary files. All bytes of the file are placed one after
the other. No address information or checksums are added. The only difference with the Binary
file format is that each byte is converted to 2 ASCII characters in the range 0..9 and A..F,
representing the 2 hex digits (nibbles). These characters are grouped on lines. The number of
nibble pairs on a line can usually range from 1 to 255, where 16 or 32 pairs are the most
common line lengths. Each line is terminated by a CR (ASCII value $0D) or CRLF pair (ASCII
values $0D,$0A). This termination may come on any byte boundary, so there must always be
an even number of nibble characters on each line. The end of file is indicated by the EOF
character (ASCII value $1A).

No address information or checksums are added to the file. The receiving program, e.g. the
loader, must know at what address to store the Hex data. It is not possible to skip unused
portions of memory because no address information is present. Unused portions must be filled
with dummy values, unless the unused portion is at the end of the file.

Example
576F77212044696420796F7520726561<cr><lf>
6C6C7920676F207468726F7567682061<cr><lf>
6C6C20746869732074726F75626C6520<cr><lf>
746F2072656164207468697320737472<cr><lf>
7696E673<cr><lf>
<eof>

In this example there are 16 bytes (32 nibbles) on each line. Each line is terminated by a CRLF
pair. In reality <cr><lf> is translated to ASCII codes $0D and $0A. At the end of the file you
see the code <eof>, which is in fact ASCII code $1A.

Binary file format

There is not that much to tell about the Binary file format for it is not really a format because no
formatting takes place what so ever. All bytes are placed in the file, one after the other. All
bytes are stored as they are, i.e. 8 bits per byte are used. The transfer of Binary files is NOT
ASCII encoded.

No address information or checksums are added to the file. The receiving program, e.g. the
loader, must know at what address to store the binary data. It is not possible to skip unused
portions of memory because no address information is present. Unused portions must be filled
with dummy values, unless the unused portion is at the end of the file. There is also no End Of
File information available, so the receiving program must have another way to signal the end of
the file.
No wonder that the Binary file format is the most compact format, if you don't count packed
files like ZIP or RAR.

The disadvantages of the Binary file format are no problem when the files are stored on disk.
Disk systems have their own way of knowing where the end of the file is and have their own
means of ensuring contents integrity with checksums or CRC.
Motorola Sxx records format

Motorola is another major player on the file format market. Motorola defined 3 format types to
adapt to the ever growing hunger for memory. The basic format is the so called S19 format,
which has a 16 bit address field and can be used for files of up to 64kb. The S28 format has a
24 bit address field and can handle files up to 16Mb. The S37 format is the largest of the 3
formats, it has a 32 bit address field and can handle files of up to 4Gb.
All three formats contain only normal ASCII characters. The Motorola file format does not need
special tricks to deal with larger files. The addressing is always linear.

Records

All data lines are called records and each record contains the following fields:

Sxccaaaaddss

Sx Every line starts with the character S, followed by a digit.

cc The byte-count. A 2 digit value (1 byte), counting all the bytes in the record, excluding
cc itself.

aaaa The address field is a 4 to 8 digit (2 to 4 bytes) number representing the first address
of this record.

dd The actual data of this record.

ss The checksum. A 2 digit (1 byte) checksum. cc+aaaa+sum(dd)+ss=$FF

Record Begin

Every record begins with the ASCII character "S", which is followed by a digit x. This digit
represents the record type. No spaces or tabs are allowed in a record. In fact, apart from the
character "S" in the beginning of the record, no other characters than 0..9 and A..F are allowed.
Interpretation of a record should be case insensitive, so it does not matter if you use a..f or
A..F.

The following record numbers can be used:

'S1' = A normal data record with a 16 bit address field.


'S9' = End Of File record for a S1 file.
'S2' = A normal data record with a 24 bit address field.
'S8' = End Of File record for a S2 file.
'S3' = A normal data record with a 32 bit address field.
'S7' = End Of File record for a S3 file.

In the table above you can see why they call these Motorola formats S19, S28, or S37. The
first digit in the name represents the normal data record identifier. The second digit is the End
Of File record identifier.
Byte Count

The byte count cc counts the number of bytes in the current record excluding cc itself. So the
number of address bytes, the number of data bytes and one byte for the checksum are
counted. The byte count can have any value from $00 to $FF. Usually records have 32 data
bytes. For the S19 format this gives a byte count of $23.
It is not recommended to send too many data bytes in a record for that may increase the
transmission time in case of errors. Also avoid sending only a few data bytes per record
because the address overhead will be too heavy in comparison to the payload.

Address field

The first data byte of the record is stored in the address specified by the Address field aaaa.
After storing that data byte the address is incremented by 1 to point to the address for the next
data byte of the record. And so on, until all data bytes are stored.
The length of the address field depends on the record type. For types S1 and S9 the address
field is 4 hex digits long (2 bytes). For S2 and S8 records the address field is 6 hex digits, and
for the S3 and S7 records the address field is 8 hex digits long. The address is sent with MSB
first.
The order of addresses in the records of a file is not important. The file may also contain
address gaps, to skip a portion of unused memory.
Due to the large addressing ranges of the S28 en S37 formats no special tricks are necessary
to manage large files.

Data field

This field contains 0 or more data bytes. The actual number of data bytes is indicated by the
byte count in the beginning of the record. The first data byte is stored in the location indicated
by the address in the address field. After that the address is incremented by 1 and the next
data byte is stored in that new location. This continues until all bytes are stored.

Checksum field

This field is a one byte (2 hex digits) 1's complement checksum of the entire record. To create
the checksum make a 1 byte sum from all fields of the record:

check sum = byte count + all address bytes + all data bytes

Then take the 1's complement of this sum to create the final checksum. The 1's complement is
simply inverting all bits. Checking the checksum at the receivers end is done by adding all bytes
together including the checksum itself, discarding all carries, and the result must be $FF.

Examples
S113B000576F77212044696420796F7520726561D8
S113B0106C6C7920676F207468726F756768206143
S113B0206C20746861742074726F75626C6520742E
S10FB0306F207265616420746869733FCE
S9030000FC

In the example above you can see a piece of S19 code with normal 16 bit addressing. The first
3 lines have 16 bytes of data each, which can be seen by the byte count, the first byte of each
line. In S19 format, the byte count is always 3 higher than the actual number of data bytes in
the record. The 4th line has only 12 bytes because the program is at its end there.
After the byte count on each line you can see the address where the 1st data byte is to be
stored. The begin address of this file is $B000. Remember that the address order within a file is
not important.
The actual data bytes follow after the address field.
Finally you see the checksum as the last byte of every record. If you like you can add all bytes
of each line together and the result should be $FF every time.
Note that the address of the last line is also $0000 and that there are no data bytes in this last
record. The address in the End Of File record can be ignored, but could also be used as a start
vector to be loaded into the Program Counter of the target system.

S21400B000576F77212044696420796F7520726561D7
S21400B0106C6C7920676F207468726F756768206142
S21400B0206C20746861742074726F75626C6520742D
S21000B0306F207265616420746869733FCD
S804000000FB

Here you see the same piece of code as in the first example, but this time we use the S28
format. There are only a few differences. As you can see another identifier is used. The byte
count is 1 higher than in S19 format because we have one more address byte to send. And
because of this change in byte count the checksum is different as well.

S3150000B000576F77212044696420796F7520726561D6
S3150000B0106C6C7920676F207468726F756768206141
S3150000B0206C20746861742074726F75626C6520742C
S3110000B0306F207265616420746869733FCC
S70500000000FA

Finally I show the same piece of code again, but this time in the S37 format. The same
differences as with S28 apply here.

http://www.sbprojects.com/knowledge/fileformats/motorola.htm
Intel HEX format

Intel Hex is one of the oldest file formats available and is adopted by many newcomers on the
market. Therefore this file format is almost always supported by various development systems
and tools.
Originally the Intel Hex format was designed for a 16 bit address range (64kb). Later the file
format was enhanced to accommodate larger files with 20 bit address range (1Mb) and even 32
bit address range (4Gb).

Records

All data lines are called records and each record contains the following fields:

:ccaaaarrddss

: Every line starts with a colon (Hex value $3A).

cc The byte-count. A 2 digit value (1 byte), counting the actual data bytes in the record.

aaaa The address field. A 4 digit (2 byte) number representing the first address to be used
by this record.

rr Record type. A 2 digit value (1 byte) indicating the record type. This is explained later
in detail.

dd The actual data of this record. There can be 0 to 255 data bytes per record (see cc).

ss Checksum. A 2 digit (1 byte) checksum. cc+aaH+aaL+rr+sum(dd)+ss=0

Record Begin

Every record begins with a colon (ASCII value $3A). Records contain only ASCII characters! No
spaces or tabs are allowed in a record. In fact, apart from the 1st colon, no other characters
than 0..9 and A..F are allowed in a record. Interpretation of a record should be case insensitive,
so it does not matter if you use a..f or A..F.

Byte Count

The byte count cc counts the actual data bytes in the current record. Usually records have 32
data bytes, but any number between 0 and 255 is possible.
It is not recommended to send too many data bytes in a record for that may increase the
transmission time in case of errors. Also avoid sending only a few data bytes per record
because the address overhead will be too heavy in comparison to the payload.

Address field
This is the address where the first data byte of the record should be stored. After storing that
data byte the address is incremented by 1 to point to the address for the next data byte of the
record. And so on, until all data bytes are stored.
The address is represented by a 4 digit hex number (2 bytes), with the MSB first.
The order of addresses in the records of a file is not important. The file may also contain
address gaps to skip a portion of unused memory.
Normally the address aaaa is used to store the first data byte of the record. However this will
only allow us to send files with a maximum size of 64kb. Therefore Intel designed two extra
record formats which it is possible to pre-set an Extended Segment Address or Upper Linear
Base Address.
In case of a Extended Segment Address this segment is added to the address field of the
record, like in Intel 16 bit processors, to obtain a 20 bit address. This will enable us to send files
with a total length of 1Mb. The Extended Segment address is pre-set by a special record type.
The formula to calculate the target address in case of Extended Segment mode is:

target address = segment*16+aaaa

In case of Upper Linear Base Address mode the upper 16 bits of the 32 bit address are pre-set
by a special record type. In this case the address space is expanded to 32 bits, which gives us a
total range of 4Gb.
The formula to calculate the target address in case of Upper Linear Base Address mode is:

target address = ulba*65536+aaaa

Record Type

There are 5 record types defined:

'00' = Data Record


'01' = End Of File Record
'02' = Extended Segment Address Record
'03' = Start Segment Address Record
'04' = Extended Linear Address Record
'05' = Start Linear Address Record.

Type '00' is the main record type. The real data are sent using this record type. The 1st data
byte of the record is stored in the address specified by the address field of the record (plus the
pre-set Segment or Linear Base Address). After that the address of the address field is
incremented and the next data byte is stored on the next address. The address in the address
field is 16 bits, so a rollover from $FFFF to $0000 can occur. This will not produce a carry into
the Segment or Linear Base Address, so addressing space is wrapped back!

Type '01' is the End Of File record. The receiver of the file will stop waiting for new records after
receiving this record. The byte count and the address field of this record must always be $00.
Because the contents of this record type is fixed, the checksum field is always $FF.

Type '02' records are used to pre-set the Extended Segment Address. With this segment
address it is possible to send files up to 1Mb in length. The Segment address is multiplied by 16
and then added to all subsequent address fields of type '00' records to obtain the effective
address. By default the Extended Segment address will be $0000, until it is specified by a type
'02' record. The address field of a type '02' record must be $00. The byte count field will be
$02 (the segment address consists of 2 bytes). The data field of the type '02' record contains
the actual Extended Segment address. Bits 3..0 of this Extended Segment address always
should always be 0!

Type '03' records don't contribute to file transfers. They are used to specify the start address
for Intel processors, like the 8086. So if you would upload a file to an Intel based development
board, the starting address of the code can be specified with this record type. This starting
address will be loaded into the CS and IP registers of the processor. For normal file transfers
the type '03' records can be ignored. The byte count of type '03' record is $04, because 4 data
bytes will be sent. The address field remains $0000. The data field of type '03' records contain
4 bytes, the first 2 bytes represent the value to be loaded into CS, the last 2 bytes are the
value to be loaded into IP. Bytes are sent MSB first.

Type '04' records are used to pre-set the Linear Base Address. This 16 bit Linear Base Address,
specified in the data area, is used to obtain a full 32 bit address range when combined with the
address field of type '00' records. With this LBA it is possible to send files of up to 4Gb in
length. The Linear Base Address is used as the upper 16 bits in the 32 bit linear address space.
The lower 16 bits will come from the address field of type '00' records. By default the Linear
Base Address will be $0000, until specified by a type '04' record. The address field of a type
'04' record must be $0000. The byte count field will be $02 (the LBA consists of 2 bytes). The
data field of the type '04' record contains the actual 2 byte Linear Base Address. MSB is sent
first.

Type '05' records don't contribute to file transfers. They are used to specify the start address
for Intel processors, like the 80386. If you would upload a file to an Intel based development
board, the starting address of the code can be specified with a type 05 record. This starting
address will be loaded in the EIP register of the processor. For normal file transfers the type
'05' records can be ignored. The byte count of type '05' records is $04, because 4 data bytes
will be sent. The address field remains $0000. The data field of type '05' records contain the 4
byte linear 32 bit starting address to be loaded into the EIP register of the processor.

Data or Offset field

This field contains 0 or more data bytes. The actual number of data bytes is indicated by the
byte count in the beginning of the record. The data bytes are interpreted as real "payload" data
in type '00' records. In other record types the data represent pre-set address values.

Checksum Field

This field is a one byte (2 hex digits) 2's complement checksum of the entire record. To create
the checksum make a 1 byte sum from all fields of the record:

byte count + both address bytes + record type + all data bytes.

Then take the 2's complement of this sum to create the final checksum. Checking the checksum
at the receiver's end is simply done by adding all bytes together including the checksum itself,
discarding all carries, and the result must be $00.

Examples
:10C00000576F77212044696420796F7520726561CC
:10C010006C6C7920676F207468726F756768206137
:10C020006C6C20746869732074726F75626C652023
:10C03000746F207265616420746869732073747210
:04C040007696E673FF
:00000001FF

In the example above you can see a piece of code with normal 16 bit addressing. The first 4
lines have 16 bytes of data each, which can be seen by the byte count, the first byte of each
line. The 5th line has only 4 bytes because the program is at its end there.
After the byte count on each line you can see the address where the 1st data byte of that line is
to be stored. The begin address of the file is $C000. Remember that the address order within a
file is not important.
Then the record type is given. In each data record this identifier is $00. Only in the End Of File
record, the last line, this identifier is $01. Note that the address of the last line is also $0000
and that there are no data bytes in this last record.
The data bytes follow the record identifier, at least for the data records they do.
Finally you see the checksum as the last byte of every record. If you like you can add all bytes
of each line together and the 8-bit result should be $00 every time.

:100000004578616D706C65207769746820616E2039
:0B0010006164647265737320676170A7
:101000004865726520697320612067617020696E90
:1010100020746865206D656D6F727920616C6C6FEE
:06102000636174696F6E4C
:00000001FF

Here you see an example with an address gap. The first part of the program starts at address
$0000. After the second record the address has suddenly changed to $1000. All date in the
addresses in between remain unchanged, or are undefined. It is even possible to fill in these
"blanks" later, without destroying the code presented in this example. As you can see not all
lines have the same number of data bytes, which is no problem.
BTW: In both examples so far no Extended segment or Linear Base Address were defined. So
these addresses are assumed to be $0000.

:020000022BC011
:1012340054686973207061727420697320696E2028
:0D12440061206C6F77207365676D656E74B7
:020000027F007D
:1080000054686973207061727420697320696E20EE
:108010007468652068696768207365676D656E744C
:00000001FF

In this final example I show you a piece of code with Extended Segment records in it. The first
record is one of them. Here the Extended Segment address is set to $2BC0, which means that
$2BC00 is added to all subsequent address fields to obtain the target address of the data. E.g.
the first data byte of the 2nd record is stored at location $2BC00+$1234=$2CE34.
In the 4th record a new Extended Segment address is specified, which means that from then on
all address fields are added to $7F000 to obtain the target address.
The Extended Segments records in this example can be replaced by Linear Base Address
records by changing the identifier '02' into '04' and adapting the corresponding checksums.
When keeping all other values the same this would result in a target address for the first byte of
the second record of $2BC00000+$1234=$2BC01234.

Intel HEX File Editor

Normally Intel HEX files are generated by assemblers and compilers. However I can imagine
that you may want to edit existing Intel HEX Files, for instance to alter a few strings in a pre-
compiled program. If you're looking for a free Intel HEX File Editor for Windows you can look at
Tarun's page.
Please note that it is not my program, so if you have any questions please ask Tarun.

Kaynak: http://www.sbprojects.com/knowledge/fileformats/intelhex.htm

You might also like