You are on page 1of 5

Information Representation

Introduction

The logic circuits of digital computers are binary, that is, they can at any one time be in one of only two
states. Thus, the only information that can be represented is that which is adequately represented with only
two values - say whether a switch is on or of, a person is male or female.

The two states of a binary device are usually represented by the binary digits (bits) 0 and 1.

Computers deal with groups of binary digits called words. A word is a string of binary digits which the
computer regards as a single unit of data. An n-bit computer is one in which the length of the most
frequently used word is n and the main data paths are parallel n-bit paths. The value of n depends on the
computer's vintage, the purpose for which it was designed, and its cost. It varies from 4 (in some simple
microprocessors) to 64 (eg a Pentium) and beyond.

The representation of Data in a Word

For simplicity, consider a pattern of eight bits, say a word containing eight binary digits. Within 8 bits the
patterns which can be distinguished are:

00000000

00000001

00000010

etc.

11111110

11111111

It can be seen that there are 2n (256) different patterns. Therefore in an 8 bit word, 256 different values can
be distinguished. Generally an n-bit word can be in one of 2n different states. What information is being
represented and what value a particular pattern has depends on the context and how the data is coded in
binary. Before discussing how commonly needed information is represented for a computer and how it is
coded, it is useful to explain some notational conventions for binary words.

Notations and Conventions

The n-bit binary word is often written as:

an-1 an-2 ... a0, where ai is either 0 or 1

Thus considering the 8 bit word:

0 1 1 0 1 0 1 0, a7, a4, a2, and a0 are 0 and a6, a5, a3, a1 are 1

Alternatively, we can describe the individual bits by their 'bit number'. Starting from the right - bit 0 and
numbering each bit until bit 7 is reached.

In arithmetic contexts bit 0 is called the least significant bit (LSB) and bit 7 is called the most significant bit
(MSB).

Considering an 8 bit word, if the patterns in a binary word are used straightforwardly to represent non-
negative integers:
0 0 0 0 0 0 0 0 is 0

and

1 1 1 1 1 1 1 1 is 255

In general: X(10) = a727 + a626 + a525 +a424 + a323 + a222 +a121 +a020

Octal and Hexadecimal

Words written in binary notation are not convenient for human use; we are most adept with decimal
(denary) notation which is also shorter. The most common shorthand ways of writing binary information are
called OCTAL and HEXADECIMAL. The conversions to and from binary (if necessary) are much simpler
than with denary.

Octal, or Base 8

A given word is split into 3-bit groups starting with bit 0; the least significant bit. For example:

01 101 010 Binary word

is

1 5 2 Octal word

To get the denary equivalent:

X(10) = 1x82 + 5x81 + 2x80

Hexadecimal

Hexadecimal is more popular because 8 bits can be represented by only two 'Hex' digits. The notation is a
little more tricky to master as alphabetic characters are used to represent the extra digits. The table below
shows the equivalent representations.
Denary Binary Octal Hexadecimal

0 00000 0 0
1 00001 1 1
2 00010 2 2
3 00011 3 3
4 00100 4 4
5 00101 5 5
6 00110 6 6
7 00111 7 7
8 01000 10 8
9 01001 11 9
10 01010 12 A
11 01011 13 B
12 01100 14 C
13 01101 15 D
14 01110 16 E
15 01111 17 F
16 10000 20 10

Bytes

A byte is the universally accepted name for a group of eight bits. A group of four bits is sometimes called a
'nibble' being smaller than a byte!

Non-negative integers
Conceptually, the simplest interpretation of a bit pattern is as a non-negative integer and the numerical
value can be calculated as shown previously. In practice this coding is not used very often but a particular
use is in addressing memory locations in the computer where (in simple terms) the amount of real memory
a computer can address is determined by the number of bits available on the Address Bus. For example, if
the computer has a 16 bit address bus then it can address 65536 (216) memory locations.

The range of values represented by an n-bit word is: 0 to 2n - 1

Integers

A range of consecutive integers which run from negative, through zero, and then positive, can be
represented by a binary string. The most common convention for achieving this is called 2's complement.

Remember that in an n-bit string we can represent at most 2n different values. Thus if half of these values
are negative then the maximum positive integer is reduced. Specifically, an 8 bit word interpreted as 2's
complement integers can represent numbers in the range:

-128 to +127

To understand how the 2's complement system works, imagine a 6 digit 'mileometer' on a car which is
driven in reverse. The sequence of patterns on the meter as it passes through zero are:

000003

000002

000001

000000

999999

999998

999997

It is natural to think of 999999 as -1 etc. Since 999999 can not represent both -1 and +999999, a rule for
interpreting the patterns must be used. If we were to say that:

000000 to 499999 represented positive integers

and

999999 to 500000 represented negative integers

we would have a decimal system exactly analogous to the 2's complement system for representing integer
values with binary digits.

Using an n-bit pattern the range is -2n-1 to 2n-1 -1 and all patterns which begin with a 1 represent negative
values. This left most bit is called the sign bit. With 2's complement notation the normal rules of addition
apply. For example:

2 000002 00000010
+
-3 999997 11111101

= -1 999999 11111111
In order to find the additive inverse of any number i.e. the number to which it must be added to yield an
answer of zero, the following rule applies. To help you understand the jargon just think of this technique as
changing the sign - making a positive number negative, or a negative number positive.
Change all 0's to 1's and all 1's to 0's (i.e. invert or complement the number) and add 1.

Thus, the additive inverse of

00110010 (50 in decimal) is 11001110 Explanation:

11001101 (by complementing)

11001110 (by adding 1)

In general, in adding two numbers we treat them as unsigned integers and disregard any carry from the
most significant bit. e.g.

01010101 85
+ 11001100 -52
= 00100001 33
Consider the addition
01000000 64
+ 01000010 66
= 10000010 130 ?????
Two positive numbers are added (the result should be 130 in decimal) but the answer appears to be
negative (bit 7 is a 1). What's wrong? The explanation introduces the concept of an overflow. In an 8 bit
pattern 127 is the largest positive integer that can be represented using 2's complement notation - so 130 is
outside the range; an overflow has occurred. In essence an overflow occurs when a result is too 'big' for the
available number of bits.

To represent a larger range of integers more than 8 bits are needed. If more than one word is used to give.
If more than one word is used to give a large enough range it is called multiple precision. Most commonly,
two words are used in double precision arithmetic.

Numbers with fractional parts - Floating Point

In the real world, the most useful arithmetic involves numbers that are not all integral, i.e. have a fractional
part. With suitable rules, these too can be represented using a binary pattern.

Conceptually, the simplest method is to choose a place for the binary point. Bits to the left are the whole
number part and bits to the right are the fractional part. Thus, if the point is taken to be between bit 4 and
bit 3 in an 8 bit pattern: 01011101, it is interpreted as:

0x23 + 1x22 + 0x21 + 1x20 + 1x2-1 + 1x2-2 + 0x2-3 + 1x2-4

in a completely analogous manner to the digits in a decimal fraction. Since, in fixing the position of the
point, we are in effect, multiplying integers by a scaling factor (in this case 2-4) positive and negative
fractional numbers can be represented in twos complement form, and the number of significant figures
depends on the value of the number being represented.

Floating point

Floating point representation has two essential groups of bits, one of which, the mantissa, contains a fixed
point binary fraction and the other, the exponent, contains an integer used to calculate the scale factor.
There are many ways in which these two parts can be coded, as a 2's complement fraction and an integer,
respectively, for example.

The number of bits allocated to each group depends on the computer, its word length and use. The number
of bits in the mantissa determines the number of significant figures, the number in the exponent, the range
of numbers that can be represented.

To indicate the principle, consider an 8 bit group in which the 3 left most bits are the exponent and the 5
right most, the mantissa, both in 2's complement notation. Thus the pattern:
10111011

has '101' (-3) for the exponent and '11011' (-0.15625) for the fraction which combine to give:

-0.15625 x 2-3

Generally, 2's complement notation is not used because the advantages it has for adding integers, are not
available when scaling factors are involved.

An unrealistic (but easy to follow) example of how 8 bits could be used to represent a floating point number
might be:

1 011 1011
sign exponent fraction
(negative) (-1 see below) 0.5 + 0.25 + 0.625 = 0.6875
Which is the same as saying: -0.6875/2 = -0.34375.

The three digits used for the exponent can have 8 possible values: 0 - 8, but if we subtract 4 from this
number we can represent numbers both larger and smaller than the binary fraction.

More sensibly, a 32 bit representation might be used where 8 bits could be used for the exponent, 23 bits
for the binary fraction, and the remaining bit is used for the sign.

You might also like