You are on page 1of 4

CS1FC16 INFORMATION REPRESENTATION

Information can come in different formats, such as: text, audio, video. However, the computer can only
process binary strings. It is necessary to be able to convert between internal and external formats.

It is important that both integers and floating points are represented internally as fixed-length binary
strings.

Mathematical background
 Integer division - A / B delivers two results: q – quotient and remainder – r

A=Bxq+r

There are 2 operations associated with integer division:

o Div: A div B = q
o Mod: A mod B = r
 Polynomials – functions of powers of x
A polynomial representation is of the form: a nxn = an-1xn-1 + … + a1x1 + a0x0 + a-1x-1 + … + a-mx-m
where the coefficients ai are constant.

Representing numbers in different bases


Any real number can be represented as a polynomial:

NR = rnRn + rn-1Rn-1 + … + r1R1 + r0R0 + r-1R-1 + … + r-mR-m with the coefficients ri in the range of <0, R-1>
Decimal numbers have base R = 10 (r in range 0-9)
Binary numbers have base R = 2 (r in range 0-1)
Hexadecimal numbers have base R = 16 (r in range 0-15)

Examples:

12.5610 = 1x101 + 2x100 + 5x10-1 + 6x10-2

110102 = 0x20 + 1x21 + 0x22 + 1x23 + 1x24

2BC16 = 12x160 + 11x161 + 2x162

Conversion between number systems


 From binary to decimal: evaluate polynomial expression:
110102 = 0x20 + 1x21 + 0x22 + 1x23 + 1x24 = 0 + 2 + 0 + 8 + 16 = 2610
 From hexadecimal to decimal: evaluate polynomial expression (in the same way as from binary)
2BC16 = 12x160 + 11x161 + 2x162 = 12 + 176 + 512 = 700
 From decimal to binary: perform division by 2:
57 = 2x28 + 1
28 = 2x14 + 0
14 = 2x7 + 0
7 = 2x3 + 1
3 = 2x1 + 1
1 = 2x0 + 1
The quotients form the binary representation: 57 10 = 1110012
 From decimal to hexadecimal: perform division by 16 (analogically to binary conversion):
57 = 16x3 + 9
3 = 16x0 + 3
The quotients form the hexadecimal representation: 57 10 = 3916
 From binary to hexadecimal: use the fact that 16 = 24
Starting from the least significant bit (from the right-hand side of the number), divide the digits
into groups of 4, convert them separately into decimal and from decimal to hexadecimal
0001101001101111 ==> 0001 1010 0110 1111
00012 = 110 = 116
10102 = 1010 = A16
01102 = 610 = 616
11112 = 1510 = F16
After converting, assemble back the number: 1A6F
 From hexadecimal to binary: convert each digit separately to decimal, then to 4-digit binary
number:
2EA516 = 0010 1110 1010 01012
216 = 210 = 00102
E16 = 1410 = 11102
A16 = 1010 = 10102
516 = 610 = 01012
Internal representation of characters
The most commonly used system is ASCII code – each character is allocated a unique 7-bit code. ASCII
includes 128 characters – letters, digits, signs and non-printable characters as well – space, tab, new line.

Another common system is Unicode – aims to cover all written languages. Encoding for a character in
Unicode has a fixed-length of 16 bits.

Internal representation of integers


Numbers are originally input as a sequence of ASCII characters. For a machine to process them, they
need to be converted from an ASCII string to an internal binary representation. The representation is
fixed length and depends on machine architecture. 32-bit integers are most common, but 64-bit are also
available.
The fixed length determines the number of the integers that can be represented.
Example: using 8-bits – the number of different combinations is 2 8 = 256.
Integers can be represented as signed or unsigned.

Unsigned – can be converted directly into binary and processed without any special care. Addition of the
sign complicates the issue, as there is no obvious way of representing it in binary.

Signed integers are usually represented using 2's complement. The most significant bit is always the
sign bit – 1 for negative numbers, 0 for positive.
To create a negative number using 2's complement:
Given positive number N, the negative is created by: 2 n - N
Example: using 8-bit integers
N = 510 = 000001012
To represent –5 = 28 – 5 = 256 – 5 = 25110 = 111110112
Another way: create positive integer, invert all bits, add 1.
Example: N = 510 = 000001012
Inverting all bits in 00000101 produces 11111010.
After adding 1 the result is 11111011.
Simplification: given a positive integer, to negate it, copy all bits starting from the right, up
to(including) the fist 1 in the digit, invert the rest: 00000101 - copy the rightmost bit, invert the
rest: 11111011.
The range of numbers that can be represented using n bits is <-2 n-1, 2n-1 –1>.
Example: Using 8-bit integers, it is possible to represent numbers between –128 and 127. The difference
between the number of positive and negative numbers accounts for the presence of 0.

Using fix-length integers creates the problem of overflow.


Example: using 8-bit integers represent the range <-128, 127>. When adding 2 integers that give a result
bigger than 127, the result will be incorrect:
74 + 85 = 159
01001010 + 01010101 = 10011111 – adding 2 positive numbers gives negative result – overflow
has occurred. Analogical problem occurs when adding 2 integers that give a result smaller than –128.

Overflow might occur when adding 2 positive or 2 negative values. It can be detected by checking the
sign bit of the answer.

Multiplication – multiplying 2 8-bit strings produces a product at most 8+8=16 bits in length – potential
for overflow if the result is longer than the length allocated by the machine.

Carry – another problem of fixed-length integers. Addition of two fixed-length binary strings can
generate a carry bit as the result of the last bit.
00101100 + 11100111 = 100010011
Adding 2 negative integers always generates carry: 11110110 + 11100111 = 111011101

All processors have a status register which includes 4 bits:


C = 1 if carry occurred
V = 1 if overflow occurred
N = 1 if the result was negative
Z = 1 if the result was 0
They are called condition code bits.

Internal representation of real numbers


A real number can be written in exponential form.
Example: 25.62 = 2.562 x 101
In general, any real number A can be written as:
A = m x Re , where m – mantissa, R – base, e – exponent
The mantissa contains digits operated on directly. The exponent gives the magnitude. In computer
science, this is called floating point.

IEEE floating point format - standard implemented in all modern processors


A real number A is represented as A = m x 2 e, where m=0 when A=0
m=1.<fraction> when A!=0
Example: 13.02 = 1.6275 x 23
There are 2 formats of the representation: single precision floating point and double precision floating
point. Single precision uses 32 bits – 1 for sign, 8 for the exponent, 23 for mantissa. Double precision
uses 64 bits – 1 for sign, 11 for the exponent, 52 for mantissa.
The sign bit is 1 for negative numbers, 0 for positive.
The exponent can be negative, positive or 0. It is represented using excess notation.
Example (for 8 bits the exponent is represented as e+127): 13.02 = 1.6275 x 2 3 ==> e=3
3+127 = 130 = 100000102 – this is the number that goes in the exponent field.
Mantissa is the <fraction> part of 1.<fraction> - the 1 is dropped.
To convert fraction to binary:
0.1875 x 2 = 0.375 - keep multiplying the fraction by 2
0.375 x 2 = 0.75 - record the whole parts
0.75 x 2 = 1.5 - whenever there is 1.<fraction>, record 1, but drop it before multiplying
0.5 x2 = 1.0 - stop when you reach 1.0
0.187510 = 0.00112

Rounding error
Most conversions will not be exact as the number must fit into the set number of bits (23 for single
precision).
Example: 0.6012510 = 0.100110011110101110000101000111...2 - this cannot be represented precisely
The number can be either truncated after 23 bits or rounded using 24 th bit. Rounding introduces less
error than truncation.
To round a 24-bit number into 23 bits:
Remove the least significant bit.
If that bit was 1, add it to the 23-bit number
Rounding introduces small error (bounded by 2 -24 for rounding from 24 to 23 bits).
To truncate the number, simply cut it at the required amount of bits and throw away the rest.

When adding numbers represented as floating points, the order of the additions can be important. If a
very large number is added to a very small one, the small one may be truncated. The general rule is to
add small values together first, to let them accumulate.

You might also like