You are on page 1of 2

DATA REPRESENTATION

· How numbers and other data, such as characters, are represented in memory
· Representing data is of great practical importance. Ideally a single memory representation, or
type, could represent all data including numbers, characters and boolean values.

Computer memory and transfer rates, however, are not infinite and designers must strike a compromise
between the widest possible range of values and conserving memory and maximizing speed.

Also, the data representations must use base 2 as dictated by the underlying binary hardware.

Goals of Computer Data Representation

• Compactness and range

– Describes number of bits used to represent a numeric value

– More compact data representation format; less expense to implement in computer hardware

– Users and programmers prefer large numeric range

• Accuracy

– Precision of representation increases with number of data bits used

• Ease of manipulation

– Manipulation is executing processor instructions (addition, subtraction, equality comparison)


16 – Ease is machine efficiency

– Processor efficiency depends on its complexity

• Standardization

– Ensures correct and efficient data transmission

– Flexibility to combine hardware from different vendors with minimal communication problems

MEMORY REPRESENTATION

1. CHARACTER MEMORY REPRESENTATION

Scheme: based on the assignment of a numeric code to each of the characters in the character
set. Standard Coding Schemes:
1. ASCII (American Standard Code for Information Interchange)
2. EBCDIC (Extended Binary Coded Decimal Interchange Code)
UNICODE

· Unicode provides a unique number for every character, no matter what the platform, no matter what
the program, no matter what the language. The Unicode Standard has been adopted by such industry
leaders as Apple, HP, IBM, JustSystems, Microsoft, Oracle, SAP, Sun, Sybase, Unisys and many others.
Unicode is required by modern standards such as XML, Java, ECMAScript 17 (JavaScript), LDAP, CORBA
3.0, WML, etc., and is the official way to implement ISO/IEC 10646. It is supported in many operating
systems, all modern browsers, and many other products. The emergence of the Unicode Standard, and
the availability of tools supporting it, are among the most significant recent global software technology
trends. In Unicode, a letter maps to something called a code point .

· Every platonic letter in every alphabet is assigned a magic number by the Unicode consortium which is
written like this: U+0639. This magic number is called a code point. The U+ means "Unicode" and the
numbers are hexadecimal. U+0639 is the Arabic letter Ain. The English letter A would be U+0041. You
can find them all using the charmap utility on Windows 2000/XP or visiting the Unicode web site. Say we
have a string:

Hello

which, in Unicode, corresponds to these five code points:

U+0048 U+0065 U+006C U+006C U+006F

2. INT (SHORT INTEGERS)

3 Schemes:

#1. Sign magnitude

Left most bit is used as a sign (1 is negative, 0 is positive) and the remaining bits are used to
store the magnitude.

#2. 2’s Complement

Nonnegative integers are represented as in the sign magnitude notation. The representation of
a negative number –n is obtained by first finding the base-two representation for n, complementing
it, and then adding one to the result.

#3. Excess or Biased Notation

The representation of an integer as a string of n bits is formed by adding the bias 2 n-1 to the
integer and representing the result in base-two.

You might also like