Professional Documents
Culture Documents
Variables
Choose an appropriate data type for each variable in a program, based on knowledge of the
features required of the variable and the internal representation of the available data type
"For larger projects, strongly-typed languages are dramatically more productive, because
many errors are caught at compile time rather than at run time (or worse, after your project
has shipped)." (J Strout)
A variable holds data that can change in value during the lifetime of the variable. The C
language associates a data type with each variable. Each data type occupies a compiler-
defined number of bytes.
Associating a data type with each variable imposes restrictions on the range of values that the
variable may hold.
Typed languages, such as C, subdivide the universe of data values into sets of distinct type. A
data type defines:
int
char
float
double
An int occupies one word and can store an integer. On a 32-bit machine, an int occupies 4
bytes:
char
1 Byte
A float typically occupies 4 bytes and can store a single-precision floating-point number:
float
1 Byte 1 Byte 1 Byte 1 Byte
A double typically occupies 8 bytes and can store a double-precision floating-point number:
double
1 Byte 1 Byte 1 Byte 1 Byte 1 Byte 1 Byte 1 Byte 1 Byte
Qualifiers
We can qualify the int data type so that it contains a minimum number of bits. The
qualifiers are:
short
long
long long
short
1 Byte 1 Byte
A long contains at least 32 bits:
long
1 Byte 1 Byte 1 Byte 1 Byte
A long long contains at least 64 bits:
long long
1 Byte 1 Byte 1 Byte 1 Byte 1 Byte 1 Byte 1 Byte 1 Byte
We can also qualify the double data type. A long double typically occupies at least 64
bits:
long double
1 Byte 1 Byte 1 Byte 1 Byte 1 Byte 1 Byte 1 Byte 1 Byte
Standard C does not specify that a long double must occupy a minimum number of bits,
only that it occupies no less bits than a double.
Non-Negative Values
To convert a non-negative integer into its binary equivalent, we distinguish the component
bits, focus on the lowest order bit as our first target, and then take the value
divide by 2,
put the remainder into the target,
make the next higher order bit the new target, and
repeat this set of instructions
Bit # 7 6 5 4 3 2 1 0
Value 0 1 2 5 11 23 46 92
Bit Values 0 1 0 1 1 1 0 0
(Right to left bit numbering is for illustrative purposes only. Intel machines use this little-
endian ordering. Motorola machines use big-endian ordering - left to right.)
To convert a binary number into its decimal equivalent, we multiply each bit value by its
corresponding power of 2 and add the bit values together.
Bit # 7 6 5 4 3 2 1 0
Power of 2 7 6 5 4 3 2 1 0
Bit Values 0 1 0 1 1 1 0 0
Multiplier 128 64 32 16 8 4 2 1
Byte Value 0*128 + 1*64 + 0*32 + 1*16 + 1*8 + 1*4 + 0*2 + 0*1 = 92
In-class practice: try exercise 1 on handout 3.
Computers store negative integers using encoding schemes. The schemes available include:
All of these schemes represent non-negative integers identically. The most popular scheme is
two's complement. With two's complement notation, separate subtraction circuits in the ALU
are unnecessary and there is only one representation of 0.
Unsigned ints
We can use all of the bits available to store the value of a variable if we know that the variable
will always contain only non-negative integer values. In such cases, we add the
qualifier unsigned:
unsigned short
unsigned int
unsigned long
unsigned long long
The range that an unsigned data type can hold depends only upon the word size of the host
machine
To accommodate the various cultures throughout the world, we need a very broad symbol
repertoire. Over 60 encoding sequences have already been defined. They include
The two popular encoding sequences are ASCII and EBCDIC. Both use a single byte. ASCII
originates in paper tape and Morse code and is listed here. ASCII represents the letter A by
the bit pattern 010000012 or 0x41 or 65. EBCDIC originates in punched cards and is
listed here. EBCDIC represents the letter A by the bit pattern 110000012 or 0xC1 or 193.
Note the different values for A under ASCII and EBCDIC.
The EBCDIC symbol order differs from the ASCII symbol order. In ASCII, the digits precede
the letters, while in EBCDIC, the letters precede the digits. If we use either sequence to sort
symbolic information that contains digits and letters, we will obtain different results.
The range of values that a character data type can store varies from platform to platform.
Compilers do not treat the char data type consistently. Some treat it as signed, while
others treat it as unsigned. For example, phobos treats char as unsigned,
while .net treats char as signed. The ASCII sequence uses the common range [0,127]
and we can expect the same results regardless of platform.
Computers store floating-point data using two separate components: an exponent and a
mantissa. The models in use vary across implementations. C leaves the model open to
definition. The most popular model is the IEEE (I-triple-E or Institute of Electrical and
Electronics Engineers) Standard 754 for Binary and Floating-Point Arithmetic.
Under IEEE 754, the model for a float occupies 32 bits, has one sign bit, a 23-bit mantissa
and a 8-bit exponent:
float
1 Byte 1 Byte 1 Byte 1 Byte
exponen
s mantissa
t
We calculate the value using the formula
x = sign * 2exponent * { 1 +
f12-1 + f22-2 + ... + f232-23}
Under the IEEE standard, the model for a double occupies 64 bits, has one sign bit, a 52-bit
mantissa and a 11-bit exponent:
double
1 Byte 1 Byte 1 Byte 1 Byte 1 Byte 1 Byte 1 Byte 1 Byte
s exponent mantissa
We calculate the value using the formula
x = sign * 2exponent * { 1 +
f12-1 + f22-2 + ... + f522-52}
DECLARATIONS
We allocate memory for a variable by specifying its data type and optionally an initial value
char section;
int numberOfClasses;
double cashFare = 2.25;
In allocating memory for variables of identical data type, we may group the identifiers in a
single declaration and separate them with commas. For example,
Naming Conventions
The identifiers that we use for variables must satisfy the following rules:
Some compilers allow more than 31 characters, while others do not. To be safe, we avoid
using more than 31 characters.
For upward compatibility with C++, we also avoid using the C++ reserved words
EXERCISES