Professional Documents
Culture Documents
Instructors:
Xiaofei Nan
1
Carnegie Mellon
2
Carnegie Mellon
Encoding Integers
Unsigned Two’s Complement
Sign Bit
10 = 0 1 0 1 0 8+2 = 10
-16 8 4 2 1
6
Carnegie Mellon
7
Carnegie Mellon
Numeric Ranges
⬛Unsigned Values
⬛Two’s Complement Values
▪ UMin = 0 ▪TMin = –2w–1
000…0
100…0
▪ UMax = 2 – w
▪ TMax 2w–1 – 1
1
=
111…1
011…1
▪ Minus 1
111…1
Values for W = 16
8
填空题 4分
submit
Carnegie Mellon
◆ C Programming
⬛ Observations
▪ |TMin | = TMax ▪ #include <limits.h>
+1 ▪ Declares constants, e.g.,
▪Asymmetric
▪ UMax = 2range
* TMax ▪ ULONG_MAX
+1 ▪ LONG_MAX
▪ LONG_MIN
▪ Values platform specific
9
Carnegie Mellon
11
Carnegie Mellon
13
Carnegie Mellon
Boolean Algebra
◆ Developed by George Boole in 19th Century
▪Algebraic representation of logic
▪Encode “True” as 1 and “False” as 0
And Or
■A&B = 1 when both A=1 and B=1 ■A|B = 1 when either A=1 or B=1
14
Carnegie Mellon
15
Carnegie Mellon
▪ 01010101 { 0, 2, 4, 6 }
▪ 76543210
◆ Operations
▪& 01000001 { 0, 6 }
▪ |Intersection
Union 01111101 { 0, 2, 3, 4, 5,
6}
▪ ^ Symmetric difference 00111100
▪ ~ Complement { 2, 3, 4, 5 }
10101010 { 1, 3, 5, 7 } 16
Carnegie Mellon
Bit-Level Operations in C al y
im ar
e x De c in
◆ Operations &, |, ~, ^ Available H
0 0
B
0 000
in C 1 1 0 001
2 2 0 010
▪Apply to any “integral” data type 3 3 0 011
4 4 0 100
▪long, int, short, char, unsigned 5 5 0 101
6 6 0 110
▪ View arguments as bit vectors 7 7 0 111
▪ Arguments applied bit-wise 8 8 1 000
9 9 1 001
◆ Examples (char data type) A 10 1 010
B 11 1 011
▪~0x41 → C 12 1 100
D 13 1 101
E 14 1 110
▪ ~0x00 → F 15 1 111
▪ 17
Carnegie Mellon
Bit-Level Operations in C al y
im ar
e x De c in
◆ Operations &, |, ~, ^ Available H
0 0
B
0 000
in C 1 1 0 001
2 2 0 010
▪Apply to any “integral” data type 3 3 0 011
4 4 0 100
▪long, int, short, char, unsigned 5 5 0 101
6 6 0 110
▪ View arguments as bit vectors 7 7 0 111
▪ Arguments applied bit-wise 8 8 1 000
9 9 1 001
◆ Examples (char data type) A 10 1 010
B 11 1 011
▪~0x41 → 0xBE C 12 1 100
D 13 1 101
▪~0100 00012 → 1011 11102 E 14 1 110
▪ ~0x00 → 0xFF F 15 1 111
▪~0000 00002 → 1111 11112
▪!0x00 →
▪ 0x69 && 0x55 →
0x01
0x01
▪p0x69
▪
▪ || 0x55(avoids
&& *p
!!0x41→ → null pointer
0x01
access)
0x01
19
Carnegie Mellon
20
Carnegie Mellon
22
Carnegie Mellon
23
Carnegie Mellon
w–1 0
ux + + + •• • +++
x - ++ •• • +++
Conversion Visualized
⬛ 2’s Comp. → Unsigned
UMax
▪ Ordering Inversion UMax – 1
▪ Negative → Big Positive
TMax + 1 Unsigned
TMax TMax Range
2’s Complement 0 0
Range –1
–2
TMin
25
Carnegie Mellon
▪ Implicit
tx = conversion
ux; int fun(unsigned u);
occurs
uy via
= assignments uy = fun(tx);
and procedure calls!
ty;
26
Carnegie Mellon
Conversion Surprises
◆ Expression Evaluation
▪If there is a mix of unsigned and signed in single
expression,
signed values implicitly converted to unsigned
▪Including comparison operations <, >, ==, <=, >=
▪Examples for W = 32: TMIN = -2,147,483,648 , TMAX = 2,147,483,647
◆ Constant1 Relation Evaluation
Constant2 == unsigned
0 0U < signed
-1 0 > unsigned
2147483647 -2147483647-1 > signed
-1 0U unsigned
2147483647U -2147483647-1 <
-1 -2 > signed
(unsigned) -1 -2 > unsigned
2147483647 2147483648U < unsigned
2147483647 (int) 2147483648U > signed
27
单选题 1分
A <
B >
C =
submit
单选题 1分
A <
B >
C =
提交
Carnegie Mellon
29
Carnegie Mellon
Sign Extension
⬛ Task:
▪ Given w-bit signed integer x
▪ Convert it to w+k-bit integer with same value
⬛ Rule:
▪ Make k copies of sign bit:
▪ X ′ = xw–1 ,…, xw–1 , xw–1 , xw–2 ,…, x0
k copies of MSB w
X •• •
••
•
X •• • •• •
′ k w
ersp 30
Carnegie Mellon
-16 8 4 2 1 -16 8 4 2 1
10 = 0 1 0 1 0 -10 = 1 1 1 1 0
-32 16 8 4 2 1 -32 16 8 4 2 1
10 = 0 1 0 1 0 -10 = 1 1 0 1 0
0 1
31
Carnegie Mellon
32
Carnegie Mellon
Truncatio
n
⬛ Task:
▪ Given k+w-bit signed or unsigned integer X
▪ Convert it to w-bit integer X’ with same value for “small enough” X
⬛ Rule:
▪ Drop top k bits:
▪ X ′ = xw–1 , xw–2 ,…, x0
k w
X •• • •• •
•• •
X′ •• •
w 33
Carnegie Mellon
2 = 0 0 0 1 0 10 = 0 1 0 1 0
-8 4 2 1 -8 4 2 1
2 = 0 0 1 0 -6 = 1 0 1 0
2 mod 16 = 2 10 mod 16 = 10U mod 16 = 10U = -6
-16 8 4 2 1 -16 8 4 2 1
-6 = 1 1 0 1 0 -10 = 1 0 1 1 0
-8 4 2 1 -8 4 2 1
-6 = 1 0 1 0 6 = 0 1 1 0
-6 mod 16 = 26U mod 16 = 10U = -6 -10 mod 16 = 22U mod 16 = 6U = 6
34
Carnegie Mellon
Summary:
Expanding, Truncating: Basic Rules
◆ Expanding (e.g., short int to int)
▪ Unsigned: zeros added
▪ Signed: sign extension
▪ Both yield expected result
46
Carnegie Mellon
Shift Operations
◆ Left Shift: x << y Argument x 0 11 000 10
Submit
Carnegie Mellon
Unsigned Addition
u •• •
Operands: w bits
•• •
+ v
True Sum: w+1 bits u+v •• •
Discard Carry: w bits UAdd w (u,v) •• •
⬛ Implements
s Modular
= w UAdd Arithmetic
u + v mod 2w
(u , v) =
48
Carnegie Mellon
v
u
49
Carnegie Mellon
▪ If true sum ≥ 2w
▪ At most once UAdd4(u , v)
True Sum
2w+1 Overflow
2w
v
0
Modular Sum
u
50
Carnegie Mellon
▪Will give s == t
51
Carnegie Mellon
Characterizing TAdd
Positive Overflow
⬛ Functionality TAdd(u , v)
▪ True sum requires w+1 bits >0
▪ Drop off MSB v
▪ Treat remaining bits as 2’s <0
comp. integer
< 0 u> 0
Negative Overflow
2w (NegOver)
2w (PosOver)
53
Carnegie Mellon
◆ Values
▪ 4-bit two’s comp. TAdd4(u , v)
▪ Range from -8 to +7
◆ Wraps Around
▪ If sum ≥ 2w–1
▪Becomes negative
▪At most once
▪ If sum < –2w–1
▪Becomes positive
▪At most once v
PosOver
u
54
Carnegie Mellon
Unsigned Multiplication in C
u •• •
Operands: w bits
* v •• •
True Product: 2*w bits u· •• • •• •
v UMult w (u,v) •• •
Discard w bits: w bits
2 21 55
Carnegie Mellon
Signed Multiplication in C
u •• •
Operands: w bits
* v •• •
True Product: 2*w bits u· •• • •• •
v TMult w (u,v) •• •
Discard w bits: w bits
1 1 1 0 100 1
* 1101 0101
110 0001 1 0 1 0010
1 1 0 1 1 101 -35 56
Carnegie Mellon
57
Carnegie Mellon
Division: / 2k 00 ••• 0 0
••• .
u / 2k
00 ••• 0 0 •••
Result: ⎣
u/2k ⎦ •••
58
Carnegie Mellon
59
Carnegie Mellon
◆ Multiplication:
▪ Unsigned/signed: Normal multiplication followed by truncate,
same operation on bit level
▪ Unsigned: multiplication mod 2w
▪ Signed: modified multiplication mod 2w (result in proper
81
Carnegie Mellon
Multiplication
⬛ Goal: Computing Product of w-bit numbers x, y
▪ Either signed or unsigned
60
Carnegie Mellon
Summary
Casting Signed ↔ Unsigned: Basic Rules
◆ Bit pattern is maintained
◆ But reinterpreted
◆ Can have unexpected effects: adding or subtracting 2w
61
Carnegie Mellon
62
Carnegie Mellon
64
Carnegie Mellon
66
Carnegie Mellon
Byte-Addressable Memory
Machine Words
◆ Any given computer has a “Word Size”
▪Nominal size of integer-valued data
▪and of addresses
char 1 1 1
short 2 2 2
int 4 4 4
long 4 8 8
float 4 4 4
double 8 8 8
pointer 4 8 8
69
Carnegie Mellon
Byte Ordering
◆ So, how are the bytes within a multi-byte word ordered in
memory?
◆ Conventions
▪ Big Endian: Sun, PPC Mac, Internet
▪Least significant byte has highest address
▪ Little Endian: x86, ARM processors running Android, iOS, and
Windows
▪Least significant byte has lowest address
71
Carnegie Mellon
72
Carnegie Mellon
Decimal:
Decimal: 15213
15213
Representing Integers Binary:
Binary: 00001111 11001111 00111100 110 1
1101
Hex: 3 B
Hex: 3 B 6 6 D D
int A = 15213;
IA32, x86-64 long int C = 15213;
Increasing addresses
Submit
Carnegie Mellon
void
void show_bytes(pointer
show_bytes(pointer start,
start, size_t
size_t len){
len){
size_t
size_t i;i;
for
for (i
(i == 0;
0; ii << len;
len; i++)
i++) printf(”%p\t0x
printf(”%p\t0x
%.2x\n", start+i, start[i]);
%.2x\n", start+i, start[i]);
printf("\n");
printf("\n");
}}
Printf directives:
Non-portable C code! %p: Print pointer
%x: Print Hexadecimal
73
Carnegie Mellon
show_bytesExecution Example
int
int aa == 15213;
15213; //
// 0x3b6d
0x3b6d
printf("int
printf("int aa == 15213;\n");
15213;\n");
show_bytes((pointer)
show_bytes((pointer) &a,
&a, sizeof(a));
sizeof(a));
int
int aa == 15213;
15213;
0x7fffb7f71dbc
0x7fffb7f71dbc 6d6d
0x7fffb7f71dbd 3b
0x7fffb7f71dbd
0x7fffb7f71dbe
3b
00
0x7fffb7f71dbe 00
0x7fffb7f71dbf
0x7fffb7f71dbf 0000
74
Carnegie Mellon
Representing Pointers
int
int BB == -15213;
-15213;
int
int *P
*P == &B;
&B;
Sun IA3 x86-64
2
EF AC 3C
FF 28 1B
FB F5 FE
2C FF 82
FD
7F
00
00
Representing Strings
char
char S[6]
S[6] == "18213";
"18213";
◆ Strings in C
▪ Represented by array of characters
▪ Each character encoded in ASCII format IA3 Sun
2
▪Standard 7-bit encoding of character set 31 31
▪Character “0” has code 0x30 38 38
– Digit i has code 0x30+i 32 32
▪ String should be NUL-terminated 31 31
33 33
▪Final character = 0
◆ Compatibility 00 00
▪Byte ordering not an issue
76
Carnegie Mellon
77
Carnegie Mellon
Summary
◆ Representing information as bits
◆ Integers
▪ Representation: unsigned and signed
▪Bit-level manipulations
▪ Conversion, casting
▪ Expanding, truncating
▪ Addition, negation, multiplication, shifting
▪ Summary
◆ Representations in memory, pointers, strings
78
Carnegie Mellon
Integer C Puzzles
x < 0 ⇒ ((x*2) < 0)
ux >= 0
x & 7 == 7 (x<<30) < 0
⇒
ux
x >>y-1
⇒ -x < -
y x * x >= 0
x > 0 && y > 0 x + y > 0
Initialization
⇒ >= 0
x ⇒ -x <= 0
int x = foo(); x <= 0 ⇒ -x >= 0
int y = bar(); (x|-x)>>31 == -1
unsigned ux = x; ux >> 3 == ux/8
x >> 3 == x/8
unsigned uy =
x & (x-1) != 0
y;
79
Carnegie Mellon
Appendix
80
Carnegie Mellon
Incremented by 1
Biasing adds 1 to final result
84