Professional Documents
Culture Documents
CSE351 L10 x86 III - 16au
CSE351 L10 x86 III - 16au
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
x86 Programming III
CSE 351 Autumn 2016
Instructor:
Justin Hsia
Teaching Assistants:
Chris Ma
Hunter Zahn
John Kaltenbach
Kevin Bi
Sachin Mehta
Suraj Bhat
Thomas Neuman
Waylon Huang
Xi Liu http://xkcd.com/648/
Yufang Sun
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Administrivia
Homework 1 due Friday @ 5pm
Still submit electronically via the Dropbox
Lab 2 released, due next Friday
Midterm is 2 weeks from today, in lecture
You will be provided a reference sheet
• Study and use this NOW so you are comfortable with it when the
exam comes around
Find a study group! Look at past exams (last 3 quarters
especially)!
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Peer Instruction Question
Which conditional statement properly fills in the
following blank?
Vote at http://PollEv.com/justinh
if(__________) {…} else {…}
Jumping
j* Instructions
Jumps to target (argument – actually just an address)
Conditional jump relies on special condition code registers
Instruction Condition Description
je target ZF Equal / Zero
jne target ~ZF Not Equal / Not Zero
js target SF Negative
jns target ~SF Nonnegative
jg target ~(SF^OF)&~ZF Greater (Signed)
jge target ~(SF^OF) Greater or Equal (Signed)
jl target (SF^OF) Less (Signed)
jle target (SF^OF)|ZF Less or Equal (Signed)
ja target ~CF&~ZF Above (unsigned)
jb target CF Below (unsigned)
4
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
x86 Control Flow
Condition codes
Conditional and unconditional branches
Loops
Switches
5
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Expressing with Goto Code
long absdiff(long x, long y) long absdiff_j(long x, long y)
{ {
long result; long result;
if (x > y) int ntest = x <= y;
result = x-y; if (ntest) goto Else;
else result = x-y;
result = y-x; goto Done;
return result; Else:
} result = y-x;
Done:
return result;
}
C allows goto as means of transferring control (jump)
Closer to assembly programming style
Generally considered bad coding style
6
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Compiling Loops
C/Java code: Assembly code:
while ( sum != 0 ) { loopTop: testq %rax, %rax
<loop body> je loopDone
} <loop body code>
jmp loopTop
loopDone:
Other loops compiled similarly
Will show variations and complications in coming slides, but
may skip a few examples in the interest of time
Most important to consider:
When should conditionals be evaluated? (while vs. do‐while)
How much jumping is involved?
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Compiling Loops
What are the Goto versions of the following?
Do…while: Test and Body
For loop: Init, Test, Update, and Body
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Compiling Loops
While loop
C/Java code: Assembly code:
while ( sum != 0 ) { loopTop: testq %rax, %rax
<loop body> je loopDone
} <loop body code>
jmp loopTop
loopDone:
Do‐while loop
C/Java code: Assembly code:
do { loopTop:
<loop body> <loop body code>
} while ( sum != 0 ) testq %rax, %rax
jne loopTop
loopDone:
9
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Do‐While Loop Example
C Code Goto Version
long pcount_do(unsigned long x) long pcount_goto(unsigned long x)
{ {
long result = 0; long result = 0;
do { loop:
result += x & 0x1; result += x & 0x1;
x >>= 1; x >>= 1;
} while (x); if(x) goto loop;
return result; return result;
} }
Count number of 1’s in argument x (“popcount”)
Use backward branch to continue looping
Only take branch when “while” condition holds
10
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Do‐While Loop Compilation
Goto Version
long pcount_goto(unsigned long x)
Register Use(s) {
long result = 0;
%rdi 1st argument (x) loop:
%rax ret val (result) result += x & 0x1;
x >>= 1;
if(x) goto loop;
return result;
}
Assembly
movl $0, %eax # result = 0
.L2: # loop:
movq %rdi, %rdx
andl $1, %edx # t = x & 0x1
addq %rdx, %rax # result += t
shrq %rdi # x >>= 1
jne .L2 # if (x) goto loop
rep ret # return (rep weird)
11
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
General Do‐While Loop Translation
C Code Goto Version
do loop:
Body Body
while (Test); if (Test)
goto loop
Body: {
Statement1;
…
Statementn;
}
Test returns integer
= 0 interpreted as false, ≠ 0 interpreted as true
12
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
General While Loop ‐ Translation #1
“Jump‐to‐middle” translation
Used with -Og
Goto Version
goto test;
loop:
While version Body
while (Test) test:
Body if (Test)
goto loop;
done:
13
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
General While Loop ‐ Translation #2
While version
while (Test)
“Do‐while” conversion
Body Used with –O1
Goto Version
Do‐While Version if (!Test)
if (!Test) goto done;
goto done; loop:
do Body
Body if (Test)
while (Test); goto loop;
done: done:
15
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
For Loop Form
General Form Init
for (Init; Test; Update) i = 0
Body Test
i < WSIZE
#define WSIZE 8*sizeof(int)
long pcount_for(unsigned long x)
{
Update
size_t i; i++
long result = 0;
for (i = 0; i < WSIZE; i++) { Body
unsigned bit = {
(x >> i) & 0x1; unsigned bit =
result += bit; (x >> i) & 0x1;
} result += bit;
return result; }
}
17
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
For Loop While Loop
For Version
for (Init; Test; Update)
Body Caveat: C and Java have
break and continue
• Conversion works fine for break
• Jump to same label as loop
While Version exit condition
Init; • But not continue: would skip
doing Update, which it should do
while (Test) { with for‐loops
Body • Introduce new label at
Update
Update;
}
18
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
For Loop ‐ While Conversion
long pcount_for_while(unsigned long x)
Init {
size_t i;
i = 0
long result = 0;
i = 0;
Test while (i < WSIZE) {
i < WSIZE unsigned bit = (x >> i) & 0x1;
result += bit;
Update i++;
}
i++ return result;
}
Body
{
unsigned bit = (x >> i) & 0x1;
result += bit;
}
19
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
For Loop ‐ Do‐While Conversion
C Code Goto Version
long pcount_for long pcount_for_goto_dw
(unsigned long x) (unsigned long x)
{ {
size_t i; size_t i;
long result = 0; long result = 0;
for (i = 0; i < WSIZE; i++) i = 0; Init
{ if (!(i < WSIZE)) !Test
unsigned bit = goto done;
(x >> i) & 0x1; loop:
result += bit; unsigned bit = Body
} (x >> i) & 0x1;
return result; result += bit;
} i++; Update
if (i < WSIZE) Test
goto loop;
Initial test can be done:
return result;
optimized away! }
20
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
x86 Control Flow
Condition codes
Conditional and unconditional branches
Loops
Switches
21
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
long switch_ex
{
(long x, long y, long z) Switch Statement
long w = 1;
switch (x) {
Example
case 1:
w = y*z; Multiple case labels
break;
case 2: Here: 5 & 6
w = y/z;
/* Fall Through */
Fall through cases
case 3: Here: 2
w += z;
break; Missing cases
case 5:
case 6:
Here: 4
w -= z;
break;
default: Implemented with:
w = 2;
} Jump table
return w; Indirect jump instruction
}
22
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Jump Table Structure
Switch Form Jump Table Jump Targets
switch (x) { JTab: Targ0 Targ0: Code
case val_0:
Targ1 Block 0
Block 0
case val_1: Targ2
Block 1 Targ1: Code
•
• • • • Block 1
case val_n-1: •
Block n–1 Targ2:
}
Targn-1 Code
Block 2
Approximate Translation •
•
target = JTab[x];
•
goto target;
Targn-1: Code
Block n–1
23
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Jump Table Structure
C code: Memory
switch (x) {
case 1: <some code>
break;
case 2: <some code>
case 3: <some code>
break; Code
case 5: Blocks
case 6: <some code>
break;
default: <some code>
}
Use the jump table when x <= 6: 0
if (x <= 6) 1
2
target = JTab[x]; Jump
3
goto target; Table 4
else 5
goto default; 6
24
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Register Use(s)
Switch Statement Example %rdi 1st argument (x)
%rsi 2nd argument (y)
long switch_ex(long x, long y, long z)
%rdx 3rd argument (z)
{
long w = 1; %rax Return value
switch (x) {
. . .
}
return w;
} Note compiler chose
to not initialize w
switch_eg: Take a look!
movq %rdx, %rcx https://godbolt.org/g/NAxYV
cmpq $6, %rdi # x:6 w
ja .L8 # default
jmp *.L4(,%rdi,8) # jump table
jump above – unsigned > catches negative default cases
25
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Switch Statement Example
long switch_ex(long x, long y, long z)
{
long w = 1;
switch (x) {
Jump table
. . .
.section .rodata
} .align 8
return w; .L4:
} .quad .L8 # x = 0
.quad .L3 # x = 1
.quad .L5 # x = 2
.quad .L9 # x = 3
.quad .L8 # x = 4
switch_eg:
switch_eg: .quad .L7 # x = 5
movq movq%rdx,
%rdx,
%rcx%rcx .quad .L7 # x = 6
cmpq cmpq$6, %rdi
$6, %rdi ## x:6
x:6
ja ja .L8 .L8 # default
jmp jmp *.L4(,%rdi,8)
*.L4(,%rdi,8)
# jump table
Indirect
jump
26
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Assembly Setup Explanation
Jump table
Table Structure .section .rodata
Each target requires 8 bytes (address) .align 8
.L4:
Base address at .L4 .quad .L8 # x = 0
.quad .L3 # x = 1
.quad .L5 # x = 2
.quad .L9 # x = 3
Direct jump: jmp .L8 .quad .L8 # x = 4
.quad .L7 # x = 5
Jump target is denoted by label .L8 .quad .L7 # x = 6
Jump Table
declaring data, not instructions 8‐byte memory alignment
Jump table switch(x) {
.section .rodata case 1: // .L3
.align 8 w = y*z;
.L4: break;
.quad .L8 # x = 0 case 2: // .L5
.quad .L3 # x = 1 w = y/z;
.quad .L5 # x = 2 /* Fall Through */
.quad .L9 # x = 3 case 3: // .L9
.quad .L8 # x = 4 w += z;
.quad .L7 # x = 5 break;
.quad .L7 # x = 6 case 5:
case 6: // .L7
w -= z;
break;
this data is 64‐bits wide
default: // .L8
w = 2;
}
28
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Register Use(s)
Code Blocks (x == 1) %rdi 1st argument (x)
%rsi 2nd argument (y)
switch(x) { %rdx 3rd argument (z)
case 1: // .L3 %rax Return value
w = y*z;
break;
. . .
.L3:
}
movq %rsi, %rax # y
imulq %rdx, %rax # y*z
ret
29
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Handling Fall‐Through
long w = 1;
. . .
switch (x) { case 2:
. . . w = y/z;
case 2: // .L5 goto merge;
w = y/z;
/* Fall Through */
case 3: // .L9
w += z;
break;
. . .
case 3:
}
w = 1;
More complicated choice than
“just fall‐through” forced by merge:
“migration” of w = 1; w += z;
• Example compilation
trade‐off
30
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Register Use(s)
Code Blocks (x == 2, x == 3) %rdi 1st argument (x)
%rsi 2nd argument (y)
long w = 1; %rdx 3rd argument (z)
. . . %rax Return value
switch (x) {
. . .
case 2: // .L5
.L5: # Case 2
w = y/z;
movq %rsi, %rax # y in rax
/* Fall Through */
cqto # Div prep
case 3: // .L9
idivq %rcx # y/z
w += z;
jmp .L6 # goto merge
break;
.L9: # Case 3
. . .
movl $1, %eax # w = 1
}
.L6: # merge:
addq %rcx, %rax # w += z
ret
31
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Register Use(s)
Code Blocks (rest) %rdi 1st argument (x)
%rsi 2nd argument (y)
switch (x) { %rdx 3rd argument (z)
. . . %rax Return value
case 5: // .L7
case 6: // .L7
w -= z;
.L7: # Case 5,6
break;
movl $1, %eax # w = 1
default: // .L8
subq %rdx, %rax # w -= z
w = 2;
ret
}
.L8: # Default:
movl $2, %eax # 2
ret
32
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Question
Would you implement this with a jump table?
switch (x) {
case 0: <some code>
break;
case 10: <some code>
break;
case 32767: <some code>
break;
default: <some code>
break;
}
Probably not
32,768‐entry jump table too big (256 KiB) for only 4 cases
For comparison, text of this switch statement 193 B
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Bonus content (nonessential). Does contain examples.
Conditional Operator with Jumps
Conditional Move
34
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Bonus Content
Conditional Operator with Jumps (nonessential)
C Code if (Test)
val = Then‐Expr;
val = Test ? Then‐Expr : Else‐Expr;
else
val = Else‐Expr;
Example:
result = x>y ? x-y : y-x;
Ternary operator ?:
Goto Version Test is expression returning
ntest = !Test;
if (ntest) goto Else;
integer
val = Then_Expr; = 0 interpreted as false
goto Done; ≠ 0 interpreted as true
Else:
val = Else_Expr; Create separate code regions
Done: for then & else expressions
. . .
Execute appropriate one
35
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Bonus Content
Conditional Move (nonessential)
more details at
end of slides
Conditional Move Instructions: cmovC src, dst
Move value from src to dst if condition C holds
if(Test) Dest ← Src
GCC tries to use them (but only when known to be safe)
Why is this useful?
Branches are very disruptive to instruction flow through pipelines
Conditional moves do not require control transfer
absdiff:
movq %rdi, %rax # x
long absdiff(long x, long y) subq %rsi, %rax # result=x-y
{ movq %rsi, %rdx
return x>y ? x-y : y-x; subq %rdi, %rdx # else_val=y-x
} cmpq %rsi, %rdi # x:y
cmovle %rdx, %rax # if <=,
ret # result=else_val
36
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Bonus Content
Using Conditional Moves (nonessential)
Conditional Move Instructions
cmovC src, dest
Move value from src to dest if condition
C holds
Instruction supports: C Code
if (Test) Dest Src
Supported in post‐1995 x86 processors val = Test
GCC tries to use them ? Then_Expr
• But, only when known to be safe : Else_Expr;
Why is this useful?
Branches are very disruptive to “Goto” Version
instruction flow through pipelines
Conditional moves do not require control result = Then_Expr;
transfer else_val = Else_Expr;
nt = !Test;
if (nt) result = else_val;
return result;
37
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Bonus Content
Conditional Move Example (nonessential)
absdiff:
movq %rdi, %rax # x
subq %rsi, %rax # result = x-y
movq %rsi, %rdx
subq %rdi, %rdx # else_val = y-x
cmpq %rsi, %rdi # x:y
cmovle %rdx, %rax # if <=, result = else_val
ret
38
L01:
L10:
Intro,
x86Combinational
ProgrammingLogic
III CSE369,
CSE351, Autumn 2016
Bonus Content
Bad Cases for Conditional Move (nonessential)
Expensive Computations
val = Test(x) ? Hard1(x) : Hard2(x);
Both values get computed
Only makes sense when computations are very simple
Risky Computations
val = p ? *p : 0;
Both values get computed
May have undesirable effects
Computations with side effects
val = x > 0 ? x*=7 : x+=3;
Both values get computed
Must be side‐effect free 39