You are on page 1of 41

Intel x86 Instruction Set Architecture

Dr. Nihat Adar


Intel Processor Architectures

Intel Year Address Data Size


Size
Processor
8086 1978 20 16
80286 1982 24 16
80386/486 ’85/’89 32 32
Pentium 1993 32 32
Pentium 4 2000 32 32
Core 2 Duo 2006 36 64
Core i7 2013 39 64
(Haswell)
Data Bus & Data Sizes

Moore's Law meant we could build systems with more transistors

More transistors meant greater bit-widths

Just like more physical space allows for wider roads/freeways, more
transistors allowed us to move to 16-, 32-and 64-bit circuitry inside the
processor

To support smaller variable sizes (char= 1-byte) we still need to
access only 1-byte of memory per access, but to support int and
long ints we want to access 4-or 8-byte chunks of memory per
access

Thus the data bus (highway connecting the processor and
memory) has been getting wider (i.e. 64-bits)

The processor can use 8-, 16-, 32-or all 64-bits of the bus (lanes of the
highway) in a single access based on the size of data that is needed
X86 Data Sizes

Integer Floating Point


4 Sizes Defined 3 Sizes Defined

Byte (B) ●
Single (S)

8-bits ➔
32-bits = 4 bytes

Word (W) ●
Double (D)

16-bits = 2 bytes ➔
64-bits = 8 bytes

Double word (L) (For a 32-bit data bus, a double would be accessed

32-bits = 4 bytes from memory in 2 reads)

Quad word (Q)

64-bits = 8 bytes
Big-endian vs Little-endian

Endian-ness refers to the two alternate methods of ordering the bytes in a


larger unit (word, DWORD, etc.)

Big-Endian

PPC, Sparc

MS byte is put at the starting address

Little-Endian

used by Intel processors / original PCI bus

LS byte is put at the starting address

Some processors (like ARM) and busses can be configured for
either big-or little-endian
Big-endian vs Little-endian
Real Numbers
BCD Data
ASCII Data
General Instruction Format Issues
● Consider the pros and cons of each format when performing the set of
operations
➔ F = X + Y –Z
➔ G=A+B
● Simple embedded computers often use single operand format
➔ Smaller data size (8-bit or 16-bit machines) means limited instruc. size
● Modern, high performance processors use 2-and 3-operand formats
Intel x86 Register Set


8-bit processors in late 1970s

4 registers for integer data: A, B, C, D

4 registers for address/pointers: SP(stack pointer), BP(base pointer),
SI(source index), DI(dest. index)

16-bit processors extended registers to 16-bits but continued
to support 8-bit access

Use prefix/suffix to indicate size: AL referenced the lower 8-bits of
register A, AH referenced the high 8-bits, AX referenced the full 16-
bit value

32-/64-bit processors (see next slide)
Intel x86 Register Set
Protected mode: Descriptors
Program Invisible Registers
Intel x86 Adressing Modes
Intel x86 Adressing Modes

Register Mode
Specifies the contents of a register as the operand

Immediate Mode
Specifies the a constant stored in the instruction as the operand
Immediate can be specified in hex or decimal

Direct Addressing Mode
Specifies a constant memory address where the true operand is located
Address can be specified in decimal or hex

Indirect Addressing Mode
Specifies a register whose value will be used as the effective address in
memory where the true operand is located
Similar to dereferencing a pointer
Parentheses indicate indirect addressing mode
Intel x86 Adressing Modes

Base/Indirect with Displacement Addressing Mode
Form: d(%reg)
Adds a constant displacement to the value in a register and uses the sum
as the effective address of the actual operand in memory


Scaled Index Addressing Mode
Form: (%reg1,%reg2,s) [s = 1, 2, 4, or 8]
Uses the result of %reg1 + %reg2*s as the effective address of the actual
operand in memory
Instruction Limits on Addressing Modes


Primary restriction is both operands cannot be memory locations

mov 2000, (%eax) is not allowed since both source and destination are in memory

To move mem -> mem use two move instructions with a register as the intermediate
storage location


Legal move combinations:
Imm -> Reg
Imm -> Mem
Reg -> Reg
Mem -> Reg
Reg -> Mem
Intel x86 Instruction Classes
● Data Transfer (movinstruction)
➔ Moves data between processor & memory (loads and saves variables between processor

and memory)
➔ One operand must be a processor register (can't move data from one memory location to

another)
➔ Specifies size via a suffix on the instruction (movb, movw, movl, movq)

● String Operations
➔ Every string instruction is tied to compulsory conventions use of index registers and the

size of operands: ES:EDI - destination operand, DS:ESI - source operand


● ALU Operations
➔ One operand must be a processor register

➔ Size and operation specified by instruction (addl, orq, andb, subw)

● Control / Program Flow


➔ Unconditional/Conditional Branch (cmpq, jmp, je, jne, jl, jge)

➔ Subroutine Calls (call, ret)

● Privileged / System Instructions


➔ Instructions that can only be used by OS or other “supervisor” software (e.g. intto access

certain OS capabilities, etc.)


Data Transfers Instructions
Data Movement
•Move from source to destination. Syntax:
MOV destination,
d ti ti source
•Source and destination have the same size
•No more than one memory operand permitted
•CS, EIP, and IP cannot be the destination
•No immediate to segment moves
MOV instruction

.data
count BYTE 100
wVal WORD 2
.code
mov bl,count
mov ax,wVal
V l
mov count,al

mov al,wVal ; error


mov ax,count
ax count ; error
mov eax,count ; error
Exercise . . .
Explain why each of the following MOV statements are
invalid:
.data
bVal BYTE 100
bVal2 BYTE ?
wVal WORD 2
dVal DWORD 5
.code
mov ds,45 ; a.
mov esi,wVal ; b.
mov eip,dVal
i dV l ; c.
mov 25,bVal ; d.
mov bVal2,bVal
bVal2 bVal ; e.
e
Memory to memory
.data
var1 WORD ?
var2 WORD ?
.code
d
mov ax, var1
mov var2, ax
Copy smaller to larger
.data
count WORD 1
.code
mov ecx,
, 0
mov cx, count

.data
d
signedVal SWORD -16 ; FFF0h
.code
code
mov ecx, 0 ; mov ecx, 0FFFFFFFFh
mov cx,
cx signedVal

MOVZX and MOVSX instructions take care of extension


for both sign and unsigned integers.
Zero extension
When you copy a smaller value into a larger destination,
the MOVZX instruction fills (extends)
( ) the upper half of
the destination with zeros.
0 10001111 Source
movzx r32,r/m8
movzx r32,r/m16
movzx r16,r/m8

00000000 10001111 Destination

mov bl,10001111b
movzx ax,bl ; zero-extension

The
h destination
d must be
b a register.
Sign extension
The MOVSX instruction fills the upper half of the destination
with a copy of the source operand
operand'ss sign bit
bit.
10001111 Source

11111111 10001111 Destination

mov bl,10001111b
movsx ax,bl ; sign extension

The destination must be a register.


register
MOVZX MOVSX
From a smaller location to a larger one

mov bx, 0A69Bh


movzx eax, bx ; EAX=0000A69Bh
movzx edx, bl ; EDX=0000009Bh
movzx cx, bl ; EAX=009Bh

mov bx, 0A69Bh


movsx eax, bx ; EAX=FFFFA69Bh
movsx edx, bl ; EDX=FFFFFF9Bh
movsx cx, bl ; EAX=FF9Bh
LAHF/SAHF (load/store status flag from/to AH)

.data
saveflags
fl BYTE ?
.code
lahf
mov saveflags, ah
...
mov ah,
h saveflags
fl
sahf

S,Z,A,P,C flags are copied.


EFLAGS
XCHG Instruction
XCHG exchanges the values of two operands. At least one
operand must be a register. No immediate operands are
permitted.
.data
data
var1 WORD 1000h
var2 WORD 2000h
.code
xchg ax,bx ; exchange 16-bit regs
xchg ah,al ; exchange 8-bit regs
xchg var1,bx ; exchange mem, reg
xchg eax,ebx
eax ebx ; exchange 32-bit regs

xchg
g var1,var2 ; error 2 memory
y operands
p
Exchange two memory locations
.data
var1
1 WORD 1000h
var2 WORD 2000h
.code
mov ax, val1
xchg ax, val2
mov val1, ax
Stack Operations
A stack is a region of memory used for temporary storage of information.

Memory space should be allocated for stack by the programmer.

The last value placed on the stack is the 1st to be taken off. This is called
LIFO (Last In, First Out) queue. Values placed on the stack are stored
from the highest memory location down to the lowest memory location.

SS is used as a segment register for address calculation together with SP.

Flags: Only affected by the popf instruction.

Addressing Modes: src & dst should be Words and cannot be immediate.
dst cannot be the ip or cs Register.
Stack Operations
push — Push on stack
The push instruction places its operand onto the top of the hardware supported stack in memory.
Specifically, push first decrements ESP by 4, then places its operand into the contents of the 32-bit location
at address (%esp). ESP (the stack pointer) is decremented by push since the x86 stack grows down — i.e.
the stack grows from high addresses to lower addresses.
Syntax
push <reg32>
push <mem>
push <con32>
Examples
push eax — push eax on the stack
push var — push the 4 bytes at address var onto the stack

pop — Pop from stack


The pop instruction removes the 4-byte data element from the top of the hardware-supported stack into the
specified operand (i.e. register or memory location). It first moves the 4 bytes located at memory location
(esp) into the specified register or memory location, and then increments ESP by 4.
Syntax
pop <reg32>
pop <mem>
Examples
pop edi — pop the top element of the stack into EDI.
pop [ebx] — pop the top element of the stack into memory at the four bytes starting at location EBX.
Excercise: Fill in the stack
String Operations
String Instructions
Every string instruction is tied to compulsory conventions use of index
registers and the size of operands:
ES:EDI - destination operand.
DS:ESI - source operand.
B/W/D - operand size 1, 2 or 4 bytes. This value specifies change of index
registers.
DF (direction flag) - 0 is up, 1 is down.

The every instruction can use one prefix to repeat itself (while condition
is met):

REP: while (ECX) { ECX--; ... }


REPE/Z: while (ECX && ZF) { ECX--; ... }
REPNE/NZ: while (ECX && !ZF) { ECX--; ... }

The string instructions are only for moving and comparing.


String Instructions
MOVSB/W/D
This instruction move one element from source to destination. The use with a
prefix can copy a block of memory.

LODSB/W/D
This instruction load one element from source into accumulator (AL, AX, EAX).

STOSB/W/D
This instruction stores content of accumulator (AL, AX, EAX) into destination.
The prefix can be used to fill a block of memory.

SCASB/W/D
This instruction compares content of accumulator (AL, AX, EAX) with
destination: null=accumulator-ES:[EDI]. The prefix can be used to search the
required value or the first difference.
String Instructions
CMPSB/W/D
This instruction compares source and destination: null=ES:[ESI]-DS:[EDI].
The prefix can be used to search consensus or difference of strings or blocks of
memory.

INSB/W/D
Instruction read data from port at address specified by DX into destination.

OUTSB/W/D
Instruction stores one element from source to port at address specified by DX.
Example: String Instructions
Example: String Instructions

You might also like