You are on page 1of 45

Lecture (02)

x86 Architecture
By:
Dr. Ahmed ElShafee

1 Dr. Ahmed ElShafee, ACU : Fall 2022, Microprocessors 1


Microcomputers and Microprocessors
There are three main components of a Computer System.
• Central Processing Unit (CPU): Also simply called as the microprocessor
acts as the brain coordinating all activities within a computer.
• Memory: The program instructions and data are primarily stored.
• Input/output (I/O) Devices: Allow the computer to input information for
processing and then output the results. I/O Devices are also known as
computer peripherals.
 The integrated Circuit (IC) chip containing the CPU is called
the microprocessor.
 A microcomputer is a relatively smaller computer with a central
processing unit (CPU) as a microprocessor. A microcomputer is
typically used as a personal computer (PC) which is smaller
than a mainframe computer.
Microcomputers and Microprocessors

The CPU is connected to memory and I/O devices through a strip of wires called a
bus. The bus inside a computer carries information from place to place. There are
three types of busses:
1. Address Bus: The address bus is used to identify the memory location or I/O device the
processor intends to communicate with. The width of the Address Bus rages from 20 bits
(8086) to 36 bits for (Pentium II).
2. Data Bus: Data bus is used by the CPU to get data from / to send data to the memory
or the I/O devices. The width of a microprocessor is used to classify the microprocessor.
The size of data bus of Intel microprocessors vary between 8-bit (8085) to 64-bit
(Pentium).
3. Control Bus: How can we tell if the address on the bus is memory address or an I/O
device address? This is where the control bus comes in. Each time the processor outputs
an address it also activates one of the four control bus signals: Memory Read, Memory
Write, I/O Read and I/O Write.

The address and control bus contains output lines only, therefore it is unidirectional,
but the data bus is bidirectional.
Microcomputers and Microprocessors
There two types of memory used in microcomputers:
 RAM (Random Access Memory/ Read-Write memory) is used by the computer
for the temporary storage of the programs that is running. Data is lost when the
computer is turned off. So known as volatile memory.
• ROM (Read Only Memory) the information in ROM is permanent and not lost
when the power is turned off. Therefore, it is called nonvolatile memory.
Note that RAM is sometimes referred as primary storage, where magnetic /optical disks
are called secondary storage.
Address Bus

RAM ROM Printer Disk Monitor Keyboard

CPU
Data Bus

Read/
Write Control Bus
Internal organisation of a microcomputer
Microcomputers and Microprocessors
Inside the CPU:
A program stored in the memory provides instructions to the CPU to perform a
specific action. This action can be a simple addition. It is function of the CPU to fetch
the program instructions from the memory and execute them.

 The CPU contains a number of registers to store information inside the CPU
temporarily. Registers inside the CPU can be 8-bit, 16-bit, 32-bit or even 64-bit
depending on the CPU.
 The CPU also contains Arithmetic and Logic Unit (ALU). The ALU performs
arithmetic (add, subtract, multiply, divide) and logic (AND, OR, NOT) functions.
 The CPU contains a program counter also known as the Instruction Pointer to
point the address of the next instruction to be executed.
 Instruction Decoder is a kind of dictionary which is used to interpret the meaning
of the instruction fetched into the CPU. Appropriate control signals are generated
according to the meaning of the instruction.
Microcomputers and Microprocessors
Inside the CPU: Address Bus

Instruction Pointer

Instruction Register
Instruction Decoder Control Bus
Timing and control
Flags
ALU signals are generated

Data Bus

Internal Register A
Busses Register B
Register C
Register D

Internal block diagram of a CPU


The interaction between the CPU, memory and I/O Devices.
Early Intel microprocessors
• Intel 8080 (1972)
– 64K addressable RAM
– 8-bit registers
– CP/M operating system
– 5,6,8,10 MHz
– 29K transistros
• Intel 8086/8088 (1978)
– IBM-PC used 8088
– 1 MB addressable RAM
– 16-bit registers
– 16-bit data bus (8-bit for 8088)
– separate floating-point unit (8087)
– used in low-cost microcontrollers now
8
The IBM-AT
• Intel 80286 (1982)
– 16 MB addressable RAM
– Protected memory
– several times faster than 8086
– introduced IDE bus architecture
– 80287 floating point unit
– Up to 20MHz
– 134K transistors

9
Intel IA-32 Family
• Intel386 (1985)
– 4 GB addressable RAM
– 32-bit registers
– paging (virtual memory)
– Up to 33MHz
• Intel486 (1989)
– instruction pipelining
– Integrated FPU
– 8K cache
• Pentium (1993) my first computer (1995)
– Superscalar (two parallel pipelines)

10
 Virtual Memory: a way of fooling the microprocessor into
thinking that it has
• access to unlimited memory by swapping data between disk
storage and RAM.
 Real mode (faster operation with maximum of 1 Mbytes of
memory) vs. Protected mode protecting the operating
system for accidental or deliberate destruction by the user.
 Protected mode is slower but can use 16 megabytes of
memory.

11
Intel P6 Family
• Pentium Pro (1995)
– advanced optimization techniques in microcode
– More pipeline stages
– On-board L2 cache
• Pentium II (1997)
– MMX (multimedia) instruction set
– Up to 450MHz
• Pentium III (1999)
– SIMD (streaming extensions) instructions (SSE)
– Up to 1+GHz
• Pentium 4 (2000)
– NetBurst micro-architecture, tuned for multimedia
– 3.8+GHz
• Pentium D (2005, Dual core)
12
Architecture of 8086 Microprocessor
• 8086: Internal Organization, Pipelining and Registers
Architecture of 8086 Microprocessor
• Pipelining
 In 8085 microprocessor, the CPU could either fetch or execute at a given time.
CPU had to fetch an instruction from the memory, then execute it, then fetch
again and execute it and so on.
 Pipelining is the simplest form to allow the CPU to fetch and execute at the
same time. Note that the fetch and execute times can be different.
Pipelined vs. Nonpipelined Execution
Architecture of 8086 Microprocessor
• Pipelining is achieved by splitting the internal structure of 8088/86 into two
sections.
 the execution unit (EU)
 the bus interface unit (BIU)

 These two sections work simultaneously. BIU accesses memory and peripherals
while the EU executes the instructions previously fetched.
 It only works if BIU keeps ahead of EU. Thus BIU has a buffer of queue. (8088 has 4
byte, and 8086 has 6 bytes).
 If the execution of any instruction takes to long, the BIU is filled to its maximum
capacity and busses will stay idle. It starts to fetch again whenever there is 2-byte
room in the queue.
 When there is a jump instruction, the microprocessor must flush out the queue.
When a jump instruction is executed BIU starts to fetch information from the new
location in the memory. In this situation EU must wait until the BIU starts to fetch
the new instruction. This is known as branch penalty.
IA-32 Architecture
IA32 Processors
• Totally Dominate Computer Market
• Evolutionary Design
– Starting in 1978 with 8086
– Added more features as time goes on
– Still support old features, although obsolete
• Complex Instruction Set Computer (CISC)
– Many different instructions with many different
formats
• But, only small subset encountered with Linux programs
– Hard to match performance of Reduced Instruction
Set Computers (RISC)
– But, Intel has done just that!
IA-32 architecture
• Lots of architecture improvements, pipelining,
superscalar, branch prediction, hyperthreading
and multi-core.
• From programmer’s point of view, IA-32 has not
changed substantially except the introduction
of a set of high-performance instructions

18
Modes of operation
• Protected mode
– native mode (Windows, Linux), full features,
separate memory

• Virtual-8086 mode
• hybrid of Protected
• each program has its own 8086 computer

• Real-address mode
– native MS-DOS
• System management mode
– power management, system security, diagnostics
19
Addressable memory

• Protected mode
– 4 GB
– 32-bit address
• Real-address and Virtual-8086 modes
– 1 MB space
– 20-bit address

20
General-purpose registers
32-bit General-Purpose Registers

EAX EBP
EBX ESP
ECX ESI
EDX EDI

16-bit Segment Registers

EFLAGS CS ES
SS FS
EIP
DS GS

21
Accessing parts of registers
• Use 8-bit name, 16-bit name, or 32-bit name
• Applies to EAX, EBX, ECX, and EDX
8 8

AH AL 8 bits + 8 bits

AX 16 bits

EAX 32 bits

22
Index and base registers
• Some registers have only a 16-bit name for
their lower half (no 8-bit aliases). The 16-bit
registers are usually used only in real-address
mode.

23
Some specialized register uses (1 of 2)
• General-Purpose
– EAX – accumulator (automatically used by division
and multiplication)
– ECX – loop counter
– ESP – stack pointer (should never be used for
arithmetic or data transfer)
– ESI, EDI – index registers (used for high-speed
memory transfer instructions)
– EBP – extended frame pointer (stack)

24
Some specialized register uses (2 of 2)
• Segment
– CS – code segment
– DS – data segment
– SS – stack segment
– ES, FS, GS - additional segments
• EIP – instruction pointer
• EFLAGS
– status and control flags
– each flag is a single binary bit (set or clear)
• Some other system registers such as IDTR, GDTR,
LDTR etc. (global descriptor table ) (Local Descriptor
Table Register) (Interrupt Descriptor Table Register)
25
Status flags
• Carry
– unsigned arithmetic out of range
• Overflow
– signed arithmetic out of range
• Sign
– result is negative
• Zero
– result is zero
• Auxiliary Carry
– carry from bit 3 to bit 4
• Parity
– sum of 1 bits is an even number
26
Floating-point, MMX, XMM registers
80-bit Data Registers
• Eight 80-bit floating-point data ST(0)
registers ST(1)

– ST(0), ST(1), . . . , ST(7) ST(2)


ST(3)
– arranged in a stack
ST(4)
– used for all floating-point ST(5)
arithmetic ST(6)
• Eight 64-bit MMX registers ST(7)
• Eight 128-bit XMM registers for
single-instruction multiple-data Opcode Register
(SIMD) operations

27
Programmer’s model

28
Programmer’s model

29
IA-32 Memory Management
Real-address mode
• 1 MB RAM maximum addressable (20-bit address)
• Application programs can access any area of
memory
• Single tasking
• Supported by MS-DOS operating system

31
Segmented memory
Segmented memory addressing: absolute (linear) address
is a combination of a 16-bit segment value added to a 16-
bit offset

F0000
E0000 8000:FFFF
D0000
C0000
B0000
A0000
one segment
90000
80000 (64K)
70000
60000
8000:0250
50000
0250
40000
30000 8000:0000
20000
10000
seg ofs
00000
32
Calculating linear addresses
• Given a segment address, multiply it by 16 (add
a hexadecimal zero), and add it to the offset
• Example: convert 08F1:0100 to a linear address

Adjusted Segment value: 0 8 F 1 0


Add the offset: 0 1 0 0
Linear address: 0 9 0 1 0

• A typical program has three segments: code,


data and stack. Segment registers CS, DS and SS
are used to store them separately.
33
Example

What linear address corresponds to the segment/offset


address 028F:0030?

028F0 + 0030 = 02920

Always use hexadecimal notation for addresses.

34
Protected mode (1 of 2)
• 4 GB addressable RAM (32-bit address)
– (00000000 to FFFFFFFFh)
• Each program assigned a memory partition
which is protected from other programs
• Designed for multitasking
• Supported by Linux & MS-Windows

35
Protected mode (2 of 2)
• Segment descriptor tables
• Program structure
– code, data, and stack areas
– CS, DS, SS segment descriptors
– global descriptor table (GDT)
• MASM Programs use the Microsoft flat memory
model

36
Flat segmentation model
• All segments are mapped to the entire 32-bit physical
address space, at least two, one for data and one for
code
• global descriptor table (GDT)

37
Multi-segment model
• Each program has a local descriptor table (LDT)
– holds descriptor for each segment used by the program
RAM

Local Descriptor Table

26000
base limit access
00026000 0010
00008000 000A
00003000 0002 8000
multiplied by
1000h 3000
38
Translating Addresses
• The IA-32 processor uses a one- or two-step
process to convert a variable's logical address
into a unique memory location.
• The first step combines a segment value with a
variable’s offset to create a physical , linear
address.
• The second optional step, called page
translation, converts a linear address to a
physical address.
Converting Logical to Linear Address
The segment Logical address

selector points to a Selector Offset

segment descriptor, Descriptor table

which contains the


base address of a
memory segment.
Segment Descriptor +
The 32-bit offset
from the logical
address is added to
the segment’s base
address, generating GDTR/LDTR

a 32-bit linear Linear address


(contains base address of
address. descriptor table)
Indexing into a Descriptor Table
Each segment descriptor indexes into the program's local
descriptor table (LDT). Each table entry is mapped to a
linear address:
Linear address space

(unused)

Logical addresses
Local Descriptor Table DRAM
SS ESP
0018 0000003A

DS offset (index)
0010 000001B6 18 001A0000
10 0002A000
08 0001A000
IP 00003000
00
0008 00002CD3

LDTR register
Paging (1 of 2)
• Virtual memory uses disk as part of the memory,
thus allowing sum of all programs can be larger
than physical memory
• Only part of a program must be kept in
memory, while the remaining parts are kept on
disk.
• The memory used by the program is divided
into small units called pages (4096-byte).
• As the program runs, the processor selectively
unloads inactive pages from memory and loads
other pages that are immediately required.
Paging (2 of 2)
• OS maintains page directory and page tables
• Page translation: CPU converts the linear
address into a physical address
• Page fault: occurs when a needed page is not
in memory, and the CPU interrupts the
program
• Virtual memory manager (VMM) – OS utility
that manages the loading and unloading of
pages
• OS copies the page into memory, program
resumes execution
Page Translation
A linear address is 10
Linear Address
10 12

divided into a page Directory Table Offset

directory field, page Page Frame

table field, and page


Page Directory Page Table
frame offset. The
Physical Address
CPU uses all three to
calculate the Page-Table Entry

physical address.
Directory Entry

CR3
32
C U Next Week,…

45 Dr. Ahmed ElShafee, ACU : Fall 2022, Microprocessors 1

You might also like