You are on page 1of 21

8/29/2022

Ho Chi Minh City University of Technology


Department of Electrical and Electronics

1. History of CPUs
2. Intel x86 Processors
3. ARM processors
4. Memory
5. Computer Software

1. History of CPUs
 1950s:
 Ferranti Mark 1, 1951: from University of Manchester
 single 80-bit accumulator , the 40-bit "multiplicand/quotient
register"
 UNIVAC I (UNIVersal Automatic Computer I) designed
principally by J. Presper Eckert and John Mauchly, the
inventors of the ENIAC
 1,905 operations per second running on a 2.25 MHz clock.
 IBM 704 in 1957:

Ferranti Mark 1, c. 1951 Ferranti Mark 1, c. 1951 An IBM 704 computer


at NACA in 1957
2

1
8/29/2022

1. History of CPUs
 1960s:
 IBM System/360 (S/360): 34,500 instructions per
second, with memory from 8 to 64 KB
 PDP-11: developed by Digital Equipment Corporation
 32 bit processor, allow 4 MB of physical memory
 Motorola 68000:
 Initial speed grades were 4, 6, and 8 MHz.
 68k instruction set

IBM System/360 PDP-11/40 Motorola MC68000


3

1. History of CPUs
 1970s:
 Intel 4004 (1971):
 a single instruction cycle was 10.8 microseconds
 Clock rate is 1 MHz

 Intel 8008 (1972)/ 8080(1974)/8086(1976): 8-bit CPU with an


external 14-bit address
 8008 clock frequency: 0.2 - 0.8MHz
 8080 clock frequency: 2 MHz
 8086 clock frequency : 5-10MHz
 32-bit VAX (1977): based on DEC's earlier PDP-11, support
virtual memory

Intel 4004 Intel 8088 Intel 8086


4

2
8/29/2022

A Brief History of Computer


Link YouTube: https://www.youtube.com/watch?v=iK0PT5q7GlE

2. Intel x86 Processors


 Dominate laptop/desktop/server market

 Evolutionary design
 Backwards compatible up until 8086, introduced in 1978
 Added more features as time goes on

 Complex instruction set computer (CISC)


 Many different instructions with many different formats
 But, only small subset encountered with Linux programs

 Hard to match performance of Reduced Instruction Set


Computers (RISC)
 But, Intel has done just that!
 In terms of speed. Less so for low power.

3
8/29/2022

Intel x86 Evolution: Milestones


Name Date Transistors MHz
 8086 1978 29K 5-10
 First 16-bit Intel processor. Basis for IBM PC & DOS
 1MB address space
 386 1985 275K 16-33
 First 32 bit Intel processor , referred to as IA32
 16 bit data path
 Added “flat addressing”, capable of running Unix
 486
 32-bit register, 32-bit data
 486DX include FPU (Floating Point Unit)
 Pentium 4E 2004 125M 2800-3800
 First 64-bit Intel x86 processor, referred to as x86-64
 Core 2 2006 291M 1060-3500
 First multi-core Intel processor
 Core i3, i5, i7 2008 731M 1700-3900
 Two cores / four cores

Intel x86 Processors, cont.


 Machine Evolution
 386 1985 0.3M
 Pentium 1993 3.1M
 Pentium/MMX 1997 4.5M
 Pentium Pro 1995 6.5M
 Pentium III 1999 8.2M
 Pentium 4 2001 42M
 Core 2 Duo 2006 291M
 Core i7 2008 731M
 Added Features
 Instructions to support multimedia operations
 Instructions to enable more efficient conditional
operations
 Transition from 32 bits to 64 bits
 More cores

4
8/29/2022

2015 State of the Art


 Core i7 Broadwell 2015

 Desktop Model
 4 cores
 Integrated graphics
 3.3-3.8 GHz
 65W

 Server Model
 8 cores
 Integrated I/O
 2-2.6 GHz
 45W

2. Intel x86 Processors

 8086 processor
 40 pin dual in-line package
 16-bit wide data bus
 16-bit registers
 20-bit external address bus
provides a 1 MB physical
address space
 The maximum linear address
space is limited to 64 KB
 Max CPU clock: 5- 10 MHz

10

5
8/29/2022

2. CPU - x86 Processor


 CPU, memory, input/output devices
 Instruction set, interfacing C to assembly, macros, stack
frame and calling convention
 Interrupt, exception

11

The architecture of 8086 microprocessor


 2 major units:
 BIU - Bus Interface Unit: bus interface, segment registers, fetch
queue
 EU - Execution Unit: control unit, ALU, registers

12

6
8/29/2022

2. x86 Processors - 8086


 Instructions:
 One-address or two addresses operations
 Support Assembly and high-level programming language (C,
Pascal)
 Main registers: are called data register or general register
 16 bit data
 Can be accessed by 8-bit registers

AH AL AX (primary accumulator)
BH BL BX (base, accumulator)
CH CL CX (counter, accumulator)
DH DL DX (accumulator, other functions

13

2. x86 Processors - 8086


 Index registers: for addressing

SI Source Index
DI Destination Index
BP Base Pointer
SP Stack Pointer
 Program counter:

IP Instruction Pointer

 Segment registers:
CS Code Segment
DS Data Segment
ES Extra Segment
SS Stack Segment
14

7
8/29/2022

2. x86 Processors - 8086


 Segment registers:
 a way to allow programs to address more than 64 KB
 the registers CS, DS, SS, and ES point to the currently used program code
segment (CS), the current data segment (DS), the current stack segment
(SS), and one extra segment determined by the programmer (ES).

CS Code Segment
DS Data Segment
ES Extra Segment
SS Stack Segment

0110 1000 1000 0111 0000 Segment, 16 bits, shifted 4 bits left
+ 0011 0100 1010 1001 Offset, 16 bits

0110 1011 1101 0001 1001 Address, 20 bits

15

1. x86 Processors - 8086


 Examples for x86
memory segmentation

16

8
8/29/2022

1. x86 Processors - 8086


 x86-32: 80386, 80486
 Register extend to 32-bit
 EAX. EBX ECX, EDX
 ESI, EDI, EBP, ESP, EIP, EFLAGS
 Two new segment registers (FS and GS) were added
 FS, GS is extra data for segment registers
 x86-64: AMD64, Core i5, Core i7,
 An R-prefix identifies the 64-bit registers (RAX, RBX,
RCX, RDX, RSI, RDI, RBP, RSP, RFLAGS, RIP)
 Add eight additional 64-bit general registers (R8-R15)

17

Some History: IA32 Registers


Origin
(mostly obsolete)
%eax %ax %ah %al accumulate

%ecx %cx %ch %cl counter


general purpose

%edx %dx %dh %dl data

%ebx %bx %bh %bl base

source
%esi %si index

destination
%edi %di index
stack
%esp %sp
pointer
base
%ebp %bp
pointer

16-bit virtual registers


(backwards compatibility) 18

9
8/29/2022

x86-64 Integer Registers


%rax %eax %r8 %r8d

%rbx %ebx %r9 %r9d

%rcx %ecx %r10 %r10d

%rdx %edx %r11 %r11d

%rsi %esi %r12 %r12d

%rdi %edi %r13 %r13d

%rsp %esp %r14 %r14d

%rbp %ebp %r15 %r15d

 Can reference low-order 4 bytes (also low-order 1


& 2 bytes) 19

3. ARM Processors
• ARM (Acorn RISC Machine) started as a new, powerful, CPU
design for the replacement of the 8-bit 6502 in Acorn
Computers (Cambridge, UK, 1985)
• First models had only a 26-bit program counter, limiting the
memory space to 64 MB (not too much by today standards,
but a lot at that time).
• 1990 spin-off: ARM renamed Advanced RISC Machines

20

10
8/29/2022

3. ARM Processors
• ARM now focuses on Embedded CPU cores
• IP licensing: Almost every silicon manufacturer sells
some microcontroller with an ARM core. Some even
compete with their own designs.
• Processing power with low current consumption
• Good MIPS/Watt figure

• Ideal for portable devices

• Compact memories: 16-bit opcodes (Thumb)


• New cores with added features
• Harvard architecture (ARM9, ARM11, Cortex)
• Floating point arithmetic
• Vector computing
• Java language
21

3. ARM Processors
• 32-bit CPU, Harvard architecture
• 3-operand instructions (typical): ADD Rd,Rn,Operand2
• RISC design:
• Few, simple, instructions
• Load/store architecture (instructions operate on registers, not
memory)
• Large register set
• Pipelined execution

22

11
8/29/2022

Von Neumann Harvard

ARM9s
ARM7s and newers
and olders
Inst. Data

AHB
bus
I D
Cache Cache
MEMORY
& I/O

Bus Interface

AHB
Memory-mapped I/O: bus
• No specific instructions for I/O
(use Load/Store instr. instead) MEMORY
• Peripheral’s registers at some & I/O
memory addresses
23

ARM7TDMI Pipeline

FETCH DECODE EXECUTE


Reg. Reg.
Read Shift ALU Write

1 Clock cycle

ARM9TDMI Pipeline

FETCH DECODE EXECUTE MEMORY WRITE


Reg. Reg.
Shift ALU access
Read Write

1 Clock cycle

• Fetch: Read Op-code from memory to internal Instruction Register


• Decode: Activate the appropriate control lines depending on Opcode

• Execute: Do the actual processing

24

12
8/29/2022

1 FETCH DECODE EXECUTE

2 FETCH DECODE EXECUTE

3 FETCH DECODE EXECUTE


instruction
time

• Simple instructions (like ADD) Complete at a rate of one per cycle

25

• More complex instructions:

1 ADD FETCH DECODE EXECUTE

2 STR FETCH DECODE Cal. ADDR Data Xfer.

3 ADD FETCH stall DECODE EXECUTE

4 ADD FETCH stall DECODE EXECUTE

5 ADD FETCH DECODE EXECUTE


instruction
time

STR : 2 effective clock cycles (+1 cycle)

26

13
8/29/2022

Data Sizes and Instruction Sets


 The ARM is a 32-bit architecture.

 When used in relation to the ARM:


 Byte means 8 bits
 Halfword means 16 bits (two bytes)
 Word means 32 bits (four bytes)

 Most ARM’s implement two instruction sets


 32-bit ARM Instruction Set
 16-bit Thumb Instruction Set

27

Processor Modes
 The ARM has seven operating modes:

 User : unprivileged mode under which most tasks run

 FIQ : entered when a high priority (fast) interrupt is raised

 IRQ : entered when a low priority (normal) interrupt is raised

 SVC : (Supervisor) entered on reset and when a Software Interrupt


instruction is executed

 Abort : used to handle memory access violations

 Undef : used to handle undefined instructions

 System : privileged mode using the same registers as user mode

28

14
8/29/2022

The Registers
 ARM has 37 registers all of which are 32-bits long.
 1 dedicated program counter
 1 dedicated current program status register
 5 dedicated saved program status registers
 30 general purpose registers

 The current processor mode governs which of several banks is


accessible. Each mode can access
 a particular set of r0-r12 registers
 a particular r13 (the stack pointer, sp) and r14 (the link register, lr)
 the program counter, r15 (pc)
 the current program status register, cpsr

Privileged modes (except System) can also access


 a particular spsr (saved program status register)

29

The ARM Register Set


Current Visible Registers
r0
Abort
Undef
SVC
IRQ
FIQ
User Mode
Mode
Mode
Mode
Mode
r1
r2
r3 Banked out Registers
r4
r5
User,
r6 User FIQ IRQ SVC Undef Abort
r7
SYS
r8 r8 r8
r9 r9 r9
r10 r10 r10
r11 r11 r11
r12 r12 r12
r13 (sp) r13 (sp) r13 (sp) r13 (sp) r13 (sp) r13 (sp) r13 (sp)
r14 (lr) r14 (lr) r14 (lr) r14 (lr) r14 (lr) r14 (lr) r14 (lr)
r15 (pc)

cpsr
spsr spsr spsr spsr spsr spsr

30

15
8/29/2022

Special Registers
 Special function registers:
 PC (R15): Program Counter. Any instruction with PC as its destination
register is a program branch

 LR (R14): Link Register. Saves a copy of PC when executing the BL


instruction (subroutine call) or when jumping to an exception or interrupt
routine
- It is copied back to PC on the return from those routines

 SP (R13): Stack Pointer. There is no stack in the ARM architecture. Even


so, R13 is usually reserved as a pointer for the program-managed stack

 CPSR : Current Program Status Register. Holds the visible status register

 SPSR : Saved Program Status Register. Holds a copy of the previous status
register while executing exception or interrupt routines
- It is copied back to CPSR on the return from the exception or interrupt
- No SPSR available in User or System modes

31

4. Memory
 Memory - Purpose of memory is data storage. Two major
types of memory :

 Primary memory - to hold data and instructions during


processing
 eg RAM. Relatively limited capacity and volatile

 Secondary memory - to provide permanent long term storage


 eg hard disk. High capacity and non-volatile

RAM banks Hard disk NAND flash chip


32

16
8/29/2022

4. Memory
 Primary memory consists of a set of locations defined
by sequentially numbered addresses. Each location
contains a binary number that can be interpreted as data
or an instruction.
 8086 uses 20-bit physical address
 Manage 1MB of memory
 80386 uses 32-bit physical address
 Manage 4GB of memory
 X86-64 uses 64-bit physical address
 Manage ??? of memory

33

u Memory locations are called words. Words are 8 bits (one byte) in size, or
a multiple of 8. Common word sizes are 16, 32 and 64 bits.

0 1 0 0 1 0 0 0 1

1
1 1 0 1 0 0 1 1
2
0 1 0 0 0 0 0 0
3

4 1 0 1 0 0 1 1 1

5 1 1 1 0 1 0 1 0

1 1 0 0 1 0 1 0

Memory locations, using an 8 bit word


34

17
8/29/2022

2. Memory

 Memory is commonly measured in multiples of bits


and bytes.
1 bit = 1 binary digit (0 or 1).

1. 1 byte = 8 bits

2. 1KB = 1024 bytes = 210

3. 1MB = 1024 KB= 220

4. 1GB = 1024 MB = 230

5. 1TB = 1024 GB = 240


35

Big Endian vs. Little Endian


• x86 processors are little-endian
• IBM z/Architecture mainframes are big-endian processors

Big Endian Little Endian


(Others) High Memory (Intel)
Addresses
Register Register

FE ED FA CE
00 0x5 00 FE ED FA CE
00 0x4 00
CE 0x3 FE
FA 0x2 ED
ED 0x1 FA
FE 0x0 CE
Low Memory Addresses
36

18
8/29/2022

5. Computer Software
Assembly/Machine Code View
CPU Memory
Addresses
Registers
Data Code
PC Data
Condition Instructions Stack
Codes

Programmer-Visible State
 PC: Program counter  Memory
 Address of next instruction  Byte addressable array
 Called “RIP” (x86-64)
 Code and user data
 Register file
 Stack to support procedures
 Heavily used program data
 Condition codes
 Store status information about most
recent arithmetic or logical operation
 Used for conditional branching
37

5. Computer Software
Compiling Into Assembly
C Code (sum.c) Generated x86-64 Assembly
long plus(long x, long y); sumstore:
pushq %rbx
void sumstore(long x, long y, movq %rdx, %rbx
long *dest) call plus
{ movq %rax, (%rbx)
long t = plus(x, y); popq %rbx
*dest = t; ret
}
Obtain (on shark machine) with command
gcc –Og –S sum.c
Produces file sum.s
Warning: Will get very different results on non-Shark machines (Andrew Linux,
Mac OS-X, …) due to different versions of gcc and different compiler settings.

38

19
8/29/2022

Quiz
1) Pick the correct choice for the 8086 CPU.
A 16 bit word size, 8 bit data path
B 8 bit word size, 8 bit data path
C 16 bit word size, 16 bit data path
D 4 bit word size, 8 bit data path
E 8 bit word size, 16 bit data path
2) Pick the correct choice for the 80386SX CPU.
A 16 bit word size, 16 bit data path
B 32 bit word size, 16 bit data path
C 8 bit word size, 32 bit data path
D 32 bit word size, 8 bit data path
E 32 bit word size, 32 bit data path
3) Pick the correct choice for the 80486DX CPU.
A 32 bit word size, 16 bit data path
B 64 bit word size, 32 bit data path
C 32 bit word size, 32 bit data path
D 32 bit word size, 16 bit data path
E 32 bit word size, 64 bit data path
39

Quiz
4) What is the first CPU to include an internal math
coprocessor?
A 386DX
B 486SX
C 486DX
D Pentium
5) What are the two main components of the CPU?
A The Control Unit and ALU
B The Registers and Output/Input management
C The ALU and FPU
6) What are the two main desktop CPU manufacturers?
A Intel and AMD
B Via and Power PC Address Content
C Marek and Sun UltraSparc 0x4000 2F
7) What are the 32-bit data when we read a double-word at 0x4001 65
the address 0x4000 with Big Endian mode?
0x4002 7E
A 0xAC7E652F
B 0x2F657EAC 0x4003 AC
C 0xCAE756F2 40

20
8/29/2022

Quiz
8) Pick the correct choice for the ARM processor.
A 16 bit word size, 16 bit data path
B 32 bit word size, 16 bit data path
C 8 bit word size, 32 bit data path
D 32 bit word size, 8 bit data path
E 32 bit word size, 32 bit data path
9) Pick the wrong choice for ARM architecture.
A Von Neumann architecture
B Harvard architecture
C 3 stage pipeline architecture
D 32-bit ARM Instruction Set
10) Pick the wrong choice for ARM registers.
A ARM has 37 32-bit registers
B There are 13 general purpose registers
C R13 is Stack Pointer
D R14 is the program counter

41

Exercises
1. Suppose that you discover that RAM addresses 000C0000 to 000C7FFF are
reserved for a PC’s video adapter. How many bytes of memory is this?
2. Suppose that you have an Intel 8086. Find the five-hex-digit address that
corresponds to each of these segment:offset pairs:
(a) 2B8C:8D21 (b) 059A:7A04 (c) 1234:5678
3. In an 8086 program, suppose that the data segment register DS contains the
segment number 23D1 and that an instruction fetches a word at offset 7B86
in the data segment. What is the five-hex-digit address of the word that is
fetched?
4. In an 8086 program, suppose that the code segment register CS contains the
segment number 014C and that the instruction pointer IP contains 15FE.
What is the five-hex-digit address of the next instruction to be fetched?
5. What are advantages and disadvantage of secondary memory?

42

21

You might also like