You are on page 1of 47

+

William Stallings
Computer Organization
and Architecture
10th Edition

© 2016 Pearson Education, Inc., Hoboken,


NJ. All rights reserved.
+ Chapter 3
A Top-Level View of Computer
Function and Interconnection
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Computer Components
 Contemporary computer designs are based on concepts developed by
John von Neumann at the Institute for Advanced Studies, Princeton

 Referred to as the von Neumann architecture and is based on three


key concepts:
 Data and instructions are stored in a single read-write memory
 The contents of this memory are addressable by location, without regard to
the type of data contained there
 Execution occurs in a sequential fashion (unless explicitly modified) from
one instruction to the next

 Hardwired program
 The result of the process of connecting the various components in the
desired configuration

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+ Sequence of
Data arithmetic Results
and logic
functions

Hardware (a) Programming in hardware

and Software
Instruction

Approaches
Instruction
codes interpreter

Control
signals

General-purpose
Data arithmetic Results
and logic
functions

(b) Programming in software

Figure 3.1 Hardware and Software Approaches

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Software
• A sequence of codes or instructions
• Part of the hardware interprets each instruction and Software
generates control signals
• Provide a new sequence of codes for each new program
instead of rewiring the hardware

Major components:
• CPU I/O
• Instruction interpreter
• Module of general-purpose arithmetic and logic
Components
functions
• I/O Components
• Input module
+ • Contains basic components for accepting data and
instructions and converting them into an internal form
of signals usable by the system
• Output module
• Means of reporting results

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Memory address Memory buffer MEMORY
register (MAR) register (MBR)
• Specifies the address • Contains the data to
in memory for the be written into
next read or write memory or receives
the data read from
memory

MAR

I/O address I/O buffer


register (I/OAR) register (I/OBR)
• Specifies a • Used for the
+ particular I/O device exchange of data
between an I/O
module and the CPU
MBR

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


CPU Main Memory
0
System 1
2
PC MAR Bus
Instruction
Instruction
Instruction
IR MBR

I/O AR
Data
Execution
unit Data
I/O BR Data
Data

I/O Module n–2


n–1

PC = Program counter
Buffers IR = Instruction register
MAR = Memory address register
MBR = Memory buffer register
I/O AR = Input/output address register
I/O BR = Input/output buffer register

Figure 3.2 Computer Components: Top-Level View

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Fetch Cycle Execute Cycle

Fetch Next Execute


START HALT
Instruction Instruction

Figure 3.3 Basic Instruction Cycl e

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+
Fetch Cycle
 At the beginning of each instruction cycle the processor fetches an
instruction from memory

 The program counter (PC) holds the address of the instruction to be


fetched next

 The processor increments the PC after each instruction fetch so that


it will fetch the next instruction in sequence

 The fetched instruction is loaded into the instruction register (IR)

 The processor interprets the instruction and performs the required


action

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Action Categories
• Data transferred from processor to • Data transferred to or
memory or from memory to from a peripheral device
processor by transferring between
the processor and an I/O
module

Processor- Processor-
memory I/O

Data
Control processing

• An instruction may specify that the • The processor may


sequence of execution be altered perform some
arithmetic or logic
operation on data

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


0 3 4 15
Opcode Address

(a) Instruction format

0 1 15
S Magnitude

(b) Integer format

Program Counter (PC) = Address of instruction


Instruction Register (IR) = Instruction being executed
Accumulator (AC) = Temporary storage

(c) Internal CPU registers

0001 = Load AC from Memory


0010 = Store AC to Memory
0101 = Add to AC from Memory

(d) Partial list of opcodes

Figure 3.4 Characteristics of a Hypothetical Machine

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Memory CPU Registers Memory CPU Registers
300 1 9 4 0 3 0 0 PC 300 1 9 4 0 3 0 1 PC
301 5 9 4 1 AC 301 5 9 4 1 0 0 0 3 AC
302 2 9 4 1 1 9 4 0 IR 302 2 9 4 1 1 9 4 0 IR
• •
• •
940 0 0 0 3 940 0 0 0 3
941 0 0 0 2 941 0 0 0 2
Step 1 Step 2
Memory CPU Registers Memory CPU Registers
300 1 9 4 0 3 0 1 PC 300 1 9 4 0 3 0 2 PC
301 5 9 4 1 0 0 0 3 AC 301 5 9 4 1 0 0 0 5 AC
302 2 9 4 1 5 9 4 1 IR 302 2 9 4 1 5 9 4 1 IR
• •
• •
940 0 0 0 3 940 0 0 0 3 3+2=5
941 0 0 0 2 941 0 0 0 2
Step 3 Step 4
Memory CPU Registers Memory CPU Registers
300 1 9 4 0 3 0 2 PC 300 1 9 4 0 3 0 3 PC
301 5 9 4 1 0 0 0 5 AC 301 5 9 4 1 0 0 0 5 AC
302 2 9 4 1 2 9 4 1 IR 302 2 9 4 1 2 9 4 1 IR
• •
• •
940 0 0 0 3 940 0 0 0 3
941 0 0 0 2 941 0 0 0 5
Step 5 Step 6

Figure 3.5 Example of Program Execution


(contents of memory and registers in hexadecimal)

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Instruction Operand Operand
fetch fetch store

Multiple Multiple
operands results

Instruction Instruction Operand Operand


Data
address operation address address
Operation
calculation decoding calculation calculation

Return for string


Instruction complete, or vector data
fetch next instruction

Figure 3.6 Instruction Cycle State Diagram

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Program Generated by some condition that occurs as a result of an instruction
execution, such as arithmetic overflow, division by zero, attempt to
execute an illegal machine instruction, or reference outside a user's
allowed memory space.
Timer Generated by a timer within the processor. This allows the operating
system to perform certain functions on a regular basis.
I/O Generated by an I/O controller, to signal normal completion of an
operation, request service from the processor, or to signal a variety of
error conditions.
Hardware failure Generated by a failure such as power failure or memory parity error.

Table 3.1

Classes of Interrupts

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


User I/O User I/O User I/O
Program Program Program Program Program Program

1 4 1 4 1 4

I/O I/O I/O


Command Command Command
WRITE WRITE WRITE
5
2a
END
2 2

Interrupt Interrupt
2b Handler Handler

WRITE WRITE 5 WRITE 5

END END
3a

3 3

3b

WRITE WRITE WRITE

(a) No interrupts (b) Interrupts; short I/O wait (c) Interrupts; long I/O wait

= interrupt occurs during course of execution of user program

Figure 3.7 Program Flow of Control Without and With Interrupts

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


User Program Interrupt Handler

i
Interrupt
occurs here i + 1

Figure 3.8 Transfer of Control via Interrupts

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Fetch Cycle Execute Cycle Interrupt Cycle

Interrupts
Disabled
Check for
Fetch Next Execute
START Interrupt;
Instruction Instruction Interrupts Process Interrupt
Enabled

HALT

Figure 3.9 Instruction Cycle with Interrupts

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Time

1 1

4 4
I/O operation
I/O operation;
processor waits 2a concurrent with
processor executing

5 5

2b
2
4
I/O operation
4 3a concurrent with
processor executing
I/O operation;
processor waits 5

5 3b

(b) With interrupts


3

(a) Without interrupts

Figure 3.10 Program Timing: Short I/O Wait

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Time

1 1

4 4

I/O operation; 2 I/O operation


processor waits concurrent with
processor executing;
then processor
waits
5

5
2
4
4
3 I/O operation
concurrent with
I/O operation; processor executing;
processor waits then processor
waits

5
5

3 (b) With interrupts

(a) Without interrupts

Figure 3.11 Program Timing: Long I/O Wait

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Instruction Operand Operand
fetch fetch store

Multiple Multiple
operands results

Instruction Instruction Operand Operand


Data Interrupt
address operation address address Interrupt
Operation check
calculation decoding calculation calculation

No
Instruction complete, Return for string interrupt
fetch next instruction or vector data

Figure 3.12 Instruction Cycle State Diagram, With Interrupts

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Interrupt
User program handler X

Interrupt
handler Y

(a) Sequential interrupt processing

Interrupt
User program handler X

Interrupt
handler Y

(b) Nested interrupt processing

Figure 3.13 Transfer of Control with Multiple Interrupts

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Printer Communication
User program
interrupt service routine interrupt service routine
t = 0

Disk
interrupt service routine

Figure 3.14 Example Time Sequence of Multiple Interrupts

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+
I/O Function
 I/O module can exchange data directly with the processor

 Processor can read data from or write data to an I/O module


 Processor identifies a specific device that is controlled by a particular I/O
module
 I/O instructions rather than memory referencing instructions

 In some cases it is desirable to allow I/O exchanges to occur directly


with memory
 The processor grants to an I/O module the authority to read from or write
to memory so that the I/O memory transfer can occur without tying up the
processor
 The I/O module issues read or write commands to memory relieving the
processor of responsibility for the exchange
 This operation is known as direct memory access (DMA)

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Read Memory
Write
N Words
Address 0 Data

Data N–1

Read I/O Module Internal


Write Data

External
Address M Ports Data

Internal
Data Interrupt
Signals
External
Data

Instructions Address

Control
Data CPU Signals

Interrupt Data
Signals

Figure 3.15 Computer Modules

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


The interconnection structure must support the following
types of transfers:

Memory Processor I/O to or


I/O to Processor
to to from
processor to I/O
processor memory memory

An I/O
module is
allowed to
exchange
Processor
Processor data directly
reads an Processor Processor
reads data with memory
instruction writes a unit sends data to
from an I/O without
or a unit of of data to the I/O
device via an going
data from memory device
I/O module through the
memory
processor
using direct
memory
access

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


A communication pathway Signals transmitted by any one
connecting two or more devices device are available for reception
• Key characteristic is that it is a shared by all other devices attached to
transmission medium the bus
• If two devices transmit during the same
Bus
Inter
time period their signals will overlap
and become garbled

conn
Typically consists of multiple
communication lines Computer systems contain a
ectio
• Each line is capable of transmitting
signals representing binary 1 and
binary 0
number of different buses that
provide pathways between
components at various levels of
n
the computer system hierarchy

System bus
• A bus that connects major computer
components (processor, memory, I/O)
The most common computer
interconnection structures are
based on the use of one or more
system buses

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Data Bus
 Data lines that provide a path for moving data among system
modules

 May consist of 32, 64, 128, or more separate lines

 The number of lines is referred to as the width of the data bus

 The number of lines determines how many bits can be transferred at a


time

 The width of the data bus


is a key factor in
determining overall
system performance

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+ Address Bus Control Bus

 Used to designate the source or


 Used to control the access and the
destination of the data on the data bus
use of the data and address lines
 If the processor wishes to read a
word of data from memory it puts  Because the data and address lines
the address of the desired word on are shared by all components there
the address lines must be a means of controlling their
use
 Width determines the maximum
possible memory capacity of the  Control signals transmit both
system command and timing information
among system modules
 Also used to address I/O ports
 The higher order bits are used to  Timing signals indicate the validity
select a particular module on the of data and address information
bus and the lower order bits select
a memory location or I/O port  Command signals specify operations
within the module to be performed
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
CPU Memory Memory I/O I/O

Control lines

Address lines Bus

Data lines

Figure 3.16 Bus Interconnection Scheme

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+
Point-to-Point Interconnect

Principal reason for change was At higher and higher data rates
the electrical constraints it becomes increasingly difficult
encountered with increasing the to perform the synchronization
frequency of wide synchronous and arbitration functions in a
buses timely fashion

A conventional shared bus on


the same chip magnified the
difficulties of increasing bus Has lower latency, higher data
data rate and reducing bus rate, and better scalability
latency to keep up with the
processors

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+Quick Path Interconnect
QPI
 Introduced in 2008

 Multiple direct connections


 Direct pairwise connections to other components eliminating the
need for arbitration found in shared transmission systems

 Layered protocol architecture


 These processor level interconnects use a layered protocol
architecture rather than the simple use of control signals found in
shared bus arrangements

 Packetized data transfer


 Data are sent as a sequence of packets each of which includes
control headers and error control codes

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


I/O device

I/O device
I/O Hub

DRAM

DRAM
Core Core
A B

DRAM

DRAM
Core Core
C D
I/O device

I/O device
I/O Hub

QPI PCI Express Memory bus

Fig u re 3 . 1 7 Mu lt ic o re Co n f ig u ra t io n U s in g QP I

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Packets
Protocol Protocol

Routing Routing

Flits
Link Link

Physical Phits Physical

Fi g u re 3 .1 8 Q P I La y e rs

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


COMPONENT A
Intel QuickPath Interconnect Port
Fwd Clk

Rcv Clk
Transmission Lanes Reception Lanes

Fwd Clk
Rcv Clk

Reception Lanes Transmission Lanes

Intel QuickPath Interconnect Port


COMPONENT B

Fi g u re 3 . 1 9 P h y s i c a l I n t e r f a c e o f t h e I n t e l Q P I I n t e rc o n n e c t

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


#2n+1 #n+1 #1 QPI
lane 0

bit stream of flits #2n+2 #n+2 #2 QPI


lane 1

#2n+1 #2n #n+2 #n+1 #n #2 #1

#3n #2n #n QPI


lane 19

Figure 3.20 QPI Multilane Distribution

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+
QPI Link Layer

 Flow control function


 Performs two key functions:  Needed to ensure that a sending
flow control and error control QPI entity does not overwhelm a
 Operate on the level of the receiving QPI entity by sending
flit (flow control unit) data faster than the receiver can
 Each flit consists of a 72- process the data and clear buffers
for more incoming data
bit message payload and an
8-bit error control code
called a cyclic redundancy  Error control function
check (CRC)
 Detects and recovers from bit
errors, and so isolates higher
layers from experiencing bit
errors

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+
QPI Routing and Protocol Layers

Routing Layer Protocol Layer


 Packet is defined as the unit of
 Used to determine the course that a transfer
packet will traverse across the
available system interconnects  One key function performed at this
level is a cache coherency protocol
 Defined by firmware and describe which deals with making sure that
the possible paths that a packet can main memory values held in
follow multiple caches are consistent

 A typical data packet payload is a


block of data being sent to or from
a cache

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+
Peripheral Component Interconnect
(PCI)
 A popular high bandwidth, processor independent bus that can function
as a mezzanine or peripheral bus

 Delivers better system performance for high speed I/O subsystems

 PCI Special Interest Group (SIG)


 Created to develop further and maintain the compatibility of the PCI
specifications

 PCI Express (PCIe)


 Point-to-point interconnect scheme intended to replace bus-based schemes such as
PCI
 Key requirement is high capacity to support the needs of higher data rate I/O
devices, such as Gigabit Ethernet
 Another requirement deals with the need to support time dependent data streams

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Core Core

Gigabit PCIe
Memory
Ethernet
Chipset
PCIe–PCI PCIe
Memory
Bridge

PCIe

PCIe PCIe
Switch

PCIe PCIe

Legacy PCIe PCIe PCIe


endpoint endpoint endpoint endpoint

Fig u re 3 . 2 1 Ty p ic a l Co n fig u ra t io n Us in g P CI e

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Transaction layer
packets (TLP)
Transaction Transaction

Data link layer


packets (DLLP)
Data Link Data Link

Physical Physical

Fi g u r e 3 .2 2 P CI e P r o t o c o l La y e r s

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


128b/ PCIe
B4 B0
130b lane 0
byte stream

B5 B1 128b/ PCIe
130b lane 1
B7 B6 B5 B4 B3 B2 B1 B0

B6 B2 128b/ PCIe
130b lane 2

128b/ PCIe
B7 B3
130b lane 3

Figure 3.23 PCIe Multilane Distribution

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


D+ D–
8b

Differential
Scrambler Receiver

8b 1b Clock recovery
circuit

Data recovery
128b/130b Encoding circuit

130b 1b

Parallel to serial Serial to parallel

1b 130b

Transmitter Differential
128b/130b Decoding
Driver

128b

D+ D–
Descrambler
(a) Transmitter
8b

(b) Receiver

Figure 3.24 PCIe Transmit and Receive Block Diagrams

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+ Receives read and write requests from the

software above the TL and creates request
packets for transmission to a destination via
the link layer

PCIe  Most transactions use a split transaction


technique
Transaction Layer (TL)  A request packet is sent out by a source
PCIe device which then waits for a
response called a completion packet

 TL messages and some write transactions


are posted transactions (meaning that no
response is expected)

 TL packet format supports 32-bit


memory addressing and extended 64-bit
memory addressing

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+
The TL supports four address spaces:

 Memory  I/O
 The memory space includes  This address space is used for
system main memory and PCIe legacy PCI devices, with
I/O devices reserved address ranges used to
 Certain ranges of memory address legacy I/O devices
addresses map into I/O devices

 Configuration  Message
 This address space enables the  This address space is for control
TL to read/write configuration signals related to interrupts,
registers associated with I/O error handling, and power
devices management

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Table 3.2
PCIe TLP Transaction Types
Address Space TLP Type Purpose
Memory Read Request
Transfer data to or from a location in the
Memory Memory Read Lock Request system memory map.
Memory Write Request
I/O Read Request Transfer data to or from a location in the
I/O
I/O Write Request system memory map for legacy devices.
Config Type 0 Read Request
Config Type 0 Write Request Transfer data to or from a location in the
Configuration
Config Type 1 Read Request configuration space of a PCIe device.
Config Type 1 Write Request
Message Request Provides in-band messaging and event
Message reporting.
Message Request with Data
Completion
Memory, I/O, Completion with Data
Returned for certain requests.
Configuration Completion Locked
Completion Locked with Data
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Number
of octets
1 STP framing 1 Start

Appended by PL
2 Sequence number
DLLP

Created
by DLL
4

2 CRC

12 or 16 Header 1 End

Created by Transaction Layer

Appended by Data Link Layer

Appended by Physical Layer


0 to 4096 Data

0 or 4 ECRC

4 LCRC

1 STP framing

(a) Transaction Layer Packet (b) Data Link Layer Packet

Fig u re 3 . 2 5 P CI e P ro t o c o l D a t a Un it Fo rm a t

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+ Summary A Top-Level View of
Computer Function and
Interconnection
Chapter 3
 Point-to-point interconnect
 QPI physical layer
 Computer components
 QPI link layer
 Computer function
 QPI routing layer
 Instruction fetch and execute
 QPI protocol layer
 Interrupts
 I/O function  PCI express
 Interconnection structures  PCI physical and logical
 Bus interconnection architecture
 PCIe physical layer
 PCIe transaction layer
 PCIe data link layer
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

You might also like