You are on page 1of 47

+

William Stallings
Computer Organization
and Architecture
10th Edition
© 2016 Pearson Education, Inc., Hoboken,
NJ. All rights reserved.
+ Chapter 3
A Top-Level View of
Computer Function and
Interconnection
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Computer Components
 Contemporary computer designs are based on concepts
developed by John von Neumann at the Institute for
Advanced Studies, Princeton

 Referred to as the von Neumann architecture and is


based on
three key concepts:
 Data and instructions are stored in a single read-write
memory
 The contents of this memory are addressable by location, without
regard to the type of data contained there
 Execution occurs in a sequential fashion (unless explicitly
modified) from one instruction to the next

 Hardwired program
 The result of the process of connecting the various components
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
ni
+ Data
Sequence of
arithmetic
and logic
functions
Results

(a) Programming in hardware

Hardware
and Instruction Instruction

Software
codes
interpreter

Approaches
Control
signals

General-purpose
Data arithmetic Results
and logic
functions

(b) Programming in software

Figure 3.1 Hardware and Software Approaches

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Software
• A sequence of codes or instructions
• Part of the hardware interprets each instruction Software
and
generates control signals
• Provide a new sequence of codes for each
new program instead of rewiring the
hardware
Major components: I/O
• CPU Components
• Instruction interpreter
• Module of general-purpose arithmetic and
• logic functions
Input module
+ • I/O Components
• Contains basic components for accepting
data and instructions and converting them
into an internal form of signals usable by the
system
• Output module
• Means of reporting results
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Memory Memory buffer
address register (MBR) M EM O RY
register (MAR) • Contains the data
• Specifies the to be written into
address in memory or
memory for the receives the data
next read or write read from memory

MA R
I/O address I/O buffer
register (I/OAR) register (I/OBR)
• Specifies a • Used for the
particular I/O exchange of
+ device data between
an I/O module
and the CPU
MBR

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


CPU Main Memory
0
System 1

PC MAR Bus 2
Instruction
Instruction
Instruction
IR MBR

I/O AR
Data
Execution Data
unit I/O BR Data
Data

I/O Module n–2


n–1

PC = Program counter
Buffers IR = Instruction register
MAR = Memory address register
MBR = Memory buffer register
I/O AR = Input/output address register
I/O BR = Input/output buffer register

Figure 3.2 Computer Components: Top-Level View


© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Fetch Cycle Execute Cycle

Fetch Next Execute


START HALT
Instruction
Instruction

Figure 3.3 Basic Instruction Cycle

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+
Fetch Cycle
 At the beginning of each instruction cycle the processor
fetches an instruction from memory

 The program counter (PC) holds the address of the


instruction to be fetched next

 The processor increments the PC after each instruction


fetch so that it will fetch the next instruction in
sequence

 The fetched instruction is loaded into the instruction


register (IR)

 The processor interprets the instruction and performs the


required action

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Action Categories
• Data transferred • Data transferred to
from processor to or from a peripheral
memory or from device by
memory to transferring between
processor the processor and
an I/O module

Processor- Processor-
memory I/O

Data
Control
processing

• An instruction may • The processor


specify that the may perform
sequence of some arithmetic
execution be or logic operation
altered on data

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Memory CPU Registers Memory CPU Registers
300 1 9 4 0 3 0 0 PC 300 1 9 4 0 3 0 1 PC
301 5 9 4 1 AC 301 5 9 4 1 0 0 0 3 AC
302 2 9 4 1 1 9 4 0 IR 302 2 9 4 1 1 9 4 0 IR
• •
• •
940 0 0 0 3 940 0 0 0 3
941 0 0 0 2 941 0 0 0 2
Step 1 Step 2
Memory CPU Registers Memory CPU Registers
300 1 9 4 0 3 0 1 PC 300 1 9 4 0 3 0 2 PC
301 5 9 4 1 0 0 0 3 AC 301 5 9 4 1 0 0 0 5 AC
302 2 9 4 1 5 9 4 1 IR 302 2 9 4 1 5 9 4 1 IR
• •
• •
940 0 0 0 3 940 0 0 0 3 3+2=5
941 0 0 0 2 941 0 0 0 2
Step 3 Step 4
Memory CPU Registers Memory CPU Registers
300 1 9 4 0 3 0 2 PC 300 1 9 4 0 3 0 3 PC
301 5 9 4 1 0 0 0 5 AC 301 5 9 4 1 0 0 0 5 AC
302 2 9 4 1 2 9 4 1 IR 302 2 9 4 1 2 9 4 1 IR
• •
• •
940 0 0 0 3 940 0 0 0 3
941 0 0 0 2 941 0 0 0 5
Step 5 Step 6

Figure 3.5 Example of Program Execution


(contents of memory and registers in hexadecimal)
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Instruction Operand Operand
fetch fetch store

Multiple Multiple
results
operands

Instruction Instruction Operand Operand


Data
address operation address address
Operation
calculation decoding calculation calculation

Return for string


Instruction complete, or vector data
fetch next
instruction

Figure 3.6 Instruction Cycle State Diagram


© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Program Generated by some condition that occurs as a result of an
instruction execution, such as arithmetic overflow, division by
zero, attempt to execute an illegal machine instruction, or reference
outside a user's allowed memory space.
Timer Generated by a timer within the processor. This allows the operating
system to perform certain functions on a regular basis.
I/O Generated by an I/O controller, to signal normal completion of an
operation, request service from the processor, or to signal a variety
of error conditions.
Hardware failure Generated by a failure such as power failure or memory parity error.

Table 3.1

Classes of
Interrupts
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
User I/O User I/O User I/O
Program Program Program Program Program Program

1 4 1 4 1 4

I/O I/O I/O


Command Command Command
WRITE WRITE WRITE

5
2a
END
2 2

Interrupt Interrupt
2b
Handler Handler
WRITE WRITE 5 WRITE 5

END END
3a

3 3

3b

WRITE WRITE WRITE

(a) No interrupts (b) Interrupts; short I/O wait (c) Interrupts; long I/O wait

= interrupt occurs during course of execution of user


program

Figure 3.7 Program Flow of Control Without and With Interrupts

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


User Program Interrupt Handler

i
Interrupt
occurs here i+1

Figure 3.8 Transfer of Control via Interrupts

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Fetch Cycle Execute Cycle Interrupt Cycle

Interrupts
Disabled
Check for
Fetch Next Execute
START Instruction Interrupts Process
Interrupt;
Interrupt
Instruction Enabled

HALT

Figure 3.9 Instruction Cycle with Interrupts

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Time
1 1

4 4
I/O operation
I/O operation;
2a concurrent with
processor waits
processor executing

5 5

2b
2
4
I/O operation
4 3a concurrent with
processor executing
I/O operation;
processor waits 5

5 3b

(b) With interrupts


3

(a) Without interrupts

Figure 3.10 Program Timing: Short I/O Wait

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Time
1 1

4 4

I/O operation; 2 I/O operation


processor waits concurrent with
processor executing;
then processor
5 waits

5
2
4

4
3 I/O operation
concurrent with
I/O operation; processor executing;
processor waits then processor
waits

5
5

3 (b) With interrupts

(a) Without interrupts

Figure 3.11 Program Timing: Long I/O Wait

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Instruction Operand Operand
fetch fetch store

Multiple Multiple
operands results

Instruction Instruction Operand Operand


Data Interrupt
address operation address address Interrupt
Operation check
calculation decoding calculation calculation

No
Instruction complete, Return for string interrupt
fetch next or vector data
instruction

Figure 3.12 Instruction Cycle State Diagram, With Interrupts

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Interrupt
User program handler
X

Interrupt
handler
Y

(a) Sequential interrupt processing

Interrupt
User program handler
X

Interrupt
handler
Y

(b) Nested interrupt processing

Figure 3.13 Transfer of Control with Multiple Interrupts

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Printer Communication
User program
interrupt service routine interrupt service routine
t=0

15
0 t=
t =1

t = 25

t= t = 25 Disk
40
interrupt service routine

t=
35

Figure 3.14 Example Time Sequence of Multiple Interrupts

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+
I/O
Function
 I/O module can exchange data directly with the processor

 Processor can read data from or write data to an I/O module


 Processor identifies a specific device that is controlled by a
particular I/O module
 I/O instructions rather than memory referencing instructions

 In some cases it is desirable to allow I/O exchanges to occur


directly with memory
 The processor grants to an I/O module the authority to read from
or write to memory so that the I/O memory transfer can occur
without tying up the processor
 The I/O module issues read or write commands to memory
relieving the processor of responsibility for the exchange
 This operation is known as direct memory access (DMA)

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Read Memory
Write
N Words
Address 0 Data

Data N–1

Read I/O Module Internal


Write Data

External
Address M Ports Data

Internal
Data Interrupt
Signals
External
Data

Instructions Address

Control
Data CPU Signals

Interrupt Data
Signals

Figure 3.15 Computer Modules

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


The interconnection structure must support
the
following types of transfers:
Memory Processor I/O to
I/O to Processor
to to or from
processor to I/O
processor memory
memory
An I / O
module is
allowed to
exchange
data
Processor Processor
directly
reads an Processor reads data Processor with
instruction writes a from an sends
memory
or a unit unit of I/O data to without
of data data to device via the I / O going
from memory an I / O device
through the
memory module
processor
using direct
memory
access

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


A communication Signals transmitted by any
pathway connecting two one device are available
or more devices for reception by all other
• Key characteristic is that it is devices attached to the bus
a shared transmission
medium
• If two devices transmit during the
I
n
same time period their signals
will overlap and become garbled

n
e
Typically consists of
Computer systems contain t
B c
multiple communication
lines a number of different
buses that provide
• Each line is capable of
pathways between
transmitting signals representing

u t
binary 1 and binary 0 components at various
levels of the computer
system hierarchy
e
System bus
• A bus that connects major
The most common s oi
ro
computer components (processor,
memory, I/O) computer interconnection
structures are based on the
use of one or more system
buses n
n
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Data Bus
 Data lines that provide a path for moving data among system
modules

 May consist of 32, 64, 128, or more separate lines

 The number of lines is referred to as the width of the data


bus

 The number of lines determines how many bits can be


transferred at a time

 The width of the data bus


is a key factor in
determining overall
system performance

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+ Address Bus Control Bus

 Used to designate the source or


destination of the data on the  Used to control the access and the
data bus use of the data and address lines
 If the processor wishes to
read a word of data from
 Because the data and address lines
memory it puts the address are shared by all components there
of the desired word on the must be a means of controlling
address lines their use

 Width determines the maximum


 Control signals transmit both
possible memory capacity of command and timing
the system information among system
modules
 Also used to address I/O ports  Timing signals indicate the validity
 The higher order bits are of data and address information
used to select a particular
module on the bus and the  Command signals specify operations
lower order bits select a to be performed
memory location or I/O
port within the module
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
CPU Memory Memory I/O I/O

Control lines

Address lines Bus

Data lines

Figure 3.16 Bus Interconnection Scheme

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+
Point-to-Point
Interconnect
Principal reason for At higher and higher data
change was the electrical rates it becomes
constraints encountered increasingly difficult to
with increasing the perform the synchronization
frequency of wide and arbitration functions in
synchronous buses a timely fashion

A conventional shared bus


on the same chip
Has lower latency,
magnified the difficulties
higher data rate, and
of increasing bus data rate
better scalability
and reducing bus latency
to keep up with the
processors

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+ Quick Path Interconnect QPI
 Introduced in 2008

 Multiple direct connections


 Direct pairwise connections to other components
eliminating the need for arbitration found in
shared transmission systems

 Layered protocol architecture


 These processor level interconnects use a layered
protocol architecture rather than the simple use
of control signals found in shared bus
arrangements

 Packetized data transfer


 Data are sent as a sequence of packets each of which
includes control headers and error control codes
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
I/O device

I/O device
I/O Hub

DRAM

DRAM
Core Core
A B

DRAM

DRAM
Core Core
C D
I/O device

I/O device
I/O Hub

QPI PCI Express Memory bus

Figure 3.17 Multicore Configuration Using QPI

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Packets
Protocol Protocol

Flits
Routing Routing

Link Phits Link

Physical Physical

Figure 3.18 QPI Layers

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


COMPONENT A
Intel QuickPath Interconnect Port
Fwd Clk

Rcv Clk
Transmission Lanes Reception Lanes

Fwd Clk
Rcv Clk

Reception Lanes Transmission Lanes

Intel QuickPath Interconnect Port


COMPONENT B

Figure 3.19 Physical Interface of the Intel QPI


Interconnect
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
#2n+1 #n+1 #1 QPI
lane 0

bit stream of flits #2n+2 #n+2 #2 QPI


lane 1

#2n+1 #2n #n+2 #n+1 #n #2 #1

#3n #2n #n QPI


lane 19

Figure 3.20 QPI Multilane Distribution

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+
QPI Link
Layer ◼ Flow control function
 Performs two key ◼ Needed to ensure that a
functions: flow control and sending QPI entity does not
error control overwhelm a receiving QPI
entity by sending data faster
 Operate on the level of than the receiver can
the flit (flow control process the data and clear
unit) buffers for more incoming
data
 Each flit consists of a 72-
bit message payload
and an 8-bit error ◼ Error control function
control code called a
cyclic redundancy check ◼ Detects and recovers from
(CRC) bit errors, and so isolates
higher layers from
experiencing bit errors

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+
QPI Routing and Protocol
Layers
Routing Layer Protocol Layer
 Packet is defined as the unit of
 Used to determine the course transfer
that a packet will traverse
across the available system  One key function performed at
interconnects this level is a cache coherency
protocol which deals with
 Defined by firmware and making sure that main
describe the possible memory values held in
paths that a packet can multiple caches are consistent
follow
 A typical data packet payload
is a block of data being sent
to or from a cache

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+
Peripheral
Component
A popular high bandwidth, processor independent bus that can
Interconnect
function as a mezzanine (PCI)

or peripheral bus

 Delivers better system performance for high speed I/O


subsystems
 PCI Special Interest Group (SIG)
 Created to develop further and maintain the compatibility of the PCI
specifications

 PCI Express (PCIe)


 Point-to-point interconnect scheme intended to replace bus-based
schemes such as PCI
 Key requirement is high capacity to support the needs of higher
data rate
I/O devices, such as Gigabit Ethernet
 Another requirement deals with the need to support time dependent data
streams
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Core Core

Gigabit PCIe
Memory
Ethernet
Chipset
PCIe–PCI PCIe
Memory
Bridge

PCIe

PCIe PCIe
Switch

PCIe PCIe

Legacy PCIe PCIe PCIe


endpoint endpoint endpoint endpoint

Figure 3.21 Typical Configuration Using PCIe

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
128b/ PCIe
B4 B0
130b lane 0

byte stream
128b/ PCIe
B5 B1
130b lane 1
B7 B6 B5 B4 B3 B2 B1 B0

128b/ PCIe
B6 B2
130b lane 2

128b/ PCIe
B7 B3
130b lane 3

Figure 3.23 PCIe Multilane Distribution

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


D+ D–
8b

Differential
Scrambler Receiver

8b 1b Clock recovery
circuit

Data recovery
128b/130b Encoding circuit

130b 1b

Parallel to serial Serial to parallel

1b 130b

Transmitter Differential
128b/130b Decoding
Driver

128b

D+ D–
Descrambler
(a) Transmitter
8b

(b)
Receiver

Figure 3.24 PCIe Transmit and Receive


Block
© 2016 Pearson Education, Inc., Diagrams
Hoboken, NJ. All rights reserved.
+ ◼ Receives read and write requests from
the software above the TL and creates
request packets for transmission to a
destination via the link layer

PCIe  Most transactions use a split transaction


technique
Transaction Layer (TL) ◼ A request packet is sent out by a
source PCIe device which then
waits for a response called a
completion packet
 TL messages and some write
transactions are posted
transactions (meaning that no
response is expected)

◼ TL packet format supports 32-bit


memory addressing and
extended 64-bit memory
addressing

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+
The TL supports four
address spaces:
 Memory  I/O
 The memory space includes  This address space is used
system main memory and
for legacy PCI devices,
PCIe I/O devices
with reserved address
 Certain ranges of memory
addresses map into I/O
ranges used to address
devices legacy I/O devices

 Message
 Configuration
 This address space is for
 This address space enables control signals related to
the TL to read/write interrupts, error
configuration registers handling, and power
associated with I/O management
devices

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Table 3.2
PCIe TLP Transaction
Address Space TLP TypeTypes
Memory Read Request
Purpose

Memory Transfer data to or from a location in the


Memory Read Lock Request
system memory map.
Memory Write Request
I/O Read Request Transfer data to or from a location in the
I/O
I/O Write Request system memory map for legacy
devices.
Config Type 0 Read Request
Config Type 0 Write Request Transfer data to or from a location in the
Configuration
Config Type 1 Read Request configuration space of a PCIe device.
Config Type 1 Write Request
Message Request Provides in-band messaging and event
Message
Message Request with Data reporting.
Completion
Memory, I/O, Completion with Data
Returned for certain requests.
Configuratio Completion Locked
n Completion Locked with Data
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Number
of
octets STP framing 1 Start
1

Appended by PL
2 Sequence number
DLLP

Created
4

DLL
by
2 CRC

12 or 16 Header 1 End

Created by Transaction Layer

Appended by Data Link Layer

Appended by Physical Layer


0 to 4096 Data

0 or 4 ECRC

4 LCRC

1 STP framing

(a) Transaction Layer Packet (b) Data Link Layer Packet

Figure 3.25 PCIe Protocol Data Unit Format

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+ Summary A Top-Level View
of Computer
Function and
Chapter 3 Interconnection
 Point-to-point interconnect
 QPI physical layer
 Computer components
 QPI link layer
 Computer function
 QPI routing layer
 Instruction fetch and
execute  QPI protocol layer
 Interrupts  PCI express
 I/O function  PCI physical and logical
 Interconnection architecture
structures  PCIe physical layer
 Bus interconnection  PCIe transaction layer
 PCIe data link layer

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

You might also like