You are on page 1of 8

What is a bus?

A Bus is:
a shared communication link
Slow vehicle that many people ride together a single set of wires used to connect multiple subsystems
well, true...
A bunch of wires...
Processor
Input
Control
Memory

Datapath Output

a Bus is also a fundamental tool for composing large, complex


systems
systematic means of abstraction

Datorteknik F1 bild 1 Datorteknik F1 bild 2

Advantages of Buses Disadvantage of Buses

I/O Device I/O Device I/O Device


I/O Device I/O Device I/O Device Processor Memory
Processor Memory

It creates a communication bottleneck


Versatility:
The bandwidth of that bus can limit the maximum I/O throughput
New devices can be added easily
Peripherals can be moved between computer The maximum bus speed is largely limited by:
systems that use the same bus standard The length of the bus
Low Cost: The number of devices on the bus
A single set of wires is shared in multiple ways The need to support a range of devices with:
Widely varying latencies
Manage complexity by partitioning the design
Widely varying data transfer rates

Datorteknik F1 bild 3 Datorteknik F1 bild 4

The General Organization of a Bus Master versus Slave

Master issues command


Bus Bus
Control Lines Data can go either way
Master Slave
Data Lines

A bus transaction includes two parts:


Control lines:
Issuing the command (and address) request
Signal requests and acknowledgments
Transferring the data action
Indicate what type of information is on the data lines
Master is the one who starts the bus transaction by:
Data lines carry information between the source and the
issuing the command (and address)
destination:
Data and Addresses Slave is the one who responds to the address by:
Complex commands Sending data to the master if the master ask for data
Receiving data from the master if the master wants to send data

Datorteknik F1 bild 5 Datorteknik F1 bild 6


Types of Buses Example: Bus
Architecture
Processor-Memory Bus (design specific) Processor/Memory Bus
Short and high speed
Only need to match the memory system
Maximize memory-to-processor bandwidth
Connects directly to the processor
Optimized for cache block transfers Backplane Bus
Backplane Bus (standard or proprietary)
Backplane: an interconnection structure within the chassis
Allow processors, memory, and I/O devices to coexist
Cost advantage: one bus for all components
I/O Busses
I/O Bus (industry standard)
Usually is lengthy and slower
Need to match a wide range of I/O devices
Connects to the processor-memory bus or backplane bus

Datorteknik F1 bild 7 Datorteknik F1 bild 8

Example: Motherboard Abit BH6 A Computer System with One Bus:


Backplane Bus
Backplane Bus
Processor Memory

I/O Devices

A single bus (the backplane bus) is used for:


Processor to memory communication
Communication between I/O devices and memory
Advantages: Simple and low cost
Disadvantages: slow and the bus can become a major
bottleneck
Example: IBM PC - AT
Datorteknik F1 bild 9 Datorteknik F1 bild 10

A Two-Bus System A Three-Bus System


Processor Memory Bus Processor Memory Bus
Processor Memory Processor Memory
Bus Bus Bus Bus
Adaptor Adaptor Adaptor Adaptor
Bus
I/O I/O I/O Adaptor I/O Bus
Bus Bus Bus Backplane Bus
Bus I/O Bus
Adaptor

I/O buses tap into the processor-memory bus via bus adaptors:
Processor-memory bus: mainly for processor-memory traffic
I/O buses: provide expansion slots for I/O devices A small number of backplane buses tap into the processor-
memory bus
Apple Macintosh-II
Processor-memory bus is used for processor memory traffic
NuBus: Processor, memory, and a few selected I/O devices
I/O buses are connected to the backplane bus
SCSI Bus: the rest of the I/O devices
Advantage: loading on the processor bus is greatly reduced
Datorteknik F1 bild 11 Datorteknik F1 bild 12
Processor-Memory Bus Memory Bus

This bus connects the CPU to RAM


4*32
Designed for maximal bandwidth 00xx 01xx 10xx 11xx Each access
Usually wide, 32 bits or more transfers 128 bits
To further increase bandwidth we use a Cache
Burst access between cache and memory, early restart

Cache Miss STALL, Start Burst read from RAM

PA[3:2] 1
32 Each RAM bank
Release Pipe STALL 00xx 01xx 10xx 11xx
is only accessed
Continue Burst read from RAM
every fourth cycle
PA[3:2] 4 2 3 on burst read

Datorteknik F1 bild 13 Datorteknik F1 bild 14

On Chip Cache (1st level) What defines a bus?

CP0 Using separate Instruction and Data


Transaction Protocol
IM DE EX DM caches, we can read a Hit simultaneously
for both Instruction and Data
Timing and Signaling Specification
In this model 1st level caches use virtual
address, and must pass TLB on Cache miss
Bunch of Wires
1st level cache must be fast
Instr Data Limited area on chip
Cache Cache Usually 8-64 kb Electrical Specification

Physical / Mechanical Characterisics


the connectors

Datorteknik F1 bild 15 Datorteknik F1 bild 16

Synchronous and Asynchronous Bus Synchronous Bus


Synchronous bus:
Includes a clock in the control lines
A fixed protocol for communication that is relative to the clock A single clock controls the protocol
Advantage: involves very little logic and can run very fast Pros
Disadvantages: Simple (one FSM)

Every device on the bus must run at the same clock rate Fast

To avoid clock skew, they cannot be long if they are fast Cons
Clock skew limits bus length
Asynchronous bus:
All devices work on the same speed (clock)
It is not clocked
It can accommodate a wide range of devices Suitable for Processor-Memory Bus
It can be lengthened without worrying about clock skew
It requires a handshaking protocol

Datorteknik F1 bild 17 Datorteknik F1 bild 18


Asynchronous Bus Asynchronous Protocol

Uses a handshaking to implement a ReadReq


transaction protocol DataReady Master Slave
ReadReq
Pros Ack
Versatile, generic protocols
Ack

Dynamic data rate


Wait for Data
Cons
More complex, two communication FSMs
Slower, (but usually can be made quite fast)
DataReady
Ack

Datorteknik F1 bild 19 Datorteknik F1 bild 20

Busses so far Bus Transaction


Master Slave

Control Lines Arbitration


Address Lines
Data Lines Request
Bus Master: has ability to control the bus, initiates transaction
Action
Bus Slave: module activated by the transaction
Bus Communication Protocol: specification of sequence of events
and timing requirements in transferring information.

Asynchronous Bus Transfers: control lines (req, ack) serve to


orchestrate sequencing.
Synchronous Bus Transfers: sequence relative to common clock.

Datorteknik F1 bild 21 Datorteknik F1 bild 22

Arbitration: Obtaining Access to the Bus Bus Arbitration


Control: Master initiates requests
Bus Bus
Master Data can go either way Slave
Bus Master, (initiator usually the CPU)
One of the most important issues in bus design: Slave, (usually the Memory)
How is the bus reserved by a devices that wishes to use it?
Chaos is avoided by a master-slave arrangement: Arbitration signals
Only the bus master can control access to the bus: BusRequest
It initiates and controls all bus requests
BusGrant
A slave responds to read and write requests
BusPriority
The simplest system:
Higher priority served first
Processor is the only bus master
Fairness, no request is locked out
All bus requests must be controlled by the processor
Major drawback: the processor is involved in every transaction

Datorteknik F1 bild 23 Datorteknik F1 bild 24


Multiple Potential Bus Masters: Bus Arbitration Schemes
the Need for Arbitration
Daisy chain arbitration
Grant line runs through all device, highest priority device first.
Bus arbitration scheme: Single device with all request lines.
A bus master wanting to use the bus asserts the bus request
Centralized
A bus master cannot use the bus until its request is granted Many request/grant lines
A bus master must signal to the arbiter after finish using the bus Complex controller, may be a bottleneck
Bus arbitration schemes usually try to balance two Self Selection
factors: Many request lines
Bus priority: the highest priority device should be serviced first The one with highest priority self decides to take bus
Fairness: Even the lowest priority device should never Collision Detection (Ethernet)
be completely locked out from the bus One request line
Try to access bus,
If collision device backoff
Try again in random + exponential time

Datorteknik F1 bild 25 Datorteknik F1 bild 26

The Daisy Chain Bus Centralized Parallel Arbitration


Arbitrations Scheme
Device 1 Device N
Device 2
Device 1 Device N
Highest Device 2 Lowest
Priority Priority Grant Req

Grant Grant Grant Bus


Arbiter
Bus Release
Arbiter Request

wired-OR
Advantage: simple
Disadvantages:
Cannot assure fairness:
A low-priority device may be locked out indefinitely Used in essentially all processor-memory busses and in high-
The use of the daisy chain grant signal also limits the bus speed speed I/O busses
Datorteknik F1 bild 27 Datorteknik F1 bild 28

Simplest bus paradigm Simple Synchronous Protocol

BReq

BG
R/W
Address Cmd+Addr

Data Data1 Data2

All agents operate syncronously


All can source / sink data at same rate
=> simple protocol Even memory busses are more complex than this
just manage the source and target
memory (slave) may take time to respond
it need to control data rate

Datorteknik F1 bild 29 Datorteknik F1 bild 30


Typical Synchronous Protocol Increasing the Bus Bandwidth
Separate versus multiplexed address and data lines:
Address and data can be transmitted in one bus cycle
if separate address and data lines are available
BReq Cost: (a) more bus lines, (b) increased complexity

BG Data bus width:


By increasing the width of the data bus, transfers of multiple words
R/W Cmd+Addr require fewer bus cycles
Address Example: SPARCstation 20s memory bus is 128 bit wide
Wait Cost: more bus lines
Block transfers:
Data Data1 Data1 Data2 Allow the bus to transfer multiple words in back-to-back bus cycles
Only one address needs to be sent at the beginning
The bus is not released until the last word is transferred
Slave indicates when it is prepared for data xfer
Cost: (a) increased complexity
Actual transfer goes at bus rate (b) increased response time (latency) for request

Datorteknik F1 bild 31 Datorteknik F1 bild 32

Pipelined Bus Protocols Increasing Transaction Rate


Attempt to initiate next address phase during current data phase on Multimaster Bus
Overlapped arbitration
Single master example (proc-to-cache) perform arbitration for next transaction during current transaction
Bus parking
CLK
master can holds onto bus and performs multiple transactions as long as no
other master makes request
Active
Overlapped address / data phases (prev. slide)
w_l requires one of the above techniques
Split-phase transaction bus ( Command queueing)
wait
completely separate address and data phases
XXX RDATA1 RDATA2
data WR DATA3 arbitrate separately for each
address phase yield a tag which is matched with data phase
Addr 1 ADDR 2 ADDR 3
addr
All of the above in most modern mem busses

Datorteknik F1 bild 33 Datorteknik F1 bild 34

Split Transaction Protocol The I/O Bus Problem


When we dont need the bus, release it!
Improves throughput
Increases latency and complexity
Designed to support wide variety of
Master Slave devices
Master full set not know at design time
ReadReq
requests
bus Ack Allow data rate match between arbitrary
No one
speed deviced
Wait for Data
requests fast processor slow I/O
bus slow processor fast I/O

Slave DataReady
requests
bus Ack

Datorteknik F1 bild 35 Datorteknik F1 bild 36


High Speed I/O Bus Backplane Bus
Virtual Address Bus Adapter Primary Memory
Examples CP0 Control RAM
Raid controllers IM DE EX DM 2nd level
Physical Addr
Graphics adapter Cache
1st level Cache
Limited number of devices Data
TLB
Data transfer bursts at full rate
DMA transfers important
small controller spools stream of bytes to or from memory
HD Interface
Either side may need to stall transfer
buffers fill up Serial Interface

Graphics Adapter
Sound Card

Datorteknik F1 bild 37 Datorteknik F1 bild 38

Direct Memory Access (DMA) Direct Memory Access


(2)

CP0 RAM
DMA Processor
IM DE EX DM 2nd level
Cache 1) Generates BusRequest, waits for Grant
1st level Cache 2) Put Address & Data on Bus
TLB 3) Increase Address, back to 2 until finished
We want to move 4) Release Bus
data from HD interface to RAM
(1)
Generates interrupt only
HD Interface When finished
If an error occurred
Why go all the way to the CPU (1)
and back to the memory bus (2)

Datorteknik F1 bild 39 Datorteknik F1 bild 40

DMA and Virtual Memory DMA and Caching

If DMA uses Virtual address it needs to pass a INCONSISTENCY problem


TLB We change the RAM contents, but not the cache
Go through the CPUs TLB (no good) We write to HD but the RAM holds old information

A TLB in the DMA processor (needs updating) Routing All DMA though CPU
No good, spoils the idea!
If DMA uses Physical address
Only transfer within one Page Software handling of DMA
Cache: Flush selected cache lines to RAM
We give the DMA a set of Physical addresses
Cache: Invalidate selected cache lines
(local TLB copy)
TLB/OS: Do not allow access to these pages until DMA finished
Hardware Cache Coherence Protocol
Complex, but very useful for multi processor systems

Datorteknik F1 bild 41 Datorteknik F1 bild 42


PCI Read/Write Transactions PCI Read Transaction

All signals sampled on rising edge


Centralized Parallel Arbitration
overlapped with previous transaction
All transfers are (unlimited) bursts
Address phase starts by asserting FRAME#
Next cycle initiator asserts cmd and address
Data transfers happen on when
IRDY# asserted by master/initiator when ready to transfer data
TRDY# asserted by target when ready to transfer data
transfer when both asserted on rising edge
FRAME# deasserted when master intends to complete
only one more data transfer
Turn-around cycle on any signal driven by more than one agent

Datorteknik F1 bild 43 Datorteknik F1 bild 44

PCI Write Transaction PCI Optimizations


Push bus efficiency toward 100% under common simple
usage
like RISC
Bus Parking
retain bus grant for previous master until another makes request
granted master can start next transfer without arbitration
Arbitrary Burst length
intiator and target can exert flow control with xRDY
target can disconnect request with STOP (abort or retry)
master can disconnect by deasserting FRAME
arbiter can disconnect by deasserting GNT
Delayed (pended, split-phase) transactions
free the bus after request to slow device

Datorteknik F1 bild 45 Datorteknik F1 bild 46

Additional PCI Issues Summary of Bus Options

Option High performance Low cost


Interrupts: support for controlling I/O devices Bus width Separate address Multiplex address
& data lines & data lines
Cache coherency:
Data width Wider is faster Narrower is cheaper
support for I/O and multiprocessors (e.g., 32 bits) (e.g., 8 bits)
Locks: Transfer size Multiple words has Single-word transfer
support timesharing, I/O, and MPs less bus overhead is simpler
Configuration Address Space Bus masters Multiple Single master
(requires arbitration) (no arbitration)
Clocking Synchronous Asynchronous
Protocol pipelined Serial

Datorteknik F1 bild 47 Datorteknik F1 bild 48

You might also like