You are on page 1of 62

THE UNIVERSITY OF TEXAS AT DALLAS

Erik Jonsson School of Engineering and Computer Science

INPUT/OUTPUT (I/O) SUBSYSTEMS • Overview of I/O performance measurement and analysis • Processor interface issues • Buses • Types and characteristics of I/O devices Hard disk storage Network interfaces • I/O system design

c C. D. Cantrell (05/1999)

THE UNIVERSITY OF TEXAS AT DALLAS

Erik Jonsson School of Engineering and Computer Science

MOTIVATION FOR STUDYING I/O • CPU performance improves by 50% to 100% per year • I/O systems’ performance improvements are limited by physics (in some cases) Mechanical delays (disk drives): Latency improvement is of order 5% per year Electrical and optical phenomena (dispersion, attenuation, crosstalk): Improvement is 5% to 25% per year • Amdahl’s law implies that, sooner or later, most of the latency will be due to the part that is hardest to improve Given: 10% of instructions perform I/O, CPU is 10 x faster Improvement is only 5 x ⇒ lose 50% of improvement • I/O bottleneck lowers the value of CPU improvements As technology evolves, a diminishing fraction of total latency is due to the CPU c C. D. Cantrell (05/1999)

THE UNIVERSITY OF TEXAS AT DALLAS

Erik Jonsson School of Engineering and Computer Science

I/O PERFORMANCE METRICS • Bandwidth (bits or bytes per second): Peak Sustained Useful for buses and networks • Throughput (I/O processes per second) Useful for file serving and transaction processing • Latency = total time for an I/O process from start to finish Most important to users ◦ Latency too great ⇒ user loses train of thought ◦ Latency no. bytes = controller time + wait time + bandwidth + CPU time − overlap
c C. D. Cantrell (02/1999)

access time. bandwidth. D. cost c C.THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science PROCESSOR INTERFACE ISSUES • Interconnections Buses • Processor interface Interrupts Memory-mapped I/O • I/O control structures Polling Interrupts DMA I/O controllers I/O processors • Capacity. Cantrell (05/1999) .

cable) Advantages: ◦ Low cost (compared to point-to-point wiring) ◦ Versatility of interconnections Disadvantages: ◦ Electrical problems ⇒ short length Bus skew Dispersion Crosstalk ◦ Shared resource ⇒ contention Organization: ◦ Control lines to signal & acknowledge requests ◦ Data lines to carry addresses. D.THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science BUSES • Bus: A communication link shared by multiple subsystems Physically: Parallel conductors (traces on die or PC board. Cantrell (02/1999) . data or commands c C.

D. PCI. memory and I/O devices coexist on the same bus In olden times. SCSI c C. often built into the backplane of a computer ◦ An interconnection structure that was part of the chassis Processor architecture includes explicit I/O instructions (IN. OUT) Standard backplane buses: VMEbus. Multibus. NuBus. ISA (Industry Standard Architecture) bus • I/O bus Examples: IDE. Cantrell (05/1999) .THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science PROCESSOR–I/O INTERFACE BUS TYPES • Backplane bus Processor.

Backplane bus Processor a. Processor. Processor and memory are on a fast synchronous bus Bus adapter Bus adapter I/O bus Memory Backplane A bus adapter bus interfaces the processor-memory bus to the backplane bus Bus adapter I/O bus . memory and I/O devices on the same bus I/O devices Memory Backplane bus Processor b. Processor and memory are on a backplane bus. bus adapters provide interfaces for various I/O buses Bus adapter Bus adapter Bus adapter Memory I/O bus I/O bus I/O bus Processor-memory bus Processor c.

I/O SYSTEM USING ONLY A BACKPLANE BUS Processor Interrupts Cache Memory–I/O bus Main memory I/O controller I/O controller I/O controller Disk Disk Graphics output Network .

I/O SYSTEM USING AN I/O BUS CPU-memory bus Cache Bus adapter Main memory CPU I/O bus I/O controller I/O controller I/O controller Disk Disk Graphics output Network .

MACINTOSH 72xx I/O SYSTEM Processor Stereo input output Serial ports Apple desktop bus Main memory PCI interface/ memory controller I/O controller I/O controller I/O controller BACKPLANE BUS PCI I/O CONTROLLERS AND BUS ADAPTERS CDROM I/O controller I/O controller Disk SCSI bus Graphics output Ethernet Tape .

PENTIUM II I/O SYSTEM Cache bus Local bus Memory bus Level 2 cache CPU PCI bridge PCI bus Main memory BACKPLANE BUS I/O CONTROLLERS AND BUS ADAPTERS SCSI USB ISA bridge IDE disk Graphics adaptor Monitor Available PCI slot Mouse Keyboard ISA bus Modem Sound card Printer Available ISA slot Tanenbaum. Structured Computer Organization .

Enterprise 10000 hardware architecture Gigaplane-XB Includes: XB-Interconnect. 4 address buses. bulk power distribution Local Power Converters • Data is packet-switched using a crossbar • Addresses are broadcast .

Cantrell (02/1999) .THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science 3-STATE BUFFER • A 3-state buffer has 2 inputs and 1 output Enable asserted: Output = input (state is either 0 or 1) Enable deasserted: High-impedance state (denoted × or Z) ◦ Output can be driven by another device Equivalent to a mechanical switch c C. D.

Cantrell (02/1999) . D.THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science EXCITATION TABLE FOR 3-STATE BUFFER • A tristate buffer has 3 possible output values: Asserted Deasserted High impedance (floating) enable 0 0 1 1 in out 0 Z 1 Z 0 0 1 1 c C.

D. Cantrell (02/1999) .THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science USE OF TRISTATES TO ENABLE/DISABLE BUS ACCESS c C.

D. point-topoint connections (Seymour Cray) ◦ EMI & reflections limit number of devices connected to bus • Real estate on die or PC board limits number of lines c C. Cantrell (02/1999) . parallel conductor A time-varying current in a conductor induces a voltage v2 = L12 di1/dt in another.THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science BUS DESIGN CONSTRAINTS • Laws of physics limit bus speeds Transmission speed ≤ speed of light Crosstalk ◦ Occurs because: A time-varying voltage on a conductor induces a charge q2 = C12 v1 on another. parallel conductor ◦ Limits bus clock frequency ◦ Can be reduced by: Grounding alternate conductors Abandoning the bus concept and using twisted-pair.

25 pF EACH DEVICE POSITION DRIVER 7 6 5 4 FIVE 7.16 10.16 10. 25 pF EACH 3 2 1 0 .COMPLEX ULTRA-SCSI CHAIN 3 METERS (10 FEET) OVERALL LENGTH (INDIVIDUAL MEASUREMENTS IN CENTIMETERS) TERMINATOR 7.48 30.45-CM STUBS.16 10.82 10.62 30.16 TERMINATOR THREE 12.48 210.37-CM STUBS.16 10.

ACK SIGNALS ON COMPLEX ULTRA-SCSI CHAIN ACK SIGNALS 6 4 2 DEVICE POSITION 0 4 2 7 2 VOLTS PER DIVISION 0 4 2 0 4 2 0 4 4 2 0 4 0 2 0 DRIVER INPUT @ 0 DRIVER INPUT @ 4 DRIVER INPUT @ 5 DRIVER INPUT @ 6 DRIVER OUTPUT @ 7 DRIVER INPUT LOGIC SIGNAL DRIVING SCSI DRIVER 6 5 10 NANOSECONDS PER DIVISION .

ACK SIGNALS ON POINT-TO-POINT ULTRA-SCSI BUS 25 METERS (82 FEET) OVERALL LENGTH TERMINATOR DRIVER ONLY END LOADS FOR THIS TEST SHIELDED 34-PAIR EXTERNAL CABLE TERMINATOR RECEIVER ACK SIGNAL 6 4 2 VOLTS PER DIVISION 2 0 4 2 0 4 2 0 RECEIVER INPUT AFTER 25 M DRIVER OUTPUT DRIVER INPUT 10 NANOSECONDS PER DIVISION .

D. Cantrell (05/1999) . SYNCHRONOUS BUSES • Bus communication protocol: Specification of sequence of events and timing requirements for transferring information on a bus • Asynchronous bus transfers: Certain conductors on the bus are control lines Signals on the control lines control the sequence of events • Synchronous bus transfers: Events are sequenced relative to a master clock signal Once a certain kind of transfer has been initiated.THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science ASYNCHRONOUS vs. no further command signaling is necessary to control the transfer c C.

g. crosstalk) All devices must run at same frequency c C.. VESA Local Bus): ◦ Extends the processor’s bus control signals ◦ May connect processor to L2 cache ◦ May connect processor and memory to high-speed I/O devices • Advantages: Fast & wide Simple logic (finite state machine) • Disadvantages: Must be short (bus skew. Cantrell (02/1999) . attenuation. D.THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science SYNCHRONOUS BUSES • Bus clock is phase-locked to processor clock 1 Bus clock frequency = n × processor clock frequency (n = 1 to 6) Clock signal is carried on a control line Communications protocol defined with reference to bus clock signal Local bus (e.

THE UNIVERSITY OF TEXAS AT DALLAS

Erik Jonsson School of Engineering and Computer Science

80286 – PENTIUM I/O • Separate I/O and memory address spaces Since the 8086, I/O or memory access is signaled by M/IO# (memory access if high, I/O if low) ◦ For MOVE (memory–CPU copy), M/IO# is high ◦ For IN or OUT (I/O), M/IO# is low ◦ M/IO# is a processor signal that does not appear on the ISA bus ◦ Instead, M/IO# is an input to the bus controller I/O address space is 0x0000 to 0xffff

c C. D. Cantrell (05/1999)

THE UNIVERSITY OF TEXAS AT DALLAS

Erik Jonsson School of Engineering and Computer Science

80286 SIGNALS
80286
31 51 49 47 45 43 41 39 37 50 48 46 44 42 40 38 36 63 64 57 59 61 54 53 CLK D15 D14 D13 D12 D11 D10 D9 D8 D7 D6 D5 D4 D3 D2 D1 D0 READY HOLD INTR NMI PEREQ BUSY ERROR A23 A22 A21 A20 A19 A18 A17 A16 A15 A14 A13 A12 A11 A10 A9 A8 A7 A6 A5 A4 A3 A2 A1 A0 7 8 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 32 33 34 1 67 66 68 65 6 4 5

Upper data bus transceiver

Lower data bus transceiver

Address latch

BHE M/IO/ COD/INTA/ LOCK 29 RESET HLDA PEACK S1 S0 CAP 62 60 52

c C. D. Cantrell (05/1999)

THE UNIVERSITY OF TEXAS AT DALLAS

Erik Jonsson School of Engineering and Computer Science

ISA BUS • ISA ≡ Industry Standard Architecture Synchronous Industry response to IBM’s MicroChannel architecture Uses both the PC/AT and the IBM PC bus standards ◦ Interface cards have 2 sets of connectors ◦ PC bus: 8 data lines, 20 address lines ◦ ISA bus: 16 data lines, 24 address lines; bus frequency 8.33 MHz Maximum possible throughput: 2 bytes×8.33 MHz = 16.67 MB/s Separate I/O and memory address spaces ◦ Since the 8085, I/O or memory access is signaled by IO/M# (I/O if high, memory access if low) For MOVE (memory–CPU copy), IO/M# is high For IN or OUT (I/O), IO/M# is low ◦ I/O address space is 0x0000 to 0xffff
c C. D. Cantrell (05/1999)

Structured Computer Organization .ISA BUS CONNECTORS Motherboard PC bus connectors PC bus Plug-in Contact board Chips CPU and other chips New connector for PC/AT Edge connector Tanenbaum.

0: Clock frequency 33 MHz.3 V). 3. 64-bit-wide data path ◦ Maximum theoretical bandwidth: 8 bytes × 66 MHz = 528 MB/s Transactions are negative-edge-triggered Address and data lines are multiplexed Bus arbiter usually built into the chipset Every PCI device has a 256-byte configuration address space that is readable by other devices ⇒ Plug ’n Play • PCI cards Options include voltage (5 V vs. D. 64 bits/184 pins) and frequency (33 vs. 32-bit-wide data path PCI 2.THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science PCI BUS • PCI ≡ Peripheral Component Interconnect Synchronous PCI 1.1: Clock frequency 66 MHz. width (32 bits/120 pins vs. 66 MHz) c C. Cantrell (05/1999) .

Structured Computer Organization REQ# PCI device GNT# .PCI BUS ARBITER GNT# GNT# GNT# REQ# REQ# REQ# PCI arbiter PCI device PCI device PCI device Tanenbaum.

Structured Computer Organization .PCI BUS TIMING FOR READ AND WRITE CYCLES Bus cycle Read Idle White T1 Φ T2 T3 T4 T5 T6 T7 Tur naround AD C/BE# Address Read cmd Enable Data Address Wr ite cmd Data Enable FRAME# IRDY# DEVSEL# TRDY# Tanenbaum.

write: slave will accept Slave wants to stop transaction immediately Data parity error detected by receiver Address parity error or system error detected Bus arbitration: request for bus ownership Bus arbitration: grant of bus ownership Reset the system and all devices × × × OPTIONAL PCI BUS SIGNALS Sign REQ64# ACK64# AD PAR64 C/BE# LOCK SBO# SDONE INTx JTAG M66EN Lines 1 1 32 1 4 1 1 1 4 5 1 Master × × × × × Slave × Description Request to run a 64-bit transaction Permission is granted for a 64-bit transaction Additional 32 bits of address or data Parity for the extra 32 address/data bits Additional 4 bits for byte enables Lock the bus to allow multiple transactions Hit on a remote cache (for a multiprocessor) Snooping done (for a multiprocessor) Request an interrupt IEEE 1149. write: data present Select configuration space instead of memory Slave has decoded its address and is listening Read: data present.PCI BUS SIGNALS MANDATORY PCI BUS SIGNALS Signal CLK AD PAR C/BE FRAME# IRDY# IDSEL DEVSEL# TRDY# STOP# PERR# SERR# REQ# GNT# RST# Lines 1 32 1 4 1 1 1 1 1 1 1 1 1 1 1 Master × × × × × × Slave × Description Clock (33 MHz or 66 MHz) Multiplexed address and data lines Address or data parity bit Bus command/bit map for bytes enabled Indicates that AD and C/BE are asserted Read: master will accept. Structured Computer Organization .1 JTAG test signals Wired to power or ground (66 MHz or 33 MHz) Tanenbaum.

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science OBTAINING BUS ACCESS • Goal: Give every device fair access • Method: Use bus masters A master enables bus access for one or more devices (by enabling/disabling tristate buffers) Single bus master can be a bottleneck Multiple masters require arbitration ◦ Every device has a priority (IRQ number. . Cantrell (02/1999) . . ) ◦ Extra control lines needed for bus request/access Arbitration methods: ◦ Centralized & parallel (SCSI) ◦ Daisy chain (VMEbus) ◦ Distributed arbitration using self-selection (NuBus) ◦ Distributed arbitration using collision detection (Ethernet) c C. SCSI ID. . D.

etc.) Processor Disks Bus request lines Memory Bus c. Processor notifies I/O device that its request is being processed.A BUS TRANSACTION WITH A SINGLE MASTER Bus request lines Memory Bus a. Device generates bus request Processor Disks Bus request lines Memory Bus b. Master (processor) responds by generating control signals (for read. device then puts address for the request on the bus Processor Disks .

Wait for a low-to-high transition on the grant line (indicates reassignment) 3. Signal on the request line 2. Intercept the grant signal and stop asserting the request line 4.DAISY CHAIN Highest priority Device 1 Device 2 Grant Lowest priority Device n Grant Bus arbiter Grant Release Request A daisy chain bus uses a bus grant line that chains through each device from highest to lowest priority. Signal that the bus is no longer required by asserting the release line . The protocol is: 1. Use the bus 5.

Cantrell (02/1999) .THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science ASYNCHRONOUS BUSES • Not clocked Can accomodate many kinds of devices (disk. tape. then transmits command bytes Device responds to each byte with Ack Controller deasserts Cmd. then transmits data bytes Device responds to each byte with Ack c C. Msg (message). scanner. D. represent with a finite state machine for each device • Example (SCSI-1 bus): Bus controller asserts Sel (select device) and transmits device ID Selected device responds with Ack Controller asserts Cmd (command). . and Req (request a data transfer) signals. . asserts I/O. . ) • Data transfer controlled with handshaking protocol on dedicated control lines.

STEPS OF AN ASYNCHRONOUS OUTPUT OPERATION Control lines Memory Data lines Processor Disks a. Control lines: Read command. Initiation of a read operation from memory. Memory puts the data on the data lines of the bus and uses the control lines to signal the I/O device that the data is available . Memory access Disks Control lines Memory Data lines Processor Disks c. Data lines: Address Control lines Memory Data lines Processor b.

Memory signals the device that it is ready. Data lines: Address Control lines Memory Data lines Processor Disks b. Control lines: Write request to memory.STEPS OF AN ASYNCHRONOUS INPUT OPERATION Control lines Memory Data lines Processor Disks a. Data is transferred .

signals that it has seen the data by asserting Ack 6. asserts DataRdy 5. releases data lines 7. drops Ack to signal end of transmission I/O device Memory . When memory sees ReadReq asserted. I/O device sees DataRdy. releases ReadReq and data lines 3. it reads the address from the data bus and asserts Ack 2. drops DataRdy. Memory sees ReadReq deasserted. I/O device sees DataRdy deasserted. I/O device sees Ack asserted. Memory sees Ack. drops Ack to acknowledge ReadReq 4. reads data.ASYNCHRONOUS BUS HANDSHAKING PROTOCOL ReadReq Data 1 3 4 4 2 Ack DataRdy 2 6 5 7 1. Memory puts requested data on the data lines.

New I/O request I/O device ___ Ack Put address on data lines. assert DataRdy Ack ________ ReadReq Memory 7 Deassert Ack New I/O request 6 Release data lines and DataRdy . assert ReadReq Ack ________ DataRdy 2 Release data lines. assert Ack ________ DataRdy ___ Ack ReadReq ReadReq 1 Record from data lines and assert Ack ________ ReadReq 3. 4 Drop Ack. deassert ReadReq DataRdy DataRdy 5 Read memory data from data lines. put memory data on data lines.

Cantrell (02/1999) .THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science SCSI-1: AN ASYNCHRONOUS BUS (1) • SCSI := Small Computer System Interface Many “standard” implementations Can connect many different kinds of devices: ◦ Logic board ◦ Hard drive ◦ CD-ROM drive ◦ Tape drive ◦ Scanner Controller chip on logic board or plug-in Controller is connected by cable to internal or peripheral devices Devices are daisy-chained Device ID is set by hardware switches c C. D.

Cantrell (02/1999) .THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science SCSI-1: AN ASYNCHRONOUS BUS (2) • SCSI-1 bus configuration Peripheral SCSI-1 devices are connected by cable Each bit of a data byte is transferred on a separate wire (line) of the cable Each device must have a unique ID number between 0 and 7 ◦ The ID is signaled by asserting one of the lines DB(0) – DB(7) ◦ In case of contention. D. so it always wins c C. the device with the highest ID wins ◦ The logic board has ID 7.

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science SCSI ID BITS http://scitexdv.com/SCSI2/ .

D. Cantrell (02/1999) .THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science SCSI-1: AN ASYNCHRONOUS BUS (3) • SCSI signaling sequence for data transfer Controller broadcasts SEL (select) signal on pin 44 and the ID number on one of the data lines Device selected responds with ACK (acknowledge) signal on pin 48 (handshake) Controller sends REQ (request) signal on pin 48 to order device to perform a task (such as transferring a data byte) Command bytes are transferred on the data bus A handshake must take place for each data byte transferred c C.

Select. Driven by the target to indicate that the current transfer is a message. Used by the target during information transfer phases. Used to send a message to the target when it controls the bus. Used by the initiator to select a target before command execution. . Optional. Reset. Also used by the target to reconnect when the reselection phase is implemented. Control/Data. Indicates that the bus is unavailable for use. status. Used during the information transfer phases to transfer commands. Data-Bus Parity Line. Request. Input/Output. Acknowledge. Determines the direction of the transfer. Used by the initiator for handshaking. Busy. Attention. Used to initiate a bus-free phase. data or messages over the bus.SCSI Bus Signals Signal DB0–DB7 DBP ATN BSY ACK RST MSG SEL C/D REQ I/O Driven By Initiator/Target Initiator/Target Initiator Initiator/Target Initiator Any Device Target Initiator Target Target Target Signal Explanation 8-Bit Bidirectional Data Bus.

SELECTION ARBITRATION (OPTIONAL) RESELECTION (OPTIONAL) BUS FREE COMMAND MESSAGE (OPTIONAL) DATA STATUS Phase Sequences of the SCSI Bus .

SCSI Information Transfer Phases Signal SEL 0 0 0 0 0 0 0 0 BSY 1 1 1 1 1 1 1 1 MSG 0 0 0 0 1 1 1 1 C/D 0 0 1 1 0 0 1 1 I/O 0 1 0 1 0 1 0 1 Direction To Target From Target To Target From Target — — To Target From Target Phase Data Out Data In Command Status Reserved Reserved Message Out Message In .

cs.html .ubc.THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science SCSI BUS TOPOLOGY http://homebrew.ca/415/project-submissions/group9/notes/scsi-2.

Cantrell (05/1999) . or between the CPU and peripheral devices: Detection that a data transfer is necessary ◦ Dedicated polling ◦ Interrupts ◦ Periodic polling Synchronization of two devices. or a device and a CPU. D. with different speeds ◦ Wait state insertion ◦ DMA ◦ Dual-ported memory ◦ FIFO buffers ◦ Caches c C.THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science I/O AS SYNCHRONIZATION OF DATA TRANSFERS • Fundamental problems of communication between devices.

than for the CPU to check the device frequently. Cantrell (05/1999) .THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science POLLED I/O Wait Is data ready? yes Read data. An I/O device can use interrupts to tell the CPU that a data transfer should be started. If the device is fast. no c C. In most cases it is more efficient for the I/O device to tell the CPU when data is ready. D. done? yes no A polling loop is not an efficient way to use a CPU unless the device is very fast. or is finished. then “data ready” checks can be interspersed among useful instructions. or when a transfer is complete.

get_loop lb $2. Device_Status bgez $a0.g.THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science DEDICATED vs. D. data may be missed A different approach is necessary for block transfers c C. PERIODIC POLLING • Periodic polling means that the CPU periodically interrogates the I/O device (e. Device_Data rfe This operation transfers only a single byte. Cantrell (05/1999) . via an oscillator–counter–decoder combination) to see whether data is ready • Dedicated polling (spin waiting) means that the I/O device controller sets or clears bits in a status register that is read in a tight loop by the CPU When a system call for keyboard input is issued.. the CPU executes code somewhat like this: get_loop: lw $a0. and dedicated polling is in use.

) An interrupt causes an exception. which results in a jump to the appropriate exception handling code (MIPS: address 0x80000080) There are (at least) two principal methods for detecting interrupts in hardware: ◦ Connect the interrupt request output of an I/O device to one of the inputs of an interrupt controller Interrupts may be level-triggered or edge-triggered ◦ Connect one interrupt line to an OR of inputs from several devices that are periodically strobed for data ready Device that caused the interrupt can be detected by reading a c C.THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science INTERRUPT-DRIVEN I/O (1) • An interrupt is an event that occurs outside the execution cycle and that causes processing of the current thread to stop Interrupts can be used to give I/O devices a means to signal the CPU that an event has occurred that requires action by the CPU (data is ready. Cantrell (05/1999) status word formed from inputs from the devices . D. etc.

subsequent instructions are suspended Pending interrupts must be handled before next instruction is fetched The exception handler determines the code to execute. an interrupt causes a jump to the general exception handling code (with a few special cases such as Reset and UTLB Miss) Method of P&H Chapter 5: Execution is suspended immediately ◦ This method is required for some exceptions (TLB miss. page fault) unless execution can be undone ◦ Restarting is hard in ISAs where memory is accessed at multiple times during execution of an instruction Method of choice: The instruction that caused the exception is allowed to finish. Cantrell (05/1999) .THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science INTERRUPT-DRIVEN I/O (2) • On a RISC machine. based on the Cause register contents The operating system determines what state needs to be saved (if any) besides the EPC and Cause registers c C. D.

if any After executing an rfe instruction.THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science INTERRUPT-DRIVEN I/O (3) • MIPS R2000 interrupt handler: Saves $a0 and $v0 in special locations ◦ save0 is at address 0x90000250. D. Cantrell (05/1999) . because the cause of the exception may be a bad stack pointer! Copies coprocessor 0 Cause and EPC registers into $k0 and $k1 Pushes current Kernel/User mode and Interrupt Enable Mode bits onto the stack in the Status register (see next slide) The kernel’s exception handler uses a jump table (or a sequence of beq’s) to determine the right code to execute (see SPIM kernel text) The operating system clears the interrupts. the processor may restart execution at the address in the EPC c C. save1 is at address 0x90000254 ◦ $a0 and $v0 can’t be pushed onto the stack.

MIPS R2000 STATUS REGISTER 31 28 CU BE V TS PE CM PZ Sw C IsC 22 1615 8 5 4 3 2 1 0 Ke us rnel er / Int en erru ab pt le Ke us rnel er / Int en erru ab pt le Ke us rnel er / Stack for kernel/user and interrupt enable bits lets processor respond to two levels of exceptions before software must save the Status register MIPS R2000 CAUSE REGISTER 15 10 5 2 Pending interrupts Exception code (ExcCode) Int en erru ab pt le Interrupt mask Old Previous Current .

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science EXCEPTION CODES IN THE MIPS R2000 ISA ExcCode 0 1 2 3 4 5 6 7 8 9 10 11 12 Name Description Int External interrupt MOD TLB modification exception TLBL TLB miss exception (Load or instruction fetch) TLBS TLB miss exception (Store) AdEL Address error exception (Load or instruction fetch) AdES Address error exception (Store) IBE Instruction fetch bus error exception DBE Data load or store bus error exception Sys System call exception Bp Breakpoint exception RI Reserved or undefined instruction exception CpU Coprocessor unusable exception Ovf Arithmetic overflow exception c C. D. Cantrell (05/1999) .

AND MULTIPLE-LINE INTERRUPT SYSTEMS CPU Interrupt flip-flop SINGLE-LINE INTERRUPT SYSTEM CPU Interrupt register 0 1 2 3 INTERRUPT REQUEST NUMBERS MULTIPLE-LINE INTERRUPT SYSTEM c C.THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science SINGLE. D. Cantrell (05/1999) .

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science VECTORED INTERRUPT SYSTEM Interrupt register Interrupt request lines Priority encoder Input active Interrupt number to CPU Interrupt pending Interrupt mask register c C. D. Cantrell (05/1999) .

the CPU saves state information and executes M[4N] → PC. the CPU checks for pending interrupts after execution of each instruction CPU saves status register (SR) and enters supervisor mode After determining the interrupt number N. Cantrell (05/1999) . D. causing a branch to the text at the location pointed to by M[4N] c C.THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science INTERRUPT-DRIVEN I/O (4) • In the Motorola 68000 series.

D.THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science VECTORED INTERRUPTS IN THE IBM PC +5 v INT INTA RD WR A0 CS TO CPU D0-D7 8259A Interrupt controller IRQ0 IRQ1 IRQ2 IRQ3 IRQ4 IRQ5 IRQ6 IRQ7 IRQ8 IRQ9 IRQ10 IRQ11 IRQ12 IRQ13 IRQ14 IRQ15 c C. Cantrell (05/1999) INT INTA RD WR A0 CS D0-D7 8259A Interrupt controller .

etc. D. Cantrell (05/1999) ..THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science MEMORY-MAPPED I/O • Instead of having multiple address spaces for memory. I/O. have a single address space Loading from a memory location that is mapped to an I/O device reads a data byte or word from the device Storing to a memory location that is mapped to an I/O device writes a data byte or word to the device Used in Motorola 68000 series • In order to synchronize I/O properly. additional memory locations may be mapped to status words for the I/O devices c C.

SPIM’s MEMORY-MAPPED I/O REGISTERS Unused Receiver control (0xffff0000) Interrupt enable Unused Receiver data (0xffff0004) Received byte 8 Ready 1 1 Unused Transmitter control (0xffff0008) Interrupt enable Unused Transmitter data (0xffff000c) 1 1 Ready 8 Transmitted byte .

Cantrell (05/1999) .THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science NETWORK INTERFACE CARD COMMUNICATION CONTROLLER (FRAMING. D. BUS INTERFACE) TCLK TE TXD CD RXD COL BUS INTERFACE ETHERNET INTERFACE ADAPTER (SIGNALING) JACK c C.

.THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science I/O PROCESSORS • An I/O processor (IOP) is a processor with (usually) a more restricted instruction set than the CPU Purpose: Offload I/O processing from the CPU ◦ Used in CDC 6600. IBM S/360–370.. Cantrell (05/1999) .. D. I/O instructions executed by an IOP are called channel command words in the IBM world A CPU and its IOPs are really a shared-memory multiprocessor c C.

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science RELATION OF I/O TO PROCESSOR ARCHITECTURE • I/O instructions and buses have disappeared • Interrupt vectors have been replaced by jump tables • Interrupt stack replaced by shadow registers Handler saves registers and re-enables higher-priority interrupts • Interrupt types reduced in number Handler must query interrupt controller • Caches cause problems for I/O Flushing degrades performance heavily Solution: “snooping” (borrowed from shared-memory multiprocessors) • Virtual memory frustrates DMA • Load-store architecture inconsistent with atomic I/O operations • Stateful processors hard to context switch c C. Cantrell (05/1999) . D.