Professional Documents
Culture Documents
Table of Contents
Lighting Data Transport Technology.........................................................................................................1
Abstract......................................................................................................................................................1
Front Side Bus............................................................................................................................................1
HyperTransportTM (Lightning Data Transport)........................................................................................3
Abstract
This paper explores the point-to-point link technologies for interconnecting memory, I/O and fast
microprocessors. We begin with the earlier interconnect bus, Front Side Bus (FSB), and then we see the
need for something extra to cater the speeds of the processors. Then, we introduce AMD's
server and supercomputing applications. While there are differences in the way they offer these
benefits, there exist some notable benefits of one over the other in some specific applications.
Until recently the personal computer boards had two important components named North Bridge and
South Bridge. The North Bridge also known as memory controller hub which connects the memory
(RAM) and graphics (AGP/PCIe) to the processor, where as the South Bridge also known as I/O
controller hub which connects the slower I/O devices to the North Bridge (Futcher i). Southbridge being
used to communicate with slower devices like USB, IDE, SATA, Ethernet, Audio Controller, CMOS
memory, is connected to the Northbridge which is faster and used as a communication controller for
Dorbala 2
AGP, PCI express and memory bus. Northbridge is connected to the microprocessor through Front-Side
Bus (FSB) which is first designed by Intel. The illustration 1 shows this layout. FSB carries the data
Southbridge make the FSB wait for some clock cycles Illustration 1: Typical Chipset Layout using
Front Side Busi
before they are ready with data. FSB's fastest transfer
speed is currently 1.6 giga-transfers per second (GT/s) (Futcher i). As the memory (RAM) is also
accessed through FSB, memory accesses are also limited to the speed of the FSB. To reduce the load on
FSB Intel used huge cache memory at L1, L2 and introduced L3 (up to 24 MB for Itanium 2 processor)
(Kanterii) for some processors and still could not eliminate the bottleneck caused by slow FSB. The
answer is the modern point-to-point interconnect technologies developed by AMD and Intel, ironically
Dorbala 3
with the help of Alpha engineers. The point-to-point interconnect technology roots from Alpha
processors of DEC, UNIX servers from IBM, and SUN. We shall compare these two technologies in
this paper.
HyperTransportTM is the combined effort of AMD, Alpha processors and API Networks to simplify and
integrate high-speed data traffic between high-speed processors, memory and I/O. HyperTransport TM
has evolved through specification 1.03 offering 12.8 Gigabytes per second aggregate bandwidth
operating at maximum clock speed of 800 MHz, to specification 2.0 offering 22.4 Gigabytes per
second aggregate bandwidth operating at maximum clock speed of 1.4 GHz, to specification 3.0
offering 41.6 Gigabytes per second aggregate bandwidth operating at maximum clock speed of 2.6
GHz, to specification 3.1 offering 51.2 Gigabytes per second aggregate bandwidth operating at
maximum clock speed of 3.2 GHz. These specifications define a practical, high performance, ideally
suitable for applications ranging from consumer, embedded systems, personal computers, portable
This compatibility led all HyperTransport TM devices appear to be PCI devices and conform to the
properties of PCI standard easing wide spread adoption of HyperTransport TM technologies throughout
the industry. HyperTransportTM technology uses enhanced 1.2 Volt low voltage differential signaling
(LVDS) for developing physical electrical link. The LVDS technology reduces the system power
consumption, reduces noise interference, simplifies printed circuit board manufacture and thus lowers
the system cost. The technology uses a low cost point-to-point link backbone structure iii interconnecting
Dorbala 4
system's core components (Processor, Memory and I/O elements). As an optimized architecture,
HyperTransportTM provides lowest possible latency, harmonizes interfaces, reduces software overhead,
enables the intermix of load/store traffic with packet-bus traffic and supports scalable performance.
HyperTransportTM functions as a fully integrated front-side bus and eliminates the North Bridge - South
Bridge structure in the case of AMD Opteron and Athlon64 64-bit x86 processors, Transmeta's Efficeon
x86 processor, Broadcom's BCM 1250 64-bit MIPS processor, PMC-Sierra's RM9000 64-bit MIPS
performance I/O bus that pipes PCI, PCI-X, USB, Firewire, and audio/video links through the system.
interface characteristics and data organization and transfer by command/address/data packet protocols.
It uses dual, point-to-point unidirectional LVDS data links, one for input and one for output which
carry the load/store data and communication packet data in HyperTransport TM packets and stream
channels. A HyperTransportTM host (HyperTransportTM enabled CPU) and one tunnel make up a
HyperTransportTM Link. The tunnel enables the HyperTransportTM Link to be passed from one
Single-Link Endpoint (Cave), Dual-link Daisy Chain (Tunnel), Multiple Daisy Chains with bridge to
other I/O protocols such as PCI, PCI-X, PCI Express, or AGP (Bridge), or Daisy chain without tunnel
(Bridge). The host is always considered the top of the link and traffic from the host is downstream
while traffic to the host is upstream. Each point-to-point unidirectional link includes a data path which
is 2, 4, 8, 16, or 32 bits wide, a clock line per each 8-bit data path, and a control line. Commands,
addresses and data are carried in packets over the data path eliminating sideband control signals.
System level control lines for RESET, PWROK and optional LDTSTOP and LDTREQ control lines
with power management functions complete the signal lines. LDTSTOP can be used to put the
Dorbala 5
1.2V LVDS signal lines are implemented by means of twin wire lines called balanced or differential
line carrying electrical signals that are equal in amplitude and timing but with opposite polarity. This
balance line prevents electrical noise within the system from affecting the signal detection process at
the receiver end. The noise would affect both the signals in equal measure and nullify the effect thus
ensuring a high degree of architectural noise immunity as well as a maximized transmission range. The
disadvantage of two wires per line is that it requires a second printed circuit board (PCB) trace per each
data pin/pair. Since HyperTransportTM protocol uses packet based traffic, the number of total signal
lines required for a given bandwidth is greatly reduced. As the speeds boost up, it employs simple
signal de-emphasis scheme which uses 1 bit history to de-emphasize the differential amplitude
generated by the transmitter when transmitting a continuous run of 1's and 0's. The eye of the signal
uses reduced amplitude for sequential bits of the same value. This requires that the receiver with higher
sensitivity.
The HyperTransportTM data transport mechanism is efficient with the least overhead of any modern I/O
data traffic is carried as a data packet that consists of an 8-byte header with write control packet, or a
header with 4-byte and a 8-byte read control packet, followed by 4-64 byte data payload. All
HyperTransportTM information is carried in the multiples of four bytes (32-bits). This packet is
distinguished as a control or data packet using the single control line (ASSERT = Control, DE-
ASSERT = Data). This method of differentiating the packet type is significant feature of the link as it
can be used to insert control packets in the middle of a long data packet. This is Priority Request
Dorbala 6
InterleavingTM (PRI) feature is unique to HyperTransport TM technology contributing to very low latency
characteristics of the HyperTransportTM by allowing a new request to be initiated in the middle of a data
packet. Also, the commands and data are categorized into one of the three types of virtual channels:
non-posted requests, posted requests and responses. Non-posted requests require a response from the
receiver like read requests and some specific write requests. Posted requests do not require a response
from the receiver like write requests. Replies to non-posted requests are Responses like read responses
or target done responses to non-posted writes. HyperTransportTM uses minimal set of data and control
lines, a straight forward packet format and provides a powerful high bandwidth for both standard
technologies by focusing on creating a unified chip-to-chip communications channel that exhibits the
lowest possible latency and overhead in supporting packet-based data streams. HyperTransport TM
exhibits low latency by adopting parallel nature in link structure. A single forwarded clock used per set
of 8 data path bits enable a very low latency point-to-point data transfer instead of adding extensive
clock encoding/decoding at both ends of the link as done by RapidIO and PCI Express. Also the low
packet over head compared to PCI Express (8-byte header for HyperTransport TM compared to 20/24
bytes for even a small data payload on PCI Express) favors HyperTransport TM. Also PRI enables a high
priority request 8-byte command to be inserted within a potentially long, lower priority data transfer
HyperTransportTM links to carry user packet data efficiently. Computer-oriented data transfers use a
Dorbala 7
load/store metaphor that require the communication link to instruct each attached device precisely
where to store or retrieve the data in system memory. Communication technologies use a channel
metaphor to specify source and destination addresses to pass data to the channel in packets containing
control and data information, instructing the receiver or transmitter where the data streams are to be
stored. The link is responsible for providing the source/destination, control information and data
payload, and does not have to specify exact memory locations or be concerned at all with the memory
storage management. DirectPacketTM protocol is neutral to the system architecture that handles packet
data. User packets are delivered by the protocol using unused bits in the base HyperTransport TM packet
format, so that there is no overhead for supporting user packers. HyperTransport TM defines just the level
of protocol required to move user packet from point A to point B and leaves the rest to the system
architecture to the OEM to implement without over-burdening with several layers of protocol
management.
i Northbridge, Southbridge and Front Side Bus David Futcher
<http://en.wikipedia.org/wiki/Northbridge_(computing)>,
<http://en.wikipedia.org/wiki/Southbridge_(computing)>, <http://en.wikipedia.org/wiki/Front-
side_bus>
ii The Common System Interface: Intel's Future Interconnect David Kanter August 28, 2007.
<http://www.realworldtech.com/page.cfm?ArticleID=RWT082807020032&p=1>