Lighting Data Transport Technology

Dorbala 1
Rohini Chandra Dorbala

August 12, 2010
Lighting Data Transport Technology
Table of Contents
Lighting Data Transport Technology.........................................................................................................1
Abstract......................................................................................................................................................1
Front Side Bus............................................................................................................................................1
HyperTransportTM (Lightning Data Transport)........................................................................................3
Abstract
This paper explores the point-to-point link technologies for interconnecting memory, I/O and fast
microprocessors. We begin with the earlier interconnect bus, Front Side Bus (FSB), and then we see the
need for something extra to cater the speeds of the processors. Then, we introduce AMD's
HyperTransportTM technology. These technologies offers competing performance benefits to desktop,
server and supercomputing applications. While there are differences in the way they offer these
benefits, there exist some notable benefits of one over the other in some specific applications.
Front Side Bus
Until recently the personal computer boards had two important components named North Bridge and
South Bridge. The North Bridge also known as memory controller hub which connects the memory
(RAM) and graphics (AGP/PCIe) to the processor, where as the South Bridge also known as I/O
controller hub which connects the slower I/O devices to the North Bridge (Futcher i). Southbridge being
used to communicate with slower devices like USB, IDE, SATA, Ethernet, Audio Controller, CMOS
memory, is connected to the Northbridge which is faster and used as a communication controller for
Dorbala 2
AGP, PCI express and memory bus. Northbridge is connected to the microprocessor through Front-Side
Bus (FSB) which is first designed by Intel. The illustration 1 shows this layout. FSB carries the data
between the Northbridge and CPU.
Intel developed FSB not just to interconnect rest of the
devices on the board through Northbridge to CPU, but
to provide effective multiprocessing support cheaper
than Sun, IBM and other UNIX server providers.
There is no theoretical limit on the number of
processors that can be put on FSB, but the
performance does not scale linearly with the number
of processors added to the FSB due to the limitation of
the architecture's bandwidth. With the current fast
processors, slower FSB can become a bottleneck while
the processor remain waiting for the data from FSB
for a clock cycle or more. Though the FSB can be
made faster, the slower devices from Northbridge and
Southbridge make the FSB wait for some clock cycles Illustration 1: Typical Chipset Layout using
Front Side Busi
before they are ready with data. FSB's fastest transfer
speed is currently 1.6 giga-transfers per second (GT/s) (Futcher i). As the memory (RAM) is also
accessed through FSB, memory accesses are also limited to the speed of the FSB. To reduce the load on
FSB Intel used huge cache memory at L1, L2 and introduced L3 (up to 24 MB for Itanium 2 processor)
(Kanterii) for some processors and still could not eliminate the bottleneck caused by slow FSB. The
answer is the modern point-to-point interconnect technologies developed by AMD and Intel, ironically
Dorbala 3
with the help of Alpha engineers. The point-to-point interconnect technology roots from Alpha
processors of DEC, UNIX servers from IBM, and SUN. We shall compare these two technologies in
this paper.
HyperTransportTM (Lightning Data Transport)
HyperTransportTM is the combined effort of AMD, Alpha processors and API Networks to simplify and
integrate high-speed data traffic between high-speed processors, memory and I/O. HyperTransport TM
has evolved through specification 1.03 offering 12.8 Gigabytes per second aggregate bandwidth
operating at maximum clock speed of 800 MHz, to specification 2.0 offering 22.4 Gigabytes per
second aggregate bandwidth operating at maximum clock speed of 1.4 GHz, to specification 3.0
offering 41.6 Gigabytes per second aggregate bandwidth operating at maximum clock speed of 2.6
GHz, to specification 3.1 offering 51.2 Gigabytes per second aggregate bandwidth operating at
maximum clock speed of 3.2 GHz. These specifications define a practical, high performance, ideally
suitable for applications ranging from consumer, embedded systems, personal computers, portable
computers, servers, network equipment, and even supercomputers.
HyperTransportTM is designed keeping Peripheral Component Interconnect (PCI) compatibility in mind.
This compatibility led all HyperTransport TM devices appear to be PCI devices and conform to the
properties of PCI standard easing wide spread adoption of HyperTransport TM technologies throughout
the industry. HyperTransportTM technology uses enhanced 1.2 Volt low voltage differential signaling
(LVDS) for developing physical electrical link. The LVDS technology reduces the system power
consumption, reduces noise interference, simplifies printed circuit board manufacture and thus lowers
the system cost. The technology uses a low cost point-to-point link backbone structure iii interconnecting
Dorbala 4
system's core components (Processor, Memory and I/O elements). As an optimized architecture,
HyperTransportTM provides lowest possible latency, harmonizes interfaces, reduces software overhead,
enables the intermix of load/store traffic with packet-bus traffic and supports scalable performance.
HyperTransportTM functions as a fully integrated front-side bus and eliminates the North Bridge - South
Bridge structure in the case of AMD Opteron and Athlon64 64-bit x86 processors, Transmeta's Efficeon
x86 processor, Broadcom's BCM 1250 64-bit MIPS processor, PMC-Sierra's RM9000 64-bit MIPS
processor family. In Apple's G5 PowerMac HyperTransport TM is used as an integrated, high
performance I/O bus that pipes PCI, PCI-X, USB, Firewire, and audio/video links through the system.
HyperTransportTM technology is a combination of channel link topology, signal electrical signal
interface characteristics and data organization and transfer by command/address/data packet protocols.
It uses dual, point-to-point unidirectional LVDS data links, one for input and one for output which
carry the load/store data and communication packet data in HyperTransport TM packets and stream
channels. A HyperTransportTM host (HyperTransportTM enabled CPU) and one tunnel make up a
HyperTransportTM Link. The tunnel enables the HyperTransportTM Link to be passed from one
HyperTransportTM enabled device to another. HyperTransportTM enabled devices are configured to be
Single-Link Endpoint (Cave), Dual-link Daisy Chain (Tunnel), Multiple Daisy Chains with bridge to
other I/O protocols such as PCI, PCI-X, PCI Express, or AGP (Bridge), or Daisy chain without tunnel
(Bridge). The host is always considered the top of the link and traffic from the host is downstream
while traffic to the host is upstream. Each point-to-point unidirectional link includes a data path which
is 2, 4, 8, 16, or 32 bits wide, a clock line per each 8-bit data path, and a control line. Commands,
addresses and data are carried in packets over the data path eliminating sideband control signals.
System level control lines for RESET, PWROK and optional LDTSTOP and LDTREQ control lines
with power management functions complete the signal lines. LDTSTOP can be used to put the
Dorbala 5
HyperTransportTM link into a virtual zero power consumption state.
1.2V LVDS signal lines are implemented by means of twin wire lines called balanced or differential
line carrying electrical signals that are equal in amplitude and timing but with opposite polarity. This
balance line prevents electrical noise within the system from affecting the signal detection process at
the receiver end. The noise would affect both the signals in equal measure and nullify the effect thus
ensuring a high degree of architectural noise immunity as well as a maximized transmission range. The
disadvantage of two wires per line is that it requires a second printed circuit board (PCB) trace per each
data pin/pair. Since HyperTransportTM protocol uses packet based traffic, the number of total signal
lines required for a given bandwidth is greatly reduced. As the speeds boost up, it employs simple
signal de-emphasis scheme which uses 1 bit history to de-emphasize the differential amplitude
generated by the transmitter when transmitting a continuous run of 1's and 0's. The eye of the signal
uses reduced amplitude for sequential bits of the same value. This requires that the receiver with higher
sensitivity.
The HyperTransportTM data transport mechanism is efficient with the least overhead of any modern I/O
interconnect architecture. Command information is carried as a control packet of 4 or 8 bytes where as
data traffic is carried as a data packet that consists of an 8-byte header with write control packet, or a
header with 4-byte and a 8-byte read control packet, followed by 4-64 byte data payload. All
HyperTransportTM information is carried in the multiples of four bytes (32-bits). This packet is
distinguished as a control or data packet using the single control line (ASSERT = Control, DE-
ASSERT = Data). This method of differentiating the packet type is significant feature of the link as it
can be used to insert control packets in the middle of a long data packet. This is Priority Request
Dorbala 6
InterleavingTM (PRI) feature is unique to HyperTransport TM technology contributing to very low latency
characteristics of the HyperTransportTM by allowing a new request to be initiated in the middle of a data
packet. Also, the commands and data are categorized into one of the three types of virtual channels:
non-posted requests, posted requests and responses. Non-posted requests require a response from the
receiver like read requests and some specific write requests. Posted requests do not require a response
from the receiver like write requests. Replies to non-posted requests are Responses like read responses
or target done responses to non-posted writes. HyperTransportTM uses minimal set of data and control
lines, a straight forward packet format and provides a powerful high bandwidth for both standard
computing-oriented applications and communications-oriented packet stream applications.
HyperTransportTM is different from other chip-to-chip slot-based architecture communications
technologies by focusing on creating a unified chip-to-chip communications channel that exhibits the
lowest possible latency and overhead in supporting packet-based data streams. HyperTransport TM
exhibits low latency by adopting parallel nature in link structure. A single forwarded clock used per set
of 8 data path bits enable a very low latency point-to-point data transfer instead of adding extensive
clock encoding/decoding at both ends of the link as done by RapidIO and PCI Express. Also the low
packet over head compared to PCI Express (8-byte header for HyperTransport TM compared to 20/24
bytes for even a small data payload on PCI Express) favors HyperTransport TM. Also PRI enables a high
priority request 8-byte command to be inserted within a potentially long, lower priority data transfer
which greatly reduces the latency of HyperTransport-based systems.
HyperTransportTM DirectPacketTM provides powerful communications protocols that enable
HyperTransportTM links to carry user packet data efficiently. Computer-oriented data transfers use a
Dorbala 7
load/store metaphor that require the communication link to instruct each attached device precisely
where to store or retrieve the data in system memory. Communication technologies use a channel
metaphor to specify source and destination addresses to pass data to the channel in packets containing
control and data information, instructing the receiver or transmitter where the data streams are to be
stored. The link is responsible for providing the source/destination, control information and data
payload, and does not have to specify exact memory locations or be concerned at all with the memory
storage management. DirectPacketTM protocol is neutral to the system architecture that handles packet
data. User packets are delivered by the protocol using unused bits in the base HyperTransport TM packet
format, so that there is no overhead for supporting user packers. HyperTransport TM defines just the level
of protocol required to move user packet from point A to point B and leaves the rest to the system
architecture to the OEM to implement without over-burdening with several layers of protocol
management.
i Northbridge, Southbridge and Front Side Bus David Futcher
<http://en.wikipedia.org/wiki/Northbridge_(computing)>,
<http://en.wikipedia.org/wiki/Southbridge_(computing)>, <http://en.wikipedia.org/wiki/Front-
side_bus>
ii The Common System Interface: Intel's Future Interconnect David Kanter August 28, 2007.
<http://www.realworldtech.com/page.cfm?ArticleID=RWT082807020032&p=1>
iii HyperTransportTM Consortium <http://www.hypertransport.org/>

Lighting Data Transport Technology

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lighting Data Transport Technology

Uploaded by

Copyright:

Available Formats

Dorbala 1

Rohini Chandra Dorbala

HyperTransportTM technology. These technologies offers competing performance benefits to desktop,

Front Side Bus

between the Northbridge and CPU.

Intel developed FSB not just to interconnect rest of the

devices on the board through Northbridge to CPU, but

to provide effective multiprocessing support cheaper

than Sun, IBM and other UNIX server providers.

There is no theoretical limit on the number of

processors that can be put on FSB, but the

performance does not scale linearly with the number

of processors added to the FSB due to the limitation of

the architecture's bandwidth. With the current fast

processors, slower FSB can become a bottleneck while

the processor remain waiting for the data from FSB

for a clock cycle or more. Though the FSB can be

made faster, the slower devices from Northbridge and

HyperTransportTM (Lightning Data Transport)

computers, servers, network equipment, and even supercomputers.

HyperTransportTM is designed keeping Peripheral Component Interconnect (PCI) compatibility in mind.

processor family. In Apple's G5 PowerMac HyperTransport TM is used as an integrated, high

HyperTransportTM technology is a combination of channel link topology, signal electrical signal

HyperTransportTM enabled device to another. HyperTransportTM enabled devices are configured to be

HyperTransportTM link into a virtual zero power consumption state.

interconnect architecture. Command information is carried as a control packet of 4 or 8 bytes where as

computing-oriented applications and communications-oriented packet stream applications.

HyperTransportTM is different from other chip-to-chip slot-based architecture communications

which greatly reduces the latency of HyperTransport-based systems.

HyperTransportTM DirectPacketTM provides powerful communications protocols that enable

iii HyperTransportTM Consortium <http://www.hypertransport.org/>

You might also like