You are on page 1of 7

Physica C 392–396 (2003) 1478–1484

www.elsevier.com/locate/physc

Single flux quantum circuit technology innovation


for backbone router applications
a,*
S. Yorozu , Y. Kameda a, Y. Hashimoto a, H. Terai b,
A. Fujimaki c, N. Yoshikawa d
a
Fundamental Research Laboratories, NEC Corporation, 34 Miyukigaoka, Tsukuba, Ibaraki 305-8501, Japan
b
Kansai Advanced Research Center, Communications Research Laboratory, Kobe 651-2492, Japan
c
Department of Quantum Engineering, Nagoya University, Furo-cho, Chikusaku, Nagoya 464-8603, Japan
d
Department of Electrical and Computer Engineering, Yokohama National University, Yokohama 240-8501, Japan
Received 13 November 2002; accepted 20 January 2003

Abstract

The performance of high-end routers will soon reach the limits of conventional technology. Single flux quantum
(SFQ) digital technology is a key technology for achieving a breakthrough. We have already proposed a high-end
router using SFQ technology. In this paper, we report on the latest innovation of SFQ router technology. First we
describe a design methodology development. We have developed a new SFQ logic cell library for cell-based top-down
circuit design. To expand the circuit scale further, a novel pseudo-automatic Josephson transmission line (JTL) routing
technique has also been developed. And then, we discuss issues of passive interconnection, and show the operation of a
passive transmission line interconnected circuit up to 40-Gbps throughput. Second we describe a packet switch circuit
demonstration. We designed a 2  2 crossbar packet switch circuit, which is a key element in the packet switch. We
successfully tested the circuit up to a clock frequency of 35 GHz. Finally we discuss an SFQ circuit application in the
optical router as an alternative application strategy.
Ó 2003 Elsevier B.V. All rights reserved.

PACS: 85.25.Cp; 85.25.Hv


Keywords: Superconducting devices; Single flux quantum logic; Cell-based design; RSFQ logic cell library, packet switch, passive
transmission line

1. Introduction Internet. In order to build broadband core


networks that can handle a large amount of traf-
Telecommunication systems need to handle fic, it is necessary to improve the packet forward-
more traffic to cope with the popularity of the ing throughput at a node as well as to increase
the link capacity. Currently, a semiconductor
processor is used in packet forwarding. LSI tech-
*
Corresponding author. Tel.: +81-298-50-2634; fax: +81-
nology is still progressing according to MooreÕs
298-56-6139. Law, but traffic loads are increasing at a faster
E-mail address: yorozu@ay.jp.nec.com (S. Yorozu). rate. While link capacities can easily be increased

0921-4534/$ - see front matter Ó 2003 Elsevier B.V. All rights reserved.
doi:10.1016/S0921-4534(03)01052-9
S. Yorozu et al. / Physica C 392–396 (2003) 1478–1484 1479

by bundling optical fibers, semiconductor pro-


cessing speeds will limit the increase in node
throughput. One bottleneck that limits node per-
formance is the packet forwarding speed in the
router switch card that gathers all the packets to
be forwarded. Thus, with line speeds and the
number of ports at a node increasing, alternative
technology for packet forwarding is eagerly re-
quired.
Superconducting single flux quantum (SFQ)
circuits do not have these limitations. They have Fig. 1. Block diagram of proposed router.
demonstrated high-performance processing in the
10–100 GHz range. They require less power and
To improve performance, we have proposed a
offer a higher operating speed than semiconductor
novel switch card using SFQ technology. Fig. 1
devices, which have a typical bit operation below
shows our proposed system. The switch fabric is a
10 nW/GHz [1]. We have already proposed an
self-routing multi-stage network, such as a cross-
SFQ packet switch for future high-end routers [2].
point or Banyan network. The target operating
In this paper, we discuss a technology innovation
frequency is a 40-GHz clock cycle to match the
for a future high-end router.
40-GHz cell library. Therefore, the fabric can be
expanded to a 40-Gbps port speed without paral-
lelizing in the switch circuit itself. The switch fabric
2. SFQ-router hardware technology
also has a switch scheduler, which schedules re-
quests to the switch fabric from every input port to
A typical high-end router consists of data input/
every output port within the packet length. Fig. 2
output line cards and a switch card connecting
shows an example of a packet switch module. A
them all of them. The line cards process an in-
16  16 switch chip may consist of around 100,000
coming packet header, and the switch card for-
Josephson junctions, which is the upper limit for
wards packets from an incoming port to a
one-chip integration with current fabrication
destination port. When evaluating router perfor-
technology. A 32  32 switch requires four 16  16
mance, an important criterion is the packet pro-
switch chips on a multi-chip module made of sili-
cessing time. The router must process packets
con material with Nb wire. The module has few
within the packet-time-length. As the data rate
output interface chips (SFQ output interface chip:
increases, the packet-time-length decreases, so the
SOIC). The SOIC is a special output driver IC
processing speed must be increased commensu-
rately. There are three performance bottlenecks
that must be eliminated. The first is the header
address processing of an incoming IP packet and
the second is the packet buffering. These two are
done in the line card, and a data parallelism
technique using CMOS high-speed network pro-
cessors and high-speed SRAMs solves these
problems very effectively. The third problem is the
packet forwarding in a switch card. The switch
card gathers all packets that should be forwarded.
As throughput rises, the number of ports and their
speeds increase so much that the data parallelism
technique is constrained by the physical packaging
density. Fig. 2. Switch module diagram.
1480 S. Yorozu et al. / Physica C 392–396 (2003) 1478–1484

chip, which is driven by multiple-phase AC bias 3.2. EDA tool development: pseudo-automatic JTL-
current to suppress noise. The driver can support routing methodology
10-Gbps digital signal transmission per line. To
generate the optimal voltage and speed, its circuit In conventional technology, we use Josephson
parameters and also critical current density are transmission lines (JTLs) for wires in the SFQ
optimized. circuit. A JTL has a large area and its delay is
comparable to or greater than that of other SFQ
3. Circuit design technology development logic cells. Therefore, as far as using JTL wire, JTL
routing design is a very difficult task. To solve this
3.1. Top-down hierarchical design environment problem, we have developed a new JTL-routing
methodology based on a commercially available
The packet switch fabric is a large-scale random- router [4].
logic circuit. Therefore, in circuit design, the design This methodology is separated into two steps.
environment is a very important issue. We have In the first step, we place logic cells and route the
developed a cell-based top-down design flow (Fig. circuit by using a commercial automatic router
3). Because the flow is based on Cadence Design without using JTL wire. Because of the severe
SystemsÕ EDA software, designers can design cir- timing constraint, it is difficult for the commercial
cuits without special knowledge of SFQ devices. In router to route the target circuit perfectly. There-
cell-based design, a cell library is a key component. fore, we introduce a new factor, ‘‘wire length tol-
We have developed a new SFQ logic cell library erance ratio’’, to quantify the relaxation of the
called CONNECT [3]. The CONNECT cells have timing constraints. If this factor is equal to 1, then
digital behavior data, analog circuit data, and the timing constraints must be adjusted perfectly.
physical layout data. Thus, a digital-level Verilog If not, they are relaxed and the router can com-
simulator can be used to estimate the circuit oper- plete all of the circuit routing. As a result, the
ation, so we can easily expand the circuit scale routed circuit does not satisfy the timing con-
without time-consuming dynamic simulation of the straints in the first step. In the second step, we
entire circuit. The state-of-the-art CONNECT cell replace the conventional wires with JTL wire cells,
library consists of about 100 cells. Each cell is de- and manually adjust the wire delays using sev-
signed to minimize interactions between cells to eral kinds of JTL cells with different delays and
allow expansion of the circuit scale. We defined a sizes. For JTL, the delay time length does not al-
minimum standard cell size of 40 lm and made the ways corresponds to the physical length, unlike
cell height and width multiples of that size. normal wire. Our methodology effectively uses this
characteristic. Using this methodology, we have
successfully routed a circuit with up to 4133 Jo-
sephson junctions (1472 cells), which corresponds
to an operating speed of up to 20 GHz by logic
simulation. Without this pseudo-automatic JTL-
routing methodology, it would be impossible to
find routes within the severe timing constraints of
20 GHz.

3.3. An emerging technology for flexible design:


passive interconnect wiring

At present, JTL wiring is the conventional way


to route SFQ circuits. However, it has the fol-
lowing disadvantages. (1) The power consumption
Fig. 3. Cell-based top-down design flow chart. is proportional to the interconnection length. (2)
S. Yorozu et al. / Physica C 392–396 (2003) 1478–1484 1481

The propagation delay is greater than the delays of Furthermore, to keep the matching condition, we
logic cells. (3) The delay time depends on the Jo- designed the same interface circuit for every logic
sephson junction parameter spreads. (4) Fluctua- cell as far as possible.
tion of the propagation delay time (timing jitter) is As a first step, we consider using PTL in inter-
proportional to the wiring length because of un- connections between circuit blocks, for example a
avoidable factors such as heat-induced noise. (5) 4  4 switch circuit consisting of four PTL-inter-
Timing adjustment in the design is difficult because connected 2  2 switch circuits. To demonstrate
the delay time is restricted by the unit JTL delay inter-block connection, we designed a testing cir-
time. Furthermore, in a typical SFQ circuit, the cuit. Fig. 4 shows a block diagram of the circuit
JTLs occupy about 70–80% of the total circuit under test and photograph of the fabricated
area. These problems can be solved by using pas- hardware. It consists of two DFFs connected with
sive transmission line (PTL) wiring instead. Using two PTLs via the designed interface circuits. Both
a PTL for logic cell interconnection greatly im- PTLs are 2 mm long. Fig. 5 shows the circuit
proves the SFQ LSI performance. For example, schematic of a DFF with interface circuits. By
PTL design can improve the precision of timing
design from several picoseconds to several tens of
femtoseconds. Therefore, we consider PTL wiring
to be the next-generation wiring method for larger-
scale circuits.
In implementing PTL interconnected circuits,
the largest problem is impedance matching be-
tween PTLs and Josephson junctions. We devel-
oped a methodology for designing interface
circuits between SFQ logic cell and PTL [5]. To
obtain a large operating margin, the quality factor
Q of the PTL and interface is important. Q rep-
resents the characteristics of multiple reflection
and resonance, which degrade the circuit margins.
Our approach is to optimize the interface circuit Fig. 4. Interconnection circuit demonstration block diagram
between logic cells and PTL by minimizing Q. and photograph of the fabricated circuit.

Fig. 5. Circuit diagram of D flip-flop and PTL interface.


1482 S. Yorozu et al. / Physica C 392–396 (2003) 1478–1484

using the on-chip testing block of the CONNECT


cell library [6], we confirmed correct operations up
to 40 GHz. These results show the feasibility of the
concept of interconnect technology between circuit
blocks. In the next step, we will expand this tech-
nology to interconnect between cells (i.e., ‘‘true
wiring’’) and integrate PTL wiring technology into
an advanced automatic routing technology.

4. Packet switch implementation

4.1. Multi-stage packet switch circuit

The switch fabric topology has a significant


effect on its switching performance. A Banyan type Fig. 7. Crosspoint 8  8 packet switch.
or crosspoint type switch may be appropriate for
SFQ implementation. A Banyan network is con-
structed from 2  2 switching elements with a sin- circuit scale. The crosspoint architecture is classi-
gle path between each input and output pair. Fig. fied into three types by buffer structure: input
6 shows an 8  8 Banyan switch. The circuit buffer, output buffer, and internal buffer. Consid-
complexity of an N  N switch is of the order ering the immaturity of SFQ memory, it is better
N log N , so it uses less hardware than the cross- to implement the input or output buffer (semi-
point type. Furthermore, there are no global con- conductor-based) type with a SFQ scheduler. Fig.
trol circuits in a Banyan network, so a switch 7 shows a diagram of an input buffer type 8  8
circuit can be expanded easily. The disadvantage crosspoint switch.
of this topology is internal and external packet
blocking, which typically causes a fatal 50% 4.2. 70-Gbps 2  2 cross-bar switch circuit operation
packet-loss rate. The crosspoint type does not have
such a large packet loss rate but it requires a larger A 2  2 cross-bar switch circuit is a key com-
ponent. We have demonstrated a 2  2 cross-bar
switch circuit as a first step. Fig. 8 shows a block
diagram. It exchanges input packets according to
whether the state is ‘‘bar’’ or ‘‘cross’’. In the ‘‘bar’’
(‘‘cross’’) state, in0 is connected to out0 (out1) and
in1 to out1 (out0). The circuit has two control
signals set_cross and reset. The signal set_cross
sets the state to ‘‘cross’’ and reset sets it to ‘‘bar’’.
The switch circuit has 13 logic cells and three-stage
pipelines. For the physical layout design, we used
the pseudo-automatic layout method described
previously. After routing, the circuit has 581 Jo-
sephson junctions and 194 cells. Fig. 9 shows a
photograph of the circuit, which was fabricated
using NECÕs standard fabrication process [7]. The
circuit area was about 1  0.5 mm2 . We also con-
nected and fabricated the on-chip testing block in
Fig. 6. Banyan 8  8 packet switch. the CONNECT cell library for testing the circuit.
S. Yorozu et al. / Physica C 392–396 (2003) 1478–1484 1483

Fig. 8. Block diagram of 2  2 crossbar switch.


Fig. 10. Optical packet switch architecture. Solid lines show the
optical domain and the dashed lines show the electrical domain.

analyzes the packet header optically and controls


optical switches. After packets have been switched
to their designated directions, a scheduler controls
optical buffers to avoid collisions among packets.
Because optical technology lacks arithmetic char-
acteristics, a scheduler for contention resolution
must be operated in the electrical domain even
when data in the buffer is in the optical domain.
The processing time required is inversely propor-
Fig. 9. Fabricated 2  2 crossbar switch. tional to the number of input ports N. Slow elec-
tronic processing limits the number of ports. The
Tests confirmed correct operation up to 35 GHz; interface speed between the scheduler and optical
i.e., 70-Gbps throughput [8]. We believe that this buffer does not need to be especially high. Thus,
circuit has the highest throughput among 2  2 the SFQ scheduler will overcome such a problem.
switches reported to date. We intend to focus on the contention resolution
circuit in future.
5. SFQ application considerations of future high-end
router with optical switching technology 6. Conclusion

Photonic technology in transmission systems is In this paper, we report on the latest SFQ
still advancing. Packet transmission speeds will be router technology innovation. First we described a
increased from todayÕs 10 Gbps to 40 or 160 Gbps design methodology innovation. To expand the
in the near future. With higher port speeds such as designable circuit scale, we have developed a new
these, especially ones over 40 Gbps, the conven- SFQ logic cell library for cell-based circuit design.
tional semiconductor interface technology for a The cell library has over 100 cells at present. The
router will be inadequate. A photonic-based Verilog simulator is available to simulate circuits,
packet switch technology may solve this problem so we can easily expand the circuit scale without
[9]. While SFQ circuits can operate at similar time-consuming dynamic simulation. To expand
speeds to optical technology, the non-electrical the circuit scale further, a novel pseudo-automatic
interface technology between optical and SFQ JTL routing technique has also been developed. As
technology is immature, so we should consider an a next-generation wiring scheme, we discussed
alternative strategy to SFQ technology in the op- passive interconnection. A passive transmission
tical packet router. line dramatically reduces the switch circuit scale
Fig. 10 shows a block diagram of a previously and greatly improves the SFQ LSI performance.
reported example of an optical packet switch [10]. We demonstrated the operation of a passive
It consists of forwarding, switching, buffering, and transmission line interconnected circuit up to 40-
buffer management functions. The payload of a Gbps throughput.
packet goes through the photonic switching and Second we described a packet switch circuit
buffering parts. A photonic forwarding module demonstration. We designed a 2  2 crossbar
1484 S. Yorozu et al. / Physica C 392–396 (2003) 1478–1484

packet switch logic circuit with a deep pipeline ar- References


chitecture, which is a key element in the packet
switch. We successfully tested the 2  2 packet data [1] K. Likharev, V. Semenov, IEEE Trans. Appl. Supercond. 1
(1991) 3.
path switch circuit up to a 35-GHz clock frequency.
[2] S. Yorozu, Y. Kameda, S. Tahara, IEICE Trans. Elec-
Finally we discussed an SFQ circuit application tronics, E 84-C (1) (2001) 15.
in the optical router as an alternative application [3] S. Yorozu, Y. Kameda, H. Terai, A. Fujimaki, T.
strategy. In a photonic router, almost all parts can Yamada, S. Tahara, Physica C 378–381 (Part 2) (2002)
be changed from electrical to optical except for the 1471.
[4] Y. Kameda, S. Yorozu, IEEE Trans. Appl. Supercond., in
scheduler, which will be a performance bottleneck.
press.
Only SFQ circuit performance can solve such a [5] Y. Hashimoto, S. Yorozu, Y. Kameda, V. Semenov, IEEE
problem. We believe this is a promising applica- Trans. Appl. Supercond., in press.
tion for SFQ technology. [6] T. Yamada, A. Sekiya, A. Akahori, H. Akaike, A.
Fujimaki, H. Hayakawa, Y. Kameda, S. Yorozu, H. Terai,
Supercond. Sci. Technol. 14 (12) (2001) 1071.
Acknowledgements [7] S. Nagasawa, Y. Hashimoto, H. Numata, S. Tahara, IEEE
Trans. Appl. Supercond. 5 (1995) 2447.
We would like to thank Mr. Kitagawa, Ms. [8] Y. Kameda, S. Yorozu, H. Terai, A. Fujimaki, JJAP 42
(2003) 2163.
Isaka, and Mr. Kamei for their assistance in fab-
[9] K. Kitayama, N. Wada, IEEE Photonic Technol. Lett.
rication of chips. We also would like to thank Dr. (1999) 1689.
Harai, Dr. Wada, and Dr. Hidaka for their useful [10] H. Harai, N. Wada, F. Kubota, W. Chujo, to be presented
discussions. at IEEE ICC, New York, USA, 2002.

You might also like