You are on page 1of 3

ISSCC96 / SESSION 9 I SRAM / PAPER FA 9.

FA 9.7: A 500MHZ 288kb CMOS SRAM Macro


for On-Chip Cache
K. Furumochi, H. Shimizu, M. Fujita, T. Akita, T.Izawa', M. Katsube',
K. Aoyama, S,Kawamura'
Logic LSI group
LSI Process Development Division, Fujitsu Ltd., Kanagawa, Japan

A 288kb (4kw by 72b) embedded SRAM macro operates a t deselection is Ins.


500MHz. A modular design using a double-stage clock generator
achievesthe word-bitsizeflexibilityrequiredfor embedded SRAM. Recent CPU cache RAMS require wide I/O interfaces in which
This macro is intended t o be used as an on-chip cache for high- data U 0 busses should run directly through the memory cell
speed CPUs. array, rather than taking a longer route around. However, a low-
voltage-swingsignal must be sensed in the high speed SRAM,and
The SRAM cell structure is shown in Figure la. The 9.9ym26%- the coupling noise between the I/O busses and bit-lines must be
cell uses 0.25ym CMOS with a single-level local-interconnection minimal. Figure 6 shows the layout of metal layers that over-
(LI) and a self-aligned contact (SAC). The SRAM cell requires comes this problem. Vdd and Vss potential levels for the memory
many contactholes,but cell sizecan be reducedby using borderless cell are supplied by first-layer metal. Secondary power supply
contacts. The single LI layer with a mask-free contact is used for lines are formed using the third and fourth metal layers. This
the cross-coupledwiring of the memory cell. The power supply shields bit lines so that both the fourth and fifth metal layers can
lines and complementbit lines are formed by the first metal layer carry signals. Interleaving the power and signal lines suppresses
and the second metal layer, respectively. The contact between noise, allowing use of two signal layers above the memory cell
first-metal layer and cell-diffusion layer is formed by the self- area. Use of multiple power-supply layers also removes ground
aligning process. The third-metal layer forms word conductors bounce and improves operating margins.
running along with the poly-word lines to reduce the resistance of
the word lines. The processes required are easily compatible with The SRAM operates at 2.5V. The memory-cell charge is not
a standard logic process, because only one layer (LI) must be enough a t this low voltage level, and may lead to soft errors. The
added t o the conventional logic process. A micrograph of SRAM softerror rate can be reduced by increasing the memory cell
cell after local-interconnectformation is shown in Figure lb. charge and by decreasing the collected charge to the memory cell.
A 6-Tr memory cell increases memory-cell charge. Reduced the
Narrow memory cell pitch makes it difficult to incorporate the dif€usion area in the memory cell reduces the collected charge.
layout pitch of peripheral circuits. A new word line driver circuit The local interconnect and self-alignedcontact (LI+SAC)process
overcomes this problem (Figure 2). The p-channel transistor (Pl) not only reduces the memory-cell size, but also reduces the soft-
that activates the circuit is shared with neighboring words. The error rate.
n-channel transistor (N2 or N4) deactivates the circuit is in the
normally on state and is not controlled by the clock ($x).It has a A micrograph of the chip is shown in Figure 7 . The macro is
small conductance and does not disturb the operation of the NOR 1.81x3.01mm2.Two redundant cell blocks and laser-programmed
gate. When the clock goes high, this transistor pulls down the metal fuses on the chip improve yield with no access-timepenalty.
output. This circuit has two advantages. First, the parasitic Typical features are listed in Table 1.
capacitance of the clock is reduced, and faster access is achieved.
Second, the number of transistors in the circuit is reduced, and
the layout pitch matches the memory-celllayout pitch. Acknowledgments:

The macro floor plan is shown in Figure 3. This SRAM is The authors thank K. Kobayashi, H. Kikuchi, K. Fujita, K.
synchronous.All self-timed clocks are generated by the two-stage Watanabe, H. Goto and T. Nakajima for their support.
clock generator. The first-stage clock generator is located at the
center of bottom and generates clocks for address input registers Reference:
and word-line drivers (@Aand/$x). The first stage clockgenerator
also generates the master clock ($1) that controls second-stage [ll Izawa, T., et al., "A Novel Embedded SRAM Technology with l o p *
Full-CMOS Ce& for 0.25pmLogic Devices,'' 1994 IEDM, Digest of Techni-
clock generators. The second stage clock generators are located in cal Papen, pp. 941-943,1994.
each U 0 block and generate the slave clocks. The slave clocks
control the sense amplifiers, output latches, data input registers
and write control circuits in each U 0 block. This modular design
technique provides increased flexibility of word-bit organization.
By placing the second-stageclock generators in each U 0 block, the
slave clock timing does not depend on the number of U0s. This Organization 4kWx72b synchronous
allows variation of bit organization without affecting the timing. Process technology 0 . 2 5 CMOS,
~
1-poly, 1-LI, self-aligned contact
Figure 4 shows the sensing circuits, including the second-stage Cell size 2.2x4.5kmz(bulk 6-Tr cell)
clock generator. Sensingcircuits are controlled by self-timed clock Macro size 1.81x3.01mm2
pulses to eliminate static current. The latching sense-amplifier Power supply 2.5V
with level shifter has high sensitivity and reduced power dissipa- Cycle time 2ns
tion. Figure 5 shows the operating waveforms with an EB tester. Active current 390mA (500MHz)
The measured accesstime (the delay time from clock input to data
output) is 2ns. The interval between word-line selection and Table 1: 288kb SRAM macro features.

156 1996 IEEE lntemational Solid-State Circuits Conference 0-7803-3136-2 I 96 I $5.00 I O IEEE
ISSCC96 / February 9,1996 / Buena Vista / 11:45 AM

II
Level shifter

(a) Structure layout (b) SEM image


Figure 1: Memory cell schematic.

f
7
- X i (word line)
J O
Rovv latch
-
- -
-

ofg- +
decoder i - D o F oData
DO
Data
- . r 7out u t out

Figure 4: Sensing circuit including clock generator.


-
7
J
Row x i+l (word line)
Clock
y-
decoder i+l -=- -=-
wordline j
Figure 2: Word line driver circuit.
4.51 .i
DOlalcli
i -
DO
-L 2.0ns 1-
Ins ' '
Figure 5: Waveforms by EB tester.

Cross-section A-A'

Figure 3: Macro floor plan.

Figure 6: Metal layer utilization for memory cell.

Figure 7: See page 435.

DIGEST OF TECHNICAL PAPERS 157


- Figure 1: Micrograph of test site hardware.

FA 9.7: A 500MHz 288kb CMOS SRAM Macro for On-Chip Cache


(Continued from page 157)
-

Figure 7': Chip micrograph.

-
DIGEST OF TECHNICAL PAPERS * 435
- -- - - --

You might also like