You are on page 1of 59

I/O MODULES:

MODULES UART
Universal asynchronous receiver and transmitter (UART):
Frequently used in conjunction with RS-232 standard, which specifies the
characteristics of two data communication equipment
A UART includes
i l d a transmitter
i
and
d a receiver.
i
Transmitter - essentially a special shift register that loads data in parallel and
then shifts it out bit by bit at a specific rate
Receiver - shifts in data bit by bit and then reassembles the data
Serial line is 1 when it is idle. The transmission starts with a start bit, which is 0,
0
followed by data bits and an optional parity bit, and ends with stop bits, which are 1
Number of data bits can be 6,7, or 8. The optional parity bit is used for error
detection. Number of stop bits can be 1, 1.5, or 2
2

UART
Transmission of a byte.
byte

Transmission with 8 data bits, no parity, and 1 stop bit


The LSB of the data word is transmitted first
No clock information is conveyed through the serial line
starts the transmitter and receiver must agree on a
Before the transmission starts,
set of parameters in advance, which include
Baud rate - number of bits per second
Number of data bits and stop bits, and use of the parity bit
Commonly used baud rates are 2400,4800,9600, and 19,200 bauds

UART: RECEIVING SUBSYSTEM


UART
No clock information is conveyed from the transmitted signal and the receiver
retrieve the
h data
d bits
b only
l by
b using the
h predetermined
d
d parameters
Over-sampling scheme is used to estimate the middle points of transmitted bits
and then retrieve them at these points accordingly
The scheme performs the function of a clock signal, instead of using the rising
edge to indicate validity of signal, it utilizes sampling ticks to estimate the middle
point of each bit
Commonly used sampling rate is 16 times the baud rate, i.e. each serial bit is
sampled 16 times
While the receiver has no information about the exact onset time of the start bit,
the estimation can be off by at most 1/16.
The baud rate can only be a small fraction of the system clock rate, thus this
scheme is not appropriate for a high data rate
4

UART: RECEIVING SUBSYSTEM


UART
Over- sampling
p gp
procedure : Assume N data bits and M stopp bits
1. Wait until the incoming signal becomes 0, the beginning of the start bit, and then
start the sampling tick counter
2. When the counter reaches 7, the incoming signal reaches the middle point of the
start bit. Clear the counter to 0 and restart
3. When the counter reaches 15, the incoming signal progresses for one bit and
reaches the middle of the first data bit. Retrieve its value, shift it into a register,
and restart the counter
4. Repeat step-3 N-1 more times to retrieve the remaining data bits.
5. If the optional parity bit is used, repeat step-3 one time to obtain the parity bit
6. Repeat step-3 M more times to obtain the stop bits
5

UART: RECEIVING SUBSYSTEM


UART
Receiving Sub-system consists of three major components:
UART receiver: obtain the data word via over-sampling
Baud rate generator: to generate the sampling ticks
Interface circuit: provides buffer and status between the UART receiver and the
system that uses the UART

Conceptual block diagram of a UART receiving subsystem

UART: RECEIVING SUBSYSTEM


UART
Baud rate generator:
Generates a sampling signal whose frequency is exactly 16 times the UARTs
designated baud rate
For the 19,200 baud rate,
the sampling rate = 307,200) ticks per second (i.e., 19,200x16)
With 50 MHz clock,
the baud rate generator needs a mod-163 (50x106/307200 ) counter,
i.e. one-clock-cycle tick is asserted once every 163 clock cycles
The parameterized mod-n counter discussed in earlier classes can be used for this
purpose by setting the N generic to 163
7

UART: RECEIVING SUBSYSTEM


UART
UART receiver: ASMD
D-BIT: constant indicates the number of data bits,
SB-TICK : constant indicates number of ticks needed for the stop bits, which is 16,
( 1,, 1.5,, and 2 stop
p bits))
24,, and 32 (for
States: start, data , and stop; represent the processing of the start bit, data bits,
and stop bit
s-tick: enable tick from the baud rate generator (16 ticks in a bit interval)
s and n registers: two counters
s register keeps track of the number of sampling ticks and counts to 7 in the
start state, to 15 in the data state, and to SB-TICK in the stop state.
n register keeps track of the number of data bits received in the data state
b register : reassemble the retrieved bits after shifting
rx-done-tick: asserted for one clock cycle after the receiving process is completed
8

UART: RECEIVING SUBSYSTEM


UART

UART receiver: ASMD

entity uart_rx is
generic(
DBIT: integer:=8;
g
; -- # data bits
SB_TICK: integer:=16 -- # ticks for stop bits
);
port(
clk reset: in std_logic;
clk,
std logic;
rx: in std_logic;
s_tick: in std_logic;
rx_done_tick: out std_logic;
dout: out std_logic_vector(7 downto 0)
);
end uart_rx ;
architecture arch of uart_rx is
type state_type is (idle, start, data, stop);
signal state_reg, state_next: state_type;
signal s_reg, s_next: unsigned(3 downto 0);
signal n_reg, n_next: unsigned(2 downto 0);
signal b_reg, b_next: std_logic_vector(7 downto 0);

-- next-state logic & data path functional units


process (state_reg,s_reg,n_reg,b_reg,s_tick,rx)
begin
state_next <= state_reg;
s_next <= s_reg;
n_next <= n_reg;
b next <=
b_next
< b_reg;
b reg;
rx_done_tick <='0';
case state_reg is
when idle =>
if rx='0'
'0' then
h
state_next <= start;
s_next <= (others=>'0');
end if;;
when start =>
if (s_tick = '1') then
if s_reg=7 then
state_next
t t
t <= data;
d t
s_next <= (others=>'0');
n_next <= (others=>'0');
else
s_next <= s_reg + 1;
end if;
10
end if;

when data =>


if (s_tick = '1') then
if s_reg=15 then
s next <= (others=>'0');
s_next
(others=> 0 );
b_next <= rx & b_reg(7 downto 1) ;
if n_reg=(DBIT-1) then
state_next <= stop ;
else
l
n_next <= n_reg + 1;
end if;
else
s_next <= s_reg + 1;
end if;
end if;
when stop
p =>
if (s_tick = '1') then
if s_reg=(SB_TICK-1) then
state_next <= idle;
rx done tick <=
rx_done_tick
<='1';
1;
else
s_next <= s_reg + 1;
end if;
end
d if;
if
end case;
end process;
dout <= b_reg;

11

UART: RECEIVING SUBSYSTEM


UART
Interface circuit
It provides a mechanism to signal the availability of a new word and to prevent the
received word from being retrieved multiple times.
It provide buffer space between the receiver and the main system
commonly used schemes:
A flag FF
A flag FF and a one-word buffer
A FIFO buffer

A flag FF

12

UART: RECEIVING SUBSYSTEM


UART

Flag FF and a one-word buffer

A FIFO buffer
13

UART TRANSMITTING SUBSYSTEM


UART:
UART transmitting
g Sub-system
y
consists of:
Baud rate generator
Transmitter
g
that shifts out data bits at a specific
p
rate
A shift register
The rate can be controlled by the baud rate generator
Because no oversampling is involved, the frequency of the ticks is 16 times
sslower
owe than
t a the
t e receiver.
ece ve . (uses an
a internal
te a counter)
cou te )
Interface circuit - the main system sets the flag FF or writes the FIFO buffer, and
the transmitter clears the flag FF or reads the FIFO buffer
Algorithm:
1. assertion of the tx-start signal,
2. data word is loaded
3 progresses through the start,
3.
start data,
data and stop states
4. completion by asserting the tx-done-tick signal for one clock cycle
filter out any potential glitch
14

UART TRANSMITTING SUBSYSTEM


UART:

Block diagram of a complete UART


15

PS2 KEYBOARD
PS2:
PS2 port was introduced in IBMs Personal System/2 personnel computers
It support interface for a keyboard and mouse to communicate with the host
The port contains two wires
one wire is for data, which is transmitted in a serial stream.
other wire is for the clock information, which specifies when the data is
valid
Information is transmitted as an 11-bit packet
a start bit,, 8 data bits,, a parity
p y bit,, and a stop
p bit
PS2 port is bidirectional
our discussion is limited to one direction, from the keyboard to the FPGA
board
FPGA prototyping board has a PS2 port and acts as a host
17

PS2 Receiving
PS2:
R
i i Sub-system
S b
t
Physical interface of a PS2 port:
In addition to data and clock lines, the PS2 port includes connections for
power (Vcc)
(V ) andd ground.
d
The power is supplied by the host
Original PS2 port,
port Vcc is 5 V
Current keyboards and mouse work with 3.3 V
FPGA should function p
properly
p y since its I/O p
pins can tolerate 5-V input
p

18

PS2 Receiving
PS2:
R
i i Sub-system
S b
t
Device-to-host communication protocol:
A PS2 device
d i and
d iits h
host communicate
i
via
i packets.
k
ps2d and ps2c:
data and clock signals

Data is transmitted in a serial stream similar to that of a UART


Unlike
U lik a UART
UART, clock
l k iinformation
f
i iis carried
i d iin a separate clock
l k signal,
i l ps2c
2
falling edge of ps2c indicates that the corresponding bit in ps2d line is valid
The clock period of the ps2c signal is between 60 and 100 s (10 to 16
16.77 kHz)
ps2d signal is stable at least 5 s before and after the falling edge of ps2c
19
signal

PS2 Receiving
PS2:
R
i i Sub-system
S b
t
Design
g and code
Design of the PS2 port receiving subsystem is similar to a UART receiver
Instead of using the over-sampling scheme, the falling-edge of the ps2c signal
is used as the reference point to retrieve data
Subsystem includes a falling edge detection circuit for the ps2c signal, and the
receiver, which shifts in and assembles the serial bits
The edge detection circuit discussed earlier can be used to detect the falling
edge
Because of the potential noise and slow transition, a simple filtering circuit is
added to eliminate glitches

20

PS2 Receiving
PS2:
R
i i Sub-system
S b
t

ASMD chart of PS2 port receiver

21

PS2: Receiving
R
i i Sub-system
S b
t
architecture arch of ps2_rx is
type statetype is (idle,
(idle dps,
dps load);
signal state_reg, state_next: statetype;
signal filter_reg, filter_next:
std_logic_vector(7 downto 0);
signal
i l ff_ps2c_reg,f_ps2c_next:
2
f 2
std_logic;
d l i
signal b_reg, b_next: std_logic_vector(10 downto 0);
signal n_reg,n_next: unsigned(3 downto 0);
g fall_edge:
_ g std_logic;
_ g ;
signal
-- filter and falling edge detection
-- fsmd to extract the 8-bit data
-- next-state
n t t t logic
l i
process(state_reg,n_reg,b_reg,fall_edge,rx_en,ps2d)
begin
rx_done_tick <='0';
state_next <= state_reg;
n_next <= n_reg;
b_next <= b_reg;

entity ps2_rx is
port (
clk, reset: in std_logic;
ps2d,, ps2c:
p
p
in std_logic;
g ; -- keyy data,, keyy clock
rx_en: in std_logic;
rx_done_tick: out std_logic;
dout: out std_logic_vector(7 downto 0)
)
);
end ps2_rx;
22

case state_reg is
when idle =>
if fall
fall_edge=
edge='1'
1 and rx
rx_en=
en='1'
1 then
b_next <= ps2d & b_reg(10 downto 1); -- shift in start bit
n_next <= "1001";
state_next <= dps;
PS2: Receiving
end if;
when dps =>
-- 8 data + 1 pairty + 1 stop
if fall_edge='1' then
b next <= ps2d & b
b_next
b_reg(10
reg(10 downto 1);
if n_reg = 0 then
state_next <=load;
else
n_next <= n_reg - 1;
1
end if;
end if;
when load =>
state_next <= idle;
rx_done_tick <='1';
end case;
end process;
-- output
dout <= b_reg(8 downto 1); -- data bits

Sub-system
Sub system

23

PS2 KEYBOARD
PS2:
O
SCAN CODE
O
Overview of scan code
A keyboard consists of a matrix of keys and an embedded microcontroller that
monitors (scans) the activities of the keys and sends scan code accordingly
A key is pressed
pressed, the make code of the key is transmitted
A key is held down continuously (typematic) the make code is transmitted
repeatedly at a specific rate
When a key is released, the break code of the key is transmitted.
The make code of the main p
part of a PS2 keyboard
y
is normallyy 1 byte
y wide and
represented by two hexadecimal numbers. The make codes of special-purpose
keys, which are known as the extended keys, can have 2 to 4 bytes
Th b
The
breakk codes
d off the
h regular
l kkeys consist
i off F0 followed
f ll
d by
b the
h make
k code
d off
the key (break code of A key is F0 1C ).
24

PS2 KEYBOARD SCAN CODE


PS2:

Sequence of codes transmitted- press and release of A : 1C F0 1C


Multiple keys can be pressed at the same time.
first press shift and then A , and release A and then release shift. Transmitted
code sequence
sequence- 12 1C F0 1C F0 12
No special code to distinguish the lower- and uppercase keys. The host keep
25
track of the shift key to determine the case accordingly.

PS2 KEYBOARD SCAN CODE


PS2:
Scan code monitor circuit
Monitors the arrival of the received
packets and displays the scan codes
on a PC
first
st split
sp t the
t e received
ece ved scan
sca code
into two 4-bit parts and treat them
as two hexadecimal digits
then convert the two digits to
ASCII code words and send the
words to a PC via the UART
26

PS2 KEYBOARD SCAN CODE


PS2:

27

EXTERNAL SRAM
Random access memory (RAM) is used for massive storage in a digital
system since a RAM cell is much simpler than an FF cell
A commonly used type of RAM is the asynchronous static RAM (SRAM)
Unlike a register,
g
, in which the data is sampled
p
and stored at an edge
g of a
clock signal, accessing data from an asynchronous SRAM is more
complicated
A read or write operation requires that the data, address, and control signals
be asserted in a specific order, and these signals must be stable for a certain
amount off time
i during
d i the
h operation
i
28

EXTERNAL SRAM

29

EXTERNAL SRAM
It is difficult for a synchronous system to access an SRAM directly.
directly Usually
memory controller is used as an interface
Controller takes commands from the main system synchronously and then
generates properly timed signals to access the SRAM
Controller shields the main system from the detailed timing and makes the
memory access appears like a synchronous operation
Performance of a memory controller is measured by the number of
memory accesses that can be completed in a given period
Designing a simple memory controller with optimal performance involves
many timing issues
30

EXTERNAL SRAM
Device:
256K-by-16 asynchronous
SRAM available with many xilinx board
g
is
The memoryy controller design
targeted to IS61LV25616AL SRAM
device
Timing characteristics of each RAM
device are different hence controller
d i may
design
m have
h
th variations
the
ri ti
as per
p r the
th
device
Same design principle can be used for
similar SRAM devices

FUNCTIONAL BLOCK DIAGRAM


IS61LV25616AL
31

EXTERNAL SRAM

Functional Table

32

EXTERNAL SRAM
For simplicity and clarity,
clarity use
one SRAM device
access the SRAM in 16-bit word format
ce_n , lb_
lb n , and
d ub_
b n signals
i l should
h ld always
l
b
be activated
i
d

Simplified Functional Table


33

EXTERNAL SRAM
Timing characteristics
F key
Few
k timing
i i parameters off an asynchronous
h
SRAM relevant
l
to this
hi ddesign:
i
tRC : read cycle time, the minimal elapsed time between two read operations
tA A : address access time,
time the time required to obtain stable output data after
an address change
tOHA : output hold time, the time that the output data remains valid after the
address changes
tDOE : output enable access time, the time required to obtain valid data after
oe n is activated
oe_n
tHZOE : output enable to high-Z time, the time for the tri-state buffer to enter
the high-impedance state after oe_n is deactivated
tLZOE : output enable to low-Z time, the time for the tri-state buffer to leave
the high-impedance state after oe_n is activated
34

EXTERNAL SRAM

Timing diagram of an
address-controlled read cycle
y

Timing diagram of an oe
oe_n
ncontrolled read cycle
35

EXTERNAL SRAM
The relevant timing parameters for a we_n
we n-controlled
controlled write operation are:
tWC: write cycle time, the minimal elapsed time between two write operations
tSA : address setup time,
time the minimal time that the address must be stable
before we_n is activated
tHA : address hold time,, the minimal time that the address must be stable after
we_n is deactivated
TPWE1: we_n pulse width, the minimal time that wen must be asserted
tSD : data setup time, the minimal time that data must be stable before the
latching edge (the edge in which we_n moves from '0' to '1')
tHD : data hold time, the minimal time that data must be stable after the
latching edge

36

EXTERNAL SRAM

Timing diagram of a write cycle

Timing parameter in ns for IS61LV25616AL


37

EXTERNAL SRAM
Role of an SRAM memoryy controller

38

EXTERNAL SRAM
addr : 18-bit address
data_f2s: 16-bit data to be
written to SRAM
data_s2f_r: 16-bit registered
g
data retrieved from the SRAM
data_s2f_ur: is the 16-bit
unregistered data retrieved from
SRAM
mem: asserted to 1 to initiate
a memory operation
wr: specifies the operationread 1 or write 0
ready: indicates whether the
controller is ready to accept a
new command

39

Block diagram of memory controller

EXTERNAL SRAM
Timing requirement/Control sequence
read cycle: operation sequence- we_n is deactivated during the entire operation
1. Place the address on ad bus and activate oe_n. (these two signals must be stable
f the
for
th entire
ti operation)
ti )
2. Wait for at least tAA (data from SRAM becomes available after this interval)
3 Retrieve the data from dio and deactivate oe_n
3.
oe n
write cycle : The basic operation sequence1. Place address on ad bus and data on dio bus and activate we_n
we n signal (these
signals must be stable for the entire operation)
2. Wait for at least tPWE1
3. Deactivate we_n signal, data is latched to SRAM at the '0'to'1' edge
4. Remove the data from dio bus

40

EXTERNAL SRAM
A SAFE DESIGN
1 The
1.
Th design
d i provides
id llarge timing
i i
margins and does not impose any
stringent timing constraints
2. With the considered design
datapath, the task is to derive the
controller
ASMD chart:
FSM has five states
idle state - starts memory
operation when the mem is
activated.
r_w determines whether it is a read
or write operation

41

EXTERNAL SRAM
For a read operation:
The memory address, addr, is sampled and stored in addr
addr-reg
reg at the transition,
FSM moves to the rd_l state
oe_n is activated in rd_l and rd_2 states
At the end of read cycle, retrieved data is stored in data-s2f -reg at the
transition, and oe_n is deactivated; FSM returns to idle state
Two read portsports
data_s2f_r is a registered output: available after FSM exits rd_2 state and data
remains unchanged until the end of the next read cycle
data_s2f_ur is connected directly to dio bus , data is removed after FSM enters
idle state
42

EXTERNAL SRAM
For a write operation:
The memory address, addr, and data, data_f2s,
are sampled and stored in the addr_reg and
data_f2s_regg at the transition; the FSM moves
to the wr_l state
we_n and tri_n both are activated in wr_1
(enables tri-state
tri state buffer to put data over SRAMs
SRAM s
dio bus)
FSM moves to wr2 state - we_n is deactivated
but tri_n remains asserted (ensures that the data
is properly latched to SRAM when we_n
changes
g 0 to 1))
FSM returns to idle state and tri_n is deactivated
to remove data from dio bus

43

EXTERNAL SRAM
Timing analysis
FSM is controlled by a 50 MHz clock and thus stays in each state for 20 ns
During the read cycle: oe_n is asserted for two states, totaling 40 ns, which
provides a 30-ns margin
p
g over the 10 ns tAA
oe_n can be de-asserted in rd_2 under a more stringent timing constraint design
The data is stored in the data_s2f register when the FSM moves from the rd_2
state to the idle state
Although oe_n is de-asserted at the transition, the data remains valid for a small
interval because of tHZOE delay of SRAM and can be sampled by clock edge
During the write cycle: we_n is asserted in wr_l state, and the 20 ns interval
exceeds the 8 ns tPWE1 requirement
tri_n signal remains asserted in wr_2 state and thus ensures that the data is still
stable during the 0 to1 transition edge of we_n
44

EXTERNAL SRAM
P f
Performance:
Both read and write operations take two clock cycles to complete
D
During
i the
th read
d operation,
ti the
th unregistered
i t d data
d t is
i available
il bl just
j t before
b f
th
the
rising edge of the second clock cycle and the registered data is available
after the risingg edge
g of the second clock cycle
y
Both read and write operations return to the idle state after completion
and thus the back-to-back memory access takes three clock cycles
HDL Implementation:
HDL code can be derived from the block diagram and the ASMD chart

45

EXTERNAL SRAM
entity sram_ctrl
sram ctrl is
port(
clk, reset: in std_logic;
-- to/from main system
mem: in std_logic;
rw: in std_logic;
addr: in std_logic_vector(17 downto 0);
data f2s: in std
data_f2s:
std_logic_vector(15
logic vector(15 downto 0);
ready: out std_logic;
data_s2f_r, data_s2f_ur: out std_logic_vector(15 downto 0);
-- to/from chip
ad:
d out std_logic_vector(17
d l i
(17 downto
d
0);
0)
we_n, oe_n: out std_logic;
dio: inout std_logic_vector(15 downto 0);
ce_n,, ub_n,, lb_n: out std_logic
g
);
end sram_ctrl;

46

EXTERNAL SRAM

-- next-state logic
process(state_reg,mem,rw,dio_a,addr,data_f2s,
data_f2s_reg,data_s2f_reg,addr_reg)
begin
addr_next <= addr_reg;
data f2s next <= data
data_f2s_next
data_f2s_reg;
f2s reg;
data_s2f_next <= data_s2f_reg;
ready <= '0';
case state_reg is
when idle =>
if mem='0' then
state_next <= idle;
else
addr_next <= addr;
if rw='0' then --write
state_next <= wr1;
d
data_f2s_next
f2
<= data_f2s;
d
f2
else -- read
state_next <= rd1;
end if;;
end if;
ready <= '1';

47

EXTERNAL SRAM

when wr1 =>


state_next <= wr2;
when wr2 =>
state_next <= idle;
when rd1 =>
state next <= rd2;
state_next
when rd2=>
data_s2f_next <= dio_a;
state_next <= idle;
end case;
end process;

-- next-state logic
process(state next)
process(state_next)
begin
tri_buf <= '1'; -- signals are active low
we_buf <= '1';
oe_buf <= '1';
case state_next is
when idle =>
when wr1 =>
tri_buf <= '0';
we_buf <= '0';
when wr2 =>
tri_buf
i b f<
<= '0';
'0'
when rd1 =>
oe_buf <= '0';
when rd2=>
oe_buf <= '0';
end case;
end process;

48

EXTERNAL SRAM
Basic testing circuit
To perform a read or write operation
operation, the addition signals required are
sw : 8 bits wide and used as data or address input
led
d:8b
bitss w
wide
de aand
d used
sed to
od
display
sp ay thee retrieved
e eved da
dataa
btn(0) : When it is asserted, the current value of sw is loaded to a data register.
The output of the register is used as the data input for the write operation
btn(1) : When it is asserted, the controller uses the value of sw as a memory
address and performs a write operation
btn (2) : When it is asserted, the controller uses the value of sw as a memory
address and performs a read operation. The readout is routed to the led signal.
During a write operation, first specify the data value and load it to the internal
register and then specify the address and initiate the write operation.
operation
During a read operation, specify the address and initiate the read operation. The
49
retrieved data is displayed in eight discrete LEDs.

EXTERNAL SRAM
MORE AGGRESSIVE DESIGN
The designed memory controller functions properly but do not have optimal
performance
While both the read and write cycles are 10 ns of the SRAM device, the backto-back memory access of this controller takes 60 ns (i.e., three clock cycles)
Timing issues

Timing issues on asynchronous SRAM:


First issue: Deactivation of the we_n signal
Ensure that we_n is deactivated before data is removed from the bus
The data hold time (tHD) is zero for this SRAM and it appears that it is fine to
deactivate we_n and remove data at the same time; this approach is not reliable
because of the variations in propagation delays
50

EXTERNAL SRAM
Second issue: The potential conflict on the data bus,
bus dio.
dio
Fighting- Data bus is a bidirectional bus. The controller
places
l
d
data
on the
h bus
b during
d i a write
i operation,
i
and
d the
h
SRAM places data on the bus during a read operation. The
fighting may occurs if both place data on the bus at the
same time
This condition should be avoided to ensure reliable
operation
p

51

EXTERNAL SRAM
Estimation of p
propagation
p g
delayy
Difficulties in the determination of propagation delays
During synthesis, the final implementation may not resemble the block diagram
depicted by the initial description
Pad delays: A memory operation involves off-chip data access. Additional
propagation delay is introduced when a signal propagates through the FPGAs
I/O pads.
The pad delay is usually much larger than the internal wiring delay and its exact
value depends on a variety of factors,
factors including the type of FPGA device,
device the
location of the output register (in LE or IOB), the I/O standards, external
loading etc.
I i
Intimate
k
knowledge
l d off FPGA device
d i and
d synthesis
h i software,
f
to perform
f
timing
i i
analysis and to estimate the propagation delays of various signals, is required
52

EXTERNAL SRAM
Alternative design
g I
The design is targeted to reduce
the back-to-back operation
overhead.
overhead
Instead of always returning to
the idle state, the memory
controller
t ll check
h k the
th mem
signal at the end of current
memory operation (rd2 or
wr2) and determine what to
do next
Initiates a new memory
operation immediately if there
is a pending request

53

EXTERNAL SRAM
Timing
g analysis
y
Skipping the idle state may introduce fighting on the data bus when different
types of back-to-back memory operations are performed. Consider a write
operation performed immediately after a read operation
During the read operation- the signal flows from SRAM to FPGA; the tristate buffer of the SRAM should be turned on (passing signal) and the
tri-state buffer of the FPGA should be turned off (high impedance)
During the write operation- the signal flows from FPGA to SRAM, and the
roles of the two tri
tri-state
state buffers are reversed
A small delay is required to turn on or off a tri-state buffer
In SRAM, these delays are specified by tHZOE and tLZOE
The design requires the two tristate buffers to reverse directions simultaneously
during back-to-back operations.
54

EXTERNAL SRAM
Example:
p
When moving from rd_2 to wr_1,
a fighting may occur in transition
if SRAMs tri-state
tri state buffer
b ffer is turned
t rned
off too slowly or FPGAs tri-state
buffer is turned on too quickly
Similarly, fighting may occur when
a read operation is performed
immediately
ed ate y aafter
te a w
write
te ope
operation
at o
A detailed timing analysis is required
to examine whether fighting
occurs, and may even need to finetune the timing to fix the problem

55

EXTERNAL SRAM
Alternative design II
In the first design, a memory operation
takes two clock cycles, which amount to
40 ns.
The read and write cycles of the SRAM
are each 10 ns
To
T reduce
d
the
h operation
i time
i
to a single
i l
20-ns clock cycle, eliminate the rd2 and
wr2 states
Proposed design takes one clock cycle to
complete the memory access and requires
two clock cycles to complete the back
back-toto
back operations
56

EXTERNAL SRAM

Timing analysis

Reducingg a state imposes


p
much tighter
g
timingg constraints for both read and write
Read operation:
The address first propagates through FPGAs I/O pads to SRAM's address bus
The retrieved data then propagates back through I/O pads to FPGA's internal
logic. (Pad delay of a Spartan-3 device range from 4 ns to more than 10 ns)
10-ns
0 ns SRAM
SR M address access time (tAA), the cycle must accommodate two pad
delays.
All of this must be completed within a 20-ns clock cycle
Write operation:
one-way- only needs to propagate address, data, and control signals to SRAM
Order of signals being activated and deactivated: we
we_n
n and tri_n
tri n signals are
deactivated simultaneously at the end of the wr_l state
57
A fine-tune the synthesis is needed to achieve this margin

EXTERNAL SRAM
Combine the features
of alternative designs

58

EXTERNAL SRAM:
SRAM FPGA Features
F t
Advanced FPGA features (Xilinx)
The
Th memory controller
ll examples
l ill
illustrate the
h lilimitations
i i
off the
h FSM
FSM-based
b d
controller and synchronous design methodology
Basically, an FSM cannot generate a control sequence that is finer than the
period of its clock signal
The operation of these alternative designs relies on factors that cannot be
specified
p
byy an RT-level HDL description
p
Due to the variations in propagation delays, the synthesized circuits are not
reliable and may or may not work
There are some ad hoc strategies to obtain better control. These features are
usually device and software dependent.
For example, the digital clock manager (DCM) circuit and input/output
block (IOB) of the Spartan-3
Spartan 3 device can help to remedy some of the
previously discussed problems
59

EXTERNAL SRAM:
SRAM FPGA Features
F t

60

61