Professional Documents
Culture Documents
MODULES UART
Universal asynchronous receiver and transmitter (UART):
Frequently used in conjunction with RS-232 standard, which specifies the
characteristics of two data communication equipment
A UART includes
i l d a transmitter
i
and
d a receiver.
i
Transmitter - essentially a special shift register that loads data in parallel and
then shifts it out bit by bit at a specific rate
Receiver - shifts in data bit by bit and then reassembles the data
Serial line is 1 when it is idle. The transmission starts with a start bit, which is 0,
0
followed by data bits and an optional parity bit, and ends with stop bits, which are 1
Number of data bits can be 6,7, or 8. The optional parity bit is used for error
detection. Number of stop bits can be 1, 1.5, or 2
2
UART
Transmission of a byte.
byte
entity uart_rx is
generic(
DBIT: integer:=8;
g
; -- # data bits
SB_TICK: integer:=16 -- # ticks for stop bits
);
port(
clk reset: in std_logic;
clk,
std logic;
rx: in std_logic;
s_tick: in std_logic;
rx_done_tick: out std_logic;
dout: out std_logic_vector(7 downto 0)
);
end uart_rx ;
architecture arch of uart_rx is
type state_type is (idle, start, data, stop);
signal state_reg, state_next: state_type;
signal s_reg, s_next: unsigned(3 downto 0);
signal n_reg, n_next: unsigned(2 downto 0);
signal b_reg, b_next: std_logic_vector(7 downto 0);
11
A flag FF
12
A FIFO buffer
13
PS2 KEYBOARD
PS2:
PS2 port was introduced in IBMs Personal System/2 personnel computers
It support interface for a keyboard and mouse to communicate with the host
The port contains two wires
one wire is for data, which is transmitted in a serial stream.
other wire is for the clock information, which specifies when the data is
valid
Information is transmitted as an 11-bit packet
a start bit,, 8 data bits,, a parity
p y bit,, and a stop
p bit
PS2 port is bidirectional
our discussion is limited to one direction, from the keyboard to the FPGA
board
FPGA prototyping board has a PS2 port and acts as a host
17
PS2 Receiving
PS2:
R
i i Sub-system
S b
t
Physical interface of a PS2 port:
In addition to data and clock lines, the PS2 port includes connections for
power (Vcc)
(V ) andd ground.
d
The power is supplied by the host
Original PS2 port,
port Vcc is 5 V
Current keyboards and mouse work with 3.3 V
FPGA should function p
properly
p y since its I/O p
pins can tolerate 5-V input
p
18
PS2 Receiving
PS2:
R
i i Sub-system
S b
t
Device-to-host communication protocol:
A PS2 device
d i and
d iits h
host communicate
i
via
i packets.
k
ps2d and ps2c:
data and clock signals
PS2 Receiving
PS2:
R
i i Sub-system
S b
t
Design
g and code
Design of the PS2 port receiving subsystem is similar to a UART receiver
Instead of using the over-sampling scheme, the falling-edge of the ps2c signal
is used as the reference point to retrieve data
Subsystem includes a falling edge detection circuit for the ps2c signal, and the
receiver, which shifts in and assembles the serial bits
The edge detection circuit discussed earlier can be used to detect the falling
edge
Because of the potential noise and slow transition, a simple filtering circuit is
added to eliminate glitches
20
PS2 Receiving
PS2:
R
i i Sub-system
S b
t
21
PS2: Receiving
R
i i Sub-system
S b
t
architecture arch of ps2_rx is
type statetype is (idle,
(idle dps,
dps load);
signal state_reg, state_next: statetype;
signal filter_reg, filter_next:
std_logic_vector(7 downto 0);
signal
i l ff_ps2c_reg,f_ps2c_next:
2
f 2
std_logic;
d l i
signal b_reg, b_next: std_logic_vector(10 downto 0);
signal n_reg,n_next: unsigned(3 downto 0);
g fall_edge:
_ g std_logic;
_ g ;
signal
-- filter and falling edge detection
-- fsmd to extract the 8-bit data
-- next-state
n t t t logic
l i
process(state_reg,n_reg,b_reg,fall_edge,rx_en,ps2d)
begin
rx_done_tick <='0';
state_next <= state_reg;
n_next <= n_reg;
b_next <= b_reg;
entity ps2_rx is
port (
clk, reset: in std_logic;
ps2d,, ps2c:
p
p
in std_logic;
g ; -- keyy data,, keyy clock
rx_en: in std_logic;
rx_done_tick: out std_logic;
dout: out std_logic_vector(7 downto 0)
)
);
end ps2_rx;
22
case state_reg is
when idle =>
if fall
fall_edge=
edge='1'
1 and rx
rx_en=
en='1'
1 then
b_next <= ps2d & b_reg(10 downto 1); -- shift in start bit
n_next <= "1001";
state_next <= dps;
PS2: Receiving
end if;
when dps =>
-- 8 data + 1 pairty + 1 stop
if fall_edge='1' then
b next <= ps2d & b
b_next
b_reg(10
reg(10 downto 1);
if n_reg = 0 then
state_next <=load;
else
n_next <= n_reg - 1;
1
end if;
end if;
when load =>
state_next <= idle;
rx_done_tick <='1';
end case;
end process;
-- output
dout <= b_reg(8 downto 1); -- data bits
Sub-system
Sub system
23
PS2 KEYBOARD
PS2:
O
SCAN CODE
O
Overview of scan code
A keyboard consists of a matrix of keys and an embedded microcontroller that
monitors (scans) the activities of the keys and sends scan code accordingly
A key is pressed
pressed, the make code of the key is transmitted
A key is held down continuously (typematic) the make code is transmitted
repeatedly at a specific rate
When a key is released, the break code of the key is transmitted.
The make code of the main p
part of a PS2 keyboard
y
is normallyy 1 byte
y wide and
represented by two hexadecimal numbers. The make codes of special-purpose
keys, which are known as the extended keys, can have 2 to 4 bytes
Th b
The
breakk codes
d off the
h regular
l kkeys consist
i off F0 followed
f ll
d by
b the
h make
k code
d off
the key (break code of A key is F0 1C ).
24
27
EXTERNAL SRAM
Random access memory (RAM) is used for massive storage in a digital
system since a RAM cell is much simpler than an FF cell
A commonly used type of RAM is the asynchronous static RAM (SRAM)
Unlike a register,
g
, in which the data is sampled
p
and stored at an edge
g of a
clock signal, accessing data from an asynchronous SRAM is more
complicated
A read or write operation requires that the data, address, and control signals
be asserted in a specific order, and these signals must be stable for a certain
amount off time
i during
d i the
h operation
i
28
EXTERNAL SRAM
29
EXTERNAL SRAM
It is difficult for a synchronous system to access an SRAM directly.
directly Usually
memory controller is used as an interface
Controller takes commands from the main system synchronously and then
generates properly timed signals to access the SRAM
Controller shields the main system from the detailed timing and makes the
memory access appears like a synchronous operation
Performance of a memory controller is measured by the number of
memory accesses that can be completed in a given period
Designing a simple memory controller with optimal performance involves
many timing issues
30
EXTERNAL SRAM
Device:
256K-by-16 asynchronous
SRAM available with many xilinx board
g
is
The memoryy controller design
targeted to IS61LV25616AL SRAM
device
Timing characteristics of each RAM
device are different hence controller
d i may
design
m have
h
th variations
the
ri ti
as per
p r the
th
device
Same design principle can be used for
similar SRAM devices
EXTERNAL SRAM
Functional Table
32
EXTERNAL SRAM
For simplicity and clarity,
clarity use
one SRAM device
access the SRAM in 16-bit word format
ce_n , lb_
lb n , and
d ub_
b n signals
i l should
h ld always
l
b
be activated
i
d
EXTERNAL SRAM
Timing characteristics
F key
Few
k timing
i i parameters off an asynchronous
h
SRAM relevant
l
to this
hi ddesign:
i
tRC : read cycle time, the minimal elapsed time between two read operations
tA A : address access time,
time the time required to obtain stable output data after
an address change
tOHA : output hold time, the time that the output data remains valid after the
address changes
tDOE : output enable access time, the time required to obtain valid data after
oe n is activated
oe_n
tHZOE : output enable to high-Z time, the time for the tri-state buffer to enter
the high-impedance state after oe_n is deactivated
tLZOE : output enable to low-Z time, the time for the tri-state buffer to leave
the high-impedance state after oe_n is activated
34
EXTERNAL SRAM
Timing diagram of an
address-controlled read cycle
y
Timing diagram of an oe
oe_n
ncontrolled read cycle
35
EXTERNAL SRAM
The relevant timing parameters for a we_n
we n-controlled
controlled write operation are:
tWC: write cycle time, the minimal elapsed time between two write operations
tSA : address setup time,
time the minimal time that the address must be stable
before we_n is activated
tHA : address hold time,, the minimal time that the address must be stable after
we_n is deactivated
TPWE1: we_n pulse width, the minimal time that wen must be asserted
tSD : data setup time, the minimal time that data must be stable before the
latching edge (the edge in which we_n moves from '0' to '1')
tHD : data hold time, the minimal time that data must be stable after the
latching edge
36
EXTERNAL SRAM
EXTERNAL SRAM
Role of an SRAM memoryy controller
38
EXTERNAL SRAM
addr : 18-bit address
data_f2s: 16-bit data to be
written to SRAM
data_s2f_r: 16-bit registered
g
data retrieved from the SRAM
data_s2f_ur: is the 16-bit
unregistered data retrieved from
SRAM
mem: asserted to 1 to initiate
a memory operation
wr: specifies the operationread 1 or write 0
ready: indicates whether the
controller is ready to accept a
new command
39
EXTERNAL SRAM
Timing requirement/Control sequence
read cycle: operation sequence- we_n is deactivated during the entire operation
1. Place the address on ad bus and activate oe_n. (these two signals must be stable
f the
for
th entire
ti operation)
ti )
2. Wait for at least tAA (data from SRAM becomes available after this interval)
3 Retrieve the data from dio and deactivate oe_n
3.
oe n
write cycle : The basic operation sequence1. Place address on ad bus and data on dio bus and activate we_n
we n signal (these
signals must be stable for the entire operation)
2. Wait for at least tPWE1
3. Deactivate we_n signal, data is latched to SRAM at the '0'to'1' edge
4. Remove the data from dio bus
40
EXTERNAL SRAM
A SAFE DESIGN
1 The
1.
Th design
d i provides
id llarge timing
i i
margins and does not impose any
stringent timing constraints
2. With the considered design
datapath, the task is to derive the
controller
ASMD chart:
FSM has five states
idle state - starts memory
operation when the mem is
activated.
r_w determines whether it is a read
or write operation
41
EXTERNAL SRAM
For a read operation:
The memory address, addr, is sampled and stored in addr
addr-reg
reg at the transition,
FSM moves to the rd_l state
oe_n is activated in rd_l and rd_2 states
At the end of read cycle, retrieved data is stored in data-s2f -reg at the
transition, and oe_n is deactivated; FSM returns to idle state
Two read portsports
data_s2f_r is a registered output: available after FSM exits rd_2 state and data
remains unchanged until the end of the next read cycle
data_s2f_ur is connected directly to dio bus , data is removed after FSM enters
idle state
42
EXTERNAL SRAM
For a write operation:
The memory address, addr, and data, data_f2s,
are sampled and stored in the addr_reg and
data_f2s_regg at the transition; the FSM moves
to the wr_l state
we_n and tri_n both are activated in wr_1
(enables tri-state
tri state buffer to put data over SRAMs
SRAM s
dio bus)
FSM moves to wr2 state - we_n is deactivated
but tri_n remains asserted (ensures that the data
is properly latched to SRAM when we_n
changes
g 0 to 1))
FSM returns to idle state and tri_n is deactivated
to remove data from dio bus
43
EXTERNAL SRAM
Timing analysis
FSM is controlled by a 50 MHz clock and thus stays in each state for 20 ns
During the read cycle: oe_n is asserted for two states, totaling 40 ns, which
provides a 30-ns margin
p
g over the 10 ns tAA
oe_n can be de-asserted in rd_2 under a more stringent timing constraint design
The data is stored in the data_s2f register when the FSM moves from the rd_2
state to the idle state
Although oe_n is de-asserted at the transition, the data remains valid for a small
interval because of tHZOE delay of SRAM and can be sampled by clock edge
During the write cycle: we_n is asserted in wr_l state, and the 20 ns interval
exceeds the 8 ns tPWE1 requirement
tri_n signal remains asserted in wr_2 state and thus ensures that the data is still
stable during the 0 to1 transition edge of we_n
44
EXTERNAL SRAM
P f
Performance:
Both read and write operations take two clock cycles to complete
D
During
i the
th read
d operation,
ti the
th unregistered
i t d data
d t is
i available
il bl just
j t before
b f
th
the
rising edge of the second clock cycle and the registered data is available
after the risingg edge
g of the second clock cycle
y
Both read and write operations return to the idle state after completion
and thus the back-to-back memory access takes three clock cycles
HDL Implementation:
HDL code can be derived from the block diagram and the ASMD chart
45
EXTERNAL SRAM
entity sram_ctrl
sram ctrl is
port(
clk, reset: in std_logic;
-- to/from main system
mem: in std_logic;
rw: in std_logic;
addr: in std_logic_vector(17 downto 0);
data f2s: in std
data_f2s:
std_logic_vector(15
logic vector(15 downto 0);
ready: out std_logic;
data_s2f_r, data_s2f_ur: out std_logic_vector(15 downto 0);
-- to/from chip
ad:
d out std_logic_vector(17
d l i
(17 downto
d
0);
0)
we_n, oe_n: out std_logic;
dio: inout std_logic_vector(15 downto 0);
ce_n,, ub_n,, lb_n: out std_logic
g
);
end sram_ctrl;
46
EXTERNAL SRAM
-- next-state logic
process(state_reg,mem,rw,dio_a,addr,data_f2s,
data_f2s_reg,data_s2f_reg,addr_reg)
begin
addr_next <= addr_reg;
data f2s next <= data
data_f2s_next
data_f2s_reg;
f2s reg;
data_s2f_next <= data_s2f_reg;
ready <= '0';
case state_reg is
when idle =>
if mem='0' then
state_next <= idle;
else
addr_next <= addr;
if rw='0' then --write
state_next <= wr1;
d
data_f2s_next
f2
<= data_f2s;
d
f2
else -- read
state_next <= rd1;
end if;;
end if;
ready <= '1';
47
EXTERNAL SRAM
-- next-state logic
process(state next)
process(state_next)
begin
tri_buf <= '1'; -- signals are active low
we_buf <= '1';
oe_buf <= '1';
case state_next is
when idle =>
when wr1 =>
tri_buf <= '0';
we_buf <= '0';
when wr2 =>
tri_buf
i b f<
<= '0';
'0'
when rd1 =>
oe_buf <= '0';
when rd2=>
oe_buf <= '0';
end case;
end process;
48
EXTERNAL SRAM
Basic testing circuit
To perform a read or write operation
operation, the addition signals required are
sw : 8 bits wide and used as data or address input
led
d:8b
bitss w
wide
de aand
d used
sed to
od
display
sp ay thee retrieved
e eved da
dataa
btn(0) : When it is asserted, the current value of sw is loaded to a data register.
The output of the register is used as the data input for the write operation
btn(1) : When it is asserted, the controller uses the value of sw as a memory
address and performs a write operation
btn (2) : When it is asserted, the controller uses the value of sw as a memory
address and performs a read operation. The readout is routed to the led signal.
During a write operation, first specify the data value and load it to the internal
register and then specify the address and initiate the write operation.
operation
During a read operation, specify the address and initiate the read operation. The
49
retrieved data is displayed in eight discrete LEDs.
EXTERNAL SRAM
MORE AGGRESSIVE DESIGN
The designed memory controller functions properly but do not have optimal
performance
While both the read and write cycles are 10 ns of the SRAM device, the backto-back memory access of this controller takes 60 ns (i.e., three clock cycles)
Timing issues
EXTERNAL SRAM
Second issue: The potential conflict on the data bus,
bus dio.
dio
Fighting- Data bus is a bidirectional bus. The controller
places
l
d
data
on the
h bus
b during
d i a write
i operation,
i
and
d the
h
SRAM places data on the bus during a read operation. The
fighting may occurs if both place data on the bus at the
same time
This condition should be avoided to ensure reliable
operation
p
51
EXTERNAL SRAM
Estimation of p
propagation
p g
delayy
Difficulties in the determination of propagation delays
During synthesis, the final implementation may not resemble the block diagram
depicted by the initial description
Pad delays: A memory operation involves off-chip data access. Additional
propagation delay is introduced when a signal propagates through the FPGAs
I/O pads.
The pad delay is usually much larger than the internal wiring delay and its exact
value depends on a variety of factors,
factors including the type of FPGA device,
device the
location of the output register (in LE or IOB), the I/O standards, external
loading etc.
I i
Intimate
k
knowledge
l d off FPGA device
d i and
d synthesis
h i software,
f
to perform
f
timing
i i
analysis and to estimate the propagation delays of various signals, is required
52
EXTERNAL SRAM
Alternative design
g I
The design is targeted to reduce
the back-to-back operation
overhead.
overhead
Instead of always returning to
the idle state, the memory
controller
t ll check
h k the
th mem
signal at the end of current
memory operation (rd2 or
wr2) and determine what to
do next
Initiates a new memory
operation immediately if there
is a pending request
53
EXTERNAL SRAM
Timing
g analysis
y
Skipping the idle state may introduce fighting on the data bus when different
types of back-to-back memory operations are performed. Consider a write
operation performed immediately after a read operation
During the read operation- the signal flows from SRAM to FPGA; the tristate buffer of the SRAM should be turned on (passing signal) and the
tri-state buffer of the FPGA should be turned off (high impedance)
During the write operation- the signal flows from FPGA to SRAM, and the
roles of the two tri
tri-state
state buffers are reversed
A small delay is required to turn on or off a tri-state buffer
In SRAM, these delays are specified by tHZOE and tLZOE
The design requires the two tristate buffers to reverse directions simultaneously
during back-to-back operations.
54
EXTERNAL SRAM
Example:
p
When moving from rd_2 to wr_1,
a fighting may occur in transition
if SRAMs tri-state
tri state buffer
b ffer is turned
t rned
off too slowly or FPGAs tri-state
buffer is turned on too quickly
Similarly, fighting may occur when
a read operation is performed
immediately
ed ate y aafter
te a w
write
te ope
operation
at o
A detailed timing analysis is required
to examine whether fighting
occurs, and may even need to finetune the timing to fix the problem
55
EXTERNAL SRAM
Alternative design II
In the first design, a memory operation
takes two clock cycles, which amount to
40 ns.
The read and write cycles of the SRAM
are each 10 ns
To
T reduce
d
the
h operation
i time
i
to a single
i l
20-ns clock cycle, eliminate the rd2 and
wr2 states
Proposed design takes one clock cycle to
complete the memory access and requires
two clock cycles to complete the back
back-toto
back operations
56
EXTERNAL SRAM
Timing analysis
EXTERNAL SRAM
Combine the features
of alternative designs
58
EXTERNAL SRAM:
SRAM FPGA Features
F t
Advanced FPGA features (Xilinx)
The
Th memory controller
ll examples
l ill
illustrate the
h lilimitations
i i
off the
h FSM
FSM-based
b d
controller and synchronous design methodology
Basically, an FSM cannot generate a control sequence that is finer than the
period of its clock signal
The operation of these alternative designs relies on factors that cannot be
specified
p
byy an RT-level HDL description
p
Due to the variations in propagation delays, the synthesized circuits are not
reliable and may or may not work
There are some ad hoc strategies to obtain better control. These features are
usually device and software dependent.
For example, the digital clock manager (DCM) circuit and input/output
block (IOB) of the Spartan-3
Spartan 3 device can help to remedy some of the
previously discussed problems
59
EXTERNAL SRAM:
SRAM FPGA Features
F t
60
61