You are on page 1of 24

Computer Architecture | GROUP 3

DRAM
Group member:
Dương Quang Thịnh 17119104
Trần Tiến Đạt 17119068
Nguyễn Đăng Khoa 17119085
Vũ Tấn Khoa 17119086
Võ Quang Linh 17119089
Lê Trí Dũng 17119065
Nguyễn Hữu Trọng 17119109
Nguyễn Thành Tín 17119106
Nguyễn Tấn Đạt 17119117
Bùi Thị Diễm 17119062

Contents
CHAPTER 1: OVERVIEW OF DRAMS.....................................................................3
1.1 Define..................................................................................................................3
1.2 DRAM system Organization...............................................................................4
1.3 The role of DRAM..............................................................................................6
1.4 Principles of operation........................................................................................6
1.5 DRAM types.......................................................................................................8
1.6 Advantages and disadvantages DRAM............................................................10
CHAPTER 2: DRAM’S STRUCTURE........................................................................12
2.1 DIMM, Channel and more................................................................................12
2.2 Command..........................................................................................................14
CHAPTER 3: MEMORY CONTROLER.....................................................................14
3.1 Controller..........................................................................................................15
3.2 Row-buffer-management policies.....................................................................15
3.3 Address Mapping..............................................................................................17
3.4 Command Queue..............................................................................................17
3.5 Latency..............................................................................................................19
3.6 Refresh..............................................................................................................20
CHAPTER 4: DRAMSim2 Memory Simulator Overview...........................................21
4.1 DRAMSim2 Inputs...........................................................................................21
CHAPTER 1: OVERVIEW OF DRAMS
1.1 Define
Dynamic random-access memory (DRAM) is a type of random ascess semiconductor memory that
stores each bit of data in a memory cell consisting of a tiny capacitor and a transistor, both typically
based on metal-oxide-semiconductor (MOS) technology.

DRAM typically takes the form of an integrated circuit chip, which can consist of dozens to billions
of DRAM memory cells. DRAM chips are widely used in digital electronics where low-cost and
high-capacity computer memory is required. One of the largest applications for DRAM is the main
memory (colloquially called the "RAM") in modern computers and graphics cards (where the "main
memory" is called the graphics memory). It is also used in many portable devices and video
game consoles. In contrast, SRAM, which is faster and more expensive than DRAM, is typically used
where speed is of greater concern than cost and size, such as the cache memories in processors.

Due to its need of a system to perform refreshing, DRAM has more complicated circuitry and timing
requirements than SRAM, but it is much more widely used. The advantage of DRAM is the structural
simplicity of its memory cells: only one transistor and a capacitor are required per bit, compared to
four or six transistors in SRAM. This allows DRAM to reach very high densities, making DRAM
much cheaper per bit. The transistors and capacitors used are extremely small; billions can fit on a
single memory chip. Due to the dynamic nature of its memory cells, DRAM consumes relatively
large amounts of power, with different ways for managing the power consumption.

DRAM had a 47% increase in the price-per-bit in 2017, the largest jump in 30 years since the 45%
percent jump in 1988, while in recent years the price has been going down.

Figure 1.1
1.2 DRAM system Organization

Figure 1.2

JEDEC-style memory bus organization. The 1.2 figure shows a system of a memory controller and
two memory modules with a 16-bit data bus and an 8-bit address and command bus.

Figure 1.3
A Dram cell consists of a capacitor connected by a pass transistor to the bit line (or digit line or
column line). The digit line (or column line) is connected to a multitude of cells arranged in a
column. The word line (or row line) is also connected to a multitude of cells, but arranged in a row. If
the word line is ascertained, then the pass transistor T1 in Figure 1 is opened and the capacitor C1 is
connected to the bit line.

The DRAM memory cell stores binary information in the form of a stored charge on the capacitor.
The capacitor's common node is biased approximately at V CC/2. The cell therefore contains a charge
of Q = ±VCC/2 • Ccell, if the capacitance of the capacitor is Ccell. The charge is Q = +VCC/2 • Ccell if the
cell stores a 1, otherwise the charge is Q = -V CC/2 • Ccell. Various leak currents will slowly remove the
charge, making a refresh operation necessary.

If we open the pass transistor by asserting the word line, then the charge will dissipate over the digit
line, leading to a voltage change. The voltage change is given by (Vsignal observed voltage change in
the digit line, Ccell the capacitance of the DRAM cell capacitor, and C line the capacitance of the digit
line

Vsignal = Vcell• Ccell• (Ccell + Cline)-1

For example, if VCC is 3.3V, then Vcell is 1.65V. Typical values for the capacitances are C line = 300fF
and Ccell = 50fF. This leads to a signal strength of 235 mV. When a DRAM cell is accessed, it shares
its charge with the digit line.
Figure 1.4

1.3 The role of DRAM


Figure 1.5 illustrates DRAM’s place in a typical PC. An individual DRAM device typically connects
indirectly to a CPU (i.e., a microprocessor) through a memory controller.

Figure 1.5

Thanks to increased server capacities and sophisticated caching technologies, DRAM can serve as a
tier in the storage infrastructure. Even mid-range server hardware is typically able to hold more than
1TB of DRAM, and while that 1TB might cost 3 to 5 times more than flash storage, its performance
capabilities are very attractive. Also, this storage area is directly accessible via the CPU slot, so
there's no storage protocol interconnect to worry about -- in other words, the lowest possible latency.

1.4 Principles of operation


DRAM is usually arranged in a rectangular array of charge storage cells consisting of one capacitor
and transistor per data bit. The figure to the right shows a simple example with a four-by-four cell
matrix. Some DRAM matrices are many thousands of cells in height and width.
The long horizontal lines connecting each row are known as word-lines. Each column of cells is
composed of two bit-lines, each connected to every other storage cell in the column (the illustration
to the right does not include this important detail). They are generally known as the "+" and "−" bit
lines.

A sense amplifier is essentially a pair of cross-connected inverters between the bit-lines. The first
inverter is connected with input from the + bit-line and output to the − bit-line. The second inverter's
input is from the − bit-line with output to the + bit-line. This results in positive feedback which
stabilizes after one bit-line is fully at its highest voltage and the other bit-line is at the lowest possible
voltage.

Operations to read a data bit from a DRAM storage cell.

The sense amplifiers are disconnected.

The bit-lines are precharged to exactly equal voltages that are in between high and low logic levels
(e.g., 0.5 V if the two levels are 0 and 1 V). The bit-lines are physically symmetrical to keep the
capacitance equal, and therefore at this time their voltages are equal.

The precharge circuit is switched off. Because the bit-lines are relatively long, they have
enough capacitance to maintain the precharged voltage for a brief time. This is an example
of dynamic logic.

The desired row's word-line is then driven high to connect a cell's storage capacitor to its bit-line.
This causes the transistor to conduct, transferring charge from the storage cell to the connected bit-
line (if the stored value is 1) or from the connected bit-line to the storage cell (if the stored value is
0). Since the capacitance of the bit-line is typically much higher than the capacitance of the storage
cell, the voltage on the bit-line increases very slightly if the storage cell's capacitor is discharged and
decreases very slightly if the storage cell is charged (e.g., 0.54 and 0.45 V in the two cases). As the
other bit-line holds 0.50 V there is a small voltage difference between the two twisted bit-lines.

The sense amplifiers are now connected to the bit-lines pairs. Positive feedback then occurs from the
cross-connected inverters, thereby amplifying the small voltage difference between the odd and even
row bit-lines of a particular column until one bit line is fully at the lowest voltage and the other is at
the maximum high voltage. Once this has happened, the row is "open" (the desired cell data is
available).[21]
All storage cells in the open row are sensed simultaneously, and the sense amplifier outputs latched.
A column address then selects which latch bit to connect to the external data bus. Reads of different
columns in the same row can be performed without a row opening delay because, for the open row,
all data has already been sensed and latched.

While reading of columns in an open row is occurring, current is flowing back up the bit-lines from
the output of the sense amplifiers and recharging the storage cells. This reinforces (i.e. "refreshes")
the charge in the storage cell by increasing the voltage in the storage capacitor if it was charged to
begin with, or by keeping it discharged if it was empty. Note that due to the length of the bit-lines
there is a fairly long propagation delay for the charge to be transferred back to the cell's capacitor.
This takes significant time past the end of sense amplification, and thus overlaps with one or more
column reads.

When done with reading all the columns in the current open row, the word-line is switched off to
disconnect the storage cell capacitors (the row is "closed") from the bit-lines. The sense amplifier is
switched off, and the bit-lines are precharged again.

1.5 DRAM types


When looking at the memory technology itself, there is a good variety of different types of DRAM.
The main DRAM types are summarised below:

Asynchronous DRAM: Asynchronous DRAM is the basic type of DRAM on which all other types
are based. Asynchronous DRAMs have connections for power, address inputs, and bidirectional data
lines.

Although this type of DRAM is asynchronous, the system is run by a memory controller which is
clocked, and this limits the speed of the system to multiples of the clock rate. Nevertheless the
operation of the DRAM itself is not synchronous.

There are various types of asynchronous DRAM within the overall family:
RAS only Refresh, ROR: This is a classic asynchronous DRAM type and it is refreshed by opening
each row in turn. The refresh cycles are spread across the overall refresh interval. An external
counter is required to refresh the rows sequentially.
CAS before RAS refresh, CBR: To reduce the level of external circuitry the counter required for the
refresh was incorporated into the main chip. This became the standard format for refresh of an
asynchronous DRAM. (It is also the only form generally used with SDRAM).

FPM DRAM: FPM DRAM or Fast Page Mode DRAM was designed to be faster than conventional
types of DRAM. As such it was the main type of DRAM used in PCs, although it is now well out of
date as it was only able to support memory bus speeds up to about 66 MHz.

EDO DRAM: Extended Data Out DRAM, EDO DRAM was a form of DRAM that provided a
performance increase over FPM DRAM. Yet this type of DRAM was still only able to operate at
speeds of up to about 66 MHz.

EDO DRAM is sometimes referred to as Hyper Page Mode enabled DRAM because it is a
development of FPM type of DRAM to which it bears many similarities. The EDO DRAM type has
the additional feature that a new access cycle could be started while the data output from the previous
cycle was still present. This type of DRAM began its data output on the falling edge of /CAS line.
However it did not inhibit the output when /CAS line rises. Instead, it held the output valid until
either /RAS was dis-asserted, or a new /CAS falling edge selected a different column address. In
some instances it was possible to carry out a memory transaction in one clock cycle, or provide an
improvement from using three clock cycles to two dependent upon the scenario and memory used.

This provided the opportunity to considerably increase the level of memory performance while also
reducing costs.

BEDO DRAM: The Burst EDO DRAM was a type of DRAM that gave improved performance of
the straight EDO DRAM. The advantage of the BEDO DRAM type is that it could process four
memory addresses in one burst saving three clock cycles when compared to EDO memory. This was
done by adding an on-chip address counter count the next address.

BEDO DRAM also added a pipelined to enable the page-access cycle to be divided into two
components: the first component accessed the data from the memory array to the output stage the
second component drove the data bus from this latch at the appropriate logic level.
Since the data was already in the output buffer, a faster access time is achieved - up to 50%
improvement when compared to conventional EDO DRAM.

BEDO DRAM provided a significant improvement over previous types of DRAM, but by the time it
was introduced, SDRAM had been launched and took the market. Therefore BEDO DRAM was
little used.

SDRAM: Synchronous DRAM is a type of DRAM that is much faster than previous, conventional
forms of RAM and DRAM. It operates in a synchronous mode, synchronising with the bus within
the CPU.

RDRAM: This is Rambus DRAM - a type of DRAM that was developed by Rambus Inc, obviously
taking its name from the company. It was a competitor to SDRAM and DDR SDRAM, and was able
to operate at much faster speeds than previous versions of DRAM.

1.6 Advantages and disadvantages DRAM


 Advantages
 Very dense

 Low cost per bit

 Simple memory cell structure

 Disadvantages
 Complex manufacturing process

 Date requires refreshing

 More complex enternal circuitry required (read and refresh periodically)

 Volatile memory

 Relatively slow operational speed


 Comparison Chart

BASIS FOR
COMPARISO SRAM DRAM FLASH
N

Speed Faster Medium Faster for read

Slower for write

Size Small Large Small

Cost Expensive Cheap Cheap

Used in Cache memory Main memory Non-volatile memory

Density Less dense Highly dense Highly dense

Construction Complex and uses Simple and uses Composed of logic


transistors and capacitors and very gates
latches. few transistors.

Single block of 6 transistors Only one transistor. Two transistors


memory
requires

Charge leakage Not present Present hence require Not present


property power refresh
BASIS FOR
COMPARISO SRAM DRAM FLASH
N

circuitry

Power Low High Low


consumption
CHAPTER 2: DRAM’S STRUCTURE

2.1 DIMM, Channel and more

DIMM: DIMM (Dual In-line memory module) contain all the chips on both sides comprises a
series of dynamic random-access memory integrated circuits. These modules are mounted on a
printed circuit board.
CHANNEL: CHANNEL is connected to memory controller when computer choose a channel, it
will define a particular DIMM
RANK: RANK is inside the DIMM and there are two ranks in the DIMM.
CHIP: CHIP is in the RANK, a rank consists of multiple chips. In the Chip, there are multiple
banks.
BANK: BANK is a multiple 2-dimension array of row and column which is known as memory
array.

2.2 Command
+Activate command: opens row (placed into row buffer)
+Read/write command: reads/writes column in the row buffer
+Precharge command: closes the row and prepares the bank for next access
CHAPTER 3: MEMORY CONTROLER
3.1 Controller
In PC systems, the memory controller is part of the north-bridge chipset that handles potentially
multiple microprocessors, graphics co-processor, the communication to the south-bridge chipset
(which, in turn, handles all of the system’s I/O functions), as well as the interface to the DRAM
system.

Designed to minimize die size, minimize power consumption, maximize system performance, or
simply reach a reasonably optimal compromise of the conflicting design goals.
The function of a DRAM memory controller is to manage the flow of data into and out of DRAM
devices connected to that DRAM controller in the memory system. A DRAM-access protocol defines
the interface protocol between a DRAM memory controller and the system of DRAM devices.

3.2 Row-buffer-management policies


3.2.1 Open-Page Row-Buffer-Management Policy
The open-page row-buffer-management policy is designed to favor memory accesses to the same row
of memory by keeping sense amplifiers open and holding a row of data for ready access. Once a row
of data is brought to the array of sense amplifiers in a bank of DRAM cells, different columns of the
same row can be accessed again with the minimal latency of tCAS. In the case where another
memory read access is made to the same row, that memory access can occur with minimal latency
since the row is already active in the sense amplifier and only a column access command is needed to
move the data from the sense amplifers to the memory controller.
However, in the case where the access is to a different row of the same bank, the memory controller
must first precharge the DRAM array, engage another row activation, and then perform the column
access.
An open-page policy is typically deployed in memory systems of low processor count platforms

3.2.1 Close-Page Row-Buffer-Management Policy


The close-page row-buffer-management policy is designed to favor accesses to random locations in
memory and optimally supports memory request patterns with low degrees of access locality.
The probability of row hit decreases and the probability of bank conflict increases in these systems,
reaching a tipping point of sorts where a close-page policy provides better performance for the
computer system.
A close-page policy is typically deployed in memory systems of larger processor count platforms
is that in large systems, the intermixing of memory request sequences from multiple,
concurrent, threaded contexts reduces the locality of the resulting memory-access sequence.

3.2.2 Hybrid (Dynamic) Row-Buffer Management Policies


To support memory request sequences whose request rate and access locality can change dramatically
depending on the dynamic, run-time behavior of the workload, DRAM memory controllers designed
for general-purpose computing can utilize a combination of access history and timers to dynamically
control the row-buffer-management policy for performance optimization or power consumption
minimization.
The minimum ratio of row buffer hits means that if a sequence of bank conflicts occurs in rapid
succession and the ratio of memory read requests that are row buffer hits falls below a precomputed
threshold, the DRAM controller can switch to a close-page policy for better performance. Similarly,
if a rapid succession of memory requests to a given bank is made to the same row, the DRAM
controller can switch to an open-page policy to improve performance.
One simple mechanism used in modern DRAM controllers to improve performance and reduce
power consumption is the use of a timer to control the sense amplifiers. That is, a timer is set to a
predetermined value when a row is activated. The timer counts down with every clock tick, and when
it reaches zero, a precharge command is issued to precharge the bank. In case of a row buffer hit to an
open bank, the counter is reset to a higher value and the countdown repeats. In this manner, temporal
and spatial locality present in a given memory-access sequence can be utilized without keeping rows
open indefinitely.

3.3 Address Mapping


The address mapping scheme is used to denote the scheme whereby a given physical address is
resolved into indices in a DRAM memory system in terms of channel ID, rank ID, bank ID, row ID,
and column ID. The task of address mapping is also sometimes referred to as address translation. The
task of an address mapping scheme is to minimize the probability of bank conflicts in temporally
adjacent requests and maximize the parallelism in the memory system.
To obtain the best performance, the choice of the address mapping scheme is often coupled to the
row-buffer-management policy of the memory controller.

3.4 Command Queue


To control the flow of data between the DRAM memory controller and DRAM devices, memory
transactions are translated into sequences of DRAM commands in modern DRAM memory
controllers. To facilitate the pipelined execution of these DRAM commands, the DRAM commands
may be placed into a single queue or multiple queues. With the DRAM commands organized in the
request queuing structure, the DRAM memory controller can then prioritize DRAM commands based
on many different factors, including, but not limited to, the priority of the request, the availability of
resources to a given request, the bank address of the request, the age of the request, or the access
history of the agent that made the request.
With the DRAM commands organized in the request queuing structure, the DRAM memory
controller can then prioritize DRAM commands based on many different factors, including, but not
limited to, the priority of the request, the availability of resources to a given request, the bank address
of the request, the age of the request, or the access history of the agent that made the request.
In the per-bank queuing structure, memory transaction requests, assumed to be of equal priority, are
sorted and directed to different queues on a bank-by-bank basis. Memory transaction requests are
translated into memory addresses and directed into different request queues based on their respective
bank addresses.
 In an open-page memory controller with request queues organized comparably to Figure 13.7,
where a given request queue has exhausted all pending requests to the same open row and all
other pending requests in the queue are addressed to different rows, the request queue can
then issue a precharge command and allow the next bank to issue commands into the memory
system.
 In a close-page memory controller, the round-robin bank-rotation scheme maximizes the
temporal distance between requests to any given bank without sophisticated logic circuits to
resolve against starvation.

However, there are still some disadvantages, particularly, for open-page memory controllers. In open-
page memory controllers, the address mapping scheme maps spatially adjacent cache lines to open
rows, and multiple requests to an open row may be pending in a given queue.
3.5 Latency

CAS latency, also known as “Column Access Strobe.” This is the number of clock cycles that pass
between when an instruction is given and when the information is made available. Higher CAS
timings can result in a higher latency even with higher clock speeds. The lower the CAS, the faster
the RAM—and consequently, the more expensive. When deciding between RAM of different clock
speeds, the RAM with the higher clock speed is superior; but when choosing between RAM of
identical clock speeds, the RAM with lower CAS latency is faster.
3.6 Refresh
The word “DRAM” is an acronym for Dynamic Random-Access Memory. The nature of the non-
persistent charge storage in the DRAM cells means that the electrical charge stored in the storage
capacitors will gradually leak out through the access transistors. Consequently, to maintain data
integrity, data values stored in DRAM cells must be periodically read out and restored to their
respective, full voltage level before the stored electrical charges decay to indistinguishable levels.
The refresh command accomplishes the task of data read-out and restoration in DRAM devices, and
as long as the time interval between refresh commands made to a given row of a DRAM array is
shorter than the worst-case data decay time, DRAM refresh commands can be used to ensure data
integrity.
Most DRAM devices use a refresh row address register to keep track of the address of the last
refreshed row. Typically, the memory controller sends a single refresh command to the DRAM
device, and the DRAM device increments the address in the refresh row address register and goes
through a row cycle for all rows with that row address in all of the banks in the DRAM device.
In most modern DRAM devices, to ensure the integrity of data stored in DRAM devices, each DRAM
row that contains valid data must be refreshed at least once per refresh period, typically every 32 or
64 ms.
The microprocessor request stream is separated into read and write request queues and the request
commands are placed into the refresh queue. In this manner, in the case that the refresh request is
below a preset deferral threshold, all read and write requests will have priority over the refresh
request.In the case where the system is idle with no other pending read or write requests, the
refresh request can then be sent to the DRAM devices. In the case where the system is filled with
pending read and write requests but a DRAM refresh request has nearly exceeded the maximum
deferral time, that DRAM refresh request will then receive the highest scheduling priority to ensure
that the refresh request occurs within the required time period to ensure data integrity in the
memory system.
CHAPTER 4: DRAMSim2 Memory
Simulator Overview

4.1 DRAMSim2 Inputs


Device Ini File defines the DIMM structural characteristics for example : DEVICE_WIDTH,
NUM_BANKS,NUM_ROWS, NUM_COLS and many non-structural characteristics such as :
REFRESH_PERIOD, tCK, Vdd etc.
System Ini File defines the Memory Controller characteristics, for example : JEDEC_DATA_BUS_BITS,
TRANS_QUEUE_DEPTH, CMD_QUEUE_DEPTH, SCHEDULING_POLICY, QUEUING_STRUCTURE etc. It
also setsthe debugging fags of DRAMSim2.
4.1.1 Memory Trace
The Trace either as a Trace File, or as an execution driven process is the main input source of
DRAMSim2. In the standalone mode the Trace is a .trc fle, the Trace File, with 3 columns:
Memory Address, Transaction Type (P_MEM_WR, P_MEM_RD, P_FETCH) and Cycle. In case of a
frontend driver which produces Trace File timed in nano seconds, it must be preprocessed with a
python parse script (comes with DRAMSim2 package) in order to be mapped to cycles.

4.1.2 System Ini File


The System Ini (system.ini) file contains the Memory System and Memory Controller parameters.

4.1.3 Device Ini File


The Device Ini file (DRAMmodel.ini) contains all the DRAM model structural parameters such as
banks, rows, columns, clock, other timing and all power parameters, as well. The most important
of them are presented below:

You might also like