Professional Documents
Culture Documents
A program with its data normally resides in auxiliary memory. When the program
or a segment of the program is to be executed, it is transferred to main memory to be
executed by the CPU. Thus one may think of auxiliary memory as containing the
totality of information stored in a computer system. It is the task of the operating
system to maintain in main memory a portion of this information that is currently
active. The part of the computer system that supervises the flow of information
between auxiliary memory and main memory is called the memory management
system.
The static RAM consists essentially of internal flip-flops that store the binary
information. The stored information remains valid as long as power is applied to the
unit. The dynamic RAM stores the binary information in the form of electric
charges that are applied to capacitors. The capacitors are provided inside the chip
by MOS transistors. The stored charge on the capacitors tends to discharge with time
and the capacitors must be periodically recharged by refreshing the dynamic memory.
Refreshing is done by cycling through the words every few milliseconds to restore the
decaying charge. The dynamic RAM offers reduced power consumption and larger
storage capacity in a single memory chip. The static RAM is easier to use and has
shorter read and write cycles.
Figure 1. Function
Table of RAM
A ROM chip is organized externally in a similar manner. However, since a ROM can
only read, the data bus can only be in an output mode. The block diagram of a ROM
chip is shown in Fig. 4. For the same-size chip, it is possible to have more bits of ROM
than of RAM, because the internal binary cells in ROM occupy less space than in RAM.
For this reason, the diagram specifies a 512-byte ROM, while the RAM has only 128
bytes. The nine address lines in the ROM chip specify any one of the 512 bytes stored
in it. The two chip select inputs must be CS1 = 1 and CS2 = 0 for the unit to operate.
Otherwise, the data bus is in a high-impedance state. There is no need for a read or
write control because the unit
can only read. Thus when the
chip is enabled by the two
select inputs, the byte selected
by the address lines appears
on the data bus.
Figure 4 Rom Chip
S-RAM has the advantage that it offers better performance than DRAM because
DRAM needs to be refreshed periodically when in use, while SRAM does not. However,
SRAM is more expensive and less dense than DRAM, so SRAM sizes are orders of
magnitude lower than DRAM.
SRAM basics - There are two key features to SRAM - Static Random Access Memory,
and these set it out against other types of memory that are available:
• The data is held statically: This means that the data is held in the
semiconductor memory without the need to be refreshed as long as the power is
applied to the memory.
• SRAM memory is a form of random access memory: A random access memory
is one in which the locations in the semiconductor memory can be written to or
read from in any order, regardless of the last memory location that was accessed.
The circuit for an individual SRAM memory cell comprises typically four transistors
configured as two cross coupled inverters. In this format the circuit has two stable
states, and these equate to the logical "0" and "1" states. The transistors are MOSFETs
because the amount of power consumed by MOS circuit is considerably less than
that of bipolar transistor technology which is the other feasible option, but it will
consume much more. Using bipolar technology limits the level for integration
because the issue of heat removal becomes a major issue.
In addition to the four transistors in the basic memory cell, and additional two
transistors are required to control the access to the memory cell during the read and
write operations. This makes a total of six transistors, making what is termed a 6T
memory cell, although more complex in terms of the number of components it has a
number of advantages. The main advantage of the six transistor SRAM circuit is the
reduced static power. In the four transistor version, there is a constant current
flow through one or other of the pull down resistors and this increases the overall
power consumption of the chip.
The operation of the SRAM memory cell is relatively straightforward. When the cell is
selected, the value to be written is stored in the cross-coupled flip-flops. The cells
are arranged in a matrix, with each cell individually addressable. Most SRAM
memories select an entire row of cells at a time, and read out the contents of all the
cells in the row along the column lines. While it is not necessary to have two bit lines,
using the signal and its inverse, this is normal practice which improves the noise
margins and improves the data integrity. If the two lines are the same, then the system
will understand there is an issue and re-interrogate the cell. The two bit lines are
passed to two input ports on a comparator to enable the advantages of the differential
data mode to be accessed, and the small voltage swings that are present can be
DRAM stores data by writing a charge to the capacitor by way of an access transistor
and was invented in 1966 by Robert Dennard at IBM and was patented in 1967.
DRAM looks at the state of charge in a transistor-capacitor circuit; a charged state is
a 1 bit; the low charge is seen as a 0 bit. Several of these transistor-capacitor circuits
together creates a “word” of memory.
DRAM uses capacitors that lose charge over time due to leakage, even if the supply
voltage is maintained. Since the charge on a capacitor decays when a voltage is
removed, DRAM must be supplied with a voltage to retain memory (and is thus
volatile). Capacitors can lose their charge a bit even when supplied with voltage if they
have devices nearby (like transistors) that draw a little current even if they are in an
“off” state; this is called capacitor leakage. Due to capacitor leakage, DRAM needs
to be refreshed often.
Sr.
No. SRAM DRAM
It stores information as long as the
power is supplied or a few
1 It stores information as long as the milliseconds when power is switched
power is supplied. off.
2 Transistors are used to store Capacitors are used to store data in
information in SRAM. DRAM.
To store information for a longer
3 Capacitors are not used hence no time, contents of the capacitor needs
refreshing is required. to be refreshed periodically.
4 SRAM is faster need compared to DRAM. DRAM provides slow access speeds.
5 It does not have a refreshing unit. It has a refreshing unit.
6 These are expensive. These are cheaper.
7 SRAMs are low-density devices. DRAMs are high-density devices.
Bits are stored in the form of electric
8 Bits are stored in voltage form. energy.
9 These are used in cache memories. These are used in main memories.
10 Consumes more power and generates Uses less power and generates less
more heat. heat.
Memory Timings (Cycle timing) - Unlike latency, memory timings (that sequence of
numbers you see on your memory module) are measured in clock cycles. So, each number
in memory timings, like 16-19-19-39, indicates the number of clock ‘ticks’ or cycles it takes
to complete a certain task. To simplify things, imagine your memory space as a giant
spreadsheet with rows and columns, where each cell can hold binary data (0 or 1).
1. CAS Latency (tCL) – The first memory timing is something called Column Access Strobe
(CAS) Latency. Although the term strobe is a bit antiquated today as it’s a vestige from
the Asynchronous DRAM days, the term CAS is still used in the industry. CAS Latency
of RAM indicates the number of cycles it takes to get a response from memory once
the memory controller sends over a column it needs to access. Unlike all other timings
below, tCL is an exact number and not a maximum/minimum.
2. Row Address to Column Address Delay (tRCD) – It denotes the minimum number of
clock cycles it will take to open a row and access the required column.
3. Row Precharge Time (tRP) – It denote that 4-number sequence indicates the minimum
clock cycle delay to access another row within the same selected column.
4. Row Active Time (tRAS) – The last number in that memory timing sequence denotes
the minimum number of clock cycles a row needs to remain open to access the data.
This is usually the biggest delay.
Calculating RAM Latency or CAS Latency - The amount of time your memory takes to
respond to a request from the memory controller. CAS Latency is measured in clock cycles,
we need to account for memory speed to get a real-world CAS latency in nanoseconds.
CAS latency formula- ((CL x 2000)/Data rate)
Memory Bandwidth - Computers need memory to store and use data, such
as in graphical processing or loading simple documents. While random access
memory (RAM) chips may say they offer a specific amount of memory, such as
10 gigabytes (GB), this amount represents the maximum amount of memory the
RAM chip can generate. What is more important is the memory bandwidth, or
the amount of memory that can be used for files per second. As the computer
gets older, regardless of how many RAM chips are installed, the memory
Figure b
If the active portions of the program and data are placed in a fast small memory,
the average memory access time can be reduced, thus reducing the total
execution time of the program. Such a fast small memory is referred to as a
cache memory. It is placed between the CPU and main memory as illustrated in Fig.
2. The cache memory access time is less than the access time of main memory by a
factor of 5 to 10. The cache is the fastest component in the memory hierarchy and
approaches the speed of CPU components.
The basic operation of the cache is as follows. When the CPU needs to access
memory, the cache is examined. If the word is found in the cache, it is read from
the fast memory. If the word addressed by the CPU is not found in the cache, the
main memory is accessed to read the word. A block of words containing the one
just accessed is then transferred from main memory to cache memory. The block
size may vary from one word (the one just accessed) to about 16 words adjacent to
the one just accessed. In this manner, some data are transferred to cache so that future
references to memory find the required words in the fast cache memory.
The basic characteristic of cache memory is its fast access time. Therefore, very little
or no time must be wasted when searching for words in the cache. The
transformation of data from main
memory to cache memory is referred
to as a mapping process. Three
types of mapping procedures are
of practical interest when
considering the organization of
cache memory:
Figure 11. Example of cache memory.
In the general case, there are 2k words in cache memory and 2n words in main
memory. The n-bit memory address is divided into two fields: k bits for the
index field and n-k bits for the tag field. The direct mapping cache
organization uses the n-bit address to access the main memory and the k-bit
index to access the cache. The internal organization of the words in the cache
memory is as shown in Fig. 14(b). Each word in cache consists of the data word
and its associated tag. When a new word is first brought into the cache, the tag
bits are stored along
side the data bits.
When the CPU
generates a memory
request, the index
field is used for the
address to access the
cache.
Figure 13 Addressing
relationships between
main and cache memories.
Figure 14 Direct
mapping cache
organization.
An example of a set-associative cache organization for a set size of two is shown in Fig.
16. Each index address refers to two data words and their associated tags. Each tag
requires six bits and each data word has 12 bits, so the word length is 2(6 + 12) 36
bits. An index address of nine bits can accommodate 512 words. Thus, the size of
The octal numbers listed in Fig. 16 are with reference to the main memory
contents illustrated in Fig. 14(a). The words stored at addresses 01000 and
02000 of main memory are stored in cache memory at index address 000.
Similarly, the words at addresses 02777 and 00777 are stored in cache at index
address 777. When the CPU generates a memory request, the index value of the
address is used to access the cache. The tag field of the CPU address is then
compared with both tags in the cache to determine if a match occurs. The
comparison logic is done by an associative search of the tags in the set similar
to an associative memory search: thus the name "set-associative." The hit ratio
will improve as the set size increases because more words with the same index
but different tags can reside in cache.
The second procedure is called the write-back method. In this method only the
cache location is updated during a write operation. The location is then marked
by a flag so that later when the word is removed from the cache it is copied into
main memory. The reason for the write-back method is that during the time a word
resides in the cache, it may be updated several times; however, as long as the word
remains in the cache, it does not matter whether the copy in main memory is out
of date, since requests from the word are filled from the cache.
Cache Initialization
One more aspect of cache organization that must be taken into consideration is
the problem of initialization. The cache is initialized when power is applied to the
computer or when the main memory is loaded with a complete set of programs from
The cache is initialized by clearing all the valid bits to 0. The valid bit of a
particular cache word is set to 1 the first time this word is loaded from main memory
and stays set unless the cache has to be initialized again. The introduction of the
valid bit means that a word in cache is not replaced by another word unless the
valid bit is set to 1 and a mismatch of tags occurs. If the valid bit happens to be 0,
the new word automatically replaces the invalid data. Thus the initialization
condition has the effect of forcing misses from the cache until it fills with valid data.
1. Which page in main memory ought to be removed to make room for a new
page?
2. When a new page is to be transferred from auxiliary memory to main memory.
3. Where the page is to be placed in main memory.
When a page fault occurs in a virtual memory system, it signifies that the
page referenced by the CPU is not in main memory. A new page is then
transferred from auxiliary memory to main memory. If main memory is full, it
would be necessary to remove a page from a memory block to make room for
the new page. The policy for choosing pages to remove is determined from the
replacement algorithm that is used.
The goal of a replacement policy is to try to remove the page least likely to
be referenced in the immediate future. Three of the most common
replacement algorithms used is the first-in, first – out ( FIFO), least recently used
(LRU) and the Optimal Page replacement algorithms.
To illustrate the problems that are possible with a FIFO page-replacement algorithm,
we consider the reference string 7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3, 2, 1, 2, 0, 1,
7, 0, 1
For example reference string, our three frames are initially empty. The first three
references (7,0,1) cause page faults, and are brought into these empty frames. The next
reference (2) replaces page 7, because page 7 was brought in first. Since 0 is the next
reference and 0 is already in memory, we have no fault for this reference. The first
reference to 3 results in page 0 being replaced, since it was the first of the three pages
in memory (0, 1, and 2) to be brought in. Because of this replacement, the next
reference, to 0, will fault. Page 1 is then replaced by page 0. This process continues as
shown in Figure 22. Every time a fault occurs, we show which pages are in our three
frames. There are 15 faults altogether.
The curve of page faults versus the number of available frames. We notice that the
number of faults for four frames (10) is greater than the number of faults for three
frames (nine)! This most unexpected result is known as Belady's anomaly: For some
page-replacement algorithms, the page fault rate may increase as the number of
allocated frames increases. We would expect that giving more memory to a process
would improve its performance.
With only nine page faults, optimal replacement is much better than a FIFO algorithm,
which had 15 faults. (If we ignore the first three, which all algorithms must suffer, then
optimal replacement is twice as good as FIFO replacement.) In fact, no replacement
algorithm can process this reference string in three frames with less than nine faults.
Unfortunately, the optimal page-replacement algorithm is difficult to implement,
because it requires future knowledge of the reference string.
LRU replacement associates with each page the time of that page's last use. When a
page must be replaced, LRU chooses that page that has not been used for the
longest period of time. This strategy is the optimal page-replacement algorithm
looking backward in time, rather than forward.
The result of applying LRU replacement to our example reference string is shown in
Figure 24. The LRU algorithm produces 12 faults. Notice that the first five faults are
the same as the optimal replacement. When the reference to page 4 occurs, however,
LRU replacement sees that, of the three frames in memory, page 2 was used least
The user program supplies logical addresses; these logical addresses must be
mapped to physical addresses before they are used. The concept of a logical-address
space that is bound to a separate physical address space is central to proper memory
management.
In a memory hierarchy system, programs and data are first stored in auxiliary
memory. Portions of a program or data are brought into main memory as they are
needed by the CPU. Virtual memory is a concept used in some large computer
systems that permit the user to construct programs as though a large memory space
were available, equal to the totality of auxiliary memory. Each address that is
referenced by the CPU goes through an address mapping from the so-called
virtual address to a physical address in main memory.
Virtual memory is used to give programmers the illusion that they have a very
large memory at their disposal, even though the computer actually has a relatively
small main memory. A virtual memory system provides a mechanism for
translating program-generated addresses into correct main memory locations.
This is done dynamically, while programs are being executed in the CPU. The
translation or mapping is handled automatically by the hardware by means of
a mapping table.
Although both a page and a block are split into groups of 1K words, a page refers
to the organization of address space, while a block refers to the organization
of memory space. The programs are also considered to be split into pages.
Portions of programs are moved from auxiliary memory to main memory in
The memory-page table consists of eight words, one for each page. The address
in the page table denotes the page number and the content of the word gives
the block number where that page is stored in main memory. The table shows
that pages 1, 2, 5, and 6 are now available in main memory in blocks 3, 0, 1, and
2, respectively. A presence bit 1 in each location indicates whether the page has
been transferred from auxiliary memory into main memory. A 0 in the presence
bit indicates that this page is not available in main memory. The CPU
references a word in memory with a virtual address of 13 bits. The three high-
order bits of the virtual address specify a page number and also an address
for the memory-page table. The content of the word in the memory page table
at the page number address is
read out into the memory
table buffer register.
A more efficient way to organize the page table would be to construct it with a
number of words equal to the number of blocks in main memory. In this way
the size of the memory is reduced and each location is fully utilized. This
method can be implemented by means of an associative memory with each
word in memory containing a page number together with its corresponding
block number. The page field in each word is compared with the page number
in the virtual address. If a match occurs, the word is read from memory and
its corresponding block number is extracted.
Paging
Paging is a memory-management scheme that permits the physical-address space
of a process to be non-contiguous. Paging avoids the considerable problem of fitting
Basic Method
The hardware support for paging is illustrated in Figure 8. Every address generated by
the CPU is divided into two parts: a page number (p) and a page offset (d). The page
number is used as an index into a page table. The page table contains the base address
of each page in physical memory. This base address is combined with the page offset
to define the physical memory address that is sent to the memory unit. The paging
model of memory is shown in Figure 8.
The page size (like the frame size) is defined by the hardware. The selection of a page
size makes the translation of a logical address into a page number and page offset
particularly easy. If the size of logical-address space is 2m, and a page size is 2n
addressing units (bytes or words), then the high-order m-n bits of a logical address
designate the page number, and the n low-order bits designate the page offset. Thus,
the logical address is as follows:
Figure 9 Paging model of logical and physical
memory
Offset Number is 0 1 2
0 3 for all logical page
1
Logical page
number
0,1,2,3 2
Logical
Address
Paging itself is a form of dynamic relocation. Every logical address is bound by the
paging hardware to some physical address. Using paging is similar to using a table of
base (or relocation) registers, one for each frame of memory. When we use a paging
scheme, we have no external fragmentation: Any free frame can be allocated to a
process that needs it. However, we may have some internal fragmentation.
When a process arrives in the system to be executed, its size, expressed in pages, is
examined. Each page of the process needs one frame. Thus, if the process requires n
pages, at least n frames must be available in memory. If n frames are available, they
are allocated to this arriving process. The first page of the process is loaded into one of
the allocated frames, and the frame number is put in the page table for this process.
The next page is loaded into another frame, and its frame number is put into the page
table, and so on.
The use of registers for the page table is satisfactory if the page table is reasonably
small (for example, 256 entries). Most contemporary computers, allow the page table
to be very large (for example, 1 million entries). For these machines, the use of fast
registers to implement the page table is not feasible. Rather, the page table is kept in
main memory, and a page-table base register (PTBR) points to the page table.
Changing page tables requires changing only this one register, substantially reducing
context-switch time.
Figure 11 Free frames (a) before allocation and
(b) after allocation.
The TLB is used with page tables in the following way. The TLB contains only a few of
the page-table entries. When a logical address is generated by the CPU, its page
number is presented to
the TLB. If the page
number is found, its
frame number is
immediately available and
is used to access memory.
The whole task may take
less than 10 percent
longer than it would if an
unmapped memory
reference were used.
Figure 12 Paging hardware with
TLB.
Effective access time = % of hit ratio x mapped access time + % time to search
TLB x (Total time – 1st access memory for page table and frame number + access the
desired byte in memory)
Numerical example - An 80-percent hit ratio means that we find the desired page
number in the TLB 80 percent of the time. If it takes 20 nanoseconds to search the TLB,
and 100 nanoseconds to access memory, then a mapped memory access takes 120
nanoseconds when the page number is in the TLB. If we fail to find the page number in
the TLB (20 nanoseconds), then we must first access memory for the page table and
frame number (100 nanoseconds), and then access the desired byte in memory (100
nanoseconds), for a total of 220 nanoseconds. To find the effective memory-access time.
1. Single Bit Data Errors - The change in one bit in the whole data sequence, is
called “Single bit error”. Occurrence of single bit error is very rare in serial
communication system. This type of error occurs only in parallel
communication system, as data is transferred bit wise in single line, there is
chance that single line to be noisy.
2. Multiple Bit Data Errors - If there is change in two or more bits of data
sequence of transmitter to receiver, it is called “Multiple bit error”. This type of
error occurs in both serial type and parallel type data communication
networks.
3. Burst Errors - The change of set of bits in data sequence is called “Burst error”.
The burst error is calculated in from the first bit change to last bit change.
a) b)
c)
Error-Correcting codes - Along with error-detecting code, we can also pass some data
to figure out the original message from the corrupt message that we received. This
type of code is called an error-correcting code. Error-correcting codes also deploy the
same strategy as error-detecting codes but additionally, such codes also detect the
exact location of the corrupt bit. In error-correcting codes, parity check has a simple
way to detect errors along with a sophisticated mechanism to determine the corrupt
bit location. Once the corrupt bit is located, its value is reverted (from 0 to 1 or 1
to 0) to get the original message.
How to Detect and Correct Errors? - To detect and correct the errors, additional
bits are added to the data bits at the time of transmission.
• The additional bits are called parity bits. They allow detection or correction of
the errors.
• The data bits along with the parity bits form a code word.
CRC Code Generation - Based on the desired number of bit checks, we will add some
zeros (0) to the actual data. This new binary data sequence is divided by a new word
of length n + 1, where n is the number of check bits to be added. The reminder
obtained as a result of this modulo 2
division is added to the dividend bit
sequence to form the cyclic code. The
generated code word is completely divisible
by the divisor that is used in generation of
code. This is transmitted through the
transmitter. Example
Check Sum - Checksums are similar to parity bits except, the number of bits in the
sums is larger than parity and the result is always constrained to be zero. That
means if the checksum is zero, error is detected.
Hamming Code - This error detecting and correcting code technique is developed by
R.W.Hamming . This code not only identifies the error bit, in the whole data sequence
and it also corrects it. This code uses a number of parity bits located at certain
positions in the codeword. The number of parity bits depends upon the number of
information bits. The hamming code uses the relation between redundancy bits and
the data bits and this code can be applied to any number of data bits.
How the Hamming code actually corrects the errors? - In Hamming code, the
redundancy bits are placed at certain calculated positions in order to eliminate errors.
The distance between the two redundancy bits is called “Hamming distance”. To
understand the working and the data error correction and detection mechanism of the
hamming code, let’s see to the following stages.
Number of parity bits - The number of parity bits to be added to a data string depends
upon the number of information bits of the data string which is to be transmitted.
Number of parity bits will be calculated by using the data bits. This relation is given
below.
2P >= n + P +1
For example, if we have 4 bit data string, i.e. n = 4, then the number of parity bits to
be added can be found by using trial and error method.
So we can say that 3 parity bits are required to transfer the 4 bit data with single bit
error correction.
Where to Place these Parity Bits? - After calculating the number of parity bits
required, we should know the appropriate positions to place them in the information
string, to provide single bit error correction. In the above considered example, we have
4 data bits and 3 parity bits. So, the total codeword to be transmitted is of 7 bits (4
+ 3). We generally represent the data sequence from right to left, as shown below.