You are on page 1of 13

EXPLAIN DIRECT MEMORY ACCESS WITH DIAGRAMS

DMA Controller is a hardware device that allows I/O devices to directly access memory with
less participation of the processor. DMA controller needs the same old circuits of an interface to
communicate with the CPU and Input/Output devices.

What is a DMA Controller?

Direct Memory Access uses hardware for accessing the memory, that hardware is called a DMA
Controller. It has the work of transferring the data between Input Output devices and main
memory with very less interaction with the processor. The direct Memory Access Controller is a
control unit, which has the work of transferring data.

DMA Controller is a type of control unit that works as an interface for the data bus and the I/O
Devices. As mentioned, DMA Controller has the work of transferring the data without the
intervention of the processors, processors can control the data transfer. DMA Controller also
contains an address unit, which generates the address and selects an I/O device for the transfer
of data. Here we are showing the block diagram of the DMA Controller.
Types of Direct Memory Access

Single-Ended DMA: Single-Ended DMA Controllers operate by reading and writing from a
single memory address. They are the simplest DMA.
Dual-Ended DMA: Dual-Ended DMA controllers can read and write from two memory
addresses. Dual-ended DMA is more advanced than single-ended DMA.
Arbitrated-Ended DMA: Arbitrated-Ended DMA works by reading and writing to several
memory addresses. It is more advanced than Dual-Ended DMA.
Interleaved DMA: Interleaved DMA are those DMA that read from one memory address and
write from another memory address.

The provided diagram represents a DMA controller, which is a crucial component in computer
architecture for managing direct memory access, a method for data transfer that bypasses the
CPU. Let's walk through the elements of the diagram.

1. Data Bus Buffer: This buffer interfaces with the system's data bus, allowing the DMA
controller to read data from or write data to the system's memory.
2. Address Bus Buffer: Like the data bus buffer, this buffer interfaces with the address bus,
allowing the DMA controller to specify the memory addresses involved in the transfer.

3. Control Logic: The central part of the DMA controller that manages the flow of information
and controls the DMA cycle. It uses signals such as DMA select (DS), register select (RS), read
(RD), write (WR), bus request (BR), bus grant (BG), and interrupt.

4. Address Register: Holds the starting memory address for the data transfer. This register is
loaded by the CPU with the address where data should be read from or written to.

5. Word Count Register: This register maintains the count of how many words remain to be
transferred. The CPU initializes this register with the total number of words to be moved.

6. Control Register: Contains control flags and mode settings for the DMA transfer, which are
used by the control logic to determine the direction of the transfer (read or write) and other
operational parameters.

7. DMA Request and DMA Acknowledge: These signals are used for communication between
the DMA controller and the peripheral devices. The DMA request is issued by the DMA
controller when it is ready to transfer data, and the DMA acknowledge is received from the
peripheral device when it is ready to participate in the data transfer.

8. I/O Device: Represents the peripheral device that is either the source or destination of the
data transfer, depending on whether the operation is a read or write.
The sequence of events is as follows:

§ The CPU sets up the DMA by loading the address register and word count register with
the appropriate values and setting the correct mode in the control register.
§ Once the DMA is configured, the CPU will issue a command to start the DMA transfer
and then continue with other tasks.
§ The DMA controller, upon receiving the bus grant signal (BG), takes control of the bus
(signifying that the CPU is no longer using it) and begins the transfer.
§ It does this by placing the address from the address register onto the address bus via the
address bus buffer and transferring data through the data bus buffer.
§ After each data transfer, the word count register is decremented until all words have been
transferred.
§ Upon completion, the DMA controller sends an interrupt signal to the CPU to signal that
the transfer has been completed and the bus is released for the CPU to use again.

This efficient process allows data to be moved quickly and directly between I/O devices and
memory, improving the system's performance by freeing the CPU from handling these data
transfers.

Modes of Data Transfer in DMA:

Burst Mode

§ In this mode, the DMA controller takes complete control of the system buses and transfers
all the data in one continuous burst.
§ The CPU is effectively locked out of the bus until the entire data transfer is complete.
§ This mode is efficient for large data transfers where the overhead of repeatedly requesting
and releasing the bus can be avoided.

Cycle Stealing Mode

§ Cycle stealing mode allows the DMA controller to interleave its transfer cycles with those
of the CPU.
§ The DMA controller will transfer one byte (or word) of data and then release control of
the buses back to the CPU before requesting them again for the next byte (or word).
§ This mode is beneficial when it's necessary to maintain a balance between CPU
operations and I/O transfer rates. It's less disruptive to the CPU's activities but slower
overall for the data transfer.

Transparent Mode

§ Transparent mode is the most non-intrusive to the CPU's operation. The DMA controller
only takes control of the buses when the CPU does not need them.
§ The DMA controller operates during the CPU's idle states, effectively making the data
transfer "invisible" or "transparent" to the CPU.
§ This requires intelligent timing to ensure that the DMA does not interfere with the CPU's
actions.

Specific DMA Controllers

8237 DMA Controller

§ This controller typically manages four I/O channels and employs a priority encoder to
decide which channel gets serviced first, based on priority levels.
§ Each channel must be programmed individually for its respective data transfer task.

8257 DMA Controller

§ The 8257 is another four-channel DMA controller which can become fully functional
with a single Intel 8212 I/O device connected to it.
§ Like the 8237, it also prioritizes channels, and it features two 16-bit registers for
managing addresses and terminal counts.
Advantages of DMA Controller

§ Increased Efficiency: Offloading data transfer tasks from the CPU to the DMA allows
the CPU to focus on other processing tasks.
§ Speed: DMA can transfer data quickly, using fewer clock cycles compared to CPU-driven
data transfers.
§ Workload Distribution: The DMA controller can manage data transfers independently,
leading to a better-optimized workload for the system.

Disadvantages of DMA Controller

§ Cost: Implementing DMA adds to the system cost due to the need for an additional
controller and more complex bus arbitration logic.
§ Cache Coherence: DMA can lead to issues with cache memory, as the CPU may not be
immediately aware of the changes made directly to the memory by DMA.
§ Complexity: The use of a DMA controller can increase the complexity of both hardware
and software, requiring more sophisticated programming and debugging efforts.
EXPLAIN OPERATION OF MESI PROTOCOL (CACHE COHERENCY)

The MESI protocol is a cache coherency protocol used in multiprocessor systems to maintain
consistency among caches. The acronym MESI stands for the four states that a cache line can be
in: Modified, Exclusive, Shared, and Invalid.

Here’s how each state operates within the protocol:

1. Modified (M): This state indicates that the cache line is both present in the current cache and
has been changed from the value in memory. This cache line is dirty, meaning that it has not been
written back to main memory. The data is only in this cache and no other processor's cache.

2. Exclusive (E): The cache line is present only in the current cache and has not been modified,
meaning it matches main memory. It's "exclusive" in that no other cache has a copy of this data.
This state allows a processor to modify the data without having to communicate with other
processors because it knows that no other processor will need to invalidate it.
3. Shared (S): This state means that the cache line may be stored in other caches of the processors
and matches the main memory. It has not been modified, so any processor with the line in "Shared"
state can read it without any issue.

4. Invalid (I): The cache line is not valid or present in the current cache. Any data previously held
in this cache line has been discarded or updated by another processor, and this cache does not have
the latest data.

Transitioning Between States:

Local Read (Green Solid Arrow): When a processor reads a line, depending on whether other
copies exist in other caches, it will be brought to the cache in either Exclusive or Shared state.

Remote Read (Blue Solid Arrow): A read from another processor can cause the cache line to
transition from Exclusive to Shared if it was the only holder or remain in Shared if it was already
shared.

Local Write (Red Solid Arrow): When the local processor writes to a cache line, it will transition
from any state to Modified if it's a local write, indicating that this cache has the only valid copy,
and it's different from what's in memory.

Remote Write (Red Dotted Arrow): A write from another processor will invalidate this cache line
if it was in Shared or Exclusive, forcing it to the Invalid state. If it was Modified, it would have to
write the data to memory and then invalidate itself.

The key purpose of MESI is to prevent the "read-modify-write" problem where two processors
might attempt to modify the same memory location simultaneously. By using these states and
transitions, MESI ensures that any modifications are serialized, maintaining memory coherence
throughout the system.
WHAT IS MEANT BY SYNCHRONIZATION IN COMPUTER ARCHITECTURE?

In computer architecture, synchronization refers to the coordination of concurrent operations to


ensure correct execution and to avoid conflicts. Specifically, it is a strategy to ensure that multiple
threads or processes that are accessing shared resources do so in a manner that maintains
consistency and prevents data races and other related problems. Synchronization can be necessary
at multiple levels within a computing system, from low-level hardware up to high-level application
software.

Let’s look at few of the contexts where synchronization is important:

§ Multi-core and Multi-processor Systems: Modern computers often have multiple


processors or multi-core CPUs where each core can execute threads independently.
Synchronization ensures that when these threads access shared memory, they do not
interfere with each other or cause inconsistent states.
§ Operating Systems: The OS manages process scheduling and resource allocation.
Synchronization ensures that processes do not simultaneously modify the same data or
resource, which could lead to corruption or deadlock.
§ Database Systems: Synchronization mechanisms are critical to ensure the integrity of
data. Transactions must be carefully coordinated so that concurrent operations do not
produce inconsistent or incorrect results.
§ Distributed Systems: In systems spread across networks, synchronization ensures that
distributed components can work together coherently, which can involve complex
coordination across different machines and possibly different geographies.
§ Instruction-Level: Inside a CPU, synchronization is important to ensure that instruction
execution is correctly ordered — for instance, using memory barriers to guarantee that
reads and writes occur in the intended sequence.
The common mechanisms for synchronization include:

§ Mutexes (Mutual Exclusion Locks): Used to prevent multiple threads from entering a
critical section of code simultaneously.
§ Semaphores: Counting semaphores can be used to control access to a resource pool, while
binary semaphores are like mutexes.
§ Critical Sections: Sections of code that should be executed by only one thread at a time.
§ Barriers: Synchronization points where threads or processes wait until all have reached
the barrier point before any are allowed to proceed.
§ Latch and Flip-flops: Hardware mechanisms used for synchronization within and
between chips.
§ Atomic Operations: Instructions executed in a single CPU cycle that can be used for
synchronization without the need for locks.
§ Condition Variables: These allow threads to wait for certain conditions to be true before
proceeding.
§ Read-Write Locks: These allow multiple readers or a single writer to access a resource
but not both simultaneously.

In all these cases, the goal is to ensure that parallel or concurrent operations on shared resources
lead to correct behavior and predictable results. Proper synchronization is vital for maintaining
data integrity and system stability in any environment where tasks are performed concurrently.
WHAT IS PAGING? WHAT IS A PAGE TABLE?

Paging is a memory management scheme by which a computer stores and retrieves data from
secondary storage for use in main memory. In this system, the operating system retrieves data from
secondary storage in blocks of a fixed size, called "pages." The main memory is divided into small,
fixed-sized blocks of physical memory called "frames." The size of a frame is the same as that of
a page to make a perfect fit for the pages into frames.

In a paged memory system, a computer can store and manage data in main memory more
efficiently by ensuring that data is not scattered in noncontiguous locations. The system can access
and move smaller pieces of data, which simplifies finding free storage for new programs and files.
This system also enables multitasking, as different programs can be loaded into separate areas of
memory without any overlap.

A Page Table is a part of the virtual memory system in an operating system that keeps track of all
the pages in virtual memory and maps each one to a specific frame in physical memory. It is a data
structure used by the CPU to translate virtual addresses into physical addresses. When a program
references a virtual address, the CPU uses the page table to translate this virtual address to its
corresponding physical address so that the correct data can be accessed in RAM.

Here is how paging works conceptually.

§ Virtual Address Space: The logical address space of a program (the virtual addresses that
the program may use) is divided into blocks of data called "pages."
§ Physical Address Space: The physical address space (the actual addresses on the RAM
chips) is divided into blocks of memory called "frames." Frames are the same size as pages
to map a page into a frame.
§ Page Table: This is a table that the operating system maintains which maps page numbers
to frame numbers. When a program needs to access a certain page, the table tells the system
which frame that page resides in.
§ Address Translation: When a program wants to access data, it provides a virtual address.
The virtual address is divided into a "page number" (which indicates which page the data
is on) and a "page offset" (which tells the system where the data is within the page). The
page table is then used to map the page number to a frame number to find the corresponding
frame in physical memory. The page offset is used to find the exact location of the data
within the frame.

Here is a simplified example of how a page table might work.

Suppose a computer system has a 32-bit address space, and the page size is 4 KB (2^12 bytes).
Each page table entry would then cover 4 KB of data. The high-order 20 bits of a virtual address
specify the page number, and the lower 12 bits specify the offset within the page.

Let's say we have the following virtual address: 0x00403020.


1. The page number would be the high-order 20 bits: 0x00403.
2. The page offset would be the lower 12 bits: 0x020.
3. The page table would be consulted to find out which frame page 0x00403 corresponds to. Let's
say the page table says it is in frame number 0x00002.
4. The physical address would then be constructed by combining the frame number with the page
offset, resulting in 0x00002020.

The CPU uses the page table continuously to translate virtual addresses to physical addresses as
programs run. This happens so quickly and seamlessly that programs are unaware of the
translation.
FIND THE TOTAL NUMBER OF FRAMES. IF A SYSTEM, SIZE OF THE MAIN
MEMORY IS 2^30 BYTES, THE PAGE SIZE IS 4 KB AND THE SIZE OF EACH PAGE
TABLE ENTRY IS 32-BIT.

The calculation for the number of frames is:


Number of frames = Size of physical memory / Page size
= 2^30 / 2^12 (since 4 KB = 2^12 bytes)
= 2^18

CONSIDER A SYSTEM WITH PAGE TABLE ENTRIES OF 8 BYTES EACH. IF THE


SIZE OF THE PAGE TABLE IS 256 BYTES, WHAT IS THE NUMBER OF ENTRIES IN
THE PAGE TABLE?

The number of entries in the page table is calculated by dividing the total size of the page table by
the size of each entry:
Number of entries = Size of page table / Size of each entry
= 256 bytes / 8 bytes per entry
= 32 entries

CONSIDER A MACHINE WITH 32-BIT LOGICAL ADDRESSES, 4 KB PAGE SIZE AND


PAGE TABLE ENTRIES OF 4 BYTES EACH. FIND THE SIZE OF THE PAGE TABLE
IN BYTES. ASSUME THE MEMORY IS BYTE ADDRESSABLE.

The size of the page table given a 32-bit logical address space and 4 KB (2^12 bytes) page size
with each entry being 4 bytes is calculated as:
Number of pages = 2^32 / Page size
= 2^32 / 2^12
= 2^20
Page table size = Number of pages × Size of each page table entry
= 2^20 × 4 bytes
= 2^22 bytes

You might also like