You are on page 1of 4

CACHE MEMORIES

CS 455 Spring 2002


Overview
A cache memory, often simply called a cache, is a fast local memory used as a buffer for a more distant, larger and slower memory in order to improve the average memory access speed. Although relatively small, cache memories rely on the principle of locality that indicates that if they retain information recently referenced, they will tend to contain much of the information needed in the near future. There are two principal types of cache memory in a conventional computer architecture. A data cache buffers transfers of data and instructions between the main memory and the processor. A paging cache, also known by such names as the TLB or Translation Lookaside Buffer is used in a virtual memory architecture to buffer the recently referenced entries from page and segment tables, thus avoiding an extra memory reference when these items are needed again. It can also be said that a virtual memory itself is a type of cache system in which the main memory is a buffer for the secondary memory. A comparison among these three types of caches is given at the end of this chapter.

Structure of a cache memory


A typical cache memory consists of an array of fast registers, each of which can hold a unit of information. Since caches only a small subset of the possible data items, each item must also be tagged with a field which identifies the item. For a data cache, each item is a line of data, which is a small sequence of words, typically 16 or less. These are tagged with a line number. For a paging cache, the items are page or segment descriptors, and the tags are the corresponding page or segment numbers.

Accessing an Item
To locate a word of memory for access using a data cache, for example, we must first determine the line number for that word, and see if the line is in the cache. To do this a search must be made of the tags. To support a fast search, the cache is structured as an associative memory, with circuitry that allows an immediate determination of whether a particular tag is present, and a readout of its contents if present. If this test fails, the item is fetched in the normal way from memory, and its line is added to the cache (see below). Since a true associative search of a large cache may be infeasible, the cache often is organized as a set associative memory in which the cache is broken into a number of smaller caches called sets. Each set is an independent cache for a portion of the address space (or page/segment numbers). The sets, in turn, with only a few registers each, are organized as fully associative memories.

Cache Memories CS 455

JDM 3/20/02

1.

Updating the Cache


Since a cache memory uses the principle of locality, if an accessed item was not previously in the cache it is normally added. To allow this, an item must be removed. A simple FIFO replacement algorithm can be implemented by maintaining an index that cycles through all cache positions. Each new item goes into the next position in order and overwrites the oldest entry. A bit more hardware will allow a true LRU algorithm, especially with small sets. In this approach, a square array of bits can record the relative order of access to each register in a set. The bits are shifted appropriately when an access occurs.

Updating Data Items


A different problem is how to update a data item once accessed, if a copy is in the cache. Since there are two copies, they must be kept consistent. There are two approaches. The write-through method updates both the cache and the main memory immediately. As a result consistency is always maintained, but the speed benefit of the cache is lost for updates. The write-back method updates only the cache and uses a flag bit to mark the register as modified or "dirty." The main memory is updated when the item is replaced. With this approach many memory references are avoided and the speed is greater. However, the hardware is more complex and there is a risk of having inconsistent copies.

Multilevel Caches
Memory today is very inexpensive, and becoming increasingly large. Despite the principle of locality, a cache will not function effectively if its size is many orders of magnitude smaller than the memory it is buffering. A natural solution to this problem is to make caches larger as well, perhaps on the order of megabytes instead of kilobytes. Such a cache may be able to hold a sufficient range of information, but is too large to be managed effectively and accessed quickly. We now need a cache for the cache. This trend leads to the use of multilevel caches: a small cache on the processor chip, and a larger cache on a separate chip nearby. These are often called the Level 1 (L1) cache and the Level 2 (L2) cache, respectively.

Cache Memories CS 455

JDM 3/20/02

2.

Problems with Cache Systems


Several problems can arise with cache systems if they are not managed carefully, especially if a write-back update method is used. These problems arise especially in connection with input and output, virtual memory, and in support of a multiuser environment. In a multiuser environment, frequent process switching occurs. Locality does not apply across process changes. Typically none of the items already in the cache will be useful to a new process. As a result, the new process must gradually build up a set of useful cache entries from scratch. If, by the time this happens, another switch occurs, the normal speed benefits of caching will not be attained. In the presence of virtual memory, this problem is more acute. Since each process attaches different meaning to the same addresses (and page numbers), old data in the cache is no longer even correct for a new process and the cache must be completely flushed. Since the most frequent switching occurs between user processes and the operating system itself, a partial solution is to have two separate caches, one selected during privileged mode and used by the OS, the other to be shared by user processes. Input and output can also cause difficulties with a write-back system. A data cache is an element of the CPU. Some I/O channels may access the memory independently via DMA techniques. As a result they will not go through the cache, and will not be aware of locations that have been updated in the cache only. The only solution is to avoid write-back or to provide the I/O DMA channels with their own cache that is cross-connected to that of the CPU.

Cache Memories CS 455

JDM 3/20/02

3.

Comparison of Cache Types


This table summarizes the characteristics of the three types of "cache" memories used in a typical storage system: data cache, paging cache, and the virtual memory itself. VIRTUAL MEMORY Main memory Secondary memory Pages of data DATA CACHE Cache Main memory Lines of data PAGING CACHE Cache Main memory Page and segment descriptors Extract line number Extract page and from address segment number from virtual address Associative cache Associative cache search search

BUFFER BULK STORAGE UNITS OF STORAGE

METHOD OF IDENTIFICATION Extract page and segment number from virtual address METHOD OF ACCESS Fetch descriptor from paging cache or index into page table; Use desciptor to access page IF NOT FOUND Issue page fault, update and retry HOW UPDATED By operating system WHEN UPDATED On reference or before WRITE STRATEGY Write-back REPLACEMENT STRATEGY MULTIPLE LEVELS By operating system; May be complex Not feasible

Fetch from main, update cache By hardware On reference Write-through or write-back By hardware; FIFO or LRU Often

Fetch from main, update cache By hardware On reference Write-through or write-back By hardware; FIFO or LRU Rarely

Cache Memories CS 455

JDM 3/20/02

4.

You might also like