You are on page 1of 11

Understand CPU Caching Concepts

Concept of Caching
Need for Cache has come about due to reasons : The concept of Locality of reference.
-> 5 percent of the data is accessed 95 percent of the times, so makes sense to cache the 5 percent of the data.

The gap between CPU and main memory speeds.


-> In analogy to producer consumer problem, the CPU is the consumer and RAM, hard disks act as producers. Slow producers limit the performance of the consumer.

CPU Cache and its operation


A CPU cache is a smaller, faster memory which stores copies of the data from the most frequently used main memory locations. The concept of Locality of reference drives caching concept, we cache the most frequently used, data, instruction for faster data access. CPU cache could be data cache, instruction cache. Unlike RAM, cache is not expandable. The CPU first checks in the L1 cache for data, if it does not find it at L1, it moves over to L2 and finally L3. If not found at L3, its a cache miss and RAM is searched next, followed by the hard drive If the CPU finds the requested data in cache, its a cache hit, and if not, its a cache miss.

Levels of caching and speed, size comparisons


Level Level 1 Cache (onchip) Level 2 5-12 ns Cache (offchip) Main Memory 10-60 ns Hard Disk Access Time 2-8 ns Typical Technology Size 8 KB-128 KB SRAM Managed By Hardware

0.5 MB - 8 MB

SRAM

Hardware

64 MB - 2 GB 3,000,000 100 GB - 2 10,000,000 ns TB

DRAM Magnetic

Operating System Operating System

Cache organization

When the processor needs to read or write a location in main memory, it first checks whether that memory location is in the cache. This is accomplished by comparing the address of the memory location to all tags in the cache that might contain that address. If the processor finds that the memory location is in the cache, we say that a cache hit has occurred; otherwise, we speak of a cache miss.

Cache Performance
Cache Size Cache Handling Replacement Strategy Automatic pre fetching

Handling Cache Miss


In order to make room for the new entry on a cache miss, the cache has to evict one of the existing entries. The heuristic that it uses to choose the entry to evict is called the replacement policy. The fundamental problem with any replacement policy is that it must predict which existing cache entry is least likely to be used in the future One popular replacement policy, LRU, replaces the least recently used entry

Mirroring Cache to Main memory


If data are written to the cache, they must at some point be written to main memory as well. The timing of this write is controlled by what is known as the write policy. A write-through cache, every write to the cache causes a write to main memory. Write-back or copy-back cache, writes are not immediately mirrored to the main memory. Instead, the cache tracks which locations have been written over

Stale data in cache


The data in main memory being cached may be changed by other entities (e.g. peripherals using direct memory access or multi-core processor), in which case the copy in the cache may become out-ofdate or stale. Alternatively, when the CPU in a multi-core processor updates the data in the cache, copies of data in caches associated with other cores will become stale. Communication protocols between the cache managers. Which keep the data consistent are known as cache coherence protocols. another possibility is to share non cacheable data.

References
Wikipedia : http://en.wikipedia.org/wiki/CPU_cache ArsTechnica : http://arstechnica.com/ http://software.intel.com What Every Programmer Should Know About Memory - Ulrich Drepper, Red Hat, Inc.

Q/A

You might also like