You are on page 1of 26

Operating System

Memory management

(Basics)

1
Goals of an OS
● Maximize memory utilization
● Improve protection between applications
● Maximize CPU utilization
● Minimize response time
● Prioritize “important” processes
● Note: Conflicting goals/tradeoffs
● E.g. running many processes will increases CPU
utilization but would reduce system response
time.
Process
● Unit of protection
● One or more threads of execution
● Resources required for execution
● Memory (RAM)
● Program code (“text”)
● Data (initialized, uninitialized, stack)
● Buffers held in the kernel on behalf of the process
Example:
protection
● Try the code in LMS

● You see different values for the same


address
● PS: fork create a child process identical to
the parent calling it
Difference between process
and thread
● Process → unit of protection
○ Anything to do with protecting: what memory you
can access, what files you can access etc.
● Thread → unit of execution
○ Anything to do with execution: stacks, local
variable
● Threads are sometimes called lightweight
process
○ due to a historical reason.
○ One thread per process
Basic idea of memory
management
● (ignoring any protection)
● Keeps track of what memory is in use and
what memory is free
● Allocates free memory to process when
needed
● Deallocates unwanted memory from a
process
● Manages the transfer of memory between
RAM and disk.
Memory hierarchy
● Ideally we want memory that is
● Fast
● Large
● Possible to an extent with the
memory hierarchy
● Locality of reference
● OS coordinates how memory
hierarchy is used.
● Focus usually on RAM ⬄ Disk
● RAM to cache 🡪 mainly by
hardware as directed by the OS
Locality of reference

● Locality of reference: the same value or a


related storage location is accessed
frequently.
● Two (main) types:
● Temporal locality: if a location is referenced it will
be referenced again in the near future (time)
● Spatial locality: if a location in memory is
referenced then an adjacent location will be
referenced in the near future. (space)
Memory hierarchy – again
Capacity Access latency
< 1 KB Registers 1 ns Better

1 MB Cache (SRAM) 2–5 ns


256 MB Main memory (DRAM) 50 ns
500 GB Magnetic disk 5 ms
> 1 TB Magnetic tape 50 sec

● When searching for something start from the top of the


hierarchy and keep moving down
● When a location is accessed, move that and adjacent ones up
the hierarchy and keep it there
● So next access will be from the faster memory
Memory layout of a process

● Typical layout of a process


● At the very top you will have the OS code
● Followed by stack of the thread
● Followed by the heap (from where we
dynamically allocate memory)
● Followed by the global variables
● Finally you have the code
● Try the mem.c code from LMS
● (actual address would vary, but layout will be
the same)
OS Stack (local var) dynamic allocated global variables Code
mem.c
● j is allocated from
stack
● i is global variable
● k is from heap
Concept of virtual memory
Virtual address space

Page table structure

RAM (Physical Address)

● Addresses generated by the code are virtual


addresses
○ They do not correspond to the addresses in the
RAM
● These addresses are translated using page
tables to get the RAM address
Page table basics
● Depends on the hardware
● You have two types of hardware
○ Hardware defined page tables: page table
structure is defined by HW (ex: ARM, x86)
○ Software defined page tables: page table
structure is left to the OS designers (ex: MIPS)
● Type of page tables: two-level page tables,
hash tables, …
● For 64bit architectures you have more levels
Two-level page table
Memory management
● Two broad classes of memory management systems
● Those that transfer processes to and from disk
during execution
● Called swapping or paging
● Linux swap: find out what it is doing
● Those that do not
● Simple
● No alternative → example embedded devices no disk
Notes on two-level page tables
● Virtual memory and physical memory is broken down into equal size
chunks called pages and frames respectively.
● Typical page size is 4KB
● You need 12bits to cover 4KB of memory. Called the offset.
● On a 32bit machine you are left with 20bits that needs to be translated
● Those 20bits are called the page number
● page number is broken into 2 parts (for the two-levels). 10bits each
● First 10bits are used to index into the page directory (or 1st level page
table)
● Page directory will give you a pointer for the page table (or 2nd level
page table)
● You index the page table using the remaining 10bits
● page table gives the top 20bits of the frame (called the frame number)
● frame number is concatenated with offset to get address
Back to the process
Virtual address space

Page table structure

RAM (Physical Address)

● The two processes have two different page


tables
● So, even for the same virtual address we get
different physical addresses and therefore
different values
Speed of memory access
● You need 2 memory access to translate
address
● one more to access the actual memory
● Too slow: use Translation Lookaside Buffers
(TLB) to cache translations
Virtual address space

Page table structure TLB

RAM (Physical Address)


TLB
● Acts as a cache of page table looks ups. (temporal
and spatial locality)
● When you want to translate a page number to frame
number first check on the TLB
● If you find it in TLB it is a TLB hit and translation stops
there
● If you cannot find the translation in the TLB (TLB
miss) walk the page table (look up in the page table)
● If you find in the page table refill the TLB
● If you cannot find a translation in page table called a
page fault.
● Page faults are notified to the OS
Exercise
The system has 80% TLB hit rate and a
two-level page table structure. How many bytes
you need to access to read 4KB of memory?
Paging
● Or dynamic paging or Swapping
● Load stuff based on the need
● Keep the pages unmapped in the page table
○ Get the page fault
○ Find a free frame
○ load content from disk
○ setup a mapping in page table
page

Disk
Frames
Page sizes
● What should be the best page size?
● Things to remember
○ We map/unmap share in units of pages
○ If the TLB can cover lots of memory that would
be fast
○ Difficult to build large TLBs
Changing page size
● Increasing page size
○ Better TLB coverage
○ Internal fragmentation increase
○ Increase page fault latency

● Decreasing page size


○ Reduce TLB coverage
○ reduce internal fragmentation
○ reduce page fault latency
Note on fragmentation
● fragmentations = unused memory
○ Internal fragmentation: you have allocated more
than what is required

○ external fragmentation: you have memory that


cannot be allocated since they are not
continuous
○ (external fragmentation does not happen if page
size is the same)
Current CPUs
● They support different page sizes
● Typically a there is a base size (4KB)
● Can map bigger pages which are a multiple
of the base size (1MB, 4MB, ..)
● Bigger pages are called “superpages”
● Also support sub-pages (1KB)
● Still experimental
Recap
● virtual memory
● process as a unit of protection
● two-level page tables
● address translation
● TLB in the address translation
● Paging
● Homework: find what is meant by “thrashing”

You might also like