Co327 Memory Allocation

Operating System
Memory management
(Basics)
1
Goals of an OS
● Maximize memory utilization
● Improve protection between applications
● Maximize CPU utilization
● Minimize response time
● Prioritize “important” processes
● Note: Conflicting goals/tradeoffs
● E.g. running many processes will increases CPU
utilization but would reduce system response
time.
Process
● Unit of protection
● One or more threads of execution
● Resources required for execution
● Memory (RAM)
● Program code (“text”)
● Data (initialized, uninitialized, stack)
● Buffers held in the kernel on behalf of the process
Example:
protection
● Try the code in LMS
● You see different values for the same

address
● PS: fork create a child process identical to
the parent calling it
Difference between process
and thread
● Process → unit of protection
○ Anything to do with protecting: what memory you
can access, what files you can access etc.
● Thread → unit of execution
○ Anything to do with execution: stacks, local
variable
● Threads are sometimes called lightweight
process
○ due to a historical reason.
○ One thread per process
Basic idea of memory
management
● (ignoring any protection)
● Keeps track of what memory is in use and
what memory is free
● Allocates free memory to process when
needed
● Deallocates unwanted memory from a
process
● Manages the transfer of memory between
RAM and disk.
Memory hierarchy
● Ideally we want memory that is
● Fast
● Large
● Possible to an extent with the
memory hierarchy
● Locality of reference
● OS coordinates how memory
hierarchy is used.
● Focus usually on RAM ⬄ Disk
● RAM to cache 🡪 mainly by
hardware as directed by the OS
Locality of reference
● Locality of reference: the same value or a

related storage location is accessed
frequently.
● Two (main) types:
● Temporal locality: if a location is referenced it will
be referenced again in the near future (time)
● Spatial locality: if a location in memory is
referenced then an adjacent location will be
referenced in the near future. (space)
Memory hierarchy – again
Capacity Access latency
< 1 KB Registers 1 ns Better
1 MB Cache (SRAM) 2–5 ns

256 MB Main memory (DRAM) 50 ns
500 GB Magnetic disk 5 ms
> 1 TB Magnetic tape 50 sec
● When searching for something start from the top of the

hierarchy and keep moving down
● When a location is accessed, move that and adjacent ones up
the hierarchy and keep it there
● So next access will be from the faster memory
Memory layout of a process
● Typical layout of a process

● At the very top you will have the OS code
● Followed by stack of the thread
● Followed by the heap (from where we
dynamically allocate memory)
● Followed by the global variables
● Finally you have the code
● Try the mem.c code from LMS
● (actual address would vary, but layout will be
the same)
OS Stack (local var) dynamic allocated global variables Code
mem.c
● j is allocated from
stack
● i is global variable
● k is from heap
Concept of virtual memory
Virtual address space
Page table structure
RAM (Physical Address)
● Addresses generated by the code are virtual

addresses
○ They do not correspond to the addresses in the
RAM
● These addresses are translated using page
tables to get the RAM address
Page table basics
● Depends on the hardware
● You have two types of hardware
○ Hardware defined page tables: page table
structure is defined by HW (ex: ARM, x86)
○ Software defined page tables: page table
structure is left to the OS designers (ex: MIPS)
● Type of page tables: two-level page tables,
hash tables, …
● For 64bit architectures you have more levels
Two-level page table
Memory management
● Two broad classes of memory management systems
● Those that transfer processes to and from disk
during execution
● Called swapping or paging
● Linux swap: find out what it is doing
● Those that do not
● Simple
● No alternative → example embedded devices no disk
Notes on two-level page tables
● Virtual memory and physical memory is broken down into equal size
chunks called pages and frames respectively.
● Typical page size is 4KB
● You need 12bits to cover 4KB of memory. Called the offset.
● On a 32bit machine you are left with 20bits that needs to be translated
● Those 20bits are called the page number
● page number is broken into 2 parts (for the two-levels). 10bits each
● First 10bits are used to index into the page directory (or 1st level page
table)
● Page directory will give you a pointer for the page table (or 2nd level
page table)
● You index the page table using the remaining 10bits
● page table gives the top 20bits of the frame (called the frame number)
● frame number is concatenated with offset to get address
Back to the process
Page table structure
● The two processes have two different page

tables
● So, even for the same virtual address we get
different physical addresses and therefore
different values
Speed of memory access
● You need 2 memory access to translate
address
● one more to access the actual memory
● Too slow: use Translation Lookaside Buffers
(TLB) to cache translations
Page table structure TLB

TLB
● Acts as a cache of page table looks ups. (temporal
and spatial locality)
● When you want to translate a page number to frame
number first check on the TLB
● If you find it in TLB it is a TLB hit and translation stops
there
● If you cannot find the translation in the TLB (TLB
miss) walk the page table (look up in the page table)
● If you find in the page table refill the TLB
● If you cannot find a translation in page table called a
page fault.
● Page faults are notified to the OS
Exercise
The system has 80% TLB hit rate and a
two-level page table structure. How many bytes
you need to access to read 4KB of memory?
Paging
● Or dynamic paging or Swapping
● Load stuff based on the need
● Keep the pages unmapped in the page table
○ Get the page fault
○ Find a free frame
○ load content from disk
○ setup a mapping in page table
page
Disk
Frames
Page sizes
● What should be the best page size?
● Things to remember
○ We map/unmap share in units of pages
○ If the TLB can cover lots of memory that would
be fast
○ Difficult to build large TLBs
Changing page size
● Increasing page size
○ Better TLB coverage
○ Internal fragmentation increase
○ Increase page fault latency
● Decreasing page size

○ Reduce TLB coverage
○ reduce internal fragmentation
○ reduce page fault latency
Note on fragmentation
● fragmentations = unused memory
○ Internal fragmentation: you have allocated more
than what is required
○ external fragmentation: you have memory that

cannot be allocated since they are not
continuous
○ (external fragmentation does not happen if page
size is the same)
Current CPUs
● They support different page sizes
● Typically a there is a base size (4KB)
● Can map bigger pages which are a multiple
of the base size (1MB, 4MB, ..)
● Bigger pages are called “superpages”
● Also support sub-pages (1KB)
● Still experimental
Recap
● virtual memory
● process as a unit of protection
● two-level page tables
● address translation
● TLB in the address translation
● Paging
● Homework: find what is meant by “thrashing”

Co327 Memory Allocation

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Co327 Memory Allocation

Uploaded by

Copyright:

Available Formats

Operating System

● You see different values for the same

● Locality of reference: the same value or a

1 MB Cache (SRAM) 2–5 ns

● When searching for something start from the top of the

● Typical layout of a process

Page table structure

RAM (Physical Address)

● Addresses generated by the code are virtual

Page table structure

RAM (Physical Address)

● The two processes have two different page

Page table structure TLB

RAM (Physical Address)

● Decreasing page size

○ external fragmentation: you have memory that

You might also like