Professional Documents
Culture Documents
1
Topics
■ Background
■ Swapping
■ Contiguous Memory Allocation
■ Segmentation
■ Paging
■ Structure of the Page Table
■ Example: The Intel 32 and 64-bit Architectures
■ Example: ARM Architecture
2
Objectives
3
Background on Main Memory
The following are some important point we need to know about the main memory.
A memory contains a large array of words and bytes placed in their own addresses.
These instructions which are stored from disk are brought to the memory and are
placed in the cpu for processing based on the program counter.
The CPU can access only the memory and registers.
And Registers are the fastest storage devices; furthermore, it can access in one CPU
clock or less.
Since the memory can take many cycles and the registers can take a clock cycle or
less, it may cause a stall. A stall means a temporarily slowing down of the process. A
solution to this is by using cache.
Cache sits between main memory and CPU registers as a solution to the stalling
problem.
Memory protection is required to ensure correct operation. It is a way of controlling
the memory access rights, to prevent any access from a process which has not been
allocated to it.
4
Main Memory or Physical Memory
- RAM
- Internal memory
- Physical memory
The main memory can be referred to as RAM or random access memory, internal
memory or physical memory. In our discussion, we may use any of these terms.
Secondary storage on the other hand are the hard disks or flash drives.
The operating system is the first program loaded in the memory. It usually occupies
the lower address of the memory for faster access since it is nearer to the hardware
for interrupts.
The table in the slide is the block diagram of the main memory. The numbers at the
left side are addresses and are expressed in decimal. Processes are placed
contiguously in the main memory. Not all processes or programs stored in the hard
disk are allocated space in the main memory. Only those processes that are to be
executed are placed in the memory. The problem is that the memory size is not that
large. The applications that we are running in our computers or devices demand
memory space for its execution.
So if we have 2GB of main memory, and our applications need millions of bytes of
memory space, and we run applications concurrently, how can this be managed by
the system?
The mechanism to resolve this issue and much more are tackled in this module.
5
Basic Concept of Memory Hardware
The CPU can only access the main memory and the registers. Instructions or data
from secondary storage devices have to be brought to the memory for the CPU’s
execution.
CPU registers are accessible within one cycle of CPU clock.
But the memory takes many cycles of the CPU clock to complete.
In this case, the cpu has to stall since it needs the data from the memory and this can
cause a problem. A solution to this is to add a cache or a fast memory between the
cpu and the main memory. A cache is a memory buffer that will accommodate the
speed differential between the 2 devices.
In the memory, there will be many processes. Processes must be protected such that
they will not be accessed by other processes. This protection can be provided by the
hardware.
Processes are placed in legal addresses. These addresses refer to addresses assigned
to the user process. It cannot use the operating system’s address space. And to
protect the user processes, we need 2 registers , namely, the base and limit registers.
The base registers stores the smallest physical memory address while the limit
register specifies the size of the range.
In this example, the base address is 300040 and the limit address is 120900. this
means that the program can access addresses from 300040 until 420940 (that is
6
300040+120900).
6
Hardware Address Protection
This flowchart serves as the trap for processes that will try to access other processes
in the memory.
To protect a memory space, the CPU hardware compares the address generated in
user mode with the registers. A program executing in the user mode that will try to
access the operating system’s memory or other user’s memory, will be trapped by the
operating system and will be treated as fatal error.
in addressing a process, we need the base and limit registers. These registers are
loaded by the operating system. Since it is only the operating system that can operate
in the kernel mode, then it implements a special privilege instruction that can change
the values in the registers. Other user programs do not have access or are not
allowed to modify the contents of the registers.
7
Address Binding
Address binding Is the process of mapping from one address space to another
address space.
It makes use of logical address and physical address. These are the two types of
addresses.
A logical address is the address generated by the cpu during the process’ execution.
And a physical address refers to the location in the memory unit or it is the address
where the process was loaded in the memory.
The logical address is translated by the MMU or Memory Management Unit which
generates the physical address of the process in main memory.
8
Address Binding
There are three ways to bind addresses, and these are at compile time, load time and
execution.
Compile time: An absolute code is generated if the physical address of the process is
known before compile time. (the physical address is embedded in the executable of a
program during compilation)
If the absolute code is known at a later time, tendency is that the code will be
recompiled.
Load time: Must generate relocatable code if memory location is not known at
compile time. The loader translates the relocatable address to absolute address. To
generate an absolute address, the base address of the process is added to the logical
address by the loader. In this case, if the base address changes, the process will be
reloaded.
Execution time: Binding delayed until run time if the process can be moved during its
execution from one memory segment to another
- Need hardware support for address maps (e.g., base and limit registers)
9
Multistep Processing of a User Program
From source
Count
code
(when compiled)
14 bytes from beginning of Count
module Source code
A B
In this slide, we have 2 figures. Figure B represents the flowchart of the different
steps that a process may go through. And in figure A, it represents an example of a
variable until it will have an absolute address.
Any user program may go through several steps; consequently, there will be different
ways in addressing. An address in the source code (such as count variable) is symbolic
and a compiler will bind this symbolic address to a relocatable address. A relocatable
address will simply be placed in an address before the module, example: count is
placed 14 bytes from the beginning of the module)The linkage editor or loader will
bind the relocatable address to absolute address (an example of this bounded
address 74102).
10
Logical Address Space and Physical Address
Space
■ Logical Address - the set of all logical addresses generated by a
program
■ Physical Address - refers to the set of all physical addresses that
correspond to these logical addresses
■ For compile time and load time address binding – same physical
and logical address space
■ For execution time – different physical and logical address spaces
Logical address is the address generated by the cpu while the physical address is the
address seen by the memory unit or the address which refers to the location in the
memory unit.
The compile time and load time binding methods generate the same logical and
physical addresses, unlike the execution time binding scheme where it generates
different logical and physical addresses.
Logical address is referred as virtual address.
A logical address space is the set of all logical addresses generated by a program and
a physical address space refers to the set of all physical addresses that correspond to
these logical addresses.
This is how it is in the execution time binding address scheme.
11
Memory Management Unit (MMU)
12
Relocation Register or Base-Register Scheme
13
Dynamic Loading
In a program execution, the entire program is loaded to the memory. The size of a
process is limited to its physical memory space.
Dynamic loading obtains a better memory-space utilization. It loads a routine only
when it is called. All routines are placed on disk in a relocatable load format.
When the routine is for execution, it is loaded to the memory.
And when a routine calls another routine, the calling routine checks if the routine
called is loaded; otherwise, it calls the relocatable linking loader to load the called
routine into the memory and update the program's address table. Then the control is
passed to the called routine.
Dynamic loading is implemented by user programs. The operating system only
provides the library routines to implement dynamic loading.
14
Static and Dynamic Linking
Static linking is when the system libraries and the program code are combined by the
loader into the binary program image
Example: when we run an exe (.exe) file, everything in the binary file are loaded into
the virtual address space of the process.
But other programs also need to run functions from the system libraries. And if these
libraries will be used, then then they will be loaded.
In this case, the libraries are embedded in the executable binary file. The program is
statically linked to its libraries and the linked codes are loaded once they are
executed.
A Disadvantage on static linking is that every program must have its own copy of the
system library functions, which could take up more memory space.
On the other hand, Dynamic linking uses shared libraries when compiling programs
and these shared libraries are dynamically linked to the program. The libraries are
linked only during at run time.
15
Swapping
16
Swapping
In this figure, process P1 is swapped out of the memory while Process 2 is swapped
in.
Let us have a round robin execution as an example for swapping scheme. When a
process finishes its time quantum, it is swapped out, and another process will be
swapped in the memory.
The swapping concept is also applied in priority-based scheduling algorithms. When a
higher priority process enters the ready queue, then, the memory manager can swap
it with lower priority processes. Then when the higher priority process is finished
executing, the lower priority is swapped in back to the memory. The state of the
lower priority process is stored in the PCB, that’s why the lower priority can return
back to its previous state. This variant of swapping for priority processes is called roll-
in and roll out.
The system has a ready queue. When the cpu scheduler schedules a process for
execution, it calls a dispatcher. Example a process P1 is to be executed next by the
cpu. The dispatcher checks if P1 is in the ready queue. If the process is not in the
ready queue and if there is no free memory, then the dispatcher swaps out a process
and swaps in P1.
17
Context Switch Time
18
Swapping on Mobile Systems
■ Not typically supported
– Flash memory based
■ Small amount of space
■ Limited number of write cycles
■ Poor throughput between flash memory and CPU on mobile platform
■ Instead use other methods to free memory
– iOS asks apps to voluntarily relinquish allocated memory
■ Read-only data thrown out and reloaded from flash if needed
■ Failure to free can result in termination
– Android terminates apps if low free memory, but first writes application state
to flash for fast restart
19
Contiguous Memory Allocation
20
Memory Mapping and Protection
500 100040
This hardware support for relocation and limit registers is quite the same with the
process on hardware address protection.
The relocation register contains the value of the smallest physical address and the
limit register contains the range of logical addresses.
Example: if the relocation register is 100040 and the limit register is 500, then, each
logical address must be less than 500.
The MMU or the Memory Management unit maps the logical address dynamically by
adding the relocation register.
The relocation register provides a way to change the operating system size
dynamically. Only the called process is loaded in the memory.
21
Multiple Partition Allocation
■ Multiple-partition allocation
– Degree of multiprogramming is limited by the number of partitions
– Variable-partition sizes for efficiency (sized to a given process’ needs)
– Hole – block of available memory; holes of various size are scattered throughout memory
– When a process arrives, it is allocated memory from a hole large enough to accommodate it
– Process exiting frees its partition, adjacent free partitions combined
– Operating system maintains information about:
a) allocated partitions b) free partitions (hole)
a b c d
22
Static and Dynamic Memory Allocation
23
Dynamic Storage-Allocation Problem
■ Best-fit: Allocate the smallest hole that is big enough; must search
entire list, unless ordered by size
– Produces the smallest leftover hole
■ Worst-fit: Allocate the largest hole; must also search entire list
– Produces the largest leftover hole
With dynamic allocation, requesting memory space during runtime, can cause a
problem. It may be that there is no available space at that instant.
The dynamic storage-allocation problem is concerned on how to give of free holes to
a requesting process. Free holes would refer to free memory space.
There are advantages and disadvantage of the different strategies.
We have the following strategies:
--first fit
--best-fit
--and worst-fit
24
Fragmentation
Fragmentation refers to the small free spaces left after processes are loaded and
freed from the memory. This usually is generated after files are created, deleted and
modified.
Memory fragmentation can be external or internal.
External fragmentation is when processes are loaded to freed up memory spaces and
these memory spaces are not contiguous. When a process has finished it is now
removed from the memory.
Internal fragmentation is when allocated memory may be slightly larger than
requested memory; this size difference is memory internal to a partition, but not
being used
One example is when the free space is 100Mb and the process size is 800MB, the
700MB is unused. This unused space can be assigned to another process.
The solution for external fragmentation is compaction, that is to shuffle the memory
contents and place the memory together in 1 block. This is possible only If dynamic
relocation
25
First-Fit Memory Allocation
■ Searching for the first available free space – start of the beginning of
wholes or where the first-fit search ended
Example:
The memory must always be updated with free memories and used memories. It is in
the free partitions that the processes are placed or allocated.
The first-fit memory allocation allocates the first available block/space that is big
enough for a process/job.
Two implementations:
--if searching for available space is always at the top/beginning <>
----Job A fits the 90KB hole, then again search from start, Job B can be allocated to
the 200KB block, Job C cannot be allocated anymore, so it waits for a block to be
freed up. Job D takes the 60KB hole.
The gray boxes indicate the unused spaces during the allocation. And to sum it up,
there is 50KB unused space.
--if search starts where first-search ended <>
----Job A fits the 90Kb hole, then this time, we search starting at the next free space,
where we had our last first-fit allocation. Job B is allocated to the 200KB block.
Locating an available space for Job C, since the last free space is too small for the job,
then, we can start our search from the start. Still, no block can allocate job c. So it
waits for a free space. Then lastly, we have Job D. The last first-fit allocation was on
block 200KB. After allocating 120KB of the 200Kb block, it now leaves us with 80KB
unused space. Job D, therefore can use this space. This is an example of internal
26
fragmentation.
26
Best-Fit Memory Allocation
■ allocates the smallest free partition that is close to the actual size of
the job or process
Given the following
Jobs (in order), how
Size
would these jobs fit in
or be allocated in the 60KB 10KB
Jobs Size (KB)
four memory
partitions? A 80 90KB 10KB
B 120
C 180
200KB 20KB
D 50
130KB 10KB
The next algorithm is the Best-Fit Allocation. It allocates the smallest free partition
that is close to the actual size of the job or process. This algorithm searches the entire
list of free partitions and considers the smallest hole that is big enough of the job or
process.
Let us use the same example.
We shall be searching the entire list of free locations in the memory. <>
So, first is Job A with 80KB fits in 90KB location.
Job B fits well with 130KB. Job C with 200KB and Job D with 60KB.
Now let’s check the unused spaces: <> When we sum up the unused spaces, we only
have 50KB.
27
Worst-Fit Memory Allocation
■ Searching the list for the largest whole
Example:
130KB 10KB
Job C waits
The Worst-Fit algorithm searches for the largest whole for a process.
For example: <>Job A is allocated the 200KB block. Job B is allocated the 130KB
block. Job C has to wait since there is no free block that s large for it. And Job D has to
be allocated to 90KB block. For Worst-fit, the system selects the largest whole for the
process/job.
And for the total unused space, for worst fit it is 170KB.
So, what do we do with the unused space?
28
Internal Fragmentation
Suppose that the following Jobs (in the order below) are to be allocated in these
five memory partitions of 200Kb, 100Kb, 500Kb, 300Kb and 450Kb. Use the Best-Fit
Algorithm.
Jobs Size Size
Size Size
(Kb) B
200K 200K
B
200K D
B(80) D
A 65 b b b F
(40)
100K (10)
B 120
A(35) 100K
b A(35) 100K
C 280 b A(35)
b
D 40 500K
b 500K
500K
E 125 b
b
F 30 300K
C(20) 300K
b C(20) 300K
G 130 b C(20)
b
450K
b 450K E
E(325) 450K
b G
b
(195)
29
Segmentation
30
Segmentation: User’s View of a Program
This is a user’s view of a whole program. The program is divided into segments.
31
Logical View of Segmentation
1
4
1
3 2
4
This is the logical view of the different segments in a program. Each segment has a
location in the memory and this is how a segment is accessed in the memory. It
makes use of addresses. We see our variables as words, but the memory sees it as a
location.
Each segment has a different address.
32
Segmentation
Main
Process Segment Table
Memory
Segment Size Memory
No. address
Segment 1 1 200 100 100
400
Segment 3
500
600
Segmentation divides the job into different sizes of segments. Each segment has a
logical address space of the program
When a process is executed, the different segments are placed in different available
blocks in the memory.
The operating maintains a segment map table for every process. The segment table
contains the starting address of the segment and the size of the segment. Each
segment has its own memory address.
To refer to the main memory, it uses the segment and the offset value.
33
Paging
34
Paging: Address Translation Scheme
An address generated by the CPU is divided into page number and page offset.
The page number (p) is used as an index into a page table which contains base
address of each page in physical memory.
And the page offset(d) is combined with the base address to define the physical
memory address that is sent to the memory unit
35
Paging Hardware
36
Paging Model of Logical and Physical Memory
The figure contains a logical memory, page table and physical memory. The page size
is defined by the hardware. The size of the page is in the power of 2 from 512 bytes
to 16MB per page.
The logical memory contains 4 pages, from page0 to page3.
The page table stores the page number and the frame number.
For example: page 2 in the logical memory has a frame address of 3 in the physical
address. In the page table, we can easily see 2,3 which defines page2 and frame
address3.
37
Paging Page
no.
Example 0
1
The size of the page table is , where n=2 2n
and the size of the logical memory is 2m 2
(where m=4). The logical memory has 4
byte-page size and the physical memory 3
of 32 bytes (8 pages).
Ex: In the logical memory,
“a” or logical address 0 is located in Page
0 and offset 0.
“b” or logical address 1 is located in Page
0 and offset 1.
What is the physical memory address of
“b” or logical address 1?
Specific address = (5x4) + 1 = 21
When a process is to be executed, the system has to know its size. Each page of the
process needs one frame. A process needing 3 pages, needs 3 frames. When a
process arrives, the first page is loaded into one of the free frames. The page number
and frame number are placed in the page table.
The size of the page table is 2^n , where n=2 and the size of the logical memory is
2^m (where m=4). The logical memory has 4 byte page size and the physical memory
of 32 bytes (8 pages).
For example: “a” is located in page0, offset 0. The logical address 0 is page 0 and
offset 0. In the page table, we see that page 0 is mapped to frame 5 in the physical
memory which has the
specific address = (5 x4)+0; where 5 is the frame size, 4 is the page size and 0 is the
offset.
Lets have another example:
Logical address 9 is in page 2, offset 1. in the page table, page 2 is mapped to frame1.
the physical address = (1x4)+1 = 5
And for logical address 14. it is in page3, offset 2. in the page table, page 3 is mapped
to frame 2. physical address = (2x4)+2 = 10
In this structure, we can say that paging is a dynamic relocation.
38
Free Frames
39
Page Table - Implementation
■ PTBR (Page-Table Base Register) –
points to the page table
■ PTLR (Page Table Length Register) –
specifies the size of the page table
■ TLB (Translation Look-aside Buffers) -
memory cache used to lessen the time
to access a user memory location
– Also called Associative Memory
■ Two Memory Accesses for data /
instructions
– Page table
– Data / instruction
The page table is paced in the main memory. It contains PTBR and PTLR.
PTBR or page table register points to the page table.
PTLR or page table length register indicates the size of the page table
The page table is implemented such that the data/instruction are required two
memory accesses namely:
-- to access page table
-- to access data/instruction
The TLB is a memory cache used to lessen the time to access a user memory location.
It uniquely identifies each process in providing its address-space protection. The TLB
usually have 64 to 1,024 entries.
In the TLB comparison, it makes a parallel search. The logical address contains p and d
(for its page transaltion). If the page number p, is in the TLB, them the system will
retrieve the frame number (in the diagram, refer to “TLB hit”); otherwise, the frame
number is retrieved from the page table in the memory (refer to TLB miss).
40
Internal Fragmentation on Paging
Example:
■ Page size = 2,048 (size = 2n )
■ Size of process = 58,800 bytes
■ No. of pages = 28 and excess bytes = 1,456
It was mentioned earlier that when a process is executed, its size has to be verified
because the process will be placed in pages. So, if the size of the process is 58,800
bytes, and the page size is 2,048, then ,the no. of pages needed for this process is 28
pages, with an excess of 1,456 bytes. The remaining bytes are still placed in a page
but with free space.
If the size of the remaining page is just a few bytes (like 2 bytes), then there will be
internal fragmentation.
Another disadvantage of paging is having a page table can consume additional
memory.
41
Internal Fragmentation
■ Example: The following are logical address references (in decimal), what
are their page numbers and offsets? Assume that the page size = 1KB
a) 2314
b) 9467
c) 900000
d) 55000
To solve for the page number and offset, first is to get the exact amount of 1KB, then
divide this from the given logical address. The remainder is the offset.
42
Example: Consider a logical address space of 64 pages of 1024 words each,
mapped onto a physical memory of 32 frames. How many bits are in the
logical address and physical address?
43
Protection
Protection in a page environment makes use of protection bits which are associated
with each frame.
The bits are placed in a page, which may be read-write or read-only . Lets have one
example on this.
44
Page: Protection (Valid or Invalid)
8 pages = 16383/2048
Example:
A 14-bit address space (where: 214
= 16,384) (address is: 0-16,383)
A process is addressed to pages 0-
10,468
Page size = 2KB
Process uses 5 pages: 10,468/2048
Pages: 0-4
page 5 = 228 bytes (excess from
10,468%2056)
In paging, the memory is protected by adding bits to the frames in the page table. A
bit can be read-write or read-only.
Every process that is to be executed is referenced through the page table. With the
additional bits. The system identifies if a process is allowed to write to a frame.
A valid or invalid bit is added to the page table. A valid bit means that the page is in
the logical address space of the process and is considered as a legal page. However,
an invalid page means that the page is not in the process’ logical address space. The
operating system sets a page to “invalid” if a process is not allowed to access the
page.
Example: (refer to slide example)
A 14-bit address space (where: 214 = 16,384) (address is: 0-16,383)
Page size = 2KB
The system has 8 pages.
A sample process is addressed to pages 0-10,468
It therefore occupies 6 pages with the last page containing 228 bytes only.
Pages: 0-4 has 2048 bytes each (pages size) (5 pages)
page 5 = 228 bytes ( the excess bytes)
If there is an attempt to access page 6 or 7, then, an error will be generated.
It is possible though that another bit can be added to the page table that will indicate
45
that the page is execute-only.
45
Shared Pages
An advantage in shared pages concept is its ability to share pages. This is used in
time-sharing environment.
It can be implemented using reentrant codes.
A Reentrant code is a program written so that several users can share the same copy
of the program in the memory.
It does not change during execution.
Each user can execute the same code at the same time.
Each process has its own registers, data storage and data.
46
Shared Pages
Example:
A text editor that
shares it code to 40
users.
Text editor = 150KB
code
Data space = 50KB
47
Page Table
■ Hierarchical Paging
■ Hashed Page Table
■ Inverted Page Table
48
Hierarchical Paging
Hierarchical paging is also referred as multilevel paging. The entries of the first level
table are pointers to the level 2 page table, and entries of level 2 are pointers to level
3 page table, and so on.
All page tables are stored in the memory.
49
Two Level Paging
Example:
- A logical address of 32 bits
machine with 4K page size)
- Page number -> 20 bits
- 10 bit page number
- 10 bit page offset
- Page offset -> 12 bits
212
Page Offset
10 10 12
50
Hashed Table Paging
The Hashed table paging is an approach for address spaces larger than 32 bits. The
hash value is the virtual page number. Each entry in the hash table is linked to
51
Hashed Table Paging
52
Inverted Page Tables
Inverted page table is a global page table that is maintained by the operating system
for all processes. Then umber of entries equals the number pf frames in the memory.
53
Inverted Page Tables
Where:
P=page number
d = offset
Pid = Process ID
The Inverted page table consists of one-page table entry for every frame of the main
memory.
In the diagram, the logical address contains the process ID, page number and offset.
The page number specifies the page number range of the logical address.
The process ID is included in the page table because two processes can have the
same set of virtual addresses. The PID serves as an address space identifier that is
used to its corresponding frame addres.
The page table can also have an additional bit, and that is the control bit that gives
information like valid or invalid bit.
How the Inverted Page table works: The CPU generates the virtual address. When a
logic address is referenced in the memory, the virtual address is matched by the
memory-mapping unit and the system searches for a match to obtain to frame
address or number. If there is a match, the ith number is sent as the real address.
Howeve,r, if there is no match, the system a Segmentation fault.
54
Examples of Architecture (Memory)
■ Intel 32 Architecture
– Has two Implementations
■ segmentation
■ segmentation with paging
– Size for each segment: 4GB
– 2 Partitions:
■ Partition 1: Up to 8K – private to the process (local)
■ Partition 2: up to 8K (second group) – shared among
processes (global)
55
Example
The CPU generates the logical address. The logical address is passed to the
segmentation unit which generates a linear address. The linear address is given to the
paging unit. The paging unit generates the physical address in the memory. Intel 32
page sizes can eb 4KB or 4MB.
The 32 bit linear address comes from the number of bits from the page number and
page offset. (so, 10 + 10 + 12)
56
ARM (Advanced RISC Machine) Architecture
ARM means Advanced RISC Machine. RISC means Reduced Instruction Set
Architecture. ARM processors are usually small in size and has lower power
consumption. IT makes use of RISC so that means it has reduced complexity.
57
ARM (Advanced RISC Machine) Architecture
32 bits
4-KB
or
16-KB
page
1-MB
or
16-MB
ARM means Advanced RISC Machine. RISC means Reduced Instruction Set
Architecture. ARM processors are usually small in size and has lower power
consumption. IT makes use of RISC so that means it has reduced complexity.
The ARM implements the translation tables. The MMU translates the Virtual
Addresses of the code and data to the Physical Addresses in the actual system. The
translated address is sent to the hardware. The ARM MMU controls the memory
access permissions, memory ordering, and cache policies for all divisions in the main
memory.
58
End of Presentation
59
References:
■ https://developer.arm.com/docs/den0024/a/the-memory-
management-unit
■ Siblerschatz, Galvin, and Gagne. Operating System Concepts (9th
Edition)
60