You are on page 1of 11

CS330: Operating Systems

Lecture 9
Virtual Memory: Address Spaces
File (executables) need to be loaded into an addressable memory location for instruction fetching
etc. A typical executable file contains code and statically allocated data. The statically allocated
data: global and static variables.
Is loading the program (code and data) sufficient for program execution?
We need memory for stack and dynamic allocation.

Address state abstraction associated with the four elements data(static), code, heap and stack.
Address space information is kept in the PCB.
// address_space.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int gvar = 100;


int main(void) {

printf("%p %p\n", &main, &gvar);


while(1)
sleep(1);
}

Compiled using gcc -no-pie address_space.c (-no-pie is a flag for lifting of some memory
related security)
Output:

0x400604 0x411038

Output remains the same even if the same executable is run in parallel. This implies that the same
address space is used for the processes. Now that the address space is the same, and if one process
is scheduled out and the other starts/renews then there needs to be some form of translation of
memory address in virtual memory to the physical memory.
Stack: function call and return, store local (stack) variable.
Heap: dynamic memory allocation using APIs like malloc().
2

Figure 1: Fig: Address space abstraction

Address Space

Address space represents memory space of a process.


Address space layout is same for all processes (convenience).
Exact layout can be decided by the OS, conventional layout is shown in the figure.

Some obvious questions:

If all processes have the same address space, how they mapn to actual memory?
→ Architecture support used by OS techniques to perform memory virtualization, i.e., trans-
late virtual address to physical address.

What are the responsibilities of the OS during program load?


How the CPU register state is changed?
→ Creating address space, loading binary, updating the PCB register state.

What is the OS role in dynamic memory allocation?


→ Maintain the address space and enforce access permissions.

OS during program load(exec). The PCB needs to store the register state of certain
registers and the memory state. This includes some metadata about the virtual address space.

- A fresh address space is initialized.

- In reality, parent address space copied at the time of fork() is reset and re-initialized (the
assumption is a child state has called exec, prior to which it was created using a fork from
the parent).

- PC and SP are set with address of code and stack, respectively.

- Physical memory for code and data allocated, executable code is loaded.
3

- Translation information is updated, when process is ready to execute,


- Executes when register state in PCB is loaded onto the CPU.

User API for memory management


to manipulate (allocate, deallocate, expand the memory using system calls)
User API to (de)allocate heap memory with different access permissions.
OS changes the memory state according to the user request.
User has no direct control on physical memory.

Lecture 10
The process address space abstraction is such that all the processes see the memory in a similar
manner. This address space is virtual and OS enables this virtual view.

Library functions like malloc(), calloc(), free() use syscalls like brk, mmap for memory man-
agement. This memory management for the virtual memory.
Requirement of OS: Not all the virtual memory is mapped (translated) to physical memory. The
OS handles this, of what memory to map and what to leave.
On-demand allocation: whenever a virtual address is accessed and it is a legitimate virtual address.
Achieved using fault/traps.
Can the size of segments change at runtime?
The code segment size and initialized data segment size is fixed (at executable load).
End of uninitialized data segment (also called BSS) can be adjusted automatically.
Heap allocation can be discontinuous special system called like mmap() provide the facility.
Stack grows and shrinks automatically based on the run-time requirements, no explicit sys-
tem calls.
4

Sliding the BSS


int brk(void *address);

If possible set the uninitialized data segment at address. Can be used by the C library to allo-
cate/free memory dynamically.

void *sbrk(long size);

Increments program’s data space by size bytes and returns the old value of the BSS.
sbrk(0) returns the current location of BSS.
Finding the segments:

etext, edata, end mark the end of text segment, initialized data segment and the BSS
respectively at the time of program load.
sbrk(0) is used to find the end of the data segment.
Printing the address of functions and variables.
Linux provide the information in /proc/pid/maps.

// mem_end.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

extern char etext, edata, end;

int arr[16];

int main() {
printf("End of text %p\n", &etext);
printf("End of initialized data %p\n", &edata);
printf("End of uninitialized data %p\n", &end);
return 0;
}
5

Output:

End of text 0x400628


End of initialized data 0x411030
End of uninitialized data 0x411078

// brk.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

extern char etext, edata, end; /*see man(3) end*/

int main()
{
printf("%s: End of text %p\n", __FILE__, &etext);
printf("%s: End of initialized data %p\n", __FILE__, &edata);
printf("%s: End of uninitialized data (at load time) %p\n", __FILE__, &end);
printf("%s: End of uninitialized data (now) = %p\n", __FILE__, sbrk(0));
}

Output:

brk.c: End of text 0x4006a0


brk.c: End of initialized data 0x411038
brk.c: End of uninitialized data (at load time) 0x411040
brk.c: End of uninitialized data (now) = 0x433000

On checking via strace we have something like:

brk(NULL) = 0x412000
brk(0x433000) = 0x433000

before the first write call. This means that something is calling brk → printf(). This is confirmed
when the following changes are made:
void *ptr = sbrk(0);
printf("%s: End of text %p\n", __FILE__, &etext);
printf("%s: End of initialized data %p\n", __FILE__, &edata);
printf("%s: End of uninitialized data (at load time) %p\n", __FILE__, &end);
printf("%s: End of uninitialized data (at main) = %p\n", __FILE__, ptr);

Output:

brk.c: End of text 0x4006a4


brk.c: End of initialized data 0x411038
brk.c: End of uninitialized data (at load time) 0x411040
brk.c: End of uninitialized data (at main) = 0x412000

Which seems to be what we expect (assuming a page space of 4KB). the printf requires some
memory, as it requires a buffer and some memory requirements.
6

Lecture 11
We have seen thus far how we can see the segment layout at program load and runtime, using
predfined variables, sbrk proc file in LINUX, etc.

How to allocate memory chunks with different permissions?


Discontinuous allocation

ptr = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);

mmap is a powerful and multi-purpose system call to perform dynamic and discontiguous allocation
(explicit OS support).
It allows to allocate address space with different permissions (READ/WRITE/EXECUTE), and at a
particular address provided by the user.

To understand the segment division of the address space, consider:


// mem_segments.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

extern char etext, edata, end;

int global_x = 5;
char gchar[32];

int main()
{
int x = 5;
void *p = &x;
printf("==========Text=========\n");
printf("Address of main = %p\n", &main);
printf("End of text %p\n", &etext);

printf("==========Data(initialized)=========\n");
printf("Address of global_x = %p\n", &global_x);
printf("End of initialized data %p\n", &edata);

printf("==========Data(uninitialized)=========\n");
printf("Address of gchar= %p\n", gchar);
printf("End of uninitialized data %p\n", &end);

printf("==========Stack=========\n");
printf("Address of x = %p\n", &x);
}

Output:

==========Text=========
Address of main = 0x4006c4
7

End of text 0x4007ec


==========Data(initialized)=========
Address of global_x = 0x411040
End of initialized data 0x411044
==========Data(uninitialized)=========
Address of gchar= 0x411050
End of uninitialized data 0x411070
==========Stack=========
Address of x = 0xffffffffefc4

Since a decent amount of stack is already allocated, it allows to manipulate pointers and
addresses so long as they are within the bounds of the stack. For eg.:
p -= 0x10000;
*(int *)p = 5;
printf("%d\n", *(int *)p);

This gives no compilation or runtime error, because we were still within the bounds of the stack.

Now consider the following changes:


p = malloc(1024);
printf("Address of p: %p\n", p);

On seeing the strace we see that no brk call was called after the write syscalls. This is because
the uninitialised data is left for the malloc to use. However, if the memory is increased beyond
the data uninitialised segment, we will need a malloc.

A bit on mmap:
// mmap.c
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include<sys/mman.h>
#include <unistd.h>

int main()
{
char *ptr;

ptr = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_PRIVATE, 0, 0);


if(ptr == MAP_FAILED){
perror("mmap");
exit(-1);
}

printf("ptr = %p\n", ptr);


strcpy(ptr, "hello cs330!");
8

munmap((void *)ptr, 4096);


return 0;
}

Has the permissions been only PROT_READ we would have got a segmentation fault.
The first argument of the mmap is a hint to the OS, where to allocate this memory. The OS may
choose to place or not at this location. To enforce this to take place we can use flags like MAP_FIXED.

What is the structure of PCB memory state?

Can be maintained as a sorted circular list accessible from the PCB, ensuring the START and
END do not overlap. Can merge/extend areas if permissions match.

Address space through fork() and exec()

Through fork() Through exec()


9

fork

Child inherits the memory state of the parent.


The memory state data structure is copied into the child PCB.
Further changes using mmap() or brk() is per-process.
// brk_fork.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <wait.h>

extern char etext, edata, end; /*see man(3) end*/


int g_arr[1024];

int main()
{
int pid;
printf("Current heap start = %p\n", sbrk(0));
if(sbrk(4096 * 1024) == (void *)-1){
printf("sbrk failed\n");
}
printf("Heap start after expand = %p\n", sbrk(0));
pid = fork();
if(!pid){ //child
printf("Child: Current heap start = %p\n", sbrk(0));
if(sbrk(4096 * 1024) == (void *)-1)
printf("sbrk failed\n");
printf("Child: heap start after expand = %p\n", sbrk(0));
exit(0);
}
wait(NULL);
printf("Heap start after child = %p\n", sbrk(0));
}

Output:
Current heap start = 0x34036000
Heap start after expand = 0x34457000
Child: Current heap start = 0x34457000
Child: heap start after expand = 0x34857000
Heap start after child = 0x34457000

exec

The address space is reinitialized using the new executable. Changes in the newly created address
space depends on the logic of the new process.
// brk_exec.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <wait.h>
10

extern char etext, edata, end; /*see man(3) end*/


int g_arr[1024];

int main()
{
int pid;
printf("End of text %p\n", &etext);
printf("End of initialized data %p\n", &edata);
printf("End of uninitialized data %p\n", &end);
printf("Current heap start = %p\n", sbrk(0));
if(sbrk(4096 * 1024) == (void *)-1){
printf("sbrk failed\n");
}
printf("Heap start after expand = %p\n", sbrk(0));
pid = fork();
if(!pid){ //child
execl("./brk", "brk", NULL);
perror("execl");
}else{
wait(NULL);
}
}

Output:

End of text 0x400878


End of initialized data 0x411060
End of uninitialized data 0x412068
Current heap start = 0x2f7c4000
Heap start after expand = 0x2fbc4000
brk.c: End of text 0x400720
brk.c: End of initialized data 0x411040
brk.c: End of uninitialized data (at load time) 0x411048
brk.c: End of uninitialized data (at main) = 0x12943000
brk.c: End of uninitialized data after expand = 0x12974000

You might also like