You are on page 1of 50

Memory Allocation

Alan L. Cox
alc@rice.edu
Objectives

Be able to describe the differences between


static and dynamic memory allocation

Be able to use malloc() and free() to manage


dynamically allocated memory in your
programs

Be able to analyze programs for memory


management related bugs

Cox Memory Allocation 2


Big Picture

C gives you access to underlying data


representations & layout
 Needed for systems programming
 Potentially dangerous for application programming
 Necessary to understand
Memory is a finite sequence of fixed-size
storage cells
 Most machines view storage cells as bytes
• “byte-addresses”
• Individual bits are not addressable
 May also view storage cells as words

Cox Memory Allocation 3


Structure Representation & Size
sizeof(struct …) =
sum of sizeof(field) struct CharCharInt {
+ alignment padding char c1;
Processor- and compiler-specific char c2;
int i;
} foo;

foo.c1 = ’a’;
foo.c2 = ’b’;
foo.i = 0xDEADBEEF;

c1 c2 padding i

61 62 EF BE AD DE
x86 uses “little-endian” representation

Cox Structures and Unions 4


Pointer Arithmetic
pointer + number pointer – number

E.g., pointer + 1 adds 1 something to a pointer

char *p; int *p;


char a; int a;
char b; int b;

p = &a; p = &a;
p += 1; In each, p now points to b p += 1;
(Assuming compiler doesn’t
reorder variables in memory)

Adds 1*sizeof(char) to Adds 1*sizeof(int) to


the memory address the memory address

Pointer arithmetic should be used cautiously

Cox Arrays and Pointers 5


Can you tell if there is padding?

Yes! struct CharCharInt {


char c1;
char c2;

Print the int


} foo;
i;

addresses of
…(code from before)…
foo.c2 and
printf(“&foo.c2 = %p\n”, &foo.c2);
foo.i printf(“&foo.i = %p\n”, &foo.i);

c1 c2 padding i

61 62 EF BE AD DE

Cox Memory Allocation 6


Can you access the padding?

Yes! struct CharCharInt {


char c1;
char c2;
int i;
} foo;

…(code from before)…

char *cp = &foo.c2;


cp += 1;
*cp = 0x7F;

c1 c2 padding i

61 62 7F EF BE AD DE
x86 uses “little-endian” representation

Cox Memory Allocation 7


Can you access both bytes?

Yes! struct CharCharInt {


char c1;
char c2;
int i;
} foo;

…(code from before)…

short *sp = (short *)(&foo.c2 + 1);


*sp = 0x7FFF;

c1 c2 padding i

61 62 FF 7F EF BE AD DE
x86 uses “little-endian” representation

Cox Memory Allocation 8


A Running Program’s Memory
0x7FFFFFFFFFFF
Unused

User Stack Created at runtime


space
47 bits of address

Shared Libraries Shared among processes

Heap Created at runtime

Read/Write Data
Loaded from the executable
Read-only Code and Data

Unused
0x000000000000
Cox Memory Allocation 9
Allocation

For all data, memory must be allocated


 Allocated = memory space reserved

Two questions:
 When do we know the size to allocate?
 When do we allocate?
Two possible answers for each:
 Compile-time (static)
 Run-time (dynamic)

Cox Memory Allocation 10


How much memory to allocate?
Sometimes obvious:

char c; One byte


int array[10];
10 * sizeof(int) (= 40 on CLEAR)

Sometimes not:
Is this going to point to one
char *c; character or a string?
int *array;
How big will this array be?
 How will these be used???
• Will they point to already allocated memory (what we’ve
seen so far)?
• Will new memory need to be allocated (we haven’t seen this
yet)?

Cox Memory Allocation 11


malloc()
Won’t continually
remind you of this

#include <stdlib.h>

int *array = malloc(num_items * sizeof(int));

Allocate memory dynamically


 Pass a size (number of bytes to allocate)
• Finds unused memory that is large enough to hold the
specified number of bytes and reserves it
 Returns a void * that points to the allocated
memory
• No typecast is needed for pointer assignment
 Essentially equivalent to new in Java (and C++)

Cox Memory Allocation 12


Using malloc()
int *i; Statically or
dynamically allocates
int *array;
space for 2 pointers Dynamically
allocates space
i = malloc(sizeof(int)); for data
array = malloc(num_items * sizeof(int));

*i = 3;
array[3] = 5;

i and array are interchangeable


 Arrays  pointers to the initial (0th) array element
 i could point to an array, as well
 May change over the course of the program
Allocated memory is not initialized!
 calloc zeroes allocated memory (otherwise, same
as malloc; details to come in lab)

Cox Memory Allocation 13


Using malloc()
Always check the return value of library calls like
malloc() for errors

int *a = malloc(num_items * sizeof(int));


if (a == NULL) {
fprintf(stderr,“Out of memory.\n”);
exit(1);
} Terminate now!
And, indicate error.
 For brevity, won’t in class
• Lab examples and provided code for assignments will
 Textbook uses capitalization convention
• Capitalized version of functions are wrappers that check for
errors and exit if they occur (i.e. Malloc)
• May not be appropriate to always exit on a malloc error,
though, as you may be able to recover memory

Cox Memory Allocation 14


When to Allocate?
Static time Dynamic time
 Typically global  Typically local variables:
variables:
int f(…)
{
int value;
int value;
int main(void)

{
}

int main(void)
}
{

}
 Only one copy ever  One copy exists for each
exists, so can allocate at
call – may be unbounded
compile-time
# of calls, so can’t
allocate at compile-time

Cox Memory Allocation 15


When to Allocate?

Static time
 Some local variables:

int f(…)
{
static int value;

One copy exists for
}
all calls – allocated
int main(void)
at compile-time
{
… Confusingly, local
} static has nothing to
do with global static!

Cox Memory Allocation 16


Allocation in Process Memory
0x7FFFFFFFFFFF

Stack Static size, dynamic allocation


Local variables

Shared Libraries

Programmer controlled
(variable-sized objects)
Heap Dynamic size, dynamic allocation

Read/Write Data Static size, static allocation

Read-only Code and Data Global variables


(and static local variables)

0x000000000000
Cox Memory Allocation 17
Why are there different methods?

Heap allocation is the most general


 Supports any size and number of allocations

Why don’t we use it for everything?


 Performance

Static allocation takes no run time

Stack allocation takes orders of magnitude


less run time than heap allocation

Cox Memory Allocation 18


Deallocation

Space allocated via variable definition


(entering scope) is automatically deallocated
when exiting scope
… f(void)
{
int y;
int array[10];

}

 Can’t refer to y or array outside of f(), so their


space is deallocated upon return

Cox Memory Allocation 19


Deallocation
malloc() allocates memory explicitly
 Must also deallocate it explicitly (using free())!
 Not automatically deallocated (garbage collected)
as in Python and Java
 Forgetting to deallocate leads to memory leaks &
running out of memory

int *a = malloc(num_items * sizeof(int));



free(a);

a = malloc(2 * num_items * sizeof(int));

Must not use a freed pointer unless reassigned


or reallocated

Cox Memory Allocation 20


Deallocation

Space allocated by malloc() is freed when the


program terminates
 If data structure is used until program termination,
don’t need to free
 Entire process’ memory is deallocated

Cox Memory Allocation 21


Back to create_date

Date * Date *
create_date3(int month, create_date3(int month,
int day, int year) int day, int year)
{ {
Date *d; Date *d;

d->month = month; d = malloc(sizeof(Date));


d->day = day; if (d != NULL) {
d->year = year; d->month = month;
d->day = day;
return (d); d->year = year;
} }

return (d);
}

Cox Memory Allocation 22


Pitfall
void
foo(void) Potential problem: memory
{ allocation is performed in
Date *today; this function (may not
know its implementation)
today = create_date3(1, 28, 2020);

/* Use “today”, if it is not NULL. */


...

return;
}

Memory is still allocated for “today”!

Will never be deallocated (calling


function doesn’t even know about it)

Cox Memory Allocation 23


Possible Solutions
void void
foo(void) foo(void)
{ {
Date *today; Date *today;

today = create_date3(…); today = create_date3(…);

/* Use “today”, … */ /* Use “today”, … */


... ...

free(today); destroy_date(today);
return; return;
} }

Explicitly deallocate memory – Complete the abstraction – “create”


specification of create_date3 has a corresponding “destroy”
must tell you to do this

Cox Memory Allocation 24


Common Memory Management Mistakes

Cox Memory Allocation 25


What’s Wrong With This Code?
int *f(…) int *make_array(…)
{ {
int i; int array[10];
… …
return (&i); return (array);
} }

Consider the statement j = *f();

Leads to referencing deallocated memory


 Never return a pointer to a local variable!

Behavior depends on allocation pattern


 Space not reallocated (unlikely)  works
 Space reallocated  very unpredictable

Cox Memory Allocation 26


One Solution
int *f(…) int *make_array(…)
{ {
int *i_ptr = int *array =
malloc(sizeof(int)); malloc(10 * sizeof(int));
… …
return (i_ptr); return (array);
} }

Allocate with malloc(), and return the pointer


 Upon return, space for local pointer variable is
deallocated
 But the malloc-ed space isn’t deallocated until it is
free-d
 Potential memory leak if caller is not careful, as
with create_date3…

Cox Memory Allocation 27


What’s Wrong With This Code?

/* Return “y = Ax”. */
int *matvec(int **A, int *x) {
int *y = malloc(N * sizeof(int));
Initialization int i, j;
loop for y[]
i=0 for (; i<N; i+=1)
j=0 for (; j<N; j+=1)
y[i] += A[i][j] * x[j];
return (y);
}

malloc-ed & declared space is not initialized!


 i, j, y[i] initially contain unknown data – garbage
 Often has zero value, leading to seemingly correct
results

Cox Memory Allocation 28


What’s Wrong With This Code?

char **p;
int i;

/* Allocate space for M*N matrix. */


p = malloc(M * sizeof(char)); char *

for (i = 0; i < M; i++)


p[i] = malloc(N * sizeof(char));

Allocates wrong amount of memory


 Leads to writing into either unallocated memory or
memory allocated for something else

Cox Memory Allocation 29


Explanation

Heap region in memory (each rectangle represents one byte)


Assume M = N = 2, a memory address is 8 bytes (or 64 bits)

for (i = 0; i < M; i++)


p[i] = malloc(N * sizeof(char));
`

p[0]

p = malloc(M * sizeof(char));

Cox Memory Allocation 30


Corrected code
Heap region in memory (each rectangle represents one byte)
Assume M = N = 2, a memory address is 8 bytes (or 64 bits)
for (i = 0; i < M; i++)
p[i] = malloc(N * sizeof(char));

`
p[1]

p[0]

p = malloc(M * sizeof(char *));

Cox Memory Allocation 31


What’s Wrong With This Code?

char **p;
int i;

/* Allocate space for M*N matrix. */


< p = malloc(M * sizeof(char *));

for (i = 0; i <= M; i += 1)
p[i] = malloc(N * sizeof(char));

Off-by-1 error
 Uses interval 0…M instead of 0…M-1
 Leads to writing unallocated memory

Be careful with loop bounds!

Cox Memory Allocation 32


Using const with pointers

const int *iptr


 Pointer to a constant integer
 Cannot write to *iptr

int *const iptr


 Constant pointer to an integer
 Cannot modify the pointer (iptr)
 Can write to *iptr
char *
xyz(char * to, const char * from)
{
char *save = to;
for (; (*to = *from); ++from, ++to);
return(save);
}

Cox Memory Allocation 33


What’s Wrong With This Code?

char *s = “1234567”; char *


… strcpy(char * to, const char * from)
char t[7]; {
strcpy(t, s); char *save = to;
for (; (*to = *from); ++from, ++to);
return(save);
}

t[] doesn’t have space for string terminator


 Leads to writing into unallocated memory
One way to avoid:

char *s = “1234567”;

char *t = malloc((strlen(s) + 1) * sizeof(char));
strcpy(t, s);

Cox Memory Allocation 34


What’s Wrong With This Code?

/*
* Search memory for a value.
* Assume value is present.
*/
int *search(int *p, int value) {
while (*p > 0 && *p != value)
p += sizeof(int); p += 1;
return (p);
}

Misused pointer arithmetic


 Search skips some data, can read unallocated
memory, and might not ever see value
 Should never add sizeof() to a pointer
 Could consider rewriting this function & its uses to
use array notation instead

Cox Memory Allocation 35


What’s Wrong With This Code?

x = malloc(N * sizeof(int));

free(x);

y = malloc(M * sizeof(int));
for (i = 0; i < M; i++) {
y[i] = x[i];
x[i] += 1;
}

Premature free()
 Reads and writes deallocated memory

Behavior depends on allocation pattern


 Space not reallocated  works
 Space reallocated  very unpredictable

Cox Memory Allocation 36


What’s Wrong With This Code?

void foo(void) {
int *x = malloc(N * sizeof(int));

free(x);
return;
}

Memory leak – doesn’t free malloc-ed space


 Data still allocated, but inaccessible, since can’t
refer to x

Initially slows future memory performance


and may ultimately lead to failure

Cox Memory Allocation 37


What’s Wrong With This Code?
struct ACons {
int first; A peek at one way
struct ACons *rest; to define lists
};
typedef struct ACons *List;

List cons(int first, List rest) {


List item = malloc(sizeof(struct ACons));
item->first = first;
item->rest = rest;
return (item);
}

void foo(void) {
List list = cons(1, cons(2, cons(3, NULL)));

free(list);
return;
}

Cox Memory Allocation 38


Example continued

Memory leak – frees only beginning of data structure


 Remainder of data structure is still allocated, but
inaccessible
 Need to write deallocation (destructor) routines for each
data structure

Cox Memory Allocation 39


Putting it all together ...

bools
strings
pointers
structs
malloc() calls
simple I/O
simple string operations

Cox Memory Allocation 40


What does action1() do?
struct thing {
#include <stdio.h>
char *stuff;

struct thing *another_thing;

}; #include <stdlib.h>

void
#include <string.h>
action1(struct thing **yp, const char *stuff)

struct thing *x = malloc(sizeof(struct thing));

action1() inserts a new node storing the specified


/* Alternatively, x->stuff = strdup(stuff); */
string into the linked list
x->stuff = malloc(strlen(stuff) + 1);

strcpy(x->stuff, stuff);
Cox Memory Allocation 41
What does action2() do?
struct thing {

char *stuff;

struct thing *another_thing;

};

void

action2(struct thing **yp)

struct thing *x;

action2() prints the strings stored in the


while ((x = *yp) != NULL) {
linked list nodes sequentially
printf("%s ", x->stuff);

yp = &x->another_thing;
Cox Memory Allocation 42
What does action3() do?
struct thing {

char *stuff;

struct thing *another_thing;

};

bool

action3(struct thing **yp, const char *stuff)

struct thing *x;


while ((x = *yp) != NULL) {

action3() finds out whether a string


if (strcmp(x->stuff, stuff) == 0)

is stored in the linked list


return (true);

else

Cox Memory Allocation


yp = &x->another_thing; 43
What does action4() do?
struct thing {

char *stuff;

struct thing *another_thing;

};

void

action4(struct thing **yp, const char *stuff)

struct thing *x;

action4() deletes the first style list


compromised to
node that
while ((x = *yp) != NULL) {
save space
stores the specified string
if (strcmp(x->stuff, stuff) == 0) {

*yp = x->another_thing;
Cox Memory Allocation 44
Next Time

Lab: Debugging
Assembly

Cox Memory Allocation 45


Assembly Language

Alan L. Cox
alc@rice.edu

Some slides adapted from CMU 15.213 slides


Objectives

Be able to read simple x86-64 assembly


language programs

Cox Assembly 47
Why Learn Assembly Language?
You’ll probably never write a complete
program in assembly
 With a few exceptions, modern compilers are much
better at writing assembly than you are
But, understanding assembly is key to
understanding the machine-level execution
model
 Behavior of programs in presence of bugs
• High-level language model breaks down
 Tuning program performance
• Understanding sources of program inefficiency
 Implementing system software
• Compiler has machine code as target
• Operating systems must manage process state

Cox Assembly 48
Assembly Language

One assembly instruction


 Straightforward translation to a group of machine
language bits that describe one instruction

What do these instructions do?


 Same kinds of things as high-level languages!
• Arithmetic & logic
– Core computation
• Data transfer
– Copying data between memory locations and/or registers
• Control transfer
– Changing which instruction is next

Cox Assembly 49
Assembly Language (cont.)

But, assembly language has additional


features:
 Distinguishes instructions & data
 Labels = names for program control points
 Pseudo-instructions = special directives to the
assembler
 Macros = user-definable abbreviations for code &
constants

Cox Assembly 50

You might also like