CS107 J Zelenski

Handout #7 May 23, 2011

Final practice
Final Exam: Friday, June 3rd 8:30-11:30am Memorial Auditorium

This is our registrar-scheduled exam time. There is no alternate exam. You may bring your textbook, notes, and other paper resources, but no electronic devices may be used. SCPD (local): See forum post with map and parking info for the on-campus exam. SCPD (remote):. Please confirm arrangements for on-site exam by email to head TA Nate Hardison (hardison@stanford.edu) by May 31st. Material The final is comprehensive but expect more coverage on post-midterm topics and particular focus on material covered in the labs and assignments. Check your rear-view mirror for the very impressive list of things you've learned in 107: • C— strings, arrays, pointers, &, *, void*, typecasts, function pointers • Data representation—bits, bytes, ASCII, two's complement integers, floating point, arrays, pointers, structs • IA32 assembly—data access and addressing modes, arithmetic and logical ops, implementation of C control structures, call/return, register use • Address space—layout and purpose of text/data/stack/heap segments, handling of globals /locals/parameters • Runtime stack— protocol for function call/return, parameter passing, management of ebp and esp registers • Compilation— tasks handled by preprocessor, compiler, assembler, and linker, static and dynamic linking, relocatable object files and executables, makefiles • Memory— memory hierarchy, caches, locality, static versus dynamic allocation, heap allocator strategies and tradeoffs • Performance— compiler optimizations, measuring execution time, profiling The rest of this handout is the final from CS107 last term so questions are fairly representative in terms of format, difficulty, and content. To conserve paper, I removed answer space, but the real exam will have much more room for your answers and scratch work. We'll distribute solutions later this week. Good luck preparing!

%eax 0xf(%eax. The function should use qsort to sort a copy of the array by length and then return the longest.L2: mov cmp jne mov shl add mov leave ret %ebp %esp. b: Implement the C function Max that takes an array of strings and returns the longest string from the array.0x8(%ebp) 0xc(%ebp).L3: addl mov lea mov mov mov call add . %eax (%eax).(%esp) Binky %eax. We did not cover Python this term and it will not appear on the exam.4). } return ___________________________. %eax -0x4(%ebp). not intended to do something meaningful. %eax . Binky: push mov sub movl jmp . . int param2) { int local = _____________________ . %esp $0x4. -0x4(%ebp) . %ebp $0x18.%eax %eax. Your code should refer to variables. %eax Fill in the blanks in the C code below for Binky to compile to the assembly above. %eax 0x8(%ebp). int nWords) Problem 2: IA32 Below is the unoptimized assembly code generated for the Binky function. while ( _____________________________ ) { __________________________________________. char *Max(char *words[]. __________________________________________.%eax %eax.L3 0xc(%ebp).0x4(%esp) 0x8(%ebp). -0x4(%ebp) 0xc(%ebp).–2– Problem 1: Python and C Parts (a) and (c) were Python questions.L2 $0x4. Note this is nonsense code. not register names. int Binky(int *param1.%eax. You can assume the array has at least one entry. %eax $0x2.

unsigned int status[31]. Note the stack data being overwritten is within the stack frame or Corrupt. argc. it overwrites the saved base pointer with a pointer back to itself. void Corrupt(int whichType. the last at 0x7ffc. If whichType is an odd number. This allocator treats the heap segment as a collection of individual pages where each page contains a group of samesize blocks. not main or further back in the stack. the function overwrites the return address with the value of ptr. printf("After %d %p \n". In the first example page diagrammed below. there is no block header or padding. const char *argv[]) { printf("Before %d %p \n". NULL). return 0. The bit is 1 if that block is freed. Problem 4: Heap allocator You are implementing a segregated storage allocator. the remaining 3968 bytes are divided into blocks. Corrupt(argc.–3– Problem 3: Runtime stack a: Implement the Corrupt function that creates stack corruption for testing a crash reporter. typedef struct { size_t sz. A page is 4096 bytes. // size of blocks on this page // bit vector of free/in-use per block Each page begins with a header as declared above. Consider its use in a program consisting of just Corrupt and the main function below. } page_header. What does it output and how does it behave? Be specific. argc. the last block's at position 991. void *ptr) b: Assume the Corrupt function has been implemented correctly. What does the program output and how does it behave? Be specific. The above program is executed with one command-line argument. The function has two arguments: an integer whichType and a pointer ptr. 0 if in use. } The above program is executed with no command-line arguments. each block's status bit is initialized to 1. A block's status bit is toggled when being allocated or freed. The page header occupies the first 128 bytes. The first block is at address 0x7080. The rest of the page is divided into blocks. The first block's status bit is at position 0 of the bit vector. int main(int argc. Every block on a page is the same size as the sz field of the page header. The status array in the page header is used as a bit vector. argv). When the page is created. If whichType is even. blocks are 4 bytes and a total of 992 blocks fit within the page. The blocks are laid out end-to-end. argv). Each block on the page corresponds to one bit. .

8ff0 Specific implementation facts: • This allocator rounds up all requested sizes to a power of 2. Here are the global variables. The parameter sz is required to be non-zero and must be able to be rounded without overflow. a: The RoundToPower function is given a size sz and returns the smallest power of 2 that is greater or equal to sz. • This allocator returns pointers aligned to 4-byte boundaries. • The heap segment consists of a sequence of pages laid out end-to-end.–4– sz 4 0x7000 status 7004-7076 7080 7084 7088 708c 7090 7094 7098 709c . every page stores the same number of entries in the status bit vector. Assume the excess status bits are initialized to 0. • To keep things simple. } page_header. static void *heapStart. For example. This built-in is implemented as a single. static void *heapEnd. even though a page with a larger size blocks requires fewer. If the val is zero. The minimum block size is 4.. type definitions. // addr of first page in heap segment // addr past end of last page of segment The allocator uses the gcc built-in function clz ('count leading zeros'). 7ff0 7ff4 7ff8 7ffc Another heap page might be divided into 16-byte blocks as shown below: sz 16 0x8000 status 8004-8076 8080 8090 . . int clz(unsigned int val). // constant: number of bits in an unsigned int #define INTBITS (sizeof(unsigned int)*8) // constant: number of bytes per page #define PAGESZ 4096 typedef struct { size_t sz.. it returns the INTBITS constant.. marking them unavailable. .. The clz function counts the leading 0-bits in integer parameter val starting from the most significant bit. The heap start address is always page-aligned. RoundToPower(7) returns 8. We will ignore allocating blocks larger than 2048. and constants.. unsigned int status[31]. • This allocator never splits nor coalesces blocks. The realloc operation must move a block to change its size. fast IA32 instruction (bsr bit scan reverse).. The implementation below uses clz for efficiency.

and returns the address of a newly allocated heap block. c: Complete the mymalloc function below. Complete the FindFree function below by providing the missing test expression. int pos) { int index1 = pos / INTBITS. 4). static void Toggle(unsigned int *array. retaining comparable efficiency. } The function as implemented above does not work correctly in all cases. } } return –1. pos < INTBITS. } Complete the Toggle function below that inverts a single bit at a given position in the bit vector. A free position has bit equal to 1. This function searches for and returns a free position within the bit vector or –1 if no positions are free. it returns NULL (do not use larger sizes or extend the heap). The function takes one argument. static int FindFree(unsigned int *array) { for (int i = 0. it updates the heap data structures and returns the pointer. The bits are referred to by position. int index2 = pos % INTBITS. It uses a first-fit search through the heap pages to find a page with blocks of the correct size containing a free block. The function can assume that pos is within range for the bit vector. and so on. i++) { // status array has 31 entries for (int pos = 0. Identify the input(s) for which it returns an incorrect result. i < 31. If an appropriate block is found. The requested size is rounded up to nearest power of 2. You should use the helper functions FindFree and Toggle. void *mymalloc(size_t sz) { sz = max(RoundToPower(sz). Position 0 refers to the least significant bit of status[0]. b: The status field in the page header is an array of unsigned integers used as a bit vector. Below are two helper functions that operate on the status bit vector. pos++) { if (_________________________________________) return pos + i*INTBITS. // round to power of 2. If none is found.1). Fix the function so it works correctly for all inputs.–5– static size_t RoundToPower(size_t sz) { int count = clz(sz). the requested payload size in bytes. minimum 4 . position 32 is the least significant bit of status[1]. return ((unsigned int)INT_MIN) >> (count . You can assume that myinit has been called and all pages in the heap segment have been properly initialized.

The C code below works correctly.%ebp 0x8(%ebp). You can hard-code knowledge that page size is the constant 4096. You can assume that ptr is the address of an allocated heap block. but you need a further boost in throughput. static void *PageStart(void *ptr) { return (void *)(((unsigned int)ptr/PAGESZ)*PAGESZ). Describe a change in code/strategy that could significantly reduce the number of cycles spent in FindFree. You should use the helper functions PageStart and Toggle.%eax %ebp # prolog # # # # # # first body inst. It spends many cycles examining a given page (almost all that time is spent in FindFree) and examines many pages. load dividend clear for divl instr load divisor divide edx:eax/ecx. } push mov mov mov mov divl imul pop ret %ebp %esp. static void *PageStart(void *ptr) { return _________________________________________________. Compiling with optimization helps. not counting the function prolog/epilog.%ecx %ecx %ecx. quotient into eax multiply quotient by divisor epilog d: Re-implement PageStart to compute an equivalent result using only 2 assembly instructions in the function body. Show both the C code and its generated assembly.%ebp 0x8(%ebp).%eax $0x0. void myfree(void *ptr) f: A callgrind profile shows mymalloc to be a bottleneck. } push mov mov %ebp %esp.–6– The PageStart function is given a pointer and returns the page-aligned start address of the page containing that pointer. . but the direct translation into unoptimized assembly uses an expensive divl operation and requires 5 total instructions for the function body.%eax # prolog # first body inst ___________________________________________ pop %ebp # epilog ret e: Complete the myfree function which deallocates a heap block and properly updates the heap data structures.%edx $0x1000.

For each scenario described below. . How does this compare to the best-case utilization for a non-segregated allocator that tacks an 8-byte header onto every block? Segregation can increase external fragmentation. provide/describe a code example that results in that outcome. Problem 5: Compilation Failing to #include a necessary header file can cause a variety of consequences. c: The missing #include causes a linker error. d: The missing #include causes an execution error (builds but does wrong thing). a: The missing #include causes a compiler warning but doesn't block the build nor create an execution error.–7– Describe a change in code/strategy that could significantly reduce the number of pages being examined. Compute the utilization for the best-case scenario of a heap consisting of just one full page of in-use 8byte blocks. b: The missing #include causes a compiler error. Describe a scenario where the segregated storage allocator would have much lower utilization than the non-segregated allocator due to external fragmentation. g: This form of segregated storage has very little internal fragmentation.

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.