You are on page 1of 6

Understanding Linux: Executable File Formats (ELF) http://havefunwhileulearn.blogspot.com/2010/04/...

Compartilhar Denunciar abus o Próximo blog» Criar um blog Login

Understanding Linux

Blog Archive
▼ 2010 (13)
► March (6)
▼ April (4)
Hacking Linu x u sing Virtu al Machine (Vmware Workst...
VMWare Player 3.0 (A boon to Linu x Hackers/Newbies...
Process Address Space
Execu table File Formats (ELF)
► May (2)
► Ju ne (1)

Saturday, April 17, 2010

Executable File Formats (ELF)


What happe ns whe n you compile your C code ?
Answer is simple, the compiler generates the
executable. On a linux/unix system, by deafult the
name of the executable generated is “a.out”.

What ’s t he re inside an e xe cut able file (a.out ) ?


Have you ever tried dissecting an a.out file ? Its not a
plain binary file of machine codes. It is much more Linkers a nd Loa ders
than that and has lot of other information that helps John R. Levine
Best Price $ 1 9 .9 9
Operating System to load it in memory. The or Buy New $ 4 6 .09
executable files have various formats like COFF, ELF
etc.
Now a day, most of the unix like operating systems Priva cy I nf orma tion
(linux, BSD, Solaris, IRIX) etc use ELF (Exe cut able
and Linkable Format ) format for their executables.

Typically an elf executable includes

ELF Header
Program Headers
Section Headers

1 de 15 28-07-2010 15:22
Understanding Linux: Executable File Formats (ELF) http://havefunwhileulearn.blogspot.com/2010/04/...

Data referred by program or section headers

Disse ct ing an ELF File


We will take a simple C Program, compile it and see what all is there in
the generated a.out (ELF) file.

/************************* test.c
************************/
int global1 = 100;
int global2;
int main (void)
{
global2 = 200;

2 de 15 28-07-2010 15:22
Understanding Linux: Executable File Formats (ELF) http://havefunwhileulearn.blogspot.com/2010/04/...

global1 = 300;
printf(“global1 = %d global2 = %d\n”, global1,
global2);
return 0;
}

On compiling it on a linux system, a.out is generated with elf file format.


You can determine the file format using the file command.

# file a.out
a.out ELF 32-bit LSB executable, Intel 80386, version
1 (SYSV), for GNU/Linux 2.6.9, dynamically linked
(user shared libs), for GNU/Linux 2.6.9, not stripped

“file” command determines this information by reading the Elf Header


which lies at the start of file.

ELF He ade r
Always lie at the start of the executable file. ELF header has an overall
information about the entire elf file. It describes the target architecture
(Intel 80386 in this case), version of elf, location and number of program
and section headers. It also contains the location of the first executable
instruction (called entry point).

Lets print the contents of ELF header for our “a.out” elf executable. You
can use the tool “readelf” to dissect the elf executable.

ELF HEADER
-----------
#define EI_NIDENT 16
typedef struct {
unsigned char e_ident[EI_NIDENT]; // elf magic
Elf32_Half e_type;
Elf32_Half e_machine; // target machine
architecture
Elf32_Word e_version;
Elf32_Addr e_entry; // entry point address
Elf32_Off e_phoff; // program hdr table’s file
offset
Elf32_Off e_shoff; // section hgr table’s file
offset
Elf32_Word e_flags;
Elf32_Half e_ehsize; // elf header size in bytes
Elf32_Half e_phentsize; // size of one entry in
program

3 de 15 28-07-2010 15:22
Understanding Linux: Executable File Formats (ELF) http://havefunwhileulearn.blogspot.com/2010/04/...

// header table in bytes.


All
// Entries are of equal
size
Elf32_Half e_phnum; // number of entries in
programm header table
Elf32_Half e_shentsize; // size of section header
in bytes
Elf32_Half e_shnum; // number of section
headers in section header table
Elf32_Half e_shstrndx; // index of .shstrtab
section in section header table.
} Elf32_Ehdr;

# readelf -h a.out
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00
00
Class: ELF32
Data: 2's complement,
little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable
file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x80482b0
Start of program headers: 52 (bytes into
file)
Start of section headers: 1980 (bytes into
file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 7
Size of section headers: 40 (bytes)
Number of section headers: 28
Section header string table index: 25

The first four bytes hold a magic number identifying the file as ELF
executable.

4 de 15 28-07-2010 15:22
Understanding Linux: Executable File Formats (ELF) http://havefunwhileulearn.blogspot.com/2010/04/...

The second (0x45), third (0x4c) and fourth (0x46) characters are in fact the
ASCII values for ‘E’, ‘L’, ‘F’. The “file” command reads this magic
number to determine if this is an ELF file or not.
Note the entry point address. This is the address of first instruction where
the control is transferred after loading the executable in memory.
Elf Header also contains the offset at which the program header table
and section header table are placed in the a.out file.

ELF Se ct ion He ade rs


The elf executable contains various sections and each section has a
corresponding section header that contains the section name, the virtual
address at which this should be loaded, the type of section, offset from the
beginning file at which the first byte of the section resides, the size of
section etc.

Few important sections are:


.t e xt : This section hold the executable instructions of the program.
.bss : This holds the uninitialized global data. In our example code,
the variable global2 will go to the .bss section. All data in this
section is initialized with 0, when program is loaded into memory.
This section occupies no space in elf file. We only have a header for
.bss section in the elf file. There is no need to allocate any space in
the a.out (elf file) as we know that the initial value of the variables
inside .bss is 0.
.dat a : Global initialized data goes here.
.st rt ab : It holds the names of various symbols.
.symt ab : It holds a symbol entry for each symbol.
.shst rt ab : This section holds sections names.

There are various other sections as well. But we will concentrate only on
the above sections.
Lets print the section header for above sections. Again, readelf can be
used to print the section headers.

ELF SECTION HEADER


------------------
typedef struct {
Elf32_Word sh_name; // offset into .shstrtab
section
Elf32_Word sh_type;
Elf32_Word sh_flags;
Elf32_Addr sh_addr;
Elf32_Off sh_offset;
Elf32_Word sh_size;
Elf32_Word sh_link;

5 de 15 28-07-2010 15:22
Understanding Linux: Executable File Formats (ELF) http://havefunwhileulearn.blogspot.com/2010/04/...

Elf32_Word sh_info;
Elf32_Word sh_addralign;
Elf32_Word sh_entsize;
} Elf32_Shdr;

# readelf –S a.out
(only important fields are shown below)

Section Headers:
[Nr] Name Type Addr Off Size
Flg

[12] .text PROGBITS 080482b0 0002b0 0001d8


AX
[22] .data PROGBITS 080495c4 0005c4 000008
WA
[23] .bss NOBITS 080495cc 0005cc 00000c
WA
[25] .shstrtab STRTAB 00000000 0006e0 0000db
[26] .symtab SYMTAB 00000000 000c1c 000460
[27] .strtab STRTAB 00000000 00107c 00026a

The sections flags have following meanings:


A (ALLOC) The space should be allocated in memory to load this
section. See that symbol and string table are not loaded in memory
X (EXEC INST RUCT IONS) The section contians executable
machine instructions. See that .text section has this flag set.
W (WRIT E) The section has data that can be modified during
program execution.

Note the se ct ion t ype (NOBITS) of .bss section. NOBITS indicates that
section does not occupy any space in th executable file.

Also, note that the virt ual addre ss of sections .symtab, .strtab is 0, which
means that they are not loaded in memory. They are only used during
debugging of the program.

The offse t specifies where the actual bytes for that section reside in the
elf file.
For eg. offset for .text section is 0x2b0, which means that the machine
instructions for this program lie at an offset of 0x2b0 from the start of a.out
file.

offset for .text section is 0x2b0, which means that the machine instructions
for this program lie at an offset of 0x2b0 from the start of a.out file.

6 de 15 28-07-2010 15:22

You might also like