You are on page 1of 7

Chapter 3: The Tools Of The Trade

The Development Tools


The Assembler
The Linker
The Debugger
The Compiler
The Object Code Disassembler
The Profiler
The Gnu Assembler
Installing The Assembler
Using The Assembler
A Word About Opcode Syntax
The Gnu Linker
The Gnu Compiler
Downloading And Installing GCC
Using gcc
The Gnu Debugger Program
Downloading And Installing gdb
Using gdb
The KDE Debugger
Downloading And Installing kgdb
Using gdb
The Gnu Objdump Program
Using objdump
An objdump Example
The Gnu Profiler Program
Using The Profiler
A profile Examples
A Complete Assembly Development System
The Basics Of Linux
Downloading And Running MEPIS
Your New Development Systems

In (some) detail

minimally three tools are needed

1. assembler
2. linker
3. debugger

additionally
4. a compiler for a high level language
5. an object code disassembler
6. a profiling tool for optimization

The Development Tools


The Assembler

a tool that converts assembly language code to instruction code for the processor.
(_ instruction code that the processor can run)
(from chapter 1) there are three components to an assembly language source code
program.
- opcode mnemonics
- data section
- directives
Each assembler uses different formats for each of these components and so
programming with each can be completely different.

(KEY) the biggest difference between assemblers are the directives. Opcode
mnemonics are closely related to processor instruction codes, but directives are
unique to each assembler - directives instruct the assembler how to construct the
instruction code program - they define programming secctions - if then statements,
while loops.
Some assemblers come with built in editors, others are simple command line
programs.

This book uses gnu assembler gas - 'as' is the command line invocation *NOT* gas.

The Linker
Most high level languages - like C and C++ - compile and link in a single step and
don't separate these out. (_ but it pays to learn linking as a separate step)

The process of linking resolves all defined functions and memory address labels
declared in the assembly language program. To do this, any C functions like printf
that the assembly language program uses needs to be included in the object code or
a reference made to an external library. For this to work automatically the linker
needs to know where the common object code library is located on the computer, or
the locations must be manually specified with the compiler command line parameters.

Most assemblers do not automatically link the object code to produce executable
code. there is a second step to link object code to other libraries and produce an
executable code that can be run on a host operating system. This is the job of the
linker.

When the linker is invoked manually, the developer must know which libraries are
required to completely resolve any functions used. The linker must be tolde where
to find function libraries and which object code to link together.

every assembler has its own linkers.

The Debugger

Like assemblers, debuggers are specific to operating systems and hardware. The
debugger must understand
1. the instruction code set of the hardware problems
2. the registers and memory handling methods of the operating system.

Most debuggers provide four functions


1. run a program in a controlled environment, specifying any runtime
parameters.
2. stopping the program at any point in execution
3. examining data elements such as memory locations and registers
4. changing elements in the program while it is running

The Compiler
Most professional developers do most of their development with high level
languages. eg C and C++, and then optimize parts in assembly. To do this, the
compiler for the HLL needs to produce instruction codes for the processor to
execute. Most compilers go through an intermediate step. Instead of directly code
HLL source code to processor instruction code, HLL compilers produce assembly code.
Then the assembler converts assembly code into processor instruction code. We can
stop the process in the middle and examine the generated assembler code, modify it,
and link the modified code.

The Object Code Disassembler


converts an object code file and/or a fully executable program and displays the
instruction codes that will be run by the processor. Some disassemblers take
another step and generate instruction codes into easily readable assembly language
syntax.

The Profiler

determines what functions consume what part of the performance - processor and
memory for example. Once you narrow down which functions consume most resources,
they can be optimized.

The Gnu Assembler


gas - is a command line program with appropriate command line parameters.

(cl) parameter description


-a which files to include in the output
-D included for backward compatibility, but ignored
--defsym define symbol and value before assembly source code
-f fast assemble, skips comment and white space
--gstabs includes debugging information for each source code
line
--gstabs+ includes special gdb debugging information
-I specifies directiories to search for files
-J do not warn about signed overflows
-K included for backwards compatibility but ignored
-L keep local symbols in the symbol table
--listing-lhs-width set the maximum width of the output column
--listing-rhs-width set the maximum width of the input source lines
-o specify name of the output object file
-R fold the data section into the text section
--statistics maximum space and total time used by the assembly
-v display the version number of as
-W do not display warning messages
-- use standard input for source files.

e.g:
as -o test.o test.s

A Word About Opcode Syntax


the confusing part of gas is the syntax it uses - the AT&T syntax.
AT&T syntax originated from Bell labs, where Unix originated, formed based on the
opcode syntax of the more popular processors used to implement unix systems of the
time.

(KEY) Intel chose a different syntax. Most documentation for Intel assembly
language programming uses Intel syntax.

the main syntax differences are


1. AT&T syntax use a $ Intel immediate operands are delimited. so e.g when
referencing the decimal value 4, AT&T syntax uses $4, Intel uses 4.
2. AT&T syntax uses % for registers, intel does not. e.g the EAX register would
be referred to in at&t syntax as %eax
3. AT&T syntax uses the opposite order to Intel for source and destination
operands. To move the integer value 4 to the EAX register,
AT&T : mov1 $4, %eax # note the 1 at the end of mov
Intel: mov eax, 4
4. AT&T syntax uses a separate character at the end of (_opcode) mnemonics to
indicate the size. In Intel syntax size is declared as a separate operand .
AT&T : mov1 $test %eax
Intel: mov eax dword ptr test -- presumable the dword ptr is the 'separate
operand'.
5. Long calls and jumps use a different syntax:
AT&T : ljmp $section, $offset
Intel: jmp section:offset

Installing The Assembler


Using The Assembler

The Gnu Linker


The Gnu Linker ld is used to link object code files into either exectuable files,
or library files.

many paramaters, but usually most of these are not required.

ld -o mytest mytest.o

creates the executable file mytest from the object file mytest.o . The executable
file is created with appropriate permissions so that it can be run from the command
line.

The Gnu Compiler


Downloading And Installing GCC
Using gcc

gcc -o ctest ctest.c

ravi@yantra:~/.../code$ ./ctest
Hello, world

gcc -S ctest.c
generates

.file "ctest.c"
.text
.section .rodata
.LC0:
.string "Hello, world \n "
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
endbr64
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
leaq .LC0(%rip), %rdi
movl $0, %eax
call printf@PLT
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0"
.section .note.GNU-stack,"",@progbits
.section .note.gnu.property,"a"
.align 8
.long 1f - 0f
.long 4f - 1f
.long 5
0:
.string "GNU"
1:
.align 8
.long 0xc0000002
.long 3f - 2f
2:
.long 0x3
3:
.align 8
4:

The Gnu Debugger Program

used to debug C and C++ programs.

Downloading And Installing gdb


Using gdb

a command line program called gdb.

can run with different options


-b set the line speed of the serial interface for remote debugging
-batch run in batch mode
-c specify the core dump file to analyze
-d specify a directory to search for source files
-e specify the file to execute
-f output filename and line numbers in standard format for debugging
-nx do not execute commands from .gdbinit files
-q quiet mode - don't print introduction
-s specify the filename for symbols
-se specify the filename for symbols and to execute
-tty set device for standard input and output
-x execute gdb commands from the specified files

To use the debugger the executable must have been compiled or assembled with the -
gstabs option which includes the necessary information in the executable file for
the debugger to know where in the source file the instruction codes relate. Once
gdb starts it uses the command line interface to accept debugging commands.

A huge list of debugging commands exist. Some of the more useful ones are

break set a breakpoint in the source code to stop execution


watch set a watchpoint to stop execution when a variable reaches a
specified value
info observe system variables such as registers, stack, and memory
x examine memory location
print display variable names
run start execution of the program within the debugger
list list specified function or lines
step step to the next instruction in the program
cont continue executing the program from that point
until running the program until it reaches the specified source code line
(or greater)

The KDE Debugger


Downloading And Installing kgdb
Using gdb

The Gnu Objdump Program

another utility in the binutils package. The objdump program displays not only the
assembly code but also the raw instruction codes generated as awell.

Using objdump
An objdump Example

ravi@yantra:~/.../code$ gcc -c ctest.c


ravi@yantra:~/.../code$ ls
ctest ctest.c ctest.o ctest.s
ravi@yantra:~/.../code$ objdump -d ctest.o

ctest.o: file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <main>:
0: f3 0f 1e fa endbr64
4: 55 push %rbp
5: 48 89 e5 mov %rsp,%rbp
8: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # f <main+0xf>
f: b8 00 00 00 00 mov $0x0,%eax
14: e8 00 00 00 00 callq 19 <main+0x19>
19: b8 00 00 00 00 mov $0x0,%eax
1e: 5d pop %rbp
1f: c3 retq

Note that the memory location addresse referenced in the program are zeroed out.
These are not determined till the linker runs.

The Gnu Profiler Program


Using The Profiler
A profile Examples
A Complete Assembly Development System
The Basics Of Linux
Downloading And Running MEPIS
Your New Development Systems

You might also like