Professional Documents
Culture Documents
CHP 3 Tools
CHP 3 Tools
In (some) detail
1. assembler
2. linker
3. debugger
additionally
4. a compiler for a high level language
5. an object code disassembler
6. a profiling tool for optimization
a tool that converts assembly language code to instruction code for the processor.
(_ instruction code that the processor can run)
(from chapter 1) there are three components to an assembly language source code
program.
- opcode mnemonics
- data section
- directives
Each assembler uses different formats for each of these components and so
programming with each can be completely different.
(KEY) the biggest difference between assemblers are the directives. Opcode
mnemonics are closely related to processor instruction codes, but directives are
unique to each assembler - directives instruct the assembler how to construct the
instruction code program - they define programming secctions - if then statements,
while loops.
Some assemblers come with built in editors, others are simple command line
programs.
This book uses gnu assembler gas - 'as' is the command line invocation *NOT* gas.
The Linker
Most high level languages - like C and C++ - compile and link in a single step and
don't separate these out. (_ but it pays to learn linking as a separate step)
The process of linking resolves all defined functions and memory address labels
declared in the assembly language program. To do this, any C functions like printf
that the assembly language program uses needs to be included in the object code or
a reference made to an external library. For this to work automatically the linker
needs to know where the common object code library is located on the computer, or
the locations must be manually specified with the compiler command line parameters.
Most assemblers do not automatically link the object code to produce executable
code. there is a second step to link object code to other libraries and produce an
executable code that can be run on a host operating system. This is the job of the
linker.
When the linker is invoked manually, the developer must know which libraries are
required to completely resolve any functions used. The linker must be tolde where
to find function libraries and which object code to link together.
The Debugger
Like assemblers, debuggers are specific to operating systems and hardware. The
debugger must understand
1. the instruction code set of the hardware problems
2. the registers and memory handling methods of the operating system.
The Compiler
Most professional developers do most of their development with high level
languages. eg C and C++, and then optimize parts in assembly. To do this, the
compiler for the HLL needs to produce instruction codes for the processor to
execute. Most compilers go through an intermediate step. Instead of directly code
HLL source code to processor instruction code, HLL compilers produce assembly code.
Then the assembler converts assembly code into processor instruction code. We can
stop the process in the middle and examine the generated assembler code, modify it,
and link the modified code.
The Profiler
determines what functions consume what part of the performance - processor and
memory for example. Once you narrow down which functions consume most resources,
they can be optimized.
e.g:
as -o test.o test.s
(KEY) Intel chose a different syntax. Most documentation for Intel assembly
language programming uses Intel syntax.
ld -o mytest mytest.o
creates the executable file mytest from the object file mytest.o . The executable
file is created with appropriate permissions so that it can be run from the command
line.
ravi@yantra:~/.../code$ ./ctest
Hello, world
gcc -S ctest.c
generates
.file "ctest.c"
.text
.section .rodata
.LC0:
.string "Hello, world \n "
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
endbr64
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
leaq .LC0(%rip), %rdi
movl $0, %eax
call printf@PLT
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0"
.section .note.GNU-stack,"",@progbits
.section .note.gnu.property,"a"
.align 8
.long 1f - 0f
.long 4f - 1f
.long 5
0:
.string "GNU"
1:
.align 8
.long 0xc0000002
.long 3f - 2f
2:
.long 0x3
3:
.align 8
4:
To use the debugger the executable must have been compiled or assembled with the -
gstabs option which includes the necessary information in the executable file for
the debugger to know where in the source file the instruction codes relate. Once
gdb starts it uses the command line interface to accept debugging commands.
A huge list of debugging commands exist. Some of the more useful ones are
another utility in the binutils package. The objdump program displays not only the
assembly code but also the raw instruction codes generated as awell.
Using objdump
An objdump Example
0000000000000000 <main>:
0: f3 0f 1e fa endbr64
4: 55 push %rbp
5: 48 89 e5 mov %rsp,%rbp
8: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # f <main+0xf>
f: b8 00 00 00 00 mov $0x0,%eax
14: e8 00 00 00 00 callq 19 <main+0x19>
19: b8 00 00 00 00 mov $0x0,%eax
1e: 5d pop %rbp
1f: c3 retq
Note that the memory location addresse referenced in the program are zeroed out.
These are not determined till the linker runs.