You are on page 1of 20

Computer Architecture and Assembly Language

Why assembly? -Assembly is widely used in industry: - Embedded systems. - Real time systems. - Low level and direct access to hardware -Assembly is widely used not in industry:

- Cracking software protections: patching, patch-loaders and emulators.
- Hacking into computer systems: buffer under/overflows, worms and Trojans.

Byte structure:
a byte has 8 bits

7

6

5

4

3

2

1

0

MSB (most significant bit)

LSB (least significant bit)

Data storage in memory: 80x86 processor stores data using little endian order. Little endian means that the low-order byte of the number is stored in the memory at the lowest address. Example: You want to store 0x1AB3 (hex number) in the memory. It would be stored this way: B3 0 1A memory block 1 2 bytes of memory . and the high-order byte at the highest address. This number has two bytes: 1A and B3.

Dl 3. There are more registers. 16. FS. BX. 32-bit general registers: EAX. EDX. CL. ECX. Segment registers: ES. ESI. .Registers: CPU contains a unit called “Register file”.SS. BL. Counter. instruction pointer: EIP Note: the registers above are a partial list. BP. CH. EBP. BH. CS . Base. SP. Source index. DH 2. SI. DS. AH. Stack pointer. DX. Data. This unit contains the registers of the following types: 1. EBX. GS 5. CX. 8-bit general registers: AL. Base pointer. Destination Index) 4. DL. ESP.bit general registers: AX. EDI (Accumulator.

return. Exists only during run time. .AL (for AX) high byte low byte XH XL EAX . procedure call. ESP .stack pointer: contains the next free address on a stack.16-bit general registers: contains two 8-bit registers: Example: AH.instruction pointer: contains offset (address) of the next instruction that is going to be executed.EIP . AX. without segments. conditional jump.32-bit general purpose register: lower 16 bits are AX.BX. segment registers: we use a flat memory model – 32bit 4GB address space. The software change it by performing unconditional jump.DX . So for this course you can ignore segment registers.CX.

the next line is considered to be a part of the backslash-ended line. reserves 64 bytes . 2 . backslash (\) uses as the line continuation character: if a line ends with backslash. buffer: resb 64 . Examples: 1. no restrictions on white space within a line. mov ax. moves constant 2 to the register ax 2. comment optional fields Either required or forbidden by an instruction Notes: 1. 2.Basic assembly instructions: Each NASM standard source line contains a combination of the 4 fields: label: (pseudo) instruction operands . 3. a colon after a label is optional.

Immediate. The left operand is the target operand. a value 2. i.EBP. One should notice that the x86 processor does not allow both operands be memory locations.e. Register. a variable or a pointer. such as AX.DL 3. while the right operand is the source operand 3 kinds of operands exists: 1.Instruction arguments A typical instruction has 2 operands.[var2] . Memory location. mov [var1].

0x2334AAFF mov word [buffer].Move instructions: MOV – move data mov r/m8.reg8 (destination) ) (copies content of 8-bit register (source) to 8-bit register or 8-bit memory unit mov reg32.imm32 (copies content of 32-bit immediate (constant) to 32-bit register) . the two operands are the same size Examples: mov EAX. and so you must explicitly code mov word [var]. It will deliberately remember nothing about the symbol var except where it begins. ax Note: NASM doesn‟t remember the types of variables you declare. .In all forms of the MOV instruction. 2.

plus the value of the carry flag. and leaves the result in its destination (first) Examples: add AX.imm16 operand) (adds its two operands together.Basic arithmetical instructions: ADD: add integers add r/m16.imm8 (adds its two operands together. and leaves the result in its destination (first) operand) Examples: add AX. BX (AX gets a value of AX+BX+CF) . BX ADC: add with carry adc r/m16.

and leaves the result in its destination (first) operand) Examples: sbb AX. BX (AX gets a value of AX-BX-CF) .Basic arithmetical instructions (Cont. from its first.r/m16 (first) operand) (subtracts its second operand from its first. BX SBB: subtract with borrow sbb r/m16. and leaves the result in its destination Examples: sub AX.imm8 (subtracts its second operand. plus the value of the carry flag.): SUB: subtract integers sub reg16.

affects all the other flags according to the result Examples: inc AX DEC: decrement integer dec reg16 (subtracts 1 from its operand) * does not affect the carry flag.Basic arithmetical instructions (Cont.): INC: increment integer inc r/m16 (adds 1 to its operand) * does not affect the carry flag. affects all the other flags according to the result Examples: dec byte [buffer] .

inverts all the bits) Examples: neg AL (if AL = (11111110). it becomes (00000001)) . it becomes (00000010)) not AL (if AL = (11111110).Basic logical instructions: NEG. and then add one) not r/m16 (performs one's complement negation. NOT: two's and one's complement neg r/m16 (replaces the contents of its operand by the two's complement negation .invert all the bits.

BL (if AL = (11111100). stores the result in the destination (first) operand) Example: and AL. BL= (00000010) => AL would be (11111110)) AND: bitwise and and r/m32.imm32 (each bit of the result is 1 if and only if the corresponding bits of the two inputs were both 1. BL= (11000010) => AL would be (11000000)) .): OR: bitwise or or r/m32. BL (if AL = (11111100).Basic logical instructions (Cont. stores the result in the destination (first) operand) Example: or AL.imm32 (each bit of the result is 1 if and only if at least one of the corresponding bits of the two inputs was 1.

Compare instruction: CMP: compare integers cmp r/m32. BL (if AL = (11111100).imm8 (performs a „mental‟ subtraction of its second operand from its first operand. but does not store the result of the subtraction anywhere) Example: cmp AL. and affects the flags as if the subtraction had taken place. BL= (11111100) => ZF would be 1) . BL= (00000010) => ZF would be 0) (if AL = (11111100).

a code can‟t contain two different non-local (as above) labels with the same name .label can be with or without colon . If we want to refer to the specific instruction in the code. we should mark it with a label: my_loop1: add ax. ax … .Labels definition (basic): Each instruction of the code has its offset (address from the beginning of the address space).an instruction that follows it can be at the same or the next line .

Similarly. the BITS setting dictates which is used. decrements its counter register (in this case it is CX register) 2. cx 1. LOOPZ. LOOPE.Loop definition: LOOP. 1 mov cx. if the counter does not become zero as a result of this operation. ax loop my_ loop. LOOPNE. B. LOOPNZ: loop with counter * for all the possible variants of operands look at NASM manual. it jumps to the given label Note: counter register can be either CX or ECX .4. LOOPNE (and LOOPNZ) jumps only if the counter is nonzero and the zero flag is clear. LOOPE (or its synonym LOOPZ) adds the additional condition that it only jumps if the counter is nonzero and the zero flag is set.if one is not specified explicitly. . 3 my_ loop: add ax.142 Example: mov ax.

three bytes in succession . DDQ. character constants are OK .13. DD : declaring initialized data DB. 0x41 0x42 0x43 0x00 (string) .0x55 'hello'. DQ (DT. so are string constants . and DO) are used to declare initialized data in the output file.0x57 'a'. DW. 0x34 0x12 .'$„ 0x1234 'a' 'ab„ 'abc' 0x12345678 . 0x41 0x42 (character constant) . 0x41 0x00 (it's just a number) .DB. 0x78 0x56 0x34 0x12 (dword) .0x56.10. They can be invoked in a wide range of ways: db db db db dw dw dw dw dd 0x55 0x55. just the byte 0x55 . DW. DD.

(characters that are not letters will remain as they are) e. 5. 3. Convert „*‟ into „#‟. 2. Convert lower case to upper case. Calculate the length of the string. 4. Convert „#‟ into „*‟.Assignment 0 You get a simple program that receives a string from the user.g. it calls to a function (that you‟ll implement in assembly) that receives one string as an argument and should do the following: 1. Than. "1: heL*Lo WorLd! "  "1: Hel#lO wORlD!“ The function shall return the length of the string. . Convert upper case to lower case. The characters conversion should be in-place.

.. dword [ebp+8] .... return an (returned values are in eax) esp.......... ...... ..text section makes the function appear in global scope tell linker that printf is defined elsewhere (not used in the program) section ....[an] ... .... 0 label_here: functions are defined as labels save Base Pointer (bp) original value use base pointer to access stack contents push all variables onto stack get function argument CODE STARTS HERE . data section. ebp dword ebp .. keep looping until it is null terminated ...... popad mov mov pop ret FUNCTION EFFECTIVE CODE ENDS HERE ... 0 label_here .... .data an: DD 0 ..... mov ecx.. mov ebp... . . inc cmp jnz ecx byte [ecx]. initialize answer .. this is a temporary var . pushad .text global do_str extern printf do_str: . FUNCTION EFFECTIVE mov dword [an]... Your code goes somewhere around here... esp . check if byte pointed to is zero .... read-write ...... increment pointer .. push ebp .. restore all previously used registers eax.section . our code is always in the ..

We use main.o file that has elf format (executable and linkable format).out.c myelf. So to compile main.s -o myelf.out . and sometimes also for input / output from a user.c file (that is written in C language) to start our program. In order to run it you should write its name on the command line: > myexe.o -o myexe. you issue a command of the form > nasm -f <format> <filename> [-o <output>] [ -l listing] Example: > nasm -f elf mytry.out It would create executable file myexe.Running NASM To assemble a file.c with our assembly file we should execute the following command: > gcc main.o It would create myelf.