You are on page 1of 65

Assembly Language for Intel-Based Computers

Module 7
Procedures
Macros

Slides prepared by Kip Irvine


Modified and supplemented by Clare Nguyen
(c) Pearson Education, 2010. All rights reserved.

Overview: Procedures
(Read chapter 5: 5.4 - 5.6, chapter 8: 8.1, 8.2)

Procedures in assembly are the same concept as functions in C/C++


or methods in Java
Code in a procedure runs as a result of a procedure call
Input and output data of a procedure are passed through the stack or
through register values
All procedure calls and returns are coordinated through the run time
stack
Concepts covered in this section:
Stack operations
Defining and using procedures

Irvine, Kip R. Assembly Language for Intel-Based Computers, 2010.

Whats next

Stack Operations
Defining and Using Procedures

Runtime Stack

At run time, a stack can be allocated in memory in the stack segment,


separated from the code segment and the data segment
The stack is memory space typically used for storing temporary data and for
parameter passing from one procedure to another
A stack is still memory, so a register is still faster to access than the stack
A stack is a LIFO data structure
data are only added to the top
data are only removed from the top
This produces a LIFO data structure: Last In First Out. The last item
that goes into the stack is the first item that is removed from the stack
As data is inserted, the stack grows from the bottom. When data is
removed, the stack shrinks toward the bottom.
10
9
8
7
6
5
4
3
2
1

top

bottom

Runtime Stack
The stack is managed by the CPU, using special registers
SS (stack segment): pointer to the stack segment. This address is set
by the OS when memory is allocated for the program
ESP (extended stack pointer): pointer to the top of the stack. This
address is the address of the last data being inserted into the stack, so
it points to the top of the stack. Every time data is inserted, the address
decrements by 4 (4 bytes per data value for protected mode)
EBP (extended base pointer): This register is used in procedure calls to
maintain local data for a procedure. EBP is used to point to the base of
the stack frame during procedure call.
Offset
00001000
00000FFC
00000FF8
00000FF4
00000FF0

00000006

ESP

Stack Segment

A program is given a data segment and a code segment by the OS when


the executable is loaded into memory, but the stack segment is not
automatically given, it must be requested by the program
With MASM the directive to request a stack segment is .stack
With the .stack directive, an optional stack size can be requested
The requested size is in bytes, and is optional
.stack 2048
; 2KB stack size
.stack 100h ; 256 bytes stack size
.stack
; default stack size of 1KB (1024)

In 32-bit mode, it is most efficient to request a stack size that is a multiple


of 32 because all data will go evenly into the stack, with no leftover memory
to waste
When a stack segment is requested, the OS sets the SS register to hold
the address of the stack, and sets ESP to point to the top of the stack
For programs written in this class, it is possible to leave out the .stack
directive because the Irvine32 include file already defines the stack for us:
.STACK 4096. However, you can use .stack to choose a smaller or larger
stack size.

PUSH Operation

Inserting data into a stack is called a push


A 32-bit PUSH operation:
decrements the stack pointer by 4
copies a value into the location pointed to by the stack pointer
BEFORE

AFTER
00001000

00000006

00000FFC

00000FFC

000000A5

00000FF8

00000FF8

00000FF4

00000FF4

00000FF0

00000FF0

00001000

00000006

ESP

ESP

Same stack after pushing two more integers:


Offset
00001000

00000006

00000FFC

000000A5

00000FF8

00000001

00000FF4

00000002

ESP

00000FF0

In memory, the stack grows downward (toward lower memory address)


The area below ESP is available for new data (until the stack overflows)

POP Operation

Removing data from a stack is called a pop


A 32 bit POP operation:
copies value at ESP into a register or variable
adds 4 to ESP
The stack shrinks upward (toward higher memory address)
BEFORE

AFTER

00001000

00000006

00001000

00000006

00000FFC

000000A5

00000FFC

000000A5

00000FF8

00000001

00000FF8

00000001

00000FF4

00000002

00000FF0

ESP

00000FF4
00000FF0

ESP

PUSH and POP Instructions

PUSH syntax
Data from the operand will be copied into the top of the stack.
There are 3 variations
PUSH r/m16 ; 16 bit register or memory
PUSH r/m32 ; 32 bit register or memory
PUSH imm32 ; 32 bit constant

POP syntax
Data will be removed from the top of the stack and copied
into the operand.
There are 2 variations
POP r/m16
; 16 bit register or memory
POP r/m32
; 32 bit register or memory

In this class we use protected mode so we always push and pop


32 bit data. An immediate value is defaulted to 32 bit.

Using the Stack (1 of 3)


Since there are a limited number of registers, one common use of the stack
is to:
Push on the stack the data that is in a register (save register data)
Use this same register for a different calculation
When done with the calculation, pop the saved value from the stack
back to the register (restore original register data)
When saving and restoring multiple registers, the PUSH and POP
instructions occur in the opposite order, due to the LIFO order of the stack
push esi
push ecx
push ebx

; save current data of


; registers esi, ecx, ebx

mov
mov
mov
call

esi,OFFSET dwordVal
ecx,LENGTHOF dwordVal
ebx,TYPE dwordVal
DumpRegs

; store new data in the same


; registers, in this case
; to display to screen

pop
pop
pop

ebx
ecx
esi

;
;
;
;

restore original data


in the registers
Note the opposite order
from the push instructions

Using the Stack (2 of 3)


When creating a nested loop with the LOOP instruction, push the outer
loop counter ECX before entering the inner loop:
mov ecx,100
L1:
push ecx
mov ecx,20
L2:

; set outer loop count


; begin the outer loop
; save outer loop count
; set inner loop count
; begin the inner loop

;
;
loop L2

; repeat the inner loop

pop ecx
loop L1

; restore outer loop count


; repeat the outer loop

Using the Stack (3 of 3)


When reversing a string:
Use a loop to store each character in EAX, then push the character in
EAX on the stack
Pop the stack in reverse order, insert each character back into the
string

Q: Why must each character be put in EAX before it is pushed? Why


cant we push each character (using its memory location) directly into
the stack?
Because only word (16-bit) or doubleword (32-bit) values can be
pushed on the stack.

Related Instructions

PUSHFD and POPFD


Push and pop the EFLAGS register
Syntax: PUSHFD
POPFD

PUSHAD pushes all the 32-bit general-purpose registers on the stack


In the order: EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI
Useful when you need to save and restore many of these registers.
It is faster than using many PUSH instructions, one for each
register
POPAD pops the same registers off the stack in reverse order
PUSHA and POPA do the same as PUSHAD and POPAD, but for 16bit registers

What's Next

Stack Operations
Defining and Using Procedures

Creating Procedures

Large problems can be divided into smaller tasks that are more
manageable and reusable
The following is the template of a procedure:

sample PROC
.
.
.
ret

sample ENDP

; start with:
; procedure_name and the directive PROC

; body of procedure: multiple instructions

;
;
;
;
;
;

ret (return) instruction so execution will


go back to the calling procedure
ret is required at the end of each procedure,
otherwise execution will continue to step to
the next instruction that happens to be after
this procedure

; end with:
; procedure_name and the directive ENDP

Documenting Procedures
For this class the required documentation for each procedure is:

A one-line description of what the procedure does

Optional documentation can include 3 more lines:

Receives: A list of input parameters; state their usage and requirements

Returns: A description of values returned by the procedure

Requires: Optional list of precondition requirements that must be


satisfied before the procedure is called
If a procedure is called without its preconditions satisfied, it will
probably not produce the expected output.

Example: SumOf Procedure


;--------------------------------------------------------SumOf PROC
; Calculates and returns the sum of three 32-bit integers.
; Receives: EAX, EBX, ECX (the three integers)
;
May be signed or unsigned.
; Returns: EAX = sum, and the status flags
; Requires: nothing
;--------------------------------------------------------add eax,ebx
add eax,ecx
ret
SumOf ENDP

In this procedure, the input data and the return value are all
passed through registers
Before calling SumOf, the caller has to make sure that the 3 integers
are in EAX, EBX, ECX
After the call, the caller can expect EAX to contain the sum

CALL Instruction

Recall that EIP holds the address of the next instruction to be executed

By storing the correct address in EIP, we can cause execution to jump to


a called procedure

Then later, by storing a different address in EIP, we can cause execution


to return (jump back) to the calling procedure

The CALL instruction is used to call a procedure


Syntax:

CALL procedure_name

A CALL instruction:

1. Pushes the current address in EIP on the stack, to save it for


the return. The current address is the address of the instruction
immediately following the CALL instruction, this address is
where execution should return to after running the called
procedure
2. Copies the address of the called procedure into EIP. The
procedure name is the address of the first instruction of the
called procedure. Therefore, the next instruction to be executed
will be the first instruction of the called procedure

RET Instruction

The RET instruction causes execution to return from a procedure

The RET instruction needs to be a the end of every procedure so that


execution can return to the calling procedure. If the RET instruction is
missing, execution will continue to step to the next instruction that
happens to appear after the procedure. (This behavior is similar to how
a C/C++ switch statement will continue to the next case if we forget the
break statement)

Syntax: RET number


where number is optional

When RET runs, the value at the top of stack is popped off the stack
into EIP. This value should be the return address: the address of the
instruction in the calling procedure that execution returns to.

If there is an optional number after the RET instruction, the number


dictates how many bytes will be popped off the stack after the return
address is popped and stored in EIP

The number is typically given if a procedure passes input arguments


on the stack, and these values need to be popped (cleared) from the
stack when returning to the calling procedure

CALL-RET Example
At the start of the CALL instruction,
EIP = 0000 0025 (the address of the next
instruction)
When CALL runs, 0000 0025 is pushed on
the stack, then the address of MySub is
copied into EIP. EIP = 0000 0040.
This causes execution to jump to the
procedure MySub and start executing the
first instruction in MySub.

Subsequent instructions in MySub are


executed, EIP changes with every
instruction.
At the RET instruction, the value at the top
of the stack, 0000 0025, is popped off and
restored in EIP. EIP = 0000 0025.
This causes execution to jump back to the
correct next instruction in the calling
procedure

main PROC
00000020
00000025
. . .
main ENDP
MySub PROC
00000040
.
.
.
.
00000180
MySub ENDP

call MySub
mov eax,ebx

mov eax,edx

ret

Nested Procedure Calls


main PROC
.
.
call Sub1
exit
main ENDP
Sub1 PROC
.
.
call Sub2
ret
Sub1 ENDP
Sub2 PROC
.
.
call Sub3
ret
Sub2 ENDP
Sub3 PROC
.
.
ret
Sub3 ENDP

Every time CALL runs, the return address is pushed on the


stack
In the example on the left, after Sub3 is called, the stack
contains all three return addresses:
(ret to main)
(ret to Sub1)
(ret to Sub2)

ESP

When RET is encountered in Sub3, the address at the top


of the stack is popped into EIP, so execution returns to Sub2
When RET is encountered in Sub2, the address at the top
of the stack is popped into EIP again, so execution returns
to Sub1
When RET is encountered in Sub1, the last address at the
top of the stack is popped into EIP again, so execution
returns to main
When execution returns to main, the stack should be empty

Input Parameters

One of the main reasons for writing procedures is that they are independent
blocks of code that can be re-used. For example, the procedure WriteString
in the books library is re-used by all of us in our programs.

Since a useful procedure may be used in many different programs, a


procedure cannot refer to specific variable names (which are specific memory
addresses) that are defined in a data segment

Input data to a procedure can only come from one of two sources:
registers, if input arguments are passed through registers
the stack, if input arguments are passed through the stack

Heres an example of why you cannot refer to specific variable names when
writing a procedure:
You write a procedure that uses the specific variable names input1 and
input2 in the code. This means input1 and input2 have to be defined in
the data segment of your program.
If other people use your procedure in their programs, input1 and input2
(which are addresses in your data segment) will not be available for their
program, and your procedure will no longer be able to run successfully.
This is where parameter passing comes in. When other people call your
procedure, they pass input data (such as their own values for input1 and
input2) to your procedure. This means your procedure must work with
input parameters, rather than with pre-determined variable names.

Parameter Passing (1 of 2)

Input parameters make procedures flexible and therefore re-usable


because parameter values can change at runtime

There are 2 ways to pass input data to a procedure and receive the return
value from the procedure: through registers and through the stack

Using registers to pass input parameters


Registers are always available for any instruction that is being
executed
Taking advantage of this fact, the calling procedure (or caller) stores
input data in certain registers
When the called procedure (or callee) runs, it uses the data that are in
these specific registers to do its work, and it can store any output result
in certain registers
When execution returns to the caller, the caller can fetch the results in
these registers
Passing parameters through registers produces straightforward and
shorter code because accessing registers is simple and fast
Example: An add procedure requires that the 2 addends be stored in
EAX and EBX
When the add procedure runs, it uses the data in EAX and EBX to do
the addition, and it stores the sum back in EAX
When execution returns to the caller, the caller fetches the sum that is
stored in EAX

Parameter Passing (2 of 2)

Using the stack to pass input parameters


Using registers to pass input data has a drawback because there are
only a small number of general purpose registers
If several registers are already being used to store input values, then
there are very few registers left for the called procedure to use to do its
work. For example, if input data is passed to a procedure by storing it in
the EAX register, then the procedure cannot use EAX for a division
operation.
Another disadvantage to using registers for parameter passing is when
the assembly code needs to interface with HLL code. HLL do not use
registers and will only pass input arguments through the stack.
Using the stack for input parameters solves both disadvantages: the
registers are available for use, and interfacing with HLL code is
seamless
However, using the stack means a few more instructions for the called
procedure to retrieve data from the stack and to return data back
through the stack
Example: An add procedure requires that both addends are pushed
onto the stack
When the procedure runs, it copies the 2 input data from the stack to do
the addition, and it copies the sum back to the stack
When execution returns, the caller pops the sum from the stack to
retrieve it

Using Registers for Parameters


Example: ArraySum adds all elements in an array and returns the sum
The caller and ArraySum pass data through registers ESI, ECX, EAX
This is a simple way for one assembly procedure to pass data to another
The down side to using registers to pass data is that ESI, ECX, EAX are no
longer available for ArraySum to use to store its temporary data
This way of passing parameters are used for short, simple procedure
ArraySum PROC
; Receives: ESI points to an array of doublewords,
;
ECX = number of array elements.
; Returns: EAX = sum
;----------------------------------------------------mov eax,0
; set the sum to zero
L1: add eax,[esi]
; add each integer to sum
add esi,4
; point to next integer in array
loop L1
; repeat for array size
ret
ArraySum ENDP

Calling A Procedure with Register Parameter


When calling a procedure with register parameters, the caller must:
Load the input parameters into the required registers (required by the
called procedure)
Call the procedure
Fetch the result (if any) from registers
mov esi,OFFSET array
mov ecx,LENGTHOF array

; load input parameters


;
into required registers

call SumArray

; call the procedure

mov sum, eax

; access result in register

Using the Stack for Parameters (1 of 2)

Using the stack for parameter passing is more flexible than using
registers
Since the registers are not used to store input parameters, they are
available for use by the called procedure (the callee). This is
important when the callee is complex and needs to use many
registers to do its work.
There is no need to coordinate between the caller and callee
regarding which input parameter is in which register. All that needs
to be communicated is the order in which parameters are passed.
This is how HLL functions pass data to each other
To pass parameter through the stack
The caller pushes all input parameters onto the stack before the call
The callee accesses these data values in the stack when needed
The callee puts the return value (if any) on the stack
The caller pops the return value (if it exists) from the stack

Using the Stack for Procedure Parameters (2 of 2)

One additional consideration when using the stack for parameter passing:
the stack must be completely cleared of all data by the time execution
finishes running the procedure call
If the stack is not cleared after every procedure call, and some data are left
behind, this causes a memory leak in the stack segment. If the program
runs for a long time or has a deep recursive call, it will lead to stack
overflow.
All data belonging to the procedure call (input parameters and possible
return value) must be popped off the stack at the end of a procedure call
The return value is popped off the stack by the caller to retrieve it and
this clears it from the stack
The input parameters are cleared off the stack by one of two ways:
By the called procedure: this is the STDCALL convention or the
Pascal convention. It is used by the Windows API.
By the caller: this is the C convention or the cdecl (C declaration)
convention. It is used by the C language. Since the caller clears the
stack, this convention allows for variable length argument list, such
as Cs printf, because the caller knows how many arguments are
passed in the stack at the time of the call
For this class we use the STDCALL convention, and the C convention is
used in Advanced Assembly when we interface with C code

Examples of Parameter Passing


Two ways of calling printArray and pass 3 input arguments:
; using registers

; using the stack

mov esi,OFFSET array


mov ecx,LENGTHOF array
mov ebx,TYPE array
call printArray

push
push
push
call

Advantage:
printArray can easily access the
input parameters in esi, ecx, ebx

Advantage:
- All general purpose registers are
available for printArray to use
- A HLL function call produces the
same code, so this version of
printArray can work with HLL code

Disadvantage:
esi, ecx, ebx are used for input
data. printArray cannot use them
to do its work.

OFFSET array
LENGTHOF array
TYPE array
printArray

Disadvantage:
- Its more work for printArray to
access the input parameters
- The parameters must be cleared
from the stack at the end of the call

The Stack Frame

The CALL, RET instructions and the parameters being passed through the
stack all go into a stack frame on the stack
A stack frame is also known as an activation record
It is a part of the stack that is set aside for each called procedure
For a 3 level nested procedure call (A calls B, B calls C, C calls D), the
stack will have 3 stack frames, one stack frame for each of the procedures
B, C, and D
The stack frame can contain up to 5 logical parts, in the order that data are
pushed into the stack:
1. Input parameters: pushed on the stack first by the caller
2. The callers return address: pushed on the stack by the CALL
instruction
3. EBP value: pushed on the stack by the callee in order to set the base
pointer for its stack frame
4. Local variables: pushed on the stack by the callee when it needs
temporary data storage
5. Saved register data: pushed on the stack by the callee, before it uses
any register
The callers return address is always part of the stack frame, even if
registers are used for parameter passing. The other parts (input
parameters, EBP, saved register data, and local variables) can be optional

Stack Frame: Setting Up


With every procedure call, the stack frame for the
called procedure is set up by the following steps, in
order:
1. The caller pushes all input parameters on the
stack
2. The caller runs the CALL instruction, causing
the return address to be pushed on the stack
3. The callee pushes EBP value on the stack,
and sets EBP to the base of its stack frame
4. If local variables are needed, the callee
makes room for them on the stack
5. The callee saves data of all registers that it
will use by pushing them on the stack

After all 5 steps are done, then the callee is ready to


begin doing work
The callee will access all input data in the stack frame
to do its work

stack frame with


all 5 parts

Input parameters
EBP +n

Return address
EBP

EBP
Local Variables

EBP -n

Saved
registers

ESP

n is the offset from EBP,


used to access data in the
stack frame

Accessing Data in the Stack Frame (1 of 4)

By the time the callee is ready to do its work, its


stack frame is already set up as shown in the
diagram from the last slide (the diagram is
repeated on the right)
The stack pointer (ESP) is pointing to the last
saved register value, which is the top of the
stack
Given the specific order of the 5 parts of the
stack frame, how do we access the input
parameters to do work with them?
One way to access data on the stack is by using
the pop instruction
However, given the LIFO nature of the stack,
we can only pop data at the top of the stack
This means that we will need to pop all data off
the stack frame in order to get to the input
parameters
But this causes a big problem: if we pop all data
off the stack to get to the input parameters,
where would we save them, especially if there
are more data than there are general purpose
registers?

stack frame with


all 5 parts

Input
parameters

EBP +n

Return address
EBP

EBP
Local
Variables

EBP -n

Saved
registers

ESP

n is the offset from EBP,


used to access data in
the stack frame

Accessing Data in the Stack Frame (2 of 4)

The solution is to take advantage of 2 facts that are


consistent with the stack:
1. Each data value on the stack is 4 bytes away
from the next data on the stack
2. ESP is always pointing to the top of the stack,
which means ESP has the address of the last
data that has been pushed into the stack
From the address of the stack top (which is stored in
ESP), we can subtract multiples of 4 to get to any
data value on the stack
For example, using the diagram on the left:
Starting from the stack top (where ESP is) and
going toward the bottom of the stack, lets say
there are: 2 saved register data, 1 local variable,
1 EBP data, and 1 return address before we get
to the location of the input parameters
This means: 4 * (2 + 1 + 1 +1) = 20 bytes from
ESP is where the input parameters are
To access the last input parameter that was
pushed into the stack, we use: [esp + 20]
However, there are 2 problems with using ESP to
access data on the stack

stack frame with


all 5 parts

Input parameters

EBP +n

Return address
EBP

EBP
Local Variables

EBP -n

Saved
registers

ESP

n is the offset from EBP,


used to access data in the
stack frame

33

Accessing Data in the Stack Frame (3 of 4)


The 2 problems with using ESP to access data on
the stack:
1. As shown in the diagram, ESP is the top of the
stack and is far away from the input parameters.
Using ESP to calculate the location of the input
parameters means we typically need to subtract
many multiples of 4, not just 20 as in the example on
the last slide. This makes the code difficult to write
and read. For example, ESP 52 is not a convenient
value to calculate, leading to more error prone code.
2. When the callee does its work, it will occasionally
push data on the stack for temporary storage (for
example, storing the loop count in a nested loop).
But every time a push or pop operation occurs, ESP
changes value. This means ESP can be a moving
target, and therefore the ESP 52 is not always
going to be at the same location on the stack.

stack frame with


all 5 parts

Input parameters

EBP +n

Return address
EBP

EBP
Local Variables

EBP -n

Saved
registers

ESP

n is the offset from EBP,


used to access data in the
stack frame

34

Accessing Data in the Stack Frame (4 of 4)

To get around the problems of using ESP to


access data on the stack, we use the register
EBP (Extended Base Pointer). EBP has its name
because it is designed to use as the pointer to the
base of the stack frame. From this base, we can
easily access input parameters as well as local
variables
At the beginning of the called procedure, we save
the current ESP value in EBP. EBP becomes a
stable, non-moving pointer to one location in the
stack frame, and this location is called the base
of the stack frame.
Note that in the diagram, EBP is in between the
input parameter section and the local variable
section, making it convenient to access both
sections of the stack frame
We can use the address in EBP to easily get to
any input parameter or local variable in the stack
frame:
Input parameters are at EBP + n addresses
Local variables are at EBP n addresses

stack frame with


all 5 parts

Input parameters

EBP +n

Return address
EBP

EBP
Local Variables

EBP -n

Saved
registers

ESP

n is the offset from EBP,


used to access data in the
stack frame

Stack Frame: Accessing Parameters

Since the callee sets EBP to point to its stack frame as the first step in the
procedure, it means EBP points to the location immediately after the
Return Address that is on the stack.

The code to set EBP is:


push EBP
mov EBP, ESP

; save current ebp value


; set ebp to the base of the stack frame

The callee can access data in the stack frame by using constant offsets
from EBP, for example: [ebp + 8] or [ebp 4]

EBP - n are addresses of local variables

EBP + n are addresses of input parameters and / or return values

Each data value on the stack is 4 bytes, so the value of n is a multiple of 4

EBP does not change value during the execution of the called procedure,
making it a stable base pointer to rely on

Stack Frame: Clearing Out (1 of 2)

Following stdcall convention, the callee is responsible to clear out the


stack frame at the end of the procedure call
At the end of a procedure, the following steps must take place before
the RET instruction, in the following order:
Register values are restored by popping saved data off the stack
and into the registers, in the opposite order that they were saved,
due to LIFO nature of the stack
Local variables are popped off the stack, if there are any. This can
be done quickly by adding to the ESP value the total number of
bytes that have been reserved for local variables
RET is called, which pops the Return Address into EIP, causing
execution to jump back to the calling procedure
Syntax for RET:
RET num
Optional operand num causes num bytes to be added to ESP after the
Return Address is popped into EIP. This effectively pops off the input
parameters that are in the stack.
When execution returns back to the caller, the stack frame for the call
should be completely cleared out except for the return value, if there is
one

Stack Frame Example

(during procedure call)

.data
sum DWORD ?
.code
sub esp,4
push 6
push 5
call AddTwo
pop sum
AddTwo PROC
push ebp
mov ebp,esp
push eax

Stack frame
;
;
;
;
;

save room for return value


first argument
second argument
save to sum

; save current ebp


; set ebp for this proc
; save register

mov eax,[ebp+12]
add eax,[ebp+8]
mov [ebp+16],eax ; return value on stack
pop eax
pop ebp
ret 8
addTwo ENDP

; restore register
; pop input params

---

ebp+16

ebp+12

ebp+8

Return addr
Old ebp
Old eax

ebp+4
ebp
ebp-4

Stack Frame Example

(during procedure return)

.data
sum DWORD ?

steps

5)

.code
sub esp,4
push 6
push 5
call AddTwo
pop sum
AddTwo PROC
push ebp
mov ebp,esp
push eax

Stack frame
;
;
;
;
;

save room for return value


second argument
first argument
save the sum in memory

; save current ebp


; set ebp for this proc
; save register

1)

mov eax,[ebp+12]
add eax,[ebp+8]
mov [ebp+16],eax ; return value on stack

2)
3)
4)

pop eax
pop ebp
ret 8

addTwo ENDP

; restore registers
; pop return addr
; and input params

11

ebp+16

ebp+12

ebp+8

Return addr

ebp+4

Old ebp

ebp

Old eax

ebp-4

Pass by Value and Return Value

In the example in the previous slide, main passes 2 input arguments by


value, and expects a return value
To pass by value, the caller pushes the values of the input arguments on
the stack, one at a time, in the order that the callee expects
When the caller expects a return value, it must make room on the stack for
this return value, so that the callee has a place to store the return value
Typically, saving room for the return value occurs before pushing the input
arguments on the stack
To make room for the return value, there are 2 common options:
Option 1:
sub esp,4
push 6
push 5
call AddTwo

; save room
; 1st argument
; 2nd argument

sub esp,4 means ESP points to


the next data slot in the stack.
This leaves an unused slot in
the stack for the return value.

Option 2:
push eax
push 6
push 5
call AddTwo

; save room
; 1st argument
; 2nd argument

push eax in this case is not for


saving eax value, but for moving
ESP to the next slot. When the
callee puts the return value in this
slot, it overwrites the eax value in
the slot.

Pass by Address / Reference

When there is a large number of data values, such as with an array, it is


not possible to pass by value all elements of an array, since it would be
very inefficient to copy all data values of an array to the stack.
Instead the address of the array (1 data value) is passed on the stack,
and then the called procedure dereferences this address to get the data
Similarly, when a called procedure needs to modify data of multiple
memory locations, it is more efficient to pass the address of the locations
to the procedure. This is more efficient than copying multiple original
values onto the stack, and at the end of the procedure, copying multiple
modified values back from the stack
When an address of a memory location is passed to the called
procedure, this is pass by address or pass by reference
Pass by address takes more work than pass by value and should only be
used when the called procedure needs to access or modify multiple data
locations

Pass by Address Examples (1 of 2)

The ArrayFill procedure needs to fill an array with data


The caller passes the address of the array and a count of the number of
array elements:
.data
array WORD 100 DUP(?)
.code
push OFFSET array
push count
call ArrayFill

; pass array by address


; pass count by value

ArrayFill can access an array through the address:


ArrayFill PROC
push ebp
mov ebp,esp
pushad
mov esi,[ebp+12]
mov ecx,[ebp+8]

Stack frame

; set ebp
; save all registers
; esi = addr of array
; ecx = count

offset(array)

[EBP + 12]

count

[EBP + 8]

return address
EBP

ESI points to the beginning of the array, we can use a loop and indirect
addressing to access each array element.

EBP

Pass by Address Example (2 of 2)


The Swap procedure exchanges the values of two 32-bit integers
Since Swap needs to modify 2 data values, we use pass by address
; Calling Swap:
push OFFSET firstInt
push OFFSET secondInt
call Swap
Swap PROC
push ebp
mov ebp,esp
pushad

mov esi,[ebp+8]
mov edi,[ebp+12]
mov eax,[esi]
xchg eax,[edi]
mov [esi],eax
popad
pop epb
ret 8
Swap ENDP

; pass by address 1 st integer


; pass by address 2 nd integer

; set EBP
; save all registers before using them

;
;
;
;
;
;

esi = address of 2nd integer


edi = address of 1st integer
eax = dereferenced esi = 2 nd integer
exchange eax (2 nd integer) with
dereferenced edi (1st integer)
dereferenced esi (2 nd int) = eax (1stint)

; restore registers
; return and clean up the 2 input args

Local Variables

When the registers dont give us enough temporary data storage for a
procedure, we can use the stack as extra storage space
Temporary data that are stored on the stack, not in the data segment, are
local variables
The simplest way to create local variables is just to push and pop the
temporary data onto the stack. But this only works if we access the saved
data in LIFO order. For example, if we push data1 and then data2 on the
stack, then using pop, we will get to data2 first, then data1
If we need to get to our local variables in more random order, then we need
to explicitly create space on the stack for local variables. This is done by
subtracting their total size from ESP.
Example: if we need 3 local variables, then their total size is 4 * 3 = 12 bytes.
We use the instruction: sub esp, 12 to make room for 12 bytes on the
stack
This step is done immediately after setting EBP because we will use an
offset from EBP to access the local variables and we dont want the offset to
be a large number (epb 56 is not as easy to work with as ebp 4)

Local Variable Example


The following example creates and initializes two 32-bit local variables
(we'll call them locA and locB):
MySub PROC
push ebp
mov ebp,esp
sub esp,8
pushad
mov [ebp-4],123456h
mov [ebp-8],0

; set ebp
;
;
;
;

save
save
locA
locB

room for 2 variables


all registers
= 123456h
= 0

; code for the rest of the procedure to do work


. . .

LEA Instruction

The LEA (load effective address) instruction returns the offset (or address)
of operands, just like the OFFSET operator.
However, OFFSET can only return constant offsets, or offsets of memory
variables that are in the data segment
LEA is used to get the offset of a stack parameter or local variable
MyProc PROC
push ebp
; set ebp
mov ebp,esp
sub esp,8
; save room for 2 local vars
pushad
; save all registers
mov [ebp-4],123456h
; locA = 123456h
mov [ebp-8],0
; locB = 0
mov edi,OFFSET [ebp-4]
; invalid operand
mov esi,OFFSET [ebp-8]
; invalid operand
lea edi,[ebp-4]
; ok
lea esi,[ebp-8]
; ok
; edi and esi now have addresses of locA and locB,
; in case you need to call the next procedure
; and pass locA and locB by address

Saving Register Values

(1 of 2)

The registers are available for any procedure to use, and every procedure
uses the same registers because many instructions require the use of at
least one register
In a procedure call, it is important that the callee saves the data that are in
each register that it will use, before using the register
Take the example of procedure A calling procedure B:
At the start, instructions in A run and use all 4 general purpose
registers to keep As temporary data
When A calls B, execution jumps to procedure B, but all 4 general
purpose registers still hold As data
When B runs, it will also need to use the general purpose registers
If B does not push the values of the registers on the stack, thus saving
them, then Bs instructions will overwrite As data that are in the
registers
It is also important that at the end of the callee code, the saved register
data are restored back to each register, before the RET instruction

Saving Register Values

(2 of 2)

Saving register values should be done before the called procedure uses any
of the registers
For EBP, this occurs as the first step in the called procedure because
EBP needs to be set to the stack frame base immediately after entering
the procedure
For other registers, this step occurs after saving room for local variables,
and before the called procedure starts its actual task
If the callee only needs to use 2-3 registers, it is faster to push each register
value on the stack
If the callee will use most or all of the registers, it is faster to use the
PUSHAD instruction
When restoring saved register values, remember that the pop instructions
should be in reverse order of the push instructions
If PUSHAD is used, then use POPAD to restore register values
Saving register values is always an important step when working with procedures. It is
especially important in a nested procedure call, where there are multiple stack frames.
Since each stack frame is accessed by EBP, if EBP isnt saved and restored properly
from procedure to procedure, we would lose our place in the stack very quickly and not
able to get to the return address when RET runs. No programmer wants execution to
return to an unknown address and start executing whatever happens to be in memory
at that unknown address

Your turn . . .

Create a procedure named Difference that subtracts the first argument


from the second one. The following is the procedure call:
sub esp, 4
push 14
push 30
call Difference
pop result

Difference PROC
push ebp
mov ebp,esp
push eax
mov eax,[ebp + 8]
sub eax,[ebp + 12]
mov [ebp+16],eax
pop eax
pop ebp
ret 8
Difference ENDP

;
;
;
;
;

;
;
;
;
;
;
;
;
;

make room for return value


first argument
second argument
save difference into result

save ebp value


set ebp
save eax value
eax = 2nd argument
eax = 2nd 1st argument
put return value on stack
restore eax
restore ebp
return and clear input param

MASM Directives and Operator for Procedures

There are several MASM directives and operators that are introduced
in section 8.2 of the book. These make it easier for assembly
programmers to maintain the stack frame
The directives are: ENTER and LEAVE (sets up and restores EBP),
LOCAL (reserves space on the stack for local variables)
The operator is: USES (to save and restore registers)
For this class, we will not use these MASM directives and operator.
One goal of the class is to understand how the stack frame works to
support procedure calls. We get a much better understanding of the
stack frame if we code the assembly instructions, rather than
depending on the assembler to generate the equivalent code. In the
advanced assembly class, after everyone has a clear idea of how to
maintain the stack frame, then the MASM directives can be used.

50

Summary of Key Concepts

The run time stack is used to support procedure calls, and is defined with
the directive .stack
The stack works in LIFO order and has several operations: push (add
data), pop (remove data), pushfd and popfd (save and restore flag
registers), pushad and popad (save and restore all registers)
Procedures are defined using the PROC and ENDP directives
Procedures run when the instruction CALL is used, and returns to the
calling procedure with the instruction RET. CALL pushes the return
address on the stack, and RET pops the return address so execution can
return to the saved memory location.
Procedures only work with the data that are passed in through registers or
through the run time stack, and they never use a specific variable name
(or specific data address)
Each procedure call has its own stack frame, which is set up with: input
parameters (by the caller), return address (by the caller), EBP (by the
callee), local variables (by the callee), and saved registers (by the callee)
The callee always save all current register values before using the
registers, in order not to clobber data that belong to the caller
Upon a return, the STDCALL convention dictates that the callee will clean
up input parameters from the stack
51

Overview: Macros
(Text book chapter 10: 10.2)

Macros are short blocks of code that are defined by the


programmer, and are substituted in when invoked in the code
Concepts covered:
Difference between macros and procedures
Defining macros
Passing parameters to macros

52

Whats Next
Difference between macros and procedures
Defining macros
Passing parameters to macros

53

Difference Between Macros and Procedures

A macro is a named block of assembly language statements


Once defined, it can be invoked (called) one or more times
Unlike a procedure, each macro call does not result in execution
jumping to another part of code. Instead each macro call is a code
substitution.
During the assembler's preprocessing step, each macro call is removed
and substituted with a copy of the macro code block
The expanded code is passed to the assembly step, where it is
checked for correctness
A program with many macro calls will be expanded into a much larger
program

Whats Next

Difference between macros and procedures


Defining macros
Passing parameters to macros

55

Defining Macros

A macro must be defined before it can be used

Macro definitions appear at the beginning of the program

The directives for the beginning and end of the macro are: MACRO and
ENDM

Input parameters are optional

Each parameter follows the rules for identifiers. It is a string that is


assigned a value when the macro is invoked.

Syntax:
macroname MACRO [parameter-1, parameter-2,...]
statement-list
ENDM

Macro Example
The mNewLine macro is defined and then invoked
mNewLine MACRO
call Crlf
ENDM

; define the macro

.data
.code
mNewLine

; invoke the macro

The assembler will substitute "mNewLine" with "call Crlf


Note that there is no CALL instruction to invoke a macro
If a macro uses a register, we need to push the register value onto
the stack to save it before using the register, and at the end of the
macro, we need to restore the register value

Whats Next

Difference between macros and procedures


Defining macros
Passing parameters to macros

58

mPutChar Macro
Writes a single character to standard output

Definition:

Invocation:

mPutchar MACRO char


push eax
mov al,char
call WriteChar
pop eax
ENDM

.code

mPutchar 'A'
1

Expansion:

Note that eax is


saved and then
restored

1
1
1

push eax
mov al,'A'
call WriteChar
pop eax

viewed in the
listing file

Invoking Macros

When we invoke a macro,


each argument we pass
matches a declared
parameter
Each parameter is replaced
by its corresponding
argument when the macro is
expanded
When a macro expands, it
generates assembly
language source code
Arguments are treated as
simple text by the
preprocessor

macro
invocation
statement
passes

text
consists
of
assembly
code

argument

generates

replaces

parameter

declared
inside

macro

mWriteStr Macro
Provides a convenient
way to display a string
by passing the string
name as an argument

mWriteStr MACRO buffer


push edx
mov edx,OFFSET buffer
call WriteString
pop edx
ENDM
.data
str1 BYTE "Welcome!",0

.code
mWriteStr str1

The expanded code shows


how the str1 argument
replaced the parameter
named buffer

1
1
1
1

push
mov
call
pop

edx
edx,OFFSET str1
WriteString
edx

Invalid Argument

If an argument is invalid, the error is caught when the expanded code is


assembled
Example:
.code
mPutchar 1234h
1
1
1
1

push eax
mov al,1234h
call WriteChar
pop eax

; error

If an argument is missing, the error is also caught when the expanded


code is assembled
Example:
.code
mPutchar

1
1
1
1

push eax
mov al,
call WriteChar
pop eax

; error

mReadStr Macro
The mReadStr macro provides a convenient wrapper around ReadString
procedure calls.
mReadStr MACRO varName
push ecx
push edx
mov edx,OFFSET varName
mov ecx,(SIZEOF varName) - 1
call ReadString
pop edx
pop ecx
ENDM
.data
firstName BYTE 30 DUP(?)

.code
mReadStr firstName

Labels in Macros
Recall that invoking a macro means the macro code is copied into the
source file. Each time the macro is invoked another copy of the macro is
copied into the source file.
If a macro has a code label and the macro is invoked multiple times, the
label will be duplicated multiple times when the macro code is copied in. This
results in an error because each code label needs to be unique.
To solve this problem, use the LOCAL directive to declare the label before
using it. This causes the assembler to generate a unique name for the label
each time it copies the macro code into the source file
printTen MACRO var
LOCAL loopTop
push ecx
push eax
mov eax, var
mov ecx, 10
loopTop: call writeDec
loop loopTop
pop edx
pop ecx
ENDM

Summary of Key Concepts

Macros are small blocks of code that are substituted in when


invoked
Macros are defined using the MACRO and ENDM directives
Macros can accept arguments, and arguments are considered text
strings by the assembler
Macros are useful when a few lines of code appear many times in a
program. Instead of repeating the same few lines of code
throughout the program, the programmer defines a macro once and
invoke it many times

65