Professional Documents
Culture Documents
By Parth Pandya
INDEX:
1. Introduction
2. Stack and heap in XC16 and its associated registers
3. Stack usage of individual function
4. Stack usage of interrupt routines
5. Stack usage of entire call tree
6. Using PC-Lint for stack analysis
7. Stack usage of entire code - Run-time analysis
8. Stack usage of individual functions - Run-time analysis
9. EDS memory and stack
10. Stack overflow and underflow
11. Best practices to optimize stack usage
1 Introduction
This document describes various methods for stack usage analysis on 16-bit PIC MCUs and dsPIC DSC’s.
An example project is also available along with this application note, to demonstrate the Stack usage
analysis for a user application which uses Microchip MPLAB® XC16 compiler.
The device chosen to for this note is the PIC24FJ256GA110. However, this same note can also be used for
other 16-bit devices.
The example code uses MPLAB X IDE and MPLAB® XC16 compiler as the development ecosystem.
Once the memory allocation is done for all the global variables and heap memory, the largest contiguous
block of remaining memory is allocated to the stack. Stack grows from lower address to higher address.
Users can specify a minimum stack for their application, using the ‘--stack’ option (--stack=size). This is the
minimum stack that should be made available to the application. If there isn’t enough space for minimum
stack in the application, a compilation error will be generated.
The default minimum stack size is 16 bytes.
For example:
“--stack=0x100” will indicate to the linker that the minimum stack size of 256 bytes should be available for
the application
In cases where it is necessary for the programmer to specify the location and size of the stack explicitly, the
stack may be defined in assembly language, using the stack attribute:
When the stack is allocated in this way, the usable stack space will be slightly less than 0x100 bytes, since a
portion of the user-defined section will be reserved for the stack guard band.
Many embedded applications do not use a heap at all. In such cases, all the available Data memory is
allocated to the stack. If the application mandates use of the heap, then the heap size should be specified
explicitly, using the ‘--heap’ option (--heap=size).
For example:
“--heap=0x200” will indicate to the linker that the 512 bytes of memory should be allocated to the heap
The heap size is always specified by the programmer. In contrast, the linker sets the stack size to a
maximum value, utilizing all remaining data memory.
As an alternative to automatic heap allocation, the heap may be allocated directly with a user-defined
section in assembly source code. For example:
0000h
SFR Space
0800h
Static data, .bss , .data
Implemented sections
Data RAM
Heap space, if allocated
with --heap
WREG15(SP)
Stack space, remaining
data memory by default Stack grows towards
goes to stack higher addresses
The MPLAB XC16 compiler may use the Software stack model to store the following:
• Automatic variables
• Arguments passed to functions
• Processor status in interrupt functions (Status and the Interrupt Priority level registers)
• Function return addresses
• Intermediate results
• Saving registers across function calls (context saving across function calls)
• WREG15 (W15) – This is the Stack Pointer (SP) register. It points to the top of stack which is
defined as the first unused location of the stack.
• WREG14 (W14) – This is the Frame Pointer (FP) register. It points to the current function’s frame.
Each function, if required, creates a new frame at the top of the stack from which the automatic and
temporary variables are allocated. The compiler option ‘-fomit-frame-pointer’ can be used to restrict
the use of Frame Pointer.
• SPLIM – This is the Stack Pointer Limit register. By default, the run-time start-up code initializes
SPLIM with the last address in Data memory. It sets the upper stack boundary. The hardware
compares SPLIM with the WREG15 (Stack Pointer) to trigger a stack overflow exception, if the
WREG15 is greater than the SPLIM.
3 Stack usage of individual functions
Stack usage of individual functions can be calculated by inspecting the disassembled code generated by the
compiler.
Points to consider before calculating the stack usage of an application by inspecting the disassembled code:
• LNK instruction stores the previous frame pointer to the stack and allocates the stack for the function
• Executing a CALL or RCALL instruction pushes the function return address onto the stack
• PUSH instruction increments the Stack Pointer
• POP instruction decrements the Stack Pointer
• MOV instruction can also be used by the compiler, to push data onto the stack, using the WREG15
register
Stack space requirements of a function can be estimated by looking at the function in the ‘.c’ source file.
The function’s auto variables and parameters are stored on the stack, and the function’s body can be
examined to estimate the amount of stack space requirement for that function.
For example:
uint8_t function1 ( uint8_t param_1, uint8_t param_2 )
{
uint8_t arr1[10] = { 'f', 'u', 'n', 'c', 't', 'i', 'o', 'n', '1', '\0' };
uint8_t arr2[15] = {0};
uint8_t cnt = 0;
To use this method the user needs to know the size of various types defined in the function. This method
provides an estimate of the stack space requirement and may not be as accurate as other methods of
stack usage calculation. This method doesn’t consider stack used for saving the frame pointer, return
addresses of functions and temporary variables storage on the stack.
Note: objects smaller than 16-bits will always be upgraded to 16-bits. Therefore, a char, pushed onto the
stack will always be pushed as a full word. In other words, we will never mis-align the stack pointer
Open the application project in MPLAB X IDE and follow these steps to calculate the stack usage of a
function, by inspecting the disassembled code:
int main(void)
{
uint32_t result;
result = function1( 0x11, 0x22 );
}
Calling function ‘main’ passes the parameters ‘param_1’ and ‘param_2’ via working register
WREG0 and WREG1.
Instruction “rcall _function1” pushes the return address (4 bytes) on the stack, which is the address
of the instruction right after ‘rcall’, in this case “mov.b w0,[w14]”
mov.b #34,w1 ; Calling function moves parameters into WREG0 and WREG1
mov.b #17,w0
rcall _function1
mov.b w0,[w14]
.LCFI2:
mov.b w0,[w14+26] ; called function moves WREG0 and WREG1 onto the stack
mov.b w1,[w14+27]
This doesn't change the stack usage, if a frame pointer is being used then this would have been
counted with the 'lnk' instruction. Such moves don't always use the stack, and they certainly don't
adjust the stackfront (like a push would)
8. Disassembled code for the function ‘function1’:
_function1:
.LFB1:
.loc 1 99 0
.set ___PA___,1
lnk #28 ; LNK instruction stores Last frame pointer and increment SP by size specified
The disassembled code has a ‘lnk #28’ instruction. ‘lnk’ instruction links frame pointer’ i.e it stores
the previous frame pointer (2 bytes) and allocates the stack for auto variables, indicating that the
stack requirement of the function is 28 bytes.
This analysis will be further elaborated upon, in the section “Run time stack analysis of individual
function”
For detailed information on calling conventions, go through sections “Function Call Conventions”
and “Auto Variable Allocation and Access” in the XC16 compiler user’s guide.
It might be useful to generate a single disassembly listing file for the entire project.
“xc16-objdump” utility present inside the <INSTALL-DIR>\bin, can be used for generating the
disassembly listing file for the entire project.
The “elf” file generated after a successful build should be passed to the ‘xc16-objdump’, with
appropriate options, to generate the disassembly listing file for the project.
Command to generate the disassembly listing file for the entire project and dumping the
disassembled code into the ‘Dissassembly.txt’:
xc16-objdump.exe -d Input.elf >> Dissassembly.txt
In absence of shadow registers or separate register banks for interrupts, interrupts need to save the
current context, including the ‘Status’ register and the Interrupt Priority level register. This is not
required when calling a function from the main line code.
To calculate the stack usage of interrupt routines, inspect the disassembly code of the interrupt routine
by referring to the assembly source file.
Consider an example of an interrupt routine ‘T1Interrupt’.
void __attribute__((interrupt(no_auto_psv))) _T1Interrupt(void)
{
// clear this interrupt condition
_T1IF = 0;
}
Even though there is not an explicit 'call' , an interrupt function has an implicit stack usage before the
function is executed. Therefore, I would say the stack usage of T1Interrupt is 6 bytes; 2 for the (useless)
lnk #0 and 4 for the interrupt return point (retfie).
In the provided example code the ‘function2’ calls ‘function3’. The stack required by ‘function2’ and
‘function3’ is 42 bytes and 420 bytes, respectively.
Thus, the total stack usage of entire call tree is 462 bytes
Similarly, adding stack requirement of individual functions in a call tree, will provide the stack usage of
the entire call tree. Following subsection provide detail about generating call tree and brief overview of
open-source utility “Doxygen”.
5.1 Using Doxygen to generate call Tree:
To generate call tree of entire project, open-source utility doxygen generator can be used.
1. MPLAB X have GUI plugin for doxygen. Go to Tools > Plugin > Available plugins , install doxygen
integrator.
HAVE_DOT = YES
EXTRACT_ALL = YES
EXTRACT_PRIVATE = YES
EXTRACT_STATIC = YES
CALL_GRAPH = YES
CALLER_GRAPH = YES
DOT_PATH = "C:\Program Files (x86)\graphviz-2.44.1-win32\Graphviz\bin\dot.exe"
5. Right click on your project in MPLAB X and select “Create doxygen” this will generate two folders
in your project ‘html’ and ‘latex’.
6. At this point you can right click on project and select “Doxygen HTML output view”. It will open a
browser window when you can browse through your functions.
7. You would be able to see graphs like following and decide largest call depth in your code :
PC-Lint can be used for stack analysis of applications, using the PIC24 and dsPIC devices.
PC-Lint provides ‘+stack’ option for generating the stack usage report of the application.
‘+stack’ takes additional options to dump the stack usage report to a dump file (‘stack_usage.txt’ in the
example below).
+stack( &file=stack_usage.txt )
Each row contains the name of the function, followed by the automatic variable storage requirements of
each function. It also provides the maximum stack usage of each function, when calling other functions
in the code.
Note 1:
MPLAB X IDE also provides a plugin (plugin is available freely) for PC-Lint. However, the PC-Lint
license needs to be purchased separately from Gimpel Software.
Refer PC-Lint installation instructions here: https://microchipdeveloper.com/mplabx:PC-Lint-installation
Note 2:
Gimpel stack usage report does not contain enough data to track function call or compiler generated
temporary overhead. This would be still a conservative estimate.
The stack usage can be calculated by reducing the amount of unused stack from the total stack.
Stack used = Total Stack – Unused Stack
The following steps explain this method in detail:
Applying this attribute to a function will cause the default runtime startup code to call this
function even before the ‘main’ is called. This function is customized to fill up the stack space
with a known fixed value.
In the example below the stack space is filled with 0xA5A5 (users may choose a different
value).
In this method of prefilling the stack, the runtime startup file is modified to fill up the stack.
1. Firstly, the user needs to identify the correct startup file for the device in use. To identify
the startup file for the device, refer to the device specific linker script file (under
<COMPILER-DIR>\support\<device family>\gld)
a. primary version with crt0 prefix (crt0_xx.s), with data initialization support.
The linker loads this version when the --data-init option is selected.
b. Alternate version with crt1 prefix (crt1_xx.s), without data initialization support.
The linker loads this version when the --no-data-init option is selected.
2. Access the runtime startup files and other required files under <COMPILER-
DIR>\src\libpic30.zip\pic30
3. Copy the startup file and “null_signature.s” from above directory to your project
directory.
4. Add these assembly source files to your MPLAB X project under ‘Source Files’, by right
clicking and selecting “Add Existing Items from folder”
5. Necessary code for prefilling the memory is already provided in the startup files. Users
just need to uncomment the following sections in the startup file.
In the example below the stack space is filled with 0xA5A5 (users may choose a different
value).
mov #__DATA_BASE,w0
mov #__DATA_LENGTH,w1
mov #__DATA_INIT_VAL,w3 ; start of initializing RAM
add w0,w1,w1
1: cp w0,w15
bra geu, 2f ; move to initializing stack
mov w3,[w0++]
cp w0, w1
bra ltu, 1b
bra 1f
2: mov #__STACK_INIT_VAL,w3
setm w15
3: mov w3,[w0++]
cp w0,w14
bra nz,3b
mov #__DATA_INIT_VAL,w3
cp w0,w1
bra ltu,1b
1: mov #__SP_init,w15 ; (RE) initialize WREG15
After execution of start-up code, the stack space will be prefilled with the pre-defined
value of 0xA5A5.
Drawback of using this method over the ‘user_init’ method, is that this method is
relatively more complex to follow than the “user_init” method outlined earlier.
7.2 Code execution and Stack utilization
As the code executes and runs through various branches and the call tree, the stack is filled up
with automatic variables, parameters and function return addresses. The stack space which
was pre-filled with a known value (0xA5A5) is now overwritten by these objects, present in
the code.
The stack space which isn’t utilized by the code contains the pre-filled value (0xA5A5).
The users can either add a small code stub to their application, or manually inspect the Data
memory to find the maximum stack space utilization of their application. These techniques
are detailed in the sections below:
7.3.1 Adding a function that returns the maximum stack usage of the application
In this approach a function is created and added to the application to return the maximum
stack usage of the application. This function should be called just before the return from main
and returns the maximum stack utilization of the application.
This function traverses the stack space from the top of the stack (higher address of Stack) to
the bottom of the stack. It contains a check for the first incidence of the stack data, which is
not the same as the pre-filled value. Once this check is true, address at which this check was
true is set. This address is then deducted from the stack start address to arrive at the maximum
stack utilization of the code.
uint16_t GetMaxStackUtilisation(void)
{
uint16_t * Address = ( uint16_t * ) StackEndAddress;
/* Keep reading RAM until other value than pre-filled value is matched */
while ( *--Address == PREFILL_VALUE );
7.3.2 Manually inspecting the Data memory to find the maximum stack usage of the
application
This method uses the same approach as in the previous section, exception being that no
function is called to calculate and return the maximum stack utilization of the code.
The user needs to inspect the File Register view of the MPLAB X IDE to deduce the
maximum stack utilization of the code.
By manually inspecting the File register view of the IDE, one can deduce that the maximum
stack utilization of the application is 0x1FA (see screenshot in above section).
To calculate the stack usage of ‘function1’, the Stack Pointer (WREG15) is read just before the call to
the function and is stored in the variable ‘StackPointerFunctionEntry’.
Inside ‘function1’ a code stub is placed just before the return to the calling function.
This code stub reads the stack pointer just before the return of the function.
Since the Stack Pointer address at function entry and function exit is known, the stack usage can be
calculated for that function.
Note: This method just provide a conservative estimate and does not account for all the stack that
might be used in the function, for example any arguments that may need to be pushed onto the
stack.
Before function entry, current value of Stack Pointer (WREG15) is read in ‘StackPointerFunctionEntry’:
/* subtract the entry point and exit points to get max stack used by a function */
StackUsedByFunction = StackPointerFunctionExit - StackPointerFunctionEntry;
Inside the function, just before the ‘return’ statement, value of Stack Pointer is read into the
‘StackPointerFunctionExit’:
uint8_t function1 ( uint8_t param_1, uint8_t param_2 )
{
.
.
.
/* Read stack pointer value at the end of function */
asm ("mov w15, %0" : "=r"(StackPointerFunctionExit)); * Using extended asm syntax
return 1;
}
By default, the compiler uses “local stack” and does allocate stack in Data memory beyond 0x7FFF.
The “--no-local-stack” option can be used when there is a requirement for a larger stack.
To prevent the linker from allocating the stack in Extended Data Space (EDS) pass “--local-stack”
option to the linker. Even though usage of EDS provides more memory for stack, EDS access requires
extra instructions and makes the access slower.
Users can select these options under MPLAB X project properties >> xc16-ld >> Use local Stack:
Checked (--local-stack): Stack will not grow beyond 0x7FFF
Unchecked (--no-local-stack): Stack will grow beyond 0x7FFF into EDS space.
16-bit PIC MCUs and DSPIC devices provide a SPLIM register (Stack Pointer Limit) which points to
the upper limit of the stack. Stack cannot grow beyond SPLIM. If such a situation arises where the Stack
grows beyond SPLIM, a Stack Overflow exception will occur, i.e. if the Stack Pointer (WREG15) grows
beyond the SPLIM, it will result in a stack overflow exception. Such exceptions should be handled by
implementing an exception handler in the code.
Stack underflow exception occurs when the Stack Pointer address is below the start address of the Data
memory. This exception is generated to prevent the stack from clobbering the Special Function Register
(SFR) space.
The MPLAB XC16 compiler automatically manages the stack for C code. However, it’s the user’s
responsibility to manage the stack, while writing assembly code.
• Avoid calling functions from interrupts. Calling functions from interrupts may require many registers to
be pushed onto the stack, hence resulting in stack overruns.
• Avoid recursion. Using recursion without knowing the recursion depth may lead to stack overflow.
• Minimum stack size for the application can be demanded using the ‘--stack’ linker option.
If the demanded, minimum stack space isn’t available, the linker will generate an error.
• Implement a stack exception handler for your application. Refer Stack overflow and underflow section
in this document.
• While using large objects, avoid creating multiple copies of the same data on the stack. Instead pass
these objects by reference.
• Copying data using pointer, without any bound checking may lead to a stack overflow condition. This
may also corrupt the stack of other functions in the call tree. Always bound check while copying data
using pointers.
• Calling a function requires return address and perhaps frame pointer to be pushed on to the stack, so if
your function is smaller, consider making it inline. Use the “inline” keyword to inline individual
functions.
Use the “-finline-functions” option to let the compiler inline all possible, simpler functions. The -O3 and
-O2 optimization options enable this option by default.
• Look for function which contribute to worst case stack usage. Reducing the size of the function or
rearranging the call tree may reduce the stack usage.
• Monitor the depth of function’s call tree, to limit the call depth to a certain level. This can be done by
rearranging the code. Static analysis tools can be useful for monitoring the maximum call depth of
functions in the application.