You are on page 1of 26

How does the stack work in assembly language?

Ask Question
Asked 15 years, 1 month ago
Modified 6 months ago
Viewed 169k times
102
I'm currently trying to understand how the stack works, so I've decided teach myself some assembly
language, I'm using this book:
http://savannah.nongnu.org/projects/pgubook/
I'm using Gas and doing my development on Linux Mint.
I'm a bit confused by something:

As far as I was aware a stack is simply a data structure. So I assumed if I was coding in assembly I'd
have to implement the stack myself. However this doesn't seem to be the case as there are commands
like

pushl
popl
So when coding in assembly for the x86 architecture and using the Gas syntax: is the stack just a data
structure that's already implemented? Or is it actually implemented at the hardware level? Or is it
something else? Also would most assembly languages for other chip sets have the stack already
implemented?
I know this is a bit of a foolish question but I'm actually quite confused by this.
 assembly

 stack

 callstack

 stack-memory

Share
Follow
edited Feb 9, 2022 at 6:32

Peter Cordes
346k4747 gold badges660660 silver badges907907 bronze badges
asked Feb 17, 2009 at 13:12

handles
7,7671818 gold badges6565 silver badges8585 bronze badges
 4
Most of these answers talk about the stack as it is used by languages, in particular they talk about passing arguments
on the stack. In many CPUs this is just to make implementing languages easier--If you were hand-coding assembly
you'd generally pass parameters to functions in registers (At least before the CPU was optimized for stack operations
because the languages used it so heavily). The stack is mostly to keep calls/returns in order, also an interrupt (which
needs to save the state of the CPU) would push existing values in registers it will use and pop them before
returning.
– Bill K
Oct 28, 2019 at 17:49
Add a comment
17 Answers
Sorted by:

102
I think primarily you're getting confused between a program's stack and any old stack.
A Stack
Is an abstract data structure which consists of information in a Last In First Out system. You put
arbitrary objects onto the stack and then you take them off again, much like an in/out tray, the top
item is always the one that is taken off and you always put on to the top.

A Programs Stack
Is a stack, it's a section of memory that is used during execution, it generally has a static size per
program and frequently used to store function parameters. You push the parameters onto the stack
when you call a function and the function either address the stack directly or pops off the variables
from the stack.

A programs stack isn't generally hardware (though it's kept in memory so it can be argued as such),
but the Stack Pointer which points to a current area of the Stack is generally a CPU register. This
makes it a bit more flexible than a LIFO stack as you can change the point at which the stack is
addressing.

You should read and make sure you understand the wikipedia article as it gives a good description of
the Hardware Stack which is what you are dealing with.
There is also this tutorial which explains the stack in terms of the old 16bit registers but could be
helpful and another one specifically about the stack.
From Nils Pipenbrinck:

It's worthy of note that some processors do not implement all of the instructions for accessing and
manipulating the stack (push, pop, stack pointer, etc) but the x86 does because of it's frequency of
use. In these situations if you wanted a stack you would have to implement it yourself (some MIPS
and some ARM processors are created without stacks).

For example, in MIPs a push instruction would be implemented like:

addi $sp, $sp, -4 # Decrement stack pointer by 4


sw $t0, ($sp) # Save $t0 to stack
and a Pop instruction would look like:
lw $t0, ($sp) # Copy from stack to $t0
addi $sp, $sp, 4 # Increment stack pointer by 4
Share
Follow
edited Jan 12, 2017 at 19:12

Alexander Malakhov
3,48922 gold badges3535 silver badges5959 bronze badges
answered Feb 17, 2009 at 13:24

Henry B
8,0071111 gold badges4343 silver badges4646 bronze badges
 4
Btw - the x86 has these special stack instructions because pushing and popping stuff from the stack happends so
often that it was a good idea to use a short opcode for them (less code-space). Architectures such as MIPS and ARM
don't have these, so you have to implement the stack on your own.
– Nils Pipenbrinck
Feb 17, 2009 at 13:42
 4
Bear in mind that your hot new processor is binary-compatible with the 8086 to some extent, and that was source-
compatible with the 8080, a development of the 8008, the first microprocessor. Some of these decisions go back a
long way.
– David Thornley
Feb 17, 2009 at 14:33
 5
In ARM, there are single instructions for manipulating the stack, they are just not so obvious because they are called
STMDB SP! (for PUSH) and LDMIA SP! (for POP).
– Adam Goode
Oct 21, 2009 at 20:16
 1
My god this answer is in need of +500... I haven't found anything explained this well since forever. Considering to
make new accounts to +1 this as of now...
– Gabriel
Feb 10, 2012 at 15:04
 1
@bplus You can also refer to cs.umd.edu/class/sum2003/cmsc311/Notes/Mips/stack.html
– Suraj Jain
Aug 18, 2016 at 6:47
Show 1 more comment
72
(I've made a gist of all the code in this answer in case you want to play with it)
I have only ever did most basic things in asm during my CS101 course back in 2003. And I had never
really "got it" how asm and stack work until I've realized that it's all basicaly like programming in C
or C++ ... but without local variables, parameters and functions. Probably doesn't sound easy yet :)
Let me show you (for x86 asm with Intel syntax).
1. What is the stack
Stack is usually a contiguous chunk of memory allocated for every thread before they start. You can
store there whatever you want. In C++ terms (code snippet #1):
const int STACK_CAPACITY = 1000;
thread_local int stack[STACK_CAPACITY];

2. Stack's top and bottom


In principle, you could store values in random cells of stack array (snippet #2.1):
stack[333] = 123;
stack[517] = 456;
stack[555] = stack[333] + stack[517];
But imagine how hard would it be to remember which cells of stack are already in use and wich ones
are "free". That's why we store new values on the stack next to each other.
One weird thing about (x86) asm's stack is that you add things there starting with the last index and
move to lower indexes: stack[999], then stack[998] and so on (snippet #2.2):
stack[999] = 123;
stack[998] = 456;
stack[997] = stack[999] + stack[998];
And still (caution, you're gonna be confused now) the "official" name for stack[999] is bottom of the
stack.
The last used cell (stack[997] in the example above) is called top of the stack (see Where the top of
the stack is on x86).

3. Stack pointer (SP)


For the purpose of this discussion let's assume CPU registers are represented as global variables
(see General-Purpose Registers).
int AX, BX, SP, BP, ...;
int main(){...}
There is special CPU register (SP) that tracks the top of the stack. SP is a pointer (holds a memory
address like 0xAAAABBCC). But for the purposes of this post I'll use it as an array index (0, 1, 2,
...).

When a thread starts, SP == STACK_CAPACITY and then the program and OS modify it as needed.
The rule is you can't write to stack cells beyond stack's top and any index less then SP is invalid and
unsafe (because of system interrupts), so you first decrement SP and then write a value to the newly
allocated cell.
When you want to push several values in the stack in a row, you can reserve space for all of them
upfront (snippet #3):
SP -= 3;
stack[999] = 12;
stack[998] = 34;
stack[997] = stack[999] + stack[998];
Note. Now you can see why allocation on the stack is so fast - it's just a single register decrement.

4. Local variables
Let's take a look at this simplistic function (snippet #4.1):
int triple(int a) {
int result = a * 3;
return result;
}
and rewrite it without using of local variable (snippet #4.2):
int triple_noLocals(int a) {
SP -= 1; // move pointer to unused cell, where we can store what we need
stack[SP] = a * 3;
return stack[SP];
}
and see how it is being called (snippet #4.3):
// SP == 1000
someVar = triple_noLocals(11);
// now SP == 999, but we don't need the value at stack[999] anymore
// and we will move the stack index back, so we can reuse this cell later
SP += 1; // SP == 1000 again

5. Push / pop
Addition of a new element on the top of the stack is such a frequent operation, that CPUs have a
special instruction for that, push. We'll implent it like this (snippet 5.1):
void push(int value) {
--SP;
stack[SP] = value;
}
Likewise, taking the top element of the stack (snippet 5.2):
void pop(int& result) {
result = stack[SP];
++SP; // note that `pop` decreases stack's size
}
Common usage pattern for push/pop is temporarily saving some value. Say, we have something
useful in variable myVar and for some reason we need to do calculations which will overwrite it
(snippet 5.3):
int myVar = ...;
push(myVar); // SP == 999
myVar += 10;
... // do something with new value in myVar
pop(myVar); // restore original value, SP == 1000

6. Function parameters
Now let's pass parameters using stack (snippet #6):
int triple_noL_noParams() { // `a` is at index 999, SP == 999
SP -= 1; // SP == 998, stack[SP + 1] == a
stack[SP] = stack[SP + 1] * 3;
return stack[SP];
}

int main(){
push(11); // SP == 999
assert(triple(11) == triple_noL_noParams());
SP += 2; // cleanup 1 local and 1 parameter
}

7. return statement
Let's return value in AX register (snippet #7):
void triple_noL_noP_noReturn() { // `a` at 998, SP == 998
SP -= 1; // SP == 997

stack[SP] = stack[SP + 1] * 3;
AX = stack[SP];

SP += 1; // finally we can cleanup locals right in the function body, SP == 998


}

void main(){
... // some code
push(AX); // save AX in case there is something useful there, SP == 999
push(11); // SP == 998
triple_noL_noP_noReturn();
assert(triple(11) == AX);
SP += 1; // cleanup param
// locals were cleaned up in the function body, so we don't need to do it here
pop(AX); // restore AX
...
}

8. Stack base pointer (BP) (also known as frame pointer) and stack frame
Lets take more "advanced" function and rewrite it in our asm-like C++ (snippet #8.1):
int myAlgo(int a, int b) {
int t1 = a * 3;
int t2 = b * 3;
return t1 - t2;
}

void myAlgo_noLPR() { // `a` at 997, `b` at 998, old AX at 999, SP == 997


SP -= 2; // SP == 995

stack[SP + 1] = stack[SP + 2] * 3;
stack[SP] = stack[SP + 3] * 3;
AX = stack[SP + 1] - stack[SP];

SP += 2; // cleanup locals, SP == 997


}

int main(){
push(AX); // SP == 999
push(22); // SP == 998
push(11); // SP == 997
myAlgo_noLPR();
assert(myAlgo(11, 22) == AX);
SP += 2;
pop(AX);
}
Now imagine we decided to introduce new local variable to store result there before returning, as we
do in tripple (snippet #4.1). The body of the function will be (snippet #8.2):
SP -= 3; // SP == 994
stack[SP + 2] = stack[SP + 3] * 3;
stack[SP + 1] = stack[SP + 4] * 3;
stack[SP] = stack[SP + 2] - stack[SP + 1];
AX = stack[SP];
SP += 3;
You see, we had to update every single reference to function parameters and local variables. To
avoid that, we need an anchor index, which doesn't change when the stack grows.

We will create the anchor right upon function entry (before we allocate space for locals) by saving
current top (value of SP) into BP register. Snippet #8.3:
void myAlgo_noLPR_withAnchor() { // `a` at 997, `b` at 998, SP == 997
push(BP); // save old BP, SP == 996
BP = SP; // create anchor, stack[BP] == old value of BP, now BP == 996
SP -= 2; // SP == 994

stack[BP - 1] = stack[BP + 1] * 3;
stack[BP - 2] = stack[BP + 2] * 3;
AX = stack[BP - 1] - stack[BP - 2];

SP = BP; // cleanup locals, SP == 996


pop(BP); // SP == 997
}
The slice of stack, wich belongs to and is in full control of the function is called function's stack
frame. E.g. myAlgo_noLPR_withAnchor's stack frame is stack[996 .. 994] (both idexes inclusive).
Frame starts at function's BP (after we've updated it inside function) and lasts until the next stack
frame. So the parameters on the stack are part of the caller's stack frame (see note 8a).
Notes:
8a. Wikipedia says otherwise about parameters, but here I adhere to Intel software developer's
manual, see vol. 1, section 6.2.4.1 Stack-Frame Base Pointer and Figure 6-2 in section 6.3.2 Far
CALL and RET Operation. Function's parameters and stack frame are part of function's activation
record (see The gen on function perilogues).
8b. positive offsets from BP point to function parameters and negative offsets point to local variables.
That's pretty handy for debugging
8c. stack[BP] stores the address of the previous stack frame, stack[stack[BP]] stores pre-previous stack
frame and so on. Following this chain, you can discover frames of all the functions in the programm,
which didn't return yet. This is how debuggers show you call stack
8d. the first 3 instructions of myAlgo_noLPR_withAnchor, where we setup the frame (save old BP,
update BP, reserve space for locals) are called function prologue

9. Calling conventions
In snippet 8.1 we've pushed parameters for myAlgo from right to left and returned result in AX. We
could as well pass params left to right and return in BX. Or pass params in BX and CX and return in
AX. Obviously, caller (main()) and called function must agree where and in which order all this stuff
is stored.
Calling convention is a set of rules on how parameters are passed and result is returned.
In the code above we've used cdecl calling convention:

 Parameters are passed on the stack, with the first argument at the lowest address on the stack at
the time of the call (pushed last <...>). The caller is responsible for popping parameters back off
the stack after the call.
 the return value is placed in AX
 EBP and ESP must be preserved by the callee (myAlgo_noLPR_withAnchor function in our case),
such that the caller (main function) can rely on those registers not having been changed by a call.
 All other registers (EAX, <...>) may be freely modified by the callee; if a caller wishes to
preserve a value before and after the function call, it must save the value elsewhere (we do this
with AX)
(Source: example "32-bit cdecl" from Stack Overflow Documentation; copyright 2016 by icktoofay and Peter Cordes ; licensed under CC BY-SA 3.0. An archive of the full
Stack Overflow Documentation content can be found at archive.org, in which this example is indexed by topic ID 3261 and example ID 11196.)

10. Function calls


Now the most interesting part. Just like data, executable code is also stored in memory (completely
unrelated to memory for stack) and every instruction has an address.
When not commanded otherwise, CPU executes instructions one after another, in the order they are
stored in memory. But we can command CPU to "jump" to another location in memory and execute
instructions from there on. In asm it can be any address, and in more high-level languages like C++
you can only jump to addresses marked by labels (there are workarounds but they are not pretty, to
say the least).
Let's take this function (snippet #10.1):
int myAlgo_withCalls(int a, int b) {
int t1 = triple(a);
int t2 = triple(b);
return t1 - t2;
}
And instead of calling tripple C++ way, do the following:

1. copy tripple's code to the beginning of myAlgo body


2. at myAlgo entry jump over tripple's code with goto
3. when we need to execute tripple's code, save on the stack address of the code line just
after tripple call, so we can return here later and continue execution (PUSH_ADDRESS macro
below)
4. jump to the address of the 1st line (tripple function) and execute it to the end (3. and 4. together
are CALL macro)
5. at the end of the tripple (after we've cleaned up locals), take return address from the top of the
stack and jump there (RET macro)
Because there is no easy way to jump to particular code address in C++, we will use labels to mark
places of jumps. I won't go into detail how macros below work, just believe me they do what I say
they do (snippet #10.2):
// pushes the address of the code at label's location on the stack
// NOTE1: this gonna work only with 32-bit compiler (so that pointer is 32-bit and fits in int)
// NOTE2: __asm block is specific for Visual C++. In GCC use https://gcc.gnu.org/onlinedocs/gcc/Labels-as-
Values.html
#define PUSH_ADDRESS(labelName) { \
void* tmpPointer; \
__asm{ mov [tmpPointer], offset labelName } \
push(reinterpret_cast<int>(tmpPointer)); \
}

// why we need indirection, read https://stackoverflow.com/a/13301627/264047


#define TOKENPASTE(x, y) x ## y
#define TOKENPASTE2(x, y) TOKENPASTE(x, y)

// generates token (not a string) we will use as label name.


// Example: LABEL_NAME(155) will generate token `lbl_155`
#define LABEL_NAME(num) TOKENPASTE2(lbl_, num)

#define CALL_IMPL(funcLabelName, callId) \


PUSH_ADDRESS(LABEL_NAME(callId)); \
goto funcLabelName; \
LABEL_NAME(callId) :

// saves return address on the stack and jumps to label `funcLabelName`


#define CALL(funcLabelName) CALL_IMPL(funcLabelName, __LINE__)

// takes address at the top of stack and jump there


#define RET() { \
int tmpInt; \
pop(tmpInt); \
void* tmpPointer = reinterpret_cast<void*>(tmpInt); \
__asm{ jmp tmpPointer } \
}

void myAlgo_asm() {
goto my_algo_start;

triple_label:
push(BP);
BP = SP;
SP -= 1;

// stack[BP] == old BP, stack[BP + 1] == return address


stack[BP - 1] = stack[BP + 2] * 3;
AX = stack[BP - 1];

SP = BP;
pop(BP);
RET();

my_algo_start:
push(BP); // SP == 995
BP = SP; // BP == 995; stack[BP] == old BP,
// stack[BP + 1] == dummy return address,
// `a` at [BP + 2], `b` at [BP + 3]
SP -= 2; // SP == 993

push(AX);
push(stack[BP + 2]);
CALL(triple_label);
stack[BP - 1] = AX;
SP -= 1;
pop(AX);

push(AX);
push(stack[BP + 3]);
CALL(triple_label);
stack[BP - 2] = AX;
SP -= 1;
pop(AX);
AX = stack[BP - 1] - stack[BP - 2];

SP = BP; // cleanup locals, SP == 997


pop(BP);
}

int main() {
push(AX);
push(22);
push(11);
push(7777); // dummy value, so that offsets inside function are like we've pushed return address
myAlgo_asm();
assert(myAlgo_withCalls(11, 22) == AX);
SP += 1; // pop dummy "return address"
SP += 2;
pop(AX);
}
Notes:
10a. because return address is stored on the stack, in principle we can change it. This is how stack
smashing attack works
10b. the last 3 instructions at the "end" of triple_label (cleanup locals, restore old BP, return) are
called function's epilogue

11. Assembly
Now let's look at real asm for myAlgo_withCalls. To do that in Visual Studio:

 set build platform to x86 (not x86_64)


 build type: Debug
 set break point somewhere inside myAlgo_withCalls
 run, and when execution stops at break point press Ctrl + Alt + D
One difference with our asm-like C++ is that asm's stack operate on bytes instead of ints. So to
reserve space for one int, SP will be decremented by 4 bytes.
Here we go (snippet #11.1, line numbers in comments are from the gist):
; 114: int myAlgo_withCalls(int a, int b) {
push ebp ; create stack frame
mov ebp,esp
; return address at (ebp + 4), `a` at (ebp + 8), `b` at (ebp + 12)

sub esp,0D8h ; reserve space for locals. Compiler can reserve more bytes then needed. 0D8h is hexadecimal
== 216 decimal

push ebx ; cdecl requires to save all these registers


push esi
push edi

; fill all the space for local variables (from (ebp-0D8h) to (ebp)) with value 0CCCCCCCCh repeated 36h times (36h
* 4 == 0D8h)
; see https://stackoverflow.com/q/3818856/264047
; I guess that's for ease of debugging, so that stack is filled with recognizable values
; 0CCCCCCCCh in binary is 110011001100...
lea edi,[ebp-0D8h]
mov ecx,36h
mov eax,0CCCCCCCCh
rep stos dword ptr es:[edi]

; 115: int t1 = triple(a);


mov eax,dword ptr [ebp+8] ; push parameter `a` on the stack
push eax

call triple (01A13E8h)


add esp,4 ; clean up param
mov dword ptr [ebp-8],eax ; copy result from eax to `t1`

; 116: int t2 = triple(b);


mov eax,dword ptr [ebp+0Ch] ; push `b` (0Ch == 12)
push eax

call triple (01A13E8h)


add esp,4
mov dword ptr [ebp-14h],eax ; t2 = eax

mov eax,dword ptr [ebp-8] ; calculate and store result in eax


sub eax,dword ptr [ebp-14h]

pop edi ; restore registers


pop esi
pop ebx

add esp,0D8h ; check we didn't mess up esp or ebp. this is only for debug builds
cmp ebp,esp
call __RTC_CheckEsp (01A116Dh)

mov esp,ebp ; destroy frame


pop ebp
ret
And asm for tripple (snippet #11.2):
push ebp
mov ebp,esp
sub esp,0CCh
push ebx
push esi
push edi
lea edi,[ebp-0CCh]
mov ecx,33h
mov eax,0CCCCCCCCh
rep stos dword ptr es:[edi]
imul eax,dword ptr [ebp+8],3
mov dword ptr [ebp-8],eax
mov eax,dword ptr [ebp-8]
pop edi
pop esi
pop ebx
mov esp,ebp
pop ebp
ret
Hope, after reading this post, assembly doesn't look as cryptic as before :)
Here are links from the post's body and some further reading:

 Eli Bendersky, Where the top of the stack is on x86 - top/bottom, push/pop, SP, stack frame,
calling conventions
 Eli Bendersky, Stack frame layout on x86-64 - args passing on x64, stack frame, red zone
 University of Mariland, Understanding the Stack - a really well-written introduction to stack
concepts. (It's for MIPS (not x86) and in GAS syntax, but this is insignificant for the topic). See
other notes on MIPS ISA Programming if interested.
 x86 Asm wikibook, General-Purpose Registers
 x86 Disassembly wikibook, The Stack
 x86 Disassembly wikibook, Functions and Stack Frames
 Intel software developer's manuals - I expected it to be really hardcore, but surprisingly it's
pretty easy read (though amount of information is overwhelming)
 Jonathan de Boyne Pollard, The gen on function perilogues - prologue/epilogue, stack
frame/activation record, red zone
Share
Follow
edited Sep 14, 2023 at 0:00

Oliver Tušla
14833 silver badges1212 bronze badges
answered Feb 9, 2017 at 20:03

Alexander Malakhov
3,48922 gold badges3535 silver badges5959 bronze badges
 It was a long time ago that I asked this, that's a really great in depth answer. Thanks.
– handles
Jul 5, 2017 at 8:00
 Why are you using the 16-bit names for registers in the early part of your answer? If you were talking about actual
16-bit code, [SP] isn't a valid addressing 16-bit mode. Probably best to use ESP. Also, if you declare SP as an int,
you should be modifying it by 4 for every element, not 1. (If you declared long *SP, then SP += 2 would increment
by 2 * sizeof(int), and thus remove 2 elements. But with int SP, that should be SP += 8, like add esp, 8. in 32-bit
asm.
– Peter Cordes
Feb 6, 2018 at 14:07
 Fascinating! I think it's interesting that you try to explain assembly using C. I haven't seen that before. Neat. I might
suggest renaming "No local variables" as "How local variables work", or just "Local variables".
– Dave Dopson
Feb 8, 2018 at 5:59
 @PeterCordes the reason for 16-bit names (SP, BP) is clarity - SP easily translates to "stack pointer". If I use proper
32-bit names I would need to either explain the difference between 16/32/64 bit modes or leave it unexplained. My
intention was that someone who only knows Java or Python can follow the post without scratching head too much.
And I think memory addressing would only distract the reader. Plus, I've put wikibook link on the topic for the
curious and said couple words about ESP at the end of the post.
– Alexander Malakhov
Apr 10, 2018 at 17:11
 1
To avoid that, we need an anchor index, which doesn't change when the stack grows. Need is the wrong word; -
fomit-frame-pointer has been the default in gcc and clang for years. People looking at real asm need to know that
EBP/RBP usually won't be used as a frame pointer. I'd say "traditionally, humans wanted an anchor that doesn't
change with push/pop, but compilers can keep track of changing offsets." Then you can update the section about
backtraces to say that's the legacy method, not used by default when DWARF .eh_frame metadata or Windows x86-
64 metadata is available.
– Peter Cordes
Apr 10, 2018 at 17:54
Show 9 more comments
8
Regarding whether the stack is implemented in the hardware, this Wikipedia article might help.
Some processors families, such as the x86, have special instructions for manipulating the stack of the
currently executing thread. Other processor families, including PowerPC and MIPS, do not have
explicit stack support, but instead rely on convention and delegate stack management to the operating
system's Application Binary Interface (ABI).
That article and the others it links to might be useful to get a feel for stack usage in processors.
Share
Follow
answered Feb 17, 2009 at 13:22

Leaf Garland
3,6571919 silver badges1818 bronze badges
Add a comment
5
I think that main answer you are looking for has already been hinted at.

When an x86 computer boots up, the stack is not setup. The programmer must explicitly set it up at
boot time. However, if you are already in an operating system, this has been taken care of. Below is a
code sample from a simple bootstrap program.

First the data and stack segment registers are set, and then the stack pointer is set 0x4000 beyond
that.

movw $BOOT_SEGMENT, %ax


movw %ax, %ds
movw %ax, %ss
movw $0x4000, %ax
movw %ax, %sp
After this code the stack may be used. Now I am sure it can be done in a number of different ways,
but I think this should illustrate the idea.
Share
Follow
answered Feb 17, 2009 at 13:57
Mr. Shickadance
5,34399 gold badges4646 silver badges6262 bronze badges
Add a comment
5
The Concept
First think of the whole thing as if you were the person who invented it. Like this:

First think of an array and how it is implemented at the low level --> it is basically just a set of
contiguous memory locations (memory locations that are next to each other). Now that you have that
mental image in your head, think of the fact that you can access ANY of those memory locations and
delete it at your will as you remove or add data in your array. Now think of that same array but
instead of the possibility to delete any location you decide that you will delete only the LAST
location as you remove or add data in your array. Now your new idea to manipulate the data in that
array in that way is called LIFO which means Last In First Out. Your idea is very good because it
makes it easier to keep track of the content of that array without having to use a sorting algorithm
every time you remove something from it. Also, to know at all times what the address of the last
object in the array is, you dedicate one Register in the Cpu to keep track of it. Now, the way that
register keeps track of it is so that every time you remove or add something to your array you also
decrement or increment the value of the address in your register by the amount of objects you
removed or added from the array (by the amount of address space they occupied). You also want to
make sure that that amount by which you decrement or increment that register is fixed to one amount
(like 4 memory locations ie. 4 bytes) per object, again, to make it easier to keep track and also to
make it possible to use that register with some loop constructs because loops use fixed
incrementation per iteration (eg. to loop trough your array with a loop you construct the loop to
increment your register by 4 each iteration, which would not be possible if your array has objects of
different sizes in it). Lastly, you choose to call this new data structure a "Stack", because it reminds
you of a stack of plates in a restaurant where they always remove or add a plate on the top of that
stack.

The Implementation
As you can see a stack is nothing more than an array of contiguous memory locations where you
decided how to manipulate it. Because of that you can see that you don't need to even use the special
instructions and registers to control the stack. You can implement it yourself with the basic mov, add
and sub instructions and using general purpose registers instead the ESP and EBP like this:

mov edx, 0FFFFFFFFh


; --> this will be the start address of your stack, furthest away from your code and data, it will also
serve as that register that keeps track of the last object in the stack that i explained earlier. You call it
the "stack pointer", so you choose the register EDX to be what ESP is normally used for.
sub edx, 4
mov [edx], dword ptr [someVar]
; --> these two instructions will decrement your stack pointer by 4 memory locations and copy the 4
bytes starting at [someVar] memory location to the memory location that EDX now points to, just
like a PUSH instruction decrements the ESP, only here you did it manually and you used EDX. So
the PUSH instruction is basically just a shorter opcode that actually does this with ESP.
mov eax, dword ptr [edx]
add edx, 4
; --> and here we do the opposite, we first copy the 4 bytes starting at the memory location that EDX
now points to into the register EAX (arbitrarily chosen here, we could have copied it anywhere we
wanted). And then we increment our stack pointer EDX by 4 memory locations. This is what the
POP instruction does.
Now you can see that the instructions PUSH and POP and the registers ESP ans EBP were just added
by Intel to make the above concept of the "stack" data structure easier to write and read. There are
still some RISC (Reduced Instruction Set) Cpu-s that don't have the PUSH ans POP instructions and
dedicated registers for stack manipulation, and while writing assembly programs for those Cpu-s you
have to implement the stack by yourself just like i showed you.
Share
Follow
edited Feb 5, 2016 at 22:33
answered Feb 5, 2016 at 22:18

Zod
5111 silver badge22 bronze badges
Add a comment
4
The stack is just a way that programs and functions use memory.

The stack always confused me, so I made an illustration:


(svg version here)

 A push "attaches a new stalactite to the ceiling".


 A pop "pops off a stalactite".
Hope it's more helpful than confusing.

Feel free to use the SVG image (CC0 licensed).


Share
Follow
edited Jan 10, 2021 at 0:06
answered Nov 10, 2013 at 12:42
Alexander
9,97744 gold badges5454 silver badges6060 bronze badges
Add a comment
3
You confuse an abstract stack and the hardware implemented stack. The latter is already
implemented.
Share
Follow
answered Feb 17, 2009 at 13:19

sharptooth
169k105105 gold badges524524 silver badges992992 bronze badges
Add a comment
2
The stack is "implemented" by means of the stack pointer, which (assuming x86 architecture here)
points into the stack segment. Every time something is pushed on the stack (by means of pushl, call,
or a similar stack opcode), it is written to the address the stack pointer points to, and the stack
pointer decremented (stack is growing downwards, i.e. smaller addresses). When you pop something
off the stack (popl, ret), the stack pointer is incremented and the value read off the stack.
In a user-space application, the stack is already set up for you when your application starts. In a
kernel-space environment, you have to set up the stack segment and the stack pointer first...
Share
Follow
answered Feb 17, 2009 at 13:18

DevSolar
68.8k2121 gold badges135135 silver badges214214 bronze badges
Add a comment
2
The call stack is implemented by the x86 instruction set and the operating system.

Instructions like push and pop adjust the stack pointer while the operating system takes care of
allocating memory as the stack grows for each thread.

The fact that the x86 stack "grows down" from higher to lower addresses make this architecture
more susceptible to the buffer overflow attack.
Share
Follow
answered Feb 17, 2009 at 13:43

Maurice Flanagan
5,19933 gold badges3131 silver badges3838 bronze badges
 2
Why does the fact that the x86 stack grows down make it more susceptible to buffer overflows? Couldn't you get the
same overflow with an expand up segment?
– Nathan Fellman
Feb 17, 2009 at 14:08
 @nathan: only if you can get the application to allocate a negative amount of memory on the stack.
– Javier
Feb 17, 2009 at 14:09
 2
Buffer overflow attacks write past the end of a stack based array - char userName[256], this writes memory from
lower to higher which lets you overwrite things like the return address. If the stack grew in the same direction, you
would only be able to overwrite unallocated stack.
– Maurice Flanagan
Feb 17, 2009 at 14:18
 I've been reading that what distinguishes the stack from the heap is that the stack cannot grow in size, are you saying
that the contents of the stack grows, or the stack itself?
– Adam
Nov 5, 2023 at 14:05
Add a comment
1
The stack already exists, so you can assume that when writing your code. The stack contains the
return addresses of the functions, the local variables and the variables which are passed between
functions. There are also stack registers such as BP, SP (Stack Pointer) built-in that you can use,
hence the built-in commands you have mentioned. If the stack wasn't already implemented, functions
couldn't run, and code flow couldn't work.
Share
Follow
answered Feb 17, 2009 at 13:16

Gal Goldman
8,7391111 gold badges4646 silver badges4545 bronze badges
Add a comment
1
I haven't seen the Gas assembler specifically, but in general the stack is "implemented" by
maintaining a reference to the location in memory where the top of the stack resides. The memory
location is stored in a register, which has different names for different architectures, but can be
thought of as the stack pointer register.

The pop and push commands are implemented in most architectures for you by building upon micro
instructions. However, some "Educational Architectures" require you implement them your self.
Functionally, push would be implemented somewhat like this:

load the address in the stack pointer register to a gen. purpose register x
store data y at the location x
increment stack pointer register by size of y
Also, some architectures store the last used memory address as the Stack Pointer. Some store the next
available address.
Share
Follow
answered Feb 17, 2009 at 13:26

Charlie White
35511 gold badge66 silver badges1616 bronze badges
Add a comment
1
I was searching about how stack works in terms of function and i found this blog its awesome and its
explain concept of stack from scratch and how stack store value in stack.
Now on your answer . I will explain with python but you will get good idea how stack works in any
language.
Its a program :

def hello(x):
if x==1:
return "op"
else:
u=1
e=12
s=hello(x-1)
e+=1
print(s)
print(x)
u+=1
return e

hello(3)
Source : Cryptroix
some of its topic which it cover in blog:

How Function work ?


Calling a Function
Functions In a Stack
What is Return Address
Stack
Stack Frame
Call Stack
Frame Pointer (FP) or Base Pointer (BP)
Stack Pointer (SP)
Allocation stack and deallocation of stack
StackoverFlow
What is Heap?
But its explain with python language so if you want you can take a look.
Share
Follow
answered Oct 18, 2016 at 9:57
user6932350
 Criptoix site is dead and there is no copy on web.archive.org
– Alexander Malakhov
Jan 18, 2017 at 18:50
 1
@AlexanderMalakhov Cryptroix was not working due to hosting issue. Cryptroix is up now and working.
– user6900888
Apr 7, 2017 at 11:45
Add a comment
0
You are correct that a stack is a data structure. Often, data structures (stacks included) you work with
are abstract and exist as a representation in memory.

The stack you are working with in this case has a more material existence- it maps directly to real
physical registers in the processor. As a data structure, stacks are FILO (first in, last out) structures
that ensure data is removed in the reverse order it was entered. See the StackOverflow logo for a
visual! ;)

You are working with the instruction stack. This is the stack of actual instructions you are feeding
the processor.
Share
Follow
answered Feb 17, 2009 at 13:23

Dave Swersky
34.7k99 gold badges8080 silver badges120120 bronze badges
 wrong. this isn't an 'instruction stack' (is there such a thing?) this is simply a memory accessed via the Stack register.
used for temporary storage, procedure parameters, and (most important) return address for function calls
– Javier
Feb 17, 2009 at 14:07
Add a comment
0
You are correct that a stack is 'just' a data structure. Here, however, it refers to a hardware
implemented stack used for a special purpose --"The Stack".

Many people have commented about hardware implemented stack versus the (software)stack data
structure. I would like to add that there are three major stack structure types -
1. A call stack -- Which is the one you are asking about! It stores function parameters and return
address etc. Do read Chapter 4 ( All about 4th page i.e. page 53)functions in that book. There is
a good explanation.
2. A generic stack Which you might use in your program to do something special...
3. A generic hardware stack
I am not sure about this, but I remember reading somewhere that there is a general purpose
hardware implemented stack available in some architectures. If anyone knows whether this is
correct, please do comment.
The first thing to know is the architecture you are programming for, which the book explains (I just
looked it up --link). To really understand things, I suggest that you learn about the memory,
addressing, registers and architecture of x86 (I assume thats what you are learning --from the book).
Share
Follow
edited Feb 17, 2009 at 15:07
answered Feb 17, 2009 at 13:42

batbrat
5,18333 gold badges3333 silver badges3838 bronze badges
Add a comment
0
Calling functions, which requires saving and restoring local state in LIFO fashion (as opposed to say,
a generalized co-routine approach), turns out to be such an incredibly common need that assembly
languages and CPU architectures basically build this functionality in. The same could probably be
said for notions of threading, memory protection, security levels, etc. In theory you could implement
your own stack, calling conventions, etc., but I assume some opcodes and most existing runtimes rely
on this native concept of "stack".
Share
Follow
answered Oct 21, 2011 at 4:59

aaron
1,0981313 silver badges1414 bronze badges
Add a comment
0
What is Stack? A stack is a type of data structure -- a means of storing information in a computer.
When a new object is entered in a stack, it is placed on top of all the previously entered objects. In
other words, the stack data structure is just like a stack of cards, papers, credit card mailings, or any
other real-world objects you can think of. When removing an object from a stack, the one on top gets
removed first. This method is referred to as LIFO (last in, first out).

The term "stack" can also be short for a network protocol stack. In networking, connections between
computers are made through a series of smaller connections. These connections, or layers, act like
the stack data structure, in that they are built and disposed of in the same way.
Share
Follow
answered Dec 10, 2012 at 9:16

rahul soni
1711 bronze badge
Add a comment
0
stack is part of memory. it use for input and output of functions. also it use for remembering function's
return.
esp register is remember the stack address.
stack and esp are implemented by hardware. also you can implement it yourself. it will make your
program very slow.
example:

nop // esp = 0012ffc4


push 0 //esp = 0012ffc0 ,Dword[0012ffc0]=00000000
call proc01 // esp = 0012ffbc ,Dword[0012ffbc] = eip , eip = adrr[proc01]
pop eax // eax = Dword[esp], esp = esp + 4
Share
Follow
edited Jul 24, 2015 at 18:50
answered Jul 21, 2015 at 20:47

Amir

You might also like