The Stack
In computer architecture, the stack is a hardware manifestation of the stack data structure (a Last In, First Out queue).
In x86, the stack is simply an area in RAM that was chosen to be the stack - there is no special hardware to store stack contents. The esp
/rsp
register holds the address in memory where the bottom of the stack resides. When something is push
ed to the stack, esp
decrements by 4 (or 8 on 64-bit x86), and the value that was push
ed is stored at that location in memory. Likewise, when a pop
instruction is executed, the value at esp
is retrieved (i.e. esp
is dereferenced), and
esp
is then incremented by 4 (or 8).
N.B. The stack "grows" down to lower memory addresses!
Conventionally, ebp
/rbp
contains the address of the top of the current stack frame, and so sometimes local variables are referenced as an offset relative to ebp
rather than an offset to esp
. A stack frame is essentially just the space used on the stack by a given function.
Uses
The stack is primarily used for a few things:
- Storing function arguments
- Storing local variables
- Storing processor state between function calls
Example
Let's see what the stack looks like right after say_hi
has been called in this 32-bit x86 C program:
#include <stdio.h>
void say_hi(const char * name) {
printf("Hello %s!\n", name);
}
int main(int argc, char ** argv) {
char * name;
if (argc != 2) {
return 1;
}
name = argv[1];
say_hi(name);
return 0;
}
And the relevant assembly:
0804840b <say_hi>:
804840b: 55 push ebp
804840c: 89 e5 mov ebp,esp
804840e: 83 ec 08 sub esp,0x8
8048411: 83 ec 08 sub esp,0x8
8048414: ff 75 08 push DWORD PTR [ebp+0x8]
8048417: 68 f0 84 04 08 push 0x80484f0
804841c: e8 bf fe ff ff call 80482e0 <printf@plt>
8048421: 83 c4 10 add esp,0x10
8048424: 90 nop
8048425: c9 leave
8048426: c3 ret
08048427 <main>:
8048427: 8d 4c 24 04 lea ecx,[esp+0x4]
804842b: 83 e4 f0 and esp,0xfffffff0
804842e: ff 71 fc push DWORD PTR [ecx-0x4]
8048431: 55 push ebp
8048432: 89 e5 mov ebp,esp
8048434: 51 push ecx
8048435: 83 ec 14 sub esp,0x14
8048438: 89 c8 mov eax,ecx
804843a: 83 38 02 cmp DWORD PTR [eax],0x2
804843d: 74 07 je 8048446 <main+0x1f>
804843f: b8 01 00 00 00 mov eax,0x1
8048444: eb 1c jmp 8048462 <main+0x3b>
8048446: 8b 40 04 mov eax,DWORD PTR [eax+0x4]
8048449: 8b 40 04 mov eax,DWORD PTR [eax+0x4]
804844c: 89 45 f4 mov DWORD PTR [ebp-0xc],eax
804844f: 83 ec 0c sub esp,0xc
8048452: ff 75 f4 push DWORD PTR [ebp-0xc]
8048455: e8 b1 ff ff ff call 804840b <say_hi>
804845a: 83 c4 10 add esp,0x10
804845d: b8 00 00 00 00 mov eax,0x0
8048462: 8b 4d fc mov ecx,DWORD PTR [ebp-0x4]
8048465: c9 leave
8048466: 8d 61 fc lea esp,[ecx-0x4]
8048469: c3 ret
Skipping over the bulk of main
, you'll see that at 0x8048452
main
's name
local is pushed to the stack because it's the first argument to say_hi
. Then, a call
instruction is executed. call
instructions first push the current instruction pointer to the stack, then jump to their destination. So when the processor begins executing say_hi
at 0x0804840b
, the stack looks like this:
EIP = 0x0804840b (push ebp)
ESP = 0xffff0000
EBP = 0xffff002c
0xffff0004: 0xffffa0a0 // say_hi argument 1
ESP -> 0xffff0000: 0x0804845a // Return address for say_hi
The first thing say_hi
does is save the current ebp
so that when it returns, ebp
is back where main
expects it to be. The stack now looks like this:
EIP = 0x0804840c (mov ebp, esp)
ESP = 0xfffefffc
EBP = 0xffff002c
0xffff0004: 0xffffa0a0 // say_hi argument 1
0xffff0000: 0x0804845a // Return address for say_hi
ESP -> 0xfffefffc: 0xffff002c // Saved EBP
Again, note how esp
gets smaller when values are pushed to the stack.
Next, the current esp
is saved into ebp
, marking the top of the new stack frame.
EIP = 0x0804840e (sub esp, 0x8)
ESP = 0xfffefffc
EBP = 0xfffefffc
0xffff0004: 0xffffa0a0 // say_hi argument 1
0xffff0000: 0x0804845a // Return address for say_hi
ESP, EBP -> 0xfffefffc: 0xffff002c // Saved EBP
Then, the stack is "grown" to accommodate local variables inside say_hi
.
EIP = 0x08048414 (push [ebp + 0x8])
ESP = 0xfffeffec // sub x2
EBP = 0xfffefffc
0xffff0004: 0xffffa0a0 // say_hi argument 1
0xffff0000: 0x0804845a // Return address for say_hi
EBP -> 0xfffefffc: 0xffff002c // Saved EBP
0xfffefff8: UNDEFINED
0xfffefff4: UNDEFINED
0xfffefff0: UNDEFINED
ESP -> 0xfffeffec: UNDEFINED
NOTE: stack space is not implictly cleared!
Now, the 2 arguments to printf
are pushed in reverse order.
EIP = 0x0804841c (call printf@plt)
ESP = 0xfffeffe4
EBP = 0xfffefffc
0xffff0004: 0xffffa0a0 // say_hi argument 1
0xffff0000: 0x0804845a // Return address for say_hi
EBP -> 0xfffefffc: 0xffff002c // Saved EBP
0xfffefff8: UNDEFINED
0xfffefff4: UNDEFINED
0xfffefff0: UNDEFINED
0xfffeffec: UNDEFINED
0xfffeffe8: 0xffffa0a0 // printf argument 2
ESP -> 0xfffeffe4: 0x080484f0 // printf argument 1
Finally, printf
is called, which pushes the address of the next instruction to execute.
EIP = 0x080482e0
ESP = 0xfffeffe4
EBP = 0xfffefffc
0xffff0004: 0xffffa0a0 // say_hi argument 1
0xffff0000: 0x0804845a // Return address for say_hi
EBP -> 0xfffefffc: 0xffff002c // Saved EBP
0xfffefff8: UNDEFINED
0xfffefff4: UNDEFINED
0xfffefff0: UNDEFINED
0xfffeffec: UNDEFINED
0xfffeffe8: 0xffffa0a0 // printf argument 2
0xfffeffe4: 0x080484f0 // printf argument 1
ESP -> 0xfffeffe0: 0x08048421 // Return address for printf
Once printf
has returned, the leave
instruction moves ebp
into esp
, and pops the saved EBP.
EIP = 0x08048426 (ret)
ESP = 0xfffefffc
EBP = 0xffff002c
0xffff0004: 0xffffa0a0 // say_hi argument 1
ESP -> 0xffff0000: 0x0804845a // Return address for say_hi
And finally, ret
pops the saved instruction pointer into eip
which causes the program to return to main with the same esp
, ebp
, and stack contents as when say_hi
was initially called.
EIP = 0x0804845a (add esp, 0x10)
ESP = 0xffff0000
EBP = 0xffff002c
ESP -> 0xffff0004: 0xffffa0a0 // say_hi argument 1