Buffer Overflow

A Buffer Overflow is a vulnerability in which data can be written which exceeds the allocated space, allowing an attacker to overwrite other data.

Stack buffer overflow

The simplest and most common buffer overflow is one where the buffer is on the stack. Let's look at an example.

#include <stdio.h>

int main() {
    int secret = 0xdeadbeef;
    char name[100] = {0};
    read(0, name, 0x100);
    if (secret == 0x1337) {
        puts("Wow! Here's a secret.");
    } else {
        puts("I guess you're not cool enough to see my secret");
    }
}

There's a tiny mistake in this program which will allow us to see the secret. name is decimal 100 bytes, however we're reading in hex 100 bytes (=256 decimal bytes)! Let's see how we can use this to our advantage.

If the compiler chose to layout the stack like this:

        0xffff006c: 0xf7f7f7f7  // Saved EIP
        0xffff0068: 0xffff0100  // Saved EBP
        0xffff0064: 0xdeadbeef  // secret
...
        0xffff0004: 0x0
ESP ->  0xffff0000: 0x0         // name

let's look at what happens when we read in 0x100 bytes of 'A's.

The first decimal 100 bytes are saved properly:

        0xffff006c: 0xf7f7f7f7  // Saved EIP
        0xffff0068: 0xffff0100  // Saved EBP
        0xffff0064: 0xdeadbeef  // secret
...
        0xffff0004: 0x41414141
ESP ->  0xffff0000: 0x41414141  // name

However when the 101st byte is read in, we see an issue:

        0xffff006c: 0xf7f7f7f7  // Saved EIP
        0xffff0068: 0xffff0100  // Saved EBP
        0xffff0064: 0xdeadbe41  // secret
...
        0xffff0004: 0x41414141
ESP ->  0xffff0000: 0x41414141  // name

The least significant byte of secret has been overwritten! If we follow the next 3 bytes to be read in, we'll see the entirety of secret is "clobbered" with our 'A's

        0xffff006c: 0xf7f7f7f7  // Saved EIP
        0xffff0068: 0xffff0100  // Saved EBP
        0xffff0064: 0x41414141  // secret
...
        0xffff0004: 0x41414141
ESP ->  0xffff0000: 0x41414141  // name

The remaining 152 bytes would continue clobbering values up the stack.

Passing an impossible check

How can we use this to pass the seemingly impossible check in the original program? Well, if we carefully line up our input so that the bytes that overwrite secret happen to be the bytes that represent 0x1337 in little-endian, we'll see the secret message.

A small Python one-liner will work nicely: python -c "print 'A'*100 + '\x31\x13\x00\x00'"

This will fill the name buffer with 100 'A's, then overwrite secret with the 32-bit little-endian encoding of 0x1337.

Going one step further

As discussed on the stack page, the instruction that the current function should jump to when it is done is also saved on the stack (denoted as "Saved EIP" in the above stack diagrams). If we can overwrite this, we can control where the program jumps after main finishes running, giving us the ability to control what the program does entirely.

Usually, the end objective in binary exploitation is to get a shell (often called "popping a shell") on the remote computer. The shell provides us with an easy way to run anything we want on the target computer.

Say there happens to be a nice function that does this defined somewhere else in the program that we normally can't get to:

void give_shell() {
    system("/bin/sh");
}

Well with our buffer overflow knowledge, now we can! All we have to do is overwrite the saved EIP on the stack to the address where give_shell is. Then, when main returns, it will pop that address off of the stack and jump to it, running give_shell, and giving us our shell.

Assuming give_shell is at 0x08048fd0, we could use something like this: python -c "print 'A'*108 + '\xd0\x8f\x04\x08'"

We send 108 'A's to overwrite the 100 bytes that is allocated for name, the 4 bytes for secret, and the 4 bytes for the saved EBP. Then we simply send the little-endian form of give_shell's address, and we would get a shell!

This idea is extended on in Return Oriented Programming