Decompilers¶
Decompilers do the impossible and reverse compiled code back into psuedocode/code.
IDA offers HexRays, which translates machine code into a higher language pseudocode.
Example Workflow¶
Let's say we are disassembling a program which has the source code:
#include <stdio.h> void printSpacer(int num){ for(int i = 0; i < num; ++i){ printf("-"); } printf("\n"); } int main() { char* string = "Hello, World!"; for(int i = 0; i < 13; ++i){ printf("%c", string[i]); for(int j = i+1; j < 13; j++){ printf("%c", string[j]); } printf("\n"); printSpacer(13 - i); } return 0; }
And creates an output of:
Hello, World! ------------- ello, World! ------------ llo, World! ----------- lo, World! ---------- o, World! --------- , World! -------- World! ------- World! ------ orld! ----- rld! ---- ld! --- d! -- ! -
If we are given a binary compiled from that source and we want to figure out how the source looks, we can use a decompiler to get c pseudocode which we can then use to reconstruct the function. The sample decompilation can look like:
printSpacer: int __fastcall printSpacer(int a1) { int i; // [rsp+8h] [rbp-8h] for ( i = 0; i < a1; ++i ) printf("-"); return printf("\n"); } main: int __cdecl main(int argc, const char **argv, const char **envp) { int v4; // [rsp+18h] [rbp-18h] signed int i; // [rsp+1Ch] [rbp-14h] for ( i = 0; i < 13; ++i ) { v4 = i + 1; printf("%c", (unsigned int)aHelloWorld[i], envp); while ( v4 < 13 ) printf("%c", (unsigned int)aHelloWorld[v4++]); printf("\n"); printSpacer(13 - i); } return 0; }
A good method of getting a good representation of the source is to convert the decompilation into Python since Python is basically psuedocode that runs. Starting with main often allows you to gain a good overview of what the program is doing and will help you translate the other functions.
Main¶
We know we will start with a main function and some variables, if you trace the execution of the variables, you can oftentimes determine the variable type. Because i is being used as an index, we know its an int, and because v4 used as one later on, it too is an index. We can also see that we have a variable aHelloWorld being printed with "%c", we can determine it represents the 'Hello, World!' string. Lets define all these variables in our Python main function:
def main(): string = "Hello, World!" i = 0 v4 = 0 for i in range(0, 13): v4 = i + 1 print(string[i], end='') while v4 < 13: print(string[v4], end='') v4 += 1 print() printSpacer(13-i)
printSpacer Function¶
Now we can see that printSpacer is clearly being fed an int value. Translating it into python shouldn't be too hard.
def printSpacer(number): i = 0 for i in range(0, number): print("-", end='') print()
Results¶
Running main() gives us:
Hello, World! ------------- ello, World! ------------ llo, World! ----------- lo, World! ---------- o, World! --------- , World! -------- World! ------- World! ------ orld! ----- rld! ---- ld! --- d! -- ! -