[Tutorial] Advanced C
Posted: Sat Jan 14, 2012 12:08 pm
Next >>
Introduction
Please note that this tutorial is based on a 32-bit systems. If you're running on a 64-bit system, please consider adding -m32 to the compiler options to get a code similar to what we're discussing
Tools I'll be using:
to install everything.
Behind the scenes
Before I start showing some advanced C techniques, I want you to take your time and learn first how a computer executes a program at low level. Without this knowledge you can hardly fully understand these techniques
So let's take this simple C program:
Please note that there's no #include directive.
Save it like example00.c and compile (with debugging info on, -g switch)
As you surely noticed, this program is useless. But out purpose is to look how it looks like behind the scenes. I'll use objdump to extract example00's code and grep to filter out 20 lines from main() code:
Let's explain a bit what the columns mean (after 08048394 <main>:):
Now a tour of what this is doing:
Saves EBP register into the stack
EBP = ESP, points EBP to bottom of stack
ESP = ESP - 10, reserves 10 bytes on the stack
Stores 4 into address EBP - 12, thus the variable "a" in our program is at address EBP - 12
Stores 13 into address EBP - 8, thus the variable "b" in our program is at address EBP - 8
Copies EBP - 8 (b) to EAX
Copies EBP - C (a) to EDX
Tricky instruction to do EAX = EAX + EDX (compiler optimizations)
Stores EAX into address EBP - 4, thus the variable "c" in our program is at address EBP - 4
EAX = 0, this is "return 0" in our code. EAX is used as return value for functions
Restores stack
Returns from function (to system in this case)
So now let's take a tour executing this program using GDB, just to see how it works and to get our hands dirty with the all-powerful GDB.
Now GDB is waiting for input. For a quick command reference, check this one (or choose your own using Google...). Let's take a look at the main() assembly again:
Let's put a breakpoint on the LEA instruction, so we can see it actually performs the addition.
Now let's kick the program start!
We stopped before execution LEA, as expected. As you can see, it points to the source file automatically. We said before that our "c" variable was EBP - 4. This address should take the value 17 when the LEA instruction is executed. Let's check what's on EBP - 4.
A zero, ok. Notice that GDB internally stores the value of our calculation in $1 so we won't have to calculate it again unless necessary. So let's execute our instruction and see what happens.
As you can see, 2 instructions were executed since we're sitting at address 0x80483b4 (EIP value), the one that does the adding (LEA) and the one that moves the result to EBP - 4 (MOV). As you can see, "c" has value 17 (0x11) as expected.
We quit GDB
Now that I covered some basics (I recommend you ask or search for stuff you didn't fully understand here) I'll start with the real thing on next part. Until then, happy hacking! 
Next >>
Introduction
Please note that this tutorial is based on a 32-bit systems. If you're running on a 64-bit system, please consider adding -m32 to the compiler options to get a code similar to what we're discussing
Tools I'll be using:
- GNU's GCC
- GNU's ObjDump
- GNU's GDB
Code: Select all
# apt-get install build-essential gdbBehind the scenes
Before I start showing some advanced C techniques, I want you to take your time and learn first how a computer executes a program at low level. Without this knowledge you can hardly fully understand these techniques
So let's take this simple C program:
Code: Select all
int main()
{
int a = 4;
int b = 13;
int c = a + b;
return 0;
}Save it like example00.c and compile (with debugging info on, -g switch)
Code: Select all
$ cc -g -o example00 example00.cCode: Select all
$ objdump -D example00 | grep -A20 main.:
08048394 <main>:
8048394: 55 push %ebp
8048395: 89 e5 mov %esp,%ebp
8048397: 83 ec 10 sub $0x10,%esp
804839a: c7 45 f4 04 00 00 00 movl $0x4,-0xc(%ebp)
80483a1: c7 45 f8 0d 00 00 00 movl $0xd,-0x8(%ebp)
80483a8: 8b 45 f8 mov -0x8(%ebp),%eax
80483ab: 8b 55 f4 mov -0xc(%ebp),%edx
80483ae: 8d 04 02 lea (%edx,%eax,1),%eax
80483b1: 89 45 fc mov %eax,-0x4(%ebp)
80483b4: b8 00 00 00 00 mov $0x0,%eax
80483b9: c9 leave
80483ba: c3 ret
80483bb: 90 nop
80483bc: 90 nop
80483bd: 90 nop
80483be: 90 nop
80483bf: 90 nop
080483c0 <__libc_csu_fini>:
80483c0: 55 push %ebp- Address of first byte
- Machine code for instruction (what the computer actually sees)
- Disassembly for instruction (direct translation from what the computer actually sees)
Code: Select all
$ objdump -D example00 | grep -A12 main.: | cut -f 3
push %ebp
mov %esp,%ebp
sub $0x10,%esp
movl $0x4,-0xc(%ebp)
movl $0xd,-0x8(%ebp)
mov -0x8(%ebp),%eax
mov -0xc(%ebp),%edx
lea (%edx,%eax,1),%eax
mov %eax,-0x4(%ebp)
mov $0x0,%eax
leave
retCode: Select all
push %ebpSaves EBP register into the stack
Code: Select all
mov %esp,%ebpEBP = ESP, points EBP to bottom of stack
Code: Select all
sub $0x10,%espESP = ESP - 10, reserves 10 bytes on the stack
Code: Select all
movl $0x4,-0xc(%ebp)Stores 4 into address EBP - 12, thus the variable "a" in our program is at address EBP - 12
Code: Select all
movl $0xd,-0x8(%ebp)Code: Select all
mov -0x8(%ebp),%eaxCode: Select all
mov -0xc(%ebp),%edxCode: Select all
lea (%edx,%eax,1),%eaxCode: Select all
mov %eax,-0x4(%ebp)Code: Select all
mov $0x0,%eaxCode: Select all
leaveCode: Select all
retSo now let's take a tour executing this program using GDB, just to see how it works and to get our hands dirty with the all-powerful GDB.
Code: Select all
$ gdb -q example00
Reading symbols from /home/m0skit0/Temp/example00...done.
(gdb)Code: Select all
(gdb) disas main
Dump of assembler code for function main:
0x08048394 <main+0>: push %ebp
0x08048395 <main+1>: mov %esp,%ebp
0x08048397 <main+3>: sub $0x10,%esp
0x0804839a <main+6>: movl $0x4,-0xc(%ebp)
0x080483a1 <main+13>: movl $0xd,-0x8(%ebp)
0x080483a8 <main+20>: mov -0x8(%ebp),%eax
0x080483ab <main+23>: mov -0xc(%ebp),%edx
0x080483ae <main+26>: lea (%edx,%eax,1),%eax
0x080483b1 <main+29>: mov %eax,-0x4(%ebp)
0x080483b4 <main+32>: mov $0x0,%eax
0x080483b9 <main+37>: leave
0x080483ba <main+38>: ret
End of assembler dump.Code: Select all
(gdb) b *0x080483ae
Breakpoint 1 at 0x80483ae: file foo.c, line 5.Code: Select all
(gdb) r
Starting program: /home/m0skit0/Temp/example00
Breakpoint 1, 0x080483ae in main () at foo.c:5
5 int c = a + b; Code: Select all
(gdb) p/x $ebp - 4
$1 = 0xbffff3b4
(gdb) x/w $1
0xbffff3b4: 0x00000000Code: Select all
(gdb) s
6 return 0;
(gdb) x/w $1
0xbffff3b4: 0x00000011
(gdb) i r eip
eip 0x80483b4 0x80483b4 <main+32>We quit GDB
Code: Select all
(gdb) q
A debugging session is active.
Inferior 1 [process 2932] will be killed.
Quit anyway? (y or n) yNext >>