Advertising (This ad goes away for registered users. You can Login or Register)

[Tutorial] Advanced C

Discuss about your favorite (gaming...or not) devices here. The most popular ones will end up getting their own categories
Programming discussions for your favorite Device
Forum rules
Forum rule Nº 15 is strictly enforced in this subforum.
m0skit0
Guru
Posts: 3817
Joined: Mon Sep 27, 2010 6:01 pm

[Tutorial] Advanced C

Post by m0skit0 »

Next >>

Introduction

Please note that this tutorial is based on a 32-bit systems. If you're running on a 64-bit system, please consider adding -m32 to the compiler options to get a code similar to what we're discussing ;)

Tools I'll be using:
  • GNU's GCC
  • GNU's ObjDump
  • GNU's GDB
All are free software, so they're easy to get. If you're running a Debian-based Linux distro, you can simply do (as root)

Code: Select all

# apt-get install build-essential gdb
to install everything.

Behind the scenes

Before I start showing some advanced C techniques, I want you to take your time and learn first how a computer executes a program at low level. Without this knowledge you can hardly fully understand these techniques :)

So let's take this simple C program:

Code: Select all

int main()
{
	int a = 4;
	int b = 13;
	int c = a + b;	
	return 0;
}
Please note that there's no #include directive.

Save it like example00.c and compile (with debugging info on, -g switch)

Code: Select all

$ cc -g -o example00 example00.c
As you surely noticed, this program is useless. But out purpose is to look how it looks like behind the scenes. I'll use objdump to extract example00's code and grep to filter out 20 lines from main() code:

Code: Select all

$ objdump -D example00 | grep -A20 main.:
08048394 <main>:
 8048394:	55                   	push   %ebp
 8048395:	89 e5                	mov    %esp,%ebp
 8048397:	83 ec 10             	sub    $0x10,%esp
 804839a:	c7 45 f4 04 00 00 00 	movl   $0x4,-0xc(%ebp)
 80483a1:	c7 45 f8 0d 00 00 00 	movl   $0xd,-0x8(%ebp)
 80483a8:	8b 45 f8             	mov    -0x8(%ebp),%eax
 80483ab:	8b 55 f4             	mov    -0xc(%ebp),%edx
 80483ae:	8d 04 02             	lea    (%edx,%eax,1),%eax
 80483b1:	89 45 fc             	mov    %eax,-0x4(%ebp)
 80483b4:	b8 00 00 00 00       	mov    $0x0,%eax
 80483b9:	c9                   	leave  
 80483ba:	c3                   	ret    
 80483bb:	90                   	nop
 80483bc:	90                   	nop
 80483bd:	90                   	nop
 80483be:	90                   	nop
 80483bf:	90                   	nop

080483c0 <__libc_csu_fini>:
 80483c0:	55                   	push   %ebp
Let's explain a bit what the columns mean (after 08048394 <main>:):
  • Address of first byte
  • Machine code for instruction (what the computer actually sees)
  • Disassembly for instruction (direct translation from what the computer actually sees)
Let's extract only the assembly for now:

Code: Select all

$ objdump -D example00 | grep -A12 main.: | cut -f 3

push   %ebp
mov    %esp,%ebp
sub    $0x10,%esp
movl   $0x4,-0xc(%ebp)
movl   $0xd,-0x8(%ebp)
mov    -0x8(%ebp),%eax
mov    -0xc(%ebp),%edx
lea    (%edx,%eax,1),%eax
mov    %eax,-0x4(%ebp)
mov    $0x0,%eax
leave  
ret
Now a tour of what this is doing:

Code: Select all

push   %ebp

Saves EBP register into the stack

Code: Select all

mov    %esp,%ebp

EBP = ESP, points EBP to bottom of stack

Code: Select all

sub    $0x10,%esp

ESP = ESP - 10, reserves 10 bytes on the stack

Code: Select all

movl   $0x4,-0xc(%ebp)

Stores 4 into address EBP - 12, thus the variable "a" in our program is at address EBP - 12

Code: Select all

movl   $0xd,-0x8(%ebp)
Stores 13 into address EBP - 8, thus the variable "b" in our program is at address EBP - 8

Code: Select all

mov    -0x8(%ebp),%eax
Copies EBP - 8 (b) to EAX

Code: Select all

mov    -0xc(%ebp),%edx
Copies EBP - C (a) to EDX

Code: Select all

lea    (%edx,%eax,1),%eax
Tricky instruction to do EAX = EAX + EDX (compiler optimizations)

Code: Select all

mov    %eax,-0x4(%ebp)
Stores EAX into address EBP - 4, thus the variable "c" in our program is at address EBP - 4

Code: Select all

mov    $0x0,%eax
EAX = 0, this is "return 0" in our code. EAX is used as return value for functions

Code: Select all

leave
Restores stack

Code: Select all

ret
Returns from function (to system in this case)

So now let's take a tour executing this program using GDB, just to see how it works and to get our hands dirty with the all-powerful GDB.

Code: Select all

$ gdb -q example00
Reading symbols from /home/m0skit0/Temp/example00...done.
(gdb)
Now GDB is waiting for input. For a quick command reference, check this one (or choose your own using Google...). Let's take a look at the main() assembly again:

Code: Select all

(gdb) disas main
Dump of assembler code for function main:
0x08048394 <main+0>:	push   %ebp
0x08048395 <main+1>:	mov    %esp,%ebp
0x08048397 <main+3>:	sub    $0x10,%esp
0x0804839a <main+6>:	movl   $0x4,-0xc(%ebp)
0x080483a1 <main+13>:	movl   $0xd,-0x8(%ebp)
0x080483a8 <main+20>:	mov    -0x8(%ebp),%eax
0x080483ab <main+23>:	mov    -0xc(%ebp),%edx
0x080483ae <main+26>:	lea    (%edx,%eax,1),%eax
0x080483b1 <main+29>:	mov    %eax,-0x4(%ebp)
0x080483b4 <main+32>:	mov    $0x0,%eax
0x080483b9 <main+37>:	leave  
0x080483ba <main+38>:	ret    
End of assembler dump.
Let's put a breakpoint on the LEA instruction, so we can see it actually performs the addition.

Code: Select all

(gdb) b *0x080483ae
Breakpoint 1 at 0x80483ae: file foo.c, line 5.
Now let's kick the program start!

Code: Select all

(gdb) r
Starting program: /home/m0skit0/Temp/example00 

Breakpoint 1, 0x080483ae in main () at foo.c:5
5		int c = a + b;	
We stopped before execution LEA, as expected. As you can see, it points to the source file automatically. We said before that our "c" variable was EBP - 4. This address should take the value 17 when the LEA instruction is executed. Let's check what's on EBP - 4.

Code: Select all

(gdb) p/x $ebp - 4
$1 = 0xbffff3b4
(gdb) x/w $1
0xbffff3b4:	0x00000000
A zero, ok. Notice that GDB internally stores the value of our calculation in $1 so we won't have to calculate it again unless necessary. So let's execute our instruction and see what happens.

Code: Select all

(gdb) s
6		return 0;
(gdb) x/w $1
0xbffff3b4:	0x00000011
(gdb) i r eip
eip            0x80483b4	0x80483b4 <main+32>
As you can see, 2 instructions were executed since we're sitting at address 0x80483b4 (EIP value), the one that does the adding (LEA) and the one that moves the result to EBP - 4 (MOV). As you can see, "c" has value 17 (0x11) as expected.

We quit GDB

Code: Select all

(gdb) q
A debugging session is active.

	Inferior 1 [process 2932] will be killed.

Quit anyway? (y or n) y
Now that I covered some basics (I recommend you ask or search for stuff you didn't fully understand here) I'll start with the real thing on next part. Until then, happy hacking! ;)

Next >>
Advertising
I wanna lots of mov al,0xb
Image
"just not into this RA stuffz"
Sirius
Posts: 103
Joined: Sat Dec 18, 2010 3:31 pm

Re: [Tutorial] Advanced C

Post by Sirius »

Thank you. is still good to know that there are people that want to share their knowledge ;)
Advertising
codestation
Big Beholder
Posts: 1660
Joined: Wed Jan 19, 2011 3:45 pm
Location: /dev/negi

Re: [Tutorial] Advanced C

Post by codestation »

Nice tutorial, just one suggestion: what about adding a -m32 to the compiler options (that or explaining the 64bit assembly as well) so everyone who follows this tutorial can have the same dissasembly?. If one has a 64bit compiler/distro then the dissasembly can be wildly different to the 32bit that you are showing. Doesn't make a big difference in this tutorial but the calling conventions on 64bit are a lot different of the 32bit.
Plugin list
Working on: QPSNProxy, QCMA - Open source content manager for the PS Vita
Playing: Error: ENOTIME
Repositories: github, google code
Just feel the code..
m0skit0
Guru
Posts: 3817
Joined: Mon Sep 27, 2010 6:01 pm

Re: [Tutorial] Advanced C

Post by m0skit0 »

Thanks codestation, I will add an explanation that this is targeted at 32-bit systems. If you want to co-author this adding the 64-bit version, then go ahead ;)
I wanna lots of mov al,0xb
Image
"just not into this RA stuffz"
Sirius
Posts: 103
Joined: Sat Dec 18, 2010 3:31 pm

Re: [Tutorial] Advanced C

Post by Sirius »

I finshed this part and found it very useful. :D Just one thing, in my disassamble code I have add insted of lea but everything else is the same ;)

Other thing, the command i r aip just print the value stored in that register right? I can't find info about that command on internet.


sorry for the bad english
m0skit0
Guru
Posts: 3817
Joined: Mon Sep 27, 2010 6:01 pm

Re: [Tutorial] Advanced C

Post by m0skit0 »

Thank you.
Sirius wrote:in my disassamble code I have add insted of lea but everything else is the same ;)
Then we most likely don't have same GCC version ;) Anyway ADD is the correct instruction, LEA is more a trick since LEA is Load Effective Address and it's used to load an address on a register.
Sirius wrote:Other thing, the command i r aip just print the value stored in that register right? I can't find info about that command on internet.
I guess you mean i r eip. AIP does not exist. It's short for info register eip, and yes, it prints EIP value. You can alternatively use p/x $eip.
I wanna lots of mov al,0xb
Image
"just not into this RA stuffz"
Sirius
Posts: 103
Joined: Sat Dec 18, 2010 3:31 pm

Re: [Tutorial] Advanced C

Post by Sirius »

Yes, sorry I mean i r eip :oops:
frank
Posts: 211
Joined: Wed Mar 23, 2011 5:15 am
Location: northwest usa
Contact:

Re: [Tutorial] Advanced C

Post by frank »

Alright, I decided to give this a go.
When i do

Code: Select all

p/x $ebp - 4
x/w $1
I get:

Code: Select all

0xbffff434:        0x00000004
I don't know if that's normal, since you didn't initialize c to anything, but I'll keep going.

Edit: Oh and my LEA instruction was at a different address

Edit 2: and I get the same 0x00000004 at the end for $1
Sig shrink!
PSP 2000 6.60 ME-1.8
G1 - Super D 1.11
Samsung Galaxy Tab 2 GT-P5113 - CyanogenMod 10
PC - Windows 7 - P4 3.20ghz HT - 3 gb 756 mb RAM - ATI Radeon x600
Cardboard box server - Ubuntu Server :D - AMD Athlon XP 3000+ - 512mb RAM
m0skit0
Guru
Posts: 3817
Joined: Mon Sep 27, 2010 6:01 pm

Re: [Tutorial] Advanced C

Post by m0skit0 »

It's normal that you get different code generated if you're using a different compiler, OS, etc... (which you didn't specify btw). Please post your disas main and at what address did you put the breakpoint.
I wanna lots of mov al,0xb
Image
"just not into this RA stuffz"
frank
Posts: 211
Joined: Wed Mar 23, 2011 5:15 am
Location: northwest usa
Contact:

Re: [Tutorial] Advanced C

Post by frank »

I am on Ubuntu 10.04 in a VM (If its the VM scr..ing with stuff, then I'll try again soon as I fix my hard Ubuntu installation)
I'm using...this:

Code: Select all

andrey@ubuntu:~/ctuts$ cc -v
Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.4.3-4ubuntu5' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --program-suffix=-4.4 --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-plugin --enable-objc-gc --enable-targets=all --disable-werror --with-arch-32=i486 --with-tune=generic --enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu --target=i486-linux-gnu
Thread model: posix
gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) 
xD this will be a very...full looking post

Code: Select all

(gdb) disas main
Dump of assembler code for function main:
   0x080483b4 <+0>:	push   %ebp
   0x080483b5 <+1>:	mov    %esp,%ebp
   0x080483b7 <+3>:	sub    $0x10,%esp
   0x080483ba <+6>:	movl   $0x4,-0x4(%ebp)
   0x080483c1 <+13>:	movl   $0xd,-0x8(%ebp)
   0x080483c8 <+20>:	mov    -0x8(%ebp),%eax
   0x080483cb <+23>:	mov    -0x4(%ebp),%edx
   0x080483ce <+26>:	lea    (%edx,%eax,1),%eax
   0x080483d1 <+29>:	mov    %eax,-0xc(%ebp)
   0x080483d4 <+32>:	mov    $0x0,%eax
   0x080483d9 <+37>:	leave  
   0x080483da <+38>:	ret    
End of assembler dump.

Code: Select all

(gdb) b *0x080483ce
Breakpoint 1 at 0x80483ce: file example00.c, line 5.

Code: Select all

(gdb) r
Starting program: /home/andrey/ctuts/example00 

Breakpoint 1, 0x080483ce in main () at example00.c:5
5		int c = a + b;

Code: Select all

(gdb) p/x $ebp - 4
$1 = 0xbffff434
(gdb) x/w $1
0xbffff434:	0x00000004

Code: Select all

(gdb) x/w $1
0xbffff434:	0x00000004
Sig shrink!
PSP 2000 6.60 ME-1.8
G1 - Super D 1.11
Samsung Galaxy Tab 2 GT-P5113 - CyanogenMod 10
PC - Windows 7 - P4 3.20ghz HT - 3 gb 756 mb RAM - ATI Radeon x600
Cardboard box server - Ubuntu Server :D - AMD Athlon XP 3000+ - 512mb RAM
Locked

Return to “Programming and Security”