Skip to content

Format String Vulnerability

The format string vulnerability is seen in the incorrect use of printf() function in C.

Syntax of printf() in C:

1
printf(control string, variable_list)

The first argument, the format string, specifies how the variables have to be displayed, printf() also assumes how the variables are passed based on the format specifiers in the format string. The below table consists of the common format specifiers used in printf() and how the variables are perceived as.

Parameters Output Passed as
%d Decimal Value
%u Unsigned Decimal Value
%c Character Value
%s String Reference
%x Hexadecimal Value
%p Basically %x suffixed with ‘0x’ Value
%n Writes the characters until “%n” into a pointer Reference

From now on everything mentioned will be in the assumption that you have a basic knowledge in C, x86 assembly and also a clear idea on how the stack works.

In a 32-bit environment the arguments of the printf() function is pushed on to the stack, first the variables are pushed on to the stack then the pointer to the control string.

As shown in the table the values pushed on the stack are printed as per the format specifier pushed on the stack.

#include<stdio.h>
 int main()
{
    int a=80;
    puts(Hello World):
    printf("The decimal is %d \n",a);
    return 0;
} 

Output :

1
2
Hello World
The decimal is 80

This is what the above code looks like in x86 assembly language.

Dump of assembler code for function main:
   0x0804840b <+0>:     lea    ecx,[esp+0x4]
   0x0804840f <+4>:     and    esp,0xfffffff0
   0x08048412 <+7>:     push   DWORD PTR [ecx-0x4]
   0x08048415 <+10>:    push   ebp
   0x08048416 <+11>:    mov    ebp,esp
   0x08048418 <+13>:    push   ecx
   0x08048419 <+14>:    sub    esp,0x14
   0x0804841c <+17>:    mov    DWORD PTR [ebp-0xc],0x50
   0x08048423 <+24>:    sub    esp,0x8
   0x08048426 <+27>:    push   DWORD PTR [ebp-0xc]
   0x08048429 <+30>:    push   0x80484d0
   0x0804842e <+35>:    call   0x80482e0 <printf@plt>
   0x08048433 <+40>:    add    esp,0x10
   0x08048436 <+43>:    mov    eax,0x0
   0x0804843b <+48>:    mov    ecx,DWORD PTR [ebp-0x4]
   0x0804843e <+51>:    leave
   0x0804843f <+52>:    lea    esp,[ecx-0x4]
   0x08048442 <+55>:    ret
End of assembler dump.

Can you see the two push instructions before the calling printf()? The two arguments to printf() are pushed on to the stack, first the value in the variable, in this case ‘a’ is the variable and ‘80’ is the value, ‘80’ is pushed on to the stack, and then the pointer to the format string(“The decimal is %d”).

When printf() is called it assumes that the arguments are already on the stack and continues execution. Whatever is on the top of the stack is printed onto the screen, and ‘%d’ is replaced with the value ‘80’ which is right next to the pointer to format string on the stack.