Saturday, February 5, 2011

When fuzzers miss (REVISED), Derbycon talk.

This post has been revised to reflect my talk at Derbycon on when fuzzers miss. I expanded on the explanations, added 2 more examples, code, slides and videos of each demo.

Each demo source and executable is in a zip below the corresponding video.

Link to derbycon talk


Expansion fuzzing, this is the most common form of fuzzing. This is where you take a buffer and push characters inside of it slowly expanding the number of characters you push. This is to find overflows in improper bounds checking in it's simplest form.

Expansion fuzzing example:
 from socket import *
 import sys

 if len(sys.argv) < 2:
  print "Usage: ip port"

 ip = sys.argv[1]
 port = sys.argv[2]

 s = socket(AF_INET, SOCK_STREAM)
 s.connect((ip, int(port)))

 for i in range(30000):
   s.send("A" * i)
   data = s.recv(1024)
   if data.chomp != "":
    print data

Some fuzzers are a little more intelligent about how they handle the fuzzing process. They will take into account int wraps, special characters used in the protocol among many other things. These are fuzzers such as listed below as a small example
Spike, sully, ....

Brute force fuzzing:
Brute force is a time/accuracy trade off fuzzing. This is a technique where you try to push every variation of every character into a buffer possibly finding parsing issues or read issues. The below example is raw brute force buzzing and can take forever, it is suggested to be smart about input and not try to run this raw against a target unless you have a ton of processing power and time.

Brute force fuzzing example:
 import sys, time
 from socket import *

 class bruteforce:
  def __init__(self, ip, port):
   self.ipaddr = ip
   self.port = port

  def fuzz(self, val):
    s = socket(AF_INET, SOCK_STREAM)
    # Timeout for any hanging sockets
    # When fail print the characters 
    # failed to screen in hex format
    print "failed on value: "+val.encode('hex') 
   return 0

  def run(self):
   size = 2**100000
   i = 0
   # have to use while because the size 
   # is too large to be handled in a for loop
   while i < size: 
    # convert int to hex
    val = hex(i).replace('0x', "") 
    if len(val)%2:
     # align the hex to make sure 
     # it is set to decode
     val = "0" + val 
    # decode the hex to a character
    data = val.decode('hex')
    i += 1
    # setup timing for app so you don't 
    # flood it and fail on packets
   return 0

 if len(sys.argv) < 2:
  print "usage: ip port"

 # !!Warning before running this!!
 # if this prints the characters to the
 # screen it will lock up whatever prints the characters!
 # Operating systems do not like beep 
 # codes and this will flood it with beepcodes and stall

 fz = bruteforce(sys.argv[1], sys.argv[2])

Listed Commands:
listing out commands that might be common exploit techniques specific for a protocol and running those quickly against a target.

This is my thought process in exploitation.
I can not stress how important this is! Reading the application documentation, protocol RFC, and any files that get accessed by the application.
Discovery / vuln to exploit
this is a phase most of what this talk is based on so will not go over this at this point.
this is where you get to where your shellcode is located once you have a POC running and offsets figured out.
writing custom shellcode or using existing shellcode
important process that can easily be skipped. review your exploit and make sure there are no bugs and see if there is anyplace you can reduce code.

This is a very simple exploit as an example to show where the vuln is located. The bottom left is the source code, the top left is the function for int overflowme() and on the right is the stack frame for overflowme(). this is what the stack looks like at point when EIP points to the call to gets.

In this slide we see the progression of the vulnerability. This shows when EIP is pointing to RETN and ready for the exploit to kick off. The conect of vulnerability vs the time the exploit kicks off is one of the keys behind this talk and is something throughout the slides I kept a theme to color all the vulnerable places in pink and the places where the exploit kicks off in red.

Time for the demos!
In each of these demos you will see bubbles the represent a function and arrows that show the order of calls and returns. if an arrow is pointing down it is a call to another function and if it is pointing up it returns back to that function. Keep an eye on the text colors as well pink = vulnerable place , red = where the exploit kicks off and what we want. Also keep in mind when something is pushed on the stack it will push up the function chain overwriting the stack frame from the higher function not the lower functions.

This is a post I made some time back that I merged into this post.

In this example we see that a char variable is set in StartProgram with a buffer size of 25. inside this function it calls get_login and prompts for a username and password. The vulnerability is in the username where it attempts to fill a 25 byte buffer with a 50 character input. This is a classic buffer overflow where EIP will be overwritten with whatever is pushed in the buffer. The trick to why a fuzzer will miss this is RETN is where the vulnerability is but it always stays in 1 function lower then where we work in unless we successfully log in.

By finishing out the application and tearing down each stack frame we where now able to trigger our exploit.

Watch the stack when inputting data into a buffer, you should always follow the data you input in a buffer.

Threads can be really troublesome in exploit development. The only thread that really matters is the main thread that the application is running in. Any other thread can be destroyed at any period in time and no effect the application. This key problem will leave us with a vulnerability hanging out there without triggering the place the exploit kicks off.

We will also look at stacks and variables to show that when you overflow a buffer you are not just overwriting RETN address but also the data from other local vars below it and how this can cause issues.

In this example we have a single thread that opens off of startApp then calls authenticate to get the username and password. I have "main thread" shown next to the branching thread just to represent the fact that there are 2 threads running at that point. The vulnerability is in the password field allows 200 byte character set to be pushed into a 20 byte buffer and the exploit kicks off when this function returns. The key here is looking at verifyAuth and seeing when auth fails it just kills that thread and opens a new one with startApp(). To get this exploit to kick off we need to be able to successfully log in to break down the function block for veryfyAuth -> authenticate.

In the video I will also show examples where authenticate fails to auth because we stomp on the username on the stack causing the auth process to fail. This is another key that will cause you sometimes to hit the incorrect path to let an exploit trigger by just pushing As into the buffer without knowing what that buffer holds.

Demo2 Part1

Demo2 Part2

This can be tricky to find but taking the normal logic of following the data should allow you to see the overflow happen. From this point it is just looking at where the thread is terminated and trying to find a way to get it to start returning higher up the stack frames.

Also as shown in the video the location that the buffers get filled cause the overflow to stomp on a local variable before it. Paying attention to what buffer is for what can sometimes be challenging when following process. Setting a memory or hardware breakpoint on access to stack frames can sometimes help find this but most time this is just going to be manual work of watching where the buffers are located. One key thing to note here is the fact that if this where a pointer pushed onto the stack and not a variable you not may have control over execution in the future, so watch those local variables and overwrite to see if they may be used in a future path.

In this example we will start a look into the heap and something fairly obvious but often used exit(). The heap can often be a great place to show many examples of where fuzzers just fail at finding the exploitable areas. Most of the protections on the heap do not target the allocation but the free calls to the chunks. Just like RETN this has to do with watching for the place where the exploit kicks off and not watching the vulnerable call. Heap can have a lot of examples and location where the exploit kicks off so this makes things more difficult, in a use after free example we have to look for the use then an allocation in the future, in a double free we need to make sure we are looking for that second free and taking the correct path to make sure we are getting to it just to name a few.

In this example we will look at crashing an application using a standard heap overflow and a free to cause a crash when validating the free list. This application is a simple echo program, all it does is echo out anything you send it. After the first echo it will ask if you would like to keep echoing output, if Y then it will allocate the initial echo to heap using heapalloc and then continue to echo out in a while 1 loop anything you put in. the trick here to access free is to get out of the echo loop free the heapalloc and exit the application. To access free you need to type quit() which is not told to you. I put the hidden quit in because reversing and finding pathing such as this sometimes can help you get back where you need to be.

[video] !!!Video is on my desktop and have not put time into edit it, it will be here someday!!!

heap is more difficult in every way then stack overflows including finding flaws so some practice and thinking on where the exploit kicks off with practice will help find heap exploits

These are only a few examples of limitless possible reasons why buffers may have an overflow and not trigger the exception. The main thing to take form this is the thought process.

Here are some things I use to find exploits such as this.
- when pushing data into an application pay attention to functions and understanding how this would look like in C. If you want to find the exploitable area and not just the vuln follow the data.
-break/log RETN, Watch the call stack/ pathing/ stack frame. Also if program is threaded make sure to check all stack frames and not just the current thread's stackframe.
-break/log Free/alloc, When watching for heap exploit break on what controls the heap structure and dump the heap chunks
-all references to, find anything that references the data you pushed in to see if it is used someplace later in a strcat, strcpy or anything of this type
-diffing, if the program is already patched don't take the time finding the exploit when they give it to you in a diff
-all intermodule calls, use all call function calls to find where you may want to look and pull out specific stack frames looking for user controlled data
-tracing, this can take a ton of time but tracing application paths and finding areas you are not hitting looking for data to be accessed can help.
-Work!, this is not all easy so it is going to take some thinking and some of your own ideas.

if you have any questions or comments please post below, hit me up on twitter or irc

No comments:

Post a Comment