Exploiting with unlink() technique by Doug Lea's malloc| Exploit excercises

Let me start by saying that I did not solve this on my own, I just followed a mix of MaXX's Phrack article Vudo and a writeup of this challenge by Joshua Wang in Concept of Proof.

This level is a real case scenario, so at least we have that. It compiles a little bit of everything we've learnt along the way, so in the end, when you read the walkthrough in this challenge, it makes sense if you understood the other challenges.

~~ The situation ~~



Remember a couple of posts ago how I said that I thought one excercise was  a double free technique but it was not? Well, this time it is the double free technique. However, it is not popular becauseof its simplicity, as you will see.

I'm going to proceed to an explanation pretending that the articles have been understood and one knows how an arena works.

What Wang explained to us as it is exposed by MaXX, is that the goal is to craft a fake chunk into one of the buffers. So now, if we have chunk B and its previous chunk is A, we will have to use the buffer to pretend that there is a chunk right before B.


Next, when the free() function is called unto chunk B, it will see that there is a previous chunk. It has to think that this previous fake chunk is a free chunk. Why? That way it will remove the ( remember, it's a fake chunk ) chunk from the doubly linked list.

Before explaining how MaXX achieved that the previous fake chunk is seen as free, you have to understand how the fake chunk was positioned.
It's very easy: in the chunk_free() function, which is the one used if a chunk is not allocated via mmap and it's also the previous function to the unlink() call, there is a part that goes "chunk_at_offset(p, -(long)prevsz)". It's not vital to understand that line, just know that it's meant to take the inverse of a number to get the start of the previous chunk.
In this case, Wang set prev_size to be -8, even though the normal thing would be -32, which is where the A chunk would start.


Now, remember how the least significant bit of the size field determined if the previous chunk was or not free? Well, here is where we create an actual free chunk by setting it to 0. If we convert -4 to hexadecimal, we get "FFFFFFFFFFFFFFFC", That would translate to 0xfffffffc for what insterests us. See that the least significant bit is 0, a.k.a, previous chunk is free.
Also, by setting the size field to -4, we're telling B chunk that the previous fake chunk has its size field to the -4 position.
As I explained before, we also need to make believe that prev_size is -8, which is "FFFFFFFFFFFFFFF8", or 0xfffffff8 for us.

So now we know that the exploit has to have "\xf8\xff\xff\xff" somewhere, and the same with "\xfc\xff\xff\xff".

The next part that Wang exposed confused me a lot since it involved plenty of 'what, what are you doing that for? what the hell? why?' for my part. I had to reread everything more than a couple of times, but I'll do my best to write it down in a way I could understand it if I needed to go over it again in the future, so here I go.

Knowing how the unlink() function looks like, now we're just going to specify the addresses ( just the addresses ) of the following things.
  • P will be the address of the start of the fake chunk
  • BK will the the address to the NOP sled
  • FD will be the address of the puts() function minus 12 bytes

Let's leave it like that and bear with me until the end, it will make sense, I promise.
Now it's important to remember that we're still in the B chunk, we're just redirecting code flow to the fake chunk, so obviously it's important that we specify the addresses where to find the content we will inject in a moment.

However, before continuing, we have a small issue.

As you can see, the fd section starts at 8 bytes into the chunk. That means that we need 8 bytes extra to fill the empty space.
As I said, it's a small issue, and those 8 bytes could be filled with the classic 0xdeadbeef 8 times, but because we want to be original, we will use 0xdeadc0de twice :)
Okay but, what about the -8 and -4 bytes? Why do we need 0xdeadc0de if we have those? Because remember: the -8 and -4 bytes are negative, therefore are part of the chunk A or/and our imaginary chunk. The actual 8 bytes, those which need to have positive values, are still empty.

Recap: now we know we will be needing "\xf8\xff\xff\xff" and "\xfc\xff\xff\xff" plus the three addresses plus 0xdeadc0de twice.
Let's continue.

We are almost there. One thing that we have yet to complete is to find a function that actually starts the whole ordeal, and what best to what we did in the past, and find a function in the GOT Table?
Again, we're taking puts(), but what's with the 12 bytes? Remember that bk is at a 12 bytes offset from the begginning of the chunk, and we want to go to the beginning of the chunk.

Let's add the address of the puts() function - 12 bytes to the list.

readelf --relocs heap3

 

0x0804b12c - 12 = 0x0804b11c

Finally, to find the address where the NOP slide will start, we'll need to take a look to the binary.


Wang set the breakpoint at 0x080488a2, which is right above the second malloc, the second malloc function allocates chunk B, where we want to write the address of the NOP sled.
Afterwards, he chose the eax register as the one that will be holding the NOP slide.
Note one thing, and that is that, as long as we look at the registers while second malloc is still uncalled, the registers will be the same. So if I chose to set the breakpoint at 0x0804889e, eax would still be exactly the same.

 

After that, we need the winner() address.


~~ Writting the exploit ~~

To write the exploit, I'm first going to detail what we have and what we need:

  • \xfc\xff\xff\xff
  • \xf8\xff\xff\xff
  • \xde\xc0\xad\xde
  • \x1c\xb1\x04\x08
  • \x08\xc0\x04\x08
  • NOP sled: \x90
  • \x64\x88\x04\x08
However, before we proceed, I would like to add a couple things to the exploit.
First, the shellcode will not work if we input it like we used to do in the other challenges, because in this case, we need it to return, we have to tell the program what to do with the address.
So it will be more like push <address> and return:
  • \x68\x64\x88\x04\x08\xc3
 And the second thing, why do we need a padding at the end of chunk A? Recall a chunk is 32 bytes long: 6 byte long shellcode, 19 bytes of NOP ( 19 is the highest number we can reach without it giving a segfault ) and we now need 7 bytes of padding to complete chunk A.

./heap3 `python -c 'print "\x90"*19 + "\x68\x64\x88\x04\x08\xc3" + "A"*7 + "\xf8\xff\xff\xff" + "\xfc\xff\xff\xff"'` `python -c 'print "\xde\xc0\xad\xde"*2 + "\x1c\xb1\x04\x08" + "\x08\xc0\x04\x08"'` chunk

Comments

Popular Posts