DLL Injection DebugWare: An Anatomy of Needle

Kapildev Ramlal, 1st June 2009
http://www.mindarray.org

When debugging software sometimes we’re faced with challenging and complex problems, forc­ing us to deviate from the traditional path and create new and clever ways to catch the bug. Some cases require intense research on the software component being debugged, including the lower level API’s that it utilizes. In Windows debugging, we are referring to the Windows API, however this can even go as far as an application specific API in some cases (although in most cases appli­cation symbols are not available).

So in considering the architecture of an application that is being debugged, one key point here is that the application will at some level utilize the Windows API. In C/C++ programming, these API’s are typically exposed by func­tion calls exported from DLL’s.

In Windows, each process file starts with a PE format header which has several sections used for various pur­poses[3].

One of these sections is Import Address Table (IAT). This table contains the list of DLL modules that the ex­ecutable links against and calls functions from. It also contains the function name and relative address of the function within the imported module. Leveraging the knowledge of how this works, the imported function ad­dress can be hooked to point to a different function – one that can be used for debugging, such as printing out the function parameters and return code. It’s a great concept and one that has been used for years within several ver­sions of Windows. It’s not the only way to implement hooking in Windows, as there are also other mechanisms such as the SetWindowsHookEx API.

So we have a problematic EXE, and need to figure out what it is doing and if the function calls within it are working. To be more specific, we want to see if the Windows function calls are working. For example, we fig­ure out that it is the EnumPrinterDrivers() function that is under suspicion, and we would like to dig deeper into its parameters and return code. Timing is critical for this problem, so any interception from a debugger such as WinDbg or NTSD causes the problem to subside…

We decide to hook the EnumPrinterDrivers() call to print out the debug information… How would we get our code loaded into the problem EXE to be hooked?

The most common method would be to create a DLL which can be loaded by, or injected into the process being debugged. It’s really not as difficult as one would imagine because a DLL is very similar to an EXE and we just need to supply it with the appropriate entry point. Here is a sample skeleton that can be used to flesh out a real work­ing DLL:

#include <windows.h>

HWND g_hMod = 0; 

#ifdef __cplusplus // If used by C++ code, 
extern "C" {  // we need to export the C interface
#endif

// Entry point for DLL (main function for DLL)
BOOL APIENTRY DllMain(HMODULE hModule, DWORD  ul_reason_for_call, LPVOID lpReserved)
{
    //Check load reason
    switch(ul_reason_for_call)
    {
       case DLL_PROCESS_ATTACH:
       {
         // Eliminate un-necessary overhead...
         DisableThreadLibraryCalls(hModule);
         //Save the handle to the DLL for later use
         g_hMod = (HWND)hModule;
         break;
       }

       case DLL_PROCESS_DETACH:
       {
         break;
       }

       default:
         break;
    }
    return TRUE;
}
#ifdef __cplusplus
}
#endif

Very simple. So we see that there’s a case statement when the DLL gets attached to the process and another for when it gets detached. That gives us a good opportunity to plant our hook, and another opportunity to unhook and cleanup our tracks. We can include our hook code in sepa­rate files in the DLL project, and then write our own hook functions that match the exact signature as the function being hooked. Keep in mind that Microsoft typedef’s a lot of their API’s, so don’t forget to check if an API really expands to an ANSI or UNICODE version ending with “A” or “W” (OutputDebugStringA vs. OutputDebug StringW). Our hook function signature needs to match exactly. Detailed instructions on creating hook functions are documented in Windows via C/C++ book by Jeffrey Richter and Christophe Nasarre (ISBN 978-0735624245).

So having a DLL, how do we invade the remote process being debugged and get our DebugWare DLL loaded? Well we have a few methods we can choose, and each comes with its own caveats.

The easiest method to inject a DLL is by using the regi­stry. Simply edit the “AppInit_DLLs” value under HKLM\Software\Microsoft\WindowsNT\CurrentVersion\Windows and add the hook DLL to the list. The path to the DLL must not contain spaces and an accompanying value, LoadAppInit_DLLs of REG_DWORD type must also be created (under the same key) with a value set to 1. As long as the process being debugged is linked against user32.dll (like all GUI based apps) then this method should work. One major thing to consider is that all GUI apps will load the hook DLL.

Another method is by using SetWindowsHookEx me­thod. This method uses the Windows API (generally for hooking window procedure message calls) called SetWindowsHookEx, by specifying a DLL handle as the 3rd parameter which causes the DLL to get loaded indi­rectly by the processes being hooked. This method is use­ful for GUI DebugWare applications.

We can also trick the process by thinking that it’s loading one of its own DLL’s (the Trojan DLL approach) or we can spawn the process being debugged as a child process and leverage the CreateProcess API to perform injection. These methods are discussed elsewhere (such as in Windows via C/C++ book), so we won’t go into more de­tails here.

My favorite method which we discuss here in details is a method of DLL injection using remote threads. Yes it requires some understanding of several architectural con­cepts, but it is quite powerful. The beauty of this tech­nique lies within the CreateRemoteThread API. We simply spawn a new thread in the process being debugged, and have that new thread to load the DLL being injected. In fact, we’ve once come across a problem so unique, that it has actually pushed me far enough to create my own DLL injection utility called Needle.

We’ll go ahead and share some of the core ingredients that make Needle pierce.

Get Debug Privileges

Needle doesn’t require debug privileges to inject all processes, but for some system processes it might be re­quired. Here’s some code that shows how to do this:

bool NEEDLE::TweakPrivilege(TCHAR * pszPrivilege, BOOL bEnabled)
{
    TOKEN_PRIVILEGES tp = {0};
    LUID luid;
    //Handle of token to adjust
    HANDLE hToken = 0;

    if (!LookupPrivilegeValue(NULL, pszPrivilege, &luid))        
    {
       this->DebugTrace(TEXT("NEEDLE: Failed to lookup privelege value. Error code: "), 
          GetLastError());
       return false; 
    }

    //Retrieves token from current process
    if(!OpenProcessToken(GetCurrentProcess(), TOKEN_ADJUST_PRIVILEGES, &hToken))
    {
       this->DebugTrace(TEXT("NEEDLE: Failed to retrieve token handle. Error code: "), 
          GetLastError());
       return false; 
    }

    tp.PrivilegeCount     = 1;
    tp.Privileges[0].Luid = luid;
    if (bEnabled)
    {
       tp.Privileges[0].Attributes = SE_PRIVILEGE_ENABLED;
    }
    else
    {
       tp.Privileges[0].Attributes = 0;

       // Enable the privilege or disable all privileges.

       if ( !AdjustTokenPrivileges(
              hToken, 
              FALSE, 
              &tp, 
              sizeof(TOKEN_PRIVILEGES), 
              (PTOKEN_PRIVILEGES) NULL, 
              (PDWORD) NULL) )
       { 
           this->DebugTrace(TEXT("NEEDLE: Error 
           adjusting token privelege: "), GetLastError());
           CloseHandle(hToken);
           return false; 
       } 

       if (GetLastError() == ERROR_NOT_ALL_ASSIGNED)
       {
           this->DebugTrace(TEXT("NEEDLE: The token does not have the requested privelege"), 0);
           CloseHandle(hToken);
           return false;
       } 
       CloseHandle(hToken);
       return true;
    }

Grab the Process ID of the process being debugged

The Windows Terminal Services API exposes a nice func­tion that allows easy process enumeration, where the re­sults are stored in a neat array which can be easily looped to walk the results. It’s called WTSEnumerateProcesses(). After the PID is obtained, we can then proceed to get the proc address of the LoadLibrary function, and then open a handle to the remote process.

BOOL NEEDLE::GetPID(TCHAR * pProcName)
{
    //First enumerate all processes on the system
    if(!WTSEnumerateProcesses(WTS_CURRENT_SERVER_HANDLE, 0, 1, &this->m_pWtsPInfo, 
          &this->m_dwCount))
    {
       this->DebugTrace(TEXT("NEEDLE: Enumerating processes failed! Error code: "),     
       GetLastError());
       return FALSE;
    }

    //Now parse processes and find Application.exe
    for(unsigned int c=0; cm_dwCount; c++)
    {
       //Find Application.exe
       if(!lstrcmpi(this->m_pWtsPInfo[c].pProcessName, pProcName))
       {
           //Use PID to open handle to process
           return this->m_pWtsPInfo[c].ProcessId;
       }
    }

    this->DebugTrace(TEXT("NEEDLE: Failed to match process name with enumerated processes!"),   
    GetLastError());
    //If we arrived here, then we failed
    return FALSE;
}

Get the LoadLibrary method

We need to extract the address of the LoadLibrary function that is specific to the LoadLibrary version we need to use. For example, LoadLibraryW vs. LoadLibraryA.

Example:

    this->m_hKern32 = GetModuleHandle(TEXT("kernel32.dll"));
    if(!this->m_hKern32) return false;

#ifdef _UNICODE
    this->m_pLoadLib = (PVOID) GetProcAddress(this->m_hKern32, "LoadLibraryW");
#else
    this->m_pLoadLib = (PVOID) GetProcAddress(this->m_hKern32, "LoadLibraryA");
#endif

    if(!this->m_pLoadLib) return false;

Open a handle to the remote process

Since we’ve obtained the PID, we can open a handle to the remote process using OpenProcess. Instead of using PROCESS_ALL_ACCESS we can specify the granular flags such as PROCESS_CREATE_THREAD| PROCESS_VM_OPERATION| PROCESS_VM_WRITE.

this->m_hProcess = OpenProcess(PROCESS_CREATE_THREAD|PROCESS_VM_OPERATION|PROCESS_VM_WRITE, 
    0, iPID);

The serum is getting loaded (Injection time)

After obtaining the remote process’ handle, we then proceed to allocate memory in the remote process to store the path to the injected DLL, and then write that path into the processes address space before creating the remote thread.

Example:

    //First we need to allocate memory in the remote 
    process for our string
    this->m_pBaseAddr = VirtualAllocEx(this->m_hProcess, 
       0, this->m_dwModLen, MEM_RESERVE|MEM_COMMIT, PAGE_READWRITE);
    if(!this->m_pBaseAddr)
    {
       this->DebugTrace(TEXT("NEEDLE: Failed to allocate memory in remote process! Error code: "), 
            GetLastError());
       return 0;
    }

    if(!WriteProcessMemory(this->m_hProcess, 
        (LPVOID)this->m_pBaseAddr, (LPCVOID)pszModPath, this->m_dwModLen, 0))
    {
       this->DebugTrace(TEXT("NEEDLE: 
       WriteProcessMemory Failed! Error code: "), GetLastError());
       return 0;
    }

    // Now let us inject Application.exe 
    // with our module
    if(!CreateRemoteThread(this->m_hProcess, 0, 0, 
        (LPTHREAD_START_ROUTINE)this->m_pLoadLib, this->m_pBaseAddr, 0, 0))
    {
       this->DebugTrace(TEXT("NEEDLE: Failed to create remote thread! Error code: "), 
              GetLastError());
       return 0;
    }
    else
    {
       this->DebugTrace(TEXT("NEEDLE: Successfully created remote thread!"), 
              GetLastError());
       return true;
    }

Bringing it all together

We’ve seen firsthand what makes Needle work. Now when cornered with a complex problem, we can consider writing a hook DLL to get more insights, and using DLL injection to get the hook DLL loaded into the remote process. We have seen how writing a DLL is easy, and implementing hooks within it really depends on our creativity and the situation at hand.

Developing a DLL injection utility like Needle is simple using the points outlined in this article, which involves using the following Windows API’s (OpenProcess, VirtualAllocEx, WriteProcessMemory, CreateRemoteThread).

Some gotcha’s that we encountered while developing Needle

It is helpful to call CreateProcess function and create the process being debugged suspended, giving a perfect op­portunity to inject the debug DLL. Be sure to cleanup any memory allocated by the API’s used to inject the remote process.

It is tricky to inject Windows services. We believe this de­pends on the point in time when the injection is at­tempted, since the service process is managed by the Service Control Manager.

CreateRemoteThread fails when attempting to run against processes which exist in a separate session. There is a way around this using an undocumented Microsoft API, but we won’t go into the details of that here.

[3] http://www.microsoft.com/whdc/system/platform/firmware/PECOFF.mspx