If you have been in trouble when implementing your new, fresh, reflective loader, raise your hand!
Well, after a thousand crashes, I want to write down some simple suggestions that could literally save you days of debugging.
The reflective loader
Generally speaking, a reflective DLL cannot be debugged with common tools and enough comfort. In fact, a reflective DLL has a custom loader inside that does the underlying mechanisms to load the DLL in memory. Basically, it loads it-self.
Note: a reflective loader cannot leverage Windows APIs or external functions because the DLL is not loaded yet. Therefore, a loader contains only PIC (Position-Independent Code) that is accomplished through particular language syntax and compiler tricks. I just want to say: maybe the problem resides in our compiler/linker flags :).
Since there’s no LoadLibrary
API or any load event that can be intercepted, common debugging won’t work. You can’t just set a breakpoint on Visual Studio, press Run and debug. You have to leverage some assembly debugger like IDA or xdbg, allowing you to understand deep enough the cause of a crash, steps many instruction back and play with memory.
You have to learn assembly.
How to debug
We’ll take IDA as an example.
Using an injector - the “standard” way
Basically, you have to do two things to break a reflective DLL after it is loaded:
- Run the injector and stop at the
CreateThread
or the equivalent mechanism that will run the injected DLL - Open a new window, attach to the process in which the DLL is injected and search
- search what? ⇒ the export that loads the DLL! Now, put a breakpoint on the start of it
From the 1st window, continue execution. You eventually hit the DLL export.
Using rundll32
Ok, in this way you only have to open the DLL under IDA and set process parameters accordingly:
- application:
rundll32.exe
- params:
<path/to/file.dll>,"ExportName"
In this case you do not lose access to the debug symbols, resulting in a better debug experience.
But there’s one thing you have to remember. rundll32
takes a DLL and an export as a comma-separated arguments. It will always run DllMain
before your export! And that’s not what you want, since the export is actually used to load the DLL in memory.
You can bypass this by
- putting a breakpoint on
DllMain
- when
DllMain
is hit, changeEIP
to reflective loader export - continue from that
Ok, and now?
Basically, a reflective loader does these operations:
- Retrieves pointers to Windows API through already-loaded DLLs inside the target process. This is accomplished through the PEB:
ntdll
for example, allows to get the pointer toVirtualAlloc
and other useful functions
- Retrieves the base address from where it has been loaded
- looking for MZ signature
- Map the PE in memory
- allocate a region of size =
SizeOfImage
memcpy
all the PE structures in it
- allocate a region of size =
- Fix PE
- import address table (loading all the declared import dlls)
- relocations
- fix memory permissions of each section
- Other
- setup exception handlers through
RtlAddFunctionTable
- execute TLS callbacks
- setup exception handlers through
- Finally
- flush instruction cache
- dll is correctly mapped and fixed, execute
DllMain
From these operations, you can extract some useful conditions or APIs to which setup a breakpoint
VirtualAlloc
(mapping the DLL)VirtualProtect
(mem permissions)- hardware breakpoint on access on
gs:0x60
orfs:0x30
(on x86) (PEB) - …
Addressing errors
Memory cannot be executed
Imports
This is often due to some windows APIs invoked directly inside the reflective loader code. For example, memset
, memcpy
but also Win32 APIs.
memset
and memcpy
refers to the C runtime located inside the ucrt.dll
or msvcrt.dll
(it depends which CRT you used). In normal cases, when you call a memset
the assembly instruction will be like:
call __imp_memset
where __imp_memset
is the address of the IAT that eventually leads to the address of the memset
function located inside the C runtime DLL. Anyway, in normal cases this DLL is already loaded before the call ;)
In this case, it is enough to reimplement those function with custom defined ones inside your code, like:
void* __cdecl memset(void* pTarget, int value, size_t cbTarget) {
unsigned char* p = (unsigned char*)pTarget;
while (cbTarget-- > 0) {
*p++ = (unsigned char)value;
}
return pTarget;
}
void* __cdecl memcpy(void* pDestination, void* pSource, size_t sLength) {
PBYTE D = (PBYTE)pDestination;
PBYTE S = (PBYTE)pSource;
while (sLength--)
*D++ = *S++;
return pDestination;
}
Also, some language syntax leads to a generation of memset instruction without you explicitly doing it! Let’s look at this code
struct MY_STRUCT var = { 0 };
In assembly, it translates to this:
mov rcx, <addr of var>
mov rdx, 0
mov r8, <size_of_struct>
call memset
You can read the issue here.
So be careful on what you’re writing and what language you’re using!
This happens in a normal program compilation and linking. To disable
memset
generation, some compilers likegcc
provides an option to do that (read here) but you’ll have to implement on your own anyway. We can also avoid linking against the C runtime, but that disables a lot of other functions and utilities and goes out of the scope of this article.
Switch cases
But hey, let’s complicate this! When you have a switch
statement inside your code, the compiler may generate jump tables for each case. Those jump tables are static offsets that contains the code connected to the matching case
, and they are generated for performance reasons. Yeah, you’ve heard it well - static. Avoid switch cases as much as possible when you’re writing reflective loader code.
Note: GCC allows you to avoid generating jump tables using the option
-fno-jump-tables
. Oh, you’re asking for MSVC? haha - no option. Don’t you dare asking anything else, you m0r##n.
Memory cannot be read
Ok this look interesting as well since it doesn’t always happen. In the best case you didn’t set DLL permissions properly, or you forgot the code that does it.
In the worst case, there are many situations in which for some reason (e.g., wrong pointer dereference, wrong struct size, …) the code access some part of the memory that is - in some way - invalid. I’m thinking to those instructions:
... some operations (bug: rcx = 0)
mov rax, rcx
mov rdx, [rax] ; bam! MEMORY_ACCESS_VIOLATION
Static strings
Strings will be defined in their own data section (check here). But hey, the section has not been mapped yet, so you’ll probably get an ACCESS_VIOLATION
error when accessing a string! Be sure to not have them inside the loader.
If you really want to use them, you can check stackstrings.
Global variables
Nothing to say here, same logic of strings applies here. be sure that you do not use them inside the loader!
MZ false positives
I share my experience and this opinion: always debug from the original injector! We’ll get to the point later.
The error for me resided in the code that gets the injected base address. This was more or less like this:
address = <address of the function>
do {
img_dos_hdr = address
if img_dos_hdr->e_magic == 'MZ' { // 0x5A4D
img_nt_header = img_dos_hdr->e_lfanew
if (img_nt_header == 'PE') { // 0x4550
// ok! found it!
break;
}
}
address--;
}
Line that crashed:
img_nt_header = img_dos_hdr->e_lfanew
It didn’t crash without stdlib linked.
Why? PEs can have false positives inside their memory
Let’s check for MZ
signature inside HxD.
First:
Ok this is the signature.
Second:
Mmm.. no, this doesn’t look like the signature.
Third:
Neither does this.
So, we have to filter out many false positives that can come when we are scrolling up memory searching our base address!
One way to solve this is to find the famous This program cannot be run in DOS mode. This (weak) code shows us how to do it:
int* msdos_stub = (int*)(uTmpAddress + 0x4E); // addr of 'This program ...'
if (*msdos_stub != 0x73696854) { // hex of 'This'
address--;
continue;
}
Without stdlib linked I didn’t have other runtime code inside my PE that brought many MZ
false positives… The error didn’t show also when loading the DLL using rundll32
.
I don’t know why! So, be careful to isolate the problem and writing down the context (i.e., memory, target process) because this can change the behavior of your loader! And finally, because of this, always debug the DLL from its injector!
I hope any of these recommendations could be useful for you, have a nice day!
See you in the next research!