Statistics

Members: 1925
News: 293
Web Links: 1
Visitors: 3826342

Who's Online

Damn Vulnerable LinuxDamn Vulnerable Linux (DVL) is a Linux-based (modified Damn Small Linux) tool for IT-Security & IT-Anti- Security and Attack & Defense. [CLICK HERE FOR MORE INFOS! ]

Featured Conference Video

T16-Recon2006-Joe_Stewart-OllyBonE.gif OllyBone - Semi-Automatic Unpacking on IA-32. View the conference video here!
Home arrow Conference Proceedings arrow Reverse Code Engineering arrow Protection and Reverse Engineering under .Net
Protection and Reverse Engineering under .Net
User Rating: / 11
PoorBest 
Written by tankaiha   


This issue mainly discuss some popular protection means under Microsoft .Net framework, including strongname, name obfuscation, flow obfuscation, metadata encryption, packing and some anti analysis tricks. With each protection, I will also provide some advice on how to reverse them. This issue is not a newbie tutorial, it targets on those guys who already get some experience in .Net programming and reversing.

Introduction

Believe it or not, Microsoft .Net framework is becoming more and more popular nowadays. Since the first version released at 2002, now we get four major versions of .net running on windows, that is v1.0, v1.1, v2.0 and v3.0. With the improvement of .net framework itself, protection technology also make great progress.

Later, we will discuss some protection means for .net applications, but before we dig into the main part, make sure you understand those basic conceptions, such as Assembly, Module, Type, Member, CLR(common language runtime), CTS(common type system), JIT(just in-time compile) etc. For those guys who don't know how easy a .net app can be reversed without protection, check .net reverse tutorial on http://www.accessroot.com.

Now, let's begin.

Introduction

Believe it or not, Microsoft .Net framework is becoming more and more popular nowadays. Since the first version released at 2002, now we get four major versions of .net running on windows, that is v1.0, v1.1, v2.0 and v3.0. With the improvement of .net framework itself, protection technology also make great progress.

Later, we will discuss some protection means for .net applications, but before we dig into the main part, make sure you understand those basic conceptions, such as Assembly, Module, Type, Member, CLR(common language runtime), CTS(common type system), JIT(just in-time compile) etc. For those guys who don't know how easy a .net app can be reversed without protection, check .net reverse tutorial on http://www.accessroot.com.

Now, let's begin.

StrongName

What's strongname? You can find lots of great papers in MSDN. In fact, strongname is not a protection means, but a verification. It make a hash calculation of your PE file, so system and app can check for that hash to see if the file is the original one. If the file is modified, the verification fail, and system refuse to run it.

When .net first come into birth, strongname is widely used as a measure to protect file be patched. But strongname is easily been removed, replaced, or even bypassed. The classic method is ildasm the PE file into il source code, then remove .publickey section in .assembly scope, then recompile it into PE file using ilasm. Search google you will understand why I say this ways is classic. Now, we get more convenient way, using tools directly. Check http://www.andreabertolotto.net/ you will find Strong Name Remove. Or, the most brute force one, patch system dll file to make strongname check always succeed. Sometimes the verification is made by application itself rather than system. Then you need a little more work to do, patch the app.

I'll not talk more about strongname, just stop here. More funny stuff is waiting for us. But for new birds, it's helpful to dig into strongname a little deeper. Just search in the internet library.

Name Obfuscation

Obfuscation may be the first formal protection means coming with .net, because Visual Studio is released with free edition of DotFuscator. How does name obfuscation protect our applications? As we all know, .net save all names(including names of assembly, module, method, field, OK, almost everything) into metadata, these metadata are all saved in the PE file. Using a decompiler (I prefer to using the word decompile, not disassemble, because msil itself is a high level language in contrast to asm), you will get more information than traditional win32 exe/dll. The result is, it's SO readable, just like reading original source code.

Name obfuscator obfuscate the names into unreadable (or unprintable) characters. There are several directions in how to obfuscate names: to meaningless string, to unprintable string, to “same” string, to very long string, or just delete the strings. Figure 1 illustrate using Reflector (the most popular decompiler under .net) to decompile an application which has been protected by a commercial obfuscator, how you feel when reading this code? (fig1 also shows flow obfuscation, which is the topic of next section.)

protection_and_reverse_engineering_under_.net_html_28e0b446.png

fig 1 using reflector to decompiler application protected by a commercial obfuscator

How to deal with name obfuscation? You can deobfuscate names into meaningful strings, such as Class1, Method1, TextBox1. Here meaningful means more readable, because name obfuscation is single-path transformation, we can't change them back to original names. Dis# has deobfuscate function, which can make this job easier. But at most time, it's not necessary to do that. The code is there, only a little hard to read, all we need is patience, time, and some skills (using analyze function of reflector, etc).

FLOW Obfuscation

Obviously, name obfuscation is not enough. To make the msil reading even harder, people invent flow obfuscation. Flow obfuscation means change code flow, but not change the code result. Flow obfuscation have at least two functions: make msil reading harder, make decompiling to high level language fail. Function one is clear, at least reversers have to spend more time on reading garbage codes. The second function bases on the fact that most decompilers can transform msil into HLL such as C#, C++/CLI. Reversers find that HLL is more readable than low level msil.

Flow obfuscator are implemented through several ways. First, insert junk code, such as figure one shows, adding Boolean judgement that always be true or false. Second, split source code into lots of segments, using branch or Boolean branch to connect them. Third, wrapper system calls into a new class, change all reference to these system methods to the new class's methods. Fourth, using decompiler's bug.

A simple example will illustrate how weak decompilers are when they try to transform msil to HLL. Input following code and assemble it using ilasm. Open it separately with four decompilers: Reflector, Dis#, Decompiler .Net 2005, and Xenocode Fox 2007.

.assembly extern mscorlib { }

.assembly extern System{}

.assembly sample {}

.method public hidebysig void Main()

{

.entrypoint

br.s start_here

pop

start_here:

ldstr "hello!"

call void [mscorlib]System.Console::WriteLine(string)

ret

}

Reflector report an exception when decompiling method Main() into HLL, the other three even don't decompile. The reason is simple, msil is a stack based language, which means it does not allow stack imbalance. These compilers using stack variable as the measure to decompile, but they all failed to recognise the junk code “pop” will never run. Of course, this little trick has no effect on msil decomiling. But that's what the flow obufscator want, to make code reading harder: now reverses have to say goodbye to HLL, and deal with boring il instructions.

How to deal with flow obfuscate? An answer come into everyone's mind immediately: flow deobfuscate. For simple flow obfuscate and for small applications with very little code, you can deobfuscate code by hand, remove junk code, connect code segments. But for larger ones, it's impossible. Then, good programmers may code a deobfuscator to make job automatically. Yes, it's a way, but I don't know if there are already some practical deobfuscators now. (I once coded a toy deobfuscator, which is useful for some obfuscated code, but need lots of work to do to be practical.) Some compiler tools/frameworks have optimization function, which can remove some of the junk code, such as MONO.CECIL, Microsoft phoenix RDK.

Metadata Encryption

Sometimes string reference is the key for reverse engineering. It's a basic rule that application should hide clue strings. String reference (called user string in .net) are saved as metadata in managed PE file. This section we will only talk about string encryption, more metadata encryption technology are combined into encryption/packing, which will be discussed in later sections.

In essence, there is only one way to hide user strings: encrypt them, then decrypt them when needed. But different protectors implement this hiding in different ways. Some encrypt each string with a unique key, then decrypt it with that key when running. Following is code sample.

IL_666e: ldstr "moacpphccapcmpfdhkmdepdedpkenobfpnifnopfingggongpi"

+ "ehbnlhpncihmjipmajfnhjolojomfkammkdmdlplklogbmllimnkpmhlgnlknnokeoakloj"

+ "fcpojjpejaadkhaneoapifbpjmbejdckikcajbdaiidmhpdgdge"

IL_6673: ldc.i4 0x7a122093

IL_6678: call string xb9d8bb5e6df032aa.x7840e3d83ad1c299::_bc24e513a5229081(string,int32)

The result string of this call is “Your product evaluation period has expired!”. Others may combine all user string into one encrypted byte array, assign each string an offset in the array. When the string is needed, using the offset and length to locate original string.

String encryption is easy to handle, why? Because the encryption is bidirectional. But, every programmer should use string encryption on their products, it's a basic rule.

OK, let's have a break here. Maybe some of you already become weary: protections under .net are not too hard, easy for me to deal with. Patient please, real challenge comes.

Packing / Encryption

Packing is so popular when facing traditional win32 applications. Now in .net era, it's also a powerful protection means. It's hard to make clear dividing line between packing and encryption, I prefer to discuss them together in one section. Let's start with the most simple packing: compression.

Compression

Compression means using compression algorithm to compress the PE file. When running, decompress the whole file in memory then invoke the entry method of the assembly. It is a kind of whole assembly protection, because the whole original file appear in memory at running time. It's easy to dump the assembly, by hand or by tools, such as .Net Generic Unpacker made by http://www.ntcore.com.

The typical code for whole assembly protection is call System.Reflection.Assembly::Load(), this instruction tell us at this time, the whole assembly is decrypted and ready to run. The following code always be locating the entry method or a special method, then invoke it.

Where the encrypted assembly be saved may be a little tricky. It can be saved as byte array in program code, or as a resource in .rsrc section. Once you locate the position and size, then get clear about the decryption algorithm, you can directly get original assembly without running the application. Static decryption or dynamic dump, you choose how.

Wrapper

Sometimes when you load a .net PE with reflector, you get a warn saying no CLI head. That means it's a native exe/dll file. But reading system requirements, it still needs .net framework installed. Why? Because the PE file has been packed (usually encrypted at the same time) by a native wrapper.

There are several kinds of wrapper. One is just a compression/encryption wrapper, after double click, it just release the whole assembly in memory. Others are more advanced, they take over .net system dll calls. If you inspect a .net PE with traditional tools, you will find there is only one instruction at the entry point: jump to CorExeMain or CorDllMain. Packer can take over this jump, only jump when it is ready(assembly has been decrypted).

Now we introduce a more advance wrapper in a new paragraph. Their wrapping target is JIT. We know that to make a .net program runs normally, all we need to do is give JIT the correct msil code and metadata. JIT don't care what you do before and after the just in-time compiling. The key code for JIT under .net framework 2.0 lies in mscorjit.dll, on my machine it's the following code.

.text:7906E7F4 private: virtual enum CorJitResult __stdcall

CILJit::compileMethod(class ICorJitInfo *,

struct CORINFO_METHOD_INFO *,

unsigned int,

unsigned char * *,

unsigned long *) proc near

The second parameter, a pointer to CORINFO_METHOD_INFO, is a structure as follows.

struct CORINFO_METHOD_INFO

{

CORINFO_METHOD_HANDLE ftn;

CORINFO_MODULE_HANDLE scope;

BYTE * ILCode;

unsigned ILCodeSize;

unsigned short maxStack;

unsigned short EHcount;

CorInfoOptions options;

CORINFO_SIG_INFO args;

CORINFO_SIG_INFO locals;

};

The third and fourth field of the struct is the key information JIT wants. All that packers need to do is decrypt msil and change the parameter to correct ones, then call compileMethod(), make JIT happy.

A trick here, how the packer knows when the CLR is going to call JIT? If we check the export table of mscorjit.dll, we will get a function: getJit(). What we get when call this? Check the code.

.text:7907EA7A __stdcall getJit() proc near

.text:7907EA7A mov eax, dword_790AF168

.text:7907EA7F test eax, eax

.text:7907EA81 jnz short locret_7907EA97

.text:7907EA83 mov eax, offset dword_790AF170

.text:7907EA88 mov dword_790AF170, offset const CILJit::`vftable'

.text:7907EA92 mov dword_790AF168, eax

.text:7907EA97

.text:7907EA97 locret_7907EA97: ; CODE XREF: getJit()+7j

.text:7907EA97 retn

.text:7907EA97 __stdcall getJit() endp

Obviously, it returns 0x790AF170, address of CILJit::'vftable'. What's the table about?

.text:7907EA98 const CILJit::`vftable' dd offset CILJit::compileMethod(ICorJitInfo *,CORINFO_METHOD_INFO *,uint,uchar * *,ulong *)

.text:7907EA98 ; DATA XREF: getJit()+Eo

.text:7907EA9C dd offset CILJit::clearCache(void)

.text:7907EAA0 dd offset CILJit::isCacheCleanupRequired(void)

The first member of CILJit::'vftable' just points to compileMethod, what we have discussed. This vtable never change on a fixed .net framework and machine. Packer can either using getJit() or directly patch the dword in memory.

Hook JIT

We have checked how packers warp JIT and dynamically decrypt msil. But some packers don't wrap, they hook special methods of JIT and EE(Execution Engine). A popular hooking lies in RuntimeMethodHandle::GetMethodBody.

.text:7A120C88 public: static class MethodBody * __fastcall RuntimeMethodHandle::GetMethodBody(class MethodDesc * *, void *) proc near

.text:7A120C88 ; DATA XREF: RuntimeMethodHandle::GetMethodBody(MethodDesc * *,void *)+29o

.text:7A120C88 ; .data:7A384EE4o

.text:7A120C88

.text:7A120C88 var_B0 = dword ptr -0B0h

.text:7A120DDF mov ecx, [ebp-34h]

.text:7A120DE2 call MethodDesc::GetILHeader(void)

.text:7A120DE7 mov [ebp-7Ch], eax

Take care of the bolded instruction call GetILHeader, packers can do their job here. Jump to their own code, decrypt the msil code and metadata, then go on with GetMethodBody.

Such packers who using hook method are almost per-method decryption. That means it never decrypt the whole assembly at one time, but only decrypt the method which is needed. This make it impossible to directly dump the whole assembly in memory, because there is no whole assembly in memory.

To reverse applications protected by these packers (more precisely, to dump the whole assembly), you need some skill to deceive the packer, make it believe the JIT wants the code, then give us the decrypted code. Why not statically decrypt? Because most packers use more than one encryption algorithm, and randomly select among these algorithms to encrypt different methods. I prefer to deceive .

How? OK, I'll tell you one of the methods I deceive those packers. Using reflection mechanism of .net framework, enumerate all Classes, Methods, Fields and so on, scan decrypted metadata in memory, then rebuild the assembly. Sometimes we don't need the assembly be same with the original one, or even don't need it can run, we just need it can be decompiled. But, as I know, there is no generic unpacker useful for every type of protections under .net.

Some Anti Tricks

This section will be simple, just introduce some little tricks making analysis of your program a little harder. Most of them are borrowed from win32 anti reversing. I'll only mention their names, because no explanation is needed.

How to anti profiler? You can set the environment variable COR_ENABLE_PROFILING to make profiler disabled every time your application runs. Or, patch the sign directly in memory. Following code will give you some clue on how .net transform information to profiler.

.text:79E972AD ; MethodDesc::MakeJitWorker(COR_ILMETHOD_DECODER *,ulong)+267972j ...

.text:79E972AD test byte ptr ProfilerStatus g_profStatus, 6

.text:79E972B4 jnz loc_7A0FEB52

.text:79E972BA

.text:79E972BA loc_79E972BA:

If profiler is disabled, then managed debugging is disabled. But reversers can still using OllyDBG to perform native debug. Just use System.Diagnostics.Debugger::get_IsAttached supplied by .net or use IsDebuggerPresent from kenel32.dll to detect that. System.Diagnostics.DebuggerHiddenAttribute also may be used to prevent new birds step through key method freely (in managed debugging).

Enumerate all windows names and check if there is some should-not-be-there application is running. Process32First and Process32Next is also a good way.

Using time span checking in key method to deceive reversers. Don't just exit if time span is too large, just give them the wrong register code.

Using internet verification. But I don't like using software need forced internet verification, because most time I have no access to internet.

Don't offer trial versions on your site.

Summarize

We've discussed some (maybe not all) popular protection means under .net, introduce their implementation and how to reverse them. You can choose the most suitable one or ones for your application. The wars of reverse engineering and protection under .net is just begin, with the Vista comes. Which side will win? Just like the history, one may win a battle, but no one can win the war.

 

StrongName

What's strongname? You can find lots of great papers in MSDN. In fact, strongname is not a protection means, but a verification. It make a hash calculation of your PE file, so system and app can check for that hash to see if the file is the original one. If the file is modified, the verification fail, and system refuse to run it.

When .net first come into birth, strongname is widely used as a measure to protect file be patched. But strongname is easily been removed, replaced, or even bypassed. The classic method is ildasm the PE file into il source code, then remove .publickey section in .assembly scope, then recompile it into PE file using ilasm. Search google you will understand why I say this ways is classic. Now, we get more convenient way, using tools directly. Check http://www.andreabertolotto.net/ you will find Strong Name Remove. Or, the most brute force one, patch system dll file to make strongname check always succeed. Sometimes the verification is made by application itself rather than system. Then you need a little more work to do, patch the app.

I'll not talk more about strongname, just stop here. More funny stuff is waiting for us. But for new birds, it's helpful to dig into strongname a little deeper. Just search in the internet library.

Name Obfuscation

Obfuscation may be the first formal protection means coming with .net, because Visual Studio is released with free edition of DotFuscator. How does name obfuscation protect our applications? As we all know, .net save all names(including names of assembly, module, method, field, OK, almost everything) into metadata, these metadata are all saved in the PE file. Using a decompiler (I prefer to using the word decompile, not disassemble, because msil itself is a high level language in contrast to asm), you will get more information than traditional win32 exe/dll. The result is, it's SO readable, just like reading original source code.

Name obfuscator obfuscate the names into unreadable (or unprintable) characters. There are several directions in how to obfuscate names: to meaningless string, to unprintable string, to “same” string, to very long string, or just delete the strings. Figure 1 illustrate using Reflector (the most popular decompiler under .net) to decompile an application which has been protected by a commercial obfuscator, how you feel when reading this code? (fig1 also shows flow obfuscation, which is the topic of next section.)