Wednesday, January 17, 2007

Code Obfuscator

One of the key architectural features of Microsoft .NET is that all its languages compile to .NET assemblies containing CPU-independent instructions. These instructions, known as Microsoft Intermediate Language (MSIL), contrast with many other languages that generate CPU-specific instructions for the target CPU. .NET assemblies are comprised both of MSIL instructions and also metadata that describe types, members, and code references from other assemblies. At runtime, the MSIL instructions are converted to CPU-specific instructions by a just-in-time (JIT) compiler.

The use of this architecture has several huge benefits for the .NET developer. For instance, it makes possible easy interoperability for code written in differing languages, and it makes it easy for an assembly to specify exactly the version of another assembly with which it will work. But there is one major drawback for some developers as well: the MSIL and metadata in an assembly provide enough information to recover the original code. The .NET framework ships with a tool, ILDASM that can disassemble MSIL into assembly language and other utilities can carry the process even further, translating a .NET assembly back into a high-level language such as C# or Visual Basic .NET.

This is a drawback because it is very difficult to protect the intellectual property in an application if anyone can read the source code for the application. Developers, who have spent months or years coming up with complex algorithms or workarounds for bugs in the .NET Framework, or other components, often prefer to have their methods remain secret from their competitors.

This is where obfuscators come in. The purpose of an obfuscator is to apply one or more transformations to a .NET assembly without affecting the proper functioning of the assembly, but that make it difficult or impossible to understand any source code recovered from the assembly. As a simple example, every obfuscator offers some level of member renaming. Source code where all of the objects are named things like A, B, and C instead of Customer, ClientData, and ServiceUpdate, is substantially more difficult to understand.

How to obfuscate?
Simplest way – Use Macro preprocessors to create hard to read code by masking the standard language syntax and grammar from the main body of code.

Pitfalls:
Although obfuscation is a valuable technology in many circumstances, you do need to be aware of some potential pitfalls:
1. Obfuscation can break code that depends on reflection, serialization, or remoting.
2. Obfuscation can make diagnosing and debugging problems in your code more difficult.
3. Obfuscation adds another step and another potential error source to your build process.

Obfuscation in Visual Studio .NET:
If your obfuscation needs are minimal, and you're a Visual Studio .NET user you may not need to purchase a product at all. That's because Visual Studio .NET includes a copy of the Community Edition of Dotfuscator for .NET, an obfuscator from PreEmptive Solutions. This obfuscator is targeted at students and freeware authors, and supports basic entity renaming and removal of unused metadata, but no advanced obfuscation features.

Recreational obfuscation:
Code is sometimes obfuscated deliberately for recreational purposes. There are programming contests which reward the most creatively obfuscated code: the International Obfuscated C Code Contest, Obfuscated Perl Contest, International Obfuscated Ruby Code Contest and Obfuscated PostScript Contest.
There are many varieties of interesting obfuscations ranging from simple keyword substitution, use/non-use of whitespace to create artistic effects, clever self-generating or heavily compressed programs, or programs that are valid and operate similarly in multiple programming languages.

Points to note:
No obfuscator known today provides any guarantees on the difficulty of reverse engineering: See http://www.math.ias.edu/~boaz/Papers/obf_informal.html
Hackers can use this technique to foil anti-virus programs that rely upon a virus having a static signature. The technique also allows spammers to hide their intentions. See: http://www.itbusiness.ca/it/client/en/Home/News.asp?id=41807&bSearch=True

1 comment:

Anonymous said...

Good words.