This article aim to explain how Code Virtualizer works. During the last month, I
spent all my free time analysing the Code Virtualizer Demo 1.0.1.0 unpacked by
softworm. Fortunately, I finished my analysis and I can say that this is the best
software I have seen before. Not best in the meaning of protection, but in
the meaning of organization. This was the most pleasing software I have
analysed.
Three important things to notice are that the description and explanation of the
code disassembled by OllyDbg is done in the code execution order. Most things
that I am going to say are applicable only for the 1.0.1.0 version of Code Virtualizer.
For comments on new versions, see "Hopes for the Future and Acknowledgments”.
And I will not threat the 64-bit case.
This article is divided in two parts. Firstly I am going to talk about how the
Virtual Machine is generated and why Oreans says that each Virtual Machine has
its own characteristics. Secondly I use the concepts described before to explain how
the Virtual Opcodes are generated, how they are executed and why they emulate the
original code of an application.
Enjoy this article and I hope you learn something reading it.
Contents
1 Introduction
1.1 About
Code Virtualizer
1.2 About
this article
2 The
Virtual Machine - Light VM
2.1 The
Virtual Machine itself
2.2 Generating
the Virtual Machine
3 The
Virtual Opcodes
3.1 Disassembling
and "Assembling" again
3.2 Generating
and Writing the Virtual Opcodes
3.3 Completing
the analysis: why does this really work?
4 Hopes
for the Future and Acknowledgments
4.1 Why
write this article?
4.2 The
general attack approach
4.3 Acknowledgments
DISCLAIMER
ALERT: THIS ARTICLE MUST BE USED ONLY FOR SCIENTIFIC/STUDY
PURPOSES. THE AUTHOR OF THIS ARTICLE IS NOT RESPONSIBLE FOR ANY USE
OF THE KNOWLEGDE DESCRIBED HERE FOR ILLEGAL PURPOSES. YOU DO ARE ONLY
ALLOWED TO READ THIS ARTICLE IF YOU AGREE WITH THIS DISCLAIMER.
1 Introduction
1.1 About Code Virtualizer
Code Virtualizer is a powerful code obfuscation system that helps
developers protect their sensitive code areas against Reverse
Engineering. Code Virtualizer has been designed to enact high
security for your sensitive code while requiring minimal system
resources.
Code Virtualizer will convert your original code into Virtual
Opcodes that will be only understood by an internal Virtual Machine.
Those Virtual Opcodes and the Virtual Machine itself are different
for every protected application, avoiding a general attack over Code
Virtualizer.
Code Virtualizer can protect your code in any x32 and x64 native
PE files, like executable files (EXEs), system services, DLLs, OCXs,
ActiveX controls, screen savers and device drivers[1].
1.2 About this article
First of all, I need to say sorry. Probably you will see a lot of
mistakes because of my english but I hope you will understand me.
This article aim to explain how Code Virtualizer works. During the
last month, I spent all my free time analysing the Code Virtualizer
Demo 1.0.1.0 unpacked by softworm[2].
Fortunately, I finished my analysis and I can say that this is the
best software I have seen before. Not best in the meaning of
protection, but in the meaning of organization. This was the most
pleasing software I have analysed.
Three important things to notice are that the description and
explanation of the code disassembled by OllyDbg[3]
is done in the code execution order. Most things that I am going to
say are applicable only for the 1.0.1.0 version of Code Virtualizer.
For comments on new versions, see "Hopes for the Future and
Acknowledgments”. And I will not threat the 64-bit case.
This article is divided in two parts. Firstly I am going to talk
about how the Virtual Machine is generated and why Oreans[4]
says that each Virtual Machine has its own characteristics. Secondly
I use the concepts described before to explain how the Virtual
Opcodes are generated, how they are executed and why they emulate the
original code of an application.
Enjoy this article and I hope you learn something reading it.
2 The Virtual Machine - Light VM
2.1 The Virtual Machine itself
I think you have noticed that I called this Virtual Machine as
"Light VM”. Actually, not me but Oreans developers did
that probably refering to the Themida Virtual Machine.
Basically each Virtual Machine has 150 handlers and a main
handler. By handler, I mean a kind of function that will deal with
the Virtual Opcodes. In general, they are small (one to six lines of
assembly code) and it is really important to understand each one.
Next I will show the first structure that I called
Handler_Information and an example of it (figure 1):
-
WORD id // a number that
represents the handler
-
DWORD start // the address of the
start of this handler in the Code Virtualizer file
-
DWORD end // the address of the
end of this handler in the Code Virtualizer file
-
DWORD address // the address of
the start of the handler in the protected file
-
WORD order // random number from (0Eh to A4h) that will
indicate the place of the handler in the protected file
|
|
Figure 1:
|
Handler_Information structure example
|
|
This structure is the principal one to generate the VM. I will not
show you each of the 150 handlers. This is tedious but if you want to
study Code Virtualizer deeper, you must read and understand one by
one. I will show you just the handler I showed in the figure above
(figure 1;
id = 0000h; start = 006035F0h; end = 006035F8h) and the main handler
(figure 3):
There is a particularity in the main handler: you can see three
times the DWORD 11111111h. They are different depending on the
protected application. The first DWORD is the address of the seventh
line of the main handler in the protected file. The second one is the
"image base" of the Virtual Machine. The last DWORD is the
total number of handlers in that VM.
2.2 Generating the Virtual Machine
Here I will give arguments to proof after that the phrase "Those
Virtual Opcodes and the Virtual Machine itself are different for
every protected application, avoiding a general attack over Code
Virtualizer." is not a very important feature.
The first step done by Code Virtualizer is to write the main
handler. Next the other 150 handlers will be written following the
Handler_Information.order sequence from 1Eh to A4h. As
Handler_Information.order is randomly generated the result will be a
difference sequence of handlers for every protected application (if
you want an example, see [5]).
Now I am going to explain how the handler 0000h (see figure 2)
is written. The same process occurs for every handler.
The next step is showed by the code below:
|
|
Figure 4:
|
LODS special case
|
|
This piece of code looks for LODS instructions. This is not
applicable for the handlers 0154h and 0156h. But why these checks?
Well, the LODS instruction in a handler represents the reading of 1,
2 or 4 Virtual Opcodes. And to increase the security, Oreans
developers insert random code after the LODS instruction. To do that,
they use another structure that I have called Special_Handler. Here
you are:
-
WORD Handler_Information.id // see
Handler_Information structure
-
BYTE instruction3 // number that
says what kind of instruction will be written as the third
instruction
-
BYTE instruction2 // number that
says what kind of instruction will be written as the second
instruction
-
BYTE instruction1 // number that
says what kind of instruction will be written as the first
instruction
-
BYTE instruction4 // number that
says what kind of instruction will be written as the fourth
instruction
-
DWORD Random1 // random number
that will be part of the instruction 2
-
DWORD Random2 // random number that will be part of the
instruction 3
|
Table 1:
|
Table of possible random instructions. Each of these
instructions can be written in DWORD, WORD or BYTE format using
the respective registers ax, bx, al, bl.
|
|
|
|
|
|
|
instruction1
|
instruction2
|
instruction3
|
instruction4
|
|
|
|
|
|
|
0
|
sub eax,ebx
|
sub eax,Random1
|
sub eax,Random2
|
sub ebx,eax
|
|
|
|
|
|
|
1
|
add eax,ebx
|
add eax,Random1
|
add eax,Random2
|
add ebx,eax
|
|
|
|
|
|
|
2
|
xor eax,ebx
|
xor eax,Random1
|
xor eax,Random2
|
xor ebx,eax
|
|
|
|
|
|
| |
|
|
|
|
Figure 5:
|
Example of Special_Handler structure
|
|
So before those operations, the handler 0000h (figure 2)
will be like this:
|
|
Figure 6:
|
Handler 0000h before addition of 4 instructions
|
|
The next step is another security feature. Some kind of
instructions are mutated by the Oreansf1.F4 function exported by
Oreansf1.dll module. This means that the code of each handler will be
obfuscated and more, this mutation engine is strictly related to the
option Virtual Machine Obfuscation. Actually, this option only
changes the complexity of the mutated opcode. This is really
something strange because there is no difference if the complexity of
the VM is low or highest in a general attack to Code Virtulizer (for
more comments, see "Hopes for the Future and Acknowledgments").
|
|
Figure 7:
|
Handler 0000h with mutated opcodes
|
|
Before that, a JUMP to the main handler is written so the next
handler will be called.
The next security feature is quite fun to see: all the 150
handlers are mixed randomly!!! For example, a piece of the handler
0161h is followed by a piece of the handler 0001h and the handler
0069h, etc...
So in the end there will be a complete obfuscated, unique and
difficult code to be analysed. Really! I do not think so :).
3 The Virtual Opcodes
3.1 Disassembling and "Assembling" again
I know that the things are obscure. You probably still have no
idea about how those handlers work but I promise that it will be
clear in the section 3.3.
The figure below shows a macro not virtualized. The code that will
be virtualized starts at 0040106Eh and ends at 0040107Dh.
|
|
Figure 8:
|
Macro not virtualized
|
|
Next a PUSH 0040108Dh and RET will be added to the original code
so the program can continue its execution normally.
After that, the exported function Oreansf1.F1 disassembles the
original code as you can see below. It was really a surprise to me
when I saw that; I hoped that Code Virtualizer would threat the code
through the bytes of the original code not through strings. It uses
Delphi functions to threat strings and I think this is not the faster
way but for sure it is easier.
|
|
Figure 9:
|
Code disassembled
|
|
Now the function OreansX2dllR.F1 exported by OreansX2dllR.dll will
do the principal and most complex work of assemble the assembly code
in a Code Virtualizer syntax and generate the most important
structure that I have called OreansX2.
OreansX2 structure:
-
DWORD instruction // type of
instruction following the Code Virtualizer syntax
-
DWORD sufix // sufix for the
instruction
-
DWORD data1 // data for the
instruction
-
DWORD data2 // data for the
instruction
-
WORD unknown // unknown use
|
Table 2:
|
Table of possible instructions for OreansX2 structure
|
|
|
|
OreansX2.instruction
|
instruction
|
|
|
|
00
|
LOAD
|
|
|
|
01
|
STORE
|
|
|
|
02
|
MOVE
|
|
|
|
03
|
IFJMP
|
|
|
|
04
|
EXTRN
|
|
|
|
05
|
UNDEF
|
|
|
|
06
|
IMULC
|
|
|
|
07
|
ADC
|
|
|
|
08
|
ADD
|
|
|
|
09
|
AND
|
|
|
|
0A
|
CMP
|
|
|
|
0B
|
OR
|
|
|
|
0C
|
SUB
|
|
|
|
0D
|
TEST
|
|
|
|
0E
|
XOR
|
|
|
|
0F
|
MOVZX
|
|
|
|
10
|
MOVZX_W
|
|
|
|
11
|
LEA
|
|
|
|
12
|
INC
|
|
|
|
13
|
RCL
|
|
|
|
14
|
RCR
|
|
|
|
15
|
ROL
|
|
|
|
16
|
ROR
|
|
|
|
17
|
SAL
|
|
|
|
18
|
SAR
|
|
|
|
19
|
SHL
|
|
|
|
1A
|
SHR
|
|
|
|
1B
|
DEC
|
|
|
|
1C
|
NOP
|
|
|
|
1D
|
MOVSX
|
|
|
|
1E
|
MOVSX_W
|
|
|
|
1F
|
CLC
|
|
|
|
20
|
CLD
|
|
|
|
21
|
CLI
|
|
|
|
22
|
CMC
|
|
|
|
23
|
STC
|
|
|
|
24
|
STD
|
|
|
|
25
|
STI
|
|
|
|
26
|
HLT
|
|
|
|
27
|
BT
|
|
|
|
28
|
BTC
|
|
|
|
29
|
BTR
|
|
|
|
2A
|
BTS
|
|
|
|
2B
|
SBB
|
|
|
|
2C
|
MUL
|
|
|
|
2D
|
IMUL
|
|
|
|
2E
|
DIV
|
|
|
|
2F
|
IDIV
|
|
|
|
30
|
BSWAP
|
|
|
|
31
|
NEG
|
|
|
|
32
|
NOT
|
|
|
|
33
|
RET
|
|
|
| |
|
|
|
Table 3:
|
Table of possible sufixes
|
|
|
|
OreansX2.sufix
|
sufix
|
|
|
|
00
|
|
|
|
|
01
|
ADDR
|
|
|
|
02
|
%sADDR, %d
|
|
|
|
03
|
%sADDR, %.8x%h
|
|
|
|
04
|
BYTE PTR %s[ADDR]
|
|
|
|
05
|
WORD PTR %s[ADDR]
|
|
|
|
06
|
DWORD PTR %s[ADDR]
|
|
|
|
07
|
QWORD PTR %s[ADDR]
|
|
|
|
08
|
%sBYTE PTR [%.8x%h]
|
|
|
|
09
|
%sWORD PTR [%.8x%h]
|
|
|
|
0A
|
%sDWORD PTR [%.8x%h]
|
|
|
|
0B
|
%sQWORD PTR [%.8x%h]
|
|
|
|
0C
|
ADDR, BYTE PTR %s[%.8x%h]
|
|
|
|
0D
|
ADDR, WORD PTR %s[%.8x%h]
|
|
|
|
0E
|
ADDR, DWORD PTR %s[%.8x%h]
|
|
|
|
0F
|
ADDR, QWORD PTR %s[%.8x%h]
|
|
|
|
10
|
%s%d
|
|
|
|
11
|
%s%.8x%h
|
|
|
|
12
|
reserved
|
|
|
|
13
|
reserved
|
|
|
|
14
|
reserved
|
|
|
|
15
|
reserved
|
|
|
|
16
|
reserved
|
|
|
|
17
|
reserved
|
|
|
|
18
|
BYTE
|
|
|
|
19
|
WORD
|
|
|
|
1A
|
DWORD
|
|
|
|
1B
|
QWORD
|
|
|
|
1C
|
reserved
|
|
|
|
1D
|
reserved
|
|
|
|
1E
|
FLAGS
|
|
|
|
1F
|
%s[ADDR]
|
|
|
|
20
|
%sBYTE %d
|
|
|
|
21
|
%sWORD %d
|
|
|
|
22
|
%sDWORD %d
|
|
|
|
23
|
%sQWORD %d
|
|
|
| |
|
|
As you can see, the syntax is quite logic. It uses XOR, ADD, etc.
for well known instructions and obvious names like MOVE, STORE, LOAD
for "special" instructions; the sufixes use a single
variable ADDR and well known formats like DWORD PTR [ADDR].
I still do not understand completely how those instructions are
generated from the original code disassembled but I think that this
is not a problem if you do some tests to see the pattern. Next I show
you one assembly instruction followed by the equivalent block of Code
Virtualizer instructions with their respective OreansX2 structure
(see the file [5]
for more examples).
|
|
Figure 10:
|
Example of Code Virtualizer syntax
|
|
I do not know if you have noticed it, but the first parameter of
the first OreansX2 structure above is 80000002h. 02 means MOVE as you
can see in the Table 2,
but this 80 means that this instruction has a relative address. That
is, the address F0000028h is relative to the image base of the
Virtual Machine.
3.2 Generating and Writing the Virtual Opcodes
Having a vector of the OreansX2 structure, now a sequence of
operations will be done to reach the next structure that I have
called Pre_Handler. The size of this structure is 28h bytes.
-
DWORD counter // counter that is
incremented by 0Eh for each Pre_Handler structure
-
DWORD real_opcode_mark // this
DWORD is the address of the original opcode in an allocated memory.
This is only applicable to the first Code Virtualizer instruction of
the block of instructions that represent the original opcode
-
DWORD unknown1 // unknown use
-
DWORD counter_0E // this the
Pre_handler.counter plus 0Eh (unknown use)
-
BOOL is_special // True if the
original opcode is any kind of call, jump, conditional jump and
others. In this case, a special structure will be generated for
those instructions
-
BYTE instruction // Same as
OreansX2.instruction
-
DWORD sufix // Same as
OreansX2.sufix
-
DWORD data1 // Same as
OreansX2.data1
-
DWORD data2 // Same as
OreansX2.data2
-
WORD unknown2 // Same as
OreansX2.unknown
-
7 bytes unknown
-
BOOL is_relative_address // TRUE if the instruction has a
relative address
|
|
Figure 11:
|
Example of Pre_handler structure
|
|
So now the principal structure that is directly related with the
Virtual Opcodes generation can be studied. I have called this
structure as Handler.
-
WORD handler // this is the
principal parameter: it is the the one who will determine what
handler must be called. It is equivalent to Handler_Information.id
-
DWORD Pre_Handler_addr // address
in memory of the correspondent Pre_Handler structure that generated
this Handler structure
-
DWORD memory_opcode // memory
address of where the Virtual Opcode represented by this structure
will be written
-
BYTE type_of_handler // 0 if the
handler does not read Virtual Opcodes through LODS intrscution. 1,
2, 4, 8 if the handler reads 1, 2, 4, 8 Virtual Opcodes
-
BYTE unknown2 // unknown use
-
DWORD data1 // data for the Code
Virtualizer instruction (like for example LOAD 18h, data1 will be
18h)
-
DWORD data2 // data for the case
of 64-bit Code Virtualizer instrution
-
DWORD file_opcode // address in the protected file of where
the Virtual Opcode represented by this structure will be written
|
|
Figure 12:
|
Example of Handler structure
|
|
Each Handler structure can generate 1, 2 or 4 Virtual Opcodes and
that is a must to understand how the vector of Handler structures is
generated.
This is not so complicated but if I put each case here, this
article would be too big. So I will just comment how this works and
if you want more details see [6].
Basically each vector of Handler structures starts with the
handler 015Bh and ends with the handlers 0161h and 015Ch. The
handlers 015Bh and 015CH do not exist actually. They are there just
to tell Code Virtualizer that special code must be inserted to handle
when the execution of Virtual Opcodes is initiated and when it is
finished. This special code will be showed shortly.
Between those handlers the Pre_handler structure is threated like
this: if Pre_handler.is_special is TRUE, the handler 0161h will be
added to the correspondent Handler structure. After that, a different
sequence of Handlers structures is generated for each of the cases:
MOVE, LOAD, STORE, SHL, ADD, SUB, IFJMP, RET, UNDEF and default case
(for the others Code Virtualizer instructions). You can see more
details about those sequences in [6].
Having understood how the vector of Handler structures is
generated, you can finally understand the brilliant part of Code
Virtualizer: how the Virtual Opcodes are built.
The first thing to say is about when Code Virtualizer finds the
handlers 015Bh and 015Ch. There is a pre-built virtualized code (this
means that the Code Virtualizer instructions and the others
structures are not there) that is responsible to initialize and
unitialize the Virtual Machine for example, catching or returning the
registers and flags before the protected application executes its
Virtual Opcodes.
So now I am going to talk about the generation of Virtual Opcodes
given the Handler structure. The first thing that Code Virtualizer
does is quite surprising. Using a random number generator, it decides
about the execution of a specific CALL. This CALL is responsible to
generate "fake" Virtual Opcodes. That is, those Virtual
Opcodes are going to be executed but they will not change anything in
the program (like a sequence of NOPs) and so they are useful to
obsfucate the real Virtual Opcodes. Besides, there are five different
sequences of "fake" Virtual Opcodes difficulting even more
the analysis of the program. And I say more, the option Virtual
Opcode Obfuscation (low, normal, high, highest) is strictly related
(I meant only related) with these "fake" Virtual Opcodes.
Depending on that option, the chance of the random number generator
allow the recursively execution of the specific CALL more than one
time can be increased or decreased. So for example, in the middle of
the emulation of a instruction, there can be a lot of "fake"
Virtual Opcodes. They can increase the size of the Virtual Opcodes by
a factor of 3!!!
Unless the "fake" Virtual Opcodes, you can say that the
Virtual Opcodes would be identical if you protect an application
twice and compare the Virtual Opcodes. What make them different, is a
global variable in the Code Virtualizer that I have called key.
So if the handler 0010h must be called, given the
Handler_Information.order and the Special_Handler structure (see
section 2.1 and 2.2 for the explanation of these structures), the
inverse operations of the ones described in Table 1
(that is ADD, SUB, XOR) will be executed to reach the correct Virtual
Opcode. The things are a little confusing I think. So let’s
clear them.
3.3 Completing the analysis: why does this really work?
The aim of this section is to explain step-by-step the
initialization of the Virtual Machine and the execution of the
Virtual Opcodes. To do that, I will use a file that I prepare and
that does not have fragmented handlers and mutation engine[7].
When the protected application reaches a macro, the code is
redirected to a PUSH/JMP sequence in a section created by Code
Virtualizer.
|
|
Figure 13:
|
PUSH/JMP example
|
|
The value pushed is the address of the first Virtual Opcode and
the jump is to the main handler.
|
|
Figure 14:
|
Virtual Opcodes
|
|
The code started at 004072D8h is always called before the
execution of every handler. It is responsible to call the handler
specified by the Virtual Opcode. The key is initialized with the the
address of the first Virtual Opcode and it is stored in the EBX
register. The ESI register has the current address of the Virtual
Opcode read and the EDI register has the Image base of the Virtual
Machine. The stack is used to store values and the EAX register is
used for operation like XOR, ADD, etc.
So when the code reaches the address 004072D8h, the registers are
like this:
|
|
Figure 16:
|
Register in the Main Handler
|
|
Now the byte 62h is read and after some operation with the key
(those random operations explained in the section 2.2; see figure
15),
when the code reaches the address 004072E4h, the registers are like
this:
|
|
Figure 17:
|
Jumping to handler 2Dh
|
|
As you can see, the key was changed and the ESI register was
updated. Now the code jmp dword ptr ds:[edi+eax*4] seems obvious: as
EDI has the image base of the Virtual Machine, the EAX value obtained
from the Virtual Opcode plus some operation is very important to call
the handler if you notice that there is a table of pointers to
handlers:
|
|
Figure 18:
|
Piece of table of pointers to handlers
|
|
By now, you know how every handler is called and it is possible to
explain why the Virtual Opcodes are unique for every protected
application: because of the key. The key is changed a lot of times
and it is address depedent. As the Virtual Opcodes depend on the key
(see section 3.2 for explanation) and the size of the Virtual Machine
is not constant, the Virtual Opcodes are unique.
The first two instructions of the Main Handler (PUSHAD and PUSHFD)
push onto the stack the registers EAX, ECX, EDX, EBX, ESP, EBP, ESI,
EDI and the Flags. After, the pre-built Virtual Opcodes that I have
talked about are responsible to pop those registers in the first 38
bytes of the Virtual Machine. Now a instruction like XOR ECX, ECX
will change the value in the address 00407014h. At the end of the
execution of the virtualized code, the registers are restored in
their correct position allowing the application to continue its
execution.
|
|
Figure 19:
|
Virtual Machine registers
|
|
Now that is your time. I will not comment every executed line. I
gave you the basis and I hope that the things are more clear now. So
trace the example program [7]
and understand how the others handlers are executed.
4 Hopes for the Future and Acknowledgments
4.1 Why write this article?
As I said in the disclaimer, the main purpose of this article is
to transmit the knowledge that I have learned to you. Besides that, I
need to say that I intended to write a tool instead of this article.
And I have also started it but as I am not a programmer I saw that
with the amount of free time that I will have I would not be able to
write this tool.
So what I hope is that someone gets interest in writing this tool
(just e-mail me). I can help and even provide source code of what I
have coded until now. But be aware that this is not an easy work.
An important thing to say is that this is a very resumed article.
I mean there is a lot of details that I omited (no time and so tired
now to say them) and others details that I did not notice. If you
have any questions or if you saw something wrong in my article or if
you wants to improve this article just e-mail me.
And my main hope is to see a similar article about the Themida
Virtual Machine. Let me say, this would not be too difficult now
before this article mainly because Themida uses the same DLLs as Code
Virualizer and because Oreans developers themselves told us that Code
Virtualizer is a version of Themida Virtual Machine a little simpler
(remember the Light VM).
And a word for Oreans: this is really a great tool to protect
sensitive code areas but as you said not 100% safe (there is not
anyone 100% safe). I think there is not a similar one in the market
too good as this one. Keep your good job improving this software!
4.2 The general attack approach
So here I will comment my ideas about a tool to deal with Code
Virtualizer and how to threat new versions. The toll is divided in
three parts:
-
Preprocessing
-
Analysis
-
Find the "fake" Virtual
Opcodes and eliminate them
-
Retrieve the Code Virtualizer
instructions
-
Analise them and retrieve the
original code
-
Posprocessing
The two most difficult things are to find and identify each
handler in the Virtual Machine and to retrieve the original code from
the block of Code Virtualizer intructions.
For the first thing I say, you have two options: study the
mutation engine and do reverse engineering (very difficult); or as
the mutation engine does not mutate all the opcodes I noticed that it
is almost 100% possible to find each handler by their not mutated
instructions.
For the second thing I say, you have two options: study how the
Code Virtualizer instructions are generated from the original
disassembled code and do reverse engineering (difficult); or do some
tests with differents kind of instructions and see the pattern. By
the way, a hint is that a very well recognizable handler is used
always for every original instruction: the STORE FLAGS. This makes
the work of find the number of original instructions easier.
This tool must support different versions of Code Virtualizer. As
the structure of it does not change, you need to adapt a few things
for example new handlers, modified handlers, and other things.
A fun example: commands like ADD, XOR, SHL, etc. have in general
three handlers; one for the byte operation, one for the word and one
for the dword. But when I first saw the three handlers for the SHL
instruction I saw something very strange:
|
|
Figure 20:
|
Code Virtualizer bug
|
|
But only in the version 1.2.0.0 we saw: "[!] Fixed
Virtualization of "SHL reg16, imm""[8].
Interesting, isn’t it?
4.3 Acknowledgments
I must say a big thanks to people who helped me directly and
indirectly to write this article. So here you are:
-
Melvill, Portuogral, forgetoz and
Spec0p (CRKTeam): people really important to me. They introduced me
to the Reverse Engineering and helped me a lot. This article is
especially dedicated to them.
-
softworm: well... what can i say?
Without his really good job, this article would not exist.
-
Ricardo Narvaja and CrackSLatinoS:
really good tutorials
-
The Reverse Engineering Community (the ones where I am
active): CrkPortugal, ARTeam, Unpack.cn, Tuts4you, EXETOOLS
References
[1] Code Virtualizer Help File - Code Virtualizer
Help.chm
[2] http://www.unpack.cn/viewthread.php?tid=5802&fpage=1&highlight=code%2Bvirtualizer
[3] OllyDbg v1.10 by Oleh Yuschuk -
http://www.ollydbg.de/
[4] http://www.oreans.com/
[5] ..\Annex\Example of Code Virtualizer
instructions.rtf - this file is included in the file Inside Code
Virtializer.rar
[6] ..\Annex\Analysis of Code Virtualizer
instructions - this folder is included in the file Inside Code
Virtializer.rar
[7] ..\Annex\handler.exe - this file is included
in the file Inside Code Virtializer.rar
[8] http://www.oreans.com/CodeVirtualizerWhatsNew.php
|