When writing a virus, size is a primary concern. A bloated virus
carrying unnecessary baggage will run slower than its optimised
counterpart and eat up more disk space.
Never optimise any code before it works fully, since altering code
after optimisation often messes up the optimisation and, in turn,
messes up the code. After it works, the focus can shift to
optimisation. Always keep a backup of the last working copy of the
virus, as optimisation often leads to improperly working code. With
this in mind, a few techniques of optimisation will be introduced.
There are two types of optimisation: structural and local.
Structural optimisation occurs when shifting the position of code or
rethinking and reordering the functions of the virus shorten its
length. A simple example follows:
check_install:
mov ax,1234h
int 21h
cmp bx,1234h
ret
install_virus:
call check_install
jz exit_install
If this is the only instance that the procedure check_install is called, the following optimisation may be made:
install_virus:
mov ax,1234h
int 21h
cmp bx,1234h
jz exit_install
The first fragment wastes a total of 4 bytes - 3 for the call and 1
for the ret. Four bytes may not seem to be worth the effort, but after
many such optimisations, the code size may be brought down
significantly. The reverse of this optimisation, using procedures in
lieu of repetitive code fragments, may work in other instances.
Properly designed and well-thought out code will allow for such an
optimisation. Another structural optimisation:
- get attributes
- open file read/only
- read file
- close file
- exit if already infected
- clear attributes
- open file read/write
- get file time/date
- write new header
- move file pointer to end of file
- concatenate virus
- restore file time/date
- close file
- restore attributes
- exit
Change the above to:
- get attributes
- clear attributes
- open file read/write
- read file
- if infected, exit to close file
- get file time/date
- move file pointer to end of file
- concatenate virus
- move file pointer to beginning
- write new header
- restore file time/date
- close file
- restore attributes
- exit
By using the second, an open file and a close file are eliminated
while adding only one move file pointer request. This can save a
healthy number of bytes.
Local, or peephole, optimisation is often easier to do than
structural optimisation. It consists of changing individual statements
or short groups of statements to save bytes.
The easiest type of peephole optimisation is a simple replacement of
one line with a functional equivalent that takes fewer bytes. The 8086
instruction set abounds with such possibilities. A few examples follow.
Perhaps the most widespread optimisation, replace:
mov ax,0 ; this instruction is 3 bytes long
mov bp,0 ; mov reg, 0 with any reg = nonsegment register takes 3 bytes
with
xor ax,ax ; this takes but 2 bytes
xor bp,bp ; mov reg, 0 always takes 2 bytes
or even
sub ax,ax ; also takes 2 bytes
sub bp,bp
One of the easiest optimisations, yet often overlooked by novices, is the merging of lines. As an example, replace:
mov bh,5h ; two bytes
mov bl,32h ; two bytes
; total: four bytes
with
mov bx,532h ; three bytes, save one byte
A very useful optimisation moving the file handle from ax to bx follows. Replace:
mov bx,ax ; 2 bytes
with
xchg ax,bx ; 1 byte
Another easy optimisation which can most easily applied to file pointer moving operations:
Replace
mov ax,4202h ; save one byte from "mov ah,42h / mov al,2"
xor dx,dx ; saves one byte from "mov dx,0"
xor cx,cx ; same here
int 21h
with
mov ax,4202h
cwd ; equivalent to "xor dx,dx" when ax < 8000h
xor cx,cx
int 21h
Sometimes it may be desirable to use si as the delta offset
variable, as an instruction involving [si] takes one less byte to
encode than its equivalent using [bp]. This does NOT work with
combinations such as [si+1]. Examples:
mov ax,[bp] ; 3 bytes
mov word ptr cs:[bp],1234h ; 6 bytes
add ax,[bp+1] ; 3 bytes - no byte savings will occur
mov ax,[si] ; 2 bytes
mov word ptr cs:[si],1234h ; 5 bytes
add ax,[si+1] ; 3 bytes - this is not smaller
A somewhat strange and rather specialised optimisation:
inc al ; 2 bytes
inc bl ; 2 bytes
versus
inc ax ; 1 byte
inc bx ; 1 byte
A structural optimisation can also involve getting rid of redundant
code. As a virus related example, consider the infection routine. In
few instances is an error-trapping routine after each interrupt call
necessary. A single "jc error" is needed, say after the first
disk-writing interrupt, and if that succeeds, the rest should also work
fine. Another possibility is to use a critical error handler instead of
error checking.
How about this example of optimised code:
mov ax, 4300h ; get file attributes
mov dx, offset filename
int 21h
push dx ; save filename
push cx ; and attributes on stack
inc ax ; ax = 4301h = set file attributes
push ax ; save 4301h on stack
xor cx,cx ; clear attributes
int 21h
...rest of infection...
pop ax ; ax = 4301h
pop cx ; cx = original attributes of file
pop dx ; dx-> original filename
int 21h
Optimisation is almost always code-specific. Through a combination
of restructuring and line replacement, a good programmer can
drastically reduce the size of a virus. By gaining a good feel of the
80x86 instruction set, many more optimisations may be found. Above all,
good program design will aid in creating small viruses.
|