Statistics

Members: 1927
News: 293
Web Links: 1
Visitors: 3929537

Who's Online

We have 1 guest online
Damn Vulnerable LinuxDamn Vulnerable Linux (DVL) is a Linux-based (modified Damn Small Linux) tool for IT-Security & IT-Anti- Security and Attack & Defense. [CLICK HERE FOR MORE INFOS! ]

Featured Conference Video

T16-Recon2006-Joe_Stewart-OllyBonE.gif OllyBone - Semi-Automatic Unpacking on IA-32. View the conference video here!
Home arrow Conference Proceedings arrow Assembly arrow The LCC Intrinsics Utility
The LCC Intrinsics Utility
User Rating: / 0
PoorBest 
Written by Jacob Navia   


Lcc-win32 is a free C compiler system. It features an IDE, a resource compiler, a linker, librarian, a windowed debugger, and other goodies.

 

 

Here, I would like to describe a special feature of lcc-win32 that will be surely appreciated by the colleagues that use assembly.

Lcc-win32 understands special macro definitions called intrinsics.This constructs will be seen as normal function calls by the front end of the compiler, but will be inline expanded by the back-end.

You can add your own intrinsic macros to the system, allowing you to use the power and speed of assembly language within the context of a more powerful and safer high level language.

I will present here two examples, to give you an idea of how this can look like. You will need the source code of lcc-win32, that can be obtained at the home page: {http://ps.qss.cz/lcc} or {ftp://ftp.cs.virginia.edu/pub/lcc-win32}

Inlining the strlen function

Lets assume the strlen function of the C library is just to slow for you. Instead of generating:

     pushl     Arg
call _strlen
addl $4,%esp

you would like to generate inline the following code: ; Inlined strlen. The input argument is in ECX and points to the ; character string

orl $-1,%eax
loop:

     inc     %eax
cmpb    $0,(%ecx,%eax)
jnz     loop

This function then, should be inlined by the compiler. The C interface would be:

_strlen(str);

The prototype must be:

extern _stdcall _strlen(char *);

The compiler recognizes intrinsic macros because they have an underscore as the first character of their names, they are declared _stdcall, and they appear in the intrinsics table. Functions that begin with an underscore are few, and this avoids looking up the intrinsics table for each function call, what would slow down compilation speed.

You take then the file intrin.c, in the sources of lcc-win32 and modify the intrinsics table. Its declaration is in the middle of the file, and looks like this:

static INTRINSICS intrinsicTable[] = {

     {"fsincos",2, 0,             fsincos,  NULL      },
{"bswap",     1,   0,        bswap,    bswapArgs },

... many declarations omitted ...

     {"reduceLtb",3,    0,        redCmpLtb,     paddArgs  },
{"mmxDotProduct",3,0,             mmxDotProd,    paddArgs  },
{"_emms",0,         0,        emms,          NULL      },
{NULL,         0,   0,        0,             0    }

};

You add before the last line, the following line:

{"_strlen",1, 0, strlenGen, strlenArgs },

telling the system that you want an intrinsic called _strlen, that takes one argument, whose code will be generated by the function strlenGen(), and the arguments assigned to their respective registers in the function strlenArgs(). This functions should assign the registers in which you want the arguments to the inline macro, and generate the code for the body of the macro. Basically, this macros are seen as special calls by the compiler, that instead of generating a push instruction, will call your <arguments> function, that should set the right fields in each node passed to it, to make later the code generator generate a move to the registers specified.

Note that all intrinsics should start with an underscore to avoid conflicting with user space names.

When a call to this function is detected by the compiler, you will first be called when pushing the arguments at each call site. Here is the function strlenArgs() then:

static Symbol strlenArgs(Node p)
{

Symbol r=NULL;

     //The global ArgumentsIndex is zero before each call. The compiler
//takes care of that.
switch (ArgumentsIndex) {
case 0: // First argument pushed, from right to left!
if (p->x.nestedCall == 0) {
Symbol w;
r = SetRegister(p,intreg[ECX]);
}
break;
}
// We have seen another argument
ArgumentsIndex++;
// Assign the register to this expression.
if (p->x.nestedCall == 0 && r)
p->syms[2] = r;
// Should never be more than  one arguments
if (ArgumentsIndex == 1)
ArgumentsIndex = 0;
return r;

}

You see that in several places we have the test:

if (p->x.nestedCall == 0)

This means that we should check if we have a nested call sequence within the arguments, i.e. the following C expression:

strlen( SomeFunction() );

True, in the case of strlen this doesnt change anything important, the result of the function will be in EAX anyway. But suppose you defined a macro that takes two arguments, say, some special form of addition sadd(a,b). In this case we would assign the second argument (from left to right) to ECX, and the first to EAX. Consider then the case of:

sadd( SomeFunction(),5);

If we would just assign 5 to ECX, then the call to SomeFunction(), would destroy the contents of ECX during the call!

This means that when the compiler detects a call within argument passing, all arguments WILL BE in the stack, and our code generating function should take care of popping them into the right registers before proceeding.

In the case of strlen this can really hardly happen, but its important to see how this would work in the general case.

Note too that the argument function should increase the global argument counter for each argument, and reset it to zero when its done. Again, this is not necessary for strlen, but for macros that take more arguments this should be done imperatively.

The SetRegister function takes care of the details of assigning a register. Here is its short body:

Symbol SetRegister(Node p,Symbol r)
{

Symbol w;

     w = p->kids[0]->syms[2];
if (w->x.regnode == NULL || w->x.regnode->vbl == NULL)
p->kids[0]->syms[2] = r;
return r;

}

This function tests that in the given node, the left child isn't already assigned to a register. It will assign the register only if this is not the case. Otherwise, the compiler will generate the move.

We come now to the center of the routine: Generating code for the strlen utility.

static Symbol strlenGen(Node p)
{

static int labelCount;

     // OK, the first thing to do is to see if we should pop our arguments. 
// If that is the case, pop them into the right registers.
if (p->x.nestedCall) {
print("\tpopl\t%%ecx\n");
}

/*
Here we generate the code for the strlen routine. Note that the % sign is used by the assembler of lcc-win32 to mark a register keyword, but our print() function uses it too to mark (as printf) the beginning of an argument. We must double them to get around this collision.

  1. Set the counter to minus one */

    print("\torl\t$-1,%%eax\n"); /*

  2. We should generate the label for this instance. All labels must be unique, and the easiest way to ensure that we always generate a new label is to number them consecutively using a counter. To avoid colliding with other labels, we use a unique prefix too. */

    print("$strlen%d:\n",labelCount); /*

  3. Now we generate the code for the body of the loop searching for the character zero. */

    print("\tinc\t%%eax\n"); / 4) Note the dollar before the immediate constant./

    print("\tcmpb\t$0,(%%ecx,%%eax)\n"); /*

  4. We generate the jump, incrementing our loop counter afterwards */

    print("\tjnz\t$strlen%d\n",labelCount++);

/*
Now we are done, the result is in eax, as it should. We finish our function. Note that no pops are needed, since the ones we did at the beginning (eventually) are just to compensate for the pushs the compiler generated. Note too that we shouldn't insert a return statement since this is a macro that shouldn't cause the current function to return! */
}

We compile the compiler, and we obtain a new compiler that will recognize the macro we have just created. Compiling the compiler with itself is a good test for your new function of course. This should be done at least three times to be sure that your function is working OK.

Register assignments

In general, you can use ECX, EDX, and EAX as you wish. The contents of EBX, ESI, EBP and EDI should always be saved. If you destroy them unpredictable results will surely occur.

Lets write a test function for our new compiler:

#include <stdio.h>
#ifdef MACRO
int _stdcall _strlen(char *);
#define strlen _strlen
#else
int strlen(char *);
#endif
int main(int argc, char *argv[])
{

        if (argc > 1)
printf("Length of \"%s\" is %d\n", argv[1], 
strlen(argv[1]));
return 0;

}

In the C source, we use the conditional MACRO to signify if we should use our macro, or just generate a call to the normal strlen procedure for comparison purposes. We compile this with our new compiler, and add the S parameter to see what is generating.

lcc -S DMACRO tstrlen.c

The assembly (that the compiler writes in tstrlen.asm) is then:

main

pushl %ebp movl %esp,%ebp pushl %edi .line 9 .line 10 cmpl $1,8(%ebp) jle _$2 .line 11 movl 12(%ebp),%edi ; Our argument gets assigned to ECX, as our strlenArgs function ; defined

movl 4(%edi),%ecx ; Here is the begin of our macro body

orl $-1,%eax ; This is our generated label _$strlen0:

inc %eax cmpb $0,(%ecx,%eax) jnz _$strlen0 ; Our macro ends here, leaving its results in EAX

pushl %eax movl 12(%ebp),%edi pushl 4(%edi) pushl $_$4 call _printf addl $12,%esp _$2:

.line 12 xor %eax,%eax .line 13 popl %edi popl %ebp ret

We see that there is absolutely no call overhead. The arguments are assigned to the right registers in our function strlenArgs, and the body is expanded in-line by strlenGen.

Next, we link our executable:

D:\lcc\src74\test>lcclnk tstrlen.obj

And we run a test:

D:\lcc\src74\test>tstrlen abcde
The length of "abcde" is 5
D:\lcc\src74\test>

Here is the strlenGen() function again for clarity.

static void strlenGen(Node p)
{

static int labelCount;

     if (p->x.nestedCall) {
print("\tpopl\t%%ecx\n");
}
print("\torl\t$-1,%%eax\n");
print("$strlen%d:\n",labelCount);
print("\tinc\t%%eax\n");
print("\tcmpb\t$0,(%%ecx,%%eax)\n");
print("\tjnz\t$strlen%d\n",labelCount++);

}

Another example: inlining the strchr function

To demonstrate a function with two arguments, we inline the strchr function. This function should return a pointer to the first occurrence of the given character in a string, or NULL, if the character doesnt appear in the string. The implementation could be like this :

strchr

movb (%eax),%dl // read a character cmpb %cl,%dl // compare it to searched for char je _strchrexit // exit if found with pointer to char as result incl %eax // move pointer to next char orb %dl,%dl // test for end of string jne strchr // if not zero continue loop xorl %eax,%eax // Not found. Zero result strchrexit :

We just scan the characters looking for either zero (end of the string) or the given char. The pointer to the string will be in EAX, and the character to be searched for will be in ECX. We use EDX as a scratch register.

The next step is then, to write the strchr function for assigning the arguments. Here it is :

static Symbol strchrArgs(Node p)
{

Symbol r=NULL;

     switch (ArgumentsIndex) {
case 0: // First argument (from right to left) char to be searched. 
// We put it in ECX
if (p->x.nestedCall == 0) {
r = SetRegister(p,intreg[ECX]);
}
break;
case 1: // Second argument: pointer to the string. We put it in EAX
if (p->x.nestedCall == 0) {
r = SetRegister(p,intreg[EAX]);
}
break;
}
ArgumentsIndex++;
if (p->x.nestedCall == 0)
p->syms[2] = r;
if (ArgumentsIndex == 2)
ArgumentsIndex = 0;
return r;

}

The next step is finally to write the generating function. Here it is; note that we need two labels:

static void strchrGen(Node p)
{

static int labelCount;

     if (p->x.nestedCall) {
print("\tpopl\t%%ecx\n");
}
print("$strchr%d:\n",labelCount);
print("\tmovb\t(%%eax),%%dl\n");
print("\tcmpb\t%%cl,%%dl\n");
print("\tje\t$strchr%d\n",labelCount+1);
print("\tinc\t%%eax\n");
print("\torb\t%%dl,%%dl\n");
print("\tjne\t_$strchr%d\n",labelCount);
print("\txorl\t%%eax,%%eax\n");
print("_$strchr%d:\n",labelCount+1);
labelCount += 2;

}

This facility is not very common in a compiler system, and it allows you to use assembly language in the routines that are really needed in a software system, leaving to the compiler the tedious work of generating the assembly for you in the 90% of the code where speed is not so important after all.

Another benefit is that you can't do simple mistakes when passing arguments to your assembler macros since they are understood as function calls by the compiler, and all prototype checking is done by the front end. If you attempt to use the strchr macro like this:

strchr('\n",string);
the compiler will issue an error.

The lcc-win32 system can be downloaded free of charge from

{http://ps.qss.cz/lcc}