Statistics

Members: 1927
News: 293
Web Links: 1
Visitors: 3933020

Who's Online

We have 3 guests online
Damn Vulnerable LinuxDamn Vulnerable Linux (DVL) is a Linux-based (modified Damn Small Linux) tool for IT-Security & IT-Anti- Security and Attack & Defense. [CLICK HERE FOR MORE INFOS! ]

Featured Conference Video

T16-Recon2006-Joe_Stewart-OllyBonE.gif OllyBone - Semi-Automatic Unpacking on IA-32. View the conference video here!
Home arrow Conference Proceedings arrow Assembly arrow C string functions: introduction, _strlen
C string functions: introduction, _strlen
User Rating: / 2
PoorBest 
Written by Xbios2   


String handling in assembly is - anyway - a difficult subject. There are few string-oriented x86 opcodes, and most of them are slow. There is not a standard library providing even basic functions. There is no string specific syntax in assembly, like C's printf('hello world') or, even worse, BASIC's a$=b$+'hello'. In a few words, if easy string-related programming is your goal, maybe you should consider PERL, or another text-manipulation language.

I. INTRODUCTION

Beware: this is going to be long...

String handling in assembly is - anyway - a difficult subject. There are few string-oriented x86 opcodes, and most of them are slow. There is not a standard library providing even basic functions. There is no string specific syntax in assembly, like C's printf('hello world') or, even worse, BASIC's a$=b$+'hello'. In a few words, if easy string-related programming is your goal, maybe you should consider PERL, or another text-manipulation language.

Yet, string functions are really needed, since almost any program in assembly uses text for I/O. (An alternative to this would be using animated paper-clips to communicate with the user :)).

Furthermore, coding those functions in assembly allows for smaller and faster functions. Actually many of the string functions in C were written in assembly (e.g. strlen, strcat, strcpy, etc). Those can be divided in two categories:

-'Traditional' functions, using the x86 string instructions -'Modern' functions, which run faster by being Pentium-optimized

Borland C++ 4.02 and KERNEL32.DLL only have traditional functions. Borland's C++ Builder v1.0 (once given free as a demo) includes both types. MSVCRT.DLL (version 5) contains 'modern' versions.

The three main aspects considered in these articles (and generally when comparing different versions of the same function) are speed, size and common sense.

'Common sense' indicates how easy it is to understand the way a function operates by reading the source code, how 'elegant' the code is. In a library module distributed as a binary (in a 'static' reuse of code), common sense is not important. It becomes important when the source code is distributed too, because it allows 'dynamic' reuse. 'Elegant' code can be easily optimized for specific needs or expanded to become a more general function.

'Size' is, obviously, the size of the resulting code. Besides creating smaller files, small size has two interesting 'side-effects'. It (usually) creates more elegant code and faster code (it decreases k, but it usually increases l (for an explanation of k and l see 'speed'). For very small functions like strlen it has the added advantage of allowing the code to be inlined without wasting too much space, thus decreasing k even more.

'Speed' indicates the number of cycles needed to execute the function. For simple string functions the number of cycles needed can be expressed as

c=k+l*n

where c is the total number of cycles, k is the number of cycles needed to 'prepare' the function, l is the number of cycles needed to process each character and n is the number of characters in the string. It is obvious that small values of c mean faster execution. In order to compare two versions of a function that run at speeds of

c1=k1+l1n and c2=k2+l2n

the ratio of c1/c2 is calculated:

c1 k1+l1n
r=----=---------

c2 k2+l2n

if r=1 then both versions run at the same speed. if r>1 then version 2 is faster. if rl2, c1