Article # 213, added by Geoworks, historical record
| first |
previous |
index |
next |
last |
Mixing C and assembly - an introduction.
You may sometimes wish to combine GOC and Esp code in a single application. There are two main times when you may want to do this. You may be writing an application in GOC, but find that the application is spending a lot of time in a few critical routines; in this case, you may be able to improve efficiency by rewriting those few routines in assembly. If you don't want to rewrite the rest of the application, you will have to write those routines so they can be called from C. On the other hand, you may be writing a library that has a lot of time-consuming routines. In this case, you may find it worthwhile to rewrite the entire library in assembly, while preserving a GOC interface. That way, all the applications that use the library will be able to take advantage of the efficiency of assembly code. (For example, most GEOS system libraries are written in assembly.) In particular, if you design a new object class that is being used by many different applications, it may be worthwhile to write a library which defines that class of object. You can write all the method code in assembly, while providing a GOC interface; this lets every application that uses the class take advantage of assembly's efficiency. A.1 Calling Conventions Glue can automatically link Esp routines into a GOC application or library. The programmer, however, has to write the routine to present an appropriate interface. To do this, the programmer will have to understand GOC calling conventions. GOC uses the calling conventions of whatever C compiler it is using. All C compilers support standard-C calling conventions. Furthermore, all C compilers that work with the GEOS SDK (including Borland and High-C) allow a specific routine to use Pascal coding conventions. You can use either the C or the Pascal coding conventions. However, most GEOS GOC routines use the Pascal calling conventions, and we encourage programmers to do the same. The Pascal calling conventions are generally more efficient than the C conventions. The only time you will need to use C calling-conventions is when your routine takes a variable number of arguments (as, e.g., printf() does). There are three basic differences between C and Pascal calling conventions. The differences are listed in brief here, and discussed at length below: + In the C calling conventions, arguments are pushed right-to-left (the last argument in the routine declaration is the first argument pushed). In the Pascal convention, arguments are pushed left-to-right. + In the C calling convention, the caller pops arguments off the stack after the called routine returns. In the Pascal calling convention, the called routine pops the arguments off the stack before returning. + In the C calling convention, the routine may be in mixed-case. In the Pascal calling convention, the routine name is automatically folded to ALL-CAPS. A.1.1 Passing Arguments Arguments are passed differently in the C and Pascal calling conventions. As noted above, if a routine uses the C pass-and-return conventions, it expects the arguments to be pushed from right to left; if the routine uses Pascal calling conventions, it expects the arguments to be passed from left to right. For example, suppose you write two similar routines, one of which uses the C calling conventions and one of which uses the Pascal conventions: extern void _cdecl CRoutine( int cArg1, int cArg2, int cArg3 ); extern void _pascal PascalRoutine( int pArg1, int pArg2, int pArg3 ); When you call the C-style routine, the C compiler generates code to push the arguments on the stack, pushing first cArg3, then cArg2, then cArg1. It then leaves space for a call to the external routine (Glue takes care of filling this in). When the routine returns, it pops three words off the stack. Thus, the GOC code CRoutine( 11, 22, 33 ); is compiled to something like push 33 ; argument cArg3 push 22 ; argument cArg2 push 11 ; argument cArg1 call CRoutine ; The linker fills this in ; Here the routine is called, and returns, ; destroying its own stack frame but leaving the ; passed arguments on the stack add sp, 6 ; Pop 3 words off the stack When you call the Pascal-style routine, the C compiler generates code to push the arguments on the stack, pushing first pArg1, then pArg2, then pArg3. It then leaves space for a call to the external routine (Glue takes care of filling this in). The called routine takes care of popping the arguments off the stack; when execution returns to the caller, they have already been popped. Thus, the GOC code PRoutine( 11, 22, 33 ); is compiled to something like push 11 ; argument pArg1 push 22 ; argument pArg2 push 33 ; argument pArg3 call PRoutine ; The linker fills this in ; Here the routine is called, and returns, ; destroying its own stack frame and popping the ; passed arguments off the stack A.1.2 Naming Conventions There is one other major difference between C and Pascal routine conventions. In C, the names of routines are case-significant. You can (e.g.) have three different routines named "MyProc", "MYPROC", and "myproc". In Pascal, on the other hand, case is not significant. No matter what case you use when you name a routine or variable, the compiler folds the case to ALL-CAPS. This means that when you define or declare a routine to use the Pascal calling conventions, many C compilers will, again, fold case to all-capital letters. (For example, the Borland compiler does this.) This means that in the Esp file where the routine is defined, you must give it a name in ALL-CAPS. In the C file, you may not need to do this; with many compilers, you can declare the routine to have a mixed-case name, and the case will automatically be folded to ALL-CAPS. For details, check your C compiler's documentation. As an example, suppose you write a routine in assembly language, called MyEspRoutine; but you wish to have a C resource call that routine. (Let's assume the routine passes and returns nothing, for simplicity's sake.) Further suppose your compiler is the Borland C++ compiler. In this case, you would name the routine thus in the Esp code: MYESPROUTINE proc far and declare the routine thus in your GOC file: extern void _pascal MyEspRoutine( void ); A.2 Adding Esp Code to a GOC Geode Most people will find it easiest to write applications in GOC. For most purposes, GOC is efficient enough; after all, whenever an application is running a system routine, or sending a message to a system-defined object, it is most likely executing assembly code. However, some applications may have very computation-heavy, time-consuming routines. This can be exacerbated if the application is intended to run on a slower platform, or if the time-consuming routines cannot (for some reason) be compiled efficiently. In this case, you can sometimes improve efficiency significantly by rewriting just those routines in assembly language. All the routines in any single code resource must be written in the same language (GOC or Esp). You will therefore have to segregate your Esp routines into one or more resources. Since you lose efficiency if resources are too small, it may be best to simply put all the Esp routines into a single resource. Simply write an assembly file with the routines; then declare all the routines in a C header file as "extern". mkmf will automatically generate appropriate instructions to compile and link the two resources. For example, suppose you are writing a geode in GOC. This geode has a single routine, AddUpThree(), that is passed three integers, adds them up, and returns the sum. Let's suppose you wish to rewrite this routine in assembly. At first, this routine might be written like this: Code Display A-1 The AddUpThree routine, in GOC /*********************************************************************** * AddUpThree *********************************************************************** * SYNOPSIS: Adds three integers, and returns the sum. ***********************************************************************/ int AddUpThree( int firstArg, int secondArg, int thirdArg ) { /* This would be so much faster in assembly... */ return( firstArg + secondArg + thirdArg ); } /* Then, in the body of some other routine, we'd call it like this: */ /* ... */ mySum = AddUpThree( 10, 20, 30 ); /* ... */ If we want to rewrite the routine in assembly, we must do two things. First, we must write the actual routine in a .asm file: Code Display A-2 The AddUpThree routine, in Esp ; First, we declare the routine name for export: EspCode segment global ADDUPTHREE:far EspCode ends ; We instruct Esp to use the "GEOS Convention", i.e. to use the Pascal ; pass-and-return convention and the medium memory model: SetGeosConvention And we write the actual routine: COMMENT @%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% ADDUPTHREE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% SYNOPSIS: Add three integers; return the sum. CALLED BY: C resource; uses Pascal conventions PASS: On stack: firstArg (integer) secondArg (integer) thirdArg (integer) RETURN: Sum of arguments (in ax) DESTROYED: Check what registers your compiler lets you destroy %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%@ ADDUPTHREE proc far firstArg:word, secondArg:word, thirdArg:word .enter mov ax, firstArg add ax, secondArg add ax, thirdArg .leave ; destroys stack frame ret ; pops arguments off stack ADDUPTHREE endp Now, you just have to put an appropriate declaration in the C file (or a C header file). Remember, different compilers handle the Pascal naming conventions differently. Let's assume you're using Borland C++, which automatically folds case for Pascal-convention routine names: Code Display A-3 Declaring the Assembly Routine in a C File /* We assume this application will be compiled by the Borland C++ compiler, * which automatically folds case of Pascal-convention routines. This means * that the linker will look for a routine named ADDUPTHREE. */ extern int _pascal AddUpThree( int firstArg, int secondArg, int thirdArg ); /* We call the routine the same way we did before: */ /* ... */ mySum = AddUpThree( 10, 20, 30 ); /* ... */ A.3 Writing an Esp Library You may wish to write a library in Esp whose routines can be called by either GOC or Esp code. This is very much like writing a library in GOC. Simply write all the exported routines to use pascal pass-and-return conventions. Remember to write both a GOC header file and an Esp header file; that way, an application can include whichever of these is appropriate. Make sure that both of these header files are maintained in tandem, and accurately reflect the state of the library. If you are writing an object library, you will need to write an Espire header file (.uih), as well as the GOC and Esp header files. All three header files must be maintained in tandem. Espire is discussed in "The UI Compiler," Chapter 4. The library is composed of three basic parts. First, there are the header files. You will need to write at least one header file, in the language in which the library is written. Ordinarily, you would write two: one in GOC, and one in Esp. That way, the library's routines could be called from either language. If you are writing an object library, you will also have to write an Espire header file. You might need to write an "entry point" routine, to handle bookkeeping for the library. However, this is not always necessary; some libraries do not have an entry point routine. Writing an entry point routine in Esp is much like writing an entry point routine in GOC. Finally, you will have to write the code resources themselves. This is much like writing a code resource for an application. The exported routines should follow C or (preferably) Pascal calling conventions, so they can be called by GOC code. The internal routines, if any, need not follow these conventions. A.3.1 Esp Library Entry Points Writing an entry point in Esp is very much like writing an entry point in GOC. The routine has to perform the same basic functions, and may be called under the same circumstances. Not all libraries will need an entry point routine. If you do not define an entry point in the library's .gp file, the kernel will not try to call an entry point for the library. The entry point is passed two arguments: di This is a member of the LibraryCallType enumerated type cx If di contains LCT_NEW_CLIENT or LCT_CLIENT_EXIT, cx contains the handle of the client geode; otherwise, this register's value is not defined. The library should set CF if there is an error, and clear CF otherwise. A.3.2 Important Macros DefLib, StartLibrary, EndLibrary Every library will need some header files. Because a header file may be included many times in a single object file, you should use certain macros to prevent the data in the header from being compiled more than once. Every assembly header file should begin with the line StartLibraryand end with EndLibrary This macro checks to see if the library's header file has already been included in the current compilation. If it has, the header file will not be included again. A code library's code files should include the library's header file. However, when including the header file, it should not use UseLib the way a regular application does. Instead, it should use DefLib thus: DefLib < > This macro includes the library's header file, but informs the Esp assembler that the file is being included in the code of the library itself. A.3.3 An Example of a Code Library Suppose you want to write a code library in Esp, but you want the library to be callable from either C or Assembly applications. For the sake of argument, suppose that library contains only a single routine, AsmLibDivByTwo(), which is passed an integer, shifts it to the right one bit (using an arithmetic shift, to preserve the sign of the integer), and returns the result. (Once you've written a library with one routine, it's easy to add more.) First we'll write the header files for the library. These header files can contain definitions of structures, records, or enumerated types; in this example, though, they define the exported routine. Code Display A-4 A Library's Assembly Header File COMMENT @%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% FILE: asmlib.def DESCRIPTION: This is the assembly header file for the AsmLib library. Make sure that it and CInclude/asmlib.h are maintained in tandem! %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%@ StartLibrary asmlib include geos.def ; Structures, macros, etc. could be defined here... ; Now list exported routines global ASMLIBDIVBYTWO:far ; PASS: (on stack) ; intArg = signed, 16-bit integer ; RETURN: ax = IntArg / 2 ; DESTROYED: none ; SIDE EFFECTS: Argument popped off the stackArgument popped off the ; stack (as per pascal calling convention) EndLibrary asmlib This file is very straightforward. Note that the name of the routine is given in ALL-CAPS; this is because the routine is defined as using the Pascal calling conventions. The C header file is slightly more complicated: Code Display A-5 A Library's C Header File /*********************************************************************** * FILE: asmlib.h ***********************************************************************/ #ifndef _ASMLIB_H_ #define _ASMLIB_H_ #include /* Structures, macros, etc. could be defined here... */ /* Now list exported routines */ extern int _pascal AsmLibDivByTwo(int numerator); #ifdef __HIGHC__ /* High C needs to be told to convert the routine * name to ALL-CAPS */ pragma Alias (AsmLibDivByTwo, "ASMLIBDIVBYTWO"); #endif /* _ASMLIB_H_ */ The High-C compiler does not automatically convert the name of a Pascal-style routine to ALL-CAPS, so we use a pragma to instruct it to translate AsmLibDivByTwo to ASMLIBDIVBYTWO when compiling. Now let's write the routine itself. Again, let's use the Pascal conventions. The routine is passed a single argument, on the stack; it returns an integer in ax. Code Display A-6 A Library's Source Code File COMMENT @%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% FILE: asmlib.asm ROUTINES: Name Description ---- ----------- ASMLIBDIVBYTWO Divides an integer by two. DESCRIPTION: This contains the code for the CodeLib library. Only one routine is defined, a simple divide-by-two routine. If the library had an entry-point routine, it might be put in this file. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%@ include geos.def include library.def include resource.def SetGeosConvention ; This macro instructs the assembler that the file will ; be using the pascal calling convention. This means ; that passed arguments will be popped off the stack ; automatically. DefLib asmlib.def Main segment resource ; This resource has all code for this library COMMENT @%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% ASMLIBDIVBYTWO %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% SYNOPSIS: This routine divides an integer by two. It uses Pascal calling conventions. CALLED BY: GLOBAL PASS: (on stack) intArg = signed, 16-bit integer RETURN: ax = IntArg / 2 DESTROYED: none SIDE EFFECTS: Argument popped off the stackArgument popped off the stack ; (as per pascal calling convention) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%@ ASMLIBDIVBYTWO proc far intArg:sword .enter ; set up stack frame mov ax, intArg sar ax, 1 .leave ; Destroy stack frame ret ; Since we're using the pascal calling convention, this ; "return" will pop the arguments off the stack. ASMLIBDIVBYTWO endp Main ends Again, the code is fairly straightforward. This routine does not assume that any values are passed in any registers. In practice, each C compiler passes certain values in the registers to called routines, and also allows called routines to destroy certain registers. You should check the documentation for any C compilers which may link in the library before you rely on this. Finally, the library needs a .gp file for the linker: Code Display A-7 A Library's .gp File ############################################################################## # # FILE: asmlib.gp # # This is a sample library for the GEOS SDK. It demonstrates how # to write a simple Esp code library, which exports a single # routine. # ############################################################################## name asmlib.lib library geos # # Specify geode type type library,single # # Information for the geode's extended attributes # longname "Sample Assembly Code Library" tokenchars "SASL" tokenid 0 # # Define the library "entry point" here, if the library has one. You would # do it like this: # entry AsmLibEntryRoutine # # (This library doesn't have an entry point.) # # Define resources other than standard discardable code # (this library has none) # # # Exported routines # export ASMLIBDIVBYTWO Again, this is straightforward. The only difference between this and an application's .gp file is that the geode's type is "library", not "appl". Also, if the library has an entry point routine, that routine would be specified here with the "entry" keyword, as shown above. Figure 1-1 Calling a routine with C calling conventions (1) Just before the routine is called, the arguments are pushed on the stack; the last argument declared is pushed first. (2) When the routine is called, it sets up its own stack frame below the passed arguments. (3) When the routine returns, it pops its own stack frame off the stack, but leaves the passed arguments intact. The caller is responsible for popping the arguments off the stack. Figure 1-2 Calling a routine with Pascal calling conventions. (1) Just before the routine is called, the arguments are pushed on the stack; the first argument declared is pushed first. (2) When the routine is called, it sets up its own stack frame below the passed arguments. (3) When the routine returns, it pops its own stack frame and the passed arguments off the stack.