(Note : This is a work in progress.)
The audience is assumed to be SWIG developers (who should also read the
SWIG Engineering Manual before starting
to code).
1.1 Directory Guide
Doc | HTML documentation. If you find a documentation bug, please let us know. | ||||||||||||
Examples | This subdir tree contains examples of using SWIG w/ different scripting languages, including makefiles. Typically, there are the "simple" and "class" examples, w/ some languages offering additional examples. See the README more index.html file in each directory for more info. [FIXME: Ref SWIG user manual.] | ||||||||||||
Lib | These are the .i (interface) files that form the SWIG installed library. Language-specific files are in subdirectories (for example, guile/typemaps.i). Each language also has a .swg file implementing runtime type support for that language. The SWIG library is not versioned. | ||||||||||||
Misc | Currently this subdir only contains file fileheader. See the Engineering Manual for more info. | ||||||||||||
Source | The C and C++ source code for the swig executable is in this subdir tree. |
DOH | C library providing memory allocation, file access and generic containers. |
Include | Configuration .h files |
CParse | Parser (lex / yacc) files and support |
Modules | Language-specific callbacks that does actual code generation (each language has a .cxx and a .h file). |
Preprocessor | SWIG-specialized C/C++ preprocessor. |
Swig | This directory contains the ANSI C core of the system and contains generic functions related to types, file handling, scanning, and so forth. |
DOH takes a different approach to tackling the complexity problem. First, rather than going overboard with dozens of types and class definitions, DOH only defines a handful of simple yet very useful objects that are easy to remember. Second, DOH uses dynamic typing---one of the features that make scripting languages so useful and which make it possible to accomplish things with much less code. Finally, DOH utilizes a few coding tricks that allow it to perform a limited form of function overloading for certain C datatypes (more on that a little later).
The key point to using DOH is that instead of thinking about code in
terms of highly specialized C data structures, just about everything
ends up being represented in terms of a just a few datatypes. For
example, structures are replaced by DOH hash tables whereas arrays are
replaced by DOH lists. At first, this is probably a little strange to
most C/C++ programmers, but in the long run in makes the system
extremely flexible and highly extensible. Also, in terms of coding,
many of the newly DOH-based subsystems are less than half the size (in
lines of code) of the earlier C++ implementation.
2.2 Basic Types
The following built-in types are currently provided by DOH:
Copies of lists and hash tables are shallow. That is, their contents are only copied by reference.DOH *a, *b, *c, *d; a = NewString("Hello World"); b = NewList(); c = Copy(a); /* Copy the string a */ d = Copy(b); /* Copy the list b */
Objects can be deleted using the Delete() function. For example:
All objects are referenced counted and given a reference count of 1 when initially created. The Delete() function only destroys an object when the reference count reaches zero. When an object is placed in a list or hash table, it's reference count is automatically increased. For example:DOH *a = NewString("Hello World"); ... Delete(a); /* Destroy a */
Should it ever be necessary to manually increase the reference count of an object, the DohIncref() function can be used:DOH *a, *b; a = NewString("Hello World"); b = NewList(); Append(b,a); /* Increases refcnt of a to 2 */ Delete(a); /* Decreases refcnt of a to 1 */ Delete(b); /* Destroys b, and destroys a */
DOH *a = NewString("Hello"); DohIncref(a);
Example usage of lists:
/* Create and populate */ List *list = NewList(); Append(list, NewString("listval1")); Append(list, NewString("listval2")); Append(list, NewString("listval3")); Append(list, NewString("listval4")); Append(list, NewString("listval5")); /* Size */ Printf(stdout, "list len: %d\n", Len(list)); /* Delete */ Delitem(list, 3); /* Replace */ Setitem(list, 0, NewString("newlistval1")); /* Get */ String *item = Getitem(list,1); if (item) Printf(stdout, "get: %s\n", item); else Printf(stdout, "get: [non-existent]\n"); /* Iterate through the container */ int len = Len(list); for (int i=0; i<len; i++) { String *item = Getitem(list,i); Printf(stdout, "list item: %s\n", item); }
Resulting output:
hash len: 5 get: hashval2 hash item: hashval5 [h5] hash item: hashval1 [h1] hash item: hashval2 [h2] hash item: hashval3 [h3]
Example usage of hash tables:
/* Create and populate */ Hash *hash = NewHash(); Setattr(hash, "h1", NewString("hashval1")); Setattr(hash, "h2", NewString("hashval2")); Setattr(hash, "h3", NewString("hashval3")); Setattr(hash, "h4", NewString("hashval4")); Setattr(hash, "h5", NewString("hashval5")); /* Size */ Printf(stdout, "hash len: %d\n", Len(hash)); /* Delete */ Delattr(hash, "h4"); /* Get */ String *item = Getattr(hash, "h2"); if (item) Printf(stdout, "get: %s\n", item); else Printf(stdout, "get: [non-existent]\n"); /* Iterate through the container */ Iterator it; for (it = First(hash); it.key; it= Next(it)) Printf(stdout, "hash item: %s [%s]\n", (it.item), (it.key));
Resulting output:
list len: 5 get: listval2 list item: newlistval1 list item: listval2 list item: listval3 list item: listval5
The representation and manipulation of types is currently in the process of being reorganized and (hopefully) simplified. The following list describes the current set of functions that are used to manipulate datatypes.
The intent of the lstr() function is to produce local variables inside wrapper functions--all of which must be reassignable types since they are the targets of conversions from a scripting representation.Original Datatype lstr() ------------------ -------- const char *a char *a double a[20] double *a double a[20][30] double *a double &a double *a
Original Datatype rcaststr() ------------------ --------- char *a const char *a (const char *) name double a[20] (double *) name double a[20][30] (double (*)[30]) name double &a (double &) *name
Original Datatype lcaststr() ------------------ --------- char *a const char *a (char *) name double a[20] (double *) name double a[20][30] (double *) name double &a (double *) &name
Here's how a wrapper function would be generated using the type generation functions above:int foo(int a, double b[20][30], const char *c, double &d);
Here's how it would look with the corresponding output filled in:wrapper_foo() { lstr("int","result") lstr("int","arg0") lstr("double [20][30]", "arg1") lstr("const char *", "arg2") lstr("double &", "arg3") ... get arguments ... result = (lcaststr("int")) foo(rcaststr("int","arg0"), rcaststr("double [20][30]","arg1"), rcaststr("const char *", "arg2"), rcaststr("double &", "arg3")) ... }
Notes:wrapper_foo() { int result; int arg0; double *arg1; char *arg2; double *arg3; ... get arguments ... result = (int) foo(arg0, (double (*)[30]) arg1, (const char *) arg2, (double &) *arg3); ... }
When SWIG generates wrappers, it tries to provide a mostly seamless integration with the original code. However, there are a number of problematic features of C/C++ programs that complicate this interface.
gets turned into a wrapper like this:double dot_product(Vector a, Vector b);
Functions that return by value require a memory allocation to store the result. For example:double wrap_dot_product(Vector *a, Vector *b) { return dot_product(*a,*b); }
becomeVector cross_product(Vector *a, Vector *b);
Note: If C++ is being wrapped, the default copy constructor is used instead of malloc() to create a copy of the return result.Vector *wrap_cross_product(Vector *a, Vector *b) { Vector *result = (Vector *) malloc(sizeof(Vector)); *result = cross_product(a,b); return result; }
The "bar" method is wrapped by a function like this:class Foo { public: double bar(double); };
double Foo_bar(Foo *self, double arg0) { return self->bar(arg0); }
gets wrapped as follows:struct Foo { int x; };
int Foo_x_get(Foo *self) { return self->x; } int Foo_x_set(Foo *self, int value) { return (self->x = value); }
Note: For C, new objects are created using the calloc() function.Foo *new_Foo() { return new Foo; }
Note: For C, objects are destroyed using free().void delete_Foo(Foo *self) { delete self; }
Here's how you might write a really simple wrapper functiondouble dot_product(Vector a, Vector b);
The output of this would appear as follows:ParmList *l = ... parameter list of the function ... DataType *t = ... return type of the function ... char *name = ... name of the function ... Wrapper *w = NewWrapper(); Printf(w->def,"void wrap_%s() {\n", name); /* Declare all of the local variables */ Swig_cargs(w, l); /* Convert all of the arguments */ ... /* Make the function call and declare the result variable */ Swig_cresult(w,t,"result",Swig_cfunction(name,l)); /* Convert the result into whatever */ ... Printf(w->code,"}\n"); Wrapper_print(w,out);
Notice that the Swig_cargs(), Swig_cresult(), and Swig_cfunction() functions have taken care of the type conversions for the Vector type automatically.void wrap_dot_product() { Vector *arg0; Vector *arg1; double result; ... result = dot_product(*arg0, *arg1); ... }
Notes:
When generating code it is important not to generate symbols that might clash with the code being wrapped. It is tempting to flout the standard or just use a symbol which starts with a single underscore followed by a lowercase letter in order to avoid name clashes. However even these legal symbols can also clash with symbols being wrapped. The following guidelines should be used when generating code in order to meet the standard and make it highly unlikely that symbol clashes will occur:17.4.3.1.2 Global names [lib.global.names] 1 Certain sets of names and function signatures are always reserved to the implementation: * Each name that contains a double underscore (__) or begins with an underscore followed by an upper case letter (2.11) is reserved to the implementation for any use. * Each name that begins with an underscore is reserved to the implementation for use as a name in the global namespace.165) 165) Such names are also reserved in namespace ::std (17.4.3.1). [back to text]
For C++ code that doesn't attempt to mangle a symbol being wrapped (for example SWIG convenience functions):
For code compiled as C or C++ that doesn't attempt to mangle a symbol being wrapped (for example SWIG convenience functions):
The DOH types used in the SWIG source code are all typedefined to void. Consequently, it is impossible for debuggers to automatically extract any information about DOH objects. The easiest approach to debugging and viewing the contents of DOH objects is to make a call into one of the family of SWIG print functions from the debugger. The "Debugging Functions" section in SWIG Parse Tree Handling lists them. It is sometimes easier to debug by placing a few calls to these functions in code of interest and recompile, especially if your debugger cannot easily make calls into functions within a debugged binary.
The SWIG distribution comes with some additional support for the gdb debugger in the Tools/swig.gdb file. Follow the instructions in this file for 'installing'. This support file provides an easy way to call into some of the family of SWIG print functions via additional user-defined gdb commands. Some usage of the swigprint and locswigprint user-defined commands are demonstrated below.
More often than not, a parse tree node needs to be examined. The session below displays the node n in one of the Java language module wrapper functions. The swigprint method is used to show the symbol name (symname - a DOH String type) and the node (n - a DOH Hash type).
Breakpoint 1, JAVA::functionWrapper (this=0x97ea5f0, n=0xb7d2afc8) at Modules/java.cxx:799 799 String *symname = Getattr(n, "sym:name"); (gdb) next 800 SwigType *t = Getattr(n, "type"); (gdb) swigprint symname Shape_x_set (gdb) swigprint n Hash(0xb7d2afc8) { 'membervariableHandler:view' : variableHandler, 'feature:except' : 0, 'name' : x, 'ismember' : 1, 'sym:symtab' : Hash(0xb7d2aca8) {......}, 'nodeType' : cdecl, 'nextSibling' : Hash(0xb7d2af98) {.............}, 'kind' : variable, 'variableHandler:feature:immutable' : <Object 'VoidObj' at 0xb7cfa008>, 'sym:name' : Shape_x_set, 'view' : membervariableHandler, 'membervariableHandler:sym:name' : x, 'membervariableHandler:type' : double, 'membervariableHandler:parms' : <Object 'VoidObj' at 0xb7cfa008>, 'parentNode' : Hash(0xb7d2abc8) {..............................}, 'feature:java:enum' : typesafe, 'access' : public, 'parms' : Hash(0xb7cb9408) {......}, 'wrap:action' : if (arg1) (arg1)->x = arg2;, 'type' : void, 'memberset' : 1, 'sym:overname' : __SWIG_0, 'membervariableHandler:name' : x, }
Note that all the attributes in the Hash are shown, including the 'sym:name' attribute which was assigned to the symname variable.
Hash types can be shown either expanded or collapsed. When a Hash is shown expanded, all the attributes are displayed along with their values, otherwise a '.' replaces each attribute when collapsed. Therefore a count of the dots provides the number of attributes within an unexpanded Hash. Below shows the 'parms' Hash being displayed with the default Hash expansion of 1, then with 2 provided as the second argument to swigprint to expand to two Hash levels in order to view the contents of the collapsed 'nextSibling' Hash.
(gdb) swigprint 0xb7cb9408 Hash(0xb7cb9408) { 'name' : self, 'type' : p.Shape, 'self' : 1, 'nextSibling' : Hash(0xb7cb9498) {...}, 'hidden' : 1, 'nodeType' : parm, } (gdb) swigprint 0xb7cb9408 2 Hash(0xb7cb9408) { 'name' : self, 'type' : p.Shape, 'self' : 1, 'nextSibling' : Hash(0xb7cb9498) { 'name' : x, 'type' : double, 'nodeType' : parm, }, 'hidden' : 1, 'nodeType' : parm, }
The same Hash can also be displayed with file and line location information via the locswigprint command.
(gdb) locswigprint 0xb7cb9408 example.h:11: [Hash(0xb7cb9408) { Hash(0xb7cb9408) { 'name' : self, 'type' : p.Shape, 'self' : 1, 'nextSibling' : Hash(0xb7cb9498) {...}, 'hidden' : 1, 'nodeType' : parm, }]
Tip: Commands in gdb can be shortened with whatever makes them unique and can be command completed with the tab key. Thus swigprint can usually be shortened to sw and locswigprint to loc. The help for each command can also be obtained within the debugging session, for example, 'help swigprint'.
The sub-section below gives pointers for debugging DOH objects using casts and provides an insight into why it can be hard to debug SWIG without the family of print functions.
7.1 Debugging DOH Types The Hard Way
The DOH types used in SWIG are all typedefined to void and hence the lack of type information for inspecting types within a debugger.
Most debuggers will however be able to display useful variable information when an object is cast to the appropriate type.
Getting at the underlying C string within DOH types is cumbersome, but possible with appropriate casts.
The casts below can be used in a debugger windows, but be sure to compile with compiler optimisations turned off before attempting the casts else they are unlikely to work.
Even displaying the underlying string in a String * doesn't work straight off in all debuggers due to the multiple definitions of String as a struct and a void.
Below are a list of common SWIG types. With each is the cast that can be used in the debugger to extract the underlying type information and the underlying char * string.