diff options
Diffstat (limited to 'swigweb/papers/Py96/python96.html')
-rwxr-xr-x | swigweb/papers/Py96/python96.html | 572 |
1 files changed, 0 insertions, 572 deletions
diff --git a/swigweb/papers/Py96/python96.html b/swigweb/papers/Py96/python96.html deleted file mode 100755 index 8bd9e032d..000000000 --- a/swigweb/papers/Py96/python96.html +++ /dev/null @@ -1,572 +0,0 @@ -<html> -<title> -Using SWIG to Control, Prototype, and Debug C Programs with Python -</title> -<body BGCOLOR="ffffff"> -<h1> Using SWIG to Control, Prototype, and Debug C Programs with Python </h1> -<b> -(Submitted to the 4th International Python Conference) <br> <br> -David M. Beazley <br> -Department of Computer Science <br> -University of Utah <br> -Salt Lake City, Utah 84112 <br> -<a href="http://www.cs.utah.edu/~beazley"> beazley@cs.utah.edu </a> <br> -</b> -<h2> Abstract </h2> -<em> -I discuss early results in using SWIG to interact with C and C++ programs -using Python. Examples of using SWIG are provided along with a -discussion of the benefits and limitations of this approach. This -paper describes work in progress with the hope of getting feedback and -ideas from the Python community. -</em> <br> <br> - -Note : SWIG is freely available at <a href="http://www.cs.utah.edu/~beazley/SWIG"> http://www.cs.utah.edu/~beazley/SWIG </a> - -<h2> Introduction </h2> - -SWIG (Simplified Wrapper and Interface Generator) is a code -development -tool designed to make it easy for scientists and engineers to add scripting -language interfaces to programs and libraries written in C and C++. -The first version of SWIG was developed in July 1995 -for use with -very large scale scientific applications running on the Connection -Machine 5 and Cray T3D at Los Alamos National Laboratory. -Since that time, the system has been extended to support several -interface languages including Python, Tcl, Perl4, Perl5, and Guile [1]. - -In this paper, I will describe how SWIG can be used to interact with -C and C++ code from Python. I'll also discuss some -applications and limitations of this approach. It is important to -emphasize that SWIG was primarily designed for scientists and engineers -who would like to have a nice interface, but who would rather -work on more interesting problems than figuring out how to build -a user interface or using a complicated interface generation tool. - -<h2> Introducing SWIG </h2> - -The idea behind SWIG is really quite simple---most interface -languages such as Python provide some mechanism for making -calls to functions written in C or C++. However, this almost -always requires the user to write special "wrapper" functions -that provide the glue between the scripting language and the -C functions. As an example, if you wanted to add the -<tt> getenv() </tt> function to Python, you would need to write -the following wrapper code [4] : - -<blockquote> -<tt> -<pre> -static PyObject *wrap_getenv(PyObject *self, PyObject *args) { - char * result; - char * arg0; - if(!PyArg_ParseTuple(args, "s",&arg0)) - return NULL; - result = getenv(arg0); - return Py_BuildValue("s", result); -} - -</pre> </tt> -</blockquote> -While writing a single wrapper function isn't too difficult, -it quickly becomes tedious and error prone as the number -of functions increases. -SWIG automates this process by generating -wrapper code from a list ANSI C function and variable declarations. -As a result, adding scripting languages to C applications can become -an almost trivial exercise. <br> <br> - -The core of the SWIG consists of a YACC parser and a collection -of general purpose utility functions. The output of the parser -is sent to two different modules--a language module for writing -wrapper functions and a documentation module for producing a simple -reference guide. Each of the target languages and documentation methods -is implemented as C++ class that can be plugged in to the system. -Currently, Python, Tcl, Perl4, Perl5, and Guile are supported as -target languages while documentation can be produced in ASCII, HTML, -and LaTeX. Different target languages can be implemented as a new -C++ class that is linked with a general purpose SWIG library file. - -<h2> Previous work </h2> - -Most Python users are probably aware that automatic wrapper generation is -not a new idea. In particular, packages such as ILU are capable -of providing bindings between C, C++, and Python using specifications -given in IDL (Interface Definition Language) [3]. SWIG is not really meant to compete with this -approach, but is designed to be a no-nonsense, easy to use tool -that scientists and engineers can use to build interesting interfaces -without having to worry about grungy programming details or learning -a complicated interface specification language. -With that said, SWIG is certainly not designed to turn Python into -a C or C++ clone (I'm not sure that turning Python into interpreted -C/C++ compiler would be a good idea anyways). However, SWIG can allow -Python to interface with a surprisingly wide variety of C functions. - -</body> -</html> - -<h2> A SWIG example </h2> - -While it is not my intent to provide a tutorial, I hope to -illustrate how SWIG works by building a module from part of the -UNIX socket library. -While there is probably no need to do this in practice (since Python -already has a socket module), it illustrates most of SWIG's capabilities -on a moderately complex example and provides a basis for -comparison between an existing module and one created by SWIG. <br> <br> - -As input, SWIG takes an input file referred to as an "interface file." -The file starts with a preamble containing the name of the -module and a code block where special header files, additional C -code, and other things can be added. After that, ANSI C function -and variable declarations are listed in any order. Since SWIG -uses C syntax, it's usually fairly easy to build an interface -from scratch or simply by copying an existing header file. -The following SWIG interface file will be used for our socket example : - -<blockquote> -<tt> -<pre> -// socket.i -// SWIG Interface file to play with some sockets -%init sock // Name of our module -%{ -#include <sys/types.h> -#include <sys/socket.h> -#include <netinet/in.h> -#include <arpa/inet.h> -#include <netdb.h> - -/* Set some values in the sockaddr_in structure */ -struct sockaddr *new_sockaddr_in(short family, unsigned long hostid, int port) { - struct sockaddr_in *addr; - addr = (struct sockaddr_in *) malloc(sizeof(struct sockaddr_in)); - bzero((char *) addr, sizeof(struct sockaddr_in)); - addr->sin_family = family; - addr->sin_addr.s_addr = hostid; - addr->sin_port = htons(port); - return (struct sockaddr *) addr; -} -/* Get host address, but return as a string instead of hostent */ -char *my_gethostbyname(char *hostname) { - struct hostent *h; - h = gethostbyname(hostname); - if (h) return h->h_name; - else return ""; -} -%} - -// Add these constants -enum {AF_UNIX, AF_INET, SOCK_STREAM, SOCK_DGRAM, SOCK_RAW, - IPPROTO_UDP, IPPROTO_TCP, INADDR_ANY}; - -#define SIZEOF_SOCKADDR sizeof(struct sockaddr) - -// Wrap these functions -int socket(int family, int type, int protocol); -int bind(int sockfd, struct sockaddr *myaddr, int addrlen); -int connect(int sockfd, struct sockaddr *servaddr, int addrlen); -int listen(int sockfd, int backlog); -int accept(int sockfd, struct sockaddr *peer, %val int *addrlen); -int close(int fd); -struct sockaddr *new_sockaddr_in(short family, unsigned long, int port); -%name gethostbyname { char *my_gethostbyname(char *); } -unsigned long inet_addr(const char *ip); - -%include unixio.i -</pre> -</tt> -</blockquote> - -There are several things to notice about this file - -<ul> -<li> The module name is specified with the <tt> %init </tt> directive. -<li> Header files and supporting C code is placed in <tt> %{,%} </tt>. -<li> Special support functions may be necessary. For example, <tt> -new_sockaddr_in() </tt> creates a new <tt> struct sockaddr_in </tt> type -and sets some of its values. -<li> Constants are created with <tt> enum </tt> or <tt> #define. </tt> -<li> Functions can take most C basic datatypes as well as pointers -to complex types. -<li> Functions can be renamed with the <tt> %name </tt> directive. -<li> The <tt> %val </tt> directive forces a function argument to be called by -value. - -<li> The <tt> %include </tt> directive can be used to include other -SWIG interface files. In this case, the file <tt> unixio.i </tt> -might look like the following : - -<blockquote> -<tt> -<pre> -// File : unixio.i -// Some file I/O and memory functions -%{ -%} -int read(int fd, void *buffer, int n); -int write(int fd, void *buffer, int n); -typedef unsigned int size_t; -void *malloc(size_t nbytes); -</pre> -</tt> -</blockquote> -<li> Since interface files can include other interface files, it can be very -easy to build up libraries and collections of useful functions. -<li> <tt> typedef </tt> can be used to provide mapping between datatypes. -<li> C and C++ comments are allowed and like C, SWIG ignores whitespace. -</ul> - -The key thing to notice about SWIG interface files is that they support -real C functions, constants, and datatypes including C pointers. As -a general rule these files are also independent of the target scripting -language--the primary reason why it's easy to support different -languages. - -<h2> Building a Python module </h2> - -Building a Python module with SWIG is a simple process that proceeds as -follows (assuming you're running Solaris 2.x) : - -<blockquote> -<tt> -<pre> -unix > wrap -python socket.i -unix > gcc -c socket_wrap.c -I/usr/local/include/Py -unix > ld -G socket_wrap.o -lsocket -lnsl -o sockmodule.so -</pre> -</tt> -</blockquote> - -The <tt> wrap </tt> command takes the interface file and produces a C file -called <tt> socket_wrap.c </tt>. This file is then compiled and built into -a shared object file. - -<h2> A sample Python script </h2> - -Our new socket module can be used normally within a Python script. -For example, we could write a simple server process to echo all -data received back to a client : - -<blockquote> -<tt> -<pre> -# Echo server program using SWIG module -from sock import * -PORT = 5000 -sockfd = socket(AF_INET, SOCK_STREAM, 0) -if sockfd < 0: - raise IOError, 'server : unable to open stream socket' -addr = new_sockaddr_in(AF_INET,INADDR_ANY,PORT) -if (bind(sockfd, addr, SIZEOF_SOCKADDR)) < 0 : - raise IOError, 'server : unable to bind local address' -listen(sockfd,5) -client_addr = new_sockaddr_in(AF_INET, 0, 0) -newsockfd = accept(sockfd, client_addr, SIZEOF_SOCKADDR) -buffer = malloc(1024) -while 1: - nbytes = read(newsockfd, buffer, 1024) - if nbytes > 0 : - write(newsockfd,buffer,nbytes) - else : break -close(newsockfd) -close(sockfd) -</pre> -</tt> -</blockquote> - -In our Python module, we are manipulating a socket, in almost -exactly the same way as would be done in a C program. We'll take -a look at a client written using the Python socket library shortly. - -<h2> Pointers and run-time type checking </h2> - -SWIG imports C pointers as ASCII strings containing both the -address and pointer type. Thus, a typical SWIG pointer might look -like the following : <br> <br> - -<center> -<tt> _fd2a0_struct_sockaddr_p </tt> <br> <br> -</center> - -Some may view the idea of importing C pointers into Python as unsafe or even -truly insane. However, handling pointers seems to be needed in order -for SWIG to handle most interesting types of C code. To provide some safety, -the wrapper functions generated by SWIG perform pointer-type checking -(since importing C pointers into -Python has in effect, bypassed type-checking in the C compiler). -When incompatible pointer types are used, this is what happens : - -<blockquote> -<tt> -<pre> -unix > python -Python 1.3 (Apr 12 1996) [GCC 2.5.8] -Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam ->>> from sock import * ->>> sockfd = socket(AF_INET, SOCK_STREAM,0); ->>> buffer = malloc(8192); ->>> bind(sockfd,buffer,SIZEOF_SOCKADDR); -Traceback (innermost last): - File "<stdin>", line 1, in ? -TypeError: Type error in argument 2 of bind. Expected _struct_sockaddr_p. ->>> -</pre> -</tt> -</blockquote> - - -SWIG's run-time type checking provides a certain degree of safety from -using invalid parameters and making simple mistakes. Of course, any system -supporting pointers can be abused, but SWIG's implementation has proven -to be quite reliable under normal use. - -<h2> Comparison with a real Python module </h2> - -As a point of comparison, we can now write a client script using the -Python socket module (this example courtesy of the Python library -reference manual) [5]. - -<blockquote> -<tt> -<pre> -# Echo client program -from socket import * -HOST = 'tjaze.lanl.gov' -PORT = 5000 -s = socket(AF_INET, SOCK_STREAM) -s.connect(HOST,PORT) -s.send('Hello world') -data = s.recv(1024) -s.close() -print 'Received', `data` -</pre> -</tt> -</blockquote> - -As one might expect, the two scripts look fairly similar. However -there are certain obvious differences. First, the SWIG generated -module is clearly a quick and dirty approach. We must explicitly -check for errors and create a few C data structures. Furthermore, -while Python functions such as <tt> socket() </tt> can take optional -arguments, SWIG generated functions can not (since the underlying -C function can't). Of course, the apparent ugliness -of the SWIG module compared to the Python equivalent is not the -point here. -The real idea to emphasize is that SWIG -can take a set of real C functions with moderate complexity and -produce a fully functional Python module in a very short period of -time. In this case, about 10 minutes worth of effort. -(I'm -guessing that the Python socket module required more time than that). -With a bit of work, the SWIG generated module could probably be -used to build a more user-friendly version that hid many of the details. - -<h2> Controlling C programs </h2> - -Of course, the real goal of SWIG is not to produce quirky -replacements for modules in the Python library. Instead, it -better suited as a mechanism for -controlling a variety of C programs. <br> <br> - -In the SPaSM molecular dynamics code used at Los Alamos National -Laboratory, SWIG -is used to build an interface out of about 200 C functions [2]. -This happens at compile time so most users don't -notice that it has occurred. However, as a result, it is now -extremely easy for the physicists using the code to -extend it with new functions. -Whenever new functionality is added, the new -functions are put into a SWIG interface file and the functions -become available when the code is recompiled (a process -which usually only involves the new C functions and SWIG generated -wrapper file). Of course, when running under Python, it is possible -to build extensions and dynamically load them as needed. <br> <br> - -Another point, not to be overlooked, is the fact that SWIG makes it -very easy for a user to combine bits and pieces of completely different -software packages without waiting for someone to write a special -purpose module. For example, SWIG can be used to import the C API -of Matlab 4.2 into Python. This combined with a simulation module -can allow a scientist to perform both simulation and data analysis in -interactively or from a script. One could even build a Tkinter interface to control -the entire system. The ease of using SWIG makes it possible for -scientists to build applications out of components that -might not have been considered earlier. - -<h2> Prototyping and Debugging </h2> - -When debugging new modules or libraries, it would be nice to be able -to test out the functions without having to repeatedly recompile the -module with the rest of a large application. With SWIG, it is often -possible to prototype and debug modules independently. SWIG can be -used to build Python wrappers around the functions and scripts -can be used for testing. In many cases, it is possible to create -the C data structures and other information that would have been generated -by the larger application all from within a Python script. Since -SWIG requires no changes to the C code, integrating the new C code into -the large package is usually pretty easy. <br> <br> - -A related benefit of the SWIG approach is that it is now possible -to easily experiment with large software libraries and packages. -Peter-Pike Sloan at the -University of Utah, used SWIG to wrap the entire contents of the Open-GL -library (including more than 560 constants and 340 C functions). -This ``interpreted'' Open-GL could then be used interactively to -determine various image parameters and to draw simple figures. -Is this case, it proved to be a remarkably effective way of -figuring out the effects of different parameters and functions without -having to recompile after every modification. - -<h2> Living dangerously with datatypes </h2> - -One of the reasons SWIG is easy to use for prototyping and control -applications is its extremely forgiving treatment of datatypes. -<tt> typedef </tt> can be used to map datatypes, but whenever -SWIG encounters an unknown datatype, it simply assumes that it's -some sort of complex datatype. As a result, it's possible to -wrap some fairly complex C code with almost no effort. For example, -one could create a module allowing Python to create a Tcl interpreter -and execute Tcl commands as follows : - -<blockquote> -<tt> -<pre> -// tcl.i -// Wrap Tcl's C interface -%init tcl -%{ -#include <tcl.h> -%} - -Tcl_Interp *Tcl_CreateInterp(void); -void Tcl_DeleteInterp(Tcl_Interp *interp); -int Tcl_Eval(Tcl_Interp *interp, char *script); -int Tcl_EvalFile(Tcl_Interp *interp, char *file); -</pre> -</tt> -</blockquote> - -SWIG has no idea what <tt> Tcl_Interp </tt> is, but it doesn't -really care as long as it's used as a pointer. When used in a -Python script, this module would work as follows : -<blockquote> -<tt> -<pre> -unix > python -Python 1.3 (Mar 26 1996) [GCC 2.7.0] -Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam ->>> import tcl ->>> interp = tcl.Tcl_CreateInterp(); ->>> tcl.Tcl_Eval(interp,"for {set i 0} {$i < 5} {incr i 1} {puts $i}"); -0 -1 -2 -3 -4 -0 ->>> -</pre> -</tt> -</blockquote> - -<h2> Limitations </h2> - -This approach represents a balance of flexibility and ease of use. -I have always felt that the tool shouldn't be more complicated than -the original problem. Therefore, SWIG certainly won't do everything. -Some of SWIG's limitations include it's limited understanding of -complex datatypes. Often the user must write special functions -to extract information from structures even though passing complex -datatypes around by reference works remarkably well in practice. -SWIG's support for C++ is also quite limited. At the moment SWIG -can be used to pass pointers to C++ objects around, but actually -transforming a C++ object into some sort of Python object is far -beyond the capabilities of the current implementation. Finally, SWIG -lacks an exception model which may be of concern to some users. - -<h2> Limitations in the Python implementation </h2> - -SWIG's support for Python is in many ways, limited by the fact that -SWIG was originally developed for use with other languages. For example, -representing C pointers as a character string can be seen as a carry-over -from an earlier Tcl implementation. With a little work, I believe -that one could create a Python "pointer" datatype while still retaining -type-checking and other features. <br> <br> - -Another limitation in the current Python implementation is the apparent lack -of variable linking support. Many applications, particularly scientific ones, -have global variables that a user may want to access. SWIG currently -creates Python functions for accessing these variables, but other -scripting languages provide a more natural mechanism. -For example, in Perl, it is -possible to attach "set" and "get" functions to a Perl object. These -are evaluated whenever that variable is accessed in a script. -For all practical purposes this special variable looks like any other Perl -variable, but is really linked to an C global variable (of course, -maybe there is a reason why Perl calls these "magic" variables). <br> <br> - -If an equivalent variable linking mechanism is available in Python, I -couldn't find it [4]. -If not, it might be possible to create a new kind of Python datatype -that supports variable linking, but which interacts seamlessly -with existing Python integer, floating point, and character string types. - -<h2> Summary and future work </h2> - -This paper has described work in progress with the SWIG system and -its support for Python. I believe that this approach shows great -promise for future work--especially within the scientific and -engineering community. Most of the people introduced to SWIG and -Python have -found both systems extremely easy and straightforward to use. -I am currently -quite interested in improving SWIG's interface to Python and exploring -its use with numerical Python in order to build interesting -and flexible scientific applications [6]. - -<h2> Acknowledgments </h2> - -This project would not have been possible without the support of a number -of people. Peter Lomdahl, Shujia Zhou, Brad Holian, Tim Germann, and Niels -Jensen at Los Alamos National Laboratory were the first users and were -instrumental in testing out the first designs. Patrick Tullmann, -John Schmidt,Peter-Pike Sloan, and Kurtis Bleeker at the University of Utah have also -been instrumental in testing out some of the newer versions and -providing feedback. -John Buckman has suggested many interesting -improvements and is currently working on porting SWIG to non-Unix -platforms including NT, OS/2 and Macintosh. -Finally -I'd like to thank Chris Johnson and the members of the Scientific -Computing and Imaging group at the University of Utah for their continued -support. Some of this work was performed under the auspices of the -Department of Energy. - -<h2> References </h2> - -[1] D.M. Beazley. <em> SWIG : An Easy to Use Tool for Integrating -Scripting Languages with C and C++ </em>. Proceedings of the 4th -USENIX Tcl/Tk Workshop (1996) (to appear). <br> <br> - -[2] D.M. Beazley and P.S. Lomdahl <em> Message-Passing Multi-Cell -Molecular Dynamics on the Connection Machine 5 </em>, Parallel -Computing 20 (1994), p. 173-195. <br> <br> - -[3] Bill Janssen and Mike Spreitzer, <em> ILU : Inter-Language -Unification via Object Modules</em>, OOPSLA 94 Workshop on -Multi-Language Object Models. -<br> <br> - -[4] Guido van Rossum, <em> Extending and Embedding the Python Interpreter, -</em> October 1995. <br> <br> - -[5] Guido van Rossum, <em> Python Library Reference </em>, October -1995. <br> <br> - -[6] Paul F. Dubois, Konrad Hinsen, and James Hugunin, <em> Numerical -Python, </em> Computers in Physics, (1996) (to appear) - -</body> -</html> |