diff options
author | Armin Rigo <arigo@tunes.org> | 2021-12-03 12:50:17 +0100 |
---|---|---|
committer | Armin Rigo <arigo@tunes.org> | 2021-12-03 12:50:17 +0100 |
commit | fbc4f4383312ac4110422d6e2eeed7200ab4e271 (patch) | |
tree | 883c733e0c59303bb4adef5650537dd3e2b32382 | |
parent | 803414398821985c5567c99dc46f08672208e29b (diff) | |
download | cffi-fbc4f4383312ac4110422d6e2eeed7200ab4e271.tar.gz |
Write explicitly that byte strings passed to `char *` arguments are not meant to live as long as the byte string is alive, but only for the duration of the call
-rw-r--r-- | doc/source/ref.rst | 25 | ||||
-rw-r--r-- | doc/source/using.rst | 9 |
2 files changed, 29 insertions, 5 deletions
diff --git a/doc/source/ref.rst b/doc/source/ref.rst index 05c0f7c..946e48c 100644 --- a/doc/source/ref.rst +++ b/doc/source/ref.rst @@ -846,20 +846,35 @@ allowed. argument is identical to a ``item[]`` argument (and ``ffi.cdef()`` doesn't record the difference). So when you call such a function, you can pass an argument that is accepted by either C type, like - for example passing a Python string to a ``char *`` argument + for example passing a Python byte string to a ``char *`` argument (because it works for ``char[]`` arguments) or a list of integers to a ``int *`` argument (it works for ``int[]`` arguments). Note that even if you want to pass a single ``item``, you need to specify it in a list of length 1; for example, a ``struct point_s *`` argument might be passed as ``[[x, y]]`` or ``[{'x': 5, 'y': - 10}]``. + 10}]``. In all these cases (including passing a byte string to + a ``char *`` argument), the required C data structure is created + just before the call is done, and freed afterwards. As an optimization, CFFI assumes that a - function with a ``char *`` argument to which you pass a Python + function with a ``char *`` argument to which you pass a Python byte string will not actually modify the array of characters passed in, - and so passes directly a pointer inside the Python string object. + and so it attempts to pass directly a pointer inside the Python + byte string object. This still doesn't mean that the ``char *`` + argument can be stored by the C function and inspected later. + The ``char *`` is only valid for the duration of the call, even if + the Python object is kept alive for longer. (On PyPy, this optimization is only available since PyPy 5.4 - with CFFI 1.8.) + with CFFI 1.8. It may fail in rare cases and fall back to making + a copy anyway, but only for short strings so it shouldn't be + noticeable.) + + If you need to pass a ``char *`` that must be valid for longer than + just the call, you need to build it explicitly, either with ``p = + ffi.new("char[]", mystring)`` (which makes a copy) or by not using a + byte string in the first place but something else like a buffer object, + or a bytearray and ``ffi.from_buffer()``; or just use + ``ffi.new("char[]", length)`` directly if possible. `[2]` C function calls are done with the GIL released. diff --git a/doc/source/using.rst b/doc/source/using.rst index 38c96ba..6432f4a 100644 --- a/doc/source/using.rst +++ b/doc/source/using.rst @@ -383,6 +383,15 @@ argument and may mutate it!): assert lib.strlen("hello") == 5 +(Note that there is no guarantee that the ``char *`` passed to the +function remains valid after the call is done. Similarly, if you write +``lib.f(x); lib.f(x)`` where ``x`` is some byte string, the two calls to +``f()`` could sometimes receive different ``char *`` arguments. This is +important notably for PyPy which uses many optimizations tweaking the data +underlying a byte string object. CFFI will not make and free a copy of +the whole string at *every* call---it usually won't---but you *cannot* +write code that relies on it: there are cases were that would break.) + You can also pass unicode strings as ``wchar_t *`` or ``char16_t *`` or ``char32_t *`` arguments. Note that the C language makes no difference between argument declarations that |