diff options
author | William S Fulton <wsf@fultondesigns.co.uk> | 2015-12-19 03:52:33 +0000 |
---|---|---|
committer | William S Fulton <wsf@fultondesigns.co.uk> | 2015-12-19 03:55:26 +0000 |
commit | 01611702ec04fa70445fd2c7d37b9b312d3f7561 (patch) | |
tree | 94118fe116503d25354f9603e873c4aae743c8bc | |
parent | 291186cfaf39497a42f6ed6395ddaeb2b466ed04 (diff) | |
download | swig-01611702ec04fa70445fd2c7d37b9b312d3f7561.tar.gz |
Python 2 Unicode strings can be used as inputs to char * or std::string types
Requires SWIG_PYTHON_2_UNICODE to be defined when compiling generated code.
-rw-r--r-- | CHANGES.current | 4 | ||||
-rw-r--r-- | Doc/Manual/Contents.html | 1 | ||||
-rw-r--r-- | Doc/Manual/Python.html | 66 | ||||
-rw-r--r-- | Examples/test-suite/python/unicode_strings_runme.py | 9 | ||||
-rw-r--r-- | Examples/test-suite/unicode_strings.i | 8 |
5 files changed, 88 insertions, 0 deletions
diff --git a/CHANGES.current b/CHANGES.current index a0e6dfa2b..050ff54cc 100644 --- a/CHANGES.current +++ b/CHANGES.current @@ -5,6 +5,10 @@ See the RELEASENOTES file for a summary of changes in each release. Version 3.0.8 (in progress) =========================== +2015-12-19: wsfulton + [Python] Python 2 Unicode UTF-8 strings can be used as inputs to char * or + std::string types if the generated C/C++ code has SWIG_PYTHON_2_UNICODE defined. + 2015-12-17: wsfulton Issues #286, #128 Remove ccache-swig.1 man page - please use the CCache.html docs instead. diff --git a/Doc/Manual/Contents.html b/Doc/Manual/Contents.html index 21ba6eaad..6d2cdaa76 100644 --- a/Doc/Manual/Contents.html +++ b/Doc/Manual/Contents.html @@ -1598,6 +1598,7 @@ <li><a href="Python.html#Python_nn75">Buffer interface</a> <li><a href="Python.html#Python_nn76">Abstract base classes</a> <li><a href="Python.html#Python_nn77">Byte string output conversion</a> +<li><a href="Python.html#Python_2_unicode">Python 2 Unicode</a> </ul> </ul> </div> diff --git a/Doc/Manual/Python.html b/Doc/Manual/Python.html index 962ee6843..c5219b693 100644 --- a/Doc/Manual/Python.html +++ b/Doc/Manual/Python.html @@ -122,6 +122,7 @@ <li><a href="#Python_nn75">Buffer interface</a> <li><a href="#Python_nn76">Abstract base classes</a> <li><a href="#Python_nn77">Byte string output conversion</a> +<li><a href="#Python_2_unicode">Python 2 Unicode</a> </ul> </ul> </div> @@ -6163,6 +6164,71 @@ For more details about the <tt>surrogateescape</tt> error handler, please see <a href="https://www.python.org/dev/peps/pep-0383/">PEP 383</a>. </p> +<H3><a name="Python_2_unicode"></a>36.12.5 Python 2 Unicode</H3> + + +<p> +A Python 3 string is a Unicode string so by default a Python 3 string that contains Unicode +characters passed to C/C++ will be accepted and converted to a C/C++ string +(<tt>char *</tt> or <tt>std::string</tt> types). +A Python 2 string is not a unicode string by default and should a Unicode string be +passed to C/C++ it will fail to convert to a C/C++ string +(<tt>char *</tt> or <tt>std::string</tt> types). +The Python 2 behavior can be made more like Python 3 by defining +<tt>SWIG_PYTHON_2_UNICODE</tt> when compiling the generated C/C++ code. +By default when the following is wrapped: +</p> + +<div class="code"><pre> +%module unicode_strings +char *charstring(char *s) { + return s; +} +</pre></div> + +<p> +An error will occur when using Unicode strings in Python 2: +</p> + +<div class="targetlang"><pre> +>>> from unicode_strings import * +>>> charstring("hi") +'hi' +>>> charstring(u"hi") +Traceback (most recent call last): + File "<stdin>", line 1, in ? +TypeError: in method 'charstring', argument 1 of type 'char *' +</pre></div> + +<p> +When the <tt>SWIG_PYTHON_2_UNICODE</tt> macro is added to the generated code: +</p> + +<div class="code"><pre> +%module unicode_strings +%begin %{ +#define SWIG_PYTHON_2_UNICODE +%} + +char *charstring(char *s) { + return s; +} +</pre></div> + +<p> +Unicode strings will be successfully accepted and converted from UTF-8, +but note that they are returned as a normal Python 2 string: +</p> + +<div class="targetlang"><pre> +>>> from unicode_strings import * +>>> charstring("hi") +'hi' +>>> charstring(u"hi") +'hi' +>>> +</pre></div> + </body> </html> diff --git a/Examples/test-suite/python/unicode_strings_runme.py b/Examples/test-suite/python/unicode_strings_runme.py index e1fc7adec..3ce98bcdb 100644 --- a/Examples/test-suite/python/unicode_strings_runme.py +++ b/Examples/test-suite/python/unicode_strings_runme.py @@ -12,3 +12,12 @@ if sys.version_info[0:2] >= (3, 1): raise ValueError('Test comparison mismatch') if unicode_strings.non_utf8_std_string() != test_string: raise ValueError('Test comparison mismatch') + +# Testing SWIG_PYTHON_2_UNICODE flag which allows unicode strings to be passed to C +if sys.version_info[0:2] < (3, 0): + assert unicode_strings.charstring("hello1") == "hello1" + assert unicode_strings.charstring(str(u"hello2")) == "hello2" + assert unicode_strings.charstring(u"hello3") == "hello3" + assert unicode_strings.charstring(unicode("hello4")) == "hello4" + unicode_strings.charstring(u"hell\xb05") + unicode_strings.charstring(u"hell\u00f66") diff --git a/Examples/test-suite/unicode_strings.i b/Examples/test-suite/unicode_strings.i index 56063c8a4..9be3748e6 100644 --- a/Examples/test-suite/unicode_strings.i +++ b/Examples/test-suite/unicode_strings.i @@ -2,6 +2,10 @@ %include <std_string.i> +%begin %{ +#define SWIG_PYTHON_2_UNICODE +%} + %inline %{ const char* non_utf8_c_str(void) { @@ -12,4 +16,8 @@ std::string non_utf8_std_string(void) { return std::string("h\xe9llo w\xc3\xb6rld"); } +char *charstring(char *s) { + return s; +} + %} |