summaryrefslogtreecommitdiff
path: root/utf8.h
diff options
context:
space:
mode:
authorKarl Williamson <public@khwilliamson.com>2013-04-27 22:14:02 -0600
committerKarl Williamson <public@khwilliamson.com>2013-08-29 09:56:08 -0600
commitde69f3af3da72e3d0ea0adae070cdb4990d7c9bf (patch)
tree81263c78565876d6828c48f51580bb2a69083a98 /utf8.h
parentbcb1a2d416c144271685cdf48af52ebbc3f267f8 (diff)
downloadperl-de69f3af3da72e3d0ea0adae070cdb4990d7c9bf.tar.gz
utf8.c: Remove wrapper functions.
Now that the Unicode data is stored in native character set order, it is rare to need to work with the Unicode order. Traditionally, the real work was done in functions that worked with the Unicode order, and wrapper functions (or macros) were used to translate to/from native. There are two groups of functions: one that translates from code point to UTF-8, and the other group goes the opposite direction. This commit changes the base function that translates from UTF-8 to code point to output native instead of Unicode. Those extremely rare instances where Unicode output is needed instead will have to hand-wrap calls to this function with a translation macro, as now described in the API pod. Prior to this, it was the other way, the native was wrapped, and the rare, strict Unicode wasn't. This eliminates a layer of function call overhead for a common case. The base function that translates from code point to UTF-8 retains its Unicode input, as that is more natural to process. However, it is de-emphasized in the pod, with the functionality description moved to the pod for a native input wrapper function. And, those wrappers are now macros in all cases; previously there was function call overhead sometimes. (Equivalent exported functions are retained, however, for XS code that uses the Perl_foo() form.) I had hoped to rebase this commit, squashing it with an earlier commit in this series, eliminating the use of a temporary function name change, but the work involved turns out to be large, with no real payoff.
Diffstat (limited to 'utf8.h')
-rw-r--r--utf8.h11
1 files changed, 7 insertions, 4 deletions
diff --git a/utf8.h b/utf8.h
index 45353eaac7..e54c98536f 100644
--- a/utf8.h
+++ b/utf8.h
@@ -39,6 +39,13 @@
#define _CORE_SWASH_INIT_RETURN_IF_UNDEF 0x2
#define _CORE_SWASH_INIT_ACCEPT_INVLIST 0x4
+#define uvchr_to_utf8(a,b) uvchr_to_utf8_flags(a,b,0)
+#define uvchr_to_utf8_flags(d,uv,flags) \
+ uvoffuni_to_utf8_flags(d,NATIVE_TO_UNI(uv),flags)
+#define utf8_to_uvchr_buf(s, e, lenp) \
+ utf8n_to_uvchr(s, (e) - (s), lenp, \
+ ckWARN_d(WARN_UTF8) ? 0 : UTF8_ALLOW_ANY)
+
#define to_uni_fold(c, p, lenp) _to_uni_fold_flags(c, p, lenp, FOLD_FLAGS_FULL)
#define to_utf8_fold(c, p, lenp) _to_utf8_fold_flags(c, p, lenp, \
FOLD_FLAGS_FULL, NULL)
@@ -122,10 +129,6 @@ END_EXTERN_C
#define UNI_TO_NATIVE(ch) (ch)
#define NATIVE_TO_UNI(ch) (ch)
-/* As there are no translations, avoid the function wrapper */
-#define utf8n_to_uvchr utf8n_to_uvoffuni
-#define uvchr_to_utf8(a,b) uvoffuni_to_utf8_flags(a,b,0)
-
/*
The following table is from Unicode 3.2.