summaryrefslogtreecommitdiff
path: root/src/backend/utils/mb/Unicode/UCS_to_EUC_CN.pl
diff options
context:
space:
mode:
authorTom Lane <tgl@sss.pgh.pa.us>2015-05-14 22:27:07 -0400
committerTom Lane <tgl@sss.pgh.pa.us>2015-05-14 22:27:12 -0400
commit7730f48ede0d222e7f750541d3d5f0f74d75d99b (patch)
tree472b56a394d55b08d31fcbaa1015d2475c788795 /src/backend/utils/mb/Unicode/UCS_to_EUC_CN.pl
parent83e176ec18d2a91dbea1d0d1bd94c38dc47cd77c (diff)
downloadpostgresql-7730f48ede0d222e7f750541d3d5f0f74d75d99b.tar.gz
Teach UtfToLocal/LocalToUtf to support algorithmic encoding conversions.
Until now, these functions have only supported encoding conversions using lookup tables, which is fine as long as there's not too many code points to convert. However, GB18030 expects all 1.1 million Unicode code points to be convertible, which would require a ridiculously-sized lookup table. Fortunately, a large fraction of those conversions can be expressed through arithmetic, ie the conversions are one-to-one in certain defined ranges. To support that, provide a callback function that is used after consulting the lookup tables. (This patch doesn't actually change anything about the GB18030 conversion behavior, just provide infrastructure for fixing it.) Since this requires changing the APIs of UtfToLocal/LocalToUtf anyway, take the opportunity to rearrange their argument lists into what seems to me a saner order. And beautify the call sites by using lengthof() instead of error-prone sizeof() arithmetic. In passing, also mark all the lookup tables used by these calls "const". This moves an impressive amount of stuff into the text segment, at least on my machine, and is safer anyhow.
Diffstat (limited to 'src/backend/utils/mb/Unicode/UCS_to_EUC_CN.pl')
-rwxr-xr-xsrc/backend/utils/mb/Unicode/UCS_to_EUC_CN.pl4
1 files changed, 2 insertions, 2 deletions
diff --git a/src/backend/utils/mb/Unicode/UCS_to_EUC_CN.pl b/src/backend/utils/mb/Unicode/UCS_to_EUC_CN.pl
index cb9a8cb003..bfc99123bf 100755
--- a/src/backend/utils/mb/Unicode/UCS_to_EUC_CN.pl
+++ b/src/backend/utils/mb/Unicode/UCS_to_EUC_CN.pl
@@ -55,7 +55,7 @@ close(FILE);
$file = "utf8_to_euc_cn.map";
open(FILE, "> $file") || die("cannot open $file");
-print FILE "static pg_utf_to_local ULmapEUC_CN[ $count ] = {\n";
+print FILE "static const pg_utf_to_local ULmapEUC_CN[ $count ] = {\n";
for $index (sort { $a <=> $b } keys(%array))
{
@@ -109,7 +109,7 @@ close(FILE);
$file = "euc_cn_to_utf8.map";
open(FILE, "> $file") || die("cannot open $file");
-print FILE "static pg_local_to_utf LUmapEUC_CN[ $count ] = {\n";
+print FILE "static const pg_local_to_utf LUmapEUC_CN[ $count ] = {\n";
for $index (sort { $a <=> $b } keys(%array))
{
$utf = $array{$index};