diff options
author | Peter Eisentraut <peter@eisentraut.org> | 2019-03-22 12:09:32 +0100 |
---|---|---|
committer | Peter Eisentraut <peter@eisentraut.org> | 2019-03-22 12:12:43 +0100 |
commit | 5e1963fb764e9cc092e0f7b58b28985c311431d9 (patch) | |
tree | 544492f24e3d48d00bd2a19c11663f84f1e18ce4 /src/test/regress/expected/subselect.out | |
parent | 2ab6d28d233af17987ea323e3235b2bda89b4f2e (diff) | |
download | postgresql-5e1963fb764e9cc092e0f7b58b28985c311431d9.tar.gz |
Collations with nondeterministic comparison
This adds a flag "deterministic" to collations. If that is false,
such a collation disables various optimizations that assume that
strings are equal only if they are byte-wise equal. That then allows
use cases such as case-insensitive or accent-insensitive comparisons
or handling of strings with different Unicode normal forms.
This functionality is only supported with the ICU provider. At least
glibc doesn't appear to have any locales that work in a
nondeterministic way, so it's not worth supporting this for the libc
provider.
The term "deterministic comparison" in this context is from Unicode
Technical Standard #10
(https://unicode.org/reports/tr10/#Deterministic_Comparison).
This patch makes changes in three areas:
- CREATE COLLATION DDL changes and system catalog changes to support
this new flag.
- Many executor nodes and auxiliary code are extended to track
collations. Previously, this code would just throw away collation
information, because the eventually-called user-defined functions
didn't use it since they only cared about equality, which didn't
need collation information.
- String data type functions that do equality comparisons and hashing
are changed to take the (non-)deterministic flag into account. For
comparison, this just means skipping various shortcuts and tie
breakers that use byte-wise comparison. For hashing, we first need
to convert the input string to a canonical "sort key" using the ICU
analogue of strxfrm().
Reviewed-by: Daniel Verite <daniel@manitou-mail.org>
Reviewed-by: Peter Geoghegan <pg@bowt.ie>
Discussion: https://www.postgresql.org/message-id/flat/1ccc668f-4cbc-0bef-af67-450b47cdfee7@2ndquadrant.com
Diffstat (limited to 'src/test/regress/expected/subselect.out')
-rw-r--r-- | src/test/regress/expected/subselect.out | 19 |
1 files changed, 19 insertions, 0 deletions
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out index fe5fc64480..4a54104182 100644 --- a/src/test/regress/expected/subselect.out +++ b/src/test/regress/expected/subselect.out @@ -746,6 +746,25 @@ select * from outer_7597 where (f1, f2) not in (select * from inner_7597); (2 rows) -- +-- Similar test case using text that verifies that collation +-- information is passed through by execTuplesEqual() in nodeSubplan.c +-- (otherwise it would error in texteq()) +-- +create temp table outer_text (f1 text, f2 text); +insert into outer_text values ('a', 'a'); +insert into outer_text values ('b', 'a'); +insert into outer_text values ('a', null); +insert into outer_text values ('b', null); +create temp table inner_text (c1 text, c2 text); +insert into inner_text values ('a', null); +select * from outer_text where (f1, f2) not in (select * from inner_text); + f1 | f2 +----+---- + b | a + b | +(2 rows) + +-- -- Test case for premature memory release during hashing of subplan output -- select '1'::text in (select '1'::name union all select '1'::name); |