intarray
intarray
The intarray> module provides a number of useful functions
and operators for manipulating null-free arrays of integers.
There is also support for indexed searches using some of the operators.
All of these operations will throw an error if a supplied array contains any
NULL elements.
Many of these operations are only sensible for one-dimensional arrays.
Although they will accept input arrays of more dimensions, the data is
treated as though it were a linear array in storage order.
intarray> Functions and Operators
The functions provided by the intarray module
are shown in , the operators
in .
intarray> Functions
Function
Return Type
Description
Example
Result
icount(int[])icount
int
number of elements in array
icount('{1,2,3}'::int[])
3
sort(int[], text dir)sort
int[]
sort array — dir> must be asc> or desc>
sort('{1,2,3}'::int[], 'desc')
{3,2,1}
sort(int[])
int[]
sort in ascending order
sort(array[11,77,44])
{11,44,77}
sort_asc(int[])sort_asc
int[]
sort in ascending order
sort_desc(int[])sort_desc
int[]
sort in descending order
uniq(int[])uniq
int[]
remove adjacent duplicates
uniq(sort('{1,2,3,2,1}'::int[]))
{1,2,3}
idx(int[], int item)idx
int
index of first element matching item> (0 if none)
idx(array[11,22,33,22,11], 22)
2
subarray(int[], int start, int len)subarray
int[]
portion of array starting at position start>, len> elements
subarray('{1,2,3,2,1}'::int[], 2, 3)
{2,3,2}
subarray(int[], int start)
int[]
portion of array starting at position start>
subarray('{1,2,3,2,1}'::int[], 2)
{2,3,2,1}
intset(int)intset
int[]
make single-element array
intset(42)
{42}
intarray> Operators
Operator
Returns
Description
int[] && int[]
boolean
overlap — true> if arrays have at least one common element
int[] @> int[]
boolean
contains — true> if left array contains right array
int[] <@ int[]
boolean
contained — true> if left array is contained in right array
# int[]
int
number of elements in array
int[] # int
int
index (same as idx> function)
int[] + int
int[]
push element onto array (add it to end of array)
int[] + int[]
int[]
array concatenation (right array added to the end of left one)
int[] - int
int[]
remove entries matching right argument from array
int[] - int[]
int[]
remove elements of right array from left
int[] | int
int[]
union of arguments
int[] | int[]
int[]
union of arrays
int[] & int[]
int[]
intersection of arrays
int[] @@ query_int
boolean
true> if array satisfies query (see below)
query_int ~~ int[]
boolean
true> if array satisfies query (commutator of @@>)
(Before PostgreSQL 8.2, the containment operators @>> and
<@> were respectively called @> and ~>.
These names are still available, but are deprecated and will eventually be
retired. Notice that the old names are reversed from the convention
formerly followed by the core geometric data types!)
The operators &&>, @>> and
<@> are equivalent to PostgreSQL>'s built-in
operators of the same names, except that they work only on integer arrays
that do not contain nulls, while the built-in operators work for any array
type. This restriction makes them faster than the built-in operators
in many cases.
The @@> and ~~> operators test whether an array
satisfies a query>, which is expressed as a value of a
specialized data type query_int>. A query>
consists of integer values that are checked against the elements of
the array, possibly combined using the operators &>
(AND), |> (OR), and !> (NOT). Parentheses
can be used as needed. For example,
the query 1&(2|3)> matches arrays that contain 1
and also contain either 2 or 3.
Index Support
intarray> provides index support for the
&&>, @>>, <@>,
and @@> operators, as well as regular array equality.
Two GiST index operator classes are provided:
gist__int_ops> (used by default) is suitable for
small- to medium-size data sets, while
gist__intbig_ops> uses a larger signature and is more
suitable for indexing large data sets (i.e., columns containing
a large number of distinct array values).
The implementation uses an RD-tree data structure with
built-in lossy compression.
There is also a non-default GIN operator class
gin__int_ops> supporting the same operators.
The choice between GiST and GIN indexing depends on the relative
performance characteristics of GiST and GIN, which are discussed elsewhere.
As a rule of thumb, a GIN index is faster to search than a GiST index, but
slower to build or update; so GIN is better suited for static data and GiST
for often-updated data.
Example
-- a message can be in one or more sections>
CREATE TABLE message (mid INT PRIMARY KEY, sections INT[], ...);
-- create specialized index
CREATE INDEX message_rdtree_idx ON message USING GIST (sections gist__int_ops);
-- select messages in section 1 OR 2 - OVERLAP operator
SELECT message.mid FROM message WHERE message.sections && '{1,2}';
-- select messages in sections 1 AND 2 - CONTAINS operator
SELECT message.mid FROM message WHERE message.sections @> '{1,2}';
-- the same, using QUERY operator
SELECT message.mid FROM message WHERE message.sections @@ '1&2'::query_int;
Benchmark
The source directory contrib/intarray/bench> contains a
benchmark test suite. To run:
cd .../bench
createdb TEST
psql TEST < ../_int.sql
./create_test.pl | psql TEST
./bench.pl
The bench.pl> script has numerous options, which
are displayed when it is run without any arguments.
Authors
All work was done by Teodor Sigaev (teodor@sigaev.ru) and
Oleg Bartunov (oleg@sai.msu.su). See
for
additional information. Andrey Oktyabrski did a great work on adding new
functions and operations.