summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorPeter Eisentraut <peter_e@gmx.net>2001-01-22 23:34:33 +0000
committerPeter Eisentraut <peter_e@gmx.net>2001-01-22 23:34:33 +0000
commit21a3857f1f4a9a8c82355f753dfacff752fdebf3 (patch)
tree608a448675e1a74e16e770f3aa99e26d120b63f5
parente9c936ff38da738d1fd68525eee9c7c1f0c558dc (diff)
downloadpostgresql-21a3857f1f4a9a8c82355f753dfacff752fdebf3.tar.gz
Rip out table expression section from SQL syntax chapter and develop it
into new chapter on query (SELECT) syntax. In the end this should become a narrative and example-filled counterpart to the SELECT reference page.
-rw-r--r--doc/src/sgml/filelist.sgml4
-rw-r--r--doc/src/sgml/queries.sgml819
-rw-r--r--doc/src/sgml/syntax.sgml538
-rw-r--r--doc/src/sgml/user.sgml3
4 files changed, 832 insertions, 532 deletions
diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml
index 799f517144..5d784d7dcc 100644
--- a/doc/src/sgml/filelist.sgml
+++ b/doc/src/sgml/filelist.sgml
@@ -1,4 +1,4 @@
-<!-- $Header: /cvsroot/pgsql/doc/src/sgml/filelist.sgml,v 1.4 2001/01/06 11:58:56 petere Exp $ -->
+<!-- $Header: /cvsroot/pgsql/doc/src/sgml/filelist.sgml,v 1.5 2001/01/22 23:34:32 petere Exp $ -->
<!entity about SYSTEM "about.sgml">
<!entity history SYSTEM "history.sgml">
@@ -31,7 +31,7 @@
<!entity plsql SYSTEM "plsql.sgml">
<!entity pltcl SYSTEM "pltcl.sgml">
<!entity psql SYSTEM "psql.sgml">
-<!entity query-ug SYSTEM "query-ug.sgml">
+<!entity queries SYSTEM "queries.sgml">
<!entity storage SYSTEM "storage.sgml">
<!entity syntax SYSTEM "syntax.sgml">
<!entity typeconv SYSTEM "typeconv.sgml">
diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml
new file mode 100644
index 0000000000..e9ede07641
--- /dev/null
+++ b/doc/src/sgml/queries.sgml
@@ -0,0 +1,819 @@
+<!-- $Header: /cvsroot/pgsql/doc/src/sgml/queries.sgml,v 1.1 2001/01/22 23:34:33 petere Exp $ -->
+
+<chapter id="queries">
+ <title>Queries</title>
+
+ <para>
+ A <firstterm>query</firstterm> is the process of or the command to
+ retrieve data from a database. In SQL the <command>SELECT</command>
+ command is used to specify queries. The general syntax of the
+ <command>SELECT</command> command is
+<synopsis>
+SELECT <replaceable>select_list</replaceable> FROM <replaceable>table_expression</replaceable> <optional><replaceable>sort_specification</replaceable></optional>
+</synopsis>
+ The following sections describe the details of the select list, the
+ table expression, and the sort specification. The simplest kind of
+ query has the form
+<programlisting>
+SELECT * FROM table1;
+</programlisting>
+ Assuming that there is a table called table1, this command would
+ retrieve all rows and all columns from table1. (The method of
+ retrieval depends on the client application. For example, the
+ <application>psql</application> program will display an ASCII-art
+ table on the screen, client libraries will offer functions to
+ retrieve individual rows and columns.) The select list
+ specification <literal>*</literal> means all columns that the table
+ expression happens to provide. A select list can also select a
+ subset of the available columns or even make calculations on the
+ columns before retrieving them; see <xref
+ linkend="queries-select-lists">. For example, if table1 has columns
+ named a, b, and c (and perhaps others) you can make the following
+ query:
+<programlisting>
+SELECT a, b + c FROM table1;
+</programlisting>
+ (assuming that b and c are of a numeric data type).
+ </para>
+
+ <para>
+ <literal>FROM table1</literal> is a particularly simple kind of
+ table expression. In general, table expressions can be complex
+ constructs of base tables, joins, and subqueries. But you can also
+ omit the table expression entirely and use the SELECT command as a
+ calculator:
+<programlisting>
+SELECT 3 * 4;
+</programlisting>
+ This is more useful if the expressions in the select list return
+ varying results. For example, you could call a function this way.
+<programlisting>
+SELECT random();
+</programlisting>
+ </para>
+
+ <sect1 id="queries-table-expressions">
+ <title>Table Expressions</title>
+
+ <para>
+ A <firstterm>table expression</firstterm> specifies a table. The
+ table expression contains a FROM clause that is optionally followed
+ by WHERE, GROUP BY, and HAVING clauses. Trivial table expressions
+ simply refer to a table on disk, a so-called base table, but more
+ complex expressions can be used to modify or combine base tables in
+ various ways.
+ </para>
+
+ <para>
+ The WHERE, GROUP BY, and HAVING clauses in the table expression
+ specify a pipeline of successive transformations performed on the
+ table derived in the FROM clause. The final transformed table that
+ is derived provides the input rows used to derive output rows as
+ specified by the select list of derived column value expressions.
+ </para>
+
+ <sect2 id="queries-from">
+ <title>FROM clause</title>
+
+ <para>
+ The FROM clause derives a table from one or more other tables
+ given in a comma-separated table reference list.
+<synopsis>
+FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_reference</replaceable> <optional>, ...</optional></optional>
+</synopsis>
+
+ A table reference may be a table name or a derived table such as a
+ subquery, a table join, or complex combinations of these. If more
+ than one table reference is listed in the FROM clause they are
+ CROSS JOINed (see below) to form the derived table that may then
+ be subject to transformations by the WHERE, GROUP BY, and HAVING
+ clauses and is finally the result of the overall table expression.
+ </para>
+
+ <para>
+ If a table reference is a simple table name and it is the
+ supertable in a table inheritance hierarchy, rows of the table
+ include rows from all of its subtable successors unless the
+ keyword ONLY precedes the table name.
+ </para>
+
+ <sect3 id="queries-join">
+ <title>Joined Tables</title>
+
+ <para>
+ A joined table is a table derived from two other (real or
+ derived) tables according to the rules of the particular join
+ type. INNER, OUTER, NATURAL, and CROSS JOIN are supported.
+ </para>
+
+ <variablelist>
+ <title>Join Types</title>
+
+ <varlistentry>
+ <term>CROSS JOIN</term>
+
+ <listitem>
+<synopsis>
+<replaceable>T1</replaceable> CROSS JOIN <replaceable>T2</replaceable>
+</synopsis>
+
+ <para>
+ For each combination of rows from
+ <replaceable>T1</replaceable> and
+ <replaceable>T2</replaceable> the derived table will contain a
+ row consisting of all columns in <replaceable>T1</replaceable>
+ followed by all columns in <replaceable>T2</replaceable>. If
+ the tables have have N and M rows respectively, the joined
+ table will have N * M rows. A cross join is essentially an
+ <literal>INNER JOIN ON TRUE</literal>.
+ </para>
+
+ <tip>
+ <para>
+ <literal>FROM <replaceable>T1</replaceable> CROSS JOIN
+ <replaceable>T2</replaceable></literal> is equivalent to
+ <literal>FROM <replaceable>T1</replaceable>,
+ <replaceable>T2</replaceable></literal>.
+ </para>
+ </tip>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Qualified JOINs</term>
+ <listitem>
+
+<synopsis>
+<replaceable>T1</replaceable> { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable> ON <replaceable>boolean expression</replaceable>
+<replaceable>T1</replaceable> { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable> USING ( <replaceable>join column list</replaceable> )
+</synopsis>
+
+ <para>
+ The words <token>INNER</token> and <token>OUTER</token> are
+ optional for all JOINs. <token>INNER</token> is the default;
+ <token>LEFT</token>, <token>RIGHT</token>, and
+ <token>FULL</token> are for OUTER JOINs only.
+ </para>
+
+ <para>
+ The <firstterm>join condition</firstterm> is specified in the
+ ON or USING clause. (The meaning of the join condition
+ depends on the particular join type; see below.) The ON
+ clause takes a Boolean value expression of the same kind as is
+ used in a WHERE clause. The USING clause takes a
+ comma-separated list of column names, which the joined tables
+ must have in common, and joins the tables on the equality of
+ those columns as a set, resulting in a joined table having one
+ column for each common column listed and all of the other
+ columns from both tables. Thus, <literal>USING (a, b,
+ c)</literal> is equivalent to <literal>ON (t1.a = t2.a AND
+ t1.b = t2.b AND t1.c = t2.c)</literal> with the exception that
+ if ON is used there will be two columns a, b, and c in the
+ result, whereas with USING there will be only one of each.
+ </para>
+
+ <variablelist>
+ <varlistentry>
+ <term>INNER JOIN</term>
+
+ <listitem>
+ <para>
+ For each row R1 of T1, the joined table has a row for each
+ row in T2 that satisfies the join condition with R1.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>LEFT OUTER JOIN</term>
+
+ <listitem>
+ <para>
+ First, an INNER JOIN is performed. Then, for a row in T1
+ that does not satisfy the join condition with any row in
+ T2, a joined row is returned with NULL values in columns of
+ T2. Thus, the joined table unconditionally has a row for each
+ row in T1.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>RIGHT OUTER JOIN</term>
+
+ <listitem>
+ <para>
+ This is like a left join, only that the result table will
+ unconditionally have a row for each row in T2.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>FULL OUTER JOIN</term>
+
+ <listitem>
+ <para>
+ First, an INNER JOIN is performed. Then, for each row in
+ T1 that does not satisfy the join condition with any row in
+ T2, a joined row is returned with null values in columns of
+ T2. Also, for each row of T2 that does not satisfy the
+ join condition with any row in T1, a joined row with null
+ values in the columns of T1 is returned.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>NATURAL JOIN</term>
+
+ <listitem>
+<synopsis>
+<replaceable>T1</replaceable> NATURAL { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> JOIN <replaceable>T2</replaceable>
+</synopsis>
+ <para>
+ A natural join creates a joined table where every pair of matching
+ column names between the two tables are merged into one column. The
+ join specification is effectively a USING clause containing all the
+ common column names and is otherwise like a Qualified JOIN.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ Joins of all types can be chained together or nested where either
+ or both of <replaceable>T1</replaceable> and
+ <replaceable>T2</replaceable> may be JOINed tables. Parenthesis
+ can be used around JOIN clauses to control the join order which
+ are otherwise left to right.
+ </para>
+ </sect3>
+
+ <sect3 id="queries-subqueries">
+ <title>Subqueries</title>
+
+ <para>
+ Subqueries specifying a derived table must be enclosed in
+ parenthesis and <emphasis>must</emphasis> be named using an AS
+ clause. (See <xref linkend="queries-table-aliases">.)
+ </para>
+
+<programlisting>
+FROM (SELECT * FROM table1) AS alias_name
+</programlisting>
+
+ <para>
+ This example is equivalent to <literal>FROM table1 AS
+ alias_name</literal>. Many subqueries can be written as table
+ joins instead.
+ </para>
+ </sect3>
+
+ <sect3 id="queries-table-aliases">
+ <title>Table and Column Aliases</title>
+
+ <para>
+ A temporary name can be given to tables and complex table
+ references to be used for references to the derived table in
+ further processing. This is called a <firstterm>table
+ alias</firstterm>.
+<synopsis>
+FROM <replaceable>table_reference</replaceable> AS <replaceable>alias</replaceable>
+</synopsis>
+ Here, <replaceable>alias</replaceable> can be any regular
+ identifier. The alias becomes the new name of the table
+ reference for the current query -- it is no longer possible to
+ refer to the table by the original name (if the table reference
+ was an ordinary base table). Thus
+<programlisting>
+SELECT * FROM my_table AS m WHERE my_table.a > 5;
+</programlisting>
+ is not valid SQL syntax. What will happen instead, as a
+ <productname>Postgres</productname> extension, is that an implicit
+ table reference is added to the FROM clause, so the query is
+ processed as if it was written as
+<programlisting>
+SELECT * FROM my_table AS m, my_table WHERE my_table.a > 5;
+</programlisting>
+ Table aliases are mainly for notational convenience, but it is
+ necessary to use them when joining a table to itself, e.g.,
+<programlisting>
+SELECT * FROM my_table AS a CROSS JOIN my_table AS b ...
+</programlisting>
+ Additionally, an alias is required if the table reference is a
+ subquery.
+ </para>
+
+ <para>
+ Parenthesis are used to resolve ambiguities. The following
+ statement will assign the alias <literal>b</literal> to the
+ result of the join, unlike the previous example:
+<programlisting>
+SELECT * FROM (my_table AS a CROSS JOIN my_table) AS b ...
+</programlisting>
+ </para>
+
+ <para>
+<synopsis>
+FROM <replaceable>table_reference</replaceable> <replaceable>alias</replaceable>
+</synopsis>
+ This form is equivalent the previously treated one; the
+ <token>AS</token> key word is noise.
+ </para>
+
+ <para>
+<synopsis>
+FROM <replaceable>table_reference</replaceable> <optional>AS</optional> <replaceable>alias</replaceable> ( <replaceable>column1</replaceable> <optional>, <replaceable>column2</replaceable> <optional>, ...</optional></optional> )
+</synopsis>
+ In addition to renaming the table as described above, the columns
+ of the table are also given temporary names. If less column
+ aliases are specified than the actual table has columns, the last
+ columns are not renamed. This syntax is especially useful for
+ self-joins or subqueries.
+ </para>
+ </sect3>
+
+ <sect3 id="queries-table-expression-examples">
+ <title>Examples</title>
+
+ <para>
+<programlisting>
+FROM T1 INNER JOIN T2 USING (C)
+FROM T1 LEFT OUTER JOIN T2 USING (C)
+FROM (T1 RIGHT OUTER JOIN T2 ON (T1C1=T2C1)) AS DT1
+FROM (T1 FULL OUTER JOIN T2 USING (C)) AS DT1 (DT1C1, DT1C2)
+
+FROM T1 NATURAL INNER JOIN T2
+FROM T1 NATURAL LEFT OUTER JOIN T2
+FROM T1 NATURAL RIGHT OUTER JOIN T2
+FROM T1 NATURAL FULL OUTER JOIN T2
+
+FROM (SELECT * FROM T1) DT1 CROSS JOIN T2, T3
+FROM (SELECT * FROM T1) DT1, T2, T3
+</programlisting>
+
+ Above are some examples of joined tables and complex derived
+ tables. Notice how the AS clause renames or names a derived
+ table and how the optional comma-separated list of column names
+ that follows gives names or renames the columns. The last two
+ FROM clauses produce the same derived table from T1, T2, and T3.
+ The AS keyword was omitted in naming the subquery as DT1. The
+ keywords OUTER and INNER are noise that can be omitted also.
+ </para>
+ </sect3>
+
+ </sect2>
+
+ <sect2 id="queries-where">
+ <title>WHERE clause</title>
+
+ <para>
+ The syntax of the WHERE clause is
+<synopsis>
+WHERE <replaceable>search condition</replaceable>
+</synopsis>
+ where <replaceable>search condition</replaceable> is any value
+ expression as defined in <xref linkend="sql-expressions"> that
+ returns a value of type <type>boolean</type>.
+ </para>
+
+ <para>
+ After the processing of the FROM clause is done, each row of the
+ derived table is checked against the search condition. If the
+ result of the condition is true, the row is kept in the output
+ table, otherwise (that is, if the result is false or NULL) it is
+ discarded. The search condition typically references at least some
+ column in the table generated in the FROM clause; this is not
+ required, but otherwise the WHERE clause will be fairly useless.
+ </para>
+
+ <note>
+ <para>
+ Before the implementation of the JOIN syntax, it was necessary to
+ put the join condition of an inner join in the WHERE clause. For
+ example, these table expressions are equivalent:
+<programlisting>
+FROM a, b WHERE a.id = b.id AND b.val &gt; 5
+</programlisting>
+ and
+<programlisting>
+FROM a INNER JOIN b ON (a.id = b.id) WHERE b.val &gt; 5
+</programlisting>
+ or perhaps even
+<programlisting>
+FROM a NATURAL JOIN b WHERE b.val &gt; 5
+</programlisting>
+ Which one of these you use is mainly a matter of style. The JOIN
+ syntax in the FROM clause is probably not as portable to other
+ products. For outer joins there is no choice in any case: they
+ must be done in the FROM clause.
+ </para>
+ </note>
+
+<programlisting>
+FROM FDT WHERE
+ C1 > 5
+
+FROM FDT WHERE
+ C1 IN (1, 2, 3)
+FROM FDT WHERE
+ C1 IN (SELECT C1 FROM T2)
+FROM FDT WHERE
+ C1 IN (SELECT C3 FROM T2 WHERE C2 = FDT.C1 + 10)
+
+FROM FDT WHERE
+ C1 BETWEEN (SELECT C3 FROM T2 WHERE C2 = FDT.C1 + 10) AND 100
+
+FROM FDT WHERE
+ EXISTS (SELECT C1 FROM T2 WHERE C2 > FDT.C1)
+</programlisting>
+
+ <para>
+ In the examples above, FDT is the table derived in the FROM
+ clause. Rows that do not meet the search condition of the where
+ clause are eliminated from FDT. Notice the use of scalar
+ subqueries as value expressions (C2 assumed UNIQUE). Just like
+ any other query, the subqueries can employ complex table
+ expressions. Notice how FDT is referenced in the subqueries.
+ Qualifying C1 as FDT.C1 is only necessary if C1 is the name of a
+ column in the derived input table of the subquery. Qualifying the
+ column name adds clarity even when it is not needed. The column
+ naming scope of an outer query extends into its inner queries.
+ </para>
+ </sect2>
+
+
+ <sect2 id="queries-group">
+ <title>GROUP BY and HAVING clauses</title>
+
+ <para>
+ After passing the WHERE filter, the derived input table may be
+ subject to grouping, using the GROUP BY clause, and elimination of
+ group rows using the HAVING clause.
+ </para>
+
+<synopsis>
+SELECT <replaceable>select_list</replaceable> FROM ... <optional>WHERE ...</optional> GROUP BY <replaceable>grouping_column_reference</replaceable> <optional>, <replaceable>grouping_column_reference</replaceable></optional>...
+</synopsis>
+
+ <para>
+ The GROUP BY clause is used to group together rows in a table that
+ share the same values in all the columns listed. The order in
+ which the columns are listed does not matter (as opposed to an
+ ORDER BY clause). The purpose is to reduce each group of rows
+ sharing common values into one group row that is representative of
+ all rows in the group. This is done to eliminate redundancy in
+ the output and/or obtain aggregates that apply to these groups.
+ </para>
+
+ <para>
+ Once a table is grouped, columns that are not included in the
+ grouping cannot be referenced, except in aggregate expressions,
+ since a specific value in those columns is ambiguous - which row
+ in the group should it come from? The grouped-by columns can be
+ referenced in select list column expressions since they have a
+ known constant value per group. Aggregate functions on the
+ ungrouped columns provide values that span the rows of a group,
+ not of the whole table. For instance, a
+ <function>sum(sales)</function> on a grouped table by product code
+ gives the total sales for each product, not the total sales on all
+ products. The aggregates of the ungrouped columns are
+ representative of the group, whereas their individual values may
+ not be.
+ </para>
+
+ <para>
+ Example:
+<programlisting>
+SELECT pid, p.name, (sum(s.units) * p.price) AS sales
+ FROM products p LEFT JOIN sales s USING ( pid )
+ GROUP BY pid, p.name, p.price;
+</programlisting>
+ In this example, the columns pid, p.name, and p.price must be in
+ the GROUP BY clause since they are referenced in the query select
+ list. The column s.units does not have to be in the GROUP BY list
+ since it is only used in an aggregate expression
+ (<function>sum()</function>), which represents the group of sales
+ of a product. For each product, a summary row is returned about
+ all sales of the product.
+ </para>
+
+ <para>
+ In strict SQL, GROUP BY can only group by columns of the source
+ table but Postgres extends this to also allow GROUP BY to group by
+ select columns in the query select list. Grouping by value
+ expressions instead of simple column names is also allowed.
+ </para>
+
+ <para>
+<synopsis>
+SELECT <replaceable>select_list</replaceable> FROM ... <optional>WHERE ...</optional> GROUP BY ... HAVING <replaceable>boolean_expression</replaceable>
+</synopsis>
+ If a table has been grouped using a GROUP BY clause, but then only
+ certain groups are of interest, the HAVING clause can be used,
+ much like a WHERE clause, to eliminate groups from a grouped
+ table. For some queries, Postgres allows a HAVING clause to be
+ used without a GROUP BY and then it acts just like another WHERE
+ clause, but the point in using HAVING that way is not clear. Since
+ HAVING operates on groups, only grouped columns can be listed in
+ the HAVING clause. If selection based on some ungrouped column is
+ desired, it should be expressed in the WHERE clause.
+ </para>
+
+ <para>
+ Example:
+<programlisting>
+SELECT pid AS "Products",
+ p.name AS "Over 5000",
+ (sum(s.units) * (p.price - p.cost)) AS "Past Month Profit"
+ FROM products p LEFT JOIN sales s USING ( pid )
+ WHERE p.date > CURRENT_DATE - INTERVAL '4 weeks'
+ GROUP BY pid, p.name, p.price, p.cost
+ HAVING p.price > 5000;
+</programlisting>
+ In the example above, the WHERE clause is selecting rows by a
+ column that is not grouped, while the HAVING clause
+ is selecting groups with a price greater than 5000.
+ </para>
+ </sect2>
+ </sect1>
+
+
+ <sect1 id="queries-select-lists">
+ <title>Select Lists</title>
+
+ <para>
+ The table expression in the <command>SELECT</command> command
+ constructs an intermediate virtual table by possibly combining
+ tables, views, eliminating rows, grouping, etc. This table is
+ finally passed on to processing by the select list. The select
+ list determines which <emphasis>columns</emphasis> of the
+ intermediate table are retained. The simplest kind of select list
+ is <literal>*</literal> which retains all columns that the table
+ expression produces. Otherwise, a select list is a comma-separated
+ list of value expressions (as defined in <xref
+ linkend="sql-expressions">). For instance, it could be a list of
+ column names:
+<programlisting>
+SELECT a, b, c FROM ...
+</programlisting>
+ The columns names a, b, and c are either the actual names of the
+ columns of table referenced in the FROM clause, or the aliases
+ given to them as explained in <xref linkend="queries-table-aliases">.
+ The name space available in the select list is the same as in the
+ WHERE clause (unless grouping is used, in which case it is the same
+ as in the HAVING clause). If more than one table has a column of
+ the same name, the table name must also be given, as in
+<programlisting>
+SELECT tbl1.a, tbl2.b, tbl1.c FROM ...
+</programlisting>
+ (see also <xref linkend="queries-where">).
+ </para>
+
+ <para>
+ If an arbitrary value expression is used in the select list, it
+ conceptually adds a new virtual column to the returned table. The
+ value expression is effectively evaluated once for each retrieved
+ row with real values substituted for any column references. But
+ the expressions in the select list do not have to reference any
+ columns in the table expression of the FROM clause; they can be
+ constant arithmetic expressions as well, for instance.
+ </para>
+
+ <sect2 id="queries-column-labels">
+ <title>Column Labels</title>
+
+ <para>
+ The entries in the select list can be assigned names for further
+ processing. The <quote>further processing</quote> in this case is
+ an optional sort specification and the client application (e.g.,
+ column headers for display). For example:
+<programlisting>
+SELECT a AS value, b + c AS sum FROM ...
+</programlisting>
+ The AS key word can in fact be omitted.
+ </para>
+
+ <para>
+ If no name is chosen, the system assigns a default. For simple
+ column references, this is the name of the column. For function
+ calls, this is the name of the function. For complex expressions,
+ the system will generate a generic name.
+ </para>
+
+ <note>
+ <para>
+ The naming of output columns here is different from that done in
+ the FROM clause (see <xref linkend="queries-table-aliases">). This
+ pipeline will in fact allow you to rename the same column twice,
+ but the name chosen in the select list is the one that will be
+ passed on.
+ </para>
+ </note>
+ </sect2>
+
+ <sect2 id="queries-distinct">
+ <title>DISTINCT</title>
+
+ <para>
+ After the select list has been processed, the result table may
+ optionally be subject to the elimination of duplicates. The
+ <token>DISTINCT</token> key word is written directly after the
+ <token>SELECT</token> to enable this:
+<synopsis>
+SELECT DISTINCT <replaceable>select_list</replaceable> ...
+</synopsis>
+ (Instead of <token>DISTINCT</token> the word <token>ALL</token>
+ can be used to select the default behavior of retaining all rows.)
+ </para>
+
+ <para>
+ Obviously, two rows are considered distinct if they differ in at
+ least one column value. NULLs are considered equal in this
+ consideration.
+ </para>
+
+ <para>
+ Alternatively, an arbitrary expression can determine what rows are
+ to be considered distinct:
+<synopsis>
+SELECT DISTINCT ON (<replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ...</optional>) <replaceable>select_list</replaceable> ...
+</synopsis>
+ Here <replaceable>expression</replaceable> is an arbitrary value
+ expression that is evaluated for all rows. A set of rows for
+ which all the expressions is equal are considered duplicates and
+ only the first row is kept in the output. Note that the
+ <quote>first row</quote> of a set is unpredictable unless the
+ query is sorted.
+ </para>
+
+ <para>
+ The DISTINCT ON clause is not part of the SQL standard and is
+ sometimes considered bad style because of the indeterminate nature
+ of its results. With judicious use of GROUP BY and subselects in
+ FROM the construct can be avoided, but it is very often the much
+ more convenient alternative.
+ </para>
+ </sect2>
+ </sect1>
+
+ <sect1 id="queries-union">
+ <title>Combining Queries</title>
+
+ <para>
+ The results of two queries can be combined using the set operations
+ union, intersection, and difference. The syntax is
+<synopsis>
+<replaceable>query1</replaceable> UNION <optional>ALL</optional> <replaceable>query2</replaceable>
+<replaceable>query1</replaceable> INTERSECT <optional>ALL</optional> <replaceable>query2</replaceable>
+<replaceable>query1</replaceable> EXCEPT <optional>ALL</optional> <replaceable>query2</replaceable>
+</synopsis>
+ <replaceable>query1</replaceable> and
+ <replaceable>query2</replaceable> are queries that can use any of
+ the features discussed up to this point. Set operations can also
+ be nested and chained, for example
+<synopsis>
+<replaceable>query1</replaceable> UNION <replaceable>query2</replaceable> UNION <replaceable>query3</replaceable>
+</synopsis>
+ which really says
+<synopsis>
+(<replaceable>query1</replaceable> UNION <replaceable>query2</replaceable>) UNION <replaceable>query3</replaceable>
+</synopsis>
+ </para>
+
+ <para>
+ <command>UNION</command> effectively appends the result of
+ <replaceable>query2</replaceable> to the result of
+ <replaceable>query1</replaceable> (although there is no guarantee
+ that this is the order in which the rows are actually returned) and
+ eliminates all duplicate rows, in the sense of DISTINCT, unless ALL
+ is specified.
+ </para>
+
+ <para>
+ <command>INTERSECT</command> returns all rows that are both in the
+ result of <replaceable>query1</replaceable> and in the result of
+ <replaceable>query2</replaceable>. Duplicate rows are eliminated
+ unless ALL is specified.
+ </para>
+
+ <para>
+ <command>EXCEPT</command> returns all rows that are in the result
+ of <replaceable>query1</replaceable> but not in the result of
+ <replaceable>query2</replaceable>. Again, duplicates are
+ eliminated unless ALL is specified.
+ </para>
+
+ <para>
+ In order to calculate the union, intersection, or difference of two
+ queries, the two queries must be <quote>union compatible</quote>,
+ which means that they both return the same number of columns, and
+ that the corresponding columns have compatible data types, as
+ described in <xref linkend="typeconv-union-case">.
+ </para>
+ </sect1>
+
+
+ <sect1 id="queries-order">
+ <title>Sorting Rows</title>
+
+ <para>
+ After a query has produced an output table (after the select list
+ has been processed) it can optionally be sorted. If sorting is not
+ chosen, the rows will be returned in random order. The actual
+ order in that case will depend on the scan and join plan types and
+ the order on disk, but it must not be relied on. A particular
+ ordering can only be guaranteed if the sort step is explicitly
+ chosen.
+ </para>
+
+ <para>
+ The ORDER BY clause specifies the sort order:
+<synopsis>
+SELECT <replaceable>select_list</replaceable> FROM <replaceable>table_expression</replaceable> ORDER BY <replaceable>column1</replaceable> <optional>ASC | DESC</optional> <optional>, <replaceable>column2</replaceable> <optional>ASC | DESC</optional> ...</optional>
+</synopsis>
+ <replaceable>column1</replaceable>, etc., refer to select list
+ columns: It can either be the name of a column (either the
+ explicit column label or default name, as explained in <xref
+ linkend="queries-column-labels">) or the number of a column. Some
+ examples:
+<programlisting>
+SELECT a, b FROM table1 ORDER BY a;
+SELECT a + b AS sum, c FROM table1 ORDER BY sum;
+SELECT a, sum(b) FROM table1 GROUP BY a ORDER BY 1;
+</programlisting>
+ </para>
+
+ <para>
+ As an extension to the SQL standard, Postgres also allows ordering
+ by arbitrary expressions:
+<programlisting>
+SELECT a, b FROM table1 ORDER BY a + b;
+</programlisting>
+ References to column names in the FROM clause that are renamed in
+ the select list are also allowed:
+<programlisting>
+SELECT a AS b FROM table1 ORDER BY a;
+</programlisting>
+ But this does not work in queries involving UNION, INTERSECT, or
+ EXCEPT, and is not portable.
+ </para>
+
+ <para>
+ Each column specification may be followed by an optional ASC or
+ DESC to set the sort direction. ASC is default. Ascending order
+ puts smaller values first, where <quote>smaller</quote> is defined
+ in terms of the <literal>&lt;</literal> operator. Similarly,
+ descending order is determined with the <literal>&gt;</literal>
+ operator.
+ </para>
+
+ <para>
+ If more than one sort column is specified the later entries are
+ used to sort the rows that are equal under the order imposed by the
+ earlier sort specifications.
+ </para>
+ </sect1>
+
+ <sect1 id="queries-limit">
+ <title>LIMIT and OFFSET</title>
+
+<synopsis>
+SELECT <replaceable>select_list</replaceable> FROM <replaceable>table_expression</replaceable> <optional>ORDER BY <replaceable>sort_spec</replaceable></optional> <optional>LIMIT { <replaceable>number</replaceable> | ALL }</optional> <optional>OFFSET <replaceable>number</replaceable></optional>
+</synopsis>
+
+ <para>
+ LIMIT allows you to retrieve just a portion of the rows that are
+ generated by the rest of the query. If a limit count is given, no
+ more than that many rows will be returned. If an offset is given,
+ that many rows will be skipped before starting to return rows.
+ </para>
+
+ <para>
+ When using LIMIT, it is a good idea to use an ORDER BY clause that
+ constrains the result rows into a unique order. Otherwise you will
+ get an unpredictable subset of the query's rows---you may be asking
+ for the tenth through twentieth rows, but tenth through twentieth
+ in what ordering? The ordering is unknown, unless you specified
+ ORDER BY.
+ </para>
+
+ <para>
+ The query optimizer takes LIMIT into account when generating a
+ query plan, so you are very likely to get different plans (yielding
+ different row orders) depending on what you give for LIMIT and
+ OFFSET. Thus, using different LIMIT/OFFSET values to select
+ different subsets of a query result <emphasis>will give
+ inconsistent results</emphasis> unless you enforce a predictable
+ result ordering with ORDER BY. This is not a bug; it is an
+ inherent consequence of the fact that SQL does not promise to
+ deliver the results of a query in any particular order unless ORDER
+ BY is used to constrain the order.
+ </para>
+ </sect1>
+
+</chapter>
diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
index f994cbe0a0..843f2d08f5 100644
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@@ -1,5 +1,5 @@
<!--
-$Header: /cvsroot/pgsql/doc/src/sgml/syntax.sgml,v 1.36 2001/01/21 22:02:01 petere Exp $
+$Header: /cvsroot/pgsql/doc/src/sgml/syntax.sgml,v 1.37 2001/01/22 23:34:33 petere Exp $
-->
<chapter id="sql-syntax">
@@ -743,7 +743,7 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
</para>
<para>
- In addition to this list, there are a number of contructs that can
+ In addition to this list, there are a number of constructs that can
be classified as an expression but do not follow any general syntax
rules. These generally have the semantics of a function or
operator and are explained in the appropriate location in <xref
@@ -763,15 +763,15 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
<para>
A column can be referenced in the form:
<synopsis>
-<replaceable>corelation</replaceable>.<replaceable>columnname</replaceable> `['<replaceable>subscript</replaceable>`]'
+<replaceable>correlation</replaceable>.<replaceable>columnname</replaceable> `['<replaceable>subscript</replaceable>`]'
</synopsis>
- <replaceable>corelation</replaceable> is either the name of a
+ <replaceable>correlation</replaceable> is either the name of a
table, an alias for a table defined by means of a FROM clause, or
the keyword <literal>NEW</literal> or <literal>OLD</literal>.
(NEW and OLD can only appear in the action portion of a rule,
- while other corelation names can be used in any SQL statement.)
- The corelation name can be omitted if the column name is unique
+ while other correlation names can be used in any SQL statement.)
+ The correlation name can be omitted if the column name is unique
across all the tables being used in the current query. If
<replaceable>column</replaceable> is of an array type, then the
optional <replaceable>subscript</replaceable> selects a specific
@@ -895,8 +895,8 @@ sqrt(2)
The precedence and associativity of the operators is hard-wired
into the parser. Most operators have the same precedence and are
left-associative. This may lead to non-intuitive behavior; for
- example the boolean operators "&lt;" and "&gt;" have a different
- precedence than the boolean operators "&lt;=" and "&gt;=". Also,
+ example the Boolean operators "&lt;" and "&gt;" have a different
+ precedence than the Boolean operators "&lt;=" and "&gt;=". Also,
you will sometimes need to add parentheses when using combinations
of binary and unary operators. For instance
<programlisting>
@@ -917,7 +917,7 @@ SELECT (5 &amp;) ~ 6;
<tgroup cols="2">
<thead>
<row>
- <entry>OperatorElement</entry>
+ <entry>Operator/Element</entry>
<entry>Associativity</entry>
<entry>Description</entry>
</row>
@@ -1057,526 +1057,6 @@ SELECT (5 &amp;) ~ 6;
</para>
</sect1>
-
- <sect1 id="sql-table-expressions">
- <title>Table Expressions</title>
-
- <para>
- A <firstterm>table expression</firstterm> specifies a table. The
- table expression contains a FROM clause that is optionally followed
- by WHERE, GROUP BY, and HAVING clauses. Trivial table expressions
- simply refer to a table on disk, a so-called base table, but more
- complex expressions can be used to modify or combine base tables in
- various ways.
- </para>
-
- <para>
- The general syntax of the <command>SELECT</command> command is
-<synopsis>
-SELECT <replaceable>select_list</replaceable> <replaceable>table_expression</replaceable>
-</synopsis>
-
- The <replaceable>select_list</replaceable> is a comma separated
- list of <replaceable>value expressions</replaceable> as defined in
- <xref linkend="sql-expressions"> that specify the derived columns
- of the query output table. Column names in the derived table that
- is the result of the <replaceable>table_expression</replaceable>
- can be used in the <replaceable>value expression</replaceable>s of
- the <replaceable>select_list</replaceable>.
- </para>
-
- <para>
- The WHERE, GROUP BY, and HAVING clauses in the table expression
- specify a pipeline of successive transformations performed on the
- table derived in the FROM clause. The final transformed table that
- is derived provides the input rows used to derive output rows as
- specified by the select list of derived column value expressions.
- </para>
-
- <sect2>
- <title>FROM clause</title>
-
- <para>
- The FROM clause derives a table from one or more other tables
- given in a comma-separated table reference list.
-<synopsis>
-FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_reference</replaceable> <optional>, ...</optional></optional>
-</synopsis>
-
- A table reference may be a table name or a derived table such as a
- subquery, a table join, or complex combinations of these. If more
- than one table reference is listed in the FROM clause they are
- CROSS JOINed (see below) to form the derived table that may then
- be subject to transformations by the WHERE, GROUP BY, and HAVING
- clauses and is finally the result of the overall table expression.
- </para>
-
- <para>
- If a table reference is a simple table name and it is the
- supertable in a table inheritance hierarchy, rows of the table
- include rows from all of its subtable successors unless the
- keyword ONLY precedes the table name.
- </para>
-
- <sect3>
- <title>Joined Tables</title>
-
- <para>
- A joined table is a table derived from two other (real or
- derived) tables according to the rules of the particular join
- type. INNER, OUTER, NATURAL, and CROSS JOIN are supported.
- </para>
-
- <variablelist>
- <title>Join Types</title>
-
- <varlistentry>
- <term>CROSS JOIN</term>
-
- <listitem>
-<synopsis>
-<replaceable>T1</replaceable> CROSS JOIN <replaceable>T2</replaceable>
-</synopsis>
-
- <para>
- For each combination of rows from
- <replaceable>T1</replaceable> and
- <replaceable>T2</replaceable> the derived table will contain a
- row consisting of all columns in <replaceable>T1</replaceable>
- followed by all columns in <replaceable>T2</replaceable>. If
- the tables have have N and M rows respectively, the joined
- table will have N * M rows. A cross join is essentially an
- <literal>INNER JOIN ON TRUE</literal>.
- </para>
-
- <tip>
- <para>
- <literal>FROM <replaceable>T1</replaceable> CROSS JOIN
- <replaceable>T2</replaceable></literal> is equivalent to
- <literal>FROM <replaceable>T1</replaceable>,
- <replaceable>T2</replaceable></literal>.
- </para>
- </tip>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Qualified JOINs</term>
- <listitem>
-
-<synopsis>
-<replaceable>T1</replaceable> { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable> ON <replaceable>boolean expression</replaceable>
-<replaceable>T1</replaceable> { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable> USING ( <replaceable>join column list</replaceable> )
-</synopsis>
-
- <para>
- The words <token>INNER</token> and <token>OUTER</token> are
- optional for all JOINs. <token>INNER</token> is the default;
- <token>LEFT</token>, <token>RIGHT</token>, and
- <token>FULL</token> are for OUTER JOINs only.
- </para>
-
- <para>
- The <firstterm>join condition</firstterm> is specified in the
- ON or USING clause. (The meaning of the join condition
- depends on the particular join type; see below.) The ON
- clause takes a boolean value expression of the same kind as is
- used in a WHERE clause. The USING clause takes a
- comma-separated list of column names, which the joined tables
- must have in common, and joins the tables on the equality of
- those columns as a set, resulting in a joined table having one
- column for each common column listed and all of the other
- columns from both tables. Thus, <literal>USING (a, b,
- c)</literal> is equivalent to <literal>ON (t1.a = t2.a AND
- t1.b = t2.b AND t1.c = t2.c)</literal> with the exception that
- if ON is used there will be two columns a, b, and c in the
- result, whereas with USING there will be only one of each.
- </para>
-
- <variablelist>
- <varlistentry>
- <term>INNER JOIN</term>
-
- <listitem>
- <para>
- For each row R1 of T1, the joined table has a row for each
- row in T2 that satisfies the join condition with R1.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>LEFT OUTER JOIN</term>
-
- <listitem>
- <para>
- First, an INNER JOIN is performed. Then, for a row in T1
- that does not satisfy the join condition with any row in
- T2, a joined row is returned with NULL values in columns of
- T2. Thus, the joined table unconditionally has a row for each
- row in T1.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>RIGHT OUTER JOIN</term>
-
- <listitem>
- <para>
- This is like a left join, only that the result table will
- unconditionally have a row for each row in T2.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>FULL OUTER JOIN</term>
-
- <listitem>
- <para>
- First, an INNER JOIN is performed. Then, for each row in
- T1 that does not satisfy the join condition with any row in
- T2, a joined row is returned with null values in columns of
- T2. Also, for each row of T2 that does not satisfy the
- join condition with any row in T1, a joined row with null
- values in the columns of T1 is returned.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>NATURAL JOIN</term>
-
- <listitem>
-<synopsis>
-<replaceable>T1</replaceable> NATURAL { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> JOIN <replaceable>T2</replaceable>
-</synopsis>
- <para>
- A natural join creates a joined table where every pair of matching
- column names between the two tables are merged into one column. The
- join specification is effectively a USING clause containing all the
- common column names and is otherwise like a Qualified JOIN.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
-
- <para>
- Joins of all types can be chained together or nested where either
- or both of <replaceable>T1</replaceable> and
- <replaceable>T2</replaceable> may be JOINed tables. Parenthesis
- can be used around JOIN clauses to control the join order which
- are otherwise left to right.
- </para>
- </sect3>
-
- <sect3 id="sql-subqueries">
- <title>Subqueries</title>
-
- <para>
- Subqueries specifying a derived table must be enclosed in
- parenthesis and <emphasis>must</emphasis> be named using an AS
- clause. (See <xref linkend="sql-table-aliases">.)
- </para>
-
-<programlisting>
-FROM (SELECT * FROM table1) AS alias_name
-</programlisting>
-
- <para>
- This example is equivalent to <literal>FROM table1 AS
- alias_name</literal>. Many subquieries can be written as table
- joins instead.
- </para>
- </sect3>
-
- <sect3 id="sql-table-aliases">
- <title>Table and Column Aliases</title>
-
- <para>
- A temporary name can be given to tables and complex table
- references to be used for references to the derived table in
- further processing. This is called a <firstterm>table
- alias</firstterm>.
-<synopsis>
-FROM <replaceable>table_reference</replaceable> AS <replaceable>alias</replaceable>
-</synopsis>
- Here, <replaceable>alias</replaceable> can be any regular
- identifier. The alias becomes the new name of the table
- reference for the current query -- it is no longer possible to
- refer to the table by the original name (if the table reference
- was an ordinary base table). Thus
-<programlisting>
-SELECT * FROM my_table AS m WHERE my_table.a > 5;
-</programlisting>
- is not valid SQL syntax. What will happen instead, as a
- <productname>Postgres</productname> extension, is that an implict
- table reference is added to the FROM clause, so the query is
- processed as if it was written as
-<programlisting>
-SELECT * FROM my_table AS m, my_table WHERE my_table.a > 5;
-</programlisting>
- Table aliases are mainly for notational convenience, but it is
- necessary to use them when joining a table to itself, e.g.,
-<programlisting>
-SELECT * FROM my_table AS a CROSS JOIN my_table AS b ...
-</programlisting>
- Additionally, an alias is required if the table reference is a
- subquery.
- </para>
-
- <para>
- Parenthesis are used to resolve ambiguities. The following
- statement will assign the alias <literal>b</literal> to the
- result of the join, unlike the previous example:
-<programlisting>
-SELECT * FROM (my_table AS a CROSS JOIN my_table) AS b ...
-</programlisting>
- </para>
-
- <para>
-<synopsis>
-FROM <replaceable>table_reference</replaceable> <replaceable>alias</replaceable>
-</synopsis>
- This form is equivalent the previously treated one; the
- <token>AS</token> key word is noise.
- </para>
-
- <para>
-<synopsis>
-FROM <replaceable>table_reference</replaceable> <optional>AS</optional> <replaceable>alias</replaceable> ( <replaceable>column1</replaceable> <optional>, <replaceable>column2</replaceable> <optional>, ...</optional></optional> )
-</synopsis>
- In addition to renaming the table as described above, the columns
- of the table are also given temporary names. If less column
- aliases are specified than the actual table has columns, the last
- columns are not renamed. This syntax is especially useful for
- self-joins or subqueries.
- </para>
- </sect3>
-
- <sect3>
- <title>Examples</title>
-
- <para>
-<programlisting>
-FROM T1 INNER JOIN T2 USING (C)
-FROM T1 LEFT OUTER JOIN T2 USING (C)
-FROM (T1 RIGHT OUTER JOIN T2 ON (T1C1=T2C1)) AS DT1
-FROM (T1 FULL OUTER JOIN T2 USING (C)) AS DT1 (DT1C1, DT1C2)
-
-FROM T1 NATURAL INNER JOIN T2
-FROM T1 NATURAL LEFT OUTER JOIN T2
-FROM T1 NATURAL RIGHT OUTER JOIN T2
-FROM T1 NATURAL FULL OUTER JOIN T2
-
-FROM (SELECT * FROM T1) DT1 CROSS JOIN T2, T3
-FROM (SELECT * FROM T1) DT1, T2, T3
-</programlisting>
-
- Above are some examples of joined tables and complex derived
- tables. Notice how the AS clause renames or names a derived
- table and how the optional comma-separated list of column names
- that follows gives names or renames the columns. The last two
- FROM clauses produce the same derived table from T1, T2, and T3.
- The AS keyword was omitted in naming the subquery as DT1. The
- keywords OUTER and INNER are noise that can be omitted also.
- </para>
- </sect3>
-
- </sect2>
-
- <sect2>
- <title>WHERE clause</title>
-
- <para>
- The syntax of the WHERE clause is
-<synopsis>
-WHERE <replaceable>search condition</replaceable>
-</synopsis>
- where <replaceable>search condition</replaceable> is any value
- expression as defined in <xref linkend="sql-expressions"> that
- returns a value of type <type>boolean</type>.
- </para>
-
- <para>
- After the processing of the FROM clause is done, each row of the
- derived table is checked against the search condition. If the
- result of the condition is true, the row is kept in the output
- table, otherwise (that is, if the result is false or NULL) it is
- discared. The search condition typically references at least some
- column in the table generated in the FROM clause; this is not
- required, but otherwise the WHERE clause will be fairly useless.
- </para>
-
- <note>
- <para>
- Before the implementation of the JOIN syntax, it was necessary to
- put the join condition of an inner join in the WHERE clause. For
- example, these table expressions are equivalent:
-<programlisting>
-FROM a, b WHERE a.id = b.id AND b.val &gt; 5
-</programlisting>
- and
-<programlisting>
-FROM a INNER JOIN b ON (a.id = b.id) WHERE b.val &gt; 5
-</programlisting>
- or perhaps even
-<programlisting>
-FROM a NATURAL JOIN b WHERE b.val &gt; 5
-</programlisting>
- Which one of these you use is mainly a matter of style. The JOIN
- syntax in the FROM clause is probably not as portable to other
- products. For outer joins there is no choice in any case: they
- must be done in the FROM clause.
- </para>
- </note>
-
-<programlisting>
-FROM FDT WHERE
- C1 > 5
-
-FROM FDT WHERE
- C1 IN (1, 2, 3)
-FROM FDT WHERE
- C1 IN (SELECT C1 FROM T2)
-FROM FDT WHERE
- C1 IN (SELECT C3 FROM T2 WHERE C2 = FDT.C1 + 10)
-
-FROM FDT WHERE
- C1 BETWEEN (SELECT C3 FROM T2 WHERE C2 = FDT.C1 + 10) AND 100
-
-FROM FDT WHERE
- EXISTS (SELECT C1 FROM T2 WHERE C2 > FDT.C1)
-</programlisting>
-
- <para>
- In the examples above, FDT is the table derived in the FROM
- clause. Rows that do not meet the search condition of the where
- clause are eliminated from FDT. Notice the use of scalar
- subqueries as value expressions (C2 assumed UNIQUE). Just like
- any other query, the subqueries can employ complex table
- expressions. Notice how FDT is referenced in the subqueries.
- Qualifying C1 as FDT.C1 is only necessary if C1 is the name of a
- column in the derived input table of the subquery. Qualifying the
- column name adds clarity even when it is not needed. The column
- naming scope of an outer query extends into its inner queries.
- </para>
- </sect2>
-
-<!-- This is confusing as heck. Make it simpler. -->
-
-<![IGNORE[
-
- <sect2>
- <title>GROUP BY and HAVING clauses</title>
-
- <para>
- After passing the WHERE filter, the derived input table may be
- subject to grouping, using the GROUP BY clause, and elimination of
- group rows using the HAVING clause. (The HAVING clause can also
- be used without GROUP BY, but then it is equivalent to the WHERE
- clause.)
- </para>
-
- <para>
- In standard SQL, the GROUP BY clause takes a list of column names,
- that specify a subrow, from the derived input table produced by
- the previous WHERE or FROM clause and partitions the table into
- groups with duplicate subrows such that within a column of the
- subrow, no column value is distinct from other column values. The
- resulting derived input table is a special type of table, called a
- grouped table, which still contains all columns but only
- references to columns of the grouped subrow, and group aggregates,
- derived from any of the columns, may appear in derived column
- value expressions in the query select list. When deriving an
- output table from a query using a grouped input table, each output
- row is derived from a corresponding group/partition of the grouped
- table. Aggregates computed in a derived output column are
- aggregates on the current partition/group of the grouped input
- table being processed. Only one output table row results per
- group/partition of the grouped input table.
- </para>
-
- <para>
- Postgres has extended the GROUP BY clause to allow some
- non-standard, but useful behavior. Derived output columns, given
- names using an AS clause in the query select list, may appear in
- the GROUP BY clause in combination with, or instead of, the input
- table column names. Tables may also be grouped by arbitrary
- expressions. If output table column names appear in the GROUP BY
- list, then the input table is augmented with additional columns of
- the output table columns listed in the GROUP BY clause. The value
- for each row in the additional columns is computed from the value
- expression that defines the output column in the query select
- list. The augmented input table is grouped by the column names
- listed in the GROUP BY clause. The resulting grouped augmented
- input table is then treated according standard SQL GROUP BY
- semantics. Only the columns of the unaugmented input table in the
- grouped subrow (if any), and group aggregates, derived from any of
- the columns of the unaugmented input table, may be referenced in
- the value expressions of the derived output columns of the
- query. Output columns derived with an aggregate expression cannot
- be named in the GROUP BY clause.
- </para>
-
- <para>
- A HAVING clause may optionally follow a GROUP BY clause. The
- HAVING clause selects or eliminates, depending on which
- perspective is taken, groups from the grouped table derived in the
- GROUP BY clause that precedes it. The search condition is the
- same type of expression allowed in a WHERE clause and may
- reference any of the input table column names in the grouped
- subrow, but may not reference any others or any named output
- columns. When the search condition results in TRUE the group is
- retained, otherwise the group is eliminated.
- </para>
- </sect2>
-
- <sect2>
- <title>ORDER BY and LIMIT clauses</title>
-
- <para>
- ORDER BY and LIMIT clauses are not clauses of a table expression.
- They are optional clauses that may follow a query expression and
- are discussed here because they are commonly used with the
- clauses above.
- </para>
-
- <para>
- ORDER BY takes a comma-separated list of columns and performs a
- cascaded ordering of the table by the columns listed, in the
- order listed. The keyword DESC or ASC may follow any column name
- or expression in the list to specify descending or ascending
- ordering, respectively. Ascending order is the default. The
- ORDER BY clause conforms to the SQL standard but is extended in
- Postgres. Postgres allows ORDER BY to reference both output
- table columns, as named in the select list using the AS clause,
- and input table columns, as given by the table derived in the
- FROM clause and other previous clauses. Postgres also extends
- ORDER BY to allow ordering by arbitrary expressions. If used in a
- query with a GROUP BY clause, the ORDER BY clause can only
- reference output table column names and grouped input table
- columns.
- </para>
-
- <para>
- LIMIT is not a standard SQL clause. LIMIT is a Postgres
- extension that limits the number of rows that will be returned
- from a query. The rows returned by a query using the LIMIT
- clause are random if no ORDER BY clause is specified. A LIMIT
- clause may optionally be followed by an OFFSET clause which
- specifies a number of rows to be skipped in the output table
- before returning the number of rows specified in the LIMIT
- clause.
- </para>
- </sect2>
-]]>
- </sect1>
-
</chapter>
<!-- Keep this comment at the end of the file
diff --git a/doc/src/sgml/user.sgml b/doc/src/sgml/user.sgml
index 78d422013d..c78e3d3a7d 100644
--- a/doc/src/sgml/user.sgml
+++ b/doc/src/sgml/user.sgml
@@ -1,5 +1,5 @@
<!--
-$Header: /cvsroot/pgsql/doc/src/sgml/Attic/user.sgml,v 1.24 2001/01/13 23:58:55 petere Exp $
+$Header: /cvsroot/pgsql/doc/src/sgml/Attic/user.sgml,v 1.25 2001/01/22 23:34:33 petere Exp $
-->
<book id="user">
@@ -44,6 +44,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/Attic/user.sgml,v 1.24 2001/01/13 23:58:55
&intro;
&syntax;
+ &queries;
&datatype;
&func;
&typeconv;