summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorVarun Gupta <varun.gupta@mariadb.com>2019-10-09 15:07:10 +0530
committerVarun Gupta <varun.gupta@mariadb.com>2020-02-09 20:41:05 +0530
commitd1e9d2133dc5959811c65f7c82f82cd6197e03b5 (patch)
tree4662022ce4cc3eada4bc9f44ca829ef553b500cc
parent1e60b6143842f788923c6952bec086d49871c4b9 (diff)
downloadmariadb-git-d1e9d2133dc5959811c65f7c82f82cd6197e03b5.tar.gz
Improved the description
-rw-r--r--sql/sql_sort_nest.cc113
1 files changed, 57 insertions, 56 deletions
diff --git a/sql/sql_sort_nest.cc b/sql/sql_sort_nest.cc
index 7cce22e2644..18d2a596587 100644
--- a/sql/sql_sort_nest.cc
+++ b/sql/sql_sort_nest.cc
@@ -82,94 +82,95 @@ Let us divide the implementation details in 3 stages:
OPTIMIZATION STAGE
-- We invoke the join planner to get an estimate of the cardinality of the
- join. This is needed for pushing the LIMIT in different partial plans
- which can resolve the ORDER BY clause.
+- The join planner is invoked to get an estimate of the cardinality for the
+ join. This is needed to estimate the number of records that are needed to be
+ read from the result of sorting.
-- Join planner is invoked again to find the best join order for the tables
- inside the join. The join planner enumerated various join orders.
- For each partial plan we try to find out if it can resolve the ORDER BY
- clause or not.
- To resolve the ORDER BY clause, equalities from the WHERE clause are also
- considered.
+- The cost of every potentially usable execution plan such that its first
+ joined tables forms a bush the result of which is sorted in accordance with
+ the ORDER BY clause. The evaluations takes into account that the LIMIT
+ operation can be pushed right after the sort operation.
-- After a partial plan that can resolve ORDER BY clause is found, we push
- the LIMIT to the partial plan.
+ The recursive procedure that enumerates such execution plans considers
+ inserting a sort operation for any partial join prefix that can resolve the
+ ORDER BY clause
+
+ So for each such partial join prefix the procedure considers two options:
+ 1) to insert the sort operation immediately
+ 2) to add it later after expanding this partial join.
+
+ For a partial prefix that cannot resolve the required ordering the procedure
+ just extends the partial join.
- Access methods that ensure pre-existing ordering are also taken into account
inside the join planner. There can be indexes on the first non-const table
- that can resolve the ORDER BY clause. So we push the LIMIT to the first
- non-const table also.
-
-- For each partial plan that can resolve the ORDER BY clause,
- we consider 2 cases
- 1) Push the LIMIT at the current partial plan
- 2) Push the LIMIT later
+ that can resolve the ORDER BY clause. So the LIMIT is also pushed to the
+ first non-const table also in this case.
- This helps us to enumerate all plans where on can push LIMIT at different
+ This helps us to enumerate all plans where on can push LIMIT to different
partial plans. Finally the plan with the lowest cost is picked by the join
- planner
-
+ planner.
COMPILATION STAGE
+A nest is a subset of join tables.
+A materialized nest is a nest whose tables are joined together and result
+is put inside a temporary table.
+Sort nest is a materialized nest which can be sorted.
+
Preparation of Sort Nest
-Let's say we have the best join order as:
+Let's say the best join order is:
t1, t2, t3, t4 .............tk,tk+1.........................tn
|<---------prefix------------>|<-------suffix--------------->|
-
The array of join_tab structures would look like
t1, t2, t3, t4 .............tk, <sort nest>, tk+1.........................tn
-After the best execution plan is picked by the join planner which requires
-a nest for a prefix of tables that can resolve the ORDER BY clause, we want
-to prepare the temporary table that would hold the result of materialization
-of the tables in the prefix.
-t1, t2, t3, t4..............tk ======> inner tables of the nest
+Consider the execution plan finally chosen by the planner.
+This is a linear plan whose first node is a temporary table that is created for
+the sort nest.
-To create the temporary table we need a list of Items which we want to store
-inside the temporary table of the nest. Currently this list contains all
-fields of the inner tables of the nest that have their bitmap read_set set.
-With this list of Items we create the temporary table for the nest.
-Also we create a list of Items for all the fields of the temporary table.
-This list is needed for substitution of items that will be evaluated in the
-POST ORDER BY context.
+Join used for the sort nest is also executed by a linear plan.
+
+ materialize
+ t1, t2, t3, t4..............tk ============> <sort nest>
+ |<---------prefix----------->|
+ Here the sort nest is the first node as stated above:
-After the nest for the prefix is prepared, we extract a sub-condition which is
-dependent on the inner tables of the nest from the WHERE clause. This
-condition is then attached to the inner tables of the nest. This condition
-would be evaluated before the ORDER BY clause is applied to the temporary
-table of the nest.
+ <sort nest> [sort], tk+1.........................tn
+ |<-------suffix-------------->|
+
+To create the temporary table of the nest a list of Items that are going to be
+stored inside the temporary table is needed. Currently this list contains
+fields of the inner tables of the nest that have their bitmap read_set set.
-We need to make substitution for items belonging to the inner tables of the
-nest which will be evaluated in the POST ORDER BY context. These items need
-to be substituted with the corresponding items of the temporary table
-of the nest.
+After the temporary table for the sort nest is created the conditions that can
+be pushed there are extracted from the WHERE clause. Thus the join with the
+sort nest can use only remainder of the extraction. This new condition has to
+be re-bound to refer to the columns of the temporary table whenever references
+to inner tables of the nest were used.
+Similarly for ON clause, SELECT list, ORDER BY clause and REF items this
+rebinding needs to be done.
EXECUTION STAGE
-Let's say we have the best join order as:
+Let's say the best join order is:
t1, t2, t3, t4 .............tk,tk+1.........................tn
|<---------prefix------------>|<-------suffix--------------->|
- The prefix are the inner table of the sort nest while the suffix are the
+ The prefix are the inner tables of the sort nest while the suffix are the
tables outside the sort nest.
- As soon as the join execution starts, we compute the partial join for the
- tables in the prefix and store the result inside the temporary table
- for the sort nest.
- Then we sort the temporary table in accordance with the ORDER BY clause.
- After the sort is performed we read the records from the temporary
- table of the sort nest one by one and continue the join with the
- tables in the suffix.
+ On the execution stage, the join executor computes the partial join for the
+ tables in the prefix and stores the result inside the temporary table of the
+ sort nest.
The join execution for this optimization can be split in 3 parts
@@ -180,13 +181,13 @@ Let's say we have the best join order as:
b) Sort the <sort nest> in accordance with the ORDER BY clause
- c) Read records from the Filesort buffer one by one and continue join
- execution with the tables in the suffix
+ c) Read records from the the result of sorting one by one and join
+ with the tables in the suffix with NESTED LOOP JOIN
<sort nest>, tk+1.........................tn
|<----------suffix----------->|
- The execution stops as soon as we get LIMIT records in the output.
+ The execution stops as soon as we get LIMIT records in the output.
*/