From 19c721631ef21bcf3ce3ea3a036da5e234b0f49c Mon Sep 17 00:00:00 2001 From: Sergei Petrunia Date: Mon, 6 Jun 2022 22:21:22 +0300 Subject: MDEV-28749: restore_prev_nj_state() doesn't update cur_sj_inner_tables correctly (Try 2) (Cherry-pick back into 10.3) The code that updates semi-join optimization state for a join order prefix had several bugs. The visible effect was bad optimization for FirstMatch or LooseScan strategies: they either weren't considered when they should have been, or considered when they shouldn't have been. In order to hit the bug, the optimizer needs to consider several different join prefixes in a certain order. Queries with "obvious" query plans which prune all join orders except one are not affected. Internally, the bugs in updates of semi-join state were: 1. restore_prev_sj_state() assumed that "we assume remaining_tables doesnt contain @tab" which wasn't true. 2. Another bug in this function: it did remove bits from join->cur_sj_inner_tables but never added them. 3. greedy_search() adds tables into the join prefix but neglects to update the semi-join optimization state. (It does update nested outer join state, see this call: check_interleaving_with_nj(best_table) but there's no matching call to update the semi-join state. (This wasn't visible because most of the state is in the POSITION structure which is updated. But there is also state in JOIN, too) The patch: - Fixes all of the above - Adds JOIN::dbug_verify_sj_inner_tables() which is used to verify the state is correct at every step. - Renames advance_sj_state() to optimize_semi_joins(). = Introduces update_sj_state() which ideally should have been called "advance_sj_state" but I didn't reuse the name to not create confusion. --- sql/sql_select.cc | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) (limited to 'sql/sql_select.cc') diff --git a/sql/sql_select.cc b/sql/sql_select.cc index ee834f5d806..87144bd2d7d 100644 --- a/sql/sql_select.cc +++ b/sql/sql_select.cc @@ -7721,6 +7721,10 @@ choose_plan(JOIN *join, table_map join_tables) { choose_initial_table_order(join); } + /* + Note: constant tables are already in the join prefix. We don't + put them into the cur_sj_inner_tables, though. + */ join->cur_sj_inner_tables= 0; if (straight_join) @@ -8023,8 +8027,8 @@ optimize_straight_join(JOIN *join, table_map join_tables) read_time= COST_ADD(read_time, COST_ADD(join->positions[idx].read_time, record_count / (double) TIME_FOR_COMPARE)); - advance_sj_state(join, join_tables, idx, &record_count, &read_time, - &loose_scan_pos); + optimize_semi_joins(join, join_tables, idx, &record_count, &read_time, + &loose_scan_pos); join_tables&= ~(s->table->map); double pushdown_cond_selectivity= 1.0; @@ -8201,6 +8205,12 @@ greedy_search(JOIN *join, /* This has been already checked by best_extension_by_limited_search */ DBUG_ASSERT(!is_interleave_error); + /* + Also, update the semi-join optimization state. Information about the + picked semi-join operation is in best_pos->...picker, but we need to + update the global state in the JOIN object, too. + */ + update_sj_state(join, best_table, idx, remaining_tables); /* find the position of 'best_table' in 'join->best_ref' */ best_idx= idx; @@ -8983,8 +8993,8 @@ best_extension_by_limited_search(JOIN *join, current_record_count / (double) TIME_FOR_COMPARE)); - advance_sj_state(join, remaining_tables, idx, ¤t_record_count, - ¤t_read_time, &loose_scan_pos); + optimize_semi_joins(join, remaining_tables, idx, ¤t_record_count, + ¤t_read_time, &loose_scan_pos); /* Expand only partial plans with lower cost than the best QEP so far */ if (current_read_time >= join->best_read) -- cgit v1.2.1