summaryrefslogtreecommitdiff
path: root/sql/sql_statistics.h
Commit message (Collapse)AuthorAgeFilesLines
* Merge 10.3 into 10.4Marko Mäkelä2020-06-031-1/+0
|\
| * Merge 10.2 into 10.3Marko Mäkelä2020-06-021-1/+0
| |\
| | * Merge 10.1 into 10.2Marko Mäkelä2020-06-011-1/+0
| | |\
| | | * Thread safe histograms loadingSergey Vojtovich2020-05-291-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously multiple threads were allowed to load histograms concurrently. There were no known problems caused by this. But given amount of data races in this code, it'd happen sooner or later. To avoid scalability bottleneck, histograms loading is protected by per-TABLE_SHARE atomic variable. Whenever histograms were loaded by preceding statement (hot-path), a scalable load-acquire check is performed. Whenever histograms have to be loaded anew, mutual exclusion for loaders is established by atomic variable. If histograms are being loaded concurrently, statement waits until load is completed. - Table_statistics::total_hist_size moved to TABLE_STATISTICS_CB: only meaningful within TABLE_SHARE (not used for collected stats). - TABLE_STATISTICS_CB::histograms_can_be_read and TABLE_STATISTICS_CB::histograms_are_read are replaced with a tri state atomic variable. - Simplified away alloc_histograms_for_table_share(). Note: there's still likely a data race if a thread attempts accessing histograms data after it failed to load it (because of concurrent load). It was there previously and goes out of the scope of this effort. One way of fixing it could be reviving TABLE::histograms_are_read and adding appropriate checks whenever it is needed. Part of MDEV-19061 - table_share used for reading statistical tables is not protected
* | | | Merge 10.3 into 10.4Marko Mäkelä2019-10-101-2/+1
|\ \ \ \ | |/ / /
| * | | Merge 10.2 into 10.3Marko Mäkelä2019-10-091-2/+1
| |\ \ \ | | |/ /
| | * | Merge 10.1 into 10.2Marko Mäkelä2019-10-091-2/+1
| | |\ \ | | | |/
| | | * MDEV-19536 - Server crash or ASAN heap-use-after-free in is_temporary_table /Sergey Vojtovich2019-10-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | read_statistics_for_tables_if_needed Regression after 279a907, read_statistics_for_tables_if_needed() was called after open_normal_and_derived_tables() failure. Fixed by moving read_statistics_for_tables() call to a branch of get_schema_stat_record() where result of open_normal_and_derived_tables() is checked. Removed THD::force_read_stats, added read_statistics_for_tables() instead. Simplified away statistics_for_command_is_needed().
| | | * Cleanup EITSSergey Vojtovich2019-10-021-2/+0
| | | | | | | | | | | | | | | | | | | | Moved EITS allocation inside read_statistics_for_tables_if_needed(). Removed redundant is_safe argument.
* | | | Merge branch '10.3' into 10.4Oleksandr Byelkin2019-05-191-1/+1
|\ \ \ \ | |/ / /
| * | | Merge 10.2 into 10.3Marko Mäkelä2019-05-141-1/+1
| |\ \ \ | | |/ /
| | * | Merge 10.1 into 10.2Marko Mäkelä2019-05-131-1/+1
| | |\ \ | | | |/
| | | * Merge branch '5.5' into 10.1Vicențiu Ciorbaru2019-05-111-1/+1
| | | |
* | | | Merge 10.3 into 10.4Marko Mäkelä2018-12-121-0/+1
|\ \ \ \ | |/ / /
| * | | Merge 10.2 into 10.3Marko Mäkelä2018-12-121-0/+1
| |\ \ \ | | |/ /
| | * | Merge 10.1 into 10.2Marko Mäkelä2018-12-121-0/+1
| | |\ \ | | | |/
| | | * Merge 10.0 into 10.1Marko Mäkelä2018-12-121-0/+1
| | | |\
| | | | * MDEV-17032: Estimates are higher for partitions of a table with ↵Varun Gupta2018-12-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | @@use_stat_tables= PREFERABLY The problem here is EITS statistics does not calculate statistics for the partitions of the table. So a temporary solution would be to not read EITS statistics for partitioned tables. Also disabling reading of EITS for columns that participate in the partition list of a table.
* | | | | MDEV-17255: New optimizer defaults and ANALYZE TABLEVarun Gupta2018-12-091-0/+27
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added to new values to the server variable use_stat_tables. The values are COMPLEMENTARY_FOR_QUERIES and PREFERABLY_FOR_QUERIES. Both these values don't allow to collect EITS for queries like analyze table t1; To collect EITS we would need to use the syntax with persistent like analyze table t1 persistent for columns (col1,col2...) index (idx1, idx2...) / ALL Changing the default value from NEVER to PREFERABLY_FOR_QUERIES.
* | | | Merge branch '10.2' into 10.3Sergei Golubchik2018-09-281-5/+24
|\ \ \ \ | |/ / /
| * | | Merge branch '10.1' into 10.2Oleksandr Byelkin2018-09-141-5/+24
| |\ \ \ | | |/ /
| | * | Merge branch '11.0' into 10.1Oleksandr Byelkin2018-09-061-5/+24
| | |\ \ | | | |/
| | | * MDEV-15306: Wrong/Unexpected result with the value ↵Varun Gupta2018-08-291-5/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | optimizer_use_condition_selectivity set to 4 Currently for selectivity calculation we perform range analysis for a column even when we don't have any statistics(EITS). This makes less sense but is used to catch contradiction for WHERE condition. So the solution is to not perform range analysis for selectivity calculation for columns that do not have statistics.
| | | * MDEV-16757 Memory leak after adding manually min/max statistical dataIgor Babaev2018-07-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for blob column ANALYZE TABLE <table> does not collect statistical data on min/max values for BLOB columns of <table>. However these values can be added into mysql.column_stats manually by executing proper statements. Unfortunately this led to a memory leak because the memory allocated for these values was never freed. This patch provides the server with a function to free memory allocated for min/max statistical values of BLOB types.
| | | * MDEV-9744: session optimizer_use_condition_selectivity=5 causing SQL Error ↵Varun Gupta2018-04-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (1918): Encountered illegal value '' when converting to DECIMAL The issue was that EITS data was allocated but then not read for some reason (one being to avoid a deadlock), then the optimizer was using these bzero'ed buffers as EITS statistics. This should not be allowed, we should use statistcs for a table only when we have successfully loaded/read the stats from the statistical tables.
* | | | Merge 10.2 into 10.3Marko Mäkelä2018-08-281-1/+1
|\ \ \ \ | |/ / /
| * | | MDEV-16934 Query with very large IN clause lists runs slowlyIgor Babaev2018-08-171-1/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces support for the system variable eq_range_index_dive_limit that existed in MySQL starting from 5.6. The variable sets a limit for index dives into equality ranges. Index dives are performed by optimizer to estimate the number of rows in range scans. Index dives usually provide good estimate but they are pretty expensive. To estimate the number of rows in equality ranges statistical data on indexes can be employed. Its usage gives not so good estimates but it's cheap. So if the number of equality dives required by an index scan exceeds the set limit no dives for equality ranges are performed by the optimizer for this index. As the new system variable is introduced in a stable version the default value for it is set to a special value meaning there is no limit for the number of index dives performed by the optimizer. The patch partially uses the MySQL code for WL 5957 'Statistics-based Range optimization for many ranges'.
* | | Merge 10.2 into 10.3Marko Mäkelä2018-08-031-1/+2
|\ \ \ | |/ /
| * | MDEV-16757 Memory leak after adding manually min/max statistical dataIgor Babaev2018-07-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for blob column ANALYZE TABLE <table> does not collect statistical data on min/max values for BLOB columns of <table>. However these values can be added into mysql.column_stats manually by executing proper statements. Unfortunately this led to a memory leak because the memory allocated for these values was never freed. This patch provides the server with a function to free memory allocated for min/max statistical values of BLOB types. Temporarily changed the test case until MDEV-16711 is fixed as without this fix the test case for MDEV-16757 did not fail only for 10.0.
* | | Changed database, tablename and alias to be LEX_CSTRINGMonty2018-01-301-15/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was done in, among other things: - thd->db and thd->db_length - TABLE_LIST tablename, db, alias and schema_name - Audit plugin database name - lex->db - All db and table names in Alter_table_ctx - st_select_lex db Other things: - Changed a lot of functions to take const LEX_CSTRING* as argument for db, table_name and alias. See init_one_table() as an example. - Changed some function arguments from LEX_CSTRING to const LEX_CSTRING - Changed some lists from LEX_STRING to LEX_CSTRING - threads_mysql.result changed because process list_db wasn't always correctly updated - New append_identifier() function that takes LEX_CSTRING* as arguments - Added new element tmp_buff to Alter_table_ctx to separate temp name handling from temporary space - Ensure we store the length after my_casedn_str() of table/db names - Removed not used version of rename_table_in_stat_tables() - Changed Natural_join_column::table_name and db_name() to never return NULL (used for print) - thd->get_db() now returns db as a printable string (thd->db.str or "")
* | | Changing field::field_name and Item::name to LEX_CSTRINGMonty2017-04-231-7/+7
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Benefits of this patch: - Removed a lot of calls to strlen(), especially for field_string - Strings generated by parser are now const strings, less chance of accidently changing a string - Removed a lot of calls with LEX_STRING as parameter (changed to pointer) - More uniform code - Item::name_length was not kept up to date. Now fixed - Several bugs found and fixed (Access to null pointers, access of freed memory, wrong arguments to printf like functions) - Removed a lot of casts from (const char*) to (char*) Changes: - This caused some ABI changes - lex_string_set now uses LEX_CSTRING - Some fucntions are now taking const char* instead of char* - Create_field::change and after changed to LEX_CSTRING - handler::connect_string, comment and engine_name() changed to LEX_CSTRING - Checked printf() related calls to find bugs. Found and fixed several errors in old code. - A lot of changes from LEX_STRING to LEX_CSTRING, especially related to parsing and events. - Some changes from LEX_STRING and LEX_STRING & to LEX_CSTRING* - Some changes for char* to const char* - Added printf argument checking for my_snprintf() - Introduced null_clex_str, star_clex_string, temp_lex_str to simplify code - Added item_empty_name and item_used_name to be able to distingush between items that was given an empty name and items that was not given a name This is used in sql_yacc.yy to know when to give an item a name. - select table_name."*' is not anymore same as table_name.* - removed not used function Item::rename() - Added comparision of item->name_length before some calls to my_strcasecmp() to speed up comparison - Moved Item_sp_variable::make_field() from item.h to item.cc - Some minimal code changes to avoid copying to const char * - Fixed wrong error message in wsrep_mysql_parse() - Fixed wrong code in find_field_in_natural_join() where real_item() was set when it shouldn't - ER_ERROR_ON_RENAME was used with extra arguments. - Removed some (wrong) ER_OUTOFMEMORY, as alloc_root will already give the error. TODO: - Check possible unsafe casts in plugin/auth_examples/qa_auth_interface.c - Change code to not modify LEX_CSTRING for database name (as part of lower_case_table_names)
* | MDEV-10435 crash with bad stat tables.Alexey Botchkov2016-12-091-3/+6
| | | | | | | | | | | | | | | | Functions from sql/statistics.cc don't seem to expect stat tables to fail or to have inadequate structure. Table open errors suppressed and some validity checks added. Invalid tables reported to the server log.
* | MDEV-10957: Assertion failure when dropping a myisam table with ↵Nirbhay Choubey2016-11-021-0/+1
|/ | | | | | | | | | | | wsrep_replicate_myisam enabled Internal updates to system statistical tables could wrongly trigger an additional total-order replication if wsrep_repli -cate_myisam is enabled. Fixed by adding a check to skip total-order replication for stat tables. Test: galera.galera_var_replicate_myisam_on
* Fixed bug mdev-11096.Igor Babaev2016-10-261-0/+5
| | | | | | | | 1. When min/max value is provided the null flag for it must be set to 0 in the bitmap Culumn_statistics::column_stat_nulls. 2. When the calculation of the selectivity of the range condition over a column requires min and max values for the column then we have to check that these values are provided.
* MDEV-6442: Assertion `join->best_read < double(...)' failed with ↵Sergey Petrunya2014-10-061-2/+10
| | | | | | | | | | | | | optimizer_use_condition_selectivity >=3 - Fix the crash by making get_column_range_cardinality() to handle the special case where Column_stats objects is an all-zeros object (the question of what is the point of having Field::read_stats point to such object remains a mystery) - Added a few comments. Learning the code still.
* MDEV-6473 - main.statistics fails on PPC64Sergey Vojtovich2014-07-231-3/+3
| | | | | | | | | | mysql.column_stats wasn't stored/restored properly on big-endian with histogram_type=DOUBLE_PREC_HB. Store histogram values using int2store()/uint2korr(). Note that this patch invalidates previously calculated histogram values on big-endian.
* Code cleanup:Sergey Petrunya2014-03-271-111/+1
| | | | | - Move [some] engine-agnostic tests from t/selectivity.test to t/selectivity_no_engine.test - Move Histogram::point_selectivity to sql_statistics.cc
* MDEV-5926, MDEV-4362 post-fixes:Sergey Petrunya2014-03-271-22/+44
| | | | | | - Histogram::find_bucket() should not walk off the end of the value range. - Address review feedback in Histogram::point_selectivity(): different handling for zero-width buckets, and explanations.
* MDEV-4362: {division by zero when lookup constant is outside the value table}Sergey Petrunya2014-03-261-10/+21
| | | | | | - Fix Histogram::point_selectivity() to work in the case where the passed value_pos=0 (or 1) and the first (or the last) bucket in the histogram has zero value-range (i.e one value).
* MDEV-5926: EITS: Histogram estimates for column=least_possible_value are wrongSergey Petrunya2014-03-261-11/+81
| | | | | | | | [Attempt #2] - Use a new selectivity calculation formula in Histogram::point_selectivity. The formula is different from the old one because it was developed from scratch. it doesn't have any possible division-by-zero problems.
* MDEV-5314 - Compiling fails on OSX using clangSergey Vojtovich2014-02-191-1/+1
| | | | | | | | | | | | | This is port of fix for MySQL BUG#17647863. revno: 5572 revision-id: jon.hauglid@oracle.com-20131030232243-b0pw98oy72uka2sj committer: Jon Olav Hauglid <jon.hauglid@oracle.com> timestamp: Thu 2013-10-31 00:22:43 +0100 message: Bug#17647863: MYSQL DOES NOT COMPILE ON OSX 10.9 GM Rename test() macro to MY_TEST() to avoid conflict with libc++.
* 10.0-monty mergeSergei Golubchik2013-07-211-0/+11
|\ | | | | | | | | | | | | | | includes: * remove some remnants of "Bug#14521864: MYSQL 5.1 TO 5.5 BUGS PARTITIONING" * introduce LOCK_share, now LOCK_ha_data is strictly for engines * rea_create_table() always creates .par file (even in "frm-only" mode) * fix a 5.6 bug, temp file leak on dummy ALTER TABLE
| * MDEV-4758 10.0-monty tree: ALTER TABLE CHANGE COLUMN doesn't drop EITS statsSergei Golubchik2013-07-091-0/+11
| | | | | | | | | | add missing rename_column_in_stat_tables(), delete_statistics_for_column(), delete_statistics_for_index(), rename_table_in_stat_tables() calls.
* | Added comments.Igor Babaev2013-04-151-2/+2
| | | | | | | | | | Renamed the virtual method middle_point_pos for the class Field to pos_in_interval.
* | Fixed compiler complains.Igor Babaev2013-04-131-0/+3
| |
* | Fixed bug mdev-4363.Igor Babaev2013-04-061-1/+2
| | | | | | | | | | | | When calculating the selectivity of a range in the function get_column_range_cardinality a check whether NULL values are included into into the range must be done.
* | Fixed bug mdev-4370.Igor Babaev2013-04-051-0/+2
| | | | | | | | | | | | Don't try to a histogram if it is not read into the cache for statistical data. It may happen so if optimizer_use_condition_selectivity is set to 3. This setting orders the optimizer not use histograms to calculate selectivity.
* | Fixed bug mdev-4350.Igor Babaev2013-04-031-6/+8
| | | | | | | | | | | | Wrong formulas used by the function Histogram::point_selectivity() could result in a negative value of selectivity returned by the function.
* | Added the type of histogram for mwl #253.Igor Babaev2013-03-301-25/+81
| | | | | | | | Introduced double precision height-balanced histograms.
* | Added histogams for table columns.Igor Babaev2013-03-251-13/+111
| |