summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Merge MDEV-26519: JSON_HB histograms into 10.8preview-10.8-MDEV-26519-json-histogramsSergei Petrunia2022-01-1951-439/+11232
|\
| * Code cleanupSergei Petrunia2022-01-197-28/+14
| |
| * Switch the default histogram_type to still be DOUBLE_PREC_HBSergei Petrunia2022-01-194-3/+4
| | | | | | | | MTR still uses JSON_HB as the default.
| * JSON_HB histogram: represent values of BIT() columns in hex alwaysSergei Petrunia2022-01-193-16/+81
| |
| * MDEV-26901: Estimation for filtered rows less precise ... #4Sergei Petrunia2022-01-196-6/+42
| | | | | | | | | | | | | | | | | | | | | | In Histogram_json_hb::point_selectivity(), do return selectivity of 0.0 when the histogram says so. The logic of "Do not return 0.0 estimate as it causes a multiply-by-zero meltdown in cost and cardinality calculations" is moved into records_in_column_ranges() where it is one *once* per column pair (as opposed to doing once per range, which can cause the error to add-up to large number when there are many ranges)
| * MDEV-27229: Estimation for filtered rows less precise ... #5Sergei Petrunia2022-01-197-16/+33
| | | | | | | | | | | | | | | | | | Followup: remove this line from get_column_range_cardinality() set_if_bigger(res, col_stats->get_avg_frequency()); and make sure it is only used with the binary histograms. For JSON histograms, it makes the estimates unnecessarily imprecise.
| * MDEV-27243: Estimation for filtered rows less precise ... #7Sergei Petrunia2022-01-192-0/+27
| | | | | | | | Added a testcase
| * MDEV-27229: Estimation for filtered rows less precise ... #5Sergei Petrunia2022-01-194-55/+148
| | | | | | | | Fix special handling for values that are right next to buckets with ndv=1.
| * Update test resultsSergei Petrunia2022-01-192-18/+18
| |
| * MDEV-27230: Estimation for filtered rows less precise ...Sergei Petrunia2022-01-193-0/+28
| | | | | | | | | | Fix the code in Histogram_json_hb::range_selectivity that handles special cases: a non-inclusive endpoint hitting a bucket boundary...
| * MDEV-27203: Valgrind / MSAN errors in Histogram_json_hb::parse_bucketSergei Petrunia2022-01-192-1/+8
| | | | | | | | In read_bucket_endpoint(), handle all possible parser states.
| * MDEV-26764: JSON_HB Histograms: handle BINARY and unassigned charactersSergei Petrunia2022-01-194-22/+130
| | | | | | | | Encode such characters in hex.
| * More test coverageSergei Petrunia2022-01-193-2/+44
| |
| * MDEV-26519: Improved histogramsSergei Petrunia2022-01-198-51/+576
| | | | | | | | | | | | | | | | Save extra information in the histogram: "target_histogram_size": nnn, "collected_at": "(date and time)", "collected_by": "(server version)",
| * MDEV-26519: Improved histograms: Better error reporting, test coverageSergei Petrunia2022-01-195-0/+107
| | | | | | | | | | | | | | | | Also report JSON histogram load errors into error log, like it is already done with other histogram/statistics load errors. Add test coverage to see what happens if one upgrades but does NOT run mysql_upgrade.
| * Rename histogram_hb_v2 -> histogram_hbSergei Petrunia2022-01-194-69/+69
| |
| * MDEV-26519: Improved histograms: Make JSON parser efficientSergei Petrunia2022-01-197-181/+351
| | | | | | | | | | | | | | | | Previous JSON parser was using an API which made the parsing inefficient: the same JSON contents was parsed again and again. Switch to using a lower-level parsing API which allows to do parsing in an efficient way.
| * MDEV-27062: Make histogram_type=JSON_HB the new defaultSergei Petrunia2022-01-1918-119/+119
| |
| * MDEV-26886: Estimation for filtered rows less precise with JSON histogramSergei Petrunia2022-01-194-29/+84
| | | | | | | | | | | | | | | | - Make Histogram_json_hb::range_selectivity handle singleton buckets specially when computing selectivity of the max. endpoint bound. (for min. endpoint, we already do that). - Also, fixed comments for Histogram_json_hb::find_bucket
| * MDEV-26911: Unexpected ER_DUP_KEY, ASAN errors, double free detected in ...Sergei Petrunia2022-01-193-2/+39
| | | | | | | | | | | | | | | | When loading the histogram, use table->field[N], not table->s->field[N]. When we used the latter we would corrupt the fields's default value. One of the consequences of that would be that AUTO_INCREMENT fields would stop working correctly.
| * MDEV-26892: JSON histograms become invalid with a specific (corrupt) value ..Sergei Petrunia2022-01-193-3/+29
| | | | | | | | | | Handle the case where the last value in the table cannot be represented in utf8mb4.
| * MDEV-26849: JSON Histograms: point selectivity estimates are offSergei Petrunia2022-01-198-10/+142
| | | | | | | | | | | | .. for non-existent values. Handle this special case.
| * MDEV-26750: Estimation for filtered rows is far off with JSON_HB histogramSergei Petrunia2022-01-193-9/+61
| | | | | | | | | | Fix a bug in position_in_interval(). Do not overwrite one interval endpoint with another.
| * MDEV-26801: Valgrind/MSAN errors in Column_statistics_collected::finish ...Sergei Petrunia2022-01-193-9/+13
| | | | | | | | | | | | | | | | | | | | | | The problem was introduced in fix for MDEV-26724. That patch has made it possible for histogram collection to fail. In particular, it fails for non-assigned characters. When histogram construction fails, we also abort the computation of COUNT(DISTINCT). When we try to use the value, we get valgrind failures. Switched the code to abort the statistics collection in this case.
| * MDEV-26709: JSON histogram may contain bucketS than histogram_size allowsSergei Petrunia2022-01-193-2889/+2463
| | | | | | | | | | When computing bucket_capacity= records/histogram->get_width(), round the value UP, not down.
| * MDEV-26724 Endless loop in json_escape_to_string upon ... empty stringSergei Petrunia2022-01-196-16/+65
| | | | | | | | | | | | | | Part#3: - make json_escape() return different errors on conversion error and on out-of-space condition. - Make histogram code handle conversion errors.
| * Update test resultsSergei Petrunia2022-01-192-2/+2
| |
| * MDEV-26737: Outdated VARIABLE_COMMENT for HISTOGRAM_TYPE in I_S.SYSTEM_VARIABLESSergei Petrunia2022-01-194-2/+15
| | | | | | | | Fix the description
| * MDEV-26710: Histogram field in mysql.column_stats is too shortSergei Petrunia2022-01-1910-13/+13
| | | | | | | | | | | | Change it to LONGBLOB. Also, update_statistics_for_table() should not "swallow" an error from open_stat_tables.
| * MDEV-26724 Endless loop in json_escape_to_string upon ... empty stringSergei Petrunia2022-01-193-7/+50
| | | | | | | | .. part#2: correctly pass the charset to JSON [un]escape functions
| * MDEV-26595: ASAN use-after-poison my_strnxfrm_simple_internal / ↵Sergei Petrunia2022-01-192-4/+34
| | | | | | | | | | | | Histogram_json_hb::range_selectivity Add testcase
| * MDEV-26589: Assertion failure upon DECODE_HISTOGRAM with NULLsSergei Petrunia2022-01-193-0/+40
| | | | | | | | | | Item_func_decode_histogram::val_str should correctly set null_value when "decoding" JSON histogram.
| * MDEV-26724 Endless loop in json_escape_to_string upon ... empty stringSergei Petrunia2022-01-193-2/+28
| | | | | | | | Correctly handle empty string when [un]escaping JSON
| * MDEV-26711: Values in JSON histograms are not properly quotedSergei Petrunia2022-01-196-80/+608
| | | | | | | | Escape values when serializing to JSON. Un-escape when reading back.
| * Use JSON_NAME, not the "histogram_hb_v2" constantSergei Petrunia2022-01-192-3/+3
| |
| * More "straightforward" memory managementSergei Petrunia2022-01-192-3/+5
| | | | | | | | Do not put Histogram objects on MEM_ROOT at all
| * Fix off-by-one error in Histogram_json_hb::find_bucketSergei Petrunia2022-01-194-13/+70
| |
| * MDEV-26590: Stack smashing/buffer overflow in Histogram_json_hb::parseSergei Petrunia2022-01-193-8/+36
| | | | | | | | Provide buffer of sufficient size.
| * Address review inputSergei Petrunia2022-01-194-4/+78
| |
| * Fix the previous cset: next() should have element_count as parameterSergei Petrunia2022-01-191-1/+1
| |
| * Fix compile warnings/error on WindowsSergei Petrunia2022-01-191-3/+3
| |
| * Fixes in opt_histogram_json.cc in the last commitsSergei Petrunia2022-01-193-27/+167
| | | | | | | | Aslo add more test coverage
| * Valgrind fixes, poor .result fixes, code cleanupsSergei Petrunia2022-01-192-7/+5
| | | | | | | | | | | | | | - Use String::c_ptr_safe() instead of String::c_ptr - Do proper datatype conversions in Histogram_json_hb::parse - Remove Histogram_json_hb::Bucket::end_value. Introduce get_end_value() instead.
| * Fix compile error on windowsSergei Petrunia2022-01-191-1/+1
| |
| * MDEV-26519: JSON Histograms: improve histogram collectionSergei Petrunia2022-01-193-1319/+5723
| | | | | | | | | | | | | | | | | | | | | | Basic ideas: 1. Store "popular" values in their own buckets. 2. Also store ndv (Number of Distinct Values) in each bucket. Because of #1, the buckets are now variable-size, so store the size in each bucket. Adjust selectivity estimation functions accordingly.
| * Fix compilation on windowsSergei Petrunia2022-01-191-2/+2
| |
| * Correctly decode string field values for pos_in_interval_for_string callSergei Petrunia2022-01-193-20/+42
| |
| * Make tests passSergei Petrunia2022-01-193-19/+8
| | | | | | | | | | | | - Fix bad tests in statistics_json test: make them meaningful and make them work on windows - Fix analyze_debug.test: correctly handle errors during ANALYZE
| * Fix compilation on windows part #3Sergei Petrunia2022-01-191-1/+1
| |
| * Fix embedded to workSergei Petrunia2022-01-191-0/+1
| |