diff options
author | Sergei Petrunia <sergey@mariadb.com> | 2023-04-19 15:15:27 +0300 |
---|---|---|
committer | Sergei Petrunia <sergey@mariadb.com> | 2023-04-28 22:39:25 +0300 |
commit | 85cc83188059d0cd280aa9f9e290dc8f025a4c3c (patch) | |
tree | bc12c628e9c32522e34619dc26052865d999f9c5 /sql/sql_sequence.cc | |
parent | bc970573b38e87a3087c8d7b2252c42e87b7cebb (diff) | |
download | mariadb-git-85cc83188059d0cd280aa9f9e290dc8f025a4c3c.tar.gz |
MDEV-31067: selectivity_from_histogram >1.0 for a DOUBLE_PREC_HB histogrambb-10.4-mdev31067-variant2
Variant #2.
When Histogram::point_selectivity() sees that the point value of interest
falls into one bucket, it tries to guess whether the bucket has many
different (unpopular) values or a few popular values. (The number of
rows is fixed, as it's a Height-balanced histogram).
The basis for this guess is the "width" of the value range the bucket
covers. Buckets covering wider value ranges are assumed to contain
values with proportionally lower frequencies.
This is just a [brave] guesswork. For a very narrow bucket, it may
produce an estimate that's larger than total #rows in the bucket
or even in the whole table.
Remove the guesswork and replace it with basic logic: return
either the per-table average selectivity of col=const, or selectivity
of one bucket, whichever is lower.
Diffstat (limited to 'sql/sql_sequence.cc')
0 files changed, 0 insertions, 0 deletions