diff options
| author | Vicențiu Ciorbaru <vicentiu@mariadb.org> | 2019-02-15 01:23:00 +0200 |
|---|---|---|
| committer | Vicențiu Ciorbaru <vicentiu@mariadb.org> | 2019-02-19 12:01:21 +0200 |
| commit | f0773b7842fcfd2032b630b4cfc7404a29d12a8f (patch) | |
| tree | 3b00628835a73575036e3488e2613d39bc8544e0 /sql/sql_class.h | |
| parent | 47f15ea73c49e90b16a4a4adf5414f51bdbf97a4 (diff) | |
| download | mariadb-git-f0773b7842fcfd2032b630b4cfc7404a29d12a8f.tar.gz | |
Introduce analyze_sample_percentage variable
The variable controls the amount of sampling analyze table performs.
If ANALYZE table with histogram collection is too slow, one can reduce the
time taken by setting analyze_sample_percentage to a lower value of the
total number of rows.
Setting it to 0 will use a formula to compute how many rows to sample:
The number of rows collected is capped to a minimum of 50000 and
increases logarithmically with a coffecient of 4096. The coffecient is
chosen so that we expect an error of less than 3% in our estimations
according to the paper:
"Random Sampling for Histogram Construction: How much is enough?”
– Surajit Chaudhuri, Rajeev Motwani, Vivek Narasayya, ACM SIGMOD, 1998.
The drawback of sampling is that avg_frequency number is computed
imprecisely and will yeild a smaller number than the real one.
Diffstat (limited to 'sql/sql_class.h')
| -rw-r--r-- | sql/sql_class.h | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/sql/sql_class.h b/sql/sql_class.h index 56b8aca19ab..3b0099ccae8 100644 --- a/sql/sql_class.h +++ b/sql/sql_class.h @@ -622,6 +622,7 @@ typedef struct system_variables ulong optimizer_selectivity_sampling_limit; ulong optimizer_use_condition_selectivity; ulong use_stat_tables; + double sample_percentage; ulong histogram_size; ulong histogram_type; ulong preload_buff_size; |
