diff options
author | Marko Mäkelä <marko.makela@mariadb.com> | 2016-12-14 19:56:39 +0200 |
---|---|---|
committer | Marko Mäkelä <marko.makela@mariadb.com> | 2016-12-16 09:19:19 +0200 |
commit | 8777458a6eb73ac1d7d864ebac390ea7039e21c1 (patch) | |
tree | 4c8df83897aa22a8ce334e52898481d783a73b21 /storage/innobase/row/row0ins.cc | |
parent | 8938031bc7eb78d406553465341338038cfb2e1a (diff) | |
download | mariadb-git-8777458a6eb73ac1d7d864ebac390ea7039e21c1.tar.gz |
MDEV-6076 Persistent AUTO_INCREMENT for InnoDB
This should be functionally equivalent to WL#6204 in MySQL 8.0.0, with
the notable difference that the file format changes are limited to
repurposing a previously unused data field in B-tree pages.
For persistent InnoDB tables, write the last used AUTO_INCREMENT
value to the root page of the clustered index, in the previously
unused (0) PAGE_MAX_TRX_ID field, now aliased as PAGE_ROOT_AUTO_INC.
Unlike some other previously unused InnoDB data fields, this one was
actually always zero-initialized, at least since MySQL 3.23.49.
The writes to PAGE_ROOT_AUTO_INC are protected by SX or X latch on the
root page. The SX latch will allow concurrent read access to the root
page. (The field PAGE_ROOT_AUTO_INC will only be read on the
first-time call to ha_innobase::open() from the SQL layer. The
PAGE_ROOT_AUTO_INC can only be updated when executing SQL, so
read/write races are not possible.)
During INSERT, the PAGE_ROOT_AUTO_INC is updated by the low-level
function btr_cur_search_to_nth_level(), adding no extra page
access. [Adaptive hash index lookup will be disabled during INSERT.]
If some rare UPDATE modifies an AUTO_INCREMENT column, the
PAGE_ROOT_AUTO_INC will be adjusted in a separate mini-transaction in
ha_innobase::update_row().
When a page is reorganized, we have to preserve the PAGE_ROOT_AUTO_INC
field.
During ALTER TABLE, the initial AUTO_INCREMENT value will be copied
from the table. ALGORITHM=COPY and online log apply in LOCK=NONE will
update PAGE_ROOT_AUTO_INC in real time.
innodb_col_no(): Determine the dict_table_t::cols[] element index
corresponding to a Field of a non-virtual column.
(The MySQL 5.7 implementation of virtual columns breaks the 1:1
relationship between Field::field_index and dict_table_t::cols[].
Virtual columns are omitted from dict_table_t::cols[]. Therefore,
we must translate the field_index of AUTO_INCREMENT columns into
an index of dict_table_t::cols[].)
Upgrade from old data files:
By default, the AUTO_INCREMENT sequence in old data files would appear
to be reset, because PAGE_MAX_TRX_ID or PAGE_ROOT_AUTO_INC would contain
the value 0 in each clustered index page. In new data files,
PAGE_ROOT_AUTO_INC can only be 0 if the table is empty or does not contain
any AUTO_INCREMENT column.
For backward compatibility, we use the old method of
SELECT MAX(auto_increment_column) for initializing the sequence.
btr_read_autoinc(): Read the AUTO_INCREMENT sequence from a new-format
data file.
btr_read_autoinc_with_fallback(): A variant of btr_read_autoinc()
that will resort to reading MAX(auto_increment_column) for data files
that did not use AUTO_INCREMENT yet. It was manually tested that during
the execution of innodb.autoinc_persist the compatibility logic is
not activated (for new files, PAGE_ROOT_AUTO_INC is never 0 in nonempty
clustered index root pages).
initialize_auto_increment(): Replaces
ha_innobase::innobase_initialize_autoinc(). This initializes
the AUTO_INCREMENT metadata. Only called from ha_innobase::open().
ha_innobase::info_low(): Do not try to lazily initialize
dict_table_t::autoinc. It must already have been initialized by
ha_innobase::open() or ha_innobase::create().
Note: The adjustments to class ha_innopart were not tested, because
the source code (native InnoDB partitioning) is not being compiled.
Diffstat (limited to 'storage/innobase/row/row0ins.cc')
-rw-r--r-- | storage/innobase/row/row0ins.cc | 39 |
1 files changed, 29 insertions, 10 deletions
diff --git a/storage/innobase/row/row0ins.cc b/storage/innobase/row/row0ins.cc index 271d70d4da9..1f884017dd3 100644 --- a/storage/innobase/row/row0ins.cc +++ b/storage/innobase/row/row0ins.cc @@ -1,6 +1,7 @@ /***************************************************************************** Copyright (c) 1996, 2016, Oracle and/or its affiliates. All Rights Reserved. +Copyright (c) 2016, MariaDB Corporation. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software @@ -2473,6 +2474,7 @@ row_ins_clust_index_entry_low( dberr_t err = DB_SUCCESS; big_rec_t* big_rec = NULL; mtr_t mtr; + ib_uint64_t auto_inc = 0; mem_heap_t* offsets_heap = NULL; ulint offsets_[REC_OFFS_NORMAL_SIZE]; ulint* offsets = offsets_; @@ -2487,7 +2489,6 @@ row_ins_clust_index_entry_low( ut_ad(!thr_get_trx(thr)->in_rollback); mtr_start(&mtr); - mtr.set_named_space(index->space); if (dict_table_is_temporary(index->table)) { /* Disable REDO logging as the lifetime of temp-tables is @@ -2496,23 +2497,41 @@ row_ins_clust_index_entry_low( Disable locking as temp-tables are local to a connection. */ ut_ad(flags & BTR_NO_LOCKING_FLAG); + ut_ad(!dict_index_is_online_ddl(index)); + ut_ad(!index->table->persistent_autoinc); mtr.set_log_mode(MTR_LOG_NO_REDO); - } + } else { + mtr.set_named_space(index->space); - if (mode == BTR_MODIFY_LEAF && dict_index_is_online_ddl(index)) { - mode = BTR_MODIFY_LEAF | BTR_ALREADY_S_LATCHED; - mtr_s_lock(dict_index_get_lock(index), &mtr); + if (mode == BTR_MODIFY_LEAF + && dict_index_is_online_ddl(index)) { + mode = BTR_MODIFY_LEAF | BTR_ALREADY_S_LATCHED; + mtr_s_lock(dict_index_get_lock(index), &mtr); + } + + if (unsigned ai = index->table->persistent_autoinc) { + /* Prepare to persist the AUTO_INCREMENT value + from the index entry to PAGE_ROOT_AUTO_INC. */ + const dfield_t* dfield = dtuple_get_nth_field( + entry, ai - 1); + auto_inc = dfield_is_null(dfield) + ? 0 + : row_parse_int(static_cast<const byte*>( + dfield->data), + dfield->len, + dfield->type.mtype, + dfield->type.prtype + & DATA_UNSIGNED); + } } /* Note that we use PAGE_CUR_LE as the search mode, because then the function will return in both low_match and up_match of the cursor sensible values */ - btr_pcur_open(index, entry, PAGE_CUR_LE, mode, &pcur, &mtr); + btr_pcur_open_low(index, 0, entry, PAGE_CUR_LE, mode, &pcur, + __FILE__, __LINE__, auto_inc, &mtr); cursor = btr_pcur_get_btr_cur(&pcur); - - if (cursor) { - cursor->thr = thr; - } + cursor->thr = thr; #ifdef UNIV_DEBUG { |