summaryrefslogtreecommitdiff
path: root/storage/innobase/row/row0ins.cc
diff options
context:
space:
mode:
authorMarko Mäkelä <marko.makela@mariadb.com>2016-12-14 19:56:39 +0200
committerMarko Mäkelä <marko.makela@mariadb.com>2016-12-16 09:19:19 +0200
commit8777458a6eb73ac1d7d864ebac390ea7039e21c1 (patch)
tree4c8df83897aa22a8ce334e52898481d783a73b21 /storage/innobase/row/row0ins.cc
parent8938031bc7eb78d406553465341338038cfb2e1a (diff)
downloadmariadb-git-8777458a6eb73ac1d7d864ebac390ea7039e21c1.tar.gz
MDEV-6076 Persistent AUTO_INCREMENT for InnoDB
This should be functionally equivalent to WL#6204 in MySQL 8.0.0, with the notable difference that the file format changes are limited to repurposing a previously unused data field in B-tree pages. For persistent InnoDB tables, write the last used AUTO_INCREMENT value to the root page of the clustered index, in the previously unused (0) PAGE_MAX_TRX_ID field, now aliased as PAGE_ROOT_AUTO_INC. Unlike some other previously unused InnoDB data fields, this one was actually always zero-initialized, at least since MySQL 3.23.49. The writes to PAGE_ROOT_AUTO_INC are protected by SX or X latch on the root page. The SX latch will allow concurrent read access to the root page. (The field PAGE_ROOT_AUTO_INC will only be read on the first-time call to ha_innobase::open() from the SQL layer. The PAGE_ROOT_AUTO_INC can only be updated when executing SQL, so read/write races are not possible.) During INSERT, the PAGE_ROOT_AUTO_INC is updated by the low-level function btr_cur_search_to_nth_level(), adding no extra page access. [Adaptive hash index lookup will be disabled during INSERT.] If some rare UPDATE modifies an AUTO_INCREMENT column, the PAGE_ROOT_AUTO_INC will be adjusted in a separate mini-transaction in ha_innobase::update_row(). When a page is reorganized, we have to preserve the PAGE_ROOT_AUTO_INC field. During ALTER TABLE, the initial AUTO_INCREMENT value will be copied from the table. ALGORITHM=COPY and online log apply in LOCK=NONE will update PAGE_ROOT_AUTO_INC in real time. innodb_col_no(): Determine the dict_table_t::cols[] element index corresponding to a Field of a non-virtual column. (The MySQL 5.7 implementation of virtual columns breaks the 1:1 relationship between Field::field_index and dict_table_t::cols[]. Virtual columns are omitted from dict_table_t::cols[]. Therefore, we must translate the field_index of AUTO_INCREMENT columns into an index of dict_table_t::cols[].) Upgrade from old data files: By default, the AUTO_INCREMENT sequence in old data files would appear to be reset, because PAGE_MAX_TRX_ID or PAGE_ROOT_AUTO_INC would contain the value 0 in each clustered index page. In new data files, PAGE_ROOT_AUTO_INC can only be 0 if the table is empty or does not contain any AUTO_INCREMENT column. For backward compatibility, we use the old method of SELECT MAX(auto_increment_column) for initializing the sequence. btr_read_autoinc(): Read the AUTO_INCREMENT sequence from a new-format data file. btr_read_autoinc_with_fallback(): A variant of btr_read_autoinc() that will resort to reading MAX(auto_increment_column) for data files that did not use AUTO_INCREMENT yet. It was manually tested that during the execution of innodb.autoinc_persist the compatibility logic is not activated (for new files, PAGE_ROOT_AUTO_INC is never 0 in nonempty clustered index root pages). initialize_auto_increment(): Replaces ha_innobase::innobase_initialize_autoinc(). This initializes the AUTO_INCREMENT metadata. Only called from ha_innobase::open(). ha_innobase::info_low(): Do not try to lazily initialize dict_table_t::autoinc. It must already have been initialized by ha_innobase::open() or ha_innobase::create(). Note: The adjustments to class ha_innopart were not tested, because the source code (native InnoDB partitioning) is not being compiled.
Diffstat (limited to 'storage/innobase/row/row0ins.cc')
-rw-r--r--storage/innobase/row/row0ins.cc39
1 files changed, 29 insertions, 10 deletions
diff --git a/storage/innobase/row/row0ins.cc b/storage/innobase/row/row0ins.cc
index 271d70d4da9..1f884017dd3 100644
--- a/storage/innobase/row/row0ins.cc
+++ b/storage/innobase/row/row0ins.cc
@@ -1,6 +1,7 @@
/*****************************************************************************
Copyright (c) 1996, 2016, Oracle and/or its affiliates. All Rights Reserved.
+Copyright (c) 2016, MariaDB Corporation.
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
@@ -2473,6 +2474,7 @@ row_ins_clust_index_entry_low(
dberr_t err = DB_SUCCESS;
big_rec_t* big_rec = NULL;
mtr_t mtr;
+ ib_uint64_t auto_inc = 0;
mem_heap_t* offsets_heap = NULL;
ulint offsets_[REC_OFFS_NORMAL_SIZE];
ulint* offsets = offsets_;
@@ -2487,7 +2489,6 @@ row_ins_clust_index_entry_low(
ut_ad(!thr_get_trx(thr)->in_rollback);
mtr_start(&mtr);
- mtr.set_named_space(index->space);
if (dict_table_is_temporary(index->table)) {
/* Disable REDO logging as the lifetime of temp-tables is
@@ -2496,23 +2497,41 @@ row_ins_clust_index_entry_low(
Disable locking as temp-tables are local to a connection. */
ut_ad(flags & BTR_NO_LOCKING_FLAG);
+ ut_ad(!dict_index_is_online_ddl(index));
+ ut_ad(!index->table->persistent_autoinc);
mtr.set_log_mode(MTR_LOG_NO_REDO);
- }
+ } else {
+ mtr.set_named_space(index->space);
- if (mode == BTR_MODIFY_LEAF && dict_index_is_online_ddl(index)) {
- mode = BTR_MODIFY_LEAF | BTR_ALREADY_S_LATCHED;
- mtr_s_lock(dict_index_get_lock(index), &mtr);
+ if (mode == BTR_MODIFY_LEAF
+ && dict_index_is_online_ddl(index)) {
+ mode = BTR_MODIFY_LEAF | BTR_ALREADY_S_LATCHED;
+ mtr_s_lock(dict_index_get_lock(index), &mtr);
+ }
+
+ if (unsigned ai = index->table->persistent_autoinc) {
+ /* Prepare to persist the AUTO_INCREMENT value
+ from the index entry to PAGE_ROOT_AUTO_INC. */
+ const dfield_t* dfield = dtuple_get_nth_field(
+ entry, ai - 1);
+ auto_inc = dfield_is_null(dfield)
+ ? 0
+ : row_parse_int(static_cast<const byte*>(
+ dfield->data),
+ dfield->len,
+ dfield->type.mtype,
+ dfield->type.prtype
+ & DATA_UNSIGNED);
+ }
}
/* Note that we use PAGE_CUR_LE as the search mode, because then
the function will return in both low_match and up_match of the
cursor sensible values */
- btr_pcur_open(index, entry, PAGE_CUR_LE, mode, &pcur, &mtr);
+ btr_pcur_open_low(index, 0, entry, PAGE_CUR_LE, mode, &pcur,
+ __FILE__, __LINE__, auto_inc, &mtr);
cursor = btr_pcur_get_btr_cur(&pcur);
-
- if (cursor) {
- cursor->thr = thr;
- }
+ cursor->thr = thr;
#ifdef UNIV_DEBUG
{