diff options
author | Teemu Ollakka <teemu.ollakka@galeracluster.com> | 2019-03-15 07:09:13 +0200 |
---|---|---|
committer | Jan Lindström <jan.lindstrom@mariadb.com> | 2019-03-15 07:09:13 +0200 |
commit | 1ef50a34ec53d0e3e43776f414dd99f343d5d6ba (patch) | |
tree | 59e612b17a4956d002ce1045460aafc1706d868f /sql/log.cc | |
parent | b234f81037d79b64f46a8de1602a8a7e3a45aada (diff) | |
download | mariadb-git-1ef50a34ec53d0e3e43776f414dd99f343d5d6ba.tar.gz |
10.4 wsrep group commit fixes (#1224)
* MDEV-16509 Improve wsrep commit performance with binlog disabled
Release commit order critical section early after trx_commit_low() if
binlog is not transaction coordinator. In order to avoid two phase commit,
binlog_hton is not registered for THD during IO_CACHE population.
Implemented a test which verifies that the transactions release
commit order early.
This optimization will change behavior during recovery as the commit
is not two phase when binlog is off. Fixed and recorded wsrep-recover-v25
and wsrep-recover to match the behavior.
* MDEV-18730 Ordering for wsrep binlog group commit
Previously out of order execution was allowed for wsrep commits.
Established proper ordering by populating wait_for_commit
for every wsrep THD and making group commit leader to wait for
prior commits before proceeding to trx_group_commit_leader().
* MDEV-18730 Added a test case to verify correct commit ordering
* MDEV-16509, MDEV-18730 Review fixes
Use WSREP_EMULATE_BINLOG() macro to decide if the binlog_hton
should be registered. Whitespace/syntax fixes and cleanups.
* MDEV-16509 Require binlog for galera_var_innodb_disallow_writes test
If the commit to InnoDB is done in one phase, the native InnoDB behavior
is that the transaction is committed in memory before it is persisted to
disk. This means that the innodb_disallow_writes=ON may not prevent
transaction to become visible to other readers before commit is completely
over. On the other hand, if the commit is two phase (as it is with binlog),
the transaction will be blocked in prepare phase.
Fixed the test to use binlog, which enforces two phase commit, which
in turn makes commit to block before the changes become visible to
other connections. This guarantees that the test produces expected
result.
Diffstat (limited to 'sql/log.cc')
-rw-r--r-- | sql/log.cc | 48 |
1 files changed, 43 insertions, 5 deletions
diff --git a/sql/log.cc b/sql/log.cc index c7aa72f9dd0..2b1d2867eef 100644 --- a/sql/log.cc +++ b/sql/log.cc @@ -2202,7 +2202,19 @@ void MYSQL_BIN_LOG::set_write_error(THD *thd, bool is_transactional) { my_error(ER_ERROR_ON_WRITE, MYF(0), name, errno); } - +#ifdef WITH_WSREP + /* If wsrep transaction is active and binlog emulation is on, + binlog write error may leave transaction without any registered + htons. This makes wsrep rollback hooks to be skipped and the + transaction will remain alive in wsrep world after rollback. + Register binlog hton here to ensure that rollback happens in full. */ + if (WSREP_EMULATE_BINLOG(thd)) + { + if (is_transactional) + trans_register_ha(thd, TRUE, binlog_hton); + trans_register_ha(thd, FALSE, binlog_hton); + } +#endif /* WITH_WSREP */ DBUG_VOID_RETURN; } @@ -5676,7 +5688,18 @@ THD::binlog_start_trans_and_stmt() this->binlog_set_stmt_begin(); bool mstmt_mode= in_multi_stmt_transaction_mode(); #ifdef WITH_WSREP - /* Write Gtid + /* + With wsrep binlog emulation we can skip the rest because the + binlog cache will not be written into binlog. Note however that + because of this the hton callbacks will not get called to clean + up the cache, so this must be done explicitly when the transaction + terminates. + */ + if (WSREP_EMULATE_BINLOG_NNULL(this)) + { + DBUG_VOID_RETURN; + } + /* Write Gtid Get domain id only when gtid mode is set If this event is replicate through a master then , we will forward the same gtid another nodes @@ -7686,9 +7709,24 @@ MYSQL_BIN_LOG::write_transaction_to_binlog_events(group_commit_entry *entry) { int is_leader= queue_for_group_commit(entry); #ifdef WITH_WSREP - if (wsrep_run_commit_hook(entry->thd, true) && is_leader >= 0 && - wsrep_ordered_commit(entry->thd, entry->all, wsrep_apply_error())) - return true; + if (wsrep_is_active(entry->thd) && + wsrep_run_commit_hook(entry->thd, entry->all)) + { + /* + Release commit order and if leader, wait for prior commit to + complete. This establishes total order for group leaders. + */ + if (wsrep_ordered_commit(entry->thd, entry->all, wsrep_apply_error())) + { + entry->thd->wakeup_subsequent_commits(1); + return 1; + } + if (is_leader) + { + if (entry->thd->wait_for_prior_commit()) + return 1; + } + } #endif /* WITH_WSREP */ /* The first in the queue handles group commit for all; the others just wait |