diff options
author | Andrei Elkin <andrei.elkin@mariadb.com> | 2019-03-31 01:47:28 +0400 |
---|---|---|
committer | Andrei Elkin <andrei.elkin@mariadb.com> | 2020-03-14 22:45:48 +0200 |
commit | c8ae357341964382bc099392c6bedc370fa734f5 (patch) | |
tree | ed9d22f66e118e743c9e6738a9bb1bf7169051b0 /sql/rpl_parallel.cc | |
parent | 5754ea2eca0ffa191b4be46fdebf2d49c438f151 (diff) | |
download | mariadb-git-c8ae357341964382bc099392c6bedc370fa734f5.tar.gz |
MDEV-742 XA PREPAREd transaction survive disconnect/server restart
Lifted long standing limitation to the XA of rolling it back at the
transaction's
connection close even if the XA is prepared.
Prepared XA-transaction is made to sustain connection close or server
restart.
The patch consists of
- binary logging extension to write prepared XA part of
transaction signified with
its XID in a new XA_prepare_log_event. The concusion part -
with Commit or Rollback decision - is logged separately as
Query_log_event.
That is in the binlog the XA consists of two separate group of
events.
That makes the whole XA possibly interweaving in binlog with
other XA:s or regular transaction but with no harm to
replication and data consistency.
Gtid_log_event receives two more flags to identify which of the
two XA phases of the transaction it represents. With either flag
set also XID info is added to the event.
When binlog is ON on the server XID::formatID is
constrained to 4 bytes.
- engines are made aware of the server policy to keep up user
prepared XA:s so they (Innodb, rocksdb) don't roll them back
anymore at their disconnect methods.
- slave applier is refined to cope with two phase logged XA:s
including parallel modes of execution.
This patch does not address crash-safe logging of the new events which
is being addressed by MDEV-21469.
CORNER CASES: read-only, pure myisam, binlog-*, @@skip_log_bin, etc
Are addressed along the following policies.
1. The read-only at reconnect marks XID to fail for future
completion with ER_XA_RBROLLBACK.
2. binlog-* filtered XA when it changes engine data is regarded as
loggable even when nothing got cached for binlog. An empty
XA-prepare group is recorded. Consequent Commit-or-Rollback
succeeds in the Engine(s) as well as recorded into binlog.
3. The same applies to the non-transactional engine XA.
4. @@skip_log_bin=OFF does not record anything at XA-prepare
(obviously), but the completion event is recorded into binlog to
admit inconsistency with slave.
The following actions are taken by the patch.
At XA-prepare:
when empty binlog cache - don't do anything to binlog if RO,
otherwise write empty XA_prepare (assert(binlog-filter case)).
At Disconnect:
when Prepared && RO (=> no binlogging was done)
set Xid_cache_element::error := ER_XA_RBROLLBACK
*keep* XID in the cache, and rollback the transaction.
At XA-"complete":
Discover the error, if any don't binlog the "complete",
return the error to the user.
Kudos
-----
Alexey Botchkov took to drive this work initially.
Sergei Golubchik, Sergei Petrunja, Marko Mäkelä provided a number of
good recommendations.
Sergei Voitovich made a magnificent review and improvements to the code.
They all deserve a bunch of thanks for making this work done!
Diffstat (limited to 'sql/rpl_parallel.cc')
-rw-r--r-- | sql/rpl_parallel.cc | 38 |
1 files changed, 25 insertions, 13 deletions
diff --git a/sql/rpl_parallel.cc b/sql/rpl_parallel.cc index f875ab2b4a4..6d89651b067 100644 --- a/sql/rpl_parallel.cc +++ b/sql/rpl_parallel.cc @@ -672,12 +672,14 @@ convert_kill_to_deadlock_error(rpl_group_info *rgi) static int is_group_ending(Log_event *ev, Log_event_type event_type) { - if (event_type == XID_EVENT) + if (event_type == XID_EVENT || event_type == XA_PREPARE_LOG_EVENT) return 1; if (event_type == QUERY_EVENT) // COMMIT/ROLLBACK are never compressed { Query_log_event *qev = (Query_log_event *)ev; - if (qev->is_commit()) + if (qev->is_commit() || + !strncmp(qev->query, STRING_WITH_LEN("XA COMMIT")) || + !strncmp(qev->query, STRING_WITH_LEN("XA ROLLBACK"))) return 1; if (qev->is_rollback()) return 2; @@ -2088,23 +2090,34 @@ rpl_parallel_thread_pool::release_thread(rpl_parallel_thread *rpt) and the LOCK_rpl_thread must be released with THD::EXIT_COND() instead of mysql_mutex_unlock. - If the flag `reuse' is set, the last worker thread will be returned again, + When `gtid_ev' is not NULL the last worker thread will be returned again, if it is still available. Otherwise a new worker thread is allocated. + + A worker for XA transaction is determined through xid hashing which + ensure for a XA-complete to be scheduled to the same-xid XA-prepare worker. */ rpl_parallel_thread * rpl_parallel_entry::choose_thread(rpl_group_info *rgi, bool *did_enter_cond, - PSI_stage_info *old_stage, bool reuse) + PSI_stage_info *old_stage, + Gtid_log_event *gtid_ev) { uint32 idx; Relay_log_info *rli= rgi->rli; rpl_parallel_thread *thr; idx= rpl_thread_idx; - if (!reuse) + if (gtid_ev) { - ++idx; - if (idx >= rpl_thread_max) - idx= 0; + if (gtid_ev->flags2 & + (Gtid_log_event::FL_COMPLETED_XA | Gtid_log_event::FL_PREPARED_XA)) + idx= my_hash_sort(&my_charset_bin, gtid_ev->xid.key(), + gtid_ev->xid.key_length()) % rpl_thread_max; + else + { + ++idx; + if (idx >= rpl_thread_max) + idx= 0; + } rpl_thread_idx= idx; } thr= rpl_threads[idx]; @@ -2662,7 +2675,7 @@ rpl_parallel::do_event(rpl_group_info *serial_rgi, Log_event *ev, else { DBUG_ASSERT(rli->gtid_skip_flag == GTID_SKIP_TRANSACTION); - if (typ == XID_EVENT || + if (typ == XID_EVENT || typ == XA_PREPARE_LOG_EVENT || (typ == QUERY_EVENT && // COMMIT/ROLLBACK are never compressed (((Query_log_event *)ev)->is_commit() || ((Query_log_event *)ev)->is_rollback()))) @@ -2673,10 +2686,11 @@ rpl_parallel::do_event(rpl_group_info *serial_rgi, Log_event *ev, } } + Gtid_log_event *gtid_ev= NULL; if (typ == GTID_EVENT) { rpl_gtid gtid; - Gtid_log_event *gtid_ev= static_cast<Gtid_log_event *>(ev); + gtid_ev= static_cast<Gtid_log_event *>(ev); uint32 domain_id= (rli->mi->using_gtid == Master_info::USE_GTID_NO || rli->mi->parallel_mode <= SLAVE_PARALLEL_MINIMAL ? 0 : gtid_ev->domain_id); @@ -2715,8 +2729,7 @@ rpl_parallel::do_event(rpl_group_info *serial_rgi, Log_event *ev, instead re-use a thread that we queued for previously. */ cur_thread= - e->choose_thread(serial_rgi, &did_enter_cond, &old_stage, - typ != GTID_EVENT); + e->choose_thread(serial_rgi, &did_enter_cond, &old_stage, gtid_ev); if (!cur_thread) { /* This means we were killed. The error is already signalled. */ @@ -2734,7 +2747,6 @@ rpl_parallel::do_event(rpl_group_info *serial_rgi, Log_event *ev, if (typ == GTID_EVENT) { - Gtid_log_event *gtid_ev= static_cast<Gtid_log_event *>(ev); bool new_gco; enum_slave_parallel_mode mode= rli->mi->parallel_mode; uchar gtid_flags= gtid_ev->flags2; |