summaryrefslogtreecommitdiff
path: root/sql/service_wsrep.cc
diff options
context:
space:
mode:
authorseppo <seppo.jaakola@iki.fi>2019-11-25 11:19:33 +0200
committerJan Lindström <jan.lindstrom@mariadb.com>2019-11-25 11:19:33 +0200
commit4111a53079da9850c630ce30eec7f8a38744eacd (patch)
treefa5a3e1fac904c0d090acb7069073b897a2752cf /sql/service_wsrep.cc
parentf95288211ce1023e0d268229fbe5febbf0b2edd3 (diff)
downloadmariadb-git-4111a53079da9850c630ce30eec7f8a38744eacd.tar.gz
MDEV-21096 async slave crash with gtid_log_pos table access (#1413)
The original crash happened when async replication IO thread was updating mysql.gtid_slave_pos table. Operations on this table should remain node local, but it appears that protection (THD::wsrep_ignore_table flag) to prevent wsrep replication for this table mas missing for innodb write_row() and update_row(). It was somewhat difficult to reproduce the issue, because mtr seems to create the affected table mysql.gtid_log_pos as of Aria engine type, and Aria engine operations will not be replicated anyhow. It looks, though, that in release installation, mysql.gtid_slave_pos table is of InnoDB engine. It was possible to trigger somewhat related problem by running test galera.galera_as_slave_gtid with configuration: gtid_pos_auto_engines=InnoDB. However, this test mode, causes earlier crash when replication background thread creates aditional table: mysql.gtid_slave_pos_InnoDB, and this table create triggered wsrep TOI replication, which also failed for assertion. Actually, async replication IO and background threads should not replicate anything to cluster. This pull request contains new test galera.galera_as_slave_gtid_auto_engine, which basically just runs galera.galera_as_slave_gtid with configuration of gtid_pos_auto_engines=InnoDB. Test galera.galera_as_slave_gtid is also modified for better code reuse. Actual fix for MDEV-21096 is in storage/innobase/handler/ha_innodb.cc, where THD::wsrep_ignore_table flag is now honored before wsrep key population. There is additional fix in sql/service_wsrep.cc where async replication IO and background threads are marked as non-local. This fences these threads out of wsrep replication altogether. Note that this change, actually makes the use of THD::wsrep_ignore-table redundant. We may want to refactor THD::wsrep_ignore_table out in the future, if there is no other use case for it in sight.
Diffstat (limited to 'sql/service_wsrep.cc')
-rw-r--r--sql/service_wsrep.cc12
1 files changed, 11 insertions, 1 deletions
diff --git a/sql/service_wsrep.cc b/sql/service_wsrep.cc
index 35bc1b83029..5526c343d69 100644
--- a/sql/service_wsrep.cc
+++ b/sql/service_wsrep.cc
@@ -112,7 +112,17 @@ extern "C" my_bool wsrep_get_debug()
extern "C" my_bool wsrep_thd_is_local(const THD *thd)
{
- return thd->wsrep_cs().mode() == wsrep::client_state::m_local;
+ /*
+ async replication IO and background threads have nothing to replicate in the cluster,
+ marking them as non-local here to prevent write set population and replication
+
+ async replication SQL thread, applies client transactions from mariadb master
+ and will be replicated into cluster
+ */
+ return (
+ thd->system_thread != SYSTEM_THREAD_SLAVE_BACKGROUND &&
+ thd->system_thread != SYSTEM_THREAD_SLAVE_IO &&
+ thd->wsrep_cs().mode() == wsrep::client_state::m_local);
}
extern "C" my_bool wsrep_thd_is_applying(const THD *thd)