summaryrefslogtreecommitdiff
path: root/sql/rpl_gtid.cc
Commit message (Collapse)AuthorAgeFilesLines
* MDEV-8001 - mysql_reset_thd_for_next_command() takes 0.04% in OLTP ROSergey Vojtovich2015-05-131-1/+1
| | | | | | | | Removed mysql_reset_thd_for_next_command(). Call THD::reset_for_next_command() directly instead. mysql_reset_thd_for_next_command() overhead dropped 0.04% -> out of radar. THD::reset_for_next_command() overhead didn't increase.
* Merge MDEV-7888 and MDEV-7929 into 10.1.Kristian Nielsen2015-04-081-0/+29
|\
| * MDEV-7888, MDEV-7929: Parallel replication hangs sometimes on ANALYZE TABLE ↵Kristian Nielsen2015-04-081-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | or DDL The hangs occur when the group_commit_orderer object is freed before the last mark_start_commit() call on it - this loses the wakeup to other waiting worker threads, causing them to hang until killed manually. The object was freed because wakeup_subsequent_commits() was called two early in two places. For MDEV-7888, during ANALYZE TABLE, and for MDEV-7929 during record_gtid() after processing a DDL event. The group_commit_orderer object can be freed when its last transaction has called wait_for_prior_commit(). Fix by implementing a suspend/resume mechanism for wakeup_subsequent_commits() that can be used in places where a transaction is committed without this being the commit of the actual replication event group. Also add a protection mechanism (that asserts in debug builds) which can prevent the too-early free and hang if other similar bugs should remain in other parts of the code.
* | MDEV-6981: feature request MASTER_GTID_WAIT status variablesKristian Nielsen2015-03-161-3/+5
| | | | | | | | | | | | | | | | Review fixes: - Coding style - Fix bad .result file - Fix test to be tolerant of different timing. - Fix test to give better info in case of unexpected timing.
* | Merge branch 'mdev-6981-master_gtid_wait-status-variables' of ↵Kristian Nielsen2015-03-161-0/+13
|\ \ | | | | | | | | | | | | | | | | | | https://github.com/openquery/mariadb-server into danblack Conflicts: sql/mysqld.cc
| * | Add Master_gtid_wait_{count,time,timeouts} statusDaniel Black2015-03-121-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MASTER_GTID_WAIT function needs some status to evaluate its use. master_gtid_wait_count indicates how many times the function is called. master_gtid_wait_time indicates how much time in microseconds occurred waiting (or timing out) master_gtid_timeouts indicates how many time times this function timed out rather than all successful gtids events being available.
* | | after-merge fixesKristian Nielsen2015-03-041-1/+1
| | |
* | | Merge MDEV-6589 and MDEV-6403 into 10.1.Kristian Nielsen2015-03-041-0/+46
|\ \ \ | |/ / | | | | | | | | | | | | | | | Conflicts: sql/log.cc sql/rpl_rli.cc sql/sql_repl.cc
| * | MDEV-6403: Temporary tables lost at STOP SLAVE in GTID mode if master has ↵Kristian Nielsen2015-03-041-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | not rotated binlog since restart The binlog contains specially marked format description events to mark when a master restart happened (which could have caused temporary tables to be silently dropped). Such events also cause slave to close temporary tables. However, there was a bug that if after this, slave re-connects to the master in GTID mode, the master can send an old format description event again. If temporary tables are closed when such event is seen for the second time, it might drop temporary tables created after that event, and cause replication failure. With this patch, the restart flag of the format description event is cleared by the master when it is sent to the slave in a subsequent connection, to avoid the errorneous temp table close.
| * | MDEV-6589: Incorrect relay log start position when restarting SQL thread ↵Kristian Nielsen2015-03-041-0/+21
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | after error in parallel replication The problem occurs in parallel replication in GTID mode, when we are using multiple replication domains. In this case, if the SQL thread stops, the slave GTID position may refer to a different point in the relay log for each domain. The bug was that when the SQL thread was stopped and restarted (but the IO thread was kept running), the SQL thread would resume applying the relay log from the point of the most advanced replication domain, silently skipping all earlier events within other domains. This caused replication corruption. This patch solves the problem by storing, when the SQL thread stops with multiple parallel replication domains active, the current GTID position. Additionally, the current position in the relay logs is moved back to a point known to be earlier than the current position of any replication domain. Then when the SQL thread restarts from the earlier position, GTIDs encountered are compared against the stored GTID position. Any GTID that was already applied before the stop is skipped to avoid duplicate apply. This patch should have no effect if multi-domain GTID parallel replication is not used. Similarly, if both SQL and IO thread are stopped and restarted, the patch has no effect, as in this case the existing relay logs are removed and re-fetched from the master at the current global @@gtid_slave_pos.
* | MDEV-4987: Sort by domain_id when list of GTIDs are outputNirbhay Choubey2015-02-271-14/+98
|/ | | | | Added logic to sort gtid list based on domain_id before populating them in string. Added a test case.
* MDEV-5120 Test suite test maria-no-logging failsMichael Widenius2014-09-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The reason for the failure was a bug in an include file on debian that causes 'struct stat' to have different sized depending on the environment. This patch fixes so that we always include my_global.h or my_config.h before we include any other files. Other things: - Removed #include <my_global.h> in some include files; Better to always do this at the top level to have as few "always-include-this-file-first' files as possible. - Removed usage of some include files that where already included by my_global.h or by other files. client/mysql_plugin.c: Use my_global.h first client/mysqlslap.c: Remove duplicated include files extra/comp_err.c: Remove duplicated include files include/m_string.h: Remove duplicated include files include/maria.h: Remove duplicated include files libmysqld/emb_qcache.cc: Use my_global.h first plugin/semisync/semisync.h: Use my_pthread.h first sql/datadict.cc: Use my_global.h first sql/debug_sync.cc: Use my_global.h first sql/derror.cc: Use my_global.h first sql/des_key_file.cc: Use my_global.h first sql/discover.cc: Use my_global.h first sql/event_data_objects.cc: Use my_global.h first sql/event_db_repository.cc: Use my_global.h first sql/event_parse_data.cc: Use my_global.h first sql/event_queue.cc: Use my_global.h first sql/event_scheduler.cc: Use my_global.h first sql/events.cc: Use my_global.h first sql/field.cc: Use my_global.h first Remove duplicated include files sql/field_conv.cc: Use my_global.h first sql/filesort.cc: Use my_global.h first Remove duplicated include files sql/gstream.cc: Use my_global.h first sql/ha_ndbcluster.cc: Use my_global.h first sql/ha_ndbcluster_binlog.cc: Use my_global.h first sql/ha_ndbcluster_cond.cc: Use my_global.h first sql/ha_partition.cc: Use my_global.h first sql/handler.cc: Use my_global.h first sql/hash_filo.cc: Use my_global.h first sql/hostname.cc: Use my_global.h first sql/init.cc: Use my_global.h first sql/item.cc: Use my_global.h first sql/item_buff.cc: Use my_global.h first sql/item_cmpfunc.cc: Use my_global.h first sql/item_create.cc: Use my_global.h first sql/item_geofunc.cc: Use my_global.h first sql/item_inetfunc.cc: Use my_global.h first sql/item_row.cc: Use my_global.h first sql/item_strfunc.cc: Use my_global.h first sql/item_subselect.cc: Use my_global.h first sql/item_sum.cc: Use my_global.h first sql/item_timefunc.cc: Use my_global.h first sql/item_xmlfunc.cc: Use my_global.h first sql/key.cc: Use my_global.h first sql/lock.cc: Use my_global.h first sql/log.cc: Use my_global.h first sql/log_event.cc: Use my_global.h first sql/log_event_old.cc: Use my_global.h first sql/mf_iocache.cc: Use my_global.h first sql/mysql_install_db.cc: Remove duplicated include files sql/mysqld.cc: Remove duplicated include files sql/net_serv.cc: Remove duplicated include files sql/opt_range.cc: Use my_global.h first sql/opt_subselect.cc: Use my_global.h first sql/opt_sum.cc: Use my_global.h first sql/parse_file.cc: Use my_global.h first sql/partition_info.cc: Use my_global.h first sql/procedure.cc: Use my_global.h first sql/protocol.cc: Use my_global.h first sql/records.cc: Use my_global.h first sql/records.h: Don't include my_global.h Better to do this at the upper level sql/repl_failsafe.cc: Use my_global.h first sql/rpl_filter.cc: Use my_global.h first sql/rpl_gtid.cc: Use my_global.h first sql/rpl_handler.cc: Use my_global.h first sql/rpl_injector.cc: Use my_global.h first sql/rpl_record.cc: Use my_global.h first sql/rpl_record_old.cc: Use my_global.h first sql/rpl_reporting.cc: Use my_global.h first sql/rpl_rli.cc: Use my_global.h first sql/rpl_tblmap.cc: Use my_global.h first sql/rpl_utility.cc: Use my_global.h first sql/set_var.cc: Added comment sql/slave.cc: Use my_global.h first sql/sp.cc: Use my_global.h first sql/sp_cache.cc: Use my_global.h first sql/sp_head.cc: Use my_global.h first sql/sp_pcontext.cc: Use my_global.h first sql/sp_rcontext.cc: Use my_global.h first sql/spatial.cc: Use my_global.h first sql/sql_acl.cc: Use my_global.h first sql/sql_admin.cc: Use my_global.h first sql/sql_analyse.cc: Use my_global.h first sql/sql_audit.cc: Use my_global.h first sql/sql_base.cc: Use my_global.h first sql/sql_binlog.cc: Use my_global.h first sql/sql_bootstrap.cc: Use my_global.h first Use my_global.h first sql/sql_cache.cc: Use my_global.h first sql/sql_class.cc: Use my_global.h first sql/sql_client.cc: Use my_global.h first sql/sql_connect.cc: Use my_global.h first sql/sql_crypt.cc: Use my_global.h first sql/sql_cursor.cc: Use my_global.h first sql/sql_db.cc: Use my_global.h first sql/sql_delete.cc: Use my_global.h first sql/sql_derived.cc: Use my_global.h first sql/sql_do.cc: Use my_global.h first sql/sql_error.cc: Use my_global.h first sql/sql_explain.cc: Use my_global.h first sql/sql_expression_cache.cc: Use my_global.h first sql/sql_handler.cc: Use my_global.h first sql/sql_help.cc: Use my_global.h first sql/sql_insert.cc: Use my_global.h first sql/sql_lex.cc: Use my_global.h first sql/sql_load.cc: Use my_global.h first sql/sql_locale.cc: Use my_global.h first sql/sql_manager.cc: Use my_global.h first sql/sql_parse.cc: Use my_global.h first sql/sql_partition.cc: Use my_global.h first sql/sql_plugin.cc: Added comment sql/sql_prepare.cc: Use my_global.h first sql/sql_priv.h: Added error if we use this before including my_global.h This check is here becasue so many files includes sql_priv.h first. sql/sql_profile.cc: Use my_global.h first sql/sql_reload.cc: Use my_global.h first sql/sql_rename.cc: Use my_global.h first sql/sql_repl.cc: Use my_global.h first sql/sql_select.cc: Use my_global.h first sql/sql_servers.cc: Use my_global.h first sql/sql_show.cc: Added comment sql/sql_signal.cc: Use my_global.h first sql/sql_statistics.cc: Use my_global.h first sql/sql_table.cc: Use my_global.h first sql/sql_tablespace.cc: Use my_global.h first sql/sql_test.cc: Use my_global.h first sql/sql_time.cc: Use my_global.h first sql/sql_trigger.cc: Use my_global.h first sql/sql_udf.cc: Use my_global.h first sql/sql_union.cc: Use my_global.h first sql/sql_update.cc: Use my_global.h first sql/sql_view.cc: Use my_global.h first sql/sys_vars.cc: Added comment sql/table.cc: Use my_global.h first sql/thr_malloc.cc: Use my_global.h first sql/transaction.cc: Use my_global.h first sql/uniques.cc: Use my_global.h first sql/unireg.cc: Use my_global.h first sql/unireg.h: Removed inclusion of my_global.h storage/archive/ha_archive.cc: Added comment storage/blackhole/ha_blackhole.cc: Use my_global.h first storage/csv/ha_tina.cc: Use my_global.h first storage/csv/transparent_file.cc: Use my_global.h first storage/federated/ha_federated.cc: Use my_global.h first storage/federatedx/federatedx_io.cc: Use my_global.h first storage/federatedx/federatedx_io_mysql.cc: Use my_global.h first storage/federatedx/federatedx_io_null.cc: Use my_global.h first storage/federatedx/federatedx_txn.cc: Use my_global.h first storage/heap/ha_heap.cc: Use my_global.h first storage/innobase/handler/handler0alter.cc: Use my_global.h first storage/maria/ha_maria.cc: Use my_global.h first storage/maria/unittest/ma_maria_log_cleanup.c: Remove duplicated include files storage/maria/unittest/test_file.c: Added comment storage/myisam/ha_myisam.cc: Move sql_plugin.h first as this includes my_global.h storage/myisammrg/ha_myisammrg.cc: Use my_global.h first storage/oqgraph/oqgraph_thunk.cc: Use my_config.h and my_global.h first One could not include my_global.h before oqgraph_thunk.h (don't know why) storage/spider/ha_spider.cc: Use my_global.h first storage/spider/hs_client/config.cpp: Use my_global.h first storage/spider/hs_client/escape.cpp: Use my_global.h first storage/spider/hs_client/fatal.cpp: Use my_global.h first storage/spider/hs_client/hstcpcli.cpp: Use my_global.h first storage/spider/hs_client/socket.cpp: Use my_global.h first storage/spider/hs_client/string_util.cpp: Use my_global.h first storage/spider/spd_conn.cc: Use my_global.h first storage/spider/spd_copy_tables.cc: Use my_global.h first storage/spider/spd_db_conn.cc: Use my_global.h first storage/spider/spd_db_handlersocket.cc: Use my_global.h first storage/spider/spd_db_mysql.cc: Use my_global.h first storage/spider/spd_db_oracle.cc: Use my_global.h first storage/spider/spd_direct_sql.cc: Use my_global.h first storage/spider/spd_i_s.cc: Use my_global.h first storage/spider/spd_malloc.cc: Use my_global.h first storage/spider/spd_param.cc: Use my_global.h first storage/spider/spd_ping_table.cc: Use my_global.h first storage/spider/spd_sys_table.cc: Use my_global.h first storage/spider/spd_table.cc: Use my_global.h first storage/spider/spd_trx.cc: Use my_global.h first storage/xtradb/handler/handler0alter.cc: Use my_global.h first storage/xtradb/handler/i_s.cc: Use my_global.h first
* MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallelunknown2014-06-101-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | replication causing replication to fail. Remove the temporary fix for MDEV-5914, which used READ COMMITTED for parallel replication worker threads. Replace it with a better, more selective solution. The issue is with certain edge cases of InnoDB gap locks, for example between INSERT and ranged DELETE. It is possible for the gap lock set by the DELETE to block the INSERT, if the DELETE runs first, while the record lock set by INSERT does not block the DELETE, if the INSERT runs first. This can cause a conflict between the two in parallel replication on the slave even though they ran without conflicts on the master. With this patch, InnoDB will ask the server layer about the two involved transactions before blocking on a gap lock. If the server layer tells InnoDB that the transactions are already fixed wrt. commit order, as they are in parallel replication, InnoDB will ignore the gap lock and allow the two transactions to proceed in parallel, avoiding the conflict. Improve the fix for MDEV-6020. When InnoDB itself detects a deadlock, it now asks the server layer for any preferences about which transaction to roll back. In case of parallel replication with two transactions T1 and T2 fixed to commit T1 before T2, the server layer will ask InnoDB to roll back T2 as the deadlock victim, not T1. This helps in some cases to avoid excessive deadlock rollback, as T2 will in any case need to wait for T1 to complete before it can itself commit. Also some misc. fixes found during development and testing: - Remove thd_rpl_is_parallel(), it is not used or needed. - Use KILL_CONNECTION instead of KILL_QUERY when a parallel replication worker thread is killed to resolve a deadlock with fixed commit ordering. There are some cases, eg. in sql/sql_parse.cc, where a KILL_QUERY can be ignored if the query otherwise completed successfully, and this could cause the deadlock kill to be lost, so that the deadlock was not correctly resolved. - Fix random test failure due to missing wait_for_binlog_checkpoint.inc. - Make sure that deadlock or other temporary errors during parallel replication are not printed to the the error log; there were some places around the replication code with extra error logging. These conditions can occur occasionally and are handled automatically without breaking replication, so they should not pollute the error log. - Fix handling of rgi->gtid_sub_id. We need to be able to access this also at the end of a transaction, to be able to detect and resolve deadlocks due to commit ordering. But this value was also used as a flag to mark whether record_gtid() had been called, by being set to zero, losing the value. Now, introduce a separate flag rgi->gtid_pending, so rgi->gtid_sub_id remains valid for the entire duration of the transaction. - Fix one place where the code to handle ignored errors called reset_killed() unconditionally, even if no error was caught that should be ignored. This could cause loss of a deadlock kill signal, breaking deadlock detection and resolution. - Fix a couple of missing mysql_reset_thd_for_next_command(). This could cause a prior error condition to remain for the next event executed, causing assertions about errors already being set and possibly giving incorrect error handling for following event executions. - Fix code that cleared thd->rgi_slave in the parallel replication worker threads after each event execution; this caused the deadlock detection and handling code to not be able to correctly process the associated transactions as belonging to replication worker threads. - Remove useless error code in slave_background_kill_request(). - Fix bug where wfc->wakeup_error was not cleared at wait_for_commit::unregister_wait_for_prior_commit(). This could cause the error condition to wrongly propagate to a later wait_for_prior_commit(), causing spurious ER_PRIOR_COMMIT_FAILED errors. - Do not put the binlog background thread into the processlist. It causes too many result differences in mtr, but also it probably is not useful for users to pollute the process list with a system thread that does not really perform any user-visible tasks...
* MDEV-6386: Assertion `thd->transaction.stmt.is_empty() || thd->in_sub_stmt ↵Kristian Nielsen2014-06-271-7/+2
| | | | | | | | | | | | | | | | | | | | | | | | || (thd->state_flags & Open_tables_state::BACKUPS_AVAIL)' fails with parallel replication The direct cause of the assertion was missing error handling in record_gtid(). If ha_commit_trans() fails for the statement commit, there was missing code to catch the error and do ha_rollback_trans() in this case; this caused close_thread_tables() to assert. Normally, this error case is not hit, but in this case it was triggered due to another bug: When a transaction T1 fails during parallel replication, the code would signal following transactions that they could start to run without properly marking the error condition. This caused subsequent transactions to incorrectly start replicating, only to get an error later during their own commit step. This was particularly serious if the subsequent transactions were DDL or MyISAM updates, which cannot be rolled back and would leave replication in an inconsistent state. Fixed by 1) in case of error, only signal following transactions to continue once the error has been properly marked and those transactions will know not to start; and 2) implement proper error handling in record_gtid() in the case that statement commit fails.
* MDEV-6314 - Compile/run MariaDB with ASanSergey Vojtovich2014-06-101-2/+1
| | | | Fixed some compilation errors/warnings with ASan.
* Fixed compiler warningsMichael Widenius2014-06-041-0/+2
| | | | | | | | | | | | | | | | | | | | | | | mysys/psi_noop.c: Fixed wrong prototype sql/rpl_gtid.cc: Added #ifndef to hide not used variable storage/connect/connect.cc: Added volatile to avoid compiler warning in gcc 4.8.1 storage/connect/filamvct.cpp: Added volatile to avoid compiler warning in gcc 4.8.1 storage/maria/ma_checkpoint.c: Removed cast to avoid compiler warning storage/myisam/mi_delete_table.c: Added attribute to avoid compiler warning storage/tokudb/ha_tokudb.cc: Use LINT_INIT_STRUCT to avoid compiler warnings storage/tokudb/hatoku_hton.cc: Use LINT_INIT_STRUCT to avoid compiler warnings storage/tokudb/tokudb_card.h: Use LINT_INIT_STRUCT to avoid compiler warnings storage/tokudb/tokudb_status.h: Use LINT_INIT_STRUCT to avoid compiler warnings
* MDEV-5804: If same GTID is received on multiple master connections in ↵unknown2014-03-121-18/+89
| | | | | | | | | multi-source replication, the event is double-executed causing corruption or replication failure Some fixes, mainly to make it work in non-parallel replication mode also (--slave-parallel-threads=0). Patch should be fairly complete now.
* MDEV-5804: If same GTID is received on multiple master connections in ↵unknown2014-03-091-5/+109
| | | | | | | | | | | | | | | | | | | | | | | | | multi-source replication, the event is double-executed causing corruption or replication failure Before, the arrival of same GTID twice in multi-source replication would cause double-apply or in gtid strict mode an error. Keep the behaviour, but add an option --gtid-ignore-duplicates which allows to correctly handle duplicates, ignoring all but the first. This relies on the user ensuring correct configuration so that sequence numbers are strictly increasing within each replication domain; then duplicates can be detected simply by comparing the sequence numbers against what is already applied. Only one master connection (but possibly multiple parallel worker threads within that connection) is allowed to apply events within one replication domain at a time; any other connection that receives a GTID in the same domain either discards it (if it is already applied) or waits for the other connection to not have any events to apply. Intermediate patch, as proof-of-concept for testing. The main limitation is that currently it is only implemented for parallel replication, @@slave_parallel_threads > 0.
* Merge MariaDB 10.0-base to 10.0.unknown2014-02-101-19/+487
|\
| * MDEV-4984: Implement MASTER_GTID_WAIT() and @@LAST_GTID.unknown2014-02-081-156/+183
| | | | | | | | | | | | | | | | Rewrite the gtid_waiting::wait_for_gtid() function. The code was rubbish (and buggy). Now the logic is much clearer. Also fix a missing slave sync that could cause test failure.
| * MDEV-4984: Implement MASTER_GTID_WAIT() and @@LAST_GTID.unknown2014-02-081-5/+5
| | | | | | | | Couple of small fixes following buildbot testing.
| * MDEV-4984: Implement MASTER_GTID_WAIT() and @@LAST_GTID.unknown2014-02-071-19/+461
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MASTER_GTID_WAIT() is similar to MASTER_POS_WAIT(), but works with a GTID position rather than an old-style filename/offset. @@LAST_GTID gives the GTID assigned to the last transaction written into the binlog. Together, the two can be used by applications to obtain the GTID of an update on the master, and then do a MASTER_GTID_WAIT() for that position on any read slave where it is important to get results that are caught up with the master at least to the point of the update. The implementation of MASTER_GTID_WAIT() is implemented in a way that tries to minimise the performance impact on the SQL threads, even in the presense of many waiters on single GTID positions (as from @@LAST_GTID).
* | Replication changes for CREATE OR REPLACE TABLEMichael Widenius2014-02-051-1/+2
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - CREATE TABLE is by default executed on the slave as CREATE OR REPLACE - DROP TABLE is by default executed on the slave as DROP TABLE IF NOT EXISTS This means that a slave will by default continue even if we try to create a table that existed on the slave (the table will be deleted and re-created) or if we try to drop a table that didn't exist on the slave. This should be safe as instead of having the slave stop because of an inconsistency between master and slave, it will fix the inconsistency. Those that would prefer to get a stopped slave instead for the above cases can set slave_ddl_exec_mode to STRICT. - Ensure that a CREATE OR REPLACE TABLE which dropped a table is replicated - DROP TABLE that generated an error on master is handled as an identical DROP TABLE on the slave (IF NOT EXISTS is not added in this case) - Added slave_ddl_exec_mode variable to decide how DDL's are replicated New logic for handling BEGIN GTID ... COMMIT from the binary log: - When we find a BEGIN GTID, we start a transaction and set OPTION_GTID_BEGIN - When we find COMMIT, we reset OPTION_GTID_BEGIN and execute the normal COMMIT code. - While OPTION_GTID_BEGIN is set: - We don't generate implict commits before or after statements - All tables are regarded as transactional tables in the binary log (to ensure things are executed exactly as on the master) - We reset OPTION_GTID_BEGIN also on rollback This will help ensuring that we don't get any sporadic commits (and thus new GTID's) on the slave and will help keep the GTID's between master and slave in sync. mysql-test/extra/rpl_tests/rpl_log.test: Added testing of mode slave_ddl_exec_mode=STRICT mysql-test/r/mysqld--help.result: New help messages mysql-test/suite/rpl/r/create_or_replace_mix.result: Testing of CREATE OR REPLACE TABLE with replication mysql-test/suite/rpl/r/create_or_replace_row.result: Testing of CREATE OR REPLACE TABLE with replication mysql-test/suite/rpl/r/create_or_replace_statement.result: Testing replication of create or replace mysql-test/suite/rpl/r/rpl_gtid_startpos.result: Test must be run in slave_ddl_exec_mode=STRICT as part of the test depends on that DROP TABLE should fail on slave. mysql-test/suite/rpl/r/rpl_row_log.result: Updated result mysql-test/suite/rpl/r/rpl_row_log_innodb.result: Updated result mysql-test/suite/rpl/r/rpl_row_show_relaylog_events.result: Updated result mysql-test/suite/rpl/r/rpl_stm_log.result: Updated result mysql-test/suite/rpl/r/rpl_stm_mix_show_relaylog_events.result: Updated result mysql-test/suite/rpl/r/rpl_temp_table_mix_row.result: Updated result mysql-test/suite/rpl/t/create_or_replace.inc: Testing of CREATE OR REPLACE TABLE with replication mysql-test/suite/rpl/t/create_or_replace_mix.cnf: Testing of CREATE OR REPLACE TABLE with replication mysql-test/suite/rpl/t/create_or_replace_mix.test: Testing of CREATE OR REPLACE TABLE with replication mysql-test/suite/rpl/t/create_or_replace_row.cnf: Testing of CREATE OR REPLACE TABLE with replication mysql-test/suite/rpl/t/create_or_replace_row.test: Testing of CREATE OR REPLACE TABLE with replication mysql-test/suite/rpl/t/create_or_replace_statement.cnf: Testing of CREATE OR REPLACE TABLE with replication mysql-test/suite/rpl/t/create_or_replace_statement.test: Testing of CREATE OR REPLACE TABLE with replication mysql-test/suite/rpl/t/rpl_gtid_startpos.test: Test must be run in slave_ddl_exec_mode=STRICT as part of the test depends on that DROP TABLE should fail on slave. mysql-test/suite/rpl/t/rpl_stm_log.test: Removed some lines mysql-test/suite/sys_vars/r/slave_ddl_exec_mode_basic.result: Testing of slave_ddl_exec_mode mysql-test/suite/sys_vars/t/slave_ddl_exec_mode_basic.test: Testing of slave_ddl_exec_mode sql/handler.cc: Regard all tables as transactional in commit if OPTION_GTID_BEGIN is set. This is to ensure that statments are not commited too early if non transactional tables are used. sql/log.cc: Regard all tables as transactional in commit if OPTION_GTID_BEGIN is set. Also treat 'direct' log events as transactional (to get them logged as they where on the master) sql/log_event.cc: Ensure that the new error from DROP TABLE when trying to drop a view is treated same as the old one. Store error code that slave expects in THD. Set OPTION_GTID_BEGIN if we find a BEGIN. Reset OPTION_GTID_BEGIN if we find a COMMIT. sql/mysqld.cc: Added slave_ddl_exec_mode_options sql/mysqld.h: Added slave_ddl_exec_mode_options sql/rpl_gtid.cc: Reset OPTION_GTID_BEGIN if we record a gtid (safety) sql/sql_class.cc: Regard all tables as transactional in commit if OPTION_GTID_BEGIN is set. sql/sql_class.h: Added to THD: log_current_statement and slave_expected_error sql/sql_insert.cc: Ensure that CREATE OR REPLACE is logged if table was deleted. Don't do implicit commit for CREATE if we are under OPTION_GTID_BEGIN sql/sql_parse.cc: Change CREATE TABLE -> CREATE OR REPLACE TABLE for slaves Change DROP TABLE -> DROP TABLE IF EXISTS for slaves CREATE TABLE doesn't force implicit commit in case of OPTION_GTID_BEGIN Don't do commits before or after any statement if OPTION_GTID_BEGIN was set. sql/sql_priv.h: Added OPTION_GTID_BEGIN sql/sql_show.cc: Enhanced store_create_info() to also be able to handle CREATE OR REPLACE sql/sql_show.h: Updated prototype sql/sql_table.cc: Ensure that CREATE OR REPLACE is logged if table was deleted. sql/sys_vars.cc: Added slave_ddl_exec_mode sql/transaction.cc: Added warning if we got a GTID under OPTION_GTID_BEGIN
* MDEV-4816: Incorrect disabling of binlog for mysql.gtid_slave_pos updateunknown2013-11-271-6/+6
| | | | | | The update of mysql.gtid_slave_pos during event replication used the wrong way to disable binary logging for the table updates. Fix to instead temporarily clear the OPT_BIN_LOG flag.
* MDEV-5306: Missing locking around rpl_global_gtid_binlog_stateunknown2013-11-181-40/+139
| | | | | | | | | | There were some places where insufficient locking between parallel threads could cause invalid memory accesses and possibly other grief. This patch adds the missing locking, and moves the locking into the struct rpl_binlog_state methods to make it easier to see that proper locking is in place everywhere.
* 5.5. mergeSergei Golubchik2013-11-131-1/+1
|
* Merge MDEV-4506: Parallel replication into 10.0-base.unknown2013-11-011-16/+20
|\
| * MDEV-5189: Incorrect parallel apply in parallel replicationunknown2013-10-251-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Two problems were fixed: 1. When not in GTID mode (master_use_gtid=no), then we must not apply events in different domains in parallel (in non-GTID mode we are not capable of restarting at different points in different domains). 2. When transactions B and C group commit together, but after and separate from A, we can apply B and C in parallel, but both B and C must not start until A has committed. Fix sub_id to be globally increasing (not just per-domain increasing) so that this wait (which is based on sub_id) can be done correctly.
| * Fixes for parallel slave:Michael Widenius2013-10-141-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Made slaves temporary table multi-thread slave safe by adding mutex around save_temporary_table usage. - rli->save_temporary_tables is the active list of all used temporary tables - This is copied to THD->temporary_tables when temporary tables are opened and updated when temporary tables are closed - Added THD->lock_temporary_tables() and THD->unlock_temporary_tables() to simplify this. - Relay_log_info->sql_thd renamed to Relay_log_info->sql_driver_thd to avoid wrong usage for merged code. - Added is_part_of_group() to mark functions that are part of the next function. This replaces setting IN_STMT when events are executed. - Added is_begin(), is_commit() and is_rollback() functions to Query_log_event to simplify code. - If slave_skip_counter is set run things in single threaded mode. This simplifies code for skipping events. - Updating state of relay log (IN_STMT and IN_TRANSACTION) is moved to one single function: update_state_of_relay_log() We can't use OPTION_BEGIN to check for the state anymore as the sql_driver and sql execution threads may be different. Clear IN_STMT and IN_TRANSACTION in init_relay_log_pos() and Relay_log_info::cleanup_context() to ensure the flags doesn't survive slave restarts is_in_group() is now independent of state of executed transaction. - Reset thd->transaction.all.modified_non_trans_table() if we did set it for single table row events. This was mainly for keeping the flag as documented. - Changed slave_open_temp_tables to uint32 to be able to use atomic operators on it. - Relay_log_info::sleep_lock -> rpl_group_info::sleep_lock - Relay_log_info::sleep_cond -> rpl_group_info::sleep_cond - Changed some functions to take rpl_group_info instead of Relay_log_info to make them multi-slave safe and to simplify usage - do_shall_skip() - continue_group() - sql_slave_killed() - next_event() - Simplifed arguments to io_salve_killed(), check_io_slave_killed() and sql_slave_killed(); No reason to supply THD as this is part of the given structure. - set_thd_in_use_temporary_tables() removed as in_use is set on usage - Added information to thd_proc_info() which thread is waiting for slave mutex to exit. - In open_table() reuse code from find_temporary_table() Other things: - More DBUG statements - Fixed the rpl_incident.test can be run with --debug - More comments - Disabled not used function rpl_connect_master() mysql-test/suite/perfschema/r/all_instances.result: Moved sleep_lock and sleep_cond to rpl_group_info mysql-test/suite/rpl/r/rpl_incident.result: Updated result mysql-test/suite/rpl/t/rpl_incident-master.opt: Not needed anymore mysql-test/suite/rpl/t/rpl_incident.test: Fixed that test can be run with --debug sql/handler.cc: More DBUG_PRINT sql/log.cc: More comments sql/log_event.cc: Added DBUG statements do_shall_skip(), continue_group() now takes rpl_group_info param Use is_begin(), is_commit() and is_rollback() functions instead of inspecting query string We don't have set slaves temporary tables 'in_use' as this is now done when tables are opened. Removed IN_STMT flag setting. This is now done in update_state_of_relay_log() Use IN_TRANSACTION flag to test state of relay log. In rows_event_stmt_cleanup() reset thd->transaction.all.modified_non_trans_table if we had set this before. sql/log_event.h: do_shall_skip(), continue_group() now takes rpl_group_info param Added is_part_of_group() to mark events that are part of the next event. This replaces setting IN_STMT when events are executed. Added is_begin(), is_commit() and is_rollback() functions to Query_log_event to simplify code. sql/log_event_old.cc: Removed IN_STMT flag setting. This is now done in update_state_of_relay_log() do_shall_skip(), continue_group() now takes rpl_group_info param sql/log_event_old.h: Added is_part_of_group() to mark events that are part of the next event. do_shall_skip(), continue_group() now takes rpl_group_info param sql/mysqld.cc: Changed slave_open_temp_tables to uint32 to be able to use atomic operators on it. Relay_log_info::sleep_lock -> Rpl_group_info::sleep_lock Relay_log_info::sleep_cond -> Rpl_group_info::sleep_cond sql/mysqld.h: Updated types and names sql/rpl_gtid.cc: More DBUG sql/rpl_parallel.cc: Updated TODO section Set thd for event that is execution Use new is_begin(), is_commit() and is_rollback() functions. More comments sql/rpl_rli.cc: sql_thd -> sql_driver_thd Relay_log_info::sleep_lock -> rpl_group_info::sleep_lock Relay_log_info::sleep_cond -> rpl_group_info::sleep_cond Clear IN_STMT and IN_TRANSACTION in init_relay_log_pos() and Relay_log_info::cleanup_context() to ensure the flags doesn't survive slave restarts. Reset table->in_use for temporary tables as the table may have been used by another THD. Use IN_TRANSACTION instead of OPTION_BEGIN to check state of relay log. Removed IN_STMT flag setting. This is now done in update_state_of_relay_log() sql/rpl_rli.h: Changed relay log state flags to bit masks instead of bit positions (most other code we have uses bit masks) Added IN_TRANSACTION to mark if we are in a BEGIN ... COMMIT section. save_temporary_tables is now thread safe Relay_log_info::sleep_lock -> rpl_group_info::sleep_lock Relay_log_info::sleep_cond -> rpl_group_info::sleep_cond Relay_log_info->sql_thd renamed to Relay_log_info->sql_driver_thd to avoid wrong usage for merged code is_in_group() is now independent of state of executed transaction. sql/slave.cc: Simplifed arguments to io_salve_killed(), sql_slave_killed() and check_io_slave_killed(); No reason to supply THD as this is part of the given structure. set_thd_in_use_temporary_tables() removed as in_use is set on usage in sql_base.cc sql_thd -> sql_driver_thd More DBUG Added update_state_of_relay_log() which will calculate the IN_STMT and IN_TRANSACTION state of the relay log after the current element is executed. If slave_skip_counter is set run things in single threaded mode. Simplifed arguments to io_salve_killed(), check_io_slave_killed() and sql_slave_killed(); No reason to supply THD as this is part of the given structure. Added information to thd_proc_info() which thread is waiting for slave mutex to exit. Disabled not used function rpl_connect_master() Updated argument to next_event() sql/sql_base.cc: Added mutex around usage of slave's temporary tables. The active list is always kept up to date in sql->rgi_slave->save_temporary_tables. Clear thd->temporary_tables after query (safety) More DBUG When using temporary table, set table->in_use to current thd as the THD may be different for slave threads. Some code is ifdef:ed with REMOVE_AFTER_MERGE_WITH_10 as the given code in 10.0 is not yet in this tree. In open_table() reuse code from find_temporary_table() sql/sql_binlog.cc: rli->sql_thd -> rli->sql_driver_thd Remove duplicate setting of rgi->rli sql/sql_class.cc: Added helper functions rgi_lock_temporary_tables() and rgi_unlock_temporary_tables() Would have been nicer to have these inline, but there was no easy way to do that sql/sql_class.h: Added functions to protect slaves temporary tables sql/sql_parse.cc: Added DBUG_PRINT sql/transaction.cc: Added comment
| * MDEV-4506, parallel replication.unknown2013-09-131-1/+1
| | | | | | | | Some after-review fixes.
| * MDEV-4506: Parallel replication. Intermediate commit.unknown2013-07-031-3/+2
| | | | | | | | | | Pass down rpl_group_info * to remove one instance of non-threadsafe use of rli->group_info.
| * MDEV-4506: Parallel replication: Intermediate commit.unknown2013-06-281-4/+5
| | | | | | | | | | First step of splitting out part of Relay_log_info, so that different event groups being applied in parallel can each use their own copy.
* | MDEV-26: Global transaction ID.unknown2013-08-231-0/+74
| | | | | | | | | | | | Implement @@gtid_binlog_state. This is the internal state of the binlog (most recent GTID logged for every domain_id and server_id). This allows to save the state before RESET MASTER and restore it afterwards.
* | MDEV-4488: When master is on the list of ignore_server_ids, GTID position on ↵unknown2013-08-221-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | slave is not updated The ignored events are not written to the relay log, but instead a fake Rotate event is generated to handle update of position. Extend this for Gtid so we similarly generate a fake Gtid_list event to update the GTID position. Also fix an unrelated test issue that got triggered by the added test cases.
* | MDEV-4820: Empty master does not give error for slave GTID position that ↵unknown2013-08-161-18/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | does not exist in the binlog The main bug here was the following situation: Suppose we set up a completely new master2 as an extra multi-master to an existing slave that already has a different master1 for domain_id=0. When the slave tries to connect to master2, master2 will not have anything that slave requests in domain_id=0, but that is fine as master2 is supposedly meant to serve eg. domain_id=1. (This is MDEV-4485). But suppose that master2 then actually starts sending events from domain_id=0. In this case, the fix for MDEV-4485 was incomplete, and the code would fail to give the error that the position requested by the slave in domain_id=0 was missing from the binlogs of master2. This could lead to lost events or completely wrong replication. The patch for this bug fixes this issue. In addition, it cleans up the code a bit, getting rid of the fake_gtid_hash in the code. And the error message when slave and master have diverged due to alternate future is clarified, as requested in the bug description.
* | MDEV-4688: empty @@gtid_slave_pos during slave commit.unknown2013-06-211-7/+33
| | | | | | | | | | | | In record_gtid(), too many rows were deleted from the slave position hash - we need to always keep on to the most recent committed row, so we have a valid slave position at all times.
* | MDEV-4686: Temporary variable for sub_id is 32-bit, should be 64-bitunknown2013-06-201-3/+3
|/ | | | | Fix wrong type for sub_id, which would cause overflow grief on long-running server. Also some related renames for consistency.
* MDEV-4486: Allow to start old-style replication even if ↵unknown2013-06-071-0/+12
| | | | | | | | | | | | | | | mysql.rpl_slave_state is unavailable If the mysql.gtid_slave_pos table is not available, we cannot load nor update the current GTID position persistently. This can happen eg. after an upgrade, before mysql_upgrade_db is run, or if the table is InnoDB and the server is restarted without the InnoDB storage engine enabled. Before, replication always failed to start if the table was unavailable. With this patch, we try to continue with old-style replication, after suitable complaints in the error log. In strict mode, or if slave is configured to use GTID, slave still refuses to start.
* MDEV-26: Global transaction ID.unknown2013-06-051-25/+66
| | | | | | | | | | | | | | | | | | | | | | Fix problems related to reconnect. When we need to reconnect (ie. explict stop/start of just the IO thread by user, or automatic reconnect due to loosing network connection with the master), it is a bit complex to correctly resume at the right point without causing duplicate or missing events in the relay log. The previous code had multiple problems in this regard. With this patch, the problem is solved as follows. The IO thread keeps track (in memory) of which GTID was last queued to the relay log. If it needs to reconnect, it resumes at that GTID position. It also counts number of events received within the last, possibly partial, event group, and skips the same number of events after a reconnect, so that events already enqueued before the reconnect are not duplicated. (There is no need to keep any persistent state; whenever we restart slave threads after both of them being stopped (such as after server restart), we erase the relay logs and start over from the last GTID applied by SQL thread. But while the SQL thread is running, this patch is needed to get correct relay log).
* MDEV-4485: Incorrect error handling in record_gtid().unknown2013-05-291-7/+52
| | | | | Fix the error handling when access to the table mysql.gtid_slave_pos fails for whatever reason. Add some test cases.
* MDEV-4478: Implement GTID "strict mode"unknown2013-05-281-51/+187
| | | | | | | | | | | When @@GLOBAL.gtid_strict_mode=1, then certain operations result in error that would otherwise result in out-of-order binlog files between servers. GTID sequence numbers are now allocated independently per domain; this results in less/no holes in GTID sequences, increasing the likelyhood that diverging binlogs will be caught by the slave when GTID strict mode is enabled.
* MDEV-26: Global transaction ID.unknown2013-05-221-11/+27
| | | | | | | | | | | | | | | | Change of user interface to be more logical and more in line with expectations to work similar to old-style replication. User can now explicitly choose in CHANGE MASTER whether binlog position is taken into account (master_gtid_pos=current_pos) or not (master_gtid_pos= slave_pos) when slave connects to master. @@gtid_pos is replaced by three separate variables @@gtid_slave_pos (can be set by user, replicated GTIDs only), @@gtid_binlog_pos (read only), and @@gtid_current_pos (a combination of the two, most recent GTID within each domain). mysql.rpl_slave_state is renamed to mysql.gtid_slave_pos to match. This fixes MDEV-4474.
* MDEV-26: Global transaction ID.unknown2013-05-151-1/+24
| | | | | | | Implement START SLAVE UNTIL master_gtid_pos = "<GTID position>". Add test cases, including a test showing how to use this to promote a new master among a set of slaves.
* Fixed errors and compiler warnings found by buildbotMichael Widenius2013-05-051-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Solaris fixes: - Fixed that wait_timeout_func and wait_timeout tests works on solaris - We have to compile without NO_ALARM on Solaris as Solaris doesn't support timeouts on sockets with setsockopt(.. SO_RCVTIMEO). - Fixed that compile-solaris-amd64-debug works (before that we got a wrong ELF class: ELFCLASS64 on linkage) - Added missing sync_with_master Other bug fixes: - Free memory for rpl_global_gtid_binlog_state before exit() to avoid 'accessing uninitalized mutex' error. BUILD/FINISH.sh: Fixed issues on Solaris with ksh BUILD/compile-solaris-amd64-debug: Added missing -m64 flag configure.cmake: We have to compile without NO_ALARM on Solaris as Solaris doesn't support timeouts on sockets with setsockopt(.. SO_RCVTIMEO) mysql-test/suite/rpl/t/rpl_gtid_mdev4473.test: - Added missing sync_with_master (fix by knielsen) sql-common/client.c: Added () to get rid of compiler warning sql/item_strfunc.cc: Fixed compiler warning sql/log.cc: Free memory for static variable rpl_global_gtid_binlog_state before exit() - If we are compiling with safemalloc, we would try to call sf_free() for some members after sf_terminate() was called, which would result of trying to access the uninitalized mutex 'sf_mutex' sql/multi_range_read.cc: Fixed compiler warnings of converting double to ulong. sql/opt_range.cc: Fixed compiler warnings of converting double to ulong or uint - Better to have all variables that can be number of rows as 'ha_rows' sql/rpl_gtid.cc: Added rpl_binlog_state::free() to be able to free memory for static objects before exit() sql/rpl_gtid.h: Added rpl_binlog_state::free() to be able to free memory for static objects before exit() sql/set_var.cc: Fixed compiler warning sql/sql_join_cache.cc: Fixed compiler warnings of converting double to uint sql/sql_show.cc: Added cast to get rid of compiler warning sql/sql_statistics.cc: Remove code that didn't do anything. (store_record() with record[0] is a no-op) storage/xtradb/os/os0file.c: Added __attribute__ ((unused)) support-files/compiler_warnings.supp: Ignore warnings from atomic_add_64_nv (was not able to fix this with a cast as the macro is a bit different between systems) vio/viosocket.c: Added more DBUG_PRINT
* MDEV-26: Global transaction ID.unknown2013-03-281-0/+1
| | | | | | | | Add tests crashing the slave in the middle of replication and checking that replication picks-up again on restart in a crash-safe way. Fix silly code that causes crash by inserting uninitialised data into a hash.
* MDEV-26: Global transaction ID.unknown2013-03-271-8/+8
| | | | | | | | | | | | | | | | Implement test case rpl_gtid_stop_start.test to test normal stop and restart of master and slave mysqld servers. Fix a couple bugs found with the test: - When InnoDB is disabled (no XA), the binlog state was not read when master mysqld starts. - Remove old code that puts a bogus D-S-0 into the initial binlog state, it is not correct in current design. - Fix memory leak in gtid_find_binlog_file().
* MDEV-26: Global transaction ID.unknown2013-03-261-2/+3
| | | | | | | Fix missing error check for applying Gtid_log_event. Fix a couple compiler warnings.
* MDEV-26: Global transaction ID.unknown2013-03-221-0/+3
| | | | Fix 3 small issues reported by Pavel Ivanov.
* MDEV-26: Global transaction ID.unknown2013-03-211-1/+8
| | | | | Fix error handling when record_gtid() fails to update the mysql.rpl_slave_state table.
* MDEV-26: Global transaction ID.unknown2013-03-181-2/+69
| | | | | | | | | | | | | Fix things so that a master can switch with MASTER_GTID_POS=AUTO to a slave that was previously running with log_slave_updates=0, by looking into the slave replication state on the master when the slave requests something not present in the binlog. Be a bit more strict about what position the slave can ask for, to avoid some easy-to-hit misconfiguration errors. Start over with seq_no counter when RESET MASTER.