summaryrefslogtreecommitdiff
path: root/sql/semisync_master.cc
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch '10.4' into 10.5Sergei Golubchik2022-10-021-1/+5
|\
| * MDEV-29613 fixup: clang -Wunused-but-set-variableMarko Mäkelä2022-09-261-1/+5
| |
* | Merge branch '10.4' into 10.5Sergei Golubchik2022-05-091-7/+58
|\ \ | |/
| * MDEV-11853: semisync thread can be killed after sync binlog but before ACK ↵Brandon Nesterenko2022-04-221-7/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in the sync state Problem: ======== If a primary is shutdown during an active semi-sync connection during the period when the primary is awaiting an ACK, the primary hard kills the active communication thread and does not ensure the transaction was received by a replica. This can lead to an inconsistent replication state. Solution: ======== During shutdown, the primary should wait for an ACK or timeout before hard killing a thread which is awaiting a communication. We extend the `SHUTDOWN WAIT FOR SLAVES` logic to identify and ignore any threads waiting for a semi-sync ACK in phase 1. Then, before stopping the ack receiver thread, the shutdown is delayed until all waiting semi-sync connections receive an ACK or time out. The connections are then killed in phase 2. Notes: 1) There remains an unresolved corner case that affects this patch. MDEV-28141: Slave crashes with Packets out of order when connecting to a shutting down master. Specifically, If a slave is connecting to a master which is actively shutting down, the slave can crash with a "Packets out of order" assertion error. To get around this issue in the MTR tests, the primary will wait a small amount of time before phase 1 killing threads to let the replicas safely stop (if applicable). 2) This patch also fixes MDEV-28114: Semi-sync Master ACK Receiver Thread Can Error on COM_QUIT Reviewed By ============ Andrei Elkin <andrei.elkin@mariadb.com>
* | Merge 10.4 into 10.5Marko Mäkelä2022-03-291-0/+1
|\ \ | |/
| * MDEV-25580: rpl.rpl_semi_sync_slave_compressed_protocol crashes because of ↵Brandon Nesterenko2022-03-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | wrong packet Problem: ======== When both semi-sync and slave compression are enabled, the numbering on packet headers can become out of sync between the primary and replica servers. More specifically, after the master flushes its write, it should increment the counters that track packets. The bug is such that the master only updates the normal packet counter and leaves the compressed packet counter alone. Solution: ======== After the master flushes, additionally increment the compressed packet counter. Reviewed By: ============ Andrei Elkin: <andrei.elkin@mariadb.com>
* | Change THD->transaction to a pointer to enable multiple transactionsMonty2020-05-231-1/+1
| | | | | | | | | | | | | | | | | | All changes (except one) is of type thd->transaction. -> thd->transaction-> thd->transaction points by default to 'thd->default_transaction' This allows us to 'easily' have multiple active transactions for a THD object, like when reading data from the mysql.proc table
* | perfschema memory related instrumentation changesSergei Golubchik2020-03-101-2/+2
|/
* Remove \n from DBUG_PRINT statementsMichael Widenius2019-10-211-1/+1
|
* Revert THD::THD(skip_global_sys_var_lock) argumentbb-10.3-svoj-MDEV-14984Sergey Vojtovich2019-05-031-5/+6
| | | | | | | | | | | | | | | | | | | | | | Originally introduced by e972125f1 to avoid harmless wait for LOCK_global_system_variables in a newly created thread, which creation was initiated by system variable update. At the same time it opens dangerous hole, when system variable update thread already released LOCK_global_system_variables and ack_receiver thread haven't yet completed new THD construction. In this case THD constructor goes completely unprotected. Since ack_receiver.stop() waits for the thread to go down, we have to temporarily release LOCK_global_system_variables so that it doesn't deadlock with ack_receiver.run(). Unfortunately it breaks atomicity of rpl_semi_sync_master_enabled updates and makes them not serialized. LOCK_rpl_semi_sync_master_enabled was introduced to workaround the above. TODO: move ack_receiver start/stop into repl_semisync_master enable_master/disable_master under LOCK_binlog protection? Part of MDEV-14984 - regression in connect performance
* MDEV-18096 The server would crash when has configs ↵Andrei Elkin2019-04-191-14/+13
| | | | | | | | | | | | | rpl_semi_sync_master_enabled = OFF rpl_semi_sync_master_wait_no_slave = OFF The patch fixes a fired assert in the semisync master module. The assert caught attempt to switch semisync off (per rpl_semi_sync_master_wait_no_slave = OFF) when it was not even initialized (per rpl_semi_sync_master_enabled = OFF). The switching-off execution branch is relocated under one that executes enable_master() first. A minor cleaup is done to remove the int return from two functions that did not return anything but an error which could not happen in the functions.
* Add likely/unlikely to speed up executionMonty2018-05-071-6/+6
| | | | | | | | | Added to: - if (error) - Lex - sql_yacc.yy and sql_yacc_ora.yy - In header files to alloc() calls - Added thd argument to thd_net_is_killed()
* MDEV-15091 : Windows, 64bit: reenable and fix warning C4267 (conversion from ↵Vladislav Vaintroub2018-02-061-2/+1
| | | | | | | | | | | 'size_t' to 'type', possible loss of data) Handle string length as size_t, consistently (almost always:)) Change function prototypes to accept size_t, where in the past ulong or uint were used. change local/member variables to size_t when appropriate. This fix excludes rocksdb, spider,spider, sphinx and connect for now.
* Changed database, tablename and alias to be LEX_CSTRINGMonty2018-01-301-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | This was done in, among other things: - thd->db and thd->db_length - TABLE_LIST tablename, db, alias and schema_name - Audit plugin database name - lex->db - All db and table names in Alter_table_ctx - st_select_lex db Other things: - Changed a lot of functions to take const LEX_CSTRING* as argument for db, table_name and alias. See init_one_table() as an example. - Changed some function arguments from LEX_CSTRING to const LEX_CSTRING - Changed some lists from LEX_STRING to LEX_CSTRING - threads_mysql.result changed because process list_db wasn't always correctly updated - New append_identifier() function that takes LEX_CSTRING* as arguments - Added new element tmp_buff to Alter_table_ctx to separate temp name handling from temporary space - Ensure we store the length after my_casedn_str() of table/db names - Removed not used version of rename_table_in_stat_tables() - Changed Natural_join_column::table_name and db_name() to never return NULL (used for print) - thd->get_db() now returns db as a printable string (thd->db.str or "")
* MDEV-13073. This patch is a followup of the previous one to convert the ↵bb-10.3-semisyncAndrei Elkin2017-12-181-141/+141
| | | | trailing underscore identifier to mariadb standard. For identifier representing class private members the underscore is replaced with a `m_` prefix. Otherwise `_` is just removed.
* MDEV-13073. This part converts the Ali patch`s identifiers to the MariaDB ↵Andrei Elkin2017-12-181-135/+142
| | | | standard. Also some renaming is done as well as white spaces removal.
* MDEV-13073. This patch replaces semisync's native function_enter,exitAndrei Elkin2017-12-181-101/+88
| | | | and its custom trace faciltiy with standard DBUG_ based equivalents.
* MDEV-13073. This part patch weeds out RUN_HOOK from the server as semisyncAndrei Elkin2017-12-181-4/+4
| | | | | is defined statically. Consequently the observer interfaces are removed as well.
* MDEV-13073 This part merges the Ali semisync related changesAndrei Elkin2017-12-181-328/+256
| | | | | | | | | | | | | | | | | | | | and specifically the ack receiving functionality. Semisync is turned to be static instead of plugin so its functions are invoked at the same points as RUN_HOOKS. The RUN_HOOKS and the observer interface remain to be removed by later patch. Todo: React on killed status by repl_semisync_master.wait_after_sync(). Currently Repl_semi_sync_master::commit_trx does not check the killed status. There were few bugfixes found that are present in mysql and its unclear whether/how they are covered. Those include: Bug#15985893: GTID SKIPPED EVENTS ON MASTER CAUSE SEMI SYNC TIME-OUTS Bug#17932935 CALLING IS_SEMI_SYNC_SLAVE() IN EACH FUNCTION CALL HAS BAD PERFORMANCE Bug#20574628: SEMI-SYNC REPLICATION PERFORMANCE DEGRADES WITH A HIGH NUMBER OF THREADS
* Moved semisync from a plugin to normal serverMonty2017-12-181-0/+1430
Part of MDEV-13073 AliSQL Optimize performance of semisync Did the following renames to match other similar variables key_ss_mutex_LOCK_binlog_ > key_LOCK_bing key_ss_cond_COND_binlog_send_ -> key_COND_binlog_send COND_binlog_send_ -> COND_binlog_send LOCK_binlog_ -> LOCK_binlog debian/mariadb-server-10.2.install does not install semisync libs.