summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
...
* MDEV-27262 Unexpected index intersection with full index scan for an indexIgor Babaev2021-12-235-6/+243
| | | | | | | | | | | | | | | | | | | | | | | | | | | If when extracting a range condition for an index from the WHERE condition Range Optimizer sees that the range condition covers the whole index then such condition should be discarded because it cannot be used in any range scan. In some cases Range Optimizer really does it, but there remained some conditions for which it was not done. As a result the optimizer could produce index merge plans with the full index scan for one of the indexes participating in the index merge. This could be observed in one of the test cases from index_merge1.inc where a plan with index_merge_sort_union was produced and in the test case reported for this bug where a plan with index_merge_sort_intersect was produced. In both cases one of two index scans participating in index merge ran over the whole index. The patch slightly changes the original above mentioned test case from index_merge1.inc to be able to produce an intended plan employing index_merge_sort_union. The original query was left to show that index merge is not used for it anymore. It should be noted that for the plan with index_merge_sort_intersect could be chosen for execution only due to a defect in the InnoDB code that returns wrong estimates for the cardinality of big ranges. This bug led to serious problems in 10.4+ where the optimization using Rowid filters is employed (see mdev-26446). Approved by Sergey Petrunia <sergey@mariadb.com>
* MDEV-24097: galera[_3nodes] suite tests in MTR sporadically failsbb-10.2-MDEV-24097-galeraJulius Goryavsky2021-12-2314-285/+292
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the first part of the fixes for MDEV-24097. This commit contains the fixes for instability when testing Galera and when restarting nodes quickly: 1) Protection against a "stuck" old SST process during the execution of the new SST (after restarting the node) is now implemented for mariabackup / xtrabackup, which should help to avoid almost all conflicts due to the use of the same ports - both during testing with mtr, so and when restarting nodes quickly in a production environment. 2) Added more protection to scripts against unexpected return of the rc != 0 (in the commands for deleting temporary files, etc). 3) Added protection against unexpected crashes during binlog transfer (in SST scripts for rsync). 4) Spaces and some special characters in binlog filenames shouldn't be a problem now (at the script level). 5) Daemon process termination tracking has been made more robust against crashes due to unexpected termination of the previous SST process while new scripts are running. 6) Reading ssl encryption parameters has been moved from specific SST scripts to a common wsrep_sst_common.sh script, which allows unified error handling, unified diagnostics and simplifies script revisions in the future. 7) Improved diagnostics of errors related to the use of openssl. 8) Corrections have been made for xtrabackup-v2 (both in tests and in the script code) that restore the work of xtrabackup with updated versions of innodb. 9) Fixed some tests for galera_3nodes, although the complete solution for the problem of starting three nodes at the same time on fast machines will be done in a separate commit. No additional tests are required as this commit fixes problems with existing tests.
* MDEV-23175: my_timer_milliseconds clock_gettime for multiple platfomrsbb-10.2-MDEV-23175-backportDaniel Black2021-12-221-2/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Small postfix to MDEV-23175 to ensure faster option on FreeBSD and compatibility to Solaris that isn't high resolution. ftime is left as a backup in case an implementation doesn't contain any of these clocks. FreeBSD $ ./unittest/mysys/my_rdtsc-t 1..11 # ----- Routine --------------- # myt.cycles.routine : 5 # myt.nanoseconds.routine : 11 # myt.microseconds.routine : 13 # myt.milliseconds.routine : 11 # myt.ticks.routine : 17 # ----- Frequency ------------- # myt.cycles.frequency : 3610295566 # myt.nanoseconds.frequency : 1000000000 # myt.microseconds.frequency : 1000000 # myt.milliseconds.frequency : 899 # myt.ticks.frequency : 136 # ----- Resolution ------------ # myt.cycles.resolution : 1 # myt.nanoseconds.resolution : 1 # myt.microseconds.resolution : 1 # myt.milliseconds.resolution : 7 # myt.ticks.resolution : 1 # ----- Overhead -------------- # myt.cycles.overhead : 26 # myt.nanoseconds.overhead : 19140 # myt.microseconds.overhead : 19036 # myt.milliseconds.overhead : 578 # myt.ticks.overhead : 21544 ok 1 - my_timer_init() did not crash ok 2 - The cycle timer is strictly increasing ok 3 - The cycle timer is implemented ok 4 - The nanosecond timer is increasing ok 5 - The nanosecond timer is implemented ok 6 - The microsecond timer is increasing ok 7 - The microsecond timer is implemented ok 8 - The millisecond timer is increasing ok 9 - The millisecond timer is implemented ok 10 - The tick timer is increasing ok 11 - The tick timer is implemented
* MDEV-27181 fixup: compatibility with Windows + small correctionsbb-10.2-MDEV-27181-fixJulius Goryavsky2021-12-176-95/+229
| | | | | | | 1) Removed symlinks that are not very well supported in tar under Windows. 2) Added comment + changed code formatting in viosslfactories.c 3) Fixed a small bug in the yassl code. 4) Fixed a typo in the script code.
* MDEV-21866: Assertion `!result' failed in convert_const_to_int upon 2nd ↵bb-10.2-MDEV-21866Dmitry Shulga2021-12-163-1/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | execution of PS Consider the following use case: MariaDB [test]> CREATE TABLE t1 (field1 BIGINT DEFAULT -1); MariaDB [test]> CREATE VIEW v1 AS SELECT DISTINCT field1 FROM t1; Repeated execution of the following query as a Prepared Statement MariaDB [test]> PREPARE stmt FROM 'SELECT * FROM v1 WHERE field1 <=> NULL'; MariaDB [test]> EXECUTE stmt; results in a crash for a server built with DEBUG. MariaDB [test]> EXECUTE stmt; ERROR 2013 (HY000): Lost connection to MySQL server during query Assertion failed: (!result), function convert_const_to_int, file item_cmpfunc.cc, line 476. Abort trap: 6 (core dumped) The crash inside the function convert_const_to_int() happens by the reason that the value -1 is stored in an instance of the class Field_longlong on restoring its original value in the statement result= field->store(orig_field_val, TRUE); that leads to assigning the value 1 to the variable 'result' with subsequent crash in the DBUG_ASSERT statement following it DBUG_ASSERT(!result); The main matter here is why this assertion failure happens on the second execution of the prepared statement and doens't on the first one. On first handling of the statement 'EXECUTE stmt;' a temporary table is created for serving the query involving the view 'v1'. The table is created by the function create_tmp_table() in the following calls trace: (trace #1) JOIN::prepare (at sql_select.cc:725) st_select_lex::handle_derived LEX::handle_list_of_derived TABLE_LIST::handle_derived mysql_handle_single_derived mysql_derived_prepare select_union::create_result_table create_tmp_table Note, that the data member TABLE::status of a TABLE instance returned by the function create_tmp_table() has the value 0. Later the function setup_table_map() is called on the TABLE instance just created for the sake of the temporary table (calls trace #2 is below): JOIN::prepare (at sql_select.cc:737) setup_tables_and_check_access setup_tables setup_table_map where the data member TABLE::status is set to the value STATUS_NO_RECORD. After that when execution of the method JOIN::prepare reaches calling of the function setup_without_group() the following calls trace is invoked JOIN::prepare setup_without_group setup_conds Item_func::fix_fields Item_func_equal::fix_length_and_dec Item_bool_rowready_func2::fix_length_and_dec Item_func::setup_args_and_comparator Item_func::convert_const_compared_to_int_field convert_const_to_int There is the following code snippet in the function convert_const_to_int() at the line item_cmpfunc.cc:448 bool save_field_value= (field_item->const_item() || !(field->table->status & STATUS_NO_RECORD)); Since field->table->status has bits STATUS_NO_RECORD set the variable save_field_value is false and therefore neither the method Field_longlong::val_int() nor the method Field_longlong::store is called on the Field instance that has the numeric value -1. That is the reason why first execution of the Prepared Statement for the query 'SELECT * FROM v1 WHERE field1 <=> NULL' is successful. On second running of the statement 'EXECUTE stmt' a new temporary tables is also created by running the calls trace #1 but the trace #2 is not executed by the reason that data member SELECT_LEX::first_cond_optimization has been set to false on first execution of the prepared statemet (in the method JOIN::optimize_inner()). As a consequence, the data member TABLE::status for a temporary table just created doesn't have the flags STATUS_NO_RECORD set and therefore on re-execution of the prepared statement the methods Field_longlong::val_int() and Field_longlong::store() are called for the field having the value -1 and the DBUG_ASSERT(!result) is fired. To fix the issue the data member TABLE::status has to be assigned the value STATUS_NO_RECORD in every place where the macros empty_record() is called to emptify a record for just instantiated TABLE object created on behalf the new temporary table.
* MDEV-27270: Wrong query plan with Range Checked for Each Record and ORDER BY ↵bb-10.2-mdev2720Sergei Petrunia2021-12-153-0/+53
| | | | | | | | | | ... LIMIT Followup to fix for MDEV-25858: When test_if_skip_sort_order() decides to use an index to satisfy ORDER BY ... LIMIT clause, it should disable "Range Checked for Each Record" optimization. Do this in all cases.
* Disable following tests from galera_3nodes suitebb-10.2-janJan Lindström2021-12-151-0/+8
| | | | | | | | | | | | | * galera_pc_bootstrap * galera_ipv6_mariabackup * galera_ipv6_mariabackup_section * galera_ipv6_rsync * galera_ipv6_rsync_section * galera_ssl_reload * galera_toi_vote * galera_wsrep_schema_init because MTR sporadaically fails: Failed to start mysqld or mysql_shutdown failed
* MDEV-27268 Failed InnoDB initialization leaves garbage files behindMarko Mäkelä2021-12-153-8/+11
| | | | | | | | | create_log_files(): Check log_set_capacity() before modifying or creating any log files. innobase_start_or_create_for_mysql(): If create_log_files() fails and we were initializing a new database, delete the system tablespace files before exiting.
* MDEV-27181: Galera SST scripts should use ssl_capath for CA directorybb-10.2-MDEV-27181-galeraJulius Goryavsky2021-12-1421-411/+1033
| | | | | | | | | | | | | | | | | 1. Galera SST scripts should use ssl_capath (not ssl_ca) for CA directory. The current implementation tries to automatically detect the path using the trailing slash in the ssl_ca variable value, but this approach is not compatible with the server configuration. Now, by analogy with the server, SST scripts also use a separate ssl_capath variable. In addition, a similar tcapath variable has been added for the old-style configuration (in the "sst" section). 2. Openssl utility detection made more reliable. 3. Removed extra spaces in automatically generated command lines - to simplify debugging of the SST scripts. 4. In general, the code for detecting the presence or absence of auxiliary utilities has been improved - it is made more reliable in some configurations (and for shells other than bash).
* MDEV-27235: Crash on SET GLOBAL innodb_encrypt_tablesMarko Mäkelä2021-12-133-0/+7
| | | | | fil_crypt_set_encrypt_tables(): If no encryption threads have been initialized, do nothing.
* don't use buffered_option_error_reporter without perfschemaSergei Golubchik2021-12-101-3/+5
| | | | | | | | it's not printed, not cleaned up without perfschema, so isn't supposed to be written into either this fixes "Memory not freed" warnings when early command line options produce warnings in non-perfschema builds
* enable partition_open_files_limit testbb-10.2-aliceforkfun2021-12-094-8/+5
|
* MDEV-19129: Fixed configure for Xcode, CMake generateSergei Krivonos2021-12-071-0/+1
| | | | | | | | | | | | | | | CMake Error in wsrep-lib/CMakeLists.txt: The custom command generating /Users/name/build/mariadb-server/sql/lex_token.h is attached to multiple targets: GenServerSource sql but none of these is a common dependency of the other(s). This is not allowed by the Xcode "new build system".
* Don't beep in mysql_upgrade_service.exeVladislav Vaintroub2021-12-071-2/+2
| | | | | This beep looks especially strange, as mysqladmin output is redirected to the log file
* MDEV-27191 MariaDB client - "system" command does not work on WindowsVladislav Vaintroub2021-12-071-8/+3
| | | | | - define USE_POPEN, like it is done elsewhere. - use Notepad as default editor on Windows for the "edit" command.
* Appveyor - cache chocolatey packagesVladislav Vaintroub2021-12-071-0/+4
|
* fix srpm builds after fe065f8d90b0Sergei Golubchik2021-12-061-1/+2
|
* fix ./mtr --manual warning after f5441ef4dac9Sergei Golubchik2021-12-061-1/+1
|
* MDEV-27088: lf unit tests - cycles insufficientMartin Beck2021-11-301-1/+1
| | | | | Per bug report, cycles was woefully insufficient to detect any implementation error.
* MDEV-27088: Server crash on ARM (WMM architecture) due to missing barriers ↵Martin Beck2021-11-302-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in lf-hash MariaDB server crashes on ARM (weak memory model architecture) while concurrently executing l_find to load node->key and add_to_purgatory to store node->key = NULL. l_find then uses key (which is NULL), to pass it to a comparison function. The specific problem is the out-of-order execution that happens on a weak memory model architecture. Two essential reorderings are possible, which need to be prevented. a) As l_find has no barriers in place between the optimistic read of the key field lf_hash.cc#L117 and the verification of link lf_hash.cc#L124, the processor can reorder the load to happen after the while-loop. In that case, a concurrent thread executing add_to_purgatory on the same node can be scheduled to store NULL at the key field lf_alloc-pin.c#L253 before key is loaded in l_find. b) A node is marked as deleted by a CAS in l_delete lf_hash.cc#L247 and taken off the list with an upfollowing CAS lf_hash.cc#L252. Only if both CAS succeed, the key field is written to by add_to_purgatory. However, due to a missing barrier, the relaxed store of key lf_alloc-pin.c#L253 can be moved ahead of the two CAS operations, which makes the value of the local purgatory list stored by add_to_purgatory visible to all threads operating on the list. As the node is not marked as deleted yet, the same error occurs in l_find. This change three accesses to be atomic. * optimistic read of key in l_find lf_hash.cc#L117 * read of link for verification lf_hash.cc#L124 * write of key in add_to_purgatory lf_alloc-pin.c#L253 Reviewers: Sergei Vojtovich, Sergei Golubchik Fixes: MDEV-23510 / d30c1331a18d875e553f3fcf544997e4f33fb943
* MDEV-26553 NOT IN subquery construct crashing 10.1 and upIgor Babaev2021-11-264-1/+78
| | | | | | | | | | | | | This bug was introduced by commit be00e279c6061134a33a8099fd69d4304735d02e The commit was applied for the task MDEV-6480 that allowed to remove top level disjuncts from WHERE conditions if the range optimizer evaluated them as always equal to FALSE/NULL. If such disjuncts are removed the WHERE condition may become an AND formula and if this formula contains multiple equalities the field JOIN::item_equal must be updated to refer to these equalities. The above mentioned commit forgot to do this and it could cause crashes for some queries. Approved by Oleksandr Byelkin <sanja@mariadb.com>
* MDEV-26972 MTR worker aborts after server restart failureSergei Golubchik2021-11-261-2/+4
| | | | | restore the old behavior where without a debugger mtr does not wait for mysqld to start. It was broken in feacc0aaf2
* MDEV-26755 innodb.undo_truncate: ilink::assert_linked(): Assertion `prev != ↵Sergei Golubchik2021-11-261-2/+0
| | | | | | | | | | | | | | | | | 0 && next != 0' failed close_connections() in mysqld.cc sends a signal to all threads. But InnoDB is too busy purging, doesn't react immediately. close_connections() waits 20 seconds, which isn't enough in this particular case, and then unlinks all threads from the list and forcibly closes their vio connection. InnoDB background threads have no vio connection to close, but they're unlinked all the same. So when later they finally notice the shutdown request and try to unlink themselves, they fail to assert that they're still linked. Fix: don't assert_linked, as another thread can unlink this THD anytime
* add a test casest-10.2-markoSergei Golubchik2021-11-262-0/+8
| | | | MDEV-20330 Combination of "," (comma), cross join and left join fails to parse
* MDEV-26558 Fix a deadlock due to cyclic dependencebb-10.2-robertryancaicse2021-11-241-1/+1
| | | Fix a potential deadlock bug between locks ctrl_mutex and entry->mutex
* mysql_install_db: remove MySQL referencesDaniel Black2021-11-241-4/+2
| | | | | MySQL documentation isn't going to help our users and we shouldn't refer to it.
* MDEV-27066: Fixed scientific notation parsing bugMarc Olivier Bergeron2021-11-243-2/+78
| | | | | | | | | | | | | | | | | The bug occurs where the float token containing a dot with an 'e' notation was dropped from the request completely. This causes a manner of invalid SQL statements like: select id 1.e, char 10.e(id 2.e), concat 3.e('a'12356.e,'b'1.e,'c'1.1234e)1.e, 12 1.e*2 1.e, 12 1.e/2 1.e, 12 1.e|2 1.e, 12 1.e^2 1.e, 12 1.e%2 1.e, 12 1.e&2 from test; To be parsed correctly as if it was: select id, char(id), concat('a','b','c'), 12*2, 12/2, 12|2, 12^2, 12%2, 12&2 from test.test; This correct parsing occurs when e is followed by any of: ( ) . , | & % * ^ /
* MDEV-22522 RPM packages have meaningless summary/descriptionbb-10.2-MDEV-22522Alexey Bychko2021-11-2310-47/+60
| | | | | this patch moves cpack summury and description for optional packages to the appropriate CMakeLists.txt files
* MDEV-26915: SST scripts do not take log_bin_index setting into accountbb-10.2-MDEV-26915-galeraJulius Goryavsky2021-11-237-10/+105
| | | | | | | | | Currently, SST scripts assume that the filename specified in the --log-bin-index argument either does not contain an extension or uses the standard ".index" extension. Similar assumptions are used for the log_bin_index parameter read from the configuration file. This commit adds support for arbitrary extensions for the index file paths.
* MDEV-26064: mariabackup SST fails when starting with --innodb-force-recoveryJulius Goryavsky2021-11-238-2/+463
| | | | | | | | | | | | If the server is started with the --innodb-force-recovery argument on the command line, then during SST this argument can be passed to mariabackup only at the --prepare stage, and accordingly it must be removed from the --mysqld-args list (and it is not should be passed to mariabackup otherwise). This commit fixes a flaw in the SST scripts and add a test that checks the ability to run the joiner node in a configuration that uses --innodb-force-recovery=1.
* MDEV-26470 "No database" selected when using CTE in a subquery of DELETE ↵Igor Babaev2021-11-203-0/+49
| | | | | | | | | | | | | statement This bug led to reporting bogus messages "No database selected" for DELETE statements if they used subqueries in their WHERE conditions and these subqueries contained references to CTEs. The bug happened because the grammar rule for DELETE statement did not call the function LEX::check_cte_dependencies_and_resolve_references() and as a result of it references to CTEs were not identified as such. Approved by Oleksandr Byelkin <sanja@mariadb.com>
* MDEV-27086 "No database selected" when using UNION of CTEs to define tableIgor Babaev2021-11-203-4/+53
| | | | | | | | | | | | This bug concerned only CREATE TABLE statements of the form CREATE TABLE <table name> AS <with clause> <union>. For such a statement not all references to CTE used in <union> were resolved. As a result a bogus message was reported for the first unresolved reference. This happened because for such statements the function resolving references to CTEs LEX::check_cte_dependencies_and_resolve_references() was called prematurely in the parser. Approved by Oleksandr Byelkin <sanja@mariadb.com>
* MDEV-27098 Subquery using the ALL keyword on TIME columns produces a wrong ↵bb-10.2-barAlexander Barkov2021-11-204-2/+46
| | | | result
* MDEV-27072 Subquery using the ALL keyword on date columns produces a wrong ↵Alexander Barkov2021-11-203-1/+22
| | | | result
* MDEV-27075 mysql_upgrade_service.exe - using uninitialized memory ↵Vladislav Vaintroub2021-11-181-19/+0
| | | | | | | | | 'defaults_file' Remove section that was trying to rename default-character-set to character-set-server This seems to be an old workaround for some upgrade warning, which did not work for some time already, because the ini filename was not initialized.
* MDEV-26747 improve corruption check for encrypted tables on ALTER IMPORTbb-10.2-MDEV-26747-corruption-checkEugene Kosov2021-11-177-45/+90
| | | | | | fil_space_decrypt(): change signature to return status via dberr_t only. Also replace impossible condition with an assertion and prove it via test cases.
* MDEV-26825 Bogus error for query with two usage of CTE referring another CTEIgor Babaev2021-11-163-23/+120
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This bug affected queries with two or more references to a CTE referring another CTE if the definition of the latter contained an invocation of a stored function that used a base table. The bug could lead to a bogus error message or to an assertion failure. For any non-first reference to CTE cte1 With_element::clone_parsed_spec() is called that parses the specification of cte1 to construct the unit structure for this usage of cte1. If cte1 refers to another CTE cte2 outside of the specification of cte1 then With_element::clone_parsed_spec() has to be called for cte2 as well. This call is made by the function LEX::resolve_references_to_cte() within the invocation of the function With_element::clone_parsed_spec() for cte1. When the specification of a CTE is parsed all table references encountered in it must be added to the global list of table references for the query. As the specification for the non-first usage of a CTE is parsed at a recursive call of the parser the function With_element::clone_parsed_spec() invoked at this recursive call should takes care of appending the list of table references encountered in the specification of this CTE cte1 to the list of table references created for the query. And it should do it after the call of LEX::resolve_references_to_cte() that resolves references to CTEs defined outside of the specification of cte1 because this call may invoke the parser again for specifications of other CTEs and the table references from their specifications must ultimately appear in the global list of table references of the query. The code of With_element::clone_parsed_spec() misplaced the call of LEX::resolve_references_to_cte(). As a result LEX::query_tables_last used for the query that was supposed to point to the field 'next_global' of the last element in the global list of table references actually pointed to 'next_global' of the previous element. The above inconsistency certainly caused serious problems when table references used in the stored functions invoked in cloned specifications of CTEs were added to the global list of table references.
* MDEV-27056 Windows upgrade_wizard - CloseHandle() on invalid (already ↵Vladislav Vaintroub2021-11-161-1/+1
| | | | closed) pipe handle
* Windows build - fix signtool search path to take modern SDKs into accountVladislav Vaintroub2021-11-161-8/+12
|
* MDEV-27030 vcol.vcol_keys_myisam fails on Windows x64, with Visual Studio 2022Vladislav Vaintroub2021-11-111-0/+8
| | | | | | | Upon investigation, decided this to be a compiler bug (happens with new compiler, on code that did not change for the last 15 years) Fixed by de-optimizing single function remove_key(), using MSVC pragma
* MDEV-26991: CURRENT_TEST: main.mysql_binary_zero_insert 'grep' is not ↵bb-10.2-MDEV-26991Brandon Nesterenko2021-11-112-5/+5
| | | | | | | | recognized as an internal or external command, operable program or batch file. Removed grep from mysqldump command stream and instead, extend the search_file pattern to search for rows containing binary zeros instead of any occurance of '00' in the input
* Remove a warning for clang 11 or earlierMarko Mäkelä2021-11-092-2/+2
| | | | This fixes up commit d22c8cae00f7a7517c9b8228efbb543037c23c97
* Remove restarts from encrypt_and_grep testst-10.2-mergeMarko Mäkelä2021-11-092-10/+12
|
* Merge mariadb-10.2.41 into 10.2Marko Mäkelä2021-11-0923-277/+477
|\
| * MDEV-26833 Missed statement rollback in case transaction drops or create ↵mariadb-10.2.41Andrei Elkin2021-11-059-51/+1211
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | temporary table When transaction creates or drops temporary tables and afterward its statement faces an error even the transactional table statement's cached ROW format events get involved into binlog and are visible after the transaction's commit. Fixed with proper analysis of whether the errored-out statement needs to be rolled back in binlog. For instance a fact of already cached CREATE or DROP for temporary tables by previous statements alone does not cause to retain the being errored-out statement events in the cache. Conversely, if the statement creates or drops a temporary table itself it can't be rolled back - this rule remains.
| * MDEV-23328 Server hang due to Galera lock conflict resolutionJan Lindström2021-11-023-8/+10
| | | | | | | | | | Use better error message when KILL fails even in case TOI fails.
| * MDEV-23328 Server hang due to Galera lock conflict resolutionJan Lindström2021-11-013-9/+15
| | | | | | | | | | * Fix error handling NULL-pointer reference * Add mtr-suppression on galera_ssl_upgrade
| * MDEV-23328 Server hang due to Galera lock conflict resolutionsjaakola2021-10-2915-203/+417
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mutex order violation when wsrep bf thread kills a conflicting trx, the stack is wsrep_thd_LOCK() wsrep_kill_victim() lock_rec_other_has_conflicting() lock_clust_rec_read_check_and_lock() row_search_mvcc() ha_innobase::index_read() ha_innobase::rnd_pos() handler::ha_rnd_pos() handler::rnd_pos_by_record() handler::ha_rnd_pos_by_record() Rows_log_event::find_row() Update_rows_log_event::do_exec_row() Rows_log_event::do_apply_event() Log_event::apply_event() wsrep_apply_events() and mutexes are taken in the order lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data When a normal KILL statement is executed, the stack is innobase_kill_query() kill_handlerton() plugin_foreach_with_mask() ha_kill_query() THD::awake() kill_one_thread() and mutexes are victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex This patch is the plan D variant for fixing potetial mutex locking order exercised by BF aborting and KILL command execution. In this approach, KILL command is replicated as TOI operation. This guarantees total isolation for the KILL command execution in the first node: there is no concurrent replication applying and no concurrent DDL executing. Therefore there is no risk of BF aborting to happen in parallel with KILL command execution either. Potential mutex deadlocks between the different mutex access paths with KILL command execution and BF aborting cannot therefore happen. TOI replication is used, in this approach, purely as means to provide isolated KILL command execution in the first node. KILL command should not (and must not) be applied in secondary nodes. In this patch, we make this sure by skipping KILL execution in secondary nodes, in applying phase, where we bail out if applier thread is trying to execute KILL command. This is effective, but skipping the applying of KILL command could happen much earlier as well. This also fixed unprotected calls to wsrep_thd_abort that will use wsrep_abort_transaction. This is fixed by holding THD::LOCK_thd_data while we abort transaction. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
| * MDEV-25114: Crash: WSREP: invalid state ROLLED_BACK (FATAL)Jan Lindström2021-10-292-112/+74
| | | | | | | | | | | | Revert "MDEV-23328 Server hang due to Galera lock conflict resolution" This reverts commit 29bbcac0ee841faaa68eeb09c86ff825eabbe6b6.
| * fix depricated pthread_yield() for tokudbOleksandr Byelkin2021-10-281-1/+3
| |