Fix for bug #52044 "FLUSH TABLES WITH READ LOCK and FLUSH

TABLES <list> WITH READ LOCK are incompatible". The problem was that FLUSH TABLES <list> WITH READ LOCK which was issued when other connection has acquired global read lock using FLUSH TABLES WITH READ LOCK was blocked and has to wait until global read lock is released. This issue stemmed from the fact that FLUSH TABLES <list> WITH READ LOCK implementation has acquired X metadata locks on tables to be flushed. Since these locks required acquiring of global IX lock this statement was incompatible with global read lock. This patch addresses problem by using SNW metadata type of lock for tables to be flushed by FLUSH TABLES <list> WITH READ LOCK. It is OK to acquire them without global IX lock as long as we won't try to upgrade those locks. Since SNW locks allow concurrent statements using same table FLUSH TABLE <list> WITH READ LOCK now has to wait until old versions of tables to be flushed go away after acquiring metadata locks. Since such waiting can lead to deadlock MDL deadlock detector was extended to take into account waits for flush and resolve such deadlocks. As a bonus code in open_tables() which was responsible for waiting old versions of tables to go away was refactored. Now when we encounter old version of table in open_table() we don't back-off and wait for all old version to go away, but instead wait for this particular table to be flushed. Such approach supported by deadlock detection should reduce number of scenarios in which FLUSH TABLES aborts concurrent multi-statement transactions. Note that active FLUSH TABLES <list> WITH READ LOCK still blocks concurrent FLUSH TABLES WITH READ LOCK statement as the former keeps tables open and thus prevents the latter statement from doing flush. mysql-test/include/handler.inc: Adjusted test case after changing status which is set when FLUSH TABLES waits for tables to be flushed from "Flushing tables" to "Waiting for table". mysql-test/r/flush.result: Added test which checks that "flush tables <list> with read lock" is compatible with active "flush tables with read lock" but not vice-versa. This test also covers bug #52044 "FLUSH TABLES WITH READ LOCK and FLUSH TABLES <list> WITH READ LOCK are incompatible". mysql-test/r/mdl_sync.result: Added scenarios in which wait for table to be flushed causes deadlocks to the coverage of MDL deadlock detector. mysql-test/suite/perfschema/r/dml_setup_instruments.result: Adjusted test results after removal of COND_refresh condition variable. mysql-test/suite/perfschema/r/server_init.result: Adjusted test and its results after removal of COND_refresh condition variable. mysql-test/suite/perfschema/t/server_init.test: Adjusted test and its results after removal of COND_refresh condition variable. mysql-test/t/flush.test: Added test which checks that "flush tables <list> with read lock" is compatible with active "flush tables with read lock" but not vice-versa. This test also covers bug #52044 "FLUSH TABLES WITH READ LOCK and FLUSH TABLES <list> WITH READ LOCK are incompatible". mysql-test/t/kill.test: Adjusted test case after changing status which is set when FLUSH TABLES waits for tables to be flushed from "Flushing tables" to "Waiting for table". mysql-test/t/lock_multi.test: Adjusted test case after changing status which is set when FLUSH TABLES waits for tables to be flushed from "Flushing tables" to "Waiting for table". mysql-test/t/mdl_sync.test: Added scenarios in which wait for table to be flushed causes deadlocks to the coverage of MDL deadlock detector. sql/ha_ndbcluster.cc: Adjusted code after adding one more parameter for close_cached_tables() call - timeout for waiting for table to be flushed. sql/ha_ndbcluster_binlog.cc: Adjusted code after adding one more parameter for close_cached_tables() call - timeout for waiting for table to be flushed. sql/lock.cc: Removed COND_refresh condition variable. See comment for sql_base.cc for details. sql/mdl.cc: Now MDL deadlock detector takes into account information about waits for table flushes when searching for deadlock. To implement this change: - Declaration of enum_deadlock_weight and Deadlock_detection_visitor were moved to mdl.h header to make them available to the code in table.cc which implements deadlock detector traversal through edges of waiters graph representing waiting for flush. - Since now MDL_context may wait not only for metadata lock but also for table to be flushed an abstract Wait_for_edge class was introduced. Its descendants MDL_ticket and Flush_ticket incapsulate specifics of inspecting waiters graph when following through edge representing wait of particular type. We no longer require global IX metadata lock when acquiring SNW or SNRW locks. Such locks are needed only when metadata locks of these types are upgraded to X locks. This allows to use SNW locks in FLUSH TABLES <list> WITH READ LOCK implementation and keep the latter compatible with global read lock. sql/mdl.h: Now MDL deadlock detector takes into account information about waits for table flushes when searching for deadlock. To implement this change: - Declaration of enum_deadlock_weight and Deadlock_detection_visitor were moved to mdl.h header to make them available to the code in table.cc which implements deadlock detector traversal through edges of waiters graph representing waiting for flush. - Since now MDL_context may wait not only for metadata lock but also for table to be flushed an abstract Wait_for_edge class was introduced. Its descendants MDL_ticket and Flush_ticket incapsulate specifics of inspecting waiters graph when following through edge representing wait of particular type. - Deadlock_detection_visitor now has m_table_shares_visited member which allows to support recursive locking for LOCK_open. This is required when deadlock detector inspects waiters graph which contains several edges representing waits for flushes or needs to come through the such edge more than once. sql/mysqld.cc: Removed COND_refresh condition variable. See comment for sql_base.cc for details. sql/mysqld.h: Removed COND_refresh condition variable. See comment for sql_base.cc for details. sql/sql_base.cc: Changed approach to how threads are waiting for table to be flushed. Now thread that wants to wait for old table to go away subscribes for notification by adding Flush_ticket to table's share and waits using MDL_context::m_wait object. Once table gets flushed (i.e. all tables are closed and table share is ready to be destroyed) all such waiters are notified individually. Thanks to this change MDL deadlock detector can take such waits into account. To implement this/as result of this change: - tdc_wait_for_old_versions() was replaced with tdc_wait_for_old_version() which waits for individual old share to go away and which is called by open_table() after finding out that share is outdated. We don't need to perform back-off before such waiting thanks to the fact that deadlock detector now sees such waits. - As result Open_table_ctx::m_mdl_requests became unnecessary and was removed. We no longer allocate copies of MDL_request objects on MEM_ROOT when MYSQL_OPEN_FORCE_SHARED/SHARED_HIGH_PRIO flags are in effect. - close_cached_tables() and tdc_wait_for_old_version() share code which implements waiting for share to be flushed - the both use TABLE_SHARE::wait_until_flush() method. Thanks to this close_cached_tables() supports timeouts and has extra parameter for this. - Open_table_context::OT_MDL_CONFLICT enum element was renamed to OT_CONFLICT as it is now also used in cases when back-off is required to resolve deadlock caused by waiting for flush and not metadata lock. - In cases when we discover that current connection tries to open tables from different generation we now simply back-off and restart process of opening tables. To support this Open_table_context::OT_REOPEN_TABLES enum element was added. - COND_refresh condition variable became unnecessary and was removed. - mysql_notify_thread_having_shared_lock() no longer wakes up connections waiting for flush as all such connections can be waken up by deadlock detector if necessary. sql/sql_base.h: - close_cached_tables() now has one more parameter - timeout for waiting for table to be flushed. - Open_table_context::OT_MDL_CONFLICT enum element was renamed to OT_CONFLICT as it is now also used in cases when back-off is required to resolve deadlock caused by waiting for flush and not metadata lock. Added new OT_REOPEN_TABLES enum element to be used in cases when we need to restart open tables process even in the middle of transaction. - Open_table_ctx::m_mdl_requests became unnecessary and was removed. sql/sql_class.h: Added assert ensuring that we won't use LOCK_open mutex with THD::enter_cond(). Otherwise deadlocks can arise in MDL deadlock detector. sql/sql_parse.cc: Changed FLUSH TABLES <list> WITH READ LOCK to take SNW metadata locks instead of X locks on tables to be flushed. Since we no longer require global IX lock to be taken when SNW locks are taken this makes this statement compatible with FLUSH TABLES WITH READ LOCK statement. Since SNW locks allow other connections to have table opened FLUSH TABLES <list> WITH READ LOCK now has to wait during open_tables() for old version to go away. Such waits can lead to deadlocks which will be detected by MDL deadlock detector which now takes waits for table to be flushed into account. Also adjusted code after adding one more parameter for close_cached_tables() call - timeout for waiting for table to be flushed. sql/sql_yacc.yy: FLUSH TABLES <list> WITH READ LOCK now needs only SNW metadata locks on tables. sql/sys_vars.cc: Adjusted code after adding one more parameter for close_cached_tables() call - timeout for waiting for table to be flushed. sql/table.cc: Implemented new approach to how threads are waiting for table to be flushed. Now thread that wants to wait for old table to go away subscribes for notification by adding Flush_ticket to table's share and waits using MDL_context::m_wait object. Once table gets flushed (i.e. all tables are closed and table share is ready to be destroyed) all such waiters are notified individually. This change allows to make such waits visible inside of MDL deadlock detector. To do it: - Added list of waiters/Flush_tickets to TABLE_SHARE class. - Changed free_table_share() to postpone freeing of share memory until last waiter goes away and to wake up subscribed waiters. - Added TABLE_SHARE::wait_until_flushed() method which implements subscription to the list of waiters for table to be flushed and waiting for this event. Implemented interface which allows to expose waits for flushes to MDL deadlock detector: - Introduced Flush_ticket class a descendant of Wait_for_edge class. - Added TABLE_SHARE::find_deadlock() method which allows deadlock detector to find out what contexts are still using old version of table in question (i.e. to find out what contexts are waited for by owner of Flush_ticket). sql/table.h: In order to support new strategy of waiting for table flush (see comment for table.cc for details) added list of waiters/Flush_tickets to TABLE_SHARE class. Implemented interface which allows to expose waits for flushes to MDL deadlock detector: - Introduced Flush_ticket class a descendant of Wait_for_edge class. - Added TABLE_SHARE::find_deadlock() method which allows deadlock detector to find out what contexts are still using old version of table in question (i.e. to find out what contexts are waited for by owner of Flush_ticket).
author: Dmitry Lenev <dlenev@mysql.com> 2010-07-27 17:34:58 +0400
committer: Dmitry Lenev <dlenev@mysql.com> 2010-07-27 17:34:58 +0400
commit: 00496b7acd1f2ac8b099ba7e6a4c7bbf09178384 (patch)
tree: bef62913efddc244d466f7ff730bcc4205357491 /sql/mdl.h
parent: 36290c092392c460132f3d7256aeeb8f94debe8f (diff)
download: mariadb-git-00496b7acd1f2ac8b099ba7e6a4c7bbf09178384.tar.gz
1 files changed, 95 insertions, 4 deletions
diff --git a/sql/mdl.h b/sql/mdl.h
index c8acd69c0f1..d7fbb14a140 100644
--- a/sql/mdl.h
+++ b/sql/mdl.h
@@ -34,7 +34,6 @@ class THD;
 class MDL_context;
 class MDL_lock;
 class MDL_ticket;
-class Deadlock_detection_visitor;
 
 /**
   Type of metadata lock request.
@@ -360,6 +359,96 @@ public:
 
 typedef void (*mdl_cached_object_release_hook)(void *);
 
+
+enum enum_deadlock_weight
+{
+  MDL_DEADLOCK_WEIGHT_DML= 0,
+  MDL_DEADLOCK_WEIGHT_DDL= 100
+};
+
+
+/**
+  A context of the recursive traversal through all contexts
+  in all sessions in search for deadlock.
+*/
+
+class Deadlock_detection_visitor
+{
+public:
+  Deadlock_detection_visitor(MDL_context *start_node_arg)
+    : m_start_node(start_node_arg),
+      m_victim(NULL),
+      m_current_search_depth(0),
+      m_table_shares_visited(0)
+  {}
+  bool enter_node(MDL_context * /* unused */);
+  void leave_node(MDL_context * /* unused */);
+
+  bool inspect_edge(MDL_context *dest);
+
+  MDL_context *get_victim() const { return m_victim; }
+
+  /**
+    Change the deadlock victim to a new one if it has lower deadlock
+    weight.
+  */
+  MDL_context *opt_change_victim_to(MDL_context *new_victim);
+private:
+  /**
+    The context which has initiated the search. There
+    can be multiple searches happening in parallel at the same time.
+  */
+  MDL_context *m_start_node;
+  /** If a deadlock is found, the context that identifies the victim. */
+  MDL_context *m_victim;
+  /** Set to the 0 at start. Increased whenever
+    we descend into another MDL context (aka traverse to the next
+    wait-for graph node). When MAX_SEARCH_DEPTH is reached, we
+    assume that a deadlock is found, even if we have not found a
+    loop.
+  */
+  uint m_current_search_depth;
+  /**
+    Maximum depth for deadlock searches. After this depth is
+    achieved we will unconditionally declare that there is a
+    deadlock.
+
+    @note This depth should be small enough to avoid stack
+          being exhausted by recursive search algorithm.
+
+    TODO: Find out what is the optimal value for this parameter.
+          Current value is safe, but probably sub-optimal,
+          as there is an anecdotal evidence that real-life
+          deadlocks are even shorter typically.
+  */
+  static const uint MAX_SEARCH_DEPTH= 32;
+
+public:
+  /**
+    Number of TABLE_SHARE objects visited by deadlock detector so far.
+    Used by TABLE_SHARE::find_deadlock() method to implement recursive
+    locking for LOCK_open mutex.
+  */
+  uint m_table_shares_visited;
+};
+
+
+/**
+  Abstract class representing edge in waiters graph to be
+  traversed by deadlock detection algorithm.
+*/
+
+class Wait_for_edge
+{
+public:
+  virtual ~Wait_for_edge() {};
+
+  virtual bool find_deadlock(Deadlock_detection_visitor *dvisitor) = 0;
+
+  virtual uint get_deadlock_weight() const = 0;
+};
+
+
 /**
   A granted metadata lock.
 
@@ -380,7 +469,7 @@ typedef void (*mdl_cached_object_release_hook)(void *);
           threads/contexts.
 */
 
-class MDL_ticket
+class MDL_ticket : public Wait_for_edge
 {
 public:
   /**
@@ -414,6 +503,7 @@ public:
   bool is_incompatible_when_granted(enum_mdl_type type) const;
   bool is_incompatible_when_waiting(enum_mdl_type type) const;
 
+  bool find_deadlock(Deadlock_detection_visitor *dvisitor);
   /* A helper used to determine which lock request should be aborted. */
   uint get_deadlock_weight() const;
 private:
@@ -680,7 +770,7 @@ private:
     by inspecting waiting queues, but we'd very much like it to be
     readily available to the wait-for graph iterator.
    */
-  MDL_ticket *m_waiting_for;
+  Wait_for_edge *m_waiting_for;
 private:
   MDL_ticket *find_ticket(MDL_request *mdl_req,
                           bool *is_transactional);
@@ -688,10 +778,11 @@ private:
   bool try_acquire_lock_impl(MDL_request *mdl_request,
                              MDL_ticket **out_ticket);
 
+public:
   void find_deadlock();
 
   /** Inform the deadlock detector there is an edge in the wait-for graph. */
-  void will_wait_for(MDL_ticket *pending_ticket)
+  void will_wait_for(Wait_for_edge *pending_ticket)
   {
     mysql_prlock_wrlock(&m_LOCK_waiting_for);
     m_waiting_for= pending_ticket;
author	Dmitry Lenev <dlenev@mysql.com>	2010-07-27 17:34:58 +0400
committer	Dmitry Lenev <dlenev@mysql.com>	2010-07-27 17:34:58 +0400
commit	00496b7acd1f2ac8b099ba7e6a4c7bbf09178384 (patch)
tree	bef62913efddc244d466f7ff730bcc4205357491 /sql/mdl.h
parent	36290c092392c460132f3d7256aeeb8f94debe8f (diff)
download	mariadb-git-00496b7acd1f2ac8b099ba7e6a4c7bbf09178384.tar.gz