From a5efb91deaf8a9c3bf53cd82ebd574be0cdabc5d Mon Sep 17 00:00:00 2001 From: Davi Arnaut Date: Wed, 6 Oct 2010 11:34:28 -0300 Subject: Bug#49938: Failing assertion: inode or deadlock in fsp/fsp0fsp.c Bug#54678: InnoDB, TRUNCATE, ALTER, I_S SELECT, crash or deadlock - Incompatible change: truncate no longer resorts to a row by row delete if the storage engine does not support the truncate method. Consequently, the count of affected rows does not, in any case, reflect the actual number of rows. - Incompatible change: it is no longer possible to truncate a table that participates as a parent in a foreign key constraint, unless it is a self-referencing constraint (both parent and child are in the same table). To work around this incompatible change and still be able to truncate such tables, disable foreign checks with SET foreign_key_checks=0 before truncate. Alternatively, if foreign key checks are necessary, please use a DELETE statement without a WHERE condition. Problem description: The problem was that for storage engines that do not support truncate table via a external drop and recreate, such as InnoDB which implements truncate via a internal drop and recreate, the delete_all_rows method could be invoked with a shared metadata lock, causing problems if the engine needed exclusive access to some internal metadata. This problem originated with the fact that there is no truncate specific handler method, which ended up leading to a abuse of the delete_all_rows method that is primarily used for delete operations without a condition. Solution: The solution is to introduce a truncate handler method that is invoked when the engine does not support truncation via a table drop and recreate. This method is invoked under a exclusive metadata lock, so that there is only a single instance of the table when the method is invoked. Also, the method is not invoked and a error is thrown if the table is a parent in a non-self-referencing foreign key relationship. This was necessary to avoid inconsistency as some integrity checks are bypassed. This is inline with the fact that truncate is primarily a DDL operation that was designed to quickly remove all data from a table. mysql-test/suite/innodb/t/innodb-truncate.test: Add test cases for truncate and foreign key checks. Also test that InnoDB resets auto-increment on truncate. mysql-test/suite/innodb/t/innodb.test: FK is not necessary, test is related to auto-increment. Update error number, truncate is no longer invoked if table is parent in a FK relationship. mysql-test/suite/innodb/t/innodb_mysql.test: Update error number, truncate is no longer invoked if table is parent in a FK relationship. Use delete instead of truncate, test is used to check the interaction of FKs, triggers and delete. mysql-test/suite/parts/inc/partition_check.inc: Fix typo. mysql-test/suite/sys_vars/t/foreign_key_checks_func.test: Update error number, truncate is no longer invoked if table is parent in a FK relationship. mysql-test/t/mdl_sync.test: Modify test case to reflect and ensure that truncate takes a exclusive metadata lock. mysql-test/t/trigger-trans.test: Update error number, truncate is no longer invoked if table is parent in a FK relationship. sql/ha_partition.cc: Reorganize the various truncate methods. delete_all_rows is now passed directly to the underlying engines, so as truncate. The code responsible for truncating individual partitions is moved to ha_partition::truncate_partition, which is invoked when a ALTER TABLE t1 TRUNCATE PARTITION p statement is executed. Since the partition truncate no longer can be invoked via delete, the bitmap operations are not necessary anymore. The explicit reset of the auto-increment value is also removed as the underlying engines are now responsible for reseting the value. sql/handler.cc: Wire up the handler truncate method. sql/handler.h: Introduce and document the truncate handler method. It assumes certain use cases of delete_all_rows. Add method to retrieve the list of foreign keys referencing a table. Method is used to avoid truncating tables that are parent in a foreign key relationship. sql/share/errmsg-utf8.txt: Add error message for truncate and FK. sql/sql_lex.h: Introduce a flag so that the partition engine can detect when a partition is being truncated. Used to give a special error. sql/sql_parse.cc: Function mysql_truncate_table no longer exists. sql/sql_partition_admin.cc: Implement the TRUNCATE PARTITION statement. sql/sql_truncate.cc: Change the truncate table implementation to use the new truncate handler method and to not rely on row-by-row delete anymore. The truncate handler method is always invoked with a exclusive metadata lock. Also, it is no longer possible to truncate a table that is parent in some non-self-referencing foreign key. storage/archive/ha_archive.cc: Rename method as the description indicates that in the future this could be a truncate operation. storage/blackhole/ha_blackhole.cc: Implement truncate as no operation for the blackhole engine in order to remain compatible with older releases. storage/federated/ha_federated.cc: Introduce truncate method that invokes delete_all_rows. This is required to support partition truncate as this form of truncate does not implement the drop and recreate protocol. storage/heap/ha_heap.cc: Introduce truncate method that invokes delete_all_rows. This is required to support partition truncate as this form of truncate does not implement the drop and recreate protocol. storage/ibmdb2i/ha_ibmdb2i.cc: Introduce truncate method that invokes delete_all_rows. This is required to support partition truncate as this form of truncate does not implement the drop and recreate protocol. storage/innobase/handler/ha_innodb.cc: Rename delete_all_rows to truncate. InnoDB now does truncate under a exclusive metadata lock. Introduce and reorganize methods used to retrieve the list of foreign keys referenced by a or referencing a table. storage/myisammrg/ha_myisammrg.cc: Introduce truncate method that invokes delete_all_rows. This is required in order to remain compatible with earlier releases where truncate would resort to a row-by-row delete. --- sql/ha_partition.h | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'sql/ha_partition.h') diff --git a/sql/ha_partition.h b/sql/ha_partition.h index 353e0d17159..f1abc0cefe2 100644 --- a/sql/ha_partition.h +++ b/sql/ha_partition.h @@ -346,6 +346,7 @@ public: virtual int update_row(const uchar * old_data, uchar * new_data); virtual int delete_row(const uchar * buf); virtual int delete_all_rows(void); + virtual int truncate(); virtual void start_bulk_insert(ha_rows rows); virtual int end_bulk_insert(); private: @@ -354,6 +355,15 @@ private: long estimate_read_buffer_size(long original_size); public: + /* + Method for truncating a specific partition. + (i.e. ALTER TABLE t1 TRUNCATE PARTITION p). + + @remark This method is a partitioning-specific hook + and thus not a member of the general SE API. + */ + int truncate_partition(Alter_info *); + virtual bool is_fatal_error(int error, uint flags) { if (!handler::is_fatal_error(error, flags) || -- cgit v1.2.1 From 2737a722785589eca70f4e25eaaa0d8f594462df Mon Sep 17 00:00:00 2001 From: Mattias Jonsson Date: Wed, 1 Dec 2010 22:47:40 +0100 Subject: Bug#58147: ALTER TABLE w/ TRUNCATE PARTITION fails but the statement is written to binlog TRUNCATE PARTITION was written to the binlog even if it failed before calling any partition's truncate function. Solved by adding an argument to truncate_partition, to flag if it should be written to the binlog or not. It should be written to the binlog when a call to any partitions truncate function is done. mysql-test/r/partition_binlog.result: New result file mysql-test/t/partition_binlog.test: New test file, including DROP PARTITION binlog test sql/ha_partition.cc: Added argument to avoid binlogging failed truncate_partition that have not yet changed any data. sql/ha_partition.h: Added argument to avoid excessive binlogging sql/sql_partition_admin.cc: Avoid to binlog TRUNCATE PARTITION if it fails before any partition has tried to truncate. --- sql/ha_partition.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'sql/ha_partition.h') diff --git a/sql/ha_partition.h b/sql/ha_partition.h index f1abc0cefe2..cb91dcba6a3 100644 --- a/sql/ha_partition.h +++ b/sql/ha_partition.h @@ -362,7 +362,7 @@ public: @remark This method is a partitioning-specific hook and thus not a member of the general SE API. */ - int truncate_partition(Alter_info *); + int truncate_partition(Alter_info *, bool *to_binlog); virtual bool is_fatal_error(int error, uint flags) { -- cgit v1.2.1 From e0887df8e1127c0f1410b9d4ad61647cb5f93be2 Mon Sep 17 00:00:00 2001 From: Mattias Jonsson Date: Fri, 25 Mar 2011 12:36:02 +0100 Subject: Bug#11766249 bug#59316: PARTITIONING AND INDEX_MERGE MEMORY LEAK When executing row-ordered-retrieval index merge, the handler was cloned, but it used the wrong memory root, so instead of allocating memory on the thread/query's mem_root, it used the table's mem_root, resulting in non released memory in the table object, and was not freed until the table was closed. Solution was to ensure that memory used during cloning of a handler was allocated from the correct memory root. This was implemented by fixing handler::clone() to also take a name argument, so it can be used with partitioning. And in ha_partition only allocate the ha_partition's ref, and call the original ha_partition partitions clone() and set at cloned partitions. Fix of .bzrignore on Windows with VS 2010 --- sql/ha_partition.h | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) (limited to 'sql/ha_partition.h') diff --git a/sql/ha_partition.h b/sql/ha_partition.h index 76b91e160ca..a38d56af8ff 100644 --- a/sql/ha_partition.h +++ b/sql/ha_partition.h @@ -133,6 +133,13 @@ private: bool m_is_sub_partitioned; // Is subpartitioned bool m_ordered_scan_ongoing; + /* + If set, this object was created with ha_partition::clone and doesn't + "own" the m_part_info structure. + */ + ha_partition *m_is_clone_of; + MEM_ROOT *m_clone_mem_root; + /* We keep track if all underlying handlers are MyISAM since MyISAM has a great number of extra flags not needed by other handlers. @@ -169,11 +176,6 @@ private: PARTITION_SHARE *share; /* Shared lock info */ #endif - /* - TRUE <=> this object was created with ha_partition::clone and doesn't - "own" the m_part_info structure. - */ - bool is_clone; bool auto_increment_lock; /**< lock reading/updating auto_inc */ /** Flag to keep the auto_increment lock through out the statement. @@ -186,7 +188,7 @@ private: /** used for prediction of start_bulk_insert rows */ enum_monotonicity_info m_part_func_monotonicity_info; public: - handler *clone(MEM_ROOT *mem_root); + handler *clone(const char *name, MEM_ROOT *mem_root); virtual void set_part_info(partition_info *part_info) { m_part_info= part_info; @@ -205,6 +207,10 @@ public: */ ha_partition(handlerton *hton, TABLE_SHARE * table); ha_partition(handlerton *hton, partition_info * part_info); + ha_partition(handlerton *hton, TABLE_SHARE *share, + partition_info *part_info_arg, + ha_partition *clone_arg, + MEM_ROOT *clone_mem_root_arg); ~ha_partition(); /* A partition handler has no characteristics in itself. It only inherits @@ -275,7 +281,7 @@ private: And one method to read it in. */ bool create_handler_file(const char *name); - bool get_from_handler_file(const char *name, MEM_ROOT *mem_root); + bool get_from_handler_file(const char *name, MEM_ROOT *mem_root, bool clone); bool new_handlers_from_part_info(MEM_ROOT *mem_root); bool create_handlers(MEM_ROOT *mem_root); void clear_handler_file(); -- cgit v1.2.1 From bd92ea43116b8ce606de5e6fc825e1a8b87a7740 Mon Sep 17 00:00:00 2001 From: Mattias Jonsson Date: Wed, 20 Apr 2011 17:52:33 +0200 Subject: Bug#11766249 bug#59316: PARTITIONING AND INDEX_MERGE MEMORY LEAK Update for previous patch according to reviewers comments. Updated the constructors for ha_partitions to use the common init_handler_variables functions Added use of defines for size and offset to get better readability for the code that reads and writes the .par file. Also refactored the get_from_handler_file function. --- sql/ha_partition.h | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) (limited to 'sql/ha_partition.h') diff --git a/sql/ha_partition.h b/sql/ha_partition.h index a38d56af8ff..cd90c4cc1d5 100644 --- a/sql/ha_partition.h +++ b/sql/ha_partition.h @@ -55,6 +55,16 @@ typedef struct st_ha_data_partition HA_DUPLICATE_POS | \ HA_CAN_SQL_HANDLER | \ HA_CAN_INSERT_DELAYED) + +/* First 4 bytes in the .par file is the number of 32-bit words in the file */ +#define PAR_WORD_SIZE 4 +/* offset to the .par file checksum */ +#define PAR_CHECKSUM_OFFSET 4 +/* offset to the total number of partitions */ +#define PAR_NUM_PARTS_OFFSET 8 +/* offset to the engines array */ +#define PAR_ENGINES_OFFSET 12 + class ha_partition :public handler { private: @@ -71,7 +81,7 @@ private: /* Data for the partition handler */ int m_mode; // Open mode uint m_open_test_lock; // Open test_if_locked - char *m_file_buffer; // Buffer with names + char *m_file_buffer; // Content of the .par file char *m_name_buffer_ptr; // Pointer to first partition name plugin_ref *m_engine_array; // Array of types of the handlers handler **m_file; // Array of references to handler inst. @@ -281,7 +291,10 @@ private: And one method to read it in. */ bool create_handler_file(const char *name); - bool get_from_handler_file(const char *name, MEM_ROOT *mem_root, bool clone); + bool setup_engine_array(MEM_ROOT *mem_root); + bool read_par_file(const char *name); + bool get_from_handler_file(const char *name, MEM_ROOT *mem_root, + bool is_clone); bool new_handlers_from_part_info(MEM_ROOT *mem_root); bool create_handlers(MEM_ROOT *mem_root); void clear_handler_file(); -- cgit v1.2.1 From 9b076952ec83b455b3730990c78fb78c2d689674 Mon Sep 17 00:00:00 2001 From: Jon Olav Hauglid Date: Wed, 1 Jun 2011 10:06:55 +0200 Subject: Bug#11853126 RE-ENABLE CONCURRENT READS WHILE CREATING SECONDARY INDEX IN INNODB MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The patches for Bug#11751388 and Bug#11784056 enabled concurrent reads while creating secondary indexes in InnoDB. However, they introduced a regression. This regression occured if ALTER TABLE failed after the index had been added, for example during the lock upgrade needed to update .FRM. If this happened, InnoDB and the server got out of sync with regards to which indexes actually existed. Therefore the patch for Bug#11815600 again disabled concurrent reads. This patch re-enables concurrent reads. The original regression is fixed by splitting the ADD INDEX operation into two parts. First the new index is created but not made active. This is done while concurrent reads are allowed. The second part of the operation makes the index active (or reverts the change). This is done after lock upgrade, which prevents the original regression. In order to implement this change, the patch changes the storage API for in-place index creation. handler::add_index() is split into two functions, handler_add_index() and handler::final_add_index(). The former for creating indexes without making them visible and the latter for commiting (i.e. making visible) new indexes or reverting the changes. Large parts of this patch were written by Marko Mäkelä. Test case added to innodb_mysql_lock.test. --- sql/ha_partition.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) (limited to 'sql/ha_partition.h') diff --git a/sql/ha_partition.h b/sql/ha_partition.h index b0b11da823f..091efcfa727 100644 --- a/sql/ha_partition.h +++ b/sql/ha_partition.h @@ -1055,7 +1055,9 @@ public: They are used for on-line/fast alter table add/drop index: ------------------------------------------------------------------------- */ - virtual int add_index(TABLE *table_arg, KEY *key_info, uint num_of_keys); + virtual int add_index(TABLE *table_arg, KEY *key_info, uint num_of_keys, + handler_add_index **add); + virtual int final_add_index(handler_add_index *add, bool commit); virtual int prepare_drop_index(TABLE *table_arg, uint *key_num, uint num_of_keys); virtual int final_drop_index(TABLE *table_arg); -- cgit v1.2.1