summaryrefslogtreecommitdiff
path: root/storage/innobase/include/mtr0log.h
diff options
context:
space:
mode:
authorMarko Mäkelä <marko.makela@mariadb.com>2020-02-13 19:12:17 +0200
committerMarko Mäkelä <marko.makela@mariadb.com>2020-02-13 19:12:17 +0200
commit7ae21b18a6b73bbc3bf1ff448faf60c29ac1d386 (patch)
treecd6d9521f59a8e1570897856bf94ab8c84e5bd83 /storage/innobase/include/mtr0log.h
parent98690052017d2091a463d335eba2c6fc8cac7cb9 (diff)
downloadmariadb-git-7ae21b18a6b73bbc3bf1ff448faf60c29ac1d386.tar.gz
MDEV-12353: Change the redo log encoding
log_t::FORMAT_10_5: physical redo log format tag log_phys_t: Buffered records in the physical format. The log record bytes will follow the last data field, making use of alignment padding that would otherwise be wasted. If there are multiple records for the same page, also those may be appended to an existing log_phys_t object if the memory is available. In the physical format, the first byte of a record identifies the record and its length (up to 15 bytes). For longer records, the immediately following bytes will encode the remaining length in a variable-length encoding. Usually, a variable-length-encoded page identifier will follow, followed by optional payload, whose length is included in the initially encoded total record length. When a mini-transaction is updating multiple fields in a page, it can avoid repeating the tablespace identifier and page number by setting the same_page flag (most significant bit) in the first byte of the log record. The byte offset of the record will be relative to where the previous record for that page ended. Until MDEV-14425 introduces a separate file-level log for redo log checkpoints and file operations, we will write the file-level records in the page-level redo log file. The record FILE_CHECKPOINT (which replaces MLOG_CHECKPOINT) will be removed in MDEV-14425, and one sequential scan of the page recovery log will suffice. Compared to MLOG_FILE_CREATE2, FILE_CREATE will not include any flags. If the information is needed, it can be parsed from WRITE records that modify FSP_SPACE_FLAGS. MLOG_ZIP_WRITE_STRING: Remove. The record was only introduced temporarily as part of this work, before being replaced with WRITE (along with MLOG_WRITE_STRING, MLOG_1BYTE, MLOG_nBYTES). mtr_buf_t::empty(): Check if the buffer is empty. mtr_t::m_n_log_recs: Remove. It suffices to check if m_log is empty. mtr_t::m_last, mtr_t::m_last_offset: End of the latest m_log record, for the same_page encoding. page_recv_t::last_offset: Reflects mtr_t::m_last_offset. Valid values for last_offset during recovery should be 0 or above 8. (The first 8 bytes of a page are the checksum and the page number, and neither are ever updated directly by log records.) Internally, the special value 1 indicates that the same_page form will not be allowed for the subsequent record. mtr_t::page_create(): Take the block descriptor as parameter, so that it can be compared to mtr_t::m_last. The INIT_INDEX_PAGE record will always followed by a subtype byte, because same_page records must be longer than 1 byte. trx_undo_page_init(): Combine the writes in WRITE record. trx_undo_header_create(): Write 4 bytes using a special MEMSET record that includes 1 bytes of length and 2 bytes of payload. flst_write_addr(): Define as a static function. Combine the writes. flst_zero_both(): Replaces two flst_zero_addr() calls. flst_init(): Do not inline the function. fsp_free_seg_inode(): Zerofill the whole inode. fsp_apply_init_file_page(): Initialize FIL_PAGE_PREV,FIL_PAGE_NEXT to FIL_NULL when using the physical format. btr_create(): Assert !page_has_siblings() because fsp_apply_init_file_page() must have been invoked. fil_ibd_create(): Do not write FILE_MODIFY after FILE_CREATE. fil_names_dirty_and_write(): Remove the parameter mtr. Write the records using a separate mini-transaction object, because any FILE_ records must be at the start of a mini-transaction log. recv_recover_page(): Add a fil_space_t* parameter. After applying log to the a ROW_FORMAT=COMPRESSED page, invoke buf_zip_decompress() to restore the uncompressed page. buf_page_io_complete(): Remove the temporary hack to discard the uncompressed page of a ROW_FORMAT=COMPRESSED page. page_zip_write_header(): Remove. Use mtr_t::write() or mtr_t::memset() instead, and update the compressed page frame separately. trx_undo_header_add_space_for_xid(): Remove. trx_undo_seg_create(): Perform the changes that were previously made by trx_undo_header_add_space_for_xid(). btr_reset_instant(): New function: Reset the table to MariaDB 10.2 or 10.3 format when rolling back an instant ALTER TABLE operation. page_rec_find_owner_rec(): Merge with the only callers. page_cur_insert_rec_low(): Combine writes by using a local buffer. MEMMOVE data from the preceding record whenever feasible (copying at least 3 bytes). page_cur_insert_rec_zip(): Combine writes to page header fields. PageBulk::insertPage(): Issue MEMMOVE records to copy a matching part from the preceding record. PageBulk::finishPage(): Combine the writes to the page header and to the sparse page directory slots. mtr_t::write(): Only log the least significant (last) bytes of multi-byte fields that actually differ. For updating FSP_SIZE, we must always write all 4 bytes to the redo log, so that the fil_space_set_recv_size() logic in recv_sys_t::parse() will work. mtr_t::memcpy(), mtr_t::zmemcpy(): Take a pointer argument instead of a numeric offset to the page frame. Only log the last bytes of multi-byte fields that actually differ. In fil_space_crypt_t::write_page0(), we must log also any unchanged bytes, so that recovery will recognize the record and invoke fil_crypt_parse(). Future work: MDEV-21724 Optimize page_cur_insert_rec_low() redo logging MDEV-21725 Optimize btr_page_reorganize_low() redo logging MDEV-21727 Optimize redo logging for ROW_FORMAT=COMPRESSED
Diffstat (limited to 'storage/innobase/include/mtr0log.h')
-rw-r--r--storage/innobase/include/mtr0log.h498
1 files changed, 459 insertions, 39 deletions
diff --git a/storage/innobase/include/mtr0log.h b/storage/innobase/include/mtr0log.h
index 06afdbb54bc..71faf119cf0 100644
--- a/storage/innobase/include/mtr0log.h
+++ b/storage/innobase/include/mtr0log.h
@@ -33,82 +33,478 @@ Created 12/7/1995 Heikki Tuuri
// Forward declaration
struct dict_index_t;
+/** The minimum 2-byte integer (0b10xxxxxx xxxxxxxx) */
+constexpr uint32_t MIN_2BYTE= 1 << 7;
+/** The minimum 3-byte integer (0b110xxxxx xxxxxxxx xxxxxxxx) */
+constexpr uint32_t MIN_3BYTE= MIN_2BYTE + (1 << 14);
+/** The minimum 4-byte integer (0b1110xxxx xxxxxxxx xxxxxxxx xxxxxxxx) */
+constexpr uint32_t MIN_4BYTE= MIN_3BYTE + (1 << 21);
+/** Minimum 5-byte integer (0b11110000 xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx) */
+constexpr uint32_t MIN_5BYTE= MIN_4BYTE + (1 << 28);
+
+/** Error from mlog_decode_varint() */
+constexpr uint32_t MLOG_DECODE_ERROR= ~0U;
+
+/** Decode the length of a variable-length encoded integer.
+@param first first byte of the encoded integer
+@return the length, in bytes */
+inline uint8_t mlog_decode_varint_length(byte first)
+{
+ uint8_t len= 1;
+ for (; first & 0x80; len++, first<<= 1);
+ return len;
+}
+
+/** Decode an integer in a redo log record.
+@param log redo log record buffer
+@return the decoded integer
+@retval MLOG_DECODE_ERROR on error */
+inline uint32_t mlog_decode_varint(const byte* log)
+{
+ uint32_t i= *log;
+ if (i < MIN_2BYTE)
+ return i;
+ if (i < 0xc0)
+ return MIN_2BYTE + ((i & ~0x80) << 8 | log[1]);
+ if (i < 0xe0)
+ return MIN_3BYTE + ((i & ~0xc0) << 16 | uint32_t{log[1]} << 8 | log[2]);
+ if (i < 0xf0)
+ return MIN_4BYTE + ((i & ~0xe0) << 24 | uint32_t{log[1]} << 16 |
+ uint32_t{log[2]} << 8 | log[3]);
+ if (i == 0xf0)
+ {
+ i= uint32_t{log[1]} << 24 | uint32_t{log[2]} << 16 |
+ uint32_t{log[3]} << 8 | log[4];
+ if (i <= ~MIN_5BYTE)
+ return MIN_5BYTE + i;
+ }
+ return MLOG_DECODE_ERROR;
+}
+
+/** Encode an integer in a redo log record.
+@param log redo log record buffer
+@param i the integer to encode
+@return end of the encoded integer */
+inline byte *mlog_encode_varint(byte *log, size_t i)
+{
+ if (i < MIN_2BYTE)
+ {
+ }
+ else if (i < MIN_3BYTE)
+ {
+ i-= MIN_2BYTE;
+ static_assert(MIN_3BYTE - MIN_2BYTE == 1 << 14, "compatibility");
+ *log++= 0x80 | static_cast<byte>(i >> 8);
+ }
+ else if (i < MIN_4BYTE)
+ {
+ i-= MIN_3BYTE;
+ static_assert(MIN_4BYTE - MIN_3BYTE == 1 << 21, "compatibility");
+ *log++= 0xc0 | static_cast<byte>(i >> 16);
+ goto last2;
+ }
+ else if (i < MIN_5BYTE)
+ {
+ i-= MIN_4BYTE;
+ static_assert(MIN_5BYTE - MIN_4BYTE == 1 << 28, "compatibility");
+ *log++= 0xe0 | static_cast<byte>(i >> 24);
+ goto last3;
+ }
+ else
+ {
+ ut_ad(i < MLOG_DECODE_ERROR);
+ i-= MIN_5BYTE;
+ *log++= 0xf0;
+ *log++= static_cast<byte>(i >> 24);
+last3:
+ *log++= static_cast<byte>(i >> 16);
+last2:
+ *log++= static_cast<byte>(i >> 8);
+ }
+ *log++= static_cast<byte>(i);
+ return log;
+}
+
+/** Determine the length of a log record.
+@param log start of log record
+@param end end of the log record buffer
+@return the length of the record, in bytes
+@retval 0 if the log extends past the end
+@retval MLOG_DECODE_ERROR if the record is corrupted */
+inline uint32_t mlog_decode_len(const byte *log, const byte *end)
+{
+ ut_ad(log < end);
+ uint32_t i= *log;
+ if (!i)
+ return 0; /* end of mini-transaction */
+ if (~i & 15)
+ return (i & 15) + 1; /* 1..16 bytes */
+ if (UNIV_UNLIKELY(++log == end))
+ return 0; /* end of buffer */
+ i= *log;
+ if (UNIV_LIKELY(i < MIN_2BYTE)) /* 1 additional length byte: 16..143 bytes */
+ return 16 + i;
+ if (i < 0xc0) /* 2 additional length bytes: 144..16,527 bytes */
+ {
+ if (UNIV_UNLIKELY(log + 1 == end))
+ return 0; /* end of buffer */
+ return 16 + MIN_2BYTE + ((i & ~0xc0) << 8 | log[1]);
+ }
+ if (i < 0xe0) /* 3 additional length bytes: 16528..1065103 bytes */
+ {
+ if (UNIV_UNLIKELY(log + 2 == end))
+ return 0; /* end of buffer */
+ return 16 + MIN_3BYTE + ((i & ~0xe0) << 16 |
+ static_cast<uint32_t>(log[1]) << 8 | log[2]);
+ }
+ /* 1,065,103 bytes per log record ought to be enough for everyone */
+ return MLOG_DECODE_ERROR;
+}
+
/** Write 1, 2, 4, or 8 bytes to a file page.
@param[in] block file page
@param[in,out] ptr pointer in file page
@param[in] val value to write
@tparam l number of bytes to write
@tparam w write request type
-@tparam V type of val */
+@tparam V type of val
+@return whether any log was written */
template<unsigned l,mtr_t::write_type w,typename V>
-inline void mtr_t::write(const buf_block_t &block, byte *ptr, V val)
+inline bool mtr_t::write(const buf_block_t &block, void *ptr, V val)
{
ut_ad(ut_align_down(ptr, srv_page_size) == block.frame);
- ut_ad(m_log_mode == MTR_LOG_NONE || m_log_mode == MTR_LOG_NO_REDO ||
- !block.page.zip.data ||
- /* written by fil_crypt_rotate_page() or innodb_make_page_dirty()? */
- (w == FORCED && l == 1 && ptr == &block.frame[FIL_PAGE_SPACE_ID]) ||
- mach_read_from_2(block.frame + FIL_PAGE_TYPE) <= FIL_PAGE_TYPE_ZBLOB2);
static_assert(l == 1 || l == 2 || l == 4 || l == 8, "wrong length");
+ byte buf[l];
switch (l) {
case 1:
- if (w == OPT && mach_read_from_1(ptr) == val) return;
- ut_ad(w != NORMAL || mach_read_from_1(ptr) != val);
ut_ad(val == static_cast<byte>(val));
- *ptr= static_cast<byte>(val);
+ buf[0]= static_cast<byte>(val);
break;
case 2:
ut_ad(val == static_cast<uint16_t>(val));
- if (w == OPT && mach_read_from_2(ptr) == val) return;
- ut_ad(w != NORMAL || mach_read_from_2(ptr) != val);
- mach_write_to_2(ptr, static_cast<uint16_t>(val));
+ mach_write_to_2(buf, static_cast<uint16_t>(val));
break;
case 4:
ut_ad(val == static_cast<uint32_t>(val));
- if (w == OPT && mach_read_from_4(ptr) == val) return;
- ut_ad(w != NORMAL || mach_read_from_4(ptr) != val);
- mach_write_to_4(ptr, static_cast<uint32_t>(val));
+ mach_write_to_4(buf, static_cast<uint32_t>(val));
break;
case 8:
- if (w == OPT && mach_read_from_8(ptr) == val) return;
- ut_ad(w != NORMAL || mach_read_from_8(ptr) != val);
- mach_write_to_8(ptr, val);
+ mach_write_to_8(buf, val);
break;
}
+ byte *p= static_cast<byte*>(ptr);
+ const byte *const end= p + l;
+ if (w != FORCED && m_log_mode == MTR_LOG_ALL)
+ {
+ const byte *b= buf;
+ while (*p++ == *b++)
+ {
+ if (p == end)
+ {
+ ut_ad(w == OPT);
+ return false;
+ }
+ }
+ p--;
+ }
+ ::memcpy(ptr, buf, l);
+ memcpy_low(block.page, static_cast<uint16_t>
+ (ut_align_offset(p, srv_page_size)), p, end - p);
+ return true;
+}
+
+/** Log an initialization of a string of bytes.
+@param[in] b buffer page
+@param[in] ofs byte offset from b->frame
+@param[in] len length of the data to write
+@param[in] val the data byte to write */
+inline void mtr_t::memset(const buf_block_t &b, ulint ofs, ulint len, byte val)
+{
+ ut_ad(len);
set_modified();
if (m_log_mode != MTR_LOG_ALL)
return;
- byte *log_ptr= m_log.open(11 + 2 + (l == 8 ? 9 : 5));
- if (l == 8)
- log_write(block, ptr, static_cast<mlog_id_t>(l), log_ptr, uint64_t{val});
- else
- log_write(block, ptr, static_cast<mlog_id_t>(l), log_ptr,
- static_cast<uint32_t>(val));
+
+ static_assert(MIN_4BYTE > UNIV_PAGE_SIZE_MAX, "consistency");
+ size_t lenlen= (len < MIN_2BYTE ? 1 + 1 : len < MIN_3BYTE ? 2 + 1 : 3 + 1);
+ byte *l= log_write<MEMSET>(b.page.id, &b.page, lenlen, true, ofs);
+ l= mlog_encode_varint(l, len);
+ *l++= val;
+ m_log.close(l);
+ m_last_offset= static_cast<uint16_t>(ofs + len);
}
-/** Write a byte string to a page.
+/** Initialize a string of bytes.
@param[in,out] b buffer page
+@param[in] ofs byte offset from block->frame
+@param[in] len length of the data to write
+@param[in] val the data byte to write */
+inline void mtr_t::memset(const buf_block_t *b, ulint ofs, ulint len, byte val)
+{
+ ut_ad(ofs <= ulint(srv_page_size));
+ ut_ad(ofs + len <= ulint(srv_page_size));
+ ::memset(ofs + b->frame, val, len);
+ memset(*b, ofs, len, val);
+}
+
+/** Log an initialization of a repeating string of bytes.
+@param[in] b buffer page
+@param[in] ofs byte offset from b->frame
+@param[in] len length of the data to write, in bytes
+@param[in] str the string to write
+@param[in] size size of str, in bytes */
+inline void mtr_t::memset(const buf_block_t &b, ulint ofs, size_t len,
+ const void *str, size_t size)
+{
+ ut_ad(size);
+ ut_ad(len > size); /* use mtr_t::memcpy() for shorter writes */
+ set_modified();
+ if (m_log_mode != MTR_LOG_ALL)
+ return;
+
+ static_assert(MIN_4BYTE > UNIV_PAGE_SIZE_MAX, "consistency");
+ size_t lenlen= (len < MIN_2BYTE ? 1 : len < MIN_3BYTE ? 2 : 3);
+ byte *l= log_write<MEMSET>(b.page.id, &b.page, lenlen + size, true, ofs);
+ l= mlog_encode_varint(l, len);
+ ::memcpy(l, str, size);
+ l+= size;
+ m_log.close(l);
+ m_last_offset= static_cast<uint16_t>(ofs + len);
+}
+
+/** Initialize a repeating string of bytes.
+@param[in,out] b buffer page
+@param[in] ofs byte offset from b->frame
+@param[in] len length of the data to write, in bytes
+@param[in] str the string to write
+@param[in] size size of str, in bytes */
+inline void mtr_t::memset(const buf_block_t *b, ulint ofs, size_t len,
+ const void *str, size_t size)
+{
+ ut_ad(ofs <= ulint(srv_page_size));
+ ut_ad(ofs + len <= ulint(srv_page_size));
+ ut_ad(len > size); /* use mtr_t::memcpy() for shorter writes */
+ size_t s= 0;
+ while (s < len)
+ {
+ ::memcpy(ofs + s + b->frame, str, size);
+ s+= len;
+ }
+ ::memcpy(ofs + s + b->frame, str, len - s);
+ memset(*b, ofs, len, str, size);
+}
+
+/** Log a write of a byte string to a page.
+@param[in] b buffer page
@param[in] offset byte offset from b->frame
@param[in] str the data to write
@param[in] len length of the data to write */
-inline
-void mtr_t::memcpy(buf_block_t *b, ulint offset, const void *str, ulint len)
+inline void mtr_t::memcpy(const buf_block_t &b, ulint offset, ulint len)
+{
+ ut_ad(len);
+ ut_ad(offset <= ulint(srv_page_size));
+ ut_ad(offset + len <= ulint(srv_page_size));
+ memcpy_low(b.page, uint16_t(offset), &b.frame[offset], len);
+}
+
+/** Log a write of a byte string to a page.
+@param id page identifier
+@param offset byte offset within page
+@param data data to be written
+@param len length of the data, in bytes */
+inline void mtr_t::memcpy_low(const buf_page_t &bpage, uint16_t offset,
+ const void *data, size_t len)
+{
+ ut_ad(len);
+ set_modified();
+ if (m_log_mode != MTR_LOG_ALL)
+ return;
+ if (len < mtr_buf_t::MAX_DATA_SIZE - (1 + 3 + 3 + 5 + 5))
+ {
+ byte *end= log_write<WRITE>(bpage.id, &bpage, len, true, offset);
+ ::memcpy(end, data, len);
+ m_log.close(end + len);
+ }
+ else
+ {
+ m_log.close(log_write<WRITE>(bpage.id, &bpage, len, false, offset));
+ m_log.push(static_cast<const byte*>(data), static_cast<uint32_t>(len));
+ }
+ m_last_offset= static_cast<uint16_t>(offset + len);
+}
+
+/** Log that a string of bytes was copied from the same page.
+@param[in] b buffer page
+@param[in] d destination offset within the page
+@param[in] s source offset within the page
+@param[in] len length of the data to copy */
+inline void mtr_t::memmove(const buf_block_t &b, ulint d, ulint s, ulint len)
{
- ::memcpy(b->frame + offset, str, len);
- memcpy(*b, offset, len);
+ ut_ad(d >= 8);
+ ut_ad(s >= 8);
+ ut_ad(len);
+ ut_ad(s <= ulint(srv_page_size));
+ ut_ad(s + len <= ulint(srv_page_size));
+ ut_ad(s != d);
+ ut_ad(d <= ulint(srv_page_size));
+ ut_ad(d + len <= ulint(srv_page_size));
+
+ set_modified();
+ if (m_log_mode != MTR_LOG_ALL)
+ return;
+ static_assert(MIN_4BYTE > UNIV_PAGE_SIZE_MAX, "consistency");
+ size_t lenlen= (len < MIN_2BYTE ? 1 : len < MIN_3BYTE ? 2 : 3);
+ /* The source offset is encoded relative to the destination offset,
+ with the sign in the least significant bit. */
+ if (s > d)
+ s= (s - d) << 1;
+ else
+ s= (d - s) << 1 | 1;
+ /* The source offset 0 is not possible. */
+ s-= 1 << 1;
+ size_t slen= (s < MIN_2BYTE ? 1 : s < MIN_3BYTE ? 2 : 3);
+ byte *l= log_write<MEMMOVE>(b.page.id, &b.page, lenlen + slen, true, d);
+ l= mlog_encode_varint(l, len);
+ l= mlog_encode_varint(l, s);
+ m_log.close(l);
+ m_last_offset= static_cast<uint16_t>(d + len);
+}
+
+/**
+Write a log record.
+@tparam type redo log record type
+@param id persistent page identifier
+@param bpage buffer pool page, or nullptr
+@param len number of additional bytes to write
+@param alloc whether to allocate the additional bytes
+@param offset byte offset, or 0 if the record type does not allow one
+@return end of mini-transaction log, minus len */
+template<byte type>
+inline byte *mtr_t::log_write(const page_id_t id, const buf_page_t *bpage,
+ size_t len, bool alloc, size_t offset)
+{
+ static_assert(!(type & 15) && type != RESERVED && type != OPTION &&
+ type <= FILE_CHECKPOINT, "invalid type");
+ ut_ad(type >= FILE_CREATE || is_named_space(id.space()));
+ ut_ad(!bpage || bpage->id == id);
+ constexpr bool have_len= type != INIT_PAGE && type != FREE_PAGE;
+ constexpr bool have_offset= type == WRITE || type == MEMSET ||
+ type == MEMMOVE;
+ static_assert(!have_offset || have_len, "consistency");
+ ut_ad(have_len || len == 0);
+ ut_ad(have_len || !alloc);
+ ut_ad(have_offset || offset == 0);
+ ut_ad(offset + len <= srv_page_size);
+ static_assert(MIN_4BYTE >= UNIV_PAGE_SIZE_MAX, "consistency");
+
+ size_t max_len;
+ if (!have_len)
+ max_len= 1 + 5 + 5;
+ else if (!have_offset)
+ max_len= m_last == bpage
+ ? 1 + 3
+ : 1 + 3 + 5 + 5;
+ else if (m_last == bpage && m_last_offset <= offset)
+ {
+ /* Encode the offset relative from m_last_offset. */
+ offset-= m_last_offset;
+ max_len= 1 + 3 + 3;
+ }
+ else
+ max_len= 1 + 3 + 5 + 5 + 3;
+ byte *const log_ptr= m_log.open(alloc ? max_len + len : max_len);
+ byte *end= log_ptr + 1;
+ const byte same_page= max_len < 1 + 5 + 5 ? 0x80 : 0;
+ if (!same_page)
+ {
+ end= mlog_encode_varint(end, id.space());
+ end= mlog_encode_varint(end, id.page_no());
+ m_last= bpage;
+ }
+ if (have_offset)
+ {
+ byte* oend= mlog_encode_varint(end, offset);
+ if (oend + len > &log_ptr[16])
+ {
+ len+= oend - log_ptr - 15;
+ if (len >= MIN_3BYTE)
+ len+= 2;
+ else if (len >= MIN_2BYTE)
+ len++;
+
+ *log_ptr= type | same_page;
+ end= mlog_encode_varint(log_ptr + 1, len);
+ if (!same_page)
+ {
+ end= mlog_encode_varint(end, id.space());
+ end= mlog_encode_varint(end, id.page_no());
+ }
+ end= mlog_encode_varint(end, offset);
+ return end;
+ }
+ else
+ end= oend;
+ }
+ else if (len >= 3 && end + len > &log_ptr[16])
+ {
+ len+= end - log_ptr - 16;
+ if (len >= MIN_3BYTE)
+ len+= 2;
+ else if (len >= MIN_2BYTE)
+ len++;
+
+ end= log_ptr;
+ *end++= type | same_page;
+ mlog_encode_varint(end, len);
+
+ if (!same_page)
+ {
+ end= mlog_encode_varint(end, id.space());
+ end= mlog_encode_varint(end, id.page_no());
+ }
+ return end;
+ }
+
+ ut_ad(end + len >= &log_ptr[1] + !same_page);
+ ut_ad(end + len <= &log_ptr[16]);
+ ut_ad(end <= &log_ptr[max_len]);
+ *log_ptr= type | same_page | static_cast<byte>(end + len - log_ptr - 1);
+ ut_ad(*log_ptr & 15);
+ return end;
}
/** Write a byte string to a page.
-@param[in,out] b ROW_FORMAT=COMPRESSED index page
-@param[in] ofs byte offset from b->zip.data
+@param[in] b buffer page
+@param[in] dest destination within b.frame
@param[in] str the data to write
-@param[in] len length of the data to write */
-inline
-void mtr_t::zmemcpy(buf_page_t *b, ulint offset, const void *str, ulint len)
+@param[in] len length of the data to write
+@tparam w write request type */
+template<mtr_t::write_type w>
+inline void mtr_t::memcpy(const buf_block_t &b, void *dest, const void *str,
+ ulint len)
{
- ::memcpy(b->zip.data + offset, str, len);
- zmemcpy(*b, offset, len);
+ ut_ad(ut_align_down(dest, srv_page_size) == b.frame);
+ char *d= static_cast<char*>(dest);
+ const char *s= static_cast<const char*>(str);
+ if (w != FORCED && m_log_mode == MTR_LOG_ALL)
+ {
+ ut_ad(len);
+ const char *const end= d + len;
+ while (*d++ == *s++)
+ {
+ if (d == end)
+ {
+ ut_ad(w == OPT);
+ return;
+ }
+ }
+ s--;
+ d--;
+ len= static_cast<ulint>(end - d);
+ }
+ ::memcpy(d, s, len);
+ memcpy(b, ut_align_offset(d, srv_page_size), len);
}
/** Initialize an entire page.
@@ -121,13 +517,37 @@ inline void mtr_t::init(buf_block_t *b)
return;
}
- m_log.close(log_write_low(MLOG_INIT_FILE_PAGE2, b->page.id, m_log.open(11)));
+ m_log.close(log_write<INIT_PAGE>(b->page.id, &b->page));
+ m_last_offset= FIL_PAGE_TYPE;
b->page.init_on_flush= true;
}
+/** Free a page.
+@param id page identifier */
+inline void mtr_t::free(const page_id_t id)
+{
+ if (m_log_mode == MTR_LOG_ALL)
+ m_log.close(log_write<FREE_PAGE>(id, nullptr));
+}
+
+/** Partly initialize a B-tree page.
+@param block B-tree page
+@param comp false=ROW_FORMAT=REDUNDANT, true=COMPACT or DYNAMIC */
+inline void mtr_t::page_create(const buf_block_t &block, bool comp)
+{
+ set_modified();
+ if (m_log_mode != MTR_LOG_ALL)
+ return;
+ byte *l= log_write<INIT_INDEX_PAGE>(block.page.id, &block.page, 1, true);
+ *l++= comp;
+ m_log.close(l);
+ m_last_offset= FIL_PAGE_TYPE;
+}
+
/********************************************************//**
-Parses an initial log record written by mtr_t::log_write_low().
+Parses an initial log record written by mlog_write_initial_log_record_low().
@return parsed record end, NULL if not a complete record */
+ATTRIBUTE_COLD /* only used when crash-upgrading */
const byte*
mlog_parse_initial_log_record(
/*==========================*/