summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--framing_format.txt18
-rw-r--r--snappy.h17
2 files changed, 18 insertions, 17 deletions
diff --git a/framing_format.txt b/framing_format.txt
index 08fda03..32b1e59 100644
--- a/framing_format.txt
+++ b/framing_format.txt
@@ -1,5 +1,5 @@
Snappy framing format description
-Last revised: 2011-12-15
+Last revised: 2013-01-05
This format decribes a framing format for Snappy, allowing compressing to
files or streams that can then more easily be decompressed without having
@@ -15,9 +15,9 @@ decompressor; it is not part of the Snappy core specification.
The file consists solely of chunks, lying back-to-back with no padding
in between. Each chunk consists first a single byte of chunk identifier,
-then a two-byte little-endian length of the chunk in bytes (from 0 to 65535,
-inclusive), and then the data if any. The three bytes of chunk header is not
-counted in the data length.
+then a three-byte little-endian length of the chunk in bytes (from 0 to
+16777215, inclusive), and then the data if any. The four bytes of chunk
+header is not counted in the data length.
The different chunk types are listed below. The first chunk must always
be the stream identifier chunk (see section 4.1, below). The stream
@@ -71,7 +71,7 @@ The stream identifier is always the first element in the stream.
It is exactly six bytes long and contains "sNaPpY" in ASCII. This means that
a valid Snappy framed stream always starts with the bytes
- 0xff 0x06 0x00 0x73 0x4e 0x61 0x50 0x70 0x59
+ 0xff 0x06 0x00 0x00 0x73 0x4e 0x61 0x50 0x70 0x59
The stream identifier chunk can come multiple times in the stream besides
the first; if such a chunk shows up, it should simply be ignored, assuming
@@ -86,9 +86,9 @@ see the compressed format specification. The compressed data is preceded by
the CRC-32C (see section 3) of the _uncompressed_ data.
Note that the data portion of the chunk, i.e., the compressed contents,
-can be at most 65531 bytes (2^16 - 1, minus the checksum).
+can be at most 16777211 bytes (2^24 - 1, minus the checksum).
However, we place an additional restriction that the uncompressed data
-in a chunk must be no longer than 32768 bytes. This allows consumers to
+in a chunk must be no longer than 65536 bytes. This allows consumers to
easily use small fixed-size buffers.
@@ -102,8 +102,8 @@ As in the compressed chunks, the data is preceded by its own masked
CRC-32C (see section 3).
An uncompressed data chunk, like compressed data chunks, should contain
-no more than 32768 data bytes, so the maximum legal chunk length with the
-checksum is 32772.
+no more than 65536 data bytes, so the maximum legal chunk length with the
+checksum is 65540.
4.4. Reserved unskippable chunks (chunk types 0x02-0x7f)
diff --git a/snappy.h b/snappy.h
index d15ffbf..03ef6ce 100644
--- a/snappy.h
+++ b/snappy.h
@@ -142,15 +142,16 @@ namespace snappy {
bool IsValidCompressedBuffer(const char* compressed,
size_t compressed_length);
- // *** DO NOT CHANGE THE VALUE OF kBlockSize ***
+ // The size of a compression block. Note that many parts of the compression
+ // code assumes that kBlockSize <= 65536; in particular, the hash table
+ // can only store 16-bit offsets, and EmitCopy() also assumes the offset
+ // is 65535 bytes or less. Note also that if you change this, it will
+ // affect the framing format (see framing_format.txt).
//
- // New Compression code chops up the input into blocks of at most
- // the following size. This ensures that back-references in the
- // output never cross kBlockSize block boundaries. This can be
- // helpful in implementing blocked decompression. However the
- // decompression code should not rely on this guarantee since older
- // compression code may not obey it.
- static const int kBlockLog = 15;
+ // Note that there might be older data around that is compressed with larger
+ // block sizes, so the decompression code should not rely on the
+ // non-existence of long backreferences.
+ static const int kBlockLog = 16;
static const size_t kBlockSize = 1 << kBlockLog;
static const int kMaxHashTableBits = 14;