| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
PiperOrigin-RevId: 372007801
|
|
|
|
|
|
| |
While we're here, take care of a couple of lint warnings by converting CHECK(a != b) to CHECK_NE(a, b).
PiperOrigin-RevId: 369132446
|
|
|
|
| |
PiperOrigin-RevId: 362386747
|
|
|
|
|
|
|
| |
This CL also removes support for using the gflags library to modify the
flags.
PiperOrigin-RevId: 361583626
|
|
|
|
| |
PiperOrigin-RevId: 361582956
|
|
|
|
| |
PiperOrigin-RevId: 357807059
|
|
|
|
| |
PiperOrigin-RevId: 347861379
|
|
|
|
| |
PiperOrigin-RevId: 347861229
|
|
|
|
|
|
|
|
| |
This lets us remove main() from snappy_bench.cc and snappy_unittest.cc,
which simplifies integrating these tests and benchmarks with other
suites.
PiperOrigin-RevId: 347857427
|
|\
| |
| |
| | |
PiperOrigin-RevId: 347736844
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
LibFuzzer does not ship with the Mac OSX Command Line Tools.
```
ld: file not found: /Applications/Xcode-12.2.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/12.0.0/lib/darwin/libclang_rt.fuzzer_osx.a
clang: error: linker command failed with exit code 1 (use -v to see invocation)
```
|
|/
|
|
| |
PiperOrigin-RevId: 347736380
|
| |
|
|\
| |
| |
| | |
PiperOrigin-RevId: 347660305
|
| | |
|
| | |
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gcc was unable to inline a function call, which caused a build
failure due to `-Wall -Werror`.
The build error was:
```
../snappy.cc:292:76: error: ignoring attributes on template argument ‘__m128i’ [-Werror=ignored-attributes]
292 | static inline std::pair<__m128i /* pattern */, __m128i /* reshuffle_mask */>
| ^
../snappy.cc:292:76: error: ignoring attributes on template argument ‘__m128i’ [-Werror=ignored-attributes]
cc1plus: all warnings being treated as errors
```
|
|
|
|
|
|
| |
CheckSuccess was removed in e1e91ee464373e0bba4aadfbd3d88a6d84dc5b95.
PiperOrigin-RevId: 347625874
|
| |
|
|
|
|
| |
PiperOrigin-RevId: 347541488
|
|
|
|
| |
PiperOrigin-RevId: 347541028
|
| |
|
|
|
|
|
|
| |
This will not change the compilation output.
PiperOrigin-RevId: 347525836
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Snappy includes a testing framework, which implements a subset of the
Google Test API, and can be used when Google Test is not available.
Snappy also includes a micro-benchmark framework, which implements an
old version of the Google Benchmark API.
This CL replaces the custom test and micro-benchmark frameworks with
google/googletest and google/benchmark. The code is vendored in
third_party/ via git submodules. The setup is similar to google/crc32c
and google/leveldb.
This CL also updates the benchmarking code to the modern Google
Benchmark API.
Benchmark results are expected to be more precise, as the old framework
ran each benchmark with a fixed number of iterations, whereas Google
Benchmark keeps iterating until the noise is low.
PiperOrigin-RevId: 347456142
|
|
|
|
| |
PiperOrigin-RevId: 347402877
|
|
|
|
| |
PiperOrigin-RevId: 347397797
|
|
|
|
| |
PiperOrigin-RevId: 347341130
|
|
|
|
|
|
| |
This feature requires C++17. Fortunately, inline is useful for header declarations, which may be included in multiple compilation units. The declarations modified by this CL occur in a single compilation unit.
PiperOrigin-RevId: 347338760
|
|
|
|
|
|
|
|
|
| |
necessary data. We now store len - offset in a signed int16, this happens to remove masking offset in the calculations and the calculations that need to be done precisely give the flags that we need for testing correctness.
2) Replace offset extraction with a lookup mask. This is less uops and is needed because we need to special case type 3 to always return 0 as to properly trigger the fallback.
3) Unroll the loop twice, this removes some loop-condition checks AND it improves the generated assembly. The loop variables tend to end up in a different register requiring mov's having two consecutive copies allows the elision of the mov's.
PiperOrigin-RevId: 346663328
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When SSSE3 is available:
- Use PSHUFB (_mm_shuffle_epi8) to handle pattern size 1 to 15 (previously it handled size 1 to 7).
- This enables us to do 16 byte copies instead of 8 bytes copies because we know that the pattern size >= 16.
- Use shuffle-reshuffle strategy to generate the next pattern after loading the initial pattern. This enables us to write 4 conditionals (similar to when pattern size >= 16) which would allow FDO to layout the code with respect to actual probabilities of each length.
- The PSHUFB masks are now generated programmatically at compile-time.
When SSSE3 is unavailable:
- No change.
In both cases:
- assert(op < op_limit) in IncrementalCopy so that we can check 'op_limit <= buf_limit - 15' instead of 'op_limit <= buf_limit - 16'. All existing call sites of IncrementalCopy guarantee this.
'bin' case is notably >20% faster because it has many repeated character patterns (i.e. pattern_size = 1).
PiperOrigin-RevId: 346454471
|
|
|
|
| |
PiperOrigin-RevId: 345360683
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When SSSE3 is available:
- Use PSHUFB (_mm_shuffle_epi8) to handle pattern size 1 to 15 (previously it handled size 1 to 7).
- This enables us to do 16 byte copies instead of 8 bytes copies because we know that the pattern size >= 16.
- Use shuffle-reshuffle strategy to generate the next pattern after loading the initial pattern. This enables us to write 4 conditionals (similar to when pattern size >= 16) which would allow FDO to layout the code with respect to actual probabilities of each length.
- The PSHUFB masks are now generated programmatically at compile-time.
When SSSE3 is unavailable:
- No change.
In both cases:
- assert(op < op_limit) in IncrementalCopy so that we can check 'op_limit <= buf_limit - 15' instead of 'op_limit <= buf_limit - 16'. All existing call sites of IncrementalCopy guarantee this.
'bin' case is notably >20% faster because it has many repeated character patterns (i.e. pattern_size = 1).
PiperOrigin-RevId: 345340892
|
|
|
|
| |
PiperOrigin-RevId: 343272548
|
|
|
|
|
|
| |
compared to LZ4. LZ4 is considered the fastest solution by many on internet. We now see that Snappy is actually becoming very competitive with compression a little faster and decompression slower but certainly not terribly slower.
PiperOrigin-RevId: 343140860
|
|
|
|
|
|
|
|
| |
I also made the compression happen only once per benchmark. This way we get a cleaner measurement of #branch-misses using "perf stat". Compression suffers naturally from a large number of branch misses which was polluting the measurements.
This showed that with the new decompression the branch misses is actually much lower then initially reported, only .2% and very stable, ie. doesn't really fluctuate with how you execute the benchmarks.
PiperOrigin-RevId: 342628576
|
|
|
|
| |
PiperOrigin-RevId: 342447553
|
|
|
|
| |
PiperOrigin-RevId: 342423961
|
|
|
|
| |
PiperOrigin-RevId: 342283314
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When SSSE3 is available:
- Use PSHUFB (_mm_shuffle_epi8) to handle pattern size 1 to 15 (previously it handled size 1 to 7).
- This enables us to do 16 byte copies instead of 8 bytes copies because we know that the pattern size >= 16.
- Use shuffle-reshuffle strategy to generate the next pattern after loading the initial pattern. This enables us to write 4 conditionals (similar to when pattern size >= 16) which would allow FDO to layout the code with respect to actual probabilities of each length.
- The PSHUFB masks are now generated programmatically at compile-time.
When SSSE3 is unavailable:
- No change.
In both cases:
- assert(op < op_limit) in IncrementalCopy so that we can check 'op_limit <= buf_limit - 15' instead of 'op_limit <= buf_limit - 16'. All existing call sites of IncrementalCopy guarantee this.
PiperOrigin-RevId: 342267037
|
|
|
|
|
|
| |
increasing the amount of independent branches executed per benchmark iteration.
PiperOrigin-RevId: 342242843
|
|
|
|
|
|
|
|
|
|
|
| |
((a*b)>>18) & mask has higher throughput than (a*b)>>shift, and produces the
same results when the hash table size is 2**14. In other cases, the hash
function is still good, but it's not as necessary for that to be the case as
the input is small anyway. This speeds up in encoding, especially in cases
where hashing is a significant part of the encoding critical path (small or
uncompressible files).
PiperOrigin-RevId: 341498741
|
|\
| |
| |
| | |
PiperOrigin-RevId: 340505526
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When building Snappy with compiler option `-Wsuggest-override` set
via the CMAKE_CXX_FLAGS, compilation produces warnings in
`snappy-sinksource.h`:
```
In file included from ./snappy-fork/snappy-sinksource.cc:32:
./snappy-fork/snappy-sinksource.h:150:18: error: ‘virtual size_t snappy::ByteArraySource::Available() const’ can be marked override [-Werror=suggest-override]
150 | virtual size_t Available() const;
| ^~~~~~~~~
./snappy-fork/snappy-sinksource.h:151:23: error: ‘virtual const char* snappy::ByteArraySource::Peek(size_t*)’ can be marked override [-Werror=suggest-override]
151 | virtual const char* Peek(size_t* len);
| ^~~~
./snappy-fork/snappy-sinksource.h:152:16: error: ‘virtual void snappy::ByteArraySource::Skip(size_t)’ can be marked override [-Werror=suggest-override]
152 | virtual void Skip(size_t n);
| ^~~~
./snappy-fork/snappy-sinksource.h:163:16: error: ‘virtual void snappy::UncheckedByteArraySink::Append(const char*, size_t)’ can be marked override [-Werror=suggest-override]
163 | virtual void Append(const char* data, size_t n);
| ^~~~~~
./snappy-fork/snappy-sinksource.h:164:17: error: ‘virtual char* snappy::UncheckedByteArraySink::GetAppendBuffer(size_t, char*)’ can be marked override [-Werror=suggest-override]
164 | virtual char* GetAppendBuffer(size_t len, char* scratch);
| ^~~~~~~~~~~~~~~
./snappy-fork/snappy-sinksource.h:165:17: error: ‘virtual char* snappy::UncheckedByteArraySink::GetAppendBufferVariable(size_t, size_t, char*, size_t, size_t*)’ can be marked override [-Werror=suggest-override]
165 | virtual char* GetAppendBufferVariable(
| ^~~~~~~~~~~~~~~~~~~~~~~
./snappy-fork/snappy-sinksource.h:168:16: error: ‘virtual void snappy::UncheckedByteArraySink::AppendAndTakeOwnership(char*, size_t, void (*)(void*, const char*, size_t), void*)’ can be marked override [-Werror=suggest-override]
168 | virtual void AppendAndTakeOwnership(
| ^~~~~~~~~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors
```
This PR adds the missing override specifiers to the sink
implementations, so compilation works fine again.
Tested it with g++-9.3 and g++-10.2.
Compatibility note:
Override specifiers were introduced with C++11, which Snappy seems
to effectively require, at least according to its CMakeLists.txt file
and due to the usage of some C++11-only STL types in its tests.
|
| |
| |
| |
| |
| |
| |
| | |
See https://reviews.llvm.org/D67122 for some discussion of why this can matter.
I don't think this should have any noticeable effect on performance.
PiperOrigin-RevId: 340255083
|
|/
|
|
| |
PiperOrigin-RevId: 339897712
|
|
|
|
| |
PiperOrigin-RevId: 321389098
|
|
|
|
| |
PiperOrigin-RevId: 320686580
|
|
|
|
| |
PiperOrigin-RevId: 312741918
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* ARCH_K8 and ARCH_ARM now work correctly on MSVC.
* ARCH_PPC now uses the same macro as tcmalloc.
Microsoft documentation:
https://docs.microsoft.com/en-us/cpp/preprocessor/predefined-macros?view=vs-2019
PowerPC documentation:
http://openpowerfoundation.org/wp-content/uploads/resources/leabi/content/dbdoclet.50655243_75216.html
PiperOrigin-RevId: 310160787
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bits::FindLSBSetNonZero64() is now available unconditionally, and it
should be easier to reason about the code included in each build
configuration.
This reduces the amount of conditional compiling going on, which makes
it easier to reason about what stubs are a used in each build
configuration.
The guards were added to work around the fact that MSVC has a
_BitScanForward64() intrinsic, but the intrinsic is only implemented on
64-bit targets, and causes a compilation error on 32-bit targets errors.
By contrast, Clang and GCC support __builtin_ctzll() everywhere, and
implement it with varying efficiency.
This CL reworks the conditional compilation directives so that
Bits::FindLSBSetNonZero64() uses the _BitScanForward64() intrinsic on
MSVC when available, and the portable implementation otherwise.
PiperOrigin-RevId: 310007748
|