<feed xmlns='http://www.w3.org/2005/Atom'>
<title>delta/libgit2.git, branch ethomson/large_loose_blobs</title>
<subtitle>github.com: libgit2/libgit2.git
</subtitle>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/libgit2.git/'/>
<entry>
<title>tests: add GITTEST_SLOW env var check</title>
<updated>2017-12-20T16:21:05+00:00</updated>
<author>
<name>Edward Thomson</name>
<email>ethomson@edwardthomson.com</email>
</author>
<published>2017-12-20T16:13:31+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/libgit2.git/commit/?id=456e52189c95315028d668f9e508798d490765e2'/>
<id>456e52189c95315028d668f9e508798d490765e2</id>
<content type='text'>
Writing very large files may be slow, particularly on inefficient
filesystems and when running instrumented code to detect invalid memory
accesses (eg within valgrind or similar tools).

Introduce `GITTEST_SLOW` so that tests that are slow can be skipped by
the CI system.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Writing very large files may be slow, particularly on inefficient
filesystems and when running instrumented code to detect invalid memory
accesses (eg within valgrind or similar tools).

Introduce `GITTEST_SLOW` so that tests that are slow can be skipped by
the CI system.
</pre>
</div>
</content>
</entry>
<entry>
<title>hash: commoncrypto hash should support large files</title>
<updated>2017-12-20T16:08:04+00:00</updated>
<author>
<name>Edward Thomson</name>
<email>ethomson@edwardthomson.com</email>
</author>
<published>2017-12-11T16:46:05+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/libgit2.git/commit/?id=bdb542143909fc278c8ba89b0c64cdf72fcaf7d2'/>
<id>bdb542143909fc278c8ba89b0c64cdf72fcaf7d2</id>
<content type='text'>
Teach the CommonCrypto hash mechanisms to support large files.  The hash
primitives take a `CC_LONG` (aka `uint32_t`) at a time.  So loop to give
the hash function at most an unsigned 32 bit's worth of bytes until we
have hashed the entire file.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Teach the CommonCrypto hash mechanisms to support large files.  The hash
primitives take a `CC_LONG` (aka `uint32_t`) at a time.  So loop to give
the hash function at most an unsigned 32 bit's worth of bytes until we
have hashed the entire file.
</pre>
</div>
</content>
</entry>
<entry>
<title>hash: win32 hash mechanism should support large files</title>
<updated>2017-12-20T16:08:04+00:00</updated>
<author>
<name>Edward Thomson</name>
<email>ethomson@edwardthomson.com</email>
</author>
<published>2017-12-10T17:26:43+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/libgit2.git/commit/?id=a89560d5693a2f43cc852cb5806df837dc79b790'/>
<id>a89560d5693a2f43cc852cb5806df837dc79b790</id>
<content type='text'>
Teach the win32 hash mechanisms to support large files.  The hash
primitives take at most `ULONG_MAX` bytes at a time.  Loop, giving the
hash function the maximum supported number of bytes, until we have
hashed the entire file.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Teach the win32 hash mechanisms to support large files.  The hash
primitives take at most `ULONG_MAX` bytes at a time.  Loop, giving the
hash function the maximum supported number of bytes, until we have
hashed the entire file.
</pre>
</div>
</content>
</entry>
<entry>
<title>odb_loose: reject objects that cannot fit in memory</title>
<updated>2017-12-20T16:08:04+00:00</updated>
<author>
<name>Edward Thomson</name>
<email>ethomson@edwardthomson.com</email>
</author>
<published>2017-12-10T17:25:00+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/libgit2.git/commit/?id=3e6533ba12c1c567f91efe621bdd155ff801877c'/>
<id>3e6533ba12c1c567f91efe621bdd155ff801877c</id>
<content type='text'>
Check the size of objects being read from the loose odb backend and
reject those that would not fit in memory with an error message that
reflects the actual problem, instead of error'ing later with an
unintuitive error message regarding truncation or invalid hashes.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Check the size of objects being read from the loose odb backend and
reject those that would not fit in memory with an error message that
reflects the actual problem, instead of error'ing later with an
unintuitive error message regarding truncation or invalid hashes.
</pre>
</div>
</content>
</entry>
<entry>
<title>zstream: use UINT_MAX sized chunks</title>
<updated>2017-12-20T16:08:03+00:00</updated>
<author>
<name>Edward Thomson</name>
<email>ethomson@edwardthomson.com</email>
</author>
<published>2017-12-10T17:23:44+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/libgit2.git/commit/?id=8642feba7429ac2941a879a0870a84a83a3664cd'/>
<id>8642feba7429ac2941a879a0870a84a83a3664cd</id>
<content type='text'>
Instead of paging to zlib in INT_MAX sized chunks, we can give it
as many as UINT_MAX bytes at a time.  zlib doesn't care how big
a buffer we give it, this simply results in fewer calls into zlib.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instead of paging to zlib in INT_MAX sized chunks, we can give it
as many as UINT_MAX bytes at a time.  zlib doesn't care how big
a buffer we give it, this simply results in fewer calls into zlib.
</pre>
</div>
</content>
</entry>
<entry>
<title>odb: support large loose objects</title>
<updated>2017-12-20T16:08:03+00:00</updated>
<author>
<name>Edward Thomson</name>
<email>ethomson@edwardthomson.com</email>
</author>
<published>2017-11-30T15:55:59+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/libgit2.git/commit/?id=ddefea750adcde06867b49d251760844540919fe'/>
<id>ddefea750adcde06867b49d251760844540919fe</id>
<content type='text'>
zlib will only inflate/deflate an `int`s worth of data at a time.
We need to loop through large files in order to ensure that we inflate
the entire file, not just an `int`s worth of data.  Thankfully, we
already have this loop in our `git_zstream` layer.  Handle large objects
using the `git_zstream`.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
zlib will only inflate/deflate an `int`s worth of data at a time.
We need to loop through large files in order to ensure that we inflate
the entire file, not just an `int`s worth of data.  Thankfully, we
already have this loop in our `git_zstream` layer.  Handle large objects
using the `git_zstream`.
</pre>
</div>
</content>
</entry>
<entry>
<title>object: introduce git_object_stringn2type</title>
<updated>2017-12-20T16:08:03+00:00</updated>
<author>
<name>Edward Thomson</name>
<email>ethomson@edwardthomson.com</email>
</author>
<published>2017-11-30T15:52:47+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/libgit2.git/commit/?id=d1e446550a966a1dbc5d765aa79fe9bc47a1c1a3'/>
<id>d1e446550a966a1dbc5d765aa79fe9bc47a1c1a3</id>
<content type='text'>
Introduce an internal API to get the object type based on a
length-specified (not null terminated) string representation.  This can
be used to compare the (space terminated) object type name in a loose
object.

Reimplement `git_object_string2type` based on this API.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Introduce an internal API to get the object type based on a
length-specified (not null terminated) string representation.  This can
be used to compare the (space terminated) object type name in a loose
object.

Reimplement `git_object_string2type` based on this API.
</pre>
</div>
</content>
</entry>
<entry>
<title>odb: test loose reading/writing large objects</title>
<updated>2017-12-20T16:08:02+00:00</updated>
<author>
<name>Edward Thomson</name>
<email>ethomson@edwardthomson.com</email>
</author>
<published>2017-11-30T15:49:05+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/libgit2.git/commit/?id=dacc32910e36e79ba108bef507e3aec9b0626e3c'/>
<id>dacc32910e36e79ba108bef507e3aec9b0626e3c</id>
<content type='text'>
Introduce a test for very large objects in the ODB.  Write a large
object (5 GB) and ensure that the write succeeds and provides us the
expected object ID.  Introduce a test that writes that file and
ensures that we can subsequently read it.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Introduce a test for very large objects in the ODB.  Write a large
object (5 GB) and ensure that the write succeeds and provides us the
expected object ID.  Introduce a test that writes that file and
ensures that we can subsequently read it.
</pre>
</div>
</content>
</entry>
<entry>
<title>util: introduce `git__prefixncmp` and consolidate implementations</title>
<updated>2017-12-20T16:08:01+00:00</updated>
<author>
<name>Edward Thomson</name>
<email>ethomson@edwardthomson.com</email>
</author>
<published>2017-11-30T15:40:13+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/libgit2.git/commit/?id=86219f40689c85ec4418575223f4376beffa45af'/>
<id>86219f40689c85ec4418575223f4376beffa45af</id>
<content type='text'>
Introduce `git_prefixncmp` that will search up to the first `n`
characters of a string to see if it is prefixed by another string.
This is useful for examining if a non-null terminated character
array is prefixed by a particular substring.

Consolidate the various implementations of `git__prefixcmp` around a
single core implementation and add some test cases to validate its
behavior.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Introduce `git_prefixncmp` that will search up to the first `n`
characters of a string to see if it is prefixed by another string.
This is useful for examining if a non-null terminated character
array is prefixed by a particular substring.

Consolidate the various implementations of `git__prefixcmp` around a
single core implementation and add some test cases to validate its
behavior.
</pre>
</div>
</content>
</entry>
<entry>
<title>zstream: treat `Z_BUF_ERROR` as non-fatal</title>
<updated>2017-12-20T16:08:01+00:00</updated>
<author>
<name>Edward Thomson</name>
<email>ethomson@edwardthomson.com</email>
</author>
<published>2017-12-12T12:24:11+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/libgit2.git/commit/?id=b7d36ef4a644c69c37e64c7c813546a68264b924'/>
<id>b7d36ef4a644c69c37e64c7c813546a68264b924</id>
<content type='text'>
zlib will return `Z_BUF_ERROR` whenever there is more input to inflate
or deflate than there is output to store the result.  This is normal for
us as we iterate through the input, particularly with very large input
buffers.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
zlib will return `Z_BUF_ERROR` whenever there is more input to inflate
or deflate than there is output to store the result.  This is normal for
us as we iterate through the input, particularly with very large input
buffers.
</pre>
</div>
</content>
</entry>
</feed>
