<feed xmlns='http://www.w3.org/2005/Atom'>
<title>delta/git.git/Documentation/technical, branch jk/repack-pack-keep-objects</title>
<subtitle>github.com: git/git.git
</subtitle>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/'/>
<entry>
<title>pack-bitmap: implement optional name_hash cache</title>
<updated>2013-12-30T20:19:23+00:00</updated>
<author>
<name>Vicent Marti</name>
<email>tanoku@gmail.com</email>
</author>
<published>2013-12-21T14:00:45+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=ae4f07fbccaab6dc93be52c0f34e137dd9fcbcf4'/>
<id>ae4f07fbccaab6dc93be52c0f34e137dd9fcbcf4</id>
<content type='text'>
When we use pack bitmaps rather than walking the object
graph, we end up with the list of objects to include in the
packfile, but we do not know the path at which any tree or
blob objects would be found.

In a recently packed repository, this is fine. A fetch would
use the paths only as a heuristic in the delta compression
phase, and a fully packed repository should not need to do
much delta compression.

As time passes, though, we may acquire more objects on top
of our large bitmapped pack. If clients fetch frequently,
then they never even look at the bitmapped history, and all
works as usual. However, a client who has not fetched since
the last bitmap repack will have "have" tips in the
bitmapped history, but "want" newer objects.

The bitmaps themselves degrade gracefully in this
circumstance. We manually walk the more recent bits of
history, and then use bitmaps when we hit them.

But we would also like to perform delta compression between
the newer objects and the bitmapped objects (both to delta
against what we know the user already has, but also between
"new" and "old" objects that the user is fetching). The lack
of pathnames makes our delta heuristics much less effective.

This patch adds an optional cache of the 32-bit name_hash
values to the end of the bitmap file. If present, a reader
can use it to match bitmapped and non-bitmapped names during
delta compression.

Here are perf results for p5310:

Test                      origin/master       HEAD^                      HEAD
-------------------------------------------------------------------------------------------------
5310.2: repack to disk    36.81(37.82+1.43)   47.70(48.74+1.41) +29.6%   47.75(48.70+1.51) +29.7%
5310.3: simulated clone   30.78(29.70+2.14)   1.08(0.97+0.10) -96.5%     1.07(0.94+0.12) -96.5%
5310.4: simulated fetch   3.16(6.10+0.08)     3.54(10.65+0.06) +12.0%    1.70(3.07+0.06) -46.2%
5310.6: partial bitmap    36.76(43.19+1.81)   6.71(11.25+0.76) -81.7%    4.08(6.26+0.46) -88.9%

You can see that the time spent on an incremental fetch goes
down, as our delta heuristics are able to do their work.
And we save time on the partial bitmap clone for the same
reason.

Signed-off-by: Vicent Marti &lt;tanoku@gmail.com&gt;
Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When we use pack bitmaps rather than walking the object
graph, we end up with the list of objects to include in the
packfile, but we do not know the path at which any tree or
blob objects would be found.

In a recently packed repository, this is fine. A fetch would
use the paths only as a heuristic in the delta compression
phase, and a fully packed repository should not need to do
much delta compression.

As time passes, though, we may acquire more objects on top
of our large bitmapped pack. If clients fetch frequently,
then they never even look at the bitmapped history, and all
works as usual. However, a client who has not fetched since
the last bitmap repack will have "have" tips in the
bitmapped history, but "want" newer objects.

The bitmaps themselves degrade gracefully in this
circumstance. We manually walk the more recent bits of
history, and then use bitmaps when we hit them.

But we would also like to perform delta compression between
the newer objects and the bitmapped objects (both to delta
against what we know the user already has, but also between
"new" and "old" objects that the user is fetching). The lack
of pathnames makes our delta heuristics much less effective.

This patch adds an optional cache of the 32-bit name_hash
values to the end of the bitmap file. If present, a reader
can use it to match bitmapped and non-bitmapped names during
delta compression.

Here are perf results for p5310:

Test                      origin/master       HEAD^                      HEAD
-------------------------------------------------------------------------------------------------
5310.2: repack to disk    36.81(37.82+1.43)   47.70(48.74+1.41) +29.6%   47.75(48.70+1.51) +29.7%
5310.3: simulated clone   30.78(29.70+2.14)   1.08(0.97+0.10) -96.5%     1.07(0.94+0.12) -96.5%
5310.4: simulated fetch   3.16(6.10+0.08)     3.54(10.65+0.06) +12.0%    1.70(3.07+0.06) -46.2%
5310.6: partial bitmap    36.76(43.19+1.81)   6.71(11.25+0.76) -81.7%    4.08(6.26+0.46) -88.9%

You can see that the time spent on an incremental fetch goes
down, as our delta heuristics are able to do their work.
And we save time on the partial bitmap clone for the same
reason.

Signed-off-by: Vicent Marti &lt;tanoku@gmail.com&gt;
Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>documentation: add documentation for the bitmap format</title>
<updated>2013-12-30T20:19:22+00:00</updated>
<author>
<name>Vicent Marti</name>
<email>tanoku@gmail.com</email>
</author>
<published>2013-11-14T12:44:02+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=0d4455a3ab070e0477ab80ae641ef19146b7a736'/>
<id>0d4455a3ab070e0477ab80ae641ef19146b7a736</id>
<content type='text'>
This is the technical documentation for the JGit-compatible Bitmap v1
on-disk format.

Signed-off-by: Vicent Marti &lt;tanoku@gmail.com&gt;
Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is the technical documentation for the JGit-compatible Bitmap v1
on-disk format.

Signed-off-by: Vicent Marti &lt;tanoku@gmail.com&gt;
Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'mg/more-textconv'</title>
<updated>2013-10-23T20:21:31+00:00</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2013-10-23T20:21:30+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=4197361e39a304df30cc492122dbdfe90ae8af0e'/>
<id>4197361e39a304df30cc492122dbdfe90ae8af0e</id>
<content type='text'>
Make "git grep" and "git show" pay attention to --textconv when
dealing with blob objects.

* mg/more-textconv:
  grep: honor --textconv for the case rev:path
  grep: allow to use textconv filters
  t7008: demonstrate behavior of grep with textconv
  cat-file: do not die on --textconv without textconv filters
  show: honor --textconv for blobs
  diff_opt: track whether flags have been set explicitly
  t4030: demonstrate behavior of show with textconv
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Make "git grep" and "git show" pay attention to --textconv when
dealing with blob objects.

* mg/more-textconv:
  grep: honor --textconv for the case rev:path
  grep: allow to use textconv filters
  t7008: demonstrate behavior of grep with textconv
  cat-file: do not die on --textconv without textconv filters
  show: honor --textconv for blobs
  diff_opt: track whether flags have been set explicitly
  t4030: demonstrate behavior of show with textconv
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'ss/doclinks'</title>
<updated>2013-09-17T18:42:54+00:00</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2013-09-17T18:42:54+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=287c0feeab28455e86ee376bc452efa61a30b528'/>
<id>287c0feeab28455e86ee376bc452efa61a30b528</id>
<content type='text'>
When we converted many documents that were traditionally text-only
to be formatted to AsciiDoc, we did not update links that point at
them to refer to the formatted HTML files.

* ss/doclinks:
  Documentation: make AsciiDoc links always point to HTML files
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When we converted many documents that were traditionally text-only
to be formatted to AsciiDoc, we did not update links that point at
them to refer to the formatted HTML files.

* ss/doclinks:
  Documentation: make AsciiDoc links always point to HTML files
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'mn/doc-pack-heu-remove-dead-pastebin'</title>
<updated>2013-09-12T21:41:47+00:00</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2013-09-12T21:41:47+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=8ee9a1830017b6f4b9dfc7c7be7e6fa1618633ea'/>
<id>8ee9a1830017b6f4b9dfc7c7be7e6fa1618633ea</id>
<content type='text'>
* mn/doc-pack-heu-remove-dead-pastebin:
  remove dead pastebin link from pack-heuristics document
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* mn/doc-pack-heu-remove-dead-pastebin:
  remove dead pastebin link from pack-heuristics document
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'jl/submodule-mv'</title>
<updated>2013-09-09T21:36:15+00:00</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2013-09-09T21:36:15+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=b02f5aeda6f66ac3be9b2e35f9b237d4f1f80d73'/>
<id>b02f5aeda6f66ac3be9b2e35f9b237d4f1f80d73</id>
<content type='text'>
"git mv A B" when moving a submodule A does "the right thing",
inclusing relocating its working tree and adjusting the paths in
the .gitmodules file.

* jl/submodule-mv: (53 commits)
  rm: delete .gitmodules entry of submodules removed from the work tree
  mv: update the path entry in .gitmodules for moved submodules
  submodule.c: add .gitmodules staging helper functions
  mv: move submodules using a gitfile
  mv: move submodules together with their work trees
  rm: do not set a variable twice without intermediate reading.
  t6131 - skip tests if on case-insensitive file system
  parse_pathspec: accept :(icase)path syntax
  pathspec: support :(glob) syntax
  pathspec: make --literal-pathspecs disable pathspec magic
  pathspec: support :(literal) syntax for noglob pathspec
  kill limit_pathspec_to_literal() as it's only used by parse_pathspec()
  parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN
  parse_pathspec: make sure the prefix part is wildcard-free
  rename field "raw" to "_raw" in struct pathspec
  tree-diff: remove the use of pathspec's raw[] in follow-rename codepath
  remove match_pathspec() in favor of match_pathspec_depth()
  remove init_pathspec() in favor of parse_pathspec()
  remove diff_tree_{setup,release}_paths
  convert common_prefix() to use struct pathspec
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
"git mv A B" when moving a submodule A does "the right thing",
inclusing relocating its working tree and adjusting the paths in
the .gitmodules file.

* jl/submodule-mv: (53 commits)
  rm: delete .gitmodules entry of submodules removed from the work tree
  mv: update the path entry in .gitmodules for moved submodules
  submodule.c: add .gitmodules staging helper functions
  mv: move submodules using a gitfile
  mv: move submodules together with their work trees
  rm: do not set a variable twice without intermediate reading.
  t6131 - skip tests if on case-insensitive file system
  parse_pathspec: accept :(icase)path syntax
  pathspec: support :(glob) syntax
  pathspec: make --literal-pathspecs disable pathspec magic
  pathspec: support :(literal) syntax for noglob pathspec
  kill limit_pathspec_to_literal() as it's only used by parse_pathspec()
  parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN
  parse_pathspec: make sure the prefix part is wildcard-free
  rename field "raw" to "_raw" in struct pathspec
  tree-diff: remove the use of pathspec's raw[] in follow-rename codepath
  remove match_pathspec() in favor of match_pathspec_depth()
  remove init_pathspec() in favor of parse_pathspec()
  remove diff_tree_{setup,release}_paths
  convert common_prefix() to use struct pathspec
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>Documentation: make AsciiDoc links always point to HTML files</title>
<updated>2013-09-06T21:49:06+00:00</updated>
<author>
<name>Sebastian Schuberth</name>
<email>sschuberth@gmail.com</email>
</author>
<published>2013-09-06T20:03:22+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=d5ff3b4be5b2e3eee9c4b298f00ed7a4c1c556a4'/>
<id>d5ff3b4be5b2e3eee9c4b298f00ed7a4c1c556a4</id>
<content type='text'>
AsciiDoc's "link" is supposed to create hyperlinks for HTML output, so
prefer a "link" to point to an HTML file instead of a text file if an HTML
version of the file is being generated. For RelNotes, keep pointing to
text files as no equivalent HTML files are generated.

If appropriate, also update the link description to not contain the linked
file's extension.

Signed-off-by: Sebastian Schuberth &lt;sschuberth@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
AsciiDoc's "link" is supposed to create hyperlinks for HTML output, so
prefer a "link" to point to an HTML file instead of a text file if an HTML
version of the file is being generated. For RelNotes, keep pointing to
text files as no equivalent HTML files are generated.

If appropriate, also update the link description to not contain the linked
file's extension.

Signed-off-by: Sebastian Schuberth &lt;sschuberth@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>remove dead pastebin link from pack-heuristics document</title>
<updated>2013-08-23T19:09:31+00:00</updated>
<author>
<name>Michal Nazarewicz</name>
<email>mina86@mina86.com</email>
</author>
<published>2013-08-23T14:25:02+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=4b363749552960f2692614a131adfac37b271178'/>
<id>4b363749552960f2692614a131adfac37b271178</id>
<content type='text'>
Signed-off-by: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Acked-by: Jon Loeliger &lt;jdl@freescale.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Acked-by: Jon Loeliger &lt;jdl@freescale.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Document the HTTP transport protocols</title>
<updated>2013-08-21T18:37:53+00:00</updated>
<author>
<name>Shawn O. Pearce</name>
<email>spearce@spearce.org</email>
</author>
<published>2013-08-21T13:45:13+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=4c6fffe2ae3642fa4576c704e2eb443de1d0f8a1'/>
<id>4c6fffe2ae3642fa4576c704e2eb443de1d0f8a1</id>
<content type='text'>
Signed-off-by: Shawn O. Pearce &lt;spearce@spearce.org&gt;
Revised-by: Tay Ray Chuan &lt;rctay89@gmail.com&gt;
Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Shawn O. Pearce &lt;spearce@spearce.org&gt;
Revised-by: Tay Ray Chuan &lt;rctay89@gmail.com&gt;
Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'ob/typofixes'</title>
<updated>2013-08-01T19:01:01+00:00</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2013-08-01T18:58:32+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=d50cb7569cfb6e04ba48900821618d28012f334e'/>
<id>d50cb7569cfb6e04ba48900821618d28012f334e</id>
<content type='text'>
* ob/typofixes:
  many small typofixes
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* ob/typofixes:
  many small typofixes
</pre>
</div>
</content>
</entry>
</feed>
