<feed xmlns='http://www.w3.org/2005/Atom'>
<title>delta/git.git/fetch-pack.h, branch sb/string-list</title>
<subtitle>github.com: git/git.git
</subtitle>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/'/>
<entry>
<title>Merge branch 'nd/shallow-clone'</title>
<updated>2014-01-17T20:21:20+00:00</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2014-01-17T20:21:14+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=92251b1b5b5e53ac9de890105e2a2bd9d15e2ecb'/>
<id>92251b1b5b5e53ac9de890105e2a2bd9d15e2ecb</id>
<content type='text'>
Fetching from a shallow-cloned repository used to be forbidden,
primarily because the codepaths involved were not carefully vetted
and we did not bother supporting such usage. This attempts to allow
object transfer out of a shallow-cloned repository in a controlled
way (i.e. the receiver become a shallow repository with truncated
history).

* nd/shallow-clone: (31 commits)
  t5537: fix incorrect expectation in test case 10
  shallow: remove unused code
  send-pack.c: mark a file-local function static
  git-clone.txt: remove shallow clone limitations
  prune: clean .git/shallow after pruning objects
  clone: use git protocol for cloning shallow repo locally
  send-pack: support pushing from a shallow clone via http
  receive-pack: support pushing to a shallow clone via http
  smart-http: support shallow fetch/clone
  remote-curl: pass ref SHA-1 to fetch-pack as well
  send-pack: support pushing to a shallow clone
  receive-pack: allow pushes that update .git/shallow
  connected.c: add new variant that runs with --shallow-file
  add GIT_SHALLOW_FILE to propagate --shallow-file to subprocesses
  receive/send-pack: support pushing from a shallow clone
  receive-pack: reorder some code in unpack()
  fetch: add --update-shallow to accept refs that update .git/shallow
  upload-pack: make sure deepening preserves shallow roots
  fetch: support fetching from a shallow repository
  clone: support remote shallow repository
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fetching from a shallow-cloned repository used to be forbidden,
primarily because the codepaths involved were not carefully vetted
and we did not bother supporting such usage. This attempts to allow
object transfer out of a shallow-cloned repository in a controlled
way (i.e. the receiver become a shallow repository with truncated
history).

* nd/shallow-clone: (31 commits)
  t5537: fix incorrect expectation in test case 10
  shallow: remove unused code
  send-pack.c: mark a file-local function static
  git-clone.txt: remove shallow clone limitations
  prune: clean .git/shallow after pruning objects
  clone: use git protocol for cloning shallow repo locally
  send-pack: support pushing from a shallow clone via http
  receive-pack: support pushing to a shallow clone via http
  smart-http: support shallow fetch/clone
  remote-curl: pass ref SHA-1 to fetch-pack as well
  send-pack: support pushing to a shallow clone
  receive-pack: allow pushes that update .git/shallow
  connected.c: add new variant that runs with --shallow-file
  add GIT_SHALLOW_FILE to propagate --shallow-file to subprocesses
  receive/send-pack: support pushing from a shallow clone
  receive-pack: reorder some code in unpack()
  fetch: add --update-shallow to accept refs that update .git/shallow
  upload-pack: make sure deepening preserves shallow roots
  fetch: support fetching from a shallow repository
  clone: support remote shallow repository
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>fetch: add --update-shallow to accept refs that update .git/shallow</title>
<updated>2013-12-11T00:14:17+00:00</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2013-12-05T13:02:42+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=48d25cae22667dfc2c31ad620172c0f0a3ac1490'/>
<id>48d25cae22667dfc2c31ad620172c0f0a3ac1490</id>
<content type='text'>
The same steps are done as in when --update-shallow is not given. The
only difference is we now add all shallow commits in "ours" and
"theirs" to .git/shallow (aka "step 8").

Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The same steps are done as in when --update-shallow is not given. The
only difference is we now add all shallow commits in "ours" and
"theirs" to .git/shallow (aka "step 8").

Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>clone: support remote shallow repository</title>
<updated>2013-12-11T00:14:17+00:00</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2013-12-05T13:02:39+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=beea4152d94cf7c77eeb6b226805b315d22b3a2f'/>
<id>beea4152d94cf7c77eeb6b226805b315d22b3a2f</id>
<content type='text'>
Cloning from a shallow repository does not follow the "8 steps for new
.git/shallow" because if it does we need to get through step 6 for all
refs. That means commit walking down to the bottom.

Instead the rule to create .git/shallow is simpler and, more
importantly, cheap: if a shallow commit is found in the pack, it's
probably used (i.e. reachable from some refs), so we add it. Others
are dropped.

One may notice this method seems flawed by the word "probably". A
shallow commit may not be reachable from any refs at all if it's
attached to an object island (a group of objects that are not
reachable by any refs).

If that object island is not complete, a new fetch request may send
more objects to connect it to some ref. At that time, because we
incorrectly installed the shallow commit in this island, the user will
not see anything after that commit (fsck is still ok). This is not
desired.

Given that object islands are rare (C Git never sends such islands for
security reasons) and do not really harm the repository integrity, a
tradeoff is made to surprise the user occasionally but work faster
everyday.

A new option --strict could be added later that follows exactly the 8
steps. "git prune" can also learn to remove dangling objects _and_ the
shallow commits that are attached to them from .git/shallow.

Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Cloning from a shallow repository does not follow the "8 steps for new
.git/shallow" because if it does we need to get through step 6 for all
refs. That means commit walking down to the bottom.

Instead the rule to create .git/shallow is simpler and, more
importantly, cheap: if a shallow commit is found in the pack, it's
probably used (i.e. reachable from some refs), so we add it. Others
are dropped.

One may notice this method seems flawed by the word "probably". A
shallow commit may not be reachable from any refs at all if it's
attached to an object island (a group of objects that are not
reachable by any refs).

If that object island is not complete, a new fetch request may send
more objects to connect it to some ref. At that time, because we
incorrectly installed the shallow commit in this island, the user will
not see anything after that commit (fsck is still ok). This is not
desired.

Given that object islands are rare (C Git never sends such islands for
security reasons) and do not really harm the repository integrity, a
tradeoff is made to surprise the user occasionally but work faster
everyday.

A new option --strict could be added later that follows exactly the 8
steps. "git prune" can also learn to remove dangling objects _and_ the
shallow commits that are attached to them from .git/shallow.

Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fetch-pack.h: one statement per bitfield declaration</title>
<updated>2013-12-11T00:14:17+00:00</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2013-12-05T13:02:38+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=f6486f07d25ab4f2f93483690acabb817b0729b8'/>
<id>f6486f07d25ab4f2f93483690acabb817b0729b8</id>
<content type='text'>
Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>git fetch-pack: add --diag-url</title>
<updated>2013-12-09T22:54:47+00:00</updated>
<author>
<name>Torsten Bögershausen</name>
<email>tboegi@web.de</email>
</author>
<published>2013-11-28T19:49:17+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=5610b7c0c6957cf0b236b6fac087c1f4dc209376'/>
<id>5610b7c0c6957cf0b236b6fac087c1f4dc209376</id>
<content type='text'>
The main purpose is to trace the URL parser called by git_connect() in
connect.c

The main features of the parser can be listed as this:

- parse out host and path for URLs with a scheme (git:// file:// ssh://)
- parse host names embedded by [] correctly
- extract the port number, if present
- separate URLs like "file" (which are local)
  from URLs like "host:repo" which should use ssh

Add the new parameter "--diag-url" to "git fetch-pack", which prints
the value for protocol, host and path to stderr and exits.

Signed-off-by: Torsten Bögershausen &lt;tboegi@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The main purpose is to trace the URL parser called by git_connect() in
connect.c

The main features of the parser can be listed as this:

- parse out host and path for URLs with a scheme (git:// file:// ssh://)
- parse host names embedded by [] correctly
- extract the port number, if present
- separate URLs like "file" (which are local)
  from URLs like "host:repo" which should use ssh

Add the new parameter "--diag-url" to "git fetch-pack", which prints
the value for protocol, host and path to stderr and exits.

Signed-off-by: Torsten Bögershausen &lt;tboegi@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cache.h: move remote/connect API out of it</title>
<updated>2013-07-08T21:34:24+00:00</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2013-07-08T20:56:53+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=47a59185369b8905ad3a4012688cba92fd2ac1ff'/>
<id>47a59185369b8905ad3a4012688cba92fd2ac1ff</id>
<content type='text'>
The definition of "struct ref" in "cache.h", a header file so
central to the system, always confused me.  This structure is not
about the local ref used by sha1-name API to name local objects.

It is what refspecs are expanded into, after finding out what refs
the other side has, to define what refs are updated after object
transfer succeeds to what values.  It belongs to "remote.h" together
with "struct refspec".

While we are at it, also move the types and functions related to the
Git transport connection to a new header file connect.h

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The definition of "struct ref" in "cache.h", a header file so
central to the system, always confused me.  This structure is not
about the local ref used by sha1-name API to name local objects.

It is what refspecs are expanded into, after finding out what refs
the other side has, to define what refs are updated after object
transfer succeeds to what values.  It belongs to "remote.h" together
with "struct refspec".

While we are at it, also move the types and functions related to the
Git transport connection to a new header file connect.h

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>clone: open a shortcut for connectivity check</title>
<updated>2013-05-28T15:07:20+00:00</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2013-05-26T01:16:17+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=c6807a40dcd29f7e5ad1e2f4fc44f1729c9afa11'/>
<id>c6807a40dcd29f7e5ad1e2f4fc44f1729c9afa11</id>
<content type='text'>
In order to make sure the cloned repository is good, we run "rev-list
--objects --not --all $new_refs" on the repository. This is expensive
on large repositories. This patch attempts to mitigate the impact in
this special case.

In the "good" clone case, we only have one pack. If all of the
following are met, we can be sure that all objects reachable from the
new refs exist, which is the intention of running "rev-list ...":

 - all refs point to an object in the pack
 - there are no dangling pointers in any object in the pack
 - no objects in the pack point to objects outside the pack

The second and third checks can be done with the help of index-pack as
a slight variation of --strict check (which introduces a new condition
for the shortcut: pack transfer must be used and the number of objects
large enough to call index-pack). The first is checked in
check_everything_connected after we get an "ok" from index-pack.

"index-pack + new checks" is still faster than the current "index-pack
+ rev-list", which is the whole point of this patch. If any of the
conditions fail, we fall back to the good old but expensive "rev-list
..". In that case it's even more expensive because we have to pay for
the new checks in index-pack. But that should only happen when the
other side is either buggy or malicious.

Cloning linux-2.6 over file://

        before         after
real    3m25.693s      2m53.050s
user    5m2.037s       4m42.396s
sys     0m13.750s      0m16.574s

A more realistic test with ssh:// over wireless

        before         after
real    11m26.629s     10m4.213s
user    5m43.196s      5m19.444s
sys     0m35.812s      0m37.630s

This shortcut is not applied to shallow clones, partly because shallow
clones should have no more objects than a usual fetch and the cost of
rev-list is acceptable, partly to avoid dealing with corner cases when
grafting is involved.

This shortcut does not apply to unpack-objects code path either
because the number of objects must be small in order to trigger that
code path.

Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In order to make sure the cloned repository is good, we run "rev-list
--objects --not --all $new_refs" on the repository. This is expensive
on large repositories. This patch attempts to mitigate the impact in
this special case.

In the "good" clone case, we only have one pack. If all of the
following are met, we can be sure that all objects reachable from the
new refs exist, which is the intention of running "rev-list ...":

 - all refs point to an object in the pack
 - there are no dangling pointers in any object in the pack
 - no objects in the pack point to objects outside the pack

The second and third checks can be done with the help of index-pack as
a slight variation of --strict check (which introduces a new condition
for the shortcut: pack transfer must be used and the number of objects
large enough to call index-pack). The first is checked in
check_everything_connected after we get an "ok" from index-pack.

"index-pack + new checks" is still faster than the current "index-pack
+ rev-list", which is the whole point of this patch. If any of the
conditions fail, we fall back to the good old but expensive "rev-list
..". In that case it's even more expensive because we have to pay for
the new checks in index-pack. But that should only happen when the
other side is either buggy or malicious.

Cloning linux-2.6 over file://

        before         after
real    3m25.693s      2m53.050s
user    5m2.037s       4m42.396s
sys     0m13.750s      0m16.574s

A more realistic test with ssh:// over wireless

        before         after
real    11m26.629s     10m4.213s
user    5m43.196s      5m19.444s
sys     0m35.812s      0m37.630s

This shortcut is not applied to shallow clones, partly because shallow
clones should have no more objects than a usual fetch and the cost of
rev-list is acceptable, partly to avoid dealing with corner cases when
grafting is involved.

This shortcut does not apply to unpack-objects code path either
because the number of objects must be small in order to trigger that
code path.

Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fetch: use struct ref to represent refs to be fetched</title>
<updated>2013-02-07T21:53:59+00:00</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2013-01-29T22:02:15+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=f2db854d24f32de7b03dde5a7d7134f5e31100b9'/>
<id>f2db854d24f32de7b03dde5a7d7134f5e31100b9</id>
<content type='text'>
Even though "git fetch" has full infrastructure to parse refspecs to
be fetched and match them against the list of refs to come up with
the final list of refs to be fetched, the list of refs that are
requested to be fetched were internally converted to a plain list of
strings at the transport layer and then passed to the underlying
fetch-pack driver.

Stop this conversion and instead pass around an array of refs.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Even though "git fetch" has full infrastructure to parse refspecs to
be fetched and match them against the list of refs to come up with
the final list of refs to be fetched, the list of refs that are
requested to be fetched were internally converted to a plain list of
strings at the transport layer and then passed to the underlying
fetch-pack driver.

Stop this conversion and instead pass around an array of refs.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>filter_refs(): delete matched refs from sought list</title>
<updated>2012-09-12T18:46:31+00:00</updated>
<author>
<name>Michael Haggerty</name>
<email>mhagger@alum.mit.edu</email>
</author>
<published>2012-09-09T06:19:43+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=4ba159996f6c1b0d6dd0a2a8bd9d6f5b342a4aa5'/>
<id>4ba159996f6c1b0d6dd0a2a8bd9d6f5b342a4aa5</id>
<content type='text'>
Remove any references that are available from the remote from the
sought list (rather than overwriting their names with NUL characters,
as previously).  Mark matching entries by writing a non-NULL pointer
to string_list_item::util during the iteration, then use
filter_string_list() later to filter out the entries that have been
marked.

Document this aspect of fetch_pack() in a comment in the header file.
(More documentation is obviously still needed.)

Signed-off-by: Michael Haggerty &lt;mhagger@alum.mit.edu&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove any references that are available from the remote from the
sought list (rather than overwriting their names with NUL characters,
as previously).  Mark matching entries by writing a non-NULL pointer
to string_list_item::util during the iteration, then use
filter_string_list() later to filter out the entries that have been
marked.

Document this aspect of fetch_pack() in a comment in the header file.
(More documentation is obviously still needed.)

Signed-off-by: Michael Haggerty &lt;mhagger@alum.mit.edu&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Change fetch_pack() and friends to take string_list arguments</title>
<updated>2012-09-12T18:46:31+00:00</updated>
<author>
<name>Michael Haggerty</name>
<email>mhagger@alum.mit.edu</email>
</author>
<published>2012-09-09T06:19:40+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=8bee93dd24731a1d2ef7c82d484a893cf68b6572'/>
<id>8bee93dd24731a1d2ef7c82d484a893cf68b6572</id>
<content type='text'>
Instead of juggling &lt;nr_heads,heads&gt; (sometimes called
&lt;nr_match,match&gt;), pass around the list of references to be sought in
a single string_list variable called "sought".  Future commits will
make more use of string_list functionality.

Signed-off-by: Michael Haggerty &lt;mhagger@alum.mit.edu&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instead of juggling &lt;nr_heads,heads&gt; (sometimes called
&lt;nr_match,match&gt;), pass around the list of references to be sought in
a single string_list variable called "sought".  Future commits will
make more use of string_list functionality.

Signed-off-by: Michael Haggerty &lt;mhagger@alum.mit.edu&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
