| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The existing "git fetch --depth=<n>" option was hard to use
correctly when making the history of an existing shallow clone
deeper. A new option, "--deepen=<n>", has been added to make this
easier to use. "git clone" also learned "--shallow-since=<date>"
and "--shallow-exclude=<tag>" options to make it easier to specify
"I am interested only in the recent N months worth of history" and
"Give me only the history since that version".
* nd/shallow-deepen: (27 commits)
fetch, upload-pack: --deepen=N extends shallow boundary by N commits
upload-pack: add get_reachable_list()
upload-pack: split check_unreachable() in two, prep for get_reachable_list()
t5500, t5539: tests for shallow depth excluding a ref
clone: define shallow clone boundary with --shallow-exclude
fetch: define shallow boundary with --shallow-exclude
upload-pack: support define shallow boundary by excluding revisions
refs: add expand_ref()
t5500, t5539: tests for shallow depth since a specific date
clone: define shallow clone boundary based on time with --shallow-since
fetch: define shallow boundary with --shallow-since
upload-pack: add deepen-since to cut shallow repos based on time
shallow.c: implement a generic shallow boundary finder based on rev-list
fetch-pack: use a separate flag for fetch in deepening mode
fetch-pack.c: mark strings for translating
fetch-pack: use a common function for verbose printing
fetch-pack: use skip_prefix() instead of starts_with()
upload-pack: move rev-list code out of check_non_tip()
upload-pack: make check_non_tip() clean things up on error
upload-pack: tighten number parsing at "deepen" lines
...
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In git-fetch, --depth argument is always relative with the latest
remote refs. This makes it a bit difficult to cover this use case,
where the user wants to make the shallow history, say 3 levels
deeper. It would work if remote refs have not moved yet, but nobody
can guarantee that, especially when that use case is performed a
couple months after the last clone or "git fetch --depth". Also,
modifying shallow boundary using --depth does not work well with
clones created by --since or --not.
This patch fixes that. A new argument --deepen=<N> will add <N> more (*)
parent commits to the current history regardless of where remote refs
are.
Have/Want negotiation is still respected. So if remote refs move, the
server will send two chunks: one between "have" and "want" and another
to extend shallow history. In theory, the client could send no "want"s
in order to get the second chunk only. But the protocol does not allow
that. Either you send no want lines, which means ls-remote; or you
have to send at least one want line that carries deep-relative to the
server..
The main work was done by Dongcan Jiang. I fixed it up here and there.
And of course all the bugs belong to me.
(*) We could even support --deepen=<N> where <N> is negative. In that
case we can cut some history from the shallow clone. This operation
(and --depth=<shorter depth>) does not require interaction with remote
side (and more complicated to implement as a result).
Helped-by: Duy Nguyen <pclouds@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| | |
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| | |
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| | |
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| | |
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| | |
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This should allow the user to say "create a shallow clone of this branch
after version <some-tag>".
Short refs are accepted and expanded at the server side with expand_ref()
because we cannot expand (unknown) refs from the client side.
Like deepen-since, deepen-not cannot be used with deepen. But deepen-not
can be mixed with deepen-since. The result is exactly how you do the
command "git rev-list --since=... --not ref".
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This is basically dwim_ref() without @{} support. To be used on the
server side where we want to expand abbreviated to full ref names and
nothing else. The first user is "git clone/fetch --shallow-exclude".
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| | |
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| | |
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| | |
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This should allow the user to say "create a shallow clone containing the
work from last year" (once the client side is fixed up, of course).
In theory deepen-since and deepen (aka --depth) can be used together to
draw the shallow boundary (whether it's intersection or union is up to
discussion, but if rev-list is used, it's likely intersection). However,
because deepen goes with a custom commit walker, we can't mix the two
yet.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Instead of a custom commit walker like get_shallow_commits(), this new
function uses rev-list to mark NOT_SHALLOW to all reachable commits,
except borders. The definition of reachable is to be defined by the
protocol later. This makes it more flexible to define shallow boundary.
The way we find border is paint all reachable commits NOT_SHALLOW. Any
of them that "touches" commits without NOT_SHALLOW flag are considered
shallow (e.g. zero parents via grafting mechanism). Shallow commits and
their true parents are all marked SHALLOW. Then NOT_SHALLOW is removed
from shallow commits at the end.
There is an interesting observation. With a generic walker, we can
produce all kinds of shallow cutting. In the following graph, every
commit but "x" is reachable. "b" is a parent of "a".
x -- a -- o
/ /
x -- c -- b -- o
After this function is run, "a" and "c" are both considered shallow
commits. After grafting occurs at the client side, what we see is
a -- o
/
c -- b -- o
Notice that because of grafting, "a" has zero parents, so "b" is no
longer a parent of "a".
This is unfortunate and may be solved in two ways. The first is change
the way shallow grafting works and keep "a -- b" connection if "b"
exists and always ends at shallow commits (iow, no loose ends). This is
hard to detect, or at least not cheap to do.
The second way is mark one "x" as shallow commit instead of "a" and
produce this graph at client side:
x -- a -- o
/ /
c -- b -- o
More commits, but simpler grafting rules.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The shallow repo could be deepened or shortened when then user gives
--depth. But in future that won't be the only way to deepen/shorten a
repo. Stop relying on args->depth in this mode. Future deepening
methods can simply set this flag on instead of updating all these if
expressions.
The new name "deepen" was chosen after the command to define shallow
boundary in pack protocol. New commands also follow this tradition.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| | |
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This reduces the number of "if (verbose)" which makes it a bit easier
to read imo. It also makes it easier to redirect all these printouts,
to a file for example.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| | |
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| | |
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
On error check_non_tip() will die and not closing file descriptors is no
big deal. The next patch will split the majority of this function out
for reuse in other cases, where die() may not be the only outcome. Same
story for popping SIGPIPE out of the signal chain. So let's make sure we
clean things up properly first.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| | |
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| | |
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Also add some more comments in this code because it takes too long to
understand what it does (to me, who should be familiar enough to
understand this code well!)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
After the last patch, "result" and "backup" are the same. "result" used
to move, but the movement is now contained in send_shallow(). Delete
this redundant variable.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| | |
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
This is a prep step for further refactoring. Besides reindentation and
s/shallows\./shallows->/g, no other changes are expected.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
For now we can handle two types, string and boolean, in
set_helper_option(). Later on we'll add string_list support, which does
not fit well. The new function strbuf_set_helper_option() can be reused
for a separate function that handles string-list.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| | |
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The command-line completion script (in contrib/) learned to
complete "git cmd ^mas<HT>" to complete the negative end of
reference to "git cmd ^master".
* cp/completion-negative-refs:
completion: support excluding refs
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Allow completion of refs with a ^ prefix. This allows completion of
commands like 'git log HEAD ^origin/master'.
Signed-off-by: Chris Packham <judge.packham@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The ./configure script generated from configure.ac was taught how
to detect support of SSL by libcurl better.
* dp/autoconf-curl-ssl:
./configure.ac: detect SSL in libcurl using curl-config
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The API of libcurl does not mention Curl_ssl_init() and when curl is
built with -flto, the Curl_ssl_init symbol is not exported.
https://curl.haxx.se/libcurl/using/ suggests calling
curl-config --feature | grep SSL
to see, if the installed curl has SSL support. Another approach
would be calling curl_version_info and checking the returned struct.
This patch removes the check for the Curl_ssl_init exported symbol
from libcurl and uses curl-config to detect SSL support in libcurl.
Signed-off-by: Дилян Палаузов <git-dpa@aegee.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
When we started cURL to talk to imap server when a new enough
version of cURL library is available, we forgot to explicitly add
imap(s):// before the destination. To some folks, that didn't work
and the library tried to make HTTP(s) requests instead.
* ak/curl-imap-send-explicit-scheme:
imap-send: Tell cURL to use imap:// or imaps://
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Right now the imap:// or imaps:// part of imap.host is not being
passed on to cURL. Perhaps it was able to guess correctly under some
circumstances, but I was not able to find one; it was just trying to
make HTTP requests for me. It’s better to be explicit in any case.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
"git pack-objects" in a repository with many packfiles used to
spend a lot of time looking for/at objects in them; the accesses to
the packfiles are now optimized by checking the most-recently-used
packfile first.
* jk/pack-objects-optim-mru:
pack-objects: use mru list when iterating over packs
pack-objects: break delta cycles before delta-search phase
sha1_file: make packed_object_info public
provide an initializer for "struct object_info"
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
In the original implementation of want_object_in_pack(), we
always looked for the object in every pack, so the order did
not matter for performance.
As of the last few patches, however, we can now often break
out of the loop early after finding the first instance, and
avoid looking in the other packs at all. In this case, pack
order can make a big difference, because we'd like to find
the objects by looking at as few packs as possible.
This patch switches us to the same packed_git_mru list that
is now used by normal object lookups.
Here are timings for p5303 on linux.git:
Test HEAD^ HEAD
------------------------------------------------------------------------
5303.3: rev-list (1) 31.31(31.07+0.23) 31.28(31.00+0.27) -0.1%
5303.4: repack (1) 40.35(38.84+2.60) 40.53(39.31+2.32) +0.4%
5303.6: rev-list (50) 31.37(31.15+0.21) 31.41(31.16+0.24) +0.1%
5303.7: repack (50) 58.25(68.54+2.03) 47.28(57.66+1.89) -18.8%
5303.9: rev-list (1000) 31.91(31.57+0.33) 31.93(31.64+0.28) +0.1%
5303.10: repack (1000) 304.80(376.00+3.92) 87.21(159.54+2.84) -71.4%
The rev-list numbers are unchanged, which makes sense (they
are not exercising this code at all). The 50- and 1000-pack
repack cases show considerable improvement.
The single-pack repack case doesn't, of course; there's
nothing to improve. In fact, it gives us a baseline for how
fast we could possibly go. You can see that though rev-list
can approach the single-pack case even with 1000 packs,
repack doesn't. The reason is simple: the loop we are
optimizing is only part of what the repack is doing. After
the "counting" phase, we do delta compression, which is much
more expensive when there are multiple packs, because we
have fewer deltas we can reuse (you can also see that these
numbers come from a multicore machine; the CPU times are
much higher than the wall-clock times due to the delta
phase).
So the good news is that in cases with many packs, we used
to be dominated by the "counting" phase, and now we are
dominated by the delta compression (which is faster, and
which we have already parallelized).
Here are similar numbers for git.git:
Test HEAD^ HEAD
---------------------------------------------------------------------
5303.3: rev-list (1) 1.55(1.51+0.02) 1.54(1.53+0.00) -0.6%
5303.4: repack (1) 1.82(1.80+0.08) 1.82(1.78+0.09) +0.0%
5303.6: rev-list (50) 1.58(1.57+0.00) 1.58(1.56+0.01) +0.0%
5303.7: repack (50) 2.50(3.12+0.07) 2.31(2.95+0.06) -7.6%
5303.9: rev-list (1000) 2.22(2.20+0.02) 2.23(2.19+0.03) +0.5%
5303.10: repack (1000) 10.47(16.78+0.22) 7.50(13.76+0.22) -28.4%
Not as impressive in terms of percentage, but still
measurable wins. If you look at the wall-clock time
improvements in the 1000-pack case, you can see that linux
improved by roughly 10x as many seconds as git. That's
because it has roughly 10x as many objects, and we'd expect
this improvement to scale linearly with the number of
objects (since the number of packs is kept constant). It's
just that the "counting" phase is a smaller percentage of
the total time spent for a git.git repack, and hence the
percentage win is smaller.
The implementation itself is a straightforward use of the
MRU code. We only bother marking a pack as used when we know
that we are able to break early out of the loop, for two
reasons:
1. If we can't break out early, it does no good; we have
to visit each pack anyway, so we might as well avoid
even the minor overhead of managing the cache order.
2. The mru_mark() function reorders the list, which would
screw up our traversal. So it is only safe to mark when
we are about to break out of the loop. We could record
the found pack and mark it after the loop finishes, of
course, but that's more complicated and it doesn't buy
us anything due to (1).
Note that this reordering does have a potential impact on
the final pack, as we store only a single "found" pack for
each object, even if it is present in multiple packs. In
principle, any copy is acceptable, as they all refer to the
same content. But in practice, they may differ in whether
they are stored as deltas, against which base, etc. This may
have an impact on delta reuse, and even the delta search
(since we skip pairs that were already in the same pack).
It's not clear whether this change of order would hurt or
even help average cases, though. The most likely reason to
have duplicate objects is from the completion of thin packs
(e.g., you have some objects in a "base" pack, then receive
several pushes; the packs you receive may be thin on the
wire, with deltas that refer to bases outside the pack, but
we complete them with duplicate base objects when indexing
them).
In such a case the current code would always find the thin
duplicates (because we currently walk the packs in reverse
chronological order). Whereas with this patch, some of those
duplicates would be found in the base pack instead.
In my tests repacking a real-world case of linux.git with
3600 thin-pack pushes (on top of a large "base" pack), the
resulting pack was about 0.04% larger with this patch. On
the other hand, because we were more likely to hit the base
pack, there were more opportunities for delta reuse, and we
had 50,000 fewer objects to examine in the delta search.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
We do not allow cycles in the delta graph of a pack (i.e., A
is a delta of B which is a delta of A) for the obvious
reason that you cannot actually access any of the objects in
such a case.
There's a last-ditch attempt to notice cycles during the
write phase, during which we issue a warning to the user and
write one of the objects out in full. However, this is
"last-ditch" for two reasons:
1. By this time, it's too late to find another delta for
the object, so the resulting pack is larger than it
otherwise could be.
2. The warning is there because this is something that
_shouldn't_ ever happen. If it does, then either:
a. a pack we are reusing deltas from had its own
cycle
b. we are reusing deltas from multiple packs, and
we found a cycle among them (i.e., A is a delta of
B in one pack, but B is a delta of A in another,
and we choose to use both deltas).
c. there is a bug in the delta-search code
So this code serves as a final check that none of these
things has happened, warns the user, and prevents us
from writing a bogus pack.
Right now, (2b) should never happen because of the static
ordering of packs in want_object_in_pack(). If two objects
have a delta relationship, then they must be in the same
pack, and therefore we will find them from that same pack.
However, a future patch would like to change that static
ordering, which will make (2b) a common occurrence. In
preparation, we should be able to handle those kinds of
cycles better. This patch does by introducing a
cycle-breaking step during the get_object_details() phase,
when we are deciding which deltas can be reused. That gives
us the chance to feed the objects into the delta search as
if the cycle did not exist.
We'll leave the detection and warning in the write_object()
phase in place, as it still serves as a check for case (2c).
This does mean we will stop warning for (2a). That case is
caused by bogus input packs, and we ideally would warn the
user about it. However, since those cycles show up after
picking reusable deltas, they look the same as (2b) to us;
our new code will break the cycles early and the last-ditch
check will never see them.
We could do analysis on any cycles that we find to
distinguish the two cases (i.e., it is a bogus pack if and
only if every delta in the cycle is in the same pack), but
we don't need to. If there is a cycle inside a pack, we'll
run into problems not only reusing the delta, but accessing
the object data at all. So when we try to dig up the actual
size of the object, we'll hit that same cycle and kick in
our usual complain-and-try-another-source code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Some code may have a pack/offset pair for an object, but
would like to look up more information. Using
sha1_object_info() is too heavy-weight; it starts from the
sha1 and has to find the pack again (so not only does it waste
time, it might not even find the same instance).
In some cases, this problem is solved by helpers like
get_size_from_delta(), which is used by pack-objects to take
a shortcut for objects whose packed representation has
already been found. But there's no similar function for
getting the object type, for instance. Rather than introduce
one, let's just make the whole packed_object_info() available.
It is smart enough to spend effort only on the items the
caller wants.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
An all-zero initializer is fine for this struct, but because
the first element is a pointer, call sites need to know to
use "NULL" instead of "0". Otherwise some static checkers
like "sparse" will complain; see d099b71 (Fix some sparse
warnings, 2013-07-18) for example. So let's provide an
initializer to make this easier to get right.
But let's also comment that memset() to zero is explicitly
OK[1]. One of the callers embeds object_info in another
struct which is initialized via memset (expand_data in
builtin/cat-file.c). Since our subset of C doesn't allow
assignment from a compound literal, handling this in any
other way is awkward, so we'd like to keep the ability to
initialize by memset(). By documenting this property, it
should make anybody who wants to change the initializer
think twice before doing so.
There's one other caller of interest. In parse_sha1_header(),
we did not initialize the struct fully in the first place.
This turned out not to be a bug because the sub-function it
calls does not look at any other fields except the ones we
did initialize. But that assumption might not hold in the
future, so it's a dangerous construct. This patch switches
it to initializing the whole struct, which protects us
against unexpected reads of the other fields.
[1] Obviously using memset() to initialize a pointer
violates the C standard, but we long ago decided that it
was an acceptable tradeoff in the real world.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
We call "qsort(array, nelem, sizeof(array[0]), fn)", and most of
the time third parameter is redundant. A new QSORT() macro lets us
omit it.
* rs/qsort:
show-branch: use QSORT
use QSORT, part 2
coccicheck: use --all-includes by default
remove unnecessary check before QSORT
use QSORT
add QSORT
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Shorten the code by using QSORT instead of calling qsort(3) directly,
as the former determines the element size automatically and checks if
there are at least two elements to sort already.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Convert two more qsort(3) calls to QSORT to reduce code size and for
better safety and consistency.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Add a make variable, SPATCH_FLAGS, for specifying flags for spatch, and
set it to --all-includes by default. This option lets it consider
header files which would otherwise be ignored. That's important for
some rules that rely on type information. It doubles the duration of
coccicheck, however.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Add a semantic patch for removing checks similar to the one that QSORT
already does internally and apply it to the code base.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Apply the semantic patch contrib/coccinelle/qsort.cocci to the code
base, replacing calls of qsort(3) with QSORT. The resulting code is
shorter and supports empty arrays with NULL pointers.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Add the macro QSORT, a convenient wrapper for qsort(3) that infers the
size of the array elements and supports the convention of initializing
empty arrays with a NULL pointer, which we use in some places.
Calling qsort(3) directly with a NULL pointer is undefined -- even with
an element count of zero -- and allows the compiler to optimize away any
following NULL checks. Using the macro avoids such surprises.
Add a semantic patch as well to demonstrate the macro's usage and to
automate the transformation of trivial cases.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \ \ \ \ \
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
The code that parses the format parameter of for-each-ref command
has seen a micro-optimization.
* sg/ref-filter-parse-optim:
ref-filter: strip format option after a field name only once while parsing
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
When parse_ref_filter_atom() iterates over a list of valid atoms to
check that a field name is one of them, it has to strip the optional
colon-separated format option suffix that might follow the field name.
However, it does so inside the loop, i.e. it performs the exact same
stripping over and over again.
Move stripping the format option suffix out of that loop, so it's only
performed once for each parsed field name.
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \ \ \ \ \ \
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Code clean-up with help from coccinelle tool continues.
* rs/cocci:
coccicheck: make transformation for strbuf_addf(sb, "...") more precise
use strbuf_add_unique_abbrev() for adding short hashes, part 2
use strbuf_addstr() instead of strbuf_addf() with "%s", part 2
gitignore: ignore output files of coccicheck make target
|