summaryrefslogtreecommitdiff
path: root/builtin
Commit message (Collapse)AuthorAgeFilesLines
* pack-objects: do not reuse packfiles without --delta-base-offsetJeff King2014-04-041-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we are sending a packfile to a remote, we currently try to reuse a whole chunk of packfile without bothering to look at the individual objects. This can make things like initial clones much lighter on the server, as we can just dump the packfile bytes. However, it's possible that the other side cannot read our packfile verbatim. For example, we may have objects stored as OFS_DELTA, but the client is an antique version of git that only understands REF_DELTA. We negotiate this capability over the fetch protocol. A normal pack-objects run will convert OFS_DELTA into REF_DELTA on the fly, but the "reuse pack" code path never even looks at the objects. This patch disables packfile reuse if the other side is missing any capabilities that we might have used in the on-disk pack. Right now the only one is OFS_DELTA, but we may need to expand in the future (e.g., if packv4 introduces new object types). We could be more thorough and only disable reuse in this case when we actually have an OFS_DELTA to send, but: 1. We almost always will have one, since we prefer OFS_DELTA to REF_DELTA when possible. So this case would almost never come up. 2. Looking through the objects defeats the purpose of the optimization, which is to do as little work as possible to get the bytes to the remote. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* pack-objects: turn off bitmaps when skipping objectsJeff King2014-03-171-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pack bitmap format requires that we have a single bit for each object in the pack, and that each object's bitmap represents its complete set of reachable objects. Therefore we have no way to represent the bitmap of an object which references objects outside the pack. We notice this problem while generating the bitmaps, as we try to find the offset of a particular object and realize that we do not have it. In this case we die, and neither the bitmap nor the pack is generated. This is correct, but perhaps a little unfriendly. If you have bitmaps turned on in the config, many repacks will fail which would otherwise succeed. E.g., incremental repacks, repacks with "-l" when you have alternates, ".keep" files. Instead, this patch notices early that we are omitting some objects from the pack and turns off bitmaps (with a warning). Note that this is not strictly correct, as it's possible that the object being omitted is not reachable from any other object in the pack. In practice, this is almost never the case, and there are two advantages to doing it this way: 1. The code is much simpler, as we do not have to cleanly abort the bitmap-generation process midway through. 2. We do not waste time partially generating bitmaps only to find out that some object deep in the history is not being packed. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* pack-bitmap: implement optional name_hash cacheVicent Marti2013-12-301-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we use pack bitmaps rather than walking the object graph, we end up with the list of objects to include in the packfile, but we do not know the path at which any tree or blob objects would be found. In a recently packed repository, this is fine. A fetch would use the paths only as a heuristic in the delta compression phase, and a fully packed repository should not need to do much delta compression. As time passes, though, we may acquire more objects on top of our large bitmapped pack. If clients fetch frequently, then they never even look at the bitmapped history, and all works as usual. However, a client who has not fetched since the last bitmap repack will have "have" tips in the bitmapped history, but "want" newer objects. The bitmaps themselves degrade gracefully in this circumstance. We manually walk the more recent bits of history, and then use bitmaps when we hit them. But we would also like to perform delta compression between the newer objects and the bitmapped objects (both to delta against what we know the user already has, but also between "new" and "old" objects that the user is fetching). The lack of pathnames makes our delta heuristics much less effective. This patch adds an optional cache of the 32-bit name_hash values to the end of the bitmap file. If present, a reader can use it to match bitmapped and non-bitmapped names during delta compression. Here are perf results for p5310: Test origin/master HEAD^ HEAD ------------------------------------------------------------------------------------------------- 5310.2: repack to disk 36.81(37.82+1.43) 47.70(48.74+1.41) +29.6% 47.75(48.70+1.51) +29.7% 5310.3: simulated clone 30.78(29.70+2.14) 1.08(0.97+0.10) -96.5% 1.07(0.94+0.12) -96.5% 5310.4: simulated fetch 3.16(6.10+0.08) 3.54(10.65+0.06) +12.0% 1.70(3.07+0.06) -46.2% 5310.6: partial bitmap 36.76(43.19+1.81) 6.71(11.25+0.76) -81.7% 4.08(6.26+0.46) -88.9% You can see that the time spent on an incremental fetch goes down, as our delta heuristics are able to do their work. And we save time on the partial bitmap clone for the same reason. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* repack: consider bitmaps when performing repacksVicent Marti2013-12-301-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | Since `pack-objects` will write a `.bitmap` file next to the `.pack` and `.idx` files, this commit teaches `git-repack` to consider the new bitmap indexes (if they exist) when performing repack operations. This implies moving old bitmap indexes out of the way if we are repacking a repository that already has them, and moving the newly generated bitmap indexes into the `objects/pack` directory, next to their corresponding packfiles. Since `git repack` is now capable of handling these `.bitmap` files, a normal `git gc` run on a repository that has `pack.writebitmaps` set to true in its config file will generate bitmap indexes as part of the garbage collection process. Alternatively, `git repack` can be called with the `-b` switch to explicitly generate bitmap indexes if you are experimenting and don't want them on all the time. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* repack: handle optional files created by pack-objectsJeff King2013-12-301-2/+7
| | | | | | | | | | | | | | | | | | We ask pack-objects to pack to a set of temporary files, and then rename them into place. Some files that pack-objects creates may be optional (like a .bitmap file), in which case we would not want to call rename(). We already call stat() and make the chmod optional if the file cannot be accessed. We could simply skip the rename step in this case, but that would be a minor regression in noticing problems with non-optional files (like the .pack and .idx files). Instead, we can now annotate extensions as optional, and skip them if they don't exist (and otherwise rely on rename() to barf). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* repack: turn exts array into array-of-structJeff King2013-12-301-6/+11
| | | | | | | | This is slightly more verbose, but will let us annotate the extensions with further options in future commits. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* repack: stop using magic number for ARRAY_SIZE(exts)Jeff King2013-12-301-4/+4
| | | | | | | | | We have a static array of extensions, but hardcode the size of the array in our loops. Let's pull out this magic number, which will make it easier to change. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* pack-objects: implement bitmap writingVicent Marti2013-12-301-0/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit extends more the functionality of `pack-objects` by allowing it to write out a `.bitmap` index next to any written packs, together with the `.idx` index that currently gets written. If bitmap writing is enabled for a given repository (either by calling `pack-objects` with the `--write-bitmap-index` flag or by having `pack.writebitmaps` set to `true` in the config) and pack-objects is writing a packfile that would normally be indexed (i.e. not piping to stdout), we will attempt to write the corresponding bitmap index for the packfile. Bitmap index writing happens after the packfile and its index has been successfully written to disk (`finish_tmp_packfile`). The process is performed in several steps: 1. `bitmap_writer_set_checksum`: this call stores the partial checksum for the packfile being written; the checksum will be written in the resulting bitmap index to verify its integrity 2. `bitmap_writer_build_type_index`: this call uses the array of `struct object_entry` that has just been sorted when writing out the actual packfile index to disk to generate 4 type-index bitmaps (one for each object type). These bitmaps have their nth bit set if the given object is of the bitmap's type. E.g. the nth bit of the Commits bitmap will be 1 if the nth object in the packfile index is a commit. This is a very cheap operation because the bitmap writing code has access to the metadata stored in the `struct object_entry` array, and hence the real type for each object in the packfile. 3. `bitmap_writer_reuse_bitmaps`: if there exists an existing bitmap index for one of the packfiles we're trying to repack, this call will efficiently rebuild the existing bitmaps so they can be reused on the new index. All the existing bitmaps will be stored in a `reuse` hash table, and the commit selection phase will prioritize these when selecting, as they can be written directly to the new index without having to perform a revision walk to fill the bitmap. This can greatly speed up the repack of a repository that already has bitmaps. 4. `bitmap_writer_select_commits`: if bitmap writing is enabled for a given `pack-objects` run, the sequence of commits generated during the Counting Objects phase will be stored in an array. We then use that array to build up the list of selected commits. Writing a bitmap in the index for each object in the repository would be cost-prohibitive, so we use a simple heuristic to pick the commits that will be indexed with bitmaps. The current heuristics are a simplified version of JGit's original implementation. We select a higher density of commits depending on their age: the 100 most recent commits are always selected, after that we pick 1 commit of each 100, and the gap increases as the commits grow older. On top of that, we make sure that every single branch that has not been merged (all the tips that would be required from a clone) gets their own bitmap, and when selecting commits between a gap, we tend to prioritize the commit with the most parents. Do note that there is no right/wrong way to perform commit selection; different selection algorithms will result in different commits being selected, but there's no such thing as "missing a commit". The bitmap walker algorithm implemented in `prepare_bitmap_walk` is able to adapt to missing bitmaps by performing manual walks that complete the bitmap: the ideal selection algorithm, however, would select the commits that are more likely to be used as roots for a walk in the future (e.g. the tips of each branch, and so on) to ensure a bitmap for them is always available. 5. `bitmap_writer_build`: this is the computationally expensive part of bitmap generation. Based on the list of commits that were selected in the previous step, we perform several incremental walks to generate the bitmap for each commit. The walks begin from the oldest commit, and are built up incrementally for each branch. E.g. consider this dag where A, B, C, D, E, F are the selected commits, and a, b, c, e are a chunk of simplified history that will not receive bitmaps. A---a---B--b--C--c--D \ E--e--F We start by building the bitmap for A, using A as the root for a revision walk and marking all the objects that are reachable until the walk is over. Once this bitmap is stored, we reuse the bitmap walker to perform the walk for B, assuming that once we reach A again, the walk will be terminated because A has already been SEEN on the previous walk. This process is repeated for C, and D, but when we try to generate the bitmaps for E, we can reuse neither the current walk nor the bitmap we have generated so far. What we do now is resetting both the walk and clearing the bitmap, and performing the walk from scratch using E as the origin. This new walk, however, does not need to be completed. Once we hit B, we can lookup the bitmap we have already stored for that commit and OR it with the existing bitmap we've composed so far, allowing us to limit the walk early. After all the bitmaps have been generated, another iteration through the list of commits is performed to find the best XOR offsets for compression before writing them to disk. Because of the incremental nature of these bitmaps, XORing one of them with its predecesor results in a minimal "bitmap delta" most of the time. We can write this delta to the on-disk bitmap index, and then re-compose the original bitmaps by XORing them again when loaded. This is a phase very similar to pack-object's `find_delta` (using bitmaps instead of objects, of course), except the heuristics have been greatly simplified: we only check the 10 bitmaps before any given one to find best compressing one. This gives good results in practice, because there is locality in the ordering of the objects (and therefore bitmaps) in the packfile. 6. `bitmap_writer_finish`: the last step in the process is serializing to disk all the bitmap data that has been generated in the two previous steps. The bitmap is written to a tmp file and then moved atomically to its final destination, using the same process as `pack-write.c:write_idx_file`. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* rev-list: add bitmap mode to speed up object listsVicent Marti2013-12-301-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The bitmap reachability index used to speed up the counting objects phase during `pack-objects` can also be used to optimize a normal rev-list if the only thing required are the SHA1s of the objects during the list (i.e., not the path names at which trees and blobs were found). Calling `git rev-list --objects --use-bitmap-index [committish]` will perform an object iteration based on a bitmap result instead of actually walking the object graph. These are some example timings for `torvalds/linux` (warm cache, best-of-five): $ time git rev-list --objects master > /dev/null real 0m34.191s user 0m33.904s sys 0m0.268s $ time git rev-list --objects --use-bitmap-index master > /dev/null real 0m1.041s user 0m0.976s sys 0m0.064s Likewise, using `git rev-list --count --use-bitmap-index` will speed up the counting operation by building the resulting bitmap and performing a fast popcount (number of bits set on the bitmap) on the result. Here are some sample timings of different ways to count commits in `torvalds/linux`: $ time git rev-list master | wc -l 399882 real 0m6.524s user 0m6.060s sys 0m3.284s $ time git rev-list --count master 399882 real 0m4.318s user 0m4.236s sys 0m0.076s $ time git rev-list --use-bitmap-index --count master 399882 real 0m0.217s user 0m0.176s sys 0m0.040s This also respects negative refs, so you can use it to count a slice of history: $ time git rev-list --count v3.0..master 144843 real 0m1.971s user 0m1.932s sys 0m0.036s $ time git rev-list --use-bitmap-index --count v3.0..master real 0m0.280s user 0m0.220s sys 0m0.056s Though note that the closer the endpoints, the less it helps. In the traversal case, we have fewer commits to cross, so we take less time. But the bitmap time is dominated by generating the pack revindex, which is constant with respect to the refs given. Note that you cannot yet get a fast --left-right count of a symmetric difference (e.g., "--count --left-right master...topic"). The slow part of that walk actually happens during the merge-base determination when we parse "master...topic". Even though a count does not actually need to know the real merge base (it only needs to take the symmetric difference of the bitmaps), the revision code would require some refactoring to handle this case. Additionally, a `--test-bitmap` flag has been added that will perform the same rev-list manually (i.e. using a normal revwalk) and using bitmaps, and verify that the results are the same. This can be used to exercise the bitmap code, and also to verify that the contents of the .bitmap file are sane. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* pack-objects: use bitmaps when packing objectsVicent Marti2013-12-301-0/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In this patch, we use the bitmap API to perform the `Counting Objects` phase in pack-objects, rather than a traditional walk through the object graph. For a reasonably-packed large repo, the time to fetch and clone is often dominated by the full-object revision walk during the Counting Objects phase. Using bitmaps can reduce the CPU time required on the server (and therefore start sending the actual pack data with less delay). For bitmaps to be used, the following must be true: 1. We must be packing to stdout (as a normal `pack-objects` from `upload-pack` would do). 2. There must be a .bitmap index containing at least one of the "have" objects that the client is asking for. 3. Bitmaps must be enabled (they are enabled by default, but can be disabled by setting `pack.usebitmaps` to false, or by using `--no-use-bitmap-index` on the command-line). If any of these is not true, we fall back to doing a normal walk of the object graph. Here are some sample timings from a full pack of `torvalds/linux` (i.e. something very similar to what would be generated for a clone of the repository) that show the speedup produced by various methods: [existing graph traversal] $ time git pack-objects --all --stdout --no-use-bitmap-index \ </dev/null >/dev/null Counting objects: 3237103, done. Compressing objects: 100% (508752/508752), done. Total 3237103 (delta 2699584), reused 3237103 (delta 2699584) real 0m44.111s user 0m42.396s sys 0m3.544s [bitmaps only, without partial pack reuse; note that pack reuse is automatic, so timing this required a patch to disable it] $ time git pack-objects --all --stdout </dev/null >/dev/null Counting objects: 3237103, done. Compressing objects: 100% (508752/508752), done. Total 3237103 (delta 2699584), reused 3237103 (delta 2699584) real 0m5.413s user 0m5.604s sys 0m1.804s [bitmaps with pack reuse (what you get with this patch)] $ time git pack-objects --all --stdout </dev/null >/dev/null Reusing existing pack: 3237103, done. Total 3237103 (delta 0), reused 0 (delta 0) real 0m1.636s user 0m1.460s sys 0m0.172s Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* pack-objects: split add_object_entryJeff King2013-12-301-20/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | This function actually does three things: 1. Check whether we've already added the object to our packing list. 2. Check whether the object meets our criteria for adding. 3. Actually add the object to our packing list. It's a little hard to see these three phases, because they happen linearly in the rather long function. Instead, this patch breaks them up into three separate helper functions. The result is a little easier to follow, though it unfortunately suffers from some optimization interdependencies between the stages (e.g., during step 3 we use the packing list index from step 1 and the packfile information from step 2). More importantly, though, the various parts can be composed differently, as they will be in the next patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* pack-objects: factor out name_hashVicent Marti2013-10-241-22/+2
| | | | | | | | | | | | As the pack-objects system grows beyond the single pack-objects.c file, more parts (like the soon-to-exist bitmap code) will need to compute hashes for matching deltas. Factor out name_hash to make it available to other files. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* pack-objects: refactor the packing listVicent Marti2013-10-241-135/+40
| | | | | | | | | | | | | | | | | | | | | | The hash table that stores the packing list for a given `pack-objects` run was tightly coupled to the pack-objects code. In this commit, we refactor the hash table and the underlying storage array into a `packing_data` struct. The functionality for accessing and adding entries to the packing list is hence accessible from other parts of Git besides the `pack-objects` builtin. This refactoring is a requirement for further patches in this series that will require accessing the commit packing list from outside of `pack-objects`. The hash table implementation has been minimally altered: we now use table sizes which are always a power of two, to ensure a uniform index distribution in the array. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'hu/cherry-pick-previous-branch'Junio C Hamano2013-10-231-2/+2
|\ | | | | | | | | | | | | | | | | "git cherry-pick" without further options would segfault. Could use a follow-up to handle '-' after argv[1] better. * hu/cherry-pick-previous-branch: cherry-pick: handle "-" after parsing options
| * cherry-pick: handle "-" after parsing optionshu/cherry-pick-previous-branchJeff King2013-10-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, we only try converting argv[1] from "-" into "@{-1}". This means we do not notice "-" when used together with an option. Worse, when "git cherry-pick" is run with no options, we segfault. Fix this by doing the substitution after we have checked that there is something in argv to cherry-pick and know any remaining options are meant for the revision-listing machinery. This still does not handle "-" after the first non-cherry-pick option. For example, git cherry-pick foo~2 - bar~5 and git cherry-pick --no-merges - will still dump usage. Reported-by: Stefan Beller <stefanbeller@googlemail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
* | Merge branch 'mg/more-textconv'Junio C Hamano2013-10-233-18/+39
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make "git grep" and "git show" pay attention to --textconv when dealing with blob objects. * mg/more-textconv: grep: honor --textconv for the case rev:path grep: allow to use textconv filters t7008: demonstrate behavior of grep with textconv cat-file: do not die on --textconv without textconv filters show: honor --textconv for blobs diff_opt: track whether flags have been set explicitly t4030: demonstrate behavior of show with textconv
| * | grep: honor --textconv for the case rev:pathmg/more-textconvMichael J Gruber2013-05-101-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | Make "grep" honor the "--textconv" option also for the object case, i.e. when used with an argument "rev:path". Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | grep: allow to use textconv filtersJeff King2013-05-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recently and not so recently, we made sure that log/grep type operations use textconv filters when a userfacing diff would do the same: ef90ab6 (pickaxe: use textconv for -S counting, 2012-10-28) b1c2f57 (diff_grep: use textconv buffers for add/deleted files, 2012-10-28) 0508fe5 (combine-diff: respect textconv attributes, 2011-05-23) "git grep" currently does not use textconv filters at all, that is neither for displaying the match and context nor for the actual grepping, even when requested by --textconv. Introduce an option "--textconv" which makes git grep use any configured textconv filters for grepping and output purposes. It is off by default. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | cat-file: do not die on --textconv without textconv filtersMichael J Gruber2013-05-101-10/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a command is supposed to use textconv filters (by default or with "--textconv") and none are configured then the blob is output without conversion; the only exception to this rule is "cat-file --textconv". Make it behave like the rest of textconv aware commands. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | show: honor --textconv for blobsMichael J Gruber2013-05-101-3/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, "diff" and "cat-file" for blobs honor "--textconv" options (with the former defaulting to "--textconv" and the latter to "--no-textconv") whereas "show" does not honor this option, even though it takes diff options. Make "show" on blobs honor "--textconv" when it is asked. The default is not to apply textconv, which is in line with what "cat-file" does. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | diff_opt: track whether flags have been set explicitlyJunio C Hamano2013-05-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The diff_opt infrastructure sets flags based on defaults and command line options. It is impossible to tell whether a flag has been set as a default or on explicit request. Update the structure so that this detection is possible: * Add an extra "opt->touched_flags" that keeps track of all the fields that have been touched by DIFF_OPT_SET and DIFF_OPT_CLR. * You may continue setting the default values to the flags, like commands in the "log" family do in cmd_log_init_defaults(), but after you finished setting the defaults, you clear the touched_flags field; * And then you let the usual callchain call diff_opt_parse(), allowing the opt->flags be set or unset, while keeping track of which bits the user touched; * There is an optional callback "opt->set_default" that is called at the very beginning to let you inspect touched_flags and update opt->flags appropriately, before the remainder of the diffcore machinery is set up, taking the opt->flags value into account. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'jc/pack-objects'Junio C Hamano2013-10-231-11/+12
|\ \ \ | | | | | | | | | | | | | | | | * jc/pack-objects: pack-objects: shrink struct object_entry
| * | | pack-objects: shrink struct object_entryjc/pack-objectsJunio C Hamano2013-02-041-11/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Turn some boolean fields into bitfields and use uint32_t for name hash. This shrinks the size of the structure from 128 bytes to 120 bytes. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | Merge branch 'sb/repack-in-c'Junio C Hamano2013-10-181-0/+388
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rewrite "git repack" in C. * sb/repack-in-c: repack: improve warnings about failure of renaming and removing files repack: retain the return value of pack-objects repack: rewrite the shell script in C
| * | | | repack: improve warnings about failure of renaming and removing filesStefan Beller2013-09-171-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Stefan Beller <stefanbeller@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | repack: retain the return value of pack-objectsStefan Beller2013-09-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During the review process of the previous commit (repack: rewrite the shell script in C), Johannes Sixt proposed to retain any exit codes from the sub-process, which makes it probably more obvious in case of failure. As the commit before should behave as close to the original shell script, the proposed change is put in this extra commit. The infrastructure however was already setup in the previous commit. (Having a local 'ret' variable) Signed-off-by: Stefan Beller <stefanbeller@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | repack: rewrite the shell script in CStefan Beller2013-09-171-0/+387
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The motivation of this patch is to get closer to a goal of being able to have a core subset of git functionality built in to git. That would mean * people on Windows could get a copy of at least the core parts of Git without having to install a Unix-style shell * people using git in on servers with chrooted environments do not need to worry about standard tools lacking for shell scripts. This patch is meant to be mostly a literal translation of the git-repack script; the intent is that later patches would start using more library facilities, but this patch is meant to be as close to a no-op as possible so it doesn't do that kind of thing. Signed-off-by: Stefan Beller <stefanbeller@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | Merge branch 'jk/clone-progress-to-stderr'Junio C Hamano2013-10-181-23/+21
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some progress and diagnostic messages from "git clone" were incorrectly sent to the standard output stream, not to the standard error stream. * jk/clone-progress-to-stderr: clone: always set transport options clone: treat "checking connectivity" like other progress clone: send diagnostic messages to stderr
| * | | | | clone: always set transport optionsjk/clone-progress-to-stderrJeff King2013-09-181-16/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A clone will always create a transport struct, whether we are cloning locally or using an actual protocol. In the local case, we only use the transport to get the list of refs, and then transfer the objects out-of-band. However, there are many options that we do not bother setting up in the local case. For the most part, these are noops, because they only affect the object-fetching stage (e.g., the --depth option). However, some options do have a visible impact. For example, giving the path to upload-pack via "-u" does not currently work for a local clone, even though we need upload-pack to get the ref list. We can just drop the conditional entirely and set these options for both local and non-local clones. Rather than keep track of which options impact the object versus the ref fetching stage, we can simply let the noops be noops (and the cost of setting the options in the first place is not high). The one exception is that we also check that the transport provides both a "get_refs_list" and a "fetch" method. We will now be checking the former for both cases (which is good, since a transport that cannot fetch refs would not work for a local clone), and we tweak the conditional to check for a "fetch" only when we are non-local. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | clone: treat "checking connectivity" like other progressJeff King2013-09-181-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When stderr does not point to a tty, we typically suppress "we are now in this phase" progress reporting (e.g., we ask the server not to send us "counting objects" and the like). The new "checking connectivity" message is in the same vein, and should be suppressed. Since clone relies on the transport code to make the decision, we can simply sneak a peek at the "progress" field of the transport struct. That properly takes into account both the verbosity and progress options we were given, as well as the result of isatty(). Note that we do not set up that progress flag for a local clone, as we do not fetch using the transport at all. That's acceptable here, though, because we also do not perform a connectivity check in that case. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | clone: send diagnostic messages to stderrJeff King2013-09-181-5/+5
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Putting messages like "Cloning into.." and "done" on stdout is un-Unix and uselessly clutters the stdout channel. Send them to stderr. We have to tweak two tests to accommodate this: 1. t5601 checks for doubled output due to forking, and doesn't actually care where the output goes; adjust it to check stderr. 2. t5702 is trying to test whether progress output was sent to stderr, but naively does so by checking whether stderr produced any output. Instead, have it look for "%", a token found in progress output but not elsewhere (and which lets us avoid hard-coding the progress text in the test). This should not regress any scripts that try to parse the current output, as the output is already internationalized and therefore unstable. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | Merge branch 'jk/trailing-slash-in-pathspec'Junio C Hamano2013-10-172-18/+10
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Code refactoring. * jk/trailing-slash-in-pathspec: reset: handle submodule with trailing slash rm: re-use parse_pathspec's trailing-slash removal
| * | | | | reset: handle submodule with trailing slashjk/trailing-slash-in-pathspecJohn Keeping2013-09-131-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using tab-completion, a directory path will often end with a trailing slash which currently confuses "git reset" when dealing with submodules. Now that we have parse_pathspec we can easily handle this by simply adding the PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP flag. To do this, we need to move the read_cache() call before the parse_pathspec() call. All of the existing paths through cmd_reset() that do not die early already call read_cache() at some point, so there is no performance impact to doing this in the common case. Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | rm: re-use parse_pathspec's trailing-slash removalJohn Keeping2013-09-131-16/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of re-implementing the "remove trailing slashes" loop in builtin/rm.c just pass PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP to parse_pathspec. Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | | Merge branch 'maint'Junio C Hamano2013-10-151-0/+4
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * maint: git-prune-packed.txt: fix reference to GIT_OBJECT_DIRECTORY clone --branch: refuse to clone if upstream repo is empty
| * | | | | | clone --branch: refuse to clone if upstream repo is emptyRalf Thielow2013-10-141-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since 920b691 (clone: refuse to clone if --branch points to bogus ref) we refuse to clone with option "-b" if the specified branch does not exist in the (non-empty) upstream. If the upstream repository is empty, the branch doesn't exist, either. So refuse the clone too. Reported-by: Robert Mitwicki <robert.mitwicki@opensoftware.pl> Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com> Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
* | | | | | | Merge branch 'po/remote-set-head-usage'Jonathan Nieder2013-10-141-2/+2
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * po/remote-set-head-usage: remote set-head -h: add long options to synopsis remote doc: document long forms of set-head options
| * | | | | | | remote set-head -h: add long options to synopsispo/remote-set-head-usagePhilip Oakley2013-09-271-2/+2
| |/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Document --auto and --delete alongside their short forms -a and -d in the first line of 'git remote set-head -h' output. Signed-off-by: Philip Oakley <philipoakley@iee.org> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
* | | | | | | Merge branch 'cc/replace-with-the-same-type'Jonathan Nieder2013-09-241-3/+13
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cc/replace-with-the-same-type: Doc: 'replace' merge and non-merge commits t6050-replace: use some long option names replace: allow long option names Documentation/replace: add Creating Replacement Objects section t6050-replace: add test to clean up all the replace refs t6050-replace: test that objects are of the same type Documentation/replace: state that objects must be of the same type replace: forbid replacing an object with one of a different type
| * | | | | | | replace: allow long option namesChristian Couder2013-09-061-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is now standard practice in Git to have both short and long option names. So let's give a long option name to the git replace options too. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | replace: forbid replacing an object with one of a different typeChristian Couder2013-09-061-0/+10
| | |_|/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Users replacing an object with one of a different type were not prevented to do so, even if it was obvious, and stated in the doc, that bad things would result from doing that. To avoid mistakes, it is better to just forbid that though. If -f option, which means '--force', is used, we can allow an object to be replaced with one of a different type, as the user should know what (s)he is doing. If one object is replaced with one of a different type, the only way to keep the history valid is to also replace all the other objects that point to the replaced object. That's because: * Annotated tags contain the type of the tagged object. * The tree/parent lines in commits must be a tree and commits, resp. * The object types referred to by trees are specified in the 'mode' field: 100644 and 100755 blob 160000 commit 040000 tree (these are the only valid modes) * Blobs don't point at anything. The doc will be updated in a later patch. Acked-by: Philip Oakley <philipoakley@iee.org> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | | | Merge branch 'jk/shortlog-tolerate-broken-commit'Jonathan Nieder2013-09-241-2/+4
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * jk/shortlog-tolerate-broken-commit: shortlog: ignore commits with missing authors
| * | | | | | | shortlog: ignore commits with missing authorsjk/shortlog-tolerate-broken-commitJeff King2013-09-181-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most of git's traversals are robust against minor breakages in commit data. For example, "git log" will still output an entry for a commit that has a broken encoding or missing author, and will not abort the whole operation. Shortlog, on the other hand, will die as soon as it sees a commit without an author, meaning that a repository with a broken commit cannot get any shortlog output at all. Let's downgrade this fatal error to a warning, and continue the operation. We simply ignore the commit and do not count it in the total (since we do not have any author under which to file it). Alternatively, we could output some kind of "<empty>" record to collect these bogus commits. It is probably not worth it, though; we have already warned to stderr, so the user is aware that such bogosities exist, and any placeholder we came up with would either be syntactically invalid, or would potentially conflict with real data. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | | | | clone: add a period after "done" to end the sentenceSebastian Schuberth2013-09-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have a period in other places after "done" (see e.g. clone_local), so we should have one here, too. Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
* | | | | | | | Merge branch 'dw/check-ignore-sans-index'Junio C Hamano2013-09-201-2/+4
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "git check-ignore" follows the same rule as "git add" and "git status" in that the ignore/exclude mechanism does not take effect on paths that are already tracked. With "--no-index" option, it can be used to diagnose which paths that should have been ignored have been mistakenly added to the index. * dw/check-ignore-sans-index: check-ignore: Add option to ignore index contents
| * | | | | | | | check-ignore: Add option to ignore index contentsdw/check-ignore-sans-indexDave Williams2013-09-121-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | check-ignore currently shows how .gitignore rules would treat untracked paths. Tracked paths do not generate useful output. This prevents debugging of why a path became tracked unexpectedly unless that path is first removed from the index with `git rm --cached <path>`. The option --no-index tells the command to bypass the check for the path being in the index and hence allows tracked paths to be checked too. Whilst this behaviour deviates from the characteristics of `git add` and `git status` its use case is unlikely to cause any user confusion. Test scripts are augmented to check this option against the standard ignores to ensure correct behaviour. Signed-off-by: Dave Williams <dave@opensourcesolutions.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | | | | | Merge branch 'mm/commit-template-squelch-advice-messages'Junio C Hamano2013-09-201-8/+17
|\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From the commit log template, remove irrelevant "advice" messages that are shared with "git status" output. * mm/commit-template-squelch-advice-messages: commit: disable status hints when writing to COMMIT_EDITMSG wt-status: turn advice_status_hints into a field of wt_status commit: factor status configuration is a helper function
| * | | | | | | | | commit: disable status hints when writing to COMMIT_EDITMSGmm/commit-template-squelch-advice-messagesMatthieu Moy2013-09-121-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This turns the template COMMIT_EDITMSG from e.g # [...] # Changes to be committed: # (use "git reset HEAD <file>..." to unstage) # # modified: builtin/commit.c # # Untracked files: # (use "git add <file>..." to include in what will be committed) # # t/foo # to # [...] # Changes to be committed: # modified: builtin/commit.c # # Untracked files: # t/foo # Most status hints were written to be accurate when running "git status" before running a commit. Many of them are not applicable when the commit has already been started, and should not be shown in COMMIT_EDITMSG. The most obvious are hints advising to run "git commit", "git rebase/am/cherry-pick --continue", which do not make sense when the command has already been run. Other messages become slightly inaccurate (e.g. hint to use "git add" to add untracked files), as the suggested commands are not immediately applicable during the editing of COMMIT_EDITMSG, but would be applicable if the commit is aborted. These messages are both potentially helpful and slightly misleading. This patch chose to remove them too, to avoid introducing too much complexity in the status code. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | | | wt-status: turn advice_status_hints into a field of wt_statusMatthieu Moy2013-09-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No behavior change in this patch, but this makes the display of status hints more flexible as they can be enabled or disabled for individual calls to commit.c:run_status(). Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | | | commit: factor status configuration is a helper functionMatthieu Moy2013-09-121-8/+10
| | |_|/ / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cmd_commit and cmd_status use very similar code to initialize the wt_status structure. Factor this code into a function to ensure future changes will keep both versions consistent. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>