summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* fetch-pack: avoid quadratic behavior in rev_list_pushjk/fetch-pack-many-refsJeff King2013-07-021-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we call find_common to start finding common ancestors with the remote side of a fetch, the first thing we do is insert the tip of each ref into our rev_list linked list. We keep the list sorted the whole time with commit_list_insert_by_date, which means our insertion ends up doing O(n^2) timestamp comparisons. We could teach rev_list_push to use an unsorted list, and then sort it once after we have added each ref. However, in get_rev, we process the list by popping commits off the front and adding parents back in timestamp-sorted order. So that procedure would still operate on the large list. Instead, we can replace the linked list with a heap-based priority queue, which can do O(log n) insertion, making the whole insertion procedure O(n log n). As a result of switching to the prio_queue struct, we fix two minor bugs: 1. When we "pop" a commit in get_rev, and when we clear the rev_list in find_common, we do not take care to free the "struct commit_list", and just leak its memory. With the prio_queue implementation, the memory management is handled for us. 2. In get_rev, we look at the head commit of the list, possibly push its parents onto the list, and then "pop" the front of the list off, assuming it is the same element that we just peeked at. This is typically going to be the case, but would not be in the face of clock skew: the parents are inserted by date, and could potentially be inserted at the head of the list if they have a timestamp newer than their descendent. In this case, we would accidentally pop the parent, and never process it at all. The new implementation pulls the commit off of the queue as we examine it, and so does not suffer from this problem. With this patch, a fetch of a single commit into a repository with 50,000 refs went from: real 0m7.984s user 0m7.852s sys 0m0.120s to: real 0m2.017s user 0m1.884s sys 0m0.124s Before this patch, a larger case with 370K refs still had not completed after tens of minutes; with this patch, it completes in about 12 seconds. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* commit.c: make compare_commits_by_commit_date globalJeff King2013-07-022-1/+3
| | | | | | | | | | This helper function was introduced as a prio_queue comparator to help topological sorting. However, other users of prio_queue who want to replace commit_list_insert_by_date will want to use it, too. So let's make it public. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fetch-pack: avoid quadratic list insertion in mark_completeJeff King2013-07-021-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | We insert the commit pointed to by each ref one-by-one into the "complete" commit_list using insert_by_date. Because each insertion is O(n), we end up with O(n^2) behavior. This typically doesn't matter, because the number of refs is reasonably small. And even if there are a lot of refs, they often point to a smaller set of objects (in which case the optimization in commit ea5f220 keeps our "n" small). However, in pathological repositories (hundreds of thousands of refs, each pointing to a unique commit), this quadratic behavior can make a difference. Since we do not care about the list order until we have finished building it, we can simply keep it unsorted during the insertion phase, then sort it afterwards. On a repository like the one described above, this dropped the time to do a no-op fetch from 2.0s to 1.7s. On normal repositories, it probably does not matter at all, but it does not hurt to protect ourselves from pathological cases. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'jc/topo-author-date-sort'Junio C Hamano2013-07-0116-163/+550
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "git log" learned the "--author-date-order" option, with which the output is topologically sorted and commits in parallel histories are shown intermixed together based on the author timestamp. * jc/topo-author-date-sort: t6003: add --author-date-order test topology tests: teach a helper to set author dates as well t6003: add --date-order test topology tests: teach a helper to take abbreviated timestamps t/lib-t6000: style fixes log: --author-date-order sort-in-topological-order: use prio-queue prio-queue: priority queue of pointers to structs toposort: rename "lifo" field
| * t6003: add --author-date-order testJunio C Hamano2013-06-211-2/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Tweak the --topo/date-order test vector a bit and mark the author dates of two commits (a2 and a3) earlier than their own committer dates, making them much older than other commits that are made on parallel branches to simulate the case where a long running topic was rebased recently. They will show up as recent in the --date-order output due to their timestamps, but they appear a lot later in the --author-date-order output, even though their committer timestamp says otherwise. Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * topology tests: teach a helper to set author dates as wellJunio C Hamano2013-06-212-26/+34
| | | | | | | | | | | | | | | | | | | | Introduce on_dates helper that is similar to on_committer_date but also sets the author date, not just the committer date. At this step, just set the same timestamp to the author date as the committer date, as no test looks at author date yet. Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * t6003: add --date-order testJunio C Hamano2013-06-211-0/+22
| | | | | | | | | | | | | | | | | | The "--date-order" output is a slight twist of "--topo-order" in that commits from parallel histories are shown in their committer date order without an attempt to clump commits from a single line of history together like --topo-order does. Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * topology tests: teach a helper to take abbreviated timestampsJunio C Hamano2013-06-213-77/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The on_committer_date helper in t/lib-t6000 is used in t6002 and t6003 with timestamps on a single day within a single minute (i.e. 1971-08-16 00:00) and the tests repeat this over and over. The actual value of the timestamp, however, does not matter very much; only their relative ordering does. Introduce another helper to expand only the suffix of the timestamp to a full timestamp to make the lines shorter, and use it in this helper. Also, because all the commits in the test are made with specific GIT_COMMITTER_DATE, stop unsetting it at the end of the helper. We'll be specifying the author timestamp to these test commits in a later patch, which will be helped with this change. Also remove a test that was commented-out from t6003; it used to test a commit with the same parent listed twice, which was allowed by mistake but was fixed long time ago. Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * t/lib-t6000: style fixesJunio C Hamano2013-06-211-47/+40
| | | | | | | | | | | | | | Mostly fixes to initial indentation with 8-SP (they should be HT) and wrapping long lines. Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * log: --author-date-orderJunio C Hamano2013-06-114-1/+83
| | | | | | | | | | | | | | | | | | | | | | Sometimes people would want to view the commits in parallel histories in the order of author dates, not committer dates. Teach "topo-order" sort machinery to do so, using a commit-info slab to record the author dates of each commit, and prio-queue to sort them. Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * sort-in-topological-order: use prio-queueJunio C Hamano2013-06-113-31/+60
| | | | | | | | | | | | | | | | | | Use the prio-queue data structure to implement a priority queue of commits sorted by committer date, when handling --date-order. The structure can also be used as a simple LIFO stack, which is a good match for --topo-order processing. Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * prio-queue: priority queue of pointers to structsJunio C Hamano2013-06-116-0/+209
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Traditionally we used a singly linked list of commits to hold a set of in-flight commits while traversing history. The most typical use of the list is to add commits that are newly discovered to it, keep the list sorted by commit timestamp, pick up the newest one from the list, and keep digging. The cost of keeping the singly linked list sorted is nontrivial, and this typical use pattern better matches a priority queue. Introduce a prio-queue structure, that can be used either as a LIFO stack, or a priority queue. This will be used in the next patch to hold in-flight commits during sort-in-topological-order. Tests and the idea to make it usable for any "void *" pointers to "things" are by Jeff King. Bugs are mine. Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * toposort: rename "lifo" fieldJunio C Hamano2013-06-116-20/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The primary invariant of sort_in_topological_order() is that a parent commit is not emitted until all children of it are. When traversing a forked history like this with "git log C E": A----B----C \ D----E we ensure that A is emitted after all of B, C, D, and E are done, B has to wait until C is done, and D has to wait until E is done. In some applications, however, we would further want to control how these child commits B, C, D and E on two parallel ancestry chains are shown. Most of the time, we would want to see C and B emitted together, and then E and D, and finally A (i.e. the --topo-order output). The "lifo" parameter of the sort_in_topological_order() function is used to control this behaviour. We start the traversal by knowing two commits, C and E. While keeping in mind that we also need to inspect E later, we pick C first to inspect, and we notice and record that B needs to be inspected. By structuring the "work to be done" set as a LIFO stack, we ensure that B is inspected next, before other in-flight commits we had known that we will need to inspect, e.g. E. When showing in --date-order, we would want to see commits ordered by timestamps, i.e. show C, E, B and D in this order before showing A, possibly mixing commits from two parallel histories together. When "lifo" parameter is set to false, the function keeps the "work to be done" set sorted in the date order to realize this semantics. After inspecting C, we add B to the "work to be done" set, but the next commit we inspect from the set is E which is newer than B. The name "lifo", however, is too strongly tied to the way how the function implements its behaviour, and does not describe what the behaviour _means_. Replace this field with an enum rev_sort_order, with two possible values: REV_SORT_IN_GRAPH_ORDER and REV_SORT_BY_COMMIT_DATE, and update the existing code. The mechanical replacement rule is: "lifo == 0" is equivalent to "sort_order == REV_SORT_BY_COMMIT_DATE" "lifo == 1" is equivalent to "sort_order == REV_SORT_IN_GRAPH_ORDER" Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'jk/commit-info-slab'Junio C Hamano2013-07-013-10/+127
|\ \ | |/ | | | | | | | | | | | | | | | | Allow adding custom information to commit objects in order to represent unbound number of flag bits etc. * jk/commit-info-slab: commit-slab: introduce a macro to define a slab for new type commit-slab: avoid large realloc commit: allow associating auxiliary info on-demand
| * commit-slab: introduce a macro to define a slab for new typeJunio C Hamano2013-06-072-58/+112
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a header file to define a macro that can define the struct type, initializer, accessor and cleanup functions to manage a commit slab. Update the "indegree" topological sort facility using it. To associate 32 flag bits with each commit, you can write: define_commit_slab(flag32, uint32); to declare "struct flag32" type, define an instance of it with struct flag32 flags; and initialize it by calling init_flag32(&flags); After that, a call to flag32_at() function uint32 *fp = flag32_at(&flags, commit); will return a pointer pointing at a uint32 for that commit. Once you are done with these flags, clean them up with clear_flag32(&flags); Callers that cannot hard-code how wide the data to be associated with the commit be at compile time can use the "_with_stride" variant to initialize the slab. Suppose you want to give one bit per existing ref, and paint commits down to find which refs are descendants of each commit. Saying typedef uint32 bits320[5]; define_commit_slab(flagbits, bits320); at compile time will still limit your code with hard-coded limit, because you may find that you have more than 320 refs at runtime. The code can declare a commit slab "struct flagbits" like this instead: define_commit_slab(flagbits, unsigned char); struct flagbits flags; and initialize it by: nrefs = ... count number of refs ... init_flagbits_with_stride(&flags, (nrefs + 7) / 8); so that unsigned char *fp = flagbits_at(&flags, commit); will return a pointer pointing at an array of 40 "unsigned char"s associated with the commit, once you figure out nrefs is 320 at runtime. Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * commit-slab: avoid large reallocJunio C Hamano2013-04-131-20/+42
| | | | | | | | | | | | | | | | | | | | Instead of using a single "slab" and keep reallocating it as we find that we need to deal with commits with larger values of commit->index, make a "slab" an array of many "slab_piece"s. Each access may need two levels of indirections, but we only need to reallocate the first level array of pointers when we have to grow the table this way. Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * commit: allow associating auxiliary info on-demandJeff King2013-04-132-10/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "indegree" field in the commit object is only used while sorting a list of commits in topological order, and wasting memory otherwise. We would prefer to shrink the size of individual commit objects, which we may have to hold thousands of in-core. We could eject "indegree" field out from the commit object and represent it as a dynamic table based on the decoration infrastructure, but the decoration is meant for sparse annotation and is not a good match. Instead, let's try a different approach. - Assign an integer (commit->index) to each commit we keep in-core (reuse the space of "indegree" field for it); - When running the topological sort, allocate an array of integers in bulk (called "slab"), use the commit->index as an index into this array, and store the "indegree" information there. This does _not_ reduce the memory footprint of a commit object, but the commit->index can be used as the index to dynamically associate commits with other kinds of information as needed. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'maint'Junio C Hamano2013-06-303-3/+14
|\ \ | | | | | | | | | | | | | | | | | | * maint: Start preparing for 1.8.3.3 check-ignore doc: fix broken link to ls-files page test: spell 'ls-files --delete' option correctly in test descriptions
| * | Start preparing for 1.8.3.3Junio C Hamano2013-06-302-1/+12
| | | | | | | | | | | | Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | Merge branch 'fc/macos-x-clipped-write' into maintJunio C Hamano2013-06-304-0/+27
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mac OS X does not like to write(2) more than INT_MAX number of bytes; work it around by chopping write(2) into smaller pieces. * fc/macos-x-clipped-write: compate/clipped-write.c: large write(2) fails on Mac OS X/XNU
| * \ \ Merge branch 'da/darwin' into maintJunio C Hamano2013-06-302-0/+25
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Newer MacOS X encourages the programs to compile and link with their CommonCrypto, not with OpenSSL. * da/darwin: imap-send: eliminate HMAC deprecation warnings on Mac OS X cache.h: eliminate SHA-1 deprecation warnings on Mac OS X Makefile: add support for Apple CommonCrypto facility Makefile: fix default regex settings on Darwin
| * | | | check-ignore doc: fix broken link to ls-files pageRamkumar Ramachandra2013-06-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | test: spell 'ls-files --delete' option correctly in test descriptionsSZEDER Gábor2013-06-301-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The option is spelled '--deleted'. Signed-off-by: SZEDER Gábor <szeder@ira.uka.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | Update draft release notes to 1.8.4Junio C Hamano2013-06-301-0/+23
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | Merge branch 'mh/ref-races'Junio C Hamano2013-06-306-134/+466
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "git pack-refs" that races with new ref creation or deletion have been susceptible to lossage of refs under right conditions, which has been tightened up. * mh/ref-races: for_each_ref: load all loose refs before packed refs get_packed_ref_cache: reload packed-refs file when it changes add a stat_validity struct Extract a struct stat_data from cache_entry packed_ref_cache: increment refcount when locked do_for_each_entry(): increment the packed refs cache refcount refs: manage lifetime of packed refs cache via reference counting refs: implement simple transactions for the packed-refs file refs: wrap the packed refs cache in a level of indirection pack_refs(): split creation of packed refs and entry writing repack_without_ref(): split list curation and entry writing
| * | | | | for_each_ref: load all loose refs before packed refsJeff King2013-06-201-4/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we are iterating through the refs using for_each_ref (or any of its sister functions), we can get into a race condition with a simultaneous "pack-refs --prune" that looks like this: 0. We have a large number of loose refs, and a few packed refs. refs/heads/z/foo is loose, with no matching entry in the packed-refs file. 1. Process A starts iterating through the refs. It loads the packed-refs file from disk, then starts lazily traversing through the loose ref directories. 2. Process B, running "pack-refs --prune", writes out the new packed-refs file. It then deletes the newly packed refs, including refs/heads/z/foo. 3. Meanwhile, process A has finally gotten to refs/heads/z (it traverses alphabetically). It descends, but finds nothing there. It checks its cached view of the packed-refs file, but it does not mention anything in "refs/heads/z/" at all (it predates the new file written by B in step 2). The traversal completes successfully without mentioning refs/heads/z/foo at all (the name, of course, isn't important; but the more refs you have and the farther down the alphabetical list a ref is, the more likely it is to hit the race). If refs/heads/z/foo did exist in the packed refs file at state 0, we would see an entry for it, but it would show whatever sha1 the ref had the last time it was packed (which could be an arbitrarily long time ago). This can be especially dangerous when process A is "git prune", as it means our set of reachable tips will be incomplete, and we may erroneously prune objects reachable from that tip (the same thing can happen if "repack -ad" is used, as it simply drops unreachable objects that are packed). This patch solves it by loading all of the loose refs for our traversal into our in-memory cache, and then refreshing the packed-refs cache. Because a pack-refs writer will always put the new packed-refs file into place before starting the prune, we know that any loose refs we fail to see will either truly be missing, or will have already been put in the packed-refs file by the time we refresh. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | get_packed_ref_cache: reload packed-refs file when it changesJeff King2013-06-201-5/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Once we read the packed-refs file into memory, we cache it to save work on future ref lookups. However, our cache may be out of date with respect to what is on disk if another process is simultaneously packing the refs. Normally it is acceptable for us to be a little out of date, since there is no guarantee whether we read the file before or after the simultaneous update. However, there is an important special case: our packed-refs file must be up to date with respect to any loose refs we read. Otherwise, we risk the following race condition: 0. There exists a loose ref refs/heads/master. 1. Process A starts and looks up the ref "master". It first checks $GIT_DIR/master, which does not exist. It then loads (and caches) the packed-refs file to see if "master" exists in it, which it does not. 2. Meanwhile, process B runs "pack-refs --all --prune". It creates a new packed-refs file which contains refs/heads/master, and removes the loose copy at $GIT_DIR/refs/heads/master. 3. Process A continues its lookup, and eventually tries $GIT_DIR/refs/heads/master. It sees that the loose ref is missing, and falls back to the packed-refs file. But it examines its cached version, which does not have refs/heads/master. After trying a few other prefixes, it reports master as a non-existent ref. There are many variants (e.g., step 1 may involve process A looking up another ref entirely, so even a fully qualified refname can fail). One of the most interesting ones is if "refs/heads/master" is already packed. In that case process A will not see it as missing, but rather will report whatever value happened to be in the packed-refs file before process B repacked (which might be an arbitrarily old value). We can fix this by making sure we reload the packed-refs file from disk after looking at any loose refs. That's unacceptably slow, so we can check its stat()-validity as a proxy, and read it only when it appears to have changed. Reading the packed-refs file after performing any loose-ref system calls is sufficient because we know the ordering of the pack-refs process: it always makes sure the newly written packed-refs file is installed into place before pruning any loose refs. As long as those operations by B appear in their executed order to process A, by the time A sees the missing loose ref, the new packed-refs file must be in place. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | add a stat_validity structMichael Haggerty2013-06-202-0/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It can sometimes be useful to know whether a path in the filesystem has been updated without going to the work of opening and re-reading its content. We trust the stat() information on disk already to handle index updates, and we can use the same trick here. This patch introduces a "stat_validity" struct which encapsulates the concept of checking the stat-freshness of a file. It is implemented on top of "struct stat_data" to reuse the logic about which stat entries to trust for a particular platform, but hides the complexity behind two simple functions: check and update. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | Extract a struct stat_data from cache_entryMichael Haggerty2013-06-203-80/+116
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add public functions fill_stat_data() and match_stat_data() to work with it. This infrastructure will later be used to check the validity of other types of file. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | packed_ref_cache: increment refcount when lockedMichael Haggerty2013-06-201-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Increment the packed_ref_cache reference count while it is locked to prevent its being freed. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | do_for_each_entry(): increment the packed refs cache refcountMichael Haggerty2013-06-201-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function calls a user-supplied callback function which could do something that causes the packed refs cache to be invalidated. So acquire a reference count on the data structure to prevent our copy from being freed while we are iterating over it. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | refs: manage lifetime of packed refs cache via reference countingMichael Haggerty2013-06-201-3/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In struct packed_ref_cache, keep a count of the number of users of the data structure. Only free the packed ref cache when the reference count goes to zero rather than when the packed ref cache is cleared. This mechanism will be used to prevent the cache data structure from being freed while it is being iterated over. So far, only the reference in struct ref_cache::packed is counted; other users will be adjusted in separate commits. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | refs: implement simple transactions for the packed-refs fileMichael Haggerty2013-06-203-21/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Handle simple transactions for the packed-refs file at the packed_ref_cache level via new functions lock_packed_refs(), commit_packed_refs(), and rollback_packed_refs(). Only allow the packed ref cache to be modified (via add_packed_ref()) while the packed refs file is locked. Change clone to add the new references within a transaction. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | refs: wrap the packed refs cache in a level of indirectionMichael Haggerty2013-06-201-6/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we know, we can solve any problem in this manner. In this case, the problem is to avoid freeing a packed refs cache while somebody is using it. So add a level of indirection as a prelude to reference-counting the packed refs cache. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | pack_refs(): split creation of packed refs and entry writingMichael Haggerty2013-06-201-14/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split pack_refs() into multiple passes: * Iterate over loose refs. For each one that can be turned into a packed ref, create a corresponding entry in the packed refs cache. * Write the packed refs to the packed-refs file. This change isolates the mutation of the packed-refs file to a single place. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | repack_without_ref(): split list curation and entry writingMichael Haggerty2013-06-201-12/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The repack_without_ref() function first removes the deleted ref from the internal packed-refs list, then writes the packed-refs list to disk, omitting any broken or stale entries. This patch splits that second step into multiple passes: * collect the list of refnames that should be deleted from packed_refs * delete those refnames from the cache * write the remainder to the packed-refs file The purpose of this change is to make the "write the remainder" part reusable. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | | Merge branch 'ap/diff-ignore-blank-lines'Junio C Hamano2013-06-3010-8/+439
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "git diff" learned a mode that ignores hunks whose change consists only of additions and removals of blank lines, which is the same as "diff -B" (ignore blank lines) of GNU diff. * ap/diff-ignore-blank-lines: diff: add --ignore-blank-lines option
| * | | | | | diff: add --ignore-blank-lines optionAntoine Pelisse2013-06-1910-8/+439
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The goal of the patch is to introduce the GNU diff -B/--ignore-blank-lines as closely as possible. The short option is not available because it's already used for "break-rewrites". When this option is used, git-diff will not create hunks that simply add or remove empty lines, but will still show empty lines addition/suppression if they are close enough to "valuable" changes. There are two differences between this option and GNU diff -B option: - GNU diff doesn't have "--inter-hunk-context", so this must be handled - The following sequence looks like a bug (context is displayed twice): $ seq 5 >file1 $ cat <<EOF >file2 change 1 2 3 4 5 change EOF $ diff -u -B file1 file2 --- file1 2013-06-08 22:13:04.471517834 +0200 +++ file2 2013-06-08 22:13:23.275517855 +0200 @@ -1,5 +1,7 @@ +change 1 2 + 3 4 5 @@ -3,3 +5,4 @@ 3 4 5 +change So here is a more thorough description of the option: - real changes are interesting - blank lines that are close enough (less than context size) to interesting changes are considered interesting (recursive definition) - "context" lines are used around each hunk of interesting changes - If two hunks are separated by less than "inter-hunk-context", they will be merged into one. The implementation does the "interesting changes selection" in a single pass. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | | | Merge branch 'mh/loose-refs-race-with-pack-ref'Junio C Hamano2013-06-301-34/+72
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We read loose and packed rerferences in two steps, but after deciding to read a loose ref but before actually opening it to read it, another process racing with us can unlink it, which would cause us to barf. Update the codepath to retry when such a race is detected. * mh/loose-refs-race-with-pack-ref: resolve_ref_unsafe(): close race condition reading loose refs resolve_ref_unsafe(): handle the case of an SHA-1 within loop resolve_ref_unsafe(): extract function handle_missing_loose_ref()
| * | | | | | | resolve_ref_unsafe(): close race condition reading loose refsMichael Haggerty2013-06-191-4/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We read loose references in two steps. The code is roughly: lstat() if error ENOENT: loose ref is missing; look for corresponding packed ref else if S_ISLNK: readlink() if error: report failure else if S_ISDIR: report failure else open() if error: report failure read() The problem is that the first filesystem call, to lstat(), is not atomic with the second filesystem call, to readlink() or open(). Therefore it is possible for another process to change the file between our two calls, for example: * If the other process deletes the file, our second call will fail with ENOENT, which we *should* interpret as "loose ref is missing; look for corresponding packed ref". This can arise if the other process is pack-refs; it might have just written a new packed-refs file containing the old contents of the reference then deleted the loose ref. * If the other process changes a symlink into a plain file, our call to readlink() will fail with EINVAL, which we *should* respond to by trying to open() and read() the file. The old code treats the reference as missing in both of these cases, which is incorrect. So instead, handle errors more selectively: if the result of readline()/open() is a failure that is inconsistent with the result of the previous lstat(), then something is fishy. In this case jump back and start over again with a fresh call to lstat(). One race is still possible and undetected: another process could change the file from a regular file into a symlink between the call to lstat and the call to open(). The open() call would silently follow the symlink and not know that something is wrong. This situation could be detected in two ways: * On systems that support O_NOFOLLOW, pass that option to the open(). * On other systems, call fstat() on the fd returned by open() and make sure that it agrees with the stat info from the original lstat(). However, we don't use symlinks anymore, so this situation is unlikely. Moreover, it doesn't appear that treating a symlink as a regular file would have grave consequences; after all, this is exactly how the code handles non-relative symlinks. So this commit leaves that race unaddressed. Note that this solves only the part of the race within resolve_ref_unsafe. In the situation described above, we may still be depending on a cached view of the packed-refs file; that race will be dealt with in a future patch. This problem was reported and diagnosed by Jeff King <peff@peff.net>, and this solution is derived from his patch. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | resolve_ref_unsafe(): handle the case of an SHA-1 within loopMichael Haggerty2013-06-191-9/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is only one "break" statement within the loop, which jumps to the code after the loop that handles the case of a file that holds a SHA-1. So move that code from below the loop into the if statement where the break was previously located. This makes the logic flow more local. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | resolve_ref_unsafe(): extract function handle_missing_loose_ref()Michael Haggerty2013-06-191-21/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nesting was getting a bit out of hand, and it's about to get worse. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | | | | Merge branch 'nk/name-rev-abbreviated-refs'Junio C Hamano2013-06-302-8/+31
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "git name-rev --refs=tags/v*" were forbidden, which was a bit inconvenient (you had to give a pattern to match refs fully, like --refs=refs/tags/v*). * nk/name-rev-abbreviated-refs: name-rev: allow to specify a subpath for --refs option
| * | | | | | | | name-rev: allow to specify a subpath for --refs optionNamhyung Kim2013-06-182-8/+31
| | |/ / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an user wants to filter specific ref using the --refs option, the pattern needs to match the full ref, e.g. --refs=refs/tags/v1.*. It'd be convenient to specify a subpath of ref pattern. For example, --refs=origin/* can find refs/remotes/origin/master by searching the pattern against its substrings in turn: refs/remotes/origin/master remotes/origin/master origin/master If it finds a match in a subpath, unambigous part of the ref path will be removed in the output. Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | | | | Merge branch 'jk/submodule-subdirectory-ok'Junio C Hamano2013-06-309-213/+673
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow various subcommands of "git submodule" to be run not from the top of the working tree of the superproject. * jk/submodule-subdirectory-ok: submodule: drop the top-level requirement rev-parse: add --prefix option submodule: show full path in error message t7403: add missing && chaining t7403: modernize style t7401: make indentation consistent
| * | | | | | | | submodule: drop the top-level requirementJohn Keeping2013-06-176-35/+319
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the new rev-parse --prefix option to process all paths given to the submodule command, dropping the requirement that it be run from the top-level of the repository. Since the interpretation of a relative submodule URL depends on whether or not "remote.origin.url" is configured, explicitly block relative URLs in "git submodule add" when not at the top level of the working tree. Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | | rev-parse: add --prefix optionJohn Keeping2013-06-173-5/+131
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes 'git rev-parse' behave as if it were invoked from the specified subdirectory of a repository, with the difference that any file paths which it prints are prefixed with the full path from the top of the working tree. This is useful for shell scripts where we may want to cd to the top of the working tree but need to handle relative paths given by the user on the command line. Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | | submodule: show full path in error messageJohn Keeping2013-06-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When --recursive was added to "submodule foreach" in commit 15fc56a (git submodule foreach: Add --recursive to recurse into nested submodules, 2009-08-19), the error message when the script returns a non-zero status was not updated to contain $prefix to show the full path. Fix this. Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | | t7403: add missing && chainingJohn Keeping2013-06-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | | t7403: modernize styleJohn Keeping2013-06-171-133/+183
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the indentation to use tabs consistently and start content on the line after the paren opening a subshell. Also don't put a space in ">file" and remove ":" from ": >file" to be consistent with the majority of tests elsewhere. Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>