delta/git.git - github.com: git/git.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge branch 'rs/ref-update-check-errors-early'	Junio C Hamano	2014-06-03	2	-4/+8
\|\ \| \| \| \| \| \| \| \| \| \|	* rs/ref-update-check-errors-early: commit.c: check for lock error and return early sequencer.c: check for lock failure and bail early in fast_forward_to
\| *	commit.c: check for lock error and return earlyrs/ref-update-check-errors-early	Ronnie Sahlberg	2014-04-17	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the check for the lock failure to happen immediately after lock_any_ref_for_update(). Previously the lock and the check-if-lock-failed was separated by a handful of string manipulation statements. Moving the check to occur immediately after the failed lock makes the code slightly easier to read and makes it follow the pattern of try-to-take-a-lock(); if (check-if-lock-failed) { error(); } Signed-off-by: Ronnie Sahlberg <sahlberg@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| *	sequencer.c: check for lock failure and bail early in fast_forward_to	Ronnie Sahlberg	2014-04-17	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change fast_forward_to() to check if locking the ref failed, print a nice error message and bail out early. The old code did not check if ref_lock was NULL and relied on the fact that the write_ref_sha1() would safely detect this condition and set the return variable ret to indicate an error. While that is safe, it makes the code harder to read for two reasons: * Inconsistency. Almost all other places we do check the lock for NULL explicitly, so the naive reader is confused "why don't we check here?" * And relying on write_ref_sha1() to detect and return an error for when a previous lock_any_ref_for_update() failed feels obfuscated. This change should not change any functionality or logic aside from adding an extra error message when this condition is triggered (write_ref_sha1() returns an error silently for this condition). Signed-off-by: Ronnie Sahlberg <sahlberg@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* \|	Merge branch 'nd/index-pack-one-fd-per-thread'	Junio C Hamano	2014-06-03	3	-18/+17
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Enable threaded index-pack on platforms without thread-unsafe pread() emulation. * nd/index-pack-one-fd-per-thread: index-pack: work around thread-unsafe pread()
\| * \|	index-pack: work around thread-unsafe pread()nd/index-pack-one-fd-per-thread	Nguyễn Thái Ngọc Duy	2014-04-16	3	-18/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Multi-threaing of index-pack was disabled with c0f8654 (index-pack: Disable threading on cygwin - 2012-06-26), because pread() implementations for Cygwin and MSYS were not thread safe. Recent Cygwin does offer usable pread() and we enabled multi-threading with 103d530f (Cygwin 1.7 has thread-safe pread, 2013-07-19). Work around this problem on platforms with a thread-unsafe pread() emulation by opening one file handle per thread; it would prevent parallel pread() on different file handles from stepping on each other. Also remove NO_THREAD_SAFE_PREAD that was introduced in c0f8654 because it's no longer used anywhere. This workaround is unconditional, even for platforms with thread-safe pread() because the overhead is small (a couple file handles more) and not worth fragmenting the code. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Tested-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* \| \|	Merge branch 'ym/fix-opportunistic-index-update-race'	Junio C Hamano	2014-06-03	6	-5/+90
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Read-only operations such as "git status" that internally refreshes the index write out the refreshed index to the disk to optimize future accesses to the working tree, but this could race with a "read-write" operation that modify the index while it is running. Detect such a race and avoid overwriting the index. Duy raised a good point that we may need to do the same for the normal writeout codepath, not just the "opportunistic" update codepath. While that is true, nobody sane would be running two simultaneous operations that are clearly write-oriented competing with each other against the same index file. So in that sense that can be done as a less urgent follow-up for this topic. * ym/fix-opportunistic-index-update-race: read-cache.c: verify index file before we opportunistically update it wrapper.c: add xpread() similar to xread()
\| * \| \|	read-cache.c: verify index file before we opportunistically update itym/fix-opportunistic-index-update-race	Yiannis Marangos	2014-04-10	3	-1/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before we proceed to opportunistically update the index (often done by an otherwise read-only operation like "git status" and "git diff" that internally refreshes the index), we must verify that the current index file is the same as the one that we read earlier before we took the lock on it, in order to avoid a possible race. In the example below git-status does "opportunistic update" and git-rebase updates the index, but the race can happen in general. 1. process A calls git-rebase (or does anything that uses the index) 2. process A applies 1st commit 3. process B calls git-status (or does anything that updates the index) 4. process B reads index 5. process A applies 2nd commit 6. process B takes the lock, then overwrites process A's changes. 7. process A applies 3rd commit As an end result the 3rd commit will have a revert of the 2nd commit. When process B takes the lock, it needs to make sure that the index hasn't changed since step 4. Signed-off-by: Yiannis Marangos <yiannis.marangos@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \|	wrapper.c: add xpread() similar to xread()	Yiannis Marangos	2014-04-10	4	-4/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is a common mistake to call read(2)/pread(2) and forget to anticipate that they may return error with EAGAIN/EINTR when the system call is interrupted. We have xread() helper to relieve callers of read(2) from having to worry about it; add xpread() helper to do the same for pread(2). Update the caller in the builtin/index-pack.c and the mmap emulation in compat/. Signed-off-by: Yiannis Marangos <yiannis.marangos@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* \| \| \|	Merge branch 'mh/ref-transaction'	Junio C Hamano	2014-06-03	13	-285/+585
\|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update "update-ref --stdin [-z]" and then introduce a transactional support for (multi-)reference updates. * mh/ref-transaction: (27 commits) ref_transaction_commit(): work with transaction->updates in place struct ref_update: add a type field struct ref_update: add a lock field ref_transaction_commit(): simplify code using temporary variables struct ref_update: store refname as a FLEX_ARRAY struct ref_update: rename field "ref_name" to "refname" refs: remove API function update_refs() update-ref --stdin: reimplement using reference transactions refs: add a concept of a reference transaction update-ref --stdin: harmonize error messages update-ref --stdin: improve the error message for unexpected EOF t1400: test one mistake at a time update-ref --stdin -z: deprecate interpreting the empty string as zeros update-ref.c: extract a new function, parse_next_sha1() t1400: test that stdin -z update treats empty <newvalue> as zeros update-ref --stdin: simplify error messages for missing oldvalues update-ref --stdin: make error messages more consistent update-ref --stdin: improve error messages for invalid values update-ref.c: extract a new function, parse_refname() parse_cmd_verify(): copy old_sha1 instead of evaluating <oldvalue> twice ...
\| * \| \| \|	ref_transaction_commit(): work with transaction->updates in placemh/ref-transaction	Michael Haggerty	2014-04-07	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we free the transaction when we are done, there is no need to make a copy of transaction->updates before working with it. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	struct ref_update: add a type field	Michael Haggerty	2014-04-07	1	-5/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It used to be that ref_transaction_commit() allocated a temporary array to hold the types of references while it is working. Instead, add a type field to ref_update that ref_transaction_commit() can use as its scratch space. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	struct ref_update: add a lock field	Michael Haggerty	2014-04-07	1	-17/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we manage ref_update objects internally, we can use them to hold some of the scratch space we need when actually carrying out the updates. Store the (struct ref_lock *) there. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	ref_transaction_commit(): simplify code using temporary variables	Michael Haggerty	2014-04-07	1	-8/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use temporary variables in the for-loop blocks to simplify expressions in the rest of the loop. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	struct ref_update: store refname as a FLEX_ARRAY	Michael Haggerty	2014-04-07	1	-9/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	struct ref_update: rename field "ref_name" to "refname"	Michael Haggerty	2014-04-07	2	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is consistent with the usual nomenclature. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	refs: remove API function update_refs()	Michael Haggerty	2014-04-07	2	-33/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It has been superseded by reference transactions. This also means that struct ref_update can become private. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	update-ref --stdin: reimplement using reference transactions	Michael Haggerty	2014-04-07	1	-67/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change is mostly clerical: the parse_cmd_() functions need to use local variables rather than a struct ref_update to collect the arguments needed for each update, and then call ref_transaction_() to queue the change rather than building up the list of changes at the caller side. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	refs: add a concept of a reference transaction	Michael Haggerty	2014-04-07	2	-0/+164
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Build out the API for dealing with a bunch of reference checks and changes within a transaction. Define an opaque ref_transaction type that is managed entirely within refs.c. Introduce functions for beginning a transaction, adding updates to a transaction, and committing/rolling back a transaction. This API will soon replace update_refs(). Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	update-ref --stdin: harmonize error messages	Michael Haggerty	2014-04-07	2	-28/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make (most of) the error messages for invalid input have the same format [1]: $COMMAND [SP $REFNAME]: $MESSAGE Update the tests accordingly. [1] A few error messages are left with their old form, because $COMMAND and $REFNAME aren't passed all the way down the call stack. Maybe those sites should be changed some day, too. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	update-ref --stdin: improve the error message for unexpected EOF	Michael Haggerty	2014-04-07	2	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Distinguish this error from the error that an argument is missing for another reason. Update the tests accordingly. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	t1400: test one mistake at a time	Michael Haggerty	2014-04-07	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This case wants to test passing a bad refname to the "update" command. But it also passes too few arguments to "update", which muddles the situation: which error should be diagnosed? So split this test into two: * One that passes too few arguments to update * One that passes all three arguments to "update", but with a bad refname. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	update-ref --stdin -z: deprecate interpreting the empty string as zeros	Michael Haggerty	2014-04-07	3	-8/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the original version of this command, for the single case of the "update" command's <newvalue>, the empty string was interpreted as being equivalent to 40 "0"s. This shorthand is unnecessary (binary input will usually be generated programmatically anyway), and it complicates the parser and the documentation. So gently deprecate this usage: remove its description from the documentation and emit a warning if it is found. But for reasons of backwards compatibility, continue to accept it. Helped-by: Brad King <brad.king@kitware.com> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	update-ref.c: extract a new function, parse_next_sha1()	Michael Haggerty	2014-04-07	2	-63/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Replace three functions, update_store_new_sha1(), update_store_old_sha1(), and parse_next_arg(), with a single function, parse_next_sha1(). The new function takes care of a whole argument, including checking whether it is there, converting it to an SHA-1, and emitting errors on EOF or for invalid values. The return value indicates whether the argument was present or absent, which requires a bit of intelligence because absent values are represented differently depending on whether "-z" was used. The new interface means that the calling functions, parse_cmd_*(), don't have to interpret the result differently based on the line_termination mode that is in effect. It also means that parse_cmd_create() can distinguish unambiguously between an empty new value and a zeros new value, which fixes a failure in t1400. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	t1400: test that stdin -z update treats empty <newvalue> as zeros	Michael Haggerty	2014-04-07	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the (slightly inconsistent) status quo; make sure it doesn't change by accident. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	update-ref --stdin: simplify error messages for missing oldvalues	Michael Haggerty	2014-04-07	2	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of, for example, fatal: update refs/heads/master missing [<oldvalue>] NUL emit fatal: update refs/heads/master missing <oldvalue> Update the tests accordingly. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	update-ref --stdin: make error messages more consistent	Michael Haggerty	2014-04-07	2	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The old error messages emitted for invalid input sometimes said "<oldvalue>"/"<newvalue>" and sometimes said "old value"/"new value". Convert them all to the former. Update the tests accordingly. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	update-ref --stdin: improve error messages for invalid values	Michael Haggerty	2014-04-07	2	-15/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If an invalid value is passed to "update-ref --stdin" as <oldvalue> or <newvalue>, include the command and the name of the reference at the beginning of the error message. Update the tests accordingly. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	update-ref.c: extract a new function, parse_refname()	Michael Haggerty	2014-04-07	1	-49/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is no reason to obscure the fact that parse_first_arg() always parses refnames. Form the new function by combining parse_first_arg() and update_store_ref_name(). Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	parse_cmd_verify(): copy old_sha1 instead of evaluating <oldvalue> twice	Michael Haggerty	2014-04-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Aside from avoiding a tiny bit of work, this makes it transparently obvious that old_sha1 and new_sha1 are identical. It is arguably a bit silly to have to set new_sha1 in order to verify old_sha1, but that is a problem for another day. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	update-ref --stdin: read the whole input at once	Michael Haggerty	2014-04-07	1	-62/+108
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Read the whole input into a strbuf at once, and then parse it from there. This might also be a tad faster, but that is not the point. The point is to decouple the parsing code from the input source (the old parsing code had to read new data even in the middle of commands). Add docstrings for the parsing functions. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	update_refs(): fix constness	Michael Haggerty	2014-04-07	3	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The old signature of update_refs() required a (const struct ref_update ) for its updates_orig argument. The "const" is presumably there to promise that the function will not modify the contents of the structures. But this declaration does not permit the function to be called with a (struct ref_update ), which is perfectly legitimate. C's type system is not powerful enough to express what we'd like. So remove the first "const" from the declaration. On the other hand, the function can promise not to modify the pointers within the array that is passed to it without inconveniencing its callers. So add a "const" that has that effect, making the final declaration (struct ref_update * const *). Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	refs.h: rename the action_on_err constants	Michael Haggerty	2014-04-07	11	-29/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Given that these constants are only being used when updating references, it is inappropriate to give them such generic names as "DIE_ON_ERR". So prefix their names with "UPDATE_REFS_". Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	t1400: add some more tests involving quoted arguments	Michael Haggerty	2014-04-07	1	-1/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously there were no good tests of C-quoted arguments. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	parse_arg(): really test that argument is properly terminated	Michael Haggerty	2014-04-07	2	-7/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The old parse_arg(), when fed an argument "refs/heads/a"master parsed 'refs/heads/a' off of the front of the argument and considered itself successful. It was only when parse_next_arg() tried to parse the next argument that a problem was noticed. But in fact, the definition of the input format requires arguments to be terminated by SP or NUL, so this argument is already erroneous and parse_arg() should diagnose the problem. So teach parse_arg() to verify that C-quoted arguments are terminated correctly. If not, emit a more specific error message. There is no corresponding error case of a non-C-quoted argument that is not terminated correctly, because the end of a non-quoted argument is by definition a space or NUL, so there is no way to insert other junk between the "end" of the argument and the argument terminator. Adjust the tests to expect the new error message. Add a docstring to the function, incorporating the comments that were formerly within the function plus some added information. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	t1400: provide more usual input to the command	Michael Haggerty	2014-04-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The old version was passing (among other things) update SP refs/heads/c NUL NUL 0{40} NUL to "git update-ref -z --stdin" to test whether the old-value check for c is working. But the <newvalue> is empty, which is a bit off the beaten track. So, to be sure that we are testing what we want to test, provide an actual <newvalue> on the "update" line. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \|	t1400: fix name and expected result of one test	Michael Haggerty	2014-04-07	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The test stdin -z create ref fails with zero new value actually passes an empty new value, not a zero new value. So rename the test s/zero/empty/, and change the expected error from fatal: create $c given zero new value to fatal: create $c missing <newvalue> Of course, this makes the test fail now, because although "git update-ref" tries to distinguish between these two errors, it does not succeed in this situation. Fixing it is more than a one-liner, so mark the test test_expect_failure for now. The failure will be fixed later in this patch series. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* \| \| \| \|	Merge branch 'ks/tree-diff-nway'	Junio C Hamano	2014-06-03	10	-170/+723
\|\ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of running N pair-wise diff-trees when inspecting a N-parent merge, find the set of paths that were touched by walking N+1 trees in parallel. These set of paths can then be turned into N pair-wise diff-tree results to be processed through rename detections and such. And N=2 case nicely degenerates to the usual 2-way diff-tree, which is very nice. * ks/tree-diff-nway: mingw: activate alloca combine-diff: speed it up, by using multiparent diff tree-walker directly tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Portable alloca for Git tree-diff: reuse base str(buf) memory on sub-tree recursion tree-diff: no need to call "full" diff_tree_sha1 from show_path() tree-diff: rework diff_tree interface to be sha1 based tree-diff: diff_tree() should now be static tree-diff: remove special-case diff-emitting code for empty-tree cases tree-diff: simplify tree_entry_pathcmp tree-diff: show_path prototype is not needed anymore tree-diff: rename compare_tree_entry -> tree_entry_pathcmp tree-diff: move all action-taking code out of compare_tree_entry() tree-diff: don't assume compare_tree_entry() returns -1,0,1 tree-diff: consolidate code for emitting diffs and recursion in one place tree-diff: show_tree() is not needed tree-diff: no need to pass match to skip_uninteresting() tree-diff: no need to manually verify that there is no mode change for a path combine-diff: move changed-paths scanning logic into its own function combine-diff: move show_log_first logic/action out of paths scanning
\| * \| \| \| \|	mingw: activate allocaks/tree-diff-nway	Kirill Smelkov	2014-04-09	2	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Both MSVC and MINGW have alloca(3) definitions in malloc.h, so by moving win32-compat alloca.h from compat/vcbuild/include/ to compat/win32/ , which is included by both MSVC and MINGW CFLAGS, we can make alloca() work on both those Windows environments. In MINGW, malloc.h has explicit check for GNUC and if it is so, defines alloca to __builtin_alloca, so it looks like we don't need to add any code to here-shipped alloca.h to get optimum performance. Compile-tested on Windows in MSysGit. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Acked-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \| \|	combine-diff: speed it up, by using multiparent diff tree-walker directly	Kirill Smelkov	2014-04-07	2	-5/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As was recently shown in "combine-diff: optimize combine_diff_path sets intersection", combine-diff runs very slowly. In that commit we optimized paths sets intersection, but that accounted only for ~ 25% of the slowness, and as my tracing showed, for linux.git v3.10..v3.11, for merges a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). In previous commit, we described the problem in more details, and reworked the diff tree-walker to be general one - i.e. to work in multiple parent case too. Now is the time to take advantage of it for finding paths for combine diff. The implementation is straightforward - if we know, we can get generated diff paths directly, and at present that means no diff filtering or rename/copy detection was requested(), we can call multiparent tree-walker directly and get ready paths. () because e.g. at present, all diffcore transformations work on diff_filepair queues, but in the future, that limitation can be lifted, if filters would operate directly on combine_diff_paths. Timings for `git log --raw --no-abbrev --no-renames` without `-c` ("git log") and with `-c` ("git log -c") and with `-c --merges` ("git log -c --merges") before and after the patch are as follows: linux.git v3.10..v3.11 log log -c log -c --merges before 1.9s 16.4s 15.2s after 1.9s 2.4s 1.1s The result stayed the same. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \| \|	tree-diff: rework diff_tree() to generate diffs for multiparent cases as well	Kirill Smelkov	2014-04-07	4	-64/+465
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously diff_tree(), which is now named ll_diff_tree_sha1(), was generating diff_filepair(s) for two trees t1 and t2, and that was usually used for a commit as t1=HEAD~, and t2=HEAD - i.e. to see changes a commit introduces. In Git, however, we have fundamentally built flexibility in that a commit can have many parents - 1 for a plain commit, 2 for a simple merge, but also more than 2 for merging several heads at once. For merges there is a so called combine-diff, which shows diff, a merge introduces by itself, omitting changes done by any parent. That works through first finding paths, that are different to all parents, and then showing generalized diff, with separate columns for +/- for each parent. The code lives in combine-diff.c . There is an impedance mismatch, however, in that a commit could generally have any number of parents, and that while diffing trees, we divide cases for 2-tree diffs and more-than-2-tree diffs. I mean there is no special casing for multiple parents commits in e.g. revision-walker . That impedance mismatch hurts performance badly for generating combined diffs - in "combine-diff: optimize combine_diff_path sets intersection" I've already removed some slowness from it, but from the timings provided there, it could be seen, that combined diffs still cost more than an order of magnitude more cpu time, compared to diff for usual commits, and that would only be an optimistic estimate, if we take into account that for e.g. linux.git there is only one merge for several dozens of plain commits. That slowness comes from the fact that currently, while generating combined diff, a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). That's because at present, to compute combine-diff, for first finding paths, that "every parent touches", we use the following combine-diff property/definition: D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn) (w.r.t. paths) where D(A,P1...Pn) is combined diff between commit A, and parents Pi and D(A,Pi) is usual two-tree diff Pi..A So if any of that D(A,Pi) is huge, tracting 1 n-parent combine-diff as n 1-parent diffs and intersecting results will be slow. And usually, for linux.git and other topic-based workflows, that D(A,P2) is huge, because, if merge-base of A and P2, is several dozens of merges (from A, via first parent) below, that D(A,P2) will be diffing sum of merges from several subsystems to 1 subsystem. The solution is to avoid computing n 1-parent diffs, and to find changed-to-all-parents paths via scanning A's and all Pi's trees simultaneously, at each step comparing their entries, and based on that comparison, populate paths result, and deduce we could skip recursing into subdirectories, if at least for 1 parent, sha1 of that dir tree is the same as in A. That would save us from doing significant amount of needless work. Such approach is very similar to what diff_tree() does, only there we deal with scanning only 2 trees simultaneously, and for n+1 tree, the logic is a bit more complex: D(T,P1...Pn) calculation scheme ------------------------------- D(T,P1...Pn) = D(T,P1) ^ ... ^ D(T,Pn) (regarding resulting paths set) D(T,Pj) - diff between T..Pj D(T,P1...Pn) - combined diff from T to parents P1,...,Pn We start from all trees, which are sorted, and compare their entries in lock-step: T P1 Pn - - - \|t\| \|p1\| \|pn\| \|-\| \|--\| ... \|--\| imin = argmin(p1...pn) \| \| \| \| \| \| \|-\| \|--\| \|--\| \|.\| \|. \| \|. \| . . . . . . at any time there could be 3 cases: 1) t < p[imin]; 2) t > p[imin]; 3) t = p[imin]. Schematic deduction of what every case means, and what to do, follows: 1) t < p[imin] -> ∀j t ∉ Pj -> "+t" ∈ D(T,Pj) -> D += "+t"; t↓ 2) t > p[imin] 2.1) ∃j: pj > p[imin] -> "-p[imin]" ∉ D(T,Pj) -> D += ø; ∀ pi=p[imin] pi↓ 2.2) ∀i pi = p[imin] -> pi ∉ T -> "-pi" ∈ D(T,Pi) -> D += "-p[imin]"; ∀i pi↓ 3) t = p[imin] 3.1) ∃j: pj > p[imin] -> "+t" ∈ D(T,Pj) -> only pi=p[imin] remains to investigate 3.2) pi = p[imin] -> investigate δ(t,pi) \| \| v 3.1+3.2) looking at δ(t,pi) ∀i: pi=p[imin] - if all != ø -> ⎧δ(t,pi) - if pi=p[imin] -> D += ⎨ ⎩"+t" - if pi>p[imin] in any case t↓ ∀ pi=p[imin] pi↓ ~ For comparison, here is how diff_tree() works: D(A,B) calculation scheme ------------------------- A B - - \|a\| \|b\| a < b -> a ∉ B -> D(A,B) += +a a↓ \|-\| \|-\| a > b -> b ∉ A -> D(A,B) += -b b↓ \| \| \| \| a = b -> investigate δ(a,b) a↓ b↓ \|-\| \|-\| \|.\| \|.\| . . . . ~~~~~~~~ This patch generalizes diff tree-walker to work with arbitrary number of parents as described above - i.e. now there is a resulting tree t, and some parents trees tp[i] i=[0..nparent). The generalization builds on the fact that usual diff D(A,B) is by definition the same as combined diff D(A,[B]), so if we could rework the code for common case and make it be not slower for nparent=1 case, usual diff(t1,t2) generation will not be slower, and multiparent diff tree-walker would greatly benefit generating combine-diff. What we do is as follows: 1) diff tree-walker ll_diff_tree_sha1() is internally reworked to be a paths generator (new name diff_tree_paths()), with each generated path being `struct combine_diff_path` with info for path, new sha1,mode and for every parent which sha1,mode it was in it. 2) From that info, we can still generate usual diff queue with struct diff_filepairs, via "exporting" generated combine_diff_path, if we know we run for nparent=1 case. (see emit_diff() which is now named emit_diff_first_parent_only()) 3) In order for diff_can_quit_early(), which checks DIFF_OPT_TST(opt, HAS_CHANGES)) to work, that exporting have to be happening not in bulk, but incrementally, one diff path at a time. For such consumers, there is a new callback in diff_options introduced: ->pathchange(opt, struct combine_diff_path ) which, if set to !NULL, is called for every generated path. (see new compat ll_diff_tree_sha1() wrapper around new paths generator for setup) 4) The paths generation itself, is reworked from previous ll_diff_tree_sha1() code according to "D(A,P1...Pn) calculation scheme" provided above: On the start we allocate [nparent] arrays in place what was earlier just for one parent tree. then we just generalize loops, and comparison according to the algorithm. Some notes(): 1) alloca(), for small arrays, is used for "runs not slower for nparent=1 case than before" goal - if we change it to xmalloc()/free() the timings get ~1% worse. For alloca() we use just-introduced xalloca/xalloca_free compatibility wrappers, so it should not be a portability problem. 2) For every parent tree, we need to keep a tag, whether entry from that parent equals to entry from minimal parent. For performance reasons I'm keeping that tag in entry's mode field in unused bit - see S_IFXMIN_NEQ. Not doing so, we'd need to alloca another [nparent] array, which hurts performance. 3) For emitted paths, memory could be reused, if we know the path was processed via callback and will not be needed later. We use efficient hand-made realloc-style path_appendnew(), that saves us from ~1-1.5% of potential additional slowdown. 4) goto(s) are used in several places, as the code executes a little bit faster with lowered register pressure. Also - we should now check for FIND_COPIES_HARDER not only when two entries names are the same, and their hashes are equal, but also for a case, when a path was removed from some of all parents having it. The reason is, if we don't, that path won't be emitted at all (see "a > xi" case), and we'll just skip it, and FIND_COPIES_HARDER wants all paths - with diff or without - to be emitted, to be later analyzed for being copies sources. The new check is only necessary for nparent >1, as for nparent=1 case xmin_eqtotal always =1 =nparent, and a path is always added to diff as removal. ~~~~~~~~ Timings for # without -c, i.e. testing only nparent=1 case `git log --raw --no-abbrev --no-renames` before and after the patch are as follows: navy.git linux.git v3.10..v3.11 before 0.611s 1.889s after 0.619s 1.907s slowdown 1.3% 0.9% This timings show we did no harm to usual diff(tree1,tree2) generation. From the table we can see that we actually did ~1% slowdown, but I think I've "earned" that 1% in the previous patch ("tree-diff: reuse base str(buf) memory on sub-tree recursion", HEAD~~) so for nparent=1 case, net timings stays approximately the same. The output also stayed the same. (*) If we revert 1)-4) to more usual techniques, for nparent=1 case, we'll get ~2-2.5% of additional slowdown, which I've tried to avoid, as "do no harm for nparent=1 case" rule. For linux.git, combined diff will run an order of magnitude faster and appropriate timings will be provided in the next commit, as we'll be taking advantage of the new diff tree-walker for combined-diff generation there. P.S. and combined diff is not some exotic/for-play-only stuff - for example for a program I write to represent Git archives as readonly filesystem, there is initial scan with `git log --reverse --raw --no-abbrev --no-renames -c` to extract log of what was created/changed when, as a result building a map {} sha1 -> in which commit (and date) a content was added that `-c` means also show combined diff for merges, and without them, if a merge is non-trivial (merges changes from two parents with both having separate changes to a file), or an evil one, the map will not be full, i.e. some valid sha1 would be absent from it. That case was my initial motivation for combined diffs speedup. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \| \|	Portable alloca for Git	Kirill Smelkov	2014-03-27	4	-2/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the next patch we'll have to use alloca() for performance reasons, but since alloca is non-standardized and is not portable, let's have a trick with compatibility wrappers: 1. at configure time, determine, do we have working alloca() through alloca.h, and define #define HAVE_ALLOCA_H if yes. 2. in code #ifdef HAVE_ALLOCA_H # include <alloca.h> # define xalloca(size) (alloca(size)) # define xalloca_free(p) do {} while(0) #else # define xalloca(size) (xmalloc(size)) # define xalloca_free(p) (free(p)) #endif and use it like func() { p = xalloca(size); ... xalloca_free(p); } This way, for systems, where alloca is available, we'll have optimal on-stack allocations with fast executions. On the other hand, on systems, where alloca is not available, this gracefully fallbacks to xmalloc/free. Both autoconf and config.mak.uname configurations were updated. For autoconf, we are not bothering considering cases, when no alloca.h is available, but alloca() works some other way - its simply alloca.h is available and works or not, everything else is deep legacy. For config.mak.uname, I've tried to make my almost-sure guess for where alloca() is available, but since I only have access to Linux it is the only change I can be sure about myself, with relevant to other changed systems people Cc'ed. NOTE SunOS and Windows had explicit -DHAVE_ALLOCA_H in their configurations. I've changed that to now-common HAVE_ALLOCA_H=YesPlease which should be correct. Cc: Brandon Casey <drafnel@gmail.com> Cc: Marius Storm-Olsen <mstormo@gmail.com> Cc: Johannes Sixt <j6t@kdbg.org> Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de> Cc: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Cc: Gerrit Pape <pape@smarden.org> Cc: Petr Salinger <Petr.Salinger@seznam.cz> Cc: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Thomas Schwinge <thomas@codesourcery.com> (GNU Hurd changes) Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \| \|	tree-diff: reuse base str(buf) memory on sub-tree recursion	Kirill Smelkov	2014-03-27	1	-19/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of allocating it all the time for every subtree in ll_diff_tree_sha1, let's allocate it once in diff_tree_sha1, and then all callee just use it in stacking style, without memory allocations. This should be faster, and for me this change gives the following slight speedups for git log --raw --no-abbrev --no-renames --format='%H' navy.git linux.git v3.10..v3.11 before 0.618s 1.903s after 0.611s 1.889s speedup 1.1% 0.7% Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \| \|	tree-diff: no need to call "full" diff_tree_sha1 from show_path()	Kirill Smelkov	2014-03-27	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As described in previous commit, when recursing into sub-trees, we can use lower-level tree walker, since its interface is now sha1 based. The change is ok, because diff_tree_sha1() only invokes ll_diff_tree_sha1(), and also, if base is empty, try_to_follow_renames(). But base is not empty here, as we have added a path and '/' before recursing. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \| \|	tree-diff: rework diff_tree interface to be sha1 based	Kirill Smelkov	2014-03-27	1	-32/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the next commit this will allow to reduce intermediate calls, when recursing into subtrees - at that stage we know only subtree sha1, and it is natural for tree walker to start from that phase. For now we do diff_tree show_path diff_tree_sha1 diff_tree ... and the change will allow to reduce it to diff_tree show_path diff_tree Also, it will allow to omit allocating strbuf for each subtree, and just reuse the common strbuf via playing with its len. The above-mentioned improvements go in the next 2 patches. The downside is that try_to_follow_renames(), if active, we cause re-reading of 2 initial trees, which was negligible based on my timings, and which is outweighed cogently by the upsides. NOTE To keep with the current interface and semantics, I needed to rename the function from diff_tree() to diff_tree_sha1(). As diff_tree_sha1() was already used, and the function we are talking here is its more low-level helper, let's use convention for prefixing such helpers with "ll_". So the final renaming is diff_tree() -> ll_diff_tree_sha1() Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \| \|	tree-diff: diff_tree() should now be static	Kirill Smelkov	2014-03-26	2	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We reworked all its users to use the functionality through diff_tree_sha1 variant in recent patches (see "tree-diff: allow diff_tree_sha1 to accept NULL sha1" and what comes next). diff_tree() is now not used outside tree-diff.c - make it static. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \| \|	tree-diff: remove special-case diff-emitting code for empty-tree cases	Kirill Smelkov	2014-03-26	1	-12/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While walking trees, we iterate their entries from lowest to highest in sort order, so empty tree means all entries were already went over. If we artificially assign +infinity value to such tree "entry", it will go after all usual entries, and through the usual driver loop we will be taking the same actions, which were hand-coded for special cases, i.e. t1 empty, t2 non-empty pathcmp(+∞, t2) -> +1 show_path(/t1=/NULL, t2); /* = t1 > t2 case in main loop / t1 non-empty, t2-empty pathcmp(t1, +∞) -> -1 show_path(t1, /t2=/NULL); / = t1 < t2 case in main loop */ In other words when we have t1 and t2, we return a sign that tells the caller to indicate the "earlier" one to be emitted, and by returning the sign that causes the non-empty side to be emitted, we will automatically cause the entries from the remaining side to be emitted, without attempting to touch the empty side at all. We can teach tree_entry_pathcmp() to pretend that an empty tree has an element that sorts after anything else to achieve this. Right now we never go to when compared tree descriptors are both infinity, as this condition is checked in the loop beginning as finishing criteria, but will do so in the future, when there will be several parents iterated simultaneously, and some pair of them would run to the end. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \| \|	tree-diff: simplify tree_entry_pathcmp	Kirill Smelkov	2014-03-20	1	-11/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since an earlier "Finally switch over tree descriptors to contain a pre-parsed entry", we can safely access all tree_desc->entry fields directly instead of first "extracting" them through tree_entry_extract. Use it. The code generated stays the same - only it now visually looks cleaner. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \| \|	tree-diff: show_path prototype is not needed anymore	Kirill Smelkov	2014-03-20	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We moved all action-taking code below show_path() in recent HEAD~~ (tree-diff: move all action-taking code out of compare_tree_entry). Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \| \|	tree-diff: rename compare_tree_entry -> tree_entry_pathcmp	Kirill Smelkov	2014-03-20	1	-6/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since previous commit, this function does not compare entry hashes, and mode are compared fully outside of it. So what it does is compare entry names and DIR bit in modes. Reflect this in its name. Add documentation stating the semantics, and move the note about files/dirs comparison to it. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
\| * \| \| \| \|	tree-diff: move all action-taking code out of compare_tree_entry()	Kirill Smelkov	2014-03-20	1	-16/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- let it do only comparison. This way the code is cleaner and more structured - cmp function only compares, and the driver takes action based on comparison result. There should be no change in performance, as effectively, we just move if series from on place into another, and merge it to was-already-there same switch/if, so the result is maybe a little bit faster. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>