| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When cloning with "--recursive", we'd generally expect
submodules to show progress reports if the main clone did,
too.
In older versions of git, this mostly worked out of the
box. Since we show progress by default when stderr is a tty,
and since the child clones inherit the parent stderr, then
both processes would come to the same decision by default.
If the parent clone was asked for "--quiet", we passed down
"--quiet" to the child. However, if stderr was not a tty and
the user specified "--progress", we did not propagate this
to the child.
That's a minor bug, but things got much worse when we
switched recently to submodule--helper's update_clone
command. With that change, the stderr of the child clones
are always connected to a pipe, and we never output
progress at all.
This patch teaches git-submodule and git-submodule--helper
how to pass down an explicit "--progress" flag when cloning.
The clone command then decides to propagate that flag based
on the cloning decision made earlier (which takes into
account isatty(2) of the parent process, existing --progress
or --quiet flags, etc). Since the child processes always run
without a tty on stderr, we don't have to worry about
passing an explicit "--no-progress"; it's the default for
them.
This fixes the recent loss of progress during recursive
clones. And as a bonus, it makes:
git clone --recursive --progress ... 2>&1 | cat
work by triggering progress explicitly in the children.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
"git clone --resurse-submodules --reference $path $URL" is a way to
reduce network transfer cost by borrowing objects in an existing
$path repository when cloning the superproject from $URL; it
learned to also peek into $path for presense of corresponding
repositories of submodules and borrow objects from there when able.
* sb/submodule-clone-rr:
clone: recursive and reference option triggers submodule alternates
clone: implement optional references
clone: clarify option_reference as required
clone: factor out checking for an alternate path
submodule--helper update-clone: allow multiple references
submodule--helper module-clone: allow multiple references
t7408: merge short tests, factor out testing method
t7408: modernize style
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When `--recursive` and `--reference` is given, it is reasonable to
expect that the submodules are created with references to the submodules
of the given alternate for the superproject.
An initial attempt to do this was presented to the mailing list, which
used flags that are passed around ("--super-reference") that instructed
the submodule clone to look for a reference in the submodules of the
referenced superproject. This is not well thought out, as any further
`submodule update` should also respect the initial setup.
When a new submodule is added to the superproject and the alternate
of the superproject does not know about that submodule yet, we rather
error out informing the user instead of being unclear if we did or did
not use a submodules alternate.
To solve this problem introduce new options that store the configuration
for what the user wanted originally.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In a later patch we want to try to create alternates for submodules,
but they might not exist in the referenced superproject. So add a way
to skip the non existing references and report them.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
In the next patch we introduce optional references; To better distinguish
between optional and required references we rename the variable.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In a later patch we want to determine if a path is suitable as an
alternate from other commands than builtin/clone. Move the checking
functionality of `add_one_reference` to `compute_alternate_path` that is
defined in cache.h.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Allow the user to pass in multiple references to update_clone.
Currently this is only internal API, but once the shell script is
replaced by a C version, this is needed.
This fixes an API bug between the shell script and the helper.
Currently the helper accepts "--reference" "--reference=foo"
as a OPT_STRING whose value happens to be "--reference=foo", and
then uses
if (suc->reference)
argv_array_push(&child->args, suc->reference)
where suc->reference _is_ "--reference=foo" when invoking the
underlying "git clone", it cancels out.
With this change we omit one of the "--reference" arguments when
passing references from the shell script to the helper.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Allow users to pass in multiple references, just as clone accepts multiple
references as well.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Tests consisting of one line each can be consolidated to have fewer tests
to run as well as fewer lines of code.
When having just a few git commands, do not create a new shell but
use the -C flag in Git to execute in the correct directory.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
No functional change intended. This commit only changes formatting
to the style we recently use, e.g. starting the body of a test with a
single quote on the same line as the header, and then having the test
indented in the following lines.
Whenever we change directories, we do that in subshells.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Enhance "git status --porcelain" output by collecting more data on
the state of the index and the working tree files, which may
further be used to teach git-prompt (in contrib/) to make fewer
calls to git.
* jh/status-v2-porcelain:
status: unit tests for --porcelain=v2
test-lib-functions.sh: add lf_to_nul helper
git-status.txt: describe --porcelain=v2 format
status: print branch info with --porcelain=v2 --branch
status: print per-file porcelain v2 status data
status: collect per-file data for --porcelain=v2
status: support --porcelain[=<version>]
status: cleanup API to wt_status_print
status: rename long-format print routines
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Test porcelain v2 status format.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add lf_to_nul helper function to test-lib-functions.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Update status manpage to include information about
porcelain v2 format.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Expand porcelain v2 output to include branch and tracking
branch information. This includes the commit id, the branch,
the upstream branch, and the ahead and behind counts.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Print per-file information in porcelain v2 format.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Collect extra per-file data for porcelain V2 format.
The output of `git status --porcelain` leaves out many
details about the current status that clients might like
to have. This can force them to be less efficient as they
may need to launch secondary commands (and try to match
the logic within git) to accumulate this extra information.
For example, a GUI IDE might want the file mode to display
the correct icon for a changed item (without having to stat
it afterwards).
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Update --porcelain argument to take optional version parameter
to allow multiple porcelain formats to be supported in the future.
The token "v1" is the default value and indicates the traditional
porcelain format. (The token "1" is an alias for that.)
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Refactor the API between builtin/commit.c and wt-status.[ch].
Hide the details of the various wt_*status_print() routines inside
wt-status.c behind a single (new) wt_status_print() routine.
Eliminate the switch statements from builtin/commit.c.
Allow details of new status formats to be isolated within wt-status.c
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Rename the various wt_status_print*() routines to be
wt_longstatus_print*() to make it clear that these
routines are only concerned with the normal/long
status output and reduce developer confusion as other
status formats are added in the future.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Clarify various ways to specify the "revision ranges" in the
documentation.
* po/range-doc:
doc: revisions: sort examples and fix alignment of the unchanged
doc: revisions: show revision expansion in examples
doc: revisions - clarify reachability examples
doc: revisions - define `reachable`
doc: gitrevisions - clarify 'latter case' is revision walk
doc: gitrevisions - use 'reachable' in page description
doc: revisions: single vs multi-parent notation comparison
doc: revisions: extra clarification of <rev>^! notation effects
doc: revisions: give headings for the two and three dot notations
doc: show the actual left, right, and boundary marks
doc: revisions - name the left and right sides
doc: use 'symmetric difference' consistently
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The previous commit adjusted the column alignment for revision
examples which show expansion. Fix the unchanged examples and sort
those that show expansions to the end of the list.
Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The revisions examples show the revison arguments and the selected
commits, but do not show the intermediate step of the expansion of
the special 'range' notations. Extend the examples, including an
all-parents multi-parent merge commit example.
Sort the examples and fix the alignment for those unaffected
in the next commit.
Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
For the r1..r2 case, the exclusion of r1, rather than inclusion of r2,
would be the unexpected case in natural language for a simple linear
development, i.e. start..end excludes start.
Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Do not self-define `reachable`, which can lead to misunderstanding.
Instead define `reachability` explictly.
Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The prior sentence has too many clauses for easy parsing.
Replace 'the latter case' with a direct quote.
Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | |
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | |
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | |
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
While there, also break out the other shorthand notations and
add a title for the revision range summary (which also appears
in git-rev-parse, so keep it mixed case).
We do not quote the notation within the headings as the asciidoc ->
docbook -> groff man viewer toolchain, particularly the docbook-groff
step, does not cope with two font changes, failing to return the heading
font to bold after the quotation of the notation.
Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | |
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The terms left and right side originate from the symmetric
difference. Name them there.
Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | |
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
"git nosuchcommand --help" said "No manual entry for gitnosuchcommand",
which was not intuitive, given that "git nosuchcommand" said "git:
'nosuchcommand' is not a git command".
* rt/help-unknown:
help: make option --help open man pages only for Git commands
help: introduce option --exclude-guides
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
If option --help is passed to a Git command, we try to open
the man page of that command. However, we do it for both commands
and concepts. Make sure it is an actual command.
This makes "git <concept> --help" not working anymore, while
"git help <concept>" still works.
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Introduce option --exclude-guides to the help command. With this option
being passed, "git help" will open man pages only for actual commands.
Since we know it is a command, we can use function help_unknown_command
to give the user advice on typos.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |\ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
* js/no-html-bypass-on-windows:
Revert "display HTML in default browser using Windows' shell API"
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
An incoming "git push" that attempts to push too many bytes can now
be rejected by setting a new configuration variable at the receiving
end.
* cc/receive-pack-limit:
receive-pack: allow a maximum input size to be specified
unpack-objects: add --max-input-size=<size> option
index-pack: add --max-input-size=<size> option
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Receive-pack feeds its input to either index-pack or
unpack-objects, which will happily accept as many bytes as
a sender is willing to provide. Let's allow an arbitrary
cutoff point where we will stop writing bytes to disk.
Cleaning up what has already been written to disk is a
related problem that is not addressed by this patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
When receiving a pack-file, it can be useful to abort the
`git unpack-objects`, if the pack-file is too big.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
When receiving a pack-file, it can be useful to abort the
`git index-pack`, if the pack-file is too big.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \ \ \ \ \
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
"git format-patch --cover-letter HEAD^" to format a single patch
with a separate cover letter now numbers the output as [PATCH 0/1]
and [PATCH 1/1] by default.
* jk/format-patch-number-singleton-patch-with-cover:
format-patch: show 0/1 and 1/1 for singleton patch with cover letter
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
Change the default behavior of git-format-patch to generate numbered
sequence of 0/1 and 1/1 when generating both a cover-letter and a single
patch. This standardizes the cover letter to have 0/N which helps
distinguish the cover letter from the patch itself. Since the behavior
is easily changed via configuration as well as the use of -n and -N this
should be acceptable default behavior.
Add tests for the new default behavior.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|\ \ \ \ \ \ \ \
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
The delta-base-cache mechanism has been a key to the performance in
a repository with a tightly packed packfile, but it did not scale
well even with a larger value of core.deltaBaseCacheLimit.
* jk/delta-base-cache:
t/perf: add basic perf tests for delta base cache
delta_base_cache: use hashmap.h
delta_base_cache: drop special treatment of blobs
delta_base_cache: use list.h for LRU
release_delta_base_cache: reuse existing detach function
clear_delta_base_cache_entry: use a more descriptive name
cache_or_unpack_entry: drop keep_cache parameter
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
This just shows off the improvements done by the last few
patches, and gives us a baseline for noticing regressions in
the future. Here are the results with linux.git as the perf
"large repo":
Test origin HEAD
-------------------------------------------------------------------
0003.1: log --raw 43.41(40.36+2.69) 33.86(30.96+2.41) -22.0%
0003.2: log -S 313.61(309.74+3.78) 298.75(295.58+3.00) -4.7%
(for a large repo, the "log -S" improvements are greater if
you bump the delta base cache limit, but I think it makes
sense to test the "stock" behavior, since that is what most
people will see).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
The fundamental data structure of the delta base cache is a
hash table mapping pairs of "(packfile, offset)" into
structs containing the actual object data. The hash table
implementation dates back to e5e0161 (Implement a simple
delta_base cache, 2007-03-17), and uses a fixed-size table.
The current size is a hard-coded 256 entries.
Because we need to be able to remove objects from the hash
table, entry lookup does not do any kind of probing to
handle collisions. Colliding items simply replace whatever
is in their slot. As a result, we have fewer usable slots
than even the 256 we allocate. At half full, each new item
has a 50% chance of displacing another one. Or another way
to think about it: every item has a 1/256 chance of being
ejected due to hash collision, without regard to our LRU
strategy.
So it would be interesting to see the effect of increasing
the cache size on the runtime for some common operations. As
with the previous patch, we'll measure "git log --raw" for
tree-only operations, and "git log -Sfoo --raw" for
operations that touch trees and blobs. All times are
wall-clock best-of-3, done against fully packed repos with
--depth=50, and the default core.deltaBaseCacheLimit of
96MB.
Here are timings for various values of MAX_DELTA_CACHE
against git.git (the asterisk marks the minimum time for
each operation):
MAX_DELTA_CACHE log-raw log-S
--------------- --------- ---------
256 0m02.227s 0m12.821s
512 0m02.143s 0m10.602s
1024 0m02.127s 0m08.642s
2048 0m02.148s 0m07.123s
4096 0m02.194s 0m06.448s*
8192 0m02.239s 0m06.504s
16384 0m02.144s* 0m06.502s
32768 0m02.202s 0m06.622s
65536 0m02.230s 0m06.677s
The log-raw case isn't changed much at all here (probably
because our trees just aren't that big in the first place,
or possibly because we have so _few_ trees in git.git that
the 256-entry cache is enough). But once we start putting
blobs in the cache, too, we see a big improvement (almost
50%). The curve levels off around 4096, which means that we
can hold about that many entries before hitting the 96MB
memory limit (or possibly that the workload is small enough
that there is simply no more work to be optimized out by
caching more).
(As a side note, I initially timed my existing git.git pack,
which was a base of --aggressive combined with some pulls on
top. So it had quite a few deeper delta chains. The
256-cache case was more like 15s, and it still dropped to
~6.5s in the same way).
Here are the timings for linux.git:
MAX_DELTA_CACHE log-raw log-S
--------------- --------- ---------
256 0m41.661s 5m12.410s
512 0m39.547s 5m07.920s
1024 0m37.054s 4m54.666s
2048 0m35.871s 4m41.194s*
4096 0m34.646s 4m51.648s
8192 0m33.881s 4m55.342s
16384 0m35.190s 5m00.122s
32768 0m35.060s 4m58.851s
65536 0m33.311s* 4m51.420s
As we grow we see a nice 20% speedup in the tree traversal,
and more modest 10% in the log-S. This is probably an
indication that we are bound less by the number of entries,
and more by the memory limit (more on that below). What is
interesting is that the numbers bounce around a bit;
increasing the number of entries isn't always a strict
improvement.
Partially this is due to noise in the measurement. But it
may also be an indication that our LRU ejection scheme is
not optimal. The smaller cache sizes introduce some
randomness into the ejection (due to collisions), which may
sometimes work in our favor (and sometimes not!).
So what is the optimal setting of MAX_DELTA_CACHE? The
"bouncing" in the linux.git log-S numbers notwithstanding,
it mostly seems like bigger is better. And even if we were
to try to find a "sweet spot", these are just two
repositories, that are not necessarily representative. The
shape of history, the size of trees and blobs, the memory
limit configuration, etc, all will affect the outcome.
Rather than trying to find the "right" number, another
strategy is to just switch to a hash table that can actually
store collisions: namely our hashmap.h implementation.
Here are numbers for that compared to the "best" we saw from
adjusting MAX_DELTA_CACHE:
| log-raw | log-S
| best hashmap | best hashmap
| --------- --------- | --------- ---------
git | 0m02.144s 0m02.144s | 0m06.448s 0m06.688s
linux | 0m33.311s 0m33.092s | 4m41.194s 4m57.172s
We can see the results are similar in most cases, which is
what we'd expect. We're not ejecting due to collisions at
all, so this is purely representing the LRU. So really, we'd
expect this to model most closely the larger values of the
static MAX_DELTA_CACHE limit. And that does seem to be
what's happening, including the "bounce" in the linux log-S
case.
So while the value for that case _isn't_ as good as the
optimal one measured above (which was 2048 entries), given
the bouncing I'm hesitant to suggest that 2048 is any kind
of optimum (not even for linux.git, let alone as a general
rule). The generic hashmap has the appeal that it drops the
number of tweakable numbers by one, which means we can focus
on tuning other elements, like the LRU strategy or the
core.deltaBaseCacheLimit setting.
And indeed, if we bump the cache limit to 1G (which is
probably silly for general use, but maybe something people
with big workstations would want to do), the linux.git log-S
time drops to 3m32s. That's something you really _can't_ do
easily with the static hash table, because the number of
entries needs to grow in proportion to the memory limit (so
2048 is almost certainly not going to be the right value
there).
This patch takes that direction, and drops the static hash
table entirely in favor of using the hashmap.h API.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
When the delta base cache runs out of allowed memory, it has
to drop entries. It does so by walking an LRU list, dropping
objects until we are under the memory limit. But we actually
walk the list twice: once to drop blobs, and then again to
drop other objects (which are generally trees). This comes
from 18bdec1 (Limit the size of the new delta_base_cache,
2007-03-19).
This performs poorly as the number of entries grows, because
any time dropping blobs does not satisfy the limit, we have
to walk the _entire_ list, trees included, looking for blobs
to drop, before starting to drop any trees.
It's not generally a problem now, as the cache is limited to
only 256 entries. But as we could benefit from increasing
that in a future patch, it's worth looking at how it
performs as the cache size grows. And the answer is "not
well".
The table below shows times for various operations with
different values of MAX_DELTA_CACHE (which is not a run-time
knob; I recompiled with -DMAX_DELTA_CACHE=$n for each).
I chose "git log --raw" ("log-raw" in the table) because it
will access all of the trees, but no blobs at all (so in a
sense it is a worst case for this problem, because we will
always walk over the entire list of trees once before
realizing there are no blobs to drop). This is also
representative of other tree-only operations like "rev-list
--objects" and "git log -- <path>".
I also timed "git log -Sfoo --raw" ("log-S" in the table).
It similarly accesses all of the trees, but also the blobs
for each commit. It's representative of "git log -p", though
it emphasizes the cost of blob access more, as "-S" is
cheaper than computing an actual blob diff.
All timings are best-of-3 wall-clock times (though they all
were CPU bound, so the user CPU times are similar). The
repositories were fully packed with --depth=50, and the
default core.deltaBaseCacheLimit of 96M was in effect. The
current value of MAX_DELTA_CACHE is 256, so I started there
and worked up by factors of 2.
First, here are values for git.git (the asterisk signals the
fastest run for each operation):
MAX_DELTA_CACHE log-raw log-S
--------------- --------- ---------
256 0m02.212s 0m12.634s
512 0m02.136s* 0m10.614s
1024 0m02.156s 0m08.614s
2048 0m02.208s 0m07.062s
4096 0m02.190s 0m06.484s*
8192 0m02.176s 0m07.635s
16384 0m02.913s 0m19.845s
32768 0m03.617s 1m05.507s
65536 0m04.031s 1m18.488s
You can see that for the tree-only log-raw case, we don't
actually benefit that much as the cache grows (all the
differences up through 8192 are basically just noise; this
is probably because we don't actually have that many
distinct trees in git.git). But for log-S, we get a definite
speed improvement as the cache grows, but the improvements
are lost as cache size grows and the linear LRU management
starts to dominate.
Here's the same thing run against linux.git:
MAX_DELTA_CACHE log-raw log-S
--------------- --------- ----------
256 0m40.987s 5m13.216s
512 0m37.949s 5m03.243s
1024 0m35.977s 4m50.580s
2048 0m33.855s 4m39.818s
4096 0m32.913s 4m47.299s*
8192 0m32.176s* 5m14.650s
16384 0m32.185s 6m31.625s
32768 0m38.056s 9m31.136s
65536 1m30.518s 17m38.549s
The pattern is similar, though the effect in log-raw is more
pronounced here. The times dip down in the middle, and then
go back up as we keep growing.
So we know there's a problem. What's the solution?
The obvious one is to improve the data structure to avoid
walking over tree entries during the looking-for-blobs
traversal. We can do this by keeping _two_ LRU lists: one
for blobs, and one for other objects. We drop items from the
blob LRU first, and then from the tree LRU (if necessary).
Here's git.git using that strategy:
MAX_DELTA_CACHE log-raw log-S
--------------- --------- ----------
256 0m02.264s 0m12.830s
512 0m02.201s 0m10.771s
1024 0m02.181s 0m08.593s
2048 0m02.205s 0m07.116s
4096 0m02.158s 0m06.537s*
8192 0m02.213s 0m07.246s
16384 0m02.155s* 0m10.975s
32768 0m02.159s 0m16.047s
65536 0m02.181s 0m16.992s
The upswing on log-raw is gone completely. But log-S still
has it (albeit much better than without this strategy).
Let's see what linux.git shows:
MAX_DELTA_CACHE log-raw log-S
--------------- --------- ---------
256 0m42.519s 5m14.654s
512 0m39.106s 5m04.708s
1024 0m36.802s 4m51.454s
2048 0m34.685s 4m39.378s*
4096 0m33.663s 4m44.047s
8192 0m33.157s 4m50.644s
16384 0m33.090s* 4m49.648s
32768 0m33.458s 4m53.371s
65536 0m33.563s 5m04.580s
The results are similar. The tree-only case again performs
well (not surprising; we're literally just dropping the one
useless walk, and not otherwise changing the cache eviction
strategy at all). But the log-S case again does a bit worse
as the cache grows (though possibly that's within the noise,
which is much larger for this case).
Perhaps this is an indication that the "remove blobs first"
strategy is not actually optimal. The intent of it is to
avoid blowing out the tree cache when we see large blobs,
but it also means we'll throw away useful, recent blobs in
favor of older trees.
Let's run the same numbers without caring about object type
at all (i.e., one LRU list, and always evicting whatever is
at the head, regardless of type).
Here's git.git:
MAX_DELTA_CACHE log-raw log-S
--------------- --------- ---------
256 0m02.227s 0m12.821s
512 0m02.143s 0m10.602s
1024 0m02.127s 0m08.642s
2048 0m02.148s 0m07.123s
4096 0m02.194s 0m06.448s*
8192 0m02.239s 0m06.504s
16384 0m02.144s* 0m06.502s
32768 0m02.202s 0m06.622s
65536 0m02.230s 0m06.677s
Much smoother; there's no dramatic upswing as we increase
the cache size (some remains, though it's small enough that
it's mostly run-to-run noise. E.g., in the log-raw case,
note how 8192 is 50-100ms higher than its neighbors). Note
also that we stop getting any real benefit for log-S after
about 4096 entries; that number will depend on the size of
the repository, the size of the blob entries, and the memory
limit of the cache.
Let's see what linux.git shows for the same strategy:
MAX_DELTA_CACHE log-raw log-S
--------------- --------- ---------
256 0m41.661s 5m12.410s
512 0m39.547s 5m07.920s
1024 0m37.054s 4m54.666s
2048 0m35.871s 4m41.194s*
4096 0m34.646s 4m51.648s
8192 0m33.881s 4m55.342s
16384 0m35.190s 5m00.122s
32768 0m35.060s 4m58.851s
65536 0m33.311s* 4m51.420s
It's similarly good. As with the "separate blob LRU"
strategy, there's a lot of noise on the log-S run here. But
it's certainly not any worse, is possibly a bit better, and
the improvement over "separate blob LRU" on the git.git case
is dramatic.
So it seems like a clear winner, and that's what this patch
implements.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
We keep an LRU list of entries for when we need to drop
something from an over-full cache. The list is implemented
as a circular doubly-linked list, which is exactly what
list.h provides. We can save a few lines by using the list.h
macros and functions. More importantly, this makes the code
easier to follow, as the reader sees explicit concepts like
"list_add_tail()" instead of pointer manipulation.
As a bonus, the list_entry() macro lets us place the lru
pointers anywhere inside the delta_base_cache_entry struct
(as opposed to just casting the pointer, which requires it
at the front of the struct). This will be useful in later
patches when we need to place other items at the front of
the struct (e.g., our hashmap implementation requires this).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
This function drops an entry entirely from the cache,
meaning that aside from the freeing of the buffer, it is
exactly equivalent to detach_delta_base_cache_entry(). Let's
build on top of the detach function, which shortens the code
and will make it simpler when we change out the underlying
storage in future patches.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
The delta base cache entries are stored in a fixed-length
hash table. So the way to remove an entry is to "clear" the
slot in the table, and that is what this function does.
However, the name is a leaky abstraction. If we were to
change the hash table implementation, it would no longer be
about "clearing". We should name it after _what_ it does,
not _how_ it does it. I.e., something like "remove" instead
of "clear".
But that does not tell the whole story, either. The subtle
thing about this function is that it removes the entry, but
does not free the entry data. So a more descriptive name is
"detach"; we give ownership of the data buffer to the
caller, and remove any other resources.
This patch uses the name detach_delta_base_cache_entry().
We could further model this after functions like
strbuf_detach(), which pass back all of the detached
information. However, since there are so many bits of
information in the struct (the data, the size, the type),
and so few callers (only one), it's not worth that
awkwardness. The name change and a comment can make the
intent clear.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|