| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
When updating the file format version for something as critical as ref
storage, the file format version must come with an extension change. The
extensions.refFormat config value is a multi-valued config value that
defaults to the pair "files" and "packed".
Add "packed-v2" as a possible value to extensions.refFormat. This
value specifies that the packed-refs file may exist in the version 2
format. (If the "packed" value does not exist, then the packed-refs file
must exist in version 2, not version 1.)
In order to select version 2 for writing, the user will have two
options. First, the user could remove "packed" and add "packed-v2" to
the extensions.refFormat list. This would imply that version 2 is the
only format available. However, this also means that version 1 files
would be ignored at read time, so this does not allow users to upgrade
repositories with existing packed-refs files.
Add a new refs.packedRefsVersion config option which allows specifying
which version to use during writes. Thus, when both "packed" and
"packed-v2" are in the extensions.refFormat list, the user can upgrade
from version 1 to version 2, or downgrade from 2 to 1.
Currently, the implementation does not use refs.packedRefsVersion, as
that is delayed until we have the code to write that file format
version. However, we can add the necessary enum values and flag
constants to communicate the presence of "packed-v2" in the
extensions.refFormat list.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
The packed-refs file is a plaintext file format that starts with a
header line, then each ref is given as one or two lines (two if there is
a peeled value). These lines are written as part of a sequence of
updates which are merged with the existing ref iterator in
merge_iterator_and_updates(). That method is currently tied directly to
write_packed_entry_v1().
When creating a new version of the packed-file format, it would be
valuable to use this merging logic in an identical way. Create a new
function pointer type, write_ref_fn, and use that type in
merge_iterator_and_updates().
Notably, the function pointer type no longer depends on a FILE pointer,
but instead takes an arbitrary "void *write_data" parameter. This
flexibility will be critical in the future, since the planned v2 format
will use the chunk-format API and need a more complicated structure than
the output FILE.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
TBD
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
The write_with_updates() method uses a write_error label to jump to code
that adds an error message before exiting with an error. This appears
both when the packed-refs file header is written, but also when a ref
line is written to the packed-refs file.
A future change will abstract the loop that writes the refs out of
write_with_updates(), making the goto an inconvenient pattern. For now,
remove the distinction between "goto write_error" and "goto error" by
adding the message in-line using the new static method
add_write_error(). This is functionally equivalent, but will make the
next step easier.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
In preparation for adding a new packed-refs file format, extract all
code from refs/packed-backend.c that involves knowledge of the plaintext
file format. This includes any parsing logic that cares about the
header, plaintext lines of the form "<oid> <ref>" or "^<peeled>", and
the error messages when there is an issue in the file. This also
includes the writing logic that writes the header or the individual
references.
Future changes will perform more refactoring to abstract away more of
the writing process to be more generic, but this is enough of a chunk of
code movement.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
The new read_trailing_table_of_contents() mimics
read_table_of_contents() except that it reads the table of contents in
reverse from the end of the given hashfile. The file is given as a
memory-mapped section of memory and a size. Automatically calculate the
start of the trailing hash and read the table of contents in revers from
that position.
The errors come along from those in read_table_of_contents(). The one
exception is that the chunk_offset cannot be checked as going into the
table of contents since we do not have that length automatically. That
may have some surprising results for some narrow forms of corruption.
However, we do still limit the size to the size of the file plus the
part of the table of contents read so far. At minimum, the given sizes
can be used to limit parsing within the file itself.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
The existing chunk formats use the table of contents at the beginning of
the file. This is intended as a way to speed up the initial loading of
the file, but comes at a cost during writes. Each example needs to fully
compute how big each chunk will be in advance, which usually requires
storing the full file contents in memory.
Future file formats may want to use the chunk format API in cases where
the writing stage is critical to performance, so we may want to stream
updates from an existing file and then only write the table of contents
at the end.
Add a new 'flags' parameter to write_chunkfile() that allows this
behavior. When this is specified, the defensive programming that checks
that the chunks are written with the precomputed sizes is disabled.
Then, the table of contents is written in reverse order at the end of
the hashfile, so a parser can read the chunk list starting from the end
of the file (minus the hash).
The parsing of these table of contents will come in a later change.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
As a preparatory step to allowing trailing table of contents, store the
offsets of each chunk as we write them. This replaces an existing use of
a local variable, but the stored value will be used in the next change.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
It will be helpful to allow a trailing table of contents when writing
some file types with the chunk-format API. The main reason is that it
allows dynamically computing the chunk sizes while writing the file.
This can use fewer resources than precomputing all chunk sizes in
advance.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Even though the commit-graph and multi-pack-index file formats specify a
number of chunks in their header information, this is optional. The
table of contents terminates with a null chunk ID, which can be used
instead. The extra value is helpful for some checks, but is ultimately
not necessary for the format.
This will be important in some future formats.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
The extensions.refFormat extension is a multi-valued config that
specifies which ref formats are available to the current repository. By
default, Git assumes the list of "files" and "packed", unless there is
at least one of these extensions specified.
With the current values, it is possible for a user to specify only
"files" or only "packed". The only-"packed" option was already ruled as
invalid since Git's current code has too many places that require a
loose reference. This could change in the future.
However, we can now allow the user to specify extensions.refFormat=files
alone, making it impossible to create a packed-refs file (or to read one
that might exist).
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
The previous change introduced the extensions.refFormat config option.
It is a multi-valued config option that currently understands "files"
and "packed", with both values assumed by default. If any value is
provided explicitly, this default is ignored and the provided settings
are used instead.
The multi-valued nature of this extension presents a way to allow a user
to specify that they never want a packed-refs file (only use "files") or
that they never want loose reference files (only use "packed"). However,
that functionality is not currently connected.
Before actually modifying the files backend to understand these
extension settings, do the basic wiring that connects the
extensions.refFormat parsing to the creation of the ref backend. A
future change will actually change the ref backend initialization based
on these settings, but this communication of the extension is
sufficiently complicated to be worth an isolated change.
For now, also forbid the setting of only "packed". This is done by
redirecting the choice of backend to the packed backend when that
selection is made. A later change will make the "files"-only extension
value ignore the packed backend.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
The documentation for 'extensions.worktreeConfig' includes a bulletted
list describing certain config values that need to be moved into the
worktree config instead of the repository config file. However, since we
are already in a bulletted list, the documentation tools do not know
when that inner list is complete. Paragraphs intended to not be part of
that inner list are rendered as part of the last bullet.
Modify the format to match a similar doubly-nested list from the
'column.ui' config documentation. Reword the descriptions slightly to
make the config keys appear as their own heading in the inner list.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Git's reference storage is critical to its function. Creating new
storage formats for references requires adding an extension. This
prevents third-party tools that do not understand that format from
operating incorrectly on the repository. This makes updating ref formats
more difficult than other optional indexes, such as the commit-graph or
multi-pack-index.
However, there are a number of potential ref storage enhancements that
are underway or could be created. Git needs an established mechanism for
coordinating between these different options.
The first obvious format update is the reftable format as documented in
Documentation/technical/reftable.txt. This format has much of its
implementation already in Git, but its connection as a ref backend is
not complete. This change is similar to some changes within one of the
patches intended for the reftable effort [1].
[1] https://lore.kernel.org/git/pull.1215.git.git.1644351400761.gitgitgadget@gmail.com/
However, this change makes a distinct strategy change from the one
recommended by reftable. Here, the extensions.refFormat extension is
provided as a multi-valued list. In the reftable RFC, the extension has
a single value, "files" or "reftable" and explicitly states that this
should not change after 'git init' or 'git clone'.
The single-valued approach has some major drawbacks, including the idea
that the "files" backend cannot coexist with the "reftable" backend at
the same time. In this way, it would not be possible to create a
repository that can write loose references and combine them into a
reftable in the background. With the multi-valued approach, we could
integrate reftable as a drop-in replacement for the packed-refs file and
allow that to be a faster way to do the integration since the test suite
would only need updates when the test is explicitly testing packed-refs.
When upgrading a repository from the "files" backend to the "reftable"
backend, it can help to have a transition period where both are present,
then finally removing the "files" backend after all loose refs are
collected into the reftable.
But the reftable is not the only approach available.
One obvious improvement could be a new file format version for the
packed-refs file. Its current plaintext-based format is inefficient due
to storing object IDs as hexadecimal representations instead of in
their raw format. This extra cost will get worse with SHA-256. In
addition, binary searches need to guess a position and scan to find
newlines for a refname entry. A structured binary format could allow for
more compact representation and faster access. Adding such a format
could be seen as "files-v2", but it is really "packed-v2".
The reftable approach has a concept of a "stack" of reftable files. This
idea would also work for a stack of packed-refs files (in v1 or v2
format). It would be helpful to describe that the refs could be stored
in a stack of packed-ref files independently of whether that is in file
format v1 or v2.
Even in these two options, it might be helpful to indicate whether or
not loose ref files are present. That is one reason to not make them
appear as "files-v2" or "files-v3" options in a single-valued extension.
Even as "packed-v2" or "packed-v3" options, this approach would require
third-party tools to understand the "v2" version if they want to support
the "v3" options. Instead, by splitting the format from the layout, we
can allow third-party tools to integrate only with the most-desired
format options.
For these reasons, this change is defining the extensions.refFormat
extension as well as how the two existing values interact. By default,
Git will assume "files" and "packed" in the list. If any other value
is provided, then the extension is marked as unrecognized.
Add tests that check the behavior of extensions.refFormat, both in that
it requires core.repositoryFormatVersion=1, and Git will refuse to work
with an unknown value of the extension.
There is a gap in the current implementation, though. What happens if
exactly one of "files" or "packed" is provided? The presence of only one
would imply that the other is not available. A later change can
communicate the list contents to the repository struct and then the
reference backend could ignore one of these two layers.
Specifically, having only "files" would mean that Git should not read or
write the packed-refs file and instead only read and write loose ref
files. By contrast, having only "packed" would mean that Git should not
read or write loose ref files and instead always update the packed-refs
file on every ref update.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
The previous change allowed skipping the hashing portion of the
hashwrite API, using it instead as a buffered write API. Disabling the
hashwrite can be particularly helpful when the write operation is in a
critical path.
One such critical path is the writing of the index. This operation is so
critical that the sparse index was created specifically to reduce the
size of the index to make these writes (and reads) faster.
Following a similar approach to one used in the microsoft/git fork [1],
add a new config option that allows disabling this hashing during the
index write. The cost is that we can no longer validate the contents for
corruption-at-rest using the trailing hash.
[1] https://github.com/microsoft/git/commit/21fed2d91410f45d85279467f21d717a2db45201
While older Git versions will not recognize the null hash as a special
case, the file format itself is still being met in terms of its
structure. Using this null hash will still allow Git operations to
function across older versions.
The one exception is 'git fsck' which checks the hash of the index file.
Here, we disable this check if the trailing hash is all zeroes. We add a
warning to the config option that this may cause undesirable behavior
with older Git versions.
As a quick comparison, I tested 'git update-index --force-write' with
and without index.computHash=false on a copy of the Linux kernel
repository.
Benchmark 1: with hash
Time (mean ± σ): 46.3 ms ± 13.8 ms [User: 34.3 ms, System: 11.9 ms]
Range (min … max): 34.3 ms … 79.1 ms 82 runs
Benchmark 2: without hash
Time (mean ± σ): 26.0 ms ± 7.9 ms [User: 11.8 ms, System: 14.2 ms]
Range (min … max): 16.3 ms … 42.0 ms 69 runs
Summary
'without hash' ran
1.78 ± 0.76 times faster than 'with hash'
These performance benefits are substantial enough to allow users the
ability to opt-in to this feature, even with the potential confusion
with older 'git fsck' versions.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| |/ / / / /
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
The hashfile API is useful for generating files that include a trailing
hash of the file's contents up to that point. Using such a hash is
helpful for verifying the file for corruption-at-rest, such as a faulty
drive causing flipped bits.
Since the commit-graph and multi-pack-index files both use this trailing
hash, the chunk-format API uses a 'struct hashfile' to handle the I/O to
the file. This was very convenient to allow using the hashfile methods
during these operations.
However, hashing the file contents during write comes at a performance
penalty. It's slower to hash the bytes on their way to the disk than
without that step. If we wish to use the chunk-format API to upgrade
other file types, then this hashing is a performance penalty that might
not be worth the benefit of a trailing hash.
For example, if we create a chunk-format version of the packed-refs
file, then the file format could shrink by using raw object IDs instead
of hexadecimal representations in ASCII. That reduction in size is not
enough to counteract the performance penalty of hashing the file
contents. In cases such as deleting a reference that appears in the
packed-refs file, that write-time performance is critical. This is in
contrast to the commit-graph and multi-pack-index files which are mainly
updated in non-critical paths such as background maintenance.
One way to allow future chunked formats to not suffer this penalty would
be to create an abstraction layer around the 'struct hashfile' using a
vtable of function pointers. This would allow placing a different
representation in place of the hashfile. This option would be cumbersome
for a few reasons. First, the hashfile's buffered writes are already
highly optimized and would need to be duplicated in another code path.
The second is that the chunk-format API calls the chunk_write_fn
pointers using a hashfile. If we change that to an abstraction layer,
then those that _do_ use the hashfile API would need to change all of
their instances of hashwrite(), hashwrite_be32(), and others to use the
new abstraction layer.
Instead, this change opts for a simpler change. Introduce a new
'skip_hash' option to 'struct hashfile'. When set, the update_fn and
final_fn members of the_hash_algo are skipped. When finalizing the
hashfile, the trailing hash is replaced with the null hash.
This use of a trailing null hash would be desireable in either case,
since we do not want to special case a file format to have a different
length depending on whether it was hashed or not. When the final bytes
of a file are all zero, we can infer that it was written without
hashing, and thus that verification is not available as a check for file
consistency. This also means that we could easily toggle hashing for any
file format we desire. For the commit-graph and multi-pack-index file,
it may be possible to allow the null hash without incrementing the file
format version, since it technically fits the structure of the file
format. The only issue is that older versions would trigger a failure
during 'git fsck'. For these file formats, we may want to delay such a
change until it is justified.
However, the index file is written in critical paths. It is also
frequently updated, so corruption at rest is less likely to be an issue
than in those other file formats. This could be a good candidate to
create an option that skips the hashing operation.
A version of this patch has existed in the microsoft/git fork since
2017 [1] (the linked commit was rebased in 2018, but the original dates
back to January 2017). Here, the change to make the index use this fast
path is delayed until a later change.
[1] https://github.com/microsoft/git/commit/21fed2d91410f45d85279467f21d717a2db45201
Co-authored-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
'git worktree add' learned how to create a worktree based on an
orphaned branch with `--orphan`.
* ja/worktree-orphan:
worktree add: add --orphan flag
worktree add: Include -B in usage docs
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Adds support for creating an orphan branch when adding a new worktree.
This functionality is equivalent to git switch's --orphan flag.
The original reason this feature was implemented was to allow a user
to initialise a new repository using solely the worktree oriented
workflow. Example usage included below.
$ GIT_DIR=".git" git init --bare
$ git worktree add --orphan master master/
Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| |/ / / / /
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
While -B behavior is already documented, it was not included in the
usage docs for either the man page or the help text. This change fixes
that and brings the usage docs in line with how the flags are documented
in other commands such as git checkout.
Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Introduce a case insensitive mode to the Bash completion helpers.
* aw/complete-case-insensitive:
completion: add case-insensitive match of pseudorefs
completion: add optional ignore-case when matching refs
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
When GIT_COMPLETION_IGNORE_CASE=1, also allow lowercase completion text
like "head" to match HEAD and other pseudorefs.
Signed-off-by: Alison Winters <alisonatwork@outlook.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| |/ / / / /
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
If GIT_COMPLETION_IGNORE_CASE=1 is set, --ignore-case will be added to
git for-each-ref calls so that branches and tags can be matched case
insensitively.
Signed-off-by: Alison Winters <alisonatwork@outlook.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
The advice message emitted by a slow "status" run is amended to
mention fsmonitor.
* rr/long-status-advice:
status: long status advice adapted to recent capabilities
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Improve the advice displayed when `git status` is slow because
of excessive numbers of untracked files. Update the `git status`
man page to explain the various configuration options.
`git status` can be slow when there are a large number of untracked
files and directories, because Git must search the entire worktree
to enumerate them. Previously, Git would print an advice message
with the elapsed search time and a suggestion to disable the search
using the `-uno` option. This suggestion also carried a warning
that might scare off some users.
Git can reduce the size and time of the untracked file search when
the `core.untrackedCache` and `core.fsmonitor` features are enabled
by caching results from previous `git status` invocations.
Update the advice to explain the various combinations of additional
configuration options and refer to (new) documentation in the man
page that explains it in more detail than what can be printed in an
advice message.
Finally, add new tests to verify the new functionality.
Signed-off-by: Rudy Rigot <rudy.rigot@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
|\ \ \ \ \ \ \
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
Fix an issue where core.fsmonitor on macOS would not notice created
or modified symbolic links.
* sz/macos-fsmonitor-symlinks:
fsmonitor--daemon: on macOS support symlink
|
| | |/ / / / /
| |/| | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Resolves a problem where symbolic links were not showing up in diff when
created or modified.
kFSEventStreamEventFlagItemIsSymlink is also treated as a file update.
This is because kFSEventStreamEventFlagItemIsFile is not included in
FSEvents when creating or deleting symbolic links. For example:
$ ln -snf t test
fsevent: '/path/to/dir/test', flags=0x40100 ItemCreated|ItemIsSymlink|
$ ln -snf ci test
fsevent: '/path/to/dir/test', flags=0x40200 ItemIsSymlink|ItemRemoved|
fsevent: '/path/to/dir/test', flags=0x40100 ItemCreated|ItemIsSymlink|
Signed-off-by: srz_zumix <zumix.cpp@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
|\ \ \ \ \ \ \
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
Free structures related to delta islands after use.
* ew/delta-islands-free:
delta-islands: free island-related data after use
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
On my use case involving 771 islands of Linux on kernel.org,
this reduces memory usage by around 25MB. The bulk of that
comes from free_remote_islands, since free_config_regexes only
saves around 40k.
This memory is saved early in the memory-intensive pack process,
making it available for the remainder of the long process.
Signed-off-by: Eric Wong <e@80x24.org>
Co-authored-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
|\ \ \ \ \ \ \ \
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
`parse_object()` hardening when checking for the existence of a
suspected blob object.
* jk/parse-object-type-mismatch:
parse_object(): check on-disk type of suspected blob
parse_object(): drop extra "has" check before checking object type
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
In parse_object(), we try to handle blobs by streaming rather than
loading them entirely into memory. The most common case here will be
that we haven't seen the object yet and check oid_object_info(), which
tells us we have a blob.
But we trigger this code on one other case: when we have an in-memory
object struct with type OBJ_BLOB (and without its "parsed" flag set,
since otherwise we'd return early from the function). This indicates
that some other part of the code suspected we have a blob (e.g., it was
mentioned by a tree or tag) but we haven't yet looked at the on-disk
copy.
In this case before hitting the streaming path, we check if we have the
object on-disk at all. This is mostly pointless extra work, as the
streaming path would complain if it couldn't open the object (albeit
with the message "hash mismatch", which is a little misleading).
But it's also insufficient to catch all problems. The streaming code
will only tell us "yes, the on-disk object matches the oid". But it
doesn't actually confirm that what we found was indeed a blob, and
neither does repo_has_object_file().
One way to improve this would be to teach stream_object_signature() to
check the type (either by returning it to us to check, or taking an
"expected" type). But there's an even simpler fix here: if we suspect
the object is a blob, just call oid_object_info() to confirm that we
have it on-disk, and that it really is a blob.
This is slightly less efficient than teaching stream_object_signature()
to do it (since it has to open the object already). But this case very
rarely comes up. In practice, we usually don't have any clue what the
type is, in which case we already call oid_object_info(). This
"suspected" case happens only when some other code created an object
struct but didn't actually parse the blob, which is actually tricky to
trigger at all (see the discussion of the test below).
I reworked the conditional a bit so that instead of:
if ((suspected_blob && oid_object_info() == OBJ_BLOB)
(no_clue && oid_object_info() == OBJ_BLOB)
we have the simpler:
if ((suspected_blob || no_clue) && oid_object_info() == OBJ_BLOB)
This is shorter, but also reflects what we really want say, which is
"have we ruled out this being a blob; if not, check it on-disk".
In either case, if oid_object_info() fails to tell us it's a blob, we'll
skip the streaming code path and call repo_read_object_file(), just as
before. And if we really do have a mismatch with the existing object
struct, we'll eventually call lookup_commit(), etc, via
parse_object_buffer(), which will complain that it doesn't match our
existing obj->type.
So this fixes one of the lingering expect_failure cases from 0616617c7e
(t: introduce tests for unexpected object types, 2019-04-09). That test
works by peeling a tag that claims to point to a blob (triggering us to
create the struct), but really points to something else, which we later
discover when we call parse_object() as part of the actual traversal).
Prior to this commit, we'd quietly check the sha1 and mark the blob as
"parsed". Now we correctly complain about the mismatch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| |/ / / / / / /
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
When parsing an object of unknown type, we check to see if it's a blob,
so we can use our streaming code path. This uses oid_object_info() to
check the type, but before doing so we call repo_has_object_file(). This
latter is pointless, as oid_object_info() will already fail if the
object is missing. Checking it ahead of time just complicates the code
and is a waste of resources (albeit small).
Let's drop the redundant check.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
|\ \ \ \ \ \ \ \
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Doc update.
* mh/gitcredentials-generate:
Docs: describe how a credential-generating helper works
|
| | |_|_|_|_|_|/
| |/| | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
Previously the docs only described storage helpers.
A concrete example: Git Credential Manager can generate credentials
for GitHub and GitLab via OAuth.
https://github.com/GitCredentialManager/git-credential-manager
Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
|\ \ \ \ \ \ \ \
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Avoid a stray empty newline in the template when creating new notes.
* mg/notes-newline:
notes: avoid empty line in template
|
| | |/ / / / / /
| |/| | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
When `git notes` prepares the template it adds an empty newline between
the comment header and the content:
>
> #
> # Write/edit the notes for the following object:
>
> # commit 0f3c55d4c2b7864bffb2d92278eff08d0b2e083f
> # etc
This is wrong structurally because that newline is part of the comment,
too, and thus should be commented. Also, it throws off some positioning
strategies of editors and plugins, and it differs from how we do commit
templates.
Change this to follow the standard set by `git commit`:
>
> #
> # Write/edit the notes for the following object:
> #
> # commit 0f3c55d4c2b7864bffb2d92278eff08d0b2e083f
>
Tests pass unchanged after this code change.
Signed-off-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
|\ \ \ \ \ \ \ \
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Push all submodules recursively with
'--recurse-submodules=on-demand'.
* jt/submodule-on-demand:
Doc: document push.recurseSubmodules=only
|
| | |_|/ / / / /
| |/| | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
Git learned pushing submodules without pushing the superproject by
the user specifying --recurse-submodules=only through 6c656c3fe4
("submodules: add RECURSE_SUBMODULES_ONLY value", 2016-12-20) and
225e8bf778 ("push: add option to push only submodules", 2016-12-20).
For users who use this feature regularly, it is desirable to have an
equivalent configuration.
It turns out that such a configuration (push.recurseSubmodules=only) is
already supported, even though it is neither documented nor mentioned
in the commit messages, due to the way the --recurse-submodules=only
feature was implemented (a function used to parse --recurse-submodules
was updated to support "only", but that same function is used to parse
push.recurseSubmodules too). What is left is to document it and test it,
which is what this commit does.
There is a possible point of confusion when recursing into a submodule
that itself has the push.recurseSubmodules=only configuration, because
if a repository has only its submodules pushed and not itself, its
superproject can never be pushed. Therefore, treat such configurations
as being "on-demand", and print a warning message.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
|\ \ \ \ \ \ \ \
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Avoid setting GIT_REFLOG_ACTION to improve readability of the
sequencer internals.
* pw/rebase-no-reflog-action:
rebase: stop exporting GIT_REFLOG_ACTION
sequencer: stop exporting GIT_REFLOG_ACTION
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Now that struct replay_opts has a reflog_action member we no longer
need to export GIT_REFLOG_ACTION when starting a rebase. If the user
has set GIT_REFLOG_ACTION then we use it when initializing
reflog_action.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| |/ / / / / / /
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
Each time it picks a commit the sequencer copies the GIT_REFLOG_ACITON
environment variable so it can temporarily change it and then restore
the previous value. This results in code that is hard to follow and also
leaks memory because (i) we fail to free the copy when we've finished
with it and (ii) each call to setenv() leaks the previous value. Instead
pass the reflog action around in a variable and use it to set
GIT_REFLOG_ACTION in the child environment when running "git commit".
Within the sequencer GIT_REFLOG_ACTION is no longer set and is only read
by sequencer_reflog_action(). It is still set by rebase before calling
the sequencer, that will be addressed in the next commit. cherry-pick
and revert are unaffected as they do not set GIT_REFLOG_ACTION before
calling the sequencer.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
|\ \ \ \ \ \ \ \
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
"git receive-pack" used to use all the local refs as the boundary for
checking connectivity of the data "git push" sent, but now it uses
only the refs that it advertised to the pusher. In a repository with
the .hideRefs configuration, this reduces the resources needed to
perform the check.
* ps/receive-use-only-advertised:
receive-pack: only use visible refs for connectivity check
rev-parse: add `--exclude-hidden=` option
revision: add new parameter to exclude hidden refs
revision: introduce struct to handle exclusions
revision: move together exclusion-related functions
refs: get rid of global list of hidden refs
refs: fix memory leak when parsing hideRefs config
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
When serving a push, git-receive-pack(1) needs to verify that the
packfile sent by the client contains all objects that are required by
the updated references. This connectivity check works by marking all
preexisting references as uninteresting and using the new reference tips
as starting point for a graph walk.
Marking all preexisting references as uninteresting can be a problem
when it comes to performance. Git forges tend to do internal bookkeeping
to keep alive sets of objects for internal use or make them easy to find
via certain references. These references are typically hidden away from
the user so that they are neither advertised nor writeable. At GitLab,
we have one particular repository that contains a total of 7 million
references, of which 6.8 million are indeed internal references. With
the current connectivity check we are forced to load all these
references in order to mark them as uninteresting, and this alone takes
around 15 seconds to compute.
We can optimize this by only taking into account the set of visible refs
when marking objects as uninteresting. This means that we may now walk
more objects until we hit any object that is marked as uninteresting.
But it is rather unlikely that clients send objects that make large
parts of objects reachable that have previously only ever been hidden,
whereas the common case is to push incremental changes that build on top
of the visible object graph.
This provides a huge boost to performance in the mentioned repository,
where the vast majority of its refs hidden. Pushing a new commit into
this repo with `transfer.hideRefs` set up to hide 6.8 million of 7 refs
as it is configured in Gitaly leads to a 4.5-fold speedup:
Benchmark 1: main
Time (mean ± σ): 30.977 s ± 0.157 s [User: 30.226 s, System: 1.083 s]
Range (min … max): 30.796 s … 31.071 s 3 runs
Benchmark 2: pks-connectivity-check-hide-refs
Time (mean ± σ): 6.799 s ± 0.063 s [User: 6.803 s, System: 0.354 s]
Range (min … max): 6.729 s … 6.850 s 3 runs
Summary
'pks-connectivity-check-hide-refs' ran
4.56 ± 0.05 times faster than 'main'
As we mostly go through the same codepaths even in the case where there
are no hidden refs at all compared to the code before there is no change
in performance when no refs are hidden:
Benchmark 1: main
Time (mean ± σ): 48.188 s ± 0.432 s [User: 49.326 s, System: 5.009 s]
Range (min … max): 47.706 s … 48.539 s 3 runs
Benchmark 2: pks-connectivity-check-hide-refs
Time (mean ± σ): 48.027 s ± 0.500 s [User: 48.934 s, System: 5.025 s]
Range (min … max): 47.504 s … 48.500 s 3 runs
Summary
'pks-connectivity-check-hide-refs' ran
1.00 ± 0.01 times faster than 'main'
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Add a new `--exclude-hidden=` option that is similar to the one we just
added to git-rev-list(1). Given a section name `uploadpack` or `receive`
as argument, it causes us to exclude all references that would be hidden
by the respective `$section.hideRefs` configuration.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Users can optionally hide refs from remote users in git-upload-pack(1),
git-receive-pack(1) and others via the `transfer.hideRefs`, but there is
not an easy way to obtain the list of all visible or hidden refs right
now. We'll require just that though for a performance improvement in our
connectivity check.
Add a new option `--exclude-hidden=` that excludes any hidden refs from
the next pseudo-ref like `--all` or `--branches`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
The functions that handle exclusion of refs work on a single string
list. We're about to add a second mechanism for excluding refs though,
and it makes sense to reuse much of the same architecture for both kinds
of exclusion.
Introduce a new `struct ref_exclusions` that encapsulates all the logic
related to excluding refs and move the `struct string_list` that holds
all wildmatch patterns of excluded refs into it. Rename functions that
operate on this struct to match its name.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Move together the definitions of functions that handle exclusions of
refs so that related functionality sits in a single place, only.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
We're about to add a new argument to git-rev-list(1) that allows it to
add all references that are visible when taking `transfer.hideRefs` et
al into account. This will require us to potentially parse multiple sets
of hidden refs, which is not easily possible right now as there is only
a single, global instance of the list of parsed hidden refs.
Refactor `parse_hide_refs_config()` and `ref_is_hidden()` so that both
take the list of hidden references as input and adjust callers to keep a
local list, instead. This allows us to easily use multiple hidden-ref
lists. Furthermore, it allows us to properly free this list before we
exit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
When parsing the hideRefs configuration, we first duplicate the config
value so that we can modify it. We then subsequently append it to the
`hide_refs` string list, which is initialized with `strdup_strings`
enabled. As a consequence we again reallocate the string, but never
free the first duplicate and thus have a memory leak.
While we never clean up the static `hide_refs` variable anyway, this is
no excuse to make the leak worse by leaking every value twice. We are
also about to change the way this variable will be handled so that we do
indeed start to clean it up. So let's fix the memory leak by using the
`string_list_append_nodup()` so that we pass ownership of the allocated
string to `hide_refs`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
|\ \ \ \ \ \ \ \ \
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Remove an absolute path in the "Migrating git directory" message.
* ab/submodule-no-abspath:
submodule--helper absorbgitdirs: no abspaths in "Migrating git..."
|
| | |_|_|/ / / / /
| |/| | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Change the "Migrating git directory" messages to avoid emitting
absolute paths. We could use "old_git_dir" and "new_gitdir.buf" here
sometimes, but not in all the cases covered by these tests,
i.e. sometimes the latter will be an absolute path with a different
prefix.
So let's just strip off the common prefix of the two strings, which
handles the cases where we have nested submodules nicely. Note that
this case is different than the one in get_submodule_displaypath() in
"builtin/submodule--helper.c" handles, as we're dealing with the paths
to the two ".git" here, not worktree paths.
Before this change we had no tests at all for this "Migrating git
directory" output.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|