| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
The httpclient implementation keeps a `read_buf` that holds the data
in the body of the response after the headers have been written. We
store that data for subsequent calls to `git_http_client_read_body`. If
we want to stop reading body data and send another request, we need to
clear that cached data.
Clear the cached body data on new requests, just like we read any
outstanding data from the socket.
|
|
|
|
|
|
|
| |
When `git_http_client_read_body` is invoked, it provides the size of the
buffer that can be read into. This will be set as the parser context's
`output_size` member. Use this as an upper limit on our reads, and
ensure that we do not read more than the client requests.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When users call `git_http_client_read_body`, it should return 0 at the
end of a message. When the `on_message_complete` callback is called,
this will set `client->state` to `DONE`. In our read loop, we look for
this condition and exit.
Without this, when there is no data left except the end of message chunk
(`0\r\n`) in the http stream, we would block by reading the three bytes
off the stream but not making progress in any `on_body` callbacks.
Listening to the `on_message_complete` callback allows us to stop trying
to read from the socket when we've read the end of message chunk.
|
|\
| |
| | |
git_pool_init: allow the function to fail
|
| |
| |
| |
| | |
Propagate failures caused by pool initialization errors.
|
| |
| |
| |
| | |
Let `git_pool_init` return an int so that it could fail.
|
|\ \
| | |
| | | |
Handle unreadable configuration files
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Modified `config_file_open()` so it returns 0 if the config file is
not readable, which happens on global config files under macOS
sandboxing (note that for some reason `access(F_OK)` DOES work with
sandboxing, but it is lying). Without this read check sandboxed
applications on macOS can not open any repository, because
`config_file_read()` will return GIT_ERROR when it cannot read the
global /Users/username/.gitconfig file, and the upper layers will
just completely abort on GIT_ERROR when attempting to load the
global config file, so no repositories can be opened.
|
|/ /
| |
| |
| |
| | |
According to index-format.txt of git, the path of an entry is prefixed
with N, where N indicates the length of bytes to be stripped.
|
|\ \
| |/
|/| |
OpenSSL certificate memory leak
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When creating a `git_cert` from the OpenSSL X509 certificate of a given
stream, we do not call `X509_free()` on the certificate, leading to a
memory leak as soon as the certificate is requested e.g. by the
certificate check callback.
Fix the issue by properly calling `X509_free()`.
|
|\ \
| | |
| | | |
tests: checkout: fix flaky test due to mtime race
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When trying to determine whether a file changed, we try to avoid heavy
operations by fist taking a look at the index, seeing whether the index
entry is modified already. This doesn't seem to cut it, though, as we
currently have the racy checkout::index::can_disable_pathspec_match test
case: sometimes the files get restored to their original contents,
sometimes they aren't.
The issue is caused by a racy index [1]: in case we modify a file, add
it to the index and then modify it again in-place without changing its
file, then we may end up with a modified file that has the same stat(3P)
info as we've currently got it in its corresponding index entry. The
mitigation for this is to treat files with the same mtime as the index
are treated as racily modified. We already have this logic in place for
the index, but not when doing a checkout.
Fix the issue by only consulting the index entry in case it has an older
mtime as the index. Previously, the following script reliably had at
least 20 failures, while now there is no failure to be observed anymore:
```bash
j=0
for i in $(seq 100)
do
if ! ./libgit2_clar -scheckout::index::can_disable_pathspec_match >/dev/null
then
j=$(($j + 1))
fi
done
echo "Failures: $j"
```
[1]: https://git-scm.com/docs/racy-git
|
|\ \ \
| |/ /
|/| | |
cmake: Sort source files for reproducible builds
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We currently use `FILE(GLOB ...)` in most places to find source and
header files. This is problematic in that the order of files returned
depends on the operating system's directory iteration order and may thus
not be deterministic. As a result, we link object files in unspecified
order, which may cause the linker to emit different code across runs.
Fix this issue by sorting all code used as input to the libgit2 library
to improve the reliability of reproducible builds.
|
|/
|
|
|
|
|
|
|
| |
While the function `git_futils_fake_symlink` is declared with arguments
`new, old`, the implementation uses the reverse order `old, new`. Let's
fix the ordering issues to be `new, old` for both, which matches what
symlink(3P) has. While at it, we also rename these parameters: `old` and
`new` doesn't really make a lot of sense in the context of symlinks,
which is why this commit renames them to be called `target` and `path`.
|
|
|
|
|
|
|
|
|
|
| |
Include GIT_ASSERT_WITH_RETVAL and GIT_ASSERT_ARG_WITH_RETVAL so that
functions that do not return int (or more precisely, where `-1` would
not be an error code) can assert.
This allows functions that return, eg, NULL on an error code to do that
by passing the return value (in this example, `NULL`) as a second
parameter to the GIT_ASSERT_WITH_RETVAL functions.
|
|
|
|
|
|
|
|
|
| |
Fall back to the system assert(3) in debug builds, which may aide
in debugging.
"Safe" assertions can be enabled in debug builds by setting
GIT_ASSERT_HARD=0. Similarly, hard assertions can be enabled in
release builds by setting GIT_ASSERT_HARD to nonzero.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Provide macros to replace usages of `assert`. A true `assert` is
punishing as a library. Instead we should do our best to not crash.
GIT_ASSERT_ARG(x) will now assert that the given argument complies to
some format and sets an error message and returns `-1` if it does not.
GIT_ASSERT(x) is for internal usage, and available as an internal
consistency check. It will set an error message and return `-1` in the
event of failure.
|
|
|
|
| |
Caught by static analysis.
|
|
|
|
|
|
|
|
| |
The checkout code didn't iterate into a subdir if it didn't match the
pathspec, but since the pathspec might match files in the subdir we
should recurse into it (In contrast to gitignore handling).
Fixes #5089
|
|
|
|
|
|
|
|
|
|
| |
We were previously applying the pathspec filter for the baseline
iterator during checkout, as well as the target tree. This was an
oversight; in fact, we should apply the pathspec filter to _all_
checkout targets, not just trees.
Add a helper function to set the iterator pathspecs from the given
checkout pathspecs, and call it everywhere.
|
|\
| |
| | |
git__hexdump: better mimic `hexdump -C`
|
| | |
|
| | |
|
|\ \
| | |
| | | |
sysdir: remove unused git_sysdir_get_str
|
| | | |
|
|/ /
| |
| |
| |
| |
| | |
Commit 0b5ba0d replaced this function with an "option_init"
equivallent, but misspelled the replacement function. As a result, this
symbol has been missing from libgit2.so ever since.
|
|\ \
| | |
| | | |
pack: Improve error handling for get_delta_base()
|
| | | |
|
| | |
| | |
| | |
| | |
| | | |
This makes get_delta_base() return the error code as the return value
and the delta base as an out-parameter.
|
| |/
| |
| |
| |
| |
| |
| |
| | |
This change moves the responsibility of setting the error upon failures
of get_delta_base() to get_delta_base() instead of its callers. That
way, the caller chan always check if the return value is negative and
mark the whole operation as an error instead of using garbage values,
which can lead to crashes if the .pack files are malformed.
|
|\ \
| | |
| | | |
merge: cache negative cache results for similarity metrics
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When computing renames, we cache the hash signatures for each of the
potentially conflicting entries so that we do not need to repeatedly
read the file and can at least halfway efficiently determine whether two
files are similar enough to be deemed a rename. In order to make the
hash signatures meaningful, we require at least four lines of data to be
present, resulting in at least four different hashes that can be
compared. Files that are deemed too small are not cached at all and
will thus be repeatedly re-hashed, which is usually not a huge issue.
The issue with above heuristic is in case a file does _not_ have at
least four lines, where a line is anything separated by a consecutive
run of "\n" or "\0" characters. For example "a\nb" is two lines, but
"a\0\0b" is also just two lines. Taken to the extreme, a file that has
megabytes of consecutive space- or NUL-only may also be deemed as too
small and thus not get cached. As a result, we will repeatedly load its
blob, calculate its hash signature just to finally throw it away as we
notice it's not of any value. When you've got a comparitively big file
that you compare against a big set of potentially renamed files, then
the cost simply expodes.
The issue can be trivially fixed by introducing negative cache entries.
Whenever we determine that a given blob does not have a meaningful
representation via a hash signature, we store this negative cache marker
and will from then on not hash it again, but also ignore it as a
potential rename target. This should help the "normal" case already
where you have a lot of small files as rename candidates, but in the
above scenario it's savings are extraordinarily high.
To verify we do not hit the issue anymore with described solution, this
commit adds a test that uses the exact same setup described above with
one 50 megabyte blob of '\0' characters and 1000 other files that get
renamed. Without the negative cache:
$ time ./libgit2_clar -smerge::trees::renames::cache_recomputation >/dev/null
real 11m48.377s
user 11m11.576s
sys 0m35.187s
And with the negative cache:
$ time ./libgit2_clar -smerge::trees::renames::cache_recomputation >/dev/null
real 0m1.972s
user 0m1.851s
sys 0m0.118s
So this represents a ~350-fold performance improvement, but it obviously
depends on how many files you have and how big the blob is. The test
number were chosen in a way that one will immediately notice as soon as
the bug resurfaces.
|
|\ \ \
| | | |
| | | | |
Handle repository format v1
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Git has supported repository format version 1 for some time. This
format is just like version 0, but it supports extensions.
Implementations must reject extensions that they don't support.
Add support for this format version and reject any extensions but
extensions.noop, which is the only extension we currently support.
While we're at it, also clean up an error message.
|
|\ \ \ \
| |_|_|/
|/| | | |
refdb_fs: remove unused header file
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The "refdb_fs.h" header contains a single struct `git_refcache` that is
not used anywhere. As a result, we can just delete the header altogether
as it doesn't have any purpose and may confuse readers.
|
| |_|/
|/| |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When generating a patch for a renamed file whose mode bits have changed
in addition to the rename, then we currently fail to parse the generated
patch. Furthermore, when generating a diff we output mode bits after the
similarity metric, which is different to how upstream git handles it.
Fix both issues by adding another state transition that allows
similarity indices after mode changes and by printing mode changes
before the similarity index.
|
|\ \ \
| | | |
| | | | |
Fix segfault when calling git_blame_buffer()
|
| |/ /
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This change makes sure that the hunk is not null before trying to
dereference it. This avoids segfaults, especially when blaming against a
modified buffer (i.e. the index).
Fixes: #5443
|
|/ /
| |
| |
| | |
Signed-off-by: Utkarsh Gupta <utkarsh@debian.org>
|
| |
| |
| |
| |
| |
| | |
While the `git_refdb_backend()` struct has a version, we do not
initialize it correctly when calling `git_refdb_backend_fs()`. Fix this
by adding the call to `git_refdb_init_backend()`.
|
|\ \
| | |
| | | |
cmake: use install directories provided via GNUInstallDirs
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We currently hand-code logic to configure where to install our artifacts
via the `LIB_INSTALL_DIR`, `INCLUDE_INSTALL_DIR` and `BIN_INSTALL_DIR`
variables. This is reinventing the wheel, as CMake already provide a way
to do that via `CMAKE_INSTALL_<DIR>` paths, e.g. `CMAKE_INSTALL_LIB`.
This requires users of libgit2 to know about the discrepancy and will
require special hacks for any build systems that handle these variables
in an automated way. One such example is Gentoo Linux, which sets up
these paths in both the cmake and cmake-utils eclass.
So let's stop doing that: the GNUInstallDirs module handles it in a
better way for us, especially so as the actual values are dependent on
CMAKE_INSTALL_PREFIX. This commit removes our own set of variables and
instead refers users to use the standard ones.
As a second benefit, this commit also fixes our pkgconfig generation to
use the GNUInstallDirs module. We had a bug there where we ignored the
CMAKE_INSTALL_PREFIX when configuring the libdir and includedir keys, so
if libdir was set to "lib64", then libdir would be an invalid path. With
GNUInstallDirs, we can now use `CMAKE_INSTALL_FULL_LIBDIR`, which
handles the prefix for us.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The Secure Transport interface we're currently using has been deprecated
with macOS 10.15. As we're currently in code freeze, we cannot migrate
to newer interfaces. As such, let's disable deprecation warnings for
our "schannel.c" stream.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Don't canonicalize symlink targets; our win32 path canonicalization
routines expect an absolute path. In particular, using the path
canonicalization routines for symlink targets (introduced in commit
7d55bee6d, "win32: fix relative symlinks pointing into dirs",
2020-01-10).
Now, use the utf8 -> utf16 relative path handling functions, so that
paths like "../foo" will be translated to "..\foo".
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add a function that takes a (possibly) relative UTF-8 path and emits a
UTF-16 path with forward slashes translated to backslashes. If the
given path is, in fact, absolute, it will be translated to absolute path
handling rules.
|
|/ /
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The path canonicalization functions on win32 are intended to
canonicalize absolute paths; those with prefixes. In other words,
things start with drive letters (`C:\`), share names (`\\server\share`),
or other prefixes (`\\?\`).
This function removes leading `..` that occur after the prefix but
before the directory/file portion (eg, turning `C:\..\..\..\foo` into
`C:\foo`). This translation is not appropriate for local paths.
|
|\ \
| | |
| | | |
CMake booleans
|