summaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* futils: fix order of declared parameters for `git_futils_fake_symlink`pks/futils-symlink-argsPatrick Steinhardt2020-05-122-6/+6
| | | | | | | | | While the function `git_futils_fake_symlink` is declared with arguments `new, old`, the implementation uses the reverse order `old, new`. Let's fix the ordering issues to be `new, old` for both, which matches what symlink(3P) has. While at it, we also rename these parameters: `old` and `new` doesn't really make a lot of sense in the context of symlinks, which is why this commit renames them to be called `target` and `path`.
* assert: allow non-int returning functions to assertethomson/assert_macrosEdward Thomson2020-05-111-14/+21
| | | | | | | | | | Include GIT_ASSERT_WITH_RETVAL and GIT_ASSERT_ARG_WITH_RETVAL so that functions that do not return int (or more precisely, where `-1` would not be an error code) can assert. This allows functions that return, eg, NULL on an error code to do that by passing the return value (in this example, `NULL`) as a second parameter to the GIT_ASSERT_WITH_RETVAL functions.
* assert: optionally fall-back to assert(3)Edward Thomson2020-05-112-27/+52
| | | | | | | | | Fall back to the system assert(3) in debug builds, which may aide in debugging. "Safe" assertions can be enabled in debug builds by setting GIT_ASSERT_HARD=0. Similarly, hard assertions can be enabled in release builds by setting GIT_ASSERT_HARD to nonzero.
* Introduce GIT_ASSERT macrosEdward Thomson2020-05-111-0/+27
| | | | | | | | | | | | Provide macros to replace usages of `assert`. A true `assert` is punishing as a library. Instead we should do our best to not crash. GIT_ASSERT_ARG(x) will now assert that the given argument complies to some format and sets an error message and returns `-1` if it does not. GIT_ASSERT(x) is for internal usage, and available as an internal consistency check. It will set an error message and return `-1` in the event of failure.
* Fix uninitialized stack memory and NULL ptr dereference in stash_to_indexPhilip Kelley2020-05-101-2/+2
| | | | Caught by static analysis.
* checkout: Fix removing untracked files by path in subdirectoriesSegev Finer2020-05-111-2/+7
| | | | | | | | The checkout code didn't iterate into a subdir if it didn't match the pathspec, but since the pathspec might match files in the subdir we should recurse into it (In contrast to gitignore handling). Fixes #5089
* checkout: filter pathspecs for _all_ checkout typesethomson/checkout_pathspecsEdward Thomson2020-05-101-9/+20
| | | | | | | | | | We were previously applying the pathspec filter for the baseline iterator during checkout, as well as the target tree. This was an oversight; in fact, we should apply the pathspec filter to _all_ checkout targets, not just trees. Add a helper function to set the iterator pathspecs from the given checkout pathspecs, and call it everywhere.
* Merge pull request #5431 from libgit2/ethomson/hexdumpEdward Thomson2020-05-101-9/+22
|\ | | | | git__hexdump: better mimic `hexdump -C`
| * git__hexdump: better mimic `hexdump -C`ethomson/hexdumpEdward Thomson2020-04-011-9/+22
| |
* | blame: add option to ignore whitespace changesCarl Schwan2020-04-141-3/+6
| |
* | Merge pull request #5485 from libgit2/ethomson/sysdir_unusedPatrick Steinhardt2020-04-052-30/+0
|\ \ | | | | | | sysdir: remove unused git_sysdir_get_str
| * | sysdir: remove unused git_sysdir_get_strethomson/sysdir_unusedEdward Thomson2020-04-052-30/+0
| | |
* | | Fix typo causing removal of symbol 'git_worktree_prune_init_options'Seth Junot2020-04-041-1/+1
|/ / | | | | | | | | | | Commit 0b5ba0d replaced this function with an "option_init" equivallent, but misspelled the replacement function. As a result, this symbol has been missing from libgit2.so ever since.
* | Merge pull request #5425 from lhchavez/fix-get-delta-basePatrick Steinhardt2020-04-043-26/+44
|\ \ | | | | | | pack: Improve error handling for get_delta_base()
| * | Re-adding the "delta offset is zero" error caselhchavez2020-04-021-0/+6
| | |
| * | Making get_delta_base() conform to the general error-handling patternlhchavez2020-04-013-25/+29
| | | | | | | | | | | | | | | This makes get_delta_base() return the error code as the return value and the delta base as an out-parameter.
| * | pack: Improve error handling for get_delta_base()lhchavez2020-04-011-7/+15
| |/ | | | | | | | | | | | | | | This change moves the responsibility of setting the error upon failures of get_delta_base() to get_delta_base() instead of its callers. That way, the caller chan always check if the return value is negative and mark the whole operation as an error instead of using garbage values, which can lead to crashes if the .pack files are malformed.
* | Merge pull request #5477 from pks-t/pks/rename-detection-negative-cachesPatrick Steinhardt2020-04-041-7/+20
|\ \ | | | | | | merge: cache negative cache results for similarity metrics
| * | merge: cache negative cache results for similarity metricsPatrick Steinhardt2020-04-011-7/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When computing renames, we cache the hash signatures for each of the potentially conflicting entries so that we do not need to repeatedly read the file and can at least halfway efficiently determine whether two files are similar enough to be deemed a rename. In order to make the hash signatures meaningful, we require at least four lines of data to be present, resulting in at least four different hashes that can be compared. Files that are deemed too small are not cached at all and will thus be repeatedly re-hashed, which is usually not a huge issue. The issue with above heuristic is in case a file does _not_ have at least four lines, where a line is anything separated by a consecutive run of "\n" or "\0" characters. For example "a\nb" is two lines, but "a\0\0b" is also just two lines. Taken to the extreme, a file that has megabytes of consecutive space- or NUL-only may also be deemed as too small and thus not get cached. As a result, we will repeatedly load its blob, calculate its hash signature just to finally throw it away as we notice it's not of any value. When you've got a comparitively big file that you compare against a big set of potentially renamed files, then the cost simply expodes. The issue can be trivially fixed by introducing negative cache entries. Whenever we determine that a given blob does not have a meaningful representation via a hash signature, we store this negative cache marker and will from then on not hash it again, but also ignore it as a potential rename target. This should help the "normal" case already where you have a lot of small files as rename candidates, but in the above scenario it's savings are extraordinarily high. To verify we do not hit the issue anymore with described solution, this commit adds a test that uses the exact same setup described above with one 50 megabyte blob of '\0' characters and 1000 other files that get renamed. Without the negative cache: $ time ./libgit2_clar -smerge::trees::renames::cache_recomputation >/dev/null real 11m48.377s user 11m11.576s sys 0m35.187s And with the negative cache: $ time ./libgit2_clar -smerge::trees::renames::cache_recomputation >/dev/null real 0m1.972s user 0m1.851s sys 0m0.118s So this represents a ~350-fold performance improvement, but it obviously depends on how many files you have and how big the blob is. The test number were chosen in a way that one will immediately notice as soon as the bug resurfaces.
* | | Merge pull request #5388 from bk2204/repo-format-v1Patrick Steinhardt2020-04-021-9/+38
|\ \ \ | | | | | | | | Handle repository format v1
| * | | repository: handle format v1brian m. carlson2020-02-111-9/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Git has supported repository format version 1 for some time. This format is just like version 0, but it supports extensions. Implementations must reject extensions that they don't support. Add support for this format version and reject any extensions but extensions.noop, which is the only extension we currently support. While we're at it, also clean up an error message.
* | | | Merge pull request #5461 from pks-t/pks/refdb-fs-unused-headerEdward Thomson2020-04-012-21/+0
|\ \ \ \ | |_|_|/ |/| | | refdb_fs: remove unused header file
| * | | refdb_fs: remove unused header filePatrick Steinhardt2020-03-252-21/+0
| | | | | | | | | | | | | | | | | | | | | | | | The "refdb_fs.h" header contains a single struct `git_refcache` that is not used anywhere. As a result, we can just delete the header altogether as it doesn't have any purpose and may confuse readers.
* | | | patch: correctly handle mode changes for renamesPatrick Steinhardt2020-03-262-7/+8
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | When generating a patch for a renamed file whose mode bits have changed in addition to the rename, then we currently fail to parse the generated patch. Furthermore, when generating a diff we output mode bits after the similarity metric, which is different to how upstream git handles it. Fix both issues by adding another state transition that allows similarity indices after mode changes and by printing mode changes before the similarity index.
* | | Merge pull request #5445 from lhchavez/fix-5443Edward Thomson2020-03-261-1/+1
|\ \ \ | | | | | | | | Fix segfault when calling git_blame_buffer()
| * | | Fix segfault when calling git_blame_buffer()lhchavez2020-03-231-1/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | This change makes sure that the hunk is not null before trying to dereference it. This avoids segfaults, especially when blaming against a modified buffer (i.e. the index). Fixes: #5443
* | | Fix spelling errorUtkarsh Gupta2020-03-261-1/+1
|/ / | | | | | | Signed-off-by: Utkarsh Gupta <utkarsh@debian.org>
* | refdb_fs: initialize backend versionPatrick Steinhardt2020-03-221-0/+3
| | | | | | | | | | | | While the `git_refdb_backend()` struct has a version, we do not initialize it correctly when calling `git_refdb_backend_fs()`. Fix this by adding the call to `git_refdb_init_backend()`.
* | Merge pull request #5455 from pks-t/pks/cmake-install-dirsEdward Thomson2020-03-211-11/+5
|\ \ | | | | | | cmake: use install directories provided via GNUInstallDirs
| * | cmake: use install directories provided via GNUInstallDirsPatrick Steinhardt2020-03-141-11/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently hand-code logic to configure where to install our artifacts via the `LIB_INSTALL_DIR`, `INCLUDE_INSTALL_DIR` and `BIN_INSTALL_DIR` variables. This is reinventing the wheel, as CMake already provide a way to do that via `CMAKE_INSTALL_<DIR>` paths, e.g. `CMAKE_INSTALL_LIB`. This requires users of libgit2 to know about the discrepancy and will require special hacks for any build systems that handle these variables in an automated way. One such example is Gentoo Linux, which sets up these paths in both the cmake and cmake-utils eclass. So let's stop doing that: the GNUInstallDirs module handles it in a better way for us, especially so as the actual values are dependent on CMAKE_INSTALL_PREFIX. This commit removes our own set of variables and instead refers users to use the standard ones. As a second benefit, this commit also fixes our pkgconfig generation to use the GNUInstallDirs module. We had a bug there where we ignored the CMAKE_INSTALL_PREFIX when configuring the libdir and includedir keys, so if libdir was set to "lib64", then libdir would be an invalid path. With GNUInstallDirs, we can now use `CMAKE_INSTALL_FULL_LIBDIR`, which handles the prefix for us.
* | | cmake: ignore deprecation notes for Secure TransportPatrick Steinhardt2020-03-131-0/+4
| | | | | | | | | | | | | | | | | | | | | The Secure Transport interface we're currently using has been deprecated with macOS 10.15. As we're currently in code freeze, we cannot migrate to newer interfaces. As such, let's disable deprecation warnings for our "schannel.c" stream.
* | | win32: don't canonicalize symlink targetsEdward Thomson2020-03-101-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't canonicalize symlink targets; our win32 path canonicalization routines expect an absolute path. In particular, using the path canonicalization routines for symlink targets (introduced in commit 7d55bee6d, "win32: fix relative symlinks pointing into dirs", 2020-01-10). Now, use the utf8 -> utf16 relative path handling functions, so that paths like "../foo" will be translated to "..\foo".
* | | win32: introduce relative path handling functionEdward Thomson2020-03-102-2/+41
| | | | | | | | | | | | | | | | | | | | | Add a function that takes a (possibly) relative UTF-8 path and emits a UTF-16 path with forward slashes translated to backslashes. If the given path is, in fact, absolute, it will be translated to absolute path handling rules.
* | | win32: clarify usage of path canonicalization funcsEdward Thomson2020-03-081-0/+3
|/ / | | | | | | | | | | | | | | | | | | | | The path canonicalization functions on win32 are intended to canonicalize absolute paths; those with prefixes. In other words, things start with drive letters (`C:\`), share names (`\\server\share`), or other prefixes (`\\?\`). This function removes leading `..` that occur after the prefix but before the directory/file portion (eg, turning `C:\..\..\..\foo` into `C:\foo`). This translation is not appropriate for local paths.
* | Merge pull request #5422 from pks-t/pks/cmake-booleansEdward Thomson2020-03-061-1/+1
|\ \ | | | | | | CMake booleans
| * | cmake: fix ENABLE_TRACE parameter being too strictPatrick Steinhardt2020-02-241-1/+1
| |/ | | | | | | | | | | | | | | | | | | In order to check whether tracing support should be turned on, we check whether ENABLE_TRACE equals "ON". This is being much too strict, as CMake will also treat "on", "true", "yes" and others as true-ish, but passing them will disable tracing support now. Fix the issue by simply removing the STREQUAL, which will cause CMake to do the right thing automatically.
* | Merge pull request #5439 from ignatenkobrain/patch-2Edward Thomson2020-03-061-1/+1
|\ \ | | | | | | Set proper pkg-config dependency for pcre2
| * | Set proper pkg-config dependency for pcre2Igor Gnatenko2020-03-031-1/+1
| | | | | | | | | | | | Signed-off-by: Igor Raits <i.gnatenko.brain@gmail.com>
* | | httpclient: use a 16kb read buffer for macOSethomson/sslreadEdward Thomson2020-03-041-1/+12
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use a 16kb read buffer for compatibility with macOS SecureTransport. SecureTransport `SSLRead` has the following behavior: 1. It will return _at most_ one TLS packet's worth of data, and 2. It will try to give you as much data as you asked for This means that if you call `SSLRead` with a buffer size that is smaller than what _it_ reads (in other words, the maximum size of a TLS packet), then it will buffer that data for subsequent calls. However, it will also attempt to give you as much data as you requested in your SSLRead call. This means that it will guarantee a network read in the event that it has buffered data. Consider our 8kb buffer and a server sending us 12kb of data on an HTTP Keep-Alive session. Our first `SSLRead` will read the TLS packet off the network. It will return us the 8kb that we requested and buffer the remaining 4kb. Our second `SSLRead` call will see the 4kb that's buffered and decide that it could give us an additional 4kb. So it will do a network read. But there's nothing left to read; that was the end of the data. The HTTP server is waiting for us to provide a new request. The server will eventually time out, our `read` system call will return, `SSLRead` can return back to us and we can make progress. While technically correct, this is wildly ineffecient. (Thanks, Tim Apple!) Moving us to use an internal buffer that is the maximum size of a TLS packet (16kb) ensures that `SSLRead` will never buffer and it will always return everything that it read (albeit decrypted).
* | Merge pull request #5417 from pks-t/pks/ntlmclient-htonllPatrick Steinhardt2020-02-251-2/+2
|\ \ | | | | | | deps: ntlmclient: fix missing htonll symbols on FreeBSD and SunOS
| * | transports: auth_ntlm: fix use of strdup/strndupPatrick Steinhardt2020-02-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | In the NTLM authentication code, we accidentally use strdup(3P) and strndup(3P) instead of our own wrappers git__strdup and git__strndup, respectively. Fix the issue by using our own functions.
* | | Fix typo on GIT_USE_NECSven Strickroth2020-02-201-1/+1
| | | | | | | | | | | | Signed-off-by: Sven Strickroth <email@cs-ware.de>
* | | Merge pull request #5390 from pks-t/pks/sha1-lookupPatrick Steinhardt2020-02-194-56/+21
|\ \ \ | | | | | | | | sha1_lookup: inline its only function into "pack.c"
| * | | sha1_lookup: inline its only function into "pack.c"Patrick Steinhardt2020-02-074-56/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The file "sha1_lookup.c" contains a single function `sha1_position` only which is used only in the packfile implementation. As the function is comparatively small, to enable the compiler to optimize better and to remove symbol visibility, move it into "pack.c".
* | | | Merge pull request #5391 from pks-t/pks/coverity-fixesPatrick Steinhardt2020-02-199-93/+132
|\ \ \ \ | |_|/ / |/| | | Coverity fixes
| * | | streams: openssl: ignore return value of `git_mutex_lock`Patrick Steinhardt2020-02-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | OpenSSL pre-v1.1 required us to set up a locking function to properly support multithreading. The locking function signature cannot return any error codes, and as a result we can't do anything if `git_mutex_lock` fails. To silence static analysis tools, let's just explicitly ignore its return value by casting it to `void`.
| * | | cache: fix invalid memory access in case updating cache entry failsPatrick Steinhardt2020-02-071-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When adding a new entry to our cache where an entry with the same OID exists already, then we only update the existing entry in case it is unparsed and the new entry is parsed. Currently, we do not check the return value of `git_oidmap_set` though when updating the existing entry. As a result, we will _not_ have updated the existing entry if `git_oidmap_set` fails, but have decremented its refcount and incremented the new entry's refcount. Later on, this may likely lead to dereferencing invalid memory. Fix the issue by checking the return value of `git_oidmap_set`. In case it fails, we will simply keep the existing stored instead, even though it's unparsed.
| * | | worktree: report errors when unable to read locking reasonPatrick Steinhardt2020-02-071-28/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Git worktree's have the ability to be locked in order to spare them from deletion, e.g. if a worktree is absent due to being located on a removable disk it is a good idea to lock it. When locking such worktrees, it is possible to give a locking reason in order to help the user later on when inspecting status of any such locked trees. The function `git_worktree_is_locked` serves to read out the locking status. It currently does not properly report any errors when reading the reason file, and callers are unexpecting of any negative return values, too. Fix this by converting callers to expect error codes and checking the return code of `git_futils_readbuffer`.
| * | | repository: check error codes when reading common linkPatrick Steinhardt2020-02-071-50/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When checking whether a path is a valid repository path, we try to read the "commondir" link file. In the process, we neither confirm that constructing the file's path succeeded nor do we verify that reading the file succeeded, which might cause us to verify repositories on an empty or bogus path later on. Fix this by checking return values. As the function to verify repos doesn't currently support returning errors, this commit also refactors the function to return an error code, passing validity of the repo via an out parameter instead, and adjusts all existing callers.
| * | | pack-objects: check return code of `git_zstream_set_input`Patrick Steinhardt2020-02-071-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | While `git_zstream_set_input` cannot fail right now, it might change in the future if we ever decide to have it check its parameters more vigorously. Let's thus check whether its return code signals an error.