summaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
...
| * | branch: clarify documentation around branchesEtienne Samson2019-12-071-27/+31
| | |
* | | Merge pull request #5333 from lrm29/attr_binary_macroPatrick Steinhardt2019-12-131-1/+1
|\ \ \ | | | | | | | | attr: Update definition of binary macro
| * | | attr: Update definition of binary macroLaurence McGlashan2019-12-121-1/+1
| |/ /
* | | path: support non-ascii drive letters on dosEdward Thomson2019-12-101-8/+30
| | | | | | | | | | | | | | | | | | | | | Windows/DOS only supports drive letters that are alpha characters A-Z. However, you can `subst` any one-character as a drive letter, including numbers or even emoji. Test that we can identify emoji as drive letters.
* | | path: protect NTFS everywhereEdward Thomson2019-12-103-8/+4
| | | | | | | | | | | | | | | Enable core.protectNTFS by default everywhere and in every codepath, not just on checkout.
* | | path: rename function that detects end of filenameEdward Thomson2019-12-101-4/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function `only_spaces_and_dots` used to detect the end of the filename on win32. Now we look at spaces and dots _before_ the end of the string _or_ a `:` character, which would signify a win32 alternate data stream. Thus, rename the function `ntfs_end_of_filename` to indicate that it detects the (virtual) end of a filename, that any further characters would be elided to the given path.
* | | path: also guard `.gitmodules` against NTFS Alternate Data StreamsJohannes Schindelin2019-12-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We just safe-guarded `.git` against NTFS Alternate Data Stream-related attack vectors, and now it is time to do the same for `.gitmodules`. Note: In the added regression test, we refrain from verifying all kinds of variations between short names and NTFS Alternate Data Streams: as the new code disallows _all_ Alternate Data Streams of `.gitmodules`, it is enough to test one in order to know that all of them are guarded against. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
* | | Disallow NTFS Alternate Data Stream attacks, even on Linux/macOSJohannes Schindelin2019-12-101-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A little-known feature of NTFS is that it offers to store metadata in so-called "Alternate Data Streams" (inspired by Apple's "resource forks") that are copied together with the file they are associated with. These Alternate Data Streams can be accessed via `<file name>:<stream name>:<stream type>`. Directories, too, have Alternate Data Streams, and they even have a default stream type `$INDEX_ALLOCATION`. Which means that `abc/` and `abc::$INDEX_ALLOCATION/` are actually equivalent. This is of course another attack vector on the Git directory that we definitely want to prevent. On Windows, we already do this incidentally, by disallowing colons in file/directory names. While it looks as if files'/directories' Alternate Data Streams are not accessible in the Windows Subsystem for Linux, and neither via CIFS/SMB-mounted network shares in Linux, it _is_ possible to access them on SMB-mounted network shares on macOS. Therefore, let's go the extra mile and prevent this particular attack _everywhere_. To keep things simple, let's just disallow *any* Alternate Data Stream of `.git`. This is libgit2's variant of CVE-2019-1352. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
* | | Protect against 8.3 "short name" attacks also on Linux/macOSJohannes Schindelin2019-12-101-1/+1
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Windows Subsystem for Linux (WSL) is getting increasingly popular, in particular because it makes it _so_ easy to run Linux software on Windows' files, via the auto-mounted Windows drives (`C:\` is mapped to `/mnt/c/`, no need to set that up manually). Unfortunately, files/directories on the Windows drives can be accessed via their _short names_, if that feature is enabled (which it is on the `C:` drive by default). Which means that we have to safeguard even our Linux users against the short name attacks. Further, while the default options of CIFS/SMB-mounts seem to disallow accessing files on network shares via their short names on Linux/macOS, it _is_ possible to do so with the right options. So let's just safe-guard against short name attacks _everywhere_. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
* | MSVC: Fix warning C4133 on x64: "function": Incompatible types - from ↵Sven Strickroth2019-12-031-1/+1
| | | | | | | | | | | | "unsigned long *" to "size_t *" Signed-off-by: Sven Strickroth <email@cs-ware.de>
* | Merge pull request #5314 from pks-t/pks/dll-main-removalEdward Thomson2019-12-013-47/+12
|\ \ | | | | | | global: convert to fiber-local storage to fix exit races
| * | global: convert to fiber-local storage to fix exit racesPatrick Steinhardt2019-11-293-47/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Windows platforms, we automatically clean up the thread-local storage upon detaching a thread via `DllMain()`. The thing is that this happens for every thread of applications that link against the libgit2 DLL, even those that don't have anything to do with libgit2 itself. As a result, we cannot assume that these unsuspecting threads make use of our `git_libgit2_init()` and `git_libgit2_shutdow()` reference counting, which may lead to racy situations: Thread 1 Thread 2 git_libgit2_shutdown() DllMain(DETACH_THREAD) git__free_tls_data() git_atomic_dec() == 0 git__free_tls_data() TlsFree(_tls_index) TlsGetValue(_tls_index) Due to the second thread never having executed `git_libgit2_init()`, the first thread will clean up TLS data and as a result also free the `_tls_index` variable. When detaching the second thread, we unconditionally access the now-free'd `_tls_index` variable, which is obviously not going to work out well. Fix the issue by converting the code to use fiber-local storage instead of thread-local storage. While FLS will behave the exact same as TLS if no fibers are in use, it does allow us to specify a destructor similar to the one that is accepted by pthread_key_create(3P). Like this, we do not have to manually free indices anymore, but will let the FLS handle calling the destructor. This allows us to get rid of `DllMain()` completely, as we only used it to keep track of when threads were exiting and results in an overall simplification of TLS cleanup.
* | | patch_parse: fix out-of-bounds reads caused by integer underflowPatrick Steinhardt2019-11-281-1/+1
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch format for binary files is a simple Base85 encoding with a length byte as prefix that encodes the current line's length. For each line, we thus check whether the line's actual length matches its expected length in order to not faultily apply a truncated patch. This also acts as a check to verify that we're not reading outside of the line's string: if (encoded_len > ctx->parse_ctx.line_len - 1) { error = git_parse_err(...); goto done; } There is the possibility for an integer underflow, though. Given a line with a single prefix byte, only, `line_len` will be zero when reaching this check. As a result, subtracting one from that will result in an integer underflow, causing us to assume that there's a wealth of bytes available later on. Naturally, this may result in an out-of-bounds read. Fix the issue by checking both `encoded_len` and `line_len` for a non-zero value. The binary format doesn't make use of zero-length lines anyway, so we need to know that there are both encoded bytes and remaining characters available at all. This patch also adds a test that works based on the last error message. Checking error messages is usually too tightly coupled, but in fact parsing the patch failed even before the change. Thus the only possibility is to use e.g. Valgrind, but that'd result in us not catching issues when run without Valgrind. As a result, using the error message is considered a viable tradeoff as we know that we didn't start decoding Base85 in the first place.
* | Merge pull request #5306 from herrerog/patchidPatrick Steinhardt2019-11-285-74/+52
|\ \ | | | | | | diff: complete support for git patchid
| * | diff: make patchid computation work with all types of commits.Gregory Herrero2019-11-281-61/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current implementation of patchid is not computing a correct patchid when given a patch where, for example, a new file is added or removed. Some more corner cases need to be handled to have same behavior as git patch-id command. Add some more tests to cover those corner cases. Signed-off-by: Gregory Herrero <gregory.herrero@oracle.com>
| * | patch_parse: correct parsing of patch containing not shown binary data.Gregory Herrero2019-11-191-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | When not shown binary data is added or removed in a patch, patch parser is currently returning 'error -1 - corrupt git binary header at line 4'. Fix it by correctly handling case where binary data is added/removed. Signed-off-by: Gregory Herrero <gregory.herrero@oracle.com>
| * | diff_print: add support for GIT_DIFF_FORMAT_PATCH_ID.Gregory Herrero2019-11-191-1/+6
| | | | | | | | | | | | | | | | | | | | | Git is generating patch-id using a stripped down version of a patch where hunk header and index information are not present. Signed-off-by: Gregory Herrero <gregory.herrero@oracle.com>
| * | diff_print: add a new 'print_index' flag when printing diff.Gregory Herrero2019-11-193-9/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new 'print_index' flag to let the caller decide whether or not 'index <oid>..<oid>' should be printed. Since patch id needs not to have index when hashing a patch, it will be useful soon. Signed-off-by: Gregory Herrero <gregory.herrero@oracle.com>
* | | Merge pull request #5243 from pks-t/pks/config-optimize-memPatrick Steinhardt2019-11-281-74/+36
|\ \ \ | | | | | | | | Memory optimizations for config entries
| * | | config_entries: micro-optimize storage of multivarsPatrick Steinhardt2019-11-051-3/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Multivars are configuration entries that have many values for the same name; we can thus micro-optimize this case by just retaining the name of the first configuration entry and freeing all the others, letting them point to the string of the first entry. The attached test case is an extreme example that demonstrates this. It contains a section name that is approximately 500kB in size with 20.000 entries "a=b". Without the optimization, this would require at least 20000*500kB bytes, which is around 10GB. With this patch, it only requires 500kB+20000*1B=20500kB. The obvious culprit here is the section header, which we repeatedly include in each of the configuration entry's names. This makes it very easier for an adversary to provide a small configuration file that disproportionally blows up in memory during processing and is thus a feasible way for a denial-of-service attack. Unfortunately, we cannot fix the root cause by e.g. having a separate "section" field that may easily be deduplicated due to the `git_config_entry` structure being part of our public API. So this micro-optimization is the best we can do for now.
| * | | config_entries: only keep track of a single entry listPatrick Steinhardt2019-11-051-73/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Whenever adding a configuration entry to the config entries structure, we allocate two list heads: - The first list head is added to the global list of config entries in order to be able to iterate over configuration entries in the order they were originally added. - The second list head is added to the map of entries in order to efficiently look up an entry by its name. If no entry with the same name exists in the map, then we add the new entry to the map directly. Otherwise, we append the new entry's list head to the pre-existing entry's list in order to keep track of multivars. While the former usecase is perfectly sound, the second usecase can be optimized. The only reason why we keep track of multivar entries in another separate list is to be able to determine whether an entry is unique or not by seeing whether its `next` pointer is set. So we keep track of a complete list of multivar entries just to have a single bit of information of whether it has other multivar entries with the same entry name. We can completely get rid of this secondary list by just adding a `first` field to the list structure itself. When executing `git_config_entries_append`, we will then simply check whether the configuration map already has an entry with the same name -- if so, we will set the `first` to zero to indicate that it is not the initial entry anymore. Instead of a second list head in the map, we can thus now directly store the list head of the first global list inside of the map and just refer to that bit. Note that the more obvious solution would be to store a `unique` field instead of a `first` field. But as we will only ever inspect the `first` field of the _last_ entry that has been moved into the map, these are semantically equivalent in that case. Having a `first` field also allows for a minor optimization: for multivar values, we can free the `name` field of all entries that are _not_ first and have them point to the name of the first entry instead.
| * | | config_entries: mark local functions as staticPatrick Steinhardt2019-11-051-3/+3
| | | | | | | | | | | | | | | | | | | | Some functions which are only used in "config_entries.c" are not marked as static, which is being fixed by this very commit.
* | | | Merge pull request #5307 from palmin/hash_sha256Patrick Steinhardt2019-11-281-0/+8
|\ \ \ \ | | | | | | | | | | ssh: include sha256 host key hash when supported
| * | | | ssh: include sha256 host key hash when supportedAnders Borum2019-11-201-0/+8
| | |/ / | |/| |
* | | | Merge pull request #5272 from tiennou/examples/cli-ificationPatrick Steinhardt2019-11-284-8/+0
|\ \ \ \ | | | | | | | | | | Various examples shape-ups
| * | | | global: DRY includes of assert.hEtienne Samson2019-11-064-8/+0
| | | | |
* | | | | Merge pull request #5123 from libgit2/ethomson/off_tPatrick Steinhardt2019-11-2841-163/+191
|\ \ \ \ \ | | | | | | | | | | | | Move `git_off_t` to `git_object_size_t`
| * | | | | internal: use off64_t instead of git_off_tethomson/off_tEdward Thomson2019-11-2510-75/+75
| | | | | | | | | | | | | | | | | | | | | | | | Prefer `off64_t` internally.
| * | | | | integer: use int64_t's for checksEdward Thomson2019-11-251-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | Use int64_t internally for type visibility.
| * | | | | offmap: store off64_t's instead of git_off_t'sEdward Thomson2019-11-252-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | Prefer `off64_t` to `git_off_t` internally for visibility.
| * | | | | mmap: use a 64-bit signed type `off64_t` for mmapEdward Thomson2019-11-256-7/+19
| | | | | | | | | | | | | | | | | | | | | | | | Prefer `off64_t` to `git_off_t` for internal visibility.
| * | | | | mmap: remove unnecessary assertionEdward Thomson2019-11-251-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | 64 bit types are always 64 bit.
| * | | | | blame: use a size_t for the bufferEdward Thomson2019-11-222-3/+10
| | | | | |
| * | | | | filestamp: use `uint64_t` for object sizeEdward Thomson2019-11-222-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using a signed type (`off_t`) use an unsigned `uint64_t` for the size of the files.
| * | | | | odb: use `git_object_size_t` for object sizeEdward Thomson2019-11-222-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using a signed type (`off_t`) use a new `git_object_size_t` for the sizes of objects.
| * | | | | futils_filesize: use `uint64_t` for object sizeEdward Thomson2019-11-225-22/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using a signed type (`off_t`) use `uint64_t` for the maximum size of files.
| * | | | | blob: use `git_object_size_t` for object sizeEdward Thomson2019-11-2215-23/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using a signed type (`off_t`) use a new `git_object_size_t` for the sizes of objects.
| * | | | | odb: use `git_object_size_t` for object sizeEdward Thomson2019-11-222-6/+6
| | |/ / / | |/| | | | | | | | | | | | | | | | | | Instead of using a signed type (`off_t`) use a new `git_object_size_t` for the sizes of objects.
* | | | | valgrind: add valgrind hints in OpenSSLEdward Thomson2019-11-242-1/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide usage hints to valgrind. We trust the data coming back from OpenSSL to have been properly initialized. (And if it has not, it's an OpenSSL bug, not a libgit2 bug.) We previously took the `VALGRIND` option to CMake as a hint to disable mmap. Remove that; it's broken. Now use it to pass on the `VALGRIND` definition so that sources can provide valgrind hints.
* | | | | valgrind: add suppressions for undefined useEdward Thomson2019-11-241-0/+4
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | valgrind will warn that OpenSSL will use undefined data in connect/read when talking to certain other TLS stacks. Thankfully, this only seems to occur when gcc is the compiler, so hopefully valgrind is just misunderstanding an optimization. Regardless, suppress this warning.
* | | | Merge pull request #5303 from pks-t/pks/patch-path-in-body-onlyEdward Thomson2019-11-161-5/+11
|\ \ \ \ | | | | | | | | | | patch_parse: use paths from "---"/"+++" lines for binary patches
| * | | | patch_parse: use paths from "---"/"+++" lines for binary patchesPatrick Steinhardt2019-11-101-5/+11
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For some patches, it is not possible to derive the old and new file paths from the patch header's first line, most importantly when they contain spaces. In such a case, we derive both paths from the "---" and "+++" lines, which allow for non-ambiguous parsing. We fail to use these paths when parsing binary patches without data, though, as we always expect the header paths to be filled in. Fix this by using the "---"/"+++" paths by default and only fall back to header paths if they aren't set. If neither of those paths are set, we just return an error. Add two tests to verify this behaviour, one of which would have previously caused a segfault.
* | | | Merge pull request #5285 from pcpthm/winhttp-308Edward Thomson2019-11-161-1/+6
|\ \ \ \ | | | | | | | | | | Follow 308 redirect in WinHTTP transport
| * | | | Follow 308 redirect in WinHTTP transportpcpthm2019-10-261-1/+6
| | | | |
* | | | | Merge pull request #5302 from tiennou/fix/p_lstat-errnoEdward Thomson2019-11-161-0/+1
|\ \ \ \ \ | |_|/ / / |/| | | | fileops: correct error return on p_lstat failures when mkdir
| * | | | fileops: correct error return on p_lstat failures when mkdirEtienne Samson2019-11-091-0/+1
| | |/ / | |/| | | | | | | | | | | | | | | | | | IIRC I got a strange return once from lstat, which translated in a weird error class/message being reported. As a safety measure, enforce a -1 return in that case.
* | | | Merge pull request #5299 from pks-t/pks/config-mem-snapshotsEdward Thomson2019-11-061-9/+1
|\ \ \ \ | | | | | | | | | | config_mem: implement support for snapshots
| * | | | config_mem: implement support for snapshotsPatrick Steinhardt2019-11-061-9/+1
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar as in commit dadbb33b6 (Fix crash if snapshotting a config_snapshot, 2019-11-01), let's implement snapshots for in-memory configuration entries. As this deletes more code than it adds, it doesn't make any sense to not allow for this and allows users to treat config backends mostly the same.
* | | | patch_parse: fix segfault when header path contains whitespace onlyPatrick Steinhardt2019-11-051-12/+9
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When parsing header paths from a patch, we reject any patches with empty paths as malformed patches. We perform the check whether a path is empty before sanitizing it, though, which may lead to a path becoming empty after the check, e.g. if we have trimmed whitespace. This may lead to a segfault later when any part of our patching logic actually references such a path, which may then be a `NULL` pointer. Fix the issue by performing the check after sanitizing. Add tests to catch the issue as they would have produced a segfault previosuly.
* | | config_file: keep reference to config entries when creating iteratorPatrick Steinhardt2019-11-051-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When creating a configuration file iterator, then we first refresh the backend and then afterwards duplicate all refreshed configuration entries into the iterator in order to avoid seeing any concurrent modifications of the entries while iterating. The duplication of entries is not guarded, though, as we do not increase the refcount of the entries that we duplicate right now. This opens us up for a race, as another thread may concurrently refresh the repository configuration and thus swap out the current set of entries. As we didn't increase the refcount, this may lead to the entries being free'd while we iterate over them in the first thread. Fix the issue by properly handling the lifecycle of the backend's entries via `config_file_entries_take` and `git_config_entries_free`, respectively.