| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| | |
| | |
| | |
| | |
| | |
| | | |
The submodule code has grown out-of-date regarding its coding style.
Update `git_submodule_reload` and `git_submodule_sync` to more closely
resemble what the rest of our code base uses.
|
| | | |
|
| | |
| | |
| | |
| | |
| | | |
git_submodule_sync should resolve submodule before writing to .git/config
to have the same behavior as git_submodule_init, which does the right thing.
|
|\ \ \
| | | |
| | | | |
http: avoid generating double slashes in url
|
| |/ /
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Prior to this change, given a remote url with a trailing slash,
such as http://localhost/a/, service requests would contain a
double slash: http://localhost/a//info/refs?service=git-receive-pack.
Detect and prevent that.
Updates #5321
|
|\ \ \
| | | |
| | | | |
patch_parse: fix undefined behaviour due to arithmetic on NULL pointers
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Doing arithmetic with NULL pointers is undefined behaviour in the C
standard. We do so regardless when parsing patches, as we happily add a
potential prefix length to prefixed paths. While this works out just
fine as the prefix length is always equal to zero in these cases, thus
resulting in another NULL pointer, it still is undefined behaviour and
was pointed out to us by OSSfuzz.
Fix the issue by checking whether paths are NULL, avoiding the
arithmetic if they are.
|
|\ \ \ \
| |_|/ /
|/| | | |
smart_pkt: fix overflow resulting in OOB read/write of one byte
|
| |/ /
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When parsing OK packets, we copy any information after the initial "ok "
prefix into the resulting packet. As newlines act as packet boundaries,
we also strip the trailing newline if there is any. We do not check
whether there is any data left after the initial "ok " prefix though,
which leads to a pointer overflow in that case as `len == 0`:
if (line[len - 1] == '\n')
--len;
This out-of-bounds read is a rather useless gadget, as we can only
deduce whether at some offset there is a newline character. In case
there accidentally is one, we overflow `len` to `SIZE_MAX` and then
write a NUL byte into an array indexed by it:
pkt->ref[len] = '\0';
Again, this doesn't seem like something that's possible to be exploited
in any meaningful way, but it may surely lead to inconsistencies or DoS.
Fix the issue by checking whether there is any trailing data after the
packet prefix.
|
|\ \ \
| |/ /
|/| | |
branch: clarify documentation around branches
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
As git_reference__name will reallocate storage to account for longer
names (it's actually allocator-dependent), it will cause all existing
pointers to the old object to become dangling, as they now point to
freed memory.
Fix the issue by renaming to a more descriptive name, and pass a pointer
to the actual reference that can safely be invalidated if the realloc
succeeds.
|
| | | |
|
|\ \ \
| | | |
| | | | |
attr: Update definition of binary macro
|
| |/ / |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Windows/DOS only supports drive letters that are alpha characters A-Z.
However, you can `subst` any one-character as a drive letter, including
numbers or even emoji. Test that we can identify emoji as drive
letters.
|
| | |
| | |
| | |
| | |
| | | |
Enable core.protectNTFS by default everywhere and in every codepath, not
just on checkout.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The function `only_spaces_and_dots` used to detect the end of the
filename on win32. Now we look at spaces and dots _before_ the end of
the string _or_ a `:` character, which would signify a win32 alternate
data stream.
Thus, rename the function `ntfs_end_of_filename` to indicate that it
detects the (virtual) end of a filename, that any further characters
would be elided to the given path.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We just safe-guarded `.git` against NTFS Alternate Data Stream-related
attack vectors, and now it is time to do the same for `.gitmodules`.
Note: In the added regression test, we refrain from verifying all kinds
of variations between short names and NTFS Alternate Data Streams: as
the new code disallows _all_ Alternate Data Streams of `.gitmodules`, it
is enough to test one in order to know that all of them are guarded
against.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
A little-known feature of NTFS is that it offers to store metadata in
so-called "Alternate Data Streams" (inspired by Apple's "resource
forks") that are copied together with the file they are associated with.
These Alternate Data Streams can be accessed via `<file name>:<stream
name>:<stream type>`.
Directories, too, have Alternate Data Streams, and they even have a
default stream type `$INDEX_ALLOCATION`. Which means that `abc/` and
`abc::$INDEX_ALLOCATION/` are actually equivalent.
This is of course another attack vector on the Git directory that we
definitely want to prevent.
On Windows, we already do this incidentally, by disallowing colons in
file/directory names.
While it looks as if files'/directories' Alternate Data Streams are not
accessible in the Windows Subsystem for Linux, and neither via
CIFS/SMB-mounted network shares in Linux, it _is_ possible to access
them on SMB-mounted network shares on macOS.
Therefore, let's go the extra mile and prevent this particular attack
_everywhere_. To keep things simple, let's just disallow *any* Alternate
Data Stream of `.git`.
This is libgit2's variant of CVE-2019-1352.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
|
|/ /
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The Windows Subsystem for Linux (WSL) is getting increasingly popular,
in particular because it makes it _so_ easy to run Linux software on
Windows' files, via the auto-mounted Windows drives (`C:\` is mapped to
`/mnt/c/`, no need to set that up manually).
Unfortunately, files/directories on the Windows drives can be accessed
via their _short names_, if that feature is enabled (which it is on the
`C:` drive by default).
Which means that we have to safeguard even our Linux users against the
short name attacks.
Further, while the default options of CIFS/SMB-mounts seem to disallow
accessing files on network shares via their short names on Linux/macOS,
it _is_ possible to do so with the right options.
So let's just safe-guard against short name attacks _everywhere_.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
|
| |
| |
| |
| |
| |
| | |
"unsigned long *" to "size_t *"
Signed-off-by: Sven Strickroth <email@cs-ware.de>
|
|\ \
| | |
| | | |
global: convert to fiber-local storage to fix exit races
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
On Windows platforms, we automatically clean up the thread-local storage
upon detaching a thread via `DllMain()`. The thing is that this happens
for every thread of applications that link against the libgit2 DLL, even
those that don't have anything to do with libgit2 itself. As a result,
we cannot assume that these unsuspecting threads make use of our
`git_libgit2_init()` and `git_libgit2_shutdow()` reference counting,
which may lead to racy situations:
Thread 1 Thread 2
git_libgit2_shutdown()
DllMain(DETACH_THREAD)
git__free_tls_data()
git_atomic_dec() == 0
git__free_tls_data()
TlsFree(_tls_index)
TlsGetValue(_tls_index)
Due to the second thread never having executed `git_libgit2_init()`, the
first thread will clean up TLS data and as a result also free the
`_tls_index` variable. When detaching the second thread, we
unconditionally access the now-free'd `_tls_index` variable, which is
obviously not going to work out well.
Fix the issue by converting the code to use fiber-local storage instead
of thread-local storage. While FLS will behave the exact same as TLS if
no fibers are in use, it does allow us to specify a destructor similar
to the one that is accepted by pthread_key_create(3P). Like this, we do
not have to manually free indices anymore, but will let the FLS handle
calling the destructor. This allows us to get rid of `DllMain()`
completely, as we only used it to keep track of when threads were
exiting and results in an overall simplification of TLS cleanup.
|
|/ /
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The patch format for binary files is a simple Base85 encoding with a
length byte as prefix that encodes the current line's length. For each
line, we thus check whether the line's actual length matches its
expected length in order to not faultily apply a truncated patch. This
also acts as a check to verify that we're not reading outside of the
line's string:
if (encoded_len > ctx->parse_ctx.line_len - 1) {
error = git_parse_err(...);
goto done;
}
There is the possibility for an integer underflow, though. Given a line
with a single prefix byte, only, `line_len` will be zero when reaching
this check. As a result, subtracting one from that will result in an
integer underflow, causing us to assume that there's a wealth of bytes
available later on. Naturally, this may result in an out-of-bounds read.
Fix the issue by checking both `encoded_len` and `line_len` for a
non-zero value. The binary format doesn't make use of zero-length lines
anyway, so we need to know that there are both encoded bytes and
remaining characters available at all.
This patch also adds a test that works based on the last error message.
Checking error messages is usually too tightly coupled, but in fact
parsing the patch failed even before the change. Thus the only
possibility is to use e.g. Valgrind, but that'd result in us not
catching issues when run without Valgrind. As a result, using the error
message is considered a viable tradeoff as we know that we didn't start
decoding Base85 in the first place.
|
|\ \
| | |
| | | |
diff: complete support for git patchid
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Current implementation of patchid is not computing a correct patchid
when given a patch where, for example, a new file is added or removed.
Some more corner cases need to be handled to have same behavior as git
patch-id command.
Add some more tests to cover those corner cases.
Signed-off-by: Gregory Herrero <gregory.herrero@oracle.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When not shown binary data is added or removed in a patch, patch parser
is currently returning 'error -1 - corrupt git binary header at line 4'.
Fix it by correctly handling case where binary data is added/removed.
Signed-off-by: Gregory Herrero <gregory.herrero@oracle.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Git is generating patch-id using a stripped down version of a patch
where hunk header and index information are not present.
Signed-off-by: Gregory Herrero <gregory.herrero@oracle.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add a new 'print_index' flag to let the caller decide whether or not
'index <oid>..<oid>' should be printed.
Since patch id needs not to have index when hashing a patch, it will be
useful soon.
Signed-off-by: Gregory Herrero <gregory.herrero@oracle.com>
|
|\ \ \
| | | |
| | | | |
Memory optimizations for config entries
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Multivars are configuration entries that have many values for the same
name; we can thus micro-optimize this case by just retaining the name of
the first configuration entry and freeing all the others, letting them
point to the string of the first entry.
The attached test case is an extreme example that demonstrates this. It
contains a section name that is approximately 500kB in size with 20.000
entries "a=b". Without the optimization, this would require at least
20000*500kB bytes, which is around 10GB. With this patch, it only
requires 500kB+20000*1B=20500kB.
The obvious culprit here is the section header, which we repeatedly
include in each of the configuration entry's names. This makes it very
easier for an adversary to provide a small configuration file that
disproportionally blows up in memory during processing and is thus a
feasible way for a denial-of-service attack. Unfortunately, we cannot
fix the root cause by e.g. having a separate "section" field that may
easily be deduplicated due to the `git_config_entry` structure being
part of our public API. So this micro-optimization is the best we can do
for now.
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Whenever adding a configuration entry to the config entries structure,
we allocate two list heads:
- The first list head is added to the global list of config entries
in order to be able to iterate over configuration entries in the
order they were originally added.
- The second list head is added to the map of entries in order to
efficiently look up an entry by its name. If no entry with the
same name exists in the map, then we add the new entry to the map
directly. Otherwise, we append the new entry's list head to the
pre-existing entry's list in order to keep track of multivars.
While the former usecase is perfectly sound, the second usecase can be
optimized. The only reason why we keep track of multivar entries in
another separate list is to be able to determine whether an entry is
unique or not by seeing whether its `next` pointer is set. So we keep
track of a complete list of multivar entries just to have a single bit
of information of whether it has other multivar entries with the same
entry name.
We can completely get rid of this secondary list by just adding a
`first` field to the list structure itself. When executing
`git_config_entries_append`, we will then simply check whether the
configuration map already has an entry with the same name -- if so, we
will set the `first` to zero to indicate that it is not the initial
entry anymore. Instead of a second list head in the map, we can thus now
directly store the list head of the first global list inside of the map
and just refer to that bit.
Note that the more obvious solution would be to store a `unique` field
instead of a `first` field. But as we will only ever inspect the `first`
field of the _last_ entry that has been moved into the map, these are
semantically equivalent in that case.
Having a `first` field also allows for a minor optimization: for
multivar values, we can free the `name` field of all entries that are
_not_ first and have them point to the name of the first entry instead.
|
| | | |
| | | |
| | | |
| | | |
| | | | |
Some functions which are only used in "config_entries.c" are not marked
as static, which is being fixed by this very commit.
|
|\ \ \ \
| | | | |
| | | | | |
ssh: include sha256 host key hash when supported
|
| | |/ /
| |/| | |
|
|\ \ \ \
| | | | |
| | | | | |
Various examples shape-ups
|
| | | | | |
|
|\ \ \ \ \
| | | | | |
| | | | | | |
Move `git_off_t` to `git_object_size_t`
|
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Prefer `off64_t` internally.
|
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Use int64_t internally for type visibility.
|
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Prefer `off64_t` to `git_off_t` internally for visibility.
|
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Prefer `off64_t` to `git_off_t` for internal visibility.
|
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
64 bit types are always 64 bit.
|
| | | | | | |
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Instead of using a signed type (`off_t`) use an unsigned `uint64_t` for
the size of the files.
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Instead of using a signed type (`off_t`) use a new `git_object_size_t`
for the sizes of objects.
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Instead of using a signed type (`off_t`) use `uint64_t` for the maximum
size of files.
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Instead of using a signed type (`off_t`) use a new `git_object_size_t`
for the sizes of objects.
|
| | |/ / /
| |/| | |
| | | | |
| | | | |
| | | | | |
Instead of using a signed type (`off_t`) use a new `git_object_size_t`
for the sizes of objects.
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Provide usage hints to valgrind. We trust the data coming back from
OpenSSL to have been properly initialized. (And if it has not, it's an
OpenSSL bug, not a libgit2 bug.)
We previously took the `VALGRIND` option to CMake as a hint to disable
mmap. Remove that; it's broken. Now use it to pass on the `VALGRIND`
definition so that sources can provide valgrind hints.
|