summaryrefslogtreecommitdiff
path: root/tests/patch/parse.c
Commit message (Collapse)AuthorAgeFilesLines
* patch_parse: fix out-of-bounds reads caused by integer underflowPatrick Steinhardt2019-11-281-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch format for binary files is a simple Base85 encoding with a length byte as prefix that encodes the current line's length. For each line, we thus check whether the line's actual length matches its expected length in order to not faultily apply a truncated patch. This also acts as a check to verify that we're not reading outside of the line's string: if (encoded_len > ctx->parse_ctx.line_len - 1) { error = git_parse_err(...); goto done; } There is the possibility for an integer underflow, though. Given a line with a single prefix byte, only, `line_len` will be zero when reaching this check. As a result, subtracting one from that will result in an integer underflow, causing us to assume that there's a wealth of bytes available later on. Naturally, this may result in an out-of-bounds read. Fix the issue by checking both `encoded_len` and `line_len` for a non-zero value. The binary format doesn't make use of zero-length lines anyway, so we need to know that there are both encoded bytes and remaining characters available at all. This patch also adds a test that works based on the last error message. Checking error messages is usually too tightly coupled, but in fact parsing the patch failed even before the change. Thus the only possibility is to use e.g. Valgrind, but that'd result in us not catching issues when run without Valgrind. As a result, using the error message is considered a viable tradeoff as we know that we didn't start decoding Base85 in the first place.
* patch_parse: use paths from "---"/"+++" lines for binary patchesPatrick Steinhardt2019-11-101-0/+14
| | | | | | | | | | | | | | For some patches, it is not possible to derive the old and new file paths from the patch header's first line, most importantly when they contain spaces. In such a case, we derive both paths from the "---" and "+++" lines, which allow for non-ambiguous parsing. We fail to use these paths when parsing binary patches without data, though, as we always expect the header paths to be filled in. Fix this by using the "---"/"+++" paths by default and only fall back to header paths if they aren't set. If neither of those paths are set, we just return an error. Add two tests to verify this behaviour, one of which would have previously caused a segfault.
* patch_parse: fix segfault when header path contains whitespace onlyPatrick Steinhardt2019-11-051-0/+14
| | | | | | | | | | | | When parsing header paths from a patch, we reject any patches with empty paths as malformed patches. We perform the check whether a path is empty before sanitizing it, though, which may lead to a path becoming empty after the check, e.g. if we have trimmed whitespace. This may lead to a segfault later when any part of our patching logic actually references such a path, which may then be a `NULL` pointer. Fix the issue by performing the check after sanitizing. Add tests to catch the issue as they would have produced a segfault previosuly.
* patch_parse: detect overflow when calculating old/new line positionPatrick Steinhardt2019-10-211-0/+7
| | | | | | | | | | | When the patch contains lines close to INT_MAX, then it may happen that we end up with an integer overflow when calculating the line of the current diff hunk. Reject such patches as unreasonable to avoid the integer overflow. As the calculation is performed on integers, we introduce two new helpers `git__add_int_overflow` and `git__sub_int_overflow` that perform the integer overflow check in a generic way.
* patch_parse: fix out-of-bounds read with No-NL linesPatrick Steinhardt2019-10-191-0/+13
| | | | | | | | | | | | | | We've got two locations where we copy lines into the patch. The first one is when copying normal " ", "-" or "+" lines, while the second location gets executed when we copy "\ No newline at end of file" lines. While the first one correctly uses `git__strndup` to copy only until the newline, the other one doesn't. Thus, if the line occurs at the end of the patch and if there is no terminating NUL character, then it may result in an out-of-bounds read. Fix the issue by using `git__strndup`, as was already done in the other location. Furthermore, add allocation checks to both locations to detect out-of-memory situations.
* patch_parse: reject empty path namesPatrick Steinhardt2019-10-191-0/+7
| | | | | | | | | When parsing patch headers, we currently accept empty path names just fine, e.g. a line "--- \n" would be parsed as the empty filename. This is not a valid patch format and may cause `NULL` pointer accesses at a later place as `git_buf_detach` will return `NULL` in that case. Reject such patches as malformed with a nice error message.
* patch_parse: reject patches with multiple old/new pathsPatrick Steinhardt2019-10-191-0/+6
| | | | | | | | | | | | It's currently possible to have patches with multiple old path name headers. As we didn't check for this case, this resulted in a memory leak when overwriting the old old path with the new old path because we simply discarded the old pointer. Instead of fixing this by free'ing the old pointer, we should reject such patches altogether. It doesn't make any sense for the "---" or "+++" markers to occur multiple times within a patch n the first place. This also implicitly fixes the memory leak.
* Merge pull request #5159 from pks-t/pks/patch-parse-old-missing-nlEdward Thomson2019-07-201-3/+23
|\ | | | | patch_parse: handle missing newline indicator in old file
| * patch_parse: ensure valid patch output with EOFNLErik Aigner2019-07-111-9/+15
| |
| * patch_parse: handle missing newline indicator in old filePatrick Steinhardt2019-07-111-0/+14
| | | | | | | | | | | | | | | | | | | | | | When either the old or new file contents have no newline at the end of the file, then git-diff(1) will print out a "\ No newline at end of file" indicator. While we do correctly handle this in the case where the new file has this indcator, we fail to parse patches where the old file is missing a newline at EOF. Fix this bug by handling and missing newline indicators in the old file. Add tests to verify that we can parse such files.
* | patch_parse: do not depend on parsed buffer's lifetimePatrick Steinhardt2019-07-051-0/+20
|/ | | | | | | | | | | When parsing a patch from a buffer, we let the patch lines point into the original buffer. While this is efficient use of resources, this also ties the lifetime of the parsed patch to the parsed buffer. As this behaviour is not documented anywhere in our API it is very surprising to its users. Untie the lifetime by duplicating the lines into the parsed patch. Add a test that verifies that lifetimes are indeed independent of each other.
* patch_parse: allow parsing ambiguous patch headersPatrick Steinhardt2017-11-111-0/+6
| | | | | | | | | | | | | | | The git patch format allows for having unquoted paths with whitespaces inside. This format becomes ambiguous to parse, e.g. in the following example: diff --git a/file b/with spaces.txt b/file b/with spaces.txt While we cannot parse this in a correct way, we can instead use the "---" and "+++" lines to retrieve the file names, as the path is not followed by anything here but spans the complete remaining line. Because of this, we can simply bail outwhen parsing the "diff --git" header here without an actual error and then proceed to just take the paths from the other headers.
* patch: differentiate not found and invalid patchesEdward Thomson2016-05-261-6/+75
|
* git_patch_parse_ctx: refcount the contextEdward Thomson2016-05-261-0/+1
|
* patch: `git_patch_from_patchfile` -> `git_patch_from_buffer`Edward Thomson2016-05-261-1/+2
|
* Introduce git_patch_options, handle prefixesEdward Thomson2016-05-261-3/+5
| | | | | Handle prefixes (in terms of number of path components) for patch parsing.
* patch_parse: ensure we can parse a patchEdward Thomson2016-05-261-0/+31