summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
...
* | | | | Merge pull request #5138 from libgit2/ethomson/cvarPatrick Steinhardt2019-07-1928-194/+209
|\ \ \ \ \ | | | | | | | | | | | | configuration: cvar -> configmap
| * | | | | configuration: deprecate git_cvar safelyethomson/cvarEdward Thomson2019-07-181-0/+14
| | | | | |
| * | | | | configuration: cvar -> configmapPatrick Steinhardt2019-07-1827-194/+195
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `cvar` is an unhelpful name. Refactor its usage to `configmap` for more clarity.
* | | | | | Merge pull request #5172 from bk2204/cache-efficient-evictionPatrick Steinhardt2019-07-181-1/+4
|\ \ \ \ \ \ | |_|/ / / / |/| | | | | Evict cache items more efficiently
| * | | | | cache: evict items more efficientlybrian m. carlson2019-07-171-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When our object cache is full, we pick eight items (or the whole cache, if there are fewer) and evict them. For small cache sizes, this is fine, but when we're dealing with a large number of objects, we can repeatedly exhaust the cache and spend a large amount of time in git_oidmap_iterate trying to find items to evict. Instead, let's assume that if the cache gets full, we have a large number of objects that we're handling, and be more aggressive about evicting items. Let's remove one item for every 2048 items, but not less than 8. This causes us to scale our evictions in proportion to the size of the cache and significantly reduces the time we spend in git_oidmap_iterate. Before this change, a full pack of all the non-blob objects in the Linux repository took in excess of 30 minutes and spent 62.3% of total runtime in odb_read_1 and its children, and 44.3% of the time in git_oidmap_iterate. With this change, the same operation now takes 14 minutes and 44 seconds, and odb_read_1 accounts for only 35.9% of total time, whereas git_oidmap_iterate consists of 6.2%. Note that we do spend a little more time inflating objects and a decent amount more time in memcmp. However, overall, the time taken is significantly improved, and time in pack building is now dominated by git_delta_create_from_index (33.7%), which is what we would expect.
* | | | | | Merge pull request #5175 from pks-t/pks/clar-fix-suite-countPatrick Steinhardt2019-07-181-1/+1
|\ \ \ \ \ \ | | | | | | | | | | | | | | clar: fix suite count
| * | | | | | tests: fix undercounting of suitesPatrick Steinhardt2019-07-181-1/+1
|/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the introduction of data variants for suites, we started undercounting the number of suites as we didn't account for those that were executed twice. This was then adjusted to count the number of initializers instead, but this fails to account for suites without any initializers at all. Fix the suite count by counting either the number of initializers or, if there is no initializer, count it as a single suite, only.
* | | | | | Merge pull request #5163 from csware/gitignore-vs2017Patrick Steinhardt2019-07-181-0/+2
|\ \ \ \ \ \ | |_|/ / / / |/| | | | | Ignore VS2017 specific files and folders
| * | | | | Ignore VS2017 specific files and foldersSven Strickroth2019-07-121-0/+2
| |/ / / / | | | | | | | | | | | | | | | Signed-off-by: Sven Strickroth <email@cs-ware.de>
* | | | | Merge pull request #5156 from pks-t/pks/attr-macros-in-subdirPatrick Steinhardt2019-07-189-171/+365
|\ \ \ \ \ | |_|/ / / |/| | | | gitattributes: ignore macros defined in subdirectories
| * | | | attr_file: ignore macros defined in subdirectoriesPatrick Steinhardt2019-07-128-40/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now, we are unconditionally applying all macros found in a gitatttributes file. But quoting gitattributes(5): Custom macro attributes can be defined only in top-level gitattributes files ($GIT_DIR/info/attributes, the .gitattributes file at the top level of the working tree, or the global or system-wide gitattributes files), not in .gitattributes files in working tree subdirectories. The built-in macro attribute "binary" is equivalent to: So gitattribute files in subdirectories of the working tree may explicitly _not_ contain macro definitions, but we do not currently enforce this limitation. This patch introduces a new parameter to the gitattributes parser that tells whether macros are allowed in the current file or not. If set to `false`, we will still parse macros, but silently ignore them instead of adding them to the list of defined macros. Update all callers to correctly determine whether the to-be-parsed file may contain macros or not. Most importantly, when walking up the directory hierarchy, we will only set it to `true` once it reaches the root directory of the repo itself. Add a test that verifies that we are indeed not applying macros from subdirectories. Previous to these changes, the test would've failed.
| * | | | attr_file: refactor `parse_buffer` functionPatrick Steinhardt2019-07-121-32/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The gitattributes code is one of our oldest and most-untouched codebases in libgit2, and as such its code style doesn't quite match our current best practices. Refactor the function `git_attr_file__parse_buffer` to better match them.
| * | | | attr_file: refactor `load_standalone` functionPatrick Steinhardt2019-07-121-18/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The gitattributes code is one of our oldest and most-untouched codebases in libgit2, and as such its code style doesn't quite match our current best practices. Refactor the function `git_attr_file__lookup_standalone` to better match them.
| * | | | attrcache: fix memory leak if inserting invalid macro to cachePatrick Steinhardt2019-07-121-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A macro without any assignments is considered an invalid macro by the attributes cache and is thus not getting added to the macro map at all. But as `git_attr_cache__insert_macro` returns success with neither free'ing nor adopting the macro into its map, this will cause a memory leak. Fix this by freeing the macro in the function if it's not going to be added. This is perfectly fine to do, as callers assume that the attrcache will have the macro adopted on success anyway.
| * | | | attrcache: fix multiple memory leaks when inserting macrosPatrick Steinhardt2019-07-121-10/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function `git_attr_cache__insert_macro` is responsible for adopting macros in the per-repo macro cache. When adding a macro that replaces an already existing macro (e.g. because of re-parsing gitattributes files), then we do not free the previous macro and thus cause a memory leak. Fix this leak by first checking if the cache already has a macro defined with the same name. If so, free it before replacing the cache entry with the new instance.
| * | | | tests: attr: verify that in-memory macros are respectedPatrick Steinhardt2019-07-121-0/+55
| | | | | | | | | | | | | | | | | | | | | | | | | Add some tests to ensure that the `git_attr_add_macro` function works as expected.
| * | | | tests: attr: implement tests to verify attribute rewriting behaviourPatrick Steinhardt2019-07-122-0/+85
| | | | | | | | | | | | | | | | | | | | | | | | | Implement some tests that verify that we are correctly updating gitattributes when rewriting or unlinking the corresponding files.
| * | | | tests: attr: extract macro tests into their own suitePatrick Steinhardt2019-07-122-70/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As macros are a specific functionality in the gitattributes code, it makes sense to extract them into their own test suite, too. This makes finding macro-related tests easier.
* | | | | Merge pull request #5168 from tiennou/clar/fix-data-suite-countPatrick Steinhardt2019-07-181-1/+1
|\ \ \ \ \ | | | | | | | | | | | | clar: correctly account for "data" suites when counting
| * | | | | clar: correctly account for "data" suites when countingEtienne Samson2019-07-161-1/+1
| | |/ / / | |/| | | | | | | | | | | | | Failing to do that makes clar miss the last of the suites, as all duplicated "data" would have not been accounted for.
* | | | | Merge pull request #5170 from bk2204/packbuilder-efficient-reallocEdward Thomson2019-07-171-1/+1
|\ \ \ \ \ | |/ / / / |/| | | | Allocate memory more efficiently when packing objects
| * | | | pack-objects: allocate memory more efficientlybrian m. carlson2019-07-171-1/+1
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The packbuilder code allocates memory in chunks. When it needs to allocate, it tries to add 1024 to the number of objects and multiply by 3/2. However, it actually multiplies by 1 instead, since it performs an integral division in the expression "3 / 2" and only then multiplies by the increased number of objects. The current behavior causes the code to waste massive amounts of time copying memory when it reallocates, causing inserting all non-blob objects in the Linux repository into a new pack to take some indeterminate time greater than 5 minutes instead of 52 seconds. Correct this error by first dividing by two, and only then multiplying by 3. We still check for overflow for the multiplication, which is the only part that can overflow. This appears to be the only place in the code base which has this problem.
* | | | Merge pull request #5131 from pks-t/pks/fileops-mkdir-in-rootPatrick Steinhardt2019-07-122-3/+46
|\ \ \ \ | |/ / / |/| | | fileops: fix creation of directory in filesystem root
| * | | fileops: fix creation of directory in filesystem rootPatrick Steinhardt2019-07-112-3/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 45f24e787 (git_repository_init: stop traversing at windows root, 2019-04-12), we have fixed `git_futils_mkdir` to correctly handle the case where we create a directory in Windows-style filesystem roots like "C:\repo". The problem here is an off-by-one: previously, to that commit, we've been checking wether the parent directory's length is equal to the root directory's length incremented by one. When we call the function with "/example", then the parent directory's length ("/") is 1, but the root directory offset is 0 as the path is directly rooted without a drive prefix. This resulted in `1 == 0 + 1`, which was true. With the change, we've stopped incrementing the root directory length, and thus now compare `1 <= 0`, which is false. The previous way of doing it was kind of finicky any non-obvious, which is also why the error was introduced. So instead of just re-adding the increment, let's explicitly add a condition that aborts finding the parent if the current parent path is "/". Making this change causes Azure Pipelines to fail the testcase repo::init::nonexistent_paths on Unix-based systems. This is because we have just fixed creating directories in the filesystem root, which previously didn't work. As Docker-based tests are running as root user, we are thus able to create the non-existing path and will now succeed to create the repository that was expected to actually fail. Let's split this up into three different tests: - A test to verify that we do not create repos in a non-existing parent directoy if the flag `GIT_REPOSITORY_INIT_MKPATH` is not set. - A test to verify that we fail if the root directory does not exist. As there is a common root directory on Unix-based systems that always exist, we can only test for this on Windows-based systems. - A test to verify that we fail if trying to create a repository in an unwriteable parent directory. We can only test this if not running tests as root user, as CAP_DAC_OVERRIDE will cause us to ignore permissions when creating files.
* | | | Merge pull request #5160 from pks-t/pks/win32-fuzzersPatrick Steinhardt2019-07-125-47/+45
|\ \ \ \ | |/ / / |/| | | win32: fix fuzzers and have CI build them
| * | | ci: build fuzzers on Powershell based build jobsPatrick Steinhardt2019-07-051-1/+1
| | | | | | | | | | | | | | | | | | | | In order to guarantee that our fuzzers build just fine on the Windows platform, let's enable building fuzzers on all Powershell-based builds.
| * | | fuzzers: clean up header includesPatrick Steinhardt2019-07-054-18/+2
| | | | | | | | | | | | | | | | | | | | | | | | There's multiple headers included in our fuzzers that aren't required at all. Furthermore, some of them are not available on Win32, causing builds to fail. Remove them to fix this.
| * | | fuzzers: use `git_buf_printf` instead of `snprintf`Patrick Steinhardt2019-07-051-16/+15
| | | | | | | | | | | | | | | | | | | | | | | | The `snprintf` function does not exist on Win32, it only has `_snprintf_s` available. Let's just avoid any cross-platform hassle and use our own `git_buf` functionality instead.
| * | | fuzzers: use POSIX emulation layer to unlink filesPatrick Steinhardt2019-07-051-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | Use `p_unlink` instead of `unlink` to remove the generated packfiles in our packfile fuzzer. Like this, we do not have to worry about using proper includes that are known on all platforms, especially Win32.
| * | | fuzzers: make printf formatters cross-platform compatiblePatrick Steinhardt2019-07-051-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The `printf` formatters in our standalone fuzzing driver are currently using the "%m" specifier, which is a GNU extension that prints the error message for the error code in `errno`. As we're using libgit2 functions in both cases anyway, let's just use `git_error_last` instead to make this valid on all platforms.
| * | | fuzzers: implement `mkdtemp` alternative for Win32Patrick Steinhardt2019-07-051-6/+20
| | | | | | | | | | | | | | | | | | | | | | | | The `mkdtemp` function is not available on Windows, so our download_refs fuzzer will fail to compile on Windows. Provide an alternative implementation to fix it.
* | | | Merge pull request #5134 from pks-t/pks/config-parser-separationPatrick Steinhardt2019-07-115-96/+98
|\ \ \ \ | |_|_|/ |/| | | Config parser separation
| * | | config_parse: provide parser init and dispose functionsPatrick Steinhardt2019-07-115-12/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now, all configuration file backends are expected to directly mess with the configuration parser's internals in order to set it up. Let's avoid doing that by implementing both a `git_config_parser_init` and `git_config_parser_dispose` function to clearly define the interface between configuration backends and the parser. Ideally, we would make the `git_config_parser` structure definition private to its implementation. But as that would require an additional memory allocation that was not required before we just live with it being visible to others.
| * | | config_file: refactor error handling in `config_write`Patrick Steinhardt2019-07-111-40/+24
| | | | | | | | | | | | | | | | | | | | | | | | Error handling in `config_write` is rather convoluted and does not match our current code style. Refactor it to make it easier to understand.
| * | | config_file: internalize `git_config_file` structPatrick Steinhardt2019-07-112-20/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the previous commits, we have finally separated the config parsing logic from the specific configuration file backend. Due to that, we can now move the `git_config_file` structure into the config file backend's implementation so that no other code may accidentally start using it again. Furthermore, we rename the structure to `diskfile` to make it obvious that it is internal, only, and to unify it with naming scheme of the other diskfile structures.
| * | | config_parse: remove use of `git_config_file`Patrick Steinhardt2019-07-114-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The config parser code needs to keep track of the current parsed file's name so that we are able to provide proper error messages to the user. Right now, we do that by storing a `git_config_file` in the parser structure, but as that is a specific backend and the parser aims to be generic, it is a layering violation. Switch over to use a simple string to fix that.
| * | | config_file: embed file in diskfile parse dataPatrick Steinhardt2019-07-111-13/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The config file code needs to keep track of the actual `git_config_file` structure, as it not only contains the path of the current configuration file, but it also keeps tracks of all includes of that file. Right now, we keep track of that structure via the `git_config_parser`, but as that's supposed to be a backend generic implementation of configuration parsing it's a layering violation to have it in there. Switch over the config file backend to use its own config file structure that's embedded in the backend parse data. This allows us to switch over the generic config parser to avoid using the `git_config_file` structure.
| * | | config_parse: rename `data` parameter to `payload` for clarityPatrick Steinhardt2019-07-112-10/+10
|/ / / | | | | | | | | | | | | | | | | | | By convention, parameters that get passed to callbacks are usually named `payload` in our codebase. Rename the `data` parameters in the configuration parser callbacks to `payload` to avoid confusion.
* | | Merge pull request #5132 from pks-t/pks/config-stat-cachePatrick Steinhardt2019-07-118-52/+184
|\ \ \ | |/ / |/| | config_file: implement stat cache to avoid repeated rehashing
| * | config_file: avoid re-reading files on writePatrick Steinhardt2019-07-111-2/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we rewrite the configuration file due to any of its values being modified, we call `config_refresh` to update the in-memory representation of our config file backend. This is needlessly wasteful though, as `config_refresh` will always open the on-disk representation to reads the file contents while we already know the complete file contents at this point in time as we have just written it to disk. Implement a new function `config_refresh_from_buffer` that will refresh the backend's config entries from a buffer instead of from the config file itself. Note that this will thus _not_ update the backend's timestamp, which will cause us to re-read the buffer when performing a read operation on it. But this is still an improvement as we now lazily re-read the contents, and most importantly we will avoid constantly re-reading the contents if we perform multiple write operations. The following strace demonstrates this if we're re-writing a key multiple times. It uses our config example with `config_set` changed to update the file 10 times with different keys: $ strace lg2 config x.x z |& grep '^open.*config' open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 And now with the optimization of `config_refresh_from_buffer`: $ strace lg2 config x.x z |& grep '^open.*config' open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 As can be seen, this is quite a lot of `open` calls less.
| * | config_file: split out function that sets config entriesPatrick Steinhardt2019-07-111-22/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | Updating a config file backend's config entries is a bit more involved, as it requires clearing of the old config entries as well as handling locking correctly. As we will need this functionality in a future patch to refresh config entries from a buffer, let's extract this into its own function `config_set_entries`.
| * | config_file: split out function that reads entries from a bufferPatrick Steinhardt2019-07-111-20/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The `config_read` function currently performs both reading the on-disk config file as well as parsing the retrieved buffer contents. To optimize how we refresh our config entries from an in-memory buffer, we need to be able to directly parse buffers, though, without involving any on-disk files at all. Extract a new function `config_read_buffer` that sets up the parsing logic and then parses config entries from a buffer, only. Have `config_read` use it to avoid duplicated logic.
| * | config_file: move refresh into `write` functionPatrick Steinhardt2019-07-111-11/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We are quite lazy in how we refresh our config file backend when updating any of its keys: instead of just updating our in-memory representation of the keys, we just discard the old set of keys and then re-read the config file contents from disk. This refresh currently happens separately at every callsite of `config_write`, but it is clear that we _always_ want to refresh if we have written the config file to disk. If we didn't, then we'd run around with an outdated config file backend that does not represent what we have on disk. By moving the refresh into `config_write`, we are also able to optimize the case where the config file is currently locked. Before, we would've tried to re-read the file even if we have only updated its cached contents without touching the on-disk file. Thus we'd have unnecessarily stat'd the file, even though we know that it shouldn't have been modified in the meantime due to its lock.
| * | config_file: implement stat cache to avoid repeated rehashingPatrick Steinhardt2019-07-114-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To decide whether a config file has changed, we always hash its complete contents. This is unnecessarily expensive, as well-behaved filesystems will always update stat information for files which have changed. So before computing the hash, we should first check whether the stat info has actually changed for either the configuration file or any of its includes. This avoids having to re-read the configuration file and its includes every time when we check whether it's been modified. Tracing the for-each-ref example previous to this commit, one can see that we repeatedly re-open both the repo configuration as well as the global configuration: $ strace lg2 for-each-ref |& grep config access("/home/pks/.gitconfig", F_OK) = -1 ENOENT (No such file or directory) access("/home/pks/.config/git/config", F_OK) = 0 access("/etc/gitconfig", F_OK) = -1 ENOENT (No such file or directory) stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 access("/tmp/repo/.git/config", F_OK) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c05290) = -1 ENOENT (No such file or directory) access("/home/pks/.gitconfig", F_OK) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 access("/home/pks/.config/git/config", F_OK) = 0 stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c051f0) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c05090) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c05090) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c05090) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 With the change, we only do stats for those files and open them a single time, only: $ strace lg2 for-each-ref |& grep config access("/home/pks/.gitconfig", F_OK) = -1 ENOENT (No such file or directory) access("/home/pks/.config/git/config", F_OK) = 0 access("/etc/gitconfig", F_OK) = -1 ENOENT (No such file or directory) stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 access("/tmp/repo/.git/config", F_OK) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffe70540d20) = -1 ENOENT (No such file or directory) access("/home/pks/.gitconfig", F_OK) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 access("/home/pks/.config/git/config", F_OK) = 0 stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 stat("/home/pks/.gitconfig", 0x7ffe70540ca0) = -1 ENOENT (No such file or directory) stat("/home/pks/.gitconfig", 0x7ffe70540c80) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 stat("/home/pks/.gitconfig", 0x7ffe70540b40) = -1 ENOENT (No such file or directory) stat("/home/pks/.gitconfig", 0x7ffe70540b20) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 stat("/home/pks/.gitconfig", 0x7ffe70540b40) = -1 ENOENT (No such file or directory) stat("/home/pks/.gitconfig", 0x7ffe70540b20) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 stat("/home/pks/.gitconfig", 0x7ffe70540b40) = -1 ENOENT (No such file or directory) stat("/home/pks/.gitconfig", 0x7ffe70540b20) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 The following benchmark has been performed with and without the stat cache in a best-of-ten run: ``` int lg2_repro(git_repository *repo, int argc, char **argv) { git_config *cfg; int32_t dummy; int i; UNUSED(argc); UNUSED(argv); check_lg2(git_repository_config(&cfg, repo), "Could not obtain config", NULL); for (i = 1; i < 100000; ++i) git_config_get_int32(&dummy, cfg, "foo.bar"); git_config_free(cfg); return 0; } ``` Without stat cache: $ time lg2 repro real 0m1.528s user 0m0.568s sys 0m0.944s With stat cache: $ time lg2 repro real 0m0.526s user 0m0.268s sys 0m0.258s This benchmark shows a nearly three-fold performance improvement. This change requires that we check our configuration stress tests as we're now in fact becoming more racy. If somebody is writing a configuration file at nearly the same time (there is a window of 100ns on Windows-based systems), then it might be that we realize that this file has actually changed and thus may not re-read it. This will only happen if either an external process is rewriting the configuration file or if the same process has multiple `git_config` structures pointing to the same time, where one of both is being used to write and the other one is used to read values.
| * | config: use `git_config_file` in favor of `struct config_file`Patrick Steinhardt2019-07-112-6/+6
| | |
| * | examples: implement config examplePatrick Steinhardt2019-07-113-0/+64
| | | | | | | | | | | | | | | Implement a new example that resembles git-config(1). Right now, this example can both read and set configuration keys, only.
| * | cmake: report whether we are using sub-second stat informationPatrick Steinhardt2019-07-111-0/+6
|/ / | | | | | | | | | | | | | | | | | | Depending on the platform and on build options, we may or may not build libgit2 with support for nanoseconds when using `stat` calls. It's currently unclear though whether sub-second stat information is used at all. Add feature info for this to tell at configure time whether it's being used or not.
* | Merge pull request #5143 from libgit2/ethomson/warningsPatrick Steinhardt2019-07-0514-28/+28
|\ \ | | | | | | ci: build with ENABLE_WERROR on Windows
| * | tests: trace: fix parameter type of aux callbackPatrick Steinhardt2019-07-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function `git_win32__stack__set_aux_cb` expects the second parameter to be a function callback of type `git_win32__stack__aux_cb_lookup`, which expects a `size_t` parameter. In our test suite trace::windows::stacktrace, we declare the callback with `unsigned int` as parameter, though, causing a compiler warning. Correct the parameter type to silence the warning.
| * | w32_stack: convert buffer length param to `size_t`Patrick Steinhardt2019-07-052-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | In both `git_win32__stack_format` and `git_win32__stack`, we handle buffer lengths via an integer variable. As we only ever pass buffer sizes to it, this should be a `size_t` though to avoid loss of precision. As we also use it to compare with other `size_t` variables, this also silences signed/unsigned comparison warnings.