summaryrefslogtreecommitdiff
path: root/fsmonitor.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'jk/strvec'Junio C Hamano2020-08-101-3/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The argv_array API is useful for not just managing argv but any "vector" (NULL-terminated array) of strings, and has seen adoption to a certain degree. It has been renamed to "strvec" to reduce the barrier to adoption. * jk/strvec: strvec: rename struct fields strvec: drop argv_array compatibility layer strvec: update documention to avoid argv_array strvec: fix indentation in renamed calls strvec: convert remaining callers away from argv_array name strvec: convert more callers away from argv_array name strvec: convert builtin/ callers away from argv_array name quote: rename sq_dequote_to_argv_array to mention strvec strvec: rename files from argv-array to strvec argv-array: rename to strvec argv-array: use size_t for count and alloc
| * strvec: convert more callers away from argv_array nameJeff King2020-07-281-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We eventually want to drop the argv_array name and just use strvec consistently. There's no particular reason we have to do it all at once, or care about interactions between converted and unconverted bits. Because of our preprocessor compat layer, the names are interchangeable to the compiler (so even a definition and declaration using different names is OK). This patch converts remaining files from the first half of the alphabet, to keep the diff to a manageable size. The conversion was done purely mechanically with: git ls-files '*.c' '*.h' | xargs perl -i -pe ' s/ARGV_ARRAY/STRVEC/g; s/argv_array/strvec/g; ' and then selectively staging files with "git add '[abcdefghjkl]*'". We'll deal with any indentation/style fallouts separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Remove doubled words in various commentsElijah Newren2020-07-281-1/+1
|/ | | | | Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fsmonitor: handle version 2 of the hooks that will use opaque tokenKevin Willford2020-01-131-11/+64
| | | | | | | | | | | | | | | | | | | Some file monitors like watchman will use something other than a timestamp to keep better track of what changes happen in between calls to query the fsmonitor. The clockid in watchman is a string. Now that the index is storing an opaque token for the last update the code needs to be updated to pass that opaque token to a verion 2 of the fsmonitor hook. Because there are repos that already have version 1 of the hook and we want them to continue to work when git is updated, we need to handle both version 1 and version 2 of the hook. In order to do that a config value is being added core.fsmonitorHookVersion to force what version of the hook should be used. When this is not set it will default to -1 and then the code will attempt to call version 2 of the hook first. If that fails it will fallback to trying version 1. Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fsmonitor: change last update timestamp on the index_state to opaque tokenKevin Willford2020-01-131-17/+32
| | | | | | | | | | | | | | Some file system monitors might not use or take a timestamp for processing and in the case of watchman could have race conditions with using a timestamp. Watchman uses something called a clockid that is used for race free queries to it. The clockid for watchman is simply a string. Change the fsmonitor_last_update from being a uint64_t to a char pointer so that any arbitrary data can be stored in it and passed back to the fsmonitor. Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* unpack-trees: skip stat on fsmonitor-valid filesUtsav Shah2019-11-211-6/+17
| | | | | | | | | | | | | | | | | | | | | | The index might be aware that a file hasn't modified via fsmonitor, but unpack-trees did not pay attention to it and checked via ie_match_stat which can be inefficient on certain filesystems. This significantly slows down commands that run oneway_merge, like checkout and reset --hard. This patch makes oneway_merge check whether a file is considered unchanged through fsmonitor and skips ie_match_stat on it. unpack-trees also now correctly copies over fsmonitor validity state from the source index. Finally, for correctness, we force a refresh of fsmonitor state in tweak_fsmonitor. After this change, commands like stash (that use reset --hard internally) go from 8s or more to ~2s on a 250k file repository on a mac. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Kevin Willford <Kevin.Willford@microsoft.com> Signed-off-by: Utsav Shah <utsav@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fsmonitor: do not compare bitmap size with size of split indexJunio C Hamano2019-11-131-2/+4
| | | | | | | | | | | | | | | | | | | | | | | 3444ec2e ("fsmonitor: don't fill bitmap with entries to be removed", 2019-10-11) added a handful of sanity checks that make sure that a bit position in fsmonitor bitmap does not go beyond the end of the index. As each bit in the bitmap corresponds to a path in the index, this is the right check most of the time. Except for the case when we are in the split-index mode and looking at a delta index that is to be overlayed on the base index but before the base index has actually been merged in, namely in read_ and write_fsmonitor_extension(). In these codepaths, the entries in the split/delta index is typically a small subset of the entire set of paths (otherwise why would we be using split-index?), so the bitmap used by the fsmonitor is almost always larger than the number of entries in the partial index, and the incorrect comparison would trigger the BUG(). Reported-by: Utsav Shah <ukshah2@illinois.edu> Helped-by: Kevin Willford <Kevin.Willford@microsoft.com> Helped-by: William Baker <William.Baker@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fsmonitor: don't fill bitmap with entries to be removedWilliam Baker2019-10-121-5/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While doing some testing with fsmonitor enabled I found that git commands would segfault after staging and unstaging an untracked file. Looking at the crash it appeared that fsmonitor_ewah_callback was attempting to adjust bits beyond the bounds of the index cache. Digging into how this could happen it became clear that the fsmonitor extension must have been written with more bits than there were entries in the index. The root cause ended up being that fill_fsmonitor_bitmap was populating fsmonitor_dirty with bits for all entries in the index, even those that had been marked for removal. To solve this problem fill_fsmonitor_bitmap has been updated to skip entries with the the CE_REMOVE flag set. With this change the bits written for the fsmonitor extension will be consistent with the index entries written by do_write_index. Additionally, BUG checks have been added to detect if the number of bits in fsmonitor_dirty should ever exceed the number of entries in the index again. Another option that was considered was moving the call to fill_fsmonitor_bitmap closer to where the index is written (and where the fsmonitor extension itself is written). However, that did not work as the fsmonitor_dirty bitmap must be filled before the index is split during writing. Signed-off-by: William Baker <William.Baker@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'cb/fsmonitor-intfix'Junio C Hamano2019-07-091-4/+4
|\ | | | | | | | | | | | | Variable type fix. * cb/fsmonitor-intfix: fsmonitor: avoid signed integer overflow / infinite loop
| * fsmonitor: avoid signed integer overflow / infinite loopCarlo Marcelo Arenas Belón2019-06-171-4/+4
| | | | | | | | | | | | | | | | | | | | 883e248b8a ("fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.", 2017-09-22) uses an int in a loop that would wrap if index_state->cache_nr (unsigned) is bigger than INT_MAX Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | fsmonitor: force a refresh after the index was discardedJohannes Schindelin2019-05-081-3/+2
|/ | | | | | | | | | | | | | With this change, the `index_state` struct becomes the new home for the flag that says whether the fsmonitor hook has been run, i.e. it is now per-index. It also gets re-set when the index is discarded, fixing the bug demonstrated by the "test_expect_failure" test added in the preceding commit. In that case fsmonitor-enabled Git would miss updates under certain circumstances, see that preceding commit for details. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'jk/snprintf-truncation'Junio C Hamano2018-05-301-10/+4
|\ | | | | | | | | | | | | | | | | | | | | Avoid unchecked snprintf() to make future code auditing easier. * jk/snprintf-truncation: fmt_with_err: add a comment that truncation is OK shorten_unambiguous_ref: use xsnprintf fsmonitor: use internal argv_array of struct child_process log_write_email_headers: use strbufs http: use strbufs instead of fixed buffers
| * fsmonitor: use internal argv_array of struct child_processRené Scharfe2018-05-211-10/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid magic array sizes and indexes by constructing the fsmonitor command line using the embedded argv_array of the child_process. The resulting code is shorter and easier to extend. Getting rid of the snprintf() calls is a bonus -- even though the buffers were big enough here to avoid truncation -- as it makes auditing the remaining callers easier. Inspired-by: Jeff King <peff@peff.net> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'bp/fsmonitor-prime-index'Junio C Hamano2018-04-251-0/+3
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | The index file is updated to record the fsmonitor section after a full scan was made, to avoid wasting the effort that has already spent. * bp/fsmonitor-prime-index: fsmonitor: force index write after full scan
| * | fsmonitor: force index write after full scanBen Peart2018-04-111-0/+3
| |/ | | | | | | | | | | | | | | | | | | | | | | | | fsmonitor currently only flags the index as dirty if the extension is being added or removed. This is a performance optimization that recognizes you can stat() a lot of files in less time than it takes to write out an updated index. This patch makes a small enhancement and flags the index dirty if we end up having to stat() all files and scan the entire working directory. The assumption being that must be expensive or you would not have turned on the feature. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'bp/fsmonitor-bufsize-fix'Junio C Hamano2018-04-251-1/+1
|\ \ | |/ |/| | | | | | | | | | | Fix an unexploitable (because the oversized contents are not under attacker's control) buffer overflow. * bp/fsmonitor-bufsize-fix: fsmonitor: fix incorrect buffer size when printing version number
| * fsmonitor: fix incorrect buffer size when printing version numberBen Peart2018-04-111-1/+1
| | | | | | | | | | | | | | | | | | This is a trivial bug fix for passing the incorrect size to snprintf() when outputting the version. It should be passing the size of the destination buffer rather than the size of the value being printed. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | dir.c: ignore paths containing .git when invalidating untracked cachend/fix-untracked-cache-invalidationNguyễn Thái Ngọc Duy2018-02-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | read_directory() code ignores all paths named ".git" even if it's not a valid git repository. See treat_path() for details. Since ".git" is basically invisible to read_directory(), when we are asked to invalidate a path that contains ".git", we can safely ignore it because the slow path would not consider it anyway. This helps when fsmonitor is used and we have a real ".git" repo at worktree top. Occasionally .git/index will be updated and if the fsmonitor hook does not filter it, untracked cache is asked to invalidate the path ".git/index". Without this patch, we invalidate the root directory unncessarily, which: - makes read_directory() fall back to slow path for root directory (slower) - makes the index dirty (because UNTR extension is updated). Depending on the index size, writing it down could also be slow. A note about the new "safe_path" knob. Since this new check could be relatively expensive, avoid it when we know it's not needed. If the path comes from the index, it can't contain ".git". If it does contain, we may be screwed up at many more levels, not just this one. Noticed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | fsmonitor: store fsmonitor bitmap before splitting indexAlex Vandiver2017-11-101-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ba1b9cac ("fsmonitor: delay updating state until after split index is merged", 2017-10-27) resolved the problem of the fsmonitor data being applied to the non-base index when reading; however, a similar problem exists when writing the index. Specifically, writing of the fsmonitor extension happens only after the work to split the index has been applied -- as such, the information in the index is only for the non-"base" index, and thus the extension information contains only partial data. When saving, compute the ewah bitmap before the index is split, and store it in the fsmonitor_dirty field, mirroring the behavior that occurred during reading. fsmonitor_dirty is kept from being leaked by being freed when the extension data is written -- which always happens precisely once, no matter the split index configuration. Signed-off-by: Alex Vandiver <alexmv@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | fsmonitor: delay updating state until after split index is mergedAlex Vandiver2017-11-011-16/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the fsmonitor extension is used in conjunction with the split index extension, the set of entries in the index when it is first loaded is only a subset of the real index. This leads to only the non-"base" index being marked as CE_FSMONITOR_VALID. Delay the expansion of the ewah bitmap until after tweak_split_index has been called to merge in the base index as well. The new fsmonitor_dirty is kept from being leaked by dint of being cleaned up in post_read_index_from, which is guaranteed to be called after do_read_index in read_index_from. Signed-off-by: Alex Vandiver <alexmv@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | fsmonitor: set the PWD to the top of the working treeAlex Vandiver2017-10-301-0/+1
|/ | | | | | | | | | | | | The fsmonitor command inherits the PWD of its caller, which may be anywhere in the working copy. This makes is difficult for the fsmonitor command to operate on the whole repository. Specifically, for the watchman integration, this causes each subdirectory to get its own watch entry. Set the CWD to the top of the working directory, for consistency. Signed-off-by: Alex Vandiver <alexmv@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fsmonitor: teach git to optionally utilize a file system monitor to speed up ↵Ben Peart2017-10-011-0/+253
detecting new or changed files. When the index is read from disk, the fsmonitor index extension is used to flag the last known potentially dirty index entries. The registered core.fsmonitor command is called with the time the index was last updated and returns the list of files changed since that time. This list is used to flag any additional dirty cache entries and untracked cache directories. We can then use this valid state to speed up preload_index(), ie_match_stat(), and refresh_cache_ent() as they do not need to lstat() files to detect potential changes for those entries marked CE_FSMONITOR_VALID. In addition, if the untracked cache is turned on valid_cached_dir() can skip checking directories for new or changed files as fsmonitor will invalidate the cache only for those directories that have been identified as having potential changes. To keep the CE_FSMONITOR_VALID state accurate during git operations; when git updates a cache entry to match the current state on disk, it will now set the CE_FSMONITOR_VALID bit. Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit is cleared and the corresponding untracked cache directory is marked invalid. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>