diff options
author | Junio C Hamano <gitster@pobox.com> | 2017-11-21 14:07:50 +0900 |
---|---|---|
committer | Junio C Hamano <gitster@pobox.com> | 2017-11-21 14:07:50 +0900 |
commit | e05336bddacb90cf243aacc0f7b7f34f900453d7 (patch) | |
tree | cb4a694c7de056ac7c4ec98e30b3b1c5b4cc3764 | |
parent | f5da077b1f9e28a473f8219d8b8391450b794abf (diff) | |
parent | 614a718a797e04fb037b25371896f910e464b671 (diff) | |
download | git-e05336bddacb90cf243aacc0f7b7f34f900453d7.tar.gz |
Merge branch 'bp/fsmonitor'
We learned to talk to watchman to speed up "git status" and other
operations that need to see which paths have been modified.
* bp/fsmonitor:
fsmonitor: preserve utf8 filenames in fsmonitor-watchman log
fsmonitor: read entirety of watchman output
fsmonitor: MINGW support for watchman integration
fsmonitor: add a performance test
fsmonitor: add a sample integration script for Watchman
fsmonitor: add test cases for fsmonitor extension
split-index: disable the fsmonitor extension when running the split index test
fsmonitor: add a test tool to dump the index extension
update-index: add fsmonitor support to update-index
ls-files: Add support in ls-files to display the fsmonitor valid bit
fsmonitor: add documentation for the fsmonitor extension.
fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
update-index: add a new --force-write-index option
preload-index: add override to enable testing preload-index
bswap: add 64 bit endianness helper get_be64
33 files changed, 1570 insertions, 23 deletions
diff --git a/Documentation/config.txt b/Documentation/config.txt index 671fcbaa0f..5f65fa9234 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -416,6 +416,13 @@ core.protectNTFS:: 8.3 "short" names. Defaults to `true` on Windows, and `false` elsewhere. +core.fsmonitor:: + If set, the value of this variable is used as a command which + will identify all files that may have changed since the + requested date/time. This information is used to speed up git by + avoiding unnecessary processing of files that have not changed. + See the "fsmonitor-watchman" section of linkgit:githooks[5]. + core.trustctime:: If false, the ctime differences between the index and the working tree are ignored; useful when the inode change time diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt index d153c17e06..3ac3e3a77d 100644 --- a/Documentation/git-ls-files.txt +++ b/Documentation/git-ls-files.txt @@ -9,7 +9,7 @@ git-ls-files - Show information about files in the index and the working tree SYNOPSIS -------- [verse] -'git ls-files' [-z] [-t] [-v] +'git ls-files' [-z] [-t] [-v] [-f] (--[cached|deleted|others|ignored|stage|unmerged|killed|modified])* (-[c|d|o|i|s|u|k|m])* [--eol] @@ -133,6 +133,11 @@ a space) at the start of each line: that are marked as 'assume unchanged' (see linkgit:git-update-index[1]). +-f:: + Similar to `-t`, but use lowercase letters for files + that are marked as 'fsmonitor valid' (see + linkgit:git-update-index[1]). + --full-name:: When run from a subdirectory, the command usually outputs paths relative to the current directory. This diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt index 75c7dd9dea..bdb0342593 100644 --- a/Documentation/git-update-index.txt +++ b/Documentation/git-update-index.txt @@ -16,9 +16,11 @@ SYNOPSIS [--chmod=(+|-)x] [--[no-]assume-unchanged] [--[no-]skip-worktree] + [--[no-]fsmonitor-valid] [--ignore-submodules] [--[no-]split-index] [--[no-|test-|force-]untracked-cache] + [--[no-]fsmonitor] [--really-refresh] [--unresolve] [--again | -g] [--info-only] [--index-info] [-z] [--stdin] [--index-version <n>] @@ -111,6 +113,12 @@ you will need to handle the situation manually. set and unset the "skip-worktree" bit for the paths. See section "Skip-worktree bit" below for more information. +--[no-]fsmonitor-valid:: + When one of these flags is specified, the object name recorded + for the paths are not updated. Instead, these options + set and unset the "fsmonitor valid" bit for the paths. See + section "File System Monitor" below for more information. + -g:: --again:: Runs 'git update-index' itself on the paths whose index @@ -201,6 +209,15 @@ will remove the intended effect of the option. `--untracked-cache` used to imply `--test-untracked-cache` but this option would enable the extension unconditionally. +--fsmonitor:: +--no-fsmonitor:: + Enable or disable files system monitor feature. These options + take effect whatever the value of the `core.fsmonitor` + configuration variable (see linkgit:git-config[1]). But a warning + is emitted when the change goes against the configured value, as + the configured value will take effect next time the index is + read and this will remove the intended effect of the option. + \--:: Do not interpret any more arguments as options. @@ -447,6 +464,34 @@ command reads the index; while when `--[no-|force-]untracked-cache` are used, the untracked cache is immediately added to or removed from the index. +File System Monitor +------------------- + +This feature is intended to speed up git operations for repos that have +large working directories. + +It enables git to work together with a file system monitor (see the +"fsmonitor-watchman" section of linkgit:githooks[5]) that can +inform it as to what files have been modified. This enables git to avoid +having to lstat() every file to find modified files. + +When used in conjunction with the untracked cache, it can further improve +performance by avoiding the cost of scanning the entire working directory +looking for new files. + +If you want to enable (or disable) this feature, it is easier to use +the `core.fsmonitor` configuration variable (see +linkgit:git-config[1]) than using the `--fsmonitor` option to +`git update-index` in each repository, especially if you want to do so +across all repositories you use, because you can set the configuration +variable to `true` (or `false`) in your `$HOME/.gitconfig` just once +and have it affect all repositories you touch. + +When the `core.fsmonitor` configuration variable is changed, the +file system monitor is added to or removed from the index the next time +a command reads the index. When `--[no-]fsmonitor` are used, the file +system monitor is immediately added to or removed from the index. + Configuration ------------- diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt index 5d3f45560e..0bb0042d8c 100644 --- a/Documentation/githooks.txt +++ b/Documentation/githooks.txt @@ -454,6 +454,34 @@ the name of the file that holds the e-mail to be sent. Exiting with a non-zero status causes 'git send-email' to abort before sending any e-mails. +fsmonitor-watchman +~~~~~~~~~~~~~~~~~~ + +This hook is invoked when the configuration option core.fsmonitor is +set to .git/hooks/fsmonitor-watchman. It takes two arguments, a version +(currently 1) and the time in elapsed nanoseconds since midnight, +January 1, 1970. + +The hook should output to stdout the list of all files in the working +directory that may have changed since the requested time. The logic +should be inclusive so that it does not miss any potential changes. +The paths should be relative to the root of the working directory +and be separated by a single NUL. + +It is OK to include files which have not actually changed. All changes +including newly-created and deleted files should be included. When +files are renamed, both the old and the new name should be included. + +Git will limit what files it checks for changes as well as which +directories are checked for untracked files based on the path names +given. + +An optimized way to tell git "all files have changed" is to return +the filename '/'. + +The exit status determines whether git will use the data from the +hook to limit its search. On error, it will fall back to verifying +all files and folders. GIT --- diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt index ade0b0c445..db3572626b 100644 --- a/Documentation/technical/index-format.txt +++ b/Documentation/technical/index-format.txt @@ -295,3 +295,22 @@ The remaining data of each directory block is grouped by type: in the previous ewah bitmap. - One NUL. + +== File System Monitor cache + + The file system monitor cache tracks files for which the core.fsmonitor + hook has told us about changes. The signature for this extension is + { 'F', 'S', 'M', 'N' }. + + The extension starts with + + - 32-bit version number: the current supported version is 1. + + - 64-bit time: the extension data reflects all changes through the given + time which is stored as the nanoseconds elapsed since midnight, + January 1, 1970. + + - 32-bit bitmap size: the size of the CE_FSMONITOR_VALID bitmap. + + - An ewah bitmap, the n-th bit indicates whether the n-th index entry + is not CE_FSMONITOR_VALID. @@ -646,7 +646,9 @@ TEST_PROGRAMS_NEED_X += test-ctype TEST_PROGRAMS_NEED_X += test-config TEST_PROGRAMS_NEED_X += test-date TEST_PROGRAMS_NEED_X += test-delta +TEST_PROGRAMS_NEED_X += test-drop-caches TEST_PROGRAMS_NEED_X += test-dump-cache-tree +TEST_PROGRAMS_NEED_X += test-dump-fsmonitor TEST_PROGRAMS_NEED_X += test-dump-split-index TEST_PROGRAMS_NEED_X += test-dump-untracked-cache TEST_PROGRAMS_NEED_X += test-fake-ssh @@ -794,6 +796,7 @@ LIB_OBJS += ewah/ewah_rlw.o LIB_OBJS += exec_cmd.o LIB_OBJS += fetch-pack.o LIB_OBJS += fsck.o +LIB_OBJS += fsmonitor.o LIB_OBJS += gettext.o LIB_OBJS += gpg-interface.o LIB_OBJS += graph.o diff --git a/builtin/ls-files.c b/builtin/ls-files.c index 8c713c47ac..2fc836e330 100644 --- a/builtin/ls-files.c +++ b/builtin/ls-files.c @@ -31,6 +31,7 @@ static int show_resolve_undo; static int show_modified; static int show_killed; static int show_valid_bit; +static int show_fsmonitor_bit; static int line_terminator = '\n'; static int debug_mode; static int show_eol; @@ -86,7 +87,8 @@ static const char *get_tag(const struct cache_entry *ce, const char *tag) { static char alttag[4]; - if (tag && *tag && show_valid_bit && (ce->ce_flags & CE_VALID)) { + if (tag && *tag && ((show_valid_bit && (ce->ce_flags & CE_VALID)) || + (show_fsmonitor_bit && (ce->ce_flags & CE_FSMONITOR_VALID)))) { memcpy(alttag, tag, 3); if (isalpha(tag[0])) { @@ -515,6 +517,8 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix) N_("identify the file status with tags")), OPT_BOOL('v', NULL, &show_valid_bit, N_("use lowercase letters for 'assume unchanged' files")), + OPT_BOOL('f', NULL, &show_fsmonitor_bit, + N_("use lowercase letters for 'fsmonitor clean' files")), OPT_BOOL('c', "cached", &show_cached, N_("show cached files in the output (default)")), OPT_BOOL('d', "deleted", &show_deleted, @@ -584,7 +588,7 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix) for (i = 0; i < exclude_list.nr; i++) { add_exclude(exclude_list.items[i].string, "", 0, el, --exclude_args); } - if (show_tag || show_valid_bit) { + if (show_tag || show_valid_bit || show_fsmonitor_bit) { tag_cached = "H "; tag_unmerged = "M "; tag_removed = "R "; diff --git a/builtin/update-index.c b/builtin/update-index.c index fefbe60167..58d1c2d282 100644 --- a/builtin/update-index.c +++ b/builtin/update-index.c @@ -16,6 +16,7 @@ #include "pathspec.h" #include "dir.h" #include "split-index.h" +#include "fsmonitor.h" /* * Default to not allowing changes to the list of files. The @@ -32,6 +33,7 @@ static int force_remove; static int verbose; static int mark_valid_only; static int mark_skip_worktree_only; +static int mark_fsmonitor_only; #define MARK_FLAG 1 #define UNMARK_FLAG 2 static struct strbuf mtime_dir = STRBUF_INIT; @@ -228,6 +230,7 @@ static int mark_ce_flags(const char *path, int flag, int mark) int namelen = strlen(path); int pos = cache_name_pos(path, namelen); if (0 <= pos) { + mark_fsmonitor_invalid(&the_index, active_cache[pos]); if (mark) active_cache[pos]->ce_flags |= flag; else @@ -460,6 +463,11 @@ static void update_one(const char *path) die("Unable to mark file %s", path); return; } + if (mark_fsmonitor_only) { + if (mark_ce_flags(path, CE_FSMONITOR_VALID, mark_fsmonitor_only == MARK_FLAG)) + die("Unable to mark file %s", path); + return; + } if (force_remove) { if (remove_file_from_cache(path)) @@ -917,6 +925,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) struct refresh_params refresh_args = {0, &has_errors}; int lock_error = 0; int split_index = -1; + int force_write = 0; + int fsmonitor = -1; struct lock_file lock_file = LOCK_INIT; struct parse_opt_ctx_t ctx; strbuf_getline_fn getline_fn; @@ -1008,6 +1018,16 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) N_("test if the filesystem supports untracked cache"), UC_TEST), OPT_SET_INT(0, "force-untracked-cache", &untracked_cache, N_("enable untracked cache without testing the filesystem"), UC_FORCE), + OPT_SET_INT(0, "force-write-index", &force_write, + N_("write out the index even if is not flagged as changed"), 1), + OPT_BOOL(0, "fsmonitor", &fsmonitor, + N_("enable or disable file system monitor")), + {OPTION_SET_INT, 0, "fsmonitor-valid", &mark_fsmonitor_only, NULL, + N_("mark files as fsmonitor valid"), + PARSE_OPT_NOARG | PARSE_OPT_NONEG, NULL, MARK_FLAG}, + {OPTION_SET_INT, 0, "no-fsmonitor-valid", &mark_fsmonitor_only, NULL, + N_("clear fsmonitor valid bit"), + PARSE_OPT_NOARG | PARSE_OPT_NONEG, NULL, UNMARK_FLAG}, OPT_END() }; @@ -1146,7 +1166,23 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) die("BUG: bad untracked_cache value: %d", untracked_cache); } - if (active_cache_changed) { + if (fsmonitor > 0) { + if (git_config_get_fsmonitor() == 0) + warning(_("core.fsmonitor is unset; " + "set it if you really want to " + "enable fsmonitor")); + add_fsmonitor(&the_index); + report(_("fsmonitor enabled")); + } else if (!fsmonitor) { + if (git_config_get_fsmonitor() == 1) + warning(_("core.fsmonitor is set; " + "remove it if you really want to " + "disable fsmonitor")); + remove_fsmonitor(&the_index); + report(_("fsmonitor disabled")); + } + + if (active_cache_changed || force_write) { if (newfd < 0) { if (refresh_args.flags & REFRESH_QUIET) exit(128); @@ -204,6 +204,7 @@ struct cache_entry { #define CE_ADDED (1 << 19) #define CE_HASHED (1 << 20) +#define CE_FSMONITOR_VALID (1 << 21) #define CE_WT_REMOVE (1 << 22) /* remove in work directory */ #define CE_CONFLICTED (1 << 23) @@ -327,6 +328,7 @@ static inline unsigned int canon_mode(unsigned int mode) #define CACHE_TREE_CHANGED (1 << 5) #define SPLIT_INDEX_ORDERED (1 << 6) #define UNTRACKED_CHANGED (1 << 7) +#define FSMONITOR_CHANGED (1 << 8) struct split_index; struct untracked_cache; @@ -345,6 +347,7 @@ struct index_state { struct hashmap dir_hash; unsigned char sha1[20]; struct untracked_cache *untracked; + uint64_t fsmonitor_last_update; }; extern struct index_state the_index; @@ -700,8 +703,10 @@ extern void *read_blob_data_from_index(const struct index_state *, const char *, #define CE_MATCH_IGNORE_MISSING 0x08 /* enable stat refresh */ #define CE_MATCH_REFRESH 0x10 -extern int ie_match_stat(const struct index_state *, const struct cache_entry *, struct stat *, unsigned int); -extern int ie_modified(const struct index_state *, const struct cache_entry *, struct stat *, unsigned int); +/* don't refresh_fsmonitor state or do stat comparison even if CE_FSMONITOR_VALID is true */ +#define CE_MATCH_IGNORE_FSMONITOR 0X20 +extern int ie_match_stat(struct index_state *, const struct cache_entry *, struct stat *, unsigned int); +extern int ie_modified(struct index_state *, const struct cache_entry *, struct stat *, unsigned int); #define HASH_WRITE_OBJECT 1 #define HASH_FORMAT_CHECK 2 @@ -799,6 +804,7 @@ extern int core_apply_sparse_checkout; extern int precomposed_unicode; extern int protect_hfs; extern int protect_ntfs; +extern const char *core_fsmonitor; /* * Include broken refs in all ref iterations, which will diff --git a/compat/bswap.h b/compat/bswap.h index 7d063e9e40..5078ce5ecc 100644 --- a/compat/bswap.h +++ b/compat/bswap.h @@ -158,7 +158,9 @@ static inline uint64_t git_bswap64(uint64_t x) #define get_be16(p) ntohs(*(unsigned short *)(p)) #define get_be32(p) ntohl(*(unsigned int *)(p)) +#define get_be64(p) ntohll(*(uint64_t *)(p)) #define put_be32(p, v) do { *(unsigned int *)(p) = htonl(v); } while (0) +#define put_be64(p, v) do { *(uint64_t *)(p) = htonll(v); } while (0) #else @@ -178,6 +180,13 @@ static inline uint32_t get_be32(const void *ptr) (uint32_t)p[3] << 0; } +static inline uint64_t get_be64(const void *ptr) +{ + const unsigned char *p = ptr; + return (uint64_t)get_be32(&p[0]) << 32 | + (uint64_t)get_be32(&p[4]) << 0; +} + static inline void put_be32(void *ptr, uint32_t value) { unsigned char *p = ptr; @@ -187,4 +196,17 @@ static inline void put_be32(void *ptr, uint32_t value) p[3] = value >> 0; } +static inline void put_be64(void *ptr, uint64_t value) +{ + unsigned char *p = ptr; + p[0] = value >> 56; + p[1] = value >> 48; + p[2] = value >> 40; + p[3] = value >> 32; + p[4] = value >> 24; + p[5] = value >> 16; + p[6] = value >> 8; + p[7] = value >> 0; +} + #endif @@ -2156,6 +2156,20 @@ int git_config_get_max_percent_split_change(void) return -1; /* default value */ } +int git_config_get_fsmonitor(void) +{ + if (git_config_get_pathname("core.fsmonitor", &core_fsmonitor)) + core_fsmonitor = getenv("GIT_FSMONITOR_TEST"); + + if (core_fsmonitor && !*core_fsmonitor) + core_fsmonitor = NULL; + + if (core_fsmonitor) + return 1; + + return 0; +} + NORETURN void git_die_config_linenr(const char *key, const char *filename, int linenr) { @@ -212,6 +212,7 @@ extern int git_config_get_pathname(const char *key, const char **dest); extern int git_config_get_untracked_cache(void); extern int git_config_get_split_index(void); extern int git_config_get_max_percent_split_change(void); +extern int git_config_get_fsmonitor(void); /* This dies if the configured or default date is in the future */ extern int git_config_get_expiry(const char *key, const char **output); diff --git a/diff-lib.c b/diff-lib.c index 731f0886d6..5173023cd3 100644 --- a/diff-lib.c +++ b/diff-lib.c @@ -12,6 +12,7 @@ #include "refs.h" #include "submodule.h" #include "dir.h" +#include "fsmonitor.h" /* * diff-files @@ -229,6 +230,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option) if (!changed && !dirty_submodule) { ce_mark_uptodate(ce); + mark_fsmonitor_valid(ce); if (!revs->diffopt.flags.find_copies_harder) continue; } @@ -18,6 +18,7 @@ #include "utf8.h" #include "varint.h" #include "ewah/ewok.h" +#include "fsmonitor.h" /* * Tells read_directory_recursive how a file or directory should be treated. @@ -1733,17 +1734,23 @@ static int valid_cached_dir(struct dir_struct *dir, if (!untracked) return 0; - if (stat(path->len ? path->buf : ".", &st)) { - invalidate_directory(dir->untracked, untracked); - memset(&untracked->stat_data, 0, sizeof(untracked->stat_data)); - return 0; - } - if (!untracked->valid || - match_stat_data_racy(istate, &untracked->stat_data, &st)) { - if (untracked->valid) + /* + * With fsmonitor, we can trust the untracked cache's valid field. + */ + refresh_fsmonitor(istate); + if (!(dir->untracked->use_fsmonitor && untracked->valid)) { + if (stat(path->len ? path->buf : ".", &st)) { invalidate_directory(dir->untracked, untracked); - fill_stat_data(&untracked->stat_data, &st); - return 0; + memset(&untracked->stat_data, 0, sizeof(untracked->stat_data)); + return 0; + } + if (!untracked->valid || + match_stat_data_racy(istate, &untracked->stat_data, &st)) { + if (untracked->valid) + invalidate_directory(dir->untracked, untracked); + fill_stat_data(&untracked->stat_data, &st); + return 0; + } } if (untracked->check_only != !!check_only) { @@ -139,6 +139,8 @@ struct untracked_cache { int gitignore_invalidated; int dir_invalidated; int dir_opened; + /* fsmonitor invalidation data */ + unsigned int use_fsmonitor : 1; }; struct dir_struct { @@ -4,6 +4,7 @@ #include "streaming.h" #include "submodule.h" #include "progress.h" +#include "fsmonitor.h" static void create_directories(const char *path, int path_len, const struct checkout *state) @@ -373,6 +374,7 @@ finish: ce->name); fill_stat_cache_info(ce, &st); ce->ce_flags |= CE_UPDATE_IN_BASE; + mark_fsmonitor_invalid(state->istate, ce); state->istate->cache_changed |= CE_ENTRY_CHANGED; } delayed: diff --git a/environment.c b/environment.c index 8289c25b44..8fa032f307 100644 --- a/environment.c +++ b/environment.c @@ -76,6 +76,7 @@ int protect_hfs = PROTECT_HFS_DEFAULT; #define PROTECT_NTFS_DEFAULT 0 #endif int protect_ntfs = PROTECT_NTFS_DEFAULT; +const char *core_fsmonitor; /* * The character that begins a commented line in user-editable file diff --git a/fsmonitor.c b/fsmonitor.c new file mode 100644 index 0000000000..7c1540c054 --- /dev/null +++ b/fsmonitor.c @@ -0,0 +1,253 @@ +#include "cache.h" +#include "config.h" +#include "dir.h" +#include "ewah/ewok.h" +#include "fsmonitor.h" +#include "run-command.h" +#include "strbuf.h" + +#define INDEX_EXTENSION_VERSION (1) +#define HOOK_INTERFACE_VERSION (1) + +struct trace_key trace_fsmonitor = TRACE_KEY_INIT(FSMONITOR); + +static void fsmonitor_ewah_callback(size_t pos, void *is) +{ + struct index_state *istate = (struct index_state *)is; + struct cache_entry *ce = istate->cache[pos]; + + ce->ce_flags &= ~CE_FSMONITOR_VALID; +} + +int read_fsmonitor_extension(struct index_state *istate, const void *data, + unsigned long sz) +{ + const char *index = data; + uint32_t hdr_version; + uint32_t ewah_size; + struct ewah_bitmap *fsmonitor_dirty; + int i; + int ret; + + if (sz < sizeof(uint32_t) + sizeof(uint64_t) + sizeof(uint32_t)) + return error("corrupt fsmonitor extension (too short)"); + + hdr_version = get_be32(index); + index += sizeof(uint32_t); + if (hdr_version != INDEX_EXTENSION_VERSION) + return error("bad fsmonitor version %d", hdr_version); + + istate->fsmonitor_last_update = get_be64(index); + index += sizeof(uint64_t); + + ewah_size = get_be32(index); + index += sizeof(uint32_t); + + fsmonitor_dirty = ewah_new(); + ret = ewah_read_mmap(fsmonitor_dirty, index, ewah_size); + if (ret != ewah_size) { + ewah_free(fsmonitor_dirty); + return error("failed to parse ewah bitmap reading fsmonitor index extension"); + } + + if (git_config_get_fsmonitor()) { + /* Mark all entries valid */ + for (i = 0; i < istate->cache_nr; i++) + istate->cache[i]->ce_flags |= CE_FSMONITOR_VALID; + + /* Mark all previously saved entries as dirty */ + ewah_each_bit(fsmonitor_dirty, fsmonitor_ewah_callback, istate); + + /* Now mark the untracked cache for fsmonitor usage */ + if (istate->untracked) + istate->untracked->use_fsmonitor = 1; + } + ewah_free(fsmonitor_dirty); + + trace_printf_key(&trace_fsmonitor, "read fsmonitor extension successful"); + return 0; +} + +void write_fsmonitor_extension(struct strbuf *sb, struct index_state *istate) +{ + uint32_t hdr_version; + uint64_t tm; + struct ewah_bitmap *bitmap; + int i; + uint32_t ewah_start; + uint32_t ewah_size = 0; + int fixup = 0; + + put_be32(&hdr_version, INDEX_EXTENSION_VERSION); + strbuf_add(sb, &hdr_version, sizeof(uint32_t)); + + put_be64(&tm, istate->fsmonitor_last_update); + strbuf_add(sb, &tm, sizeof(uint64_t)); + fixup = sb->len; + strbuf_add(sb, &ewah_size, sizeof(uint32_t)); /* we'll fix this up later */ + + ewah_start = sb->len; + bitmap = ewah_new(); + for (i = 0; i < istate->cache_nr; i++) + if (!(istate->cache[i]->ce_flags & CE_FSMONITOR_VALID)) + ewah_set(bitmap, i); + ewah_serialize_strbuf(bitmap, sb); + ewah_free(bitmap); + + /* fix up size field */ + put_be32(&ewah_size, sb->len - ewah_start); + memcpy(sb->buf + fixup, &ewah_size, sizeof(uint32_t)); + + trace_printf_key(&trace_fsmonitor, "write fsmonitor extension successful"); +} + +/* + * Call the query-fsmonitor hook passing the time of the last saved results. + */ +static int query_fsmonitor(int version, uint64_t last_update, struct strbuf *query_result) +{ + struct child_process cp = CHILD_PROCESS_INIT; + char ver[64]; + char date[64]; + const char *argv[4]; + + if (!(argv[0] = core_fsmonitor)) + return -1; + + snprintf(ver, sizeof(version), "%d", version); + snprintf(date, sizeof(date), "%" PRIuMAX, (uintmax_t)last_update); + argv[1] = ver; + argv[2] = date; + argv[3] = NULL; + cp.argv = argv; + cp.use_shell = 1; + + return capture_command(&cp, query_result, 1024); +} + +static void fsmonitor_refresh_callback(struct index_state *istate, const char *name) +{ + int pos = index_name_pos(istate, name, strlen(name)); + + if (pos >= 0) { + struct cache_entry *ce = istate->cache[pos]; + ce->ce_flags &= ~CE_FSMONITOR_VALID; + } + + /* + * Mark the untracked cache dirty even if it wasn't found in the index + * as it could be a new untracked file. + */ + trace_printf_key(&trace_fsmonitor, "fsmonitor_refresh_callback '%s'", name); + untracked_cache_invalidate_path(istate, name); +} + +void refresh_fsmonitor(struct index_state *istate) +{ + static int has_run_once = 0; + struct strbuf query_result = STRBUF_INIT; + int query_success = 0; + size_t bol; /* beginning of line */ + uint64_t last_update; + char *buf; + int i; + + if (!core_fsmonitor || has_run_once) + return; + has_run_once = 1; + + trace_printf_key(&trace_fsmonitor, "refresh fsmonitor"); + /* + * This could be racy so save the date/time now and query_fsmonitor + * should be inclusive to ensure we don't miss potential changes. + */ + last_update = getnanotime(); + + /* + * If we have a last update time, call query_fsmonitor for the set of + * changes since that time, else assume everything is possibly dirty + * and check it all. + */ + if (istate->fsmonitor_last_update) { + query_success = !query_fsmonitor(HOOK_INTERFACE_VERSION, + istate->fsmonitor_last_update, &query_result); + trace_performance_since(last_update, "fsmonitor process '%s'", core_fsmonitor); + trace_printf_key(&trace_fsmonitor, "fsmonitor process '%s' returned %s", + core_fsmonitor, query_success ? "success" : "failure"); + } + + /* a fsmonitor process can return '/' to indicate all entries are invalid */ + if (query_success && query_result.buf[0] != '/') { + /* Mark all entries returned by the monitor as dirty */ + buf = query_result.buf; + bol = 0; + for (i = 0; i < query_result.len; i++) { + if (buf[i] != '\0') + continue; + fsmonitor_refresh_callback(istate, buf + bol); + bol = i + 1; + } + if (bol < query_result.len) + fsmonitor_refresh_callback(istate, buf + bol); + } else { + /* Mark all entries invalid */ + for (i = 0; i < istate->cache_nr; i++) + istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID; + + if (istate->untracked) + istate->untracked->use_fsmonitor = 0; + } + strbuf_release(&query_result); + + /* Now that we've updated istate, save the last_update time */ + istate->fsmonitor_last_update = last_update; +} + +void add_fsmonitor(struct index_state *istate) +{ + int i; + + if (!istate->fsmonitor_last_update) { + trace_printf_key(&trace_fsmonitor, "add fsmonitor"); + istate->cache_changed |= FSMONITOR_CHANGED; + istate->fsmonitor_last_update = getnanotime(); + + /* reset the fsmonitor state */ + for (i = 0; i < istate->cache_nr; i++) + istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID; + + /* reset the untracked cache */ + if (istate->untracked) { + add_untracked_cache(istate); + istate->untracked->use_fsmonitor = 1; + } + + /* Update the fsmonitor state */ + refresh_fsmonitor(istate); + } +} + +void remove_fsmonitor(struct index_state *istate) +{ + if (istate->fsmonitor_last_update) { + trace_printf_key(&trace_fsmonitor, "remove fsmonitor"); + istate->cache_changed |= FSMONITOR_CHANGED; + istate->fsmonitor_last_update = 0; + } +} + +void tweak_fsmonitor(struct index_state *istate) +{ + switch (git_config_get_fsmonitor()) { + case -1: /* keep: do nothing */ + break; + case 0: /* false */ + remove_fsmonitor(istate); + break; + case 1: /* true */ + add_fsmonitor(istate); + break; + default: /* unknown value: do nothing */ + break; + } +} diff --git a/fsmonitor.h b/fsmonitor.h new file mode 100644 index 0000000000..0de644e01a --- /dev/null +++ b/fsmonitor.h @@ -0,0 +1,66 @@ +#ifndef FSMONITOR_H +#define FSMONITOR_H + +extern struct trace_key trace_fsmonitor; + +/* + * Read the fsmonitor index extension and (if configured) restore the + * CE_FSMONITOR_VALID state. + */ +extern int read_fsmonitor_extension(struct index_state *istate, const void *data, unsigned long sz); + +/* + * Write the CE_FSMONITOR_VALID state into the fsmonitor index extension. + */ +extern void write_fsmonitor_extension(struct strbuf *sb, struct index_state *istate); + +/* + * Add/remove the fsmonitor index extension + */ +extern void add_fsmonitor(struct index_state *istate); +extern void remove_fsmonitor(struct index_state *istate); + +/* + * Add/remove the fsmonitor index extension as necessary based on the current + * core.fsmonitor setting. + */ +extern void tweak_fsmonitor(struct index_state *istate); + +/* + * Run the configured fsmonitor integration script and clear the + * CE_FSMONITOR_VALID bit for any files returned as dirty. Also invalidate + * any corresponding untracked cache directory structures. Optimized to only + * run the first time it is called. + */ +extern void refresh_fsmonitor(struct index_state *istate); + +/* + * Set the given cache entries CE_FSMONITOR_VALID bit. This should be + * called any time the cache entry has been updated to reflect the + * current state of the file on disk. + */ +static inline void mark_fsmonitor_valid(struct cache_entry *ce) +{ + if (core_fsmonitor) { + ce->ce_flags |= CE_FSMONITOR_VALID; + trace_printf_key(&trace_fsmonitor, "mark_fsmonitor_clean '%s'", ce->name); + } +} + +/* + * Clear the given cache entry's CE_FSMONITOR_VALID bit and invalidate + * any corresponding untracked cache directory structures. This should + * be called any time git creates or modifies a file that should + * trigger an lstat() or invalidate the untracked cache for the + * corresponding directory + */ +static inline void mark_fsmonitor_invalid(struct index_state *istate, struct cache_entry *ce) +{ + if (core_fsmonitor) { + ce->ce_flags &= ~CE_FSMONITOR_VALID; + untracked_cache_invalidate_path(istate, ce->name); + trace_printf_key(&trace_fsmonitor, "mark_fsmonitor_invalid '%s'", ce->name); + } +} + +#endif diff --git a/preload-index.c b/preload-index.c index 70a4c80878..2a83255e4e 100644 --- a/preload-index.c +++ b/preload-index.c @@ -4,6 +4,7 @@ #include "cache.h" #include "pathspec.h" #include "dir.h" +#include "fsmonitor.h" #ifdef NO_PTHREADS static void preload_index(struct index_state *index, @@ -55,15 +56,18 @@ static void *preload_thread(void *_data) continue; if (ce_skip_worktree(ce)) continue; + if (ce->ce_flags & CE_FSMONITOR_VALID) + continue; if (!ce_path_match(ce, &p->pathspec, NULL)) continue; if (threaded_has_symlink_leading_path(&cache, ce->name, ce_namelen(ce))) continue; if (lstat(ce->name, &st)) continue; - if (ie_match_stat(index, ce, &st, CE_MATCH_RACY_IS_DIRTY)) + if (ie_match_stat(index, ce, &st, CE_MATCH_RACY_IS_DIRTY|CE_MATCH_IGNORE_FSMONITOR)) continue; ce_mark_uptodate(ce); + mark_fsmonitor_valid(ce); } while (--nr > 0); cache_def_clear(&cache); return NULL; @@ -79,6 +83,8 @@ static void preload_index(struct index_state *index, return; threads = index->cache_nr / THREAD_COST; + if ((index->cache_nr > 1) && (threads < 2) && getenv("GIT_FORCE_PRELOAD_TEST")) + threads = 2; if (threads < 2) return; if (threads > MAX_PARALLEL) diff --git a/read-cache.c b/read-cache.c index b13a1cb8f2..87e88b2642 100644 --- a/read-cache.c +++ b/read-cache.c @@ -19,6 +19,7 @@ #include "varint.h" #include "split-index.h" #include "utf8.h" +#include "fsmonitor.h" /* Mask for the name length in ce_flags in the on-disk index */ @@ -38,11 +39,12 @@ #define CACHE_EXT_RESOLVE_UNDO 0x52455543 /* "REUC" */ #define CACHE_EXT_LINK 0x6c696e6b /* "link" */ #define CACHE_EXT_UNTRACKED 0x554E5452 /* "UNTR" */ +#define CACHE_EXT_FSMONITOR 0x46534D4E /* "FSMN" */ /* changes that can be kept in $GIT_DIR/index (basically all extensions) */ #define EXTMASK (RESOLVE_UNDO_CHANGED | CACHE_TREE_CHANGED | \ CE_ENTRY_ADDED | CE_ENTRY_REMOVED | CE_ENTRY_CHANGED | \ - SPLIT_INDEX_ORDERED | UNTRACKED_CHANGED) + SPLIT_INDEX_ORDERED | UNTRACKED_CHANGED | FSMONITOR_CHANGED) struct index_state the_index; static const char *alternate_index_output; @@ -62,6 +64,7 @@ static void replace_index_entry(struct index_state *istate, int nr, struct cache free(old); set_index_entry(istate, nr, ce); ce->ce_flags |= CE_UPDATE_IN_BASE; + mark_fsmonitor_invalid(istate, ce); istate->cache_changed |= CE_ENTRY_CHANGED; } @@ -150,8 +153,10 @@ void fill_stat_cache_info(struct cache_entry *ce, struct stat *st) if (assume_unchanged) ce->ce_flags |= CE_VALID; - if (S_ISREG(st->st_mode)) + if (S_ISREG(st->st_mode)) { ce_mark_uptodate(ce); + mark_fsmonitor_valid(ce); + } } static int ce_compare_data(const struct cache_entry *ce, struct stat *st) @@ -301,7 +306,7 @@ int match_stat_data_racy(const struct index_state *istate, return match_stat_data(sd, st); } -int ie_match_stat(const struct index_state *istate, +int ie_match_stat(struct index_state *istate, const struct cache_entry *ce, struct stat *st, unsigned int options) { @@ -309,7 +314,10 @@ int ie_match_stat(const struct index_state *istate, int ignore_valid = options & CE_MATCH_IGNORE_VALID; int ignore_skip_worktree = options & CE_MATCH_IGNORE_SKIP_WORKTREE; int assume_racy_is_modified = options & CE_MATCH_RACY_IS_DIRTY; + int ignore_fsmonitor = options & CE_MATCH_IGNORE_FSMONITOR; + if (!ignore_fsmonitor) + refresh_fsmonitor(istate); /* * If it's marked as always valid in the index, it's * valid whatever the checked-out copy says. @@ -320,6 +328,8 @@ int ie_match_stat(const struct index_state *istate, return 0; if (!ignore_valid && (ce->ce_flags & CE_VALID)) return 0; + if (!ignore_fsmonitor && (ce->ce_flags & CE_FSMONITOR_VALID)) + return 0; /* * Intent-to-add entries have not been added, so the index entry @@ -357,7 +367,7 @@ int ie_match_stat(const struct index_state *istate, return changed; } -int ie_modified(const struct index_state *istate, +int ie_modified(struct index_state *istate, const struct cache_entry *ce, struct stat *st, unsigned int options) { @@ -778,6 +788,7 @@ int chmod_index_entry(struct index_state *istate, struct cache_entry *ce, } cache_tree_invalidate_path(istate, ce->name); ce->ce_flags |= CE_UPDATE_IN_BASE; + mark_fsmonitor_invalid(istate, ce); istate->cache_changed |= CE_ENTRY_CHANGED; return 0; @@ -1229,10 +1240,13 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate, int ignore_valid = options & CE_MATCH_IGNORE_VALID; int ignore_skip_worktree = options & CE_MATCH_IGNORE_SKIP_WORKTREE; int ignore_missing = options & CE_MATCH_IGNORE_MISSING; + int ignore_fsmonitor = options & CE_MATCH_IGNORE_FSMONITOR; if (!refresh || ce_uptodate(ce)) return ce; + if (!ignore_fsmonitor) + refresh_fsmonitor(istate); /* * CE_VALID or CE_SKIP_WORKTREE means the user promised us * that the change to the work tree does not matter and told @@ -1246,6 +1260,10 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate, ce_mark_uptodate(ce); return ce; } + if (!ignore_fsmonitor && (ce->ce_flags & CE_FSMONITOR_VALID)) { + ce_mark_uptodate(ce); + return ce; + } if (has_symlink_leading_path(ce->name, ce_namelen(ce))) { if (ignore_missing) @@ -1283,8 +1301,10 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate, * because CE_UPTODATE flag is in-core only; * we are not going to write this change out. */ - if (!S_ISGITLINK(ce->ce_mode)) + if (!S_ISGITLINK(ce->ce_mode)) { ce_mark_uptodate(ce); + mark_fsmonitor_valid(ce); + } return ce; } } @@ -1392,6 +1412,7 @@ int refresh_index(struct index_state *istate, unsigned int flags, */ ce->ce_flags &= ~CE_VALID; ce->ce_flags |= CE_UPDATE_IN_BASE; + mark_fsmonitor_invalid(istate, ce); istate->cache_changed |= CE_ENTRY_CHANGED; } if (quiet) @@ -1554,6 +1575,9 @@ static int read_index_extension(struct index_state *istate, case CACHE_EXT_UNTRACKED: istate->untracked = read_untracked_extension(data, sz); break; + case CACHE_EXT_FSMONITOR: + read_fsmonitor_extension(istate, data, sz); + break; default: if (*ext < 'A' || 'Z' < *ext) return error("index uses %.4s extension, which we do not understand", @@ -1729,6 +1753,7 @@ static void post_read_index_from(struct index_state *istate) check_ce_order(istate); tweak_untracked_cache(istate); tweak_split_index(istate); + tweak_fsmonitor(istate); } /* remember to discard_cache() before reading a different cache! */ @@ -2320,6 +2345,16 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile, if (err) return -1; } + if (!strip_extensions && istate->fsmonitor_last_update) { + struct strbuf sb = STRBUF_INIT; + + write_fsmonitor_extension(&sb, istate); + err = write_index_ext_header(&c, newfd, CACHE_EXT_FSMONITOR, sb.len) < 0 + || ce_write(&c, newfd, sb.buf, sb.len) < 0; + strbuf_release(&sb); + if (err) + return -1; + } if (ce_flush(&c, newfd, istate->sha1)) return -1; diff --git a/submodule.c b/submodule.c index 3ee4a0caa7..bb531e0e59 100644 --- a/submodule.c +++ b/submodule.c @@ -62,7 +62,7 @@ int is_staging_gitmodules_ok(const struct index_state *istate) if ((pos >= 0) && (pos < istate->cache_nr)) { struct stat st; if (lstat(GITMODULES_FILE, &st) == 0 && - ce_match_stat(istate->cache[pos], &st, 0) & DATA_CHANGED) + ce_match_stat(istate->cache[pos], &st, CE_MATCH_IGNORE_FSMONITOR) & DATA_CHANGED) return 0; } diff --git a/t/helper/.gitignore b/t/helper/.gitignore index 7c9d28a834..d02f9b39ac 100644 --- a/t/helper/.gitignore +++ b/t/helper/.gitignore @@ -3,7 +3,9 @@ /test-config /test-date /test-delta +/test-drop-caches /test-dump-cache-tree +/test-dump-fsmonitor /test-dump-split-index /test-dump-untracked-cache /test-fake-ssh diff --git a/t/helper/test-drop-caches.c b/t/helper/test-drop-caches.c new file mode 100644 index 0000000000..bd1a857d52 --- /dev/null +++ b/t/helper/test-drop-caches.c @@ -0,0 +1,164 @@ +#include "git-compat-util.h" + +#if defined(GIT_WINDOWS_NATIVE) + +static int cmd_sync(void) +{ + char Buffer[MAX_PATH]; + DWORD dwRet; + char szVolumeAccessPath[] = "\\\\.\\X:"; + HANDLE hVolWrite; + int success = 0; + + dwRet = GetCurrentDirectory(MAX_PATH, Buffer); + if ((0 == dwRet) || (dwRet > MAX_PATH)) + return error("Error getting current directory"); + + if ((Buffer[0] < 'A') || (Buffer[0] > 'Z')) + return error("Invalid drive letter '%c'", Buffer[0]); + + szVolumeAccessPath[4] = Buffer[0]; + hVolWrite = CreateFile(szVolumeAccessPath, GENERIC_READ | GENERIC_WRITE, + FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_EXISTING, 0, NULL); + if (INVALID_HANDLE_VALUE == hVolWrite) + return error("Unable to open volume for writing, need admin access"); + + success = FlushFileBuffers(hVolWrite); + if (!success) + error("Unable to flush volume"); + + CloseHandle(hVolWrite); + + return !success; +} + +#define STATUS_SUCCESS (0x00000000L) +#define STATUS_PRIVILEGE_NOT_HELD (0xC0000061L) + +typedef enum _SYSTEM_INFORMATION_CLASS { + SystemMemoryListInformation = 80, +} SYSTEM_INFORMATION_CLASS; + +typedef enum _SYSTEM_MEMORY_LIST_COMMAND { + MemoryCaptureAccessedBits, + MemoryCaptureAndResetAccessedBits, + MemoryEmptyWorkingSets, + MemoryFlushModifiedList, + MemoryPurgeStandbyList, + MemoryPurgeLowPriorityStandbyList, + MemoryCommandMax +} SYSTEM_MEMORY_LIST_COMMAND; + +static BOOL GetPrivilege(HANDLE TokenHandle, LPCSTR lpName, int flags) +{ + BOOL bResult; + DWORD dwBufferLength; + LUID luid; + TOKEN_PRIVILEGES tpPreviousState; + TOKEN_PRIVILEGES tpNewState; + + dwBufferLength = 16; + bResult = LookupPrivilegeValueA(0, lpName, &luid); + if (bResult) { + tpNewState.PrivilegeCount = 1; + tpNewState.Privileges[0].Luid = luid; + tpNewState.Privileges[0].Attributes = 0; + bResult = AdjustTokenPrivileges(TokenHandle, 0, &tpNewState, + (DWORD)((LPBYTE)&(tpNewState.Privileges[1]) - (LPBYTE)&tpNewState), + &tpPreviousState, &dwBufferLength); + if (bResult) { + tpPreviousState.PrivilegeCount = 1; + tpPreviousState.Privileges[0].Luid = luid; + tpPreviousState.Privileges[0].Attributes = flags != 0 ? 2 : 0; + bResult = AdjustTokenPrivileges(TokenHandle, 0, &tpPreviousState, + dwBufferLength, 0, 0); + } + } + return bResult; +} + +static int cmd_dropcaches(void) +{ + HANDLE hProcess = GetCurrentProcess(); + HANDLE hToken; + HMODULE ntdll; + DWORD(WINAPI *NtSetSystemInformation)(INT, PVOID, ULONG); + SYSTEM_MEMORY_LIST_COMMAND command; + int status; + + if (!OpenProcessToken(hProcess, TOKEN_QUERY | TOKEN_ADJUST_PRIVILEGES, &hToken)) + return error("Can't open current process token"); + + if (!GetPrivilege(hToken, "SeProfileSingleProcessPrivilege", 1)) + return error("Can't get SeProfileSingleProcessPrivilege"); + + CloseHandle(hToken); + + ntdll = LoadLibrary("ntdll.dll"); + if (!ntdll) + return error("Can't load ntdll.dll, wrong Windows version?"); + + NtSetSystemInformation = + (DWORD(WINAPI *)(INT, PVOID, ULONG))GetProcAddress(ntdll, "NtSetSystemInformation"); + if (!NtSetSystemInformation) + return error("Can't get function addresses, wrong Windows version?"); + + command = MemoryPurgeStandbyList; + status = NtSetSystemInformation( + SystemMemoryListInformation, + &command, + sizeof(SYSTEM_MEMORY_LIST_COMMAND) + ); + if (status == STATUS_PRIVILEGE_NOT_HELD) + error("Insufficient privileges to purge the standby list, need admin access"); + else if (status != STATUS_SUCCESS) + error("Unable to execute the memory list command %d", status); + + FreeLibrary(ntdll); + + return status; +} + +#elif defined(__linux__) + +static int cmd_sync(void) +{ + return system("sync"); +} + +static int cmd_dropcaches(void) +{ + return system("echo 3 | sudo tee /proc/sys/vm/drop_caches"); +} + +#elif defined(__APPLE__) + +static int cmd_sync(void) +{ + return system("sync"); +} + +static int cmd_dropcaches(void) +{ + return system("sudo purge"); +} + +#else + +static int cmd_sync(void) +{ + return 0; +} + +static int cmd_dropcaches(void) +{ + return error("drop caches not implemented on this platform"); +} + +#endif + +int cmd_main(int argc, const char **argv) +{ + cmd_sync(); + return cmd_dropcaches(); +} diff --git a/t/helper/test-dump-fsmonitor.c b/t/helper/test-dump-fsmonitor.c new file mode 100644 index 0000000000..ad452707e8 --- /dev/null +++ b/t/helper/test-dump-fsmonitor.c @@ -0,0 +1,21 @@ +#include "cache.h" + +int cmd_main(int ac, const char **av) +{ + struct index_state *istate = &the_index; + int i; + + setup_git_directory(); + if (do_read_index(istate, get_index_file(), 0) < 0) + die("unable to read index file"); + if (!istate->fsmonitor_last_update) { + printf("no fsmonitor\n"); + return 0; + } + printf("fsmonitor last update %"PRIuMAX"\n", (uintmax_t)istate->fsmonitor_last_update); + + for (i = 0; i < istate->cache_nr; i++) + printf((istate->cache[i]->ce_flags & CE_FSMONITOR_VALID) ? "+" : "-"); + + return 0; +} diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh new file mode 100755 index 0000000000..16d1bf72e5 --- /dev/null +++ b/t/perf/p7519-fsmonitor.sh @@ -0,0 +1,184 @@ +#!/bin/sh + +test_description="Test core.fsmonitor" + +. ./perf-lib.sh + +# +# Performance test for the fsmonitor feature which enables git to talk to a +# file system change monitor and avoid having to scan the working directory +# for new or modified files. +# +# By default, the performance test will utilize the Watchman file system +# monitor if it is installed. If Watchman is not installed, it will use a +# dummy integration script that does not report any new or modified files. +# The dummy script has very little overhead which provides optimistic results. +# +# The performance test will also use the untracked cache feature if it is +# available as fsmonitor uses it to speed up scanning for untracked files. +# +# There are 3 environment variables that can be used to alter the default +# behavior of the performance test: +# +# GIT_PERF_7519_UNTRACKED_CACHE: used to configure core.untrackedCache +# GIT_PERF_7519_SPLIT_INDEX: used to configure core.splitIndex +# GIT_PERF_7519_FSMONITOR: used to configure core.fsMonitor +# +# The big win for using fsmonitor is the elimination of the need to scan the +# working directory looking for changed and untracked files. If the file +# information is all cached in RAM, the benefits are reduced. +# +# GIT_PERF_7519_DROP_CACHE: if set, the OS caches are dropped between tests +# + +test_perf_large_repo +test_checkout_worktree + +test_lazy_prereq UNTRACKED_CACHE ' + { git update-index --test-untracked-cache; ret=$?; } && + test $ret -ne 1 +' + +test_lazy_prereq WATCHMAN ' + { command -v watchman >/dev/null 2>&1; ret=$?; } && + test $ret -ne 1 +' + +if test_have_prereq WATCHMAN +then + # Convert unix style paths to escaped Windows style paths for Watchman + case "$(uname -s)" in + MSYS_NT*) + GIT_WORK_TREE="$(cygpath -aw "$PWD" | sed 's,\\,/,g')" + ;; + *) + GIT_WORK_TREE="$PWD" + ;; + esac +fi + +if test -n "$GIT_PERF_7519_DROP_CACHE" +then + # When using GIT_PERF_7519_DROP_CACHE, GIT_PERF_REPEAT_COUNT must be 1 to + # generate valid results. Otherwise the caching that happens for the nth + # run will negate the validity of the comparisons. + if test "$GIT_PERF_REPEAT_COUNT" -ne 1 + then + echo "warning: Setting GIT_PERF_REPEAT_COUNT=1" >&2 + GIT_PERF_REPEAT_COUNT=1 + fi +fi + +test_expect_success "setup for fsmonitor" ' + # set untrackedCache depending on the environment + if test -n "$GIT_PERF_7519_UNTRACKED_CACHE" + then + git config core.untrackedCache "$GIT_PERF_7519_UNTRACKED_CACHE" + else + if test_have_prereq UNTRACKED_CACHE + then + git config core.untrackedCache true + else + git config core.untrackedCache false + fi + fi && + + # set core.splitindex depending on the environment + if test -n "$GIT_PERF_7519_SPLIT_INDEX" + then + git config core.splitIndex "$GIT_PERF_7519_SPLIT_INDEX" + fi && + + # set INTEGRATION_SCRIPT depending on the environment + if test -n "$GIT_PERF_7519_FSMONITOR" + then + INTEGRATION_SCRIPT="$GIT_PERF_7519_FSMONITOR" + else + # + # Choose integration script based on existence of Watchman. + # If Watchman exists, watch the work tree and attempt a query. + # If everything succeeds, use Watchman integration script, + # else fall back to an empty integration script. + # + mkdir .git/hooks && + if test_have_prereq WATCHMAN + then + INTEGRATION_SCRIPT=".git/hooks/fsmonitor-watchman" && + cp "$TEST_DIRECTORY/../templates/hooks--fsmonitor-watchman.sample" "$INTEGRATION_SCRIPT" && + watchman watch "$GIT_WORK_TREE" && + watchman watch-list | grep -q -F "$GIT_WORK_TREE" + else + INTEGRATION_SCRIPT=".git/hooks/fsmonitor-empty" && + write_script "$INTEGRATION_SCRIPT"<<-\EOF + EOF + fi + fi && + + git config core.fsmonitor "$INTEGRATION_SCRIPT" && + git update-index --fsmonitor +' + +if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-drop-caches +fi + +test_perf "status (fsmonitor=$INTEGRATION_SCRIPT)" ' + git status +' + +if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-drop-caches +fi + +test_perf "status -uno (fsmonitor=$INTEGRATION_SCRIPT)" ' + git status -uno +' + +if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-drop-caches +fi + +test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" ' + git status -uall +' + +test_expect_success "setup without fsmonitor" ' + unset INTEGRATION_SCRIPT && + git config --unset core.fsmonitor && + git update-index --no-fsmonitor +' + +if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-drop-caches +fi + +test_perf "status (fsmonitor=$INTEGRATION_SCRIPT)" ' + git status +' + +if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-drop-caches +fi + +test_perf "status -uno (fsmonitor=$INTEGRATION_SCRIPT)" ' + git status -uno +' + +if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-drop-caches +fi + +test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" ' + git status -uall +' + +if test_have_prereq WATCHMAN +then + watchman watch-del "$GIT_WORK_TREE" >/dev/null 2>&1 && + + # Work around Watchman bug on Windows where it holds on to handles + # preventing the removal of the trash directory + watchman shutdown-server >/dev/null 2>&1 +fi + +test_done diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh index 22f69a410b..af9b847761 100755 --- a/t/t1700-split-index.sh +++ b/t/t1700-split-index.sh @@ -6,6 +6,7 @@ test_description='split index mode tests' # We need total control of index splitting here sane_unset GIT_TEST_SPLIT_INDEX +sane_unset GIT_FSMONITOR_TEST test_expect_success 'enable split index' ' git config splitIndex.maxPercentChange 100 && diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh new file mode 100755 index 0000000000..c6df85af5e --- /dev/null +++ b/t/t7519-status-fsmonitor.sh @@ -0,0 +1,304 @@ +#!/bin/sh + +test_description='git status with file system watcher' + +. ./test-lib.sh + +# +# To run the entire git test suite using fsmonitor: +# +# copy t/t7519/fsmonitor-all to a location in your path and then set +# GIT_FSMONITOR_TEST=fsmonitor-all and run your tests. +# + +# Note, after "git reset --hard HEAD" no extensions exist other than 'TREE' +# "git update-index --fsmonitor" can be used to get the extension written +# before testing the results. + +clean_repo () { + git reset --hard HEAD && + git clean -fd +} + +dirty_repo () { + : >untracked && + : >dir1/untracked && + : >dir2/untracked && + echo 1 >modified && + echo 2 >dir1/modified && + echo 3 >dir2/modified && + echo 4 >new && + echo 5 >dir1/new && + echo 6 >dir2/new +} + +write_integration_script () { + write_script .git/hooks/fsmonitor-test<<-\EOF + if test "$#" -ne 2 + then + echo "$0: exactly 2 arguments expected" + exit 2 + fi + if test "$1" != 1 + then + echo "Unsupported core.fsmonitor hook version." >&2 + exit 1 + fi + printf "untracked\0" + printf "dir1/untracked\0" + printf "dir2/untracked\0" + printf "modified\0" + printf "dir1/modified\0" + printf "dir2/modified\0" + printf "new\0" + printf "dir1/new\0" + printf "dir2/new\0" + EOF +} + +test_lazy_prereq UNTRACKED_CACHE ' + { git update-index --test-untracked-cache; ret=$?; } && + test $ret -ne 1 +' + +test_expect_success 'setup' ' + mkdir -p .git/hooks && + : >tracked && + : >modified && + mkdir dir1 && + : >dir1/tracked && + : >dir1/modified && + mkdir dir2 && + : >dir2/tracked && + : >dir2/modified && + git -c core.fsmonitor= add . && + git -c core.fsmonitor= commit -m initial && + git config core.fsmonitor .git/hooks/fsmonitor-test && + cat >.gitignore <<-\EOF + .gitignore + expect* + actual* + marker* + EOF +' + +# test that the fsmonitor extension is off by default +test_expect_success 'fsmonitor extension is off by default' ' + test-dump-fsmonitor >actual && + grep "^no fsmonitor" actual +' + +# test that "update-index --fsmonitor" adds the fsmonitor extension +test_expect_success 'update-index --fsmonitor" adds the fsmonitor extension' ' + git update-index --fsmonitor && + test-dump-fsmonitor >actual && + grep "^fsmonitor last update" actual +' + +# test that "update-index --no-fsmonitor" removes the fsmonitor extension +test_expect_success 'update-index --no-fsmonitor" removes the fsmonitor extension' ' + git update-index --no-fsmonitor && + test-dump-fsmonitor >actual && + grep "^no fsmonitor" actual +' + +cat >expect <<EOF && +h dir1/modified +H dir1/tracked +h dir2/modified +H dir2/tracked +h modified +H tracked +EOF + +# test that "update-index --fsmonitor-valid" sets the fsmonitor valid bit +test_expect_success 'update-index --fsmonitor-valid" sets the fsmonitor valid bit' ' + git update-index --fsmonitor && + git update-index --fsmonitor-valid dir1/modified && + git update-index --fsmonitor-valid dir2/modified && + git update-index --fsmonitor-valid modified && + git ls-files -f >actual && + test_cmp expect actual +' + +cat >expect <<EOF && +H dir1/modified +H dir1/tracked +H dir2/modified +H dir2/tracked +H modified +H tracked +EOF + +# test that "update-index --no-fsmonitor-valid" clears the fsmonitor valid bit +test_expect_success 'update-index --no-fsmonitor-valid" clears the fsmonitor valid bit' ' + git update-index --no-fsmonitor-valid dir1/modified && + git update-index --no-fsmonitor-valid dir2/modified && + git update-index --no-fsmonitor-valid modified && + git ls-files -f >actual && + test_cmp expect actual +' + +cat >expect <<EOF && +H dir1/modified +H dir1/tracked +H dir2/modified +H dir2/tracked +H modified +H tracked +EOF + +# test that all files returned by the script get flagged as invalid +test_expect_success 'all files returned by integration script get flagged as invalid' ' + write_integration_script && + dirty_repo && + git update-index --fsmonitor && + git ls-files -f >actual && + test_cmp expect actual +' + +cat >expect <<EOF && +H dir1/modified +h dir1/new +H dir1/tracked +H dir2/modified +h dir2/new +H dir2/tracked +H modified +h new +H tracked +EOF + +# test that newly added files are marked valid +test_expect_success 'newly added files are marked valid' ' + git add new && + git add dir1/new && + git add dir2/new && + git ls-files -f >actual && + test_cmp expect actual +' + +cat >expect <<EOF && +H dir1/modified +h dir1/new +h dir1/tracked +H dir2/modified +h dir2/new +h dir2/tracked +H modified +h new +h tracked +EOF + +# test that all unmodified files get marked valid +test_expect_success 'all unmodified files get marked valid' ' + # modified files result in update-index returning 1 + test_must_fail git update-index --refresh --force-write-index && + git ls-files -f >actual && + test_cmp expect actual +' + +cat >expect <<EOF && +H dir1/modified +h dir1/tracked +h dir2/modified +h dir2/tracked +h modified +h tracked +EOF + +# test that *only* files returned by the integration script get flagged as invalid +test_expect_success '*only* files returned by the integration script get flagged as invalid' ' + write_script .git/hooks/fsmonitor-test<<-\EOF && + printf "dir1/modified\0" + EOF + clean_repo && + git update-index --refresh --force-write-index && + echo 1 >modified && + echo 2 >dir1/modified && + echo 3 >dir2/modified && + test_must_fail git update-index --refresh --force-write-index && + git ls-files -f >actual && + test_cmp expect actual +' + +# Ensure commands that call refresh_index() to move the index back in time +# properly invalidate the fsmonitor cache +test_expect_success 'refresh_index() invalidates fsmonitor cache' ' + write_script .git/hooks/fsmonitor-test<<-\EOF && + EOF + clean_repo && + dirty_repo && + git add . && + git commit -m "to reset" && + git reset HEAD~1 && + git status >actual && + git -c core.fsmonitor= status >expect && + test_i18ncmp expect actual +' + +# test fsmonitor with and without preloadIndex +preload_values="false true" +for preload_val in $preload_values +do + test_expect_success "setup preloadIndex to $preload_val" ' + git config core.preloadIndex $preload_val && + if test $preload_val = true + then + GIT_FORCE_PRELOAD_TEST=$preload_val; export GIT_FORCE_PRELOAD_TEST + else + unset GIT_FORCE_PRELOAD_TEST + fi + ' + + # test fsmonitor with and without the untracked cache (if available) + uc_values="false" + test_have_prereq UNTRACKED_CACHE && uc_values="false true" + for uc_val in $uc_values + do + test_expect_success "setup untracked cache to $uc_val" ' + git config core.untrackedcache $uc_val + ' + + # Status is well tested elsewhere so we'll just ensure that the results are + # the same when using core.fsmonitor. + test_expect_success 'compare status with and without fsmonitor' ' + write_integration_script && + clean_repo && + dirty_repo && + git add new && + git add dir1/new && + git add dir2/new && + git status >actual && + git -c core.fsmonitor= status >expect && + test_i18ncmp expect actual + ' + + # Make sure it's actually skipping the check for modified and untracked + # (if enabled) files unless it is told about them. + test_expect_success "status doesn't detect unreported modifications" ' + write_script .git/hooks/fsmonitor-test<<-\EOF && + :>marker + EOF + clean_repo && + git status && + test_path_is_file marker && + dirty_repo && + rm -f marker && + git status >actual && + test_path_is_file marker && + test_i18ngrep ! "Changes not staged for commit:" actual && + if test $uc_val = true + then + test_i18ngrep ! "Untracked files:" actual + fi && + if test $uc_val = false + then + test_i18ngrep "Untracked files:" actual + fi && + rm -f marker + ' + done +done + +test_done diff --git a/t/t7519/fsmonitor-all b/t/t7519/fsmonitor-all new file mode 100755 index 0000000000..691bc94dc2 --- /dev/null +++ b/t/t7519/fsmonitor-all @@ -0,0 +1,24 @@ +#!/bin/sh +# +# An test hook script to integrate with git to test fsmonitor. +# +# The hook is passed a version (currently 1) and a time in nanoseconds +# formatted as a string and outputs to stdout all files that have been +# modified since the given time. Paths must be relative to the root of +# the working tree and separated by a single NUL. +# +#echo "$0 $*" >&2 + +if test "$#" -ne 2 +then + echo "$0: exactly 2 arguments expected" >&2 + exit 2 +fi + +if test "$1" != 1 +then + echo "Unsupported core.fsmonitor hook version." >&2 + exit 1 +fi + +echo "/" diff --git a/t/t7519/fsmonitor-none b/t/t7519/fsmonitor-none new file mode 100755 index 0000000000..ed9cf5a6a9 --- /dev/null +++ b/t/t7519/fsmonitor-none @@ -0,0 +1,22 @@ +#!/bin/sh +# +# An test hook script to integrate with git to test fsmonitor. +# +# The hook is passed a version (currently 1) and a time in nanoseconds +# formatted as a string and outputs to stdout all files that have been +# modified since the given time. Paths must be relative to the root of +# the working tree and separated by a single NUL. +# +#echo "$0 $*" >&2 + +if test "$#" -ne 2 +then + echo "$0: exactly 2 arguments expected" >&2 + exit 2 +fi + +if test "$1" != 1 +then + echo "Unsupported core.fsmonitor hook version." >&2 + exit 1 +fi diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman new file mode 100755 index 0000000000..a3e30bf54f --- /dev/null +++ b/t/t7519/fsmonitor-watchman @@ -0,0 +1,139 @@ +#!/usr/bin/perl + +use strict; +use warnings; +use IPC::Open2; + +# An example hook script to integrate Watchman +# (https://facebook.github.io/watchman/) with git to speed up detecting +# new and modified files. +# +# The hook is passed a version (currently 1) and a time in nanoseconds +# formatted as a string and outputs to stdout all files that have been +# modified since the given time. Paths must be relative to the root of +# the working tree and separated by a single NUL. +# +# To enable this hook, rename this file to "query-watchman" and set +# 'git config core.fsmonitor .git/hooks/query-watchman' +# +my ($version, $time) = @ARGV; +#print STDERR "$0 $version $time\n"; + +# Check the hook interface version + +if ($version == 1) { + # convert nanoseconds to seconds + $time = int $time / 1000000000; +} else { + die "Unsupported query-fsmonitor hook version '$version'.\n" . + "Falling back to scanning...\n"; +} + +# Convert unix style paths to escaped Windows style paths when running +# in Windows command prompt + +my $system = `uname -s`; +$system =~ s/[\r\n]+//g; +my $git_work_tree; + +if ($system =~ m/^MSYS_NT/ || $system =~ m/^MINGW/) { + $git_work_tree = `cygpath -aw "\$PWD"`; + $git_work_tree =~ s/[\r\n]+//g; + $git_work_tree =~ s,\\,/,g; +} else { + $git_work_tree = $ENV{'PWD'}; +} + +my $retry = 1; + +launch_watchman(); + +sub launch_watchman { + + my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j') + or die "open2() failed: $!\n" . + "Falling back to scanning...\n"; + + # In the query expression below we're asking for names of files that + # changed since $time but were not transient (ie created after + # $time but no longer exist). + # + # To accomplish this, we're using the "since" generator to use the + # recency index to select candidate nodes and "fields" to limit the + # output to file names only. Then we're using the "expression" term to + # further constrain the results. + # + # The category of transient files that we want to ignore will have a + # creation clock (cclock) newer than $time_t value and will also not + # currently exist. + + my $query = <<" END"; + ["query", "$git_work_tree", { + "since": $time, + "fields": ["name"], + "expression": ["not", ["allof", ["since", $time, "cclock"], ["not", "exists"]]] + }] + END + + open (my $fh, ">", ".git/watchman-query.json"); + print $fh $query; + close $fh; + + print CHLD_IN $query; + close CHLD_IN; + my $response = do {local $/; <CHLD_OUT>}; + + open ($fh, ">", ".git/watchman-response.json"); + print $fh $response; + close $fh; + + die "Watchman: command returned no output.\n" . + "Falling back to scanning...\n" if $response eq ""; + die "Watchman: command returned invalid output: $response\n" . + "Falling back to scanning...\n" unless $response =~ /^\{/; + + my $json_pkg; + eval { + require JSON::XS; + $json_pkg = "JSON::XS"; + 1; + } or do { + require JSON::PP; + $json_pkg = "JSON::PP"; + }; + + my $o = $json_pkg->new->utf8->decode($response); + + if ($retry > 0 and $o->{error} and $o->{error} =~ m/unable to resolve root .* directory (.*) is not watched/) { + print STDERR "Adding '$git_work_tree' to watchman's watch list.\n"; + $retry--; + qx/watchman watch "$git_work_tree"/; + die "Failed to make watchman watch '$git_work_tree'.\n" . + "Falling back to scanning...\n" if $? != 0; + + # Watchman will always return all files on the first query so + # return the fast "everything is dirty" flag to git and do the + # Watchman query just to get it over with now so we won't pay + # the cost in git to look up each individual file. + + open ($fh, ">", ".git/watchman-output.out"); + print "/\0"; + close $fh; + + print "/\0"; + eval { launch_watchman() }; + exit 0; + } + + die "Watchman: $o->{error}.\n" . + "Falling back to scanning...\n" if $o->{error}; + + open ($fh, ">", ".git/watchman-output.out"); + binmode $fh, ":utf8"; + print $fh @{$o->{files}}; + close $fh; + + binmode STDOUT, ":utf8"; + local $, = "\0"; + print @{$o->{files}}; +} diff --git a/templates/hooks--fsmonitor-watchman.sample b/templates/hooks--fsmonitor-watchman.sample new file mode 100755 index 0000000000..9eba8a7409 --- /dev/null +++ b/templates/hooks--fsmonitor-watchman.sample @@ -0,0 +1,120 @@ +#!/usr/bin/perl + +use strict; +use warnings; +use IPC::Open2; + +# An example hook script to integrate Watchman +# (https://facebook.github.io/watchman/) with git to speed up detecting +# new and modified files. +# +# The hook is passed a version (currently 1) and a time in nanoseconds +# formatted as a string and outputs to stdout all files that have been +# modified since the given time. Paths must be relative to the root of +# the working tree and separated by a single NUL. +# +# To enable this hook, rename this file to "query-watchman" and set +# 'git config core.fsmonitor .git/hooks/query-watchman' +# +my ($version, $time) = @ARGV; + +# Check the hook interface version + +if ($version == 1) { + # convert nanoseconds to seconds + $time = int $time / 1000000000; +} else { + die "Unsupported query-fsmonitor hook version '$version'.\n" . + "Falling back to scanning...\n"; +} + +# Convert unix style paths to escaped Windows style paths when running +# in Windows command prompt + +my $system = `uname -s`; +$system =~ s/[\r\n]+//g; +my $git_work_tree; + +if ($system =~ m/^MSYS_NT/ || $system =~ m/^MINGW/) { + $git_work_tree = `cygpath -aw "\$PWD"`; + $git_work_tree =~ s/[\r\n]+//g; + $git_work_tree =~ s,\\,/,g; +} else { + $git_work_tree = $ENV{'PWD'}; +} + +my $retry = 1; + +launch_watchman(); + +sub launch_watchman { + + my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j') + or die "open2() failed: $!\n" . + "Falling back to scanning...\n"; + + # In the query expression below we're asking for names of files that + # changed since $time but were not transient (ie created after + # $time but no longer exist). + # + # To accomplish this, we're using the "since" generator to use the + # recency index to select candidate nodes and "fields" to limit the + # output to file names only. Then we're using the "expression" term to + # further constrain the results. + # + # The category of transient files that we want to ignore will have a + # creation clock (cclock) newer than $time_t value and will also not + # currently exist. + + my $query = <<" END"; + ["query", "$git_work_tree", { + "since": $time, + "fields": ["name"], + "expression": ["not", ["allof", ["since", $time, "cclock"], ["not", "exists"]]] + }] + END + + print CHLD_IN $query; + close CHLD_IN; + my $response = do {local $/; <CHLD_OUT>}; + + die "Watchman: command returned no output.\n" . + "Falling back to scanning...\n" if $response eq ""; + die "Watchman: command returned invalid output: $response\n" . + "Falling back to scanning...\n" unless $response =~ /^\{/; + + my $json_pkg; + eval { + require JSON::XS; + $json_pkg = "JSON::XS"; + 1; + } or do { + require JSON::PP; + $json_pkg = "JSON::PP"; + }; + + my $o = $json_pkg->new->utf8->decode($response); + + if ($retry > 0 and $o->{error} and $o->{error} =~ m/unable to resolve root .* directory (.*) is not watched/) { + print STDERR "Adding '$git_work_tree' to watchman's watch list.\n"; + $retry--; + qx/watchman watch "$git_work_tree"/; + die "Failed to make watchman watch '$git_work_tree'.\n" . + "Falling back to scanning...\n" if $? != 0; + + # Watchman will always return all files on the first query so + # return the fast "everything is dirty" flag to git and do the + # Watchman query just to get it over with now so we won't pay + # the cost in git to look up each individual file. + print "/\0"; + eval { launch_watchman() }; + exit 0; + } + + die "Watchman: $o->{error}.\n" . + "Falling back to scanning...\n" if $o->{error}; + + binmode STDOUT, ":utf8"; + local $, = "\0"; + print @{$o->{files}}; +} diff --git a/unpack-trees.c b/unpack-trees.c index 25740cb593..bf8b602901 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -14,6 +14,7 @@ #include "dir.h" #include "submodule.h" #include "submodule-config.h" +#include "fsmonitor.h" /* * Error messages expected by scripts out of plumbing commands such as @@ -408,6 +409,7 @@ static int apply_sparse_checkout(struct index_state *istate, ce->ce_flags &= ~CE_SKIP_WORKTREE; if (was_skip_worktree != ce_skip_worktree(ce)) { ce->ce_flags |= CE_UPDATE_IN_BASE; + mark_fsmonitor_invalid(istate, ce); istate->cache_changed |= CE_ENTRY_CHANGED; } |