summaryrefslogtreecommitdiff
path: root/tree-walk.h
Commit message (Collapse)AuthorAgeFilesLines
* fsck: handle bad trees like other errorsdt/tree-fsckDavid Turner2016-09-271-0/+8
| | | | | | | | | Instead of dying when fsck hits a malformed tree object, log the error like any other and continue. Now fsck can tell the user which tree is bad, too. Signed-off-by: David Turner <dturner@twosigma.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tree-walk: convert tree_entry_extract() to use struct object_idbrian m. carlson2016-04-251-2/+2
| | | | | Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* struct name_entry: use struct object_id instead of unsigned char sha1[20]brian m. carlson2016-04-251-3/+3
| | | | | Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* do_compare_entry: use already-computed pathDavid Turner2016-01-051-0/+1
| | | | | | | | | | | | | | | | | | | | In traverse_trees, we generate the complete traverse path for a traverse_info. Later, in do_compare_entry, we used to go do a bunch of work to compare the traverse_info to a cache_entry's name without computing that path. But since we already have that path, we don't need to do all that work. Instead, we can just put the generated path into the traverse_info, and do the comparison more directly. We copy the path because prune_traversal might mutate `base`. This doesn't happen in any codepaths where do_compare_entry is called, but it's better to be safe. This makes git checkout much faster -- about 25% on Twitter's monorepo. Deeper directory trees are likely to benefit more than shallower ones. Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tree-walk: learn get_tree_entry_follow_symlinksDavid Turner2015-05-201-0/+18
| | | | | | | | | | | | | | Add a new function, get_tree_entry_follow_symlinks, to tree-walk.[ch]. The function is not yet used. It will be used to implement git cat-file --batch --follow-symlinks. The function locates an object by path, following symlinks in the repository. If the symlinks lead outside the repository, the function reports this to the caller. Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tree-walk: finally switch over tree descriptors to contain a pre-parsed entryks/tree-diff-walkKirill Smelkov2014-02-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This continues 4651ece8 (Switch over tree descriptors to contain a pre-parsed entry) and moves the only rest computational part mode = canon_mode(mode) from tree_entry_extract() to tree entry decode phase - to decode_tree_entry(). The reason to do it, is that canon_mode() is at least 2 conditional jumps for regular files, and that could be noticeable should canon_mode() be invoked several times. That does not matter for current Git codebase, where typical tree traversal is while (t->size) { sha1 = tree_entry_extract(t, &path, &mode); ... update_tree_entry(t); } i.e. we do t -> sha1,path.mode "extraction" only once per entry. In such cases, it does not matter performance-wise, where that mode canonicalization is done - either once in tree_entry_extract(), or once in decode_tree_entry() called by update_tree_entry() - it is approximately the same. But for future code, which could need to work with several tree_desc's in parallel, it could be handy to operate on tree_desc descriptors, and do "extracts" only when needed, or at all, access only relevant part of it through structure fields directly. And for such situations, having canon_mode() be done once in decode phase is better - we won't need to pay the performance price of 2 extra conditional jumps on every t->mode access. So let's move mode canonicalization to decode_tree_entry(). That was the final bit. Now after tree entry is decoded, it is fully ready and could be accessed either directly via field, or through tree_entry_extract() which this time got really "totally trivial". Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* unpack-trees: don't shift conflicts left and rightRené Scharfe2013-06-171-1/+1
| | | | | | | | | | | If o->merge is set, the struct traverse_info member conflicts is shifted left in unpack_callback, then passed through traverse_trees_recursive to unpack_nondirectories, where it is shifted right before use. Stop the shifting and just pass the conflict bit mask as is. Rename the member to df_conflicts to prove that it isn't used anywhere else. Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tree_entry_interesting(): give meaningful names to return valuesNguyễn Thái Ngọc Duy2011-10-271-1/+11
| | | | | | | | | It is a basic code hygiene to avoid magic constants that are unnamed. Besides, this helps extending the value later on for "interesting, but cannot decide if the entry truely matches yet" (ie. prefix matches) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tree-walk.c: do not leak internal structure in tree_entry_len()Nguyễn Thái Ngọc Duy2011-10-271-3/+3
| | | | | | | | | | | | | tree_entry_len() does not simply take two random arguments and return a tree length. The two pointers must point to a tree item structure, or struct name_entry. Passing random pointers will return incorrect value. Force callers to pass struct name_entry instead of two pointers (with hope that they don't manually construct struct name_entry themselves) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* traverse_trees(): allow pruning with pathspecJunio C Hamano2011-08-291-0/+1
| | | | | | | | | | | | | | The traverse_trees() machinery is primarily meant for merging two (or more) trees, and because a merge is a full tree operation, it doesn't support any pruning with pathspec. Since d1f2d7e (Make run_diff_index() use unpack_trees(), not read_tree(), 2008-01-19), however, we use unpack_trees() to traverse_trees() callchain to perform "diff-index", which could waste a lot of work traversing trees outside the user-supplied pathspec, only to discard at the blob comparison level in diff-lib.c::oneway_diff() which is way too late. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* grep: drop pathspec_matches() in favor of tree_entry_interesting()Nguyễn Thái Ngọc Duy2011-02-031-1/+1
| | | | | Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tree_entry_interesting(): support wildcard matchingNguyễn Thái Ngọc Duy2011-02-031-1/+1
| | | | | | | | never_interesting optimization is disabled if there is any wildcard pathspec, even if it only matches exactly on trees. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* diff-tree: convert base+baselen to writable strbufNguyễn Thái Ngọc Duy2011-02-031-1/+1
| | | | | | | | | | | | | | | | | In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Move tree_entry_interesting() to tree-walk.c and export itNguyễn Thái Ngọc Duy2011-02-031-0/+2
| | | | | Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'maint'Junio C Hamano2010-08-261-1/+4
|\ | | | | | | | | | | | | * maint: for-each-ref: fix objectname:short bug tree-walk: Correct bitrotted comment about tree_entry() Fix 'git log' early pager startup error case
| * tree-walk: Correct bitrotted comment about tree_entry()Elijah Newren2010-08-251-1/+4
| | | | | | | | | | | | | | | | | | There was a code comment that referred to the "above two functions" but over time the functions immediately preceding the comment have changed. Just mention the relevant functions by name. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | unpack_trees: group error messages by typeMatthieu Moy2010-08-111-0/+1
|/ | | | | | | | | | | | | | | | | When an error is encountered, it calls add_rejected_file() which either - directly displays the error message and stops if in plumbing mode (i.e. if show_all_errors is not initialized at 1) - or stores it so that it will be displayed at the end with display_error_msgs(), Storing the files by error type permits to have a list of files for which there is the same error instead of having a serie of almost identical errors. As each bind_overlap error combines a file and an old file, a list cannot be done, therefore, theses errors are not stored but directly displayed. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Make 'traverse_trees()' traverse conflicting DF entries in parallelLinus Torvalds2008-03-091-1/+2
| | | | | | | | | | | | | | | | | | This makes the traverse_trees() entry comparator routine use the more relaxed form of name comparison that considers files and directories with the same name identical. We pass in a separate mask for just the directory entries, so that the callback routine can decide (if it wants to) to only handle one or the other type, but generally most (all?) users are expected to really want to see the case of a name 'foo' showing up in one tree as a file and in another as a directory at the same time. In particular, moving 'unpack_trees()' over to use this tree traversal mechanism requires this. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Add return value to 'traverse_tree()' callbackLinus Torvalds2008-03-091-2/+2
| | | | | | | | | This allows the callback to return an error value, but it can also specify which of the tree entries that it actually used up by returning a positive mask value. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Make 'traverse_tree()' use linked structure rather than 'const char *base'Linus Torvalds2008-03-091-2/+18
| | | | | | | | | | | | | | | This makes the calling convention a bit less obvious, but a lot more flexible. Instead of allocating and extending a new 'base' string, we just link the top-most name into a linked list of the 'info' structure when traversing a subdirectory, and we can generate the basename by following the list. Perhaps even more importantly, the linked list of info structures also gives us a place to naturally save off other information than just the directory name. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* rename: Break filepairs with different types.Junio C Hamano2007-12-021-7/+0
| | | | | | | | | | | | When we consider if a path has been totally rewritten, we did not touch changes from symlinks to files or vice versa. But a change that modifies even the type of a blob surely should count as a complete rewrite. While we are at it, modernise diffcore-break to be aware of gitlinks (we do not want to touch them). Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Fix rev-list when showing objects involving submodulesLinus Torvalds2007-11-141-0/+7
| | | | | | | | | | | | | | | | | | The function mark_tree_uninteresting() assumed that the tree entries are blob when they are not trees. This is not so. Since we do not traverse into submodules (yet), the gitlinks should be ignored. In general, we should try to start moving away from using the "S_ISLNK()" like things for internal git state. It was a mistake to just assume the numbers all were same across all systems in the first place. This implementation converts to the "object_type", and then uses a case statement. Noticed by Ilari on IRC. Test script taken from an earlier version by Dscho. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Two trivial -Wcast-qual fixesJunio C Hamano2007-06-221-1/+1
| | | | | | | | Luiz Fernando N. Capitulino noticed the one in tree-walk.h where we cast away constness while computing the legnth of a tree entry. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Remove stale non-static-inline prototype for tree_entry_extract()Matthieu Castet2007-05-131-1/+0
| | | | | | | | | | When 4651ece8 made the function a "static inline", it should have removd the stale prototype but everybody missed that. Thomas Glanzmann noticed this broke compilation with Forte12 compiler on his Sun boxes. Signed-off-by: Junio C Hamano <junkio@cox.net>
* Switch over tree descriptors to contain a pre-parsed entryLinus Torvalds2007-03-211-5/+13
| | | | | | | | | | | | | | | | | | | This makes the tree descriptor contain a "struct name_entry" as part of it, and it gets filled in so that it always contains a valid entry. On some benchmarks, it improves performance by up to 15%. That makes tree entry "extract" trivial, and means that we only actually need to decode each tree entry just once: we decode the first one when we initialize the tree descriptor, and each subsequent one when doing "update_tree_entry()". In particular, this means that we don't need to do strlen() both at extract time _and_ at update time. Finally, it also allows more sharing of code (entry_extract(), that wanted a "struct name_entry", just got totally trivial, along with the "tree_entry()" function). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* Initialize tree descriptors with a helper function rather than by hand.Linus Torvalds2007-03-211-2/+3
| | | | | | | | | | | | | This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* Remove "pathlen" from "struct name_entry"Linus Torvalds2007-03-211-1/+0
| | | | | | | | | Since we have the "tree_entry_len()" helper function these days, and don't need to do a full strlen(), there's no point in saving the path length - it's just redundant information. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* Avoid unnecessary strlen() callsLinus Torvalds2007-03-181-0/+5
| | | | | | | | | | | | | | | | | | | | | | This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net>
* tree_entry(): new tree-walking helper functionLinus Torvalds2006-05-301-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a "tree_entry()" function that combines the common operation of doing a "tree_entry_extract()" + "update_tree_entry()". It also has a simplified calling convention, designed for simple loops that traverse over a whole tree: the arguments are pointers to the tree descriptor and a name_entry structure to fill in, and it returns a boolean "true" if there was an entry left to be gotten in the tree. This allows tree traversal with struct tree_desc desc; struct name_entry entry; desc.buf = tree->buffer; desc.size = tree->size; while (tree_entry(&desc, &entry) { ... use "entry.{path, sha1, mode, pathlen}" ... } which is not only shorter than writing it out in full, it's hopefully less error prone too. [ It's actually a tad faster too - we don't need to recalculate the entry pathlength in both extract and update, but need to do it only once. Also, some callers can avoid doing a "strlen()" on the result, since it's returned as part of the name_entry structure. However, by now we're talking just 1% speedup on "git-rev-list --objects --all", and we're definitely at the point where tree walking is no longer the issue any more. ] NOTE! Not everybody wants to use this new helper function, since some of the tree walkers very much on purpose do the descriptor update separately from the entry extraction. So the "extract + update" sequence still remains as the core sequence, this is just a simplified interface. We should probably add a silly two-line inline helper function for initializing the descriptor from the "struct tree" too, just to cut down on the noise from that common "desc" initializer. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* get_tree_entry(): make it available from tree-walkJunio C Hamano2006-04-191-0/+2
| | | | Signed-off-by: Junio C Hamano <junkio@cox.net>
* tree/diff header cleanup.Junio C Hamano2006-03-291-0/+25
Introduce tree-walk.[ch] and move "struct tree_desc" and associated functions from various places. Rename DIFF_FILE_CANON_MODE(mode) macro to canon_mode(mode) and move it to cache.h. This macro returns the canonicalized st_mode value in the host byte order for files, symlinks and directories -- to be compared with a tree_desc entry. create_ce_mode(mode) in cache.h is similar but is intended to be used for index entries (so it does not work for directories) and returns the value in the network byte order. Signed-off-by: Junio C Hamano <junkio@cox.net>