summaryrefslogtreecommitdiff
path: root/builtin-grep.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'jc/upload-corrupt' into nextJunio C Hamano2006-06-211-4/+5
|\ | | | | | | | | | | | | | | | | * jc/upload-corrupt: upload-pack/fetch-pack: support side-band communication Retire git-clone-pack upload-pack: prepare for sideband message support. upload-pack: avoid sending an incomplete pack upon failure Fix possible out-of-bounds array access
| * Fix possible out-of-bounds array accessUwe Zeisberger2006-06-211-4/+5
| | | | | | | | | | | | | | | | | | | | If match is "", match[-1] is accessed. Let pathspec_matches return 1 in that case indicating that "" matches everything. Incidently this fixes git-grep'ing in ".". Signed-off-by: Uwe Zeisberger <Uwe_Zeisberger@digi.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
* | Add "named object array" conceptLinus Torvalds2006-06-191-10/+6
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've had this notion of a "object_list" for a long time, which eventually grew a "name" member because some users (notably git-rev-list) wanted to name each object as it is generated. That object_list is great for some things, but it isn't all that wonderful for others, and the "name" member is generally not used by everybody. This patch splits the users of the object_list array up into two: the traditional list users, who want the list-like format, and who don't actually use or want the name. And another class of users that really used the list as an extensible array, and generally wanted to name the objects. The patch is fairly straightforward, but it's also biggish. Most of it really just cleans things up: switching the revision parsing and listing over to the array makes things like the builtin-diff usage much simpler (we now see exactly how many members the array has, and we don't get the objects reversed from the order they were on the command line). One of the main reasons for doing this at all is that the malloc overhead of the simple object list was actually pretty high, and the array is just a lot denser. So this patch brings down memory usage by git-rev-list by just under 3% (on top of all the other memory use optimizations) on the mozilla archive. It does add more lines than it removes, and more importantly, it adds a whole new infrastructure for maintaining lists of objects, but on the other hand, the new dynamic array code is pretty obvious. The change to builtin-diff-tree.c shows a fairly good example of why an array interface is sometimes more natural, and just much simpler for everybody. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* Shrink "struct object" a bitLinus Torvalds2006-06-171-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This shrinks "struct object" by a small amount, by getting rid of the "struct type *" pointer and replacing it with a 3-bit bitfield instead. In addition, we merge the bitfields and the "flags" field, which incidentally should also remove a useless 4-byte padding from the object when in 64-bit mode. Now, our "struct object" is still too damn large, but it's now less obviously bloated, and of the remaining fields, only the "util" (which is not used by most things) is clearly something that should be eventually discarded. This shrinks the "git-rev-list --all" memory use by about 2.5% on the kernel archive (and, perhaps more importantly, on the larger mozilla archive). That may not sound like much, but I suspect it's more on a 64-bit platform. There are other remaining inefficiencies (the parent lists, for example, probably have horrible malloc overhead), but this was pretty obvious. Most of the patch is just changing the comparison of the "type" pointer from one of the constant string pointers to the appropriate new TYPE_xxx small integer constant. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep: pass ignore case option to external grepRobert Fitzsimons2006-06-061-0/+2
| | | | | | | | Don't just read the --ignore-case/-i option, pass the flag on to the external grep program. Signed-off-by: Robert Fitzsimons <robfitz@273k.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
* tree_entry(): new tree-walking helper functionLinus Torvalds2006-05-301-16/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a "tree_entry()" function that combines the common operation of doing a "tree_entry_extract()" + "update_tree_entry()". It also has a simplified calling convention, designed for simple loops that traverse over a whole tree: the arguments are pointers to the tree descriptor and a name_entry structure to fill in, and it returns a boolean "true" if there was an entry left to be gotten in the tree. This allows tree traversal with struct tree_desc desc; struct name_entry entry; desc.buf = tree->buffer; desc.size = tree->size; while (tree_entry(&desc, &entry) { ... use "entry.{path, sha1, mode, pathlen}" ... } which is not only shorter than writing it out in full, it's hopefully less error prone too. [ It's actually a tad faster too - we don't need to recalculate the entry pathlength in both extract and update, but need to do it only once. Also, some callers can avoid doing a "strlen()" on the result, since it's returned as part of the name_entry structure. However, by now we're talking just 1% speedup on "git-rev-list --objects --all", and we're definitely at the point where tree walking is no longer the issue any more. ] NOTE! Not everybody wants to use this new helper function, since some of the tree walkers very much on purpose do the descriptor update separately from the entry extraction. So the "extract + update" sequence still remains as the core sequence, this is just a simplified interface. We should probably add a silly two-line inline helper function for initializing the descriptor from the "struct tree" too, just to cut down on the noise from that common "desc" initializer. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* remove superflous "const"Alex Riesen2006-05-211-1/+1
| | | | Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep: workaround for non GNU grep.Linus Torvalds2006-05-171-3/+20
| | | | | | | | | | | | | | | | | Of course, it still ignores the fact that not all grep's support some of the flags like -F/-L/-A/-C etc, but for those cases, the external grep itself will happily just say "unrecognized option -F" or similar. So with this change, "git grep" should handle all the flags the native grep handles, which is really quite fine. We don't _need_ to expose anything more, and if you do want our extensions, you can get them with "--uncached" and an up-to-date index. No configuration necessary, and we automatically take advantage of any native grep we have, if possible. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* Fix silly typo in new builtin grepLinus Torvalds2006-05-151-1/+1
| | | | | | | | The "-F" flag apparently got mis-translated due to some over-eager copy-paste work into a duplicate "-H" when using the external grep. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep: unparse more command line options.Junio C Hamano2006-05-151-8/+57
| | | | | | The earlier one to use external grep missed some often used options. Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep: use external grep when we can take advantage of itLinus Torvalds2006-05-141-0/+79
| | | | | | | | | | | | | | | | | | | | | It's not perfect, but it gets the "git grep some-random-string" down to the good old half-a-second range for the kernel. It should convert more of the argument flags for "grep", that should be trivial to expand (I did a few just as an example). It should also bother to try to return the right "hit" value (which it doesn't, right now - the code is kind of there, but I didn't actually bother to do it _right_). Also, right now it _just_ limits by number of arguments, but it should also strictly speaking limit by total argument size (ie add up the length of the filenames, and do the "exec_grep()" flush call if it's bigger than some random value like 32kB). But I think that it's _conceptually_ doing all the right things, and it seems to work. So maybe somebody else can do some of the final polish. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep: -F (--fixed-strings)Junio C Hamano2006-05-091-4/+32
| | | | Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep: -w fixJunio C Hamano2006-05-091-3/+3
| | | | Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep: typofixJunio C Hamano2006-05-091-1/+1
| | | | Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep: tighten argument parsing.Junio C Hamano2006-05-081-26/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | I mistyped git grep next -e '"^@"' '*.c' and got many hits that contain "next" without complaint. Obviously what I meant to say was: git grep -e '"^@"' next -- '*.c' This tightens the argument parsing rule a bit: - All "grep" parameters should come first; - If there is no -e nor -f to specify pattern, the first non option string is the parameter; - After that, zero or more revs can follow. - An optional '--' can be present, and is skipped. - All the rest are pathspecs. If '--' was not there, they must be paths that exist in the working tree. Signed-off-by: Junio C Hamano <junkio@cox.net>
* Teach -f <file> option to builtin-grep.Junio C Hamano2006-05-081-19/+42
| | | | Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep: -L (--files-without-match).Junio C Hamano2006-05-031-0/+20
| | | | Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep: binary files -a and -IJunio C Hamano2006-05-031-0/+44
| | | | Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep: terminate correctly at EOFJunio C Hamano2006-05-031-0/+2
| | | | | | It barfed and segfaulted with an incomplete line. Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep: tighten path wildcard vs tree traversal.Junio C Hamano2006-05-021-15/+20
| | | | | | | The earlier code descended into Documentation/technical when given "Documentation/how*" as the pattern, which was too loose. Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep: support -w (--word-regexp).Junio C Hamano2006-05-021-0/+30
| | | | Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep: support -c (--count).Junio C Hamano2006-05-021-1/+20
| | | | Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep: allow more than one patterns.Junio C Hamano2006-05-021-21/+51
| | | | Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep: allow -<n> and -[ABC]<n> notation for context lines.Junio C Hamano2006-05-021-6/+22
| | | | Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep: printf %.*s length is int, not ptrdiff_t.Junio C Hamano2006-05-021-1/+1
| | | | Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep: do not use setup_revisions()Junio C Hamano2006-05-011-121/+134
| | | | | | | | | | Grep may want to grok multiple revisions, but it does not make much sense to walk revisions while doing so. This stops calling the code to parse parameters for the revision walker. The parameter parsing for the optional "-e" option becomes a lot simpler with it as well. Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep: support '-l' option.Junio C Hamano2006-05-011-0/+10
| | | | Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep: wildcard pathspec fixesJunio C Hamano2006-05-011-23/+62
| | | | | | | | | | | | | | This tweaks the pathspec wildcard used in builtin-grep to match that of ls-files. With this: git grep -e DEBUG -- '*/Kconfig*' would work like the shell script version, and you could even do: git grep -e DEBUG --cached -- '*/Kconfig*' ;# from index git grep -e DEBUG v2.6.12 -- '*/Kconfig*' ;# from rev Signed-off-by: Junio C Hamano <junkio@cox.net>
* built-in "git grep"Junio C Hamano2006-05-011-0/+454
This attempts to set up built-in "git grep" to further reduce our dependence on the shell, while at the same time optionally allowing to run grep against object database. You could do funky things like these: git grep --cached -e pattern ;# grep from index git grep -e pattern master ;# or in a rev git grep -e pattern master next ;# or in multiple revs git grep -e pattern pu^@ ;# even like this with an ;# extension from another topic ;-) git grep -e pattern master..next ;# or even from rev ranges git grep -e pattern master~20:Documentation ;# or an arbitrary tree git grep -e pattern next:git-commit.sh ;# or an arbitrary blob Right now, it does not understand and/or obey many options grep should accept, and the pattern must be given with -e option due to the way the parameter parser is structured, both of which obviously need to be fixed for usability. But this is going in the right direction. The shell script version is one of the worst Portability offender in the git barebone Porcelainish; it uses xargs -0 to pass paths around and shell arrays to sift flags and parameters. Signed-off-by: Junio C Hamano <junkio@cox.net>