<feed xmlns='http://www.w3.org/2005/Atom'>
<title>delta/git.git/dir.c, branch baserock/morph</title>
<subtitle>github.com: git/git.git
</subtitle>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/'/>
<entry>
<title>Merge branch 'nd/const-struct-cache-entry'</title>
<updated>2013-07-22T18:24:01+00:00</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2013-07-22T18:24:00+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=d3aeb31dc410a71ee41b87328f0d71996417294f'/>
<id>d3aeb31dc410a71ee41b87328f0d71996417294f</id>
<content type='text'>
* nd/const-struct-cache-entry:
  Convert "struct cache_entry *" to "const ..." wherever possible
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* nd/const-struct-cache-entry:
  Convert "struct cache_entry *" to "const ..." wherever possible
</pre>
</div>
</content>
</entry>
<entry>
<title>Convert "struct cache_entry *" to "const ..." wherever possible</title>
<updated>2013-07-09T16:12:48+00:00</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2013-07-09T15:29:00+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=9c5e6c802cde9881785b7f1b3278b97be4aabd82'/>
<id>9c5e6c802cde9881785b7f1b3278b97be4aabd82</id>
<content type='text'>
I attempted to make index_state-&gt;cache[] a "const struct cache_entry **"
to find out how existing entries in index are modified and where. The
question I have is what do we do if we really need to keep track of on-disk
changes in the index. The result is

 - diff-lib.c: setting CE_UPTODATE

 - name-hash.c: setting CE_HASHED

 - preload-index.c, read-cache.c, unpack-trees.c and
   builtin/update-index: obvious

 - entry.c: write_entry() may refresh the checked out entry via
   fill_stat_cache_info(). This causes "non-const struct cache_entry
   *" in builtin/apply.c, builtin/checkout-index.c and
   builtin/checkout.c

 - builtin/ls-files.c: --with-tree changes stagemask and may set
   CE_UPDATE

Of these, write_entry() and its call sites are probably most
interesting because it modifies on-disk info. But this is stat info
and can be retrieved via refresh, at least for porcelain
commands. Other just uses ce_flags for local purposes.

So, keeping track of "dirty" entries is just a matter of setting a
flag in index modification functions exposed by read-cache.c. Except
unpack-trees, the rest of the code base does not do anything funny
behind read-cache's back.

The actual patch is less valueable than the summary above. But if
anyone wants to re-identify the above sites. Applying this patch, then
this:

    diff --git a/cache.h b/cache.h
    index 430d021..1692891 100644
    --- a/cache.h
    +++ b/cache.h
    @@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode)
     #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)

     struct index_state {
    -	struct cache_entry **cache;
    +	const struct cache_entry **cache;
     	unsigned int version;
     	unsigned int cache_nr, cache_alloc, cache_changed;
     	struct string_list *resolve_undo;

will help quickly identify them without bogus warnings.

Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I attempted to make index_state-&gt;cache[] a "const struct cache_entry **"
to find out how existing entries in index are modified and where. The
question I have is what do we do if we really need to keep track of on-disk
changes in the index. The result is

 - diff-lib.c: setting CE_UPTODATE

 - name-hash.c: setting CE_HASHED

 - preload-index.c, read-cache.c, unpack-trees.c and
   builtin/update-index: obvious

 - entry.c: write_entry() may refresh the checked out entry via
   fill_stat_cache_info(). This causes "non-const struct cache_entry
   *" in builtin/apply.c, builtin/checkout-index.c and
   builtin/checkout.c

 - builtin/ls-files.c: --with-tree changes stagemask and may set
   CE_UPDATE

Of these, write_entry() and its call sites are probably most
interesting because it modifies on-disk info. But this is stat info
and can be retrieved via refresh, at least for porcelain
commands. Other just uses ce_flags for local purposes.

So, keeping track of "dirty" entries is just a matter of setting a
flag in index modification functions exposed by read-cache.c. Except
unpack-trees, the rest of the code base does not do anything funny
behind read-cache's back.

The actual patch is less valueable than the summary above. But if
anyone wants to re-identify the above sites. Applying this patch, then
this:

    diff --git a/cache.h b/cache.h
    index 430d021..1692891 100644
    --- a/cache.h
    +++ b/cache.h
    @@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode)
     #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)

     struct index_state {
    -	struct cache_entry **cache;
    +	const struct cache_entry **cache;
     	unsigned int version;
     	unsigned int cache_nr, cache_alloc, cache_changed;
     	struct string_list *resolve_undo;

will help quickly identify them without bogus warnings.

Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>treat_directory(): do not declare submodules to be untracked</title>
<updated>2013-07-01T21:23:24+00:00</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2013-07-01T21:00:32+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=26c986e118523fda4624cec8d14bb8a4a09fdd08'/>
<id>26c986e118523fda4624cec8d14bb8a4a09fdd08</id>
<content type='text'>
When the working tree walker encounters a directory, it asks the
function treat_directory() if it should descend into it, show it as
an untracked directory, or do something else.  When the directory is
the top of the submodule working tree, we used to say "That is an
untracked directory", which was bogus.

It is an entity that is tracked in the index of the repository we
are looking at, and that is not to be descended into it.  Return
path_none, not path_untracked, to report that.

The existing case that path_untracked is returned for a newly
discovered submodule that is not tracked in the index (this only
happens when DIR_NO_GITLINKS option is not used) is unchanged, but
that is exactly because the submodule is not tracked in the index.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When the working tree walker encounters a directory, it asks the
function treat_directory() if it should descend into it, show it as
an untracked directory, or do something else.  When the directory is
the top of the submodule working tree, we used to say "That is an
untracked directory", which was bogus.

It is an entity that is tracked in the index of the repository we
are looking at, and that is not to be descended into it.  Return
path_none, not path_untracked, to report that.

The existing case that path_untracked is returned for a newly
discovered submodule that is not tracked in the index (this only
happens when DIR_NO_GITLINKS option is not used) is unchanged, but
that is exactly because the submodule is not tracked in the index.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'kb/status-ignored-optim-2'</title>
<updated>2013-06-03T19:58:56+00:00</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2013-06-03T19:58:56+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=3684101a654d5a7b598d9810df3b498f2a8928d0'/>
<id>3684101a654d5a7b598d9810df3b498f2a8928d0</id>
<content type='text'>
Fix 1.8.3 regressions in the .gitignore path exclusion logic.

* kb/status-ignored-optim-2:
  dir.c: fix ignore processing within not-ignored directories
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix 1.8.3 regressions in the .gitignore path exclusion logic.

* kb/status-ignored-optim-2:
  dir.c: fix ignore processing within not-ignored directories
</pre>
</div>
</content>
</entry>
<entry>
<title>dir.c: fix ignore processing within not-ignored directories</title>
<updated>2013-06-02T21:54:38+00:00</updated>
<author>
<name>Karsten Blees</name>
<email>karsten.blees@gmail.com</email>
</author>
<published>2013-05-29T20:32:36+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=c3c327deeaf018e727a27f5ae88e140ff7a48595'/>
<id>c3c327deeaf018e727a27f5ae88e140ff7a48595</id>
<content type='text'>
As of 95c6f271 "dir.c: unify is_excluded and is_path_excluded APIs", the
is_excluded API no longer recurses into directories that match an ignore
pattern, and returns the directory's ignored state for all contained paths.

This is OK for normal ignore patterns, i.e. ignoring a directory affects
the entire contents recursively.

Unfortunately, this also "works" for negated ignore patterns ('!dir'), i.e.
the entire contents is "not-ignored" recursively, regardless of ignore
patterns that match the contents directly.

In prep_exclude, skip recursing into a directory only if it is really
ignored (i.e. the ignore pattern is not negated).

Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Tested-by: Øystein Walle &lt;oystwa@gmail.com&gt;
Reviewed-by: Duy Nguyen &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As of 95c6f271 "dir.c: unify is_excluded and is_path_excluded APIs", the
is_excluded API no longer recurses into directories that match an ignore
pattern, and returns the directory's ignored state for all contained paths.

This is OK for normal ignore patterns, i.e. ignoring a directory affects
the entire contents recursively.

Unfortunately, this also "works" for negated ignore patterns ('!dir'), i.e.
the entire contents is "not-ignored" recursively, regardless of ignore
patterns that match the contents directly.

In prep_exclude, skip recursing into a directory only if it is really
ignored (i.e. the ignore pattern is not negated).

Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Tested-by: Øystein Walle &lt;oystwa@gmail.com&gt;
Reviewed-by: Duy Nguyen &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'jn/config-ignore-inaccessible'</title>
<updated>2013-05-29T21:30:10+00:00</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2013-05-29T21:30:10+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=7ebb906ddd69628fc9874b1266257ae2256b8f14'/>
<id>7ebb906ddd69628fc9874b1266257ae2256b8f14</id>
<content type='text'>
When $HOME is misconfigured to point at an unreadable directory, we
used to complain and die. This loosens the check.

* jn/config-ignore-inaccessible:
  config: allow inaccessible configuration under $HOME
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When $HOME is misconfigured to point at an unreadable directory, we
used to complain and die. This loosens the check.

* jn/config-ignore-inaccessible:
  config: allow inaccessible configuration under $HOME
</pre>
</div>
</content>
</entry>
<entry>
<title>dir.c: git-status --ignored: don't scan the work tree twice</title>
<updated>2013-04-15T19:36:42+00:00</updated>
<author>
<name>Karsten Blees</name>
<email>karsten.blees@gmail.com</email>
</author>
<published>2013-04-15T19:15:03+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=0aaf62b6e018484bad9cea47dc00644d57b7ad49'/>
<id>0aaf62b6e018484bad9cea47dc00644d57b7ad49</id>
<content type='text'>
'git-status --ignored' still scans the work tree twice to collect
untracked and ignored files, respectively.

fill_directory / read_directory already supports collecting untracked and
ignored files in a single directory scan. However, the DIR_COLLECT_IGNORED
flag to enable this has some git-add specific side-effects (e.g. it
doesn't recurse into ignored directories, so listing ignored files with
--untracked=all doesn't work).

The DIR_SHOW_IGNORED flag doesn't list untracked files and returns ignored
files in dir_struct.entries[] (instead of dir_struct.ignored[] as
DIR_COLLECT_IGNORED). DIR_SHOW_IGNORED is used all throughout git.

We don't want to break the existing API, so lets introduce a new flag
DIR_SHOW_IGNORED_TOO that lists untracked as well as ignored files similar
to DIR_COLLECT_FILES, but will recurse into sub-directories based on the
other flags as DIR_SHOW_IGNORED does.

In dir.c::read_directory_recursive, add ignored files to either
dir_struct.entries[] or dir_struct.ignored[] based on the flags. Also move
the DIR_COLLECT_IGNORED case here so that filling result lists is in a
common place.

In wt-status.c::wt_status_collect_untracked, use the new flag and read
results from dir_struct.ignored[]. Remove the extra fill_directory call.

builtin/check-ignore.c doesn't call fill_directory, setting the git-add
specific DIR_COLLECT_IGNORED flag has no effect here. Remove for clarity.

Update API documentation to reflect the changes.

Performance: with this patch, 'git-status --ignored' is typically as fast
as 'git-status'.

Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
'git-status --ignored' still scans the work tree twice to collect
untracked and ignored files, respectively.

fill_directory / read_directory already supports collecting untracked and
ignored files in a single directory scan. However, the DIR_COLLECT_IGNORED
flag to enable this has some git-add specific side-effects (e.g. it
doesn't recurse into ignored directories, so listing ignored files with
--untracked=all doesn't work).

The DIR_SHOW_IGNORED flag doesn't list untracked files and returns ignored
files in dir_struct.entries[] (instead of dir_struct.ignored[] as
DIR_COLLECT_IGNORED). DIR_SHOW_IGNORED is used all throughout git.

We don't want to break the existing API, so lets introduce a new flag
DIR_SHOW_IGNORED_TOO that lists untracked as well as ignored files similar
to DIR_COLLECT_FILES, but will recurse into sub-directories based on the
other flags as DIR_SHOW_IGNORED does.

In dir.c::read_directory_recursive, add ignored files to either
dir_struct.entries[] or dir_struct.ignored[] based on the flags. Also move
the DIR_COLLECT_IGNORED case here so that filling result lists is in a
common place.

In wt-status.c::wt_status_collect_untracked, use the new flag and read
results from dir_struct.ignored[]. Remove the extra fill_directory call.

builtin/check-ignore.c doesn't call fill_directory, setting the git-add
specific DIR_COLLECT_IGNORED flag has no effect here. Remove for clarity.

Update API documentation to reflect the changes.

Performance: with this patch, 'git-status --ignored' is typically as fast
as 'git-status'.

Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dir.c: git-status --ignored: don't scan the work tree three times</title>
<updated>2013-04-15T19:34:01+00:00</updated>
<author>
<name>Karsten Blees</name>
<email>karsten.blees@gmail.com</email>
</author>
<published>2013-04-15T19:14:22+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=defd7c7b3717394ee05b454172bf7b1e747af6ae'/>
<id>defd7c7b3717394ee05b454172bf7b1e747af6ae</id>
<content type='text'>
'git-status --ignored' recursively scans directories up to three times:

 1. To collect untracked files.

 2. To collect ignored files.

 3. When collecting ignored files, to check that an untracked directory
    that potentially contains ignored files doesn't also contain untracked
    files (i.e. isn't already listed as untracked).

Let's get rid of case 3 first.

Currently, read_directory_recursive returns a boolean whether a directory
contains the requested files or not (actually, it returns the number of
files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies
what we're looking for.

To be able to test for both untracked and ignored files in a single scan,
we need to return a bit more info, and the result must be independent of
the DIR_SHOW_IGNORED flag.

Reuse the path_treatment enum as return value of read_directory_recursive.
Split path_handled in two separate values path_excluded and path_untracked
that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't
need an extra value path_untracked_and_excluded, as directories with both
untracked and ignored files should be listed as untracked.

Rename path_ignored to path_none for clarity (i.e. "don't treat that path"
in contrast to "the path is ignored and should be treated according to
DIR_SHOW_IGNORED").

Replace enum directory_treatment with path_treatment. That's just another
enum with the same meaning, no need to translate back and forth.

In treat_directory, get rid of the extra read_directory_recursive call and
all the DIR_SHOW_IGNORED-specific code.

In read_directory_recursive, decide whether to dir_add_name path_excluded
or path_untracked paths based on the DIR_SHOW_IGNORED flag.

The return value of read_directory_recursive is the maximum path_treatment
of all files and sub-directories. In the check_only case, abort when we've
reached the most significant value (path_untracked).

Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
'git-status --ignored' recursively scans directories up to three times:

 1. To collect untracked files.

 2. To collect ignored files.

 3. When collecting ignored files, to check that an untracked directory
    that potentially contains ignored files doesn't also contain untracked
    files (i.e. isn't already listed as untracked).

Let's get rid of case 3 first.

Currently, read_directory_recursive returns a boolean whether a directory
contains the requested files or not (actually, it returns the number of
files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies
what we're looking for.

To be able to test for both untracked and ignored files in a single scan,
we need to return a bit more info, and the result must be independent of
the DIR_SHOW_IGNORED flag.

Reuse the path_treatment enum as return value of read_directory_recursive.
Split path_handled in two separate values path_excluded and path_untracked
that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't
need an extra value path_untracked_and_excluded, as directories with both
untracked and ignored files should be listed as untracked.

Rename path_ignored to path_none for clarity (i.e. "don't treat that path"
in contrast to "the path is ignored and should be treated according to
DIR_SHOW_IGNORED").

Replace enum directory_treatment with path_treatment. That's just another
enum with the same meaning, no need to translate back and forth.

In treat_directory, get rid of the extra read_directory_recursive call and
all the DIR_SHOW_IGNORED-specific code.

In read_directory_recursive, decide whether to dir_add_name path_excluded
or path_untracked paths based on the DIR_SHOW_IGNORED flag.

The return value of read_directory_recursive is the maximum path_treatment
of all files and sub-directories. In the check_only case, abort when we've
reached the most significant value (path_untracked).

Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dir.c: git-status: avoid is_excluded checks for tracked files</title>
<updated>2013-04-15T19:34:01+00:00</updated>
<author>
<name>Karsten Blees</name>
<email>karsten.blees@gmail.com</email>
</author>
<published>2013-04-15T19:13:35+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=8aaf8d7728e8ac50cbf6bcad05b6e896d4e69e0b'/>
<id>8aaf8d7728e8ac50cbf6bcad05b6e896d4e69e0b</id>
<content type='text'>
Checking if a file is in the index is much faster (hashtable lookup) than
checking if the file is excluded (linear search over exclude patterns).

Skip is_excluded checks for files: move the cache_name_exists check from
treat_file to treat_one_path and return early if the file is tracked.

This can safely be done as all other code paths also return path_ignored
for tracked files, and dir_add_ignored skips tracked files as well.

There's just one line left in treat_file, so move this to treat_one_path
as well.

Here's some performance data for git-status from the linux and WebKit
repos (best of 10 runs on a Debian Linux on SSD, core.preloadIndex=true):

       |    status      | status --ignored
       | linux | WebKit | linux | WebKit
-------+-------+--------+-------+---------
before | 0.218 |  1.583 | 0.321 |  2.579
after  | 0.156 |  0.988 | 0.202 |  1.279
gain   | 1.397 |  1.602 | 1.589 |  2.016

Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Checking if a file is in the index is much faster (hashtable lookup) than
checking if the file is excluded (linear search over exclude patterns).

Skip is_excluded checks for files: move the cache_name_exists check from
treat_file to treat_one_path and return early if the file is tracked.

This can safely be done as all other code paths also return path_ignored
for tracked files, and dir_add_ignored skips tracked files as well.

There's just one line left in treat_file, so move this to treat_one_path
as well.

Here's some performance data for git-status from the linux and WebKit
repos (best of 10 runs on a Debian Linux on SSD, core.preloadIndex=true):

       |    status      | status --ignored
       | linux | WebKit | linux | WebKit
-------+-------+--------+-------+---------
before | 0.218 |  1.583 | 0.321 |  2.579
after  | 0.156 |  0.988 | 0.202 |  1.279
gain   | 1.397 |  1.602 | 1.589 |  2.016

Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dir.c: replace is_path_excluded with now equivalent is_excluded API</title>
<updated>2013-04-15T19:34:01+00:00</updated>
<author>
<name>Karsten Blees</name>
<email>karsten.blees@gmail.com</email>
</author>
<published>2013-04-15T19:12:57+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/git.git/commit/?id=b07bc8c8c3c492d657a8bedf04ff939763ea8222'/>
<id>b07bc8c8c3c492d657a8bedf04ff939763ea8222</id>
<content type='text'>
Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
