summaryrefslogtreecommitdiff
path: root/fast-import.c
Commit message (Collapse)AuthorAgeFilesLines
* Use die_errno() instead of die() when checking syscallsThomas Rast2009-06-271-2/+2
| | | | | | | | | | | | | | | | | | | | Lots of die() calls did not actually report the kind of error, which can leave the user confused as to the real problem. Use die_errno() where we check a system/library call that sets errno on failure, or one of the following that wrap such calls: Function Passes on error from -------- -------------------- odb_pack_keep open read_ancestry fopen read_in_full xread strbuf_read xread strbuf_read_file open or strbuf_read_file strbuf_readlink readlink write_in_full xwrite Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Convert existing die(..., strerror(errno)) to die_errno()Thomas Rast2009-06-271-2/+2
| | | | | | | | | | | Change calls to die(..., strerror(errno)) to use the new die_errno(). In the process, also make slight style adjustments: at least state _something_ about the function that failed (instead of just printing the pathname), and put paths in single quotes. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'ar/unlink-err'Junio C Hamano2009-05-181-2/+2
|\ | | | | | | | | | | | | * ar/unlink-err: print unlink(2) errno in copy_or_link_directory replace direct calls to unlink(2) with unlink_or_warn Introduce an unlink(2) wrapper which gives warning if unlink failed
| * replace direct calls to unlink(2) with unlink_or_warnAlex Riesen2009-04-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This helps to notice when something's going wrong, especially on systems which lock open files. I used the following criteria when selecting the code for replacement: - it was already printing a warning for the unlink failures - it is in a function which already printing something or is called from such a function - it is in a static function, returning void and the function is only called from a builtin main function (cmd_) - it is in a function which handles emergency exit (signal handlers) - it is in a function which is obvously cleaning up the lockfiles Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Fix a bunch of pointer declarations (codestyle)Felipe Contreras2009-05-011-7/+7
|/ | | | | | | Essentially; s/type* /type */ as per the coding guidelines. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Fix more typos/spelling in commentsMichael J Gruber2009-04-221-1/+1
| | | | | | | A few more fixes on top of the automatic spell checker generated ones. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Fix typos / spelling in commentsMike Ralphson2009-04-221-3/+3
| | | | | Signed-off-by: Mike Ralphson <mike@abacus.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'jc/shared-literally'Junio C Hamano2009-04-061-3/+0
|\ | | | | | | | | | | | | | | | | * jc/shared-literally: t1301: loosen test for forced modes set_shared_perm(): sometimes we know what the final mode bits should look like move_temp_to_file(): do not forget to chmod() in "Coda hack" codepath Move chmod(foo, 0444) into move_temp_to_file() "core.sharedrepository = 0mode" should set, not loosen
| * Move chmod(foo, 0444) into move_temp_to_file()Johan Herland2009-03-271-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | When writing out a loose object or a pack (index), move_temp_to_file() is called to finalize the resulting file. These files (loose files and packs) should all have permission mode 0444 (modulo adjust_shared_perm()). Therefore, instead of doing chmod(foo, 0444) explicitly from each callsite (or even forgetting to chmod() at all), do the chmod() call from within move_temp_to_file(). Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Correct missing SP characters in grammar comment at top of fast-import.cElijah Newren2009-03-251-3/+4
| | | | | | | | | | | | Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Remove unused function scope local variablesBenjamin Kramer2009-03-071-5/+3
|/ | | | | | | | | | | | | | | | These variables were unused and can be removed safely: builtin-clone.c::cmd_clone(): use_local_hardlinks, use_separate_remote builtin-fetch-pack.c::find_common(): len builtin-remote.c::mv(): symref diff.c::show_stats():show_stats(): total diffcore-break.c::should_break(): base_size fast-import.c::validate_raw_date(): date, sign fsck.c::fsck_tree(): o_sha1, sha1 xdiff-interface.c::parse_num(): read_some Signed-off-by: Benjamin Kramer <benny.kra@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'jc/maint-1.6.0-pack-directory'Junio C Hamano2009-02-251-9/+5
|\ | | | | | | | | * jc/maint-1.6.0-pack-directory: Make sure objects/pack exists before creating a new pack
| * Make sure objects/pack exists before creating a new packJunio C Hamano2009-02-251-9/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In a repository created with git older than f49fb35 (git-init-db: create "pack" subdirectory under objects, 2005-06-27), objects/pack/ directory is not created upon initialization. It was Ok because subdirectories are created as needed inside directories init-db creates, and back then, packfiles were recent invention. After the said commit, new codepaths started relying on the presense of objects/pack/ directory in the repository. This was exacerbated with 8b4eb6b (Do not perform cross-directory renames when creating packs, 2008-09-22) that moved the location temporary pack files are created from objects/ directory to objects/pack/ directory, because moving temporary to the final location was done carefully with lazy leading directory creation. Many packfile related operations in such an old repository can fail mysteriously because of this. This commit introduces two helper functions to make things work better. - odb_mkstemp() is a specialized version of mkstemp() to refactor the code and teach it to create leading directories as needed; - odb_pack_keep() refactors the code to create a ".keep" file while create leading directories as needed. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Drop double-semicolon in CJunio C Hamano2009-02-101-1/+1
| | | | | | | | | | | | The worst offenders are "continue;;" and "break;;" in switch statements. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'maint'Junio C Hamano2009-02-101-0/+1
|\ \ | | | | | | | | | | | | * maint: Clear the delta base cache during fast-import checkpoint
| * \ Merge branch 'maint-1.6.0' into maintJunio C Hamano2009-02-101-0/+1
| |\ \ | | | | | | | | | | | | | | | | * maint-1.6.0: Clear the delta base cache during fast-import checkpoint
| | * | Clear the delta base cache during fast-import checkpointShawn O. Pearce2009-02-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise we may reuse the same memory address for a totally different "struct packed_git", and a previously cached object from the prior occupant might be returned when trying to unpack an object from the new pack. Found-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | Add calls to git_extract_argv0_path() in programs that call git_config_*Steffen Prohaska2009-01-261-0/+3
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Programs that use git_config need to find the global configuration. When runtime prefix computation is enabled, this requires that git_extract_argv0_path() is called early in the program's main(). This commit adds the necessary calls. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'maint-1.6.0' into maintJunio C Hamano2009-01-131-3/+4
|\ \ \ | |/ / | | | | | | | | | | | | * maint-1.6.0: fast-import: Cleanup mode setting. Git.pm: call Error::Simple() properly
| * | fast-import: Cleanup mode setting.Felipe Contreras2009-01-131-3/+4
| |/ | | | | | | | | | | | | | | | | | | | | | | "S_IFREG | mode" makes only sense for 0644 and 0755. Even though doing (S_IFREG | mode) may not hurt when mode is any other supported value, that is only true because S_IFREG mode bit happens to be already on for S_IFLNK or S_IFGITLINK. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | fast-import.c: stricter strtoul check, silence compiler warningRené Scharfe2008-12-211-4/+7
| | | | | | | | | | | | | | | | | | | | | | Store the return value of strtoul() in order to avoid compiler warnings on Ubuntu 8.10. Also check errno after each call, which is the only way to notice an overflow without making ULONG_MAX an illegal date. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'maint' to sync with GIT 1.6.0.6Junio C Hamano2008-12-191-11/+15
|\ \ | |/ | | | | Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * fast-import: make tagger information optionalJunio C Hamano2008-12-191-11/+15
| | | | | | | | | | | | | | | | | | Even though newer Porcelain tools always record the tagger information when creating new tags, export/import pair should be able to faithfully reproduce ancient tag objects that lack tagger information. Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: Shawn O. Pearce <spearce@spearce.org>
* | Merge branch 'maint'Junio C Hamano2008-12-151-1/+3
|\ \ | |/ | | | | | | | | | | * maint: fast-import: close pack before unlinking it pager: do not dup2 stderr if it is already redirected git-show: do not segfault when showing a bad tag
| * fast-import: close pack before unlinking itJohannes Schindelin2008-12-151-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is sort of a companion patch to 4723ee9(Close files opened by lock_file() before unlinking.): on Windows, you cannot delete what is still open. This makes test 9300-fast-import pass on Windows for me; quite a few fast-imports leave temporary packs until the test "blank lines not necessary after other commands" actually tests for the number of files in .git/objects/pack/, which has a few temporary packs now. I guess that 8b4eb6b(Do not perform cross-directory renames when creating packs) was "responsible" for the breakage. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | git-fast-import possible memory corruption problemYONETANI Tomokazu2008-12-141-3/+4
| | | | | | | | | | | | | | | | | | | | | | Internal "allocate in bulk, we will never free this memory anyway" allocator used in fast-import had a logic to round up the size of the requested memory block in a wrong place (it computed if the available space is enough to fit the request first, and then carved a chunk of memory by size rounded up to the alignment, which could go beyond the actually available space). Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | fix openssl headers conflicting with custom SHA1 implementationsNicolas Pitre2008-10-021-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On ARM I have the following compilation errors: CC fast-import.o In file included from cache.h:8, from builtin.h:6, from fast-import.c:142: arm/sha1.h:14: error: conflicting types for 'SHA_CTX' /usr/include/openssl/sha.h:105: error: previous declaration of 'SHA_CTX' was here arm/sha1.h:16: error: conflicting types for 'SHA1_Init' /usr/include/openssl/sha.h:115: error: previous declaration of 'SHA1_Init' was here arm/sha1.h:17: error: conflicting types for 'SHA1_Update' /usr/include/openssl/sha.h:116: error: previous declaration of 'SHA1_Update' was here arm/sha1.h:18: error: conflicting types for 'SHA1_Final' /usr/include/openssl/sha.h:117: error: previous declaration of 'SHA1_Final' was here make: *** [fast-import.o] Error 1 This is because openssl header files are always included in git-compat-util.h since commit 684ec6c63c whenever NO_OPENSSL is not set, which somehow brings in <openssl/sha1.h> clashing with the custom ARM version. Compilation of git is probably broken on PPC too for the same reason. Turns out that the only file requiring openssl/ssl.h and openssl/err.h is imap-send.c. But only moving those problematic includes there doesn't solve the issue as it also includes cache.h which brings in the conflicting local SHA1 header file. As suggested by Jeff King, the best solution is to rename our references to SHA1 functions and structure to something git specific, and define those according to the implementation used. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* | Merge branch 'maint'Junio C Hamano2008-09-231-2/+2
|\ \ | |/ | | | | | | | | * maint: builtin-prune.c: prune temporary packs in <object_dir>/pack directory Do not perform cross-directory renames when creating packs
| * Do not perform cross-directory renames when creating packsPetr Baudis2008-09-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | A comment on top of create_tmpfile() describes caveats ('can have problems on various systems (FAT, NFS, Coda)') that should apply in this situation as well. This in the end did not end up solving any of my personal problems, but it might be a useful cleanup patch nevertheless. Signed-off-by: Petr Baudis <pasky@suse.cz> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'np/maint-safer-pack'Junio C Hamano2008-09-021-1/+2
|\ \ | |/ | | | | | | | | | | | | | | * np/maint-safer-pack: fixup_pack_header_footer(): use nicely aligned buffer sizes index-pack: use fixup_pack_header_footer()'s validation mode pack-objects: use fixup_pack_header_footer()'s validation mode improve reliability of fixup_pack_header_footer() pack-objects: improve returned information from write_one()
| * improve reliability of fixup_pack_header_footer()Nicolas Pitre2008-08-291-1/+2
| | | | | | | | | | | | | | | | | | | | Currently, this function has the potential to read corrupted pack data from disk and give it a valid SHA1 checksum. Let's add the ability to validate SHA1 checksum of existing data along the way, including before and after any arbitrary point in the pack. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | cast pid_t's to uintmax_t to improve portabilityDavid Soria Parra2008-08-311-3/+3
|/ | | | | | | | | | | Some systems (like e.g. OpenSolaris) define pid_t as long, therefore all our sprintf that use %i/%d cause a compiler warning beacuse of the implicit long->int cast. To make sure that we fit the limits, we display pids as PRIuMAX and cast them explicitly to uintmax_t. Signed-off-by: David Soria Parra <dsp@php.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Support gitlinks in fast-import.Alexander Gavrilov2008-07-191-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | Currently fast-import/export cannot be used for repositories with submodules. This patch extends the relevant programs to make them correctly process gitlinks. Links can be represented by two forms of the Modify command: M 160000 SHA1 some/path which sets the link target explicitly, or M 160000 :mark some/path where the mark refers to a commit. The latter form can be used by importing tools to build all submodules simultaneously in one physical repository, and then simply fetch them apart. Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Make usage strings dash-lessStephan Beyer2008-07-131-1/+1
| | | | | | | | | | | | | | | When you misuse a git command, you are shown the usage string. But this is currently shown in the dashed form. So if you just copy what you see, it will not work, when the dashed form is no longer supported. This patch makes git commands show the dash-less version. For shell scripts that do not specify OPTIONS_SPEC, git-sh-setup.sh generates a dash-less usage string now. Signed-off-by: Stephan Beyer <s-beyer@gmx.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Make pack creation always fsync() the resultLinus Torvalds2008-05-311-1/+1
| | | | | | | | | | This means that we can depend on packs always being stable on disk, simplifying a lot of the object serialization worries. And unlike loose objects, serializing pack creation IO isn't going to be a performance killer. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'js/config-cb'v1.5.6-rc0Junio C Hamano2008-05-251-3/+3
|\ | | | | | | | | | | | | | | | | | | * js/config-cb: Provide git_config with a callback-data parameter Conflicts: builtin-add.c builtin-cat-file.c
| * Provide git_config with a callback-data parameterJohannes Schindelin2008-05-141-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | git_config() only had a function parameter, but no callback data parameter. This assumes that all callback functions only modify global variables. With this patch, every callback gets a void * parameter, and it is hoped that this will help the libification effort. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | git-fast-import: rename cmd_*() functions to parse_*()Miklos Vajna2008-05-161-31/+31
|/ | | | | | | | | | There is a cmd_merge() function in fast-import that will conflict with builtin-merge's cmd_merge() function. To keep it consistent, rename all cmd_*() function to parse_*() Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fast-import: Allow "reset" to delete a new branch without errorEyvind Bernhardsen2008-03-161-0/+2
| | | | | | | | | | | | | | | | | Creating a branch in fast-import and then resetting it without making any further commits to it currently causes an error message at the end of the import. This error is triggered by cvs2svn's git backend, which uses a temporary fixup branch when it creates tags, because the fixup branch is reset after each tag. This patch prevents the error, allowing "reset" to be used to delete temporary branches. Signed-off-by: Eyvind Bernhardsen <eyvind-git@orakel.ntnu.no> Acked-by: Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'maint' to sync with 1.5.4.4Junio C Hamano2008-03-081-1/+2
|\ | | | | | | | | | | | | | | * maint: GIT 1.5.4.4 ident.c: reword error message when the user name cannot be determined Fix dcommit, rebase when rewriteRoot is in use Really make the LF after reset in fast-import optional
| * Really make the LF after reset in fast-import optionalAdeodato Simó2008-03-081-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cmd_from() ends with a call to read_next_command(), which is needed when using cmd_from() from commands where from is not the last element. With reset, however, "from" is the last command, after which the flow returns to the main loop, which calls read_next_command() again. Because of this, always set unread_command_buf in cmd_reset_branch(), even if cmd_from() was successful. Add a test case for this in t9300-fast-import.sh. Signed-off-by: Adeodato Simó <dato@net.com.org.es> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | fast-import: exit with proper message if not a git dirJean-Luc Herren2008-03-021-0/+1
| | | | | | | | | | | | | | | | | | git fast-import expects to be run from an existing (possibly empty) repository. It was dying with a suboptimal message if that wasn't the case. Signed-off-by: Jean-Luc Herren <jlh@gmx.ch> Acked-by: Shawn O. Pearce <spearce@spearce.org>
* | Finish current packfile during fast-import crash handlerShawn O. Pearce2008-02-161-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If fast-import is in the middle of crashing due to a protocol error or something like that then it can be very useful to have the mark table and all objects up until that point be available for a new import to resume from. Currently we just close the active packfile, unkeep all of our newly created packfiles (so they can be deleted), and dump the marks table to a temporary file. We don't attempt to update the refs/tags that the process has in memory as much of that data can be found in the crash report and I'm not sure it would be the right thing to do under every type of crash. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Include the fast-import marks table in crash reportsShawn O. Pearce2008-02-161-0/+10
| | | | | | | | | | | | | | | | | | | | | | If fast-import was not run with --export-marks but we are crashing the frontend application developer may still benefit from having that information available to them. We now include the marks table as part of the crash report if --export-marks was not supplied on the command line. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Include annotated tags in fast-import crash reportsShawn O. Pearce2008-02-161-0/+13
|/ | | | | | | | | | | If annotated tags were created they exist in a different namespace within the fast-import process' internal memory tables so we did not export them in the inactive branch table. Now they are written out after the branches, in the order that they were defined by the frontend process. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fast-import: check return value from unpack_entry()Shawn O. Pearce2008-02-151-0/+2
| | | | | | | | | | | If the tree object we have asked for is deltafied in the packfile and the delta did not apply correctly or was not able to be decompressed from the packfile then we can get back NULL instead of the tree data. This is (part of) the reason why read_sha1_file() can return NULL, so we need to also handle it the same way. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Document the hairy gfi_unpack_entry part of fast-importShawn O. Pearce2008-01-211-0/+33
| | | | | | | | | | | Junio pointed out this part of fast-import wasn't very clear on initial read, and it took some time for someone who was new to fast-import's "dirty little tricks" to understand how this was even working. So a little bit of commentary in the proper place may help future readers. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Teach fast-import to honor pack.compression and pack.depthShawn O. Pearce2008-01-211-3/+29
| | | | | | | | | | | | | | | | | | We now use the configured pack.compression and pack.depth values within fast-import, as like builtin-pack-objects fast-import is generating a packfile for consumption by the Git tools. We use the same behavior as builtin-pack-objects does for these options, allowing core.compression to supply the default value for pack.compression. The default setting for pack.depth within fast-import is still 10 as users will generally repack fast-import generated packfiles by `repack -f`. A large delta depth within the fast-import packfile can significantly slow down such a later repack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fast-import: Don't use a maybe-clobbered errno valueJim Meyering2008-01-181-3/+6
| | | | | | | | Without this change, each diagnostic could use an errno value clobbered by the close or unlink in rollback_lock_file. Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Fix random fast-import errors when compiled with NO_MMAPShawn O. Pearce2008-01-171-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fast-import was relying on the fact that on most systems mmap() and write() are synchronized by the filesystem's buffer cache. We were relying on the ability to mmap() 20 bytes beyond the current end of the file, then later fill in those bytes with a future write() call, then read them through the previously obtained mmap() address. This isn't always true with some implementations of NFS, but it is especially not true with our NO_MMAP=YesPlease build time option used on some platforms. If fast-import was built with NO_MMAP=YesPlease we used the malloc()+pread() emulation and the subsequent write() call does not update the trailing 20 bytes of a previously obtained "mmap()" (aka malloc'd) address. Under NO_MMAP that behavior causes unpack_entry() in sha1_file.c to be unable to read an object header (or data) that has been unlucky enough to be written to the packfile at a location such that it is in the trailing 20 bytes of a window previously opened on that same packfile. This bug has gone unnoticed for a very long time as it is highly data dependent. Not only does the object have to be placed at the right position, but it also needs to be positioned behind some other object that has been accessed due to a branch cache invalidation. In other words the stars had to align just right, and if you did run into this bug you probably should also have purchased a lottery ticket. Fortunately the workaround is a lot easier than the bug explanation. Before we allow unpack_entry() to read data from a pack window that has also (possibly) been modified through write() we force all existing windows on that packfile to be closed. By closing the windows we ensure that any new access via the emulated mmap() will reread the packfile, updating to the current file content. This comes at a slight performance degredation as we cannot reuse previously cached windows when we update the packfile. But it is a fairly minor difference as the window closes happen at only two points: - When the packfile is finalized and its .idx is generated: At this stage we are getting ready to update the refs and any data access into the packfile is going to be random, and is going after only the branch tips (to ensure they are valid). Our existing windows (if any) are not likely to be positioned at useful locations to access those final tip commits so we probably were closing them before anyway. - When the branch cache missed and we need to reload: At this point fast-import is getting change commands for the next commit and it needs to go re-read a tree object it previously had written out to the packfile. What windows we had (if any) are not likely to cover the tree in question so we probably were closing them before anyway. We do try to avoid unnecessarily closing windows in the second case by checking to see if the packfile size has increased since the last time we called unpack_entry() on that packfile. If the size has not changed then we have not written additional data, and any existing window is still vaild. This nicely handles the cases where fast-import is going through a branch cache reload and needs to read many trees at once. During such an event we are not likely to be updating the packfile so we do not cycle the windows between reads. With this change in place t9301-fast-export.sh (which was broken by c3b0dec509fe136c5417422f31898b5a4e2d5e02) finally works again. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>