summaryrefslogtreecommitdiff
path: root/src/pack.c
Commit message (Collapse)AuthorAgeFilesLines
* Drop parsing pack filename SHA1 part, no one cares the filenameLinquize2014-01-231-5/+0
|
* One more rename/cleanup for callback err functionsRussell Belfer2013-12-111-4/+2
|
* Some callback error check style cleanupsRussell Belfer2013-12-111-1/+3
| | | | I find this easier to read...
* Remove converting user error to GIT_EUSERRussell Belfer2013-12-111-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This changes the behavior of callbacks so that the callback error code is not converted into GIT_EUSER and instead we propagate the return value through to the caller. Instead of using the giterr_capture and giterr_restore functions, we now rely on all functions to pass back the return value from a callback. To avoid having a return value with no error message, the user can call the public giterr_set_str or some such function to set an error message. There is a new helper 'giterr_set_callback' that functions can invoke after making a callback which ensures that some error message was set in case the callback did not set one. In places where the sign of the callback return value is meaningful (e.g. positive to skip, negative to abort), only the negative values are returned back to the caller, obviously, since the other values allow for continuing the loop. The hardest parts of this were in the checkout code where positive return values were overloaded as meaningful values for checkout. I fixed this by adding an output parameter to many of the internal checkout functions and removing the overload. This added some code, but it is probably a better implementation. There is some funkiness in the network code where user provided callbacks could be returning a positive or a negative value and we want to rely on that to cancel the loop. There are still a couple places where an user error might get turned into GIT_EUSER there, I think, though none exercised by the tests.
* Further EUSER and error propagation fixesRussell Belfer2013-12-111-4/+2
| | | | | | | | | | | | | This continues auditing all the places where GIT_EUSER is being returned and making sure to clear any existing error using the new giterr_user_cancel helper. As a result, places that relied on intercepting GIT_EUSER but having the old error preserved also needed to be cleaned up to correctly stash and then retrieve the actual error. Additionally, as I encountered places where error codes were not being propagated correctly, I tried to fix them up. A number of those fixes are included in the this commit as well.
* pack: `__object_header` always returns unsigned valuesVicent Marti2013-11-011-2/+2
|
* Fix warning on win64Linquize2013-11-011-1/+1
|
* pack: move the object header function hereCarlos Martín Nieto2013-10-041-0/+32
|
* sha1_lookup: do not use the "experimental" lookup modeVicent Marti2013-08-141-1/+4
|
* Close p->mwf.fd only if necessarySven Strickroth2013-07-251-2/+3
| | | | | | This fixes a regression introduced in revision 9d2f841a5d39fc25ce722a3904f6ebc9aa112222. Signed-off-by: Sven Strickroth <email@cs-ware.de>
* pack: fix memory leak in error pathRémi Duraffort2013-07-151-1/+3
|
* Mutex init can failRussell Belfer2013-05-311-2/+14
| | | | | | | It is obviously quite a serious problem if this happens, but mutex initialization can fail and we should detect it. It's a bit like a memory allocation failure, in that you're probably pretty screwed if this occurs, but at least we'll catch it.
* Zero memory for major objects before freeingRussell Belfer2013-05-311-1/+5
| | | | | | | By zeroing out the memory when we free larger objects (i.e. those that serve as collections of other data, such as repos, odb, refdb), I'm hoping that it will be easier for libgit2 bindings to find errors in their object management code.
* Switch to index_version as "git_pack_file is ready" flagCarlos Martín Nieto2013-05-021-5/+6
| | | | | | | | | | | | | | | | We use p->index_map.data to check whether the struct has been set up and all the information about the index is stored there. This variable gets set up halfway through the setup process, however, and a thread can come along and use fields that haven't been written to yet. Crucially, pack_entry_find_offset() needs to read the index version (which is written after index_map) to know the offset and stride length to pass to sha1_entry_pos(). If these values are wrong, assertions in it will fail, as it will be reading bogus data. Make index_version the last field to be written and switch from using p->index_map.data to p->index_version as "git_pack_file is ready" flag as we can use it to know if every field has been written.
* Revert "Protect sha1_entry_pos call with mutex"Carlos Martín Nieto2013-05-021-12/+10
| | | | This reverts commit 8c535f3f6879c6796d8107d7eb80dd8b2105621b.
* Protect sha1_entry_pos call with mutexRussell Belfer2013-05-021-10/+12
| | | | | | | There is an occasional assertion failure in sha1_entry_pos from pack_entry_find_index when running threaded. Holding the mutex around the code that grabs the index_map data and processes it makes this assertion failure go away.
* Add extra locking around packfile openRussell Belfer2013-05-021-15/+29
| | | | | | We were still seeing a few issues in threaded access to packs. This adds extra locks around the opening of the mwindow to avoid a different race.
* Make git_oid_cmp public and add git_oid__cmpRussell Belfer2013-04-291-3/+3
|
* Consolidate packfile allocation furtherRussell Belfer2013-04-221-1/+1
| | | | | | | Rename git_packfile_check to git_packfile_alloc since it is now being used more in that capacity. Fix the various places that use it. Consolidate some repeated code in odb_pack.c related to the allocation of a new pack_backend.
* Make indexer use shared packfile open codeRussell Belfer2013-04-221-22/+17
| | | | | | | | | | | The indexer was creating a packfile object separately from the code in pack.c which was a problem since I put a call to git_mutex_init into just pack.c. This commit updates the pack function for creating a new pack object (i.e. git_packfile_check()) so that it can be used in both places and then makes indexer.c use the shared initialization routine. There are also a few minor formatting and warning message fixes.
* Further threading fixesRussell Belfer2013-04-221-11/+8
| | | | | | | | | | | | | This builds on the earlier thread safety work to make it so that setting the odb, index, refdb, or config for a repository is done in a threadsafe manner with minimized locking time. This is done by adding a lock to the repository object and using it to guard the assignment of the above listed pointers. The lock is only held to assign the pointer value. This also contains some minor fixes to the other work with pack files to reduce the time that locks are being held to and fix an apparently memory leak.
* Add mutex around mapping and unmapping pack filesRussell Belfer2013-04-221-24/+43
| | | | | | | | | | | When I was writing threading tests for the new cache, the main error I kept running into was a pack file having it's content unmapped underneath the running thread. This adds a lock around the routines that map and unmap the pack data so that threads can effectively reload the data when they need it. This also required reworking the error handling paths in a couple places in the code which I tried to make consistent.
* indexer: use a hashtable for keeping track of offsetsCarlos Martín Nieto2013-03-031-5/+6
| | | | | | | | | | These offsets are needed for REF_DELTA objects, which encode which object they use as a base, but not where it lies in the packfile, so we need a list. These objects are mostly from older packfiles, before OFS_DELTA was widely spread. The time spent in indexing these packfiles is greatly reduced, though remains above what git is able to do.
* Vector improvements and their falloutPhilip Kelley2013-01-271-3/+2
|
* Fix a mutex leak in pack.cPhilip Kelley2013-01-261-0/+1
|
* pack: evict all of the pages at onceCarlos Martín Nieto2013-01-141-31/+4
| | | | | | Somewhat surprisingly, this can increase the speed considerably, as we don't bother trying to decide what to evict, and the most used entries are quickly back into the cache.
* pack: evict objects from the cache in groups of eightCarlos Martín Nieto2013-01-141-11/+33
| | | | | This drops the cache eviction below libcrypto and zlib in the perf output. The number has been chosen empirically.
* pack: fixes to the cacheCarlos Martín Nieto2013-01-121-3/+8
| | | | | The offset should be git_off_t, and we should check the return value of the mutex lock function.
* indexer: properly free the packfile resourcesCarlos Martín Nieto2013-01-121-1/+1
| | | | | | | | The indexer needs to call the packfile's free function so it takes care of freeing the caches. We still need to close the mwf descriptor manually so we can rename the packfile into its final name on Windows.
* Revert "pack: packfile_free -> git_packfile_free and use it in the indexers"Carlos Martín Nieto2013-01-111-1/+1
| | | | | | This reverts commit f289f886cb81bb570bed747053d5ebf8aba6bef7, which makes the tests fail on Windows. Revert until we can figure out a solution.
* Fix MSVC compilation warningsnulltoken2013-01-111-1/+1
|
* pack: packfile_free -> git_packfile_free and use it in the indexersCarlos Martín Nieto2013-01-111-1/+1
| | | | | It turns out the indexers have been ignoring the pack's free function and leaking data. Plug that.
* pack: limit the amount of memory the base delta cache can useCarlos Martín Nieto2013-01-111-2/+34
| | | | | Currently limited to 16MB (like git) and to objects up to 1MB in size.
* pack: abstract out the cache into its own functionsCarlos Martín Nieto2013-01-111-52/+81
|
* pack: refcount entries and add a mutex around cache accessCarlos Martín Nieto2013-01-111-9/+33
|
* pack: introduce a delta base cacheCarlos Martín Nieto2013-01-111-17/+78
| | | | | | | | Many delta bases are re-used. Cache them to avoid inflating the same data repeatedly. This version doesn't limit the amount of entries to store, so it can end up using a considerable amound of memory.
* update copyrightsEdward Thomson2013-01-081-1/+1
|
* Merge pull request #1091 from carlosmn/stream-objectVicent Martí2012-12-071-0/+66
|\ | | | | Indexer speedup with large objects
| * pack: introduce a streaming API for raw objectsCarlos Martín Nieto2012-11-301-0/+66
| | | | | | | | | | This allows us to take objects from the packfile as a stream instead of having to keep it all in memory.
* | pack: add git_packfile_resolve_headerDavid Michael Barr2012-12-031-0/+50
|/ | | | | | | | | | | | | | | | | | To paraphrase @peff: You can get both size and type from a packed object reasonably cheaply. If you have: * An object that is not a delta; both type and size are available in the packfile header. * An object that is a delta. The packfile type will be OBJ_*_DELTA, and you have to resolve back to the base to find the real type. That means potentially a lot of packfile index lookups, but each one is relatively cheap. For the size, you inflate the first few bytes of the delta, whose header will tell you the resulting size of applying the delta to the base. For simplicity, we just decompress the whole delta for now.
* Make git_odb_foreach_cb take const paramRussell Belfer2012-11-271-1/+1
| | | | | This makes the first OID param of the ODB callback a const pointer and also propogates that change all the way to the backends.
* Set p->mwf.fd to -1 on errorSven Strickroth2012-11-241-2/+4
| | | | | | If p->mwf.fd is e.g. -2 then it is closed in packfile_free and an exception might be thrown. Signed-off-by: Sven Strickroth <email@cs-ware.de>
* Remove use of English expletivesMartin Woodward2012-11-231-1/+1
| | | | | | Remove words such as fuck, crap, shit etc. Remove other potentially offensive words from comments. Tidy up other geopolicital terms in comments.
* pack: iterate objects in offset orderDavid Michael Barr2012-09-141-12/+36
| | | | Compute the ordering on demand and persist until the index is freed.
* Merge remote-tracking branch 'arrbee/tree-walk-fixes' into developmentVicent Marti2012-08-061-5/+6
|\ | | | | | | | | | | | | | | | | Conflicts: src/notes.c src/transports/git.c src/transports/http.c src/transports/local.c tests-clar/odb/foreach.c
| * Update iterators for consistency across libraryRussell Belfer2012-08-031-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This updates all the `foreach()` type functions across the library that take callbacks from the user to have a consistent behavior. The rules are: * A callback terminates the loop by returning any non-zero value * Once the callback returns non-zero, it will not be called again (i.e. the loop stops all iteration regardless of state) * If the callback returns non-zero, the parent fn returns GIT_EUSER * Although the parent returns GIT_EUSER, no error will be set in the library and `giterr_last()` will return NULL if called. This commit makes those changes across the library and adds tests for most of the iteration APIs to make sure that they follow the above rules.
* | portability: Improve x86/amd64 compatibilitynulltoken2012-07-241-3/+3
|/
* odb: add git_odb_foreach()Carlos Martín Nieto2012-07-031-0/+43
| | | | | Go through each backend and list every objects that exists in them. This allows fsck-like uses.
* mwindow: allow memory-window files to deregisterCarlos Martin Nieto2012-06-281-0/+1
| | | | | | | | | Once a file is registered, there is no way to deregister it, even after the structure that contains it is no longer needed and has been freed. This may be the source of #624. Allow and use the deregister function to remove our file from the global list.
* Actually do the mmap... unsurprisingly, this makes the indexer work on SFSChris Young2012-06-121-3/+3
| | | | | On RAM: the .idx and .pack files become links to a .lock and the original download respectively. Assume some feature (such as record locking) supported by SFS but not JXFS or RAM: is required.