summaryrefslogtreecommitdiff
path: root/src/pack.c
Commit message (Collapse)AuthorAgeFilesLines
* Remove extra semicolon outside of a functionStefan Widgren2015-07-311-2/+2
| | | | | Without this change, compiling with gcc and pedantic generates warning: ISO C does not allow extra ‘;’ outside of a function.
* pack: use git_buf when building the index nameCarlos Martín Nieto2015-06-101-10/+11
| | | | | | The way we currently do it depends on the subtlety of strlen vs sizeof and the fact that .pack is one longer than .idx. Let's use a git_buf so we can express the manipulation we want much more clearly.
* indexer: don't look for the index we're creatingEdward Thomson2015-05-221-0/+7
| | | | | | When creating an index, know that we do not have an index for our own packfile, preventing some unnecessary file opens and error reporting.
* Reorder some khash declarationsCarlos Martín Nieto2015-03-111-0/+3
| | | | | | Keep the definitions in the headers, while putting the declarations in the C files. Putting the function definitions in headers causes them to be duplicated if you include two headers with them.
* Merge pull request #2907 from jasonhaslam/git_packfile_unpack_raceCarlos Martín Nieto2015-02-201-2/+9
|\ | | | | Fix race in git_packfile_unpack.
| * Fix race in git_packfile_unpack.Jason Haslam2015-02-141-2/+9
| | | | | | | | | | | | Increment refcount of newly added cache entries just like existing entries looked up from the cache. Otherwise the new entry can be evicted from the cache and destroyed while it's still in use.
* | Make our overflow check look more like gcc/clang'sEdward Thomson2015-02-131-10/+11
| | | | | | | | | | | | | | | | | | Make our overflow checking look more like gcc and clang's, so that we can substitute it out with the compiler instrinsics on platforms that support it. This means dropping the ability to pass `NULL` as an out parameter. As a result, the macros also get updated to reflect this as well.
* | allocations: test for overflow of requested sizeEdward Thomson2015-02-121-0/+7
|/ | | | | Introduce some helper macros to test integer overflow from arithmetic and set error message appropriately.
* Plug some leaksJacques Germishuys2014-12-291-0/+1
|
* Fix for misleading "missing delta bases" error - Fix #2721.Ravindra Patel2014-11-211-1/+4
|
* Removed some useless variable assignmentsPierre-Olivier Latour2014-10-271-1/+0
|
* Silence uninitialized warningJacques Germishuys2014-09-261-1/+1
|
* Several CppCat warnings fixedArkady Shapkin2014-09-031-3/+0
|
* pack: return the correct final offsetcmn/unpack-offsetCarlos Martín Nieto2014-08-261-1/+1
| | | | | | | | | | The callers of git_packfile_unpack() expect the obj_offset argument to be set to the beginning of the next object. We were mistakenly returning the the offset of the object's data, which causes the CRC function to try to use the wrong offset. Set obj_offset to curpos instead of elem->offset to point to the next element and bring back expected behaviour.
* pack: free the new pack struct if we fail to insertCarlos Martín Nieto2014-06-251-3/+3
| | | | | | | | If we fail to insert the packfile in the map, make sure to free it. This makes the free function only attempt to remove its mwindows from the global list if we have opened the packfile to avoid accessing the list unlocked.
* Share packs across repository instancescmn/global-mwfCarlos Martín Nieto2014-06-231-1/+18
| | | | | | | | | | | Opening the same repository multiple times will currently open the same file multiple times, as well as map the same region of the file multiple times. This is not necessary, as the packfile data is immutable. Instead of opening and closing packfiles directly, introduce an indirection and allocate packfiles globally. This does mean locking on each packfile open, but we already use this lock for the global mwindow list so it doesn't introduce a new contention point.
* pack: init the cache on packfile alloccmn/pack-cache-initCarlos Martín Nieto2014-05-151-8/+7
| | | | | | | | When running multithreaded, it is not enough to check for the offmap allocation. Move the call to cache_init() to packfile allocation so we can be sure it is always allocated free of races. This fixes #2355.
* pack: don't forget to cache the base objectcmn/pack-unpack-loopCarlos Martín Nieto2014-05-131-7/+8
| | | | | The base object is a good cache candidate, so we shouldn't forget to add it to the cache.
* pack: use stack allocation for smaller delta chainsCarlos Martín Nieto2014-05-131-16/+45
| | | | | | This avoid allocating the array on the heap for relatively small chains. The expected performance increase is sadly not really noticeable.
* pack: expose a cached delta base directlyCarlos Martín Nieto2014-05-131-93/+92
| | | | | Instead of going through a special entry in the chain, let's pass it as an output parameter.
* pack: simplify delta chain codeCarlos Martín Nieto2014-05-091-49/+51
| | | | | | | The switch makes the loop somewhat unwieldy. Let's assume it's fine and perform the check when we're accessing the data. This makes our code look a lot more like git's.
* pack: preallocate a 64-element chainCarlos Martín Nieto2014-05-091-0/+1
| | | | | | | | | Dependency chains are often large and require a few reallocations. Allocate a 64-element chain before doing anything else to avoid allocations during the loop. This value comes from the stack-allocated one git uses. We still allocate this on the heap, but it does help performance a little bit.
* pack: make sure not to leak the dep chainCarlos Martín Nieto2014-05-091-8/+13
|
* pack: use a cache for delta bases when unpackingCarlos Martín Nieto2014-05-091-73/+72
| | | | | | Bring back the use of the delta base cache for unpacking objects. When generating the delta chain, we stop when we find a delta base in the pack's cache and use that as the starting point.
* pack: unpack using a loopCarlos Martín Nieto2014-05-091-25/+119
| | | | | | | | | | | | | | | We currently make use of recursive function calls to unpack an object, resolving the deltas as we come back down the chain. This means that we have unbounded stack growth as we look up objects in a pack. This is now done in two steps: first we figure out what the dependency chain is by looking up the delta bases until we reach a non-delta object, pushing the information we need onto a stack and then we pop from that stack and apply the deltas until there are no more left. This version of the code does not make use of the delta base cache so it is slower than what's in the mainline. A later commit will reintroduce it.
* pack: do not repeat the same error message four timesCarlos Martín Nieto2014-05-091-4/+4
| | | | | | Repeating this error message makes it harder to find out where we actually are finding the error, and they don't really describe what we're trying to do.
* pack: remove misleading commentCarlos Martín Nieto2014-05-091-7/+0
|
* Drop parsing pack filename SHA1 part, no one cares the filenameLinquize2014-01-231-5/+0
|
* One more rename/cleanup for callback err functionsRussell Belfer2013-12-111-4/+2
|
* Some callback error check style cleanupsRussell Belfer2013-12-111-1/+3
| | | | I find this easier to read...
* Remove converting user error to GIT_EUSERRussell Belfer2013-12-111-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This changes the behavior of callbacks so that the callback error code is not converted into GIT_EUSER and instead we propagate the return value through to the caller. Instead of using the giterr_capture and giterr_restore functions, we now rely on all functions to pass back the return value from a callback. To avoid having a return value with no error message, the user can call the public giterr_set_str or some such function to set an error message. There is a new helper 'giterr_set_callback' that functions can invoke after making a callback which ensures that some error message was set in case the callback did not set one. In places where the sign of the callback return value is meaningful (e.g. positive to skip, negative to abort), only the negative values are returned back to the caller, obviously, since the other values allow for continuing the loop. The hardest parts of this were in the checkout code where positive return values were overloaded as meaningful values for checkout. I fixed this by adding an output parameter to many of the internal checkout functions and removing the overload. This added some code, but it is probably a better implementation. There is some funkiness in the network code where user provided callbacks could be returning a positive or a negative value and we want to rely on that to cancel the loop. There are still a couple places where an user error might get turned into GIT_EUSER there, I think, though none exercised by the tests.
* Further EUSER and error propagation fixesRussell Belfer2013-12-111-4/+2
| | | | | | | | | | | | | This continues auditing all the places where GIT_EUSER is being returned and making sure to clear any existing error using the new giterr_user_cancel helper. As a result, places that relied on intercepting GIT_EUSER but having the old error preserved also needed to be cleaned up to correctly stash and then retrieve the actual error. Additionally, as I encountered places where error codes were not being propagated correctly, I tried to fix them up. A number of those fixes are included in the this commit as well.
* pack: `__object_header` always returns unsigned valuesVicent Marti2013-11-011-2/+2
|
* Fix warning on win64Linquize2013-11-011-1/+1
|
* pack: move the object header function hereCarlos Martín Nieto2013-10-041-0/+32
|
* sha1_lookup: do not use the "experimental" lookup modeVicent Marti2013-08-141-1/+4
|
* Close p->mwf.fd only if necessarySven Strickroth2013-07-251-2/+3
| | | | | | This fixes a regression introduced in revision 9d2f841a5d39fc25ce722a3904f6ebc9aa112222. Signed-off-by: Sven Strickroth <email@cs-ware.de>
* pack: fix memory leak in error pathRémi Duraffort2013-07-151-1/+3
|
* Mutex init can failRussell Belfer2013-05-311-2/+14
| | | | | | | It is obviously quite a serious problem if this happens, but mutex initialization can fail and we should detect it. It's a bit like a memory allocation failure, in that you're probably pretty screwed if this occurs, but at least we'll catch it.
* Zero memory for major objects before freeingRussell Belfer2013-05-311-1/+5
| | | | | | | By zeroing out the memory when we free larger objects (i.e. those that serve as collections of other data, such as repos, odb, refdb), I'm hoping that it will be easier for libgit2 bindings to find errors in their object management code.
* Switch to index_version as "git_pack_file is ready" flagCarlos Martín Nieto2013-05-021-5/+6
| | | | | | | | | | | | | | | | We use p->index_map.data to check whether the struct has been set up and all the information about the index is stored there. This variable gets set up halfway through the setup process, however, and a thread can come along and use fields that haven't been written to yet. Crucially, pack_entry_find_offset() needs to read the index version (which is written after index_map) to know the offset and stride length to pass to sha1_entry_pos(). If these values are wrong, assertions in it will fail, as it will be reading bogus data. Make index_version the last field to be written and switch from using p->index_map.data to p->index_version as "git_pack_file is ready" flag as we can use it to know if every field has been written.
* Revert "Protect sha1_entry_pos call with mutex"Carlos Martín Nieto2013-05-021-12/+10
| | | | This reverts commit 8c535f3f6879c6796d8107d7eb80dd8b2105621b.
* Protect sha1_entry_pos call with mutexRussell Belfer2013-05-021-10/+12
| | | | | | | There is an occasional assertion failure in sha1_entry_pos from pack_entry_find_index when running threaded. Holding the mutex around the code that grabs the index_map data and processes it makes this assertion failure go away.
* Add extra locking around packfile openRussell Belfer2013-05-021-15/+29
| | | | | | We were still seeing a few issues in threaded access to packs. This adds extra locks around the opening of the mwindow to avoid a different race.
* Make git_oid_cmp public and add git_oid__cmpRussell Belfer2013-04-291-3/+3
|
* Consolidate packfile allocation furtherRussell Belfer2013-04-221-1/+1
| | | | | | | Rename git_packfile_check to git_packfile_alloc since it is now being used more in that capacity. Fix the various places that use it. Consolidate some repeated code in odb_pack.c related to the allocation of a new pack_backend.
* Make indexer use shared packfile open codeRussell Belfer2013-04-221-22/+17
| | | | | | | | | | | The indexer was creating a packfile object separately from the code in pack.c which was a problem since I put a call to git_mutex_init into just pack.c. This commit updates the pack function for creating a new pack object (i.e. git_packfile_check()) so that it can be used in both places and then makes indexer.c use the shared initialization routine. There are also a few minor formatting and warning message fixes.
* Further threading fixesRussell Belfer2013-04-221-11/+8
| | | | | | | | | | | | | This builds on the earlier thread safety work to make it so that setting the odb, index, refdb, or config for a repository is done in a threadsafe manner with minimized locking time. This is done by adding a lock to the repository object and using it to guard the assignment of the above listed pointers. The lock is only held to assign the pointer value. This also contains some minor fixes to the other work with pack files to reduce the time that locks are being held to and fix an apparently memory leak.
* Add mutex around mapping and unmapping pack filesRussell Belfer2013-04-221-24/+43
| | | | | | | | | | | When I was writing threading tests for the new cache, the main error I kept running into was a pack file having it's content unmapped underneath the running thread. This adds a lock around the routines that map and unmap the pack data so that threads can effectively reload the data when they need it. This also required reworking the error handling paths in a couple places in the code which I tried to make consistent.
* indexer: use a hashtable for keeping track of offsetsCarlos Martín Nieto2013-03-031-5/+6
| | | | | | | | | | These offsets are needed for REF_DELTA objects, which encode which object they use as a base, but not where it lies in the packfile, so we need a list. These objects are mostly from older packfiles, before OFS_DELTA was widely spread. The time spent in indexing these packfiles is greatly reduced, though remains above what git is able to do.