object-store: allow threaded access to object reading

Allow object reading to be performed by multiple threads protecting it with an internal lock, the obj_read_mutex. The lock usage can be toggled with enable_obj_read_lock() and disable_obj_read_lock(). Currently, the functions which can be safely called in parallel are: read_object_file_extended(), repo_read_object_file(), read_object_file(), read_object_with_reference(), read_object(), oid_object_info() and oid_object_info_extended(). It's also possible to use obj_read_lock() and obj_read_unlock() to protect other sections that cannot execute in parallel with object reading. Probably there are many spots in the functions listed above that could be executed unlocked (and thus, in parallel). But, for now, we are most interested in allowing parallel access to zlib inflation. This is one of the sections where object reading spends most of the time in (e.g. up to one-third of git-grep's execution time in the chromium repo corresponds to inflation) and it's already thread-safe. So, to take advantage of that, the obj_read_mutex is released when calling git_inflate() and re-acquired right after, for every calling spot in oid_object_info_extended()'s call chain. We may refine this lock to also exploit other possible parallel spots in the future, but for now, threaded zlib inflation should already give great speedups for threaded object reading callers. Note that add_delta_base_cache() was also modified to skip adding already present entries to the cache. This wasn't possible before, but it would be now, with the parallel inflation. Take for example the following situation, where two threads - A and B - are executing the code at unpack_entry(): 1. Thread A is performing the decompression of a base O (which is not yet in the cache) at PHASE II. Thread B is simultaneously trying to unpack O, but just starting at PHASE I. 2. Since O is not yet in the cache, B will go to PHASE II to also perform the decompression. 3. When they finish decompressing, one of them will get the object reading mutex and go to PHASE III while the other waits for the mutex. Let’s say A got the mutex first. 4. Thread A will add O to the cache, go throughout the rest of PHASE III and return. 5. Thread B gets the mutex, also add O to the cache (if the check wasn't there) and returns. Finally, it is also important to highlight that the object reading lock can only ensure thread-safety in the mentioned functions thanks to two complementary mechanisms: the use of 'struct raw_object_store's replace_mutex, which guards sections in the object reading machinery that would otherwise be thread-unsafe; and the 'struct pack_window's inuse_cnt, which protects window reading operations (such as the one performed during the inflation of a packed object), allowing them to execute without the acquisition of the obj_read_mutex. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
author: Matheus Tavares <matheus.bernardino@usp.br> 2020-01-15 23:39:53 -0300
committer: Junio C Hamano <gitster@pobox.com> 2020-01-17 13:52:14 -0800
commit: 31877c9aec21e0824fd4fcf415069cf8dfae4b72 (patch)
tree: ed78b3edfb08a2f65c47dede4d320af072b46c71 /packfile.c
parent: b1fc9da1c84a94ef03eb07df361f3ec43006b39f (diff)
download: git-31877c9aec21e0824fd4fcf415069cf8dfae4b72.tar.gz
1 files changed, 32 insertions, 0 deletions
diff --git a/packfile.c b/packfile.c
index 7e7c04e4d8..24a73fc33a 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1086,7 +1086,23 @@ unsigned long get_size_from_delta(struct packed_git *p,
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
+		/*
+		 * Note: the window section returned by use_pack() must be
+		 * available throughout git_inflate()'s unlocked execution. To
+		 * ensure no other thread will modify the window in the
+		 * meantime, we rely on the packed_window.inuse_cnt. This
+		 * counter is incremented before window reading and checked
+		 * before window disposal.
+		 *
+		 * Other worrying sections could be the call to close_pack_fd(),
+		 * which can close packs even with in-use windows, and to
+		 * reprepare_packed_git(). Regarding the former, mmap doc says:
+		 * "closing the file descriptor does not unmap the region". And
+		 * for the latter, it won't re-open already available packs.
+		 */
+		obj_read_unlock();
 		st = git_inflate(&stream, Z_FINISH);
+		obj_read_lock();
 		curpos += stream.next_in - in;
 	} while ((st == Z_OK || st == Z_BUF_ERROR) &&
 		 stream.total_out < sizeof(delta_head));
@@ -1445,6 +1461,14 @@ static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
 	struct delta_base_cache_entry *ent = xmalloc(sizeof(*ent));
 	struct list_head *lru, *tmp;
 
+	/*
+	 * Check required to avoid redundant entries when more than one thread
+	 * is unpacking the same object, in unpack_entry() (since its phases I
+	 * and III might run concurrently across multiple threads).
+	 */
+	if (in_delta_base_cache(p, base_offset))
+		return;
+
 	delta_base_cached += base_size;
 
 	list_for_each_safe(lru, tmp, &delta_base_cache_lru) {
@@ -1574,7 +1598,15 @@ static void *unpack_compressed_entry(struct packed_git *p,
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
+		/*
+		 * Note: we must ensure the window section returned by
+		 * use_pack() will be available throughout git_inflate()'s
+		 * unlocked execution. Please refer to the comment at
+		 * get_size_from_delta() to see how this is done.
+		 */
+		obj_read_unlock();
 		st = git_inflate(&stream, Z_FINISH);
+		obj_read_lock();
 		if (!stream.avail_out)
 			break; /* the payload is larger than it should be */
 		curpos += stream.next_in - in;
author	Matheus Tavares <matheus.bernardino@usp.br>	2020-01-15 23:39:53 -0300
committer	Junio C Hamano <gitster@pobox.com>	2020-01-17 13:52:14 -0800
commit	31877c9aec21e0824fd4fcf415069cf8dfae4b72 (patch)
tree	ed78b3edfb08a2f65c47dede4d320af072b46c71 /packfile.c
parent	b1fc9da1c84a94ef03eb07df361f3ec43006b39f (diff)
download	git-31877c9aec21e0824fd4fcf415069cf8dfae4b72.tar.gz