From 19428e800895ba20eacb3357285acef8d69f6d8c Mon Sep 17 00:00:00 2001
From: Yorick Peterse <yorickpeterse@gmail.com>
Date: Mon, 7 May 2018 18:22:07 +0200
Subject: Preload pipeline data for project pipelines

When displaying the pipelines of a project we now preload the following
data:

1. Authors of the commits that belong to these pipelines
2. The number of warnings per pipeline, which is used by
   Ci::Pipeline#has_warnings?

== Commit Authors

Previously this data was queried for every Commit separately, leading to
20 SQL queries being executed in the worst case. With an average of 3 to
5 milliseconds per SQL query this could result in 100 milliseconds being
spent in _just_ getting Commit authors.

To preload this data Commit#author now uses BatchLoader (through
Commit#lazy_author), and a separate module
Gitlab::Ci::Pipeline::Preloader is used to ensure all authors are loaded
before they are used.

== Number of warnings

This changes Ci::Pipeline#has_warnings? so it supports preloading of the
number of warnings per pipeline. This removes the need for executing a
COUNT(*) query for every pipeline just to see if it has any warnings or
not.
---
 app/models/commit.rb | 28 +++++++++++++++++++++++++++-
 1 file changed, 27 insertions(+), 1 deletion(-)

(limited to 'app/models/commit.rb')

diff --git a/app/models/commit.rb b/app/models/commit.rb
index b46f9f34689..56d4c86774e 100644
--- a/app/models/commit.rb
+++ b/app/models/commit.rb
@@ -224,8 +224,34 @@ class Commit
     Gitlab::ClosingIssueExtractor.new(project, current_user).closed_by_message(safe_message)
   end
 
+  def lazy_author
+    BatchLoader.for(author_email.downcase).batch do |emails, loader|
+      # A Hash that maps user Emails to the corresponding User objects. The
+      # Emails at this point are the _primary_ Emails of the Users.
+      users_for_emails = User
+        .by_any_email(emails)
+        .each_with_object({}) { |user, hash| hash[user.email] = user }
+
+      users_for_ids = users_for_emails
+        .values
+        .each_with_object({}) { |user, hash| hash[user.id] = user }
+
+      # Some commits may have used an alternative Email address. In this case we
+      # need to query the "emails" table to map those addresses to User objects.
+      Email
+        .where(email: emails - users_for_emails.keys)
+        .pluck(:email, :user_id)
+        .each { |(email, id)| users_for_emails[email] = users_for_ids[id] }
+
+      users_for_emails.each { |email, user| loader.call(email, user) }
+    end
+  end
+
   def author
-    User.find_by_any_email(author_email.downcase)
+    # We use __sync so that we get the actual objects back (including an actual
+    # nil), instead of a wrapper, as returning a wrapped nil breaks a lot of
+    # code.
+    lazy_author.__sync
   end
   request_cache(:author) { author_email.downcase }
 
-- 
cgit v1.2.1