diff options
author | Sean McGivern <sean@gitlab.com> | 2019-04-03 10:46:13 +0100 |
---|---|---|
committer | Sean McGivern <sean@gitlab.com> | 2019-04-04 12:36:22 +0100 |
commit | 10ceb33ba271f603fa09d4a4b5fdca03fd7ea333 (patch) | |
tree | f96b0e786a4e2ef96287ac2099cb7fd89d8e3ebb /app/finders | |
parent | f87b7fe3b386962c45e83486634352da544857fb (diff) | |
download | gitlab-ce-10ceb33ba271f603fa09d4a4b5fdca03fd7ea333.tar.gz |
Extend CTE search optimisation to projectsextend-cte-optimisations-to-projects
When we use the `search` param on an `IssuableFinder`, we can run into
issues. We have trigram indexes to support these searches. On
GitLab.com, we often see Postgres's optimiser prioritise the (global)
trigram indexes over the index on `project_id`. For group and project
searches, we know that it will be quicker to filter by `project_id`
first, as it returns fewer rows in most cases.
For group issues search, we ran into this issue previously, and went
through the following iterations:
1. Use a CTE on the project IDs as an optimisation fence. This prevents
the planner from disregarding the index on `project_id`.
Unfortunately it breaks some types of sorting, like priority and
popularity, as they sort on a joined table.
2. Use a subquery for listing issues, and a CTE for counts. The subquery
- in the case of group lists - didn't help as much as the CTE, but
was faster than not including it. We can safely use a CTE for counts
as they don't have sorting.
Now, however, we're seeing the same issue in a project context. The
subquery doesn't help at all there (it would only return one row, after
all). In an attempt to keep total code complexity under control, this
commit removes the subquery optimisation and applies the CTE
optimisation only for sorts we know that are safe.
This means that for more complicated sorts (like priority and
popularity), the search will continue to be very slow. If this is a
high-priority issue, we can consider introducing further optimisations,
but this finder is already very complicated and additional complexity
has a cost.
The group CTE optimisation is controlled by the same feature flag as
before, `attempt_group_search_optimizations`, which is enabled by
default. The new project CTE optimisation is controlled by a new feature
flag, `attempt_project_search_optimizations`, which is disabled by
default.
Diffstat (limited to 'app/finders')
-rw-r--r-- | app/finders/issuable_finder.rb | 46 |
1 files changed, 20 insertions, 26 deletions
diff --git a/app/finders/issuable_finder.rb b/app/finders/issuable_finder.rb index 6eab8c5ee51..fa9dda2ab31 100644 --- a/app/finders/issuable_finder.rb +++ b/app/finders/issuable_finder.rb @@ -84,7 +84,7 @@ class IssuableFinder # https://www.postgresql.org/docs/current/static/queries-with.html items = by_search(items) - items = sort(items) unless use_cte_for_count? + items = sort(items) items end @@ -92,7 +92,6 @@ class IssuableFinder def filter_items(items) items = by_project(items) items = by_group(items) - items = by_subquery(items) items = by_scope(items) items = by_created_at(items) items = by_updated_at(items) @@ -132,10 +131,12 @@ class IssuableFinder # This does not apply when we are using a CTE for the search, as the labels # GROUP BY is inside the subquery in that case, so we set labels_count to 1. # - # We always use CTE when searching in Groups if the feature flag is enabled, - # but never when searching in Projects. + # Groups and projects have separate feature flags to suggest the use + # of a CTE. The CTE will not be used if the sort doesn't support it, + # but will always be used for the counts here as we ignore sorting + # anyway. labels_count = label_names.any? ? label_names.count : 1 - labels_count = 1 if use_cte_for_count? + labels_count = 1 if use_cte_for_search? finder.execute.reorder(nil).group(:state).count.each do |key, value| counts[count_key(key)] += value / labels_count @@ -309,15 +310,14 @@ class IssuableFinder end # rubocop: enable CodeReuse/ActiveRecord - def use_subquery_for_search? - strong_memoize(:use_subquery_for_search) do - !force_cte? && attempt_group_search_optimizations? - end - end + def use_cte_for_search? + strong_memoize(:use_cte_for_search) do + next false unless search + next false unless Gitlab::Database.postgresql? + # Only simple unsorted & simple sorts can use CTE + next false if params[:sort].present? && !params[:sort].in?(klass.simple_sorts.keys) - def use_cte_for_count? - strong_memoize(:use_cte_for_count) do - force_cte? && attempt_group_search_optimizations? + attempt_group_search_optimizations? || attempt_project_search_optimizations? end end @@ -332,12 +332,15 @@ class IssuableFinder end def attempt_group_search_optimizations? - search && - Gitlab::Database.postgresql? && - params[:attempt_group_search_optimizations] && + params[:attempt_group_search_optimizations] && Feature.enabled?(:attempt_group_search_optimizations, default_enabled: true) end + def attempt_project_search_optimizations? + params[:attempt_project_search_optimizations] && + Feature.enabled?(:attempt_project_search_optimizations) + end + def count_key(value) Array(value).last.to_sym end @@ -408,20 +411,11 @@ class IssuableFinder end # rubocop: enable CodeReuse/ActiveRecord - # Wrap projects and groups in a subquery if the conditions are met. - def by_subquery(items) - if use_subquery_for_search? - klass.where(id: items.select(:id)) # rubocop: disable CodeReuse/ActiveRecord - else - items - end - end - # rubocop: disable CodeReuse/ActiveRecord def by_search(items) return items unless search - if use_cte_for_count? + if use_cte_for_search? cte = Gitlab::SQL::RecursiveCTE.new(klass.table_name) cte << items |