summaryrefslogtreecommitdiff
path: root/doc/development/merge_request_performance_guidelines.md
diff options
context:
space:
mode:
authorGitLab Bot <gitlab-bot@gitlab.com>2019-12-02 12:06:45 +0000
committerGitLab Bot <gitlab-bot@gitlab.com>2019-12-02 12:06:45 +0000
commitbffcdf9bca11a4d43cc40e3f382f03088d36f7c6 (patch)
treec773436393b7a59b5f6b14388b9fa6402a9bd198 /doc/development/merge_request_performance_guidelines.md
parent259c0cc0c4f8a49001b33d1bee577f4422e16d62 (diff)
downloadgitlab-ce-bffcdf9bca11a4d43cc40e3f382f03088d36f7c6.tar.gz
Add latest changes from gitlab-org/gitlab@master
Diffstat (limited to 'doc/development/merge_request_performance_guidelines.md')
-rw-r--r--doc/development/merge_request_performance_guidelines.md48
1 files changed, 48 insertions, 0 deletions
diff --git a/doc/development/merge_request_performance_guidelines.md b/doc/development/merge_request_performance_guidelines.md
index 43059d23c79..ec50b1557d4 100644
--- a/doc/development/merge_request_performance_guidelines.md
+++ b/doc/development/merge_request_performance_guidelines.md
@@ -165,6 +165,54 @@ can quickly spiral out of control.
There are some cases where this may be needed. If this is the case this should
be clearly mentioned in the merge request description.
+## Batch process
+
+**Summary:** Iterating a single process to external services (e.g. PostgreSQL, Redis, Object Storage, etc)
+should be executed in a **batch-style** in order to reduce connection overheads.
+
+For fetching rows from various tables in a batch-style, please see [Eager Loading](#eager-loading) section.
+
+### Example: Delete multiple files from Object Storage
+
+When you delete multiple files from object storage (e.g. GCS),
+executing a single REST API call multiple times is a quite expensive
+process. Ideally, this should be done in a batch-style, for example, S3 provides
+[batch deletion API](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html),
+so it'd be a good idea to consider such an approach.
+
+The `FastDestroyAll` module might help this situation. It's a
+small framework when you remove a bunch of database rows and its associated data
+in a batch style.
+
+## Timeout
+
+**Summary:** You should set a reasonable timeout when the system invokes HTTP calls
+to external services (e.g. Kubernetes), and it should be executed in Sidekiq, not
+in Puma/Unicorn threads.
+
+Often, GitLab needs to communicate with an external service such as Kubernetes
+clusters. In this case, it's hard to estimate when the external service finishes
+the requested process, for example, if it's a user-owned cluster that is inactive for some reason,
+GitLab might wait for the response forever ([Example](https://gitlab.com/gitlab-org/gitlab/issues/31475)).
+This could result in Puma/Unicorn timeout and should be avoided at all cost.
+
+You should set a reasonable timeout, gracefully handle exceptions and surface the
+errors in UI or logging internally.
+
+Using [`ReactiveCaching`](https://docs.gitlab.com/ee/development/utilities.html#reactivecaching) is one of the best solutions to fetch external data.
+
+## Keep database transaction minimal
+
+**Summary:** You should avoid accessing to external services (e.g. Gitaly) during database
+transactions, otherwise it leads to severe contention problems
+as an open transaction basically blocks the release of a Postgres backend connection.
+
+For keeping transaction as minimal as possible, please consider using `AfterCommitQueue`
+module or `after_commit` AR hook.
+
+Here is [an example](https://gitlab.com/gitlab-org/gitlab/issues/36154#note_247228859)
+that one request to Gitaly instance during transaction triggered a P1 issue.
+
## Eager Loading
**Summary:** always eager load associations when retrieving more than one row.