diff options
author | Toon Claes <toon@gitlab.com> | 2017-05-24 15:03:45 +0200 |
---|---|---|
committer | Toon Claes <toon@gitlab.com> | 2017-05-30 22:45:59 +0200 |
commit | 01c6323d238706bc8d0acb0273552a094c996ca0 (patch) | |
tree | 86efcc097adebcce7ccb7ea76aba4e14e350d6eb /lib/api/projects.rb | |
parent | a1deed629e03d8db47deb1bcf795ae8abaf2c847 (diff) | |
download | gitlab-ce-01c6323d238706bc8d0acb0273552a094c996ca0.tar.gz |
UNION of SELECT/WHERE is faster than WHERE on UNION
Instead of applying WHERE on a UNION, apply the WHERE on each of the seperate
SELECT statements, and do UNION on that.
Local tests with about 2_000_000 projects:
- 1_500_000 private projects
- 40_000 internal projects
- 400_000 public projects
For the API endpoint `/api/v4/projects?visibility=private` the slowest query was:
```sql
SELECT "projects".*
FROM "projects"
WHERE ...
```
The original query took 1073.8ms.
The query refactored to UNION of SELECT/WHERE took 2.3ms.
The original query was:
```sql
SELECT "projects".*
FROM "projects"
WHERE "projects"."pending_delete" = $1
AND (projects.id IN
(SELECT "projects"."id"
FROM "projects"
INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
WHERE "projects"."pending_delete" = 'f'
AND "project_authorizations"."user_id" = 23
UNION SELECT "projects"."id"
FROM "projects"
WHERE "projects"."visibility_level" IN (20,
10)))
AND "projects"."visibility_level" = $2
AND "projects"."archived" = $3
ORDER BY "projects"."created_at" DESC
LIMIT 20
OFFSET 0 [["pending_delete", "f"],
["visibility_level", 0],
["archived", "f"]]
```
The refactored query:
```sql
SELECT "projects".*
FROM "projects"
WHERE "projects"."pending_delete" = $1
AND (projects.id IN
(SELECT "projects"."id"
FROM "projects"
INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
WHERE "projects"."pending_delete" = 'f'
AND "project_authorizations"."user_id" = 23
AND "projects"."visibility_level" = 0
AND "projects"."archived" = 'f'
UNION SELECT "projects"."id"
FROM "projects"
WHERE "projects"."visibility_level" IN (20,
10)
AND "projects"."visibility_level" = 0
AND "projects"."archived" = 'f'))
ORDER BY "projects"."created_at" DESC
LIMIT 20
OFFSET 0 [["pending_delete", "f"]]
```
Diffstat (limited to 'lib/api/projects.rb')
0 files changed, 0 insertions, 0 deletions