summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorToon Claes <toon@gitlab.com>2018-09-18 17:32:21 +0200
committerToon Claes <toon@gitlab.com>2018-09-20 16:27:09 +0200
commitce830d3c60fbf445c67fb923f03678ad2333eba5 (patch)
tree68a6b4a82244e15eb9c81d697ae9ac2cacad3642
parent8c2192943a5efc4d0a28c67b04bf9b979def66a1 (diff)
downloadgitlab-ce-ce830d3c60fbf445c67fb923f03678ad2333eba5.tar.gz
Add Gitlab::Database::Subquery.self_join to delete_all with limit
`delete_all` doesn't support limit, so you'd need to subquery that. And instead of subquerying with `where(id: query)`, it's better to use an `INNER JOIN`. This method also works with MySQL, while subquerying doesn't (without another layer of subquerying) Reference: https://stackoverflow.com/questions/17892762/mysql-this-version-of-mysql-doesnt-yet-support-limit-in-all-any-some-subqu/17892886#17892886
-rw-r--r--doc/development/README.md1
-rw-r--r--doc/development/database_helpers.md63
-rw-r--r--lib/gitlab/database/subquery.rb16
-rw-r--r--spec/lib/gitlab/database/subquery_spec.rb17
4 files changed, 97 insertions, 0 deletions
diff --git a/doc/development/README.md b/doc/development/README.md
index d8dbc993442..94e604d125d 100644
--- a/doc/development/README.md
+++ b/doc/development/README.md
@@ -94,6 +94,7 @@ description: 'Learn how to contribute to GitLab.'
- [Verifying database capabilities](verifying_database_capabilities.md)
- [Database Debugging and Troubleshooting](database_debugging.md)
- [Query Count Limits](query_count_limits.md)
+- [Database helper modules](database_helpers.md)
## Testing guides
diff --git a/doc/development/database_helpers.md b/doc/development/database_helpers.md
new file mode 100644
index 00000000000..21e4e725de6
--- /dev/null
+++ b/doc/development/database_helpers.md
@@ -0,0 +1,63 @@
+# Database helpers
+
+There are a number of useful helper modules defined in `/lib/gitlab/database/`.
+
+## Subquery
+
+In some cases it is not possible to perform an operation on a query.
+For example:
+
+```ruby
+Geo::EventLog.where('id < 100').limit(10).delete_all
+```
+
+Will give this error:
+
+> ActiveRecord::ActiveRecordError: delete_all doesn't support limit
+
+One solution would be to wrap it in another `where`:
+
+```ruby
+Geo::EventLog.where(id: Geo::EventLog.where('id < 100').limit(10)).delete_all
+```
+
+This works with PostgreSQL, but with MySQL it gives this error:
+
+> ActiveRecord::StatementInvalid: Mysql2::Error: This version of MySQL
+> doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'
+
+Also, that query doesn't have very good performance. Using a
+`INNER JOIN` with itself is better.
+
+So instead of this query:
+
+```sql
+SELECT geo_event_log.*
+FROM geo_event_log
+WHERE geo_event_log.id IN
+ (SELECT geo_event_log.id
+ FROM geo_event_log
+ WHERE (id < 100)
+ LIMIT 10)
+```
+
+It's better to write:
+
+```sql
+SELECT geo_event_log.*
+FROM geo_event_log
+INNER JOIN
+ (SELECT geo_event_log.*
+ FROM geo_event_log
+ WHERE (id < 100)
+ LIMIT 10) t2 ON geo_event_log.id = t2.id
+```
+
+And this is where `Gitlab::Database::Subquery.self_join` can help
+you. So you can rewrite the above statement as:
+
+```ruby
+Gitlab::Database::Subquery.self_join(Geo::EventLog.where('id < 100').limit(10)).delete_all
+```
+
+And this also works with MySQL, so you don't need to worry about that.
diff --git a/lib/gitlab/database/subquery.rb b/lib/gitlab/database/subquery.rb
new file mode 100644
index 00000000000..2a6f39c6a27
--- /dev/null
+++ b/lib/gitlab/database/subquery.rb
@@ -0,0 +1,16 @@
+# frozen_string_literal: true
+
+module Gitlab
+ module Database
+ module Subquery
+ class << self
+ def self_join(relation)
+ t = relation.arel_table
+ t2 = relation.arel.as('t2')
+
+ relation.unscoped.joins(t.join(t2).on(t[:id].eq(t2[:id])).join_sources.first)
+ end
+ end
+ end
+ end
+end
diff --git a/spec/lib/gitlab/database/subquery_spec.rb b/spec/lib/gitlab/database/subquery_spec.rb
new file mode 100644
index 00000000000..70380e02f16
--- /dev/null
+++ b/spec/lib/gitlab/database/subquery_spec.rb
@@ -0,0 +1,17 @@
+# frozen_string_literal: true
+
+require 'spec_helper'
+
+RSpec.describe Gitlab::Database::Subquery do
+ describe '.self_join' do
+ set(:project) { create(:project) }
+
+ it 'allows you to delete_all rows with WHERE and LIMIT' do
+ events = create_list(:event, 8, project: project)
+
+ expect do
+ described_class.self_join(Event.where('id < ?', events[5]).recent.limit(2)).delete_all
+ end.to change { Event.count }.by(-2)
+ end
+ end
+end