<feed xmlns='http://www.w3.org/2005/Atom'>
<title>delta/gitlab/gitlab-ce.git/spec/benchmarks, branch remove_sqlite_check</title>
<subtitle>gitlab.com: gitlab-org/gitlab-ce.git
</subtitle>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/'/>
<entry>
<title>Move Markdown/reference logic from Gitlab::Markdown to Banzai</title>
<updated>2015-12-15T14:51:16+00:00</updated>
<author>
<name>Douwe Maan</name>
<email>douwe@gitlab.com</email>
</author>
<published>2015-12-15T14:51:16+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/commit/?id=7781bda9bd82997f4a03de4cf911b1156ceb2cde'/>
<id>7781bda9bd82997f4a03de4cf911b1156ceb2cde</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Align hash literals in IssuesFinder spec</title>
<updated>2015-11-19T15:02:21+00:00</updated>
<author>
<name>Yorick Peterse</name>
<email>yorickpeterse@gmail.com</email>
</author>
<published>2015-11-19T15:02:21+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/commit/?id=094e1cc01b4f98ea4b8cd664344f3b8b583af471'/>
<id>094e1cc01b4f98ea4b8cd664344f3b8b583af471</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Added benchmark for IssuesFinder</title>
<updated>2015-11-19T10:48:50+00:00</updated>
<author>
<name>Yorick Peterse</name>
<email>yorickpeterse@gmail.com</email>
</author>
<published>2015-11-11T11:48:56+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/commit/?id=45840426a79f5ab251caf71400d3e4ed5f5eedbf'/>
<id>45840426a79f5ab251caf71400d3e4ed5f5eedbf</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'create-project-performance' into 'master'</title>
<updated>2015-11-04T10:14:30+00:00</updated>
<author>
<name>Yorick Peterse</name>
<email>yorickpeterse@gmail.com</email>
</author>
<published>2015-11-04T10:14:30+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/commit/?id=6d91ee0095d2491ab5d9862d5deba22282d1412d'/>
<id>6d91ee0095d2491ab5d9862d5deba22282d1412d</id>
<content type='text'>

Improve performance of creating projects



See merge request !1650</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>

Improve performance of creating projects



See merge request !1650</pre>
</div>
</content>
</entry>
<entry>
<title>Added benchmark for User.all</title>
<updated>2015-11-03T10:47:23+00:00</updated>
<author>
<name>Yorick Peterse</name>
<email>yorickpeterse@gmail.com</email>
</author>
<published>2015-11-03T10:47:23+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/commit/?id=0df65909eff560f2d313c4034ed21a65d643a0c3'/>
<id>0df65909eff560f2d313c4034ed21a65d643a0c3</id>
<content type='text'>
This benchmark exists to test if ordering has any noticeable impact in
the test environment.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This benchmark exists to test if ordering has any noticeable impact in
the test environment.
</pre>
</div>
</content>
</entry>
<entry>
<title>Adjusted ips/sec for find_by_any_email benchmarks</title>
<updated>2015-10-30T11:00:58+00:00</updated>
<author>
<name>Yorick Peterse</name>
<email>yorickpeterse@gmail.com</email>
</author>
<published>2015-10-29T16:53:56+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/commit/?id=6d3068bec3a926d17f4f2d0da895856489bfcb7a'/>
<id>6d3068bec3a926d17f4f2d0da895856489bfcb7a</id>
<content type='text'>
While these benchmarks run at roughly 1500 i/sec setting the threshold
to 1000 leaves some room for deviations (e.g. due to different DB
setups).
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
While these benchmarks run at roughly 1500 i/sec setting the threshold
to 1000 leaves some room for deviations (e.g. due to different DB
setups).
</pre>
</div>
</content>
</entry>
<entry>
<title>Improve performance of User.find_by_any_email</title>
<updated>2015-10-30T11:00:58+00:00</updated>
<author>
<name>Yorick Peterse</name>
<email>yorickpeterse@gmail.com</email>
</author>
<published>2015-10-28T13:43:27+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/commit/?id=49c081b9f38e99bbc11e7132d87773749b5b39d5'/>
<id>49c081b9f38e99bbc11e7132d87773749b5b39d5</id>
<content type='text'>
This query used to rely on a JOIN, effectively producing the following
SQL:

    SELECT users.*
    FROM users
    LEFT OUTER JOIN emails ON emails.user_id = users.id
    WHERE (users.email = X OR emails.email = X)
    LIMIT 1;

The use of a JOIN means having to scan over all Emails and users, join
them together and then filter out the rows that don't match the criteria
(though this step may be taken into account already when joining).

In the new setup this query instead uses a sub-query, producing the
following SQL:

    SELECT *
    FROM users
    WHERE id IN (select user_id FROM emails WHERE email = X)
    OR email = X
    LIMIT 1;

This query has the benefit that it:

1. Doesn't have to JOIN any rows
2. Only has to operate on a relatively small set of rows from the
   "emails" table.

Since most users will only have a handful of Emails associated
(certainly not hundreds or even thousands) the size of the set returned
by the sub-query is small enough that it should not become problematic.

Performance of the old versus new version can be measured using the
following benchmark:

    # Save this in ./bench.rb
    require 'benchmark/ips'

    email = 'yorick@gitlab.com'

    def User.find_by_any_email_old(email)
      user_table = arel_table
      email_table = Email.arel_table

      query = user_table.
        project(user_table[Arel.star]).
        join(email_table, Arel::Nodes::OuterJoin).
        on(user_table[:id].eq(email_table[:user_id])).
        where(user_table[:email].eq(email).or(email_table[:email].eq(email)))

      find_by_sql(query.to_sql).first
    end

    Benchmark.ips do |bench|
      bench.report 'original' do
        User.find_by_any_email_old(email)
      end

      bench.report 'optimized' do
        User.find_by_any_email(email)
      end

      bench.compare!
    end

Running this locally using "bundle exec rails r bench.rb" produces the
following output:

    Calculating -------------------------------------
                original     1.000  i/100ms
               optimized    93.000  i/100ms
    -------------------------------------------------
                original     11.103  (± 0.0%) i/s -     56.000
               optimized    948.713  (± 5.3%) i/s -      4.743k

    Comparison:
               optimized:      948.7 i/s
                original:       11.1 i/s - 85.45x slower

In other words, the new setup is 85x faster compared to the old setup,
at least when running this benchmark locally.

For GitLab.com these improvements result in User.find_by_any_email
taking only ~170 ms to run, instead of around 800 ms. While this is
"only" an improvement of about 4.5 times (instead of 85x) it's still
significantly better than before.

Fixes #3242
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This query used to rely on a JOIN, effectively producing the following
SQL:

    SELECT users.*
    FROM users
    LEFT OUTER JOIN emails ON emails.user_id = users.id
    WHERE (users.email = X OR emails.email = X)
    LIMIT 1;

The use of a JOIN means having to scan over all Emails and users, join
them together and then filter out the rows that don't match the criteria
(though this step may be taken into account already when joining).

In the new setup this query instead uses a sub-query, producing the
following SQL:

    SELECT *
    FROM users
    WHERE id IN (select user_id FROM emails WHERE email = X)
    OR email = X
    LIMIT 1;

This query has the benefit that it:

1. Doesn't have to JOIN any rows
2. Only has to operate on a relatively small set of rows from the
   "emails" table.

Since most users will only have a handful of Emails associated
(certainly not hundreds or even thousands) the size of the set returned
by the sub-query is small enough that it should not become problematic.

Performance of the old versus new version can be measured using the
following benchmark:

    # Save this in ./bench.rb
    require 'benchmark/ips'

    email = 'yorick@gitlab.com'

    def User.find_by_any_email_old(email)
      user_table = arel_table
      email_table = Email.arel_table

      query = user_table.
        project(user_table[Arel.star]).
        join(email_table, Arel::Nodes::OuterJoin).
        on(user_table[:id].eq(email_table[:user_id])).
        where(user_table[:email].eq(email).or(email_table[:email].eq(email)))

      find_by_sql(query.to_sql).first
    end

    Benchmark.ips do |bench|
      bench.report 'original' do
        User.find_by_any_email_old(email)
      end

      bench.report 'optimized' do
        User.find_by_any_email(email)
      end

      bench.compare!
    end

Running this locally using "bundle exec rails r bench.rb" produces the
following output:

    Calculating -------------------------------------
                original     1.000  i/100ms
               optimized    93.000  i/100ms
    -------------------------------------------------
                original     11.103  (± 0.0%) i/s -     56.000
               optimized    948.713  (± 5.3%) i/s -      4.743k

    Comparison:
               optimized:      948.7 i/s
                original:       11.1 i/s - 85.45x slower

In other words, the new setup is 85x faster compared to the old setup,
at least when running this benchmark locally.

For GitLab.com these improvements result in User.find_by_any_email
taking only ~170 ms to run, instead of around 800 ms. While this is
"only" an improvement of about 4.5 times (instead of 85x) it's still
significantly better than before.

Fixes #3242
</pre>
</div>
</content>
</entry>
<entry>
<title>Added benchmark for Projects::CreateService</title>
<updated>2015-10-29T11:09:25+00:00</updated>
<author>
<name>Yorick Peterse</name>
<email>yorickpeterse@gmail.com</email>
</author>
<published>2015-10-20T15:44:15+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/commit/?id=6369c992a6d9279f4aa38c60c350c966f3926df1'/>
<id>6369c992a6d9279f4aa38c60c350c966f3926df1</id>
<content type='text'>
This benchmark currently runs at ~0.6 iterations per second and is
unlikely to perform any better any time soon.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This benchmark currently runs at ~0.6 iterations per second and is
unlikely to perform any better any time soon.
</pre>
</div>
</content>
</entry>
<entry>
<title>Added benchmark for ReferenceFilter</title>
<updated>2015-10-20T13:53:22+00:00</updated>
<author>
<name>Yorick Peterse</name>
<email>yorickpeterse@gmail.com</email>
</author>
<published>2015-10-20T13:51:02+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/commit/?id=e1c3077e4bb718ce841fad175f708623d8375818'/>
<id>e1c3077e4bb718ce841fad175f708623d8375818</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Improve performance of sorting milestone issues</title>
<updated>2015-10-19T09:37:14+00:00</updated>
<author>
<name>Yorick Peterse</name>
<email>yorickpeterse@gmail.com</email>
</author>
<published>2015-10-15T16:10:35+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/commit/?id=4ff75e317935f990b90dcc5869afe8ebb2b6fee6'/>
<id>4ff75e317935f990b90dcc5869afe8ebb2b6fee6</id>
<content type='text'>
This cuts down the time it takes to sort issues of a milestone by about
10x. In the previous setup the code would run a SQL query for every
issue that had to be sorted. The new setup instead runs a single SQL
query to update all the given issues at once.

The attached benchmark used to run at around 60 iterations per second,
using the new setup this hovers around 600 iterations per second. Timing
wise a request to update a milestone with 40-something issues would take
about 760 ms, in the new setup this only takes about 130 ms.

Fixes #3066
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This cuts down the time it takes to sort issues of a milestone by about
10x. In the previous setup the code would run a SQL query for every
issue that had to be sorted. The new setup instead runs a single SQL
query to update all the given issues at once.

The attached benchmark used to run at around 60 iterations per second,
using the new setup this hovers around 600 iterations per second. Timing
wise a request to update a milestone with 40-something issues would take
about 760 ms, in the new setup this only takes about 130 ms.

Fixes #3066
</pre>
</div>
</content>
</entry>
</feed>
