<feed xmlns='http://www.w3.org/2005/Atom'>
<title>delta/gitlab/gitlab-ce.git/app/models/merge_request_diff.rb, branch remove-commit-tree</title>
<subtitle>gitlab.com: gitlab-org/gitlab-ce.git
</subtitle>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/'/>
<entry>
<title>Use memoization for commits on diffs</title>
<updated>2017-12-12T15:28:26+00:00</updated>
<author>
<name>Zeger-Jan van de Weg</name>
<email>git@zjvandeweg.nl</email>
</author>
<published>2017-12-11T15:38:16+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/commit/?id=3ab026b7d69ec97796fe4bb409933d7a716eef19'/>
<id>3ab026b7d69ec97796fe4bb409933d7a716eef19</id>
<content type='text'>
The Gitaly CommitService is being hammered by n + 1 calls, mostly when
finding commits. This leads to this gRPC being turned of on production:
https://gitlab.com/gitlab-org/gitaly/issues/514#note_48991378

Hunting down where it came from, most of them were due to
MergeRequest#show. To prove this, I set a script to request the
MergeRequest#show page 50 times. The GDK was being scraped by
Prometheus, where we have metrics on controller#action and their Gitaly
calls performed. On both occations I've restarted the full GDK so all
caches had to be rebuild.

Current master, 806a68a81f1baee, needed 435 requests
After this commit, 154 requests
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The Gitaly CommitService is being hammered by n + 1 calls, mostly when
finding commits. This leads to this gRPC being turned of on production:
https://gitlab.com/gitlab-org/gitaly/issues/514#note_48991378

Hunting down where it came from, most of them were due to
MergeRequest#show. To prove this, I set a script to request the
MergeRequest#show page 50 times. The GDK was being scraped by
Prometheus, where we have metrics on controller#action and their Gitaly
calls performed. On both occations I've restarted the full GDK so all
caches had to be rebuild.

Current master, 806a68a81f1baee, needed 435 requests
After this commit, 154 requests
</pre>
</div>
</content>
</entry>
<entry>
<title>Remove serialised diff and commit columns</title>
<updated>2017-11-28T16:13:40+00:00</updated>
<author>
<name>Sean McGivern</name>
<email>sean@gitlab.com</email>
</author>
<published>2017-11-21T16:58:08+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/commit/?id=4ebbfe5d3e0dbe06346ee0c64a8f62ec11f9b6e8'/>
<id>4ebbfe5d3e0dbe06346ee0c64a8f62ec11f9b6e8</id>
<content type='text'>
The st_commits and st_diffs columns on merge_request_diffs historically held the
YAML-serialised data for a merge request diff, in a variety of formats.

Since 9.5, these have been migrated in the background to two new tables:
merge_request_diff_commits and merge_request_diff_files. That has the advantage
that we can actually query the data (for instance, to find out how many commits
we've stored), and that it can't be in a variety of formats, but must match the
new schema.

This is the final step of that journey, where we drop those columns and remove
all references to them. This is a breaking change to the importer, because we
can no longer import diffs created in the old format, and we cannot guarantee
the export will be in the new format unless it was generated after this commit.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The st_commits and st_diffs columns on merge_request_diffs historically held the
YAML-serialised data for a merge request diff, in a variety of formats.

Since 9.5, these have been migrated in the background to two new tables:
merge_request_diff_commits and merge_request_diff_files. That has the advantage
that we can actually query the data (for instance, to find out how many commits
we've stored), and that it can't be in a variety of formats, but must match the
new schema.

This is the final step of that journey, where we drop those columns and remove
all references to them. This is a breaking change to the importer, because we
can no longer import diffs created in the old format, and we cannot guarantee
the export will be in the new format unless it was generated after this commit.
</pre>
</div>
</content>
</entry>
<entry>
<title>Use latest_merge_request_diff association</title>
<updated>2017-11-23T12:14:56+00:00</updated>
<author>
<name>Sean McGivern</name>
<email>sean@gitlab.com</email>
</author>
<published>2017-11-15T17:22:18+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/commit/?id=991bf24ec8890eca248a00deb4f33f309c9ffb83'/>
<id>991bf24ec8890eca248a00deb4f33f309c9ffb83</id>
<content type='text'>
Compared to the merge_request_diff association:

1. It's simpler to query. The query uses a foreign key to the
   merge_request_diffs table, so no ordering is necessary.
2. It's faster for preloading. The merge_request_diff association has to load
   every diff for the MRs in the set, then discard all but the most recent for
   each. This association means that Rails can just query for N diffs from N
   MRs.
3. It's more complicated to update. This is a bidirectional foreign key, so we
   need to update two tables when adding a diff record. This also means we need
   to handle this as a special case when importing a GitLab project.

There is some juggling with this association in the merge request model:

* `MergeRequest#latest_merge_request_diff` is _always_ the latest diff.
* `MergeRequest#merge_request_diff` reuses
  `MergeRequest#latest_merge_request_diff` unless:
    * Arguments are passed. These are typically to force-reload the association.
    * It doesn't exist. That means we might be trying to implicitly create a
      diff. This only seems to happen in specs.
    * The association is already loaded. This is important for the reasons
      explained in the comment, which I'll reiterate here: if we a) load a
      non-latest diff, then b) get its `merge_request`, then c) get that MR's
      `merge_request_diff`, we should get the diff we loaded in c), even though
      that's not the latest diff.

Basically, `MergeRequest#merge_request_diff` is the latest diff in most cases,
but not quite all.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Compared to the merge_request_diff association:

1. It's simpler to query. The query uses a foreign key to the
   merge_request_diffs table, so no ordering is necessary.
2. It's faster for preloading. The merge_request_diff association has to load
   every diff for the MRs in the set, then discard all but the most recent for
   each. This association means that Rails can just query for N diffs from N
   MRs.
3. It's more complicated to update. This is a bidirectional foreign key, so we
   need to update two tables when adding a diff record. This also means we need
   to handle this as a special case when importing a GitLab project.

There is some juggling with this association in the merge request model:

* `MergeRequest#latest_merge_request_diff` is _always_ the latest diff.
* `MergeRequest#merge_request_diff` reuses
  `MergeRequest#latest_merge_request_diff` unless:
    * Arguments are passed. These are typically to force-reload the association.
    * It doesn't exist. That means we might be trying to implicitly create a
      diff. This only seems to happen in specs.
    * The association is already loaded. This is important for the reasons
      explained in the comment, which I'll reiterate here: if we a) load a
      non-latest diff, then b) get its `merge_request`, then c) get that MR's
      `merge_request_diff`, we should get the diff we loaded in c), even though
      that's not the latest diff.

Basically, `MergeRequest#merge_request_diff` is the latest diff in most cases,
but not quite all.
</pre>
</div>
</content>
</entry>
<entry>
<title>Optimise getting the pipeline status of commits</title>
<updated>2017-11-16T15:01:14+00:00</updated>
<author>
<name>Yorick Peterse</name>
<email>yorickpeterse@gmail.com</email>
</author>
<published>2017-11-10T19:57:11+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/commit/?id=ab16a6fb34c0f3e4d9afed3332c559868201e606'/>
<id>ab16a6fb34c0f3e4d9afed3332c559868201e606</id>
<content type='text'>
This adds an optimised way of getting the latest pipeline status for a
list of Commit objects (or just a single one).
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This adds an optimised way of getting the latest pipeline status for a
list of Commit objects (or just a single one).
</pre>
</div>
</content>
</entry>
<entry>
<title>Set merge_request_diff_id on MR when creating</title>
<updated>2017-11-02T16:09:56+00:00</updated>
<author>
<name>Sean McGivern</name>
<email>sean@gitlab.com</email>
</author>
<published>2017-10-26T10:33:54+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/commit/?id=4768a1e26f70263be24ac681042507fe01218b9b'/>
<id>4768a1e26f70263be24ac681042507fe01218b9b</id>
<content type='text'>
Once we migrate existing MRs to have this column, we will be able to get the
latest diff for a single merge request more efficiently, and (more importantly)
get all latest diffs for a collection of MRs efficiently.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Once we migrate existing MRs to have this column, we will be able to get the
latest diff for a single merge request more efficiently, and (more importantly)
get all latest diffs for a collection of MRs efficiently.
</pre>
</div>
</content>
</entry>
<entry>
<title>Move `fetch_ref` from MergeRequestDiff#ensure_commit_shas to MergeRequest</title>
<updated>2017-10-05T08:48:25+00:00</updated>
<author>
<name>Rémy Coutable</name>
<email>remy@rymai.me</email>
</author>
<published>2017-09-22T15:46:24+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/commit/?id=67de21c15bd8bb24667a4d8bfa259406165a909e'/>
<id>67de21c15bd8bb24667a4d8bfa259406165a909e</id>
<content type='text'>
MergeRequest#create_merge_request_diff and MergeRequest#reload_diff are
the only places where we generate a new MR diff so that's where we
should fetch the ref.

This also ensures that the ref is not fetched when we call
merge_request.merge_request_diffs.create in
Github::Import#fetch_pull_requests.

Signed-off-by: Rémy Coutable &lt;remy@rymai.me&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
MergeRequest#create_merge_request_diff and MergeRequest#reload_diff are
the only places where we generate a new MR diff so that's where we
should fetch the ref.

This also ensures that the ref is not fetched when we call
merge_request.merge_request_diffs.create in
Github::Import#fetch_pull_requests.

Signed-off-by: Rémy Coutable &lt;remy@rymai.me&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Keep only the changes that are known to work well</title>
<updated>2017-10-05T08:48:25+00:00</updated>
<author>
<name>Rémy Coutable</name>
<email>remy@rymai.me</email>
</author>
<published>2017-09-22T11:37:14+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/commit/?id=324f672eefcd3db1337d51f5137dd085c4697c8c'/>
<id>324f672eefcd3db1337d51f5137dd085c4697c8c</id>
<content type='text'>
Also, improved a bit the method names in
Github::Representation::PullRequest.

Last but not least, ensure temp branch names doesn't contain a `/` as
this would create the ref in a subfolder in `refs/heads` (e.g.
`refs/heads/gh-123/456/rymai/foo`), and would leave empty directories
upon branch deletion (e.g. `refs/heads/gh-123/456/rymai/`.

Signed-off-by: Rémy Coutable &lt;remy@rymai.me&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Also, improved a bit the method names in
Github::Representation::PullRequest.

Last but not least, ensure temp branch names doesn't contain a `/` as
this would create the ref in a subfolder in `refs/heads` (e.g.
`refs/heads/gh-123/456/rymai/foo`), and would leave empty directories
upon branch deletion (e.g. `refs/heads/gh-123/456/rymai/`.

Signed-off-by: Rémy Coutable &lt;remy@rymai.me&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Fetch and map refs/pull to refs/merge-requests in the GH import</title>
<updated>2017-10-05T08:48:25+00:00</updated>
<author>
<name>James Lopez</name>
<email>james@jameslopez.es</email>
</author>
<published>2017-08-24T08:18:01+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/commit/?id=149adcd1206ba3bebd20401bf45ef7e67ec690fb'/>
<id>149adcd1206ba3bebd20401bf45ef7e67ec690fb</id>
<content type='text'>
update MR diff
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
update MR diff
</pre>
</div>
</content>
</entry>
<entry>
<title>Refactor Gitlab::Git::Commit to include a repository</title>
<updated>2017-08-08T02:34:34+00:00</updated>
<author>
<name>Alejandro Rodríguez</name>
<email>alejorro70@gmail.com</email>
</author>
<published>2017-07-25T20:48:17+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/commit/?id=c21ae07e331ca14605410555d0582f14cb661bb6'/>
<id>c21ae07e331ca14605410555d0582f14cb661bb6</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Migrate MR commits and diffs to new tables</title>
<updated>2017-08-03T12:20:26+00:00</updated>
<author>
<name>Sean McGivern</name>
<email>sean@gitlab.com</email>
</author>
<published>2017-07-03T14:48:59+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitlab/gitlab-ce.git/commit/?id=f2d50af917b878a98e06b994ac32c0718f3d0b78'/>
<id>f2d50af917b878a98e06b994ac32c0718f3d0b78</id>
<content type='text'>
Previously, we stored these as serialised fields - `st_{commits,diffs}` - on the
`merge_request_diffs` table. These now have their own tables -
`merge_request_diff_{commits,diffs}` - with a column for each attribute of the
serialised data.

Add a background migration to go through the existing MR diffs and migrate them
to the new format. Ignore any contents that cannot be displayed. Assuming that
we have 5 million rows to migrate, and each batch of 2,500 rows can be
completed in 5 minutes, this will take about 7 days to migrate everything.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Previously, we stored these as serialised fields - `st_{commits,diffs}` - on the
`merge_request_diffs` table. These now have their own tables -
`merge_request_diff_{commits,diffs}` - with a column for each attribute of the
serialised data.

Add a background migration to go through the existing MR diffs and migrate them
to the new format. Ignore any contents that cannot be displayed. Assuming that
we have 5 million rows to migrate, and each batch of 2,500 rows can be
completed in 5 minutes, this will take about 7 days to migrate everything.
</pre>
</div>
</content>
</entry>
</feed>
