From b47a84b4be5781c91bf1072dcfa765b6782f0a35 Mon Sep 17 00:00:00 2001 From: Yorick Peterse Date: Thu, 28 Apr 2016 21:44:21 +0200 Subject: Added performance guidelines Fixes gitlab-org/gitlab-ce#15254 gitlab-org/gitlab-ce#14277 [ci skip] --- doc/development/README.md | 1 + doc/development/performance.md | 258 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 259 insertions(+) create mode 100644 doc/development/performance.md (limited to 'doc/development') diff --git a/doc/development/README.md b/doc/development/README.md index 3f3ef068f96..aa7d54c01d0 100644 --- a/doc/development/README.md +++ b/doc/development/README.md @@ -8,6 +8,7 @@ - [How to dump production data to staging](db_dump.md) - [Instrumentation](instrumentation.md) - [Migration Style Guide](migration_style_guide.md) for creating safe migrations +- [Performance guidelines](performance.md) - [Rake tasks](rake_tasks.md) for development - [Shell commands](shell_commands.md) in the GitLab codebase - [Sidekiq debugging](sidekiq_debugging.md) diff --git a/doc/development/performance.md b/doc/development/performance.md new file mode 100644 index 00000000000..c74198650e5 --- /dev/null +++ b/doc/development/performance.md @@ -0,0 +1,258 @@ +# Performance Guidelines + +This document describes various guidelines to follow to ensure good and +consistent performance of GitLab. + +## Workflow + +The process of solving performance problems is roughly as follows: + +1. Make sure there's an issue open somewhere (e.g. on the GitLab CE issue + tracker), create one if there isn't. See [#15607][#15607] for an example. +2. Measure the performance of the code in a production environment such as + GitLab.com (see the [Tooling](#tooling) section below). Performance should be + measured over a period of _at least_ 24 hours. +3. Add your findings based on the measurement period (screenshots of graphs, + timings, etc) to the issue mentioned in step 1. +4. Solve the problem. +5. Create a merge request, assign the "performance" label and ping the right + people (e.g. [@yorickpeterse][@yorickpeterse] and [@joshfng][@joshfng]). +6. Once a change has been deployed make sure to _again_ measure for at least 24 + hours to see if your changes have any impact on the production environment. +7. Repeat until you're done. + +When providing timings make sure to provide: + +* The 95th percentile +* The 99th percentile +* The mean + +When providing screenshots of graphs make sure that both the X and Y axes and +the legend are clearly visible. If you happen to have access to GitLab.com's own +monitoring tools you should also provide a link to any relevant +graphs/dashboards. + +## Tooling + +GitLab provides two built-in tools to aid the process of improving performance: + +* [Sherlock](doc/development/profiling.md#sherlock) +* [GitLab Performance Monitoring](doc/monitoring/performance) + +GitLab employees can use GitLab.com's performance monitoring systems located at +, this requires you to log in using your +`@gitlab.com` Email address. Non-GitLab employees are advised to set up their +own InfluxDB + Grafana stack. + +## Benchmarks + +Benchmarks are almost always useless. Benchmarks usually only test small bits of +code in isolation and often only measure the best case scenario. On top of that, +benchmarks for libraries (e.g. a Gem) tend to be biased in favour of the +library. After all there's little benefit to an author publishing a benchmark +that shows they perform worse than their competitors. + +Benchmarks are only really useful when you need a rough (emphasis on "rough") +understanding of the impact of your changes. For example, if a certain method is +slow a benchmark can be used to see if the changes you're making have any impact +on the method's performance. However, even when a benchmark shows your changes +improve performance there's no guarantee the performance also improves in a +production environment. + +When writing benchmarks you should almost always use +[benchmark-ips](https://github.com/evanphx/benchmark-ips). Ruby's `Benchmark` +module that comes with the standard library is rarely useful as it runs either a +single iteration (when using `Benchmark.bm`) or two iterations (when using +`Benchmark.bmbm`). Running this few iterations means external factors (e.g. a +video streaming in the background) can very easily skew the benchmark +statistics. + +Another problem with the `Benchmark` module is that it displays timings, not +iterations. This means that if a piece of code completes in a very short period +of time it can be very difficult to compare the timings before and after a +certain change. This in turn leads to patterns such as the following: + +```ruby +Benchmark.bmbm(10) do |bench| + bench.report 'do something' do + 100.times do + ... work here ... + end + end +end +``` + +This however leads to the question: how many iterations should we run to get +meaningful statistics? + +The benchmark-ips Gem basically takes care of all this and much more, and as a +result of this should be used instead of the `Benchmark` module. + +In short: + +1. Don't trust benchmarks you find on the internet. +2. Never make claims based on just benchmarks, always measure in production to + confirm your findings. +3. X being N times faster than Y is meaningless if you don't know what impact it + will actually have on your production environment. +4. A production environment is the _only_ benchmark that always tells the truth + (unless your performance monitoring systems are not set up correctly). +5. If you must write a benchmark use the benchmark-ips Gem instead of Ruby's + `Benchmark` module. + +## Importance of Changes + +When working on performance improvements it's important to always ask yourself +the question "How important is it to improve the performance of this piece of +code?". Not every piece of code is equally important and it would be a waste to +spend a week trying to improve something that only impacts a tiny fraction of +our users. For example, spending a week trying to squeeze 10 milliseconds out of +a method is a waste of time when you could have spent a week squeezing out 10 +seconds elsewhere. + +There is no clear set of steps that you can follow to determine if a certain +piece of code is worth optimizing. The only two things you can do are: + +1. Think about what the code does, how it's used, how many times it's called and + how much time is spent in it relative to the total execution time (e.g. the + total time spent in a web request). +2. Ask others (preferably in the form of an issue). + +Some examples of changes that aren't really important/worth the effort: + +* Replacing double quotes with single quotes. +* Replacing usage of Array with Set when the list of values is very small. +* Replacing library A with library B when both only take up 0.1% of the total + execution time. +* Calling `freeze` on every string (see [String Freezing](#string-freezing)). + +## Slow Operations & Sidekiq + +Slow operations (e.g. merging branches) or operations that are prone to errors +(using external APIs) should be performed in a Sidekiq worker instead of +directly in a web request as much as possible. This has numerous benefits such +as: + +1. An error won't prevent the request from completing. +2. The process being slow won't affect the loading time of a page. +3. In case of a failure it's easy to re-try the process (Sidekiq takes care of + this automatically). +4. By isolating the code from a web request it will hopefully be easier to test + and maintain. + +It's especially important to use Sidekiq as much as possible when dealing with +Git operations as these operations can take quite some time to complete +depending on the performance of the underlying storage system. + +## Git Operations + +Care should be taken to not run unnecessary Git operations. For example, +retrieving the list of branch names using `Repository#branch_names` can be done +without an explicit check if a repository exists or not. In other words, instead +of this: + +```ruby +if repository.exists? + repository.branch_names.each do |name| + ... + end +end +``` + +You can instead just write: + +```ruby +repository.branch_names.each do |name| + ... +end +``` + +## Caching + +Operations that will often return the same result should be cached using Redis, +in particular Git operations. When caching data in Redis make sure the cache is +flushed whenever needed. For example, a cache for the list of tags should be +flushed whenever a new tag is pushed or a tag is removed. + +When adding cache expiration code for repositories this code should be placed in +one of the before/after hooks residing in the Repository class. For example, if +a cache should be flushed after importing a repository this code should be added +to `Repository#after_import`. This ensures the cache logic stays within the +Repository class instead of leaking into other classes. + +When caching data make sure to also memoize the result in an instance variable. +While retrieving data from Redis is much faster than raw Git operations it still +has overhead. By caching the result in an instance variable repeated calls to +the same method won't end up retrieving data from Redis upon every call. When +memoizing cached data in an instance variable make sure to also reset the +instance variable when flushing the cache. An example: + + +```ruby +def first_branch + @first_branch ||= cache.fetch(:first_branch) { branches.first } +end + +def expire_first_branch_cache + cache.expire(:first_branch) + @first_branch = nil +end +``` + +## Anti-Patterns + +This is a collection of [anti-patterns][anti-pattern] that should be avoided +unless these changes have a measurable, significant and positive impact on +production environments. + +### String Freezing + +In recent Ruby versions calling `freeze` on a String leads to it being allocated +only once and re-used. For example, on Ruby 2.3 this will only allocate the +"foo" String once: + +```ruby +10.times do + 'foo'.freeze +end +``` + +Blindly adding a `.freeze` call to every String is an anti-pattern that should +be avoided unless one can prove (using production data) the call actually has a +positive impact on performance. + +This feature of Ruby wasn't really meant to make things faster directly, instead +it was meant to reduce the number of allocations. Depending on the size of the +String and how frequently it would be allocated (before the `.freeze` call was +added) this _may_ make things faster, but there's no guarantee it will. + +Another common flavour of this is to not only freeze a String but also assign it +to a constant, for example: + +```ruby +SOME_CONSTANT = 'foo'.freeze + +9000.times do + SOME_CONSTANT +end +``` + +The only reason you should be doing this is to prevent somebody from mutating +the global String. However, since you can just re-assign constants in Ruby +there's nothing stopping somebody from doing this elsewhere in the code: + +```ruby +SOME_CONSTANT = 'bar' +``` + +### Moving Allocations to Constants + +Storing an object as a constant so you only allocate it once _may_ improve +performance but there's no guarantee this will. Looking up constants has an +impact on runtime performance and as such using a constant instead of +referencing an object directly may even slow code down. + +[#15607]: https://gitlab.com/gitlab-org/gitlab-ce/issues/15607 +[@yorickpeterse]: https://gitlab.com/u/yorickpeterse +[@joshfng]: https://gitlab.com/u/joshfng +[anti-pattern]: https://en.wikipedia.org/wiki/Anti-pattern -- cgit v1.2.1 From 8bd793c9c58c701882447fba055cd93e55f34af5 Mon Sep 17 00:00:00 2001 From: Achilleas Pipinellis Date: Tue, 3 May 2016 20:46:14 +0300 Subject: Copyedit performance document [ci skip] --- doc/development/performance.md | 44 +++++++++++++++++++++--------------------- 1 file changed, 22 insertions(+), 22 deletions(-) (limited to 'doc/development') diff --git a/doc/development/performance.md b/doc/development/performance.md index c74198650e5..41f415812a0 100644 --- a/doc/development/performance.md +++ b/doc/development/performance.md @@ -7,7 +7,7 @@ consistent performance of GitLab. The process of solving performance problems is roughly as follows: -1. Make sure there's an issue open somewhere (e.g. on the GitLab CE issue +1. Make sure there's an issue open somewhere (e.g., on the GitLab CE issue tracker), create one if there isn't. See [#15607][#15607] for an example. 2. Measure the performance of the code in a production environment such as GitLab.com (see the [Tooling](#tooling) section below). Performance should be @@ -27,7 +27,7 @@ When providing timings make sure to provide: * The 99th percentile * The mean -When providing screenshots of graphs make sure that both the X and Y axes and +When providing screenshots of graphs, make sure that both the X and Y axes and the legend are clearly visible. If you happen to have access to GitLab.com's own monitoring tools you should also provide a link to any relevant graphs/dashboards. @@ -37,7 +37,7 @@ graphs/dashboards. GitLab provides two built-in tools to aid the process of improving performance: * [Sherlock](doc/development/profiling.md#sherlock) -* [GitLab Performance Monitoring](doc/monitoring/performance) +* [GitLab Performance Monitoring](doc/monitoring/performance/monitoring.md) GitLab employees can use GitLab.com's performance monitoring systems located at , this requires you to log in using your @@ -48,7 +48,7 @@ own InfluxDB + Grafana stack. Benchmarks are almost always useless. Benchmarks usually only test small bits of code in isolation and often only measure the best case scenario. On top of that, -benchmarks for libraries (e.g. a Gem) tend to be biased in favour of the +benchmarks for libraries (e.g., a Gem) tend to be biased in favour of the library. After all there's little benefit to an author publishing a benchmark that shows they perform worse than their competitors. @@ -102,7 +102,7 @@ In short: ## Importance of Changes -When working on performance improvements it's important to always ask yourself +When working on performance improvements, it's important to always ask yourself the question "How important is it to improve the performance of this piece of code?". Not every piece of code is equally important and it would be a waste to spend a week trying to improve something that only impacts a tiny fraction of @@ -114,7 +114,7 @@ There is no clear set of steps that you can follow to determine if a certain piece of code is worth optimizing. The only two things you can do are: 1. Think about what the code does, how it's used, how many times it's called and - how much time is spent in it relative to the total execution time (e.g. the + how much time is spent in it relative to the total execution time (e.g., the total time spent in a web request). 2. Ask others (preferably in the form of an issue). @@ -159,7 +159,7 @@ if repository.exists? end ``` -You can instead just write: +You can just write: ```ruby repository.branch_names.each do |name| @@ -170,21 +170,21 @@ end ## Caching Operations that will often return the same result should be cached using Redis, -in particular Git operations. When caching data in Redis make sure the cache is +in particular Git operations. When caching data in Redis, make sure the cache is flushed whenever needed. For example, a cache for the list of tags should be flushed whenever a new tag is pushed or a tag is removed. -When adding cache expiration code for repositories this code should be placed in -one of the before/after hooks residing in the Repository class. For example, if -a cache should be flushed after importing a repository this code should be added -to `Repository#after_import`. This ensures the cache logic stays within the -Repository class instead of leaking into other classes. +When adding cache expiration code for repositories, this code should be placed +in one of the before/after hooks residing in the Repository class. For example, +if a cache should be flushed after importing a repository this code should be +added to `Repository#after_import`. This ensures the cache logic stays within +the Repository class instead of leaking into other classes. -When caching data make sure to also memoize the result in an instance variable. -While retrieving data from Redis is much faster than raw Git operations it still -has overhead. By caching the result in an instance variable repeated calls to +When caching data, make sure to also memoize the result in an instance variable. +While retrieving data from Redis is much faster than raw Git operations, it still +has overhead. By caching the result in an instance variable, repeated calls to the same method won't end up retrieving data from Redis upon every call. When -memoizing cached data in an instance variable make sure to also reset the +memoizing cached data in an instance variable, make sure to also reset the instance variable when flushing the cache. An example: @@ -224,10 +224,10 @@ positive impact on performance. This feature of Ruby wasn't really meant to make things faster directly, instead it was meant to reduce the number of allocations. Depending on the size of the String and how frequently it would be allocated (before the `.freeze` call was -added) this _may_ make things faster, but there's no guarantee it will. +added), this _may_ make things faster, but there's no guarantee it will. -Another common flavour of this is to not only freeze a String but also assign it -to a constant, for example: +Another common flavour of this is to not only freeze a String, but also assign +it to a constant, for example: ```ruby SOME_CONSTANT = 'foo'.freeze @@ -248,8 +248,8 @@ SOME_CONSTANT = 'bar' ### Moving Allocations to Constants Storing an object as a constant so you only allocate it once _may_ improve -performance but there's no guarantee this will. Looking up constants has an -impact on runtime performance and as such using a constant instead of +performance, but there's no guarantee this will. Looking up constants has an +impact on runtime performance, and as such, using a constant instead of referencing an object directly may even slow code down. [#15607]: https://gitlab.com/gitlab-org/gitlab-ce/issues/15607 -- cgit v1.2.1 From 70551217a6cc250dc2e9a61720fc447df74e38c7 Mon Sep 17 00:00:00 2001 From: Yorick Peterse Date: Wed, 4 May 2016 13:17:43 +0200 Subject: Fixed username links in the performance guide These would end up being rendered as: @yorickpeterse @yorickpeterse [ci skip] --- doc/development/performance.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'doc/development') diff --git a/doc/development/performance.md b/doc/development/performance.md index 41f415812a0..fb37b3a889c 100644 --- a/doc/development/performance.md +++ b/doc/development/performance.md @@ -16,7 +16,7 @@ The process of solving performance problems is roughly as follows: timings, etc) to the issue mentioned in step 1. 4. Solve the problem. 5. Create a merge request, assign the "performance" label and ping the right - people (e.g. [@yorickpeterse][@yorickpeterse] and [@joshfng][@joshfng]). + people (e.g. [@yorickpeterse][yorickpeterse] and [@joshfng][joshfng]). 6. Once a change has been deployed make sure to _again_ measure for at least 24 hours to see if your changes have any impact on the production environment. 7. Repeat until you're done. @@ -253,6 +253,6 @@ impact on runtime performance, and as such, using a constant instead of referencing an object directly may even slow code down. [#15607]: https://gitlab.com/gitlab-org/gitlab-ce/issues/15607 -[@yorickpeterse]: https://gitlab.com/u/yorickpeterse -[@joshfng]: https://gitlab.com/u/joshfng +[yorickpeterse]: https://gitlab.com/u/yorickpeterse +[joshfng]: https://gitlab.com/u/joshfng [anti-pattern]: https://en.wikipedia.org/wiki/Anti-pattern -- cgit v1.2.1 From e449a6c05c734ff9e38e13013591d07d03c0fff6 Mon Sep 17 00:00:00 2001 From: Yorick Peterse Date: Wed, 4 May 2016 13:14:11 +0200 Subject: Added documentation on how to instrument methods [ci skip] --- doc/development/instrumentation.md | 129 +++++++++++++++++++++++++++++++++++-- 1 file changed, 123 insertions(+), 6 deletions(-) (limited to 'doc/development') diff --git a/doc/development/instrumentation.md b/doc/development/instrumentation.md index c1cf2e77c26..9168c70945a 100644 --- a/doc/development/instrumentation.md +++ b/doc/development/instrumentation.md @@ -1,12 +1,125 @@ # Instrumenting Ruby Code -GitLab Performance Monitoring allows instrumenting of custom blocks of Ruby -code. This can be used to measure the time spent in a specific part of a larger -chunk of code. The resulting data is stored as a field in the transaction that -executed the block. +GitLab Performance Monitoring allows instrumenting of both methods and custom +blocks of Ruby code. Method instrumentation is the primary form of +instrumentation with block-based instrumentation only being used when we want to +drill down to specific regions of code within a method. -To start measuring a block of Ruby code you should use `Gitlab::Metrics.measure` -and give it a name: +## Instrumenting Methods + +Instrumenting methods is done by using the `Gitlab::Metrics::Instrumentation` +module. This module offers a few different methods that can be used to +instrument code: + +* `instrument_method`: instruments a single class method. +* `instrument_instance_method`: instruments a single instance method. +* `instrument_class_hierarchy`: given a Class this method will recursively + instrument all sub-classes (both class and instance methods). +* `instrument_methods`: instruments all public class methods of a Module. +* `instrument_instance_methods`: instruments all public instance methods of a + Module. + +To remove the need for typing the full `Gitlab::Metrics::Instrumentation` +namespace you can use the `configure` class method. This method simply yields +the supplied block while passing `Gitlab::Metrics::Instrumentation` as its +argument. An example: + +``` +Gitlab::Metrics::Instrumentation.configure do |conf| + conf.instrument_method(Foo, :bar) + conf.instrument_method(Foo, :baz) +end +``` + +Using this method is in general preferred over directly calling the various +instrumentation methods. + +Method instrumentation should be added in the initializer +`config/initializers/metrics.rb`. + +### Examples + +Instrumenting a single method: + +``` +Gitlab::Metrics::Instrumentation.configure do |conf| + conf.instrument_method(User, :find_by) +end +``` + +Instrumenting an entire class hierarchy: + +``` +Gitlab::Metrics::Instrumentation.configure do |conf| + conf.instrument_class_hierarchy(ActiveRecord::Base) +end +``` + +Instrumenting all public class methods: + +``` +Gitlab::Metrics::Instrumentation.configure do |conf| + conf.instrument_methods(User) +end +``` + +### Checking Instrumented Methods + +The easiest way to check if a method has been instrumented is to check its +source location. For example: + +``` +method = Rugged::TagCollection.instance_method(:[]) + +method.source_location +``` + +If the source location points to `lib/gitlab/metrics/instrumentation.rb` you +know the method has been instrumented. + +If you're using Pry you can use the `$` command to display the source code of a +method (along with its source location), this is easier than running the above +Ruby code. In case of the above snippet you'd run the following: + +``` +$ Rugged::TagCollection#[] +``` + +This will print out something along the lines of: + +``` +From: /path/to/your/gitlab/lib/gitlab/metrics/instrumentation.rb @ line 148: +Owner: # +Visibility: public +Number of lines: 21 + +def #{name}(#{args_signature}) + trans = Gitlab::Metrics::Instrumentation.transaction + + if trans + start = Time.now + retval = super + duration = (Time.now - start) * 1000.0 + + if duration >= Gitlab::Metrics.method_call_threshold + trans.increment(:method_duration, duration) + + trans.add_metric(Gitlab::Metrics::Instrumentation::SERIES, + { duration: duration }, + method: #{label.inspect}) + end + + retval + else + super + end +end +``` + +## Instrumenting Ruby Blocks + +Measuring blocks of Ruby code is done by calling `Gitlab::Metrics.measure` and +passing it a block. For example: ```ruby Gitlab::Metrics.measure(:foo) do @@ -14,6 +127,10 @@ Gitlab::Metrics.measure(:foo) do end ``` +The block is executed and the execution time is stored as a set of fields in the +currently running transaction. If no transaction is present the block is yielded +without measuring anything. + 3 values are measured for a block: 1. The real time elapsed, stored in NAME_real_time. -- cgit v1.2.1 From df9a7c6b4acf0a662366b2e31d897cda370c2216 Mon Sep 17 00:00:00 2001 From: Robert Speicher Date: Thu, 5 May 2016 17:22:18 -0400 Subject: Add a note to testing docs about the `:each` argument to RSpec hooks [ci skip] --- doc/development/testing.md | 1 + 1 file changed, 1 insertion(+) (limited to 'doc/development') diff --git a/doc/development/testing.md b/doc/development/testing.md index 672e3fb4649..33eed29ba5c 100644 --- a/doc/development/testing.md +++ b/doc/development/testing.md @@ -64,6 +64,7 @@ the command line via `bundle exec teaspoon`, or via a web browser at methods. - Use `context` to test branching logic. - Don't `describe` symbols (see [Gotchas](gotchas.md#dont-describe-symbols)). +- Don't supply the `:each` argument to hooks since it's the default. - Prefer `not_to` to `to_not`. - Try to match the ordering of tests to the ordering within the class. - Try to follow the [Four-Phase Test][four-phase-test] pattern, using newlines -- cgit v1.2.1