diff options
Diffstat (limited to 'doc/development')
-rw-r--r-- | doc/development/README.md | 1 | ||||
-rw-r--r-- | doc/development/architecture.md | 2 | ||||
-rw-r--r-- | doc/development/benchmarking.md | 69 | ||||
-rw-r--r-- | doc/development/profiling.md | 27 | ||||
-rw-r--r-- | doc/development/rake_tasks.md | 2 | ||||
-rw-r--r-- | doc/development/shared_files.md | 33 | ||||
-rw-r--r-- | doc/development/shell_commands.md | 20 |
7 files changed, 147 insertions, 7 deletions
diff --git a/doc/development/README.md b/doc/development/README.md index 6bc8e1888db..d5bf166ad32 100644 --- a/doc/development/README.md +++ b/doc/development/README.md @@ -8,3 +8,4 @@ - [UI guide](ui_guide.md) for building GitLab with existing css styles and elements - [Migration Style Guide](migration_style_guide.md) for creating safe migrations - [How to dump production data to staging](dump_db.md) +- [Benchmarking](benchmarking.md) diff --git a/doc/development/architecture.md b/doc/development/architecture.md index c00d290371e..6101a71a8de 100644 --- a/doc/development/architecture.md +++ b/doc/development/architecture.md @@ -146,7 +146,7 @@ nginx Apache httpd -- [Explanation of Apache logs](http://httpd.apache.org/docs/2.2/logs.html). +- [Explanation of Apache logs](https://httpd.apache.org/docs/2.2/logs.html). - `/var/log/apache2/` contains error and output logs (on Ubuntu). - `/var/log/httpd/` contains error and output logs (on RHEL). diff --git a/doc/development/benchmarking.md b/doc/development/benchmarking.md new file mode 100644 index 00000000000..88e18ee95f9 --- /dev/null +++ b/doc/development/benchmarking.md @@ -0,0 +1,69 @@ +# Benchmarking + +GitLab CE comes with a set of benchmarks that are executed for every build. This +makes it easier to measure performance of certain components over time. + +Benchmarks are written as RSpec tests using a few extra helpers. To write a +benchmark, first tag the top-level `describe`: + +```ruby +describe MaruTheCat, benchmark: true do + +end +``` + +This ensures the benchmark is executed separately from other test collections. +It also exposes the various RSpec matchers used for writing benchmarks to the +test group. + +Next, lets write the actual benchmark: + +```ruby +describe MaruTheCat, benchmark: true do + let(:maru) { MaruTheChat.new } + + describe '#jump_in_box' do + benchmark_subject { maru.jump_in_box } + + it { is_expected.to iterate_per_second(9000) } + end +end +``` + +Here `benchmark_subject` is a small wrapper around RSpec's `subject` method that +makes it easier to specify the subject of a benchmark. Using RSpec's regular +`subject` would require us to write the following instead: + +```ruby +subject { -> { maru.jump_in_box } } +``` + +The `iterate_per_second` matcher defines the amount of times per second a +subject should be executed. The higher the amount of iterations the better. + +By default the allowed standard deviation is a maximum of 30%. This can be +adjusted by chaining the `with_maximum_stddev` on the `iterate_per_second` +matcher: + +```ruby +it { is_expected.to iterate_per_second(9000).with_maximum_stddev(50) } +``` + +This can be useful if the code in question depends on external resources of +which the performance can vary a lot (e.g. physical HDDs, network calls, etc). +However, in most cases 30% should be enough so only change this when really +needed. + +## Benchmarks Location + +Benchmarks should be stored in `spec/benchmarks` and should follow the regular +Rails specs structure. That is, model benchmarks go in `spec/benchmark/models`, +benchmarks for code in the `lib` directory go in `spec/benchmarks/lib`, etc. + +## Underlying Technology + +The benchmark setup uses [benchmark-ips][benchmark-ips] which takes care of the +heavy lifting such as warming up code, calculating iterations, standard +deviation, etc. + +[benchmark-ips]: https://github.com/evanphx/benchmark-ips diff --git a/doc/development/profiling.md b/doc/development/profiling.md new file mode 100644 index 00000000000..e244ad4e881 --- /dev/null +++ b/doc/development/profiling.md @@ -0,0 +1,27 @@ +# Profiling + +To make it easier to track down performance problems GitLab comes with a set of +profiling tools, some of these are available by default while others need to be +explicitly enabled. + +## Sherlock + +Sherlock is a custom profiling tool built into GitLab. Sherlock is _only_ +available when running GitLab in development mode _and_ when setting the +environment variable `ENABLE_SHERLOCK` to a non empty value. For example: + + ENABLE_SHERLOCK=1 bundle exec rails s + +Recorded transactions can be found by navigating to `/sherlock/transactions`. + +## Bullet + +Bullet is a Gem that can be used to track down N+1 query problems. Because +Bullet adds quite a bit of logging noise it's disabled by default. To enable +Bullet, set the environment variable `ENABLE_BULLET` to a non-empty value before +starting GitLab. For example: + + ENABLE_BULLET=true bundle exec rails s + +Bullet will log query problems to both the Rails log as well as the Chrome +console. diff --git a/doc/development/rake_tasks.md b/doc/development/rake_tasks.md index a4a980cf0e0..9f3fd69fc4e 100644 --- a/doc/development/rake_tasks.md +++ b/doc/development/rake_tasks.md @@ -9,7 +9,7 @@ bundle exec rake setup ``` The `setup` task is a alias for `gitlab:setup`. -This tasks calls `db:setup` to create the database, calls `add_limits_mysql` that adds limits to the database schema in case of a MySQL database and fianlly it calls `db:seed_fu` to seed the database. +This tasks calls `db:setup` to create the database, calls `add_limits_mysql` that adds limits to the database schema in case of a MySQL database and finally it calls `db:seed_fu` to seed the database. Note: `db:setup` calls `db:seed` but this does nothing. ## Run tests diff --git a/doc/development/shared_files.md b/doc/development/shared_files.md new file mode 100644 index 00000000000..fcd905b54a4 --- /dev/null +++ b/doc/development/shared_files.md @@ -0,0 +1,33 @@ +# Shared files + +Historically, GitLab has been storing shared files in many different +directories: `public/uploads`, `builds`, `tmp/repositories`, `tmp/rebase` (EE), +etc. Having so many shared directories makes it difficult to deploy GitLab on +shared storage (e.g. NFS). Working towards GitLab 9.0 we are consolidating +these different directories under the `shared` directory. + +This means that if GitLab will start storing puppies in some future version +then we should put them in `shared/puppies`. Temporary puppy files should be +stored in `shared/tmp`. + +In the GitLab application code you can get the full path to the `shared` +directory with `Gitlab.config.shared.path`. + +## What is not a 'shared file' + +Files that belong to only one process, or on only one server, should not go in +`shared`. Examples include PID files and sockets. + +## Temporary files and shared storage + +Sometimes you create a temporary file on disk with the intention of it becoming +'official'. For example you might be first streaming an upload from a user to +disk in a temporary file so you can perform some checks on it. When the checks +pass, you make the file official. In scenarios like this please follow these +rules: + +- Store the temporary file under `shared/tmp`, i.e. on the same filesystem you + want the official file to be on. +- Use move/rename operations when operating on the file instead of copy + operations where possible, because renaming a file is much faster than + copying it. diff --git a/doc/development/shell_commands.md b/doc/development/shell_commands.md index 2d1d0fb4154..65cdd74bdb6 100644 --- a/doc/development/shell_commands.md +++ b/doc/development/shell_commands.md @@ -35,6 +35,16 @@ Gitlab::Popen.popen(%W(find /some/path -not -path /some/path -mmin +120 -delete) This coding style could have prevented CVE-2013-4490. +## Always use the configurable git binary path for git commands + +```ruby +# Wrong +system(*%W(git branch -d -- #{branch_name})) + +# Correct +system(*%W(#{Gitlab.config.git.bin_path} branch -d -- #{branch_name})) +``` + ## Bypass the shell by splitting commands into separate tokens When we pass shell commands as a single string to Ruby, Ruby will let `/bin/sh` evaluate the entire string. Essentially, we are asking the shell to evaluate a one-line script. This creates a risk for shell injection attacks. It is better to split the shell command into tokens ourselves. Sometimes we use the scripting capabilities of the shell to change the working directory or set environment variables. All of this can also be achieved securely straight from Ruby @@ -81,9 +91,9 @@ In the GitLab codebase, we avoid the option/argument ambiguity by _always_ using ```ruby # Wrong -system(*%W(git branch -d #{branch_name})) +system(*%W(#{Gitlab.config.git.bin_path} branch -d #{branch_name})) # Correct -system(*%W(git branch -d -- #{branch_name})) +system(*%W(#{Gitlab.config.git.bin_path} branch -d -- #{branch_name})) ``` This coding style could have prevented CVE-2013-4582. @@ -94,9 +104,9 @@ Capturing the output of shell commands with backticks reads nicely, but you are ```ruby # Wrong -logs = `cd #{repo_dir} && git log` +logs = `cd #{repo_dir} && #{Gitlab.config.git.bin_path} log` # Correct -logs, exit_status = Gitlab::Popen.popen(%W(git log), repo_dir) +logs, exit_status = Gitlab::Popen.popen(%W(#{Gitlab.config.git.bin_path} log), repo_dir) # Wrong user = `whoami` @@ -108,7 +118,7 @@ In other repositories, such as gitlab-shell you can also use `IO.popen`. ```ruby # Safe IO.popen example -logs = IO.popen(%W(git log), chdir: repo_dir) { |p| p.read } +logs = IO.popen(%W(#{Gitlab.config.git.bin_path} log), chdir: repo_dir) { |p| p.read } ``` Note that unlike `Gitlab::Popen.popen`, `IO.popen` does not capture standard error. |