From b82a205b740840e2a7d0fa3eecf3e361ca73416e Mon Sep 17 00:00:00 2001 From: Marc Radulescu Date: Wed, 26 Nov 2014 18:51:12 +0100 Subject: added office analogy to help understanding of gitlab architecture --- doc/development/architecture.md | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) (limited to 'doc/development') diff --git a/doc/development/architecture.md b/doc/development/architecture.md index c4813d22eaa..109b21ab2a5 100644 --- a/doc/development/architecture.md +++ b/doc/development/architecture.md @@ -8,6 +8,38 @@ EE releases are available not long after CE releases. To obtain the GitLab EE th Both EE and CE require an add-on component called gitlab-shell. It is obtained from the [gitlab-shell repository](https://gitlab.com/gitlab-org/gitlab-shell/tree/master). New versions are usually tags but staying on the master branch will give you the latest stable version. New releases are generally around the same time as GitLab CE releases with exception for informal security updates deemed critical. +## Physical office analogy + +You can imagine GitLab as a physical office. + +**The repositories** are the goods GitLab handling. +They can be stored in a warehouse. +This can be either a hard disk, or something more complex, such as a NFS filesystem; + +**NginX** acts like the front-desk. +Users come to NginX and request actions to be done by workers in the office; + +**The database** is a series of metal file cabinets with information on: + - The goods in the warehouse (metadata, issues, merge requests etc); + - The users coming to the front desk (permissions) + +**Redis** is a [communication board with “cubby holes”](http://cache3.asset-cache.net/gc/52392865-mail-lies-in-cubby-holes-in-the-trenton-post-gettyimages.jpg?v=1&c=IWSAsset&k=2&d=OCUJ5gVf7YdJQI2Xhkc2QMDTqXzgg%2Fa7CPCCcA9Ug%2BfL2iMdhkcAYaLLAievbZlwJI9YEbpjb1pB2Fh7Fge3%2FA%3D%3D) that can contain tasks for office workers; + +**Sidekiq** is a worker that primarily handles sending out emails. +It takes tasks from the Redis communication board; + +**A Unicorn worker** is a worker that handles quick/mundane tasks. +They work with the communication board (Redis). +Their job description: + - check permissions by checking the user session stored in a Redis “cubby hole”; + - make tasks for Sidekiq; + - fetch stuff from the warehouse or move things around in there; + +**Gitlab-shell** is a third kind of worker that takes orders from a fax machine (SSH) instead of the front desk (HTTP). +Gitlab-shell communicates with Sidekiq via the “communication board” (Redis), and asks quick questions of the Unicorn workers either directly or via the front desk. + +**GitLab Enterprise Edition (the application)** is the collection of processes and business practices that the office is run by. + ## System Layout When referring to ~git in the pictures it means the home directory of the git user which is typically /home/git. -- cgit v1.2.1 From 8ccaee19792ac10dffd7a86be0835f2ea5674d0e Mon Sep 17 00:00:00 2001 From: Marc Radulescu Date: Wed, 26 Nov 2014 20:54:42 +0100 Subject: replaced hotlink --- doc/development/architecture.md | 2 +- doc/development/cubby_holes.jpg | Bin 0 -> 132815 bytes 2 files changed, 1 insertion(+), 1 deletion(-) create mode 100644 doc/development/cubby_holes.jpg (limited to 'doc/development') diff --git a/doc/development/architecture.md b/doc/development/architecture.md index 109b21ab2a5..68c813d4339 100644 --- a/doc/development/architecture.md +++ b/doc/development/architecture.md @@ -23,7 +23,7 @@ Users come to NginX and request actions to be done by workers in the office; - The goods in the warehouse (metadata, issues, merge requests etc); - The users coming to the front desk (permissions) -**Redis** is a [communication board with “cubby holes”](http://cache3.asset-cache.net/gc/52392865-mail-lies-in-cubby-holes-in-the-trenton-post-gettyimages.jpg?v=1&c=IWSAsset&k=2&d=OCUJ5gVf7YdJQI2Xhkc2QMDTqXzgg%2Fa7CPCCcA9Ug%2BfL2iMdhkcAYaLLAievbZlwJI9YEbpjb1pB2Fh7Fge3%2FA%3D%3D) that can contain tasks for office workers; +**Redis** is a [communication board with “cubby holes”](https://dev.gitlab.org/gitlab/gitlabhq/blob/master/doc/development/cubby_holes.jpg) that can contain tasks for office workers; **Sidekiq** is a worker that primarily handles sending out emails. It takes tasks from the Redis communication board; diff --git a/doc/development/cubby_holes.jpg b/doc/development/cubby_holes.jpg new file mode 100644 index 00000000000..afbb58bb950 Binary files /dev/null and b/doc/development/cubby_holes.jpg differ -- cgit v1.2.1 From c85d4af88921aba31afd39c3403fe2d41381c2ca Mon Sep 17 00:00:00 2001 From: Marc Radulescu Date: Thu, 27 Nov 2014 10:24:19 +0100 Subject: remove unnecessarry image --- doc/development/architecture.md | 2 +- doc/development/cubby_holes.jpg | Bin 132815 -> 0 bytes 2 files changed, 1 insertion(+), 1 deletion(-) delete mode 100644 doc/development/cubby_holes.jpg (limited to 'doc/development') diff --git a/doc/development/architecture.md b/doc/development/architecture.md index 68c813d4339..209182e7742 100644 --- a/doc/development/architecture.md +++ b/doc/development/architecture.md @@ -23,7 +23,7 @@ Users come to NginX and request actions to be done by workers in the office; - The goods in the warehouse (metadata, issues, merge requests etc); - The users coming to the front desk (permissions) -**Redis** is a [communication board with “cubby holes”](https://dev.gitlab.org/gitlab/gitlabhq/blob/master/doc/development/cubby_holes.jpg) that can contain tasks for office workers; +**Redis** is a communication board with “cubby holes” that can contain tasks for office workers; **Sidekiq** is a worker that primarily handles sending out emails. It takes tasks from the Redis communication board; diff --git a/doc/development/cubby_holes.jpg b/doc/development/cubby_holes.jpg deleted file mode 100644 index afbb58bb950..00000000000 Binary files a/doc/development/cubby_holes.jpg and /dev/null differ -- cgit v1.2.1 From 64919745544cd09cdb510bf15e9522280d61fdde Mon Sep 17 00:00:00 2001 From: Jacob Vosmaer Date: Mon, 1 Dec 2014 18:58:37 +0100 Subject: Disable Sidekiq arguments logging by default --- doc/development/README.md | 1 + 1 file changed, 1 insertion(+) (limited to 'doc/development') diff --git a/doc/development/README.md b/doc/development/README.md index 20db6662aca..c31e5d7ae97 100644 --- a/doc/development/README.md +++ b/doc/development/README.md @@ -4,3 +4,4 @@ - [Shell commands](shell_commands.md) in the GitLab codebase - [Rake tasks](rake_tasks.md) for development - [CI setup](ci_setup.md) for testing GitLab +- [Sidekiq debugging](sidekiq_debugging.md) -- cgit v1.2.1 From 704b7237e6c4daa3642c01f8803072fdc3a45eaf Mon Sep 17 00:00:00 2001 From: Sytse Sijbrandij Date: Thu, 4 Dec 2014 16:54:08 +0100 Subject: Fix notifications for developers that don't read the documentation. --- doc/development/rake_tasks.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) (limited to 'doc/development') diff --git a/doc/development/rake_tasks.md b/doc/development/rake_tasks.md index 6d9ac161e91..ffa61e66134 100644 --- a/doc/development/rake_tasks.md +++ b/doc/development/rake_tasks.md @@ -1,6 +1,6 @@ # Rake tasks for developers -## Setup db with developer seeds: +## Setup db with developer seeds Note that if your db user does not have advanced privileges you must create the db manually before running this command. @@ -8,6 +8,9 @@ Note that if your db user does not have advanced privileges you must create the bundle exec rake setup ``` +The `setup` task is a alias for `gitlab:setup`. +This tasks calls `db:setup` to create the database, with `add_limits_mysql` it adds limits to the database schema in case of a MySQL database and fianlly it runs `db:seed_fu` to seed the database. + ## Run tests This runs all test suites present in GitLab. -- cgit v1.2.1 From 3dc25ba331c4f5c4708b0fcd8478d943d182d760 Mon Sep 17 00:00:00 2001 From: Sytse Sijbrandij Date: Thu, 4 Dec 2014 21:22:21 +0100 Subject: Remove warning from db seed since it is called by db setup. --- doc/development/rake_tasks.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'doc/development') diff --git a/doc/development/rake_tasks.md b/doc/development/rake_tasks.md index ffa61e66134..53f8095cb13 100644 --- a/doc/development/rake_tasks.md +++ b/doc/development/rake_tasks.md @@ -9,7 +9,8 @@ bundle exec rake setup ``` The `setup` task is a alias for `gitlab:setup`. -This tasks calls `db:setup` to create the database, with `add_limits_mysql` it adds limits to the database schema in case of a MySQL database and fianlly it runs `db:seed_fu` to seed the database. +This tasks calls `db:setup` to create the database, calls `add_limits_mysql` that adds limits to the database schema in case of a MySQL database and fianlly it calls `db:seed_fu` to seed the database. +Note: `db:setup` calls `db:seed` but this does nothing. ## Run tests -- cgit v1.2.1 From 369375d0862f16f7a9926374226b1bad028f530c Mon Sep 17 00:00:00 2001 From: Robert Schilling Date: Sun, 7 Dec 2014 01:24:03 +0100 Subject: Move sidekiq debug docs to development folder --- doc/development/sidekiq_debugging.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) create mode 100644 doc/development/sidekiq_debugging.md (limited to 'doc/development') diff --git a/doc/development/sidekiq_debugging.md b/doc/development/sidekiq_debugging.md new file mode 100644 index 00000000000..cea11e5f126 --- /dev/null +++ b/doc/development/sidekiq_debugging.md @@ -0,0 +1,14 @@ +# Sidekiq debugging + +## Log arguments to Sidekiq jobs + +If you want to see what arguments are being passed to Sidekiq jobs you can set +the SIDEKIQ_LOG_ARGUMENTS environment variable. + +``` +SIDEKIQ_LOG_ARGUMENTS=1 bundle exec foreman start +``` + +It is not recommend to enable this setting in production because some Sidekiq +jobs (such as sending a password reset email) take secret arguments (for +example the password reset token). -- cgit v1.2.1 From 82eb0a44d7afa3b6ab77b8f7c9386740496a72e1 Mon Sep 17 00:00:00 2001 From: Jacob Vosmaer Date: Tue, 9 Dec 2014 14:51:15 +0100 Subject: Add security tips about file and paths --- doc/development/shell_commands.md | 63 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 63 insertions(+) (limited to 'doc/development') diff --git a/doc/development/shell_commands.md b/doc/development/shell_commands.md index 23c8365c340..1e51ad73e32 100644 --- a/doc/development/shell_commands.md +++ b/doc/development/shell_commands.md @@ -1,5 +1,8 @@ # Guidelines for shell commands in the GitLab codebase +This document contains guidelines for working with processes and files in the GitLab codebase. +These guidelines are meant to make your code more reliable _and_ secure. + ## References - [Google Ruby Security Reviewer's Guide](https://code.google.com/p/ruby-security/wiki/Guide) @@ -109,3 +112,63 @@ logs = IO.popen(%W(git log), chdir: repo_dir).read ``` Note that unlike `Gitlab::Popen.popen`, `IO.popen` does not capture standard error. + +## Avoid user input at the start of path strings + +Various methods for opening and reading files in Ruby can be used to read the +standard output of a process instead of a file. The following two commands do +roughly the same: + +``` +`touch /tmp/pawned-by-backticks` +File.read('|touch /tmp/pawned-by-file-read') +``` + +The key is to open a 'file' whose name starts with a `|`. +Affected methods include Kernel#open, File::read, File::open, IO::open and IO::read. + +You can protect against this behavior of 'open' and 'read' by ensuring that an +attacker cannot control the start of the filename string you are opening. For +instance, the following is sufficient to protect against accidentally starting +a shell command with `|`: + +``` +# we assume repo_path is not controlled by the attacker (user) +path = File.join(repo_path, user_input) +# path cannot start with '|' now. +File.read(path) +``` + +## Guard against path traversal + +Path traversal is a security where the program (GitLab) tries to restrict user +access to a certain directory on disk, but the user manages to open a file +outside that directory by taking advantage of the `../` path notation. + +``` +# Suppose the user gave us a path and they are trying to trick us +user_input = '../other-repo.git/other-file' + +# We look up the repo path somewhere +repo_path = 'repositories/user-repo.git' + +# The intention of the code below is to open a file under repo_path, but +# because the user used '..' she can 'break out' into +# 'repositories/other-repo.git' +full_path = File.join(repo_path, user_input) +File.open(full_path) do # Oops! +``` + +A good way to protect against this is to compare the full path with its +'absolute path' according to Ruby's `File.absolute_path`. + +``` +full_path = File.join(repo_path, user_input) +if full_path != File.absolute_path(full_path) + raise "Invalid path: #{full_path.inspect}" +end + +File.open(full_path) do # Etc. +``` + +A check like this could have avoided CVE-2013-4583. -- cgit v1.2.1 From a63187f28b18e2feea16681b313166a982254e4e Mon Sep 17 00:00:00 2001 From: Jacob Vosmaer Date: Thu, 22 Jan 2015 15:53:16 +0100 Subject: Don't create zombies with IO.popen The previous recommend incantation would leave the process we read from hanging around, even though it had finished. That gives you a 'defunct'/'zombie' process. --- doc/development/shell_commands.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'doc/development') diff --git a/doc/development/shell_commands.md b/doc/development/shell_commands.md index 1e51ad73e32..42f17e19536 100644 --- a/doc/development/shell_commands.md +++ b/doc/development/shell_commands.md @@ -108,7 +108,7 @@ In other repositories, such as gitlab-shell you can also use `IO.popen`. ```ruby # Safe IO.popen example -logs = IO.popen(%W(git log), chdir: repo_dir).read +logs = IO.popen(%W(git log), chdir: repo_dir) { |p| p.read } ``` Note that unlike `Gitlab::Popen.popen`, `IO.popen` does not capture standard error. -- cgit v1.2.1 From ad6c372eeee5d112ad199dd4e487df584976445d Mon Sep 17 00:00:00 2001 From: Ewan Edwards Date: Tue, 3 Feb 2015 15:18:40 -0800 Subject: Fix a number of discovered typos, capitalization of developer and product names, plus a couple of instances of bad Markdown markup. --- doc/development/architecture.md | 10 +++++----- doc/development/ci_setup.md | 2 +- 2 files changed, 6 insertions(+), 6 deletions(-) (limited to 'doc/development') diff --git a/doc/development/architecture.md b/doc/development/architecture.md index 209182e7742..714cc016004 100644 --- a/doc/development/architecture.md +++ b/doc/development/architecture.md @@ -16,8 +16,8 @@ You can imagine GitLab as a physical office. They can be stored in a warehouse. This can be either a hard disk, or something more complex, such as a NFS filesystem; -**NginX** acts like the front-desk. -Users come to NginX and request actions to be done by workers in the office; +**Nginx** acts like the front-desk. +Users come to Nginx and request actions to be done by workers in the office; **The database** is a series of metal file cabinets with information on: - The goods in the warehouse (metadata, issues, merge requests etc); @@ -70,7 +70,7 @@ To summarize here's the [directory structure of the `git` user home directory](. ps aux | grep '^git' -GitLab has several components to operate. As a system user (i.e. any user that is not the `git` user) it requires a persistent database (MySQL/PostreSQL) and redis database. It also uses Apache httpd or nginx to proxypass Unicorn. As the `git` user it starts Sidekiq and Unicorn (a simple ruby HTTP server running on port `8080` by default). Under the GitLab user there are normally 4 processes: `unicorn_rails master` (1 process), `unicorn_rails worker` (2 processes), `sidekiq` (1 process). +GitLab has several components to operate. As a system user (i.e. any user that is not the `git` user) it requires a persistent database (MySQL/PostreSQL) and redis database. It also uses Apache httpd or Nginx to proxypass Unicorn. As the `git` user it starts Sidekiq and Unicorn (a simple ruby HTTP server running on port `8080` by default). Under the GitLab user there are normally 4 processes: `unicorn_rails master` (1 process), `unicorn_rails worker` (2 processes), `sidekiq` (1 process). ### Repository access @@ -146,13 +146,13 @@ nginx Apache httpd -- [Explanation of apache logs](http://httpd.apache.org/docs/2.2/logs.html). +- [Explanation of Apache logs](http://httpd.apache.org/docs/2.2/logs.html). - `/var/log/apache2/` contains error and output logs (on Ubuntu). - `/var/log/httpd/` contains error and output logs (on RHEL). redis -- `/var/log/redis/redis.log` there are also logrotated logs there. +- `/var/log/redis/redis.log` there are also log-rotated logs there. PostgreSQL diff --git a/doc/development/ci_setup.md b/doc/development/ci_setup.md index ee16aedafe7..f417667754e 100644 --- a/doc/development/ci_setup.md +++ b/doc/development/ci_setup.md @@ -26,7 +26,7 @@ We use [these build scripts](https://gitlab.com/gitlab-org/gitlab-ci/blob/master # Build configuration on [Semaphore](https://semaphoreapp.com/gitlabhq/gitlabhq/) for testing the [GitHub.com repo](https://github.com/gitlabhq/gitlabhq) - Language: Ruby -- Ruby verion: 2.1.2 +- Ruby version: 2.1.2 - database.yml: pg Build commands -- cgit v1.2.1 From 93e42f690bc057ca0e803074aaeb1b55ea9c2232 Mon Sep 17 00:00:00 2001 From: Jacob Vosmaer Date: Thu, 19 Feb 2015 11:20:58 +0100 Subject: Document fun facts about omnibus-gitlab --- doc/development/omnibus.md | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) create mode 100644 doc/development/omnibus.md (limited to 'doc/development') diff --git a/doc/development/omnibus.md b/doc/development/omnibus.md new file mode 100644 index 00000000000..0ba354d28a2 --- /dev/null +++ b/doc/development/omnibus.md @@ -0,0 +1,32 @@ +# What you should know about omnibus packages + +Most users install GitLab using our omnibus packages. As a developer it can be +good to know how the omnibus packages differ from what you have on your laptop +when you are coding. + +## Files are owned by root by default + +All the files in the Rails tree (`app/`, `config/` etc.) are owned by 'root' in +omnibus installations. This makes the installation simpler and it provides +extra security. The omnibus reconfigure script contains commands that give +write access to the 'git' user only where needed. + +For example, the 'git' user is allowed to write in the `log/` directory, in +`public/uploads`, and they are allowed to rewrite the `db/schema.rb` file. + +In other cases, the reconfigure script tricks GitLab into not trying to write a +file. For instance, GitLab will generate a `.secret` file if it cannot find one +and write it to the Rails root. In the omnibus packages, reconfigure writes the +`.secret` file first, so that GitLab never tries to write it. + +## Code, data and logs are in separate directories + +The omnibus design separates code (read-only, under `/opt/gitlab`) from data +(read/write, under `/var/opt/gitlab`) and logs (read/write, under +`/var/log/gitlab`). To make this happen the reconfigure script sets custom +paths where it can in GitLab config files, and where there are no path +settings, it uses symlinks. + +For example, `config/gitlab.yml` is treated as data so that file is a symlink. +The same goes for `public/uploads`. The `log/` directory is replaced by omnibus +with a symlink to `/var/log/gitlab/gitlab-rails`. -- cgit v1.2.1