diff options
author | Jacob Vosmaer <contact@jacobvosmaer.nl> | 2015-09-25 18:31:54 +0200 |
---|---|---|
committer | Jacob Vosmaer <contact@jacobvosmaer.nl> | 2015-09-25 18:32:02 +0200 |
commit | 5bcd0efe3e0b1fef06147d87f843adac717d7c42 (patch) | |
tree | 17397dba894df43d599cfb6a5d073446d3fd4090 | |
parent | 7a8a892efdf59925a95cdf6504f7c74c31b87eeb (diff) | |
download | gitlab-ce-5bcd0efe3e0b1fef06147d87f843adac717d7c42.tar.gz |
Add parallel-rsync-repos script and start docs
-rw-r--r-- | bin/parallel-rsync-repos | 26 | ||||
-rw-r--r-- | doc/operations/rsyncing_repositories.md | 87 |
2 files changed, 113 insertions, 0 deletions
diff --git a/bin/parallel-rsync-repos b/bin/parallel-rsync-repos new file mode 100644 index 00000000000..b2429f743b5 --- /dev/null +++ b/bin/parallel-rsync-repos @@ -0,0 +1,26 @@ +#!/bin/sh +# this script should run as the 'git' user, not root, because of mkdir +# +# Example invocation: +# find /var/opt/gitlab/git-data/repositories -maxdepth 2 | \ +# parallel-rsync-repos /var/opt/gitlab/git-data/repositories /mnt/gitlab/repositories + +SRC=$1 +DEST=$2 + +if [ -z "$JOBS" ] ; then + JOBS=10 +fi + +if [ -z "$SRC" ] || [ -z "$DEST" ] ; then + echo "Usage: $0 SRC DEST" + exit 1 +fi + +if ! cd $SRC ; then + echo "cd $SRC failed" + exit 1 +fi + +sed "s|$SRC|./|" |\ + parallel -j$JOBS --progress "mkdir -p $DEST/{} && rsync --delete -a {}/. $DEST/{}/" diff --git a/doc/operations/rsyncing_repositories.md b/doc/operations/rsyncing_repositories.md new file mode 100644 index 00000000000..231e09f0462 --- /dev/null +++ b/doc/operations/rsyncing_repositories.md @@ -0,0 +1,87 @@ +# Moving repositories managed by GitLab + +Sometimes you need to move all repositories managed by GitLab to +another filesystem or another server. In this document we will look +at some of the ways you can copy all your repositories from +`/var/opt/gitlab/git-data/repositories` to `/mnt/gitlab/repositories`. + +We will look at three scenarios: the target directory is empty, the +target directory contains an outdated copy of the repositories, and +how to deal with thousands of repositories. + +**Each of the approaches we list can/will overwrite data in the +target directory `/mnt/gitlab/repositories`. Do not mix up the +source and the target.** + +## Target directory is empty: use a tar pipe + +If the target directory `/mnt/gitlab/repositories` is empty the +simplest thing to do is to use a tar pipe. + +``` +# As the git user +tar -C /var/opt/gitlab/git-data/repositories -cf - -- . |\ + tar -C /mnt/gitlab/repositories -xf - +``` + +If you want to see progress, replace `-xf` with `-xvf`. + +### Tar pipe to another server + +You can also use a tar pipe to copy data to another server. If your +'git' user has SSH access to the newserver as 'git@newserver', you +can pipe the data through SSH. + +``` +# As the git user +tar -C /var/opt/gitlab/git-data/repositories -cf - -- . |\ + ssh git@newserver tar -C /mnt/gitlab/repositories -xf - +``` + +If you want to compress the data before it goes over the network +(which will cost you CPU cycles) you can replace `ssh` with `ssh +-C`. + +## The target directory contains an outdated copy of the repositories: use rsync + +In this scenario it is better to use rsync. This utility is either +already installed on your system or easily installable via apt, yum +etc. + +``` +# As the 'git' user +rsync -a --delete /var/opt/gitlab/git-data/repositories/. \ + /mnt/gitlab/repositories +``` + +The `/.` in the command above is very important, without it you can +easily get the wrong directory structure in the target directory. +If you want to see progress, replace `-a` with `-av`. + +### Single rsync to another server + +If the 'git' user on your source system has SSH access to the target +server you can send the repositories over the network with rsync. + +``` +# As the 'git' user +rsync -a --delete /var/opt/gitlab/git-data/repositories/. \ + git@newserver:/mnt/gitlab/repositories +``` + +## Thousands of Git repositories: use one rsync per repository + +Every time you start an rsync job it has to inspect all files in +the source directory, all files in the target directory, and then +decide what files to copy or not. If the source or target directory +has many contents this startup phase of rsync can become a burden +for your GitLab server. In cases like this you can make rsync's +life easier by dividing its work in smaller pieces, and sync one +repository at a time. + +In addition to rsync we will use [GNU +Parallel](http://www.gnu.org/software/parallel/). This utility is +not included in GitLab so you need to install it yourself with apt +or yum. Also note that the GitLab scripts we used below were added +in GitLab 8.???. + |