summaryrefslogtreecommitdiff
path: root/doc/user/project/repository
diff options
context:
space:
mode:
authorNick Thomas <nick@gitlab.com>2018-11-19 15:03:58 +0000
committerNick Thomas <nick@gitlab.com>2018-12-06 18:58:00 +0000
commit9395d198f9b9ec59858d2f316e58cda22ab80050 (patch)
tree0b494120c8d7d59316d590fada95adcbf0ac23f2 /doc/user/project/repository
parent79b44c16ccf3827eba6b168aae6c395ac3f3df17 (diff)
downloadgitlab-ce-9395d198f9b9ec59858d2f316e58cda22ab80050.tar.gz
Use BFG object maps to clean projects
Diffstat (limited to 'doc/user/project/repository')
-rw-r--r--doc/user/project/repository/img/repository_cleanup.pngbin0 -> 20833 bytes
-rw-r--r--doc/user/project/repository/reducing_the_repo_size_using_git.md109
2 files changed, 83 insertions, 26 deletions
diff --git a/doc/user/project/repository/img/repository_cleanup.png b/doc/user/project/repository/img/repository_cleanup.png
new file mode 100644
index 00000000000..2749392ffa4
--- /dev/null
+++ b/doc/user/project/repository/img/repository_cleanup.png
Binary files differ
diff --git a/doc/user/project/repository/reducing_the_repo_size_using_git.md b/doc/user/project/repository/reducing_the_repo_size_using_git.md
index d534c8cbe4b..672567a8d7d 100644
--- a/doc/user/project/repository/reducing_the_repo_size_using_git.md
+++ b/doc/user/project/repository/reducing_the_repo_size_using_git.md
@@ -1,43 +1,105 @@
# Reducing the repository size using Git
A GitLab Enterprise Edition administrator can set a [repository size limit][admin-repo-size]
-which will prevent you to exceed it.
+which will prevent you from exceeding it.
When a project has reached its size limit, you will not be able to push to it,
create a new merge request, or merge existing ones. You will still be able to
create new issues, and clone the project though. Uploading LFS objects will
also be denied.
-In order to lift these restrictions, the administrator of the GitLab instance
-needs to increase the limit on the particular project that exceeded it or you
-need to instruct Git to rewrite changes.
-
If you exceed the repository size limit, your first thought might be to remove
-some data, make a new commit and push back to the repository. Unfortunately,
-it's not so easy and that workflow won't work. Deleting files in a commit doesn't
-actually reduce the size of the repo since the earlier commits and blobs are
-still around. What you need to do is rewrite history with Git's
-[`filter-branch` option][gitscm].
+some data, make a new commit and push back to the repository. Perhaps you can
+move some blobs to LFS, or remove some old dependency updates from history.
+Unfortunately, it's not so easy and that workflow won't work. Deleting files in
+a commit doesn't actually reduce the size of the repo since the earlier commits
+and blobs are still around. What you need to do is rewrite history with Git's
+[`filter-branch` option][gitscm], or a tool like the [BFG Repo-Cleaner][bfg].
Note that even with that method, until `git gc` runs on the GitLab side, the
-"removed" commits and blobs will still be around. And if a commit was ever
-included in an MR, or if a build was run for a commit, or if a user commented
-on it, it will be kept around too. So, in these cases the size will not decrease.
-
-The only fool proof way to actually decrease the repository size is to prune all
-the unneeded stuff locally, and then create a new project on GitLab and start
-using that instead.
+"removed" commits and blobs will still be around. You also need to be able to
+push the rewritten history to GitLab, which may be impossible if you've already
+exceeded the maximum size limit.
-With that being said, you can try reducing your repository size with the
-following method.
-
-## Using `git filter-branch` to purge files
+In order to lift these restrictions, the administrator of the GitLab instance
+needs to increase the limit on the particular project that exceeded it, so it's
+always better to spot that you're approaching the limit and act proactively to
+stay underneath it. If you hit the limit, and your admin can't - or won't -
+temporarily increase it for you, your only option is to prune all the unneeded
+stuff locally, and then create a new project on GitLab and start using that
+instead.
+
+If you can continue to use the original project, we recommend [using the
+BFG Repo-Cleaner](#using-the-bfg-repo-cleaner). It's faster and simpler than
+`git filter-branch`, and GitLab can use its account of what has changed to clean
+up its own internal state, maximizing the space saved.
> **Warning:**
> Make sure to first make a copy of your repository since rewriting history will
> purge the files and information you are about to delete. Also make sure to
> inform any collaborators to not use `pull` after your changes, but use `rebase`.
+> **Warning:**
+> This process is not suitable for removing sensitive data like password or keys
+> from your repository. Information about commits, including file content, is
+> cached in the database, and will remain visible even after they have been
+> removed from the repository.
+
+## Using the BFG Repo-Cleaner
+
+> [Introduced](https://gitlab.com/gitlab-org/gitlab-ce/issues/19376) in GitLab 11.6.
+
+1. [Install BFG](https://rtyley.github.io/bfg-repo-cleaner/).
+
+1. Navigate to your repository:
+
+ ```
+ cd my_repository/
+ ```
+
+1. Change to the branch you want to remove the big file from:
+
+ ```
+ git checkout master
+ ```
+
+1. Create a commit removing the large file from the branch, if it still exists:
+
+ ```
+ git rm path/to/big_file.mpg
+ git commit -m 'Remove unneeded large file'
+ ```
+
+1. Rewrite history:
+
+ ```
+ bfg --delete-files path/to/big_file.mpg
+ ```
+
+ An object map file will be written to `object-id-map.old-new.txt`. Keep it
+ around - you'll need it for the final step!
+
+1. Force-push the changes to GitLab:
+
+ ```
+ git push --force-with-lease origin master
+ ```
+
+ If this step fails, someone has changed the `master` branch while you were
+ rewriting history. You could restore the branch and re-run BFG to preserve
+ their changes, or use `git push --force` to overwrite their changes.
+
+1. Navigate to **Project > Settings > Repository > Repository Cleanup**:
+
+ ![Repository settings cleanup form](img/repository_cleanup.png)
+
+ Upload the `object-id-map.old-new.txt` file and press **Start cleanup**.
+ This will remove any internal git references to the old commits, and run
+ `git gc` against the repository. You will receive an email once it has
+ completed.
+
+## Using `git filter-branch`
+
1. Navigate to your repository:
```
@@ -70,11 +132,6 @@ following method.
Your repository should now be below the size limit.
-> **Note:**
-> As an alternative to `filter-branch`, you can use the `bfg` tool with a
-> command like: `bfg --delete-files path/to/big_file.mpg`. Read the
-> [BFG Repo-Cleaner][bfg] documentation for more information.
-
[admin-repo-size]: https://docs.gitlab.com/ee/user/admin_area/settings/account_and_limit_settings.html#repository-size-limit
[bfg]: https://rtyley.github.io/bfg-repo-cleaner/
[gitscm]: https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History#The-Nuclear-Option:-filter-branch