Branching and merging at the system level in Baserock ===================================================== NOTE: This is a spec. The code does not yet match it. As I write this, Baserock consists of just under 70 upstream projects, each of which we keep in their own git repository. We need a way to manage changes to them in a sensible manner, particularly when things touch more than one repository. What we need is a way to do branch and merge the whole system, across all our git repositories, with similar ease and efficiency as what git provides for an individual project. Where possible we need to allow the use of raw git so that we do not constrain our developers unnecessarily. There are other things we will want to do across all the Baserock git repositories, but that is outside the scope of this document, and will be dealt with later. A couple of use cases: * I have a problem on a particular device, and want to make changes to analyze and fix it. I need to branch the specific version of everything that was using the system image version that the device was running. I then want to be able to make changes to selected components and build the system with those changes. Everything I do not explicitly touch should stay at the same version. * I am developing Baserock and I want to fix something, or add a new feature, or other such change. I want to take the current newest version of everything (the mainline development branch, whatever it might be named), and make changes to some components, and build and test those changes. While I'm doing that, I don't want to get random other changes by other people, except when I explicitly ask for them (e.g. "git pull" on an individual repository.), to avoid unnecessary conflicts and building changes that don't affect my changes. In both users cases, when I'm done, I want to get my changes into the relevant branches. This might happen by merging my changes directly, by generating a pull request on a git server, or by generating a patch series for each affected repository to be mailed to people who can do the merging. Overview -------- We want a clear, convenient, and efficient way of working with multiple repositories and multiple projects at the same time. To manage this, we introduce the following concepts (FIXME: naming needs attention): * **git repository** is exactly the same as usually with git, as are all other concepts related to git * **system branch** is a collection of branches in individual git repositories that together form a particular line of development of the whole system; in other words, given all the git repositories that are part of Baserock, system branch `foo` consists of branch `foo` in each git repository that has a branch with that name * **system branch directory** contains git repositories relevant to a system branch; it need not contain all the repositories, just the ones that are being worked on by the user, or that the user for other reasons have checked out * **morph workspace** is where all Morph keeps global state and shared caches, so that creating a system branch directory is a fairly cheap operation; all the system branch directories are inside the morph workspace directory As a picture: /home/liw/ -- user's home directory baserock/ -- morph workspace .morph/ -- morph shared state, cache, etc unstable/ -- system branch directory: mainline devel morphs/ -- git repository for system, stratum morphs magnetic-frobbles/ -- system branch directory: new feature morphs/ -- system branch specific changes to morphs linux/ -- ditto for linux To use the system branching and merging, you do the following (which we'll cover in more detail later): 1. Initialize the morph workspace. This creates the `.morph` directory and populates it with some initial stuff. You can have as many workspaces as you want, but you don't have to have more than one, and you only need to initialize it once. 2. Branch the system from some version of it (e.g., `master`) to work on a new feature or bug fix. This creates a system branch directory under the workspace directory. The system branch directory initially contains a clone of the `morphs` git repository, with some changes specific to this system branch. (See petrification, below.) 3. Edit one or more components (chunks) in the project. This typically requires adding more clones of git repositories inside the system branch directory. 4. Build, test, and fix, repeating as necessary. This requires using git directly (not via morph) in the git repositories inside the system branch directory. 5. Merge the changes to relevant target branches. Depending on what the change was, it may be necessary ot merge into many branches, e.g., for each stable release branch. Walkthrough ----------- Let's walk through what happens, making things more concrete. This is still fairly high level, and implementation details will come later. morph init ~/baserock This creates the `~/baserock` directory if it does not exist, and then initializes it as a "morph workspace" directory, by creating a `.morph` subdirectory. `.morph` will contain the Morph cache directory, and other shared state between the various branches. As part of the cache, any git repositories that Morph clones get put into the cache first, and cloned into the system branch directories from there (probably using hard-linking for speed), so that if there's a need for another clone of the repository, it does not need to be cloned from a server a second time. cd ~/baserock morph branch liw/foo morph branch liw/foo baserock/stable-1.0 morph branch liw/foo --branch-off-system=/home/liw/system.img Create a new system branch, and the corresponding system branch directory. The three versions shown differ in the starting point of the branch: the first one uses the `master` branch in `morphs`, the second one uses the named branch instead, and the third one gets the SHA-1s from a system image file. Also, clone the `morphs` git repository inside the system branch directory. cd ~/baserock/liw/foo/morphs edit base-system.morph devel-system.morph git commit -a Modify the specified morphologies (or the stratum morphologies they refer to) to nail down the references to chunk repositories to use SHA-1s instead of branch names or whatever. The changes need to be committed to git manually, so that the user has a chance to give a good commit message. Petrification is useful to prevent the set of changes including changes by other team members. When a chunk is edited it will be made to refer to that ref instead of the SHA-1 that it is petrified to. Petrification can be done by resolving the chunk references against the current state of the git repositories, or it can be done by getting the SHA-1s directly from a system image, or a data file. cd ~/baserock/liw/foo morph edit linux Tell Morph that you want to edit a particular component (chunk). This will clone the repository into the system branch directory, at the point in history indicated by the morphologies in the local version of `morphs`. cd ~/baserock/liw/foo morph git -- log -p master..HEAD This allows running a git command in each git repository in a system branch directory. Morph may offer short forms ("morph status") as well, for things that are needed very commonly. cd ~/baserock/baserock/mainline morph merge liw/foo This merges the changes made in the `liw/foo` branch into the `baserock/mainline` branch. The petrification changes are automatically undone, since they're not going to be wanted in the merge. cd ~/baserock morph mass-merge liw/foo baserock/stable* Do the merge from `liw/foo` to every branch matching `baserock/stable*` (as expanded by the shell). This is a wrapper around the simpler `morph merge` command to make it easier to push a change into many branches (e.g., a security fix to all stable branches). Implementation: `morph init` -------------- Usage: morph init [DIR] DIR defaults to the current working directory. If DIR is given, but does not exist, it is created. * Create `DIR/.morph`. Implementation: `morph branch` -------------- Usage: morph branch BRANCH [COMMIT] This needs to be run in the morph workspace directory (the one initialized with `morph init`). * If `./BRANCH` as a directory exists, abort. * Create `./BRANCH` directory. * Clone the `morphs` repository to `BRANCH/morphs`. * Create a new branch called `BRANCH` in morphs, based either the tip of `master` or from `COMMIT` if given. Store the SHA-1 of the branch origin in some way so we get at it later. Implementation: `morph checkout` -------------- Usage: morph checkout BRANCH This needs to be run in the morph workspace directory. It works like `morph branch`, except it does not create the new branch and requires it to exist instead. * If `./BRANCH` as a directory exists, abort. * Create `./BRANCH` directory. * Clone the `morphs` repository to `BRANCH/morphs`. * Run `git checkout BRANCH` in the `morphs` repository. Implementation: `morph edit` -------------- Usage: morph edit REPO MORPH... where `REPO` is a chunk repository (absolute URL or one relative to one of the `git-base-url` values). The command must be run in the `morphs` directory of the system branch. * `git clone REPOURL` where the URL is constructed with `git-base-url` if necessary. * `git branch BRANCH REF` where `BRANCH` is the branch name given to `morph branch` and `REF` is the reference to the chunk we want to edit, as specified in morphologies. * Modify the affected morphologies to refer to the repository using the `BRANCH` name, and commit those changes. If the specified morphology is not a stratum morphology (that is, it is a system one), we check all the stratum morphologies mentioned and find the one that refers to the specified repository. Multiple morphologies can be specified. They must have the same original reference to the repository. However, they will all be modified. Implementation: `morph git` -------------- Usage: morph git -- log -p master..HEAD This is to be run in the morph workspace. It runs git with the arguments on the command line in each local git repository in the workspace. (The `--` is only necessary if the git arguments are to contain options.) Implementation: `morph merge` -------------- Usage: morph merge BRANCH This needs to be run inside a system branch directory's `morphs` repository, and `BRANCH` must be another system branch checked out in the morph workspace. * In each git repository modified by the `BRANCH` system branch, run `git merge --no-commit BRANCH`, then undo any changes to stratum morphologies made by `morph edit`, and finally commit the changes. Implementation: `morph mass-merge` -------------- Usage: morph mass-merge BRANCH [TARGET]... To be run in the morph workspace directory. This just runs `morph merge BRANCH` in each `TARGET` system branch. Implementation: `morph cherry-pick` -------------- Usage: morph cherry-pick BRANCH [COMMIT]... morph cherry-pick BRANCH --since-branch-point To be run in the system branch directory. In the first form: * For each git repository modified by the `BRANCH` system branch, run `git cherry-pick COMMIT` for each `COMMIT`. In the second form: * For each git repository modified by the `BRANCH` system branch, run `git cherry-pick` giving it a list of all commits made after the system branch was created.