README for lorry-controller =========================== Overview -------- Lorry Controller, or LC for short, manages the importing of source code from external sources into git repositories on a Trove, GitLab, or Gerrit server (Downstream Host). LC uses the Lorry tool to do the actual import. Lorry can read code from several different version control systems, and convert them to git. External repositories can be specfied individually, as Lorry `.lorry` specification files. In addition, LC can be told to mirror all the git repositories on a Trove or GitLab server (Upstream Host). LC runs Lorry for the right external repositories, and takes care of running a suitable number of Lorry instances concurrently, and recovering from any problems. LC has a web based administration interface, and an HTTP API for reporting and controlling its state. This README file documents the LC configuration file and general use. For the architecture of LC and the HTTP API, see the `ARCH.md` file. Installation ------------ See the `INSTALL.md` file. Lorry Controller configuration: overview ------------------------------ Lorry Controller has two levels of configuration. The first level is command line options and configuration files. This level specifies things such as log levels, network addresses to listen on, and such. Most importantly, this level specifies the location of the second level. For information about these options, run `lorry-controller-webapp --help` to get a list of them. The second level is a git repository that specifies which external repositories and Upstream Hosts to import into the Downstream Host. This git repository is referred to as CONFGIT in documentation, and is specified with the the `--confgit-url` command line option, or the `confgit-url` key in the configuration file. The configuration file could contain this, for example: [config] confgit-url = ssh://git@localhost/baserock/local-config/lorries The system integration of a Trove automatically includes a configuration file that contains a configuration such as the above. The URL contains the name of the Trove, so it needs to be customised for each Trove, but as long as you're only using LC as part of a Baserock Trove, it's all taken care of for you automatically. The CONFGIT repository ---------------------- The CONFGIT repository must contain at least the file `lorry-controller.conf`. It may also contain other files, including `.lorry` files for Lorry, but all other files are ignored unless referenced by `lorry-controller.conf`. The `lorry-controller.conf` file -------------------------------- `lorry-controller.conf` is a JSON file containing a list of maps. Each map specifies an Upstream Host or one set of `.lorry` files. Here's an example that tells LC to mirror the `git.baserock.org` Trove and anything in the `open-source-lorries/*.lorry` files (if any exist). [ { "ignore": [ "baserock/lorries" ], "interval": "2H", "ls-interval": "4H", "prefixmap": { "baserock": "baserock", "delta": "delta" }, "protocol": "http", "host": "git.baserock.org", "type": "trove" }, { "type": "lorries", "interval": "6H", "prefix": "delta", "globs": [ "open-source-lorries/*.lorry" ] } ] A Host specification (map) uses the following mandatory keys: * `type:` -- either `trove` or `gitlab`, depending on the type of Upstream Host. * `host` -- the Upstream Host to mirror; a domain name or IP address. * `protocol` -- specify how Lorry Controller (and Lorry) should talk to the Upstream Host. Allowed values are `ssh`, `https`, `http`. * `prefixmap` -- map repository path prefixes from the Upstream Host to the Downstream Host. If the upstream prefix is `foo`, and the downstream prefix is `bar`, then upstream repository `foo/baserock/yeehaa` gets mirrored to downstream repository `bar/baserock/yeehaa`. If the Upstream Host has a repository that does not match a prefix, that repository gets ignored. * `ls-interval` -- determine how often should Lorry Controller query the Upstream Host for a list of repositories it may mirror. See below for how the value is interpreted. The default is 24 hours. * `interval` -- specify how often Lorry Controller should mirror the repositories in the spec. See below for INTERVAL. The default interval is 24 hours. Additionally, the following optional keys are allowed in Host specifications: * `ignore` -- a list of git repositories from the Upstream Host that should NOT be mirrored. Each list element is a glob pattern which is matched against the path to the git repository (not including leading slash). * `auth` -- specify how to authenticate to the Upstream Host over https (only). It should be a dictionary with the fields `username` and `password`. A GitLab specification (map) uses an additional mandatory key: * `private-token` -- the GitLab private token for a user with the minimum permissions of master of any group you may wish to create repositories under. A Lorry specification (map) uses the following keys, all of them mandatory: * `type: lorries` -- specify it's a Lorry specification. * `interval` -- identical in meaning to the `interval` in a Host specification. * `prefix` -- a path prefix to be prepended to all repositories created from the `.lorry` files from this spec. * `globs` -- a list of globs (as strings) for `.lorry` files to use. The glob is matched in the directory containing the configuration file in which this spec is. It is OK for the globs to not match anything. For backwards compatibility with another implementation of Lorry Controller, other fields in either type of specification are allowed and silently ignored. An INTERVAL value (for `interval` or `ls-interval`) is a number and a unit to indicate a time interval. Allowed units are minutes (`m`), hours (`h`), and days (`d`), expressed as single-letter codes in upper or lower case. The syntax of `.lorry` files is specified by the Lorry program; see its documentation for details. Lorry Controller supports an optional `description` field in `.lorry` files that is used to set the repository description on the Downstream Host. HTTP proxy configuration: `proxy.conf` -------------------------------------- Lorry Controller will look for a file called `proxy.conf` in the same directory as the `lorry-controller.conf` configuration file. It is in JSON format, with the following key/value pairs: * `hostname` -- the hostname of the HTTP proxy * `username` -- username for authenticating to the proxy * `password` -- a **cleartext** password for authenticating to the proxy * `port` -- port number for connecting to the proxy Lorry Controller will use this information for both HTTP and HTTPS proxying. Do note that the **password is stored in cleartext** and that access to the configuration file (and the git repository where it is stored) must be controlled appropriately. WEBAPP 'Admin' Interface ------------------------ An 'admin' interface runs locally on port 12765. For the moment you can access this interface using an ssh tunnel, for example: ssh -L 12765:localhost:12765 root@lorryhost will bind 12765 on your localhost to 12765 on lorryhost, with this running you can access the 'admin' interface at http://localhost:12765/1.0/status-html When used within Trove, a web interface for managing lorry controller is accessible from http://trove/1.0/status-html.