summaryrefslogtreecommitdiff
path: root/README
blob: 8455e35abc556067019c48a9c313b4265079ef84 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
README for lorry-controller
===========================

Overview
--------

Lorry Controller, or LC for short, manages the importing of source
code from external sources into git repositories on a Trove. A Trove
is a component in the Baserock system for hosting source code, and LC
runs inside a Trove.

LC uses the Lorry tool to do the actual import. Lorry can read code
from several different version control systems, and convert them to
git. External repositories can be specfied individually, as Lorry
`.lorry` specification files. In addition, LC can be told to mirror
all the git repositories on another Trove.

LC runs Lorry for the right external repositories, and takes care of
running a suitable number of Lorry instances concurrently, and
recovering from any problems. LC has a web based administration
interface, and an HTTP API for reporting and controlling its state.

This README file documents the LC configuration file and general use.
For the architecture of LC and the HTTP API, see the `ARCH` file.

Lorry Controller configuration: overview
------------------------------

Lorry Controller has two levels of configuration. The first level is
command line options and configuration files. This level specifies
things such as log levels, network addresses to listen on, and such.
Most importantly, this level specifies the location of the second
level. For information about these options, run
`lorry-controller-webapp --help` to get a list of them.

The second level is a git repository that specifies which external
repositories and other Troves to import into the Trove LC runs on.
This git repository is referred to as CONFGIT in documentation, and is
specified with the the `--confgit-url` command line option, or the
`confgit-url` key in the configuration file. The configuration file
could contain this, for example:

    [config]
    confgit-url = ssh://git@localhost/baserock/local-config/lorries

The system integration of a Trove automatically includes a
configuration file that contains a configuration such as the above.
The URL contains the name of the Trove, so it needs to be customised
for each Trove, but as long as you're only using LC as part of a
Baserock Trove, it's all taken care of for you automatically.


The CONFGIT repository
----------------------

The CONFGIT repository must contain at least the file
`lorry-controller.conf`. It may also contain other files, including
`.lorry` files for Lorry, but all other files are ignored unless
referenced by `lorry-controller.conf`.



The `lorry-controller.conf` file
--------------------------------

`lorry-controller.conf` is a JSON file containing a list of maps. Each
map specifies another Trove, a GitLab instance, or one set of `.lorry`
files. Here's an example that tells LC to mirror the `git.baserock.org`
Trove and anything in the `open-source-lorries/*.lorry` files (if any
exist).

    [
        {
            "ignore": [
                "baserock/lorries"
            ], 
            "interval": "2H", 
            "ls-interval": "4H", 
            "prefixmap": {
                "baserock": "baserock", 
                "delta": "delta"
            }, 
            "protocol": "http", 
            "host": "git.baserock.org",
            "type": "trove"
        },
        {
            "type": "lorries",
            "interval": "6H",
            "prefix": "delta",
            "globs": [
                "open-source-lorries/*.lorry"
            ]
        }
    ]

A Trove specification (map) uses the following mandatory keys:

* `type: trove` -- specify it's a Trove specification.

* `host` -- the other Trove to mirror; a domain name or IP address.

* `protocol` -- specify how Lorry Controller (and Lorry) should talk
  to other Troves. Allowed values are `ssh`, `https`, `http`.

* `prefixmap` -- map repository path prefixes from the other Trove to
  the local Trove. If the remote prefix is `foo`, and the local prefix
  is `bar`, then remote repository `foo/baserock/yeehaa` gets mirrored
  to local repository `bar/baserock/yeehaa`. If the remote Trove has a
  repository that does not match a prefix, that repository gets
  ignored.

* `ls-interval` -- determine how often should Lorry Controller query
  the other Trove for a list of repositories it may mirror. See below
  for how the value is interpreted. The default is 24 hours.

* `interval` -- specify how often Lorry Controller should mirror the
  repositories in the spec. See below for INTERVAL. The default
  interval is 24 hours.

Additionally, the following optional keys are allowed in Trove
specifications:

* `ignore` -- a list of git repositories from the other Trove that
  should NOT be mirrored. Each list element is a glob pattern which
  is matched against the path to the git repository (not including leading
  slash).

* `auth` -- specify how to authenticate to the remote Trove over https
  (only). It should be a dictionary with the fields `username` and
  `password`.

A GitLab specification (map) makes use of the same keys as a Trove,
however it uses an additional mandatory key:

* `type: gitlab` -- specify it's a GitLab specification.

* `private-token` -- the GitLab private token for a user with the
  minimum permissions of master of any group you may wish to create
  repositories under.

A Lorry specification (map) uses the following keys, all of them
mandatory:

* `type: lorries` -- specify it's a Lorry specification.

* `interval` -- identical in meaning to the `interval` in a
  Trove specification.

* `prefix` -- a path prefix to be prepended to all repositories
  created from the `.lorry` files from this spec.

* `globs` -- a list of globs (as strings) for `.lorry` files to use.
  The glob is matched in the directory containing the configuration
  file in which this spec is. It is OK for the globs to not match
  anything.

For backwards compatibility with another implementation of Lorry
Controller, other fields in either type of specification are allowed
and silently ignored.

An INTERVAL value (for `interval` or `ls-interval`) is a number and a
unit to indicate a time interval. Allowed units are minutes (`m`),
hours (`h`), and days (`d`), expressed as single-letter codes in upper
or lower case.

The syntax of `.lorry` files is specified by the Lorry program; see
its documentation for details.


HTTP proxy configuration: `proxy.conf`
--------------------------------------

Lorry Controller will look for a file called `proxy.conf` in the same
directory as the `lorry-controller.conf` configuration file.
It is in JSON format, with the following key/value pairs:

* `hostname` -- the hostname of the HTTP proxy
* `username` -- username for authenticating to the proxy
* `password` -- a **cleartext** password for authenticating to the
  proxy
* `port` -- port number for connecting to the proxy

Lorry Controller will use this information for both HTTP and HTTPS
proxying.

Do note that the **password is stored in cleartext** and that access
to the configuration file (and the git repository where it is stored)
must be controlled appropriately.

WEBAPP 'Admin' Interface
------------------------

A web interface for managing lorry controller is accessible from
http://trove/1.0/status-html. A more detailed 'admin' interface runs locally
on port 12765.

For the moment you can access this interface using an ssh tunnel if you have
root access to the trove, for example:

ssh -L 12765:localhost:12765 root@trove

will bind 12765 on your localhost to 12765 on the trove, with this running
you can access the trove 'admin' interface at
http://localhost:12765/1.0/status-html