summaryrefslogtreecommitdiff
path: root/spec.mdwn
blob: c7574d73381ac973496ea74313238f04570ed5a8 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
Title: Baserock definitions format

The Baserock definitions format
===============================

This page describes the Baserock definitions format (morph files). It is intended to be useful as
an *informal* specification. It is not guaranteed to be accurate or exhaustive.

If you are just getting started with Baserock, the wiki pages [quick-start](http://wiki.baserock.org/quick-start), [devel-with](http://wiki.baserock.org/devel-with) and [guides](http://wiki.baserock.org/guides) pages provide a more practical introduction.

The allowed YAML constructs are described in json-schema format here: <http://git.baserock.org/cgit/baserock/baserock/spec.git/tree/schemas>.

The data model is described using OWL here: <http://git.baserock.org/cgit/baserock/baserock/spec.git/tree/schemas/baserock.owl>.

The source code of [Morph] and [YBD] might be more useful if you need a completely accurate description of how the current Baserock definition format is used in practice.

Versioning
----------

The current version of the definitions format is version 7.

Definitions repository
----------------------

The design of Baserock aims to encourage users to keep *all* the information
needed to build and deploy their software in one Git repository. This repo is
often referred to as the 'definitions.git' repo, although nothing forces you
to call it that.

Some of this information will be Baserock 'definitions files', which describe
how to build or deploy some software. Baserock tooling should expect that all
the definitions it needs to process live in one Git repo. The definitions.git
repo can and should contain any other files needed for build and deployment as
well, such as configuration data and documentation.

The Baserock Project maintains a set of 'reference system definitions' at
[git://git.baserock.org/baserock/baserock/definitions] (which can also be
referred to as [baserock:baserock/definitions], when using the repo aliasing
feature of [Morph]). That repo contains systems that can be built and
deployed as-is, but it is important that users can fork this repo as well,
and work on systems in their version using `git merge` or `git rebase` to keep
up to date with changes from upstream.

Baserock tooling should not mandate anything about the definitions repo that
the user wants to process, other than the rules defined below.


[git://git.baserock.org/baserock/baserock/definitions]: http://git.baserock.org/cgit/baserock/baserock/definitions.git
[baserock:baserock/definitions]: http://git.baserock.org/cgit/baserock/baserock/definitions.git

### Structure

Tooling can enforce that the definitions.git repo is actually a Git repo, but
it can equally just treat it as a tree of files and directories.

The top directory of the repo must contain a file named `VERSION`, that is
valid [YAML] and contains a dict with key "version" and a value that is an
integer.

The integer specifies the version of the definitions format that this repo
uses. A tool should refuse to process a version that it doesn't support, to
avoid unpredictable errors. See the "Versioning" heading above for more detail
on versions.

To find all the Baserock definition files in the repo, tooling can recursively
scan the contents of the repo for files matching the glob pattern "\*.morph".

Definitions file syntax
-----------------------

[YAML] is used for all Baserock definitions files.

The toplevel entity in a definition is a dict, in all cases. Any syntax errors
or type errors (such the toplevel entity being a number, or something) should
be reported to the user.

The [Morph] tool raises an error if any unknown dictionary keys are found in
the definition, mainly so that it reports any spelling errors in key names.

### Common fields

For all definitions, use the following fields:

* `name`: the name of the definition; it must currently match the filename
  (without the `.morph` suffix); **required**
* `kind`: the kind of thing being built; **required**
* `description`: a comment to describe what the definition is for; optional

### Build definitions: Chunks, Systems and Strata

Within this document, consider 'building' to be the act of running a series of
commands in a given 'environment', where the commands and how to build the
environment are completely specified by the definitions and the build tool.

#### Chunks

A 'chunk' definition describes an individual component, which can be built from
a Git repository by executing the given sequence of commands.

The structure of a 'chunk' definition is described using [JSON-Schema] in the
[spec.git] repo:
<http://git.baserock.org/cgit/baserock/baserock/spec.git/tree/schemas/chunk.json-schema>

The fields mean the following:

* `build-system`: if the program is built using a build system known to
  `morph`, you can set this field and avoid having to set the various
  `*-commands` fields; the commands that the build system specifies can
  be overridden; the following build-systems are known:

  - `autotools`
  - `python-distutils`
  - `cpan`
  - `cmake`
  - `qmake`

  optional

* `pre-configure-commands`: a list of shell commands to run at
  the configuration phase of a build, before the list in `configure-commands`;
  optional
* `configure-commands`: a list of shell commands to run at the configuraiton
  phase of a build; optional
* `post-configure-commands`: a list of shell commands to run at
  the configuration phase of a build, after the list in `configure-commands`;
  optional

* `pre-build-commands`: a list of shell commands to run at
  the build phase of a build, before the list in `build-commands`;
  optional
* `build-commands`: a list of shell commands to run to build (compile) the
  project; optional
* `post-build-commands`: a list of shell commands to run at
  the build phase of a build, after the list in `build-commands`;
  optional

* `pre-test-commands`: a list of shell commands to run at
  the test phase of a build, before the list in `test-commands`;
  optional
* `test-commands`: a list of shell commands to run unit tests and other
  non-interactive tests on the built but un-installed project; optional
* `post-test-commands`: a list of shell commands to run at
  the test phase of a build, after the list in `test-commands`;
  optional

* `pre-install-commands`: a list of shell commands to run at
  the install phase of a build, before the list in `install-commands`;
  optional
* `install-commands`: a list of shell commands to ; optional
* `post-install-commands`: a list of shell commands to run at
  the install phase of a build, after the list in `strip-commands`;
  optional

* `pre-strip-commands`: a list of shell commands to run at
  the strip phase of a build, before the list in `strip-commands`;
  optional
* `strip-commands`: a list of shell commands to strip debug symbols from binaries;
  this should strip binaries in the directory named in the `DESTDIR` environment
  variable, not the actual system; optional
* `post-strip-commands`: a list of shell commands to run at
  the strip phase of a build, after the list in `install-commands`;
  optional

* `max-jobs`: a string to be given to `make` as the argument to the `-j`
  option to specify the maximum number of parallel jobs; the only sensible
  value is `"1"` (including the quotes), to prevent parallel jobs to run
  at all; parallel jobs are only used during the `build-commands` phase,
  since the other phases are often not safe when run in parallel; `morph`
  picks a default value based on the number of CPUs on the host system;
  optional

* `chunks`: a key/value map of lists of regular expressions;
  the key is the name
  of a binary chunk, the regexps match the pathnames that will be
  included in that chunk; the patterns match the pathnames that get installed
  by `install-commands` (the whole path below `DESTDIR`); every file must
  be matched by at least one pattern; by default, a single chunk gets
  created, named according to the definition, and containing all files;
  optional

#### Strata

A 'stratum' is a group of related chunks. A stratum can contain only chunks.
Certain information about how to build a chunk is defined in the containing
stratum, rather than in the chunk definition.

The structure of a 'stratum' definition is described using [JSON-Schema] in the
[spec.git] repo:
<http://git.baserock.org/cgit/baserock/baserock/spec.git/tree/schemas/stratum.json-schema>

The fields mean the following:

* `build-depends`: a list of strings, each of which refers to another
  stratum that the current stratum depends on. This list may be omitted
  or empty if the stratum does not depend on anything else.
* `chunks`: a list of key/value mappings, where each mapping corresponds
  to a chunk to be included in the stratum; the mappings may use the
  following keys:
    - `name` is the chunk's name (may be different from the
      morphology name),
    - `repo` is the repository in which to find (defaults to chunk name),
    - `ref` identifies the commit to use (typically a branch name, but
       any tree-ish git accepts is ok)
    - `morph` is a path, relative to the top of the definitions repo,
      to a chunk .morph file.
    - `build-system` specifies one of the predefined build systems. You
      must specify ONE of `morph` or `build-system` for each chunk.
  In addition to these keys, each of the sources can specify a list of
  build dependencies using the `build-depends` field. To specify one or
  more chunk dependencies, `build-depends` needs to be set to a list
  that contains the names of chunks that the source depends on in the
  same stratum. These names correspond to the values of the `name`
  fields of the other chunks.

  At the moment, the ordering is significant in chunk build-depends. This
  is used during bootstrapping, when you want to override the first build of
  a component with its second version in a staging area. This feature is kind
  of a workaround for the lack of distinction between build and runtime
  dependencies.

#### Systems

In the Baserock model, a 'system' is the top level entity that you actually
build and execute. Systems contain one or more strata.

The structure of a 'system' definition is described using [JSON-Schema] in the
[spec.git] repo:
<http://git.baserock.org/cgit/baserock/baserock/spec.git/tree/schemas/system.json-schema>

The fields mean the following:

* `strata`: a list of key/value mappings, similar to the 'chunks' field of a
  stratum. Two fields are allowed (are both required?):
    - `name`: name of the artifact when the stratum is build
    - `morph`: path to a stratum .morph file relative to the top of the containing repo

#### Example chunk (simplified commands):

    name: eglibc
    kind: chunk
    configure-commands:
    - mkdir o
    - cd o && ../libc/configure --prefix=/usr
    build-commands:
    - cd o && make
    install-commands:
    - cd o && make install_root="$DESTDIR" install

#### Example stratum:

    name: foundation
    kind: stratum
    chunks:
    - name: fhs-dirs
      repo: upstream:fhs-dirs
      ref: baserock/bootstrap
      build-depends: []
    - name: linux-api-headers
      repo: upstream:linux
      ref: baserock/morph
      build-depends:
      - fhs-dirs
    - name: eglibc
      repo: upstream:eglibc
      ref: baserock/bootstrap
      build-depends:
      - linux-api-headers
    - name: busybox
      repo: upstream:busybox
      ref: baserock/bootstrap
      build-depends:
      - fhs-dirs
      - linux-api-headers

#### Example system:

    name: base
    kind: system
    strata:
    - morph: foundation
    - morph: linux-stratum

### Deployment definitions: Clusters

For 'deployment', Baserock defines an API for running 'extensions'. The
'cluster' and 'system' definitions together describe what extensions should be
run, and what should be set in their environment, in order to deploy the
system. See the [Deployment](deployment) section for how to find and execute
the extensions.

Within this document, consider "deployment" to be a process of first
post-processing a filesystem tree with one or more 'configure extensions', then 
performing an operation to convert and/or transfer the filesystem tree
using a 'write extension'.

The structure of the 'cluster' definitions is described using [JSON-Schema] in
the [spec.git] repo:
<http://git.baserock.org/cgit/baserock/baserock/spec.git/tree/schemas/cluster.json-schema>

A cluster morphology defines a list of systems to deploy, and for each system a
list of ways to deploy them. The fields are used as follows:

* **systems**: a list of systems to deploy;
    the value is a list of mappings, where each mapping has the
    following keys:

    * **morph**: the system morphology to use in the specified
        commit.

    * **deploy**: a mapping where each key identifies a
        system and each system has at least the following keys:

        * **type**: identifies the relative path, without extension, to the
            '.write' program that should be used for this system.
        * **location**: where the deployed system should end up
            at. The syntax depends on the '.write' extension chosen in the
            'type' field.

        Optionally, it can specify **upgrade-type** and
        **upgrade-location** as well, which should be interpreted in the same
        way.

        The system dictionary can have any number of other entries. These
        should be collected up and are passed to each '.configure' extension
        and to the '.write' extension, through the environment. The extensions
        can interpret any of them in any manner.

    * **deploy-defaults**: allows multiple deployments of the same
        system to share some settings, when they can. Default settings
        will be overridden by those defined inside the deploy mapping.

    * **subsystems**: structured in the same way as the 'systems' entry, this
        allows deploying something *within* a system. The Baserock reference
        definitions use this to provide an initramfs inside some of the
        reference systems.

Example:

    name: cluster-foo
    kind: cluster
    systems:
        - morph: devel-system-x86_64-generic.morph
            deploy:
                cluster-foo-x86_64-1:
                    type: extensions/kvm
                    location: kvm+ssh://user@host/x86_64-1/x86_64-1.img
                    upgrade-type: extensions/ssh-rsync
                    upgrade-location: root@localhost
                    HOSTNAME: cluster-foo-x86_64-1
                    DISK_SIZE: 4G
                    RAM_SIZE: 4G
                    VCPUS: 2
        - morph: devel-system-armv7-highbank
            deploy-defaults:
                type: extensions/pxeboot
                location: cluster-foo-pxeboot-server
            deploy:
                cluster-foo-armv7-1:
                    HOSTNAME: cluster-foo-armv7-1
                cluster-foo-armv7-2:
                    HOSTNAME: cluster-foo-armv7-2

### Repo URLs

Git repository locations can (and should) be abbreviated using the 'repo-alias' feature of Baserock definitions. This is a kind of [Compact URI](http://www.w3.org/TR/2009/CR-curie-20090116/). It currently only affects the 'repo' fields in a stratum .morph file.

For example, instead of writing this:

```
- name: fhs-dirs
  repo: git://git.baserock.org/baserock/baserock/fhs-dirs.git
  ref: master
```

You can write this:

```
- name: fhs-dirs
  repo: baserock:baserock/fhs-dirs.git
  ref: master
```

There are two repo aliases that *must* be defined:

 - `baserock:` (defaulting to git://git.baserock.org/baserock/)
 - `upstream:` (defaulting to git://git.baserock.org/delta/)

Baserock tools should allow changing these values. The main benefit of this compact URI scheme is that definitions are not tied to a specific Git server, or protocol. You can build against a mirror of the original Git server, or change the protocol that is used, just by altering the repo-alias configuration.

Build environment
-----------------

### Sandboxing

Builds should be done an isolated 'staging area', with only the specified dependencies available to the build process. The simplest approach is to install the dependencies in an empty directory, then [chroot](https://en.wikipedia.org/wiki/Chroot) into it. The more sandboxing the build tool can do, the better, because it lowers the chance of unexpected and unreproducible errors in the build process. The [Sandboxlib](https://github.com/CodethinkLabs/sandboxlib) Python library may be useful.

The exception to the above is if the 'build-mode' field for a chunk is set to 'bootstrap'. Chunks in bootstrap mode are treated specially and do have access to tools from the host system. 

FIXME: more detail is needed here!

### Environment variables

The following environment variables can be used in chunk configure/build/install commands, and must be defined by the build tool.

 - `MORPH_ARCH`: the Morph-specific architecture name; see <http://git.baserock.org/cgi-bin/cgit.cgi/baserock/baserock/morph.git/tree/morphlib/util.py#n473> for a list of valid architectures
 - `PREFIX`: the value of the 'prefix' field for this chunk (set in the stratum .morph file); default /usr
 - `TARGET`: the [GNU architecture triplet](http://wiki.osdev.org/Target_Triplet) for the target architecture (for example, x86_64-baserock-linux)
 - `TARGET_STAGE1`: the 'bootstrap' variant of the GNU architecture triplet. This must be different from $TARGET -- you can just change the vendor field to achieve that (e.g. x86_64-bootstrap-linux).

FIXME: The `TARGET` and `TARGET_STAGE1` fields are specific to building GNU/Linux based systems, they shouldn't be mandated in the spec.

[JSON-Schema]: https://www.json-schema.org/
[Morph]: http://wiki.baserock.org/Morph/
[YBD]: http://wiki.baserock.org/ybd/
[morph.git]: git://git.baserock.org/cgit/baserock/baserock/morph.git/
[spec.git]: git://git.baserock.org/cgit/baserock/baserock/spec.git/
[YAML]: http://yaml.org/