diff options
author | Sam Thursfield <sam.thursfield@codethink.co.uk> | 2015-07-31 12:07:40 +0100 |
---|---|---|
committer | Baserock Gerrit <gerrit@baserock.org> | 2015-09-18 15:49:28 +0000 |
commit | 666c1f019c275a65fe96246a20e5bd3fbd73a8dd (patch) | |
tree | 16d164162a2b0abd5ed1493fec50e3ed3011ebd8 /schemas/README.schemas | |
parent | aeeab946dac9100be26756bfd4948f4b52df386e (diff) | |
download | definitions-666c1f019c275a65fe96246a20e5bd3fbd73a8dd.tar.gz |
Add schemas for Baserock definitions format
See schemas/README.schemas for information.
Change-Id: I6c384692dbf70017a3ece2ed56c1f8cbe60b493d
Diffstat (limited to 'schemas/README.schemas')
-rw-r--r-- | schemas/README.schemas | 137 |
1 files changed, 137 insertions, 0 deletions
diff --git a/schemas/README.schemas b/schemas/README.schemas new file mode 100644 index 00000000..a7789187 --- /dev/null +++ b/schemas/README.schemas @@ -0,0 +1,137 @@ +Schemas for the Baserock definitions format +=========================================== + +The starting point for learning about the Baserock definitions format is the +wiki page at <http://wiki.baserock.org/definitions/>. + +The schemas/ directory in the Baserock reference definitions.git repository is +the canonical home for some schemas which describe the format in a +machine-readable way. + +There are two parts to 'Baserock definitions'. The 'Baserock data model' is an +abstract vocabulary for describing how to build, integrate and deploy software +components. The 'Baserock definitions YAML representation format' is a +serialisation format for the data model, which lets you write YAML files +describing how to build, integrate and deploy software components. + +If you want to make the YAML files easier to deal with, you only need +to care about the JSON-Schema schemas and anything that parses the YAML files. + +If you want to write a new tool to build, visualise, analyse or otherwise +process Baserock definitions in some way, you can ignore the syntax altogether, +use a pre-existing parser, and just think in terms of the data +model. + +If you want to change the data model, you still have quite a difficult job, +but at least it should be simple to write a translation layer on top of an +existing parser so that you can interpret all the existing Baserock reference +system definitions in terms of your new data model. + + +The Baserock definitions YAML representation format +--------------------------------------------------- + +YAML itself is a syntax for representating data as text. The YAML specification +is at <http://www.yaml.org/>. + +The data needs to be structured in a certain way for it to make sense as +Baserock build/integration/deployment instructions. We have used JSON-Schema +to describe the required layout of the data. + +The JSON-Schema standard is described at <http://json-schema.org/>. The +JSON-Schema language was designed for use with JSON, which is another syntax +for representing data as text, which happens to be a subset of YAML. We have +found so far that JSON-Schema works well with YAML, at least when using the +Python 'jsonschema' module. + +Definitions are represented by files with a '.morph' extension. There are four +different kinds: 'chunk', 'stratum', 'system', and 'cluster'. Each of these is +described with a different .json-schema file. It is possible to merge all these +into one file, and use the 'oneOf' field to say that any .morph file should +match exactly one of the layouts. The only issue with this approach is that +the Python 'jsonschema' model will give you totally useless errors if anything +is invalid (along the lines of "<dump of entire file> is not valid under any of +the given schemas"). So for now they are separate. + + +Tools for working with the Baserock YAML schemas +------------------------------------------------ + +You can use `scripts/yaml-jsonschema` to validate .morph files against the +schemas. For example: + + scripts/yaml-jsonschema schemas/cluster.json-schema clusters/*.morph + + +The Baserock data model +----------------------- + +The best way to represent information on disk may be a pretty inefficient way +to represent that data in a computer's memory. Likewise, the way a program +stores data internally may be totally impractical for people to edit directly. + +The file `baserock.owl` is an initial effort to describe the Baserock data +model independently of any syntax or representation. + +We use the W3C standard Web Ontology Language (OWL), combined with the much +simpler RDF Schema language. Together, this allows defining the vocabulary we +can use to define build, integration and deployment instructions. There are +various ways to represent OWL 'ontologies'; `baserock.owl` uses a +representation format named Turtle, which is designed to be convenient for +hand-editing. + +The current data model is very closely tied to the current syntax, but we are +looking to change this and make it much more generic. This will involve +removing the current 'Chunk', 'Stratum', 'System' and 'Cluster' classes, and +adding something like 'thing with build instructions' and 'thing that contains +other things' instead. Name suggestions are welcome :-) + +It's useful to consider existing OWL and RDF Schema vocabularies that are +related to the Baserock data model. In future we can link the Baserock +reference system definitions with related data published elsewhere on the Web. +Here is an incomplete list: + + - Description of a Project (DOAP): https://github.com/edumbill/doap + - Software Ontology: https://robertdavidstevens.wordpress.com/2014/06/19/the-software-ontology-swo/ + - Software Packet Data Exchange (SPDX): https://spdx.org/about-spdx/what-is-spdx + + +Tools for working with the Baserock data model schema +----------------------------------------------------- + +It's difficult to find to a short, relevant 'getting started' guide. The +website http://www.linkeddata.org/ has a lot of background that should be +useful. + +The `rapper` commandline tool, which comes as part of the 'raptor2' C library, +is helpful for converting from one syntax to another, and checking if +`baserock.owl` is valid Turtle syntax. The 'raptor2' homepage is +<http://www.librdf.org/>. + +To check the syntax of `baserock.owl` using `rapper`: + + rapper -i turtle schemas/baserock.owl + + +Omissions / TODO items +---------------------- + +- Device nodes: chunk .morph files can list a set of device nodes. In + `chunk.json-schema` this is recognised, but in `baserock.owl` it is missing. + +- 'Lorry' mirroring instructions. These contain information on where 'upstream' + source code is kept, which should be considered part of the data model. A + JSON schema may be better off in lorry.git or + baserock/local-config/lorries.git. + +- Metadata in built systems. This is currently not standardised at all. + + +Comments +-------- + +As far as I know, Baserock is the first project to treat build, integration and +deployment instructions as data rather than code. If you have questions about +the schemas, the definitions format, or the overall approach, and they aren't +answered here or in <http://wiki.baserock.org/definitions/>, then please ask on +the baserock-dev@baserock.org mailing list. |