summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorbst-marge-bot <marge-bot@buildstream.build>2019-05-31 09:39:04 +0000
committerbst-marge-bot <marge-bot@buildstream.build>2019-05-31 09:39:04 +0000
commitfd91071fd2666f54f6beb967d6762415525e3e56 (patch)
tree9ce80f21f5fef0c44e4eefefad7932c99b609bf1
parent194bffca4f4add6b22c21bd023bf8d6995fe4422 (diff)
parentd88c4aa38a2cb119c36c0648baa722b35369f786 (diff)
downloadbuildstream-fd91071fd2666f54f6beb967d6762415525e3e56.tar.gz
Merge branch 'raoul/1024-artifact-docs' into 'master'
Update docs regarding artifact and source caches Closes #1024 See merge request BuildStream/buildstream!1362
-rw-r--r--doc/source/arch_caches.rst68
-rw-r--r--doc/source/format_project.rst17
-rw-r--r--doc/source/main_architecture.rst1
-rw-r--r--doc/source/main_using.rst2
-rw-r--r--doc/source/using_config.rst6
-rw-r--r--doc/source/using_configuring_cache_server.rst (renamed from doc/source/using_configuring_artifact_server.rst)63
6 files changed, 118 insertions, 39 deletions
diff --git a/doc/source/arch_caches.rst b/doc/source/arch_caches.rst
new file mode 100644
index 000000000..c415cfc47
--- /dev/null
+++ b/doc/source/arch_caches.rst
@@ -0,0 +1,68 @@
+
+.. _caches:
+
+
+Caches
+======
+
+BuildStream uses local caches to avoid repeating work, and can have remote
+caches configured to allow the results of work to be shared between multiple
+users. There are caches for both elements and sources that map keys to relevant
+metadata and point to data in CAS.
+
+Content Addressable Storage (CAS)
+---------------------------------
+
+The majority of data is stored in Content Addressable Storage or CAS, which
+indexes stored files by the SHA256 hash of their contents. This allows for a
+flat file structure as well as any repeated data to be shared across a CAS. In
+order to store directory structures BuildStream's CAS uses `protocol buffers`_
+for storing directory and file information as defined in Googles `REAPI`_.
+
+:ref:`bst-artifact-server <artifact_command_reference>` runs a `grpc`_ CAS
+service (also defined in REAPI) that both artifact and source cache use,
+allowing them to download and upload files to a remote service.
+
+Artifact caches
+---------------
+
+Artifacts store build results of an element which is then referred to by its
+cache key (described in :ref:`cachekeys`). The artifacts information is then
+stored in a protocol buffer, defined in ``artifact.proto``, which includes
+metadata such as the digest of the files root; strong and weak keys; and log
+files digests. The digests point to locations in the CAS of relavant files and
+directories, allowing BuildStream to query remote CAS servers for this
+information.
+
+:ref:`bst-artifact-server <artifact_command_reference>` uses grpc to implement a
+remote API for an artifact service, that BuildStream then uses to query,
+retrieve and update artifact files, before using this information to download
+the files and other data from the remote CAS.
+
+Source caches
+-------------
+
+Sources are cached by running the :mod:`Source.stage
+<buildstream.source.Source.stage>` method and capturing the directory output of
+this into the CAS, which then use the sources key to refer to this. The source
+key will be calculated with the plugins defined :mod:`Plugin.get_unique_key
+<buildstream.plugin.Plugin.get_unique_key>` and, depending on whether the source
+requires previous sources to be staged (e.g. the patch plugin), the unique key
+of all sources listed before it in an element. Source caches are simpler than
+artifacts, as they just need to map a source key to a directory digest, with no
+additional metadata.
+
+Similar to artifacts, :ref:`bst-artifact-server <artifact_command_reference>`
+uses grpc to implements a 'reference service' API that allows BuildStream to
+query for these source digests, which can then be used to retrieve sources from
+a CAS.
+
+.. note::
+
+ Not all plugins use the same result as the staged output for workspaces. As a
+ result when initialising a workspace, BuildStream may require fetching the
+ original source if it only has the source in the source cache.
+
+.. _protocol buffers: https://developers.google.com/protocol-buffers/docs/overview
+.. _grpc: https://grpc.io
+.. _REAPI: https://github.com/bazelbuild/remote-apis
diff --git a/doc/source/format_project.rst b/doc/source/format_project.rst
index f5758657d..ac6786c79 100644
--- a/doc/source/format_project.rst
+++ b/doc/source/format_project.rst
@@ -189,7 +189,7 @@ for more detail.
Artifact server
~~~~~~~~~~~~~~~
-If you have setup an :ref:`artifact server <artifacts>` for your
+If you have setup an :ref:`artifact server <cache_servers>` for your
project then it is convenient to configure the following in your ``project.conf``
so that users need not have any additional configuration to communicate
with an artifact share.
@@ -218,6 +218,13 @@ The use of ports are required to distinguish between pull only access and
push/pull access. For information regarding the server/client certificates
and keys, please see: :ref:`Key pair for the server <server_authentication>`.
+.. note::
+
+ Buildstream artifact servers have changed since 1.2 to use protocol buffers
+ to store artifact information rather than a directory structure, as well as a
+ new server API. As a result newer buildstream clients won't work with older
+ servers.
+
.. _project_source_cache:
Source cache server
@@ -239,12 +246,6 @@ Exactly the same as artifact servers, source cache servers can be specified.
client-cert: client.crt
client-key: client.key
-.. note::
-
- As artifact caches work in exactly the same way, a configured artifact server
- can also be used as a source cache server. If you want to use a server as
- both you can put it under both artifacts and source caches configs.
-
.. _project_remote_execution:
Remote execution
@@ -273,7 +274,7 @@ using the `remote-execution` option:
instance-name: development-emea-1
storage-service specifies a remote CAS store and the parameters are the
-same as those used to specify an :ref:`artifact server <artifacts>`.
+same as those used to specify an :ref:`artifact server <cache_servers>`.
The action-cache-service specifies where built actions are cached, allowing
buildstream to check whether an action has already been executed and download it
diff --git a/doc/source/main_architecture.rst b/doc/source/main_architecture.rst
index b0a117ed9..d9b9f3e50 100644
--- a/doc/source/main_architecture.rst
+++ b/doc/source/main_architecture.rst
@@ -14,6 +14,7 @@ This section provides details on the overall BuildStream architecture.
arch_dependency_model
arch_scheduler
arch_cachekeys
+ arch_caches
arch_sandboxing
arch_remote_execution
diff --git a/doc/source/main_using.rst b/doc/source/main_using.rst
index d56023e74..48553087c 100644
--- a/doc/source/main_using.rst
+++ b/doc/source/main_using.rst
@@ -17,4 +17,4 @@ guides and information on user preferences and configuration.
using_examples
using_config
using_commands
- using_configuring_artifact_server
+ using_configuring_cache_server
diff --git a/doc/source/using_config.rst b/doc/source/using_config.rst
index 2b74b2de5..2582e711f 100644
--- a/doc/source/using_config.rst
+++ b/doc/source/using_config.rst
@@ -32,8 +32,8 @@ the supported configurations on a project wide basis are listed here.
Artifact server
~~~~~~~~~~~~~~~
-Although project's often specify a :ref:`remote artifact cache <artifacts>` in
-their ``project.conf``, you may also want to specify extra caches.
+Although project's often specify a :ref:`remote artifact cache <cache_servers>`
+in their ``project.conf``, you may also want to specify extra caches.
Assuming that your host/server is reachable on the internet as ``artifacts.com``
(for example), there are two ways to declare remote caches in your user
@@ -100,6 +100,8 @@ pull only access and push/pull access. For information regarding this and the
server/client certificates and keys, please see:
:ref:`Key pair for the server <server_authentication>`.
+.. _config_sources:
+
Source cache server
~~~~~~~~~~~~~~~~~~~
Similarly global and project specific source caches servers can be specified in
diff --git a/doc/source/using_configuring_artifact_server.rst b/doc/source/using_configuring_cache_server.rst
index da61f0f80..856046f35 100644
--- a/doc/source/using_configuring_artifact_server.rst
+++ b/doc/source/using_configuring_cache_server.rst
@@ -1,32 +1,34 @@
-.. _artifacts:
+.. _cache_servers:
-Configuring Artifact Server
-===========================
+Configuring Cache Servers
+=========================
BuildStream caches the results of builds in a local artifact cache, and will
avoid building an element if there is a suitable build already present in the
-local artifact cache.
+local artifact cache. Similarly it will cache sources and avoid pulling them if
+present in the local cache. See :ref:`caches <caches>` for more details.
-In addition to the local artifact cache, you can configure one or more remote
-artifact caches and BuildStream will then try to pull a suitable build from one
-of the remotes, falling back to a local build if needed.
+In addition to the local caches, you can configure one or more remote caches and
+BuildStream will then try to pull a suitable object from one of the remotes,
+falling back to performing a local build or fetching a source if needed.
Configuring BuildStream to use remote caches
--------------------------------------------
A project will often set up continuous build infrastructure that pushes
-built artifacts to a shared cache, so developers working on the project can
-make use of these pre-built artifacts instead of having to each build the whole
+cached objects to a shared cache, so developers working on the project can
+make use of these pre-made objects instead of having to each build the whole
project locally. The project can declare this cache in its
-:ref:`project configuration file <project_essentials_artifacts>`.
+project configuration file for :ref:`artifacts <project_essentials_artifacts>`
+and :ref:`sources <project_source_cache>`.
Users can declare additional remote caches in the :ref:`user configuration
<config_artifacts>`. There are several use cases for this: your project may not
define its own cache, it may be useful to have a local mirror of its cache, or
you may have a reason to share artifacts privately.
-Remote artifact caches are identified by their URL. There are currently two
-supported protocols:
+Remote caches are identified by their URL. There are currently two supported
+protocols:
* ``http``: Pull and push access, without transport-layer security
* ``https``: Pull and push access, with transport-layer security
@@ -38,23 +40,24 @@ them in a specific order:
2. Project configuration
3. User configuration
-When an artifact is built locally, BuildStream will try to push it to all the
+When an an object is created locally, BuildStream will try to push it to all the
caches which have the ``push: true`` flag set. You can also manually push
-artifacts to a specific cache using the :ref:`bst artifact push command <invoking_artifact_push>`.
+artifacts to a specific cache using the :ref:`bst artifact push command
+<invoking_artifact_push>`.
-Artifacts are identified using the element's :ref:`cache key <cachekeys>` so
-the builds provided by a cache should be interchangable with those provided
+Objects are identified using the element or sources :ref:`cache key <cachekeys>`
+so the objects provided by a cache should be interchangable with those provided
by any other cache.
-Setting up a remote artifact cache
-----------------------------------
-The rest of this page outlines how to set up a shared artifact cache.
+Setting up a remote cache
+-------------------------
+The rest of this page outlines how to set up a shared cache.
Setting up the user
~~~~~~~~~~~~~~~~~~~
-A specific user is not needed, however, a dedicated user to own the
-artifact cache is recommended.
+A specific user is not needed, however, a dedicated user to own the cache is
+recommended.
.. code:: bash
@@ -70,11 +73,11 @@ and authorization there.
Installing the server
~~~~~~~~~~~~~~~~~~~~~
-You will also need to install BuildStream on the artifact server in order
+You will also need to install BuildStream on the cache server in order
to receive uploaded artifacts over ssh. Follow the instructions for installing
BuildStream `here <https://buildstream.build/install.html>`_.
-When installing BuildStream on the artifact server, it must be installed
+When installing BuildStream on the cache server, it must be installed
in a system wide location, with ``pip3 install .`` in the BuildStream
checkout directory.
@@ -91,6 +94,8 @@ requiring BuildStream's more exigent dependencies by setting the
BST_ARTIFACTS_ONLY=1 pip3 install .
+.. _artifact_command_reference:
+
Command reference
~~~~~~~~~~~~~~~~~
@@ -239,12 +244,14 @@ We can then check if the services are successfully running with:
For more information on systemd services see:
`Creating Systemd Service Files <https://www.devdungeon.com/content/creating-systemd-service-files>`_.
-Declaring remote artifact caches
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Remote artifact caches can be declared within either:
+Declaring remote caches
+~~~~~~~~~~~~~~~~~~~~~~~
+Remote caches can be declared within either:
-1. The :ref:`project configuration <project_essentials_artifacts>`, or
-2. The :ref:`user configuration <config_artifacts>`.
+1. The project configuration for :ref:`artifact <project_essentials_artifacts>`
+ and :ref:`sources <project_source_cache>`, or
+2. The user configuration for :ref:`artifacts <config_artifacts>` and
+ :ref:`sources <config_sources>`.
Please follow the above links to see examples showing how we declare remote
caches in both the project configuration and the user configuration, respectively.