diff options
author | Raoul Hidalgo Charman <raoul.hidalgocharman@codethink.co.uk> | 2019-05-28 18:22:56 +0100 |
---|---|---|
committer | Raoul Hidalgo Charman <raoul.hidalgocharman@codethink.co.uk> | 2019-05-31 10:08:54 +0100 |
commit | fe721b65df09497c0f07f5fe414ad5cb2db49095 (patch) | |
tree | 2e73585d995358e6f826d2aa06caa8fbcd205990 /doc | |
parent | de51a5a587bced0b2092573a092503352a789931 (diff) | |
download | buildstream-fe721b65df09497c0f07f5fe414ad5cb2db49095.tar.gz |
doc: Add architecture section on caches
Part of #1024
Diffstat (limited to 'doc')
-rw-r--r-- | doc/source/arch_caches.rst | 68 | ||||
-rw-r--r-- | doc/source/main_architecture.rst | 1 | ||||
-rw-r--r-- | doc/source/using_configuring_artifact_server.rst | 2 |
3 files changed, 71 insertions, 0 deletions
diff --git a/doc/source/arch_caches.rst b/doc/source/arch_caches.rst new file mode 100644 index 000000000..c415cfc47 --- /dev/null +++ b/doc/source/arch_caches.rst @@ -0,0 +1,68 @@ + +.. _caches: + + +Caches +====== + +BuildStream uses local caches to avoid repeating work, and can have remote +caches configured to allow the results of work to be shared between multiple +users. There are caches for both elements and sources that map keys to relevant +metadata and point to data in CAS. + +Content Addressable Storage (CAS) +--------------------------------- + +The majority of data is stored in Content Addressable Storage or CAS, which +indexes stored files by the SHA256 hash of their contents. This allows for a +flat file structure as well as any repeated data to be shared across a CAS. In +order to store directory structures BuildStream's CAS uses `protocol buffers`_ +for storing directory and file information as defined in Googles `REAPI`_. + +:ref:`bst-artifact-server <artifact_command_reference>` runs a `grpc`_ CAS +service (also defined in REAPI) that both artifact and source cache use, +allowing them to download and upload files to a remote service. + +Artifact caches +--------------- + +Artifacts store build results of an element which is then referred to by its +cache key (described in :ref:`cachekeys`). The artifacts information is then +stored in a protocol buffer, defined in ``artifact.proto``, which includes +metadata such as the digest of the files root; strong and weak keys; and log +files digests. The digests point to locations in the CAS of relavant files and +directories, allowing BuildStream to query remote CAS servers for this +information. + +:ref:`bst-artifact-server <artifact_command_reference>` uses grpc to implement a +remote API for an artifact service, that BuildStream then uses to query, +retrieve and update artifact files, before using this information to download +the files and other data from the remote CAS. + +Source caches +------------- + +Sources are cached by running the :mod:`Source.stage +<buildstream.source.Source.stage>` method and capturing the directory output of +this into the CAS, which then use the sources key to refer to this. The source +key will be calculated with the plugins defined :mod:`Plugin.get_unique_key +<buildstream.plugin.Plugin.get_unique_key>` and, depending on whether the source +requires previous sources to be staged (e.g. the patch plugin), the unique key +of all sources listed before it in an element. Source caches are simpler than +artifacts, as they just need to map a source key to a directory digest, with no +additional metadata. + +Similar to artifacts, :ref:`bst-artifact-server <artifact_command_reference>` +uses grpc to implements a 'reference service' API that allows BuildStream to +query for these source digests, which can then be used to retrieve sources from +a CAS. + +.. note:: + + Not all plugins use the same result as the staged output for workspaces. As a + result when initialising a workspace, BuildStream may require fetching the + original source if it only has the source in the source cache. + +.. _protocol buffers: https://developers.google.com/protocol-buffers/docs/overview +.. _grpc: https://grpc.io +.. _REAPI: https://github.com/bazelbuild/remote-apis diff --git a/doc/source/main_architecture.rst b/doc/source/main_architecture.rst index b0a117ed9..d9b9f3e50 100644 --- a/doc/source/main_architecture.rst +++ b/doc/source/main_architecture.rst @@ -14,6 +14,7 @@ This section provides details on the overall BuildStream architecture. arch_dependency_model arch_scheduler arch_cachekeys + arch_caches arch_sandboxing arch_remote_execution diff --git a/doc/source/using_configuring_artifact_server.rst b/doc/source/using_configuring_artifact_server.rst index da61f0f80..6eb64113c 100644 --- a/doc/source/using_configuring_artifact_server.rst +++ b/doc/source/using_configuring_artifact_server.rst @@ -91,6 +91,8 @@ requiring BuildStream's more exigent dependencies by setting the BST_ARTIFACTS_ONLY=1 pip3 install . +.. _artifact_command_reference: + Command reference ~~~~~~~~~~~~~~~~~ |