summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* fixup: Import JsonErrorbaserock/adamcoldrick/fix-broken-pipe-busy-loopAdam Coldrick2015-05-121-1/+1
| | | | Change-Id: Ie27a76c0a9e8e44b36389c35e43c7b5cd8c4022a
* distbuild: Handle errors from socketSam Thursfield2015-05-123-2/+24
| | | | | | | | | | | | | | | | | | | | | We found a distbuild controller stuck in a busy loop, with the logs full of the same error message repeated: ... _flush(): Exception 'IOError: [Errno 32] Broken pipe' from sock.write() We suspect this came about because the initiator disconnected without sending an EOF. The initiator was in a VM on a laptop so it seems possible that the host OS turned off the wireless adaptor without giving the VM a chance to close its connections gracefully. The busy loop is because nothing in the SocketBuffer class handles the SocketError events queued by the _flush() method. Unhandled events are ignored. So the SocketBuffer stays in 'w' state without ever shifting any data and never returns. Adding transitions to handle the SocketError event will fix the problem. If a socket error happens now in the same scenario, it will be handled as if the initiator disconnected.
* distbuild: Set job status to failed when sending exec-cancelAdam Coldrick2015-05-121-0/+8
| | | | | | | | | Currently jobs may continue running after exec-cancel is sent if exec-response takes a while to be sent back. This commit makes the job's state be set to 'failed' when exec-cancel is sent, so that the wait for exec-response doesn't matter. Change-Id: I858d9efcba38c81a912cf57aee2bdd8c02cb466b
* Revert "distbuild: Track worker jobs using artifact basename only"Adam Coldrick2015-05-121-29/+48
| | | | | | | | | | This reverts commit 75ef3e9585091b463b60d2981b3b7283a2ea8eab. It turns out that the JobQueue may need to handle more than one build of the same artifact at once, as one may be in the process of being cancelled when another build of the same artifact is requested. So they do need an ID separate from the artifact ID. Change-Id: Ifa0c06987795a4aebdadbd9927de27919377b0a2
* Remove mention of MorphologyFactory in the unit testsAdam Coldrick2015-05-121-2/+0
| | | | Change-Id: I27f5319721aa3e996c186f92a3c2296d6df4bedb
* Clean up artifact serialisationAdam Coldrick2015-05-126-85/+99
| | | | | | | We no longer serialise whole artifacts, so it doesn't make sense for things to still refer to serialise-artifact and similar. Change-Id: Id4d563a07041bbce77f13ac71dc3f7de39df5e23
* Move duplicate fix_chunk_build_mode function to a common locationAdam Coldrick2015-05-123-52/+33
| | | | Change-Id: I11b4dbeb50d67068701f269ef6ac7cfbd89f6aed
* Enable native-bootstrap to continue build after recovered from fault.Kejia Hu (Terry)2015-05-121-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | The previous script creates new directory for the chunk it is going to build without checking whether the directory exists or not. It will fail back if the directory it attempted to creat exists. So if build failed, you always need to remove all .inst directories and let the native-bootstrap script build from the beginning. This patch improves this, and you can run the native bootstrap script direct after resumed without loss previous progress. A condition was added to determine whether previous native bootstrap script was terminated when it was building current chunk. As .build directory for certain chunk only exists during building phase of itself, it was created when started build, and cleaned up after building finished. If .inst for certain chunk exists, the .build directory doesn't, the building of the chunk should be succeed in previous build. The second go of native-bootstrap will skip all successful chunks and start where it left off. Change-Id: I91ae213ecc8c98808efdfd969624291e70f7e0fe
* Remove % from debug statementRichard Ipsum2015-05-121-1/+1
| | | | Change-Id: I674c39149aad82c07c85d2db3207280b91dfa292
* Add a common func for handling build terminationRichard Ipsum2015-05-121-20/+11
| | | | Change-Id: I95fbfcb2ed6a8ffdd946d36eacc030b4ae1b9b21
* Add GraphProgress messagesRichard Ipsum2015-05-125-10/+91
| | | | | | | Adds distinct message types to give us more flexibility over message handling now that we have multiple initiator types with different requirements. Change-Id: Ib2af8736b83d66ef20a8e37591ca68c9441b6497
* distbuild: Fix protocol version checking for distbuild commandsLauren Perry2015-05-111-0/+14
| | | | | | | | This fixes an issue with distbuild-status and distbuild-cancel crashing due to their appropriate Initiator classes not handling 'build-failed' messages Change-Id: Ia35c8e14a30e3a9bdea1e44f7726181db75dfbe5
* yarns: Add yarn for morph diffRichard Maw2015-05-111-0/+22
| | | | Change-Id: If3f6abdaab6518e77da911bfe1952c8ffe4bda34
* yarns: Add the ability to tag chunks and commit updates to definitionsRichard Maw2015-05-112-0/+30
| | | | Change-Id: Ia644ddfaa5138f0ad459099cf26f51b545a9f9ca
* morph: Add morph diff subcommandRichard Maw2015-05-112-0/+148
| | | | Change-Id: If74c97ccd81aa4d92ef247d2be59282f9552d4a1
* morphlib: Add utility module for parsing argv into lists of systemsRichard Maw2015-05-113-0/+143
| | | | | | | | | | | | | | The `morph anchor`, `morph build-morphology` and a potential `morph diff` command would all benefit from having a unified way to parse the argv for the systems it must operate on, especially in the case of the potential `morph diff`, which needs to be able to handle being given two sets of systems. `morph anchor` may make use of it now by passing the list of systems to the Source resolver, but `morph build-morphology` would have to iterate over the systems and graph each independently. Change-Id: I91ab4764ffca3aa16f144f89f68f37cc21b9f643
* distbuild: Builds currently break due to job being set twiceLauren Perry2015-05-111-1/+0
| | | | | | | Remove extra job set line as self._current_job no longer exists in worker_build_scheduler.py Change-Id: I8849742587f11f83ebba64f48eaf97fac83e6589
* SourceResolver: Allow the resolution of multiple systemsRichard Maw2015-05-112-5/+10
| | | | | | | | | | | | | | | | The existing Source resolution code handles resolution of multiple systems sufficiently. It is not appropriate to then take this source pool and attempt to create a build graph from it though, as the logical structure of the input of what we want to build, and the logical structure of what we will produce are conflated in the Source object. If we do not intend to create a build graph from the Source Pool we generate, then it is an appropriate data structure that may be used to analyse changes in the input to a build. Change-Id: If8e4a726f16f8aca000e59ecbbeb7d926cc08391
* LRC: Make get_updated_repo handle multiple refsRichard Maw2015-05-111-16/+27
| | | | | | | | | | Passing a single ref is still accepted, but if you have multiple refs you need to check from the same repository, it is more appropriate to do it in one call to get_updated_repo, as otherwise there will be unnecessary output about it not needing to be updated in multiple places. Change-Id: I194d7c0e3e84c4917518ba37672b508505c71b8e
* MorphologyLoader: Set filename attribute at parse timeRichard Maw2015-05-111-2/+3
| | | | Change-Id: I0e0b8d352eb4ef1ab6c50e0ba0162263d9bac09d
* morph anchor: Handle updating refsRichard Maw2015-05-111-5/+6
| | | | | | | | | | Previously it would not attempt to make commits it needed locally available if the commit was available on the remote repo cache. Now it will do the update if the commit is not available locally, and will obey --no-git-update. Change-Id: I80f1e351ce334641e2ef733fa4c9a6ab967f9b67
* morphlib.util: add word_join_listRichard Maw2015-05-111-0/+10
| | | | | | | This is useful for representing lists of items in status or exception messages. Change-Id: I530eecdc311ac77fca9922dab063f550ea810c06
* yarns: Fix incorrect chunk name in test suiteAdam Coldrick2015-05-111-1/+1
| | | | Change-Id: I60e808d7f42890dac2e1470a994e1a31a92401e7
* Fix mistake in sysroot.writeSam Thursfield2015-05-081-1/+1
| | | | | | The * should not be in quotes. Change-Id: Ieebdc7532ba1bff5ba9742f72440ed00b0c0de2a
* Raise an error if a stratum build-depends on itselfAdam Coldrick2015-05-082-0/+25
| | | | | | | | | | | | If a stratum build-depends on itself, the build graph calculation gets stuck in an infinite loop as it adds the same stratum to the queue of morphologies to inspect over and over again. This commit causes MorphologyLoader.validate_stratum to raise an error if a stratum contains itself in it's build-depends, as depending on itself makes no sense and will cause the above problem. Change-Id: I76df5b7d63d010ae3b17f72bfa39b273e74279dd
* Fix distbuild-morphologyAdam Coldrick2015-05-071-1/+1
| | | | | | This fixes an error caused by not enough parameters being given to the InitiatorBuildCommand constructor in distbuild-morphology. Change-Id: I133bd2f267fd06cfe88a1cbf4711cc79ad00d209
* distbuild: Fix initiator hanging when protocol errors occurSam Thursfield2015-05-071-19/+34
| | | | | | | | | | | | | | | | | | | | | If the initiator sends an invalid build-request message, it will now exit with the following sort of error: ERROR: Failed to build baserock:baserock/definitions f2d78e9b7221bca65cba53af3f3b50d50d90628f systems/build-system-x86_64.morph: Invalid build-request message. Check you are using a supported version of Morph. This distbuild network uses protocol version 2. Previously, the controller would log an error to its log file, but it would not send any response to the initiator so the initiator would hang forever. Behaviour is the same as before for the case where the initiator sends a build-request message with the wrong protocol version: the initiator will exit with an error message. Change-Id: I94fdee02bc701d4a679a0261b3c46dbdf14cfcaf
* Fix sysroot.write trying to overwrite existing filesSam Thursfield2015-05-072-11/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 807e6a90876c5469d242 changed the behaviour of sysroot.write to avoid deleting the contents of the sysroot. This was done so if you accidentally set 'sysroot=/' it wouldn't delete your whole system. It turns out that SDK deployments like clusters/sdk-example-cluster.morph depended on the contents of the directory being deleted. The system armv7lhf-cross-toolchain-system-x86_64.morph has a bunch of files installed by the cross-toolchain in /usr/armv7lhf-baserock-linux-gnueabi/sys-root. Previously sysroot.write would delete these, but since commit 807e6a90876c5469d242 it would fail with several errors like: mv: can't rename '/src/tmp/deployments/usr/armv7l.../sys-root/sbin' If we use 'cp -a' instead of 'mv' then it is slower to deploy, but there are no errors. I am still unsure why files from the cross-toolchain system are installed and then deleted. Although this patch fixes the immediate issue, I don't know if it's the right thing to do. It seems better to not install those files in the first place, if we do not need them. This commit also removes the check for the sysroot target location being empty. This doesn't work, because it runs /before/ the system being deployed is unpacked. Change-Id: I10671c2f3b2060cfb36f880675b83351c6cdd807
* distbuild: Allow WorkerConnection to track multiple in-flight jobsSam Thursfield2015-05-071-108/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Although in theory a worker should only ever have one job at once, in practice this assumption doesn't hold, and can cause serious confusion. The worker (implemented in the JsonRouter class) will actually queue up exec-request messages and run the oldest one first. I saw a case where, due to a build not being correctly cancelled, the WorkerConnection.current_job attribute got out of sync with what the worker was actually building. This lead to an error when trying to fetch the built artifacts, as the controller tried to fetch artifacts for something that wasn't actually built yet, and everything got stuck. To prevent this from happening, we either need to remove the exec-request queue in the worker-daemon process, or make the WorkerConnection class cope with multiple jobs at once. The latter seems like the more robust approach, so I have done that. Another bug this fixes is the issue where, if the 'Computing build graph' (serialise-artifact) step of a build completes on the controller while one of its WorkerConnection objects is waiting for artifacts to be fetched by the shared cache from the worker, the build hangs. This would happen because the WorkerConnection assumed that any HelperResponse message it saw was the result of its request, so would send a _JobFinished before caching had actually finished if there was an unrelated HelperResponse received in the meantime. It now checks the request ID of the HelperResponse before calling the code that is now in the new _handle_helper_result_for_job() function. Change-Id: Ia961f333f9dae77405b58c82c99a56e4c43e1628
* distbuild: Track worker jobs using artifact basename onlySam Thursfield2015-05-071-34/+23
| | | | | | | Rather than generating IDs for each job, identify them by what artifact is going to be built. Artifact cache IDs need to be unique in any case. Change-Id: I37a0277931c45a8fb6e37ae7c2a6a942ae732fdd
* distbuild: Track state of a job in the Job classSam Thursfield2015-05-071-22/+31
| | | | | | | This is a bit more comprehensive than the previous approach of using public instance attributes, and I find it easier to reason about. Change-Id: I2942ecf53c95e29893dc0982d38aec689ebfa614
* distbuild: Make Jobs class into a more generic JobQueueSam Thursfield2015-05-071-11/+17
| | | | | | | The intention is to allow workers to use this class for job tracking, in addition to the controller. Change-Id: I355861086764476b383266bab7e850af5e05bc54
* Make str() of a GitDirectory return its location.Sam Thursfield2015-05-071-0/+3
| | | | | | Handy for log messages. Change-Id: I4336866c456a6225a6f3ecbfef10dfc7b864ac59
* Make listing contents of local tarball cache more robustSam Thursfield2015-05-071-1/+12
| | | | | | | | | | | Previously it would crash with a backtrace if there were unexpected filenames in the directory. It's still not amazingly robust, but I don't have time to rewrite the whole thing now. This code seems to have ignored that cachefs.walkfiles() returns filenames with a preceeding '/', which I have fixed now. Change-Id: I98b3094bd6c82b26984513ee81a1eab9bf253a34
* Show progress of downloads when --verbose is passed, not --debugSam Thursfield2015-05-064-4/+5
| | | | | | | This commit undoes behaviour changes from commit aa6dfcbb70c03dfeb3f9af02. Change-Id: Ie677fb9c4e6bcd6edeba2cdd87f4f6125dcae7a4
* GitDir: Fix setting fetch url when push url is already on-diskRichard Maw2015-05-061-2/+2
| | | | | | | | | | | | 290483010cfc7945cd4483fadd1d98c3b83efb3 broke morph checkout, which uses set_fetch_url on a repository that has been cloned, hence has its origin remote url config already on-disk. The fix prevents it changing the push_url when the fetch_url is set, unless it is an unnamed remote, as if the config is on-disk, this already does the right thing. Change-Id: I6204f664407bab3d7f8ecf0fcca72f5015dee55e
* Update distbuild protocol version to 3Sam Thursfield2015-05-051-1/+1
| | | | | | | | | Commit 84096556ea54d4af236f1fe5f7ccf61c1343016f changed the protocol without changing the protocol version. Versions of Morph between that one and this one may hang forever in 'morph distbuild' if trying to build on an incompatible distbuild network. Change-Id: I9194657f59a4b4a61a6fde7bd85105b56ca1a78d
* Add yarns for basic `morph anchor` functionalityRichard Maw2015-05-011-0/+131
| | | | Change-Id: I77a8a3aab887f5d14a372690502df3fdeba6db10
* Add `morph anchor` commandRichard Maw2015-05-012-0/+227
| | | | Change-Id: If9d92d7c75b9c4276b69c482c076c6fc1d4ccbbf
* yarns: Fix typo in system branch creation yarnRichard Maw2015-04-291-1/+1
| | | | Change-Id: I1df58c33987597d4aa5a8eb241b4de4ac72fe250
* yarns: Fix get-repo test falsely checking exit resultRichard Maw2015-04-291-6/+3
| | | | | | | "the user gets the repo" does not set exit-morph, so it is not valid to check whether it exited successfully. Change-Id: I05e2d5c1919eee6b714269642eb9c39bcf578bbc
* RemoteRefManager: Fail all ref updates when one failsRichard Maw2015-04-291-1/+27
| | | | | | | There's no API to do it in one push yet, but we can send a delete for all the branches that *did* commit. Change-Id: I671e9384b84657a3e9034d62818caa0ac0d8de1e
* GitDir: Set the fetch or push url when the other is setRichard Maw2015-04-291-0/+4
| | | | Change-Id: I500cb81fd0f133bd9f4e76d46bc0ff8a4f57fe50
* yarns: Have non-bogus trove configRichard Maw2015-04-291-7/+9
| | | | Change-Id: I5dec13df6c28eeb4e8c83ec41fb4bd119e2eebb1
* CachedRepo: Fix reference to _gitdirRichard Maw2015-04-291-1/+1
| | | | | | | 87f8dbefda89bf6cb9e4b88f23a5317b054da0d4 added a method that used _gitdir, but the patch to change it to gitdir was merged afterwards. Change-Id: Ibd9bff73a0fe69b3c1c2ff6acd02df6cea4a13de
* install-files.configure: make possible to overwrite symlinksJavier Jardón2015-04-301-1/+10
| | | | | | os.symlink will fail if the origin file/link already exist Change-Id: I8175c8dce699e55c3e39e35dfd45c0c19b8bd96d
* Add a test for partial deploymentAdam Coldrick2015-04-302-1/+54
| | | | Change-Id: Iaab620f3d9ebc037fe024db933b03e8f40ca40a4
* Allow the deployment of individual chunks/strata from systemsAdam Coldrick2015-04-302-40/+219
| | | | | | | | | | | | | This commit allows the specification of one or more strata/chunks in a deployment entry in a cluster morphology to deploy instead of the full system if --partial is set. These are listed in a 'partial-deploy-components' field in each deployment definition. The components must be in the system, and this only works for tarball or sysroot deployments. It SHOULD NOT be used when deploying production systems, as it has a number of limitations. Change-Id: I04ac58af57216335d9257f6620d09f18f61ea714
* morphlib: Add command to get build-log for a given chunk and system fileLauren Perry2015-04-302-0/+91
| | | | Change-Id: I09e9b17ef2e0fb94dbf5a96dca91062d64433add
* Add ssh keys conf extRichard Ipsum2015-04-301-0/+25
| | | | Change-Id: I4e7888cbff2e4708154538f8f0a48aeaa1a8a811