summaryrefslogtreecommitdiff
path: root/distbuild
Commit message (Collapse)AuthorAgeFilesLines
* Fix partial distbuilds of non-existant componentsAdam Coldrick2015-04-301-8/+9
| | | | | | | | | | Currently, attempting to distbuild a component which is not in the given system or doesn't exist at all will cause the full system to be built, rather than an error raised. This is because the logic which checks that all components were found is completely nonsensical. This commit makes it actually check the right thing. Change-Id: Ide4d7e3fa5f71e433f3a7b7c8c387fe594c92e43
* Ignore BuildProgress messagesRichard Ipsum2015-04-291-1/+0
| | | | | | | | | | | | | | | | | | | | Once building starts we close the json machine on the initiator, but we may have received build progress events between processing our build-started event and closing the json machine, since there is not a nice way to tell the different types of build progress apart (they all use BuildProgress) we will ignore all BuildProgress messages for now. A possible fix for this is to introduce GraphProgress messages so that we can report the building of the graph without reporting other types of BuildProgress ("Waiting for worker" or "Transferring artifact to cache") that we're not interested in. Note that we will still report build failures or build success, so if there's a mistake in the definitions this will be reported before the detach can occur, similarly if the system is already built this will be reported before the detach happens. Change-Id: Ia006ccfba826d2c91f4dea6c028ecdcb5a2b02d6
* Remove n_state_machines_of_type functionRichard Ipsum2015-04-292-4/+2
| | | | Change-Id: Icfc3d1aa125196e208d7ac35f43f06c5f5a21ba4
* distbuild: Add distbuild status commandLauren Perry2015-04-296-10/+100
| | | | | | | | | Adds a command to get the status of all recently ran distbuilds for a given server (e.g. Running, Finished, Failed, Cancelled), so as to tell if a build running via distbuild-start has finished or otherwise exited without going through the server's log files Change-Id: I5ce9fe54ae7b1bd8fe3e0d629f615042be8827ed
* distbuild: Add distbuild start and cancel functionalityLauren Perry2015-04-295-8/+221
| | | | | | | | | | | Add command for distbuild-start to build_plugin in morphlib, and create a boolean parameter to inform the initiator whether to disconnect the controller and leave the build running remotely. Add distbuild-cancel command to parse currently-running distbuild build-request IDs and cancel the one matching the given argument Change-Id: I458a5767bb768ceb2b4d8876adf1c86075d452bd
* distbuild: Add protocol version checking for list-jobs commandLauren Perry2015-04-293-15/+30
| | | | | | | | | | | | | | | Currently, the distbuild-list-jobs command will fail if morph is outdated (i.e. protocol version for client and distbuild network don't match); a protocol_version field has been added to the list-jobs request message to fix this. Moved version check outside build-request message to reduce duplication in new functions. Generalised the list-request output to reduce duplication for any further additions that may require a message output. Change-Id: I28e733cbfe8c89e8c11427df5d40ab275abd313c
* distbuild: Fix NameError when worker disconnectsSam Thursfield2015-04-281-1/+1
| | | | Change-Id: Ifdaa92c209a4ca488c4447911bef9b1bf7d61438
* Make distbuild use an ArtifactReference not an Artifact internally when buildingAdam Coldrick2015-04-242-28/+30
| | | | | | | | We no longer serialise entire artifacts, so the output of deserialise_artifact is an ArtifactReference. This commit changes stuff in distbuild to know how to deal with that rather than an Artifact. Change-Id: I79b40d041700a85c25980e3bd70cd34dedd2a113
* Don't serialise the entire build graphAdam Coldrick2015-04-242-219/+131
| | | | | | | | | | The controller no longer needs to know everything about an artifact as the workers can calculate the build graph themselves quickly. This reduces the amount of data which needs to be serialised by serialise-artifact, making the yaml dump quicker. Change-Id: I6bd0bed14c2efb2f499e9d6f0a97e6188353121a
* distbuild: Add test suite for distbuild-helperSam Thursfield2015-04-224-5/+70
| | | | | | | | | This is mostly to check that the 'cancel entire subprocess tree' works as expected. Revert that patch and the test fails. There are also some tweaks included in this commit. Change-Id: If297522e6589ebb3a07dac66a39eb243789e53aa
* distbuild: Don't create a directory for build output until we get someSam Thursfield2015-04-211-9/+17
| | | | | | | | Currently, it leaves around empty directories called build-00, build-01, etc. when you run a distbuild that fails to get as far as building something, which is annoying. Change-Id: Id3466e248c327dedaf973bc2fe22d42e5c5570d4
* distbuild: Kill the whole process tree when cancelling a buildSam Thursfield2015-04-211-2/+4
| | | | | | | | | | | | | | | | | | We discovered a case where a user of distbuild began a build of 'qtbase', then cancelled it 2 minutes in. The `morph worker-build` process didn't exit for over an hour -- it ran right through until the chunk artifacts had been created. Then it exited with code -9 (SIGKILL). This seems to be due to the fact that SIGKILL doesn't kill subprocesses, and so any file descriptors the subprocesses have open will remain open. If we set up the `morph worker-build` process as a process group leader, using os.setpgid(), then we can use os.killpg() to kill the entire process group. This should ensure that the `morph worker-build` command exits straight away, as all of its subprocesses will be killed at the same time it is. Change-Id: I38707d18004d8c5bc994fd0cb99e90fd5def58e4
* distbuild: Move SubprocessEventSource into its own moduleSam Thursfield2015-04-212-0/+106
| | | | | | | | | Previously it was only available in the distbuild-helper program. Moving it to its own module means we can test it and reuse it. This commit also adds a docstring to the class. Change-Id: Iaf7854048cf0ff463a87894f1f500cdcb6a34d8b
* distbuild: Fix log message when listening for connectionsSam Thursfield2015-04-211-1/+1
| | | | | | | | | | | | | | | | A log message was printing the 'remote name' of a socket that was listening for connections. There isn't one, so the message always shows this: 2015-04-14 17:05:19 INFO Binding socket to sam-jetson-mason 2015-04-14 17:05:19 INFO Listening at None Print the local name instead: 2015-04-14 17:05:19 INFO Binding socket to sam-jetson-mason 2015-04-14 17:05:19 INFO Listening at 10.24.2.125:7878 Change-Id: I22c1bbe8c9f78ef63e587b6ace516afc861fae0f
* distbuild: Add distbuild-list-jobs functionLauren Perry2015-04-175-26/+116
| | | | | | | | | | Add InitiatorListJobs class and list-jobs message template, add distbuild-list-jobs to morph commandlist, send running job information back to initiator, split out handling of build request and list-jobs messages to separate functions and change generating a random integer to UUID for message identification Change-Id: Id02604f2c1201dbc10f6bbd7f501b8ce1ce0deae
* distbuild: Remove unneeded debugging statementSam Thursfield2015-04-091-6/+0
| | | | | | | A JsonMachine object can be set to log all messages that it sends, we don't need to handle it in the WorkerConnection class as well. Change-Id: Idfdc06953363a016708b5dda50c978eb93b1113c
* distbuild: Disable extra message debugging in worker log filesSam Thursfield2015-04-091-1/+0
| | | | | | | | | | | | | | | | | | | | | | | Worker log files are overly verbose with this enabled, each message is dumped 6 times: 2015-03-19 11:00:11 DEBUG JsonMachine: Received: '"{...}\\n"\n' 2015-03-19 11:00:11 DEBUG JsonMachine: line: '"{...}\\n"' 2015-03-19 11:00:11 DEBUG JsonRouter: got msg: {...} 2015-03-19 11:00:11 DEBUG JsonMachine: Sending message {...} 2015-03-19 11:00:11 DEBUG JsonMachine: As '"{...}\\n"' 2015-03-19 11:00:11 DEBUG JsonRouter: sent to client: {...} With this setting disabled, the message is only logged by the JsonRouter class, so appears only twice: 2015-03-19 11:00:11 DEBUG JsonRouter: got msg: {...} 2015-03-19 11:00:11 DEBUG JsonRouter: sent to client: {...} We've not seen any issues with message encoding/decoding recently so I think it's safe to disable this debugging output by default. Change-Id: I7d22ed29e81d6c594cb2c639abf3b40bfb27e3ad
* distbuild: Make 'Current jobs' log message more usefulSam Thursfield2015-04-091-2/+11
| | | | | | | | | | | | | | | | | | | It's good to know which jobs are in progress and which are queued, when reading morph-controller.log. Old output: 2015-04-09 10:40:58 DEBUG Current jobs: ['3f647933a1effbb128c857225ba77e9aa775d92314ef0acf3e58e084a7248c73.chunk.stage1-binutils-misc', 'd7279e4179a31d8a3a98c27d5b01ad1bb7387c7fab623fee1086ab68af2784bb.chunk.stage2-fhs-dirs-misc'] New output: 2015-04-09 10:40:58 DEBUG Current jobs: ['3f647933a1effbb128c857225ba77e9aa775d92314ef0acf3e58e084a7248c73.chunk.stage1-binutils-misc (given to worker1:3434)', 'd7279e4179a31d8a3a98c27d5b01ad1bb7387c7fab623fee1086ab68af2784bb.chunk.stage2-fhs-dirs-misc (given to worker2:3434)'] Change-Id: Ie89e6723b0da5f930813591a3166301fd3966804
* distbuild: Fix issues in build cancellationSam Thursfield2015-04-021-8/+13
| | | | | | | | | A cancel during the 'graphing' or 'annotating' stages would be ignored as the BuildController was listening for the InitiatorDisconnect message from the wrong event source. In 'building' state the actual build would be stopped, but the BuildController instance would stick around due to sending the message class instead of an instance of the message. Change-Id: I222a8aa39bf7fffab4d89e12997ffd18cd1b54fc
* Implement partial distbuildsAdam Coldrick2015-04-023-26/+96
| | | | | | | | | | | | | | In addition to partial builds we also want to be able to do partial distbuilds, and distbuild uses a different codepath. This commit updates the distbuild code to know what to do if a partial build is requested. It only builds up to the latest chunk/stratum that was requested, and displays where to find the artifacts for each of the chunks/strata requested upon completion of the build. The usage is the same as for local builds. Change-Id: I0537f74e2e65c7aefe5e71795f17999e2415fce5
* Use python3 compatible notation for catching exceptionsJavier Jardón2015-03-164-6/+6
| | | | Change-Id: Ibda7a938cd16e35517a531140f39ef4664d85c72
* Use the modern way of the GPL copyright header: URL instead real addressJavier Jardón2015-03-1629-80/+51
| | | | Change-Id: I992dc0c1d40f563ade56a833162d409b02be90a0
* Fix how the morph protocol version error message is displayedLauren Perry2015-03-111-2/+2
|
* Merge branch 'sam/distbuild-build-logs'Sam Thursfield2015-03-118-127/+134
|\ | | | | | | | | Reviewed-By: Adam Coldrick <adam.coldrick@codethink.co.uk> Reviewed-By: Richard Maw <richard.maw@codethink.co.uk>
| * distbuild: Log in build-step-xx.log files when initiator cancels buildSam Thursfield2015-02-182-15/+22
| | | | | | | | | | This makes it easier to spot if an incomplete build was due to the user cancelling, or if it represents a dropped connection or internal error.
| * distbuild: Remove the build-steps messageSam Thursfield2015-02-185-53/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This message was hundreds of kilobytes in size, as it contained a recursive list of dependencies for each artifact in the build graph. It was used in the initiator only to print this message: Build steps in total: 592 This message is now gone. The 'Need to build %d artifacts' build-progress message now indicates the total build steps instead: Need to build 300 artifacts, of 592 total This is a compatible change to the distbuild protocol: old initiators will continue to work as normal with new controllers that don't send the build-steps message.
| * distbuild: Create a new directory to store build logs for each build.Sam Thursfield2015-02-181-1/+20
| | | | | | | | | | | | | | | | | | | | | | | | It gets messy having hundreds of build-step-xx.log files in the current directory, and if two builds are run in parallel from the same directory the logs for a given chunk will be mixed together in one file. Now, a new directory named build-0, build-1, build-2 etc is created for each new build. If the user passes --initiator-step-output-dir the logs will be placed in that directory, instead. This behaviour is the same as before.
| * distbuild: Use source name, not artifact name, for build step logsSam Thursfield2015-02-181-1/+1
| | | | | | | | | | | | | | | | Users build sources, not artifacts. So the log files should be called build-step-systemd.log and not build-step-systemd-misc.log. Note strata are a kind of special case so you will still see build-step-foundation-runtime.log, build-step-foundation-devel.log etc.
| * distbuild: Fix build logs being sent to the wrong log filesSam Thursfield2015-02-181-54/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For a while we have seen an issue where output from build A would end up in the log file of some other random chunk. The problem turns out to be that the WorkerConnection class in the controller-daemon assumes cancellation is instantaneous. If a build was cancelled, the WorkerConnection would send a cancel message for the job it was running, and then start a new job. However, the worker-daemon process would have a backlog of exec-output messages and a delayed exec-response message from the old job. The controller would receive these and would assume that they were for the new job, without checking the job ID in the messages. Thus they would be sent to the wrong log file. To fix this, the WorkerConnection class now tracks jobs by job ID, and the code should be generally more robust when unexpected messages are received.
| * Update copyright yearsSam Thursfield2015-02-184-4/+4
| |
* | Merge branch 'sam/distbuild-worker-disconnect'Sam Thursfield2015-03-031-8/+35
|\ \ | | | | | | | | | | | | | | | Reviewed-By: Richard Maw <richard.maw@codethink.co.uk> Reviewed-By: Francisco Redondo Marchena <francisco.marchena@codethink.co.uk> Reviewed-By: Mike Smith <mike.smith@codethink.co.uk>
| * | distbuild: Be more robust when a worker disconnectsSam Thursfield2015-02-031-8/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | The logic to handle a worker disconnecting was broken. The WorkerConnection object would remove itself from the main loop as soon as the worker disconnected. But it would not get removed from the list of available workers that the WorkerBuildQueue maintains. So the controller would continue sending messages to this dead connection, and the builds it sent would hang forever for a response.
* | | Add protocol versioning for distbuild systemsLauren Perry2015-02-253-3/+24
| | |
* | | Merge remote-tracking branch ↵Sam Thursfield2015-02-202-13/+22
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | 'lauren/baserock/lauren/distbuild-invalid-input-crash' Reviewed-By: Richard Maw <richard.maw@codethink.co.uk> Reviewed-By: Sam Thursfield <sam.thursfield@codethink.co.uk>
| * | Update copyright yearsLauren Perry2015-02-092-2/+2
| | |
| * | Fix distbuild controller crashing on some invalid inputsLauren Perry2015-02-092-11/+20
| |/
* | Fix copyright yearsSam Thursfield2015-02-113-3/+3
| |
* | distbuild: Give more detail when requests to cache-server failSam Thursfield2015-02-111-2/+3
| | | | | | | | | | | | Let the end-user see the URL that distbuild was attempting to talk to, so they can more easily spot configuration errors. It's kind of silly to say 'HTTP request failed' without saying where the request was going.
* | distbuild: Simplify error when computing build graph failsSam Thursfield2015-02-111-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous error looked like this by the time it had reached the initiator's console: ERROR: Failed to build baserock:baserock/definitions c7292b7c81cdd7e5b9e85722406371748453c44f systems/base-system-x86_64-generic.morph.frodsham: Failed to compute build graph. Problem with serialise-artifact: ERROR: Couldn't find morphology: systems/base-system-x86_64-generic.morph.frodsham New message is at least a bit simpler: ERROR: Failed to build baserock:baserock/definitions c7292b7c81cdd7e5b9e85722406371748453c44f systems/base-system-x86_64-generic.morph.frodsham: ERROR: Couldn't find morphology: systems/base-system-x86_64-generic.morph.frodsham
* | distbuild: Fix case where 'computing build graph' would hang foreverSam Thursfield2015-02-112-1/+7
| | | | | | | | | | | | If there's no distbuild-helper process running on the controller then the controller would hang forever. This situation is unlikely, but it's important to give the user feedback instead of silently hanging forever.
* | distbuild: Simplify failure cases in BuildControllerSam Thursfield2015-02-111-43/+19
| | | | | | | | | | | | | | There's no need to handle failure differently at each stage of the build. Simpler to use the BuildFailed message for all errors. This then allows us to have a single self.fail() function that can be used everywhere.
* | distbuild: Rearrange code that sends exec-request messageSam Thursfield2015-02-111-8/+13
| |
* | distbuild: Write name of build worker to build-step-xx.log filesSam Thursfield2015-02-111-15/+33
|/ | | | | | | Knowing which worker built something is useful for debugging, and right now that information is only present on the initiator's console. It's good to have it in the build-step-xx.log file too so the information doesn't get lost.
* Fix lines longer than 79 charactersSam Thursfield2014-10-291-1/+2
|
* Fix distbuild to allow passing a commit instead of a named ref to be builtSam Thursfield2014-10-273-6/+21
| | | | | | | | | | | The recent changes to the BuildCommand.build() function caused distbuild to break, because I didn't make the same change to the InitiatorBuildCommand.build() function but did change how it was called. This commit adds the ability to have optional fields in distbuild messages. This is used to add an optional 'original_ref' field, which will get passed to `morph serialise-artifact` by new distbuild controllers, and will be ignored by older ones.
* distbuild: serialize dependent sources of graphRichard Maw2014-10-082-24/+46
|
* Allow distbuilds to choose where to put logsRichard Maw2014-10-081-1/+7
|
* distbuild: yaml-encode messages before json encodingRichard Maw2014-10-081-2/+8
| | | | | | JSON can only handle unicode strings, but commands can write anything to stdout/stderr, so we do the same trick as for the serialise, and json encode yaml.
* Allow distbuilding morphologies with binary data embeddedRichard Maw2014-10-081-9/+10
| | | | | | The horrible json.dumped, yaml dump is because we need it to be both binary safe (which yaml gives us) and one line per message (which json gives us).
* Fix issues with distbuild caused by moving to building per-sourceRichard Maw2014-10-084-22/+26
|