summaryrefslogtreecommitdiff
path: root/distbuild
Commit message (Collapse)AuthorAgeFilesLines
* log job stuffbaserock/adamcoldrick/wip-distbuild-fixesAdam Coldrick2015-03-272-2/+2
| | | | Change-Id: Idb57f1032e24847ba20cef30476bb6a0b438a896
* readableAdam Coldrick2015-03-271-0/+1
| | | | Change-Id: I5dc199b918acc95b962caaee9a13c6f091d72df4
* Keep huntingAdam Coldrick2015-03-272-4/+5
| | | | Change-Id: Ib7332a7e4848c7eb8ad5f602653d744632f37acc
* NOAdam Coldrick2015-03-261-5/+5
| | | | Change-Id: I2c92ff5040779e3a14daba811333291cb7c2cfec
* Try using event.jobAdam Coldrick2015-03-261-4/+4
| | | | Change-Id: I0fdc3b438689693f66b74a41e64ad705834385d7
* Use python3 compatible notation for catching exceptionsJavier Jardón2015-03-164-6/+6
| | | | Change-Id: Ibda7a938cd16e35517a531140f39ef4664d85c72
* Use the modern way of the GPL copyright header: URL instead real addressJavier Jardón2015-03-1629-80/+51
| | | | Change-Id: I992dc0c1d40f563ade56a833162d409b02be90a0
* Fix how the morph protocol version error message is displayedLauren Perry2015-03-111-2/+2
|
* Merge branch 'sam/distbuild-build-logs'Sam Thursfield2015-03-118-127/+134
|\ | | | | | | | | Reviewed-By: Adam Coldrick <adam.coldrick@codethink.co.uk> Reviewed-By: Richard Maw <richard.maw@codethink.co.uk>
| * distbuild: Log in build-step-xx.log files when initiator cancels buildSam Thursfield2015-02-182-15/+22
| | | | | | | | | | This makes it easier to spot if an incomplete build was due to the user cancelling, or if it represents a dropped connection or internal error.
| * distbuild: Remove the build-steps messageSam Thursfield2015-02-185-53/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This message was hundreds of kilobytes in size, as it contained a recursive list of dependencies for each artifact in the build graph. It was used in the initiator only to print this message: Build steps in total: 592 This message is now gone. The 'Need to build %d artifacts' build-progress message now indicates the total build steps instead: Need to build 300 artifacts, of 592 total This is a compatible change to the distbuild protocol: old initiators will continue to work as normal with new controllers that don't send the build-steps message.
| * distbuild: Create a new directory to store build logs for each build.Sam Thursfield2015-02-181-1/+20
| | | | | | | | | | | | | | | | | | | | | | | | It gets messy having hundreds of build-step-xx.log files in the current directory, and if two builds are run in parallel from the same directory the logs for a given chunk will be mixed together in one file. Now, a new directory named build-0, build-1, build-2 etc is created for each new build. If the user passes --initiator-step-output-dir the logs will be placed in that directory, instead. This behaviour is the same as before.
| * distbuild: Use source name, not artifact name, for build step logsSam Thursfield2015-02-181-1/+1
| | | | | | | | | | | | | | | | Users build sources, not artifacts. So the log files should be called build-step-systemd.log and not build-step-systemd-misc.log. Note strata are a kind of special case so you will still see build-step-foundation-runtime.log, build-step-foundation-devel.log etc.
| * distbuild: Fix build logs being sent to the wrong log filesSam Thursfield2015-02-181-54/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For a while we have seen an issue where output from build A would end up in the log file of some other random chunk. The problem turns out to be that the WorkerConnection class in the controller-daemon assumes cancellation is instantaneous. If a build was cancelled, the WorkerConnection would send a cancel message for the job it was running, and then start a new job. However, the worker-daemon process would have a backlog of exec-output messages and a delayed exec-response message from the old job. The controller would receive these and would assume that they were for the new job, without checking the job ID in the messages. Thus they would be sent to the wrong log file. To fix this, the WorkerConnection class now tracks jobs by job ID, and the code should be generally more robust when unexpected messages are received.
| * Update copyright yearsSam Thursfield2015-02-184-4/+4
| |
* | Merge branch 'sam/distbuild-worker-disconnect'Sam Thursfield2015-03-031-8/+35
|\ \ | | | | | | | | | | | | | | | Reviewed-By: Richard Maw <richard.maw@codethink.co.uk> Reviewed-By: Francisco Redondo Marchena <francisco.marchena@codethink.co.uk> Reviewed-By: Mike Smith <mike.smith@codethink.co.uk>
| * | distbuild: Be more robust when a worker disconnectsSam Thursfield2015-02-031-8/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | The logic to handle a worker disconnecting was broken. The WorkerConnection object would remove itself from the main loop as soon as the worker disconnected. But it would not get removed from the list of available workers that the WorkerBuildQueue maintains. So the controller would continue sending messages to this dead connection, and the builds it sent would hang forever for a response.
* | | Add protocol versioning for distbuild systemsLauren Perry2015-02-253-3/+24
| | |
* | | Merge remote-tracking branch ↵Sam Thursfield2015-02-202-13/+22
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | 'lauren/baserock/lauren/distbuild-invalid-input-crash' Reviewed-By: Richard Maw <richard.maw@codethink.co.uk> Reviewed-By: Sam Thursfield <sam.thursfield@codethink.co.uk>
| * | Update copyright yearsLauren Perry2015-02-092-2/+2
| | |
| * | Fix distbuild controller crashing on some invalid inputsLauren Perry2015-02-092-11/+20
| |/
* | Fix copyright yearsSam Thursfield2015-02-113-3/+3
| |
* | distbuild: Give more detail when requests to cache-server failSam Thursfield2015-02-111-2/+3
| | | | | | | | | | | | Let the end-user see the URL that distbuild was attempting to talk to, so they can more easily spot configuration errors. It's kind of silly to say 'HTTP request failed' without saying where the request was going.
* | distbuild: Simplify error when computing build graph failsSam Thursfield2015-02-111-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous error looked like this by the time it had reached the initiator's console: ERROR: Failed to build baserock:baserock/definitions c7292b7c81cdd7e5b9e85722406371748453c44f systems/base-system-x86_64-generic.morph.frodsham: Failed to compute build graph. Problem with serialise-artifact: ERROR: Couldn't find morphology: systems/base-system-x86_64-generic.morph.frodsham New message is at least a bit simpler: ERROR: Failed to build baserock:baserock/definitions c7292b7c81cdd7e5b9e85722406371748453c44f systems/base-system-x86_64-generic.morph.frodsham: ERROR: Couldn't find morphology: systems/base-system-x86_64-generic.morph.frodsham
* | distbuild: Fix case where 'computing build graph' would hang foreverSam Thursfield2015-02-112-1/+7
| | | | | | | | | | | | If there's no distbuild-helper process running on the controller then the controller would hang forever. This situation is unlikely, but it's important to give the user feedback instead of silently hanging forever.
* | distbuild: Simplify failure cases in BuildControllerSam Thursfield2015-02-111-43/+19
| | | | | | | | | | | | | | There's no need to handle failure differently at each stage of the build. Simpler to use the BuildFailed message for all errors. This then allows us to have a single self.fail() function that can be used everywhere.
* | distbuild: Rearrange code that sends exec-request messageSam Thursfield2015-02-111-8/+13
| |
* | distbuild: Write name of build worker to build-step-xx.log filesSam Thursfield2015-02-111-15/+33
|/ | | | | | | Knowing which worker built something is useful for debugging, and right now that information is only present on the initiator's console. It's good to have it in the build-step-xx.log file too so the information doesn't get lost.
* Fix lines longer than 79 charactersSam Thursfield2014-10-291-1/+2
|
* Fix distbuild to allow passing a commit instead of a named ref to be builtSam Thursfield2014-10-273-6/+21
| | | | | | | | | | | The recent changes to the BuildCommand.build() function caused distbuild to break, because I didn't make the same change to the InitiatorBuildCommand.build() function but did change how it was called. This commit adds the ability to have optional fields in distbuild messages. This is used to add an optional 'original_ref' field, which will get passed to `morph serialise-artifact` by new distbuild controllers, and will be ignored by older ones.
* distbuild: serialize dependent sources of graphRichard Maw2014-10-082-24/+46
|
* Allow distbuilds to choose where to put logsRichard Maw2014-10-081-1/+7
|
* distbuild: yaml-encode messages before json encodingRichard Maw2014-10-081-2/+8
| | | | | | JSON can only handle unicode strings, but commands can write anything to stdout/stderr, so we do the same trick as for the serialise, and json encode yaml.
* Allow distbuilding morphologies with binary data embeddedRichard Maw2014-10-081-9/+10
| | | | | | The horrible json.dumped, yaml dump is because we need it to be both binary safe (which yaml gives us) and one line per message (which json gives us).
* Fix issues with distbuild caused by moving to building per-sourceRichard Maw2014-10-084-22/+26
|
* Allow ephemeral ports for distbuild servicesRichard Maw2014-10-021-1/+1
|
* distbuild: allow daemons to bind to ephemeral portsRichard Maw2014-10-011-1/+6
| | | | | | | | | | | | | You can bind to an ephemeral port by passing 0 as the port number. To work out which port you actually got, you need to call getsockname(). To facilitate being able to spawn multiple copies of the daemons for testing environments, you can pass a -file option, which will make the daemon write which port it actually bound to. If this path is a fifo, reading from it in the spawner process will allow synchronisation of only spawning services that require that port to be ready after it is.
* Fix and integrate distbuild unit testsRichard Maw2014-10-012-127/+113
|
* Fix copyright years of distbuild code.Sam Thursfield2014-09-1129-29/+29
|
* Fix all distbuild code to be GPLv2 licensed.Sam Thursfield2014-09-1011-21/+141
|
* Rename for consistencyRichard Ipsum2014-08-141-10/+9
|
* Remove dead functionsRichard Ipsum2014-08-141-20/+0
|
* Revert distbuild parts of "Make our use of json binary path safe"baserock/richardmaw/bugfix/stop-decoding-distbuild-commsRichard Maw2014-07-183-6/+6
| | | | | | | | | | | | | | | | | | | | | | | The "unicode fix" worked for the subset of cases relevant, and only broke distbuild because its tests have not been integrated with ./check, so the fact that it broke for any string ending with a \ escaped notice, if you will excuse the pun. During json.load, the encode option is for specifying the character encoding of the file or string that is being loaded. During json.dump, the encode option is for the encoding of `str` keys and values. The fact that it worked for the set of cases we cared about is a small mystery, probably caused by the strings we happened to give it being valid unicode-escape encoded `str`ings. A full fix would require either converting all these cases to a different format, such as YAML, which will handle input data not being valid Unicode, or pre-processing the data that is passed to `json.dump` to convert all `str` instances to an appropriately escaped `unicode`, and converting back on `json.load`, but this is a quick fix to get the distbuild code working again.
* Fix JSON file handling to be binary filename safeLars Wirzenius2014-07-153-6/+6
|\ | | | | | | | | Reviewed-by: Lars Wirzenius Reviewed-by: Pedro Alvarez
| * Make our use of json binary path safebaserock/richardmaw/bugfix/unicode-safe-jsonRichard Maw2014-07-113-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | json only accepts unicode. Various APIs such as file paths and environment variables allow binary data, so we need to support this properly. This patch changes every[1] use of json.load or json.dump to escape non-unicode data strings. This appears exactly as it used to if the input was valid unicode, if it isn't it will insert \xabcd escapes in the place of non-unicode data. When loading back in, if json.load is told to unescape it with `encoding='unicode-escape'` then it will convert it back correctly. This change was primarily to support file paths that weren't valid unicode, where this would choke and die. Now it works, but any tools that parsed the metadata need to unescape the paths. [1]: The interface to the remote repo cache uses json data, but I haven't changes its json.load calls to unescape the data, since the repo caches haven't been made to escape the data.
* | Log the address we attempt to bindRichard Ipsum2014-07-111-0/+1
|/ | | | | This will make it easier to determine what is wrong if the controller daemon is run with a bad controller host address.
* distbuild: Log a message when a socket error is receivedSam Thursfield2014-06-191-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I found an issue in distbuild where the controller was stuck in a busy loop where it was continually writing to a closed socket. With 'strace' I saw write(), SIGPIPE, write(), SIGPIPE, ad infinitum. I got this much of a Python backtrace using GDB: distbuild.socketsrc.SocketEventSource.write() distbuild.sockbuf.SocketBuffer._flush() distbuild.sm.StateMachine.handle_event() I didn't manage to get further. However, I suspect one of the state machine transitions may be creating an event loop instead of correctly handling the error. The log file was quiet at this point, the last entries were: 2014-06-19 08:57:36 INFO There seems to be nothing to build 2014-06-19 08:57:36 INFO Requested artifact is built 2014-06-19 08:57:36 DEBUG InitiatorConnection: sent to 10.24.1.215:53818: {'mess age': 'Need to build 0 artifacts', 'type': 'build-progress', 'id': 790629564} 2014-06-19 08:57:36 DEBUG Notifying initiator of successful build 2014-06-19 08:57:36 DEBUG MainLoop.remove_state_machine: <BuildController at 0xb 6c554c, request-id InitiatorConnection-93> 2014-06-19 08:57:36 DEBUG InitiatorConnection: sent to 10.24.1.215:53818: {'type ': 'build-finished', 'id': 790629564, 'urls': [u'http://hawkdevtrove:8080/1.0/ar tifacts?filename=861f640923494ca3626bbd65655b350ce1bebea4c0bf7a57693bc06ed122cef 4.system.devel-system-x86_32-chroot-rootfs']} 2014-06-19 08:57:36 DEBUG InitiatorConnection: 10.24.1.215:53818: closing: <Json Machine at 0xc6cb22c: socket 10.24.1.164:7878 -> 10.24.1.215:53818, max_buffer 1 6384> 2014-06-19 08:57:36 DEBUG MainLoop.remove_state_machine: <InitiatorConnection at 0xc6cbcec: remote 10.24.1.215:53818> 2014-06-19 08:57:36 DEBUG MainLoop.remove_state_machine: <JsonMachine at 0xc6cb22c: socket 10.24.1.164:7878 -> 10.24.1.215:53818, max_buffer 16384> 2014-06-19 08:57:36 DEBUG MainLoop.remove_state_machine: <SocketBuffer at 0xc6cbe2c: socket None max_buffer 16384> This commit should improve matters a little: in future the log file will show the ID of the SocketEventSource object and error we hit when calling its write() function.
* Import InitiatorConnectionMachineRichard Ipsum2014-06-111-1/+2
|
* Use superRichard Ipsum2014-06-111-1/+1
| | | | This change is made just for consistency.
* Add InitiatorConnectionMachineRichard Ipsum2014-06-111-5/+36
| | | | | | The InitiatorConnectionMachine wraps the ConnectionMachine, so we can continue to use ConnectionMachine without providing it with an app.