summaryrefslogtreecommitdiff
path: root/morphlib
Commit message (Collapse)AuthorAgeFilesLines
* Transfer sparse files faster for kvm, vbox deploymentbaserock/liw/xfer-holeLars Wirzenius2014-09-045-8/+268
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The KVM and VirtualBox deployments use sparse files for raw disk images. This means they can store a large disk (say, tens or hundreds of gigabytes) without using more disk space than is required for the actual content (e.g., a gigabyte or so for the files in the root filesystem). The kernel and filesystem make the unwritten parts of the disk image look as if they are filled with zero bytes. This is good. However, during deployment those sparse files get transferred as if there really are a lot of zeroes. Those zeroes take a lot of time to transfer. rsync, for example, does not handle large holes efficiently. This change introduces a couple of helper tools (morphlib/xfer-hole and morphlib/recv-hole), which transfer the holes more efficiently. The xfer-hole program reads a file and outputs records like these: DATA 123 binary data (exaclyt 123 bytes and no newline at the end) HOLE 3245 xfer-hole can do this efficiently, without having to read through all the zeroes in the holes, using the SEEK_DATA and SEEK_HOLE arguments to lseek. Using this, the holes take only take a few bytes each, making it possible to transfer a disk image faster. In my benchmarks, transferring a 100G byte disk image took about 100 seconds for KVM, and 220 seconds for VirtualBox (which needs to more work at the receiver to convert the raw disk to a VDI). Both benchmarks were from a VM on my laptop to the laptop itself. The interesting bit here is that the receiver (recv-hole) is simple enough that it can be implemented in a bit of shell script, and the text of the shell script can be run on the remote end by giving it to ssh as a command line argument. This means there is no need to install any special tools on the receiver, which makes using this improvement much simpler.
* Use git's remotes to determine unpushed branchesbaserock/richardmaw/less-ls-remoteRichard Maw2014-09-022-10/+23
| | | | | | | | | | | | | | | | | | | | | | Rather than using `git ls-remote` every time to see if there are changes at the remote end, use a local cache. Git already solves this problem with its refs/remotes/$foo branch namespace, so we can just use that instead. In addition, the detection of which upstream branch to use has been improved; it now uses git's notion of which the upstream branch of your current branch is, which saves effort in the implementation, and allows the name of the local branch to differ from that of the remote branch. This now won't notice if the branch you currently have checked out had commits pushed from another source, but for some use-cases this is preferable, as the result equivalent to if you had built before the other push. It may make sense to further extend this logic to check that the local branch is not ahead of the remote branch, instead of requiring them to be equal.
* Merge branch 'sam/less-cache-key-logging'Sam Thursfield2014-09-021-15/+5
|\ | | | | | | | | Reviewed-By: Richard Maw <richard.maw@codethink.co.uk> Reviewed-By: Lars Wirzenius <lars.wirzenius@codethink.co.uk>
| * Remove some cache key related log messagesSam Thursfield2014-08-291-15/+5
| | | | | | | | | | | | | | | | | | There's no need to log every time we look something up in a dict. This just makes log files huge. The CacheKeyComputer.compute_key() function still logs the first time it calculates a cache key for an Artifact instance. This message now includes the hex string that is used to identify the artifact.
* | deploy: Make Python extensions log debug messages to MORPH_LOG_FD by defaultSam Thursfield2014-09-011-1/+24
| | | | | | | | | | | | | | | | | | Previously logging was disabled for Python deploy extensions. We get a lot of useful information for free in the log file by doing this: the environment will be written when the subprocess starts, and if it crashes the full backtrace will be written there too. Subcommand execution with cliapp.runcmd() will also be logged.
* | deploy: Allow extensions to write to Morph log fileRichard Maw2014-09-012-26/+143
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is achieved by passing it the write end of a pipe, so that the extension has somewhere to write debug messages without clobbering either stdout or stderr. Previously deployment extensions could only display status messages on stdout, which is good for telling the user what is happening but is not useful when trying to do post-mortem debugging, when more information is usually required. This uses the asyncore asynchronous event framework, which is rather specific to sockets, but can be made to work with file descriptors, and has the advantage that it's part of the python standard library. It relies on polling file descriptors, but there's a trick with pipes to use them as a notification that a process has exited: 1. Ensure the write-end of the pipe is only open in the process you want to know when it exits 2. Make sure the pipe's FDs are set in non-blocking read/write mode 3. Call select or poll on the read-end of the file descriptor 4. If select/poll says you can read from that FD, but you get an EOF, then the process has closed it, and if you know it doesn't do that in normal operation, then the process has terminated. It took a few weird hacks to get the async_chat module to unregister its event handlers on EOF, but the result is an event loop that is asleep until the kernel tells it that it has to do something.
* | Add `morph upgrade` command, deprecate `morph deploy --upgrade`Sam Thursfield2014-09-012-6/+28
| | | | | | | | | | | | The arguments to `morph deploy` can get quite long, any way we can make it shorter and clearer is useful. We can also avoid having the strange --no-upgrade flag in future.
* | Clarify that multiple images can be deployed at once by `morph deploy`Sam Thursfield2014-09-011-1/+1
|/
* Merge remote-tracking branch 'origin/baserock/richardmaw/bugfix/http-fail'Richard Maw2014-08-262-5/+12
|\ | | | | | | | | Reviewed-by: Francisco Redondo Marchena Reviewed-by: Sam Thursfield
| * Only autodetect morphology when result is 404baserock/richardmaw/bugfix/http-failRichard Maw2014-08-192-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The MorphologyFactory class will use a RemoteRepoCache to see if a morphology file exists, and if it doesn't, uses a file listing to see if it can detect what build-system is uses, hence what the default morphology should be. However, it was overly generic in what error cases it would accept as the morphology not being found, so if the RemoteRepoCache was suddenly un-resolvable for a brief period, then it would assume the morphology didn't exist, and use the default one. This happened to a user, and the result was a full rebuild. So we now fix this by only raising the exception that means the file didn't exist, if we got a HTTP 404.
* | Fix `morph edit` for non-file URIsRichard Maw2014-08-261-1/+1
| | | | | | | | | | | | | | | | | | The clone_into function is non-functional when you pass it a sha1 ref. If you have a file:// URI then this doesn't get used, which is how it slipped past the tests. Reviewed-by: Lars Wirzenius +2
* | Prevent git-replace refs affecting git operationsRichard Maw2014-08-2111-139/+171
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We assumed that the sha1 of the tree of the commit of the ref we care about was sufficient to cache, but `git replace` means that you need to know the state of other branches too, since anything in refs/replace can completely change what the tree you check-out is. This behaviour can be disabled globally by setting GIT_NO_REPLACE_OBJECTS, so we're going to do that. If we need to integrate a project that uses git-replace to change the contents of their git trees then we could support that by: if any(refs/replace/*): potentially_replacable_objects = [ `git rev-parse HEAD`, `git rev-parse HEAD^{commit}`, `git rev-parse HEAD^{tree}`] potentially_replacable_objects.extend( `git ls-tree -r HEAD | awk '{print $3}'`) # NOTE: doesn't handle submodules, and you'd need to expand this # set whenever you process a replacement for object in refs/replace/*: if basename(object) not in potentially_replacable_objects: continue cache_key['replacements'][basename(object)] = `git rev-parse $object` If we were to support this would need to traverse the tree anyway, doing replacements, so we may as well use libgit to do the checkout anyway, and list which replacements were used. However, since the expected use-case of `git replace` is as a better way to do history grafting, we're unlikely to need it, as it would only have any effect if it replaced the commit we were using with a different one. Rubber-stamped-by: Daniel Silverstone
* | deploy: Check correct usage of --upgrade for rawdisk deploymentsSam Thursfield2014-08-191-0/+21
| | | | | | | | | | | | | | | | | | | | This avoids confusion when the user expected to be doing an initial deployment and wasn't aware that a file with the same name as the target already existed. Previously rawdisk.write would try to mount the file and upgrade it. Now we require the user to pass '--upgrade' when they intend to upgrade, as with other deployment extensions.
* | Cache configuration values in GitDirectory.Daniel Silverstone2014-08-191-2/+6
|/ | | | | | | | This reduces the vast number of 'git config -z core.bare' which we used to get a lot of, to one or two per run. It doesn't save much time, but it does make logs less full of confusion. Reviewed-By: Richard maw <richard.maw@codethink.co.uk> +2
* Fix morphloader to unset_defaults for chunksFrancisco Redondo Marchena2014-08-151-2/+13
|
* Update morphloader_test.pyFrancisco Redondo Marchena2014-08-151-12/+13
|
* Set chunk static default to Nonebaserock/franred/fixes-needed-for-organize-definitionsFrancisco Redondo Marchena2014-08-151-12/+12
|
* Add system-integration to chunk _static_defaultsFrancisco Redondo Marchena2014-08-151-0/+1
|
* Add deploy-defaults before deploy in the MorphologyDumper keyorderFrancisco Redondo Marchena2014-08-151-0/+1
|
* Merge branch 'baserock/richardmaw/james/writeexts_support_jetson'James Thomas2014-08-142-22/+83
|\ | | | | | | | | Reviewed by: Richard Maw <richard.maw@codethink.co.uk> Merged by: James Thomas <james.thomas@codethink.co.uk>
| * Merge remote-tracking branch 'origin/baserock/james/writeexts_support_jetson'baserock/richardmaw/james/writeexts_support_jetsonRichard Maw2014-08-122-22/+83
| |\
| | * Add support for a device tree to be set using DTB_PATHbaserock/james/writeexts_support_jetsonJames Thomas2014-08-031-0/+21
| | |
| | * Make bootloader config/install more genericJames Thomas2014-08-031-17/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the BOOTLOADER environment variable and instead favour BOOTLOADER_CONFIG_FORMAT to set the desired bootloader format, and BOOTLOADER_INSTALL to set the type of bootloader to install. For example, since u-boot can boot using extlinux.conf files, it's conceivable that someone might want to do CONFIG_FORMAT=extlinux.conf, INSTALL=u-boot. However, for most platforms you would want to set INSTALL to "none"
| | * Support setting a different root device using ROOT_DEVICEJames Thomas2014-08-031-2/+5
| |/
* | Rename morph3 to morphologyAdam Coldrick2014-08-146-57/+57
| | | | | | | | | | | | Instead of leaving morph3 with a potentially confusing name, rename it to `morphology` since it is now the only implementation of the Morphology class.
* | Revert "Ensure that none of the _orig_* fields are in the cache key"Adam Coldrick2014-08-141-4/+1
| | | | | | | | | | | | | | | | | | This reverts commit 4124c50b8dc3dfb0ffb933153d0fe6385edf389c. This commit is no longer needed now that morph2 is gone. Conflicts: morphlib/cachekeycomputer.py
* | Remove morph2 and its testsAdam Coldrick2014-08-143-705/+0
| | | | | | | | This commit removes the now unneeded morph2 and its associated tests.
* | unittests: Make the unittests use morphloaderAdam Coldrick2014-08-146-297/+254
| | | | | | | | | | This commit removes all use of morph2 from the unittests, replacing it with morphloader/morph3. It also converts the test morphologies to YAML.
* | buildsystem: Generate a Morphology not textAdam Coldrick2014-08-144-30/+29
| | | | | | | | | | Rather than generating the text of a morphology which is later loaded, generate a Morphology object and return that.
* | morphologyfactory: Use morphloader not morph2Adam Coldrick2014-08-142-166/+75
| | | | | | | | | | | | | | | | This commit reworks morphologyfactory to stop using morph2. It removes the validation that is already handled by morphloader, and uses morphloader to load the morphology rather than using morph2. The unit tests are also changed to turn the example morphologies into YAML, and also to check for the correct exceptions.
* | morphloader: Get commands when loading morphologyAdam Coldrick2014-08-144-3/+14
| | | | | | | | | | | | Rather than having a `get_commands` method to obtain missing commands from the build system when they are needed, get the commands when loading a morphology.
* | morphloader: Add and remove some default valuesAdam Coldrick2014-08-142-12/+21
| | | | | | | | | | | | | | | | | | This commit stops morphloader setting the `morph` field in chunk specs, and also makes it set defaults for the prefix and build-mode fields. Not setting the `morph` field is necessary as its presence in chunk specs is used by `traverse_morphs` to mean that the morphology file is in the definitions repository, not the chunk source repository. If we set a default value here, we end up looking for files which do not exist.
* | morphset: Don't expect to be able to find `morph` or `name`Adam Coldrick2014-08-141-3/+5
| | | | | | | | | | | | | | | | Currently, morphloader sets the morph field in chunk specs by default. A later commit will remove this behaviour, so morphset needs to stop expecting the `morph` field to exist in morphology specs. However, stratum specs don't have a `name` field so we can't expect to be able to find that either.
* | show-dependencies: Use sanitise_morphology_path instead of stripping `.morph`Adam Coldrick2014-08-141-2/+2
| |
* | cachedrepo: Remove unused load_morphology methodAdam Coldrick2014-08-142-13/+0
| |
* | Merge branch 'baserock/richardmaw/misc-fixups'baserock/richardmaw/tmpRichard Maw2014-08-125-147/+52
|\ \ | | | | | | | | | | | | Reviewed-by: Daniel Silverstone Reviewed-by: Pedro Alvarez
| * | Fix `morph edit` when repo has the same ref as system branchRichard Maw2014-08-122-5/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There was a check in it to see whether it needed to do the git branch and git checkout based on whether the name of the branch matched that in the morphology. This had a couple of problems: 1. Now that we aren't always building from HEAD, we need to be able to roll its commit back, so using the existing branch isn't always the best idea. 2. It only checks the "ref" field, not "unpetrify-ref", so even though we clone the right ref in there, it's checking the commit id against the system branch name, so would always try to re-create the branch, and fail when it already exists. So now, we remove the original ref and re-create it with our preferred HEAD. A better solution might be to change the clone logic to not automatically checkout HEAD, and instead require an explicit branch then checkout, but the initial clone logic is shared with build, and I didn't feel like tracking down all the different places that it was used.
| * | Make morph show-branch-root print the pathRichard Maw2014-08-122-15/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | The help for the show-branch-root command said it returns a path, but the command and the yarns just showed the aliased url it was cloned from. Given I found myself needing the path in some scripts, not the repo url, I think it's more useful to reconcile the difference this way.
| * | Remove petrify and unpetrify commandsRichard Maw2014-08-123-127/+0
|/ / | | | | | | | | | | | | We don't use this any more, and instead prefer to always keep definitions.git petrified, and update the refs ourselves. branch-from-image still uses some of the remaining petrify code.
* | Avoid creating and pushing temporary build branches when they aren't necessary.Richard Maw2014-08-124-99/+129
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sorry about the big lump, I can split it into a nicer set of changes, but they didn't naturally emerge in a nice series. This creates a pushed_build_branch context manager, to eliminate code duplication between build and deploy. Rather than the build branch being constructed knowing whether it needs to push the branch, it infers that from the state of the repositories, and whether a local build would be possible. If there are no uncommitted changes and all local branches are pushed, then it doesn't create temporary branches or push them, and instead uses what it already has. It will currently create and use temporary build branches even for chunks that have no local changes, but it's pretty cheap, and doesn't require re-working the build-ref injection code to check whether there are local changes.
* | Treat untracked morphologies as uncommitted changesRichard Maw2014-08-121-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We used to coincidentally include morphologies because we later include every morphology in the build graph into our temporary build branches. Since we're now checking whether there's any uncommitted changes before attempting to create a temporary build branch, this means that we can no longer build uncommitted morphologies if they aren't reported as changes. So this patch makes an exception to the untracked changes rule for anything that ends with .morph. It's still confusing that some files aren't included in temporary build branches, but that would cause performance regressions, so we'll limit it to just morphologies for now, until we make a decision on what uncomitted content we care about.
* | Cut BuildBranch out of morphlib.extensionsRichard Maw2014-08-123-30/+20
| | | | | | | | | | | | | | | | | | | | | | This was previously used just so it could get the right repo and ref to read the file out of. However, there was a subtle bug in this behaviour, as if we had not previously used morph build in that branch, it would attempt to read the extensions from a branch which didn't exist. So now it reads it from the working tree, which always exists.
* | Turn BuildBranch methods into regular functionsRichard Maw2014-08-123-18/+28
|/ | | | | | | | | | | | | | | | | | | | | | | | Previously they were generator functions, which yielded interesting context at interesting times so that the caller could respond by printing status messages. The only benefits this had over callbacks were: 1. 1 fewer function scope to worry about. I don't have data on the amount of memory used for a function scope vs a generator, but it could be less cognitive load for determining which variables are defined in the callback's body. 2. It is possible to yield in the caller, so you could make that into a coroutine too, however this wasn't required in this case, as the yielded value was intended to be informational. The downsides to this are: 1. It's a rather peculiar construct, so can be difficult to understand what's going on, and the implications, which led to 2. If you call the function, but don't use the iterator it returned, then it won't do anything, which is very confusing indeed, if you're not used to how generator functions work.
* Merge branch 'baserock/adamcoldrick/no-orig-fields-in-cache-key'Adam Coldrick2014-08-011-1/+4
|\ | | | | | | | | | | | | | | This fixes a bug where distbuild was calculating a different cache local build. It will no longer be needed once morph2 is removed. Reviewed-by: Sam Thursfield Reviewed-by: Richard Maw
| * Ensure that none of the _orig_* fields are in the cache keybaserock/adamcoldrick/no-orig-fields-in-cache-keybaserock/adamcoldrick/no-fake-morphAdam Coldrick2014-08-011-1/+2
| | | | | | | | | | | | | | | | morph2.Morphology adds a number of fields beginning with _orig_, which contain default values. There is no reason for these values to be in the cache key, and this fixes a bug where distbuild created systems with a different cache key than local builds did when the configuration extensions list was empty.
* | Merge remote-tracking branch ↵Lars Wirzenius2014-08-011-2/+2
|\ \ | | | | | | | | | 'remotes/origin/baserock/richardmaw/bugfix/distbuild-eglibc'
| * | StagingArea: Ensure staging are paths are bytestringsbaserock/richardmaw/bugfix/distbuild-eglibcRichard Maw2014-07-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the destdir path returned when creating a staging area is a unicode string, then when attempting to `os.walk(destdir)`, it will encounter unicode errors if there are file paths in the destdir that are not representable as unicode strings. For various as-yet unknown reasons, when building stage-2 eglibc it produces file paths that are not unicode compatible. There was previously a patch to fix this issue with regards to creating the metadata files, but it did not fix all the issues, because the build at the time was local rather than distributed. This is failing during a distributed build because morphologies are serialised into json, and during deserialisation their string values are left as unicode. Rather than doing the byte-string conversion during deserialisation, I have chosen to do it when the contents of the morphology are used, because it's only at the point where it's used to create a file path, that it matters whether it's unicode or not.
* | | Use sanitise_morphology_path in serialise_artifactRichard Ipsum2014-07-311-1/+1
| | |
* | | Dump multi-line strings in yaml documents in '|' formbaserock/richardmaw/bugfix/yaml-multi-line-dumpRichard Maw2014-07-303-3/+60
| |/ |/| | | | | | | | | | | | | This prevents the description fields of morphologies being mangled. This does not preserve the original formatting, so much as happen to dump it in the same way we wrote it, but given we chose that form because we think it looks the nicest, that's not a problem.
* | Merge branch 'baserock/richardmaw/bugfix/fix-list-artifacts'Adam Coldrick2014-07-291-1/+1
|\ \ | | | | | | | | | | | | Reviewed-by: Daniel Silverstone Reviewed-by: Adam Coldrick