summaryrefslogtreecommitdiff
path: root/heat/engine/stack.py
Commit message (Collapse)AuthorAgeFilesLines
* Fix hacking warningsAndreas Jaeger2020-04-161-4/+4
| | | | | | Fix some warnings found by hacking and enable them again. Change-Id: Ia09de4d0fda752a009b9246b4e6d485601cd9562
* Make tags handling more robustZane Bitter2019-10-101-17/+23
| | | | | | | | | | | | | | | | | | | | | | | | | Avoid loading the tags from the DB and then re-saving every time the stack is stored when the stack has tags. Avoid attempting to lazy-load the tags from the DB multiple times when there are no tags. Avoid lazy-loading the existing tags on an update when we intend to overwrite them anyway. Avoid writing the same set of tags multiple times. e.g. in a legacy update, previously we rewrote the current set of tags when changing the state to IN_PROGRESS, then wrote the new set of tags regardless of whether they had changed, then wrote the new set of tags again when the update completed. In a convergence update we also did three writes but in a different order, so that the new tags were written every time. With this change we write the new set of tags only once. This could also have prevented stacks with tags from being updated from legacy to convergence, because the (unchanged) tag list would get rewritten from inside a DB transaction. This is not expected so the stack_tags_set() API does not pass subtransactions=True when creating a transaction, which would cause a DB error. Change-Id: Ia52818cfc9479d5fa6e3b236988694f47998acda Task: 37001
* Merge parameters and templates when resetting stack statusRabi Mishra2019-07-171-14/+30
| | | | | | | | | We keep the new template in the prev_raw_template_id. When setting the stacks to FAILED we should also merge the templates for both existing and backup stack. Change-Id: Ic67a4833672d1c562980ee19fd8071f84dd9500a Task: 35842
* Load existing resources using correct environmentZane Bitter2019-04-011-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In convergence we were loading resources from the database using the current environment. This is incorrect when a previous update has failed, meaning the resources in the database were created with a non-current template and environment. If an attempt was made to change the type of a resource but that resource was never updated, this will result in us loading a resource with the wrong type. If the type has been removed then it can result in errors just trying to show the stack. Note that the Resource.load() method used during a convergence traversal already does the Right Thing - it only uses the new type if it is a valid substitution for the old type, and UpdateReplace is later raised in Resource.update_convergence() if the type does not match in that specified in the new environment. So we don't see any problems with stack updates, just with API calls. Since we cannot change the signature of Resource.__new__() without also modifying the signature of __init__() in every resource plugin that has implemented it (many of which are out of tree), instead substitute the stack definition for the duration of creating the Resource object. This will result in stack.env returning the environment the resource was last updated with. Change-Id: I3fbd14324fc4681b26747ee7505000b8fc9439f1 Story: #2005090 Task: 29688
* Merge "Fix SoftwareDeployment on DELETE action"Zuul2019-03-201-5/+5
|\
| * Fix SoftwareDeployment on DELETE actionEthan Lynn2019-03-131-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | When we specify a sd on delete action, os-collect-config will not get authentication because we didn't load access_allowed_handlers after stack enter stack delete phrase. This patch will make sure we load necessary access_allowed_handlers even if in stack delete phrase. Change-Id: I43c1a865f507f7cb7757e26ae5c503ce484ee280 Story: #2004661 Task: #28628
* | Improve best existing resource selectionZane Bitter2019-01-291-18/+26
|/ | | | | | | | | | | | | | | Rank all existing versions of a resource in a convergence stack to improve the likelihood that we find the best one to update. This allows us to roll back to the original version of a resource (or even attempt an in-place update of it) when replacing it has failed. Previously this only worked during automatic rollback; on subsequent updates we would always work on the failed replacement (which inevitably meant attempting another replacement in almost all cases). Change-Id: Ia231fae85d1ddb9fc7b7de4e82cec0c0e0fd06b7 Story: #2003579 Task: 24881
* Streamline conversion of resources to convergenceZane Bitter2018-12-171-18/+13
| | | | | | | | | | Use a single write to the database to convert each resource. Add a method to the versioned object class that encapsulates the DB-specific information, and get rid of the Resource.set_requires() classmethod that just calls a method on the versioned object instance that's passed to it. Change-Id: Ieca7e0f0642c38c44fb8d7729333a0ccd93c9cb4
* Merge "Delete db resources not in template"Zuul2018-12-101-1/+10
|\
| * Delete db resources not in templaterabi2018-10-151-1/+10
| | | | | | | | | | | | | | | | | | | | | | When migrating stacks to convergence, if there are resources in the database that are not in the current_template_id of the stack, they are possibly of no isue, so it would better to delete those resources from db to avoid any future update issues. Change-Id: Ica99cec6765d22d7ee2262e2d402b2e98cb5bd5e Story: #2004071 Task: 27092
* | Don't depend on string interningZane Bitter2018-10-121-1/+1
|/ | | | | | | | | Use '!=' instead of 'is not' to compare strings. In practice, short strings that appear in the source code are interned in CPython, but this is implementation-specific. Change-Id: If3f305c2d647fcd7515cb0a326a30f4eda93acd3
* Merge "Refactor deferral of stack state persistence"Zuul2018-08-091-15/+29
|\
| * Refactor deferral of stack state persistenceZane Bitter2018-07-311-15/+29
| | | | | | | | | | | | | | | | | | | | | | | | When we hold a StackLock, we defer any persistence of COMPLETE or FAILED states in state_set() until we release the lock, to avoid a race on the client side. The logic for doing this was scattered about and needed to be updated together, which has caused bugs in the past. Collect all of the logic into a single implementation, for better documentation and so that nothing can fall through the cracks. Change-Id: I6757d911a63708a6c6356f70c24ccf1d1b5ec076
* | Store resources convergence-style in stack check/suspend/resumeZane Bitter2018-08-061-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there are resources in a template that don't exist in the database at the time of a stack check (or suspend or resume) operation, running the stack check will cause them to be stored in the database. Since these operations have not been converted to convergence (story 1727142), they do not set the current_template_id as a convergence update would. If this occurs then the stack will be unrecoverable. To avoid this, when convergence is enabled store any missing resources in the same manner that a convergence update would, prior to running the stack check/suspend/resume. Just in case, make sure the stack doesn't get stuck if we do end up in the wrong state, by not trying to load a template with None as an ID. Change-Id: Iedba67c5de39dc2d58938da5505dda5dd147c130 Story: #2003062 Task: 23101
* | Merge "Handle exceptions in initial convergence traversal setup"Zuul2018-08-051-9/+29
|\ \ | |/ |/|
| * Handle exceptions in initial convergence traversal setupZane Bitter2018-07-311-9/+29
| | | | | | | | | | | | | | | | | | | | | | If an exception occurs while doing the initial setup of a convergence traversal (including sending the check_resource messages for leaf nodes), mark the stack as failed. If we don't do this then any error or exception in this method will cause the stack to hang IN_PROGRESS. Change-Id: Ib8231321e823634a3dc23cff9a1c7d560f64fd6e Story: #2003125 Task: 23245
* | Eliminate client races in legacy operationsZane Bitter2018-07-301-13/+39
|/ | | | | | | | | | | | | | Wait for the legacy stack to move to the IN_PROGRESS state before returning from the API call in the stack update, suspend, resume, check, and restore operations. For the stack delete operation, do the same provided that we can acquire the stack lock immediately, and thus don't need to wait for existing operations to be cancelled before we can change the state to IN_PROGRESS. In other cases there is still a race. Change-Id: Id94d009d69342f311a00ed3859f4ca8ac6b0af09 Story: #1669608 Task: 23175
* Eliminate client race condition in convergence deleteZane Bitter2018-07-301-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously when doing a delete in convergence, we spawned a new thread to start the delete. This was to ensure the request returned without waiting for potentially slow operations like deleting snapshots and stopping existing workers (which could have caused RPC timeouts). The result, however, was that the stack was not guaranteed to be DELETE_IN_PROGRESS by the time the request returned. In the case where a previous delete had failed, a client request to show the stack issued soon after the delete had returned would likely show the stack status as DELETE_FAILED still. Only a careful examination of the updated_at timestamp would reveal that this corresponded to the previous delete and not the one just issued. In the case of a nested stack, this could leave the parent stack effectively undeletable. (Since the updated_at time is not modified on delete in the legacy path, we never checked it when deleting a nested stack.) To prevent this, change the order of operations so that the stack is first put into the DELETE_IN_PROGRESS state before the delete_stack call returns. Only after the state is stored, spawn a thread to complete the operation. Since there is no stack lock in convergence, this gives us the flexibility to cancel other in-progress workers after we've already written to the Stack itself to start a new traversal. The previous patch in the series means that snapshots are now also deleted after the stack is marked as DELETE_IN_PROGRESS. This is consistent with the legacy path. Change-Id: Ib767ce8b39293c2279bf570d8399c49799cbaa70 Story: #1669608 Task: 23174
* Delete snapshots using contemporary resourcesZane Bitter2018-07-241-14/+31
| | | | | | | | | | | | | | | | | | | | | | When deleting a snapshot, we used the current resources in the stack to call delete_snapshot() on. However, there is no guarantee that the resources that existed at the time the snapshot was created were of the same type as any current resources of the same name. Use resources created using the template in the snapshot to do the snapshot deletion. This also solves the problem addressed in df1708b1a83f21f249aa08827fa88a25fc62c9e5, whereby snapshots had to be deleted before the stack deletion was started in convergence because otherwise the 'latest' template contained no resources. That allows us to once again move the snapshot deletion after the start of the stack deletion, which is consistent with when it happens in the legacy path. Amongst other things, this ensures that any failures can be reported correctly. Change-Id: I1d239e9fcda30fec4795a82eba20c3fb11e9e72a
* Merge "Docs: Eliminate warnings in docs generation"Zuul2018-07-231-6/+8
|\
| * Docs: Eliminate warnings in docs generationZane Bitter2018-06-211-6/+8
| | | | | | | | | | | | | | | | | | Fix all of the existing sphinx warnings, and treat warnings as errors in future. Change-Id: I084ef65da1002c47c7d05a68d6f0268b89a36a7a Depends-On: https://review.openstack.org/553639 Depends-On: https://review.openstack.org/559348
* | Merge "delete_trust failure will not block a stack delete"Zuul2018-07-121-3/+5
|\ \
| * | delete_trust failure will not block a stack deleteNakul Dahiwade2018-07-021-3/+5
| |/ | | | | | | | | | | | | | | | | | | | | Deleting a tenant that has active stacks would have to issue a stack delete twice per stack to delete those stacks, that was because the delete_trust would fail, but credentials were still cleared.This change just logs the error but does not fail the stack delete. Change-Id: I9d770a91b20d1db137b3fc313c794fcee4a5e4bf Story: 2002619 Task: 22248
* | Don't re-use resource-properties-data in backup stacksZane Bitter2018-06-261-0/+1
|/ | | | | | | | | | | | | When purging events we are only able to (efficiently) search for references to resource properties data from events and resources in the same stack. This results in foreign key constraint errors if the resource properties data is referenced from a backup stack. To avoid this, don't reuse resource properties data IDs after moving a resource between the backup and main stacks, but duplicate the data instead. Change-Id: I93329197c99a2dba37b0e1dbd7efe7b2b17bc036 Story: #2002643 Task: 22510
* Stop using needed_by field in resourceZane Bitter2018-06-141-18/+0
| | | | | | | | | | | | | | | | During the original prototype development, I missed the fact that the `needed_by` field of a resource is no longer needed for the actual convergence algorithm: https://github.com/zaneb/heat-convergence-prototype/commit/c74aac1f07e3fdf1fe382a7edce6c4828eda13e3 Since nothing is using this any more, we can avoid an unnecessary DB write to every resource at the start of a stack update. For now, just write an empty list to that field any time we are storing the resource. In future, we can stop reading/writing the field altogether, and in a subsequent release we could drop the column from the DB. Change-Id: I0c9c77f395db1131b16e5bd9d579a092033400b1
* Calculate convergence required_by from graph in StackZane Bitter2018-06-141-0/+16
| | | | | | | | | | | | | | | In convergence, resources can continue to exist in the database that are no longer part of the latest template. When calling Resource.required_by() for such resources, we still want to get a list of those resources that depend on them. Previously we did this using the `needed_by` field in the resource. Since this is the only actual use of needed_by, get the information from the Stack's convergence graph instead (which is generated from the resources' `requires` field and ignores `needed_by`). This eliminates any risk that the `requires` and `needed_by` fields get out of sync, and allows us to get rid of `needed_by` altogether in the future. Change-Id: I64e1c66817151f39829d5c54b0a740c56ea8edad
* Merge "Add catch-all for property errors in implicit dependencies"Zuul2018-05-021-1/+5
|\
| * Add catch-all for property errors in implicit dependenciesZane Bitter2017-09-191-1/+5
| | | | | | | | | | | | | | | | | | | | | | The previous patch ensures that we ignore errors getting properties in all extant add_dependencies() methods for calculating implicit dependencies. To guard against similar errors in future, also ignore and log any uncaught ValueError or TypeError exceptions encountered during implicit dependency calculation. Change-Id: I2cac0add975e36a9c52b9cbb50f0660882322754 Related-Bug: #1708209
* | Log traversal ID when beginningZane Bitter2018-04-171-2/+2
| | | | | | | | | | | | | | | | At the beginning of a convergence traversal, log the traversal ID along with the dependency graph for the traversal. This could be useful in debugging. Also, log it at the DEBUG, not INFO level. Change-Id: Ic7c567b6f949bdec9b3cface4fa07748fbe585eb
* | Merge "Return nested parameters for resource group."Zuul2018-02-281-0/+31
|\ \
| * | Return nested parameters for resource group.Thomas Herve2018-02-261-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This refactors the building of schema from parameter validation to use a new method (which doesn't keep stacks in memory), and use that new method for providing proper schema for resource group when the size is 0. Change-Id: Id3020e8f3fd94e2cef413d5eb9de9d1cd16ddeaa Closes-Bug: #1751074 Closes-Bug: #1626025
* | | Merge "Delete redundant code"Zuul2018-02-261-2/+1
|\ \ \ | |/ / |/| |
| * | Delete redundant codechenaidong12017-08-161-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In function update_task, the following codes have already judged action is in (self.UPDATE, self.ROLLBACK, self.RESTORE), therefore the deleted code is not necessary. if action not in (self.UPDATE, self.ROLLBACK, self.RESTORE): LOG.error("Unexpected action %s passed to update!", action) self.state_set(self.UPDATE, self.FAILED, "Invalid action %s" % action) return Change-Id: I85f4aaf4c294923358172896cf8efcb5af238957
* | | Prioritise resource deletion over creationZane Bitter2018-02-081-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because of quotas, there are times when creating a resource and then deleting another resource may fail where doing it in the reverse order would work, even though the resources are independent of one another. When enqueueing 'check_resource' messages, send those for cleanup nodes prior to those for update nodes. This means that all things being equal (i.e. no dependency relationship), deletions will be started first. It doesn't guarantee success when quotas allow, since only a dependency relationship will cause Heat to wait for the deletion to complete before starting creation, but it is a risk-free way to give us a better chance of succeeding. Change-Id: I9727d906cd0ad8c4bf9c5e632a47af6d7aad0c72 Partial-Bug: #1713900
* | | Remove OS::Heat::HARestarterricolin2018-01-291-38/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The OS::Heat::HARestarter has been deprecated since Kilo. Now is the time to eliminate support and hide it from the documentation. This replaces it with a placeholder resource (like OS::Heat::None) and marks it as hidden. Change-Id: I56cd1f2d0b3323399ef02c3a0a05d79cc69af956
* | | Merge "Define resource/output definition sections with constants"Zuul2018-01-081-7/+9
|\ \ \
| * | | Define resource/output definition sections with constantsZane Bitter2017-11-161-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It was unclear what the valid arguments to Template.get_section_name() were (especially since the function is mis-named for what it actually does in HOT). Define the arguments as constants and don't pass string literals any more. Be consistent in how we define paths, standardising on the method in Resource.validate_template(). Change-Id: Ifd073d9889ff60502f78aaa54532cec2b7814d93
* | | | Merge "Downgrade WARNING-level log"Zuul2018-01-041-7/+5
|\ \ \ \
| * | | | Downgrade WARNING-level logZane Bitter2017-11-031-7/+5
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's expected that when a resource completes and the traversal it was processing is no longer active, we won't update the stack state in the database. Therefore don't log at WARNING level for this routine event. Change-Id: Ic901e46c251400c4eeb90a4d9ad4a97d61fb1af8
* | | | Merge "Remove unused variable"Zuul2017-12-131-1/+1
|\ \ \ \
| * | | | Remove unused variableBéla Vancsics2017-03-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TrivialFix Change-Id: I81e3a62c890f0ad043dd604443f56766ba6c11b4
* | | | | Ignore resources with non-existent templaterabi2017-11-281-1/+4
| |/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When listing resources with nested depth, we get all resources for the root stack and cache them in the context. By the time we iterate through the resources for a nested stack, if the current_template_id of resources do not match the stack template, we try to load the template. However, it's possible that the template does not exist anymore. It would be good to ignore those resources. Change-Id: I838320539838ed74f490bca8601cde96eaf7c7ee Closes-Bug: #1734815
* | | | Merge "Eager load resource_properties_data in resource"Jenkins2017-10-131-1/+1
|\ \ \ \
| * | | | Eager load resource_properties_data in resourceCrag Wolfe2017-07-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Eager load resource_properties_data in resources in the typical resource-loading scenarios where properties data will be accessed. Thus, we can save an extra db query per resource when loading all the resources in a stack, for instance. Fall back to lazy loading properties data in other scenarios. Also, the resource object doesn't need to store a copy of its ResourcePropertiesData object in self.rsrc_prop_data, so don't. Change-Id: Ib7684af3fe06f818628fd21f1216de5047872948 Closes-Bug: #1665503
* | | | | Use a namedtuple for convergence graph nodesZane Bitter2017-09-261-7/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The node key in the convergence graph is a (resource id, update/!cleanup) tuple. Sometimes it would be convenient to access the members by name, so convert to a namedtuple. Change-Id: Id8c159b0137df091e96f1f8d2312395d4a5664ee
* | | | | Remove the existing snapshots from the backendhuangtianhua2017-09-141-1/+2
| |_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | We only have to remove the existing snapshots for resources when stack delete. Change-Id: Ia195f3c3380fe71e0888c8291209dd4562318951 Closes-Bug: #1716612
* | | | Merge "Show correct version of data in convergence resource list"Jenkins2017-08-291-1/+3
|\ \ \ \
| * | | | Show correct version of data in convergence resource listZane Bitter2017-07-131-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In convergence, there can be multiple versions of the same resource present in a stack at one time. The fix for bug 1301320 was supposed to make data for all versions available from the resource list API. And while it does go through all of the resources in the database and create Resource objects for each one, they were each created with the correct version of the resource definition but DB data from the latest resource. (Only the name is passed to the Resource object, it has to figure out which database row to load data from itself - and it will choose the same one every time.) With this change, we always load the correct DB data into the newly-created Resource object. Change-Id: I6b9d1b86b3dbf767bccebddd78275bbf0933029a Closes-Bug: #1704194
* | | | | Rollback stack with correct tagshuangtianhua2017-08-141-3/+6
| |_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | Set old tags for old_stack to make sure the stack with the correct tags after rollback. Change-Id: I8450df08c84b5a467ab8ac991451c5b108ee96e7 Closes-Bug: #1702251
* | | | Add converge flag in stack update for observing on realityricolin2017-08-071-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add converge parameter for stack update API and RPC call, that allow triggering observe on reality. This will be triggered by API call with converge argument (with True or False value) within. This flag also works for resources within nested stack. Implements bp get-reality-for-resources Change-Id: I151b575b714dcc9a5971a1573c126152ecd7ea93