summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Always fall back from hard linking to copying filesstable/2023.1Dmitry Tantsur2023-04-113-107/+40
| | | | | | | | | | | | | The current check is insufficient: it passes for Kubernetes shared volumes, although hard-linking between them is not possible. This patch changes the approach to trying a hard link and falling back to copyfile instead. The patch relies on optimizations in Python 3.8 and thus should not be backported beyond the Zed series to avoid performance regression. Change-Id: I929944685b3ac61b2f63d2549198a2d8a1c8fe35 (cherry picked from commit 59c6ad96ce35c9deecfedb5698c5806f3883a8af)
* Add error logging on lookup failures in the APIDmitry Tantsur2023-04-041-1/+5
| | | | | | | | Lookup returns generic 404 errors for security reasons. Logging is the only way of debugging any issues during it. Change-Id: I860ed6b90468a403f0f6cdec9c3d84bc872fda06 (cherry picked from commit 21437135ab3a8c9aa2fea99c48ab42eb45630941)
* Merge "Wipe Agent Token when cleaning timeout occcurs" into stable/2023.1Zuul2023-03-163-2/+15
|\
| * Wipe Agent Token when cleaning timeout occcursJulia Kreger2023-03-143-2/+15
| | | | | | | | | | | | | | | | | | | | | | | | In a relatively odd turn of events, should cleaning have started, but then timed out due to lost communications or a hard failure of the machine, an agent token could previously be orphaned preventing re-cleaning. We now explicitly remove the token in this case. Change-Id: I236cdf6ddb040284e9fd1fa10136ad17ef665638 (cherry picked from commit 47b5909486c336352c536eb2cadd121afea8cf12)
* | Merge "Clean out agent token even if power is already off" into stable/2023.1Zuul2023-03-153-0/+38
|\ \ | |/ |/|
| * Clean out agent token even if power is already offJulia Kreger2023-03-143-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While investigating a very curious report, I discovered that if somehow the power was *already* turned off to a node, say through an incorrect BMC *or* human action, and Ironic were to pick it up (as it does by default, because it checks before applying the power state, then it would not wipe the token information, preventing the agent from connecting on the next action/attempt/operation. We now remove the token on all calls to conductor utilities node_power_action method when appropriate, even if no other work is required. Change-Id: Ie89e8be9ad2887467f277772445d4bef79fa5ea1 (cherry picked from commit bcf6c12269168c5b4f0d9d4d3212e813f1827494)
* | Do not recalculate checksum if disk_format is not changedDmitry Tantsur2023-03-136-22/+151
|/ | | | | | | | Even if a glance image is raw, we still recalculate the checksum after "converting" it to raw. This process may take exceptionally long. Change-Id: Id93d518b8d2b8064ff901f1a0452abd825e366c0 (cherry picked from commit f00da959eaa70a7e77059655c0050137cee78568)
* Update TOX_CONSTRAINTS_FILE for stable/2023.1OpenStack Release Bot2023-03-091-5/+5
| | | | | | | | | | | | Update the URL to the upper-constraints file to point to the redirect rule on releases.openstack.org so that anyone working on this branch will switch to the correct upper-constraints list automatically when the requirements repository branches. Until the requirements repository has as stable/2023.1 branch, tests will continue to use the upper-constraints list on master. Change-Id: Iac3bf10942721195369d095149e1015fe0c9f8ef
* Update .gitreview for stable/2023.1OpenStack Release Bot2023-03-091-0/+1
| | | | Change-Id: I03af74e3bc2bd0db5c254f5b0f8dda3714f73e37
* Update release mappings for 21.4 release21.4.0Julia Kreger2023-03-072-4/+66
| | | | | | | | | | | | | | | | | This mapping allows object version upgrades to be navigated and needs to be updated pre-release otherwise we break the inherent upgrade job to the latest state of the development branch. Also, had to backfill the records for the bugfix branch since, while not required for that version to run, it is required to have to upgrade from that version. Also, lists antelope and 2023.1 as "named" releases, due to the abiguity and configuration, it just seemed better to be on the safe side. Change-Id: I633275caf8c3dc750023fbb27bd8a3f4d23e9fa5
* Merge "Add missing include for inventory API reference"Zuul2023-03-073-3/+8
|\
| * Add missing include for inventory API referenceDmitry Tantsur2023-02-233-3/+8
| | | | | | | | | | | | Also fix up the title and make sure the linter checks API ref files. Change-Id: I360fd4fab699e732ac03dc07faab33e18fe2bf13
* | Merge "Fix online upgrades for Bios/Traits"Zuul2023-03-073-12/+62
|\ \
| * | Fix online upgrades for Bios/TraitsJulia Kreger2023-03-073-12/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ... And tags, but nobody uses tags since it is not available via the API. Anyhow, the online upgrade code was written under the assumption that *all* tables had an "id" column. This is not always true in the ironic data model for tables which started as pure extensions of the Nodes table, and fails in particular when: 1) A database row has data stored in an ealier version of the object 2) That same object gets a version upgrade. In the case which discovered this, BIOSSetting was added at version 1.0, and later updated to include additional fields which incremented the version to 1.1. When the upgrade went to evaluate and iterate through the fields, the command failed because the table was designed around "node_id" instead of "id". Story: 2010632 Task: 47590 Change-Id: I7bec6cfacb9d1558bc514c07386583436759f4df
* | | Merge "Add prelude for OpenStack 2023.1 Ironic release"Zuul2023-03-061-0/+14
|\ \ \ | |/ / |/| |
| * | Add prelude for OpenStack 2023.1 Ironic releaseJay Faulkner2023-03-061-0/+14
| | | | | | | | | | | | | | | | | | We need a prelude. I added one. Change-Id: I48a7ca99439ce2ac3f954ec382971c1a4382ac58
* | | Merge "Add Yoga versions to release notes"Zuul2023-03-030-0/+0
|\ \ \ | |/ / |/| |
| * | Add Yoga versions to release notesDmitry Tantsur2022-08-291-3/+3
| | | | | | | | | | | | Change-Id: Iff987cbe0b1e6e2a00d73f5db6d13a00403abf72
* | | Merge "Do not move nodes to CLEAN FAILED with empty last_error"Zuul2023-03-029-27/+80
|\ \ \
| * | | Do not move nodes to CLEAN FAILED with empty last_errorDmitry Tantsur2023-03-019-27/+80
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When cleaning fails, we power off the node, unless it has been running a clean step already. This happens when aborting cleaning or on a boot failure. This change makes sure that the power action does not wipe the last_error field, resulting in a node with provision_state=CLEANFAIL and last_error=None for several seconds. I've hit this in Metal3. Also when aborting cleaning, make sure last_error is set during the transition to CLEANFAIL, not when the clean up thread starts running. While here, make sure to log the current step in all cases, not only when aborting a non-abortable step. Change-Id: Id21dd7eb44dad149661ebe2d75a9b030aa70526f Story: #2010603 Task: #47476
* | | Merge "Respond to rpc requests on stop until hash ring reset"Zuul2023-02-284-7/+118
|\ \ \
| * | | Respond to rpc requests on stop until hash ring resetSteve Baker2023-02-274-7/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently when a conductor is stopped, the rpc service stops responding to requests as soon as self.manager.del_host returns. This means that until the hash ring is reset on the whole cluster, requests can be sent to a service which is stopped. This change waits for the remaining seconds to delay stopping until CONF.hash_ring_reset_interval has elapsed. This will improve the reliability of the cluster when scaling down or rolling out updates. This delay only occurs when there is more than one online conductor, to allow fast restarts on single-node ironic installs (bifrost, metal3). Change-Id: I643eb34f9605532c5c12dd2a42f4ea67bf3e0b40
* | | | Merge "Fix expired links"Zuul2023-02-281-1/+1
|\ \ \ \
| * | | | Fix expired linksrenliang172023-02-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | update the address for https://e.huawei.com/en/products/computing/kunpeng/accessories/ibmc Change-Id: I7c1ea261b44e2ff7f399942796e54a9277a9b1d5
* | | | | Merge "Add configurable delays to the fake drivers"Zuul2023-02-275-0/+187
|\ \ \ \ \
| * | | | | Add configurable delays to the fake driversSteve Baker2022-10-135-0/+187
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Simulating workloads with the fake driver currently misses the reality that some operations take time to complete, rather than occuring instantly. This makes it difficult to mock real workloads for performance and functional testing of ironic itself. This change adds configurable random wait times for fake drivers in a new ironic.conf [fake] section. Each supported driver having one configuration option controlling the delay. These delays are applied to operations which typically block in other drivers. The default value of zero continues the existing behaviour of no delay. A single integer value will result in a constant delay in seconds. Two values separated by a comma will result in a triangular distribution weighted by the first value, specifically in python[1]: random.triangular(a, b, a) Change-Id: I7cb1b50d035939e6c4538b3373002a309bfedea4 [1] https://docs.python.org/3/library/random.html#random.triangular
* | | | | | Merge "Get conductor metric data"Zuul2023-02-2710-63/+348
|\ \ \ \ \ \ | |_|_|/ / / |/| | | | |
| * | | | | Get conductor metric dataJulia Kreger2023-02-2310-63/+348
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change adds the capability for the ironic-conductor and standalone service process to transmit timer and counter metrics to the message bus notifier which may be consumed by a ceilometer, ironic-prometheus-exporter, or other consumer of metrics event data on to the message bus. This functionality is not presently supported on dedicated API services such as those running as an ``ironic-api`` application process, or Ironic WSGI application. This is due to the lack of an internal trigger mechanism to transmit the data in a metrics update to the message bus and/or notifier plugin. This change requires ironic-lib 5.4.0 to collect and ship metrics via the message bus. Depends-On: https://review.opendev.org/c/openstack/ironic-lib/+/865311 Change-Id: If6941f970241a22d96e06d88365f76edc4683364
* | | | | | Merge "Add a comment about node sharding to API versions"Zuul2023-02-241-0/+1
|\ \ \ \ \ \
| * | | | | | Add a comment about node sharding to API versionsJakub Jelinek2023-02-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Follow-up to I385594339028c20cfc83fdcc4cbbec107efdacff Story: 2010378 Task: 46624 Change-Id: I95f3caaaf3fd92d60ce39b5803747728f65bbc17
* | | | | | | Merge "Set lockutils default logging"Zuul2023-02-232-0/+11
|\ \ \ \ \ \ \
| * | | | | | | Set lockutils default loggingJulia Kreger2023-02-202-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While developing some internal metrics collection capability, and the realization that a lock was needed, we realized that the lock activity itself would be a bit noisy. And image actions also get lock logging, and it is just really noisy, but not super helpful for troubleshooting. So, set it to WARNING instead. Discussion wise, see: https://review.opendev.org/c/openstack/ironic-lib/+/865311 Change-Id: I3ab14ee5b5cc063784d26e3c760f1422c692060d
* | | | | | | | Merge "Relaxing console pid looking"Zuul2023-02-232-1/+7
|\ \ \ \ \ \ \ \
| * | | | | | | | Relaxing console pid lookingKaifeng Wang2023-02-152-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recently we hit an issue that the pid file is missing, current logic simply removes pid file if the corresponding process is not found, but if the pid file is lost then the console could never be stopped and futher more, be restarted, regardless if the process is there or not. This patch captures FileNotFound to the exception handling to allow console recovery. Change-Id: I1a0b8347e960c6cff8aca10a22c67b710f7d617e
* | | | | | | | | Merge "fix inspectwait logic"Zuul2023-02-233-2/+56
|\ \ \ \ \ \ \ \ \ | |_|_|_|_|_|_|/ / |/| | | | | | | |
| * | | | | | | | fix inspectwait logicJulia Kreger2023-02-153-2/+56
| | |_|_|_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tl;dr is that we changed ``inspecting`` to include a ``inspect wait`` state. Unfortunately we never spotted the logic inside of the db API. We never spotted it because our testing in inspection code uses a mocked task manager... and we *really* don't have intense db testing because we expect the objects and higher level interactions to validate the lowest db level. Unfortunately, because of the out of band inspection workflow, we have to cover both cases in terms of what the starting state and ending state could be, but we've added tests to validate this is handled as we expect. Change-Id: Icccbc6d65531e460c55555e021bf81d362f5fc8b
* | | | | | | | Merge "Add release note for node sharding"Zuul2023-02-201-0/+14
|\ \ \ \ \ \ \ \
| * | | | | | | | Add release note for node shardingJay Faulkner2023-02-171-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Release note covers changes in the previous 4 commits in this chain. Change-Id: I5388e82e958acd930295215c9f9427080650866d
* | | | | | | | | Merge "Make metrics names a little more consistent"Zuul2023-02-202-3/+12
|\ \ \ \ \ \ \ \ \
| * | | | | | | | | Make metrics names a little more consistentJulia Kreger2023-01-182-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some of these metrics decorator were unlabeled without a class which would result in semi-confusing structures for the metrics counters. Now, we should be semi-consistent. Change-Id: Ie2795419991dc941f2a2b2bc0c6116b92d285041
* | | | | | | | | | Merge "Fixes console port conflict occurs in certain path"Zuul2023-02-205-42/+24
|\ \ \ \ \ \ \ \ \ \
| * | | | | | | | | | Fixes console port conflict occurs in certain pathKaifeng Wang2023-02-155-42/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The dynamically allocated console port for a node is saved into database and reused on subsequent console operations. In certain code path the port record cann't be trusted and we should do a re-allocation. This patch fixes the issue by ignores previous allocation record. The extra cleanup in the takeover is not required anymore and removed as well. Change-Id: I1a07ea9b30a2c760af7a6a4e39f3ff227df28fff Story: 2010489 Task: 47061
* | | | | | | | | | | Merge "Use association_proxy for port groups node_uuid"Zuul2023-02-199-47/+41
|\ \ \ \ \ \ \ \ \ \ \
| * | | | | | | | | | | Use association_proxy for port groups node_uuidHarald Jensås2022-12-149-47/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change adds 'node_uuid' to: ironic.objects.portgroup.Portgroup 'node_uuid' is a relationship using association_proxy in models.Portgroup. Using the association_proxy removes the need to do the node lookup to populate node uuid for portgroups in the api controller. NOTE: On portgroup create a read is added to read the port from the database, this ensures node_uuid is loaded and solves the DetachedInstanceError which is otherwise raised. The test test_list_with_deleted_port_group was deleted, if the portgroup does not exist porgroup_uuid on the port will be None, no need for extra handling of that case. Bumps Portgroup object version to 1.5 Change-Id: I4317d034b6661da4248935cb0b9cb095982cc052
* | | | | | | | | | | | Merge "Fix Inventory DB"Zuul2023-02-177-60/+11
|\ \ \ \ \ \ \ \ \ \ \ \
| * | | | | | | | | | | | Fix Inventory DBJakub Jelinek2023-02-167-60/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Follow-up to I6b830e5cc30f1fa1f1900e7c45e6f246fa1ec51c Original changa introduced some errors such as mismatched arguments for exceptions Story: 2010275 Task: 46204 Change-Id: I550e048ab22a6cd25502b41d1c579819df369249
* | | | | | | | | | | | | Merge "Indicate maintenance mode"Zuul2023-02-171-3/+3
|\ \ \ \ \ \ \ \ \ \ \ \ \ | |_|_|_|_|_|_|_|_|_|/ / / |/| | | | | | | | | | | |
| * | | | | | | | | | | | Indicate maintenance modeJakub Jelinek2023-02-161-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Follow-up to I74b19f7a42c1326d7ec04e6320176e81639ebfb4 Mention need of the maintenance mode to orphan swift objects during node clean up Story: 2010275 Task: 46204 Change-Id: Ie95a5bd333b0dab3e97254dfb4eb532bdbfd2650
* | | | | | | | | | | | | Merge "Minor spelling/grammar fixes for release docs"Zuul2023-02-151-17/+17
|\ \ \ \ \ \ \ \ \ \ \ \ \
| * | | | | | | | | | | | | Minor spelling/grammar fixes for release docsJay Faulkner2023-01-261-17/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix spelling, make Ironic capitalized throughout. Change-Id: Ia689954279034d21c60dea4bca73ee5b1bb41d81