diff options
author | Stephen Finucane <stephenfin@redhat.com> | 2020-08-05 14:27:06 +0100 |
---|---|---|
committer | Stephen Finucane <stephenfin@redhat.com> | 2020-09-01 16:19:27 +0100 |
commit | 44376d2e212e0f9405a58dc7fc4d5b38d70ac42e (patch) | |
tree | 9baaead7a36417b840c4d4a1cfc4e756aac09849 /nova/tests/functional/libvirt/test_numa_servers.py | |
parent | 662398a1a4bb69c11a08fa5fa7196a89ed1acc87 (diff) | |
download | nova-44376d2e212e0f9405a58dc7fc4d5b38d70ac42e.tar.gz |
Don't unset Instance.old_flavor, new_flavor until necessary
Since change Ia6d8a7909081b0b856bd7e290e234af7e42a2b38, the resource
tracker's 'drop_move_claim' method has been capable of freeing up
resource usage. However, this relies on accurate resource reporting.
It transpires that there's a race whereby the resource tracker's
'update_available_resource' periodic task can end up not accounting for
usage from migrations that are in the process of being completed. The
root cause is the resource tracker's reliance on the stashed flavor in a
given migration record [1]. Previously, this information was deleted by
the compute manager at the start of the confirm migration operation [2].
The compute manager would then call the virt driver [3], which could
take a not insignificant amount of time to return, before finally
dropping the move claim. If the periodic task ran between the clearing
of the stashed flavor and the return of the virt driver, it would find a
migration record with no stashed flavor and would therefore ignore this
record for accounting purposes [4], resulting in an incorrect record for
the compute node, and an exception when the 'drop_move_claim' attempts
to free up the resources that aren't being tracked.
The solution to this issue is pretty simple. Instead of unsetting the
old flavor record from the migration at the start of the various move
operations, do it afterwards.
[1] https://github.com/openstack/nova/blob/6557d67/nova/compute/resource_tracker.py#L1288
[2] https://github.com/openstack/nova/blob/6557d67/nova/compute/manager.py#L4310-L4315
[3] https://github.com/openstack/nova/blob/6557d67/nova/compute/manager.py#L4330-L4331
[4] https://github.com/openstack/nova/blob/6557d67/nova/compute/resource_tracker.py#L1300
Change-Id: I4760b01b695c94fa371b72216d398388cf981d28
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Partial-Bug: #1879878
Related-Bug: #1834349
Related-Bug: #1818914
Diffstat (limited to 'nova/tests/functional/libvirt/test_numa_servers.py')
-rw-r--r-- | nova/tests/functional/libvirt/test_numa_servers.py | 13 |
1 files changed, 4 insertions, 9 deletions
diff --git a/nova/tests/functional/libvirt/test_numa_servers.py b/nova/tests/functional/libvirt/test_numa_servers.py index 86a9dcdfbd..e5aa401ba0 100644 --- a/nova/tests/functional/libvirt/test_numa_servers.py +++ b/nova/tests/functional/libvirt/test_numa_servers.py @@ -696,8 +696,7 @@ class NUMAServersTest(NUMAServersTestBase): self.ctxt, dst_host, ).numa_topology, ) - # FIXME(stephenfin): There should still be two pinned cores here - self.assertEqual(0, len(src_numa_topology.cells[0].pinned_cpus)) + self.assertEqual(2, len(src_numa_topology.cells[0].pinned_cpus)) self.assertEqual(2, len(dst_numa_topology.cells[0].pinned_cpus)) # before continuing with the actualy confirm process @@ -738,14 +737,10 @@ class NUMAServersTest(NUMAServersTestBase): # Now confirm the resize - # FIXME(stephenfin): This should be successful, but it's failing with a - # HTTP 500 due to bug #1879878 post = {'confirmResize': None} - exc = self.assertRaises( - client.OpenStackApiException, - self.api.post_server_action, server['id'], post) - self.assertEqual(500, exc.response.status_code) - self.assertIn('CPUUnpinningInvalid', str(exc)) + self.api.post_server_action(server['id'], post) + + server = self._wait_for_state_change(server, 'ACTIVE') class NUMAServerTestWithCountingQuotaFromPlacement(NUMAServersTest): |