| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When cpu_policy is mixed the scheduler tries to find a valid CPU pinning
for each instance NUMA cell. However if there is an instance NUMA cell
that does not request any pinned CPUs then such logic will calculate
empty pinning information for that cell. Then the scheduler logic
wrongly assumes that an empty pinning result means there was no valid
pinning. However there is difference between a None result when no valid
pinning found, from an empty result [] which means there was nothing to
pin.
This patch makes sure that pinning == None is differentiated from
pinning == [].
Closes-Bug: #1994526
Change-Id: I5a35a45abfcfbbb858a94927853777f112e73e5b
|
|
|
|
|
| |
Related-Bug: #1994526
Change-Id: I52ee068377cc48ef4b4cdcb4b05fdc8d926faddf
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The existing generic restart_compute_service() call in the nova test
base class is not appropriate for the libvirt functional test that needs
to reconfigure the libvirt connection as it is not aware of the libvirt
specific mocking needed when a compute service is started.
So this patch adds a specific restart_compute_service() call
to nova.tests.functional.libvirt.base.ServersTestBase. This will be used
by a later patch testing [pci]device_spec reconfiguration scenarios.
This change showed that some of the existing libvirt functional test
used the incomplete restart_compute_service from the base class. Others
used local mocking to inject new pci config to the restart. I moved all
these to the new function and removed the local mocking.
Change-Id: Ic717dc42ac6b6cace59d344acaf12f9d1ee35564
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we no longer support py27, we can use the standard library
unittest.mock module instead of the third party mock lib. Most of this
is autogenerated, as described below, but there is one manual change
necessary:
nova/tests/functional/regressions/test_bug_1781286.py
We need to avoid using 'fixtures.MockPatch' since fixtures is using
'mock' (the library) under the hood and a call to 'mock.patch.stop'
found in that test will now "stop" mocks from the wrong library. We
have discussed making this configurable but the option proposed isn't
that pretty [1] so this is better.
The remainder was auto-generated with the following (hacky) script, with
one or two manual tweaks after the fact:
import glob
for path in glob.glob('nova/tests/**/*.py', recursive=True):
with open(path) as fh:
lines = fh.readlines()
if 'import mock\n' not in lines:
continue
import_group_found = False
create_first_party_group = False
for num, line in enumerate(lines):
line = line.strip()
if line.startswith('import ') or line.startswith('from '):
tokens = line.split()
for lib in (
'ddt', 'six', 'webob', 'fixtures', 'testtools'
'neutron', 'cinder', 'ironic', 'keystone', 'oslo',
):
if lib in tokens[1]:
create_first_party_group = True
break
if create_first_party_group:
break
import_group_found = True
if not import_group_found:
continue
if line.startswith('import ') or line.startswith('from '):
tokens = line.split()
if tokens[1] > 'unittest':
break
elif tokens[1] == 'unittest' and (
len(tokens) == 2 or tokens[4] > 'mock'
):
break
elif not line:
break
if create_first_party_group:
lines.insert(num, 'from unittest import mock\n\n')
else:
lines.insert(num, 'from unittest import mock\n')
del lines[lines.index('import mock\n')]
with open(path, 'w+') as fh:
fh.writelines(lines)
Note that we cannot remove mock from our requirements files yet due to
importing pypowervm unit test code in nova unit tests. This library
still uses the mock lib, and since we are importing test code and that
lib (correctly) only declares mock in its test-requirements.txt, mock
would not otherwise be installed and would cause errors while loading
nova unit test code.
[1] https://github.com/testing-cabal/fixtures/pull/49
Change-Id: Id5b04cf2f6ca24af8e366d23f15cf0e5cac8e1cc
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
|
|
|
|
|
|
|
| |
This change fixes some of the typos in unit tests as well
as in nova code-base.
Change-Id: I209bbb270baf889fcb2b9a4d1ce0ab4a962d0d0e
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a race condition between an incoming resize and an
update_available_resource periodic in the resource tracker. The race
window starts when the resize_instance RPC finishes and ends when the
finish_resize compute RPC finally applies the migration context on the
instance.
In the race window, if the update_available_resource periodic is run on
the destination node, then it will see the instance as being tracked on
this host as the instance.node is already pointing to the dest. But the
instance.numa_topology still points to the source host topology as the
migration context is not applied yet. This leads to CPU pinning error if
the source topology does not fit to the dest topology. Also it stops the
periodic task and leaves the tracker in an inconsistent state. The
inconsistent state only cleanup up after the periodic is run outside of
the race window.
This patch applies the migration context temporarily to the specific
instances during the periodic to keep resource accounting correct.
Change-Id: Icaad155e22c9e2d86e464a0deb741c73f0dfb28a
Closes-Bug: #1953359
Closes-Bug: #1952915
|
|
|
|
|
|
|
|
|
|
| |
This patch extends the original reproduction
I4be429c56aaa15ee12f448978c38214e741eae63 to cover
bug 1952915 as well as they have a common root cause.
Change-Id: I57982131768d87e067d1413012b96f1baa68052b
Related-Bug: #1953359
Related-Bug: #1952915
|
|
|
|
|
|
|
|
| |
This patch adds a functional test that reproduces a race between
incoming migration and the update_available_resource periodic
Change-Id: I4be429c56aaa15ee12f448978c38214e741eae63
Related-Bug: #1953359
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During resize, on the source host, in resize_instance(), the instance.host
and .node is updated to point to the destination host. This indicates to
the source host's resource tracker that the allocation of this instance
does not need to be tracked as an instance but as an outbound migration
instead. However for the source host's resource tracker to do that it,
needs to use the instance.old_flavor. Unfortunately the
instance.old_flavor is only set during finish_resize() on the
destination host. (resize_instance cast to the finish_resize). So it is
possible that a running resize_instance() set the instance.host to point
to the destination and then before the finish_resize could set the
old_flavor an update_available_resources periodic runs on the source
host. This causes that the allocation of this instance is not tracked as
an instance as the instance.host point to the destination but it is not
tracked as a migration either as the instance.old_flavor is not yet set.
So the allocation on the source host is simply dropped by the periodic
job.
When such migration is confirmed the confirm_resize() tries to drop
the same resource allocation again but fails as the pinned CPUs of the
instance already freed.
When such migration is reverted instead, then revert succeeds but the
source host resource allocation will not contain the resource allocation
of the instance until the next update_available_resources periodic runs
and corrects it.
This does not affect resources tracked exclusively in placement (e.g.
VCPU, MEMORY_MB, DISK_GB) but it does affect NUMA related resource that
are still tracked in the resource tracker (e.g. huge pages, pinned
CPUs).
This patch moves the instance.old_flavor setting to the source node to
the same transaction that sets the instance.host to point to the
destination host. Hence solving the race condition.
Change-Id: Ic0d6c59147abe5e094e8f13e0d3523b178daeba9
Closes-Bug: #1944759
|
|
|
|
|
|
|
|
|
| |
Add functional tests to reproduce the race between resize_instance()
and update_available_resources().
Related-Bug: #1944759
Change-Id: Icb7e3379248fe00f9a94f9860181b5de44902379
|
|
|
|
|
|
|
|
|
| |
There's only one driver now, which means there isn't really a driver at
all. Move the code into the manager altogether and avoid a useless layer
of abstraction.
Change-Id: I609df5b707e05ea70c8a738701423ca751682575
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Nova has never supported specifying per numa node
cpu toplogies. Logically the cpu toplogy of a guest
is independent of its numa toplogy and there is no
way to model different cpu toplogies per numa node
or implement that in hardware.
The presence of the code in nova that allowed the generation
of these invalid configuration has now been removed as it
broke the automatic selection of cpu topologies based
on hw:max_[cpus|sockets|threads] flavor and image properties.
This change removed the incorrect code and related unit
tests with assert nova could generate invalid topologies.
Closes-Bug: #1910466
Change-Id: Ia81a0fdbd950b51dbcc70c65ba492549a224ce2b
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change reproduces bug #1910466
When hw:cpu_max_[sockets|cores|threads] is configured
in addition to an explict numa topologies and cpu pinning
nova is currently incapable of generating the correct
virtual CPU topology resulting in an index out of range
error as we attempt to retrieve the first topology
from an empty list.
This change reproduces the error via a new functional
test.
Related-Bug: #1910466
Change-Id: I333b3d85deed971678141307dd06545e308cf989
|
|
|
|
|
|
|
|
|
|
|
| |
The fake_notifier uses module globals and also needs careful stub and
reset calls to work properly. This patch wraps the fake_notifier into a
proper Fixture that automates the complexity.
This is fairly rage patch but it does not change any logic just redirect
calls from the fake_notifier to the new NotificationFixture
Change-Id: I456f685f480b8de71014cf232a8f08c731605ad8
|
|
|
|
|
|
|
|
| |
These were left to last since there's a bit of cleanup necessary to move
everything across.
Change-Id: I921c812ac03f7d32eec31200772020c17f292851
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
|
|
|
|
|
|
|
|
| |
Move these to the central place. There's a large amount of test damage
but it's pretty trivial.
Change-Id: If581eb7aa463c9dde13714f34f0f1b41549a7130
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Previously, the default for kB_mem was 2048. That's 2MB of RAM. Change
this to 16GB, and cleanup tests that were forced to explicitly set
sane value by removing the kB_mem argument. In cases where tests were
using the default CPU configuration, arguments to HostInfo are removed
entirely.
Change-Id: I1c97776583f9e43c4a284fb9ef59bd6293730a28
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If using the 'mixed' CPU policy then, by design, the instance is
consuming both "shared" and "dedicated" host CPUs. However, we were only
checking the latter. Fix this.
Change-Id: Ic21c918736e7046ad32a2b8a3330496ce42950b0
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Closes-Bug: #1898272
|
|\ \
| |/ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If you have two host NUMA nodes and you mark the entirety of one node
for shared vCPUs (VCPU) and the other for dedicated vCPUs (PCPU), then
it should not be possible to boot an instance with a 'mixed' CPU policy
and a single NUMA node, since satisfying that request should necessitate
splitting across host nodes. This is not currently the case. Prove that
out.
Change-Id: I72afe21d117e932cc00c9f616618fb3722bf8ca4
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Related-Bug: #1898272
|
|\ \
| |/ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
I'm not sure why these weren't included in the series, but they help
prove out what this is doing.
This highlights a small "bug" in the code, whereby the topology object
associated with the instance's NUMA cell doesn't have the correct number
of CPUs. This has no adverse effects since that attribute isn't actually
used except to indicate a minimum thread count necessary for the cell,
but it is wrong as we fix it all the same.
Some tests are renamed now that we're testing more than one "legacy"
policy.
Part of blueprint use-pcpu-and-vcpu-in-one-instance
Change-Id: I30a040d549f48d53cab2c59c00bb269f821ace88
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
|
|/
|
|
|
|
|
|
|
| |
Replace six.text_type with str.
This patch completes six removal.
Change-Id: I779bd1446dc1f070fa5100ccccda7881fa508d79
Implements: blueprint six-removal
Signed-off-by: Takashi Natsume <takanattie@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This rather beefy (but also quite simple) patch replaces the
'stub_out_image_service' call and associated cleanup in all functional
tests with a new 'GlanceFixture', based on the old 'FakeImageService'.
The use of a fixture means we don't have to worry about teardown and
allows us to stub Glance in the same manners as Cinder, Neutron,
Placement etc.
Unit test cleanup is handled in a later patch.
Change-Id: I6daea47988181dfa6dde3d9c42004c0ecf6ae87a
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enable the policy fixture by default, which should yield more realistic
functional tests. We need to update some tests to use admin APIs where
policy dictates they are necessary. Note that we're currently testing
the legacy policy - not the updated, scoped policy - since the legacy
policy is the default one currently.
Note that we also need to modify the 'SingleCellSimple' fixture in this
change to use the same project ID as the 'OSAPIFixture'.
Change-Id: Ia3dea78f16cb3c7081714c4db36e20d5ee76ed7d
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change Id3f77c4ecccfdc4caa6dbf120c3df4fbdfce9d0f added the
'start_computes' function as a helper to start multiple compute hosts
with different 'HostInfo' objects. Unfortunately, there are cases where
you might also want to start multiple computes with e.g. different
'PCIInfo' objects and this can't handle those cases. We could expand the
'host_info_dict' parameter to transition from a mapping of hostnames ->
HostInfo objects to a mapping of hostnames -> HostInfo and PCIInfo
objects, but after a while that gets frankly quite ridiculous. Instead,
replace 'start_computes' with 'start_compute', a new helper that can
handle creating a new service with its own unique 'HostInfo' and various
other attributes currently accepted by the '_get_connection' helper.
This allows us to remove 'start_computes' in its entirety.
Change-Id: I79a16a0a62c6060cd3062174ce68fd8cbde9f3fc
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since change Ia6d8a7909081b0b856bd7e290e234af7e42a2b38, the resource
tracker's 'drop_move_claim' method has been capable of freeing up
resource usage. However, this relies on accurate resource reporting.
It transpires that there's a race whereby the resource tracker's
'update_available_resource' periodic task can end up not accounting for
usage from migrations that are in the process of being completed. The
root cause is the resource tracker's reliance on the stashed flavor in a
given migration record [1]. Previously, this information was deleted by
the compute manager at the start of the confirm migration operation [2].
The compute manager would then call the virt driver [3], which could
take a not insignificant amount of time to return, before finally
dropping the move claim. If the periodic task ran between the clearing
of the stashed flavor and the return of the virt driver, it would find a
migration record with no stashed flavor and would therefore ignore this
record for accounting purposes [4], resulting in an incorrect record for
the compute node, and an exception when the 'drop_move_claim' attempts
to free up the resources that aren't being tracked.
The solution to this issue is pretty simple. Instead of unsetting the
old flavor record from the migration at the start of the various move
operations, do it afterwards.
[1] https://github.com/openstack/nova/blob/6557d67/nova/compute/resource_tracker.py#L1288
[2] https://github.com/openstack/nova/blob/6557d67/nova/compute/manager.py#L4310-L4315
[3] https://github.com/openstack/nova/blob/6557d67/nova/compute/manager.py#L4330-L4331
[4] https://github.com/openstack/nova/blob/6557d67/nova/compute/resource_tracker.py#L1300
Change-Id: I4760b01b695c94fa371b72216d398388cf981d28
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Partial-Bug: #1879878
Related-Bug: #1834349
Related-Bug: #1818914
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Attempting to boot an instance with 'hw:cpu_policy=dedicated' will
result in a request from nova-scheduler to placement for allocation
candidates with $flavor.vcpu 'PCPU' inventory. Similarly, booting an
instance with 'hw:cpu_thread_policy=isolate' will result in a request
for allocation candidates with 'HW_CPU_HYPERTHREADING=forbidden', i.e.
hosts without hyperthreading. This has been the case since the
cpu-resources feature was implemented in Train. However, as part of that
work and to enable upgrades from hosts that predated Train, we also make
a second request for candidates with $flavor.vcpu 'VCPU' inventory. The
idea behind this is that old compute nodes would only report 'VCPU' and
should be useable, and any new compute nodes that got caught up in this
second request could never actually be scheduled to since there wouldn't
be enough cores from 'ComputeNode.numa_topology.cells.[*].pcpuset'
available to schedule to, resulting in rejection by the
'NUMATopologyFilter'. However, if a host was rejected in the first
query because it reported the 'HW_CPU_HYPERTHREADING' trait, it could
get picked up by the second query and would happily be scheduled to,
resulting in an instance consuming 'VCPU' inventory from a host that
properly supported 'PCPU' inventory.
The solution is simply, though also a huge hack. If we detect that the
host is using new style configuration and should be able to report
'PCPU', check if the instance asked for no hyperthreading and whether
the host has it. If all are True, reject the request.
Change-Id: Id39aaaac09585ca1a754b669351c86e234b89dd9
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Closes-Bug: #1889633
|
|\ \
| |/ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
With the introduction of the cpu-resources work [1], (libvirt) hosts can
now report 'PCPU' inventory separate from 'VCPU' inventory, which is
consumed by instances with pinned CPUs ('hw:cpu_policy=dedicated'). As
part of that effort, we had to drop support for the ability to boot
instances with 'hw:cpu_thread_policy=isolate' (i.e. I don't want
hyperthreads) on hosts with hyperthreading. This had been previously
implemented by marking thread siblings of the host cores used by such an
instance as reserved and unusable by other instances, but such a design
wasn't possible in world where we had to track resource consumption in
placement before landing in the host. Instead, the 'isolate' policy now
simply means "give me a host without hyperthreads". This is enforced by
hosts with hyperthreads reporting the 'HW_CPU_HYPERTHREADING' trait, and
instances with the 'isolate' policy requesting
'HW_CPU_HYPERTHREADING=forbidden'.
Or at least, that's how it should work. We also have a fallback query
for placement to find hosts with 'VCPU' inventory and that doesn't care
about the 'HW_CPU_HYPERTHREADING' trait. This was envisioned to ensure
hosts with old style configuration ('[DEFAULT] vcpu_pin_set') could
continue to be scheduled to. We figured that this second fallback query
could accidentally pick up hosts with new-style configuration, but we
are also tracking the available and used cores from those listed in the
'[compute] cpu_dedicated_set' as part of the host 'NUMATopology' objects
(specifically, via the 'pcpuset' and 'cpu_pinning' fields of the
'NUMACell' child objects). These are validated by both the
'NUMATopologyFilter' and the virt driver itself, which means hosts with
new style configuration that got caught up in this second query would be
rejected by this filter or by a late failure on the host. (Hint: there's
much more detail on this in the spec).
Unfortunately we didn't think about hyperthreading. If a host gets
picked up in the second request, it might well have enough PCPU
inventory but simply be rejected in the first query since it had
hyperthreads. In this case, because it has enough free cores available
for pinning, neither the filter nor the virt driver will reject the
request, resulting in a situation whereby the instance ends up falling
back to the old code paths and consuming $flavor.vcpu host cores, plus
the thread siblings for each of these cores. Despite this, it will be
marked as consuming $flavor.vcpu VCPU (not PCPU) inventory in placement.
This patch proves this to be the case, allowing us to resolve the issue
later.
[1] https://specs.openstack.org/openstack/nova-specs/specs/train/approved/cpu-resources.html
Change-Id: I87cd4d14192b1a40cbdca6e3af0f818f2cab613e
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Related-Bug: #1889633
|
| |
| |
| |
| |
| |
| |
| | |
These are also useful for vTPM tests.
Change-Id: I5d6db24a440397e588ba69f98a9cd2b8a846adc2
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
|
|\ \ |
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| | |
It's unnecessary, particularly when nothing of the other service
fixtures use it.
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Change-Id: If849f80c0372872b2de57b20e8b63c069a54ccff
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When one resizes a pinned instance, the instance claims host CPUs for
pinning purposes on the destination. However, the host CPUs on the
source are not immediately relinquished. Rather, they are held by the
migration record, to handle the event that the resize is reverted. It is
only when one confirms this resize that the old cores are finally
relinquished.
It appears there is a potential race between the resource tracker's
periodic task and the freeing of these resources, resulting in attempts
to unpin host cores that have already been unpinned. This test
highlights that bug pending a fix.
Change-Id: Ie092628ac71eb87c9dfa7220255a2953ada9e04d
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Related-Bug: #1879878
|
|
|
|
|
|
|
|
| |
While working on I36dd64dc272ea1743995b3b696323a9431666489, it was
noticed that test_numa_server.py was needlessly duplicating code
around compute service starup and restarting. This patch fixes this.
Change-Id: I00cd09fac0a8cb88191cc4cb3cbaf306987a97d6
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Validate basic huge page assignment to an instance. When assigning huge
pages to an instance, a NUMA topology will automatically be added, thus
this is a subclass of the NUMA tests.
This will prove useful as a starting base for functional tests if we
ever do manage to get memory pages modelled in placement.
Change-Id: I8c760ed242a8ffd9ad963a5f51364f541909cd4c
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When resizing, it's possible to change the NUMA topology of an instance,
or remove it entirely, due to different extra specs in the new flavor.
Unfortunately we cache the instance's NUMA topology object in
'RequestSpec.numa_topology' and don't update it when resizing. This
means if a given host doesn't have enough free CPUs or mempages of the
size requested by the *old* flavor, that host can be rejected by the
filter.
Correct this by regenerating the 'RequestSpec.numa_topology' field as
part of the resize operation, ensuring that we revert to the old field
value in the case of a resize-revert.
Change-Id: I0ca50665b86b9fdb4618192d4d6a3bcaa6ea2291
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
Co-Authored-By: He Jie Xu <hejie.xu@intel.com>
Closes-bug: #1805767
|
|
|
|
|
|
|
|
| |
This is a *very* common pattern. Centralize it. Only a few users are
added for now, but we can migrate more later (it's rather tedious work).
Change-Id: I84c58de90dad6d86271767363aef90ddac0f1730
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
'_IntegratedTestBase' has subclassed 'InstanceHelperMixin' since change
I0d21cb94c932e6e556eca964c57868c705b2d120, which means both now provide
a '_build_minimal_create_server_request' function. However, only
'_IntegratedTestBase' provides a '_build_server' function. The
'_build_minimal_create_server_request' and '_build_server' functions do
pretty much the same thing but there are some differences. Combine these
under the '_build_server' alias.
Change-Id: I91fa2f73185fef48e9aae9b7f61389c374e06676
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch addresses a number of typos and minor
issues raised during review of [1][2]. A summary
of the changes are corrections to typos in comments,
a correction to the exception message, an update to
the release note and the addition of debug logging.
[1] I0322d872bdff68936033a6f5a54e8296a6fb3434
[2] I48bccc4b9adcac3c7a3e42769c11fdeb8f6fd132
Related-Bug: #1804502
Related-Bug: #1763766
Change-Id: I8975e524cd5a9c7dfb065bb2dc8ceb03f1b89e7b
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change leverages the new NUMA constraint checking added in
in I0322d872bdff68936033a6f5a54e8296a6fb3434 to allow the
NUMATopologyFilter to be skipped on rebuild.
As the new behavior of rebuild enfroces that no changes
to the numa constraints are allowed on rebuild we no longer
need to execute the NUMATopologyFilter. Previously
the NUMATopologyFilter would process the rebuild request
as if it was a request to spawn a new instnace as the
numa_fit_instance_to_host function is not rebuild aware.
As such prior to this change a rebuild would only succeed
if a host had enough additional capacity for a second instance
on the same host meeting the requirement of the new image and
existing flavor. This behavior was incorrect on two counts as
a rebuild uses a noop claim. First the resouce usage cannot
change so it was incorrect to require the addtional capacity
to rebuild an instance. Secondly it was incorrect not to assert
the resouce usage remained the same.
I0322d872bdff68936033a6f5a54e8296a6fb3434 adressed guarding the
rebuild against altering the resouce usage and this change
allows in place rebuild.
This change found a latent bug that will be adressed in a follow
up change and updated the functional tests to note the incorrect
behavior.
Change-Id: I48bccc4b9adcac3c7a3e42769c11fdeb8f6fd132
Closes-Bug: #1804502
Implements: blueprint inplace-rebuild-of-numa-instances
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the image change during a rebuild it's possible for the request
NUMA topology to change. As a rebuild uses a noop claim in the
resource tracker the NUMA topology will not be updated as part of
a rebuild.
If the NUMA constraints do not change, a rebuild will continue as normal.
If the new constraints conflict with the existing NUMA constraints of the
instance the rebuild will be rejected without altering the status of the
instance.
This change introduces an API check to block rebuild when the NUMA
requirements for the new image do not match the existing NUMA constraints.
This is in line with the previous check introduced to prevent the rebuild of
volume-backed instances which similarly are not supported.
This change adds functional tests to assert the expected behaviour of
rebuilding NUMA instances with new images. This change also asserts that
in place rebuilds of numa instances is currently not supported.
Closes-Bug: #1763766
Partial-implements: blueprint inplace-rebuild-of-numa-instances
Change-Id: I0322d872bdff68936033a6f5a54e8296a6fb3434
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both 'ServersTestBase' and 'InstanceHelperMixin' provide implementations
of '_wait_for_state_change' but they behave differently. The former
waits for an instance to transition *from* the provided state, while the
latter, somewhat more sanely, waits for the transition *to* the provided
state. Switch to using the latter everywhere and make the necessary
changes. Due to class hierarchies, we end up with two nearly identical
implementations but these will be merged in a future change.
Change-Id: I80cdc0a33ec27b1389130c22f9c3a8ff69f6b1a0
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We want 'ServerBase' to inherit from 'InstanceHelperMixin'. They both
have implementations of a "wait until server is deleted" function. The
'ServerBase' implementation is called '_wait_for_deletion' and takes a
'server_id', while the 'InstanceHelperMixin' implementation is called
'_wait_until_deleted' and takes a full server JSON-y dict. A later
change, I0c56841d098d3e9d72db65be3143f3c893f0b6ba, will rework the
'ServersTestBase' version to bring it inline with 'InstanceHelperMixin'
version. However, 'ServerBase._delete_server' currently calls a
'_wait_for_deletion' function and passes it a 'server_id'. As such,
'_delete_server' itself is only passed a server_id. After the future
change, this 'server_id' will no longer be enough as '_delete_server'
will need to call the newly merged '_wait_until_deleted' that takes a
full server JSON-y dict, so '_delete_server' itself needs to receive a
full server JSON-y dict.
Do this work now to simplify the future patch.
Change-Id: Iceafa5ff2b7abff7c6d974ba49036ef03fb1c85f
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sometimes functional tests need to start multiple compute hosts with
different HostInfo objects (representing, for example, different NUMA
topologies). In the future this will also be needed by both NUMA-aware
live migration functional tests and a regression test for bug 1843639.
This patch adds a helper function to the base libvirt function test
class that takes a hostname to HostInfo dict and starts a compute for
each host. Existing tests that can make use of this new helper are
refactored.
Change-Id: Id3f77c4ecccfdc4caa6dbf120c3df4fbdfce9d0f
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This never actually worked. What we wanted to do was test a setup where
we had two hosts, one with the necessary configuration needed for a
specific request and one without, and attempt to cold migrate the
instance from the former to the latter resulting in a fail. However,
because it's not possible to use different configuration for different
hosts, we were attempting to "break" the configuration on one host.
Unfortunately, the driver correctly detects this broken behavior,
resulting in an error message like so:
ERROR [nova.compute.manager] Error updating resources for node test_compute1.
Traceback (most recent call last):
File "nova/compute/manager.py", line 8524, in _update_available_resource_for_node
startup=startup)
File "nova/compute/resource_tracker.py", line 867, in update_available_resource
resources = self.driver.get_available_resource(nodename)
File "nova/virt/libvirt/driver.py", line 7907, in get_available_resource
numa_topology = self._get_host_numa_topology()
File "nova/virt/libvirt/driver.py", line 7057, in _get_host_numa_topology
physnet_affinities = _get_physnet_numa_affinity()
File "nova/virt/libvirt/driver.py", line 7039, in _get_physnet_numa_affinity
raise exception.InvalidNetworkNUMAAffinity(reason=msg)
InvalidNetworkNUMAAffinity: Invalid NUMA network affinity configured: node 1 for
physnet foo is not present in host affinity set {0: set([])}
There isn't really an alternative. We can't configure compute nodes
separately so that's ruled out and the virt driver will correctly detect
any other attempts to break the configuration. Since the test never
actually worked, the correct thing to do seems to be to remove it.
Change-Id: I14637d788205408dcf9a007d7727358c03033dcd
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
|