diff options
author | melanie witt <melwittt@gmail.com> | 2023-05-11 19:23:52 +0000 |
---|---|---|
committer | melanie witt <melwittt@gmail.com> | 2023-05-17 00:57:37 +0000 |
commit | c095cfe04e2c71efcfbfdd95948af080a98065e6 (patch) | |
tree | 2d0d9e73f6c7b08747b43e59b0898c6205eaffd7 | |
parent | e9a54ff3508efbb1dea6b80fc5d970a8385c6ed4 (diff) | |
download | nova-c095cfe04e2c71efcfbfdd95948af080a98065e6.tar.gz |
tests: Use GreenThreadPoolExecutor.shutdown(wait=True)
We are still having some issues in the gate where greenlets from
previous tests continue to run while the next test starts, causing
false negative failures in unit or functional test jobs.
This adds a new fixture that will ensure
GreenThreadPoolExecutor.shutdown() is called with wait=True, to wait
for greenlets in the pool to finish running before moving on.
In local testing, doing this does not appear to adversely affect test
run times, which was my primary concern.
As a baseline, I ran a subset of functional tests in a loop
until failure without the patch and after 11 hours, I got a failure
reproducing the bug. With the patch, running the same subset of
functional tests in a loop has been running for 24 hours and has not
failed yet.
Based on this, I think it may be worth trying this out to see if it
will help stability of our unit and functional test jobs. And if it
ends up impacting test run times or causes other issues, we can
revert it.
Partial-Bug: #1946339
Change-Id: Ia916310522b007061660172fa4d63d0fde9a55ac
-rw-r--r-- | nova/test.py | 7 | ||||
-rw-r--r-- | nova/tests/fixtures/nova.py | 35 |
2 files changed, 42 insertions, 0 deletions
diff --git a/nova/test.py b/nova/test.py index e37967b06d..1cf605f10a 100644 --- a/nova/test.py +++ b/nova/test.py @@ -317,6 +317,13 @@ class TestCase(base.BaseTestCase): # all other tests. scheduler_utils.reset_globals() + # Wait for bare greenlets spawn_n()'ed from a GreenThreadPoolExecutor + # to finish before moving on from the test. When greenlets from a + # previous test remain running, they may attempt to access structures + # (like the database) that have already been torn down and can cause + # the currently running test to fail. + self.useFixture(nova_fixtures.GreenThreadPoolShutdownWait()) + def _setup_cells(self): """Setup a normal cellsv2 environment. diff --git a/nova/tests/fixtures/nova.py b/nova/tests/fixtures/nova.py index abfc3ecc6c..be0691f7aa 100644 --- a/nova/tests/fixtures/nova.py +++ b/nova/tests/fixtures/nova.py @@ -1938,3 +1938,38 @@ class ComputeNodeIdFixture(fixtures.Fixture): 'nova.compute.manager.ComputeManager.' '_ensure_existing_node_identity', mock.DEFAULT)) + + +class GreenThreadPoolShutdownWait(fixtures.Fixture): + """Always wait for greenlets in greenpool to finish. + + We use the futurist.GreenThreadPoolExecutor, for example, in compute + manager to run live migration jobs. It runs those jobs in bare greenlets + created by eventlet.spawn_n(). Bare greenlets cannot be killed the same + way as GreenThreads created by eventlet.spawn(). + + Because they cannot be killed, in the test environment we must either let + them run to completion or move on while they are still running (which can + cause test failures as the leaked greenlets attempt to access structures + that have already been torn down). + + When a compute service is stopped by Service.stop(), the compute manager's + cleanup_host() method is called and while cleaning up, the compute manager + calls the GreenThreadPoolExecutor.shutdown() method with wait=False. This + means that a test running GreenThreadPoolExecutor jobs will not wait for + the bare greenlets to finish running -- it will instead move on immediately + while greenlets are still running. + + This fixture will ensure GreenThreadPoolExecutor.shutdown() is always + called with wait=True in an effort to reduce the number of leaked bare + greenlets. + + See https://bugs.launchpad.net/nova/+bug/1946339 for details. + """ + + def setUp(self): + super().setUp() + real_shutdown = futurist.GreenThreadPoolExecutor.shutdown + self.useFixture(fixtures.MockPatch( + 'futurist.GreenThreadPoolExecutor.shutdown', + lambda self, wait: real_shutdown(self, wait=True))) |