summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authormelanie witt <melwittt@gmail.com>2023-05-11 19:23:52 +0000
committermelanie witt <melwittt@gmail.com>2023-05-17 00:57:37 +0000
commitc095cfe04e2c71efcfbfdd95948af080a98065e6 (patch)
tree2d0d9e73f6c7b08747b43e59b0898c6205eaffd7
parente9a54ff3508efbb1dea6b80fc5d970a8385c6ed4 (diff)
downloadnova-c095cfe04e2c71efcfbfdd95948af080a98065e6.tar.gz
tests: Use GreenThreadPoolExecutor.shutdown(wait=True)
We are still having some issues in the gate where greenlets from previous tests continue to run while the next test starts, causing false negative failures in unit or functional test jobs. This adds a new fixture that will ensure GreenThreadPoolExecutor.shutdown() is called with wait=True, to wait for greenlets in the pool to finish running before moving on. In local testing, doing this does not appear to adversely affect test run times, which was my primary concern. As a baseline, I ran a subset of functional tests in a loop until failure without the patch and after 11 hours, I got a failure reproducing the bug. With the patch, running the same subset of functional tests in a loop has been running for 24 hours and has not failed yet. Based on this, I think it may be worth trying this out to see if it will help stability of our unit and functional test jobs. And if it ends up impacting test run times or causes other issues, we can revert it. Partial-Bug: #1946339 Change-Id: Ia916310522b007061660172fa4d63d0fde9a55ac
-rw-r--r--nova/test.py7
-rw-r--r--nova/tests/fixtures/nova.py35
2 files changed, 42 insertions, 0 deletions
diff --git a/nova/test.py b/nova/test.py
index e37967b06d..1cf605f10a 100644
--- a/nova/test.py
+++ b/nova/test.py
@@ -317,6 +317,13 @@ class TestCase(base.BaseTestCase):
# all other tests.
scheduler_utils.reset_globals()
+ # Wait for bare greenlets spawn_n()'ed from a GreenThreadPoolExecutor
+ # to finish before moving on from the test. When greenlets from a
+ # previous test remain running, they may attempt to access structures
+ # (like the database) that have already been torn down and can cause
+ # the currently running test to fail.
+ self.useFixture(nova_fixtures.GreenThreadPoolShutdownWait())
+
def _setup_cells(self):
"""Setup a normal cellsv2 environment.
diff --git a/nova/tests/fixtures/nova.py b/nova/tests/fixtures/nova.py
index abfc3ecc6c..be0691f7aa 100644
--- a/nova/tests/fixtures/nova.py
+++ b/nova/tests/fixtures/nova.py
@@ -1938,3 +1938,38 @@ class ComputeNodeIdFixture(fixtures.Fixture):
'nova.compute.manager.ComputeManager.'
'_ensure_existing_node_identity',
mock.DEFAULT))
+
+
+class GreenThreadPoolShutdownWait(fixtures.Fixture):
+ """Always wait for greenlets in greenpool to finish.
+
+ We use the futurist.GreenThreadPoolExecutor, for example, in compute
+ manager to run live migration jobs. It runs those jobs in bare greenlets
+ created by eventlet.spawn_n(). Bare greenlets cannot be killed the same
+ way as GreenThreads created by eventlet.spawn().
+
+ Because they cannot be killed, in the test environment we must either let
+ them run to completion or move on while they are still running (which can
+ cause test failures as the leaked greenlets attempt to access structures
+ that have already been torn down).
+
+ When a compute service is stopped by Service.stop(), the compute manager's
+ cleanup_host() method is called and while cleaning up, the compute manager
+ calls the GreenThreadPoolExecutor.shutdown() method with wait=False. This
+ means that a test running GreenThreadPoolExecutor jobs will not wait for
+ the bare greenlets to finish running -- it will instead move on immediately
+ while greenlets are still running.
+
+ This fixture will ensure GreenThreadPoolExecutor.shutdown() is always
+ called with wait=True in an effort to reduce the number of leaked bare
+ greenlets.
+
+ See https://bugs.launchpad.net/nova/+bug/1946339 for details.
+ """
+
+ def setUp(self):
+ super().setUp()
+ real_shutdown = futurist.GreenThreadPoolExecutor.shutdown
+ self.useFixture(fixtures.MockPatch(
+ 'futurist.GreenThreadPoolExecutor.shutdown',
+ lambda self, wait: real_shutdown(self, wait=True)))