diff options
author | Martin Schwenke <martin@meltin.net> | 2017-02-15 19:33:02 +1100 |
---|---|---|
committer | Martin Schwenke <martins@samba.org> | 2017-02-24 07:47:11 +0100 |
commit | fdc0dbee29f8cb81dfcb1c995df6468469fd75ce (patch) | |
tree | 6a84ca04933cfa117c7d97eab48581c1c3bb3bfb | |
parent | 2d22454f17a691648dc6d26864a896588de944b2 (diff) | |
download | samba-fdc0dbee29f8cb81dfcb1c995df6468469fd75ce.tar.gz |
ctdb-tests: Add synchronisation points in reload IPs tests
"ctdb reloadips" use of ipreallocate() can result in a spurious
takeover runs. This can cause a subsequent "ctdb reloadips" to fail
to disable takeover runs (due to there being one already in progress).
There are various possible improvements but a proper fix probably
requires a protocol change. That would mean receiving an ACK for a
takeover run request to indicate that the request will be processes
and then a broadcast to indicate a completed takeover run.
There are various other partial fixes (e.g. de-duping queued takeover
run requests against those in the in-progess queue) and workarounds
(e.g. always do a double ipreallocate() in the tool, which should
absorb the spurious takeover run).
However, this is unlikely to be a real-world problem. Real use cases
should not involve repeatedly reloading the IP configuration.
Instead, work around the problem of flaky tests by manually adding
"ctdb sync" commands to cause extra no-op takeover runs. These should
not add spurious takeover runs and will create synchronisation points
to help avoid the issue.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
-rwxr-xr-x | ctdb/tests/complex/18_ctdb_reloadips.sh | 6 | ||||
-rwxr-xr-x | ctdb/tests/simple/18_ctdb_reloadips.sh | 2 |
2 files changed, 8 insertions, 0 deletions
diff --git a/ctdb/tests/complex/18_ctdb_reloadips.sh b/ctdb/tests/complex/18_ctdb_reloadips.sh index c61dcdc7c6d..5ac2830ce34 100755 --- a/ctdb/tests/complex/18_ctdb_reloadips.sh +++ b/ctdb/tests/complex/18_ctdb_reloadips.sh @@ -199,6 +199,8 @@ try_command_on_node $test_node "$CTDB reloadips" check_ips $test_node "$iface" "$prefix" 1 $new_ip_max +try_command_on_node any $CTDB sync + #################### # This should be the primary. Ensure that no other IPs are lost @@ -211,6 +213,8 @@ try_command_on_node $test_node "$CTDB reloadips" check_ips $test_node "$iface" "$prefix" 2 $new_ip_max +try_command_on_node any $CTDB sync + #################### # Get rid of about 1/2 the IPs @@ -224,6 +228,8 @@ try_command_on_node $test_node "$CTDB reloadips" check_ips $test_node "$iface" "$prefix" $start $new_ip_max +try_command_on_node any $CTDB sync + #################### # Delete the rest diff --git a/ctdb/tests/simple/18_ctdb_reloadips.sh b/ctdb/tests/simple/18_ctdb_reloadips.sh index b68ecfa617b..6b92878c3dc 100755 --- a/ctdb/tests/simple/18_ctdb_reloadips.sh +++ b/ctdb/tests/simple/18_ctdb_reloadips.sh @@ -61,6 +61,8 @@ fi echo "GOOD: no IPs left on node $test_node" +try_command_on_node any $CTDB sync + echo "Restoring addresses" restore_public_addresses |