summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMartin Schwenke <martin@meltin.net>2017-02-15 19:33:02 +1100
committerMartin Schwenke <martins@samba.org>2017-02-24 07:47:11 +0100
commitfdc0dbee29f8cb81dfcb1c995df6468469fd75ce (patch)
tree6a84ca04933cfa117c7d97eab48581c1c3bb3bfb
parent2d22454f17a691648dc6d26864a896588de944b2 (diff)
downloadsamba-fdc0dbee29f8cb81dfcb1c995df6468469fd75ce.tar.gz
ctdb-tests: Add synchronisation points in reload IPs tests
"ctdb reloadips" use of ipreallocate() can result in a spurious takeover runs. This can cause a subsequent "ctdb reloadips" to fail to disable takeover runs (due to there being one already in progress). There are various possible improvements but a proper fix probably requires a protocol change. That would mean receiving an ACK for a takeover run request to indicate that the request will be processes and then a broadcast to indicate a completed takeover run. There are various other partial fixes (e.g. de-duping queued takeover run requests against those in the in-progess queue) and workarounds (e.g. always do a double ipreallocate() in the tool, which should absorb the spurious takeover run). However, this is unlikely to be a real-world problem. Real use cases should not involve repeatedly reloading the IP configuration. Instead, work around the problem of flaky tests by manually adding "ctdb sync" commands to cause extra no-op takeover runs. These should not add spurious takeover runs and will create synchronisation points to help avoid the issue. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
-rwxr-xr-xctdb/tests/complex/18_ctdb_reloadips.sh6
-rwxr-xr-xctdb/tests/simple/18_ctdb_reloadips.sh2
2 files changed, 8 insertions, 0 deletions
diff --git a/ctdb/tests/complex/18_ctdb_reloadips.sh b/ctdb/tests/complex/18_ctdb_reloadips.sh
index c61dcdc7c6d..5ac2830ce34 100755
--- a/ctdb/tests/complex/18_ctdb_reloadips.sh
+++ b/ctdb/tests/complex/18_ctdb_reloadips.sh
@@ -199,6 +199,8 @@ try_command_on_node $test_node "$CTDB reloadips"
check_ips $test_node "$iface" "$prefix" 1 $new_ip_max
+try_command_on_node any $CTDB sync
+
####################
# This should be the primary. Ensure that no other IPs are lost
@@ -211,6 +213,8 @@ try_command_on_node $test_node "$CTDB reloadips"
check_ips $test_node "$iface" "$prefix" 2 $new_ip_max
+try_command_on_node any $CTDB sync
+
####################
# Get rid of about 1/2 the IPs
@@ -224,6 +228,8 @@ try_command_on_node $test_node "$CTDB reloadips"
check_ips $test_node "$iface" "$prefix" $start $new_ip_max
+try_command_on_node any $CTDB sync
+
####################
# Delete the rest
diff --git a/ctdb/tests/simple/18_ctdb_reloadips.sh b/ctdb/tests/simple/18_ctdb_reloadips.sh
index b68ecfa617b..6b92878c3dc 100755
--- a/ctdb/tests/simple/18_ctdb_reloadips.sh
+++ b/ctdb/tests/simple/18_ctdb_reloadips.sh
@@ -61,6 +61,8 @@ fi
echo "GOOD: no IPs left on node $test_node"
+try_command_on_node any $CTDB sync
+
echo "Restoring addresses"
restore_public_addresses