diff options
| author | Alan Conway <aconway@apache.org> | 2014-04-24 17:54:05 +0000 |
|---|---|---|
| committer | Alan Conway <aconway@apache.org> | 2014-04-24 17:54:05 +0000 |
| commit | 1d3b4560f8a7f212976b536376a976b3b41f489b (patch) | |
| tree | 82c4baadc8f4159bea4fa8ad872f9858061c727e /qpid/java/client/example/src | |
| parent | 67f29e0685b4bfaa0721a25ae901c3b5e18c0db3 (diff) | |
| download | qpid-python-1d3b4560f8a7f212976b536376a976b3b41f489b.tar.gz | |
QPID-5719: HA becomes unresponsive once any of the brokers are SIGSTOPed
- Added timeout to qpid-ha.
- qpidd init script pings broker to verify it is not hung.
- updated documentation in qpid/doc/book/src/cpp-broker/Active-Passive-Cluster.xml.
The new results for the cases mentioned in the bug:
a] stopped ALL brokers: rgmanager restarts the entire cluster but data is lost.
Equivalent to killing all the brokers at once. This does not affect quorum because
only qpidd services are affected, not other services managed by cman.
b] stopped the primary: rgmanager restarts the primary after a timeout and promotes one of the backups.
c] stopped a backup: rgmanager restarts the backups after a timeout.
Clients that are actively sending messages may see a delay while backup is restarted.
Note you need to set link-heartbeat-interval in qpidd.conf. The default is very
high (120 seconds), it should be set lower to see recovery from sigstop in a
reasonable time.
See the updated documentation in qpid/doc/book/src/cpp-broker/Active-Passive-Cluster.xml.
git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1589807 13f79535-47bb-0310-9956-ffa450edef68
Diffstat (limited to 'qpid/java/client/example/src')
0 files changed, 0 insertions, 0 deletions
