diff options
| author | Alan Conway <aconway@apache.org> | 2014-05-29 15:02:15 +0000 |
|---|---|---|
| committer | Alan Conway <aconway@apache.org> | 2014-05-29 15:02:15 +0000 |
| commit | ac33dffad49541ae5e9e27eea996ea43bbdd1327 (patch) | |
| tree | d2bcb08dce12301f6806885e28d8f7d0830dae6a | |
| parent | e69fa09aae08f1ea7770793044d9b54cac4ac1a1 (diff) | |
| download | qpid-python-ac33dffad49541ae5e9e27eea996ea43bbdd1327.tar.gz | |
NO-JIRA: HA documentation: security configuration troubleshooting
Common issue for new users is cluster failing to start due to incorrect
security configuration. Added some notes to highlight the need for
security configuration and updated the troubleshooting section.
git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1598315 13f79535-47bb-0310-9956-ffa450edef68
| -rw-r--r-- | qpid/doc/book/src/cpp-broker/Active-Passive-Cluster.xml | 84 |
1 files changed, 62 insertions, 22 deletions
diff --git a/qpid/doc/book/src/cpp-broker/Active-Passive-Cluster.xml b/qpid/doc/book/src/cpp-broker/Active-Passive-Cluster.xml index 6e0225a2af..246a0a4ab5 100644 --- a/qpid/doc/book/src/cpp-broker/Active-Passive-Cluster.xml +++ b/qpid/doc/book/src/cpp-broker/Active-Passive-Cluster.xml @@ -219,6 +219,12 @@ under the License. The broker must load the <filename>ha</filename> module, it is loaded by default. The following broker options are available for the HA module. </para> + <note> + <para> + Incorrect security settings are a common cause of problems when + getting started, see <xref linkend="ha-security"/>. + </para> + </note> <table frame="all" id="ha-broker-options"> <title>Broker Options for High Availability Messaging Cluster</title> <tgroup align="left" cols="2" colsep="1" rowsep="1"> @@ -822,8 +828,22 @@ connection = qpid.messaging.Connection.establish("node1", reconnect=True, reconn Please see <xref linkend="chap-Messaging_User_Guide-Security"/> for more details on enabling authentication and setting up Access Control Lists. </para> + <note> + <para> + Unless you disable authentication with <literal>auth=no</literal> in + your configuration, you <emphasis>must</emphasis> set the options below + and you <emphasis>must</emphasis> have an ACL file with at least the + entry described below. + </para> + <para> + Backups will be <emphasis>unable to connect to the primary</emphasis> if + the security configuration is incorrect. See also <xref + linkend="ha-troubleshoot-security"/> + </para> + </note> <para> - When authentication is enabled, HA brokers use the credentials set by the following options: + When authentication is enabled you must set the credentials used by HA + brokers with following options: </para> <table frame="all" id="ha-security-options"> <title>HA Security Options</title> @@ -848,7 +868,13 @@ connection = qpid.messaging.Connection.establish("node1", reconnect=True, reconn </row> <row> <entry><para><literal>ha-mechanism</literal> <replaceable>MECHANISM</replaceable></para></entry> - <entry><para>Mechanism for HA brokers.</para></entry> + <entry> + <para> + Mechanism for HA brokers. Any mechanism you enable for + broker-to-broker communication can also be used by a client, so + do not use ha-mechanism=ANONYMOUS in a secure environment. + </para> + </entry> </row> </tbody> </tgroup> @@ -922,27 +948,41 @@ qpid-ha -b <replaceable>broker-address</replaceable> promote This section applies to clusters that are using rgmanager as the cluster manager. </para> - <section id="authentication-failures"> - <title>Authentication failures</title> + <section id="ha-troubleshoot-no-primary"> + <title>No primary broker</title> + <para> + When you initially start a HA cluster, all brokers are in + <literal>joining</literal> mode. The brokers do not automatically select + a primary, they rely on the cluster manager <literal>rgmanager</literal> + to do so. If <literal>rgmanager</literal> is not running or is not + configured correctly, brokers will remain in the + <literal>joining</literal> state. See <xref linkend="ha-rm-config"/> + </para> + </section> + <section id="ha-troubleshoot-security"> + <title>Authentication and ACL failures</title> <para> - If a broker is unable to establish a connection to another broker - in the cluster due to authentication problems, the log will - contain SASL errors, for example: + If a broker is unable to establish a connection to another broker in the + cluster due to authentication or ACL problems the logs may contain + errors like the following: + <programlisting> +info SASL: Authentication failed: SASL(-13): user not found: Password verification failed + </programlisting> + <programlisting> +warning Client closed connection with 320: User anonymous@QPID federation connection denied. Systems with authentication enabled must specify ACL create link rules. + </programlisting> <programlisting> -2012-aug-04 10:17:37 info SASL: Authentication failed: SASL(-13): user not found: Password verification failed +warning Client closed connection with 320: ACL denied anonymous@QPID creating a federation link. </programlisting> </para> <para> - Set the SASL user name and password used to connect to other - brokers using the ha-username and ha-password properties when you - start the broker. Set the SASL mode using ha-mechanism. Any - mechanism you enable for broker-to-broker communication can also - be used by a client, so do not enable ha-mechanism=ANONYMOUS in a - secure environment. Once the cluster is running, run qpid-ha to - make sure that the brokers are running as one cluster. + Set the HA security configuration and ACL file as described in <xref + linkend="ha-security"/>. Once the cluster is running and the primary is + promoted , run <literal>qpid-ha</literal> to make sure that the brokers + are running as one cluster. </para> </section> - <section id="slow-recovery-times"> + <section id="ha-troubleshoot-slow-recovery"> <title>Slow recovery times</title> <para> The following configuration settings affect recovery time. The @@ -950,7 +990,7 @@ qpid-ha -b <replaceable>broker-address</replaceable> promote loaded system. You should run tests to determine if the values are appropriate for your system and load conditions. </para> - <section id="cluster.conf"> + <section id="ha-troubleshoot-cluster.conf"> <title>cluster.conf:</title> <programlisting> <rm status_poll_interval=1> @@ -970,7 +1010,7 @@ qpid-ha -b <replaceable>broker-address</replaceable> promote failing over the VIP to a new address. </para> </section> - <section id="qpidd.conf"> + <section id="ha-troubleshoot-qpidd.conf"> <title>qpidd.conf</title> <programlisting> link-maintenance-interval=0.1 @@ -1006,7 +1046,7 @@ link-heartbeat-interval=5 </para> </section> </section> - <section id="total-cluster-failure"> + <section id="ha-troubleshoot-total-cluster-failure"> <title>Total cluster failure</title> <para> The cluster can only guarantee availability as long as there is at @@ -1047,7 +1087,7 @@ link-heartbeat-interval=5 If the surviving broker fails before that the cluster will fail in one of two modes (depending on the exact timing of failures) </para> - <section id="the-cluster-hangs"> + <section id="ha-troubleshoot-the-cluster-hangs"> <title>1. The cluster hangs</title> <para> All brokers are in joining or catch-up mode. rgmanager tries to @@ -1080,7 +1120,7 @@ service:qpidd-primary-service (20.0.10.33) stopped with clusvcadm, then restart (primary last) </para> </section> - <section id="the-cluster-reboots"> + <section id="ha-troubleshoot-the-cluster-reboots"> <title>2. The cluster reboots</title> <para> A new primary is promoted and the cluster is functional but all @@ -1088,7 +1128,7 @@ service:qpidd-primary-service (20.0.10.33) stopped </para> </section> </section> - <section id="fencing-and-network-partitions"> + <section id="ha-troubleshoot-fencing-and-network-partitions"> <title>Fencing and network partitions</title> <para> A network partition is a a network failure that divides the |
