summaryrefslogtreecommitdiff
path: root/ctdb
Commit message (Collapse)AuthorAgeFilesLines
* ctdb-tests: Make process exists test more resilientMartin Schwenke2019-11-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This can fail as follows: --==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==-- Running test ./tests/UNIT/tool/ctdb.process-exists.003.sh (02:26:30) --==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==-- ctdb.process-exists.003 - ctdbd process with multiple connections on node 0 Setting up fake ctdbd <10||0| OK <10|PID 26107 exists |0| OK ================================================== Running "ctdb -d NOTICE process-exists 26107 0x1234567812345678" PASSED ================================================== Running "ctdb -d NOTICE process-exists 26107 0xaebbccdd12345678" Registered SRVID 0xaebbccdd12345678 -------------------------------------------------- Output (Exit status: 1): -------------------------------------------------- PID 26107 with SRVID 0xaebbccdd12345678 does not exist -------------------------------------------------- Required output (Exit status: 0): -------------------------------------------------- PID 26107 with SRVID 0xaebbccdd12345678 exists FAILED connection to daemon closed, exiting ========================================================================== TEST FAILED: ./tests/UNIT/tool/ctdb.process-exists.003.sh (status 1) (duration: 0s) ========================================================================== This happens when dummy_client has not registered the SRVID (for its 10th connection) before the 2nd simple_test. Change the initial wait to ensure that the SRVID is registered. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Nov 6 02:46:24 UTC 2019 on sn-devel-184
* ctdb-tests: Improve code quality in ctdb_init()Martin Schwenke2019-11-061-7/+9
| | | | | | | | Improve quoting and indentation. Print a clear error if the cluster goes back into recovery and doesn't come back out. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-tests: No longer retry starting the clusterMartin Schwenke2019-11-061-30/+4
| | | | | | | | Retrying like this hides bugs. The cluster should come up first time, every time. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-tcp: Drop tracking of file descriptor for incoming connectionsMartin Schwenke2019-11-064-11/+0
| | | | | | | | | | | This file descriptor is owned by the incoming queue. It will be closed when the queue is torn down. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14175 RN: Avoid communication breakdown on node reconnect Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-tcp: Avoid orphaning the TCP incoming queueMartin Schwenke2019-11-061-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CTDB's incoming queue handling does not check whether an existing queue exists, so can overwrite the pointer to the queue. This used to be harmless until commit c68b6f96f26664459187ab2fbd56767fb31767e0 changed the read callback to use a parent structure as the callback data. Instead of cleaning up an orphaned queue on disconnect, as before, this will now free the new queue. At first glance it doesn't seem possible that 2 incoming connections from the same node could be processed before the intervening disconnect. However, the incoming connections and disconnect occur on different file descriptors. The queue can become orphaned on node A when the following sequence occurs: 1. Node A comes up 2. Node A accepts an incoming connection from node B 3. Node B processes a timeout before noticing that outgoing the queue is writable 4. Node B tears down the outgoing connection to node A 5. Node B initiates a new connection to node A 6. Node A accepts an incoming connection from node B Node A processes then the disconnect of the old incoming connection from (2) but tears down the new incoming connection from (6). This then occurs until the originally affected node is restarted. However, due to the number of outgoing connection attempts and associated teardowns, this induces the same behaviour on the corresponding incoming queue on all nodes that node A attempts to connect to. Therefore, other nodes become affected and need to be restarted too. As a result, the whole cluster probably needs to be restarted to recover from this situation. The problem can occur any time CTDB is started on a node. The fix is to avoid accepting new incoming connections when a queue for incoming connections is already present. The connecting node will simply retry establishing its outgoing connection. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14175 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-tcp: Check incoming queue to see if incoming connection is upMartin Schwenke2019-11-061-1/+1
| | | | | | | | | | This makes it consistent with the reverse case. Also, in_fd will soon be removed. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14175 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb/utils/smnotify/smnotify.c: typo fixesBjörn Jacke2019-10-311-5/+5
| | | | | Signed-off-by: Bjoern Jacke <bjacke@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb/utils/scsi_io/scsi_io.c: typo fixesBjörn Jacke2019-10-311-10/+10
| | | | | Signed-off-by: Bjoern Jacke <bjacke@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb/server/ctdb_daemon.c: typo fixesBjörn Jacke2019-10-311-4/+4
| | | | | Signed-off-by: Bjoern Jacke <bjacke@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb/server/ctdb_client.c: typo fixesBjörn Jacke2019-10-311-3/+3
| | | | | Signed-off-by: Bjoern Jacke <bjacke@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb/server/ctdb_call.c: typo fixesBjörn Jacke2019-10-311-3/+3
| | | | | Signed-off-by: Bjoern Jacke <bjacke@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb/include/ctdb_private.h: typo fixesBjörn Jacke2019-10-311-1/+1
| | | | | Signed-off-by: Bjoern Jacke <bjacke@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb/ib/ibwrapper_test.c: typo fixesBjörn Jacke2019-10-311-1/+1
| | | | | Signed-off-by: Bjoern Jacke <bjacke@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb/ib/ibw_ctdb.c: typo fixesBjörn Jacke2019-10-311-1/+1
| | | | | Signed-off-by: Bjoern Jacke <bjacke@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb/doc/readonlyrecords.txt: typo fixesBjörn Jacke2019-10-311-3/+3
| | | | | Signed-off-by: Bjoern Jacke <bjacke@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb/doc/ctdb.1.xml: typo fixesBjörn Jacke2019-10-311-2/+2
| | | | | Signed-off-by: Bjoern Jacke <bjacke@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb/doc/ctdb-tunables.7.xml: typo fixesBjörn Jacke2019-10-311-6/+6
| | | | | Signed-off-by: Bjoern Jacke <bjacke@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb/doc/ctdb-statistics.7.xml: typo fixesBjörn Jacke2019-10-311-1/+1
| | | | | Signed-off-by: Bjoern Jacke <bjacke@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb/common/srvid.h: typo fixesBjörn Jacke2019-10-311-1/+1
| | | | | Signed-off-by: Bjoern Jacke <bjacke@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb/client/client.h: typo fixesBjörn Jacke2019-10-311-1/+1
| | | | | Signed-off-by: Bjoern Jacke <bjacke@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb-tests: Add vacuuming testsMartin Schwenke2019-10-249-0/+1011
| | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Thu Oct 24 05:28:21 UTC 2019 on sn-devel-184
* ctdb-tests: Add handling of process clean-up on a cluster nodeMartin Schwenke2019-10-241-0/+28
| | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-tests: Factor out function check_cattdb_num_records()Martin Schwenke2019-10-242-10/+34
| | | | | | | This can be use in multiple vacuuming tests. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-tests: Add ctdb-db-test toolMartin Schwenke2019-10-242-0/+796
| | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-client: Factor out function client_db_tdb()Martin Schwenke2019-10-242-16/+25
| | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-daemon: Implement DB_VACUUM controlMartin Schwenke2019-10-243-0/+103
| | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-vacuum: Only schedule next vacuum event if vacuuuming is scheduledMartin Schwenke2019-10-241-3/+12
| | | | | | | At the moment vacuuming is always scheduled. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-daemon: Factor out code to create vacuuming childMartin Schwenke2019-10-241-48/+86
| | | | | | | | This changes the behaviour for some failures from exiting to simply attempting to schedule the next run. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-vacuum: Simplify recording of in-progress vacuuming childMartin Schwenke2019-10-242-13/+9
| | | | | | | There can only be one, so simplify the logic. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-protocol: Add marshalling for control DB_VACUUMMartin Schwenke2019-10-247-2/+71
| | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-protocol: Add marshalling for struct ctdb_db_vacuumMartin Schwenke2019-10-245-0/+92
| | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-protocol: Add new control CTDB_CONTROL_DB_VACUUMMartin Schwenke2019-10-241-0/+8
| | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-vacuum: Avoid processing any more packetsAmitay Isaacs2019-10-241-3/+0
| | | | | | | | | All the vacuum operations if required have an event loop to ensure completion of pending operations. Once all the steps are complete, there is no reason to process any more packets. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb-daemon: Avoid memory leak when packet is deferredAmitay Isaacs2019-10-241-1/+2
| | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb-recoverd: No need for database detach handlerAmitay Isaacs2019-10-242-43/+0
| | | | | | | | | | | | The only reason for recoverd attaching to databases was to migrate records to the local node as part of vacuuming. Recovery daemon does not take part in database vacuuming any more. The actual database recovery is handled via the recovery_helper and recovery daemon should not need to attach to the databases any more. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb-recoverd: Drop VACUUM_FETCH message handlingAmitay Isaacs2019-10-241-149/+0
| | | | | | | This is now implemented in the ctdb daemon using VACUMM_FETCH control. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb-vacuum: Replace VACUUM_FETCH message with controlAmitay Isaacs2019-10-241-9/+9
| | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb-vacuum: Add processing of fetch queueAmitay Isaacs2019-10-241-3/+189
| | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb-daemon: Add implementation of VACUUM_FETCH controlAmitay Isaacs2019-10-245-1/+86
| | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb-tests: Add marshalling tests for new controlAmitay Isaacs2019-10-243-2/+17
| | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb-protocol: Add marshalling for new control VACUUM_FETCHAmitay Isaacs2019-10-244-0/+51
| | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb-protocol: Add new control VACUUM_FETCHAmitay Isaacs2019-10-241-0/+1
| | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb-tests: Drop code releated to obsolete controlsAmitay Isaacs2019-10-241-78/+0
| | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb-protocol: Drop code related to obsolete controlsAmitay Isaacs2019-10-242-69/+0
| | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb: Avoid malloc/memcpy/free in ctdb_ltdb_fetch()Volker Lendecke2019-10-241-31/+72
| | | | | | | Make use of tdb_parse_record() Signed-off-by: Volker Lendecke <vl@samba.org> Signed-off-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-tests: Add -l option to set number of local daemonsMartin Schwenke2019-10-222-5/+4
| | | | | | | | | | | This is the only place where setting an environment variable by hand is recommended, so remove the anomaly. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Ralph Böhme <slow@samba.org> Autobuild-Date(master): Tue Oct 22 21:02:11 UTC 2019 on sn-devel-184
* ctdb-tests: Prefix remaining environment variables with CTDB_Martin Schwenke2019-10-2211-33/+33
| | | | | | | | | | | Now they are clearly all part of CTDB. TEST_SOCKET_WRAPPER_SO_PATH gets too long in integration_local_daemons.bash, so change it to CTDB_TEST_SWRAP_SO_PATH instead of just prefixing. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-tests: Drop setting of test state directory for testonly targetMartin Schwenke2019-10-221-1/+1
| | | | | | | This is the default and deciding this should be left to run_tests.sh. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-tests: Enable printing of logs on failure in autobuildMartin Schwenke2019-10-221-1/+1
| | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-tests: Add run_tests.sh option to print logs on test failureMartin Schwenke2019-10-222-1/+21
| | | | | | | | | Implement this for local daemons integration tests, dumping last 100 lines of logs. This makes it possible to debug some failures in automated tests where the logs are unavailable for analysis. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>