From 9eb8cbca769e8e462bacdc4db424633ca093e3e4 Mon Sep 17 00:00:00 2001 From: Beniamino Galvani Date: Mon, 29 Aug 2022 09:08:40 +0200 Subject: device: don't emit recheck-assume if there is a queued activation request The @dracut_NM_vlan_over_team_no_boot sometimes fails, among other things, because it fails to assume an indicated connection after a restart. That seems to happen because after the decision to activate the indicated connection, the device does not move from DISCONNECTED state quickly enough. Another assumption recheck runs in between and decides to generate a connection, because the assume state was already reset in between. First start, creates and activates b3a61b68-f744-4a4c-a513-61399c154a67 on vlan0017: NetworkManager (version 1.41.1-30921.55767cf5.el9) is starting... (asserts:10000, boot:caf7301a-19cd-498b-b5ba-5d36ee939ffe) ... settings: update[b3a61b68-f744-4a4c-a513-61399c154a67]: adding connection "vlan0017" (45113870df0a4cfb/keyfile) Second start: NetworkManager (version 1.41.1-30921.55767cf5.el9) is starting... (after a restart, asserts:10000, boot:caf7301a-19cd-498b-b5ba-5d36ee939ffe) Assumption attempt successfully picks the right connection and thus proceeds to reset the assume state: manager: (vlan0017): assume: will attempt to assume matching connection 'vlan0017' (b3a61b68-f744-4a4c-a513-61399c154a67) (indicated) device[c7c5101cf0b73f5f] (vlan0017): assume-state: set guess-assume=0, connection=(null) Everything great so far, activation of the right connection is enqueued and the device moves away from unavailable state. However, the activation can't proceed immediately: device (vlan0017): state change: unmanaged -> unavailable (reason 'connection-assumed', sys-iface-state: 'assume') device (vlan0017): state change: unavailable -> disconnected (reason 'connection-assumed', sys-iface-state: 'assume') active-connection[0x55ba1162f1c0]: set device "vlan0017" [0x55ba1163c4f0] device[c7c5101cf0b73f5f] (vlan0017): queue activation request waiting for carrier Now another assumption attempt is done. The original assume state is gone, so a connection is generated: platform-linux: UDEV event: action 'add' subsys 'net' device 'vlan0017' (6); seqnum=1959 device[c7c5101cf0b73f5f] (vlan0017): queued link change for ifindex 6 manager: (vlan0017): assume: generated connection 'vlan0017' (57627119-8c20-4f9e-bf4d-4fc427b4a6a9) keyfile: commit: 57627119-8c20-4f9e-bf4d-4fc427b4a6a9 (vlan0017) added as "/run/NetworkManager/system-connections/vlan0017-57627119-8c20-4f9e-bf4d-4fc427b4a6a9.nmconnection" (nm-generated,volatile,external) I think this shouldn't have happened. We've picked the correct connection already and it's enqueued for activation! Change the check in nm_device_emit_recheck_assume() to also consider any queued activation. Fixes-test: @dracut_NM_vlan_over_team_no_boot Co-authored-by: Lubomir Rintel https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1351 --- src/core/devices/nm-device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/core/devices/nm-device.c b/src/core/devices/nm-device.c index 0a046f1e45..6be8b06a6a 100644 --- a/src/core/devices/nm-device.c +++ b/src/core/devices/nm-device.c @@ -8956,7 +8956,7 @@ nm_device_emit_recheck_assume(gpointer user_data) priv = NM_DEVICE_GET_PRIVATE(self); priv->recheck_assume_id = 0; - if (!nm_device_get_act_request(self)) + if (!priv->queued_act_request && !nm_device_get_act_request(self)) g_signal_emit(self, signals[RECHECK_ASSUME], 0); return G_SOURCE_REMOVE; -- cgit v1.2.1