summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Set release date for 2.13.5.v2.13.5Ilya Maximets2021-10-212-2/+3
| | | | | | Acked-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* python: idl: Avoid sending transactions when the DB is not synced up.Terry Wilson2021-10-211-0/+7
| | | | | | | | | | | | | | This ports the C IDL change f50714b to the Python IDL: Until now the code here would happily try to send transactions to the database server even if the database connection was not in the correct state. In some cases this could lead to strange behavior, such as sending a database transaction for a database that the IDL had just learned did not exist on the server. Signed-off-by: Terry Wilson <twilson@redhat.com> (cherry picked from commit 34fbdc41084c16f855efa9c56086b03644408b7d) Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* datapath-windows: add layers when adding the deferred actionsWilson Peng2021-10-193-6/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the layers info propogated to ProcessDeferredActions may be incorrect. Because of this, any subsequent usage of layers might result in undesired behavior. Accordingly in this patch it will add the related layers in the deferred action to make sure the layers consistent with the related NBL. In the specified case 229, we have encountered one issue when doing the decap Geneve Packet and doing the twice NAT(via two flow tables) and found the HTTP packet will be changed the TCP sequence. After debugging, we found the issue is caused by the not-updated layers value isTcp and isUdp for Geneve decapping case. The related function call chains are listed below, OvsExecuteDpIoctl—>OvsActionsExecute—>OvsDoExecuteActions->OvsTunnelPortRx ——>OvsDoExecuteActions——〉nat ct action and recircle action ->OvsActionsExecute->defered_actions processing for nat and recircle action For the Geneve packet decaping, it will firstly set the layers for Udp packet. Then it will go on doing OVS flow extract to get the inner packet layers and Processing the first nat action and first recircle action. After that datapath Will do defered_actions processing on OvsActionsExecute. And it does inherit The incorrect geneve packet layers value( isTCP 0 and isUdp 1).So in the second Nat action processing it will get the wrong TCP Headers in OvsUpdateAddressAndPort And it will update related TCP check field value but in this case it will change The packet Tcp seq value. Reported-at:https://github.com/openvswitch/ovs-issues/issues/229 Signed-off-by: Wilson Peng <pweisong@vmware.com> Signed-off-by: Alin-Gabriel Serdean <aserdean@ovn.org>
* ofproto-dpif-xlate: Fix zone set from non-frozen-metadata fields.Peng He2021-10-134-8/+143
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CT zone could be set from a field that is not included in frozen metadata. Consider the example rules which are typically seen in OpenStack security group rules: priority=100,in_port=1,tcp,ct_state=-trk,action=ct(zone=5,table=0) priority=100,in_port=1,tcp,ct_state=+trk,action=ct(commit,zone=NXM_NX_CT_ZONE[]),2 The zone is set from the first rule's ct action. These two rules will generate two megaflows: the first one uses zone=5 to query the CT module, the second one sets the zone-id from the first megaflow and commit to CT. The current implementation will generate a megaflow that does not use ct_zone=5 as a match, but directly commit into the ct using zone=5, as zone is set by an Imm not a field. Consider a situation that one changes the zone id (for example to 15) in the first rule, however, still keep the second rule unchanged. During this change, there is traffic hitting the two generated megaflows, the revaldiator would revalidate all megaflows, however, the revalidator will not change the second megaflow, because zone=5 is recorded in the megaflow, so the xlate will still translate the commit action into zone=5, and the new traffic will still commit to CT as zone=5, not zone=15, resulting in taffic drops and other issues. Just like OVS set-field convention, if a field X is set by Y (Y is a variable not an Imm), we should also mask Y as a match in the generated megaflow. An exception is that if the zone-id is set by the field that is included in the frozen state (i.e. regs) and this upcall is a resume of a thawed xlate, the un-wildcarding can be skipped, as the recirc_id is a hash of the values in these fields, and it will change following the changes of these fields. When the recirc_id changes, all megaflows with the old recirc id will be invalid later. Fixes: 07659514c3 ("Add support for connection tracking.") Reported-by: Sai Su <susai.ss@bytedance.com> Signed-off-by: Peng He <hepeng.0320@bytedance.com> Acked-by: Mark D. Gray <mark.d.gray@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* dpif-netdev: Fix use-after-free on PACKET_OUT of IP fragments.Ilya Maximets2021-10-131-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IP fragmentation engine may not only steal the packet but also add more. For example, after receiving the last fragment, it will add all previous fragments to a batch. Unfortunately, it will also free the original last fragment and replace it with a copy. This invalidates the 'packet_clone' pointer in the dpif_netdev_execute() leading to the use-after-free: ==3525086==ERROR: AddressSanitizer: heap-use-after-free on address 0x61600020439c at pc 0x000000688a6d READ of size 1 at 0x61600020439c thread T0 #0 0x688a6c in dp_packet_swap ./lib/dp-packet.h:265:5 #1 0x68781d in dpif_netdev_execute lib/dpif-netdev.c:4103:9 #2 0x6675db in dpif_netdev_operate lib/dpif-netdev.c:4129:25 #3 0x691e5e in dpif_operate lib/dpif.c:1367:13 #4 0x692909 in dpif_execute lib/dpif.c:1321:9 #5 0x5b19c6 in packet_execute ofproto/ofproto-dpif.c:4991:5 #6 0x5a2861 in ofproto_packet_out_finish ofproto/ofproto.c:3662:5 #7 0x5a65c6 in do_bundle_commit ofproto/ofproto.c:8270:13 #8 0x5a0cae in handle_bundle_control ofproto/ofproto.c:8309:17 #9 0x59a476 in handle_single_part_openflow ofproto/ofproto.c:8593:16 #10 0x5877ac in handle_openflow ofproto/ofproto.c:8674:21 #11 0x6296f1 in ofconn_run ofproto/connmgr.c:1329:13 #12 0x62925d in connmgr_run ofproto/connmgr.c:356:9 #13 0x586904 in ofproto_run ofproto/ofproto.c:1879:5 #14 0x55c830 in bridge_run__ vswitchd/bridge.c:3251:9 #15 0x55c015 in bridge_run vswitchd/bridge.c:3310:5 #16 0x575f31 in main vswitchd/ovs-vswitchd.c:127:9 #17 0x7f01099d3492 in __libc_start_main (/lib64/libc.so.6+0x23492) #18 0x47d96d in _start (vswitchd/ovs-vswitchd+0x47d96d) 0x61600020439c is located 28 bytes inside of 560-byte region freed by thread T0 here: #0 0x5177a8 in free (vswitchd/ovs-vswitchd+0x5177a8) #1 0x6b17b6 in dp_packet_delete ./lib/dp-packet.h:256:9 #2 0x6afeee in ipf_extract_frags_from_batch lib/ipf.c:947:17 #3 0x6afd63 in ipf_preprocess_conntrack lib/ipf.c:1232:9 #4 0x946b2c in conntrack_execute lib/conntrack.c:1446:5 #5 0x67e3ed in dp_execute_cb lib/dpif-netdev.c:8277:9 #6 0x7097d7 in odp_execute_actions lib/odp-execute.c:865:17 #7 0x66409e in dp_netdev_execute_actions lib/dpif-netdev.c:8322:5 #8 0x6877ad in dpif_netdev_execute lib/dpif-netdev.c:4090:5 #9 0x6675db in dpif_netdev_operate lib/dpif-netdev.c:4129:25 #10 0x691e5e in dpif_operate lib/dpif.c:1367:13 #11 0x692909 in dpif_execute lib/dpif.c:1321:9 #12 0x5b19c6 in packet_execute ofproto/ofproto-dpif.c:4991:5 #13 0x5a2861 in ofproto_packet_out_finish ofproto/ofproto.c:3662:5 #14 0x5a65c6 in do_bundle_commit ofproto/ofproto.c:8270:13 #15 0x5a0cae in handle_bundle_control ofproto/ofproto.c:8309:17 #16 0x59a476 in handle_single_part_openflow ofproto/ofproto.c:8593:16 #17 0x5877ac in handle_openflow ofproto/ofproto.c:8674:21 #18 0x6296f1 in ofconn_run ofproto/connmgr.c:1329:13 #19 0x62925d in connmgr_run ofproto/connmgr.c:356:9 #20 0x586904 in ofproto_run ofproto/ofproto.c:1879:5 #21 0x55c830 in bridge_run__ vswitchd/bridge.c:3251:9 #22 0x55c015 in bridge_run vswitchd/bridge.c:3310:5 #23 0x575f31 in main vswitchd/ovs-vswitchd.c:127:9 #24 0x7f01099d3492 in __libc_start_main (/lib64/libc.so.6+0x23492) The issue can be reproduced with system-userspace testsuite on the 'conntrack - IPv4 fragmentation with fragments specified' test. Previously, there was a leak inside the IP fragmentation module that kept the original segment, so 'packet_clone' remained a valid pointer. But commit 803ed12e31b0 ("ipf: release unhandled packets from the batch") fixed the leak leading to use-after-free. Using the packet from a batch instead of 'packet_clone' to swap packet content to avoid the issue. While investigating this problem, more issues uncovered. One of them is that IP fragmentation engine can add more packets to the batch, but there is no way to get them to a caller. Adding an extra branch for this case with a 'FIXME' comment in order to highlight the issue. Another one is that IP fragmentation engine will keep only 32 fragments dropping all other fragments while refilling a batch, but that should be fixed separately. Fixes: 7e6b41ac8d9d ("dpif-netdev: Fix crash when PACKET_OUT is metered.") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Aaron Conole <aconole@redhat.com>
* tunnel-push-pop.at: Mask source port in tunnel header.Ilya Maximets2021-10-121-4/+6
| | | | | | | | | | Source port is based on a packet hash and hash depends on a chosen implementation. Masking it to avoid test failures with '-msse4.2'. Fixes: 7e6b41ac8d9d ("dpif-netdev: Fix crash when PACKET_OUT is metered.") Reported-by: Kumar Amber <kumar.amber@intel.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Alin-Gabriel Serdean <aserdean@ovn.org>
* ipf: release unhandled packets from the batchAaron Conole2021-10-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since 640d4db788ed ("ipf: Fix a use-after-free error, ...") the ipf framework unconditionally allocates a new dp_packet to track individual fragments. This prevents a use-after-free. However, an additional issue was present - even when the packet buffer is cloned, if the ip fragment handling code keeps it, the original buffer is leaked during the refill loop. Even in the original processing code, the hardcoded dnsteal branches would always leak a packet buffer from the refill loop. This can be confirmed with valgrind: ==717566== 16,672 (4,480 direct, 12,192 indirect) bytes in 8 blocks are definitely lost in loss record 390 of 390 ==717566== at 0x484086F: malloc (vg_replace_malloc.c:380) ==717566== by 0x537BFD: xmalloc__ (util.c:137) ==717566== by 0x537BFD: xmalloc (util.c:172) ==717566== by 0x46DDD4: dp_packet_new (dp-packet.c:153) ==717566== by 0x46DDD4: dp_packet_new_with_headroom (dp-packet.c:163) ==717566== by 0x550AA6: netdev_linux_batch_rxq_recv_sock.constprop.0 (netdev-linux.c:1262) ==717566== by 0x5512AF: netdev_linux_rxq_recv (netdev-linux.c:1511) ==717566== by 0x4AB7E0: netdev_rxq_recv (netdev.c:727) ==717566== by 0x47F00D: dp_netdev_process_rxq_port (dpif-netdev.c:4699) ==717566== by 0x47FD13: dpif_netdev_run (dpif-netdev.c:5957) ==717566== by 0x4331D2: type_run (ofproto-dpif.c:370) ==717566== by 0x41DFD8: ofproto_type_run (ofproto.c:1768) ==717566== by 0x40A7FB: bridge_run__ (bridge.c:3245) ==717566== by 0x411269: bridge_run (bridge.c:3310) ==717566== by 0x406E6C: main (ovs-vswitchd.c:127) The fix is to delete the original packet when it isn't able to be reinserted into the packet batch. Subsequent valgrind runs show that the packets are not leaked from the batch any longer. Fixes: 640d4db788ed ("ipf: Fix a use-after-free error, and remove the 'do_not_steal' flag.") Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.") Reported-by: Wan Junjie <wanjunjie@bytedance.com> Reported-at: https://github.com/openvswitch/ovs-issues/issues/226 Signed-off-by: Aaron Conole <aconole@redhat.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Tested-by: Wan Junjie <wanjunjie@bytedance.com> Signed-off-by: Alin-Gabriel Serdean <aserdean@ovn.org>
* datapath-windows:adjust Offset when processing packet in POP_VLAN actionwilsonpeng2021-09-301-3/+15
| | | | | | | | | | | | | | In one typical setup, on the Windows VM running OVS Windows Kernel, a Geneva packet with 8021.q VLAN tag is received. Then it may do POP_VLAN action processing in Actions.c, if the packet does not have Ieee8021QNetBufferListInfo in the oob of the packet, it will be processed by function OvsPopVlanInPktBuf(). In the function it will go on remove VLAN header present in the nbl, but related layers is never readjusted for the offset value at this moment. As a result, it will cause function OvsValidateIPChecksum drop the packet. Reported-at:https://github.com/openvswitch/ovs-issues/issues/225 Signed-off-by: wilsonpeng <pweisong@vmware.com> Signed-off-by: Alin-Gabriel Serdean <aserdean@ovn.org>
* cirrus: Reduce memory requirements for FreeBSD VMs.Ilya Maximets2021-09-231-1/+1
| | | | | | | | | | | | According to memory usage graphs, our builds are using 3GB at most. Reducing memory requirements to 4GB to have some room. This change doesn't affect time needed to finish the build, but should have a slight positive effect on scheduling time on a community cluster. And it's also not cool from our side to reserve shared resources that we're not using, while they could be used by some other project. Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Aaron Conole <aconole@redhat.com>
* netdev-linux: Fix a null pointer dereference in netdev_linux_notify_sock().Yunjian Wang2021-09-161-1/+1
| | | | | | | | | | | | | If nl_sock_join_mcgroup() returns an error, the 'sock' is freed and set to NULL. This issues will lead to null pointer deference in nl_sock_listen_all_nsid(). To fix it, we call nl_sock_listen_all_nsid() before joining the mcgroups. Fixes: cf114a7fce80 ("netlink linux: enable listening to all nsids") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Reviewed-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* pcap-file: Fix memory leak in ovs_pcap_open().Yunjian Wang2021-09-161-0/+1
| | | | | | | | | | | In ovs_pcap_open(), we allocate memory for the 'p_file' structure but not released when fopen fails. Addresses-Coverity: ("Resource leak") Fixes: b6e840aed03e ("pcap-file: Add nanosecond resolution pcap support.") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* odp-util: Fix a null pointer dereference in odp_flow_format().Yunjian Wang2021-09-161-1/+1
| | | | | | | | | | | | | This patch fixes (dereference after null check) coverity issue. For this reason, we should add null check of 'mask' before calling nl_attr_find__() because the 'mask' maybe null. Addresses-Coverity: ("Dereference after null check") Fixes: e6cc0babc25d ("ovs-dpctl: Add mega flow support") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Acked-by: Aaron Conole <aconole@redhat.com> Acked-by: Vasu Dasari <vdasari@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* odp-util: Fix a null pointer dereference in odp_nsh_key_from_attr__().Yunjian Wang2021-09-161-1/+1
| | | | | | | | | | | | This patch fixes (dereference after null check) coverity issue. We should add null check of 'nsh_mask' to avoid passing NULL pointer to memcpy() because the 'nsh_mask' maybe null. Addresses-Coverity: ("Dereference after null check") Fixes: 81fdabb94dd7 ("nsh: fix nested mask for OVS_KEY_ATTR_NSH") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ipf: Fix only nat the first fragment in the reass process.wenxu2021-09-152-38/+82
| | | | | | | | | | | | | | | | | | | | | | The ipf collect original fragment packets and reass a new pkt to do the conntrack logic. After finish the conntrack things copy the ct meta info to each original packet and modify the l4 header in the first fragment. It should modify the ip src/dst info for all the fragments. Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.") Signed-off-by: wenxu <wenxu@ucloud.cn> Co-authored-by: luke.li <luke.li@ucloud.cn> Signed-off-by: luke.li <luke.li@ucloud.cn> Reviewed-by: wangze <wangze712@gmail.com> Acked-by: Aaron Conole <aconole@redhat.com> Acked-by: Paolo Valerio <pvalerio@redhat.com> Test case: Co-authored-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* dpif-netdev: Fix crash when PACKET_OUT is metered.Tony van der Peet2021-09-084-1/+101
| | | | | | | | | | | | | | When a PACKET_OUT has output port of OFPP_TABLE, and the rule table includes a meter and this causes the packet to be deleted, execute with a clone of the packet, restoring the original packet if it is changed by the execution. Add tests to verify the original issue is fixed, and that the fix doesn't break tunnel processing. Reported-by: Tony van der Peet <tony.vanderpeet@alliedtelesis.co.nz> Signed-off-by: Tony van der Peet <tony.vanderpeet@alliedtelesis.co.nz> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* tc: Set action flags for tunnel_key release.Vlad Buslov2021-08-161-0/+1
| | | | | | | | | | | | The commit that enabled 'no_percpu' flag for compatible actions missed the tunnel_key release action code. Add missing call to nl_msg_put_act_flags(). Fixes: 292d5bd9bb34 ("tc: Set 'no_percpu' flag for compatible actions") Reported-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Tested-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* netlink-socket: Replace error with txn->error when logging nacked transactions.Paolo Valerio2021-08-161-1/+1
| | | | | | | | | | | | | in case nl_msg_nlmsgerr returns true which basically means that the nlmsg_type == NLMSG_ERROR, we need to log the error code, besides the descriptive representation, stored by nl_msg_nlmsgerr instead of "error". Fixes: 72d32ac0b3a1 ("netlink-socket: Make caller provide message receive buffers.") Suggested-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Reviewed-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* dynamic-string: Fix a crash in ds_clone().Sriharsha Basavapatna2021-08-161-0/+4
| | | | | | | | | | | | | | | | | | | | | ds_clone() crashes while trying to clone an empty dynamic string. It happens because it doesn't check if memory was allocated and tries to read from the NULL pointer. ds_init() doesn't allocate any memory. For example: In netdev_offload_dpdk_flow_create() when an offload request fails, dump_flow() is called to log a warning message. The 's_tnl' string in flow_patterns gets initialized in vport_to_rte_tunnel() conditionally via ds_put_format(). If it is not initialized, it crashes later in dump_flow_attr()->ds_clone()->memcpy() while dereferencing this string. To fix this, check if memory for the src string has been allocated, before copying it to the dst string. Fixes: fa44a4a3ff7b ("ovn-controller: Persist desired conntrack groups.") Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* dpif-netdev: Fix offloads of modified flows.Eli Britstein2021-08-021-9/+5
| | | | | | | | | | | | | | | | | | | | | | Association of a mark to a flow is done as part of its offload handling, in the offloading thread. However, the PMD thread specifies whether an offload request is an "add" or "modify" by the association of a mark to the flow. This is exposed to a race condition. A flow might be created with actions that cannot be fully offloaded, for example flooding (before MAC learning), and later modified to have actions that can be fully offloaded. If the two requests are queued before the offload thread handling, they are both marked as "add". When the offload thread handles them, the first request is partially offloaded, and the second one is ignored as the flow is already considered as offloaded. Fix it by specifying add/modify of an offload request by the actual flow state change, without relying on the mark. Fixes: 3c7330ebf036 ("netdev-offload-dpdk: Support offload of output action.") Signed-off-by: Eli Britstein <elibr@nvidia.com> Reviewed-by: Gaetan Rivet <gaetanr@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* dpif-netdev: Fix flow modification after failure.Eli Britstein2021-08-021-2/+2
| | | | | | | | | | | | dp_netdev_flow_offload_main thread is asynchronous, by the cited commit. There might be a case where there are modification requests of the same flow submitted before handled. Then, if the first handling fails, the rule for the flow is deleted, and the mark is freed. Then, the following one should not be handled as a modification, but rather as an "add". Fixes: 02bb2824e51d ("dpif-netdev: do hw flow offload in a thread") Signed-off-by: Eli Britstein <elibr@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* daemon-unix: Fix leak of a fork error message.Ilya Maximets2021-07-261-0/+1
| | | | | | | | | | | | | | | | | | 19 bytes in 1 blocks are definitely lost in loss record 24 of 121 at 0x4839748: malloc (vg_replace_malloc.c:306) by 0x483BD63: realloc (vg_replace_malloc.c:834) by 0x521C26: xrealloc (util.c:149) by 0x478F91: ds_reserve (dynamic-string.c:63) by 0x47928B: ds_put_format_valist (dynamic-string.c:161) by 0x47920A: ds_put_format (dynamic-string.c:142) by 0x506DE5: process_status_msg (process.c:0) by 0x52A6D0: fork_and_wait_for_startup (daemon-unix.c:284) by 0x52A54D: daemonize_start (daemon-unix.c:453) by 0x40EB3E: main (ovs-vswitchd.c:91) Fixes: b925336a36e6 ("daemon: restart child process if it died before signaling its readiness") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Roi Dayan <roid@nvidia.com>
* datapath-windows:Correct checksum for DNAT actionWilson Peng2021-07-211-0/+12
| | | | | | | | | | | | | | | | | | While testing OVS-windows flows for the DNAT action, the checksum In TCP header is set incorrectly when TCP offload is enabled by Default. As a result, the packet will be dropped on receiver linuxVM. >>>sample flow default configuration on both Windows VM and Linux VM (src=40.0.1.2,dst=10.150.0.1) --dnat--> (src=40.0.1.2,dst==30.1.0.2) Without the fix for some TCP packet(40.0.1.2->30.1.0.2 with payload len 207) the TCP checksum will be pseduo header checksum and the value is 0x01d6. With the fix the checksum will be 0x47ee, it could be got the correct TCP checksum on the receiver Linux VM. Signed-off-by: Wilson Peng<pweisong@vmware.com> Signed-off-by: Anand Kumar<kumaranand@vmware.com> Acked-by: Alin-Gabriel Serdean <aserdean@ovn.org> Signed-off-by: Alin-Gabriel Serdean <aserdean@ovn.org>
* dpif-netlink: Fix report_loss() message.Mark Gray2021-07-161-2/+2
| | | | | | | | Fixes: 1579cf677fcb ("dpif-linux: Implement the API functions to allow multiple handler threads read upcall.") Signed-off-by: Mark Gray <mark.d.gray@redhat.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* conntrack: Document all-zero IP SNAT behavior and add a test case.Eelco Chaudron2021-07-1613-1/+148
| | | | | | | | | | | | | | | | | | | | Currently, conntrack in the kernel has an undocumented feature referred to as all-zero IP address SNAT. Basically, when a source port collision is detected during the commit, the source port will be translated to an ephemeral port. If there is no collision, no SNAT is performed. This patchset documents this behavior and adds a self-test to verify it's not changing. In addition, a datapath feature flag is added for the all-zero IP SNAT case. This will help applications on top of OVS, like OVN, to determine this feature can be used. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com> Acked-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Alin-Gabriel Serdean <aserdean@ovn.org> Acked-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ovsdb-server: Fix memleak when failing to read storage.Dumitru Ceara2021-07-151-5/+3
| | | | | | Fixes: 1b1d2e6daa56 ("ovsdb: Introduce experimental support for clustered databases.") Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* netdev-linux: Ignore TSO packets when TSO is not enabled for userspace.Eelco Chaudron2021-07-091-3/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When TSO is disabled from a userspace forwarding datapath perspective, but TSO has been wrongly enabled on the kernel side, log a warning message, and drop the packet. With the current implementation, OVS will crash. [i.maximets]: The call stack looks like this: 0 dp_packet_set_size (b=0x0, b=0x0, v=13028) at lib/dp-packet.h:578 1 netdev_linux_batch_rxq_recv_sock at lib/netdev-linux.c:1310 2 netdev_linux_rxq_recv at lib/netdev-linux.c 3 netdev_rxq_recv at lib/netdev.c 4 dp_netdev_process_rxq_port at lib/dpif-netdev.c The problem is that the code assumes that (mmsgs[i].msg_len > std_len) can only be true if userpace-tso is enabled and additional buffers are provided to the kernel. However, since recvmmsg() is called with MSG_TRUNC, the resulting msg_len reflects the original packet size before truncation, and it can be larger than the buffer if TSO / GRO is enabled on the network interface. If TSO support for user space is not enabled in OVS, the aux_bufs are not allocated and are left NULL, resulting in a crash. Fixes: 73858f9dbe83 ("netdev-linux: Prepend the std packet in the TSO packet") Fixes: 2109841b7984 ("Use batch process recv for tap and raw socket in netdev datapath") Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* conntrack: Handle already natted packets.Paolo Valerio2021-07-092-1/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a packet gets dnatted and then recirculated, it could be possible that it matches another rule that performs another nat action. The kernel datapath handles this situation turning to a no-op the second nat action, so natting only once the packet. In the userspace datapath instead, when the ct action gets executed, an initial lookup of the translated packet fails to retrieve the connection related to the packet, leading to the creation of a new entry in ct for the src nat action with a subsequent failure of the connection establishment. with the following flows: table=0,priority=30,in_port=1,ip,nw_dst=192.168.2.100,actions=ct(commit,nat(dst=10.1.1.2:80),table=1) table=0,priority=20,in_port=2,ip,actions=ct(nat,table=1) table=0,priority=10,ip,actions=resubmit(,2) table=0,priority=10,arp,actions=NORMAL table=0,priority=0,actions=drop table=1,priority=5,ip,actions=ct(commit,nat(src=10.1.1.240),table=2) table=2,in_port=ovs-l0,actions=2 table=2,in_port=ovs-r0,actions=1 Establishing a connection from 10.1.1.1 to 192.168.2.100 the outcome is: tcp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=4000,dport=80), reply=(src=10.1.1.2,dst=10.1.1.240,sport=80,dport=4000), protoinfo=(state=ESTABLISHED) tcp,orig=(src=10.1.1.1,dst=192.168.2.100,sport=4000,dport=80), reply=(src=10.1.1.2,dst=10.1.1.1,sport=80,dport=4000), protoinfo=(state=ESTABLISHED) With this patch applied the outcome is: tcp,orig=(src=10.1.1.1,dst=192.168.2.100,sport=4000,dport=80), reply=(src=10.1.1.2,dst=10.1.1.1,sport=80,dport=4000), protoinfo=(state=ESTABLISHED) The patch performs, for already natted packets, a lookup of the reverse key in order to retrieve the related entry, it also adds a test case that besides testing the scenario ensures that the other ct actions are executed. Reported-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* python: Fix Idl.run change_seqno update.Bodo Petermann2021-07-071-5/+6
| | | | | | | | | | | | | | | | | Fix an issue where Idl.run() returned False even if there was a change. If Idl.run() reads multiple messages from the database server, some may constitute changes and some may not. Changed the way change_seqno is reset: if a message is not a change, reset change_seqno only to the value before reading this message, not to the value before reading the first message. This will fix the return value in a scenario where some message was a change and the last one wasn't. The new change_seqno will now be the value after handling the message with the last change. Fixes: c39751e44539 ("python: Monitor Database table to manage lifecycle of IDL client.") Signed-off-by: Bodo Petermann <b.petermann@syseleven.de> Acked-by: Alin Gabriel Serdean <aserdean@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* Prepare for 2.13.5.Ilya Maximets2021-07-013-1/+10
| | | | | Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* Set release date for 2.13.4.v2.13.4Ilya Maximets2021-07-012-2/+3
| | | | | Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* netlink: removed incorrect optimizationToms Atteka2021-06-302-1/+55
| | | | | | | | | | | | | | | This optimization caused FLOW_TNL_F_UDPIF flag not to be used in hash calculation for geneve tunnel when revalidating flows which resulted in different cache hash values and incorrect behaviour. Added test to prevent regression. CC: Jesse Gross <jesse@nicira.com> Fixes: 6728d578f64e ("dpif-netdev: Translate Geneve options per-flow, not per-packet.") Reported-at: https://github.com/vmware-tanzu/antrea/issues/897 Signed-off-by: Toms Atteka <cpp.code.lv@gmail.com> Acked-by: Ansis Atteka <aatteka@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ovs-actions.xml: Add missing bracket.Paolo Valerio2021-06-301-1/+1
| | | | | | | | | | | | | | | A bracket is apparently missing in ovs-actions(7). The patch changes the relevant row from: ct(argument]...) to: ct([argument]...) Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* netdev-offload-tc: Use nl_msg_put_flag for OVS_TUNNEL_KEY_ATTR_CSUM.Paolo Valerio2021-06-301-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a tunnel port gets added to the bridge setting the checksum option to true: ovs-vsctl add-port br0 geneve0 \ -- set interface geneve0 type=geneve \ options:remote_ip=<remote_ip> options:key=<key> options:csum=true the flow dump for the outgoing traffic will include a "bad key length 1 ..." message: ovs-appctl dpctl/dump-flows --names -m ufid:<>, ..., dp:tc, actions:set(tunnel(tun_id=<>,dst=<>,ttl=64,tp_dst=6081, key6(bad key length 1, expected 0)(01)flags(key))) ,genev_sys_6081 This is due to a mismatch present between the expected length (zero for OVS_TUNNEL_KEY_ATTR_CSUM in ovs_tun_key_attr_lens) and the current one. With this patch the same flow dump becomes: ovs-appctl dpctl/dump-flows --names -m ufid:<>, ..., dp:tc, actions:set(tunnel(tun_id=<>,dst=<>,ttl=64,tp_dst=6081, flags(csum|key))),genev_sys_6081 Fixes: d9677a1f0eaf ("netdev-tc-offloads: TC csum option is not matched with tunnel configuration") Suggested-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* datapath-windows: Specify external include pathsAlin Gabriel Serdean2021-06-301-0/+17
| | | | | | | | | | | | | | | | VStudio 16.10 adds usermode includes before including the driver kit ones. Bug tracked at: https://developercommunity.visualstudio.com/t/error-lnk2019-unresolved-external-symbol-stdio-com/1434674 Fixes appveyor build reported by forcing external includes. Reported-by: Ilya Maximets <i.maximets@ovn.org> Reported-by: Frank Wagner <frank.wagner@dbosoft.eu> Reported-at: https://github.com/openvswitch/ovs-issues/issues/209#issuecomment-861385852 Reported-at: https://github.com/openvswitch/ovs-issues/issues/211 Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org> Acked-by: Ilya Maximets <i.maximets@ovn.org>
* Remove Python 2 leftovers.Rosemarie O'Riorden2021-06-2216-63/+35
| | | | | | | Fixes: 1ca0323e7c29 ("Require Python 3 and remove support for Python 2.") Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1949875 Signed-off-by: Rosemarie O'Riorden <roriorde@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ipf: Fix a use-after-free error, and remove the 'do_not_steal' flag.Aaron Conole2021-06-153-16/+6
| | | | | | | | | | | | | | | | | | | | | As reported by Wang Liang, the way packets are passed to the ipf module doesn't allow for use later on in reassembly. Such packets may be get released anyway, such as during cleanup of tx processing. Because the ipf module lacks a way of forcing the dp_packet to be retained, it will later reuse the packet. Instead, just clone the packet and let the ipf queue own the copy until the queue is destroyed. After this change, there are no more in-tree users of the batch 'do_not_steal' flag. Thus, we remove it as well. Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.") Fixes: 0b3ff31d35f5 ("dp-packet: Add 'do_not_steal' packet batch flag.") Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2021-April/382098.html Reported-by: Wang Liang <wangliangrt@didiglobal.com> Signed-off-by: Aaron Conole <aconole@redhat.com> Co-authored-by: Wang Liang <wangliangrt@didiglobal.com> Signed-off-by: Wang Liang <wangliangrt@didiglobal.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ofproto: Fix potential NULL dereference in ofproto_ct_*_zone_timeout_policy().Dumitru Ceara2021-06-031-2/+2
| | | | | | | | | Spotted during code inspection. Fixes: 993cae678bca ("ofproto-dpif: Consume CT_Zone, and CT_Timeout_Policy tables") Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ofproto: Fix potential NULL dereference in ofproto_get_datapath_cap().Dumitru Ceara2021-06-031-1/+1
| | | | | | | | | | | | | | Reproducer: ovs-vsctl \ -- add-br br \ -- set bridge br datapath-type=foo \ -- --id=@m create Datapath datapath_version=0 'capabilities={}' \ -- set Open_vSwitch . datapaths:"foo"=@m Fixes: 27501802d09f ("ofproto-dpif: Expose datapath capability to ovsdb.") Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* dpif-netlink: Fix send of uninitialized memory in ct limit requests.Ilya Maximets2021-05-241-10/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ct limit requests never initializes the whole 'struct ovs_zone_limit' sending uninitialized stack memory to kernel: Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s) at 0x5E23867: sendmsg (in /usr/lib64/libpthread-2.28.so) by 0x54F761: nl_sock_transact_multiple__ (netlink-socket.c:858) by 0x54FB6E: nl_sock_transact_multiple.part.9 (netlink-socket.c:1079) by 0x54FCC0: nl_sock_transact_multiple (netlink-socket.c:1044) by 0x54FCC0: nl_sock_transact (netlink-socket.c:1108) by 0x550B6F: nl_transact (netlink-socket.c:1804) by 0x53BEA2: dpif_netlink_ct_get_limits (dpif-netlink.c:3052) by 0x588B57: dpctl_ct_get_limits (dpctl.c:2178) by 0x586FF2: dpctl_unixctl_handler (dpctl.c:2870) by 0x52C241: process_command (unixctl.c:310) by 0x52C241: run_connection (unixctl.c:344) by 0x52C241: unixctl_server_run (unixctl.c:395) by 0x407526: main (ovs-vswitchd.c:128) Address 0x10b87480 is 32 bytes inside a block of size 4,096 alloc'd at 0x4C30F0B: malloc (vg_replace_malloc.c:307) by 0x52CDE4: xmalloc (util.c:138) by 0x4F7E07: ofpbuf_init (ofpbuf.c:123) by 0x4F7E07: ofpbuf_new (ofpbuf.c:151) by 0x53BDE3: dpif_netlink_ct_get_limits (dpif-netlink.c:3025) by 0x588B57: dpctl_ct_get_limits (dpctl.c:2178) by 0x586FF2: dpctl_unixctl_handler (dpctl.c:2870) by 0x52C241: process_command (unixctl.c:310) by 0x52C241: run_connection (unixctl.c:344) by 0x52C241: unixctl_server_run (unixctl.c:395) by 0x407526: main (ovs-vswitchd.c:128) Uninitialised value was created by a stack allocation at 0x46AAA0: ct_dpif_get_limits (ct-dpif.c:197) Fix that by using designated initializers that will clear all the non-specified fields. Fixes: 906ff9d229ee ("dpif-netlink: Implement conntrack zone limit") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Mark D. Gray <mark.d.gray@redhat.com>
* ofproto-dpif: Fix use of uninitialized attributes of timeout policy.Ilya Maximets2021-05-241-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | 'cdtp' is allocated on a stack and it has uninitialized 'present' field that specifies which attributes are actually set. This causes use of uninitialized attributes. Conditional jump or move depends on uninitialised value(s) at 0x539651: dpif_netlink_get_nl_tp_udp_attrs (dpif-netlink.c:3206) by 0x539651: dpif_netlink_get_nl_tp_attrs (dpif-netlink.c:3234) by 0x539651: dpif_netlink_ct_set_timeout_policy (dpif-netlink.c:3370) by 0x42B615: ct_add_timeout_policy_to_dpif (ofproto-dpif.c:5421) by 0x42B615: ct_set_zone_timeout_policy (ofproto-dpif.c:5525) by 0x40BD98: ct_zones_reconfigure (bridge.c:756) by 0x40BD98: datapath_reconfigure (bridge.c:804) by 0x40D737: bridge_reconfigure (bridge.c:903) by 0x411720: bridge_run (bridge.c:3331) by 0x40751C: main (ovs-vswitchd.c:127) Clearing the whole structure to avoid this kind of problems. Fixes: 993cae678bca ("ofproto-dpif: Consume CT_Zone, and CT_Timeout_Policy tables") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Mark D. Gray <mark.d.gray@redhat.com>
* netdev-linux: Fix use of uninitialized LAG master name.Ilya Maximets2021-05-241-1/+3
| | | | | | | | | | | | | | | | | | | | | | 'if_indextoname' may fail leaving the 'master_name' uninitialized: Conditional jump or move depends on uninitialised value(s) at 0x4C34329: strlen (vg_replace_strmem.c:459) by 0x51C638: hash_string (hash.h:342) by 0x51C638: hash_name (shash.c:28) by 0x51CC51: shash_find (shash.c:231) by 0x51CD38: shash_find_data (shash.c:245) by 0x4A797F: netdev_from_name (netdev.c:2013) by 0x544148: netdev_linux_update_lag (netdev-linux.c:676) by 0x544148: netdev_linux_run (netdev-linux.c:769) by 0x4A5997: netdev_run (netdev.c:186) by 0x40752B: main (ovs-vswitchd.c:129) Uninitialised value was created by a stack allocation at 0x543AFA: netdev_linux_run (netdev-linux.c:722) Fixes: d22f8927c3c9 ("netdev-linux: monitor and offload LAG slaves to TC") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Mark D. Gray <mark.d.gray@redhat.com>
* ofp_actions: Fix set_mpls_tc formatting.Adrian Moreno2021-05-193-4/+60
| | | | | | | | | | | Apart from a cut-and-paste typo, the man page claims that mpls_labels can be provided in hexadecimal format but that's currently not the case. Fix mpls ofp-action formatting, add size checks on ofp-action parsing and add some unit tests. Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* dpif-netdev: Remove meter rate from the bucket size calculation.Ilya Maximets2021-05-183-18/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implementation of meters supposed to be a classic token bucket with 2 typical parameters: rate and burst size. Burst size in this schema is the maximum number of bytes/packets that could pass without being rate limited. Recent changes to userspace datapath made meter implementation to be in line with the kernel one, and this uncovered several issues. The main problem is that maximum bucket size for unknown reason accounts not only burst size, but also the numerical value of rate. This creates a lot of confusion around behavior of meters. For example, if rate is configured as 1000 pps and burst size set to 1, this should mean that meter will tolerate bursts of 1 packet at most, i.e. not a single packet above the rate should pass the meter. However, current implementation calculates maximum bucket size as (rate + burst size), so the effective bucket size will be 1001. This means that first 1000 packets will not be rate limited and average rate might be twice as high as the configured rate. This also makes it practically impossible to configure meter that will have burst size lower than the rate, which might be a desirable configuration if the rate is high. Inability to configure low values of a burst size and overall inability for a user to predict what will be a maximum and average rate from the configured parameters of a meter without looking at the OVS and kernel code might be also classified as a security issue, because drop meters are frequently used as a way of protection from DoS attacks. This change removes rate from the calculation of a bucket size, making it in line with the classic token bucket algorithm and essentially making the rate and burst tolerance being predictable from a users' perspective. Same change will be proposed for the kernel implementation. Unit tests changed back to their correct version and enhanced. Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
* ovsdb-idl: Consider all tables when computing expected cond seqno.Dumitru Ceara2021-05-181-3/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In ovsdb_idl_db_set_condition(), take into account all pending condition changes for all tables when computing the db->cond_seqno at which the monitor is expected to be updated. In the following scenario, with two tables, A and B, the old code performed the following steps: 1. Initial db->cond_seqno = X. 2. Client changes condition for table A: - A->new_cond gets set - expected cond seqno returned to the client: X + 1 3. ovsdb-idl sends the monitor_cond_change for table A - A->req_cond <- A->new_cond 4. Client changes condition for table B: - B->new_cond gets set - expected cond seqno returned to the client: X + 1 - however, because the condition change at step 3 is still not replied to, table B's monitor_cond_change request is not sent yet. 5. ovsdb-idl receives the reply for the condition change at step 3: - db->cond_seqno <- X + 1 6. ovsdb-idl sends the monitor_cond_change for table B 7. ovsdb-idl receives the reply for the condition change at step 6: - db->cond_seqno <- X + 2 The client was incorrectly informed that it will have all relevant updates for table B at seqno X + 1 while actually that happens later, at seqno X + 2. Fixes: 46437c5232bd ("ovsdb-idl: Enhance conditional monitoring API") Acked-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Dumitru Ceara <dceara@redhat.com> (cherry picked from commit b5bb044fbe4c1395dcde5cc7d5081ef0099bb8b3) Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ovs-ofctl: Fix coredump when using "add-groups" command.Wang Yibo2021-05-141-1/+1
| | | | | | | | | | | | When using ovs-ofctl add-groups with only "switch" argument, a coredump is generated. The main reason is that the command "ovs-ofctl add-groups" need two arguments but the limitation of min-args of this command is set to 1. Fixes: 7395c05254df ("Implement OpenFlow 1.1+ "groups" protocol.") Submitted-at: https://github.com/openvswitch/ovs/pull/360 Signed-off-by: Wang Yibo <bobxxwang@126.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* raft: Transfer leadership before creating snapshots.Ilya Maximets2021-05-144-7/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With a big database writing snapshot could take a lot of time, for example, on one of the systems compaction of 300MB database takes about 10 seconds to complete. For the clustered database, 40% of this time takes conversion of the database to the file transaction json format, the rest of time is formatting a string and writing to disk. Of course, this highly depends on the disc and CPU speeds. 300MB is the very possible database size for the OVN Southbound DB, and it might be even bigger than that. During compaction the database is not available and the ovsdb-server doesn't do any other tasks. If leader spends 10-15 seconds writing a snapshot, the cluster is not functional for that time period. Leader also, likely, has some monitors to serve, so the one poll interval may be 15-20 seconds long in the end. Systems with so big databases typically has very high election timers configured (16 seconds), so followers will start election only after this significant amount of time. Once leader is back to the operational state, it will re-connect and try to join the cluster back. In some cases, this might also trigger the 'connected' state flapping on the old leader triggering a re-connection of clients. This issue has been observed with large-scale OVN deployments. One of the methods to improve the situation is to transfer leadership before compacting. This allows to keep the cluster functional, while one of the servers writes a snapshot. Additionally logging the time spent for compaction if it was longer than 1 second. This adds a bit of visibility to 'unreasonably long poll interval's. Reported-at: https://bugzilla.redhat.com/1960391 Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Dumitru Ceara <dceara@redhat.com>
* dpdk: Use DPDK 19.11.8 release.Hariprasad Govindharajan2021-05-125-10/+10
| | | | | | | | | | | | | | | | Modify ci linux build script to use the latest DPDK stable release. Modify Documentation to use the latest DPDK stable release 19.11.8 Update NEWS file to reflect the latest DPDK stable release. Building DPDK releases 19.11.6 and 19.11.7,with meson was found to break the OvS build. DPDK 19.11.8 was released to fix the above said issue. Link to the discussion about the issue can be found below: http://mails.dpdk.org/archives/stable/2021-April/029786.html Signed-off-by: Hariprasad Govindharajan <hariprasad.govindharajan@intel.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* github: Fix up malformed /etc/hosts.Ilya Maximets2021-05-111-0/+6
| | | | | | | | | | | | | | | | | | | | | | For some reason /etc/hosts in GHA now contains a plain text line like this: Note: Don't Delete this file. Also, don't remove this line. ... This breaks libunbound and makes a series of unit tests to emit following warning: |00001|dns_resolve|WARN|Failed to read etc/hosts: syntax error Working around this issue by removing a bad line from /etc/hosts until this fixed properly by GitHub team. This in combination with other fixes should unblock CI. Bug for virtual-environments: https://github.com/actions/virtual-environments/issues/3353 Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com>
* doc: automake: Add support for sphinx 4.0.Ilya Maximets2021-05-111-2/+7
| | | | | | | | | | | | | | | | | | | | File layout for man pages in sphinx 4 by default changed [1] from: Documentation/_ref/man/page.section to: Documentation/_ref/man/section/page.section Ajusting our build scripts so they will be able to locate files in new places. This fixes our CI build. [1] https://github.com/sphinx-doc/sphinx/issues/7996 Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org> Reviewed-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Aaron Conole <aconole@redhat.com>
* cirrus: Look up existing versions of python dependencies.Ilya Maximets2021-05-101-2/+3
| | | | | | | | | | | | | | | | | | | | | | | FreeBSD sometimes changes the base version of python3 that is used for packages. This affects package names. For example, currently CI is broken, because there is no more py37- versions of sphinx and openssl available, only py38- ones: pkg: No packages available to install matching 'py37-openssl' have been found in the repositories pkg: No packages available to install matching 'py37-sphinx' have been found in the repositories We had the same issue last year with 3.6 -> 3.7 transition: dfa2e3d04948 ("cirrus: Use python 3.7 packages on FreeBSD.") Fixing that by searching for a package instead of using a specific version. This should help to avoid same issues in the future. Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Reviewed-by: Aaron Conole <aconole@redhat.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Reviewed-by: David Marchand <david.marchand@redhat.com>