From patchwork Wed Apr 10 12:53:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Marchand X-Patchwork-Id: 52592 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 8946A1B122; Wed, 10 Apr 2019 14:54:11 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 7CB407CDA; Wed, 10 Apr 2019 14:54:05 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D1818316A8EC; Wed, 10 Apr 2019 12:54:04 +0000 (UTC) Received: from dmarchan.remote.csb (ovpn-204-129.brq.redhat.com [10.40.204.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6A8B65C223; Wed, 10 Apr 2019 12:54:03 +0000 (UTC) From: David Marchand To: dev@dpdk.org Cc: chas3@att.com, p.oltarzewski@gmail.com, stable@dpdk.org Date: Wed, 10 Apr 2019 14:53:46 +0200 Message-Id: <1554900829-16180-2-git-send-email-david.marchand@redhat.com> In-Reply-To: <1554900829-16180-1-git-send-email-david.marchand@redhat.com> References: <1554900829-16180-1-git-send-email-david.marchand@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.41]); Wed, 10 Apr 2019 12:54:04 +0000 (UTC) Subject: [dpdk-dev] [PATCH 1/4] net/bonding: fix oob access in LACP mode when sending many packets X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" We'd better consolidate the fast queue and the normal tx burst functions under a common inline wrapper for maintenance. But looking closer at the bufs_slave_port_idxs[] mapping array in those tx burst functions, its size is invalid since up to nb_bufs are handled here. A previous patch [1] fixed this issue for balance tx burst function without mentionning it. 802.3ad and balance modes are functionally equivalent on transmit. The only difference is on the slave id distribution. Add an additional inline wrapper to consolidate even more and fix this issue. [1]: https://git.dpdk.org/dpdk/commit/?id=c5224f623431 Fixes: 09150784a776 ("net/bonding: burst mode hash calculation") Cc: stable@dpdk.org Signed-off-by: David Marchand --- drivers/net/bonding/rte_eth_bond_pmd.c | 213 ++++++++------------------------- 1 file changed, 51 insertions(+), 162 deletions(-) diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c index f30422a..c193d6d 100644 --- a/drivers/net/bonding/rte_eth_bond_pmd.c +++ b/drivers/net/bonding/rte_eth_bond_pmd.c @@ -2,6 +2,7 @@ * Copyright(c) 2010-2017 Intel Corporation */ #include +#include #include #include @@ -295,97 +296,6 @@ } static uint16_t -bond_ethdev_tx_burst_8023ad_fast_queue(void *queue, struct rte_mbuf **bufs, - uint16_t nb_bufs) -{ - struct bond_tx_queue *bd_tx_q = (struct bond_tx_queue *)queue; - struct bond_dev_private *internals = bd_tx_q->dev_private; - - uint16_t slave_port_ids[RTE_MAX_ETHPORTS]; - uint16_t slave_count; - - uint16_t dist_slave_port_ids[RTE_MAX_ETHPORTS]; - uint16_t dist_slave_count; - - /* 2-D array to sort mbufs for transmission on each slave into */ - struct rte_mbuf *slave_bufs[RTE_MAX_ETHPORTS][nb_bufs]; - /* Number of mbufs for transmission on each slave */ - uint16_t slave_nb_bufs[RTE_MAX_ETHPORTS] = { 0 }; - /* Mapping array generated by hash function to map mbufs to slaves */ - uint16_t bufs_slave_port_idxs[RTE_MAX_ETHPORTS] = { 0 }; - - uint16_t slave_tx_count; - uint16_t total_tx_count = 0, total_tx_fail_count = 0; - - uint16_t i; - - if (unlikely(nb_bufs == 0)) - return 0; - - /* Copy slave list to protect against slave up/down changes during tx - * bursting */ - slave_count = internals->active_slave_count; - if (unlikely(slave_count < 1)) - return 0; - - memcpy(slave_port_ids, internals->active_slaves, - sizeof(slave_port_ids[0]) * slave_count); - - - dist_slave_count = 0; - for (i = 0; i < slave_count; i++) { - struct port *port = &bond_mode_8023ad_ports[slave_port_ids[i]]; - - if (ACTOR_STATE(port, DISTRIBUTING)) - dist_slave_port_ids[dist_slave_count++] = - slave_port_ids[i]; - } - - if (unlikely(dist_slave_count < 1)) - return 0; - - /* - * Populate slaves mbuf with the packets which are to be sent on it - * selecting output slave using hash based on xmit policy - */ - internals->burst_xmit_hash(bufs, nb_bufs, dist_slave_count, - bufs_slave_port_idxs); - - for (i = 0; i < nb_bufs; i++) { - /* Populate slave mbuf arrays with mbufs for that slave. */ - uint16_t slave_idx = bufs_slave_port_idxs[i]; - - slave_bufs[slave_idx][slave_nb_bufs[slave_idx]++] = bufs[i]; - } - - - /* Send packet burst on each slave device */ - for (i = 0; i < dist_slave_count; i++) { - if (slave_nb_bufs[i] == 0) - continue; - - slave_tx_count = rte_eth_tx_burst(dist_slave_port_ids[i], - bd_tx_q->queue_id, slave_bufs[i], - slave_nb_bufs[i]); - - total_tx_count += slave_tx_count; - - /* If tx burst fails move packets to end of bufs */ - if (unlikely(slave_tx_count < slave_nb_bufs[i])) { - int slave_tx_fail_count = slave_nb_bufs[i] - - slave_tx_count; - total_tx_fail_count += slave_tx_fail_count; - memcpy(&bufs[nb_bufs - total_tx_fail_count], - &slave_bufs[i][slave_tx_count], - slave_tx_fail_count * sizeof(bufs[0])); - } - } - - return total_tx_count; -} - - -static uint16_t bond_ethdev_rx_burst_8023ad(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) { @@ -1200,16 +1110,13 @@ struct bwg_slave { return num_tx_total; } -static uint16_t -bond_ethdev_tx_burst_balance(void *queue, struct rte_mbuf **bufs, - uint16_t nb_bufs) +static inline uint16_t +tx_burst_balance(void *queue, struct rte_mbuf **bufs, uint16_t nb_bufs, + uint16_t *slave_port_ids, uint16_t slave_count) { struct bond_tx_queue *bd_tx_q = (struct bond_tx_queue *)queue; struct bond_dev_private *internals = bd_tx_q->dev_private; - uint16_t slave_port_ids[RTE_MAX_ETHPORTS]; - uint16_t slave_count; - /* Array to sort mbufs for transmission on each slave into */ struct rte_mbuf *slave_bufs[RTE_MAX_ETHPORTS][nb_bufs]; /* Number of mbufs for transmission on each slave */ @@ -1222,18 +1129,6 @@ struct bwg_slave { uint16_t i; - if (unlikely(nb_bufs == 0)) - return 0; - - /* Copy slave list to protect against slave up/down changes during tx - * bursting */ - slave_count = internals->active_slave_count; - if (unlikely(slave_count < 1)) - return 0; - - memcpy(slave_port_ids, internals->active_slaves, - sizeof(slave_port_ids[0]) * slave_count); - /* * Populate slaves mbuf with the packets which are to be sent on it * selecting output slave using hash based on xmit policy @@ -1274,7 +1169,7 @@ struct bwg_slave { } static uint16_t -bond_ethdev_tx_burst_8023ad(void *queue, struct rte_mbuf **bufs, +bond_ethdev_tx_burst_balance(void *queue, struct rte_mbuf **bufs, uint16_t nb_bufs) { struct bond_tx_queue *bd_tx_q = (struct bond_tx_queue *)queue; @@ -1283,18 +1178,36 @@ struct bwg_slave { uint16_t slave_port_ids[RTE_MAX_ETHPORTS]; uint16_t slave_count; + if (unlikely(nb_bufs == 0)) + return 0; + + /* Copy slave list to protect against slave up/down changes during tx + * bursting + */ + slave_count = internals->active_slave_count; + if (unlikely(slave_count < 1)) + return 0; + + memcpy(slave_port_ids, internals->active_slaves, + sizeof(slave_port_ids[0]) * slave_count); + return tx_burst_balance(queue, bufs, nb_bufs, slave_port_ids, + slave_count); +} + +static inline uint16_t +tx_burst_8023ad(void *queue, struct rte_mbuf **bufs, uint16_t nb_bufs, + bool dedicated_txq) +{ + struct bond_tx_queue *bd_tx_q = (struct bond_tx_queue *)queue; + struct bond_dev_private *internals = bd_tx_q->dev_private; + + uint16_t slave_port_ids[RTE_MAX_ETHPORTS]; + uint16_t slave_count; + uint16_t dist_slave_port_ids[RTE_MAX_ETHPORTS]; uint16_t dist_slave_count; - /* 2-D array to sort mbufs for transmission on each slave into */ - struct rte_mbuf *slave_bufs[RTE_MAX_ETHPORTS][nb_bufs]; - /* Number of mbufs for transmission on each slave */ - uint16_t slave_nb_bufs[RTE_MAX_ETHPORTS] = { 0 }; - /* Mapping array generated by hash function to map mbufs to slaves */ - uint16_t bufs_slave_port_idxs[RTE_MAX_ETHPORTS] = { 0 }; - uint16_t slave_tx_count; - uint16_t total_tx_count = 0, total_tx_fail_count = 0; uint16_t i; @@ -1307,6 +1220,9 @@ struct bwg_slave { memcpy(slave_port_ids, internals->active_slaves, sizeof(slave_port_ids[0]) * slave_count); + if (dedicated_txq) + goto skip_tx_ring; + /* Check for LACP control packets and send if available */ for (i = 0; i < slave_count; i++) { struct port *port = &bond_mode_8023ad_ports[slave_port_ids[i]]; @@ -1328,6 +1244,7 @@ struct bwg_slave { } } +skip_tx_ring: if (unlikely(nb_bufs == 0)) return 0; @@ -1340,53 +1257,25 @@ struct bwg_slave { slave_port_ids[i]; } - if (likely(dist_slave_count > 0)) { - - /* - * Populate slaves mbuf with the packets which are to be sent - * on it, selecting output slave using hash based on xmit policy - */ - internals->burst_xmit_hash(bufs, nb_bufs, dist_slave_count, - bufs_slave_port_idxs); - - for (i = 0; i < nb_bufs; i++) { - /* - * Populate slave mbuf arrays with mbufs for that - * slave - */ - uint16_t slave_idx = bufs_slave_port_idxs[i]; - - slave_bufs[slave_idx][slave_nb_bufs[slave_idx]++] = - bufs[i]; - } - - - /* Send packet burst on each slave device */ - for (i = 0; i < dist_slave_count; i++) { - if (slave_nb_bufs[i] == 0) - continue; - - slave_tx_count = rte_eth_tx_burst( - dist_slave_port_ids[i], - bd_tx_q->queue_id, slave_bufs[i], - slave_nb_bufs[i]); - - total_tx_count += slave_tx_count; + if (unlikely(dist_slave_count < 1)) + return 0; - /* If tx burst fails move packets to end of bufs */ - if (unlikely(slave_tx_count < slave_nb_bufs[i])) { - int slave_tx_fail_count = slave_nb_bufs[i] - - slave_tx_count; - total_tx_fail_count += slave_tx_fail_count; + return tx_burst_balance(queue, bufs, nb_bufs, dist_slave_port_ids, + dist_slave_count); +} - memcpy(&bufs[nb_bufs - total_tx_fail_count], - &slave_bufs[i][slave_tx_count], - slave_tx_fail_count * sizeof(bufs[0])); - } - } - } +static uint16_t +bond_ethdev_tx_burst_8023ad(void *queue, struct rte_mbuf **bufs, + uint16_t nb_bufs) +{ + return tx_burst_8023ad(queue, bufs, nb_bufs, false); +} - return total_tx_count; +static uint16_t +bond_ethdev_tx_burst_8023ad_fast_queue(void *queue, struct rte_mbuf **bufs, + uint16_t nb_bufs) +{ + return tx_burst_8023ad(queue, bufs, nb_bufs, true); } static uint16_t From patchwork Wed Apr 10 12:53:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Marchand X-Patchwork-Id: 52593 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id F2F491B12A; Wed, 10 Apr 2019 14:54:16 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 01D871B118; Wed, 10 Apr 2019 14:54:07 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6399B3082E50; Wed, 10 Apr 2019 12:54:06 +0000 (UTC) Received: from dmarchan.remote.csb (ovpn-204-129.brq.redhat.com [10.40.204.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1F2C95C28F; Wed, 10 Apr 2019 12:54:04 +0000 (UTC) From: David Marchand To: dev@dpdk.org Cc: chas3@att.com, p.oltarzewski@gmail.com, stable@dpdk.org Date: Wed, 10 Apr 2019 14:53:47 +0200 Message-Id: <1554900829-16180-3-git-send-email-david.marchand@redhat.com> In-Reply-To: <1554900829-16180-1-git-send-email-david.marchand@redhat.com> References: <1554900829-16180-1-git-send-email-david.marchand@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Wed, 10 Apr 2019 12:54:06 +0000 (UTC) Subject: [dpdk-dev] [PATCH 2/4] net/bonding: fix LACP fast queue Rx handler X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" fast queue Rx burst function is missing checks on promisc and the slave collecting state. Define an inline wrapper to have a common base. Fixes: 112891cd27e5 ("net/bonding: add dedicated HW queues for LACP control") Cc: stable@dpdk.org Signed-off-by: David Marchand Signed-off-by: Chas Williams --- drivers/net/bonding/rte_eth_bond_pmd.c | 73 +++++++++++++--------------------- 1 file changed, 27 insertions(+), 46 deletions(-) diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c index c193d6d..989be5c 100644 --- a/drivers/net/bonding/rte_eth_bond_pmd.c +++ b/drivers/net/bonding/rte_eth_bond_pmd.c @@ -256,48 +256,9 @@ return 0; } -static uint16_t -bond_ethdev_rx_burst_8023ad_fast_queue(void *queue, struct rte_mbuf **bufs, - uint16_t nb_pkts) -{ - struct bond_rx_queue *bd_rx_q = (struct bond_rx_queue *)queue; - struct bond_dev_private *internals = bd_rx_q->dev_private; - uint16_t num_rx_total = 0; /* Total number of received packets */ - uint16_t slaves[RTE_MAX_ETHPORTS]; - uint16_t slave_count; - uint16_t active_slave; - uint16_t i; - - /* Copy slave list to protect against slave up/down changes during tx - * bursting */ - slave_count = internals->active_slave_count; - active_slave = internals->active_slave; - memcpy(slaves, internals->active_slaves, - sizeof(internals->active_slaves[0]) * slave_count); - - for (i = 0; i < slave_count && nb_pkts; i++) { - uint16_t num_rx_slave; - - /* Read packets from this slave */ - num_rx_slave = rte_eth_rx_burst(slaves[active_slave], - bd_rx_q->queue_id, - bufs + num_rx_total, nb_pkts); - num_rx_total += num_rx_slave; - nb_pkts -= num_rx_slave; - - if (++active_slave == slave_count) - active_slave = 0; - } - - if (++internals->active_slave >= slave_count) - internals->active_slave = 0; - - return num_rx_total; -} - -static uint16_t -bond_ethdev_rx_burst_8023ad(void *queue, struct rte_mbuf **bufs, - uint16_t nb_pkts) +static inline uint16_t +rx_burst_8023ad(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts, + bool dedicated_rxq) { /* Cast to structure, containing bonded device's port id and queue id */ struct bond_rx_queue *bd_rx_q = (struct bond_rx_queue *)queue; @@ -357,10 +318,16 @@ hdr = rte_pktmbuf_mtod(bufs[j], struct ether_hdr *); subtype = ((struct slow_protocol_frame *)hdr)->slow_protocol.subtype; - /* Remove packet from array if it is slow packet or slave is not - * in collecting state or bonding interface is not in promiscuous - * mode and packet address does not match. */ - if (unlikely(is_lacp_packets(hdr->ether_type, subtype, bufs[j]) || + /* Remove packet from array if: + * - it is slow packet but no dedicated rxq is present, + * - slave is not in collecting state, + * - bonding interface is not in promiscuous mode and + * packet is not multicast and address does not match, + */ + if (unlikely( + (!dedicated_rxq && + is_lacp_packets(hdr->ether_type, subtype, + bufs[j])) || !collecting || (!promisc && !is_multicast_ether_addr(&hdr->d_addr) && @@ -392,6 +359,20 @@ return num_rx_total; } +static uint16_t +bond_ethdev_rx_burst_8023ad(void *queue, struct rte_mbuf **bufs, + uint16_t nb_pkts) +{ + return rx_burst_8023ad(queue, bufs, nb_pkts, false); +} + +static uint16_t +bond_ethdev_rx_burst_8023ad_fast_queue(void *queue, struct rte_mbuf **bufs, + uint16_t nb_pkts) +{ + return rx_burst_8023ad(queue, bufs, nb_pkts, true); +} + #if defined(RTE_LIBRTE_BOND_DEBUG_ALB) || defined(RTE_LIBRTE_BOND_DEBUG_ALB_L1) uint32_t burstnumberRX; uint32_t burstnumberTX; From patchwork Wed Apr 10 12:53:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Marchand X-Patchwork-Id: 52594 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 265FD1B13B; Wed, 10 Apr 2019 14:54:20 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id ACEAF1B118; Wed, 10 Apr 2019 14:54:08 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 13A633001244; Wed, 10 Apr 2019 12:54:08 +0000 (UTC) Received: from dmarchan.remote.csb (ovpn-204-129.brq.redhat.com [10.40.204.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id C4B5261B7F; Wed, 10 Apr 2019 12:54:06 +0000 (UTC) From: David Marchand To: dev@dpdk.org Cc: chas3@att.com, p.oltarzewski@gmail.com, stable@dpdk.org Date: Wed, 10 Apr 2019 14:53:48 +0200 Message-Id: <1554900829-16180-4-git-send-email-david.marchand@redhat.com> In-Reply-To: <1554900829-16180-1-git-send-email-david.marchand@redhat.com> References: <1554900829-16180-1-git-send-email-david.marchand@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Wed, 10 Apr 2019 12:54:08 +0000 (UTC) Subject: [dpdk-dev] [PATCH 3/4] net/bonding: fix unicast packets filtering when not in promisc X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" By default, the 802.3ad code enables promisc mode on all slaves. To avoid all packets going to the application (unless the application asked for promiscuous mode), all frames are supposed to be filtered in the rx burst handler. However the incriminated commit broke this because non pure ethernet frames (basically any unicast Ether()/IP() packet) are not filtered anymore. Fixes: 71b7b37ec959 ("net/bonding: use ptype flags for LACP Rx filtering") Cc: stable@dpdk.org Signed-off-by: David Marchand --- drivers/net/bonding/rte_eth_bond_pmd.c | 7 ------- 1 file changed, 7 deletions(-) diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c index 989be5c..9ef0717 100644 --- a/drivers/net/bonding/rte_eth_bond_pmd.c +++ b/drivers/net/bonding/rte_eth_bond_pmd.c @@ -305,13 +305,6 @@ /* Handle slow protocol packets. */ while (j < num_rx_total) { - - /* If packet is not pure L2 and is known, skip it */ - if ((bufs[j]->packet_type & ~RTE_PTYPE_L2_ETHER) != 0) { - j++; - continue; - } - if (j + 3 < num_rx_total) rte_prefetch0(rte_pktmbuf_mtod(bufs[j + 3], void *)); From patchwork Wed Apr 10 12:53:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Marchand X-Patchwork-Id: 52595 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id DC36C1B14D; Wed, 10 Apr 2019 14:54:23 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 22C1C1B118 for ; Wed, 10 Apr 2019 14:54:10 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 87A95316EB73; Wed, 10 Apr 2019 12:54:09 +0000 (UTC) Received: from dmarchan.remote.csb (ovpn-204-129.brq.redhat.com [10.40.204.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 717AA5C28F; Wed, 10 Apr 2019 12:54:08 +0000 (UTC) From: David Marchand To: dev@dpdk.org Cc: chas3@att.com, p.oltarzewski@gmail.com Date: Wed, 10 Apr 2019 14:53:49 +0200 Message-Id: <1554900829-16180-5-git-send-email-david.marchand@redhat.com> In-Reply-To: <1554900829-16180-1-git-send-email-david.marchand@redhat.com> References: <1554900829-16180-1-git-send-email-david.marchand@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.41]); Wed, 10 Apr 2019 12:54:09 +0000 (UTC) Subject: [dpdk-dev] [PATCH 4/4] net/bonding: prefer allmulti to promisc for LACP X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Rather than the global promiscuous mode on all slaves, let's try to use allmulti. To do this, we apply the same mechanism than for promiscuous and drop in rx_burst. When adding a slave, we first try to use allmulti, else we try promiscuous. Because of this, we must be able to handle allmulti on the bonding interface, so the associated dev_ops is added with checks on which mode has been applied on each slave. Rather than add a new flag in the bond private structure, we can remove promiscuous_en since ethdev already maintains this information. When starting the bond device, there is no promisc/allmulti re-apply as, again, ethdev does this itself. On reception, we must inspect if the packets are multicast or unicast and take a decision to drop based on which mode has been enabled on the bonding interface. Note: in the future, if we define an API so that we can add/remove mc addresses to a port (rather than the current global set_mc_addr_list API), we can refine this to only register the LACP multicast mac address. Signed-off-by: David Marchand --- drivers/net/bonding/rte_eth_bond_8023ad.c | 51 +++++++++- drivers/net/bonding/rte_eth_bond_8023ad_private.h | 7 ++ drivers/net/bonding/rte_eth_bond_pmd.c | 113 +++++++++++++++++----- drivers/net/bonding/rte_eth_bond_private.h | 3 - 4 files changed, 148 insertions(+), 26 deletions(-) diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c index 5004898..adf6b7e 100644 --- a/drivers/net/bonding/rte_eth_bond_8023ad.c +++ b/drivers/net/bonding/rte_eth_bond_8023ad.c @@ -907,6 +907,49 @@ bond_mode_8023ad_periodic_cb, arg); } +static int +bond_mode_8023ad_register_lacp_mac(uint16_t slave_id) +{ + rte_eth_allmulticast_enable(slave_id); + if (rte_eth_allmulticast_get(slave_id)) { + RTE_BOND_LOG(DEBUG, "forced allmulti for port %u", + slave_id); + bond_mode_8023ad_ports[slave_id].forced_rx_flags = + BOND_8023AD_FORCED_ALLMULTI; + return 0; + } + + rte_eth_promiscuous_enable(slave_id); + if (rte_eth_promiscuous_get(slave_id)) { + RTE_BOND_LOG(DEBUG, "forced promiscuous for port %u", + slave_id); + bond_mode_8023ad_ports[slave_id].forced_rx_flags = + BOND_8023AD_FORCED_PROMISC; + return 0; + } + + return -1; +} + +static void +bond_mode_8023ad_unregister_lacp_mac(uint16_t slave_id) +{ + switch (bond_mode_8023ad_ports[slave_id].forced_rx_flags) { + case BOND_8023AD_FORCED_ALLMULTI: + RTE_BOND_LOG(DEBUG, "unset allmulti for port %u", slave_id); + rte_eth_allmulticast_disable(slave_id); + break; + + case BOND_8023AD_FORCED_PROMISC: + RTE_BOND_LOG(DEBUG, "unset promisc for port %u", slave_id); + rte_eth_promiscuous_disable(slave_id); + break; + + default: + break; + } +} + void bond_mode_8023ad_activate_slave(struct rte_eth_dev *bond_dev, uint16_t slave_id) @@ -948,7 +991,11 @@ /* use this port as agregator */ port->aggregator_port_id = slave_id; - rte_eth_promiscuous_enable(slave_id); + + if (bond_mode_8023ad_register_lacp_mac(slave_id) < 0) { + RTE_BOND_LOG(WARNING, "slave %u is most likely broken and won't receive LACP packets", + slave_id); + } timer_cancel(&port->warning_timer); @@ -1022,6 +1069,8 @@ old_partner_state = port->partner_state; record_default(port); + bond_mode_8023ad_unregister_lacp_mac(slave_id); + /* If partner timeout state changes then disable timer */ if (!((old_partner_state ^ port->partner_state) & STATE_LACP_SHORT_TIMEOUT)) diff --git a/drivers/net/bonding/rte_eth_bond_8023ad_private.h b/drivers/net/bonding/rte_eth_bond_8023ad_private.h index f91902e..edda841 100644 --- a/drivers/net/bonding/rte_eth_bond_8023ad_private.h +++ b/drivers/net/bonding/rte_eth_bond_8023ad_private.h @@ -109,6 +109,13 @@ struct port { uint16_t sm_flags; enum rte_bond_8023ad_selection selected; + /** Indicates if either allmulti or promisc has been enforced on the + * slave so that we can receive lacp packets + */ +#define BOND_8023AD_FORCED_ALLMULTI (1 << 0) +#define BOND_8023AD_FORCED_PROMISC (1 << 1) + uint8_t forced_rx_flags; + uint64_t current_while_timer; uint64_t periodic_timer; uint64_t wait_while_timer; diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c index 9ef0717..07e19c4 100644 --- a/drivers/net/bonding/rte_eth_bond_pmd.c +++ b/drivers/net/bonding/rte_eth_bond_pmd.c @@ -274,7 +274,8 @@ uint16_t slave_count, idx; uint8_t collecting; /* current slave collecting status */ - const uint8_t promisc = internals->promiscuous_en; + const uint8_t promisc = rte_eth_promiscuous_get(internals->port_id); + const uint8_t allmulti = rte_eth_allmulticast_get(internals->port_id); uint8_t subtype; uint16_t i; uint16_t j; @@ -314,8 +315,10 @@ /* Remove packet from array if: * - it is slow packet but no dedicated rxq is present, * - slave is not in collecting state, - * - bonding interface is not in promiscuous mode and - * packet is not multicast and address does not match, + * - bonding interface is not in promiscuous mode: + * - packet is unicast and address does not match, + * - packet is multicast and bonding interface + * is not in allmulti, */ if (unlikely( (!dedicated_rxq && @@ -323,9 +326,11 @@ bufs[j])) || !collecting || (!promisc && - !is_multicast_ether_addr(&hdr->d_addr) && - !is_same_ether_addr(bond_mac, - &hdr->d_addr)))) { + ((is_unicast_ether_addr(&hdr->d_addr) && + !is_same_ether_addr(bond_mac, + &hdr->d_addr)) || + (!allmulti && + is_multicast_ether_addr(&hdr->d_addr)))))) { if (hdr->ether_type == ether_type_slow_be) { bond_mode_8023ad_handle_slow_pkt( @@ -1924,10 +1929,6 @@ struct bwg_slave { } } - /* If bonded device is configure in promiscuous mode then re-apply config */ - if (internals->promiscuous_en) - bond_ethdev_promiscuous_enable(eth_dev); - if (internals->mode == BONDING_MODE_8023AD) { if (internals->mode4.dedicated_queues.enabled == 1) { internals->mode4.dedicated_queues.rx_qid = @@ -2445,18 +2446,17 @@ struct bwg_slave { struct bond_dev_private *internals = eth_dev->data->dev_private; int i; - internals->promiscuous_en = 1; - switch (internals->mode) { /* Promiscuous mode is propagated to all slaves */ case BONDING_MODE_ROUND_ROBIN: case BONDING_MODE_BALANCE: case BONDING_MODE_BROADCAST: - for (i = 0; i < internals->slave_count; i++) - rte_eth_promiscuous_enable(internals->slaves[i].port_id); - break; - /* In mode4 promiscus mode is managed when slave is added/removed */ case BONDING_MODE_8023AD: + for (i = 0; i < internals->slave_count; i++) { + uint16_t port_id = internals->slaves[i].port_id; + + rte_eth_promiscuous_enable(port_id); + } break; /* Promiscuous mode is propagated only to primary slave */ case BONDING_MODE_ACTIVE_BACKUP: @@ -2476,18 +2476,21 @@ struct bwg_slave { struct bond_dev_private *internals = dev->data->dev_private; int i; - internals->promiscuous_en = 0; - switch (internals->mode) { /* Promiscuous mode is propagated to all slaves */ case BONDING_MODE_ROUND_ROBIN: case BONDING_MODE_BALANCE: case BONDING_MODE_BROADCAST: - for (i = 0; i < internals->slave_count; i++) - rte_eth_promiscuous_disable(internals->slaves[i].port_id); - break; - /* In mode4 promiscus mode is set managed when slave is added/removed */ case BONDING_MODE_8023AD: + for (i = 0; i < internals->slave_count; i++) { + uint16_t port_id = internals->slaves[i].port_id; + + if (internals->mode == BONDING_MODE_8023AD && + bond_mode_8023ad_ports[port_id].forced_rx_flags == + BOND_8023AD_FORCED_PROMISC) + continue; + rte_eth_promiscuous_disable(port_id); + } break; /* Promiscuous mode is propagated only to primary slave */ case BONDING_MODE_ACTIVE_BACKUP: @@ -2502,6 +2505,70 @@ struct bwg_slave { } static void +bond_ethdev_allmulticast_enable(struct rte_eth_dev *eth_dev) +{ + struct bond_dev_private *internals = eth_dev->data->dev_private; + int i; + + switch (internals->mode) { + /* allmulti mode is propagated to all slaves */ + case BONDING_MODE_ROUND_ROBIN: + case BONDING_MODE_BALANCE: + case BONDING_MODE_BROADCAST: + case BONDING_MODE_8023AD: + for (i = 0; i < internals->slave_count; i++) { + uint16_t port_id = internals->slaves[i].port_id; + + rte_eth_allmulticast_enable(port_id); + } + break; + /* allmulti mode is propagated only to primary slave */ + case BONDING_MODE_ACTIVE_BACKUP: + case BONDING_MODE_TLB: + case BONDING_MODE_ALB: + default: + /* Do not touch allmulti when there cannot be primary ports */ + if (internals->slave_count == 0) + break; + rte_eth_allmulticast_enable(internals->current_primary_port); + } +} + +static void +bond_ethdev_allmulticast_disable(struct rte_eth_dev *eth_dev) +{ + struct bond_dev_private *internals = eth_dev->data->dev_private; + int i; + + switch (internals->mode) { + /* allmulti mode is propagated to all slaves */ + case BONDING_MODE_ROUND_ROBIN: + case BONDING_MODE_BALANCE: + case BONDING_MODE_BROADCAST: + case BONDING_MODE_8023AD: + for (i = 0; i < internals->slave_count; i++) { + uint16_t port_id = internals->slaves[i].port_id; + + if (internals->mode == BONDING_MODE_8023AD && + bond_mode_8023ad_ports[port_id].forced_rx_flags == + BOND_8023AD_FORCED_ALLMULTI) + continue; + rte_eth_allmulticast_disable(port_id); + } + break; + /* allmulti mode is propagated only to primary slave */ + case BONDING_MODE_ACTIVE_BACKUP: + case BONDING_MODE_TLB: + case BONDING_MODE_ALB: + default: + /* Do not touch allmulti when there cannot be primary ports */ + if (internals->slave_count == 0) + break; + rte_eth_allmulticast_disable(internals->current_primary_port); + } +} + +static void bond_ethdev_delayed_lsc_propagation(void *arg) { if (arg == NULL) @@ -2893,6 +2960,8 @@ struct bwg_slave { .stats_reset = bond_ethdev_stats_reset, .promiscuous_enable = bond_ethdev_promiscuous_enable, .promiscuous_disable = bond_ethdev_promiscuous_disable, + .allmulticast_enable = bond_ethdev_allmulticast_enable, + .allmulticast_disable = bond_ethdev_allmulticast_disable, .reta_update = bond_ethdev_rss_reta_update, .reta_query = bond_ethdev_rss_reta_query, .rss_hash_update = bond_ethdev_rss_hash_update, diff --git a/drivers/net/bonding/rte_eth_bond_private.h b/drivers/net/bonding/rte_eth_bond_private.h index 8afef39..4154e17 100644 --- a/drivers/net/bonding/rte_eth_bond_private.h +++ b/drivers/net/bonding/rte_eth_bond_private.h @@ -122,9 +122,6 @@ struct bond_dev_private { uint8_t user_defined_mac; /**< Flag for whether MAC address is user defined or not */ - uint8_t promiscuous_en; - /**< Enabled/disable promiscuous mode on bonding device */ - uint8_t link_status_polling_enabled; uint32_t link_status_polling_interval_ms;