[v5,3/3] net/ixgbe: implement recycle buffer mode

Message ID 20230330062939.1206267-4-feifei.wang2@arm.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series Recycle buffers from Tx to Rx |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation fail ninja build failure
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/github-robot: build fail github build: failed
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-unit-testing success Testing PASS
ci/intel-Testing success Testing PASS
ci/iol-abi-testing warning Testing issues

Commit Message

Feifei Wang March 30, 2023, 6:29 a.m. UTC
  Define specific function implementation for ixgbe driver.
Currently, recycle buffer mode can support 128bit
vector path. And can be enabled both in fast free and
no fast free mode.

Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
 drivers/net/ixgbe/ixgbe_ethdev.c |   1 +
 drivers/net/ixgbe/ixgbe_ethdev.h |   3 +
 drivers/net/ixgbe/ixgbe_rxtx.c   | 153 +++++++++++++++++++++++++++++++
 drivers/net/ixgbe/ixgbe_rxtx.h   |   4 +
 4 files changed, 161 insertions(+)
  

Comments

Ferruh Yigit April 19, 2023, 2:46 p.m. UTC | #1
On 3/30/2023 7:29 AM, Feifei Wang wrote:
> Define specific function implementation for ixgbe driver.
> Currently, recycle buffer mode can support 128bit
> vector path. And can be enabled both in fast free and
> no fast free mode.
> 
> Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> ---
>  drivers/net/ixgbe/ixgbe_ethdev.c |   1 +
>  drivers/net/ixgbe/ixgbe_ethdev.h |   3 +
>  drivers/net/ixgbe/ixgbe_rxtx.c   | 153 +++++++++++++++++++++++++++++++
>  drivers/net/ixgbe/ixgbe_rxtx.h   |   4 +
>  4 files changed, 161 insertions(+)
> 

What do you think to extract buf_recycle related code in drivers into
its own file, this may help to manager maintainership of code easier?

<...>

> +uint16_t
> +ixgbe_tx_buf_stash_vec(void *tx_queue,
> +		struct rte_eth_rxq_buf_recycle_info *rxq_buf_recycle_info)
> +{
> +	struct ixgbe_tx_queue *txq = tx_queue;
> +	struct ixgbe_tx_entry *txep;
> +	struct rte_mbuf **rxep;
> +	struct rte_mbuf *m[RTE_IXGBE_TX_MAX_FREE_BUF_SZ];
> +	int i, j, n;
> +	uint32_t status;
> +	uint16_t avail = 0;
> +	uint16_t buf_ring_size = rxq_buf_recycle_info->buf_ring_size;
> +	uint16_t mask = rxq_buf_recycle_info->buf_ring_size - 1;
> +	uint16_t refill_request = rxq_buf_recycle_info->refill_request;
> +	uint16_t refill_head = *rxq_buf_recycle_info->refill_head;
> +	uint16_t receive_tail = *rxq_buf_recycle_info->receive_tail;
> +
> +	/* Get available recycling Rx buffers. */
> +	avail = (buf_ring_size - (refill_head - receive_tail)) & mask;
> +
> +	/* Check Tx free thresh and Rx available space. */
> +	if (txq->nb_tx_free > txq->tx_free_thresh || avail <= txq->tx_rs_thresh)
> +		return 0;
> +
> +	/* check DD bits on threshold descriptor */
> +	status = txq->tx_ring[txq->tx_next_dd].wb.status;
> +	if (!(status & IXGBE_ADVTXD_STAT_DD))
> +		return 0;
> +
> +	n = txq->tx_rs_thresh;
> +
> +	/* Buffer recycle can only support no ring buffer wraparound.
> +	 * Two case for this:
> +	 *
> +	 * case 1: The refill head of Rx buffer ring needs to be aligned with
> +	 * buffer ring size. In this case, the number of Tx freeing buffers
> +	 * should be equal to refill_request.
> +	 *
> +	 * case 2: The refill head of Rx ring buffer does not need to be aligned
> +	 * with buffer ring size. In this case, the update of refill head can not
> +	 * exceed the Rx buffer ring size.
> +	 */
> +	if (refill_request != n ||
> +		(!refill_request && (refill_head + n > buf_ring_size)))
> +		return 0;
> +
> +	/* First buffer to free from S/W ring is at index
> +	 * tx_next_dd - (tx_rs_thresh-1).
> +	 */
> +	txep = &txq->sw_ring[txq->tx_next_dd - (n - 1)];
> +	rxep = rxq_buf_recycle_info->buf_ring;
> +	rxep += refill_head;
> +
> +	if (txq->offloads & RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE) {
> +		/* Directly put mbufs from Tx to Rx. */
> +		for (i = 0; i < n; i++, rxep++, txep++)
> +			*rxep = txep[0].mbuf;
> +	} else {
> +		for (i = 0, j = 0; i < n; i++) {
> +			/* Avoid txq contains buffers from expected mempoo. */

mempool (unless trying to introduce a new concept :)
  
Feifei Wang April 26, 2023, 7:36 a.m. UTC | #2
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Wednesday, April 19, 2023 10:47 PM
> To: Feifei Wang <Feifei.Wang2@arm.com>; Qiming Yang
> <qiming.yang@intel.com>; Wenjun Wu <wenjun1.wu@intel.com>
> Cc: dev@dpdk.org; konstantin.v.ananyev@yandex.ru;
> mb@smartsharesystems.com; nd <nd@arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>
> Subject: Re: [PATCH v5 3/3] net/ixgbe: implement recycle buffer mode
> 
> On 3/30/2023 7:29 AM, Feifei Wang wrote:
> > Define specific function implementation for ixgbe driver.
> > Currently, recycle buffer mode can support 128bit vector path. And can
> > be enabled both in fast free and no fast free mode.
> >
> > Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > ---
> >  drivers/net/ixgbe/ixgbe_ethdev.c |   1 +
> >  drivers/net/ixgbe/ixgbe_ethdev.h |   3 +
> >  drivers/net/ixgbe/ixgbe_rxtx.c   | 153
> +++++++++++++++++++++++++++++++
> >  drivers/net/ixgbe/ixgbe_rxtx.h   |   4 +
> >  4 files changed, 161 insertions(+)
> >
> 
> What do you think to extract buf_recycle related code in drivers into its own
> file, this may help to manager maintainership of code easier?
Good comment, this will make code clean and easy to maintain.
> 
> <...>
> 
> > +uint16_t
> > +ixgbe_tx_buf_stash_vec(void *tx_queue,
> > +		struct rte_eth_rxq_buf_recycle_info *rxq_buf_recycle_info) {
> > +	struct ixgbe_tx_queue *txq = tx_queue;
> > +	struct ixgbe_tx_entry *txep;
> > +	struct rte_mbuf **rxep;
> > +	struct rte_mbuf *m[RTE_IXGBE_TX_MAX_FREE_BUF_SZ];
> > +	int i, j, n;
> > +	uint32_t status;
> > +	uint16_t avail = 0;
> > +	uint16_t buf_ring_size = rxq_buf_recycle_info->buf_ring_size;
> > +	uint16_t mask = rxq_buf_recycle_info->buf_ring_size - 1;
> > +	uint16_t refill_request = rxq_buf_recycle_info->refill_request;
> > +	uint16_t refill_head = *rxq_buf_recycle_info->refill_head;
> > +	uint16_t receive_tail = *rxq_buf_recycle_info->receive_tail;
> > +
> > +	/* Get available recycling Rx buffers. */
> > +	avail = (buf_ring_size - (refill_head - receive_tail)) & mask;
> > +
> > +	/* Check Tx free thresh and Rx available space. */
> > +	if (txq->nb_tx_free > txq->tx_free_thresh || avail <= txq->tx_rs_thresh)
> > +		return 0;
> > +
> > +	/* check DD bits on threshold descriptor */
> > +	status = txq->tx_ring[txq->tx_next_dd].wb.status;
> > +	if (!(status & IXGBE_ADVTXD_STAT_DD))
> > +		return 0;
> > +
> > +	n = txq->tx_rs_thresh;
> > +
> > +	/* Buffer recycle can only support no ring buffer wraparound.
> > +	 * Two case for this:
> > +	 *
> > +	 * case 1: The refill head of Rx buffer ring needs to be aligned with
> > +	 * buffer ring size. In this case, the number of Tx freeing buffers
> > +	 * should be equal to refill_request.
> > +	 *
> > +	 * case 2: The refill head of Rx ring buffer does not need to be aligned
> > +	 * with buffer ring size. In this case, the update of refill head can not
> > +	 * exceed the Rx buffer ring size.
> > +	 */
> > +	if (refill_request != n ||
> > +		(!refill_request && (refill_head + n > buf_ring_size)))
> > +		return 0;
> > +
> > +	/* First buffer to free from S/W ring is at index
> > +	 * tx_next_dd - (tx_rs_thresh-1).
> > +	 */
> > +	txep = &txq->sw_ring[txq->tx_next_dd - (n - 1)];
> > +	rxep = rxq_buf_recycle_info->buf_ring;
> > +	rxep += refill_head;
> > +
> > +	if (txq->offloads & RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE) {
> > +		/* Directly put mbufs from Tx to Rx. */
> > +		for (i = 0; i < n; i++, rxep++, txep++)
> > +			*rxep = txep[0].mbuf;
> > +	} else {
> > +		for (i = 0, j = 0; i < n; i++) {
> > +			/* Avoid txq contains buffers from expected mempoo.
> */
> 
> mempool (unless trying to introduce a new concept :)
Agree.
  

Patch

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 88118bc305..3bada9abbd 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -543,6 +543,7 @@  static const struct eth_dev_ops ixgbe_eth_dev_ops = {
 	.set_mc_addr_list     = ixgbe_dev_set_mc_addr_list,
 	.rxq_info_get         = ixgbe_rxq_info_get,
 	.txq_info_get         = ixgbe_txq_info_get,
+	.rxq_buf_recycle_info_get = ixgbe_rxq_buf_recycle_info_get,
 	.timesync_enable      = ixgbe_timesync_enable,
 	.timesync_disable     = ixgbe_timesync_disable,
 	.timesync_read_rx_timestamp = ixgbe_timesync_read_rx_timestamp,
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/ixgbe/ixgbe_ethdev.h
index 48290af512..ca6aa0da64 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.h
+++ b/drivers/net/ixgbe/ixgbe_ethdev.h
@@ -625,6 +625,9 @@  void ixgbe_rxq_info_get(struct rte_eth_dev *dev, uint16_t queue_id,
 void ixgbe_txq_info_get(struct rte_eth_dev *dev, uint16_t queue_id,
 	struct rte_eth_txq_info *qinfo);
 
+void ixgbe_rxq_buf_recycle_info_get(struct rte_eth_dev *dev, uint16_t queue_id,
+		struct rte_eth_rxq_buf_recycle_info *rxq_buf_recycle_info);
+
 int ixgbevf_dev_rx_init(struct rte_eth_dev *dev);
 
 void ixgbevf_dev_tx_init(struct rte_eth_dev *dev);
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index c9d6ca9efe..ee27121315 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -953,6 +953,133 @@  ixgbe_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
 	return nb_tx;
 }
 
+uint16_t
+ixgbe_tx_buf_stash_vec(void *tx_queue,
+		struct rte_eth_rxq_buf_recycle_info *rxq_buf_recycle_info)
+{
+	struct ixgbe_tx_queue *txq = tx_queue;
+	struct ixgbe_tx_entry *txep;
+	struct rte_mbuf **rxep;
+	struct rte_mbuf *m[RTE_IXGBE_TX_MAX_FREE_BUF_SZ];
+	int i, j, n;
+	uint32_t status;
+	uint16_t avail = 0;
+	uint16_t buf_ring_size = rxq_buf_recycle_info->buf_ring_size;
+	uint16_t mask = rxq_buf_recycle_info->buf_ring_size - 1;
+	uint16_t refill_request = rxq_buf_recycle_info->refill_request;
+	uint16_t refill_head = *rxq_buf_recycle_info->refill_head;
+	uint16_t receive_tail = *rxq_buf_recycle_info->receive_tail;
+
+	/* Get available recycling Rx buffers. */
+	avail = (buf_ring_size - (refill_head - receive_tail)) & mask;
+
+	/* Check Tx free thresh and Rx available space. */
+	if (txq->nb_tx_free > txq->tx_free_thresh || avail <= txq->tx_rs_thresh)
+		return 0;
+
+	/* check DD bits on threshold descriptor */
+	status = txq->tx_ring[txq->tx_next_dd].wb.status;
+	if (!(status & IXGBE_ADVTXD_STAT_DD))
+		return 0;
+
+	n = txq->tx_rs_thresh;
+
+	/* Buffer recycle can only support no ring buffer wraparound.
+	 * Two case for this:
+	 *
+	 * case 1: The refill head of Rx buffer ring needs to be aligned with
+	 * buffer ring size. In this case, the number of Tx freeing buffers
+	 * should be equal to refill_request.
+	 *
+	 * case 2: The refill head of Rx ring buffer does not need to be aligned
+	 * with buffer ring size. In this case, the update of refill head can not
+	 * exceed the Rx buffer ring size.
+	 */
+	if (refill_request != n ||
+		(!refill_request && (refill_head + n > buf_ring_size)))
+		return 0;
+
+	/* First buffer to free from S/W ring is at index
+	 * tx_next_dd - (tx_rs_thresh-1).
+	 */
+	txep = &txq->sw_ring[txq->tx_next_dd - (n - 1)];
+	rxep = rxq_buf_recycle_info->buf_ring;
+	rxep += refill_head;
+
+	if (txq->offloads & RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE) {
+		/* Directly put mbufs from Tx to Rx. */
+		for (i = 0; i < n; i++, rxep++, txep++)
+			*rxep = txep[0].mbuf;
+	} else {
+		for (i = 0, j = 0; i < n; i++) {
+			/* Avoid txq contains buffers from expected mempoo. */
+			if (unlikely(rxq_buf_recycle_info->mp
+						!= txep[i].mbuf->pool))
+				return 0;
+
+			m[j] = rte_pktmbuf_prefree_seg(txep[i].mbuf);
+
+			/* In case 1, each of Tx buffers should be the
+			 * last reference.
+			 */
+			if (unlikely(m[j] == NULL && refill_request))
+				return 0;
+			/* In case 2, the number of valid Tx free
+			 * buffers should be recorded.
+			 */
+			j++;
+		}
+		rte_memcpy(rxep, m, sizeof(void *) * j);
+	}
+
+	/* Update counters for Tx. */
+	txq->nb_tx_free = (uint16_t)(txq->nb_tx_free + txq->tx_rs_thresh);
+	txq->tx_next_dd = (uint16_t)(txq->tx_next_dd + txq->tx_rs_thresh);
+	if (txq->tx_next_dd >= txq->nb_tx_desc)
+		txq->tx_next_dd = (uint16_t)(txq->tx_rs_thresh - 1);
+
+	return n;
+}
+
+uint16_t
+ixgbe_rx_descriptors_refill_vec(void *rx_queue, uint16_t nb)
+{
+	struct ixgbe_rx_queue *rxq = rx_queue;
+	struct ixgbe_rx_entry *rxep;
+	volatile union ixgbe_adv_rx_desc *rxdp;
+	uint16_t rx_id;
+	uint64_t paddr;
+	uint64_t dma_addr;
+	uint16_t i;
+
+	rxdp = rxq->rx_ring + rxq->rxrearm_start;
+	rxep = &rxq->sw_ring[rxq->rxrearm_start];
+
+	for (i = 0; i < nb; i++) {
+		/* Initialize rxdp descs. */
+		paddr = (rxep[i].mbuf)->buf_iova + RTE_PKTMBUF_HEADROOM;
+		dma_addr = rte_cpu_to_le_64(paddr);
+		/* flush desc with pa dma_addr */
+		rxdp[i].read.hdr_addr = 0;
+		rxdp[i].read.pkt_addr = dma_addr;
+	}
+
+	/* Update the descriptor initializer index */
+	rxq->rxrearm_start += nb;
+	if (rxq->rxrearm_start >= rxq->nb_rx_desc)
+		rxq->rxrearm_start = 0;
+
+	rxq->rxrearm_nb -= nb;
+
+	rx_id = (uint16_t)((rxq->rxrearm_start == 0) ?
+			(rxq->nb_rx_desc - 1) : (rxq->rxrearm_start - 1));
+
+	/* Update the tail pointer on the NIC */
+	IXGBE_PCI_REG_WRITE(rxq->rdt_reg_addr, rx_id);
+
+	return nb;
+}
+
 /*********************************************************************
  *
  *  TX prep functions
@@ -2558,6 +2685,7 @@  ixgbe_set_tx_function(struct rte_eth_dev *dev, struct ixgbe_tx_queue *txq)
 				(rte_eal_process_type() != RTE_PROC_PRIMARY ||
 					ixgbe_txq_vec_setup(txq) == 0)) {
 			PMD_INIT_LOG(DEBUG, "Vector tx enabled.");
+			dev->tx_buf_stash = ixgbe_tx_buf_stash_vec;
 			dev->tx_pkt_burst = ixgbe_xmit_pkts_vec;
 		} else
 		dev->tx_pkt_burst = ixgbe_xmit_pkts_simple;
@@ -4823,6 +4951,7 @@  ixgbe_set_rx_function(struct rte_eth_dev *dev)
 					    "callback (port=%d).",
 				     dev->data->port_id);
 
+			dev->rx_descriptors_refill = ixgbe_rx_descriptors_refill_vec;
 			dev->rx_pkt_burst = ixgbe_recv_scattered_pkts_vec;
 		} else if (adapter->rx_bulk_alloc_allowed) {
 			PMD_INIT_LOG(DEBUG, "Using a Scattered with bulk "
@@ -4852,6 +4981,7 @@  ixgbe_set_rx_function(struct rte_eth_dev *dev)
 			     RTE_IXGBE_DESCS_PER_LOOP,
 			     dev->data->port_id);
 
+		dev->rx_descriptors_refill = ixgbe_rx_descriptors_refill_vec;
 		dev->rx_pkt_burst = ixgbe_recv_pkts_vec;
 	} else if (adapter->rx_bulk_alloc_allowed) {
 		PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions are "
@@ -5623,6 +5753,29 @@  ixgbe_txq_info_get(struct rte_eth_dev *dev, uint16_t queue_id,
 	qinfo->conf.tx_deferred_start = txq->tx_deferred_start;
 }
 
+void
+ixgbe_rxq_buf_recycle_info_get(struct rte_eth_dev *dev, uint16_t queue_id,
+	struct rte_eth_rxq_buf_recycle_info *rxq_buf_recycle_info)
+{
+	struct ixgbe_rx_queue *rxq;
+	struct ixgbe_adapter *adapter = dev->data->dev_private;
+
+	rxq = dev->data->rx_queues[queue_id];
+
+	rxq_buf_recycle_info->buf_ring = (void *)rxq->sw_ring;
+	rxq_buf_recycle_info->mp = rxq->mb_pool;
+	rxq_buf_recycle_info->buf_ring_size = rxq->nb_rx_desc;
+	rxq_buf_recycle_info->receive_tail = &rxq->rx_tail;
+
+	if (adapter->rx_vec_allowed) {
+		rxq_buf_recycle_info->refill_request = RTE_IXGBE_RXQ_REARM_THRESH;
+		rxq_buf_recycle_info->refill_head = &rxq->rxrearm_start;
+	} else {
+		rxq_buf_recycle_info->refill_request = rxq->rx_free_thresh;
+		rxq_buf_recycle_info->refill_head = &rxq->rx_free_trigger;
+	}
+}
+
 /*
  * [VF] Initializes Receive Unit.
  */
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.h b/drivers/net/ixgbe/ixgbe_rxtx.h
index 668a5b9814..18f890f91a 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.h
+++ b/drivers/net/ixgbe/ixgbe_rxtx.h
@@ -295,6 +295,10 @@  int ixgbe_dev_tx_done_cleanup(void *tx_queue, uint32_t free_cnt);
 extern const uint32_t ptype_table[IXGBE_PACKET_TYPE_MAX];
 extern const uint32_t ptype_table_tn[IXGBE_PACKET_TYPE_TN_MAX];
 
+uint16_t ixgbe_tx_buf_stash_vec(void *tx_queue,
+		struct rte_eth_rxq_buf_recycle_info *rxq_buf_recycle_info);
+uint16_t ixgbe_rx_descriptors_refill_vec(void *rx_queue, uint16_t nb);
+
 uint16_t ixgbe_xmit_fixed_burst_vec(void *tx_queue, struct rte_mbuf **tx_pkts,
 				    uint16_t nb_pkts);
 int ixgbe_txq_vec_setup(struct ixgbe_tx_queue *txq);