[v3,05/33] net/ena: fix fast mbuf free
Checks
Commit Message
From: Shai Brandes <shaibran@amazon.com>
In case the application enables fast mbuf release optimization,
the driver releases 256 TX mbufs in bulk upon reaching the
TX free threshold.
The existing implementation utilizes rte_mempool_put_bulk for bulk
freeing TXs, which exclusively supports direct mbufs.
In case the application transmits indirect bufs, the driver must
also decrement the mbuf reference count and unlink the mbuf segment.
For such case, the driver should employ rte_pktmbuf_free_bulk.
Fixes: c339f53823f3 ("net/ena: support fast mbuf free")
Cc: stable@dpdk.org
Signed-off-by: Shai Brandes <shaibran@amazon.com>
Reviewed-by: Amit Bernstein <amitbern@amazon.com>
---
doc/guides/rel_notes/release_24_03.rst | 1 +
drivers/net/ena/ena_ethdev.c | 6 ++----
2 files changed, 3 insertions(+), 4 deletions(-)
Comments
On 3/6/2024 12:24 PM, shaibran@amazon.com wrote:
> From: Shai Brandes <shaibran@amazon.com>
>
> In case the application enables fast mbuf release optimization,
> the driver releases 256 TX mbufs in bulk upon reaching the
> TX free threshold.
> The existing implementation utilizes rte_mempool_put_bulk for bulk
> freeing TXs, which exclusively supports direct mbufs.
> In case the application transmits indirect bufs, the driver must
> also decrement the mbuf reference count and unlink the mbuf segment.
> For such case, the driver should employ rte_pktmbuf_free_bulk.
>
Ack.
I wonder if you observe any performance impact from this change, just
for reference if we encounter similar decision in the future.
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Friday, March 8, 2024 7:23 PM
> To: Brandes, Shai <shaibran@amazon.com>
> Cc: dev@dpdk.org; stable@dpdk.org
> Subject: RE: [EXTERNAL] [PATCH v3 05/33] net/ena: fix fast mbuf free
>
> CAUTION: This email originated from outside of the organization. Do not click
> links or open attachments unless you can confirm the sender and know the
> content is safe.
>
>
>
> On 3/6/2024 12:24 PM, shaibran@amazon.com wrote:
> > From: Shai Brandes <shaibran@amazon.com>
> >
> > In case the application enables fast mbuf release optimization, the
> > driver releases 256 TX mbufs in bulk upon reaching the TX free
> > threshold.
> > The existing implementation utilizes rte_mempool_put_bulk for bulk
> > freeing TXs, which exclusively supports direct mbufs.
> > In case the application transmits indirect bufs, the driver must also
> > decrement the mbuf reference count and unlink the mbuf segment.
> > For such case, the driver should employ rte_pktmbuf_free_bulk.
> >
>
> Ack.
>
> I wonder if you observe any performance impact from this change, just for
> reference if we encounter similar decision in the future.
[Brandes, Shai] we did not see performance impact in our testing.
It was discovered by a new latency application we crafted that uses the bulk free option, which transmitted one by one packets copied from a common buffer, but showed that there are missing packets.
On 3/10/2024 2:58 PM, Brandes, Shai wrote:
>
>
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>> Sent: Friday, March 8, 2024 7:23 PM
>> To: Brandes, Shai <shaibran@amazon.com>
>> Cc: dev@dpdk.org; stable@dpdk.org
>> Subject: RE: [EXTERNAL] [PATCH v3 05/33] net/ena: fix fast mbuf free
>>
>> CAUTION: This email originated from outside of the organization. Do not click
>> links or open attachments unless you can confirm the sender and know the
>> content is safe.
>>
>>
>>
>> On 3/6/2024 12:24 PM, shaibran@amazon.com wrote:
>>> From: Shai Brandes <shaibran@amazon.com>
>>>
>>> In case the application enables fast mbuf release optimization, the
>>> driver releases 256 TX mbufs in bulk upon reaching the TX free
>>> threshold.
>>> The existing implementation utilizes rte_mempool_put_bulk for bulk
>>> freeing TXs, which exclusively supports direct mbufs.
>>> In case the application transmits indirect bufs, the driver must also
>>> decrement the mbuf reference count and unlink the mbuf segment.
>>> For such case, the driver should employ rte_pktmbuf_free_bulk.
>>>
>>
>> Ack.
>>
>> I wonder if you observe any performance impact from this change, just for
>> reference if we encounter similar decision in the future.
> [Brandes, Shai] we did not see performance impact in our testing.
> It was discovered by a new latency application we crafted that uses the bulk free option, which transmitted one by one packets copied from a common buffer, but showed that there are missing packets.
>
ack.
@@ -105,6 +105,7 @@ New Features
* Removed the reporting of `rx_overruns` errors from xstats and instead updated `imissed` stat with its value.
* Added support for sub-optimal configuration notifications from the device.
+ * Restructured fast release of mbufs when RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE optimization is enabled.
* **Updated Atomic Rules' Arkville driver.**
@@ -3122,8 +3122,7 @@ ena_tx_cleanup_mbuf_fast(struct rte_mbuf **mbufs_to_clean,
m_next = mbuf->next;
mbufs_to_clean[mbuf_cnt++] = mbuf;
if (mbuf_cnt == buf_size) {
- rte_mempool_put_bulk(mbufs_to_clean[0]->pool, (void **)mbufs_to_clean,
- (unsigned int)mbuf_cnt);
+ rte_pktmbuf_free_bulk(mbufs_to_clean, mbuf_cnt);
mbuf_cnt = 0;
}
mbuf = m_next;
@@ -3191,8 +3190,7 @@ static int ena_tx_cleanup(void *txp, uint32_t free_pkt_cnt)
}
if (mbuf_cnt != 0)
- rte_mempool_put_bulk(mbufs_to_clean[0]->pool,
- (void **)mbufs_to_clean, mbuf_cnt);
+ rte_pktmbuf_free_bulk(mbufs_to_clean, mbuf_cnt);
/* Notify completion handler that full cleanup was performed */
if (free_pkt_cnt == 0 || total_tx_pkts < cleanup_budget)