[1/3] net/af_xdp: Increase max batch size to 512

Message ID 20210224111852.11947-2-ciara.loftus@intel.com (mailing list archive)
State Changes Requested, archived
Delegated to: Ferruh Yigit
Headers
Series AF_XDP Preferred Busy Polling |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Loftus, Ciara Feb. 24, 2021, 11:18 a.m. UTC
  Prior to this the max size was 32 which was unnecessarily
small. Also enforce the max batch size for TX for both
copy and zero copy modes. Prior to this only copy mode
enforced the max size.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
---
 drivers/net/af_xdp/rte_eth_af_xdp.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)
  

Comments

Ferruh Yigit March 1, 2021, 4:31 p.m. UTC | #1
On 2/24/2021 11:18 AM, Ciara Loftus wrote:
> Prior to this the max size was 32 which was unnecessarily
> small. 

Can you please describe the impact? Why changed from 32 to 512?
I assume this is to improve the performance but can you please explicitly 
document it in the commit log?

> Also enforce the max batch size for TX for both
> copy and zero copy modes. Prior to this only copy mode
> enforced the max size.
> 

By enforcing, the PMD ignores the user provided burst value if it is more than 
PMS supported MAX, and this ignoring is done in silent. Also there is no way to 
discover this MAX value without checking the code.

Overall, why this max values are required at all? After quick check I can see 
they are used for some bulk operations, which I assume can be eliminated, what 
do you think?

> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>

<...>
  
Loftus, Ciara March 3, 2021, 3:07 p.m. UTC | #2
> 
> On 2/24/2021 11:18 AM, Ciara Loftus wrote:
> > Prior to this the max size was 32 which was unnecessarily
> > small.
> 
> Can you please describe the impact? Why changed from 32 to 512?
> I assume this is to improve the performance but can you please explicitly
> document it in the commit log?

Indeed - improved performance due to bulk operations and fewer ring accesses and syscalls.
The value 512 was arbitrary. I will change this to the default ring size as defined by libbpf (2048) in v2.
Will update the commit log with this info.

> 
> > Also enforce the max batch size for TX for both
> > copy and zero copy modes. Prior to this only copy mode
> > enforced the max size.
> >
> 
> By enforcing, the PMD ignores the user provided burst value if it is more than
> PMS supported MAX, and this ignoring is done in silent. Also there is no way
> to
> discover this MAX value without checking the code.
> 
> Overall, why this max values are required at all? After quick check I can see
> they are used for some bulk operations, which I assume can be eliminated,
> what
> do you think?

We need to size some arrays at compile time with this max value.

Instead of removing the bulk operations which may impact performance, how about taking an approach where we split batches that are > 2048 into smaller batches and still handle all the packets instead of discarding those > 2048. Something like what's done in ixgbe for example:
http://code.dpdk.org/dpdk/v21.02/source/drivers/net/ixgbe/ixgbe_rxtx.c#L318

Thanks,
Ciara

> 
> > Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
> 
> <...>
  
Ferruh Yigit March 3, 2021, 3:38 p.m. UTC | #3
On 3/3/2021 3:07 PM, Loftus, Ciara wrote:
>>
>> On 2/24/2021 11:18 AM, Ciara Loftus wrote:
>>> Prior to this the max size was 32 which was unnecessarily
>>> small.
>>
>> Can you please describe the impact? Why changed from 32 to 512?
>> I assume this is to improve the performance but can you please explicitly
>> document it in the commit log?
> 
> Indeed - improved performance due to bulk operations and fewer ring accesses and syscalls.
> The value 512 was arbitrary. I will change this to the default ring size as defined by libbpf (2048) in v2.
> Will update the commit log with this info.
> 
>>
>>> Also enforce the max batch size for TX for both
>>> copy and zero copy modes. Prior to this only copy mode
>>> enforced the max size.
>>>
>>
>> By enforcing, the PMD ignores the user provided burst value if it is more than
>> PMS supported MAX, and this ignoring is done in silent. Also there is no way
>> to
>> discover this MAX value without checking the code.
>>
>> Overall, why this max values are required at all? After quick check I can see
>> they are used for some bulk operations, which I assume can be eliminated,
>> what
>> do you think?
> 
> We need to size some arrays at compile time with this max value.
> 
> Instead of removing the bulk operations which may impact performance, how about taking an approach where we split batches that are > 2048 into smaller batches and still handle all the packets instead of discarding those > 2048. Something like what's done in ixgbe for example:
> http://code.dpdk.org/dpdk/v21.02/source/drivers/net/ixgbe/ixgbe_rxtx.c#L318
>` 

If there is no reasonable way to eliminate the fix sized arrays, above 
suggestion looks good.

Thanks,
ferruh
  

Patch

diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c
index b8d5ad0d91..b51db90204 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -66,8 +66,8 @@  RTE_LOG_REGISTER(af_xdp_logtype, pmd.net.af_xdp, NOTICE);
 #define ETH_AF_XDP_DFLT_START_QUEUE_IDX	0
 #define ETH_AF_XDP_DFLT_QUEUE_COUNT	1
 
-#define ETH_AF_XDP_RX_BATCH_SIZE	32
-#define ETH_AF_XDP_TX_BATCH_SIZE	32
+#define ETH_AF_XDP_RX_BATCH_SIZE	512
+#define ETH_AF_XDP_TX_BATCH_SIZE	512
 
 
 struct xsk_umem_info {
@@ -535,8 +535,6 @@  af_xdp_tx_cp(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 	uint32_t idx_tx;
 	struct xsk_ring_cons *cq = &txq->pair->cq;
 
-	nb_pkts = RTE_MIN(nb_pkts, ETH_AF_XDP_TX_BATCH_SIZE);
-
 	pull_umem_cq(umem, nb_pkts, cq);
 
 	nb_pkts = rte_ring_dequeue_bulk(umem->buf_ring, addrs,
@@ -580,6 +578,8 @@  af_xdp_tx_cp(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 static uint16_t
 eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 {
+	nb_pkts = RTE_MIN(nb_pkts, ETH_AF_XDP_TX_BATCH_SIZE);
+
 #if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
 	return af_xdp_tx_zc(queue, bufs, nb_pkts);
 #else