[v3] net/af_packet: fix ignoring full ring on tx

Message ID 1635868044-69456-1-git-send-email-tudor.cornea@gmail.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series [v3] net/af_packet: fix ignoring full ring on tx |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/github-robot: build success github build: passed
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS

Commit Message

Tudor Cornea Nov. 2, 2021, 3:47 p.m. UTC
  The poll call can return POLLERR which is ignored, or it can return
POLLOUT, even if there are no free frames in the mmap-ed area.

We can account for both of these cases by re-checking if the next
frame is empty before writing into it.

We have attempted to reproduce this issue with pktgen-dpdk, using the
following configuration.

pktgen -l 1-4 -n 4 --proc-type=primary --no-pci --no-telemetry \
    --no-huge -m 512 \
    --vdev=net_af_packet0,iface=eth1,blocksz=16384,framesz=8192, \
    framecnt=2048,qpairs=1,qdisc_bypass=0 \
    -- \
    -P \
    -T \
    -m "3.0" \
    -f themes/black-yellow.theme

We configure a low tx rate (~ 335 packets / second) and a small
packet size, of about 300 Bytes from the pktgen CLI.

set 0 size 300
set 0 rate 0.008
set 0 burst 1
start 0

After bringing the interface down, and up again, we seem to arrive
in a state in which the tx rate is inconsistent, and does not recover.

ifconfig eth1 down; sleep 7; ifconfig eth1 up

[1] http://code.dpdk.org/pktgen-dpdk/pktgen-20.11.2/source/INSTALL.md

Signed-off-by: Mihai Pogonaru <pogonarumihai@gmail.com>
Signed-off-by: Tudor Cornea <tudor.cornea@gmail.com>

---
v2:
* Added check for POLLERR
* Used tx_ring_status_available() for checking TP_STATUS_AVAILABLE
---
 drivers/net/af_packet/rte_eth_af_packet.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)
  

Comments

Ferruh Yigit Nov. 2, 2021, 4:47 p.m. UTC | #1
On 11/2/2021 3:47 PM, Tudor Cornea wrote:
> The poll call can return POLLERR which is ignored, or it can return
> POLLOUT, even if there are no free frames in the mmap-ed area.
> 
> We can account for both of these cases by re-checking if the next
> frame is empty before writing into it.
> 
> We have attempted to reproduce this issue with pktgen-dpdk, using the
> following configuration.
> 
> pktgen -l 1-4 -n 4 --proc-type=primary --no-pci --no-telemetry \
>      --no-huge -m 512 \
>      --vdev=net_af_packet0,iface=eth1,blocksz=16384,framesz=8192, \
>      framecnt=2048,qpairs=1,qdisc_bypass=0 \
>      -- \
>      -P \
>      -T \
>      -m "3.0" \
>      -f themes/black-yellow.theme
> 
> We configure a low tx rate (~ 335 packets / second) and a small
> packet size, of about 300 Bytes from the pktgen CLI.
> 
> set 0 size 300
> set 0 rate 0.008
> set 0 burst 1
> start 0
> 
> After bringing the interface down, and up again, we seem to arrive
> in a state in which the tx rate is inconsistent, and does not recover.
> 
> ifconfig eth1 down; sleep 7; ifconfig eth1 up
> 
> [1] http://code.dpdk.org/pktgen-dpdk/pktgen-20.11.2/source/INSTALL.md
> 
> Signed-off-by: Mihai Pogonaru <pogonarumihai@gmail.com>
> Signed-off-by: Tudor Cornea <tudor.cornea@gmail.com>
> 
> ---
> v2:
> * Added check for POLLERR
> * Used tx_ring_status_available() for checking TP_STATUS_AVAILABLE
> ---
>   drivers/net/af_packet/rte_eth_af_packet.c | 26 ++++++++++++++++++++++++--
>   1 file changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c
> index 559f5a0..d3d3104 100644
> --- a/drivers/net/af_packet/rte_eth_af_packet.c
> +++ b/drivers/net/af_packet/rte_eth_af_packet.c
> @@ -237,8 +237,30 @@ eth_af_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
>   		}
>   
>   		/* point at the next incoming frame */
> -		if (!tx_ring_status_available(ppd->tp_status) &&
> -		    poll(&pfd, 1, -1) < 0)
> +		if (!tx_ring_status_available(ppd->tp_status)) {
> +			if (poll(&pfd, 1, -1) < 0)
> +				break;
> +			if (pfd.revents & POLLERR)
> +				break;
> +		}
> +
> +		/*
> +		 * Poll can return POLLERR if the interface is down
> +		 *

Above comment fits better to above check, or can remove it completely.
Can you send a new version, or I can remove it while merging if you prefer?

Except from above comment,
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

> +		 * It will almost always return POLLOUT, even if there
> +		 * are no extra buffers available
> +		 *
> +		 * This happens, because packet_poll() calls datagram_poll()
> +		 * which checks the space left in the socket buffer and,
> +		 * in the case of packet_mmap, the default socket buffer length
> +		 * doesn't match the requested size for the tx_ring.
> +		 * As such, there is almost always space left in socket buffer,
> +		 * which doesn't seem to be correlated to the requested size
> +		 * for the tx_ring in packet_mmap.
> +		 *
> +		 * This results in poll() returning POLLOUT.
> +		 */
> +		if (!tx_ring_status_available(ppd->tp_status))
>   			break;
>   
>   		/* copy the tx frame data */
>
  

Patch

diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c
index 559f5a0..d3d3104 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -237,8 +237,30 @@  eth_af_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 		}
 
 		/* point at the next incoming frame */
-		if (!tx_ring_status_available(ppd->tp_status) &&
-		    poll(&pfd, 1, -1) < 0)
+		if (!tx_ring_status_available(ppd->tp_status)) {
+			if (poll(&pfd, 1, -1) < 0)
+				break;
+			if (pfd.revents & POLLERR)
+				break;
+		}
+
+		/*
+		 * Poll can return POLLERR if the interface is down
+		 *
+		 * It will almost always return POLLOUT, even if there
+		 * are no extra buffers available
+		 *
+		 * This happens, because packet_poll() calls datagram_poll()
+		 * which checks the space left in the socket buffer and,
+		 * in the case of packet_mmap, the default socket buffer length
+		 * doesn't match the requested size for the tx_ring.
+		 * As such, there is almost always space left in socket buffer,
+		 * which doesn't seem to be correlated to the requested size
+		 * for the tx_ring in packet_mmap.
+		 *
+		 * This results in poll() returning POLLOUT.
+		 */
+		if (!tx_ring_status_available(ppd->tp_status))
 			break;
 
 		/* copy the tx frame data */