[1/1] net/sfc: stop misuse of Rx ingress m-port metadata on EF100

Message ID 20230309041101.8321-1-ivan.malov@arknetworks.am (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series [1/1] net/sfc: stop misuse of Rx ingress m-port metadata on EF100 |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/github-robot: build success github build: passed

Commit Message

Ivan Malov March 9, 2023, 4:11 a.m. UTC
  The driver supports representor functionality. In it,
packets coming from VFs to the dedicated back-end Rx
queue get demultiplexed into front-end Rx queues of
representor ethdevs as per the per-packet metadata
indicating logical HW ingress ports. On transmit,
packets are provided with symmetrical metadata
by front-end Tx queues, and the back-end queue
transforms the data into so-called Tx override
descriptors. These let the packets bypass flow
lookup and go directly to the represented VFs.

However, in the Rx part, the driver extracts
the said metadata on every HW Rx queue, that
is, not just on the one used by representors
Doing so leads to a buggy behaviour. It is
revealed by operating testpmd as follows:

dpdk-testpmd -a 0000:c6:00.0 -a 0000:c6:00.1 -- -i

testpmd> flow create 0 transfer pattern port_representor \
 port_id is 0 / end actions port_representor port_id 1 / end
Flow rule #0 created

testpmd> set fwd io
testpmd> start tx_first

testpmd> flow destroy 0 rule 0
Flow rule #0 destroyed

testpmd> stop

  ---------------------- Forward statistics for port 0  -----------------
  RX-packets: 19196498       RX-dropped: 0             RX-total: 19196498
  TX-packets: 19196535       TX-dropped: 0             TX-total: 19196535
  -----------------------------------------------------------------------

  ---------------------- Forward statistics for port 1  -----------------
  RX-packets: 19196503       RX-dropped: 0             RX-total: 19196503
  TX-packets: 19196530       TX-dropped: 0             TX-total: 19196530
  -----------------------------------------------------------------------

In this scenario, two physical functions of the adapter
do not have any corresponding "back-to-back" forwarder
on peer host. Packets transmitted from port 0 can only
be forwarded to port 1 by means of a special flow rule.

The flow rule indeed works, but destroying it does not
stop forwarding. Port statistics carry on incrementing.

Also, it is apparent that forwarding in the opposite
direction must not have worked in this case as the
flow is meant to target only one of the directions.

Because of the bug, the first 32 mbufs received
as a result of the flow rule operation have the
said metadata present. In io mode, testpmd does
not tamper with mbufs and passes them directly
to transmit path, so this data remains in them
instructing the PMD to override destinations
of the packets via Tx option descriptors.

Expected behaviour is as follows:

  ---------------------- Forward statistics for port 0  -----------------
  RX-packets: 0              RX-dropped: 0             RX-total: 0
  TX-packets: 15787496       TX-dropped: 0             TX-total: 15787496
  -----------------------------------------------------------------------

  ---------------------- Forward statistics for port 1  -----------------
  RX-packets: 15787464       RX-dropped: 0             RX-total: 15787464
  TX-packets: 32             TX-dropped: 0             TX-total: 32
  -----------------------------------------------------------------------

These figures show the rule work only for one direction.
Also, removing the flow shall cause forwarding to cease.

Provided patch fixes the bug accordingly.

Fixes: d0f981a3efd8 ("net/sfc: handle ingress mport in EF100 Rx prefix")
Cc: stable@dpdk.org

Signed-off-by: Ivan Malov <ivan.malov@arknetworks.am>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
---
 drivers/net/sfc/sfc_dp_rx.h    | 3 +++
 drivers/net/sfc/sfc_ef100_rx.c | 3 ++-
 drivers/net/sfc/sfc_rx.c       | 3 +++
 3 files changed, 8 insertions(+), 1 deletion(-)
  

Comments

Ferruh Yigit March 10, 2023, 2:01 p.m. UTC | #1
On 3/9/2023 4:11 AM, Ivan Malov wrote:
> The driver supports representor functionality. In it,
> packets coming from VFs to the dedicated back-end Rx
> queue get demultiplexed into front-end Rx queues of
> representor ethdevs as per the per-packet metadata
> indicating logical HW ingress ports. On transmit,
> packets are provided with symmetrical metadata
> by front-end Tx queues, and the back-end queue
> transforms the data into so-called Tx override
> descriptors. These let the packets bypass flow
> lookup and go directly to the represented VFs.
> 
> However, in the Rx part, the driver extracts
> the said metadata on every HW Rx queue, that
> is, not just on the one used by representors
> Doing so leads to a buggy behaviour. It is
> revealed by operating testpmd as follows:
> 
> dpdk-testpmd -a 0000:c6:00.0 -a 0000:c6:00.1 -- -i
> 
> testpmd> flow create 0 transfer pattern port_representor \
>  port_id is 0 / end actions port_representor port_id 1 / end
> Flow rule #0 created
> 
> testpmd> set fwd io
> testpmd> start tx_first
> 
> testpmd> flow destroy 0 rule 0
> Flow rule #0 destroyed
> 
> testpmd> stop
> 
>   ---------------------- Forward statistics for port 0  -----------------
>   RX-packets: 19196498       RX-dropped: 0             RX-total: 19196498
>   TX-packets: 19196535       TX-dropped: 0             TX-total: 19196535
>   -----------------------------------------------------------------------
> 
>   ---------------------- Forward statistics for port 1  -----------------
>   RX-packets: 19196503       RX-dropped: 0             RX-total: 19196503
>   TX-packets: 19196530       TX-dropped: 0             TX-total: 19196530
>   -----------------------------------------------------------------------
> 
> In this scenario, two physical functions of the adapter
> do not have any corresponding "back-to-back" forwarder
> on peer host. Packets transmitted from port 0 can only
> be forwarded to port 1 by means of a special flow rule.
> 
> The flow rule indeed works, but destroying it does not
> stop forwarding. Port statistics carry on incrementing.
> 
> Also, it is apparent that forwarding in the opposite
> direction must not have worked in this case as the
> flow is meant to target only one of the directions.
> 
> Because of the bug, the first 32 mbufs received
> as a result of the flow rule operation have the
> said metadata present. In io mode, testpmd does
> not tamper with mbufs and passes them directly
> to transmit path, so this data remains in them
> instructing the PMD to override destinations
> of the packets via Tx option descriptors.
> 
> Expected behaviour is as follows:
> 
>   ---------------------- Forward statistics for port 0  -----------------
>   RX-packets: 0              RX-dropped: 0             RX-total: 0
>   TX-packets: 15787496       TX-dropped: 0             TX-total: 15787496
>   -----------------------------------------------------------------------
> 
>   ---------------------- Forward statistics for port 1  -----------------
>   RX-packets: 15787464       RX-dropped: 0             RX-total: 15787464
>   TX-packets: 32             TX-dropped: 0             TX-total: 32
>   -----------------------------------------------------------------------
> 
> These figures show the rule work only for one direction.
> Also, removing the flow shall cause forwarding to cease.
> 
> Provided patch fixes the bug accordingly.
> 
> Fixes: d0f981a3efd8 ("net/sfc: handle ingress mport in EF100 Rx prefix")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Ivan Malov <ivan.malov@arknetworks.am>
> Reviewed-by: Andy Moreton <amoreton@xilinx.com>

Applied to dpdk-next-net/main, thanks.
  
Andrew Rybchenko March 10, 2023, 2:18 p.m. UTC | #2
On 3/9/23 07:11, Ivan Malov wrote:
> The driver supports representor functionality. In it,
> packets coming from VFs to the dedicated back-end Rx
> queue get demultiplexed into front-end Rx queues of
> representor ethdevs as per the per-packet metadata
> indicating logical HW ingress ports. On transmit,
> packets are provided with symmetrical metadata
> by front-end Tx queues, and the back-end queue
> transforms the data into so-called Tx override
> descriptors. These let the packets bypass flow
> lookup and go directly to the represented VFs.
> 
> However, in the Rx part, the driver extracts
> the said metadata on every HW Rx queue, that
> is, not just on the one used by representors
> Doing so leads to a buggy behaviour. It is
> revealed by operating testpmd as follows:
> 
> dpdk-testpmd -a 0000:c6:00.0 -a 0000:c6:00.1 -- -i
> 
> testpmd> flow create 0 transfer pattern port_representor \
>   port_id is 0 / end actions port_representor port_id 1 / end
> Flow rule #0 created
> 
> testpmd> set fwd io
> testpmd> start tx_first
> 
> testpmd> flow destroy 0 rule 0
> Flow rule #0 destroyed
> 
> testpmd> stop
> 
>    ---------------------- Forward statistics for port 0  -----------------
>    RX-packets: 19196498       RX-dropped: 0             RX-total: 19196498
>    TX-packets: 19196535       TX-dropped: 0             TX-total: 19196535
>    -----------------------------------------------------------------------
> 
>    ---------------------- Forward statistics for port 1  -----------------
>    RX-packets: 19196503       RX-dropped: 0             RX-total: 19196503
>    TX-packets: 19196530       TX-dropped: 0             TX-total: 19196530
>    -----------------------------------------------------------------------
> 
> In this scenario, two physical functions of the adapter
> do not have any corresponding "back-to-back" forwarder
> on peer host. Packets transmitted from port 0 can only
> be forwarded to port 1 by means of a special flow rule.
> 
> The flow rule indeed works, but destroying it does not
> stop forwarding. Port statistics carry on incrementing.
> 
> Also, it is apparent that forwarding in the opposite
> direction must not have worked in this case as the
> flow is meant to target only one of the directions.
> 
> Because of the bug, the first 32 mbufs received
> as a result of the flow rule operation have the
> said metadata present. In io mode, testpmd does
> not tamper with mbufs and passes them directly
> to transmit path, so this data remains in them
> instructing the PMD to override destinations
> of the packets via Tx option descriptors.
> 
> Expected behaviour is as follows:
> 
>    ---------------------- Forward statistics for port 0  -----------------
>    RX-packets: 0              RX-dropped: 0             RX-total: 0
>    TX-packets: 15787496       TX-dropped: 0             TX-total: 15787496
>    -----------------------------------------------------------------------
> 
>    ---------------------- Forward statistics for port 1  -----------------
>    RX-packets: 15787464       RX-dropped: 0             RX-total: 15787464
>    TX-packets: 32             TX-dropped: 0             TX-total: 32
>    -----------------------------------------------------------------------
> 
> These figures show the rule work only for one direction.
> Also, removing the flow shall cause forwarding to cease.
> 
> Provided patch fixes the bug accordingly.
> 
> Fixes: d0f981a3efd8 ("net/sfc: handle ingress mport in EF100 Rx prefix")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Ivan Malov <ivan.malov@arknetworks.am>
> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
> ---
>   drivers/net/sfc/sfc_dp_rx.h    | 3 +++
>   drivers/net/sfc/sfc_ef100_rx.c | 3 ++-
>   drivers/net/sfc/sfc_rx.c       | 3 +++
>   3 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/sfc/sfc_dp_rx.h b/drivers/net/sfc/sfc_dp_rx.h
> index 246adbd87c..51a44bd034 100644
> --- a/drivers/net/sfc/sfc_dp_rx.h
> +++ b/drivers/net/sfc/sfc_dp_rx.h
> @@ -10,6 +10,8 @@
>   #ifndef _SFC_DP_RX_H
>   #define _SFC_DP_RX_H
>   
> +#include <stdbool.h>
> +
>   #include <rte_mempool.h>
>   #include <ethdev_driver.h>
>   
> @@ -27,6 +29,7 @@ extern "C" {
>    */
>   struct sfc_dp_rxq {
>   	struct sfc_dp_queue	dpq;
> +	bool			need_ingress_mport;

May be I'm late, but it is a wrong location for the
information. The information is EF100-specific, but
the structure is a generic one.

I'd say that corresponding flag should be set in rxq->flag
upon creation based on request in qcreate info and
cleared if Rx prefix does not support it. May be it
should be an error if we need the information, but Rx prefix
does not provide it.

>   };
>   
>   /** Datapath receive queue descriptor number limitations */
> diff --git a/drivers/net/sfc/sfc_ef100_rx.c b/drivers/net/sfc/sfc_ef100_rx.c
> index 16cd8524d3..c4d256b40d 100644
> --- a/drivers/net/sfc/sfc_ef100_rx.c
> +++ b/drivers/net/sfc/sfc_ef100_rx.c
> @@ -876,7 +876,8 @@ sfc_ef100_rx_qstart(struct sfc_dp_rxq *dp_rxq, unsigned int evq_read_ptr,
>   	else
>   		rxq->flags &= ~SFC_EF100_RXQ_USER_MARK;
>   
> -	if ((unsup_rx_prefix_fields &
> +	if (dp_rxq->need_ingress_mport &&
> +	    (unsup_rx_prefix_fields &
>   	     (1U << EFX_RX_PREFIX_FIELD_INGRESS_MPORT)) == 0)
>   		rxq->flags |= SFC_EF100_RXQ_INGRESS_MPORT;
>   	else
> diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
> index 5ea98187c3..3d3d7d42e3 100644
> --- a/drivers/net/sfc/sfc_rx.c
> +++ b/drivers/net/sfc/sfc_rx.c
> @@ -1265,6 +1265,9 @@ sfc_rx_qinit(struct sfc_adapter *sa, sfc_sw_index_t sw_index,
>   	if (rc != 0)
>   		goto fail_dp_rx_qcreate;
>   
> +	rxq_info->dp->need_ingress_mport =
> +		((rxq_info->type_flags & EFX_RXQ_FLAG_INGRESS_MPORT) != 0);
> +
>   	evq->dp_rxq = rxq_info->dp;
>   
>   	rxq_info->state = SFC_RXQ_INITIALIZED;
  
Ferruh Yigit March 10, 2023, 2:43 p.m. UTC | #3
On 3/10/2023 2:18 PM, Andrew Rybchenko wrote:
> On 3/9/23 07:11, Ivan Malov wrote:
>> The driver supports representor functionality. In it,
>> packets coming from VFs to the dedicated back-end Rx
>> queue get demultiplexed into front-end Rx queues of
>> representor ethdevs as per the per-packet metadata
>> indicating logical HW ingress ports. On transmit,
>> packets are provided with symmetrical metadata
>> by front-end Tx queues, and the back-end queue
>> transforms the data into so-called Tx override
>> descriptors. These let the packets bypass flow
>> lookup and go directly to the represented VFs.
>>
>> However, in the Rx part, the driver extracts
>> the said metadata on every HW Rx queue, that
>> is, not just on the one used by representors
>> Doing so leads to a buggy behaviour. It is
>> revealed by operating testpmd as follows:
>>
>> dpdk-testpmd -a 0000:c6:00.0 -a 0000:c6:00.1 -- -i
>>
>> testpmd> flow create 0 transfer pattern port_representor \
>>   port_id is 0 / end actions port_representor port_id 1 / end
>> Flow rule #0 created
>>
>> testpmd> set fwd io
>> testpmd> start tx_first
>>
>> testpmd> flow destroy 0 rule 0
>> Flow rule #0 destroyed
>>
>> testpmd> stop
>>
>>    ---------------------- Forward statistics for port 0 
>> -----------------
>>    RX-packets: 19196498       RX-dropped: 0             RX-total:
>> 19196498
>>    TX-packets: 19196535       TX-dropped: 0             TX-total:
>> 19196535
>>   
>> -----------------------------------------------------------------------
>>
>>    ---------------------- Forward statistics for port 1 
>> -----------------
>>    RX-packets: 19196503       RX-dropped: 0             RX-total:
>> 19196503
>>    TX-packets: 19196530       TX-dropped: 0             TX-total:
>> 19196530
>>   
>> -----------------------------------------------------------------------
>>
>> In this scenario, two physical functions of the adapter
>> do not have any corresponding "back-to-back" forwarder
>> on peer host. Packets transmitted from port 0 can only
>> be forwarded to port 1 by means of a special flow rule.
>>
>> The flow rule indeed works, but destroying it does not
>> stop forwarding. Port statistics carry on incrementing.
>>
>> Also, it is apparent that forwarding in the opposite
>> direction must not have worked in this case as the
>> flow is meant to target only one of the directions.
>>
>> Because of the bug, the first 32 mbufs received
>> as a result of the flow rule operation have the
>> said metadata present. In io mode, testpmd does
>> not tamper with mbufs and passes them directly
>> to transmit path, so this data remains in them
>> instructing the PMD to override destinations
>> of the packets via Tx option descriptors.
>>
>> Expected behaviour is as follows:
>>
>>    ---------------------- Forward statistics for port 0 
>> -----------------
>>    RX-packets: 0              RX-dropped: 0             RX-total: 0
>>    TX-packets: 15787496       TX-dropped: 0             TX-total:
>> 15787496
>>   
>> -----------------------------------------------------------------------
>>
>>    ---------------------- Forward statistics for port 1 
>> -----------------
>>    RX-packets: 15787464       RX-dropped: 0             RX-total:
>> 15787464
>>    TX-packets: 32             TX-dropped: 0             TX-total: 32
>>   
>> -----------------------------------------------------------------------
>>
>> These figures show the rule work only for one direction.
>> Also, removing the flow shall cause forwarding to cease.
>>
>> Provided patch fixes the bug accordingly.
>>
>> Fixes: d0f981a3efd8 ("net/sfc: handle ingress mport in EF100 Rx prefix")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Ivan Malov <ivan.malov@arknetworks.am>
>> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
>> ---
>>   drivers/net/sfc/sfc_dp_rx.h    | 3 +++
>>   drivers/net/sfc/sfc_ef100_rx.c | 3 ++-
>>   drivers/net/sfc/sfc_rx.c       | 3 +++
>>   3 files changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/sfc/sfc_dp_rx.h b/drivers/net/sfc/sfc_dp_rx.h
>> index 246adbd87c..51a44bd034 100644
>> --- a/drivers/net/sfc/sfc_dp_rx.h
>> +++ b/drivers/net/sfc/sfc_dp_rx.h
>> @@ -10,6 +10,8 @@
>>   #ifndef _SFC_DP_RX_H
>>   #define _SFC_DP_RX_H
>>   +#include <stdbool.h>
>> +
>>   #include <rte_mempool.h>
>>   #include <ethdev_driver.h>
>>   @@ -27,6 +29,7 @@ extern "C" {
>>    */
>>   struct sfc_dp_rxq {
>>       struct sfc_dp_queue    dpq;
>> +    bool            need_ingress_mport;
> 
> May be I'm late, but it is a wrong location for the
> information. The information is EF100-specific, but
> the structure is a generic one.
> 
> I'd say that corresponding flag should be set in rxq->flag
> upon creation based on request in qcreate info and
> cleared if Rx prefix does not support it. May be it
> should be an error if we need the information, but Rx prefix
> does not provide it.
> 

Thanks Andrew, patch dropped from next-net, updating patchwork status as
"change requested".

>>   };
>>     /** Datapath receive queue descriptor number limitations */
>> diff --git a/drivers/net/sfc/sfc_ef100_rx.c
>> b/drivers/net/sfc/sfc_ef100_rx.c
>> index 16cd8524d3..c4d256b40d 100644
>> --- a/drivers/net/sfc/sfc_ef100_rx.c
>> +++ b/drivers/net/sfc/sfc_ef100_rx.c
>> @@ -876,7 +876,8 @@ sfc_ef100_rx_qstart(struct sfc_dp_rxq *dp_rxq,
>> unsigned int evq_read_ptr,
>>       else
>>           rxq->flags &= ~SFC_EF100_RXQ_USER_MARK;
>>   -    if ((unsup_rx_prefix_fields &
>> +    if (dp_rxq->need_ingress_mport &&
>> +        (unsup_rx_prefix_fields &
>>            (1U << EFX_RX_PREFIX_FIELD_INGRESS_MPORT)) == 0)
>>           rxq->flags |= SFC_EF100_RXQ_INGRESS_MPORT;
>>       else
>> diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
>> index 5ea98187c3..3d3d7d42e3 100644
>> --- a/drivers/net/sfc/sfc_rx.c
>> +++ b/drivers/net/sfc/sfc_rx.c
>> @@ -1265,6 +1265,9 @@ sfc_rx_qinit(struct sfc_adapter *sa,
>> sfc_sw_index_t sw_index,
>>       if (rc != 0)
>>           goto fail_dp_rx_qcreate;
>>   +    rxq_info->dp->need_ingress_mport =
>> +        ((rxq_info->type_flags & EFX_RXQ_FLAG_INGRESS_MPORT) != 0);
>> +
>>       evq->dp_rxq = rxq_info->dp;
>>         rxq_info->state = SFC_RXQ_INITIALIZED;
>
  

Patch

diff --git a/drivers/net/sfc/sfc_dp_rx.h b/drivers/net/sfc/sfc_dp_rx.h
index 246adbd87c..51a44bd034 100644
--- a/drivers/net/sfc/sfc_dp_rx.h
+++ b/drivers/net/sfc/sfc_dp_rx.h
@@ -10,6 +10,8 @@ 
 #ifndef _SFC_DP_RX_H
 #define _SFC_DP_RX_H
 
+#include <stdbool.h>
+
 #include <rte_mempool.h>
 #include <ethdev_driver.h>
 
@@ -27,6 +29,7 @@  extern "C" {
  */
 struct sfc_dp_rxq {
 	struct sfc_dp_queue	dpq;
+	bool			need_ingress_mport;
 };
 
 /** Datapath receive queue descriptor number limitations */
diff --git a/drivers/net/sfc/sfc_ef100_rx.c b/drivers/net/sfc/sfc_ef100_rx.c
index 16cd8524d3..c4d256b40d 100644
--- a/drivers/net/sfc/sfc_ef100_rx.c
+++ b/drivers/net/sfc/sfc_ef100_rx.c
@@ -876,7 +876,8 @@  sfc_ef100_rx_qstart(struct sfc_dp_rxq *dp_rxq, unsigned int evq_read_ptr,
 	else
 		rxq->flags &= ~SFC_EF100_RXQ_USER_MARK;
 
-	if ((unsup_rx_prefix_fields &
+	if (dp_rxq->need_ingress_mport &&
+	    (unsup_rx_prefix_fields &
 	     (1U << EFX_RX_PREFIX_FIELD_INGRESS_MPORT)) == 0)
 		rxq->flags |= SFC_EF100_RXQ_INGRESS_MPORT;
 	else
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 5ea98187c3..3d3d7d42e3 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -1265,6 +1265,9 @@  sfc_rx_qinit(struct sfc_adapter *sa, sfc_sw_index_t sw_index,
 	if (rc != 0)
 		goto fail_dp_rx_qcreate;
 
+	rxq_info->dp->need_ingress_mport =
+		((rxq_info->type_flags & EFX_RXQ_FLAG_INGRESS_MPORT) != 0);
+
 	evq->dp_rxq = rxq_info->dp;
 
 	rxq_info->state = SFC_RXQ_INITIALIZED;