net/mana: fix wrong indexing on CQE error when coalescing is used
Checks
Commit Message
From: Long Li <longli@microsoft.com>
On a fatal CQE error when coalescing is used, update the correct index
and allow proceeding to the next CQE.
Fixes: 3409e0f172f6 ("net/mana: implement Rx CQE coalescing")
Signed-off-by: Long Li <longli@microsoft.com>
---
drivers/net/mana/rx.c | 18 ++++++++----------
1 file changed, 8 insertions(+), 10 deletions(-)
Comments
On 7/7/2023 1:17 AM, longli@linuxonhyperv.com wrote:
> From: Long Li <longli@microsoft.com>
>
> On a fatal CQE error when coalescing is used, update the correct index
> and allow proceeding to the next CQE.
>
> Fixes: 3409e0f172f6 ("net/mana: implement Rx CQE coalescing")
>
Is above fixes commit correct?
Logic for 'CQE_RX_COALESCED_4' is not changed with this commit, in above
commit and in this commit both does breaks the loop.
This commit changes logic for 'CQE_RX_TRUNCATED' and 'default' cases,
which are added with different commits, not the one in fixes line.
"fatal CQE error when coalescing" mentioned in the commit log, to which
switch case does this corresponds to?
> Signed-off-by: Long Li <longli@microsoft.com>
> ---
> drivers/net/mana/rx.c | 18 ++++++++----------
> 1 file changed, 8 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/net/mana/rx.c b/drivers/net/mana/rx.c
> index cacfd9ae1b..220b372b15 100644
> --- a/drivers/net/mana/rx.c
> +++ b/drivers/net/mana/rx.c
> @@ -416,23 +416,21 @@ mana_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
>
> switch (oob->cqe_hdr.cqe_type) {
> case CQE_RX_OKAY:
> + case CQE_RX_COALESCED_4:
> /* Proceed to process mbuf */
> break;
>
> case CQE_RX_TRUNCATED:
> - DP_LOG(DEBUG, "Drop a truncated packet");
> + default:
> + DP_LOG(ERR, "RX CQE type %d client %d vendor %d",
> + oob->cqe_hdr.cqe_type, oob->cqe_hdr.client_type,
> + oob->cqe_hdr.vendor_err);
> +
> rxq->stats.errors++;
> rte_pktmbuf_free(mbuf);
> - goto drop;
> -
> - case CQE_RX_COALESCED_4:
> - /* Proceed to process mbuf */
> - break;
>
> - default:
> - DP_LOG(ERR, "Unknown RX CQE type %d",
> - oob->cqe_hdr.cqe_type);
> - continue;
> + i++;
> + goto drop;
> }
>
> DP_LOG(DEBUG, "mana_rx_comp_oob type %d rxq %p",
> Subject: Re: [PATCH] net/mana: fix wrong indexing on CQE error when coalescing
> is used
>
> On 7/7/2023 1:17 AM, longli@linuxonhyperv.com wrote:
> > From: Long Li <longli@microsoft.com>
> >
> > On a fatal CQE error when coalescing is used, update the correct index
> > and allow proceeding to the next CQE.
> >
> > Fixes: 3409e0f172f6 ("net/mana: implement Rx CQE coalescing")
> >
>
> Is above fixes commit correct?
> Logic for 'CQE_RX_COALESCED_4' is not changed with this commit, in above
> commit and in this commit both does breaks the loop.
Yes, the "Fixes" tag is correct. Here CQE_RX_COALESCED_4 is rearranged to make it easier to read, but it doesn't change the behavior.
>
> This commit changes logic for 'CQE_RX_TRUNCATED' and 'default' cases, which
> are added with different commits, not the one in fixes line.
>
> "fatal CQE error when coalescing" mentioned in the commit log, to which switch
> case does this corresponds to?
The previous patch (3409e0f172f6 ) introduced variable "i", an index to completion CQEs. But both 'default' and 'CQE_RX_TRUNCATED' cases don't advance "i", hence not advance to next CQE on error.
The fatal CQE error means the "default" case. On 'CQE_RX_TRUNCATED' the code can recover when all CQEs are read. But on "default", it's a dead loop.
>
>
> > Signed-off-by: Long Li <longli@microsoft.com>
> > ---
> > drivers/net/mana/rx.c | 18 ++++++++----------
> > 1 file changed, 8 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/net/mana/rx.c b/drivers/net/mana/rx.c index
> > cacfd9ae1b..220b372b15 100644
> > --- a/drivers/net/mana/rx.c
> > +++ b/drivers/net/mana/rx.c
> > @@ -416,23 +416,21 @@ mana_rx_burst(void *dpdk_rxq, struct rte_mbuf
> > **pkts, uint16_t pkts_n)
> >
> > switch (oob->cqe_hdr.cqe_type) {
> > case CQE_RX_OKAY:
> > + case CQE_RX_COALESCED_4:
> > /* Proceed to process mbuf */
> > break;
> >
> > case CQE_RX_TRUNCATED:
> > - DP_LOG(DEBUG, "Drop a truncated packet");
> > + default:
> > + DP_LOG(ERR, "RX CQE type %d client %d vendor %d",
> > + oob->cqe_hdr.cqe_type, oob->cqe_hdr.client_type,
> > + oob->cqe_hdr.vendor_err);
> > +
> > rxq->stats.errors++;
> > rte_pktmbuf_free(mbuf);
> > - goto drop;
> > -
> > - case CQE_RX_COALESCED_4:
> > - /* Proceed to process mbuf */
> > - break;
> >
> > - default:
> > - DP_LOG(ERR, "Unknown RX CQE type %d",
> > - oob->cqe_hdr.cqe_type);
> > - continue;
> > + i++;
> > + goto drop;
> > }
> >
> > DP_LOG(DEBUG, "mana_rx_comp_oob type %d rxq %p",
On 7/7/2023 7:01 PM, Long Li wrote:
>> Subject: Re: [PATCH] net/mana: fix wrong indexing on CQE error when coalescing
>> is used
>>
>> On 7/7/2023 1:17 AM, longli@linuxonhyperv.com wrote:
>>> From: Long Li <longli@microsoft.com>
>>>
>>> On a fatal CQE error when coalescing is used, update the correct index
>>> and allow proceeding to the next CQE.
>>>
>>> Fixes: 3409e0f172f6 ("net/mana: implement Rx CQE coalescing")
>>>
>>
>> Is above fixes commit correct?
>> Logic for 'CQE_RX_COALESCED_4' is not changed with this commit, in above
>> commit and in this commit both does breaks the loop.
>
> Yes, the "Fixes" tag is correct. Here CQE_RX_COALESCED_4 is rearranged to make it easier to read, but it doesn't change the behavior.
>
>>
>> This commit changes logic for 'CQE_RX_TRUNCATED' and 'default' cases, which
>> are added with different commits, not the one in fixes line.
>>
>> "fatal CQE error when coalescing" mentioned in the commit log, to which switch
>> case does this corresponds to?
>
> The previous patch (3409e0f172f6 ) introduced variable "i", an index to completion CQEs. But both 'default' and 'CQE_RX_TRUNCATED' cases don't advance "i", hence not advance to next CQE on error.
>
Right, thanks for clarification, I was checking from CQE_RX_COALESCED_4
perspective.
> The fatal CQE error means the "default" case. On 'CQE_RX_TRUNCATED' the code can recover when all CQEs are read. But on "default", it's a dead loop.
>
ack.
>>
>>
>>> Signed-off-by: Long Li <longli@microsoft.com>
>>> ---
>>> drivers/net/mana/rx.c | 18 ++++++++----------
>>> 1 file changed, 8 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/net/mana/rx.c b/drivers/net/mana/rx.c index
>>> cacfd9ae1b..220b372b15 100644
>>> --- a/drivers/net/mana/rx.c
>>> +++ b/drivers/net/mana/rx.c
>>> @@ -416,23 +416,21 @@ mana_rx_burst(void *dpdk_rxq, struct rte_mbuf
>>> **pkts, uint16_t pkts_n)
>>>
>>> switch (oob->cqe_hdr.cqe_type) {
>>> case CQE_RX_OKAY:
>>> + case CQE_RX_COALESCED_4:
>>> /* Proceed to process mbuf */
>>> break;
>>>
>>> case CQE_RX_TRUNCATED:
>>> - DP_LOG(DEBUG, "Drop a truncated packet");
>>> + default:
>>> + DP_LOG(ERR, "RX CQE type %d client %d vendor %d",
>>> + oob->cqe_hdr.cqe_type, oob->cqe_hdr.client_type,
>>> + oob->cqe_hdr.vendor_err);
>>> +
>>> rxq->stats.errors++;
>>> rte_pktmbuf_free(mbuf);
>>> - goto drop;
>>> -
>>> - case CQE_RX_COALESCED_4:
>>> - /* Proceed to process mbuf */
>>> - break;
>>>
>>> - default:
>>> - DP_LOG(ERR, "Unknown RX CQE type %d",
>>> - oob->cqe_hdr.cqe_type);
>>> - continue;
>>> + i++;
>>> + goto drop;
>>> }
>>>
>>> DP_LOG(DEBUG, "mana_rx_comp_oob type %d rxq %p",
>
On 7/7/2023 1:17 AM, longli@linuxonhyperv.com wrote:
> From: Long Li <longli@microsoft.com>
>
> On a fatal CQE error when coalescing is used, update the correct index
> and allow proceeding to the next CQE.
>
> Fixes: 3409e0f172f6 ("net/mana: implement Rx CQE coalescing">
> Signed-off-by: Long Li <longli@microsoft.com>
>
Applied to dpdk-next-net/main, thanks.
@@ -416,23 +416,21 @@ mana_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
switch (oob->cqe_hdr.cqe_type) {
case CQE_RX_OKAY:
+ case CQE_RX_COALESCED_4:
/* Proceed to process mbuf */
break;
case CQE_RX_TRUNCATED:
- DP_LOG(DEBUG, "Drop a truncated packet");
+ default:
+ DP_LOG(ERR, "RX CQE type %d client %d vendor %d",
+ oob->cqe_hdr.cqe_type, oob->cqe_hdr.client_type,
+ oob->cqe_hdr.vendor_err);
+
rxq->stats.errors++;
rte_pktmbuf_free(mbuf);
- goto drop;
-
- case CQE_RX_COALESCED_4:
- /* Proceed to process mbuf */
- break;
- default:
- DP_LOG(ERR, "Unknown RX CQE type %d",
- oob->cqe_hdr.cqe_type);
- continue;
+ i++;
+ goto drop;
}
DP_LOG(DEBUG, "mana_rx_comp_oob type %d rxq %p",