mbox series

[0/8] Optimize qede use of rx/tx_entries

Message ID cover.1614938727.git.bnemeth@redhat.com (mailing list archive)
Headers
Series Optimize qede use of rx/tx_entries |

Message

Balazs Nemeth March 5, 2021, 1:13 p.m. UTC
  This patch set optimizes qede_{rx,tx}_entry and introduces
rte_pktmbuf_free_bulk in qede_process_tx_compl. The overall performance
improvement depends on the use-case; in a physical-virtual-physical test
on a ThunderX2 99xx system with two SMT threads used in ovs,
and two cores used in a vm, an improvement of around 2.55% is observed
due to this patch set.

Balazs Nemeth (8):
  net/qede: remove flags from qede_tx_entry and simplify to rte_mbuf
  net/qede: avoid repeatedly calling ecore_chain_get_cons_idx
  net/qede: assume txq->sw_tx_ring[idx] is never null in
    qede_free_tx_pkt
  net/qede: inline qede_free_tx_pkt to prepare for rte_pktmbuf_free_bulk
  net/qede: use rte_pktmbuf_free_bulk instead of rte_pktmbuf_free
  net/qede: prefetch txq->hw_cons_ptr
  net/qede: prefetch next packet to free
  net/qede: remove page_offset from struct qede_rx_entry and simplify

 drivers/net/qede/qede_rxtx.c | 148 +++++++++++++++++++----------------
 drivers/net/qede/qede_rxtx.h |  21 +----
 2 files changed, 81 insertions(+), 88 deletions(-)

--
2.29.2
  

Comments

Jerin Jacob March 8, 2021, 6:13 p.m. UTC | #1
On Fri, Mar 5, 2021 at 6:44 PM Balazs Nemeth <bnemeth@redhat.com> wrote:
>
> This patch set optimizes qede_{rx,tx}_entry and introduces
> rte_pktmbuf_free_bulk in qede_process_tx_compl. The overall performance
> improvement depends on the use-case; in a physical-virtual-physical test
> on a ThunderX2 99xx system with two SMT threads used in ovs,
> and two cores used in a vm, an improvement of around 2.55% is observed
> due to this patch set.


Cc: rmody@marvell.com
Cc: shshaikh@marvell.com

Hi Rasesh, Shahed
Could you review this series from Balazs?


>
> Balazs Nemeth (8):
>   net/qede: remove flags from qede_tx_entry and simplify to rte_mbuf
>   net/qede: avoid repeatedly calling ecore_chain_get_cons_idx
>   net/qede: assume txq->sw_tx_ring[idx] is never null in
>     qede_free_tx_pkt
>   net/qede: inline qede_free_tx_pkt to prepare for rte_pktmbuf_free_bulk
>   net/qede: use rte_pktmbuf_free_bulk instead of rte_pktmbuf_free
>   net/qede: prefetch txq->hw_cons_ptr
>   net/qede: prefetch next packet to free
>   net/qede: remove page_offset from struct qede_rx_entry and simplify
>
>  drivers/net/qede/qede_rxtx.c | 148 +++++++++++++++++++----------------
>  drivers/net/qede/qede_rxtx.h |  21 +----
>  2 files changed, 81 insertions(+), 88 deletions(-)
>
> --
> 2.29.2
>
  
Igor Russkikh March 10, 2021, 6:43 a.m. UTC | #2
> On Fri, Mar 5, 2021 at 6:44 PM Balazs Nemeth <bnemeth@redhat.com> wrote:
>>
>> This patch set optimizes qede_{rx,tx}_entry and introduces
>> rte_pktmbuf_free_bulk in qede_process_tx_compl. The overall performance
>> improvement depends on the use-case; in a physical-virtual-physical test
>> on a ThunderX2 99xx system with two SMT threads used in ovs,
>> and two cores used in a vm, an improvement of around 2.55% is observed
>> due to this patch set.

Hi Balazs, thanks for posting, quickly reviewed it, looks reasonable.
Doing a detailed review.

> Cc: rmody@marvell.com
> Cc: shshaikh@marvell.com>
> Hi Rasesh, Shahed
> Could you review this series from Balazs?

Shahed is not within marvell, adding Devendra from our DPKD support team.

Igor
  
Jerin Jacob March 10, 2021, 7:51 a.m. UTC | #3
On Wed, Mar 10, 2021 at 12:13 PM Igor Russkikh <irusskikh@marvell.com> wrote:
>
>
>
> > On Fri, Mar 5, 2021 at 6:44 PM Balazs Nemeth <bnemeth@redhat.com> wrote:
> >>
> >> This patch set optimizes qede_{rx,tx}_entry and introduces
> >> rte_pktmbuf_free_bulk in qede_process_tx_compl. The overall performance
> >> improvement depends on the use-case; in a physical-virtual-physical test
> >> on a ThunderX2 99xx system with two SMT threads used in ovs,
> >> and two cores used in a vm, an improvement of around 2.55% is observed
> >> due to this patch set.
>
> Hi Balazs, thanks for posting, quickly reviewed it, looks reasonable.
> Doing a detailed review.
>
> > Cc: rmody@marvell.com
> > Cc: shshaikh@marvell.com>
> > Hi Rasesh, Shahed
> > Could you review this series from Balazs?
>
> Shahed is not within marvell, adding Devendra from our DPKD support team.

Could you update the MAINTAINERS to reflect that status?


>
> Igor
  
Igor Russkikh March 10, 2021, 8:17 a.m. UTC | #4
>>> Cc: rmody@marvell.com
>>> Cc: shshaikh@marvell.com>
>>> Hi Rasesh, Shahed
>>> Could you review this series from Balazs?
>>
>> Shahed is not within marvell, adding Devendra from our DPKD support team.
> 
> Could you update the MAINTAINERS to reflect that status?

Sure, done, just sent!

Igor
  
Jerin Jacob March 20, 2021, 1:16 p.m. UTC | #5
On Wed, Mar 10, 2021 at 12:13 PM Igor Russkikh <irusskikh@marvell.com> wrote:
>
>
>
> > On Fri, Mar 5, 2021 at 6:44 PM Balazs Nemeth <bnemeth@redhat.com> wrote:
> >>
> >> This patch set optimizes qede_{rx,tx}_entry and introduces
> >> rte_pktmbuf_free_bulk in qede_process_tx_compl. The overall performance
> >> improvement depends on the use-case; in a physical-virtual-physical test
> >> on a ThunderX2 99xx system with two SMT threads used in ovs,
> >> and two cores used in a vm, an improvement of around 2.55% is observed
> >> due to this patch set.
>
> Hi Balazs, thanks for posting, quickly reviewed it, looks reasonable.

Waiting for the review/ack to merge this series

> Doing a detailed review.
>
> > Cc: rmody@marvell.com
> > Cc: shshaikh@marvell.com>
> > Hi Rasesh, Shahed
> > Could you review this series from Balazs?
>
> Shahed is not within marvell, adding Devendra from our DPKD support team.
>
> Igor
  
Igor Russkikh March 22, 2021, 5:08 p.m. UTC | #6
On 3/5/2021 2:13 PM, Balazs Nemeth wrote:
> External Email
> 
> ----------------------------------------------------------------------
> This patch set optimizes qede_{rx,tx}_entry and introduces
> rte_pktmbuf_free_bulk in qede_process_tx_compl. The overall performance
> improvement depends on the use-case; in a physical-virtual-physical test
> on a ThunderX2 99xx system with two SMT threads used in ovs,
> and two cores used in a vm, an improvement of around 2.55% is observed
> due to this patch set.
> 
> Balazs Nemeth (8):
>   net/qede: remove flags from qede_tx_entry and simplify to rte_mbuf
>   net/qede: avoid repeatedly calling ecore_chain_get_cons_idx
>   net/qede: assume txq->sw_tx_ring[idx] is never null in
>     qede_free_tx_pkt
>   net/qede: inline qede_free_tx_pkt to prepare for rte_pktmbuf_free_bulk
>   net/qede: use rte_pktmbuf_free_bulk instead of rte_pktmbuf_free
>   net/qede: prefetch txq->hw_cons_ptr
>   net/qede: prefetch next packet to free
>   net/qede: remove page_offset from struct qede_rx_entry and simplify
> 
>  drivers/net/qede/qede_rxtx.c | 148 +++++++++++++++++++----------------
>  drivers/net/qede/qede_rxtx.h |  21 +----
>  2 files changed, 81 insertions(+), 88 deletions(-)

Series reviewed, for the series

Acked-by: Igor Russkikh <irusskikh@marvell.com>

One checkpatch warn I see in patchwork output, probably worth fixing:

ERROR:POINTER_LOCATION: "(foo**)" should be "(foo **)"
#120: FILE: drivers/net/qede/qede_rxtx.c:56:
+	ret = rte_mempool_get_bulk(rxq->mb_pool, (void**)&rxq->sw_rx_ring[idx], count);

Thanks
  Igor
  
Jerin Jacob March 24, 2021, 9:18 a.m. UTC | #7
On Mon, Mar 22, 2021 at 10:38 PM Igor Russkikh <irusskikh@marvell.com> wrote:
>
>
>
> On 3/5/2021 2:13 PM, Balazs Nemeth wrote:
> > External Email
> >
> > ----------------------------------------------------------------------
> > This patch set optimizes qede_{rx,tx}_entry and introduces
> > rte_pktmbuf_free_bulk in qede_process_tx_compl. The overall performance
> > improvement depends on the use-case; in a physical-virtual-physical test
> > on a ThunderX2 99xx system with two SMT threads used in ovs,
> > and two cores used in a vm, an improvement of around 2.55% is observed
> > due to this patch set.
> >
> > Balazs Nemeth (8):
> >   net/qede: remove flags from qede_tx_entry and simplify to rte_mbuf
> >   net/qede: avoid repeatedly calling ecore_chain_get_cons_idx
> >   net/qede: assume txq->sw_tx_ring[idx] is never null in
> >     qede_free_tx_pkt
> >   net/qede: inline qede_free_tx_pkt to prepare for rte_pktmbuf_free_bulk
> >   net/qede: use rte_pktmbuf_free_bulk instead of rte_pktmbuf_free
> >   net/qede: prefetch txq->hw_cons_ptr
> >   net/qede: prefetch next packet to free
> >   net/qede: remove page_offset from struct qede_rx_entry and simplify
> >
> >  drivers/net/qede/qede_rxtx.c | 148 +++++++++++++++++++----------------
> >  drivers/net/qede/qede_rxtx.h |  21 +----
> >  2 files changed, 81 insertions(+), 88 deletions(-)
>
> Series reviewed, for the series
>
> Acked-by: Igor Russkikh <irusskikh@marvell.com>
>
> One checkpatch warn I see in patchwork output, probably worth fixing:
>
> ERROR:POINTER_LOCATION: "(foo**)" should be "(foo **)"
> #120: FILE: drivers/net/qede/qede_rxtx.c:56:
> +       ret = rte_mempool_get_bulk(rxq->mb_pool, (void**)&rxq->sw_rx_ring[idx], count);


Hi @Balazs Nemeth

Please fix the following checkpatc.shh and check-git-log.sh issues and
add Igor reviewed by in next version.
I will merge the next version with the fixes. Updaed  this series
status in the patchwork as "Changes requested"


Wrong headline format:
        net/qede: remove flags from qede_tx_entry and simplify to rte_mbuf
        net/qede: avoid repeatedly calling ecore_chain_get_cons_idx
        net/qede: assume txq->sw_tx_ring[idx] is never null in qede_free_tx_pkt
        net/qede: inline qede_free_tx_pkt to prepare for rte_pktmbuf_free_bulk
        net/qede: use rte_pktmbuf_free_bulk instead of rte_pktmbuf_free
        net/qede: prefetch txq->hw_cons_ptr
        net/qede: remove page_offset from struct qede_rx_entry and simplify
Headline too long:
        net/qede: remove flags from qede_tx_entry and simplify to rte_mbuf
        net/qede: assume txq->sw_tx_ring[idx] is never null in qede_free_tx_pkt
        net/qede: inline qede_free_tx_pkt to prepare for rte_pktmbuf_free_bulk
        net/qede: use rte_pktmbuf_free_bulk instead of rte_pktmbuf_free
        net/qede: remove page_offset from struct qede_rx_entry and simplify

Invalid patch(es) found - checked 8 patches
check-git-log failed



### net/qede: remove flags from qede_tx_entry and simplify to rte_mbuf

WARNING:LONG_LINE: line length of 89 exceeds 80 columns
#37: FILE: drivers/net/qede/qede_rxtx.c:429:
+                                            (sizeof(txq->sw_tx_ring)
* txq->nb_tx_desc),

total: 0 errors, 1 warnings, 0 checks, 87 lines checked

### net/qede: avoid repeatedly calling ecore_chain_get_cons_idx

WARNING:BRACES: braces {} are not necessary for single statement blocks
#82: FILE: drivers/net/qede/qede_rxtx.c:934:
+       while (remaining) {
+               remaining -= qede_free_tx_pkt(txq);
+       }

total: 0 errors, 1 warnings, 0 checks, 67 lines checked

### net/qede: use rte_pktmbuf_free_bulk instead of rte_pktmbuf_free

WARNING:LONG_LINE: line length of 89 exceeds 80 columns
#48: FILE: drivers/net/qede/qede_rxtx.c:930:
+               rte_pktmbuf_free_bulk(&txq->sw_tx_ring[first_idx],
mask - first_idx + 1);

WARNING:LONG_LINE: line length of 84 exceeds 80 columns
#51: FILE: drivers/net/qede/qede_rxtx.c:933:
+               rte_pktmbuf_free_bulk(&txq->sw_tx_ring[first_idx], idx
- first_idx);

total: 0 errors, 2 warnings, 0 checks, 32 lines checked

### net/qede: prefetch next packet to free

WARNING:REPEATED_WORD: Possible repeated word: 'next'
#6:
While handling the current mbuf, pull the next next mbuf into the cache.

WARNING:BLOCK_COMMENT_STYLE: Block comments use * on subsequent lines
#21: FILE: drivers/net/qede/qede_rxtx.c:919:
+               /* Prefetch the next mbuf. Note that at least the last 4 mbufs
+                  that are prefetched will not be used in the current call. */

WARNING:BLOCK_COMMENT_STYLE: Block comments use a trailing */ on a separate line
#21: FILE: drivers/net/qede/qede_rxtx.c:919:
+                  that are prefetched will not be used in the current call. */

total: 0 errors, 3 warnings, 0 checks, 11 lines checked

### net/qede: remove page_offset from struct qede_rx_entry and simplify

WARNING:LONG_LINE: line length of 87 exceeds 80 columns
#49: FILE: drivers/net/qede/qede_rxtx.c:56:
+       ret = rte_mempool_get_bulk(rxq->mb_pool,
(void**)&rxq->sw_rx_ring[idx], count);

ERROR:POINTER_LOCATION: "(foo**)" should be "(foo **)"
#49: FILE: drivers/net/qede/qede_rxtx.c:56:
+       ret = rte_mempool_get_bulk(rxq->mb_pool,
(void**)&rxq->sw_rx_ring[idx], count);

total: 1 errors, 1 warnings, 0 checks, 174 lines checked



>
> Thanks
>   Igor
  
Balazs Nemeth March 24, 2021, 9:34 a.m. UTC | #8
On Wed, 2021-03-24 at 14:48 +0530, Jerin Jacob wrote:
> On Mon, Mar 22, 2021 at 10:38 PM Igor Russkikh
> <irusskikh@marvell.com> wrote:
> > 
> > 
> > 
> > On 3/5/2021 2:13 PM, Balazs Nemeth wrote:
> > > External Email
> > > 
> > > -----------------------------------------------------------------
> > > -----
> > > This patch set optimizes qede_{rx,tx}_entry and introduces
> > > rte_pktmbuf_free_bulk in qede_process_tx_compl. The overall
> > > performance
> > > improvement depends on the use-case; in a physical-virtual-
> > > physical test
> > > on a ThunderX2 99xx system with two SMT threads used in ovs,
> > > and two cores used in a vm, an improvement of around 2.55% is
> > > observed
> > > due to this patch set.
> > > 
> > > Balazs Nemeth (8):
> > >   net/qede: remove flags from qede_tx_entry and simplify to
> > > rte_mbuf
> > >   net/qede: avoid repeatedly calling ecore_chain_get_cons_idx
> > >   net/qede: assume txq->sw_tx_ring[idx] is never null in
> > >     qede_free_tx_pkt
> > >   net/qede: inline qede_free_tx_pkt to prepare for
> > > rte_pktmbuf_free_bulk
> > >   net/qede: use rte_pktmbuf_free_bulk instead of rte_pktmbuf_free
> > >   net/qede: prefetch txq->hw_cons_ptr
> > >   net/qede: prefetch next packet to free
> > >   net/qede: remove page_offset from struct qede_rx_entry and
> > > simplify
> > > 
> > >  drivers/net/qede/qede_rxtx.c | 148 +++++++++++++++++++----------
> > > ------
> > >  drivers/net/qede/qede_rxtx.h |  21 +----
> > >  2 files changed, 81 insertions(+), 88 deletions(-)
> > 
> > Series reviewed, for the series
> > 
> > Acked-by: Igor Russkikh <irusskikh@marvell.com>
> > 
> > One checkpatch warn I see in patchwork output, probably worth
> > fixing:
> > 
> > ERROR:POINTER_LOCATION: "(foo**)" should be "(foo **)"
> > #120: FILE: drivers/net/qede/qede_rxtx.c:56:
> > +       ret = rte_mempool_get_bulk(rxq->mb_pool, (void**)&rxq-
> > >sw_rx_ring[idx], count);
> 
> 
> Hi @Balazs Nemeth
> 
> Please fix the following checkpatc.shh and check-git-log.sh issues
> and
> add Igor reviewed by in next version.
> I will merge the next version with the fixes. Updaed  this series
> status in the patchwork as "Changes requested"
> 

Ok I will provide a new version next week. Thanks for the feedback!

> 
> Wrong headline format:
>         net/qede: remove flags from qede_tx_entry and simplify to
> rte_mbuf
>         net/qede: avoid repeatedly calling ecore_chain_get_cons_idx
>         net/qede: assume txq->sw_tx_ring[idx] is never null in
> qede_free_tx_pkt
>         net/qede: inline qede_free_tx_pkt to prepare for
> rte_pktmbuf_free_bulk
>         net/qede: use rte_pktmbuf_free_bulk instead of
> rte_pktmbuf_free
>         net/qede: prefetch txq->hw_cons_ptr
>         net/qede: remove page_offset from struct qede_rx_entry and
> simplify
> Headline too long:
>         net/qede: remove flags from qede_tx_entry and simplify to
> rte_mbuf
>         net/qede: assume txq->sw_tx_ring[idx] is never null in
> qede_free_tx_pkt
>         net/qede: inline qede_free_tx_pkt to prepare for
> rte_pktmbuf_free_bulk
>         net/qede: use rte_pktmbuf_free_bulk instead of
> rte_pktmbuf_free
>         net/qede: remove page_offset from struct qede_rx_entry and
> simplify
> 
> Invalid patch(es) found - checked 8 patches
> check-git-log failed
> 
> 
> 
> ### net/qede: remove flags from qede_tx_entry and simplify to
> rte_mbuf
> 
> WARNING:LONG_LINE: line length of 89 exceeds 80 columns
> #37: FILE: drivers/net/qede/qede_rxtx.c:429:
> +                                            (sizeof(txq->sw_tx_ring)
> * txq->nb_tx_desc),
> 
> total: 0 errors, 1 warnings, 0 checks, 87 lines checked
> 
> ### net/qede: avoid repeatedly calling ecore_chain_get_cons_idx
> 
> WARNING:BRACES: braces {} are not necessary for single statement
> blocks
> #82: FILE: drivers/net/qede/qede_rxtx.c:934:
> +       while (remaining) {
> +               remaining -= qede_free_tx_pkt(txq);
> +       }
> 
> total: 0 errors, 1 warnings, 0 checks, 67 lines checked
> 
> ### net/qede: use rte_pktmbuf_free_bulk instead of rte_pktmbuf_free
> 
> WARNING:LONG_LINE: line length of 89 exceeds 80 columns
> #48: FILE: drivers/net/qede/qede_rxtx.c:930:
> +               rte_pktmbuf_free_bulk(&txq->sw_tx_ring[first_idx],
> mask - first_idx + 1);
> 
> WARNING:LONG_LINE: line length of 84 exceeds 80 columns
> #51: FILE: drivers/net/qede/qede_rxtx.c:933:
> +               rte_pktmbuf_free_bulk(&txq->sw_tx_ring[first_idx],
> idx
> - first_idx);
> 
> total: 0 errors, 2 warnings, 0 checks, 32 lines checked
> 
> ### net/qede: prefetch next packet to free
> 
> WARNING:REPEATED_WORD: Possible repeated word: 'next'
> #6:
> While handling the current mbuf, pull the next next mbuf into the
> cache.
> 
> WARNING:BLOCK_COMMENT_STYLE: Block comments use * on subsequent lines
> #21: FILE: drivers/net/qede/qede_rxtx.c:919:
> +               /* Prefetch the next mbuf. Note that at least the
> last 4 mbufs
> +                  that are prefetched will not be used in the
> current call. */
> 
> WARNING:BLOCK_COMMENT_STYLE: Block comments use a trailing */ on a
> separate line
> #21: FILE: drivers/net/qede/qede_rxtx.c:919:
> +                  that are prefetched will not be used in the
> current call. */
> 
> total: 0 errors, 3 warnings, 0 checks, 11 lines checked
> 
> ### net/qede: remove page_offset from struct qede_rx_entry and
> simplify
> 
> WARNING:LONG_LINE: line length of 87 exceeds 80 columns
> #49: FILE: drivers/net/qede/qede_rxtx.c:56:
> +       ret = rte_mempool_get_bulk(rxq->mb_pool,
> (void**)&rxq->sw_rx_ring[idx], count);
> 
> ERROR:POINTER_LOCATION: "(foo**)" should be "(foo **)"
> #49: FILE: drivers/net/qede/qede_rxtx.c:56:
> +       ret = rte_mempool_get_bulk(rxq->mb_pool,
> (void**)&rxq->sw_rx_ring[idx], count);
> 
> total: 1 errors, 1 warnings, 0 checks, 174 lines checked
> 
> 
> 
> > 
> > Thanks
> >   Igor
> 

Regards,
Balazs