Message ID | 1619414983-131070-1-git-send-email-wenzhuo.lu@intel.com (mailing list archive) |
---|---|
State | Superseded, archived |
Delegated to: | Qi Zhang |
Headers | show |
Series | net/iavf: fix performance drop | expand |
Context | Check | Description |
---|---|---|
ci/iol-mellanox-Performance | success | Performance Testing PASS |
ci/github-robot | success | github build: passed |
ci/iol-testing | success | Testing PASS |
ci/iol-abi-testing | success | Testing PASS |
ci/intel-Testing | success | Testing PASS |
ci/iol-intel-Performance | success | Performance Testing PASS |
ci/Intel-compilation | success | Compilation OK |
ci/iol-intel-Functional | success | Functional Testing PASS |
ci/checkpatch | success | coding style OK |
> -----Original Message----- > From: dev <dev-bounces@dpdk.org> On Behalf Of Wenzhuo Lu > Sent: Monday, April 26, 2021 1:30 PM > To: dev@dpdk.org > Cc: Lu, Wenzhuo <wenzhuo.lu@intel.com> > Subject: [dpdk-dev] [PATCH] net/iavf: fix performance drop > > AVX2 and SSE don't have the offload path. > Not necessary doing any check. Or the scalar path will be chosen. > > Fixes: eff56a7b9f97 ("net/iavf: add offload path for Rx AVX512") > > Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com> > --- > drivers/net/iavf/iavf_rxtx.c | 12 +++++------- > 1 file changed, 5 insertions(+), 7 deletions(-) > > diff --git a/drivers/net/iavf/iavf_rxtx.c b/drivers/net/iavf/iavf_rxtx.c index > 3f3cf63..0ba19dbf 100644 > --- a/drivers/net/iavf/iavf_rxtx.c > +++ b/drivers/net/iavf/iavf_rxtx.c > @@ -2401,13 +2401,11 @@ > check_ret = iavf_rx_vec_dev_check(dev); > if (check_ret >= 0 && > rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_128) { > - if (check_ret == IAVF_VECTOR_PATH) { > - use_sse = true; > - if ((rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2) == 1 || > - rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1) > && > - rte_vect_get_max_simd_bitwidth() >= > RTE_VECT_SIMD_256) > - use_avx2 = true; > - } > + use_sse = true; > + if ((rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2) == 1 || > + rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1) && > + rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_256) > + use_avx2 = true; Not sure if the right path will be selected when avx512 is not true, (CC_AVX512_SUPPORT is disable) while check_ret is IAVF_VECTOR_OFFLOAD_PATH? Currently we have if (!use_sse && !use_avx2 && !use_avx512) goto normal; Should we also add below check? if (!use_avx512 && check_ret == IAVF_VECTOR_OFFLOAD_PATH) goto normal; > > #ifdef CC_AVX512_SUPPORT > if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1 && > -- > 1.9.3
> > +use_sse = true; > > +if ((rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2) == 1 || > > + rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1) && > > + rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_256) > > +use_avx2 = true; > > > Not sure if the right path will be selected when avx512 is not true, > (CC_AVX512_SUPPORT is disable) while check_ret is > IAVF_VECTOR_OFFLOAD_PATH? > > Currently we have > > if (!use_sse && !use_avx2 && !use_avx512) goto normal; > > Should we also add below check? > > if (!use_avx512 && check_ret == IAVF_VECTOR_OFFLOAD_PATH) > goto normal; Not necessary. As explained in the commit log, AVX2 and SSE support the offload features. The purpose of this patch is to let AVX2 be chosen when offload needed and AVX512 disable.
> -----Original Message----- > From: Lu, Wenzhuo <wenzhuo.lu@intel.com> > Sent: 2021年4月27日 9:51 > To: Zhang, Qi Z <qi.z.zhang@intel.com>; dev@dpdk.org > Subject: RE: [dpdk-dev] [PATCH] net/iavf: fix performance drop > > > > +use_sse = true; > > > +if ((rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2) == 1 || > > > + rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1) && > > > + rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_256) > > > +use_avx2 = true; > > > > > > Not sure if the right path will be selected when avx512 is not true, > > (CC_AVX512_SUPPORT is disable) while check_ret is > > IAVF_VECTOR_OFFLOAD_PATH? > > > > Currently we have > > > > if (!use_sse && !use_avx2 && !use_avx512) goto normal; > > > > Should we also add below check? > > > > if (!use_avx512 && check_ret == IAVF_VECTOR_OFFLOAD_PATH) > > goto normal; > Not necessary. As explained in the commit log, AVX2 and SSE support the > offload features. The purpose of this patch is to let AVX2 be chosen when > offload needed and AVX512 disable. OK, now I understand it, I will suggest to change the commit log. From AVX2 and SSE don't have the offload path. To AVX2 and SSE Tx path already support offloads but can't selected. If you agree, will apply above during change during patch merge.
> > > > +use_sse = true; > > > > +if ((rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2) == 1 || > > > > + rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1) && > > > > + rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_256) > > > > +use_avx2 = true; > > > > > > > > > Not sure if the right path will be selected when avx512 is not true, > > > (CC_AVX512_SUPPORT is disable) while check_ret is > > > IAVF_VECTOR_OFFLOAD_PATH? > > > > > > Currently we have > > > > > > if (!use_sse && !use_avx2 && !use_avx512) goto normal; > > > > > > Should we also add below check? > > > > > > if (!use_avx512 && check_ret == IAVF_VECTOR_OFFLOAD_PATH) > > > goto normal; > > Not necessary. As explained in the commit log, AVX2 and SSE support > > the offload features. The purpose of this patch is to let AVX2 be > > chosen when offload needed and AVX512 disable. > > OK, now I understand it, I will suggest to change the commit log. > > From > AVX2 and SSE don't have the offload path. > > To > AVX2 and SSE Tx path already support offloads but can't selected. > > If you agree, will apply above during change during patch merge. Sure. Thanks for that.
> -----Original Message----- > From: dev <dev-bounces@dpdk.org> On Behalf Of Wenzhuo Lu > Sent: Monday, April 26, 2021 1:30 PM > To: dev@dpdk.org > Cc: Lu, Wenzhuo <wenzhuo.lu@intel.com> > Subject: [dpdk-dev] [PATCH] net/iavf: fix performance drop > > AVX2 and SSE don't have the offload path. > Not necessary doing any check. Or the scalar path will be chosen. AVX2 and SSE Tx path already support offloads but can't be selected. The patch remove the unnecessary check or the scalar path will always be chosen. > > Fixes: eff56a7b9f97 ("net/iavf: add offload path for Rx AVX512") > > Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com> Applied to dpdk-next-net-intel after refined commit log. Thanks Qi
On 4/26/2021 6:29 AM, Wenzhuo Lu wrote: > AVX2 and SSE don't have the offload path. > Not necessary doing any check. Or the scalar path > will be chosen. Hi Wenzhuo, The fix by not changing Rx implementation, but making sure correct Rx path selected, right? Can you please clarify this in the commit log? So the performance drop fixed for whoever have the vector path supported and offloads enabled, can be good to highlight in the patch title, otherwise it is too generic. > > Fixes: eff56a7b9f97 ("net/iavf: add offload path for Rx AVX512") > I am not clear what caused the performance drop. Before above patch, vector path was not supporting the offloads and scalar path should be selected. After above patch, still scalar path selected, although vector path supports offloads, but for both before and after scalar path is selected, so why/when the performance drop happens? Can you please clarify in the commit log, how to reproduce performance drop? > Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com> > --- > drivers/net/iavf/iavf_rxtx.c | 12 +++++------- > 1 file changed, 5 insertions(+), 7 deletions(-) > > diff --git a/drivers/net/iavf/iavf_rxtx.c b/drivers/net/iavf/iavf_rxtx.c > index 3f3cf63..0ba19dbf 100644 > --- a/drivers/net/iavf/iavf_rxtx.c > +++ b/drivers/net/iavf/iavf_rxtx.c > @@ -2401,13 +2401,11 @@ > check_ret = iavf_rx_vec_dev_check(dev); > if (check_ret >= 0 && > rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_128) { > - if (check_ret == IAVF_VECTOR_PATH) { > - use_sse = true; > - if ((rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2) == 1 || > - rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1) && > - rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_256) > - use_avx2 = true; > - } > + use_sse = true; > + if ((rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2) == 1 || > + rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1) && > + rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_256) > + use_avx2 = true; > > #ifdef CC_AVX512_SUPPORT > if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1 && >
On 4/27/2021 3:57 AM, Zhang, Qi Z wrote: > > >> -----Original Message----- >> From: Lu, Wenzhuo <wenzhuo.lu@intel.com> >> Sent: 2021年4月27日 9:51 >> To: Zhang, Qi Z <qi.z.zhang@intel.com>; dev@dpdk.org >> Subject: RE: [dpdk-dev] [PATCH] net/iavf: fix performance drop >> >>>> +use_sse = true; >>>> +if ((rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2) == 1 || >>>> + rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1) && >>>> + rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_256) >>>> +use_avx2 = true; >>> >>> >>> Not sure if the right path will be selected when avx512 is not true, >>> (CC_AVX512_SUPPORT is disable) while check_ret is >>> IAVF_VECTOR_OFFLOAD_PATH? >>> >>> Currently we have >>> >>> if (!use_sse && !use_avx2 && !use_avx512) goto normal; >>> >>> Should we also add below check? >>> >>> if (!use_avx512 && check_ret == IAVF_VECTOR_OFFLOAD_PATH) >>> goto normal; >> Not necessary. As explained in the commit log, AVX2 and SSE support the >> offload features. The purpose of this patch is to let AVX2 be chosen when >> offload needed and AVX512 disable. > > OK, now I understand it, I will suggest to change the commit log. > > From > AVX2 and SSE don't have the offload path. > > To > AVX2 and SSE Tx path already support offloads but can't selected. > Isn't this related to the _Rx_ path? > If you agree, will apply above during change during patch merge. >
> -----Original Message----- > From: Yigit, Ferruh <ferruh.yigit@intel.com> > Sent: Wednesday, April 28, 2021 7:32 PM > To: Zhang, Qi Z <qi.z.zhang@intel.com>; Lu, Wenzhuo > <wenzhuo.lu@intel.com>; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH] net/iavf: fix performance drop > > On 4/27/2021 3:57 AM, Zhang, Qi Z wrote: > > > > > >> -----Original Message----- > >> From: Lu, Wenzhuo <wenzhuo.lu@intel.com> > >> Sent: 2021年4月27日 9:51 > >> To: Zhang, Qi Z <qi.z.zhang@intel.com>; dev@dpdk.org > >> Subject: RE: [dpdk-dev] [PATCH] net/iavf: fix performance drop > >> > >>>> +use_sse = true; > >>>> +if ((rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2) == 1 || > >>>> + rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1) && > >>>> + rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_256) > >>>> +use_avx2 = true; > >>> > >>> > >>> Not sure if the right path will be selected when avx512 is not true, > >>> (CC_AVX512_SUPPORT is disable) while check_ret is > >>> IAVF_VECTOR_OFFLOAD_PATH? > >>> > >>> Currently we have > >>> > >>> if (!use_sse && !use_avx2 && !use_avx512) goto normal; > >>> > >>> Should we also add below check? > >>> > >>> if (!use_avx512 && check_ret == IAVF_VECTOR_OFFLOAD_PATH) > >>> goto normal; > >> Not necessary. As explained in the commit log, AVX2 and SSE support > >> the offload features. The purpose of this patch is to let AVX2 be > >> chosen when offload needed and AVX512 disable. > > > > OK, now I understand it, I will suggest to change the commit log. > > > > From > > AVX2 and SSE don't have the offload path. > > > > To > > AVX2 and SSE Tx path already support offloads but can't selected. > > > > Isn't this related to the _Rx_ path? Yes, it should be Rx Path. > > > If you agree, will apply above during change during patch merge. > >
> The fix by not changing Rx implementation, but making sure correct Rx path > selected, right? Can you please clarify this in the commit log? > > So the performance drop fixed for whoever have the vector path supported and > offloads enabled, can be good to highlight in the patch title, otherwise it is too > generic. Yes, the implementation is not changed. It's only about the path selection. I'll clarify it. > > > > > Fixes: eff56a7b9f97 ("net/iavf: add offload path for Rx AVX512") > > > > I am not clear what caused the performance drop. > Before above patch, vector path was not supporting the offloads and scalar > path should be selected. After above patch, still scalar path selected, although > vector path supports offloads, but for both before and after scalar path is > selected, so why/when the performance drop happens? > Can you please clarify in the commit log, how to reproduce performance drop? Sure. I'll describe more about what's the issue.
diff --git a/drivers/net/iavf/iavf_rxtx.c b/drivers/net/iavf/iavf_rxtx.c index 3f3cf63..0ba19dbf 100644 --- a/drivers/net/iavf/iavf_rxtx.c +++ b/drivers/net/iavf/iavf_rxtx.c @@ -2401,13 +2401,11 @@ check_ret = iavf_rx_vec_dev_check(dev); if (check_ret >= 0 && rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_128) { - if (check_ret == IAVF_VECTOR_PATH) { - use_sse = true; - if ((rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2) == 1 || - rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1) && - rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_256) - use_avx2 = true; - } + use_sse = true; + if ((rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2) == 1 || + rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1) && + rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_256) + use_avx2 = true; #ifdef CC_AVX512_SUPPORT if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1 &&
AVX2 and SSE don't have the offload path. Not necessary doing any check. Or the scalar path will be chosen. Fixes: eff56a7b9f97 ("net/iavf: add offload path for Rx AVX512") Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com> --- drivers/net/iavf/iavf_rxtx.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)