[v2] config: enable packet data prefetch

Message ID 20200923015131.101203-1-yong.liu@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series [v2] config: enable packet data prefetch |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-testing success Testing PASS
ci/Intel-compilation success Compilation OK
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/travis-robot success Travis build: passed

Commit Message

Marvin Liu Sept. 23, 2020, 1:51 a.m. UTC
  Data prefetch instruction can preload data into cpu’s hierarchical
cache before data access. Virtualized data paths like virtio utilized
this feature for acceleration. Since most modern cpus have support
prefetch function, we can enable packet data prefetch as default.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
---
v2: move define from meson.build to rte_config.h
---
 config/rte_config.h | 1 +
 1 file changed, 1 insertion(+)
  

Comments

Thomas Monjalon Oct. 14, 2020, 10:02 p.m. UTC | #1
23/09/2020 03:51, Marvin Liu:
> Data prefetch instruction can preload data into cpu’s hierarchical
> cache before data access. Virtualized data paths like virtio utilized
> this feature for acceleration. Since most modern cpus have support
> prefetch function, we can enable packet data prefetch as default.
> 
> Signed-off-by: Marvin Liu <yong.liu@intel.com>
> ---
> +#define RTE_PMD_PACKET_PREFETCH 1

We could also remove the related #ifdefs.

What can be the drawback of always enable those prefetches?
  
Marvin Liu Oct. 15, 2020, 1:21 a.m. UTC | #2
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Thursday, October 15, 2020 6:03 AM
> To: Liu, Yong <yong.liu@intel.com>
> Cc: Richardson, Bruce <bruce.richardson@intel.com>;
> stephen@networkplumber.org; dev@dpdk.org;
> david.marchand@redhat.com; Yigit, Ferruh <ferruh.yigit@intel.com>;
> maxime.coquelin@redhat.com; honnappa.nagarahalli@arm.com; David
> Christensen <drc@linux.vnet.ibm.com>; ruifeng.wang@arm.com
> Subject: Re: [dpdk-dev] [PATCH v2] config: enable packet data prefetch
> 
> 23/09/2020 03:51, Marvin Liu:
> > Data prefetch instruction can preload data into cpu’s hierarchical
> > cache before data access. Virtualized data paths like virtio utilized
> > this feature for acceleration. Since most modern cpus have support
> > prefetch function, we can enable packet data prefetch as default.
> >
> > Signed-off-by: Marvin Liu <yong.liu@intel.com>
> > ---
> > +#define RTE_PMD_PACKET_PREFETCH 1
> 
> We could also remove the related #ifdefs.
> 
> What can be the drawback of always enable those prefetches?
> 

Hi Thomas,
I think the potential drawback is that current prefetch location cannot guarantee the best performance across different platforms. 
Each developer has tuned the performance by adding prefetch instruction and verified the result on himself platform. 
So prefetch location is based on certain platform, also it will be hard for developer to compare the results across platforms. 

Thanks,
Marvin
  
Honnappa Nagarahalli Oct. 15, 2020, 4:09 a.m. UTC | #3
<snip>

> >
> > 23/09/2020 03:51, Marvin Liu:
> > > Data prefetch instruction can preload data into cpu’s hierarchical
> > > cache before data access. Virtualized data paths like virtio
> > > utilized this feature for acceleration. Since most modern cpus have
> > > support prefetch function, we can enable packet data prefetch as default.
> > >
> > > Signed-off-by: Marvin Liu <yong.liu@intel.com>
> > > ---
> > > +#define RTE_PMD_PACKET_PREFETCH 1
> >
> > We could also remove the related #ifdefs.
> >
> > What can be the drawback of always enable those prefetches?
> >
> 
> Hi Thomas,
> I think the potential drawback is that current prefetch location cannot
> guarantee the best performance across different platforms.
Then, does it make sense to enable this by default?

> Each developer has tuned the performance by adding prefetch instruction
> and verified the result on himself platform.
> So prefetch location is based on certain platform, also it will be hard for
> developer to compare the results across platforms.
> 
> Thanks,
> Marvin
  
Marvin Liu Oct. 15, 2020, 8:23 a.m. UTC | #4
> -----Original Message-----
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Sent: Thursday, October 15, 2020 12:10 PM
> To: Liu, Yong <yong.liu@intel.com>; thomas@monjalon.net
> Cc: Richardson, Bruce <bruce.richardson@intel.com>;
> stephen@networkplumber.org; dev@dpdk.org;
> david.marchand@redhat.com; Yigit, Ferruh <ferruh.yigit@intel.com>;
> maxime.coquelin@redhat.com; David Christensen
> <drc@linux.vnet.ibm.com>; Ruifeng Wang <Ruifeng.Wang@arm.com>; nd
> <nd@arm.com>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>;
> nd <nd@arm.com>
> Subject: RE: [dpdk-dev] [PATCH v2] config: enable packet data prefetch
> 
> <snip>
> 
> > >
> > > 23/09/2020 03:51, Marvin Liu:
> > > > Data prefetch instruction can preload data into cpu’s hierarchical
> > > > cache before data access. Virtualized data paths like virtio
> > > > utilized this feature for acceleration. Since most modern cpus have
> > > > support prefetch function, we can enable packet data prefetch as
> default.
> > > >
> > > > Signed-off-by: Marvin Liu <yong.liu@intel.com>
> > > > ---
> > > > +#define RTE_PMD_PACKET_PREFETCH 1
> > >
> > > We could also remove the related #ifdefs.
> > >
> > > What can be the drawback of always enable those prefetches?
> > >
> >
> > Hi Thomas,
> > I think the potential drawback is that current prefetch location cannot
> > guarantee the best performance across different platforms.
> Then, does it make sense to enable this by default?
> 

Now most of prefetch actions are placed after pointer of data is valid.  I think this methodology can benefit all platforms.
It's hard to say that it’s the best choice for all. But no more better solution in my mind. 
At least, we need to allow user to enable packet data prefetch. 

Regards,
Marvin

> > Each developer has tuned the performance by adding prefetch instruction
> > and verified the result on himself platform.
> > So prefetch location is based on certain platform, also it will be hard for
> > developer to compare the results across platforms.
> >
> > Thanks,
> > Marvin
  
Thomas Monjalon Oct. 15, 2020, 9:21 a.m. UTC | #5
15/10/2020 10:23, Liu, Yong:
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > > > 23/09/2020 03:51, Marvin Liu:
> > > > > Data prefetch instruction can preload data into cpu’s hierarchical
> > > > > cache before data access. Virtualized data paths like virtio
> > > > > utilized this feature for acceleration. Since most modern cpus have
> > > > > support prefetch function, we can enable packet data prefetch as
> > default.
> > > > >
> > > > > Signed-off-by: Marvin Liu <yong.liu@intel.com>
> > > > > ---
> > > > > +#define RTE_PMD_PACKET_PREFETCH 1
> > > >
> > > > We could also remove the related #ifdefs.
> > > >
> > > > What can be the drawback of always enable those prefetches?
> > > >
> > >
> > > Hi Thomas,
> > > I think the potential drawback is that current prefetch location cannot
> > > guarantee the best performance across different platforms.
> > Then, does it make sense to enable this by default?
> > 
> 
> Now most of prefetch actions are placed after pointer of data is valid.  I think this methodology can benefit all platforms.
> It's hard to say that it’s the best choice for all. But no more better solution in my mind. 
> At least, we need to allow user to enable packet data prefetch.

In my opinion, it can be tested and measured.

> > > Each developer has tuned the performance by adding prefetch instruction
> > > and verified the result on himself platform.
> > > So prefetch location is based on certain platform, also it will be hard for
> > > developer to compare the results across platforms.

If it shows benefit on an architecture, then it should be enabled
with #ifdef RTE_ARCH_XX

I am for removing the option RTE_PMD_PACKET_PREFETCH.
  
Honnappa Nagarahalli Oct. 15, 2020, 2:38 p.m. UTC | #6
<snip>

> 15/10/2020 10:23, Liu, Yong:
> > From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > > > > 23/09/2020 03:51, Marvin Liu:
> > > > > > Data prefetch instruction can preload data into cpu’s
> > > > > > hierarchical cache before data access. Virtualized data paths
> > > > > > like virtio utilized this feature for acceleration. Since most
> > > > > > modern cpus have support prefetch function, we can enable
> > > > > > packet data prefetch as
> > > default.
> > > > > >
> > > > > > Signed-off-by: Marvin Liu <yong.liu@intel.com>
> > > > > > ---
> > > > > > +#define RTE_PMD_PACKET_PREFETCH 1
> > > > >
> > > > > We could also remove the related #ifdefs.
> > > > >
> > > > > What can be the drawback of always enable those prefetches?
> > > > >
> > > >
> > > > Hi Thomas,
> > > > I think the potential drawback is that current prefetch location
> > > > cannot guarantee the best performance across different platforms.
> > > Then, does it make sense to enable this by default?
> > >
> >
> > Now most of prefetch actions are placed after pointer of data is valid.  I
> think this methodology can benefit all platforms.
> > It's hard to say that it’s the best choice for all. But no more better solution
> in my mind.
> > At least, we need to allow user to enable packet data prefetch.
> 
> In my opinion, it can be tested and measured.
+ Joyce, to test this for VirtIO on Arm

> 
> > > > Each developer has tuned the performance by adding prefetch
> > > > instruction and verified the result on himself platform.
> > > > So prefetch location is based on certain platform, also it will be
> > > > hard for developer to compare the results across platforms.
> 
> If it shows benefit on an architecture, then it should be enabled with #ifdef
> RTE_ARCH_XX
> 
> I am for removing the option RTE_PMD_PACKET_PREFETCH.
> 
>
  

Patch

diff --git a/config/rte_config.h b/config/rte_config.h
index 0bae630fd9..8b007c4c31 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -101,6 +101,7 @@ 
 #define RTE_LIBRTE_GRAPH_STATS 1
 
 /****** driver defines ********/
+#define RTE_PMD_PACKET_PREFETCH 1
 
 /* QuickAssist device */
 /* Max. number of QuickAssist devices which can be attached */