Message ID | 20201009081410.63944-1-yong.liu@intel.com (mailing list archive) |
---|---|
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id EDDDFA04BC; Fri, 9 Oct 2020 10:20:34 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 7E70A1C1B2; Fri, 9 Oct 2020 10:20:33 +0200 (CEST) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id B2B4F1C1B1 for <dev@dpdk.org>; Fri, 9 Oct 2020 10:20:30 +0200 (CEST) IronPort-SDR: xOjnT24feBir87gEeEjqe+IiXqkxTIXW/TfEzm+VgBHeCTl3Z9WQ8wSu0LFwlqIskD0Fa2aq2d kQQFe0OSqgmA== X-IronPort-AV: E=McAfee;i="6000,8403,9768"; a="144778890" X-IronPort-AV: E=Sophos;i="5.77,354,1596524400"; d="scan'208";a="144778890" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 01:20:28 -0700 IronPort-SDR: ChFvzJabg4WJT+1iU456/Hhj+HWGGeRtBW2hi12uwCcEY9qXeUm6D3fC6vkOoGMY1q77TpodxX EFbgyrjSuonA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,354,1596524400"; d="scan'208";a="528833240" Received: from npg-dpdk-virtual-marvin-dev.sh.intel.com ([10.67.119.56]) by orsmga005.jf.intel.com with ESMTP; 09 Oct 2020 01:20:24 -0700 From: Marvin Liu <yong.liu@intel.com> To: maxime.coquelin@redhat.com, chenbo.xia@intel.com, zhihong.wang@intel.com Cc: dev@dpdk.org, Marvin Liu <yong.liu@intel.com> Date: Fri, 9 Oct 2020 16:14:05 +0800 Message-Id: <20201009081410.63944-1-yong.liu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200819032414.51430-2-yong.liu@intel.com> References: <20200819032414.51430-2-yong.liu@intel.com> Subject: [dpdk-dev] [PATCH v3 0/5] vhost add vectorized data path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions <dev.dpdk.org> List-Unsubscribe: <https://mails.dpdk.org/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://mails.dpdk.org/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <https://mails.dpdk.org/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> |
Series | vhost add vectorized data path | |
Message
Marvin Liu
Oct. 9, 2020, 8:14 a.m. UTC
Packed ring format is imported since virtio spec 1.1. All descriptors are compacted into one single ring when packed ring format is on. It is straight forward that ring operations can be accelerated by utilizing SIMD instructions. This patch set will introduce vectorized data path in vhost library. If vectorized option is on, operations like descs check, descs writeback, address translation will be accelerated by SIMD instructions. On skylake server, it can bring 6% performance gain in loopback case and around 4% performance gain in PvP case. Vhost application can choose whether using vectorized acceleration, just like external buffer feature. If platform or ring format not support vectorized function, vhost will fallback to use default batch function. There will be no impact in current data path. v3: * rename vectorized datapath file * eliminate the impact when avx512 disabled * dynamically allocate memory regions structure * remove unlikely hint for in_order v2: * add vIOMMU support * add dequeue offloading * rebase code Marvin Liu (5): vhost: add vectorized data path vhost: reuse packed ring functions vhost: prepare memory regions addresses vhost: add packed ring vectorized dequeue vhost: add packed ring vectorized enqueue doc/guides/nics/vhost.rst | 5 + doc/guides/prog_guide/vhost_lib.rst | 12 + drivers/net/vhost/rte_eth_vhost.c | 17 +- lib/librte_vhost/meson.build | 16 ++ lib/librte_vhost/rte_vhost.h | 1 + lib/librte_vhost/socket.c | 5 + lib/librte_vhost/vhost.c | 11 + lib/librte_vhost/vhost.h | 239 +++++++++++++++++++ lib/librte_vhost/vhost_user.c | 26 +++ lib/librte_vhost/virtio_net.c | 258 ++++----------------- lib/librte_vhost/virtio_net_avx.c | 344 ++++++++++++++++++++++++++++ 11 files changed, 718 insertions(+), 216 deletions(-) create mode 100644 lib/librte_vhost/virtio_net_avx.c
Comments
Hi Marvin, On 10/9/20 10:14 AM, Marvin Liu wrote: > Packed ring format is imported since virtio spec 1.1. All descriptors > are compacted into one single ring when packed ring format is on. It is > straight forward that ring operations can be accelerated by utilizing > SIMD instructions. > > This patch set will introduce vectorized data path in vhost library. If > vectorized option is on, operations like descs check, descs writeback, > address translation will be accelerated by SIMD instructions. On skylake > server, it can bring 6% performance gain in loopback case and around 4% > performance gain in PvP case. IMHO, 4% gain on PVP is not a significant gain if we compare to the added complexity. Moreover, I guess this is 4% gain with testpmd-based PVP? If this is the case it may be even lower with OVS-DPDK PVP benchmark, I will try to do a benchmark this week. Thanks, Maxime > Vhost application can choose whether using vectorized acceleration, just > like external buffer feature. If platform or ring format not support > vectorized function, vhost will fallback to use default batch function. > There will be no impact in current data path. > > v3: > * rename vectorized datapath file > * eliminate the impact when avx512 disabled > * dynamically allocate memory regions structure > * remove unlikely hint for in_order > > v2: > * add vIOMMU support > * add dequeue offloading > * rebase code > > Marvin Liu (5): > vhost: add vectorized data path > vhost: reuse packed ring functions > vhost: prepare memory regions addresses > vhost: add packed ring vectorized dequeue > vhost: add packed ring vectorized enqueue > > doc/guides/nics/vhost.rst | 5 + > doc/guides/prog_guide/vhost_lib.rst | 12 + > drivers/net/vhost/rte_eth_vhost.c | 17 +- > lib/librte_vhost/meson.build | 16 ++ > lib/librte_vhost/rte_vhost.h | 1 + > lib/librte_vhost/socket.c | 5 + > lib/librte_vhost/vhost.c | 11 + > lib/librte_vhost/vhost.h | 239 +++++++++++++++++++ > lib/librte_vhost/vhost_user.c | 26 +++ > lib/librte_vhost/virtio_net.c | 258 ++++----------------- > lib/librte_vhost/virtio_net_avx.c | 344 ++++++++++++++++++++++++++++ > 11 files changed, 718 insertions(+), 216 deletions(-) > create mode 100644 lib/librte_vhost/virtio_net_avx.c >
> -----Original Message----- > From: Maxime Coquelin <maxime.coquelin@redhat.com> > Sent: Monday, October 12, 2020 4:22 PM > To: Liu, Yong <yong.liu@intel.com>; Xia, Chenbo <chenbo.xia@intel.com>; > Wang, Zhihong <zhihong.wang@intel.com> > Cc: dev@dpdk.org > Subject: Re: [PATCH v3 0/5] vhost add vectorized data path > > Hi Marvin, > > On 10/9/20 10:14 AM, Marvin Liu wrote: > > Packed ring format is imported since virtio spec 1.1. All descriptors > > are compacted into one single ring when packed ring format is on. It is > > straight forward that ring operations can be accelerated by utilizing > > SIMD instructions. > > > > This patch set will introduce vectorized data path in vhost library. If > > vectorized option is on, operations like descs check, descs writeback, > > address translation will be accelerated by SIMD instructions. On skylake > > server, it can bring 6% performance gain in loopback case and around 4% > > performance gain in PvP case. > > IMHO, 4% gain on PVP is not a significant gain if we compare to the > added complexity. Moreover, I guess this is 4% gain with testpmd-based > PVP? If this is the case it may be even lower with OVS-DPDK PVP > benchmark, I will try to do a benchmark this week. > Maxime, I have observed around 3% gain with OVS-DPDK in first version. But the number is not reliable as datapath has been changed. I will try again after fixed OVS integration issue with latest dpdk. > Thanks, > Maxime > > > Vhost application can choose whether using vectorized acceleration, just > > like external buffer feature. If platform or ring format not support > > vectorized function, vhost will fallback to use default batch function. > > There will be no impact in current data path. > > > > v3: > > * rename vectorized datapath file > > * eliminate the impact when avx512 disabled > > * dynamically allocate memory regions structure > > * remove unlikely hint for in_order > > > > v2: > > * add vIOMMU support > > * add dequeue offloading > > * rebase code > > > > Marvin Liu (5): > > vhost: add vectorized data path > > vhost: reuse packed ring functions > > vhost: prepare memory regions addresses > > vhost: add packed ring vectorized dequeue > > vhost: add packed ring vectorized enqueue > > > > doc/guides/nics/vhost.rst | 5 + > > doc/guides/prog_guide/vhost_lib.rst | 12 + > > drivers/net/vhost/rte_eth_vhost.c | 17 +- > > lib/librte_vhost/meson.build | 16 ++ > > lib/librte_vhost/rte_vhost.h | 1 + > > lib/librte_vhost/socket.c | 5 + > > lib/librte_vhost/vhost.c | 11 + > > lib/librte_vhost/vhost.h | 239 +++++++++++++++++++ > > lib/librte_vhost/vhost_user.c | 26 +++ > > lib/librte_vhost/virtio_net.c | 258 ++++----------------- > > lib/librte_vhost/virtio_net_avx.c | 344 ++++++++++++++++++++++++++++ > > 11 files changed, 718 insertions(+), 216 deletions(-) > > create mode 100644 lib/librte_vhost/virtio_net_avx.c > >
Hi Marvin, On 10/12/20 11:10 AM, Liu, Yong wrote: > > >> -----Original Message----- >> From: Maxime Coquelin <maxime.coquelin@redhat.com> >> Sent: Monday, October 12, 2020 4:22 PM >> To: Liu, Yong <yong.liu@intel.com>; Xia, Chenbo <chenbo.xia@intel.com>; >> Wang, Zhihong <zhihong.wang@intel.com> >> Cc: dev@dpdk.org >> Subject: Re: [PATCH v3 0/5] vhost add vectorized data path >> >> Hi Marvin, >> >> On 10/9/20 10:14 AM, Marvin Liu wrote: >>> Packed ring format is imported since virtio spec 1.1. All descriptors >>> are compacted into one single ring when packed ring format is on. It is >>> straight forward that ring operations can be accelerated by utilizing >>> SIMD instructions. >>> >>> This patch set will introduce vectorized data path in vhost library. If >>> vectorized option is on, operations like descs check, descs writeback, >>> address translation will be accelerated by SIMD instructions. On skylake >>> server, it can bring 6% performance gain in loopback case and around 4% >>> performance gain in PvP case. >> >> IMHO, 4% gain on PVP is not a significant gain if we compare to the >> added complexity. Moreover, I guess this is 4% gain with testpmd-based >> PVP? If this is the case it may be even lower with OVS-DPDK PVP >> benchmark, I will try to do a benchmark this week. >> > > Maxime, > I have observed around 3% gain with OVS-DPDK in first version. But the number is not reliable as datapath has been changed. > I will try again after fixed OVS integration issue with latest dpdk. Thanks for the information. Also, wouldn't using AVX512 lower the CPU frequency? If so, could it have an impact on the workload running on the other CPUs? Thanks, Maxime >> Thanks, >> Maxime >> >>> Vhost application can choose whether using vectorized acceleration, just >>> like external buffer feature. If platform or ring format not support >>> vectorized function, vhost will fallback to use default batch function. >>> There will be no impact in current data path. >>> >>> v3: >>> * rename vectorized datapath file >>> * eliminate the impact when avx512 disabled >>> * dynamically allocate memory regions structure >>> * remove unlikely hint for in_order >>> >>> v2: >>> * add vIOMMU support >>> * add dequeue offloading >>> * rebase code >>> >>> Marvin Liu (5): >>> vhost: add vectorized data path >>> vhost: reuse packed ring functions >>> vhost: prepare memory regions addresses >>> vhost: add packed ring vectorized dequeue >>> vhost: add packed ring vectorized enqueue >>> >>> doc/guides/nics/vhost.rst | 5 + >>> doc/guides/prog_guide/vhost_lib.rst | 12 + >>> drivers/net/vhost/rte_eth_vhost.c | 17 +- >>> lib/librte_vhost/meson.build | 16 ++ >>> lib/librte_vhost/rte_vhost.h | 1 + >>> lib/librte_vhost/socket.c | 5 + >>> lib/librte_vhost/vhost.c | 11 + >>> lib/librte_vhost/vhost.h | 239 +++++++++++++++++++ >>> lib/librte_vhost/vhost_user.c | 26 +++ >>> lib/librte_vhost/virtio_net.c | 258 ++++----------------- >>> lib/librte_vhost/virtio_net_avx.c | 344 ++++++++++++++++++++++++++++ >>> 11 files changed, 718 insertions(+), 216 deletions(-) >>> create mode 100644 lib/librte_vhost/virtio_net_avx.c >>> >
> -----Original Message----- > From: Maxime Coquelin <maxime.coquelin@redhat.com> > Sent: Monday, October 12, 2020 5:57 PM > To: Liu, Yong <yong.liu@intel.com>; Xia, Chenbo <chenbo.xia@intel.com>; > Wang, Zhihong <zhihong.wang@intel.com> > Cc: dev@dpdk.org > Subject: Re: [PATCH v3 0/5] vhost add vectorized data path > > Hi Marvin, > > On 10/12/20 11:10 AM, Liu, Yong wrote: > > > > > >> -----Original Message----- > >> From: Maxime Coquelin <maxime.coquelin@redhat.com> > >> Sent: Monday, October 12, 2020 4:22 PM > >> To: Liu, Yong <yong.liu@intel.com>; Xia, Chenbo > <chenbo.xia@intel.com>; > >> Wang, Zhihong <zhihong.wang@intel.com> > >> Cc: dev@dpdk.org > >> Subject: Re: [PATCH v3 0/5] vhost add vectorized data path > >> > >> Hi Marvin, > >> > >> On 10/9/20 10:14 AM, Marvin Liu wrote: > >>> Packed ring format is imported since virtio spec 1.1. All descriptors > >>> are compacted into one single ring when packed ring format is on. It is > >>> straight forward that ring operations can be accelerated by utilizing > >>> SIMD instructions. > >>> > >>> This patch set will introduce vectorized data path in vhost library. If > >>> vectorized option is on, operations like descs check, descs writeback, > >>> address translation will be accelerated by SIMD instructions. On skylake > >>> server, it can bring 6% performance gain in loopback case and around 4% > >>> performance gain in PvP case. > >> > >> IMHO, 4% gain on PVP is not a significant gain if we compare to the > >> added complexity. Moreover, I guess this is 4% gain with testpmd-based > >> PVP? If this is the case it may be even lower with OVS-DPDK PVP > >> benchmark, I will try to do a benchmark this week. > >> > > > > Maxime, > > I have observed around 3% gain with OVS-DPDK in first version. But the > number is not reliable as datapath has been changed. > > I will try again after fixed OVS integration issue with latest dpdk. > > Thanks for the information. > > Also, wouldn't using AVX512 lower the CPU frequency? > If so, could it have an impact on the workload running on the other > CPUs? > All AVX512 instructions used in vhost are lightweight ones, frequency won't be affected. Theoretically system performance won’t be affected if only lightweight instructions are used. Thanks. > Thanks, > Maxime > > >> Thanks, > >> Maxime > >> > >>> Vhost application can choose whether using vectorized acceleration, > just > >>> like external buffer feature. If platform or ring format not support > >>> vectorized function, vhost will fallback to use default batch function. > >>> There will be no impact in current data path. > >>> > >>> v3: > >>> * rename vectorized datapath file > >>> * eliminate the impact when avx512 disabled > >>> * dynamically allocate memory regions structure > >>> * remove unlikely hint for in_order > >>> > >>> v2: > >>> * add vIOMMU support > >>> * add dequeue offloading > >>> * rebase code > >>> > >>> Marvin Liu (5): > >>> vhost: add vectorized data path > >>> vhost: reuse packed ring functions > >>> vhost: prepare memory regions addresses > >>> vhost: add packed ring vectorized dequeue > >>> vhost: add packed ring vectorized enqueue > >>> > >>> doc/guides/nics/vhost.rst | 5 + > >>> doc/guides/prog_guide/vhost_lib.rst | 12 + > >>> drivers/net/vhost/rte_eth_vhost.c | 17 +- > >>> lib/librte_vhost/meson.build | 16 ++ > >>> lib/librte_vhost/rte_vhost.h | 1 + > >>> lib/librte_vhost/socket.c | 5 + > >>> lib/librte_vhost/vhost.c | 11 + > >>> lib/librte_vhost/vhost.h | 239 +++++++++++++++++++ > >>> lib/librte_vhost/vhost_user.c | 26 +++ > >>> lib/librte_vhost/virtio_net.c | 258 ++++----------------- > >>> lib/librte_vhost/virtio_net_avx.c | 344 > ++++++++++++++++++++++++++++ > >>> 11 files changed, 718 insertions(+), 216 deletions(-) > >>> create mode 100644 lib/librte_vhost/virtio_net_avx.c > >>> > >
Hi All, Performance gain from vectorized datapath in OVS-DPDK is around 1%, meanwhile it have a small impact of original datapath. On the other hand, it will increase the complexity of vhost (new parameter introduced, prepare memory information for address translation). After weighed the procs and co, I’d like to drawback this patch set. Thanks for your time. Regards, Marvin > -----Original Message----- > From: Maxime Coquelin <maxime.coquelin@redhat.com> > Sent: Monday, October 12, 2020 4:22 PM > To: Liu, Yong <yong.liu@intel.com>; Xia, Chenbo <chenbo.xia@intel.com>; > Wang, Zhihong <zhihong.wang@intel.com> > Cc: dev@dpdk.org > Subject: Re: [PATCH v3 0/5] vhost add vectorized data path > > Hi Marvin, > > On 10/9/20 10:14 AM, Marvin Liu wrote: > > Packed ring format is imported since virtio spec 1.1. All descriptors > > are compacted into one single ring when packed ring format is on. It is > > straight forward that ring operations can be accelerated by utilizing > > SIMD instructions. > > > > This patch set will introduce vectorized data path in vhost library. If > > vectorized option is on, operations like descs check, descs writeback, > > address translation will be accelerated by SIMD instructions. On skylake > > server, it can bring 6% performance gain in loopback case and around 4% > > performance gain in PvP case. > > IMHO, 4% gain on PVP is not a significant gain if we compare to the > added complexity. Moreover, I guess this is 4% gain with testpmd-based > PVP? If this is the case it may be even lower with OVS-DPDK PVP > benchmark, I will try to do a benchmark this week. > > Thanks, > Maxime > > > Vhost application can choose whether using vectorized acceleration, just > > like external buffer feature. If platform or ring format not support > > vectorized function, vhost will fallback to use default batch function. > > There will be no impact in current data path. > > > > v3: > > * rename vectorized datapath file > > * eliminate the impact when avx512 disabled > > * dynamically allocate memory regions structure > > * remove unlikely hint for in_order > > > > v2: > > * add vIOMMU support > > * add dequeue offloading > > * rebase code > > > > Marvin Liu (5): > > vhost: add vectorized data path > > vhost: reuse packed ring functions > > vhost: prepare memory regions addresses > > vhost: add packed ring vectorized dequeue > > vhost: add packed ring vectorized enqueue > > > > doc/guides/nics/vhost.rst | 5 + > > doc/guides/prog_guide/vhost_lib.rst | 12 + > > drivers/net/vhost/rte_eth_vhost.c | 17 +- > > lib/librte_vhost/meson.build | 16 ++ > > lib/librte_vhost/rte_vhost.h | 1 + > > lib/librte_vhost/socket.c | 5 + > > lib/librte_vhost/vhost.c | 11 + > > lib/librte_vhost/vhost.h | 239 +++++++++++++++++++ > > lib/librte_vhost/vhost_user.c | 26 +++ > > lib/librte_vhost/virtio_net.c | 258 ++++----------------- > > lib/librte_vhost/virtio_net_avx.c | 344 ++++++++++++++++++++++++++++ > > 11 files changed, 718 insertions(+), 216 deletions(-) > > create mode 100644 lib/librte_vhost/virtio_net_avx.c > >
Hi Marvin, On 10/15/20 5:28 PM, Liu, Yong wrote: > Hi All, > Performance gain from vectorized datapath in OVS-DPDK is around 1%, meanwhile it have a small impact of original datapath. > On the other hand, it will increase the complexity of vhost (new parameter introduced, prepare memory information for address translation). > After weighed the procs and co, I’d like to drawback this patch set. Thanks for your time. Thanks for running the test with the new version. I have removed it from Patchwork. Thanks, Maxime > Regards, > Marvin > >> -----Original Message----- >> From: Maxime Coquelin <maxime.coquelin@redhat.com> >> Sent: Monday, October 12, 2020 4:22 PM >> To: Liu, Yong <yong.liu@intel.com>; Xia, Chenbo <chenbo.xia@intel.com>; >> Wang, Zhihong <zhihong.wang@intel.com> >> Cc: dev@dpdk.org >> Subject: Re: [PATCH v3 0/5] vhost add vectorized data path >> >> Hi Marvin, >> >> On 10/9/20 10:14 AM, Marvin Liu wrote: >>> Packed ring format is imported since virtio spec 1.1. All descriptors >>> are compacted into one single ring when packed ring format is on. It is >>> straight forward that ring operations can be accelerated by utilizing >>> SIMD instructions. >>> >>> This patch set will introduce vectorized data path in vhost library. If >>> vectorized option is on, operations like descs check, descs writeback, >>> address translation will be accelerated by SIMD instructions. On skylake >>> server, it can bring 6% performance gain in loopback case and around 4% >>> performance gain in PvP case. >> >> IMHO, 4% gain on PVP is not a significant gain if we compare to the >> added complexity. Moreover, I guess this is 4% gain with testpmd-based >> PVP? If this is the case it may be even lower with OVS-DPDK PVP >> benchmark, I will try to do a benchmark this week. >> >> Thanks, >> Maxime >> >>> Vhost application can choose whether using vectorized acceleration, just >>> like external buffer feature. If platform or ring format not support >>> vectorized function, vhost will fallback to use default batch function. >>> There will be no impact in current data path. >>> >>> v3: >>> * rename vectorized datapath file >>> * eliminate the impact when avx512 disabled >>> * dynamically allocate memory regions structure >>> * remove unlikely hint for in_order >>> >>> v2: >>> * add vIOMMU support >>> * add dequeue offloading >>> * rebase code >>> >>> Marvin Liu (5): >>> vhost: add vectorized data path >>> vhost: reuse packed ring functions >>> vhost: prepare memory regions addresses >>> vhost: add packed ring vectorized dequeue >>> vhost: add packed ring vectorized enqueue >>> >>> doc/guides/nics/vhost.rst | 5 + >>> doc/guides/prog_guide/vhost_lib.rst | 12 + >>> drivers/net/vhost/rte_eth_vhost.c | 17 +- >>> lib/librte_vhost/meson.build | 16 ++ >>> lib/librte_vhost/rte_vhost.h | 1 + >>> lib/librte_vhost/socket.c | 5 + >>> lib/librte_vhost/vhost.c | 11 + >>> lib/librte_vhost/vhost.h | 239 +++++++++++++++++++ >>> lib/librte_vhost/vhost_user.c | 26 +++ >>> lib/librte_vhost/virtio_net.c | 258 ++++----------------- >>> lib/librte_vhost/virtio_net_avx.c | 344 ++++++++++++++++++++++++++++ >>> 11 files changed, 718 insertions(+), 216 deletions(-) >>> create mode 100644 lib/librte_vhost/virtio_net_avx.c >>> >