[v13,2/2] kni: support IOVA mode

Message ID 20191115111807.20935-3-vattunuru@marvell.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series kni: support IOVA mode |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/travis-robot success Travis build: passed

Commit Message

Vamsi Krishna Attunuru Nov. 15, 2019, 11:18 a.m. UTC
  From: Vamsi Attunuru <vattunuru@marvell.com>

Current KNI implementation only operates in IOVA_PA mode
patch adds required functionality to enable KNI in
IOVA_VA mode.

KNI loopback mode tests will have performance impact in
this mode due to IOVA to KVA address translations.
However, In KNI real world use cases, the performace
impact will be based on Linux kernel stack and scheduler
latencies. Performance varies based on the KNI use case.
If bus iommu scheme is IOVA_DC and KNI module is loaded,
DPDK chooses IOVA as PA as existing behaviour.

During KNI creation, app's iova_mode details are passed to
the KNI kernel module, accordingly kernel module translates
PA/IOVA addresses to KVA and vice-versa.

Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Suggested-by: Ferruh Yigit <ferruh.yigit@intel.com>
---
 doc/guides/prog_guide/kernel_nic_interface.rst | 15 +++++++++++++++
 doc/guides/rel_notes/release_19_11.rst         | 15 ++++++++++++++-
 lib/librte_eal/linux/eal/eal.c                 | 23 ++++++++++++++++-------
 lib/librte_kni/rte_kni.c                       |  6 ++++++
 4 files changed, 51 insertions(+), 8 deletions(-)
  

Comments

Ferruh Yigit Nov. 15, 2019, 12:11 p.m. UTC | #1
On 11/15/2019 11:18 AM, vattunuru@marvell.com wrote:
> From: Vamsi Attunuru <vattunuru@marvell.com>
> 
> Current KNI implementation only operates in IOVA_PA mode
> patch adds required functionality to enable KNI in
> IOVA_VA mode.
> 
> KNI loopback mode tests will have performance impact in
> this mode due to IOVA to KVA address translations.
> However, In KNI real world use cases, the performace
> impact will be based on Linux kernel stack and scheduler
> latencies. Performance varies based on the KNI use case.
> If bus iommu scheme is IOVA_DC and KNI module is loaded,
> DPDK chooses IOVA as PA as existing behaviour.
> 
> During KNI creation, app's iova_mode details are passed to
> the KNI kernel module, accordingly kernel module translates
> PA/IOVA addresses to KVA and vice-versa.
> 
> Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
> Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
> Suggested-by: Ferruh Yigit <ferruh.yigit@intel.com>
> ---
>  doc/guides/prog_guide/kernel_nic_interface.rst | 15 +++++++++++++++
>  doc/guides/rel_notes/release_19_11.rst         | 15 ++++++++++++++-
>  lib/librte_eal/linux/eal/eal.c                 | 23 ++++++++++++++++-------
>  lib/librte_kni/rte_kni.c                       |  6 ++++++
>  4 files changed, 51 insertions(+), 8 deletions(-)
> 
> diff --git a/doc/guides/prog_guide/kernel_nic_interface.rst b/doc/guides/prog_guide/kernel_nic_interface.rst
> index 2fd58e1..3bed18f 100644
> --- a/doc/guides/prog_guide/kernel_nic_interface.rst
> +++ b/doc/guides/prog_guide/kernel_nic_interface.rst
> @@ -300,6 +300,21 @@ The sk_buff is then freed and the mbuf sent in the tx_q FIFO.
>  The DPDK TX thread dequeues the mbuf and sends it to the PMD via ``rte_eth_tx_burst()``.
>  It then puts the mbuf back in the cache.
>  
> +IOVA = VA: Support
> +------------------
> +
> +KNI operates in IOVA_VA scheme when
> +
> +- LINUX_VERSION_CODE >= KERNEL_VERSION(4, 6, 0) and
> +- eal option `iova-mode=va` is passed or bus IOVA scheme in the DPDK is selected
> +  as RTE_IOVA_VA.
> +
> +KNI loopback mode tests will have performance impact in this mode due to IOVA to
> +KVA address translations. However, In KNI real world use cases, the performace
> +impact will be based on Linux kernel stack and scheduler latencies. Performance
> +varies based on the KNI use case. If bus iommu scheme is IOVA_DC and KNI module
> +is loaded, DPDK chooses IOVA as PA as existing behaviour.

I understand you are trying to neglect the affect but mentioning the Linux
kernel stack and scheduler latency etc is confusing I think, also I am for
mentioning the availability of the eal "--iova-mode=pa" option, what about
following like more simple message:

"
Due to IOVA to KVA address translations, based on the KNI use case there can be
a performance impact. For mitigation forcing IOVA to PA via eal "--iova-mode=pa"
option can be used.
"

IOVA_DC bus iommu scheme resulting IOVA as PA either can be another paragraph I
think, as a single sentences.
  
David Marchand Nov. 15, 2019, 12:59 p.m. UTC | #2
I can't see an interest in splitting this patch from the kmod update.
Ferruh, what do you think?


On Fri, Nov 15, 2019 at 12:19 PM <vattunuru@marvell.com> wrote:
>
> From: Vamsi Attunuru <vattunuru@marvell.com>
>
> Current KNI implementation only operates in IOVA_PA mode
> patch adds required functionality to enable KNI in
> IOVA_VA mode.
>
> KNI loopback mode tests will have performance impact in
> this mode due to IOVA to KVA address translations.
> However, In KNI real world use cases, the performace

performance

> impact will be based on Linux kernel stack and scheduler
> latencies. Performance varies based on the KNI use case.
> If bus iommu scheme is IOVA_DC and KNI module is loaded,
> DPDK chooses IOVA as PA as existing behaviour.
>
> During KNI creation, app's iova_mode details are passed to
> the KNI kernel module, accordingly kernel module translates
> PA/IOVA addresses to KVA and vice-versa.
>
> Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
> Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
> Suggested-by: Ferruh Yigit <ferruh.yigit@intel.com>
> ---
>  doc/guides/prog_guide/kernel_nic_interface.rst | 15 +++++++++++++++
>  doc/guides/rel_notes/release_19_11.rst         | 15 ++++++++++++++-
>  lib/librte_eal/linux/eal/eal.c                 | 23 ++++++++++++++++-------
>  lib/librte_kni/rte_kni.c                       |  6 ++++++
>  4 files changed, 51 insertions(+), 8 deletions(-)
>

[snip]

> diff --git a/lib/librte_eal/linux/eal/eal.c b/lib/librte_eal/linux/eal/eal.c
> index 9e2d50c..53ca84b 100644
> --- a/lib/librte_eal/linux/eal/eal.c
> +++ b/lib/librte_eal/linux/eal/eal.c
> @@ -1086,14 +1086,23 @@ rte_eal_init(int argc, char **argv)
>                         }
>                 }
>  #ifdef RTE_LIBRTE_KNI
> -               /* Workaround for KNI which requires physical address to work */
> -               if (iova_mode == RTE_IOVA_VA &&
> -                               rte_eal_check_module("rte_kni") == 1) {
> -                       if (phys_addrs) {
> +               if (rte_eal_check_module("rte_kni") == 1) {
> +#if KERNEL_VERSION(4, 6, 0) > LINUX_VERSION_CODE
> +                       if (iova_mode == RTE_IOVA_VA) {
>                                 iova_mode = RTE_IOVA_PA;
> -                               RTE_LOG(WARNING, EAL, "Forcing IOVA as 'PA' because KNI module is loaded\n");
> -                       } else {
> -                               RTE_LOG(DEBUG, EAL, "KNI can not work since physical addresses are unavailable\n");
> +                               RTE_LOG(WARNING, EAL, "Forcing IOVA as 'PA' because "
> +                                               "Kernel version supports only 'PA' mode for KNI module\n");
> +                       }
> +#endif
> +                       if (rte_bus_get_iommu_class() == RTE_IOVA_DC)
> +                               iova_mode = RTE_IOVA_PA;

If physical addresses are unavailable, this code forces PA anyway.


> +
> +                       if (iova_mode == RTE_IOVA_PA) {
> +                               if (phys_addrs && is_iommu_enabled())
> +                                       RTE_LOG(WARNING, EAL, "Forced IOVA as 'PA' because KNI module is loaded\n");
> +
> +                               if (!phys_addrs)
> +                                       RTE_LOG(DEBUG, EAL, "KNI can not work since physical addresses are unavailable\n");
>                         }

Checking physical addresses availability after, and having a log is not enough.


So far, KNI could not work with IOVA as VA.
Your patchset adds support for IOVA as VA if kernel is >= 4.6.
Repeating my proposal (as far as eal.c is concerned) of just changing:

@@ -1085,7 +1085,7 @@ rte_eal_init(int argc, char **argv)
                                RTE_LOG(DEBUG, EAL, "IOMMU is not
available, selecting IOVA as PA mode.\n");
                        }
                }
-#ifdef RTE_LIBRTE_KNI
+#if defined(RTE_LIBRTE_KNI) && KERNEL_VERSION(4, 6, 0) > LINUX_VERSION_CODE
                /* Workaround for KNI which requires physical address to work */
                if (iova_mode == RTE_IOVA_VA &&



--
David Marchand
  
Vamsi Krishna Attunuru Nov. 15, 2019, 1:35 p.m. UTC | #3
> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Friday, November 15, 2019 6:29 PM
> To: Vamsi Krishna Attunuru <vattunuru@marvell.com>; Yigit, Ferruh
> <ferruh.yigit@intel.com>
> Cc: dev <dev@dpdk.org>; Thomas Monjalon <thomas@monjalon.net>; Jerin
> Jacob Kollanukkaran <jerinj@marvell.com>; Kiran Kumar Kokkilagadda
> <kirankumark@marvell.com>; Olivier Matz <olivier.matz@6wind.com>;
> Burakov, Anatoly <anatoly.burakov@intel.com>; Andrew Rybchenko
> <arybchenko@solarflare.com>; Stephen Hemminger
> <stephen@networkplumber.org>
> Subject: [EXT] Re: [dpdk-dev] [PATCH v13 2/2] kni: support IOVA mode
> 
> External Email
> 
> ----------------------------------------------------------------------
> I can't see an interest in splitting this patch from the kmod update.
> Ferruh, what do you think?
> 
> 
> On Fri, Nov 15, 2019 at 12:19 PM <vattunuru@marvell.com> wrote:
> >
> > From: Vamsi Attunuru <vattunuru@marvell.com>
> >
> > Current KNI implementation only operates in IOVA_PA mode patch adds
> > required functionality to enable KNI in IOVA_VA mode.
> >
> > KNI loopback mode tests will have performance impact in this mode due
> > to IOVA to KVA address translations.
> > However, In KNI real world use cases, the performace
> 
> performance
> 
> > impact will be based on Linux kernel stack and scheduler latencies.
> > Performance varies based on the KNI use case.
> > If bus iommu scheme is IOVA_DC and KNI module is loaded, DPDK chooses
> > IOVA as PA as existing behaviour.
> >
> > During KNI creation, app's iova_mode details are passed to the KNI
> > kernel module, accordingly kernel module translates PA/IOVA addresses
> > to KVA and vice-versa.
> >
> > Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
> > Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
> > Suggested-by: Ferruh Yigit <ferruh.yigit@intel.com>
> > ---
> >  doc/guides/prog_guide/kernel_nic_interface.rst | 15 +++++++++++++++
> >  doc/guides/rel_notes/release_19_11.rst         | 15 ++++++++++++++-
> >  lib/librte_eal/linux/eal/eal.c                 | 23 ++++++++++++++++-------
> >  lib/librte_kni/rte_kni.c                       |  6 ++++++
> >  4 files changed, 51 insertions(+), 8 deletions(-)
> >
> 
> [snip]
> 
> > diff --git a/lib/librte_eal/linux/eal/eal.c
> > b/lib/librte_eal/linux/eal/eal.c index 9e2d50c..53ca84b 100644
> > --- a/lib/librte_eal/linux/eal/eal.c
> > +++ b/lib/librte_eal/linux/eal/eal.c
> > @@ -1086,14 +1086,23 @@ rte_eal_init(int argc, char **argv)
> >                         }
> >                 }
> >  #ifdef RTE_LIBRTE_KNI
> > -               /* Workaround for KNI which requires physical address to work */
> > -               if (iova_mode == RTE_IOVA_VA &&
> > -                               rte_eal_check_module("rte_kni") == 1) {
> > -                       if (phys_addrs) {
> > +               if (rte_eal_check_module("rte_kni") == 1) { #if
> > +KERNEL_VERSION(4, 6, 0) > LINUX_VERSION_CODE
> > +                       if (iova_mode == RTE_IOVA_VA) {
> >                                 iova_mode = RTE_IOVA_PA;
> > -                               RTE_LOG(WARNING, EAL, "Forcing IOVA as 'PA' because KNI
> module is loaded\n");
> > -                       } else {
> > -                               RTE_LOG(DEBUG, EAL, "KNI can not work since physical
> addresses are unavailable\n");
> > +                               RTE_LOG(WARNING, EAL, "Forcing IOVA as 'PA' because "
> > +                                               "Kernel version supports only 'PA' mode for KNI
> module\n");
> > +                       }
> > +#endif
> > +                       if (rte_bus_get_iommu_class() == RTE_IOVA_DC)
> > +                               iova_mode = RTE_IOVA_PA;
> 
> If physical addresses are unavailable, this code forces PA anyway.
> 
> 
> > +
> > +                       if (iova_mode == RTE_IOVA_PA) {
> > +                               if (phys_addrs && is_iommu_enabled())
> > +                                       RTE_LOG(WARNING, EAL, "Forced
> > + IOVA as 'PA' because KNI module is loaded\n");
> > +
> > +                               if (!phys_addrs)
> > +                                       RTE_LOG(DEBUG, EAL, "KNI can
> > + not work since physical addresses are unavailable\n");
> >                         }
> 
> Checking physical addresses availability after, and having a log is not enough.
> 
> 
> So far, KNI could not work with IOVA as VA.
> Your patchset adds support for IOVA as VA if kernel is >= 4.6.
> Repeating my proposal (as far as eal.c is concerned) of just changing:

To keep the existing behavior intact when bus iommu returns IOVA_DC, had to handle those case also here.

How about below scheme:

#ifdef RTE_LIBRTE_KNI
-               /* Workaround for KNI which requires physical address to work */
-               if (iova_mode == RTE_IOVA_VA &&
-                               rte_eal_check_module("rte_kni") == 1) {
-                       if (phys_addrs) {
-                               iova_mode = RTE_IOVA_PA;
-                               RTE_LOG(WARNING, EAL, "Forcing IOVA as 'PA' because KNI module is loaded\n");
-                       } else {
-                               RTE_LOG(DEBUG, EAL, "KNI can not work since physical addresses are unavailable\n");
+               if (rte_eal_check_module("rte_kni") == 1) {
+                       bool force_iova_as_pa = false;
+#if KERNEL_VERSION(4, 6, 0) > LINUX_VERSION_CODE
+                       if (iova_mode == RTE_IOVA_VA) {
+                               force_iova_as_pa = true;
+                               RTE_LOG(WARNING, EAL, "Forcing IOVA as 'PA' because "
+                                               "Kernel version supports only 'PA' mode for KNI module\n");
+                       }
+#endif
+                       if (rte_bus_get_iommu_class() == RTE_IOVA_DC)
+                               force_iova_as_pa = true;
+
+                       if (force_iova_as_pa) {
+                               if (phys_addrs) {
+                                       iova_mode = RTE_IOVA_PA;
+                                       RTE_LOG(WARNING, EAL, "Forced IOVA as 'PA' because KNI module is loaded\n");
+                               } else
+                                       RTE_LOG(DEBUG, EAL, "KNI can not work since physical addresses are unavailable\n");
                        }
                }
 #endif



> 
> @@ -1085,7 +1085,7 @@ rte_eal_init(int argc, char **argv)
>                                 RTE_LOG(DEBUG, EAL, "IOMMU is not available, selecting
> IOVA as PA mode.\n");
>                         }
>                 }
> -#ifdef RTE_LIBRTE_KNI
> +#if defined(RTE_LIBRTE_KNI) && KERNEL_VERSION(4, 6, 0) >
> +LINUX_VERSION_CODE
>                 /* Workaround for KNI which requires physical address to work */
>                 if (iova_mode == RTE_IOVA_VA &&
> 
> 
> 
> --
> David Marchand
  
David Marchand Nov. 15, 2019, 1:40 p.m. UTC | #4
On Fri, Nov 15, 2019 at 2:36 PM Vamsi Krishna Attunuru
<vattunuru@marvell.com> wrote:
>
>
> > -----Original Message-----
> > From: David Marchand <david.marchand@redhat.com>
> > Sent: Friday, November 15, 2019 6:29 PM
> > To: Vamsi Krishna Attunuru <vattunuru@marvell.com>; Yigit, Ferruh
> > <ferruh.yigit@intel.com>
> > Cc: dev <dev@dpdk.org>; Thomas Monjalon <thomas@monjalon.net>; Jerin
> > Jacob Kollanukkaran <jerinj@marvell.com>; Kiran Kumar Kokkilagadda
> > <kirankumark@marvell.com>; Olivier Matz <olivier.matz@6wind.com>;
> > Burakov, Anatoly <anatoly.burakov@intel.com>; Andrew Rybchenko
> > <arybchenko@solarflare.com>; Stephen Hemminger
> > <stephen@networkplumber.org>
> > Subject: [EXT] Re: [dpdk-dev] [PATCH v13 2/2] kni: support IOVA mode
> >
> > External Email
> >
> > ----------------------------------------------------------------------
> > I can't see an interest in splitting this patch from the kmod update.
> > Ferruh, what do you think?
> >
> >
> > On Fri, Nov 15, 2019 at 12:19 PM <vattunuru@marvell.com> wrote:
> > >
> > > From: Vamsi Attunuru <vattunuru@marvell.com>
> > >
> > > Current KNI implementation only operates in IOVA_PA mode patch adds
> > > required functionality to enable KNI in IOVA_VA mode.
> > >
> > > KNI loopback mode tests will have performance impact in this mode due
> > > to IOVA to KVA address translations.
> > > However, In KNI real world use cases, the performace
> >
> > performance
> >
> > > impact will be based on Linux kernel stack and scheduler latencies.
> > > Performance varies based on the KNI use case.
> > > If bus iommu scheme is IOVA_DC and KNI module is loaded, DPDK chooses
> > > IOVA as PA as existing behaviour.
> > >
> > > During KNI creation, app's iova_mode details are passed to the KNI
> > > kernel module, accordingly kernel module translates PA/IOVA addresses
> > > to KVA and vice-versa.
> > >
> > > Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
> > > Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
> > > Suggested-by: Ferruh Yigit <ferruh.yigit@intel.com>
> > > ---
> > >  doc/guides/prog_guide/kernel_nic_interface.rst | 15 +++++++++++++++
> > >  doc/guides/rel_notes/release_19_11.rst         | 15 ++++++++++++++-
> > >  lib/librte_eal/linux/eal/eal.c                 | 23 ++++++++++++++++-------
> > >  lib/librte_kni/rte_kni.c                       |  6 ++++++
> > >  4 files changed, 51 insertions(+), 8 deletions(-)
> > >
> >
> > [snip]
> >
> > > diff --git a/lib/librte_eal/linux/eal/eal.c
> > > b/lib/librte_eal/linux/eal/eal.c index 9e2d50c..53ca84b 100644
> > > --- a/lib/librte_eal/linux/eal/eal.c
> > > +++ b/lib/librte_eal/linux/eal/eal.c
> > > @@ -1086,14 +1086,23 @@ rte_eal_init(int argc, char **argv)
> > >                         }
> > >                 }
> > >  #ifdef RTE_LIBRTE_KNI
> > > -               /* Workaround for KNI which requires physical address to work */
> > > -               if (iova_mode == RTE_IOVA_VA &&
> > > -                               rte_eal_check_module("rte_kni") == 1) {
> > > -                       if (phys_addrs) {
> > > +               if (rte_eal_check_module("rte_kni") == 1) { #if
> > > +KERNEL_VERSION(4, 6, 0) > LINUX_VERSION_CODE
> > > +                       if (iova_mode == RTE_IOVA_VA) {
> > >                                 iova_mode = RTE_IOVA_PA;
> > > -                               RTE_LOG(WARNING, EAL, "Forcing IOVA as 'PA' because KNI
> > module is loaded\n");
> > > -                       } else {
> > > -                               RTE_LOG(DEBUG, EAL, "KNI can not work since physical
> > addresses are unavailable\n");
> > > +                               RTE_LOG(WARNING, EAL, "Forcing IOVA as 'PA' because "
> > > +                                               "Kernel version supports only 'PA' mode for KNI
> > module\n");
> > > +                       }
> > > +#endif
> > > +                       if (rte_bus_get_iommu_class() == RTE_IOVA_DC)
> > > +                               iova_mode = RTE_IOVA_PA;
> >
> > If physical addresses are unavailable, this code forces PA anyway.
> >
> >
> > > +
> > > +                       if (iova_mode == RTE_IOVA_PA) {
> > > +                               if (phys_addrs && is_iommu_enabled())
> > > +                                       RTE_LOG(WARNING, EAL, "Forced
> > > + IOVA as 'PA' because KNI module is loaded\n");
> > > +
> > > +                               if (!phys_addrs)
> > > +                                       RTE_LOG(DEBUG, EAL, "KNI can
> > > + not work since physical addresses are unavailable\n");
> > >                         }
> >
> > Checking physical addresses availability after, and having a log is not enough.
> >
> >
> > So far, KNI could not work with IOVA as VA.
> > Your patchset adds support for IOVA as VA if kernel is >= 4.6.
> > Repeating my proposal (as far as eal.c is concerned) of just changing:
>
> To keep the existing behavior intact when bus iommu returns IOVA_DC, had to handle those case also here.

No.
If you want to change something, do it in a separate patch with full
explanation.
  
Jerin Jacob Nov. 15, 2019, 1:40 p.m. UTC | #5
On Fri, Nov 15, 2019 at 6:29 PM David Marchand
<david.marchand@redhat.com> wrote:
>
> I can't see an interest in splitting this patch from the kmod update.
> Ferruh, what do you think?
>
>
> On Fri, Nov 15, 2019 at 12:19 PM <vattunuru@marvell.com> wrote:
> >
> > From: Vamsi Attunuru <vattunuru@marvell.com>
> >
> > Current KNI implementation only operates in IOVA_PA mode
> > patch adds required functionality to enable KNI in
> > IOVA_VA mode.
> >
> > KNI loopback mode tests will have performance impact in
> > this mode due to IOVA to KVA address translations.
> > However, In KNI real world use cases, the performace
>
> performance
>
> > impact will be based on Linux kernel stack and scheduler
> > latencies. Performance varies based on the KNI use case.
> > If bus iommu scheme is IOVA_DC and KNI module is loaded,
> > DPDK chooses IOVA as PA as existing behaviour.
> >
> > During KNI creation, app's iova_mode details are passed to
> > the KNI kernel module, accordingly kernel module translates
> > PA/IOVA addresses to KVA and vice-versa.
> >
> > Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
> > Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
> > Suggested-by: Ferruh Yigit <ferruh.yigit@intel.com>
> > ---
> >  doc/guides/prog_guide/kernel_nic_interface.rst | 15 +++++++++++++++
> >  doc/guides/rel_notes/release_19_11.rst         | 15 ++++++++++++++-
> >  lib/librte_eal/linux/eal/eal.c                 | 23 ++++++++++++++++-------
> >  lib/librte_kni/rte_kni.c                       |  6 ++++++
> >  4 files changed, 51 insertions(+), 8 deletions(-)
> >
>
> [snip]
>
> > diff --git a/lib/librte_eal/linux/eal/eal.c b/lib/librte_eal/linux/eal/eal.c
> > index 9e2d50c..53ca84b 100644
> > --- a/lib/librte_eal/linux/eal/eal.c
> > +++ b/lib/librte_eal/linux/eal/eal.c
> > @@ -1086,14 +1086,23 @@ rte_eal_init(int argc, char **argv)
> >                         }
> >                 }
> >  #ifdef RTE_LIBRTE_KNI
> > -               /* Workaround for KNI which requires physical address to work */
> > -               if (iova_mode == RTE_IOVA_VA &&
> > -                               rte_eal_check_module("rte_kni") == 1) {
> > -                       if (phys_addrs) {
> > +               if (rte_eal_check_module("rte_kni") == 1) {
> > +#if KERNEL_VERSION(4, 6, 0) > LINUX_VERSION_CODE
> > +                       if (iova_mode == RTE_IOVA_VA) {
> >                                 iova_mode = RTE_IOVA_PA;
> > -                               RTE_LOG(WARNING, EAL, "Forcing IOVA as 'PA' because KNI module is loaded\n");
> > -                       } else {
> > -                               RTE_LOG(DEBUG, EAL, "KNI can not work since physical addresses are unavailable\n");
> > +                               RTE_LOG(WARNING, EAL, "Forcing IOVA as 'PA' because "
> > +                                               "Kernel version supports only 'PA' mode for KNI module\n");
> > +                       }
> > +#endif
> > +                       if (rte_bus_get_iommu_class() == RTE_IOVA_DC)
> > +                               iova_mode = RTE_IOVA_PA;
>
> If physical addresses are unavailable, this code forces PA anyway.
>
>
> > +
> > +                       if (iova_mode == RTE_IOVA_PA) {
> > +                               if (phys_addrs && is_iommu_enabled())
> > +                                       RTE_LOG(WARNING, EAL, "Forced IOVA as 'PA' because KNI module is loaded\n");
> > +
> > +                               if (!phys_addrs)
> > +                                       RTE_LOG(DEBUG, EAL, "KNI can not work since physical addresses are unavailable\n");
> >                         }
>
> Checking physical addresses availability after, and having a log is not enough.
>
>
> So far, KNI could not work with IOVA as VA.
> Your patchset adds support for IOVA as VA if kernel is >= 4.6.
> Repeating my proposal (as far as eal.c is concerned) of just changing:

We need achive the following.

IOVA as PA  has performance implication on KNI case. So we need to
make IOVA as PA when KNI module is loaded.
Your suggestion makes IOVA as PA when bus return IOVA as VA due the
fact the KNI check is latter. We need to move that up.

How about the following

[master]dell[dpdk.org] $ git diff
diff --git a/lib/librte_eal/linux/eal/eal.c b/lib/librte_eal/linux/eal/eal.c
index 9e2d50cfb..085fde767 100644
--- a/lib/librte_eal/linux/eal/eal.c
+++ b/lib/librte_eal/linux/eal/eal.c
@@ -1064,6 +1064,21 @@ rte_eal_init(int argc, char **argv)
                /* autodetect the IOVA mapping mode */
                enum rte_iova_mode iova_mode = rte_bus_get_iommu_class();

+#if defined(RTE_LIBRTE_KNI) && KERNEL_VERSION(4, 6, 0) > LINUX_VERSION_CODE
+               /* IOVA as PA gives better performance for KNI. Choose IOVA as
+                 * PA when bus returns RTE_IOVA_DC and KNI module is present.
+                 */
+               if (iova_mode == RTE_IOVA_DC &&
+                               rte_eal_check_module("rte_kni") == 1) {
+                       if (phys_addrs) {
+                               iova_mode = RTE_IOVA_PA;
+                               RTE_LOG(WARNING, EAL, "Forcing IOVA as
'PA' because KNI module is loaded\n");
+                       } else {
+                               RTE_LOG(DEBUG, EAL, "KNI can not work
since physical addresses are unavailable\n");
+                       }
+               }
+#endif
+
                if (iova_mode == RTE_IOVA_DC) {
                        RTE_LOG(DEBUG, EAL, "Buses did not request a
specific IOVA mode.\n");

@@ -1085,18 +1100,6 @@ rte_eal_init(int argc, char **argv)
                                RTE_LOG(DEBUG, EAL, "IOMMU is not
available, selecting IOVA as PA mode.\n");
                        }
                }
-#ifdef RTE_LIBRTE_KNI
-               /* Workaround for KNI which requires physical address to work */
-               if (iova_mode == RTE_IOVA_VA &&
-                               rte_eal_check_module("rte_kni") == 1) {
-                       if (phys_addrs) {
-                               iova_mode = RTE_IOVA_PA;
-                               RTE_LOG(WARNING, EAL, "Forcing IOVA as
'PA' because KNI module is loaded\n");
-                       } else {
-                               RTE_LOG(DEBUG, EAL, "KNI can not work
since physical addresses are unavailable\n");
-                       }
-               }
-#endif
                rte_eal_get_configuration()->iova_mode = iova_mode;
        } else {
                rte_eal_get_configuration()->iova_mode =
[master]dell[dpdk.org] $
>
> @@ -1085,7 +1085,7 @@ rte_eal_init(int argc, char **argv)
>                                 RTE_LOG(DEBUG, EAL, "IOMMU is not
> available, selecting IOVA as PA mode.\n");
>                         }
>                 }
> -#ifdef RTE_LIBRTE_KNI
> +#if defined(RTE_LIBRTE_KNI) && KERNEL_VERSION(4, 6, 0) > LINUX_VERSION_CODE
>                 /* Workaround for KNI which requires physical address to work */
>                 if (iova_mode == RTE_IOVA_VA &&
>
>
>
> --
> David Marchand
>
  
David Marchand Nov. 15, 2019, 2:56 p.m. UTC | #6
On Fri, Nov 15, 2019 at 2:40 PM Jerin Jacob <jerinjacobk@gmail.com> wrote:
> On Fri, Nov 15, 2019 at 6:29 PM David Marchand
> <david.marchand@redhat.com> wrote:
> > So far, KNI could not work with IOVA as VA.
> > Your patchset adds support for IOVA as VA if kernel is >= 4.6.
>
> We need achive the following.
>
> IOVA as PA  has performance implication on KNI case. So we need to
> make IOVA as PA when KNI module is loaded.

- The current behaviour is:
  * buses announce PA, nothing to do wrt KNI,
  * buses announce VA or DC (whatever the considerations on iommu), if
physical addresses are available, then PA is used and KNI works,

- Now, with this new feature (on kernels >= 4.6), we can have KNI work
with VA, the previous workaround can be skipped.
There is another consideration wrt performance, as a consequence, for
kernels 4.6, if physical addresses are available, we can select PA for
KNI.

_But_ I did not see people complaining about the current behavior.
I see no reason to change this if the VA support can't be used (kernels < 4.6).

So how about (I inverted the #if LINUX_VERSION_CODE >=
KERNEL_VERSION(4, 6, 0), it was causing me headaches):

diff --git a/lib/librte_eal/linux/eal/eal.c b/lib/librte_eal/linux/eal/eal.c
index 9e2d50cfb..33f3c674c 100644
--- a/lib/librte_eal/linux/eal/eal.c
+++ b/lib/librte_eal/linux/eal/eal.c
@@ -1073,6 +1073,11 @@ rte_eal_init(int argc, char **argv)
                                 */
                                iova_mode = RTE_IOVA_VA;
                                RTE_LOG(DEBUG, EAL, "Physical
addresses are unavailable, selecting IOVA as VA mode.\n");
+#if defined(RTE_LIBRTE_KNI) && LINUX_VERSION_CODE >= KERNEL_VERSION(4, 6, 0)
+                       } else if (rte_eal_check_module("rte_kni") == 1) {
+                               iova_mode = RTE_IOVA_PA;
+                               RTE_LOG(DEBUG, EAL, "Forcing IOVA as
'PA' for better perfomance with KNI\n");
+#endif
                        } else if (is_iommu_enabled()) {
                                /* we have an IOMMU, pick IOVA as VA mode */
                                iova_mode = RTE_IOVA_VA;
@@ -1085,7 +1090,7 @@ rte_eal_init(int argc, char **argv)
                                RTE_LOG(DEBUG, EAL, "IOMMU is not
available, selecting IOVA as PA mode.\n");
                        }
                }
-#ifdef RTE_LIBRTE_KNI
-               /* Workaround for KNI which requires physical address to work */
+#if defined(RTE_LIBRTE_KNI) && LINUX_VERSION_CODE < KERNEL_VERSION(4, 6, 0)
+               /* Workaround for KNI which requires physical address
to work with kernels < 4.6 */
                if (iova_mode == RTE_IOVA_VA &&
                                rte_eal_check_module("rte_kni") == 1) {
  
Jerin Jacob Nov. 15, 2019, 3:22 p.m. UTC | #7
On Fri, Nov 15, 2019 at 8:27 PM David Marchand
<david.marchand@redhat.com> wrote:
>
> On Fri, Nov 15, 2019 at 2:40 PM Jerin Jacob <jerinjacobk@gmail.com> wrote:
> > On Fri, Nov 15, 2019 at 6:29 PM David Marchand
> > <david.marchand@redhat.com> wrote:
> > > So far, KNI could not work with IOVA as VA.
> > > Your patchset adds support for IOVA as VA if kernel is >= 4.6.
> >
> > We need achive the following.
> >
> > IOVA as PA  has performance implication on KNI case. So we need to
> > make IOVA as PA when KNI module is loaded.
>
> - The current behaviour is:
>   * buses announce PA, nothing to do wrt KNI,
>   * buses announce VA or DC (whatever the considerations on iommu), if
> physical addresses are available, then PA is used and KNI works,
>
> - Now, with this new feature (on kernels >= 4.6), we can have KNI work
> with VA, the previous workaround can be skipped.
> There is another consideration wrt performance, as a consequence, for
> kernels 4.6, if physical addresses are available, we can select PA for
> KNI.
>
> _But_ I did not see people complaining about the current behavior.
> I see no reason to change this if the VA support can't be used (kernels < 4.6).
>
> So how about (I inverted the #if LINUX_VERSION_CODE >=
> KERNEL_VERSION(4, 6, 0), it was causing me headaches):


That works too David. Thanks for the review.

>
> diff --git a/lib/librte_eal/linux/eal/eal.c b/lib/librte_eal/linux/eal/eal.c
> index 9e2d50cfb..33f3c674c 100644
> --- a/lib/librte_eal/linux/eal/eal.c
> +++ b/lib/librte_eal/linux/eal/eal.c
> @@ -1073,6 +1073,11 @@ rte_eal_init(int argc, char **argv)
>                                  */
>                                 iova_mode = RTE_IOVA_VA;
>                                 RTE_LOG(DEBUG, EAL, "Physical
> addresses are unavailable, selecting IOVA as VA mode.\n");
> +#if defined(RTE_LIBRTE_KNI) && LINUX_VERSION_CODE >= KERNEL_VERSION(4, 6, 0)
> +                       } else if (rte_eal_check_module("rte_kni") == 1) {
> +                               iova_mode = RTE_IOVA_PA;
> +                               RTE_LOG(DEBUG, EAL, "Forcing IOVA as
> 'PA' for better perfomance with KNI\n");
> +#endif
>                         } else if (is_iommu_enabled()) {
>                                 /* we have an IOMMU, pick IOVA as VA mode */
>                                 iova_mode = RTE_IOVA_VA;
> @@ -1085,7 +1090,7 @@ rte_eal_init(int argc, char **argv)
>                                 RTE_LOG(DEBUG, EAL, "IOMMU is not
> available, selecting IOVA as PA mode.\n");
>                         }
>                 }
> -#ifdef RTE_LIBRTE_KNI
> -               /* Workaround for KNI which requires physical address to work */
> +#if defined(RTE_LIBRTE_KNI) && LINUX_VERSION_CODE < KERNEL_VERSION(4, 6, 0)
> +               /* Workaround for KNI which requires physical address
> to work with kernels < 4.6 */
>                 if (iova_mode == RTE_IOVA_VA &&
>                                 rte_eal_check_module("rte_kni") == 1) {
>
>
> --
> David Marchand
>
  

Patch

diff --git a/doc/guides/prog_guide/kernel_nic_interface.rst b/doc/guides/prog_guide/kernel_nic_interface.rst
index 2fd58e1..3bed18f 100644
--- a/doc/guides/prog_guide/kernel_nic_interface.rst
+++ b/doc/guides/prog_guide/kernel_nic_interface.rst
@@ -300,6 +300,21 @@  The sk_buff is then freed and the mbuf sent in the tx_q FIFO.
 The DPDK TX thread dequeues the mbuf and sends it to the PMD via ``rte_eth_tx_burst()``.
 It then puts the mbuf back in the cache.
 
+IOVA = VA: Support
+------------------
+
+KNI operates in IOVA_VA scheme when
+
+- LINUX_VERSION_CODE >= KERNEL_VERSION(4, 6, 0) and
+- eal option `iova-mode=va` is passed or bus IOVA scheme in the DPDK is selected
+  as RTE_IOVA_VA.
+
+KNI loopback mode tests will have performance impact in this mode due to IOVA to
+KVA address translations. However, In KNI real world use cases, the performace
+impact will be based on Linux kernel stack and scheduler latencies. Performance
+varies based on the KNI use case. If bus iommu scheme is IOVA_DC and KNI module
+is loaded, DPDK chooses IOVA as PA as existing behaviour.
+
 Ethtool
 -------
 
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index 682c1bd..c207c93 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -292,8 +292,21 @@  New Features
   compilers store their internal representation of the source code that
   the linker uses at the final stage of compilation process.
 
-  See :doc:`../prog_guide/lto` for more information:
+* **Added IOVA as VA support for KNI.**
+
+  * Added IOVA = VA support for KNI, KNI can operate in IOVA = VA mode when
+    `iova-mode=va` eal option is passed to the application or when bus IOVA
+    scheme is selected as RTE_IOVA_VA. This mode only works on Linux Kernel
+    versions 4.6.0 and above.
 
+  * KNI loopback mode tests will have performance impact in this mode due
+    to IOVA to KVA address translations. However, In KNI real world use cases,
+    the performace impact will be based on Linux kernel stack and scheduler
+    latencies. Performance varies based on the KNI use case. If bus iommu
+    scheme is IOVA_DC and KNI module is loaded, DPDK chooses IOVA as PA as
+    existing behaviour.
+
+  See :doc:`../prog_guide/lto` for more information:
 
 
 Removed Items
diff --git a/lib/librte_eal/linux/eal/eal.c b/lib/librte_eal/linux/eal/eal.c
index 9e2d50c..53ca84b 100644
--- a/lib/librte_eal/linux/eal/eal.c
+++ b/lib/librte_eal/linux/eal/eal.c
@@ -1086,14 +1086,23 @@  rte_eal_init(int argc, char **argv)
 			}
 		}
 #ifdef RTE_LIBRTE_KNI
-		/* Workaround for KNI which requires physical address to work */
-		if (iova_mode == RTE_IOVA_VA &&
-				rte_eal_check_module("rte_kni") == 1) {
-			if (phys_addrs) {
+		if (rte_eal_check_module("rte_kni") == 1) {
+#if KERNEL_VERSION(4, 6, 0) > LINUX_VERSION_CODE
+			if (iova_mode == RTE_IOVA_VA) {
 				iova_mode = RTE_IOVA_PA;
-				RTE_LOG(WARNING, EAL, "Forcing IOVA as 'PA' because KNI module is loaded\n");
-			} else {
-				RTE_LOG(DEBUG, EAL, "KNI can not work since physical addresses are unavailable\n");
+				RTE_LOG(WARNING, EAL, "Forcing IOVA as 'PA' because "
+						"Kernel version supports only 'PA' mode for KNI module\n");
+			}
+#endif
+			if (rte_bus_get_iommu_class() == RTE_IOVA_DC)
+				iova_mode = RTE_IOVA_PA;
+
+			if (iova_mode == RTE_IOVA_PA) {
+				if (phys_addrs && is_iommu_enabled())
+					RTE_LOG(WARNING, EAL, "Forced IOVA as 'PA' because KNI module is loaded\n");
+
+				if (!phys_addrs)
+					RTE_LOG(DEBUG, EAL, "KNI can not work since physical addresses are unavailable\n");
 			}
 		}
 #endif
diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
index 7fbcf22..a609d2f 100644
--- a/lib/librte_kni/rte_kni.c
+++ b/lib/librte_kni/rte_kni.c
@@ -6,6 +6,8 @@ 
 #error "KNI is not supported"
 #endif
 
+#include <linux/version.h>
+
 #include <string.h>
 #include <fcntl.h>
 #include <unistd.h>
@@ -97,10 +99,12 @@  static volatile int kni_fd = -1;
 int
 rte_kni_init(unsigned int max_kni_ifaces __rte_unused)
 {
+#if KERNEL_VERSION(4, 6, 0) > LINUX_VERSION_CODE
 	if (rte_eal_iova_mode() != RTE_IOVA_PA) {
 		RTE_LOG(ERR, KNI, "KNI requires IOVA as PA\n");
 		return -1;
 	}
+#endif
 
 	/* Check FD and open */
 	if (kni_fd < 0) {
@@ -302,6 +306,8 @@  rte_kni_alloc(struct rte_mempool *pktmbuf_pool,
 	kni->group_id = conf->group_id;
 	kni->mbuf_size = conf->mbuf_size;
 
+	dev_info.iova_mode = (rte_eal_iova_mode() == RTE_IOVA_VA) ? 1 : 0;
+
 	ret = ioctl(kni_fd, RTE_KNI_IOCTL_CREATE, &dev_info);
 	if (ret < 0)
 		goto ioctl_fail;