[v6,0/4] add IOVA = VA support in KNI
mbox series

Message ID 20190625035700.2953-1-vattunuru@marvell.com
Headers show
Series
  • add IOVA = VA support in KNI
Related show

Message

Vamsi Krishna Attunuru June 25, 2019, 3:56 a.m. UTC
From: Vamsi Attunuru <vattunuru@marvell.com>

----
V6 Changes:
* Added new mempool flag to ensure mbuf memory is not scattered
across page boundaries.
* Added KNI kernel module required PCI device information.
* Modified KNI example application to create mempool with new
mempool flag.

V5 changes:
* Fixed build issue with 32b build

V4 changes:
* Fixed build issues with older kernel versions
* This approach will only work with kernel above 4.4.0

V3 Changes:
* Add new approach to work kni with IOVA=VA mode using
iommu_iova_to_phys API.

Kiran Kumar K (1):
  kernel/linux/kni: add IOVA support in kni module

Vamsi Attunuru (3):
  lib/mempool: skip populating mempool objs that falls on page
    boundaries
  lib/kni: add PCI related information
  example/kni: add IOVA support for kni application

 examples/kni/main.c                               | 53 +++++++++++++++-
 kernel/linux/kni/kni_dev.h                        |  3 +
 kernel/linux/kni/kni_misc.c                       | 62 +++++++++++++++---
 kernel/linux/kni/kni_net.c                        | 76 +++++++++++++++++++----
 lib/librte_eal/linux/eal/eal.c                    |  8 ---
 lib/librte_eal/linux/eal/include/rte_kni_common.h |  8 +++
 lib/librte_kni/rte_kni.c                          |  7 +++
 lib/librte_mempool/rte_mempool.c                  |  2 +-
 lib/librte_mempool/rte_mempool.h                  |  2 +
 lib/librte_mempool/rte_mempool_ops_default.c      | 30 +++++++++
 10 files changed, 219 insertions(+), 32 deletions(-)

Comments

Burakov, Anatoly June 25, 2019, 10 a.m. UTC | #1
On 25-Jun-19 4:56 AM, vattunuru@marvell.com wrote:
> From: Vamsi Attunuru <vattunuru@marvell.com>
> 
> ----
> V6 Changes:
> * Added new mempool flag to ensure mbuf memory is not scattered
> across page boundaries.
> * Added KNI kernel module required PCI device information.
> * Modified KNI example application to create mempool with new
> mempool flag.
> 
Others can chime in, but my 2 cents: this reduces the usefulness of KNI 
because it limits the kinds of mempools one can use them with, and makes 
it so that the code that works with every other PMD requires changes to 
work with KNI.
Jerin Jacob Kollanukkaran June 25, 2019, 11:15 a.m. UTC | #2
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Burakov, Anatoly
> Sent: Tuesday, June 25, 2019 3:30 PM
> To: Vamsi Krishna Attunuru <vattunuru@marvell.com>; dev@dpdk.org
> Cc: ferruh.yigit@intel.com; olivier.matz@6wind.com;
> arybchenko@solarflare.com
> Subject: Re: [dpdk-dev] [PATCH v6 0/4] add IOVA = VA support in KNI
> 
> On 25-Jun-19 4:56 AM, vattunuru@marvell.com wrote:
> > From: Vamsi Attunuru <vattunuru@marvell.com>
> >
> > ----
> > V6 Changes:
> > * Added new mempool flag to ensure mbuf memory is not scattered across
> > page boundaries.
> > * Added KNI kernel module required PCI device information.
> > * Modified KNI example application to create mempool with new mempool
> > flag.
> >
> Others can chime in, but my 2 cents: this reduces the usefulness of KNI because
> it limits the kinds of mempools one can use them with, and makes it so that the
> code that works with every other PMD requires changes to work with KNI.

# One option to make this flag as default only for packet mempool(not allow allocate on page boundary).
In real world the overhead will be very minimal considering Huge page size is 1G or 512M 
# Enable this flag explicitly only IOVA = VA mode in library. Not  need to expose to application
# I don’t think, there needs to be any PMD specific change to make KNI with IOVA = VA mode
# No preference on flags to be passed by application vs in library. But IMO this change would be
needed in mempool support KNI in IOVA = VA mode.



> 
> --
> Thanks,
> Anatoly
Burakov, Anatoly June 25, 2019, 11:30 a.m. UTC | #3
On 25-Jun-19 12:15 PM, Jerin Jacob Kollanukkaran wrote:
>> -----Original Message-----
>> From: dev <dev-bounces@dpdk.org> On Behalf Of Burakov, Anatoly
>> Sent: Tuesday, June 25, 2019 3:30 PM
>> To: Vamsi Krishna Attunuru <vattunuru@marvell.com>; dev@dpdk.org
>> Cc: ferruh.yigit@intel.com; olivier.matz@6wind.com;
>> arybchenko@solarflare.com
>> Subject: Re: [dpdk-dev] [PATCH v6 0/4] add IOVA = VA support in KNI
>>
>> On 25-Jun-19 4:56 AM, vattunuru@marvell.com wrote:
>>> From: Vamsi Attunuru <vattunuru@marvell.com>
>>>
>>> ----
>>> V6 Changes:
>>> * Added new mempool flag to ensure mbuf memory is not scattered across
>>> page boundaries.
>>> * Added KNI kernel module required PCI device information.
>>> * Modified KNI example application to create mempool with new mempool
>>> flag.
>>>
>> Others can chime in, but my 2 cents: this reduces the usefulness of KNI because
>> it limits the kinds of mempools one can use them with, and makes it so that the
>> code that works with every other PMD requires changes to work with KNI.
> 
> # One option to make this flag as default only for packet mempool(not allow allocate on page boundary).
> In real world the overhead will be very minimal considering Huge page size is 1G or 512M
> # Enable this flag explicitly only IOVA = VA mode in library. Not  need to expose to application
> # I don’t think, there needs to be any PMD specific change to make KNI with IOVA = VA mode
> # No preference on flags to be passed by application vs in library. But IMO this change would be
> needed in mempool support KNI in IOVA = VA mode.
> 

I would be OK to just make it default behavior to not cross page 
boundaries when allocating buffers. This would solve the problem for KNI 
and for any other use case that would rely on PA-contiguous buffers in 
face of IOVA as VA mode.

We could also add a flag to explicitly allow page crossing without also 
making mbufs IOVA-non-contiguous, but i'm not sure if there are use 
cases that would benefit from this.

> 
> 
>>
>> --
>> Thanks,
>> Anatoly
Burakov, Anatoly June 25, 2019, 1:38 p.m. UTC | #4
On 25-Jun-19 12:30 PM, Burakov, Anatoly wrote:
> On 25-Jun-19 12:15 PM, Jerin Jacob Kollanukkaran wrote:
>>> -----Original Message-----
>>> From: dev <dev-bounces@dpdk.org> On Behalf Of Burakov, Anatoly
>>> Sent: Tuesday, June 25, 2019 3:30 PM
>>> To: Vamsi Krishna Attunuru <vattunuru@marvell.com>; dev@dpdk.org
>>> Cc: ferruh.yigit@intel.com; olivier.matz@6wind.com;
>>> arybchenko@solarflare.com
>>> Subject: Re: [dpdk-dev] [PATCH v6 0/4] add IOVA = VA support in KNI
>>>
>>> On 25-Jun-19 4:56 AM, vattunuru@marvell.com wrote:
>>>> From: Vamsi Attunuru <vattunuru@marvell.com>
>>>>
>>>> ----
>>>> V6 Changes:
>>>> * Added new mempool flag to ensure mbuf memory is not scattered across
>>>> page boundaries.
>>>> * Added KNI kernel module required PCI device information.
>>>> * Modified KNI example application to create mempool with new mempool
>>>> flag.
>>>>
>>> Others can chime in, but my 2 cents: this reduces the usefulness of 
>>> KNI because
>>> it limits the kinds of mempools one can use them with, and makes it 
>>> so that the
>>> code that works with every other PMD requires changes to work with KNI.
>>
>> # One option to make this flag as default only for packet mempool(not 
>> allow allocate on page boundary).
>> In real world the overhead will be very minimal considering Huge page 
>> size is 1G or 512M
>> # Enable this flag explicitly only IOVA = VA mode in library. Not  
>> need to expose to application
>> # I don’t think, there needs to be any PMD specific change to make KNI 
>> with IOVA = VA mode
>> # No preference on flags to be passed by application vs in library. 
>> But IMO this change would be
>> needed in mempool support KNI in IOVA = VA mode.
>>
> 
> I would be OK to just make it default behavior to not cross page 
> boundaries when allocating buffers. This would solve the problem for KNI 
> and for any other use case that would rely on PA-contiguous buffers in 
> face of IOVA as VA mode.
> 
> We could also add a flag to explicitly allow page crossing without also 
> making mbufs IOVA-non-contiguous, but i'm not sure if there are use 
> cases that would benefit from this.

On another thought, such a default would break 4K pages in case for 
packets bigger than page size (i.e. jumbo frames). Should we care?

> 
>>
>>
>>>
>>> -- 
>>> Thanks,
>>> Anatoly
> 
>
Jerin Jacob Kollanukkaran June 27, 2019, 9:34 a.m. UTC | #5
> -----Original Message-----
> From: Burakov, Anatoly <anatoly.burakov@intel.com>
> Sent: Tuesday, June 25, 2019 7:09 PM
> To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; Vamsi Krishna Attunuru
> <vattunuru@marvell.com>; dev@dpdk.org
> Cc: ferruh.yigit@intel.com; olivier.matz@6wind.com;
> arybchenko@solarflare.com
> Subject: Re: [dpdk-dev] [PATCH v6 0/4] add IOVA = VA support in KNI
> 
> On 25-Jun-19 12:30 PM, Burakov, Anatoly wrote:
> > On 25-Jun-19 12:15 PM, Jerin Jacob Kollanukkaran wrote:
> >>> -----Original Message-----
> >>> From: dev <dev-bounces@dpdk.org> On Behalf Of Burakov, Anatoly
> >>> Sent: Tuesday, June 25, 2019 3:30 PM
> >>> To: Vamsi Krishna Attunuru <vattunuru@marvell.com>; dev@dpdk.org
> >>> Cc: ferruh.yigit@intel.com; olivier.matz@6wind.com;
> >>> arybchenko@solarflare.com
> >>> Subject: Re: [dpdk-dev] [PATCH v6 0/4] add IOVA = VA support in KNI
> >>>
> >>> On 25-Jun-19 4:56 AM, vattunuru@marvell.com wrote:
> >>>> From: Vamsi Attunuru <vattunuru@marvell.com>
> >>>>
> >>>> ----
> >>>> V6 Changes:
> >>>> * Added new mempool flag to ensure mbuf memory is not scattered
> >>>> across page boundaries.
> >>>> * Added KNI kernel module required PCI device information.
> >>>> * Modified KNI example application to create mempool with new
> >>>> mempool flag.
> >>>>
> >>> Others can chime in, but my 2 cents: this reduces the usefulness of
> >>> KNI because it limits the kinds of mempools one can use them with,
> >>> and makes it so that the code that works with every other PMD
> >>> requires changes to work with KNI.
> >>
> >> # One option to make this flag as default only for packet mempool(not
> >> allow allocate on page boundary).
> >> In real world the overhead will be very minimal considering Huge page
> >> size is 1G or 512M # Enable this flag explicitly only IOVA = VA mode
> >> in library. Not need to expose to application # I don’t think, there
> >> needs to be any PMD specific change to make KNI with IOVA = VA mode #
> >> No preference on flags to be passed by application vs in library.
> >> But IMO this change would be
> >> needed in mempool support KNI in IOVA = VA mode.
> >>
> >
> > I would be OK to just make it default behavior to not cross page
> > boundaries when allocating buffers. This would solve the problem for
> > KNI and for any other use case that would rely on PA-contiguous
> > buffers in face of IOVA as VA mode.
> >
> > We could also add a flag to explicitly allow page crossing without
> > also making mbufs IOVA-non-contiguous, but i'm not sure if there are
> > use cases that would benefit from this.
> 
> On another thought, such a default would break 4K pages in case for packets
> bigger than page size (i.e. jumbo frames). Should we care?

The hugepage size will not be 4K. Right?

Olivier,

As a maintainer any thoughts of exposing/not exposing the new mepool flag to
Skip the page boundaries?

All,
Either option is fine, Asking for feedback to processed further?
Vamsi Krishna Attunuru July 1, 2019, 1:51 p.m. UTC | #6
ping..
Vamsi Krishna Attunuru July 4, 2019, 6:42 a.m. UTC | #7
Hi All,


Just to summarize, below items have arisen from the initial review.

1) Can the new mempool flag be made default to all the pools and will there be case that new flag functionality would fail  for some page sizes.?

2) Adding HW device info(pci dev info) to KNI device structure, will it break KNI on virtual devices in VA or PA mode.?


Can someone suggest if any changes required to address above issues.
Jerin Jacob Kollanukkaran July 4, 2019, 9:48 a.m. UTC | #8
>From: Vamsi Krishna Attunuru 
>Sent: Thursday, July 4, 2019 12:13 PM
>To: dev@dpdk.org
>Cc: ferruh.yigit@intel.com; olivier.matz@6wind.com; arybchenko@solarflare.com; Jerin Jacob Kollanukkaran <jerinj@marvell.com>; Burakov, Anatoly <anatoly.burakov@intel.com>
>Subject: Re: [dpdk-dev] [PATCH v6 0/4] add IOVA = VA support in KNI
>
>Hi All,
>
>Just to summarize, below items have arisen from the initial review.
>1) Can the new mempool flag be made default to all the pools and will there be case that new flag functionality would fail  for some page sizes.?

If the minimum huge page size is 2MB and normal huge page size is 512MB or 1G. So I think, new flags can be default as skipping the page boundaries for 
Mempool objects has nearly zero overhead. But I leave decision to maintainers.

>2) Adding HW device info(pci dev info) to KNI device structure, will it break KNI on virtual devices in VA or PA mode.?

Iommu_domain will be created only for PCI devices and the system runs in IOVA_VA mode. Virtual devices(IOVA_DC(don't care) or
IOVA_PA devices still it works without PCI device structure)

It is  a useful feature where KNI can run without root privilege and it is pending for long time. Request to review and close this

>
>Can someone suggest if any changes required to address above issues.
Ferruh Yigit July 11, 2019, 4:21 p.m. UTC | #9
On 7/4/2019 10:48 AM, Jerin Jacob Kollanukkaran wrote:
>> From: Vamsi Krishna Attunuru 
>> Sent: Thursday, July 4, 2019 12:13 PM
>> To: dev@dpdk.org
>> Cc: ferruh.yigit@intel.com; olivier.matz@6wind.com; arybchenko@solarflare.com; Jerin Jacob Kollanukkaran <jerinj@marvell.com>; Burakov, Anatoly <anatoly.burakov@intel.com>
>> Subject: Re: [dpdk-dev] [PATCH v6 0/4] add IOVA = VA support in KNI
>>
>> Hi All,
>>
>> Just to summarize, below items have arisen from the initial review.
>> 1) Can the new mempool flag be made default to all the pools and will there be case that new flag functionality would fail  for some page sizes.?
> 
> If the minimum huge page size is 2MB and normal huge page size is 512MB or 1G. So I think, new flags can be default as skipping the page boundaries for 
> Mempool objects has nearly zero overhead. But I leave decision to maintainers.
> 
>> 2) Adding HW device info(pci dev info) to KNI device structure, will it break KNI on virtual devices in VA or PA mode.?
> 
> Iommu_domain will be created only for PCI devices and the system runs in IOVA_VA mode. Virtual devices(IOVA_DC(don't care) or
> IOVA_PA devices still it works without PCI device structure)
> 
> It is  a useful feature where KNI can run without root privilege and it is pending for long time. Request to review and close this

I support the idea to remove 'kni' forcing to the IOVA=PA mode, but also not
sure about forcing all KNI users to update their code to allocate mempool in a
very specific way.

What about giving more control to the user on this?

Any user want to use IOVA=VA and KNI together can update application to justify
memory allocation of the KNI and give an explicit "kni iova_mode=1" config.
Who want to use existing KNI implementation can continue to use it with IOVA=PA
mode which is current case, or for this case user may need to force the DPDK
application to IOVA=PA but at least there is a workaround.

And kni sample application should have sample for both case, although this
increases the testing and maintenance cost, I hope we can get support from you
on the iova_mode=1 usecase.

What do you think?



> 
>>
>> Can someone suggest if any changes required to address above issues. 
> ________________________________________
> From: dev <mailto:dev-bounces@dpdk.org> on behalf of Vamsi Krishna Attunuru <mailto:vattunuru@marvell.com>
> Sent: Monday, July 1, 2019 7:21:22 PM
> To: Jerin Jacob Kollanukkaran; Burakov, Anatoly; mailto:dev@dpdk.org
> Cc: mailto:ferruh.yigit@intel.com; mailto:olivier.matz@6wind.com; mailto:arybchenko@solarflare.com
> Subject: [EXT] Re: [dpdk-dev] [PATCH v6 0/4] add IOVA = VA support in KNI 
>  
> External Email
> 
> ----------------------------------------------------------------------
> ping..
> 
> ________________________________
> From: Jerin Jacob Kollanukkaran
> Sent: Thursday, June 27, 2019 3:04:58 PM
> To: Burakov, Anatoly; Vamsi Krishna Attunuru; mailto:dev@dpdk.org
> Cc: mailto:ferruh.yigit@intel.com; mailto:olivier.matz@6wind.com; mailto:arybchenko@solarflare.com
> Subject: RE: [dpdk-dev] [PATCH v6 0/4] add IOVA = VA support in KNI
> 
>> -----Original Message-----
>> From: Burakov, Anatoly <mailto:anatoly.burakov@intel.com>
>> Sent: Tuesday, June 25, 2019 7:09 PM
>> To: Jerin Jacob Kollanukkaran <mailto:jerinj@marvell.com>; Vamsi Krishna Attunuru
>> <mailto:vattunuru@marvell.com>; mailto:dev@dpdk.org
>> Cc: mailto:ferruh.yigit@intel.com; mailto:olivier.matz@6wind.com;
>> mailto:arybchenko@solarflare.com
>> Subject: Re: [dpdk-dev] [PATCH v6 0/4] add IOVA = VA support in KNI
>>
>> On 25-Jun-19 12:30 PM, Burakov, Anatoly wrote:
>>> On 25-Jun-19 12:15 PM, Jerin Jacob Kollanukkaran wrote:
>>>>> -----Original Message-----
>>>>> From: dev <mailto:dev-bounces@dpdk.org> On Behalf Of Burakov, Anatoly
>>>>> Sent: Tuesday, June 25, 2019 3:30 PM
>>>>> To: Vamsi Krishna Attunuru <mailto:vattunuru@marvell.com>; mailto:dev@dpdk.org
>>>>> Cc: mailto:ferruh.yigit@intel.com; mailto:olivier.matz@6wind.com;
>>>>> mailto:arybchenko@solarflare.com
>>>>> Subject: Re: [dpdk-dev] [PATCH v6 0/4] add IOVA = VA support in KNI
>>>>>
>>>>> On 25-Jun-19 4:56 AM, mailto:vattunuru@marvell.com wrote:
>>>>>> From: Vamsi Attunuru <mailto:vattunuru@marvell.com>
>>>>>>
>>>>>> ----
>>>>>> V6 Changes:
>>>>>> * Added new mempool flag to ensure mbuf memory is not scattered
>>>>>> across page boundaries.
>>>>>> * Added KNI kernel module required PCI device information.
>>>>>> * Modified KNI example application to create mempool with new
>>>>>> mempool flag.
>>>>>>
>>>>> Others can chime in, but my 2 cents: this reduces the usefulness of
>>>>> KNI because it limits the kinds of mempools one can use them with,
>>>>> and makes it so that the code that works with every other PMD
>>>>> requires changes to work with KNI.
>>>>
>>>> # One option to make this flag as default only for packet mempool(not
>>>> allow allocate on page boundary).
>>>> In real world the overhead will be very minimal considering Huge page
>>>> size is 1G or 512M # Enable this flag explicitly only IOVA = VA mode
>>>> in library. Not need to expose to application # I don't think, there
>>>> needs to be any PMD specific change to make KNI with IOVA = VA mode #
>>>> No preference on flags to be passed by application vs in library.
>>>> But IMO this change would be
>>>> needed in mempool support KNI in IOVA = VA mode.
>>>>
>>>
>>> I would be OK to just make it default behavior to not cross page
>>> boundaries when allocating buffers. This would solve the problem for
>>> KNI and for any other use case that would rely on PA-contiguous
>>> buffers in face of IOVA as VA mode.
>>>
>>> We could also add a flag to explicitly allow page crossing without
>>> also making mbufs IOVA-non-contiguous, but i'm not sure if there are
>>> use cases that would benefit from this.
>>
>> On another thought, such a default would break 4K pages in case for packets
>> bigger than page size (i.e. jumbo frames). Should we care?
> 
> The hugepage size will not be 4K. Right?
> 
> Olivier,
> 
> As a maintainer any thoughts of exposing/not exposing the new mepool flag to
> Skip the page boundaries?
> 
> All,
> Either option is fine, Asking for feedback to processed further?
>