mbox series

[v2,0/7] fix DMA mask check

Message ID 20181101195330.19464-1-alejandro.lucero@netronome.com (mailing list archive)
Headers show
Series fix DMA mask check | expand

Message

Alejandro Lucero Nov. 1, 2018, 7:53 p.m. UTC
A patchset sent introducing DMA mask checks has several critical
issues precluding apps to execute. The patchset was reviewed and
finally accepted after three versions. Obviously it did not go 
through the proper testing what can be explained, at least from my
side, due to the big changes to the memory initialization code these
last months. It turns out the patchset did work with legacy memory
and I'm afraid that was mainly my testing.

This patchset should solve the main problems reported:

 - deadlock duriing initialization
 - segmentation fault with secondary processes

For solving the deadlock, a new API is introduced:

 rte_mem_check_dma_mask_safe/unsafe

making the previous rte_mem_check_dma_mask the one those new functions
end calling. A boolean param is used for calling rte_memseg_walk thread
safe or thread unsafe. This second option is needed for avoiding the 
deadlock.

For the secondary processes problem, the call to check the dma mask is
avoided from code being executed before the memory initialization. 
Instead, a new API function, rte_mem_set_dma_mask is introduced, which 
will be used in those cases. The dma mask check is done once the memory
initialization is completed.

This last change implies the IOVA mode can not be set depending on IOMMU
hardware limitations, and it is assumed IOVA VA is possible. If the dma
mask check reports a problem after memory initilization, the error 
message includes now advice for trying with --iova-mode option set to 
pa.

The patchet also includes the dma mask check for legacy memory and the 
no hugepage option.

Finally, all the DMA mask API has been updated for using the same prefix
than other EAL memory code.

An initial version of this patchset has been tested by Intel DPDK 
Validation team and it seems it solves all the problems reported. This 
final patchset has the same functionality with minor changes. I have
successfully tested the patchset with my limited testbench.

v2:
 - modify error messages with more descriptive information
 - change safe/unsafe versions for dma check to previous one plus a
   thread_unsafe version.  
 - use rte_eal_using_phys_addr instead of getuid
 - fix comments
 - reorder patches

Alejandro Lucero (7):
  mem: fix call to DMA mask check
  mem: use proper prefix
  mem: add function for setting DMA mask
  bus/pci: avoid call to DMA mask check
  mem: modify error message for DMA mask check
  eal/mem: use DMA mask check for legacy memory
  mem: add thread unsafe version for checking DMA mask

 doc/guides/rel_notes/release_18_11.rst     |  2 +-
 drivers/bus/pci/linux/pci.c                | 11 +++++-
 drivers/net/nfp/nfp_net.c                  |  2 +-
 lib/librte_eal/common/eal_common_memory.c  | 42 +++++++++++++++++++--
 lib/librte_eal/common/include/rte_memory.h | 41 ++++++++++++++++++++-
 lib/librte_eal/common/malloc_heap.c        | 43 ++++++++++++++++++----
 lib/librte_eal/linuxapp/eal/eal_memory.c   | 20 ++++++++++
 lib/librte_eal/rte_eal_version.map         |  4 +-
 8 files changed, 148 insertions(+), 17 deletions(-)

Comments

Ferruh Yigit Nov. 2, 2018, 6:38 p.m. UTC | #1
On 11/1/2018 7:53 PM, Alejandro Lucero wrote:
> A patchset sent introducing DMA mask checks has several critical
> issues precluding apps to execute. The patchset was reviewed and
> finally accepted after three versions. Obviously it did not go 
> through the proper testing what can be explained, at least from my
> side, due to the big changes to the memory initialization code these
> last months. It turns out the patchset did work with legacy memory
> and I'm afraid that was mainly my testing.
> 
> This patchset should solve the main problems reported:
> 
>  - deadlock duriing initialization
>  - segmentation fault with secondary processes
> 
> For solving the deadlock, a new API is introduced:
> 
>  rte_mem_check_dma_mask_safe/unsafe
> 
> making the previous rte_mem_check_dma_mask the one those new functions
> end calling. A boolean param is used for calling rte_memseg_walk thread
> safe or thread unsafe. This second option is needed for avoiding the 
> deadlock.
> 
> For the secondary processes problem, the call to check the dma mask is
> avoided from code being executed before the memory initialization. 
> Instead, a new API function, rte_mem_set_dma_mask is introduced, which 
> will be used in those cases. The dma mask check is done once the memory
> initialization is completed.
> 
> This last change implies the IOVA mode can not be set depending on IOMMU
> hardware limitations, and it is assumed IOVA VA is possible. If the dma
> mask check reports a problem after memory initilization, the error 
> message includes now advice for trying with --iova-mode option set to 
> pa.
> 
> The patchet also includes the dma mask check for legacy memory and the 
> no hugepage option.
> 
> Finally, all the DMA mask API has been updated for using the same prefix
> than other EAL memory code.
> 
> An initial version of this patchset has been tested by Intel DPDK 
> Validation team and it seems it solves all the problems reported. This 
> final patchset has the same functionality with minor changes. I have
> successfully tested the patchset with my limited testbench.
> 
> v2:
>  - modify error messages with more descriptive information
>  - change safe/unsafe versions for dma check to previous one plus a
>    thread_unsafe version.  
>  - use rte_eal_using_phys_addr instead of getuid
>  - fix comments
>  - reorder patches
> 
> Alejandro Lucero (7):
>   mem: fix call to DMA mask check
>   mem: use proper prefix
>   mem: add function for setting DMA mask
>   bus/pci: avoid call to DMA mask check
>   mem: modify error message for DMA mask check
>   eal/mem: use DMA mask check for legacy memory
>   mem: add thread unsafe version for checking DMA mask

Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>

Fixing the deadlock issue, and build/per-patch build is good.
Thomas Monjalon Nov. 5, 2018, 12:03 a.m. UTC | #2
02/11/2018 19:38, Ferruh Yigit:
> On 11/1/2018 7:53 PM, Alejandro Lucero wrote:
> > A patchset sent introducing DMA mask checks has several critical
> > issues precluding apps to execute. The patchset was reviewed and
> > finally accepted after three versions. Obviously it did not go 
> > through the proper testing what can be explained, at least from my
> > side, due to the big changes to the memory initialization code these
> > last months. It turns out the patchset did work with legacy memory
> > and I'm afraid that was mainly my testing.
> > 
> > This patchset should solve the main problems reported:
> > 
> >  - deadlock duriing initialization
> >  - segmentation fault with secondary processes
> > 
> > For solving the deadlock, a new API is introduced:
> > 
> >  rte_mem_check_dma_mask_safe/unsafe
> > 
> > making the previous rte_mem_check_dma_mask the one those new functions
> > end calling. A boolean param is used for calling rte_memseg_walk thread
> > safe or thread unsafe. This second option is needed for avoiding the 
> > deadlock.
> > 
> > For the secondary processes problem, the call to check the dma mask is
> > avoided from code being executed before the memory initialization. 
> > Instead, a new API function, rte_mem_set_dma_mask is introduced, which 
> > will be used in those cases. The dma mask check is done once the memory
> > initialization is completed.
> > 
> > This last change implies the IOVA mode can not be set depending on IOMMU
> > hardware limitations, and it is assumed IOVA VA is possible. If the dma
> > mask check reports a problem after memory initilization, the error 
> > message includes now advice for trying with --iova-mode option set to 
> > pa.
> > 
> > The patchet also includes the dma mask check for legacy memory and the 
> > no hugepage option.
> > 
> > Finally, all the DMA mask API has been updated for using the same prefix
> > than other EAL memory code.
> > 
> > An initial version of this patchset has been tested by Intel DPDK 
> > Validation team and it seems it solves all the problems reported. This 
> > final patchset has the same functionality with minor changes. I have
> > successfully tested the patchset with my limited testbench.
> > 
> > v2:
> >  - modify error messages with more descriptive information
> >  - change safe/unsafe versions for dma check to previous one plus a
> >    thread_unsafe version.  
> >  - use rte_eal_using_phys_addr instead of getuid
> >  - fix comments
> >  - reorder patches
> > 
> > Alejandro Lucero (7):
> >   mem: fix call to DMA mask check
> >   mem: use proper prefix
> >   mem: add function for setting DMA mask
> >   bus/pci: avoid call to DMA mask check
> >   mem: modify error message for DMA mask check
> >   eal/mem: use DMA mask check for legacy memory
> >   mem: add thread unsafe version for checking DMA mask
> 
> Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
> 
> Fixing the deadlock issue, and build/per-patch build is good.

Applied, thanks