bus/pci: fix mapping BAR containing MSI-X table

Message ID 20200925021435.16114-1-hyonkim@cisco.com (mailing list archive)
State Accepted, archived
Delegated to: David Marchand
Headers
Series bus/pci: fix mapping BAR containing MSI-X table |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/Intel-compilation success Compilation OK
ci/iol-testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/travis-robot success Travis build: passed
ci/iol-mellanox-Performance success Performance Testing PASS

Commit Message

Hyong Youb Kim (hyonkim) Sept. 25, 2020, 2:14 a.m. UTC
  When the BAR contains MSI-X table, pci_vfio_mmap_bar() tries to skip
the table and map the rest. "map around it" is the phrase used in the
source. The function splits the BAR into two regions: the region
before the table (first part or memreg[0]) and the region after the
table (second part or memreg[1]).

For hardware that has MSI-X vector table offset 0, the first part does
not exist (memreg[0].size == 0).

  Capabilities: [60] MSI-X: Enable- Count=48 Masked-
         Vector table: BAR=2 offset=00000000
         PBA: BAR=2 offset=00001000

The mapping part of the function maps the first part, if it
exists. Then, it maps the second part, if it exists and "if mapping the
first part succeeded".

The recent change that replaces MAP_FAILED with NULL breaks the "if
mapping the first part succeeded" condition (1) in the snippet below.

    void *map_addr = NULL;
    if (memreg[0].size) {
	    /* actual map of first part */
	    map_addr = pci_map_resource(...);
    }

    /* if there's a second part, try to map it */
    if (map_addr != NULL                              // -- (1)
	    && memreg[1].offset && memreg[1].size) {
	[...]
    }

    if (map_addr == NULL) {
            RTE_LOG(ERR, EAL, "Failed to map pci BAR%d\n",
                    bar_index);
            return -1;
    }

When the first part does not exist, (1) sees map_addr is still NULL,
and the function fails. This behavior is a regression and fails
probing hardware with vector table offset 0.

Previously, (1) was "map_addr != MAP_FAILED", which meant
pci_map_resource() was actually attempted and failed. So, expand (1)
to check if the first part exists as well, to match the semantics of
MAP_FAILED.

Bugzilla ID: 539
Fixes: e200535c1ca3 ("mem: drop mapping API workaround")

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
---
 drivers/bus/pci/linux/pci_vfio.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)
  

Comments

Burakov, Anatoly Sept. 25, 2020, 11:48 a.m. UTC | #1
On 25-Sep-20 3:14 AM, Hyong Youb Kim wrote:
> When the BAR contains MSI-X table, pci_vfio_mmap_bar() tries to skip
> the table and map the rest. "map around it" is the phrase used in the
> source. The function splits the BAR into two regions: the region
> before the table (first part or memreg[0]) and the region after the
> table (second part or memreg[1]).
> 
> For hardware that has MSI-X vector table offset 0, the first part does
> not exist (memreg[0].size == 0).
> 
>    Capabilities: [60] MSI-X: Enable- Count=48 Masked-
>           Vector table: BAR=2 offset=00000000
>           PBA: BAR=2 offset=00001000
> 
> The mapping part of the function maps the first part, if it
> exists. Then, it maps the second part, if it exists and "if mapping the
> first part succeeded".
> 
> The recent change that replaces MAP_FAILED with NULL breaks the "if
> mapping the first part succeeded" condition (1) in the snippet below.
> 
>      void *map_addr = NULL;
>      if (memreg[0].size) {
> 	    /* actual map of first part */
> 	    map_addr = pci_map_resource(...);
>      }
> 
>      /* if there's a second part, try to map it */
>      if (map_addr != NULL                              // -- (1)
> 	    && memreg[1].offset && memreg[1].size) {
> 	[...]
>      }
> 
>      if (map_addr == NULL) {
>              RTE_LOG(ERR, EAL, "Failed to map pci BAR%d\n",
>                      bar_index);
>              return -1;
>      }
> 
> When the first part does not exist, (1) sees map_addr is still NULL,
> and the function fails. This behavior is a regression and fails
> probing hardware with vector table offset 0.
> 
> Previously, (1) was "map_addr != MAP_FAILED", which meant
> pci_map_resource() was actually attempted and failed. So, expand (1)
> to check if the first part exists as well, to match the semantics of
> MAP_FAILED.
> 
> Bugzilla ID: 539
> Fixes: e200535c1ca3 ("mem: drop mapping API workaround")
> 
> Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
> ---

LGTM

Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
  
David Marchand Sept. 28, 2020, 9:42 a.m. UTC | #2
On Fri, Sep 25, 2020 at 4:15 AM Hyong Youb Kim <hyonkim@cisco.com> wrote:
>
> When the BAR contains MSI-X table, pci_vfio_mmap_bar() tries to skip
> the table and map the rest. "map around it" is the phrase used in the
> source. The function splits the BAR into two regions: the region
> before the table (first part or memreg[0]) and the region after the
> table (second part or memreg[1]).
>
> For hardware that has MSI-X vector table offset 0, the first part does
> not exist (memreg[0].size == 0).
>
>   Capabilities: [60] MSI-X: Enable- Count=48 Masked-
>          Vector table: BAR=2 offset=00000000
>          PBA: BAR=2 offset=00001000
>
> The mapping part of the function maps the first part, if it
> exists. Then, it maps the second part, if it exists and "if mapping the
> first part succeeded".
>
> The recent change that replaces MAP_FAILED with NULL breaks the "if
> mapping the first part succeeded" condition (1) in the snippet below.
>
>     void *map_addr = NULL;
>     if (memreg[0].size) {
>             /* actual map of first part */
>             map_addr = pci_map_resource(...);
>     }
>
>     /* if there's a second part, try to map it */
>     if (map_addr != NULL                              // -- (1)
>             && memreg[1].offset && memreg[1].size) {
>         [...]
>     }
>
>     if (map_addr == NULL) {
>             RTE_LOG(ERR, EAL, "Failed to map pci BAR%d\n",
>                     bar_index);
>             return -1;
>     }
>
> When the first part does not exist, (1) sees map_addr is still NULL,
> and the function fails. This behavior is a regression and fails
> probing hardware with vector table offset 0.
>
> Previously, (1) was "map_addr != MAP_FAILED", which meant
> pci_map_resource() was actually attempted and failed. So, expand (1)
> to check if the first part exists as well, to match the semantics of
> MAP_FAILED.
>
> Bugzilla ID: 539
> Fixes: e200535c1ca3 ("mem: drop mapping API workaround")
>
> Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

Applied, thanks.
  

Patch

diff --git a/drivers/bus/pci/linux/pci_vfio.c b/drivers/bus/pci/linux/pci_vfio.c
index 34b5da80df..d2073994fa 100644
--- a/drivers/bus/pci/linux/pci_vfio.c
+++ b/drivers/bus/pci/linux/pci_vfio.c
@@ -564,8 +564,20 @@  pci_vfio_mmap_bar(int vfio_dev_fd, struct mapped_pci_resource *vfio_res,
 							RTE_MAP_FORCE_ADDRESS);
 		}
 
+		/*
+		 * Regarding "memreg[0].size == 0":
+		 * If this BAR has MSI-X table, memreg[0].size (the
+		 * first part or the part before the table) can
+		 * legitimately be 0 for hardware using vector table
+		 * offset 0 (i.e. first part does not exist).
+		 *
+		 * When memreg[0].size is 0, "mapping the first part"
+		 * never happens, and map_addr is NULL at this
+		 * point. So check that mapping has been actually
+		 * attempted.
+		 */
 		/* if there's a second part, try to map it */
-		if (map_addr != NULL
+		if ((map_addr != NULL || memreg[0].size == 0)
 			&& memreg[1].offset && memreg[1].size) {
 			void *second_addr = RTE_PTR_ADD(bar_addr,
 						(uintptr_t)(memreg[1].offset -