[v3] pci_vfio: Support 64KB kernel page_size with vfio-pci driver

Message ID 1542595044-27145-1-git-send-email-tone.zhang@arm.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series [v3] pci_vfio: Support 64KB kernel page_size with vfio-pci driver |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/intel-Performance-Testing success Performance Testing PASS
ci/mellanox-Performance-Testing success Performance Testing PASS
ci/Intel-compilation success Compilation OK

Commit Message

tone.zhang Nov. 19, 2018, 2:37 a.m. UTC
  With a larger PAGE_SIZE it is possible for the MSI table to very
close to the end of the BAR s.t. when we align the start and end
of the MSI table to the PAGE_SIZE, the end offset of the MSI
table is out of the PCI BAR boundary.

This patch addresses the issue by comparing both the start and the
end offset of the MSI table with the BAR size, and skip the mapping
if it is out of Bar scope.

The patch fixes the debug log as below:
EAL: Skipping BAR0

Signed-off-by: tone.zhang <tone.zhang@arm.com>
Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Steve Capper <Steve.Capper@arm.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 drivers/bus/pci/linux/pci_vfio.c | 27 ++++++++++++++++++++++-----
 1 file changed, 22 insertions(+), 5 deletions(-)
  

Comments

tone.zhang Dec. 3, 2018, 7:25 a.m. UTC | #1
Hi all,

Could anyone please have a review and re-test the change?

Thanks a lot!

Br,
Tone

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of tone.zhang
> Sent: Monday, November 19, 2018 10:37 AM
> To: dev@dpdk.org
> Cc: Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>; Honnappa
> Nagarahalli <Honnappa.Nagarahalli@arm.com>; Steve Capper
> <Steve.Capper@arm.com>; anatoly.burakov@intel.com; nd <nd@arm.com>
> Subject: [dpdk-dev] [PATCH v3] pci_vfio: Support 64KB kernel page_size with
> vfio-pci driver
> 
> With a larger PAGE_SIZE it is possible for the MSI table to very close to the end
> of the BAR s.t. when we align the start and end of the MSI table to the
> PAGE_SIZE, the end offset of the MSI table is out of the PCI BAR boundary.
> 
> This patch addresses the issue by comparing both the start and the end offset of
> the MSI table with the BAR size, and skip the mapping if it is out of Bar scope.
> 
> The patch fixes the debug log as below:
> EAL: Skipping BAR0
> 
> Signed-off-by: tone.zhang <tone.zhang@arm.com>
> Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Steve Capper <Steve.Capper@arm.com>
> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
>  drivers/bus/pci/linux/pci_vfio.c | 27 ++++++++++++++++++++++-----
>  1 file changed, 22 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/bus/pci/linux/pci_vfio.c b/drivers/bus/pci/linux/pci_vfio.c
> index ffd26f1..0e7bfd7 100644
> --- a/drivers/bus/pci/linux/pci_vfio.c
> +++ b/drivers/bus/pci/linux/pci_vfio.c
> @@ -457,9 +457,11 @@ pci_vfio_mmap_bar(int vfio_dev_fd, struct
> mapped_pci_resource *vfio_res,
>  	struct pci_msix_table *msix_table = &vfio_res->msix_table;
>  	struct pci_map *bar = &vfio_res->maps[bar_index];
> 
> -	if (bar->size == 0)
> +	if (bar->size == 0) {
>  		/* Skip this BAR */
> +		RTE_LOG(INFO, EAL, "Skipping BAR%d\n", bar_index);
>  		return 0;
> +	}
> 
>  	if (msix_table->bar_index == bar_index) {
>  		/*
> @@ -468,8 +470,14 @@ pci_vfio_mmap_bar(int vfio_dev_fd, struct
> mapped_pci_resource *vfio_res,
>  		 */
>  		uint32_t table_start = msix_table->offset;
>  		uint32_t table_end = table_start + msix_table->size;
> -		table_end = (table_end + ~PAGE_MASK) & PAGE_MASK;
> -		table_start &= PAGE_MASK;
> +		table_end = RTE_ALIGN(table_end, PAGE_SIZE);
> +		table_start = RTE_ALIGN(table_start, PAGE_SIZE);
> +
> +		/* If page-aligned start of MSI-X table is beyond BAR size,
> +		 * shrink the mapping size to MSI-X table start address.
> +		 */
> +		if (table_start >= bar->size)
> +			table_start = msix_table->offset;
> 
>  		if (table_start == 0 && table_end >= bar->size) {
>  			/* Cannot map this BAR */
> @@ -481,8 +489,17 @@ pci_vfio_mmap_bar(int vfio_dev_fd, struct
> mapped_pci_resource *vfio_res,
> 
>  		memreg[0].offset = bar->offset;
>  		memreg[0].size = table_start;
> -		memreg[1].offset = bar->offset + table_end;
> -		memreg[1].size = bar->size - table_end;
> +		if (bar->size < table_end) {
> +			/*
> +			 * If MSI-X table end is beyond BAR end, don't attempt
> +			 * to perform second mapping.
> +			 */
> +			memreg[1].offset = 0;
> +			memreg[1].size = 0;
> +		} else {
> +			memreg[1].offset = bar->offset + table_end;
> +			memreg[1].size = bar->size - table_end;
> +		}
> 
>  		RTE_LOG(DEBUG, EAL,
>  			"Trying to map BAR%d that contains the MSI-X "
> --
> 2.7.4
  
Anatoly Burakov Dec. 10, 2018, 11:40 a.m. UTC | #2
On 03-Dec-18 7:25 AM, Tone Zhang (Arm Technology China) wrote:
> Hi all,
> 
> Could anyone please have a review and re-test the change?
> 
> Thanks a lot!
> 
> Br,
> Tone

Tested with latest 18.11, all seems to work fine.

Tested-by: Anatoly Burakov <anatoly.burakov@intel.com>
  
Anatoly Burakov Dec. 10, 2018, 11:45 a.m. UTC | #3
On 19-Nov-18 2:37 AM, tone.zhang wrote:
> With a larger PAGE_SIZE it is possible for the MSI table to very
> close to the end of the BAR s.t. when we align the start and end
> of the MSI table to the PAGE_SIZE, the end offset of the MSI
> table is out of the PCI BAR boundary.
> 
> This patch addresses the issue by comparing both the start and the
> end offset of the MSI table with the BAR size, and skip the mapping
> if it is out of Bar scope.
> 
> The patch fixes the debug log as below:
> EAL: Skipping BAR0
> 
> Signed-off-by: tone.zhang <tone.zhang@arm.com>
> Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Steve Capper <Steve.Capper@arm.com>
> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
>   drivers/bus/pci/linux/pci_vfio.c | 27 ++++++++++++++++++++++-----
>   1 file changed, 22 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/bus/pci/linux/pci_vfio.c b/drivers/bus/pci/linux/pci_vfio.c
> index ffd26f1..0e7bfd7 100644
> --- a/drivers/bus/pci/linux/pci_vfio.c
> +++ b/drivers/bus/pci/linux/pci_vfio.c
> @@ -457,9 +457,11 @@ pci_vfio_mmap_bar(int vfio_dev_fd, struct mapped_pci_resource *vfio_res,
>   	struct pci_msix_table *msix_table = &vfio_res->msix_table;
>   	struct pci_map *bar = &vfio_res->maps[bar_index];
>   
> -	if (bar->size == 0)
> +	if (bar->size == 0) {
>   		/* Skip this BAR */
> +		RTE_LOG(INFO, EAL, "Skipping BAR%d\n", bar_index);
>   		return 0;

I would perhaps make it a DEBUG rather than INFO.

> +	}
>   
>   	if (msix_table->bar_index == bar_index) {
>   		/*
> @@ -468,8 +470,14 @@ pci_vfio_mmap_bar(int vfio_dev_fd, struct mapped_pci_resource *vfio_res,
>   		 */
>   		uint32_t table_start = msix_table->offset;
>   		uint32_t table_end = table_start + msix_table->size;
> -		table_end = (table_end + ~PAGE_MASK) & PAGE_MASK;
> -		table_start &= PAGE_MASK;
> +		table_end = RTE_ALIGN(table_end, PAGE_SIZE);
> +		table_start = RTE_ALIGN(table_start, PAGE_SIZE);

I believe table_start &= PAGE_MASK is equivalent to table_start = 
RTE_ALIGN_FLOOR(), not RTE_ALIGN() (which is an alias for RTE_ALIGN_CEIL()).

> +
> +		/* If page-aligned start of MSI-X table is beyond BAR size,
> +		 * shrink the mapping size to MSI-X table start address.
> +		 */
> +		if (table_start >= bar->size)
> +			table_start = msix_table->offset;
>   
>   		if (table_start == 0 && table_end >= bar->size) {
>   			/* Cannot map this BAR */
> @@ -481,8 +489,17 @@ pci_vfio_mmap_bar(int vfio_dev_fd, struct mapped_pci_resource *vfio_res,
>   
>   		memreg[0].offset = bar->offset;
>   		memreg[0].size = table_start;
> -		memreg[1].offset = bar->offset + table_end;
> -		memreg[1].size = bar->size - table_end;
> +		if (bar->size < table_end) {
> +			/*
> +			 * If MSI-X table end is beyond BAR end, don't attempt
> +			 * to perform second mapping.
> +			 */
> +			memreg[1].offset = 0;
> +			memreg[1].size = 0;
> +		} else {
> +			memreg[1].offset = bar->offset + table_end;
> +			memreg[1].size = bar->size - table_end;
> +		}
>   
>   		RTE_LOG(DEBUG, EAL,
>   			"Trying to map BAR%d that contains the MSI-X "
>
  
Stephen Hemminger Dec. 10, 2018, 3:55 p.m. UTC | #4
On Mon, 10 Dec 2018 11:45:37 +0000
"Burakov, Anatoly" <anatoly.burakov@intel.com> wrote:

> >   		/* Skip this BAR */
> > +		RTE_LOG(INFO, EAL, "Skipping BAR%d\n", bar_index);
> >   		return 0;  
> 
> I would perhaps make it a DEBUG rather than INFO.

And drop the comment...
  
tone.zhang Dec. 12, 2018, 10:48 a.m. UTC | #5
Hi Anatoly,

Thanks for the comments.

> -----Original Message-----
> From: Burakov, Anatoly <anatoly.burakov@intel.com>
> Sent: Monday, December 10, 2018 7:46 PM
> To: Tone Zhang (Arm Technology China) <Tone.Zhang@arm.com>;
> dev@dpdk.org
> Cc: Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>; Honnappa
> Nagarahalli <Honnappa.Nagarahalli@arm.com>; Steve Capper
> <Steve.Capper@arm.com>; nd <nd@arm.com>
> Subject: Re: [PATCH v3] pci_vfio: Support 64KB kernel page_size with vfio-pci
> driver
> 
> On 19-Nov-18 2:37 AM, tone.zhang wrote:
> > With a larger PAGE_SIZE it is possible for the MSI table to very close
> > to the end of the BAR s.t. when we align the start and end of the MSI
> > table to the PAGE_SIZE, the end offset of the MSI table is out of the
> > PCI BAR boundary.
> >
> > This patch addresses the issue by comparing both the start and the end
> > offset of the MSI table with the BAR size, and skip the mapping if it
> > is out of Bar scope.
> >
> > The patch fixes the debug log as below:
> > EAL: Skipping BAR0
> >
> > Signed-off-by: tone.zhang <tone.zhang@arm.com>
> > Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
> > Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > Reviewed-by: Steve Capper <Steve.Capper@arm.com>
> > Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
> > ---
> >   drivers/bus/pci/linux/pci_vfio.c | 27 ++++++++++++++++++++++-----
> >   1 file changed, 22 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/bus/pci/linux/pci_vfio.c
> > b/drivers/bus/pci/linux/pci_vfio.c
> > index ffd26f1..0e7bfd7 100644
> > --- a/drivers/bus/pci/linux/pci_vfio.c
> > +++ b/drivers/bus/pci/linux/pci_vfio.c
> > @@ -457,9 +457,11 @@ pci_vfio_mmap_bar(int vfio_dev_fd, struct
> mapped_pci_resource *vfio_res,
> >   	struct pci_msix_table *msix_table = &vfio_res->msix_table;
> >   	struct pci_map *bar = &vfio_res->maps[bar_index];
> >
> > -	if (bar->size == 0)
> > +	if (bar->size == 0) {
> >   		/* Skip this BAR */
> > +		RTE_LOG(INFO, EAL, "Skipping BAR%d\n", bar_index);
> >   		return 0;
> 
> I would perhaps make it a DEBUG rather than INFO.

Will update in v4 version soon.

> 
> > +	}
> >
> >   	if (msix_table->bar_index == bar_index) {
> >   		/*
> > @@ -468,8 +470,14 @@ pci_vfio_mmap_bar(int vfio_dev_fd, struct
> mapped_pci_resource *vfio_res,
> >   		 */
> >   		uint32_t table_start = msix_table->offset;
> >   		uint32_t table_end = table_start + msix_table->size;
> > -		table_end = (table_end + ~PAGE_MASK) & PAGE_MASK;
> > -		table_start &= PAGE_MASK;
> > +		table_end = RTE_ALIGN(table_end, PAGE_SIZE);
> > +		table_start = RTE_ALIGN(table_start, PAGE_SIZE);
> 
> I believe table_start &= PAGE_MASK is equivalent to table_start =
> RTE_ALIGN_FLOOR(), not RTE_ALIGN() (which is an alias for RTE_ALIGN_CEIL()).

Will change to RTE_ALIGN_FLOOR. Then table_start will be less than bar->size, the below code will be updated further in v4.

> 
> > +
> > +		/* If page-aligned start of MSI-X table is beyond BAR size,
> > +		 * shrink the mapping size to MSI-X table start address.
> > +		 */
> > +		if (table_start >= bar->size)
> > +			table_start = msix_table->offset;
> >
> >   		if (table_start == 0 && table_end >= bar->size) {
> >   			/* Cannot map this BAR */
> > @@ -481,8 +489,17 @@ pci_vfio_mmap_bar(int vfio_dev_fd, struct
> > mapped_pci_resource *vfio_res,
> >
> >   		memreg[0].offset = bar->offset;
> >   		memreg[0].size = table_start;
> > -		memreg[1].offset = bar->offset + table_end;
> > -		memreg[1].size = bar->size - table_end;
> > +		if (bar->size < table_end) {
> > +			/*
> > +			 * If MSI-X table end is beyond BAR end, don't attempt
> > +			 * to perform second mapping.
> > +			 */
> > +			memreg[1].offset = 0;
> > +			memreg[1].size = 0;
> > +		} else {
> > +			memreg[1].offset = bar->offset + table_end;
> > +			memreg[1].size = bar->size - table_end;
> > +		}
> >
> >   		RTE_LOG(DEBUG, EAL,
> >   			"Trying to map BAR%d that contains the MSI-X "
> >
> 
> 
> --
> Thanks,
> Anatoly

Thnaks!

Br,
Tone
  
tone.zhang Dec. 12, 2018, 10:49 a.m. UTC | #6
Hi Stephen,

Thanks for the comments.

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Monday, December 10, 2018 11:56 PM
> To: Burakov, Anatoly <anatoly.burakov@intel.com>
> Cc: Tone Zhang (Arm Technology China) <Tone.Zhang@arm.com>;
> dev@dpdk.org; Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>;
> Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Steve Capper
> <Steve.Capper@arm.com>; nd <nd@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v3] pci_vfio: Support 64KB kernel page_size with
> vfio-pci driver
> 
> On Mon, 10 Dec 2018 11:45:37 +0000
> "Burakov, Anatoly" <anatoly.burakov@intel.com> wrote:
> 
> > >   		/* Skip this BAR */
> > > +		RTE_LOG(INFO, EAL, "Skipping BAR%d\n", bar_index);
> > >   		return 0;
> >
> > I would perhaps make it a DEBUG rather than INFO.
> 
> And drop the comment...

Will drop the comment in v4.

Thanks.

Br,
Tone
  

Patch

diff --git a/drivers/bus/pci/linux/pci_vfio.c b/drivers/bus/pci/linux/pci_vfio.c
index ffd26f1..0e7bfd7 100644
--- a/drivers/bus/pci/linux/pci_vfio.c
+++ b/drivers/bus/pci/linux/pci_vfio.c
@@ -457,9 +457,11 @@  pci_vfio_mmap_bar(int vfio_dev_fd, struct mapped_pci_resource *vfio_res,
 	struct pci_msix_table *msix_table = &vfio_res->msix_table;
 	struct pci_map *bar = &vfio_res->maps[bar_index];
 
-	if (bar->size == 0)
+	if (bar->size == 0) {
 		/* Skip this BAR */
+		RTE_LOG(INFO, EAL, "Skipping BAR%d\n", bar_index);
 		return 0;
+	}
 
 	if (msix_table->bar_index == bar_index) {
 		/*
@@ -468,8 +470,14 @@  pci_vfio_mmap_bar(int vfio_dev_fd, struct mapped_pci_resource *vfio_res,
 		 */
 		uint32_t table_start = msix_table->offset;
 		uint32_t table_end = table_start + msix_table->size;
-		table_end = (table_end + ~PAGE_MASK) & PAGE_MASK;
-		table_start &= PAGE_MASK;
+		table_end = RTE_ALIGN(table_end, PAGE_SIZE);
+		table_start = RTE_ALIGN(table_start, PAGE_SIZE);
+
+		/* If page-aligned start of MSI-X table is beyond BAR size,
+		 * shrink the mapping size to MSI-X table start address.
+		 */
+		if (table_start >= bar->size)
+			table_start = msix_table->offset;
 
 		if (table_start == 0 && table_end >= bar->size) {
 			/* Cannot map this BAR */
@@ -481,8 +489,17 @@  pci_vfio_mmap_bar(int vfio_dev_fd, struct mapped_pci_resource *vfio_res,
 
 		memreg[0].offset = bar->offset;
 		memreg[0].size = table_start;
-		memreg[1].offset = bar->offset + table_end;
-		memreg[1].size = bar->size - table_end;
+		if (bar->size < table_end) {
+			/*
+			 * If MSI-X table end is beyond BAR end, don't attempt
+			 * to perform second mapping.
+			 */
+			memreg[1].offset = 0;
+			memreg[1].size = 0;
+		} else {
+			memreg[1].offset = bar->offset + table_end;
+			memreg[1].size = bar->size - table_end;
+		}
 
 		RTE_LOG(DEBUG, EAL,
 			"Trying to map BAR%d that contains the MSI-X "