[v4,3/4] bus/cdx: add support for MSI

Message ID 20230508111812.2655-4-nipun.gupta@amd.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series Support AMD CDX bus, for FPGA based CDX devices. The CDX |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Gupta, Nipun May 8, 2023, 11:18 a.m. UTC
  MSI's are exposed to the devices using VFIO (vfio-cdx). This
patch uses the same to add support for MSI for the devices on
the cdx bus.

A couple of API's have been introduced in the EAL interrupt
framework:
- rte_intr_irq_count_set: This API is used to set the total
    interrupts on the interrupt handle. This would be provided
    by VFIO (irq.count) for VFIO enabled devices.
- rte_intr_irq_count_get: This API returns the total number
    interrupts which were set.

Signed-off-by: Nipun Gupta <nipun.gupta@amd.com>
---
 drivers/bus/cdx/bus_cdx_driver.h       |  25 ++++
 drivers/bus/cdx/cdx.c                  |  11 ++
 drivers/bus/cdx/cdx_vfio.c             | 182 ++++++++++++++++++++++++-
 drivers/bus/cdx/version.map            |   2 +
 lib/eal/common/eal_common_interrupts.c |  21 +++
 lib/eal/common/eal_interrupts.h        |   1 +
 lib/eal/include/rte_interrupts.h       |  32 +++++
 lib/eal/version.map                    |   2 +
 8 files changed, 274 insertions(+), 2 deletions(-)
  

Comments

Thomas Monjalon May 24, 2023, 11:06 a.m. UTC | #1
08/05/2023 13:18, Nipun Gupta:
> MSI's are exposed to the devices using VFIO (vfio-cdx). This
> patch uses the same to add support for MSI for the devices on
> the cdx bus.
> 
> A couple of API's have been introduced in the EAL interrupt
> framework:
> - rte_intr_irq_count_set: This API is used to set the total
>     interrupts on the interrupt handle. This would be provided
>     by VFIO (irq.count) for VFIO enabled devices.
> - rte_intr_irq_count_get: This API returns the total number
>     interrupts which were set.
[...]
> --- a/lib/eal/common/eal_interrupts.h
> +++ b/lib/eal/common/eal_interrupts.h
> @@ -16,6 +16,7 @@ struct rte_intr_handle {
>  	};
>  	uint32_t alloc_flags;	/**< flags passed at allocation */
>  	enum rte_intr_handle_type type;  /**< handle type */
> +	uint32_t irq_count;		/**< IRQ count provided via VFIO */

Why only via VFIO?

[...]
> +/**
> + * @internal
> + * Set the irq count field of interrupt handle with user
> + * provided irq count value.
> + *
> + * @param intr_handle
> + *  pointer to the interrupt handle.
> + * @param irq_count
> + *  IRQ count

Please write IRQ all uppercase consistently.
Same for CDX.

> +	rte_intr_irq_count_get;
> +	rte_intr_irq_count_set;

Adding a new API in EAL deserves a separate commit with a different audience.
It looks like it has been hidden from EAL reviewers eyes so far.
  
Gupta, Nipun May 24, 2023, 5:06 p.m. UTC | #2
[AMD Official Use Only - General]

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Wednesday, May 24, 2023 4:37 PM
> To: Gupta, Nipun <Nipun.Gupta@amd.com>
> Cc: dev@dpdk.org; david.marchand@redhat.com; Yigit, Ferruh
> <Ferruh.Yigit@amd.com>; Anand, Harpreet <harpreet.anand@amd.com>;
> Agarwal, Nikhil <nikhil.agarwal@amd.com>
> Subject: Re: [PATCH v4 3/4] bus/cdx: add support for MSI
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> 08/05/2023 13:18, Nipun Gupta:
> > MSI's are exposed to the devices using VFIO (vfio-cdx). This
> > patch uses the same to add support for MSI for the devices on
> > the cdx bus.
> >
> > A couple of API's have been introduced in the EAL interrupt
> > framework:
> > - rte_intr_irq_count_set: This API is used to set the total
> >     interrupts on the interrupt handle. This would be provided
> >     by VFIO (irq.count) for VFIO enabled devices.
> > - rte_intr_irq_count_get: This API returns the total number
> >     interrupts which were set.
> [...]
> > --- a/lib/eal/common/eal_interrupts.h
> > +++ b/lib/eal/common/eal_interrupts.h
> > @@ -16,6 +16,7 @@ struct rte_intr_handle {
> >       };
> >       uint32_t alloc_flags;   /**< flags passed at allocation */
> >       enum rte_intr_handle_type type;  /**< handle type */
> > +     uint32_t irq_count;             /**< IRQ count provided via VFIO */
>
> Why only via VFIO?

Though this represents total number of irq count, for VFIO it is returned by
VFIO_DEVICE_GET_IRQ_INFO ioctl call. I am not sure about UIO, so added
Comment w.r.t. VFIO only. I will make it generic in next spin.

>
> [...]
> > +/**
> > + * @internal
> > + * Set the irq count field of interrupt handle with user
> > + * provided irq count value.
> > + *
> > + * @param intr_handle
> > + *  pointer to the interrupt handle.
> > + * @param irq_count
> > + *  IRQ count
>
> Please write IRQ all uppercase consistently.
> Same for CDX.

Sure.

>
> > +     rte_intr_irq_count_get;
> > +     rte_intr_irq_count_set;
>
> Adding a new API in EAL deserves a separate commit with a different audience.
> It looks like it has been hidden from EAL reviewers eyes so far.

Will make a separate commit for this.

Thanks,
Nipun
  

Patch

diff --git a/drivers/bus/cdx/bus_cdx_driver.h b/drivers/bus/cdx/bus_cdx_driver.h
index 7edcb019eb..fdeaf46664 100644
--- a/drivers/bus/cdx/bus_cdx_driver.h
+++ b/drivers/bus/cdx/bus_cdx_driver.h
@@ -72,6 +72,7 @@  struct rte_cdx_device {
 	struct rte_cdx_id id;			/**< CDX ID. */
 	struct rte_mem_resource mem_resource[CDX_MAX_RESOURCE];
 						/**< CDX Memory Resource */
+	struct rte_intr_handle *intr_handle;	/**< Interrupt handle */
 };
 
 /**
@@ -173,6 +174,30 @@  void rte_cdx_unmap_device(struct rte_cdx_device *dev);
 __rte_internal
 void rte_cdx_register(struct rte_cdx_driver *driver);
 
+/**
+ * Enables VFIO Interrupts for CDX bus devices.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ *
+ *  @return
+ *  0 on success, -1 on error.
+ */
+__rte_internal
+int rte_cdx_vfio_intr_enable(const struct rte_intr_handle *intr_handle);
+
+/**
+ * Disable VFIO Interrupts for CDX bus devices.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ *
+ *  @return
+ *  0 on success, -1 on error.
+ */
+__rte_internal
+int rte_cdx_vfio_intr_disable(const struct rte_intr_handle *intr_handle);
+
 /**
  * Helper for CDX device registration from driver (eth, crypto, raw) instance
  */
diff --git a/drivers/bus/cdx/cdx.c b/drivers/bus/cdx/cdx.c
index 8cc273336e..6c9ceaaf7f 100644
--- a/drivers/bus/cdx/cdx.c
+++ b/drivers/bus/cdx/cdx.c
@@ -204,6 +204,15 @@  cdx_scan_one(const char *dirname, const char *dev_name)
 		goto err;
 	}
 
+	/* Allocate interrupt instance for cdx device */
+	dev->intr_handle =
+		rte_intr_instance_alloc(RTE_INTR_INSTANCE_F_PRIVATE);
+	if (dev->intr_handle == NULL) {
+		CDX_BUS_ERR("Failed to create interrupt instance for %s\n",
+			dev->device.name);
+		return -ENOMEM;
+	}
+
 	/*
 	 * Check if device is bound to 'vfio-cdx' driver, so that user-space
 	 * can gracefully access the device.
@@ -394,6 +403,8 @@  cdx_probe_one_driver(struct rte_cdx_driver *dr,
 	return ret;
 
 error_probe:
+	rte_intr_instance_free(dev->intr_handle);
+	dev->intr_handle = NULL;
 	cdx_vfio_unmap_resource(dev);
 error_map_device:
 	return ret;
diff --git a/drivers/bus/cdx/cdx_vfio.c b/drivers/bus/cdx/cdx_vfio.c
index ae11f589b3..1422b98503 100644
--- a/drivers/bus/cdx/cdx_vfio.c
+++ b/drivers/bus/cdx/cdx_vfio.c
@@ -60,6 +60,10 @@  struct mapped_cdx_resource {
 /** mapped cdx device list */
 TAILQ_HEAD(mapped_cdx_res_list, mapped_cdx_resource);
 
+/* irq set buffer length for MSI interrupts */
+#define MSI_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + \
+			      sizeof(int) * (RTE_MAX_RXTX_INTR_VEC_ID + 1))
+
 static struct rte_tailq_elem cdx_vfio_tailq = {
 	.name = "VFIO_CDX_RESOURCE_LIST",
 };
@@ -104,6 +108,27 @@  cdx_vfio_unmap_resource_primary(struct rte_cdx_device *dev)
 	char cdx_addr[PATH_MAX] = {0};
 	struct mapped_cdx_resource *vfio_res = NULL;
 	struct mapped_cdx_res_list *vfio_res_list;
+	int ret, vfio_dev_fd;
+
+	if (rte_intr_fd_get(dev->intr_handle) < 0)
+		return -1;
+
+	if (close(rte_intr_fd_get(dev->intr_handle)) < 0) {
+		CDX_BUS_ERR("Error when closing eventfd file descriptor for %s",
+			dev->device.name);
+		return -1;
+	}
+
+	vfio_dev_fd = rte_intr_dev_fd_get(dev->intr_handle);
+	if (vfio_dev_fd < 0)
+		return -1;
+
+	ret = rte_vfio_release_device(rte_cdx_get_sysfs_path(), dev->device.name,
+				      vfio_dev_fd);
+	if (ret < 0) {
+		CDX_BUS_ERR("Cannot release VFIO device");
+		return ret;
+	}
 
 	vfio_res_list =
 		RTE_TAILQ_CAST(cdx_vfio_tailq.head, mapped_cdx_res_list);
@@ -126,6 +151,18 @@  cdx_vfio_unmap_resource_secondary(struct rte_cdx_device *dev)
 {
 	struct mapped_cdx_resource *vfio_res = NULL;
 	struct mapped_cdx_res_list *vfio_res_list;
+	int ret, vfio_dev_fd;
+
+	vfio_dev_fd = rte_intr_dev_fd_get(dev->intr_handle);
+	if (vfio_dev_fd < 0)
+		return -1;
+
+	ret = rte_vfio_release_device(rte_cdx_get_sysfs_path(), dev->device.name,
+				      vfio_dev_fd);
+	if (ret < 0) {
+		CDX_BUS_ERR("Cannot release VFIO device");
+		return ret;
+	}
 
 	vfio_res_list =
 		RTE_TAILQ_CAST(cdx_vfio_tailq.head, mapped_cdx_res_list);
@@ -150,9 +187,80 @@  cdx_vfio_unmap_resource(struct rte_cdx_device *dev)
 		return cdx_vfio_unmap_resource_secondary(dev);
 }
 
+/* set up interrupt support (but not enable interrupts) */
 static int
-cdx_rte_vfio_setup_device(int vfio_dev_fd)
+cdx_vfio_setup_interrupts(struct rte_cdx_device *dev, int vfio_dev_fd,
+		int num_irqs)
 {
+	int i, ret;
+
+	if (num_irqs == 0)
+		return 0;
+
+	/* start from MSI interrupt type */
+	for (i = 0; i < num_irqs; i++) {
+		struct vfio_irq_info irq = { .argsz = sizeof(irq) };
+		int fd = -1;
+
+		irq.index = i;
+
+		ret = ioctl(vfio_dev_fd, VFIO_DEVICE_GET_IRQ_INFO, &irq);
+		if (ret < 0) {
+			CDX_BUS_ERR("Cannot get VFIO IRQ info, error %i (%s)",
+				errno, strerror(errno));
+			return -1;
+		}
+
+		/* if this vector cannot be used with eventfd, fail if we explicitly
+		 * specified interrupt type, otherwise continue
+		 */
+		if ((irq.flags & VFIO_IRQ_INFO_EVENTFD) == 0)
+			continue;
+
+		if (rte_intr_irq_count_set(dev->intr_handle, irq.count))
+			return -1;
+
+		/* Reallocate the efds and elist fields of intr_handle based
+		 * on CDX device MSI size.
+		 */
+		if ((uint32_t)rte_intr_nb_intr_get(dev->intr_handle) < irq.count &&
+				rte_intr_event_list_update(dev->intr_handle, irq.count))
+			return -1;
+
+		/* set up an eventfd for interrupts */
+		fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
+		if (fd < 0) {
+			CDX_BUS_ERR("Cannot set up eventfd, error %i (%s)",
+				errno, strerror(errno));
+			return -1;
+		}
+
+		if (rte_intr_fd_set(dev->intr_handle, fd))
+			return -1;
+
+		/* DPDK CDX bus currently supports only MSI-X */
+		if (rte_intr_type_set(dev->intr_handle, RTE_INTR_HANDLE_VFIO_MSIX))
+			return -1;
+
+		if (rte_intr_dev_fd_set(dev->intr_handle, vfio_dev_fd))
+			return -1;
+
+		return 0;
+	}
+
+	/* if we're here, we haven't found a suitable interrupt vector */
+	return -1;
+}
+
+static int
+cdx_vfio_setup_device(struct rte_cdx_device *dev, int vfio_dev_fd,
+		int num_irqs)
+{
+	if (cdx_vfio_setup_interrupts(dev, vfio_dev_fd, num_irqs) != 0) {
+		CDX_BUS_ERR("Error setting up interrupts!");
+		return -1;
+	}
+
 	/*
 	 * Reset the device. If the device is not capable of resetting,
 	 * then it updates errno as EINVAL.
@@ -288,6 +396,9 @@  cdx_vfio_map_resource_primary(struct rte_cdx_device *dev)
 	struct cdx_map *maps;
 	int vfio_dev_fd, i, ret;
 
+	if (rte_intr_fd_set(dev->intr_handle, -1))
+		return -1;
+
 	ret = rte_vfio_setup_device(rte_cdx_get_sysfs_path(), dev_name,
 				    &vfio_dev_fd, &device_info);
 	if (ret)
@@ -353,7 +464,7 @@  cdx_vfio_map_resource_primary(struct rte_cdx_device *dev)
 		free(reg);
 	}
 
-	if (cdx_rte_vfio_setup_device(vfio_dev_fd) < 0) {
+	if (cdx_vfio_setup_device(dev, vfio_dev_fd, device_info.num_irqs) < 0) {
 		CDX_BUS_ERR("%s setup device failed", dev_name);
 		goto err_vfio_res;
 	}
@@ -383,6 +494,9 @@  cdx_vfio_map_resource_secondary(struct rte_cdx_device *dev)
 	const char *dev_name = dev->device.name;
 	struct cdx_map *maps;
 
+	if (rte_intr_fd_set(dev->intr_handle, -1))
+		return -1;
+
 	/* if we're in a secondary process, just find our tailq entry */
 	TAILQ_FOREACH(vfio_res, vfio_res_list, next) {
 		if (strcmp(vfio_res->name, dev_name))
@@ -416,6 +530,10 @@  cdx_vfio_map_resource_secondary(struct rte_cdx_device *dev)
 		dev->mem_resource[i].len = maps[i].size;
 	}
 
+	/* we need save vfio_dev_fd, so it can be used during release */
+	if (rte_intr_dev_fd_set(dev->intr_handle, vfio_dev_fd))
+		goto err_vfio_dev_fd;
+
 	return 0;
 err_vfio_dev_fd:
 	rte_vfio_release_device(rte_cdx_get_sysfs_path(),
@@ -435,3 +553,63 @@  cdx_vfio_map_resource(struct rte_cdx_device *dev)
 	else
 		return cdx_vfio_map_resource_secondary(dev);
 }
+
+int
+rte_cdx_vfio_intr_enable(const struct rte_intr_handle *intr_handle)
+{
+	char irq_set_buf[MSI_IRQ_SET_BUF_LEN];
+	struct vfio_irq_set *irq_set;
+	int *fd_ptr, vfio_dev_fd, i;
+	int ret;
+
+	irq_set = (struct vfio_irq_set *) irq_set_buf;
+	irq_set->count = rte_intr_irq_count_get(intr_handle);
+	irq_set->argsz = sizeof(struct vfio_irq_set) +
+			 (sizeof(int) * irq_set->count);
+
+	irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
+	irq_set->index = 0;
+	irq_set->start = 0;
+	fd_ptr = (int *) &irq_set->data;
+
+	for (i = 0; i < rte_intr_nb_efd_get(intr_handle); i++)
+		fd_ptr[i] = rte_intr_efds_index_get(intr_handle, i);
+
+	vfio_dev_fd = rte_intr_dev_fd_get(intr_handle);
+	ret = ioctl(vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
+
+	if (ret) {
+		CDX_BUS_ERR("Error enabling MSI interrupts for fd %d",
+			rte_intr_fd_get(intr_handle));
+		return -1;
+	}
+
+	return 0;
+}
+
+/* disable MSI interrupts */
+int
+rte_cdx_vfio_intr_disable(const struct rte_intr_handle *intr_handle)
+{
+	struct vfio_irq_set *irq_set;
+	char irq_set_buf[MSI_IRQ_SET_BUF_LEN];
+	int len, ret, vfio_dev_fd;
+
+	len = sizeof(struct vfio_irq_set);
+
+	irq_set = (struct vfio_irq_set *) irq_set_buf;
+	irq_set->argsz = len;
+	irq_set->count = 0;
+	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
+	irq_set->index = 0;
+	irq_set->start = 0;
+
+	vfio_dev_fd = rte_intr_dev_fd_get(intr_handle);
+	ret = ioctl(vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
+
+	if (ret)
+		CDX_BUS_ERR("Error disabling MSI interrupts for fd %d",
+			rte_intr_fd_get(intr_handle));
+
+	return ret;
+}
diff --git a/drivers/bus/cdx/version.map b/drivers/bus/cdx/version.map
index 957fcab978..2f3d484ebd 100644
--- a/drivers/bus/cdx/version.map
+++ b/drivers/bus/cdx/version.map
@@ -6,6 +6,8 @@  INTERNAL {
 	rte_cdx_register;
 	rte_cdx_unmap_device;
 	rte_cdx_unregister;
+	rte_cdx_vfio_intr_disable;
+	rte_cdx_vfio_intr_enable;
 
 	local: *;
 };
diff --git a/lib/eal/common/eal_common_interrupts.c b/lib/eal/common/eal_common_interrupts.c
index 97b64fed58..a0167d9ad4 100644
--- a/lib/eal/common/eal_common_interrupts.c
+++ b/lib/eal/common/eal_common_interrupts.c
@@ -398,6 +398,27 @@  int rte_intr_elist_index_set(struct rte_intr_handle *intr_handle,
 	return -rte_errno;
 }
 
+int rte_intr_irq_count_set(struct rte_intr_handle *intr_handle,
+	int irq_count)
+{
+	CHECK_VALID_INTR_HANDLE(intr_handle);
+
+	intr_handle->irq_count = irq_count;
+
+	return 0;
+fail:
+	return -rte_errno;
+}
+
+int rte_intr_irq_count_get(const struct rte_intr_handle *intr_handle)
+{
+	CHECK_VALID_INTR_HANDLE(intr_handle);
+
+	return intr_handle->irq_count;
+fail:
+	return -rte_errno;
+}
+
 int rte_intr_vec_list_alloc(struct rte_intr_handle *intr_handle,
 	const char *name, int size)
 {
diff --git a/lib/eal/common/eal_interrupts.h b/lib/eal/common/eal_interrupts.h
index 482781b862..237f471a76 100644
--- a/lib/eal/common/eal_interrupts.h
+++ b/lib/eal/common/eal_interrupts.h
@@ -16,6 +16,7 @@  struct rte_intr_handle {
 	};
 	uint32_t alloc_flags;	/**< flags passed at allocation */
 	enum rte_intr_handle_type type;  /**< handle type */
+	uint32_t irq_count;		/**< IRQ count provided via VFIO */
 	uint32_t max_intr;             /**< max interrupt requested */
 	uint32_t nb_efd;               /**< number of available efd(event fd) */
 	uint8_t efd_counter_size;      /**< size of efd counter, used for vdev */
diff --git a/lib/eal/include/rte_interrupts.h b/lib/eal/include/rte_interrupts.h
index 487e3c8875..bc477a483f 100644
--- a/lib/eal/include/rte_interrupts.h
+++ b/lib/eal/include/rte_interrupts.h
@@ -506,6 +506,38 @@  __rte_internal
 int
 rte_intr_max_intr_get(const struct rte_intr_handle *intr_handle);
 
+/**
+ * @internal
+ * Set the irq count field of interrupt handle with user
+ * provided irq count value.
+ *
+ * @param intr_handle
+ *  pointer to the interrupt handle.
+ * @param irq_count
+ *  IRQ count
+ *
+ * @return
+ *  - On success, zero.
+ *  - On failure, a negative value and rte_errno is set.
+ */
+__rte_internal
+int
+rte_intr_irq_count_set(struct rte_intr_handle *intr_handle, int irq_count);
+
+/**
+ * @internal
+ * Returns the irq count field of the given interrupt handle instance.
+ *
+ * @param intr_handle
+ *  pointer to the interrupt handle.
+ *
+ * @return
+ *  - On success, ir count.
+ *  - On failure, a negative value and rte_errno is set.
+ */
+__rte_internal
+int rte_intr_irq_count_get(const struct rte_intr_handle *intr_handle);
+
 /**
  * @internal
  * Set the number of event fd field of interrupt handle
diff --git a/lib/eal/version.map b/lib/eal/version.map
index 6d6978f108..14bf7ade77 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -458,6 +458,8 @@  INTERNAL {
 	rte_intr_instance_dup;
 	rte_intr_instance_windows_handle_get;
 	rte_intr_instance_windows_handle_set;
+	rte_intr_irq_count_get;
+	rte_intr_irq_count_set;
 	rte_intr_max_intr_get;
 	rte_intr_max_intr_set;
 	rte_intr_nb_efd_get;