[RFC] vhost: support raising device error

Message ID 1603697101-8947-1-git-send-email-xuemingl@nvidia.com (mailing list archive)
State Changes Requested, archived
Delegated to: Maxime Coquelin
Headers
Series [RFC] vhost: support raising device error |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation success Compilation OK

Commit Message

Xueming Li Oct. 26, 2020, 7:25 a.m. UTC
  According to virtio spec, The device SHOULD set DEVICE_NEEDS_RESET when
it enters an error state that a reset is needed. If DRIVER_OK is set,
after it sets DEVICE_NEEDS_RESET, the device MUST send a device
configuration change notification to the driver.

This patch introduces new api to raise vDPA hardware error and escalates
configuration change to vhost via client message
VHOST_USER_SLAVE_CONFIG_CHANGE_MSG.

The vhost should check DRIVER_OK and decide whether to notify driver.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 lib/librte_vhost/rte_vdpa_dev.h | 12 ++++++++++++
 lib/librte_vhost/version.map    |  1 +
 lib/librte_vhost/vhost_user.c   | 14 ++++++++++++++
 3 files changed, 27 insertions(+)
  

Comments

Maxime Coquelin Jan. 5, 2021, 9:32 a.m. UTC | #1
Hi Xueming,

On 10/26/20 8:25 AM, Xueming Li wrote:
> According to virtio spec, The device SHOULD set DEVICE_NEEDS_RESET when
> it enters an error state that a reset is needed. If DRIVER_OK is set,
> after it sets DEVICE_NEEDS_RESET, the device MUST send a device
> configuration change notification to the driver.
> 
> This patch introduces new api to raise vDPA hardware error and escalates
> configuration change to vhost via client message
> VHOST_USER_SLAVE_CONFIG_CHANGE_MSG.
> 
> The vhost should check DRIVER_OK and decide whether to notify driver.
> 
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> ---
>  lib/librte_vhost/rte_vdpa_dev.h | 12 ++++++++++++
>  lib/librte_vhost/version.map    |  1 +
>  lib/librte_vhost/vhost_user.c   | 14 ++++++++++++++
>  3 files changed, 27 insertions(+)
> 
> diff --git a/lib/librte_vhost/rte_vdpa_dev.h b/lib/librte_vhost/rte_vdpa_dev.h
> index a60183f780..87b7397c6f 100644
> --- a/lib/librte_vhost/rte_vdpa_dev.h
> +++ b/lib/librte_vhost/rte_vdpa_dev.h
> @@ -117,6 +117,18 @@ rte_vdpa_unregister_device(struct rte_vdpa_device *dev);
>  int
>  rte_vhost_host_notifier_ctrl(int vid, uint16_t qid, bool enable);
>  
> +/**
> + * Set device hardware error and notify host.
> + *
> + * @param vid
> + *  vhost device id
> + * @return
> + *  0 on success, -1 on failure
> + */
> +__rte_experimental
> +int
> +rte_vhost_host_raise_error(int vid);
> +
>  /**
>   * Synchronize the used ring from mediated ring to guest, log dirty
>   * page for each writeable buffer, caller should handle the used
> diff --git a/lib/librte_vhost/version.map b/lib/librte_vhost/version.map
> index 9183d6f2fc..5a4c5dc818 100644
> --- a/lib/librte_vhost/version.map
> +++ b/lib/librte_vhost/version.map
> @@ -76,4 +76,5 @@ EXPERIMENTAL {
>  	rte_vhost_async_channel_unregister;
>  	rte_vhost_submit_enqueue_burst;
>  	rte_vhost_poll_enqueue_completed;
> +	rte_vhost_host_raise_error;
>  };
> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
> index d20c8c57ad..d8353176f2 100644
> --- a/lib/librte_vhost/vhost_user.c
> +++ b/lib/librte_vhost/vhost_user.c
> @@ -2992,6 +2992,20 @@ rte_vhost_slave_config_change(int vid, bool need_reply)
>  	return vhost_user_slave_config_change(dev, need_reply);
>  }
>  
> +int
> +rte_vhost_host_raise_error(int vid)
> +{
> +	struct virtio_net *dev;
> +
> +	dev = get_device(vid);
> +	if (!dev)
> +		return -ENODEV;
> +
> +	dev->status |= VIRTIO_DEVICE_STATUS_DEV_NEED_RESET;
> +
> +	return vhost_user_slave_config_change(dev, 0);

In order to be able to send VHOST_USER_SLAVE_CONFIG_CHANGE_MSG request,
we need to negotiate the VHOST_USER_PROTOCOL_F_CONFIG feature, and
ensure it is supported by the master. From the Vhost-user spec in Qemu:

"
``VHOST_USER_SLAVE_CONFIG_CHANGE_MSG``
  :id: 2
  :equivalent ioctl: N/A
  :slave payload: N/A
  :master payload: N/A

  When ``VHOST_USER_PROTOCOL_F_CONFIG`` is negotiated, vhost-user
  slave sends such messages to notify that the virtio device's
  configuration space has changed, for those host devices which can
  support such feature, host driver can send ``VHOST_USER_GET_CONFIG``
  message to slave to get the latest content. If
  ``VHOST_USER_PROTOCOL_F_REPLY_ACK`` is negotiated, and slave set the
  ``VHOST_USER_NEED_REPLY`` flag, master must respond with zero when
  operation is successfully completed, or non-zero otherwise.
"

Also, it is not clear in the spec that it should be sent on status
updates.

Finally, if we go and advertise we support VHOST_USER_PROTOCOL_F_CONFIG,
we will have to support VHOST_USER_GET_CONFIG and VHOST_USER_SET_CONFIG
request handling in the backend.

I think it might be interesting to support it in the case of vDPA, but
full support needs to be added otherwise it will break.

Thanks,
Maxime

> +}
> +
>  static int vhost_user_slave_set_vring_host_notifier(struct virtio_net *dev,
>  						    int index, int fd,
>  						    uint64_t offset,
>
  

Patch

diff --git a/lib/librte_vhost/rte_vdpa_dev.h b/lib/librte_vhost/rte_vdpa_dev.h
index a60183f780..87b7397c6f 100644
--- a/lib/librte_vhost/rte_vdpa_dev.h
+++ b/lib/librte_vhost/rte_vdpa_dev.h
@@ -117,6 +117,18 @@  rte_vdpa_unregister_device(struct rte_vdpa_device *dev);
 int
 rte_vhost_host_notifier_ctrl(int vid, uint16_t qid, bool enable);
 
+/**
+ * Set device hardware error and notify host.
+ *
+ * @param vid
+ *  vhost device id
+ * @return
+ *  0 on success, -1 on failure
+ */
+__rte_experimental
+int
+rte_vhost_host_raise_error(int vid);
+
 /**
  * Synchronize the used ring from mediated ring to guest, log dirty
  * page for each writeable buffer, caller should handle the used
diff --git a/lib/librte_vhost/version.map b/lib/librte_vhost/version.map
index 9183d6f2fc..5a4c5dc818 100644
--- a/lib/librte_vhost/version.map
+++ b/lib/librte_vhost/version.map
@@ -76,4 +76,5 @@  EXPERIMENTAL {
 	rte_vhost_async_channel_unregister;
 	rte_vhost_submit_enqueue_burst;
 	rte_vhost_poll_enqueue_completed;
+	rte_vhost_host_raise_error;
 };
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index d20c8c57ad..d8353176f2 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -2992,6 +2992,20 @@  rte_vhost_slave_config_change(int vid, bool need_reply)
 	return vhost_user_slave_config_change(dev, need_reply);
 }
 
+int
+rte_vhost_host_raise_error(int vid)
+{
+	struct virtio_net *dev;
+
+	dev = get_device(vid);
+	if (!dev)
+		return -ENODEV;
+
+	dev->status |= VIRTIO_DEVICE_STATUS_DEV_NEED_RESET;
+
+	return vhost_user_slave_config_change(dev, 0);
+}
+
 static int vhost_user_slave_set_vring_host_notifier(struct virtio_net *dev,
 						    int index, int fd,
 						    uint64_t offset,