vdpa/mlx5: fix notification timing
Checks
Commit Message
The issue is relevant only for the timer event modes: 0 and 1.
When the HW finishes to consume a burst of the guest Rx descriptors,
it creates a CQE in the CQ.
When traffic stops, the mlx5 driver arms the CQ to get a notification
when a specific CQE index is created - the index to be armed is the
next CQE index which should be polled by the driver.
The mlx5 driver configured the kernel driver to send notification to
the guest callfd in the same time of the armed CQE event.
It means that the guest was notified only for each first CQE in a
poll cycle, so if the driver polled CQEs of all the virtio queue
available descriptors, the guest was not notified again for the rest
because there was no any new CQE to trigger the guest notification.
Hence, the Rx queues might be stuck when the guest didn't work with
poll mode.
Remove prior kernel notification, and do manual notification after CQ
polling.
Fixes: a9dd7275a149 ("vdpa/mlx5: optimize notification events")
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Xueming Li <xuemingl@mellanox.com>
---
drivers/vdpa/mlx5/mlx5_vdpa_event.c | 17 +++--------------
1 file changed, 3 insertions(+), 14 deletions(-)
Comments
On 7/27/20 4:00 PM, Matan Azrad wrote:
> The issue is relevant only for the timer event modes: 0 and 1.
>
> When the HW finishes to consume a burst of the guest Rx descriptors,
> it creates a CQE in the CQ.
> When traffic stops, the mlx5 driver arms the CQ to get a notification
> when a specific CQE index is created - the index to be armed is the
> next CQE index which should be polled by the driver.
>
> The mlx5 driver configured the kernel driver to send notification to
> the guest callfd in the same time of the armed CQE event.
> It means that the guest was notified only for each first CQE in a
> poll cycle, so if the driver polled CQEs of all the virtio queue
> available descriptors, the guest was not notified again for the rest
> because there was no any new CQE to trigger the guest notification.
>
> Hence, the Rx queues might be stuck when the guest didn't work with
> poll mode.
>
> Remove prior kernel notification, and do manual notification after CQ
> polling.
>
> Fixes: a9dd7275a149 ("vdpa/mlx5: optimize notification events")
>
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> Acked-by: Xueming Li <xuemingl@mellanox.com>
> ---
> drivers/vdpa/mlx5/mlx5_vdpa_event.c | 17 +++--------------
> 1 file changed, 3 insertions(+), 14 deletions(-)
>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Thanks,
Maxime
On 7/27/20 4:00 PM, Matan Azrad wrote:
> The issue is relevant only for the timer event modes: 0 and 1.
>
> When the HW finishes to consume a burst of the guest Rx descriptors,
> it creates a CQE in the CQ.
> When traffic stops, the mlx5 driver arms the CQ to get a notification
> when a specific CQE index is created - the index to be armed is the
> next CQE index which should be polled by the driver.
>
> The mlx5 driver configured the kernel driver to send notification to
> the guest callfd in the same time of the armed CQE event.
> It means that the guest was notified only for each first CQE in a
> poll cycle, so if the driver polled CQEs of all the virtio queue
> available descriptors, the guest was not notified again for the rest
> because there was no any new CQE to trigger the guest notification.
>
> Hence, the Rx queues might be stuck when the guest didn't work with
> poll mode.
>
> Remove prior kernel notification, and do manual notification after CQ
> polling.
>
> Fixes: a9dd7275a149 ("vdpa/mlx5: optimize notification events")
>
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> Acked-by: Xueming Li <xuemingl@mellanox.com>
Applied to dpdk-next-virtio/master
Thanks,
Maxime
@@ -172,17 +172,6 @@
rte_errno = errno;
goto error;
}
- if (callfd != -1 &&
- priv->event_mode != MLX5_VDPA_EVENT_MODE_ONLY_INTERRUPT) {
- ret = mlx5_glue->devx_subscribe_devx_event_fd(priv->eventc,
- callfd,
- cq->cq->obj, 0);
- if (ret) {
- DRV_LOG(ERR, "Failed to subscribe CQE event fd.");
- rte_errno = errno;
- goto error;
- }
- }
cq->callfd = callfd;
/* Init CQ to ones to be in HW owner in the start. */
cq->cqes[0].op_own = MLX5_CQE_OWNER_MASK;
@@ -352,11 +341,11 @@
struct mlx5_vdpa_virtq, eqp);
mlx5_vdpa_cq_poll(cq);
+ /* Notify guest for descs consuming. */
+ if (cq->callfd != -1)
+ eventfd_write(cq->callfd, (eventfd_t)1);
if (priv->event_mode == MLX5_VDPA_EVENT_MODE_ONLY_INTERRUPT) {
mlx5_vdpa_cq_arm(priv, cq);
- /* Notify guest for descs consuming. */
- if (cq->callfd != -1)
- eventfd_write(cq->callfd, (eventfd_t)1);
return;
}
/* Don't arm again - timer will take control. */