vhost:fix crash on port deletion and VM shutdown at same time
Checks
Commit Message
The vhost_user_read_cb() and rte_vhost_driver_unregister() can be
called at the same time by 2 threads. Eg thread1 calls vhost_user_read_cb()
and removes the vsocket from conn_list, then thread2 calls
rte_vhost_driver_unregister() and frees the vsocket since it is NOT in the
conn_list. So thread1 will access invalid memory when trying to reconnect.
The fix is to move the "removing of vsocket from conn_list" to end of the
vhost_user_read_cb(), then avoid the race condition.
The core trace is
Program terminated with signal 11, Segmentation fault.
Signed-off-by: Zhike Wang <wangzhike@jd.com>
---
lib/librte_vhost/socket.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
Comments
On 1/16/20 3:07 AM, Zhike Wang wrote:
> The vhost_user_read_cb() and rte_vhost_driver_unregister() can be
> called at the same time by 2 threads. Eg thread1 calls vhost_user_read_cb()
> and removes the vsocket from conn_list, then thread2 calls
> rte_vhost_driver_unregister() and frees the vsocket since it is NOT in the
> conn_list. So thread1 will access invalid memory when trying to reconnect.
>
> The fix is to move the "removing of vsocket from conn_list" to end of the
> vhost_user_read_cb(), then avoid the race condition.
>
> The core trace is
> Program terminated with signal 11, Segmentation fault.
Need to add the Fixes tag;
Fixes: af1475918124 ("vhost: introduce API to start a specific driver")
Cc: stable@dpdk.org
> Signed-off-by: Zhike Wang <wangzhike@jd.com>
> ---
> lib/librte_vhost/socket.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
> index ebb2ff6..01a9dec 100644
> --- a/lib/librte_vhost/socket.c
> +++ b/lib/librte_vhost/socket.c
> @@ -318,16 +318,16 @@ struct vhost_user {
>
> vhost_destroy_device(conn->vid);
>
> + if (vsocket->reconnect) {
> + create_unix_socket(vsocket);
> + vhost_user_start_client(vsocket);
> + }
> +
> pthread_mutex_lock(&vsocket->conn_mutex);
> TAILQ_REMOVE(&vsocket->conn_list, conn, next);
> pthread_mutex_unlock(&vsocket->conn_mutex);
>
> free(conn);
> -
> - if (vsocket->reconnect) {
> - create_unix_socket(vsocket);
> - vhost_user_start_client(vsocket);
> - }
> }
> }
>
>
No need to send v2, I'll add it when applying to my tree.
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Thanks,
Maxime
On 1/16/20 3:07 AM, Zhike Wang wrote:
> The vhost_user_read_cb() and rte_vhost_driver_unregister() can be
> called at the same time by 2 threads. Eg thread1 calls vhost_user_read_cb()
> and removes the vsocket from conn_list, then thread2 calls
> rte_vhost_driver_unregister() and frees the vsocket since it is NOT in the
> conn_list. So thread1 will access invalid memory when trying to reconnect.
>
> The fix is to move the "removing of vsocket from conn_list" to end of the
> vhost_user_read_cb(), then avoid the race condition.
>
> The core trace is
> Program terminated with signal 11, Segmentation fault.
>
> Signed-off-by: Zhike Wang <wangzhike@jd.com>
> ---
> lib/librte_vhost/socket.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
Applied to dpdk-next-virtio tree.
Thanks,
Maxime
@@ -318,16 +318,16 @@ struct vhost_user {
vhost_destroy_device(conn->vid);
+ if (vsocket->reconnect) {
+ create_unix_socket(vsocket);
+ vhost_user_start_client(vsocket);
+ }
+
pthread_mutex_lock(&vsocket->conn_mutex);
TAILQ_REMOVE(&vsocket->conn_list, conn, next);
pthread_mutex_unlock(&vsocket->conn_mutex);
free(conn);
-
- if (vsocket->reconnect) {
- create_unix_socket(vsocket);
- vhost_user_start_client(vsocket);
- }
}
}