vhost:fix crash on port deletion and VM shutdown at same time

Message ID 1579140457-14805-1-git-send-email-wangzhike@jd.com (mailing list archive)
State Accepted, archived
Delegated to: Maxime Coquelin
Headers
Series vhost:fix crash on port deletion and VM shutdown at same time |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-nxp-Performance success Performance Testing PASS
ci/Intel-compilation fail Compilation issues
ci/travis-robot success Travis build: passed
ci/iol-testing success Testing PASS

Commit Message

王志克 Jan. 16, 2020, 2:07 a.m. UTC
  The vhost_user_read_cb() and rte_vhost_driver_unregister() can be
called at the same time by 2 threads. Eg thread1 calls vhost_user_read_cb()
and removes the vsocket from conn_list, then thread2 calls
rte_vhost_driver_unregister() and frees the vsocket since it is NOT in the
conn_list. So thread1 will access invalid memory when trying to reconnect.

The fix is to move the "removing of vsocket from conn_list" to end of the
vhost_user_read_cb(), then avoid the race condition.

The core trace is
Program terminated with signal 11, Segmentation fault.

Signed-off-by: Zhike Wang <wangzhike@jd.com>
---
 lib/librte_vhost/socket.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)
  

Comments

Maxime Coquelin Feb. 4, 2020, 6:16 p.m. UTC | #1
On 1/16/20 3:07 AM, Zhike Wang wrote:
> The vhost_user_read_cb() and rte_vhost_driver_unregister() can be
> called at the same time by 2 threads. Eg thread1 calls vhost_user_read_cb()
> and removes the vsocket from conn_list, then thread2 calls
> rte_vhost_driver_unregister() and frees the vsocket since it is NOT in the
> conn_list. So thread1 will access invalid memory when trying to reconnect.
> 
> The fix is to move the "removing of vsocket from conn_list" to end of the
> vhost_user_read_cb(), then avoid the race condition.
> 
> The core trace is
> Program terminated with signal 11, Segmentation fault.

Need to add the Fixes tag;

Fixes: af1475918124 ("vhost: introduce API to start a specific driver")
Cc: stable@dpdk.org

> Signed-off-by: Zhike Wang <wangzhike@jd.com>
> ---
>  lib/librte_vhost/socket.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
> index ebb2ff6..01a9dec 100644
> --- a/lib/librte_vhost/socket.c
> +++ b/lib/librte_vhost/socket.c
> @@ -318,16 +318,16 @@ struct vhost_user {
>  
>  		vhost_destroy_device(conn->vid);
>  
> +		if (vsocket->reconnect) {
> +			create_unix_socket(vsocket);
> +			vhost_user_start_client(vsocket);
> +		}
> +
>  		pthread_mutex_lock(&vsocket->conn_mutex);
>  		TAILQ_REMOVE(&vsocket->conn_list, conn, next);
>  		pthread_mutex_unlock(&vsocket->conn_mutex);
>  
>  		free(conn);
> -
> -		if (vsocket->reconnect) {
> -			create_unix_socket(vsocket);
> -			vhost_user_start_client(vsocket);
> -		}
>  	}
>  }
>  
> 

No need to send v2, I'll add it when applying to my tree.

Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Thanks,
Maxime
  
Maxime Coquelin Feb. 5, 2020, 9:46 a.m. UTC | #2
On 1/16/20 3:07 AM, Zhike Wang wrote:
> The vhost_user_read_cb() and rte_vhost_driver_unregister() can be
> called at the same time by 2 threads. Eg thread1 calls vhost_user_read_cb()
> and removes the vsocket from conn_list, then thread2 calls
> rte_vhost_driver_unregister() and frees the vsocket since it is NOT in the
> conn_list. So thread1 will access invalid memory when trying to reconnect.
> 
> The fix is to move the "removing of vsocket from conn_list" to end of the
> vhost_user_read_cb(), then avoid the race condition.
> 
> The core trace is
> Program terminated with signal 11, Segmentation fault.
> 
> Signed-off-by: Zhike Wang <wangzhike@jd.com>
> ---
>  lib/librte_vhost/socket.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)

Applied to dpdk-next-virtio tree.

Thanks,
Maxime
  

Patch

diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index ebb2ff6..01a9dec 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -318,16 +318,16 @@  struct vhost_user {
 
 		vhost_destroy_device(conn->vid);
 
+		if (vsocket->reconnect) {
+			create_unix_socket(vsocket);
+			vhost_user_start_client(vsocket);
+		}
+
 		pthread_mutex_lock(&vsocket->conn_mutex);
 		TAILQ_REMOVE(&vsocket->conn_list, conn, next);
 		pthread_mutex_unlock(&vsocket->conn_mutex);
 
 		free(conn);
-
-		if (vsocket->reconnect) {
-			create_unix_socket(vsocket);
-			vhost_user_start_client(vsocket);
-		}
 	}
 }