Message ID | 20190128065549.98266-1-findtheonlyway@gmail.com |
---|---|
State | New |
Delegated to: | Maxime Coquelin |
Headers | show |
Series |
|
Related | show |
Context | Check | Description |
---|---|---|
ci/mellanox-Performance-Testing | success | Performance Testing PASS |
ci/intel-Performance-Testing | success | Performance Testing PASS |
ci/Intel-compilation | success | Compilation OK |
ci/checkpatch | success | coding style OK |
On 1/28/19 7:55 AM, sunwenjie wrote: > When rte_vhost_driver_unregister delete the connection fd, > fdset_try_del will always try and donot release the > vhostuser.mutex if the fd is busy, but the fdset_event_dispatch > will set the fd to busy and call vhost_user_msg_handler to get > vhostuser.mutex, which will cause deadlock. Unlock the > vhost_user.mutexif fdset_try_del fail and relock it when retry. What about this wording: In rte_vhost_driver_unregister(), the connection fd is removed from the fdset using fdset_try_del(). Call to this function may fail if the corresponding fd is in busy state, indicating that event dispatcher is executing the read or write callback on this fd. When it happens, rte_vhost_driver_unregister() keeps trying to remove the fd from the set until it is no more busy. This situation is causing a deadlock, because rte_vhost_driver_unregister() keeps trying to remove the fd from the set with vhost_user.mutex held, while the callback executed by the dispatcher, vhost_user_read_cb(), also takes this mutex at numerous places. The fix consists in releasing vhost_user.mutex between each retry in vhost_driver_unregister(). > > Fixes: 8b4b949144b8 ("vhost: fix dead lock on closing in server mode") > Cc: stable@dpdk.org > > Signed-off-by: sunwenjie <findtheonlyway@gmail.com> We need your real name for legal reasons: Signed-off-by: Surname Lastname <findtheonlyway@gmail.com> No need to resubmit, I can handle the commit message fixup and the fix looks good to me: Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> As soon as I get your name in above format I will apply the patch in Virtio tree. Thanks for submitting the fix. Maxime > --- > lib/librte_vhost/socket.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c > index 9cf34ad17..9883b0491 100644 > --- a/lib/librte_vhost/socket.c > +++ b/lib/librte_vhost/socket.c > @@ -961,13 +961,13 @@ rte_vhost_driver_unregister(const char *path) > int count; > struct vhost_user_connection *conn, *next; > > +again: > pthread_mutex_lock(&vhost_user.mutex); > > for (i = 0; i < vhost_user.vsocket_cnt; i++) { > struct vhost_user_socket *vsocket = vhost_user.vsockets[i]; > > if (!strcmp(vsocket->path, path)) { > -again: > pthread_mutex_lock(&vsocket->conn_mutex); > for (conn = TAILQ_FIRST(&vsocket->conn_list); > conn != NULL; > @@ -983,6 +983,7 @@ rte_vhost_driver_unregister(const char *path) > conn->connfd) == -1) { > pthread_mutex_unlock( > &vsocket->conn_mutex); > + pthread_mutex_unlock(&vhost_user.mutex); > goto again; > } > >
Thanks, Maxime. Your description is better, My real name is Wenjie Sun. Signed-off-by: Wenjie Sun <findtheonlyway@gmail.com> Maxime Coquelin <maxime.coquelin@redhat.com> 于2019年2月8日周五 下午10:12写道: > > > On 1/28/19 7:55 AM, sunwenjie wrote: > > When rte_vhost_driver_unregister delete the connection fd, > > fdset_try_del will always try and donot release the > > vhostuser.mutex if the fd is busy, but the fdset_event_dispatch > > will set the fd to busy and call vhost_user_msg_handler to get > > vhostuser.mutex, which will cause deadlock. Unlock the > > vhost_user.mutexif fdset_try_del fail and relock it when retry. > > What about this wording: > > In rte_vhost_driver_unregister(), the connection fd is removed from > the fdset using fdset_try_del(). Call to this function may fail > if the corresponding fd is in busy state, indicating that event > dispatcher is executing the read or write callback on this fd. > When it happens, rte_vhost_driver_unregister() keeps trying to > remove the fd from the set until it is no more busy. > > This situation is causing a deadlock, because > rte_vhost_driver_unregister() keeps trying to remove the fd from > the set with vhost_user.mutex held, while the callback executed > by the dispatcher, vhost_user_read_cb(), also takes this mutex at > numerous places. > > The fix consists in releasing vhost_user.mutex between each retry > in vhost_driver_unregister(). > > > > > > Fixes: 8b4b949144b8 ("vhost: fix dead lock on closing in server mode") > > Cc: stable@dpdk.org > > > > Signed-off-by: sunwenjie <findtheonlyway@gmail.com> > > We need your real name for legal reasons: > Signed-off-by: Surname Lastname <findtheonlyway@gmail.com> > > No need to resubmit, I can handle the commit message fixup and > the fix looks good to me: > Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> > > As soon as I get your name in above format I will apply the patch in > Virtio tree. Thanks for submitting the fix. > > Maxime > > --- > > lib/librte_vhost/socket.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c > > index 9cf34ad17..9883b0491 100644 > > --- a/lib/librte_vhost/socket.c > > +++ b/lib/librte_vhost/socket.c > > @@ -961,13 +961,13 @@ rte_vhost_driver_unregister(const char *path) > > int count; > > struct vhost_user_connection *conn, *next; > > > > +again: > > pthread_mutex_lock(&vhost_user.mutex); > > > > for (i = 0; i < vhost_user.vsocket_cnt; i++) { > > struct vhost_user_socket *vsocket = vhost_user.vsockets[i]; > > > > if (!strcmp(vsocket->path, path)) { > > -again: > > pthread_mutex_lock(&vsocket->conn_mutex); > > for (conn = TAILQ_FIRST(&vsocket->conn_list); > > conn != NULL; > > @@ -983,6 +983,7 @@ rte_vhost_driver_unregister(const char *path) > > conn->connfd) == -1) { > > pthread_mutex_unlock( > > > &vsocket->conn_mutex); > > + > pthread_mutex_unlock(&vhost_user.mutex); > > goto again; > > } > > > > >
diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c index 9cf34ad17..9883b0491 100644 --- a/lib/librte_vhost/socket.c +++ b/lib/librte_vhost/socket.c @@ -961,13 +961,13 @@ rte_vhost_driver_unregister(const char *path) int count; struct vhost_user_connection *conn, *next; +again: pthread_mutex_lock(&vhost_user.mutex); for (i = 0; i < vhost_user.vsocket_cnt; i++) { struct vhost_user_socket *vsocket = vhost_user.vsockets[i]; if (!strcmp(vsocket->path, path)) { -again: pthread_mutex_lock(&vsocket->conn_mutex); for (conn = TAILQ_FIRST(&vsocket->conn_list); conn != NULL; @@ -983,6 +983,7 @@ rte_vhost_driver_unregister(const char *path) conn->connfd) == -1) { pthread_mutex_unlock( &vsocket->conn_mutex); + pthread_mutex_unlock(&vhost_user.mutex); goto again; }
When rte_vhost_driver_unregister delete the connection fd, fdset_try_del will always try and donot release the vhostuser.mutex if the fd is busy, but the fdset_event_dispatch will set the fd to busy and call vhost_user_msg_handler to get vhostuser.mutex, which will cause deadlock. Unlock the vhost_user.mutexif fdset_try_del fail and relock it when retry. Fixes: 8b4b949144b8 ("vhost: fix dead lock on closing in server mode") Cc: stable@dpdk.org Signed-off-by: sunwenjie <findtheonlyway@gmail.com> --- lib/librte_vhost/socket.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)