From patchwork Thu Feb 29 12:24:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 137480 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2F0CB43C36; Thu, 29 Feb 2024 13:25:14 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1AD9A41133; Thu, 29 Feb 2024 13:25:12 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 5FAC240E4A for ; Thu, 29 Feb 2024 13:25:10 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1709209509; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Xypbq0nHIUw3GjM/Pybi6OIeg99F/kSbPa6LqQhXBzs=; b=YfMF/HXXlRzxlKZWAWTyyQb6EBFsnfGMf+BeOkJefKmIrYP0Js3e+AucbajYvuMW2+8YS1 TAbUk4lhvgUmdVM4OvZh9F88OQ9pK9jrYvXcMhHtaoMqd7yMHPWUxf95ehuQEeoBikMOhr 5PO35eGy1665eZkbGsIeMQcAC9ccdHk= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-91-Mw5sRnjxPGmyuRwApqYK6w-1; Thu, 29 Feb 2024 07:25:08 -0500 X-MC-Unique: Mw5sRnjxPGmyuRwApqYK6w-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 391F3185A787; Thu, 29 Feb 2024 12:25:07 +0000 (UTC) Received: from max-p1.redhat.com (unknown [10.39.208.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id B77FC112132A; Thu, 29 Feb 2024 12:25:05 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbox@nvidia.com Cc: Maxime Coquelin , stable@dpdk.org Subject: [PATCH 1/7] vhost: fix VDUSE device destruction failure Date: Thu, 29 Feb 2024 13:24:56 +0100 Message-ID: <20240229122502.2572343-2-maxime.coquelin@redhat.com> In-Reply-To: <20240229122502.2572343-1-maxime.coquelin@redhat.com> References: <20240229122502.2572343-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org VDUSE_DESTROY_DEVICE ioctl can fail because the device's chardev is not released despite close syscall having been called. It happens because the events handler thread is still polling the file descriptor. fdset_pipe_notify() is not enough because it does not ensure the notification has been handled by the event thread, it just returns once the notification is sent. To fix this, this patch introduces a synchronization mechanism based on pthread's condition, so that fdset_pipe_notify() only returns once the pipe's read callback has been executed. Fixes: 51d018fdac4e ("vhost: add VDUSE events handler") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin --- lib/vhost/fd_man.c | 21 ++++++++++++++++++--- lib/vhost/fd_man.h | 5 +++++ 2 files changed, 23 insertions(+), 3 deletions(-) diff --git a/lib/vhost/fd_man.c b/lib/vhost/fd_man.c index 79a8d2c006..42ce059039 100644 --- a/lib/vhost/fd_man.c +++ b/lib/vhost/fd_man.c @@ -309,10 +309,11 @@ fdset_event_dispatch(void *arg) } static void -fdset_pipe_read_cb(int readfd, void *dat __rte_unused, +fdset_pipe_read_cb(int readfd, void *dat, int *remove __rte_unused) { char charbuf[16]; + struct fdset *fdset = dat; int r = read(readfd, charbuf, sizeof(charbuf)); /* * Just an optimization, we don't care if read() failed @@ -320,6 +321,11 @@ fdset_pipe_read_cb(int readfd, void *dat __rte_unused, * compiler happy */ RTE_SET_USED(r); + + pthread_mutex_lock(&fdset->sync_mutex); + fdset->sync = true; + pthread_cond_broadcast(&fdset->sync_cond); + pthread_mutex_unlock(&fdset->sync_mutex); } void @@ -342,7 +348,7 @@ fdset_pipe_init(struct fdset *fdset) } ret = fdset_add(fdset, fdset->u.readfd, - fdset_pipe_read_cb, NULL, NULL); + fdset_pipe_read_cb, NULL, fdset); if (ret < 0) { VHOST_FDMAN_LOG(ERR, @@ -359,7 +365,12 @@ fdset_pipe_init(struct fdset *fdset) void fdset_pipe_notify(struct fdset *fdset) { - int r = write(fdset->u.writefd, "1", 1); + int r; + + pthread_mutex_lock(&fdset->sync_mutex); + + fdset->sync = false; + r = write(fdset->u.writefd, "1", 1); /* * Just an optimization, we don't care if write() failed * so ignore explicitly its return value to make the @@ -367,4 +378,8 @@ fdset_pipe_notify(struct fdset *fdset) */ RTE_SET_USED(r); + while (!fdset->sync) + pthread_cond_wait(&fdset->sync_cond, &fdset->sync_mutex); + + pthread_mutex_unlock(&fdset->sync_mutex); } diff --git a/lib/vhost/fd_man.h b/lib/vhost/fd_man.h index 6315904c8e..cc19937612 100644 --- a/lib/vhost/fd_man.h +++ b/lib/vhost/fd_man.h @@ -6,6 +6,7 @@ #define _FD_MAN_H_ #include #include +#include #define MAX_FDS 1024 @@ -35,6 +36,10 @@ struct fdset { int writefd; }; } u; + + pthread_mutex_t sync_mutex; + pthread_cond_t sync_cond; + bool sync; }; From patchwork Thu Feb 29 12:24:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 137481 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A458243C36; Thu, 29 Feb 2024 13:25:20 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 60825427E5; Thu, 29 Feb 2024 13:25:14 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 7F313411F3 for ; Thu, 29 Feb 2024 13:25:12 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1709209512; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AbKRMTcvqci7XHWdD+4oMUp5pT+Nkfptl//SiN1Jdlc=; b=dzrHXMc7WmCFVdowcp5zplXDtRuM8BavgINKi1UWM2J5e/BysBAgSGAK3GepQbWYKokbIM 6XFrxB0um/H517430MD0m8xhV9/n+RZGimgFp2DYuUhnQNFk77tZ+cad7ufFbN4br2bMwH oMopLKSSax4mQ0pWFExIdR9Guhf6zCM= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-341-VX4ArW1gMtyWSnZBhg_xtg-1; Thu, 29 Feb 2024 07:25:09 -0500 X-MC-Unique: VX4ArW1gMtyWSnZBhg_xtg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id BE6B6280A9AE; Thu, 29 Feb 2024 12:25:08 +0000 (UTC) Received: from max-p1.redhat.com (unknown [10.39.208.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7B9B4111D3DE; Thu, 29 Feb 2024 12:25:07 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbox@nvidia.com Cc: Maxime Coquelin Subject: [PATCH 2/7] vhost: rename polling mutex Date: Thu, 29 Feb 2024 13:24:57 +0100 Message-ID: <20240229122502.2572343-3-maxime.coquelin@redhat.com> In-Reply-To: <20240229122502.2572343-1-maxime.coquelin@redhat.com> References: <20240229122502.2572343-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This trivial patch fixes a typo in fd's manager polling mutex name. Signed-off-by: Maxime Coquelin Reviewed-by: David Marchand --- lib/vhost/fd_man.c | 8 ++++---- lib/vhost/fd_man.h | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/lib/vhost/fd_man.c b/lib/vhost/fd_man.c index 42ce059039..5dde40e51a 100644 --- a/lib/vhost/fd_man.c +++ b/lib/vhost/fd_man.c @@ -125,9 +125,9 @@ fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) pthread_mutex_lock(&pfdset->fd_mutex); i = pfdset->num < MAX_FDS ? pfdset->num++ : -1; if (i == -1) { - pthread_mutex_lock(&pfdset->fd_pooling_mutex); + pthread_mutex_lock(&pfdset->fd_polling_mutex); fdset_shrink_nolock(pfdset); - pthread_mutex_unlock(&pfdset->fd_pooling_mutex); + pthread_mutex_unlock(&pfdset->fd_polling_mutex); i = pfdset->num < MAX_FDS ? pfdset->num++ : -1; if (i == -1) { pthread_mutex_unlock(&pfdset->fd_mutex); @@ -244,9 +244,9 @@ fdset_event_dispatch(void *arg) numfds = pfdset->num; pthread_mutex_unlock(&pfdset->fd_mutex); - pthread_mutex_lock(&pfdset->fd_pooling_mutex); + pthread_mutex_lock(&pfdset->fd_polling_mutex); val = poll(pfdset->rwfds, numfds, 1000 /* millisecs */); - pthread_mutex_unlock(&pfdset->fd_pooling_mutex); + pthread_mutex_unlock(&pfdset->fd_polling_mutex); if (val < 0) continue; diff --git a/lib/vhost/fd_man.h b/lib/vhost/fd_man.h index cc19937612..2517ae5a9b 100644 --- a/lib/vhost/fd_man.h +++ b/lib/vhost/fd_man.h @@ -24,7 +24,7 @@ struct fdset { struct pollfd rwfds[MAX_FDS]; struct fdentry fd[MAX_FDS]; pthread_mutex_t fd_mutex; - pthread_mutex_t fd_pooling_mutex; + pthread_mutex_t fd_polling_mutex; int num; /* current fd number of this fdset */ union pipefds { From patchwork Thu Feb 29 12:24:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 137483 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A74DF43C36; Thu, 29 Feb 2024 13:25:35 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8DE7942DBE; Thu, 29 Feb 2024 13:25:20 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 9B4E042D0B for ; Thu, 29 Feb 2024 13:25:14 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1709209514; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/EZoy7c4cRfT1LBbFPTEDCbGNG/pWOLRtgZqlJGlryc=; b=Iu5qbjaMqQHdmugnnaLCrqesPLESVuIybr+nCEpKRwaEBCVGBajXC/l/5xUwx0kRQWQojL j2u5ofZ5D43p2nUE3qhTljJhgRvmbnoB4pdtpNOq8rOFtlL2DpsSpM58y7IllBHIUKVsU5 sLdU5EDVXnSTfYETP51DJ7bZ1/8QL+U= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-481--RY2u7nOOqCmt1vIQdPiqA-1; Thu, 29 Feb 2024 07:25:10 -0500 X-MC-Unique: -RY2u7nOOqCmt1vIQdPiqA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 859CA3C0FCA0; Thu, 29 Feb 2024 12:25:10 +0000 (UTC) Received: from max-p1.redhat.com (unknown [10.39.208.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0DC1C111D3DA; Thu, 29 Feb 2024 12:25:08 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbox@nvidia.com Cc: Maxime Coquelin Subject: [PATCH 3/7] vhost: make use of FD manager init function Date: Thu, 29 Feb 2024 13:24:58 +0100 Message-ID: <20240229122502.2572343-4-maxime.coquelin@redhat.com> In-Reply-To: <20240229122502.2572343-1-maxime.coquelin@redhat.com> References: <20240229122502.2572343-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Instead of statically initialize the fdset, this patch converts VDUSE and Vhost-user to use fdset_init() function, which now also initialize the mutexes. This is preliminary rework to hide FDs manager pipe from its users. Signed-off-by: Maxime Coquelin --- lib/vhost/fd_man.c | 9 +++++++-- lib/vhost/fd_man.h | 2 +- lib/vhost/socket.c | 11 +++++------ lib/vhost/vduse.c | 14 ++++++-------- 4 files changed, 19 insertions(+), 17 deletions(-) diff --git a/lib/vhost/fd_man.c b/lib/vhost/fd_man.c index 5dde40e51a..d33036a171 100644 --- a/lib/vhost/fd_man.c +++ b/lib/vhost/fd_man.c @@ -96,19 +96,24 @@ fdset_add_fd(struct fdset *pfdset, int idx, int fd, pfd->revents = 0; } -void +int fdset_init(struct fdset *pfdset) { int i; if (pfdset == NULL) - return; + return -1; + + pthread_mutex_init(&pfdset->fd_mutex, NULL); + pthread_mutex_init(&pfdset->fd_polling_mutex, NULL); for (i = 0; i < MAX_FDS; i++) { pfdset->fd[i].fd = -1; pfdset->fd[i].dat = NULL; } pfdset->num = 0; + + return 0; } /** diff --git a/lib/vhost/fd_man.h b/lib/vhost/fd_man.h index 2517ae5a9b..92d24d8591 100644 --- a/lib/vhost/fd_man.h +++ b/lib/vhost/fd_man.h @@ -43,7 +43,7 @@ struct fdset { }; -void fdset_init(struct fdset *pfdset); +int fdset_init(struct fdset *pfdset); int fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat); diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c index a2fdac30a4..b544e39be7 100644 --- a/lib/vhost/socket.c +++ b/lib/vhost/socket.c @@ -89,12 +89,6 @@ static int create_unix_socket(struct vhost_user_socket *vsocket); static int vhost_user_start_client(struct vhost_user_socket *vsocket); static struct vhost_user vhost_user = { - .fdset = { - .fd = { [0 ... MAX_FDS - 1] = {-1, NULL, NULL, NULL, 0} }, - .fd_mutex = PTHREAD_MUTEX_INITIALIZER, - .fd_pooling_mutex = PTHREAD_MUTEX_INITIALIZER, - .num = 0 - }, .vsocket_cnt = 0, .mutex = PTHREAD_MUTEX_INITIALIZER, }; @@ -1187,6 +1181,11 @@ rte_vhost_driver_start(const char *path) return vduse_device_create(path, vsocket->net_compliant_ol_flags); if (fdset_tid.opaque_id == 0) { + if (fdset_init(&vhost_user.fdset) < 0) { + VHOST_CONFIG_LOG(path, ERR, "Failed to init Vhost-user fdset"); + return -1; + } + /** * create a pipe which will be waited by poll and notified to * rebuild the wait list of poll. diff --git a/lib/vhost/vduse.c b/lib/vhost/vduse.c index d462428d2c..d83d7b0d7c 100644 --- a/lib/vhost/vduse.c +++ b/lib/vhost/vduse.c @@ -31,14 +31,7 @@ struct vduse { struct fdset fdset; }; -static struct vduse vduse = { - .fdset = { - .fd = { [0 ... MAX_FDS - 1] = {-1, NULL, NULL, NULL, 0} }, - .fd_mutex = PTHREAD_MUTEX_INITIALIZER, - .fd_pooling_mutex = PTHREAD_MUTEX_INITIALIZER, - .num = 0 - }, -}; +static struct vduse vduse; static bool vduse_events_thread; @@ -434,6 +427,11 @@ vduse_device_create(const char *path, bool compliant_ol_flags) /* If first device, create events dispatcher thread */ if (vduse_events_thread == false) { + if (fdset_init(&vduse.fdset) < 0) { + VHOST_CONFIG_LOG(path, ERR, "Failed to init VDUSE fdset"); + return -1; + } + /** * create a pipe which will be waited by poll and notified to * rebuild the wait list of poll. From patchwork Thu Feb 29 12:24:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 137482 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 00CA843C36; Thu, 29 Feb 2024 13:25:29 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 4A73242D95; Thu, 29 Feb 2024 13:25:16 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 41000427E3 for ; Thu, 29 Feb 2024 13:25:14 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1709209513; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SpHAD+sEUkONSYBX2n/Fr7MTwj+jPzpXhuPbvGo0StU=; b=jFzOb6Aw/0Pi+lt8eUXbpOIsbrTvtE+IO+ZAaMeKWGXbl1MOlJCSCjM5wC1SG6Wd+BcTJ1 GxRugv8Z3BBEO5ZmzWtJV9c2fGKFLakrZPvmIhMpBFVrf8hGhi5CaXFCjD+KjvXUNASlhS u5axEbufE0TfOc9maneB4uoSnavI8vA= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-68-UuMLBIgxNOucC_5Gfa9p9A-1; Thu, 29 Feb 2024 07:25:12 -0500 X-MC-Unique: UuMLBIgxNOucC_5Gfa9p9A-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 192753C0FC8C; Thu, 29 Feb 2024 12:25:12 +0000 (UTC) Received: from max-p1.redhat.com (unknown [10.39.208.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id C9024112132A; Thu, 29 Feb 2024 12:25:10 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbox@nvidia.com Cc: Maxime Coquelin Subject: [PATCH 4/7] vhost: hide synchronization within FD manager Date: Thu, 29 Feb 2024 13:24:59 +0100 Message-ID: <20240229122502.2572343-5-maxime.coquelin@redhat.com> In-Reply-To: <20240229122502.2572343-1-maxime.coquelin@redhat.com> References: <20240229122502.2572343-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch forces synchronization for all FDs additions or deletions in the FD set. With that, it is no more necessary for the user to know about the FD set pipe, so hide its initialization in the FD manager. Signed-off-by: Maxime Coquelin --- lib/vhost/fd_man.c | 174 +++++++++++++++++++++++++-------------------- lib/vhost/fd_man.h | 7 +- lib/vhost/socket.c | 12 +--- lib/vhost/vduse.c | 18 +---- 4 files changed, 101 insertions(+), 110 deletions(-) diff --git a/lib/vhost/fd_man.c b/lib/vhost/fd_man.c index d33036a171..0ae481b785 100644 --- a/lib/vhost/fd_man.c +++ b/lib/vhost/fd_man.c @@ -2,7 +2,9 @@ * Copyright(c) 2010-2014 Intel Corporation */ +#include #include +#include #include #include @@ -17,6 +19,87 @@ RTE_LOG_REGISTER_SUFFIX(vhost_fdset_logtype, fdset, INFO); #define FDPOLLERR (POLLERR | POLLHUP | POLLNVAL) +static void +fdset_pipe_read_cb(int readfd, void *dat, + int *remove __rte_unused) +{ + char charbuf[16]; + struct fdset *fdset = dat; + int r = read(readfd, charbuf, sizeof(charbuf)); + /* + * Just an optimization, we don't care if read() failed + * so ignore explicitly its return value to make the + * compiler happy + */ + RTE_SET_USED(r); + + pthread_mutex_lock(&fdset->sync_mutex); + fdset->sync = true; + pthread_cond_broadcast(&fdset->sync_cond); + pthread_mutex_unlock(&fdset->sync_mutex); +} + +static void +fdset_pipe_uninit(struct fdset *fdset) +{ + fdset_del(fdset, fdset->u.readfd); + close(fdset->u.readfd); + fdset->u.readfd = -1; + close(fdset->u.writefd); + fdset->u.writefd = -1; +} + +static int +fdset_pipe_init(struct fdset *fdset) +{ + int ret; + + pthread_mutex_init(&fdset->sync_mutex, NULL); + pthread_cond_init(&fdset->sync_cond, NULL); + + if (pipe(fdset->u.pipefd) < 0) { + VHOST_FDMAN_LOG(ERR, + "failed to create pipe for vhost fdset"); + return -1; + } + + ret = fdset_add(fdset, fdset->u.readfd, + fdset_pipe_read_cb, NULL, fdset); + if (ret < 0) { + VHOST_FDMAN_LOG(ERR, + "failed to add pipe readfd %d into vhost server fdset", + fdset->u.readfd); + + fdset_pipe_uninit(fdset); + return -1; + } + + return 0; +} + +static void +fdset_sync(struct fdset *fdset) +{ + int ret; + + pthread_mutex_lock(&fdset->sync_mutex); + + fdset->sync = false; + ret = write(fdset->u.writefd, "1", 1); + if (ret < 0) { + VHOST_FDMAN_LOG(ERR, + "Failed to write to notification pipe: %s", + strerror(errno)); + goto out_unlock; + } + + while (!fdset->sync) + pthread_cond_wait(&fdset->sync_cond, &fdset->sync_mutex); + +out_unlock: + pthread_mutex_unlock(&fdset->sync_mutex); +} + static int get_last_valid_idx(struct fdset *pfdset, int last_valid_idx) { @@ -96,6 +179,12 @@ fdset_add_fd(struct fdset *pfdset, int idx, int fd, pfd->revents = 0; } +void +fdset_uninit(struct fdset *pfdset) +{ + fdset_pipe_uninit(pfdset); +} + int fdset_init(struct fdset *pfdset) { @@ -113,7 +202,7 @@ fdset_init(struct fdset *pfdset) } pfdset->num = 0; - return 0; + return fdset_pipe_init(pfdset); } /** @@ -143,6 +232,8 @@ fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) fdset_add_fd(pfdset, i, fd, rcb, wcb, dat); pthread_mutex_unlock(&pfdset->fd_mutex); + fdset_sync(pfdset); + return 0; } @@ -174,6 +265,8 @@ fdset_del(struct fdset *pfdset, int fd) pthread_mutex_unlock(&pfdset->fd_mutex); } while (i != -1); + fdset_sync(pfdset); + return dat; } @@ -207,6 +300,9 @@ fdset_try_del(struct fdset *pfdset, int fd) } pthread_mutex_unlock(&pfdset->fd_mutex); + + fdset_sync(pfdset); + return 0; } @@ -312,79 +408,3 @@ fdset_event_dispatch(void *arg) return 0; } - -static void -fdset_pipe_read_cb(int readfd, void *dat, - int *remove __rte_unused) -{ - char charbuf[16]; - struct fdset *fdset = dat; - int r = read(readfd, charbuf, sizeof(charbuf)); - /* - * Just an optimization, we don't care if read() failed - * so ignore explicitly its return value to make the - * compiler happy - */ - RTE_SET_USED(r); - - pthread_mutex_lock(&fdset->sync_mutex); - fdset->sync = true; - pthread_cond_broadcast(&fdset->sync_cond); - pthread_mutex_unlock(&fdset->sync_mutex); -} - -void -fdset_pipe_uninit(struct fdset *fdset) -{ - fdset_del(fdset, fdset->u.readfd); - close(fdset->u.readfd); - close(fdset->u.writefd); -} - -int -fdset_pipe_init(struct fdset *fdset) -{ - int ret; - - if (pipe(fdset->u.pipefd) < 0) { - VHOST_FDMAN_LOG(ERR, - "failed to create pipe for vhost fdset"); - return -1; - } - - ret = fdset_add(fdset, fdset->u.readfd, - fdset_pipe_read_cb, NULL, fdset); - - if (ret < 0) { - VHOST_FDMAN_LOG(ERR, - "failed to add pipe readfd %d into vhost server fdset", - fdset->u.readfd); - - fdset_pipe_uninit(fdset); - return -1; - } - - return 0; -} - -void -fdset_pipe_notify(struct fdset *fdset) -{ - int r; - - pthread_mutex_lock(&fdset->sync_mutex); - - fdset->sync = false; - r = write(fdset->u.writefd, "1", 1); - /* - * Just an optimization, we don't care if write() failed - * so ignore explicitly its return value to make the - * compiler happy - */ - RTE_SET_USED(r); - - while (!fdset->sync) - pthread_cond_wait(&fdset->sync_cond, &fdset->sync_mutex); - - pthread_mutex_unlock(&fdset->sync_mutex); -} diff --git a/lib/vhost/fd_man.h b/lib/vhost/fd_man.h index 92d24d8591..c18e3a435c 100644 --- a/lib/vhost/fd_man.h +++ b/lib/vhost/fd_man.h @@ -42,6 +42,7 @@ struct fdset { bool sync; }; +void fdset_uninit(struct fdset *pfdset); int fdset_init(struct fdset *pfdset); @@ -53,10 +54,4 @@ int fdset_try_del(struct fdset *pfdset, int fd); uint32_t fdset_event_dispatch(void *arg); -int fdset_pipe_init(struct fdset *fdset); - -void fdset_pipe_uninit(struct fdset *fdset); - -void fdset_pipe_notify(struct fdset *fdset); - #endif diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c index b544e39be7..2f93d48c31 100644 --- a/lib/vhost/socket.c +++ b/lib/vhost/socket.c @@ -278,7 +278,6 @@ vhost_user_add_connection(int fd, struct vhost_user_socket *vsocket) TAILQ_INSERT_TAIL(&vsocket->conn_list, conn, next); pthread_mutex_unlock(&vsocket->conn_mutex); - fdset_pipe_notify(&vhost_user.fdset); return; err_cleanup: @@ -1186,20 +1185,11 @@ rte_vhost_driver_start(const char *path) return -1; } - /** - * create a pipe which will be waited by poll and notified to - * rebuild the wait list of poll. - */ - if (fdset_pipe_init(&vhost_user.fdset) < 0) { - VHOST_CONFIG_LOG(path, ERR, "failed to create pipe for vhost fdset"); - return -1; - } - int ret = rte_thread_create_internal_control(&fdset_tid, "vhost-evt", fdset_event_dispatch, &vhost_user.fdset); if (ret != 0) { VHOST_CONFIG_LOG(path, ERR, "failed to create fdset handling thread"); - fdset_pipe_uninit(&vhost_user.fdset); + fdset_uninit(&vhost_user.fdset); return -1; } } diff --git a/lib/vhost/vduse.c b/lib/vhost/vduse.c index d83d7b0d7c..257285a89f 100644 --- a/lib/vhost/vduse.c +++ b/lib/vhost/vduse.c @@ -225,7 +225,6 @@ vduse_vring_setup(struct virtio_net *dev, unsigned int index) close(vq->kickfd); vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD; } - fdset_pipe_notify(&vduse.fdset); vhost_enable_guest_notification(dev, vq, 1); VHOST_CONFIG_LOG(dev->ifname, INFO, "Ctrl queue event handler installed"); } @@ -238,10 +237,8 @@ vduse_vring_cleanup(struct virtio_net *dev, unsigned int index) struct vduse_vq_eventfd vq_efd; int ret; - if (vq == dev->cvq && vq->kickfd >= 0) { + if (vq == dev->cvq && vq->kickfd >= 0) fdset_del(&vduse.fdset, vq->kickfd); - fdset_pipe_notify(&vduse.fdset); - } vq_efd.index = index; vq_efd.fd = VDUSE_EVENTFD_DEASSIGN; @@ -432,20 +429,11 @@ vduse_device_create(const char *path, bool compliant_ol_flags) return -1; } - /** - * create a pipe which will be waited by poll and notified to - * rebuild the wait list of poll. - */ - if (fdset_pipe_init(&vduse.fdset) < 0) { - VHOST_CONFIG_LOG(path, ERR, "failed to create pipe for vduse fdset"); - return -1; - } - ret = rte_thread_create_internal_control(&fdset_tid, "vduse-evt", fdset_event_dispatch, &vduse.fdset); if (ret != 0) { VHOST_CONFIG_LOG(path, ERR, "failed to create vduse fdset handling thread"); - fdset_pipe_uninit(&vduse.fdset); + fdset_uninit(&vduse.fdset); return -1; } @@ -573,7 +561,6 @@ vduse_device_create(const char *path, bool compliant_ol_flags) dev->vduse_dev_fd); goto out_dev_destroy; } - fdset_pipe_notify(&vduse.fdset); free(dev_config); @@ -616,7 +603,6 @@ vduse_device_destroy(const char *path) vduse_device_stop(dev); fdset_del(&vduse.fdset, dev->vduse_dev_fd); - fdset_pipe_notify(&vduse.fdset); if (dev->vduse_dev_fd >= 0) { close(dev->vduse_dev_fd); From patchwork Thu Feb 29 12:25:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 137484 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9762043C36; Thu, 29 Feb 2024 13:25:41 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D4AEC42DCA; Thu, 29 Feb 2024 13:25:21 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 20C2742D72 for ; Thu, 29 Feb 2024 13:25:16 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1709209515; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kWOrb3PzXAY7zdFlc7KXowRti6dJPbdTpf1Qdcptsm8=; b=iwInk2Niaw5+QuRMCf8pvFM6H4FYafUIO/hJ2pKjq1H3neYU1pwx97Z4BByVhTMwTb5u0F GYq7wchK9FCt5up11TidGMf/u9rxZMbL6vwkk922UHQCgTqGrc4KpXX69Ht0Qsc8y+haNH eFCCUHSgT1QixCCy9gHmePLPO/lDHQE= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-584-jFvpeUmAMQ2nmGHx3Eiikw-1; Thu, 29 Feb 2024 07:25:14 -0500 X-MC-Unique: jFvpeUmAMQ2nmGHx3Eiikw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D2A18280A9C8; Thu, 29 Feb 2024 12:25:13 +0000 (UTC) Received: from max-p1.redhat.com (unknown [10.39.208.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5C0101121312; Thu, 29 Feb 2024 12:25:12 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbox@nvidia.com Cc: Maxime Coquelin Subject: [PATCH 5/7] vhost: improve fdset initialization Date: Thu, 29 Feb 2024 13:25:00 +0100 Message-ID: <20240229122502.2572343-6-maxime.coquelin@redhat.com> In-Reply-To: <20240229122502.2572343-1-maxime.coquelin@redhat.com> References: <20240229122502.2572343-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch heavily reworks fdset initialization: - fdsets are now dynamically allocated by the FD manager - the event dispatcher is now created by the FD manager - struct fdset is now opaque to VDUSE and Vhost Signed-off-by: Maxime Coquelin --- lib/vhost/fd_man.c | 177 +++++++++++-- lib/vhost/fd_man.c.orig | 538 ++++++++++++++++++++++++++++++++++++++++ lib/vhost/fd_man.h | 39 +-- lib/vhost/socket.c | 24 +- lib/vhost/vduse.c | 29 +-- 5 files changed, 715 insertions(+), 92 deletions(-) create mode 100644 lib/vhost/fd_man.c.orig diff --git a/lib/vhost/fd_man.c b/lib/vhost/fd_man.c index 0ae481b785..8b47c97d45 100644 --- a/lib/vhost/fd_man.c +++ b/lib/vhost/fd_man.c @@ -3,12 +3,16 @@ */ #include +#include #include #include #include #include #include +#include +#include +#include #include "fd_man.h" @@ -19,6 +23,79 @@ RTE_LOG_REGISTER_SUFFIX(vhost_fdset_logtype, fdset, INFO); #define FDPOLLERR (POLLERR | POLLHUP | POLLNVAL) +struct fdentry { + int fd; /* -1 indicates this entry is empty */ + fd_cb rcb; /* callback when this fd is readable. */ + fd_cb wcb; /* callback when this fd is writeable.*/ + void *dat; /* fd context */ + int busy; /* whether this entry is being used in cb. */ +}; + +struct fdset { + char name[RTE_THREAD_NAME_SIZE]; + struct pollfd rwfds[MAX_FDS]; + struct fdentry fd[MAX_FDS]; + rte_thread_t tid; + pthread_mutex_t fd_mutex; + pthread_mutex_t fd_polling_mutex; + int num; /* current fd number of this fdset */ + + union pipefds { + struct { + int pipefd[2]; + }; + struct { + int readfd; + int writefd; + }; + } u; + + pthread_mutex_t sync_mutex; + pthread_cond_t sync_cond; + bool sync; + bool destroy; +}; + +static int fdset_add_no_sync(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat); +static uint32_t fdset_event_dispatch(void *arg); + +#define MAX_FDSETS 8 + +static struct fdset *fdsets[MAX_FDSETS]; +pthread_mutex_t fdsets_mutex = PTHREAD_MUTEX_INITIALIZER; + +static struct fdset * +fdset_lookup(const char *name) +{ + int i; + + for (i = 0; i < MAX_FDSETS; i++) { + struct fdset *fdset = fdsets[i]; + if (fdset == NULL) + continue; + + if (!strncmp(fdset->name, name, RTE_THREAD_NAME_SIZE)) + return fdset; + } + + return NULL; +} + +static int +fdset_insert(struct fdset *fdset) +{ + int i; + + for (i = 0; i < MAX_FDSETS; i++) { + if (fdsets[i] == NULL) { + fdsets[i] = fdset; + return 0; + } + } + + return -1; +} + static void fdset_pipe_read_cb(int readfd, void *dat, int *remove __rte_unused) @@ -63,7 +140,7 @@ fdset_pipe_init(struct fdset *fdset) return -1; } - ret = fdset_add(fdset, fdset->u.readfd, + ret = fdset_add_no_sync(fdset, fdset->u.readfd, fdset_pipe_read_cb, NULL, fdset); if (ret < 0) { VHOST_FDMAN_LOG(ERR, @@ -179,37 +256,82 @@ fdset_add_fd(struct fdset *pfdset, int idx, int fd, pfd->revents = 0; } -void -fdset_uninit(struct fdset *pfdset) -{ - fdset_pipe_uninit(pfdset); -} - -int -fdset_init(struct fdset *pfdset) +struct fdset * +fdset_init(const char *name) { + struct fdset *fdset; + uint32_t val; int i; - if (pfdset == NULL) - return -1; + if (name == NULL) { + VHOST_FDMAN_LOG(ERR, "Invalid name"); + goto err; + } - pthread_mutex_init(&pfdset->fd_mutex, NULL); - pthread_mutex_init(&pfdset->fd_polling_mutex, NULL); + pthread_mutex_lock(&fdsets_mutex); + fdset = fdset_lookup(name); + if (fdset) { + pthread_mutex_unlock(&fdsets_mutex); + return fdset; + } + + fdset = rte_zmalloc(NULL, sizeof(*fdset), 0); + if (!fdset) { + VHOST_FDMAN_LOG(ERR, "Failed to alloc fdset %s", name); + goto err_unlock; + } + + rte_strscpy(fdset->name, name, RTE_THREAD_NAME_SIZE); + + pthread_mutex_init(&fdset->fd_mutex, NULL); + pthread_mutex_init(&fdset->fd_polling_mutex, NULL); for (i = 0; i < MAX_FDS; i++) { - pfdset->fd[i].fd = -1; - pfdset->fd[i].dat = NULL; + fdset->fd[i].fd = -1; + fdset->fd[i].dat = NULL; } - pfdset->num = 0; + fdset->num = 0; - return fdset_pipe_init(pfdset); + if (fdset_pipe_init(fdset)) { + VHOST_FDMAN_LOG(ERR, "Failed to init pipe for %s", name); + goto err_free; + } + + if (rte_thread_create_internal_control(&fdset->tid, fdset->name, + fdset_event_dispatch, fdset)) { + VHOST_FDMAN_LOG(ERR, "Failed to create %s event dispatch thread", + fdset->name); + goto err_pipe; + } + + if (fdset_insert(fdset)) { + VHOST_FDMAN_LOG(ERR, "Failed to insert fdset %s", name); + goto err_thread; + } + + pthread_mutex_unlock(&fdsets_mutex); + + return fdset; + +err_thread: + fdset->destroy = true; + fdset_sync(fdset); + rte_thread_join(fdset->tid, &val); +err_pipe: + fdset_pipe_uninit(fdset); +err_free: + rte_free(fdset); +err_unlock: + pthread_mutex_unlock(&fdsets_mutex); +err: + return NULL; } /** * Register the fd in the fdset with read/write handler and context. */ -int -fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) +static int +fdset_add_no_sync(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) { int i; @@ -232,6 +354,18 @@ fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) fdset_add_fd(pfdset, i, fd, rcb, wcb, dat); pthread_mutex_unlock(&pfdset->fd_mutex); + return 0; +} + +int +fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) +{ + int ret; + + ret = fdset_add_no_sync(pfdset, fd, rcb, wcb, dat); + if (ret < 0) + return ret; + fdset_sync(pfdset); return 0; @@ -315,7 +449,7 @@ fdset_try_del(struct fdset *pfdset, int fd) * will wait until the flag is reset to zero(which indicates the callback is * finished), then it could free the context after fdset_del. */ -uint32_t +static uint32_t fdset_event_dispatch(void *arg) { int i; @@ -404,6 +538,9 @@ fdset_event_dispatch(void *arg) if (need_shrink) fdset_shrink(pfdset); + + if (pfdset->destroy) + break; } return 0; diff --git a/lib/vhost/fd_man.c.orig b/lib/vhost/fd_man.c.orig new file mode 100644 index 0000000000..c0149fbf4e --- /dev/null +++ b/lib/vhost/fd_man.c.orig @@ -0,0 +1,538 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2010-2014 Intel Corporation + */ + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include + +#include "fd_man.h" + +RTE_LOG_REGISTER_SUFFIX(vhost_fdset_logtype, fdset, INFO); +#define RTE_LOGTYPE_VHOST_FDMAN vhost_fdset_logtype +#define VHOST_FDMAN_LOG(level, ...) \ + RTE_LOG_LINE(level, VHOST_FDMAN, "" __VA_ARGS__) + +#define FDPOLLERR (POLLERR | POLLHUP | POLLNVAL) + +struct fdentry { + int fd; /* -1 indicates this entry is empty */ + fd_cb rcb; /* callback when this fd is readable. */ + fd_cb wcb; /* callback when this fd is writeable.*/ + void *dat; /* fd context */ + int busy; /* whether this entry is being used in cb. */ +}; + +struct fdset { + char name[RTE_THREAD_NAME_SIZE]; + struct pollfd rwfds[MAX_FDS]; + struct fdentry fd[MAX_FDS]; + rte_thread_t tid; + pthread_mutex_t fd_mutex; + pthread_mutex_t fd_polling_mutex; + int num; /* current fd number of this fdset */ + + int sync_fd; + pthread_mutex_t sync_mutex; + pthread_cond_t sync_cond; + bool sync; + + bool destroy; +}; + +static int fdset_add_no_sync(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat); +static uint32_t fdset_event_dispatch(void *arg); + +#define MAX_FDSETS 8 + +static struct fdset *fdsets[MAX_FDSETS]; +pthread_mutex_t fdsets_mutex = PTHREAD_MUTEX_INITIALIZER; + +static struct fdset * +fdset_lookup(const char *name) +{ + int i; + + for (i = 0; i < MAX_FDSETS; i++) { + struct fdset *fdset = fdsets[i]; + if (fdset == NULL) + continue; + + if (!strncmp(fdset->name, name, RTE_THREAD_NAME_SIZE)) + return fdset; + } + + return NULL; +} + +static int +fdset_insert(struct fdset *fdset) +{ + int i; + + for (i = 0; i < MAX_FDSETS; i++) { + if (fdsets[i] == NULL) { + fdsets[i] = fdset; + return 0; + } + } + + return -1; +} + +static void +fdset_sync_read_cb(int sync_fd, void *dat, int *remove __rte_unused) +{ + eventfd_t val; + struct fdset *fdset = dat; + int r = eventfd_read(sync_fd, &val); + /* + * Just an optimization, we don't care if read() failed + * so ignore explicitly its return value to make the + * compiler happy + */ + RTE_SET_USED(r); + + pthread_mutex_lock(&fdset->sync_mutex); + fdset->sync = true; + pthread_cond_broadcast(&fdset->sync_cond); + pthread_mutex_unlock(&fdset->sync_mutex); +} + +static void +fdset_sync_uninit(struct fdset *fdset) +{ + fdset_del(fdset, fdset->sync_fd); + close(fdset->sync_fd); + fdset->sync_fd = -1; +} + +static int +fdset_sync_init(struct fdset *fdset) +{ + int ret; + + pthread_mutex_init(&fdset->sync_mutex, NULL); + pthread_cond_init(&fdset->sync_cond, NULL); + + fdset->sync_fd = eventfd(0, 0); + if (fdset->sync_fd < 0) { + VHOST_FDMAN_LOG(ERR, "failed to create eventfd for %s fdset", fdset->name); + return -1; + } + +<<<<<<< HEAD + ret = fdset_add_no_sync(fdset, fdset->u.readfd, + fdset_pipe_read_cb, NULL, fdset); +======= + ret = fdset_add(fdset, fdset->sync_fd, fdset_sync_read_cb, NULL, fdset); +>>>>>>> 3474bf77e2 (vhost: convert fdset sync to eventfd) + if (ret < 0) { + VHOST_FDMAN_LOG(ERR, "failed to add eventfd %d to %s fdset", + fdset->sync_fd, fdset->name); + + fdset_sync_uninit(fdset); + return -1; + } + + return 0; +} + +static void +fdset_sync(struct fdset *fdset) +{ + int ret; + + pthread_mutex_lock(&fdset->sync_mutex); + + fdset->sync = false; + ret = eventfd_write(fdset->sync_fd, (eventfd_t)1); + if (ret < 0) { + VHOST_FDMAN_LOG(ERR, "Failed to write sync eventfd for %s fdset: %s", + fdset->name, strerror(errno)); + goto out_unlock; + } + + while (!fdset->sync) + pthread_cond_wait(&fdset->sync_cond, &fdset->sync_mutex); + +out_unlock: + pthread_mutex_unlock(&fdset->sync_mutex); +} + +static int +get_last_valid_idx(struct fdset *pfdset, int last_valid_idx) +{ + int i; + + for (i = last_valid_idx; i >= 0 && pfdset->fd[i].fd == -1; i--) + ; + + return i; +} + +static void +fdset_move(struct fdset *pfdset, int dst, int src) +{ + pfdset->fd[dst] = pfdset->fd[src]; + pfdset->rwfds[dst] = pfdset->rwfds[src]; +} + +static void +fdset_shrink_nolock(struct fdset *pfdset) +{ + int i; + int last_valid_idx = get_last_valid_idx(pfdset, pfdset->num - 1); + + for (i = 0; i < last_valid_idx; i++) { + if (pfdset->fd[i].fd != -1) + continue; + + fdset_move(pfdset, i, last_valid_idx); + last_valid_idx = get_last_valid_idx(pfdset, last_valid_idx - 1); + } + pfdset->num = last_valid_idx + 1; +} + +/* + * Find deleted fd entries and remove them + */ +static void +fdset_shrink(struct fdset *pfdset) +{ + pthread_mutex_lock(&pfdset->fd_mutex); + fdset_shrink_nolock(pfdset); + pthread_mutex_unlock(&pfdset->fd_mutex); +} + +/** + * Returns the index in the fdset for a given fd. + * @return + * index for the fd, or -1 if fd isn't in the fdset. + */ +static int +fdset_find_fd(struct fdset *pfdset, int fd) +{ + int i; + + for (i = 0; i < pfdset->num && pfdset->fd[i].fd != fd; i++) + ; + + return i == pfdset->num ? -1 : i; +} + +static void +fdset_add_fd(struct fdset *pfdset, int idx, int fd, + fd_cb rcb, fd_cb wcb, void *dat) +{ + struct fdentry *pfdentry = &pfdset->fd[idx]; + struct pollfd *pfd = &pfdset->rwfds[idx]; + + pfdentry->fd = fd; + pfdentry->rcb = rcb; + pfdentry->wcb = wcb; + pfdentry->dat = dat; + + pfd->fd = fd; + pfd->events = rcb ? POLLIN : 0; + pfd->events |= wcb ? POLLOUT : 0; + pfd->revents = 0; +} + +struct fdset * +fdset_init(const char *name) +{ + struct fdset *fdset; + uint32_t val; + int i; + + if (name == NULL) { + VHOST_FDMAN_LOG(ERR, "Invalid name"); + goto err; + } + + pthread_mutex_lock(&fdsets_mutex); + fdset = fdset_lookup(name); + if (fdset) { + pthread_mutex_unlock(&fdsets_mutex); + return fdset; + } + + fdset = rte_zmalloc(NULL, sizeof(*fdset), 0); + if (!fdset) { + VHOST_FDMAN_LOG(ERR, "Failed to alloc fdset %s", name); + goto err_unlock; + } + + strncpy(fdset->name, name, RTE_THREAD_NAME_SIZE - 1); + + pthread_mutex_init(&fdset->fd_mutex, NULL); + pthread_mutex_init(&fdset->fd_polling_mutex, NULL); + + for (i = 0; i < MAX_FDS; i++) { + fdset->fd[i].fd = -1; + fdset->fd[i].dat = NULL; + } + fdset->num = 0; + + if (fdset_sync_init(fdset)) { + VHOST_FDMAN_LOG(ERR, "Failed to init sync for %s", name); + goto err_free; + } + + if (rte_thread_create_internal_control(&fdset->tid, fdset->name, + fdset_event_dispatch, fdset)) { + VHOST_FDMAN_LOG(ERR, "Failed to create %s event dispatch thread", + fdset->name); + goto err_sync; + } + + if (fdset_insert(fdset)) { + VHOST_FDMAN_LOG(ERR, "Failed to insert fdset %s", name); + goto err_thread; + } + + pthread_mutex_unlock(&fdsets_mutex); + + return fdset; + +err_thread: + fdset->destroy = true; + fdset_sync(fdset); + rte_thread_join(fdset->tid, &val); +err_sync: + fdset_sync_uninit(fdset); +err_free: + rte_free(fdset); +err_unlock: + pthread_mutex_unlock(&fdsets_mutex); +err: + return NULL; +} + +/** + * Register the fd in the fdset with read/write handler and context. + */ +static int +fdset_add_no_sync(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) +{ + int i; + + if (pfdset == NULL || fd == -1) + return -1; + + pthread_mutex_lock(&pfdset->fd_mutex); + i = pfdset->num < MAX_FDS ? pfdset->num++ : -1; + if (i == -1) { + pthread_mutex_lock(&pfdset->fd_polling_mutex); + fdset_shrink_nolock(pfdset); + pthread_mutex_unlock(&pfdset->fd_polling_mutex); + i = pfdset->num < MAX_FDS ? pfdset->num++ : -1; + if (i == -1) { + pthread_mutex_unlock(&pfdset->fd_mutex); + return -2; + } + } + + fdset_add_fd(pfdset, i, fd, rcb, wcb, dat); + pthread_mutex_unlock(&pfdset->fd_mutex); + + return 0; +} + +int +fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat) +{ + int ret; + + ret = fdset_add_no_sync(pfdset, fd, rcb, wcb, dat); + if (ret < 0) + return ret; + + fdset_sync(pfdset); + + return 0; +} + +/** + * Unregister the fd from the fdset. + * Returns context of a given fd or NULL. + */ +void * +fdset_del(struct fdset *pfdset, int fd) +{ + int i; + void *dat = NULL; + + if (pfdset == NULL || fd == -1) + return NULL; + + do { + pthread_mutex_lock(&pfdset->fd_mutex); + + i = fdset_find_fd(pfdset, fd); + if (i != -1 && pfdset->fd[i].busy == 0) { + /* busy indicates r/wcb is executing! */ + dat = pfdset->fd[i].dat; + pfdset->fd[i].fd = -1; + pfdset->fd[i].rcb = pfdset->fd[i].wcb = NULL; + pfdset->fd[i].dat = NULL; + i = -1; + } + pthread_mutex_unlock(&pfdset->fd_mutex); + } while (i != -1); + + fdset_sync(pfdset); + + return dat; +} + +/** + * Unregister the fd from the fdset. + * + * If parameters are invalid, return directly -2. + * And check whether fd is busy, if yes, return -1. + * Otherwise, try to delete the fd from fdset and + * return true. + */ +int +fdset_try_del(struct fdset *pfdset, int fd) +{ + int i; + + if (pfdset == NULL || fd == -1) + return -2; + + pthread_mutex_lock(&pfdset->fd_mutex); + i = fdset_find_fd(pfdset, fd); + if (i != -1 && pfdset->fd[i].busy) { + pthread_mutex_unlock(&pfdset->fd_mutex); + return -1; + } + + if (i != -1) { + pfdset->fd[i].fd = -1; + pfdset->fd[i].rcb = pfdset->fd[i].wcb = NULL; + pfdset->fd[i].dat = NULL; + } + + pthread_mutex_unlock(&pfdset->fd_mutex); + + fdset_sync(pfdset); + + return 0; +} + +/** + * This functions runs in infinite blocking loop until there is no fd in + * pfdset. It calls corresponding r/w handler if there is event on the fd. + * + * Before the callback is called, we set the flag to busy status; If other + * thread(now rte_vhost_driver_unregister) calls fdset_del concurrently, it + * will wait until the flag is reset to zero(which indicates the callback is + * finished), then it could free the context after fdset_del. + */ +static uint32_t +fdset_event_dispatch(void *arg) +{ + int i; + struct pollfd *pfd; + struct fdentry *pfdentry; + fd_cb rcb, wcb; + void *dat; + int fd, numfds; + int remove1, remove2; + int need_shrink; + struct fdset *pfdset = arg; + int val; + + if (pfdset == NULL) + return 0; + + while (1) { + + /* + * When poll is blocked, other threads might unregister + * listenfds from and register new listenfds into fdset. + * When poll returns, the entries for listenfds in the fdset + * might have been updated. It is ok if there is unwanted call + * for new listenfds. + */ + pthread_mutex_lock(&pfdset->fd_mutex); + numfds = pfdset->num; + pthread_mutex_unlock(&pfdset->fd_mutex); + + pthread_mutex_lock(&pfdset->fd_polling_mutex); + val = poll(pfdset->rwfds, numfds, 1000 /* millisecs */); + pthread_mutex_unlock(&pfdset->fd_polling_mutex); + if (val < 0) + continue; + + need_shrink = 0; + for (i = 0; i < numfds; i++) { + pthread_mutex_lock(&pfdset->fd_mutex); + + pfdentry = &pfdset->fd[i]; + fd = pfdentry->fd; + pfd = &pfdset->rwfds[i]; + + if (fd < 0) { + need_shrink = 1; + pthread_mutex_unlock(&pfdset->fd_mutex); + continue; + } + + if (!pfd->revents) { + pthread_mutex_unlock(&pfdset->fd_mutex); + continue; + } + + remove1 = remove2 = 0; + + rcb = pfdentry->rcb; + wcb = pfdentry->wcb; + dat = pfdentry->dat; + pfdentry->busy = 1; + + pthread_mutex_unlock(&pfdset->fd_mutex); + + if (rcb && pfd->revents & (POLLIN | FDPOLLERR)) + rcb(fd, dat, &remove1); + if (wcb && pfd->revents & (POLLOUT | FDPOLLERR)) + wcb(fd, dat, &remove2); + pfdentry->busy = 0; + /* + * fdset_del needs to check busy flag. + * We don't allow fdset_del to be called in callback + * directly. + */ + /* + * When we are to clean up the fd from fdset, + * because the fd is closed in the cb, + * the old fd val could be reused by when creates new + * listen fd in another thread, we couldn't call + * fdset_del. + */ + if (remove1 || remove2) { + pfdentry->fd = -1; + need_shrink = 1; + } + } + + if (need_shrink) + fdset_shrink(pfdset); + + if (pfdset->destroy) + break; + } + + return 0; +} diff --git a/lib/vhost/fd_man.h b/lib/vhost/fd_man.h index c18e3a435c..079fa0155f 100644 --- a/lib/vhost/fd_man.h +++ b/lib/vhost/fd_man.h @@ -8,50 +8,19 @@ #include #include +struct fdset; + #define MAX_FDS 1024 typedef void (*fd_cb)(int fd, void *dat, int *remove); -struct fdentry { - int fd; /* -1 indicates this entry is empty */ - fd_cb rcb; /* callback when this fd is readable. */ - fd_cb wcb; /* callback when this fd is writeable.*/ - void *dat; /* fd context */ - int busy; /* whether this entry is being used in cb. */ -}; - -struct fdset { - struct pollfd rwfds[MAX_FDS]; - struct fdentry fd[MAX_FDS]; - pthread_mutex_t fd_mutex; - pthread_mutex_t fd_polling_mutex; - int num; /* current fd number of this fdset */ - - union pipefds { - struct { - int pipefd[2]; - }; - struct { - int readfd; - int writefd; - }; - } u; - - pthread_mutex_t sync_mutex; - pthread_cond_t sync_cond; - bool sync; -}; - -void fdset_uninit(struct fdset *pfdset); - -int fdset_init(struct fdset *pfdset); +struct fdset *fdset_init(const char *name); int fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat); void *fdset_del(struct fdset *pfdset, int fd); -int fdset_try_del(struct fdset *pfdset, int fd); -uint32_t fdset_event_dispatch(void *arg); +int fdset_try_del(struct fdset *pfdset, int fd); #endif diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c index 2f93d48c31..9eebc63479 100644 --- a/lib/vhost/socket.c +++ b/lib/vhost/socket.c @@ -76,7 +76,7 @@ struct vhost_user_connection { #define MAX_VHOST_SOCKET 1024 struct vhost_user { struct vhost_user_socket *vsockets[MAX_VHOST_SOCKET]; - struct fdset fdset; + struct fdset *fdset; int vsocket_cnt; pthread_mutex_t mutex; }; @@ -261,7 +261,7 @@ vhost_user_add_connection(int fd, struct vhost_user_socket *vsocket) conn->connfd = fd; conn->vsocket = vsocket; conn->vid = vid; - ret = fdset_add(&vhost_user.fdset, fd, vhost_user_read_cb, + ret = fdset_add(vhost_user.fdset, fd, vhost_user_read_cb, NULL, conn); if (ret < 0) { VHOST_CONFIG_LOG(vsocket->path, ERR, @@ -394,7 +394,7 @@ vhost_user_start_server(struct vhost_user_socket *vsocket) if (ret < 0) goto err; - ret = fdset_add(&vhost_user.fdset, fd, vhost_user_server_new_connection, + ret = fdset_add(vhost_user.fdset, fd, vhost_user_server_new_connection, NULL, vsocket); if (ret < 0) { VHOST_CONFIG_LOG(path, ERR, "failed to add listen fd %d to vhost server fdset", @@ -1079,7 +1079,7 @@ rte_vhost_driver_unregister(const char *path) * mutex lock, and try again since the r/wcb * may use the mutex lock. */ - if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) == -1) { + if (fdset_try_del(vhost_user.fdset, vsocket->socket_fd) == -1) { pthread_mutex_unlock(&vhost_user.mutex); goto again; } @@ -1099,7 +1099,7 @@ rte_vhost_driver_unregister(const char *path) * try again since the r/wcb may use the * conn_mutex and mutex locks. */ - if (fdset_try_del(&vhost_user.fdset, + if (fdset_try_del(vhost_user.fdset, conn->connfd) == -1) { pthread_mutex_unlock(&vsocket->conn_mutex); pthread_mutex_unlock(&vhost_user.mutex); @@ -1167,7 +1167,6 @@ int rte_vhost_driver_start(const char *path) { struct vhost_user_socket *vsocket; - static rte_thread_t fdset_tid; pthread_mutex_lock(&vhost_user.mutex); vsocket = find_vhost_user_socket(path); @@ -1179,19 +1178,12 @@ rte_vhost_driver_start(const char *path) if (vsocket->is_vduse) return vduse_device_create(path, vsocket->net_compliant_ol_flags); - if (fdset_tid.opaque_id == 0) { - if (fdset_init(&vhost_user.fdset) < 0) { + if (vhost_user.fdset == NULL) { + vhost_user.fdset = fdset_init("vhost-evt"); + if (vhost_user.fdset == NULL) { VHOST_CONFIG_LOG(path, ERR, "Failed to init Vhost-user fdset"); return -1; } - - int ret = rte_thread_create_internal_control(&fdset_tid, - "vhost-evt", fdset_event_dispatch, &vhost_user.fdset); - if (ret != 0) { - VHOST_CONFIG_LOG(path, ERR, "failed to create fdset handling thread"); - fdset_uninit(&vhost_user.fdset); - return -1; - } } if (vsocket->is_server) diff --git a/lib/vhost/vduse.c b/lib/vhost/vduse.c index 257285a89f..ef2573bdf0 100644 --- a/lib/vhost/vduse.c +++ b/lib/vhost/vduse.c @@ -28,13 +28,11 @@ #define VDUSE_CTRL_PATH "/dev/vduse/control" struct vduse { - struct fdset fdset; + struct fdset *fdset; }; static struct vduse vduse; -static bool vduse_events_thread; - static const char * const vduse_reqs_str[] = { "VDUSE_GET_VQ_STATE", "VDUSE_SET_STATUS", @@ -215,7 +213,7 @@ vduse_vring_setup(struct virtio_net *dev, unsigned int index) } if (vq == dev->cvq) { - ret = fdset_add(&vduse.fdset, vq->kickfd, vduse_control_queue_event, NULL, dev); + ret = fdset_add(vduse.fdset, vq->kickfd, vduse_control_queue_event, NULL, dev); if (ret) { VHOST_CONFIG_LOG(dev->ifname, ERR, "Failed to setup kickfd handler for VQ %u: %s", @@ -238,7 +236,7 @@ vduse_vring_cleanup(struct virtio_net *dev, unsigned int index) int ret; if (vq == dev->cvq && vq->kickfd >= 0) - fdset_del(&vduse.fdset, vq->kickfd); + fdset_del(vduse.fdset, vq->kickfd); vq_efd.index = index; vq_efd.fd = VDUSE_EVENTFD_DEASSIGN; @@ -413,7 +411,6 @@ int vduse_device_create(const char *path, bool compliant_ol_flags) { int control_fd, dev_fd, vid, ret; - rte_thread_t fdset_tid; uint32_t i, max_queue_pairs, total_queues; struct virtio_net *dev; struct virtio_net_config vnet_config = {{ 0 }}; @@ -422,22 +419,12 @@ vduse_device_create(const char *path, bool compliant_ol_flags) struct vduse_dev_config *dev_config = NULL; const char *name = path + strlen("/dev/vduse/"); - /* If first device, create events dispatcher thread */ - if (vduse_events_thread == false) { - if (fdset_init(&vduse.fdset) < 0) { + if (vduse.fdset == NULL) { + vduse.fdset = fdset_init("vduse-evt"); + if (vduse.fdset == NULL) { VHOST_CONFIG_LOG(path, ERR, "Failed to init VDUSE fdset"); return -1; } - - ret = rte_thread_create_internal_control(&fdset_tid, "vduse-evt", - fdset_event_dispatch, &vduse.fdset); - if (ret != 0) { - VHOST_CONFIG_LOG(path, ERR, "failed to create vduse fdset handling thread"); - fdset_uninit(&vduse.fdset); - return -1; - } - - vduse_events_thread = true; } control_fd = open(VDUSE_CTRL_PATH, O_RDWR); @@ -555,7 +542,7 @@ vduse_device_create(const char *path, bool compliant_ol_flags) dev->cvq = dev->virtqueue[max_queue_pairs * 2]; - ret = fdset_add(&vduse.fdset, dev->vduse_dev_fd, vduse_events_handler, NULL, dev); + ret = fdset_add(vduse.fdset, dev->vduse_dev_fd, vduse_events_handler, NULL, dev); if (ret) { VHOST_CONFIG_LOG(name, ERR, "Failed to add fd %d to vduse fdset", dev->vduse_dev_fd); @@ -602,7 +589,7 @@ vduse_device_destroy(const char *path) vduse_device_stop(dev); - fdset_del(&vduse.fdset, dev->vduse_dev_fd); + fdset_del(vduse.fdset, dev->vduse_dev_fd); if (dev->vduse_dev_fd >= 0) { close(dev->vduse_dev_fd); From patchwork Thu Feb 29 12:25:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 137485 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 92E5643C36; Thu, 29 Feb 2024 13:25:48 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 4E35442DEF; Thu, 29 Feb 2024 13:25:23 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id A545742D72 for ; Thu, 29 Feb 2024 13:25:17 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1709209517; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LmF7qPGs1eZr0guitFSXpVk4EJJ8BM7JQyUuHQzD+ws=; b=hVp6jSsb9PukqW0pGjWa4xQrZUZAbgHYLAh/20k5WichlnH/NNGNXByCKziEkQFyOu3RMJ h7SGVd16v1QiBmVftB5RYzrjKdmyAFJBaQMS9mmiQ6RDBDgvJzoy8pwBZxtRKx3oRjxbDD NEW4JZnzN18o02zqypoibmvrms32poE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-371-VoG6xYfWPvSoFoA5ab2Fog-1; Thu, 29 Feb 2024 07:25:15 -0500 X-MC-Unique: VoG6xYfWPvSoFoA5ab2Fog-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6BCE110651E6; Thu, 29 Feb 2024 12:25:15 +0000 (UTC) Received: from max-p1.redhat.com (unknown [10.39.208.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id 291AE111D3DA; Thu, 29 Feb 2024 12:25:14 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbox@nvidia.com Cc: Maxime Coquelin Subject: [PATCH 6/7] vhost: convert fdset sync to eventfd Date: Thu, 29 Feb 2024 13:25:01 +0100 Message-ID: <20240229122502.2572343-7-maxime.coquelin@redhat.com> In-Reply-To: <20240229122502.2572343-1-maxime.coquelin@redhat.com> References: <20240229122502.2572343-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch converts fdset's sync mechanism from pipe to eventfd, as we only use it to send notification events. Signed-off-by: Maxime Coquelin --- lib/vhost/fd_man.c | 65 +++++++++++++++++++--------------------------- 1 file changed, 26 insertions(+), 39 deletions(-) diff --git a/lib/vhost/fd_man.c b/lib/vhost/fd_man.c index 8b47c97d45..6a5bd74656 100644 --- a/lib/vhost/fd_man.c +++ b/lib/vhost/fd_man.c @@ -6,6 +6,7 @@ #include #include #include +#include #include #include @@ -40,19 +41,11 @@ struct fdset { pthread_mutex_t fd_polling_mutex; int num; /* current fd number of this fdset */ - union pipefds { - struct { - int pipefd[2]; - }; - struct { - int readfd; - int writefd; - }; - } u; - + int sync_fd; pthread_mutex_t sync_mutex; pthread_cond_t sync_cond; bool sync; + bool destroy; }; @@ -97,12 +90,11 @@ fdset_insert(struct fdset *fdset) } static void -fdset_pipe_read_cb(int readfd, void *dat, - int *remove __rte_unused) +fdset_sync_read_cb(int sync_fd, void *dat, int *remove __rte_unused) { - char charbuf[16]; + eventfd_t val; struct fdset *fdset = dat; - int r = read(readfd, charbuf, sizeof(charbuf)); + int r = eventfd_read(sync_fd, &val); /* * Just an optimization, we don't care if read() failed * so ignore explicitly its return value to make the @@ -117,37 +109,33 @@ fdset_pipe_read_cb(int readfd, void *dat, } static void -fdset_pipe_uninit(struct fdset *fdset) +fdset_sync_uninit(struct fdset *fdset) { - fdset_del(fdset, fdset->u.readfd); - close(fdset->u.readfd); - fdset->u.readfd = -1; - close(fdset->u.writefd); - fdset->u.writefd = -1; + fdset_del(fdset, fdset->sync_fd); + close(fdset->sync_fd); + fdset->sync_fd = -1; } static int -fdset_pipe_init(struct fdset *fdset) +fdset_sync_init(struct fdset *fdset) { int ret; pthread_mutex_init(&fdset->sync_mutex, NULL); pthread_cond_init(&fdset->sync_cond, NULL); - if (pipe(fdset->u.pipefd) < 0) { - VHOST_FDMAN_LOG(ERR, - "failed to create pipe for vhost fdset"); + fdset->sync_fd = eventfd(0, 0); + if (fdset->sync_fd < 0) { + VHOST_FDMAN_LOG(ERR, "failed to create eventfd for %s fdset", fdset->name); return -1; } - ret = fdset_add_no_sync(fdset, fdset->u.readfd, - fdset_pipe_read_cb, NULL, fdset); + ret = fdset_add_no_sync(fdset, fdset->sync_fd, fdset_sync_read_cb, NULL, fdset); if (ret < 0) { - VHOST_FDMAN_LOG(ERR, - "failed to add pipe readfd %d into vhost server fdset", - fdset->u.readfd); + VHOST_FDMAN_LOG(ERR, "failed to add eventfd %d to %s fdset", + fdset->sync_fd, fdset->name); - fdset_pipe_uninit(fdset); + fdset_sync_uninit(fdset); return -1; } @@ -162,11 +150,10 @@ fdset_sync(struct fdset *fdset) pthread_mutex_lock(&fdset->sync_mutex); fdset->sync = false; - ret = write(fdset->u.writefd, "1", 1); + ret = eventfd_write(fdset->sync_fd, (eventfd_t)1); if (ret < 0) { - VHOST_FDMAN_LOG(ERR, - "Failed to write to notification pipe: %s", - strerror(errno)); + VHOST_FDMAN_LOG(ERR, "Failed to write sync eventfd for %s fdset: %s", + fdset->name, strerror(errno)); goto out_unlock; } @@ -292,8 +279,8 @@ fdset_init(const char *name) } fdset->num = 0; - if (fdset_pipe_init(fdset)) { - VHOST_FDMAN_LOG(ERR, "Failed to init pipe for %s", name); + if (fdset_sync_init(fdset)) { + VHOST_FDMAN_LOG(ERR, "Failed to init sync for %s", name); goto err_free; } @@ -301,7 +288,7 @@ fdset_init(const char *name) fdset_event_dispatch, fdset)) { VHOST_FDMAN_LOG(ERR, "Failed to create %s event dispatch thread", fdset->name); - goto err_pipe; + goto err_sync; } if (fdset_insert(fdset)) { @@ -317,8 +304,8 @@ fdset_init(const char *name) fdset->destroy = true; fdset_sync(fdset); rte_thread_join(fdset->tid, &val); -err_pipe: - fdset_pipe_uninit(fdset); +err_sync: + fdset_sync_uninit(fdset); err_free: rte_free(fdset); err_unlock: From patchwork Thu Feb 29 12:25:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 137486 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id E733443C36; Thu, 29 Feb 2024 13:25:56 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 016AF42E3F; Thu, 29 Feb 2024 13:25:25 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id DF64742DCA for ; Thu, 29 Feb 2024 13:25:20 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1709209520; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=K6mQQgwdhlpajEZBksW0jmAmhsd0DBYf1QD4SGBzyk4=; b=RBdGeI6a0amtHLzoENdi0HnVJMFReR/pgRhfAd9mPj6CcdqGLCeA0f4hHe8StwypBntFI8 dZ0B+KgDW04UMwLrihgnIO2W5nI/de9EM7X7d2XCHs6TRF2Q2Xx2jaBFpb9XAlUdohnUJK eWtdy9waGuCBNH4IGCv5EUAAZlFzD0s= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-619-m30Na_ZeOaiveYmbZ8f5Ow-1; Thu, 29 Feb 2024 07:25:17 -0500 X-MC-Unique: m30Na_ZeOaiveYmbZ8f5Ow-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id ED9C810651E0; Thu, 29 Feb 2024 12:25:16 +0000 (UTC) Received: from max-p1.redhat.com (unknown [10.39.208.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id B471F1121312; Thu, 29 Feb 2024 12:25:15 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbox@nvidia.com Cc: Maxime Coquelin Subject: [PATCH 7/7] vhost: improve FD manager logging Date: Thu, 29 Feb 2024 13:25:02 +0100 Message-ID: <20240229122502.2572343-8-maxime.coquelin@redhat.com> In-Reply-To: <20240229122502.2572343-1-maxime.coquelin@redhat.com> References: <20240229122502.2572343-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Convert the logging macro to pass the fdset name as argument. Signed-off-by: Maxime Coquelin --- lib/vhost/fd_man.c | 25 +++++++++++-------------- 1 file changed, 11 insertions(+), 14 deletions(-) diff --git a/lib/vhost/fd_man.c b/lib/vhost/fd_man.c index 6a5bd74656..18426095b4 100644 --- a/lib/vhost/fd_man.c +++ b/lib/vhost/fd_man.c @@ -19,8 +19,8 @@ RTE_LOG_REGISTER_SUFFIX(vhost_fdset_logtype, fdset, INFO); #define RTE_LOGTYPE_VHOST_FDMAN vhost_fdset_logtype -#define VHOST_FDMAN_LOG(level, ...) \ - RTE_LOG_LINE(level, VHOST_FDMAN, "" __VA_ARGS__) +#define VHOST_FDMAN_LOG(prefix, level, fmt, args...) \ + RTE_LOG_LINE(level, VHOST_FDMAN, "(%s) " fmt, prefix, ##args) #define FDPOLLERR (POLLERR | POLLHUP | POLLNVAL) @@ -126,15 +126,13 @@ fdset_sync_init(struct fdset *fdset) fdset->sync_fd = eventfd(0, 0); if (fdset->sync_fd < 0) { - VHOST_FDMAN_LOG(ERR, "failed to create eventfd for %s fdset", fdset->name); + VHOST_FDMAN_LOG(fdset->name, ERR, "failed to create eventfd"); return -1; } ret = fdset_add_no_sync(fdset, fdset->sync_fd, fdset_sync_read_cb, NULL, fdset); if (ret < 0) { - VHOST_FDMAN_LOG(ERR, "failed to add eventfd %d to %s fdset", - fdset->sync_fd, fdset->name); - + VHOST_FDMAN_LOG(fdset->name, ERR, "failed to add eventfd %d", fdset->sync_fd); fdset_sync_uninit(fdset); return -1; } @@ -152,8 +150,8 @@ fdset_sync(struct fdset *fdset) fdset->sync = false; ret = eventfd_write(fdset->sync_fd, (eventfd_t)1); if (ret < 0) { - VHOST_FDMAN_LOG(ERR, "Failed to write sync eventfd for %s fdset: %s", - fdset->name, strerror(errno)); + VHOST_FDMAN_LOG(fdset->name, ERR, "Failed to write sync eventfd: %s", + strerror(errno)); goto out_unlock; } @@ -251,7 +249,7 @@ fdset_init(const char *name) int i; if (name == NULL) { - VHOST_FDMAN_LOG(ERR, "Invalid name"); + VHOST_FDMAN_LOG("fdset", ERR, "Invalid name"); goto err; } @@ -264,7 +262,7 @@ fdset_init(const char *name) fdset = rte_zmalloc(NULL, sizeof(*fdset), 0); if (!fdset) { - VHOST_FDMAN_LOG(ERR, "Failed to alloc fdset %s", name); + VHOST_FDMAN_LOG(name, ERR, "Failed to alloc fdset"); goto err_unlock; } @@ -280,19 +278,18 @@ fdset_init(const char *name) fdset->num = 0; if (fdset_sync_init(fdset)) { - VHOST_FDMAN_LOG(ERR, "Failed to init sync for %s", name); + VHOST_FDMAN_LOG(fdset->name, ERR, "Failed to init sync"); goto err_free; } if (rte_thread_create_internal_control(&fdset->tid, fdset->name, fdset_event_dispatch, fdset)) { - VHOST_FDMAN_LOG(ERR, "Failed to create %s event dispatch thread", - fdset->name); + VHOST_FDMAN_LOG(fdset->name, ERR, "Failed to create event dispatch thread"); goto err_sync; } if (fdset_insert(fdset)) { - VHOST_FDMAN_LOG(ERR, "Failed to insert fdset %s", name); + VHOST_FDMAN_LOG(fdset->name, ERR, "Failed to insert fdset"); goto err_thread; }