From patchwork Fri Mar 31 15:42:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125677 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7970A42887; Fri, 31 Mar 2023 17:43:16 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E278F42D3B; Fri, 31 Mar 2023 17:43:12 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 05BB342D3B for ; Fri, 31 Mar 2023 17:43:11 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277391; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WS/7IUJ113/G+RVfonceHhJfGIpD9cl7JIoV+03VF4A=; b=aGr7tbdpHkJfhAi0x8Ls6NSzPYFPIa/SJ6MEW+Ar9oyNLhF1icqO99GNTnfHFZPw65Ey3V 04n4UyvvIFkSCi7mXBSK+RYcfOr1u+tFPU3OP2eRIKQ91bsujR8Gro1Y4NaueejyT7ttTw lc8QwwSeqkUIDcaOa8WdZd2Og6YPy94= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-269-GAhfkahiPsmBKSFkjCd3ww-1; Fri, 31 Mar 2023 11:43:08 -0400 X-MC-Unique: GAhfkahiPsmBKSFkjCd3ww-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 13D123C1014F; Fri, 31 Mar 2023 15:43:08 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id AF3972027040; Fri, 31 Mar 2023 15:43:05 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin , stable@dpdk.org Subject: [RFC 01/27] vhost: fix missing guest notif stat increment Date: Fri, 31 Mar 2023 17:42:33 +0200 Message-Id: <20230331154259.1447831-2-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Guest notification counter was only incremented for split ring, this patch adds it also for packed ring. Fixes: 1ea74efd7fa4 ("vhost: add statistics for guest notification") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- lib/vhost/vhost.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index 8fdab13c70..8554ab4002 100644 --- a/lib/vhost/vhost.h +++ b/lib/vhost/vhost.h @@ -973,6 +973,8 @@ vhost_vring_call_packed(struct virtio_net *dev, struct vhost_virtqueue *vq) kick: if (kick) { eventfd_write(vq->callfd, (eventfd_t)1); + if (dev->flags & VIRTIO_DEV_STATS_ENABLED) + vq->stats.guest_notifications++; if (dev->notify_ops->guest_notified) dev->notify_ops->guest_notified(dev->vid); } From patchwork Fri Mar 31 15:42:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125678 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8EEE742887; Fri, 31 Mar 2023 17:43:23 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 351D342D43; Fri, 31 Mar 2023 17:43:19 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 752EF42D3F for ; Fri, 31 Mar 2023 17:43:17 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277397; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ihLYrf2Njhvqp+07u/lWOhAdbUFFFk+YJ5ldsj+2EEg=; b=bVluc/X9pG2uSoSBLg90MW+AV5SgZ5kIB1m63dKsMPlNtyuxUoqjsCWltArlFSiuAGlarg 9e/byAzo+2FRy0ldoZZN98Q5jicWtLffmRsjBrDdggiO1e2tFyik0NMTvNELm1CUXfPf27 WdIZvfb9tOapqV9s2KRyGJxmNkFjT0w= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-659-tHmC4fGVM4uTS7rJwMC1Sg-1; Fri, 31 Mar 2023 11:43:11 -0400 X-MC-Unique: tHmC4fGVM4uTS7rJwMC1Sg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D9E918533DC; Fri, 31 Mar 2023 15:43:10 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 675A22027040; Fri, 31 Mar 2023 15:43:08 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin , stable@dpdk.org Subject: [RFC 02/27] vhost: fix invalid call FD handling Date: Fri, 31 Mar 2023 17:42:34 +0200 Message-Id: <20230331154259.1447831-3-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch fixes cases where IRQ injection is tried while the call FD is not valid, which should not happen. Fixes: b1cce26af1dc ("vhost: add notification for packed ring") Fixes: e37ff954405a ("vhost: support virtqueue interrupt/notification suppression") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- lib/vhost/vhost.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index 8554ab4002..40863f7bfd 100644 --- a/lib/vhost/vhost.h +++ b/lib/vhost/vhost.h @@ -902,9 +902,9 @@ vhost_vring_call_split(struct virtio_net *dev, struct vhost_virtqueue *vq) "%s: used_event_idx=%d, old=%d, new=%d\n", __func__, vhost_used_event(vq), old, new); - if ((vhost_need_event(vhost_used_event(vq), new, old) && - (vq->callfd >= 0)) || - unlikely(!signalled_used_valid)) { + if ((vhost_need_event(vhost_used_event(vq), new, old) || + unlikely(!signalled_used_valid)) && + vq->callfd >= 0) { eventfd_write(vq->callfd, (eventfd_t) 1); if (dev->flags & VIRTIO_DEV_STATS_ENABLED) vq->stats.guest_notifications++; @@ -971,7 +971,7 @@ vhost_vring_call_packed(struct virtio_net *dev, struct vhost_virtqueue *vq) if (vhost_need_event(off, new, old)) kick = true; kick: - if (kick) { + if (kick && vq->callfd >= 0) { eventfd_write(vq->callfd, (eventfd_t)1); if (dev->flags & VIRTIO_DEV_STATS_ENABLED) vq->stats.guest_notifications++; From patchwork Fri Mar 31 15:42:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125679 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7017042887; Fri, 31 Mar 2023 17:43:30 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 497C942D4E; Fri, 31 Mar 2023 17:43:20 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 9FDFC42D40 for ; Fri, 31 Mar 2023 17:43:17 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277397; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OHZYCCXfSU/8PPVEAurhZS5Jj3TOMlldeLqZoOEx/0k=; b=ZLbr9iX7WzpM5sd18ESsqUPBW+G5xC0KWnz6JsvSEGt5UxJ3+N0BJt7+Nnf+0ocnSip6l2 i1WJJG9GeT/vqfWyDe06uAonOnFznzRgEA9C64P7i4bSD95x21ycAcJ0WaREuGpioRVDHc lfJbKhoLCJqCD6uEEA4171FdTVawltM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-168-BQNhglB2MfWOReR1OG0Oig-1; Fri, 31 Mar 2023 11:43:14 -0400 X-MC-Unique: BQNhglB2MfWOReR1OG0Oig-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9349D185A78B; Fri, 31 Mar 2023 15:43:13 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 300E72027040; Fri, 31 Mar 2023 15:43:11 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin , stable@dpdk.org Subject: [RFC 03/27] vhost: fix IOTLB entries overlap check with previous entry Date: Fri, 31 Mar 2023 17:42:35 +0200 Message-Id: <20230331154259.1447831-4-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Commit 22b6d0ac691a ("vhost: fix madvise IOTLB entries pages overlap check") fixed the check to ensure the entry to be removed does not overlap with the next one in the IOTLB cache before marking it as DONTDUMP with madvise(). This is not enough, because the same issue is present when comparing with the previous entry in the cache, where the end address of the previous entry should be used, not the start one. Fixes: dea092d0addb ("vhost: fix madvise arguments alignment") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin Acked-by: Mike Pattrick Reviewed-by: Chenbo Xia --- lib/vhost/iotlb.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/lib/vhost/iotlb.c b/lib/vhost/iotlb.c index 3f45bc6061..870c8acb88 100644 --- a/lib/vhost/iotlb.c +++ b/lib/vhost/iotlb.c @@ -178,8 +178,8 @@ vhost_user_iotlb_cache_random_evict(struct virtio_net *dev, struct vhost_virtque mask = ~(alignment - 1); /* Don't disable coredump if the previous node is in the same page */ - if (prev_node == NULL || - (node->uaddr & mask) != (prev_node->uaddr & mask)) { + if (prev_node == NULL || (node->uaddr & mask) != + ((prev_node->uaddr + prev_node->size - 1) & mask)) { next_node = RTE_TAILQ_NEXT(node, next); /* Don't disable coredump if the next node is in the same page */ if (next_node == NULL || ((node->uaddr + node->size - 1) & mask) != @@ -283,8 +283,8 @@ vhost_user_iotlb_cache_remove(struct virtio_net *dev, struct vhost_virtqueue *vq mask = ~(alignment-1); /* Don't disable coredump if the previous node is in the same page */ - if (prev_node == NULL || - (node->uaddr & mask) != (prev_node->uaddr & mask)) { + if (prev_node == NULL || (node->uaddr & mask) != + ((prev_node->uaddr + prev_node->size - 1) & mask)) { next_node = RTE_TAILQ_NEXT(node, next); /* Don't disable coredump if the next node is in the same page */ if (next_node == NULL || ((node->uaddr + node->size - 1) & mask) != From patchwork Fri Mar 31 15:42:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125680 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 926CA42887; Fri, 31 Mar 2023 17:43:38 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5A50042D5A; Fri, 31 Mar 2023 17:43:23 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 4AACD42D53 for ; Fri, 31 Mar 2023 17:43:21 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277400; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=djoE5NvkeHLa0ejo4QaFW7Paf11K9N7KTuEWfBJCOAA=; b=e6zNSRrTkU59q7YjHM4bWqYViZ3tOvZiWYU6Y87NZyUYztEPy5ZNYEFUuqVjgTR24yAbXJ zAm1VAAtvm1Fh/sVEBXaG9yzEDwWbI1ePY6lToLPasYBBf2zPJT5MOvjW06sG+oeZviWfh HtD8AsEsuJTz8bJD7mIGdG10Hh0kY+0= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-376-KMWSMz7yOuaHG6d_jBECUg-1; Fri, 31 Mar 2023 11:43:16 -0400 X-MC-Unique: KMWSMz7yOuaHG6d_jBECUg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3E5042810C10; Fri, 31 Mar 2023 15:43:16 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id DBB8E2027042; Fri, 31 Mar 2023 15:43:13 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 04/27] vhost: add helper of IOTLB entries coredump Date: Fri, 31 Mar 2023 17:42:36 +0200 Message-Id: <20230331154259.1447831-5-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch reworks IOTLB code to extract madvise-related bits into dedicated helper. This refactoring improves code sharing. Signed-off-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- lib/vhost/iotlb.c | 77 +++++++++++++++++++++++++---------------------- 1 file changed, 41 insertions(+), 36 deletions(-) diff --git a/lib/vhost/iotlb.c b/lib/vhost/iotlb.c index 870c8acb88..e8f1cb661e 100644 --- a/lib/vhost/iotlb.c +++ b/lib/vhost/iotlb.c @@ -23,6 +23,34 @@ struct vhost_iotlb_entry { #define IOTLB_CACHE_SIZE 2048 +static void +vhost_user_iotlb_set_dump(struct virtio_net *dev, struct vhost_iotlb_entry *node) +{ + uint64_t align; + + align = hua_to_alignment(dev->mem, (void *)(uintptr_t)node->uaddr); + + mem_set_dump((void *)(uintptr_t)node->uaddr, node->size, false, align); +} + +static void +vhost_user_iotlb_clear_dump(struct virtio_net *dev, struct vhost_iotlb_entry *node, + struct vhost_iotlb_entry *prev, struct vhost_iotlb_entry *next) +{ + uint64_t align, mask; + + align = hua_to_alignment(dev->mem, (void *)(uintptr_t)node->uaddr); + mask = ~(align - 1); + + /* Don't disable coredump if the previous node is in the same page */ + if (prev == NULL || (node->uaddr & mask) != ((prev->uaddr + prev->size - 1) & mask)) { + /* Don't disable coredump if the next node is in the same page */ + if (next == NULL || + ((node->uaddr + node->size - 1) & mask) != (next->uaddr & mask)) + mem_set_dump((void *)(uintptr_t)node->uaddr, node->size, false, align); + } +} + static struct vhost_iotlb_entry * vhost_user_iotlb_pool_get(struct vhost_virtqueue *vq) { @@ -149,8 +177,8 @@ vhost_user_iotlb_cache_remove_all(struct virtio_net *dev, struct vhost_virtqueue rte_rwlock_write_lock(&vq->iotlb_lock); RTE_TAILQ_FOREACH_SAFE(node, &vq->iotlb_list, next, temp_node) { - mem_set_dump((void *)(uintptr_t)node->uaddr, node->size, false, - hua_to_alignment(dev->mem, (void *)(uintptr_t)node->uaddr)); + vhost_user_iotlb_set_dump(dev, node); + TAILQ_REMOVE(&vq->iotlb_list, node, next); vhost_user_iotlb_pool_put(vq, node); } @@ -164,7 +192,6 @@ static void vhost_user_iotlb_cache_random_evict(struct virtio_net *dev, struct vhost_virtqueue *vq) { struct vhost_iotlb_entry *node, *temp_node, *prev_node = NULL; - uint64_t alignment, mask; int entry_idx; rte_rwlock_write_lock(&vq->iotlb_lock); @@ -173,20 +200,10 @@ vhost_user_iotlb_cache_random_evict(struct virtio_net *dev, struct vhost_virtque RTE_TAILQ_FOREACH_SAFE(node, &vq->iotlb_list, next, temp_node) { if (!entry_idx) { - struct vhost_iotlb_entry *next_node; - alignment = hua_to_alignment(dev->mem, (void *)(uintptr_t)node->uaddr); - mask = ~(alignment - 1); - - /* Don't disable coredump if the previous node is in the same page */ - if (prev_node == NULL || (node->uaddr & mask) != - ((prev_node->uaddr + prev_node->size - 1) & mask)) { - next_node = RTE_TAILQ_NEXT(node, next); - /* Don't disable coredump if the next node is in the same page */ - if (next_node == NULL || ((node->uaddr + node->size - 1) & mask) != - (next_node->uaddr & mask)) - mem_set_dump((void *)(uintptr_t)node->uaddr, node->size, - false, alignment); - } + struct vhost_iotlb_entry *next_node = RTE_TAILQ_NEXT(node, next); + + vhost_user_iotlb_clear_dump(dev, node, prev_node, next_node); + TAILQ_REMOVE(&vq->iotlb_list, node, next); vhost_user_iotlb_pool_put(vq, node); vq->iotlb_cache_nr--; @@ -240,16 +257,16 @@ vhost_user_iotlb_cache_insert(struct virtio_net *dev, struct vhost_virtqueue *vq vhost_user_iotlb_pool_put(vq, new_node); goto unlock; } else if (node->iova > new_node->iova) { - mem_set_dump((void *)(uintptr_t)new_node->uaddr, new_node->size, true, - hua_to_alignment(dev->mem, (void *)(uintptr_t)new_node->uaddr)); + vhost_user_iotlb_set_dump(dev, new_node); + TAILQ_INSERT_BEFORE(node, new_node, next); vq->iotlb_cache_nr++; goto unlock; } } - mem_set_dump((void *)(uintptr_t)new_node->uaddr, new_node->size, true, - hua_to_alignment(dev->mem, (void *)(uintptr_t)new_node->uaddr)); + vhost_user_iotlb_set_dump(dev, new_node); + TAILQ_INSERT_TAIL(&vq->iotlb_list, new_node, next); vq->iotlb_cache_nr++; @@ -265,7 +282,6 @@ vhost_user_iotlb_cache_remove(struct virtio_net *dev, struct vhost_virtqueue *vq uint64_t iova, uint64_t size) { struct vhost_iotlb_entry *node, *temp_node, *prev_node = NULL; - uint64_t alignment, mask; if (unlikely(!size)) return; @@ -278,20 +294,9 @@ vhost_user_iotlb_cache_remove(struct virtio_net *dev, struct vhost_virtqueue *vq break; if (iova < node->iova + node->size) { - struct vhost_iotlb_entry *next_node; - alignment = hua_to_alignment(dev->mem, (void *)(uintptr_t)node->uaddr); - mask = ~(alignment-1); - - /* Don't disable coredump if the previous node is in the same page */ - if (prev_node == NULL || (node->uaddr & mask) != - ((prev_node->uaddr + prev_node->size - 1) & mask)) { - next_node = RTE_TAILQ_NEXT(node, next); - /* Don't disable coredump if the next node is in the same page */ - if (next_node == NULL || ((node->uaddr + node->size - 1) & mask) != - (next_node->uaddr & mask)) - mem_set_dump((void *)(uintptr_t)node->uaddr, node->size, - false, alignment); - } + struct vhost_iotlb_entry *next_node = RTE_TAILQ_NEXT(node, next); + + vhost_user_iotlb_clear_dump(dev, node, prev_node, next_node); TAILQ_REMOVE(&vq->iotlb_list, node, next); vhost_user_iotlb_pool_put(vq, node); From patchwork Fri Mar 31 15:42:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125682 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 20E3342887; Fri, 31 Mar 2023 17:43:53 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A28BF42D69; Fri, 31 Mar 2023 17:43:26 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id E60EB42D62 for ; Fri, 31 Mar 2023 17:43:23 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277403; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DzI2mzfZqWy/oUf+X58/CEpCAfnbfcJJtVS9mWXswDg=; b=hXdhlYaewLnQ3q/1I775nErLqoMX+q9c6y3ZXBPm4PBR7pE5vSN6kYuVsKoda3MixADV6D wEIqyw3LNgiyfdtvYwpvdB8NIaInpBzHzEaZ7zPRZIV7bTPZttEymxlc1ONSELkoP12omR PEHtU+I02nFEo10rm75Oh8I3aEdqa1Q= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-488-NLRI65qmPlOSiqIGlFJ7MA-1; Fri, 31 Mar 2023 11:43:19 -0400 X-MC-Unique: NLRI65qmPlOSiqIGlFJ7MA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id BDCDC85A588; Fri, 31 Mar 2023 15:43:18 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 868242027040; Fri, 31 Mar 2023 15:43:16 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 05/27] vhost: add helper for IOTLB entries shared page check Date: Fri, 31 Mar 2023 17:42:37 +0200 Message-Id: <20230331154259.1447831-6-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch introduces a helper to check whether two IOTLB entries share a page. Signed-off-by: Maxime Coquelin Acked-by: Mike Pattrick Reviewed-by: Chenbo Xia --- lib/vhost/iotlb.c | 25 ++++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/lib/vhost/iotlb.c b/lib/vhost/iotlb.c index e8f1cb661e..d919f74704 100644 --- a/lib/vhost/iotlb.c +++ b/lib/vhost/iotlb.c @@ -23,6 +23,23 @@ struct vhost_iotlb_entry { #define IOTLB_CACHE_SIZE 2048 +static bool +vhost_user_iotlb_share_page(struct vhost_iotlb_entry *a, struct vhost_iotlb_entry *b, + uint64_t align) +{ + uint64_t a_end, b_start; + + if (a == NULL || b == NULL) + return false; + + /* Assumes entry a lower than entry b */ + RTE_ASSERT(a->uaddr < b->uaddr); + a_end = RTE_ALIGN_CEIL(a->uaddr + a->size, align); + b_start = RTE_ALIGN_FLOOR(b->uaddr, align); + + return a_end > b_start; +} + static void vhost_user_iotlb_set_dump(struct virtio_net *dev, struct vhost_iotlb_entry *node) { @@ -37,16 +54,14 @@ static void vhost_user_iotlb_clear_dump(struct virtio_net *dev, struct vhost_iotlb_entry *node, struct vhost_iotlb_entry *prev, struct vhost_iotlb_entry *next) { - uint64_t align, mask; + uint64_t align; align = hua_to_alignment(dev->mem, (void *)(uintptr_t)node->uaddr); - mask = ~(align - 1); /* Don't disable coredump if the previous node is in the same page */ - if (prev == NULL || (node->uaddr & mask) != ((prev->uaddr + prev->size - 1) & mask)) { + if (!vhost_user_iotlb_share_page(prev, node, align)) { /* Don't disable coredump if the next node is in the same page */ - if (next == NULL || - ((node->uaddr + node->size - 1) & mask) != (next->uaddr & mask)) + if (!vhost_user_iotlb_share_page(node, next, align)) mem_set_dump((void *)(uintptr_t)node->uaddr, node->size, false, align); } } From patchwork Fri Mar 31 15:42:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125681 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id B05C842887; Fri, 31 Mar 2023 17:43:45 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8869B42D6D; Fri, 31 Mar 2023 17:43:25 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id E21FE42D61 for ; Fri, 31 Mar 2023 17:43:23 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277403; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DXV36NX/aIzMqke4hM7zTyQWFh29iERb0TpNS+taEGE=; b=IGYNHibLIl3t4CpMGK58G1MvPMftOhTG0+CiclNk4CAG2Udf3mrVfMDIl1FN2m2Ms16tXR 9vfUQ+xnTcCcTVYnv1UblVhpUmp6mCjgrX7rpQ0HBCfmvunpThrd6JhBw2DHo4lvH73XOO EhTbDtqoLX9n8Okx42t5Va15lylifYA= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-557-0WXIu4HfO2yuZC7he8VNiQ-1; Fri, 31 Mar 2023 11:43:22 -0400 X-MC-Unique: 0WXIu4HfO2yuZC7he8VNiQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 951F28030A0; Fri, 31 Mar 2023 15:43:21 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 12CF52027040; Fri, 31 Mar 2023 15:43:18 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin , stable@dpdk.org Subject: [RFC 06/27] vhost: don't dump unneeded pages with IOTLB Date: Fri, 31 Mar 2023 17:42:38 +0200 Message-Id: <20230331154259.1447831-7-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On IOTLB entry removal, previous fixes took care of not marking pages shared with other IOTLB entries as DONTDUMP. However, if an IOTLB entry is spanned on multiple pages, the other pages were kept as DODUMP while they might not have been shared with other entries, increasing needlessly the coredump size. This patch addresses this issue by excluding only the shared pages from madvise's DONTDUMP. Fixes: dea092d0addb ("vhost: fix madvise arguments alignment") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin Acked-by: Mike Pattrick Reviewed-by: Chenbo Xia --- lib/vhost/iotlb.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/lib/vhost/iotlb.c b/lib/vhost/iotlb.c index d919f74704..f598c0a8c4 100644 --- a/lib/vhost/iotlb.c +++ b/lib/vhost/iotlb.c @@ -54,16 +54,23 @@ static void vhost_user_iotlb_clear_dump(struct virtio_net *dev, struct vhost_iotlb_entry *node, struct vhost_iotlb_entry *prev, struct vhost_iotlb_entry *next) { - uint64_t align; + uint64_t align, start, end; + + start = node->uaddr; + end = node->uaddr + node->size; align = hua_to_alignment(dev->mem, (void *)(uintptr_t)node->uaddr); - /* Don't disable coredump if the previous node is in the same page */ - if (!vhost_user_iotlb_share_page(prev, node, align)) { - /* Don't disable coredump if the next node is in the same page */ - if (!vhost_user_iotlb_share_page(node, next, align)) - mem_set_dump((void *)(uintptr_t)node->uaddr, node->size, false, align); - } + /* Skip first page if shared with previous entry. */ + if (vhost_user_iotlb_share_page(prev, node, align)) + start = RTE_ALIGN_CEIL(start, align); + + /* Skip last page if shared with next entry. */ + if (vhost_user_iotlb_share_page(node, next, align)) + end = RTE_ALIGN_FLOOR(end, align); + + if (end > start) + mem_set_dump((void *)(uintptr_t)start, end - start, false, align); } static struct vhost_iotlb_entry * From patchwork Fri Mar 31 15:42:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125683 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id BB48F42887; Fri, 31 Mar 2023 17:44:03 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 9381142D86; Fri, 31 Mar 2023 17:43:30 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 6C00E42D9B for ; Fri, 31 Mar 2023 17:43:28 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277408; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rrHtcleNyRPxarVUrmoArgi/ukFN/XB0c6AAtPTe1m4=; b=FpanshvakWILpNUIWoTUTqcEQ87rtSIF8Rwk9BALSbtPds38iRiCH5jrtHakhbjDzdpclx kbKLkQjnu9rLnRkCqF5fzqJpAsKyUvbbHMDpvwR6hvkESErrsMvlt8x5hWakIXxHA6AOWJ oEXzWIBgvLiVUWH1QhEchEFe0vOYBsY= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-382-sr2Bl0mQNy-IzP6ofedb6Q-1; Fri, 31 Mar 2023 11:43:24 -0400 X-MC-Unique: sr2Bl0mQNy-IzP6ofedb6Q-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 485B02810C11; Fri, 31 Mar 2023 15:43:24 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id D22E32027040; Fri, 31 Mar 2023 15:43:21 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 07/27] vhost: change to single IOTLB cache per device Date: Fri, 31 Mar 2023 17:42:39 +0200 Message-Id: <20230331154259.1447831-8-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch simplifies IOTLB implementation and improves IOTLB memory consumption by having a single IOTLB cache per device, instead of having one per queue. In order to not impact performance, it keeps an IOTLB lock per virtqueue, so that there is no contention between multiple queue trying to acquire it. Signed-off-by: Maxime Coquelin --- lib/vhost/iotlb.c | 212 +++++++++++++++++++---------------------- lib/vhost/iotlb.h | 43 ++++++--- lib/vhost/vhost.c | 18 ++-- lib/vhost/vhost.h | 16 ++-- lib/vhost/vhost_user.c | 25 +++-- 5 files changed, 160 insertions(+), 154 deletions(-) diff --git a/lib/vhost/iotlb.c b/lib/vhost/iotlb.c index f598c0a8c4..a91115cf1c 100644 --- a/lib/vhost/iotlb.c +++ b/lib/vhost/iotlb.c @@ -74,86 +74,81 @@ vhost_user_iotlb_clear_dump(struct virtio_net *dev, struct vhost_iotlb_entry *no } static struct vhost_iotlb_entry * -vhost_user_iotlb_pool_get(struct vhost_virtqueue *vq) +vhost_user_iotlb_pool_get(struct virtio_net *dev) { struct vhost_iotlb_entry *node; - rte_spinlock_lock(&vq->iotlb_free_lock); - node = SLIST_FIRST(&vq->iotlb_free_list); + rte_spinlock_lock(&dev->iotlb_free_lock); + node = SLIST_FIRST(&dev->iotlb_free_list); if (node != NULL) - SLIST_REMOVE_HEAD(&vq->iotlb_free_list, next_free); - rte_spinlock_unlock(&vq->iotlb_free_lock); + SLIST_REMOVE_HEAD(&dev->iotlb_free_list, next_free); + rte_spinlock_unlock(&dev->iotlb_free_lock); return node; } static void -vhost_user_iotlb_pool_put(struct vhost_virtqueue *vq, - struct vhost_iotlb_entry *node) +vhost_user_iotlb_pool_put(struct virtio_net *dev, struct vhost_iotlb_entry *node) { - rte_spinlock_lock(&vq->iotlb_free_lock); - SLIST_INSERT_HEAD(&vq->iotlb_free_list, node, next_free); - rte_spinlock_unlock(&vq->iotlb_free_lock); + rte_spinlock_lock(&dev->iotlb_free_lock); + SLIST_INSERT_HEAD(&dev->iotlb_free_list, node, next_free); + rte_spinlock_unlock(&dev->iotlb_free_lock); } static void -vhost_user_iotlb_cache_random_evict(struct virtio_net *dev, struct vhost_virtqueue *vq); +vhost_user_iotlb_cache_random_evict(struct virtio_net *dev); static void -vhost_user_iotlb_pending_remove_all(struct vhost_virtqueue *vq) +vhost_user_iotlb_pending_remove_all(struct virtio_net *dev) { struct vhost_iotlb_entry *node, *temp_node; - rte_rwlock_write_lock(&vq->iotlb_pending_lock); + rte_rwlock_write_lock(&dev->iotlb_pending_lock); - RTE_TAILQ_FOREACH_SAFE(node, &vq->iotlb_pending_list, next, temp_node) { - TAILQ_REMOVE(&vq->iotlb_pending_list, node, next); - vhost_user_iotlb_pool_put(vq, node); + RTE_TAILQ_FOREACH_SAFE(node, &dev->iotlb_pending_list, next, temp_node) { + TAILQ_REMOVE(&dev->iotlb_pending_list, node, next); + vhost_user_iotlb_pool_put(dev, node); } - rte_rwlock_write_unlock(&vq->iotlb_pending_lock); + rte_rwlock_write_unlock(&dev->iotlb_pending_lock); } bool -vhost_user_iotlb_pending_miss(struct vhost_virtqueue *vq, uint64_t iova, - uint8_t perm) +vhost_user_iotlb_pending_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm) { struct vhost_iotlb_entry *node; bool found = false; - rte_rwlock_read_lock(&vq->iotlb_pending_lock); + rte_rwlock_read_lock(&dev->iotlb_pending_lock); - TAILQ_FOREACH(node, &vq->iotlb_pending_list, next) { + TAILQ_FOREACH(node, &dev->iotlb_pending_list, next) { if ((node->iova == iova) && (node->perm == perm)) { found = true; break; } } - rte_rwlock_read_unlock(&vq->iotlb_pending_lock); + rte_rwlock_read_unlock(&dev->iotlb_pending_lock); return found; } void -vhost_user_iotlb_pending_insert(struct virtio_net *dev, struct vhost_virtqueue *vq, - uint64_t iova, uint8_t perm) +vhost_user_iotlb_pending_insert(struct virtio_net *dev, uint64_t iova, uint8_t perm) { struct vhost_iotlb_entry *node; - node = vhost_user_iotlb_pool_get(vq); + node = vhost_user_iotlb_pool_get(dev); if (node == NULL) { VHOST_LOG_CONFIG(dev->ifname, DEBUG, - "IOTLB pool for vq %"PRIu32" empty, clear entries for pending insertion\n", - vq->index); - if (!TAILQ_EMPTY(&vq->iotlb_pending_list)) - vhost_user_iotlb_pending_remove_all(vq); + "IOTLB pool empty, clear entries for pending insertion\n"); + if (!TAILQ_EMPTY(&dev->iotlb_pending_list)) + vhost_user_iotlb_pending_remove_all(dev); else - vhost_user_iotlb_cache_random_evict(dev, vq); - node = vhost_user_iotlb_pool_get(vq); + vhost_user_iotlb_cache_random_evict(dev); + node = vhost_user_iotlb_pool_get(dev); if (node == NULL) { VHOST_LOG_CONFIG(dev->ifname, ERR, - "IOTLB pool vq %"PRIu32" still empty, pending insertion failure\n", - vq->index); + "IOTLB pool still empty, pending insertion failure\n"); return; } } @@ -161,22 +156,21 @@ vhost_user_iotlb_pending_insert(struct virtio_net *dev, struct vhost_virtqueue * node->iova = iova; node->perm = perm; - rte_rwlock_write_lock(&vq->iotlb_pending_lock); + rte_rwlock_write_lock(&dev->iotlb_pending_lock); - TAILQ_INSERT_TAIL(&vq->iotlb_pending_list, node, next); + TAILQ_INSERT_TAIL(&dev->iotlb_pending_list, node, next); - rte_rwlock_write_unlock(&vq->iotlb_pending_lock); + rte_rwlock_write_unlock(&dev->iotlb_pending_lock); } void -vhost_user_iotlb_pending_remove(struct vhost_virtqueue *vq, - uint64_t iova, uint64_t size, uint8_t perm) +vhost_user_iotlb_pending_remove(struct virtio_net *dev, uint64_t iova, uint64_t size, uint8_t perm) { struct vhost_iotlb_entry *node, *temp_node; - rte_rwlock_write_lock(&vq->iotlb_pending_lock); + rte_rwlock_write_lock(&dev->iotlb_pending_lock); - RTE_TAILQ_FOREACH_SAFE(node, &vq->iotlb_pending_list, next, + RTE_TAILQ_FOREACH_SAFE(node, &dev->iotlb_pending_list, next, temp_node) { if (node->iova < iova) continue; @@ -184,81 +178,78 @@ vhost_user_iotlb_pending_remove(struct vhost_virtqueue *vq, continue; if ((node->perm & perm) != node->perm) continue; - TAILQ_REMOVE(&vq->iotlb_pending_list, node, next); - vhost_user_iotlb_pool_put(vq, node); + TAILQ_REMOVE(&dev->iotlb_pending_list, node, next); + vhost_user_iotlb_pool_put(dev, node); } - rte_rwlock_write_unlock(&vq->iotlb_pending_lock); + rte_rwlock_write_unlock(&dev->iotlb_pending_lock); } static void -vhost_user_iotlb_cache_remove_all(struct virtio_net *dev, struct vhost_virtqueue *vq) +vhost_user_iotlb_cache_remove_all(struct virtio_net *dev) { struct vhost_iotlb_entry *node, *temp_node; - rte_rwlock_write_lock(&vq->iotlb_lock); + vhost_user_iotlb_wr_lock_all(dev); - RTE_TAILQ_FOREACH_SAFE(node, &vq->iotlb_list, next, temp_node) { + RTE_TAILQ_FOREACH_SAFE(node, &dev->iotlb_list, next, temp_node) { vhost_user_iotlb_set_dump(dev, node); - TAILQ_REMOVE(&vq->iotlb_list, node, next); - vhost_user_iotlb_pool_put(vq, node); + TAILQ_REMOVE(&dev->iotlb_list, node, next); + vhost_user_iotlb_pool_put(dev, node); } - vq->iotlb_cache_nr = 0; + dev->iotlb_cache_nr = 0; - rte_rwlock_write_unlock(&vq->iotlb_lock); + vhost_user_iotlb_wr_unlock_all(dev); } static void -vhost_user_iotlb_cache_random_evict(struct virtio_net *dev, struct vhost_virtqueue *vq) +vhost_user_iotlb_cache_random_evict(struct virtio_net *dev) { struct vhost_iotlb_entry *node, *temp_node, *prev_node = NULL; int entry_idx; - rte_rwlock_write_lock(&vq->iotlb_lock); + vhost_user_iotlb_wr_lock_all(dev); - entry_idx = rte_rand() % vq->iotlb_cache_nr; + entry_idx = rte_rand() % dev->iotlb_cache_nr; - RTE_TAILQ_FOREACH_SAFE(node, &vq->iotlb_list, next, temp_node) { + RTE_TAILQ_FOREACH_SAFE(node, &dev->iotlb_list, next, temp_node) { if (!entry_idx) { struct vhost_iotlb_entry *next_node = RTE_TAILQ_NEXT(node, next); vhost_user_iotlb_clear_dump(dev, node, prev_node, next_node); - TAILQ_REMOVE(&vq->iotlb_list, node, next); - vhost_user_iotlb_pool_put(vq, node); - vq->iotlb_cache_nr--; + TAILQ_REMOVE(&dev->iotlb_list, node, next); + vhost_user_iotlb_pool_put(dev, node); + dev->iotlb_cache_nr--; break; } prev_node = node; entry_idx--; } - rte_rwlock_write_unlock(&vq->iotlb_lock); + vhost_user_iotlb_wr_unlock_all(dev); } void -vhost_user_iotlb_cache_insert(struct virtio_net *dev, struct vhost_virtqueue *vq, - uint64_t iova, uint64_t uaddr, +vhost_user_iotlb_cache_insert(struct virtio_net *dev, uint64_t iova, uint64_t uaddr, uint64_t size, uint8_t perm) { struct vhost_iotlb_entry *node, *new_node; - new_node = vhost_user_iotlb_pool_get(vq); + new_node = vhost_user_iotlb_pool_get(dev); if (new_node == NULL) { VHOST_LOG_CONFIG(dev->ifname, DEBUG, - "IOTLB pool vq %"PRIu32" empty, clear entries for cache insertion\n", - vq->index); - if (!TAILQ_EMPTY(&vq->iotlb_list)) - vhost_user_iotlb_cache_random_evict(dev, vq); + "IOTLB pool empty, clear entries for cache insertion\n"); + if (!TAILQ_EMPTY(&dev->iotlb_list)) + vhost_user_iotlb_cache_random_evict(dev); else - vhost_user_iotlb_pending_remove_all(vq); - new_node = vhost_user_iotlb_pool_get(vq); + vhost_user_iotlb_pending_remove_all(dev); + new_node = vhost_user_iotlb_pool_get(dev); if (new_node == NULL) { VHOST_LOG_CONFIG(dev->ifname, ERR, - "IOTLB pool vq %"PRIu32" still empty, cache insertion failed\n", - vq->index); + "IOTLB pool still empty, cache insertion failed\n"); return; } } @@ -268,49 +259,47 @@ vhost_user_iotlb_cache_insert(struct virtio_net *dev, struct vhost_virtqueue *vq new_node->size = size; new_node->perm = perm; - rte_rwlock_write_lock(&vq->iotlb_lock); + vhost_user_iotlb_wr_lock_all(dev); - TAILQ_FOREACH(node, &vq->iotlb_list, next) { + TAILQ_FOREACH(node, &dev->iotlb_list, next) { /* * Entries must be invalidated before being updated. * So if iova already in list, assume identical. */ if (node->iova == new_node->iova) { - vhost_user_iotlb_pool_put(vq, new_node); + vhost_user_iotlb_pool_put(dev, new_node); goto unlock; } else if (node->iova > new_node->iova) { vhost_user_iotlb_set_dump(dev, new_node); TAILQ_INSERT_BEFORE(node, new_node, next); - vq->iotlb_cache_nr++; + dev->iotlb_cache_nr++; goto unlock; } } vhost_user_iotlb_set_dump(dev, new_node); - TAILQ_INSERT_TAIL(&vq->iotlb_list, new_node, next); - vq->iotlb_cache_nr++; + TAILQ_INSERT_TAIL(&dev->iotlb_list, new_node, next); + dev->iotlb_cache_nr++; unlock: - vhost_user_iotlb_pending_remove(vq, iova, size, perm); - - rte_rwlock_write_unlock(&vq->iotlb_lock); + vhost_user_iotlb_pending_remove(dev, iova, size, perm); + vhost_user_iotlb_wr_unlock_all(dev); } void -vhost_user_iotlb_cache_remove(struct virtio_net *dev, struct vhost_virtqueue *vq, - uint64_t iova, uint64_t size) +vhost_user_iotlb_cache_remove(struct virtio_net *dev, uint64_t iova, uint64_t size) { struct vhost_iotlb_entry *node, *temp_node, *prev_node = NULL; if (unlikely(!size)) return; - rte_rwlock_write_lock(&vq->iotlb_lock); + vhost_user_iotlb_wr_lock_all(dev); - RTE_TAILQ_FOREACH_SAFE(node, &vq->iotlb_list, next, temp_node) { + RTE_TAILQ_FOREACH_SAFE(node, &dev->iotlb_list, next, temp_node) { /* Sorted list */ if (unlikely(iova + size < node->iova)) break; @@ -320,19 +309,19 @@ vhost_user_iotlb_cache_remove(struct virtio_net *dev, struct vhost_virtqueue *vq vhost_user_iotlb_clear_dump(dev, node, prev_node, next_node); - TAILQ_REMOVE(&vq->iotlb_list, node, next); - vhost_user_iotlb_pool_put(vq, node); - vq->iotlb_cache_nr--; - } else + TAILQ_REMOVE(&dev->iotlb_list, node, next); + vhost_user_iotlb_pool_put(dev, node); + dev->iotlb_cache_nr--; + } else { prev_node = node; + } } - rte_rwlock_write_unlock(&vq->iotlb_lock); + vhost_user_iotlb_wr_unlock_all(dev); } uint64_t -vhost_user_iotlb_cache_find(struct vhost_virtqueue *vq, uint64_t iova, - uint64_t *size, uint8_t perm) +vhost_user_iotlb_cache_find(struct virtio_net *dev, uint64_t iova, uint64_t *size, uint8_t perm) { struct vhost_iotlb_entry *node; uint64_t offset, vva = 0, mapped = 0; @@ -340,7 +329,7 @@ vhost_user_iotlb_cache_find(struct vhost_virtqueue *vq, uint64_t iova, if (unlikely(!*size)) goto out; - TAILQ_FOREACH(node, &vq->iotlb_list, next) { + TAILQ_FOREACH(node, &dev->iotlb_list, next) { /* List sorted by iova */ if (unlikely(iova < node->iova)) break; @@ -373,60 +362,57 @@ vhost_user_iotlb_cache_find(struct vhost_virtqueue *vq, uint64_t iova, } void -vhost_user_iotlb_flush_all(struct virtio_net *dev, struct vhost_virtqueue *vq) +vhost_user_iotlb_flush_all(struct virtio_net *dev) { - vhost_user_iotlb_cache_remove_all(dev, vq); - vhost_user_iotlb_pending_remove_all(vq); + vhost_user_iotlb_cache_remove_all(dev); + vhost_user_iotlb_pending_remove_all(dev); } int -vhost_user_iotlb_init(struct virtio_net *dev, struct vhost_virtqueue *vq) +vhost_user_iotlb_init(struct virtio_net *dev) { unsigned int i; int socket = 0; - if (vq->iotlb_pool) { + if (dev->iotlb_pool) { /* * The cache has already been initialized, * just drop all cached and pending entries. */ - vhost_user_iotlb_flush_all(dev, vq); - rte_free(vq->iotlb_pool); + vhost_user_iotlb_flush_all(dev); + rte_free(dev->iotlb_pool); } #ifdef RTE_LIBRTE_VHOST_NUMA - if (get_mempolicy(&socket, NULL, 0, vq, MPOL_F_NODE | MPOL_F_ADDR) != 0) + if (get_mempolicy(&socket, NULL, 0, dev, MPOL_F_NODE | MPOL_F_ADDR) != 0) socket = 0; #endif - rte_spinlock_init(&vq->iotlb_free_lock); - rte_rwlock_init(&vq->iotlb_lock); - rte_rwlock_init(&vq->iotlb_pending_lock); + rte_spinlock_init(&dev->iotlb_free_lock); + rte_rwlock_init(&dev->iotlb_pending_lock); - SLIST_INIT(&vq->iotlb_free_list); - TAILQ_INIT(&vq->iotlb_list); - TAILQ_INIT(&vq->iotlb_pending_list); + SLIST_INIT(&dev->iotlb_free_list); + TAILQ_INIT(&dev->iotlb_list); + TAILQ_INIT(&dev->iotlb_pending_list); if (dev->flags & VIRTIO_DEV_SUPPORT_IOMMU) { - vq->iotlb_pool = rte_calloc_socket("iotlb", IOTLB_CACHE_SIZE, + dev->iotlb_pool = rte_calloc_socket("iotlb", IOTLB_CACHE_SIZE, sizeof(struct vhost_iotlb_entry), 0, socket); - if (!vq->iotlb_pool) { - VHOST_LOG_CONFIG(dev->ifname, ERR, - "Failed to create IOTLB cache pool for vq %"PRIu32"\n", - vq->index); + if (!dev->iotlb_pool) { + VHOST_LOG_CONFIG(dev->ifname, ERR, "Failed to create IOTLB cache pool\n"); return -1; } for (i = 0; i < IOTLB_CACHE_SIZE; i++) - vhost_user_iotlb_pool_put(vq, &vq->iotlb_pool[i]); + vhost_user_iotlb_pool_put(dev, &dev->iotlb_pool[i]); } - vq->iotlb_cache_nr = 0; + dev->iotlb_cache_nr = 0; return 0; } void -vhost_user_iotlb_destroy(struct vhost_virtqueue *vq) +vhost_user_iotlb_destroy(struct virtio_net *dev) { - rte_free(vq->iotlb_pool); + rte_free(dev->iotlb_pool); } diff --git a/lib/vhost/iotlb.h b/lib/vhost/iotlb.h index 73b5465b41..3490b9e6be 100644 --- a/lib/vhost/iotlb.h +++ b/lib/vhost/iotlb.h @@ -37,20 +37,37 @@ vhost_user_iotlb_wr_unlock(struct vhost_virtqueue *vq) rte_rwlock_write_unlock(&vq->iotlb_lock); } -void vhost_user_iotlb_cache_insert(struct virtio_net *dev, struct vhost_virtqueue *vq, - uint64_t iova, uint64_t uaddr, +static __rte_always_inline void +vhost_user_iotlb_wr_lock_all(struct virtio_net *dev) + __rte_no_thread_safety_analysis +{ + uint32_t i; + + for (i = 0; i < dev->nr_vring; i++) + rte_rwlock_write_lock(&dev->virtqueue[i]->iotlb_lock); +} + +static __rte_always_inline void +vhost_user_iotlb_wr_unlock_all(struct virtio_net *dev) + __rte_no_thread_safety_analysis +{ + uint32_t i; + + for (i = 0; i < dev->nr_vring; i++) + rte_rwlock_write_unlock(&dev->virtqueue[i]->iotlb_lock); +} + +void vhost_user_iotlb_cache_insert(struct virtio_net *dev, uint64_t iova, uint64_t uaddr, uint64_t size, uint8_t perm); -void vhost_user_iotlb_cache_remove(struct virtio_net *dev, struct vhost_virtqueue *vq, - uint64_t iova, uint64_t size); -uint64_t vhost_user_iotlb_cache_find(struct vhost_virtqueue *vq, uint64_t iova, +void vhost_user_iotlb_cache_remove(struct virtio_net *dev, uint64_t iova, uint64_t size); +uint64_t vhost_user_iotlb_cache_find(struct virtio_net *dev, uint64_t iova, uint64_t *size, uint8_t perm); -bool vhost_user_iotlb_pending_miss(struct vhost_virtqueue *vq, uint64_t iova, - uint8_t perm); -void vhost_user_iotlb_pending_insert(struct virtio_net *dev, struct vhost_virtqueue *vq, - uint64_t iova, uint8_t perm); -void vhost_user_iotlb_pending_remove(struct vhost_virtqueue *vq, uint64_t iova, +bool vhost_user_iotlb_pending_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm); +void vhost_user_iotlb_pending_insert(struct virtio_net *dev, uint64_t iova, uint8_t perm); +void vhost_user_iotlb_pending_remove(struct virtio_net *dev, uint64_t iova, uint64_t size, uint8_t perm); -void vhost_user_iotlb_flush_all(struct virtio_net *dev, struct vhost_virtqueue *vq); -int vhost_user_iotlb_init(struct virtio_net *dev, struct vhost_virtqueue *vq); -void vhost_user_iotlb_destroy(struct vhost_virtqueue *vq); +void vhost_user_iotlb_flush_all(struct virtio_net *dev); +int vhost_user_iotlb_init(struct virtio_net *dev); +void vhost_user_iotlb_destroy(struct virtio_net *dev); + #endif /* _VHOST_IOTLB_H_ */ diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c index ef37943817..d35075b96c 100644 --- a/lib/vhost/vhost.c +++ b/lib/vhost/vhost.c @@ -63,7 +63,7 @@ __vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq, tmp_size = *size; - vva = vhost_user_iotlb_cache_find(vq, iova, &tmp_size, perm); + vva = vhost_user_iotlb_cache_find(dev, iova, &tmp_size, perm); if (tmp_size == *size) { if (dev->flags & VIRTIO_DEV_STATS_ENABLED) vq->stats.iotlb_hits++; @@ -75,7 +75,7 @@ __vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq, iova += tmp_size; - if (!vhost_user_iotlb_pending_miss(vq, iova, perm)) { + if (!vhost_user_iotlb_pending_miss(dev, iova, perm)) { /* * iotlb_lock is read-locked for a full burst, * but it only protects the iotlb cache. @@ -85,12 +85,12 @@ __vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq, */ vhost_user_iotlb_rd_unlock(vq); - vhost_user_iotlb_pending_insert(dev, vq, iova, perm); + vhost_user_iotlb_pending_insert(dev, iova, perm); if (vhost_user_iotlb_miss(dev, iova, perm)) { VHOST_LOG_DATA(dev->ifname, ERR, "IOTLB miss req failed for IOVA 0x%" PRIx64 "\n", iova); - vhost_user_iotlb_pending_remove(vq, iova, 1, perm); + vhost_user_iotlb_pending_remove(dev, iova, 1, perm); } vhost_user_iotlb_rd_lock(vq); @@ -397,7 +397,6 @@ free_vq(struct virtio_net *dev, struct vhost_virtqueue *vq) vhost_free_async_mem(vq); rte_spinlock_unlock(&vq->access_lock); rte_free(vq->batch_copy_elems); - vhost_user_iotlb_destroy(vq); rte_free(vq->log_cache); rte_free(vq); } @@ -575,7 +574,7 @@ vring_invalidate(struct virtio_net *dev __rte_unused, struct vhost_virtqueue *vq } static void -init_vring_queue(struct virtio_net *dev, struct vhost_virtqueue *vq, +init_vring_queue(struct virtio_net *dev __rte_unused, struct vhost_virtqueue *vq, uint32_t vring_idx) { int numa_node = SOCKET_ID_ANY; @@ -595,8 +594,6 @@ init_vring_queue(struct virtio_net *dev, struct vhost_virtqueue *vq, } #endif vq->numa_node = numa_node; - - vhost_user_iotlb_init(dev, vq); } static void @@ -631,6 +628,7 @@ alloc_vring_queue(struct virtio_net *dev, uint32_t vring_idx) dev->virtqueue[i] = vq; init_vring_queue(dev, vq, i); rte_spinlock_init(&vq->access_lock); + rte_rwlock_init(&vq->iotlb_lock); vq->avail_wrap_counter = 1; vq->used_wrap_counter = 1; vq->signalled_used_valid = false; @@ -795,6 +793,10 @@ vhost_setup_virtio_net(int vid, bool enable, bool compliant_ol_flags, bool stats dev->flags |= VIRTIO_DEV_SUPPORT_IOMMU; else dev->flags &= ~VIRTIO_DEV_SUPPORT_IOMMU; + + if (vhost_user_iotlb_init(dev) < 0) + VHOST_LOG_CONFIG("device", ERR, "failed to init IOTLB\n"); + } void diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index 40863f7bfd..67cc4a2fdb 100644 --- a/lib/vhost/vhost.h +++ b/lib/vhost/vhost.h @@ -302,13 +302,6 @@ struct vhost_virtqueue { struct log_cache_entry *log_cache; rte_rwlock_t iotlb_lock; - rte_rwlock_t iotlb_pending_lock; - struct vhost_iotlb_entry *iotlb_pool; - TAILQ_HEAD(, vhost_iotlb_entry) iotlb_list; - TAILQ_HEAD(, vhost_iotlb_entry) iotlb_pending_list; - int iotlb_cache_nr; - rte_spinlock_t iotlb_free_lock; - SLIST_HEAD(, vhost_iotlb_entry) iotlb_free_list; /* Used to notify the guest (trigger interrupt) */ int callfd; @@ -483,6 +476,15 @@ struct virtio_net { int extbuf; int linearbuf; struct vhost_virtqueue *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2]; + + rte_rwlock_t iotlb_pending_lock; + struct vhost_iotlb_entry *iotlb_pool; + TAILQ_HEAD(, vhost_iotlb_entry) iotlb_list; + TAILQ_HEAD(, vhost_iotlb_entry) iotlb_pending_list; + int iotlb_cache_nr; + rte_spinlock_t iotlb_free_lock; + SLIST_HEAD(, vhost_iotlb_entry) iotlb_free_list; + struct inflight_mem_info *inflight_info; #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ) char ifname[IF_NAME_SZ]; diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c index d60e39b6bc..81ebef0137 100644 --- a/lib/vhost/vhost_user.c +++ b/lib/vhost/vhost_user.c @@ -7,7 +7,7 @@ * The vhost-user protocol connection is an external interface, so it must be * robust against invalid inputs. * - * This is important because the vhost-user frontend is only one step removed +* This is important because the vhost-user frontend is only one step removed * from the guest. Malicious guests that have escaped will then launch further * attacks from the vhost-user frontend. * @@ -237,6 +237,8 @@ vhost_backend_cleanup(struct virtio_net *dev) } dev->postcopy_listening = 0; + + vhost_user_iotlb_destroy(dev); } static void @@ -539,7 +541,6 @@ numa_realloc(struct virtio_net **pdev, struct vhost_virtqueue **pvq) if (vq != dev->virtqueue[vq->index]) { VHOST_LOG_CONFIG(dev->ifname, INFO, "reallocated virtqueue on node %d\n", node); dev->virtqueue[vq->index] = vq; - vhost_user_iotlb_init(dev, vq); } if (vq_is_packed(dev)) { @@ -664,6 +665,8 @@ numa_realloc(struct virtio_net **pdev, struct vhost_virtqueue **pvq) return; } dev->guest_pages = gp; + + vhost_user_iotlb_init(dev); } #else static void @@ -1360,8 +1363,7 @@ vhost_user_set_mem_table(struct virtio_net **pdev, /* Flush IOTLB cache as previous HVAs are now invalid */ if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM)) - for (i = 0; i < dev->nr_vring; i++) - vhost_user_iotlb_flush_all(dev, dev->virtqueue[i]); + vhost_user_iotlb_flush_all(dev); free_mem_region(dev); rte_free(dev->mem); @@ -2194,7 +2196,7 @@ vhost_user_get_vring_base(struct virtio_net **pdev, ctx->msg.size = sizeof(ctx->msg.payload.state); ctx->fd_num = 0; - vhost_user_iotlb_flush_all(dev, vq); + vhost_user_iotlb_flush_all(dev); vring_invalidate(dev, vq); @@ -2639,15 +2641,14 @@ vhost_user_iotlb_msg(struct virtio_net **pdev, if (!vva) return RTE_VHOST_MSG_RESULT_ERR; + vhost_user_iotlb_cache_insert(dev, imsg->iova, vva, len, imsg->perm); + for (i = 0; i < dev->nr_vring; i++) { struct vhost_virtqueue *vq = dev->virtqueue[i]; if (!vq) continue; - vhost_user_iotlb_cache_insert(dev, vq, imsg->iova, vva, - len, imsg->perm); - if (is_vring_iotlb(dev, vq, imsg)) { rte_spinlock_lock(&vq->access_lock); translate_ring_addresses(&dev, &vq); @@ -2657,15 +2658,14 @@ vhost_user_iotlb_msg(struct virtio_net **pdev, } break; case VHOST_IOTLB_INVALIDATE: + vhost_user_iotlb_cache_remove(dev, imsg->iova, imsg->size); + for (i = 0; i < dev->nr_vring; i++) { struct vhost_virtqueue *vq = dev->virtqueue[i]; if (!vq) continue; - vhost_user_iotlb_cache_remove(dev, vq, imsg->iova, - imsg->size); - if (is_vring_iotlb(dev, vq, imsg)) { rte_spinlock_lock(&vq->access_lock); vring_invalidate(dev, vq); @@ -2674,8 +2674,7 @@ vhost_user_iotlb_msg(struct virtio_net **pdev, } break; default: - VHOST_LOG_CONFIG(dev->ifname, ERR, - "invalid IOTLB message type (%d)\n", + VHOST_LOG_CONFIG(dev->ifname, ERR, "invalid IOTLB message type (%d)\n", imsg->type); return RTE_VHOST_MSG_RESULT_ERR; } From patchwork Fri Mar 31 15:42:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125684 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 69AA942887; Fri, 31 Mar 2023 17:44:14 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1F74F42DAE; Fri, 31 Mar 2023 17:43:32 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 18BB142D8A for ; Fri, 31 Mar 2023 17:43:31 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277410; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=r2uqriny4pAN9hCZLk/rX5SCmJuwp02o7GyBccibjNI=; b=PPbOFa+0Ijs5px2FQV6a5QX6P+3wVYUjOBJzgVLwRXaWyryusrbBvvc9G2kbn7kGK3taS7 3/S4BGxG39RD4LfeP1rLJOSbxI+ud4fc9P/tShV/67TN8wpi5pJo/wG4EEnBv1rqlkMFso bnMoKwDmVruZe8ncZCw+O6GiLwMLSic= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-509-ItZqSpdWPe2S7LhjJAGlTw-1; Fri, 31 Mar 2023 11:43:27 -0400 X-MC-Unique: ItZqSpdWPe2S7LhjJAGlTw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 194FC8028B3; Fri, 31 Mar 2023 15:43:27 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9AB262027041; Fri, 31 Mar 2023 15:43:24 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 08/27] vhost: add offset field to IOTLB entries Date: Fri, 31 Mar 2023 17:42:40 +0200 Message-Id: <20230331154259.1447831-9-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch is a preliminary work to prepare for VDUSE support, for which we need to keep track of the mmaped base address and offset in order to be able to unmap it later when IOTLB entry is invalidated. Signed-off-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- lib/vhost/iotlb.c | 30 ++++++++++++++++++------------ lib/vhost/iotlb.h | 2 +- lib/vhost/vhost_user.c | 2 +- 3 files changed, 20 insertions(+), 14 deletions(-) diff --git a/lib/vhost/iotlb.c b/lib/vhost/iotlb.c index a91115cf1c..51f118bc48 100644 --- a/lib/vhost/iotlb.c +++ b/lib/vhost/iotlb.c @@ -17,6 +17,7 @@ struct vhost_iotlb_entry { uint64_t iova; uint64_t uaddr; + uint64_t uoffset; uint64_t size; uint8_t perm; }; @@ -27,15 +28,18 @@ static bool vhost_user_iotlb_share_page(struct vhost_iotlb_entry *a, struct vhost_iotlb_entry *b, uint64_t align) { - uint64_t a_end, b_start; + uint64_t a_start, a_end, b_start; if (a == NULL || b == NULL) return false; + a_start = a->uaddr + a->uoffset; + b_start = b->uaddr + b->uoffset; + /* Assumes entry a lower than entry b */ - RTE_ASSERT(a->uaddr < b->uaddr); - a_end = RTE_ALIGN_CEIL(a->uaddr + a->size, align); - b_start = RTE_ALIGN_FLOOR(b->uaddr, align); + RTE_ASSERT(a_start < b_start); + a_end = RTE_ALIGN_CEIL(a_start + a->size, align); + b_start = RTE_ALIGN_FLOOR(b_start, align); return a_end > b_start; } @@ -43,11 +47,12 @@ vhost_user_iotlb_share_page(struct vhost_iotlb_entry *a, struct vhost_iotlb_entr static void vhost_user_iotlb_set_dump(struct virtio_net *dev, struct vhost_iotlb_entry *node) { - uint64_t align; + uint64_t align, start; - align = hua_to_alignment(dev->mem, (void *)(uintptr_t)node->uaddr); + start = node->uaddr + node->uoffset; + align = hua_to_alignment(dev->mem, (void *)(uintptr_t)start); - mem_set_dump((void *)(uintptr_t)node->uaddr, node->size, false, align); + mem_set_dump((void *)(uintptr_t)start, node->size, false, align); } static void @@ -56,10 +61,10 @@ vhost_user_iotlb_clear_dump(struct virtio_net *dev, struct vhost_iotlb_entry *no { uint64_t align, start, end; - start = node->uaddr; - end = node->uaddr + node->size; + start = node->uaddr + node->uoffset; + end = start + node->size; - align = hua_to_alignment(dev->mem, (void *)(uintptr_t)node->uaddr); + align = hua_to_alignment(dev->mem, (void *)(uintptr_t)start); /* Skip first page if shared with previous entry. */ if (vhost_user_iotlb_share_page(prev, node, align)) @@ -234,7 +239,7 @@ vhost_user_iotlb_cache_random_evict(struct virtio_net *dev) void vhost_user_iotlb_cache_insert(struct virtio_net *dev, uint64_t iova, uint64_t uaddr, - uint64_t size, uint8_t perm) + uint64_t uoffset, uint64_t size, uint8_t perm) { struct vhost_iotlb_entry *node, *new_node; @@ -256,6 +261,7 @@ vhost_user_iotlb_cache_insert(struct virtio_net *dev, uint64_t iova, uint64_t ua new_node->iova = iova; new_node->uaddr = uaddr; + new_node->uoffset = uoffset; new_node->size = size; new_node->perm = perm; @@ -344,7 +350,7 @@ vhost_user_iotlb_cache_find(struct virtio_net *dev, uint64_t iova, uint64_t *siz offset = iova - node->iova; if (!vva) - vva = node->uaddr + offset; + vva = node->uaddr + node->uoffset + offset; mapped += node->size - offset; iova = node->iova + node->size; diff --git a/lib/vhost/iotlb.h b/lib/vhost/iotlb.h index 3490b9e6be..bee36c5903 100644 --- a/lib/vhost/iotlb.h +++ b/lib/vhost/iotlb.h @@ -58,7 +58,7 @@ vhost_user_iotlb_wr_unlock_all(struct virtio_net *dev) } void vhost_user_iotlb_cache_insert(struct virtio_net *dev, uint64_t iova, uint64_t uaddr, - uint64_t size, uint8_t perm); + uint64_t uoffset, uint64_t size, uint8_t perm); void vhost_user_iotlb_cache_remove(struct virtio_net *dev, uint64_t iova, uint64_t size); uint64_t vhost_user_iotlb_cache_find(struct virtio_net *dev, uint64_t iova, uint64_t *size, uint8_t perm); diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c index 81ebef0137..93673d3902 100644 --- a/lib/vhost/vhost_user.c +++ b/lib/vhost/vhost_user.c @@ -2641,7 +2641,7 @@ vhost_user_iotlb_msg(struct virtio_net **pdev, if (!vva) return RTE_VHOST_MSG_RESULT_ERR; - vhost_user_iotlb_cache_insert(dev, imsg->iova, vva, len, imsg->perm); + vhost_user_iotlb_cache_insert(dev, imsg->iova, vva, 0, len, imsg->perm); for (i = 0; i < dev->nr_vring; i++) { struct vhost_virtqueue *vq = dev->virtqueue[i]; From patchwork Fri Mar 31 15:42:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125685 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5E30C42887; Fri, 31 Mar 2023 17:44:22 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 804EC42D9E; Fri, 31 Mar 2023 17:43:36 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 9855942D8A for ; Fri, 31 Mar 2023 17:43:35 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277415; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hTZ+4i+hQeEyPGPJbiUse7ShJ5pAwX5L41uqoatABHE=; b=UPUebvBmAa18s8Yz8uJIvRZgRYZkolBXKDBRofuEpDlBoV03wQxcTiHkVF3hujIp2LyUNh 5tMMf+p0BZM87+X2XkHjRWI5I5IUsryaHqjyEhCTne1qIoEDnR7j0vHe4x1H+fA0X7s2xg slmZ8qtj7K2Ax6oG0lUSDqh+loxNRRA= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-9-ADyD4_-7N_ilwD-Iwkr4KA-1; Fri, 31 Mar 2023 11:43:30 -0400 X-MC-Unique: ADyD4_-7N_ilwD-Iwkr4KA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id AFFB2101A531; Fri, 31 Mar 2023 15:43:29 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5F68B202701E; Fri, 31 Mar 2023 15:43:27 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 09/27] vhost: add page size info to IOTLB entry Date: Fri, 31 Mar 2023 17:42:41 +0200 Message-Id: <20230331154259.1447831-10-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org VDUSE will close the file descriptor after having mapped the shared memory, so it will not be possible to get the page size afterwards. This patch adds an new page_shift field to the IOTLB entry, so that the information will be passed at IOTLB cache insertion time. The information is stored as a bit shift value so that IOTLB entry keeps fitting in a single cacheline. Signed-off-by: Maxime Coquelin --- lib/vhost/iotlb.c | 46 ++++++++++++++++++++---------------------- lib/vhost/iotlb.h | 2 +- lib/vhost/vhost.h | 1 - lib/vhost/vhost_user.c | 8 +++++--- 4 files changed, 28 insertions(+), 29 deletions(-) diff --git a/lib/vhost/iotlb.c b/lib/vhost/iotlb.c index 51f118bc48..188dfb8e38 100644 --- a/lib/vhost/iotlb.c +++ b/lib/vhost/iotlb.c @@ -19,14 +19,14 @@ struct vhost_iotlb_entry { uint64_t uaddr; uint64_t uoffset; uint64_t size; + uint8_t page_shift; uint8_t perm; }; #define IOTLB_CACHE_SIZE 2048 static bool -vhost_user_iotlb_share_page(struct vhost_iotlb_entry *a, struct vhost_iotlb_entry *b, - uint64_t align) +vhost_user_iotlb_share_page(struct vhost_iotlb_entry *a, struct vhost_iotlb_entry *b) { uint64_t a_start, a_end, b_start; @@ -38,44 +38,41 @@ vhost_user_iotlb_share_page(struct vhost_iotlb_entry *a, struct vhost_iotlb_entr /* Assumes entry a lower than entry b */ RTE_ASSERT(a_start < b_start); - a_end = RTE_ALIGN_CEIL(a_start + a->size, align); - b_start = RTE_ALIGN_FLOOR(b_start, align); + a_end = RTE_ALIGN_CEIL(a_start + a->size, RTE_BIT64(a->page_shift)); + b_start = RTE_ALIGN_FLOOR(b_start, RTE_BIT64(b->page_shift)); return a_end > b_start; } static void -vhost_user_iotlb_set_dump(struct virtio_net *dev, struct vhost_iotlb_entry *node) +vhost_user_iotlb_set_dump(struct vhost_iotlb_entry *node) { - uint64_t align, start; + uint64_t start; start = node->uaddr + node->uoffset; - align = hua_to_alignment(dev->mem, (void *)(uintptr_t)start); - - mem_set_dump((void *)(uintptr_t)start, node->size, false, align); + mem_set_dump((void *)(uintptr_t)start, node->size, false, RTE_BIT64(node->page_shift)); } static void -vhost_user_iotlb_clear_dump(struct virtio_net *dev, struct vhost_iotlb_entry *node, +vhost_user_iotlb_clear_dump(struct vhost_iotlb_entry *node, struct vhost_iotlb_entry *prev, struct vhost_iotlb_entry *next) { - uint64_t align, start, end; + uint64_t start, end; start = node->uaddr + node->uoffset; end = start + node->size; - align = hua_to_alignment(dev->mem, (void *)(uintptr_t)start); - /* Skip first page if shared with previous entry. */ - if (vhost_user_iotlb_share_page(prev, node, align)) - start = RTE_ALIGN_CEIL(start, align); + if (vhost_user_iotlb_share_page(prev, node)) + start = RTE_ALIGN_CEIL(start, RTE_BIT64(node->page_shift)); /* Skip last page if shared with next entry. */ - if (vhost_user_iotlb_share_page(node, next, align)) - end = RTE_ALIGN_FLOOR(end, align); + if (vhost_user_iotlb_share_page(node, next)) + end = RTE_ALIGN_FLOOR(end, RTE_BIT64(node->page_shift)); if (end > start) - mem_set_dump((void *)(uintptr_t)start, end - start, false, align); + mem_set_dump((void *)(uintptr_t)start, end - start, false, + RTE_BIT64(node->page_shift)); } static struct vhost_iotlb_entry * @@ -198,7 +195,7 @@ vhost_user_iotlb_cache_remove_all(struct virtio_net *dev) vhost_user_iotlb_wr_lock_all(dev); RTE_TAILQ_FOREACH_SAFE(node, &dev->iotlb_list, next, temp_node) { - vhost_user_iotlb_set_dump(dev, node); + vhost_user_iotlb_set_dump(node); TAILQ_REMOVE(&dev->iotlb_list, node, next); vhost_user_iotlb_pool_put(dev, node); @@ -223,7 +220,7 @@ vhost_user_iotlb_cache_random_evict(struct virtio_net *dev) if (!entry_idx) { struct vhost_iotlb_entry *next_node = RTE_TAILQ_NEXT(node, next); - vhost_user_iotlb_clear_dump(dev, node, prev_node, next_node); + vhost_user_iotlb_clear_dump(node, prev_node, next_node); TAILQ_REMOVE(&dev->iotlb_list, node, next); vhost_user_iotlb_pool_put(dev, node); @@ -239,7 +236,7 @@ vhost_user_iotlb_cache_random_evict(struct virtio_net *dev) void vhost_user_iotlb_cache_insert(struct virtio_net *dev, uint64_t iova, uint64_t uaddr, - uint64_t uoffset, uint64_t size, uint8_t perm) + uint64_t uoffset, uint64_t size, uint64_t page_size, uint8_t perm) { struct vhost_iotlb_entry *node, *new_node; @@ -263,6 +260,7 @@ vhost_user_iotlb_cache_insert(struct virtio_net *dev, uint64_t iova, uint64_t ua new_node->uaddr = uaddr; new_node->uoffset = uoffset; new_node->size = size; + new_node->page_shift = __builtin_ctz(page_size); new_node->perm = perm; vhost_user_iotlb_wr_lock_all(dev); @@ -276,7 +274,7 @@ vhost_user_iotlb_cache_insert(struct virtio_net *dev, uint64_t iova, uint64_t ua vhost_user_iotlb_pool_put(dev, new_node); goto unlock; } else if (node->iova > new_node->iova) { - vhost_user_iotlb_set_dump(dev, new_node); + vhost_user_iotlb_set_dump(new_node); TAILQ_INSERT_BEFORE(node, new_node, next); dev->iotlb_cache_nr++; @@ -284,7 +282,7 @@ vhost_user_iotlb_cache_insert(struct virtio_net *dev, uint64_t iova, uint64_t ua } } - vhost_user_iotlb_set_dump(dev, new_node); + vhost_user_iotlb_set_dump(new_node); TAILQ_INSERT_TAIL(&dev->iotlb_list, new_node, next); dev->iotlb_cache_nr++; @@ -313,7 +311,7 @@ vhost_user_iotlb_cache_remove(struct virtio_net *dev, uint64_t iova, uint64_t si if (iova < node->iova + node->size) { struct vhost_iotlb_entry *next_node = RTE_TAILQ_NEXT(node, next); - vhost_user_iotlb_clear_dump(dev, node, prev_node, next_node); + vhost_user_iotlb_clear_dump(node, prev_node, next_node); TAILQ_REMOVE(&dev->iotlb_list, node, next); vhost_user_iotlb_pool_put(dev, node); diff --git a/lib/vhost/iotlb.h b/lib/vhost/iotlb.h index bee36c5903..81ca04df21 100644 --- a/lib/vhost/iotlb.h +++ b/lib/vhost/iotlb.h @@ -58,7 +58,7 @@ vhost_user_iotlb_wr_unlock_all(struct virtio_net *dev) } void vhost_user_iotlb_cache_insert(struct virtio_net *dev, uint64_t iova, uint64_t uaddr, - uint64_t uoffset, uint64_t size, uint8_t perm); + uint64_t uoffset, uint64_t size, uint64_t page_size, uint8_t perm); void vhost_user_iotlb_cache_remove(struct virtio_net *dev, uint64_t iova, uint64_t size); uint64_t vhost_user_iotlb_cache_find(struct virtio_net *dev, uint64_t iova, uint64_t *size, uint8_t perm); diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index 67cc4a2fdb..4ace5ab081 100644 --- a/lib/vhost/vhost.h +++ b/lib/vhost/vhost.h @@ -1016,6 +1016,5 @@ mbuf_is_consumed(struct rte_mbuf *m) return true; } -uint64_t hua_to_alignment(struct rte_vhost_memory *mem, void *ptr); void mem_set_dump(void *ptr, size_t size, bool enable, uint64_t alignment); #endif /* _VHOST_NET_CDEV_H_ */ diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c index 93673d3902..a989f2c46d 100644 --- a/lib/vhost/vhost_user.c +++ b/lib/vhost/vhost_user.c @@ -743,7 +743,7 @@ log_addr_to_gpa(struct virtio_net *dev, struct vhost_virtqueue *vq) return log_gpa; } -uint64_t +static uint64_t hua_to_alignment(struct rte_vhost_memory *mem, void *ptr) { struct rte_vhost_mem_region *r; @@ -2632,7 +2632,7 @@ vhost_user_iotlb_msg(struct virtio_net **pdev, struct virtio_net *dev = *pdev; struct vhost_iotlb_msg *imsg = &ctx->msg.payload.iotlb; uint16_t i; - uint64_t vva, len; + uint64_t vva, len, pg_sz; switch (imsg->type) { case VHOST_IOTLB_UPDATE: @@ -2641,7 +2641,9 @@ vhost_user_iotlb_msg(struct virtio_net **pdev, if (!vva) return RTE_VHOST_MSG_RESULT_ERR; - vhost_user_iotlb_cache_insert(dev, imsg->iova, vva, 0, len, imsg->perm); + pg_sz = hua_to_alignment(dev->mem, (void *)(uintptr_t)vva); + + vhost_user_iotlb_cache_insert(dev, imsg->iova, vva, 0, len, pg_sz, imsg->perm); for (i = 0; i < dev->nr_vring; i++) { struct vhost_virtqueue *vq = dev->virtqueue[i]; From patchwork Fri Mar 31 15:42:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125686 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7032842887; Fri, 31 Mar 2023 17:44:29 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id AFD5D42D93; Fri, 31 Mar 2023 17:43:38 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 37D0742D8A for ; Fri, 31 Mar 2023 17:43:36 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277415; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=S4GTKEPafA3RH5QazuKFQxx/XU1wLhXXdTAUsoo6+xM=; b=eOqQgA2Gsqsg+cjTcbxgKbc5HESBe/0t0W3p2ccRbdUmAmz2Fx+TPOJ7wSO1WguTlorzCg IhH53bwYdzPdF1NbMun7kZ7GClw0SmJd/vTFiOE6QVHUPagbSxpCgotxQJ3OBqTM5o24TY rc6ajNsIprL1PFQDWc7YA4gG7uPRsmo= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-253-mnzhWGLONaCmnmdzUYnwvA-1; Fri, 31 Mar 2023 11:43:32 -0400 X-MC-Unique: mnzhWGLONaCmnmdzUYnwvA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3CAA23C10693; Fri, 31 Mar 2023 15:43:32 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0DB4B2027041; Fri, 31 Mar 2023 15:43:29 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 10/27] vhost: retry translating IOVA after IOTLB miss Date: Fri, 31 Mar 2023 17:42:42 +0200 Message-Id: <20230331154259.1447831-11-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Vhost-user backend IOTLB misses and updates are asynchronous, so IOVA address translation function just fails after having sent an IOTLB miss update if needed entry was not in the IOTLB cache. This is not the case for VDUSE, for which the needed IOTLB update is returned directly when sending an IOTLB miss. This patch retry again finding the needed entry in the IOTLB cache after having sent an IOTLB miss. Signed-off-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- lib/vhost/vhost.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c index d35075b96c..4f16307e4d 100644 --- a/lib/vhost/vhost.c +++ b/lib/vhost/vhost.c @@ -96,6 +96,12 @@ __vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq, vhost_user_iotlb_rd_lock(vq); } + tmp_size = *size; + /* Retry in case of VDUSE, as it is synchronous */ + vva = vhost_user_iotlb_cache_find(dev, iova, &tmp_size, perm); + if (tmp_size == *size) + return vva; + return 0; } From patchwork Fri Mar 31 15:42:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125687 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 972AA42887; Fri, 31 Mar 2023 17:44:36 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E963F42DAA; Fri, 31 Mar 2023 17:43:41 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id E4A5F42D9B for ; Fri, 31 Mar 2023 17:43:38 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277418; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=K1HiwmdWGZNQqKxh0U2Bsu9jfh4mu0cl1yEOM0Jp3Sc=; b=iZdKqk4Rj+P8TmLQvk4ZBeW/WylQhv6fvuuFLp8/dRwsGsaWzZ7Y42kRRuqjzHASKoRNt7 eE9cFCu2T6cyTFomh/iXxfO0dJQgRJAEo1DTUHxhOnR6z8QdOlRpWXqMlCo7ompyqNsgm2 OjleaObC8X/PFSAsZdG/jGBRnZ5AXVU= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-48-vgbSIVJDNG-P3a1ye3wr4A-1; Fri, 31 Mar 2023 11:43:35 -0400 X-MC-Unique: vgbSIVJDNG-P3a1ye3wr4A-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E626E280BF67; Fri, 31 Mar 2023 15:43:34 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9050D2028E8F; Fri, 31 Mar 2023 15:43:32 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 11/27] vhost: introduce backend ops Date: Fri, 31 Mar 2023 17:42:43 +0200 Message-Id: <20230331154259.1447831-12-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch introduces backend ops struct, that will enable calling backend specifics callbacks (Vhost-user, VDUSE), in shared code. This is an empty shell for now, it will be filled in later patches. Signed-off-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- lib/vhost/socket.c | 2 +- lib/vhost/vhost.c | 8 +++++++- lib/vhost/vhost.h | 10 +++++++++- lib/vhost/vhost_user.c | 8 ++++++++ lib/vhost/vhost_user.h | 1 + 5 files changed, 26 insertions(+), 3 deletions(-) diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c index 669c322e12..ba54263824 100644 --- a/lib/vhost/socket.c +++ b/lib/vhost/socket.c @@ -221,7 +221,7 @@ vhost_user_add_connection(int fd, struct vhost_user_socket *vsocket) return; } - vid = vhost_new_device(); + vid = vhost_user_new_device(); if (vid == -1) { goto err; } diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c index 4f16307e4d..41f212315e 100644 --- a/lib/vhost/vhost.c +++ b/lib/vhost/vhost.c @@ -676,11 +676,16 @@ reset_device(struct virtio_net *dev) * there is a new virtio device being attached). */ int -vhost_new_device(void) +vhost_new_device(struct vhost_backend_ops *ops) { struct virtio_net *dev; int i; + if (ops == NULL) { + VHOST_LOG_CONFIG("device", ERR, "missing backend ops.\n"); + return -1; + } + pthread_mutex_lock(&vhost_dev_lock); for (i = 0; i < RTE_MAX_VHOST_DEVICE; i++) { if (vhost_devices[i] == NULL) @@ -708,6 +713,7 @@ vhost_new_device(void) dev->backend_req_fd = -1; dev->postcopy_ufd = -1; rte_spinlock_init(&dev->backend_req_lock); + dev->backend_ops = ops; return i; } diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index 4ace5ab081..cc5c707205 100644 --- a/lib/vhost/vhost.h +++ b/lib/vhost/vhost.h @@ -89,6 +89,12 @@ for (iter = val; iter < num; iter++) #endif +/** + * Structure that contains backend-specific ops. + */ +struct vhost_backend_ops { +}; + /** * Structure contains buffer address, length and descriptor index * from vring to do scatter RX. @@ -513,6 +519,8 @@ struct virtio_net { void *extern_data; /* pre and post vhost user message handlers for the device */ struct rte_vhost_user_extern_ops extern_ops; + + struct vhost_backend_ops *backend_ops; } __rte_cache_aligned; static inline void @@ -812,7 +820,7 @@ get_device(int vid) return dev; } -int vhost_new_device(void); +int vhost_new_device(struct vhost_backend_ops *ops); void cleanup_device(struct virtio_net *dev, int destroy); void reset_device(struct virtio_net *dev); void vhost_destroy_device(int); diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c index a989f2c46d..2d5dec5bc1 100644 --- a/lib/vhost/vhost_user.c +++ b/lib/vhost/vhost_user.c @@ -3464,3 +3464,11 @@ int rte_vhost_host_notifier_ctrl(int vid, uint16_t qid, bool enable) return ret; } + +static struct vhost_backend_ops vhost_user_backend_ops; + +int +vhost_user_new_device(void) +{ + return vhost_new_device(&vhost_user_backend_ops); +} diff --git a/lib/vhost/vhost_user.h b/lib/vhost/vhost_user.h index a0987a58f9..61456049c8 100644 --- a/lib/vhost/vhost_user.h +++ b/lib/vhost/vhost_user.h @@ -185,5 +185,6 @@ int vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm); int read_fd_message(char *ifname, int sockfd, char *buf, int buflen, int *fds, int max_fds, int *fd_num); int send_fd_message(char *ifname, int sockfd, char *buf, int buflen, int *fds, int fd_num); +int vhost_user_new_device(void); #endif From patchwork Fri Mar 31 15:42:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125688 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3600B42887; Fri, 31 Mar 2023 17:44:44 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id F258142DB9; Fri, 31 Mar 2023 17:43:42 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id A55F042DA2 for ; Fri, 31 Mar 2023 17:43:41 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277421; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LAPRlCaLVxKzzAIdEEBfv34fCej9/BW0KuBIVnmwQaQ=; b=GLYUd0kjDD0i17CkujPpVpqTUrkYOpKC87udUpocbScSFE5d1xtw219wUDcijAwMONoaiT no85BwOi9uAT4ESngGBw/tCf8lWckPizf8CBTdyweI8cnd/k4Snq5Fx7R1byDw1X2spXJD 3AsL5rBqfzP+Rup6paxwxk2dS1CR+p4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-346-q0OAGq2-OQaMH9ykGE2Mgw-1; Fri, 31 Mar 2023 11:43:37 -0400 X-MC-Unique: q0OAGq2-OQaMH9ykGE2Mgw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 7A922185A7A2; Fri, 31 Mar 2023 15:43:37 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3935A20290A5; Fri, 31 Mar 2023 15:43:35 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 12/27] vhost: add IOTLB cache entry removal callback Date: Fri, 31 Mar 2023 17:42:44 +0200 Message-Id: <20230331154259.1447831-13-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org VDUSE will need to munmap() the IOTLB entry on removal from the cache, as it performs mmap() before insertion. This patch introduces a callback that VDUSE layer will implement to achieve this. Signed-off-by: Maxime Coquelin --- lib/vhost/iotlb.c | 12 ++++++++++++ lib/vhost/vhost.h | 4 ++++ 2 files changed, 16 insertions(+) diff --git a/lib/vhost/iotlb.c b/lib/vhost/iotlb.c index 188dfb8e38..86b0be62b4 100644 --- a/lib/vhost/iotlb.c +++ b/lib/vhost/iotlb.c @@ -25,6 +25,15 @@ struct vhost_iotlb_entry { #define IOTLB_CACHE_SIZE 2048 +static void +vhost_user_iotlb_remove_notify(struct virtio_net *dev, struct vhost_iotlb_entry *entry) +{ + if (dev->backend_ops->iotlb_remove_notify == NULL) + return; + + dev->backend_ops->iotlb_remove_notify(entry->uaddr, entry->uoffset, entry->size); +} + static bool vhost_user_iotlb_share_page(struct vhost_iotlb_entry *a, struct vhost_iotlb_entry *b) { @@ -198,6 +207,7 @@ vhost_user_iotlb_cache_remove_all(struct virtio_net *dev) vhost_user_iotlb_set_dump(node); TAILQ_REMOVE(&dev->iotlb_list, node, next); + vhost_user_iotlb_remove_notify(dev, node); vhost_user_iotlb_pool_put(dev, node); } @@ -223,6 +233,7 @@ vhost_user_iotlb_cache_random_evict(struct virtio_net *dev) vhost_user_iotlb_clear_dump(node, prev_node, next_node); TAILQ_REMOVE(&dev->iotlb_list, node, next); + vhost_user_iotlb_remove_notify(dev, node); vhost_user_iotlb_pool_put(dev, node); dev->iotlb_cache_nr--; break; @@ -314,6 +325,7 @@ vhost_user_iotlb_cache_remove(struct virtio_net *dev, uint64_t iova, uint64_t si vhost_user_iotlb_clear_dump(node, prev_node, next_node); TAILQ_REMOVE(&dev->iotlb_list, node, next); + vhost_user_iotlb_remove_notify(dev, node); vhost_user_iotlb_pool_put(dev, node); dev->iotlb_cache_nr--; } else { diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index cc5c707205..2ad26f6951 100644 --- a/lib/vhost/vhost.h +++ b/lib/vhost/vhost.h @@ -89,10 +89,14 @@ for (iter = val; iter < num; iter++) #endif +struct virtio_net; +typedef void (*vhost_iotlb_remove_notify)(uint64_t addr, uint64_t off, uint64_t size); + /** * Structure that contains backend-specific ops. */ struct vhost_backend_ops { + vhost_iotlb_remove_notify iotlb_remove_notify; }; /** From patchwork Fri Mar 31 15:42:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125691 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5EF8B42887; Fri, 31 Mar 2023 17:45:03 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 02F7242DCC; Fri, 31 Mar 2023 17:43:53 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id E85F942D3D for ; Fri, 31 Mar 2023 17:43:51 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277431; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EnzJ2XPQzBwT4KAwoRaboJ+9eIYObJCLw4vEDG1mLb0=; b=J9XECcKRL6Vtf3IiscNEz9KCYVw9oNWy9RhusS+XjQKdWfRoNtMBlZfOTBlxXwsyX7yGuG o64ySSPuvoDtoB/F363PhI09u+PZrQte2Tx+JuRKj3hUmvOr6VbbKSxm5jOoxlMc97LkGz iySssOzIG6Awj7Rj2dBWyxRMgpqfDao= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-33-VrDlnqSDNBWINpij9ku6IA-1; Fri, 31 Mar 2023 11:43:40 -0400 X-MC-Unique: VrDlnqSDNBWINpij9ku6IA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 2888A85A5A3; Fri, 31 Mar 2023 15:43:40 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id CE047202701E; Fri, 31 Mar 2023 15:43:37 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 13/27] vhost: add helper for IOTLB misses Date: Fri, 31 Mar 2023 17:42:45 +0200 Message-Id: <20230331154259.1447831-14-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch adds a helper for sending IOTLB misses as VDUSE will use an ioctl while Vhost-user use a dedicated Vhost-user backend request. Signed-off-by: Maxime Coquelin --- lib/vhost/vhost.c | 13 ++++++++++++- lib/vhost/vhost.h | 3 +++ lib/vhost/vhost_user.c | 6 ++++-- lib/vhost/vhost_user.h | 1 - 4 files changed, 19 insertions(+), 4 deletions(-) diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c index 41f212315e..790eb06b28 100644 --- a/lib/vhost/vhost.c +++ b/lib/vhost/vhost.c @@ -52,6 +52,12 @@ static const struct vhost_vq_stats_name_off vhost_vq_stat_strings[] = { #define VHOST_NB_VQ_STATS RTE_DIM(vhost_vq_stat_strings) +static int +vhost_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm) +{ + return dev->backend_ops->iotlb_miss(dev, iova, perm); +} + uint64_t __vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq, uint64_t iova, uint64_t *size, uint8_t perm) @@ -86,7 +92,7 @@ __vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq, vhost_user_iotlb_rd_unlock(vq); vhost_user_iotlb_pending_insert(dev, iova, perm); - if (vhost_user_iotlb_miss(dev, iova, perm)) { + if (vhost_iotlb_miss(dev, iova, perm)) { VHOST_LOG_DATA(dev->ifname, ERR, "IOTLB miss req failed for IOVA 0x%" PRIx64 "\n", iova); @@ -686,6 +692,11 @@ vhost_new_device(struct vhost_backend_ops *ops) return -1; } + if (ops->iotlb_miss == NULL) { + VHOST_LOG_CONFIG("device", ERR, "missing IOTLB miss backend op.\n"); + return -1; + } + pthread_mutex_lock(&vhost_dev_lock); for (i = 0; i < RTE_MAX_VHOST_DEVICE; i++) { if (vhost_devices[i] == NULL) diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index 2ad26f6951..ee7640e901 100644 --- a/lib/vhost/vhost.h +++ b/lib/vhost/vhost.h @@ -92,11 +92,14 @@ struct virtio_net; typedef void (*vhost_iotlb_remove_notify)(uint64_t addr, uint64_t off, uint64_t size); +typedef int (*vhost_iotlb_miss_cb)(struct virtio_net *dev, uint64_t iova, uint8_t perm); + /** * Structure that contains backend-specific ops. */ struct vhost_backend_ops { vhost_iotlb_remove_notify iotlb_remove_notify; + vhost_iotlb_miss_cb iotlb_miss; }; /** diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c index 2d5dec5bc1..6a9f32972a 100644 --- a/lib/vhost/vhost_user.c +++ b/lib/vhost/vhost_user.c @@ -3305,7 +3305,7 @@ vhost_user_msg_handler(int vid, int fd) return ret; } -int +static int vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm) { int ret; @@ -3465,7 +3465,9 @@ int rte_vhost_host_notifier_ctrl(int vid, uint16_t qid, bool enable) return ret; } -static struct vhost_backend_ops vhost_user_backend_ops; +static struct vhost_backend_ops vhost_user_backend_ops = { + .iotlb_miss = vhost_user_iotlb_miss, +}; int vhost_user_new_device(void) diff --git a/lib/vhost/vhost_user.h b/lib/vhost/vhost_user.h index 61456049c8..1ffeca92f3 100644 --- a/lib/vhost/vhost_user.h +++ b/lib/vhost/vhost_user.h @@ -179,7 +179,6 @@ struct __rte_packed vhu_msg_context { /* vhost_user.c */ int vhost_user_msg_handler(int vid, int fd); -int vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm); /* socket.c */ int read_fd_message(char *ifname, int sockfd, char *buf, int buflen, int *fds, int max_fds, From patchwork Fri Mar 31 15:42:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125689 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id BFAC942887; Fri, 31 Mar 2023 17:44:52 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8AF9F42D3A; Fri, 31 Mar 2023 17:43:48 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id B2CC142D33 for ; Fri, 31 Mar 2023 17:43:47 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277427; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=aOBLjEobRVaJIQA0lIk4GLINRlU5Goj1AlU+vw9Oxwo=; b=UgRFr4FAiJKUVCmtwgCoqDx3N77miD3n+6K7vIFqTnb2ew9HF8olwzyGnV0o+EAnVHeH8G vZs/RYP9cCVolC/0+R1bZaigIrCUXk0moBYHgG8eME4byQW8lblIgM6AMDjyEo97KuNfa6 VOjhRIdB5FCD+e8XzzFKu7QetbrFvOA= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-467-4kIAusRWOkGtXVKvqpEmXA-1; Fri, 31 Mar 2023 11:43:44 -0400 X-MC-Unique: 4kIAusRWOkGtXVKvqpEmXA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B370D3C10693; Fri, 31 Mar 2023 15:43:43 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7142D2027040; Fri, 31 Mar 2023 15:43:40 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 14/27] vhost: add helper for interrupt injection Date: Fri, 31 Mar 2023 17:42:46 +0200 Message-Id: <20230331154259.1447831-15-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Vhost-user uses eventfd to inject IRQs, but VDUSE uses an ioctl. This patch prepares vhost_vring_call_split() and vhost_vring_call_packed() to support VDUSE by introducing a new helper. It also adds a new counter to for guest notification failures, which could happen in case of uninitialized call file descriptor for example. Signed-off-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- lib/vhost/vhost.c | 6 +++++ lib/vhost/vhost.h | 54 +++++++++++++++++++++++------------------- lib/vhost/vhost_user.c | 10 ++++++++ 3 files changed, 46 insertions(+), 24 deletions(-) diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c index 790eb06b28..c07028f2b3 100644 --- a/lib/vhost/vhost.c +++ b/lib/vhost/vhost.c @@ -44,6 +44,7 @@ static const struct vhost_vq_stats_name_off vhost_vq_stat_strings[] = { {"size_1024_1518_packets", offsetof(struct vhost_virtqueue, stats.size_bins[6])}, {"size_1519_max_packets", offsetof(struct vhost_virtqueue, stats.size_bins[7])}, {"guest_notifications", offsetof(struct vhost_virtqueue, stats.guest_notifications)}, + {"guest_notifications_error", offsetof(struct vhost_virtqueue, stats.guest_notifications_error)}, {"iotlb_hits", offsetof(struct vhost_virtqueue, stats.iotlb_hits)}, {"iotlb_misses", offsetof(struct vhost_virtqueue, stats.iotlb_misses)}, {"inflight_submitted", offsetof(struct vhost_virtqueue, stats.inflight_submitted)}, @@ -697,6 +698,11 @@ vhost_new_device(struct vhost_backend_ops *ops) return -1; } + if (ops->inject_irq == NULL) { + VHOST_LOG_CONFIG("device", ERR, "missing IRQ injection backend op.\n"); + return -1; + } + pthread_mutex_lock(&vhost_dev_lock); for (i = 0; i < RTE_MAX_VHOST_DEVICE; i++) { if (vhost_devices[i] == NULL) diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index ee7640e901..8f0875b4e2 100644 --- a/lib/vhost/vhost.h +++ b/lib/vhost/vhost.h @@ -90,16 +90,20 @@ #endif struct virtio_net; +struct vhost_virtqueue; + typedef void (*vhost_iotlb_remove_notify)(uint64_t addr, uint64_t off, uint64_t size); typedef int (*vhost_iotlb_miss_cb)(struct virtio_net *dev, uint64_t iova, uint8_t perm); +typedef int (*vhost_vring_inject_irq_cb)(struct virtio_net *dev, struct vhost_virtqueue *vq); /** * Structure that contains backend-specific ops. */ struct vhost_backend_ops { vhost_iotlb_remove_notify iotlb_remove_notify; vhost_iotlb_miss_cb iotlb_miss; + vhost_vring_inject_irq_cb inject_irq; }; /** @@ -149,6 +153,7 @@ struct virtqueue_stats { /* Size bins in array as RFC 2819, undersized [0], 64 [1], etc */ uint64_t size_bins[8]; uint64_t guest_notifications; + uint64_t guest_notifications_error; uint64_t iotlb_hits; uint64_t iotlb_misses; uint64_t inflight_submitted; @@ -900,6 +905,24 @@ vhost_need_event(uint16_t event_idx, uint16_t new_idx, uint16_t old) return (uint16_t)(new_idx - event_idx - 1) < (uint16_t)(new_idx - old); } +static __rte_always_inline void +vhost_vring_inject_irq(struct virtio_net *dev, struct vhost_virtqueue *vq) +{ + int ret; + + ret = dev->backend_ops->inject_irq(dev, vq); + if (ret) { + if (dev->flags & VIRTIO_DEV_STATS_ENABLED) + vq->stats.guest_notifications_error++; + return; + } + + if (dev->flags & VIRTIO_DEV_STATS_ENABLED) + vq->stats.guest_notifications++; + if (dev->notify_ops->guest_notified) + dev->notify_ops->guest_notified(dev->vid); +} + static __rte_always_inline void vhost_vring_call_split(struct virtio_net *dev, struct vhost_virtqueue *vq) { @@ -919,25 +942,13 @@ vhost_vring_call_split(struct virtio_net *dev, struct vhost_virtqueue *vq) "%s: used_event_idx=%d, old=%d, new=%d\n", __func__, vhost_used_event(vq), old, new); - if ((vhost_need_event(vhost_used_event(vq), new, old) || - unlikely(!signalled_used_valid)) && - vq->callfd >= 0) { - eventfd_write(vq->callfd, (eventfd_t) 1); - if (dev->flags & VIRTIO_DEV_STATS_ENABLED) - vq->stats.guest_notifications++; - if (dev->notify_ops->guest_notified) - dev->notify_ops->guest_notified(dev->vid); - } + if (vhost_need_event(vhost_used_event(vq), new, old) || + unlikely(!signalled_used_valid)) + vhost_vring_inject_irq(dev, vq); } else { /* Kick the guest if necessary. */ - if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT) - && (vq->callfd >= 0)) { - eventfd_write(vq->callfd, (eventfd_t)1); - if (dev->flags & VIRTIO_DEV_STATS_ENABLED) - vq->stats.guest_notifications++; - if (dev->notify_ops->guest_notified) - dev->notify_ops->guest_notified(dev->vid); - } + if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) + vhost_vring_inject_irq(dev, vq); } } @@ -988,13 +999,8 @@ vhost_vring_call_packed(struct virtio_net *dev, struct vhost_virtqueue *vq) if (vhost_need_event(off, new, old)) kick = true; kick: - if (kick && vq->callfd >= 0) { - eventfd_write(vq->callfd, (eventfd_t)1); - if (dev->flags & VIRTIO_DEV_STATS_ENABLED) - vq->stats.guest_notifications++; - if (dev->notify_ops->guest_notified) - dev->notify_ops->guest_notified(dev->vid); - } + if (kick) + vhost_vring_inject_irq(dev, vq); } static __rte_always_inline void diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c index 6a9f32972a..2e4a9fdea4 100644 --- a/lib/vhost/vhost_user.c +++ b/lib/vhost/vhost_user.c @@ -3465,8 +3465,18 @@ int rte_vhost_host_notifier_ctrl(int vid, uint16_t qid, bool enable) return ret; } +static int +vhost_user_inject_irq(struct virtio_net *dev __rte_unused, struct vhost_virtqueue *vq) +{ + if (vq->callfd < 0) + return -1; + + return eventfd_write(vq->callfd, (eventfd_t)1); +} + static struct vhost_backend_ops vhost_user_backend_ops = { .iotlb_miss = vhost_user_iotlb_miss, + .inject_irq = vhost_user_inject_irq, }; int From patchwork Fri Mar 31 15:42:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125690 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A6D3942887; Fri, 31 Mar 2023 17:44:57 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id F2DA742D49; Fri, 31 Mar 2023 17:43:51 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id EAB4842D3E for ; Fri, 31 Mar 2023 17:43:50 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277430; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Dlk/nURVg0Md1OSIcxv3lo8Of9GNwnZALhXR1S1uzW0=; b=PX+uzRly2y8+qmo0mhAQXOYqD/D6N8WCMaLVCP5b4CQ51wgbzOoYabTtuo4QAVcHr3ZLxP gy6LTuR9Y61Mm8EiRDhqYQvqjegAtkHryC2kty2YL6TfVd3S04WwFfjh0KTST+XyBK70+5 FTRAl19ZFDGbnHVCmEsHSH4PRAQKOIU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-398-MFhVoKgSMrajHtQQ3p2Ikg-1; Fri, 31 Mar 2023 11:43:47 -0400 X-MC-Unique: MFhVoKgSMrajHtQQ3p2Ikg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4335B101A531; Fri, 31 Mar 2023 15:43:46 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 12BC7202701E; Fri, 31 Mar 2023 15:43:43 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 15/27] vhost: add API to set max queue pairs Date: Fri, 31 Mar 2023 17:42:47 +0200 Message-Id: <20230331154259.1447831-16-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch introduces a new rte_vhost_driver_set_max_queues API as preliminary work for multiqueue support with VDUSE. Indeed, with VDUSE we need to pre-allocate the vrings at device creation time, so we need such API not to allocate the 128 queue pairs supported by the Vhost library. Calling the API is optional, 128 queue pairs remaining the default. Signed-off-by: Maxime Coquelin --- doc/guides/prog_guide/vhost_lib.rst | 4 ++++ lib/vhost/rte_vhost.h | 17 ++++++++++++++ lib/vhost/socket.c | 36 +++++++++++++++++++++++++++-- lib/vhost/version.map | 3 +++ 4 files changed, 58 insertions(+), 2 deletions(-) diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst index e8bb8c9b7b..cd4b109139 100644 --- a/doc/guides/prog_guide/vhost_lib.rst +++ b/doc/guides/prog_guide/vhost_lib.rst @@ -334,6 +334,10 @@ The following is an overview of some key Vhost API functions: Clean DMA vChannel finished to use. After this function is called, the specified DMA vChannel should no longer be used by the Vhost library. +* ``rte_vhost_driver_set_max_queue_num(path, max_queue_pairs)`` + + Set the maximum number of queue pairs supported by the device. + Vhost-user Implementations -------------------------- diff --git a/lib/vhost/rte_vhost.h b/lib/vhost/rte_vhost.h index 58a5d4be92..44cbfcb469 100644 --- a/lib/vhost/rte_vhost.h +++ b/lib/vhost/rte_vhost.h @@ -588,6 +588,23 @@ rte_vhost_driver_get_protocol_features(const char *path, int rte_vhost_driver_get_queue_num(const char *path, uint32_t *queue_num); +/** + * @warning + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice. + * + * Set the maximum number of queue pairs supported by the device. + * + * @param path + * The vhost-user socket file path + * @param max_queue_pairs + * The maximum number of queue pairs + * @return + * 0 on success, -1 on failure + */ +__rte_experimental +int +rte_vhost_driver_set_max_queue_num(const char *path, uint32_t max_queue_pairs); + /** * Get the feature bits after negotiation * diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c index ba54263824..e95c3ffeac 100644 --- a/lib/vhost/socket.c +++ b/lib/vhost/socket.c @@ -56,6 +56,8 @@ struct vhost_user_socket { uint64_t protocol_features; + uint32_t max_queue_pairs; + struct rte_vdpa_device *vdpa_dev; struct rte_vhost_device_ops const *notify_ops; @@ -821,7 +823,7 @@ rte_vhost_driver_get_queue_num(const char *path, uint32_t *queue_num) vdpa_dev = vsocket->vdpa_dev; if (!vdpa_dev) { - *queue_num = VHOST_MAX_QUEUE_PAIRS; + *queue_num = vsocket->max_queue_pairs; goto unlock_exit; } @@ -831,7 +833,36 @@ rte_vhost_driver_get_queue_num(const char *path, uint32_t *queue_num) goto unlock_exit; } - *queue_num = RTE_MIN((uint32_t)VHOST_MAX_QUEUE_PAIRS, vdpa_queue_num); + *queue_num = RTE_MIN(vsocket->max_queue_pairs, vdpa_queue_num); + +unlock_exit: + pthread_mutex_unlock(&vhost_user.mutex); + return ret; +} + +int +rte_vhost_driver_set_max_queue_num(const char *path, uint32_t max_queue_pairs) +{ + struct vhost_user_socket *vsocket; + int ret = 0; + + VHOST_LOG_CONFIG(path, INFO, "Setting max queue pairs to %u\n", max_queue_pairs); + + if (max_queue_pairs > VHOST_MAX_QUEUE_PAIRS) { + VHOST_LOG_CONFIG(path, ERR, "Library only supports up to %u queue pairs\n", + VHOST_MAX_QUEUE_PAIRS); + return -1; + } + + pthread_mutex_lock(&vhost_user.mutex); + vsocket = find_vhost_user_socket(path); + if (!vsocket) { + VHOST_LOG_CONFIG(path, ERR, "socket file is not registered yet.\n"); + ret = -1; + goto unlock_exit; + } + + vsocket->max_queue_pairs = max_queue_pairs; unlock_exit: pthread_mutex_unlock(&vhost_user.mutex); @@ -890,6 +921,7 @@ rte_vhost_driver_register(const char *path, uint64_t flags) goto out_free; } vsocket->vdpa_dev = NULL; + vsocket->max_queue_pairs = VHOST_MAX_QUEUE_PAIRS; vsocket->extbuf = flags & RTE_VHOST_USER_EXTBUF_SUPPORT; vsocket->linearbuf = flags & RTE_VHOST_USER_LINEARBUF_SUPPORT; vsocket->async_copy = flags & RTE_VHOST_USER_ASYNC_COPY; diff --git a/lib/vhost/version.map b/lib/vhost/version.map index d322a4a888..dffb126aa8 100644 --- a/lib/vhost/version.map +++ b/lib/vhost/version.map @@ -98,6 +98,9 @@ EXPERIMENTAL { # added in 22.11 rte_vhost_async_dma_unconfigure; rte_vhost_vring_call_nonblock; + + # added in 23.07 + rte_vhost_driver_set_max_queue_num; }; INTERNAL { From patchwork Fri Mar 31 15:42:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125692 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4B95842887; Fri, 31 Mar 2023 17:45:09 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1242542D53; Fri, 31 Mar 2023 17:43:56 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id B327542D3D for ; Fri, 31 Mar 2023 17:43:54 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277434; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HduW2NuZBk3ssn+Fqx7jajLaDjhDHP9CK95oM/BFQPY=; b=UEfizj7xt+rzzIsA2l3tIoB5hyG1OKT1OjUvVmh35T4v5ltFawMc1pKkDtmrmdn/RD4/rS 7wiB0t4ixXCJZbWzCFqhKASo8sscqk7hGtEqOrOFhBztjmalxMqKdre8w/vccrcvfOWUkQ B+zsTPSFoUlJ7bnV6Gn7wPh28m1PzFY= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-517-1WXajjy2OrCy7KgcXO8ZQQ-1; Fri, 31 Mar 2023 11:43:49 -0400 X-MC-Unique: 1WXajjy2OrCy7KgcXO8ZQQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CEA363C025B9; Fri, 31 Mar 2023 15:43:48 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8AE212027040; Fri, 31 Mar 2023 15:43:46 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 16/27] net/vhost: use API to set max queue pairs Date: Fri, 31 Mar 2023 17:42:48 +0200 Message-Id: <20230331154259.1447831-17-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org In order to support multiqueue with VDUSE, we need to be able to limit the maximum number of queue pairs, to avoid unnecessary memory consumption since the maximum number of queue pairs need to be allocated at device creation time, as opposed to Vhost-user which allocate only when the frontend initialize them. Signed-off-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- drivers/net/vhost/rte_eth_vhost.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c index 62ef955ebc..8d37ec9775 100644 --- a/drivers/net/vhost/rte_eth_vhost.c +++ b/drivers/net/vhost/rte_eth_vhost.c @@ -1013,6 +1013,9 @@ vhost_driver_setup(struct rte_eth_dev *eth_dev) goto drv_unreg; } + if (rte_vhost_driver_set_max_queue_num(internal->iface_name, internal->max_queues)) + goto drv_unreg; + if (rte_vhost_driver_callback_register(internal->iface_name, &vhost_ops) < 0) { VHOST_LOG(ERR, "Can't register callbacks\n"); From patchwork Fri Mar 31 15:42:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125693 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id AD1BF42887; Fri, 31 Mar 2023 17:45:14 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1FBBD42F8E; Fri, 31 Mar 2023 17:43:59 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id DFCDA42D38 for ; Fri, 31 Mar 2023 17:43:55 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277435; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=e5/Ox2X3o7Rm2M+qv79YPDUy/u994wUkBeLeh+H1bII=; b=XvFUXOGcIJkJm7dlb/q3YdlScDah4yxtvqOAOdsbv82ZcLyWDSkC/d7TbH6BbiFrMxnQJL mwLJt94P2IzjWCTCMILDPD5yk4RPAB25dHAmMgdWeO1R9fa0DKLSeEzkxT+1TBE9FmvVa0 16dMvox+G9a9Z4NqVcGP96MUIBpMOJc= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-528-YS753o_zNAK7_jR8eCOLGA-1; Fri, 31 Mar 2023 11:43:52 -0400 X-MC-Unique: YS753o_zNAK7_jR8eCOLGA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6F7DF1C0A592; Fri, 31 Mar 2023 15:43:51 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 33E802027040; Fri, 31 Mar 2023 15:43:49 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 17/27] vhost: add control virtqueue support Date: Fri, 31 Mar 2023 17:42:49 +0200 Message-Id: <20230331154259.1447831-18-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org In order to support multi-queue with VDUSE, having control queue support in required. This patch adds control queue implementation, it will be used later when adding VDUSE support. Only split ring layout is supported for now, packed ring support will be added later. Signed-off-by: Maxime Coquelin --- lib/vhost/meson.build | 1 + lib/vhost/vhost.h | 2 + lib/vhost/virtio_net_ctrl.c | 282 ++++++++++++++++++++++++++++++++++++ lib/vhost/virtio_net_ctrl.h | 10 ++ 4 files changed, 295 insertions(+) create mode 100644 lib/vhost/virtio_net_ctrl.c create mode 100644 lib/vhost/virtio_net_ctrl.h diff --git a/lib/vhost/meson.build b/lib/vhost/meson.build index 197a51d936..cdcd403df3 100644 --- a/lib/vhost/meson.build +++ b/lib/vhost/meson.build @@ -28,6 +28,7 @@ sources = files( 'vhost_crypto.c', 'vhost_user.c', 'virtio_net.c', + 'virtio_net_ctrl.c', ) headers = files( 'rte_vdpa.h', diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index 8f0875b4e2..76663aed24 100644 --- a/lib/vhost/vhost.h +++ b/lib/vhost/vhost.h @@ -525,6 +525,8 @@ struct virtio_net { int postcopy_ufd; int postcopy_listening; + struct vhost_virtqueue *cvq; + struct rte_vdpa_device *vdpa_dev; /* context data for the external message handlers */ diff --git a/lib/vhost/virtio_net_ctrl.c b/lib/vhost/virtio_net_ctrl.c new file mode 100644 index 0000000000..16ea63b42f --- /dev/null +++ b/lib/vhost/virtio_net_ctrl.c @@ -0,0 +1,282 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright (c) 2023 Red Hat, Inc. + */ + +#undef RTE_ANNOTATE_LOCKS + +#include +#include +#include + +#include "vhost.h" +#include "virtio_net_ctrl.h" + +struct virtio_net_ctrl { + uint8_t class; + uint8_t command; + uint8_t command_data[]; +}; + +struct virtio_net_ctrl_elem { + struct virtio_net_ctrl *ctrl_req; + uint16_t head_idx; + uint16_t n_descs; + uint8_t *desc_ack; +}; + +static int +virtio_net_ctrl_pop(struct virtio_net *dev, struct virtio_net_ctrl_elem *ctrl_elem) +{ + struct vhost_virtqueue *cvq = dev->cvq; + uint16_t avail_idx, desc_idx, n_descs = 0; + uint64_t desc_len, desc_addr, desc_iova, data_len = 0; + uint8_t *ctrl_req; + struct vring_desc *descs; + + avail_idx = __atomic_load_n(&cvq->avail->idx, __ATOMIC_ACQUIRE); + if (avail_idx == cvq->last_avail_idx) { + VHOST_LOG_CONFIG(dev->ifname, DEBUG, "Control queue empty\n"); + return 0; + } + + desc_idx = cvq->avail->ring[cvq->last_avail_idx]; + if (desc_idx >= cvq->size) { + VHOST_LOG_CONFIG(dev->ifname, ERR, "Out of range desc index, dropping\n"); + goto err; + } + + ctrl_elem->head_idx = desc_idx; + + if (cvq->desc[desc_idx].flags & VRING_DESC_F_INDIRECT) { + desc_len = cvq->desc[desc_idx].len; + desc_iova = cvq->desc[desc_idx].addr; + + descs = (struct vring_desc *)(uintptr_t)vhost_iova_to_vva(dev, cvq, + desc_iova, &desc_len, VHOST_ACCESS_RO); + if (!descs || desc_len != cvq->desc[desc_idx].len) { + VHOST_LOG_CONFIG(dev->ifname, ERR, "Failed to map ctrl indirect descs\n"); + goto err; + } + + desc_idx = 0; + } else { + descs = cvq->desc; + } + + while (1) { + desc_len = descs[desc_idx].len; + desc_iova = descs[desc_idx].addr; + + n_descs++; + + if (descs[desc_idx].flags & VRING_DESC_F_WRITE) { + if (ctrl_elem->desc_ack) { + VHOST_LOG_CONFIG(dev->ifname, ERR, + "Unexpected ctrl chain layout\n"); + goto err; + } + + if (desc_len != sizeof(uint8_t)) { + VHOST_LOG_CONFIG(dev->ifname, ERR, + "Invalid ack size for ctrl req, dropping\n"); + goto err; + } + + ctrl_elem->desc_ack = (uint8_t *)(uintptr_t)vhost_iova_to_vva(dev, cvq, + desc_iova, &desc_len, VHOST_ACCESS_WO); + if (!ctrl_elem->desc_ack || desc_len != sizeof(uint8_t)) { + VHOST_LOG_CONFIG(dev->ifname, ERR, + "Failed to map ctrl ack descriptor\n"); + goto err; + } + } else { + if (ctrl_elem->desc_ack) { + VHOST_LOG_CONFIG(dev->ifname, ERR, + "Unexpected ctrl chain layout\n"); + goto err; + } + + data_len += desc_len; + } + + if (!(descs[desc_idx].flags & VRING_DESC_F_NEXT)) + break; + + desc_idx = descs[desc_idx].next; + } + + desc_idx = ctrl_elem->head_idx; + + if (cvq->desc[desc_idx].flags & VRING_DESC_F_INDIRECT) + ctrl_elem->n_descs = 1; + else + ctrl_elem->n_descs = n_descs; + + if (!ctrl_elem->desc_ack) { + VHOST_LOG_CONFIG(dev->ifname, ERR, "Missing ctrl ack descriptor\n"); + goto err; + } + + if (data_len < sizeof(ctrl_elem->ctrl_req->class) + sizeof(ctrl_elem->ctrl_req->command)) { + VHOST_LOG_CONFIG(dev->ifname, ERR, "Invalid control header size\n"); + goto err; + } + + ctrl_elem->ctrl_req = malloc(data_len); + if (!ctrl_elem->ctrl_req) { + VHOST_LOG_CONFIG(dev->ifname, ERR, "Failed to alloc ctrl request\n"); + goto err; + } + + ctrl_req = (uint8_t *)ctrl_elem->ctrl_req; + + if (cvq->desc[desc_idx].flags & VRING_DESC_F_INDIRECT) { + desc_len = cvq->desc[desc_idx].len; + desc_iova = cvq->desc[desc_idx].addr; + + descs = (struct vring_desc *)(uintptr_t)vhost_iova_to_vva(dev, cvq, + desc_iova, &desc_len, VHOST_ACCESS_RO); + if (!descs || desc_len != cvq->desc[desc_idx].len) { + VHOST_LOG_CONFIG(dev->ifname, ERR, "Failed to map ctrl indirect descs\n"); + goto err; + } + + desc_idx = 0; + } else { + descs = cvq->desc; + } + + while (!(descs[desc_idx].flags & VRING_DESC_F_WRITE)) { + desc_len = descs[desc_idx].len; + desc_iova = descs[desc_idx].addr; + + desc_addr = vhost_iova_to_vva(dev, cvq, desc_iova, &desc_len, VHOST_ACCESS_RO); + if (!desc_addr || desc_len < descs[desc_idx].len) { + VHOST_LOG_CONFIG(dev->ifname, ERR, "Failed to map ctrl descriptor\n"); + goto free_err; + } + + memcpy(ctrl_req, (void *)(uintptr_t)desc_addr, desc_len); + ctrl_req += desc_len; + + if (!(descs[desc_idx].flags & VRING_DESC_F_NEXT)) + break; + + desc_idx = descs[desc_idx].next; + } + + cvq->last_avail_idx++; + if (cvq->last_avail_idx >= cvq->size) + cvq->last_avail_idx -= cvq->size; + + if (dev->features & (1ULL << VIRTIO_RING_F_EVENT_IDX)) + vhost_avail_event(cvq) = cvq->last_avail_idx; + + return 1; + +free_err: + free(ctrl_elem->ctrl_req); +err: + cvq->last_avail_idx++; + if (cvq->last_avail_idx >= cvq->size) + cvq->last_avail_idx -= cvq->size; + + if (dev->features & (1ULL << VIRTIO_RING_F_EVENT_IDX)) + vhost_avail_event(cvq) = cvq->last_avail_idx; + + return -1; +} + +static uint8_t +virtio_net_ctrl_handle_req(struct virtio_net *dev, struct virtio_net_ctrl *ctrl_req) +{ + uint8_t ret = VIRTIO_NET_ERR; + + if (ctrl_req->class == VIRTIO_NET_CTRL_MQ && + ctrl_req->command == VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET) { + uint16_t queue_pairs; + uint32_t i; + + queue_pairs = *(uint16_t *)(uintptr_t)ctrl_req->command_data; + VHOST_LOG_CONFIG(dev->ifname, INFO, "Ctrl req: MQ %u queue pairs\n", queue_pairs); + ret = VIRTIO_NET_OK; + + for (i = 0; i < dev->nr_vring; i++) { + struct vhost_virtqueue *vq = dev->virtqueue[i]; + bool enable; + + if (vq == dev->cvq) + continue; + + if (i < queue_pairs * 2) + enable = true; + else + enable = false; + + vq->enabled = enable; + if (dev->notify_ops->vring_state_changed) + dev->notify_ops->vring_state_changed(dev->vid, i, enable); + } + } + + return ret; +} + +static int +virtio_net_ctrl_push(struct virtio_net *dev, struct virtio_net_ctrl_elem *ctrl_elem) +{ + struct vhost_virtqueue *cvq = dev->cvq; + struct vring_used_elem *used_elem; + + used_elem = &cvq->used->ring[cvq->last_used_idx]; + used_elem->id = ctrl_elem->head_idx; + used_elem->len = ctrl_elem->n_descs; + + cvq->last_used_idx++; + if (cvq->last_used_idx >= cvq->size) + cvq->last_used_idx -= cvq->size; + + __atomic_store_n(&cvq->used->idx, cvq->last_used_idx, __ATOMIC_RELEASE); + + free(ctrl_elem->ctrl_req); + + return 0; +} + +int +virtio_net_ctrl_handle(struct virtio_net *dev) +{ + int ret = 0; + + if (dev->features & (1ULL << VIRTIO_F_RING_PACKED)) { + VHOST_LOG_CONFIG(dev->ifname, ERR, "Packed ring not supported yet\n"); + return -1; + } + + if (!dev->cvq) { + VHOST_LOG_CONFIG(dev->ifname, ERR, "missing control queue\n"); + return -1; + } + + rte_spinlock_lock(&dev->cvq->access_lock); + + while (1) { + struct virtio_net_ctrl_elem ctrl_elem; + + memset(&ctrl_elem, 0, sizeof(struct virtio_net_ctrl_elem)); + + ret = virtio_net_ctrl_pop(dev, &ctrl_elem); + if (ret <= 0) + break; + + *ctrl_elem.desc_ack = virtio_net_ctrl_handle_req(dev, ctrl_elem.ctrl_req); + + ret = virtio_net_ctrl_push(dev, &ctrl_elem); + if (ret < 0) + break; + } + + rte_spinlock_unlock(&dev->cvq->access_lock); + + return ret; +} diff --git a/lib/vhost/virtio_net_ctrl.h b/lib/vhost/virtio_net_ctrl.h new file mode 100644 index 0000000000..9a90f4b9da --- /dev/null +++ b/lib/vhost/virtio_net_ctrl.h @@ -0,0 +1,10 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright (c) 2023 Red Hat, Inc. + */ + +#ifndef _VIRTIO_NET_CTRL_H +#define _VIRTIO_NET_CTRL_H + +int virtio_net_ctrl_handle(struct virtio_net *dev); + +#endif From patchwork Fri Mar 31 15:42:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125694 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A75CA42887; Fri, 31 Mar 2023 17:45:21 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B7A7042FAE; Fri, 31 Mar 2023 17:44:00 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 733A142D3C for ; Fri, 31 Mar 2023 17:43:59 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277439; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=M4rJzQxi8J7R0YK6JL+Av0+bk291qerrzLq2kXuaZ3Y=; b=XQjWnZN/d4Fhv6NfF8fMw7BWkGY5CJx8jLXMJ6cxeEcRNnSBlWlHmu/s0Ey90IyWklNFcN Y7/CTxASj3t9OjcBMzZtQDEYD6A0+/MWLkJRwxyzFGbDxw9CF0AeScH4iUiSfLstHdJr78 b+ZTUPVg56n6q6PuLi04w+6YNtRZrR0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-574-i-lyCn32N7mRSljawpoI5A-1; Fri, 31 Mar 2023 11:43:54 -0400 X-MC-Unique: i-lyCn32N7mRSljawpoI5A-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1AD8E811E7C; Fri, 31 Mar 2023 15:43:54 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id BE51B2027040; Fri, 31 Mar 2023 15:43:51 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 18/27] vhost: add VDUSE device creation and destruction Date: Fri, 31 Mar 2023 17:42:50 +0200 Message-Id: <20230331154259.1447831-19-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch adds initial support for VDUSE, which includes the device creation and destruction. It does not include the virtqueues configuration, so this is not functionnal at this point. Signed-off-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- lib/vhost/meson.build | 4 + lib/vhost/socket.c | 34 +++++--- lib/vhost/vduse.c | 184 ++++++++++++++++++++++++++++++++++++++++++ lib/vhost/vduse.h | 33 ++++++++ lib/vhost/vhost.h | 2 + 5 files changed, 245 insertions(+), 12 deletions(-) create mode 100644 lib/vhost/vduse.c create mode 100644 lib/vhost/vduse.h diff --git a/lib/vhost/meson.build b/lib/vhost/meson.build index cdcd403df3..a57a15937f 100644 --- a/lib/vhost/meson.build +++ b/lib/vhost/meson.build @@ -30,6 +30,10 @@ sources = files( 'virtio_net.c', 'virtio_net_ctrl.c', ) +if cc.has_header('linux/vduse.h') + sources += files('vduse.c') + cflags += '-DVHOST_HAS_VDUSE' +endif headers = files( 'rte_vdpa.h', 'rte_vhost.h', diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c index e95c3ffeac..a8a1c4cd2b 100644 --- a/lib/vhost/socket.c +++ b/lib/vhost/socket.c @@ -18,6 +18,7 @@ #include #include "fd_man.h" +#include "vduse.h" #include "vhost.h" #include "vhost_user.h" @@ -35,6 +36,7 @@ struct vhost_user_socket { int socket_fd; struct sockaddr_un un; bool is_server; + bool is_vduse; bool reconnect; bool iommu_support; bool use_builtin_virtio_net; @@ -992,18 +994,21 @@ rte_vhost_driver_register(const char *path, uint64_t flags) #endif } - if ((flags & RTE_VHOST_USER_CLIENT) != 0) { - vsocket->reconnect = !(flags & RTE_VHOST_USER_NO_RECONNECT); - if (vsocket->reconnect && reconn_tid == 0) { - if (vhost_user_reconnect_init() != 0) - goto out_mutex; - } + if (!strncmp("/dev/vduse/", path, strlen("/dev/vduse/"))) { + vsocket->is_vduse = true; } else { - vsocket->is_server = true; - } - ret = create_unix_socket(vsocket); - if (ret < 0) { - goto out_mutex; + if ((flags & RTE_VHOST_USER_CLIENT) != 0) { + vsocket->reconnect = !(flags & RTE_VHOST_USER_NO_RECONNECT); + if (vsocket->reconnect && reconn_tid == 0) { + if (vhost_user_reconnect_init() != 0) + goto out_mutex; + } + } else { + vsocket->is_server = true; + } + ret = create_unix_socket(vsocket); + if (ret < 0) + goto out_mutex; } vhost_user.vsockets[vhost_user.vsocket_cnt++] = vsocket; @@ -1068,7 +1073,9 @@ rte_vhost_driver_unregister(const char *path) if (strcmp(vsocket->path, path)) continue; - if (vsocket->is_server) { + if (vsocket->is_vduse) { + vduse_device_destroy(path); + } else if (vsocket->is_server) { /* * If r/wcb is executing, release vhost_user's * mutex lock, and try again since the r/wcb @@ -1171,6 +1178,9 @@ rte_vhost_driver_start(const char *path) if (!vsocket) return -1; + if (vsocket->is_vduse) + return vduse_device_create(path); + if (fdset_tid == 0) { /** * create a pipe which will be waited by poll and notified to diff --git a/lib/vhost/vduse.c b/lib/vhost/vduse.c new file mode 100644 index 0000000000..336761c97a --- /dev/null +++ b/lib/vhost/vduse.c @@ -0,0 +1,184 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright (c) 2023 Red Hat, Inc. + */ + +#include +#include +#include +#include + + +#include +#include + +#include +#include + +#include + +#include "vduse.h" +#include "vhost.h" + +#define VHOST_VDUSE_API_VERSION 0 +#define VDUSE_CTRL_PATH "/dev/vduse/control" + +#define VDUSE_NET_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \ + (1ULL << VIRTIO_F_ANY_LAYOUT) | \ + (1ULL << VIRTIO_F_VERSION_1) | \ + (1ULL << VIRTIO_RING_F_INDIRECT_DESC) | \ + (1ULL << VIRTIO_RING_F_EVENT_IDX) | \ + (1ULL << VIRTIO_F_IN_ORDER) | \ + (1ULL << VIRTIO_F_IOMMU_PLATFORM)) + +static struct vhost_backend_ops vduse_backend_ops = { +}; + +int +vduse_device_create(const char *path) +{ + int control_fd, dev_fd, vid, ret; + uint32_t i; + struct virtio_net *dev; + uint64_t ver = VHOST_VDUSE_API_VERSION; + struct vduse_dev_config *dev_config = NULL; + const char *name = path + strlen("/dev/vduse/"); + + control_fd = open(VDUSE_CTRL_PATH, O_RDWR); + if (control_fd < 0) { + VHOST_LOG_CONFIG(name, ERR, "Failed to open %s: %s\n", + VDUSE_CTRL_PATH, strerror(errno)); + return -1; + } + + if (ioctl(control_fd, VDUSE_SET_API_VERSION, &ver)) { + VHOST_LOG_CONFIG(name, ERR, "Failed to set API version: %" PRIu64 ": %s\n", + ver, strerror(errno)); + ret = -1; + goto out_ctrl_close; + } + + dev_config = malloc(offsetof(struct vduse_dev_config, config)); + if (!dev_config) { + VHOST_LOG_CONFIG(name, ERR, "Failed to allocate VDUSE config\n"); + ret = -1; + goto out_ctrl_close; + } + + memset(dev_config, 0, sizeof(struct vduse_dev_config)); + + strncpy(dev_config->name, name, VDUSE_NAME_MAX - 1); + dev_config->device_id = VIRTIO_ID_NET; + dev_config->vendor_id = 0; + dev_config->features = VDUSE_NET_SUPPORTED_FEATURES; + dev_config->vq_num = 2; + dev_config->vq_align = sysconf(_SC_PAGE_SIZE); + dev_config->config_size = 0; + + ret = ioctl(control_fd, VDUSE_CREATE_DEV, dev_config); + if (ret < 0) { + VHOST_LOG_CONFIG(name, ERR, "Failed to create VDUSE device: %s\n", + strerror(errno)); + goto out_free; + } + + dev_fd = open(path, O_RDWR); + if (dev_fd < 0) { + VHOST_LOG_CONFIG(name, ERR, "Failed to open device %s: %s\n", + path, strerror(errno)); + ret = -1; + goto out_dev_close; + } + + vid = vhost_new_device(&vduse_backend_ops); + if (vid < 0) { + VHOST_LOG_CONFIG(name, ERR, "Failed to create new Vhost device\n"); + ret = -1; + goto out_dev_close; + } + + dev = get_device(vid); + if (!dev) { + ret = -1; + goto out_dev_close; + } + + strncpy(dev->ifname, path, IF_NAME_SZ - 1); + dev->vduse_ctrl_fd = control_fd; + dev->vduse_dev_fd = dev_fd; + vhost_setup_virtio_net(dev->vid, true, true, true, true); + + for (i = 0; i < 2; i++) { + struct vduse_vq_config vq_cfg = { 0 }; + + ret = alloc_vring_queue(dev, i); + if (ret) { + VHOST_LOG_CONFIG(name, ERR, "Failed to alloc vring %d metadata\n", i); + goto out_dev_destroy; + } + + vq_cfg.index = i; + vq_cfg.max_size = 1024; + + ret = ioctl(dev->vduse_dev_fd, VDUSE_VQ_SETUP, &vq_cfg); + if (ret) { + VHOST_LOG_CONFIG(name, ERR, "Failed to set-up VQ %d\n", i); + goto out_dev_destroy; + } + } + + free(dev_config); + + return 0; + +out_dev_destroy: + vhost_destroy_device(vid); +out_dev_close: + if (dev_fd >= 0) + close(dev_fd); + ioctl(control_fd, VDUSE_DESTROY_DEV, name); +out_free: + free(dev_config); +out_ctrl_close: + close(control_fd); + + return ret; +} + +int +vduse_device_destroy(const char *path) +{ + const char *name = path + strlen("/dev/vduse/"); + struct virtio_net *dev; + int vid, ret; + + for (vid = 0; vid < RTE_MAX_VHOST_DEVICE; vid++) { + dev = vhost_devices[vid]; + + if (dev == NULL) + continue; + + if (!strcmp(path, dev->ifname)) + break; + } + + if (vid == RTE_MAX_VHOST_DEVICE) + return -1; + + if (dev->vduse_dev_fd >= 0) { + close(dev->vduse_dev_fd); + dev->vduse_dev_fd = -1; + } + + if (dev->vduse_ctrl_fd >= 0) { + ret = ioctl(dev->vduse_ctrl_fd, VDUSE_DESTROY_DEV, name); + if (ret) + VHOST_LOG_CONFIG(name, ERR, "Failed to destroy VDUSE device: %s\n", + strerror(errno)); + close(dev->vduse_ctrl_fd); + dev->vduse_ctrl_fd = -1; + } + + vhost_destroy_device(vid); + + return 0; +} diff --git a/lib/vhost/vduse.h b/lib/vhost/vduse.h new file mode 100644 index 0000000000..a15e5d4c16 --- /dev/null +++ b/lib/vhost/vduse.h @@ -0,0 +1,33 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright (c) 2023 Red Hat, Inc. + */ + +#ifndef _VDUSE_H +#define _VDUSE_H + +#include "vhost.h" + +#ifdef VHOST_HAS_VDUSE + +int vduse_device_create(const char *path); +int vduse_device_destroy(const char *path); + +#else + +static inline int +vduse_device_create(const char *path) +{ + VHOST_LOG_CONFIG(path, ERR, "VDUSE support disabled at build time\n"); + return -1; +} + +static inline int +vduse_device_destroy(const char *path) +{ + VHOST_LOG_CONFIG(path, ERR, "VDUSE support disabled at build time\n"); + return -1; +} + +#endif /* VHOST_HAS_VDUSE */ + +#endif /* _VDUSE_H */ diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index 76663aed24..c8f2a0d43a 100644 --- a/lib/vhost/vhost.h +++ b/lib/vhost/vhost.h @@ -524,6 +524,8 @@ struct virtio_net { int postcopy_ufd; int postcopy_listening; + int vduse_ctrl_fd; + int vduse_dev_fd; struct vhost_virtqueue *cvq; From patchwork Fri Mar 31 15:42:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125695 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1084942887; Fri, 31 Mar 2023 17:45:28 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E915542FBA; Fri, 31 Mar 2023 17:44:02 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id EC17C42FB2 for ; Fri, 31 Mar 2023 17:44:00 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277440; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=T9EhuTMuxE2dZA7DO5G54JiKG+HKVhBVDlfMcsaBgs8=; b=XmBdkjcaQJj/MHYZTTT9jf8/b98G8/BsIYJnJoMK3WcuKOIVhXMFTcPznG8nZrhigY9rzl KKU8LAkkTsMCbQP712ceqJ0ruB9PG1Di2ptbtivfxI0wqPRONWz4xSuhoU0iFzkcOnYhnh 7/V2c8yL2eZa/CDCv4mEnClsUYEXkoA= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-610-OulI6V_vNaSD6Pb-nVH2dA-1; Fri, 31 Mar 2023 11:43:57 -0400 X-MC-Unique: OulI6V_vNaSD6Pb-nVH2dA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B4DB11C0A594; Fri, 31 Mar 2023 15:43:56 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 643462027040; Fri, 31 Mar 2023 15:43:54 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 19/27] vhost: add VDUSE callback for IOTLB miss Date: Fri, 31 Mar 2023 17:42:51 +0200 Message-Id: <20230331154259.1447831-20-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch implements the VDUSE callback for IOTLB misses, where it unmaps the pages from the invalidated IOTLB entry. Signed-off-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- lib/vhost/vduse.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) diff --git a/lib/vhost/vduse.c b/lib/vhost/vduse.c index 336761c97a..f46823f589 100644 --- a/lib/vhost/vduse.c +++ b/lib/vhost/vduse.c @@ -13,9 +13,11 @@ #include #include +#include #include +#include "iotlb.h" #include "vduse.h" #include "vhost.h" @@ -30,7 +32,63 @@ (1ULL << VIRTIO_F_IN_ORDER) | \ (1ULL << VIRTIO_F_IOMMU_PLATFORM)) +static int +vduse_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm __rte_unused) +{ + struct vduse_iotlb_entry entry; + uint64_t size, page_size; + struct stat stat; + void *mmap_addr; + int fd, ret; + + entry.start = iova; + entry.last = iova + 1; + + ret = ioctl(dev->vduse_dev_fd, VDUSE_IOTLB_GET_FD, &entry); + if (ret < 0) { + VHOST_LOG_CONFIG(dev->ifname, ERR, "Failed to get IOTLB entry for 0x%" PRIx64 "\n", + iova); + return -1; + } + + fd = ret; + + VHOST_LOG_CONFIG(dev->ifname, DEBUG, "New IOTLB entry:\n"); + VHOST_LOG_CONFIG(dev->ifname, DEBUG, "\tIOVA: %" PRIx64 " - %" PRIx64 "\n", + (uint64_t)entry.start, (uint64_t)entry.last); + VHOST_LOG_CONFIG(dev->ifname, DEBUG, "\toffset: %" PRIx64 "\n", (uint64_t)entry.offset); + VHOST_LOG_CONFIG(dev->ifname, DEBUG, "\tfd: %d\n", fd); + VHOST_LOG_CONFIG(dev->ifname, DEBUG, "\tperm: %x\n", entry.perm); + + size = entry.last - entry.start + 1; + mmap_addr = mmap(0, size + entry.offset, entry.perm, MAP_SHARED, fd, 0); + if (!mmap_addr) { + VHOST_LOG_CONFIG(dev->ifname, ERR, + "Failed to mmap IOTLB entry for 0x%" PRIx64 "\n", iova); + ret = -1; + goto close_fd; + } + + ret = fstat(fd, &stat); + if (ret < 0) { + VHOST_LOG_CONFIG(dev->ifname, ERR, "Failed to get page size.\n"); + munmap(mmap_addr, entry.offset + size); + goto close_fd; + } + page_size = (uint64_t)stat.st_blksize; + + vhost_user_iotlb_cache_insert(dev, entry.start, (uint64_t)(uintptr_t)mmap_addr, + entry.offset, size, page_size, entry.perm); + + ret = 0; +close_fd: + close(fd); + + return ret; +} + static struct vhost_backend_ops vduse_backend_ops = { + .iotlb_miss = vduse_iotlb_miss, }; int From patchwork Fri Mar 31 15:42:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125697 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1EADF42887; Fri, 31 Mar 2023 17:45:40 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7F9CA42FAA; Fri, 31 Mar 2023 17:44:07 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 9996342D55 for ; Fri, 31 Mar 2023 17:44:05 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277445; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XuzRDiKDa1X4zYFpy1/4othekXqlcbAxdq6a6Nwk6Q8=; b=ZtUnkI9k6LKGQIqLdBmMLU6O4UDPc0D7OCNQOg/mL2BWqFA65E54R4BR6XHpIOq6kKeMuY ObeGR6+pf5q7AV+lkL67HPBkqcXMk57W2GmuT2Rw8/bKa+DnGz5pxPVckF+tnGl8RutZ2+ xL53THMsDYYu1XuWA3OQnwt1mJ6cikE= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-650-6AMvldC5OsC-fmTbEb1yNQ-1; Fri, 31 Mar 2023 11:43:59 -0400 X-MC-Unique: 6AMvldC5OsC-fmTbEb1yNQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6405C280BF67; Fri, 31 Mar 2023 15:43:59 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 06F7D2027042; Fri, 31 Mar 2023 15:43:56 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 20/27] vhost: add VDUSE callback for IOTLB entry removal Date: Fri, 31 Mar 2023 17:42:52 +0200 Message-Id: <20230331154259.1447831-21-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch implements the VDUSE callback for IOTLB misses, where it unmaps the pages from the invalidated IOTLB entry Signed-off-by: Maxime Coquelin --- lib/vhost/vduse.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/lib/vhost/vduse.c b/lib/vhost/vduse.c index f46823f589..ff4c9e72f1 100644 --- a/lib/vhost/vduse.c +++ b/lib/vhost/vduse.c @@ -32,6 +32,12 @@ (1ULL << VIRTIO_F_IN_ORDER) | \ (1ULL << VIRTIO_F_IOMMU_PLATFORM)) +static void +vduse_iotlb_remove_notify(uint64_t addr, uint64_t offset, uint64_t size) +{ + munmap((void *)(uintptr_t)addr, offset + size); +} + static int vduse_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm __rte_unused) { @@ -89,6 +95,7 @@ vduse_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm __rte_unuse static struct vhost_backend_ops vduse_backend_ops = { .iotlb_miss = vduse_iotlb_miss, + .iotlb_remove_notify = vduse_iotlb_remove_notify, }; int From patchwork Fri Mar 31 15:42:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125696 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 214A242887; Fri, 31 Mar 2023 17:45:34 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 50B2F42FA8; Fri, 31 Mar 2023 17:44:06 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 3694842FC6 for ; Fri, 31 Mar 2023 17:44:04 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277443; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oQBHZae0hhbo7kI1Acv2HX3YLnFYonX+KzXKZO4OTNg=; b=jHnf/Of+AODv+Xmjqd71TyXf2yPWMS/gHGDOkjFxF7m2pFoiOWhSQuLpL2EQa3H+8epMgf dwFD8bEv4Rw66C3EaFQhjuiKpNpkhRhgKsWJIggPI81w/TWLgk4gsKgwY4HAt3XWhGhXaR jNGkjl3PjUx6hXtUo+zignWHCZxqyc0= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-304-_ELMxngyOwiNEPqm3_qVzw-1; Fri, 31 Mar 2023 11:44:02 -0400 X-MC-Unique: _ELMxngyOwiNEPqm3_qVzw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id EBE96280BF65; Fri, 31 Mar 2023 15:44:01 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id B5ABC202701E; Fri, 31 Mar 2023 15:43:59 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 21/27] vhost: add VDUSE callback for IRQ injection Date: Fri, 31 Mar 2023 17:42:53 +0200 Message-Id: <20230331154259.1447831-22-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch implements the VDUSE callback for kicking virtqueues. Signed-off-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- lib/vhost/vduse.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/lib/vhost/vduse.c b/lib/vhost/vduse.c index ff4c9e72f1..afa8a39498 100644 --- a/lib/vhost/vduse.c +++ b/lib/vhost/vduse.c @@ -32,6 +32,12 @@ (1ULL << VIRTIO_F_IN_ORDER) | \ (1ULL << VIRTIO_F_IOMMU_PLATFORM)) +static int +vduse_inject_irq(struct virtio_net *dev, struct vhost_virtqueue *vq) +{ + return ioctl(dev->vduse_dev_fd, VDUSE_VQ_INJECT_IRQ, &vq->index); +} + static void vduse_iotlb_remove_notify(uint64_t addr, uint64_t offset, uint64_t size) { @@ -96,6 +102,7 @@ vduse_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm __rte_unuse static struct vhost_backend_ops vduse_backend_ops = { .iotlb_miss = vduse_iotlb_miss, .iotlb_remove_notify = vduse_iotlb_remove_notify, + .inject_irq = vduse_inject_irq, }; int From patchwork Fri Mar 31 15:42:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125699 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 94D7042887; Fri, 31 Mar 2023 17:45:52 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id C717942FC6; Fri, 31 Mar 2023 17:44:13 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 87BE542D4E for ; Fri, 31 Mar 2023 17:44:10 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277450; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=O345baDM46Gp6ExT5NuaFsThDsbC4LbpAu/1ohE6OAE=; b=SA0IBUC6s5Skh/JYDfrSiYFgpKKy77TmS6b9w9VrIPKdGj8rIAfCl3Tpf/Wpks2mes1LTZ AoqFBde0GbRLzR7vig3HBl0svHrkd6lIlDMtZNmo+RnGRbifJuVBjwHsy1GywxHQ2MtGFG kV+yXKSrC/26+UkdC7/+QcsNQVxKw8U= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-262-cTGIupy9P8mp89fubNAiIQ-1; Fri, 31 Mar 2023 11:44:05 -0400 X-MC-Unique: cTGIupy9P8mp89fubNAiIQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 84521858F0E; Fri, 31 Mar 2023 15:44:04 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4A7672027040; Fri, 31 Mar 2023 15:44:02 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 22/27] vhost: add VDUSE events handler Date: Fri, 31 Mar 2023 17:42:54 +0200 Message-Id: <20230331154259.1447831-23-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch makes use of Vhost lib's FD manager to install a handler for VDUSE events occurring on the VDUSE device FD. Signed-off-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- lib/vhost/vduse.c | 102 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 102 insertions(+) diff --git a/lib/vhost/vduse.c b/lib/vhost/vduse.c index afa8a39498..2a183130d3 100644 --- a/lib/vhost/vduse.c +++ b/lib/vhost/vduse.c @@ -17,6 +17,7 @@ #include +#include "fd_man.h" #include "iotlb.h" #include "vduse.h" #include "vhost.h" @@ -32,6 +33,27 @@ (1ULL << VIRTIO_F_IN_ORDER) | \ (1ULL << VIRTIO_F_IOMMU_PLATFORM)) +struct vduse { + struct fdset fdset; +}; + +static struct vduse vduse = { + .fdset = { + .fd = { [0 ... MAX_FDS - 1] = {-1, NULL, NULL, NULL, 0} }, + .fd_mutex = PTHREAD_MUTEX_INITIALIZER, + .fd_pooling_mutex = PTHREAD_MUTEX_INITIALIZER, + .num = 0 + }, +}; + +static bool vduse_events_thread; + +static const char * const vduse_reqs_str[] = { + "VDUSE_GET_VQ_STATE", + "VDUSE_SET_STATUS", + "VDUSE_UPDATE_IOTLB", +}; + static int vduse_inject_irq(struct virtio_net *dev, struct vhost_virtqueue *vq) { @@ -105,16 +127,84 @@ static struct vhost_backend_ops vduse_backend_ops = { .inject_irq = vduse_inject_irq, }; +static void +vduse_events_handler(int fd, void *arg, int *remove __rte_unused) +{ + struct virtio_net *dev = arg; + struct vduse_dev_request req; + struct vduse_dev_response resp; + int ret; + + memset(&resp, 0, sizeof(resp)); + + ret = read(fd, &req, sizeof(req)); + if (ret < 0) { + VHOST_LOG_CONFIG(dev->ifname, ERR, "Failed to read request: %s\n", + strerror(errno)); + return; + } else if (ret < (int)sizeof(req)) { + VHOST_LOG_CONFIG(dev->ifname, ERR, "Incomplete to read request %d\n", ret); + return; + } + + pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, NULL); + + VHOST_LOG_CONFIG(dev->ifname, INFO, "New request: %s (%u)\n", + req.type < RTE_DIM(vduse_reqs_str) ? + vduse_reqs_str[req.type] : "Unknown", + req.type); + + switch (req.type) { + default: + resp.result = VDUSE_REQ_RESULT_FAILED; + break; + } + + pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL); + + resp.request_id = req.request_id; + + ret = write(dev->vduse_dev_fd, &resp, sizeof(resp)); + if (ret != sizeof(resp)) { + VHOST_LOG_CONFIG(dev->ifname, ERR, "Failed to write response %s\n", + strerror(errno)); + return; + } +} + int vduse_device_create(const char *path) { int control_fd, dev_fd, vid, ret; + pthread_t fdset_tid; uint32_t i; struct virtio_net *dev; uint64_t ver = VHOST_VDUSE_API_VERSION; struct vduse_dev_config *dev_config = NULL; const char *name = path + strlen("/dev/vduse/"); + /* If first device, create events dispatcher thread */ + if (vduse_events_thread == false) { + /** + * create a pipe which will be waited by poll and notified to + * rebuild the wait list of poll. + */ + if (fdset_pipe_init(&vduse.fdset) < 0) { + VHOST_LOG_CONFIG(path, ERR, "failed to create pipe for vduse fdset\n"); + return -1; + } + + ret = rte_ctrl_thread_create(&fdset_tid, "vduse-events", NULL, + fdset_event_dispatch, &vduse.fdset); + if (ret != 0) { + VHOST_LOG_CONFIG(path, ERR, "failed to create vduse fdset handling thread\n"); + fdset_pipe_uninit(&vduse.fdset); + return -1; + } + + vduse_events_thread = true; + } + control_fd = open(VDUSE_CTRL_PATH, O_RDWR); if (control_fd < 0) { VHOST_LOG_CONFIG(name, ERR, "Failed to open %s: %s\n", @@ -198,6 +288,13 @@ vduse_device_create(const char *path) } } + ret = fdset_add(&vduse.fdset, dev->vduse_dev_fd, vduse_events_handler, NULL, dev); + if (ret) { + VHOST_LOG_CONFIG(name, ERR, "Failed to add fd %d to vduse fdset\n", + dev->vduse_dev_fd); + goto out_dev_destroy; + } + free(dev_config); return 0; @@ -236,11 +333,16 @@ vduse_device_destroy(const char *path) if (vid == RTE_MAX_VHOST_DEVICE) return -1; + fdset_del(&vduse.fdset, dev->vduse_dev_fd); + if (dev->vduse_dev_fd >= 0) { close(dev->vduse_dev_fd); dev->vduse_dev_fd = -1; } + sleep(2); //ToDo: Need to rework fdman to ensure the deleted FD is no + //more being polled, otherwise VDUSE_DESTROY_DEV will fail. + if (dev->vduse_ctrl_fd >= 0) { ret = ioctl(dev->vduse_ctrl_fd, VDUSE_DESTROY_DEV, name); if (ret) From patchwork Fri Mar 31 15:42:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125698 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A222842887; Fri, 31 Mar 2023 17:45:46 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A4B6E42D84; Fri, 31 Mar 2023 17:44:10 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 4AE5C42D4D for ; Fri, 31 Mar 2023 17:44:09 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277448; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kieZpiCDBHUAoWPF7P+q9LY85nCdxSk55sLKeytb21E=; b=R1ddEdn06bqSm0wXjFsKKKOBImSnwN+vqRkBIOfMEXBXaKGifVmlw2SfUzf33AwxhJuwgk eFyNvqP5m1vj1IUs2InGOFiIR3viKiierF+yjlMCF8/bLedYzgunTB61UWPLaXRG7ccaGT +RjSDYO6KOrLBqJuBmgnE9ANPas3PIU= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-252-X-EbhL-VPfScFIWzO02Xdg-1; Fri, 31 Mar 2023 11:44:07 -0400 X-MC-Unique: X-EbhL-VPfScFIWzO02Xdg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 21E53280BF65; Fri, 31 Mar 2023 15:44:07 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id CCBCC2027040; Fri, 31 Mar 2023 15:44:04 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 23/27] vhost: add support for virtqueue state get event Date: Fri, 31 Mar 2023 17:42:55 +0200 Message-Id: <20230331154259.1447831-24-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch adds support for VDUSE_GET_VQ_STATE event handling, which consists in providing the backend last available index for the specified virtqueue. Signed-off-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- lib/vhost/vduse.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/lib/vhost/vduse.c b/lib/vhost/vduse.c index 2a183130d3..36028b7315 100644 --- a/lib/vhost/vduse.c +++ b/lib/vhost/vduse.c @@ -133,6 +133,7 @@ vduse_events_handler(int fd, void *arg, int *remove __rte_unused) struct virtio_net *dev = arg; struct vduse_dev_request req; struct vduse_dev_response resp; + struct vhost_virtqueue *vq; int ret; memset(&resp, 0, sizeof(resp)); @@ -155,6 +156,13 @@ vduse_events_handler(int fd, void *arg, int *remove __rte_unused) req.type); switch (req.type) { + case VDUSE_GET_VQ_STATE: + vq = dev->virtqueue[req.vq_state.index]; + VHOST_LOG_CONFIG(dev->ifname, INFO, "\tvq index: %u, avail_index: %u\n", + req.vq_state.index, vq->last_avail_idx); + resp.vq_state.split.avail_index = vq->last_avail_idx; + resp.result = VDUSE_REQ_RESULT_OK; + break; default: resp.result = VDUSE_REQ_RESULT_FAILED; break; From patchwork Fri Mar 31 15:42:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125700 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 548FD42887; Fri, 31 Mar 2023 17:46:01 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6057B42FCC; Fri, 31 Mar 2023 17:44:15 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 961F742D81 for ; Fri, 31 Mar 2023 17:44:13 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277453; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xSRxyQDPKUWur8YOMG25njDpgBUFENDtPLWy2ZJx4Jo=; b=hwcM2GoADpb33N1W6fTUbunRSCsn+BVYh1oi7h3Ti+Inw6rSBLdGJTAfUAgoQ1Zq8ldbPz PQ6DabFEPWAFoDQxPE2CM2Cmkrw+YfU49BN9DiJqdsB6ywR2ZmM94UGkg90jnXEjyBi4nz 2r7D7EEwP3e2LKpDapH3OWJw8Saahw8= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-267-E3HmIgzcOue35eevDJtrxg-1; Fri, 31 Mar 2023 11:44:10 -0400 X-MC-Unique: E3HmIgzcOue35eevDJtrxg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A75CF1C0A594; Fri, 31 Mar 2023 15:44:09 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 667E3202701E; Fri, 31 Mar 2023 15:44:07 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 24/27] vhost: add support for VDUSE status set event Date: Fri, 31 Mar 2023 17:42:56 +0200 Message-Id: <20230331154259.1447831-25-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch adds support for VDUSE_SET_STATUS event handling, which consists in updating the Virtio device status set by the Virtio driver. Signed-off-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- lib/vhost/vduse.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/lib/vhost/vduse.c b/lib/vhost/vduse.c index 36028b7315..7d59a5f709 100644 --- a/lib/vhost/vduse.c +++ b/lib/vhost/vduse.c @@ -163,6 +163,12 @@ vduse_events_handler(int fd, void *arg, int *remove __rte_unused) resp.vq_state.split.avail_index = vq->last_avail_idx; resp.result = VDUSE_REQ_RESULT_OK; break; + case VDUSE_SET_STATUS: + VHOST_LOG_CONFIG(dev->ifname, INFO, "\tnew status: 0x%08x\n", + req.s.status); + dev->status = req.s.status; + resp.result = VDUSE_REQ_RESULT_OK; + break; default: resp.result = VDUSE_REQ_RESULT_FAILED; break; From patchwork Fri Mar 31 15:42:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125701 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 339AF42887; Fri, 31 Mar 2023 17:46:07 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 934AC42F9A; Fri, 31 Mar 2023 17:44:17 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 3732F42F9A for ; Fri, 31 Mar 2023 17:44:16 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277455; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TmeYJB0kxqet/xBlQ/5cqCPx6hmegE4LURNqTdYg4Rk=; b=CEWVEWJTENuoHEMt09vgPp/0o++VnF0jbQxQXPp0wYjHEfWRv34kIvfEbsyOM8LVwFo5sn Nqp0jZYnqiRqUkgFi17m/gwwkjGNrjxbs573t763oiNjk3g+LvqJxU/w8qXFvCI314WbXD exwu+Jqpbt+JAmE7DDtwqWY7mWuDsGc= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-20-qmho9HLsOm6dJ60d1EUMRg-1; Fri, 31 Mar 2023 11:44:12 -0400 X-MC-Unique: qmho9HLsOm6dJ60d1EUMRg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5C29838025E8; Fri, 31 Mar 2023 15:44:12 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 05DF22027040; Fri, 31 Mar 2023 15:44:09 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 25/27] vhost: add support for VDUSE IOTLB update event Date: Fri, 31 Mar 2023 17:42:57 +0200 Message-Id: <20230331154259.1447831-26-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch adds support for VDUSE_UPDATE_IOTLB event handling, which consists in invaliding IOTLB entries for the range specified in the request. Signed-off-by: Maxime Coquelin --- lib/vhost/vduse.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/lib/vhost/vduse.c b/lib/vhost/vduse.c index 7d59a5f709..b5b9fa2eb1 100644 --- a/lib/vhost/vduse.c +++ b/lib/vhost/vduse.c @@ -169,6 +169,12 @@ vduse_events_handler(int fd, void *arg, int *remove __rte_unused) dev->status = req.s.status; resp.result = VDUSE_REQ_RESULT_OK; break; + case VDUSE_UPDATE_IOTLB: + VHOST_LOG_CONFIG(dev->ifname, INFO, "\tIOVA range: %" PRIx64 " - %" PRIx64 "\n", + (uint64_t)req.iova.start, (uint64_t)req.iova.last); + vhost_user_iotlb_cache_remove(dev, req.iova.start, + req.iova.last - req.iova.start + 1); + break; default: resp.result = VDUSE_REQ_RESULT_FAILED; break; From patchwork Fri Mar 31 15:42:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125702 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3FAB942887; Fri, 31 Mar 2023 17:46:13 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B96EA42FE0; Fri, 31 Mar 2023 17:44:21 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 315C042FD2 for ; Fri, 31 Mar 2023 17:44:20 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277459; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=d3j+ErMzEpgB+xT6+RuJvj0hsWkvdxTo222+/qYt3Wo=; b=eHGzSrLiMvUdKWMDj/jXgRNzrzzhxWYyxH0GpotisMCag+FGiPo4pkS/uj7vWzJ0hZm1Zk tR+pw0lS5A4FgEU+/sqrvMMhPquSrBrMF1q6GFCcLFUMyYfUj6DD5uP3cctlgihERIg5zV MrKc3S0ByxWbqL6NNcCAgpEI/q3d6w8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-515-gvSIXOqtMjyRpzrfiYHAsg-1; Fri, 31 Mar 2023 11:44:15 -0400 X-MC-Unique: gvSIXOqtMjyRpzrfiYHAsg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1EDC8889047; Fri, 31 Mar 2023 15:44:15 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id ADB312027040; Fri, 31 Mar 2023 15:44:12 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 26/27] vhost: add VDUSE device startup Date: Fri, 31 Mar 2023 17:42:58 +0200 Message-Id: <20230331154259.1447831-27-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch adds the device and its virtqueues initialization once the Virtio driver has set the DRIVER_OK in the Virtio status register. Signed-off-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- lib/vhost/vduse.c | 118 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 118 insertions(+) diff --git a/lib/vhost/vduse.c b/lib/vhost/vduse.c index b5b9fa2eb1..1cd04b4872 100644 --- a/lib/vhost/vduse.c +++ b/lib/vhost/vduse.c @@ -127,6 +127,120 @@ static struct vhost_backend_ops vduse_backend_ops = { .inject_irq = vduse_inject_irq, }; +static void +vduse_vring_setup(struct virtio_net *dev, unsigned int index) +{ + struct vhost_virtqueue *vq = dev->virtqueue[index]; + struct vhost_vring_addr *ra = &vq->ring_addrs; + struct vduse_vq_info vq_info; + struct vduse_vq_eventfd vq_efd; + int ret; + + vq_info.index = index; + ret = ioctl(dev->vduse_dev_fd, VDUSE_VQ_GET_INFO, &vq_info); + if (ret) { + VHOST_LOG_CONFIG(dev->ifname, ERR, "Failed to get VQ %u info: %s\n", + index, strerror(errno)); + return; + } + + VHOST_LOG_CONFIG(dev->ifname, INFO, "VQ %u info:\n", index); + VHOST_LOG_CONFIG(dev->ifname, INFO, "\tnum: %u\n", vq_info.num); + VHOST_LOG_CONFIG(dev->ifname, INFO, "\tdesc_addr: %llx\n", vq_info.desc_addr); + VHOST_LOG_CONFIG(dev->ifname, INFO, "\tdriver_addr: %llx\n", vq_info.driver_addr); + VHOST_LOG_CONFIG(dev->ifname, INFO, "\tdevice_addr: %llx\n", vq_info.device_addr); + VHOST_LOG_CONFIG(dev->ifname, INFO, "\tavail_idx: %u\n", vq_info.split.avail_index); + VHOST_LOG_CONFIG(dev->ifname, INFO, "\tready: %u\n", vq_info.ready); + + vq->last_avail_idx = vq_info.split.avail_index; + vq->size = vq_info.num; + vq->ready = vq_info.ready; + vq->enabled = true; + ra->desc_user_addr = vq_info.desc_addr; + ra->avail_user_addr = vq_info.driver_addr; + ra->used_user_addr = vq_info.device_addr; + + vq->shadow_used_split = rte_malloc_socket(NULL, + vq->size * sizeof(struct vring_used_elem), + RTE_CACHE_LINE_SIZE, 0); + vq->batch_copy_elems = rte_malloc_socket(NULL, + vq->size * sizeof(struct batch_copy_elem), + RTE_CACHE_LINE_SIZE, 0); + + vhost_user_iotlb_rd_lock(vq); + if (vring_translate(dev, vq)) + VHOST_LOG_CONFIG(dev->ifname, ERR, "Failed to translate vring %d addresses\n", + index); + vhost_user_iotlb_rd_unlock(vq); + + vq->kickfd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC); + if (vq->kickfd < 0) { + VHOST_LOG_CONFIG(dev->ifname, ERR, "Failed to init kickfd for VQ %u: %s\n", + index, strerror(errno)); + vq->kickfd = VIRTIO_INVALID_EVENTFD; + return; + } + + vq_efd.index = index; + vq_efd.fd = vq->kickfd; + + ret = ioctl(dev->vduse_dev_fd, VDUSE_VQ_SETUP_KICKFD, &vq_efd); + if (ret) { + VHOST_LOG_CONFIG(dev->ifname, ERR, "Failed to setup kickfd for VQ %u: %s\n", + index, strerror(errno)); + close(vq->kickfd); + vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD; + return; + } +} + +static void +vduse_device_start(struct virtio_net *dev) +{ + unsigned int i, ret; + + dev->notify_ops = vhost_driver_callback_get(dev->ifname); + if (!dev->notify_ops) { + VHOST_LOG_CONFIG(dev->ifname, ERR, + "Failed to get callback ops for driver\n"); + return; + } + + ret = ioctl(dev->vduse_dev_fd, VDUSE_DEV_GET_FEATURES, &dev->features); + if (ret) { + VHOST_LOG_CONFIG(dev->ifname, ERR, "Failed to get features: %s\n", + strerror(errno)); + return; + } + + VHOST_LOG_CONFIG(dev->ifname, INFO, "negotiated Virtio features: 0x%" PRIx64 "\n", + dev->features); + + if (dev->features & + ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | + (1ULL << VIRTIO_F_VERSION_1) | + (1ULL << VIRTIO_F_RING_PACKED))) { + dev->vhost_hlen = sizeof(struct virtio_net_hdr_mrg_rxbuf); + } else { + dev->vhost_hlen = sizeof(struct virtio_net_hdr); + } + + for (i = 0; i < dev->nr_vring; i++) + vduse_vring_setup(dev, i); + + dev->flags |= VIRTIO_DEV_READY; + + if (dev->notify_ops->new_device(dev->vid) == 0) + dev->flags |= VIRTIO_DEV_RUNNING; + + for (i = 0; i < dev->nr_vring; i++) { + struct vhost_virtqueue *vq = dev->virtqueue[i]; + + if (dev->notify_ops->vring_state_changed) + dev->notify_ops->vring_state_changed(dev->vid, i, vq->enabled); + } +} + static void vduse_events_handler(int fd, void *arg, int *remove __rte_unused) { @@ -167,6 +281,10 @@ vduse_events_handler(int fd, void *arg, int *remove __rte_unused) VHOST_LOG_CONFIG(dev->ifname, INFO, "\tnew status: 0x%08x\n", req.s.status); dev->status = req.s.status; + + if (dev->status & VIRTIO_DEVICE_STATUS_DRIVER_OK) + vduse_device_start(dev); + resp.result = VDUSE_REQ_RESULT_OK; break; case VDUSE_UPDATE_IOTLB: From patchwork Fri Mar 31 15:42:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 125703 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3A56B42887; Fri, 31 Mar 2023 17:46:19 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E880642D70; Fri, 31 Mar 2023 17:44:23 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id DC21442FE6 for ; Fri, 31 Mar 2023 17:44:21 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680277461; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZEt1kuq4MCxSmWYkLNnV407qEMdmwbsv223LsRY/P5Q=; b=DUGJFufHUrY6XF90sE078nqQWBgSN8EPAv0cV9dqNnxeuXqj5YdeDqEKY80N/cu6yaUrg4 XEjMACgGlzBJkkRD5V+Id764i4AHCYGemwlzCJZaMZwxOk5bLE0ElnZ4s939tpeeoSLH2A N4Uz1t3yzBvYVzCbUMCVewWq5p8YMAM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-550-tyendm_xN6iKjtirB-SsPA-1; Fri, 31 Mar 2023 11:44:18 -0400 X-MC-Unique: tyendm_xN6iKjtirB-SsPA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B2EFD889070; Fri, 31 Mar 2023 15:44:17 +0000 (UTC) Received: from max-t490s.redhat.com (unknown [10.39.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 730862027040; Fri, 31 Mar 2023 15:44:15 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, david.marchand@redhat.com, chenbo.xia@intel.com, mkp@redhat.com, fbl@redhat.com, jasowang@redhat.com, cunming.liang@intel.com, xieyongji@bytedance.com, echaudro@redhat.com, eperezma@redhat.com, amorenoz@redhat.com Cc: Maxime Coquelin Subject: [RFC 27/27] vhost: add multiqueue support to VDUSE Date: Fri, 31 Mar 2023 17:42:59 +0200 Message-Id: <20230331154259.1447831-28-maxime.coquelin@redhat.com> In-Reply-To: <20230331154259.1447831-1-maxime.coquelin@redhat.com> References: <20230331154259.1447831-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch enables control queue support in order to support multiqueue. Signed-off-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- lib/vhost/vduse.c | 69 ++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 63 insertions(+), 6 deletions(-) diff --git a/lib/vhost/vduse.c b/lib/vhost/vduse.c index 1cd04b4872..135e78fc35 100644 --- a/lib/vhost/vduse.c +++ b/lib/vhost/vduse.c @@ -21,6 +21,7 @@ #include "iotlb.h" #include "vduse.h" #include "vhost.h" +#include "virtio_net_ctrl.h" #define VHOST_VDUSE_API_VERSION 0 #define VDUSE_CTRL_PATH "/dev/vduse/control" @@ -31,7 +32,9 @@ (1ULL << VIRTIO_RING_F_INDIRECT_DESC) | \ (1ULL << VIRTIO_RING_F_EVENT_IDX) | \ (1ULL << VIRTIO_F_IN_ORDER) | \ - (1ULL << VIRTIO_F_IOMMU_PLATFORM)) + (1ULL << VIRTIO_F_IOMMU_PLATFORM) | \ + (1ULL << VIRTIO_NET_F_CTRL_VQ) | \ + (1ULL << VIRTIO_NET_F_MQ)) struct vduse { struct fdset fdset; @@ -127,6 +130,25 @@ static struct vhost_backend_ops vduse_backend_ops = { .inject_irq = vduse_inject_irq, }; +static void +vduse_control_queue_event(int fd, void *arg, int *remove __rte_unused) +{ + struct virtio_net *dev = arg; + uint64_t buf; + int ret; + + ret = read(fd, &buf, sizeof(buf)); + if (ret < 0) { + VHOST_LOG_CONFIG(dev->ifname, ERR, "Failed to read control queue event: %s\n", + strerror(errno)); + return; + } + + VHOST_LOG_CONFIG(dev->ifname, DEBUG, "Control queue kicked\n"); + if (virtio_net_ctrl_handle(dev)) + VHOST_LOG_CONFIG(dev->ifname, ERR, "Failed to handle ctrl request\n"); +} + static void vduse_vring_setup(struct virtio_net *dev, unsigned int index) { @@ -192,6 +214,18 @@ vduse_vring_setup(struct virtio_net *dev, unsigned int index) vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD; return; } + + if (vq == dev->cvq) { + vhost_enable_guest_notification(dev, vq, 1); + ret = fdset_add(&vduse.fdset, vq->kickfd, vduse_control_queue_event, NULL, dev); + if (ret) { + VHOST_LOG_CONFIG(dev->ifname, ERR, + "Failed to setup kickfd handler for VQ %u: %s\n", + index, strerror(errno)); + close(vq->kickfd); + vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD; + } + } } static void @@ -236,6 +270,9 @@ vduse_device_start(struct virtio_net *dev) for (i = 0; i < dev->nr_vring; i++) { struct vhost_virtqueue *vq = dev->virtqueue[i]; + if (vq == dev->cvq) + continue; + if (dev->notify_ops->vring_state_changed) dev->notify_ops->vring_state_changed(dev->vid, i, vq->enabled); } @@ -315,8 +352,9 @@ vduse_device_create(const char *path) { int control_fd, dev_fd, vid, ret; pthread_t fdset_tid; - uint32_t i; + uint32_t i, max_queue_pairs; struct virtio_net *dev; + struct virtio_net_config vnet_config = { 0 }; uint64_t ver = VHOST_VDUSE_API_VERSION; struct vduse_dev_config *dev_config = NULL; const char *name = path + strlen("/dev/vduse/"); @@ -357,22 +395,33 @@ vduse_device_create(const char *path) goto out_ctrl_close; } - dev_config = malloc(offsetof(struct vduse_dev_config, config)); + dev_config = malloc(offsetof(struct vduse_dev_config, config) + + sizeof(vnet_config)); if (!dev_config) { VHOST_LOG_CONFIG(name, ERR, "Failed to allocate VDUSE config\n"); ret = -1; goto out_ctrl_close; } + ret = rte_vhost_driver_get_queue_num(path, &max_queue_pairs); + if (ret < 0) { + VHOST_LOG_CONFIG(name, ERR, "Failed to get max queue pairs\n"); + goto out_free; + } + + VHOST_LOG_CONFIG(path, INFO, "VDUSE max queue pairs: %u\n", max_queue_pairs); + + vnet_config.max_virtqueue_pairs = max_queue_pairs; memset(dev_config, 0, sizeof(struct vduse_dev_config)); strncpy(dev_config->name, name, VDUSE_NAME_MAX - 1); dev_config->device_id = VIRTIO_ID_NET; dev_config->vendor_id = 0; dev_config->features = VDUSE_NET_SUPPORTED_FEATURES; - dev_config->vq_num = 2; + dev_config->vq_num = max_queue_pairs * 2 + 1; /* Includes ctrl queue */ dev_config->vq_align = sysconf(_SC_PAGE_SIZE); - dev_config->config_size = 0; + dev_config->config_size = sizeof(struct virtio_net_config); + memcpy(dev_config->config, &vnet_config, sizeof(vnet_config)); ret = ioctl(control_fd, VDUSE_CREATE_DEV, dev_config); if (ret < 0) { @@ -407,7 +456,7 @@ vduse_device_create(const char *path) dev->vduse_dev_fd = dev_fd; vhost_setup_virtio_net(dev->vid, true, true, true, true); - for (i = 0; i < 2; i++) { + for (i = 0; i < max_queue_pairs * 2 + 1; i++) { struct vduse_vq_config vq_cfg = { 0 }; ret = alloc_vring_queue(dev, i); @@ -426,6 +475,8 @@ vduse_device_create(const char *path) } } + dev->cvq = dev->virtqueue[max_queue_pairs * 2]; + ret = fdset_add(&vduse.fdset, dev->vduse_dev_fd, vduse_events_handler, NULL, dev); if (ret) { VHOST_LOG_CONFIG(name, ERR, "Failed to add fd %d to vduse fdset\n", @@ -471,6 +522,12 @@ vduse_device_destroy(const char *path) if (vid == RTE_MAX_VHOST_DEVICE) return -1; + if (dev->cvq && dev->cvq->kickfd >= 0) { + fdset_del(&vduse.fdset, dev->cvq->kickfd); + close(dev->cvq->kickfd); + dev->cvq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD; + } + fdset_del(&vduse.fdset, dev->vduse_dev_fd); if (dev->vduse_dev_fd >= 0) {