From patchwork Mon Oct 8 15:25:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 46261 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 0CF611B16A; Mon, 8 Oct 2018 17:26:22 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 178CC1B159; Mon, 8 Oct 2018 17:26:20 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 715B43001922; Mon, 8 Oct 2018 15:26:18 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-51.ams2.redhat.com [10.36.112.51]) by smtp.corp.redhat.com (Postfix) with ESMTP id 00C5F68C7F; Mon, 8 Oct 2018 15:26:14 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com, nicknickolaev@gmail.com, i.maximets@samsung.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Cc: dgilbert@redhat.com, stable@dpdk.org, Maxime Coquelin Date: Mon, 8 Oct 2018 17:25:39 +0200 Message-Id: <20181008152557.14275-2-maxime.coquelin@redhat.com> In-Reply-To: <20181008152557.14275-1-maxime.coquelin@redhat.com> References: <20181008152557.14275-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.47]); Mon, 08 Oct 2018 15:26:18 +0000 (UTC) Subject: [dpdk-dev] [PATCH v4 01/19] vhost: fix messages error checks X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Return of message handling has now changed to an enum that can take non-negative value that is not zero in case a reply is needed. But the code checking the variable afterwards has not been updated, leading to success messages handling being treated as errors. Fixes: 2f270595c05d ("vhost: rework message handling as a callback array") Signed-off-by: Maxime Coquelin Acked-by: Ilya Maximets --- lib/librte_vhost/vhost_user.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 7ef3fb4a4..060b41893 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -1783,7 +1783,7 @@ vhost_user_msg_handler(int vid, int fd) } skip_to_post_handle: - if (!ret && dev->extern_ops.post_msg_handle) { + if (ret != VH_RESULT_ERR && dev->extern_ops.post_msg_handle) { uint32_t need_reply; ret = (*dev->extern_ops.post_msg_handle)( @@ -1800,10 +1800,10 @@ vhost_user_msg_handler(int vid, int fd) vhost_user_unlock_all_queue_pairs(dev); if (msg.flags & VHOST_USER_NEED_REPLY) { - msg.payload.u64 = !!ret; + msg.payload.u64 = ret == VH_RESULT_ERR; msg.size = sizeof(msg.payload.u64); send_vhost_reply(fd, &msg); - } else if (ret) { + } else if (ret == VH_RESULT_ERR) { RTE_LOG(ERR, VHOST_CONFIG, "vhost message handling failed.\n"); return -1; From patchwork Mon Oct 8 15:25:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 46262 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 6C1501B155; Mon, 8 Oct 2018 17:26:25 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 5A64E1B155; Mon, 8 Oct 2018 17:26:23 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 975303082B56; Mon, 8 Oct 2018 15:26:22 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-51.ams2.redhat.com [10.36.112.51]) by smtp.corp.redhat.com (Postfix) with ESMTP id C6E3419742; Mon, 8 Oct 2018 15:26:18 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com, nicknickolaev@gmail.com, i.maximets@samsung.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Cc: dgilbert@redhat.com, stable@dpdk.org, Maxime Coquelin Date: Mon, 8 Oct 2018 17:25:40 +0200 Message-Id: <20181008152557.14275-3-maxime.coquelin@redhat.com> In-Reply-To: <20181008152557.14275-1-maxime.coquelin@redhat.com> References: <20181008152557.14275-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Mon, 08 Oct 2018 15:26:22 +0000 (UTC) Subject: [dpdk-dev] [PATCH v4 02/19] vhost: fix return code of messages requiring replies X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" VHOST_USER_GET_PROTOCOL_FEATURES, VHOST_USER_GET_VRING_BASE and VHOST_USER_SET_LOG_BASE require replies, so their handlers should return VH_RESULT_REPLY, not VH_RESULT_OK. Fixes: 0bff510b5ea6 ("vhost: unify message handling function signature") Signed-off-by: Maxime Coquelin Acked-by: Ilya Maximets Reviewed-by: Tiwei Bie --- lib/librte_vhost/vhost_user.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 060b41893..ce0ac0098 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -1161,7 +1161,7 @@ vhost_user_get_vring_base(struct virtio_net **pdev, msg->size = sizeof(msg->payload.state); - return VH_RESULT_OK; + return VH_RESULT_REPLY; } /* @@ -1218,7 +1218,7 @@ vhost_user_get_protocol_features(struct virtio_net **pdev, msg->payload.u64 = protocol_features; msg->size = sizeof(msg->payload.u64); - return VH_RESULT_OK; + return VH_RESULT_REPLY; } static int @@ -1298,7 +1298,7 @@ vhost_user_set_log_base(struct virtio_net **pdev, struct VhostUserMsg *msg) msg->size = sizeof(msg->payload.u64); - return VH_RESULT_OK; + return VH_RESULT_REPLY; } static int vhost_user_set_log_fd(struct virtio_net **pdev __rte_unused, From patchwork Mon Oct 8 15:25:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 46263 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 706A21B17D; Mon, 8 Oct 2018 17:26:28 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id E75461B176; Mon, 8 Oct 2018 17:26:26 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 35B7D3F73C; Mon, 8 Oct 2018 15:26:26 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-51.ams2.redhat.com [10.36.112.51]) by smtp.corp.redhat.com (Postfix) with ESMTP id F41491A930; Mon, 8 Oct 2018 15:26:22 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com, nicknickolaev@gmail.com, i.maximets@samsung.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Cc: dgilbert@redhat.com, stable@dpdk.org, Maxime Coquelin Date: Mon, 8 Oct 2018 17:25:41 +0200 Message-Id: <20181008152557.14275-4-maxime.coquelin@redhat.com> In-Reply-To: <20181008152557.14275-1-maxime.coquelin@redhat.com> References: <20181008152557.14275-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Mon, 08 Oct 2018 15:26:26 +0000 (UTC) Subject: [dpdk-dev] [PATCH v4 03/19] vhost: clarify reply-ack in case a reply was already sent X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" For messages that require a reply, a second ack should not be sent when reply-ack protocol feature is negotiated, even if the corresponding flag is set in the message. The code is compliant with the spec but it isn't clear it is, so this patch adds a comment to make it explicit. Suggested-by: Ilya Maximets Signed-off-by: Maxime Coquelin Acked-by: Ilya Maximets Reviewed-by: Tiwei Bie --- lib/librte_vhost/vhost_user.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index ce0ac0098..7f3e86778 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -1799,6 +1799,11 @@ vhost_user_msg_handler(int vid, int fd) if (unlock_required) vhost_user_unlock_all_queue_pairs(dev); + /* + * If the request required a reply that was already sent, + * this optional reply-ack won't be sent as the + * VHOST_USER_NEED_REPLY was cleared in send_vhost_reply(). + */ if (msg.flags & VHOST_USER_NEED_REPLY) { msg.payload.u64 = ret == VH_RESULT_ERR; msg.size = sizeof(msg.payload.u64); From patchwork Mon Oct 8 15:25:42 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 46264 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 354731B186; Mon, 8 Oct 2018 17:26:32 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id B19181B180; Mon, 8 Oct 2018 17:26:30 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0462C11530; Mon, 8 Oct 2018 15:26:30 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-51.ams2.redhat.com [10.36.112.51]) by smtp.corp.redhat.com (Postfix) with ESMTP id B619C170FE; Mon, 8 Oct 2018 15:26:26 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com, nicknickolaev@gmail.com, i.maximets@samsung.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Cc: dgilbert@redhat.com, stable@dpdk.org, Maxime Coquelin Date: Mon, 8 Oct 2018 17:25:42 +0200 Message-Id: <20181008152557.14275-5-maxime.coquelin@redhat.com> In-Reply-To: <20181008152557.14275-1-maxime.coquelin@redhat.com> References: <20181008152557.14275-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 08 Oct 2018 15:26:30 +0000 (UTC) Subject: [dpdk-dev] [PATCH v4 04/19] vhost: fix payload size of reply X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" QEMU doesn't expect any payload for the reply of VHOST_USER_SET_LOG_BASE request, so don't send any. Note that the Vhost-user specification isn't clear about it and would need to be fixed. Fixes: 54f9e32305d4 ("vhost: handle dirty pages logging request") Cc: stable@dpdk.org Reported-by: Ilya Maximets Signed-off-by: Maxime Coquelin Acked-by: Ilya Maximets --- lib/librte_vhost/vhost_user.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 7f3e86778..71a0e7dd7 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -1296,7 +1296,11 @@ vhost_user_set_log_base(struct virtio_net **pdev, struct VhostUserMsg *msg) dev->log_base = dev->log_addr + off; dev->log_size = size; - msg->size = sizeof(msg->payload.u64); + /* + * The spec is not clear about it (yet), but QEMU doesn't expect + * any payload in the reply. + */ + msg->size = 0; return VH_RESULT_REPLY; } From patchwork Mon Oct 8 15:25:43 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 46265 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id AFD501B17A; Mon, 8 Oct 2018 17:26:36 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 677201B18A; Mon, 8 Oct 2018 17:26:34 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 841F2A5D95; Mon, 8 Oct 2018 15:26:33 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-51.ams2.redhat.com [10.36.112.51]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5422C170FE; Mon, 8 Oct 2018 15:26:30 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com, nicknickolaev@gmail.com, i.maximets@samsung.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Cc: dgilbert@redhat.com, stable@dpdk.org, Maxime Coquelin Date: Mon, 8 Oct 2018 17:25:43 +0200 Message-Id: <20181008152557.14275-6-maxime.coquelin@redhat.com> In-Reply-To: <20181008152557.14275-1-maxime.coquelin@redhat.com> References: <20181008152557.14275-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 08 Oct 2018 15:26:33 +0000 (UTC) Subject: [dpdk-dev] [PATCH v4 05/19] vhost: fix error handling when mem table gets updated X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" When the memory table gets updated, the rings addresses need to be translated again. If it fails, we need to exit cleanly by unmapping memory regions. Fixes: d5022533c20a ("vhost: retranslate vring addr when memory table changes") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin Acked-by: Ilya Maximets --- lib/librte_vhost/vhost_user.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 71a0e7dd7..3f01926e2 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -964,7 +964,7 @@ vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *msg) dev = translate_ring_addresses(dev, i); if (!dev) - return VH_RESULT_ERR; + goto err_mmap; *pdev = dev; } From patchwork Mon Oct 8 15:25:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 46266 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id BD83F1B1A0; Mon, 8 Oct 2018 17:26:39 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id ADA681B159; Mon, 8 Oct 2018 17:26:37 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id F2FDD308FB9E; Mon, 8 Oct 2018 15:26:36 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-51.ams2.redhat.com [10.36.112.51]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2090D194A8; Mon, 8 Oct 2018 15:26:33 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com, nicknickolaev@gmail.com, i.maximets@samsung.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Cc: dgilbert@redhat.com, stable@dpdk.org, Maxime Coquelin Date: Mon, 8 Oct 2018 17:25:44 +0200 Message-Id: <20181008152557.14275-7-maxime.coquelin@redhat.com> In-Reply-To: <20181008152557.14275-1-maxime.coquelin@redhat.com> References: <20181008152557.14275-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.43]); Mon, 08 Oct 2018 15:26:37 +0000 (UTC) Subject: [dpdk-dev] [PATCH v4 06/19] vhost: define postcopy protocol flag X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Signed-off-by: Dr. David Alan Gilbert Signed-off-by: Maxime Coquelin Acked-by: Ilya Maximets Reviewed-by: Tiwei Bie --- lib/librte_vhost/rte_vhost.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h index b02673d4a..b3cc6990d 100644 --- a/lib/librte_vhost/rte_vhost.h +++ b/lib/librte_vhost/rte_vhost.h @@ -66,6 +66,10 @@ extern "C" { #define VHOST_USER_PROTOCOL_F_HOST_NOTIFIER 11 #endif +#ifndef VHOST_USER_PROTOCOL_F_PAGEFAULT +#define VHOST_USER_PROTOCOL_F_PAGEFAULT 8 +#endif + /** Indicate whether protocol features negotiation is supported. */ #ifndef VHOST_USER_F_PROTOCOL_FEATURES #define VHOST_USER_F_PROTOCOL_FEATURES 30 From patchwork Mon Oct 8 15:25:45 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 46267 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 823461B1C4; Mon, 8 Oct 2018 17:26:43 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 370651B147; Mon, 8 Oct 2018 17:26:41 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7D13530BCCC0; Mon, 8 Oct 2018 15:26:40 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-51.ams2.redhat.com [10.36.112.51]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5FC8B5783; Mon, 8 Oct 2018 15:26:37 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com, nicknickolaev@gmail.com, i.maximets@samsung.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Cc: dgilbert@redhat.com, stable@dpdk.org, Maxime Coquelin Date: Mon, 8 Oct 2018 17:25:45 +0200 Message-Id: <20181008152557.14275-8-maxime.coquelin@redhat.com> In-Reply-To: <20181008152557.14275-1-maxime.coquelin@redhat.com> References: <20181008152557.14275-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.44]); Mon, 08 Oct 2018 15:26:40 +0000 (UTC) Subject: [dpdk-dev] [PATCH v4 07/19] vhost: add number of fds to vhost-user messages and use it X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" As soons as some anciliarry datai (fds) are received, it is copied without checking its length. This patch adds adds the number of fds received to the message, which is set in read_vhost_message(). This is preliminary work to support sending fds to Qemu. Signed-off-by: Dr. David Alan Gilbert Signed-off-by: Maxime Coquelin --- lib/librte_vhost/socket.c | 25 ++++++++++++++++++++----- lib/librte_vhost/vhost_user.c | 2 +- lib/librte_vhost/vhost_user.h | 4 +++- 3 files changed, 24 insertions(+), 7 deletions(-) diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c index d63031747..7cad5593e 100644 --- a/lib/librte_vhost/socket.c +++ b/lib/librte_vhost/socket.c @@ -94,18 +94,24 @@ static struct vhost_user vhost_user = { .mutex = PTHREAD_MUTEX_INITIALIZER, }; -/* return bytes# of read on success or negative val on failure. */ +/* + * return bytes# of read on success or negative val on failure. Update fdnum + * with number of fds read. + */ int -read_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num) +read_fd_message(int sockfd, char *buf, int buflen, int *fds, int max_fds, + int *fd_num) { struct iovec iov; struct msghdr msgh; - size_t fdsize = fd_num * sizeof(int); - char control[CMSG_SPACE(fdsize)]; + char control[CMSG_SPACE(max_fds * sizeof(int))]; struct cmsghdr *cmsg; int got_fds = 0; + int *tmp_fds; int ret; + *fd_num = 0; + memset(&msgh, 0, sizeof(msgh)); iov.iov_base = buf; iov.iov_len = buflen; @@ -131,13 +137,22 @@ read_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num) if ((cmsg->cmsg_level == SOL_SOCKET) && (cmsg->cmsg_type == SCM_RIGHTS)) { got_fds = (cmsg->cmsg_len - CMSG_LEN(0)) / sizeof(int); + if (got_fds > max_fds) { + RTE_LOG(ERR, VHOST_CONFIG, + "Received msg contains more fds than supported\n"); + tmp_fds = (int *)CMSG_DATA(cmsg); + while (got_fds--) + close(tmp_fds[got_fds]); + return -1; + } + *fd_num = got_fds; memcpy(fds, CMSG_DATA(cmsg), got_fds * sizeof(int)); break; } } /* Clear out unused file descriptors */ - while (got_fds < fd_num) + while (got_fds < max_fds) fds[got_fds++] = -1; return ret; diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 3f01926e2..8b63aadc6 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -1517,7 +1517,7 @@ read_vhost_message(int sockfd, struct VhostUserMsg *msg) int ret; ret = read_fd_message(sockfd, (char *)msg, VHOST_USER_HDR_SIZE, - msg->fds, VHOST_MEMORY_MAX_NREGIONS); + msg->fds, VHOST_MEMORY_MAX_NREGIONS, &msg->fd_num); if (ret <= 0) return ret; diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h index 42166adf2..dd0262f8f 100644 --- a/lib/librte_vhost/vhost_user.h +++ b/lib/librte_vhost/vhost_user.h @@ -132,6 +132,7 @@ typedef struct VhostUserMsg { VhostUserVringArea area; } payload; int fds[VHOST_MEMORY_MAX_NREGIONS]; + int fd_num; } __attribute((packed)) VhostUserMsg; #define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64) @@ -146,7 +147,8 @@ int vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm); int vhost_user_host_notifier_ctrl(int vid, bool enable); /* socket.c */ -int read_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num); +int read_fd_message(int sockfd, char *buf, int buflen, int *fds, int max_fds, + int *fd_num); int send_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num); #endif From patchwork Mon Oct 8 15:25:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 46268 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 961861B1D6; Mon, 8 Oct 2018 17:26:46 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 003AA1B1CD; Mon, 8 Oct 2018 17:26:44 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3230C30CF681; Mon, 8 Oct 2018 15:26:44 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-51.ams2.redhat.com [10.36.112.51]) by smtp.corp.redhat.com (Postfix) with ESMTP id B6D605783; Mon, 8 Oct 2018 15:26:40 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com, nicknickolaev@gmail.com, i.maximets@samsung.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Cc: dgilbert@redhat.com, stable@dpdk.org, Maxime Coquelin Date: Mon, 8 Oct 2018 17:25:46 +0200 Message-Id: <20181008152557.14275-9-maxime.coquelin@redhat.com> In-Reply-To: <20181008152557.14275-1-maxime.coquelin@redhat.com> References: <20181008152557.14275-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.47]); Mon, 08 Oct 2018 15:26:44 +0000 (UTC) Subject: [dpdk-dev] [PATCH v4 08/19] vhost: pass socket fd to message handling callbacks X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This is not used for now, but will be needed for the special handling of VHOST_USER_SET_MEM_TABLE message once postcopy will be supported. Signed-off-by: Maxime Coquelin --- lib/librte_vhost/vhost_user.c | 71 +++++++++++++++++++++++------------ 1 file changed, 47 insertions(+), 24 deletions(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 8b63aadc6..48a070277 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -138,14 +138,16 @@ vhost_backend_cleanup(struct virtio_net *dev) */ static int vhost_user_set_owner(struct virtio_net **pdev __rte_unused, - struct VhostUserMsg *msg __rte_unused) + struct VhostUserMsg *msg __rte_unused, + int main_fd __rte_unused) { return VH_RESULT_OK; } static int vhost_user_reset_owner(struct virtio_net **pdev, - struct VhostUserMsg *msg __rte_unused) + struct VhostUserMsg *msg __rte_unused, + int main_fd __rte_unused) { struct virtio_net *dev = *pdev; vhost_destroy_device_notify(dev); @@ -159,7 +161,8 @@ vhost_user_reset_owner(struct virtio_net **pdev, * The features that we support are requested. */ static int -vhost_user_get_features(struct virtio_net **pdev, struct VhostUserMsg *msg) +vhost_user_get_features(struct virtio_net **pdev, struct VhostUserMsg *msg, + int main_fd __rte_unused) { struct virtio_net *dev = *pdev; uint64_t features = 0; @@ -176,7 +179,8 @@ vhost_user_get_features(struct virtio_net **pdev, struct VhostUserMsg *msg) * The queue number that we support are requested. */ static int -vhost_user_get_queue_num(struct virtio_net **pdev, struct VhostUserMsg *msg) +vhost_user_get_queue_num(struct virtio_net **pdev, struct VhostUserMsg *msg, + int main_fd __rte_unused) { struct virtio_net *dev = *pdev; uint32_t queue_num = 0; @@ -193,7 +197,8 @@ vhost_user_get_queue_num(struct virtio_net **pdev, struct VhostUserMsg *msg) * We receive the negotiated features supported by us and the virtio device. */ static int -vhost_user_set_features(struct virtio_net **pdev, struct VhostUserMsg *msg) +vhost_user_set_features(struct virtio_net **pdev, struct VhostUserMsg *msg, + int main_fd __rte_unused) { struct virtio_net *dev = *pdev; uint64_t features = msg->payload.u64; @@ -275,7 +280,8 @@ vhost_user_set_features(struct virtio_net **pdev, struct VhostUserMsg *msg) */ static int vhost_user_set_vring_num(struct virtio_net **pdev, - struct VhostUserMsg *msg) + struct VhostUserMsg *msg, + int main_fd __rte_unused) { struct virtio_net *dev = *pdev; struct vhost_virtqueue *vq = dev->virtqueue[msg->payload.state.index]; @@ -637,7 +643,8 @@ translate_ring_addresses(struct virtio_net *dev, int vq_index) * This function then converts these to our address space. */ static int -vhost_user_set_vring_addr(struct virtio_net **pdev, struct VhostUserMsg *msg) +vhost_user_set_vring_addr(struct virtio_net **pdev, struct VhostUserMsg *msg, + int main_fd __rte_unused) { struct virtio_net *dev = *pdev; struct vhost_virtqueue *vq; @@ -674,7 +681,8 @@ vhost_user_set_vring_addr(struct virtio_net **pdev, struct VhostUserMsg *msg) */ static int vhost_user_set_vring_base(struct virtio_net **pdev, - struct VhostUserMsg *msg) + struct VhostUserMsg *msg, + int main_fd __rte_unused) { struct virtio_net *dev = *pdev; dev->virtqueue[msg->payload.state.index]->last_used_idx = @@ -807,7 +815,8 @@ vhost_memory_changed(struct VhostUserMemory *new, } static int -vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *msg) +vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *msg, + int main_fd __rte_unused) { struct virtio_net *dev = *pdev; struct VhostUserMemory memory = msg->payload.memory; @@ -1021,7 +1030,8 @@ virtio_is_ready(struct virtio_net *dev) } static int -vhost_user_set_vring_call(struct virtio_net **pdev, struct VhostUserMsg *msg) +vhost_user_set_vring_call(struct virtio_net **pdev, struct VhostUserMsg *msg, + int main_fd __rte_unused) { struct virtio_net *dev = *pdev; struct vhost_vring_file file; @@ -1045,7 +1055,8 @@ vhost_user_set_vring_call(struct virtio_net **pdev, struct VhostUserMsg *msg) } static int vhost_user_set_vring_err(struct virtio_net **pdev __rte_unused, - struct VhostUserMsg *msg) + struct VhostUserMsg *msg, + int main_fd __rte_unused) { if (!(msg->payload.u64 & VHOST_USER_VRING_NOFD_MASK)) close(msg->fds[0]); @@ -1055,7 +1066,8 @@ static int vhost_user_set_vring_err(struct virtio_net **pdev __rte_unused, } static int -vhost_user_set_vring_kick(struct virtio_net **pdev, struct VhostUserMsg *msg) +vhost_user_set_vring_kick(struct virtio_net **pdev, struct VhostUserMsg *msg, + int main_fd __rte_unused) { struct virtio_net *dev = *pdev; struct vhost_vring_file file; @@ -1114,7 +1126,8 @@ free_zmbufs(struct vhost_virtqueue *vq) */ static int vhost_user_get_vring_base(struct virtio_net **pdev, - struct VhostUserMsg *msg) + struct VhostUserMsg *msg, + int main_fd __rte_unused) { struct virtio_net *dev = *pdev; struct vhost_virtqueue *vq = dev->virtqueue[msg->payload.state.index]; @@ -1170,7 +1183,8 @@ vhost_user_get_vring_base(struct virtio_net **pdev, */ static int vhost_user_set_vring_enable(struct virtio_net **pdev, - struct VhostUserMsg *msg) + struct VhostUserMsg *msg, + int main_fd __rte_unused) { struct virtio_net *dev = *pdev; int enable = (int)msg->payload.state.num; @@ -1198,7 +1212,8 @@ vhost_user_set_vring_enable(struct virtio_net **pdev, static int vhost_user_get_protocol_features(struct virtio_net **pdev, - struct VhostUserMsg *msg) + struct VhostUserMsg *msg, + int main_fd __rte_unused) { struct virtio_net *dev = *pdev; uint64_t features, protocol_features; @@ -1223,7 +1238,8 @@ vhost_user_get_protocol_features(struct virtio_net **pdev, static int vhost_user_set_protocol_features(struct virtio_net **pdev, - struct VhostUserMsg *msg) + struct VhostUserMsg *msg, + int main_fd __rte_unused) { struct virtio_net *dev = *pdev; uint64_t protocol_features = msg->payload.u64; @@ -1240,7 +1256,8 @@ vhost_user_set_protocol_features(struct virtio_net **pdev, } static int -vhost_user_set_log_base(struct virtio_net **pdev, struct VhostUserMsg *msg) +vhost_user_set_log_base(struct virtio_net **pdev, struct VhostUserMsg *msg, + int main_fd __rte_unused) { struct virtio_net *dev = *pdev; int fd = msg->fds[0]; @@ -1306,7 +1323,8 @@ vhost_user_set_log_base(struct virtio_net **pdev, struct VhostUserMsg *msg) } static int vhost_user_set_log_fd(struct virtio_net **pdev __rte_unused, - struct VhostUserMsg *msg) + struct VhostUserMsg *msg, + int main_fd __rte_unused) { close(msg->fds[0]); RTE_LOG(INFO, VHOST_CONFIG, "not implemented.\n"); @@ -1323,7 +1341,8 @@ static int vhost_user_set_log_fd(struct virtio_net **pdev __rte_unused, * a flag 'broadcast_rarp' to let rte_vhost_dequeue_burst() inject it. */ static int -vhost_user_send_rarp(struct virtio_net **pdev, struct VhostUserMsg *msg) +vhost_user_send_rarp(struct virtio_net **pdev, struct VhostUserMsg *msg, + int main_fd __rte_unused) { struct virtio_net *dev = *pdev; uint8_t *mac = (uint8_t *)&msg->payload.u64; @@ -1353,7 +1372,8 @@ vhost_user_send_rarp(struct virtio_net **pdev, struct VhostUserMsg *msg) } static int -vhost_user_net_set_mtu(struct virtio_net **pdev, struct VhostUserMsg *msg) +vhost_user_net_set_mtu(struct virtio_net **pdev, struct VhostUserMsg *msg, + int main_fd __rte_unused) { struct virtio_net *dev = *pdev; if (msg->payload.u64 < VIRTIO_MIN_MTU || @@ -1370,7 +1390,8 @@ vhost_user_net_set_mtu(struct virtio_net **pdev, struct VhostUserMsg *msg) } static int -vhost_user_set_req_fd(struct virtio_net **pdev, struct VhostUserMsg *msg) +vhost_user_set_req_fd(struct virtio_net **pdev, struct VhostUserMsg *msg, + int main_fd __rte_unused) { struct virtio_net *dev = *pdev; int fd = msg->fds[0]; @@ -1437,7 +1458,8 @@ is_vring_iotlb_invalidate(struct vhost_virtqueue *vq, } static int -vhost_user_iotlb_msg(struct virtio_net **pdev, struct VhostUserMsg *msg) +vhost_user_iotlb_msg(struct virtio_net **pdev, struct VhostUserMsg *msg, + int main_fd __rte_unused) { struct virtio_net *dev = *pdev; struct vhost_iotlb_msg *imsg = &msg->payload.iotlb; @@ -1482,7 +1504,8 @@ vhost_user_iotlb_msg(struct virtio_net **pdev, struct VhostUserMsg *msg) } typedef int (*vhost_message_handler_t)(struct virtio_net **pdev, - struct VhostUserMsg *msg); + struct VhostUserMsg *msg, + int main_fd); static vhost_message_handler_t vhost_message_handlers[VHOST_USER_MAX] = { [VHOST_USER_NONE] = NULL, [VHOST_USER_GET_FEATURES] = vhost_user_get_features, @@ -1760,7 +1783,7 @@ vhost_user_msg_handler(int vid, int fd) if (request > VHOST_USER_NONE && request < VHOST_USER_MAX) { if (!vhost_message_handlers[request]) goto skip_to_post_handle; - ret = vhost_message_handlers[request](&dev, &msg); + ret = vhost_message_handlers[request](&dev, &msg, fd); switch (ret) { case VH_RESULT_ERR: From patchwork Mon Oct 8 15:25:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 46269 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 6D2FB1B1F4; Mon, 8 Oct 2018 17:26:49 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 5793D1B1EE; Mon, 8 Oct 2018 17:26:48 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 833A730BCCCC; Mon, 8 Oct 2018 15:26:47 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-51.ams2.redhat.com [10.36.112.51]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5C70758773; Mon, 8 Oct 2018 15:26:44 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com, nicknickolaev@gmail.com, i.maximets@samsung.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Cc: dgilbert@redhat.com, stable@dpdk.org, Maxime Coquelin Date: Mon, 8 Oct 2018 17:25:47 +0200 Message-Id: <20181008152557.14275-10-maxime.coquelin@redhat.com> In-Reply-To: <20181008152557.14275-1-maxime.coquelin@redhat.com> References: <20181008152557.14275-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.44]); Mon, 08 Oct 2018 15:26:47 +0000 (UTC) Subject: [dpdk-dev] [PATCH v4 09/19] vhost: enable fds passing when sending vhost-user messages X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Passing userfault fds to Qemu will be required for postcopy live-migration feature. Signed-off-by: Dr. David Alan Gilbert Signed-off-by: Maxime Coquelin --- lib/librte_vhost/vhost_user.c | 27 +++++++++++++++------------ 1 file changed, 15 insertions(+), 12 deletions(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 48a070277..20f38267d 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -171,6 +171,7 @@ vhost_user_get_features(struct virtio_net **pdev, struct VhostUserMsg *msg, msg->payload.u64 = features; msg->size = sizeof(msg->payload.u64); + msg->fd_num = 0; return VH_RESULT_REPLY; } @@ -189,6 +190,7 @@ vhost_user_get_queue_num(struct virtio_net **pdev, struct VhostUserMsg *msg, msg->payload.u64 = (uint64_t)queue_num; msg->size = sizeof(msg->payload.u64); + msg->fd_num = 0; return VH_RESULT_REPLY; } @@ -1173,6 +1175,7 @@ vhost_user_get_vring_base(struct virtio_net **pdev, vq->batch_copy_elems = NULL; msg->size = sizeof(msg->payload.state); + msg->fd_num = 0; return VH_RESULT_REPLY; } @@ -1232,6 +1235,7 @@ vhost_user_get_protocol_features(struct virtio_net **pdev, msg->payload.u64 = protocol_features; msg->size = sizeof(msg->payload.u64); + msg->fd_num = 0; return VH_RESULT_REPLY; } @@ -1318,6 +1322,7 @@ vhost_user_set_log_base(struct virtio_net **pdev, struct VhostUserMsg *msg, * any payload in the reply. */ msg->size = 0; + msg->fd_num = 0; return VH_RESULT_REPLY; } @@ -1564,13 +1569,13 @@ read_vhost_message(int sockfd, struct VhostUserMsg *msg) } static int -send_vhost_message(int sockfd, struct VhostUserMsg *msg, int *fds, int fd_num) +send_vhost_message(int sockfd, struct VhostUserMsg *msg) { if (!msg) return 0; return send_fd_message(sockfd, (char *)msg, - VHOST_USER_HDR_SIZE + msg->size, fds, fd_num); + VHOST_USER_HDR_SIZE + msg->size, msg->fds, msg->fd_num); } static int @@ -1584,19 +1589,18 @@ send_vhost_reply(int sockfd, struct VhostUserMsg *msg) msg->flags |= VHOST_USER_VERSION; msg->flags |= VHOST_USER_REPLY_MASK; - return send_vhost_message(sockfd, msg, NULL, 0); + return send_vhost_message(sockfd, msg); } static int -send_vhost_slave_message(struct virtio_net *dev, struct VhostUserMsg *msg, - int *fds, int fd_num) +send_vhost_slave_message(struct virtio_net *dev, struct VhostUserMsg *msg) { int ret; if (msg->flags & VHOST_USER_NEED_REPLY) rte_spinlock_lock(&dev->slave_req_lock); - ret = send_vhost_message(dev->slave_req_fd, msg, fds, fd_num); + ret = send_vhost_message(dev->slave_req_fd, msg); if (ret < 0 && (msg->flags & VHOST_USER_NEED_REPLY)) rte_spinlock_unlock(&dev->slave_req_lock); @@ -1834,6 +1838,7 @@ vhost_user_msg_handler(int vid, int fd) if (msg.flags & VHOST_USER_NEED_REPLY) { msg.payload.u64 = ret == VH_RESULT_ERR; msg.size = sizeof(msg.payload.u64); + msg.fd_num = 0; send_vhost_reply(fd, &msg); } else if (ret == VH_RESULT_ERR) { RTE_LOG(ERR, VHOST_CONFIG, @@ -1917,7 +1922,7 @@ vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm) }, }; - ret = send_vhost_message(dev->slave_req_fd, &msg, NULL, 0); + ret = send_vhost_message(dev->slave_req_fd, &msg); if (ret < 0) { RTE_LOG(ERR, VHOST_CONFIG, "Failed to send IOTLB miss message (%d)\n", @@ -1933,8 +1938,6 @@ static int vhost_user_slave_set_vring_host_notifier(struct virtio_net *dev, uint64_t offset, uint64_t size) { - int *fdp = NULL; - size_t fd_num = 0; int ret; struct VhostUserMsg msg = { .request.slave = VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG, @@ -1950,11 +1953,11 @@ static int vhost_user_slave_set_vring_host_notifier(struct virtio_net *dev, if (fd < 0) msg.payload.area.u64 |= VHOST_USER_VRING_NOFD_MASK; else { - fdp = &fd; - fd_num = 1; + msg.fds[0] = fd; + msg.fd_num = 1; } - ret = send_vhost_slave_message(dev, &msg, fdp, fd_num); + ret = send_vhost_slave_message(dev, &msg); if (ret < 0) { RTE_LOG(ERR, VHOST_CONFIG, "Failed to set host notifier (%d)\n", ret); From patchwork Mon Oct 8 15:25:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 46270 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 095421B1FE; Mon, 8 Oct 2018 17:26:53 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 828821B1E9; Mon, 8 Oct 2018 17:26:51 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CCCF461BB0; Mon, 8 Oct 2018 15:26:50 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-51.ams2.redhat.com [10.36.112.51]) by smtp.corp.redhat.com (Postfix) with ESMTP id E73695783; Mon, 8 Oct 2018 15:26:47 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com, nicknickolaev@gmail.com, i.maximets@samsung.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Cc: dgilbert@redhat.com, stable@dpdk.org, Maxime Coquelin Date: Mon, 8 Oct 2018 17:25:48 +0200 Message-Id: <20181008152557.14275-11-maxime.coquelin@redhat.com> In-Reply-To: <20181008152557.14275-1-maxime.coquelin@redhat.com> References: <20181008152557.14275-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Mon, 08 Oct 2018 15:26:50 +0000 (UTC) Subject: [dpdk-dev] [PATCH v4 10/19] vhost: add config flag for postcopy feature X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Postcopy live-migration features relies on userfaultfd, which was only introduced in kernel v4.3. This patch introduces a new define to allow building vhost library on kernels not supporting userfaultfd. With legacy build system, user has to explicitly set CONFIG_RTE_LIBRTE_VHOST_POSTCOPY to 'y'. With Meson build system, RTE_LIBRTE_VHOST_POSTCOPY gets automatically defined if userfaultfd kernel header is present. Suggested-by: Ilya Maximets Signed-off-by: Maxime Coquelin Acked-by: Ilya Maximets --- config/common_linuxapp | 1 + lib/librte_vhost/meson.build | 2 ++ 2 files changed, 3 insertions(+) diff --git a/config/common_linuxapp b/config/common_linuxapp index 485e1467d..424100a75 100644 --- a/config/common_linuxapp +++ b/config/common_linuxapp @@ -14,6 +14,7 @@ CONFIG_RTE_LIBRTE_KNI=y CONFIG_RTE_LIBRTE_PMD_KNI=y CONFIG_RTE_LIBRTE_VHOST=y CONFIG_RTE_LIBRTE_VHOST_NUMA=y +CONFIG_RTE_LIBRTE_VHOST_POSTCOPY=n CONFIG_RTE_LIBRTE_PMD_VHOST=y CONFIG_RTE_LIBRTE_IFC_PMD=y CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y diff --git a/lib/librte_vhost/meson.build b/lib/librte_vhost/meson.build index 9d25b4d88..e33e6fc16 100644 --- a/lib/librte_vhost/meson.build +++ b/lib/librte_vhost/meson.build @@ -7,6 +7,8 @@ endif if has_libnuma == 1 dpdk_conf.set10('RTE_LIBRTE_VHOST_NUMA', true) endif +dpdk_conf.set('RTE_LIBRTE_VHOST_POSTCOPY', + cc.has_header('linux/userfaultfd.h')) version = 4 allow_experimental_apis = true cflags += '-fno-strict-aliasing' From patchwork Mon Oct 8 15:25:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 46271 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 730781B17E; Mon, 8 Oct 2018 17:26:56 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 10CE81B1C3; Mon, 8 Oct 2018 17:26:55 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5A0404E91F; Mon, 8 Oct 2018 15:26:54 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-51.ams2.redhat.com [10.36.112.51]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2C05828DF0; Mon, 8 Oct 2018 15:26:50 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com, nicknickolaev@gmail.com, i.maximets@samsung.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Cc: dgilbert@redhat.com, stable@dpdk.org, Maxime Coquelin Date: Mon, 8 Oct 2018 17:25:49 +0200 Message-Id: <20181008152557.14275-12-maxime.coquelin@redhat.com> In-Reply-To: <20181008152557.14275-1-maxime.coquelin@redhat.com> References: <20181008152557.14275-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Mon, 08 Oct 2018 15:26:54 +0000 (UTC) Subject: [dpdk-dev] [PATCH v4 11/19] vhost: introduce postcopy's advise message X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This patch opens a userfaultfd and sends it back to Qemu's VHOST_USER_POSTCOPY_ADVISE request. Signed-off-by: Dr. David Alan Gilbert Signed-off-by: Maxime Coquelin --- lib/librte_vhost/vhost.h | 2 ++ lib/librte_vhost/vhost_user.c | 49 +++++++++++++++++++++++++++++++++++ lib/librte_vhost/vhost_user.h | 3 ++- 3 files changed, 53 insertions(+), 1 deletion(-) diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h index 25ffd7614..21722d8a8 100644 --- a/lib/librte_vhost/vhost.h +++ b/lib/librte_vhost/vhost.h @@ -363,6 +363,8 @@ struct virtio_net { int slave_req_fd; rte_spinlock_t slave_req_lock; + int postcopy_ufd; + /* * Device id to identify a specific backend device. * It's set to -1 for the default software implementation. diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 20f38267d..3cdd2af28 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -24,13 +24,19 @@ #include #include #include +#include +#include #include #include #include +#include #include #ifdef RTE_LIBRTE_VHOST_NUMA #include #endif +#ifdef RTE_LIBRTE_VHOST_POSTCOPY +#include +#endif #include #include @@ -69,6 +75,7 @@ static const char *vhost_message_str[VHOST_USER_MAX] = { [VHOST_USER_IOTLB_MSG] = "VHOST_USER_IOTLB_MSG", [VHOST_USER_CRYPTO_CREATE_SESS] = "VHOST_USER_CRYPTO_CREATE_SESS", [VHOST_USER_CRYPTO_CLOSE_SESS] = "VHOST_USER_CRYPTO_CLOSE_SESS", + [VHOST_USER_POSTCOPY_ADVISE] = "VHOST_USER_POSTCOPY_ADVISE", }; /* The possible results of a message handling function */ @@ -130,6 +137,11 @@ vhost_backend_cleanup(struct virtio_net *dev) close(dev->slave_req_fd); dev->slave_req_fd = -1; } + + if (dev->postcopy_ufd >= 0) { + close(dev->postcopy_ufd); + dev->postcopy_ufd = -1; + } } /* @@ -1508,6 +1520,42 @@ vhost_user_iotlb_msg(struct virtio_net **pdev, struct VhostUserMsg *msg, return VH_RESULT_OK; } +static int +vhost_user_set_postcopy_advise(struct virtio_net **pdev, + struct VhostUserMsg *msg, + int main_fd __rte_unused) +{ + struct virtio_net *dev = *pdev; +#ifdef RTE_LIBRTE_VHOST_POSTCOPY + struct uffdio_api api_struct; + + dev->postcopy_ufd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK); + + if (dev->postcopy_ufd == -1) { + RTE_LOG(ERR, VHOST_CONFIG, "Userfaultfd not available: %s\n", + strerror(errno)); + return VH_RESULT_ERR; + } + api_struct.api = UFFD_API; + api_struct.features = 0; + if (ioctl(dev->postcopy_ufd, UFFDIO_API, &api_struct)) { + RTE_LOG(ERR, VHOST_CONFIG, "UFFDIO_API ioctl failure: %s\n", + strerror(errno)); + close(dev->postcopy_ufd); + return VH_RESULT_ERR; + } + msg->fds[0] = dev->postcopy_ufd; + msg->fd_num = 1; + + return VH_RESULT_REPLY; +#else + dev->postcopy_ufd = -1; + msg->fd_num = 0; + + return VH_RESULT_ERR; +#endif +} + typedef int (*vhost_message_handler_t)(struct virtio_net **pdev, struct VhostUserMsg *msg, int main_fd); @@ -1535,6 +1583,7 @@ static vhost_message_handler_t vhost_message_handlers[VHOST_USER_MAX] = { [VHOST_USER_NET_SET_MTU] = vhost_user_net_set_mtu, [VHOST_USER_SET_SLAVE_REQ_FD] = vhost_user_set_req_fd, [VHOST_USER_IOTLB_MSG] = vhost_user_iotlb_msg, + [VHOST_USER_POSTCOPY_ADVISE] = vhost_user_set_postcopy_advise, }; diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h index dd0262f8f..2030b40a5 100644 --- a/lib/librte_vhost/vhost_user.h +++ b/lib/librte_vhost/vhost_user.h @@ -50,7 +50,8 @@ typedef enum VhostUserRequest { VHOST_USER_IOTLB_MSG = 22, VHOST_USER_CRYPTO_CREATE_SESS = 26, VHOST_USER_CRYPTO_CLOSE_SESS = 27, - VHOST_USER_MAX = 28 + VHOST_USER_POSTCOPY_ADVISE = 28, + VHOST_USER_MAX = 29 } VhostUserRequest; typedef enum VhostUserSlaveRequest { From patchwork Mon Oct 8 15:25:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 46272 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id DB5A31B20A; Mon, 8 Oct 2018 17:27:00 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id DB7BF1B201; Mon, 8 Oct 2018 17:26:58 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 38ADB3082133; Mon, 8 Oct 2018 15:26:58 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-51.ams2.redhat.com [10.36.112.51]) by smtp.corp.redhat.com (Postfix) with ESMTP id B316219742; Mon, 8 Oct 2018 15:26:54 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com, nicknickolaev@gmail.com, i.maximets@samsung.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Cc: dgilbert@redhat.com, stable@dpdk.org, Maxime Coquelin Date: Mon, 8 Oct 2018 17:25:50 +0200 Message-Id: <20181008152557.14275-13-maxime.coquelin@redhat.com> In-Reply-To: <20181008152557.14275-1-maxime.coquelin@redhat.com> References: <20181008152557.14275-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Mon, 08 Oct 2018 15:26:58 +0000 (UTC) Subject: [dpdk-dev] [PATCH v4 12/19] vhost: add support for postcopy's listen message X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Signed-off-by: Dr. David Alan Gilbert Signed-off-by: Maxime Coquelin --- lib/librte_vhost/vhost.h | 1 + lib/librte_vhost/vhost_user.c | 21 +++++++++++++++++++++ lib/librte_vhost/vhost_user.h | 3 ++- 3 files changed, 24 insertions(+), 1 deletion(-) diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h index 21722d8a8..9453cb28d 100644 --- a/lib/librte_vhost/vhost.h +++ b/lib/librte_vhost/vhost.h @@ -364,6 +364,7 @@ struct virtio_net { rte_spinlock_t slave_req_lock; int postcopy_ufd; + int postcopy_listening; /* * Device id to identify a specific backend device. diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 3cdd2af28..7a79145c2 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -76,6 +76,7 @@ static const char *vhost_message_str[VHOST_USER_MAX] = { [VHOST_USER_CRYPTO_CREATE_SESS] = "VHOST_USER_CRYPTO_CREATE_SESS", [VHOST_USER_CRYPTO_CLOSE_SESS] = "VHOST_USER_CRYPTO_CLOSE_SESS", [VHOST_USER_POSTCOPY_ADVISE] = "VHOST_USER_POSTCOPY_ADVISE", + [VHOST_USER_POSTCOPY_LISTEN] = "VHOST_USER_POSTCOPY_LISTEN", }; /* The possible results of a message handling function */ @@ -142,6 +143,8 @@ vhost_backend_cleanup(struct virtio_net *dev) close(dev->postcopy_ufd); dev->postcopy_ufd = -1; } + + dev->postcopy_listening = 0; } /* @@ -1556,6 +1559,23 @@ vhost_user_set_postcopy_advise(struct virtio_net **pdev, #endif } +static int +vhost_user_set_postcopy_listen(struct virtio_net **pdev, + struct VhostUserMsg *msg __rte_unused, + int main_fd __rte_unused) +{ + struct virtio_net *dev = *pdev; + + if (dev->mem && dev->mem->nregions) { + RTE_LOG(ERR, VHOST_CONFIG, + "Regions already registered at postcopy-listen\n"); + return VH_RESULT_ERR; + } + dev->postcopy_listening = 1; + + return VH_RESULT_OK; +} + typedef int (*vhost_message_handler_t)(struct virtio_net **pdev, struct VhostUserMsg *msg, int main_fd); @@ -1584,6 +1604,7 @@ static vhost_message_handler_t vhost_message_handlers[VHOST_USER_MAX] = { [VHOST_USER_SET_SLAVE_REQ_FD] = vhost_user_set_req_fd, [VHOST_USER_IOTLB_MSG] = vhost_user_iotlb_msg, [VHOST_USER_POSTCOPY_ADVISE] = vhost_user_set_postcopy_advise, + [VHOST_USER_POSTCOPY_LISTEN] = vhost_user_set_postcopy_listen, }; diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h index 2030b40a5..55d8659ba 100644 --- a/lib/librte_vhost/vhost_user.h +++ b/lib/librte_vhost/vhost_user.h @@ -51,7 +51,8 @@ typedef enum VhostUserRequest { VHOST_USER_CRYPTO_CREATE_SESS = 26, VHOST_USER_CRYPTO_CLOSE_SESS = 27, VHOST_USER_POSTCOPY_ADVISE = 28, - VHOST_USER_MAX = 29 + VHOST_USER_POSTCOPY_LISTEN = 29, + VHOST_USER_MAX = 30 } VhostUserRequest; typedef enum VhostUserSlaveRequest { From patchwork Mon Oct 8 15:25:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 46273 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 9EEBC1B274; Mon, 8 Oct 2018 17:27:03 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id CFAD51B1F3; Mon, 8 Oct 2018 17:27:02 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E19A15F799; Mon, 8 Oct 2018 15:27:01 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-51.ams2.redhat.com [10.36.112.51]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9C3FF194A8; Mon, 8 Oct 2018 15:26:58 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com, nicknickolaev@gmail.com, i.maximets@samsung.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Cc: dgilbert@redhat.com, stable@dpdk.org, Maxime Coquelin Date: Mon, 8 Oct 2018 17:25:51 +0200 Message-Id: <20181008152557.14275-14-maxime.coquelin@redhat.com> In-Reply-To: <20181008152557.14275-1-maxime.coquelin@redhat.com> References: <20181008152557.14275-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Mon, 08 Oct 2018 15:27:02 +0000 (UTC) Subject: [dpdk-dev] [PATCH v4 13/19] vhost: register new regions with userfaultfd X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Signed-off-by: Dr. David Alan Gilbert Signed-off-by: Maxime Coquelin --- lib/librte_vhost/vhost_user.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 7a79145c2..46c97836a 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -975,6 +975,32 @@ vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *msg, mmap_size, alignment, mmap_offset); + + if (dev->postcopy_listening) { +#ifdef RTE_LIBRTE_VHOST_POSTCOPY + struct uffdio_register reg_struct; + + reg_struct.range.start = (uint64_t)(uintptr_t)mmap_addr; + reg_struct.range.len = mmap_size; + reg_struct.mode = UFFDIO_REGISTER_MODE_MISSING; + + if (ioctl(dev->postcopy_ufd, UFFDIO_REGISTER, + ®_struct)) { + RTE_LOG(ERR, VHOST_CONFIG, + "Failed to register ufd for region %d: (ufd = %d) %s\n", + i, dev->postcopy_ufd, + strerror(errno)); + goto err_mmap; + } + RTE_LOG(INFO, VHOST_CONFIG, + "\t userfaultfd registered for range : %llx - %llx\n", + reg_struct.range.start, + reg_struct.range.start + + reg_struct.range.len - 1); +#else + goto err_mmap; +#endif + } } for (i = 0; i < dev->nr_vring; i++) { From patchwork Mon Oct 8 15:25:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 46274 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 434401B295; Mon, 8 Oct 2018 17:27:08 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 530FE1B1EC; Mon, 8 Oct 2018 17:27:06 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8856C3091D4F; Mon, 8 Oct 2018 15:27:05 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-51.ams2.redhat.com [10.36.112.51]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5AB07170FE; Mon, 8 Oct 2018 15:27:02 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com, nicknickolaev@gmail.com, i.maximets@samsung.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Cc: dgilbert@redhat.com, stable@dpdk.org, Maxime Coquelin Date: Mon, 8 Oct 2018 17:25:52 +0200 Message-Id: <20181008152557.14275-15-maxime.coquelin@redhat.com> In-Reply-To: <20181008152557.14275-1-maxime.coquelin@redhat.com> References: <20181008152557.14275-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Mon, 08 Oct 2018 15:27:05 +0000 (UTC) Subject: [dpdk-dev] [PATCH v4 14/19] vhost: avoid useless VhostUserMemory copy X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The VHOST_USER_SET_MEM_TABLE payload is copied when handled, whereas it could directly be referenced. This is not very important, but next, we'll need to update the payload and send it back to Qemu. Signed-off-by: Dr. David Alan Gilbert Signed-off-by: Maxime Coquelin Acked-by: Ilya Maximets --- lib/librte_vhost/vhost_user.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 46c97836a..9a50b962b 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -836,7 +836,7 @@ vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *msg, int main_fd __rte_unused) { struct virtio_net *dev = *pdev; - struct VhostUserMemory memory = msg->payload.memory; + struct VhostUserMemory *memory = &msg->payload.memory; struct rte_vhost_mem_region *reg; void *mmap_addr; uint64_t mmap_size; @@ -846,17 +846,17 @@ vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *msg, int populate; int fd; - if (memory.nregions > VHOST_MEMORY_MAX_NREGIONS) { + if (memory->nregions > VHOST_MEMORY_MAX_NREGIONS) { RTE_LOG(ERR, VHOST_CONFIG, - "too many memory regions (%u)\n", memory.nregions); + "too many memory regions (%u)\n", memory->nregions); return VH_RESULT_ERR; } - if (dev->mem && !vhost_memory_changed(&memory, dev->mem)) { + if (dev->mem && !vhost_memory_changed(memory, dev->mem)) { RTE_LOG(INFO, VHOST_CONFIG, "(%d) memory regions not changed\n", dev->vid); - for (i = 0; i < memory.nregions; i++) + for (i = 0; i < memory->nregions; i++) close(msg->fds[i]); return VH_RESULT_OK; @@ -888,25 +888,25 @@ vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *msg, } dev->mem = rte_zmalloc("vhost-mem-table", sizeof(struct rte_vhost_memory) + - sizeof(struct rte_vhost_mem_region) * memory.nregions, 0); + sizeof(struct rte_vhost_mem_region) * memory->nregions, 0); if (dev->mem == NULL) { RTE_LOG(ERR, VHOST_CONFIG, "(%d) failed to allocate memory for dev->mem\n", dev->vid); return VH_RESULT_ERR; } - dev->mem->nregions = memory.nregions; + dev->mem->nregions = memory->nregions; - for (i = 0; i < memory.nregions; i++) { + for (i = 0; i < memory->nregions; i++) { fd = msg->fds[i]; reg = &dev->mem->regions[i]; - reg->guest_phys_addr = memory.regions[i].guest_phys_addr; - reg->guest_user_addr = memory.regions[i].userspace_addr; - reg->size = memory.regions[i].memory_size; + reg->guest_phys_addr = memory->regions[i].guest_phys_addr; + reg->guest_user_addr = memory->regions[i].userspace_addr; + reg->size = memory->regions[i].memory_size; reg->fd = fd; - mmap_offset = memory.regions[i].mmap_offset; + mmap_offset = memory->regions[i].mmap_offset; /* Check for memory_size + mmap_offset overflow */ if (mmap_offset >= -reg->size) { From patchwork Mon Oct 8 15:25:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 46275 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B16D61B2AE; Mon, 8 Oct 2018 17:27:11 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 0FA7B1B187; Mon, 8 Oct 2018 17:27:10 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3F0B5308213B; Mon, 8 Oct 2018 15:27:09 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-51.ams2.redhat.com [10.36.112.51]) by smtp.corp.redhat.com (Postfix) with ESMTP id E1AF219742; Mon, 8 Oct 2018 15:27:05 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com, nicknickolaev@gmail.com, i.maximets@samsung.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Cc: dgilbert@redhat.com, stable@dpdk.org, Maxime Coquelin Date: Mon, 8 Oct 2018 17:25:53 +0200 Message-Id: <20181008152557.14275-16-maxime.coquelin@redhat.com> In-Reply-To: <20181008152557.14275-1-maxime.coquelin@redhat.com> References: <20181008152557.14275-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Mon, 08 Oct 2018 15:27:09 +0000 (UTC) Subject: [dpdk-dev] [PATCH v4 15/19] vhost: send userfault range addresses back to qemu X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Signed-off-by: Dr. David Alan Gilbert Signed-off-by: Maxime Coquelin --- lib/librte_vhost/vhost_user.c | 49 ++++++++++++++++++++++++++++++++--- 1 file changed, 46 insertions(+), 3 deletions(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 9a50b962b..75e555515 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -89,6 +89,11 @@ enum vh_result { VH_RESULT_REPLY = 1, }; +static int +send_vhost_reply(int sockfd, struct VhostUserMsg *msg); +static int +read_vhost_message(int sockfd, struct VhostUserMsg *msg); + static uint64_t get_blk_size(int fd) { @@ -833,7 +838,7 @@ vhost_memory_changed(struct VhostUserMemory *new, static int vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *msg, - int main_fd __rte_unused) + int main_fd) { struct virtio_net *dev = *pdev; struct VhostUserMemory *memory = &msg->payload.memory; @@ -977,11 +982,49 @@ vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *msg, mmap_offset); if (dev->postcopy_listening) { + /* + * We haven't a better way right now than sharing + * DPDK's virtual address with Qemu, so that Qemu can + * retrieve the region offset when handling userfaults. + */ + memory->regions[i].userspace_addr = + reg->host_user_addr; + } + } + if (dev->postcopy_listening) { + /* Send the addresses back to qemu */ + msg->fd_num = 0; + send_vhost_reply(main_fd, msg); + + /* Wait for qemu to acknolwedge it's got the addresses + * we've got to wait before we're allowed to generate faults. + */ + VhostUserMsg ack_msg; + if (read_vhost_message(main_fd, &ack_msg) <= 0) { + RTE_LOG(ERR, VHOST_CONFIG, + "Failed to read qemu ack on postcopy set-mem-table\n"); + goto err_mmap; + } + if (ack_msg.request.master != VHOST_USER_SET_MEM_TABLE) { + RTE_LOG(ERR, VHOST_CONFIG, + "Bad qemu ack on postcopy set-mem-table (%d)\n", + ack_msg.request.master); + goto err_mmap; + } + + /* Now userfault register and we can use the memory */ + for (i = 0; i < memory->nregions; i++) { #ifdef RTE_LIBRTE_VHOST_POSTCOPY + reg = &dev->mem->regions[i]; struct uffdio_register reg_struct; - reg_struct.range.start = (uint64_t)(uintptr_t)mmap_addr; - reg_struct.range.len = mmap_size; + /* + * Let's register all the mmap'ed area to ensure + * alignment on page boundary. + */ + reg_struct.range.start = + (uint64_t)(uintptr_t)reg->mmap_addr; + reg_struct.range.len = reg->mmap_size; reg_struct.mode = UFFDIO_REGISTER_MODE_MISSING; if (ioctl(dev->postcopy_ufd, UFFDIO_REGISTER, From patchwork Mon Oct 8 15:25:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 46276 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id CE7161B3A1; Mon, 8 Oct 2018 17:27:14 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 92B361B17C; Mon, 8 Oct 2018 17:27:13 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C75143084293; Mon, 8 Oct 2018 15:27:12 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-51.ams2.redhat.com [10.36.112.51]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9D35F19742; Mon, 8 Oct 2018 15:27:09 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com, nicknickolaev@gmail.com, i.maximets@samsung.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Cc: dgilbert@redhat.com, stable@dpdk.org, Maxime Coquelin Date: Mon, 8 Oct 2018 17:25:54 +0200 Message-Id: <20181008152557.14275-17-maxime.coquelin@redhat.com> In-Reply-To: <20181008152557.14275-1-maxime.coquelin@redhat.com> References: <20181008152557.14275-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.40]); Mon, 08 Oct 2018 15:27:13 +0000 (UTC) Subject: [dpdk-dev] [PATCH v4 16/19] vhost: add support to postcopy's end request X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The master sends this message before stopping handling userfaults, so that the backend closes the userfaultfd. The master waits for the slave to acknowledge the request with an empty 64bits payload for synchronization purpose. Signed-off-by: Dr. David Alan Gilbert Signed-off-by: Maxime Coquelin --- lib/librte_vhost/vhost_user.c | 21 +++++++++++++++++++++ lib/librte_vhost/vhost_user.h | 3 ++- 2 files changed, 23 insertions(+), 1 deletion(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 75e555515..7c588ad54 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -77,6 +77,7 @@ static const char *vhost_message_str[VHOST_USER_MAX] = { [VHOST_USER_CRYPTO_CLOSE_SESS] = "VHOST_USER_CRYPTO_CLOSE_SESS", [VHOST_USER_POSTCOPY_ADVISE] = "VHOST_USER_POSTCOPY_ADVISE", [VHOST_USER_POSTCOPY_LISTEN] = "VHOST_USER_POSTCOPY_LISTEN", + [VHOST_USER_POSTCOPY_END] = "VHOST_USER_POSTCOPY_END", }; /* The possible results of a message handling function */ @@ -1645,6 +1646,25 @@ vhost_user_set_postcopy_listen(struct virtio_net **pdev, return VH_RESULT_OK; } +static int +vhost_user_postcopy_end(struct virtio_net **pdev, struct VhostUserMsg *msg, + int main_fd __rte_unused) +{ + struct virtio_net *dev = *pdev; + + dev->postcopy_listening = 0; + if (dev->postcopy_ufd >= 0) { + close(dev->postcopy_ufd); + dev->postcopy_ufd = -1; + } + + msg->payload.u64 = 0; + msg->size = sizeof(msg->payload.u64); + msg->fd_num = 0; + + return VH_RESULT_REPLY; +} + typedef int (*vhost_message_handler_t)(struct virtio_net **pdev, struct VhostUserMsg *msg, int main_fd); @@ -1674,6 +1694,7 @@ static vhost_message_handler_t vhost_message_handlers[VHOST_USER_MAX] = { [VHOST_USER_IOTLB_MSG] = vhost_user_iotlb_msg, [VHOST_USER_POSTCOPY_ADVISE] = vhost_user_set_postcopy_advise, [VHOST_USER_POSTCOPY_LISTEN] = vhost_user_set_postcopy_listen, + [VHOST_USER_POSTCOPY_END] = vhost_user_postcopy_end, }; diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h index 55d8659ba..73b1fe2b9 100644 --- a/lib/librte_vhost/vhost_user.h +++ b/lib/librte_vhost/vhost_user.h @@ -52,7 +52,8 @@ typedef enum VhostUserRequest { VHOST_USER_CRYPTO_CLOSE_SESS = 27, VHOST_USER_POSTCOPY_ADVISE = 28, VHOST_USER_POSTCOPY_LISTEN = 29, - VHOST_USER_MAX = 30 + VHOST_USER_POSTCOPY_END = 30, + VHOST_USER_MAX = 31 } VhostUserRequest; typedef enum VhostUserSlaveRequest { From patchwork Mon Oct 8 15:25:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 46277 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 749F31B1AB; Mon, 8 Oct 2018 17:27:19 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id AD6D51B14D; Mon, 8 Oct 2018 17:27:17 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id F237D5F799; Mon, 8 Oct 2018 15:27:16 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-51.ams2.redhat.com [10.36.112.51]) by smtp.corp.redhat.com (Postfix) with ESMTP id 31D56194A8; Mon, 8 Oct 2018 15:27:12 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com, nicknickolaev@gmail.com, i.maximets@samsung.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Cc: dgilbert@redhat.com, stable@dpdk.org, Maxime Coquelin Date: Mon, 8 Oct 2018 17:25:55 +0200 Message-Id: <20181008152557.14275-18-maxime.coquelin@redhat.com> In-Reply-To: <20181008152557.14275-1-maxime.coquelin@redhat.com> References: <20181008152557.14275-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Mon, 08 Oct 2018 15:27:17 +0000 (UTC) Subject: [dpdk-dev] [PATCH v4 17/19] vhost: enable postcopy protocol feature X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Enable postcopy protocol feature except if dequeue zero-copy is enabled. In this case, guest memory requires to be populated, which is not compatible with userfaultfd. Signed-off-by: Dr. David Alan Gilbert Signed-off-by: Maxime Coquelin --- lib/librte_vhost/vhost_user.c | 7 +++++++ lib/librte_vhost/vhost_user.h | 3 ++- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 7c588ad54..580669630 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -1318,6 +1318,13 @@ vhost_user_get_protocol_features(struct virtio_net **pdev, if (!(features & (1ULL << VIRTIO_F_IOMMU_PLATFORM))) protocol_features &= ~(1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK); + /* + * If dequeue zerocopy is enabled, guest memory requires to be + * populated, which is not compatible with postcopy. + */ + if (dev->dequeue_zero_copy) + protocol_features &= ~(1ULL << VHOST_USER_PROTOCOL_F_PAGEFAULT); + msg->payload.u64 = protocol_features; msg->size = sizeof(msg->payload.u64); msg->fd_num = 0; diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h index 73b1fe2b9..dc97be843 100644 --- a/lib/librte_vhost/vhost_user.h +++ b/lib/librte_vhost/vhost_user.h @@ -22,7 +22,8 @@ (1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ) | \ (1ULL << VHOST_USER_PROTOCOL_F_CRYPTO_SESSION) | \ (1ULL << VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD) | \ - (1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER)) + (1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER) | \ + (1ULL << VHOST_USER_PROTOCOL_F_PAGEFAULT)) typedef enum VhostUserRequest { VHOST_USER_NONE = 0, From patchwork Mon Oct 8 15:25:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 46278 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 1BD701B396; Mon, 8 Oct 2018 17:27:23 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 9D7371B188; Mon, 8 Oct 2018 17:27:21 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id EC9117F6CB; Mon, 8 Oct 2018 15:27:20 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-51.ams2.redhat.com [10.36.112.51]) by smtp.corp.redhat.com (Postfix) with ESMTP id 543B05783; Mon, 8 Oct 2018 15:27:17 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com, nicknickolaev@gmail.com, i.maximets@samsung.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Cc: dgilbert@redhat.com, stable@dpdk.org, Maxime Coquelin Date: Mon, 8 Oct 2018 17:25:56 +0200 Message-Id: <20181008152557.14275-19-maxime.coquelin@redhat.com> In-Reply-To: <20181008152557.14275-1-maxime.coquelin@redhat.com> References: <20181008152557.14275-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Mon, 08 Oct 2018 15:27:21 +0000 (UTC) Subject: [dpdk-dev] [PATCH v4 18/19] vhost: add flag to enable postcopy live-migration X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Postcopy live-migration feature require the application to not populate the guest memory. As the vhost library cannot prevent the application to that (e.g. preventing the application to call mlockall()), the feature is disabled by default. The application should only enable the feature if it does not force the guest memory to be populated. In case the user passes the RTE_VHOST_USER_POSTCOPY_SUPPORT flag at registration but the feature was not compiled, registration fails. Signed-off-by: Maxime Coquelin --- doc/guides/prog_guide/vhost_lib.rst | 8 ++++++++ lib/librte_vhost/rte_vhost.h | 1 + lib/librte_vhost/socket.c | 19 +++++++++++++++++-- 3 files changed, 26 insertions(+), 2 deletions(-) diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst index 77af4d775..c77df338f 100644 --- a/doc/guides/prog_guide/vhost_lib.rst +++ b/doc/guides/prog_guide/vhost_lib.rst @@ -106,6 +106,14 @@ The following is an overview of some key Vhost API functions: Enabling this flag with these Qemu version results in Qemu being blocked when multiple queue pairs are declared. + - ``RTE_VHOST_USER_POSTCOPY_SUPPORT`` + + Postcopy live-migration support will be enabled when this flag is set. + It is disabled by default. + + Enabling this flag should only be done when the calling application does + not pre-fault the guest shared memory, otherwise migration would fail. + * ``rte_vhost_driver_set_features(path, features)`` This function sets the feature bits the vhost-user driver supports. The diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h index b3cc6990d..b26afbffa 100644 --- a/lib/librte_vhost/rte_vhost.h +++ b/lib/librte_vhost/rte_vhost.h @@ -28,6 +28,7 @@ extern "C" { #define RTE_VHOST_USER_NO_RECONNECT (1ULL << 1) #define RTE_VHOST_USER_DEQUEUE_ZERO_COPY (1ULL << 2) #define RTE_VHOST_USER_IOMMU_SUPPORT (1ULL << 3) +#define RTE_VHOST_USER_POSTCOPY_SUPPORT (1ULL << 4) /** Protocol features. */ #ifndef VHOST_USER_PROTOCOL_F_MQ diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c index 7cad5593e..bdce26ff9 100644 --- a/lib/librte_vhost/socket.c +++ b/lib/librte_vhost/socket.c @@ -51,6 +51,8 @@ struct vhost_user_socket { uint64_t supported_features; uint64_t features; + uint64_t protocol_features; + /* * Device id to identify a specific backend device. * It's set to -1 for the default software implementation. @@ -735,7 +737,7 @@ rte_vhost_driver_get_protocol_features(const char *path, did = vsocket->vdpa_dev_id; vdpa_dev = rte_vdpa_get_device(did); if (!vdpa_dev || !vdpa_dev->ops->get_protocol_features) { - *protocol_features = VHOST_USER_PROTOCOL_FEATURES; + *protocol_features = vsocket->protocol_features; goto unlock_exit; } @@ -748,7 +750,7 @@ rte_vhost_driver_get_protocol_features(const char *path, goto unlock_exit; } - *protocol_features = VHOST_USER_PROTOCOL_FEATURES + *protocol_features = vsocket->protocol_features & vdpa_protocol_features; unlock_exit: @@ -867,6 +869,7 @@ rte_vhost_driver_register(const char *path, uint64_t flags) vsocket->use_builtin_virtio_net = true; vsocket->supported_features = VIRTIO_NET_SUPPORTED_FEATURES; vsocket->features = VIRTIO_NET_SUPPORTED_FEATURES; + vsocket->protocol_features = VHOST_USER_PROTOCOL_FEATURES; /* Dequeue zero copy can't assure descriptors returned in order */ if (vsocket->dequeue_zero_copy) { @@ -879,6 +882,18 @@ rte_vhost_driver_register(const char *path, uint64_t flags) vsocket->features &= ~(1ULL << VIRTIO_F_IOMMU_PLATFORM); } + if (!(flags & RTE_VHOST_USER_POSTCOPY_SUPPORT)) { + vsocket->protocol_features &= + ~(1ULL << VHOST_USER_PROTOCOL_F_PAGEFAULT); + } else { +#ifndef RTE_LIBRTE_VHOST_POSTCOPY + RTE_LOG(ERR, VHOST_CONFIG, + "Postcopy requested but not compiled\n"); + ret = -1; + goto out_mutex; +#endif + } + if ((flags & RTE_VHOST_USER_CLIENT) != 0) { vsocket->reconnect = !(flags & RTE_VHOST_USER_NO_RECONNECT); if (vsocket->reconnect && reconn_tid == 0) { From patchwork Mon Oct 8 15:25:57 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 46279 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 3E79F1B17D; Mon, 8 Oct 2018 17:27:31 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 327841B17D; Mon, 8 Oct 2018 17:27:30 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 78A42A5D9B; Mon, 8 Oct 2018 15:27:29 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-51.ams2.redhat.com [10.36.112.51]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7671D194A8; Mon, 8 Oct 2018 15:27:21 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com, nicknickolaev@gmail.com, i.maximets@samsung.com, bruce.richardson@intel.com, alejandro.lucero@netronome.com Cc: dgilbert@redhat.com, stable@dpdk.org, Maxime Coquelin Date: Mon, 8 Oct 2018 17:25:57 +0200 Message-Id: <20181008152557.14275-20-maxime.coquelin@redhat.com> In-Reply-To: <20181008152557.14275-1-maxime.coquelin@redhat.com> References: <20181008152557.14275-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 08 Oct 2018 15:27:29 +0000 (UTC) Subject: [dpdk-dev] [PATCH v4 19/19] net/vhost: add parameter to enable postcopy support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Introduce a new postcopy-support parameter to Vhost PMD that passes the RTE_VHOST_USER_POSTCOPY_SUPPORT flag at vhost device register time. Flag should only be set if application does not prefault guest memory using, for example, mlockall() syscall. Default value is 0, meaning that postcopy support is disabled unless specified explicitly. Example to enable postcopy support for a given device: --vdev 'net_vhost0,iface=/tmp/vhost-user1,postcopy-support=1' Signed-off-by: Maxime Coquelin --- doc/guides/nics/vhost.rst | 5 +++++ drivers/net/vhost/rte_eth_vhost.c | 13 +++++++++++++ 2 files changed, 18 insertions(+) diff --git a/doc/guides/nics/vhost.rst b/doc/guides/nics/vhost.rst index 4f7ae8990..23f2e87aa 100644 --- a/doc/guides/nics/vhost.rst +++ b/doc/guides/nics/vhost.rst @@ -71,6 +71,11 @@ The user can specify below arguments in `--vdev` option. It is used to enable iommu support in vhost library. (Default: 0 (disabled)) +#. ``postcopy-support``: + + It is used to enable postcopy live-migration support in vhost library. + (Default: 0 (disabled)) + Vhost PMD event handling ------------------------ diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c index aa6052221..1330f06ba 100644 --- a/drivers/net/vhost/rte_eth_vhost.c +++ b/drivers/net/vhost/rte_eth_vhost.c @@ -30,6 +30,7 @@ enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM}; #define ETH_VHOST_CLIENT_ARG "client" #define ETH_VHOST_DEQUEUE_ZERO_COPY "dequeue-zero-copy" #define ETH_VHOST_IOMMU_SUPPORT "iommu-support" +#define ETH_VHOST_POSTCOPY_SUPPORT "postcopy-support" #define VHOST_MAX_PKT_BURST 32 static const char *valid_arguments[] = { @@ -38,6 +39,7 @@ static const char *valid_arguments[] = { ETH_VHOST_CLIENT_ARG, ETH_VHOST_DEQUEUE_ZERO_COPY, ETH_VHOST_IOMMU_SUPPORT, + ETH_VHOST_POSTCOPY_SUPPORT, NULL }; @@ -1339,6 +1341,7 @@ rte_pmd_vhost_probe(struct rte_vdev_device *dev) int client_mode = 0; int dequeue_zero_copy = 0; int iommu_support = 0; + int postcopy_support = 0; struct rte_eth_dev *eth_dev; const char *name = rte_vdev_device_name(dev); @@ -1411,6 +1414,16 @@ rte_pmd_vhost_probe(struct rte_vdev_device *dev) flags |= RTE_VHOST_USER_IOMMU_SUPPORT; } + if (rte_kvargs_count(kvlist, ETH_VHOST_POSTCOPY_SUPPORT) == 1) { + ret = rte_kvargs_process(kvlist, ETH_VHOST_POSTCOPY_SUPPORT, + &open_int, &postcopy_support); + if (ret < 0) + goto out_free; + + if (postcopy_support) + flags |= RTE_VHOST_USER_POSTCOPY_SUPPORT; + } + if (dev->device.numa_node == SOCKET_ID_ANY) dev->device.numa_node = rte_socket_id();