From patchwork Thu Aug 23 16:51:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 43823 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id C54F14C97; Thu, 23 Aug 2018 18:52:10 +0200 (CEST) Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id F35782C60 for ; Thu, 23 Aug 2018 18:52:08 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 746E5401B3A6; Thu, 23 Aug 2018 16:52:08 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-46.ams2.redhat.com [10.36.112.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1B4FC2166BA1; Thu, 23 Aug 2018 16:52:06 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com Cc: dgilbert@redhat.com, Maxime Coquelin Date: Thu, 23 Aug 2018 18:51:48 +0200 Message-Id: <20180823165157.30001-2-maxime.coquelin@redhat.com> In-Reply-To: <20180823165157.30001-1-maxime.coquelin@redhat.com> References: <20180823165157.30001-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Thu, 23 Aug 2018 16:52:08 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Thu, 23 Aug 2018 16:52:08 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: [dpdk-dev] [RFC 01/10] vhost: define postcopy protocol flag X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Signed-off-by: Dr. David Alan Gilbert Signed-off-by: Maxime Coquelin --- lib/librte_vhost/rte_vhost.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h index b02673d4a..b3cc6990d 100644 --- a/lib/librte_vhost/rte_vhost.h +++ b/lib/librte_vhost/rte_vhost.h @@ -66,6 +66,10 @@ extern "C" { #define VHOST_USER_PROTOCOL_F_HOST_NOTIFIER 11 #endif +#ifndef VHOST_USER_PROTOCOL_F_PAGEFAULT +#define VHOST_USER_PROTOCOL_F_PAGEFAULT 8 +#endif + /** Indicate whether protocol features negotiation is supported. */ #ifndef VHOST_USER_F_PROTOCOL_FEATURES #define VHOST_USER_F_PROTOCOL_FEATURES 30 From patchwork Thu Aug 23 16:51:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 43824 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B4F1F4CA1; Thu, 23 Aug 2018 18:52:13 +0200 (CEST) Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id 88C354C8D for ; Thu, 23 Aug 2018 18:52:10 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 2C484401B3C4; Thu, 23 Aug 2018 16:52:10 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-46.ams2.redhat.com [10.36.112.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id B8BE62166BA1; Thu, 23 Aug 2018 16:52:08 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com Cc: dgilbert@redhat.com, Maxime Coquelin Date: Thu, 23 Aug 2018 18:51:49 +0200 Message-Id: <20180823165157.30001-3-maxime.coquelin@redhat.com> In-Reply-To: <20180823165157.30001-1-maxime.coquelin@redhat.com> References: <20180823165157.30001-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Thu, 23 Aug 2018 16:52:10 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Thu, 23 Aug 2018 16:52:10 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: [dpdk-dev] [RFC 02/10] vhost: add number of fds to vhost-user messages and use it X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" As soons as some anciliarry datai (fds) are received, it is copied without checking its length. This patch adds adds the number of fds received to the message, which is set in read_vhost_message(). This is preliminary work to support sending fds to Qemu. Signed-off-by: Dr. David Alan Gilbert Signed-off-by: Maxime Coquelin --- lib/librte_vhost/socket.c | 21 ++++++++++++++++----- lib/librte_vhost/vhost_user.c | 2 +- lib/librte_vhost/vhost_user.h | 4 +++- 3 files changed, 20 insertions(+), 7 deletions(-) diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c index d63031747..c04d3d305 100644 --- a/lib/librte_vhost/socket.c +++ b/lib/librte_vhost/socket.c @@ -94,18 +94,23 @@ static struct vhost_user vhost_user = { .mutex = PTHREAD_MUTEX_INITIALIZER, }; -/* return bytes# of read on success or negative val on failure. */ +/* + * return bytes# of read on success or negative val on failure. Update fdnum + * with number of fds read. + */ int -read_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num) +read_fd_message(int sockfd, char *buf, int buflen, int *fds, int max_fds, + int *fd_num) { struct iovec iov; struct msghdr msgh; - size_t fdsize = fd_num * sizeof(int); - char control[CMSG_SPACE(fdsize)]; + char control[CMSG_SPACE(max_fds * sizeof(int))]; struct cmsghdr *cmsg; int got_fds = 0; int ret; + *fd_num = 0; + memset(&msgh, 0, sizeof(msgh)); iov.iov_base = buf; iov.iov_len = buflen; @@ -131,13 +136,19 @@ read_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num) if ((cmsg->cmsg_level == SOL_SOCKET) && (cmsg->cmsg_type == SCM_RIGHTS)) { got_fds = (cmsg->cmsg_len - CMSG_LEN(0)) / sizeof(int); + if (got_fds > max_fds) { + RTE_LOG(ERR, VHOST_CONFIG, + "Received msg contains more fds than supported\n"); + return -1; + } + *fd_num = got_fds; memcpy(fds, CMSG_DATA(cmsg), got_fds * sizeof(int)); break; } } /* Clear out unused file descriptors */ - while (got_fds < fd_num) + while (got_fds < max_fds) fds[got_fds++] = -1; return ret; diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index a2d4c9ffc..0a7cbfa14 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -1410,7 +1410,7 @@ read_vhost_message(int sockfd, struct VhostUserMsg *msg) int ret; ret = read_fd_message(sockfd, (char *)msg, VHOST_USER_HDR_SIZE, - msg->fds, VHOST_MEMORY_MAX_NREGIONS); + msg->fds, VHOST_MEMORY_MAX_NREGIONS, &msg->fd_num); if (ret <= 0) return ret; diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h index 42166adf2..dd0262f8f 100644 --- a/lib/librte_vhost/vhost_user.h +++ b/lib/librte_vhost/vhost_user.h @@ -132,6 +132,7 @@ typedef struct VhostUserMsg { VhostUserVringArea area; } payload; int fds[VHOST_MEMORY_MAX_NREGIONS]; + int fd_num; } __attribute((packed)) VhostUserMsg; #define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64) @@ -146,7 +147,8 @@ int vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm); int vhost_user_host_notifier_ctrl(int vid, bool enable); /* socket.c */ -int read_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num); +int read_fd_message(int sockfd, char *buf, int buflen, int *fds, int max_fds, + int *fd_num); int send_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num); #endif From patchwork Thu Aug 23 16:51:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 43825 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id BD7EA4CAD; Thu, 23 Aug 2018 18:52:15 +0200 (CEST) Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id BFBA44C9F for ; Thu, 23 Aug 2018 18:52:12 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 661A6401B1F7; Thu, 23 Aug 2018 16:52:12 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-46.ams2.redhat.com [10.36.112.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id 72D182166BA1; Thu, 23 Aug 2018 16:52:10 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com Cc: dgilbert@redhat.com, Maxime Coquelin Date: Thu, 23 Aug 2018 18:51:50 +0200 Message-Id: <20180823165157.30001-4-maxime.coquelin@redhat.com> In-Reply-To: <20180823165157.30001-1-maxime.coquelin@redhat.com> References: <20180823165157.30001-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Thu, 23 Aug 2018 16:52:12 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Thu, 23 Aug 2018 16:52:12 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: [dpdk-dev] [RFC 03/10] vhost: enable fds passing when sending vhost-user messages X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Passing userfault fds to Qemu will be required for postcopy live-migration feature. Signed-off-by: Dr. David Alan Gilbert Signed-off-by: Maxime Coquelin --- lib/librte_vhost/vhost_user.c | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 0a7cbfa14..9d1e9cf75 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -1168,6 +1168,7 @@ vhost_user_get_protocol_features(struct virtio_net *dev, msg->payload.u64 = protocol_features; msg->size = sizeof(msg->payload.u64); + msg->fd_num = 0; } static void @@ -1434,13 +1435,13 @@ read_vhost_message(int sockfd, struct VhostUserMsg *msg) } static int -send_vhost_message(int sockfd, struct VhostUserMsg *msg, int *fds, int fd_num) +send_vhost_message(int sockfd, struct VhostUserMsg *msg) { if (!msg) return 0; return send_fd_message(sockfd, (char *)msg, - VHOST_USER_HDR_SIZE + msg->size, fds, fd_num); + VHOST_USER_HDR_SIZE + msg->size, msg->fds, msg->fd_num); } static int @@ -1454,19 +1455,18 @@ send_vhost_reply(int sockfd, struct VhostUserMsg *msg) msg->flags |= VHOST_USER_VERSION; msg->flags |= VHOST_USER_REPLY_MASK; - return send_vhost_message(sockfd, msg, NULL, 0); + return send_vhost_message(sockfd, msg); } static int -send_vhost_slave_message(struct virtio_net *dev, struct VhostUserMsg *msg, - int *fds, int fd_num) +send_vhost_slave_message(struct virtio_net *dev, struct VhostUserMsg *msg) { int ret; if (msg->flags & VHOST_USER_NEED_REPLY) rte_spinlock_lock(&dev->slave_req_lock); - ret = send_vhost_message(dev->slave_req_fd, msg, fds, fd_num); + ret = send_vhost_message(dev->slave_req_fd, msg); if (ret < 0 && (msg->flags & VHOST_USER_NEED_REPLY)) rte_spinlock_unlock(&dev->slave_req_lock); @@ -1651,6 +1651,7 @@ vhost_user_msg_handler(int vid, int fd) case VHOST_USER_GET_FEATURES: msg.payload.u64 = vhost_user_get_features(dev); msg.size = sizeof(msg.payload.u64); + msg.fd_num = 0; send_vhost_reply(fd, &msg); break; case VHOST_USER_SET_FEATURES: @@ -1683,6 +1684,7 @@ vhost_user_msg_handler(int vid, int fd) /* it needs a reply */ msg.size = sizeof(msg.payload.u64); + msg.fd_num = 0; send_vhost_reply(fd, &msg); break; case VHOST_USER_SET_LOG_FD: @@ -1722,6 +1724,7 @@ vhost_user_msg_handler(int vid, int fd) case VHOST_USER_GET_QUEUE_NUM: msg.payload.u64 = (uint64_t)vhost_user_get_queue_num(dev); msg.size = sizeof(msg.payload.u64); + msg.fd_num = 0; send_vhost_reply(fd, &msg); break; @@ -1769,6 +1772,7 @@ vhost_user_msg_handler(int vid, int fd) if (msg.flags & VHOST_USER_NEED_REPLY) { msg.payload.u64 = !!ret; msg.size = sizeof(msg.payload.u64); + msg.fd_num = 0; send_vhost_reply(fd, &msg); } @@ -1848,7 +1852,7 @@ vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm) }, }; - ret = send_vhost_message(dev->slave_req_fd, &msg, NULL, 0); + ret = send_vhost_message(dev->slave_req_fd, &msg); if (ret < 0) { RTE_LOG(ERR, VHOST_CONFIG, "Failed to send IOTLB miss message (%d)\n", @@ -1864,8 +1868,6 @@ static int vhost_user_slave_set_vring_host_notifier(struct virtio_net *dev, uint64_t offset, uint64_t size) { - int *fdp = NULL; - size_t fd_num = 0; int ret; struct VhostUserMsg msg = { .request.slave = VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG, @@ -1881,11 +1883,11 @@ static int vhost_user_slave_set_vring_host_notifier(struct virtio_net *dev, if (fd < 0) msg.payload.area.u64 |= VHOST_USER_VRING_NOFD_MASK; else { - fdp = &fd; - fd_num = 1; + msg.fds[0] = fd; + msg.fd_num = 1; } - ret = send_vhost_slave_message(dev, &msg, fdp, fd_num); + ret = send_vhost_slave_message(dev, &msg); if (ret < 0) { RTE_LOG(ERR, VHOST_CONFIG, "Failed to set host notifier (%d)\n", ret); From patchwork Thu Aug 23 16:51:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 43826 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 952B34CB5; Thu, 23 Aug 2018 18:52:17 +0200 (CEST) Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id 929CB4CA5 for ; Thu, 23 Aug 2018 18:52:14 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 39FB9401B1F7; Thu, 23 Aug 2018 16:52:14 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-46.ams2.redhat.com [10.36.112.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id AE3902166BA1; Thu, 23 Aug 2018 16:52:12 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com Cc: dgilbert@redhat.com, Maxime Coquelin Date: Thu, 23 Aug 2018 18:51:51 +0200 Message-Id: <20180823165157.30001-5-maxime.coquelin@redhat.com> In-Reply-To: <20180823165157.30001-1-maxime.coquelin@redhat.com> References: <20180823165157.30001-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Thu, 23 Aug 2018 16:52:14 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Thu, 23 Aug 2018 16:52:14 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: [dpdk-dev] [RFC 04/10] vhost: introduce postcopy's advise message X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This patch opens a userfaultfd and sends it back to Qemu's VHOST_USER_POSTCOPY_ADVISE request. Signed-off-by: Dr. David Alan Gilbert Signed-off-by: Maxime Coquelin --- lib/librte_vhost/vhost.h | 2 ++ lib/librte_vhost/vhost_user.c | 37 +++++++++++++++++++++++++++++++++++ lib/librte_vhost/vhost_user.h | 3 ++- 3 files changed, 41 insertions(+), 1 deletion(-) diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h index 760a09c0d..2606743d9 100644 --- a/lib/librte_vhost/vhost.h +++ b/lib/librte_vhost/vhost.h @@ -363,6 +363,8 @@ struct virtio_net { int slave_req_fd; rte_spinlock_t slave_req_lock; + int postcopy_ufd; + /* * Device id to identify a specific backend device. * It's set to -1 for the default software implementation. diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 9d1e9cf75..beca18a18 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -24,9 +24,13 @@ #include #include #include +#include +#include +#include #include #include #include +#include #include #ifdef RTE_LIBRTE_VHOST_NUMA #include @@ -69,6 +73,7 @@ static const char *vhost_message_str[VHOST_USER_MAX] = { [VHOST_USER_IOTLB_MSG] = "VHOST_USER_IOTLB_MSG", [VHOST_USER_CRYPTO_CREATE_SESS] = "VHOST_USER_CRYPTO_CREATE_SESS", [VHOST_USER_CRYPTO_CLOSE_SESS] = "VHOST_USER_CRYPTO_CLOSE_SESS", + [VHOST_USER_POSTCOPY_ADVISE] = "VHOST_USER_POSTCOPY_ADVISE", }; static uint64_t @@ -1404,6 +1409,33 @@ vhost_user_iotlb_msg(struct virtio_net **pdev, struct VhostUserMsg *msg) return 0; } +static int +vhost_user_set_postcopy_advise(struct virtio_net *dev, struct VhostUserMsg *msg) +{ + struct uffdio_api api_struct; + + + dev->postcopy_ufd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK); + + if (dev->postcopy_ufd == -1) { + RTE_LOG(ERR, VHOST_CONFIG, "Userfaultfd not available: %s\n", + strerror(errno)); + return -1; + } + api_struct.api = UFFD_API; + api_struct.features = 0; + if (ioctl(dev->postcopy_ufd, UFFDIO_API, &api_struct)) { + RTE_LOG(ERR, VHOST_CONFIG, "UFFDIO_API ioctl failure: %s\n", + strerror(errno)); + close(dev->postcopy_ufd); + return -1; + } + msg->fds[0] = dev->postcopy_ufd; + msg->fd_num = 1; + + return 0; +} + /* return bytes# of read on success or negative val on failure. */ static int read_vhost_message(int sockfd, struct VhostUserMsg *msg) @@ -1747,6 +1779,11 @@ vhost_user_msg_handler(int vid, int fd) ret = vhost_user_iotlb_msg(&dev, &msg); break; + case VHOST_USER_POSTCOPY_ADVISE: + vhost_user_set_postcopy_advise(dev, &msg); + send_vhost_reply(fd, &msg); + break; + default: ret = -1; break; diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h index dd0262f8f..2030b40a5 100644 --- a/lib/librte_vhost/vhost_user.h +++ b/lib/librte_vhost/vhost_user.h @@ -50,7 +50,8 @@ typedef enum VhostUserRequest { VHOST_USER_IOTLB_MSG = 22, VHOST_USER_CRYPTO_CREATE_SESS = 26, VHOST_USER_CRYPTO_CLOSE_SESS = 27, - VHOST_USER_MAX = 28 + VHOST_USER_POSTCOPY_ADVISE = 28, + VHOST_USER_MAX = 29 } VhostUserRequest; typedef enum VhostUserSlaveRequest { From patchwork Thu Aug 23 16:51:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 43827 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id ABC694F93; Thu, 23 Aug 2018 18:52:19 +0200 (CEST) Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id B272698 for ; Thu, 23 Aug 2018 18:52:16 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 337BD807689A; Thu, 23 Aug 2018 16:52:16 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-46.ams2.redhat.com [10.36.112.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id 81F562166BA1; Thu, 23 Aug 2018 16:52:14 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com Cc: dgilbert@redhat.com, Maxime Coquelin Date: Thu, 23 Aug 2018 18:51:52 +0200 Message-Id: <20180823165157.30001-6-maxime.coquelin@redhat.com> In-Reply-To: <20180823165157.30001-1-maxime.coquelin@redhat.com> References: <20180823165157.30001-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Thu, 23 Aug 2018 16:52:16 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Thu, 23 Aug 2018 16:52:16 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: [dpdk-dev] [RFC 05/10] vhost: add support for postcopy's listen message X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Signed-off-by: Dr. David Alan Gilbert Signed-off-by: Maxime Coquelin --- lib/librte_vhost/vhost.h | 1 + lib/librte_vhost/vhost_user.c | 18 ++++++++++++++++++ lib/librte_vhost/vhost_user.h | 4 +++- 3 files changed, 22 insertions(+), 1 deletion(-) diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h index 2606743d9..a4367152c 100644 --- a/lib/librte_vhost/vhost.h +++ b/lib/librte_vhost/vhost.h @@ -364,6 +364,7 @@ struct virtio_net { rte_spinlock_t slave_req_lock; int postcopy_ufd; + int postcopy_listening; /* * Device id to identify a specific backend device. diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index beca18a18..d93febbc3 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -74,6 +74,7 @@ static const char *vhost_message_str[VHOST_USER_MAX] = { [VHOST_USER_CRYPTO_CREATE_SESS] = "VHOST_USER_CRYPTO_CREATE_SESS", [VHOST_USER_CRYPTO_CLOSE_SESS] = "VHOST_USER_CRYPTO_CLOSE_SESS", [VHOST_USER_POSTCOPY_ADVISE] = "VHOST_USER_POSTCOPY_ADVISE", + [VHOST_USER_POSTCOPY_LISTEN] = "VHOST_USER_POSTCOPY_LISTEN", }; static uint64_t @@ -1436,6 +1437,19 @@ vhost_user_set_postcopy_advise(struct virtio_net *dev, struct VhostUserMsg *msg) return 0; } +static int +vhost_user_set_postcopy_listen(struct virtio_net *dev) +{ + if (dev->mem && dev->mem->nregions) { + RTE_LOG(ERR, VHOST_CONFIG, + "Regions already registered at postcopy-listen\n"); + return -1; + } + dev->postcopy_listening = 1; + + return 0; +} + /* return bytes# of read on success or negative val on failure. */ static int read_vhost_message(int sockfd, struct VhostUserMsg *msg) @@ -1784,6 +1798,10 @@ vhost_user_msg_handler(int vid, int fd) send_vhost_reply(fd, &msg); break; + case VHOST_USER_POSTCOPY_LISTEN: + ret = vhost_user_set_postcopy_listen(dev); + break; + default: ret = -1; break; diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h index 2030b40a5..73b1fe2b9 100644 --- a/lib/librte_vhost/vhost_user.h +++ b/lib/librte_vhost/vhost_user.h @@ -51,7 +51,9 @@ typedef enum VhostUserRequest { VHOST_USER_CRYPTO_CREATE_SESS = 26, VHOST_USER_CRYPTO_CLOSE_SESS = 27, VHOST_USER_POSTCOPY_ADVISE = 28, - VHOST_USER_MAX = 29 + VHOST_USER_POSTCOPY_LISTEN = 29, + VHOST_USER_POSTCOPY_END = 30, + VHOST_USER_MAX = 31 } VhostUserRequest; typedef enum VhostUserSlaveRequest { From patchwork Thu Aug 23 16:51:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 43828 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 519E8532C; Thu, 23 Aug 2018 18:52:21 +0200 (CEST) Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id 70FDC4CBD for ; Thu, 23 Aug 2018 18:52:18 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E005487A78; Thu, 23 Aug 2018 16:52:17 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-46.ams2.redhat.com [10.36.112.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id 787622166BA1; Thu, 23 Aug 2018 16:52:16 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com Cc: dgilbert@redhat.com, Maxime Coquelin Date: Thu, 23 Aug 2018 18:51:53 +0200 Message-Id: <20180823165157.30001-7-maxime.coquelin@redhat.com> In-Reply-To: <20180823165157.30001-1-maxime.coquelin@redhat.com> References: <20180823165157.30001-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Thu, 23 Aug 2018 16:52:17 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Thu, 23 Aug 2018 16:52:17 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: [dpdk-dev] [RFC 06/10] vhost: register new regions with userfaultfd X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Signed-off-by: Dr. David Alan Gilbert Signed-off-by: Maxime Coquelin --- lib/librte_vhost/vhost_user.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index d93febbc3..1c75d8015 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -926,6 +926,28 @@ vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *pmsg) mmap_size, alignment, mmap_offset); + + if (dev->postcopy_listening) { + struct uffdio_register reg_struct; + + reg_struct.range.start = (uint64_t)(uintptr_t)mmap_addr; + reg_struct.range.len = mmap_size; + reg_struct.mode = UFFDIO_REGISTER_MODE_MISSING; + + if (ioctl(dev->postcopy_ufd, UFFDIO_REGISTER, + ®_struct)) { + RTE_LOG(ERR, VHOST_CONFIG, + "Failed to register ufd for region %d: (ufd = %d) %s\n", + i, dev->postcopy_ufd, + strerror(errno)); + continue; + } + RTE_LOG(INFO, VHOST_CONFIG, + "\t userfaultfd registered for range : %llx - %llx\n", + reg_struct.range.start, + reg_struct.range.start + + reg_struct.range.len - 1); + } } for (i = 0; i < dev->nr_vring; i++) { From patchwork Thu Aug 23 16:51:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 43829 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id D42CE5699; Thu, 23 Aug 2018 18:52:22 +0200 (CEST) Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id 140164CC3 for ; Thu, 23 Aug 2018 18:52:20 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9EECA87A78; Thu, 23 Aug 2018 16:52:19 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-46.ams2.redhat.com [10.36.112.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id 385BF2166BA1; Thu, 23 Aug 2018 16:52:18 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com Cc: dgilbert@redhat.com, Maxime Coquelin Date: Thu, 23 Aug 2018 18:51:54 +0200 Message-Id: <20180823165157.30001-8-maxime.coquelin@redhat.com> In-Reply-To: <20180823165157.30001-1-maxime.coquelin@redhat.com> References: <20180823165157.30001-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Thu, 23 Aug 2018 16:52:19 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Thu, 23 Aug 2018 16:52:19 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: [dpdk-dev] [RFC 07/10] vhost: avoid useless VhostUserMemory copy X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The VHOST_USER_SET_MEM_TABLE payload is copied when handled, whereas it could directly be referenced. This is not very important, but next, we'll need to update the payload and send it back to Qemu. Signed-off-by: Dr. David Alan Gilbert Signed-off-by: Maxime Coquelin --- lib/librte_vhost/vhost_user.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 1c75d8015..0861feff1 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -787,7 +787,7 @@ static int vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *pmsg) { struct virtio_net *dev = *pdev; - struct VhostUserMemory memory = pmsg->payload.memory; + struct VhostUserMemory *memory = &pmsg->payload.memory; struct rte_vhost_mem_region *reg; void *mmap_addr; uint64_t mmap_size; @@ -797,17 +797,17 @@ vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *pmsg) int populate; int fd; - if (memory.nregions > VHOST_MEMORY_MAX_NREGIONS) { + if (memory->nregions > VHOST_MEMORY_MAX_NREGIONS) { RTE_LOG(ERR, VHOST_CONFIG, - "too many memory regions (%u)\n", memory.nregions); + "too many memory regions (%u)\n", memory->nregions); return -1; } - if (dev->mem && !vhost_memory_changed(&memory, dev->mem)) { + if (dev->mem && !vhost_memory_changed(memory, dev->mem)) { RTE_LOG(INFO, VHOST_CONFIG, "(%d) memory regions not changed\n", dev->vid); - for (i = 0; i < memory.nregions; i++) + for (i = 0; i < memory->nregions; i++) close(pmsg->fds[i]); return 0; @@ -839,25 +839,25 @@ vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *pmsg) } dev->mem = rte_zmalloc("vhost-mem-table", sizeof(struct rte_vhost_memory) + - sizeof(struct rte_vhost_mem_region) * memory.nregions, 0); + sizeof(struct rte_vhost_mem_region) * memory->nregions, 0); if (dev->mem == NULL) { RTE_LOG(ERR, VHOST_CONFIG, "(%d) failed to allocate memory for dev->mem\n", dev->vid); return -1; } - dev->mem->nregions = memory.nregions; + dev->mem->nregions = memory->nregions; - for (i = 0; i < memory.nregions; i++) { + for (i = 0; i < memory->nregions; i++) { fd = pmsg->fds[i]; reg = &dev->mem->regions[i]; - reg->guest_phys_addr = memory.regions[i].guest_phys_addr; - reg->guest_user_addr = memory.regions[i].userspace_addr; - reg->size = memory.regions[i].memory_size; + reg->guest_phys_addr = memory->regions[i].guest_phys_addr; + reg->guest_user_addr = memory->regions[i].userspace_addr; + reg->size = memory->regions[i].memory_size; reg->fd = fd; - mmap_offset = memory.regions[i].mmap_offset; + mmap_offset = memory->regions[i].mmap_offset; /* Check for memory_size + mmap_offset overflow */ if (mmap_offset >= -reg->size) { From patchwork Thu Aug 23 16:51:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 43830 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 45D5558C4; Thu, 23 Aug 2018 18:52:27 +0200 (CEST) Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id 4555B58C3 for ; Thu, 23 Aug 2018 18:52:25 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E1BDA401B1F7; Thu, 23 Aug 2018 16:52:24 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-46.ams2.redhat.com [10.36.112.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id E3A292166BA1; Thu, 23 Aug 2018 16:52:19 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com Cc: dgilbert@redhat.com, Maxime Coquelin Date: Thu, 23 Aug 2018 18:51:55 +0200 Message-Id: <20180823165157.30001-9-maxime.coquelin@redhat.com> In-Reply-To: <20180823165157.30001-1-maxime.coquelin@redhat.com> References: <20180823165157.30001-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Thu, 23 Aug 2018 16:52:24 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Thu, 23 Aug 2018 16:52:24 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: [dpdk-dev] [RFC 08/10] vhost: send userfault range addresses back to qemu X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Signed-off-by: Dr. David Alan Gilbert Signed-off-by: Maxime Coquelin --- lib/librte_vhost/vhost_user.c | 48 ++++++++++++++++++++++++++++++++--- 1 file changed, 44 insertions(+), 4 deletions(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 0861feff1..29e3e2a07 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -77,6 +77,11 @@ static const char *vhost_message_str[VHOST_USER_MAX] = { [VHOST_USER_POSTCOPY_LISTEN] = "VHOST_USER_POSTCOPY_LISTEN", }; +static int +send_vhost_reply(int sockfd, struct VhostUserMsg *msg); +static int +read_vhost_message(int sockfd, struct VhostUserMsg *msg); + static uint64_t get_blk_size(int fd) { @@ -784,7 +789,8 @@ vhost_memory_changed(struct VhostUserMemory *new, } static int -vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *pmsg) +vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *pmsg, + int main_fd) { struct virtio_net *dev = *pdev; struct VhostUserMemory *memory = &pmsg->payload.memory; @@ -928,10 +934,44 @@ vhost_user_set_mem_table(struct virtio_net **pdev, struct VhostUserMsg *pmsg) mmap_offset); if (dev->postcopy_listening) { + /* + * We haven't a better way right now than sharing + * DPDK's virtual address with Qemu, so that Qemu can + * retreive the region offset when handling userfaults. + */ + memory->regions[i].userspace_addr = + (uint64_t)(uintptr_t)mmap_addr; + } + } + if (dev->postcopy_listening) { + /* Send the addresses back to qemu */ + pmsg->fd_num = 0; + send_vhost_reply(main_fd, pmsg); + + /* Wait for qemu to acknolwedge it's got the addresses + * we've got to wait before we're allowed to generate faults. + */ + VhostUserMsg ack_msg; + if (read_vhost_message(main_fd, &ack_msg) <= 0) { + RTE_LOG(ERR, VHOST_CONFIG, + "Failed to read qemu ack on postcopy set-mem-table\n"); + goto err_mmap; + } + if (ack_msg.request.master != VHOST_USER_SET_MEM_TABLE) { + RTE_LOG(ERR, VHOST_CONFIG, + "Bad qemu ack on postcopy set-mem-table (%d)\n", + ack_msg.request.master); + goto err_mmap; + } + + /* Now userfault register and we can use the memory */ + for (i = 0; i < memory->nregions; i++) { + reg = &dev->mem->regions[i]; struct uffdio_register reg_struct; - reg_struct.range.start = (uint64_t)(uintptr_t)mmap_addr; - reg_struct.range.len = mmap_size; + reg_struct.range.start = + (uint64_t)(uintptr_t)reg->mmap_addr; + reg_struct.range.len = reg->mmap_size; reg_struct.mode = UFFDIO_REGISTER_MODE_MISSING; if (ioctl(dev->postcopy_ufd, UFFDIO_REGISTER, @@ -1744,7 +1784,7 @@ vhost_user_msg_handler(int vid, int fd) break; case VHOST_USER_SET_MEM_TABLE: - ret = vhost_user_set_mem_table(&dev, &msg); + ret = vhost_user_set_mem_table(&dev, &msg, fd); break; case VHOST_USER_SET_LOG_BASE: From patchwork Thu Aug 23 16:51:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 43831 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id A926F58FE; Thu, 23 Aug 2018 18:52:29 +0200 (CEST) Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id 056B84D27 for ; Thu, 23 Aug 2018 18:52:27 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A28FB7A7EA; Thu, 23 Aug 2018 16:52:26 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-46.ams2.redhat.com [10.36.112.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id 33C8D2166BA1; Thu, 23 Aug 2018 16:52:25 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com Cc: dgilbert@redhat.com, Maxime Coquelin Date: Thu, 23 Aug 2018 18:51:56 +0200 Message-Id: <20180823165157.30001-10-maxime.coquelin@redhat.com> In-Reply-To: <20180823165157.30001-1-maxime.coquelin@redhat.com> References: <20180823165157.30001-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Thu, 23 Aug 2018 16:52:26 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Thu, 23 Aug 2018 16:52:26 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: [dpdk-dev] [RFC 09/10] vhost: add support to postcopy's end request X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The master sends this message before stopping handling userfaults, so that the backend closes the userfaultfd. The master waits for the slave to acknowledge the request with an empty 64bits payload for synchronization purpose. Signed-off-by: Dr. David Alan Gilbert Signed-off-by: Maxime Coquelin --- lib/librte_vhost/vhost_user.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 29e3e2a07..54692775c 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -75,6 +75,7 @@ static const char *vhost_message_str[VHOST_USER_MAX] = { [VHOST_USER_CRYPTO_CLOSE_SESS] = "VHOST_USER_CRYPTO_CLOSE_SESS", [VHOST_USER_POSTCOPY_ADVISE] = "VHOST_USER_POSTCOPY_ADVISE", [VHOST_USER_POSTCOPY_LISTEN] = "VHOST_USER_POSTCOPY_LISTEN", + [VHOST_USER_POSTCOPY_END] = "VHOST_USER_POSTCOPY_END", }; static int @@ -1512,6 +1513,18 @@ vhost_user_set_postcopy_listen(struct virtio_net *dev) return 0; } +static int +vhost_user_postcopy_end(struct virtio_net *dev) +{ + dev->postcopy_listening = 0; + if (dev->postcopy_ufd > 0) { + close(dev->postcopy_ufd); + dev->postcopy_ufd = -1; + } + + return 0; +} + /* return bytes# of read on success or negative val on failure. */ static int read_vhost_message(int sockfd, struct VhostUserMsg *msg) @@ -1864,6 +1877,14 @@ vhost_user_msg_handler(int vid, int fd) ret = vhost_user_set_postcopy_listen(dev); break; + case VHOST_USER_POSTCOPY_END: + vhost_user_postcopy_end(dev); + msg.payload.u64 = 0; + msg.size = sizeof(msg.payload.u64); + msg.fd_num = 0; + send_vhost_reply(fd, &msg); + break; + default: ret = -1; break; From patchwork Thu Aug 23 16:51:57 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 43832 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 555DF5A6E; Thu, 23 Aug 2018 18:52:31 +0200 (CEST) Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id CC95858FA for ; Thu, 23 Aug 2018 18:52:28 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 758F97A7EA; Thu, 23 Aug 2018 16:52:28 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-46.ams2.redhat.com [10.36.112.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id ED4852166BA1; Thu, 23 Aug 2018 16:52:26 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com Cc: dgilbert@redhat.com, Maxime Coquelin Date: Thu, 23 Aug 2018 18:51:57 +0200 Message-Id: <20180823165157.30001-11-maxime.coquelin@redhat.com> In-Reply-To: <20180823165157.30001-1-maxime.coquelin@redhat.com> References: <20180823165157.30001-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Thu, 23 Aug 2018 16:52:28 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Thu, 23 Aug 2018 16:52:28 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: [dpdk-dev] [RFC 10/10] vhost: enable postcopy protocol feature X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Enable postcopy protocol feature except if dequeue zero-copy is enabled. In this case, guest memory requires to be populated, which is not compatible with userfaultfd. Signed-off-by: Dr. David Alan Gilbert Signed-off-by: Maxime Coquelin --- lib/librte_vhost/vhost_user.c | 7 +++++++ lib/librte_vhost/vhost_user.h | 3 ++- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 54692775c..1a31d8551 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -1235,6 +1235,13 @@ vhost_user_get_protocol_features(struct virtio_net *dev, if (!(features & (1ULL << VIRTIO_F_IOMMU_PLATFORM))) protocol_features &= ~(1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK); + /* + * If dequeue zerocopy is enabled, guest memory requires to be + * populated, which is not compatible with postcopy. + */ + if (dev->dequeue_zero_copy) + protocol_features &= ~(1ULL << VHOST_USER_PROTOCOL_F_PAGEFAULT); + msg->payload.u64 = protocol_features; msg->size = sizeof(msg->payload.u64); msg->fd_num = 0; diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h index 73b1fe2b9..dc97be843 100644 --- a/lib/librte_vhost/vhost_user.h +++ b/lib/librte_vhost/vhost_user.h @@ -22,7 +22,8 @@ (1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ) | \ (1ULL << VHOST_USER_PROTOCOL_F_CRYPTO_SESSION) | \ (1ULL << VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD) | \ - (1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER)) + (1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER) | \ + (1ULL << VHOST_USER_PROTOCOL_F_PAGEFAULT)) typedef enum VhostUserRequest { VHOST_USER_NONE = 0,