Message ID | 1450321921-27799-3-git-send-email-yuanhan.liu@linux.intel.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id 9ABAB8DAC; Thu, 17 Dec 2015 04:12:01 +0100 (CET) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 80BD88D3C for <dev@dpdk.org>; Thu, 17 Dec 2015 04:11:56 +0100 (CET) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP; 16 Dec 2015 19:11:47 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,439,1444719600"; d="scan'208";a="873183519" Received: from yliu-dev.sh.intel.com ([10.239.66.49]) by orsmga002.jf.intel.com with ESMTP; 16 Dec 2015 19:11:46 -0800 From: Yuanhan Liu <yuanhan.liu@linux.intel.com> To: dev@dpdk.org Date: Thu, 17 Dec 2015 11:11:57 +0800 Message-Id: <1450321921-27799-3-git-send-email-yuanhan.liu@linux.intel.com> X-Mailer: git-send-email 1.9.0 In-Reply-To: <1450321921-27799-1-git-send-email-yuanhan.liu@linux.intel.com> References: <1449027793-30975-1-git-send-email-yuanhan.liu@linux.intel.com> <1450321921-27799-1-git-send-email-yuanhan.liu@linux.intel.com> Cc: "Michael S. Tsirkin" <mst@redhat.com>, Victor Kaplansky <vkaplans@redhat.com> Subject: [dpdk-dev] [PATCH v2 2/6] vhost: introduce vhost_log_write X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK <dev.dpdk.org> List-Unsubscribe: <http://dpdk.org/ml/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://dpdk.org/ml/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <http://dpdk.org/ml/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> |
Commit Message
Yuanhan Liu
Dec. 17, 2015, 3:11 a.m. UTC
Introduce vhost_log_write() helper function to log the dirty pages we touched. Page size is harded code to 4096 (VHOST_LOG_PAGE), and each log is presented by 1 bit. Therefore, vhost_log_write() simply finds the right bit for related page we are gonna change, and set it to 1. dev->log_base denotes the start of the dirty page bitmap. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Signed-off-by: Victor Kaplansky <victork@redhat.com --- lib/librte_vhost/rte_virtio_net.h | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+)
Comments
On 12/17/2015 11:11 AM, Yuanhan Liu wrote: > Introduce vhost_log_write() helper function to log the dirty pages we > touched. Page size is harded code to 4096 (VHOST_LOG_PAGE), and each > log is presented by 1 bit. > > Therefore, vhost_log_write() simply finds the right bit for related > page we are gonna change, and set it to 1. dev->log_base denotes the > start of the dirty page bitmap. > > Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> > Signed-off-by: Victor Kaplansky <victork@redhat.com > --- > lib/librte_vhost/rte_virtio_net.h | 29 +++++++++++++++++++++++++++++ > 1 file changed, 29 insertions(+) > > diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h > index 8acee02..5726683 100644 > --- a/lib/librte_vhost/rte_virtio_net.h > +++ b/lib/librte_vhost/rte_virtio_net.h > @@ -40,6 +40,7 @@ > */ > > #include <stdint.h> > +#include <linux/vhost.h> > #include <linux/virtio_ring.h> > #include <linux/virtio_net.h> > #include <sys/eventfd.h> > @@ -59,6 +60,8 @@ struct rte_mbuf; > /* Backend value set by guest. */ > #define VIRTIO_DEV_STOPPED -1 > > +#define VHOST_LOG_PAGE 4096 > + > > /* Enum for virtqueue management. */ > enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM}; > @@ -205,6 +208,32 @@ gpa_to_vva(struct virtio_net *dev, uint64_t guest_pa) > return vhost_va; > } > > +static inline void __attribute__((always_inline)) > +vhost_log_page(uint8_t *log_base, uint64_t page) > +{ > + log_base[page / 8] |= 1 << (page % 8); > +} > + Those logging functions are not supposed to be API. Could we move them into an internal header file? > +static inline void __attribute__((always_inline)) > +vhost_log_write(struct virtio_net *dev, uint64_t addr, uint64_t len) > +{ > + uint64_t page; > + Before we log, we need memory barrier to make sure updates are in place. > + if (likely(((dev->features & (1ULL << VHOST_F_LOG_ALL)) == 0) || > + !dev->log_base || !len)) > + return; > + > + if (unlikely(dev->log_size < ((addr + len - 1) / VHOST_LOG_PAGE / 8))) > + return; > + > + page = addr / VHOST_LOG_PAGE; > + while (page * VHOST_LOG_PAGE < addr + len) { Let us have a page_end var to make the code simpler? > + vhost_log_page((uint8_t *)(uintptr_t)dev->log_base, page); > + page += VHOST_LOG_PAGE; page += 1? > + } > +} > + > + > /** > * Disable features in feature_mask. Returns 0 on success. > */
On Mon, Dec 21, 2015 at 03:06:43PM +0000, Xie, Huawei wrote: > On 12/17/2015 11:11 AM, Yuanhan Liu wrote: > > Introduce vhost_log_write() helper function to log the dirty pages we > > touched. Page size is harded code to 4096 (VHOST_LOG_PAGE), and each > > log is presented by 1 bit. > > > > Therefore, vhost_log_write() simply finds the right bit for related > > page we are gonna change, and set it to 1. dev->log_base denotes the > > start of the dirty page bitmap. > > > > Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> > > Signed-off-by: Victor Kaplansky <victork@redhat.com > > --- > > lib/librte_vhost/rte_virtio_net.h | 29 +++++++++++++++++++++++++++++ > > 1 file changed, 29 insertions(+) > > > > diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h > > index 8acee02..5726683 100644 > > --- a/lib/librte_vhost/rte_virtio_net.h > > +++ b/lib/librte_vhost/rte_virtio_net.h > > @@ -40,6 +40,7 @@ > > */ > > > > #include <stdint.h> > > +#include <linux/vhost.h> > > #include <linux/virtio_ring.h> > > #include <linux/virtio_net.h> > > #include <sys/eventfd.h> > > @@ -59,6 +60,8 @@ struct rte_mbuf; > > /* Backend value set by guest. */ > > #define VIRTIO_DEV_STOPPED -1 > > > > +#define VHOST_LOG_PAGE 4096 > > + > > > > /* Enum for virtqueue management. */ > > enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM}; > > @@ -205,6 +208,32 @@ gpa_to_vva(struct virtio_net *dev, uint64_t guest_pa) > > return vhost_va; > > } > > > > +static inline void __attribute__((always_inline)) > > +vhost_log_page(uint8_t *log_base, uint64_t page) > > +{ > > + log_base[page / 8] |= 1 << (page % 8); > > +} > > + > Those logging functions are not supposed to be API. Could we move them > into an internal header file? Agreed. I should have put them into vhost_rxtx.c > > +static inline void __attribute__((always_inline)) > > +vhost_log_write(struct virtio_net *dev, uint64_t addr, uint64_t len) > > +{ > > + uint64_t page; > > + > Before we log, we need memory barrier to make sure updates are in place. > > + if (likely(((dev->features & (1ULL << VHOST_F_LOG_ALL)) == 0) || > > + !dev->log_base || !len)) > > + return; Put a memory barrier inside set_features()? I see no var dependence here, why putting a barrier then? We are accessing and modifying same var, doesn't the cache MESI protocol will get rid of your concerns? > > + > > + if (unlikely(dev->log_size < ((addr + len - 1) / VHOST_LOG_PAGE / 8))) > > + return; > > + > > + page = addr / VHOST_LOG_PAGE; > > + while (page * VHOST_LOG_PAGE < addr + len) { > Let us have a page_end var to make the code simpler? Could do that. > > + vhost_log_page((uint8_t *)(uintptr_t)dev->log_base, page); > > + page += VHOST_LOG_PAGE; > page += 1? Oops, right. --yliu > > + } > > +} > > + > > + > > /** > > * Disable features in feature_mask. Returns 0 on success. > > */ >
On 12/22/2015 10:40 AM, Yuanhan Liu wrote: > On Mon, Dec 21, 2015 at 03:06:43PM +0000, Xie, Huawei wrote: >> On 12/17/2015 11:11 AM, Yuanhan Liu wrote: >>> Introduce vhost_log_write() helper function to log the dirty pages we >>> touched. Page size is harded code to 4096 (VHOST_LOG_PAGE), and each >>> log is presented by 1 bit. >>> >>> Therefore, vhost_log_write() simply finds the right bit for related >>> page we are gonna change, and set it to 1. dev->log_base denotes the >>> start of the dirty page bitmap. >>> >>> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> >>> Signed-off-by: Victor Kaplansky <victork@redhat.com >>> --- >>> lib/librte_vhost/rte_virtio_net.h | 29 +++++++++++++++++++++++++++++ >>> 1 file changed, 29 insertions(+) >>> >>> diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h >>> index 8acee02..5726683 100644 >>> --- a/lib/librte_vhost/rte_virtio_net.h >>> +++ b/lib/librte_vhost/rte_virtio_net.h >>> @@ -40,6 +40,7 @@ >>> */ >>> >>> #include <stdint.h> >>> +#include <linux/vhost.h> >>> #include <linux/virtio_ring.h> >>> #include <linux/virtio_net.h> >>> #include <sys/eventfd.h> >>> @@ -59,6 +60,8 @@ struct rte_mbuf; >>> /* Backend value set by guest. */ >>> #define VIRTIO_DEV_STOPPED -1 >>> >>> +#define VHOST_LOG_PAGE 4096 >>> + >>> >>> /* Enum for virtqueue management. */ >>> enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM}; >>> @@ -205,6 +208,32 @@ gpa_to_vva(struct virtio_net *dev, uint64_t guest_pa) >>> return vhost_va; >>> } >>> >>> +static inline void __attribute__((always_inline)) >>> +vhost_log_page(uint8_t *log_base, uint64_t page) >>> +{ >>> + log_base[page / 8] |= 1 << (page % 8); >>> +} >>> + >> Those logging functions are not supposed to be API. Could we move them >> into an internal header file? > Agreed. I should have put them into vhost_rxtx.c > >>> +static inline void __attribute__((always_inline)) >>> +vhost_log_write(struct virtio_net *dev, uint64_t addr, uint64_t len) >>> +{ >>> + uint64_t page; >>> + >> Before we log, we need memory barrier to make sure updates are in place. >>> + if (likely(((dev->features & (1ULL << VHOST_F_LOG_ALL)) == 0) || >>> + !dev->log_base || !len)) >>> + return; > Put a memory barrier inside set_features()? > > I see no var dependence here, why putting a barrier then? We are > accessing and modifying same var, doesn't the cache MESI protocol > will get rid of your concerns? This fence isn't about feature var. It is to ensure that updates to the guest buffer are committed before the logging. For IA strong memory model, compiler barrier is enough. For other weak memory model, fence is required. >>> + >>> + if (unlikely(dev->log_size < ((addr + len - 1) / VHOST_LOG_PAGE / 8))) >>> + return; >>> + >>> + page = addr / VHOST_LOG_PAGE; >>> + while (page * VHOST_LOG_PAGE < addr + len) { >> Let us have a page_end var to make the code simpler? > Could do that. > > >>> + vhost_log_page((uint8_t *)(uintptr_t)dev->log_base, page); >>> + page += VHOST_LOG_PAGE; >> page += 1? > Oops, right. > > --yliu > >>> + } >>> +} >>> + >>> + >>> /** >>> * Disable features in feature_mask. Returns 0 on success. >>> */
On Tue, Dec 22, 2015 at 02:45:52AM +0000, Xie, Huawei wrote: > >>> +static inline void __attribute__((always_inline)) > >>> +vhost_log_write(struct virtio_net *dev, uint64_t addr, uint64_t len) > >>> +{ > >>> + uint64_t page; > >>> + > >> Before we log, we need memory barrier to make sure updates are in place. > >>> + if (likely(((dev->features & (1ULL << VHOST_F_LOG_ALL)) == 0) || > >>> + !dev->log_base || !len)) > >>> + return; > > Put a memory barrier inside set_features()? > > > > I see no var dependence here, why putting a barrier then? We are > > accessing and modifying same var, doesn't the cache MESI protocol > > will get rid of your concerns? > This fence isn't about feature var. It is to ensure that updates to the > guest buffer are committed before the logging. Oh.., I was thinking you were talking about the "dev->features" field concurrent access and modify you mentioned from V1. > For IA strong memory model, compiler barrier is enough. For other weak > memory model, fence is required. > >>> + > >>> + if (unlikely(dev->log_size < ((addr + len - 1) / VHOST_LOG_PAGE / 8))) > >>> + return; So that I should put a "rte_mb()" __here__? --yliu > >>> + > >>> + page = addr / VHOST_LOG_PAGE; > >>> + while (page * VHOST_LOG_PAGE < addr + len) { > >> Let us have a page_end var to make the code simpler? > > Could do that. > > > > > >>> + vhost_log_page((uint8_t *)(uintptr_t)dev->log_base, page); > >>> + page += VHOST_LOG_PAGE; > >> page += 1? > > Oops, right. > > > > --yliu > > > >>> + } > >>> +} > >>> + > >>> + > >>> /** > >>> * Disable features in feature_mask. Returns 0 on success. > >>> */ >
On Thu, Dec 17, 2015 at 11:11:57AM +0800, Yuanhan Liu wrote: > +static inline void __attribute__((always_inline)) > +vhost_log_write(struct virtio_net *dev, uint64_t addr, uint64_t len) > +{ > + uint64_t page; > + > + if (likely(((dev->features & (1ULL << VHOST_F_LOG_ALL)) == 0) || > + !dev->log_base || !len)) > + return; > + > + if (unlikely(dev->log_size < ((addr + len - 1) / VHOST_LOG_PAGE / 8))) Should it be "<="? Peter
On Tue, Dec 22, 2015 at 01:11:02PM +0800, Peter Xu wrote: > On Thu, Dec 17, 2015 at 11:11:57AM +0800, Yuanhan Liu wrote: > > +static inline void __attribute__((always_inline)) > > +vhost_log_write(struct virtio_net *dev, uint64_t addr, uint64_t len) > > +{ > > + uint64_t page; > > + > > + if (likely(((dev->features & (1ULL << VHOST_F_LOG_ALL)) == 0) || > > + !dev->log_base || !len)) > > + return; > > + > > + if (unlikely(dev->log_size < ((addr + len - 1) / VHOST_LOG_PAGE / 8))) > > Should it be "<="? Right, thanks for catching it. --yliu
On 12/22/2015 11:03 AM, Yuanhan Liu wrote: > On Tue, Dec 22, 2015 at 02:45:52AM +0000, Xie, Huawei wrote: >>>>> +static inline void __attribute__((always_inline)) >>>>> +vhost_log_write(struct virtio_net *dev, uint64_t addr, uint64_t len) >>>>> +{ >>>>> + uint64_t page; >>>>> + >>>> Before we log, we need memory barrier to make sure updates are in place. >>>>> + if (likely(((dev->features & (1ULL << VHOST_F_LOG_ALL)) == 0) || >>>>> + !dev->log_base || !len)) >>>>> + return; >>> Put a memory barrier inside set_features()? >>> >>> I see no var dependence here, why putting a barrier then? We are >>> accessing and modifying same var, doesn't the cache MESI protocol >>> will get rid of your concerns? >> This fence isn't about feature var. It is to ensure that updates to the >> guest buffer are committed before the logging. > Oh.., I was thinking you were talking about the "dev->features" field > concurrent access and modify you mentioned from V1. > >> For IA strong memory model, compiler barrier is enough. For other weak >> memory model, fence is required. >>>>> + >>>>> + if (unlikely(dev->log_size < ((addr + len - 1) / VHOST_LOG_PAGE / 8))) >>>>> + return; > So that I should put a "rte_mb()" __here__? > > --yliu I find that we already have the arch dependent version of rte_smp_wmb() --huawei >>>>> + >>>>> + page = addr / VHOST_LOG_PAGE; >>>>> + while (page * VHOST_LOG_PAGE < addr + len) { >>>> Let us have a page_end var to make the code simpler? >>> Could do that. >>> >>> >>>>> + vhost_log_page((uint8_t *)(uintptr_t)dev->log_base, page); >>>>> + page += VHOST_LOG_PAGE; >>>> page += 1? >>> Oops, right. >>> >>> --yliu >>> >>>>> + } >>>>> +} >>>>> + >>>>> + >>>>> /** >>>>> * Disable features in feature_mask. Returns 0 on success. >>>>> */
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h index 8acee02..5726683 100644 --- a/lib/librte_vhost/rte_virtio_net.h +++ b/lib/librte_vhost/rte_virtio_net.h @@ -40,6 +40,7 @@ */ #include <stdint.h> +#include <linux/vhost.h> #include <linux/virtio_ring.h> #include <linux/virtio_net.h> #include <sys/eventfd.h> @@ -59,6 +60,8 @@ struct rte_mbuf; /* Backend value set by guest. */ #define VIRTIO_DEV_STOPPED -1 +#define VHOST_LOG_PAGE 4096 + /* Enum for virtqueue management. */ enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM}; @@ -205,6 +208,32 @@ gpa_to_vva(struct virtio_net *dev, uint64_t guest_pa) return vhost_va; } +static inline void __attribute__((always_inline)) +vhost_log_page(uint8_t *log_base, uint64_t page) +{ + log_base[page / 8] |= 1 << (page % 8); +} + +static inline void __attribute__((always_inline)) +vhost_log_write(struct virtio_net *dev, uint64_t addr, uint64_t len) +{ + uint64_t page; + + if (likely(((dev->features & (1ULL << VHOST_F_LOG_ALL)) == 0) || + !dev->log_base || !len)) + return; + + if (unlikely(dev->log_size < ((addr + len - 1) / VHOST_LOG_PAGE / 8))) + return; + + page = addr / VHOST_LOG_PAGE; + while (page * VHOST_LOG_PAGE < addr + len) { + vhost_log_page((uint8_t *)(uintptr_t)dev->log_base, page); + page += VHOST_LOG_PAGE; + } +} + + /** * Disable features in feature_mask. Returns 0 on success. */