From patchwork Wed Sep 6 20:02:29 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 28428 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id 192672583; Wed, 6 Sep 2017 22:02:39 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id DBA211381 for ; Wed, 6 Sep 2017 22:02:37 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E6C7181DE8; Wed, 6 Sep 2017 20:02:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com E6C7181DE8 Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=maxime.coquelin@redhat.com Received: from [10.36.112.20] (ovpn-112-20.ams2.redhat.com [10.36.112.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 603735D760; Wed, 6 Sep 2017 20:02:30 +0000 (UTC) To: Tiwei Bie Cc: dev@dpdk.org, yliu@fridaylinux.org, jfreiman@redhat.com, mst@redhat.com, vkaplans@redhat.com, jasowang@redhat.com, lei.a.yao@intel.com, cunming.liang@intel.com References: <20170831095023.21037-1-maxime.coquelin@redhat.com> <20170831095023.21037-4-maxime.coquelin@redhat.com> <20170905044516.GC31895@debian-ZGViaWFuCg> <68468145-5b45-5875-b37f-35df3482379a@redhat.com> <20170905100751.GA7290@debian-ZGViaWFuCg> <0362ed01-211f-d4fc-d4ae-11ea81ad5df1@redhat.com> <20170906011459.GA3965@debian-ZGViaWFuCg> <06b50f42-9151-9b94-646e-e2e2b153a957@redhat.com> <20170906073021.GA15842@debian-ZGViaWFuCg> From: Maxime Coquelin Message-ID: Date: Wed, 6 Sep 2017 22:02:29 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170906073021.GA15842@debian-ZGViaWFuCg> Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Wed, 06 Sep 2017 20:02:37 +0000 (UTC) Subject: Re: [dpdk-dev] [PATCH 03/21] vhost: protect virtio_net device struct X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 09/06/2017 09:30 AM, Tiwei Bie wrote: > On Wed, Sep 06, 2017 at 09:15:47AM +0200, Maxime Coquelin wrote: >> Hi Tiwei, >> >> On 09/06/2017 03:15 AM, Tiwei Bie wrote: >>> On Tue, Sep 05, 2017 at 01:00:42PM +0200, Maxime Coquelin wrote: >>>> On 09/05/2017 12:07 PM, Tiwei Bie wrote: >>>>> On Tue, Sep 05, 2017 at 11:24:14AM +0200, Maxime Coquelin wrote: >>>>>> On 09/05/2017 06:45 AM, Tiwei Bie wrote: >>>>>>> On Thu, Aug 31, 2017 at 11:50:05AM +0200, Maxime Coquelin wrote: >>>>>>>> virtio_net device might be accessed while being reallocated >>>>>>>> in case of NUMA awareness. This case might be theoretical, >>>>>>>> but it will be needed anyway to protect vrings pages against >>>>>>>> invalidation. >>>>>>>> >>>>>>>> The virtio_net devs are now protected with a readers/writers >>>>>>>> lock, so that before reallocating the device, it is ensured >>>>>>>> that it is not being referenced by the processing threads. >>>>>>>> >>>>>>> [...] >>>>>>>> +struct virtio_net * >>>>>>>> +get_device(int vid) >>>>>>>> +{ >>>>>>>> + struct virtio_net *dev; >>>>>>>> + >>>>>>>> + rte_rwlock_read_lock(&vhost_devices[vid].lock); >>>>>>>> + >>>>>>>> + dev = __get_device(vid); >>>>>>>> + if (unlikely(!dev)) >>>>>>>> + rte_rwlock_read_unlock(&vhost_devices[vid].lock); >>>>>>>> + >>>>>>>> + return dev; >>>>>>>> +} >>>>>>>> + >>>>>>>> +void >>>>>>>> +put_device(int vid) >>>>>>>> +{ >>>>>>>> + rte_rwlock_read_unlock(&vhost_devices[vid].lock); >>>>>>>> +} >>>>>>>> + >>>>>>> >>>>>>> This patch introduced a per-device rwlock which needs to be acquired >>>>>>> unconditionally in the data path. So for each vhost device, the IO >>>>>>> threads of different queues will need to acquire/release this lock >>>>>>> during each enqueue and dequeue operation, which will cause cache >>>>>>> contention when multiple queues are enabled and handled by different >>>>>>> cores. With this patch alone, I saw ~7% performance drop when enabling >>>>>>> 6 queues to do 64bytes iofwd loopback test. Is there any way to avoid >>>>>>> introducing this lock to the data path? >>>>>> >>>>>> First, I'd like to thank you for running the MQ test. >>>>>> I agree it may have a performance impact in this case. >>>>>> >>>>>> This lock has currently two purposes: >>>>>> 1. Prevent referencing freed virtio_dev struct in case of numa_realloc. >>>>>> 2. Protect vring pages against invalidation. >>>>>> >>>>>> For 2., it can be fixed by using the per-vq IOTLB lock (it was not the >>>>>> case in my early prototypes that had per device IOTLB cache). >>>>>> >>>>>> For 1., this is an existing problem, so we might consider it is >>>>>> acceptable to keep current state. Maybe it could be improved by only >>>>>> reallocating in case VQ0 is not on the right NUMA node, the other VQs >>>>>> not being initialized at this point. >>>>>> >>>>>> If we do this we might be able to get rid of this lock, I need some more >>>>>> time though to ensure I'm not missing something. >>>>>> >>>>>> What do you think? >>>>>> >>>>> >>>>> Cool. So it's possible that the lock in the data path will be >>>>> acquired only when the IOMMU feature is enabled. It will be >>>>> great! >>>>> >>>>> Besides, I just did a very simple MQ test to verify my thoughts. >>>>> Lei (CC'ed in this mail) may do a thorough performance test for >>>>> this patch set to evaluate the performance impacts. >>>> >>>> I'll try to post v2 this week including the proposed change. >>>> Maybe it'll be better Lei waits for the v2. >>>> >>> >>> Cool. Sure. Thank you! :) >> >> I have done the changes, you can find the v2 on my gitlab repo: >> https://gitlab.com/mcoquelin/dpdk-next-virtio/commits/vhost_iotlb_v2 >> >> I'm testing it right now, but if you'd like to run some early benchmark >> before I post the series, there it is! >> > > Got it. Thanks! :) Just to let you know that I have updated my branch to remove another regression with iommu=off by inlining the noiommu part of vhost_iova_to_vva call (See below for the patch, that is squashed into my branch). Without this, when running microbenchmarks (txonly, rxonly, ...) I noticed a 4% perf degradation. I think I'll have to post the series without testing PVP, because I had to change the machine I use as packet generator, and now I have X710 NICs that seems to be unsupported with Moongen :(. I have been advised to us TRex instead, but I'll need some time to set it up... Regards, Maxime ps: Are you coming to Dublin? > Best regards, > Tiwei Bie > Subject: [PATCH] vhost: inline IOMMU feature check Signed-off-by: Maxime Coquelin --- lib/librte_vhost/vhost.c | 5 +---- lib/librte_vhost/vhost.h | 12 +++++++++++- 2 files changed, 12 insertions(+), 5 deletions(-) int vring_translate(struct virtio_net *dev, struct vhost_virtqueue *vq); void vring_invalidate(struct virtio_net *dev, struct vhost_virtqueue *vq); diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c index 938b3abf2..256184ac2 100644 --- a/lib/librte_vhost/vhost.c +++ b/lib/librte_vhost/vhost.c @@ -55,7 +55,7 @@ struct virtio_net *vhost_devices[MAX_VHOST_DEVICE]; -uint64_t vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq, +uint64_t __vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq, uint64_t iova, uint64_t size, uint8_t perm) { uint64_t vva, tmp_size; @@ -63,9 +63,6 @@ uint64_t vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq, if (unlikely(!size)) return 0; - if (!(dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM))) - return rte_vhost_gpa_to_vva(dev->mem, iova); - tmp_size = size; vva = vhost_user_iotlb_cache_find(vq, iova, &tmp_size, perm); diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h index 191e6c5f1..969f1108b 100644 --- a/lib/librte_vhost/vhost.h +++ b/lib/librte_vhost/vhost.h @@ -355,8 +355,18 @@ struct vhost_device_ops const *vhost_driver_callback_get(const char *path); */ void vhost_backend_cleanup(struct virtio_net *dev); -uint64_t vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq, +uint64_t __vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq, uint64_t iova, uint64_t size, uint8_t perm); + +static __rte_always_inline uint64_t +vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq, + uint64_t iova, uint64_t size, uint8_t perm) +{ + if (!(dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM))) + return rte_vhost_gpa_to_vva(dev->mem, iova); + + return __vhost_iova_to_vva(dev, vq, iova, size, perm); +}