From patchwork Wed Jun 27 14:49:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 41666 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 426B41BF43; Wed, 27 Jun 2018 16:50:18 +0200 (CEST) Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id AAC8B1BF2C for ; Wed, 27 Jun 2018 16:50:14 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 2DC1C818F047; Wed, 27 Jun 2018 14:50:14 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-39.ams2.redhat.com [10.36.112.39]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1917F20389E0; Wed, 27 Jun 2018 14:50:12 +0000 (UTC) From: Maxime Coquelin To: tiwei.bie@intel.com, zhihong.wang@intel.com, dev@dpdk.org Cc: Maxime Coquelin Date: Wed, 27 Jun 2018 16:49:53 +0200 Message-Id: <20180627144959.17277-2-maxime.coquelin@redhat.com> In-Reply-To: <20180627144959.17277-1-maxime.coquelin@redhat.com> References: <20180627144959.17277-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Wed, 27 Jun 2018 14:50:14 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Wed, 27 Jun 2018 14:50:14 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: [dpdk-dev] [PATCH v3 1/7] vhost: use shadow used ring in dequeue path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Relax used ring contention by reusing the shadow used ring feature used by enqueue path. Signed-off-by: Maxime Coquelin Reviewed-by: Tiwei Bie --- lib/librte_vhost/virtio_net.c | 45 ++++++++++--------------------------------- 1 file changed, 10 insertions(+), 35 deletions(-) diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c index 98ad8e936..7e70a927f 100644 --- a/lib/librte_vhost/virtio_net.c +++ b/lib/librte_vhost/virtio_net.c @@ -1019,35 +1019,6 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, return error; } -static __rte_always_inline void -update_used_ring(struct virtio_net *dev, struct vhost_virtqueue *vq, - uint32_t used_idx, uint32_t desc_idx) -{ - vq->used->ring[used_idx].id = desc_idx; - vq->used->ring[used_idx].len = 0; - vhost_log_cache_used_vring(dev, vq, - offsetof(struct vring_used, ring[used_idx]), - sizeof(vq->used->ring[used_idx])); -} - -static __rte_always_inline void -update_used_idx(struct virtio_net *dev, struct vhost_virtqueue *vq, - uint32_t count) -{ - if (unlikely(count == 0)) - return; - - rte_smp_wmb(); - rte_smp_rmb(); - - vhost_log_cache_sync(dev, vq); - - vq->used->idx += count; - vhost_log_used_vring(dev, vq, offsetof(struct vring_used, idx), - sizeof(vq->used->idx)); - vhost_vring_call(dev, vq); -} - static __rte_always_inline struct zcopy_mbuf * get_zmbuf(struct vhost_virtqueue *vq) { @@ -1146,6 +1117,7 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, goto out_access_unlock; vq->batch_copy_nb_elems = 0; + vq->shadow_used_idx = 0; if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM)) vhost_user_iotlb_rd_lock(vq); @@ -1164,8 +1136,7 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, if (mbuf_is_consumed(zmbuf->mbuf)) { used_idx = vq->last_used_idx++ & (vq->size - 1); - update_used_ring(dev, vq, used_idx, - zmbuf->desc_idx); + update_shadow_used_ring(vq, zmbuf->desc_idx, 0); nr_updated += 1; TAILQ_REMOVE(&vq->zmbuf_list, zmbuf, next); @@ -1176,7 +1147,9 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, } } - update_used_idx(dev, vq, nr_updated); + flush_shadow_used_ring(dev, vq); + vhost_vring_call(dev, vq); + vq->shadow_used_idx = 0; } /* @@ -1233,7 +1206,7 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, desc_indexes[i] = vq->avail->ring[avail_idx]; if (likely(dev->dequeue_zero_copy == 0)) - update_used_ring(dev, vq, used_idx, desc_indexes[i]); + update_shadow_used_ring(vq, desc_indexes[i], 0); } /* Prefetch descriptor index. */ @@ -1326,8 +1299,10 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, if (likely(dev->dequeue_zero_copy == 0)) { do_data_copy_dequeue(vq); - vq->last_used_idx += i; - update_used_idx(dev, vq, i); + if (unlikely(i < count)) + vq->shadow_used_idx = i; + flush_shadow_used_ring(dev, vq); + vhost_vring_call(dev, vq); } out: From patchwork Wed Jun 27 14:49:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 41667 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 0FB601BF4D; Wed, 27 Jun 2018 16:50:21 +0200 (CEST) Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id 122DD1BF3F for ; Wed, 27 Jun 2018 16:50:15 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 76FAC407C570; Wed, 27 Jun 2018 14:50:15 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-39.ams2.redhat.com [10.36.112.39]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6F00C20389E0; Wed, 27 Jun 2018 14:50:14 +0000 (UTC) From: Maxime Coquelin To: tiwei.bie@intel.com, zhihong.wang@intel.com, dev@dpdk.org Cc: Maxime Coquelin Date: Wed, 27 Jun 2018 16:49:54 +0200 Message-Id: <20180627144959.17277-3-maxime.coquelin@redhat.com> In-Reply-To: <20180627144959.17277-1-maxime.coquelin@redhat.com> References: <20180627144959.17277-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Wed, 27 Jun 2018 14:50:15 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Wed, 27 Jun 2018 14:50:15 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: [dpdk-dev] [PATCH v3 2/7] vhost: make gpa to hpa failure an error X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" CVE-2018-1059 fix makes sure gpa contiguous memory is also contiguous in hva space. Incidentally, it also makes sure it is contiguous in hpa space. So we can simplify the code by making gpa contiguous memory discontiguous in hpa space an error. Signed-off-by: Maxime Coquelin --- lib/librte_vhost/virtio_net.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c index 7e70a927f..ec4bcc400 100644 --- a/lib/librte_vhost/virtio_net.c +++ b/lib/librte_vhost/virtio_net.c @@ -884,13 +884,13 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, cpy_len = RTE_MIN(desc_chunck_len, mbuf_avail); - /* - * A desc buf might across two host physical pages that are - * not continuous. In such case (gpa_to_hpa returns 0), data - * will be copied even though zero copy is enabled. - */ - if (unlikely(dev->dequeue_zero_copy && (hpa = gpa_to_hpa(dev, - desc_gaddr + desc_offset, cpy_len)))) { + if (unlikely(dev->dequeue_zero_copy)) { + hpa = gpa_to_hpa(dev, + desc_gaddr + desc_offset, cpy_len); + if (unlikely(!hpa)) { + error = -1; + goto out; + } cur->data_len = cpy_len; cur->data_off = 0; cur->buf_addr = (void *)(uintptr_t)(desc_addr From patchwork Wed Jun 27 14:49:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 41668 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 9F3B91BF55; Wed, 27 Jun 2018 16:50:22 +0200 (CEST) Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id 233171BF43 for ; Wed, 27 Jun 2018 16:50:17 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B87CF818F047; Wed, 27 Jun 2018 14:50:16 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-39.ams2.redhat.com [10.36.112.39]) by smtp.corp.redhat.com (Postfix) with ESMTP id B401320389E0; Wed, 27 Jun 2018 14:50:15 +0000 (UTC) From: Maxime Coquelin To: tiwei.bie@intel.com, zhihong.wang@intel.com, dev@dpdk.org Cc: Maxime Coquelin Date: Wed, 27 Jun 2018 16:49:55 +0200 Message-Id: <20180627144959.17277-4-maxime.coquelin@redhat.com> In-Reply-To: <20180627144959.17277-1-maxime.coquelin@redhat.com> References: <20180627144959.17277-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Wed, 27 Jun 2018 14:50:16 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Wed, 27 Jun 2018 14:50:16 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: [dpdk-dev] [PATCH v3 3/7] vhost: use buffer vectors in dequeue path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" To ease packed ring layout integration, this patch makes the dequeue path to re-use buffer vectors implemented for enqueue path. Doing this, copy_desc_to_mbuf() is now ring layout type agnostic. Signed-off-by: Maxime Coquelin --- lib/librte_vhost/virtio_net.c | 143 ++++++++++-------------------------------- 1 file changed, 33 insertions(+), 110 deletions(-) diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c index ec4bcc400..4816e8003 100644 --- a/lib/librte_vhost/virtio_net.c +++ b/lib/librte_vhost/virtio_net.c @@ -750,11 +750,9 @@ put_zmbuf(struct zcopy_mbuf *zmbuf) static __rte_always_inline int copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, - struct vring_desc *descs, uint16_t max_desc, - struct rte_mbuf *m, uint16_t desc_idx, - struct rte_mempool *mbuf_pool) + struct buf_vector *buf_vec, uint16_t nr_vec, + struct rte_mbuf *m, struct rte_mempool *mbuf_pool) { - struct vring_desc *desc; uint64_t desc_addr, desc_gaddr; uint32_t desc_avail, desc_offset; uint32_t mbuf_avail, mbuf_offset; @@ -764,24 +762,18 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, struct virtio_net_hdr tmp_hdr; struct virtio_net_hdr *hdr = NULL; /* A counter to avoid desc dead loop chain */ - uint32_t nr_desc = 1; + uint16_t vec_idx = 0; struct batch_copy_elem *batch_copy = vq->batch_copy_elems; int error = 0; - desc = &descs[desc_idx]; - if (unlikely((desc->len < dev->vhost_hlen)) || - (desc->flags & VRING_DESC_F_INDIRECT)) { - error = -1; - goto out; - } - - desc_chunck_len = desc->len; - desc_gaddr = desc->addr; + desc_chunck_len = buf_vec[vec_idx].buf_len; + desc_gaddr = buf_vec[vec_idx].buf_addr; desc_addr = vhost_iova_to_vva(dev, vq, desc_gaddr, &desc_chunck_len, VHOST_ACCESS_RO); - if (unlikely(!desc_addr)) { + if (unlikely(buf_vec[vec_idx].buf_len < dev->vhost_hlen || + !desc_addr)) { error = -1; goto out; } @@ -828,16 +820,12 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, * for Tx: the first for storing the header, and others * for storing the data. */ - if (likely((desc->len == dev->vhost_hlen) && - (desc->flags & VRING_DESC_F_NEXT) != 0)) { - desc = &descs[desc->next]; - if (unlikely(desc->flags & VRING_DESC_F_INDIRECT)) { - error = -1; + if (likely(buf_vec[vec_idx].buf_len == dev->vhost_hlen)) { + if (unlikely(++vec_idx >= nr_vec)) goto out; - } - desc_chunck_len = desc->len; - desc_gaddr = desc->addr; + desc_chunck_len = buf_vec[vec_idx].buf_len; + desc_gaddr = buf_vec[vec_idx].buf_addr; desc_addr = vhost_iova_to_vva(dev, vq, desc_gaddr, &desc_chunck_len, @@ -848,10 +836,9 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, } desc_offset = 0; - desc_avail = desc->len; - nr_desc += 1; + desc_avail = buf_vec[vec_idx].buf_len; } else { - desc_avail = desc->len - dev->vhost_hlen; + desc_avail = buf_vec[vec_idx].buf_len - dev->vhost_hlen; if (unlikely(desc_chunck_len < dev->vhost_hlen)) { desc_chunck_len = desc_avail; @@ -906,7 +893,8 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, if (likely(cpy_len > MAX_BATCH_LEN || vq->batch_copy_nb_elems >= vq->size || (hdr && cur == m) || - desc->len != desc_chunck_len)) { + buf_vec[vec_idx].buf_len != + desc_chunck_len)) { rte_memcpy(rte_pktmbuf_mtod_offset(cur, void *, mbuf_offset), (void *)((uintptr_t)(desc_addr + @@ -933,22 +921,11 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, /* This desc reaches to its end, get the next one */ if (desc_avail == 0) { - if ((desc->flags & VRING_DESC_F_NEXT) == 0) + if (++vec_idx >= nr_vec) break; - if (unlikely(desc->next >= max_desc || - ++nr_desc > max_desc)) { - error = -1; - goto out; - } - desc = &descs[desc->next]; - if (unlikely(desc->flags & VRING_DESC_F_INDIRECT)) { - error = -1; - goto out; - } - - desc_chunck_len = desc->len; - desc_gaddr = desc->addr; + desc_chunck_len = buf_vec[vec_idx].buf_len; + desc_gaddr = buf_vec[vec_idx].buf_addr; desc_addr = vhost_iova_to_vva(dev, vq, desc_gaddr, &desc_chunck_len, @@ -961,7 +938,7 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, rte_prefetch0((void *)(uintptr_t)desc_addr); desc_offset = 0; - desc_avail = desc->len; + desc_avail = buf_vec[vec_idx].buf_len; PRINT_PACKET(dev, (uintptr_t)desc_addr, (uint32_t)desc_chunck_len, 0); @@ -1085,11 +1062,8 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, struct virtio_net *dev; struct rte_mbuf *rarp_mbuf = NULL; struct vhost_virtqueue *vq; - uint32_t desc_indexes[MAX_PKT_BURST]; - uint32_t used_idx; uint32_t i = 0; uint16_t free_entries; - uint16_t avail_idx; dev = get_device(vid); if (!dev) @@ -1135,7 +1109,6 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, next = TAILQ_NEXT(zmbuf, next); if (mbuf_is_consumed(zmbuf->mbuf)) { - used_idx = vq->last_used_idx++ & (vq->size - 1); update_shadow_used_ring(vq, zmbuf->desc_idx, 0); nr_updated += 1; @@ -1182,89 +1155,43 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, } free_entries = *((volatile uint16_t *)&vq->avail->idx) - - vq->last_avail_idx; + vq->last_avail_idx; if (free_entries == 0) goto out; VHOST_LOG_DEBUG(VHOST_DATA, "(%d) %s\n", dev->vid, __func__); - /* Prefetch available and used ring */ - avail_idx = vq->last_avail_idx & (vq->size - 1); - used_idx = vq->last_used_idx & (vq->size - 1); - rte_prefetch0(&vq->avail->ring[avail_idx]); - rte_prefetch0(&vq->used->ring[used_idx]); - count = RTE_MIN(count, MAX_PKT_BURST); count = RTE_MIN(count, free_entries); VHOST_LOG_DEBUG(VHOST_DATA, "(%d) about to dequeue %u buffers\n", dev->vid, count); - /* Retrieve all of the head indexes first to avoid caching issues. */ for (i = 0; i < count; i++) { - avail_idx = (vq->last_avail_idx + i) & (vq->size - 1); - used_idx = (vq->last_used_idx + i) & (vq->size - 1); - desc_indexes[i] = vq->avail->ring[avail_idx]; - - if (likely(dev->dequeue_zero_copy == 0)) - update_shadow_used_ring(vq, desc_indexes[i], 0); - } - - /* Prefetch descriptor index. */ - rte_prefetch0(&vq->desc[desc_indexes[0]]); - for (i = 0; i < count; i++) { - struct vring_desc *desc, *idesc = NULL; - uint16_t sz, idx; - uint64_t dlen; + struct buf_vector buf_vec[BUF_VECTOR_MAX]; + uint16_t head_idx, dummy_len; + uint32_t nr_vec = 0; int err; - if (likely(i + 1 < count)) - rte_prefetch0(&vq->desc[desc_indexes[i + 1]]); - - if (vq->desc[desc_indexes[i]].flags & VRING_DESC_F_INDIRECT) { - dlen = vq->desc[desc_indexes[i]].len; - desc = (struct vring_desc *)(uintptr_t) - vhost_iova_to_vva(dev, vq, - vq->desc[desc_indexes[i]].addr, - &dlen, - VHOST_ACCESS_RO); - if (unlikely(!desc)) - break; - - if (unlikely(dlen < vq->desc[desc_indexes[i]].len)) { - /* - * The indirect desc table is not contiguous - * in process VA space, we have to copy it. - */ - idesc = alloc_copy_ind_table(dev, vq, - &vq->desc[desc_indexes[i]]); - if (unlikely(!idesc)) - break; - - desc = idesc; - } + if (unlikely(fill_vec_buf(dev, vq, + vq->last_avail_idx + i, + &nr_vec, buf_vec, + &head_idx, &dummy_len) < 0)) + break; - rte_prefetch0(desc); - sz = vq->desc[desc_indexes[i]].len / sizeof(*desc); - idx = 0; - } else { - desc = vq->desc; - sz = vq->size; - idx = desc_indexes[i]; - } + if (likely(dev->dequeue_zero_copy == 0)) + update_shadow_used_ring(vq, head_idx, 0); pkts[i] = rte_pktmbuf_alloc(mbuf_pool); if (unlikely(pkts[i] == NULL)) { RTE_LOG(ERR, VHOST_DATA, "Failed to allocate memory for mbuf.\n"); - free_ind_table(idesc); break; } - err = copy_desc_to_mbuf(dev, vq, desc, sz, pkts[i], idx, - mbuf_pool); + err = copy_desc_to_mbuf(dev, vq, buf_vec, nr_vec, pkts[i], + mbuf_pool); if (unlikely(err)) { rte_pktmbuf_free(pkts[i]); - free_ind_table(idesc); break; } @@ -1274,11 +1201,10 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, zmbuf = get_zmbuf(vq); if (!zmbuf) { rte_pktmbuf_free(pkts[i]); - free_ind_table(idesc); break; } zmbuf->mbuf = pkts[i]; - zmbuf->desc_idx = desc_indexes[i]; + zmbuf->desc_idx = head_idx; /* * Pin lock the mbuf; we will check later to see @@ -1291,9 +1217,6 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, vq->nr_zmbuf += 1; TAILQ_INSERT_TAIL(&vq->zmbuf_list, zmbuf, next); } - - if (unlikely(!!idesc)) - free_ind_table(idesc); } vq->last_avail_idx += i; From patchwork Wed Jun 27 14:49:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 41669 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id DB91D1BF62; Wed, 27 Jun 2018 16:50:24 +0200 (CEST) Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id 8BD521BF48 for ; Wed, 27 Jun 2018 16:50:18 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 2D45D401DE84; Wed, 27 Jun 2018 14:50:18 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-39.ams2.redhat.com [10.36.112.39]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0BAB120389E0; Wed, 27 Jun 2018 14:50:16 +0000 (UTC) From: Maxime Coquelin To: tiwei.bie@intel.com, zhihong.wang@intel.com, dev@dpdk.org Cc: Maxime Coquelin Date: Wed, 27 Jun 2018 16:49:56 +0200 Message-Id: <20180627144959.17277-5-maxime.coquelin@redhat.com> In-Reply-To: <20180627144959.17277-1-maxime.coquelin@redhat.com> References: <20180627144959.17277-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Wed, 27 Jun 2018 14:50:18 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Wed, 27 Jun 2018 14:50:18 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: [dpdk-dev] [PATCH v3 4/7] vhost: translate iovas at vectors fill time X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This patch aims at simplifying the desc to mbuf and mbuf to desc copy functions. It performs the iova to hva translations at vectors fill time. Doing this, in case desc buffer isn't contiguous in hva space, it gets split into multiple vectors. Signed-off-by: Maxime Coquelin --- lib/librte_vhost/vhost.h | 1 + lib/librte_vhost/virtio_net.c | 348 ++++++++++++++++++------------------------ 2 files changed, 151 insertions(+), 198 deletions(-) diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h index 786a74f64..e3b2ed2ff 100644 --- a/lib/librte_vhost/vhost.h +++ b/lib/librte_vhost/vhost.h @@ -43,6 +43,7 @@ * from vring to do scatter RX. */ struct buf_vector { + uint64_t buf_iova; uint64_t buf_addr; uint32_t buf_len; uint32_t desc_idx; diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c index 4816e8003..371d0e646 100644 --- a/lib/librte_vhost/virtio_net.c +++ b/lib/librte_vhost/virtio_net.c @@ -225,12 +225,12 @@ static __rte_always_inline int fill_vec_buf(struct virtio_net *dev, struct vhost_virtqueue *vq, uint32_t avail_idx, uint32_t *vec_idx, struct buf_vector *buf_vec, uint16_t *desc_chain_head, - uint16_t *desc_chain_len) + uint16_t *desc_chain_len, uint8_t perm) { uint16_t idx = vq->avail->ring[avail_idx & (vq->size - 1)]; uint32_t vec_id = *vec_idx; uint32_t len = 0; - uint64_t dlen; + uint64_t dlen, desc_avail, desc_iova; struct vring_desc *descs = vq->desc; struct vring_desc *idesc = NULL; @@ -261,16 +261,43 @@ fill_vec_buf(struct virtio_net *dev, struct vhost_virtqueue *vq, } while (1) { - if (unlikely(vec_id >= BUF_VECTOR_MAX || idx >= vq->size)) { + if (unlikely(idx >= vq->size)) { free_ind_table(idesc); return -1; } + len += descs[idx].len; - buf_vec[vec_id].buf_addr = descs[idx].addr; - buf_vec[vec_id].buf_len = descs[idx].len; - buf_vec[vec_id].desc_idx = idx; - vec_id++; + desc_avail = descs[idx].len; + desc_iova = descs[idx].addr; + + while (desc_avail) { + uint64_t desc_addr; + uint64_t desc_chunck_len = desc_avail; + + if (unlikely(vec_id >= BUF_VECTOR_MAX)) { + free_ind_table(idesc); + return -1; + } + + desc_addr = vhost_iova_to_vva(dev, vq, + desc_iova, + &desc_chunck_len, + perm); + if (unlikely(!desc_addr)) { + free_ind_table(idesc); + return -1; + } + + buf_vec[vec_id].buf_iova = desc_iova; + buf_vec[vec_id].buf_addr = desc_addr; + buf_vec[vec_id].buf_len = desc_chunck_len; + buf_vec[vec_id].desc_idx = idx; + + desc_avail -= desc_chunck_len; + desc_iova += desc_chunck_len; + vec_id++; + } if ((descs[idx].flags & VRING_DESC_F_NEXT) == 0) break; @@ -293,7 +320,8 @@ fill_vec_buf(struct virtio_net *dev, struct vhost_virtqueue *vq, static inline int reserve_avail_buf(struct virtio_net *dev, struct vhost_virtqueue *vq, uint32_t size, struct buf_vector *buf_vec, - uint16_t *num_buffers, uint16_t avail_head) + uint16_t *num_buffers, uint16_t avail_head, + uint16_t *nr_vec) { uint16_t cur_idx; uint32_t vec_idx = 0; @@ -315,7 +343,8 @@ reserve_avail_buf(struct virtio_net *dev, struct vhost_virtqueue *vq, return -1; if (unlikely(fill_vec_buf(dev, vq, cur_idx, &vec_idx, buf_vec, - &head_idx, &len) < 0)) + &head_idx, &len, + VHOST_ACCESS_RW) < 0)) return -1; len = RTE_MIN(len, size); update_shadow_used_ring(vq, head_idx, len); @@ -334,21 +363,22 @@ reserve_avail_buf(struct virtio_net *dev, struct vhost_virtqueue *vq, return -1; } + *nr_vec = vec_idx; + return 0; } static __rte_always_inline int copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mbuf *m, struct buf_vector *buf_vec, - uint16_t num_buffers) + uint16_t nr_vec, uint16_t num_buffers) { uint32_t vec_idx = 0; - uint64_t desc_addr, desc_gaddr; uint32_t mbuf_offset, mbuf_avail; - uint32_t desc_offset, desc_avail; + uint32_t buf_offset, buf_avail; + uint64_t buf_addr, buf_iova, buf_len; uint32_t cpy_len; - uint64_t desc_chunck_len; - uint64_t hdr_addr, hdr_phys_addr; + uint64_t hdr_addr; struct rte_mbuf *hdr_mbuf; struct batch_copy_elem *batch_copy = vq->batch_copy_elems; struct virtio_net_hdr_mrg_rxbuf tmp_hdr, *hdr = NULL; @@ -359,82 +389,57 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, goto out; } - desc_chunck_len = buf_vec[vec_idx].buf_len; - desc_gaddr = buf_vec[vec_idx].buf_addr; - desc_addr = vhost_iova_to_vva(dev, vq, - desc_gaddr, - &desc_chunck_len, - VHOST_ACCESS_RW); - if (buf_vec[vec_idx].buf_len < dev->vhost_hlen || !desc_addr) { + buf_addr = buf_vec[vec_idx].buf_addr; + buf_iova = buf_vec[vec_idx].buf_iova; + buf_len = buf_vec[vec_idx].buf_len; + + if (unlikely(buf_len < dev->vhost_hlen && nr_vec <= 1)) { error = -1; goto out; } hdr_mbuf = m; - hdr_addr = desc_addr; - if (unlikely(desc_chunck_len < dev->vhost_hlen)) + hdr_addr = buf_addr; + if (unlikely(buf_len < dev->vhost_hlen)) hdr = &tmp_hdr; else hdr = (struct virtio_net_hdr_mrg_rxbuf *)(uintptr_t)hdr_addr; - hdr_phys_addr = desc_gaddr; rte_prefetch0((void *)(uintptr_t)hdr_addr); VHOST_LOG_DEBUG(VHOST_DATA, "(%d) RX: num merge buffers %d\n", dev->vid, num_buffers); - desc_avail = buf_vec[vec_idx].buf_len - dev->vhost_hlen; - if (unlikely(desc_chunck_len < dev->vhost_hlen)) { - desc_chunck_len = desc_avail; - desc_gaddr += dev->vhost_hlen; - desc_addr = vhost_iova_to_vva(dev, vq, - desc_gaddr, - &desc_chunck_len, - VHOST_ACCESS_RW); - if (unlikely(!desc_addr)) { - error = -1; - goto out; - } - - desc_offset = 0; + if (unlikely(buf_len < dev->vhost_hlen)) { + buf_offset = dev->vhost_hlen - buf_len; + vec_idx++; + buf_addr = buf_vec[vec_idx].buf_addr; + buf_iova = buf_vec[vec_idx].buf_iova; + buf_len = buf_vec[vec_idx].buf_len; + buf_avail = buf_len - buf_offset; } else { - desc_offset = dev->vhost_hlen; - desc_chunck_len -= dev->vhost_hlen; + buf_offset = dev->vhost_hlen; + buf_avail = buf_len - dev->vhost_hlen; } - mbuf_avail = rte_pktmbuf_data_len(m); mbuf_offset = 0; while (mbuf_avail != 0 || m->next != NULL) { - /* done with current desc buf, get the next one */ - if (desc_avail == 0) { + /* done with current buf, get the next one */ + if (buf_avail == 0) { vec_idx++; - desc_chunck_len = buf_vec[vec_idx].buf_len; - desc_gaddr = buf_vec[vec_idx].buf_addr; - desc_addr = - vhost_iova_to_vva(dev, vq, - desc_gaddr, - &desc_chunck_len, - VHOST_ACCESS_RW); - if (unlikely(!desc_addr)) { + if (unlikely(vec_idx >= nr_vec)) { error = -1; goto out; } + buf_addr = buf_vec[vec_idx].buf_addr; + buf_iova = buf_vec[vec_idx].buf_iova; + buf_len = buf_vec[vec_idx].buf_len; + /* Prefetch buffer address. */ - rte_prefetch0((void *)(uintptr_t)desc_addr); - desc_offset = 0; - desc_avail = buf_vec[vec_idx].buf_len; - } else if (unlikely(desc_chunck_len == 0)) { - desc_chunck_len = desc_avail; - desc_gaddr += desc_offset; - desc_addr = vhost_iova_to_vva(dev, vq, - desc_gaddr, - &desc_chunck_len, VHOST_ACCESS_RW); - if (unlikely(!desc_addr)) { - error = -1; - goto out; - } - desc_offset = 0; + rte_prefetch0((void *)(uintptr_t)buf_addr); + buf_offset = 0; + buf_avail = buf_len; } /* done with current mbuf, get the next one */ @@ -455,18 +460,12 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, uint64_t len; uint64_t remain = dev->vhost_hlen; uint64_t src = (uint64_t)(uintptr_t)hdr, dst; - uint64_t guest_addr = hdr_phys_addr; + uint64_t iova = buf_vec[0].buf_iova; + uint16_t hdr_vec_idx = 0; while (remain) { len = remain; - dst = vhost_iova_to_vva(dev, vq, - guest_addr, &len, - VHOST_ACCESS_RW); - if (unlikely(!dst || !len)) { - error = -1; - goto out; - } - + dst = buf_vec[hdr_vec_idx].buf_addr; rte_memcpy((void *)(uintptr_t)dst, (void *)(uintptr_t)src, len); @@ -474,50 +473,50 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, PRINT_PACKET(dev, (uintptr_t)dst, (uint32_t)len, 0); vhost_log_cache_write(dev, vq, - guest_addr, len); + iova, len); remain -= len; - guest_addr += len; + iova += len; src += len; + hdr_vec_idx++; } } else { PRINT_PACKET(dev, (uintptr_t)hdr_addr, dev->vhost_hlen, 0); - vhost_log_cache_write(dev, vq, hdr_phys_addr, + vhost_log_cache_write(dev, vq, + buf_vec[0].buf_iova, dev->vhost_hlen); } hdr_addr = 0; } - cpy_len = RTE_MIN(desc_chunck_len, mbuf_avail); + cpy_len = RTE_MIN(buf_len, mbuf_avail); if (likely(cpy_len > MAX_BATCH_LEN || vq->batch_copy_nb_elems >= vq->size)) { - rte_memcpy((void *)((uintptr_t)(desc_addr + - desc_offset)), + rte_memcpy((void *)((uintptr_t)(buf_addr + buf_offset)), rte_pktmbuf_mtod_offset(m, void *, mbuf_offset), cpy_len); - vhost_log_cache_write(dev, vq, desc_gaddr + desc_offset, + vhost_log_cache_write(dev, vq, buf_iova + buf_offset, cpy_len); - PRINT_PACKET(dev, (uintptr_t)(desc_addr + desc_offset), + PRINT_PACKET(dev, (uintptr_t)(buf_addr + buf_offset), cpy_len, 0); } else { batch_copy[vq->batch_copy_nb_elems].dst = - (void *)((uintptr_t)(desc_addr + desc_offset)); + (void *)((uintptr_t)(buf_addr + buf_offset)); batch_copy[vq->batch_copy_nb_elems].src = rte_pktmbuf_mtod_offset(m, void *, mbuf_offset); batch_copy[vq->batch_copy_nb_elems].log_addr = - desc_gaddr + desc_offset; + buf_iova + buf_offset; batch_copy[vq->batch_copy_nb_elems].len = cpy_len; vq->batch_copy_nb_elems++; } mbuf_avail -= cpy_len; mbuf_offset += cpy_len; - desc_avail -= cpy_len; - desc_offset += cpy_len; - desc_chunck_len -= cpy_len; + buf_avail -= cpy_len; + buf_offset += cpy_len; } out: @@ -568,10 +567,11 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, avail_head = *((volatile uint16_t *)&vq->avail->idx); for (pkt_idx = 0; pkt_idx < count; pkt_idx++) { uint32_t pkt_len = pkts[pkt_idx]->pkt_len + dev->vhost_hlen; + uint16_t nr_vec = 0; if (unlikely(reserve_avail_buf(dev, vq, pkt_len, buf_vec, &num_buffers, - avail_head) < 0)) { + avail_head, &nr_vec) < 0)) { VHOST_LOG_DEBUG(VHOST_DATA, "(%d) failed to get enough desc from vring\n", dev->vid); @@ -584,7 +584,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, vq->last_avail_idx + num_buffers); if (copy_mbuf_to_desc(dev, vq, pkts[pkt_idx], - buf_vec, num_buffers) < 0) { + buf_vec, nr_vec, + num_buffers) < 0) { vq->shadow_used_idx -= num_buffers; break; } @@ -753,11 +754,10 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, struct buf_vector *buf_vec, uint16_t nr_vec, struct rte_mbuf *m, struct rte_mempool *mbuf_pool) { - uint64_t desc_addr, desc_gaddr; - uint32_t desc_avail, desc_offset; + uint32_t buf_avail, buf_offset; + uint64_t buf_addr, buf_iova, buf_len; uint32_t mbuf_avail, mbuf_offset; uint32_t cpy_len; - uint64_t desc_chunck_len; struct rte_mbuf *cur = m, *prev = m; struct virtio_net_hdr tmp_hdr; struct virtio_net_hdr *hdr = NULL; @@ -766,25 +766,25 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, struct batch_copy_elem *batch_copy = vq->batch_copy_elems; int error = 0; - desc_chunck_len = buf_vec[vec_idx].buf_len; - desc_gaddr = buf_vec[vec_idx].buf_addr; - desc_addr = vhost_iova_to_vva(dev, - vq, desc_gaddr, - &desc_chunck_len, - VHOST_ACCESS_RO); - if (unlikely(buf_vec[vec_idx].buf_len < dev->vhost_hlen || - !desc_addr)) { + buf_addr = buf_vec[vec_idx].buf_addr; + buf_iova = buf_vec[vec_idx].buf_iova; + buf_len = buf_vec[vec_idx].buf_len; + + if (unlikely(buf_len < dev->vhost_hlen && nr_vec <= 1)) { error = -1; goto out; } + if (likely(nr_vec > 1)) + rte_prefetch0((void *)(uintptr_t)buf_vec[1].buf_addr); + if (virtio_net_with_host_offload(dev)) { - if (unlikely(desc_chunck_len < sizeof(struct virtio_net_hdr))) { - uint64_t len = desc_chunck_len; + if (unlikely(buf_len < sizeof(struct virtio_net_hdr))) { + uint64_t len; uint64_t remain = sizeof(struct virtio_net_hdr); - uint64_t src = desc_addr; + uint64_t src; uint64_t dst = (uint64_t)(uintptr_t)&tmp_hdr; - uint64_t guest_addr = desc_gaddr; + uint16_t hdr_vec_idx = 0; /* * No luck, the virtio-net header doesn't fit @@ -792,25 +792,18 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, */ while (remain) { len = remain; - src = vhost_iova_to_vva(dev, vq, - guest_addr, &len, - VHOST_ACCESS_RO); - if (unlikely(!src || !len)) { - error = -1; - goto out; - } - + src = buf_vec[hdr_vec_idx].buf_addr; rte_memcpy((void *)(uintptr_t)dst, (void *)(uintptr_t)src, len); - guest_addr += len; remain -= len; dst += len; + hdr_vec_idx++; } hdr = &tmp_hdr; } else { - hdr = (struct virtio_net_hdr *)((uintptr_t)desc_addr); + hdr = (struct virtio_net_hdr *)((uintptr_t)buf_addr); rte_prefetch0(hdr); } } @@ -820,68 +813,51 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, * for Tx: the first for storing the header, and others * for storing the data. */ - if (likely(buf_vec[vec_idx].buf_len == dev->vhost_hlen)) { + if (unlikely(buf_len < dev->vhost_hlen)) { + buf_offset = dev->vhost_hlen - buf_len; + vec_idx++; + buf_addr = buf_vec[vec_idx].buf_addr; + buf_iova = buf_vec[vec_idx].buf_iova; + buf_len = buf_vec[vec_idx].buf_len; + buf_avail = buf_len - buf_offset; + } else if (buf_len == dev->vhost_hlen) { if (unlikely(++vec_idx >= nr_vec)) goto out; + buf_addr = buf_vec[vec_idx].buf_addr; + buf_iova = buf_vec[vec_idx].buf_iova; + buf_len = buf_vec[vec_idx].buf_len; - desc_chunck_len = buf_vec[vec_idx].buf_len; - desc_gaddr = buf_vec[vec_idx].buf_addr; - desc_addr = vhost_iova_to_vva(dev, - vq, desc_gaddr, - &desc_chunck_len, - VHOST_ACCESS_RO); - if (unlikely(!desc_addr)) { - error = -1; - goto out; - } - - desc_offset = 0; - desc_avail = buf_vec[vec_idx].buf_len; + buf_offset = 0; + buf_avail = buf_len; } else { - desc_avail = buf_vec[vec_idx].buf_len - dev->vhost_hlen; - - if (unlikely(desc_chunck_len < dev->vhost_hlen)) { - desc_chunck_len = desc_avail; - desc_gaddr += dev->vhost_hlen; - desc_addr = vhost_iova_to_vva(dev, - vq, desc_gaddr, - &desc_chunck_len, - VHOST_ACCESS_RO); - if (unlikely(!desc_addr)) { - error = -1; - goto out; - } - - desc_offset = 0; - } else { - desc_offset = dev->vhost_hlen; - desc_chunck_len -= dev->vhost_hlen; - } + buf_offset = dev->vhost_hlen; + buf_avail = buf_vec[vec_idx].buf_len - dev->vhost_hlen; } - rte_prefetch0((void *)(uintptr_t)(desc_addr + desc_offset)); + rte_prefetch0((void *)(uintptr_t) + (buf_addr + buf_offset)); - PRINT_PACKET(dev, (uintptr_t)(desc_addr + desc_offset), - (uint32_t)desc_chunck_len, 0); + PRINT_PACKET(dev, + (uintptr_t)(buf_addr + buf_offset), + (uint32_t)buf_avail, 0); mbuf_offset = 0; mbuf_avail = m->buf_len - RTE_PKTMBUF_HEADROOM; while (1) { uint64_t hpa; - cpy_len = RTE_MIN(desc_chunck_len, mbuf_avail); + cpy_len = RTE_MIN(buf_avail, mbuf_avail); if (unlikely(dev->dequeue_zero_copy)) { - hpa = gpa_to_hpa(dev, - desc_gaddr + desc_offset, cpy_len); + hpa = gpa_to_hpa(dev, buf_iova + buf_offset, cpy_len); if (unlikely(!hpa)) { error = -1; goto out; } cur->data_len = cpy_len; cur->data_off = 0; - cur->buf_addr = (void *)(uintptr_t)(desc_addr - + desc_offset); + cur->buf_addr = + (void *)(uintptr_t)(buf_addr + buf_offset); cur->buf_iova = hpa; /* @@ -892,21 +868,19 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, } else { if (likely(cpy_len > MAX_BATCH_LEN || vq->batch_copy_nb_elems >= vq->size || - (hdr && cur == m) || - buf_vec[vec_idx].buf_len != - desc_chunck_len)) { + (hdr && cur == m))) { rte_memcpy(rte_pktmbuf_mtod_offset(cur, void *, mbuf_offset), - (void *)((uintptr_t)(desc_addr + - desc_offset)), + (void *)((uintptr_t)(buf_addr + + buf_offset)), cpy_len); } else { batch_copy[vq->batch_copy_nb_elems].dst = rte_pktmbuf_mtod_offset(cur, void *, mbuf_offset); batch_copy[vq->batch_copy_nb_elems].src = - (void *)((uintptr_t)(desc_addr + - desc_offset)); + (void *)((uintptr_t)(buf_addr + + buf_offset)); batch_copy[vq->batch_copy_nb_elems].len = cpy_len; vq->batch_copy_nb_elems++; @@ -915,48 +889,25 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, mbuf_avail -= cpy_len; mbuf_offset += cpy_len; - desc_avail -= cpy_len; - desc_chunck_len -= cpy_len; - desc_offset += cpy_len; + buf_avail -= cpy_len; + buf_offset += cpy_len; - /* This desc reaches to its end, get the next one */ - if (desc_avail == 0) { + /* This buf reaches to its end, get the next one */ + if (buf_avail == 0) { if (++vec_idx >= nr_vec) break; - desc_chunck_len = buf_vec[vec_idx].buf_len; - desc_gaddr = buf_vec[vec_idx].buf_addr; - desc_addr = vhost_iova_to_vva(dev, - vq, desc_gaddr, - &desc_chunck_len, - VHOST_ACCESS_RO); - if (unlikely(!desc_addr)) { - error = -1; - goto out; - } - - rte_prefetch0((void *)(uintptr_t)desc_addr); + buf_addr = buf_vec[vec_idx].buf_addr; + buf_iova = buf_vec[vec_idx].buf_iova; + buf_len = buf_vec[vec_idx].buf_len; - desc_offset = 0; - desc_avail = buf_vec[vec_idx].buf_len; + rte_prefetch0((void *)(uintptr_t)buf_addr); - PRINT_PACKET(dev, (uintptr_t)desc_addr, - (uint32_t)desc_chunck_len, 0); - } else if (unlikely(desc_chunck_len == 0)) { - desc_chunck_len = desc_avail; - desc_gaddr += desc_offset; - desc_addr = vhost_iova_to_vva(dev, vq, - desc_gaddr, - &desc_chunck_len, - VHOST_ACCESS_RO); - if (unlikely(!desc_addr)) { - error = -1; - goto out; - } - desc_offset = 0; + buf_offset = 0; + buf_avail = buf_len; - PRINT_PACKET(dev, (uintptr_t)desc_addr, - (uint32_t)desc_chunck_len, 0); + PRINT_PACKET(dev, (uintptr_t)buf_addr, + (uint32_t)buf_avail, 0); } /* @@ -1175,7 +1126,8 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, if (unlikely(fill_vec_buf(dev, vq, vq->last_avail_idx + i, &nr_vec, buf_vec, - &head_idx, &dummy_len) < 0)) + &head_idx, &dummy_len, + VHOST_ACCESS_RO) < 0)) break; if (likely(dev->dequeue_zero_copy == 0)) From patchwork Wed Jun 27 14:49:57 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 41670 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B0A941BF66; Wed, 27 Jun 2018 16:50:26 +0200 (CEST) Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id F07411BF4B for ; Wed, 27 Jun 2018 16:50:19 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9621F401022E; Wed, 27 Jun 2018 14:50:19 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-39.ams2.redhat.com [10.36.112.39]) by smtp.corp.redhat.com (Postfix) with ESMTP id 749FF20389E0; Wed, 27 Jun 2018 14:50:18 +0000 (UTC) From: Maxime Coquelin To: tiwei.bie@intel.com, zhihong.wang@intel.com, dev@dpdk.org Cc: Maxime Coquelin Date: Wed, 27 Jun 2018 16:49:57 +0200 Message-Id: <20180627144959.17277-6-maxime.coquelin@redhat.com> In-Reply-To: <20180627144959.17277-1-maxime.coquelin@redhat.com> References: <20180627144959.17277-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Wed, 27 Jun 2018 14:50:19 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Wed, 27 Jun 2018 14:50:19 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: [dpdk-dev] [PATCH v3 5/7] vhost: improve prefetching in dequeue path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This is an optimization to prefetch next buffer while the current one is being processed. Signed-off-by: Maxime Coquelin --- lib/librte_vhost/virtio_net.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c index 371d0e646..404968cd0 100644 --- a/lib/librte_vhost/virtio_net.c +++ b/lib/librte_vhost/virtio_net.c @@ -901,7 +901,13 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, buf_iova = buf_vec[vec_idx].buf_iova; buf_len = buf_vec[vec_idx].buf_len; - rte_prefetch0((void *)(uintptr_t)buf_addr); + /* + * Prefecth desc n + 1 buffer while + * desc n buffer is processed. + */ + if (vec_idx + 1 < nr_vec) + rte_prefetch0((void *)(uintptr_t) + buf_vec[vec_idx + 1].buf_addr); buf_offset = 0; buf_avail = buf_len; @@ -1133,6 +1139,8 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, if (likely(dev->dequeue_zero_copy == 0)) update_shadow_used_ring(vq, head_idx, 0); + rte_prefetch0((void *)(uintptr_t)buf_vec[0].buf_addr); + pkts[i] = rte_pktmbuf_alloc(mbuf_pool); if (unlikely(pkts[i] == NULL)) { RTE_LOG(ERR, VHOST_DATA, From patchwork Wed Jun 27 14:49:58 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 41671 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 051D51BF6E; Wed, 27 Jun 2018 16:50:28 +0200 (CEST) Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id 60C851BF50 for ; Wed, 27 Jun 2018 16:50:21 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 063F8818F6F5; Wed, 27 Jun 2018 14:50:21 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-39.ams2.redhat.com [10.36.112.39]) by smtp.corp.redhat.com (Postfix) with ESMTP id E460F20389E0; Wed, 27 Jun 2018 14:50:19 +0000 (UTC) From: Maxime Coquelin To: tiwei.bie@intel.com, zhihong.wang@intel.com, dev@dpdk.org Cc: Maxime Coquelin Date: Wed, 27 Jun 2018 16:49:58 +0200 Message-Id: <20180627144959.17277-7-maxime.coquelin@redhat.com> In-Reply-To: <20180627144959.17277-1-maxime.coquelin@redhat.com> References: <20180627144959.17277-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Wed, 27 Jun 2018 14:50:21 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Wed, 27 Jun 2018 14:50:21 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: [dpdk-dev] [PATCH v3 6/7] vhost: prefetch first descriptor in dequeue path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Signed-off-by: Maxime Coquelin --- lib/librte_vhost/virtio_net.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c index 404968cd0..ff6fa8d61 100644 --- a/lib/librte_vhost/virtio_net.c +++ b/lib/librte_vhost/virtio_net.c @@ -1082,6 +1082,8 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, vq->shadow_used_idx = 0; } + rte_prefetch0(&vq->avail->ring[vq->last_avail_idx & (vq->size - 1)]); + /* * Construct a RARP broadcast packet, and inject it to the "pkts" * array, to looks like that guest actually send such packet. From patchwork Wed Jun 27 14:49:59 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxime Coquelin X-Patchwork-Id: 41672 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 8194F1BF7A; Wed, 27 Jun 2018 16:50:30 +0200 (CEST) Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id D67701BF59 for ; Wed, 27 Jun 2018 16:50:22 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 62E08407C570; Wed, 27 Jun 2018 14:50:22 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-39.ams2.redhat.com [10.36.112.39]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5793220389E0; Wed, 27 Jun 2018 14:50:21 +0000 (UTC) From: Maxime Coquelin To: tiwei.bie@intel.com, zhihong.wang@intel.com, dev@dpdk.org Cc: Maxime Coquelin Date: Wed, 27 Jun 2018 16:49:59 +0200 Message-Id: <20180627144959.17277-8-maxime.coquelin@redhat.com> In-Reply-To: <20180627144959.17277-1-maxime.coquelin@redhat.com> References: <20180627144959.17277-1-maxime.coquelin@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Wed, 27 Jun 2018 14:50:22 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Wed, 27 Jun 2018 14:50:22 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: [dpdk-dev] [PATCH v3 7/7] vhost: improve prefetching in enqueue path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This is an optimization to prefetch next buffer while the current one is being processed. Signed-off-by: Maxime Coquelin --- lib/librte_vhost/virtio_net.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c index ff6fa8d61..81377e79a 100644 --- a/lib/librte_vhost/virtio_net.c +++ b/lib/librte_vhost/virtio_net.c @@ -393,6 +393,9 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, buf_iova = buf_vec[vec_idx].buf_iova; buf_len = buf_vec[vec_idx].buf_len; + if (nr_vec > 1) + rte_prefetch0((void *)(uintptr_t)buf_vec[1].buf_addr); + if (unlikely(buf_len < dev->vhost_hlen && nr_vec <= 1)) { error = -1; goto out; @@ -404,7 +407,6 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, hdr = &tmp_hdr; else hdr = (struct virtio_net_hdr_mrg_rxbuf *)(uintptr_t)hdr_addr; - rte_prefetch0((void *)(uintptr_t)hdr_addr); VHOST_LOG_DEBUG(VHOST_DATA, "(%d) RX: num merge buffers %d\n", dev->vid, num_buffers); @@ -436,8 +438,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, buf_iova = buf_vec[vec_idx].buf_iova; buf_len = buf_vec[vec_idx].buf_len; - /* Prefetch buffer address. */ - rte_prefetch0((void *)(uintptr_t)buf_addr); + /* Prefetch next buffer address. */ + if (vec_idx + 1 < nr_vec) + rte_prefetch0((void *)(uintptr_t) + buf_vec[vec_idx + 1].buf_addr); buf_offset = 0; buf_avail = buf_len; } @@ -579,6 +583,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, break; } + rte_prefetch0((void *)(uintptr_t)buf_vec[0].buf_addr); + VHOST_LOG_DEBUG(VHOST_DATA, "(%d) current index %d | end index %d\n", dev->vid, vq->last_avail_idx, vq->last_avail_idx + num_buffers);