From patchwork Wed May 17 09:08:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eelco Chaudron X-Patchwork-Id: 126909 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3182942B2A; Wed, 17 May 2023 11:08:55 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1D2B541611; Wed, 17 May 2023 11:08:55 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 4C4B141611 for ; Wed, 17 May 2023 11:08:53 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684314532; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8eZdsfGOuxFzpbkHjIjhQWES1VaK0PaQ8t3nUzO/PJg=; b=OINZYX8rFhrgjx0bEgv6FmldbQT5BcrWFGXnBowcuYlL7gLPX/p1C0PAiSIK5mIjymzfyV A077/ifkH1u/pOpwfcuyXF7HGv06VEZMwCb6dUtCR3oaw5qPD0JDukoI0UgGxn1H7UN3kI H8rdcE7KulslkrjWdnBSC9Y9FQQVTA4= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-660-BLJgVy_WOiaCcJVJViY4-Q-1; Wed, 17 May 2023 05:08:49 -0400 X-MC-Unique: BLJgVy_WOiaCcJVJViY4-Q-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 636D129AB3F1; Wed, 17 May 2023 09:08:49 +0000 (UTC) Received: from ebuild.redhat.com (unknown [10.39.195.120]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7CB9E1121314; Wed, 17 May 2023 09:08:48 +0000 (UTC) From: Eelco Chaudron To: maxime.coquelin@redhat.com, chenbo.xia@intel.com, david.marchand@redhat.com Cc: dev@dpdk.org Subject: [PATCH v3 1/4] vhost: change vhost_virtqueue access lock to a read/write one Date: Wed, 17 May 2023 11:08:47 +0200 Message-Id: <168431452543.558450.14131829672896784074.stgit@ebuild.local> In-Reply-To: <168431450017.558450.16680518469610688737.stgit@ebuild.local> References: <168431450017.558450.16680518469610688737.stgit@ebuild.local> User-Agent: StGit/1.5 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This change will allow the vhost interrupt datapath handling to be split between two processed without one of them holding an explicit lock. Signed-off-by: Eelco Chaudron Tested-by: Maxime Coquelin Reviewed-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- lib/eal/include/generic/rte_rwlock.h | 17 ++++++ lib/vhost/vhost.c | 46 +++++++++-------- lib/vhost/vhost.h | 4 +- lib/vhost/vhost_user.c | 14 +++-- lib/vhost/virtio_net.c | 90 +++++++++++++++++----------------- 5 files changed, 94 insertions(+), 77 deletions(-) diff --git a/lib/eal/include/generic/rte_rwlock.h b/lib/eal/include/generic/rte_rwlock.h index 71e2d8d5f4..9e083bbc61 100644 --- a/lib/eal/include/generic/rte_rwlock.h +++ b/lib/eal/include/generic/rte_rwlock.h @@ -236,6 +236,23 @@ rte_rwlock_write_unlock(rte_rwlock_t *rwl) __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_WRITE, __ATOMIC_RELEASE); } +/** + * Test if the write lock is taken. + * + * @param rwl + * A pointer to a rwlock structure. + * @return + * 1 if the write lock is currently taken; 0 otherwise. + */ +static inline int +rte_rwlock_write_is_locked(rte_rwlock_t *rwl) +{ + if (__atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED) & RTE_RWLOCK_WRITE) + return 1; + + return 0; +} + /** * Try to execute critical section in a hardware memory transaction, if it * fails or not available take a read lock diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c index ef37943817..74bdbfd810 100644 --- a/lib/vhost/vhost.c +++ b/lib/vhost/vhost.c @@ -393,9 +393,9 @@ free_vq(struct virtio_net *dev, struct vhost_virtqueue *vq) else rte_free(vq->shadow_used_split); - rte_spinlock_lock(&vq->access_lock); + rte_rwlock_write_lock(&vq->access_lock); vhost_free_async_mem(vq); - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_write_unlock(&vq->access_lock); rte_free(vq->batch_copy_elems); vhost_user_iotlb_destroy(vq); rte_free(vq->log_cache); @@ -630,7 +630,7 @@ alloc_vring_queue(struct virtio_net *dev, uint32_t vring_idx) dev->virtqueue[i] = vq; init_vring_queue(dev, vq, i); - rte_spinlock_init(&vq->access_lock); + rte_rwlock_init(&vq->access_lock); vq->avail_wrap_counter = 1; vq->used_wrap_counter = 1; vq->signalled_used_valid = false; @@ -1305,14 +1305,14 @@ rte_vhost_vring_call(int vid, uint16_t vring_idx) if (!vq) return -1; - rte_spinlock_lock(&vq->access_lock); + rte_rwlock_read_lock(&vq->access_lock); if (vq_is_packed(dev)) vhost_vring_call_packed(dev, vq); else vhost_vring_call_split(dev, vq); - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_read_unlock(&vq->access_lock); return 0; } @@ -1334,7 +1334,7 @@ rte_vhost_vring_call_nonblock(int vid, uint16_t vring_idx) if (!vq) return -1; - if (!rte_spinlock_trylock(&vq->access_lock)) + if (rte_rwlock_read_trylock(&vq->access_lock)) return -EAGAIN; if (vq_is_packed(dev)) @@ -1342,7 +1342,7 @@ rte_vhost_vring_call_nonblock(int vid, uint16_t vring_idx) else vhost_vring_call_split(dev, vq); - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_read_unlock(&vq->access_lock); return 0; } @@ -1365,7 +1365,7 @@ rte_vhost_avail_entries(int vid, uint16_t queue_id) if (!vq) return 0; - rte_spinlock_lock(&vq->access_lock); + rte_rwlock_write_lock(&vq->access_lock); if (unlikely(!vq->enabled || vq->avail == NULL)) goto out; @@ -1373,7 +1373,7 @@ rte_vhost_avail_entries(int vid, uint16_t queue_id) ret = *(volatile uint16_t *)&vq->avail->idx - vq->last_used_idx; out: - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_write_unlock(&vq->access_lock); return ret; } @@ -1457,12 +1457,12 @@ rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable) if (!vq) return -1; - rte_spinlock_lock(&vq->access_lock); + rte_rwlock_write_lock(&vq->access_lock); vq->notif_enable = enable; ret = vhost_enable_guest_notification(dev, vq, enable); - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_write_unlock(&vq->access_lock); return ret; } @@ -1520,7 +1520,7 @@ rte_vhost_rx_queue_count(int vid, uint16_t qid) if (vq == NULL) return 0; - rte_spinlock_lock(&vq->access_lock); + rte_rwlock_write_lock(&vq->access_lock); if (unlikely(!vq->enabled || vq->avail == NULL)) goto out; @@ -1528,7 +1528,7 @@ rte_vhost_rx_queue_count(int vid, uint16_t qid) ret = *((volatile uint16_t *)&vq->avail->idx) - vq->last_avail_idx; out: - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_write_unlock(&vq->access_lock); return ret; } @@ -1757,9 +1757,9 @@ rte_vhost_async_channel_register(int vid, uint16_t queue_id) if (unlikely(vq == NULL || !dev->async_copy || dev->vdpa_dev != NULL)) return -1; - rte_spinlock_lock(&vq->access_lock); + rte_rwlock_write_lock(&vq->access_lock); ret = async_channel_register(dev, vq); - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_write_unlock(&vq->access_lock); return ret; } @@ -1804,7 +1804,7 @@ rte_vhost_async_channel_unregister(int vid, uint16_t queue_id) if (vq == NULL) return ret; - if (!rte_spinlock_trylock(&vq->access_lock)) { + if (rte_rwlock_write_trylock(&vq->access_lock)) { VHOST_LOG_CONFIG(dev->ifname, ERR, "failed to unregister async channel, virtqueue busy.\n"); return ret; @@ -1821,7 +1821,7 @@ rte_vhost_async_channel_unregister(int vid, uint16_t queue_id) ret = 0; } - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_write_unlock(&vq->access_lock); return ret; } @@ -1954,7 +1954,7 @@ rte_vhost_async_get_inflight(int vid, uint16_t queue_id) if (vq == NULL) return ret; - if (!rte_spinlock_trylock(&vq->access_lock)) { + if (rte_rwlock_write_trylock(&vq->access_lock)) { VHOST_LOG_CONFIG(dev->ifname, DEBUG, "failed to check in-flight packets. virtqueue busy.\n"); return ret; @@ -1963,7 +1963,7 @@ rte_vhost_async_get_inflight(int vid, uint16_t queue_id) if (vq->async) ret = vq->async->pkts_inflight_n; - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_write_unlock(&vq->access_lock); return ret; } @@ -2084,13 +2084,13 @@ rte_vhost_vring_stats_get(int vid, uint16_t queue_id, vq = dev->virtqueue[queue_id]; - rte_spinlock_lock(&vq->access_lock); + rte_rwlock_write_lock(&vq->access_lock); for (i = 0; i < VHOST_NB_VQ_STATS; i++) { stats[i].value = *(uint64_t *)(((char *)vq) + vhost_vq_stat_strings[i].offset); stats[i].id = i; } - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_write_unlock(&vq->access_lock); return VHOST_NB_VQ_STATS; } @@ -2111,9 +2111,9 @@ int rte_vhost_vring_stats_reset(int vid, uint16_t queue_id) vq = dev->virtqueue[queue_id]; - rte_spinlock_lock(&vq->access_lock); + rte_rwlock_write_lock(&vq->access_lock); memset(&vq->stats, 0, sizeof(vq->stats)); - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_write_unlock(&vq->access_lock); return 0; } diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index 8fdab13c70..5c939ef06f 100644 --- a/lib/vhost/vhost.h +++ b/lib/vhost/vhost.h @@ -277,7 +277,7 @@ struct vhost_virtqueue { bool access_ok; bool ready; - rte_spinlock_t access_lock; + rte_rwlock_t access_lock; union { @@ -517,7 +517,7 @@ static inline void vq_assert_lock__(struct virtio_net *dev, struct vhost_virtqueue *vq, const char *func) __rte_assert_exclusive_lock(&vq->access_lock) { - if (unlikely(!rte_spinlock_is_locked(&vq->access_lock))) + if (unlikely(!rte_rwlock_write_is_locked(&vq->access_lock))) rte_panic("VHOST_CONFIG: (%s) %s() called without access lock taken.\n", dev->ifname, func); } diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c index d60e39b6bc..c9454ce3d9 100644 --- a/lib/vhost/vhost_user.c +++ b/lib/vhost/vhost_user.c @@ -399,7 +399,7 @@ vhost_user_set_features(struct virtio_net **pdev, cleanup_vq_inflight(dev, vq); /* vhost_user_lock_all_queue_pairs locked all qps */ vq_assert_lock(dev, vq); - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_write_unlock(&vq->access_lock); free_vq(dev, vq); } } @@ -2649,10 +2649,10 @@ vhost_user_iotlb_msg(struct virtio_net **pdev, len, imsg->perm); if (is_vring_iotlb(dev, vq, imsg)) { - rte_spinlock_lock(&vq->access_lock); + rte_rwlock_write_lock(&vq->access_lock); translate_ring_addresses(&dev, &vq); *pdev = dev; - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_write_unlock(&vq->access_lock); } } break; @@ -2667,9 +2667,9 @@ vhost_user_iotlb_msg(struct virtio_net **pdev, imsg->size); if (is_vring_iotlb(dev, vq, imsg)) { - rte_spinlock_lock(&vq->access_lock); + rte_rwlock_write_lock(&vq->access_lock); vring_invalidate(dev, vq); - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_write_unlock(&vq->access_lock); } } break; @@ -3030,7 +3030,7 @@ vhost_user_lock_all_queue_pairs(struct virtio_net *dev) struct vhost_virtqueue *vq = dev->virtqueue[i]; if (vq) { - rte_spinlock_lock(&vq->access_lock); + rte_rwlock_write_lock(&vq->access_lock); vq_num++; } i++; @@ -3048,7 +3048,7 @@ vhost_user_unlock_all_queue_pairs(struct virtio_net *dev) struct vhost_virtqueue *vq = dev->virtqueue[i]; if (vq) { - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_write_unlock(&vq->access_lock); vq_num++; } i++; diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c index fe10a78586..d7624d18c8 100644 --- a/lib/vhost/virtio_net.c +++ b/lib/vhost/virtio_net.c @@ -55,7 +55,7 @@ is_valid_virt_queue_idx(uint32_t idx, int is_tx, uint32_t nr_vring) static inline void vhost_queue_stats_update(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mbuf **pkts, uint16_t count) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) { struct virtqueue_stats *stats = &vq->stats; int i; @@ -102,7 +102,7 @@ static __rte_always_inline int64_t vhost_async_dma_transfer_one(struct virtio_net *dev, struct vhost_virtqueue *vq, int16_t dma_id, uint16_t vchan_id, uint16_t flag_idx, struct vhost_iov_iter *pkt) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) { struct async_dma_vchan_info *dma_info = &dma_copy_track[dma_id].vchans[vchan_id]; uint16_t ring_mask = dma_info->ring_mask; @@ -152,7 +152,7 @@ static __rte_always_inline uint16_t vhost_async_dma_transfer(struct virtio_net *dev, struct vhost_virtqueue *vq, int16_t dma_id, uint16_t vchan_id, uint16_t head_idx, struct vhost_iov_iter *pkts, uint16_t nr_pkts) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) { struct async_dma_vchan_info *dma_info = &dma_copy_track[dma_id].vchans[vchan_id]; int64_t ret, nr_copies = 0; @@ -454,7 +454,7 @@ vhost_async_shadow_enqueue_packed_batch(struct vhost_virtqueue *vq, static __rte_always_inline void vhost_async_shadow_dequeue_packed_batch(struct vhost_virtqueue *vq, uint16_t *ids) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) { uint16_t i; struct vhost_async *async = vq->async; @@ -1131,7 +1131,7 @@ static __rte_always_inline int async_fill_seg(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mbuf *m, uint32_t mbuf_offset, uint64_t buf_iova, uint32_t cpy_len, bool to_desc) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { struct vhost_async *async = vq->async; @@ -1211,7 +1211,7 @@ static __rte_always_inline int mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mbuf *m, struct buf_vector *buf_vec, uint16_t nr_vec, uint16_t num_buffers, bool is_async) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { uint32_t vec_idx = 0; @@ -1341,7 +1341,7 @@ vhost_enqueue_single_packed(struct virtio_net *dev, struct rte_mbuf *pkt, struct buf_vector *buf_vec, uint16_t *nr_descs) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { uint16_t nr_vec = 0; @@ -1403,7 +1403,7 @@ vhost_enqueue_single_packed(struct virtio_net *dev, static __rte_noinline uint32_t virtio_dev_rx_split(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mbuf **pkts, uint32_t count) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { uint32_t pkt_idx = 0; @@ -1635,7 +1635,7 @@ static __rte_always_inline int16_t virtio_dev_rx_single_packed(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mbuf *pkt) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { struct buf_vector buf_vec[BUF_VECTOR_MAX]; @@ -1661,7 +1661,7 @@ virtio_dev_rx_packed(struct virtio_net *dev, struct vhost_virtqueue *__rte_restrict vq, struct rte_mbuf **__rte_restrict pkts, uint32_t count) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { uint32_t pkt_idx = 0; @@ -1701,7 +1701,7 @@ virtio_dev_rx(struct virtio_net *dev, struct vhost_virtqueue *vq, uint32_t nb_tx = 0; VHOST_LOG_DATA(dev->ifname, DEBUG, "%s\n", __func__); - rte_spinlock_lock(&vq->access_lock); + rte_rwlock_read_lock(&vq->access_lock); if (unlikely(!vq->enabled)) goto out_access_unlock; @@ -1727,7 +1727,7 @@ virtio_dev_rx(struct virtio_net *dev, struct vhost_virtqueue *vq, vhost_user_iotlb_rd_unlock(vq); out_access_unlock: - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_read_unlock(&vq->access_lock); return nb_tx; } @@ -1760,7 +1760,7 @@ rte_vhost_enqueue_burst(int vid, uint16_t queue_id, static __rte_always_inline uint16_t async_get_first_inflight_pkt_idx(struct vhost_virtqueue *vq) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) { struct vhost_async *async = vq->async; @@ -2165,7 +2165,7 @@ virtio_dev_rx_async_submit_packed(struct virtio_net *dev, struct vhost_virtqueue static __rte_always_inline void write_back_completed_descs_split(struct vhost_virtqueue *vq, uint16_t n_descs) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) { struct vhost_async *async = vq->async; uint16_t nr_left = n_descs; @@ -2198,7 +2198,7 @@ write_back_completed_descs_split(struct vhost_virtqueue *vq, uint16_t n_descs) static __rte_always_inline void write_back_completed_descs_packed(struct vhost_virtqueue *vq, uint16_t n_buffers) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) { struct vhost_async *async = vq->async; uint16_t from = async->last_buffer_idx_packed; @@ -2263,7 +2263,7 @@ write_back_completed_descs_packed(struct vhost_virtqueue *vq, static __rte_always_inline uint16_t vhost_poll_enqueue_completed(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mbuf **pkts, uint16_t count, int16_t dma_id, uint16_t vchan_id) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) { struct vhost_async *async = vq->async; struct async_inflight_info *pkts_info = async->pkts_info; @@ -2357,7 +2357,7 @@ rte_vhost_poll_enqueue_completed(int vid, uint16_t queue_id, vq = dev->virtqueue[queue_id]; - if (!rte_spinlock_trylock(&vq->access_lock)) { + if (rte_rwlock_read_trylock(&vq->access_lock)) { VHOST_LOG_DATA(dev->ifname, DEBUG, "%s: virtqueue %u is busy.\n", __func__, queue_id); @@ -2377,7 +2377,7 @@ rte_vhost_poll_enqueue_completed(int vid, uint16_t queue_id, vq->stats.inflight_completed += n_pkts_cpl; out: - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_read_unlock(&vq->access_lock); return n_pkts_cpl; } @@ -2465,7 +2465,7 @@ rte_vhost_clear_queue(int vid, uint16_t queue_id, struct rte_mbuf **pkts, vq = dev->virtqueue[queue_id]; - if (!rte_spinlock_trylock(&vq->access_lock)) { + if (rte_rwlock_read_trylock(&vq->access_lock)) { VHOST_LOG_DATA(dev->ifname, DEBUG, "%s: virtqueue %u is busy.\n", __func__, queue_id); return 0; @@ -2495,7 +2495,7 @@ rte_vhost_clear_queue(int vid, uint16_t queue_id, struct rte_mbuf **pkts, vq->stats.inflight_completed += n_pkts_cpl; out_access_unlock: - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_read_unlock(&vq->access_lock); return n_pkts_cpl; } @@ -2516,7 +2516,7 @@ virtio_dev_rx_async_submit(struct virtio_net *dev, struct vhost_virtqueue *vq, return 0; } - rte_spinlock_lock(&vq->access_lock); + rte_rwlock_write_lock(&vq->access_lock); if (unlikely(!vq->enabled || !vq->async)) goto out_access_unlock; @@ -2544,7 +2544,7 @@ virtio_dev_rx_async_submit(struct virtio_net *dev, struct vhost_virtqueue *vq, vhost_user_iotlb_rd_unlock(vq); out_access_unlock: - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_write_unlock(&vq->access_lock); return nb_tx; } @@ -2866,7 +2866,7 @@ desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, struct buf_vector *buf_vec, uint16_t nr_vec, struct rte_mbuf *m, struct rte_mempool *mbuf_pool, bool legacy_ol_flags, uint16_t slot_idx, bool is_async) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { uint32_t buf_avail, buf_offset, buf_len; @@ -3073,7 +3073,7 @@ static uint16_t virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count, bool legacy_ol_flags) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { uint16_t i; @@ -3178,7 +3178,7 @@ static uint16_t virtio_dev_tx_split_legacy(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { return virtio_dev_tx_split(dev, vq, mbuf_pool, pkts, count, true); @@ -3189,7 +3189,7 @@ static uint16_t virtio_dev_tx_split_compliant(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { return virtio_dev_tx_split(dev, vq, mbuf_pool, pkts, count, false); @@ -3393,7 +3393,7 @@ vhost_dequeue_single_packed(struct virtio_net *dev, uint16_t *buf_id, uint16_t *desc_count, bool legacy_ol_flags) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { struct buf_vector buf_vec[BUF_VECTOR_MAX]; @@ -3443,7 +3443,7 @@ virtio_dev_tx_single_packed(struct virtio_net *dev, struct rte_mempool *mbuf_pool, struct rte_mbuf *pkts, bool legacy_ol_flags) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { @@ -3475,7 +3475,7 @@ virtio_dev_tx_packed(struct virtio_net *dev, struct rte_mbuf **__rte_restrict pkts, uint32_t count, bool legacy_ol_flags) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { uint32_t pkt_idx = 0; @@ -3520,7 +3520,7 @@ static uint16_t virtio_dev_tx_packed_legacy(struct virtio_net *dev, struct vhost_virtqueue *__rte_restrict vq, struct rte_mempool *mbuf_pool, struct rte_mbuf **__rte_restrict pkts, uint32_t count) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { return virtio_dev_tx_packed(dev, vq, mbuf_pool, pkts, count, true); @@ -3531,7 +3531,7 @@ static uint16_t virtio_dev_tx_packed_compliant(struct virtio_net *dev, struct vhost_virtqueue *__rte_restrict vq, struct rte_mempool *mbuf_pool, struct rte_mbuf **__rte_restrict pkts, uint32_t count) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { return virtio_dev_tx_packed(dev, vq, mbuf_pool, pkts, count, false); @@ -3566,7 +3566,7 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, vq = dev->virtqueue[queue_id]; - if (unlikely(rte_spinlock_trylock(&vq->access_lock) == 0)) + if (unlikely(rte_rwlock_read_trylock(&vq->access_lock) != 0)) return 0; if (unlikely(!vq->enabled)) { @@ -3636,7 +3636,7 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, vhost_user_iotlb_rd_unlock(vq); out_access_unlock: - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_read_unlock(&vq->access_lock); if (unlikely(rarp_mbuf != NULL)) count += 1; @@ -3648,7 +3648,7 @@ static __rte_always_inline uint16_t async_poll_dequeue_completed(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mbuf **pkts, uint16_t count, int16_t dma_id, uint16_t vchan_id, bool legacy_ol_flags) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) { uint16_t start_idx, from, i; uint16_t nr_cpl_pkts = 0; @@ -3695,7 +3695,7 @@ static __rte_always_inline uint16_t virtio_dev_tx_async_split(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count, int16_t dma_id, uint16_t vchan_id, bool legacy_ol_flags) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { static bool allocerr_warned; @@ -3844,7 +3844,7 @@ virtio_dev_tx_async_split_legacy(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count, int16_t dma_id, uint16_t vchan_id) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { return virtio_dev_tx_async_split(dev, vq, mbuf_pool, @@ -3857,7 +3857,7 @@ virtio_dev_tx_async_split_compliant(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count, int16_t dma_id, uint16_t vchan_id) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { return virtio_dev_tx_async_split(dev, vq, mbuf_pool, @@ -3867,7 +3867,7 @@ virtio_dev_tx_async_split_compliant(struct virtio_net *dev, static __rte_always_inline void vhost_async_shadow_dequeue_single_packed(struct vhost_virtqueue *vq, uint16_t buf_id, uint16_t count) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) { struct vhost_async *async = vq->async; uint16_t idx = async->buffer_idx_packed; @@ -3889,7 +3889,7 @@ virtio_dev_tx_async_single_packed(struct virtio_net *dev, struct rte_mbuf *pkts, uint16_t slot_idx, bool legacy_ol_flags) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { int err; @@ -3942,7 +3942,7 @@ virtio_dev_tx_async_packed_batch(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mbuf **pkts, uint16_t slot_idx, uint16_t dma_id, uint16_t vchan_id) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { uint16_t avail_idx = vq->last_avail_idx; @@ -4000,7 +4000,7 @@ static __rte_always_inline uint16_t virtio_dev_tx_async_packed(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count, uint16_t dma_id, uint16_t vchan_id, bool legacy_ol_flags) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { uint32_t pkt_idx = 0; @@ -4111,7 +4111,7 @@ static uint16_t virtio_dev_tx_async_packed_legacy(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count, uint16_t dma_id, uint16_t vchan_id) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { return virtio_dev_tx_async_packed(dev, vq, mbuf_pool, @@ -4123,7 +4123,7 @@ static uint16_t virtio_dev_tx_async_packed_compliant(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count, uint16_t dma_id, uint16_t vchan_id) - __rte_exclusive_locks_required(&vq->access_lock) + __rte_shared_locks_required(&vq->access_lock) __rte_shared_locks_required(&vq->iotlb_lock) { return virtio_dev_tx_async_packed(dev, vq, mbuf_pool, @@ -4173,7 +4173,7 @@ rte_vhost_async_try_dequeue_burst(int vid, uint16_t queue_id, vq = dev->virtqueue[queue_id]; - if (unlikely(rte_spinlock_trylock(&vq->access_lock) == 0)) + if (unlikely(rte_rwlock_read_trylock(&vq->access_lock) != 0)) return 0; if (unlikely(vq->enabled == 0)) { @@ -4255,7 +4255,7 @@ rte_vhost_async_try_dequeue_burst(int vid, uint16_t queue_id, vhost_user_iotlb_rd_unlock(vq); out_access_unlock: - rte_spinlock_unlock(&vq->access_lock); + rte_rwlock_read_unlock(&vq->access_lock); if (unlikely(rarp_mbuf != NULL)) count += 1; From patchwork Wed May 17 09:08:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eelco Chaudron X-Patchwork-Id: 126910 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D83FA42B2A; Wed, 17 May 2023 11:09:02 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A0FE442D2F; Wed, 17 May 2023 11:09:02 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 650854114A for ; Wed, 17 May 2023 11:09:00 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684314539; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=q4L5prnidxqi8bdQ9JYTRVD+0dnlAfo32PGFitT4KoE=; b=JSXlntnrPGhoDmqP90urLkS2PMD2q7Lq9cFpmRX/UnlQjEjdN8rtVZB/O8mWvwBqh4PryN HcyHTsAHfYASVcc+y1VYkBUS900YF8dz9UHJl85fmCkPFNwH+WC2+cCbKNxyD/Hp+M0QLi vlqDIkK57srCs2XrT6gcAQ/GgrrXr8Q= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-345-WPZ1DnAoN-yEPyV2cgVjig-1; Wed, 17 May 2023 05:08:58 -0400 X-MC-Unique: WPZ1DnAoN-yEPyV2cgVjig-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4459485A588; Wed, 17 May 2023 09:08:58 +0000 (UTC) Received: from ebuild.redhat.com (unknown [10.39.195.120]) by smtp.corp.redhat.com (Postfix) with ESMTP id 874A5492B00; Wed, 17 May 2023 09:08:57 +0000 (UTC) From: Eelco Chaudron To: maxime.coquelin@redhat.com, chenbo.xia@intel.com, david.marchand@redhat.com Cc: dev@dpdk.org Subject: [PATCH v3 2/4] vhost: make the guest_notifications statistic counter atomic Date: Wed, 17 May 2023 11:08:56 +0200 Message-Id: <168431453456.558450.8798179744539843068.stgit@ebuild.local> In-Reply-To: <168431450017.558450.16680518469610688737.stgit@ebuild.local> References: <168431450017.558450.16680518469610688737.stgit@ebuild.local> User-Agent: StGit/1.5 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Making the guest_notifications statistic counter atomic, allows it to be safely incremented while holding the read access_lock. Signed-off-by: Eelco Chaudron Reviewed-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- lib/vhost/vhost.c | 8 ++++++++ lib/vhost/vhost.h | 9 ++++++--- 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c index 74bdbfd810..8ff6434c93 100644 --- a/lib/vhost/vhost.c +++ b/lib/vhost/vhost.c @@ -2086,6 +2086,10 @@ rte_vhost_vring_stats_get(int vid, uint16_t queue_id, rte_rwlock_write_lock(&vq->access_lock); for (i = 0; i < VHOST_NB_VQ_STATS; i++) { + /* + * No need to the read atomic counters as such, due to the + * above write access_lock preventing them to be updated. + */ stats[i].value = *(uint64_t *)(((char *)vq) + vhost_vq_stat_strings[i].offset); stats[i].id = i; @@ -2112,6 +2116,10 @@ int rte_vhost_vring_stats_reset(int vid, uint16_t queue_id) vq = dev->virtqueue[queue_id]; rte_rwlock_write_lock(&vq->access_lock); + /* + * No need to the reset atomic counters as such, due to the + * above write access_lock preventing them to be updated. + */ memset(&vq->stats, 0, sizeof(vq->stats)); rte_rwlock_write_unlock(&vq->access_lock); diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index 5c939ef06f..37609c7c8d 100644 --- a/lib/vhost/vhost.h +++ b/lib/vhost/vhost.h @@ -135,11 +135,12 @@ struct virtqueue_stats { uint64_t broadcast; /* Size bins in array as RFC 2819, undersized [0], 64 [1], etc */ uint64_t size_bins[8]; - uint64_t guest_notifications; uint64_t iotlb_hits; uint64_t iotlb_misses; uint64_t inflight_submitted; uint64_t inflight_completed; + /* Counters below are atomic, and should be incremented as such. */ + uint64_t guest_notifications; }; /** @@ -907,7 +908,8 @@ vhost_vring_call_split(struct virtio_net *dev, struct vhost_virtqueue *vq) unlikely(!signalled_used_valid)) { eventfd_write(vq->callfd, (eventfd_t) 1); if (dev->flags & VIRTIO_DEV_STATS_ENABLED) - vq->stats.guest_notifications++; + __atomic_fetch_add(&vq->stats.guest_notifications, + 1, __ATOMIC_RELAXED); if (dev->notify_ops->guest_notified) dev->notify_ops->guest_notified(dev->vid); } @@ -917,7 +919,8 @@ vhost_vring_call_split(struct virtio_net *dev, struct vhost_virtqueue *vq) && (vq->callfd >= 0)) { eventfd_write(vq->callfd, (eventfd_t)1); if (dev->flags & VIRTIO_DEV_STATS_ENABLED) - vq->stats.guest_notifications++; + __atomic_fetch_add(&vq->stats.guest_notifications, + 1, __ATOMIC_RELAXED); if (dev->notify_ops->guest_notified) dev->notify_ops->guest_notified(dev->vid); } From patchwork Wed May 17 09:09:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eelco Chaudron X-Patchwork-Id: 126912 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1472342B2A; Wed, 17 May 2023 11:09:18 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 04B0542D1A; Wed, 17 May 2023 11:09:18 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id BBA104067B for ; Wed, 17 May 2023 11:09:16 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684314556; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=N7A4+saLOt44saselSeN5om0pZFSj4g81RxexG2h/xY=; b=OQah3m9kkCQOUEljacQDK7xuoWnP1eQnW9r+JDDecJ68HPgT775jG6nYdVncc07jXEAZy8 CmlhhorHU2lTb5fRIfIgcRJLJB0yrpYQ3W84RNA43Hr4jLoeXo2zLfmx75WfjPzLiv2HDI WKaVb8RuCpxm5MJEeOHxR1fX2D+la5s= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-573-Ecv8p0c_Mfe8x1oJj0dnvw-1; Wed, 17 May 2023 05:09:07 -0400 X-MC-Unique: Ecv8p0c_Mfe8x1oJj0dnvw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0B193857E06; Wed, 17 May 2023 09:09:07 +0000 (UTC) Received: from ebuild.redhat.com (unknown [10.39.195.120]) by smtp.corp.redhat.com (Postfix) with ESMTP id 550074078908; Wed, 17 May 2023 09:09:06 +0000 (UTC) From: Eelco Chaudron To: maxime.coquelin@redhat.com, chenbo.xia@intel.com, david.marchand@redhat.com Cc: dev@dpdk.org Subject: [PATCH v3 3/4] vhost: fix invalid call FD handling Date: Wed, 17 May 2023 11:09:05 +0200 Message-Id: <168431454344.558450.2397970324914136724.stgit@ebuild.local> In-Reply-To: <168431450017.558450.16680518469610688737.stgit@ebuild.local> References: <168431450017.558450.16680518469610688737.stgit@ebuild.local> User-Agent: StGit/1.5 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch fixes cases where IRQ injection is tried while the call FD is not valid, which should not happen. Fixes: b1cce26af1dc ("vhost: add notification for packed ring") Fixes: e37ff954405a ("vhost: support virtqueue interrupt/notification suppression") Signed-off-by: Maxime Coquelin Signed-off-by: Eelco Chaudron Reviewed-by: Chenbo Xia --- lib/vhost/vhost.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index 37609c7c8d..23a4e2b1a7 100644 --- a/lib/vhost/vhost.h +++ b/lib/vhost/vhost.h @@ -903,9 +903,9 @@ vhost_vring_call_split(struct virtio_net *dev, struct vhost_virtqueue *vq) "%s: used_event_idx=%d, old=%d, new=%d\n", __func__, vhost_used_event(vq), old, new); - if ((vhost_need_event(vhost_used_event(vq), new, old) && - (vq->callfd >= 0)) || - unlikely(!signalled_used_valid)) { + if ((vhost_need_event(vhost_used_event(vq), new, old) || + unlikely(!signalled_used_valid)) && + vq->callfd >= 0) { eventfd_write(vq->callfd, (eventfd_t) 1); if (dev->flags & VIRTIO_DEV_STATS_ENABLED) __atomic_fetch_add(&vq->stats.guest_notifications, @@ -974,7 +974,7 @@ vhost_vring_call_packed(struct virtio_net *dev, struct vhost_virtqueue *vq) if (vhost_need_event(off, new, old)) kick = true; kick: - if (kick) { + if (kick && vq->callfd >= 0) { eventfd_write(vq->callfd, (eventfd_t)1); if (dev->notify_ops->guest_notified) dev->notify_ops->guest_notified(dev->vid); From patchwork Wed May 17 09:09:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eelco Chaudron X-Patchwork-Id: 126913 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A621942B2A; Wed, 17 May 2023 11:09:22 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2F2A142D42; Wed, 17 May 2023 11:09:19 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 1947E42D3A for ; Wed, 17 May 2023 11:09:18 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684314557; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZDF3QMO8cn3elD20Y0P+iBVGYCEI/QDhL/8POYxdPnM=; b=ATEfUD2AvpMCb3RUOLusdqU+S61qYsBU2wG1UBy55JKuK/Dip1UJtfuWqwJi09Hjdp/Z2K RKh6WS6vHRbYSACcEWLWf7bCUfEQgpQEiCHLFRmdFqHD1qiAuQVQCO8F7sk0sYLRkukrji 3iqLNjM6bUee/5J2N3J+hkycLl3pg/o= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-414-XsoAaRK0P5Ozaqly7QqM8Q-1; Wed, 17 May 2023 05:09:16 -0400 X-MC-Unique: XsoAaRK0P5Ozaqly7QqM8Q-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0177184AF31; Wed, 17 May 2023 09:09:16 +0000 (UTC) Received: from ebuild.redhat.com (unknown [10.39.195.120]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3778C14171C0; Wed, 17 May 2023 09:09:15 +0000 (UTC) From: Eelco Chaudron To: maxime.coquelin@redhat.com, chenbo.xia@intel.com, david.marchand@redhat.com Cc: dev@dpdk.org Subject: [PATCH v3 4/4] vhost: add device op to offload the interrupt kick Date: Wed, 17 May 2023 11:09:13 +0200 Message-Id: <168431455219.558450.14986601389394385835.stgit@ebuild.local> In-Reply-To: <168431450017.558450.16680518469610688737.stgit@ebuild.local> References: <168431450017.558450.16680518469610688737.stgit@ebuild.local> User-Agent: StGit/1.5 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org This patch adds an operation callback which gets called every time the library wants to call eventfd_write(). This eventfd_write() call could result in a system call, which could potentially block the PMD thread. The callback function can decide whether it's ok to handle the eventfd_write() now or have the newly introduced function, rte_vhost_notify_guest(), called at a later time. This can be used by 3rd party applications, like OVS, to avoid system calls being called as part of the PMD threads. Signed-off-by: Eelco Chaudron Reviewed-by: Maxime Coquelin --- lib/vhost/meson.build | 2 ++ lib/vhost/rte_vhost.h | 23 +++++++++++++++++- lib/vhost/socket.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++--- lib/vhost/version.map | 9 +++++++ lib/vhost/vhost.c | 38 ++++++++++++++++++++++++++++++ lib/vhost/vhost.h | 58 ++++++++++++++++++++++++++++++++------------- 6 files changed, 171 insertions(+), 22 deletions(-) diff --git a/lib/vhost/meson.build b/lib/vhost/meson.build index 0d1abf6283..05679447db 100644 --- a/lib/vhost/meson.build +++ b/lib/vhost/meson.build @@ -38,3 +38,5 @@ driver_sdk_headers = files( 'vdpa_driver.h', ) deps += ['ethdev', 'cryptodev', 'hash', 'pci', 'dmadev'] + +use_function_versioning = true diff --git a/lib/vhost/rte_vhost.h b/lib/vhost/rte_vhost.h index 58a5d4be92..7a10bc36cf 100644 --- a/lib/vhost/rte_vhost.h +++ b/lib/vhost/rte_vhost.h @@ -298,7 +298,13 @@ struct rte_vhost_device_ops { */ void (*guest_notified)(int vid); - void *reserved[1]; /**< Reserved for future extension */ + /** + * If this callback is registered, notification to the guest can + * be handled by the front-end calling rte_vhost_notify_guest(). + * If it's not handled, 'false' should be returned. This can be used + * to remove the "slow" eventfd_write() syscall from the datapath. + */ + bool (*guest_notify)(int vid, uint16_t queue_id); }; /** @@ -433,6 +439,21 @@ void rte_vhost_log_used_vring(int vid, uint16_t vring_idx, int rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable); +/** + * @warning + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice. + * + * Inject the offloaded interrupt into the vhost device's queue. For more + * details see the 'guest_notify' vhost device operation. + * + * @param vid + * vhost device ID + * @param queue_id + * virtio queue index + */ +__rte_experimental +void rte_vhost_notify_guest(int vid, uint16_t queue_id); + /** * Register vhost driver. path could be different for multiple * instance support. diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c index 669c322e12..f2c02075fe 100644 --- a/lib/vhost/socket.c +++ b/lib/vhost/socket.c @@ -15,6 +15,7 @@ #include #include +#include #include #include "fd_man.h" @@ -59,6 +60,7 @@ struct vhost_user_socket { struct rte_vdpa_device *vdpa_dev; struct rte_vhost_device_ops const *notify_ops; + struct rte_vhost_device_ops *malloc_notify_ops; }; struct vhost_user_connection { @@ -846,6 +848,11 @@ vhost_user_socket_mem_free(struct vhost_user_socket *vsocket) vsocket->path = NULL; } + if (vsocket && vsocket->malloc_notify_ops) { + free(vsocket->malloc_notify_ops); + vsocket->malloc_notify_ops = NULL; + } + if (vsocket) { free(vsocket); vsocket = NULL; @@ -1099,21 +1106,69 @@ rte_vhost_driver_unregister(const char *path) /* * Register ops so that we can add/remove device to data core. */ -int -rte_vhost_driver_callback_register(const char *path, - struct rte_vhost_device_ops const * const ops) +static int +vhost_driver_callback_register(const char *path, + struct rte_vhost_device_ops const * const ops, + struct rte_vhost_device_ops *malloc_ops) { struct vhost_user_socket *vsocket; pthread_mutex_lock(&vhost_user.mutex); vsocket = find_vhost_user_socket(path); - if (vsocket) + if (vsocket) { vsocket->notify_ops = ops; + free(vsocket->malloc_notify_ops); + vsocket->malloc_notify_ops = malloc_ops; + } pthread_mutex_unlock(&vhost_user.mutex); return vsocket ? 0 : -1; } +int __vsym +rte_vhost_driver_callback_register_v24(const char *path, + struct rte_vhost_device_ops const * const ops) +{ + return vhost_driver_callback_register(path, ops, NULL); +} + +int __vsym +rte_vhost_driver_callback_register_v23(const char *path, + struct rte_vhost_device_ops const * const ops) +{ + int ret; + + /* + * Although the ops structure is a const structure, we do need to + * override the guest_notify operation. This is because with the + * previous APIs it was "reserved" and if any garbage value was passed, + * it could crash the application. + */ + if (ops && !ops->guest_notify) { + struct rte_vhost_device_ops *new_ops; + + new_ops = malloc(sizeof(*new_ops)); + if (new_ops == NULL) + return -1; + + memcpy(new_ops, ops, sizeof(*new_ops)); + new_ops->guest_notify = NULL; + + ret = vhost_driver_callback_register(path, new_ops, new_ops); + } else { + ret = vhost_driver_callback_register(path, ops, NULL); + } + + return ret; +} + +/* Mark the v23 function as the old version, and v24 as the default version. */ +VERSION_SYMBOL(rte_vhost_driver_callback_register, _v23, 23); +BIND_DEFAULT_SYMBOL(rte_vhost_driver_callback_register, _v24, 24); +MAP_STATIC_SYMBOL(int rte_vhost_driver_callback_register(const char *path, + struct rte_vhost_device_ops const * const ops), + rte_vhost_driver_callback_register_v24); + struct rte_vhost_device_ops const * vhost_driver_callback_get(const char *path) { diff --git a/lib/vhost/version.map b/lib/vhost/version.map index d322a4a888..7bcbfd12cf 100644 --- a/lib/vhost/version.map +++ b/lib/vhost/version.map @@ -64,6 +64,12 @@ DPDK_23 { local: *; }; +DPDK_24 { + global: + + rte_vhost_driver_callback_register; +} DPDK_23; + EXPERIMENTAL { global: @@ -98,6 +104,9 @@ EXPERIMENTAL { # added in 22.11 rte_vhost_async_dma_unconfigure; rte_vhost_vring_call_nonblock; + + # added in 23.07 + rte_vhost_notify_guest; }; INTERNAL { diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c index 8ff6434c93..79e88f986e 100644 --- a/lib/vhost/vhost.c +++ b/lib/vhost/vhost.c @@ -44,6 +44,10 @@ static const struct vhost_vq_stats_name_off vhost_vq_stat_strings[] = { {"size_1024_1518_packets", offsetof(struct vhost_virtqueue, stats.size_bins[6])}, {"size_1519_max_packets", offsetof(struct vhost_virtqueue, stats.size_bins[7])}, {"guest_notifications", offsetof(struct vhost_virtqueue, stats.guest_notifications)}, + {"guest_notifications_offloaded", offsetof(struct vhost_virtqueue, + stats.guest_notifications_offloaded)}, + {"guest_notifications_error", offsetof(struct vhost_virtqueue, + stats.guest_notifications_error)}, {"iotlb_hits", offsetof(struct vhost_virtqueue, stats.iotlb_hits)}, {"iotlb_misses", offsetof(struct vhost_virtqueue, stats.iotlb_misses)}, {"inflight_submitted", offsetof(struct vhost_virtqueue, stats.inflight_submitted)}, @@ -1467,6 +1471,40 @@ rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable) return ret; } +void +rte_vhost_notify_guest(int vid, uint16_t queue_id) +{ + struct virtio_net *dev = get_device(vid); + struct vhost_virtqueue *vq; + + if (!dev || queue_id >= VHOST_MAX_VRING) + return; + + vq = dev->virtqueue[queue_id]; + if (!vq) + return; + + rte_rwlock_read_lock(&vq->access_lock); + + if (vq->callfd >= 0) { + int ret = eventfd_write(vq->callfd, (eventfd_t)1); + + if (ret) { + if (dev->flags & VIRTIO_DEV_STATS_ENABLED) + __atomic_fetch_add(&vq->stats.guest_notifications_error, + 1, __ATOMIC_RELAXED); + } else { + if (dev->flags & VIRTIO_DEV_STATS_ENABLED) + __atomic_fetch_add(&vq->stats.guest_notifications, + 1, __ATOMIC_RELAXED); + if (dev->notify_ops->guest_notified) + dev->notify_ops->guest_notified(dev->vid); + } + } + + rte_rwlock_read_unlock(&vq->access_lock); +} + void rte_vhost_log_write(int vid, uint64_t addr, uint64_t len) { diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index 23a4e2b1a7..8ad53e9bb5 100644 --- a/lib/vhost/vhost.h +++ b/lib/vhost/vhost.h @@ -141,6 +141,8 @@ struct virtqueue_stats { uint64_t inflight_completed; /* Counters below are atomic, and should be incremented as such. */ uint64_t guest_notifications; + uint64_t guest_notifications_offloaded; + uint64_t guest_notifications_error; }; /** @@ -884,6 +886,34 @@ vhost_need_event(uint16_t event_idx, uint16_t new_idx, uint16_t old) return (uint16_t)(new_idx - event_idx - 1) < (uint16_t)(new_idx - old); } +static __rte_always_inline void +vhost_vring_inject_irq(struct virtio_net *dev, struct vhost_virtqueue *vq) +{ + int ret; + + if (dev->notify_ops->guest_notify && + dev->notify_ops->guest_notify(dev->vid, vq->index)) { + if (dev->flags & VIRTIO_DEV_STATS_ENABLED) + __atomic_fetch_add(&vq->stats.guest_notifications_offloaded, + 1, __ATOMIC_RELAXED); + return; + } + + ret = eventfd_write(vq->callfd, (eventfd_t) 1); + if (ret) { + if (dev->flags & VIRTIO_DEV_STATS_ENABLED) + __atomic_fetch_add(&vq->stats.guest_notifications_error, + 1, __ATOMIC_RELAXED); + return; + } + + if (dev->flags & VIRTIO_DEV_STATS_ENABLED) + __atomic_fetch_add(&vq->stats.guest_notifications, + 1, __ATOMIC_RELAXED); + if (dev->notify_ops->guest_notified) + dev->notify_ops->guest_notified(dev->vid); +} + static __rte_always_inline void vhost_vring_call_split(struct virtio_net *dev, struct vhost_virtqueue *vq) { @@ -906,23 +936,13 @@ vhost_vring_call_split(struct virtio_net *dev, struct vhost_virtqueue *vq) if ((vhost_need_event(vhost_used_event(vq), new, old) || unlikely(!signalled_used_valid)) && vq->callfd >= 0) { - eventfd_write(vq->callfd, (eventfd_t) 1); - if (dev->flags & VIRTIO_DEV_STATS_ENABLED) - __atomic_fetch_add(&vq->stats.guest_notifications, - 1, __ATOMIC_RELAXED); - if (dev->notify_ops->guest_notified) - dev->notify_ops->guest_notified(dev->vid); + vhost_vring_inject_irq(dev, vq); } } else { /* Kick the guest if necessary. */ if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT) && (vq->callfd >= 0)) { - eventfd_write(vq->callfd, (eventfd_t)1); - if (dev->flags & VIRTIO_DEV_STATS_ENABLED) - __atomic_fetch_add(&vq->stats.guest_notifications, - 1, __ATOMIC_RELAXED); - if (dev->notify_ops->guest_notified) - dev->notify_ops->guest_notified(dev->vid); + vhost_vring_inject_irq(dev, vq); } } } @@ -974,11 +994,8 @@ vhost_vring_call_packed(struct virtio_net *dev, struct vhost_virtqueue *vq) if (vhost_need_event(off, new, old)) kick = true; kick: - if (kick && vq->callfd >= 0) { - eventfd_write(vq->callfd, (eventfd_t)1); - if (dev->notify_ops->guest_notified) - dev->notify_ops->guest_notified(dev->vid); - } + if (kick && vq->callfd >= 0) + vhost_vring_inject_irq(dev, vq); } static __rte_always_inline void @@ -1017,4 +1034,11 @@ mbuf_is_consumed(struct rte_mbuf *m) uint64_t hua_to_alignment(struct rte_vhost_memory *mem, void *ptr); void mem_set_dump(void *ptr, size_t size, bool enable, uint64_t alignment); + +/* Versioned functions */ +int rte_vhost_driver_callback_register_v23(const char *path, + struct rte_vhost_device_ops const * const ops); +int rte_vhost_driver_callback_register_v24(const char *path, + struct rte_vhost_device_ops const * const ops); + #endif /* _VHOST_NET_CDEV_H_ */