From patchwork Sat Oct 1 00:42:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Volodymyr Fialko X-Patchwork-Id: 117237 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id B9772A00C4; Sat, 1 Oct 2022 02:42:31 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id DBB93427F7; Sat, 1 Oct 2022 02:42:29 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by mails.dpdk.org (Postfix) with ESMTP id 376BE410D2 for ; Sat, 1 Oct 2022 02:42:28 +0200 (CEST) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 28UDKl6Z030783; Fri, 30 Sep 2022 17:42:27 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=ecELVsbfW/5pboK49x0dSPXe2tU1YUMp2J9w0zyOpok=; b=It+Jjbje3KaYYYFugC9uja++ykwSrJRF8PN3ioHa/kJ4sXLl41OHYaSO3G4wQeSf8hdf tcG6vj1SYbruF1vh7Jj10E+hrNzTAnx4yIl37VF78kTt2TMoMsxg173t57/u0U5AB2zC XtiCY3faEawR3DH6dvGqIhYePvAbcf3bjtNfx/xzZeEssu6g7SKYDJ+h10SiXyHz4aBl SOhPLl3g8cAURfGqCIjdpLGldWAbPS3j6rFG/B/qj1Zn/r7KIGCKewQQXDb1w5mc5r/1 zi1ngz0uoa0guJVpN5oSmnfUYAWkDjCDJVt3uYZVqr9imo7bmrXd4bzV349a5s6oxHgf 6Q== Received: from dc5-exch02.marvell.com ([199.233.59.182]) by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3jx18ba3ns-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 30 Sep 2022 17:42:27 -0700 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Fri, 30 Sep 2022 17:42:25 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.18 via Frontend Transport; Fri, 30 Sep 2022 17:42:25 -0700 Received: from localhost.localdomain (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id 631383F7057; Fri, 30 Sep 2022 17:42:22 -0700 (PDT) From: Volodymyr Fialko To: , Jerin Jacob , Abhinandan Gujjar , Pavan Nikhilesh , Shijith Thotton , Hemant Agrawal , Sachin Saxena CC: , , Volodymyr Fialko Subject: [PATCH v3 1/2] eventdev: introduce event cryptodev vector type Date: Sat, 1 Oct 2022 02:42:12 +0200 Message-ID: <20221001004213.2911114-2-vfialko@marvell.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221001004213.2911114-1-vfialko@marvell.com> References: <20220926113607.1613674-1-vfialko@marvell.com> <20221001004213.2911114-1-vfialko@marvell.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: 66CvRKjIzqisCqldKi-l445-zt7hqh7L X-Proofpoint-GUID: 66CvRKjIzqisCqldKi-l445-zt7hqh7L X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-30_05,2022-09-29_03,2022-06-22_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Introduce ability to aggregate crypto operations processed by event crypto adapter into single event containing rte_event_vector whose event type is RTE_EVENT_TYPE_CRYPTODEV_VECTOR. Application should set RTE_EVENT_CRYPTO_ADAPTER_EVENT_VECTOR in rte_event_crypto_adapter_queue_conf::flag and provide vector configuration with respect of rte_event_crypto_adapter_vector_limits, which could be obtained by calling rte_event_crypto_adapter_vector_limits_get, to enable vectorization. The event crypto adapter would be responsible for vectorizing the crypto operations based on provided response information in rte_event_crypto_metadata::response_info. Updated drivers and tests accordingly to new API. Signed-off-by: Volodymyr Fialko --- app/test-eventdev/test_perf_common.c | 11 +- app/test/test_event_crypto_adapter.c | 12 +- .../prog_guide/event_crypto_adapter.rst | 23 +++- drivers/event/cnxk/cn10k_eventdev.c | 4 +- drivers/event/cnxk/cn9k_eventdev.c | 5 +- drivers/event/dpaa/dpaa_eventdev.c | 9 +- drivers/event/dpaa2/dpaa2_eventdev.c | 9 +- drivers/event/octeontx/ssovf_evdev.c | 4 +- lib/eventdev/eventdev_pmd.h | 35 +++++- lib/eventdev/eventdev_trace.h | 6 +- lib/eventdev/rte_event_crypto_adapter.c | 105 ++++++++++++++++-- lib/eventdev/rte_event_crypto_adapter.h | 101 ++++++++++++++++- lib/eventdev/rte_eventdev.h | 8 ++ 13 files changed, 285 insertions(+), 47 deletions(-) diff --git a/app/test-eventdev/test_perf_common.c b/app/test-eventdev/test_perf_common.c index 81420be73a..8472a87b99 100644 --- a/app/test-eventdev/test_perf_common.c +++ b/app/test-eventdev/test_perf_common.c @@ -837,14 +837,13 @@ perf_event_crypto_adapter_setup(struct test_perf *t, struct prod_data *p) } if (cap & RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_QP_EV_BIND) { - struct rte_event response_info; + struct rte_event_crypto_adapter_queue_conf conf; - response_info.event = 0; - response_info.sched_type = RTE_SCHED_TYPE_ATOMIC; - response_info.queue_id = p->queue_id; + memset(&conf, 0, sizeof(conf)); + conf.ev.sched_type = RTE_SCHED_TYPE_ATOMIC; + conf.ev.queue_id = p->queue_id; ret = rte_event_crypto_adapter_queue_pair_add( - TEST_PERF_CA_ID, p->ca.cdev_id, p->ca.cdev_qp_id, - &response_info); + TEST_PERF_CA_ID, p->ca.cdev_id, p->ca.cdev_qp_id, &conf); } else { ret = rte_event_crypto_adapter_queue_pair_add( TEST_PERF_CA_ID, p->ca.cdev_id, p->ca.cdev_qp_id, NULL); diff --git a/app/test/test_event_crypto_adapter.c b/app/test/test_event_crypto_adapter.c index 2ecc7e2cea..bb617c1042 100644 --- a/app/test/test_event_crypto_adapter.c +++ b/app/test/test_event_crypto_adapter.c @@ -1175,6 +1175,10 @@ test_crypto_adapter_create(void) static int test_crypto_adapter_qp_add_del(void) { + struct rte_event_crypto_adapter_queue_conf queue_conf = { + .ev = response_info, + }; + uint32_t cap; int ret; @@ -1183,7 +1187,7 @@ test_crypto_adapter_qp_add_del(void) if (cap & RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_QP_EV_BIND) { ret = rte_event_crypto_adapter_queue_pair_add(TEST_ADAPTER_ID, - TEST_CDEV_ID, TEST_CDEV_QP_ID, &response_info); + TEST_CDEV_ID, TEST_CDEV_QP_ID, &queue_conf); } else ret = rte_event_crypto_adapter_queue_pair_add(TEST_ADAPTER_ID, TEST_CDEV_ID, TEST_CDEV_QP_ID, NULL); @@ -1206,6 +1210,10 @@ configure_event_crypto_adapter(enum rte_event_crypto_adapter_mode mode) .new_event_threshold = 1200, }; + struct rte_event_crypto_adapter_queue_conf queue_conf = { + .ev = response_info, + }; + uint32_t cap; int ret; @@ -1238,7 +1246,7 @@ configure_event_crypto_adapter(enum rte_event_crypto_adapter_mode mode) if (cap & RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_QP_EV_BIND) { ret = rte_event_crypto_adapter_queue_pair_add(TEST_ADAPTER_ID, - TEST_CDEV_ID, TEST_CDEV_QP_ID, &response_info); + TEST_CDEV_ID, TEST_CDEV_QP_ID, &queue_conf); } else ret = rte_event_crypto_adapter_queue_pair_add(TEST_ADAPTER_ID, TEST_CDEV_ID, TEST_CDEV_QP_ID, NULL); diff --git a/doc/guides/prog_guide/event_crypto_adapter.rst b/doc/guides/prog_guide/event_crypto_adapter.rst index 4fb5c688e0..554df7e358 100644 --- a/doc/guides/prog_guide/event_crypto_adapter.rst +++ b/doc/guides/prog_guide/event_crypto_adapter.rst @@ -201,10 +201,10 @@ capability, event information must be passed to the add API. ret = rte_event_crypto_adapter_caps_get(id, evdev, &cap); if (cap & RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_QP_EV_BIND) { - struct rte_event event; + struct rte_event_crypto_adapter_queue_conf conf; - // Fill in event information & pass it to add API - rte_event_crypto_adapter_queue_pair_add(id, cdev_id, qp_id, &event); + // Fill in conf.event information & pass it to add API + rte_event_crypto_adapter_queue_pair_add(id, cdev_id, qp_id, &conf); } else rte_event_crypto_adapter_queue_pair_add(id, cdev_id, qp_id, NULL); @@ -291,6 +291,23 @@ the ``rte_crypto_op``. rte_memcpy(op + len, &m_data, sizeof(m_data)); } +Enable event vectorization +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The event crypto adapter can aggregate outcoming crypto operations based on +provided response information of ``rte_event_crypto_metadata::response_info`` +and generate a ``rte_event`` containing ``rte_event_vector`` whose event type +is ``RTE_EVENT_TYPE_CRYPTODEV_VECTOR``. +To enable vectorization application should set +RTE_EVENT_CRYPTO_ADAPTER_EVENT_VECTOR in +``rte_event_crypto_adapter_queue_conf::flag`` and provide vector +configuration(size, mempool, etc.) with respect of +``rte_event_crypto_adapter_vector_limits``, which could be obtained by calling +``rte_event_crypto_adapter_vector_limits_get()``. + +The RTE_EVENT_CRYPTO_ADAPTER_CAP_EVENT_VECTOR capability indicates whether +PMD supports this feature. + Start the adapter instance ~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/drivers/event/cnxk/cn10k_eventdev.c b/drivers/event/cnxk/cn10k_eventdev.c index bbaa6d0361..c55d69724b 100644 --- a/drivers/event/cnxk/cn10k_eventdev.c +++ b/drivers/event/cnxk/cn10k_eventdev.c @@ -1034,12 +1034,12 @@ static int cn10k_crypto_adapter_qp_add(const struct rte_eventdev *event_dev, const struct rte_cryptodev *cdev, int32_t queue_pair_id, - const struct rte_event *event) + const struct rte_event_crypto_adapter_queue_conf *conf) { struct cnxk_sso_evdev *dev = cnxk_sso_pmd_priv(event_dev); int ret; - RTE_SET_USED(event); + RTE_SET_USED(conf); CNXK_VALID_DEV_OR_ERR_RET(event_dev->dev, "event_cn10k"); CNXK_VALID_DEV_OR_ERR_RET(cdev->device, "crypto_cn10k"); diff --git a/drivers/event/cnxk/cn9k_eventdev.c b/drivers/event/cnxk/cn9k_eventdev.c index 764963db85..fca7b5f3a5 100644 --- a/drivers/event/cnxk/cn9k_eventdev.c +++ b/drivers/event/cnxk/cn9k_eventdev.c @@ -1125,12 +1125,13 @@ cn9k_crypto_adapter_caps_get(const struct rte_eventdev *event_dev, static int cn9k_crypto_adapter_qp_add(const struct rte_eventdev *event_dev, const struct rte_cryptodev *cdev, - int32_t queue_pair_id, const struct rte_event *event) + int32_t queue_pair_id, + const struct rte_event_crypto_adapter_queue_conf *conf) { struct cnxk_sso_evdev *dev = cnxk_sso_pmd_priv(event_dev); int ret; - RTE_SET_USED(event); + RTE_SET_USED(conf); CNXK_VALID_DEV_OR_ERR_RET(event_dev->dev, "event_cn9k"); CNXK_VALID_DEV_OR_ERR_RET(cdev->device, "crypto_cn9k"); diff --git a/drivers/event/dpaa/dpaa_eventdev.c b/drivers/event/dpaa/dpaa_eventdev.c index 8e470584ea..4b3d16735b 100644 --- a/drivers/event/dpaa/dpaa_eventdev.c +++ b/drivers/event/dpaa/dpaa_eventdev.c @@ -26,6 +26,7 @@ #include #include #include +#include #include #include #include @@ -775,10 +776,10 @@ static int dpaa_eventdev_crypto_queue_add(const struct rte_eventdev *dev, const struct rte_cryptodev *cryptodev, int32_t rx_queue_id, - const struct rte_event *ev) + const struct rte_event_crypto_adapter_queue_conf *conf) { struct dpaa_eventdev *priv = dev->data->dev_private; - uint8_t ev_qid = ev->queue_id; + uint8_t ev_qid = conf->ev.queue_id; u16 ch_id = priv->evq_info[ev_qid].ch_id; int ret; @@ -786,10 +787,10 @@ dpaa_eventdev_crypto_queue_add(const struct rte_eventdev *dev, if (rx_queue_id == -1) return dpaa_eventdev_crypto_queue_add_all(dev, - cryptodev, ev); + cryptodev, &conf->ev); ret = dpaa_sec_eventq_attach(cryptodev, rx_queue_id, - ch_id, ev); + ch_id, &conf->ev); if (ret) { DPAA_EVENTDEV_ERR( "dpaa_sec_eventq_attach failed: ret: %d\n", ret); diff --git a/drivers/event/dpaa2/dpaa2_eventdev.c b/drivers/event/dpaa2/dpaa2_eventdev.c index 1001297cda..f499d0d015 100644 --- a/drivers/event/dpaa2/dpaa2_eventdev.c +++ b/drivers/event/dpaa2/dpaa2_eventdev.c @@ -26,6 +26,7 @@ #include #include #include +#include #include #include @@ -865,10 +866,10 @@ static int dpaa2_eventdev_crypto_queue_add(const struct rte_eventdev *dev, const struct rte_cryptodev *cryptodev, int32_t rx_queue_id, - const struct rte_event *ev) + const struct rte_event_crypto_adapter_queue_conf *conf) { struct dpaa2_eventdev *priv = dev->data->dev_private; - uint8_t ev_qid = ev->queue_id; + uint8_t ev_qid = conf->ev.queue_id; struct dpaa2_dpcon_dev *dpcon = priv->evq_info[ev_qid].dpcon; int ret; @@ -876,10 +877,10 @@ dpaa2_eventdev_crypto_queue_add(const struct rte_eventdev *dev, if (rx_queue_id == -1) return dpaa2_eventdev_crypto_queue_add_all(dev, - cryptodev, ev); + cryptodev, &conf->ev); ret = dpaa2_sec_eventq_attach(cryptodev, rx_queue_id, - dpcon, ev); + dpcon, &conf->ev); if (ret) { DPAA2_EVENTDEV_ERR( "dpaa2_sec_eventq_attach failed: ret: %d\n", ret); diff --git a/drivers/event/octeontx/ssovf_evdev.c b/drivers/event/octeontx/ssovf_evdev.c index 9d4347a16a..650266b996 100644 --- a/drivers/event/octeontx/ssovf_evdev.c +++ b/drivers/event/octeontx/ssovf_evdev.c @@ -746,12 +746,12 @@ static int ssovf_crypto_adapter_qp_add(const struct rte_eventdev *dev, const struct rte_cryptodev *cdev, int32_t queue_pair_id, - const struct rte_event *event) + const struct rte_event_crypto_adapter_queue_conf *conf) { struct cpt_instance *qp; uint8_t qp_id; - RTE_SET_USED(event); + RTE_SET_USED(conf); if (queue_pair_id == -1) { for (qp_id = 0; qp_id < cdev->data->nb_queue_pairs; qp_id++) { diff --git a/lib/eventdev/eventdev_pmd.h b/lib/eventdev/eventdev_pmd.h index 2c74332c4a..e49ff23db5 100644 --- a/lib/eventdev/eventdev_pmd.h +++ b/lib/eventdev/eventdev_pmd.h @@ -910,6 +910,7 @@ rte_event_pmd_selftest_seqn(struct rte_mbuf *mbuf) } struct rte_cryptodev; +struct rte_event_crypto_adapter_queue_conf; /** * This API may change without prior notice @@ -964,11 +965,11 @@ typedef int (*eventdev_crypto_adapter_caps_get_t) * - <0: Error code returned by the driver function. * */ -typedef int (*eventdev_crypto_adapter_queue_pair_add_t) - (const struct rte_eventdev *dev, - const struct rte_cryptodev *cdev, - int32_t queue_pair_id, - const struct rte_event *event); +typedef int (*eventdev_crypto_adapter_queue_pair_add_t)( + const struct rte_eventdev *dev, + const struct rte_cryptodev *cdev, + int32_t queue_pair_id, + const struct rte_event_crypto_adapter_queue_conf *queue_conf); /** @@ -1077,6 +1078,27 @@ typedef int (*eventdev_crypto_adapter_stats_reset) (const struct rte_eventdev *dev, const struct rte_cryptodev *cdev); +struct rte_event_crypto_adapter_vector_limits; +/** + * Get event vector limits for a given event, crypto device pair. + * + * @param dev + * Event device pointer + * + * @param cdev + * Crypto device pointer + * + * @param[out] limits + * Pointer to the limits structure to be filled. + * + * @return + * - 0: Success. + * - <0: Error code returned by the driver function. + */ +typedef int (*eventdev_crypto_adapter_vector_limits_get_t)( + const struct rte_eventdev *dev, const struct rte_cryptodev *cdev, + struct rte_event_crypto_adapter_vector_limits *limits); + /** * Retrieve the event device's eth Tx adapter capabilities. * @@ -1402,6 +1424,9 @@ struct eventdev_ops { /**< Get crypto stats */ eventdev_crypto_adapter_stats_reset crypto_adapter_stats_reset; /**< Reset crypto stats */ + eventdev_crypto_adapter_vector_limits_get_t + crypto_adapter_vector_limits_get; + /**< Get event vector limits for the crypto adapter */ eventdev_eth_rx_adapter_q_stats_get eth_rx_adapter_queue_stats_get; /**< Get ethernet Rx queue stats */ diff --git a/lib/eventdev/eventdev_trace.h b/lib/eventdev/eventdev_trace.h index 5ec43d80ee..d48cd58850 100644 --- a/lib/eventdev/eventdev_trace.h +++ b/lib/eventdev/eventdev_trace.h @@ -18,6 +18,7 @@ extern "C" { #include #include "rte_eventdev.h" +#include "rte_event_crypto_adapter.h" #include "rte_event_eth_rx_adapter.h" #include "rte_event_timer_adapter.h" @@ -271,11 +272,12 @@ RTE_TRACE_POINT( RTE_TRACE_POINT( rte_eventdev_trace_crypto_adapter_queue_pair_add, RTE_TRACE_POINT_ARGS(uint8_t adptr_id, uint8_t cdev_id, - const void *event, int32_t queue_pair_id), + int32_t queue_pair_id, + const struct rte_event_crypto_adapter_queue_conf *conf), rte_trace_point_emit_u8(adptr_id); rte_trace_point_emit_u8(cdev_id); rte_trace_point_emit_i32(queue_pair_id); - rte_trace_point_emit_ptr(event); + rte_trace_point_emit_ptr(conf); ) RTE_TRACE_POINT( diff --git a/lib/eventdev/rte_event_crypto_adapter.c b/lib/eventdev/rte_event_crypto_adapter.c index a8ef5bac06..49e5305800 100644 --- a/lib/eventdev/rte_event_crypto_adapter.c +++ b/lib/eventdev/rte_event_crypto_adapter.c @@ -921,11 +921,12 @@ int rte_event_crypto_adapter_queue_pair_add(uint8_t id, uint8_t cdev_id, int32_t queue_pair_id, - const struct rte_event *event) + const struct rte_event_crypto_adapter_queue_conf *conf) { + struct rte_event_crypto_adapter_vector_limits limits; struct event_crypto_adapter *adapter; - struct rte_eventdev *dev; struct crypto_device_info *dev_info; + struct rte_eventdev *dev; uint32_t cap; int ret; @@ -950,11 +951,49 @@ rte_event_crypto_adapter_queue_pair_add(uint8_t id, return ret; } - if ((cap & RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_QP_EV_BIND) && - (event == NULL)) { - RTE_EDEV_LOG_ERR("Conf value can not be NULL for dev_id=%u", - cdev_id); - return -EINVAL; + if (conf == NULL) { + if (cap & RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_QP_EV_BIND) { + RTE_EDEV_LOG_ERR("Conf value can not be NULL for dev_id=%u", + cdev_id); + return -EINVAL; + } + } else { + if (conf->flags & RTE_EVENT_CRYPTO_ADAPTER_EVENT_VECTOR) { + if ((cap & RTE_EVENT_CRYPTO_ADAPTER_CAP_EVENT_VECTOR) == 0) { + RTE_EDEV_LOG_ERR("Event vectorization is not supported," + "dev %" PRIu8 " cdev %" PRIu8, id, + cdev_id); + return -ENOTSUP; + } + + ret = rte_event_crypto_adapter_vector_limits_get( + adapter->eventdev_id, cdev_id, &limits); + if (ret < 0) { + RTE_EDEV_LOG_ERR("Failed to get event device vector " + "limits, dev %" PRIu8 " cdev %" PRIu8, + id, cdev_id); + return -EINVAL; + } + + if (conf->vector_sz < limits.min_sz || + conf->vector_sz > limits.max_sz || + conf->vector_timeout_ns < limits.min_timeout_ns || + conf->vector_timeout_ns > limits.max_timeout_ns || + conf->vector_mp == NULL) { + RTE_EDEV_LOG_ERR("Invalid event vector configuration," + " dev %" PRIu8 " cdev %" PRIu8, + id, cdev_id); + return -EINVAL; + } + + if (conf->vector_mp->elt_size < (sizeof(struct rte_event_vector) + + (sizeof(uintptr_t) * conf->vector_sz))) { + RTE_EDEV_LOG_ERR("Invalid event vector configuration," + " dev %" PRIu8 " cdev %" PRIu8, + id, cdev_id); + return -EINVAL; + } + } } dev_info = &adapter->cdevs[cdev_id]; @@ -989,7 +1028,7 @@ rte_event_crypto_adapter_queue_pair_add(uint8_t id, ret = (*dev->dev_ops->crypto_adapter_queue_pair_add)(dev, dev_info->dev, queue_pair_id, - event); + conf); if (ret) return ret; @@ -1029,8 +1068,8 @@ rte_event_crypto_adapter_queue_pair_add(uint8_t id, rte_service_component_runstate_set(adapter->service_id, 1); } - rte_eventdev_trace_crypto_adapter_queue_pair_add(id, cdev_id, event, - queue_pair_id); + rte_eventdev_trace_crypto_adapter_queue_pair_add(id, cdev_id, + queue_pair_id, conf); return 0; } @@ -1288,3 +1327,49 @@ rte_event_crypto_adapter_event_port_get(uint8_t id, uint8_t *event_port_id) return 0; } + +int +rte_event_crypto_adapter_vector_limits_get( + uint8_t dev_id, uint16_t cdev_id, + struct rte_event_crypto_adapter_vector_limits *limits) +{ + struct rte_cryptodev *cdev; + struct rte_eventdev *dev; + uint32_t cap; + int ret; + + RTE_EVENTDEV_VALID_DEVID_OR_ERR_RET(dev_id, -EINVAL); + + if (!rte_cryptodev_is_valid_dev(cdev_id)) { + RTE_EDEV_LOG_ERR("Invalid dev_id=%" PRIu8, cdev_id); + return -EINVAL; + } + + if (limits == NULL) { + RTE_EDEV_LOG_ERR("Invalid limits storage provided"); + return -EINVAL; + } + + dev = &rte_eventdevs[dev_id]; + cdev = rte_cryptodev_pmd_get_dev(cdev_id); + + ret = rte_event_crypto_adapter_caps_get(dev_id, cdev_id, &cap); + if (ret) { + RTE_EDEV_LOG_ERR("Failed to get adapter caps edev %" PRIu8 + "cdev %" PRIu16, dev_id, cdev_id); + return ret; + } + + if (!(cap & RTE_EVENT_CRYPTO_ADAPTER_CAP_EVENT_VECTOR)) { + RTE_EDEV_LOG_ERR("Event vectorization is not supported," + "dev %" PRIu8 " cdev %" PRIu8, dev_id, cdev_id); + return -ENOTSUP; + } + + RTE_FUNC_PTR_OR_ERR_RET( + *dev->dev_ops->crypto_adapter_vector_limits_get, + -ENOTSUP); + + return dev->dev_ops->crypto_adapter_vector_limits_get( + dev, cdev, limits); +} diff --git a/lib/eventdev/rte_event_crypto_adapter.h b/lib/eventdev/rte_event_crypto_adapter.h index d90a19e72c..83d154a6ce 100644 --- a/lib/eventdev/rte_event_crypto_adapter.h +++ b/lib/eventdev/rte_event_crypto_adapter.h @@ -253,6 +253,78 @@ struct rte_event_crypto_adapter_conf { */ }; +#define RTE_EVENT_CRYPTO_ADAPTER_EVENT_VECTOR 0x1 +/**< This flag indicates that crypto operations processed on the crypto + * adapter need to be vectorized + * @see rte_event_crypto_adapter_queue_conf::flags + */ + +/** + * Adapter queue configuration structure + */ +struct rte_event_crypto_adapter_queue_conf { + uint32_t flags; + /**< Flags for handling crypto operations + * @see RTE_EVENT_CRYPTO_ADAPTER_EVENT_VECTOR + */ + struct rte_event ev; + /**< If HW supports cryptodev queue pair to event queue binding, + * application is expected to fill in event information. + * @see RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_QP_EV_BIND + */ + uint16_t vector_sz; + /**< Indicates the maximum number for crypto operations to combine and + * form a vector. + * @see rte_event_crypto_adapter_vector_limits::min_sz + * @see rte_event_crypto_adapter_vector_limits::max_sz + * Valid when RTE_EVENT_CRYPTO_ADAPTER_EVENT_VECTOR flag is set in + * @see rte_event_crypto_adapter_queue_conf::flags + */ + uint64_t vector_timeout_ns; + /**< + * Indicates the maximum number of nanoseconds to wait for aggregating + * crypto operations. Should be within vectorization limits of the + * adapter + * @see rte_event_crypto_adapter_vector_limits::min_timeout_ns + * @see rte_event_crypto_adapter_vector_limits::max_timeout_ns + * Valid when RTE_EVENT_CRYPTO_ADAPTER_EVENT_VECTOR flag is set in + * @see rte_event_crypto_adapter_queue_conf::flags + */ + struct rte_mempool *vector_mp; + /**< Indicates the mempool that should be used for allocating + * rte_event_vector container. + * Should be created by using `rte_event_vector_pool_create`. + * Valid when RTE_EVENT_CRYPTO_ADAPTER_EVENT_VECTOR flag is set in + * @see rte_event_crypto_adapter_queue_conf::flags. + */ +}; + +/** + * A structure used to retrieve event crypto adapter vector limits. + */ +struct rte_event_crypto_adapter_vector_limits { + uint16_t min_sz; + /**< Minimum vector limit configurable. + * @see rte_event_crypto_adapter_queue_conf::vector_sz + */ + uint16_t max_sz; + /**< Maximum vector limit configurable. + * @see rte_event_crypto_adapter_queue_conf::vector_sz + */ + uint8_t log2_sz; + /**< True if the size configured should be in log2. + * @see rte_event_crypto_adapter_queue_conf::vector_sz + */ + uint64_t min_timeout_ns; + /**< Minimum vector timeout configurable. + * @see rte_event_crypto_adapter_queue_conf::vector_timeout_ns + */ + uint64_t max_timeout_ns; + /**< Maximum vector timeout configurable. + * @see rte_event_crypto_adapter_queue_conf::vector_timeout_ns + */ +}; + /** * Function type used for adapter configuration callback. The callback is * used to fill in members of the struct rte_event_crypto_adapter_conf, this @@ -392,10 +464,9 @@ rte_event_crypto_adapter_free(uint8_t id); * Cryptodev queue pair identifier. If queue_pair_id is set -1, * adapter adds all the pre configured queue pairs to the instance. * - * @param event - * if HW supports cryptodev queue pair to event queue binding, application is - * expected to fill in event information, else it will be NULL. - * @see RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_QP_EV_BIND + * @param conf + * Additional configuration structure of type + * *rte_event_crypto_adapter_queue_conf* * * @return * - 0: Success, queue pair added correctly. @@ -405,7 +476,7 @@ int rte_event_crypto_adapter_queue_pair_add(uint8_t id, uint8_t cdev_id, int32_t queue_pair_id, - const struct rte_event *event); + const struct rte_event_crypto_adapter_queue_conf *conf); /** * Delete a queue pair from an event crypto adapter. @@ -523,6 +594,26 @@ rte_event_crypto_adapter_service_id_get(uint8_t id, uint32_t *service_id); int rte_event_crypto_adapter_event_port_get(uint8_t id, uint8_t *event_port_id); +/** + * Retrieve vector limits for a given event dev and crypto dev pair. + * @see rte_event_crypto_adapter_vector_limits + * + * @param dev_id + * Event device identifier. + * @param cdev_id + * Crypto device identifier. + * @param [out] limits + * A pointer to rte_event_crypto_adapter_vector_limits structure that has to + * be filled. + * + * @return + * - 0: Success. + * - <0: Error code on failure. + */ +int rte_event_crypto_adapter_vector_limits_get( + uint8_t dev_id, uint16_t cdev_id, + struct rte_event_crypto_adapter_vector_limits *limits); + /** * Enqueue a burst of crypto operations as event objects supplied in *rte_event* * structure on an event crypto adapter designated by its event *dev_id* through diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h index 88e7c809c0..60e9043ac4 100644 --- a/lib/eventdev/rte_eventdev.h +++ b/lib/eventdev/rte_eventdev.h @@ -1220,6 +1220,9 @@ struct rte_event_vector { #define RTE_EVENT_TYPE_ETH_RX_ADAPTER_VECTOR \ (RTE_EVENT_TYPE_VECTOR | RTE_EVENT_TYPE_ETH_RX_ADAPTER) /**< The event vector generated from eth Rx adapter. */ +#define RTE_EVENT_TYPE_CRYPTODEV_VECTOR \ + (RTE_EVENT_TYPE_VECTOR | RTE_EVENT_TYPE_CRYPTODEV) +/**< The event vector generated from cryptodev adapter. */ #define RTE_EVENT_TYPE_MAX 0x10 /**< Maximum number of event types */ @@ -1437,6 +1440,11 @@ rte_event_timer_adapter_caps_get(uint8_t dev_id, uint32_t *caps); * the private data information along with the crypto session. */ +#define RTE_EVENT_CRYPTO_ADAPTER_CAP_EVENT_VECTOR 0x10 +/**< Flag indicates HW is capable of aggregating processed + * crypto operations into rte_event_vector. + */ + /** * Retrieve the event device's crypto adapter capabilities for the * specified cryptodev device From patchwork Sat Oct 1 00:42:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Volodymyr Fialko X-Patchwork-Id: 117238 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id E5496A00C4; Sat, 1 Oct 2022 02:44:50 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id C6A5840684; Sat, 1 Oct 2022 02:44:50 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by mails.dpdk.org (Postfix) with ESMTP id D318B4003F for ; Sat, 1 Oct 2022 02:44:49 +0200 (CEST) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 28UDK2so028924; Fri, 30 Sep 2022 17:42:42 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=2TCaOWEHUzlK/tlhM2vyLbQ+SiAIRxHuzsjrWeAt/l4=; b=WeguSka4z2PHF0I6HRb5ihD97Z995jEH87ahnqJJEc7d13qkr1rdyTSUb+wE/3HHiPE+ w6h5lW5bTefAVg5gso8gqrBAeZ11iEgc2S/7KYDy3hB897UrTZOurh8V3yIL8Ju7bHhY bH+TAD9Ln1SJGH97dNfJop6MYxTFMQyEm1wnkqEU7M9nRB6e3GOEcj88K9A290TRY7g2 u3mbzCSxSnszE2sRsKUmMnTQ+84ktSy08cruiN6W3YWc8JCb8ERZvu3ZyE+62LrG8LjZ nayjBZ+boLdXvu5Q7u4/ZwJ2r2AuZa87m1DvSIAH4FaliHABX74+/CARlVAQ/WdaEnWi eg== Received: from dc5-exch02.marvell.com ([199.233.59.182]) by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3jx18ba3pd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 30 Sep 2022 17:42:42 -0700 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Fri, 30 Sep 2022 17:42:40 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.18 via Frontend Transport; Fri, 30 Sep 2022 17:42:40 -0700 Received: from localhost.localdomain (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id 008513F7057; Fri, 30 Sep 2022 17:42:36 -0700 (PDT) From: Volodymyr Fialko To: , Ankur Dwivedi , Anoob Joseph , Tejasree Kondoj , Ray Kinsella , Pavan Nikhilesh , Shijith Thotton CC: , , , Volodymyr Fialko Subject: [PATCH v3 2/2] crypto/cnxk: add vectorization for event crypto Date: Sat, 1 Oct 2022 02:42:13 +0200 Message-ID: <20221001004213.2911114-3-vfialko@marvell.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221001004213.2911114-1-vfialko@marvell.com> References: <20220926113607.1613674-1-vfialko@marvell.com> <20221001004213.2911114-1-vfialko@marvell.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: 6kBPNdIep_XBLSjbWADqclWy5hLEkuwg X-Proofpoint-GUID: 6kBPNdIep_XBLSjbWADqclWy5hLEkuwg X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-30_05,2022-09-29_03,2022-06-22_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support for vector aggregation of crypto operations for cn10k. Crypto operations will be grouped by sub event type, flow id, scheduler type and queue id fields from rte_event_crypto_metadata::response_info. Signed-off-by: Volodymyr Fialko --- drivers/crypto/cnxk/cn10k_cryptodev_ops.c | 398 +++++++++++++++++++--- drivers/crypto/cnxk/cn10k_cryptodev_ops.h | 2 + drivers/crypto/cnxk/cnxk_cryptodev_ops.h | 9 +- drivers/crypto/cnxk/version.map | 1 + drivers/event/cnxk/cn10k_eventdev.c | 31 +- drivers/event/cnxk/cn10k_worker.h | 6 +- drivers/event/cnxk/cn9k_eventdev.c | 7 +- drivers/event/cnxk/cnxk_eventdev.h | 4 +- drivers/event/cnxk/cnxk_eventdev_adptr.c | 17 +- 9 files changed, 415 insertions(+), 60 deletions(-) diff --git a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c index 586941cd70..7bbe8726e3 100644 --- a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c +++ b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c @@ -19,6 +19,25 @@ #include "roc_api.h" +#define PKTS_PER_LOOP 32 +#define PKTS_PER_STEORL 16 + +/* Holds information required to send crypto operations in one burst */ +struct ops_burst { + struct rte_crypto_op *op[PKTS_PER_LOOP]; + uint64_t w2[PKTS_PER_LOOP]; + struct cn10k_sso_hws *ws; + struct cnxk_cpt_qp *qp; + uint16_t nb_ops; +}; + +/* Holds information required to send vector of operations */ +struct vec_request { + struct cpt_inflight_req *req; + struct rte_event_vector *vec; + uint64_t w2; +}; + static inline struct cnxk_se_sess * cn10k_cpt_sym_temp_sess_create(struct cnxk_cpt_qp *qp, struct rte_crypto_op *op) { @@ -164,9 +183,6 @@ cn10k_cpt_fill_inst(struct cnxk_cpt_qp *qp, struct rte_crypto_op *ops[], return 1; } -#define PKTS_PER_LOOP 32 -#define PKTS_PER_STEORL 16 - static uint16_t cn10k_cpt_enqueue_burst(void *qptr, struct rte_crypto_op **ops, uint16_t nb_ops) { @@ -267,9 +283,9 @@ cn10k_cpt_crypto_adapter_ev_mdata_set(struct rte_cryptodev *dev __rte_unused, union rte_event_crypto_metadata *ec_mdata = mdata; struct rte_event *rsp_info; struct cnxk_cpt_qp *qp; + uint64_t w2, tag_type; uint8_t cdev_id; int16_t qp_id; - uint64_t w2; /* Get queue pair */ cdev_id = ec_mdata->request_info.cdev_id; @@ -277,9 +293,9 @@ cn10k_cpt_crypto_adapter_ev_mdata_set(struct rte_cryptodev *dev __rte_unused, qp = rte_cryptodevs[cdev_id].data->queue_pairs[qp_id]; /* Prepare w2 */ + tag_type = qp->ca.vector_sz ? RTE_EVENT_TYPE_CRYPTODEV_VECTOR : RTE_EVENT_TYPE_CRYPTODEV; rsp_info = &ec_mdata->response_info; - w2 = CNXK_CPT_INST_W2((RTE_EVENT_TYPE_CRYPTODEV << 28) | - (rsp_info->sub_event_type << 20) | + w2 = CNXK_CPT_INST_W2((tag_type << 28) | (rsp_info->sub_event_type << 20) | rsp_info->flow_id, rsp_info->sched_type, rsp_info->queue_id, 0); @@ -373,18 +389,236 @@ cn10k_ca_meta_info_extract(struct rte_crypto_op *op, return 0; } +static inline void +cn10k_cpt_vec_inst_fill(struct vec_request *vec_req, struct cpt_inst_s *inst, + struct cnxk_cpt_qp *qp) +{ + const union cpt_res_s res = {.cn10k.compcode = CPT_COMP_NOT_DONE}; + struct cpt_inflight_req *infl_req = vec_req->req; + + const union cpt_inst_w4 w4 = { + .s.opcode_major = ROC_SE_MAJOR_OP_MISC, + .s.opcode_minor = ROC_SE_MISC_MINOR_OP_PASSTHROUGH, + .s.param1 = 1, + .s.param2 = 1, + .s.dlen = 0, + }; + + infl_req->vec = vec_req->vec; + infl_req->qp = qp; + + inst->res_addr = (uint64_t)&infl_req->res; + __atomic_store_n(&infl_req->res.u64[0], res.u64[0], __ATOMIC_RELAXED); + + inst->w0.u64 = 0; + inst->w2.u64 = vec_req->w2; + inst->w3.u64 = CNXK_CPT_INST_W3(1, infl_req); + inst->w4.u64 = w4.u64; + inst->w7.u64 = ROC_CPT_DFLT_ENG_GRP_SE << 61; +} + +static void +cn10k_cpt_vec_pkt_submission_timeout_handle(void) +{ + plt_dp_err("Vector packet submission timedout"); + abort(); +} + +static inline void +cn10k_cpt_vec_submit(struct vec_request vec_tbl[], uint16_t vec_tbl_len, struct cnxk_cpt_qp *qp) +{ + uint64_t lmt_base, lmt_arg, lmt_id, io_addr; + union cpt_fc_write_s fc; + struct cpt_inst_s *inst; + uint16_t burst_size; + uint64_t *fc_addr; + int i; + + if (vec_tbl_len == 0) + return; + + const uint32_t fc_thresh = qp->lmtline.fc_thresh; + /* + * Use 10 mins timeout for the poll. It is not possible to recover from partial submission + * of vector packet. Actual packets for processing are submitted to CPT prior to this + * routine. Hence, any failure for submission of vector packet would indicate an + * unrecoverable error for the application. + */ + const uint64_t timeout = rte_get_timer_cycles() + 10 * 60 * rte_get_timer_hz(); + + lmt_base = qp->lmtline.lmt_base; + io_addr = qp->lmtline.io_addr; + fc_addr = qp->lmtline.fc_addr; + ROC_LMT_BASE_ID_GET(lmt_base, lmt_id); + inst = (struct cpt_inst_s *)lmt_base; + +again: + burst_size = RTE_MIN(PKTS_PER_STEORL, vec_tbl_len); + for (i = 0; i < burst_size; i++) + cn10k_cpt_vec_inst_fill(&vec_tbl[i], &inst[i * 2], qp); + + do { + fc.u64[0] = __atomic_load_n(fc_addr, __ATOMIC_RELAXED); + if (likely(fc.s.qsize < fc_thresh)) + break; + if (unlikely(rte_get_timer_cycles() > timeout)) + cn10k_cpt_vec_pkt_submission_timeout_handle(); + } while (true); + + lmt_arg = ROC_CN10K_CPT_LMT_ARG | (i - 1) << 12 | lmt_id; + roc_lmt_submit_steorl(lmt_arg, io_addr); + + rte_io_wmb(); + + vec_tbl_len -= i; + + if (vec_tbl_len > 0) { + vec_tbl += i; + goto again; + } +} + +static inline int +ca_lmtst_vec_submit(struct ops_burst *burst, struct vec_request vec_tbl[], uint16_t *vec_tbl_len) +{ + struct cpt_inflight_req *infl_reqs[PKTS_PER_LOOP]; + uint64_t lmt_base, lmt_arg, io_addr; + uint16_t lmt_id, len = *vec_tbl_len; + struct cpt_inst_s *inst, *inst_base; + struct cpt_inflight_req *infl_req; + struct rte_event_vector *vec; + union cpt_fc_write_s fc; + struct cnxk_cpt_qp *qp; + uint64_t *fc_addr; + int ret, i, vi; + + qp = burst->qp; + + lmt_base = qp->lmtline.lmt_base; + io_addr = qp->lmtline.io_addr; + fc_addr = qp->lmtline.fc_addr; + + const uint32_t fc_thresh = qp->lmtline.fc_thresh; + + ROC_LMT_BASE_ID_GET(lmt_base, lmt_id); + inst_base = (struct cpt_inst_s *)lmt_base; + +#ifdef CNXK_CRYPTODEV_DEBUG + if (unlikely(!qp->ca.enabled)) { + rte_errno = EINVAL; + return 0; + } +#endif + + /* Perform fc check before putting packets into vectors */ + fc.u64[0] = __atomic_load_n(fc_addr, __ATOMIC_RELAXED); + if (unlikely(fc.s.qsize > fc_thresh)) { + rte_errno = EAGAIN; + return 0; + } + + if (unlikely(rte_mempool_get_bulk(qp->ca.req_mp, (void **)infl_reqs, burst->nb_ops))) { + rte_errno = ENOMEM; + return 0; + } + + for (i = 0; i < burst->nb_ops; i++) { + inst = &inst_base[2 * i]; + infl_req = infl_reqs[i]; + infl_req->op_flags = 0; + + ret = cn10k_cpt_fill_inst(qp, &burst->op[i], inst, infl_req); + if (unlikely(ret != 1)) { + plt_cpt_dbg("Could not process op: %p", burst->op[i]); + if (i != 0) + goto submit; + else + goto put; + } + + infl_req->res.cn10k.compcode = CPT_COMP_NOT_DONE; + infl_req->qp = qp; + inst->w3.u64 = 0x1; + + /* Lookup for existing vector by w2 */ + for (vi = len - 1; vi >= 0; vi--) { + if (vec_tbl[vi].w2 != burst->w2[i]) + continue; + vec = vec_tbl[vi].vec; + if (unlikely(vec->nb_elem == qp->ca.vector_sz)) + continue; + vec->ptrs[vec->nb_elem++] = infl_req; + goto next_op; /* continue outer loop */ + } + + /* No available vectors found, allocate a new one */ + if (unlikely(rte_mempool_get(qp->ca.vector_mp, (void **)&vec_tbl[len].vec))) { + rte_errno = ENOMEM; + if (i != 0) + goto submit; + else + goto put; + } + /* Also preallocate in-flight request, that will be used to + * submit misc passthrough instruction + */ + if (unlikely(rte_mempool_get(qp->ca.req_mp, (void **)&vec_tbl[len].req))) { + rte_mempool_put(qp->ca.vector_mp, vec_tbl[len].vec); + rte_errno = ENOMEM; + if (i != 0) + goto submit; + else + goto put; + } + vec_tbl[len].w2 = burst->w2[i]; + vec_tbl[len].vec->ptrs[0] = infl_req; + vec_tbl[len].vec->nb_elem = 1; + len++; + +next_op:; + } + + /* Submit operations in burst */ +submit: + if (CNXK_TT_FROM_TAG(burst->ws->gw_rdata) == SSO_TT_ORDERED) + roc_sso_hws_head_wait(burst->ws->base); + + if (i > PKTS_PER_STEORL) { + lmt_arg = ROC_CN10K_CPT_LMT_ARG | (PKTS_PER_STEORL - 1) << 12 | (uint64_t)lmt_id; + roc_lmt_submit_steorl(lmt_arg, io_addr); + lmt_arg = ROC_CN10K_CPT_LMT_ARG | (i - PKTS_PER_STEORL - 1) << 12 | + (uint64_t)(lmt_id + PKTS_PER_STEORL); + roc_lmt_submit_steorl(lmt_arg, io_addr); + } else { + lmt_arg = ROC_CN10K_CPT_LMT_ARG | (i - 1) << 12 | (uint64_t)lmt_id; + roc_lmt_submit_steorl(lmt_arg, io_addr); + } + + rte_io_wmb(); + +put: + if (i != burst->nb_ops) + rte_mempool_put_bulk(qp->ca.req_mp, (void *)&infl_reqs[i], burst->nb_ops - i); + + *vec_tbl_len = len; + + return i; +} + static inline uint16_t -ca_lmtst_burst_submit(struct cn10k_sso_hws *ws, uint64_t w2[], struct cnxk_cpt_qp *qp, - struct rte_crypto_op *op[], uint16_t nb_ops) +ca_lmtst_burst_submit(struct ops_burst *burst) { struct cpt_inflight_req *infl_reqs[PKTS_PER_LOOP]; uint64_t lmt_base, lmt_arg, io_addr; struct cpt_inst_s *inst, *inst_base; struct cpt_inflight_req *infl_req; union cpt_fc_write_s fc; + struct cnxk_cpt_qp *qp; uint64_t *fc_addr; uint16_t lmt_id; - int ret, i; + int ret, i, j; + + qp = burst->qp; lmt_base = qp->lmtline.lmt_base; io_addr = qp->lmtline.io_addr; @@ -395,24 +629,26 @@ ca_lmtst_burst_submit(struct cn10k_sso_hws *ws, uint64_t w2[], struct cnxk_cpt_q ROC_LMT_BASE_ID_GET(lmt_base, lmt_id); inst_base = (struct cpt_inst_s *)lmt_base; +#ifdef CNXK_CRYPTODEV_DEBUG if (unlikely(!qp->ca.enabled)) { rte_errno = EINVAL; return 0; } +#endif - if (unlikely(rte_mempool_get_bulk(qp->ca.req_mp, (void **)infl_reqs, nb_ops))) { + if (unlikely(rte_mempool_get_bulk(qp->ca.req_mp, (void **)infl_reqs, burst->nb_ops))) { rte_errno = ENOMEM; return 0; } - for (i = 0; i < nb_ops; i++) { + for (i = 0; i < burst->nb_ops; i++) { inst = &inst_base[2 * i]; infl_req = infl_reqs[i]; infl_req->op_flags = 0; - ret = cn10k_cpt_fill_inst(qp, &op[i], inst, infl_req); + ret = cn10k_cpt_fill_inst(qp, &burst->op[i], inst, infl_req); if (unlikely(ret != 1)) { - plt_dp_dbg("Could not process op: %p", op[i]); + plt_dp_dbg("Could not process op: %p", burst->op[i]); if (i != 0) goto submit; else @@ -423,20 +659,25 @@ ca_lmtst_burst_submit(struct cn10k_sso_hws *ws, uint64_t w2[], struct cnxk_cpt_q infl_req->qp = qp; inst->w0.u64 = 0; inst->res_addr = (uint64_t)&infl_req->res; - inst->w2.u64 = w2[i]; + inst->w2.u64 = burst->w2[i]; inst->w3.u64 = CNXK_CPT_INST_W3(1, infl_req); } fc.u64[0] = __atomic_load_n(fc_addr, __ATOMIC_RELAXED); if (unlikely(fc.s.qsize > fc_thresh)) { rte_errno = EAGAIN; + for (j = 0; j < i; j++) { + infl_req = infl_reqs[j]; + if (unlikely(infl_req->op_flags & CPT_OP_FLAGS_METABUF)) + rte_mempool_put(qp->meta_info.pool, infl_req->mdata); + } i = 0; goto put; } submit: - if (CNXK_TT_FROM_TAG(ws->gw_rdata) == SSO_TT_ORDERED) - roc_sso_hws_head_wait(ws->base); + if (CNXK_TT_FROM_TAG(burst->ws->gw_rdata) == SSO_TT_ORDERED) + roc_sso_hws_head_wait(burst->ws->base); if (i > PKTS_PER_STEORL) { lmt_arg = ROC_CN10K_CPT_LMT_ARG | (PKTS_PER_STEORL - 1) << 12 | (uint64_t)lmt_id; @@ -452,8 +693,8 @@ ca_lmtst_burst_submit(struct cn10k_sso_hws *ws, uint64_t w2[], struct cnxk_cpt_q rte_io_wmb(); put: - if (unlikely(i != nb_ops)) - rte_mempool_put_bulk(qp->ca.req_mp, (void *)&infl_reqs[i], nb_ops - i); + if (unlikely(i != burst->nb_ops)) + rte_mempool_put_bulk(qp->ca.req_mp, (void *)&infl_reqs[i], burst->nb_ops - i); return i; } @@ -461,42 +702,76 @@ ca_lmtst_burst_submit(struct cn10k_sso_hws *ws, uint64_t w2[], struct cnxk_cpt_q uint16_t __rte_hot cn10k_cpt_crypto_adapter_enqueue(void *ws, struct rte_event ev[], uint16_t nb_events) { - struct rte_crypto_op *ops[PKTS_PER_LOOP], *op; - struct cnxk_cpt_qp *qp, *curr_qp = NULL; - uint64_t w2s[PKTS_PER_LOOP], w2; - uint16_t submitted, count = 0; - int ret, i, ops_len = 0; + uint16_t submitted, count = 0, vec_tbl_len = 0; + struct vec_request vec_tbl[nb_events]; + struct rte_crypto_op *op; + struct ops_burst burst; + struct cnxk_cpt_qp *qp; + bool is_vector = false; + uint64_t w2; + int ret, i; + + burst.ws = ws; + burst.qp = NULL; + burst.nb_ops = 0; for (i = 0; i < nb_events; i++) { op = ev[i].event_ptr; ret = cn10k_ca_meta_info_extract(op, &qp, &w2); if (unlikely(ret)) { rte_errno = EINVAL; - return count; + goto vec_submit; } - if (qp != curr_qp) { - if (ops_len) { - submitted = ca_lmtst_burst_submit(ws, w2s, curr_qp, ops, ops_len); + /* Queue pair change check */ + if (qp != burst.qp) { + if (burst.nb_ops) { + if (is_vector) { + submitted = + ca_lmtst_vec_submit(&burst, vec_tbl, &vec_tbl_len); + /* + * Vector submission is required on qp change, but not in + * other cases, since we could send several vectors per + * lmtst instruction only for same qp + */ + cn10k_cpt_vec_submit(vec_tbl, vec_tbl_len, burst.qp); + vec_tbl_len = 0; + } else { + submitted = ca_lmtst_burst_submit(&burst); + } count += submitted; - if (unlikely(submitted != ops_len)) - return count; - ops_len = 0; + if (unlikely(submitted != burst.nb_ops)) + goto vec_submit; + burst.nb_ops = 0; } - curr_qp = qp; + is_vector = qp->ca.vector_sz; + burst.qp = qp; } - w2s[ops_len] = w2; - ops[ops_len] = op; - if (++ops_len == PKTS_PER_LOOP) { - submitted = ca_lmtst_burst_submit(ws, w2s, curr_qp, ops, ops_len); + burst.w2[burst.nb_ops] = w2; + burst.op[burst.nb_ops] = op; + + /* Max nb_ops per burst check */ + if (++burst.nb_ops == PKTS_PER_LOOP) { + if (is_vector) + submitted = ca_lmtst_vec_submit(&burst, vec_tbl, &vec_tbl_len); + else + submitted = ca_lmtst_burst_submit(&burst); count += submitted; - if (unlikely(submitted != ops_len)) - return count; - ops_len = 0; + if (unlikely(submitted != burst.nb_ops)) + goto vec_submit; + burst.nb_ops = 0; } } - if (ops_len) - count += ca_lmtst_burst_submit(ws, w2s, curr_qp, ops, ops_len); + /* Submit the rest of crypto operations */ + if (burst.nb_ops) { + if (is_vector) + count += ca_lmtst_vec_submit(&burst, vec_tbl, &vec_tbl_len); + else + count += ca_lmtst_burst_submit(&burst); + } + +vec_submit: + cn10k_cpt_vec_submit(vec_tbl, vec_tbl_len, burst.qp); return count; } @@ -654,6 +929,49 @@ cn10k_cpt_crypto_adapter_dequeue(uintptr_t get_work1) return (uintptr_t)cop; } +uintptr_t +cn10k_cpt_crypto_adapter_vector_dequeue(uintptr_t get_work1) +{ + struct cpt_inflight_req *infl_req, *vec_infl_req; + struct rte_mempool *meta_mp, *req_mp; + struct rte_event_vector *vec; + struct rte_crypto_op *cop; + struct cnxk_cpt_qp *qp; + union cpt_res_s res; + int i; + + vec_infl_req = (struct cpt_inflight_req *)(get_work1); + + vec = vec_infl_req->vec; + qp = vec_infl_req->qp; + meta_mp = qp->meta_info.pool; + req_mp = qp->ca.req_mp; + +#ifdef CNXK_CRYPTODEV_DEBUG + res.u64[0] = __atomic_load_n(&vec_infl_req->res.u64[0], __ATOMIC_RELAXED); + PLT_ASSERT(res.cn10k.compcode == CPT_COMP_WARN); + PLT_ASSERT(res.cn10k.uc_compcode == 0); +#endif + + for (i = 0; i < vec->nb_elem; i++) { + infl_req = vec->ptrs[i]; + cop = infl_req->cop; + + res.u64[0] = __atomic_load_n(&infl_req->res.u64[0], __ATOMIC_RELAXED); + cn10k_cpt_dequeue_post_process(qp, cop, infl_req, &res.cn10k); + + vec->ptrs[i] = cop; + if (unlikely(infl_req->op_flags & CPT_OP_FLAGS_METABUF)) + rte_mempool_put(meta_mp, infl_req->mdata); + + rte_mempool_put(req_mp, infl_req); + } + + rte_mempool_put(req_mp, vec_infl_req); + + return (uintptr_t)vec; +} + static uint16_t cn10k_cpt_dequeue_burst(void *qptr, struct rte_crypto_op **ops, uint16_t nb_ops) { diff --git a/drivers/crypto/cnxk/cn10k_cryptodev_ops.h b/drivers/crypto/cnxk/cn10k_cryptodev_ops.h index 628d6a567c..8104310c30 100644 --- a/drivers/crypto/cnxk/cn10k_cryptodev_ops.h +++ b/drivers/crypto/cnxk/cn10k_cryptodev_ops.h @@ -18,5 +18,7 @@ uint16_t __rte_hot cn10k_cpt_crypto_adapter_enqueue(void *ws, struct rte_event e uint16_t nb_events); __rte_internal uintptr_t cn10k_cpt_crypto_adapter_dequeue(uintptr_t get_work1); +__rte_internal +uintptr_t cn10k_cpt_crypto_adapter_vector_dequeue(uintptr_t get_work1); #endif /* _CN10K_CRYPTODEV_OPS_H_ */ diff --git a/drivers/crypto/cnxk/cnxk_cryptodev_ops.h b/drivers/crypto/cnxk/cnxk_cryptodev_ops.h index ffe4ae19aa..d9ed43b40b 100644 --- a/drivers/crypto/cnxk/cnxk_cryptodev_ops.h +++ b/drivers/crypto/cnxk/cnxk_cryptodev_ops.h @@ -37,7 +37,10 @@ struct cpt_qp_meta_info { struct cpt_inflight_req { union cpt_res_s res; - struct rte_crypto_op *cop; + union { + struct rte_crypto_op *cop; + struct rte_event_vector *vec; + }; void *mdata; uint8_t op_flags; void *qp; @@ -63,6 +66,10 @@ struct crypto_adpter_info { /**< Set if queue pair is added to crypto adapter */ struct rte_mempool *req_mp; /**< CPT inflight request mempool */ + uint16_t vector_sz; + /** Maximum number of cops to combine into single vector */ + struct rte_mempool *vector_mp; + /** Pool for allocating rte_event_vector */ }; struct cnxk_cpt_qp { diff --git a/drivers/crypto/cnxk/version.map b/drivers/crypto/cnxk/version.map index 0178c416ec..4735e70550 100644 --- a/drivers/crypto/cnxk/version.map +++ b/drivers/crypto/cnxk/version.map @@ -5,6 +5,7 @@ INTERNAL { cn9k_cpt_crypto_adapter_dequeue; cn10k_cpt_crypto_adapter_enqueue; cn10k_cpt_crypto_adapter_dequeue; + cn10k_cpt_crypto_adapter_vector_dequeue; local: *; }; diff --git a/drivers/event/cnxk/cn10k_eventdev.c b/drivers/event/cnxk/cn10k_eventdev.c index c55d69724b..742e43a5c6 100644 --- a/drivers/event/cnxk/cn10k_eventdev.c +++ b/drivers/event/cnxk/cn10k_eventdev.c @@ -1025,7 +1025,8 @@ cn10k_crypto_adapter_caps_get(const struct rte_eventdev *event_dev, CNXK_VALID_DEV_OR_ERR_RET(cdev->device, "crypto_cn10k"); *caps = RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_OP_FWD | - RTE_EVENT_CRYPTO_ADAPTER_CAP_SESSION_PRIVATE_DATA; + RTE_EVENT_CRYPTO_ADAPTER_CAP_SESSION_PRIVATE_DATA | + RTE_EVENT_CRYPTO_ADAPTER_CAP_EVENT_VECTOR; return 0; } @@ -1039,23 +1040,20 @@ cn10k_crypto_adapter_qp_add(const struct rte_eventdev *event_dev, struct cnxk_sso_evdev *dev = cnxk_sso_pmd_priv(event_dev); int ret; - RTE_SET_USED(conf); - CNXK_VALID_DEV_OR_ERR_RET(event_dev->dev, "event_cn10k"); CNXK_VALID_DEV_OR_ERR_RET(cdev->device, "crypto_cn10k"); dev->is_ca_internal_port = 1; cn10k_sso_fp_fns_set((struct rte_eventdev *)(uintptr_t)event_dev); - ret = cnxk_crypto_adapter_qp_add(event_dev, cdev, queue_pair_id); + ret = cnxk_crypto_adapter_qp_add(event_dev, cdev, queue_pair_id, conf); cn10k_sso_set_priv_mem(event_dev, NULL, 0); return ret; } static int -cn10k_crypto_adapter_qp_del(const struct rte_eventdev *event_dev, - const struct rte_cryptodev *cdev, +cn10k_crypto_adapter_qp_del(const struct rte_eventdev *event_dev, const struct rte_cryptodev *cdev, int32_t queue_pair_id) { CNXK_VALID_DEV_OR_ERR_RET(event_dev->dev, "event_cn10k"); @@ -1072,6 +1070,26 @@ cn10k_tim_caps_get(const struct rte_eventdev *evdev, uint64_t flags, cn10k_sso_set_priv_mem); } +static int +cn10k_crypto_adapter_vec_limits(const struct rte_eventdev *event_dev, + const struct rte_cryptodev *cdev, + struct rte_event_crypto_adapter_vector_limits *limits) +{ + CNXK_VALID_DEV_OR_ERR_RET(event_dev->dev, "event_cn10k"); + CNXK_VALID_DEV_OR_ERR_RET(cdev->device, "crypto_cn10k"); + + limits->log2_sz = false; + limits->min_sz = 0; + limits->max_sz = UINT16_MAX; + /* Unused timeout, in software implementation we aggregate all crypto + * operations passed to the enqueue function + */ + limits->min_timeout_ns = 0; + limits->max_timeout_ns = 0; + + return 0; +} + static struct eventdev_ops cn10k_sso_dev_ops = { .dev_infos_get = cn10k_sso_info_get, .dev_configure = cn10k_sso_dev_configure, @@ -1109,6 +1127,7 @@ static struct eventdev_ops cn10k_sso_dev_ops = { .crypto_adapter_caps_get = cn10k_crypto_adapter_caps_get, .crypto_adapter_queue_pair_add = cn10k_crypto_adapter_qp_add, .crypto_adapter_queue_pair_del = cn10k_crypto_adapter_qp_del, + .crypto_adapter_vector_limits_get = cn10k_crypto_adapter_vec_limits, .xstats_get = cnxk_sso_xstats_get, .xstats_reset = cnxk_sso_xstats_reset, diff --git a/drivers/event/cnxk/cn10k_worker.h b/drivers/event/cnxk/cn10k_worker.h index 41b6ba8912..7a82dd352a 100644 --- a/drivers/event/cnxk/cn10k_worker.h +++ b/drivers/event/cnxk/cn10k_worker.h @@ -230,6 +230,9 @@ cn10k_sso_hws_post_process(struct cn10k_sso_hws *ws, uint64_t *u64, if ((flags & CPT_RX_WQE_F) && (CNXK_EVENT_TYPE_FROM_TAG(u64[0]) == RTE_EVENT_TYPE_CRYPTODEV)) { u64[1] = cn10k_cpt_crypto_adapter_dequeue(u64[1]); + } else if ((flags & CPT_RX_WQE_F) && + (CNXK_EVENT_TYPE_FROM_TAG(u64[0]) == RTE_EVENT_TYPE_CRYPTODEV_VECTOR)) { + u64[1] = cn10k_cpt_crypto_adapter_vector_dequeue(u64[1]); } else if (CNXK_EVENT_TYPE_FROM_TAG(u64[0]) == RTE_EVENT_TYPE_ETHDEV) { uint8_t port = CNXK_SUB_EVENT_FROM_TAG(u64[0]); uint64_t mbuf; @@ -272,8 +275,7 @@ cn10k_sso_hws_post_process(struct cn10k_sso_hws *ws, uint64_t *u64, cn10k_sso_process_tstamp(u64[1], mbuf, ws->tstamp[port]); u64[1] = mbuf; - } else if (CNXK_EVENT_TYPE_FROM_TAG(u64[0]) == - RTE_EVENT_TYPE_ETHDEV_VECTOR) { + } else if (CNXK_EVENT_TYPE_FROM_TAG(u64[0]) == RTE_EVENT_TYPE_ETHDEV_VECTOR) { uint8_t port = CNXK_SUB_EVENT_FROM_TAG(u64[0]); __uint128_t vwqe_hdr = *(__uint128_t *)u64[1]; diff --git a/drivers/event/cnxk/cn9k_eventdev.c b/drivers/event/cnxk/cn9k_eventdev.c index fca7b5f3a5..f5a42a86f8 100644 --- a/drivers/event/cnxk/cn9k_eventdev.c +++ b/drivers/event/cnxk/cn9k_eventdev.c @@ -1131,23 +1131,20 @@ cn9k_crypto_adapter_qp_add(const struct rte_eventdev *event_dev, struct cnxk_sso_evdev *dev = cnxk_sso_pmd_priv(event_dev); int ret; - RTE_SET_USED(conf); - CNXK_VALID_DEV_OR_ERR_RET(event_dev->dev, "event_cn9k"); CNXK_VALID_DEV_OR_ERR_RET(cdev->device, "crypto_cn9k"); dev->is_ca_internal_port = 1; cn9k_sso_fp_fns_set((struct rte_eventdev *)(uintptr_t)event_dev); - ret = cnxk_crypto_adapter_qp_add(event_dev, cdev, queue_pair_id); + ret = cnxk_crypto_adapter_qp_add(event_dev, cdev, queue_pair_id, conf); cn9k_sso_set_priv_mem(event_dev, NULL, 0); return ret; } static int -cn9k_crypto_adapter_qp_del(const struct rte_eventdev *event_dev, - const struct rte_cryptodev *cdev, +cn9k_crypto_adapter_qp_del(const struct rte_eventdev *event_dev, const struct rte_cryptodev *cdev, int32_t queue_pair_id) { CNXK_VALID_DEV_OR_ERR_RET(event_dev->dev, "event_cn9k"); diff --git a/drivers/event/cnxk/cnxk_eventdev.h b/drivers/event/cnxk/cnxk_eventdev.h index 293e0fff3f..f68c2aee23 100644 --- a/drivers/event/cnxk/cnxk_eventdev.h +++ b/drivers/event/cnxk/cnxk_eventdev.h @@ -10,6 +10,7 @@ #include #include #include +#include #include #include #include @@ -305,7 +306,8 @@ int cnxk_sso_tx_adapter_start(uint8_t id, const struct rte_eventdev *event_dev); int cnxk_sso_tx_adapter_stop(uint8_t id, const struct rte_eventdev *event_dev); int cnxk_sso_tx_adapter_free(uint8_t id, const struct rte_eventdev *event_dev); int cnxk_crypto_adapter_qp_add(const struct rte_eventdev *event_dev, - const struct rte_cryptodev *cdev, int32_t queue_pair_id); + const struct rte_cryptodev *cdev, int32_t queue_pair_id, + const struct rte_event_crypto_adapter_queue_conf *conf); int cnxk_crypto_adapter_qp_del(const struct rte_cryptodev *cdev, int32_t queue_pair_id); #endif /* __CNXK_EVENTDEV_H__ */ diff --git a/drivers/event/cnxk/cnxk_eventdev_adptr.c b/drivers/event/cnxk/cnxk_eventdev_adptr.c index 3ba5b246f0..5ec436382c 100644 --- a/drivers/event/cnxk/cnxk_eventdev_adptr.c +++ b/drivers/event/cnxk/cnxk_eventdev_adptr.c @@ -641,7 +641,8 @@ cnxk_sso_tx_adapter_free(uint8_t id __rte_unused, } static int -crypto_adapter_qp_setup(const struct rte_cryptodev *cdev, struct cnxk_cpt_qp *qp) +crypto_adapter_qp_setup(const struct rte_cryptodev *cdev, struct cnxk_cpt_qp *qp, + const struct rte_event_crypto_adapter_queue_conf *conf) { char name[RTE_MEMPOOL_NAMESIZE]; uint32_t cache_size, nb_req; @@ -674,6 +675,10 @@ crypto_adapter_qp_setup(const struct rte_cryptodev *cdev, struct cnxk_cpt_qp *qp if (qp->ca.req_mp == NULL) return -ENOMEM; + if (conf != NULL) { + qp->ca.vector_sz = conf->vector_sz; + qp->ca.vector_mp = conf->vector_mp; + } qp->ca.enabled = true; return 0; @@ -681,7 +686,8 @@ crypto_adapter_qp_setup(const struct rte_cryptodev *cdev, struct cnxk_cpt_qp *qp int cnxk_crypto_adapter_qp_add(const struct rte_eventdev *event_dev, const struct rte_cryptodev *cdev, - int32_t queue_pair_id) + int32_t queue_pair_id, + const struct rte_event_crypto_adapter_queue_conf *conf) { struct cnxk_sso_evdev *sso_evdev = cnxk_sso_pmd_priv(event_dev); uint32_t adptr_xae_cnt = 0; @@ -693,7 +699,7 @@ cnxk_crypto_adapter_qp_add(const struct rte_eventdev *event_dev, const struct rt for (qp_id = 0; qp_id < cdev->data->nb_queue_pairs; qp_id++) { qp = cdev->data->queue_pairs[qp_id]; - ret = crypto_adapter_qp_setup(cdev, qp); + ret = crypto_adapter_qp_setup(cdev, qp, conf); if (ret) { cnxk_crypto_adapter_qp_del(cdev, -1); return ret; @@ -702,7 +708,7 @@ cnxk_crypto_adapter_qp_add(const struct rte_eventdev *event_dev, const struct rt } } else { qp = cdev->data->queue_pairs[queue_pair_id]; - ret = crypto_adapter_qp_setup(cdev, qp); + ret = crypto_adapter_qp_setup(cdev, qp, conf); if (ret) return ret; adptr_xae_cnt = qp->ca.req_mp->size; @@ -733,7 +739,8 @@ crypto_adapter_qp_free(struct cnxk_cpt_qp *qp) } int -cnxk_crypto_adapter_qp_del(const struct rte_cryptodev *cdev, int32_t queue_pair_id) +cnxk_crypto_adapter_qp_del(const struct rte_cryptodev *cdev, + int32_t queue_pair_id) { struct cnxk_cpt_qp *qp;