From patchwork Sat Oct 1 00:42:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Volodymyr Fialko X-Patchwork-Id: 117238 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id E5496A00C4; Sat, 1 Oct 2022 02:44:50 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id C6A5840684; Sat, 1 Oct 2022 02:44:50 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by mails.dpdk.org (Postfix) with ESMTP id D318B4003F for ; Sat, 1 Oct 2022 02:44:49 +0200 (CEST) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 28UDK2so028924; Fri, 30 Sep 2022 17:42:42 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=2TCaOWEHUzlK/tlhM2vyLbQ+SiAIRxHuzsjrWeAt/l4=; b=WeguSka4z2PHF0I6HRb5ihD97Z995jEH87ahnqJJEc7d13qkr1rdyTSUb+wE/3HHiPE+ w6h5lW5bTefAVg5gso8gqrBAeZ11iEgc2S/7KYDy3hB897UrTZOurh8V3yIL8Ju7bHhY bH+TAD9Ln1SJGH97dNfJop6MYxTFMQyEm1wnkqEU7M9nRB6e3GOEcj88K9A290TRY7g2 u3mbzCSxSnszE2sRsKUmMnTQ+84ktSy08cruiN6W3YWc8JCb8ERZvu3ZyE+62LrG8LjZ nayjBZ+boLdXvu5Q7u4/ZwJ2r2AuZa87m1DvSIAH4FaliHABX74+/CARlVAQ/WdaEnWi eg== Received: from dc5-exch02.marvell.com ([199.233.59.182]) by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3jx18ba3pd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 30 Sep 2022 17:42:42 -0700 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Fri, 30 Sep 2022 17:42:40 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.18 via Frontend Transport; Fri, 30 Sep 2022 17:42:40 -0700 Received: from localhost.localdomain (unknown [10.28.34.39]) by maili.marvell.com (Postfix) with ESMTP id 008513F7057; Fri, 30 Sep 2022 17:42:36 -0700 (PDT) From: Volodymyr Fialko To: , Ankur Dwivedi , Anoob Joseph , Tejasree Kondoj , Ray Kinsella , Pavan Nikhilesh , Shijith Thotton CC: , , , Volodymyr Fialko Subject: [PATCH v3 2/2] crypto/cnxk: add vectorization for event crypto Date: Sat, 1 Oct 2022 02:42:13 +0200 Message-ID: <20221001004213.2911114-3-vfialko@marvell.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221001004213.2911114-1-vfialko@marvell.com> References: <20220926113607.1613674-1-vfialko@marvell.com> <20221001004213.2911114-1-vfialko@marvell.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: 6kBPNdIep_XBLSjbWADqclWy5hLEkuwg X-Proofpoint-GUID: 6kBPNdIep_XBLSjbWADqclWy5hLEkuwg X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-30_05,2022-09-29_03,2022-06-22_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support for vector aggregation of crypto operations for cn10k. Crypto operations will be grouped by sub event type, flow id, scheduler type and queue id fields from rte_event_crypto_metadata::response_info. Signed-off-by: Volodymyr Fialko --- drivers/crypto/cnxk/cn10k_cryptodev_ops.c | 398 +++++++++++++++++++--- drivers/crypto/cnxk/cn10k_cryptodev_ops.h | 2 + drivers/crypto/cnxk/cnxk_cryptodev_ops.h | 9 +- drivers/crypto/cnxk/version.map | 1 + drivers/event/cnxk/cn10k_eventdev.c | 31 +- drivers/event/cnxk/cn10k_worker.h | 6 +- drivers/event/cnxk/cn9k_eventdev.c | 7 +- drivers/event/cnxk/cnxk_eventdev.h | 4 +- drivers/event/cnxk/cnxk_eventdev_adptr.c | 17 +- 9 files changed, 415 insertions(+), 60 deletions(-) diff --git a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c index 586941cd70..7bbe8726e3 100644 --- a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c +++ b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c @@ -19,6 +19,25 @@ #include "roc_api.h" +#define PKTS_PER_LOOP 32 +#define PKTS_PER_STEORL 16 + +/* Holds information required to send crypto operations in one burst */ +struct ops_burst { + struct rte_crypto_op *op[PKTS_PER_LOOP]; + uint64_t w2[PKTS_PER_LOOP]; + struct cn10k_sso_hws *ws; + struct cnxk_cpt_qp *qp; + uint16_t nb_ops; +}; + +/* Holds information required to send vector of operations */ +struct vec_request { + struct cpt_inflight_req *req; + struct rte_event_vector *vec; + uint64_t w2; +}; + static inline struct cnxk_se_sess * cn10k_cpt_sym_temp_sess_create(struct cnxk_cpt_qp *qp, struct rte_crypto_op *op) { @@ -164,9 +183,6 @@ cn10k_cpt_fill_inst(struct cnxk_cpt_qp *qp, struct rte_crypto_op *ops[], return 1; } -#define PKTS_PER_LOOP 32 -#define PKTS_PER_STEORL 16 - static uint16_t cn10k_cpt_enqueue_burst(void *qptr, struct rte_crypto_op **ops, uint16_t nb_ops) { @@ -267,9 +283,9 @@ cn10k_cpt_crypto_adapter_ev_mdata_set(struct rte_cryptodev *dev __rte_unused, union rte_event_crypto_metadata *ec_mdata = mdata; struct rte_event *rsp_info; struct cnxk_cpt_qp *qp; + uint64_t w2, tag_type; uint8_t cdev_id; int16_t qp_id; - uint64_t w2; /* Get queue pair */ cdev_id = ec_mdata->request_info.cdev_id; @@ -277,9 +293,9 @@ cn10k_cpt_crypto_adapter_ev_mdata_set(struct rte_cryptodev *dev __rte_unused, qp = rte_cryptodevs[cdev_id].data->queue_pairs[qp_id]; /* Prepare w2 */ + tag_type = qp->ca.vector_sz ? RTE_EVENT_TYPE_CRYPTODEV_VECTOR : RTE_EVENT_TYPE_CRYPTODEV; rsp_info = &ec_mdata->response_info; - w2 = CNXK_CPT_INST_W2((RTE_EVENT_TYPE_CRYPTODEV << 28) | - (rsp_info->sub_event_type << 20) | + w2 = CNXK_CPT_INST_W2((tag_type << 28) | (rsp_info->sub_event_type << 20) | rsp_info->flow_id, rsp_info->sched_type, rsp_info->queue_id, 0); @@ -373,18 +389,236 @@ cn10k_ca_meta_info_extract(struct rte_crypto_op *op, return 0; } +static inline void +cn10k_cpt_vec_inst_fill(struct vec_request *vec_req, struct cpt_inst_s *inst, + struct cnxk_cpt_qp *qp) +{ + const union cpt_res_s res = {.cn10k.compcode = CPT_COMP_NOT_DONE}; + struct cpt_inflight_req *infl_req = vec_req->req; + + const union cpt_inst_w4 w4 = { + .s.opcode_major = ROC_SE_MAJOR_OP_MISC, + .s.opcode_minor = ROC_SE_MISC_MINOR_OP_PASSTHROUGH, + .s.param1 = 1, + .s.param2 = 1, + .s.dlen = 0, + }; + + infl_req->vec = vec_req->vec; + infl_req->qp = qp; + + inst->res_addr = (uint64_t)&infl_req->res; + __atomic_store_n(&infl_req->res.u64[0], res.u64[0], __ATOMIC_RELAXED); + + inst->w0.u64 = 0; + inst->w2.u64 = vec_req->w2; + inst->w3.u64 = CNXK_CPT_INST_W3(1, infl_req); + inst->w4.u64 = w4.u64; + inst->w7.u64 = ROC_CPT_DFLT_ENG_GRP_SE << 61; +} + +static void +cn10k_cpt_vec_pkt_submission_timeout_handle(void) +{ + plt_dp_err("Vector packet submission timedout"); + abort(); +} + +static inline void +cn10k_cpt_vec_submit(struct vec_request vec_tbl[], uint16_t vec_tbl_len, struct cnxk_cpt_qp *qp) +{ + uint64_t lmt_base, lmt_arg, lmt_id, io_addr; + union cpt_fc_write_s fc; + struct cpt_inst_s *inst; + uint16_t burst_size; + uint64_t *fc_addr; + int i; + + if (vec_tbl_len == 0) + return; + + const uint32_t fc_thresh = qp->lmtline.fc_thresh; + /* + * Use 10 mins timeout for the poll. It is not possible to recover from partial submission + * of vector packet. Actual packets for processing are submitted to CPT prior to this + * routine. Hence, any failure for submission of vector packet would indicate an + * unrecoverable error for the application. + */ + const uint64_t timeout = rte_get_timer_cycles() + 10 * 60 * rte_get_timer_hz(); + + lmt_base = qp->lmtline.lmt_base; + io_addr = qp->lmtline.io_addr; + fc_addr = qp->lmtline.fc_addr; + ROC_LMT_BASE_ID_GET(lmt_base, lmt_id); + inst = (struct cpt_inst_s *)lmt_base; + +again: + burst_size = RTE_MIN(PKTS_PER_STEORL, vec_tbl_len); + for (i = 0; i < burst_size; i++) + cn10k_cpt_vec_inst_fill(&vec_tbl[i], &inst[i * 2], qp); + + do { + fc.u64[0] = __atomic_load_n(fc_addr, __ATOMIC_RELAXED); + if (likely(fc.s.qsize < fc_thresh)) + break; + if (unlikely(rte_get_timer_cycles() > timeout)) + cn10k_cpt_vec_pkt_submission_timeout_handle(); + } while (true); + + lmt_arg = ROC_CN10K_CPT_LMT_ARG | (i - 1) << 12 | lmt_id; + roc_lmt_submit_steorl(lmt_arg, io_addr); + + rte_io_wmb(); + + vec_tbl_len -= i; + + if (vec_tbl_len > 0) { + vec_tbl += i; + goto again; + } +} + +static inline int +ca_lmtst_vec_submit(struct ops_burst *burst, struct vec_request vec_tbl[], uint16_t *vec_tbl_len) +{ + struct cpt_inflight_req *infl_reqs[PKTS_PER_LOOP]; + uint64_t lmt_base, lmt_arg, io_addr; + uint16_t lmt_id, len = *vec_tbl_len; + struct cpt_inst_s *inst, *inst_base; + struct cpt_inflight_req *infl_req; + struct rte_event_vector *vec; + union cpt_fc_write_s fc; + struct cnxk_cpt_qp *qp; + uint64_t *fc_addr; + int ret, i, vi; + + qp = burst->qp; + + lmt_base = qp->lmtline.lmt_base; + io_addr = qp->lmtline.io_addr; + fc_addr = qp->lmtline.fc_addr; + + const uint32_t fc_thresh = qp->lmtline.fc_thresh; + + ROC_LMT_BASE_ID_GET(lmt_base, lmt_id); + inst_base = (struct cpt_inst_s *)lmt_base; + +#ifdef CNXK_CRYPTODEV_DEBUG + if (unlikely(!qp->ca.enabled)) { + rte_errno = EINVAL; + return 0; + } +#endif + + /* Perform fc check before putting packets into vectors */ + fc.u64[0] = __atomic_load_n(fc_addr, __ATOMIC_RELAXED); + if (unlikely(fc.s.qsize > fc_thresh)) { + rte_errno = EAGAIN; + return 0; + } + + if (unlikely(rte_mempool_get_bulk(qp->ca.req_mp, (void **)infl_reqs, burst->nb_ops))) { + rte_errno = ENOMEM; + return 0; + } + + for (i = 0; i < burst->nb_ops; i++) { + inst = &inst_base[2 * i]; + infl_req = infl_reqs[i]; + infl_req->op_flags = 0; + + ret = cn10k_cpt_fill_inst(qp, &burst->op[i], inst, infl_req); + if (unlikely(ret != 1)) { + plt_cpt_dbg("Could not process op: %p", burst->op[i]); + if (i != 0) + goto submit; + else + goto put; + } + + infl_req->res.cn10k.compcode = CPT_COMP_NOT_DONE; + infl_req->qp = qp; + inst->w3.u64 = 0x1; + + /* Lookup for existing vector by w2 */ + for (vi = len - 1; vi >= 0; vi--) { + if (vec_tbl[vi].w2 != burst->w2[i]) + continue; + vec = vec_tbl[vi].vec; + if (unlikely(vec->nb_elem == qp->ca.vector_sz)) + continue; + vec->ptrs[vec->nb_elem++] = infl_req; + goto next_op; /* continue outer loop */ + } + + /* No available vectors found, allocate a new one */ + if (unlikely(rte_mempool_get(qp->ca.vector_mp, (void **)&vec_tbl[len].vec))) { + rte_errno = ENOMEM; + if (i != 0) + goto submit; + else + goto put; + } + /* Also preallocate in-flight request, that will be used to + * submit misc passthrough instruction + */ + if (unlikely(rte_mempool_get(qp->ca.req_mp, (void **)&vec_tbl[len].req))) { + rte_mempool_put(qp->ca.vector_mp, vec_tbl[len].vec); + rte_errno = ENOMEM; + if (i != 0) + goto submit; + else + goto put; + } + vec_tbl[len].w2 = burst->w2[i]; + vec_tbl[len].vec->ptrs[0] = infl_req; + vec_tbl[len].vec->nb_elem = 1; + len++; + +next_op:; + } + + /* Submit operations in burst */ +submit: + if (CNXK_TT_FROM_TAG(burst->ws->gw_rdata) == SSO_TT_ORDERED) + roc_sso_hws_head_wait(burst->ws->base); + + if (i > PKTS_PER_STEORL) { + lmt_arg = ROC_CN10K_CPT_LMT_ARG | (PKTS_PER_STEORL - 1) << 12 | (uint64_t)lmt_id; + roc_lmt_submit_steorl(lmt_arg, io_addr); + lmt_arg = ROC_CN10K_CPT_LMT_ARG | (i - PKTS_PER_STEORL - 1) << 12 | + (uint64_t)(lmt_id + PKTS_PER_STEORL); + roc_lmt_submit_steorl(lmt_arg, io_addr); + } else { + lmt_arg = ROC_CN10K_CPT_LMT_ARG | (i - 1) << 12 | (uint64_t)lmt_id; + roc_lmt_submit_steorl(lmt_arg, io_addr); + } + + rte_io_wmb(); + +put: + if (i != burst->nb_ops) + rte_mempool_put_bulk(qp->ca.req_mp, (void *)&infl_reqs[i], burst->nb_ops - i); + + *vec_tbl_len = len; + + return i; +} + static inline uint16_t -ca_lmtst_burst_submit(struct cn10k_sso_hws *ws, uint64_t w2[], struct cnxk_cpt_qp *qp, - struct rte_crypto_op *op[], uint16_t nb_ops) +ca_lmtst_burst_submit(struct ops_burst *burst) { struct cpt_inflight_req *infl_reqs[PKTS_PER_LOOP]; uint64_t lmt_base, lmt_arg, io_addr; struct cpt_inst_s *inst, *inst_base; struct cpt_inflight_req *infl_req; union cpt_fc_write_s fc; + struct cnxk_cpt_qp *qp; uint64_t *fc_addr; uint16_t lmt_id; - int ret, i; + int ret, i, j; + + qp = burst->qp; lmt_base = qp->lmtline.lmt_base; io_addr = qp->lmtline.io_addr; @@ -395,24 +629,26 @@ ca_lmtst_burst_submit(struct cn10k_sso_hws *ws, uint64_t w2[], struct cnxk_cpt_q ROC_LMT_BASE_ID_GET(lmt_base, lmt_id); inst_base = (struct cpt_inst_s *)lmt_base; +#ifdef CNXK_CRYPTODEV_DEBUG if (unlikely(!qp->ca.enabled)) { rte_errno = EINVAL; return 0; } +#endif - if (unlikely(rte_mempool_get_bulk(qp->ca.req_mp, (void **)infl_reqs, nb_ops))) { + if (unlikely(rte_mempool_get_bulk(qp->ca.req_mp, (void **)infl_reqs, burst->nb_ops))) { rte_errno = ENOMEM; return 0; } - for (i = 0; i < nb_ops; i++) { + for (i = 0; i < burst->nb_ops; i++) { inst = &inst_base[2 * i]; infl_req = infl_reqs[i]; infl_req->op_flags = 0; - ret = cn10k_cpt_fill_inst(qp, &op[i], inst, infl_req); + ret = cn10k_cpt_fill_inst(qp, &burst->op[i], inst, infl_req); if (unlikely(ret != 1)) { - plt_dp_dbg("Could not process op: %p", op[i]); + plt_dp_dbg("Could not process op: %p", burst->op[i]); if (i != 0) goto submit; else @@ -423,20 +659,25 @@ ca_lmtst_burst_submit(struct cn10k_sso_hws *ws, uint64_t w2[], struct cnxk_cpt_q infl_req->qp = qp; inst->w0.u64 = 0; inst->res_addr = (uint64_t)&infl_req->res; - inst->w2.u64 = w2[i]; + inst->w2.u64 = burst->w2[i]; inst->w3.u64 = CNXK_CPT_INST_W3(1, infl_req); } fc.u64[0] = __atomic_load_n(fc_addr, __ATOMIC_RELAXED); if (unlikely(fc.s.qsize > fc_thresh)) { rte_errno = EAGAIN; + for (j = 0; j < i; j++) { + infl_req = infl_reqs[j]; + if (unlikely(infl_req->op_flags & CPT_OP_FLAGS_METABUF)) + rte_mempool_put(qp->meta_info.pool, infl_req->mdata); + } i = 0; goto put; } submit: - if (CNXK_TT_FROM_TAG(ws->gw_rdata) == SSO_TT_ORDERED) - roc_sso_hws_head_wait(ws->base); + if (CNXK_TT_FROM_TAG(burst->ws->gw_rdata) == SSO_TT_ORDERED) + roc_sso_hws_head_wait(burst->ws->base); if (i > PKTS_PER_STEORL) { lmt_arg = ROC_CN10K_CPT_LMT_ARG | (PKTS_PER_STEORL - 1) << 12 | (uint64_t)lmt_id; @@ -452,8 +693,8 @@ ca_lmtst_burst_submit(struct cn10k_sso_hws *ws, uint64_t w2[], struct cnxk_cpt_q rte_io_wmb(); put: - if (unlikely(i != nb_ops)) - rte_mempool_put_bulk(qp->ca.req_mp, (void *)&infl_reqs[i], nb_ops - i); + if (unlikely(i != burst->nb_ops)) + rte_mempool_put_bulk(qp->ca.req_mp, (void *)&infl_reqs[i], burst->nb_ops - i); return i; } @@ -461,42 +702,76 @@ ca_lmtst_burst_submit(struct cn10k_sso_hws *ws, uint64_t w2[], struct cnxk_cpt_q uint16_t __rte_hot cn10k_cpt_crypto_adapter_enqueue(void *ws, struct rte_event ev[], uint16_t nb_events) { - struct rte_crypto_op *ops[PKTS_PER_LOOP], *op; - struct cnxk_cpt_qp *qp, *curr_qp = NULL; - uint64_t w2s[PKTS_PER_LOOP], w2; - uint16_t submitted, count = 0; - int ret, i, ops_len = 0; + uint16_t submitted, count = 0, vec_tbl_len = 0; + struct vec_request vec_tbl[nb_events]; + struct rte_crypto_op *op; + struct ops_burst burst; + struct cnxk_cpt_qp *qp; + bool is_vector = false; + uint64_t w2; + int ret, i; + + burst.ws = ws; + burst.qp = NULL; + burst.nb_ops = 0; for (i = 0; i < nb_events; i++) { op = ev[i].event_ptr; ret = cn10k_ca_meta_info_extract(op, &qp, &w2); if (unlikely(ret)) { rte_errno = EINVAL; - return count; + goto vec_submit; } - if (qp != curr_qp) { - if (ops_len) { - submitted = ca_lmtst_burst_submit(ws, w2s, curr_qp, ops, ops_len); + /* Queue pair change check */ + if (qp != burst.qp) { + if (burst.nb_ops) { + if (is_vector) { + submitted = + ca_lmtst_vec_submit(&burst, vec_tbl, &vec_tbl_len); + /* + * Vector submission is required on qp change, but not in + * other cases, since we could send several vectors per + * lmtst instruction only for same qp + */ + cn10k_cpt_vec_submit(vec_tbl, vec_tbl_len, burst.qp); + vec_tbl_len = 0; + } else { + submitted = ca_lmtst_burst_submit(&burst); + } count += submitted; - if (unlikely(submitted != ops_len)) - return count; - ops_len = 0; + if (unlikely(submitted != burst.nb_ops)) + goto vec_submit; + burst.nb_ops = 0; } - curr_qp = qp; + is_vector = qp->ca.vector_sz; + burst.qp = qp; } - w2s[ops_len] = w2; - ops[ops_len] = op; - if (++ops_len == PKTS_PER_LOOP) { - submitted = ca_lmtst_burst_submit(ws, w2s, curr_qp, ops, ops_len); + burst.w2[burst.nb_ops] = w2; + burst.op[burst.nb_ops] = op; + + /* Max nb_ops per burst check */ + if (++burst.nb_ops == PKTS_PER_LOOP) { + if (is_vector) + submitted = ca_lmtst_vec_submit(&burst, vec_tbl, &vec_tbl_len); + else + submitted = ca_lmtst_burst_submit(&burst); count += submitted; - if (unlikely(submitted != ops_len)) - return count; - ops_len = 0; + if (unlikely(submitted != burst.nb_ops)) + goto vec_submit; + burst.nb_ops = 0; } } - if (ops_len) - count += ca_lmtst_burst_submit(ws, w2s, curr_qp, ops, ops_len); + /* Submit the rest of crypto operations */ + if (burst.nb_ops) { + if (is_vector) + count += ca_lmtst_vec_submit(&burst, vec_tbl, &vec_tbl_len); + else + count += ca_lmtst_burst_submit(&burst); + } + +vec_submit: + cn10k_cpt_vec_submit(vec_tbl, vec_tbl_len, burst.qp); return count; } @@ -654,6 +929,49 @@ cn10k_cpt_crypto_adapter_dequeue(uintptr_t get_work1) return (uintptr_t)cop; } +uintptr_t +cn10k_cpt_crypto_adapter_vector_dequeue(uintptr_t get_work1) +{ + struct cpt_inflight_req *infl_req, *vec_infl_req; + struct rte_mempool *meta_mp, *req_mp; + struct rte_event_vector *vec; + struct rte_crypto_op *cop; + struct cnxk_cpt_qp *qp; + union cpt_res_s res; + int i; + + vec_infl_req = (struct cpt_inflight_req *)(get_work1); + + vec = vec_infl_req->vec; + qp = vec_infl_req->qp; + meta_mp = qp->meta_info.pool; + req_mp = qp->ca.req_mp; + +#ifdef CNXK_CRYPTODEV_DEBUG + res.u64[0] = __atomic_load_n(&vec_infl_req->res.u64[0], __ATOMIC_RELAXED); + PLT_ASSERT(res.cn10k.compcode == CPT_COMP_WARN); + PLT_ASSERT(res.cn10k.uc_compcode == 0); +#endif + + for (i = 0; i < vec->nb_elem; i++) { + infl_req = vec->ptrs[i]; + cop = infl_req->cop; + + res.u64[0] = __atomic_load_n(&infl_req->res.u64[0], __ATOMIC_RELAXED); + cn10k_cpt_dequeue_post_process(qp, cop, infl_req, &res.cn10k); + + vec->ptrs[i] = cop; + if (unlikely(infl_req->op_flags & CPT_OP_FLAGS_METABUF)) + rte_mempool_put(meta_mp, infl_req->mdata); + + rte_mempool_put(req_mp, infl_req); + } + + rte_mempool_put(req_mp, vec_infl_req); + + return (uintptr_t)vec; +} + static uint16_t cn10k_cpt_dequeue_burst(void *qptr, struct rte_crypto_op **ops, uint16_t nb_ops) { diff --git a/drivers/crypto/cnxk/cn10k_cryptodev_ops.h b/drivers/crypto/cnxk/cn10k_cryptodev_ops.h index 628d6a567c..8104310c30 100644 --- a/drivers/crypto/cnxk/cn10k_cryptodev_ops.h +++ b/drivers/crypto/cnxk/cn10k_cryptodev_ops.h @@ -18,5 +18,7 @@ uint16_t __rte_hot cn10k_cpt_crypto_adapter_enqueue(void *ws, struct rte_event e uint16_t nb_events); __rte_internal uintptr_t cn10k_cpt_crypto_adapter_dequeue(uintptr_t get_work1); +__rte_internal +uintptr_t cn10k_cpt_crypto_adapter_vector_dequeue(uintptr_t get_work1); #endif /* _CN10K_CRYPTODEV_OPS_H_ */ diff --git a/drivers/crypto/cnxk/cnxk_cryptodev_ops.h b/drivers/crypto/cnxk/cnxk_cryptodev_ops.h index ffe4ae19aa..d9ed43b40b 100644 --- a/drivers/crypto/cnxk/cnxk_cryptodev_ops.h +++ b/drivers/crypto/cnxk/cnxk_cryptodev_ops.h @@ -37,7 +37,10 @@ struct cpt_qp_meta_info { struct cpt_inflight_req { union cpt_res_s res; - struct rte_crypto_op *cop; + union { + struct rte_crypto_op *cop; + struct rte_event_vector *vec; + }; void *mdata; uint8_t op_flags; void *qp; @@ -63,6 +66,10 @@ struct crypto_adpter_info { /**< Set if queue pair is added to crypto adapter */ struct rte_mempool *req_mp; /**< CPT inflight request mempool */ + uint16_t vector_sz; + /** Maximum number of cops to combine into single vector */ + struct rte_mempool *vector_mp; + /** Pool for allocating rte_event_vector */ }; struct cnxk_cpt_qp { diff --git a/drivers/crypto/cnxk/version.map b/drivers/crypto/cnxk/version.map index 0178c416ec..4735e70550 100644 --- a/drivers/crypto/cnxk/version.map +++ b/drivers/crypto/cnxk/version.map @@ -5,6 +5,7 @@ INTERNAL { cn9k_cpt_crypto_adapter_dequeue; cn10k_cpt_crypto_adapter_enqueue; cn10k_cpt_crypto_adapter_dequeue; + cn10k_cpt_crypto_adapter_vector_dequeue; local: *; }; diff --git a/drivers/event/cnxk/cn10k_eventdev.c b/drivers/event/cnxk/cn10k_eventdev.c index c55d69724b..742e43a5c6 100644 --- a/drivers/event/cnxk/cn10k_eventdev.c +++ b/drivers/event/cnxk/cn10k_eventdev.c @@ -1025,7 +1025,8 @@ cn10k_crypto_adapter_caps_get(const struct rte_eventdev *event_dev, CNXK_VALID_DEV_OR_ERR_RET(cdev->device, "crypto_cn10k"); *caps = RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_OP_FWD | - RTE_EVENT_CRYPTO_ADAPTER_CAP_SESSION_PRIVATE_DATA; + RTE_EVENT_CRYPTO_ADAPTER_CAP_SESSION_PRIVATE_DATA | + RTE_EVENT_CRYPTO_ADAPTER_CAP_EVENT_VECTOR; return 0; } @@ -1039,23 +1040,20 @@ cn10k_crypto_adapter_qp_add(const struct rte_eventdev *event_dev, struct cnxk_sso_evdev *dev = cnxk_sso_pmd_priv(event_dev); int ret; - RTE_SET_USED(conf); - CNXK_VALID_DEV_OR_ERR_RET(event_dev->dev, "event_cn10k"); CNXK_VALID_DEV_OR_ERR_RET(cdev->device, "crypto_cn10k"); dev->is_ca_internal_port = 1; cn10k_sso_fp_fns_set((struct rte_eventdev *)(uintptr_t)event_dev); - ret = cnxk_crypto_adapter_qp_add(event_dev, cdev, queue_pair_id); + ret = cnxk_crypto_adapter_qp_add(event_dev, cdev, queue_pair_id, conf); cn10k_sso_set_priv_mem(event_dev, NULL, 0); return ret; } static int -cn10k_crypto_adapter_qp_del(const struct rte_eventdev *event_dev, - const struct rte_cryptodev *cdev, +cn10k_crypto_adapter_qp_del(const struct rte_eventdev *event_dev, const struct rte_cryptodev *cdev, int32_t queue_pair_id) { CNXK_VALID_DEV_OR_ERR_RET(event_dev->dev, "event_cn10k"); @@ -1072,6 +1070,26 @@ cn10k_tim_caps_get(const struct rte_eventdev *evdev, uint64_t flags, cn10k_sso_set_priv_mem); } +static int +cn10k_crypto_adapter_vec_limits(const struct rte_eventdev *event_dev, + const struct rte_cryptodev *cdev, + struct rte_event_crypto_adapter_vector_limits *limits) +{ + CNXK_VALID_DEV_OR_ERR_RET(event_dev->dev, "event_cn10k"); + CNXK_VALID_DEV_OR_ERR_RET(cdev->device, "crypto_cn10k"); + + limits->log2_sz = false; + limits->min_sz = 0; + limits->max_sz = UINT16_MAX; + /* Unused timeout, in software implementation we aggregate all crypto + * operations passed to the enqueue function + */ + limits->min_timeout_ns = 0; + limits->max_timeout_ns = 0; + + return 0; +} + static struct eventdev_ops cn10k_sso_dev_ops = { .dev_infos_get = cn10k_sso_info_get, .dev_configure = cn10k_sso_dev_configure, @@ -1109,6 +1127,7 @@ static struct eventdev_ops cn10k_sso_dev_ops = { .crypto_adapter_caps_get = cn10k_crypto_adapter_caps_get, .crypto_adapter_queue_pair_add = cn10k_crypto_adapter_qp_add, .crypto_adapter_queue_pair_del = cn10k_crypto_adapter_qp_del, + .crypto_adapter_vector_limits_get = cn10k_crypto_adapter_vec_limits, .xstats_get = cnxk_sso_xstats_get, .xstats_reset = cnxk_sso_xstats_reset, diff --git a/drivers/event/cnxk/cn10k_worker.h b/drivers/event/cnxk/cn10k_worker.h index 41b6ba8912..7a82dd352a 100644 --- a/drivers/event/cnxk/cn10k_worker.h +++ b/drivers/event/cnxk/cn10k_worker.h @@ -230,6 +230,9 @@ cn10k_sso_hws_post_process(struct cn10k_sso_hws *ws, uint64_t *u64, if ((flags & CPT_RX_WQE_F) && (CNXK_EVENT_TYPE_FROM_TAG(u64[0]) == RTE_EVENT_TYPE_CRYPTODEV)) { u64[1] = cn10k_cpt_crypto_adapter_dequeue(u64[1]); + } else if ((flags & CPT_RX_WQE_F) && + (CNXK_EVENT_TYPE_FROM_TAG(u64[0]) == RTE_EVENT_TYPE_CRYPTODEV_VECTOR)) { + u64[1] = cn10k_cpt_crypto_adapter_vector_dequeue(u64[1]); } else if (CNXK_EVENT_TYPE_FROM_TAG(u64[0]) == RTE_EVENT_TYPE_ETHDEV) { uint8_t port = CNXK_SUB_EVENT_FROM_TAG(u64[0]); uint64_t mbuf; @@ -272,8 +275,7 @@ cn10k_sso_hws_post_process(struct cn10k_sso_hws *ws, uint64_t *u64, cn10k_sso_process_tstamp(u64[1], mbuf, ws->tstamp[port]); u64[1] = mbuf; - } else if (CNXK_EVENT_TYPE_FROM_TAG(u64[0]) == - RTE_EVENT_TYPE_ETHDEV_VECTOR) { + } else if (CNXK_EVENT_TYPE_FROM_TAG(u64[0]) == RTE_EVENT_TYPE_ETHDEV_VECTOR) { uint8_t port = CNXK_SUB_EVENT_FROM_TAG(u64[0]); __uint128_t vwqe_hdr = *(__uint128_t *)u64[1]; diff --git a/drivers/event/cnxk/cn9k_eventdev.c b/drivers/event/cnxk/cn9k_eventdev.c index fca7b5f3a5..f5a42a86f8 100644 --- a/drivers/event/cnxk/cn9k_eventdev.c +++ b/drivers/event/cnxk/cn9k_eventdev.c @@ -1131,23 +1131,20 @@ cn9k_crypto_adapter_qp_add(const struct rte_eventdev *event_dev, struct cnxk_sso_evdev *dev = cnxk_sso_pmd_priv(event_dev); int ret; - RTE_SET_USED(conf); - CNXK_VALID_DEV_OR_ERR_RET(event_dev->dev, "event_cn9k"); CNXK_VALID_DEV_OR_ERR_RET(cdev->device, "crypto_cn9k"); dev->is_ca_internal_port = 1; cn9k_sso_fp_fns_set((struct rte_eventdev *)(uintptr_t)event_dev); - ret = cnxk_crypto_adapter_qp_add(event_dev, cdev, queue_pair_id); + ret = cnxk_crypto_adapter_qp_add(event_dev, cdev, queue_pair_id, conf); cn9k_sso_set_priv_mem(event_dev, NULL, 0); return ret; } static int -cn9k_crypto_adapter_qp_del(const struct rte_eventdev *event_dev, - const struct rte_cryptodev *cdev, +cn9k_crypto_adapter_qp_del(const struct rte_eventdev *event_dev, const struct rte_cryptodev *cdev, int32_t queue_pair_id) { CNXK_VALID_DEV_OR_ERR_RET(event_dev->dev, "event_cn9k"); diff --git a/drivers/event/cnxk/cnxk_eventdev.h b/drivers/event/cnxk/cnxk_eventdev.h index 293e0fff3f..f68c2aee23 100644 --- a/drivers/event/cnxk/cnxk_eventdev.h +++ b/drivers/event/cnxk/cnxk_eventdev.h @@ -10,6 +10,7 @@ #include #include #include +#include #include #include #include @@ -305,7 +306,8 @@ int cnxk_sso_tx_adapter_start(uint8_t id, const struct rte_eventdev *event_dev); int cnxk_sso_tx_adapter_stop(uint8_t id, const struct rte_eventdev *event_dev); int cnxk_sso_tx_adapter_free(uint8_t id, const struct rte_eventdev *event_dev); int cnxk_crypto_adapter_qp_add(const struct rte_eventdev *event_dev, - const struct rte_cryptodev *cdev, int32_t queue_pair_id); + const struct rte_cryptodev *cdev, int32_t queue_pair_id, + const struct rte_event_crypto_adapter_queue_conf *conf); int cnxk_crypto_adapter_qp_del(const struct rte_cryptodev *cdev, int32_t queue_pair_id); #endif /* __CNXK_EVENTDEV_H__ */ diff --git a/drivers/event/cnxk/cnxk_eventdev_adptr.c b/drivers/event/cnxk/cnxk_eventdev_adptr.c index 3ba5b246f0..5ec436382c 100644 --- a/drivers/event/cnxk/cnxk_eventdev_adptr.c +++ b/drivers/event/cnxk/cnxk_eventdev_adptr.c @@ -641,7 +641,8 @@ cnxk_sso_tx_adapter_free(uint8_t id __rte_unused, } static int -crypto_adapter_qp_setup(const struct rte_cryptodev *cdev, struct cnxk_cpt_qp *qp) +crypto_adapter_qp_setup(const struct rte_cryptodev *cdev, struct cnxk_cpt_qp *qp, + const struct rte_event_crypto_adapter_queue_conf *conf) { char name[RTE_MEMPOOL_NAMESIZE]; uint32_t cache_size, nb_req; @@ -674,6 +675,10 @@ crypto_adapter_qp_setup(const struct rte_cryptodev *cdev, struct cnxk_cpt_qp *qp if (qp->ca.req_mp == NULL) return -ENOMEM; + if (conf != NULL) { + qp->ca.vector_sz = conf->vector_sz; + qp->ca.vector_mp = conf->vector_mp; + } qp->ca.enabled = true; return 0; @@ -681,7 +686,8 @@ crypto_adapter_qp_setup(const struct rte_cryptodev *cdev, struct cnxk_cpt_qp *qp int cnxk_crypto_adapter_qp_add(const struct rte_eventdev *event_dev, const struct rte_cryptodev *cdev, - int32_t queue_pair_id) + int32_t queue_pair_id, + const struct rte_event_crypto_adapter_queue_conf *conf) { struct cnxk_sso_evdev *sso_evdev = cnxk_sso_pmd_priv(event_dev); uint32_t adptr_xae_cnt = 0; @@ -693,7 +699,7 @@ cnxk_crypto_adapter_qp_add(const struct rte_eventdev *event_dev, const struct rt for (qp_id = 0; qp_id < cdev->data->nb_queue_pairs; qp_id++) { qp = cdev->data->queue_pairs[qp_id]; - ret = crypto_adapter_qp_setup(cdev, qp); + ret = crypto_adapter_qp_setup(cdev, qp, conf); if (ret) { cnxk_crypto_adapter_qp_del(cdev, -1); return ret; @@ -702,7 +708,7 @@ cnxk_crypto_adapter_qp_add(const struct rte_eventdev *event_dev, const struct rt } } else { qp = cdev->data->queue_pairs[queue_pair_id]; - ret = crypto_adapter_qp_setup(cdev, qp); + ret = crypto_adapter_qp_setup(cdev, qp, conf); if (ret) return ret; adptr_xae_cnt = qp->ca.req_mp->size; @@ -733,7 +739,8 @@ crypto_adapter_qp_free(struct cnxk_cpt_qp *qp) } int -cnxk_crypto_adapter_qp_del(const struct rte_cryptodev *cdev, int32_t queue_pair_id) +cnxk_crypto_adapter_qp_del(const struct rte_cryptodev *cdev, + int32_t queue_pair_id) { struct cnxk_cpt_qp *qp;