From patchwork Sat Sep 4 07:46:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shijith Thotton X-Patchwork-Id: 97979 X-Patchwork-Delegate: jerinj@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4A7AFA0C45; Sat, 4 Sep 2021 09:47:00 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 338F9410E6; Sat, 4 Sep 2021 09:47:00 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id 9AA75410E6 for ; Sat, 4 Sep 2021 09:46:58 +0200 (CEST) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 183KZxkj012098 for ; Sat, 4 Sep 2021 00:46:58 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=Fuk7CColZ664yVgb2ZriVkZE8bGaLmsyF7CGD3pnsTY=; b=UlVxLHIkEE/oaWreId0BGRQPrzB9hhMJ2xq1fgAz6uF3oBkLeG8h+dhvmxmQrI6EVAUx hHE+33Y1YXyGLkvr0nf8G1KCShDR2OwBkDZDCtN3nkDAZPtD3UEMEMq8lx/RBIO5aEUi dBblAVRZsEEifPi4GE1cJLG3WqHgEKwz2BFYoubYSbEhR0mwZ7VSo1T7DsWyDwkRthlI 2jrXLpsrTPdYKNrRm26sFMvYaxWsTzKm8N2IRvAPhvznPFq+nezs8oNATs9OU2u5WdyX Vpo0SNSdlPEMnipNqmaHiHbmZcNSHmBswH0++4Nd45rGqu/YJqRlKWgmmXDXTUUnZkWb ww== Received: from dc5-exch02.marvell.com ([199.233.59.182]) by mx0b-0016f401.pphosted.com with ESMTP id 3aufr8bxv7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for ; Sat, 04 Sep 2021 00:46:57 -0700 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Sat, 4 Sep 2021 00:46:55 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.18 via Frontend Transport; Sat, 4 Sep 2021 00:46:55 -0700 Received: from localhost.localdomain (unknown [10.28.34.29]) by maili.marvell.com (Postfix) with ESMTP id 52CFB3F7050; Sat, 4 Sep 2021 00:46:53 -0700 (PDT) From: Shijith Thotton To: CC: Shijith Thotton , , , , Date: Sat, 4 Sep 2021 13:16:29 +0530 Message-ID: <2cc3afedfb25bb53bed04c3f286b1849aa20e6b8.1630740268.git.sthotton@marvell.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: gLz00elSFHYPBTYB8qXgQbbZFHVwh-p5 X-Proofpoint-GUID: gLz00elSFHYPBTYB8qXgQbbZFHVwh-p5 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.391,FMLib:17.0.607.475 definitions=2021-09-04_02,2021-09-03_01,2020-04-07_01 Subject: [dpdk-dev] [PATCH v2] examples/l3fwd: add changes to use event vector X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Added changes to receive packets as event vector. By default this is disabled and can be enabled using the option --enable-vector. Vector size and timeout to form the vector can be configured using options --vector-size and --vector-tmo-ns. Example: dpdk-l3fwd -l 0-3 -n 4 -- -p 0x03 --mode=eventdev \ --eventq-sched=ordered --enable-vector --vector-size 16 Signed-off-by: Shijith Thotton --- Depends-on: series=18322 (eventdev: simplify Rx adapter event vector config) v2: * Fixed setting event vector attribute. doc/guides/sample_app_ug/l3_forward.rst | 7 + examples/l3fwd/l3fwd.h | 26 ++++ examples/l3fwd/l3fwd_em.c | 104 +++++++++++++ examples/l3fwd/l3fwd_em.h | 37 +++++ examples/l3fwd/l3fwd_em_hlm.h | 69 +++++++++ examples/l3fwd/l3fwd_em_sequential.h | 25 ++++ examples/l3fwd/l3fwd_event.c | 57 ++++--- examples/l3fwd/l3fwd_event.h | 25 ++++ examples/l3fwd/l3fwd_event_internal_port.c | 28 +++- examples/l3fwd/l3fwd_fib.c | 164 +++++++++++++++++++++ examples/l3fwd/l3fwd_lpm.c | 121 +++++++++++++++ examples/l3fwd/main.c | 58 ++++++++ 12 files changed, 698 insertions(+), 23 deletions(-) diff --git a/doc/guides/sample_app_ug/l3_forward.rst b/doc/guides/sample_app_ug/l3_forward.rst index 2d5cd5f1c0..be11e7d7f7 100644 --- a/doc/guides/sample_app_ug/l3_forward.rst +++ b/doc/guides/sample_app_ug/l3_forward.rst @@ -74,6 +74,7 @@ The application has a number of command line options:: [--mode] [--eventq-sched] [--event-eth-rxqs] + [--enable-vector [--vector-size SIZE] [--vector-tmo-ns NS]] [-E] [-L] @@ -115,6 +116,12 @@ Where, * ``--event-eth-rxqs:`` Optional, Number of ethernet RX queues per device. Only valid if --mode=eventdev. +* ``--enable-vector:`` Optional, Enable event vectorization. Only valid if --mode=eventdev. + +* ``--vector-size:`` Optional, Max vector size if event vectorization is enabled. + +* ``--vector-tmo-ns:`` Optional, Max timeout to form vector in nanoseconds if event vectorization is enabled. + * ``-E:`` Optional, enable exact match, legacy flag, please use ``--lookup=em`` instead. diff --git a/examples/l3fwd/l3fwd.h b/examples/l3fwd/l3fwd.h index a808d60247..9607ee0fbb 100644 --- a/examples/l3fwd/l3fwd.h +++ b/examples/l3fwd/l3fwd.h @@ -28,6 +28,8 @@ #define MEMPOOL_CACHE_SIZE 256 #define MAX_RX_QUEUE_PER_LCORE 16 +#define VECTOR_SIZE_DEFAULT MAX_PKT_BURST +#define VECTOR_TMO_NS_DEFAULT 1E6 /* 1ms */ /* * Try to avoid TX buffering if we have at least MAX_TX_BURST packets to send. */ @@ -221,6 +223,14 @@ int lpm_event_main_loop_tx_q(__rte_unused void *dummy); int lpm_event_main_loop_tx_q_burst(__rte_unused void *dummy); +int +lpm_event_main_loop_tx_d_vector(__rte_unused void *dummy); +int +lpm_event_main_loop_tx_d_burst_vector(__rte_unused void *dummy); +int +lpm_event_main_loop_tx_q_vector(__rte_unused void *dummy); +int +lpm_event_main_loop_tx_q_burst_vector(__rte_unused void *dummy); int em_event_main_loop_tx_d(__rte_unused void *dummy); @@ -230,6 +240,14 @@ int em_event_main_loop_tx_q(__rte_unused void *dummy); int em_event_main_loop_tx_q_burst(__rte_unused void *dummy); +int +em_event_main_loop_tx_d_vector(__rte_unused void *dummy); +int +em_event_main_loop_tx_d_burst_vector(__rte_unused void *dummy); +int +em_event_main_loop_tx_q_vector(__rte_unused void *dummy); +int +em_event_main_loop_tx_q_burst_vector(__rte_unused void *dummy); int fib_event_main_loop_tx_d(__rte_unused void *dummy); @@ -239,6 +257,14 @@ int fib_event_main_loop_tx_q(__rte_unused void *dummy); int fib_event_main_loop_tx_q_burst(__rte_unused void *dummy); +int +fib_event_main_loop_tx_d_vector(__rte_unused void *dummy); +int +fib_event_main_loop_tx_d_burst_vector(__rte_unused void *dummy); +int +fib_event_main_loop_tx_q_vector(__rte_unused void *dummy); +int +fib_event_main_loop_tx_q_burst_vector(__rte_unused void *dummy); /* Return ipv4/ipv6 fwd lookup struct for LPM, EM or FIB. */ diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c index 2a8ab6aab5..ff5e56766c 100644 --- a/examples/l3fwd/l3fwd_em.c +++ b/examples/l3fwd/l3fwd_em.c @@ -878,6 +878,110 @@ em_event_main_loop_tx_q_burst(__rte_unused void *dummy) return 0; } +/* Same eventdev loop for single and burst of vector */ +static __rte_always_inline void +em_event_loop_vector(struct l3fwd_event_resources *evt_rsrc, + const uint8_t flags) +{ + const int event_p_id = l3fwd_get_free_event_port(evt_rsrc); + const uint8_t tx_q_id = + evt_rsrc->evq.event_q_id[evt_rsrc->evq.nb_queues - 1]; + const uint8_t event_d_id = evt_rsrc->event_d_id; + const uint16_t deq_len = evt_rsrc->deq_depth; + struct rte_event events[MAX_PKT_BURST]; + struct lcore_conf *lconf; + unsigned int lcore_id; + int i, nb_enq, nb_deq; + + if (event_p_id < 0) + return; + + lcore_id = rte_lcore_id(); + lconf = &lcore_conf[lcore_id]; + + RTE_LOG(INFO, L3FWD, "entering %s on lcore %u\n", __func__, lcore_id); + + while (!force_quit) { + /* Read events from RX queues */ + nb_deq = rte_event_dequeue_burst(event_d_id, event_p_id, events, + deq_len, 0); + if (nb_deq == 0) { + rte_pause(); + continue; + } + + for (i = 0; i < nb_deq; i++) { + if (flags & L3FWD_EVENT_TX_ENQ) { + events[i].queue_id = tx_q_id; + events[i].op = RTE_EVENT_OP_FORWARD; + } + +#if defined RTE_ARCH_X86 || defined __ARM_NEON + l3fwd_em_process_event_vector(events[i].vec, lconf); +#else + l3fwd_em_no_opt_process_event_vector(events[i].vec, + lconf); +#endif + if (flags & L3FWD_EVENT_TX_DIRECT) + event_vector_txq_set(events[i].vec, 0); + } + + if (flags & L3FWD_EVENT_TX_ENQ) { + nb_enq = rte_event_enqueue_burst(event_d_id, event_p_id, + events, nb_deq); + while (nb_enq < nb_deq && !force_quit) + nb_enq += rte_event_enqueue_burst( + event_d_id, event_p_id, events + nb_enq, + nb_deq - nb_enq); + } + + if (flags & L3FWD_EVENT_TX_DIRECT) { + nb_enq = rte_event_eth_tx_adapter_enqueue( + event_d_id, event_p_id, events, nb_deq, 0); + while (nb_enq < nb_deq && !force_quit) + nb_enq += rte_event_eth_tx_adapter_enqueue( + event_d_id, event_p_id, events + nb_enq, + nb_deq - nb_enq, 0); + } + } +} + +int __rte_noinline +em_event_main_loop_tx_d_vector(__rte_unused void *dummy) +{ + struct l3fwd_event_resources *evt_rsrc = l3fwd_get_eventdev_rsrc(); + + em_event_loop_vector(evt_rsrc, L3FWD_EVENT_TX_DIRECT); + return 0; +} + +int __rte_noinline +em_event_main_loop_tx_d_burst_vector(__rte_unused void *dummy) +{ + struct l3fwd_event_resources *evt_rsrc = l3fwd_get_eventdev_rsrc(); + + em_event_loop_vector(evt_rsrc, L3FWD_EVENT_TX_DIRECT); + return 0; +} + +int __rte_noinline +em_event_main_loop_tx_q_vector(__rte_unused void *dummy) +{ + struct l3fwd_event_resources *evt_rsrc = l3fwd_get_eventdev_rsrc(); + + em_event_loop_vector(evt_rsrc, L3FWD_EVENT_TX_ENQ); + return 0; +} + +int __rte_noinline +em_event_main_loop_tx_q_burst_vector(__rte_unused void *dummy) +{ + struct l3fwd_event_resources *evt_rsrc = l3fwd_get_eventdev_rsrc(); + + em_event_loop_vector(evt_rsrc, L3FWD_EVENT_TX_ENQ); + return 0; +} + /* Initialize exact match (hash) parameters. 8< */ void setup_hash(const int socketid) diff --git a/examples/l3fwd/l3fwd_em.h b/examples/l3fwd/l3fwd_em.h index b992a21da4..56c93e4463 100644 --- a/examples/l3fwd/l3fwd_em.h +++ b/examples/l3fwd/l3fwd_em.h @@ -175,4 +175,41 @@ l3fwd_em_no_opt_process_events(int nb_rx, struct rte_event **events, l3fwd_em_simple_process(events[j]->mbuf, qconf); } +static inline void +l3fwd_em_no_opt_process_event_vector(struct rte_event_vector *vec, + struct lcore_conf *qconf) +{ + struct rte_mbuf **mbufs = vec->mbufs; + int32_t i; + + /* Prefetch first packets */ + for (i = 0; i < PREFETCH_OFFSET && i < vec->nb_elem; i++) + rte_prefetch0(rte_pktmbuf_mtod(mbufs[i], void *)); + + /* Process first packet to init vector attributes */ + l3fwd_em_simple_process(mbufs[0], qconf); + if (vec->attr_valid) { + if (mbufs[0]->port != BAD_PORT) + vec->port = mbufs[0]->port; + else + vec->attr_valid = 0; + } + + /* + * Prefetch and forward already prefetched packets. + */ + for (i = 1; i < (vec->nb_elem - PREFETCH_OFFSET); i++) { + rte_prefetch0( + rte_pktmbuf_mtod(mbufs[i + PREFETCH_OFFSET], void *)); + l3fwd_em_simple_process(mbufs[i], qconf); + event_vector_attr_validate(vec, mbufs[i]); + } + + /* Forward remaining prefetched packets */ + for (; i < vec->nb_elem; i++) { + l3fwd_em_simple_process(mbufs[i], qconf); + event_vector_attr_validate(vec, mbufs[i]); + } +} + #endif /* __L3FWD_EM_H__ */ diff --git a/examples/l3fwd/l3fwd_em_hlm.h b/examples/l3fwd/l3fwd_em_hlm.h index 278707c18c..e76f2760b0 100644 --- a/examples/l3fwd/l3fwd_em_hlm.h +++ b/examples/l3fwd/l3fwd_em_hlm.h @@ -318,4 +318,73 @@ l3fwd_em_process_events(int nb_rx, struct rte_event **ev, process_packet(pkts_burst[j], &pkts_burst[j]->port); } } + +static inline void +l3fwd_em_process_event_vector(struct rte_event_vector *vec, + struct lcore_conf *qconf) +{ + struct rte_mbuf **mbufs = vec->mbufs; + uint16_t dst_port[MAX_PKT_BURST]; + int32_t i, j, n, pos; + + for (j = 0; j < EM_HASH_LOOKUP_COUNT && j < vec->nb_elem; j++) + rte_prefetch0( + rte_pktmbuf_mtod(mbufs[j], struct rte_ether_hdr *) + 1); + + if (vec->attr_valid) + vec->port = em_get_dst_port(qconf, mbufs[0], mbufs[0]->port); + + n = RTE_ALIGN_FLOOR(vec->nb_elem, EM_HASH_LOOKUP_COUNT); + for (j = 0; j < n; j += EM_HASH_LOOKUP_COUNT) { + uint32_t pkt_type = + RTE_PTYPE_L3_MASK | RTE_PTYPE_L4_TCP | RTE_PTYPE_L4_UDP; + uint32_t l3_type, tcp_or_udp; + + for (i = 0; i < EM_HASH_LOOKUP_COUNT; i++) + pkt_type &= mbufs[j + i]->packet_type; + + l3_type = pkt_type & RTE_PTYPE_L3_MASK; + tcp_or_udp = pkt_type & (RTE_PTYPE_L4_TCP | RTE_PTYPE_L4_UDP); + + for (i = 0, pos = j + EM_HASH_LOOKUP_COUNT; + i < EM_HASH_LOOKUP_COUNT && pos < vec->nb_elem; + i++, pos++) { + rte_prefetch0(rte_pktmbuf_mtod(mbufs[pos], + struct rte_ether_hdr *) + + 1); + } + + if (tcp_or_udp && (l3_type == RTE_PTYPE_L3_IPV4)) { + em_get_dst_port_ipv4xN_events(qconf, &mbufs[j], + &dst_port[j]); + } else if (tcp_or_udp && (l3_type == RTE_PTYPE_L3_IPV6)) { + em_get_dst_port_ipv6xN_events(qconf, &mbufs[j], + &dst_port[j]); + } else { + for (i = 0; i < EM_HASH_LOOKUP_COUNT; i++) { + mbufs[j + i]->port = + em_get_dst_port(qconf, mbufs[j + i], + mbufs[j + i]->port); + process_packet(mbufs[j + i], + &mbufs[j + i]->port); + event_vector_attr_validate(vec, mbufs[j + i]); + } + continue; + } + processx4_step3(&mbufs[j], &dst_port[j]); + + for (i = 0; i < EM_HASH_LOOKUP_COUNT; i++) { + mbufs[j + i]->port = dst_port[j + i]; + event_vector_attr_validate(vec, mbufs[j + i]); + } + } + + for (; j < vec->nb_elem; j++) { + mbufs[j]->port = + em_get_dst_port(qconf, mbufs[j], mbufs[j]->port); + process_packet(mbufs[j], &mbufs[j]->port); + event_vector_attr_validate(vec, mbufs[j]); + } +} + #endif /* __L3FWD_EM_HLM_H__ */ diff --git a/examples/l3fwd/l3fwd_em_sequential.h b/examples/l3fwd/l3fwd_em_sequential.h index 6170052cf8..f426c508ef 100644 --- a/examples/l3fwd/l3fwd_em_sequential.h +++ b/examples/l3fwd/l3fwd_em_sequential.h @@ -121,4 +121,29 @@ l3fwd_em_process_events(int nb_rx, struct rte_event **events, process_packet(mbuf, &mbuf->port); } } + +static inline void +l3fwd_em_process_event_vector(struct rte_event_vector *vec, + struct lcore_conf *qconf) +{ + struct rte_mbuf **mbufs = vec->mbufs; + int32_t i, j; + + rte_prefetch0(rte_pktmbuf_mtod(mbufs[0], struct rte_ether_hdr *) + 1); + + if (vec->attr_valid) + vec->port = em_get_dst_port(qconf, mbufs[0], mbufs[0]->port); + + for (i = 0, j = 1; i < vec->nb_elem; i++, j++) { + if (j < vec->nb_elem) + rte_prefetch0(rte_pktmbuf_mtod(mbufs[j], + struct rte_ether_hdr *) + + 1); + mbufs[i]->port = + em_get_dst_port(qconf, mbufs[i], mbufs[i]->port); + process_packet(mbufs[i], &mbufs[i]->port); + event_vector_attr_validate(vec, mbufs[i]); + } +} + #endif /* __L3FWD_EM_SEQUENTIAL_H__ */ diff --git a/examples/l3fwd/l3fwd_event.c b/examples/l3fwd/l3fwd_event.c index 961860ea18..29172e590b 100644 --- a/examples/l3fwd/l3fwd_event.c +++ b/examples/l3fwd/l3fwd_event.c @@ -215,23 +215,35 @@ void l3fwd_event_resource_setup(struct rte_eth_conf *port_conf) { struct l3fwd_event_resources *evt_rsrc = l3fwd_get_eventdev_rsrc(); - const event_loop_cb lpm_event_loop[2][2] = { - [0][0] = lpm_event_main_loop_tx_d, - [0][1] = lpm_event_main_loop_tx_d_burst, - [1][0] = lpm_event_main_loop_tx_q, - [1][1] = lpm_event_main_loop_tx_q_burst, + const event_loop_cb lpm_event_loop[2][2][2] = { + [0][0][0] = lpm_event_main_loop_tx_d, + [0][0][1] = lpm_event_main_loop_tx_d_burst, + [0][1][0] = lpm_event_main_loop_tx_q, + [0][1][1] = lpm_event_main_loop_tx_q_burst, + [1][0][0] = lpm_event_main_loop_tx_d_vector, + [1][0][1] = lpm_event_main_loop_tx_d_burst_vector, + [1][1][0] = lpm_event_main_loop_tx_q_vector, + [1][1][1] = lpm_event_main_loop_tx_q_burst_vector, }; - const event_loop_cb em_event_loop[2][2] = { - [0][0] = em_event_main_loop_tx_d, - [0][1] = em_event_main_loop_tx_d_burst, - [1][0] = em_event_main_loop_tx_q, - [1][1] = em_event_main_loop_tx_q_burst, + const event_loop_cb em_event_loop[2][2][2] = { + [0][0][0] = em_event_main_loop_tx_d, + [0][0][1] = em_event_main_loop_tx_d_burst, + [0][1][0] = em_event_main_loop_tx_q, + [0][1][1] = em_event_main_loop_tx_q_burst, + [1][0][0] = em_event_main_loop_tx_d_vector, + [1][0][1] = em_event_main_loop_tx_d_burst_vector, + [1][1][0] = em_event_main_loop_tx_q_vector, + [1][1][1] = em_event_main_loop_tx_q_burst_vector, }; - const event_loop_cb fib_event_loop[2][2] = { - [0][0] = fib_event_main_loop_tx_d, - [0][1] = fib_event_main_loop_tx_d_burst, - [1][0] = fib_event_main_loop_tx_q, - [1][1] = fib_event_main_loop_tx_q_burst, + const event_loop_cb fib_event_loop[2][2][2] = { + [0][0][0] = fib_event_main_loop_tx_d, + [0][0][1] = fib_event_main_loop_tx_d_burst, + [0][1][0] = fib_event_main_loop_tx_q, + [0][1][1] = fib_event_main_loop_tx_q_burst, + [1][0][0] = fib_event_main_loop_tx_d_vector, + [1][0][1] = fib_event_main_loop_tx_d_burst_vector, + [1][1][0] = fib_event_main_loop_tx_q_vector, + [1][1][1] = fib_event_main_loop_tx_q_burst_vector, }; uint32_t event_queue_cfg; int ret; @@ -265,12 +277,15 @@ l3fwd_event_resource_setup(struct rte_eth_conf *port_conf) if (ret < 0) rte_exit(EXIT_FAILURE, "Error in starting eventdev"); - evt_rsrc->ops.lpm_event_loop = lpm_event_loop[evt_rsrc->tx_mode_q] - [evt_rsrc->has_burst]; + evt_rsrc->ops.lpm_event_loop = + lpm_event_loop[evt_rsrc->vector_enabled][evt_rsrc->tx_mode_q] + [evt_rsrc->has_burst]; - evt_rsrc->ops.em_event_loop = em_event_loop[evt_rsrc->tx_mode_q] - [evt_rsrc->has_burst]; + evt_rsrc->ops.em_event_loop = + em_event_loop[evt_rsrc->vector_enabled][evt_rsrc->tx_mode_q] + [evt_rsrc->has_burst]; - evt_rsrc->ops.fib_event_loop = fib_event_loop[evt_rsrc->tx_mode_q] - [evt_rsrc->has_burst]; + evt_rsrc->ops.fib_event_loop = + fib_event_loop[evt_rsrc->vector_enabled][evt_rsrc->tx_mode_q] + [evt_rsrc->has_burst]; } diff --git a/examples/l3fwd/l3fwd_event.h b/examples/l3fwd/l3fwd_event.h index 3ad1902ab5..f139632016 100644 --- a/examples/l3fwd/l3fwd_event.h +++ b/examples/l3fwd/l3fwd_event.h @@ -65,6 +65,7 @@ struct l3fwd_event_resources { uint8_t disable_implicit_release; struct l3fwd_event_setup_ops ops; struct rte_mempool * (*pkt_pool)[NB_SOCKETS]; + struct rte_mempool **vec_pool; struct l3fwd_event_queues evq; struct l3fwd_event_ports evp; uint32_t port_mask; @@ -76,8 +77,32 @@ struct l3fwd_event_resources { uint8_t has_burst; uint8_t enabled; uint8_t eth_rx_queues; + uint8_t vector_enabled; + uint16_t vector_size; + uint64_t vector_tmo_ns; }; +static inline void +event_vector_attr_validate(struct rte_event_vector *vec, struct rte_mbuf *mbuf) +{ + /* l3fwd application only changes mbuf port while processing */ + if (vec->attr_valid && (vec->port != mbuf->port)) + vec->attr_valid = 0; +} + +static inline void +event_vector_txq_set(struct rte_event_vector *vec, uint16_t txq) +{ + if (vec->attr_valid) { + vec->queue = txq; + } else { + int i; + + for (i = 0; i < vec->nb_elem; i++) + rte_event_eth_tx_adapter_txq_set(vec->mbufs[i], txq); + } +} + struct l3fwd_event_resources *l3fwd_get_eventdev_rsrc(void); void l3fwd_event_resource_setup(struct rte_eth_conf *port_conf); int l3fwd_get_free_event_port(struct l3fwd_event_resources *eventdev_rsrc); diff --git a/examples/l3fwd/l3fwd_event_internal_port.c b/examples/l3fwd/l3fwd_event_internal_port.c index 9916a7f556..7b30cc37ca 100644 --- a/examples/l3fwd/l3fwd_event_internal_port.c +++ b/examples/l3fwd/l3fwd_event_internal_port.c @@ -215,12 +215,36 @@ l3fwd_rx_tx_adapter_setup_internal_port(void) rte_panic("Failed to allocate memory for Rx adapter\n"); } - RTE_ETH_FOREACH_DEV(port_id) { if ((evt_rsrc->port_mask & (1 << port_id)) == 0) continue; + + if (evt_rsrc->vector_enabled) { + uint32_t cap; + + if (rte_event_eth_rx_adapter_caps_get(event_d_id, + port_id, &cap)) + rte_panic( + "Failed to get event rx adapter capability"); + + if (cap & RTE_EVENT_ETH_RX_ADAPTER_CAP_EVENT_VECTOR) { + eth_q_conf.vector_sz = evt_rsrc->vector_size; + eth_q_conf.vector_timeout_ns = + evt_rsrc->vector_tmo_ns; + eth_q_conf.vector_mp = + evt_rsrc->per_port_pool ? + evt_rsrc->vec_pool[port_id] : + evt_rsrc->vec_pool[0]; + eth_q_conf.rx_queue_flags |= + RTE_EVENT_ETH_RX_ADAPTER_QUEUE_EVENT_VECTOR; + } else { + rte_panic( + "Rx adapter doesn't support event vector"); + } + } + ret = rte_event_eth_rx_adapter_create(adapter_id, event_d_id, - &evt_rsrc->def_p_conf); + &evt_rsrc->def_p_conf); if (ret) rte_panic("Failed to create rx adapter[%d]\n", adapter_id); diff --git a/examples/l3fwd/l3fwd_fib.c b/examples/l3fwd/l3fwd_fib.c index f8d6a3ac39..9b7487f62d 100644 --- a/examples/l3fwd/l3fwd_fib.c +++ b/examples/l3fwd/l3fwd_fib.c @@ -412,6 +412,170 @@ fib_event_main_loop_tx_q_burst(__rte_unused void *dummy) return 0; } +static __rte_always_inline void +fib_process_event_vector(struct rte_event_vector *vec) +{ + uint8_t ipv6_arr[MAX_PKT_BURST][RTE_FIB6_IPV6_ADDR_SIZE]; + uint64_t hopsv4[MAX_PKT_BURST], hopsv6[MAX_PKT_BURST]; + uint32_t ipv4_arr_assem, ipv6_arr_assem; + struct rte_mbuf **mbufs = vec->mbufs; + uint32_t ipv4_arr[MAX_PKT_BURST]; + uint8_t type_arr[MAX_PKT_BURST]; + uint32_t ipv4_cnt, ipv6_cnt; + struct lcore_conf *lconf; + uint16_t nh; + int i; + + lconf = &lcore_conf[rte_lcore_id()]; + + /* Reset counters. */ + ipv4_cnt = 0; + ipv6_cnt = 0; + ipv4_arr_assem = 0; + ipv6_arr_assem = 0; + + /* Prefetch first packets. */ + for (i = 0; i < FIB_PREFETCH_OFFSET && i < vec->nb_elem; i++) + rte_prefetch0(rte_pktmbuf_mtod(mbufs[i], void *)); + + /* Parse packet info and prefetch. */ + for (i = 0; i < (vec->nb_elem - FIB_PREFETCH_OFFSET); i++) { + rte_prefetch0(rte_pktmbuf_mtod(mbufs[i + FIB_PREFETCH_OFFSET], + void *)); + fib_parse_packet(mbufs[i], &ipv4_arr[ipv4_cnt], &ipv4_cnt, + ipv6_arr[ipv6_cnt], &ipv6_cnt, &type_arr[i]); + } + + /* Parse remaining packet info. */ + for (; i < vec->nb_elem; i++) + fib_parse_packet(mbufs[i], &ipv4_arr[ipv4_cnt], &ipv4_cnt, + ipv6_arr[ipv6_cnt], &ipv6_cnt, &type_arr[i]); + + /* Lookup IPv4 hops if IPv4 packets are present. */ + if (likely(ipv4_cnt > 0)) + rte_fib_lookup_bulk(lconf->ipv4_lookup_struct, ipv4_arr, hopsv4, + ipv4_cnt); + + /* Lookup IPv6 hops if IPv6 packets are present. */ + if (ipv6_cnt > 0) + rte_fib6_lookup_bulk(lconf->ipv6_lookup_struct, ipv6_arr, + hopsv6, ipv6_cnt); + + if (vec->attr_valid) { + nh = type_arr[0] ? (uint16_t)hopsv4[0] : (uint16_t)hopsv6[0]; + if (nh != FIB_DEFAULT_HOP) + vec->port = nh; + else + vec->attr_valid = 0; + } + + /* Assign ports looked up in fib depending on IPv4 or IPv6 */ + for (i = 0; i < vec->nb_elem; i++) { + if (type_arr[i]) + nh = (uint16_t)hopsv4[ipv4_arr_assem++]; + else + nh = (uint16_t)hopsv6[ipv6_arr_assem++]; + if (nh != FIB_DEFAULT_HOP) + mbufs[i]->port = nh; + event_vector_attr_validate(vec, mbufs[i]); + } +} + +static __rte_always_inline void +fib_event_loop_vector(struct l3fwd_event_resources *evt_rsrc, + const uint8_t flags) +{ + const int event_p_id = l3fwd_get_free_event_port(evt_rsrc); + const uint8_t tx_q_id = + evt_rsrc->evq.event_q_id[evt_rsrc->evq.nb_queues - 1]; + const uint8_t event_d_id = evt_rsrc->event_d_id; + const uint16_t deq_len = evt_rsrc->deq_depth; + struct rte_event events[MAX_PKT_BURST]; + int nb_enq, nb_deq, i; + + if (event_p_id < 0) + return; + + RTE_LOG(INFO, L3FWD, "entering %s on lcore %u\n", __func__, + rte_lcore_id()); + + while (!force_quit) { + /* Read events from RX queues. */ + nb_deq = rte_event_dequeue_burst(event_d_id, event_p_id, events, + deq_len, 0); + if (nb_deq == 0) { + rte_pause(); + continue; + } + + for (i = 0; i < nb_deq; i++) { + if (flags & L3FWD_EVENT_TX_ENQ) { + events[i].queue_id = tx_q_id; + events[i].op = RTE_EVENT_OP_FORWARD; + } + + fib_process_event_vector(events[i].vec); + + if (flags & L3FWD_EVENT_TX_DIRECT) + event_vector_txq_set(events[i].vec, 0); + } + + if (flags & L3FWD_EVENT_TX_ENQ) { + nb_enq = rte_event_enqueue_burst(event_d_id, event_p_id, + events, nb_deq); + while (nb_enq < nb_deq && !force_quit) + nb_enq += rte_event_enqueue_burst( + event_d_id, event_p_id, events + nb_enq, + nb_deq - nb_enq); + } + + if (flags & L3FWD_EVENT_TX_DIRECT) { + nb_enq = rte_event_eth_tx_adapter_enqueue( + event_d_id, event_p_id, events, nb_deq, 0); + while (nb_enq < nb_deq && !force_quit) + nb_enq += rte_event_eth_tx_adapter_enqueue( + event_d_id, event_p_id, events + nb_enq, + nb_deq - nb_enq, 0); + } + } +} + +int __rte_noinline +fib_event_main_loop_tx_d_vector(__rte_unused void *dummy) +{ + struct l3fwd_event_resources *evt_rsrc = l3fwd_get_eventdev_rsrc(); + + fib_event_loop_vector(evt_rsrc, L3FWD_EVENT_TX_DIRECT); + return 0; +} + +int __rte_noinline +fib_event_main_loop_tx_d_burst_vector(__rte_unused void *dummy) +{ + struct l3fwd_event_resources *evt_rsrc = l3fwd_get_eventdev_rsrc(); + + fib_event_loop_vector(evt_rsrc, L3FWD_EVENT_TX_DIRECT); + return 0; +} + +int __rte_noinline +fib_event_main_loop_tx_q_vector(__rte_unused void *dummy) +{ + struct l3fwd_event_resources *evt_rsrc = l3fwd_get_eventdev_rsrc(); + + fib_event_loop_vector(evt_rsrc, L3FWD_EVENT_TX_ENQ); + return 0; +} + +int __rte_noinline +fib_event_main_loop_tx_q_burst_vector(__rte_unused void *dummy) +{ + struct l3fwd_event_resources *evt_rsrc = l3fwd_get_eventdev_rsrc(); + + fib_event_loop_vector(evt_rsrc, L3FWD_EVENT_TX_ENQ); + return 0; +} + /* Function to setup fib. 8< */ void setup_fib(const int socketid) diff --git a/examples/l3fwd/l3fwd_lpm.c b/examples/l3fwd/l3fwd_lpm.c index 7200160164..5b1351c5bb 100644 --- a/examples/l3fwd/l3fwd_lpm.c +++ b/examples/l3fwd/l3fwd_lpm.c @@ -427,6 +427,127 @@ lpm_event_main_loop_tx_q_burst(__rte_unused void *dummy) return 0; } +static __rte_always_inline void +lpm_process_event_vector(struct rte_event_vector *vec, struct lcore_conf *lconf) +{ + struct rte_mbuf **mbufs = vec->mbufs; + int i; + + /* Process first packet to init vector attributes */ + lpm_process_event_pkt(lconf, mbufs[0]); + if (vec->attr_valid) { + if (mbufs[0]->port != BAD_PORT) + vec->port = mbufs[0]->port; + else + vec->attr_valid = 0; + } + + for (i = 1; i < vec->nb_elem; i++) { + lpm_process_event_pkt(lconf, mbufs[i]); + event_vector_attr_validate(vec, mbufs[i]); + } +} + +/* Same eventdev loop for single and burst of vector */ +static __rte_always_inline void +lpm_event_loop_vector(struct l3fwd_event_resources *evt_rsrc, + const uint8_t flags) +{ + const int event_p_id = l3fwd_get_free_event_port(evt_rsrc); + const uint8_t tx_q_id = + evt_rsrc->evq.event_q_id[evt_rsrc->evq.nb_queues - 1]; + const uint8_t event_d_id = evt_rsrc->event_d_id; + const uint16_t deq_len = evt_rsrc->deq_depth; + struct rte_event events[MAX_PKT_BURST]; + struct lcore_conf *lconf; + unsigned int lcore_id; + int i, nb_enq, nb_deq; + + if (event_p_id < 0) + return; + + lcore_id = rte_lcore_id(); + lconf = &lcore_conf[lcore_id]; + + RTE_LOG(INFO, L3FWD, "entering %s on lcore %u\n", __func__, lcore_id); + + while (!force_quit) { + /* Read events from RX queues */ + nb_deq = rte_event_dequeue_burst(event_d_id, event_p_id, events, + deq_len, 0); + if (nb_deq == 0) { + rte_pause(); + continue; + } + + for (i = 0; i < nb_deq; i++) { + if (flags & L3FWD_EVENT_TX_ENQ) { + events[i].queue_id = tx_q_id; + events[i].op = RTE_EVENT_OP_FORWARD; + } + + lpm_process_event_vector(events[i].vec, lconf); + + if (flags & L3FWD_EVENT_TX_DIRECT) + event_vector_txq_set(events[i].vec, 0); + } + + if (flags & L3FWD_EVENT_TX_ENQ) { + nb_enq = rte_event_enqueue_burst(event_d_id, event_p_id, + events, nb_deq); + while (nb_enq < nb_deq && !force_quit) + nb_enq += rte_event_enqueue_burst( + event_d_id, event_p_id, events + nb_enq, + nb_deq - nb_enq); + } + + if (flags & L3FWD_EVENT_TX_DIRECT) { + nb_enq = rte_event_eth_tx_adapter_enqueue( + event_d_id, event_p_id, events, nb_deq, 0); + while (nb_enq < nb_deq && !force_quit) + nb_enq += rte_event_eth_tx_adapter_enqueue( + event_d_id, event_p_id, events + nb_enq, + nb_deq - nb_enq, 0); + } + } +} + +int __rte_noinline +lpm_event_main_loop_tx_d_vector(__rte_unused void *dummy) +{ + struct l3fwd_event_resources *evt_rsrc = l3fwd_get_eventdev_rsrc(); + + lpm_event_loop_vector(evt_rsrc, L3FWD_EVENT_TX_DIRECT); + return 0; +} + +int __rte_noinline +lpm_event_main_loop_tx_d_burst_vector(__rte_unused void *dummy) +{ + struct l3fwd_event_resources *evt_rsrc = l3fwd_get_eventdev_rsrc(); + + lpm_event_loop_vector(evt_rsrc, L3FWD_EVENT_TX_DIRECT); + return 0; +} + +int __rte_noinline +lpm_event_main_loop_tx_q_vector(__rte_unused void *dummy) +{ + struct l3fwd_event_resources *evt_rsrc = l3fwd_get_eventdev_rsrc(); + + lpm_event_loop_vector(evt_rsrc, L3FWD_EVENT_TX_ENQ); + return 0; +} + +int __rte_noinline +lpm_event_main_loop_tx_q_burst_vector(__rte_unused void *dummy) +{ + struct l3fwd_event_resources *evt_rsrc = l3fwd_get_eventdev_rsrc(); + + lpm_event_loop_vector(evt_rsrc, L3FWD_EVENT_TX_ENQ); + return 0; +} + void setup_lpm(const int socketid) { diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c index 00ac267af1..7382a557aa 100644 --- a/examples/l3fwd/main.c +++ b/examples/l3fwd/main.c @@ -137,6 +137,7 @@ static struct rte_eth_conf port_conf = { }; static struct rte_mempool *pktmbuf_pool[RTE_MAX_ETHPORTS][NB_SOCKETS]; +static struct rte_mempool *vector_pool[RTE_MAX_ETHPORTS]; static uint8_t lkp_per_socket[NB_SOCKETS]; struct l3fwd_lkp_mode { @@ -334,6 +335,7 @@ print_usage(const char *prgname) " [--per-port-pool]" " [--mode]" " [--eventq-sched]" + " [--enable-vector [--vector-size SIZE] [--vector-tmo-ns NS]]" " [-E]" " [-L]\n\n" @@ -361,6 +363,9 @@ print_usage(const char *prgname) " --event-eth-rxqs: Number of ethernet RX queues per device.\n" " Default: 1\n" " Valid only if --mode=eventdev\n" + " --enable-vector: Enable event vectorization.\n" + " --vector-size: Max vector size if event vectorization is enabled.\n" + " --vector-tmo-ns: Max timeout to form vector in nanoseconds if event vectorization is enabled\n" " -E : Enable exact match, legacy flag please use --lookup=em instead\n" " -L : Enable longest prefix match, legacy flag please use --lookup=lpm instead\n\n", prgname); @@ -574,6 +579,10 @@ static const char short_options[] = #define CMD_LINE_OPT_EVENTQ_SYNC "eventq-sched" #define CMD_LINE_OPT_EVENT_ETH_RX_QUEUES "event-eth-rxqs" #define CMD_LINE_OPT_LOOKUP "lookup" +#define CMD_LINE_OPT_ENABLE_VECTOR "enable-vector" +#define CMD_LINE_OPT_VECTOR_SIZE "vector-size" +#define CMD_LINE_OPT_VECTOR_TMO_NS "vector-tmo-ns" + enum { /* long options mapped to a short option */ @@ -592,6 +601,9 @@ enum { CMD_LINE_OPT_EVENTQ_SYNC_NUM, CMD_LINE_OPT_EVENT_ETH_RX_QUEUES_NUM, CMD_LINE_OPT_LOOKUP_NUM, + CMD_LINE_OPT_ENABLE_VECTOR_NUM, + CMD_LINE_OPT_VECTOR_SIZE_NUM, + CMD_LINE_OPT_VECTOR_TMO_NS_NUM }; static const struct option lgopts[] = { @@ -608,6 +620,9 @@ static const struct option lgopts[] = { {CMD_LINE_OPT_EVENT_ETH_RX_QUEUES, 1, 0, CMD_LINE_OPT_EVENT_ETH_RX_QUEUES_NUM}, {CMD_LINE_OPT_LOOKUP, 1, 0, CMD_LINE_OPT_LOOKUP_NUM}, + {CMD_LINE_OPT_ENABLE_VECTOR, 0, 0, CMD_LINE_OPT_ENABLE_VECTOR_NUM}, + {CMD_LINE_OPT_VECTOR_SIZE, 1, 0, CMD_LINE_OPT_VECTOR_SIZE_NUM}, + {CMD_LINE_OPT_VECTOR_TMO_NS, 1, 0, CMD_LINE_OPT_VECTOR_TMO_NS_NUM}, {NULL, 0, 0, 0} }; @@ -774,6 +789,16 @@ parse_args(int argc, char **argv) return -1; break; + case CMD_LINE_OPT_ENABLE_VECTOR_NUM: + printf("event vectorization is enabled\n"); + evt_rsrc->vector_enabled = 1; + break; + case CMD_LINE_OPT_VECTOR_SIZE_NUM: + evt_rsrc->vector_size = strtol(optarg, NULL, 10); + break; + case CMD_LINE_OPT_VECTOR_TMO_NS_NUM: + evt_rsrc->vector_tmo_ns = strtoull(optarg, NULL, 10); + break; default: print_usage(prgname); return -1; @@ -795,6 +820,19 @@ parse_args(int argc, char **argv) return -1; } + if (evt_rsrc->vector_enabled && !evt_rsrc->vector_size) { + evt_rsrc->vector_size = VECTOR_SIZE_DEFAULT; + fprintf(stderr, "vector size set to default (%" PRIu16 ")\n", + evt_rsrc->vector_size); + } + + if (evt_rsrc->vector_enabled && !evt_rsrc->vector_tmo_ns) { + evt_rsrc->vector_tmo_ns = VECTOR_TMO_NS_DEFAULT; + fprintf(stderr, + "vector timeout set to default (%" PRIu64 " ns)\n", + evt_rsrc->vector_tmo_ns); + } + /* * Nothing is selected, pick longest-prefix match * as default match. @@ -833,6 +871,7 @@ print_ethaddr(const char *name, const struct rte_ether_addr *eth_addr) int init_mem(uint16_t portid, unsigned int nb_mbuf) { + struct l3fwd_event_resources *evt_rsrc = l3fwd_get_eventdev_rsrc(); struct lcore_conf *qconf; int socketid; unsigned lcore_id; @@ -876,6 +915,24 @@ init_mem(uint16_t portid, unsigned int nb_mbuf) lkp_per_socket[socketid] = 1; } } + + if (evt_rsrc->vector_enabled && vector_pool[portid] == NULL) { + unsigned int nb_vec; + + nb_vec = (nb_mbuf + evt_rsrc->vector_size - 1) / + evt_rsrc->vector_size; + snprintf(s, sizeof(s), "vector_pool_%d", portid); + vector_pool[portid] = rte_event_vector_pool_create( + s, nb_vec, 0, evt_rsrc->vector_size, socketid); + if (vector_pool[portid] == NULL) + rte_exit(EXIT_FAILURE, + "Failed to create vector pool for port %d\n", + portid); + else + printf("Allocated vector pool for port %d\n", + portid); + } + qconf = &lcore_conf[lcore_id]; qconf->ipv4_lookup_struct = l3fwd_lkp.get_ipv4_lookup_struct(socketid); @@ -1307,6 +1364,7 @@ main(int argc, char **argv) evt_rsrc->per_port_pool = per_port_pool; evt_rsrc->pkt_pool = pktmbuf_pool; + evt_rsrc->vec_pool = vector_pool; evt_rsrc->port_mask = enabled_port_mask; /* Configure eventdev parameters if user has requested */ if (evt_rsrc->enabled) {