From patchwork Thu Sep 19 06:25:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leyi Rong X-Patchwork-Id: 59354 X-Patchwork-Delegate: xiaolong.ye@intel.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 0D4CA1E8F4; Thu, 19 Sep 2019 08:28:55 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by dpdk.org (Postfix) with ESMTP id C89F81E8DB for ; Thu, 19 Sep 2019 08:28:51 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Sep 2019 23:28:51 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,522,1559545200"; d="scan'208";a="362421946" Received: from dpdk-lrong-srv-04.sh.intel.com ([10.67.119.187]) by orsmga005.jf.intel.com with ESMTP; 18 Sep 2019 23:28:49 -0700 From: Leyi Rong To: haiyue.wang@intel.com, wenzhuo.lu@intel.com, qi.z.zhang@intel.com, xiaolong.ye@intel.com Cc: dev@dpdk.org Date: Thu, 19 Sep 2019 14:25:48 +0800 Message-Id: <20190919062553.79257-2-leyi.rong@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190919062553.79257-1-leyi.rong@intel.com> References: <20190829023421.112551-1-leyi.rong@intel.com> <20190919062553.79257-1-leyi.rong@intel.com> Subject: [dpdk-dev] [PATCH v4 1/6] net/ice: add Rx flex descriptor definition X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Haiyue Wang The Rx flex descriptor has 16B and 32B size, with different field definitions compared to legacy type. Signed-off-by: Haiyue Wang --- drivers/net/ice/ice_rxtx.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ice/ice_rxtx.h b/drivers/net/ice/ice_rxtx.h index e9214110c..64e891875 100644 --- a/drivers/net/ice/ice_rxtx.h +++ b/drivers/net/ice/ice_rxtx.h @@ -21,8 +21,10 @@ #ifdef RTE_LIBRTE_ICE_16BYTE_RX_DESC #define ice_rx_desc ice_16byte_rx_desc +#define ice_rx_flex_desc ice_16b_rx_flex_desc #else #define ice_rx_desc ice_32byte_rx_desc +#define ice_rx_flex_desc ice_32b_rx_flex_desc #endif #define ICE_SUPPORT_CHAIN_NUM 5 From patchwork Thu Sep 19 06:25:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leyi Rong X-Patchwork-Id: 59355 X-Patchwork-Delegate: xiaolong.ye@intel.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 50FB41E901; Thu, 19 Sep 2019 08:28:57 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by dpdk.org (Postfix) with ESMTP id 53A431E8EE for ; Thu, 19 Sep 2019 08:28:54 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Sep 2019 23:28:53 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,522,1559545200"; d="scan'208";a="362421952" Received: from dpdk-lrong-srv-04.sh.intel.com ([10.67.119.187]) by orsmga005.jf.intel.com with ESMTP; 18 Sep 2019 23:28:51 -0700 From: Leyi Rong To: haiyue.wang@intel.com, wenzhuo.lu@intel.com, qi.z.zhang@intel.com, xiaolong.ye@intel.com Cc: dev@dpdk.org Date: Thu, 19 Sep 2019 14:25:49 +0800 Message-Id: <20190919062553.79257-3-leyi.rong@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190919062553.79257-1-leyi.rong@intel.com> References: <20190829023421.112551-1-leyi.rong@intel.com> <20190919062553.79257-1-leyi.rong@intel.com> Subject: [dpdk-dev] [PATCH v4 2/6] net/ice: handle the Rx flex descriptor X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Haiyue Wang Set the RXDID with flex descriptor type by default, change the Rx function to support new descriptor handling. Signed-off-by: Haiyue Wang Reviewed-by: Xiaolong Ye --- drivers/net/ice/ice_rxtx.c | 236 +++++++++++++++++-------------------- 1 file changed, 111 insertions(+), 125 deletions(-) diff --git a/drivers/net/ice/ice_rxtx.c b/drivers/net/ice/ice_rxtx.c index 0282b5375..d2e36853f 100644 --- a/drivers/net/ice/ice_rxtx.c +++ b/drivers/net/ice/ice_rxtx.c @@ -13,7 +13,6 @@ PKT_TX_TCP_SEG | \ PKT_TX_OUTER_IP_CKSUM) -#define ICE_RX_ERR_BITS 0x3f static enum ice_status ice_program_hw_rx_queue(struct ice_rx_queue *rxq) @@ -25,18 +24,9 @@ ice_program_hw_rx_queue(struct ice_rx_queue *rxq) enum ice_status err; uint16_t buf_size, len; struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode; + uint32_t rxdid = ICE_RXDID_COMMS_GENERIC; uint32_t regval; - /** - * The kernel driver uses flex descriptor. It sets the register - * to flex descriptor mode. - * DPDK uses legacy descriptor. It should set the register back - * to the default value, then uses legacy descriptor mode. - */ - regval = (0x01 << QRXFLXP_CNTXT_RXDID_PRIO_S) & - QRXFLXP_CNTXT_RXDID_PRIO_M; - ICE_WRITE_REG(hw, QRXFLXP_CNTXT(rxq->reg_idx), regval); - /* Set buffer size as the head split is disabled. */ buf_size = (uint16_t)(rte_pktmbuf_data_room_size(rxq->mp) - RTE_PKTMBUF_HEADROOM); @@ -94,6 +84,21 @@ ice_program_hw_rx_queue(struct ice_rx_queue *rxq) rx_ctx.showiv = 0; rx_ctx.crcstrip = (rxq->crc_len == 0) ? 1 : 0; + /* Enable Flexible Descriptors in the queue context which + * allows this driver to select a specific receive descriptor format + */ + regval = (rxdid << QRXFLXP_CNTXT_RXDID_IDX_S) & + QRXFLXP_CNTXT_RXDID_IDX_M; + + /* increasing context priority to pick up profile ID; + * default is 0x01; setting to 0x03 to ensure profile + * is programming if prev context is of same priority + */ + regval |= (0x03 << QRXFLXP_CNTXT_RXDID_PRIO_S) & + QRXFLXP_CNTXT_RXDID_PRIO_M; + + ICE_WRITE_REG(hw, QRXFLXP_CNTXT(rxq->reg_idx), regval); + err = ice_clear_rxq_ctx(hw, rxq->reg_idx); if (err) { PMD_DRV_LOG(ERR, "Failed to clear Lan Rx queue (%u) context", @@ -961,16 +966,15 @@ uint32_t ice_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id) { #define ICE_RXQ_SCAN_INTERVAL 4 - volatile union ice_rx_desc *rxdp; + volatile union ice_rx_flex_desc *rxdp; struct ice_rx_queue *rxq; uint16_t desc = 0; rxq = dev->data->rx_queues[rx_queue_id]; - rxdp = &rxq->rx_ring[rxq->rx_tail]; + rxdp = (volatile union ice_rx_flex_desc *)&rxq->rx_ring[rxq->rx_tail]; while ((desc < rxq->nb_rx_desc) && - ((rte_le_to_cpu_64(rxdp->wb.qword1.status_error_len) & - ICE_RXD_QW1_STATUS_M) >> ICE_RXD_QW1_STATUS_S) & - (1 << ICE_RX_DESC_STATUS_DD_S)) { + rte_le_to_cpu_16(rxdp->wb.status_error0) & + (1 << ICE_RX_FLEX_DESC_STATUS0_DD_S)) { /** * Check the DD bit of a rx descriptor of each 4 in a group, * to avoid checking too frequently and downgrading performance @@ -979,79 +983,77 @@ ice_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id) desc += ICE_RXQ_SCAN_INTERVAL; rxdp += ICE_RXQ_SCAN_INTERVAL; if (rxq->rx_tail + desc >= rxq->nb_rx_desc) - rxdp = &(rxq->rx_ring[rxq->rx_tail + + rxdp = (volatile union ice_rx_flex_desc *) + &(rxq->rx_ring[rxq->rx_tail + desc - rxq->nb_rx_desc]); } return desc; } -/* Translate the rx descriptor status to pkt flags */ -static inline uint64_t -ice_rxd_status_to_pkt_flags(uint64_t qword) -{ - uint64_t flags; - - /* Check if RSS_HASH */ - flags = (((qword >> ICE_RX_DESC_STATUS_FLTSTAT_S) & - ICE_RX_DESC_FLTSTAT_RSS_HASH) == - ICE_RX_DESC_FLTSTAT_RSS_HASH) ? PKT_RX_RSS_HASH : 0; - - return flags; -} +#define ICE_RX_FLEX_ERR0_BITS \ + ((1 << ICE_RX_FLEX_DESC_STATUS0_HBO_S) | \ + (1 << ICE_RX_FLEX_DESC_STATUS0_XSUM_IPE_S) | \ + (1 << ICE_RX_FLEX_DESC_STATUS0_XSUM_L4E_S) | \ + (1 << ICE_RX_FLEX_DESC_STATUS0_XSUM_EIPE_S) | \ + (1 << ICE_RX_FLEX_DESC_STATUS0_XSUM_EUDPE_S) | \ + (1 << ICE_RX_FLEX_DESC_STATUS0_RXE_S)) /* Rx L3/L4 checksum */ static inline uint64_t -ice_rxd_error_to_pkt_flags(uint64_t qword) +ice_rxd_error_to_pkt_flags(uint16_t stat_err0) { uint64_t flags = 0; - uint64_t error_bits = (qword >> ICE_RXD_QW1_ERROR_S); - if (likely((error_bits & ICE_RX_ERR_BITS) == 0)) { + /* check if HW has decoded the packet and checksum */ + if (unlikely(!(stat_err0 & (1 << ICE_RX_FLEX_DESC_STATUS0_L3L4P_S)))) + return 0; + + if (likely(!(stat_err0 & ICE_RX_FLEX_ERR0_BITS))) { flags |= (PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD); return flags; } - if (unlikely(error_bits & (1 << ICE_RX_DESC_ERROR_IPE_S))) + if (unlikely(stat_err0 & (1 << ICE_RX_FLEX_DESC_STATUS0_XSUM_IPE_S))) flags |= PKT_RX_IP_CKSUM_BAD; else flags |= PKT_RX_IP_CKSUM_GOOD; - if (unlikely(error_bits & (1 << ICE_RX_DESC_ERROR_L4E_S))) + if (unlikely(stat_err0 & (1 << ICE_RX_FLEX_DESC_STATUS0_XSUM_L4E_S))) flags |= PKT_RX_L4_CKSUM_BAD; else flags |= PKT_RX_L4_CKSUM_GOOD; - if (unlikely(error_bits & (1 << ICE_RX_DESC_ERROR_EIPE_S))) + if (unlikely(stat_err0 & (1 << ICE_RX_FLEX_DESC_STATUS0_XSUM_EIPE_S))) flags |= PKT_RX_EIP_CKSUM_BAD; return flags; } static inline void -ice_rxd_to_vlan_tci(struct rte_mbuf *mb, volatile union ice_rx_desc *rxdp) +ice_rxd_to_vlan_tci(struct rte_mbuf *mb, volatile union ice_rx_flex_desc *rxdp) { - if (rte_le_to_cpu_64(rxdp->wb.qword1.status_error_len) & - (1 << ICE_RX_DESC_STATUS_L2TAG1P_S)) { + if (rte_le_to_cpu_16(rxdp->wb.status_error0) & + (1 << ICE_RX_FLEX_DESC_STATUS0_L2TAG1P_S)) { mb->ol_flags |= PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED; mb->vlan_tci = - rte_le_to_cpu_16(rxdp->wb.qword0.lo_dword.l2tag1); + rte_le_to_cpu_16(rxdp->wb.l2tag1); PMD_RX_LOG(DEBUG, "Descriptor l2tag1: %u", - rte_le_to_cpu_16(rxdp->wb.qword0.lo_dword.l2tag1)); + rte_le_to_cpu_16(rxdp->wb.l2tag1)); } else { mb->vlan_tci = 0; } #ifndef RTE_LIBRTE_ICE_16BYTE_RX_DESC - if (rte_le_to_cpu_16(rxdp->wb.qword2.ext_status) & - (1 << ICE_RX_DESC_EXT_STATUS_L2TAG2P_S)) { + if (rte_le_to_cpu_16(rxdp->wb.status_error1) & + (1 << ICE_RX_FLEX_DESC_STATUS1_L2TAG2P_S)) { mb->ol_flags |= PKT_RX_QINQ_STRIPPED | PKT_RX_QINQ | PKT_RX_VLAN_STRIPPED | PKT_RX_VLAN; mb->vlan_tci_outer = mb->vlan_tci; - mb->vlan_tci = rte_le_to_cpu_16(rxdp->wb.qword2.l2tag2_2); + mb->vlan_tci = rte_le_to_cpu_16(rxdp->wb.l2tag2_2nd); PMD_RX_LOG(DEBUG, "Descriptor l2tag2_1: %u, l2tag2_2: %u", - rte_le_to_cpu_16(rxdp->wb.qword2.l2tag2_1), - rte_le_to_cpu_16(rxdp->wb.qword2.l2tag2_2)); + rte_le_to_cpu_16(rxdp->wb.l2tag2_1st), + rte_le_to_cpu_16(rxdp->wb.l2tag2_2nd)); } else { mb->vlan_tci_outer = 0; } @@ -1060,6 +1062,21 @@ ice_rxd_to_vlan_tci(struct rte_mbuf *mb, volatile union ice_rx_desc *rxdp) mb->vlan_tci, mb->vlan_tci_outer); } +static inline void +ice_rxd_to_pkt_fields(struct rte_mbuf *mb, + volatile union ice_rx_flex_desc *rxdp) +{ + volatile struct ice_32b_rx_flex_desc_comms *desc = + (volatile struct ice_32b_rx_flex_desc_comms *)rxdp; + uint16_t stat_err; + + stat_err = rte_le_to_cpu_16(desc->status_error0); + if (likely(stat_err & (1 << ICE_RX_FLEX_DESC_STATUS0_RSS_VALID_S))) { + mb->ol_flags |= PKT_RX_RSS_HASH; + mb->hash.rss = rte_le_to_cpu_32(desc->rss_hash); + } +} + #ifdef RTE_LIBRTE_ICE_RX_ALLOW_BULK_ALLOC #define ICE_LOOK_AHEAD 8 #if (ICE_LOOK_AHEAD != 8) @@ -1068,25 +1085,23 @@ ice_rxd_to_vlan_tci(struct rte_mbuf *mb, volatile union ice_rx_desc *rxdp) static inline int ice_rx_scan_hw_ring(struct ice_rx_queue *rxq) { - volatile union ice_rx_desc *rxdp; + volatile union ice_rx_flex_desc *rxdp; struct ice_rx_entry *rxep; struct rte_mbuf *mb; + uint16_t stat_err0; uint16_t pkt_len; - uint64_t qword1; - uint32_t rx_status; int32_t s[ICE_LOOK_AHEAD], nb_dd; int32_t i, j, nb_rx = 0; uint64_t pkt_flags = 0; uint32_t *ptype_tbl = rxq->vsi->adapter->ptype_tbl; - rxdp = &rxq->rx_ring[rxq->rx_tail]; + rxdp = (volatile union ice_rx_flex_desc *)&rxq->rx_ring[rxq->rx_tail]; rxep = &rxq->sw_ring[rxq->rx_tail]; - qword1 = rte_le_to_cpu_64(rxdp->wb.qword1.status_error_len); - rx_status = (qword1 & ICE_RXD_QW1_STATUS_M) >> ICE_RXD_QW1_STATUS_S; + stat_err0 = rte_le_to_cpu_16(rxdp->wb.status_error0); /* Make sure there is at least 1 packet to receive */ - if (!(rx_status & (1 << ICE_RX_DESC_STATUS_DD_S))) + if (!(stat_err0 & (1 << ICE_RX_FLEX_DESC_STATUS0_DD_S))) return 0; /** @@ -1096,42 +1111,31 @@ ice_rx_scan_hw_ring(struct ice_rx_queue *rxq) for (i = 0; i < ICE_RX_MAX_BURST; i += ICE_LOOK_AHEAD, rxdp += ICE_LOOK_AHEAD, rxep += ICE_LOOK_AHEAD) { /* Read desc statuses backwards to avoid race condition */ - for (j = ICE_LOOK_AHEAD - 1; j >= 0; j--) { - qword1 = rte_le_to_cpu_64( - rxdp[j].wb.qword1.status_error_len); - s[j] = (qword1 & ICE_RXD_QW1_STATUS_M) >> - ICE_RXD_QW1_STATUS_S; - } + for (j = ICE_LOOK_AHEAD - 1; j >= 0; j--) + s[j] = rte_le_to_cpu_16(rxdp[j].wb.status_error0); rte_smp_rmb(); /* Compute how many status bits were set */ for (j = 0, nb_dd = 0; j < ICE_LOOK_AHEAD; j++) - nb_dd += s[j] & (1 << ICE_RX_DESC_STATUS_DD_S); + nb_dd += s[j] & (1 << ICE_RX_FLEX_DESC_STATUS0_DD_S); nb_rx += nb_dd; /* Translate descriptor info to mbuf parameters */ for (j = 0; j < nb_dd; j++) { mb = rxep[j].mbuf; - qword1 = rte_le_to_cpu_64( - rxdp[j].wb.qword1.status_error_len); - pkt_len = ((qword1 & ICE_RXD_QW1_LEN_PBUF_M) >> - ICE_RXD_QW1_LEN_PBUF_S) - rxq->crc_len; + pkt_len = (rte_le_to_cpu_16(rxdp[j].wb.pkt_len) & + ICE_RX_FLX_DESC_PKT_LEN_M) - rxq->crc_len; mb->data_len = pkt_len; mb->pkt_len = pkt_len; mb->ol_flags = 0; - pkt_flags = ice_rxd_status_to_pkt_flags(qword1); - pkt_flags |= ice_rxd_error_to_pkt_flags(qword1); - if (pkt_flags & PKT_RX_RSS_HASH) - mb->hash.rss = - rte_le_to_cpu_32( - rxdp[j].wb.qword0.hi_dword.rss); - mb->packet_type = ptype_tbl[(uint8_t)( - (qword1 & - ICE_RXD_QW1_PTYPE_M) >> - ICE_RXD_QW1_PTYPE_S)]; + stat_err0 = rte_le_to_cpu_16(rxdp[j].wb.status_error0); + pkt_flags = ice_rxd_error_to_pkt_flags(stat_err0); + mb->packet_type = ptype_tbl[ICE_RX_FLEX_DESC_PTYPE_M & + rte_le_to_cpu_16(rxdp[j].wb.ptype_flex_flags0)]; ice_rxd_to_vlan_tci(mb, &rxdp[j]); + ice_rxd_to_pkt_fields(mb, &rxdp[j]); mb->ol_flags |= pkt_flags; } @@ -1312,8 +1316,8 @@ ice_recv_scattered_pkts(void *rx_queue, { struct ice_rx_queue *rxq = rx_queue; volatile union ice_rx_desc *rx_ring = rxq->rx_ring; - volatile union ice_rx_desc *rxdp; - union ice_rx_desc rxd; + volatile union ice_rx_flex_desc *rxdp; + union ice_rx_flex_desc rxd; struct ice_rx_entry *sw_ring = rxq->sw_ring; struct ice_rx_entry *rxe; struct rte_mbuf *first_seg = rxq->pkt_first_seg; @@ -1324,21 +1328,18 @@ ice_recv_scattered_pkts(void *rx_queue, uint16_t nb_rx = 0; uint16_t nb_hold = 0; uint16_t rx_packet_len; - uint32_t rx_status; - uint64_t qword1; + uint16_t rx_stat_err0; uint64_t dma_addr; - uint64_t pkt_flags = 0; + uint64_t pkt_flags; uint32_t *ptype_tbl = rxq->vsi->adapter->ptype_tbl; struct rte_eth_dev *dev; while (nb_rx < nb_pkts) { - rxdp = &rx_ring[rx_id]; - qword1 = rte_le_to_cpu_64(rxdp->wb.qword1.status_error_len); - rx_status = (qword1 & ICE_RXD_QW1_STATUS_M) >> - ICE_RXD_QW1_STATUS_S; + rxdp = (volatile union ice_rx_flex_desc *)&rx_ring[rx_id]; + rx_stat_err0 = rte_le_to_cpu_16(rxdp->wb.status_error0); /* Check the DD bit first */ - if (!(rx_status & (1 << ICE_RX_DESC_STATUS_DD_S))) + if (!(rx_stat_err0 & (1 << ICE_RX_FLEX_DESC_STATUS0_DD_S))) break; /* allocate mbuf */ @@ -1377,14 +1378,10 @@ ice_recv_scattered_pkts(void *rx_queue, /* Set data buffer address and data length of the mbuf */ rxdp->read.hdr_addr = 0; rxdp->read.pkt_addr = dma_addr; - rx_packet_len = (qword1 & ICE_RXD_QW1_LEN_PBUF_M) >> - ICE_RXD_QW1_LEN_PBUF_S; + rx_packet_len = rte_le_to_cpu_16(rxd.wb.pkt_len) & + ICE_RX_FLX_DESC_PKT_LEN_M; rxm->data_len = rx_packet_len; rxm->data_off = RTE_PKTMBUF_HEADROOM; - ice_rxd_to_vlan_tci(rxm, rxdp); - rxm->packet_type = ptype_tbl[(uint8_t)((qword1 & - ICE_RXD_QW1_PTYPE_M) >> - ICE_RXD_QW1_PTYPE_S)]; /** * If this is the first buffer of the received packet, set the @@ -1410,7 +1407,7 @@ ice_recv_scattered_pkts(void *rx_queue, * update the pointer to the last mbuf of the current scattered * packet and continue to parse the RX ring. */ - if (!(rx_status & (1 << ICE_RX_DESC_STATUS_EOF_S))) { + if (!(rx_stat_err0 & (1 << ICE_RX_FLEX_DESC_STATUS0_EOF_S))) { last_seg = rxm; continue; } @@ -1442,13 +1439,11 @@ ice_recv_scattered_pkts(void *rx_queue, first_seg->port = rxq->port_id; first_seg->ol_flags = 0; - - pkt_flags = ice_rxd_status_to_pkt_flags(qword1); - pkt_flags |= ice_rxd_error_to_pkt_flags(qword1); - if (pkt_flags & PKT_RX_RSS_HASH) - first_seg->hash.rss = - rte_le_to_cpu_32(rxd.wb.qword0.hi_dword.rss); - + first_seg->packet_type = ptype_tbl[ICE_RX_FLEX_DESC_PTYPE_M & + rte_le_to_cpu_16(rxd.wb.ptype_flex_flags0)]; + ice_rxd_to_vlan_tci(first_seg, &rxd); + ice_rxd_to_pkt_fields(first_seg, &rxd); + pkt_flags = ice_rxd_error_to_pkt_flags(rx_stat_err0); first_seg->ol_flags |= pkt_flags; /* Prefetch data of first segment, if configured to do so. */ rte_prefetch0(RTE_PTR_ADD(first_seg->buf_addr, @@ -1538,9 +1533,8 @@ ice_dev_supported_ptypes_get(struct rte_eth_dev *dev) int ice_rx_descriptor_status(void *rx_queue, uint16_t offset) { + volatile union ice_rx_flex_desc *rxdp; struct ice_rx_queue *rxq = rx_queue; - volatile uint64_t *status; - uint64_t mask; uint32_t desc; if (unlikely(offset >= rxq->nb_rx_desc)) @@ -1553,10 +1547,9 @@ ice_rx_descriptor_status(void *rx_queue, uint16_t offset) if (desc >= rxq->nb_rx_desc) desc -= rxq->nb_rx_desc; - status = &rxq->rx_ring[desc].wb.qword1.status_error_len; - mask = rte_cpu_to_le_64((1ULL << ICE_RX_DESC_STATUS_DD_S) << - ICE_RXD_QW1_STATUS_S); - if (*status & mask) + rxdp = (volatile union ice_rx_flex_desc *)&rxq->rx_ring[desc]; + if (rte_le_to_cpu_16(rxdp->wb.status_error0) & + (1 << ICE_RX_FLEX_DESC_STATUS0_DD_S)) return RTE_ETH_RX_DESC_DONE; return RTE_ETH_RX_DESC_AVAIL; @@ -1642,8 +1635,8 @@ ice_recv_pkts(void *rx_queue, { struct ice_rx_queue *rxq = rx_queue; volatile union ice_rx_desc *rx_ring = rxq->rx_ring; - volatile union ice_rx_desc *rxdp; - union ice_rx_desc rxd; + volatile union ice_rx_flex_desc *rxdp; + union ice_rx_flex_desc rxd; struct ice_rx_entry *sw_ring = rxq->sw_ring; struct ice_rx_entry *rxe; struct rte_mbuf *nmb; /* new allocated mbuf */ @@ -1652,21 +1645,18 @@ ice_recv_pkts(void *rx_queue, uint16_t nb_rx = 0; uint16_t nb_hold = 0; uint16_t rx_packet_len; - uint32_t rx_status; - uint64_t qword1; + uint16_t rx_stat_err0; uint64_t dma_addr; - uint64_t pkt_flags = 0; + uint64_t pkt_flags; uint32_t *ptype_tbl = rxq->vsi->adapter->ptype_tbl; struct rte_eth_dev *dev; while (nb_rx < nb_pkts) { - rxdp = &rx_ring[rx_id]; - qword1 = rte_le_to_cpu_64(rxdp->wb.qword1.status_error_len); - rx_status = (qword1 & ICE_RXD_QW1_STATUS_M) >> - ICE_RXD_QW1_STATUS_S; + rxdp = (volatile union ice_rx_flex_desc *)&rx_ring[rx_id]; + rx_stat_err0 = rte_le_to_cpu_16(rxdp->wb.status_error0); /* Check the DD bit first */ - if (!(rx_status & (1 << ICE_RX_DESC_STATUS_DD_S))) + if (!(rx_stat_err0 & (1 << ICE_RX_FLEX_DESC_STATUS0_DD_S))) break; /* allocate mbuf */ @@ -1696,8 +1686,8 @@ ice_recv_pkts(void *rx_queue, rxdp->read.pkt_addr = dma_addr; /* calculate rx_packet_len of the received pkt */ - rx_packet_len = ((qword1 & ICE_RXD_QW1_LEN_PBUF_M) >> - ICE_RXD_QW1_LEN_PBUF_S) - rxq->crc_len; + rx_packet_len = (rte_le_to_cpu_16(rxd.wb.pkt_len) & + ICE_RX_FLX_DESC_PKT_LEN_M) - rxq->crc_len; /* fill old mbuf with received descriptor: rxd */ rxm->data_off = RTE_PKTMBUF_HEADROOM; @@ -1707,15 +1697,11 @@ ice_recv_pkts(void *rx_queue, rxm->pkt_len = rx_packet_len; rxm->data_len = rx_packet_len; rxm->port = rxq->port_id; - ice_rxd_to_vlan_tci(rxm, rxdp); - rxm->packet_type = ptype_tbl[(uint8_t)((qword1 & - ICE_RXD_QW1_PTYPE_M) >> - ICE_RXD_QW1_PTYPE_S)]; - pkt_flags = ice_rxd_status_to_pkt_flags(qword1); - pkt_flags |= ice_rxd_error_to_pkt_flags(qword1); - if (pkt_flags & PKT_RX_RSS_HASH) - rxm->hash.rss = - rte_le_to_cpu_32(rxd.wb.qword0.hi_dword.rss); + rxm->packet_type = ptype_tbl[ICE_RX_FLEX_DESC_PTYPE_M & + rte_le_to_cpu_16(rxd.wb.ptype_flex_flags0)]; + ice_rxd_to_vlan_tci(rxm, &rxd); + ice_rxd_to_pkt_fields(rxm, &rxd); + pkt_flags = ice_rxd_error_to_pkt_flags(rx_stat_err0); rxm->ol_flags |= pkt_flags; /* copy old mbuf to rx_pkts */ rx_pkts[nb_rx++] = rxm; From patchwork Thu Sep 19 06:25:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leyi Rong X-Patchwork-Id: 59356 X-Patchwork-Delegate: xiaolong.ye@intel.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 314EC1E916; Thu, 19 Sep 2019 08:29:00 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by dpdk.org (Postfix) with ESMTP id EACFB1E8FA for ; Thu, 19 Sep 2019 08:28:55 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Sep 2019 23:28:55 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,522,1559545200"; d="scan'208";a="362421961" Received: from dpdk-lrong-srv-04.sh.intel.com ([10.67.119.187]) by orsmga005.jf.intel.com with ESMTP; 18 Sep 2019 23:28:53 -0700 From: Leyi Rong To: haiyue.wang@intel.com, wenzhuo.lu@intel.com, qi.z.zhang@intel.com, xiaolong.ye@intel.com Cc: dev@dpdk.org Date: Thu, 19 Sep 2019 14:25:50 +0800 Message-Id: <20190919062553.79257-4-leyi.rong@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190919062553.79257-1-leyi.rong@intel.com> References: <20190829023421.112551-1-leyi.rong@intel.com> <20190919062553.79257-1-leyi.rong@intel.com> Subject: [dpdk-dev] [PATCH v4 3/6] net/ice: add protocol extraction support for per Rx queue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Haiyue Wang The ice has the feature to extract protocol fields into flex descriptor by programming per queue. Currently, the ice PMD will put the protocol fields into rte_mbuf::udata64 with different type format. Application can access the protocol fields quickly. Signed-off-by: Haiyue Wang --- doc/guides/nics/ice.rst | 101 +++++++++ doc/guides/rel_notes/release_19_11.rst | 4 + drivers/net/ice/Makefile | 3 + drivers/net/ice/ice_ethdev.c | 301 +++++++++++++++++++++++++ drivers/net/ice/ice_ethdev.h | 4 + drivers/net/ice/ice_rxtx.c | 61 +++++ drivers/net/ice/ice_rxtx.h | 2 + drivers/net/ice/ice_rxtx_vec_common.h | 3 + drivers/net/ice/meson.build | 2 + drivers/net/ice/rte_pmd_ice.h | 152 +++++++++++++ 10 files changed, 633 insertions(+) create mode 100644 drivers/net/ice/rte_pmd_ice.h diff --git a/doc/guides/nics/ice.rst b/doc/guides/nics/ice.rst index 03819d29f..8a6f60e71 100644 --- a/doc/guides/nics/ice.rst +++ b/doc/guides/nics/ice.rst @@ -61,6 +61,107 @@ Runtime Config Options NOTE: In Safe mode, only very limited features are available, features like RSS, checksum, fdir, tunneling ... are all disabled. +- ``Protocol extraction for per queue`` + + Configure the RX queues to do protocol extraction into ``rte_mbuf::udata64`` + for protocol handling acceleration, like checking the TCP SYN packets quickly. + + The argument format is:: + + -w 18:00.0,proto_xtr=[...] + -w 18:00.0,proto_xtr= + + Queues are grouped by ``(`` and ``)`` within the group. The ``-`` character + is used as a range separator and ``,`` is used as a single number separator. + The grouping ``()`` can be omitted for single element group. If no queues are + specified, PMD will use this protocol extraction type for all queues. + + Protocol is : ``vlan, ipv4, ipv6, ipv6_flow, tcp``. + + .. code-block:: console + + testpmd -w 18:00.0,proto_xtr='[(1,2-3,8-9):tcp,10-13:vlan]' + + This setting means queues 1, 2-3, 8-9 are TCP extraction, queues 10-13 are + VLAN extraction, other queues run with no protocol extraction. + + .. code-block:: console + + testpmd -w 18:00.0,proto_xtr=vlan,proto_xtr='[(1,2-3,8-9):tcp,10-23:ipv6]' + + This setting means queues 1, 2-3, 8-9 are TCP extraction, queues 10-23 are + IPv6 extraction, other queues use the default VLAN extraction. + + The extraction will be copied into the lower 32 bit of ``rte_mbuf::udata64``. + + .. table:: Protocol extraction : ``vlan`` + + +----------------------------+----------------------------+ + | VLAN2 | VLAN1 | + +======+===+=================+======+===+=================+ + | PCP | D | VID | PCP | D | VID | + +------+---+-----------------+------+---+-----------------+ + + VLAN1 - single or EVLAN (first for QinQ). + + VLAN2 - C-VLAN (second for QinQ). + + .. table:: Protocol extraction : ``ipv4`` + + +----------------------------+----------------------------+ + | IPHDR2 | IPHDR1 | + +======+=======+=============+==============+=============+ + | Ver |Hdr Len| ToS | TTL | Protocol | + +------+-------+-------------+--------------+-------------+ + + IPHDR1 - IPv4 header word 4, "TTL" and "Protocol" fields. + + IPHDR2 - IPv4 header word 0, "Ver", "Hdr Len" and "Type of Service" fields. + + .. table:: Protocol extraction : ``ipv6`` + + +----------------------------+----------------------------+ + | IPHDR2 | IPHDR1 | + +=====+=============+========+=============+==============+ + | Ver |Traffic class| Flow | Next Header | Hop Limit | + +-----+-------------+--------+-------------+--------------+ + + IPHDR1 - IPv6 header word 3, "Next Header" and "Hop Limit" fields. + + IPHDR2 - IPv6 header word 0, "Ver", "Traffic class" and high 4 bits of + "Flow Label" fields. + + .. table:: Protocol extraction : ``ipv6_flow`` + + +----------------------------+----------------------------+ + | IPHDR2 | IPHDR1 | + +=====+=============+========+============================+ + | Ver |Traffic class| Flow Label | + +-----+-------------+-------------------------------------+ + + IPHDR1 - IPv6 header word 1, 16 low bits of the "Flow Label" field. + + IPHDR2 - IPv6 header word 0, "Ver", "Traffic class" and high 4 bits of + "Flow Label" fields. + + .. table:: Protocol extraction : ``tcp`` + + +----------------------------+----------------------------+ + | TCPHDR2 | TCPHDR1 | + +============================+======+======+==============+ + | Reserved |Offset| RSV | Flags | + +----------------------------+------+------+--------------+ + + TCPHDR1 - TCP header word 6, "Data Offset" and "Flags" fields. + + TCPHDR2 - Reserved + + Use ``get_proto_xtr_flds(struct rte_mbuf *mb)`` to access the protocol + extraction, do not use ``rte_mbuf::udata64`` directly. + + The ``dump_proto_xtr_flds(struct rte_mbuf *mb)`` routine shows how to + access the protocol extraction result in ``struct rte_mbuf``. + Driver compilation and testing ------------------------------ diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst index 8490d897c..382806229 100644 --- a/doc/guides/rel_notes/release_19_11.rst +++ b/doc/guides/rel_notes/release_19_11.rst @@ -21,6 +21,10 @@ DPDK Release 19.11 xdg-open build/doc/html/guides/rel_notes/release_19_11.html +* **Updated the ICE driver.** + + * Added support for handling Receive Flex Descriptor. + * Added support for protocol extraction on per Rx queue. New Features ------------ diff --git a/drivers/net/ice/Makefile b/drivers/net/ice/Makefile index ae53c2646..4a279f196 100644 --- a/drivers/net/ice/Makefile +++ b/drivers/net/ice/Makefile @@ -82,4 +82,7 @@ ifeq ($(CC_AVX2_SUPPORT), 1) endif SRCS-$(CONFIG_RTE_LIBRTE_ICE_PMD) += ice_generic_flow.c +# install this header file +SYMLINK-$(CONFIG_RTE_LIBRTE_ICE_PMD)-include := rte_pmd_ice.h + include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/net/ice/ice_ethdev.c b/drivers/net/ice/ice_ethdev.c index 44a14cb8a..7c74b6169 100644 --- a/drivers/net/ice/ice_ethdev.c +++ b/drivers/net/ice/ice_ethdev.c @@ -19,9 +19,11 @@ /* devargs */ #define ICE_SAFE_MODE_SUPPORT_ARG "safe-mode-support" +#define ICE_PROTO_XTR_ARG "proto_xtr" static const char * const ice_valid_args[] = { ICE_SAFE_MODE_SUPPORT_ARG, + ICE_PROTO_XTR_ARG, NULL }; @@ -257,6 +259,280 @@ ice_init_controlq_parameter(struct ice_hw *hw) hw->mailboxq.sq_buf_size = ICE_MAILBOXQ_BUF_SZ; } +static int +lookup_proto_xtr_type(const char *xtr_name) +{ + static struct { + const char *name; + enum proto_xtr_type type; + } xtr_type_map[] = { + { "vlan", PROTO_XTR_VLAN }, + { "ipv4", PROTO_XTR_IPV4 }, + { "ipv6", PROTO_XTR_IPV6 }, + { "ipv6_flow", PROTO_XTR_IPV6_FLOW }, + { "tcp", PROTO_XTR_TCP }, + }; + uint32_t i; + + for (i = 0; i < RTE_DIM(xtr_type_map); i++) { + if (strcmp(xtr_name, xtr_type_map[i].name) == 0) + return xtr_type_map[i].type; + } + + return -1; +} + +/* + * Parse elem, the elem could be single number/range or '(' ')' group + * 1) A single number elem, it's just a simple digit. e.g. 9 + * 2) A single range elem, two digits with a '-' between. e.g. 2-6 + * 3) A group elem, combines multiple 1) or 2) with '( )'. e.g (0,2-4,6) + * Within group elem, '-' used for a range separator; + * ',' used for a single number. + */ +static int +parse_queue_set(const char *input, int xtr_type, struct ice_devargs *devargs) +{ + const char *str = input; + char *end = NULL; + uint32_t min, max; + uint32_t idx; + + while (isblank(*str)) + str++; + + if (!isdigit(*str) && *str != '(') + return -1; + + /* process single number or single range of number */ + if (*str != '(') { + errno = 0; + idx = strtoul(str, &end, 10); + if (errno || end == NULL || idx >= ICE_MAX_QUEUE_NUM) + return -1; + + while (isblank(*end)) + end++; + + min = idx; + max = idx; + + /* process single - */ + if (*end == '-') { + end++; + while (isblank(*end)) + end++; + if (!isdigit(*end)) + return -1; + + errno = 0; + idx = strtoul(end, &end, 10); + if (errno || end == NULL || idx >= ICE_MAX_QUEUE_NUM) + return -1; + + max = idx; + while (isblank(*end)) + end++; + } + + if (*end != ':') + return -1; + + for (idx = RTE_MIN(min, max); + idx <= RTE_MAX(min, max); idx++) + devargs->proto_xtr[idx] = xtr_type; + + return 0; + } + + /* process set within bracket */ + str++; + while (isblank(*str)) + str++; + if (*str == '\0') + return -1; + + min = ICE_MAX_QUEUE_NUM; + do { + /* go ahead to the first digit */ + while (isblank(*str)) + str++; + if (!isdigit(*str)) + return -1; + + /* get the digit value */ + errno = 0; + idx = strtoul(str, &end, 10); + if (errno || end == NULL || idx >= ICE_MAX_QUEUE_NUM) + return -1; + + /* go ahead to separator '-',',' and ')' */ + while (isblank(*end)) + end++; + if (*end == '-') { + if (min == ICE_MAX_QUEUE_NUM) + min = idx; + else /* avoid continuous '-' */ + return -1; + } else if (*end == ',' || *end == ')') { + max = idx; + if (min == ICE_MAX_QUEUE_NUM) + min = idx; + + for (idx = RTE_MIN(min, max); + idx <= RTE_MAX(min, max); idx++) + devargs->proto_xtr[idx] = xtr_type; + + min = ICE_MAX_QUEUE_NUM; + } else { + return -1; + } + + str = end + 1; + } while (*end != ')' && *end != '\0'); + + return 0; +} + +static int +parse_queue_proto_xtr(const char *queues, struct ice_devargs *devargs) +{ + const char *queue_start; + uint32_t idx; + int xtr_type; + char xtr_name[32]; + + while (isblank(*queues)) + queues++; + + if (*queues != '[') { + xtr_type = lookup_proto_xtr_type(queues); + if (xtr_type < 0) + return -1; + + memset(devargs->proto_xtr, xtr_type, + sizeof(devargs->proto_xtr)); + + return 0; + } + + queues++; + do { + while (isblank(*queues)) + queues++; + if (*queues == '\0') + return -1; + + queue_start = queues; + + /* go across a complete bracket */ + if (*queue_start == '(') { + queues += strcspn(queues, ")"); + if (*queues != ')') + return -1; + } + + /* scan the separator ':' */ + queues += strcspn(queues, ":"); + if (*queues++ != ':') + return -1; + while (isblank(*queues)) + queues++; + + for (idx = 0; ; idx++) { + if (isblank(queues[idx]) || + queues[idx] == ',' || + queues[idx] == ']' || + queues[idx] == '\0') + break; + + if (idx > sizeof(xtr_name) - 2) + return -1; + + xtr_name[idx] = queues[idx]; + } + xtr_name[idx] = '\0'; + xtr_type = lookup_proto_xtr_type(xtr_name); + if (xtr_type < 0) + return -1; + + queues += idx; + + while (isblank(*queues) || *queues == ',' || *queues == ']') + queues++; + + if (parse_queue_set(queue_start, xtr_type, devargs) < 0) + return -1; + } while (*queues != '\0'); + + return 0; +} + +static int +handle_proto_xtr_arg(__rte_unused const char *key, const char *value, + void *extra_args) +{ + struct ice_devargs *devargs = extra_args; + + if (value == NULL || extra_args == NULL) + return -EINVAL; + + if (parse_queue_proto_xtr(value, devargs) < 0) { + PMD_DRV_LOG(ERR, + "The protocol extraction parameter is wrong : '%s'", + value); + return -1; + } + + return 0; +} + +static bool +ice_proto_xtr_support(struct ice_hw *hw) +{ +#define FLX_REG(val, fld, idx) \ + (((val) & GLFLXP_RXDID_FLX_WRD_##idx##_##fld##_M) >> \ + GLFLXP_RXDID_FLX_WRD_##idx##_##fld##_S) + static struct { + uint32_t rxdid; + uint16_t protid_0; + uint16_t protid_1; + } xtr_sets[] = { + { ICE_RXDID_COMMS_AUX_VLAN, ICE_PROT_EVLAN_O, ICE_PROT_VLAN_O }, + { ICE_RXDID_COMMS_AUX_IPV4, ICE_PROT_IPV4_OF_OR_S, + ICE_PROT_IPV4_OF_OR_S }, + { ICE_RXDID_COMMS_AUX_IPV6, ICE_PROT_IPV6_OF_OR_S, + ICE_PROT_IPV6_OF_OR_S }, + { ICE_RXDID_COMMS_AUX_IPV6_FLOW, ICE_PROT_IPV6_OF_OR_S, + ICE_PROT_IPV6_OF_OR_S }, + { ICE_RXDID_COMMS_AUX_TCP, ICE_PROT_TCP_IL, ICE_PROT_ID_INVAL }, + }; + uint32_t i; + + for (i = 0; i < RTE_DIM(xtr_sets); i++) { + uint32_t rxdid = xtr_sets[i].rxdid; + uint32_t v; + + if (xtr_sets[i].protid_0 != ICE_PROT_ID_INVAL) { + v = ICE_READ_REG(hw, GLFLXP_RXDID_FLX_WRD_4(rxdid)); + + if (FLX_REG(v, PROT_MDID, 4) != xtr_sets[i].protid_0 || + FLX_REG(v, RXDID_OPCODE, 4) != ICE_RX_OPC_EXTRACT) + return false; + } + + if (xtr_sets[i].protid_1 != ICE_PROT_ID_INVAL) { + v = ICE_READ_REG(hw, GLFLXP_RXDID_FLX_WRD_5(rxdid)); + + if (FLX_REG(v, PROT_MDID, 5) != xtr_sets[i].protid_1 || + FLX_REG(v, RXDID_OPCODE, 5) != ICE_RX_OPC_EXTRACT) + return false; + } + } + + return true; +} + static int ice_res_pool_init(struct ice_res_pool_info *pool, uint32_t base, uint32_t num) @@ -1079,6 +1355,8 @@ ice_interrupt_handler(void *param) static int ice_pf_sw_init(struct rte_eth_dev *dev) { + struct ice_adapter *ad = + ICE_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private); struct ice_hw *hw = ICE_PF_TO_HW(pf); @@ -1088,6 +1366,16 @@ ice_pf_sw_init(struct rte_eth_dev *dev) pf->lan_nb_qps = pf->lan_nb_qp_max; + if (ice_proto_xtr_support(hw)) + pf->proto_xtr = rte_zmalloc(NULL, pf->lan_nb_qps, 0); + + if (pf->proto_xtr != NULL) + rte_memcpy(pf->proto_xtr, ad->devargs.proto_xtr, + RTE_MIN((size_t)pf->lan_nb_qps, + sizeof(ad->devargs.proto_xtr))); + else + PMD_DRV_LOG(NOTICE, "Protocol extraction is disabled"); + return 0; } @@ -1378,9 +1666,18 @@ static int ice_parse_devargs(struct rte_eth_dev *dev) return -EINVAL; } + memset(ad->devargs.proto_xtr, PROTO_XTR_NONE, + sizeof(ad->devargs.proto_xtr)); + + ret = rte_kvargs_process(kvlist, ICE_PROTO_XTR_ARG, + &handle_proto_xtr_arg, &ad->devargs); + if (ret) + goto bail; + ret = rte_kvargs_process(kvlist, ICE_SAFE_MODE_SUPPORT_ARG, &parse_bool, &ad->devargs.safe_mode_support); +bail: rte_kvargs_free(kvlist); return ret; } @@ -1547,6 +1844,7 @@ ice_dev_init(struct rte_eth_dev *dev) ice_sched_cleanup_all(hw); rte_free(hw->port_info); ice_shutdown_all_ctrlq(hw); + rte_free(pf->proto_xtr); return ret; } @@ -1672,6 +1970,8 @@ ice_dev_close(struct rte_eth_dev *dev) rte_free(hw->port_info); hw->port_info = NULL; ice_shutdown_all_ctrlq(hw); + rte_free(pf->proto_xtr); + pf->proto_xtr = NULL; } static int @@ -3795,6 +4095,7 @@ RTE_PMD_REGISTER_PCI(net_ice, rte_ice_pmd); RTE_PMD_REGISTER_PCI_TABLE(net_ice, pci_id_ice_map); RTE_PMD_REGISTER_KMOD_DEP(net_ice, "* igb_uio | uio_pci_generic | vfio-pci"); RTE_PMD_REGISTER_PARAM_STRING(net_ice, + ICE_PROTO_XTR_ARG "=[queue:]" ICE_SAFE_MODE_SUPPORT_ARG "=<0|1>"); RTE_INIT(ice_init_log) diff --git a/drivers/net/ice/ice_ethdev.h b/drivers/net/ice/ice_ethdev.h index f569da833..adbb66322 100644 --- a/drivers/net/ice/ice_ethdev.h +++ b/drivers/net/ice/ice_ethdev.h @@ -263,6 +263,7 @@ struct ice_pf { uint16_t lan_nb_qp_max; uint16_t lan_nb_qps; /* The number of queue pairs of LAN */ uint16_t base_queue; /* The base queue pairs index in the device */ + uint8_t *proto_xtr; /* Protocol extraction type for all queues */ struct ice_hw_port_stats stats_offset; struct ice_hw_port_stats stats; /* internal packet statistics, it should be excluded from the total */ @@ -273,11 +274,14 @@ struct ice_pf { struct ice_flow_list flow_list; }; +#define ICE_MAX_QUEUE_NUM 2048 + /** * Cache devargs parse result. */ struct ice_devargs { int safe_mode_support; + uint8_t proto_xtr[ICE_MAX_QUEUE_NUM]; }; /** diff --git a/drivers/net/ice/ice_rxtx.c b/drivers/net/ice/ice_rxtx.c index d2e36853f..e28310b96 100644 --- a/drivers/net/ice/ice_rxtx.c +++ b/drivers/net/ice/ice_rxtx.c @@ -13,6 +13,36 @@ PKT_TX_TCP_SEG | \ PKT_TX_OUTER_IP_CKSUM) +static inline uint8_t +ice_rxdid_to_proto_xtr_type(uint8_t rxdid) +{ + static uint8_t xtr_map[] = { + [ICE_RXDID_COMMS_AUX_VLAN] = PROTO_XTR_VLAN, + [ICE_RXDID_COMMS_AUX_IPV4] = PROTO_XTR_IPV4, + [ICE_RXDID_COMMS_AUX_IPV6] = PROTO_XTR_IPV6, + [ICE_RXDID_COMMS_AUX_IPV6_FLOW] = PROTO_XTR_IPV6_FLOW, + [ICE_RXDID_COMMS_AUX_TCP] = PROTO_XTR_TCP, + }; + + return rxdid < RTE_DIM(xtr_map) ? xtr_map[rxdid] : PROTO_XTR_NONE; +} + +static inline uint8_t +ice_proto_xtr_type_to_rxdid(uint8_t xtr_tpye) +{ + static uint8_t rxdid_map[] = { + [PROTO_XTR_VLAN] = ICE_RXDID_COMMS_AUX_VLAN, + [PROTO_XTR_IPV4] = ICE_RXDID_COMMS_AUX_IPV4, + [PROTO_XTR_IPV6] = ICE_RXDID_COMMS_AUX_IPV6, + [PROTO_XTR_IPV6_FLOW] = ICE_RXDID_COMMS_AUX_IPV6_FLOW, + [PROTO_XTR_TCP] = ICE_RXDID_COMMS_AUX_TCP, + }; + uint8_t rxdid; + + rxdid = xtr_tpye < RTE_DIM(rxdid_map) ? rxdid_map[xtr_tpye] : 0; + + return rxdid != 0 ? rxdid : ICE_RXDID_COMMS_GENERIC; +} static enum ice_status ice_program_hw_rx_queue(struct ice_rx_queue *rxq) @@ -84,6 +114,11 @@ ice_program_hw_rx_queue(struct ice_rx_queue *rxq) rx_ctx.showiv = 0; rx_ctx.crcstrip = (rxq->crc_len == 0) ? 1 : 0; + rxdid = ice_proto_xtr_type_to_rxdid(rxq->proto_xtr); + + PMD_DRV_LOG(DEBUG, "Port (%u) - Rx queue (%u) is set with RXDID : %u", + rxq->port_id, rxq->queue_id, rxdid); + /* Enable Flexible Descriptors in the queue context which * allows this driver to select a specific receive descriptor format */ @@ -641,6 +676,8 @@ ice_rx_queue_setup(struct rte_eth_dev *dev, rxq->drop_en = rx_conf->rx_drop_en; rxq->vsi = vsi; rxq->rx_deferred_start = rx_conf->rx_deferred_start; + rxq->proto_xtr = pf->proto_xtr != NULL ? + pf->proto_xtr[queue_idx] : PROTO_XTR_NONE; /* Allocate the maximun number of RX ring hardware descriptor. */ len = ICE_MAX_RING_DESC; @@ -1062,6 +1099,10 @@ ice_rxd_to_vlan_tci(struct rte_mbuf *mb, volatile union ice_rx_flex_desc *rxdp) mb->vlan_tci, mb->vlan_tci_outer); } +#define ICE_RX_PROTO_XTR_VALID \ + ((1 << ICE_RX_FLEX_DESC_STATUS1_XTRMD4_VALID_S) | \ + (1 << ICE_RX_FLEX_DESC_STATUS1_XTRMD5_VALID_S)) + static inline void ice_rxd_to_pkt_fields(struct rte_mbuf *mb, volatile union ice_rx_flex_desc *rxdp) @@ -1075,6 +1116,26 @@ ice_rxd_to_pkt_fields(struct rte_mbuf *mb, mb->ol_flags |= PKT_RX_RSS_HASH; mb->hash.rss = rte_le_to_cpu_32(desc->rss_hash); } + +#ifndef RTE_LIBRTE_ICE_16BYTE_RX_DESC + init_proto_xtr_flds(mb); + + stat_err = rte_le_to_cpu_16(desc->status_error1); + if (stat_err & ICE_RX_PROTO_XTR_VALID) { + struct proto_xtr_flds *xtr = get_proto_xtr_flds(mb); + + if (stat_err & (1 << ICE_RX_FLEX_DESC_STATUS1_XTRMD4_VALID_S)) + xtr->u.raw.data0 = + rte_le_to_cpu_16(desc->flex_ts.flex.aux0); + + if (stat_err & (1 << ICE_RX_FLEX_DESC_STATUS1_XTRMD5_VALID_S)) + xtr->u.raw.data1 = + rte_le_to_cpu_16(desc->flex_ts.flex.aux1); + + xtr->type = ice_rxdid_to_proto_xtr_type(desc->rxdid); + xtr->magic = PROTO_XTR_MAGIC_ID; + } +#endif } #ifdef RTE_LIBRTE_ICE_RX_ALLOW_BULK_ALLOC diff --git a/drivers/net/ice/ice_rxtx.h b/drivers/net/ice/ice_rxtx.h index 64e891875..de16637f3 100644 --- a/drivers/net/ice/ice_rxtx.h +++ b/drivers/net/ice/ice_rxtx.h @@ -5,6 +5,7 @@ #ifndef _ICE_RXTX_H_ #define _ICE_RXTX_H_ +#include "rte_pmd_ice.h" #include "ice_ethdev.h" #define ICE_ALIGN_RING_DESC 32 @@ -78,6 +79,7 @@ struct ice_rx_queue { uint16_t max_pkt_len; /* Maximum packet length */ bool q_set; /* indicate if rx queue has been configured */ bool rx_deferred_start; /* don't start this queue in dev start */ + uint8_t proto_xtr; /* Protocol extraction from flexible descriptor */ ice_rx_release_mbufs_t rx_rel_mbufs; }; diff --git a/drivers/net/ice/ice_rxtx_vec_common.h b/drivers/net/ice/ice_rxtx_vec_common.h index c5f0d564f..080ca4175 100644 --- a/drivers/net/ice/ice_rxtx_vec_common.h +++ b/drivers/net/ice/ice_rxtx_vec_common.h @@ -234,6 +234,9 @@ ice_rx_vec_queue_default(struct ice_rx_queue *rxq) if (rxq->nb_rx_desc % rxq->rx_free_thresh) return -1; + if (rxq->proto_xtr != PROTO_XTR_NONE) + return -1; + return 0; } diff --git a/drivers/net/ice/meson.build b/drivers/net/ice/meson.build index 36b4b3c85..6828170a9 100644 --- a/drivers/net/ice/meson.build +++ b/drivers/net/ice/meson.build @@ -34,3 +34,5 @@ if arch_subdir == 'x86' objs += ice_avx2_lib.extract_objects('ice_rxtx_vec_avx2.c') endif endif + +install_headers('rte_pmd_ice.h') diff --git a/drivers/net/ice/rte_pmd_ice.h b/drivers/net/ice/rte_pmd_ice.h new file mode 100644 index 000000000..719487e1e --- /dev/null +++ b/drivers/net/ice/rte_pmd_ice.h @@ -0,0 +1,152 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Intel Corporation + */ + +#ifndef _RTE_PMD_ICE_H_ +#define _RTE_PMD_ICE_H_ + +#include +#include +#include + +#ifdef __cplusplus +extern "C" { +#endif + +enum proto_xtr_type { + PROTO_XTR_NONE, + PROTO_XTR_VLAN, + PROTO_XTR_IPV4, + PROTO_XTR_IPV6, + PROTO_XTR_IPV6_FLOW, + PROTO_XTR_TCP, +}; + +struct proto_xtr_flds { + union { + struct { + uint16_t data0; + uint16_t data1; + } raw; + struct { + uint16_t stag_vid:12, + stag_dei:1, + stag_pcp:3; + uint16_t ctag_vid:12, + ctag_dei:1, + ctag_pcp:3; + } vlan; + struct { + uint16_t protocol:8, + ttl:8; + uint16_t tos:8, + ihl:4, + version:4; + } ipv4; + struct { + uint16_t hoplimit:8, + nexthdr:8; + uint16_t flowhi4:4, + tc:8, + version:4; + } ipv6; + struct { + uint16_t flowlo16; + uint16_t flowhi4:4, + tc:8, + version:4; + } ipv6_flow; + struct { + uint16_t fin:1, + syn:1, + rst:1, + psh:1, + ack:1, + urg:1, + ece:1, + cwr:1, + res1:4, + doff:4; + uint16_t rsvd; + } tcp; + } u; + + uint16_t rsvd; + + uint8_t type; + +#define PROTO_XTR_MAGIC_ID 0xCE + uint8_t magic; +}; + +static inline void +init_proto_xtr_flds(struct rte_mbuf *mb) +{ + mb->udata64 = 0; +} + +static inline struct proto_xtr_flds * +get_proto_xtr_flds(struct rte_mbuf *mb) +{ + RTE_BUILD_BUG_ON(sizeof(struct proto_xtr_flds) > sizeof(mb->udata64)); + + return (struct proto_xtr_flds *)&mb->udata64; +} + +static inline void +dump_proto_xtr_flds(struct rte_mbuf *mb) +{ + struct proto_xtr_flds *xtr = get_proto_xtr_flds(mb); + + if (xtr->magic != PROTO_XTR_MAGIC_ID || xtr->type == PROTO_XTR_NONE) + return; + + printf(" - Protocol Extraction:[0x%04x:0x%04x],", + xtr->u.raw.data0, xtr->u.raw.data1); + + if (xtr->type == PROTO_XTR_VLAN) + printf("vlan,stag=%u:%u:%u,ctag=%u:%u:%u ", + xtr->u.vlan.stag_pcp, + xtr->u.vlan.stag_dei, + xtr->u.vlan.stag_vid, + xtr->u.vlan.ctag_pcp, + xtr->u.vlan.ctag_dei, + xtr->u.vlan.ctag_vid); + else if (xtr->type == PROTO_XTR_IPV4) + printf("ipv4,ver=%u,hdrlen=%u,tos=%u,ttl=%u,proto=%u ", + xtr->u.ipv4.version, + xtr->u.ipv4.ihl, + xtr->u.ipv4.tos, + xtr->u.ipv4.ttl, + xtr->u.ipv4.protocol); + else if (xtr->type == PROTO_XTR_IPV6) + printf("ipv6,ver=%u,tc=%u,flow_hi4=0x%x,nexthdr=%u,hoplimit=%u ", + xtr->u.ipv6.version, + xtr->u.ipv6.tc, + xtr->u.ipv6.flowhi4, + xtr->u.ipv6.nexthdr, + xtr->u.ipv6.hoplimit); + else if (xtr->type == PROTO_XTR_IPV6_FLOW) + printf("ipv6_flow,ver=%u,tc=%u,flow=0x%x%04x ", + xtr->u.ipv6_flow.version, + xtr->u.ipv6_flow.tc, + xtr->u.ipv6_flow.flowhi4, + xtr->u.ipv6_flow.flowlo16); + else if (xtr->type == PROTO_XTR_TCP) + printf("tcp,doff=%u,flags=%s%s%s%s%s%s%s%s ", + xtr->u.tcp.doff, + xtr->u.tcp.cwr ? "C" : "", + xtr->u.tcp.ece ? "E" : "", + xtr->u.tcp.urg ? "U" : "", + xtr->u.tcp.ack ? "A" : "", + xtr->u.tcp.psh ? "P" : "", + xtr->u.tcp.rst ? "R" : "", + xtr->u.tcp.syn ? "S" : "", + xtr->u.tcp.fin ? "F" : ""); +} + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_PMD_ICE_H_ */ From patchwork Thu Sep 19 06:25:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leyi Rong X-Patchwork-Id: 59357 X-Patchwork-Delegate: xiaolong.ye@intel.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 9E13D1E920; Thu, 19 Sep 2019 08:29:02 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by dpdk.org (Postfix) with ESMTP id C65361E90A for ; Thu, 19 Sep 2019 08:28:57 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Sep 2019 23:28:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,522,1559545200"; d="scan'208";a="362421969" Received: from dpdk-lrong-srv-04.sh.intel.com ([10.67.119.187]) by orsmga005.jf.intel.com with ESMTP; 18 Sep 2019 23:28:55 -0700 From: Leyi Rong To: haiyue.wang@intel.com, wenzhuo.lu@intel.com, qi.z.zhang@intel.com, xiaolong.ye@intel.com Cc: dev@dpdk.org Date: Thu, 19 Sep 2019 14:25:51 +0800 Message-Id: <20190919062553.79257-5-leyi.rong@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190919062553.79257-1-leyi.rong@intel.com> References: <20190829023421.112551-1-leyi.rong@intel.com> <20190919062553.79257-1-leyi.rong@intel.com> Subject: [dpdk-dev] [PATCH v4 4/6] net/ice: switch to flexible descriptor in SSE path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Wenzhuo Lu With this path, the flexible descriptor is supported in SSE path. And the legacy descriptor is not supported. Signed-off-by: Wenzhuo Lu --- drivers/net/ice/ice_rxtx_vec_sse.c | 239 +++++++++++++---------------- 1 file changed, 110 insertions(+), 129 deletions(-) diff --git a/drivers/net/ice/ice_rxtx_vec_sse.c b/drivers/net/ice/ice_rxtx_vec_sse.c index 967a7b16b..dafcb081a 100644 --- a/drivers/net/ice/ice_rxtx_vec_sse.c +++ b/drivers/net/ice/ice_rxtx_vec_sse.c @@ -15,14 +15,14 @@ ice_rxq_rearm(struct ice_rx_queue *rxq) { int i; uint16_t rx_id; - volatile union ice_rx_desc *rxdp; + volatile union ice_rx_flex_desc *rxdp; struct ice_rx_entry *rxep = &rxq->sw_ring[rxq->rxrearm_start]; struct rte_mbuf *mb0, *mb1; __m128i hdr_room = _mm_set_epi64x(RTE_PKTMBUF_HEADROOM, RTE_PKTMBUF_HEADROOM); __m128i dma_addr0, dma_addr1; - rxdp = rxq->rx_ring + rxq->rxrearm_start; + rxdp = (union ice_rx_flex_desc *)rxq->rx_ring + rxq->rxrearm_start; /* Pull 'n' more MBUFs into the software ring */ if (rte_mempool_get_bulk(rxq->mp, @@ -88,93 +88,90 @@ ice_rx_desc_to_olflags_v(struct ice_rx_queue *rxq, __m128i descs[4], const __m128i mbuf_init = _mm_set_epi64x(0, rxq->mbuf_initializer); __m128i rearm0, rearm1, rearm2, rearm3; - __m128i vlan0, vlan1, rss, l3_l4e; + __m128i tmp_desc, flags, rss_vlan; - /* mask everything except RSS, flow director and VLAN flags - * bit2 is for VLAN tag, bit11 for flow director indication - * bit13:12 for RSS indication. + /* mask everything except checksum, RSS and VLAN flags. + * bit6:4 for checksum. + * bit12 for RSS indication. + * bit13 for VLAN indication. */ - const __m128i rss_vlan_msk = _mm_set_epi32(0x1c03804, 0x1c03804, - 0x1c03804, 0x1c03804); + const __m128i desc_mask = _mm_set_epi32(0x3070, 0x3070, + 0x3070, 0x3070); - const __m128i cksum_mask = _mm_set_epi32(PKT_RX_IP_CKSUM_GOOD | - PKT_RX_IP_CKSUM_BAD | - PKT_RX_L4_CKSUM_GOOD | - PKT_RX_L4_CKSUM_BAD | + const __m128i cksum_mask = _mm_set_epi32(PKT_RX_IP_CKSUM_MASK | + PKT_RX_L4_CKSUM_MASK | PKT_RX_EIP_CKSUM_BAD, - PKT_RX_IP_CKSUM_GOOD | - PKT_RX_IP_CKSUM_BAD | - PKT_RX_L4_CKSUM_GOOD | - PKT_RX_L4_CKSUM_BAD | + PKT_RX_IP_CKSUM_MASK | + PKT_RX_L4_CKSUM_MASK | PKT_RX_EIP_CKSUM_BAD, - PKT_RX_IP_CKSUM_GOOD | - PKT_RX_IP_CKSUM_BAD | - PKT_RX_L4_CKSUM_GOOD | - PKT_RX_L4_CKSUM_BAD | + PKT_RX_IP_CKSUM_MASK | + PKT_RX_L4_CKSUM_MASK | PKT_RX_EIP_CKSUM_BAD, - PKT_RX_IP_CKSUM_GOOD | - PKT_RX_IP_CKSUM_BAD | - PKT_RX_L4_CKSUM_GOOD | - PKT_RX_L4_CKSUM_BAD | + PKT_RX_IP_CKSUM_MASK | + PKT_RX_L4_CKSUM_MASK | PKT_RX_EIP_CKSUM_BAD); - /* map rss and vlan type to rss hash and vlan flag */ - const __m128i vlan_flags = _mm_set_epi8(0, 0, 0, 0, - 0, 0, 0, 0, - 0, 0, 0, PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED, - 0, 0, 0, 0); - - const __m128i rss_flags = _mm_set_epi8(0, 0, 0, 0, - 0, 0, 0, 0, - PKT_RX_RSS_HASH | PKT_RX_FDIR, PKT_RX_RSS_HASH, 0, 0, - 0, 0, PKT_RX_FDIR, 0); - - const __m128i l3_l4e_flags = _mm_set_epi8(0, 0, 0, 0, 0, 0, 0, 0, + /* map the checksum, rss and vlan fields to the checksum, rss + * and vlan flag + */ + const __m128i cksum_flags = _mm_set_epi8(0, 0, 0, 0, 0, 0, 0, 0, /* shift right 1 bit to make sure it not exceed 255 */ (PKT_RX_EIP_CKSUM_BAD | PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_BAD) >> 1, - (PKT_RX_IP_CKSUM_GOOD | PKT_RX_EIP_CKSUM_BAD | - PKT_RX_L4_CKSUM_BAD) >> 1, - (PKT_RX_EIP_CKSUM_BAD | PKT_RX_IP_CKSUM_BAD) >> 1, - (PKT_RX_IP_CKSUM_GOOD | PKT_RX_EIP_CKSUM_BAD) >> 1, + (PKT_RX_EIP_CKSUM_BAD | PKT_RX_L4_CKSUM_BAD | + PKT_RX_IP_CKSUM_GOOD) >> 1, + (PKT_RX_EIP_CKSUM_BAD | PKT_RX_L4_CKSUM_GOOD | + PKT_RX_IP_CKSUM_BAD) >> 1, + (PKT_RX_EIP_CKSUM_BAD | PKT_RX_L4_CKSUM_GOOD | + PKT_RX_IP_CKSUM_GOOD) >> 1, (PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_BAD) >> 1, - (PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_BAD) >> 1, - PKT_RX_IP_CKSUM_BAD >> 1, - (PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD) >> 1); - - vlan0 = _mm_unpackhi_epi32(descs[0], descs[1]); - vlan1 = _mm_unpackhi_epi32(descs[2], descs[3]); - vlan0 = _mm_unpacklo_epi64(vlan0, vlan1); - - vlan1 = _mm_and_si128(vlan0, rss_vlan_msk); - vlan0 = _mm_shuffle_epi8(vlan_flags, vlan1); - - rss = _mm_srli_epi32(vlan1, 11); - rss = _mm_shuffle_epi8(rss_flags, rss); + (PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_GOOD) >> 1, + (PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_BAD) >> 1, + (PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_GOOD) >> 1); - l3_l4e = _mm_srli_epi32(vlan1, 22); - l3_l4e = _mm_shuffle_epi8(l3_l4e_flags, l3_l4e); + const __m128i rss_vlan_flags = _mm_set_epi8(0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + PKT_RX_RSS_HASH | PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED, + PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED, + PKT_RX_RSS_HASH, 0); + + /* merge 4 descriptors */ + flags = _mm_unpackhi_epi32(descs[0], descs[1]); + tmp_desc = _mm_unpackhi_epi32(descs[2], descs[3]); + tmp_desc = _mm_unpacklo_epi64(flags, tmp_desc); + tmp_desc = _mm_and_si128(flags, desc_mask); + + /* checksum flags */ + tmp_desc = _mm_srli_epi32(tmp_desc, 4); + flags = _mm_shuffle_epi8(cksum_flags, tmp_desc); /* then we shift left 1 bit */ - l3_l4e = _mm_slli_epi32(l3_l4e, 1); - /* we need to mask out the reduntant bits */ - l3_l4e = _mm_and_si128(l3_l4e, cksum_mask); + flags = _mm_slli_epi32(flags, 1); + /* we need to mask out the reduntant bits introduced by RSS or + * VLAN fields. + */ + flags = _mm_and_si128(flags, cksum_mask); - vlan0 = _mm_or_si128(vlan0, rss); - vlan0 = _mm_or_si128(vlan0, l3_l4e); + /* RSS, VLAN flag */ + tmp_desc = _mm_srli_epi32(tmp_desc, 8); + rss_vlan = _mm_shuffle_epi8(rss_vlan_flags, tmp_desc); + + /* merge the flags */ + flags = _mm_or_si128(flags, rss_vlan); /** * At this point, we have the 4 sets of flags in the low 16-bits - * of each 32-bit value in vlan0. + * of each 32-bit value in flags. * We want to extract these, and merge them with the mbuf init data * so we can do a single 16-byte write to the mbuf to set the flags * and all the other initialization fields. Extracting the * appropriate flags means that we have to do a shift and blend for * each mbuf before we do the write. */ - rearm0 = _mm_blend_epi16(mbuf_init, _mm_slli_si128(vlan0, 8), 0x10); - rearm1 = _mm_blend_epi16(mbuf_init, _mm_slli_si128(vlan0, 4), 0x10); - rearm2 = _mm_blend_epi16(mbuf_init, vlan0, 0x10); - rearm3 = _mm_blend_epi16(mbuf_init, _mm_srli_si128(vlan0, 4), 0x10); + rearm0 = _mm_blend_epi16(mbuf_init, _mm_slli_si128(flags, 8), 0x10); + rearm1 = _mm_blend_epi16(mbuf_init, _mm_slli_si128(flags, 4), 0x10); + rearm2 = _mm_blend_epi16(mbuf_init, flags, 0x10); + rearm3 = _mm_blend_epi16(mbuf_init, _mm_srli_si128(flags, 4), 0x10); /* write the rearm data and the olflags in one write */ RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, ol_flags) != @@ -187,22 +184,24 @@ ice_rx_desc_to_olflags_v(struct ice_rx_queue *rxq, __m128i descs[4], _mm_store_si128((__m128i *)&rx_pkts[3]->rearm_data, rearm3); } -#define PKTLEN_SHIFT 10 - static inline void ice_rx_desc_to_ptype_v(__m128i descs[4], struct rte_mbuf **rx_pkts, uint32_t *ptype_tbl) { - __m128i ptype0 = _mm_unpackhi_epi64(descs[0], descs[1]); - __m128i ptype1 = _mm_unpackhi_epi64(descs[2], descs[3]); - - ptype0 = _mm_srli_epi64(ptype0, 30); - ptype1 = _mm_srli_epi64(ptype1, 30); - - rx_pkts[0]->packet_type = ptype_tbl[_mm_extract_epi8(ptype0, 0)]; - rx_pkts[1]->packet_type = ptype_tbl[_mm_extract_epi8(ptype0, 8)]; - rx_pkts[2]->packet_type = ptype_tbl[_mm_extract_epi8(ptype1, 0)]; - rx_pkts[3]->packet_type = ptype_tbl[_mm_extract_epi8(ptype1, 8)]; + const __m128i ptype_mask = _mm_set_epi16(0, ICE_RX_FLEX_DESC_PTYPE_M, + 0, ICE_RX_FLEX_DESC_PTYPE_M, + 0, ICE_RX_FLEX_DESC_PTYPE_M, + 0, ICE_RX_FLEX_DESC_PTYPE_M); + __m128i ptype_01 = _mm_unpacklo_epi32(descs[0], descs[1]); + __m128i ptype_23 = _mm_unpacklo_epi32(descs[2], descs[3]); + __m128i ptype_all = _mm_unpacklo_epi64(ptype_01, ptype_23); + + ptype_all = _mm_and_si128(ptype_all, ptype_mask); + + rx_pkts[0]->packet_type = ptype_tbl[_mm_extract_epi16(ptype_all, 1)]; + rx_pkts[1]->packet_type = ptype_tbl[_mm_extract_epi16(ptype_all, 3)]; + rx_pkts[2]->packet_type = ptype_tbl[_mm_extract_epi16(ptype_all, 5)]; + rx_pkts[3]->packet_type = ptype_tbl[_mm_extract_epi16(ptype_all, 7)]; } /** @@ -215,21 +214,39 @@ static inline uint16_t _ice_recv_raw_pkts_vec(struct ice_rx_queue *rxq, struct rte_mbuf **rx_pkts, uint16_t nb_pkts, uint8_t *split_packet) { - volatile union ice_rx_desc *rxdp; + volatile union ice_rx_flex_desc *rxdp; struct ice_rx_entry *sw_ring; uint16_t nb_pkts_recd; int pos; uint64_t var; - __m128i shuf_msk; uint32_t *ptype_tbl = rxq->vsi->adapter->ptype_tbl; - __m128i crc_adjust = _mm_set_epi16 - (0, 0, 0, /* ignore non-length fields */ + (0, 0, 0, /* ignore non-length fields */ -rxq->crc_len, /* sub crc on data_len */ 0, /* ignore high-16bits of pkt_len */ -rxq->crc_len, /* sub crc on pkt_len */ - 0, 0 /* ignore pkt_type field */ + 0, 0 /* ignore pkt_type field */ ); + const __m128i zero = _mm_setzero_si128(); + /* mask to shuffle from desc. to mbuf */ + const __m128i shuf_msk = _mm_set_epi8 + (0xFF, 0xFF, 0xFF, 0xFF, /* rss not supported */ + 11, 10, /* octet 10~11, 16 bits vlan_macip */ + 5, 4, /* octet 4~5, 16 bits data_len */ + 0xFF, 0xFF, /* skip high 16 bits pkt_len, zero out */ + 5, 4, /* octet 4~5, low 16 bits pkt_len */ + 0xFF, 0xFF, /* pkt_type set as unknown */ + 0xFF, 0xFF /* pkt_type set as unknown */ + ); + const __m128i eop_shuf_mask = _mm_set_epi8(0xFF, 0xFF, + 0xFF, 0xFF, + 0xFF, 0xFF, + 0xFF, 0xFF, + 0xFF, 0xFF, + 0xFF, 0xFF, + 0x04, 0x0C, + 0x00, 0x08); + /** * compile-time check the above crc_adjust layout is correct. * NOTE: the first field (lowest address) is given last in set_epi16 @@ -239,7 +256,13 @@ _ice_recv_raw_pkts_vec(struct ice_rx_queue *rxq, struct rte_mbuf **rx_pkts, offsetof(struct rte_mbuf, rx_descriptor_fields1) + 4); RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, data_len) != offsetof(struct rte_mbuf, rx_descriptor_fields1) + 8); - __m128i dd_check, eop_check; + + /* 4 packets DD mask */ + const __m128i dd_check = _mm_set_epi64x(0x0000000100000001LL, + 0x0000000100000001LL); + /* 4 packets EOP mask */ + const __m128i eop_check = _mm_set_epi64x(0x0000000200000002LL, + 0x0000000200000002LL); /* nb_pkts shall be less equal than ICE_MAX_RX_BURST */ nb_pkts = RTE_MIN(nb_pkts, ICE_MAX_RX_BURST); @@ -250,7 +273,7 @@ _ice_recv_raw_pkts_vec(struct ice_rx_queue *rxq, struct rte_mbuf **rx_pkts, /* Just the act of getting into the function from the application is * going to cost about 7 cycles */ - rxdp = rxq->rx_ring + rxq->rx_tail; + rxdp = (union ice_rx_flex_desc *)rxq->rx_ring + rxq->rx_tail; rte_prefetch0(rxdp); @@ -263,26 +286,10 @@ _ice_recv_raw_pkts_vec(struct ice_rx_queue *rxq, struct rte_mbuf **rx_pkts, /* Before we start moving massive data around, check to see if * there is actually a packet available */ - if (!(rxdp->wb.qword1.status_error_len & - rte_cpu_to_le_32(1 << ICE_RX_DESC_STATUS_DD_S))) + if (!(rxdp->wb.status_error0 & + rte_cpu_to_le_32(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S))) return 0; - /* 4 packets DD mask */ - dd_check = _mm_set_epi64x(0x0000000100000001LL, 0x0000000100000001LL); - - /* 4 packets EOP mask */ - eop_check = _mm_set_epi64x(0x0000000200000002LL, 0x0000000200000002LL); - - /* mask to shuffle from desc. to mbuf */ - shuf_msk = _mm_set_epi8 - (7, 6, 5, 4, /* octet 4~7, 32bits rss */ - 3, 2, /* octet 2~3, low 16 bits vlan_macip */ - 15, 14, /* octet 15~14, 16 bits data_len */ - 0xFF, 0xFF, /* skip high 16 bits pkt_len, zero out */ - 15, 14, /* octet 15~14, low 16 bits pkt_len */ - 0xFF, 0xFF, /* pkt_type set as unknown */ - 0xFF, 0xFF /*pkt_type set as unknown */ - ); /** * Compile-time verify the shuffle mask * NOTE: some field positions already verified above, but duplicated @@ -315,7 +322,7 @@ _ice_recv_raw_pkts_vec(struct ice_rx_queue *rxq, struct rte_mbuf **rx_pkts, rxdp += ICE_DESCS_PER_LOOP) { __m128i descs[ICE_DESCS_PER_LOOP]; __m128i pkt_mb1, pkt_mb2, pkt_mb3, pkt_mb4; - __m128i zero, staterr, sterr_tmp1, sterr_tmp2; + __m128i staterr, sterr_tmp1, sterr_tmp2; /* 2 64 bit or 4 32 bit mbuf pointers in one XMM reg. */ __m128i mbp1; #if defined(RTE_ARCH_X86_64) @@ -359,14 +366,6 @@ _ice_recv_raw_pkts_vec(struct ice_rx_queue *rxq, struct rte_mbuf **rx_pkts, /* avoid compiler reorder optimization */ rte_compiler_barrier(); - /* pkt 3,4 shift the pktlen field to be 16-bit aligned*/ - const __m128i len3 = _mm_slli_epi32(descs[3], PKTLEN_SHIFT); - const __m128i len2 = _mm_slli_epi32(descs[2], PKTLEN_SHIFT); - - /* merge the now-aligned packet length fields back in */ - descs[3] = _mm_blend_epi16(descs[3], len3, 0x80); - descs[2] = _mm_blend_epi16(descs[2], len2, 0x80); - /* D.1 pkt 3,4 convert format from desc to pktmbuf */ pkt_mb4 = _mm_shuffle_epi8(descs[3], shuf_msk); pkt_mb3 = _mm_shuffle_epi8(descs[2], shuf_msk); @@ -382,20 +381,11 @@ _ice_recv_raw_pkts_vec(struct ice_rx_queue *rxq, struct rte_mbuf **rx_pkts, pkt_mb4 = _mm_add_epi16(pkt_mb4, crc_adjust); pkt_mb3 = _mm_add_epi16(pkt_mb3, crc_adjust); - /* pkt 1,2 shift the pktlen field to be 16-bit aligned*/ - const __m128i len1 = _mm_slli_epi32(descs[1], PKTLEN_SHIFT); - const __m128i len0 = _mm_slli_epi32(descs[0], PKTLEN_SHIFT); - - /* merge the now-aligned packet length fields back in */ - descs[1] = _mm_blend_epi16(descs[1], len1, 0x80); - descs[0] = _mm_blend_epi16(descs[0], len0, 0x80); - /* D.1 pkt 1,2 convert format from desc to pktmbuf */ pkt_mb2 = _mm_shuffle_epi8(descs[1], shuf_msk); pkt_mb1 = _mm_shuffle_epi8(descs[0], shuf_msk); /* C.2 get 4 pkts staterr value */ - zero = _mm_xor_si128(dd_check, dd_check); staterr = _mm_unpacklo_epi32(sterr_tmp1, sterr_tmp2); /* D.3 copy final 3,4 data to rx_pkts */ @@ -412,15 +402,6 @@ _ice_recv_raw_pkts_vec(struct ice_rx_queue *rxq, struct rte_mbuf **rx_pkts, /* C* extract and record EOP bit */ if (split_packet) { - __m128i eop_shuf_mask = _mm_set_epi8(0xFF, 0xFF, - 0xFF, 0xFF, - 0xFF, 0xFF, - 0xFF, 0xFF, - 0xFF, 0xFF, - 0xFF, 0xFF, - 0x04, 0x0C, - 0x00, 0x08); - /* and with mask to extract bits, flipping 1-0 */ __m128i eop_bits = _mm_andnot_si128(staterr, eop_check); /* the staterr values are not in order, as the count From patchwork Thu Sep 19 06:25:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leyi Rong X-Patchwork-Id: 59358 X-Patchwork-Delegate: xiaolong.ye@intel.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 570B81E931; Thu, 19 Sep 2019 08:29:05 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by dpdk.org (Postfix) with ESMTP id 9C8CF1E91B for ; Thu, 19 Sep 2019 08:29:00 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Sep 2019 23:28:59 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,522,1559545200"; d="scan'208";a="362421990" Received: from dpdk-lrong-srv-04.sh.intel.com ([10.67.119.187]) by orsmga005.jf.intel.com with ESMTP; 18 Sep 2019 23:28:57 -0700 From: Leyi Rong To: haiyue.wang@intel.com, wenzhuo.lu@intel.com, qi.z.zhang@intel.com, xiaolong.ye@intel.com Cc: dev@dpdk.org, Leyi Rong Date: Thu, 19 Sep 2019 14:25:52 +0800 Message-Id: <20190919062553.79257-6-leyi.rong@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190919062553.79257-1-leyi.rong@intel.com> References: <20190829023421.112551-1-leyi.rong@intel.com> <20190919062553.79257-1-leyi.rong@intel.com> Subject: [dpdk-dev] [PATCH v4 5/6] net/ice: switch to Rx flexible descriptor in AVX path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Switch to Rx flexible descriptor format instead of legacy descriptor format. Signed-off-by: Leyi Rong --- drivers/net/ice/ice_rxtx_vec_avx2.c | 224 ++++++++++++++-------------- 1 file changed, 109 insertions(+), 115 deletions(-) diff --git a/drivers/net/ice/ice_rxtx_vec_avx2.c b/drivers/net/ice/ice_rxtx_vec_avx2.c index 5ce29c2a2..46776fa12 100644 --- a/drivers/net/ice/ice_rxtx_vec_avx2.c +++ b/drivers/net/ice/ice_rxtx_vec_avx2.c @@ -15,10 +15,10 @@ ice_rxq_rearm(struct ice_rx_queue *rxq) { int i; uint16_t rx_id; - volatile union ice_rx_desc *rxdp; + volatile union ice_rx_flex_desc *rxdp; struct ice_rx_entry *rxep = &rxq->sw_ring[rxq->rxrearm_start]; - rxdp = rxq->rx_ring + rxq->rxrearm_start; + rxdp = (union ice_rx_flex_desc *)rxq->rx_ring + rxq->rxrearm_start; /* Pull 'n' more MBUFs into the software ring */ if (rte_mempool_get_bulk(rxq->mp, @@ -132,8 +132,6 @@ ice_rxq_rearm(struct ice_rx_queue *rxq) ICE_PCI_REG_WRITE(rxq->qrx_tail, rx_id); } -#define PKTLEN_SHIFT 10 - static inline uint16_t _ice_recv_raw_pkts_vec_avx2(struct ice_rx_queue *rxq, struct rte_mbuf **rx_pkts, uint16_t nb_pkts, uint8_t *split_packet) @@ -144,7 +142,8 @@ _ice_recv_raw_pkts_vec_avx2(struct ice_rx_queue *rxq, struct rte_mbuf **rx_pkts, const __m256i mbuf_init = _mm256_set_epi64x(0, 0, 0, rxq->mbuf_initializer); struct ice_rx_entry *sw_ring = &rxq->sw_ring[rxq->rx_tail]; - volatile union ice_rx_desc *rxdp = rxq->rx_ring + rxq->rx_tail; + volatile union ice_rx_flex_desc *rxdp = + (union ice_rx_flex_desc *)rxq->rx_ring + rxq->rx_tail; const int avx_aligned = ((rxq->rx_tail & 1) == 0); rte_prefetch0(rxdp); @@ -161,8 +160,8 @@ _ice_recv_raw_pkts_vec_avx2(struct ice_rx_queue *rxq, struct rte_mbuf **rx_pkts, /* Before we start moving massive data around, check to see if * there is actually a packet available */ - if (!(rxdp->wb.qword1.status_error_len & - rte_cpu_to_le_32(1 << ICE_RX_DESC_STATUS_DD_S))) + if (!(rxdp->wb.status_error0 & + rte_cpu_to_le_32(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S))) return 0; /* constants used in processing loop */ @@ -193,21 +192,23 @@ _ice_recv_raw_pkts_vec_avx2(struct ice_rx_queue *rxq, struct rte_mbuf **rx_pkts, const __m256i shuf_msk = _mm256_set_epi8 (/* first descriptor */ - 7, 6, 5, 4, /* octet 4~7, 32bits rss */ - 3, 2, /* octet 2~3, low 16 bits vlan_macip */ - 15, 14, /* octet 15~14, 16 bits data_len */ - 0xFF, 0xFF, /* skip high 16 bits pkt_len, zero out */ - 15, 14, /* octet 15~14, low 16 bits pkt_len */ - 0xFF, 0xFF, /* pkt_type set as unknown */ - 0xFF, 0xFF, /*pkt_type set as unknown */ + 0xFF, 0xFF, + 0xFF, 0xFF, /* rss not supported */ + 11, 10, /* octet 10~11, 16 bits vlan_macip */ + 5, 4, /* octet 4~5, 16 bits data_len */ + 0xFF, 0xFF, /* skip hi 16 bits pkt_len, zero out */ + 5, 4, /* octet 4~5, 16 bits pkt_len */ + 0xFF, 0xFF, /* pkt_type set as unknown */ + 0xFF, 0xFF, /*pkt_type set as unknown */ /* second descriptor */ - 7, 6, 5, 4, /* octet 4~7, 32bits rss */ - 3, 2, /* octet 2~3, low 16 bits vlan_macip */ - 15, 14, /* octet 15~14, 16 bits data_len */ - 0xFF, 0xFF, /* skip high 16 bits pkt_len, zero out */ - 15, 14, /* octet 15~14, low 16 bits pkt_len */ - 0xFF, 0xFF, /* pkt_type set as unknown */ - 0xFF, 0xFF /*pkt_type set as unknown */ + 0xFF, 0xFF, + 0xFF, 0xFF, /* rss not supported */ + 11, 10, /* octet 10~11, 16 bits vlan_macip */ + 5, 4, /* octet 4~5, 16 bits data_len */ + 0xFF, 0xFF, /* skip hi 16 bits pkt_len, zero out */ + 5, 4, /* octet 4~5, 16 bits pkt_len */ + 0xFF, 0xFF, /* pkt_type set as unknown */ + 0xFF, 0xFF /*pkt_type set as unknown */ ); /** * compile-time check the above crc and shuffle layout is correct. @@ -225,68 +226,68 @@ _ice_recv_raw_pkts_vec_avx2(struct ice_rx_queue *rxq, struct rte_mbuf **rx_pkts, /* Status/Error flag masks */ /** - * mask everything except RSS, flow director and VLAN flags - * bit2 is for VLAN tag, bit11 for flow director indication - * bit13:12 for RSS indication. Bits 3-5 of error - * field (bits 22-24) are for IP/L4 checksum errors + * mask everything except Checksum Reports, RSS indication + * and VLAN indication. + * bit6:4 for IP/L4 checksum errors. + * bit12 is for RSS indication. + * bit13 is for VLAN indication. */ const __m256i flags_mask = - _mm256_set1_epi32((1 << 2) | (1 << 11) | - (3 << 12) | (7 << 22)); - /** - * data to be shuffled by result of flag mask. If VLAN bit is set, - * (bit 2), then position 4 in this array will be used in the - * destination - */ - const __m256i vlan_flags_shuf = - _mm256_set_epi32(0, 0, PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED, 0, - 0, 0, PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED, 0); - /** - * data to be shuffled by result of flag mask, shifted down 11. - * If RSS/FDIR bits are set, shuffle moves appropriate flags in - * place. - */ - const __m256i rss_flags_shuf = - _mm256_set_epi8(0, 0, 0, 0, 0, 0, 0, 0, - PKT_RX_RSS_HASH | PKT_RX_FDIR, PKT_RX_RSS_HASH, - 0, 0, 0, 0, PKT_RX_FDIR, 0,/* end up 128-bits */ - 0, 0, 0, 0, 0, 0, 0, 0, - PKT_RX_RSS_HASH | PKT_RX_FDIR, PKT_RX_RSS_HASH, - 0, 0, 0, 0, PKT_RX_FDIR, 0); - + _mm256_set1_epi32((7 << 4) | (1 << 12) | (1 << 13)); /** - * data to be shuffled by the result of the flags mask shifted by 22 + * data to be shuffled by the result of the flags mask shifted by 4 * bits. This gives use the l3_l4 flags. */ const __m256i l3_l4_flags_shuf = _mm256_set_epi8(0, 0, 0, 0, 0, 0, 0, 0, /* shift right 1 bit to make sure it not exceed 255 */ (PKT_RX_EIP_CKSUM_BAD | PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_BAD) >> 1, - (PKT_RX_IP_CKSUM_GOOD | PKT_RX_EIP_CKSUM_BAD | - PKT_RX_L4_CKSUM_BAD) >> 1, - (PKT_RX_EIP_CKSUM_BAD | PKT_RX_IP_CKSUM_BAD) >> 1, - (PKT_RX_IP_CKSUM_GOOD | PKT_RX_EIP_CKSUM_BAD) >> 1, + (PKT_RX_EIP_CKSUM_BAD | PKT_RX_L4_CKSUM_BAD | + PKT_RX_IP_CKSUM_GOOD) >> 1, + (PKT_RX_EIP_CKSUM_BAD | PKT_RX_L4_CKSUM_GOOD | + PKT_RX_IP_CKSUM_BAD) >> 1, + (PKT_RX_EIP_CKSUM_BAD | PKT_RX_L4_CKSUM_GOOD | + PKT_RX_IP_CKSUM_GOOD) >> 1, (PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_BAD) >> 1, - (PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_BAD) >> 1, - PKT_RX_IP_CKSUM_BAD >> 1, - (PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD) >> 1, + (PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_GOOD) >> 1, + (PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_BAD) >> 1, + (PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_GOOD) >> 1, /* second 128-bits */ 0, 0, 0, 0, 0, 0, 0, 0, (PKT_RX_EIP_CKSUM_BAD | PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_BAD) >> 1, - (PKT_RX_IP_CKSUM_GOOD | PKT_RX_EIP_CKSUM_BAD | - PKT_RX_L4_CKSUM_BAD) >> 1, - (PKT_RX_EIP_CKSUM_BAD | PKT_RX_IP_CKSUM_BAD) >> 1, - (PKT_RX_IP_CKSUM_GOOD | PKT_RX_EIP_CKSUM_BAD) >> 1, + (PKT_RX_EIP_CKSUM_BAD | PKT_RX_L4_CKSUM_BAD | + PKT_RX_IP_CKSUM_GOOD) >> 1, + (PKT_RX_EIP_CKSUM_BAD | PKT_RX_L4_CKSUM_GOOD | + PKT_RX_IP_CKSUM_BAD) >> 1, + (PKT_RX_EIP_CKSUM_BAD | PKT_RX_L4_CKSUM_GOOD | + PKT_RX_IP_CKSUM_GOOD) >> 1, (PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_BAD) >> 1, - (PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_BAD) >> 1, - PKT_RX_IP_CKSUM_BAD >> 1, - (PKT_RX_IP_CKSUM_GOOD | PKT_RX_L4_CKSUM_GOOD) >> 1); - + (PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_GOOD) >> 1, + (PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_BAD) >> 1, + (PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_GOOD) >> 1); const __m256i cksum_mask = _mm256_set1_epi32(PKT_RX_IP_CKSUM_GOOD | PKT_RX_IP_CKSUM_BAD | PKT_RX_L4_CKSUM_GOOD | PKT_RX_L4_CKSUM_BAD | PKT_RX_EIP_CKSUM_BAD); + /** + * data to be shuffled by result of flag mask, shifted down 12. + * If RSS(bit12)/VLAN(bit13) are set, + * shuffle moves appropriate flags in place. + */ + const __m256i rss_vlan_flags_shuf = _mm256_set_epi8(0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + PKT_RX_RSS_HASH | PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED, + PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED, + PKT_RX_RSS_HASH, 0, + /* end up 128-bits */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + PKT_RX_RSS_HASH | PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED, + PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED, + PKT_RX_RSS_HASH, 0); RTE_SET_USED(avx_aligned); /* for 32B descriptors we don't use this */ @@ -369,73 +370,66 @@ _ice_recv_raw_pkts_vec_avx2(struct ice_rx_queue *rxq, struct rte_mbuf **rx_pkts, } /** - * convert descriptors 4-7 into mbufs, adjusting length and - * re-arranging fields. Then write into the mbuf + * convert descriptors 4-7 into mbufs, re-arrange fields. + * Then write into the mbuf. */ - const __m256i len6_7 = _mm256_slli_epi32(raw_desc6_7, - PKTLEN_SHIFT); - const __m256i len4_5 = _mm256_slli_epi32(raw_desc4_5, - PKTLEN_SHIFT); - const __m256i desc6_7 = _mm256_blend_epi16(raw_desc6_7, - len6_7, 0x80); - const __m256i desc4_5 = _mm256_blend_epi16(raw_desc4_5, - len4_5, 0x80); - __m256i mb6_7 = _mm256_shuffle_epi8(desc6_7, shuf_msk); - __m256i mb4_5 = _mm256_shuffle_epi8(desc4_5, shuf_msk); + __m256i mb6_7 = _mm256_shuffle_epi8(raw_desc6_7, shuf_msk); + __m256i mb4_5 = _mm256_shuffle_epi8(raw_desc4_5, shuf_msk); mb6_7 = _mm256_add_epi16(mb6_7, crc_adjust); mb4_5 = _mm256_add_epi16(mb4_5, crc_adjust); /** - * to get packet types, shift 64-bit values down 30 bits - * and so ptype is in lower 8-bits in each + * to get packet types, ptype is located in bit16-25 + * of each 128bits */ - const __m256i ptypes6_7 = _mm256_srli_epi64(desc6_7, 30); - const __m256i ptypes4_5 = _mm256_srli_epi64(desc4_5, 30); - const uint8_t ptype7 = _mm256_extract_epi8(ptypes6_7, 24); - const uint8_t ptype6 = _mm256_extract_epi8(ptypes6_7, 8); - const uint8_t ptype5 = _mm256_extract_epi8(ptypes4_5, 24); - const uint8_t ptype4 = _mm256_extract_epi8(ptypes4_5, 8); + const __m256i ptype_mask = + _mm256_set1_epi16(ICE_RX_FLEX_DESC_PTYPE_M); + const __m256i ptypes6_7 = + _mm256_and_si256(raw_desc6_7, ptype_mask); + const __m256i ptypes4_5 = + _mm256_and_si256(raw_desc4_5, ptype_mask); + const uint16_t ptype7 = _mm256_extract_epi16(ptypes6_7, 9); + const uint16_t ptype6 = _mm256_extract_epi16(ptypes6_7, 1); + const uint16_t ptype5 = _mm256_extract_epi16(ptypes4_5, 9); + const uint16_t ptype4 = _mm256_extract_epi16(ptypes4_5, 1); mb6_7 = _mm256_insert_epi32(mb6_7, ptype_tbl[ptype7], 4); mb6_7 = _mm256_insert_epi32(mb6_7, ptype_tbl[ptype6], 0); mb4_5 = _mm256_insert_epi32(mb4_5, ptype_tbl[ptype5], 4); mb4_5 = _mm256_insert_epi32(mb4_5, ptype_tbl[ptype4], 0); /* merge the status bits into one register */ - const __m256i status4_7 = _mm256_unpackhi_epi32(desc6_7, - desc4_5); + const __m256i status4_7 = _mm256_unpackhi_epi32(raw_desc6_7, + raw_desc4_5); /** - * convert descriptors 0-3 into mbufs, adjusting length and - * re-arranging fields. Then write into the mbuf + * convert descriptors 0-3 into mbufs, re-arrange fields. + * Then write into the mbuf. */ - const __m256i len2_3 = _mm256_slli_epi32(raw_desc2_3, - PKTLEN_SHIFT); - const __m256i len0_1 = _mm256_slli_epi32(raw_desc0_1, - PKTLEN_SHIFT); - const __m256i desc2_3 = _mm256_blend_epi16(raw_desc2_3, - len2_3, 0x80); - const __m256i desc0_1 = _mm256_blend_epi16(raw_desc0_1, - len0_1, 0x80); - __m256i mb2_3 = _mm256_shuffle_epi8(desc2_3, shuf_msk); - __m256i mb0_1 = _mm256_shuffle_epi8(desc0_1, shuf_msk); + __m256i mb2_3 = _mm256_shuffle_epi8(raw_desc2_3, shuf_msk); + __m256i mb0_1 = _mm256_shuffle_epi8(raw_desc0_1, shuf_msk); mb2_3 = _mm256_add_epi16(mb2_3, crc_adjust); mb0_1 = _mm256_add_epi16(mb0_1, crc_adjust); - /* get the packet types */ - const __m256i ptypes2_3 = _mm256_srli_epi64(desc2_3, 30); - const __m256i ptypes0_1 = _mm256_srli_epi64(desc0_1, 30); - const uint8_t ptype3 = _mm256_extract_epi8(ptypes2_3, 24); - const uint8_t ptype2 = _mm256_extract_epi8(ptypes2_3, 8); - const uint8_t ptype1 = _mm256_extract_epi8(ptypes0_1, 24); - const uint8_t ptype0 = _mm256_extract_epi8(ptypes0_1, 8); + /** + * to get packet types, ptype is located in bit16-25 + * of each 128bits + */ + const __m256i ptypes2_3 = + _mm256_and_si256(raw_desc2_3, ptype_mask); + const __m256i ptypes0_1 = + _mm256_and_si256(raw_desc0_1, ptype_mask); + const uint16_t ptype3 = _mm256_extract_epi16(ptypes2_3, 9); + const uint16_t ptype2 = _mm256_extract_epi16(ptypes2_3, 1); + const uint16_t ptype1 = _mm256_extract_epi16(ptypes0_1, 9); + const uint16_t ptype0 = _mm256_extract_epi16(ptypes0_1, 1); mb2_3 = _mm256_insert_epi32(mb2_3, ptype_tbl[ptype3], 4); mb2_3 = _mm256_insert_epi32(mb2_3, ptype_tbl[ptype2], 0); mb0_1 = _mm256_insert_epi32(mb0_1, ptype_tbl[ptype1], 4); mb0_1 = _mm256_insert_epi32(mb0_1, ptype_tbl[ptype0], 0); /* merge the status bits into one register */ - const __m256i status0_3 = _mm256_unpackhi_epi32(desc2_3, - desc0_1); + const __m256i status0_3 = _mm256_unpackhi_epi32(raw_desc2_3, + raw_desc0_1); /** * take the two sets of status bits and merge to one @@ -450,24 +444,24 @@ _ice_recv_raw_pkts_vec_avx2(struct ice_rx_queue *rxq, struct rte_mbuf **rx_pkts, /* get only flag/error bits we want */ const __m256i flag_bits = _mm256_and_si256(status0_7, flags_mask); - /* set vlan and rss flags */ - const __m256i vlan_flags = - _mm256_shuffle_epi8(vlan_flags_shuf, flag_bits); - const __m256i rss_flags = - _mm256_shuffle_epi8(rss_flags_shuf, - _mm256_srli_epi32(flag_bits, 11)); /** * l3_l4_error flags, shuffle, then shift to correct adjustment * of flags in flags_shuf, and finally mask out extra bits */ __m256i l3_l4_flags = _mm256_shuffle_epi8(l3_l4_flags_shuf, - _mm256_srli_epi32(flag_bits, 22)); + _mm256_srli_epi32(flag_bits, 4)); l3_l4_flags = _mm256_slli_epi32(l3_l4_flags, 1); l3_l4_flags = _mm256_and_si256(l3_l4_flags, cksum_mask); + /* set rss and vlan flags */ + const __m256i rss_vlan_flag_bits = + _mm256_srli_epi32(flag_bits, 12); + const __m256i rss_vlan_flags = + _mm256_shuffle_epi8(rss_vlan_flags_shuf, + rss_vlan_flag_bits); /* merge flags */ const __m256i mbuf_flags = _mm256_or_si256(l3_l4_flags, - _mm256_or_si256(rss_flags, vlan_flags)); + rss_vlan_flags); /** * At this point, we have the 8 sets of flags in the low 16-bits * of each 32-bit value in vlan0. From patchwork Thu Sep 19 06:25:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leyi Rong X-Patchwork-Id: 59359 X-Patchwork-Delegate: xiaolong.ye@intel.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id AADD41E93B; Thu, 19 Sep 2019 08:29:07 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by dpdk.org (Postfix) with ESMTP id E6EF61E91E for ; Thu, 19 Sep 2019 08:29:01 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Sep 2019 23:29:01 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,522,1559545200"; d="scan'208";a="362422013" Received: from dpdk-lrong-srv-04.sh.intel.com ([10.67.119.187]) by orsmga005.jf.intel.com with ESMTP; 18 Sep 2019 23:28:59 -0700 From: Leyi Rong To: haiyue.wang@intel.com, wenzhuo.lu@intel.com, qi.z.zhang@intel.com, xiaolong.ye@intel.com Cc: dev@dpdk.org Date: Thu, 19 Sep 2019 14:25:53 +0800 Message-Id: <20190919062553.79257-7-leyi.rong@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190919062553.79257-1-leyi.rong@intel.com> References: <20190829023421.112551-1-leyi.rong@intel.com> <20190919062553.79257-1-leyi.rong@intel.com> Subject: [dpdk-dev] [PATCH v4 6/6] net/ice: remove Rx legacy descriptor definition X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Haiyue Wang Since now the ice PMD only handles Rx Flex descriptor, so remove the legacy descriptor definition. Signed-off-by: Haiyue Wang Reviewed-by: Xiaolong Ye --- drivers/net/ice/ice_rxtx.c | 25 ++++++++++++------------- drivers/net/ice/ice_rxtx.h | 4 +--- drivers/net/ice/ice_rxtx_vec_avx2.c | 5 ++--- drivers/net/ice/ice_rxtx_vec_sse.c | 4 ++-- 4 files changed, 17 insertions(+), 21 deletions(-) diff --git a/drivers/net/ice/ice_rxtx.c b/drivers/net/ice/ice_rxtx.c index e28310b96..40186131f 100644 --- a/drivers/net/ice/ice_rxtx.c +++ b/drivers/net/ice/ice_rxtx.c @@ -171,7 +171,7 @@ ice_alloc_rx_queue_mbufs(struct ice_rx_queue *rxq) uint16_t i; for (i = 0; i < rxq->nb_rx_desc; i++) { - volatile union ice_rx_desc *rxd; + volatile union ice_rx_flex_desc *rxd; struct rte_mbuf *mbuf = rte_mbuf_raw_alloc(rxq->mp); if (unlikely(!mbuf)) { @@ -346,7 +346,7 @@ ice_reset_rx_queue(struct ice_rx_queue *rxq) #endif /* RTE_LIBRTE_ICE_RX_ALLOW_BULK_ALLOC */ len = rxq->nb_rx_desc; - for (i = 0; i < len * sizeof(union ice_rx_desc); i++) + for (i = 0; i < len * sizeof(union ice_rx_flex_desc); i++) ((volatile char *)rxq->rx_ring)[i] = 0; #ifdef RTE_LIBRTE_ICE_RX_ALLOW_BULK_ALLOC @@ -691,7 +691,7 @@ ice_rx_queue_setup(struct rte_eth_dev *dev, #endif /* Allocate the maximum number of RX ring hardware descriptor. */ - ring_size = sizeof(union ice_rx_desc) * len; + ring_size = sizeof(union ice_rx_flex_desc) * len; ring_size = RTE_ALIGN(ring_size, ICE_DMA_MEM_ALIGN); rz = rte_eth_dma_zone_reserve(dev, "rx_ring", queue_idx, ring_size, ICE_RING_BASE_ALIGN, @@ -1008,7 +1008,7 @@ ice_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id) uint16_t desc = 0; rxq = dev->data->rx_queues[rx_queue_id]; - rxdp = (volatile union ice_rx_flex_desc *)&rxq->rx_ring[rxq->rx_tail]; + rxdp = &rxq->rx_ring[rxq->rx_tail]; while ((desc < rxq->nb_rx_desc) && rte_le_to_cpu_16(rxdp->wb.status_error0) & (1 << ICE_RX_FLEX_DESC_STATUS0_DD_S)) { @@ -1020,8 +1020,7 @@ ice_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id) desc += ICE_RXQ_SCAN_INTERVAL; rxdp += ICE_RXQ_SCAN_INTERVAL; if (rxq->rx_tail + desc >= rxq->nb_rx_desc) - rxdp = (volatile union ice_rx_flex_desc *) - &(rxq->rx_ring[rxq->rx_tail + + rxdp = &(rxq->rx_ring[rxq->rx_tail + desc - rxq->nb_rx_desc]); } @@ -1156,7 +1155,7 @@ ice_rx_scan_hw_ring(struct ice_rx_queue *rxq) uint64_t pkt_flags = 0; uint32_t *ptype_tbl = rxq->vsi->adapter->ptype_tbl; - rxdp = (volatile union ice_rx_flex_desc *)&rxq->rx_ring[rxq->rx_tail]; + rxdp = &rxq->rx_ring[rxq->rx_tail]; rxep = &rxq->sw_ring[rxq->rx_tail]; stat_err0 = rte_le_to_cpu_16(rxdp->wb.status_error0); @@ -1241,7 +1240,7 @@ ice_rx_fill_from_stage(struct ice_rx_queue *rxq, static inline int ice_rx_alloc_bufs(struct ice_rx_queue *rxq) { - volatile union ice_rx_desc *rxdp; + volatile union ice_rx_flex_desc *rxdp; struct ice_rx_entry *rxep; struct rte_mbuf *mb; uint16_t alloc_idx, i; @@ -1376,7 +1375,7 @@ ice_recv_scattered_pkts(void *rx_queue, uint16_t nb_pkts) { struct ice_rx_queue *rxq = rx_queue; - volatile union ice_rx_desc *rx_ring = rxq->rx_ring; + volatile union ice_rx_flex_desc *rx_ring = rxq->rx_ring; volatile union ice_rx_flex_desc *rxdp; union ice_rx_flex_desc rxd; struct ice_rx_entry *sw_ring = rxq->sw_ring; @@ -1396,7 +1395,7 @@ ice_recv_scattered_pkts(void *rx_queue, struct rte_eth_dev *dev; while (nb_rx < nb_pkts) { - rxdp = (volatile union ice_rx_flex_desc *)&rx_ring[rx_id]; + rxdp = &rx_ring[rx_id]; rx_stat_err0 = rte_le_to_cpu_16(rxdp->wb.status_error0); /* Check the DD bit first */ @@ -1608,7 +1607,7 @@ ice_rx_descriptor_status(void *rx_queue, uint16_t offset) if (desc >= rxq->nb_rx_desc) desc -= rxq->nb_rx_desc; - rxdp = (volatile union ice_rx_flex_desc *)&rxq->rx_ring[desc]; + rxdp = &rxq->rx_ring[desc]; if (rte_le_to_cpu_16(rxdp->wb.status_error0) & (1 << ICE_RX_FLEX_DESC_STATUS0_DD_S)) return RTE_ETH_RX_DESC_DONE; @@ -1695,7 +1694,7 @@ ice_recv_pkts(void *rx_queue, uint16_t nb_pkts) { struct ice_rx_queue *rxq = rx_queue; - volatile union ice_rx_desc *rx_ring = rxq->rx_ring; + volatile union ice_rx_flex_desc *rx_ring = rxq->rx_ring; volatile union ice_rx_flex_desc *rxdp; union ice_rx_flex_desc rxd; struct ice_rx_entry *sw_ring = rxq->sw_ring; @@ -1713,7 +1712,7 @@ ice_recv_pkts(void *rx_queue, struct rte_eth_dev *dev; while (nb_rx < nb_pkts) { - rxdp = (volatile union ice_rx_flex_desc *)&rx_ring[rx_id]; + rxdp = &rx_ring[rx_id]; rx_stat_err0 = rte_le_to_cpu_16(rxdp->wb.status_error0); /* Check the DD bit first */ diff --git a/drivers/net/ice/ice_rxtx.h b/drivers/net/ice/ice_rxtx.h index de16637f3..25b3822df 100644 --- a/drivers/net/ice/ice_rxtx.h +++ b/drivers/net/ice/ice_rxtx.h @@ -21,10 +21,8 @@ #define ICE_CHK_Q_ENA_INTERVAL_US 100 #ifdef RTE_LIBRTE_ICE_16BYTE_RX_DESC -#define ice_rx_desc ice_16byte_rx_desc #define ice_rx_flex_desc ice_16b_rx_flex_desc #else -#define ice_rx_desc ice_32byte_rx_desc #define ice_rx_flex_desc ice_32b_rx_flex_desc #endif @@ -48,7 +46,7 @@ struct ice_rx_entry { struct ice_rx_queue { struct rte_mempool *mp; /* mbuf pool to populate RX ring */ - volatile union ice_rx_desc *rx_ring;/* RX ring virtual address */ + volatile union ice_rx_flex_desc *rx_ring;/* RX ring virtual address */ rte_iova_t rx_ring_dma; /* RX ring DMA address */ struct ice_rx_entry *sw_ring; /* address of RX soft ring */ uint16_t nb_rx_desc; /* number of RX descriptors */ diff --git a/drivers/net/ice/ice_rxtx_vec_avx2.c b/drivers/net/ice/ice_rxtx_vec_avx2.c index 46776fa12..f32222bb4 100644 --- a/drivers/net/ice/ice_rxtx_vec_avx2.c +++ b/drivers/net/ice/ice_rxtx_vec_avx2.c @@ -18,7 +18,7 @@ ice_rxq_rearm(struct ice_rx_queue *rxq) volatile union ice_rx_flex_desc *rxdp; struct ice_rx_entry *rxep = &rxq->sw_ring[rxq->rxrearm_start]; - rxdp = (union ice_rx_flex_desc *)rxq->rx_ring + rxq->rxrearm_start; + rxdp = rxq->rx_ring + rxq->rxrearm_start; /* Pull 'n' more MBUFs into the software ring */ if (rte_mempool_get_bulk(rxq->mp, @@ -142,8 +142,7 @@ _ice_recv_raw_pkts_vec_avx2(struct ice_rx_queue *rxq, struct rte_mbuf **rx_pkts, const __m256i mbuf_init = _mm256_set_epi64x(0, 0, 0, rxq->mbuf_initializer); struct ice_rx_entry *sw_ring = &rxq->sw_ring[rxq->rx_tail]; - volatile union ice_rx_flex_desc *rxdp = - (union ice_rx_flex_desc *)rxq->rx_ring + rxq->rx_tail; + volatile union ice_rx_flex_desc *rxdp = rxq->rx_ring + rxq->rx_tail; const int avx_aligned = ((rxq->rx_tail & 1) == 0); rte_prefetch0(rxdp); diff --git a/drivers/net/ice/ice_rxtx_vec_sse.c b/drivers/net/ice/ice_rxtx_vec_sse.c index dafcb081a..2ae9370f4 100644 --- a/drivers/net/ice/ice_rxtx_vec_sse.c +++ b/drivers/net/ice/ice_rxtx_vec_sse.c @@ -22,7 +22,7 @@ ice_rxq_rearm(struct ice_rx_queue *rxq) RTE_PKTMBUF_HEADROOM); __m128i dma_addr0, dma_addr1; - rxdp = (union ice_rx_flex_desc *)rxq->rx_ring + rxq->rxrearm_start; + rxdp = rxq->rx_ring + rxq->rxrearm_start; /* Pull 'n' more MBUFs into the software ring */ if (rte_mempool_get_bulk(rxq->mp, @@ -273,7 +273,7 @@ _ice_recv_raw_pkts_vec(struct ice_rx_queue *rxq, struct rte_mbuf **rx_pkts, /* Just the act of getting into the function from the application is * going to cost about 7 cycles */ - rxdp = (union ice_rx_flex_desc *)rxq->rx_ring + rxq->rx_tail; + rxdp = rxq->rx_ring + rxq->rx_tail; rte_prefetch0(rxdp);