[dpdk-dev,v2,08/16] fm10k: add Vector RX scatter function
Commit Message
From: "Chen Jing D(Mark)" <jing.d.chen@intel.com>
Add func fm10k_recv_scattered_pkts_vec to receive chained packets
with SSE instructions.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
---
drivers/net/fm10k/fm10k.h | 2 +
drivers/net/fm10k/fm10k_rxtx_vec.c | 88 ++++++++++++++++++++++++++++++++++++
2 files changed, 90 insertions(+), 0 deletions(-)
Comments
Hi,
On 10/22/2015 5:44 PM, Chen Jing D(Mark) wrote:
> From: "Chen Jing D(Mark)" <jing.d.chen@intel.com>
>
> Add func fm10k_recv_scattered_pkts_vec to receive chained packets
> with SSE instructions.
>
> Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
> ---
> drivers/net/fm10k/fm10k.h | 2 +
> drivers/net/fm10k/fm10k_rxtx_vec.c | 88 ++++++++++++++++++++++++++++++++++++
> 2 files changed, 90 insertions(+), 0 deletions(-)
>
[...]
> +
> +/*
> + * vPMD receive routine that reassembles scattered packets
> + *
> + * Notice:
> + * - don't support ol_flags for rss and csum err
> + * - nb_pkts < RTE_IXGBE_DESCS_PER_LOOP, just return no packet
> + * - nb_pkts > RTE_IXGBE_MAX_RX_BURST, only scan RTE_IXGBE_MAX_RX_BURST
> + * numbers of DD bit
In order to make sure nb_pkts > RTE_IXGBE_MAX_RX_BURST, it's necessary
to do RTE_MIN().
> + * - floor align nb_pkts to a RTE_IXGBE_DESC_PER_LOOP power-of-two
> + */
> +uint16_t
> +fm10k_recv_scattered_pkts_vec(void *rx_queue,
> + struct rte_mbuf **rx_pkts,
> + uint16_t nb_pkts)
> +{
> + struct fm10k_rx_queue *rxq = rx_queue;
> + uint8_t split_flags[RTE_FM10K_MAX_RX_BURST] = {0};
> + unsigned i = 0;
> +
> + /* get some new buffers */
> + uint16_t nb_bufs = fm10k_recv_raw_pkts_vec(rxq, rx_pkts, nb_pkts,
> + split_flags);
> + if (nb_bufs == 0)
> + return 0;
> +
> + /* happy day case, full burst + no packets to be joined */
> + const uint64_t *split_fl64 = (uint64_t *)split_flags;
> + if (rxq->pkt_first_seg == NULL &&
> + split_fl64[0] == 0 && split_fl64[1] == 0 &&
> + split_fl64[2] == 0 && split_fl64[3] == 0)
> + return nb_bufs;
> +
> + /* reassemble any packets that need reassembly*/
> + if (rxq->pkt_first_seg == NULL) {
> + /* find the first split flag, and only reassemble then*/
> + while (i < nb_bufs && !split_flags[i])
> + i++;
> + if (i == nb_bufs)
> + return nb_bufs;
> + }
> + return i + fm10k_reassemble_packets(rxq, &rx_pkts[i], nb_bufs - i,
> + &split_flags[i]);
> +}
Hi, Steve,
Best Regards,
Mark
> -----Original Message-----
> From: Liang, Cunming
> Sent: Tuesday, October 27, 2015 1:28 PM
> To: Chen, Jing D; dev@dpdk.org
> Cc: Tao, Zhe; He, Shaopeng; Ananyev, Konstantin; Richardson, Bruce
> Subject: Re: [PATCH v2 08/16] fm10k: add Vector RX scatter function
>
> Hi,
>
> On 10/22/2015 5:44 PM, Chen Jing D(Mark) wrote:
> > From: "Chen Jing D(Mark)" <jing.d.chen@intel.com>
> >
> > Add func fm10k_recv_scattered_pkts_vec to receive chained packets
> > with SSE instructions.
> >
> > Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
> > ---
> > drivers/net/fm10k/fm10k.h | 2 +
> > drivers/net/fm10k/fm10k_rxtx_vec.c | 88
> ++++++++++++++++++++++++++++++++++++
> > 2 files changed, 90 insertions(+), 0 deletions(-)
> >
> [...]
> > +
> > +/*
> > + * vPMD receive routine that reassembles scattered packets
> > + *
> > + * Notice:
> > + * - don't support ol_flags for rss and csum err
> > + * - nb_pkts < RTE_IXGBE_DESCS_PER_LOOP, just return no packet
> > + * - nb_pkts > RTE_IXGBE_MAX_RX_BURST, only scan
> RTE_IXGBE_MAX_RX_BURST
> > + * numbers of DD bit
> In order to make sure nb_pkts > RTE_IXGBE_MAX_RX_BURST, it's necessary
> to do RTE_MIN().
I'll remove the improper comments. In func fm10k_recv_raw_pkts_vec, it will use
nb_pkts as index to iterate properly.
After then, below func will use actual received packet size nb_bufs as index to iterate.
So, I think RTE_MIN() is not necessary?
> > + * - floor align nb_pkts to a RTE_IXGBE_DESC_PER_LOOP power-of-two
> > + */
> > +uint16_t
> > +fm10k_recv_scattered_pkts_vec(void *rx_queue,
> > + struct rte_mbuf **rx_pkts,
> > + uint16_t nb_pkts)
> > +{
> > + struct fm10k_rx_queue *rxq = rx_queue;
> > + uint8_t split_flags[RTE_FM10K_MAX_RX_BURST] = {0};
> > + unsigned i = 0;
> > +
> > + /* get some new buffers */
> > + uint16_t nb_bufs = fm10k_recv_raw_pkts_vec(rxq, rx_pkts, nb_pkts,
> > + split_flags);
> > + if (nb_bufs == 0)
> > + return 0;
> > +
> > + /* happy day case, full burst + no packets to be joined */
> > + const uint64_t *split_fl64 = (uint64_t *)split_flags;
> > + if (rxq->pkt_first_seg == NULL &&
> > + split_fl64[0] == 0 && split_fl64[1] == 0 &&
> > + split_fl64[2] == 0 && split_fl64[3] == 0)
> > + return nb_bufs;
> > +
> > + /* reassemble any packets that need reassembly*/
> > + if (rxq->pkt_first_seg == NULL) {
> > + /* find the first split flag, and only reassemble then*/
> > + while (i < nb_bufs && !split_flags[i])
> > + i++;
> > + if (i == nb_bufs)
> > + return nb_bufs;
> > + }
> > + return i + fm10k_reassemble_packets(rxq, &rx_pkts[i], nb_bufs - i,
> > + &split_flags[i]);
> > +}
Hi, Steve,
Best Regards,
Mark
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Chen, Jing D
> Sent: Tuesday, October 27, 2015 1:44 PM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 08/16] fm10k: add Vector RX scatter
> function
>
> Hi, Steve,
>
> Best Regards,
> Mark
>
>
> > -----Original Message-----
> > From: Liang, Cunming
> > Sent: Tuesday, October 27, 2015 1:28 PM
> > To: Chen, Jing D; dev@dpdk.org
> > Cc: Tao, Zhe; He, Shaopeng; Ananyev, Konstantin; Richardson, Bruce
> > Subject: Re: [PATCH v2 08/16] fm10k: add Vector RX scatter function
> >
> > Hi,
> >
> > On 10/22/2015 5:44 PM, Chen Jing D(Mark) wrote:
> > > From: "Chen Jing D(Mark)" <jing.d.chen@intel.com>
> > >
> > > Add func fm10k_recv_scattered_pkts_vec to receive chained packets
> > > with SSE instructions.
> > >
> > > Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
> > > ---
> > > drivers/net/fm10k/fm10k.h | 2 +
> > > drivers/net/fm10k/fm10k_rxtx_vec.c | 88
> > ++++++++++++++++++++++++++++++++++++
> > > 2 files changed, 90 insertions(+), 0 deletions(-)
> > >
> > [...]
> > > +
> > > +/*
> > > + * vPMD receive routine that reassembles scattered packets
> > > + *
> > > + * Notice:
> > > + * - don't support ol_flags for rss and csum err
> > > + * - nb_pkts < RTE_IXGBE_DESCS_PER_LOOP, just return no packet
> > > + * - nb_pkts > RTE_IXGBE_MAX_RX_BURST, only scan
> > RTE_IXGBE_MAX_RX_BURST
> > > + * numbers of DD bit
> > In order to make sure nb_pkts > RTE_IXGBE_MAX_RX_BURST, it's
> necessary
> > to do RTE_MIN().
My bad. You indicates nb_pkts should be less or equal than RTE_IXGBE_MAX_TX_BURST.
I'll change accordingly.
>
> I'll remove the improper comments. In func fm10k_recv_raw_pkts_vec, it
> will use
> nb_pkts as index to iterate properly.
> After then, below func will use actual received packet size nb_bufs as index
> to iterate.
> So, I think RTE_MIN() is not necessary?
>
> > > + * - floor align nb_pkts to a RTE_IXGBE_DESC_PER_LOOP power-of-two
> > > + */
> > > +uint16_t
> > > +fm10k_recv_scattered_pkts_vec(void *rx_queue,
> > > + struct rte_mbuf **rx_pkts,
> > > + uint16_t nb_pkts)
> > > +{
> > > + struct fm10k_rx_queue *rxq = rx_queue;
> > > + uint8_t split_flags[RTE_FM10K_MAX_RX_BURST] = {0};
> > > + unsigned i = 0;
> > > +
> > > + /* get some new buffers */
> > > + uint16_t nb_bufs = fm10k_recv_raw_pkts_vec(rxq, rx_pkts, nb_pkts,
> > > + split_flags);
> > > + if (nb_bufs == 0)
> > > + return 0;
> > > +
> > > + /* happy day case, full burst + no packets to be joined */
> > > + const uint64_t *split_fl64 = (uint64_t *)split_flags;
> > > + if (rxq->pkt_first_seg == NULL &&
> > > + split_fl64[0] == 0 && split_fl64[1] == 0 &&
> > > + split_fl64[2] == 0 && split_fl64[3] == 0)
> > > + return nb_bufs;
> > > +
> > > + /* reassemble any packets that need reassembly*/
> > > + if (rxq->pkt_first_seg == NULL) {
> > > + /* find the first split flag, and only reassemble then*/
> > > + while (i < nb_bufs && !split_flags[i])
> > > + i++;
> > > + if (i == nb_bufs)
> > > + return nb_bufs;
> > > + }
> > > + return i + fm10k_reassemble_packets(rxq, &rx_pkts[i], nb_bufs - i,
> > > + &split_flags[i]);
> > > +}
@@ -329,4 +329,6 @@ uint16_t fm10k_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
int fm10k_rxq_vec_setup(struct fm10k_rx_queue *rxq);
int fm10k_rx_vec_condition_check(struct rte_eth_dev *);
uint16_t fm10k_recv_pkts_vec(void *, struct rte_mbuf **, uint16_t);
+uint16_t fm10k_recv_scattered_pkts_vec(void *, struct rte_mbuf **,
+ uint16_t);
#endif
@@ -508,3 +508,91 @@ fm10k_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
{
return fm10k_recv_raw_pkts_vec(rx_queue, rx_pkts, nb_pkts, NULL);
}
+
+static inline uint16_t
+fm10k_reassemble_packets(struct fm10k_rx_queue *rxq,
+ struct rte_mbuf **rx_bufs,
+ uint16_t nb_bufs, uint8_t *split_flags)
+{
+ struct rte_mbuf *pkts[RTE_FM10K_MAX_RX_BURST]; /*finished pkts*/
+ struct rte_mbuf *start = rxq->pkt_first_seg;
+ struct rte_mbuf *end = rxq->pkt_last_seg;
+ unsigned pkt_idx, buf_idx;
+
+
+ for (buf_idx = 0, pkt_idx = 0; buf_idx < nb_bufs; buf_idx++) {
+ if (end != NULL) {
+ /* processing a split packet */
+ end->next = rx_bufs[buf_idx];
+ start->nb_segs++;
+ start->pkt_len += rx_bufs[buf_idx]->data_len;
+ end = end->next;
+
+ if (!split_flags[buf_idx]) {
+ /* it's the last packet of the set */
+ start->hash = end->hash;
+ start->ol_flags = end->ol_flags;
+ pkts[pkt_idx++] = start;
+ start = end = NULL;
+ }
+ } else {
+ /* not processing a split packet */
+ if (!split_flags[buf_idx]) {
+ /* not a split packet, save and skip */
+ pkts[pkt_idx++] = rx_bufs[buf_idx];
+ continue;
+ }
+ end = start = rx_bufs[buf_idx];
+ }
+ }
+
+ /* save the partial packet for next time */
+ rxq->pkt_first_seg = start;
+ rxq->pkt_last_seg = end;
+ memcpy(rx_bufs, pkts, pkt_idx * (sizeof(*pkts)));
+ return pkt_idx;
+}
+
+/*
+ * vPMD receive routine that reassembles scattered packets
+ *
+ * Notice:
+ * - don't support ol_flags for rss and csum err
+ * - nb_pkts < RTE_IXGBE_DESCS_PER_LOOP, just return no packet
+ * - nb_pkts > RTE_IXGBE_MAX_RX_BURST, only scan RTE_IXGBE_MAX_RX_BURST
+ * numbers of DD bit
+ * - floor align nb_pkts to a RTE_IXGBE_DESC_PER_LOOP power-of-two
+ */
+uint16_t
+fm10k_recv_scattered_pkts_vec(void *rx_queue,
+ struct rte_mbuf **rx_pkts,
+ uint16_t nb_pkts)
+{
+ struct fm10k_rx_queue *rxq = rx_queue;
+ uint8_t split_flags[RTE_FM10K_MAX_RX_BURST] = {0};
+ unsigned i = 0;
+
+ /* get some new buffers */
+ uint16_t nb_bufs = fm10k_recv_raw_pkts_vec(rxq, rx_pkts, nb_pkts,
+ split_flags);
+ if (nb_bufs == 0)
+ return 0;
+
+ /* happy day case, full burst + no packets to be joined */
+ const uint64_t *split_fl64 = (uint64_t *)split_flags;
+ if (rxq->pkt_first_seg == NULL &&
+ split_fl64[0] == 0 && split_fl64[1] == 0 &&
+ split_fl64[2] == 0 && split_fl64[3] == 0)
+ return nb_bufs;
+
+ /* reassemble any packets that need reassembly*/
+ if (rxq->pkt_first_seg == NULL) {
+ /* find the first split flag, and only reassemble then*/
+ while (i < nb_bufs && !split_flags[i])
+ i++;
+ if (i == nb_bufs)
+ return nb_bufs;
+ }
+ return i + fm10k_reassemble_packets(rxq, &rx_pkts[i], nb_bufs - i,
+ &split_flags[i]);
+}