Message ID | 1477486575-25148-2-git-send-email-tomaszx.kulasek@intel.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id 7C633BB54; Wed, 26 Oct 2016 14:58:17 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 97F03BB52 for <dev@dpdk.org>; Wed, 26 Oct 2016 14:58:14 +0200 (CEST) Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga103.fm.intel.com with ESMTP; 26 Oct 2016 05:58:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos; i="5.31,550,1473145200"; d="scan'208"; a="1059400899" Received: from unknown (HELO Sent) ([10.103.102.234]) by fmsmga001.fm.intel.com with SMTP; 26 Oct 2016 05:58:00 -0700 Received: by Sent (sSMTP sendmail emulation); Wed, 26 Oct 2016 14:57:16 +0200 From: Tomasz Kulasek <tomaszx.kulasek@intel.com> To: dev@dpdk.org Date: Wed, 26 Oct 2016 14:56:10 +0200 Message-Id: <1477486575-25148-2-git-send-email-tomaszx.kulasek@intel.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1477486575-25148-1-git-send-email-tomaszx.kulasek@intel.com> References: <1477327917-18564-1-git-send-email-tomaszx.kulasek@intel.com> <1477486575-25148-1-git-send-email-tomaszx.kulasek@intel.com> Subject: [dpdk-dev] [PATCH v11 1/6] ethdev: add Tx preparation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK <dev.dpdk.org> List-Unsubscribe: <http://dpdk.org/ml/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://dpdk.org/ml/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <http://dpdk.org/ml/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> |
Commit Message
Tomasz Kulasek
Oct. 26, 2016, 12:56 p.m. UTC
Added API for `rte_eth_tx_prep`
uint16_t rte_eth_tx_prep(uint8_t port_id, uint16_t queue_id,
struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
Added fields to the `struct rte_eth_desc_lim`:
uint16_t nb_seg_max;
/**< Max number of segments per whole packet. */
uint16_t nb_mtu_seg_max;
/**< Max number of segments per one MTU */
Added functions:
int rte_validate_tx_offload(struct rte_mbuf *m)
to validate general requirements for tx offload set in mbuf of packet
such a flag completness. In current implementation this function is
called optionaly when RTE_LIBRTE_ETHDEV_DEBUG is enabled.
int rte_phdr_cksum_fix(struct rte_mbuf *m)
to fix pseudo header checksum for TSO and non-TSO tcp/udp packets
before hardware tx checksum offload.
- for non-TSO tcp/udp packets full pseudo-header checksum is
counted and set.
- for TSO the IP payload length is not included.
PERFORMANCE TESTS
-----------------
This feature was tested with modified csum engine from test-pmd.
The packet checksum preparation was moved from application to Tx
preparation step placed before burst.
We may expect some overhead costs caused by:
1) using additional callback before burst,
2) rescanning burst,
3) additional condition checking (packet validation),
4) worse optimization (e.g. packet data access, etc.)
We tested it using ixgbe Tx preparation implementation with some parts
disabled to have comparable information about the impact of different
parts of implementation.
IMPACT:
1) For unimplemented Tx preparation callback the performance impact is
negligible,
2) For packet condition check without checksum modifications (nb_segs,
available offloads, etc.) is 14626628/14252168 (~2.62% drop),
3) Full support in ixgbe driver (point 2 + packet checksum
initialization) is 14060924/13588094 (~3.48% drop)
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
---
config/common_base | 1 +
lib/librte_ether/rte_ethdev.h | 103 +++++++++++++++++++++++++++++++++++++++++
lib/librte_mbuf/rte_mbuf.h | 64 +++++++++++++++++++++++++
lib/librte_net/rte_net.h | 85 ++++++++++++++++++++++++++++++++++
4 files changed, 253 insertions(+)
Comments
On 10/26/2016 02:56 PM, Tomasz Kulasek wrote: > Added API for `rte_eth_tx_prep` > > [...] > > Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>
Hi Tomasz, This is a major new function in the API and I still have some comments. 2016-10-26 14:56, Tomasz Kulasek: > --- a/config/common_base > +++ b/config/common_base > +CONFIG_RTE_ETHDEV_TX_PREP=y We cannot enable it until it is implemented in every drivers. > struct rte_eth_dev { > eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. */ > eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function. */ > + eth_tx_prep_t tx_pkt_prep; /**< Pointer to PMD transmit prepare function. */ > struct rte_eth_dev_data *data; /**< Pointer to device data */ > const struct eth_driver *driver;/**< Driver for this device */ > const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */ Could you confirm why tx_pkt_prep is not in dev_ops? I guess we want to have several implementations? Shouldn't we have a const struct control_dev_ops and a struct datapath_dev_ops? > +rte_eth_tx_prep(uint8_t port_id, uint16_t queue_id, struct rte_mbuf **tx_pkts, > + uint16_t nb_pkts) The word "prep" can be understood as "prepend". Why not rte_eth_tx_prepare? > +/** > + * Fix pseudo header checksum > + * > + * This function fixes pseudo header checksum for TSO and non-TSO tcp/udp in > + * provided mbufs packet data. > + * > + * - for non-TSO tcp/udp packets full pseudo-header checksum is counted and set > + * in packet data, > + * - for TSO the IP payload length is not included in pseudo header. > + * > + * This function expects that used headers are in the first data segment of > + * mbuf, are not fragmented and can be safely modified. What happens otherwise? > + * > + * @param m > + * The packet mbuf to be fixed. > + * @return > + * 0 if checksum is initialized properly > + */ > +static inline int > +rte_phdr_cksum_fix(struct rte_mbuf *m) Could we find a better name for this function? - About the prefix, rte_ip_ ? - About the scope, where this phdr_cksum is specified? Isn't it an intel_phdr_cksum to match what hardware expects? - About the verb, is it really fixing something broken? Or just writing into a mbuf? I would suggest rte_ip_intel_cksum_prepare.
> > Hi Tomasz, > > This is a major new function in the API and I still have some comments. > > 2016-10-26 14:56, Tomasz Kulasek: > > --- a/config/common_base > > +++ b/config/common_base > > +CONFIG_RTE_ETHDEV_TX_PREP=y > > We cannot enable it until it is implemented in every drivers. Not sure why? If tx_pkt_prep == NULL, then rte_eth_tx_prep() would just act as noop. Right now it is not mandatory for the PMD to implement it. > > > struct rte_eth_dev { > > eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. */ > > eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function. */ > > + eth_tx_prep_t tx_pkt_prep; /**< Pointer to PMD transmit prepare function. */ > > struct rte_eth_dev_data *data; /**< Pointer to device data */ > > const struct eth_driver *driver;/**< Driver for this device */ > > const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */ > > Could you confirm why tx_pkt_prep is not in dev_ops? > I guess we want to have several implementations? Yes, it depends on configuration options, same as tx_pkt_burst. > > Shouldn't we have a const struct control_dev_ops and a struct datapath_dev_ops? That's probably a good idea, but I suppose it is out of scope for that patch. Konstantin
2016-10-27 15:52, Ananyev, Konstantin: > > > > > Hi Tomasz, > > > > This is a major new function in the API and I still have some comments. > > > > 2016-10-26 14:56, Tomasz Kulasek: > > > --- a/config/common_base > > > +++ b/config/common_base > > > +CONFIG_RTE_ETHDEV_TX_PREP=y > > > > We cannot enable it until it is implemented in every drivers. > > Not sure why? > If tx_pkt_prep == NULL, then rte_eth_tx_prep() would just act as noop. > Right now it is not mandatory for the PMD to implement it. If it is not implemented, the application must do the preparation by itself. From patch 6: " Removed pseudo header calculation for udp/tcp/tso packets from application and used Tx preparation API for packet preparation and verification. " So how does it behave with other drivers? > > > struct rte_eth_dev { > > > eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. */ > > > eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function. */ > > > + eth_tx_prep_t tx_pkt_prep; /**< Pointer to PMD transmit prepare function. */ > > > struct rte_eth_dev_data *data; /**< Pointer to device data */ > > > const struct eth_driver *driver;/**< Driver for this device */ > > > const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */ > > > > Could you confirm why tx_pkt_prep is not in dev_ops? > > I guess we want to have several implementations? > > Yes, it depends on configuration options, same as tx_pkt_burst. > > > > > Shouldn't we have a const struct control_dev_ops and a struct datapath_dev_ops? > > That's probably a good idea, but I suppose it is out of scope for that patch. No it's not out of scope. It answers to the question "why is it added in this structure and not dev_ops". We won't do this change when nothing else is changed in the struct.
> -----Original Message----- > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > Sent: Thursday, October 27, 2016 5:02 PM > To: Ananyev, Konstantin <konstantin.ananyev@intel.com> > Cc: Kulasek, TomaszX <tomaszx.kulasek@intel.com>; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v11 1/6] ethdev: add Tx preparation > > 2016-10-27 15:52, Ananyev, Konstantin: > > > > > > > > Hi Tomasz, > > > > > > This is a major new function in the API and I still have some comments. > > > > > > 2016-10-26 14:56, Tomasz Kulasek: > > > > --- a/config/common_base > > > > +++ b/config/common_base > > > > +CONFIG_RTE_ETHDEV_TX_PREP=y > > > > > > We cannot enable it until it is implemented in every drivers. > > > > Not sure why? > > If tx_pkt_prep == NULL, then rte_eth_tx_prep() would just act as noop. > > Right now it is not mandatory for the PMD to implement it. > > If it is not implemented, the application must do the preparation by itself. > From patch 6: > " > Removed pseudo header calculation for udp/tcp/tso packets from > application and used Tx preparation API for packet preparation and > verification. > " > So how does it behave with other drivers? Hmm so it seems that we broke testpmd csumonly mode for non-intel drivers.. My bad, missed that part completely. Yes, then I suppose for now we'll need to support both (with and without) code paths for testpmd. Probably a new fwd mode or just extra parameter for the existing one? Any other suggestions? > > > > > struct rte_eth_dev { > > > > eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. */ > > > > eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function. */ > > > > + eth_tx_prep_t tx_pkt_prep; /**< Pointer to PMD transmit prepare function. */ > > > > struct rte_eth_dev_data *data; /**< Pointer to device data */ > > > > const struct eth_driver *driver;/**< Driver for this device */ > > > > const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */ > > > > > > Could you confirm why tx_pkt_prep is not in dev_ops? > > > I guess we want to have several implementations? > > > > Yes, it depends on configuration options, same as tx_pkt_burst. > > > > > > > > Shouldn't we have a const struct control_dev_ops and a struct datapath_dev_ops? > > > > That's probably a good idea, but I suppose it is out of scope for that patch. > > No it's not out of scope. > It answers to the question "why is it added in this structure and not dev_ops". > We won't do this change when nothing else is changed in the struct. Not sure I understood you here: Are you saying datapath_dev_ops/controlpath_dev_ops have to be introduced as part of that patch? But that's a lot of changes all over rte_ethdev.[h,c]. It definitely worse a separate patch (might be some discussion) for me. Konstantin
Hi Thomas, > -----Original Message----- > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > Sent: Thursday, October 27, 2016 17:01 > To: Kulasek, TomaszX <tomaszx.kulasek@intel.com> > Cc: dev@dpdk.org; olivier.matz@6wind.com > Subject: Re: [dpdk-dev] [PATCH v11 1/6] ethdev: add Tx preparation > > Hi Tomasz, > > This is a major new function in the API and I still have some comments. > > 2016-10-26 14:56, Tomasz Kulasek: > > --- a/config/common_base > > +++ b/config/common_base > > +CONFIG_RTE_ETHDEV_TX_PREP=y > > We cannot enable it until it is implemented in every drivers. > For most of drivers it's safe to enable it by default and if this feature is not supported, no checks/modifications are done. In that meaning the processing path is the same as without using Tx preparation. Introducing this macro was discussed in the threads: http://dpdk.org/ml/archives/dev/2016-September/046437.html http://dpdk.org/dev/patchwork/patch/15770/ Short conclusion: Jerin Jacob pointed, that it can have significant impact on some architectures (such a low-end ARMv7, ARMv8 targets which may not have PCIE-RC support and have only integrated NIC controller), even if this feature is not implemented. We've added this macro to provide an ability to use NOOP operation and allow turn off this feature if will have adverse effect on specific configuration/hardware. > > struct rte_eth_dev { > > eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. > */ > > eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function. > */ > > + eth_tx_prep_t tx_pkt_prep; /**< Pointer to PMD transmit prepare > function. */ > > struct rte_eth_dev_data *data; /**< Pointer to device data */ > > const struct eth_driver *driver;/**< Driver for this device */ > > const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */ > > Could you confirm why tx_pkt_prep is not in dev_ops? > I guess we want to have several implementations? > Yes, the implementation may vary on selected tx_burst path (e.g. vector implementation, simple implementation, full featured, and so on, and can have another requirements, such a implemented features, performance requirements for each implementation). The path is chosen based on the application requirements transparently and we have a pair of callbacks -- tx_burst and corresponding callback (which depends directly on tx_burst path). > Shouldn't we have a const struct control_dev_ops and a struct > datapath_dev_ops? > > > +rte_eth_tx_prep(uint8_t port_id, uint16_t queue_id, struct rte_mbuf > **tx_pkts, > > + uint16_t nb_pkts) > > The word "prep" can be understood as "prepend". > Why not rte_eth_tx_prepare? > I do not mind. > > +/** > > + * Fix pseudo header checksum > > + * > > + * This function fixes pseudo header checksum for TSO and non-TSO > tcp/udp in > > + * provided mbufs packet data. > > + * > > + * - for non-TSO tcp/udp packets full pseudo-header checksum is counted > and set > > + * in packet data, > > + * - for TSO the IP payload length is not included in pseudo header. > > + * > > + * This function expects that used headers are in the first data > segment of > > + * mbuf, are not fragmented and can be safely modified. > > What happens otherwise? > There are requirements for this helper function. For Tx preparation callback we check this requirement and if it fails, -NOTSUP errno is returned. > > + * > > + * @param m > > + * The packet mbuf to be fixed. > > + * @return > > + * 0 if checksum is initialized properly > > + */ > > +static inline int > > +rte_phdr_cksum_fix(struct rte_mbuf *m) > > Could we find a better name for this function? > - About the prefix, rte_ip_ ? > - About the scope, where this phdr_cksum is specified? > Isn't it an intel_phdr_cksum to match what hardware expects? > - About the verb, is it really fixing something broken? > Or just writing into a mbuf? > I would suggest rte_ip_intel_cksum_prepare. Fixes in the meaning of requirements for offloads, which states e.g. that to use specific Tx offload we should to fill checksums in a proper way, if not, thee settings are not valid and should be fixed. But you're right, prepare is better word. About the function name, maybe rte_net_intel_chksum_prepare will be better while it prepares also tcp/udp headers and is placed in rte_net.h? Tomasz
2016-10-27 16:24, Ananyev, Konstantin: > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > > 2016-10-27 15:52, Ananyev, Konstantin: > > > > Hi Tomasz, > > > > > > > > This is a major new function in the API and I still have some comments. > > > > > > > > 2016-10-26 14:56, Tomasz Kulasek: > > > > > --- a/config/common_base > > > > > +++ b/config/common_base > > > > > +CONFIG_RTE_ETHDEV_TX_PREP=y > > > > > > > > We cannot enable it until it is implemented in every drivers. > > > > > > Not sure why? > > > If tx_pkt_prep == NULL, then rte_eth_tx_prep() would just act as noop. > > > Right now it is not mandatory for the PMD to implement it. > > > > If it is not implemented, the application must do the preparation by itself. > > From patch 6: > > " > > Removed pseudo header calculation for udp/tcp/tso packets from > > application and used Tx preparation API for packet preparation and > > verification. > > " > > So how does it behave with other drivers? > > Hmm so it seems that we broke testpmd csumonly mode for non-intel drivers.. > My bad, missed that part completely. > Yes, then I suppose for now we'll need to support both (with and without) code paths for testpmd. > Probably a new fwd mode or just extra parameter for the existing one? > Any other suggestions? Please think how we can use it in every applications. It is not ready. Either we introduce the API without enabling it, or we implement it in every drivers. > > > > > struct rte_eth_dev { > > > > > eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. */ > > > > > eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function. */ > > > > > + eth_tx_prep_t tx_pkt_prep; /**< Pointer to PMD transmit prepare function. */ > > > > > struct rte_eth_dev_data *data; /**< Pointer to device data */ > > > > > const struct eth_driver *driver;/**< Driver for this device */ > > > > > const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */ > > > > > > > > Could you confirm why tx_pkt_prep is not in dev_ops? > > > > I guess we want to have several implementations? > > > > > > Yes, it depends on configuration options, same as tx_pkt_burst. > > > > > > > > > > > Shouldn't we have a const struct control_dev_ops and a struct datapath_dev_ops? > > > > > > That's probably a good idea, but I suppose it is out of scope for that patch. > > > > No it's not out of scope. > > It answers to the question "why is it added in this structure and not dev_ops". > > We won't do this change when nothing else is changed in the struct. > > Not sure I understood you here: > Are you saying datapath_dev_ops/controlpath_dev_ops have to be introduced as part of that patch? > But that's a lot of changes all over rte_ethdev.[h,c]. > It definitely worse a separate patch (might be some discussion) for me. Yes it could be a separate patch in the same patchset.
Hi > -----Original Message----- > From: Ananyev, Konstantin > Sent: Thursday, October 27, 2016 18:24 > To: Thomas Monjalon <thomas.monjalon@6wind.com> > Cc: Kulasek, TomaszX <tomaszx.kulasek@intel.com>; dev@dpdk.org > Subject: RE: [dpdk-dev] [PATCH v11 1/6] ethdev: add Tx preparation > > > > > -----Original Message----- > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > > Sent: Thursday, October 27, 2016 5:02 PM > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com> > > Cc: Kulasek, TomaszX <tomaszx.kulasek@intel.com>; dev@dpdk.org > > Subject: Re: [dpdk-dev] [PATCH v11 1/6] ethdev: add Tx preparation > > > > 2016-10-27 15:52, Ananyev, Konstantin: > > > > > > > > > > > Hi Tomasz, > > > > > > > > This is a major new function in the API and I still have some > comments. > > > > > > > > 2016-10-26 14:56, Tomasz Kulasek: > > > > > --- a/config/common_base > > > > > +++ b/config/common_base > > > > > +CONFIG_RTE_ETHDEV_TX_PREP=y > > > > > > > > We cannot enable it until it is implemented in every drivers. > > > > > > Not sure why? > > > If tx_pkt_prep == NULL, then rte_eth_tx_prep() would just act as noop. > > > Right now it is not mandatory for the PMD to implement it. > > > > If it is not implemented, the application must do the preparation by > itself. > > From patch 6: > > " > > Removed pseudo header calculation for udp/tcp/tso packets from > > application and used Tx preparation API for packet preparation and > > verification. > > " > > So how does it behave with other drivers? > > Hmm so it seems that we broke testpmd csumonly mode for non-intel > drivers.. > My bad, missed that part completely. > Yes, then I suppose for now we'll need to support both (with and without) > code paths for testpmd. > Probably a new fwd mode or just extra parameter for the existing one? > Any other suggestions? > I had sent txprep engine in v2 (http://dpdk.org/dev/patchwork/patch/15775/), but I'm opened on the suggestions. If you like it I can resent it in place of csumonly modification. Tomasz > > > > > > > struct rte_eth_dev { > > > > > eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive > function. */ > > > > > eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit > > > > > function. */ > > > > > + eth_tx_prep_t tx_pkt_prep; /**< Pointer to PMD transmit > > > > > +prepare function. */ > > > > > struct rte_eth_dev_data *data; /**< Pointer to device data */ > > > > > const struct eth_driver *driver;/**< Driver for this device */ > > > > > const struct eth_dev_ops *dev_ops; /**< Functions exported by > > > > > PMD */ > > > > > > > > Could you confirm why tx_pkt_prep is not in dev_ops? > > > > I guess we want to have several implementations? > > > > > > Yes, it depends on configuration options, same as tx_pkt_burst. > > > > > > > > > > > Shouldn't we have a const struct control_dev_ops and a struct > datapath_dev_ops? > > > > > > That's probably a good idea, but I suppose it is out of scope for that > patch. > > > > No it's not out of scope. > > It answers to the question "why is it added in this structure and not > dev_ops". > > We won't do this change when nothing else is changed in the struct. > > Not sure I understood you here: > Are you saying datapath_dev_ops/controlpath_dev_ops have to be introduced > as part of that patch? > But that's a lot of changes all over rte_ethdev.[h,c]. > It definitely worse a separate patch (might be some discussion) for me. > Konstantin > >
Hi Tomasz, > > Hi > > > -----Original Message----- > > From: Ananyev, Konstantin > > Sent: Thursday, October 27, 2016 18:24 > > To: Thomas Monjalon <thomas.monjalon@6wind.com> > > Cc: Kulasek, TomaszX <tomaszx.kulasek@intel.com>; dev@dpdk.org > > Subject: RE: [dpdk-dev] [PATCH v11 1/6] ethdev: add Tx preparation > > > > > > > > > -----Original Message----- > > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > > > Sent: Thursday, October 27, 2016 5:02 PM > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com> > > > Cc: Kulasek, TomaszX <tomaszx.kulasek@intel.com>; dev@dpdk.org > > > Subject: Re: [dpdk-dev] [PATCH v11 1/6] ethdev: add Tx preparation > > > > > > 2016-10-27 15:52, Ananyev, Konstantin: > > > > > > > > > > > > > > Hi Tomasz, > > > > > > > > > > This is a major new function in the API and I still have some > > comments. > > > > > > > > > > 2016-10-26 14:56, Tomasz Kulasek: > > > > > > --- a/config/common_base > > > > > > +++ b/config/common_base > > > > > > +CONFIG_RTE_ETHDEV_TX_PREP=y > > > > > > > > > > We cannot enable it until it is implemented in every drivers. > > > > > > > > Not sure why? > > > > If tx_pkt_prep == NULL, then rte_eth_tx_prep() would just act as noop. > > > > Right now it is not mandatory for the PMD to implement it. > > > > > > If it is not implemented, the application must do the preparation by > > itself. > > > From patch 6: > > > " > > > Removed pseudo header calculation for udp/tcp/tso packets from > > > application and used Tx preparation API for packet preparation and > > > verification. > > > " > > > So how does it behave with other drivers? > > > > Hmm so it seems that we broke testpmd csumonly mode for non-intel > > drivers.. > > My bad, missed that part completely. > > Yes, then I suppose for now we'll need to support both (with and without) > > code paths for testpmd. > > Probably a new fwd mode or just extra parameter for the existing one? > > Any other suggestions? > > > > I had sent txprep engine in v2 (http://dpdk.org/dev/patchwork/patch/15775/), but I'm opened on the suggestions. If you like it I can resent > it in place of csumonly modification. I still not sure it is worth to have another version of csum... Can we introduce a new global variable in testpmd and a new command: testpmd> csum tx_prep or so? Looking at current testpmd patch, I suppose the changes will be minimal. What do you think? Konstantin > > Tomasz > > > > > > > > > > struct rte_eth_dev { > > > > > > eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive > > function. */ > > > > > > eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit > > > > > > function. */ > > > > > > + eth_tx_prep_t tx_pkt_prep; /**< Pointer to PMD transmit > > > > > > +prepare function. */ > > > > > > struct rte_eth_dev_data *data; /**< Pointer to device data */ > > > > > > const struct eth_driver *driver;/**< Driver for this device */ > > > > > > const struct eth_dev_ops *dev_ops; /**< Functions exported by > > > > > > PMD */ > > > > > > > > > > Could you confirm why tx_pkt_prep is not in dev_ops? > > > > > I guess we want to have several implementations? > > > > > > > > Yes, it depends on configuration options, same as tx_pkt_burst. > > > > > > > > > > > > > > Shouldn't we have a const struct control_dev_ops and a struct > > datapath_dev_ops? > > > > > > > > That's probably a good idea, but I suppose it is out of scope for that > > patch. > > > > > > No it's not out of scope. > > > It answers to the question "why is it added in this structure and not > > dev_ops". > > > We won't do this change when nothing else is changed in the struct. > > > > Not sure I understood you here: > > Are you saying datapath_dev_ops/controlpath_dev_ops have to be introduced > > as part of that patch? > > But that's a lot of changes all over rte_ethdev.[h,c]. > > It definitely worse a separate patch (might be some discussion) for me. > > Konstantin > > > >
Hi Konstantin, > -----Original Message----- > From: Ananyev, Konstantin > Sent: Friday, October 28, 2016 12:16 > To: Kulasek, TomaszX <tomaszx.kulasek@intel.com>; Thomas Monjalon > <thomas.monjalon@6wind.com> > Cc: dev@dpdk.org > Subject: RE: [dpdk-dev] [PATCH v11 1/6] ethdev: add Tx preparation > > Hi Tomasz, > > > > > Hi > > > > > -----Original Message----- > > > From: Ananyev, Konstantin > > > Sent: Thursday, October 27, 2016 18:24 > > > To: Thomas Monjalon <thomas.monjalon@6wind.com> > > > Cc: Kulasek, TomaszX <tomaszx.kulasek@intel.com>; dev@dpdk.org > > > Subject: RE: [dpdk-dev] [PATCH v11 1/6] ethdev: add Tx preparation > > > > > > > > > > > > > -----Original Message----- > > > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > > > > Sent: Thursday, October 27, 2016 5:02 PM > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com> > > > > Cc: Kulasek, TomaszX <tomaszx.kulasek@intel.com>; dev@dpdk.org > > > > Subject: Re: [dpdk-dev] [PATCH v11 1/6] ethdev: add Tx preparation > > > > > > > > 2016-10-27 15:52, Ananyev, Konstantin: > > > > > > > > > > > > > > > > > Hi Tomasz, > > > > > > > > > > > > This is a major new function in the API and I still have some > > > comments. > > > > > > > > > > > > 2016-10-26 14:56, Tomasz Kulasek: > > > > > > > --- a/config/common_base > > > > > > > +++ b/config/common_base > > > > > > > +CONFIG_RTE_ETHDEV_TX_PREP=y > > > > > > > > > > > > We cannot enable it until it is implemented in every drivers. > > > > > > > > > > Not sure why? > > > > > If tx_pkt_prep == NULL, then rte_eth_tx_prep() would just act as > noop. > > > > > Right now it is not mandatory for the PMD to implement it. > > > > > > > > If it is not implemented, the application must do the preparation > > > > by > > > itself. > > > > From patch 6: > > > > " > > > > Removed pseudo header calculation for udp/tcp/tso packets from > > > > application and used Tx preparation API for packet preparation and > > > > verification. > > > > " > > > > So how does it behave with other drivers? > > > > > > Hmm so it seems that we broke testpmd csumonly mode for non-intel > > > drivers.. > > > My bad, missed that part completely. > > > Yes, then I suppose for now we'll need to support both (with and > > > without) code paths for testpmd. > > > Probably a new fwd mode or just extra parameter for the existing one? > > > Any other suggestions? > > > > > > > I had sent txprep engine in v2 > > (http://dpdk.org/dev/patchwork/patch/15775/), but I'm opened on the > suggestions. If you like it I can resent it in place of csumonly > modification. > > I still not sure it is worth to have another version of csum... > Can we introduce a new global variable in testpmd and a new command: > testpmd> csum tx_prep > or so? > Looking at current testpmd patch, I suppose the changes will be minimal. > What do you think? > Konstantin > This is not a problem. Tomasz
2016-10-28 10:15, Ananyev, Konstantin: > > From: Ananyev, Konstantin > > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > > > > 2016-10-27 15:52, Ananyev, Konstantin: > > > > > > 2016-10-26 14:56, Tomasz Kulasek: > > > > > > > --- a/config/common_base > > > > > > > +++ b/config/common_base > > > > > > > +CONFIG_RTE_ETHDEV_TX_PREP=y > > > > > > > > > > > > We cannot enable it until it is implemented in every drivers. > > > > > > > > > > Not sure why? > > > > > If tx_pkt_prep == NULL, then rte_eth_tx_prep() would just act as noop. > > > > > Right now it is not mandatory for the PMD to implement it. > > > > > > > > If it is not implemented, the application must do the preparation by > > > itself. > > > > From patch 6: > > > > " > > > > Removed pseudo header calculation for udp/tcp/tso packets from > > > > application and used Tx preparation API for packet preparation and > > > > verification. > > > > " > > > > So how does it behave with other drivers? > > > > > > Hmm so it seems that we broke testpmd csumonly mode for non-intel > > > drivers.. > > > My bad, missed that part completely. > > > Yes, then I suppose for now we'll need to support both (with and without) > > > code paths for testpmd. > > > Probably a new fwd mode or just extra parameter for the existing one? > > > Any other suggestions? > > > > > > > I had sent txprep engine in v2 (http://dpdk.org/dev/patchwork/patch/15775/), but I'm opened on the suggestions. If you like it I can resent > > it in place of csumonly modification. > > I still not sure it is worth to have another version of csum... > Can we introduce a new global variable in testpmd and a new command: > testpmd> csum tx_prep > or so? > Looking at current testpmd patch, I suppose the changes will be minimal. > What do you think? No please no! The problem is not in testpmd. The problem is in every applications. Should we prepare the checksums or let tx_prep do it? The result will depend of the driver used.
> -----Original Message----- > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > Sent: Friday, October 28, 2016 11:22 AM > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kulasek, TomaszX <tomaszx.kulasek@intel.com> > Cc: dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v11 1/6] ethdev: add Tx preparation > > 2016-10-28 10:15, Ananyev, Konstantin: > > > From: Ananyev, Konstantin > > > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > > > > > 2016-10-27 15:52, Ananyev, Konstantin: > > > > > > > 2016-10-26 14:56, Tomasz Kulasek: > > > > > > > > --- a/config/common_base > > > > > > > > +++ b/config/common_base > > > > > > > > +CONFIG_RTE_ETHDEV_TX_PREP=y > > > > > > > > > > > > > > We cannot enable it until it is implemented in every drivers. > > > > > > > > > > > > Not sure why? > > > > > > If tx_pkt_prep == NULL, then rte_eth_tx_prep() would just act as noop. > > > > > > Right now it is not mandatory for the PMD to implement it. > > > > > > > > > > If it is not implemented, the application must do the preparation by > > > > itself. > > > > > From patch 6: > > > > > " > > > > > Removed pseudo header calculation for udp/tcp/tso packets from > > > > > application and used Tx preparation API for packet preparation and > > > > > verification. > > > > > " > > > > > So how does it behave with other drivers? > > > > > > > > Hmm so it seems that we broke testpmd csumonly mode for non-intel > > > > drivers.. > > > > My bad, missed that part completely. > > > > Yes, then I suppose for now we'll need to support both (with and without) > > > > code paths for testpmd. > > > > Probably a new fwd mode or just extra parameter for the existing one? > > > > Any other suggestions? > > > > > > > > > > I had sent txprep engine in v2 (http://dpdk.org/dev/patchwork/patch/15775/), but I'm opened on the suggestions. If you like it I can > resent > > > it in place of csumonly modification. > > > > I still not sure it is worth to have another version of csum... > > Can we introduce a new global variable in testpmd and a new command: > > testpmd> csum tx_prep > > or so? > > Looking at current testpmd patch, I suppose the changes will be minimal. > > What do you think? > > No please no! > The problem is not in testpmd. > The problem is in every applications. > Should we prepare the checksums or let tx_prep do it? Not sure, I understood you... Right now we don't' change other apps. They would work as before. If people would like to start to use tx_prep in their apps - they are free to do that. If they like to keep doing that manually - that's fine too. From other side we need an ability to test (and demonstrate) that new functionality. So we do need changes in testpmd. Konstantin > The result will depend of the driver used.
> -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin > Sent: Friday, October 28, 2016 11:29 AM > To: Thomas Monjalon <thomas.monjalon@6wind.com>; Kulasek, TomaszX > <tomaszx.kulasek@intel.com> > Cc: dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v11 1/6] ethdev: add Tx preparation > > > > > -----Original Message----- > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > > Sent: Friday, October 28, 2016 11:22 AM > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Kulasek, > > TomaszX <tomaszx.kulasek@intel.com> > > Cc: dev@dpdk.org > > Subject: Re: [dpdk-dev] [PATCH v11 1/6] ethdev: add Tx preparation > > > > 2016-10-28 10:15, Ananyev, Konstantin: > > > > From: Ananyev, Konstantin > > > > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > > > > > > 2016-10-27 15:52, Ananyev, Konstantin: > > > > > > > > 2016-10-26 14:56, Tomasz Kulasek: > > > > > > > > > --- a/config/common_base > > > > > > > > > +++ b/config/common_base > > > > > > > > > +CONFIG_RTE_ETHDEV_TX_PREP=y > > > > > > > > > > > > > > > > We cannot enable it until it is implemented in every > drivers. > > > > > > > > > > > > > > Not sure why? > > > > > > > If tx_pkt_prep == NULL, then rte_eth_tx_prep() would just act > as noop. > > > > > > > Right now it is not mandatory for the PMD to implement it. > > > > > > > > > > > > If it is not implemented, the application must do the > > > > > > preparation by > > > > > itself. > > > > > > From patch 6: > > > > > > " > > > > > > Removed pseudo header calculation for udp/tcp/tso packets from > > > > > > application and used Tx preparation API for packet preparation > > > > > > and verification. > > > > > > " > > > > > > So how does it behave with other drivers? > > > > > > > > > > Hmm so it seems that we broke testpmd csumonly mode for > > > > > non-intel drivers.. > > > > > My bad, missed that part completely. > > > > > Yes, then I suppose for now we'll need to support both (with and > > > > > without) code paths for testpmd. > > > > > Probably a new fwd mode or just extra parameter for the existing > one? > > > > > Any other suggestions? > > > > > > > > > > > > > I had sent txprep engine in v2 > > > > (http://dpdk.org/dev/patchwork/patch/15775/), but I'm opened on > > > > the suggestions. If you like it I can > > resent > > > > it in place of csumonly modification. > > > > > > I still not sure it is worth to have another version of csum... > > > Can we introduce a new global variable in testpmd and a new command: > > > testpmd> csum tx_prep > > > or so? > > > Looking at current testpmd patch, I suppose the changes will be > minimal. > > > What do you think? > > > > No please no! > > The problem is not in testpmd. > > The problem is in every applications. > > Should we prepare the checksums or let tx_prep do it? > > Not sure, I understood you... > Right now we don't' change other apps. > They would work as before. > If people would like to start to use tx_prep in their apps - they are free > to do that. > If they like to keep doing that manually - that's fine too. > From other side we need an ability to test (and demonstrate) that new > functionality. > So we do need changes in testpmd. > Konstantin > Just my 2c on this: * given this is new functionality, and no apps are currently using it, I'm not sure I see the harm in having the function available by default. We just need to be clear about the limits of the function and the fact that apps need to do work themselves if the driver doesn't provide the function. * having it enabled will then allow any apps that want to use it do to so. * however, for our sample apps, and by default in testpmd, we *shouldn't* use this functionality, in the absence of any fallback, so that is where I would look to have the enable/disable switch, not in the library. * going forward, I think a SW fallback inside the ethdev API itself would be a good addition to make this fully generic. Hope this helps, [and also that I haven't missed some subtlety in the discussion!] /Bruce
On Fri, Oct 28, 2016 at 10:15:47AM +0000, Ananyev, Konstantin wrote: > Hi Tomasz, > > > > > > Not sure why? > > > > > If tx_pkt_prep == NULL, then rte_eth_tx_prep() would just act as noop. > > > > > Right now it is not mandatory for the PMD to implement it. > > > > > > > > If it is not implemented, the application must do the preparation by > > > itself. > > > > From patch 6: > > > > " > > > > Removed pseudo header calculation for udp/tcp/tso packets from > > > > application and used Tx preparation API for packet preparation and > > > > verification. > > > > " > > > > So how does it behave with other drivers? > > > > > > Hmm so it seems that we broke testpmd csumonly mode for non-intel > > > drivers.. > > > My bad, missed that part completely. > > > Yes, then I suppose for now we'll need to support both (with and without) > > > code paths for testpmd. > > > Probably a new fwd mode or just extra parameter for the existing one? > > > Any other suggestions? > > > > > > > I had sent txprep engine in v2 (http://dpdk.org/dev/patchwork/patch/15775/), but I'm opened on the suggestions. If you like it I can resent > > it in place of csumonly modification. > > I still not sure it is worth to have another version of csum... > Can we introduce a new global variable in testpmd and a new command: > testpmd> csum tx_prep Just my 2 cents, As "tx_prep" is a generic API and if PMD tries to fix-up some other limitation(not csum) then in that case it is difficult for the application to know in which PMD&application combination it needs be used. > or so? > Looking at current testpmd patch, I suppose the changes will be minimal. > What do you think? > Konstantin > > > > > Tomasz > > > > > > > > > > > > > struct rte_eth_dev { > > > > > > > eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive > > > function. */ > > > > > > > eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit > > > > > > > function. */ > > > > > > > + eth_tx_prep_t tx_pkt_prep; /**< Pointer to PMD transmit > > > > > > > +prepare function. */ > > > > > > > struct rte_eth_dev_data *data; /**< Pointer to device data */ > > > > > > > const struct eth_driver *driver;/**< Driver for this device */ > > > > > > > const struct eth_dev_ops *dev_ops; /**< Functions exported by > > > > > > > PMD */ > > > > > > > > > > > > Could you confirm why tx_pkt_prep is not in dev_ops? > > > > > > I guess we want to have several implementations? > > > > > > > > > > Yes, it depends on configuration options, same as tx_pkt_burst. > > > > > > > > > > > > > > > > > Shouldn't we have a const struct control_dev_ops and a struct > > > datapath_dev_ops? > > > > > > > > > > That's probably a good idea, but I suppose it is out of scope for that > > > patch. > > > > > > > > No it's not out of scope. > > > > It answers to the question "why is it added in this structure and not > > > dev_ops". > > > > We won't do this change when nothing else is changed in the struct. > > > > > > Not sure I understood you here: > > > Are you saying datapath_dev_ops/controlpath_dev_ops have to be introduced > > > as part of that patch? > > > But that's a lot of changes all over rte_ethdev.[h,c]. > > > It definitely worse a separate patch (might be some discussion) for me. > > > Konstantin > > > > > > >
Hi Thomasz, > > 2016-10-27 16:24, Ananyev, Konstantin: > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > > > 2016-10-27 15:52, Ananyev, Konstantin: > > > > > Hi Tomasz, > > > > > > > > > > This is a major new function in the API and I still have some comments. > > > > > > > > > > 2016-10-26 14:56, Tomasz Kulasek: > > > > > > --- a/config/common_base > > > > > > +++ b/config/common_base > > > > > > +CONFIG_RTE_ETHDEV_TX_PREP=y > > > > > > > > > > We cannot enable it until it is implemented in every drivers. > > > > > > > > Not sure why? > > > > If tx_pkt_prep == NULL, then rte_eth_tx_prep() would just act as noop. > > > > Right now it is not mandatory for the PMD to implement it. > > > > > > If it is not implemented, the application must do the preparation by itself. > > > From patch 6: > > > " > > > Removed pseudo header calculation for udp/tcp/tso packets from > > > application and used Tx preparation API for packet preparation and > > > verification. > > > " > > > So how does it behave with other drivers? > > > > Hmm so it seems that we broke testpmd csumonly mode for non-intel drivers.. > > My bad, missed that part completely. > > Yes, then I suppose for now we'll need to support both (with and without) code paths for testpmd. > > Probably a new fwd mode or just extra parameter for the existing one? > > Any other suggestions? > > Please think how we can use it in every applications. > It is not ready. > Either we introduce the API without enabling it, or we implement it > in every drivers. I understand your position here, but just like to point that: 1) It is a new functionality optional to use. The app is free not to use that functionality and still do the preparation itself (as it has to do it now). All existing apps would keep working as expected without using that function. Though if the app developer knows that for all HW models he plans to run on tx_prep is implemented - he is free to use it. 2) It would be difficult for Tomasz (and other Intel guys) to implement tx_prep() for all non-Intel HW that DPDK supports right now. We just don't have all the actual HW in stock and probably adequate knowledge of it. So we depend here on the good will of other PMD mainaners/developers to implement tx_prep() for these devices. From other side, if it will be disabled by default, then, I think, PMD developers just wouldn't be motivated to implement it. So it will be left untested and unused forever. > > > > > > > struct rte_eth_dev { > > > > > > eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. */ > > > > > > eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function. */ > > > > > > + eth_tx_prep_t tx_pkt_prep; /**< Pointer to PMD transmit prepare function. */ > > > > > > struct rte_eth_dev_data *data; /**< Pointer to device data */ > > > > > > const struct eth_driver *driver;/**< Driver for this device */ > > > > > > const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */ > > > > > > > > > > Could you confirm why tx_pkt_prep is not in dev_ops? > > > > > I guess we want to have several implementations? > > > > > > > > Yes, it depends on configuration options, same as tx_pkt_burst. > > > > > > > > > > > > > > Shouldn't we have a const struct control_dev_ops and a struct datapath_dev_ops? > > > > > > > > That's probably a good idea, but I suppose it is out of scope for that patch. > > > > > > No it's not out of scope. > > > It answers to the question "why is it added in this structure and not dev_ops". > > > We won't do this change when nothing else is changed in the struct. > > > > Not sure I understood you here: > > Are you saying datapath_dev_ops/controlpath_dev_ops have to be introduced as part of that patch? > > But that's a lot of changes all over rte_ethdev.[h,c]. > > It definitely worse a separate patch (might be some discussion) for me. > > Yes it could be a separate patch in the same patchset. Honestly, I think it is a good idea, but it is too late and too risky to do such change right now. We are on RC2 right now, just few days before RC3... Can't that wait till 17.02? From my understanding - it is pure code restructuring, without any functionality affected. Konstantin
> > Hi Thomasz, > > > > > 2016-10-27 16:24, Ananyev, Konstantin: > > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > > > > 2016-10-27 15:52, Ananyev, Konstantin: > > > > > > Hi Tomasz, > > > > > > > > > > > > This is a major new function in the API and I still have some comments. > > > > > > > > > > > > 2016-10-26 14:56, Tomasz Kulasek: > > > > > > > --- a/config/common_base > > > > > > > +++ b/config/common_base > > > > > > > +CONFIG_RTE_ETHDEV_TX_PREP=y > > > > > > > > > > > > We cannot enable it until it is implemented in every drivers. > > > > > > > > > > Not sure why? > > > > > If tx_pkt_prep == NULL, then rte_eth_tx_prep() would just act as noop. > > > > > Right now it is not mandatory for the PMD to implement it. > > > > > > > > If it is not implemented, the application must do the preparation by itself. > > > > From patch 6: > > > > " > > > > Removed pseudo header calculation for udp/tcp/tso packets from > > > > application and used Tx preparation API for packet preparation and > > > > verification. > > > > " > > > > So how does it behave with other drivers? > > > > > > Hmm so it seems that we broke testpmd csumonly mode for non-intel drivers.. > > > My bad, missed that part completely. > > > Yes, then I suppose for now we'll need to support both (with and without) code paths for testpmd. > > > Probably a new fwd mode or just extra parameter for the existing one? > > > Any other suggestions? > > > > Please think how we can use it in every applications. > > It is not ready. > > Either we introduce the API without enabling it, or we implement it > > in every drivers. > > I understand your position here, but just like to point that: > 1) It is a new functionality optional to use. > The app is free not to use that functionality and still do the preparation itself > (as it has to do it now). > All existing apps would keep working as expected without using that function. > Though if the app developer knows that for all HW models he plans to run on > tx_prep is implemented - he is free to use it. > 2) It would be difficult for Tomasz (and other Intel guys) to implement tx_prep() > for all non-Intel HW that DPDK supports right now. > We just don't have all the actual HW in stock and probably adequate knowledge of it. > So we depend here on the good will of other PMD mainaners/developers to implement > tx_prep() for these devices. > From other side, if it will be disabled by default, then, I think, > PMD developers just wouldn't be motivated to implement it. > So it will be left untested and unused forever. Actually as another thought: Can we have it enabled by default, but mark it as experimental or so? If memory serves me right, we've done that for cryptodev in the past, no? Konstantin > > > > > > > > > > struct rte_eth_dev { > > > > > > > eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. */ > > > > > > > eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function. */ > > > > > > > + eth_tx_prep_t tx_pkt_prep; /**< Pointer to PMD transmit prepare function. */ > > > > > > > struct rte_eth_dev_data *data; /**< Pointer to device data */ > > > > > > > const struct eth_driver *driver;/**< Driver for this device */ > > > > > > > const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */ > > > > > > > > > > > > Could you confirm why tx_pkt_prep is not in dev_ops? > > > > > > I guess we want to have several implementations? > > > > > > > > > > Yes, it depends on configuration options, same as tx_pkt_burst. > > > > > > > > > > > > > > > > > Shouldn't we have a const struct control_dev_ops and a struct datapath_dev_ops? > > > > > > > > > > That's probably a good idea, but I suppose it is out of scope for that patch. > > > > > > > > No it's not out of scope. > > > > It answers to the question "why is it added in this structure and not dev_ops". > > > > We won't do this change when nothing else is changed in the struct. > > > > > > Not sure I understood you here: > > > Are you saying datapath_dev_ops/controlpath_dev_ops have to be introduced as part of that patch? > > > But that's a lot of changes all over rte_ethdev.[h,c]. > > > It definitely worse a separate patch (might be some discussion) for me. > > > > Yes it could be a separate patch in the same patchset. > > Honestly, I think it is a good idea, but it is too late and too risky to do such change right now. > We are on RC2 right now, just few days before RC3... > Can't that wait till 17.02? > From my understanding - it is pure code restructuring, without any functionality affected. > Konstantin >
2016-10-28 11:34, Ananyev, Konstantin: > > > 2016-10-27 16:24, Ananyev, Konstantin: > > > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > > > > > 2016-10-27 15:52, Ananyev, Konstantin: > > > > > > > 2016-10-26 14:56, Tomasz Kulasek: > > > > > > > > --- a/config/common_base > > > > > > > > +++ b/config/common_base > > > > > > > > +CONFIG_RTE_ETHDEV_TX_PREP=y > > > > > > > > > > > > > > We cannot enable it until it is implemented in every drivers. > > > > > > > > > > > > Not sure why? > > > > > > If tx_pkt_prep == NULL, then rte_eth_tx_prep() would just act as noop. > > > > > > Right now it is not mandatory for the PMD to implement it. > > > > > > > > > > If it is not implemented, the application must do the preparation by itself. > > > > > From patch 6: > > > > > " > > > > > Removed pseudo header calculation for udp/tcp/tso packets from > > > > > application and used Tx preparation API for packet preparation and > > > > > verification. > > > > > " > > > > > So how does it behave with other drivers? > > > > > > > > Hmm so it seems that we broke testpmd csumonly mode for non-intel drivers.. > > > > My bad, missed that part completely. > > > > Yes, then I suppose for now we'll need to support both (with and without) code paths for testpmd. > > > > Probably a new fwd mode or just extra parameter for the existing one? > > > > Any other suggestions? > > > > > > Please think how we can use it in every applications. > > > It is not ready. > > > Either we introduce the API without enabling it, or we implement it > > > in every drivers. > > > > I understand your position here, but just like to point that: > > 1) It is a new functionality optional to use. > > The app is free not to use that functionality and still do the preparation itself > > (as it has to do it now). > > All existing apps would keep working as expected without using that function. > > Though if the app developer knows that for all HW models he plans to run on > > tx_prep is implemented - he is free to use it. > > 2) It would be difficult for Tomasz (and other Intel guys) to implement tx_prep() > > for all non-Intel HW that DPDK supports right now. > > We just don't have all the actual HW in stock and probably adequate knowledge of it. > > So we depend here on the good will of other PMD mainaners/developers to implement > > tx_prep() for these devices. > > From other side, if it will be disabled by default, then, I think, > > PMD developers just wouldn't be motivated to implement it. > > So it will be left untested and unused forever. > > Actually as another thought: > Can we have it enabled by default, but mark it as experimental or so? > If memory serves me right, we've done that for cryptodev in the past, no? Cryptodev was a whole new library. We won't play the game "find which function is experimental or not". We should not enable a function until it is fully implemented. If the user really understands that it will work only with few drivers then he can change the build configuration himself. Enabling in the default configuration is a message to say that it works everywhere without any risk. It's so simple that I don't even understand why I must argue for. And by the way, it is late for 16.11. I suggest to integrate it in the beginning of 17.02 cycle, with the hope that you can convince other developers to implement it in other drivers, so we could finally enable it in the default config. Oh, and I don't trust that nobody were thinking that it would break testpmd for non-Intel drivers.
> > 2016-10-28 11:34, Ananyev, Konstantin: > > > > 2016-10-27 16:24, Ananyev, Konstantin: > > > > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > > > > > > 2016-10-27 15:52, Ananyev, Konstantin: > > > > > > > > 2016-10-26 14:56, Tomasz Kulasek: > > > > > > > > > --- a/config/common_base > > > > > > > > > +++ b/config/common_base > > > > > > > > > +CONFIG_RTE_ETHDEV_TX_PREP=y > > > > > > > > > > > > > > > > We cannot enable it until it is implemented in every drivers. > > > > > > > > > > > > > > Not sure why? > > > > > > > If tx_pkt_prep == NULL, then rte_eth_tx_prep() would just act as noop. > > > > > > > Right now it is not mandatory for the PMD to implement it. > > > > > > > > > > > > If it is not implemented, the application must do the preparation by itself. > > > > > > From patch 6: > > > > > > " > > > > > > Removed pseudo header calculation for udp/tcp/tso packets from > > > > > > application and used Tx preparation API for packet preparation and > > > > > > verification. > > > > > > " > > > > > > So how does it behave with other drivers? > > > > > > > > > > Hmm so it seems that we broke testpmd csumonly mode for non-intel drivers.. > > > > > My bad, missed that part completely. > > > > > Yes, then I suppose for now we'll need to support both (with and without) code paths for testpmd. > > > > > Probably a new fwd mode or just extra parameter for the existing one? > > > > > Any other suggestions? > > > > > > > > Please think how we can use it in every applications. > > > > It is not ready. > > > > Either we introduce the API without enabling it, or we implement it > > > > in every drivers. > > > > > > I understand your position here, but just like to point that: > > > 1) It is a new functionality optional to use. > > > The app is free not to use that functionality and still do the preparation itself > > > (as it has to do it now). > > > All existing apps would keep working as expected without using that function. > > > Though if the app developer knows that for all HW models he plans to run on > > > tx_prep is implemented - he is free to use it. > > > 2) It would be difficult for Tomasz (and other Intel guys) to implement tx_prep() > > > for all non-Intel HW that DPDK supports right now. > > > We just don't have all the actual HW in stock and probably adequate knowledge of it. > > > So we depend here on the good will of other PMD mainaners/developers to implement > > > tx_prep() for these devices. > > > From other side, if it will be disabled by default, then, I think, > > > PMD developers just wouldn't be motivated to implement it. > > > So it will be left untested and unused forever. > > > > Actually as another thought: > > Can we have it enabled by default, but mark it as experimental or so? > > If memory serves me right, we've done that for cryptodev in the past, no? > > Cryptodev was a whole new library. > We won't play the game "find which function is experimental or not". > > We should not enable a function until it is fully implemented. > > If the user really understands that it will work only with few drivers > then he can change the build configuration himself. > Enabling in the default configuration is a message to say that it works > everywhere without any risk. > It's so simple that I don't even understand why I must argue for. > > And by the way, it is late for 16.11. Ok, I understand your concern about enabling it by default and testpmd breakage, but what else you believe is not ready? > I suggest to integrate it in the beginning of 17.02 cycle, with the hope > that you can convince other developers to implement it in other drivers, > so we could finally enable it in the default config. Ok, any insights then, how we can convince people to do that? BTW, it means then that tx_prep() should become part of mandatory API to be implemented by each PMD doing TX offloads, right? > > Oh, and I don't trust that nobody were thinking that it would break testpmd > for non-Intel drivers. Well, believe it or not, but yes, I missed that one. I think I already admitted that it was my fault, and apologized for that. But sure, it is your choice to trust me here or not. Konstantin
2016-10-28 12:59, Ananyev, Konstantin: > > 2016-10-28 11:34, Ananyev, Konstantin: > > > > > 2016-10-27 16:24, Ananyev, Konstantin: > > > > > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > > > > > > > 2016-10-27 15:52, Ananyev, Konstantin: > > > > > > > > > 2016-10-26 14:56, Tomasz Kulasek: > > > > > > > > > > --- a/config/common_base > > > > > > > > > > +++ b/config/common_base > > > > > > > > > > +CONFIG_RTE_ETHDEV_TX_PREP=y > > > > > > > > > > > > > > > > > > We cannot enable it until it is implemented in every drivers. > > > > > > > > > > > > > > > > Not sure why? > > > > > > > > If tx_pkt_prep == NULL, then rte_eth_tx_prep() would just act as noop. > > > > > > > > Right now it is not mandatory for the PMD to implement it. > > > > > > > > > > > > > > If it is not implemented, the application must do the preparation by itself. > > > > > > > From patch 6: > > > > > > > " > > > > > > > Removed pseudo header calculation for udp/tcp/tso packets from > > > > > > > application and used Tx preparation API for packet preparation and > > > > > > > verification. > > > > > > > " > > > > > > > So how does it behave with other drivers? > > > > > > > > > > > > Hmm so it seems that we broke testpmd csumonly mode for non-intel drivers.. > > > > > > My bad, missed that part completely. > > > > > > Yes, then I suppose for now we'll need to support both (with and without) code paths for testpmd. > > > > > > Probably a new fwd mode or just extra parameter for the existing one? > > > > > > Any other suggestions? > > > > > > > > > > Please think how we can use it in every applications. > > > > > It is not ready. > > > > > Either we introduce the API without enabling it, or we implement it > > > > > in every drivers. > > > > > > > > I understand your position here, but just like to point that: > > > > 1) It is a new functionality optional to use. > > > > The app is free not to use that functionality and still do the preparation itself > > > > (as it has to do it now). > > > > All existing apps would keep working as expected without using that function. > > > > Though if the app developer knows that for all HW models he plans to run on > > > > tx_prep is implemented - he is free to use it. > > > > 2) It would be difficult for Tomasz (and other Intel guys) to implement tx_prep() > > > > for all non-Intel HW that DPDK supports right now. > > > > We just don't have all the actual HW in stock and probably adequate knowledge of it. > > > > So we depend here on the good will of other PMD mainaners/developers to implement > > > > tx_prep() for these devices. > > > > From other side, if it will be disabled by default, then, I think, > > > > PMD developers just wouldn't be motivated to implement it. > > > > So it will be left untested and unused forever. > > > > > > Actually as another thought: > > > Can we have it enabled by default, but mark it as experimental or so? > > > If memory serves me right, we've done that for cryptodev in the past, no? > > > > Cryptodev was a whole new library. > > We won't play the game "find which function is experimental or not". > > > > We should not enable a function until it is fully implemented. > > > > If the user really understands that it will work only with few drivers > > then he can change the build configuration himself. > > Enabling in the default configuration is a message to say that it works > > everywhere without any risk. > > It's so simple that I don't even understand why I must argue for. > > > > And by the way, it is late for 16.11. > > Ok, I understand your concern about enabling it by default and testpmd breakage, > but what else you believe is not ready? That's already a lot! I commented also about function naming. All these things are trivial to fix. But it is late. After RC1, we should stop integrating new features. > > I suggest to integrate it in the beginning of 17.02 cycle, with the hope > > that you can convince other developers to implement it in other drivers, > > so we could finally enable it in the default config. > > Ok, any insights then, how we can convince people to do that? You just have to explain clearly what this new feature is bringing and what will be the future possibilities. > BTW, it means then that tx_prep() should become part of mandatory API > to be implemented by each PMD doing TX offloads, right? Right. The question is "what means mandatory"? Should we block some patches for non-compliant drivers? Should we remove offloads capability from non-compliant drivers? > > Oh, and I don't trust that nobody were thinking that it would break testpmd > > for non-Intel drivers. > > Well, believe it or not, but yes, I missed that one. > I think I already admitted that it was my fault, and apologized for that. And it's my fault not having seen that before. I was hoping that good reviews would be done by other contributors. > But sure, it is your choice to trust me here or not. Konstantin I trust you. However if nobody else was reviewing this patchset at Intel, this is probably an issue. And more importantly we must encourage other vendors to review such major patch for the ethdev API.
Hi Thomas, > > > I suggest to integrate it in the beginning of 17.02 cycle, with the hope > > > that you can convince other developers to implement it in other drivers, > > > so we could finally enable it in the default config. > > > > Ok, any insights then, how we can convince people to do that? > > You just have to explain clearly what this new feature is bringing > and what will be the future possibilities. > > > BTW, it means then that tx_prep() should become part of mandatory API > > to be implemented by each PMD doing TX offloads, right? > > Right. > The question is "what means mandatory"? For me "mandatory" here would mean that: - if the PMD supports TX offloads AND - if to be able use any of these offloads the upper layer SW would have to: - modify the contents of the packet OR - obey HW specific restrictions then it is a PMD developer responsibility to provide tx_prep() that would implement expected modifications of the packet contents and restriction checks. Otherwise, tx_prep() implementation is not required and can be safely set to NULL. Does it sounds good enough to everyone? > Should we block some patches for non-compliant drivers? If we agree that it should be a 'mandatory' one - and patch in question breaks that requirement, then probably yes. > Should we remove offloads capability from non-compliant drivers? Do you mean existing PMDs? Are there any particular right now, that can't work properly with testpmd csumonly mode? Konstantin
2016-11-01 12:57, Ananyev, Konstantin: > > > > I suggest to integrate it in the beginning of 17.02 cycle, with the hope > > > > that you can convince other developers to implement it in other drivers, > > > > so we could finally enable it in the default config. > > > > > > Ok, any insights then, how we can convince people to do that? > > > > You just have to explain clearly what this new feature is bringing > > and what will be the future possibilities. > > > > > BTW, it means then that tx_prep() should become part of mandatory API > > > to be implemented by each PMD doing TX offloads, right? > > > > Right. > > The question is "what means mandatory"? > > For me "mandatory" here would mean that: > - if the PMD supports TX offloads AND > - if to be able use any of these offloads the upper layer SW would have to: > - modify the contents of the packet OR > - obey HW specific restrictions > then it is a PMD developer responsibility to provide tx_prep() that would implement > expected modifications of the packet contents and restriction checks. > Otherwise, tx_prep() implementation is not required and can be safely set to NULL. > > Does it sounds good enough to everyone? Yes, good definition, thanks. > > Should we block some patches for non-compliant drivers? > > If we agree that it should be a 'mandatory' one - and patch in question breaks > that requirement, then probably yes. > > > Should we remove offloads capability from non-compliant drivers? > > Do you mean existing PMDs? > Are there any particular right now, that can't work properly with testpmd csumonly mode? I cannot answer to this question. Before txprep, there is only one API: the application must prepare the packets checksum itself (get_psd_sum in testpmd). With txprep, the application have 2 choices: keep doing the job itself or call txprep which calls a PMD-specific function. The question is: does non-Intel drivers need a checksum preparation for TSO? Will it behave well if txprep does nothing in these drivers? When looking at the code, most of drivers handle the TSO flags. But it is hard to know whether they rely on the pseudo checksum or not. git grep -l 'PKT_TX_UDP_CKSUM\|PKT_TX_TCP_CKSUM\|PKT_TX_TCP_SEG' drivers/net/ drivers/net/bnxt/bnxt_txr.c drivers/net/cxgbe/sge.c drivers/net/e1000/em_rxtx.c drivers/net/e1000/igb_rxtx.c drivers/net/ena/ena_ethdev.c drivers/net/enic/enic_rxtx.c drivers/net/fm10k/fm10k_rxtx.c drivers/net/i40e/i40e_rxtx.c drivers/net/ixgbe/ixgbe_rxtx.c drivers/net/mlx4/mlx4.c drivers/net/mlx5/mlx5_rxtx.c drivers/net/nfp/nfp_net.c drivers/net/qede/qede_rxtx.c drivers/net/thunderx/nicvf_rxtx.c drivers/net/virtio/virtio_rxtx.c drivers/net/vmxnet3/vmxnet3_rxtx.c
diff --git a/config/common_base b/config/common_base index c7fd3db..619284b 100644 --- a/config/common_base +++ b/config/common_base @@ -120,6 +120,7 @@ CONFIG_RTE_MAX_QUEUES_PER_PORT=1024 CONFIG_RTE_LIBRTE_IEEE1588=n CONFIG_RTE_ETHDEV_QUEUE_STAT_CNTRS=16 CONFIG_RTE_ETHDEV_RXTX_CALLBACKS=y +CONFIG_RTE_ETHDEV_TX_PREP=y # # Support NIC bypass logic diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index 38641e8..cf6f68e 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -182,6 +182,7 @@ extern "C" { #include <rte_pci.h> #include <rte_dev.h> #include <rte_devargs.h> +#include <rte_errno.h> #include "rte_ether.h" #include "rte_eth_ctrl.h" #include "rte_dev_info.h" @@ -699,6 +700,8 @@ struct rte_eth_desc_lim { uint16_t nb_max; /**< Max allowed number of descriptors. */ uint16_t nb_min; /**< Min allowed number of descriptors. */ uint16_t nb_align; /**< Number of descriptors should be aligned to. */ + uint16_t nb_seg_max; /**< Max number of segments per whole packet. */ + uint16_t nb_mtu_seg_max; /**< Max number of segments per one MTU */ }; /** @@ -1188,6 +1191,11 @@ typedef uint16_t (*eth_tx_burst_t)(void *txq, uint16_t nb_pkts); /**< @internal Send output packets on a transmit queue of an Ethernet device. */ +typedef uint16_t (*eth_tx_prep_t)(void *txq, + struct rte_mbuf **tx_pkts, + uint16_t nb_pkts); +/**< @internal Prepare output packets on a transmit queue of an Ethernet device. */ + typedef int (*flow_ctrl_get_t)(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf); /**< @internal Get current flow control parameter on an Ethernet device */ @@ -1622,6 +1630,7 @@ struct rte_eth_rxtx_callback { struct rte_eth_dev { eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. */ eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function. */ + eth_tx_prep_t tx_pkt_prep; /**< Pointer to PMD transmit prepare function. */ struct rte_eth_dev_data *data; /**< Pointer to device data */ const struct eth_driver *driver;/**< Driver for this device */ const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */ @@ -2816,6 +2825,100 @@ rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id, return (*dev->tx_pkt_burst)(dev->data->tx_queues[queue_id], tx_pkts, nb_pkts); } +/** + * Process a burst of output packets on a transmit queue of an Ethernet device. + * + * The rte_eth_tx_prep() function is invoked to prepare output packets to be + * transmitted on the output queue *queue_id* of the Ethernet device designated + * by its *port_id*. + * The *nb_pkts* parameter is the number of packets to be prepared which are + * supplied in the *tx_pkts* array of *rte_mbuf* structures, each of them + * allocated from a pool created with rte_pktmbuf_pool_create(). + * For each packet to send, the rte_eth_tx_prep() function performs + * the following operations: + * + * - Check if packet meets devices requirements for tx offloads. + * + * - Check limitations about number of segments. + * + * - Check additional requirements when debug is enabled. + * + * - Update and/or reset required checksums when tx offload is set for packet. + * + * Since this function can modify packet data, provided mbufs must be safely + * writable (e.g. modified data cannot be in shared segment). + * + * The rte_eth_tx_prep() function returns the number of packets ready to be + * sent. A return value equal to *nb_pkts* means that all packets are valid and + * ready to be sent, otherwise stops processing on the first invalid packet and + * leaves the rest packets untouched. + * + * @param port_id + * The port identifier of the Ethernet device. + * The value must be a valid port id. + * @param queue_id + * The index of the transmit queue through which output packets must be + * sent. + * The value must be in the range [0, nb_tx_queue - 1] previously supplied + * to rte_eth_dev_configure(). + * @param tx_pkts + * The address of an array of *nb_pkts* pointers to *rte_mbuf* structures + * which contain the output packets. + * @param nb_pkts + * The maximum number of packets to process. + * @return + * The number of packets correct and ready to be sent. The return value can be + * less than the value of the *tx_pkts* parameter when some packet doesn't + * meet devices requirements with rte_errno set appropriately: + * - -EINVAL: offload flags are not correctly set + * - -ENOTSUP: the offload feature is not supported by the hardware + * + */ + +#ifdef RTE_ETHDEV_TX_PREP + +static inline uint16_t +rte_eth_tx_prep(uint8_t port_id, uint16_t queue_id, struct rte_mbuf **tx_pkts, + uint16_t nb_pkts) +{ + struct rte_eth_dev *dev; + +#ifdef RTE_LIBRTE_ETHDEV_DEBUG + if (!rte_eth_dev_is_valid_port(port_id)) { + RTE_PMD_DEBUG_TRACE("Invalid TX port_id=%d\n", port_id); + rte_errno = -EINVAL; + return 0; + } +#endif + + dev = &rte_eth_devices[port_id]; + +#ifdef RTE_LIBRTE_ETHDEV_DEBUG + if (queue_id >= dev->data->nb_tx_queues) { + RTE_PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", queue_id); + rte_errno = -EINVAL; + return 0; + } +#endif + + if (!dev->tx_pkt_prep) + return nb_pkts; + + return (*dev->tx_pkt_prep)(dev->data->tx_queues[queue_id], + tx_pkts, nb_pkts); +} + +#else + +static inline uint16_t +rte_eth_tx_prep(__rte_unused uint8_t port_id, __rte_unused uint16_t queue_id, + __rte_unused struct rte_mbuf **tx_pkts, uint16_t nb_pkts) +{ + return nb_pkts; +} + +#endif + typedef void (*buffer_tx_error_fn)(struct rte_mbuf **unsent, uint16_t count, void *userdata); diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index 109e666..ff9e749 100644 --- a/lib/librte_mbuf/rte_mbuf.h +++ b/lib/librte_mbuf/rte_mbuf.h @@ -283,6 +283,19 @@ extern "C" { */ #define PKT_TX_OUTER_IPV6 (1ULL << 60) +/** + * Bit Mask of all supported packet Tx offload features flags, which can be set + * for packet. + */ +#define PKT_TX_OFFLOAD_MASK ( \ + PKT_TX_IP_CKSUM | \ + PKT_TX_L4_MASK | \ + PKT_TX_OUTER_IP_CKSUM | \ + PKT_TX_TCP_SEG | \ + PKT_TX_QINQ_PKT | \ + PKT_TX_VLAN_PKT | \ + PKT_TX_TUNNEL_MASK) + #define __RESERVED (1ULL << 61) /**< reserved for future mbuf use */ #define IND_ATTACHED_MBUF (1ULL << 62) /**< Indirect attached mbuf */ @@ -1647,6 +1660,57 @@ static inline int rte_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *tail } /** + * Validate general requirements for tx offload in mbuf. + * + * This function checks correctness and completeness of Tx offload settings. + * + * @param m + * The packet mbuf to be validated. + * @return + * 0 if packet is valid + */ +static inline int +rte_validate_tx_offload(const struct rte_mbuf *m) +{ + uint64_t ol_flags = m->ol_flags; + uint64_t inner_l3_offset = m->l2_len; + + /* Does packet set any of available offloads? */ + if (!(ol_flags & PKT_TX_OFFLOAD_MASK)) + return 0; + + if (ol_flags & PKT_TX_OUTER_IP_CKSUM) + inner_l3_offset += m->outer_l2_len + m->outer_l3_len; + + /* Headers are fragmented */ + if (rte_pktmbuf_data_len(m) < inner_l3_offset + m->l3_len + m->l4_len) + return -ENOTSUP; + + /* IP checksum can be counted only for IPv4 packet */ + if ((ol_flags & PKT_TX_IP_CKSUM) && (ol_flags & PKT_TX_IPV6)) + return -EINVAL; + + /* IP type not set when required */ + if (ol_flags & (PKT_TX_L4_MASK | PKT_TX_TCP_SEG)) + if (!(ol_flags & (PKT_TX_IPV4 | PKT_TX_IPV6))) + return -EINVAL; + + /* Check requirements for TSO packet */ + if (ol_flags & PKT_TX_TCP_SEG) + if ((m->tso_segsz == 0) || + ((ol_flags & PKT_TX_IPV4) && + !(ol_flags & PKT_TX_IP_CKSUM))) + return -EINVAL; + + /* PKT_TX_OUTER_IP_CKSUM set for non outer IPv4 packet. */ + if ((ol_flags & PKT_TX_OUTER_IP_CKSUM) && + !(ol_flags & PKT_TX_OUTER_IPV4)) + return -EINVAL; + + return 0; +} + +/** * Dump an mbuf structure to a file. * * Dump all fields for the given packet mbuf and all its associated diff --git a/lib/librte_net/rte_net.h b/lib/librte_net/rte_net.h index d4156ae..553d5f1 100644 --- a/lib/librte_net/rte_net.h +++ b/lib/librte_net/rte_net.h @@ -38,6 +38,11 @@ extern "C" { #endif +#include <rte_ip.h> +#include <rte_udp.h> +#include <rte_tcp.h> +#include <rte_sctp.h> + /** * Structure containing header lengths associated to a packet, filled * by rte_net_get_ptype(). @@ -86,6 +91,86 @@ struct rte_net_hdr_lens { uint32_t rte_net_get_ptype(const struct rte_mbuf *m, struct rte_net_hdr_lens *hdr_lens, uint32_t layers); +/** + * Fix pseudo header checksum + * + * This function fixes pseudo header checksum for TSO and non-TSO tcp/udp in + * provided mbufs packet data. + * + * - for non-TSO tcp/udp packets full pseudo-header checksum is counted and set + * in packet data, + * - for TSO the IP payload length is not included in pseudo header. + * + * This function expects that used headers are in the first data segment of + * mbuf, are not fragmented and can be safely modified. + * + * @param m + * The packet mbuf to be fixed. + * @return + * 0 if checksum is initialized properly + */ +static inline int +rte_phdr_cksum_fix(struct rte_mbuf *m) +{ + struct ipv4_hdr *ipv4_hdr; + struct ipv6_hdr *ipv6_hdr; + struct tcp_hdr *tcp_hdr; + struct udp_hdr *udp_hdr; + uint64_t ol_flags = m->ol_flags; + uint64_t inner_l3_offset = m->l2_len; + + if (ol_flags & PKT_TX_OUTER_IP_CKSUM) + inner_l3_offset += m->outer_l2_len + m->outer_l3_len; + + if ((ol_flags & PKT_TX_UDP_CKSUM) == PKT_TX_UDP_CKSUM) { + if (ol_flags & PKT_TX_IPV4) { + ipv4_hdr = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *, + inner_l3_offset); + + if (ol_flags & PKT_TX_IP_CKSUM) + ipv4_hdr->hdr_checksum = 0; + + udp_hdr = (struct udp_hdr *)((char *)ipv4_hdr + + m->l3_len); + udp_hdr->dgram_cksum = rte_ipv4_phdr_cksum(ipv4_hdr, + ol_flags); + } else { + ipv6_hdr = rte_pktmbuf_mtod_offset(m, struct ipv6_hdr *, + inner_l3_offset); + /* non-TSO udp */ + udp_hdr = rte_pktmbuf_mtod_offset(m, struct udp_hdr *, + inner_l3_offset + m->l3_len); + udp_hdr->dgram_cksum = rte_ipv6_phdr_cksum(ipv6_hdr, + ol_flags); + } + } else if ((ol_flags & PKT_TX_TCP_CKSUM) || + (ol_flags & PKT_TX_TCP_SEG)) { + if (ol_flags & PKT_TX_IPV4) { + ipv4_hdr = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *, + inner_l3_offset); + + if (ol_flags & PKT_TX_IP_CKSUM) + ipv4_hdr->hdr_checksum = 0; + + /* non-TSO tcp or TSO */ + tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + + m->l3_len); + tcp_hdr->cksum = rte_ipv4_phdr_cksum(ipv4_hdr, + ol_flags); + } else { + ipv6_hdr = rte_pktmbuf_mtod_offset(m, struct ipv6_hdr *, + inner_l3_offset); + /* non-TSO tcp or TSO */ + tcp_hdr = rte_pktmbuf_mtod_offset(m, struct tcp_hdr *, + inner_l3_offset + m->l3_len); + tcp_hdr->cksum = rte_ipv6_phdr_cksum(ipv6_hdr, + ol_flags); + } + } + + return 0; +} + #ifdef __cplusplus } #endif