[RFC] ethdev: support hairpin queue

Message ID 1565703468-55617-1-git-send-email-orika@mellanox.com
State New
Delegated to: Ferruh Yigit
Headers show
Series
  • [RFC] ethdev: support hairpin queue
Related show

Checks

Context Check Description
ci/Intel-compilation success Compilation OK
ci/checkpatch warning coding style issues

Commit Message

Ori Kam Aug. 13, 2019, 1:37 p.m.
This RFC replaces RFC[1].

The hairpin feature (different name can be forward) acts as "bump on the wire",
meaning that a packet that is received from the wire can be modified using
offloaded action and then sent back to the wire without application intervention
which save CPU cycles.

The hairpin is the inverse function of loopback in which application
sends a packet then it is received again by the
application without being sent to the wire.

The hairpin can be used by a number of different NVF, for example load
balancer, gateway and so on.

As can be seen from the hairpin description, hairpin is basically RX queue
connected to TX queue.

During the design phase I was thinking of two ways to implement this
feature the first one is adding a new rte flow action. and the second
one is create a special kind of queue.

The advantages of using the queue approch:
1. More control for the application. queue depth (the memory size that
should be used).
2. Enable QoS. QoS is normaly a parametr of queue, so in this approch it
will be easy to integrate with such system.
3. Native integression with the rte flow API. Just setting the target
queue/rss to hairpin queue, will result that the traffic will be routed
to the hairpin queue.
4. Enable queue offloading.

Each hairpin Rxq can be connected Txq / number of Txqs which can belong to a
different ports assuming the PMD supports it. The same goes the other
way each hairpin Txq can be connected to one or more Rxqs.
This is the reason that both the Txq setup and Rxq setup are getting the
hairpin configuration structure.

From PMD prespctive the number of Rxq/Txq is the total of standard
queues + hairpin queues.

To configure hairpin queue the user should call
rte_eth_rx_hairpin_queue_setup / rte_eth_tx_hairpin_queue_setup insteed
of the normal queue setup functions.

The hairpin queues are not part of the normal RSS functiosn.

To use the queues the user simply create a flow that points to RSS/queue
actions that are hairpin queues.

[1]
http://inbox.dpdk.org/dev/AM4PR05MB3425E55B721A4090FCBE7D80DB1E0@AM4PR05MB3425.eurprd05.prod.outlook.com/

Signed-off-by: Ori Kam <orika@mellanox.com>
---
 lib/librte_ethdev/rte_ethdev.h | 124 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 124 insertions(+)

Comments

Stephen Hemminger Aug. 13, 2019, 3:46 p.m. | #1
On Tue, 13 Aug 2019 13:37:48 +0000
Ori Kam <orika@mellanox.com> wrote:

> This RFC replaces RFC[1].
> 
> The hairpin feature (different name can be forward) acts as "bump on the wire",
> meaning that a packet that is received from the wire can be modified using
> offloaded action and then sent back to the wire without application intervention
> which save CPU cycles.
> 
> The hairpin is the inverse function of loopback in which application
> sends a packet then it is received again by the
> application without being sent to the wire.
> 
> The hairpin can be used by a number of different NVF, for example load
> balancer, gateway and so on.
> 
> As can be seen from the hairpin description, hairpin is basically RX queue
> connected to TX queue.
> 
> During the design phase I was thinking of two ways to implement this
> feature the first one is adding a new rte flow action. and the second
> one is create a special kind of queue.


Life would be easier for users if the hairpin was an attribute
of queue configuration, not a separate API call.
Ori Kam Aug. 14, 2019, 5:35 a.m. | #2
Hi Stephen,

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Tuesday, August 13, 2019 6:46 PM
> To: Ori Kam <orika@mellanox.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; ferruh.yigit@intel.com;
> arybchenko@solarflare.com; Shahaf Shuler <shahafs@mellanox.com>; Slava
> Ovsiienko <viacheslavo@mellanox.com>; Alex Rosenbaum
> <alexr@mellanox.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [RFC] ethdev: support hairpin queue
> 
> On Tue, 13 Aug 2019 13:37:48 +0000
> Ori Kam <orika@mellanox.com> wrote:
> 
> > This RFC replaces RFC[1].
> >
> > The hairpin feature (different name can be forward) acts as "bump on the
> wire",
> > meaning that a packet that is received from the wire can be modified using
> > offloaded action and then sent back to the wire without application
> intervention
> > which save CPU cycles.
> >
> > The hairpin is the inverse function of loopback in which application
> > sends a packet then it is received again by the
> > application without being sent to the wire.
> >
> > The hairpin can be used by a number of different NVF, for example load
> > balancer, gateway and so on.
> >
> > As can be seen from the hairpin description, hairpin is basically RX queue
> > connected to TX queue.
> >
> > During the design phase I was thinking of two ways to implement this
> > feature the first one is adding a new rte flow action. and the second
> > one is create a special kind of queue.
> 
> 
> Life would be easier for users if the hairpin was an attribute
> of queue configuration, not a separate API call.

I was thinking about it. the reason that I split the functions is that they use different
parameters sets. For example the hairpin queue doesn't need memory region while it does need
the hairpin configuration. So in each case hairpin queue / normal queue there will be
parameters that are not in use. I think this is less preferred. What do you think?

Thanks,
Ori
Ori Kam Aug. 14, 2019, 6:05 a.m. | #3
> -----Original Message-----
> From: Ori Kam
> Sent: Wednesday, August 14, 2019 8:36 AM
> To: Stephen Hemminger <stephen@networkplumber.org>
> Cc: Thomas Monjalon <thomas@monjalon.net>; ferruh.yigit@intel.com;
> arybchenko@solarflare.com; Shahaf Shuler <shahafs@mellanox.com>; Slava
> Ovsiienko <viacheslavo@mellanox.com>; Alex Rosenbaum
> <Alexr@mellanox.com>; dev@dpdk.org
> Subject: RE: [dpdk-dev] [RFC] ethdev: support hairpin queue
> 
> Hi Stephen,
> 
> > -----Original Message-----
> > From: Stephen Hemminger <stephen@networkplumber.org>
> > Sent: Tuesday, August 13, 2019 6:46 PM
> > To: Ori Kam <orika@mellanox.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; ferruh.yigit@intel.com;
> > arybchenko@solarflare.com; Shahaf Shuler <shahafs@mellanox.com>; Slava
> > Ovsiienko <viacheslavo@mellanox.com>; Alex Rosenbaum
> > <alexr@mellanox.com>; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [RFC] ethdev: support hairpin queue
> >
> > On Tue, 13 Aug 2019 13:37:48 +0000
> > Ori Kam <orika@mellanox.com> wrote:
> >
> > > This RFC replaces RFC[1].
> > >
> > > The hairpin feature (different name can be forward) acts as "bump on the
> > wire",
> > > meaning that a packet that is received from the wire can be modified using
> > > offloaded action and then sent back to the wire without application
> > intervention
> > > which save CPU cycles.
> > >
> > > The hairpin is the inverse function of loopback in which application
> > > sends a packet then it is received again by the
> > > application without being sent to the wire.
> > >
> > > The hairpin can be used by a number of different NVF, for example load
> > > balancer, gateway and so on.
> > >
> > > As can be seen from the hairpin description, hairpin is basically RX queue
> > > connected to TX queue.
> > >
> > > During the design phase I was thinking of two ways to implement this
> > > feature the first one is adding a new rte flow action. and the second
> > > one is create a special kind of queue.
> >
> >
> > Life would be easier for users if the hairpin was an attribute
> > of queue configuration, not a separate API call.
> 
> I was thinking about it. the reason that I split the functions is that they use
> different
> parameters sets. For example the hairpin queue doesn't need memory region
> while it does need
> the hairpin configuration. So in each case hairpin queue / normal queue there
> will be
> parameters that are not in use. I think this is less preferred. What do you think?
> 

Forgot in my last mail two more reasons I had for this for this:
1. changing to existing function will break API, and will force all applications to update date.
2.  2 API are easier to document and explain.
3. the reason stated above that there will be unused parameters in each call.

What do you think?


> Thanks,
> Ori
Stephen Hemminger Aug. 14, 2019, 2:56 p.m. | #4
On Wed, 14 Aug 2019 06:05:13 +0000
Ori Kam <orika@mellanox.com> wrote:

> > -----Original Message-----
> > From: Ori Kam
> > Sent: Wednesday, August 14, 2019 8:36 AM
> > To: Stephen Hemminger <stephen@networkplumber.org>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; ferruh.yigit@intel.com;
> > arybchenko@solarflare.com; Shahaf Shuler <shahafs@mellanox.com>; Slava
> > Ovsiienko <viacheslavo@mellanox.com>; Alex Rosenbaum
> > <Alexr@mellanox.com>; dev@dpdk.org
> > Subject: RE: [dpdk-dev] [RFC] ethdev: support hairpin queue
> > 
> > Hi Stephen,
> >   
> > > -----Original Message-----
> > > From: Stephen Hemminger <stephen@networkplumber.org>
> > > Sent: Tuesday, August 13, 2019 6:46 PM
> > > To: Ori Kam <orika@mellanox.com>
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; ferruh.yigit@intel.com;
> > > arybchenko@solarflare.com; Shahaf Shuler <shahafs@mellanox.com>; Slava
> > > Ovsiienko <viacheslavo@mellanox.com>; Alex Rosenbaum
> > > <alexr@mellanox.com>; dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [RFC] ethdev: support hairpin queue
> > >
> > > On Tue, 13 Aug 2019 13:37:48 +0000
> > > Ori Kam <orika@mellanox.com> wrote:
> > >  
> > > > This RFC replaces RFC[1].
> > > >
> > > > The hairpin feature (different name can be forward) acts as "bump on the  
> > > wire",  
> > > > meaning that a packet that is received from the wire can be modified using
> > > > offloaded action and then sent back to the wire without application  
> > > intervention  
> > > > which save CPU cycles.
> > > >
> > > > The hairpin is the inverse function of loopback in which application
> > > > sends a packet then it is received again by the
> > > > application without being sent to the wire.
> > > >
> > > > The hairpin can be used by a number of different NVF, for example load
> > > > balancer, gateway and so on.
> > > >
> > > > As can be seen from the hairpin description, hairpin is basically RX queue
> > > > connected to TX queue.
> > > >
> > > > During the design phase I was thinking of two ways to implement this
> > > > feature the first one is adding a new rte flow action. and the second
> > > > one is create a special kind of queue.  
> > >
> > >
> > > Life would be easier for users if the hairpin was an attribute
> > > of queue configuration, not a separate API call.  
> > 
> > I was thinking about it. the reason that I split the functions is that they use
> > different
> > parameters sets. For example the hairpin queue doesn't need memory region
> > while it does need
> > the hairpin configuration. So in each case hairpin queue / normal queue there
> > will be
> > parameters that are not in use. I think this is less preferred. What do you think?
> >   
> 
> Forgot in my last mail two more reasons I had for this for this:
> 1. changing to existing function will break API, and will force all applications to update date.
> 2.  2 API are easier to document and explain.
> 3. the reason stated above that there will be unused parameters in each call.

New API's are like system calls, they create longer term support overhead.
It would be good if there was support for this on multiple NIC types.
Ori Kam Aug. 15, 2019, 4:41 a.m. | #5
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Wednesday, August 14, 2019 5:56 PM
> To: Ori Kam <orika@mellanox.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; ferruh.yigit@intel.com;
> arybchenko@solarflare.com; Shahaf Shuler <shahafs@mellanox.com>; Slava
> Ovsiienko <viacheslavo@mellanox.com>; Alex Rosenbaum
> <alexr@mellanox.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [RFC] ethdev: support hairpin queue
> 
> On Wed, 14 Aug 2019 06:05:13 +0000
> Ori Kam <orika@mellanox.com> wrote:
> 
> > > -----Original Message-----
> > > From: Ori Kam
> > > Sent: Wednesday, August 14, 2019 8:36 AM
> > > To: Stephen Hemminger <stephen@networkplumber.org>
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; ferruh.yigit@intel.com;
> > > arybchenko@solarflare.com; Shahaf Shuler <shahafs@mellanox.com>;
> Slava
> > > Ovsiienko <viacheslavo@mellanox.com>; Alex Rosenbaum
> > > <Alexr@mellanox.com>; dev@dpdk.org
> > > Subject: RE: [dpdk-dev] [RFC] ethdev: support hairpin queue
> > >
> > > Hi Stephen,
> > >
> > > > -----Original Message-----
> > > > From: Stephen Hemminger <stephen@networkplumber.org>
> > > > Sent: Tuesday, August 13, 2019 6:46 PM
> > > > To: Ori Kam <orika@mellanox.com>
> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; ferruh.yigit@intel.com;
> > > > arybchenko@solarflare.com; Shahaf Shuler <shahafs@mellanox.com>;
> Slava
> > > > Ovsiienko <viacheslavo@mellanox.com>; Alex Rosenbaum
> > > > <alexr@mellanox.com>; dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [RFC] ethdev: support hairpin queue
> > > >
> > > > On Tue, 13 Aug 2019 13:37:48 +0000
> > > > Ori Kam <orika@mellanox.com> wrote:
> > > >
> > > > > This RFC replaces RFC[1].
> > > > >
> > > > > The hairpin feature (different name can be forward) acts as "bump on
> the
> > > > wire",
> > > > > meaning that a packet that is received from the wire can be modified
> using
> > > > > offloaded action and then sent back to the wire without application
> > > > intervention
> > > > > which save CPU cycles.
> > > > >
> > > > > The hairpin is the inverse function of loopback in which application
> > > > > sends a packet then it is received again by the
> > > > > application without being sent to the wire.
> > > > >
> > > > > The hairpin can be used by a number of different NVF, for example load
> > > > > balancer, gateway and so on.
> > > > >
> > > > > As can be seen from the hairpin description, hairpin is basically RX
> queue
> > > > > connected to TX queue.
> > > > >
> > > > > During the design phase I was thinking of two ways to implement this
> > > > > feature the first one is adding a new rte flow action. and the second
> > > > > one is create a special kind of queue.
> > > >
> > > >
> > > > Life would be easier for users if the hairpin was an attribute
> > > > of queue configuration, not a separate API call.
> > >
> > > I was thinking about it. the reason that I split the functions is that they use
> > > different
> > > parameters sets. For example the hairpin queue doesn't need memory
> region
> > > while it does need
> > > the hairpin configuration. So in each case hairpin queue / normal queue
> there
> > > will be
> > > parameters that are not in use. I think this is less preferred. What do you
> think?
> > >
> >
> > Forgot in my last mail two more reasons I had for this for this:
> > 1. changing to existing function will break API, and will force all applications
> to update date.
> > 2.  2 API are easier to document and explain.
> > 3. the reason stated above that there will be unused parameters in each call.
> 
> New API's are like system calls, they create longer term support overhead.
> It would be good if there was support for this on multiple NIC types.

I don't know the capability of other NICs. I think this is a good feature that can be embrace
and implemented by other NICS (may be they can even have some SW implementation for this
that will still use CPU but will give faster packet rate since they know how their HW works)
Regarding the long term support, I'm sorry but I don't see the longer support issue that important since for this 
exact reason I think a dedicated API is much easer to maintain. Also my be in future there will be 
a new type and then the generic function will have a lot of unused code which is hard to maintain
and debug.

Thanks,
Ori
Ori Kam Aug. 25, 2019, 2:06 p.m. | #6
Hi Stephen,

Does my answer resolves your concerns?

Thanks,
Ori

> -----Original Message-----
> From: Ori Kam
> Sent: Thursday, August 15, 2019 7:42 AM
> To: Stephen Hemminger <stephen@networkplumber.org>
> Cc: Thomas Monjalon <thomas@monjalon.net>; ferruh.yigit@intel.com;
> arybchenko@solarflare.com; Shahaf Shuler <shahafs@mellanox.com>; Slava
> Ovsiienko <viacheslavo@mellanox.com>; Alex Rosenbaum
> <Alexr@mellanox.com>; dev@dpdk.org
> Subject: RE: [dpdk-dev] [RFC] ethdev: support hairpin queue
> 
> 
> 
> > -----Original Message-----
> > From: Stephen Hemminger <stephen@networkplumber.org>
> > Sent: Wednesday, August 14, 2019 5:56 PM
> > To: Ori Kam <orika@mellanox.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; ferruh.yigit@intel.com;
> > arybchenko@solarflare.com; Shahaf Shuler <shahafs@mellanox.com>; Slava
> > Ovsiienko <viacheslavo@mellanox.com>; Alex Rosenbaum
> > <alexr@mellanox.com>; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [RFC] ethdev: support hairpin queue
> >
> > On Wed, 14 Aug 2019 06:05:13 +0000
> > Ori Kam <orika@mellanox.com> wrote:
> >
> > > > -----Original Message-----
> > > > From: Ori Kam
> > > > Sent: Wednesday, August 14, 2019 8:36 AM
> > > > To: Stephen Hemminger <stephen@networkplumber.org>
> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; ferruh.yigit@intel.com;
> > > > arybchenko@solarflare.com; Shahaf Shuler <shahafs@mellanox.com>;
> > Slava
> > > > Ovsiienko <viacheslavo@mellanox.com>; Alex Rosenbaum
> > > > <Alexr@mellanox.com>; dev@dpdk.org
> > > > Subject: RE: [dpdk-dev] [RFC] ethdev: support hairpin queue
> > > >
> > > > Hi Stephen,
> > > >
> > > > > -----Original Message-----
> > > > > From: Stephen Hemminger <stephen@networkplumber.org>
> > > > > Sent: Tuesday, August 13, 2019 6:46 PM
> > > > > To: Ori Kam <orika@mellanox.com>
> > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; ferruh.yigit@intel.com;
> > > > > arybchenko@solarflare.com; Shahaf Shuler <shahafs@mellanox.com>;
> > Slava
> > > > > Ovsiienko <viacheslavo@mellanox.com>; Alex Rosenbaum
> > > > > <alexr@mellanox.com>; dev@dpdk.org
> > > > > Subject: Re: [dpdk-dev] [RFC] ethdev: support hairpin queue
> > > > >
> > > > > On Tue, 13 Aug 2019 13:37:48 +0000
> > > > > Ori Kam <orika@mellanox.com> wrote:
> > > > >
> > > > > > This RFC replaces RFC[1].
> > > > > >
> > > > > > The hairpin feature (different name can be forward) acts as "bump on
> > the
> > > > > wire",
> > > > > > meaning that a packet that is received from the wire can be modified
> > using
> > > > > > offloaded action and then sent back to the wire without application
> > > > > intervention
> > > > > > which save CPU cycles.
> > > > > >
> > > > > > The hairpin is the inverse function of loopback in which application
> > > > > > sends a packet then it is received again by the
> > > > > > application without being sent to the wire.
> > > > > >
> > > > > > The hairpin can be used by a number of different NVF, for example
> load
> > > > > > balancer, gateway and so on.
> > > > > >
> > > > > > As can be seen from the hairpin description, hairpin is basically RX
> > queue
> > > > > > connected to TX queue.
> > > > > >
> > > > > > During the design phase I was thinking of two ways to implement this
> > > > > > feature the first one is adding a new rte flow action. and the second
> > > > > > one is create a special kind of queue.
> > > > >
> > > > >
> > > > > Life would be easier for users if the hairpin was an attribute
> > > > > of queue configuration, not a separate API call.
> > > >
> > > > I was thinking about it. the reason that I split the functions is that they use
> > > > different
> > > > parameters sets. For example the hairpin queue doesn't need memory
> > region
> > > > while it does need
> > > > the hairpin configuration. So in each case hairpin queue / normal queue
> > there
> > > > will be
> > > > parameters that are not in use. I think this is less preferred. What do you
> > think?
> > > >
> > >
> > > Forgot in my last mail two more reasons I had for this for this:
> > > 1. changing to existing function will break API, and will force all applications
> > to update date.
> > > 2.  2 API are easier to document and explain.
> > > 3. the reason stated above that there will be unused parameters in each
> call.
> >
> > New API's are like system calls, they create longer term support overhead.
> > It would be good if there was support for this on multiple NIC types.
> 
> I don't know the capability of other NICs. I think this is a good feature that can
> be embrace
> and implemented by other NICS (may be they can even have some SW
> implementation for this
> that will still use CPU but will give faster packet rate since they know how their
> HW works)
> Regarding the long term support, I'm sorry but I don't see the longer support
> issue that important since for this
> exact reason I think a dedicated API is much easer to maintain. Also my be in
> future there will be
> a new type and then the generic function will have a lot of unused code which
> is hard to maintain
> and debug.
> 
> Thanks,
> Ori
Wu, Jingjing Sept. 5, 2019, 4 a.m. | #7
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ori Kam
> Sent: Tuesday, August 13, 2019 9:38 PM
> To: thomas@monjalon.net; Yigit, Ferruh <ferruh.yigit@intel.com>;
> arybchenko@solarflare.com; shahafs@mellanox.com; viacheslavo@mellanox.com;
> alexr@mellanox.com
> Cc: dev@dpdk.org; orika@mellanox.com
> Subject: [dpdk-dev] [RFC] ethdev: support hairpin queue
> 
> This RFC replaces RFC[1].
> 
> The hairpin feature (different name can be forward) acts as "bump on the wire",
> meaning that a packet that is received from the wire can be modified using
> offloaded action and then sent back to the wire without application intervention
> which save CPU cycles.
> 
> The hairpin is the inverse function of loopback in which application
> sends a packet then it is received again by the
> application without being sent to the wire.
> 
> The hairpin can be used by a number of different NVF, for example load
> balancer, gateway and so on.
> 
> As can be seen from the hairpin description, hairpin is basically RX queue
> connected to TX queue.
> 
> During the design phase I was thinking of two ways to implement this
> feature the first one is adding a new rte flow action. and the second
> one is create a special kind of queue.
> 
> The advantages of using the queue approch:
> 1. More control for the application. queue depth (the memory size that
> should be used).
> 2. Enable QoS. QoS is normaly a parametr of queue, so in this approch it
> will be easy to integrate with such system.


Which kind of QoS?

> 3. Native integression with the rte flow API. Just setting the target
> queue/rss to hairpin queue, will result that the traffic will be routed
> to the hairpin queue.
> 4. Enable queue offloading.
> 
Looks like the hairpin queue is just hardware queue, it has no relationship with host memory. It makes the queue concept a little bit confusing. And why do we need to setup queues, maybe some info in eth_conf is enough?

Not sure how your hardware make the hairpin work? Use rte_flow for packet modification offload? Then how does HW distribute packets to those hardware queue, classification? If So, why not just extend rte_flow with the hairpin action?

> Each hairpin Rxq can be connected Txq / number of Txqs which can belong to a
> different ports assuming the PMD supports it. The same goes the other
> way each hairpin Txq can be connected to one or more Rxqs.
> This is the reason that both the Txq setup and Rxq setup are getting the
> hairpin configuration structure.
> 
> From PMD prespctive the number of Rxq/Txq is the total of standard
> queues + hairpin queues.
> 
> To configure hairpin queue the user should call
> rte_eth_rx_hairpin_queue_setup / rte_eth_tx_hairpin_queue_setup insteed
> of the normal queue setup functions.

If the new API introduced to avoid ABI change, would one API rte_eth_rx_hairpin_setup be enough?

Thanks
Jingjing
Ori Kam Sept. 5, 2019, 5:44 a.m. | #8
Hi Wu, 
Thanks for your comments PSB,

Ori

> -----Original Message-----
> From: Wu, Jingjing <jingjing.wu@intel.com>
> Sent: Thursday, September 5, 2019 7:01 AM
> To: Ori Kam <orika@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; Yigit, Ferruh <ferruh.yigit@intel.com>;
> arybchenko@solarflare.com; Shahaf Shuler <shahafs@mellanox.com>; Slava
> Ovsiienko <viacheslavo@mellanox.com>; Alex Rosenbaum
> <alexr@mellanox.com>
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [RFC] ethdev: support hairpin queue
> 
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ori Kam
> > Sent: Tuesday, August 13, 2019 9:38 PM
> > To: thomas@monjalon.net; Yigit, Ferruh <ferruh.yigit@intel.com>;
> > arybchenko@solarflare.com; shahafs@mellanox.com;
> viacheslavo@mellanox.com;
> > alexr@mellanox.com
> > Cc: dev@dpdk.org; orika@mellanox.com
> > Subject: [dpdk-dev] [RFC] ethdev: support hairpin queue
> >
> > This RFC replaces RFC[1].
> >
> > The hairpin feature (different name can be forward) acts as "bump on the
> wire",
> > meaning that a packet that is received from the wire can be modified using
> > offloaded action and then sent back to the wire without application
> intervention
> > which save CPU cycles.
> >
> > The hairpin is the inverse function of loopback in which application
> > sends a packet then it is received again by the
> > application without being sent to the wire.
> >
> > The hairpin can be used by a number of different NVF, for example load
> > balancer, gateway and so on.
> >
> > As can be seen from the hairpin description, hairpin is basically RX queue
> > connected to TX queue.
> >
> > During the design phase I was thinking of two ways to implement this
> > feature the first one is adding a new rte flow action. and the second
> > one is create a special kind of queue.
> >
> > The advantages of using the queue approch:
> > 1. More control for the application. queue depth (the memory size that
> > should be used).
> > 2. Enable QoS. QoS is normaly a parametr of queue, so in this approch it
> > will be easy to integrate with such system.
> 
> 
> Which kind of QoS?

For example latency , packet rate those kinds of makes sense in the queue level.
I know we don't have any current support but I think we will have during the next year.

> 
> > 3. Native integression with the rte flow API. Just setting the target
> > queue/rss to hairpin queue, will result that the traffic will be routed
> > to the hairpin queue.
> > 4. Enable queue offloading.
> >
> Looks like the hairpin queue is just hardware queue, it has no relationship with
> host memory. It makes the queue concept a little bit confusing. And why do we
> need to setup queues, maybe some info in eth_conf is enough?

Like stated above it makes sense to have queue related parameters.
For example I can think of application that most packets are going threw that hairpin queue, but some control packets are
from the application. So the application can configure the QoS between those two queues. In addtion this will enable the application
to use the queue like normal queue from rte_flow (see comment below) and every other aspect.
 
> 
> Not sure how your hardware make the hairpin work? Use rte_flow for packet
> modification offload? Then how does HW distribute packets to those hardware
> queue, classification? If So, why not just extend rte_flow with the hairpin
> action?
> 

You are correct, the application uses rte_flow and just points the traffic to the requested hairpin queue/rss.
We could have added a new rte_flow command. The reasons we didn't:
1. Like stated above some of the hairpin makes sense in queue level.
2.  In the near future, we will also want to support hairpin between different ports. This makes much more
sense using queues.
  
> > Each hairpin Rxq can be connected Txq / number of Txqs which can belong to
> a
> > different ports assuming the PMD supports it. The same goes the other
> > way each hairpin Txq can be connected to one or more Rxqs.
> > This is the reason that both the Txq setup and Rxq setup are getting the
> > hairpin configuration structure.
> >
> > From PMD prespctive the number of Rxq/Txq is the total of standard
> > queues + hairpin queues.
> >
> > To configure hairpin queue the user should call
> > rte_eth_rx_hairpin_queue_setup / rte_eth_tx_hairpin_queue_setup insteed
> > of the normal queue setup functions.
> 
> If the new API introduced to avoid ABI change, would one API
> rte_eth_rx_hairpin_setup be enough?

I'm not sure I understand your comment.
The rx_hairpin_setup was created for two main reasons:
1. Avoid API change.
2. I think it is more correct to use different API since the parameters are different.

The reason we have both rx and tx setup functions is that we want the user to have control binding the two queues.
It is most important when we will advance to hairpin between ports.

> 
> Thanks
> Jingjing

Thanks,
Ori
Wu, Jingjing Sept. 6, 2019, 3:08 a.m. | #9
Hi, Ori

Thanks for the explanation. I have more question below.

Thanks
Jingjing

> -----Original Message-----
> From: Ori Kam [mailto:orika@mellanox.com]
> Sent: Thursday, September 5, 2019 1:45 PM
> To: Wu, Jingjing <jingjing.wu@intel.com>; Thomas Monjalon <thomas@monjalon.net>;
> Yigit, Ferruh <ferruh.yigit@intel.com>; arybchenko@solarflare.com; Shahaf Shuler
> <shahafs@mellanox.com>; Slava Ovsiienko <viacheslavo@mellanox.com>; Alex
> Rosenbaum <alexr@mellanox.com>
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [RFC] ethdev: support hairpin queue
> 
> Hi Wu,
> Thanks for your comments PSB,
> 
> Ori
> 
> > -----Original Message-----
> > From: Wu, Jingjing <jingjing.wu@intel.com>
> > Sent: Thursday, September 5, 2019 7:01 AM
> > To: Ori Kam <orika@mellanox.com>; Thomas Monjalon
> > <thomas@monjalon.net>; Yigit, Ferruh <ferruh.yigit@intel.com>;
> > arybchenko@solarflare.com; Shahaf Shuler <shahafs@mellanox.com>; Slava
> > Ovsiienko <viacheslavo@mellanox.com>; Alex Rosenbaum
> > <alexr@mellanox.com>
> > Cc: dev@dpdk.org
> > Subject: RE: [dpdk-dev] [RFC] ethdev: support hairpin queue
> >
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ori Kam
> > > Sent: Tuesday, August 13, 2019 9:38 PM
> > > To: thomas@monjalon.net; Yigit, Ferruh <ferruh.yigit@intel.com>;
> > > arybchenko@solarflare.com; shahafs@mellanox.com;
> > viacheslavo@mellanox.com;
> > > alexr@mellanox.com
> > > Cc: dev@dpdk.org; orika@mellanox.com
> > > Subject: [dpdk-dev] [RFC] ethdev: support hairpin queue
> > >
> > > This RFC replaces RFC[1].
> > >
> > > The hairpin feature (different name can be forward) acts as "bump on the
> > wire",
> > > meaning that a packet that is received from the wire can be modified using
> > > offloaded action and then sent back to the wire without application
> > intervention
> > > which save CPU cycles.
> > >
> > > The hairpin is the inverse function of loopback in which application
> > > sends a packet then it is received again by the
> > > application without being sent to the wire.
> > >
> > > The hairpin can be used by a number of different NVF, for example load
> > > balancer, gateway and so on.
> > >
> > > As can be seen from the hairpin description, hairpin is basically RX queue
> > > connected to TX queue.
> > >
> > > During the design phase I was thinking of two ways to implement this
> > > feature the first one is adding a new rte flow action. and the second
> > > one is create a special kind of queue.
> > >
> > > The advantages of using the queue approch:
> > > 1. More control for the application. queue depth (the memory size that
> > > should be used).
> > > 2. Enable QoS. QoS is normaly a parametr of queue, so in this approch it
> > > will be easy to integrate with such system.
> >
> >
> > Which kind of QoS?
> 
> For example latency , packet rate those kinds of makes sense in the queue level.
> I know we don't have any current support but I think we will have during the next year.
> 
Where would be the QoS API loading? TM API? Or propose other new?
> >
> > > 3. Native integression with the rte flow API. Just setting the target
> > > queue/rss to hairpin queue, will result that the traffic will be routed
> > > to the hairpin queue.
> > > 4. Enable queue offloading.
> > >
> > Looks like the hairpin queue is just hardware queue, it has no relationship with
> > host memory. It makes the queue concept a little bit confusing. And why do we
> > need to setup queues, maybe some info in eth_conf is enough?
> 
> Like stated above it makes sense to have queue related parameters.
> For example I can think of application that most packets are going threw that hairpin
> queue, but some control packets are
> from the application. So the application can configure the QoS between those two
> queues. In addtion this will enable the application
> to use the queue like normal queue from rte_flow (see comment below) and every other
> aspect.
> 
Yes, it is typical use case. And rte_flow is used to classify to different queue?
If I understand correct, your hairpin queue is using host memory/or on-card memory for buffering, but CPU cannot touch it, all the packet processing is done by NIC.
Queue is created, where the queue ID is used? Tx queue ID may be used as action of rte_flow? I still don't understand where the hairpin Rx queue ID be used. 
In my opinion, if no rx/tx function, it should not be a true queue from host view. 

> >
> > Not sure how your hardware make the hairpin work? Use rte_flow for packet
> > modification offload? Then how does HW distribute packets to those hardware
> > queue, classification? If So, why not just extend rte_flow with the hairpin
> > action?
> >
> 
> You are correct, the application uses rte_flow and just points the traffic to the requested
> hairpin queue/rss.
> We could have added a new rte_flow command. The reasons we didn't:
> 1. Like stated above some of the hairpin makes sense in queue level.
> 2.  In the near future, we will also want to support hairpin between different ports. This
> makes much more
> sense using queues.
> 
> > > Each hairpin Rxq can be connected Txq / number of Txqs which can belong to
> > a
> > > different ports assuming the PMD supports it. The same goes the other
> > > way each hairpin Txq can be connected to one or more Rxqs.
> > > This is the reason that both the Txq setup and Rxq setup are getting the
> > > hairpin configuration structure.
> > >
> > > From PMD prespctive the number of Rxq/Txq is the total of standard
> > > queues + hairpin queues.
> > >
> > > To configure hairpin queue the user should call
> > > rte_eth_rx_hairpin_queue_setup / rte_eth_tx_hairpin_queue_setup insteed
> > > of the normal queue setup functions.
> >
> > If the new API introduced to avoid ABI change, would one API
> > rte_eth_rx_hairpin_setup be enough?
> 
> I'm not sure I understand your comment.
> The rx_hairpin_setup was created for two main reasons:
> 1. Avoid API change.
> 2. I think it is more correct to use different API since the parameters are different.
> 
I mean not use queue setup concept, set hairpin feature through one hairpin configuration API.

> The reason we have both rx and tx setup functions is that we want the user to have
> control binding the two queues.
> It is most important when we will advance to hairpin between ports.

Hairpin between ports? It looks like switch but not hairpin, right?
> 
> >
> > Thanks
> > Jingjing
> 
> Thanks,
> Ori
Ori Kam Sept. 8, 2019, 6:44 a.m. | #10
Hi Jingjing,

PSB

> -----Original Message-----
> From: Wu, Jingjing <jingjing.wu@intel.com>
> Sent: Friday, September 6, 2019 6:08 AM
> To: Ori Kam <orika@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; Yigit, Ferruh <ferruh.yigit@intel.com>;
> arybchenko@solarflare.com; Shahaf Shuler <shahafs@mellanox.com>; Slava
> Ovsiienko <viacheslavo@mellanox.com>; Alex Rosenbaum
> <alexr@mellanox.com>
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [RFC] ethdev: support hairpin queue
> 
> Hi, Ori
> 
> Thanks for the explanation. I have more question below.
> 
> Thanks
> Jingjing
> 
> > -----Original Message-----
> > From: Ori Kam [mailto:orika@mellanox.com]
> > Sent: Thursday, September 5, 2019 1:45 PM
> > To: Wu, Jingjing <jingjing.wu@intel.com>; Thomas Monjalon
> <thomas@monjalon.net>;
> > Yigit, Ferruh <ferruh.yigit@intel.com>; arybchenko@solarflare.com; Shahaf
> Shuler
> > <shahafs@mellanox.com>; Slava Ovsiienko <viacheslavo@mellanox.com>;
> Alex
> > Rosenbaum <alexr@mellanox.com>
> > Cc: dev@dpdk.org
> > Subject: RE: [dpdk-dev] [RFC] ethdev: support hairpin queue
> >
> > Hi Wu,
> > Thanks for your comments PSB,
> >
> > Ori
> >
> > > -----Original Message-----
> > > From: Wu, Jingjing <jingjing.wu@intel.com>
> > > Sent: Thursday, September 5, 2019 7:01 AM
> > > To: Ori Kam <orika@mellanox.com>; Thomas Monjalon
> > > <thomas@monjalon.net>; Yigit, Ferruh <ferruh.yigit@intel.com>;
> > > arybchenko@solarflare.com; Shahaf Shuler <shahafs@mellanox.com>;
> Slava
> > > Ovsiienko <viacheslavo@mellanox.com>; Alex Rosenbaum
> > > <alexr@mellanox.com>
> > > Cc: dev@dpdk.org
> > > Subject: RE: [dpdk-dev] [RFC] ethdev: support hairpin queue
> > >
> > >
> > > > -----Original Message-----
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ori Kam
> > > > Sent: Tuesday, August 13, 2019 9:38 PM
> > > > To: thomas@monjalon.net; Yigit, Ferruh <ferruh.yigit@intel.com>;
> > > > arybchenko@solarflare.com; shahafs@mellanox.com;
> > > viacheslavo@mellanox.com;
> > > > alexr@mellanox.com
> > > > Cc: dev@dpdk.org; orika@mellanox.com
> > > > Subject: [dpdk-dev] [RFC] ethdev: support hairpin queue
> > > >
> > > > This RFC replaces RFC[1].
> > > >
> > > > The hairpin feature (different name can be forward) acts as "bump on the
> > > wire",
> > > > meaning that a packet that is received from the wire can be modified
> using
> > > > offloaded action and then sent back to the wire without application
> > > intervention
> > > > which save CPU cycles.
> > > >
> > > > The hairpin is the inverse function of loopback in which application
> > > > sends a packet then it is received again by the
> > > > application without being sent to the wire.
> > > >
> > > > The hairpin can be used by a number of different NVF, for example load
> > > > balancer, gateway and so on.
> > > >
> > > > As can be seen from the hairpin description, hairpin is basically RX queue
> > > > connected to TX queue.
> > > >
> > > > During the design phase I was thinking of two ways to implement this
> > > > feature the first one is adding a new rte flow action. and the second
> > > > one is create a special kind of queue.
> > > >
> > > > The advantages of using the queue approch:
> > > > 1. More control for the application. queue depth (the memory size that
> > > > should be used).
> > > > 2. Enable QoS. QoS is normaly a parametr of queue, so in this approch it
> > > > will be easy to integrate with such system.
> > >
> > >
> > > Which kind of QoS?
> >
> > For example latency , packet rate those kinds of makes sense in the queue
> level.
> > I know we don't have any current support but I think we will have during the
> next year.
> >
> Where would be the QoS API loading? TM API? Or propose other new?

I think it will be a new API,  The TM is more like limiting the bandwidth of a target flow, while in
QoS should influence more the priority between the queues.

> > >
> > > > 3. Native integression with the rte flow API. Just setting the target
> > > > queue/rss to hairpin queue, will result that the traffic will be routed
> > > > to the hairpin queue.
> > > > 4. Enable queue offloading.
> > > >
> > > Looks like the hairpin queue is just hardware queue, it has no relationship
> with
> > > host memory. It makes the queue concept a little bit confusing. And why do
> we
> > > need to setup queues, maybe some info in eth_conf is enough?
> >
> > Like stated above it makes sense to have queue related parameters.
> > For example I can think of application that most packets are going threw that
> hairpin
> > queue, but some control packets are
> > from the application. So the application can configure the QoS between those
> two
> > queues. In addtion this will enable the application
> > to use the queue like normal queue from rte_flow (see comment below) and
> every other
> > aspect.
> >
> Yes, it is typical use case. And rte_flow is used to classify to different queue?
> If I understand correct, your hairpin queue is using host memory/or on-card
> memory for buffering, but CPU cannot touch it, all the packet processing is
> done by NIC.
> Queue is created, where the queue ID is used? Tx queue ID may be used as
> action of rte_flow? I still don't understand where the hairpin Rx queue ID be
> used.
> In my opinion, if no rx/tx function, it should not be a true queue from host view.
> 

Yes rte_flow is used to classify the traffic between the queues, in order to use the hairpin feature in 
the basic usage, the application just insert any ingress flow that the target queue/RSS is hairpin queue.
For example assuming that queue index 4 is hairpin queue, hairpin will look something like this:
Flow create 0 ingress group 0 pattern eth / ipv4 .... / end actions decap / encap / queue index 4 / end

I understand but don't agree about your point that if there is no rx/tx function it is not a queue.
In hairpin queue we are offloading the data path. Unrelated to this RFC we are working on VDPA driver.
This is not ethdev driver but what it does is offloading the vhost and offloads the enqueue and dequeue functions[1].

> > >
> > > Not sure how your hardware make the hairpin work? Use rte_flow for
> packet
> > > modification offload? Then how does HW distribute packets to those
> hardware
> > > queue, classification? If So, why not just extend rte_flow with the hairpin
> > > action?
> > >
> >
> > You are correct, the application uses rte_flow and just points the traffic to the
> requested
> > hairpin queue/rss.
> > We could have added a new rte_flow command. The reasons we didn't:
> > 1. Like stated above some of the hairpin makes sense in queue level.
> > 2.  In the near future, we will also want to support hairpin between different
> ports. This
> > makes much more
> > sense using queues.
> >
> > > > Each hairpin Rxq can be connected Txq / number of Txqs which can
> belong to
> > > a
> > > > different ports assuming the PMD supports it. The same goes the other
> > > > way each hairpin Txq can be connected to one or more Rxqs.
> > > > This is the reason that both the Txq setup and Rxq setup are getting the
> > > > hairpin configuration structure.
> > > >
> > > > From PMD prespctive the number of Rxq/Txq is the total of standard
> > > > queues + hairpin queues.
> > > >
> > > > To configure hairpin queue the user should call
> > > > rte_eth_rx_hairpin_queue_setup / rte_eth_tx_hairpin_queue_setup
> insteed
> > > > of the normal queue setup functions.
> > >
> > > If the new API introduced to avoid ABI change, would one API
> > > rte_eth_rx_hairpin_setup be enough?
> >
> > I'm not sure I understand your comment.
> > The rx_hairpin_setup was created for two main reasons:
> > 1. Avoid API change.
> > 2. I think it is more correct to use different API since the parameters are
> different.
> >
> I mean not use queue setup concept, set hairpin feature through one hairpin
> configuration API.
> 

I'm not sure I understand.
API that will look something like this will be better?
Int hairpin_bind(uint16_t rx_port, uint16_t rx queue, struct hairpin_conf *rx_hairpin_conf, 
uint16_t tx_port, uint16_t tx_queue, struct hairpin_conf *tx_hairpin_conf)

The problem with such API, is that it will cause issue for nics that supports one to many connections.
For example assuming that some nic can support one rx queue to 4 tx queues.
Also we still need to configure the hairpin queue. So if I understand you correctly is that the hairpin queues
will not be setup and this API will set them.
 
> > The reason we have both rx and tx setup functions is that we want the user to
> have
> > control binding the two queues.
> > It is most important when we will advance to hairpin between ports.
> 
> Hairpin between ports? It looks like switch but not hairpin, right?

Switch from my understanding is between VM meaning traffic sent from one VM will be routed
directly to the target VM. This is not the case of hairpin. In hairpin traffic comes from the wire and goes
back to the wire. There are no VM in the system. Example application for hairpin is load balancers or gateways,
were we get for example one port is connected to one system and the second port connected to a second system.
It is the job of the application to check if the packet should pass and if so modify it, to match the second system.
For example moving VXLAN tunnel packet to MPLS tunnel in the other system.
  
> >
> > >
> > > Thanks
> > > Jingjing
> >
> > Thanks,
> > Ori

Thanks,
Ori

Patch

diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index dc6596b..fb54162 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -804,6 +804,15 @@  struct rte_eth_txconf {
 };
 
 /**
+ * A structure used to configure hairpin binding..
+ */
+struct rte_eth_hairpin_conf {
+	uint16_t peer_n; /**< The number of peer queues and queues. */
+	uint16_t (*ports)[]; /**< The peer ports. */
+	uint16_t (*queues)[]; /**< The peer queues. */
+};
+
+/**
  * A structure contains information about HW descriptor ring limitations.
  */
 struct rte_eth_desc_lim {
@@ -1013,6 +1022,7 @@  struct rte_eth_conf {
 #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
 #define DEV_RX_OFFLOAD_SCTP_CKSUM	0x00020000
 #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM  0x00040000
+#define DEV_RX_OFFLOAD_HAIRPIN		0x00080000
 
 #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM | \
 				 DEV_RX_OFFLOAD_UDP_CKSUM | \
@@ -1075,6 +1085,7 @@  struct rte_eth_conf {
  * Application must set PKT_TX_METADATA and mbuf metadata field.
  */
 #define DEV_TX_OFFLOAD_MATCH_METADATA   0x00200000
+#define DEV_TX_OFFLOAD_HAIRPIN		0x00400000
 
 #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
 /**< Device supports Rx queue setup after device started*/
@@ -1769,6 +1780,56 @@  int rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id,
 		struct rte_mempool *mb_pool);
 
 /**
+ * Allocate and set up a hairpin receive queue for an Ethernet device.
+ *
+ * The function set up the selected queue to be used in hairpin.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param rx_queue_id
+ *   The index of the receive queue to set up.
+ *   The value must be in the range [0, nb_rx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @param nb_rx_desc
+ *   The number of receive descriptors to allocate for the receive ring.
+ * @param socket_id
+ *   The *socket_id* argument is the socket identifier in case of NUMA.
+ *   The value can be *SOCKET_ID_ANY* if there is no NUMA constraint for
+ *   the DMA memory allocated for the receive descriptors of the ring.
+ * @param rx_conf
+ *   The pointer to the configuration data to be used for the receive queue.
+ *   NULL value is allowed, in which case default RX configuration
+ *   will be used.
+ *   The *rx_conf* structure contains an *rx_thresh* structure with the values
+ *   of the Prefetch, Host, and Write-Back threshold registers of the receive
+ *   ring.
+ *   In addition it contains the hardware offloads features to activate using
+ *   the DEV_RX_OFFLOAD_* flags.
+ *   If an offloading set in rx_conf->offloads
+ *   hasn't been set in the input argument eth_conf->rxmode.offloads
+ *   to rte_eth_dev_configure(), it is a new added offloading, it must be
+ *   per-queue type and it is enabled for the queue.
+ *   No need to repeat any bit in rx_conf->offloads which has already been
+ *   enabled in rte_eth_dev_configure() at port level. An offloading enabled
+ *   at port level can't be disabled at queue level.
+ * @param hairpin_conf
+ *   The pointer to the hairpin binding configuration.
+ * @return
+ *   - 0: Success, receive queue correctly set up.
+ *   - -EINVAL: The size of network buffers which can be allocated from the
+ *      memory pool does not fit the various buffer sizes allowed by the
+ *      device controller.
+ *   - -ENOMEM: Unable to allocate the receive ring descriptors or to
+ *      allocate network memory buffers from the memory pool when
+ *      initializing receive descriptors.
+ */
+int rte_eth_rx_hairpin_queue_setup
+	(uint16_t port_id, uint16_t rx_queue_id,
+	 uint16_t nb_rx_desc, unsigned int socket_id,
+	 const struct rte_eth_rxconf *rx_conf,
+	 const struct rte_eth_hairpin_conf *hairpin_conf);
+
+/**
  * Allocate and set up a transmit queue for an Ethernet device.
  *
  * @param port_id
@@ -1821,6 +1882,69 @@  int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
 		const struct rte_eth_txconf *tx_conf);
 
 /**
+ * Allocate and set up a transmit hairpin queue for an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param tx_queue_id
+ *   The index of the transmit queue to set up.
+ *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @param nb_tx_desc
+ *   The number of transmit descriptors to allocate for the transmit ring.
+ * @param socket_id
+ *   The *socket_id* argument is the socket identifier in case of NUMA.
+ *   Its value can be *SOCKET_ID_ANY* if there is no NUMA constraint for
+ *   the DMA memory allocated for the transmit descriptors of the ring.
+ * @param tx_conf
+ *   The pointer to the configuration data to be used for the transmit queue.
+ *   NULL value is allowed, in which case default RX configuration
+ *   will be used.
+ *   The *tx_conf* structure contains the following data:
+ *   - The *tx_thresh* structure with the values of the Prefetch, Host, and
+ *     Write-Back threshold registers of the transmit ring.
+ *     When setting Write-Back threshold to the value greater then zero,
+ *     *tx_rs_thresh* value should be explicitly set to one.
+ *   - The *tx_free_thresh* value indicates the [minimum] number of network
+ *     buffers that must be pending in the transmit ring to trigger their
+ *     [implicit] freeing by the driver transmit function.
+ *   - The *tx_rs_thresh* value indicates the [minimum] number of transmit
+ *     descriptors that must be pending in the transmit ring before setting the
+ *     RS bit on a descriptor by the driver transmit function.
+ *     The *tx_rs_thresh* value should be less or equal then
+ *     *tx_free_thresh* value, and both of them should be less then
+ *     *nb_tx_desc* - 3.
+ *   - The *txq_flags* member contains flags to pass to the TX queue setup
+ *     function to configure the behavior of the TX queue. This should be set
+ *     to 0 if no special configuration is required.
+ *     This API is obsolete and will be deprecated. Applications
+ *     should set it to ETH_TXQ_FLAGS_IGNORE and use
+ *     the offloads field below.
+ *   - The *offloads* member contains Tx offloads to be enabled.
+ *     If an offloading set in tx_conf->offloads
+ *     hasn't been set in the input argument eth_conf->txmode.offloads
+ *     to rte_eth_dev_configure(), it is a new added offloading, it must be
+ *     per-queue type and it is enabled for the queue.
+ *     No need to repeat any bit in tx_conf->offloads which has already been
+ *     enabled in rte_eth_dev_configure() at port level. An offloading enabled
+ *     at port level can't be disabled at queue level.
+ *
+ *     Note that setting *tx_free_thresh* or *tx_rs_thresh* value to 0 forces
+ *     the transmit function to use default values.
+ * @param hairpin_conf
+ *   The hairpin binding configuration.
+ *
+ * @return
+ *   - 0: Success, the transmit queue is correctly set up.
+ *   - -ENOMEM: Unable to allocate the transmit ring descriptors.
+ */
+int rte_eth_tx_hairpin_queue_setup
+	(uint16_t port_id, uint16_t tx_queue_id,
+	 uint16_t nb_tx_desc, unsigned int socket_id,
+	 const struct rte_eth_txconf *tx_conf,
+	 const struct rte_eth_hairpin_conf *hairpin_conf);
+
+/**
  * Return the NUMA socket to which an Ethernet device is connected
  *
  * @param port_id