[dpdk-dev,v2,1/2] ethdev: add buffered tx api

Message ID 1456333729-3804-2-git-send-email-tomaszx.kulasek@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers

Commit Message

Tomasz Kulasek Feb. 24, 2016, 5:08 p.m. UTC
  Many sample apps include internal buffering for single-packet-at-a-time
operation. Since this is such a common paradigm, this functionality is
better suited to being implemented in the ethdev API.

The new APIs in the ethdev library are:
* rte_eth_tx_buffer_init - initialize buffer
* rte_eth_tx_buffer - buffer up a single packet for future transmission
* rte_eth_tx_buffer_flush - flush any unsent buffered packets
* rte_eth_tx_buffer_set_err_callback - set up a callback to be called in
  case transmitting a buffered burst fails. By default, we just free the
  unsent packets.

As well as these, an additional reference callback is provided, which
frees the packets (as the default callback does), as well as updating a
user-provided counter, so that the number of dropped packets can be
tracked.

v2 changes:
 - reworked to use new buffer model
 - buffer data and callbacks are removed from rte_eth_dev/rte_eth_dev_data,
   so this patch doesn't brake an ABI anymore
 - introduced RTE_ETH_TX_BUFFER macro and rte_eth_tx_buffer_init
 - buffers are not attached to the port-queue
 - buffers can be allocated dynamically during application work
 - size of buffer can be changed without port restart

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
---
 lib/librte_ether/rte_ethdev.c          |   36 +++++++
 lib/librte_ether/rte_ethdev.h          |  182 +++++++++++++++++++++++++++++++-
 lib/librte_ether/rte_ether_version.map |    9 ++
 3 files changed, 226 insertions(+), 1 deletion(-)
  

Comments

Thomas Monjalon March 8, 2016, 10:52 p.m. UTC | #1
Hi,

It is an overlay on the tx burst API.
Probably it doesn't hurt to add it but we have to be really cautious
with the API definition to try keeping it stable in the future.

2016-02-24 18:08, Tomasz Kulasek:
> +/**
> + * Structure used to buffer packets for future TX
> + * Used by APIs rte_eth_tx_buffer and rte_eth_tx_buffer_flush
> + */
> +struct rte_eth_dev_tx_buffer {
> +	unsigned nb_pkts;

What about "length"?
Why is it unsigned and the size is uint16_t?

> +	uint64_t errors;
> +	/**< Total number of queue packets to sent that are dropped. */

The errors are passed as userdata to the default callback.
If we really want to have this kind of counter, we can define our
own callback. So why defining this field as standard?
I would like to keep it as simple as possible.

> +	buffer_tx_error_fn cbfn;

Why not simply "callback" as name?

> +	void *userdata;
> +	uint16_t size;           /**< Size of buffer for buffered tx */
> +	struct rte_mbuf *pkts[];
> +};

What is the benefit of exposing this structure in the API,
except that it is used in some inline functions?

> +static inline uint16_t
> +rte_eth_tx_buffer_flush(uint8_t port_id, uint16_t queue_id,
> +		struct rte_eth_dev_tx_buffer *buffer)
> +{
> +	uint16_t sent;
> +
> +	uint16_t to_send = buffer->nb_pkts;
> +
> +	if (to_send == 0)
> +		return 0;

Why this check is done in the lib?
What is the performance gain if we are idle?
It can be done outside if needed.

> +	sent = rte_eth_tx_burst(port_id, queue_id, buffer->pkts, to_send);
> +
> +	buffer->nb_pkts = 0;
> +
> +	/* All packets sent, or to be dealt with by callback below */
> +	if (unlikely(sent != to_send))
> +		buffer->cbfn(&buffer->pkts[sent], to_send - sent,
> +				buffer->userdata);
> +
> +	return sent;
> +}
[...]
> +/**
> + * Callback function for tracking unsent buffered packets.
> + *
> + * This function can be passed to rte_eth_tx_buffer_set_err_callback() to
> + * adjust the default behaviour when buffered packets cannot be sent. This
> + * function drops any unsent packets, but also updates a user-supplied counter
> + * to track the overall number of packets dropped. The counter should be an
> + * uint64_t variable.
> + *
> + * NOTE: this function should not be called directly, instead it should be used
> + *       as a callback for packet buffering.
> + *
> + * NOTE: when configuring this function as a callback with
> + *       rte_eth_tx_buffer_set_err_callback(), the final, userdata parameter
> + *       should point to an uint64_t value.

Please forget this idea of counter in the default callback.

[...]
> +void
> +rte_eth_count_unsent_packet_callback(struct rte_mbuf **pkts, uint16_t unsent,
> +		void *userdata);

What about rte_eth_tx_buffer_default_callback as name?
  
Ananyev, Konstantin March 9, 2016, 1:36 p.m. UTC | #2
Hi Thomas,

> 
> Hi,
> 
> It is an overlay on the tx burst API.
> Probably it doesn't hurt to add it but we have to be really cautious
> with the API definition to try keeping it stable in the future.
> 
> 2016-02-24 18:08, Tomasz Kulasek:
> > +/**
> > + * Structure used to buffer packets for future TX
> > + * Used by APIs rte_eth_tx_buffer and rte_eth_tx_buffer_flush
> > + */
> > +struct rte_eth_dev_tx_buffer {
> > +	unsigned nb_pkts;
> 
> What about "length"?
> Why is it unsigned and the size is uint16_t?

Good point,  yes need to make it consistent.

> 
> > +	uint64_t errors;
> > +	/**< Total number of queue packets to sent that are dropped. */
> 
> The errors are passed as userdata to the default callback.
> If we really want to have this kind of counter, we can define our
> own callback. So why defining this field as standard?
> I would like to keep it as simple as possible.
> 
> > +	buffer_tx_error_fn cbfn;
> 
> Why not simply "callback" as name?
> 
> > +	void *userdata;
> > +	uint16_t size;           /**< Size of buffer for buffered tx */
> > +	struct rte_mbuf *pkts[];
> > +};
> 
> What is the benefit of exposing this structure in the API,
> except that it is used in some inline functions?
> 

I described the benefits, I think it provides here:
http://dpdk.org/ml/archives/dev/2016-February/033058.html

> > +static inline uint16_t
> > +rte_eth_tx_buffer_flush(uint8_t port_id, uint16_t queue_id,
> > +		struct rte_eth_dev_tx_buffer *buffer)
> > +{
> > +	uint16_t sent;
> > +
> > +	uint16_t to_send = buffer->nb_pkts;
> > +
> > +	if (to_send == 0)
> > +		return 0;
> 
> Why this check is done in the lib?
> What is the performance gain if we are idle?
> It can be done outside if needed.

Yes, that could be done outside, but if user has to do it anyway,
why not to put it inside?
I don't expect any performance gain/loss because of that -
just seems a bit more convenient to the user.

Konstantin

> 
> > +	sent = rte_eth_tx_burst(port_id, queue_id, buffer->pkts, to_send);
> > +
> > +	buffer->nb_pkts = 0;
> > +
> > +	/* All packets sent, or to be dealt with by callback below */
> > +	if (unlikely(sent != to_send))
> > +		buffer->cbfn(&buffer->pkts[sent], to_send - sent,
> > +				buffer->userdata);
> > +
> > +	return sent;
> > +}
> [...]
> > +/**
> > + * Callback function for tracking unsent buffered packets.
> > + *
> > + * This function can be passed to rte_eth_tx_buffer_set_err_callback() to
> > + * adjust the default behaviour when buffered packets cannot be sent. This
> > + * function drops any unsent packets, but also updates a user-supplied counter
> > + * to track the overall number of packets dropped. The counter should be an
> > + * uint64_t variable.
> > + *
> > + * NOTE: this function should not be called directly, instead it should be used
> > + *       as a callback for packet buffering.
> > + *
> > + * NOTE: when configuring this function as a callback with
> > + *       rte_eth_tx_buffer_set_err_callback(), the final, userdata parameter
> > + *       should point to an uint64_t value.
> 
> Please forget this idea of counter in the default callback.
> 
> [...]
> > +void
> > +rte_eth_count_unsent_packet_callback(struct rte_mbuf **pkts, uint16_t unsent,
> > +		void *userdata);
> 
> What about rte_eth_tx_buffer_default_callback as name?
  
Thomas Monjalon March 9, 2016, 2:25 p.m. UTC | #3
2016-03-09 13:36, Ananyev, Konstantin:
> > > +   if (to_send == 0)
> > > +           return 0;
> > 
> > Why this check is done in the lib?
> > What is the performance gain if we are idle?
> > It can be done outside if needed.
> 
> Yes, that could be done outside, but if user has to do it anyway,
> why not to put it inside?
> I don't expect any performance gain/loss because of that -
> just seems a bit more convenient to the user.

It is handling an idle case so there is no gain obviously.
But the condition branching is surely a loss.
So why the user would you like to do this check?
  
Ananyev, Konstantin March 9, 2016, 3:23 p.m. UTC | #4
> 
> 2016-03-09 13:36, Ananyev, Konstantin:
> > > > +   if (to_send == 0)
> > > > +           return 0;
> > >
> > > Why this check is done in the lib?
> > > What is the performance gain if we are idle?
> > > It can be done outside if needed.
> >
> > Yes, that could be done outside, but if user has to do it anyway,
> > why not to put it inside?
> > I don't expect any performance gain/loss because of that -
> > just seems a bit more convenient to the user.
> 
> It is handling an idle case so there is no gain obviously.
> But the condition branching is surely a loss.

I suppose that condition should always be checked:
either in user code prior to function call or inside the
function call itself.
So don't expect any difference in performance here...
Do you have any particular example when you think it would? 
Or are you talking about rte_eth_tx_buffer() calling
rte_eth_tx_buffer_flush() internally?
For that one - both are flush is 'static inline' , so I expect
compiler be smart enough to remove this redundant check.  

> So why the user would you like to do this check?
Just for user convenience - to save him doing that manually.

Konstantin
  
Thomas Monjalon March 9, 2016, 3:26 p.m. UTC | #5
2016-03-09 15:23, Ananyev, Konstantin:
> > 
> > 2016-03-09 13:36, Ananyev, Konstantin:
> > > > > +   if (to_send == 0)
> > > > > +           return 0;
> > > >
> > > > Why this check is done in the lib?
> > > > What is the performance gain if we are idle?
> > > > It can be done outside if needed.
> > >
> > > Yes, that could be done outside, but if user has to do it anyway,
> > > why not to put it inside?
> > > I don't expect any performance gain/loss because of that -
> > > just seems a bit more convenient to the user.
> > 
> > It is handling an idle case so there is no gain obviously.
> > But the condition branching is surely a loss.
> 
> I suppose that condition should always be checked:
> either in user code prior to function call or inside the
> function call itself.
> So don't expect any difference in performance here...
> Do you have any particular example when you think it would? 
> Or are you talking about rte_eth_tx_buffer() calling
> rte_eth_tx_buffer_flush() internally?
> For that one - both are flush is 'static inline' , so I expect
> compiler be smart enough to remove this redundant check.  
> 
> > So why the user would you like to do this check?
> Just for user convenience - to save him doing that manually.

Probably I've missed something. If we remove this check, the function
will do nothing, right? How is it changing the behaviour?
  
Tomasz Kulasek March 9, 2016, 3:32 p.m. UTC | #6
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Wednesday, March 9, 2016 16:27
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: Kulasek, TomaszX <tomaszx.kulasek@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 1/2] ethdev: add buffered tx api
> 
> 2016-03-09 15:23, Ananyev, Konstantin:
> > >
> > > 2016-03-09 13:36, Ananyev, Konstantin:
> > > > > > +   if (to_send == 0)
> > > > > > +           return 0;
> > > > >
> > > > > Why this check is done in the lib?
> > > > > What is the performance gain if we are idle?
> > > > > It can be done outside if needed.
> > > >
> > > > Yes, that could be done outside, but if user has to do it anyway,
> > > > why not to put it inside?
> > > > I don't expect any performance gain/loss because of that - just
> > > > seems a bit more convenient to the user.
> > >
> > > It is handling an idle case so there is no gain obviously.
> > > But the condition branching is surely a loss.
> >
> > I suppose that condition should always be checked:
> > either in user code prior to function call or inside the function call
> > itself.
> > So don't expect any difference in performance here...
> > Do you have any particular example when you think it would?
> > Or are you talking about rte_eth_tx_buffer() calling
> > rte_eth_tx_buffer_flush() internally?
> > For that one - both are flush is 'static inline' , so I expect
> > compiler be smart enough to remove this redundant check.
> >
> > > So why the user would you like to do this check?
> > Just for user convenience - to save him doing that manually.
> 
> Probably I've missed something. If we remove this check, the function will
> do nothing, right? How is it changing the behaviour?

If we remove this check, function will try to send 0 packets and check condition for error. So we gain nothing with removing that.

Tomasz
  
Thomas Monjalon March 9, 2016, 3:37 p.m. UTC | #7
2016-03-09 15:32, Kulasek, TomaszX:
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > 2016-03-09 15:23, Ananyev, Konstantin:
> > > >
> > > > 2016-03-09 13:36, Ananyev, Konstantin:
> > > > > > > +   if (to_send == 0)
> > > > > > > +           return 0;
> > > > > >
> > > > > > Why this check is done in the lib?
> > > > > > What is the performance gain if we are idle?
> > > > > > It can be done outside if needed.
> > > > >
> > > > > Yes, that could be done outside, but if user has to do it anyway,
> > > > > why not to put it inside?
> > > > > I don't expect any performance gain/loss because of that - just
> > > > > seems a bit more convenient to the user.
> > > >
> > > > It is handling an idle case so there is no gain obviously.
> > > > But the condition branching is surely a loss.
> > >
> > > I suppose that condition should always be checked:
> > > either in user code prior to function call or inside the function call
> > > itself.
> > > So don't expect any difference in performance here...
> > > Do you have any particular example when you think it would?
> > > Or are you talking about rte_eth_tx_buffer() calling
> > > rte_eth_tx_buffer_flush() internally?
> > > For that one - both are flush is 'static inline' , so I expect
> > > compiler be smart enough to remove this redundant check.
> > >
> > > > So why the user would you like to do this check?
> > > Just for user convenience - to save him doing that manually.
> > 
> > Probably I've missed something. If we remove this check, the function will
> > do nothing, right? How is it changing the behaviour?
> 
> If we remove this check, function will try to send 0 packets and check
> condition for error. So we gain nothing with removing that.

Actually I should not arguing why removing it,
but you should arguing why adding it :)
  
Ananyev, Konstantin March 9, 2016, 3:42 p.m. UTC | #8
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Wednesday, March 09, 2016 3:27 PM
> To: Ananyev, Konstantin
> Cc: Kulasek, TomaszX; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 1/2] ethdev: add buffered tx api
> 
> 2016-03-09 15:23, Ananyev, Konstantin:
> > >
> > > 2016-03-09 13:36, Ananyev, Konstantin:
> > > > > > +   if (to_send == 0)
> > > > > > +           return 0;
> > > > >
> > > > > Why this check is done in the lib?
> > > > > What is the performance gain if we are idle?
> > > > > It can be done outside if needed.
> > > >
> > > > Yes, that could be done outside, but if user has to do it anyway,
> > > > why not to put it inside?
> > > > I don't expect any performance gain/loss because of that -
> > > > just seems a bit more convenient to the user.
> > >
> > > It is handling an idle case so there is no gain obviously.
> > > But the condition branching is surely a loss.
> >
> > I suppose that condition should always be checked:
> > either in user code prior to function call or inside the
> > function call itself.
> > So don't expect any difference in performance here...
> > Do you have any particular example when you think it would?
> > Or are you talking about rte_eth_tx_buffer() calling
> > rte_eth_tx_buffer_flush() internally?
> > For that one - both are flush is 'static inline' , so I expect
> > compiler be smart enough to remove this redundant check.
> >
> > > So why the user would you like to do this check?
> > Just for user convenience - to save him doing that manually.
> 
> Probably I've missed something. If we remove this check, the function
> will do nothing, right? How is it changing the behaviour?

If we'll remove that check, then 
rte_eth_tx_burst(...,nb_pkts=0)->(*dev->tx_pkt_burst)(...,nb_pkts=0)
will be called.
So in that case it might be even slower, as we'll have to do a proper call.
Of course user can avoid it by:

If(tx_buffer->nb_pkts != 0)
	rte_eth_tx_buffer_flush(port, queue, tx_buffer);

But as I said what for to force user to do that?
Why not to  make this check inside the function? 
Konstantin
  
Thomas Monjalon March 9, 2016, 3:52 p.m. UTC | #9
2016-03-09 15:42, Ananyev, Konstantin:
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > 2016-03-09 15:23, Ananyev, Konstantin:
> > > >
> > > > 2016-03-09 13:36, Ananyev, Konstantin:
> > > > > > > +   if (to_send == 0)
> > > > > > > +           return 0;
> > > > > >
> > > > > > Why this check is done in the lib?
> > > > > > What is the performance gain if we are idle?
> > > > > > It can be done outside if needed.
> > > > >
> > > > > Yes, that could be done outside, but if user has to do it anyway,
> > > > > why not to put it inside?
> > > > > I don't expect any performance gain/loss because of that -
> > > > > just seems a bit more convenient to the user.
> > > >
> > > > It is handling an idle case so there is no gain obviously.
> > > > But the condition branching is surely a loss.
> > >
> > > I suppose that condition should always be checked:
> > > either in user code prior to function call or inside the
> > > function call itself.
> > > So don't expect any difference in performance here...
> > > Do you have any particular example when you think it would?
> > > Or are you talking about rte_eth_tx_buffer() calling
> > > rte_eth_tx_buffer_flush() internally?
> > > For that one - both are flush is 'static inline' , so I expect
> > > compiler be smart enough to remove this redundant check.
> > >
> > > > So why the user would you like to do this check?
> > > Just for user convenience - to save him doing that manually.
> > 
> > Probably I've missed something. If we remove this check, the function
> > will do nothing, right? How is it changing the behaviour?
> 
> If we'll remove that check, then 
> rte_eth_tx_burst(...,nb_pkts=0)->(*dev->tx_pkt_burst)(...,nb_pkts=0)
> will be called.
> So in that case it might be even slower, as we'll have to do a proper call.

If there is no packet, we have time to do a useless call.

> Of course user can avoid it by:
> 
> If(tx_buffer->nb_pkts != 0)
> 	rte_eth_tx_buffer_flush(port, queue, tx_buffer);
> 
> But as I said what for to force user to do that?
> Why not to  make this check inside the function?

Because it may be slower when there are some packets
and will "accelerate" only the no-packet case.

We do not progress in this discussion. It is not a big deal, just a non sense.
So I agree to keep it if we change the website to announce that DPDK
accelerates the idle processing ;)
  
Ananyev, Konstantin March 9, 2016, 4:17 p.m. UTC | #10
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Wednesday, March 09, 2016 3:52 PM
> To: Ananyev, Konstantin
> Cc: Kulasek, TomaszX; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 1/2] ethdev: add buffered tx api
> 
> 2016-03-09 15:42, Ananyev, Konstantin:
> > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > 2016-03-09 15:23, Ananyev, Konstantin:
> > > > >
> > > > > 2016-03-09 13:36, Ananyev, Konstantin:
> > > > > > > > +   if (to_send == 0)
> > > > > > > > +           return 0;
> > > > > > >
> > > > > > > Why this check is done in the lib?
> > > > > > > What is the performance gain if we are idle?
> > > > > > > It can be done outside if needed.
> > > > > >
> > > > > > Yes, that could be done outside, but if user has to do it anyway,
> > > > > > why not to put it inside?
> > > > > > I don't expect any performance gain/loss because of that -
> > > > > > just seems a bit more convenient to the user.
> > > > >
> > > > > It is handling an idle case so there is no gain obviously.
> > > > > But the condition branching is surely a loss.
> > > >
> > > > I suppose that condition should always be checked:
> > > > either in user code prior to function call or inside the
> > > > function call itself.
> > > > So don't expect any difference in performance here...
> > > > Do you have any particular example when you think it would?
> > > > Or are you talking about rte_eth_tx_buffer() calling
> > > > rte_eth_tx_buffer_flush() internally?
> > > > For that one - both are flush is 'static inline' , so I expect
> > > > compiler be smart enough to remove this redundant check.
> > > >
> > > > > So why the user would you like to do this check?
> > > > Just for user convenience - to save him doing that manually.
> > >
> > > Probably I've missed something. If we remove this check, the function
> > > will do nothing, right? How is it changing the behaviour?
> >
> > If we'll remove that check, then
> > rte_eth_tx_burst(...,nb_pkts=0)->(*dev->tx_pkt_burst)(...,nb_pkts=0)
> > will be called.
> > So in that case it might be even slower, as we'll have to do a proper call.
> 
> If there is no packet, we have time to do a useless call.

One lcore can do TX for several queues/ports.
Let say we have N queues to handle, but right now traffic is going only through
one of them. 
That means we'll have to do N-1 useless calls and reduce number of cycles
available to send actual traffic.

> 
> > Of course user can avoid it by:
> >
> > If(tx_buffer->nb_pkts != 0)
> > 	rte_eth_tx_buffer_flush(port, queue, tx_buffer);
> >
> > But as I said what for to force user to do that?
> > Why not to  make this check inside the function?
> 
> Because it may be slower when there are some packets
> and will "accelerate" only the no-packet case.
> 
> We do not progress in this discussion.
> It is not a big deal, 

Exactly.

>just a non sense.

Look at what most of current DPDK examples do: they do check manually
does nb_pkts==0 or not, if not call tx_burst().
For me it makes sense to move that check into the library function -
so each and every caller doesn't have to do it manually.

> So I agree to keep it if we change the website to announce that DPDK
> accelerates the idle processing ;)

That's fine by me, but at first I suppose you'll have to provide some data
showing that this approach slowdowns things, right? :)

Konstantin
  
Thomas Monjalon March 9, 2016, 4:21 p.m. UTC | #11
2016-03-09 16:17, Ananyev, Konstantin:
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > 2016-03-09 15:42, Ananyev, Konstantin:
> > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > > 2016-03-09 15:23, Ananyev, Konstantin:
> > > > > >
> > > > > > 2016-03-09 13:36, Ananyev, Konstantin:
> > > > > > > > > +   if (to_send == 0)
> > > > > > > > > +           return 0;
> > > > > > > >
> > > > > > > > Why this check is done in the lib?
> > > > > > > > What is the performance gain if we are idle?
> > > > > > > > It can be done outside if needed.
> > > > > > >
> > > > > > > Yes, that could be done outside, but if user has to do it anyway,
> > > > > > > why not to put it inside?
> > > > > > > I don't expect any performance gain/loss because of that -
> > > > > > > just seems a bit more convenient to the user.
> > > > > >
> > > > > > It is handling an idle case so there is no gain obviously.
> > > > > > But the condition branching is surely a loss.
> > > > >
> > > > > I suppose that condition should always be checked:
> > > > > either in user code prior to function call or inside the
> > > > > function call itself.
> > > > > So don't expect any difference in performance here...
> > > > > Do you have any particular example when you think it would?
> > > > > Or are you talking about rte_eth_tx_buffer() calling
> > > > > rte_eth_tx_buffer_flush() internally?
> > > > > For that one - both are flush is 'static inline' , so I expect
> > > > > compiler be smart enough to remove this redundant check.
> > > > >
> > > > > > So why the user would you like to do this check?
> > > > > Just for user convenience - to save him doing that manually.
> > > >
> > > > Probably I've missed something. If we remove this check, the function
> > > > will do nothing, right? How is it changing the behaviour?
> > >
> > > If we'll remove that check, then
> > > rte_eth_tx_burst(...,nb_pkts=0)->(*dev->tx_pkt_burst)(...,nb_pkts=0)
> > > will be called.
> > > So in that case it might be even slower, as we'll have to do a proper call.
> > 
> > If there is no packet, we have time to do a useless call.
> 
> One lcore can do TX for several queues/ports.
> Let say we have N queues to handle, but right now traffic is going only through
> one of them. 
> That means we'll have to do N-1 useless calls and reduce number of cycles
> available to send actual traffic.

OK, good justification, thanks.

> > > Of course user can avoid it by:
> > >
> > > If(tx_buffer->nb_pkts != 0)
> > > 	rte_eth_tx_buffer_flush(port, queue, tx_buffer);
> > >
> > > But as I said what for to force user to do that?
> > > Why not to  make this check inside the function?
> > 
> > Because it may be slower when there are some packets
> > and will "accelerate" only the no-packet case.
> > 
> > We do not progress in this discussion.
> > It is not a big deal, 
> 
> Exactly.
> 
> >just a non sense.
> 
> Look at what most of current DPDK examples do: they do check manually
> does nb_pkts==0 or not, if not call tx_burst().
> For me it makes sense to move that check into the library function -
> so each and every caller doesn't have to do it manually.
> 
> > So I agree to keep it if we change the website to announce that DPDK
> > accelerates the idle processing ;)
> 
> That's fine by me, but at first I suppose you'll have to provide some data
> showing that this approach slowdowns things, right? :)

You got me
  
Tomasz Kulasek March 9, 2016, 4:35 p.m. UTC | #12
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Tuesday, March 8, 2016 23:52
> To: Kulasek, TomaszX <tomaszx.kulasek@intel.com>
> Cc: dev@dpdk.org; Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v2 1/2] ethdev: add buffered tx api
> 
> Hi,
> 

[...]

> > +/**
> > + * Callback function for tracking unsent buffered packets.
> > + *
> > + * This function can be passed to
> > +rte_eth_tx_buffer_set_err_callback() to
> > + * adjust the default behaviour when buffered packets cannot be sent.
> > +This
> > + * function drops any unsent packets, but also updates a
> > +user-supplied counter
> > + * to track the overall number of packets dropped. The counter should
> > +be an
> > + * uint64_t variable.
> > + *
> > + * NOTE: this function should not be called directly, instead it should
> be used
> > + *       as a callback for packet buffering.
> > + *
> > + * NOTE: when configuring this function as a callback with
> > + *       rte_eth_tx_buffer_set_err_callback(), the final, userdata
> parameter
> > + *       should point to an uint64_t value.
> 
> Please forget this idea of counter in the default callback.
> 

Ok, I forgot.

> [...]
> > +void
> > +rte_eth_count_unsent_packet_callback(struct rte_mbuf **pkts, uint16_t
> unsent,
> > +		void *userdata);
> 
> What about rte_eth_tx_buffer_default_callback as name?

This function is used now as default to count silently dropped packets and update error counter in tx_buffer structure. When I remove error counter and set silent drop as default behavior, it's better to have two callbacks to choice:

1) silently dropping packets (set as default)
2) as defined above to dropping with counter.

Maybe better is to define two default callbacks while many applications can still update it's internal error counter,
So IHMO these names are more descriptive:

rte_eth_tx_buffer_drop_callback
rte_eth_tx_buffer_count_callback

What you think?

Tomasz
  
Thomas Monjalon March 9, 2016, 5:06 p.m. UTC | #13
2016-03-09 16:35, Kulasek, TomaszX:
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > +void
> > > +rte_eth_count_unsent_packet_callback(struct rte_mbuf **pkts, uint16_t
> > unsent,
> > > +		void *userdata);
> > 
> > What about rte_eth_tx_buffer_default_callback as name?
> 
> This function is used now as default to count silently dropped packets and update error counter in tx_buffer structure. When I remove error counter and set silent drop as default behavior, it's better to have two callbacks to choice:
> 
> 1) silently dropping packets (set as default)
> 2) as defined above to dropping with counter.
> 
> Maybe better is to define two default callbacks while many applications can still update it's internal error counter,
> So IHMO these names are more descriptive:
> 
> rte_eth_tx_buffer_drop_callback
> rte_eth_tx_buffer_count_callback
> 
> What you think?

I think you are right about the name.

Are you sure providing a "count" callback is needed?
Is it just to refactor the examples?
  
Tomasz Kulasek March 9, 2016, 6:12 p.m. UTC | #14
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Wednesday, March 9, 2016 18:07
> To: Kulasek, TomaszX <tomaszx.kulasek@intel.com>
> Cc: dev@dpdk.org; Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v2 1/2] ethdev: add buffered tx api
> 
> 2016-03-09 16:35, Kulasek, TomaszX:
> > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > > +void
> > > > +rte_eth_count_unsent_packet_callback(struct rte_mbuf **pkts,
> > > > +uint16_t
> > > unsent,
> > > > +		void *userdata);
> > >
> > > What about rte_eth_tx_buffer_default_callback as name?
> >
> > This function is used now as default to count silently dropped packets
> and update error counter in tx_buffer structure. When I remove error
> counter and set silent drop as default behavior, it's better to have two
> callbacks to choice:
> >
> > 1) silently dropping packets (set as default)
> > 2) as defined above to dropping with counter.
> >
> > Maybe better is to define two default callbacks while many
> > applications can still update it's internal error counter, So IHMO these
> names are more descriptive:
> >
> > rte_eth_tx_buffer_drop_callback
> > rte_eth_tx_buffer_count_callback
> >
> > What you think?
> 
> I think you are right about the name.
> 
> Are you sure providing a "count" callback is needed?
> Is it just to refactor the examples?

I think it's useful to have a callback which let you easily to track the overall number of packets dropped. It's handy when you want to drop packets and not leave them untracked.

It's good to have it, but it's not critical.

Changing the examples is not a problem while I've got copy-paste superpower.

Tomasz
  

Patch

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 756b234..b8ab747 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1307,6 +1307,42 @@  rte_eth_tx_queue_setup(uint8_t port_id, uint16_t tx_queue_id,
 }
 
 void
+rte_eth_count_unsent_packet_callback(struct rte_mbuf **pkts, uint16_t unsent,
+		void *userdata)
+{
+	uint64_t *count = userdata;
+	unsigned i;
+
+	for (i = 0; i < unsent; i++)
+		rte_pktmbuf_free(pkts[i]);
+
+	*count += unsent;
+}
+
+int
+rte_eth_tx_buffer_set_err_callback(struct rte_eth_dev_tx_buffer *buffer,
+		buffer_tx_error_fn cbfn, void *userdata)
+{
+	buffer->cbfn = cbfn;
+	buffer->userdata = userdata;
+	return 0;
+}
+
+int
+rte_eth_tx_buffer_init(struct rte_eth_dev_tx_buffer *buffer, uint16_t size)
+{
+	if (buffer == NULL)
+		return -EINVAL;
+
+	buffer->size = size;
+	if (buffer->cbfn == NULL)
+		rte_eth_tx_buffer_set_err_callback(buffer,
+				rte_eth_count_unsent_packet_callback, (void *)&buffer->errors);
+
+	return 0;
+}
+
+void
 rte_eth_promiscuous_enable(uint8_t port_id)
 {
 	struct rte_eth_dev *dev;
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 16da821..b0d4932 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1,7 +1,7 @@ 
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -2655,6 +2655,186 @@  rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id,
 	return (*dev->tx_pkt_burst)(dev->data->tx_queues[queue_id], tx_pkts, nb_pkts);
 }
 
+typedef void (*buffer_tx_error_fn)(struct rte_mbuf **unsent, uint16_t count,
+		void *userdata);
+
+/**
+ * Structure used to buffer packets for future TX
+ * Used by APIs rte_eth_tx_buffer and rte_eth_tx_buffer_flush
+ */
+struct rte_eth_dev_tx_buffer {
+	unsigned nb_pkts;
+	uint64_t errors;
+	/**< Total number of queue packets to sent that are dropped. */
+	buffer_tx_error_fn cbfn;
+	void *userdata;
+	uint16_t size;           /**< Size of buffer for buffered tx */
+	struct rte_mbuf *pkts[];
+};
+
+/**
+ * Calculate the size of the tx buffer.
+ *
+ * @param sz
+ *   Number of stored packets.
+ */
+#define RTE_ETH_TX_BUFFER_SIZE(sz) \
+	(sizeof(struct rte_eth_dev_tx_buffer) + (sz) * sizeof(struct rte_mbuf *))
+
+/**
+ * Initialize default values for buffered transmitting
+ *
+ * @param buffer
+ *   Tx buffer to be initialized.
+ * @param size
+ *   Buffer size
+ * @return
+ *   0 if no error
+ */
+int
+rte_eth_tx_buffer_init(struct rte_eth_dev_tx_buffer *buffer, uint16_t size);
+
+/**
+ * Send any packets queued up for transmission on a port and HW queue
+ *
+ * This causes an explicit flush of packets previously buffered via the
+ * rte_eth_tx_buffer() function. It returns the number of packets successfully
+ * sent to the NIC, and calls the error callback for any unsent packets. Unless
+ * explicitly set up otherwise, the default callback simply frees the unsent
+ * packets back to the owning mempool.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The index of the transmit queue through which output packets must be
+ *   sent.
+ *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @param buffer
+ *   Buffer of packets to be transmit.
+ * @return
+ *   The number of packets successfully sent to the Ethernet device. The error
+ *   callback is called for any packets which could not be sent.
+ */
+static inline uint16_t
+rte_eth_tx_buffer_flush(uint8_t port_id, uint16_t queue_id,
+		struct rte_eth_dev_tx_buffer *buffer)
+{
+	uint16_t sent;
+
+	uint16_t to_send = buffer->nb_pkts;
+
+	if (to_send == 0)
+		return 0;
+
+	sent = rte_eth_tx_burst(port_id, queue_id, buffer->pkts, to_send);
+
+	buffer->nb_pkts = 0;
+
+	/* All packets sent, or to be dealt with by callback below */
+	if (unlikely(sent != to_send))
+		buffer->cbfn(&buffer->pkts[sent], to_send - sent,
+				buffer->userdata);
+
+	return sent;
+}
+
+/**
+ * Buffer a single packet for future transmission on a port and queue
+ *
+ * This function takes a single mbuf/packet and buffers it for later
+ * transmission on the particular port and queue specified. Once the buffer is
+ * full of packets, an attempt will be made to transmit all the buffered
+ * packets. In case of error, where not all packets can be transmitted, a
+ * callback is called with the unsent packets as a parameter. If no callback
+ * is explicitly set up, the unsent packets are just freed back to the owning
+ * mempool. The function returns the number of packets actually sent i.e.
+ * 0 if no buffer flush occurred, otherwise the number of packets successfully
+ * flushed
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The index of the transmit queue through which output packets must be
+ *   sent.
+ *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @param buffer
+ *   Buffer used to collect packets to be sent.
+ * @param tx_pkt
+ *   Pointer to the packet mbuf to be sent.
+ * @return
+ *   0 = packet has been buffered for later transmission
+ *   N > 0 = packet has been buffered, and the buffer was subsequently flushed,
+ *     causing N packets to be sent, and the error callback to be called for
+ *     the rest.
+ */
+static inline uint16_t __attribute__((always_inline))
+rte_eth_tx_buffer(uint8_t port_id, uint16_t queue_id,
+		struct rte_eth_dev_tx_buffer *buffer, struct rte_mbuf *tx_pkt)
+{
+	buffer->pkts[buffer->nb_pkts++] = tx_pkt;
+	if (buffer->nb_pkts < buffer->size)
+		return 0;
+
+	return rte_eth_tx_buffer_flush(port_id, queue_id, buffer);
+}
+
+/**
+ * Configure a callback for buffered packets which cannot be sent
+ *
+ * Register a specific callback to be called when an attempt is made to send
+ * all packets buffered on an ethernet port, but not all packets can
+ * successfully be sent. The callback registered here will be called only
+ * from calls to rte_eth_tx_buffer() and rte_eth_tx_buffer_flush() APIs.
+ * The default callback configured for each queue by default just frees the
+ * packets back to the calling mempool. If additional behaviour is required,
+ * for example, to count dropped packets, or to retry transmission of packets
+ * which cannot be sent, this function should be used to register a suitable
+ * callback function to implement the desired behaviour.
+ * The example callback "rte_eth_count_unsent_packet_callback()" is also
+ * provided as reference.
+ *
+ * @param buffer
+ *   The port identifier of the Ethernet device.
+ * @param cbfn
+ *   The function to be used as the callback.
+ * @param userdata
+ *   Arbitrary parameter to be passed to the callback function
+ * @return
+ *   0 on success, or -1 on error with rte_errno set appropriately
+ */
+int
+rte_eth_tx_buffer_set_err_callback(struct rte_eth_dev_tx_buffer *buffer,
+		buffer_tx_error_fn cbfn, void *userdata);
+
+/**
+ * Callback function for tracking unsent buffered packets.
+ *
+ * This function can be passed to rte_eth_tx_buffer_set_err_callback() to
+ * adjust the default behaviour when buffered packets cannot be sent. This
+ * function drops any unsent packets, but also updates a user-supplied counter
+ * to track the overall number of packets dropped. The counter should be an
+ * uint64_t variable.
+ *
+ * NOTE: this function should not be called directly, instead it should be used
+ *       as a callback for packet buffering.
+ *
+ * NOTE: when configuring this function as a callback with
+ *       rte_eth_tx_buffer_set_err_callback(), the final, userdata parameter
+ *       should point to an uint64_t value.
+ *
+ * @param pkts
+ *   The previously buffered packets which could not be sent
+ * @param unsent
+ *   The number of unsent packets in the pkts array
+ * @param userdata
+ *   Pointer to an unsigned long value, which will be incremented by unsent
+ */
+void
+rte_eth_count_unsent_packet_callback(struct rte_mbuf **pkts, uint16_t unsent,
+		void *userdata);
+
 /**
  * The eth device event type for interrupt, and maybe others in the future.
  */
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index d8db24d..ad11c71 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -117,3 +117,12 @@  DPDK_2.2 {
 
 	local: *;
 };
+
+DPDK_2.3 {
+	global:
+
+	rte_eth_count_unsent_packet_callback;
+	rte_eth_tx_buffer_init;
+	rte_eth_tx_buffer_set_err_callback;
+
+} DPDK_2.2;