[dpdk-dev,v5,1/3] ethdev: new API to free consumed buffers in Tx ring

Message ID 20170127183800.27466-2-bmcfall@redhat.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel compilation success Compilation OK

Commit Message

Billy McFall Jan. 27, 2017, 6:37 p.m. UTC
  Add a new API to force free consumed buffers on Tx ring. API will return
the number of packets freed (0-n) or error code if feature not supported
(-ENOTSUP) or input invalid (-ENODEV).

Signed-off-by: Billy McFall <bmcfall@redhat.com>
---
 doc/guides/nics/features/default.ini  |  1 +
 doc/guides/prog_guide/mempool_lib.rst | 29 +++++++++++++++++++++++++++++
 lib/librte_ether/rte_ethdev.c         | 14 ++++++++++++++
 lib/librte_ether/rte_ethdev.h         | 31 +++++++++++++++++++++++++++++++
 4 files changed, 75 insertions(+)
  

Comments

Thomas Monjalon Feb. 27, 2017, 1:48 p.m. UTC | #1
2017-01-27 13:37, Billy McFall:
> --- a/doc/guides/nics/features/default.ini
> +++ b/doc/guides/nics/features/default.ini
> @@ -55,6 +55,7 @@ FW version           =
>  EEPROM dump          =
>  Registers dump       =
>  Multiprocess aware   =
> +Free TX ring buffers =

I'm afraid this wording will be confusing, because every drivers
free their buffers :)
What about "Free Tx mbuf on demand" ?

And please, move this line upper, just after "Rx interrupt".

We also need to carefully review the doc you provided (thanks).
First quick comment, please wrap lines shorter in the doc.

About the function prototype, I've seen a double space :)
I think you could use rte_errno (while keeping negative return codes).
  
Billy McFall March 7, 2017, 2:29 p.m. UTC | #2
Thomas,

Thanks for your comments. See inline.

On Mon, Feb 27, 2017 at 8:48 AM, Thomas Monjalon <thomas.monjalon@6wind.com>
wrote:

> 2017-01-27 13:37, Billy McFall:
> > --- a/doc/guides/nics/features/default.ini
> > +++ b/doc/guides/nics/features/default.ini
> > @@ -55,6 +55,7 @@ FW version           =
> >  EEPROM dump          =
> >  Registers dump       =
> >  Multiprocess aware   =
> > +Free TX ring buffers =
>
> I'm afraid this wording will be confusing, because every drivers
> free their buffers :)
> What about "Free Tx mbuf on demand" ?
>
>
I definitely like your wording of the feature better than mine. All the
existing features were under 20 characters and I was trying to stay under
that.


> And please, move this line upper, just after "Rx interrupt".
>
> Done


> We also need to carefully review the doc you provided (thanks).
> First quick comment, please wrap lines shorter in the doc.
>
> Done


> About the function prototype, I've seen a double space :)
>

Done


> I think you could use rte_errno (while keeping negative return codes).
>

I can do that if you want, but if I understand your comment, it will make
the implementation of the function not as clean. I cannot use the existing
RTE_ETH_VALID_PORTID_OR_ERR_RET(..) and RTE_FUNC_PTR_OR_ERR_RET(..) MACROs
because they are handling the return on error. Or am I missing something?
  
Thomas Monjalon March 7, 2017, 4:03 p.m. UTC | #3
2017-03-07 09:29, Billy McFall:
> On Mon, Feb 27, 2017 at 8:48 AM, Thomas Monjalon <thomas.monjalon@6wind.com>
> wrote:
> > I think you could use rte_errno (while keeping negative return codes).
> >
> 
> I can do that if you want, but if I understand your comment, it will make
> the implementation of the function not as clean. I cannot use the existing
> RTE_ETH_VALID_PORTID_OR_ERR_RET(..) and RTE_FUNC_PTR_OR_ERR_RET(..) MACROs
> because they are handling the return on error. Or am I missing something?

Yes. Maybe we need new macros for basic error management with rte_errno.
  
John McNamara March 7, 2017, 4:42 p.m. UTC | #4
> -----Original Message-----

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Billy McFall

> Sent: Friday, January 27, 2017 6:38 PM

> To: thomas.monjalon@6wind.com; Lu, Wenzhuo <wenzhuo.lu@intel.com>;

> olivier.matz@6wind.com

> Cc: dev@dpdk.org; Billy McFall <bmcfall@redhat.com>

> Subject: [dpdk-dev] [PATCH v5 1/3] ethdev: new API to free consumed

> buffers in Tx ring

>


Hi Billy,

Thanks for this. Some minor comments below.


>

>

> +Driver Cache

> +~~~~~~~~~~~~


I think this could be at the same level as the "Local Cache" earlier in
the doc.


> +

> +In addition to the a core’s local cache, many of the drivers don't


The apostrophe after core is non-ascii. That can mess with the PDF output.


> +release the mbuf back to the mempool, or local cache, immediately after

> the packet has been transmitted.

> +Instead, they leave the mbuf in their txRing and either perform a bulk


I'd suggest s/txRing/Tx ring/ here and below.


> +release when the tx_rs_thresh has been crossed or free the mbuf when a


You should fixed width quote ``tx_rs_thresh`` here and below.


> slot in the txRing is needed.

> +

> +An application can request the driver to release used mbufs with the

> ``rte_eth_tx_done_cleanup()`` API.

> +This API requests the driver to release mbufs that are no longer in

> +use, independent of whether or not the tx_rs_thresh has been crossed.

> +There are two scenarios when an application may want the mbuf back

> immediately:


s/back/released/ maybe?

> +

> +* When a given packet needs to be sent to multiple destination interfaces

> (either for Layer 2 flooding or Layer 3 multi-cast).

> +  One option is to make a copy of the packet or a copy of the header

> portion that needs to manipulated.


s/to/to be/


John
  
Billy McFall March 9, 2017, 3:49 p.m. UTC | #5
On Tue, Mar 7, 2017 at 11:03 AM, Thomas Monjalon <thomas.monjalon@6wind.com>
wrote:

> 2017-03-07 09:29, Billy McFall:
> > On Mon, Feb 27, 2017 at 8:48 AM, Thomas Monjalon <
> thomas.monjalon@6wind.com>
> > wrote:
> > > I think you could use rte_errno (while keeping negative return codes).
> > >
> >
> > I can do that if you want, but if I understand your comment, it will make
> > the implementation of the function not as clean. I cannot use the
> existing
> > RTE_ETH_VALID_PORTID_OR_ERR_RET(..) and RTE_FUNC_PTR_OR_ERR_RET(..)
> MACROs
> > because they are handling the return on error. Or am I missing something?
>
> Yes. Maybe we need new macros for basic error management with rte_errno.
>
> Looking at the code. Do you think we need new MACROs or just set rte_errno
in the existing? What would be the down side to setting rte_errno for all
the existing calls?

Looks like RTE_ETH_VALID_PORTID_OR_ERR_RET(..) is being called ~135 times.
Most calls are with retval set to either -ENODEV or -EINVAL. A few
instances of 0 and -1, but not many.

Looks like RTE_FUNC_PTR_OR_ERR_RET(..) is being called ~100 times. Most
calls are with retval set to -ENOTSUP. A few instances of 0, but not many.

I was thinking:
/* Macros to check for valid port */
#define RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, retval) do { \
if (!rte_eth_dev_is_valid_port(port_id)) { \
RTE_PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); \
+ if (retval < 0) \
+ rte_errno = -retval; \
return retval; \
} \
} while (0)

Thanks,
Billy
  
Thomas Monjalon March 9, 2017, 5:11 p.m. UTC | #6
2017-03-09 10:49, Billy McFall:
> On Tue, Mar 7, 2017 at 11:03 AM, Thomas Monjalon <thomas.monjalon@6wind.com>
> wrote:
> > 2017-03-07 09:29, Billy McFall:
> > > On Mon, Feb 27, 2017 at 8:48 AM, Thomas Monjalon <
> > thomas.monjalon@6wind.com>
> > > wrote:
> > > > I think you could use rte_errno (while keeping negative return codes).
> > > >
> > >
> > > I can do that if you want, but if I understand your comment, it will make
> > > the implementation of the function not as clean. I cannot use the
> > existing
> > > RTE_ETH_VALID_PORTID_OR_ERR_RET(..) and RTE_FUNC_PTR_OR_ERR_RET(..)
> > MACROs
> > > because they are handling the return on error. Or am I missing something?
> >
> > Yes. Maybe we need new macros for basic error management with rte_errno.
> >
> > Looking at the code. Do you think we need new MACROs or just set rte_errno
> in the existing? What would be the down side to setting rte_errno for all
> the existing calls?
> 
> Looks like RTE_ETH_VALID_PORTID_OR_ERR_RET(..) is being called ~135 times.
> Most calls are with retval set to either -ENODEV or -EINVAL. A few
> instances of 0 and -1, but not many.
> 
> Looks like RTE_FUNC_PTR_OR_ERR_RET(..) is being called ~100 times. Most
> calls are with retval set to -ENOTSUP. A few instances of 0, but not many.
> 
> I was thinking:
> /* Macros to check for valid port */
> #define RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, retval) do { \
> if (!rte_eth_dev_is_valid_port(port_id)) { \
> RTE_PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); \
> + if (retval < 0) \
> + rte_errno = -retval; \
> return retval; \
> } \
> } while (0)

Yes it seems acceptable.
This rework may be done separately.
  

Patch

diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini
index 6e4a043..e2d8c83 100644
--- a/doc/guides/nics/features/default.ini
+++ b/doc/guides/nics/features/default.ini
@@ -55,6 +55,7 @@  FW version           =
 EEPROM dump          =
 Registers dump       =
 Multiprocess aware   =
+Free TX ring buffers =
 BSD nic_uio          =
 Linux UIO            =
 Linux VFIO           =
diff --git a/doc/guides/prog_guide/mempool_lib.rst b/doc/guides/prog_guide/mempool_lib.rst
index ffdc109..92c6fd5 100644
--- a/doc/guides/prog_guide/mempool_lib.rst
+++ b/doc/guides/prog_guide/mempool_lib.rst
@@ -132,6 +132,35 @@  These user-owned caches can be explicitly passed to ``rte_mempool_generic_put()`
 The ``rte_mempool_default_cache()`` call returns the default internal cache if any.
 In contrast to the default caches, user-owned caches can be used by non-EAL threads too.
 
+
+Driver Cache
+~~~~~~~~~~~~
+
+In addition to the a core’s local cache, many of the drivers don't release the mbuf back to the mempool, or local cache,
+immediately after the packet has been transmitted.
+Instead, they leave the mbuf in their txRing and either perform a bulk release when the tx_rs_thresh has been crossed
+or free the mbuf when a slot in the txRing is needed.
+
+An application can request the driver to release used mbufs with the ``rte_eth_tx_done_cleanup()`` API.
+This API requests the driver to release mbufs that are no longer in use, independent of whether or not the tx_rs_thresh
+has been crossed.
+There are two scenarios when an application may want the mbuf back immediately:
+
+* When a given packet needs to be sent to multiple destination interfaces (either for Layer 2 flooding or Layer 3 multi-cast).
+  One option is to make a copy of the packet or a copy of the header portion that needs to manipulated.
+  A second option is to transmit the packet and then poll the ``rte_eth_tx_done_cleanup()`` API until the reference count
+  on the packet is decremented.
+  Then the same packet can be transmitted to the next destination interface.
+
+* If an application is designed to make multiple runs, like a packet generator, and one run has completed.
+  The application may want to reset to a clean state.
+  In this case, it may want to call the ``rte_eth_tx_done_cleanup()`` API to request each destination interface it has been
+  using to release all of its used mbufs.
+
+To determine if a driver supports this API, check for the *Free TX ring buffers* feature in the *Network Interface
+Controller Drivers* document.
+
+
 Mempool Handlers
 ------------------------
 
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 61f44e2..1cbf6d0 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1253,6 +1253,20 @@  rte_eth_tx_buffer_set_err_callback(struct rte_eth_dev_tx_buffer *buffer,
 }
 
 int
+rte_eth_tx_done_cleanup(uint8_t port_id, uint16_t queue_id,  uint32_t free_cnt)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+
+	/* Validate Input Data. Bail if not valid or not supported. */
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_done_cleanup, -ENOTSUP);
+
+	/* Call driver to free pending mbufs. */
+	return (*dev->dev_ops->tx_done_cleanup)(dev->data->tx_queues[queue_id],
+			free_cnt);
+}
+
+int
 rte_eth_tx_buffer_init(struct rte_eth_dev_tx_buffer *buffer, uint16_t size)
 {
 	int ret = 0;
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index c17bbda..b23886c 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1183,6 +1183,9 @@  typedef int (*eth_fw_version_get_t)(struct rte_eth_dev *dev,
 				     char *fw_version, size_t fw_size);
 /**< @internal Get firmware information of an Ethernet device. */
 
+typedef int (*eth_tx_done_cleanup_t)(void *txq, uint32_t free_cnt);
+/**< @internal Force mbufs to be from TX ring. */
+
 typedef void (*eth_rxq_info_get_t)(struct rte_eth_dev *dev,
 	uint16_t rx_queue_id, struct rte_eth_rxq_info *qinfo);
 
@@ -1487,6 +1490,7 @@  struct eth_dev_ops {
 	eth_rx_disable_intr_t      rx_queue_intr_disable; /**< Disable Rx queue interrupt. */
 	eth_tx_queue_setup_t       tx_queue_setup;/**< Set up device TX queue. */
 	eth_queue_release_t        tx_queue_release; /**< Release TX queue. */
+	eth_tx_done_cleanup_t      tx_done_cleanup;/**< Free tx ring mbufs */
 
 	eth_dev_led_on_t           dev_led_on;    /**< Turn on LED. */
 	eth_dev_led_off_t          dev_led_off;   /**< Turn off LED. */
@@ -3091,6 +3095,33 @@  rte_eth_tx_buffer(uint8_t port_id, uint16_t queue_id,
 }
 
 /**
+ * Request the driver to free mbufs currently cached by the driver. The
+ * driver will only free the mbuf if it is no longer in use. It is the
+ * application's responsibity to ensure rte_eth_tx_buffer_flush(..) is
+ * called if needed.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The index of the transmit queue through which output packets must be
+ *   sent.
+ *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @param free_cnt
+ *   Maximum number of packets to free. Use 0 to indicate all possible packets
+ *   should be freed. Note that a packet may be using multiple mbufs.
+ * @return
+ *   Failure: < 0
+ *     -ENODEV: Invalid interface
+ *     -ENOTSUP: Driver does not support function
+ *   Success: >= 0
+ *     0-n: Number of packets freed. More packets may still remain in ring that
+ *     are in use.
+ */
+int
+rte_eth_tx_done_cleanup(uint8_t port_id, uint16_t queue_id,  uint32_t free_cnt);
+
+/**
  * Configure a callback for buffered packets which cannot be sent
  *
  * Register a specific callback to be called when an attempt is made to send