[dpdk-dev,v9,1/8] ethdev: use locks to protect Rx/Tx callback lists

Message ID 1465897108-26548-2-git-send-email-reshma.pattan@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers

Commit Message

Pattan, Reshma June 14, 2016, 9:38 a.m. UTC
Added spinlocks around add/remove logic of Rx and Tx callbacks
to avoid corruption of callback lists in multithreaded context.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 lib/librte_ether/rte_ethdev.c | 82 +++++++++++++++++++++----------------------
 1 file changed, 40 insertions(+), 42 deletions(-)
  

Comments

Thomas Monjalon June 14, 2016, 7:59 p.m. UTC | #1
2016-06-14 10:38, Reshma Pattan:
> Added spinlocks around add/remove logic of Rx and Tx callbacks
> to avoid corruption of callback lists in multithreaded context.
> 
> Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>

Why cb->next is not locked in burst functions?
Just protecting add/remove but not its usage seems useless.
  
Pattan, Reshma June 15, 2016, 5:30 a.m. UTC | #2
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Tuesday, June 14, 2016 9:00 PM
> To: Pattan, Reshma <reshma.pattan@intel.com>
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx
> callback lists
> 
> 2016-06-14 10:38, Reshma Pattan:
> > Added spinlocks around add/remove logic of Rx and Tx callbacks to
> > avoid corruption of callback lists in multithreaded context.
> >
> > Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
> 
> Why cb->next is not locked in burst functions?
It is safe to do "read access" here and doesn't require any locking as rx/tx burst is initiated  by only local user(control plane) thread.

> Just protecting add/remove but not its usage seems useless.
Here locks were required  around add/remove to protect "write access"  because write to callback list is now done from 2 threads 
i.e. one from local user thread(control plane) and another from pdump control thread(initiated by remote pdump request). 

Thanks,
Reshma
  
Thomas Monjalon June 15, 2016, 8:19 a.m. UTC | #3
2016-06-15 05:30, Pattan, Reshma:
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > 2016-06-14 10:38, Reshma Pattan:
> > > Added spinlocks around add/remove logic of Rx and Tx callbacks to
> > > avoid corruption of callback lists in multithreaded context.
> > >
> > > Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
> > 
> > Why cb->next is not locked in burst functions?
> It is safe to do "read access" here and doesn't require any locking as rx/tx burst is initiated  by only local user(control plane) thread.
> 
> > Just protecting add/remove but not its usage seems useless.
> Here locks were required  around add/remove to protect "write access"  because write to callback list is now done from 2 threads 
> i.e. one from local user thread(control plane) and another from pdump control thread(initiated by remote pdump request). 

So read and write can be done by different threads.
I think the read access would need locking but we do not want it
in fast path.
Are you sure there is no issue in this design?
  
Ananyev, Konstantin June 15, 2016, 8:37 a.m. UTC | #4
Hi Thomas,

> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Wednesday, June 15, 2016 9:19 AM
> To: Pattan, Reshma
> Cc: dev@dpdk.org; Ananyev, Konstantin
> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> 
> 2016-06-15 05:30, Pattan, Reshma:
> > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > 2016-06-14 10:38, Reshma Pattan:
> > > > Added spinlocks around add/remove logic of Rx and Tx callbacks to
> > > > avoid corruption of callback lists in multithreaded context.
> > > >
> > > > Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
> > >
> > > Why cb->next is not locked in burst functions?
> > It is safe to do "read access" here and doesn't require any locking as rx/tx burst is initiated  by only local user(control plane) thread.
> >
> > > Just protecting add/remove but not its usage seems useless.
> > Here locks were required  around add/remove to protect "write access"  because write to callback list is now done from 2 threads
> > i.e. one from local user thread(control plane) and another from pdump control thread(initiated by remote pdump request).
> 
> So read and write can be done by different threads.

Yes, and this is possible even in current DPDK version (16.04).
What is added by Reshma's patch - now it is possible to have concurrent write
from 2 different thread to that list.  

> I think the read access would need locking but we do not want it
> in fast path.

I don't think it would be needed.
As I said - read/write interaction didn't change from what we have right now.
But if you have some particular scenario in mind that you believe would cause
a race condition - please speak up.  
Konstantin

> Are you sure there is no issue in this design?
  
Thomas Monjalon June 15, 2016, 8:48 a.m. UTC | #5
2016-06-15 08:37, Ananyev, Konstantin:
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > 2016-06-15 05:30, Pattan, Reshma:
> > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > > 2016-06-14 10:38, Reshma Pattan:
> > > > > Added spinlocks around add/remove logic of Rx and Tx callbacks to
> > > > > avoid corruption of callback lists in multithreaded context.
> > > > >
> > > > > Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
> > > >
> > > > Why cb->next is not locked in burst functions?
> > > It is safe to do "read access" here and doesn't require any locking as rx/tx burst is initiated  by only local user(control plane) thread.
> > >
> > > > Just protecting add/remove but not its usage seems useless.
> > > Here locks were required  around add/remove to protect "write access"  because write to callback list is now done from 2 threads
> > > i.e. one from local user thread(control plane) and another from pdump control thread(initiated by remote pdump request).
> > 
> > So read and write can be done by different threads.
> 
> Yes, and this is possible even in current DPDK version (16.04).
> What is added by Reshma's patch - now it is possible to have concurrent write
> from 2 different thread to that list.  
> 
> > I think the read access would need locking but we do not want it
> > in fast path.
> 
> I don't think it would be needed.
> As I said - read/write interaction didn't change from what we have right now.
> But if you have some particular scenario in mind that you believe would cause
> a race condition - please speak up.  

If we add/remove a callback during a burst? Is it possible that the next
pointer would have a wrong value leading to a crash?
Maybe we need a comment to state that we should not alter burst
callbacks while running burst functions.
  
Ananyev, Konstantin June 15, 2016, 9:54 a.m. UTC | #6
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Wednesday, June 15, 2016 9:49 AM
> To: Ananyev, Konstantin
> Cc: Pattan, Reshma; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> 
> 2016-06-15 08:37, Ananyev, Konstantin:
> > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > 2016-06-15 05:30, Pattan, Reshma:
> > > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > > > 2016-06-14 10:38, Reshma Pattan:
> > > > > > Added spinlocks around add/remove logic of Rx and Tx callbacks to
> > > > > > avoid corruption of callback lists in multithreaded context.
> > > > > >
> > > > > > Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
> > > > >
> > > > > Why cb->next is not locked in burst functions?
> > > > It is safe to do "read access" here and doesn't require any locking as rx/tx burst is initiated  by only local user(control plane)
> thread.
> > > >
> > > > > Just protecting add/remove but not its usage seems useless.
> > > > Here locks were required  around add/remove to protect "write access"  because write to callback list is now done from 2
> threads
> > > > i.e. one from local user thread(control plane) and another from pdump control thread(initiated by remote pdump request).
> > >
> > > So read and write can be done by different threads.
> >
> > Yes, and this is possible even in current DPDK version (16.04).
> > What is added by Reshma's patch - now it is possible to have concurrent write
> > from 2 different thread to that list.
> >
> > > I think the read access would need locking but we do not want it
> > > in fast path.
> >
> > I don't think it would be needed.
> > As I said - read/write interaction didn't change from what we have right now.
> > But if you have some particular scenario in mind that you believe would cause
> > a race condition - please speak up.
> 
> If we add/remove a callback during a burst? Is it possible that the next
> pointer would have a wrong value leading to a crash?
> Maybe we need a comment to state that we should not alter burst
> callbacks while running burst functions.

Current status (16.04):
It is safe to add/remove RX/TX callbacks while 
another thread is doing simultaneously RX/TX burst over same queue.
I.E: it is supposed to be safe to invoke
rte_eth_add(/remove)_rx(/tx)_callback() and rte_eth_rx_burst()/rte_eth_tx_burst()
from different threads simultaneously.
Though it is not safe to free/modify that rte_eth_rxtx_callback while current
rte_eth_rx_burst()/rte_eth_tx_burst() are still active.
That exactly what comments for rte_eth_remove_rx_callback() say:

* Note: the callback is removed from the callback list but it isn't freed
 * since the it may still be in use. The memory for the callback can be
 * subsequently freed back by the application by calling rte_free():
 *
 * - Immediately - if the port is stopped, or the user knows that no
 *   callbacks are in flight e.g. if called from the thread doing RX/TX
 *   on that queue.
 *
 * - After a short delay - where the delay is sufficient to allow any
 *   in-flight callbacks to complete.

In other words, right now there only way to know for sure that it is safe
to free the removed callback - is to stop the port.

Does it need to be changed, so when rte_eth_remove_rx_callback() returns
user can safely free the callback (or even better rte_eth_remove_rx_callback free the callback for us)?
In my opinion - yes.
Though, I think, it has nothing to do with pdump patches, and I think should be a matter
for separate a patch/discussion.

Now with pdump library introduction - there is possibility that 2 different threads
can try to  add/remove callbacks for the same queue simultaneously.
First one - thread executing control requests from local user,
second  one - pdump control thread executing pdump requests from pdump client.
That lock is introduced to avoid race condition between such 2 threads:
i.e. to prevent multiple threads to modify same list simultaneously.   
It is not intended to synchronise read/write accesses to the list, see above. 

Konstantin
  
Thomas Monjalon June 15, 2016, 11:17 a.m. UTC | #7
2016-06-15 09:54, Ananyev, Konstantin:
> 
> > -----Original Message-----
> > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > Sent: Wednesday, June 15, 2016 9:49 AM
> > To: Ananyev, Konstantin
> > Cc: Pattan, Reshma; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> > 
> > 2016-06-15 08:37, Ananyev, Konstantin:
> > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > > 2016-06-15 05:30, Pattan, Reshma:
> > > > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > > > > 2016-06-14 10:38, Reshma Pattan:
> > > > > > > Added spinlocks around add/remove logic of Rx and Tx callbacks to
> > > > > > > avoid corruption of callback lists in multithreaded context.
> > > > > > >
> > > > > > > Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
> > > > > >
> > > > > > Why cb->next is not locked in burst functions?
> > > > > It is safe to do "read access" here and doesn't require any locking as rx/tx burst is initiated  by only local user(control plane)
> > thread.
> > > > >
> > > > > > Just protecting add/remove but not its usage seems useless.
> > > > > Here locks were required  around add/remove to protect "write access"  because write to callback list is now done from 2
> > threads
> > > > > i.e. one from local user thread(control plane) and another from pdump control thread(initiated by remote pdump request).
> > > >
> > > > So read and write can be done by different threads.
> > >
> > > Yes, and this is possible even in current DPDK version (16.04).
> > > What is added by Reshma's patch - now it is possible to have concurrent write
> > > from 2 different thread to that list.
> > >
> > > > I think the read access would need locking but we do not want it
> > > > in fast path.
> > >
> > > I don't think it would be needed.
> > > As I said - read/write interaction didn't change from what we have right now.
> > > But if you have some particular scenario in mind that you believe would cause
> > > a race condition - please speak up.
> > 
> > If we add/remove a callback during a burst? Is it possible that the next
> > pointer would have a wrong value leading to a crash?
> > Maybe we need a comment to state that we should not alter burst
> > callbacks while running burst functions.
> 
> Current status (16.04):
> It is safe to add/remove RX/TX callbacks while 
> another thread is doing simultaneously RX/TX burst over same queue.
> I.E: it is supposed to be safe to invoke
> rte_eth_add(/remove)_rx(/tx)_callback() and rte_eth_rx_burst()/rte_eth_tx_burst()
> from different threads simultaneously.
> Though it is not safe to free/modify that rte_eth_rxtx_callback while current
> rte_eth_rx_burst()/rte_eth_tx_burst() are still active.
> That exactly what comments for rte_eth_remove_rx_callback() say:
> 
> * Note: the callback is removed from the callback list but it isn't freed
>  * since the it may still be in use. The memory for the callback can be
>  * subsequently freed back by the application by calling rte_free():
>  *
>  * - Immediately - if the port is stopped, or the user knows that no
>  *   callbacks are in flight e.g. if called from the thread doing RX/TX
>  *   on that queue.
>  *
>  * - After a short delay - where the delay is sufficient to allow any
>  *   in-flight callbacks to complete.
> 
> In other words, right now there only way to know for sure that it is safe
> to free the removed callback - is to stop the port.
> 
> Does it need to be changed, so when rte_eth_remove_rx_callback() returns
> user can safely free the callback (or even better rte_eth_remove_rx_callback free the callback for us)?
> In my opinion - yes.
> Though, I think, it has nothing to do with pdump patches, and I think should be a matter
> for separate a patch/discussion.
> 
> Now with pdump library introduction - there is possibility that 2 different threads
> can try to  add/remove callbacks for the same queue simultaneously.
> First one - thread executing control requests from local user,
> second  one - pdump control thread executing pdump requests from pdump client.
> That lock is introduced to avoid race condition between such 2 threads:
> i.e. to prevent multiple threads to modify same list simultaneously.   
> It is not intended to synchronise read/write accesses to the list, see above. 

OK thanks for the explanations
  
Ivan Boule June 15, 2016, 12:15 p.m. UTC | #8
On 06/15/2016 10:48 AM, Thomas Monjalon wrote:

>>
>>> I think the read access would need locking but we do not want it
>>> in fast path.
>>
>> I don't think it would be needed.
>> As I said - read/write interaction didn't change from what we have right now.
>> But if you have some particular scenario in mind that you believe would cause
>> a race condition - please speak up.
>
> If we add/remove a callback during a burst? Is it possible that the next
> pointer would have a wrong value leading to a crash?
> Maybe we need a comment to state that we should not alter burst
> callbacks while running burst functions.
>

Hi Reshma,

You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the 
function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list 
of RX callbacks associated with the polled RX queue to safely remove RX 
callback(s) in parallel.
The problem is not [only] with the setting and the loading of "cb->next" 
that you assume to be atomic operations, which is certainly true on most 
CPUs.
I see the 2 important following issues:

1) the "rte_eth_rxtx_callback" data structure associated with a removed 
RX callback could still be accessed in the callback parsing loop of the 
function "rte_eth_rx_burst()" after having been freed in parallel.

BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE 
MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that 
does not free the "rte_eth_rxtx_callback" data structure associated with 
the removed callback !

2) As a consequence of 1), RX callbacks can be invoked/executed 
while/after being removed.
If the application must free resources that it dynamically allocated to 
be used by the RX callback being removed, how to guarantee that the last 
invocation of that RX callback has been completed and that such a 
callback will never be invoked again, so that the resources can safely 
be freed?

This is an example of a well-known more generic object deletion problem 
which must arrange to guarantee that a deleted object is not used and 
not accessible for use anymore before being actually deleted (freed, for 
instance).

Note that a lock cannot be used in the execution path of the 
rte_eth_rx_burst() function to address this issue, as locks MUST NEVER 
be introduced in the RX/TX path of the DPDK framework.

Of course, the same issues stand for TX callbacks.

Regards,
Ivan
  
Ananyev, Konstantin June 15, 2016, 12:40 p.m. UTC | #9
Hi Ivan,

> -----Original Message-----
> From: Ivan Boule [mailto:ivan.boule@6wind.com]
> Sent: Wednesday, June 15, 2016 1:15 PM
> To: Thomas Monjalon; Ananyev, Konstantin
> Cc: Pattan, Reshma; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> 
> On 06/15/2016 10:48 AM, Thomas Monjalon wrote:
> 
> >>
> >>> I think the read access would need locking but we do not want it
> >>> in fast path.
> >>
> >> I don't think it would be needed.
> >> As I said - read/write interaction didn't change from what we have right now.
> >> But if you have some particular scenario in mind that you believe would cause
> >> a race condition - please speak up.
> >
> > If we add/remove a callback during a burst? Is it possible that the next
> > pointer would have a wrong value leading to a crash?
> > Maybe we need a comment to state that we should not alter burst
> > callbacks while running burst functions.
> >
> 
> Hi Reshma,
> 
> You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the
> function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list
> of RX callbacks associated with the polled RX queue to safely remove RX
> callback(s) in parallel.
> The problem is not [only] with the setting and the loading of "cb->next"
> that you assume to be atomic operations, which is certainly true on most
> CPUs.
> I see the 2 important following issues:
> 
> 1) the "rte_eth_rxtx_callback" data structure associated with a removed
> RX callback could still be accessed in the callback parsing loop of the
> function "rte_eth_rx_burst()" after having been freed in parallel.
> 
> BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE
> MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that
> does not free the "rte_eth_rxtx_callback" data structure associated with
> the removed callback !

Yes, though it is documented behaviour, someone can probably
refer it as a feature, not a bug ;)

> 
> 2) As a consequence of 1), RX callbacks can be invoked/executed
> while/after being removed.
> If the application must free resources that it dynamically allocated to
> be used by the RX callback being removed, how to guarantee that the last
> invocation of that RX callback has been completed and that such a
> callback will never be invoked again, so that the resources can safely
> be freed?
> 
> This is an example of a well-known more generic object deletion problem
> which must arrange to guarantee that a deleted object is not used and
> not accessible for use anymore before being actually deleted (freed, for
> instance).

Yes, and as I wrote in other mail, IMO it needs to be addressed.
But again it is already existing problem in rte_ethdev,
and I think it shouldn't stop pdump integration.
Konstantin

> 
> Note that a lock cannot be used in the execution path of the
> rte_eth_rx_burst() function to address this issue, as locks MUST NEVER
> be introduced in the RX/TX path of the DPDK framework.
> 
> Of course, the same issues stand for TX callbacks.
> 
> Regards,
> Ivan
> 
> 
> 
> --
> Ivan Boule
> 6WIND Development Engineer
  
Bruce Richardson June 15, 2016, 1:29 p.m. UTC | #10
On Wed, Jun 15, 2016 at 12:40:16PM +0000, Ananyev, Konstantin wrote:
> Hi Ivan,
> 
> > -----Original Message-----
> > From: Ivan Boule [mailto:ivan.boule@6wind.com]
> > Sent: Wednesday, June 15, 2016 1:15 PM
> > To: Thomas Monjalon; Ananyev, Konstantin
> > Cc: Pattan, Reshma; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> > 
> > On 06/15/2016 10:48 AM, Thomas Monjalon wrote:
> > 
> > >>
> > >>> I think the read access would need locking but we do not want it
> > >>> in fast path.
> > >>
> > >> I don't think it would be needed.
> > >> As I said - read/write interaction didn't change from what we have right now.
> > >> But if you have some particular scenario in mind that you believe would cause
> > >> a race condition - please speak up.
> > >
> > > If we add/remove a callback during a burst? Is it possible that the next
> > > pointer would have a wrong value leading to a crash?
> > > Maybe we need a comment to state that we should not alter burst
> > > callbacks while running burst functions.
> > >
> > 
> > Hi Reshma,
> > 
> > You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the
> > function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list
> > of RX callbacks associated with the polled RX queue to safely remove RX
> > callback(s) in parallel.
> > The problem is not [only] with the setting and the loading of "cb->next"
> > that you assume to be atomic operations, which is certainly true on most
> > CPUs.
> > I see the 2 important following issues:
> > 
> > 1) the "rte_eth_rxtx_callback" data structure associated with a removed
> > RX callback could still be accessed in the callback parsing loop of the
> > function "rte_eth_rx_burst()" after having been freed in parallel.
> > 
> > BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE
> > MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that
> > does not free the "rte_eth_rxtx_callback" data structure associated with
> > the removed callback !
> 
> Yes, though it is documented behaviour, someone can probably
> refer it as a feature, not a bug ;)
> 

+1
This is definitely not a bug, this is absolutely by design. One may argue with
the design, but it was done for a definite reason, so as to avoid paying the
penalty of having locks. It pushes more responsibility onto the app, but it
does allow the app to choose the best solution for managing the freeing of
memory for its situation. The alternative is to force all apps to pay the cost
of having locks, even if better options for freeing the memory are available.

/Bruce
  
Thomas Monjalon June 15, 2016, 1:49 p.m. UTC | #11
I agree this patch do not bring a new issue.
But the current status deserves to be discussed.

2016-06-15 09:54, Ananyev, Konstantin:
> It is safe to add/remove RX/TX callbacks while 
> another thread is doing simultaneously RX/TX burst over same queue.

You are probably right, but I don't why it is safe? On which CPU?
How can we be sure that read and write of the "next" pointer are atomic?
  
Ivan Boule June 15, 2016, 2:07 p.m. UTC | #12
On 06/15/2016 03:29 PM, Bruce Richardson wrote:
> On Wed, Jun 15, 2016 at 12:40:16PM +0000, Ananyev, Konstantin wrote:
>> Hi Ivan,
>>
>>> -----Original Message-----
>>> From: Ivan Boule [mailto:ivan.boule@6wind.com]
>>> Sent: Wednesday, June 15, 2016 1:15 PM
>>> To: Thomas Monjalon; Ananyev, Konstantin
>>> Cc: Pattan, Reshma; dev@dpdk.org
>>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
>>>
>>> On 06/15/2016 10:48 AM, Thomas Monjalon wrote:
>>>
>>>>>
>>>>>> I think the read access would need locking but we do not want it
>>>>>> in fast path.
>>>>>
>>>>> I don't think it would be needed.
>>>>> As I said - read/write interaction didn't change from what we have right now.
>>>>> But if you have some particular scenario in mind that you believe would cause
>>>>> a race condition - please speak up.
>>>>
>>>> If we add/remove a callback during a burst? Is it possible that the next
>>>> pointer would have a wrong value leading to a crash?
>>>> Maybe we need a comment to state that we should not alter burst
>>>> callbacks while running burst functions.
>>>>
>>>
>>> Hi Reshma,
>>>
>>> You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the
>>> function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list
>>> of RX callbacks associated with the polled RX queue to safely remove RX
>>> callback(s) in parallel.
>>> The problem is not [only] with the setting and the loading of "cb->next"
>>> that you assume to be atomic operations, which is certainly true on most
>>> CPUs.
>>> I see the 2 important following issues:
>>>
>>> 1) the "rte_eth_rxtx_callback" data structure associated with a removed
>>> RX callback could still be accessed in the callback parsing loop of the
>>> function "rte_eth_rx_burst()" after having been freed in parallel.
>>>
>>> BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE
>>> MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that
>>> does not free the "rte_eth_rxtx_callback" data structure associated with
>>> the removed callback !
>>
>> Yes, though it is documented behaviour, someone can probably
>> refer it as a feature, not a bug ;)
>>
>
> +1
> This is definitely not a bug, this is absolutely by design. One may argue with
> the design, but it was done for a definite reason, so as to avoid paying the
> penalty of having locks. It pushes more responsibility onto the app, but it
> does allow the app to choose the best solution for managing the freeing of
> memory for its situation. The alternative is to force all apps to pay the cost
> of having locks, even if better options for freeing the memory are available.
>
> /Bruce
>

-1 (not to say 0xFFFFFFFF)

This is definitely an API design bug !
I would say that if you don't know how to free a resource that you 
allocate, it is very likely that you are wrong allocating it.
And this is exactly what happens here with RX/TX callback data structures.
This problem can easily be addressed by just changing the API as follows:

Change
     void *
     rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
                             rte_rx_callback_fn fn, void *user_param)

to
     int
     rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
                             struct rte_eth_rxtx_callback *cb)

In addition of solving the problem, this approach makes the API 
consistent and let the application allocate "rte_eth_rxtx_callback" data 
structures through any appropriate mean.

Regards,
Ivan
  
Bruce Richardson June 15, 2016, 2:19 p.m. UTC | #13
On Wed, Jun 15, 2016 at 04:07:20PM +0200, Ivan Boule wrote:
> On 06/15/2016 03:29 PM, Bruce Richardson wrote:
> >On Wed, Jun 15, 2016 at 12:40:16PM +0000, Ananyev, Konstantin wrote:
> >>Hi Ivan,
> >>
> >>>-----Original Message-----
> >>>From: Ivan Boule [mailto:ivan.boule@6wind.com]
> >>>Sent: Wednesday, June 15, 2016 1:15 PM
> >>>To: Thomas Monjalon; Ananyev, Konstantin
> >>>Cc: Pattan, Reshma; dev@dpdk.org
> >>>Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> >>>
> >>>On 06/15/2016 10:48 AM, Thomas Monjalon wrote:
> >>>
> >>>>>
> >>>>>>I think the read access would need locking but we do not want it
> >>>>>>in fast path.
> >>>>>
> >>>>>I don't think it would be needed.
> >>>>>As I said - read/write interaction didn't change from what we have right now.
> >>>>>But if you have some particular scenario in mind that you believe would cause
> >>>>>a race condition - please speak up.
> >>>>
> >>>>If we add/remove a callback during a burst? Is it possible that the next
> >>>>pointer would have a wrong value leading to a crash?
> >>>>Maybe we need a comment to state that we should not alter burst
> >>>>callbacks while running burst functions.
> >>>>
> >>>
> >>>Hi Reshma,
> >>>
> >>>You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the
> >>>function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list
> >>>of RX callbacks associated with the polled RX queue to safely remove RX
> >>>callback(s) in parallel.
> >>>The problem is not [only] with the setting and the loading of "cb->next"
> >>>that you assume to be atomic operations, which is certainly true on most
> >>>CPUs.
> >>>I see the 2 important following issues:
> >>>
> >>>1) the "rte_eth_rxtx_callback" data structure associated with a removed
> >>>RX callback could still be accessed in the callback parsing loop of the
> >>>function "rte_eth_rx_burst()" after having been freed in parallel.
> >>>
> >>>BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE
> >>>MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that
> >>>does not free the "rte_eth_rxtx_callback" data structure associated with
> >>>the removed callback !
> >>
> >>Yes, though it is documented behaviour, someone can probably
> >>refer it as a feature, not a bug ;)
> >>
> >
> >+1
> >This is definitely not a bug, this is absolutely by design. One may argue with
> >the design, but it was done for a definite reason, so as to avoid paying the
> >penalty of having locks. It pushes more responsibility onto the app, but it
> >does allow the app to choose the best solution for managing the freeing of
> >memory for its situation. The alternative is to force all apps to pay the cost
> >of having locks, even if better options for freeing the memory are available.
> >
> >/Bruce
> >
> 
> -1 (not to say 0xFFFFFFFF)
> 
> This is definitely an API design bug !
> I would say that if you don't know how to free a resource that you allocate,
> it is very likely that you are wrong allocating it.
> And this is exactly what happens here with RX/TX callback data structures.
> This problem can easily be addressed by just changing the API as follows:
> 
> Change
>     void *
>     rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
>                             rte_rx_callback_fn fn, void *user_param)
> 
> to
>     int
>     rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
>                             struct rte_eth_rxtx_callback *cb)
> 
> In addition of solving the problem, this approach makes the API consistent
> and let the application allocate "rte_eth_rxtx_callback" data structures
> through any appropriate mean.
> 

That looks like a reasonable change to me. It keeps the important part of the
existing API behaviour, while making the API more consistent.

/Bruce
  
Ananyev, Konstantin June 15, 2016, 2:20 p.m. UTC | #14
> -----Original Message-----
> From: Ivan Boule [mailto:ivan.boule@6wind.com]
> Sent: Wednesday, June 15, 2016 3:07 PM
> To: Richardson, Bruce; Ananyev, Konstantin
> Cc: Thomas Monjalon; Pattan, Reshma; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> 
> On 06/15/2016 03:29 PM, Bruce Richardson wrote:
> > On Wed, Jun 15, 2016 at 12:40:16PM +0000, Ananyev, Konstantin wrote:
> >> Hi Ivan,
> >>
> >>> -----Original Message-----
> >>> From: Ivan Boule [mailto:ivan.boule@6wind.com]
> >>> Sent: Wednesday, June 15, 2016 1:15 PM
> >>> To: Thomas Monjalon; Ananyev, Konstantin
> >>> Cc: Pattan, Reshma; dev@dpdk.org
> >>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> >>>
> >>> On 06/15/2016 10:48 AM, Thomas Monjalon wrote:
> >>>
> >>>>>
> >>>>>> I think the read access would need locking but we do not want it
> >>>>>> in fast path.
> >>>>>
> >>>>> I don't think it would be needed.
> >>>>> As I said - read/write interaction didn't change from what we have right now.
> >>>>> But if you have some particular scenario in mind that you believe would cause
> >>>>> a race condition - please speak up.
> >>>>
> >>>> If we add/remove a callback during a burst? Is it possible that the next
> >>>> pointer would have a wrong value leading to a crash?
> >>>> Maybe we need a comment to state that we should not alter burst
> >>>> callbacks while running burst functions.
> >>>>
> >>>
> >>> Hi Reshma,
> >>>
> >>> You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the
> >>> function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list
> >>> of RX callbacks associated with the polled RX queue to safely remove RX
> >>> callback(s) in parallel.
> >>> The problem is not [only] with the setting and the loading of "cb->next"
> >>> that you assume to be atomic operations, which is certainly true on most
> >>> CPUs.
> >>> I see the 2 important following issues:
> >>>
> >>> 1) the "rte_eth_rxtx_callback" data structure associated with a removed
> >>> RX callback could still be accessed in the callback parsing loop of the
> >>> function "rte_eth_rx_burst()" after having been freed in parallel.
> >>>
> >>> BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE
> >>> MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that
> >>> does not free the "rte_eth_rxtx_callback" data structure associated with
> >>> the removed callback !
> >>
> >> Yes, though it is documented behaviour, someone can probably
> >> refer it as a feature, not a bug ;)
> >>
> >
> > +1
> > This is definitely not a bug, this is absolutely by design. One may argue with
> > the design, but it was done for a definite reason, so as to avoid paying the
> > penalty of having locks. It pushes more responsibility onto the app, but it
> > does allow the app to choose the best solution for managing the freeing of
> > memory for its situation. The alternative is to force all apps to pay the cost
> > of having locks, even if better options for freeing the memory are available.
> >
> > /Bruce
> >
> 
> -1 (not to say 0xFFFFFFFF)
> 
> This is definitely an API design bug !
> I would say that if you don't know how to free a resource that you
> allocate, it is very likely that you are wrong allocating it.
> And this is exactly what happens here with RX/TX callback data structures.
> This problem can easily be addressed by just changing the API as follows:
> 
> Change
>      void *
>      rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
>                              rte_rx_callback_fn fn, void *user_param)
> 
> to
>      int
>      rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
>                              struct rte_eth_rxtx_callback *cb)
> 
> In addition of solving the problem, this approach makes the API
> consistent and let the application allocate "rte_eth_rxtx_callback" data
> structures through any appropriate mean.

That might make API a bit cleaner, but I don't see how it fixes the generic problem:
there is still no way to know by the upper layer when it is safe to free/re-use
removed callback, but to make sure that all IO on that queue is stopped
(I.E. some external synchronisation around the queue).   
As you said in previous mail: 
> This is an example of a well-known more generic object deletion problem
> which must arrange to guarantee that a deleted object is not used and
> not accessible for use anymore before being actually deleted (freed, for
> instance).
Konstantin

> 
> Regards,
> Ivan
> 
> --
> Ivan Boule
> 6WIND Development Engineer
  
Bruce Richardson June 15, 2016, 2:22 p.m. UTC | #15
On Wed, Jun 15, 2016 at 03:20:27PM +0100, Ananyev, Konstantin wrote:
> 
> 
> > -----Original Message-----
> > From: Ivan Boule [mailto:ivan.boule@6wind.com]
> > Sent: Wednesday, June 15, 2016 3:07 PM
> > To: Richardson, Bruce; Ananyev, Konstantin
> > Cc: Thomas Monjalon; Pattan, Reshma; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> > 
> > On 06/15/2016 03:29 PM, Bruce Richardson wrote:
> > > On Wed, Jun 15, 2016 at 12:40:16PM +0000, Ananyev, Konstantin wrote:
> > >> Hi Ivan,
> > >>
> > >>> -----Original Message-----
> > >>> From: Ivan Boule [mailto:ivan.boule@6wind.com]
> > >>> Sent: Wednesday, June 15, 2016 1:15 PM
> > >>> To: Thomas Monjalon; Ananyev, Konstantin
> > >>> Cc: Pattan, Reshma; dev@dpdk.org
> > >>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> > >>>
> > >>> On 06/15/2016 10:48 AM, Thomas Monjalon wrote:
> > >>>
> > >>>>>
> > >>>>>> I think the read access would need locking but we do not want it
> > >>>>>> in fast path.
> > >>>>>
> > >>>>> I don't think it would be needed.
> > >>>>> As I said - read/write interaction didn't change from what we have right now.
> > >>>>> But if you have some particular scenario in mind that you believe would cause
> > >>>>> a race condition - please speak up.
> > >>>>
> > >>>> If we add/remove a callback during a burst? Is it possible that the next
> > >>>> pointer would have a wrong value leading to a crash?
> > >>>> Maybe we need a comment to state that we should not alter burst
> > >>>> callbacks while running burst functions.
> > >>>>
> > >>>
> > >>> Hi Reshma,
> > >>>
> > >>> You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the
> > >>> function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list
> > >>> of RX callbacks associated with the polled RX queue to safely remove RX
> > >>> callback(s) in parallel.
> > >>> The problem is not [only] with the setting and the loading of "cb->next"
> > >>> that you assume to be atomic operations, which is certainly true on most
> > >>> CPUs.
> > >>> I see the 2 important following issues:
> > >>>
> > >>> 1) the "rte_eth_rxtx_callback" data structure associated with a removed
> > >>> RX callback could still be accessed in the callback parsing loop of the
> > >>> function "rte_eth_rx_burst()" after having been freed in parallel.
> > >>>
> > >>> BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE
> > >>> MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that
> > >>> does not free the "rte_eth_rxtx_callback" data structure associated with
> > >>> the removed callback !
> > >>
> > >> Yes, though it is documented behaviour, someone can probably
> > >> refer it as a feature, not a bug ;)
> > >>
> > >
> > > +1
> > > This is definitely not a bug, this is absolutely by design. One may argue with
> > > the design, but it was done for a definite reason, so as to avoid paying the
> > > penalty of having locks. It pushes more responsibility onto the app, but it
> > > does allow the app to choose the best solution for managing the freeing of
> > > memory for its situation. The alternative is to force all apps to pay the cost
> > > of having locks, even if better options for freeing the memory are available.
> > >
> > > /Bruce
> > >
> > 
> > -1 (not to say 0xFFFFFFFF)
> > 
> > This is definitely an API design bug !
> > I would say that if you don't know how to free a resource that you
> > allocate, it is very likely that you are wrong allocating it.
> > And this is exactly what happens here with RX/TX callback data structures.
> > This problem can easily be addressed by just changing the API as follows:
> > 
> > Change
> >      void *
> >      rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
> >                              rte_rx_callback_fn fn, void *user_param)
> > 
> > to
> >      int
> >      rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
> >                              struct rte_eth_rxtx_callback *cb)
> > 
> > In addition of solving the problem, this approach makes the API
> > consistent and let the application allocate "rte_eth_rxtx_callback" data
> > structures through any appropriate mean.
> 
> That might make API a bit cleaner, but I don't see how it fixes the generic problem:
> there is still no way to know by the upper layer when it is safe to free/re-use
> removed callback, but to make sure that all IO on that queue is stopped
> (I.E. some external synchronisation around the queue).   

Actually, it allows other, more creative solutions, like an app having a
statically allocated set/pool of callback structures that it doesn't need to
allocate or free at all.

/Bruce
  
Ananyev, Konstantin June 15, 2016, 2:27 p.m. UTC | #16
> -----Original Message-----
> From: Richardson, Bruce
> Sent: Wednesday, June 15, 2016 3:22 PM
> To: Ananyev, Konstantin
> Cc: Ivan Boule; Thomas Monjalon; Pattan, Reshma; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> 
> On Wed, Jun 15, 2016 at 03:20:27PM +0100, Ananyev, Konstantin wrote:
> >
> >
> > > -----Original Message-----
> > > From: Ivan Boule [mailto:ivan.boule@6wind.com]
> > > Sent: Wednesday, June 15, 2016 3:07 PM
> > > To: Richardson, Bruce; Ananyev, Konstantin
> > > Cc: Thomas Monjalon; Pattan, Reshma; dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> > >
> > > On 06/15/2016 03:29 PM, Bruce Richardson wrote:
> > > > On Wed, Jun 15, 2016 at 12:40:16PM +0000, Ananyev, Konstantin wrote:
> > > >> Hi Ivan,
> > > >>
> > > >>> -----Original Message-----
> > > >>> From: Ivan Boule [mailto:ivan.boule@6wind.com]
> > > >>> Sent: Wednesday, June 15, 2016 1:15 PM
> > > >>> To: Thomas Monjalon; Ananyev, Konstantin
> > > >>> Cc: Pattan, Reshma; dev@dpdk.org
> > > >>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> > > >>>
> > > >>> On 06/15/2016 10:48 AM, Thomas Monjalon wrote:
> > > >>>
> > > >>>>>
> > > >>>>>> I think the read access would need locking but we do not want it
> > > >>>>>> in fast path.
> > > >>>>>
> > > >>>>> I don't think it would be needed.
> > > >>>>> As I said - read/write interaction didn't change from what we have right now.
> > > >>>>> But if you have some particular scenario in mind that you believe would cause
> > > >>>>> a race condition - please speak up.
> > > >>>>
> > > >>>> If we add/remove a callback during a burst? Is it possible that the next
> > > >>>> pointer would have a wrong value leading to a crash?
> > > >>>> Maybe we need a comment to state that we should not alter burst
> > > >>>> callbacks while running burst functions.
> > > >>>>
> > > >>>
> > > >>> Hi Reshma,
> > > >>>
> > > >>> You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the
> > > >>> function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list
> > > >>> of RX callbacks associated with the polled RX queue to safely remove RX
> > > >>> callback(s) in parallel.
> > > >>> The problem is not [only] with the setting and the loading of "cb->next"
> > > >>> that you assume to be atomic operations, which is certainly true on most
> > > >>> CPUs.
> > > >>> I see the 2 important following issues:
> > > >>>
> > > >>> 1) the "rte_eth_rxtx_callback" data structure associated with a removed
> > > >>> RX callback could still be accessed in the callback parsing loop of the
> > > >>> function "rte_eth_rx_burst()" after having been freed in parallel.
> > > >>>
> > > >>> BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE
> > > >>> MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that
> > > >>> does not free the "rte_eth_rxtx_callback" data structure associated with
> > > >>> the removed callback !
> > > >>
> > > >> Yes, though it is documented behaviour, someone can probably
> > > >> refer it as a feature, not a bug ;)
> > > >>
> > > >
> > > > +1
> > > > This is definitely not a bug, this is absolutely by design. One may argue with
> > > > the design, but it was done for a definite reason, so as to avoid paying the
> > > > penalty of having locks. It pushes more responsibility onto the app, but it
> > > > does allow the app to choose the best solution for managing the freeing of
> > > > memory for its situation. The alternative is to force all apps to pay the cost
> > > > of having locks, even if better options for freeing the memory are available.
> > > >
> > > > /Bruce
> > > >
> > >
> > > -1 (not to say 0xFFFFFFFF)
> > >
> > > This is definitely an API design bug !
> > > I would say that if you don't know how to free a resource that you
> > > allocate, it is very likely that you are wrong allocating it.
> > > And this is exactly what happens here with RX/TX callback data structures.
> > > This problem can easily be addressed by just changing the API as follows:
> > >
> > > Change
> > >      void *
> > >      rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
> > >                              rte_rx_callback_fn fn, void *user_param)
> > >
> > > to
> > >      int
> > >      rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
> > >                              struct rte_eth_rxtx_callback *cb)
> > >
> > > In addition of solving the problem, this approach makes the API
> > > consistent and let the application allocate "rte_eth_rxtx_callback" data
> > > structures through any appropriate mean.
> >
> > That might make API a bit cleaner, but I don't see how it fixes the generic problem:
> > there is still no way to know by the upper layer when it is safe to free/re-use
> > removed callback, but to make sure that all IO on that queue is stopped
> > (I.E. some external synchronisation around the queue).
> 
> Actually, it allows other, more creative solutions, like an app having a
> statically allocated set/pool of callback structures that it doesn't need to
> allocate or free at all.

I understand that, and as I said I am not against that change.
But it doesn't solve the problem in general.
Ok, if you have a static object, you wouldn't need to call free() for it,
but you still want to re-use it after remove_cb() has finished, right?
So there is no actual difference - the main question is:
at what point after remove_cb() it is safe to modify it's contents.
Konstantin

> 
> /Bruce
  
Ivan Boule June 15, 2016, 3:33 p.m. UTC | #17
On 06/15/2016 04:27 PM, Ananyev, Konstantin wrote:
>
>
>> -----Original Message-----
>> From: Richardson, Bruce
>> Sent: Wednesday, June 15, 2016 3:22 PM
>> To: Ananyev, Konstantin
>> Cc: Ivan Boule; Thomas Monjalon; Pattan, Reshma; dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
>>
>> On Wed, Jun 15, 2016 at 03:20:27PM +0100, Ananyev, Konstantin wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Ivan Boule [mailto:ivan.boule@6wind.com]
>>>> Sent: Wednesday, June 15, 2016 3:07 PM
>>>> To: Richardson, Bruce; Ananyev, Konstantin
>>>> Cc: Thomas Monjalon; Pattan, Reshma; dev@dpdk.org
>>>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
>>>>
>>>> On 06/15/2016 03:29 PM, Bruce Richardson wrote:
>>>>> On Wed, Jun 15, 2016 at 12:40:16PM +0000, Ananyev, Konstantin wrote:
>>>>>> Hi Ivan,
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Ivan Boule [mailto:ivan.boule@6wind.com]
>>>>>>> Sent: Wednesday, June 15, 2016 1:15 PM
>>>>>>> To: Thomas Monjalon; Ananyev, Konstantin
>>>>>>> Cc: Pattan, Reshma; dev@dpdk.org
>>>>>>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
>>>>>>>
>>>>>>> On 06/15/2016 10:48 AM, Thomas Monjalon wrote:
>>>>>>>
>>>>>>>>>
>>>>>>>>>> I think the read access would need locking but we do not want it
>>>>>>>>>> in fast path.
>>>>>>>>>
>>>>>>>>> I don't think it would be needed.
>>>>>>>>> As I said - read/write interaction didn't change from what we have right now.
>>>>>>>>> But if you have some particular scenario in mind that you believe would cause
>>>>>>>>> a race condition - please speak up.
>>>>>>>>
>>>>>>>> If we add/remove a callback during a burst? Is it possible that the next
>>>>>>>> pointer would have a wrong value leading to a crash?
>>>>>>>> Maybe we need a comment to state that we should not alter burst
>>>>>>>> callbacks while running burst functions.
>>>>>>>>
>>>>>>>
>>>>>>> Hi Reshma,
>>>>>>>
>>>>>>> You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the
>>>>>>> function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list
>>>>>>> of RX callbacks associated with the polled RX queue to safely remove RX
>>>>>>> callback(s) in parallel.
>>>>>>> The problem is not [only] with the setting and the loading of "cb->next"
>>>>>>> that you assume to be atomic operations, which is certainly true on most
>>>>>>> CPUs.
>>>>>>> I see the 2 important following issues:
>>>>>>>
>>>>>>> 1) the "rte_eth_rxtx_callback" data structure associated with a removed
>>>>>>> RX callback could still be accessed in the callback parsing loop of the
>>>>>>> function "rte_eth_rx_burst()" after having been freed in parallel.
>>>>>>>
>>>>>>> BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE
>>>>>>> MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that
>>>>>>> does not free the "rte_eth_rxtx_callback" data structure associated with
>>>>>>> the removed callback !
>>>>>>
>>>>>> Yes, though it is documented behaviour, someone can probably
>>>>>> refer it as a feature, not a bug ;)
>>>>>>
>>>>>
>>>>> +1
>>>>> This is definitely not a bug, this is absolutely by design. One may argue with
>>>>> the design, but it was done for a definite reason, so as to avoid paying the
>>>>> penalty of having locks. It pushes more responsibility onto the app, but it
>>>>> does allow the app to choose the best solution for managing the freeing of
>>>>> memory for its situation. The alternative is to force all apps to pay the cost
>>>>> of having locks, even if better options for freeing the memory are available.
>>>>>
>>>>> /Bruce
>>>>>
>>>>
>>>> -1 (not to say 0xFFFFFFFF)
>>>>
>>>> This is definitely an API design bug !
>>>> I would say that if you don't know how to free a resource that you
>>>> allocate, it is very likely that you are wrong allocating it.
>>>> And this is exactly what happens here with RX/TX callback data structures.
>>>> This problem can easily be addressed by just changing the API as follows:
>>>>
>>>> Change
>>>>       void *
>>>>       rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
>>>>                               rte_rx_callback_fn fn, void *user_param)
>>>>
>>>> to
>>>>       int
>>>>       rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
>>>>                               struct rte_eth_rxtx_callback *cb)
>>>>
>>>> In addition of solving the problem, this approach makes the API
>>>> consistent and let the application allocate "rte_eth_rxtx_callback" data
>>>> structures through any appropriate mean.
>>>
>>> That might make API a bit cleaner, but I don't see how it fixes the generic problem:
>>> there is still no way to know by the upper layer when it is safe to free/re-use
>>> removed callback, but to make sure that all IO on that queue is stopped
>>> (I.E. some external synchronisation around the queue).
>>
>> Actually, it allows other, more creative solutions, like an app having a
>> statically allocated set/pool of callback structures that it doesn't need to
>> allocate or free at all.
>
> I understand that, and as I said I am not against that change.
> But it doesn't solve the problem in general.
> Ok, if you have a static object, you wouldn't need to call free() for it,
> but you still want to re-use it after remove_cb() has finished, right?
> So there is no actual difference - the main question is:
> at what point after remove_cb() it is safe to modify it's contents.
> Konstantin
>
>>
>> /Bruce
>
Hi Bruce/Konstantin

There is no need to use locks to ensure that a RX callback being removed 
is not executed/invoked and will never be invoked again.
This issue can be easily addressed at application level through the 
common synchronization mechanism that consists in having the control 
thread of the application that manages the adding/removal of RX 
callbacks to:
1) Ask the packet processing function of the application to stop 
invoking the function rte_eth_rx_burst() on the target RX queue.
2) Once the packet processing function acked it stopped polling the 
target RX queue, safely remove the RX callback and free whatever 
resource needs to be freed.
3) Enable the packet processing function of the application to invoke 
again the function rte_eth_rx_burst() on the target RX queue.

Enjoy :-)

Ivan
  

Patch

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index e148028..ce70d58 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -77,6 +77,12 @@  static uint8_t nb_ports;
 /* spinlock for eth device callbacks */
 static rte_spinlock_t rte_eth_dev_cb_lock = RTE_SPINLOCK_INITIALIZER;
 
+/* spinlock for add/remove rx callbacks */
+static rte_spinlock_t rte_eth_rx_cb_lock = RTE_SPINLOCK_INITIALIZER;
+
+/* spinlock for add/remove tx callbacks */
+static rte_spinlock_t rte_eth_tx_cb_lock = RTE_SPINLOCK_INITIALIZER;
+
 /* store statistics names and its offset in stats structure  */
 struct rte_eth_xstats_name_off {
 	char name[RTE_ETH_XSTATS_NAME_SIZE];
@@ -1634,7 +1640,6 @@  rte_eth_dev_set_rx_queue_stats_mapping(uint8_t port_id, uint16_t rx_queue_id,
 			STAT_QMAP_RX);
 }
 
-
 void
 rte_eth_dev_info_get(uint8_t port_id, struct rte_eth_dev_info *dev_info)
 {
@@ -2905,7 +2910,6 @@  rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
 		rte_errno = EINVAL;
 		return NULL;
 	}
-
 	struct rte_eth_rxtx_callback *cb = rte_zmalloc(NULL, sizeof(*cb), 0);
 
 	if (cb == NULL) {
@@ -2916,6 +2920,7 @@  rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
 	cb->fn.rx = fn;
 	cb->param = user_param;
 
+	rte_spinlock_lock(&rte_eth_rx_cb_lock);
 	/* Add the callbacks in fifo order. */
 	struct rte_eth_rxtx_callback *tail =
 		rte_eth_devices[port_id].post_rx_burst_cbs[queue_id];
@@ -2928,6 +2933,7 @@  rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
 			tail = tail->next;
 		tail->next = cb;
 	}
+	rte_spinlock_unlock(&rte_eth_rx_cb_lock);
 
 	return cb;
 }
@@ -2957,6 +2963,7 @@  rte_eth_add_tx_callback(uint8_t port_id, uint16_t queue_id,
 	cb->fn.tx = fn;
 	cb->param = user_param;
 
+	rte_spinlock_lock(&rte_eth_tx_cb_lock);
 	/* Add the callbacks in fifo order. */
 	struct rte_eth_rxtx_callback *tail =
 		rte_eth_devices[port_id].pre_tx_burst_cbs[queue_id];
@@ -2969,6 +2976,7 @@  rte_eth_add_tx_callback(uint8_t port_id, uint16_t queue_id,
 			tail = tail->next;
 		tail->next = cb;
 	}
+	rte_spinlock_unlock(&rte_eth_tx_cb_lock);
 
 	return cb;
 }
@@ -2987,29 +2995,24 @@  rte_eth_remove_rx_callback(uint8_t port_id, uint16_t queue_id,
 		return -EINVAL;
 
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
-	struct rte_eth_rxtx_callback *cb = dev->post_rx_burst_cbs[queue_id];
-	struct rte_eth_rxtx_callback *prev_cb;
-
-	/* Reset head pointer and remove user cb if first in the list. */
-	if (cb == user_cb) {
-		dev->post_rx_burst_cbs[queue_id] = user_cb->next;
-		return 0;
-	}
-
-	/* Remove the user cb from the callback list. */
-	do {
-		prev_cb = cb;
-		cb = cb->next;
-
+	struct rte_eth_rxtx_callback *cb;
+	struct rte_eth_rxtx_callback **prev_cb;
+	int ret = -EINVAL;
+
+	rte_spinlock_lock(&rte_eth_rx_cb_lock);
+	prev_cb = &dev->post_rx_burst_cbs[queue_id];
+	for (; *prev_cb != NULL; prev_cb = &cb->next) {
+		cb = *prev_cb;
 		if (cb == user_cb) {
-			prev_cb->next = user_cb->next;
-			return 0;
+			/* Remove the user cb from the callback list. */
+			*prev_cb = cb->next;
+			ret = 0;
+			break;
 		}
+	}
+	rte_spinlock_unlock(&rte_eth_rx_cb_lock);
 
-	} while (cb != NULL);
-
-	/* Callback wasn't found. */
-	return -EINVAL;
+	return ret;
 }
 
 int
@@ -3026,29 +3029,24 @@  rte_eth_remove_tx_callback(uint8_t port_id, uint16_t queue_id,
 		return -EINVAL;
 
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
-	struct rte_eth_rxtx_callback *cb = dev->pre_tx_burst_cbs[queue_id];
-	struct rte_eth_rxtx_callback *prev_cb;
-
-	/* Reset head pointer and remove user cb if first in the list. */
-	if (cb == user_cb) {
-		dev->pre_tx_burst_cbs[queue_id] = user_cb->next;
-		return 0;
-	}
-
-	/* Remove the user cb from the callback list. */
-	do {
-		prev_cb = cb;
-		cb = cb->next;
-
+	int ret = -EINVAL;
+	struct rte_eth_rxtx_callback *cb;
+	struct rte_eth_rxtx_callback **prev_cb;
+
+	rte_spinlock_lock(&rte_eth_tx_cb_lock);
+	prev_cb = &dev->pre_tx_burst_cbs[queue_id];
+	for (; *prev_cb != NULL; prev_cb = &cb->next) {
+		cb = *prev_cb;
 		if (cb == user_cb) {
-			prev_cb->next = user_cb->next;
-			return 0;
+			/* Remove the user cb from the callback list. */
+			*prev_cb = cb->next;
+			ret = 0;
+			break;
 		}
+	}
+	rte_spinlock_unlock(&rte_eth_tx_cb_lock);
 
-	} while (cb != NULL);
-
-	/* Callback wasn't found. */
-	return -EINVAL;
+	return ret;
 }
 
 int