diff mbox series

[v5,3/3] kni: fix kernel deadlock when using mlx devices

Message ID 20210329143655.521750-3-ferruh.yigit@intel.com (mailing list archive)
State Accepted, archived
Delegated to: Thomas Monjalon
Headers show
Series [v5,1/3] kni: refactor user request processing | expand

Checks

Context Check Description
ci/intel-Testing success Testing PASS
ci/Intel-compilation success Compilation OK
ci/iol-testing success Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/travis-robot success travis build: passed
ci/checkpatch warning coding style issues

Commit Message

Ferruh Yigit March 29, 2021, 2:36 p.m. UTC
KNI runs userspace callback with rtnl lock held, this is not working
fine with some devices that needs to interact with kernel interface in
the callback, like Mellanox devices.

The solution is releasing the rtnl lock before calling the userspace
callback. But it requires two consideration:

1. The rtnl lock needs to released before 'kni->sync_lock', otherwise it
   causes deadlock with multiple KNI devices, please check below the A.
   for the details of the deadlock condition.

2. When rtnl lock is released for interface down event, it cause a
   regression and deadlock, so can't release the rtnl lock for interface
   down event, please check below B. for the details.

As a solution, interface down event is handled asynchronously and for
all other events rtnl lock is released before processing the callback.

A. KNI sync lock is being locked while rtnl is held.
If two threads are calling kni_net_process_request() ,
then the first one will take the sync lock, release rtnl lock then sleep.
The second thread will try to lock sync lock while holding rtnl.
The first thread will wake, and try to lock rtnl, resulting in a
deadlock.  The remedy is to release rtnl before locking the KNI sync
lock.
Since in between nothing is accessing Linux network-wise, no rtnl
locking is needed.

B. There is a race condition in __dev_close_many() processing the
close_list while the application terminates.
It looks like if two KNI interfaces are terminating,
and one releases the rtnl lock, the other takes it,
updating the close_list in an unstable state,
causing the close_list to become a circular linked list,
hence list_for_each_entry() will endlessly loop inside
__dev_close_many() .

To summarize:
request != interface down : unlock rtnl, send request to user-space,
wait for response, send the response error code to caller in user-space.

request == interface down: send request to user-space, return immediately
with error code of 0 (success) to user-space.

Fixes: 3fc5ca2f6352 ("kni: initial import")
Cc: stable@dpdk.org

Signed-off-by: Elad Nachman <eladv6@gmail.com>
---
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Igor Ryzhov <iryzhov@nfware.com>
Cc: Dan Gora <dg@adax.com>

 #	kernel/linux/kni/kni_net.c.rej
---
 kernel/linux/kni/kni_net.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Ferruh Yigit April 9, 2021, 2:56 p.m. UTC | #1
On 3/29/2021 3:36 PM, Ferruh Yigit wrote:
> KNI runs userspace callback with rtnl lock held, this is not working
> fine with some devices that needs to interact with kernel interface in
> the callback, like Mellanox devices.
> 
> The solution is releasing the rtnl lock before calling the userspace
> callback. But it requires two consideration:
> 
> 1. The rtnl lock needs to released before 'kni->sync_lock', otherwise it
>     causes deadlock with multiple KNI devices, please check below the A.
>     for the details of the deadlock condition.
> 
> 2. When rtnl lock is released for interface down event, it cause a
>     regression and deadlock, so can't release the rtnl lock for interface
>     down event, please check below B. for the details.
> 
> As a solution, interface down event is handled asynchronously and for
> all other events rtnl lock is released before processing the callback.
> 
> A. KNI sync lock is being locked while rtnl is held.
> If two threads are calling kni_net_process_request() ,
> then the first one will take the sync lock, release rtnl lock then sleep.
> The second thread will try to lock sync lock while holding rtnl.
> The first thread will wake, and try to lock rtnl, resulting in a
> deadlock.  The remedy is to release rtnl before locking the KNI sync
> lock.
> Since in between nothing is accessing Linux network-wise, no rtnl
> locking is needed.
> 
> B. There is a race condition in __dev_close_many() processing the
> close_list while the application terminates.
> It looks like if two KNI interfaces are terminating,
> and one releases the rtnl lock, the other takes it,
> updating the close_list in an unstable state,
> causing the close_list to become a circular linked list,
> hence list_for_each_entry() will endlessly loop inside
> __dev_close_many() .
> 
> To summarize:
> request != interface down : unlock rtnl, send request to user-space,
> wait for response, send the response error code to caller in user-space.
> 
> request == interface down: send request to user-space, return immediately
> with error code of 0 (success) to user-space.
> 
> Fixes: 3fc5ca2f6352 ("kni: initial import")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Elad Nachman <eladv6@gmail.com>
> ---
> Cc: Stephen Hemminger <stephen@networkplumber.org>
> Cc: Igor Ryzhov <iryzhov@nfware.com>
> Cc: Dan Gora <dg@adax.com>
> 

Hi Elad, Igor,

Can you please review/test this set when you have time?

Thanks,
ferruh
Elad Nachman April 12, 2021, 2:35 p.m. UTC | #2
Hi,

The new patch is fine by me.

Tested several dozens restarts of our proprietary application without
apparent problem.

FYI,

Elad.

בתאריך יום ו׳, 9 באפר׳ 2021, 17:56, מאת Ferruh Yigit ‏<
ferruh.yigit@intel.com>:

> On 3/29/2021 3:36 PM, Ferruh Yigit wrote:
> > KNI runs userspace callback with rtnl lock held, this is not working
> > fine with some devices that needs to interact with kernel interface in
> > the callback, like Mellanox devices.
> >
> > The solution is releasing the rtnl lock before calling the userspace
> > callback. But it requires two consideration:
> >
> > 1. The rtnl lock needs to released before 'kni->sync_lock', otherwise it
> >     causes deadlock with multiple KNI devices, please check below the A.
> >     for the details of the deadlock condition.
> >
> > 2. When rtnl lock is released for interface down event, it cause a
> >     regression and deadlock, so can't release the rtnl lock for interface
> >     down event, please check below B. for the details.
> >
> > As a solution, interface down event is handled asynchronously and for
> > all other events rtnl lock is released before processing the callback.
> >
> > A. KNI sync lock is being locked while rtnl is held.
> > If two threads are calling kni_net_process_request() ,
> > then the first one will take the sync lock, release rtnl lock then sleep.
> > The second thread will try to lock sync lock while holding rtnl.
> > The first thread will wake, and try to lock rtnl, resulting in a
> > deadlock.  The remedy is to release rtnl before locking the KNI sync
> > lock.
> > Since in between nothing is accessing Linux network-wise, no rtnl
> > locking is needed.
> >
> > B. There is a race condition in __dev_close_many() processing the
> > close_list while the application terminates.
> > It looks like if two KNI interfaces are terminating,
> > and one releases the rtnl lock, the other takes it,
> > updating the close_list in an unstable state,
> > causing the close_list to become a circular linked list,
> > hence list_for_each_entry() will endlessly loop inside
> > __dev_close_many() .
> >
> > To summarize:
> > request != interface down : unlock rtnl, send request to user-space,
> > wait for response, send the response error code to caller in user-space.
> >
> > request == interface down: send request to user-space, return immediately
> > with error code of 0 (success) to user-space.
> >
> > Fixes: 3fc5ca2f6352 ("kni: initial import")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Elad Nachman <eladv6@gmail.com>
> > ---
> > Cc: Stephen Hemminger <stephen@networkplumber.org>
> > Cc: Igor Ryzhov <iryzhov@nfware.com>
> > Cc: Dan Gora <dg@adax.com>
> >
>
> Hi Elad, Igor,
>
> Can you please review/test this set when you have time?
>
> Thanks,
> ferruh
>
>
Thomas Monjalon April 20, 2021, 11:07 p.m. UTC | #3
12/04/2021 16:35, Elad Nachman:
> Hi,
> 
> The new patch is fine by me.
> 
> Tested several dozens restarts of our proprietary application without
> apparent problem.

Series applied, thanks.
Igor Ryzhov April 23, 2021, 8:41 a.m. UTC | #4
This patch changes the behavior for KNI interface shutdown.
Previously we would receive a real response from the driver, now we
always receive success.
I think this should be reflected in the docs/release notes.

Igor

On Wed, Apr 21, 2021 at 2:07 AM Thomas Monjalon <thomas@monjalon.net> wrote:

> 12/04/2021 16:35, Elad Nachman:
> > Hi,
> >
> > The new patch is fine by me.
> >
> > Tested several dozens restarts of our proprietary application without
> > apparent problem.
>
> Series applied, thanks.
>
>
>
Ferruh Yigit April 23, 2021, 8:59 a.m. UTC | #5
On 4/23/2021 9:41 AM, Igor Ryzhov wrote:
> This patch changes the behavior for KNI interface shutdown.
> Previously we would receive a real response from the driver, now we 
> always receive success.
> I think this should be reflected in the docs/release notes.
> 

Hi Igor,

Make sense, I can add it.

Meanwhile do you think has a benefit to make shutdown behavior configurable? 
Async/Sync shutdown based on module param?

> Igor
> 
> On Wed, Apr 21, 2021 at 2:07 AM Thomas Monjalon <thomas@monjalon.net 
> <mailto:thomas@monjalon.net>> wrote:
> 
>     12/04/2021 16:35, Elad Nachman:
>      > Hi,
>      >
>      > The new patch is fine by me.
>      >
>      > Tested several dozens restarts of our proprietary application without
>      > apparent problem.
> 
>     Series applied, thanks.
> 
>
Igor Ryzhov April 23, 2021, 12:43 p.m. UTC | #6
Hi Ferruh,

Thanks. I think it would be great to make this configurable, and maybe even
make shutdown synchronous by default to preserve the old behavior.

I would be grateful if you could spend time on the work and I am ready to
review it.

Igor

On Fri, Apr 23, 2021 at 11:59 AM Ferruh Yigit <ferruh.yigit@intel.com>
wrote:

> On 4/23/2021 9:41 AM, Igor Ryzhov wrote:
> > This patch changes the behavior for KNI interface shutdown.
> > Previously we would receive a real response from the driver, now we
> > always receive success.
> > I think this should be reflected in the docs/release notes.
> >
>
> Hi Igor,
>
> Make sense, I can add it.
>
> Meanwhile do you think has a benefit to make shutdown behavior
> configurable?
> Async/Sync shutdown based on module param?
>
> > Igor
> >
> > On Wed, Apr 21, 2021 at 2:07 AM Thomas Monjalon <thomas@monjalon.net
> > <mailto:thomas@monjalon.net>> wrote:
> >
> >     12/04/2021 16:35, Elad Nachman:
> >      > Hi,
> >      >
> >      > The new patch is fine by me.
> >      >
> >      > Tested several dozens restarts of our proprietary application
> without
> >      > apparent problem.
> >
> >     Series applied, thanks.
> >
> >
>
>
Igor Ryzhov April 23, 2021, 12:58 p.m. UTC | #7
Sorry I remembered the problem with the deadlock.

We can't just make the shutdown command synchronous, because
we can't release the rtnl_lock anyway. So regardless of the process
mode (sync/async), we have to preserve the lock when processing
the shutdown. It looks like two different settings...

On Fri, Apr 23, 2021 at 3:43 PM Igor Ryzhov <iryzhov@nfware.com> wrote:

> Hi Ferruh,
>
> Thanks. I think it would be great to make this configurable, and maybe even
> make shutdown synchronous by default to preserve the old behavior.
>
> I would be grateful if you could spend time on the work and I am ready to
> review it.
>
> Igor
>
> On Fri, Apr 23, 2021 at 11:59 AM Ferruh Yigit <ferruh.yigit@intel.com>
> wrote:
>
>> On 4/23/2021 9:41 AM, Igor Ryzhov wrote:
>> > This patch changes the behavior for KNI interface shutdown.
>> > Previously we would receive a real response from the driver, now we
>> > always receive success.
>> > I think this should be reflected in the docs/release notes.
>> >
>>
>> Hi Igor,
>>
>> Make sense, I can add it.
>>
>> Meanwhile do you think has a benefit to make shutdown behavior
>> configurable?
>> Async/Sync shutdown based on module param?
>>
>> > Igor
>> >
>> > On Wed, Apr 21, 2021 at 2:07 AM Thomas Monjalon <thomas@monjalon.net
>> > <mailto:thomas@monjalon.net>> wrote:
>> >
>> >     12/04/2021 16:35, Elad Nachman:
>> >      > Hi,
>> >      >
>> >      > The new patch is fine by me.
>> >      >
>> >      > Tested several dozens restarts of our proprietary application
>> without
>> >      > apparent problem.
>> >
>> >     Series applied, thanks.
>> >
>> >
>>
>>
diff mbox series

Patch

diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c
index 6cf99da0dc92..f259327954b2 100644
--- a/kernel/linux/kni/kni_net.c
+++ b/kernel/linux/kni/kni_net.c
@@ -113,6 +113,14 @@  kni_net_process_request(struct net_device *dev, struct rte_kni_request *req)
 
 	ASSERT_RTNL();
 
+	/* If we need to wait and RTNL mutex is held
+	 * drop the mutex and hold reference to keep device
+	 */
+	if (req->async == 0) {
+		dev_hold(dev);
+		rtnl_unlock();
+	}
+
 	mutex_lock(&kni->sync_lock);
 
 	/* Construct data */
@@ -152,6 +160,10 @@  kni_net_process_request(struct net_device *dev, struct rte_kni_request *req)
 
 fail:
 	mutex_unlock(&kni->sync_lock);
+	if (req->async == 0) {
+		rtnl_lock();
+		dev_put(dev);
+	}
 	return ret;
 }
 
@@ -194,6 +206,10 @@  kni_net_release(struct net_device *dev)
 
 	/* Setting if_up to 0 means down */
 	req.if_up = 0;
+
+	/* request async because of the deadlock problem */
+	req.async = 1;
+
 	ret = kni_net_process_request(dev, &req);
 
 	return (ret == 0) ? req.result : ret;