[v2,02/10] kni: separate releasing netdev from freeing KNI interface

Message ID 20180629015508.26599-3-dg@adax.com (mailing list archive)
State Changes Requested, archived
Delegated to: Thomas Monjalon
Headers
Series [v2,01/10] kni: remove unused variables from struct kni_dev |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail Compilation issues

Commit Message

Dan Gora June 29, 2018, 1:55 a.m. UTC
  Currently the rte_kni kernel driver suffers from a problem where
when the interface is released, it generates a callback to the DPDK
application to change the interface state to Down.  However, after the
DPDK application handles the callback and generates a response back to
the kernel, the rte_kni driver cannot wake the thread which is asleep
waiting for the response, because it is holding the kni_link_lock
semaphore and it has already removed the 'struct kni_dev' from the
list of interfaces to poll for responses.

This means that if the KNI interface is in the Up state when
rte_kni_release() is called, it will always sleep for three seconds
until kni_net_release gives up waiting for a response from the DPDK
application.

To fix this, we must separate the step to release the kernel network
interface from the steps to remove the KNI interface from the list
of interfaces to poll.

When the kernel network interface is removed with unregister_netdev(),
if the interface is up, it will generate a callback to mark the
interface down, which calls kni_net_release().  kni_net_release() will
block waiting for the DPDK application to call rte_kni_handle_request()
to handle the callback, but it also needs the thread in the KNI driver
(either the per-dev thread for multi-thread or the per-driver thread)
to call kni_net_poll_resp() in order to wake the thread sleeping in
kni_net_release (actually kni_net_process_request()).

So now, KNI interfaces should be removed as such:

1) The user calls rte_kni_release().  This only unregisters the
netdev in the kernel, but touches nothing else.  This allows all the
threads to run which are necessary to handle the callback into the
DPDK application to mark the interface down.

2) The user stops the thread running rte_kni_handle_request().
After rte_kni_release() has been called, there will be no more
callbacks for that interface so it is not necessary.  It cannot be
running at the same time that rte_kni_free() frees all of the FIFOs
and DPDK memory for that KNI interface.

3) The user calls rte_kni_free().  This performs the RTE_KNI_IOCTL_FREE
ioctl which calls kni_ioctl_free().  This function removes the struct
kni_dev from the list of interfaces to poll (and kills the per-dev
kthread, if configured for multi-thread), then frees the memory in
the FIFOs.

Signed-off-by: Dan Gora <dg@adax.com>
---
 kernel/linux/kni/kni_misc.c                   | 69 +++++++++++++++----
 .../eal/include/exec-env/rte_kni_common.h     |  1 +
 2 files changed, 58 insertions(+), 12 deletions(-)
  

Comments

Ferruh Yigit Aug. 29, 2018, 10:59 a.m. UTC | #1
On 6/29/2018 2:55 AM, Dan Gora wrote:
> Currently the rte_kni kernel driver suffers from a problem where
> when the interface is released, it generates a callback to the DPDK
> application to change the interface state to Down.  However, after the
> DPDK application handles the callback and generates a response back to
> the kernel, the rte_kni driver cannot wake the thread which is asleep
> waiting for the response, because it is holding the kni_link_lock
> semaphore and it has already removed the 'struct kni_dev' from the
> list of interfaces to poll for responses.
> 
> This means that if the KNI interface is in the Up state when
> rte_kni_release() is called, it will always sleep for three seconds
> until kni_net_release gives up waiting for a response from the DPDK
> application.
> 
> To fix this, we must separate the step to release the kernel network
> interface from the steps to remove the KNI interface from the list
> of interfaces to poll.
> 
> When the kernel network interface is removed with unregister_netdev(),
> if the interface is up, it will generate a callback to mark the
> interface down, which calls kni_net_release().  kni_net_release() will
> block waiting for the DPDK application to call rte_kni_handle_request()
> to handle the callback, but it also needs the thread in the KNI driver
> (either the per-dev thread for multi-thread or the per-driver thread)
> to call kni_net_poll_resp() in order to wake the thread sleeping in
> kni_net_release (actually kni_net_process_request()).
> 
> So now, KNI interfaces should be removed as such:
> 
> 1) The user calls rte_kni_release().  This only unregisters the
> netdev in the kernel, but touches nothing else.  This allows all the
> threads to run which are necessary to handle the callback into the
> DPDK application to mark the interface down.
> 
> 2) The user stops the thread running rte_kni_handle_request().
> After rte_kni_release() has been called, there will be no more
> callbacks for that interface so it is not necessary.  It cannot be
> running at the same time that rte_kni_free() frees all of the FIFOs
> and DPDK memory for that KNI interface.
> 
> 3) The user calls rte_kni_free().  This performs the RTE_KNI_IOCTL_FREE
> ioctl which calls kni_ioctl_free().  This function removes the struct
> kni_dev from the list of interfaces to poll (and kills the per-dev
> kthread, if configured for multi-thread), then frees the memory in
> the FIFOs.
> 
> Signed-off-by: Dan Gora <dg@adax.com>


You are right, that problem exits.
Although I don't see problem related to holding the kni_list_lock, polling
thread terminated before unregister interface cause the problem.

And it has a reason to terminate polling thread first, because it uses device
resources.

Separating unregister and free steps looks good, but I am not sure if this
should be reflected to the user, with a new ioctl and API.
When user done with interface it calls rte_kni_release() to release them, does
user really need a rte_kni_free() API or need to know the difference of two, is
there any action to take in userspace between these two APIs? I think no.

What about keeping single rte_kni_release() API and solve the issue internally
in KNI?

Previously it was doing:
- Stop threads (also there is another single/multi thread error [1])
- kni_dev_remove()
	- unregister and free netdev() [2]
	- kni_net_release_fifo_phy() [3]

Instead internally can we do:
a- Unregister kernel interfaces, rte_kni_unregister()?
b- stop threads
c- kni_net_release_fifo_phy
d- free netdev

The challenge I can see is some time required between a) and b) to let userspace
app to response, we need a way to know response received before stopping the thread.

Another thing is there are two release path, kni_release() and
kni_ioctl_release() both should be fixed.



[1]
If multi thread enabled they have been stopped, but if single thread used it has
not been stopped that is why you don't see the 3 seconds delay for default
single thread case, but not stopping the polling thread but removing the
interface is wrong.

[2]
unregistering netdev will trigger a userspace request but response won't be read
because polling thread also polls the response queue, and that thread is already
stopped at this stage.

[3]
This is also wrong as you have pointed in later patch in your series,
kni_net_release_fifo_phy() moves packets from rxq/alloq queue to free queue,
queues are still allocated but the references kept in kernel may be invalid at
this stage because of free netdev()
  
Dan Gora Sept. 4, 2018, 12:20 a.m. UTC | #2
Hi Ferruh,

thanks for your feedback.. I apologize for the delay getting back to
you on this..

On Wed, Aug 29, 2018 at 7:59 AM, Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> You are right, that problem exits.
> Although I don't see problem related to holding the kni_list_lock, polling
> thread terminated before unregister interface cause the problem.

Actually no, the polling thread is not terminated when the deadlock occurs..

The code path is this:  Note that this only occurs when the interface
is in the UP state!

DPDK app calls
 rte_kni_release()->
  ioctl (RTE_KNI_IOCTL_RELEASE);

Down in the kernel in the KNI driver:
 kni_ioctl (RTE_KNI_IOCTL_RELEASE) ->
  kni_ioctl_release()->
   down_write(kni_list_lock);
   -- This prevents the polling thread from running because it does a
   -- "down_read(kni_list_lock)".
   kni_dev_remove()->
    unregister_netdev() - This generates the callback into the DPDK application.
     ndo_stop - kni_net_release() ->
      kni_net_process_request() ->
       wait_event_interruptible_timeout(3 seconds);

Back in the DPDK app (Note this has to be in a different thread than
the one which called rte_kni_release()!)
  rte_kni_handle_request() - RTE_KNI_REQ_CFG_NETWORK_IF ->
   ** Application callback for config_network_if **
  kni_fifo_put the response back to the kernel in the resp_q fifo.

This is the deadlock... The polling thread cannot run because
kni_ioctl_release() is still holding the kni_list_lock, so
kni_net_process_request() will not see the response back from the DPDK
app in the resp_q fifo.  Only when wait_event_interruptible_timeout
times out does kni_net_process_request see the response back from the
DPDK app, which causes kni_dev_remove() to return, allowing
kni_ioctl_release to drop the kni_list_lock.

> And it has a reason to terminate polling thread first, because it uses device
> resources.
>
> Separating unregister and free steps looks good, but I am not sure if this
> should be reflected to the user, with a new ioctl and API.
> When user done with interface it calls rte_kni_release() to release them, does
> user really need a rte_kni_free() API or need to know the difference of two, is
> there any action to take in userspace between these two APIs? I think no.
>
> What about keeping single rte_kni_release() API and solve the issue internally
> in KNI?
>
> Previously it was doing:
> - Stop threads (also there is another single/multi thread error [1])
> - kni_dev_remove()
>         - unregister and free netdev() [2]
>         - kni_net_release_fifo_phy() [3]
>
> Instead internally can we do:
> a- Unregister kernel interfaces, rte_kni_unregister()?
> b- stop threads
> c- kni_net_release_fifo_phy
> d- free netdev
>
> The challenge I can see is some time required between a) and b) to let userspace
> app to response, we need a way to know response received before stopping the thread.
>
> Another thing is there are two release path, kni_release() and
> kni_ioctl_release() both should be fixed.
>
>
>
> [1]
> If multi thread enabled they have been stopped, but if single thread used it has
> not been stopped that is why you don't see the 3 seconds delay for default
> single thread case, but not stopping the polling thread but removing the
> interface is wrong.


You see the problem in the single thread case as well if the interface
is in the 'UP' state when it is removed.  That is what triggers the
callback to the DPDK application to mark the interface 'down', as
shown above...

>
> [2]
> unregistering netdev will trigger a userspace request but response won't be read
> because polling thread also polls the response queue, and that thread is already
> stopped at this stage.
>

Not exactly.. The polling thread is not dead, just locked out because
kni_ioctl_release() is holding the kni_list_lock semaphore.


> [3]
> This is also wrong as you have pointed in later patch in your series,
> kni_net_release_fifo_phy() moves packets from rxq/alloq queue to free queue,
> queues are still allocated but the references kept in kernel may be invalid at
> this stage because of free netdev()

It's worse than that actually because you cannot touch 'dev' at all
after the free_netdev() because the 'struct kni_dev' is embedded in
the 'struct net_dev' allocated with alloc_netdev(), so you cannot call
kni_net_release_fifo_phy() after free_netdev:

kni_dev_remove()
{
    if (dev->net_dev) {
        unregister_netdev(dev->net_dev);
        free_netdev(dev->net_dev);
    }
    kni_net_release_fifo_phy(dev);
}

Because 'dev' has already been freed.

Similarly in kni_ioctl_release() you cannot do:

kni_dev_remove(dev);
list_del(&dev->list);

Because currently kni_dev_remove() calls free_netdevk(), which frees
the memory pointed to by 'dev'.

Maybe this all can be resolved more easily by allocating the 'struct
kni_dev' in separate memory rather than embedding it in the net_dev,
that way we can drop the 'kni_list_lock' before calling
kni_dev_remove() and we can still touch 'dev' after calling
free_netdev(), so we can then grab the lock again and remove it from
the list...

I'll noodle this a bit more and see if I can get it all to work.. I
just wanted to get back to you about how to reproduce the lockup.

thanks
dan
  
Dan Gora Sept. 4, 2018, 12:36 a.m. UTC | #3
Hi Ferruh,

I remembered now the motivation behind separating rte_kni_release()
and rte_kni_free().

The problem is that the DPDK thread which calls rte_kni_release()
_cannot_ be the same thread which handles callbacks from the KNI
driver via rte_kni_handle_request().  This is because the thread which
calls rte_kni_release() will be stuck down in
ioctl(RTE_KNI_IOCTL_RELEASE) when the kernel calls the
RTE_KNI_REQ_CFG_NETWORK_IF callback to the DPDK application.  Since
that thread cannot call rte_kni_handle_request(), the callback would
then just timeout unless some other thread calls
rte_kni_handle_request().

So then you are in a bit of a chicken and egg situation.  You _have_
to have a separate thread calling rte_kni_handle_request periodically,
but that thread also _cannot_ run after rte_kni_release returns
(actually it's worse than that because it's actually after the
ioctl(RTE_KNI_IOCTL_RELEASE) returns and the fifos are freed).

So in order to resolve this, I separated the release from the freeing
stages. This allows the DPDK application to keep the
rte_kni_handle_request() thread running while rte_kni_release() is
called so that it can handle the interface state callback, then kill
that thread so that it cannot touch any 'struct rte_kni' resources,
then free the struct rte_kni resources.


thanks
dan

On Wed, Aug 29, 2018 at 7:59 AM, Ferruh Yigit <ferruh.yigit@intel.com> wrote:

>> When the kernel network interface is removed with unregister_netdev(),
>> if the interface is up, it will generate a callback to mark the
>> interface down, which calls kni_net_release().  kni_net_release() will
>> block waiting for the DPDK application to call rte_kni_handle_request()
>> to handle the callback, but it also needs the thread in the KNI driver
>> (either the per-dev thread for multi-thread or the per-driver thread)
>> to call kni_net_poll_resp() in order to wake the thread sleeping in
>> kni_net_release (actually kni_net_process_request()).
>>
>> So now, KNI interfaces should be removed as such:
>>
>> 1) The user calls rte_kni_release().  This only unregisters the
>> netdev in the kernel, but touches nothing else.  This allows all the
>> threads to run which are necessary to handle the callback into the
>> DPDK application to mark the interface down.
>>
>> 2) The user stops the thread running rte_kni_handle_request().
>> After rte_kni_release() has been called, there will be no more
>> callbacks for that interface so it is not necessary.  It cannot be
>> running at the same time that rte_kni_free() frees all of the FIFOs
>> and DPDK memory for that KNI interface.
>>
>> 3) The user calls rte_kni_free().  This performs the RTE_KNI_IOCTL_FREE
>> ioctl which calls kni_ioctl_free().  This function removes the struct
>> kni_dev from the list of interfaces to poll (and kills the per-dev
>> kthread, if configured for multi-thread), then frees the memory in
>> the FIFOs.
  
Ferruh Yigit Oct. 10, 2018, 5:24 p.m. UTC | #4
On 9/4/2018 1:36 AM, Dan Gora wrote:
> Hi Ferruh,
> 
> I remembered now the motivation behind separating rte_kni_release()
> and rte_kni_free().
> 
> The problem is that the DPDK thread which calls rte_kni_release()
> _cannot_ be the same thread which handles callbacks from the KNI
> driver via rte_kni_handle_request().  This is because the thread which
> calls rte_kni_release() will be stuck down in
> ioctl(RTE_KNI_IOCTL_RELEASE) when the kernel calls the
> RTE_KNI_REQ_CFG_NETWORK_IF callback to the DPDK application.  Since
> that thread cannot call rte_kni_handle_request(), the callback would
> then just timeout unless some other thread calls
> rte_kni_handle_request().
> 
> So then you are in a bit of a chicken and egg situation.  You _have_
> to have a separate thread calling rte_kni_handle_request periodically,
> but that thread also _cannot_ run after rte_kni_release returns
> (actually it's worse than that because it's actually after the
> ioctl(RTE_KNI_IOCTL_RELEASE) returns and the fifos are freed).

I see, so we have problem in both end, -userspace side and kernel side.

Agreed that separating release & free may help, but I am not sure about adding a
new API for KNI.

Very simply, what about prevent kni_net_release() send callback to userspace?
This is already not working and removing it resolves the issues you mentioned.
Sample application calls rte_eth_dev_stop() after release itself, so behavior
will be same.

But the issues in kernel you mentioned, using `dev` after free_netdev() called
should be addressed.

> 
> So in order to resolve this, I separated the release from the freeing
> stages. This allows the DPDK application to keep the
> rte_kni_handle_request() thread running while rte_kni_release() is
> called so that it can handle the interface state callback, then kill
> that thread so that it cannot touch any 'struct rte_kni' resources,
> then free the struct rte_kni resources.
> 
> 
> thanks
> dan
> 
> On Wed, Aug 29, 2018 at 7:59 AM, Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> 
>>> When the kernel network interface is removed with unregister_netdev(),
>>> if the interface is up, it will generate a callback to mark the
>>> interface down, which calls kni_net_release().  kni_net_release() will
>>> block waiting for the DPDK application to call rte_kni_handle_request()
>>> to handle the callback, but it also needs the thread in the KNI driver
>>> (either the per-dev thread for multi-thread or the per-driver thread)
>>> to call kni_net_poll_resp() in order to wake the thread sleeping in
>>> kni_net_release (actually kni_net_process_request()).
>>>
>>> So now, KNI interfaces should be removed as such:
>>>
>>> 1) The user calls rte_kni_release().  This only unregisters the
>>> netdev in the kernel, but touches nothing else.  This allows all the
>>> threads to run which are necessary to handle the callback into the
>>> DPDK application to mark the interface down.
>>>
>>> 2) The user stops the thread running rte_kni_handle_request().
>>> After rte_kni_release() has been called, there will be no more
>>> callbacks for that interface so it is not necessary.  It cannot be
>>> running at the same time that rte_kni_free() frees all of the FIFOs
>>> and DPDK memory for that KNI interface.
>>>
>>> 3) The user calls rte_kni_free().  This performs the RTE_KNI_IOCTL_FREE
>>> ioctl which calls kni_ioctl_free().  This function removes the struct
>>> kni_dev from the list of interfaces to poll (and kills the per-dev
>>> kthread, if configured for multi-thread), then frees the memory in
>>> the FIFOs.
  
Dan Gora Oct. 10, 2018, 6:18 p.m. UTC | #5
On Wed, Oct 10, 2018 at 2:25 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>
> On 9/4/2018 1:36 AM, Dan Gora wrote:
> > Hi Ferruh,
> >
> > I remembered now the motivation behind separating rte_kni_release()
> > and rte_kni_free().
> >
> > The problem is that the DPDK thread which calls rte_kni_release()
> > _cannot_ be the same thread which handles callbacks from the KNI
> > driver via rte_kni_handle_request().  This is because the thread which
> > calls rte_kni_release() will be stuck down in
> > ioctl(RTE_KNI_IOCTL_RELEASE) when the kernel calls the
> > RTE_KNI_REQ_CFG_NETWORK_IF callback to the DPDK application.  Since
> > that thread cannot call rte_kni_handle_request(), the callback would
> > then just timeout unless some other thread calls
> > rte_kni_handle_request().
> >
> > So then you are in a bit of a chicken and egg situation.  You _have_
> > to have a separate thread calling rte_kni_handle_request periodically,
> > but that thread also _cannot_ run after rte_kni_release returns
> > (actually it's worse than that because it's actually after the
> > ioctl(RTE_KNI_IOCTL_RELEASE) returns and the fifos are freed).
>
> I see, so we have problem in both end, -userspace side and kernel side.
>
> Agreed that separating release & free may help, but I am not sure about adding a
> new API for KNI.
>
> Very simply, what about prevent kni_net_release() send callback to userspace?

No, because how is the DPDK application going to know when the user
does 'ip link set down dev <kniX>'?   It's important for the DPDK
application to know when the KNI interface is marked down.

> This is already not working and removing it resolves the issues you mentioned.

Huh?  How is it not working?  Of course it works.

> Sample application calls rte_eth_dev_stop() after release itself, so behavior
> will be same.

Huh?

> But the issues in kernel you mentioned, using `dev` after free_netdev() called
> should be addressed.

Yes, that's why I fixed them in the patches that I sent.

d
  
Ferruh Yigit Oct. 10, 2018, 10:51 p.m. UTC | #6
On 10/10/2018 7:18 PM, Dan Gora wrote:
> On Wed, Oct 10, 2018 at 2:25 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>>
>> On 9/4/2018 1:36 AM, Dan Gora wrote:
>>> Hi Ferruh,
>>>
>>> I remembered now the motivation behind separating rte_kni_release()
>>> and rte_kni_free().
>>>
>>> The problem is that the DPDK thread which calls rte_kni_release()
>>> _cannot_ be the same thread which handles callbacks from the KNI
>>> driver via rte_kni_handle_request().  This is because the thread which
>>> calls rte_kni_release() will be stuck down in
>>> ioctl(RTE_KNI_IOCTL_RELEASE) when the kernel calls the
>>> RTE_KNI_REQ_CFG_NETWORK_IF callback to the DPDK application.  Since
>>> that thread cannot call rte_kni_handle_request(), the callback would
>>> then just timeout unless some other thread calls
>>> rte_kni_handle_request().
>>>
>>> So then you are in a bit of a chicken and egg situation.  You _have_
>>> to have a separate thread calling rte_kni_handle_request periodically,
>>> but that thread also _cannot_ run after rte_kni_release returns
>>> (actually it's worse than that because it's actually after the
>>> ioctl(RTE_KNI_IOCTL_RELEASE) returns and the fifos are freed).
>>
>> I see, so we have problem in both end, -userspace side and kernel side.
>>
>> Agreed that separating release & free may help, but I am not sure about adding a
>> new API for KNI.
>>
>> Very simply, what about prevent kni_net_release() send callback to userspace?
> 
> No, because how is the DPDK application going to know when the user
> does 'ip link set down dev <kniX>'?   It's important for the DPDK
> application to know when the KNI interface is marked down.

I mean kni_net_release() called because of unregister_netdev(),

it is possible to set a flag in kni_dev_remove(), before unregister_netdev(),
and in kni_net_release() don't call kni_net_process_request() if flag is set.

Looks like it can work and only a few lines of code, what do you think?

> 
>> This is already not working and removing it resolves the issues you mentioned.
> 
> Huh?  How is it not working?  Of course it works.

The kni_net_release() called because of unregister_netdev() is not working, as
you explained in userspace the thread handles request already terminated, even
if not in kernel side response not received and timed off because of lock...

> 
>> Sample application calls rte_eth_dev_stop() after release itself, so behavior
>> will be same.
> 
> Huh?

in kni sample app, in kni_free_kni() rte_eth_dev_stop() is called after
rte_kni_release().
So if you prevent kni_net_release() called because of unregister_netdev() to
send callback it won't be problem because of existing rte_eth_dev_stop()

> 
>> But the issues in kernel you mentioned, using `dev` after free_netdev() called
>> should be addressed.
> 
> Yes, that's why I fixed them in the patches that I sent.
> 
> d
>
  
Dan Gora Oct. 10, 2018, 11:38 p.m. UTC | #7
I'll have a look at this.. Maybe it will work, I'll have to try it and
test it to see if there are any gotchas that are not obvious.

d

On Wed, Oct 10, 2018 at 7:51 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> >> Very simply, what about prevent kni_net_release() send callback to userspace?
> >
> > No, because how is the DPDK application going to know when the user
> > does 'ip link set down dev <kniX>'?   It's important for the DPDK
> > application to know when the KNI interface is marked down.
>
> I mean kni_net_release() called because of unregister_netdev(),
>
> it is possible to set a flag in kni_dev_remove(), before unregister_netdev(),
> and in kni_net_release() don't call kni_net_process_request() if flag is set.
>
> Looks like it can work and only a few lines of code, what do you think?
>
> >
> >> This is already not working and removing it resolves the issues you mentioned.
> >
> > Huh?  How is it not working?  Of course it works.
>
> The kni_net_release() called because of unregister_netdev() is not working, as
> you explained in userspace the thread handles request already terminated, even
> if not in kernel side response not received and timed off because of lock...
>
> >
> >> Sample application calls rte_eth_dev_stop() after release itself, so behavior
> >> will be same.
> >
> > Huh?
>
> in kni sample app, in kni_free_kni() rte_eth_dev_stop() is called after
> rte_kni_release().
> So if you prevent kni_net_release() called because of unregister_netdev() to
> send callback it won't be problem because of existing rte_eth_dev_stop()
>
  

Patch

diff --git a/kernel/linux/kni/kni_misc.c b/kernel/linux/kni/kni_misc.c
index fa69f8e63..d889ffc4b 100644
--- a/kernel/linux/kni/kni_misc.c
+++ b/kernel/linux/kni/kni_misc.c
@@ -192,8 +192,6 @@  kni_dev_remove(struct kni_dev *dev)
 		free_netdev(dev->net_dev);
 	}
 
-	kni_net_release_fifo_phy(dev);
-
 	return 0;
 }
 
@@ -224,6 +222,7 @@  kni_release(struct inode *inode, struct file *file)
 		}
 
 		kni_dev_remove(dev);
+		kni_net_release_fifo_phy(dev);
 		list_del(&dev->list);
 	}
 	up_write(&knet->kni_list_lock);
@@ -263,7 +262,6 @@  kni_run_thread(struct kni_net *knet, struct kni_dev *kni, uint8_t force_bind)
 		kni->pthread = kthread_create(kni_thread_multiple,
 			(void *)kni, "kni_%s", kni->name);
 		if (IS_ERR(kni->pthread)) {
-			kni_dev_remove(kni);
 			return -ECANCELED;
 		}
 
@@ -278,7 +276,6 @@  kni_run_thread(struct kni_net *knet, struct kni_dev *kni, uint8_t force_bind)
 				(void *)knet, "kni_single");
 			if (IS_ERR(knet->kni_kthread)) {
 				mutex_unlock(&knet->kni_kthread_lock);
-				kni_dev_remove(kni);
 				return -ECANCELED;
 			}
 
@@ -460,15 +457,17 @@  kni_ioctl_create(struct net *net, uint32_t ioctl_num,
 	if (ret) {
 		pr_err("error %i registering device \"%s\"\n",
 					ret, dev_info.name);
-		kni->net_dev = NULL;
-		kni_dev_remove(kni);
+		kni_net_release_fifo_phy(kni);
 		free_netdev(net_dev);
 		return -ENODEV;
 	}
 
 	ret = kni_run_thread(knet, kni, dev_info.force_bind);
-	if (ret != 0)
+	if (ret != 0) {
+		kni_dev_remove(kni);
+		kni_net_release_fifo_phy(kni);
 		return ret;
+	}
 
 	down_write(&knet->kni_list_lock);
 	list_add(&kni->list, &knet->kni_list_head);
@@ -495,6 +494,46 @@  kni_ioctl_release(struct net *net, uint32_t ioctl_num,
 		return -EIO;
 	}
 
+	/* Release the network device according to its name */
+	if (strlen(dev_info.name) == 0)
+		return ret;
+
+	down_read(&knet->kni_list_lock);
+	list_for_each_entry_safe(dev, n, &knet->kni_list_head, list) {
+		if (strncmp(dev->name, dev_info.name, RTE_KNI_NAMESIZE) != 0)
+			continue;
+
+		kni_dev_remove(dev);
+		up_read(&knet->kni_list_lock);
+		pr_info("Successfully released kni interface: %s\n",
+			dev_info.name);
+		return 0;
+	}
+	up_read(&knet->kni_list_lock);
+	pr_info("Error: failed to find kni interface: %s\n",
+		dev_info.name);
+
+	return ret;
+}
+
+static int
+kni_ioctl_free(struct net *net, uint32_t ioctl_num,
+		unsigned long ioctl_param)
+{
+	struct kni_net *knet = net_generic(net, kni_net_id);
+	int ret = -EINVAL;
+	struct kni_dev *dev, *n;
+	struct rte_kni_device_info dev_info;
+
+	if (_IOC_SIZE(ioctl_num) > sizeof(dev_info))
+		return -EINVAL;
+
+	ret = copy_from_user(&dev_info, (void *)ioctl_param, sizeof(dev_info));
+	if (ret) {
+		pr_err("copy_from_user in kni_ioctl_release");
+		return -EIO;
+	}
+
 	/* Release the network device according to its name */
 	if (strlen(dev_info.name) == 0)
 		return ret;
@@ -509,14 +548,17 @@  kni_ioctl_release(struct net *net, uint32_t ioctl_num,
 			dev->pthread = NULL;
 		}
 
-		kni_dev_remove(dev);
+		kni_net_release_fifo_phy(dev);
 		list_del(&dev->list);
-		ret = 0;
-		break;
+		up_write(&knet->kni_list_lock);
+		pr_info("Successfully freed kni interface: %s\n",
+			dev_info.name);
+		return 0;
 	}
 	up_write(&knet->kni_list_lock);
-	pr_info("%s release kni named %s\n",
-		(ret == 0 ? "Successfully" : "Unsuccessfully"), dev_info.name);
+
+	pr_info("Error: failed to find kni interface: %s\n",
+		dev_info.name);
 
 	return ret;
 }
@@ -542,6 +584,9 @@  kni_ioctl(struct inode *inode, uint32_t ioctl_num, unsigned long ioctl_param)
 	case _IOC_NR(RTE_KNI_IOCTL_RELEASE):
 		ret = kni_ioctl_release(net, ioctl_num, ioctl_param);
 		break;
+	case _IOC_NR(RTE_KNI_IOCTL_FREE):
+		ret = kni_ioctl_free(net, ioctl_num, ioctl_param);
+		break;
 	default:
 		pr_debug("IOCTL default\n");
 		break;
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
index cfa9448bd..318a3f939 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
@@ -129,5 +129,6 @@  struct rte_kni_device_info {
 #define RTE_KNI_IOCTL_TEST    _IOWR(0, 1, int)
 #define RTE_KNI_IOCTL_CREATE  _IOWR(0, 2, struct rte_kni_device_info)
 #define RTE_KNI_IOCTL_RELEASE _IOWR(0, 3, struct rte_kni_device_info)
+#define RTE_KNI_IOCTL_FREE    _IOWR(0, 4, struct rte_kni_device_info)
 
 #endif /* _RTE_KNI_COMMON_H_ */