[dpdk-dev,07/11] ethdev: add lock to port allocation check

Message ID 20180509094337.26112-8-thomas@monjalon.net (mailing list archive)
State Superseded, archived
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Thomas Monjalon May 9, 2018, 9:43 a.m. UTC
  From: Matan Azrad <matan@mellanox.com>

When comparing the port name, there can be a race condition with
a thread allocating a new port and writing the name at the same time.
It can lead to match with a partial name by error.

The check of the port is now considered as a critical section
protected with locks.

This fix will be even more required for multi-process when the
port availability will rely only on the name, in a following patch.

Fixes: 84934303a17c ("ethdev: synchronize port allocation")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_ethdev/rte_ethdev.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)
  

Comments

Gaëtan Rivet May 9, 2018, 12:21 p.m. UTC | #1
Hi,

On Wed, May 09, 2018 at 11:43:33AM +0200, Thomas Monjalon wrote:
> From: Matan Azrad <matan@mellanox.com>
> 
> When comparing the port name, there can be a race condition with
> a thread allocating a new port and writing the name at the same time.
> It can lead to match with a partial name by error.
> 
> The check of the port is now considered as a critical section
> protected with locks.
> 
> This fix will be even more required for multi-process when the
> port availability will rely only on the name, in a following patch.
> 
> Fixes: 84934303a17c ("ethdev: synchronize port allocation")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> Acked-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>  lib/librte_ethdev/rte_ethdev.c | 22 +++++++++++++++++++---
>  1 file changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
> index ae86d0ba7..357be2dca 100644
> --- a/lib/librte_ethdev/rte_ethdev.c
> +++ b/lib/librte_ethdev/rte_ethdev.c
> @@ -227,8 +227,8 @@ rte_eth_dev_shared_data_prepare(void)
>  	rte_spinlock_unlock(&rte_eth_shared_data_lock);
>  }
>  
> -struct rte_eth_dev *
> -rte_eth_dev_allocated(const char *name)
> +static struct rte_eth_dev *
> +rte_eth_dev_allocated_lock_free(const char *name)

A suggestion about the naming here.
Reading subsequent patches, we can see this function being used during
ethdev allocation routines. The _lock_free suffix is a little
misleading, as for an instant one can think that there is something
being freed about an allocated ethdev lock.

I would suggest

  * rte_eth_dev_allocated_nolock
    or
  * rte_eth_dev_allocated_lockless
    (or even rte_eth_lockless_dev_allocated)

instead.

>  {
>  	unsigned i;
>  
> @@ -240,6 +240,22 @@ rte_eth_dev_allocated(const char *name)
>  	return NULL;
>  }
>  
> +struct rte_eth_dev *
> +rte_eth_dev_allocated(const char *name)
> +{
> +	struct rte_eth_dev *ethdev;
> +
> +	rte_eth_dev_shared_data_prepare();
> +
> +	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
> +
> +	ethdev = rte_eth_dev_allocated_lock_free(name);
> +
> +	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
> +
> +	return ethdev;
> +}
> +
>  static uint16_t
>  rte_eth_dev_find_free_port(void)
>  {
> @@ -286,7 +302,7 @@ rte_eth_dev_allocate(const char *name)
>  		goto unlock;
>  	}
>  
> -	if (rte_eth_dev_allocated(name) != NULL) {
> +	if (rte_eth_dev_allocated_lock_free(name) != NULL) {
>  		ethdev_log(ERR,
>  			"Ethernet Device with name %s already allocated!",
>  			name);
> -- 
> 2.16.2
>
  
Thomas Monjalon May 9, 2018, 12:25 p.m. UTC | #2
09/05/2018 14:21, Gaëtan Rivet:
> Hi,
> 
> On Wed, May 09, 2018 at 11:43:33AM +0200, Thomas Monjalon wrote:
> > From: Matan Azrad <matan@mellanox.com>
> > 
> > When comparing the port name, there can be a race condition with
> > a thread allocating a new port and writing the name at the same time.
> > It can lead to match with a partial name by error.
> > 
> > The check of the port is now considered as a critical section
> > protected with locks.
> > 
> > This fix will be even more required for multi-process when the
> > port availability will rely only on the name, in a following patch.
> > 
> > Fixes: 84934303a17c ("ethdev: synchronize port allocation")
> > Cc: stable@dpdk.org
> > 
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > Acked-by: Thomas Monjalon <thomas@monjalon.net>
> > ---
> >  lib/librte_ethdev/rte_ethdev.c | 22 +++++++++++++++++++---
> >  1 file changed, 19 insertions(+), 3 deletions(-)
> > 
> > diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
> > index ae86d0ba7..357be2dca 100644
> > --- a/lib/librte_ethdev/rte_ethdev.c
> > +++ b/lib/librte_ethdev/rte_ethdev.c
> > @@ -227,8 +227,8 @@ rte_eth_dev_shared_data_prepare(void)
> >  	rte_spinlock_unlock(&rte_eth_shared_data_lock);
> >  }
> >  
> > -struct rte_eth_dev *
> > -rte_eth_dev_allocated(const char *name)
> > +static struct rte_eth_dev *
> > +rte_eth_dev_allocated_lock_free(const char *name)
> 
> A suggestion about the naming here.
> Reading subsequent patches, we can see this function being used during
> ethdev allocation routines. The _lock_free suffix is a little
> misleading, as for an instant one can think that there is something
> being freed about an allocated ethdev lock.
> 
> I would suggest
> 
>   * rte_eth_dev_allocated_nolock
>     or
>   * rte_eth_dev_allocated_lockless
>     (or even rte_eth_lockless_dev_allocated)
> 
> instead.

Good suggestions. I vote for rte_eth_dev_allocated_nolock.
Thanks
  
Stephen Hemminger May 10, 2018, 8:33 p.m. UTC | #3
On Wed,  9 May 2018 11:43:33 +0200
Thomas Monjalon <thomas@monjalon.net> wrote:

>  
> +struct rte_eth_dev *
> +rte_eth_dev_allocated(const char *name)
> +{
> +	struct rte_eth_dev *ethdev;
> +
> +	rte_eth_dev_shared_data_prepare();
> +
> +	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
> +
> +	ethdev = rte_eth_dev_allocated_lock_free(name);
> +
> +	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
> +
> +	return ethdev;
> +}
> +

Not sure about this. The code it self is correct, but it creates
a racy semantic.

If caller doesn't already hold a lock then there is no guarantee that
the device returned won't be destroyed by some other thread or that
the name was just allocated by some other process.
  
Stephen Hemminger May 10, 2018, 8:35 p.m. UTC | #4
On Wed, 9 May 2018 14:21:17 +0200
Gaëtan Rivet <gaetan.rivet@6wind.com> wrote:

> A suggestion about the naming here.
> Reading subsequent patches, we can see this function being used during
> ethdev allocation routines. The _lock_free suffix is a little
> misleading, as for an instant one can think that there is something
> being freed about an allocated ethdev lock.
> 
> I would suggest
> 
>   * rte_eth_dev_allocated_nolock
>     or
>   * rte_eth_dev_allocated_lockless
>     (or even rte_eth_lockless_dev_allocated)
> 
> instead.

Personally, used to the convention of:
    rte_eth_dev_find(name)
and
    _rte_eth_dev_find(name)

The _ implies internal version without lock.

Also allocated to me implies a boolean test only.
  
Thomas Monjalon May 10, 2018, 10:10 p.m. UTC | #5
10/05/2018 22:33, Stephen Hemminger:
> On Wed,  9 May 2018 11:43:33 +0200
> Thomas Monjalon <thomas@monjalon.net> wrote:
> 
> >  
> > +struct rte_eth_dev *
> > +rte_eth_dev_allocated(const char *name)
> > +{
> > +	struct rte_eth_dev *ethdev;
> > +
> > +	rte_eth_dev_shared_data_prepare();
> > +
> > +	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
> > +
> > +	ethdev = rte_eth_dev_allocated_lock_free(name);
> > +
> > +	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
> > +
> > +	return ethdev;
> > +}
> > +
> 
> Not sure about this. The code it self is correct, but it creates
> a racy semantic.
> 
> If caller doesn't already hold a lock then there is no guarantee that
> the device returned won't be destroyed by some other thread

It is an old high level design decision in DPDK:
We do not hold a lock during the whole life of a port.
So it is the application responsibility to not mess its own ports.
The consequence is that one port must be managed by only one thread.

We can discuss the original thread design but it is out of the
scope of this patchset.

> or that the name was just allocated by some other process.

It does not say which process allocated the port, yes.
But the name is unique among processes.
So the process knows for sure what to do with the port having this name.
  
Thomas Monjalon May 10, 2018, 10:13 p.m. UTC | #6
10/05/2018 22:35, Stephen Hemminger:
> On Wed, 9 May 2018 14:21:17 +0200
> Gaëtan Rivet <gaetan.rivet@6wind.com> wrote:
> 
> > A suggestion about the naming here.
> > Reading subsequent patches, we can see this function being used during
> > ethdev allocation routines. The _lock_free suffix is a little
> > misleading, as for an instant one can think that there is something
> > being freed about an allocated ethdev lock.
> > 
> > I would suggest
> > 
> >   * rte_eth_dev_allocated_nolock
> >     or
> >   * rte_eth_dev_allocated_lockless
> >     (or even rte_eth_lockless_dev_allocated)
> > 
> > instead.
> 
> Personally, used to the convention of:
>     rte_eth_dev_find(name)
> and
>     _rte_eth_dev_find(name)
> 
> The _ implies internal version without lock.

It is a matter of taste.
We have chosen "nolock" in v2, and I think it is explicit.

> Also allocated to me implies a boolean test only.

Yes, the name is a bit strange.
But it is old and out of the scope of this patch.
If we want to change it, it is an API deprecation.
  
Stephen Hemminger May 10, 2018, 10:29 p.m. UTC | #7
On Fri, 11 May 2018 00:10:19 +0200
Thomas Monjalon <thomas@monjalon.net> wrote:

> 10/05/2018 22:33, Stephen Hemminger:
> > On Wed,  9 May 2018 11:43:33 +0200
> > Thomas Monjalon <thomas@monjalon.net> wrote:
> >   
> > >  
> > > +struct rte_eth_dev *
> > > +rte_eth_dev_allocated(const char *name)
> > > +{
> > > +	struct rte_eth_dev *ethdev;
> > > +
> > > +	rte_eth_dev_shared_data_prepare();
> > > +
> > > +	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
> > > +
> > > +	ethdev = rte_eth_dev_allocated_lock_free(name);
> > > +
> > > +	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
> > > +
> > > +	return ethdev;
> > > +}
> > > +  
> > 
> > Not sure about this. The code it self is correct, but it creates
> > a racy semantic.
> > 
> > If caller doesn't already hold a lock then there is no guarantee that
> > the device returned won't be destroyed by some other thread  
> 
> It is an old high level design decision in DPDK:
> We do not hold a lock during the whole life of a port.
> So it is the application responsibility to not mess its own ports.
> The consequence is that one port must be managed by only one thread.
> 
> We can discuss the original thread design but it is out of the
> scope of this patchset.
> 
> > or that the name was just allocated by some other process.  
> 
> It does not say which process allocated the port, yes.
> But the name is unique among processes.
> So the process knows for sure what to do with the port having this name.

For future, I would like to change rte_eth_devices from an array of structures to
an array of pointers.  Reserving a port could be done with atomic exchange, and keep
a bitmap as hint for next free port to choose.  When supporting tunnels etc, it makes sense
to support lots of ports (like > 16 bit); and devices may come and go.

Also, change link state in eth device into full operational state value.
That should be enough to cover both tunnel and failsafe usage, and existing state
value can go away.

The ownership model should also be expressed more as functional operations in the
device model. It needs to be used consistently in multiple places, allow more layering
and also have more error checks builtin.
  
Thomas Monjalon May 10, 2018, 11:47 p.m. UTC | #8
11/05/2018 00:13, Thomas Monjalon:
> 10/05/2018 22:35, Stephen Hemminger:
> > On Wed, 9 May 2018 14:21:17 +0200
> > Gaëtan Rivet <gaetan.rivet@6wind.com> wrote:
> > 
> > > A suggestion about the naming here.
> > > Reading subsequent patches, we can see this function being used during
> > > ethdev allocation routines. The _lock_free suffix is a little
> > > misleading, as for an instant one can think that there is something
> > > being freed about an allocated ethdev lock.
> > > 
> > > I would suggest
> > > 
> > >   * rte_eth_dev_allocated_nolock
> > >     or
> > >   * rte_eth_dev_allocated_lockless
> > >     (or even rte_eth_lockless_dev_allocated)
> > > 
> > > instead.
> > 
> > Personally, used to the convention of:
> >     rte_eth_dev_find(name)
> > and
> >     _rte_eth_dev_find(name)
> > 
> > The _ implies internal version without lock.
> 
> It is a matter of taste.
> We have chosen "nolock" in v2, and I think it is explicit.

After a second thought, I have decided to follow the underscore convention:
	_rte_eth_dev_allocated
  

Patch

diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index ae86d0ba7..357be2dca 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -227,8 +227,8 @@  rte_eth_dev_shared_data_prepare(void)
 	rte_spinlock_unlock(&rte_eth_shared_data_lock);
 }
 
-struct rte_eth_dev *
-rte_eth_dev_allocated(const char *name)
+static struct rte_eth_dev *
+rte_eth_dev_allocated_lock_free(const char *name)
 {
 	unsigned i;
 
@@ -240,6 +240,22 @@  rte_eth_dev_allocated(const char *name)
 	return NULL;
 }
 
+struct rte_eth_dev *
+rte_eth_dev_allocated(const char *name)
+{
+	struct rte_eth_dev *ethdev;
+
+	rte_eth_dev_shared_data_prepare();
+
+	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
+
+	ethdev = rte_eth_dev_allocated_lock_free(name);
+
+	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
+
+	return ethdev;
+}
+
 static uint16_t
 rte_eth_dev_find_free_port(void)
 {
@@ -286,7 +302,7 @@  rte_eth_dev_allocate(const char *name)
 		goto unlock;
 	}
 
-	if (rte_eth_dev_allocated(name) != NULL) {
+	if (rte_eth_dev_allocated_lock_free(name) != NULL) {
 		ethdev_log(ERR,
 			"Ethernet Device with name %s already allocated!",
 			name);