[dpdk-dev,07/11] ethdev: add lock to port allocation check
Checks
Commit Message
From: Matan Azrad <matan@mellanox.com>
When comparing the port name, there can be a race condition with
a thread allocating a new port and writing the name at the same time.
It can lead to match with a partial name by error.
The check of the port is now considered as a critical section
protected with locks.
This fix will be even more required for multi-process when the
port availability will rely only on the name, in a following patch.
Fixes: 84934303a17c ("ethdev: synchronize port allocation")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
---
lib/librte_ethdev/rte_ethdev.c | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)
Comments
Hi,
On Wed, May 09, 2018 at 11:43:33AM +0200, Thomas Monjalon wrote:
> From: Matan Azrad <matan@mellanox.com>
>
> When comparing the port name, there can be a race condition with
> a thread allocating a new port and writing the name at the same time.
> It can lead to match with a partial name by error.
>
> The check of the port is now considered as a critical section
> protected with locks.
>
> This fix will be even more required for multi-process when the
> port availability will rely only on the name, in a following patch.
>
> Fixes: 84934303a17c ("ethdev: synchronize port allocation")
> Cc: stable@dpdk.org
>
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> Acked-by: Thomas Monjalon <thomas@monjalon.net>
> ---
> lib/librte_ethdev/rte_ethdev.c | 22 +++++++++++++++++++---
> 1 file changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
> index ae86d0ba7..357be2dca 100644
> --- a/lib/librte_ethdev/rte_ethdev.c
> +++ b/lib/librte_ethdev/rte_ethdev.c
> @@ -227,8 +227,8 @@ rte_eth_dev_shared_data_prepare(void)
> rte_spinlock_unlock(&rte_eth_shared_data_lock);
> }
>
> -struct rte_eth_dev *
> -rte_eth_dev_allocated(const char *name)
> +static struct rte_eth_dev *
> +rte_eth_dev_allocated_lock_free(const char *name)
A suggestion about the naming here.
Reading subsequent patches, we can see this function being used during
ethdev allocation routines. The _lock_free suffix is a little
misleading, as for an instant one can think that there is something
being freed about an allocated ethdev lock.
I would suggest
* rte_eth_dev_allocated_nolock
or
* rte_eth_dev_allocated_lockless
(or even rte_eth_lockless_dev_allocated)
instead.
> {
> unsigned i;
>
> @@ -240,6 +240,22 @@ rte_eth_dev_allocated(const char *name)
> return NULL;
> }
>
> +struct rte_eth_dev *
> +rte_eth_dev_allocated(const char *name)
> +{
> + struct rte_eth_dev *ethdev;
> +
> + rte_eth_dev_shared_data_prepare();
> +
> + rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
> +
> + ethdev = rte_eth_dev_allocated_lock_free(name);
> +
> + rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
> +
> + return ethdev;
> +}
> +
> static uint16_t
> rte_eth_dev_find_free_port(void)
> {
> @@ -286,7 +302,7 @@ rte_eth_dev_allocate(const char *name)
> goto unlock;
> }
>
> - if (rte_eth_dev_allocated(name) != NULL) {
> + if (rte_eth_dev_allocated_lock_free(name) != NULL) {
> ethdev_log(ERR,
> "Ethernet Device with name %s already allocated!",
> name);
> --
> 2.16.2
>
09/05/2018 14:21, Gaëtan Rivet:
> Hi,
>
> On Wed, May 09, 2018 at 11:43:33AM +0200, Thomas Monjalon wrote:
> > From: Matan Azrad <matan@mellanox.com>
> >
> > When comparing the port name, there can be a race condition with
> > a thread allocating a new port and writing the name at the same time.
> > It can lead to match with a partial name by error.
> >
> > The check of the port is now considered as a critical section
> > protected with locks.
> >
> > This fix will be even more required for multi-process when the
> > port availability will rely only on the name, in a following patch.
> >
> > Fixes: 84934303a17c ("ethdev: synchronize port allocation")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > Acked-by: Thomas Monjalon <thomas@monjalon.net>
> > ---
> > lib/librte_ethdev/rte_ethdev.c | 22 +++++++++++++++++++---
> > 1 file changed, 19 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
> > index ae86d0ba7..357be2dca 100644
> > --- a/lib/librte_ethdev/rte_ethdev.c
> > +++ b/lib/librte_ethdev/rte_ethdev.c
> > @@ -227,8 +227,8 @@ rte_eth_dev_shared_data_prepare(void)
> > rte_spinlock_unlock(&rte_eth_shared_data_lock);
> > }
> >
> > -struct rte_eth_dev *
> > -rte_eth_dev_allocated(const char *name)
> > +static struct rte_eth_dev *
> > +rte_eth_dev_allocated_lock_free(const char *name)
>
> A suggestion about the naming here.
> Reading subsequent patches, we can see this function being used during
> ethdev allocation routines. The _lock_free suffix is a little
> misleading, as for an instant one can think that there is something
> being freed about an allocated ethdev lock.
>
> I would suggest
>
> * rte_eth_dev_allocated_nolock
> or
> * rte_eth_dev_allocated_lockless
> (or even rte_eth_lockless_dev_allocated)
>
> instead.
Good suggestions. I vote for rte_eth_dev_allocated_nolock.
Thanks
On Wed, 9 May 2018 11:43:33 +0200
Thomas Monjalon <thomas@monjalon.net> wrote:
>
> +struct rte_eth_dev *
> +rte_eth_dev_allocated(const char *name)
> +{
> + struct rte_eth_dev *ethdev;
> +
> + rte_eth_dev_shared_data_prepare();
> +
> + rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
> +
> + ethdev = rte_eth_dev_allocated_lock_free(name);
> +
> + rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
> +
> + return ethdev;
> +}
> +
Not sure about this. The code it self is correct, but it creates
a racy semantic.
If caller doesn't already hold a lock then there is no guarantee that
the device returned won't be destroyed by some other thread or that
the name was just allocated by some other process.
On Wed, 9 May 2018 14:21:17 +0200
Gaëtan Rivet <gaetan.rivet@6wind.com> wrote:
> A suggestion about the naming here.
> Reading subsequent patches, we can see this function being used during
> ethdev allocation routines. The _lock_free suffix is a little
> misleading, as for an instant one can think that there is something
> being freed about an allocated ethdev lock.
>
> I would suggest
>
> * rte_eth_dev_allocated_nolock
> or
> * rte_eth_dev_allocated_lockless
> (or even rte_eth_lockless_dev_allocated)
>
> instead.
Personally, used to the convention of:
rte_eth_dev_find(name)
and
_rte_eth_dev_find(name)
The _ implies internal version without lock.
Also allocated to me implies a boolean test only.
10/05/2018 22:33, Stephen Hemminger:
> On Wed, 9 May 2018 11:43:33 +0200
> Thomas Monjalon <thomas@monjalon.net> wrote:
>
> >
> > +struct rte_eth_dev *
> > +rte_eth_dev_allocated(const char *name)
> > +{
> > + struct rte_eth_dev *ethdev;
> > +
> > + rte_eth_dev_shared_data_prepare();
> > +
> > + rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
> > +
> > + ethdev = rte_eth_dev_allocated_lock_free(name);
> > +
> > + rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
> > +
> > + return ethdev;
> > +}
> > +
>
> Not sure about this. The code it self is correct, but it creates
> a racy semantic.
>
> If caller doesn't already hold a lock then there is no guarantee that
> the device returned won't be destroyed by some other thread
It is an old high level design decision in DPDK:
We do not hold a lock during the whole life of a port.
So it is the application responsibility to not mess its own ports.
The consequence is that one port must be managed by only one thread.
We can discuss the original thread design but it is out of the
scope of this patchset.
> or that the name was just allocated by some other process.
It does not say which process allocated the port, yes.
But the name is unique among processes.
So the process knows for sure what to do with the port having this name.
10/05/2018 22:35, Stephen Hemminger:
> On Wed, 9 May 2018 14:21:17 +0200
> Gaëtan Rivet <gaetan.rivet@6wind.com> wrote:
>
> > A suggestion about the naming here.
> > Reading subsequent patches, we can see this function being used during
> > ethdev allocation routines. The _lock_free suffix is a little
> > misleading, as for an instant one can think that there is something
> > being freed about an allocated ethdev lock.
> >
> > I would suggest
> >
> > * rte_eth_dev_allocated_nolock
> > or
> > * rte_eth_dev_allocated_lockless
> > (or even rte_eth_lockless_dev_allocated)
> >
> > instead.
>
> Personally, used to the convention of:
> rte_eth_dev_find(name)
> and
> _rte_eth_dev_find(name)
>
> The _ implies internal version without lock.
It is a matter of taste.
We have chosen "nolock" in v2, and I think it is explicit.
> Also allocated to me implies a boolean test only.
Yes, the name is a bit strange.
But it is old and out of the scope of this patch.
If we want to change it, it is an API deprecation.
On Fri, 11 May 2018 00:10:19 +0200
Thomas Monjalon <thomas@monjalon.net> wrote:
> 10/05/2018 22:33, Stephen Hemminger:
> > On Wed, 9 May 2018 11:43:33 +0200
> > Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > >
> > > +struct rte_eth_dev *
> > > +rte_eth_dev_allocated(const char *name)
> > > +{
> > > + struct rte_eth_dev *ethdev;
> > > +
> > > + rte_eth_dev_shared_data_prepare();
> > > +
> > > + rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
> > > +
> > > + ethdev = rte_eth_dev_allocated_lock_free(name);
> > > +
> > > + rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
> > > +
> > > + return ethdev;
> > > +}
> > > +
> >
> > Not sure about this. The code it self is correct, but it creates
> > a racy semantic.
> >
> > If caller doesn't already hold a lock then there is no guarantee that
> > the device returned won't be destroyed by some other thread
>
> It is an old high level design decision in DPDK:
> We do not hold a lock during the whole life of a port.
> So it is the application responsibility to not mess its own ports.
> The consequence is that one port must be managed by only one thread.
>
> We can discuss the original thread design but it is out of the
> scope of this patchset.
>
> > or that the name was just allocated by some other process.
>
> It does not say which process allocated the port, yes.
> But the name is unique among processes.
> So the process knows for sure what to do with the port having this name.
For future, I would like to change rte_eth_devices from an array of structures to
an array of pointers. Reserving a port could be done with atomic exchange, and keep
a bitmap as hint for next free port to choose. When supporting tunnels etc, it makes sense
to support lots of ports (like > 16 bit); and devices may come and go.
Also, change link state in eth device into full operational state value.
That should be enough to cover both tunnel and failsafe usage, and existing state
value can go away.
The ownership model should also be expressed more as functional operations in the
device model. It needs to be used consistently in multiple places, allow more layering
and also have more error checks builtin.
11/05/2018 00:13, Thomas Monjalon:
> 10/05/2018 22:35, Stephen Hemminger:
> > On Wed, 9 May 2018 14:21:17 +0200
> > Gaëtan Rivet <gaetan.rivet@6wind.com> wrote:
> >
> > > A suggestion about the naming here.
> > > Reading subsequent patches, we can see this function being used during
> > > ethdev allocation routines. The _lock_free suffix is a little
> > > misleading, as for an instant one can think that there is something
> > > being freed about an allocated ethdev lock.
> > >
> > > I would suggest
> > >
> > > * rte_eth_dev_allocated_nolock
> > > or
> > > * rte_eth_dev_allocated_lockless
> > > (or even rte_eth_lockless_dev_allocated)
> > >
> > > instead.
> >
> > Personally, used to the convention of:
> > rte_eth_dev_find(name)
> > and
> > _rte_eth_dev_find(name)
> >
> > The _ implies internal version without lock.
>
> It is a matter of taste.
> We have chosen "nolock" in v2, and I think it is explicit.
After a second thought, I have decided to follow the underscore convention:
_rte_eth_dev_allocated
@@ -227,8 +227,8 @@ rte_eth_dev_shared_data_prepare(void)
rte_spinlock_unlock(&rte_eth_shared_data_lock);
}
-struct rte_eth_dev *
-rte_eth_dev_allocated(const char *name)
+static struct rte_eth_dev *
+rte_eth_dev_allocated_lock_free(const char *name)
{
unsigned i;
@@ -240,6 +240,22 @@ rte_eth_dev_allocated(const char *name)
return NULL;
}
+struct rte_eth_dev *
+rte_eth_dev_allocated(const char *name)
+{
+ struct rte_eth_dev *ethdev;
+
+ rte_eth_dev_shared_data_prepare();
+
+ rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
+
+ ethdev = rte_eth_dev_allocated_lock_free(name);
+
+ rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
+
+ return ethdev;
+}
+
static uint16_t
rte_eth_dev_find_free_port(void)
{
@@ -286,7 +302,7 @@ rte_eth_dev_allocate(const char *name)
goto unlock;
}
- if (rte_eth_dev_allocated(name) != NULL) {
+ if (rte_eth_dev_allocated_lock_free(name) != NULL) {
ethdev_log(ERR,
"Ethernet Device with name %s already allocated!",
name);