[2/2] net/failsafe: fix primary/secondary mutex

Message ID 20210315192722.35490-3-stephen@networkplumber.org (mailing list archive)
State Changes Requested, archived
Delegated to: Andrew Rybchenko
Headers
Series Mark shared pthread mutex |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail Compilation issues
ci/iol-abi-testing success Testing PASS
ci/travis-robot success travis build: passed
ci/github-robot success github build: passed
ci/iol-testing fail Testing issues
ci/iol-mellanox-Functional success Functional Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/intel-Testing success Testing PASS

Commit Message

Stephen Hemminger March 15, 2021, 7:27 p.m. UTC
  Set mutex used in failsafe driver to protect when used by
both primary and secondary process. Without this fix, the failsafe
lock is not really locking when there are multiple secondary processes.

Bugzilla ID: 662
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")
Cc: matan@mellanox.com
---
 drivers/net/failsafe/failsafe.c | 5 +++++
 1 file changed, 5 insertions(+)
  

Comments

Ferruh Yigit April 14, 2021, 1:10 p.m. UTC | #1
On 3/15/2021 7:27 PM, Stephen Hemminger wrote:
> Set mutex used in failsafe driver to protect when used by
> both primary and secondary process. Without this fix, the failsafe
> lock is not really locking when there are multiple secondary processes.
> 
> Bugzilla ID: 662
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")
> Cc: matan@mellanox.com
> ---
>   drivers/net/failsafe/failsafe.c | 5 +++++
>   1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
> index e3bda0df2bf9..5b7e560dbc08 100644
> --- a/drivers/net/failsafe/failsafe.c
> +++ b/drivers/net/failsafe/failsafe.c
> @@ -140,6 +140,11 @@ fs_mutex_init(struct fs_priv *priv)
>   		ERROR("Cannot initiate mutex attributes - %s", strerror(ret));
>   		return ret;
>   	}
> +	/* Allow mutex to protect primary/secondary */
> +	ret = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
> +	if (ret)
> +		ERROR("Cannot set mutex shared - %s", strerror(ret));
> +
>   	/* Allow mutex relocks for the thread holding the mutex. */
>   	ret = pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);
>   	if (ret) {
> 

Overall looks good to me.

Gaetan, Matan,

Can you please test the patch? To be sure it is not causing any unexpected 
functional/performance issues.

Thanks,
ferruh
  
Ferruh Yigit April 16, 2021, 8:19 a.m. UTC | #2
On 4/14/2021 2:10 PM, Ferruh Yigit wrote:
> On 3/15/2021 7:27 PM, Stephen Hemminger wrote:
>> Set mutex used in failsafe driver to protect when used by
>> both primary and secondary process. Without this fix, the failsafe
>> lock is not really locking when there are multiple secondary processes.
>>
>> Bugzilla ID: 662
>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>> Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")
>> Cc: matan@mellanox.com
>> ---
>>   drivers/net/failsafe/failsafe.c | 5 +++++
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
>> index e3bda0df2bf9..5b7e560dbc08 100644
>> --- a/drivers/net/failsafe/failsafe.c
>> +++ b/drivers/net/failsafe/failsafe.c
>> @@ -140,6 +140,11 @@ fs_mutex_init(struct fs_priv *priv)
>>           ERROR("Cannot initiate mutex attributes - %s", strerror(ret));
>>           return ret;
>>       }
>> +    /* Allow mutex to protect primary/secondary */
>> +    ret = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
>> +    if (ret)
>> +        ERROR("Cannot set mutex shared - %s", strerror(ret));
>> +
>>       /* Allow mutex relocks for the thread holding the mutex. */
>>       ret = pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);
>>       if (ret) {
>>
> 
> Overall looks good to me.
> 
> Gaetan, Matan,
> 
> Can you please test the patch? To be sure it is not causing any unexpected 
> functional/performance issues.
> 

Ping.
  
Thomas Monjalon April 19, 2021, 5:08 p.m. UTC | #3
About the title, better to speak about multi-process,
it is less confusing than primary/secondary.

15/03/2021 20:27, Stephen Hemminger:
> Set mutex used in failsafe driver to protect when used by
> both primary and secondary process. Without this fix, the failsafe
> lock is not really locking when there are multiple secondary processes.
> 
> Bugzilla ID: 662
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")
> Cc: matan@mellanox.com

The correct order for above lines is:

Bugzilla ID: 662
Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

> ---
> --- a/drivers/net/failsafe/failsafe.c
> +++ b/drivers/net/failsafe/failsafe.c
> @@ -140,6 +140,11 @@ fs_mutex_init(struct fs_priv *priv)
>  		ERROR("Cannot initiate mutex attributes - %s", strerror(ret));
>  		return ret;
>  	}
> +	/* Allow mutex to protect primary/secondary */
> +	ret = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
> +	if (ret)
> +		ERROR("Cannot set mutex shared - %s", strerror(ret));

Why not returning an error here?
  
Andrew Rybchenko June 8, 2021, 8 a.m. UTC | #4
On 4/19/21 8:08 PM, Thomas Monjalon wrote:
> About the title, better to speak about multi-process,
> it is less confusing than primary/secondary.
> 
> 15/03/2021 20:27, Stephen Hemminger:
>> Set mutex used in failsafe driver to protect when used by
>> both primary and secondary process. Without this fix, the failsafe
>> lock is not really locking when there are multiple secondary processes.
>>
>> Bugzilla ID: 662
>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>> Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")
>> Cc: matan@mellanox.com
> 
> The correct order for above lines is:
> 
> Bugzilla ID: 662
> Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")
> 
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> 
>> ---
>> --- a/drivers/net/failsafe/failsafe.c
>> +++ b/drivers/net/failsafe/failsafe.c
>> @@ -140,6 +140,11 @@ fs_mutex_init(struct fs_priv *priv)
>>  		ERROR("Cannot initiate mutex attributes - %s", strerror(ret));
>>  		return ret;
>>  	}
>> +	/* Allow mutex to protect primary/secondary */
>> +	ret = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
>> +	if (ret)
>> +		ERROR("Cannot set mutex shared - %s", strerror(ret));
> 
> Why not returning an error here?

+1

I think it would be safer to return an error here.
  
Stephen Hemminger June 8, 2021, 3:42 p.m. UTC | #5
On Tue, 8 Jun 2021 11:00:37 +0300
Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> wrote:

> On 4/19/21 8:08 PM, Thomas Monjalon wrote:
> > About the title, better to speak about multi-process,
> > it is less confusing than primary/secondary.
> > 
> > 15/03/2021 20:27, Stephen Hemminger:  
> >> Set mutex used in failsafe driver to protect when used by
> >> both primary and secondary process. Without this fix, the failsafe
> >> lock is not really locking when there are multiple secondary processes.
> >>
> >> Bugzilla ID: 662
> >> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> >> Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")
> >> Cc: matan@mellanox.com  
> > 
> > The correct order for above lines is:
> > 
> > Bugzilla ID: 662
> > Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")
> > 
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> >   
> >> ---
> >> --- a/drivers/net/failsafe/failsafe.c
> >> +++ b/drivers/net/failsafe/failsafe.c
> >> @@ -140,6 +140,11 @@ fs_mutex_init(struct fs_priv *priv)
> >>  		ERROR("Cannot initiate mutex attributes - %s", strerror(ret));
> >>  		return ret;
> >>  	}
> >> +	/* Allow mutex to protect primary/secondary */
> >> +	ret = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
> >> +	if (ret)
> >> +		ERROR("Cannot set mutex shared - %s", strerror(ret));  
> > 
> > Why not returning an error here?  
> 
> +1
> 
> I think it would be safer to return an error here.

Ok but it never happens.
  
Andrew Rybchenko June 8, 2021, 3:55 p.m. UTC | #6
On 6/8/21 6:42 PM, Stephen Hemminger wrote:
> On Tue, 8 Jun 2021 11:00:37 +0300
> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> wrote:
> 
>> On 4/19/21 8:08 PM, Thomas Monjalon wrote:
>>> About the title, better to speak about multi-process,
>>> it is less confusing than primary/secondary.
>>>
>>> 15/03/2021 20:27, Stephen Hemminger:  
>>>> Set mutex used in failsafe driver to protect when used by
>>>> both primary and secondary process. Without this fix, the failsafe
>>>> lock is not really locking when there are multiple secondary processes.
>>>>
>>>> Bugzilla ID: 662
>>>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>>>> Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")
>>>> Cc: matan@mellanox.com  
>>>
>>> The correct order for above lines is:
>>>
>>> Bugzilla ID: 662
>>> Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")
>>>
>>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>>>   
>>>> ---
>>>> --- a/drivers/net/failsafe/failsafe.c
>>>> +++ b/drivers/net/failsafe/failsafe.c
>>>> @@ -140,6 +140,11 @@ fs_mutex_init(struct fs_priv *priv)
>>>>  		ERROR("Cannot initiate mutex attributes - %s", strerror(ret));
>>>>  		return ret;
>>>>  	}
>>>> +	/* Allow mutex to protect primary/secondary */
>>>> +	ret = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
>>>> +	if (ret)
>>>> +		ERROR("Cannot set mutex shared - %s", strerror(ret));  
>>>
>>> Why not returning an error here?  
>>
>> +1
>>
>> I think it would be safer to return an error here.
> 
> Ok but it never happens.
> 

May I ask why? 'man pthread_mutexattr_setpshared' says that it
is possible.
  
Stephen Hemminger June 8, 2021, 8:48 p.m. UTC | #7
On Tue, 8 Jun 2021 18:55:17 +0300
Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> wrote:

> On 6/8/21 6:42 PM, Stephen Hemminger wrote:
> > On Tue, 8 Jun 2021 11:00:37 +0300
> > Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> wrote:
> >   
> >> On 4/19/21 8:08 PM, Thomas Monjalon wrote:  
> >>> About the title, better to speak about multi-process,
> >>> it is less confusing than primary/secondary.
> >>>
> >>> 15/03/2021 20:27, Stephen Hemminger:    
> >>>> Set mutex used in failsafe driver to protect when used by
> >>>> both primary and secondary process. Without this fix, the failsafe
> >>>> lock is not really locking when there are multiple secondary processes.
> >>>>
> >>>> Bugzilla ID: 662
> >>>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> >>>> Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")
> >>>> Cc: matan@mellanox.com    
> >>>
> >>> The correct order for above lines is:
> >>>
> >>> Bugzilla ID: 662
> >>> Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")
> >>>
> >>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> >>>     
> >>>> ---
> >>>> --- a/drivers/net/failsafe/failsafe.c
> >>>> +++ b/drivers/net/failsafe/failsafe.c
> >>>> @@ -140,6 +140,11 @@ fs_mutex_init(struct fs_priv *priv)
> >>>>  		ERROR("Cannot initiate mutex attributes - %s", strerror(ret));
> >>>>  		return ret;
> >>>>  	}
> >>>> +	/* Allow mutex to protect primary/secondary */
> >>>> +	ret = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
> >>>> +	if (ret)
> >>>> +		ERROR("Cannot set mutex shared - %s", strerror(ret));    
> >>>
> >>> Why not returning an error here?    
> >>
> >> +1
> >>
> >> I think it would be safer to return an error here.  
> > 
> > Ok but it never happens.
> >   
> 
> May I ask why? 'man pthread_mutexattr_setpshared' says that it
> is possible.
> 

The glibc implementation of pthread_mutexattr_setpshared is:


int
pthread_mutexattr_setpshared (pthread_mutexattr_t *attr, int pshared)
{
  struct pthread_mutexattr *iattr;

  int err = futex_supports_pshared (pshared);
  if (err != 0)
    return err;

  iattr = (struct pthread_mutexattr *) attr;

  if (pshared == PTHREAD_PROCESS_PRIVATE)
    iattr->mutexkind &= ~PTHREAD_MUTEXATTR_FLAG_PSHARED;
  else
    iattr->mutexkind |= PTHREAD_MUTEXATTR_FLAG_PSHARED;

  return 0;
}

And

/* FUTEX_SHARED is always supported by the Linux kernel.  */
static __always_inline int
futex_supports_pshared (int pshared)
{
  if (__glibc_likely (pshared == PTHREAD_PROCESS_PRIVATE))
    return 0;
  else if (pshared == PTHREAD_PROCESS_SHARED)
    return 0;
  else
    return EINVAL;
}


There for the code as written can not return an error.
The check was only because someone could report a bogus
issue from a broken c library.
  
Andrew Rybchenko June 9, 2021, 10:04 a.m. UTC | #8
On 6/8/21 11:48 PM, Stephen Hemminger wrote:
> On Tue, 8 Jun 2021 18:55:17 +0300
> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> wrote:
> 
>> On 6/8/21 6:42 PM, Stephen Hemminger wrote:
>>> On Tue, 8 Jun 2021 11:00:37 +0300
>>> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> wrote:
>>>   
>>>> On 4/19/21 8:08 PM, Thomas Monjalon wrote:  
>>>>> About the title, better to speak about multi-process,
>>>>> it is less confusing than primary/secondary.
>>>>>
>>>>> 15/03/2021 20:27, Stephen Hemminger:    
>>>>>> Set mutex used in failsafe driver to protect when used by
>>>>>> both primary and secondary process. Without this fix, the failsafe
>>>>>> lock is not really locking when there are multiple secondary processes.
>>>>>>
>>>>>> Bugzilla ID: 662
>>>>>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>>>>>> Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")
>>>>>> Cc: matan@mellanox.com    
>>>>>
>>>>> The correct order for above lines is:
>>>>>
>>>>> Bugzilla ID: 662
>>>>> Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")
>>>>>
>>>>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>>>>>     
>>>>>> ---
>>>>>> --- a/drivers/net/failsafe/failsafe.c
>>>>>> +++ b/drivers/net/failsafe/failsafe.c
>>>>>> @@ -140,6 +140,11 @@ fs_mutex_init(struct fs_priv *priv)
>>>>>>  		ERROR("Cannot initiate mutex attributes - %s", strerror(ret));
>>>>>>  		return ret;
>>>>>>  	}
>>>>>> +	/* Allow mutex to protect primary/secondary */
>>>>>> +	ret = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
>>>>>> +	if (ret)
>>>>>> +		ERROR("Cannot set mutex shared - %s", strerror(ret));    
>>>>>
>>>>> Why not returning an error here?    
>>>>
>>>> +1
>>>>
>>>> I think it would be safer to return an error here.  
>>>
>>> Ok but it never happens.
>>>   
>>
>> May I ask why? 'man pthread_mutexattr_setpshared' says that it
>> is possible.
>>
> 
> The glibc implementation of pthread_mutexattr_setpshared is:
> 
> 
> int
> pthread_mutexattr_setpshared (pthread_mutexattr_t *attr, int pshared)
> {
>   struct pthread_mutexattr *iattr;
> 
>   int err = futex_supports_pshared (pshared);
>   if (err != 0)
>     return err;
> 
>   iattr = (struct pthread_mutexattr *) attr;
> 
>   if (pshared == PTHREAD_PROCESS_PRIVATE)
>     iattr->mutexkind &= ~PTHREAD_MUTEXATTR_FLAG_PSHARED;
>   else
>     iattr->mutexkind |= PTHREAD_MUTEXATTR_FLAG_PSHARED;
> 
>   return 0;
> }
> 
> And
> 
> /* FUTEX_SHARED is always supported by the Linux kernel.  */
> static __always_inline int
> futex_supports_pshared (int pshared)
> {
>   if (__glibc_likely (pshared == PTHREAD_PROCESS_PRIVATE))
>     return 0;
>   else if (pshared == PTHREAD_PROCESS_SHARED)
>     return 0;
>   else
>     return EINVAL;
> }
> 
> 
> There for the code as written can not return an error.
> The check was only because someone could report a bogus
> issue from a broken c library.
> 

Many thanks for detailed description.
I thought that it is better to follow API
definition and it is not that hard to check
return code and handle it. Yes, glibc is not
the only C library.
  
Gaëtan Rivet June 14, 2021, 2:43 p.m. UTC | #9
On Wed, Jun 9, 2021, at 12:04, Andrew Rybchenko wrote:
> On 6/8/21 11:48 PM, Stephen Hemminger wrote:
> > On Tue, 8 Jun 2021 18:55:17 +0300
> > Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> wrote:
> > 
> >> On 6/8/21 6:42 PM, Stephen Hemminger wrote:
> >>> On Tue, 8 Jun 2021 11:00:37 +0300
> >>> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> wrote:
> >>>   
> >>>> On 4/19/21 8:08 PM, Thomas Monjalon wrote:  
> >>>>> About the title, better to speak about multi-process,
> >>>>> it is less confusing than primary/secondary.
> >>>>>
> >>>>> 15/03/2021 20:27, Stephen Hemminger:    
> >>>>>> Set mutex used in failsafe driver to protect when used by
> >>>>>> both primary and secondary process. Without this fix, the failsafe
> >>>>>> lock is not really locking when there are multiple secondary processes.
> >>>>>>
> >>>>>> Bugzilla ID: 662
> >>>>>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> >>>>>> Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")
> >>>>>> Cc: matan@mellanox.com    
> >>>>>
> >>>>> The correct order for above lines is:
> >>>>>
> >>>>> Bugzilla ID: 662
> >>>>> Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")
> >>>>>
> >>>>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> >>>>>     
> >>>>>> ---
> >>>>>> --- a/drivers/net/failsafe/failsafe.c
> >>>>>> +++ b/drivers/net/failsafe/failsafe.c
> >>>>>> @@ -140,6 +140,11 @@ fs_mutex_init(struct fs_priv *priv)
> >>>>>>  		ERROR("Cannot initiate mutex attributes - %s", strerror(ret));
> >>>>>>  		return ret;
> >>>>>>  	}
> >>>>>> +	/* Allow mutex to protect primary/secondary */
> >>>>>> +	ret = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
> >>>>>> +	if (ret)
> >>>>>> +		ERROR("Cannot set mutex shared - %s", strerror(ret));    
> >>>>>
> >>>>> Why not returning an error here?    
> >>>>
> >>>> +1
> >>>>
> >>>> I think it would be safer to return an error here.  
> >>>
> >>> Ok but it never happens.
> >>>   
> >>
> >> May I ask why? 'man pthread_mutexattr_setpshared' says that it
> >> is possible.
> >>
> > 
> > The glibc implementation of pthread_mutexattr_setpshared is:
> > 
> > 
> > int
> > pthread_mutexattr_setpshared (pthread_mutexattr_t *attr, int pshared)
> > {
> >   struct pthread_mutexattr *iattr;
> > 
> >   int err = futex_supports_pshared (pshared);
> >   if (err != 0)
> >     return err;
> > 
> >   iattr = (struct pthread_mutexattr *) attr;
> > 
> >   if (pshared == PTHREAD_PROCESS_PRIVATE)
> >     iattr->mutexkind &= ~PTHREAD_MUTEXATTR_FLAG_PSHARED;
> >   else
> >     iattr->mutexkind |= PTHREAD_MUTEXATTR_FLAG_PSHARED;
> > 
> >   return 0;
> > }
> > 
> > And
> > 
> > /* FUTEX_SHARED is always supported by the Linux kernel.  */
> > static __always_inline int
> > futex_supports_pshared (int pshared)
> > {
> >   if (__glibc_likely (pshared == PTHREAD_PROCESS_PRIVATE))
> >     return 0;
> >   else if (pshared == PTHREAD_PROCESS_SHARED)
> >     return 0;
> >   else
> >     return EINVAL;
> > }
> > 
> > 
> > There for the code as written can not return an error.
> > The check was only because someone could report a bogus
> > issue from a broken c library.
> > 
> 
> Many thanks for detailed description.
> I thought that it is better to follow API
> definition and it is not that hard to check
> return code and handle it. Yes, glibc is not
> the only C library.
> 

On principle the API spec should be respected without assuming a specific
implementation.

Another way to think about it is that a future dev having zero knowledge of
this thread, reading this code and checking the POSIX manual, will also need to
check that usual c lib implementations are unlikely to generate an error before
concluding that this code is alright. It should not be necessary.
  
Madhuker Mythri Oct. 17, 2022, 10:40 a.m. UTC | #10
> On Wed, Jun 9, 2021, at 12:04, Andrew Rybchenko wrote:
>> On 6/8/21 11:48 PM, Stephen Hemminger wrote:
>> > On Tue, 8 Jun 2021 18:55:17 +0300
>> > Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> wrote:
>> > 
>> >> On 6/8/21 6:42 PM, Stephen Hemminger wrote:
> >>> On Tue, 8 Jun 2021 11:00:37 +0300
> >>> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> wrote:
> >>>   
> >>>> On 4/19/21 8:08 PM, Thomas Monjalon wrote:  
> >>>>> About the title, better to speak about multi-process, it is less 
> >>>>> confusing than primary/secondary.
> >>>>>
> >>>>> 15/03/2021 20:27, Stephen Hemminger:    
> >>>>>> Set mutex used in failsafe driver to protect when used by both 
> >>>>>> primary and secondary process. Without this fix, the failsafe 
> >>>>>> lock is not really locking when there are multiple secondary processes.
> >>>>>>
> >>>>>> Bugzilla ID: 662
> >>>>>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> >>>>>> Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")
> >>>>>> Cc: matan@mellanox.com    
> >>>>>
> >>>>> The correct order for above lines is:
> >>>>>
> >>>>> Bugzilla ID: 662
> >>>>> Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")
> >>>>>
> >>>>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> >>>>>     
> >>>>>> ---
> > >>>>>> --- a/drivers/net/failsafe/failsafe.c
> > >>>>>> +++ b/drivers/net/failsafe/failsafe.c
> >>>>>> @@ -140,6 +140,11 @@ fs_mutex_init(struct fs_priv *priv)
> > >>>>>>  		ERROR("Cannot initiate mutex attributes - %s", strerror(ret));
> > >>>>>>  		return ret;
> >>>>>>  	}
> > >>>>>> +	/* Allow mutex to protect primary/secondary */
> > >>>>>> +	ret = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
> > >>>>>> +	if (ret)
> > >>>>>> +		ERROR("Cannot set mutex shared - %s", strerror(ret));    
> >>>>>
> > >>>>> Why not returning an error here?    
> >>>>
> > >>>> +1
> >>>>
> > >>>> I think it would be safer to return an error here.  
> >>>
> > >>> Ok but it never happens.
> >>>   
> >>
> > >> May I ask why? 'man pthread_mutexattr_setpshared' says that it is 
> > >> possible.
> >>
> > 
> > > The glibc implementation of pthread_mutexattr_setpshared is:
> > 
> > 
> > > int
> > > pthread_mutexattr_setpshared (pthread_mutexattr_t *attr, int 
> > > pshared) {
> > >   struct pthread_mutexattr *iattr;
> > 
> > >   int err = futex_supports_pshared (pshared);
> > >   if (err != 0)
> > >     return err;
> > > 
> > >   iattr = (struct pthread_mutexattr *) attr;
> > > 
> > >   if (pshared == PTHREAD_PROCESS_PRIVATE)
> > >     iattr->mutexkind &= ~PTHREAD_MUTEXATTR_FLAG_PSHARED;
> > >   else
> > >     iattr->mutexkind |= PTHREAD_MUTEXATTR_FLAG_PSHARED;
> > > 
> > >   return 0;
> > > }
> > > 
> > > And
> > > 
> > > /* FUTEX_SHARED is always supported by the Linux kernel.  */ static 
> > > __always_inline int futex_supports_pshared (int pshared) {
> > >   if (__glibc_likely (pshared == PTHREAD_PROCESS_PRIVATE))
> > >     return 0;
> > >   else if (pshared == PTHREAD_PROCESS_SHARED)
> > >     return 0;
> > >   else
> > >     return EINVAL;
> > }
> > > 
> > 
> > > There for the code as written can not return an error.
> > > The check was only because someone could report a bogus issue from a 
> > > broken c library.
> > > 
> > 
> > Many thanks for detailed description.
> > I thought that it is better to follow API definition and it is not 
> > that hard to check return code and handle it. Yes, glibc is not the 
> > only C library.
> > 
>
> On principle the API spec should be respected without assuming a specific implementation.
>
> Another way to think about it is that a future dev having zero knowledge of this thread, reading this code and checking the POSIX manual, will also need to check that usual c lib implementations are unlikely > to generate an error before concluding that this code is alright. It should not be necessary.
>
 
We are also facing similar issue, while probe of fail-safe PMD b/w multi-process.
rte_eth_dev_attach_secondary(), API return's error, while probing from secondary process in rte_pmd_tap_probe().
So, can you please let us know, if any fix available on such issue ?

Thanks,
Madhuker.
  

Patch

diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index e3bda0df2bf9..5b7e560dbc08 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -140,6 +140,11 @@  fs_mutex_init(struct fs_priv *priv)
 		ERROR("Cannot initiate mutex attributes - %s", strerror(ret));
 		return ret;
 	}
+	/* Allow mutex to protect primary/secondary */
+	ret = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
+	if (ret)
+		ERROR("Cannot set mutex shared - %s", strerror(ret));
+
 	/* Allow mutex relocks for the thread holding the mutex. */
 	ret = pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);
 	if (ret) {