[v4] eal: fix proc type auto detection

Message ID 67a795e77bc9f5ac79ab78a878ae19abbead9f50.1563984454.git.anatoly.burakov@intel.com (mailing list archive)
State Changes Requested, archived
Delegated to: David Marchand
Headers
Series [v4] eal: fix proc type auto detection |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/iol-Compile-Testing success Compile Testing PASS
ci/Intel-compilation success Compilation OK
ci/mellanox-Performance-Testing success Performance Testing PASS
ci/intel-Performance-Testing success Performance Testing PASS

Commit Message

Anatoly Burakov July 24, 2019, 4:07 p.m. UTC
  Currently, primary process holds an exclusive lock on the config
file, thereby preventing other primaries from spinning up. However,
when the primary dies, the lock is no longer being held, even though
there might be other secondary processes still running.

The fix is two-fold. First of all, downgrade the primary process's
exclusive lock to a shared lock once we have it. Second of all,
also take out shared locks on the config from the secondaries. We
are using fcntl() locks, which get dropped when the file handle is
closed, so also remove the closure of config file handle.

Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---

Notes:
    v4:
    - Fixed FreeBSD log message to match Linux version
    
    v3:
    - Added similar changes to FreeBSD version
    
    v2:
    - Adjusted indentation

 lib/librte_eal/freebsd/eal/eal.c | 36 +++++++++++++++++++++++++++----
 lib/librte_eal/linux/eal/eal.c   | 37 +++++++++++++++++++++++++++-----
 2 files changed, 64 insertions(+), 9 deletions(-)
  

Comments

Thomas Monjalon July 30, 2019, 8:13 a.m. UTC | #1
24/07/2019 18:07, Anatoly Burakov:
> Currently, primary process holds an exclusive lock on the config
> file, thereby preventing other primaries from spinning up. However,
> when the primary dies, the lock is no longer being held, even though
> there might be other secondary processes still running.
> 
> The fix is two-fold. First of all, downgrade the primary process's
> exclusive lock to a shared lock once we have it. Second of all,
> also take out shared locks on the config from the secondaries. We
> are using fcntl() locks, which get dropped when the file handle is
> closed, so also remove the closure of config file handle.
> 
> Fixes: af75078fece3 ("first public release")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>

This is not a new bug, and the fix is a bit complex,
so it is deferred to 19.11 cycle. OK?
  
Anatoly Burakov July 30, 2019, 9:19 a.m. UTC | #2
On 30-Jul-19 9:13 AM, Thomas Monjalon wrote:
> 24/07/2019 18:07, Anatoly Burakov:
>> Currently, primary process holds an exclusive lock on the config
>> file, thereby preventing other primaries from spinning up. However,
>> when the primary dies, the lock is no longer being held, even though
>> there might be other secondary processes still running.
>>
>> The fix is two-fold. First of all, downgrade the primary process's
>> exclusive lock to a shared lock once we have it. Second of all,
>> also take out shared locks on the config from the secondaries. We
>> are using fcntl() locks, which get dropped when the file handle is
>> closed, so also remove the closure of config file handle.
>>
>> Fixes: af75078fece3 ("first public release")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> 
> This is not a new bug, and the fix is a bit complex,
> so it is deferred to 19.11 cycle. OK?
> 

Yes, i'm OK with that.
  
David Marchand Aug. 12, 2019, 10:03 a.m. UTC | #3
On Wed, Jul 24, 2019 at 6:08 PM Anatoly Burakov
<anatoly.burakov@intel.com> wrote:
>
> Currently, primary process holds an exclusive lock on the config
> file, thereby preventing other primaries from spinning up. However,
> when the primary dies, the lock is no longer being held, even though
> there might be other secondary processes still running.
>
> The fix is two-fold. First of all, downgrade the primary process's
> exclusive lock to a shared lock once we have it. Second of all,
> also take out shared locks on the config from the secondaries. We
> are using fcntl() locks, which get dropped when the file handle is
> closed, so also remove the closure of config file handle.
>
> Fixes: af75078fece3 ("first public release")
> Cc: stable@dpdk.org
>
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
>
> Notes:
>     v4:
>     - Fixed FreeBSD log message to match Linux version
>
>     v3:
>     - Added similar changes to FreeBSD version
>
>     v2:
>     - Adjusted indentation
>
>  lib/librte_eal/freebsd/eal/eal.c | 36 +++++++++++++++++++++++++++----
>  lib/librte_eal/linux/eal/eal.c   | 37 +++++++++++++++++++++++++++-----
>  2 files changed, 64 insertions(+), 9 deletions(-)
>
> diff --git a/lib/librte_eal/freebsd/eal/eal.c b/lib/librte_eal/freebsd/eal/eal.c
> index d53f0fe69..69995bf8f 100644
> --- a/lib/librte_eal/freebsd/eal/eal.c
> +++ b/lib/librte_eal/freebsd/eal/eal.c
> @@ -72,6 +72,13 @@ static struct flock wr_lock = {
>                 .l_len = sizeof(early_mem_config.memsegs),
>  };
>
> +static struct flock rd_lock = {
> +       .l_type = F_RDLCK,
> +       .l_whence = SEEK_SET,
> +       .l_start = offsetof(struct rte_mem_config, memsegs),
> +       .l_len = sizeof(early_mem_config.memsegs),
> +};
> +
>  /* Address of global and public configuration */
>  static struct rte_config rte_config = {
>                 .mem_config = &early_mem_config,
> @@ -249,8 +256,21 @@ rte_eal_config_create(void)
>         if (retval < 0){
>                 close(mem_cfg_fd);
>                 mem_cfg_fd = -1;
> -               RTE_LOG(ERR, EAL, "Cannot create lock on '%s'. Is another primary "
> -                       "process running?\n", pathname);
> +               RTE_LOG(ERR, EAL, "Cannot create exclusive lock on '%s'. "
> +                       "Is another process running?\n", pathname);
> +               return -1;
> +       }
> +
> +       /* we hold an exclusive lock - now downgrade it to a read lock to allow
> +        * other processes to also hold onto this file while preventing other
> +        * primaries from spinning up.
> +        */
> +       retval = fcntl(mem_cfg_fd, F_SETLK, &rd_lock);
> +       if (retval < 0) {
> +               close(mem_cfg_fd);
> +               mem_cfg_fd = -1;
> +               RTE_LOG(ERR, EAL, "Cannot downgrade to shared lock on '%s': %s\n",
> +                       pathname, strerror(errno));
>                 return -1;
>         }
>
> @@ -292,6 +312,16 @@ rte_eal_config_attach(void)
>                         return -1;
>                 }
>         }
> +       /* lock the file to prevent primary from initializing while this
> +        * process is still running.
> +        */
> +       if (fcntl(mem_cfg_fd, F_SETLK, &rd_lock) < 0) {
> +               close(mem_cfg_fd);
> +               mem_cfg_fd = -1;
> +               RTE_LOG(ERR, EAL, "Cannot create shared lock on '%s': %s\n",
> +                               pathname, strerror(errno));
> +               return -1;
> +       }
>
>         rte_mem_cfg_addr = mmap(NULL, sizeof(*rte_config.mem_config),
>                                 PROT_READ, MAP_SHARED, mem_cfg_fd, 0);
> @@ -330,8 +360,6 @@ rte_eal_config_reattach(void)
>         mem_config = (struct rte_mem_config *) mmap(rte_mem_cfg_addr,
>                         sizeof(*mem_config), PROT_READ | PROT_WRITE, MAP_SHARED,
>                         mem_cfg_fd, 0);
> -       close(mem_cfg_fd);
> -       mem_cfg_fd = -1;
>
>         if (mem_config == MAP_FAILED) {
>                 RTE_LOG(ERR, EAL, "Cannot mmap memory for rte_config! error %i (%s)\n",

We are missing a mem_cfg_fd cleanup if mmap failed.


> diff --git a/lib/librte_eal/linux/eal/eal.c b/lib/librte_eal/linux/eal/eal.c
> index 34db78753..0f0726703 100644
> --- a/lib/librte_eal/linux/eal/eal.c
> +++ b/lib/librte_eal/linux/eal/eal.c
> @@ -83,6 +83,13 @@ static struct flock wr_lock = {
>                 .l_len = sizeof(early_mem_config.memsegs),
>  };
>
> +static struct flock rd_lock = {
> +       .l_type = F_RDLCK,
> +       .l_whence = SEEK_SET,
> +       .l_start = offsetof(struct rte_mem_config, memsegs),
> +       .l_len = sizeof(early_mem_config.memsegs),
> +};
> +
>  /* Address of global and public configuration */
>  static struct rte_config rte_config = {
>                 .mem_config = &early_mem_config,
> @@ -343,8 +350,21 @@ rte_eal_config_create(void)
>         if (retval < 0){
>                 close(mem_cfg_fd);
>                 mem_cfg_fd = -1;
> -               RTE_LOG(ERR, EAL, "Cannot create lock on '%s'. Is another primary "
> -                       "process running?\n", pathname);
> +               RTE_LOG(ERR, EAL, "Cannot create exclusive lock on '%s'. "
> +                       "Is another process running?\n", pathname);
> +               return -1;
> +       }
> +
> +       /* we hold an exclusive lock - now downgrade it to a read lock to allow
> +        * other processes to also hold onto this file while preventing other
> +        * primaries from spinning up.
> +        */
> +       retval = fcntl(mem_cfg_fd, F_SETLK, &rd_lock);
> +       if (retval < 0) {
> +               close(mem_cfg_fd);
> +               mem_cfg_fd = -1;
> +               RTE_LOG(ERR, EAL, "Cannot downgrade to shared lock on '%s': %s\n",
> +                       pathname, strerror(errno));
>                 return -1;
>         }
>
> @@ -389,6 +409,16 @@ rte_eal_config_attach(void)
>                         return -1;
>                 }
>         }
> +       /* lock the file to prevent primary from initializing while this
> +        * process is still running.
> +        */
> +       if (fcntl(mem_cfg_fd, F_SETLK, &rd_lock) < 0) {
> +               close(mem_cfg_fd);
> +               mem_cfg_fd = -1;
> +               RTE_LOG(ERR, EAL, "Cannot create shared lock on '%s': %s\n",
> +                               pathname, strerror(errno));
> +               return -1;
> +       }
>
>         /* map it as read-only first */
>         mem_config = (struct rte_mem_config *) mmap(NULL, sizeof(*mem_config),
> @@ -427,9 +457,6 @@ rte_eal_config_reattach(void)
>                         sizeof(*mem_config), PROT_READ | PROT_WRITE, MAP_SHARED,
>                         mem_cfg_fd, 0);
>
> -       close(mem_cfg_fd);
> -       mem_cfg_fd = -1;
> -
>         if (mem_config == MAP_FAILED || mem_config != rte_mem_cfg_addr) {
>                 if (mem_config != MAP_FAILED) {
>                         /* errno is stale, don't use */

Idem.

Reviewed-by: David Marchand <david.marchand@redhat.com>

With https://patchwork.dpdk.org/patch/56501/, the code is now really
close between Linux and FreeBSD, could it go to common entirely?
  
Van Haaren, Harry Aug. 12, 2019, 10:21 a.m. UTC | #4
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of David Marchand
> Sent: Monday, August 12, 2019 11:04 AM
> To: Burakov, Anatoly <anatoly.burakov@intel.com>
> Cc: dev <dev@dpdk.org>; Richardson, Bruce <bruce.richardson@intel.com>;
> Stephen Hemminger <stephen@networkplumber.org>; dpdk stable <stable@dpdk.org>
> Subject: Re: [dpdk-dev] [PATCH v4] eal: fix proc type auto detection
> 
> On Wed, Jul 24, 2019 at 6:08 PM Anatoly Burakov
> <anatoly.burakov@intel.com> wrote:
> >
> > Currently, primary process holds an exclusive lock on the config
> > file, thereby preventing other primaries from spinning up. However,
> > when the primary dies, the lock is no longer being held, even though
> > there might be other secondary processes still running.
> >
> > The fix is two-fold. First of all, downgrade the primary process's
> > exclusive lock to a shared lock once we have it. Second of all,
> > also take out shared locks on the config from the secondaries. We
> > are using fcntl() locks, which get dropped when the file handle is
> > closed, so also remove the closure of config file handle.
> >
> > Fixes: af75078fece3 ("first public release")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>


Apologies I'm late to the conversation.

Will the rte_eal_primary_proc_alive() function still detect the primary
as alive, and not confuse secondaries with primaries in this new method?

Currently, the pri_proc_alive() code uses lockf(fd, F_TEST, 0); to detect
if a primary is alive. I'm not familiar enough with shared locks to know 
if the new behavior would be consistent with the old.

-H
  
Anatoly Burakov Aug. 12, 2019, 1:30 p.m. UTC | #5
On 12-Aug-19 11:21 AM, Van Haaren, Harry wrote:
>> -----Original Message-----
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of David Marchand
>> Sent: Monday, August 12, 2019 11:04 AM
>> To: Burakov, Anatoly <anatoly.burakov@intel.com>
>> Cc: dev <dev@dpdk.org>; Richardson, Bruce <bruce.richardson@intel.com>;
>> Stephen Hemminger <stephen@networkplumber.org>; dpdk stable <stable@dpdk.org>
>> Subject: Re: [dpdk-dev] [PATCH v4] eal: fix proc type auto detection
>>
>> On Wed, Jul 24, 2019 at 6:08 PM Anatoly Burakov
>> <anatoly.burakov@intel.com> wrote:
>>>
>>> Currently, primary process holds an exclusive lock on the config
>>> file, thereby preventing other primaries from spinning up. However,
>>> when the primary dies, the lock is no longer being held, even though
>>> there might be other secondary processes still running.
>>>
>>> The fix is two-fold. First of all, downgrade the primary process's
>>> exclusive lock to a shared lock once we have it. Second of all,
>>> also take out shared locks on the config from the secondaries. We
>>> are using fcntl() locks, which get dropped when the file handle is
>>> closed, so also remove the closure of config file handle.
>>>
>>> Fixes: af75078fece3 ("first public release")
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> 
> 
> Apologies I'm late to the conversation.
> 
> Will the rte_eal_primary_proc_alive() function still detect the primary
> as alive, and not confuse secondaries with primaries in this new method?

Good question, i'll have to investigate. Maybe we'll have to change the 
lock from the fcntl() locks to flock()-based locks.

> 
> Currently, the pri_proc_alive() code uses lockf(fd, F_TEST, 0); to detect
> if a primary is alive. I'm not familiar enough with shared locks to know
> if the new behavior would be consistent with the old.
> 
> -H
>
  
Anatoly Burakov Aug. 12, 2019, 1:31 p.m. UTC | #6
On 12-Aug-19 11:03 AM, David Marchand wrote:
> On Wed, Jul 24, 2019 at 6:08 PM Anatoly Burakov
> <anatoly.burakov@intel.com> wrote:
>>
>> Currently, primary process holds an exclusive lock on the config
>> file, thereby preventing other primaries from spinning up. However,
>> when the primary dies, the lock is no longer being held, even though
>> there might be other secondary processes still running.
>>
>> The fix is two-fold. First of all, downgrade the primary process's
>> exclusive lock to a shared lock once we have it. Second of all,
>> also take out shared locks on the config from the secondaries. We
>> are using fcntl() locks, which get dropped when the file handle is
>> closed, so also remove the closure of config file handle.
>>
>> Fixes: af75078fece3 ("first public release")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
>> ---

<snip>

>>          rte_mem_cfg_addr = mmap(NULL, sizeof(*rte_config.mem_config),
>>                                  PROT_READ, MAP_SHARED, mem_cfg_fd, 0);
>> @@ -330,8 +360,6 @@ rte_eal_config_reattach(void)
>>          mem_config = (struct rte_mem_config *) mmap(rte_mem_cfg_addr,
>>                          sizeof(*mem_config), PROT_READ | PROT_WRITE, MAP_SHARED,
>>                          mem_cfg_fd, 0);
>> -       close(mem_cfg_fd);
>> -       mem_cfg_fd = -1;
>>
>>          if (mem_config == MAP_FAILED) {
>>                  RTE_LOG(ERR, EAL, "Cannot mmap memory for rte_config! error %i (%s)\n",
> 
> We are missing a mem_cfg_fd cleanup if mmap failed.
> 

Good catch! Will fix.

> 
>> diff --git a/lib/librte_eal/linux/eal/eal.c b/lib/librte_eal/linux/eal/eal.c
>> index 34db78753..0f0726703 100644
>> --- a/lib/librte_eal/linux/eal/eal.c
>> +++ b/lib/librte_eal/linux/eal/eal.c
>> @@ -83,6 +83,13 @@ static struct flock wr_lock = {
>>                  .l_len = sizeof(early_mem_config.memsegs),
>>   };
>>
>> +static struct flock rd_lock = {
>> +       .l_type = F_RDLCK,

<snip>

>> -
>>          if (mem_config == MAP_FAILED || mem_config != rte_mem_cfg_addr) {
>>                  if (mem_config != MAP_FAILED) {
>>                          /* errno is stale, don't use */
> 
> Idem.
> 
> Reviewed-by: David Marchand <david.marchand@redhat.com>
> 
> With https://patchwork.dpdk.org/patch/56501/, the code is now really
> close between Linux and FreeBSD, could it go to common entirely?

I would prefer to keep them separate due to upcoming Windows port.
  
Anatoly Burakov Oct. 24, 2019, 4:07 p.m. UTC | #7
On 12-Aug-19 11:03 AM, David Marchand wrote:
> On Wed, Jul 24, 2019 at 6:08 PM Anatoly Burakov
> <anatoly.burakov@intel.com> wrote:
>>
>> Currently, primary process holds an exclusive lock on the config
>> file, thereby preventing other primaries from spinning up. However,
>> when the primary dies, the lock is no longer being held, even though
>> there might be other secondary processes still running.
>>
>> The fix is two-fold. First of all, downgrade the primary process's
>> exclusive lock to a shared lock once we have it. Second of all,
>> also take out shared locks on the config from the secondaries. We
>> are using fcntl() locks, which get dropped when the file handle is
>> closed, so also remove the closure of config file handle.
>>
>> Fixes: af75078fece3 ("first public release")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
>> ---
>>

Hi David,

I've been investigating how to improve this patch, and i've hit a dead end.

The problems here are two-fold. Using fcntl() and flock() locks together 
is not advisable, so both primary-secondary detection and 
rte_eal_primary_proc_alive() (as per Harry's point) have to use the same 
method for checking locks.

Using flock() would work for this purpose. Unfortunately, on FreeBSD, 
converting exclusive lock into a shared lock involves unlocking first 
[1] (creating a race). On Linux it doesn't specifically say that it does 
that, but it does mention that it is not guaranteed to be atomic [2]. 
So, we can't use flock() here.

It seems that fcntl() lock conversions are atomic, however fcntl() locks 
on both Linux and FreeBSD are implemented in a very stupid way in that 
/any/ closure of a file descriptor drops /all/ locks on that file. 
Meaning, the moment secondary does the check in primary_proc_alive() and 
closes the config file fd, the process-wide lock drops. Mind you, 
primary_proc_alive() is implemented using lockf() rather than fcntl(), 
which is an issue in itself, but on Linux, lockf() is implemented on top 
of fcntl() locks and thus suffers from the same issue.

So, unless you have better ideas, i think this patch can be marked as 
rejected.

[1] https://www.freebsd.org/cgi/man.cgi?query=flock&sektion=2
[2] https://linux.die.net/man/2/flock
  
David Marchand Oct. 27, 2019, 7:41 p.m. UTC | #8
On Thu, Oct 24, 2019 at 6:07 PM Burakov, Anatoly
<anatoly.burakov@intel.com> wrote:
>
> On 12-Aug-19 11:03 AM, David Marchand wrote:
> > On Wed, Jul 24, 2019 at 6:08 PM Anatoly Burakov
> > <anatoly.burakov@intel.com> wrote:
> >>
> >> Currently, primary process holds an exclusive lock on the config
> >> file, thereby preventing other primaries from spinning up. However,
> >> when the primary dies, the lock is no longer being held, even though
> >> there might be other secondary processes still running.
> >>
> >> The fix is two-fold. First of all, downgrade the primary process's
> >> exclusive lock to a shared lock once we have it. Second of all,
> >> also take out shared locks on the config from the secondaries. We
> >> are using fcntl() locks, which get dropped when the file handle is
> >> closed, so also remove the closure of config file handle.
> >>
> >> Fixes: af75078fece3 ("first public release")
> >> Cc: stable@dpdk.org
> >>
> >> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> >> ---
> >>
>
> Hi David,
>
> I've been investigating how to improve this patch, and i've hit a dead end.
>
> The problems here are two-fold. Using fcntl() and flock() locks together
> is not advisable, so both primary-secondary detection and
> rte_eal_primary_proc_alive() (as per Harry's point) have to use the same
> method for checking locks.
>
> Using flock() would work for this purpose. Unfortunately, on FreeBSD,
> converting exclusive lock into a shared lock involves unlocking first
> [1] (creating a race). On Linux it doesn't specifically say that it does
> that, but it does mention that it is not guaranteed to be atomic [2].
> So, we can't use flock() here.
>
> It seems that fcntl() lock conversions are atomic, however fcntl() locks
> on both Linux and FreeBSD are implemented in a very stupid way in that
> /any/ closure of a file descriptor drops /all/ locks on that file.
> Meaning, the moment secondary does the check in primary_proc_alive() and
> closes the config file fd, the process-wide lock drops. Mind you,
> primary_proc_alive() is implemented using lockf() rather than fcntl(),
> which is an issue in itself, but on Linux, lockf() is implemented on top
> of fcntl() locks and thus suffers from the same issue.
>
> So, unless you have better ideas, i think this patch can be marked as
> rejected.
>
> [1] https://www.freebsd.org/cgi/man.cgi?query=flock&sektion=2
> [2] https://linux.die.net/man/2/flock

Sorry, hard to digest, I would need to look at this again later.
If you have no easy solution, let's revisit after 19.11.

Thanks Anatoly.
  

Patch

diff --git a/lib/librte_eal/freebsd/eal/eal.c b/lib/librte_eal/freebsd/eal/eal.c
index d53f0fe69..69995bf8f 100644
--- a/lib/librte_eal/freebsd/eal/eal.c
+++ b/lib/librte_eal/freebsd/eal/eal.c
@@ -72,6 +72,13 @@  static struct flock wr_lock = {
 		.l_len = sizeof(early_mem_config.memsegs),
 };
 
+static struct flock rd_lock = {
+	.l_type = F_RDLCK,
+	.l_whence = SEEK_SET,
+	.l_start = offsetof(struct rte_mem_config, memsegs),
+	.l_len = sizeof(early_mem_config.memsegs),
+};
+
 /* Address of global and public configuration */
 static struct rte_config rte_config = {
 		.mem_config = &early_mem_config,
@@ -249,8 +256,21 @@  rte_eal_config_create(void)
 	if (retval < 0){
 		close(mem_cfg_fd);
 		mem_cfg_fd = -1;
-		RTE_LOG(ERR, EAL, "Cannot create lock on '%s'. Is another primary "
-			"process running?\n", pathname);
+		RTE_LOG(ERR, EAL, "Cannot create exclusive lock on '%s'. "
+			"Is another process running?\n", pathname);
+		return -1;
+	}
+
+	/* we hold an exclusive lock - now downgrade it to a read lock to allow
+	 * other processes to also hold onto this file while preventing other
+	 * primaries from spinning up.
+	 */
+	retval = fcntl(mem_cfg_fd, F_SETLK, &rd_lock);
+	if (retval < 0) {
+		close(mem_cfg_fd);
+		mem_cfg_fd = -1;
+		RTE_LOG(ERR, EAL, "Cannot downgrade to shared lock on '%s': %s\n",
+			pathname, strerror(errno));
 		return -1;
 	}
 
@@ -292,6 +312,16 @@  rte_eal_config_attach(void)
 			return -1;
 		}
 	}
+	/* lock the file to prevent primary from initializing while this
+	 * process is still running.
+	 */
+	if (fcntl(mem_cfg_fd, F_SETLK, &rd_lock) < 0) {
+		close(mem_cfg_fd);
+		mem_cfg_fd = -1;
+		RTE_LOG(ERR, EAL, "Cannot create shared lock on '%s': %s\n",
+				pathname, strerror(errno));
+		return -1;
+	}
 
 	rte_mem_cfg_addr = mmap(NULL, sizeof(*rte_config.mem_config),
 				PROT_READ, MAP_SHARED, mem_cfg_fd, 0);
@@ -330,8 +360,6 @@  rte_eal_config_reattach(void)
 	mem_config = (struct rte_mem_config *) mmap(rte_mem_cfg_addr,
 			sizeof(*mem_config), PROT_READ | PROT_WRITE, MAP_SHARED,
 			mem_cfg_fd, 0);
-	close(mem_cfg_fd);
-	mem_cfg_fd = -1;
 
 	if (mem_config == MAP_FAILED) {
 		RTE_LOG(ERR, EAL, "Cannot mmap memory for rte_config! error %i (%s)\n",
diff --git a/lib/librte_eal/linux/eal/eal.c b/lib/librte_eal/linux/eal/eal.c
index 34db78753..0f0726703 100644
--- a/lib/librte_eal/linux/eal/eal.c
+++ b/lib/librte_eal/linux/eal/eal.c
@@ -83,6 +83,13 @@  static struct flock wr_lock = {
 		.l_len = sizeof(early_mem_config.memsegs),
 };
 
+static struct flock rd_lock = {
+	.l_type = F_RDLCK,
+	.l_whence = SEEK_SET,
+	.l_start = offsetof(struct rte_mem_config, memsegs),
+	.l_len = sizeof(early_mem_config.memsegs),
+};
+
 /* Address of global and public configuration */
 static struct rte_config rte_config = {
 		.mem_config = &early_mem_config,
@@ -343,8 +350,21 @@  rte_eal_config_create(void)
 	if (retval < 0){
 		close(mem_cfg_fd);
 		mem_cfg_fd = -1;
-		RTE_LOG(ERR, EAL, "Cannot create lock on '%s'. Is another primary "
-			"process running?\n", pathname);
+		RTE_LOG(ERR, EAL, "Cannot create exclusive lock on '%s'. "
+			"Is another process running?\n", pathname);
+		return -1;
+	}
+
+	/* we hold an exclusive lock - now downgrade it to a read lock to allow
+	 * other processes to also hold onto this file while preventing other
+	 * primaries from spinning up.
+	 */
+	retval = fcntl(mem_cfg_fd, F_SETLK, &rd_lock);
+	if (retval < 0) {
+		close(mem_cfg_fd);
+		mem_cfg_fd = -1;
+		RTE_LOG(ERR, EAL, "Cannot downgrade to shared lock on '%s': %s\n",
+			pathname, strerror(errno));
 		return -1;
 	}
 
@@ -389,6 +409,16 @@  rte_eal_config_attach(void)
 			return -1;
 		}
 	}
+	/* lock the file to prevent primary from initializing while this
+	 * process is still running.
+	 */
+	if (fcntl(mem_cfg_fd, F_SETLK, &rd_lock) < 0) {
+		close(mem_cfg_fd);
+		mem_cfg_fd = -1;
+		RTE_LOG(ERR, EAL, "Cannot create shared lock on '%s': %s\n",
+				pathname, strerror(errno));
+		return -1;
+	}
 
 	/* map it as read-only first */
 	mem_config = (struct rte_mem_config *) mmap(NULL, sizeof(*mem_config),
@@ -427,9 +457,6 @@  rte_eal_config_reattach(void)
 			sizeof(*mem_config), PROT_READ | PROT_WRITE, MAP_SHARED,
 			mem_cfg_fd, 0);
 
-	close(mem_cfg_fd);
-	mem_cfg_fd = -1;
-
 	if (mem_config == MAP_FAILED || mem_config != rte_mem_cfg_addr) {
 		if (mem_config != MAP_FAILED) {
 			/* errno is stale, don't use */