[dpdk-dev,1/2] eal/linux: move plugin load to very start of eal init

Message ID 1425912999-13118-2-git-send-email-david.marchand@6wind.com (mailing list archive)
State Rejected, archived
Headers

Commit Message

David Marchand March 9, 2015, 2:56 p.m. UTC
Loading shared libraries should be done at the very start of eal init so that
the code statically built in dpdk and the code loaded from shared objects is
handled (almost) the same way wrt to call to rte_eal_init().
The only thing that must be done before is filling the solib_list which is done
by eal_parse_args().

Signed-off-by: David Marchand <david.marchand@6wind.com>
---
 lib/librte_eal/linuxapp/eal/eal.c |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)
  

Comments

Neil Horman March 9, 2015, 3:21 p.m. UTC | #1
On Mon, Mar 09, 2015 at 03:56:38PM +0100, David Marchand wrote:
> Loading shared libraries should be done at the very start of eal init so that
> the code statically built in dpdk and the code loaded from shared objects is
> handled (almost) the same way wrt to call to rte_eal_init().
> The only thing that must be done before is filling the solib_list which is done
> by eal_parse_args().
> 
> Signed-off-by: David Marchand <david.marchand@6wind.com>
> ---
>  lib/librte_eal/linuxapp/eal/eal.c |   14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
> index 16f9e7c..c1c103d 100644
> --- a/lib/librte_eal/linuxapp/eal/eal.c
> +++ b/lib/librte_eal/linuxapp/eal/eal.c
> @@ -725,6 +725,13 @@ rte_eal_init(int argc, char **argv)
>  	if (fctret < 0)
>  		exit(1);
>  
> +	TAILQ_FOREACH(solib, &solib_list, next) {
> +		RTE_LOG(INFO, EAL, "open shared lib %s\n", solib->name);
> +		solib->lib_handle = dlopen(solib->name, RTLD_NOW);
> +		if (solib->lib_handle == NULL)
> +			RTE_LOG(WARNING, EAL, "%s\n", dlerror());
> +	}
> +
>  	/* set log level as early as possible */
>  	rte_set_log_level(internal_config.log_level);
>  
> @@ -797,13 +804,6 @@ rte_eal_init(int argc, char **argv)
>  
>  	rte_eal_mcfg_complete();
>  
> -	TAILQ_FOREACH(solib, &solib_list, next) {
> -		RTE_LOG(INFO, EAL, "open shared lib %s\n", solib->name);
> -		solib->lib_handle = dlopen(solib->name, RTLD_NOW);
> -		if (solib->lib_handle == NULL)
> -			RTE_LOG(WARNING, EAL, "%s\n", dlerror());
> -	}
> -
>  	eal_thread_init_master(rte_config.master_lcore);
>  
>  	ret = eal_thread_dump_affinity(cpuset, RTE_CPU_AFFINITY_STR_LEN);
> -- 
> 1.7.10.4
> 
> 

I don't see anything explicitly wrong with this, but at the same time it doesn't
seem to fix anything.  Is there a particular bug that you're fixing in relation
to your cover letter here?  Or is there some expectation that PMD's loaded in
this fashion expect the dpdk to be completely uninitalized?  That would seem
like a strange operational requirement to me.

Neil
  
David Marchand March 10, 2015, 9:08 a.m. UTC | #2
Hello Neil,

On Mon, Mar 9, 2015 at 4:21 PM, Neil Horman <nhorman@tuxdriver.com> wrote:

> On Mon, Mar 09, 2015 at 03:56:38PM +0100, David Marchand wrote:
> > Loading shared libraries should be done at the very start of eal init so
> that
> > the code statically built in dpdk and the code loaded from shared
> objects is
> > handled (almost) the same way wrt to call to rte_eal_init().
> > The only thing that must be done before is filling the solib_list which
> is done
> > by eal_parse_args().
> >
>
>
> I don't see anything explicitly wrong with this, but at the same time it
> doesn't
> seem to fix anything.  Is there a particular bug that you're fixing in
> relation
> to your cover letter here?  Or is there some expectation that PMD's loaded
> in
> this fashion expect the dpdk to be completely uninitalized?  That would
> seem
> like a strange operational requirement to me.
>

Well, at first, I wanted to fix the virtio pmd init issue (iopl() not
called at the right place wrt to other pthreads created in rte_eal_init()).
With next patch, this issue is fixed for statically builtin virtio pmd, but
for virtio pmd as a shared object, the dlopen comes too late.
So, yes, I moved the dlopen() for this reason.

From a more general point of view, since we support both static and dso
pmds, I would say that this is more logical to have dlopen comes very
early, since static code is "loaded" even earlier : if the current pmds
needed more than just register to the driver list, they would already have
triggered segfaults and/or bugs.


I know this change comes really late for 2.0.
I am open to other ideas but I don't want to see more #ifdef <some feature>
in eal.c (especially for a pmd), this is a non sense.

I would say that at least the patch 2 is needed for 2.0 : it fixes the
static case, but without patch 1 virtio pmd triggers a segfault on
interrupt receipt when built as a dso.
  
Neil Horman March 10, 2015, 10:55 a.m. UTC | #3
On Tue, Mar 10, 2015 at 10:08:24AM +0100, David Marchand wrote:
> Hello Neil,
> 
> On Mon, Mar 9, 2015 at 4:21 PM, Neil Horman <nhorman@tuxdriver.com> wrote:
> 
> > On Mon, Mar 09, 2015 at 03:56:38PM +0100, David Marchand wrote:
> > > Loading shared libraries should be done at the very start of eal init so
> > that
> > > the code statically built in dpdk and the code loaded from shared
> > objects is
> > > handled (almost) the same way wrt to call to rte_eal_init().
> > > The only thing that must be done before is filling the solib_list which
> > is done
> > > by eal_parse_args().
> > >
> >
> >
> > I don't see anything explicitly wrong with this, but at the same time it
> > doesn't
> > seem to fix anything.  Is there a particular bug that you're fixing in
> > relation
> > to your cover letter here?  Or is there some expectation that PMD's loaded
> > in
> > this fashion expect the dpdk to be completely uninitalized?  That would
> > seem
> > like a strange operational requirement to me.
> >
> 
> Well, at first, I wanted to fix the virtio pmd init issue (iopl() not
> called at the right place wrt to other pthreads created in rte_eal_init()).
Ah, this is what you were addressing:
http://dpdk.org/ml/archives/dev/2015-March/014765.html

> With next patch, this issue is fixed for statically builtin virtio pmd, but
> for virtio pmd as a shared object, the dlopen comes too late.
> So, yes, I moved the dlopen() for this reason.
> 
But this doesn't do anything to help you.  The goal, according to the above
thread, is to initalize the pmd earlier so that you can call iopl prior to doing
any forks (so that io privlidges are inherited).  But both static and dynamic
pmd have constructors that just register their driver structures.  No
initalization happens until rte_eal_dev_init is called.  So this movement does
nothing to change the time any given drivers init routine is called.

> From a more general point of view, since we support both static and dso
> pmds, I would say that this is more logical to have dlopen comes very
> early, since static code is "loaded" even earlier : if the current pmds
> needed more than just register to the driver list, they would already have
> triggered segfaults and/or bugs.
> 
No, not really.  I suppose it doesn't hurt anything, but moving this earlier in
a function doesn't really buy you anything, as statically allocate pmds are
called by the gcc start code prior to an applications main routine running, so
we're never actually going to get close to parity there, nor do we need to,
because the actual init happens at rte_eal_dev_init, which is in parity for both
static and dynamic drivers.

> 
> I know this change comes really late for 2.0.
> I am open to other ideas but I don't want to see more #ifdef <some feature>
> in eal.c (especially for a pmd), this is a non sense.
> 
> I would say that at least the patch 2 is needed for 2.0 : it fixes the
> static case, but without patch 1 virtio pmd triggers a segfault on
> interrupt receipt when built as a dso.
> 
The static case suffers from problems as well I think, in that its possible to
architect multiple processes that are not started from fork that use the same
pmd, which would create the same issue.  I think a better course of action would
be to document the need for an application to call iopl before rte_eal_init.

Neil

> 
> -- 
> David Marchand
  
Stephen Hemminger Oct. 14, 2015, 12:05 a.m. UTC | #4
On Tue, 10 Mar 2015 06:55:41 -0400
Neil Horman <nhorman@tuxdriver.com> wrote:

> On Tue, Mar 10, 2015 at 10:08:24AM +0100, David Marchand wrote:
> > Hello Neil,
> > 
> > On Mon, Mar 9, 2015 at 4:21 PM, Neil Horman <nhorman@tuxdriver.com> wrote:
> > 
> > > On Mon, Mar 09, 2015 at 03:56:38PM +0100, David Marchand wrote:
> > > > Loading shared libraries should be done at the very start of eal init so
> > > that
> > > > the code statically built in dpdk and the code loaded from shared
> > > objects is
> > > > handled (almost) the same way wrt to call to rte_eal_init().
> > > > The only thing that must be done before is filling the solib_list which
> > > is done
> > > > by eal_parse_args().
> > > >
> > >
> > >
> > > I don't see anything explicitly wrong with this, but at the same time it
> > > doesn't
> > > seem to fix anything.  Is there a particular bug that you're fixing in
> > > relation
> > > to your cover letter here?  Or is there some expectation that PMD's loaded
> > > in
> > > this fashion expect the dpdk to be completely uninitalized?  That would
> > > seem
> > > like a strange operational requirement to me.
> > >
> > 
> > Well, at first, I wanted to fix the virtio pmd init issue (iopl() not
> > called at the right place wrt to other pthreads created in rte_eal_init()).
> Ah, this is what you were addressing:
> http://dpdk.org/ml/archives/dev/2015-March/014765.html
> 
> > With next patch, this issue is fixed for statically builtin virtio pmd, but
> > for virtio pmd as a shared object, the dlopen comes too late.
> > So, yes, I moved the dlopen() for this reason.
> > 
> But this doesn't do anything to help you.  The goal, according to the above
> thread, is to initalize the pmd earlier so that you can call iopl prior to doing
> any forks (so that io privlidges are inherited).  But both static and dynamic
> pmd have constructors that just register their driver structures.  No
> initalization happens until rte_eal_dev_init is called.  So this movement does
> nothing to change the time any given drivers init routine is called.
> 
> > From a more general point of view, since we support both static and dso
> > pmds, I would say that this is more logical to have dlopen comes very
> > early, since static code is "loaded" even earlier : if the current pmds
> > needed more than just register to the driver list, they would already have
> > triggered segfaults and/or bugs.
> > 
> No, not really.  I suppose it doesn't hurt anything, but moving this earlier in
> a function doesn't really buy you anything, as statically allocate pmds are
> called by the gcc start code prior to an applications main routine running, so
> we're never actually going to get close to parity there, nor do we need to,
> because the actual init happens at rte_eal_dev_init, which is in parity for both
> static and dynamic drivers.
> 
> > 
> > I know this change comes really late for 2.0.
> > I am open to other ideas but I don't want to see more #ifdef <some feature>
> > in eal.c (especially for a pmd), this is a non sense.
> > 
> > I would say that at least the patch 2 is needed for 2.0 : it fixes the
> > static case, but without patch 1 virtio pmd triggers a segfault on
> > interrupt receipt when built as a dso.
> > 
> The static case suffers from problems as well I think, in that its possible to
> architect multiple processes that are not started from fork that use the same
> pmd, which would create the same issue.  I think a better course of action would
> be to document the need for an application to call iopl before rte_eal_init.
> 

Given all this, I recommend that Thomas not apply this patch.
Please resubmit if there is a real problem with drivers (something in tree).
There are enough other bugs to fix without chasing ghosts.
  
David Marchand Oct. 14, 2015, 9:55 a.m. UTC | #5
On Wed, Oct 14, 2015 at 2:05 AM, Stephen Hemminger <
stephen@networkplumber.org> wrote:

> Given all this, I recommend that Thomas not apply this patch.
> Please resubmit if there is a real problem with drivers (something in
> tree).
> There are enough other bugs to fix without chasing ghosts.
>

Yes, this patch can be dropped.
Removed it from patchwork.
  

Patch

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 16f9e7c..c1c103d 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -725,6 +725,13 @@  rte_eal_init(int argc, char **argv)
 	if (fctret < 0)
 		exit(1);
 
+	TAILQ_FOREACH(solib, &solib_list, next) {
+		RTE_LOG(INFO, EAL, "open shared lib %s\n", solib->name);
+		solib->lib_handle = dlopen(solib->name, RTLD_NOW);
+		if (solib->lib_handle == NULL)
+			RTE_LOG(WARNING, EAL, "%s\n", dlerror());
+	}
+
 	/* set log level as early as possible */
 	rte_set_log_level(internal_config.log_level);
 
@@ -797,13 +804,6 @@  rte_eal_init(int argc, char **argv)
 
 	rte_eal_mcfg_complete();
 
-	TAILQ_FOREACH(solib, &solib_list, next) {
-		RTE_LOG(INFO, EAL, "open shared lib %s\n", solib->name);
-		solib->lib_handle = dlopen(solib->name, RTLD_NOW);
-		if (solib->lib_handle == NULL)
-			RTE_LOG(WARNING, EAL, "%s\n", dlerror());
-	}
-
 	eal_thread_init_master(rte_config.master_lcore);
 
 	ret = eal_thread_dump_affinity(cpuset, RTE_CPU_AFFINITY_STR_LEN);