[v4] eal: refactor rte_eal_init into sub-functions

Message ID 1706103911-6907-1-git-send-email-rahulgupt@linux.microsoft.com (mailing list archive)
State Changes Requested, archived
Delegated to: David Marchand
Headers
Series [v4] eal: refactor rte_eal_init into sub-functions |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/loongarch-compilation success Compilation OK
ci/iol-intel-Performance success Performance Testing PASS
ci/github-robot: build fail github build: failed
ci/iol-intel-Functional fail Functional Testing issues
ci/Intel-compilation fail Compilation issues
ci/intel-Testing success Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/intel-Functional success Functional PASS
ci/iol-sample-apps-testing warning Testing issues
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-compile-amd64-testing fail Testing issues
ci/iol-unit-amd64-testing fail Testing issues
ci/iol-unit-arm64-testing fail Testing issues
ci/iol-compile-arm64-testing success Testing PASS
ci/loongarch-unit-testing fail Unit Testing FAIL

Commit Message

Rahul Gupta Jan. 24, 2024, 1:45 p.m. UTC
  From: Rahul Gupta <rahulgupt@microsoft.com>

In continuation to the following email, I am sending this patch.
(https://inbox.dpdk.org/dev/20231110172523.GA17466@microsoft.com/)

Initialization requires rte_eal_init + rte_pktmbuf_pool_create which
can consume a total time of 500-600 ms:
a) For many devices FLR may take a significant chunk of time
   (200-250 ms in our use-case), this FLR is triggered during device
   probe in rte_eal_init().
b) rte_pktmbuf_pool_create() can consume up to 300-350 ms for
applications that require huge memory.

This cost is incurred on each restart (which happens in our use-case
during binary updates for servicing).
This patch provides an optimization using pthreads that applications
can use and which can save 200-230ms.

In this patch, rte_eal_init() is refactored into two parts-
a) 1st part is dependent code ie- it’s a perquisite of the FLR and
   mempool creation. So this code needs to be executed before any
   pthreads. Its named as rte_eal_init_setup()
b) 2nd part of code is independent code ie- it can execute in parallel
   to mempool creation in a pthread. Its named as rte_eal_init_async_setup().

In existing applications no changes are required unless they wish to leverage
the optimization.

If the application wants to leverage this optimization, then it needs to call
rte_eal_init_async() (instead of call rte_eal_init()), then it can create a
thread using rte_eal_remote_launch() to schedule a task it would like todo in
parallel rte_eal_init_async_setup(), this task can be a mbuf pool creation
using- rte_pktmbuf_pool_create()

After this, if next operations require completion of above task, then
user can use rte_eal_init_wait_async_setup_complete(),
or if user wants to just check status of that thread, then use-
rte_eal_init_async_setup_done()

---
v2: Address Stephen Hemminger's comment
---
v3: address support for single lcore
---
v4: address Brue Richardson and Stephen Hemminger comment
Existing application need not do any changes if bootup optimization is not needed.

 app/test-pmd/testpmd.c    |  24 ++++++++-
 lib/eal/include/rte_eal.h | 107 ++++++++++++++++++++++++++++++++++++++
 lib/eal/linux/eal.c       |  62 ++++++++++++++++++++--
 lib/eal/version.map       |   7 +++
 4 files changed, 196 insertions(+), 4 deletions(-)
  

Comments

Stephen Hemminger Jan. 24, 2024, 3:47 p.m. UTC | #1
On Wed, 24 Jan 2024 05:45:11 -0800
Rahul Gupta <rahulgupt@linux.microsoft.com> wrote:

> +
> +/*
> + * waits until function executing on given lcore finishes.
> + * returns value returned by the function executing on that lcore.
> + */
> +__rte_experimental int
> +rte_eal_init_wait_async_setup_complete(void)
> +{
> +	int lcore_id = -1;
> +	lcore_id = rte_lcore_id();

Useless initialization.
  
Stephen Hemminger Jan. 24, 2024, 3:47 p.m. UTC | #2
On Wed, 24 Jan 2024 05:45:11 -0800
Rahul Gupta <rahulgupt@linux.microsoft.com> wrote:

> + * returns current status of execution on a given lcore
> + */
> +__rte_experimental int
> +rte_eal_init_async_setup_done(int lcore_id)
> +{
> +	int ret = (lcore_config[lcore_id].state)

parenthesis not needed here
  
David Marchand Jan. 24, 2024, 3:53 p.m. UTC | #3
On Wed, Jan 24, 2024 at 2:45 PM Rahul Gupta
<rahulgupt@linux.microsoft.com> wrote:
>
> From: Rahul Gupta <rahulgupt@microsoft.com>
>
> In continuation to the following email, I am sending this patch.
> (https://inbox.dpdk.org/dev/20231110172523.GA17466@microsoft.com/)
>
> Initialization requires rte_eal_init + rte_pktmbuf_pool_create which
> can consume a total time of 500-600 ms:
> a) For many devices FLR may take a significant chunk of time
>    (200-250 ms in our use-case), this FLR is triggered during device
>    probe in rte_eal_init().
> b) rte_pktmbuf_pool_create() can consume up to 300-350 ms for
> applications that require huge memory.
>
> This cost is incurred on each restart (which happens in our use-case
> during binary updates for servicing).
> This patch provides an optimization using pthreads that applications
> can use and which can save 200-230ms.
>
> In this patch, rte_eal_init() is refactored into two parts-
> a) 1st part is dependent code ie- it’s a perquisite of the FLR and
>    mempool creation. So this code needs to be executed before any
>    pthreads. Its named as rte_eal_init_setup()
> b) 2nd part of code is independent code ie- it can execute in parallel
>    to mempool creation in a pthread. Its named as rte_eal_init_async_setup().
>
> In existing applications no changes are required unless they wish to leverage
> the optimization.
>
> If the application wants to leverage this optimization, then it needs to call
> rte_eal_init_async() (instead of call rte_eal_init()), then it can create a
> thread using rte_eal_remote_launch() to schedule a task it would like todo in
> parallel rte_eal_init_async_setup(), this task can be a mbuf pool creation
> using- rte_pktmbuf_pool_create()
>
> After this, if next operations require completion of above task, then
> user can use rte_eal_init_wait_async_setup_complete(),
> or if user wants to just check status of that thread, then use-
> rte_eal_init_async_setup_done()

Looking at what this patch does.. I am under the impression all you
really need is rte_eal_init without initial probing.
Such behavior can probably be achieved with a allowlist set to a non
existing device (like for example "-a 0000:00:00.0"), then later, use
device hotplug.


Some quick note on this patch.

- don't expose symbols externally if they are only for internal use in
the same library,
- current version is 24.03, not 24.01 (wrt comment in version.map),
- other OSes are not handled by this patch, please do the work for
FreeBSD and Windows,
- as a followup of the previous point, please check if we can share
code between OSes and, if so, move the shared code to lib/eal/common,
- did you test this series with multiprocess?
- why should telemetry and other parts of the current rte_eal_init()
be left in the second stage of this initialisation pipeline?
  
Tyler Retzlaff Jan. 24, 2024, 5:49 p.m. UTC | #4
On Wed, Jan 24, 2024 at 05:45:11AM -0800, Rahul Gupta wrote:
> From: Rahul Gupta <rahulgupt@microsoft.com>
> 
> In continuation to the following email, I am sending this patch.
> (https://inbox.dpdk.org/dev/20231110172523.GA17466@microsoft.com/)
> 
> Initialization requires rte_eal_init + rte_pktmbuf_pool_create which
> can consume a total time of 500-600 ms:
> a) For many devices FLR may take a significant chunk of time
>    (200-250 ms in our use-case), this FLR is triggered during device
>    probe in rte_eal_init().
> b) rte_pktmbuf_pool_create() can consume up to 300-350 ms for
> applications that require huge memory.
> 
> This cost is incurred on each restart (which happens in our use-case
> during binary updates for servicing).
> This patch provides an optimization using pthreads that applications
> can use and which can save 200-230ms.
> 
> In this patch, rte_eal_init() is refactored into two parts-
> a) 1st part is dependent code ie- it’s a perquisite of the FLR and
>    mempool creation. So this code needs to be executed before any
>    pthreads. Its named as rte_eal_init_setup()
> b) 2nd part of code is independent code ie- it can execute in parallel
>    to mempool creation in a pthread. Its named as rte_eal_init_async_setup().
> 
> In existing applications no changes are required unless they wish to leverage
> the optimization.
> 
> If the application wants to leverage this optimization, then it needs to call
> rte_eal_init_async() (instead of call rte_eal_init()), then it can create a
> thread using rte_eal_remote_launch() to schedule a task it would like todo in
> parallel rte_eal_init_async_setup(), this task can be a mbuf pool creation
> using- rte_pktmbuf_pool_create()
> 
> After this, if next operations require completion of above task, then
> user can use rte_eal_init_wait_async_setup_complete(),
> or if user wants to just check status of that thread, then use-
> rte_eal_init_async_setup_done()
> 
> ---
> v2: Address Stephen Hemminger's comment
> ---
> v3: address support for single lcore
> ---
> v4: address Brue Richardson and Stephen Hemminger comment
> Existing application need not do any changes if bootup optimization is not needed.
> 
>  app/test-pmd/testpmd.c    |  24 ++++++++-
>  lib/eal/include/rte_eal.h | 107 ++++++++++++++++++++++++++++++++++++++
>  lib/eal/linux/eal.c       |  62 ++++++++++++++++++++--
>  lib/eal/version.map       |   7 +++
>  4 files changed, 196 insertions(+), 4 deletions(-)

NAK

changes to EAL need to cover all support OS/platforms, EAL is not
Linux-only library.
  
Rahul Gupta Jan. 29, 2024, 5:35 a.m. UTC | #5
On (01/24/24 16:53), David Marchand wrote:
> Date: Wed, 24 Jan 2024 16:53:33 +0100
> From: David Marchand <david.marchand@redhat.com>
> To: Rahul Gupta <rahulgupt@linux.microsoft.com>
> Cc: dev@dpdk.org, thomas@monjalon.net, bruce.richardson@intel.com,
>  dmitry.kozliuk@gmail.com, stephen@networkplumber.org,
>  sovaradh@linux.microsoft.com, okaya@kernel.org,
>  sujithsankar@microsoft.com, sowmini.varadhan@microsoft.com,
>  krathinavel@microsoft.com, rahulrgupta27@gmail.com, Rahul Gupta
>  <rahulgupt@microsoft.com>
> Subject: Re: [dpdk-dev] [PATCH v4] eal: refactor rte_eal_init into
>  sub-functions
> 
> On Wed, Jan 24, 2024 at 2:45 PM Rahul Gupta
> <rahulgupt@linux.microsoft.com> wrote:
> >
> > From: Rahul Gupta <rahulgupt@microsoft.com>
> >
> > In continuation to the following email, I am sending this patch.
> > (https://inbox.dpdk.org/dev/20231110172523.GA17466@microsoft.com/)
> >
> > Initialization requires rte_eal_init + rte_pktmbuf_pool_create which
> > can consume a total time of 500-600 ms:
> > a) For many devices FLR may take a significant chunk of time
> >    (200-250 ms in our use-case), this FLR is triggered during device
> >    probe in rte_eal_init().
> > b) rte_pktmbuf_pool_create() can consume up to 300-350 ms for
> > applications that require huge memory.
> >
> > This cost is incurred on each restart (which happens in our use-case
> > during binary updates for servicing).
> > This patch provides an optimization using pthreads that applications
> > can use and which can save 200-230ms.
> >
> > In this patch, rte_eal_init() is refactored into two parts-
> > a) 1st part is dependent code ie- it’s a perquisite of the FLR and
> >    mempool creation. So this code needs to be executed before any
> >    pthreads. Its named as rte_eal_init_setup()
> > b) 2nd part of code is independent code ie- it can execute in parallel
> >    to mempool creation in a pthread. Its named as rte_eal_init_async_setup().
> >
> > In existing applications no changes are required unless they wish to leverage
> > the optimization.
> >
> > If the application wants to leverage this optimization, then it needs to call
> > rte_eal_init_async() (instead of call rte_eal_init()), then it can create a
> > thread using rte_eal_remote_launch() to schedule a task it would like todo in
> > parallel rte_eal_init_async_setup(), this task can be a mbuf pool creation
> > using- rte_pktmbuf_pool_create()
> >
> > After this, if next operations require completion of above task, then
> > user can use rte_eal_init_wait_async_setup_complete(),
> > or if user wants to just check status of that thread, then use-
> > rte_eal_init_async_setup_done()
> 
> Looking at what this patch does.. I am under the impression all you
> really need is rte_eal_init without initial probing.
> Such behavior can probably be achieved with a allowlist set to a non
> existing device (like for example "-a 0000:00:00.0"), then later, use
> device hotplug.
The patch will be useful to all the adapters irrespective of their
host plug support.
> 
> Some quick note on this patch.
> 
> - don't expose symbols externally if they are only for internal use in
> the same library,
done in next patch.
> - current version is 24.03, not 24.01 (wrt comment in version.map),
done
> - other OSes are not handled by this patch, please do the work for
> FreeBSD and Windows,
I can send patch to support non-linux OS, but due to lack of setup,
I will need help to test same.

Also, I am planning to do the porting at the end (ie after incorporating
all review comments, in order to prevent duplication of efforts).

> - as a followup of the previous point, please check if we can share
> code between OSes and, if so, move the shared code to lib/eal/common,
The code for rte_eal_init() is different for all three distros, (even if
I consider just the 1st part of rte_eal_init() ie rte_eal_init_setup()).
So its difficult to move to common dir. We will have todo it separately
for all OS.
> - did you test this series with multiprocess?
Yes it works fine, I have tested it with simple_mp app.
> - why should telemetry and other parts of the current rte_eal_init()
> be left in the second stage of this initialisation pipeline?
Actually motivation of this patch was todo most of the work in parallel
ie in second stage, so not only probe/FLR but telemetry and any other work
which can be executed in parallel are done here. (pls refer to the link
in the commit message for more details)
> 
> 
> -- 
> David Marchand


Thanks for the reviewing, my comments are inline.

Thanks,
Rahul.
  
David Marchand Jan. 29, 2024, 7:55 a.m. UTC | #6
On Mon, Jan 29, 2024 at 6:35 AM Rahul Gupta
<rahulgupt@linux.microsoft.com> wrote:
> > Looking at what this patch does.. I am under the impression all you
> > really need is rte_eal_init without initial probing.
> > Such behavior can probably be achieved with a allowlist set to a non
> > existing device (like for example "-a 0000:00:00.0"), then later, use
> > device hotplug.
> The patch will be useful to all the adapters irrespective of their
> host plug support.

I did not say hotplug support is needed.
If what I described already works, this patch adds nothing.
  
Thomas Monjalon Feb. 2, 2024, 10:21 a.m. UTC | #7
29/01/2024 08:55, David Marchand:
> On Mon, Jan 29, 2024 at 6:35 AM Rahul Gupta
> <rahulgupt@linux.microsoft.com> wrote:
> > > Looking at what this patch does.. I am under the impression all you
> > > really need is rte_eal_init without initial probing.
> > > Such behavior can probably be achieved with a allowlist set to a non
> > > existing device (like for example "-a 0000:00:00.0"), then later, use
> > > device hotplug.
> > The patch will be useful to all the adapters irrespective of their
> > host plug support.
> 
> I did not say hotplug support is needed.
> If what I described already works, this patch adds nothing.

I agree with David.
Disabling initial probing should provide what you want.
Did you test his proposal?
  
Rahul Gupta Feb. 3, 2024, 12:57 p.m. UTC | #8
On (02/02/24 11:21), Thomas Monjalon wrote:
> Date: Fri, 02 Feb 2024 11:21:59 +0100
> From: Thomas Monjalon <thomas@monjalon.net>
> To: Rahul Gupta <rahulgupt@linux.microsoft.com>
> Cc: David Marchand <david.marchand@redhat.com>, dev@dpdk.org,
>  bruce.richardson@intel.com, dmitry.kozliuk@gmail.com,
>  stephen@networkplumber.org, sovaradh@linux.microsoft.com,
>  okaya@kernel.org, sujithsankar@microsoft.com,
>  sowmini.varadhan@microsoft.com, krathinavel@microsoft.com,
>  rahulrgupta27@gmail.com
> Subject: Re: [dpdk-dev] [PATCH v4] eal: refactor rte_eal_init into
>  sub-functions
> 
> 29/01/2024 08:55, David Marchand:
> > On Mon, Jan 29, 2024 at 6:35 AM Rahul Gupta
> > <rahulgupt@linux.microsoft.com> wrote:
> > > > Looking at what this patch does.. I am under the impression all you
> > > > really need is rte_eal_init without initial probing.
> > > > Such behavior can probably be achieved with a allowlist set to a non
> > > > existing device (like for example "-a 0000:00:00.0"), then later, use
> > > > device hotplug.
> > > The patch will be useful to all the adapters irrespective of their
> > > host plug support.
> > 
> > I did not say hotplug support is needed.
> > If what I described already works, this patch adds nothing.
> 
> I agree with David.
> Disabling initial probing should provide what you want.
> Did you test his proposal?
> 
> 
Yes, I was about to reply after testing same, will be done with testing in few days.
But I think the bootup time saved by my patch and hot plug patch will be almost same,
because apart from FLR (probe()) the extra work done by my patch (i.e. telemetry,
rte_service_init() in parallel to mbuf pool creation) are consuming very less bootup time.
So in future if more things are added to 2nd part of eal_init (i.e. rte_eal_init_async_setup()),
then the bootup time will be less if we use my patch.
I think we can defer this patch till then.

Thanks,
Rahul.
  
Rahul Gupta Feb. 8, 2024, 5:05 p.m. UTC | #9
On (02/03/24 04:57), Rahul Gupta wrote:
> Date: Sat, 3 Feb 2024 04:57:49 -0800
> From: Rahul Gupta <rahulgupt@linux.microsoft.com>
> To: Thomas Monjalon <thomas@monjalon.net>
> Cc: David Marchand <david.marchand@redhat.com>, dev@dpdk.org,
>  bruce.richardson@intel.com, dmitry.kozliuk@gmail.com,
>  stephen@networkplumber.org, sovaradh@linux.microsoft.com,
>  okaya@kernel.org, sujithsankar@microsoft.com,
>  sowmini.varadhan@microsoft.com, krathinavel@microsoft.com,
>  rahulrgupta27@gmail.com
> Subject: Re: [dpdk-dev] [PATCH v4] eal: refactor rte_eal_init into
>  sub-functions
> User-Agent: Mutt/1.5.21 (2010-09-15)
> 
> On (02/02/24 11:21), Thomas Monjalon wrote:
> > Date: Fri, 02 Feb 2024 11:21:59 +0100
> > From: Thomas Monjalon <thomas@monjalon.net>
> > To: Rahul Gupta <rahulgupt@linux.microsoft.com>
> > Cc: David Marchand <david.marchand@redhat.com>, dev@dpdk.org,
> >  bruce.richardson@intel.com, dmitry.kozliuk@gmail.com,
> >  stephen@networkplumber.org, sovaradh@linux.microsoft.com,
> >  okaya@kernel.org, sujithsankar@microsoft.com,
> >  sowmini.varadhan@microsoft.com, krathinavel@microsoft.com,
> >  rahulrgupta27@gmail.com
> > Subject: Re: [dpdk-dev] [PATCH v4] eal: refactor rte_eal_init into
> >  sub-functions
> > 
> > 29/01/2024 08:55, David Marchand:
> > > On Mon, Jan 29, 2024 at 6:35 AM Rahul Gupta
> > > <rahulgupt@linux.microsoft.com> wrote:
> > > > > Looking at what this patch does.. I am under the impression all you
> > > > > really need is rte_eal_init without initial probing.
> > > > > Such behavior can probably be achieved with a allowlist set to a non
> > > > > existing device (like for example "-a 0000:00:00.0"), then later, use
> > > > > device hotplug.
> > > > The patch will be useful to all the adapters irrespective of their
> > > > host plug support.
> > > 
> > > I did not say hotplug support is needed.
> > > If what I described already works, this patch adds nothing.
> > 
> > I agree with David.
> > Disabling initial probing should provide what you want.
> > Did you test his proposal?
> > 
> > 
> Yes, I was about to reply after testing same, will be done with testing in few days.
> But I think the bootup time saved by my patch and hot plug patch will be almost same,
> because apart from FLR (probe()) the extra work done by my patch (i.e. telemetry,
> rte_service_init() in parallel to mbuf pool creation) are consuming very less bootup time.
> So in future if more things are added to 2nd part of eal_init (i.e. rte_eal_init_async_setup()),
> then the bootup time will be less if we use my patch.
> I think we can defer this patch till then.
> 
> Thanks,
> Rahul.

Initial tests looks ok wrt application bootup time saving by using the hot plug APIs
and so we may not need the patch for rte_eal_init().

Thanks for revewing patch and suggestions.
Thanks,
Rahul.
  

Patch

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 9e4e99e53b..c8eb194f64 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -4531,6 +4531,8 @@  main(int argc, char** argv)
 	portid_t port_id;
 	uint16_t count;
 	int ret;
+	int lcore_id;
+	int main_lcore_id;
 
 #ifdef RTE_EXEC_ENV_WINDOWS
 	signal(SIGINT, signal_handler);
@@ -4550,11 +4552,31 @@  main(int argc, char** argv)
 		rte_exit(EXIT_FAILURE, "Cannot register log type");
 	rte_log_set_level(testpmd_logtype, RTE_LOG_DEBUG);
 
-	diag = rte_eal_init(argc, argv);
+	diag = rte_eal_init_async(argc, argv);
 	if (diag < 0)
 		rte_exit(EXIT_FAILURE, "Cannot init EAL: %s\n",
 			 rte_strerror(rte_errno));
 
+	main_lcore_id = rte_get_main_lcore();
+	lcore_id = rte_get_next_lcore(main_lcore_id, 0, 1);
+	/* Gives status of rte_eal_init_async() */
+	if (main_lcore_id != lcore_id)
+		while (rte_eal_init_async_setup_done(lcore_id) == 0)
+			;
+
+	/*
+	 * Use rte_eal_init_wait_async_setup_complete() to get return value of
+	 * rte_eal_init_async().
+	 * Or
+	 * if testpmd application don't want to know progress/status of
+	 * rte_eal_init_async() and just want to wait till it finishes
+	 * then use following function.
+	 */
+	ret = rte_eal_init_wait_async_setup_complete();
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "Cannot init EAL: "
+			 "rte_eal_init_async() failed: %s\n",
+			 strerror(ret));
 	/* allocate port structures, and init them */
 	init_port();
 
diff --git a/lib/eal/include/rte_eal.h b/lib/eal/include/rte_eal.h
index c2256f832e..6d7044b632 100644
--- a/lib/eal/include/rte_eal.h
+++ b/lib/eal/include/rte_eal.h
@@ -111,6 +111,113 @@  int rte_eal_iopl_init(void);
  */
 int rte_eal_init(int argc, char **argv);
 
+/**
+ * Initialize the Environment Abstraction Layer (EAL).
+ *
+ * This function is to be executed on the MAIN lcore only, as soon
+ * as possible in the application's main() function.
+ * It puts the WORKER lcores in the WAIT state.
+ *
+ * @param argc
+ *   A non-negative value.  If it is greater than 0, the array members
+ *   for argv[0] through argv[argc] (non-inclusive) shall contain pointers
+ *   to strings.
+ * @param argv
+ *   An array of strings.  The contents of the array, as well as the strings
+ *   which are pointed to by the array, may be modified by this function.
+ *   The program name pointer argv[0] is copied into the last parsed argv
+ *   so that argv[0] is still the same after deducing the parsed arguments.
+ * @return
+ *   - On success, the number of parsed arguments, which is greater or
+ *     equal to zero. After the call to rte_eal_init_async(),
+ *     all arguments argv[x] with x < ret may have been modified by this
+ *     function call and should not be further interpreted by the
+ *     application.  The EAL does not take any ownership of the memory used
+ *     for either the argv array, or its members.
+ *   - On failure, -1 and rte_errno is set to a value indicating the cause
+ *     for failure.  In some instances, the application will need to be
+ *     restarted as part of clearing the issue.
+ *
+ *   Error codes returned via rte_errno:
+ *     EACCES indicates a permissions issue.
+ *
+ *     EAGAIN indicates either a bus or system resource was not available,
+ *            setup may be attempted again.
+ *
+ *     EALREADY indicates that the rte_eal_init_async function has already been
+ *              called, and cannot be called again.
+ *
+ *     EFAULT indicates the tailq configuration name was not found in
+ *            memory configuration.
+ *
+ *     EINVAL indicates invalid parameters were passed as argv/argc.
+ *
+ *     ENOMEM indicates failure likely caused by an out-of-memory condition.
+ *
+ *     ENODEV indicates memory setup issues.
+ *
+ *     ENOTSUP indicates that the EAL cannot initialize on this system.
+ *
+ *     EPROTO indicates that the PCI bus is either not present, or is not
+ *            readable by the eal.
+ *
+ *     ENOEXEC indicates that a service core failed to launch successfully.
+ */
+int rte_eal_init_async(int argc, char **argv);
+
+/**
+ * Initialize the Environment Abstraction Layer (EAL): Initial setup
+ *
+ * Its called from rte_eal_init() on MAIN lcore only and must NOT be directly
+ * called by user application.
+ * The driver dependent code is present in this function, ie before calling any other
+ * function eal library function this function must be complete successfully.
+ *
+ * return value is same as rte_eal_init().
+ */
+
+__rte_experimental
+int rte_eal_init_setup(int argc, char **argv);
+
+/**
+ * Initialize the Environment Abstraction Layer (EAL): FLR and probe device
+ *
+ * Its thread is forked by rte_eal_init() and must NOT be directly called by user application.
+ * Launched on next available worker lcore.
+ * In this function initialisation needed for memory pool creation is completed,
+ * so this code can be executed in parallel to non device related operations
+ * like mbuf pool creation.
+ *
+ * return value is same as rte_eal_init().
+ */
+__rte_experimental
+int rte_eal_init_async_setup(__attribute__((unused)) void *arg);
+
+/**
+ * Initialize the Environment Abstraction Layer (EAL): Indication of rte_eal_init() completion
+ *
+ * It waits for rte_eal_init_async() to finish. It MUST be called from application,
+ * when a thread join is needed. Typically application will call this function after
+ * it performs all device independent operation (like mbuf pool creation) on initial lcore.
+ *
+ * return value is same as rte_eal_init().
+ */
+__rte_experimental
+int rte_eal_init_wait_async_setup_complete(void);
+
+/**
+ * Initialize the Environment Abstraction Layer (EAL): Indication of rte_eal_init() completion
+ *
+ * It shows status of rte_eal_init_async() ie the function is executing or completed.
+ * It MUST be called from application,
+ * Typically an application will call this function when it wants to know status of
+ * rte_eal_init_async() (ie FLR and probe thread).
+ *
+ * return value is same as rte_eal_init().
+ */
+__rte_experimental
+int rte_eal_init_async_setup_done(int lcore_id);
+
 /**
  * Clean up the Environment Abstraction Layer (EAL)
  *
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index fd422f1f62..e86aac5980 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -962,9 +962,8 @@  eal_worker_thread_create(unsigned int lcore_id)
 	return ret;
 }
 
-/* Launch threads, called at application init(). */
-int
-rte_eal_init(int argc, char **argv)
+__rte_experimental
+int rte_eal_init_setup(int argc, char **argv)
 {
 	int i, fctret, ret;
 	static RTE_ATOMIC(uint32_t) run_once;
@@ -1268,7 +1267,15 @@  rte_eal_init(int argc, char **argv)
 	 */
 	rte_eal_mp_remote_launch(sync_func, NULL, SKIP_MAIN);
 	rte_eal_mp_wait_lcore();
+	return fctret;
+}
 
+__rte_experimental int
+rte_eal_init_async_setup(__attribute__((unused)) void *arg)
+{
+	int ret = 0;
+	struct internal_config *internal_conf =
+		eal_get_internal_configuration();
 	/* initialize services so vdevs register service during bus_probe. */
 	ret = rte_service_init();
 	if (ret) {
@@ -1322,6 +1329,55 @@  rte_eal_init(int argc, char **argv)
 
 	eal_mcfg_complete();
 
+	return 0;
+}
+
+/*
+ * waits until function executing on given lcore finishes.
+ * returns value returned by the function executing on that lcore.
+ */
+__rte_experimental int
+rte_eal_init_wait_async_setup_complete(void)
+{
+	int lcore_id = -1;
+	lcore_id = rte_lcore_id();
+	lcore_id = rte_get_next_lcore(lcore_id, 0, 1);
+	int ret = rte_eal_wait_lcore(lcore_id);
+	return ret;
+}
+
+/*
+ * returns current status of execution on a given lcore
+ */
+__rte_experimental int
+rte_eal_init_async_setup_done(int lcore_id)
+{
+	int ret = (lcore_config[lcore_id].state);
+	return (ret == WAIT);
+}
+
+/* Launch threads, called at application init(). */
+int
+rte_eal_init(int argc, char **argv)
+{
+	int fctret = rte_eal_init_setup(argc, argv);
+	if (fctret < 0)
+		return fctret;
+	return rte_eal_init_async_setup(NULL);
+}
+
+/* Launch threads, called at application init(). */
+__rte_experimental int
+rte_eal_init_async(int argc, char **argv)
+{
+	int lcore_id;
+	int fctret = rte_eal_init_setup(argc, argv);	/* initial lcore*/
+	if (fctret < 0)
+		return fctret;
+	lcore_id = rte_lcore_id();
+	lcore_id = rte_get_next_lcore(lcore_id, 0, 1);
+	/* running on a worker lcore */
+	rte_eal_remote_launch(rte_eal_init_async_setup, NULL, lcore_id);
 	return fctret;
 }
 
diff --git a/lib/eal/version.map b/lib/eal/version.map
index 5e0cd47c82..5e7ccb67c4 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -393,6 +393,13 @@  EXPERIMENTAL {
 	# added in 23.07
 	rte_memzone_max_get;
 	rte_memzone_max_set;
+
+	# added in 24.01
+	rte_eal_init_async;
+	rte_eal_init_setup;
+	rte_eal_init_async_setup;
+	rte_eal_init_async_setup_done;
+	rte_eal_init_wait_async_setup_complete;
 };
 
 INTERNAL {