eal/linux: register mp hotplug callback after memory init

Message ID 20230531065506.63021-1-wangzhihong.wzh@bytedance.com (mailing list archive)
State Superseded, archived
Delegated to: David Marchand
Headers
Series eal/linux: register mp hotplug callback after memory init |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/github-robot: build success github build: passed
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/intel-Testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-testing success Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-unit-testing success Testing PASS

Commit Message

王志宏 May 31, 2023, 6:55 a.m. UTC
  Secondary would crash if it tries to handle mp requests before memory
init, since globals such as eth_dev_shared_data_lock are not accessible
to it at this moment.
---
 lib/eal/linux/eal.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)
  

Comments

Anatoly Burakov June 1, 2023, 12:26 p.m. UTC | #1
On 5/31/2023 7:55 AM, Zhihong Wang wrote:
> Secondary would crash if it tries to handle mp requests before memory
> init, since globals such as eth_dev_shared_data_lock are not accessible
> to it at this moment.
> ---

Please correct me if I'm wrong, but if init is not completed, none of 
the IPC stuff is initialized either, and any hotplug requests would not 
trigger any callbacks in the first place?
  
王志宏 June 2, 2023, 3:33 a.m. UTC | #2
Apologize that I have to go directly to function names to explain :)
  - rte_eal_intr_init creates eal_intr_thread_main which starts
eal_intr_handle_interrupts
  - rte_mp_channel_init creates mp_handle which processes messages
registered by rte_mp_action_register
  - then, eal_mp_dev_hotplug_init calls rte_mp_action_register to register
handle_primary_request for EAL_DEV_MP_ACTION_REQUEST

At this point the whole messaging mechanism starts to function: When
primary attaches/detaches devices, it sends EAL_DEV_MP_ACTION_REQUEST, and
handle_primary_request invokes __handle_primary_request, which
calls local_dev_probe/remove. In the end it goes to for
example rte_eth_dev_attach_secondary.

Now, if secondary is somewhere after eal_mp_dev_hotplug_init but before
memory init done, it will crash due to memory access violation.

Thanks
Zhihong
From: "Burakov, Anatoly"<anatoly.burakov@intel.com>
Date:  Thu, Jun 1, 2023, 20:26
Subject:  [External] Re: [PATCH] eal/linux: register mp hotplug callback
after memory init
To: "Zhihong Wang"<wangzhihong.wzh@bytedance.com>, <dev@dpdk.org>, <
qi.z.zhang@intel.com>
On 5/31/2023 7:55 AM, Zhihong Wang wrote:
> Secondary would crash if it tries to handle mp requests before memory
> init, since globals such as eth_dev_shared_data_lock are not accessible
> to it at this moment.
> ---

Please correct me if I'm wrong, but if init is not completed, none of
the IPC stuff is initialized either, and any hotplug requests would not
trigger any callbacks in the first place?
  
Anatoly Burakov June 2, 2023, 1:10 p.m. UTC | #3
On 6/2/2023 4:33 AM, 王志宏 wrote:
> Apologize that I have to go directly to function names to explain :)
>    - rte_eal_intr_init creates eal_intr_thread_main which starts 
> eal_intr_handle_interrupts
>    - rte_mp_channel_init creates mp_handle which processes messages 
> registered by rte_mp_action_register
>    - then, eal_mp_dev_hotplug_init calls rte_mp_action_register to 
> register handle_primary_request for EAL_DEV_MP_ACTION_REQUEST
> 
> At this point the whole messaging mechanism starts to function: When 
> primary attaches/detaches devices, it sends EAL_DEV_MP_ACTION_REQUEST, 
> and handle_primary_request invokes __handle_primary_request, which 
> calls local_dev_probe/remove. In the end it goes to for 
> example rte_eth_dev_attach_secondary.
> 
> Now, if secondary is somewhere after eal_mp_dev_hotplug_init but before 
> memory init done, it will crash due to memory access violation.
> 
> Thanks
> Zhihong

Hi,

I went into the code for IPC and you're right: we're only ignoring 
requests from *unregistered* callbacks, but if they were already 
registered, we handle them, which would cause your issue.
  
Anatoly Burakov June 2, 2023, 1:10 p.m. UTC | #4
On 5/31/2023 7:55 AM, Zhihong Wang wrote:
> Secondary would crash if it tries to handle mp requests before memory
> init, since globals such as eth_dev_shared_data_lock are not accessible
> to it at this moment.
> ---

Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
  
David Marchand June 7, 2023, 8:21 p.m. UTC | #5
Hello Zhihong,

On Wed, May 31, 2023 at 8:55 AM Zhihong Wang
<wangzhihong.wzh@bytedance.com> wrote:
>
> Secondary would crash if it tries to handle mp requests before memory
> init, since globals such as eth_dev_shared_data_lock are not accessible
> to it at this moment.

Can you please resend this patch with a SoB ?
Thanks.
  
王志宏 June 8, 2023, 7:29 a.m. UTC | #6
Done. Thanks :)

From: "David Marchand"<david.marchand@redhat.com>
Date:  Thu, Jun 8, 2023, 04:21
Subject:  [External] Re: [PATCH] eal/linux: register mp hotplug callback
after memory init
To: "Zhihong Wang"<wangzhihong.wzh@bytedance.com>
Cc: <dev@dpdk.org>, <anatoly.burakov@intel.com>, <qi.z.zhang@intel.com>
Hello Zhihong,

On Wed, May 31, 2023 at 8:55 AM Zhihong Wang
<wangzhihong.wzh@bytedance.com> wrote:
>
> Secondary would crash if it tries to handle mp requests before memory
> init, since globals such as eth_dev_shared_data_lock are not accessible
> to it at this moment.

Can you please resend this patch with a SoB ?
Thanks.
  

Patch

diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index ae323cd492..a74d564597 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -1058,12 +1058,6 @@  rte_eal_init(int argc, char **argv)
 		}
 	}
 
-	/* register multi-process action callbacks for hotplug */
-	if (eal_mp_dev_hotplug_init() < 0) {
-		rte_eal_init_alert("failed to register mp callback for hotplug");
-		return -1;
-	}
-
 	if (rte_bus_scan()) {
 		rte_eal_init_alert("Cannot scan the buses for devices");
 		rte_errno = ENODEV;
@@ -1221,6 +1215,12 @@  rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
+	/* register multi-process action callbacks for hotplug after memory init */
+	if (eal_mp_dev_hotplug_init() < 0) {
+		rte_eal_init_alert("failed to register mp callback for hotplug");
+		return -1;
+	}
+
 	if (rte_eal_tailqs_init() < 0) {
 		rte_eal_init_alert("Cannot init tail queues for objects");
 		rte_errno = EFAULT;