[v2,2/2] crypto/ipsec_mb: fix QP release in secondary

Message ID 20250407052532.1913-2-ming.1.yang@nokia-sbell.com (mailing list archive)
State New
Delegated to: David Marchand
Headers
Series [v2,1/2] eal: prevent socket closure before MP sync |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/iol-mellanox-Functional success Functional Testing PASS
ci/iol-marvell-Functional success Functional Testing PASS
ci/intel-Functional success Functional PASS
ci/github-robot: build success github build: passed
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-unit-amd64-testing success Testing PASS
ci/iol-unit-arm64-testing success Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-compile-arm64-testing success Testing PASS
ci/iol-sample-apps-testing success Testing PASS
ci/iol-compile-amd64-testing success Testing PASS

Commit Message

Yang Ming April 7, 2025, 5:25 a.m. UTC
From: myang <ming.1.yang@nokia-sbell.com>

When a secondary process tries to release a queue pair (QP) that
does not belong to it, error logs occur:
CRYPTODEV: ipsec_mb_ipc_request() line 373: Unable to release
qp_id=0
EAL: Message data is too long
EAL: Fail to handle message: ipsec_mb_mp_msg
EAL: Fail to recv reply for request /tmp/dpdk/l2hi/mp_socket:
ipsec_mb_mp_msg

This patch ensures that a secondary process only frees a QP if
it actually owns it, preventing conflicts and resolving the
issue.

Signed-off-by: myang <ming.1.yang@nokia-sbell.com>
---
 drivers/crypto/ipsec_mb/ipsec_mb_ops.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)
  

Comments

Yang Ming April 24, 2025, 2:26 p.m. UTC | #1
Hi,

On 2025/4/7 13:25, Yang Ming wrote:
> From: myang <ming.1.yang@nokia-sbell.com>
>
> When a secondary process tries to release a queue pair (QP) that
> does not belong to it, error logs occur:
> CRYPTODEV: ipsec_mb_ipc_request() line 373: Unable to release
> qp_id=0
> EAL: Message data is too long
> EAL: Fail to handle message: ipsec_mb_mp_msg
> EAL: Fail to recv reply for request /tmp/dpdk/l2hi/mp_socket:
> ipsec_mb_mp_msg
>
> This patch ensures that a secondary process only frees a QP if
> it actually owns it, preventing conflicts and resolving the
> issue.
>
> Signed-off-by: myang <ming.1.yang@nokia-sbell.com>
> ---
>   drivers/crypto/ipsec_mb/ipsec_mb_ops.c | 7 +++++--
>   1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/crypto/ipsec_mb/ipsec_mb_ops.c b/drivers/crypto/ipsec_mb/ipsec_mb_ops.c
> index 910efb1a97..50ee140ccd 100644
> --- a/drivers/crypto/ipsec_mb/ipsec_mb_ops.c
> +++ b/drivers/crypto/ipsec_mb/ipsec_mb_ops.c
> @@ -138,6 +138,7 @@ int
>   ipsec_mb_qp_release(struct rte_cryptodev *dev, uint16_t qp_id)
>   {
>   	struct ipsec_mb_qp *qp = dev->data->queue_pairs[qp_id];
> +	uint16_t process_id = (uint16_t)getpid();
>   
>   	if (!qp)
>   		return 0;
> @@ -152,8 +153,10 @@ ipsec_mb_qp_release(struct rte_cryptodev *dev, uint16_t qp_id)
>   		rte_free(qp);
>   		dev->data->queue_pairs[qp_id] = NULL;
>   	} else { /* secondary process */
> -		return ipsec_mb_secondary_qp_op(dev->data->dev_id, qp_id,
> -					NULL, 0, RTE_IPSEC_MB_MP_REQ_QP_FREE);
> +		if (qp->qp_used_by_pid == process_id)
> +			return ipsec_mb_secondary_qp_op(dev->data->dev_id,
> +						qp_id, NULL, 0,
> +						RTE_IPSEC_MB_MP_REQ_QP_FREE);
>   	}
>   	return 0;
>   }

Hi Experts,

Is there any chance to review and accept this patch?

Brs,

Yang Ming
  
David Marchand May 5, 2025, 3:31 p.m. UTC | #2
Hello,

On Thu, Apr 24, 2025 at 4:26 PM Yang Ming <ming.1.yang@nokia-sbell.com> wrote:
>
> Hi,
>
> On 2025/4/7 13:25, Yang Ming wrote:
> > From: myang <ming.1.yang@nokia-sbell.com>
> >
> > When a secondary process tries to release a queue pair (QP) that
> > does not belong to it, error logs occur:
> > CRYPTODEV: ipsec_mb_ipc_request() line 373: Unable to release
> > qp_id=0
> > EAL: Message data is too long
> > EAL: Fail to handle message: ipsec_mb_mp_msg
> > EAL: Fail to recv reply for request /tmp/dpdk/l2hi/mp_socket:
> > ipsec_mb_mp_msg
> >
> > This patch ensures that a secondary process only frees a QP if
> > it actually owns it, preventing conflicts and resolving the
> > issue.
> >
> > Signed-off-by: myang <ming.1.yang@nokia-sbell.com>
> > ---
> >   drivers/crypto/ipsec_mb/ipsec_mb_ops.c | 7 +++++--
> >   1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/crypto/ipsec_mb/ipsec_mb_ops.c b/drivers/crypto/ipsec_mb/ipsec_mb_ops.c
> > index 910efb1a97..50ee140ccd 100644
> > --- a/drivers/crypto/ipsec_mb/ipsec_mb_ops.c
> > +++ b/drivers/crypto/ipsec_mb/ipsec_mb_ops.c
> > @@ -138,6 +138,7 @@ int
> >   ipsec_mb_qp_release(struct rte_cryptodev *dev, uint16_t qp_id)
> >   {
> >       struct ipsec_mb_qp *qp = dev->data->queue_pairs[qp_id];
> > +     uint16_t process_id = (uint16_t)getpid();
> >
> >       if (!qp)
> >               return 0;
> > @@ -152,8 +153,10 @@ ipsec_mb_qp_release(struct rte_cryptodev *dev, uint16_t qp_id)
> >               rte_free(qp);
> >               dev->data->queue_pairs[qp_id] = NULL;
> >       } else { /* secondary process */
> > -             return ipsec_mb_secondary_qp_op(dev->data->dev_id, qp_id,
> > -                                     NULL, 0, RTE_IPSEC_MB_MP_REQ_QP_FREE);
> > +             if (qp->qp_used_by_pid == process_id)
> > +                     return ipsec_mb_secondary_qp_op(dev->data->dev_id,
> > +                                             qp_id, NULL, 0,
> > +                                             RTE_IPSEC_MB_MP_REQ_QP_FREE);
> >       }
> >       return 0;
> >   }
>
> Hi Experts,
>
> Is there any chance to review and accept this patch?

Don't forget to Cc maintainers.
  
Ji, Kai May 7, 2025, 3:25 p.m. UTC | #3
Hi Yangming,

PID check is implemented here:
https://github.com/DPDK/dpdk/blob/75f179ebe347b6098cf3af26d3d3b7168fe3fe24/drivers/crypto/ipsec_mb/ipsec_mb_ops.c#L376

Can you share the steps to re-produce the error ?

Regards

Kai
  
Moses Young May 12, 2025, 10:10 a.m. UTC | #4
On 5/7/2025 11:25 PM, Ji, Kai wrote:

> Hi Yangming,
>
> PID check is implemented here:
> https://github.com/DPDK/dpdk/blob/75f179ebe347b6098cf3af26d3d3b7168fe3fe24/drivers/crypto/ipsec_mb/ipsec_mb_ops.c#L376 
> <https://github.com/DPDK/dpdk/blob/75f179ebe347b6098cf3af26d3d3b7168fe3fe24/drivers/crypto/ipsec_mb/ipsec_mb_ops.c#L376>
>
> Can you share the steps to re-produce the error ?
>
> Regards
>
> Kai
>
> ------------------------------------------------------------------------
> *From:* Yang Ming
> *Sent:* Thursday, April 24, 2025 15:26
> *To:* dev@dpdk.org
> *Subject:* Re: [PATCH v2 2/2] crypto/ipsec_mb: fix QP release in 
> secondary
>
> Hi,
>
> On 2025/4/7 13:25, Yang Ming wrote:
> > From: myang <ming.1.yang@nokia-sbell.com>
> >
> > When a secondary process tries to release a queue pair (QP) that
> > does not belong to it, error logs occur:
> > CRYPTODEV: ipsec_mb_ipc_request() line 373: Unable to release
> > qp_id=0
> > EAL: Message data is too long
> > EAL: Fail to handle message: ipsec_mb_mp_msg
> > EAL: Fail to recv reply for request /tmp/dpdk/l2hi/mp_socket:
> > ipsec_mb_mp_msg
> >
> > This patch ensures that a secondary process only frees a QP if
> > it actually owns it, preventing conflicts and resolving the
> > issue.
> >
> > Signed-off-by: myang <ming.1.yang@nokia-sbell.com>
> > ---
> >   drivers/crypto/ipsec_mb/ipsec_mb_ops.c | 7 +++++--
> >   1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/crypto/ipsec_mb/ipsec_mb_ops.c 
> b/drivers/crypto/ipsec_mb/ipsec_mb_ops.c
> > index 910efb1a97..50ee140ccd 100644
> > --- a/drivers/crypto/ipsec_mb/ipsec_mb_ops.c
> > +++ b/drivers/crypto/ipsec_mb/ipsec_mb_ops.c
> > @@ -138,6 +138,7 @@ int
> >   ipsec_mb_qp_release(struct rte_cryptodev *dev, uint16_t qp_id)
> >   {
> >        struct ipsec_mb_qp *qp = dev->data->queue_pairs[qp_id];
> > +     uint16_t process_id = (uint16_t)getpid();
> >
> >        if (!qp)
> >                return 0;
> > @@ -152,8 +153,10 @@ ipsec_mb_qp_release(struct rte_cryptodev *dev, 
> uint16_t qp_id)
> >                rte_free(qp);
> >                dev->data->queue_pairs[qp_id] = NULL;
> >        } else { /* secondary process */
> > -             return ipsec_mb_secondary_qp_op(dev->data->dev_id, qp_id,
> > -                                     NULL, 0, 
> RTE_IPSEC_MB_MP_REQ_QP_FREE);
> > +             if (qp->qp_used_by_pid == process_id)
> > +                     return ipsec_mb_secondary_qp_op(dev->data->dev_id,
> > +                                             qp_id, NULL, 0,
> > + RTE_IPSEC_MB_MP_REQ_QP_FREE);
> >        }
> >        return 0;
> >   }
>
> Hi Experts,
>
> Is there any chance to review and accept this patch?
>
> Brs,
>
> Yang Ming
>
Hi,

David. Thanks for your feedback. I will Cc maintainers in next version.

Kai, Thanks for your feedback. Here's our test steps after applying the 
previous patch called "eal: prevent socket closure before MP sync" in 
this serious:
1. Start the primary process: Run the DPDK primary process with the 
IPsec MB crypto device.
2. Launch the secondary process: Start a DPDK secondary process using 
the same device parameters.
3. Exit the secondary normally.

On secondary exit, error logs show below as my commit log's description:
CRYPTODEV: ipsec_mb_ipc_request() line 373: Unable to release qp_id=0
EAL: Message data is too long
EAL: Fail to handle message: ipsec_mb_mp_msg
EAL: Fail to recv reply for request /tmp/dpdk/l2hi/mp_socket: 
ipsec_mb_mp_msg

This message corresponds exactly to the PID check in the code you 
referenced:
if (qp->qp_used_by_pid != req_param->process_id) {
     CDEV_LOG_ERR("Unable to release qp_id=%d", qp_id);
     goto out;
}

In our view, this log indicates an abnormal condition: a secondary 
process should be able to release its own QPs without triggering an 
error during normal shutdown.

Thanks for your help.

Best,
Yang Ming
  
Ji, Kai May 19, 2025, 10:46 a.m. UTC | #5
Hi Yangming,

How did you configure the queue pairs differently between the primary and secondary processes?

The application must call rte_cryptodev_queue_pair_setup() with a unique qp_id for each process. This implies that each process should receive distinct queue pair configurations at runtime. From what I can tell, it’s likely that the secondary process is still using the same queue pair as the primary process.

Regards

Kai
  
Moses Young May 30, 2025, 4:14 p.m. UTC | #6
On 5/19/2025 6:46 PM, Ji, Kai wrote:

> Hi Yangming,
>
> How did you configure the queue pairs differently between the primary 
> and secondary processes?
>
> The application must call rte_cryptodev_queue_pair_setup() with a 
> unique qp_id for each process. This implies that each process should 
> receive distinct queue pair configurations at runtime. From what I can 
> tell, it’s likely that the secondary process is still using the same 
> queue pair as the primary process.
>
> Regards
>
> Kai
>
>
>
>
> ------------------------------------------------------------------------
> *From:* Moses Young <mosesyyoung@gmail.com>
> *Sent:* Monday, May 12, 2025 11:10
> *To:* Ji, Kai <kai.ji@intel.com>; Yang Ming 
> <ming.1.yang@nokia-sbell.com>; dev@dpdk.org <dev@dpdk.org>
> *Subject:* Re: [PATCH v2 2/2] crypto/ipsec_mb: fix QP release in 
> secondary
>
> On 5/7/2025 11:25 PM, Ji, Kai wrote:
>
>     Hi Yangming,
>
>     PID check is implemented here:
>     https://github.com/DPDK/dpdk/blob/75f179ebe347b6098cf3af26d3d3b7168fe3fe24/drivers/crypto/ipsec_mb/ipsec_mb_ops.c#L376
>     <https://github.com/DPDK/dpdk/blob/75f179ebe347b6098cf3af26d3d3b7168fe3fe24/drivers/crypto/ipsec_mb/ipsec_mb_ops.c#L376>
>
>     Can you share the steps to re-produce the error ?
>
>     Regards
>
>     Kai
>
>     ------------------------------------------------------------------------
>     *From:* Yang Ming
>     *Sent:* Thursday, April 24, 2025 15:26
>     *To:* dev@dpdk.org <mailto:dev@dpdk.org>
>     *Subject:* Re: [PATCH v2 2/2] crypto/ipsec_mb: fix QP release in
>     secondary
>
>     Hi,
>
>     On 2025/4/7 13:25, Yang Ming wrote:
>     > From: myang <ming.1.yang@nokia-sbell.com>
>     <mailto:ming.1.yang@nokia-sbell.com>
>     >
>     > When a secondary process tries to release a queue pair (QP) that
>     > does not belong to it, error logs occur:
>     > CRYPTODEV: ipsec_mb_ipc_request() line 373: Unable to release
>     > qp_id=0
>     > EAL: Message data is too long
>     > EAL: Fail to handle message: ipsec_mb_mp_msg
>     > EAL: Fail to recv reply for request /tmp/dpdk/l2hi/mp_socket:
>     > ipsec_mb_mp_msg
>     >
>     > This patch ensures that a secondary process only frees a QP if
>     > it actually owns it, preventing conflicts and resolving the
>     > issue.
>     >
>     > Signed-off-by: myang <ming.1.yang@nokia-sbell.com>
>     <mailto:ming.1.yang@nokia-sbell.com>
>     > ---
>     >   drivers/crypto/ipsec_mb/ipsec_mb_ops.c | 7 +++++--
>     >   1 file changed, 5 insertions(+), 2 deletions(-)
>     >
>     > diff --git a/drivers/crypto/ipsec_mb/ipsec_mb_ops.c
>     b/drivers/crypto/ipsec_mb/ipsec_mb_ops.c
>     > index 910efb1a97..50ee140ccd 100644
>     > --- a/drivers/crypto/ipsec_mb/ipsec_mb_ops.c
>     > +++ b/drivers/crypto/ipsec_mb/ipsec_mb_ops.c
>     > @@ -138,6 +138,7 @@ int
>     >   ipsec_mb_qp_release(struct rte_cryptodev *dev, uint16_t qp_id)
>     >   {
>     >        struct ipsec_mb_qp *qp = dev->data->queue_pairs[qp_id];
>     > +     uint16_t process_id = (uint16_t)getpid();
>     >
>     >        if (!qp)
>     >                return 0;
>     > @@ -152,8 +153,10 @@ ipsec_mb_qp_release(struct rte_cryptodev
>     *dev, uint16_t qp_id)
>     >                rte_free(qp);
>     >                dev->data->queue_pairs[qp_id] = NULL;
>     >        } else { /* secondary process */
>     > -             return ipsec_mb_secondary_qp_op(dev->data->dev_id,
>     qp_id,
>     > -                                     NULL, 0,
>     RTE_IPSEC_MB_MP_REQ_QP_FREE);
>     > +             if (qp->qp_used_by_pid == process_id)
>     > +                     return
>     ipsec_mb_secondary_qp_op(dev->data->dev_id,
>     > +                                             qp_id, NULL, 0,
>     > + RTE_IPSEC_MB_MP_REQ_QP_FREE);
>     >        }
>     >        return 0;
>     >   }
>
>     Hi Experts,
>
>     Is there any chance to review and accept this patch?
>
>     Brs,
>
>     Yang Ming
>
> Hi,
>
> David. Thanks for your feedback. I will Cc maintainers in next version.
>
> Kai, Thanks for your feedback. Here's our test steps after applying 
> the previous patch called "eal: prevent socket closure before MP sync" 
> in this serious:
> 1. Start the primary process: Run the DPDK primary process with the 
> IPsec MB crypto device.
> 2. Launch the secondary process: Start a DPDK secondary process using 
> the same device parameters.
> 3. Exit the secondary normally.
>
> On secondary exit, error logs show below as my commit log's description:
> CRYPTODEV: ipsec_mb_ipc_request() line 373: Unable to release qp_id=0
> EAL: Message data is too long
> EAL: Fail to handle message: ipsec_mb_mp_msg
> EAL: Fail to recv reply for request /tmp/dpdk/l2hi/mp_socket: 
> ipsec_mb_mp_msg
>
> This message corresponds exactly to the PID check in the code you 
> referenced:
> if (qp->qp_used_by_pid != req_param->process_id) {
>     CDEV_LOG_ERR("Unable to release qp_id=%d", qp_id);
>     goto out;
> }
>
> In our view, this log indicates an abnormal condition: a secondary 
> process should be able to release its own QPs without triggering an 
> error during normal shutdown.
>
> Thanks for your help.
>
> Best,
> Yang Ming


Hi Kai,

The queue pairs is shared between the primary and secondary processes. 
Even if each process calls rte_cryptodev_queue_pair_setup() with a 
unique qp_id, the primary and secondary still end up using the same 
shared queue pair structure, because cryptodev->data is shared between them.

 From the code path, cryptodev->data is allocated in the primary via 
rte_cryptodev_data_alloc() (inside 
ipsec_mb_create-->rte_cryptodev_pmd_create-->rte_cryptodev_pmd_allocate-->rte_cryptodev_data_alloc). 
This memory is placed in a shared memzone (rte_cryptodev_data_%u), so 
both primary and secondary processes reference the same cryptodev->data, 
including nb_queue_pairs and queue_pairs[].
As a result, when the secondary process exits, ipsec_mb_remove() is 
called (inside 
rte_eal_cleanup-->eal_bus_cleanup-->vdev_cleanup-->rte_vdev_driver-->ipsec_mb_remove-->ipsec_mb_qp_release-->ipsec_mb_secondary_qp_op) 
and it loops through all queue pairs using:
for (qp_id = 0; qp_id < cryptodev->data->nb_queue_pairs; qp_id++)
     ipsec_mb_qp_release(cryptodev, qp_id);

This causes the secondary to attempt releasing queue pairs it doesn't 
own, triggering the error logs mentioned in previous mail.

Best regards,
Yang Ming
  

Patch

diff --git a/drivers/crypto/ipsec_mb/ipsec_mb_ops.c b/drivers/crypto/ipsec_mb/ipsec_mb_ops.c
index 910efb1a97..50ee140ccd 100644
--- a/drivers/crypto/ipsec_mb/ipsec_mb_ops.c
+++ b/drivers/crypto/ipsec_mb/ipsec_mb_ops.c
@@ -138,6 +138,7 @@  int
 ipsec_mb_qp_release(struct rte_cryptodev *dev, uint16_t qp_id)
 {
 	struct ipsec_mb_qp *qp = dev->data->queue_pairs[qp_id];
+	uint16_t process_id = (uint16_t)getpid();
 
 	if (!qp)
 		return 0;
@@ -152,8 +153,10 @@  ipsec_mb_qp_release(struct rte_cryptodev *dev, uint16_t qp_id)
 		rte_free(qp);
 		dev->data->queue_pairs[qp_id] = NULL;
 	} else { /* secondary process */
-		return ipsec_mb_secondary_qp_op(dev->data->dev_id, qp_id,
-					NULL, 0, RTE_IPSEC_MB_MP_REQ_QP_FREE);
+		if (qp->qp_used_by_pid == process_id)
+			return ipsec_mb_secondary_qp_op(dev->data->dev_id,
+						qp_id, NULL, 0,
+						RTE_IPSEC_MB_MP_REQ_QP_FREE);
 	}
 	return 0;
 }