eal: fix alignment of RISCV xmm vector type

Message ID 1700083003-6641-1-git-send-email-roretzla@linux.microsoft.com (mailing list archive)
State Accepted, archived
Delegated to: Thomas Monjalon
Headers
Series eal: fix alignment of RISCV xmm vector type |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/iol-unit-amd64-testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-compile-amd64-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/intel-Testing success Testing PASS
ci/iol-sample-apps-testing success Testing PASS
ci/iol-unit-arm64-testing success Testing PASS
ci/github-robot: build success github build: passed
ci/iol-compile-arm64-testing success Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/intel-Functional success Functional PASS

Commit Message

Tyler Retzlaff Nov. 15, 2023, 9:16 p.m. UTC
  Fix the alignment for rte_xmm_t it should be 16 instead of 8 bytes.

Fixes: f22e705ebf12 ("eal/riscv: support RISC-V architecture")
Cc: maz@semihalf.com
Cc: stable@dpdk.org
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
---
 lib/eal/riscv/include/rte_vect.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Morten Brørup Nov. 15, 2023, 9:31 p.m. UTC | #1
> From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> Sent: Wednesday, 15 November 2023 22.17
> 
> Fix the alignment for rte_xmm_t it should be 16 instead of 8 bytes.
> 
> Fixes: f22e705ebf12 ("eal/riscv: support RISC-V architecture")
> Cc: maz@semihalf.com
> Cc: stable@dpdk.org
> Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
> ---

Reviewed-by: Morten Brørup <mb@smartsharesystems.com>

As mentioned in the other thread:

We need to urgently decide if this bug should live on in DPDK 23.11, or if the fix should be included although we are very late in the release process.

Stanislaw, what do you think?

Furthermore, I wonder if it can be backported to stable, and to what extent backporting it would break the ABI/API.
  
Stanislaw Kardach Nov. 15, 2023, 11:20 p.m. UTC | #2
On Wed, Nov 15, 2023 at 10:31 PM Morten Brørup <mb@smartsharesystems.com> wrote:
>
> > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > Sent: Wednesday, 15 November 2023 22.17
> >
> > Fix the alignment for rte_xmm_t it should be 16 instead of 8 bytes.
> >
> > Fixes: f22e705ebf12 ("eal/riscv: support RISC-V architecture")
> > Cc: maz@semihalf.com
> > Cc: stable@dpdk.org
> > Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
> > ---
>
> Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
>
> As mentioned in the other thread:
>
> We need to urgently decide if this bug should live on in DPDK 23.11, or if the fix should be included although we are very late in the release process.
>
> Stanislaw, what do you think?
Good catch! As for backporting I'm not sure of the urgency given that
our examples still use scalar instructions for handling xmm_t. The
question is whether there is a platform in use which has vector
extensions enabled and that utilizes DPDK. I'm not that sure of it
though I'd be happy to be proven wrong.

Reviewed-by: Stanislaw Kardach <kda@semihalf.com>

>
> Furthermore, I wonder if it can be backported to stable, and to what extent backporting it would break the ABI/API.
>


--
Best Regards,
Stanisław Kardach
  
Morten Brørup Nov. 16, 2023, 7:45 a.m. UTC | #3
> From: Stanisław Kardach [mailto:kda@semihalf.com]
> Sent: Thursday, 16 November 2023 00.21
> 
> On Wed, Nov 15, 2023 at 10:31 PM Morten Brørup
> <mb@smartsharesystems.com> wrote:
> >
> > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > Sent: Wednesday, 15 November 2023 22.17
> > >
> > > Fix the alignment for rte_xmm_t it should be 16 instead of 8 bytes.
> > >
> > > Fixes: f22e705ebf12 ("eal/riscv: support RISC-V architecture")
> > > Cc: maz@semihalf.com
> > > Cc: stable@dpdk.org
> > > Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
> > > ---
> >
> > Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
> >
> > As mentioned in the other thread:
> >
> > We need to urgently decide if this bug should live on in DPDK 23.11,
> or if the fix should be included although we are very late in the
> release process.
> >
> > Stanislaw, what do you think?
> Good catch! As for backporting I'm not sure of the urgency given that
> our examples still use scalar instructions for handling xmm_t. The
> question is whether there is a platform in use which has vector
> extensions enabled and that utilizes DPDK. I'm not that sure of it
> though I'd be happy to be proven wrong.

Can we extrapolate this, and also conclude that postponing this bugfix until the next ABI/API breaking release, DPDK 24.11, is not really going to hurt anyone?

Stanislaw, please confirm?

Bruce, I don't feel 100 % confident in making this postponement recommendation. Could you please provide a second opinion regarding the timing of fixing this bug? Or rather: do you have any strong arguments *against* postponing it to DPDK 24.11?

> 
> Reviewed-by: Stanislaw Kardach <kda@semihalf.com>
> 
> >
> > Furthermore, I wonder if it can be backported to stable, and to what
> extent backporting it would break the ABI/API.

Based on your input regarding the current deployment status, backporting the bugfix clearly isn't worth the effort. Excellent!

> >
> 
> 
> --
> Best Regards,
> Stanisław Kardach
  
Bruce Richardson Nov. 17, 2023, 10:54 a.m. UTC | #4
On Thu, Nov 16, 2023 at 08:45:35AM +0100, Morten Brørup wrote:
> > From: Stanisław Kardach [mailto:kda@semihalf.com]
> > Sent: Thursday, 16 November 2023 00.21
> > 
> > On Wed, Nov 15, 2023 at 10:31 PM Morten Brørup
> > <mb@smartsharesystems.com> wrote:
> > >
> > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > Sent: Wednesday, 15 November 2023 22.17
> > > >
> > > > Fix the alignment for rte_xmm_t it should be 16 instead of 8 bytes.
> > > >
> > > > Fixes: f22e705ebf12 ("eal/riscv: support RISC-V architecture")
> > > > Cc: maz@semihalf.com
> > > > Cc: stable@dpdk.org
> > > > Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
> > > > ---
> > >
> > > Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
> > >
> > > As mentioned in the other thread:
> > >
> > > We need to urgently decide if this bug should live on in DPDK 23.11,
> > or if the fix should be included although we are very late in the
> > release process.
> > >
> > > Stanislaw, what do you think?
> > Good catch! As for backporting I'm not sure of the urgency given that
> > our examples still use scalar instructions for handling xmm_t. The
> > question is whether there is a platform in use which has vector
> > extensions enabled and that utilizes DPDK. I'm not that sure of it
> > though I'd be happy to be proven wrong.
> 
> Can we extrapolate this, and also conclude that postponing this bugfix until the next ABI/API breaking release, DPDK 24.11, is not really going to hurt anyone?
> 
> Stanislaw, please confirm?
> 
> Bruce, I don't feel 100 % confident in making this postponement recommendation. Could you please provide a second opinion regarding the timing of fixing this bug? Or rather: do you have any strong arguments *against* postponing it to DPDK 24.11?
> 

Not sure I'm any better placed to make an argument either way! However, I
would very much tend to say that we should include this in 23.11, on the
basis that if it turns out to be important we can't backport it later
without affecting ABI. Right now, the code looks broken to me, and I'm also
struggling to find circumstances where increasing the alignment will
actually stop something from working. There could well be performance
implications of having extra padding, but things should still work, AFAIK.
On the other hand, if we don't include the fix, it is possible that a
system (possibly a future one) could break and segfault due to incorrect
vector code. I'd take a possible slowdown over a segfault!

Is my assessment correct, or perhaps I'm missing some detail.

/Bruce
  
Morten Brørup Nov. 17, 2023, 11:18 a.m. UTC | #5
+CC Thomas, this patch is ready to be applied to 23.11.

> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent: Friday, 17 November 2023 11.55
> 
> On Thu, Nov 16, 2023 at 08:45:35AM +0100, Morten Brørup wrote:
> > > From: Stanisław Kardach [mailto:kda@semihalf.com]
> > > Sent: Thursday, 16 November 2023 00.21
> > >
> > > On Wed, Nov 15, 2023 at 10:31 PM Morten Brørup
> > > <mb@smartsharesystems.com> wrote:
> > > >
> > > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > > Sent: Wednesday, 15 November 2023 22.17
> > > > >
> > > > > Fix the alignment for rte_xmm_t it should be 16 instead of 8
> bytes.
> > > > >
> > > > > Fixes: f22e705ebf12 ("eal/riscv: support RISC-V architecture")
> > > > > Cc: maz@semihalf.com
> > > > > Cc: stable@dpdk.org
> > > > > Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
> > > > > ---
> > > >
> > > > Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
> > > >
> > > > As mentioned in the other thread:
> > > >
> > > > We need to urgently decide if this bug should live on in DPDK
> 23.11,
> > > or if the fix should be included although we are very late in the
> > > release process.
> > > >
> > > > Stanislaw, what do you think?
> > > Good catch! As for backporting I'm not sure of the urgency given
> that
> > > our examples still use scalar instructions for handling xmm_t. The
> > > question is whether there is a platform in use which has vector
> > > extensions enabled and that utilizes DPDK. I'm not that sure of it
> > > though I'd be happy to be proven wrong.
> >
> > Can we extrapolate this, and also conclude that postponing this
> bugfix until the next ABI/API breaking release, DPDK 24.11, is not
> really going to hurt anyone?
> >
> > Stanislaw, please confirm?
> >
> > Bruce, I don't feel 100 % confident in making this postponement
> recommendation. Could you please provide a second opinion regarding the
> timing of fixing this bug? Or rather: do you have any strong arguments
> *against* postponing it to DPDK 24.11?
> >
> 
> Not sure I'm any better placed to make an argument either way!

Bruce, I picked you because of your experience with vector code.

> However, I
> would very much tend to say that we should include this in 23.11, on
> the
> basis that if it turns out to be important we can't backport it later
> without affecting ABI. Right now, the code looks broken to me, and I'm
> also
> struggling to find circumstances where increasing the alignment will
> actually stop something from working. There could well be performance
> implications of having extra padding, but things should still work,
> AFAIK.
> On the other hand, if we don't include the fix, it is possible that a
> system (possibly a future one) could break and segfault due to
> incorrect
> vector code. I'd take a possible slowdown over a segfault!

The risk of slowdown isn't a factor for me at this stage.

I'm trying to balance the risk of fixing the bug vs. breaking something this late in the 23.11 release cycle.

You have a strong point that we also need to consider the bugfix in the context of the total lifetime of DPDK 23.11 in the wild.
With RISC-V's current traction, that certainly speaks in favor of including it in 23.11.

> 
> Is my assessment correct, or perhaps I'm missing some detail.

Thank you for your valuable feedback, Bruce.

I was just being overly cautious... After all, 23.11 is still at a stage where bug fixes are accepted!

New conclusion: Let's get it into 23.11.
  
Thomas Monjalon Nov. 18, 2023, 8:04 a.m. UTC | #6
17/11/2023 12:18, Morten Brørup:
> +CC Thomas, this patch is ready to be applied to 23.11.
> 
> > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > Sent: Friday, 17 November 2023 11.55
> > 
> > On Thu, Nov 16, 2023 at 08:45:35AM +0100, Morten Brørup wrote:
> > > > From: Stanisław Kardach [mailto:kda@semihalf.com]
> > > > Sent: Thursday, 16 November 2023 00.21
> > > >
> > > > On Wed, Nov 15, 2023 at 10:31 PM Morten Brørup
> > > > <mb@smartsharesystems.com> wrote:
> > > > >
> > > > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > > > Sent: Wednesday, 15 November 2023 22.17
> > > > > >
> > > > > > Fix the alignment for rte_xmm_t it should be 16 instead of 8
> > bytes.
> > > > > >
> > > > > > Fixes: f22e705ebf12 ("eal/riscv: support RISC-V architecture")
> > > > > > Cc: maz@semihalf.com
> > > > > > Cc: stable@dpdk.org
> > > > > > Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
> > > > > > ---
> > > > >
> > > > > Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
> > > > >
> > > > > As mentioned in the other thread:
> > > > >
> > > > > We need to urgently decide if this bug should live on in DPDK
> > 23.11,
> > > > or if the fix should be included although we are very late in the
> > > > release process.
> > > > >
> > > > > Stanislaw, what do you think?
> > > > Good catch! As for backporting I'm not sure of the urgency given
> > that
> > > > our examples still use scalar instructions for handling xmm_t. The
> > > > question is whether there is a platform in use which has vector
> > > > extensions enabled and that utilizes DPDK. I'm not that sure of it
> > > > though I'd be happy to be proven wrong.
> > >
> > > Can we extrapolate this, and also conclude that postponing this
> > bugfix until the next ABI/API breaking release, DPDK 24.11, is not
> > really going to hurt anyone?
> > >
> > > Stanislaw, please confirm?
> > >
> > > Bruce, I don't feel 100 % confident in making this postponement
> > recommendation. Could you please provide a second opinion regarding the
> > timing of fixing this bug? Or rather: do you have any strong arguments
> > *against* postponing it to DPDK 24.11?
> > >
> > 
> > Not sure I'm any better placed to make an argument either way!
> 
> Bruce, I picked you because of your experience with vector code.
> 
> > However, I
> > would very much tend to say that we should include this in 23.11, on
> > the
> > basis that if it turns out to be important we can't backport it later
> > without affecting ABI. Right now, the code looks broken to me, and I'm
> > also
> > struggling to find circumstances where increasing the alignment will
> > actually stop something from working. There could well be performance
> > implications of having extra padding, but things should still work,
> > AFAIK.
> > On the other hand, if we don't include the fix, it is possible that a
> > system (possibly a future one) could break and segfault due to
> > incorrect
> > vector code. I'd take a possible slowdown over a segfault!
> 
> The risk of slowdown isn't a factor for me at this stage.
> 
> I'm trying to balance the risk of fixing the bug vs. breaking something this late in the 23.11 release cycle.
> 
> You have a strong point that we also need to consider the bugfix in the context of the total lifetime of DPDK 23.11 in the wild.
> With RISC-V's current traction, that certainly speaks in favor of including it in 23.11.
> 
> > 
> > Is my assessment correct, or perhaps I'm missing some detail.
> 
> Thank you for your valuable feedback, Bruce.
> 
> I was just being overly cautious... After all, 23.11 is still at a stage where bug fixes are accepted!
> 
> New conclusion: Let's get it into 23.11.

Applied, thanks.

I've kept CC:stable as it is a real fix.
I don't see a problem in breaking API/ABI for RISC-V which is quite new.
Anyway stable maintainers will choose what to do.
  

Patch

diff --git a/lib/eal/riscv/include/rte_vect.h b/lib/eal/riscv/include/rte_vect.h
index 2f97f43..da9092a 100644
--- a/lib/eal/riscv/include/rte_vect.h
+++ b/lib/eal/riscv/include/rte_vect.h
@@ -29,7 +29,7 @@ 
 	uint32_t	u32[XMM_SIZE / sizeof(uint32_t)];
 	uint64_t	u64[XMM_SIZE / sizeof(uint64_t)];
 	double		pd[XMM_SIZE / sizeof(double)];
-} __rte_aligned(8) rte_xmm_t;
+} __rte_aligned(16) rte_xmm_t;
 
 static inline xmm_t
 vect_load_128(void *p)