[3/3] acl: adjust the tests

Message ID 20190408182420.4398-4-aconole@redhat.com (mailing list archive)
State Changes Requested, archived
Delegated to: Thomas Monjalon
Headers
Series librte_acl: fixes related to testing with the meson build |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Aaron Conole April 8, 2019, 6:24 p.m. UTC
  This makes the tests pass, and also ensures that on platforms where the
testing is supported, we can properly test the implementation specific
code.  One edge case is when we run on x86_64 systems that don't support
AVX2, but where the compiler can generate such instructions.  That could
be an enhancement in the future, but for now at least the tests will
pass.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 app/test/test_acl.c             | 62 +++++++++++++--------------------
 lib/librte_acl/Makefile         |  1 +
 lib/librte_acl/acl_run_notsup.c | 46 ++++++++++++++++++++++++
 lib/librte_acl/meson.build      |  4 +--
 4 files changed, 73 insertions(+), 40 deletions(-)
 create mode 100644 lib/librte_acl/acl_run_notsup.c
  

Comments

Ananyev, Konstantin April 9, 2019, 8:41 a.m. UTC | #1
Hi Aaron,

> 
> This makes the tests pass, and also ensures that on platforms where the
> testing is supported, we can properly test the implementation specific
> code.  One edge case is when we run on x86_64 systems that don't support
> AVX2, but where the compiler can generate such instructions.  That could
> be an enhancement in the future, but for now at least the tests will
> pass.
> 
> Signed-off-by: Aaron Conole <aconole@redhat.com>
> ---
>  app/test/test_acl.c             | 62 +++++++++++++--------------------
>  lib/librte_acl/Makefile         |  1 +
>  lib/librte_acl/acl_run_notsup.c | 46 ++++++++++++++++++++++++
>  lib/librte_acl/meson.build      |  4 +--
>  4 files changed, 73 insertions(+), 40 deletions(-)
>  create mode 100644 lib/librte_acl/acl_run_notsup.c
> 
> diff --git a/app/test/test_acl.c b/app/test/test_acl.c
> index b1f75d1bc..c44faa251 100644
> --- a/app/test/test_acl.c
> +++ b/app/test/test_acl.c
> @@ -408,6 +408,9 @@ test_classify(void)
>  		return -1;
>  	}
> 
> +	/* Always use the scalar testing for now. */
> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> +
>  	ret = 0;
>  	for (i = 0; i != TEST_CLASSIFY_ITER; i++) {
> 
> @@ -547,6 +550,7 @@ test_build_ports_range(void)
>  	for (i = 0; i != RTE_DIM(test_data); i++)
>  		data[i] = (uint8_t *)&test_data[i];
> 
> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
>  	for (i = 0; i != RTE_DIM(test_rules); i++) {
>  		rte_acl_reset(acx);
>  		ret = test_classify_buid(acx, test_rules, i + 1);
> @@ -911,6 +915,8 @@ test_convert_rules(const char *desc,
>  		return -1;
>  	}
> 
> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> +

As I understand here and above, on x86 you replaced default algo (SSE, AVX2)
with scalar one, right?
That looks like reduction of test coverage for x86.
Konstantin
  
Aaron Conole April 9, 2019, 1:01 p.m. UTC | #2
"Ananyev, Konstantin" <konstantin.ananyev@intel.com> writes:

> Hi Aaron,
>
>> 
>> This makes the tests pass, and also ensures that on platforms where the
>> testing is supported, we can properly test the implementation specific
>> code.  One edge case is when we run on x86_64 systems that don't support
>> AVX2, but where the compiler can generate such instructions.  That could
>> be an enhancement in the future, but for now at least the tests will
>> pass.
>> 
>> Signed-off-by: Aaron Conole <aconole@redhat.com>
>> ---
>>  app/test/test_acl.c             | 62 +++++++++++++--------------------
>>  lib/librte_acl/Makefile         |  1 +
>>  lib/librte_acl/acl_run_notsup.c | 46 ++++++++++++++++++++++++
>>  lib/librte_acl/meson.build      |  4 +--
>>  4 files changed, 73 insertions(+), 40 deletions(-)
>>  create mode 100644 lib/librte_acl/acl_run_notsup.c
>> 
>> diff --git a/app/test/test_acl.c b/app/test/test_acl.c
>> index b1f75d1bc..c44faa251 100644
>> --- a/app/test/test_acl.c
>> +++ b/app/test/test_acl.c
>> @@ -408,6 +408,9 @@ test_classify(void)
>>  		return -1;
>>  	}
>> 
>> +	/* Always use the scalar testing for now. */
>> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
>> +
>>  	ret = 0;
>>  	for (i = 0; i != TEST_CLASSIFY_ITER; i++) {
>> 
>> @@ -547,6 +550,7 @@ test_build_ports_range(void)
>>  	for (i = 0; i != RTE_DIM(test_data); i++)
>>  		data[i] = (uint8_t *)&test_data[i];
>> 
>> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
>>  	for (i = 0; i != RTE_DIM(test_rules); i++) {
>>  		rte_acl_reset(acx);
>>  		ret = test_classify_buid(acx, test_rules, i + 1);
>> @@ -911,6 +915,8 @@ test_convert_rules(const char *desc,
>>  		return -1;
>>  	}
>> 
>> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
>> +
>
> As I understand here and above, on x86 you replaced default algo (SSE, AVX2)
> with scalar one, right?
> That looks like reduction of test coverage for x86.

In one way, you're right.  However, the tests weren't testing what they
purported anyway.  Actually, it's just a shift I think (previously, it
would have tested the AVX2 but I don't see AVX2 having a fallback into
the SSE code - unlike the SSE code falling back into scalar).

The tests were failing for a number of reasons when built with meson,
and on the systems I tested with (including tests under travis).

1. Any meson build that I observed didn't correctly fill anything but
   the scalar variable.  I had to remove the -ENOTSUP definitions in the
   rte_acl.c file (forgot to git add it), and make the second version.

2. The tests never selected scalar, or nor sse implementations.  Rather,
   they selected only what the currently running platform provided.
   This meant that I was always seeing the AVX2 code executed, but never
   SSE nor scalar (but for one case) - at least as far as I could see.

There were others - I iterated on these for a few days.

This is why I changed a block to run through each implementation in one
of the versions.

HOWEVER, it's still deficient.

We need to fully cover all the cases.  BUT it's better than the failure
that currently happens on almost every system I've tried - including
shipping the build to travis to run.  So, I figured running > failing with
almost no reason why.  And looking into the failure revealed that the
meson build didn't even include the platform specific builds.

During my rework, I can change the test cases to iterate as in other
test cases.  It will extend the time.  And I don't know how to resolve
the case where we run on a system that doesn't support AVX2 but we have
a compiler that supports AVX2 (since that case will fail - but we
shouldn't even attempt it).

WDYT?

> Konstantin
  
Ananyev, Konstantin April 9, 2019, 4:03 p.m. UTC | #3
> > Hi Aaron,
> >
> >>
> >> This makes the tests pass, and also ensures that on platforms where the
> >> testing is supported, we can properly test the implementation specific
> >> code.  One edge case is when we run on x86_64 systems that don't support
> >> AVX2, but where the compiler can generate such instructions.  That could
> >> be an enhancement in the future, but for now at least the tests will
> >> pass.
> >>
> >> Signed-off-by: Aaron Conole <aconole@redhat.com>
> >> ---
> >>  app/test/test_acl.c             | 62 +++++++++++++--------------------
> >>  lib/librte_acl/Makefile         |  1 +
> >>  lib/librte_acl/acl_run_notsup.c | 46 ++++++++++++++++++++++++
> >>  lib/librte_acl/meson.build      |  4 +--
> >>  4 files changed, 73 insertions(+), 40 deletions(-)
> >>  create mode 100644 lib/librte_acl/acl_run_notsup.c
> >>
> >> diff --git a/app/test/test_acl.c b/app/test/test_acl.c
> >> index b1f75d1bc..c44faa251 100644
> >> --- a/app/test/test_acl.c
> >> +++ b/app/test/test_acl.c
> >> @@ -408,6 +408,9 @@ test_classify(void)
> >>  		return -1;
> >>  	}
> >>
> >> +	/* Always use the scalar testing for now. */
> >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> >> +
> >>  	ret = 0;
> >>  	for (i = 0; i != TEST_CLASSIFY_ITER; i++) {
> >>
> >> @@ -547,6 +550,7 @@ test_build_ports_range(void)
> >>  	for (i = 0; i != RTE_DIM(test_data); i++)
> >>  		data[i] = (uint8_t *)&test_data[i];
> >>
> >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> >>  	for (i = 0; i != RTE_DIM(test_rules); i++) {
> >>  		rte_acl_reset(acx);
> >>  		ret = test_classify_buid(acx, test_rules, i + 1);
> >> @@ -911,6 +915,8 @@ test_convert_rules(const char *desc,
> >>  		return -1;
> >>  	}
> >>
> >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> >> +
> >
> > As I understand here and above, on x86 you replaced default algo (SSE, AVX2)
> > with scalar one, right?
> > That looks like reduction of test coverage for x86.
> 
> In one way, you're right.  However, the tests weren't testing what they
> purported anyway.

Could you explain a bit more here?
What I am seeing: tests were running bot sse(or avx2) and scalar classify() method.
Now they always running scalar only.
To me it definitely looks like reduction in coverage.

>  Actually, it's just a shift I think (previously, it
> would have tested the AVX2 but I don't see AVX2 having a fallback into
> the SSE code - unlike the SSE code falling back into scalar).

Not sure I understand you here.
What fallback for AVX2 you expect that you think is missing?

> 
> The tests were failing for a number of reasons when built with meson,

Ok, but with legacy build system (make) on x86 all tests passes, right?
So the problem is in new build system, not in the test itself.
Why we should compromise our test coverage to make it work with
new tools? 
That just hides the problem without fixing it.
Instead I think the build system needs to be fixed.
Looking at it a bit closely, for .so meson+ninja generates code with
correct version of the function:
 
nm x86_64-native-linuxapp-gcc-meson/lib/librte_acl.so.2 | grep acl_classify_sse
000000000000fa50 t rte_acl_classify_sse

So for 'meson -Ddefault_library=shared'
acl_autotest passes without the problem. 

Though for static lib we have both:
nm x86_64-native-linuxapp-gcc-meson/lib/librte_acl.a | grep acl_classify_sse
0000000000000000 W rte_acl_classify_sse
0000000000004880 T rte_acl_classify_sse

And then linker pickups the wrong one:
nm x86_64-native-linuxapp-gcc-meson/app/test/dpdk-test | grep acl_classify_sse
00000000005f6100 W rte_acl_classify_sse

While for make:
$ nm x86_64-native-linuxapp-gcc-aesmb/lib/librte_acl.a | grep acl_classify_sse 
0000000000000000 W rte_acl_classify_sse
0000000000004880 T rte_acl_classify_sse
$ nm x86_64-native-linuxapp-gcc-aesmb/app/test | grep acl_classify_sse
0000000000240440 T rte_acl_classify_sse

Linker pickups the right one.


> and on the systems I tested with (including tests under travis).
> 
> 1. Any meson build that I observed didn't correctly fill anything but
>    the scalar variable.  I had to remove the -ENOTSUP definitions in the
>    rte_acl.c file (forgot to git add it), and make the second version.
> 
> 2. The tests never selected scalar, or nor sse implementations. 

As I can see test_classify_run() *always* run both default method (sse/avx2 on x86)
and then scalar one. 

>  Rather,
>    they selected only what the currently running platform provided.
>    This meant that I was always seeing the AVX2 code executed, but never
>    SSE nor scalar (but for one case) - at least as far as I could see.
> 
> There were others - I iterated on these for a few days.
> 
> This is why I changed a block to run through each implementation in one
> of the versions.
> 
> HOWEVER, it's still deficient.
> 
> We need to fully cover all the cases.  BUT it's better than the failure
> that currently happens on almost every system I've tried - including
> shipping the build to travis to run.  So, I figured running > failing with
> almost no reason why.  And looking into the failure revealed that the
> meson build didn't even include the platform specific builds.
> 
> During my rework, I can change the test cases to iterate as in other
> test cases.  It will extend the time.  And I don't know how to resolve
> the case where we run on a system that doesn't support AVX2 but we have
> a compiler that supports AVX2 (since that case will fail - but we
> shouldn't even attempt it).

I don't see why that should happen.
At rte_acl_init() we do check does that machine supports AVX2(SSE, NEON) 
instructions or not.
Are you saying under some circumstances rte_acl_init() are not working properly,
or not get invoked?

Konstantin
  
Ananyev, Konstantin April 9, 2019, 5:04 p.m. UTC | #4
> 
> > > Hi Aaron,
> > >
> > >>
> > >> This makes the tests pass, and also ensures that on platforms where the
> > >> testing is supported, we can properly test the implementation specific
> > >> code.  One edge case is when we run on x86_64 systems that don't support
> > >> AVX2, but where the compiler can generate such instructions.  That could
> > >> be an enhancement in the future, but for now at least the tests will
> > >> pass.
> > >>
> > >> Signed-off-by: Aaron Conole <aconole@redhat.com>
> > >> ---
> > >>  app/test/test_acl.c             | 62 +++++++++++++--------------------
> > >>  lib/librte_acl/Makefile         |  1 +
> > >>  lib/librte_acl/acl_run_notsup.c | 46 ++++++++++++++++++++++++
> > >>  lib/librte_acl/meson.build      |  4 +--
> > >>  4 files changed, 73 insertions(+), 40 deletions(-)
> > >>  create mode 100644 lib/librte_acl/acl_run_notsup.c
> > >>
> > >> diff --git a/app/test/test_acl.c b/app/test/test_acl.c
> > >> index b1f75d1bc..c44faa251 100644
> > >> --- a/app/test/test_acl.c
> > >> +++ b/app/test/test_acl.c
> > >> @@ -408,6 +408,9 @@ test_classify(void)
> > >>  		return -1;
> > >>  	}
> > >>
> > >> +	/* Always use the scalar testing for now. */
> > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> > >> +
> > >>  	ret = 0;
> > >>  	for (i = 0; i != TEST_CLASSIFY_ITER; i++) {
> > >>
> > >> @@ -547,6 +550,7 @@ test_build_ports_range(void)
> > >>  	for (i = 0; i != RTE_DIM(test_data); i++)
> > >>  		data[i] = (uint8_t *)&test_data[i];
> > >>
> > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> > >>  	for (i = 0; i != RTE_DIM(test_rules); i++) {
> > >>  		rte_acl_reset(acx);
> > >>  		ret = test_classify_buid(acx, test_rules, i + 1);
> > >> @@ -911,6 +915,8 @@ test_convert_rules(const char *desc,
> > >>  		return -1;
> > >>  	}
> > >>
> > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> > >> +
> > >
> > > As I understand here and above, on x86 you replaced default algo (SSE, AVX2)
> > > with scalar one, right?
> > > That looks like reduction of test coverage for x86.
> >
> > In one way, you're right.  However, the tests weren't testing what they
> > purported anyway.
> 
> Could you explain a bit more here?
> What I am seeing: tests were running bot sse(or avx2) and scalar classify() method.
> Now they always running scalar only.
> To me it definitely looks like reduction in coverage.
> 
> >  Actually, it's just a shift I think (previously, it
> > would have tested the AVX2 but I don't see AVX2 having a fallback into
> > the SSE code - unlike the SSE code falling back into scalar).
> 
> Not sure I understand you here.
> What fallback for AVX2 you expect that you think is missing?
> 
> >
> > The tests were failing for a number of reasons when built with meson,
> 
> Ok, but with legacy build system (make) on x86 all tests passes, right?
> So the problem is in new build system, not in the test itself.
> Why we should compromise our test coverage to make it work with
> new tools?
> That just hides the problem without fixing it.
> Instead I think the build system needs to be fixed.
> Looking at it a bit closely, for .so meson+ninja generates code with
> correct version of the function:
> 
> nm x86_64-native-linuxapp-gcc-meson/lib/librte_acl.so.2 | grep acl_classify_sse
> 000000000000fa50 t rte_acl_classify_sse
> 
> So for 'meson -Ddefault_library=shared'
> acl_autotest passes without the problem.
> 
> Though for static lib we have both:
> nm x86_64-native-linuxapp-gcc-meson/lib/librte_acl.a | grep acl_classify_sse
> 0000000000000000 W rte_acl_classify_sse
> 0000000000004880 T rte_acl_classify_sse
> 
> And then linker pickups the wrong one:
> nm x86_64-native-linuxapp-gcc-meson/app/test/dpdk-test | grep acl_classify_sse
> 00000000005f6100 W rte_acl_classify_sse
> 
> While for make:
> $ nm x86_64-native-linuxapp-gcc-aesmb/lib/librte_acl.a | grep acl_classify_sse
> 0000000000000000 W rte_acl_classify_sse
> 0000000000004880 T rte_acl_classify_sse
> $ nm x86_64-native-linuxapp-gcc-aesmb/app/test | grep acl_classify_sse
> 0000000000240440 T rte_acl_classify_sse
> 
> Linker pickups the right one.

And the changes below make linker to pick-up the proper version of the function
and make acl_autotest to pass for static build too.

diff --git a/app/test/meson.build b/app/test/meson.build
index 867cc5863..4364be932 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -328,6 +328,7 @@ test_dep_objs += cc.find_library('execinfo', required: false)
 link_libs = []
 if get_option('default_library') == 'static'
        link_libs = dpdk_drivers
+       link_libs += dpdk_static_libraries
 endif

 if get_option('tests')
diff --git a/meson.build b/meson.build
index a96486597..df1e1c41c 100644
--- a/meson.build
+++ b/meson.build
@@ -62,6 +62,7 @@ configure_file(output: build_cfg,
 # for static builds, include the drivers as libs and we need to "whole-archive"
 # them.
 dpdk_drivers = ['-Wl,--whole-archive'] + dpdk_drivers + ['-Wl,--no-whole-archive']
+dpdk_static_libraries = ['-Wl,--whole-archive'] + dpdk_static_libraries + ['-Wl,--no-whole-archive']

Not saying that's the proper patch, but just to prove that linking librte_acl.a
with '--whole-archive' does fix the problem.
Bruce, could you point is the best way to fix things here
(my meson knowledge is very limited)?
Do we need extra container here as 'whole_archive_static_libraries[]' or so?
Thanks
Konstantin


> 
> 
> > and on the systems I tested with (including tests under travis).
> >
> > 1. Any meson build that I observed didn't correctly fill anything but
> >    the scalar variable.  I had to remove the -ENOTSUP definitions in the
> >    rte_acl.c file (forgot to git add it), and make the second version.
> >
> > 2. The tests never selected scalar, or nor sse implementations.
> 
> As I can see test_classify_run() *always* run both default method (sse/avx2 on x86)
> and then scalar one.
> 
> >  Rather,
> >    they selected only what the currently running platform provided.
> >    This meant that I was always seeing the AVX2 code executed, but never
> >    SSE nor scalar (but for one case) - at least as far as I could see.
> >
> > There were others - I iterated on these for a few days.
> >
> > This is why I changed a block to run through each implementation in one
> > of the versions.
> >
> > HOWEVER, it's still deficient.
> >
> > We need to fully cover all the cases.  BUT it's better than the failure
> > that currently happens on almost every system I've tried - including
> > shipping the build to travis to run.  So, I figured running > failing with
> > almost no reason why.  And looking into the failure revealed that the
> > meson build didn't even include the platform specific builds.
> >
> > During my rework, I can change the test cases to iterate as in other
> > test cases.  It will extend the time.  And I don't know how to resolve
> > the case where we run on a system that doesn't support AVX2 but we have
> > a compiler that supports AVX2 (since that case will fail - but we
> > shouldn't even attempt it).
> 
> I don't see why that should happen.
> At rte_acl_init() we do check does that machine supports AVX2(SSE, NEON)
> instructions or not.
> Are you saying under some circumstances rte_acl_init() are not working properly,
> or not get invoked?
> 
> Konstantin
  
Bruce Richardson April 9, 2019, 5:05 p.m. UTC | #5
> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Tuesday, April 9, 2019 5:03 PM
> To: Aaron Conole <aconole@redhat.com>
> Cc: dev@dpdk.org; Jerin Jacob <jerinj@marvell.com>; Gavin Hu
> <gavin.hu@arm.com>; Richardson, Bruce <bruce.richardson@intel.com>;
> Michael Santana <msantana@redhat.com>
> Subject: RE: [PATCH 3/3] acl: adjust the tests
> 
> 
> > > Hi Aaron,
> > >
> > >>
> > >> This makes the tests pass, and also ensures that on platforms where
> > >> the testing is supported, we can properly test the implementation
> > >> specific code.  One edge case is when we run on x86_64 systems that
> > >> don't support AVX2, but where the compiler can generate such
> > >> instructions.  That could be an enhancement in the future, but for
> > >> now at least the tests will pass.
> > >>
> > >> Signed-off-by: Aaron Conole <aconole@redhat.com>
> > >> ---
> > >>  app/test/test_acl.c             | 62 +++++++++++++------------------
> --
> > >>  lib/librte_acl/Makefile         |  1 +
> > >>  lib/librte_acl/acl_run_notsup.c | 46 ++++++++++++++++++++++++
> > >>  lib/librte_acl/meson.build      |  4 +--
> > >>  4 files changed, 73 insertions(+), 40 deletions(-)  create mode
> > >> 100644 lib/librte_acl/acl_run_notsup.c
> > >>
> > >> diff --git a/app/test/test_acl.c b/app/test/test_acl.c index
> > >> b1f75d1bc..c44faa251 100644
> > >> --- a/app/test/test_acl.c
> > >> +++ b/app/test/test_acl.c
> > >> @@ -408,6 +408,9 @@ test_classify(void)
> > >>  		return -1;
> > >>  	}
> > >>
> > >> +	/* Always use the scalar testing for now. */
> > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> > >> +
> > >>  	ret = 0;
> > >>  	for (i = 0; i != TEST_CLASSIFY_ITER; i++) {
> > >>
> > >> @@ -547,6 +550,7 @@ test_build_ports_range(void)
> > >>  	for (i = 0; i != RTE_DIM(test_data); i++)
> > >>  		data[i] = (uint8_t *)&test_data[i];
> > >>
> > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> > >>  	for (i = 0; i != RTE_DIM(test_rules); i++) {
> > >>  		rte_acl_reset(acx);
> > >>  		ret = test_classify_buid(acx, test_rules, i + 1); @@ -
> 911,6
> > >> +915,8 @@ test_convert_rules(const char *desc,
> > >>  		return -1;
> > >>  	}
> > >>
> > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> > >> +
> > >
> > > As I understand here and above, on x86 you replaced default algo
> > > (SSE, AVX2) with scalar one, right?
> > > That looks like reduction of test coverage for x86.
> >
> > In one way, you're right.  However, the tests weren't testing what
> > they purported anyway.
> 
> Could you explain a bit more here?
> What I am seeing: tests were running bot sse(or avx2) and scalar
> classify() method.
> Now they always running scalar only.
> To me it definitely looks like reduction in coverage.
> 
> >  Actually, it's just a shift I think (previously, it would have tested
> > the AVX2 but I don't see AVX2 having a fallback into the SSE code -
> > unlike the SSE code falling back into scalar).
> 
> Not sure I understand you here.
> What fallback for AVX2 you expect that you think is missing?
> 
> >
> > The tests were failing for a number of reasons when built with meson,
> 
> Ok, but with legacy build system (make) on x86 all tests passes, right?
> So the problem is in new build system, not in the test itself.
> Why we should compromise our test coverage to make it work with new tools?
> That just hides the problem without fixing it.
> Instead I think the build system needs to be fixed.
> Looking at it a bit closely, for .so meson+ninja generates code with
> correct version of the function:
> 
> nm x86_64-native-linuxapp-gcc-meson/lib/librte_acl.so.2 | grep
> acl_classify_sse
> 000000000000fa50 t rte_acl_classify_sse
> 
> So for 'meson -Ddefault_library=shared'
> acl_autotest passes without the problem.
> 
> Though for static lib we have both:
> nm x86_64-native-linuxapp-gcc-meson/lib/librte_acl.a | grep
> acl_classify_sse
> 0000000000000000 W rte_acl_classify_sse
> 0000000000004880 T rte_acl_classify_sse
> 
> And then linker pickups the wrong one:
> nm x86_64-native-linuxapp-gcc-meson/app/test/dpdk-test | grep
> acl_classify_sse
> 00000000005f6100 W rte_acl_classify_sse
> 
> While for make:
> $ nm x86_64-native-linuxapp-gcc-aesmb/lib/librte_acl.a | grep
> acl_classify_sse
> 0000000000000000 W rte_acl_classify_sse
> 0000000000004880 T rte_acl_classify_sse
> $ nm x86_64-native-linuxapp-gcc-aesmb/app/test | grep acl_classify_sse
> 0000000000240440 T rte_acl_classify_sse
> 
> Linker pickups the right one.
> 

I assume the same issues occurs for AVX2, but for SSE specifically why do we even compile up two copies of the function for x86 platforms, since SSE will always be supported?

/Bruce
  
Ananyev, Konstantin April 9, 2019, 6:29 p.m. UTC | #6
> >
> > > > Hi Aaron,
> > > >
> > > >>
> > > >> This makes the tests pass, and also ensures that on platforms where
> > > >> the testing is supported, we can properly test the implementation
> > > >> specific code.  One edge case is when we run on x86_64 systems that
> > > >> don't support AVX2, but where the compiler can generate such
> > > >> instructions.  That could be an enhancement in the future, but for
> > > >> now at least the tests will pass.
> > > >>
> > > >> Signed-off-by: Aaron Conole <aconole@redhat.com>
> > > >> ---
> > > >>  app/test/test_acl.c             | 62 +++++++++++++------------------
> > --
> > > >>  lib/librte_acl/Makefile         |  1 +
> > > >>  lib/librte_acl/acl_run_notsup.c | 46 ++++++++++++++++++++++++
> > > >>  lib/librte_acl/meson.build      |  4 +--
> > > >>  4 files changed, 73 insertions(+), 40 deletions(-)  create mode
> > > >> 100644 lib/librte_acl/acl_run_notsup.c
> > > >>
> > > >> diff --git a/app/test/test_acl.c b/app/test/test_acl.c index
> > > >> b1f75d1bc..c44faa251 100644
> > > >> --- a/app/test/test_acl.c
> > > >> +++ b/app/test/test_acl.c
> > > >> @@ -408,6 +408,9 @@ test_classify(void)
> > > >>  		return -1;
> > > >>  	}
> > > >>
> > > >> +	/* Always use the scalar testing for now. */
> > > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> > > >> +
> > > >>  	ret = 0;
> > > >>  	for (i = 0; i != TEST_CLASSIFY_ITER; i++) {
> > > >>
> > > >> @@ -547,6 +550,7 @@ test_build_ports_range(void)
> > > >>  	for (i = 0; i != RTE_DIM(test_data); i++)
> > > >>  		data[i] = (uint8_t *)&test_data[i];
> > > >>
> > > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> > > >>  	for (i = 0; i != RTE_DIM(test_rules); i++) {
> > > >>  		rte_acl_reset(acx);
> > > >>  		ret = test_classify_buid(acx, test_rules, i + 1); @@ -
> > 911,6
> > > >> +915,8 @@ test_convert_rules(const char *desc,
> > > >>  		return -1;
> > > >>  	}
> > > >>
> > > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> > > >> +
> > > >
> > > > As I understand here and above, on x86 you replaced default algo
> > > > (SSE, AVX2) with scalar one, right?
> > > > That looks like reduction of test coverage for x86.
> > >
> > > In one way, you're right.  However, the tests weren't testing what
> > > they purported anyway.
> >
> > Could you explain a bit more here?
> > What I am seeing: tests were running bot sse(or avx2) and scalar
> > classify() method.
> > Now they always running scalar only.
> > To me it definitely looks like reduction in coverage.
> >
> > >  Actually, it's just a shift I think (previously, it would have tested
> > > the AVX2 but I don't see AVX2 having a fallback into the SSE code -
> > > unlike the SSE code falling back into scalar).
> >
> > Not sure I understand you here.
> > What fallback for AVX2 you expect that you think is missing?
> >
> > >
> > > The tests were failing for a number of reasons when built with meson,
> >
> > Ok, but with legacy build system (make) on x86 all tests passes, right?
> > So the problem is in new build system, not in the test itself.
> > Why we should compromise our test coverage to make it work with new tools?
> > That just hides the problem without fixing it.
> > Instead I think the build system needs to be fixed.
> > Looking at it a bit closely, for .so meson+ninja generates code with
> > correct version of the function:
> >
> > nm x86_64-native-linuxapp-gcc-meson/lib/librte_acl.so.2 | grep
> > acl_classify_sse
> > 000000000000fa50 t rte_acl_classify_sse
> >
> > So for 'meson -Ddefault_library=shared'
> > acl_autotest passes without the problem.
> >
> > Though for static lib we have both:
> > nm x86_64-native-linuxapp-gcc-meson/lib/librte_acl.a | grep
> > acl_classify_sse
> > 0000000000000000 W rte_acl_classify_sse
> > 0000000000004880 T rte_acl_classify_sse
> >
> > And then linker pickups the wrong one:
> > nm x86_64-native-linuxapp-gcc-meson/app/test/dpdk-test | grep
> > acl_classify_sse
> > 00000000005f6100 W rte_acl_classify_sse
> >
> > While for make:
> > $ nm x86_64-native-linuxapp-gcc-aesmb/lib/librte_acl.a | grep
> > acl_classify_sse
> > 0000000000000000 W rte_acl_classify_sse
> > 0000000000004880 T rte_acl_classify_sse
> > $ nm x86_64-native-linuxapp-gcc-aesmb/app/test | grep acl_classify_sse
> > 0000000000240440 T rte_acl_classify_sse
> >
> > Linker pickups the right one.
> >
> 
> I assume the same issues occurs for AVX2, 

Yes, I just used sse because it is always available on x86. 

but for SSE specifically why do we even compile up two copies of the function for x86 platforms,
> since SSE will always be supported?

for non IA  platforms.
Konstantin
  
Bruce Richardson April 10, 2019, 8:13 a.m. UTC | #7
> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Tuesday, April 9, 2019 6:05 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Aaron Conole
> <aconole@redhat.com>
> Cc: dev@dpdk.org; Jerin Jacob <jerinj@marvell.com>; Gavin Hu
> <gavin.hu@arm.com>; Richardson, Bruce <bruce.richardson@intel.com>;
> Michael Santana <msantana@redhat.com>
> Subject: RE: [PATCH 3/3] acl: adjust the tests
> 
> 
> 
> >
> > > > Hi Aaron,
> > > >
> > > >>
> > > >> This makes the tests pass, and also ensures that on platforms
> > > >> where the testing is supported, we can properly test the
> > > >> implementation specific code.  One edge case is when we run on
> > > >> x86_64 systems that don't support AVX2, but where the compiler
> > > >> can generate such instructions.  That could be an enhancement in
> > > >> the future, but for now at least the tests will pass.
> > > >>
> > > >> Signed-off-by: Aaron Conole <aconole@redhat.com>
> > > >> ---
> > > >>  app/test/test_acl.c             | 62 +++++++++++++----------------
> ----
> > > >>  lib/librte_acl/Makefile         |  1 +
> > > >>  lib/librte_acl/acl_run_notsup.c | 46 ++++++++++++++++++++++++
> > > >>  lib/librte_acl/meson.build      |  4 +--
> > > >>  4 files changed, 73 insertions(+), 40 deletions(-)  create mode
> > > >> 100644 lib/librte_acl/acl_run_notsup.c
> > > >>
> > > >> diff --git a/app/test/test_acl.c b/app/test/test_acl.c index
> > > >> b1f75d1bc..c44faa251 100644
> > > >> --- a/app/test/test_acl.c
> > > >> +++ b/app/test/test_acl.c
> > > >> @@ -408,6 +408,9 @@ test_classify(void)
> > > >>  		return -1;
> > > >>  	}
> > > >>
> > > >> +	/* Always use the scalar testing for now. */
> > > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> > > >> +
> > > >>  	ret = 0;
> > > >>  	for (i = 0; i != TEST_CLASSIFY_ITER; i++) {
> > > >>
> > > >> @@ -547,6 +550,7 @@ test_build_ports_range(void)
> > > >>  	for (i = 0; i != RTE_DIM(test_data); i++)
> > > >>  		data[i] = (uint8_t *)&test_data[i];
> > > >>
> > > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> > > >>  	for (i = 0; i != RTE_DIM(test_rules); i++) {
> > > >>  		rte_acl_reset(acx);
> > > >>  		ret = test_classify_buid(acx, test_rules, i + 1); @@ -
> 911,6
> > > >> +915,8 @@ test_convert_rules(const char *desc,
> > > >>  		return -1;
> > > >>  	}
> > > >>
> > > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> > > >> +
> > > >
> > > > As I understand here and above, on x86 you replaced default algo
> > > > (SSE, AVX2) with scalar one, right?
> > > > That looks like reduction of test coverage for x86.
> > >
> > > In one way, you're right.  However, the tests weren't testing what
> > > they purported anyway.
> >
> > Could you explain a bit more here?
> > What I am seeing: tests were running bot sse(or avx2) and scalar
> classify() method.
> > Now they always running scalar only.
> > To me it definitely looks like reduction in coverage.
> >
> > >  Actually, it's just a shift I think (previously, it would have
> > > tested the AVX2 but I don't see AVX2 having a fallback into the SSE
> > > code - unlike the SSE code falling back into scalar).
> >
> > Not sure I understand you here.
> > What fallback for AVX2 you expect that you think is missing?
> >
> > >
> > > The tests were failing for a number of reasons when built with
> > > meson,
> >
> > Ok, but with legacy build system (make) on x86 all tests passes, right?
> > So the problem is in new build system, not in the test itself.
> > Why we should compromise our test coverage to make it work with new
> > tools?
> > That just hides the problem without fixing it.
> > Instead I think the build system needs to be fixed.
> > Looking at it a bit closely, for .so meson+ninja generates code with
> > correct version of the function:
> >
> > nm x86_64-native-linuxapp-gcc-meson/lib/librte_acl.so.2 | grep
> > acl_classify_sse
> > 000000000000fa50 t rte_acl_classify_sse
> >
> > So for 'meson -Ddefault_library=shared'
> > acl_autotest passes without the problem.
> >
> > Though for static lib we have both:
> > nm x86_64-native-linuxapp-gcc-meson/lib/librte_acl.a | grep
> > acl_classify_sse
> > 0000000000000000 W rte_acl_classify_sse
> > 0000000000004880 T rte_acl_classify_sse
> >
> > And then linker pickups the wrong one:
> > nm x86_64-native-linuxapp-gcc-meson/app/test/dpdk-test | grep
> > acl_classify_sse
> > 00000000005f6100 W rte_acl_classify_sse
> >
> > While for make:
> > $ nm x86_64-native-linuxapp-gcc-aesmb/lib/librte_acl.a | grep
> > acl_classify_sse
> > 0000000000000000 W rte_acl_classify_sse
> > 0000000000004880 T rte_acl_classify_sse $ nm
> > x86_64-native-linuxapp-gcc-aesmb/app/test | grep acl_classify_sse
> > 0000000000240440 T rte_acl_classify_sse
> >
> > Linker pickups the right one.
> 
> And the changes below make linker to pick-up the proper version of the
> function and make acl_autotest to pass for static build too.
> 
> diff --git a/app/test/meson.build b/app/test/meson.build index
> 867cc5863..4364be932 100644
> --- a/app/test/meson.build
> +++ b/app/test/meson.build
> @@ -328,6 +328,7 @@ test_dep_objs += cc.find_library('execinfo', required:
> false)  link_libs = []  if get_option('default_library') == 'static'
>         link_libs = dpdk_drivers
> +       link_libs += dpdk_static_libraries
>  endif
> 
>  if get_option('tests')
> diff --git a/meson.build b/meson.build
> index a96486597..df1e1c41c 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -62,6 +62,7 @@ configure_file(output: build_cfg,  # for static builds,
> include the drivers as libs and we need to "whole-archive"
>  # them.
>  dpdk_drivers = ['-Wl,--whole-archive'] + dpdk_drivers + ['-Wl,--no-whole-
> archive']
> +dpdk_static_libraries = ['-Wl,--whole-archive'] + dpdk_static_libraries
> ++ ['-Wl,--no-whole-archive']
> 
> Not saying that's the proper patch, but just to prove that linking
> librte_acl.a with '--whole-archive' does fix the problem.
> Bruce, could you point is the best way to fix things here (my meson
> knowledge is very limited)?
> Do we need extra container here as 'whole_archive_static_libraries[]' or
> so?
> Thanks
> Konstantin
> 
I'll look into this. I'd really rather avoid having to have all DPDK libraries
linked with link-whole, but if not, we'll need some sort of similar solution.

Regards,
/Bruce
  
Bruce Richardson April 10, 2019, 9:06 a.m. UTC | #8
On Tue, Apr 09, 2019 at 07:29:09PM +0100, Ananyev, Konstantin wrote:
> 
> > >
> > > > > Hi Aaron,
> > > > >
> > > > >>
> > > > >> This makes the tests pass, and also ensures that on platforms where
> > > > >> the testing is supported, we can properly test the implementation
> > > > >> specific code.  One edge case is when we run on x86_64 systems that
> > > > >> don't support AVX2, but where the compiler can generate such
> > > > >> instructions.  That could be an enhancement in the future, but for
> > > > >> now at least the tests will pass.
> > > > >>
> > > > >> Signed-off-by: Aaron Conole <aconole@redhat.com>
> > > > >> ---
> > > > >>  app/test/test_acl.c             | 62 +++++++++++++------------------
> > > --
> > > > >>  lib/librte_acl/Makefile         |  1 +
> > > > >>  lib/librte_acl/acl_run_notsup.c | 46 ++++++++++++++++++++++++
> > > > >>  lib/librte_acl/meson.build      |  4 +--
> > > > >>  4 files changed, 73 insertions(+), 40 deletions(-)  create mode
> > > > >> 100644 lib/librte_acl/acl_run_notsup.c
> > > > >>
> > > > >> diff --git a/app/test/test_acl.c b/app/test/test_acl.c index
> > > > >> b1f75d1bc..c44faa251 100644
> > > > >> --- a/app/test/test_acl.c
> > > > >> +++ b/app/test/test_acl.c
> > > > >> @@ -408,6 +408,9 @@ test_classify(void)
> > > > >>  		return -1;
> > > > >>  	}
> > > > >>
> > > > >> +	/* Always use the scalar testing for now. */
> > > > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> > > > >> +
> > > > >>  	ret = 0;
> > > > >>  	for (i = 0; i != TEST_CLASSIFY_ITER; i++) {
> > > > >>
> > > > >> @@ -547,6 +550,7 @@ test_build_ports_range(void)
> > > > >>  	for (i = 0; i != RTE_DIM(test_data); i++)
> > > > >>  		data[i] = (uint8_t *)&test_data[i];
> > > > >>
> > > > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> > > > >>  	for (i = 0; i != RTE_DIM(test_rules); i++) {
> > > > >>  		rte_acl_reset(acx);
> > > > >>  		ret = test_classify_buid(acx, test_rules, i + 1); @@ -
> > > 911,6
> > > > >> +915,8 @@ test_convert_rules(const char *desc,
> > > > >>  		return -1;
> > > > >>  	}
> > > > >>
> > > > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> > > > >> +
> > > > >
> > > > > As I understand here and above, on x86 you replaced default algo
> > > > > (SSE, AVX2) with scalar one, right?
> > > > > That looks like reduction of test coverage for x86.
> > > >
> > > > In one way, you're right.  However, the tests weren't testing what
> > > > they purported anyway.
> > >
> > > Could you explain a bit more here?
> > > What I am seeing: tests were running bot sse(or avx2) and scalar
> > > classify() method.
> > > Now they always running scalar only.
> > > To me it definitely looks like reduction in coverage.
> > >
> > > >  Actually, it's just a shift I think (previously, it would have tested
> > > > the AVX2 but I don't see AVX2 having a fallback into the SSE code -
> > > > unlike the SSE code falling back into scalar).
> > >
> > > Not sure I understand you here.
> > > What fallback for AVX2 you expect that you think is missing?
> > >
> > > >
> > > > The tests were failing for a number of reasons when built with meson,
> > >
> > > Ok, but with legacy build system (make) on x86 all tests passes, right?
> > > So the problem is in new build system, not in the test itself.
> > > Why we should compromise our test coverage to make it work with new tools?
> > > That just hides the problem without fixing it.
> > > Instead I think the build system needs to be fixed.
> > > Looking at it a bit closely, for .so meson+ninja generates code with
> > > correct version of the function:
> > >
> > > nm x86_64-native-linuxapp-gcc-meson/lib/librte_acl.so.2 | grep
> > > acl_classify_sse
> > > 000000000000fa50 t rte_acl_classify_sse
> > >
> > > So for 'meson -Ddefault_library=shared'
> > > acl_autotest passes without the problem.
> > >
> > > Though for static lib we have both:
> > > nm x86_64-native-linuxapp-gcc-meson/lib/librte_acl.a | grep
> > > acl_classify_sse
> > > 0000000000000000 W rte_acl_classify_sse
> > > 0000000000004880 T rte_acl_classify_sse
> > >
> > > And then linker pickups the wrong one:
> > > nm x86_64-native-linuxapp-gcc-meson/app/test/dpdk-test | grep
> > > acl_classify_sse
> > > 00000000005f6100 W rte_acl_classify_sse
> > >
> > > While for make:
> > > $ nm x86_64-native-linuxapp-gcc-aesmb/lib/librte_acl.a | grep
> > > acl_classify_sse
> > > 0000000000000000 W rte_acl_classify_sse
> > > 0000000000004880 T rte_acl_classify_sse
> > > $ nm x86_64-native-linuxapp-gcc-aesmb/app/test | grep acl_classify_sse
> > > 0000000000240440 T rte_acl_classify_sse
> > >
> > > Linker pickups the right one.
> > >
> > 
> > I assume the same issues occurs for AVX2, 
> 
> Yes, I just used sse because it is always available on x86. 
> 
> but for SSE specifically why do we even compile up two copies of the function for x86 platforms,
> > since SSE will always be supported?
> 
> for non IA  platforms.

Yes, I realise that, but there is no point in compiling the weak version
for IA platforms, since the normal version will be guaranteed available. In
any case, it doesn't matter much, since the issue needs fixing for AVX2
anyway.

/Bruce
  
Aaron Conole April 10, 2019, 1:10 p.m. UTC | #9
"Ananyev, Konstantin" <konstantin.ananyev@intel.com> writes:

>> 
>> > > Hi Aaron,
>> > >
>> > >>
>> > >> This makes the tests pass, and also ensures that on platforms where the
>> > >> testing is supported, we can properly test the implementation specific
>> > >> code.  One edge case is when we run on x86_64 systems that don't support
>> > >> AVX2, but where the compiler can generate such instructions.  That could
>> > >> be an enhancement in the future, but for now at least the tests will
>> > >> pass.
>> > >>
>> > >> Signed-off-by: Aaron Conole <aconole@redhat.com>
>> > >> ---
>> > >>  app/test/test_acl.c             | 62 +++++++++++++--------------------
>> > >>  lib/librte_acl/Makefile         |  1 +
>> > >>  lib/librte_acl/acl_run_notsup.c | 46 ++++++++++++++++++++++++
>> > >>  lib/librte_acl/meson.build      |  4 +--
>> > >>  4 files changed, 73 insertions(+), 40 deletions(-)
>> > >>  create mode 100644 lib/librte_acl/acl_run_notsup.c
>> > >>
>> > >> diff --git a/app/test/test_acl.c b/app/test/test_acl.c
>> > >> index b1f75d1bc..c44faa251 100644
>> > >> --- a/app/test/test_acl.c
>> > >> +++ b/app/test/test_acl.c
>> > >> @@ -408,6 +408,9 @@ test_classify(void)
>> > >>  		return -1;
>> > >>  	}
>> > >>
>> > >> +	/* Always use the scalar testing for now. */
>> > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
>> > >> +
>> > >>  	ret = 0;
>> > >>  	for (i = 0; i != TEST_CLASSIFY_ITER; i++) {
>> > >>
>> > >> @@ -547,6 +550,7 @@ test_build_ports_range(void)
>> > >>  	for (i = 0; i != RTE_DIM(test_data); i++)
>> > >>  		data[i] = (uint8_t *)&test_data[i];
>> > >>
>> > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
>> > >>  	for (i = 0; i != RTE_DIM(test_rules); i++) {
>> > >>  		rte_acl_reset(acx);
>> > >>  		ret = test_classify_buid(acx, test_rules, i + 1);
>> > >> @@ -911,6 +915,8 @@ test_convert_rules(const char *desc,
>> > >>  		return -1;
>> > >>  	}
>> > >>
>> > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
>> > >> +
>> > >
>> > > As I understand here and above, on x86 you replaced default algo (SSE, AVX2)
>> > > with scalar one, right?
>> > > That looks like reduction of test coverage for x86.
>> >
>> > In one way, you're right.  However, the tests weren't testing what they
>> > purported anyway.
>> 
>> Could you explain a bit more here?
>> What I am seeing: tests were running bot sse(or avx2) and scalar classify() method.
>> Now they always running scalar only.
>> To me it definitely looks like reduction in coverage.
>> 
>> >  Actually, it's just a shift I think (previously, it
>> > would have tested the AVX2 but I don't see AVX2 having a fallback into
>> > the SSE code - unlike the SSE code falling back into scalar).
>> 
>> Not sure I understand you here.
>> What fallback for AVX2 you expect that you think is missing?
>> 
>> >
>> > The tests were failing for a number of reasons when built with meson,
>> 
>> Ok, but with legacy build system (make) on x86 all tests passes, right?
>> So the problem is in new build system, not in the test itself.
>> Why we should compromise our test coverage to make it work with
>> new tools?
>> That just hides the problem without fixing it.
>> Instead I think the build system needs to be fixed.
>> Looking at it a bit closely, for .so meson+ninja generates code with
>> correct version of the function:
>> 
>> nm x86_64-native-linuxapp-gcc-meson/lib/librte_acl.so.2 | grep acl_classify_sse
>> 000000000000fa50 t rte_acl_classify_sse
>> 
>> So for 'meson -Ddefault_library=shared'
>> acl_autotest passes without the problem.
>> 
>> Though for static lib we have both:
>> nm x86_64-native-linuxapp-gcc-meson/lib/librte_acl.a | grep acl_classify_sse
>> 0000000000000000 W rte_acl_classify_sse
>> 0000000000004880 T rte_acl_classify_sse
>> 
>> And then linker pickups the wrong one:
>> nm x86_64-native-linuxapp-gcc-meson/app/test/dpdk-test | grep acl_classify_sse
>> 00000000005f6100 W rte_acl_classify_sse
>> 
>> While for make:
>> $ nm x86_64-native-linuxapp-gcc-aesmb/lib/librte_acl.a | grep acl_classify_sse
>> 0000000000000000 W rte_acl_classify_sse
>> 0000000000004880 T rte_acl_classify_sse
>> $ nm x86_64-native-linuxapp-gcc-aesmb/app/test | grep acl_classify_sse
>> 0000000000240440 T rte_acl_classify_sse
>> 
>> Linker pickups the right one.
>
> And the changes below make linker to pick-up the proper version of the function
> and make acl_autotest to pass for static build too.
>
> diff --git a/app/test/meson.build b/app/test/meson.build
> index 867cc5863..4364be932 100644
> --- a/app/test/meson.build
> +++ b/app/test/meson.build
> @@ -328,6 +328,7 @@ test_dep_objs += cc.find_library('execinfo', required: false)
>  link_libs = []
>  if get_option('default_library') == 'static'
>         link_libs = dpdk_drivers
> +       link_libs += dpdk_static_libraries
>  endif
>
>  if get_option('tests')
> diff --git a/meson.build b/meson.build
> index a96486597..df1e1c41c 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -62,6 +62,7 @@ configure_file(output: build_cfg,
>  # for static builds, include the drivers as libs and we need to "whole-archive"
>  # them.
>  dpdk_drivers = ['-Wl,--whole-archive'] + dpdk_drivers + ['-Wl,--no-whole-archive']
> +dpdk_static_libraries = ['-Wl,--whole-archive'] + dpdk_static_libraries + ['-Wl,--no-whole-archive']
>
> Not saying that's the proper patch, but just to prove that linking librte_acl.a
> with '--whole-archive' does fix the problem.
> Bruce, could you point is the best way to fix things here
> (my meson knowledge is very limited)?
> Do we need extra container here as 'whole_archive_static_libraries[]' or so?
> Thanks
> Konstantin

Okay - I'll look at this part more.  I think I went down the path of
explicitly setting these because the comments didn't match with what was
occuring (for example, in the section that I changed that loops through
all versions, only the AVX2 and Scalar were being tested on my system,
while the comment implied SSE).

I also believe that I split out the functions because of the linking
issue (I guess the way the linker resolves the functions works properly
when the weak versions are in a different translation unit)?  I'll spend
some time trying to get it working in a different way.

Regardless, this wasn't ready for posting as 'PATCH' - I meant it as
RFC.  I don't intend to change the first two patches, though.

And thank you for the all the feedback!

>
>> 
>> 
>> > and on the systems I tested with (including tests under travis).
>> >
>> > 1. Any meson build that I observed didn't correctly fill anything but
>> >    the scalar variable.  I had to remove the -ENOTSUP definitions in the
>> >    rte_acl.c file (forgot to git add it), and make the second version.
>> >
>> > 2. The tests never selected scalar, or nor sse implementations.
>> 
>> As I can see test_classify_run() *always* run both default method (sse/avx2 on x86)
>> and then scalar one.
>> 
>> >  Rather,
>> >    they selected only what the currently running platform provided.
>> >    This meant that I was always seeing the AVX2 code executed, but never
>> >    SSE nor scalar (but for one case) - at least as far as I could see.
>> >
>> > There were others - I iterated on these for a few days.
>> >
>> > This is why I changed a block to run through each implementation in one
>> > of the versions.
>> >
>> > HOWEVER, it's still deficient.
>> >
>> > We need to fully cover all the cases.  BUT it's better than the failure
>> > that currently happens on almost every system I've tried - including
>> > shipping the build to travis to run.  So, I figured running > failing with
>> > almost no reason why.  And looking into the failure revealed that the
>> > meson build didn't even include the platform specific builds.
>> >
>> > During my rework, I can change the test cases to iterate as in other
>> > test cases.  It will extend the time.  And I don't know how to resolve
>> > the case where we run on a system that doesn't support AVX2 but we have
>> > a compiler that supports AVX2 (since that case will fail - but we
>> > shouldn't even attempt it).
>> 
>> I don't see why that should happen.
>> At rte_acl_init() we do check does that machine supports AVX2(SSE, NEON)
>> instructions or not.
>> Are you saying under some circumstances rte_acl_init() are not working properly,
>> or not get invoked?
>> 
>> Konstantin
  
Bruce Richardson April 10, 2019, 1:24 p.m. UTC | #10
On Wed, Apr 10, 2019 at 09:10:25AM -0400, Aaron Conole wrote:
> "Ananyev, Konstantin" <konstantin.ananyev@intel.com> writes:
> 
> >> 
> >> > > Hi Aaron,
> >> > >
> >> > >>
> >> > >> This makes the tests pass, and also ensures that on platforms where the
> >> > >> testing is supported, we can properly test the implementation specific
> >> > >> code.  One edge case is when we run on x86_64 systems that don't support
> >> > >> AVX2, but where the compiler can generate such instructions.  That could
> >> > >> be an enhancement in the future, but for now at least the tests will
> >> > >> pass.
> >> > >>
> >> > >> Signed-off-by: Aaron Conole <aconole@redhat.com>
> >> > >> ---
> >> > >>  app/test/test_acl.c             | 62 +++++++++++++--------------------
> >> > >>  lib/librte_acl/Makefile         |  1 +
> >> > >>  lib/librte_acl/acl_run_notsup.c | 46 ++++++++++++++++++++++++
> >> > >>  lib/librte_acl/meson.build      |  4 +--
> >> > >>  4 files changed, 73 insertions(+), 40 deletions(-)
> >> > >>  create mode 100644 lib/librte_acl/acl_run_notsup.c
> >> > >>
> >> > >> diff --git a/app/test/test_acl.c b/app/test/test_acl.c
> >> > >> index b1f75d1bc..c44faa251 100644
> >> > >> --- a/app/test/test_acl.c
> >> > >> +++ b/app/test/test_acl.c
> >> > >> @@ -408,6 +408,9 @@ test_classify(void)
> >> > >>  		return -1;
> >> > >>  	}
> >> > >>
> >> > >> +	/* Always use the scalar testing for now. */
> >> > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> >> > >> +
> >> > >>  	ret = 0;
> >> > >>  	for (i = 0; i != TEST_CLASSIFY_ITER; i++) {
> >> > >>
> >> > >> @@ -547,6 +550,7 @@ test_build_ports_range(void)
> >> > >>  	for (i = 0; i != RTE_DIM(test_data); i++)
> >> > >>  		data[i] = (uint8_t *)&test_data[i];
> >> > >>
> >> > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> >> > >>  	for (i = 0; i != RTE_DIM(test_rules); i++) {
> >> > >>  		rte_acl_reset(acx);
> >> > >>  		ret = test_classify_buid(acx, test_rules, i + 1);
> >> > >> @@ -911,6 +915,8 @@ test_convert_rules(const char *desc,
> >> > >>  		return -1;
> >> > >>  	}
> >> > >>
> >> > >> +	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
> >> > >> +
> >> > >
> >> > > As I understand here and above, on x86 you replaced default algo (SSE, AVX2)
> >> > > with scalar one, right?
> >> > > That looks like reduction of test coverage for x86.
> >> >
> >> > In one way, you're right.  However, the tests weren't testing what they
> >> > purported anyway.
> >> 
> >> Could you explain a bit more here?
> >> What I am seeing: tests were running bot sse(or avx2) and scalar classify() method.
> >> Now they always running scalar only.
> >> To me it definitely looks like reduction in coverage.
> >> 
> >> >  Actually, it's just a shift I think (previously, it
> >> > would have tested the AVX2 but I don't see AVX2 having a fallback into
> >> > the SSE code - unlike the SSE code falling back into scalar).
> >> 
> >> Not sure I understand you here.
> >> What fallback for AVX2 you expect that you think is missing?
> >> 
> >> >
> >> > The tests were failing for a number of reasons when built with meson,
> >> 
> >> Ok, but with legacy build system (make) on x86 all tests passes, right?
> >> So the problem is in new build system, not in the test itself.
> >> Why we should compromise our test coverage to make it work with
> >> new tools?
> >> That just hides the problem without fixing it.
> >> Instead I think the build system needs to be fixed.
> >> Looking at it a bit closely, for .so meson+ninja generates code with
> >> correct version of the function:
> >> 
> >> nm x86_64-native-linuxapp-gcc-meson/lib/librte_acl.so.2 | grep acl_classify_sse
> >> 000000000000fa50 t rte_acl_classify_sse
> >> 
> >> So for 'meson -Ddefault_library=shared'
> >> acl_autotest passes without the problem.
> >> 
> >> Though for static lib we have both:
> >> nm x86_64-native-linuxapp-gcc-meson/lib/librte_acl.a | grep acl_classify_sse
> >> 0000000000000000 W rte_acl_classify_sse
> >> 0000000000004880 T rte_acl_classify_sse
> >> 
> >> And then linker pickups the wrong one:
> >> nm x86_64-native-linuxapp-gcc-meson/app/test/dpdk-test | grep acl_classify_sse
> >> 00000000005f6100 W rte_acl_classify_sse
> >> 
> >> While for make:
> >> $ nm x86_64-native-linuxapp-gcc-aesmb/lib/librte_acl.a | grep acl_classify_sse
> >> 0000000000000000 W rte_acl_classify_sse
> >> 0000000000004880 T rte_acl_classify_sse
> >> $ nm x86_64-native-linuxapp-gcc-aesmb/app/test | grep acl_classify_sse
> >> 0000000000240440 T rte_acl_classify_sse
> >> 
> >> Linker pickups the right one.
> >
> > And the changes below make linker to pick-up the proper version of the function
> > and make acl_autotest to pass for static build too.
> >
> > diff --git a/app/test/meson.build b/app/test/meson.build
> > index 867cc5863..4364be932 100644
> > --- a/app/test/meson.build
> > +++ b/app/test/meson.build
> > @@ -328,6 +328,7 @@ test_dep_objs += cc.find_library('execinfo', required: false)
> >  link_libs = []
> >  if get_option('default_library') == 'static'
> >         link_libs = dpdk_drivers
> > +       link_libs += dpdk_static_libraries
> >  endif
> >
> >  if get_option('tests')
> > diff --git a/meson.build b/meson.build
> > index a96486597..df1e1c41c 100644
> > --- a/meson.build
> > +++ b/meson.build
> > @@ -62,6 +62,7 @@ configure_file(output: build_cfg,
> >  # for static builds, include the drivers as libs and we need to "whole-archive"
> >  # them.
> >  dpdk_drivers = ['-Wl,--whole-archive'] + dpdk_drivers + ['-Wl,--no-whole-archive']
> > +dpdk_static_libraries = ['-Wl,--whole-archive'] + dpdk_static_libraries + ['-Wl,--no-whole-archive']
> >
> > Not saying that's the proper patch, but just to prove that linking librte_acl.a
> > with '--whole-archive' does fix the problem.
> > Bruce, could you point is the best way to fix things here
> > (my meson knowledge is very limited)?
> > Do we need extra container here as 'whole_archive_static_libraries[]' or so?
> > Thanks
> > Konstantin
> 
> Okay - I'll look at this part more.  I think I went down the path of
> explicitly setting these because the comments didn't match with what was
> occuring (for example, in the section that I changed that loops through
> all versions, only the AVX2 and Scalar were being tested on my system,
> while the comment implied SSE).
> 
> I also believe that I split out the functions because of the linking
> issue (I guess the way the linker resolves the functions works properly
> when the weak versions are in a different translation unit)?  I'll spend
> some time trying to get it working in a different way.
> 
> Regardless, this wasn't ready for posting as 'PATCH' - I meant it as
> RFC.  I don't intend to change the first two patches, though.
> 
> And thank you for the all the feedback!
> 
I've dug into this a bit, and I'm doing up a patch to remove the use of
weak symbols in our libraries (note, just libs, not drivers) entirely.
That's fairly easy to do, and not a big change, but should make this
problem go away.

/Bruce
  
Bruce Richardson April 10, 2019, 1:46 p.m. UTC | #11
On Wed, Apr 10, 2019 at 02:24:56PM +0100, Bruce Richardson wrote:
> On Wed, Apr 10, 2019 at 09:10:25AM -0400, Aaron Conole wrote:
> > 
> > Okay - I'll look at this part more.  I think I went down the path of
> > explicitly setting these because the comments didn't match with what was
> > occuring (for example, in the section that I changed that loops through
> > all versions, only the AVX2 and Scalar were being tested on my system,
> > while the comment implied SSE).
> > 
> > I also believe that I split out the functions because of the linking
> > issue (I guess the way the linker resolves the functions works properly
> > when the weak versions are in a different translation unit)?  I'll spend
> > some time trying to get it working in a different way.
> > 
> > Regardless, this wasn't ready for posting as 'PATCH' - I meant it as
> > RFC.  I don't intend to change the first two patches, though.
> > 
> > And thank you for the all the feedback!
> > 
> I've dug into this a bit, and I'm doing up a patch to remove the use of
> weak symbols in our libraries (note, just libs, not drivers) entirely.
> That's fairly easy to do, and not a big change, but should make this
> problem go away.
> 
> /Bruce

Ref: http://patches.dpdk.org/project/dpdk/list/?series=4242
  

Patch

diff --git a/app/test/test_acl.c b/app/test/test_acl.c
index b1f75d1bc..c44faa251 100644
--- a/app/test/test_acl.c
+++ b/app/test/test_acl.c
@@ -408,6 +408,9 @@  test_classify(void)
 		return -1;
 	}
 
+	/* Always use the scalar testing for now. */
+	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
+
 	ret = 0;
 	for (i = 0; i != TEST_CLASSIFY_ITER; i++) {
 
@@ -547,6 +550,7 @@  test_build_ports_range(void)
 	for (i = 0; i != RTE_DIM(test_data); i++)
 		data[i] = (uint8_t *)&test_data[i];
 
+	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
 	for (i = 0; i != RTE_DIM(test_rules); i++) {
 		rte_acl_reset(acx);
 		ret = test_classify_buid(acx, test_rules, i + 1);
@@ -911,6 +915,8 @@  test_convert_rules(const char *desc,
 		return -1;
 	}
 
+	rte_acl_set_ctx_classify(acx, RTE_ACL_CLASSIFY_SCALAR);
+
 	rc = convert_rules(acx, convert, acl_test_rules,
 		RTE_DIM(acl_test_rules));
 	if (rc != 0)
@@ -1352,7 +1358,7 @@  test_invalid_parameters(void)
 	struct rte_acl_param param;
 	struct rte_acl_ctx *acx;
 	struct rte_acl_ipv4vlan_rule rule;
-	int result;
+	int i, result;
 
 	uint32_t layout[RTE_ACL_IPV4VLAN_NUM] = {0};
 
@@ -1513,45 +1519,25 @@  test_invalid_parameters(void)
 		return -1;
 	}
 
-	/* SSE classify test */
-
-	/* cover zero categories in classify (should not fail) */
-	result = rte_acl_classify(acx, NULL, NULL, 0, 0);
-	if (result != 0) {
-		printf("Line %i: SSE classify with zero categories "
-				"failed!\n", __LINE__);
-		rte_acl_free(acx);
-		return -1;
-	}
-
-	/* cover invalid but positive categories in classify */
-	result = rte_acl_classify(acx, NULL, NULL, 0, 3);
-	if (result == 0) {
-		printf("Line %i: SSE classify with 3 categories "
-				"should have failed!\n", __LINE__);
-		rte_acl_free(acx);
-		return -1;
-	}
-
-	/* scalar classify test */
+	for (i = RTE_ACL_CLASSIFY_DEFAULT; i < RTE_ACL_CLASSIFY_NUM; ++i) {
+		rte_acl_set_ctx_classify(acx, i); /* set up the classify code */
 
-	/* cover zero categories in classify (should not fail) */
-	result = rte_acl_classify_alg(acx, NULL, NULL, 0, 0,
-		RTE_ACL_CLASSIFY_SCALAR);
-	if (result != 0) {
-		printf("Line %i: Scalar classify with zero categories "
-				"failed!\n", __LINE__);
-		rte_acl_free(acx);
-		return -1;
-	}
+		/* cover zero categories in classify (should not fail) */
+		result = rte_acl_classify(acx, NULL, NULL, 0, 0);
+		if (result != 0 && result != -ENOTSUP) {
+			printf("AGL: %d, ACL classify with zero categories failed: %d!\n",
+			       i, result);
+			return -1;
+		}
 
-	/* cover invalid but positive categories in classify */
-	result = rte_acl_classify(acx, NULL, NULL, 0, 3);
-	if (result == 0) {
-		printf("Line %i: Scalar classify with 3 categories "
-				"should have failed!\n", __LINE__);
-		rte_acl_free(acx);
-		return -1;
+		/* cover invalid but positive categories in classify */
+		result = rte_acl_classify(acx, NULL, NULL, 0, 3);
+		/* we don't check for -ENOTSUP here, since it is a failure */
+		if (result == 0) {
+			printf("AGL: %d, ACL classify with 3 categories should fail!\n",
+			       i);
+			return -1;
+		}
 	}
 
 	/* free ACL context */
diff --git a/lib/librte_acl/Makefile b/lib/librte_acl/Makefile
index ea5edf00a..c5dfdb832 100644
--- a/lib/librte_acl/Makefile
+++ b/lib/librte_acl/Makefile
@@ -21,6 +21,7 @@  SRCS-$(CONFIG_RTE_LIBRTE_ACL) += rte_acl.c
 SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_bld.c
 SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_gen.c
 SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_scalar.c
+SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_notsup.c
 
 ifneq ($(filter y,$(CONFIG_RTE_ARCH_ARM) $(CONFIG_RTE_ARCH_ARM64)),)
 SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_neon.c
diff --git a/lib/librte_acl/acl_run_notsup.c b/lib/librte_acl/acl_run_notsup.c
new file mode 100644
index 000000000..2bcc6e67f
--- /dev/null
+++ b/lib/librte_acl/acl_run_notsup.c
@@ -0,0 +1,46 @@ 
+#include <rte_acl.h>
+#include "acl.h"
+
+/*
+ * If the compiler doesn't support AVX2 instructions,
+ * then the dummy one would be used instead for AVX2 classify method.
+ */
+int __rte_weak
+rte_acl_classify_avx2(__rte_unused const struct rte_acl_ctx *ctx,
+	__rte_unused const uint8_t **data,
+	__rte_unused uint32_t *results,
+	__rte_unused uint32_t num,
+	__rte_unused uint32_t categories)
+{
+	return -ENOTSUP;
+}
+
+int __rte_weak
+rte_acl_classify_sse(__rte_unused const struct rte_acl_ctx *ctx,
+	__rte_unused const uint8_t **data,
+	__rte_unused uint32_t *results,
+	__rte_unused uint32_t num,
+	__rte_unused uint32_t categories)
+{
+	return -ENOTSUP;
+}
+
+int __rte_weak
+rte_acl_classify_neon(__rte_unused const struct rte_acl_ctx *ctx,
+	__rte_unused const uint8_t **data,
+	__rte_unused uint32_t *results,
+	__rte_unused uint32_t num,
+	__rte_unused uint32_t categories)
+{
+	return -ENOTSUP;
+}
+
+int __rte_weak
+rte_acl_classify_altivec(__rte_unused const struct rte_acl_ctx *ctx,
+	__rte_unused const uint8_t **data,
+	__rte_unused uint32_t *results,
+	__rte_unused uint32_t num,
+	__rte_unused uint32_t categories)
+{
+	return -ENOTSUP;
+}
diff --git a/lib/librte_acl/meson.build b/lib/librte_acl/meson.build
index 03c19e4e5..fc8689aa9 100644
--- a/lib/librte_acl/meson.build
+++ b/lib/librte_acl/meson.build
@@ -2,8 +2,8 @@ 
 # Copyright(c) 2017 Intel Corporation
 
 version = 2
-sources = files('acl_bld.c', 'acl_gen.c', 'acl_run_scalar.c',
-		'rte_acl.c', 'tb_mem.c')
+sources = files('acl_bld.c', 'acl_gen.c', 'acl_run_notsup.c',
+		'acl_run_scalar.c', 'rte_acl.c', 'tb_mem.c')
 headers = files('rte_acl.h', 'rte_acl_osdep.h')
 
 if arch_subdir == 'x86'