diff mbox series

[v8,2/4] meson: add infra to support machine specific flags

Message ID 20190410161400.9361-2-jerinj@marvell.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers show
Series [v8,1/4] mk: introduce helper to check valid compiler argument | expand

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Jerin Jacob April 10, 2019, 4:13 p.m. UTC
From: Pavan Nikhilesh <pbhagavatula@marvell.com>

Currently, RTE_* flags are set based on the implementer ID but there might
be some micro arch specific differences from the same vendor
eg. CACHE_LINESIZE. Add support to set micro arch specific flags.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
 config/arm/meson.build | 56 ++++++++++++++++++++++++------------------
 1 file changed, 32 insertions(+), 24 deletions(-)

Comments

Yongseok Koh April 10, 2019, 5:37 p.m. UTC | #1
> On Apr 10, 2019, at 9:13 AM, jerinjacobk@gmail.com wrote:
> 
> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> 
> Currently, RTE_* flags are set based on the implementer ID but there might
> be some micro arch specific differences from the same vendor
> eg. CACHE_LINESIZE. Add support to set micro arch specific flags.
> 
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> ---
> config/arm/meson.build | 56 ++++++++++++++++++++++++------------------
> 1 file changed, 32 insertions(+), 24 deletions(-)
> 
> diff --git a/config/arm/meson.build b/config/arm/meson.build
> index 170a4981a..24bce2b39 100644
> --- a/config/arm/meson.build
> +++ b/config/arm/meson.build
> @@ -7,25 +7,6 @@ march_opt = '-march=@0@'.format(machine)
> 
> arm_force_native_march = false
> 
> -machine_args_generic = [
> -	['default', ['-march=armv8-a+crc+crypto']],
> -	['native', ['-march=native']],
> -	['0xd03', ['-mcpu=cortex-a53']],
> -	['0xd04', ['-mcpu=cortex-a35']],
> -	['0xd05', ['-mcpu=cortex-a55']],
> -	['0xd07', ['-mcpu=cortex-a57']],
> -	['0xd08', ['-mcpu=cortex-a72']],
> -	['0xd09', ['-mcpu=cortex-a73']],
> -	['0xd0a', ['-mcpu=cortex-a75']],
> -	['0xd0b', ['-mcpu=cortex-a76']],
> -]
> -machine_args_cavium = [
> -	['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
> -	['native', ['-march=native']],
> -	['0xa1', ['-mcpu=thunderxt88']],
> -	['0xa2', ['-mcpu=thunderxt81']],
> -	['0xa3', ['-mcpu=thunderxt83']]]
> -
> flags_common_default = [
> 	# Accelarate rte_memcpy. Be sure to run unit test (memcpy_perf_autotest)
> 	# to determine the best threshold in code. Refer to notes in source file
> @@ -52,12 +33,10 @@ flags_generic = [
> 	['RTE_USE_C11_MEM_MODEL', true],
> 	['RTE_CACHE_LINE_SIZE', 128]]
> flags_cavium = [
> -	['RTE_MACHINE', '"thunderx"'],
> 	['RTE_CACHE_LINE_SIZE', 128],
> 	['RTE_MAX_NUMA_NODES', 2],
> 	['RTE_MAX_LCORE', 96],
> -	['RTE_MAX_VFIO_GROUPS', 128],
> -	['RTE_USE_C11_MEM_MODEL', false]]
> +	['RTE_MAX_VFIO_GROUPS', 128]]
> flags_dpaa = [
> 	['RTE_MACHINE', '"dpaa"'],
> 	['RTE_USE_C11_MEM_MODEL', true],
> @@ -71,6 +50,27 @@ flags_dpaa2 = [
> 	['RTE_MAX_NUMA_NODES', 1],
> 	['RTE_MAX_LCORE', 16],
> 	['RTE_LIBRTE_DPAA2_USE_PHYS_IOVA', false]]
> +flags_default_extra = []
> +flags_thunderx_extra = [
> +	['RTE_MACHINE', '"thunderx"'],
> +	['RTE_USE_C11_MEM_MODEL', false]]
> +
> +machine_args_generic = [
> +	['default', ['-march=armv8-a+crc+crypto']],
> +	['native', ['-march=native']],
> +	['0xd03', ['-mcpu=cortex-a53']],
> +	['0xd04', ['-mcpu=cortex-a35']],
> +	['0xd07', ['-mcpu=cortex-a57']],
> +	['0xd08', ['-mcpu=cortex-a72']],
> +	['0xd09', ['-mcpu=cortex-a73']],
> +	['0xd0a', ['-mcpu=cortex-a75']]]
> +
> +machine_args_cavium = [
> +	['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
> +	['native', ['-march=native']],
> +	['0xa1', ['-mcpu=thunderxt88'], flags_thunderx_extra],
> +	['0xa2', ['-mcpu=thunderxt81'], flags_thunderx_extra],
> +	['0xa3', ['-mcpu=thunderxt83'], flags_thunderx_extra]]
> 
> ## Arm implementer ID (ARM DDI 0487C.a, Section G7.2.106, Page G7-5321)
> impl_generic = ['Generic armv8', flags_generic, machine_args_generic]
> @@ -157,8 +157,16 @@ else
> 	endif
> 	foreach marg: machine[2]
> 		if marg[0] == impl_pn
> -			foreach f: marg[1]
> -				machine_args += f
> +			foreach flag: marg[1]
> +				if cc.has_argument(flag)
> +					machine_args += flag
> +				endif
> +			endforeach
> +			# Apply any extra machine specific flags.
> +			foreach flag: marg.get(2, flags_default_extra)
> +				if flag.length() > 0
> +					dpdk_conf.set(flag[0], flag[1])
> +				endif

Let me continue the discussion from v7 here.
Seems I wan't clear enough.

Let me take an example. If the host is thunderx2 (0xaf) and compiler is older
than v7, flags_thunderx2_extra isn't set. This means, for example,
RTE_CACHE_LINE_SIZE will still be 128. Is that what you want?
RTE_CACHE_LINE_SIZE has nothing to do with compiler support and you might want
to set it regardless of gcc version. You could skip setting -mcpu with setting
the extra flags.

Thoughts?

Thanks,
Yongseok
Pavan Nikhilesh Bhagavatula April 11, 2019, 6:07 a.m. UTC | #2
Hi Yongseok,

>-----Original Message-----
>From: Yongseok Koh <yskoh@mellanox.com>
>Sent: Wednesday, April 10, 2019 11:08 PM
>To: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
>Cc: Thomas Monjalon <thomas@monjalon.net>; dev <dev@dpdk.org>; Jerin
>Jacob Kollanukkaran <jerinj@marvell.com>; jerinjacobk@gmail.com
>Subject: [EXT] Re: [dpdk-dev] [PATCH v8 2/4] meson: add infra to support
>machine specific flags
>
>External Email
>
>----------------------------------------------------------------------
>
>> On Apr 10, 2019, at 9:13 AM, jerinjacobk@gmail.com wrote:
>>
>> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
>>
>> Currently, RTE_* flags are set based on the implementer ID but there
>> might be some micro arch specific differences from the same vendor eg.
>> CACHE_LINESIZE. Add support to set micro arch specific flags.
>>
>> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
>> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
>> ---
>> config/arm/meson.build | 56 ++++++++++++++++++++++++------------------
>> 1 file changed, 32 insertions(+), 24 deletions(-)
>>
>> diff --git a/config/arm/meson.build b/config/arm/meson.build index
>> 170a4981a..24bce2b39 100644
>> --- a/config/arm/meson.build
>> +++ b/config/arm/meson.build
>> @@ -7,25 +7,6 @@ march_opt = '-march=@0@'.format(machine)
>>
>> arm_force_native_march = false
>>
>> -machine_args_generic = [
>> -	['default', ['-march=armv8-a+crc+crypto']],
>> -	['native', ['-march=native']],
>> -	['0xd03', ['-mcpu=cortex-a53']],
>> -	['0xd04', ['-mcpu=cortex-a35']],
>> -	['0xd05', ['-mcpu=cortex-a55']],
>> -	['0xd07', ['-mcpu=cortex-a57']],
>> -	['0xd08', ['-mcpu=cortex-a72']],
>> -	['0xd09', ['-mcpu=cortex-a73']],
>> -	['0xd0a', ['-mcpu=cortex-a75']],
>> -	['0xd0b', ['-mcpu=cortex-a76']],
>> -]
>> -machine_args_cavium = [
>> -	['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
>> -	['native', ['-march=native']],
>> -	['0xa1', ['-mcpu=thunderxt88']],
>> -	['0xa2', ['-mcpu=thunderxt81']],
>> -	['0xa3', ['-mcpu=thunderxt83']]]
>> -
>> flags_common_default = [
>> 	# Accelarate rte_memcpy. Be sure to run unit test
>(memcpy_perf_autotest)
>> 	# to determine the best threshold in code. Refer to notes in source
>> file @@ -52,12 +33,10 @@ flags_generic = [
>> 	['RTE_USE_C11_MEM_MODEL', true],
>> 	['RTE_CACHE_LINE_SIZE', 128]]
>> flags_cavium = [
>> -	['RTE_MACHINE', '"thunderx"'],
>> 	['RTE_CACHE_LINE_SIZE', 128],
>> 	['RTE_MAX_NUMA_NODES', 2],
>> 	['RTE_MAX_LCORE', 96],
>> -	['RTE_MAX_VFIO_GROUPS', 128],
>> -	['RTE_USE_C11_MEM_MODEL', false]]
>> +	['RTE_MAX_VFIO_GROUPS', 128]]
>> flags_dpaa = [
>> 	['RTE_MACHINE', '"dpaa"'],
>> 	['RTE_USE_C11_MEM_MODEL', true],
>> @@ -71,6 +50,27 @@ flags_dpaa2 = [
>> 	['RTE_MAX_NUMA_NODES', 1],
>> 	['RTE_MAX_LCORE', 16],
>> 	['RTE_LIBRTE_DPAA2_USE_PHYS_IOVA', false]]
>> +flags_default_extra = []
>> +flags_thunderx_extra = [
>> +	['RTE_MACHINE', '"thunderx"'],
>> +	['RTE_USE_C11_MEM_MODEL', false]]
>> +
>> +machine_args_generic = [
>> +	['default', ['-march=armv8-a+crc+crypto']],
>> +	['native', ['-march=native']],
>> +	['0xd03', ['-mcpu=cortex-a53']],
>> +	['0xd04', ['-mcpu=cortex-a35']],
>> +	['0xd07', ['-mcpu=cortex-a57']],
>> +	['0xd08', ['-mcpu=cortex-a72']],
>> +	['0xd09', ['-mcpu=cortex-a73']],
>> +	['0xd0a', ['-mcpu=cortex-a75']]]
>> +
>> +machine_args_cavium = [
>> +	['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
>> +	['native', ['-march=native']],
>> +	['0xa1', ['-mcpu=thunderxt88'], flags_thunderx_extra],
>> +	['0xa2', ['-mcpu=thunderxt81'], flags_thunderx_extra],
>> +	['0xa3', ['-mcpu=thunderxt83'], flags_thunderx_extra]]
>>
>> ## Arm implementer ID (ARM DDI 0487C.a, Section G7.2.106, Page
>> G7-5321) impl_generic = ['Generic armv8', flags_generic,
>> machine_args_generic] @@ -157,8 +157,16 @@ else
>> 	endif
>> 	foreach marg: machine[2]
>> 		if marg[0] == impl_pn
>> -			foreach f: marg[1]
>> -				machine_args += f
>> +			foreach flag: marg[1]
>> +				if cc.has_argument(flag)
>> +					machine_args += flag
>> +				endif
>> +			endforeach
>> +			# Apply any extra machine specific flags.
>> +			foreach flag: marg.get(2, flags_default_extra)
>> +				if flag.length() > 0
>> +					dpdk_conf.set(flag[0], flag[1])
>> +				endif
>
>Let me continue the discussion from v7 here.
>Seems I wan't clear enough.
>
>Let me take an example. If the host is thunderx2 (0xaf) and compiler is older
>than v7, flags_thunderx2_extra isn't set. This means, for example,
>RTE_CACHE_LINE_SIZE will still be 128. Is that what you want?
>RTE_CACHE_LINE_SIZE has nothing to do with compiler support and you might
>want to set it regardless of gcc version. You could skip setting -mcpu with setting
>the extra flags.
>

Thanks for the detailed explanation.
I think since we have the check to skip mcpu flag when cc doesn't support it (cc.has_argument(flag))
It will be safe to remove 
`
        # Primary part number based mcpu flags are supported
        # for gcc versions > 7
        if cc.version().version_compare(
                        '<7.0') or cmd_output.length() == 0
                if not meson.is_cross_build() and arm_force_native_march == true
                        impl_pn = 'native'
                else
                        impl_pn = 'default'
                endif
        endif
`

The command output check can also be removed as it is handled when calling the command script itself.

Thoughts?

PS. I think the safest way to set CACHELINE_SIZE is to read the cache type register[1] but sadly only few latest kernels 
have the support through sysfs (/sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size) 

>Thoughts?
>
>Thanks,
>Yongseok
>
>
>

Regards,
Pavan.
Yongseok Koh April 11, 2019, 8:12 p.m. UTC | #3
> On Apr 10, 2019, at 11:07 PM, Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com> wrote:
> 
> Hi Yongseok,
> 
>> -----Original Message-----
>> From: Yongseok Koh <yskoh@mellanox.com>
>> Sent: Wednesday, April 10, 2019 11:08 PM
>> To: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
>> Cc: Thomas Monjalon <thomas@monjalon.net>; dev <dev@dpdk.org>; Jerin
>> Jacob Kollanukkaran <jerinj@marvell.com>; jerinjacobk@gmail.com
>> Subject: [EXT] Re: [dpdk-dev] [PATCH v8 2/4] meson: add infra to support
>> machine specific flags
>> 
>> External Email
>> 
>> ----------------------------------------------------------------------
>> 
>>> On Apr 10, 2019, at 9:13 AM, jerinjacobk@gmail.com wrote:
>>> 
>>> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
>>> 
>>> Currently, RTE_* flags are set based on the implementer ID but there
>>> might be some micro arch specific differences from the same vendor eg.
>>> CACHE_LINESIZE. Add support to set micro arch specific flags.
>>> 
>>> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
>>> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
>>> ---
>>> config/arm/meson.build | 56 ++++++++++++++++++++++++------------------
>>> 1 file changed, 32 insertions(+), 24 deletions(-)
>>> 
>>> diff --git a/config/arm/meson.build b/config/arm/meson.build index
>>> 170a4981a..24bce2b39 100644
>>> --- a/config/arm/meson.build
>>> +++ b/config/arm/meson.build
>>> @@ -7,25 +7,6 @@ march_opt = '-march=@0@'.format(machine)
>>> 
>>> arm_force_native_march = false
>>> 
>>> -machine_args_generic = [
>>> -	['default', ['-march=armv8-a+crc+crypto']],
>>> -	['native', ['-march=native']],
>>> -	['0xd03', ['-mcpu=cortex-a53']],
>>> -	['0xd04', ['-mcpu=cortex-a35']],
>>> -	['0xd05', ['-mcpu=cortex-a55']],
>>> -	['0xd07', ['-mcpu=cortex-a57']],
>>> -	['0xd08', ['-mcpu=cortex-a72']],
>>> -	['0xd09', ['-mcpu=cortex-a73']],
>>> -	['0xd0a', ['-mcpu=cortex-a75']],
>>> -	['0xd0b', ['-mcpu=cortex-a76']],
>>> -]
>>> -machine_args_cavium = [
>>> -	['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
>>> -	['native', ['-march=native']],
>>> -	['0xa1', ['-mcpu=thunderxt88']],
>>> -	['0xa2', ['-mcpu=thunderxt81']],
>>> -	['0xa3', ['-mcpu=thunderxt83']]]
>>> -
>>> flags_common_default = [
>>> 	# Accelarate rte_memcpy. Be sure to run unit test
>> (memcpy_perf_autotest)
>>> 	# to determine the best threshold in code. Refer to notes in source
>>> file @@ -52,12 +33,10 @@ flags_generic = [
>>> 	['RTE_USE_C11_MEM_MODEL', true],
>>> 	['RTE_CACHE_LINE_SIZE', 128]]
>>> flags_cavium = [
>>> -	['RTE_MACHINE', '"thunderx"'],
>>> 	['RTE_CACHE_LINE_SIZE', 128],
>>> 	['RTE_MAX_NUMA_NODES', 2],
>>> 	['RTE_MAX_LCORE', 96],
>>> -	['RTE_MAX_VFIO_GROUPS', 128],
>>> -	['RTE_USE_C11_MEM_MODEL', false]]
>>> +	['RTE_MAX_VFIO_GROUPS', 128]]
>>> flags_dpaa = [
>>> 	['RTE_MACHINE', '"dpaa"'],
>>> 	['RTE_USE_C11_MEM_MODEL', true],
>>> @@ -71,6 +50,27 @@ flags_dpaa2 = [
>>> 	['RTE_MAX_NUMA_NODES', 1],
>>> 	['RTE_MAX_LCORE', 16],
>>> 	['RTE_LIBRTE_DPAA2_USE_PHYS_IOVA', false]]
>>> +flags_default_extra = []
>>> +flags_thunderx_extra = [
>>> +	['RTE_MACHINE', '"thunderx"'],
>>> +	['RTE_USE_C11_MEM_MODEL', false]]
>>> +
>>> +machine_args_generic = [
>>> +	['default', ['-march=armv8-a+crc+crypto']],
>>> +	['native', ['-march=native']],
>>> +	['0xd03', ['-mcpu=cortex-a53']],
>>> +	['0xd04', ['-mcpu=cortex-a35']],
>>> +	['0xd07', ['-mcpu=cortex-a57']],
>>> +	['0xd08', ['-mcpu=cortex-a72']],
>>> +	['0xd09', ['-mcpu=cortex-a73']],
>>> +	['0xd0a', ['-mcpu=cortex-a75']]]
>>> +
>>> +machine_args_cavium = [
>>> +	['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
>>> +	['native', ['-march=native']],
>>> +	['0xa1', ['-mcpu=thunderxt88'], flags_thunderx_extra],
>>> +	['0xa2', ['-mcpu=thunderxt81'], flags_thunderx_extra],
>>> +	['0xa3', ['-mcpu=thunderxt83'], flags_thunderx_extra]]
>>> 
>>> ## Arm implementer ID (ARM DDI 0487C.a, Section G7.2.106, Page
>>> G7-5321) impl_generic = ['Generic armv8', flags_generic,
>>> machine_args_generic] @@ -157,8 +157,16 @@ else
>>> 	endif
>>> 	foreach marg: machine[2]
>>> 		if marg[0] == impl_pn
>>> -			foreach f: marg[1]
>>> -				machine_args += f
>>> +			foreach flag: marg[1]
>>> +				if cc.has_argument(flag)
>>> +					machine_args += flag
>>> +				endif
>>> +			endforeach
>>> +			# Apply any extra machine specific flags.
>>> +			foreach flag: marg.get(2, flags_default_extra)
>>> +				if flag.length() > 0
>>> +					dpdk_conf.set(flag[0], flag[1])
>>> +				endif
>> 
>> Let me continue the discussion from v7 here.
>> Seems I wan't clear enough.
>> 
>> Let me take an example. If the host is thunderx2 (0xaf) and compiler is older
>> than v7, flags_thunderx2_extra isn't set. This means, for example,
>> RTE_CACHE_LINE_SIZE will still be 128. Is that what you want?
>> RTE_CACHE_LINE_SIZE has nothing to do with compiler support and you might
>> want to set it regardless of gcc version. You could skip setting -mcpu with setting
>> the extra flags.
>> 
> 
> Thanks for the detailed explanation.
> I think since we have the check to skip mcpu flag when cc doesn't support it (cc.has_argument(flag))
> It will be safe to remove 
> `
>        # Primary part number based mcpu flags are supported
>        # for gcc versions > 7
>        if cc.version().version_compare(
>                        '<7.0') or cmd_output.length() == 0
>                if not meson.is_cross_build() and arm_force_native_march == true
>                        impl_pn = 'native'
>                else
>                        impl_pn = 'default'
>                endif
>        endif
> `

+1

> 
> The command output check can also be removed as it is handled when calling the command script itself.

+1

> 
> Thoughts?
> 
> PS. I think the safest way to set CACHELINE_SIZE is to read the cache type register[1] but sadly only few latest kernels 
> have the support through sysfs (/sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size) 

+1

In summary, +3. LoL

I'll also submit a patch to change the default cacheline size of cortex-a72 with the new flags_*_extra[]


thanks,
Yongseok
Thomas Monjalon April 11, 2019, 11:37 p.m. UTC | #4
10/04/2019 18:13, jerinjacobk@gmail.com:
> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> 
> Currently, RTE_* flags are set based on the implementer ID but there might
> be some micro arch specific differences from the same vendor
> eg. CACHE_LINESIZE. Add support to set micro arch specific flags.

I don't like how flags are set in config/arm/meson.build.
It is a real mess to find which flag applies to which machine.
Adding the flags_*_extra in the machine_args_* is adding more mess.

[...]
>  flags_common_default = [
>  	# Accelarate rte_memcpy. Be sure to run unit test (memcpy_perf_autotest)
>  	# to determine the best threshold in code. Refer to notes in source file
> @@ -52,12 +33,10 @@ flags_generic = [
>  	['RTE_USE_C11_MEM_MODEL', true],
>  	['RTE_CACHE_LINE_SIZE', 128]]
>  flags_cavium = [
> -	['RTE_MACHINE', '"thunderx"'],
>  	['RTE_CACHE_LINE_SIZE', 128],
>  	['RTE_MAX_NUMA_NODES', 2],
>  	['RTE_MAX_LCORE', 96],
> -	['RTE_MAX_VFIO_GROUPS', 128],
> -	['RTE_USE_C11_MEM_MODEL', false]]
> +	['RTE_MAX_VFIO_GROUPS', 128]]
>  flags_dpaa = [
>  	['RTE_MACHINE', '"dpaa"'],
>  	['RTE_USE_C11_MEM_MODEL', true],
> @@ -71,6 +50,27 @@ flags_dpaa2 = [
>  	['RTE_MAX_NUMA_NODES', 1],
>  	['RTE_MAX_LCORE', 16],
>  	['RTE_LIBRTE_DPAA2_USE_PHYS_IOVA', false]]
> +flags_default_extra = []
> +flags_thunderx_extra = [
> +	['RTE_MACHINE', '"thunderx"'],
> +	['RTE_USE_C11_MEM_MODEL', false]]
> +
> +machine_args_generic = [
> +	['default', ['-march=armv8-a+crc+crypto']],
> +	['native', ['-march=native']],
> +	['0xd03', ['-mcpu=cortex-a53']],
> +	['0xd04', ['-mcpu=cortex-a35']],
> +	['0xd07', ['-mcpu=cortex-a57']],
> +	['0xd08', ['-mcpu=cortex-a72']],
> +	['0xd09', ['-mcpu=cortex-a73']],
> +	['0xd0a', ['-mcpu=cortex-a75']]]
> +
> +machine_args_cavium = [
> +	['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
> +	['native', ['-march=native']],
> +	['0xa1', ['-mcpu=thunderxt88'], flags_thunderx_extra],
> +	['0xa2', ['-mcpu=thunderxt81'], flags_thunderx_extra],
> +	['0xa3', ['-mcpu=thunderxt83'], flags_thunderx_extra]]

I think we should have a simpler model.
We need only to know the machine name and get all the related machine config.
In native compilation, machine name is guessed from implementor id and pn
(from config/arm/armv8_machine.py). We can directly output the machine name
from this script and leave the naming logic in this script.
In the cross-compilation config files (config/arm/*),
we can just specify the machine name.
Then every machine config (machine_args and dpdk_conf) would be specified
in some arrays based on the machine name.
Of course, we can keep some common default values.

Thoughts?
Yongseok Koh April 12, 2019, 1:59 a.m. UTC | #5
> On Apr 11, 2019, at 4:37 PM, Thomas Monjalon <thomas@monjalon.net> wrote:
> 
> 10/04/2019 18:13, jerinjacobk@gmail.com:
>> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
>> 
>> Currently, RTE_* flags are set based on the implementer ID but there might
>> be some micro arch specific differences from the same vendor
>> eg. CACHE_LINESIZE. Add support to set micro arch specific flags.
> 
> I don't like how flags are set in config/arm/meson.build.
> It is a real mess to find which flag applies to which machine.
> Adding the flags_*_extra in the machine_args_* is adding more mess.
> 
> [...]
>> flags_common_default = [
>> 	# Accelarate rte_memcpy. Be sure to run unit test (memcpy_perf_autotest)
>> 	# to determine the best threshold in code. Refer to notes in source file
>> @@ -52,12 +33,10 @@ flags_generic = [
>> 	['RTE_USE_C11_MEM_MODEL', true],
>> 	['RTE_CACHE_LINE_SIZE', 128]]
>> flags_cavium = [
>> -	['RTE_MACHINE', '"thunderx"'],
>> 	['RTE_CACHE_LINE_SIZE', 128],
>> 	['RTE_MAX_NUMA_NODES', 2],
>> 	['RTE_MAX_LCORE', 96],
>> -	['RTE_MAX_VFIO_GROUPS', 128],
>> -	['RTE_USE_C11_MEM_MODEL', false]]
>> +	['RTE_MAX_VFIO_GROUPS', 128]]
>> flags_dpaa = [
>> 	['RTE_MACHINE', '"dpaa"'],
>> 	['RTE_USE_C11_MEM_MODEL', true],
>> @@ -71,6 +50,27 @@ flags_dpaa2 = [
>> 	['RTE_MAX_NUMA_NODES', 1],
>> 	['RTE_MAX_LCORE', 16],
>> 	['RTE_LIBRTE_DPAA2_USE_PHYS_IOVA', false]]
>> +flags_default_extra = []
>> +flags_thunderx_extra = [
>> +	['RTE_MACHINE', '"thunderx"'],
>> +	['RTE_USE_C11_MEM_MODEL', false]]
>> +
>> +machine_args_generic = [
>> +	['default', ['-march=armv8-a+crc+crypto']],
>> +	['native', ['-march=native']],
>> +	['0xd03', ['-mcpu=cortex-a53']],
>> +	['0xd04', ['-mcpu=cortex-a35']],
>> +	['0xd07', ['-mcpu=cortex-a57']],
>> +	['0xd08', ['-mcpu=cortex-a72']],
>> +	['0xd09', ['-mcpu=cortex-a73']],
>> +	['0xd0a', ['-mcpu=cortex-a75']]]
>> +
>> +machine_args_cavium = [
>> +	['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
>> +	['native', ['-march=native']],
>> +	['0xa1', ['-mcpu=thunderxt88'], flags_thunderx_extra],
>> +	['0xa2', ['-mcpu=thunderxt81'], flags_thunderx_extra],
>> +	['0xa3', ['-mcpu=thunderxt83'], flags_thunderx_extra]]
> 
> I think we should have a simpler model.
> We need only to know the machine name and get all the related machine config.
> In native compilation, machine name is guessed from implementor id and pn

It can be guessed unless machine name is specified as an option.
We can add machine_name option?
One observation. We do have machine option but it isn't used much for arm.
And in config/arm/meson.build, march_opt isn't used either.

> (from config/arm/armv8_machine.py). We can directly output the machine name
> from this script and leave the naming logic in this script.
> In the cross-compilation config files (config/arm/*),
> we can just specify the machine name.
> Then every machine config (machine_args and dpdk_conf) would be specified
> in some arrays based on the machine name.
> Of course, we can keep some common default values.
> 
> Thoughts?

Sounds like we can cleanly reorganize it with the suggestion.

Thanks,
Yongseok
Yongseok Koh April 12, 2019, 2:04 a.m. UTC | #6
> On Apr 11, 2019, at 1:12 PM, Yongseok Koh <yskoh@mellanox.com> wrote:
> 
>> 
>> On Apr 10, 2019, at 11:07 PM, Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com> wrote:
>> 
>> Hi Yongseok,
>> 
>>> -----Original Message-----
>>> From: Yongseok Koh <yskoh@mellanox.com>
>>> Sent: Wednesday, April 10, 2019 11:08 PM
>>> To: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
>>> Cc: Thomas Monjalon <thomas@monjalon.net>; dev <dev@dpdk.org>; Jerin
>>> Jacob Kollanukkaran <jerinj@marvell.com>; jerinjacobk@gmail.com
>>> Subject: [EXT] Re: [dpdk-dev] [PATCH v8 2/4] meson: add infra to support
>>> machine specific flags
>>> 
>>> External Email
>>> 
>>> ----------------------------------------------------------------------
>>> 
>>>> On Apr 10, 2019, at 9:13 AM, jerinjacobk@gmail.com wrote:
>>>> 
>>>> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
>>>> 
>>>> Currently, RTE_* flags are set based on the implementer ID but there
>>>> might be some micro arch specific differences from the same vendor eg.
>>>> CACHE_LINESIZE. Add support to set micro arch specific flags.
>>>> 
>>>> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
>>>> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
>>>> ---
>>>> config/arm/meson.build | 56 ++++++++++++++++++++++++------------------
>>>> 1 file changed, 32 insertions(+), 24 deletions(-)
>>>> 
>>>> diff --git a/config/arm/meson.build b/config/arm/meson.build index
>>>> 170a4981a..24bce2b39 100644
>>>> --- a/config/arm/meson.build
>>>> +++ b/config/arm/meson.build
>>>> @@ -7,25 +7,6 @@ march_opt = '-march=@0@'.format(machine)
>>>> 
>>>> arm_force_native_march = false
>>>> 
>>>> -machine_args_generic = [
>>>> -	['default', ['-march=armv8-a+crc+crypto']],
>>>> -	['native', ['-march=native']],
>>>> -	['0xd03', ['-mcpu=cortex-a53']],
>>>> -	['0xd04', ['-mcpu=cortex-a35']],
>>>> -	['0xd05', ['-mcpu=cortex-a55']],
>>>> -	['0xd07', ['-mcpu=cortex-a57']],
>>>> -	['0xd08', ['-mcpu=cortex-a72']],
>>>> -	['0xd09', ['-mcpu=cortex-a73']],
>>>> -	['0xd0a', ['-mcpu=cortex-a75']],
>>>> -	['0xd0b', ['-mcpu=cortex-a76']],
>>>> -]
>>>> -machine_args_cavium = [
>>>> -	['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
>>>> -	['native', ['-march=native']],
>>>> -	['0xa1', ['-mcpu=thunderxt88']],
>>>> -	['0xa2', ['-mcpu=thunderxt81']],
>>>> -	['0xa3', ['-mcpu=thunderxt83']]]
>>>> -
>>>> flags_common_default = [
>>>> 	# Accelarate rte_memcpy. Be sure to run unit test
>>> (memcpy_perf_autotest)
>>>> 	# to determine the best threshold in code. Refer to notes in source
>>>> file @@ -52,12 +33,10 @@ flags_generic = [
>>>> 	['RTE_USE_C11_MEM_MODEL', true],
>>>> 	['RTE_CACHE_LINE_SIZE', 128]]
>>>> flags_cavium = [
>>>> -	['RTE_MACHINE', '"thunderx"'],
>>>> 	['RTE_CACHE_LINE_SIZE', 128],
>>>> 	['RTE_MAX_NUMA_NODES', 2],
>>>> 	['RTE_MAX_LCORE', 96],
>>>> -	['RTE_MAX_VFIO_GROUPS', 128],
>>>> -	['RTE_USE_C11_MEM_MODEL', false]]
>>>> +	['RTE_MAX_VFIO_GROUPS', 128]]
>>>> flags_dpaa = [
>>>> 	['RTE_MACHINE', '"dpaa"'],
>>>> 	['RTE_USE_C11_MEM_MODEL', true],
>>>> @@ -71,6 +50,27 @@ flags_dpaa2 = [
>>>> 	['RTE_MAX_NUMA_NODES', 1],
>>>> 	['RTE_MAX_LCORE', 16],
>>>> 	['RTE_LIBRTE_DPAA2_USE_PHYS_IOVA', false]]
>>>> +flags_default_extra = []
>>>> +flags_thunderx_extra = [
>>>> +	['RTE_MACHINE', '"thunderx"'],
>>>> +	['RTE_USE_C11_MEM_MODEL', false]]
>>>> +
>>>> +machine_args_generic = [
>>>> +	['default', ['-march=armv8-a+crc+crypto']],
>>>> +	['native', ['-march=native']],
>>>> +	['0xd03', ['-mcpu=cortex-a53']],
>>>> +	['0xd04', ['-mcpu=cortex-a35']],
>>>> +	['0xd07', ['-mcpu=cortex-a57']],
>>>> +	['0xd08', ['-mcpu=cortex-a72']],
>>>> +	['0xd09', ['-mcpu=cortex-a73']],
>>>> +	['0xd0a', ['-mcpu=cortex-a75']]]
>>>> +
>>>> +machine_args_cavium = [
>>>> +	['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
>>>> +	['native', ['-march=native']],
>>>> +	['0xa1', ['-mcpu=thunderxt88'], flags_thunderx_extra],
>>>> +	['0xa2', ['-mcpu=thunderxt81'], flags_thunderx_extra],
>>>> +	['0xa3', ['-mcpu=thunderxt83'], flags_thunderx_extra]]
>>>> 
>>>> ## Arm implementer ID (ARM DDI 0487C.a, Section G7.2.106, Page
>>>> G7-5321) impl_generic = ['Generic armv8', flags_generic,
>>>> machine_args_generic] @@ -157,8 +157,16 @@ else
>>>> 	endif
>>>> 	foreach marg: machine[2]
>>>> 		if marg[0] == impl_pn
>>>> -			foreach f: marg[1]
>>>> -				machine_args += f
>>>> +			foreach flag: marg[1]
>>>> +				if cc.has_argument(flag)
>>>> +					machine_args += flag
>>>> +				endif
>>>> +			endforeach
>>>> +			# Apply any extra machine specific flags.
>>>> +			foreach flag: marg.get(2, flags_default_extra)
>>>> +				if flag.length() > 0
>>>> +					dpdk_conf.set(flag[0], flag[1])
>>>> +				endif
>>> 
>>> Let me continue the discussion from v7 here.
>>> Seems I wan't clear enough.
>>> 
>>> Let me take an example. If the host is thunderx2 (0xaf) and compiler is older
>>> than v7, flags_thunderx2_extra isn't set. This means, for example,
>>> RTE_CACHE_LINE_SIZE will still be 128. Is that what you want?
>>> RTE_CACHE_LINE_SIZE has nothing to do with compiler support and you might
>>> want to set it regardless of gcc version. You could skip setting -mcpu with setting
>>> the extra flags.
>>> 
>> 
>> Thanks for the detailed explanation.
>> I think since we have the check to skip mcpu flag when cc doesn't support it (cc.has_argument(flag))
>> It will be safe to remove 
>> `
>>       # Primary part number based mcpu flags are supported
>>       # for gcc versions > 7
>>       if cc.version().version_compare(
>>                       '<7.0') or cmd_output.length() == 0
>>               if not meson.is_cross_build() and arm_force_native_march == true
>>                       impl_pn = 'native'
>>               else
>>                       impl_pn = 'default'
>>               endif
>>       endif
>> `
> 
> +1

I've tested it but still have an issue with old gcc.
Even if -mcpu isn't set due to cc.has_argument(), -march isn't set either.
So, it spews error due to lack of CRC feature.
-march should have '+crc'. The error I got was:

> ninja: Entering directory `build'
> [942/1452] Compiling C object 'drivers/drivers...c@sta/net_softnic_rte_eth_softnic_action.c.o'.
> FAILED: drivers/drivers@@tmp_rte_pmd_softnic@sta/net_softnic_rte_eth_softnic_action.c.o
> cc -Idrivers/drivers@@tmp_rte_pmd_softnic@sta -Idrivers -I../drivers -Idrivers/net/softnic -I../drivers/net/softnic -Ilib/librte_ethdev -I../lib/librte_ethdev -I. -I../ -Iconfig -I../config-Ilib/librte_eal/common/include -I../lib/librte_eal/common/include -I../lib/librte_eal/linux/eal/include -Ilib/librte_eal/common -I../lib/librte_eal/common -Ilib/librte_eal/
> common/include/arch/arm -I../lib/librte_eal/common/include/arch/arm -Ilib/librte_eal -I../lib/librte_eal -Ilib/librte_kvargs -I../lib/librte_kvargs -Ilib/librte_net -I../lib/librte_net -Ilib/librte_mbuf -I../lib/librte_mbuf -Ilib/librte_mempool -I../lib/librte_mempool -Ilib/librte_ring -I../lib/librte_ring -Ilib/librte_cmdline -I../lib/librte_cmdline -Ilib/lib
> rte_meter -I../lib/librte_meter -Idrivers/bus/pci -I../drivers/bus/pci -I../drivers/bus/pci/linux -Ilib/librte_pci -I../lib/librte_pci -Idrivers/bus/vdev -I../drivers/bus/vdev -Ilib/librte_pipeline -I../lib/librte_pipeline -Ilib/librte_port -I../lib/librte_port -Ilib/librte_sched -I../lib/librte_sched -Ilib/librte_ip_frag -I../lib/librte_ip_frag -Ilib/librte_h
> ash -I../lib/librte_hash -Ilib/librte_cryptodev -I../lib/librte_cryptodev -Ilib/librte_kni -I../lib/librte_kni -Ilib/librte_table -I../lib/librte_table -Ilib/librte_lpm -I../lib/librte_lpm -Ilib/librte_acl -I../lib/librte_acl -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -O3 -include rte_config.h -Wsign-compare -Wcast-qual -fPIC -D_GNU_SOURCE -DALLOW_EXPERI
> MENTAL_API  -MD -MQ 'drivers/drivers@@tmp_rte_pmd_softnic@sta/net_softnic_rte_eth_softnic_action.c.o' -MF 'drivers/drivers@@tmp_rte_pmd_softnic@sta/net_softnic_rte_eth_softnic_action.c.o.d' -o 'drivers/drivers@@tmp_rte_pmd_softnic@sta/net_softnic_rte_eth_softnic_action.c.o' -c ../drivers/net/softnic/rte_eth_softnic_action.c
> {standard input}: Assembler messages:
> {standard input}:14: Error: selected processor does not support `crc32cx w3,w3,x0'
> {standard input}:37: Error: selected processor does not support `crc32cx w1,w1,x3'
> {standard input}:40: Error: selected processor does not support `crc32cx w0,w0,x2'


My machine has 0x41(Arm) and 0xd08(cortex-a72). gcc is '4.8.5 20150623 (Red Hat 4.8.5-28)'

Thanks,
Yongseok


> 
>> 
>> The command output check can also be removed as it is handled when calling the command script itself.
> 
> +1
> 
>> 
>> Thoughts?
>> 
>> PS. I think the safest way to set CACHELINE_SIZE is to read the cache type register[1] but sadly only few latest kernels 
>> have the support through sysfs (/sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size) 
> 
> +1
> 
> In summary, +3. LoL
> 
> I'll also submit a patch to change the default cacheline size of cortex-a72 with the new flags_*_extra[]
> 
> 
> thanks,
> Yongseok
Jerin Jacob Kollanukkaran April 12, 2019, 6:07 a.m. UTC | #7
> -----Original Message-----
> From: Yongseok Koh <yskoh@mellanox.com>
> Sent: Friday, April 12, 2019 7:35 AM
> To: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; dev <dev@dpdk.org>; Jerin
> Jacob Kollanukkaran <jerinj@marvell.com>; jerinjacobk@gmail.com
> Subject: [EXT] Re: [dpdk-dev] [PATCH v8 2/4] meson: add infra to support
> machine specific flags
> 
> External Email
> 
> I've tested it but still have an issue with old gcc.
> Even if -mcpu isn't set due to cc.has_argument(), -march isn't set either.
> So, it spews error due to lack of CRC feature.
> -march should have '+crc'. The error I got was:
> 
> > ninja: Entering directory `build'
> > [942/1452] Compiling C object
> 'drivers/drivers...c@sta/net_softnic_rte_eth_softnic_action.c.o'.
> > FAILED:
> >
> drivers/drivers@@tmp_rte_pmd_softnic@sta/net_softnic_rte_eth_softnic
> _a
> > ction.c.o cc -Idrivers/drivers@@tmp_rte_pmd_softnic@sta -Idrivers
> > -I../drivers -Idrivers/net/softnic -I../drivers/net/softnic
> > -Ilib/librte_ethdev -I../lib/librte_ethdev -I. -I../ -Iconfig
> > -I../config-Ilib/librte_eal/common/include
> > -I../lib/librte_eal/common/include
> > -I../lib/librte_eal/linux/eal/include -Ilib/librte_eal/common
> > -I../lib/librte_eal/common -Ilib/librte_eal/ common/include/arch/arm
> > -I../lib/librte_eal/common/include/arch/arm -Ilib/librte_eal
> > -I../lib/librte_eal -Ilib/librte_kvargs -I../lib/librte_kvargs
> > -Ilib/librte_net -I../lib/librte_net -Ilib/librte_mbuf
> > -I../lib/librte_mbuf -Ilib/librte_mempool -I../lib/librte_mempool
> > -Ilib/librte_ring -I../lib/librte_ring -Ilib/librte_cmdline
> > -I../lib/librte_cmdline -Ilib/lib rte_meter -I../lib/librte_meter
> > -Idrivers/bus/pci -I../drivers/bus/pci -I../drivers/bus/pci/linux
> > -Ilib/librte_pci -I../lib/librte_pci -Idrivers/bus/vdev
> > -I../drivers/bus/vdev -Ilib/librte_pipeline -I../lib/librte_pipeline
> > -Ilib/librte_port -I../lib/librte_port -Ilib/librte_sched
> > -I../lib/librte_sched -Ilib/librte_ip_frag -I../lib/librte_ip_frag
> > -Ilib/librte_h ash -I../lib/librte_hash -Ilib/librte_cryptodev
> > -I../lib/librte_cryptodev -Ilib/librte_kni -I../lib/librte_kni
> > -Ilib/librte_table -I../lib/librte_table -Ilib/librte_lpm
> > -I../lib/librte_lpm -Ilib/librte_acl -I../lib/librte_acl -pipe
> > -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -O3 -include rte_config.h
> > -Wsign-compare -Wcast-qual -fPIC -D_GNU_SOURCE -DALLOW_EXPERI
> > MENTAL_API  -MD -MQ
> >
> 'drivers/drivers@@tmp_rte_pmd_softnic@sta/net_softnic_rte_eth_softnic
> _
> > action.c.o' -MF
> >
> 'drivers/drivers@@tmp_rte_pmd_softnic@sta/net_softnic_rte_eth_softnic
> _
> > action.c.o.d' -o
> >
> 'drivers/drivers@@tmp_rte_pmd_softnic@sta/net_softnic_rte_eth_softnic
> _
> > action.c.o' -c ../drivers/net/softnic/rte_eth_softnic_action.c
> > {standard input}: Assembler messages:
> > {standard input}:14: Error: selected processor does not support `crc32cx
> w3,w3,x0'
> > {standard input}:37: Error: selected processor does not support `crc32cx
> w1,w1,x3'
> > {standard input}:40: Error: selected processor does not support `crc32cx
> w0,w0,x2'
> 
> 
> My machine has 0x41(Arm) and 0xd08(cortex-a72). gcc is '4.8.5 20150623 (Red
> Hat 4.8.5-28)'

Are you testing with very latest master where the following patch available in build?
http://patches.dpdk.org/patch/52367/
It should fix that issue.


> 
> Thanks,
> Yongseok
> 
> 
> >
> >>
> >> The command output check can also be removed as it is handled when
> calling the command script itself.
> >
> > +1
> >
> >>
> >> Thoughts?
> >>
> >> PS. I think the safest way to set CACHELINE_SIZE is to read the cache
> >> type register[1] but sadly only few latest kernels have the support
> >> through sysfs
> >> (/sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size)
> >
> > +1
> >
> > In summary, +3. LoL
> >
> > I'll also submit a patch to change the default cacheline size of
> > cortex-a72 with the new flags_*_extra[]
> >
> >
> > thanks,
> > Yongseok
Yongseok Koh April 12, 2019, 6:43 a.m. UTC | #8
> On Apr 11, 2019, at 11:07 PM, Jerin Jacob Kollanukkaran <jerinj@marvell.com> wrote:
> 
> 
> 
>> -----Original Message-----
>> From: Yongseok Koh <yskoh@mellanox.com>
>> Sent: Friday, April 12, 2019 7:35 AM
>> To: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
>> Cc: Thomas Monjalon <thomas@monjalon.net>; dev <dev@dpdk.org>; Jerin
>> Jacob Kollanukkaran <jerinj@marvell.com>; jerinjacobk@gmail.com
>> Subject: [EXT] Re: [dpdk-dev] [PATCH v8 2/4] meson: add infra to support
>> machine specific flags
>> 
>> External Email
>> 
>> I've tested it but still have an issue with old gcc.
>> Even if -mcpu isn't set due to cc.has_argument(), -march isn't set either.
>> So, it spews error due to lack of CRC feature.
>> -march should have '+crc'. The error I got was:
>> 
>>> ninja: Entering directory `build'
>>> [942/1452] Compiling C object
>> 'drivers/drivers...c@sta/net_softnic_rte_eth_softnic_action.c.o'.
>>> FAILED:
>>> 
>> drivers/drivers@@tmp_rte_pmd_softnic@sta/net_softnic_rte_eth_softnic
>> _a
>>> ction.c.o cc -Idrivers/drivers@@tmp_rte_pmd_softnic@sta -Idrivers
>>> -I../drivers -Idrivers/net/softnic -I../drivers/net/softnic
>>> -Ilib/librte_ethdev -I../lib/librte_ethdev -I. -I../ -Iconfig
>>> -I../config-Ilib/librte_eal/common/include
>>> -I../lib/librte_eal/common/include
>>> -I../lib/librte_eal/linux/eal/include -Ilib/librte_eal/common
>>> -I../lib/librte_eal/common -Ilib/librte_eal/ common/include/arch/arm
>>> -I../lib/librte_eal/common/include/arch/arm -Ilib/librte_eal
>>> -I../lib/librte_eal -Ilib/librte_kvargs -I../lib/librte_kvargs
>>> -Ilib/librte_net -I../lib/librte_net -Ilib/librte_mbuf
>>> -I../lib/librte_mbuf -Ilib/librte_mempool -I../lib/librte_mempool
>>> -Ilib/librte_ring -I../lib/librte_ring -Ilib/librte_cmdline
>>> -I../lib/librte_cmdline -Ilib/lib rte_meter -I../lib/librte_meter
>>> -Idrivers/bus/pci -I../drivers/bus/pci -I../drivers/bus/pci/linux
>>> -Ilib/librte_pci -I../lib/librte_pci -Idrivers/bus/vdev
>>> -I../drivers/bus/vdev -Ilib/librte_pipeline -I../lib/librte_pipeline
>>> -Ilib/librte_port -I../lib/librte_port -Ilib/librte_sched
>>> -I../lib/librte_sched -Ilib/librte_ip_frag -I../lib/librte_ip_frag
>>> -Ilib/librte_h ash -I../lib/librte_hash -Ilib/librte_cryptodev
>>> -I../lib/librte_cryptodev -Ilib/librte_kni -I../lib/librte_kni
>>> -Ilib/librte_table -I../lib/librte_table -Ilib/librte_lpm
>>> -I../lib/librte_lpm -Ilib/librte_acl -I../lib/librte_acl -pipe
>>> -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -O3 -include rte_config.h
>>> -Wsign-compare -Wcast-qual -fPIC -D_GNU_SOURCE -DALLOW_EXPERI
>>> MENTAL_API  -MD -MQ
>>> 
>> 'drivers/drivers@@tmp_rte_pmd_softnic@sta/net_softnic_rte_eth_softnic
>> _
>>> action.c.o' -MF
>>> 
>> 'drivers/drivers@@tmp_rte_pmd_softnic@sta/net_softnic_rte_eth_softnic
>> _
>>> action.c.o.d' -o
>>> 
>> 'drivers/drivers@@tmp_rte_pmd_softnic@sta/net_softnic_rte_eth_softnic
>> _
>>> action.c.o' -c ../drivers/net/softnic/rte_eth_softnic_action.c
>>> {standard input}: Assembler messages:
>>> {standard input}:14: Error: selected processor does not support `crc32cx
>> w3,w3,x0'
>>> {standard input}:37: Error: selected processor does not support `crc32cx
>> w1,w1,x3'
>>> {standard input}:40: Error: selected processor does not support `crc32cx
>> w0,w0,x2'
>> 
>> 
>> My machine has 0x41(Arm) and 0xd08(cortex-a72). gcc is '4.8.5 20150623 (Red
>> Hat 4.8.5-28)'
> 
> Are you testing with very latest master where the following patch available in build?
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dpdk.org%2Fpatch%2F52367%2F&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C9adebf1529ab4bbc189708d6bf0d1d00%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636906460384104429&amp;sdata=vG65lezE%2BZacEpn38cUoozwEYm%2BBUGvuYBQ2ToEKnSI%3D&amp;reserved=0
> It should fix that issue.

Thanks, that fixes the issue.
But I've encountered another one. Are you aware of this?

ninja: Entering directory `build'
[1151/1452] Compiling C object 'drivers/drivers@@tmp_r...d_octeontx_event@sta/event_octeontx_timvf_worker.c.o'.
FAILED: drivers/drivers@@tmp_rte_pmd_octeontx_event@sta/event_octeontx_timvf_worker.c.o
cc -Idrivers/drivers@@tmp_rte_pmd_octeontx_event@sta -Idrivers -I../drivers -Idrivers/event/octeontx -I../drivers/event/octeontx -Ilib/librte_eventdev -I../lib/librte_eventdev -I. -I../ -Iconfig -I../config -Ilib/librte_eal/common/include -I../lib/librte_eal/common/include -I../lib/librte_eal/linux/eal/include -Ilib/librte_eal/common -I../lib/librte_eal/common -Ilib/librte_eal/common/include/arch/arm -I../lib/librte_eal/co
mmon/include/arch/arm -Ilib/librte_eal -I../lib/librte_eal -Ilib/librte_kvargs -I../lib/librte_kvargs -Ilib/librte_ring -I../lib/librte_ring -Ilib/librte_ethdev -I../lib/librte_ethdev -Ilib/librte_net -I../lib/librte_net -Ilib/librte_mbuf -I../lib/librte_mbuf -Ilib/librte_mempool -I../lib/librte_mempool -Ilib/librte_cmdline -I../lib/librte_cmdline -Ilib/librte_meter -I../lib/librte_meter -Ilib/librte_hash -I../lib/librte_h
ash -Ilib/librte_timer -I../lib/librte_timer -Ilib/librte_cryptodev -I../lib/librte_cryptodev -Idrivers/common/octeontx -I../drivers/common/octeontx -Idrivers/mempool/octeontx -I../drivers/mempool/octeontx -Idrivers/bus/pci -I../drivers/bus/pci -I../drivers/bus/pci/linux -Ilib/librte_pci -I../lib/librte_pci -Idrivers/bus/vdev -I../drivers/bus/vdev -Idrivers/net/octeontx -I../drivers/net/octeontx -Idrivers/net/octeontx/base
 -I../drivers/net/octeontx/base -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -O3 -include rte_config.h -Wsign-compare -Wcast-qual -fPIC -D_GNU_SOURCE -DALLOW_EXPERIMENTAL_API  -MD -MQ 'drivers/drivers@@tmp_rte_pmd_octeontx_event@sta/event_octeontx_timvf_worker.c.o' -MF 'drivers/drivers@@tmp_rte_pmd_octeontx_event@sta/event_octeontx_timvf_worker.c.o.d' -o 'drivers/drivers@@tmp_rte_pmd_octeontx_event@sta/event_octeontx_t
imvf_worker.c.o' -c ../drivers/event/octeontx/timvf_worker.c
../drivers/event/octeontx/timvf_worker.c: In function ‘timvf_timer_arm_burst_sp’:
../drivers/event/octeontx/timvf_worker.c:88:1: error: could not split insn
 }
 ^
(insn 95 98 99 (parallel [
            (set (reg:DI 3 x3 [orig:98 D.8656 ] [98])
                (mem/v:DI (reg/f:DI 21 x21 [orig:88 D.8662 ] [88]) [-1  S8 A64]))
            (set (mem/v:DI (reg/f:DI 21 x21 [orig:88 D.8662 ] [88]) [-1  S8 A64])
                (unspec_volatile:DI [
                        (plus:DI (mem/v:DI (reg/f:DI 21 x21 [orig:88 D.8662 ] [88]) [-1  S8 A64])
                            (const_int -281474976710656 [0xffff000000000000]))
                        (const_int 0 [0])
                    ] UNSPECV_ATOMIC_OP))
            (clobber (reg:CC 66 cc))
            (clobber (reg:DI 0 x0))
            (clobber (reg:SI 1 x1))
        ]) ../drivers/event/octeontx/timvf_worker.h:95 1832 {atomic_fetch_adddi}
     (expr_list:REG_UNUSED (reg:CC 66 cc)
        (expr_list:REG_UNUSED (reg:SI 1 x1)
            (expr_list:REG_UNUSED (reg:DI 0 x0)
                (nil)))))
../drivers/event/octeontx/timvf_worker.c:88:1: internal compiler error: in final_scan_insn, at final.c:2897
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugzilla.redhat.com/bugzilla> for instructions.
{standard input}: Assembler messages:
{standard input}: Error: open CFI at the end of file; missing .cfi_endproc directive
Preprocessed source stored into /tmp/ccnQRbOm.out file, please attach this to your bugreport.
[1168/1452] Compiling C object 'drivers/drivers@@tmp_r...ntx_crypto@sta/crypto_octeontx_otx_cryptodev_ops.c.o'.
ninja: build stopped: subcommand failed.

Thanks
Yongseok
Jerin Jacob Kollanukkaran April 12, 2019, 7 a.m. UTC | #9
> -----Original Message-----
> From: Yongseok Koh <yskoh@mellanox.com>
> Sent: Friday, April 12, 2019 12:14 PM
> To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Thomas
> Monjalon <thomas@monjalon.net>; dev <dev@dpdk.org>;
> jerinjacobk@gmail.com
> Subject: [EXT] Re: [dpdk-dev] [PATCH v8 2/4] meson: add infra to support
> machine specific flags
> 
> 
> Thanks, that fixes the issue.
> But I've encountered another one. Are you aware of this?

Yes. It is a compiler bug. This patch set is NOT introducing this.

Fixed same on legacy build with arm64 . If you are using < 4.8.6 compiler
For meson, this patched needs to be ported to meson

$ git show f3af3e44a444cdfe3fa7b3e2c042be351401eb23
commit f3af3e44a444cdfe3fa7b3e2c042be351401eb23
Author: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Date:   Mon Sep 3 15:01:10 2018 +0530

    mk: disable OcteonTx for buggy compilers only on arm64
    
    Disable octeontx for gcc 4.8.5 as the compiler is emitting "internal
    compiler error" for aarch64. The GCC "internal compiler error" was
    observed only for arm64 architecture so disable the PMD only
    for arm64.
    
    Fixes: 4f760550a093 ("mk: disable OcteonTx for buggy compilers")
    Cc: stable@dpdk.org
    
    Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
    Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>

> 
> ninja: Entering directory `build'
> [1151/1452] Compiling C object
> 'drivers/drivers@@tmp_r...d_octeontx_event@sta/event_octeontx_timvf_
> worker.c.o'.
> FAILED:
> drivers/drivers@@tmp_rte_pmd_octeontx_event@sta/event_octeontx_ti
> mvf_worker.c.o
> cc -Idrivers/drivers@@tmp_rte_pmd_octeontx_event@sta -Idrivers -
> I../drivers -Idrivers/event/octeontx -I../drivers/event/octeontx -
> Ilib/librte_eventdev -I../lib/librte_eventdev -I. -I../ -Iconfig -I../config -
> Ilib/librte_eal/common/include -I../lib/librte_eal/common/include -
> I../lib/librte_eal/linux/eal/include -Ilib/librte_eal/common -
> I../lib/librte_eal/common -Ilib/librte_eal/common/include/arch/arm -
> I../lib/librte_eal/co mmon/include/arch/arm -Ilib/librte_eal -I../lib/librte_eal
> -Ilib/librte_kvargs -I../lib/librte_kvargs -Ilib/librte_ring -I../lib/librte_ring -
> Ilib/librte_ethdev -I../lib/librte_ethdev -Ilib/librte_net -I../lib/librte_net -
> Ilib/librte_mbuf -I../lib/librte_mbuf -Ilib/librte_mempool -
> I../lib/librte_mempool -Ilib/librte_cmdline -I../lib/librte_cmdline -
> Ilib/librte_meter -I../lib/librte_meter -Ilib/librte_hash -I../lib/librte_h ash -
> Ilib/librte_timer -I../lib/librte_timer -Ilib/librte_cryptodev -
> I../lib/librte_cryptodev -Idrivers/common/octeontx -
> I../drivers/common/octeontx -Idrivers/mempool/octeontx -
> I../drivers/mempool/octeontx -Idrivers/bus/pci -I../drivers/bus/pci -
> I../drivers/bus/pci/linux -Ilib/librte_pci -I../lib/librte_pci -Idrivers/bus/vdev -
> I../drivers/bus/vdev -Idrivers/net/octeontx -I../drivers/net/octeontx -
> Idrivers/net/octeontx/base  -I../drivers/net/octeontx/base -pipe -
> D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -O3 -include rte_config.h -
> Wsign-compare -Wcast-qual -fPIC -D_GNU_SOURCE -
> DALLOW_EXPERIMENTAL_API  -MD -MQ
> 'drivers/drivers@@tmp_rte_pmd_octeontx_event@sta/event_octeontx_ti
> mvf_worker.c.o' -MF
> 'drivers/drivers@@tmp_rte_pmd_octeontx_event@sta/event_octeontx_ti
> mvf_worker.c.o.d' -o
> 'drivers/drivers@@tmp_rte_pmd_octeontx_event@sta/event_octeontx_t
> imvf_worker.c.o' -c ../drivers/event/octeontx/timvf_worker.c
> ../drivers/event/octeontx/timvf_worker.c: In function
> ‘timvf_timer_arm_burst_sp’:
> ../drivers/event/octeontx/timvf_worker.c:88:1: error: could not split insn  }
> ^ (insn 95 98 99 (parallel [
>             (set (reg:DI 3 x3 [orig:98 D.8656 ] [98])
>                 (mem/v:DI (reg/f:DI 21 x21 [orig:88 D.8662 ] [88]) [-1  S8 A64]))
>             (set (mem/v:DI (reg/f:DI 21 x21 [orig:88 D.8662 ] [88]) [-1  S8 A64])
>                 (unspec_volatile:DI [
>                         (plus:DI (mem/v:DI (reg/f:DI 21 x21 [orig:88 D.8662 ] [88]) [-1  S8
> A64])
>                             (const_int -281474976710656 [0xffff000000000000]))
>                         (const_int 0 [0])
>                     ] UNSPECV_ATOMIC_OP))
>             (clobber (reg:CC 66 cc))
>             (clobber (reg:DI 0 x0))
>             (clobber (reg:SI 1 x1))
>         ]) ../drivers/event/octeontx/timvf_worker.h:95 1832
> {atomic_fetch_adddi}
>      (expr_list:REG_UNUSED (reg:CC 66 cc)
>         (expr_list:REG_UNUSED (reg:SI 1 x1)
>             (expr_list:REG_UNUSED (reg:DI 0 x0)
>                 (nil)))))
> ../drivers/event/octeontx/timvf_worker.c:88:1: internal compiler error: in
> final_scan_insn, at final.c:2897 Please submit a full bug report, with
> preprocessed source if appropriate.
> See <http://bugzilla.redhat.com/bugzilla> for instructions.
> {standard input}: Assembler messages:
> {standard input}: Error: open CFI at the end of file; missing .cfi_endproc
> directive Preprocessed source stored into /tmp/ccnQRbOm.out file, please
> attach this to your bugreport.
> [1168/1452] Compiling C object
> 'drivers/drivers@@tmp_r...ntx_crypto@sta/crypto_octeontx_otx_cryptode
> v_ops.c.o'.
> ninja: build stopped: subcommand failed.
> 
> Thanks
> Yongseok
Yongseok Koh April 12, 2019, 7:09 a.m. UTC | #10
> On Apr 11, 2019, at 11:43 PM, Yongseok Koh <yskoh@mellanox.com> wrote:
> 
> 
>> On Apr 11, 2019, at 11:07 PM, Jerin Jacob Kollanukkaran <jerinj@marvell.com> wrote:
>> 
>> 
>> 
>>> -----Original Message-----
>>> From: Yongseok Koh <yskoh@mellanox.com>
>>> Sent: Friday, April 12, 2019 7:35 AM
>>> To: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
>>> Cc: Thomas Monjalon <thomas@monjalon.net>; dev <dev@dpdk.org>; Jerin
>>> Jacob Kollanukkaran <jerinj@marvell.com>; jerinjacobk@gmail.com
>>> Subject: [EXT] Re: [dpdk-dev] [PATCH v8 2/4] meson: add infra to support
>>> machine specific flags
>>> 
>>> External Email
>>> 
>>> I've tested it but still have an issue with old gcc.
>>> Even if -mcpu isn't set due to cc.has_argument(), -march isn't set either.
>>> So, it spews error due to lack of CRC feature.
>>> -march should have '+crc'. The error I got was:
>>> 
>>>> ninja: Entering directory `build'
>>>> [942/1452] Compiling C object
>>> 'drivers/drivers...c@sta/net_softnic_rte_eth_softnic_action.c.o'.
>>>> FAILED:
>>>> 
>>> drivers/drivers@@tmp_rte_pmd_softnic@sta/net_softnic_rte_eth_softnic
>>> _a
>>>> ction.c.o cc -Idrivers/drivers@@tmp_rte_pmd_softnic@sta -Idrivers
>>>> -I../drivers -Idrivers/net/softnic -I../drivers/net/softnic
>>>> -Ilib/librte_ethdev -I../lib/librte_ethdev -I. -I../ -Iconfig
>>>> -I../config-Ilib/librte_eal/common/include
>>>> -I../lib/librte_eal/common/include
>>>> -I../lib/librte_eal/linux/eal/include -Ilib/librte_eal/common
>>>> -I../lib/librte_eal/common -Ilib/librte_eal/ common/include/arch/arm
>>>> -I../lib/librte_eal/common/include/arch/arm -Ilib/librte_eal
>>>> -I../lib/librte_eal -Ilib/librte_kvargs -I../lib/librte_kvargs
>>>> -Ilib/librte_net -I../lib/librte_net -Ilib/librte_mbuf
>>>> -I../lib/librte_mbuf -Ilib/librte_mempool -I../lib/librte_mempool
>>>> -Ilib/librte_ring -I../lib/librte_ring -Ilib/librte_cmdline
>>>> -I../lib/librte_cmdline -Ilib/lib rte_meter -I../lib/librte_meter
>>>> -Idrivers/bus/pci -I../drivers/bus/pci -I../drivers/bus/pci/linux
>>>> -Ilib/librte_pci -I../lib/librte_pci -Idrivers/bus/vdev
>>>> -I../drivers/bus/vdev -Ilib/librte_pipeline -I../lib/librte_pipeline
>>>> -Ilib/librte_port -I../lib/librte_port -Ilib/librte_sched
>>>> -I../lib/librte_sched -Ilib/librte_ip_frag -I../lib/librte_ip_frag
>>>> -Ilib/librte_h ash -I../lib/librte_hash -Ilib/librte_cryptodev
>>>> -I../lib/librte_cryptodev -Ilib/librte_kni -I../lib/librte_kni
>>>> -Ilib/librte_table -I../lib/librte_table -Ilib/librte_lpm
>>>> -I../lib/librte_lpm -Ilib/librte_acl -I../lib/librte_acl -pipe
>>>> -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -O3 -include rte_config.h
>>>> -Wsign-compare -Wcast-qual -fPIC -D_GNU_SOURCE -DALLOW_EXPERI
>>>> MENTAL_API  -MD -MQ
>>>> 
>>> 'drivers/drivers@@tmp_rte_pmd_softnic@sta/net_softnic_rte_eth_softnic
>>> _
>>>> action.c.o' -MF
>>>> 
>>> 'drivers/drivers@@tmp_rte_pmd_softnic@sta/net_softnic_rte_eth_softnic
>>> _
>>>> action.c.o.d' -o
>>>> 
>>> 'drivers/drivers@@tmp_rte_pmd_softnic@sta/net_softnic_rte_eth_softnic
>>> _
>>>> action.c.o' -c ../drivers/net/softnic/rte_eth_softnic_action.c
>>>> {standard input}: Assembler messages:
>>>> {standard input}:14: Error: selected processor does not support `crc32cx
>>> w3,w3,x0'
>>>> {standard input}:37: Error: selected processor does not support `crc32cx
>>> w1,w1,x3'
>>>> {standard input}:40: Error: selected processor does not support `crc32cx
>>> w0,w0,x2'
>>> 
>>> 
>>> My machine has 0x41(Arm) and 0xd08(cortex-a72). gcc is '4.8.5 20150623 (Red
>>> Hat 4.8.5-28)'
>> 
>> Are you testing with very latest master where the following patch available in build?
>> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dpdk.org%2Fpatch%2F52367%2F&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C5909260f30a64e07a95108d6bf123bde%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636906482386596820&amp;sdata=4%2BfRfELXK37SNY4wdFNGPF8lpU7S3DEfPoDfAH5K7GE%3D&amp;reserved=0
>> It should fix that issue.
> 
> Thanks, that fixes the issue.
> But I've encountered another one. Are you aware of this?
> 
> ninja: Entering directory `build'
> [1151/1452] Compiling C object 'drivers/drivers@@tmp_r...d_octeontx_event@sta/event_octeontx_timvf_worker.c.o'.
> FAILED: drivers/drivers@@tmp_rte_pmd_octeontx_event@sta/event_octeontx_timvf_worker.c.o
> cc -Idrivers/drivers@@tmp_rte_pmd_octeontx_event@sta -Idrivers -I../drivers -Idrivers/event/octeontx -I../drivers/event/octeontx -Ilib/librte_eventdev -I../lib/librte_eventdev -I. -I../ -Iconfig -I../config -Ilib/librte_eal/common/include -I../lib/librte_eal/common/include -I../lib/librte_eal/linux/eal/include -Ilib/librte_eal/common -I../lib/librte_eal/common -Ilib/librte_eal/common/include/arch/arm -I../lib/librte_eal/co
> mmon/include/arch/arm -Ilib/librte_eal -I../lib/librte_eal -Ilib/librte_kvargs -I../lib/librte_kvargs -Ilib/librte_ring -I../lib/librte_ring -Ilib/librte_ethdev -I../lib/librte_ethdev -Ilib/librte_net -I../lib/librte_net -Ilib/librte_mbuf -I../lib/librte_mbuf -Ilib/librte_mempool -I../lib/librte_mempool -Ilib/librte_cmdline -I../lib/librte_cmdline -Ilib/librte_meter -I../lib/librte_meter -Ilib/librte_hash -I../lib/librte_h
> ash -Ilib/librte_timer -I../lib/librte_timer -Ilib/librte_cryptodev -I../lib/librte_cryptodev -Idrivers/common/octeontx -I../drivers/common/octeontx -Idrivers/mempool/octeontx -I../drivers/mempool/octeontx -Idrivers/bus/pci -I../drivers/bus/pci -I../drivers/bus/pci/linux -Ilib/librte_pci -I../lib/librte_pci -Idrivers/bus/vdev -I../drivers/bus/vdev -Idrivers/net/octeontx -I../drivers/net/octeontx -Idrivers/net/octeontx/base
> -I../drivers/net/octeontx/base -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -O3 -include rte_config.h -Wsign-compare -Wcast-qual -fPIC -D_GNU_SOURCE -DALLOW_EXPERIMENTAL_API  -MD -MQ 'drivers/drivers@@tmp_rte_pmd_octeontx_event@sta/event_octeontx_timvf_worker.c.o' -MF 'drivers/drivers@@tmp_rte_pmd_octeontx_event@sta/event_octeontx_timvf_worker.c.o.d' -o 'drivers/drivers@@tmp_rte_pmd_octeontx_event@sta/event_octeontx_t
> imvf_worker.c.o' -c ../drivers/event/octeontx/timvf_worker.c
> ../drivers/event/octeontx/timvf_worker.c: In function ‘timvf_timer_arm_burst_sp’:
> ../drivers/event/octeontx/timvf_worker.c:88:1: error: could not split insn
> }
> ^
> (insn 95 98 99 (parallel [
>            (set (reg:DI 3 x3 [orig:98 D.8656 ] [98])
>                (mem/v:DI (reg/f:DI 21 x21 [orig:88 D.8662 ] [88]) [-1  S8 A64]))
>            (set (mem/v:DI (reg/f:DI 21 x21 [orig:88 D.8662 ] [88]) [-1  S8 A64])
>                (unspec_volatile:DI [
>                        (plus:DI (mem/v:DI (reg/f:DI 21 x21 [orig:88 D.8662 ] [88]) [-1  S8 A64])
>                            (const_int -281474976710656 [0xffff000000000000]))
>                        (const_int 0 [0])
>                    ] UNSPECV_ATOMIC_OP))
>            (clobber (reg:CC 66 cc))
>            (clobber (reg:DI 0 x0))
>            (clobber (reg:SI 1 x1))
>        ]) ../drivers/event/octeontx/timvf_worker.h:95 1832 {atomic_fetch_adddi}
>     (expr_list:REG_UNUSED (reg:CC 66 cc)
>        (expr_list:REG_UNUSED (reg:SI 1 x1)
>            (expr_list:REG_UNUSED (reg:DI 0 x0)
>                (nil)))))
> ../drivers/event/octeontx/timvf_worker.c:88:1: internal compiler error: in final_scan_insn, at final.c:2897
> Please submit a full bug report,
> with preprocessed source if appropriate.
> See <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fbugzilla.redhat.com%2Fbugzilla&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C5909260f30a64e07a95108d6bf123bde%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636906482386596820&amp;sdata=2AH1gInkxui7UEDb7LLNppMxDEaf%2F5N5TEHhDRTSDJY%3D&amp;reserved=0> for instructions.
> {standard input}: Assembler messages:
> {standard input}: Error: open CFI at the end of file; missing .cfi_endproc directive
> Preprocessed source stored into /tmp/ccnQRbOm.out file, please attach this to your bugreport.
> [1168/1452] Compiling C object 'drivers/drivers@@tmp_r...ntx_crypto@sta/crypto_octeontx_otx_cryptodev_ops.c.o'.
> ninja: build stopped: subcommand failed.


One more issue.
With gcc7.2, crypto isn't enabled if -mcpu is set.
How about thunderx/octeon?
Looks it should be like -mcpu=cortex-a72+crypto
I'll take care of this in a separate patch.
Because I want to add a new option to control it.
The reason is a binary having crypto support can't be run on a cpu w/o crypto extension, it is panicked.
And -mcpu=cortex-a72 includes 'crc' support by default.


Thanks,
Yongseok
Jerin Jacob Kollanukkaran April 12, 2019, 7:12 a.m. UTC | #11
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, April 12, 2019 5:07 AM
> To: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Jerin Jacob
> Kollanukkaran <jerinj@marvell.com>
> Cc: dev@dpdk.org; jerinjacobk@gmail.com; yskoh@mellanox.com;
> bruce.richardson@intel.com
> Subject: Re: [dpdk-dev] [PATCH v8 2/4] meson: add infra to support machine
> specific flags
> 
> 10/04/2019 18:13, jerinjacobk@gmail.com:
> > From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> >
> > Currently, RTE_* flags are set based on the implementer ID but there
> > might be some micro arch specific differences from the same vendor eg.
> > CACHE_LINESIZE. Add support to set micro arch specific flags.
> 
> I don't like how flags are set in config/arm/meson.build.
> It is a real mess to find which flag applies to which machine.
> Adding the flags_*_extra in the machine_args_* is adding more mess.
> 
> [...]
> >  flags_common_default = [
> >  	# Accelarate rte_memcpy. Be sure to run unit test
> (memcpy_perf_autotest)
> >  	# to determine the best threshold in code. Refer to notes in source
> > file @@ -52,12 +33,10 @@ flags_generic = [
> >  	['RTE_USE_C11_MEM_MODEL', true],
> >  	['RTE_CACHE_LINE_SIZE', 128]]
> >  flags_cavium = [
> > -	['RTE_MACHINE', '"thunderx"'],
> >  	['RTE_CACHE_LINE_SIZE', 128],
> >  	['RTE_MAX_NUMA_NODES', 2],
> >  	['RTE_MAX_LCORE', 96],
> > -	['RTE_MAX_VFIO_GROUPS', 128],
> > -	['RTE_USE_C11_MEM_MODEL', false]]
> > +	['RTE_MAX_VFIO_GROUPS', 128]]
> >  flags_dpaa = [
> >  	['RTE_MACHINE', '"dpaa"'],
> >  	['RTE_USE_C11_MEM_MODEL', true],
> > @@ -71,6 +50,27 @@ flags_dpaa2 = [
> >  	['RTE_MAX_NUMA_NODES', 1],
> >  	['RTE_MAX_LCORE', 16],
> >  	['RTE_LIBRTE_DPAA2_USE_PHYS_IOVA', false]]
> > +flags_default_extra = []
> > +flags_thunderx_extra = [
> > +	['RTE_MACHINE', '"thunderx"'],
> > +	['RTE_USE_C11_MEM_MODEL', false]]
> > +
> > +machine_args_generic = [
> > +	['default', ['-march=armv8-a+crc+crypto']],
> > +	['native', ['-march=native']],
> > +	['0xd03', ['-mcpu=cortex-a53']],
> > +	['0xd04', ['-mcpu=cortex-a35']],
> > +	['0xd07', ['-mcpu=cortex-a57']],
> > +	['0xd08', ['-mcpu=cortex-a72']],
> > +	['0xd09', ['-mcpu=cortex-a73']],
> > +	['0xd0a', ['-mcpu=cortex-a75']]]
> > +
> > +machine_args_cavium = [
> > +	['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
> > +	['native', ['-march=native']],
> > +	['0xa1', ['-mcpu=thunderxt88'], flags_thunderx_extra],
> > +	['0xa2', ['-mcpu=thunderxt81'], flags_thunderx_extra],
> > +	['0xa3', ['-mcpu=thunderxt83'], flags_thunderx_extra]]
> 
> I think we should have a simpler model.
> We need only to know the machine name and get all the related machine
> config.
> In native compilation, machine name is guessed from implementor id and pn
> (from config/arm/armv8_machine.py). We can directly output the machine
> name from this script and leave the naming logic in this script.
> In the cross-compilation config files (config/arm/*), we can just specify the
> machine name.
> Then every machine config (machine_args and dpdk_conf) would be
> specified in some arrays based on the machine name.
> Of course, we can keep some common default values.

Thomas,

This patch was around last three months. It reached upto v8.
I think, in that last minute for RC2, We cannot take major rework on this as it needs to tested for
Other arm64 platform too. It was pulled out from RC1 because other pcap issue from meson.
Now its not fair to say to rework the meson stuff now.
I suggest to take other rework in next release.


> 
> Thoughts?
>
Yongseok Koh April 12, 2019, 7:34 a.m. UTC | #12
> On Apr 12, 2019, at 12:00 AM, Jerin Jacob Kollanukkaran <jerinj@marvell.com> wrote:
> 
>> -----Original Message-----
>> From: Yongseok Koh <yskoh@mellanox.com>
>> Sent: Friday, April 12, 2019 12:14 PM
>> To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
>> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Thomas
>> Monjalon <thomas@monjalon.net>; dev <dev@dpdk.org>;
>> jerinjacobk@gmail.com
>> Subject: [EXT] Re: [dpdk-dev] [PATCH v8 2/4] meson: add infra to support
>> machine specific flags
>> 
>> 
>> Thanks, that fixes the issue.
>> But I've encountered another one. Are you aware of this?
> 
> Yes. It is a compiler bug. This patch set is NOT introducing this.
> 
> Fixed same on legacy build with arm64 . If you are using < 4.8.6 compiler
> For meson, this patched needs to be ported to meson

Okay, I've ported it. Will submit it soon.

Yongseok


> 
> $ git show f3af3e44a444cdfe3fa7b3e2c042be351401eb23
> commit f3af3e44a444cdfe3fa7b3e2c042be351401eb23
> Author: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Date:   Mon Sep 3 15:01:10 2018 +0530
> 
>    mk: disable OcteonTx for buggy compilers only on arm64
> 
>    Disable octeontx for gcc 4.8.5 as the compiler is emitting "internal
>    compiler error" for aarch64. The GCC "internal compiler error" was
>    observed only for arm64 architecture so disable the PMD only
>    for arm64.
> 
>    Fixes: 4f760550a093 ("mk: disable OcteonTx for buggy compilers")
>    Cc: stable@dpdk.org
> 
>    Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
>    Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
> 
>> 
>> ninja: Entering directory `build'
>> [1151/1452] Compiling C object
>> 'drivers/drivers@@tmp_r...d_octeontx_event@sta/event_octeontx_timvf_
>> worker.c.o'.
>> FAILED:
>> drivers/drivers@@tmp_rte_pmd_octeontx_event@sta/event_octeontx_ti
>> mvf_worker.c.o
>> cc -Idrivers/drivers@@tmp_rte_pmd_octeontx_event@sta -Idrivers -
>> I../drivers -Idrivers/event/octeontx -I../drivers/event/octeontx -
>> Ilib/librte_eventdev -I../lib/librte_eventdev -I. -I../ -Iconfig -I../config -
>> Ilib/librte_eal/common/include -I../lib/librte_eal/common/include -
>> I../lib/librte_eal/linux/eal/include -Ilib/librte_eal/common -
>> I../lib/librte_eal/common -Ilib/librte_eal/common/include/arch/arm -
>> I../lib/librte_eal/co mmon/include/arch/arm -Ilib/librte_eal -I../lib/librte_eal
>> -Ilib/librte_kvargs -I../lib/librte_kvargs -Ilib/librte_ring -I../lib/librte_ring -
>> Ilib/librte_ethdev -I../lib/librte_ethdev -Ilib/librte_net -I../lib/librte_net -
>> Ilib/librte_mbuf -I../lib/librte_mbuf -Ilib/librte_mempool -
>> I../lib/librte_mempool -Ilib/librte_cmdline -I../lib/librte_cmdline -
>> Ilib/librte_meter -I../lib/librte_meter -Ilib/librte_hash -I../lib/librte_h ash -
>> Ilib/librte_timer -I../lib/librte_timer -Ilib/librte_cryptodev -
>> I../lib/librte_cryptodev -Idrivers/common/octeontx -
>> I../drivers/common/octeontx -Idrivers/mempool/octeontx -
>> I../drivers/mempool/octeontx -Idrivers/bus/pci -I../drivers/bus/pci -
>> I../drivers/bus/pci/linux -Ilib/librte_pci -I../lib/librte_pci -Idrivers/bus/vdev -
>> I../drivers/bus/vdev -Idrivers/net/octeontx -I../drivers/net/octeontx -
>> Idrivers/net/octeontx/base  -I../drivers/net/octeontx/base -pipe -
>> D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -O3 -include rte_config.h -
>> Wsign-compare -Wcast-qual -fPIC -D_GNU_SOURCE -
>> DALLOW_EXPERIMENTAL_API  -MD -MQ
>> 'drivers/drivers@@tmp_rte_pmd_octeontx_event@sta/event_octeontx_ti
>> mvf_worker.c.o' -MF
>> 'drivers/drivers@@tmp_rte_pmd_octeontx_event@sta/event_octeontx_ti
>> mvf_worker.c.o.d' -o
>> 'drivers/drivers@@tmp_rte_pmd_octeontx_event@sta/event_octeontx_t
>> imvf_worker.c.o' -c ../drivers/event/octeontx/timvf_worker.c
>> ../drivers/event/octeontx/timvf_worker.c: In function
>> ‘timvf_timer_arm_burst_sp’:
>> ../drivers/event/octeontx/timvf_worker.c:88:1: error: could not split insn  }
>> ^ (insn 95 98 99 (parallel [
>>            (set (reg:DI 3 x3 [orig:98 D.8656 ] [98])
>>                (mem/v:DI (reg/f:DI 21 x21 [orig:88 D.8662 ] [88]) [-1  S8 A64]))
>>            (set (mem/v:DI (reg/f:DI 21 x21 [orig:88 D.8662 ] [88]) [-1  S8 A64])
>>                (unspec_volatile:DI [
>>                        (plus:DI (mem/v:DI (reg/f:DI 21 x21 [orig:88 D.8662 ] [88]) [-1  S8
>> A64])
>>                            (const_int -281474976710656 [0xffff000000000000]))
>>                        (const_int 0 [0])
>>                    ] UNSPECV_ATOMIC_OP))
>>            (clobber (reg:CC 66 cc))
>>            (clobber (reg:DI 0 x0))
>>            (clobber (reg:SI 1 x1))
>>        ]) ../drivers/event/octeontx/timvf_worker.h:95 1832
>> {atomic_fetch_adddi}
>>     (expr_list:REG_UNUSED (reg:CC 66 cc)
>>        (expr_list:REG_UNUSED (reg:SI 1 x1)
>>            (expr_list:REG_UNUSED (reg:DI 0 x0)
>>                (nil)))))
>> ../drivers/event/octeontx/timvf_worker.c:88:1: internal compiler error: in
>> final_scan_insn, at final.c:2897 Please submit a full bug report, with
>> preprocessed source if appropriate.
>> See <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fbugzilla.redhat.com%2Fbugzilla&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C557fa46858104e15443508d6bf1494e9%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636906492451079366&amp;sdata=EpdRkARHJDQCd6LRSe7gWjCghjIec%2Fx%2BJjbJiDNEYm4%3D&amp;reserved=0> for instructions.
>> {standard input}: Assembler messages:
>> {standard input}: Error: open CFI at the end of file; missing .cfi_endproc
>> directive Preprocessed source stored into /tmp/ccnQRbOm.out file, please
>> attach this to your bugreport.
>> [1168/1452] Compiling C object
>> 'drivers/drivers@@tmp_r...ntx_crypto@sta/crypto_octeontx_otx_cryptode
>> v_ops.c.o'.
>> ninja: build stopped: subcommand failed.
>> 
>> Thanks
>> Yongseok
>
Jerin Jacob Kollanukkaran April 12, 2019, 7:35 a.m. UTC | #13
> -----Original Message-----
> From: Yongseok Koh <yskoh@mellanox.com>
> Sent: Friday, April 12, 2019 12:39 PM
> To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Thomas
> Monjalon <thomas@monjalon.net>; dev <dev@dpdk.org>;
> jerinjacobk@gmail.com
> Subject: [EXT] Re: [dpdk-dev] [PATCH v8 2/4] meson: add infra to support
> machine specific flags
> 
> 
> 
> One more issue.
> With gcc7.2, crypto isn't enabled if -mcpu is set.
> How about thunderx/octeon?

It is enabled when when mcpu=thunderxt88 is provided.
In default case, it picks up -march=armv8-a+crc+crypto if mpcu is not matching. 
So I don’t see any issue.

> Looks it should be like -mcpu=cortex-a72+crypto I'll take care of this in a

mcpu should be enough. Looks like like it a compiler bug, Is cortex-a72 available with out crypto?
mcpu suppose to tune for specific cpu.


> separate patch.
> Because I want to add a new option to control it.
> The reason is a binary having crypto support can't be run on a cpu w/o crypto
> extension, it is panicked.
> And -mcpu=cortex-a72 includes 'crc' support by default.

I suggest to add config/arm/arm64_a72_linux_gcc, If we need to catch all this issue in
cross compilation otherwise it will come only on native compilation on that specific board.
Now the devtools/test-meson-builds.sh goes over all the config/arm/* on the native compilation 
On arm64 board and x86. If you need set compiler test for a72 then please add the cross compile config.


> 
> 
> Thanks,
> Yongseok
>
Yongseok Koh April 12, 2019, 7:47 a.m. UTC | #14
> On Apr 12, 2019, at 12:35 AM, Jerin Jacob Kollanukkaran <jerinj@marvell.com> wrote:
> 
> 
> 
>> -----Original Message-----
>> From: Yongseok Koh <yskoh@mellanox.com>
>> Sent: Friday, April 12, 2019 12:39 PM
>> To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
>> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Thomas
>> Monjalon <thomas@monjalon.net>; dev <dev@dpdk.org>;
>> jerinjacobk@gmail.com
>> Subject: [EXT] Re: [dpdk-dev] [PATCH v8 2/4] meson: add infra to support
>> machine specific flags
>> 
>> 
>> 
>> One more issue.
>> With gcc7.2, crypto isn't enabled if -mcpu is set.
>> How about thunderx/octeon?
> 
> It is enabled when when mcpu=thunderxt88 is provided.
> In default case, it picks up -march=armv8-a+crc+crypto if mpcu is not matching. 
> So I don’t see any issue.
> 
>> Looks it should be like -mcpu=cortex-a72+crypto I'll take care of this in a
> 
> mcpu should be enough. Looks like like it a compiler bug, Is cortex-a72 available with out crypto?
> mcpu suppose to tune for specific cpu.

Yes, Mellanox BlueField is designed based on a72 and there are two variants in production.
One with crypto, the other w/o crypto. That might be the reason why -mcpu=cortex-a72
doesn't enable crypto by default.

>> separate patch.
>> Because I want to add a new option to control it.
>> The reason is a binary having crypto support can't be run on a cpu w/o crypto
>> extension, it is panicked.
>> And -mcpu=cortex-a72 includes 'crc' support by default.
> 
> I suggest to add config/arm/arm64_a72_linux_gcc, If we need to catch all this issue in
> cross compilation otherwise it will come only on native compilation on that specific board.
> Now the devtools/test-meson-builds.sh goes over all the config/arm/* on the native compilation 
> On arm64 board and x86. If you need set compiler test for a72 then please add the cross compile config.

Yes, I'm going to add config/arm/arm64_bluefield_linux_gcc soon. Patch is ready now.

Thanks,
Yongseok
Thomas Monjalon April 12, 2019, 8:45 a.m. UTC | #15
12/04/2019 09:12, Jerin Jacob Kollanukkaran:
> From: Thomas Monjalon <thomas@monjalon.net>
> > 10/04/2019 18:13, jerinjacobk@gmail.com:
> > > From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> > >
> > > Currently, RTE_* flags are set based on the implementer ID but there
> > > might be some micro arch specific differences from the same vendor eg.
> > > CACHE_LINESIZE. Add support to set micro arch specific flags.
> > 
> > I don't like how flags are set in config/arm/meson.build.
> > It is a real mess to find which flag applies to which machine.
> > Adding the flags_*_extra in the machine_args_* is adding more mess.
> > 
> > [...]
> > >  flags_common_default = [
> > >  	# Accelarate rte_memcpy. Be sure to run unit test
> > (memcpy_perf_autotest)
> > >  	# to determine the best threshold in code. Refer to notes in source
> > > file @@ -52,12 +33,10 @@ flags_generic = [
> > >  	['RTE_USE_C11_MEM_MODEL', true],
> > >  	['RTE_CACHE_LINE_SIZE', 128]]
> > >  flags_cavium = [
> > > -	['RTE_MACHINE', '"thunderx"'],
> > >  	['RTE_CACHE_LINE_SIZE', 128],
> > >  	['RTE_MAX_NUMA_NODES', 2],
> > >  	['RTE_MAX_LCORE', 96],
> > > -	['RTE_MAX_VFIO_GROUPS', 128],
> > > -	['RTE_USE_C11_MEM_MODEL', false]]
> > > +	['RTE_MAX_VFIO_GROUPS', 128]]
> > >  flags_dpaa = [
> > >  	['RTE_MACHINE', '"dpaa"'],
> > >  	['RTE_USE_C11_MEM_MODEL', true],
> > > @@ -71,6 +50,27 @@ flags_dpaa2 = [
> > >  	['RTE_MAX_NUMA_NODES', 1],
> > >  	['RTE_MAX_LCORE', 16],
> > >  	['RTE_LIBRTE_DPAA2_USE_PHYS_IOVA', false]]
> > > +flags_default_extra = []
> > > +flags_thunderx_extra = [
> > > +	['RTE_MACHINE', '"thunderx"'],
> > > +	['RTE_USE_C11_MEM_MODEL', false]]
> > > +
> > > +machine_args_generic = [
> > > +	['default', ['-march=armv8-a+crc+crypto']],
> > > +	['native', ['-march=native']],
> > > +	['0xd03', ['-mcpu=cortex-a53']],
> > > +	['0xd04', ['-mcpu=cortex-a35']],
> > > +	['0xd07', ['-mcpu=cortex-a57']],
> > > +	['0xd08', ['-mcpu=cortex-a72']],
> > > +	['0xd09', ['-mcpu=cortex-a73']],
> > > +	['0xd0a', ['-mcpu=cortex-a75']]]
> > > +
> > > +machine_args_cavium = [
> > > +	['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
> > > +	['native', ['-march=native']],
> > > +	['0xa1', ['-mcpu=thunderxt88'], flags_thunderx_extra],
> > > +	['0xa2', ['-mcpu=thunderxt81'], flags_thunderx_extra],
> > > +	['0xa3', ['-mcpu=thunderxt83'], flags_thunderx_extra]]
> > 
> > I think we should have a simpler model.
> > We need only to know the machine name and get all the related machine
> > config.
> > In native compilation, machine name is guessed from implementor id and pn
> > (from config/arm/armv8_machine.py). We can directly output the machine
> > name from this script and leave the naming logic in this script.
> > In the cross-compilation config files (config/arm/*), we can just specify the
> > machine name.
> > Then every machine config (machine_args and dpdk_conf) would be
> > specified in some arrays based on the machine name.
> > Of course, we can keep some common default values.
> 
> Thomas,
> 
> This patch was around last three months. It reached upto v8.
> I think, in that last minute for RC2, We cannot take major rework on this as it needs to tested for
> Other arm64 platform too. It was pulled out from RC1 because other pcap issue from meson.
> Now its not fair to say to rework the meson stuff now.
> I suggest to take other rework in next release.

I was not confortable with this patch without being able to say why.
Yesterday I spent more time to understand and see what may be improved.
I agree it is late, so it won't block this patch for 19.05.
Do you agree this file can be improved?
Please would you like to look at reworking during next cycle?
Thanks
Jerin Jacob Kollanukkaran April 13, 2019, 6:24 a.m. UTC | #16
> > > > +machine_args_cavium = [
> > > > +	['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
> > > > +	['native', ['-march=native']],
> > > > +	['0xa1', ['-mcpu=thunderxt88'], flags_thunderx_extra],
> > > > +	['0xa2', ['-mcpu=thunderxt81'], flags_thunderx_extra],
> > > > +	['0xa3', ['-mcpu=thunderxt83'], flags_thunderx_extra]]
> > >
> > > I think we should have a simpler model.
> > > We need only to know the machine name and get all the related
> > > machine config.
> > > In native compilation, machine name is guessed from implementor id
> > > and pn (from config/arm/armv8_machine.py). We can directly output
> > > the machine name from this script and leave the naming logic in this script.
> > > In the cross-compilation config files (config/arm/*), we can just
> > > specify the machine name.
> > > Then every machine config (machine_args and dpdk_conf) would be
> > > specified in some arrays based on the machine name.
> > > Of course, we can keep some common default values.
> >
> > Thomas,
> >
> > This patch was around last three months. It reached upto v8.
> > I think, in that last minute for RC2, We cannot take major rework on
> > this as it needs to tested for Other arm64 platform too. It was pulled out from
> RC1 because other pcap issue from meson.
> > Now its not fair to say to rework the meson stuff now.
> > I suggest to take other rework in next release.
> 
> I was not confortable with this patch without being able to say why.
> Yesterday I spent more time to understand and see what may be improved.
> I agree it is late, so it won't block this patch for 19.05.
> Do you agree this file can be improved?

Moving to  the all to static config file is an option but we lose the flexibility
of runtime detecting the options and few of them are probing at runtime based
on gcc versions and mcpu combination etc.
I am not expert in meson area and not sure meson/python has better data strcture for this other than
list/array combo. If Bruce has any feedback on this, then we
will try to prototype it.






> Please would you like to look at reworking during next cycle?
> Thanks
>
Thomas Monjalon April 13, 2019, 8:42 p.m. UTC | #17
13/04/2019 08:24, Jerin Jacob Kollanukkaran:
> > I was not confortable with this patch without being able to say why.
> > Yesterday I spent more time to understand and see what may be improved.
> > I agree it is late, so it won't block this patch for 19.05.
> > Do you agree this file can be improved?
> 
> Moving to  the all to static config file is an option but we lose the flexibility
> of runtime detecting the options and few of them are probing at runtime based
> on gcc versions and mcpu combination etc.

I think there is a misunderstanding.
I'm suggesting to symplify arrays by indexing only by machine name.
It should not change the behaviour.

> I am not expert in meson area and not sure meson/python has better data strcture for this other than
> list/array combo. If Bruce has any feedback on this, then we
> will try to prototype it.
> 
> > Please would you like to look at reworking during next cycle?
> > Thanks
Pavan Nikhilesh Bhagavatula April 14, 2019, 2:40 p.m. UTC | #18
Hi Thomas, 

There is no guarantee of primary part number (machine names) uniqueness between implementors. 
If we limit lookups to only machine names through primary part number we would have a lot of repetitive defines.
Also, moving the arrays into the python script is not feasible as meson needs to reparse the standard out from the python script

Currently, config is split into three parts :
	1. Implementor specific defines.
	2. Micro-arch specific compiler flags.
	3. Micro-arch specific defines.

I think from a configurability point of view the above three are really important for fine grained control.

Thoughts?

Regards,
Pavan.

>-----Original Message-----
>From: Thomas Monjalon <thomas@monjalon.net>
>Sent: Sunday, April 14, 2019 2:13 AM
>To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
>Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>;
>dev@dpdk.org; jerinjacobk@gmail.com; yskoh@mellanox.com;
>bruce.richardson@intel.com
>Subject: Re: [dpdk-dev] [PATCH v8 2/4] meson: add infra to support machine
>specific flags
>
>13/04/2019 08:24, Jerin Jacob Kollanukkaran:
>> > I was not confortable with this patch without being able to say why.
>> > Yesterday I spent more time to understand and see what may be improved.
>> > I agree it is late, so it won't block this patch for 19.05.
>> > Do you agree this file can be improved?
>>
>> Moving to  the all to static config file is an option but we lose the
>> flexibility of runtime detecting the options and few of them are
>> probing at runtime based on gcc versions and mcpu combination etc.
>
>I think there is a misunderstanding.
>I'm suggesting to symplify arrays by indexing only by machine name.
>It should not change the behaviour.
>
>> I am not expert in meson area and not sure meson/python has better
>> data strcture for this other than list/array combo. If Bruce has any
>> feedback on this, then we will try to prototype it.
>>
>> > Please would you like to look at reworking during next cycle?
>> > Thanks
>
>
Thomas Monjalon April 14, 2019, 5:44 p.m. UTC | #19
14/04/2019 16:40, Pavan Nikhilesh Bhagavatula:
> Hi Thomas, 
> 
> There is no guarantee of primary part number (machine names) uniqueness between implementors.

I think we don't speak the same language :)
By machine name, I mean what we set in RTE_MACHINE, like octeontx2.

> If we limit lookups to only machine names through primary part number we would have a lot of repetitive defines.
> Also, moving the arrays into the python script is not feasible as meson needs to reparse the standard out from the python script

I will probably need to write a PoC.

> Currently, config is split into three parts :
> 	1. Implementor specific defines.
> 	2. Micro-arch specific compiler flags.
> 	3. Micro-arch specific defines.

This is currently unreadable in my opinion.

> I think from a configurability point of view the above three are really important for fine grained control.

I agree fine grain is required.


> Thoughts?
> 
> Regards,
> Pavan.
> 
> >-----Original Message-----
> >From: Thomas Monjalon <thomas@monjalon.net>
> >Sent: Sunday, April 14, 2019 2:13 AM
> >To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> >Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>;
> >dev@dpdk.org; jerinjacobk@gmail.com; yskoh@mellanox.com;
> >bruce.richardson@intel.com
> >Subject: Re: [dpdk-dev] [PATCH v8 2/4] meson: add infra to support machine
> >specific flags
> >
> >13/04/2019 08:24, Jerin Jacob Kollanukkaran:
> >> > I was not confortable with this patch without being able to say why.
> >> > Yesterday I spent more time to understand and see what may be improved.
> >> > I agree it is late, so it won't block this patch for 19.05.
> >> > Do you agree this file can be improved?
> >>
> >> Moving to  the all to static config file is an option but we lose the
> >> flexibility of runtime detecting the options and few of them are
> >> probing at runtime based on gcc versions and mcpu combination etc.
> >
> >I think there is a misunderstanding.
> >I'm suggesting to symplify arrays by indexing only by machine name.
> >It should not change the behaviour.
> >
> >> I am not expert in meson area and not sure meson/python has better
> >> data strcture for this other than list/array combo. If Bruce has any
> >> feedback on this, then we will try to prototype it.
> >>
> >> > Please would you like to look at reworking during next cycle?
> >> > Thanks
> >
> >
> 
>
Jerin Jacob Kollanukkaran April 14, 2019, 6:19 p.m. UTC | #20
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Sunday, April 14, 2019 11:14 PM
> To: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
> Cc: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; dev@dpdk.org;
> jerinjacobk@gmail.com; yskoh@mellanox.com; bruce.richardson@intel.com
> Subject: Re: [dpdk-dev] [PATCH v8 2/4] meson: add infra to support machine
> specific flags
> 
> 14/04/2019 16:40, Pavan Nikhilesh Bhagavatula:
> > Hi Thomas,
> >
> > There is no guarantee of primary part number (machine names) uniqueness
> between implementors.
> 
> I think we don't speak the same language :) By machine name, I mean what we
> set in RTE_MACHINE, like octeontx2.

As you know, The system probes "implementor_id" and "implementor_pn"
Values. There is nothing like machine name in meson and in order to keep 
Synergy with native build, we need to just follow, "implementor_id"
and "implementor_pn". Now, it is possible to have "implemetor_id" to
"implementor_pn"  to machine name lookup but Unlike, "make"  based
Build system, meson supports supporting a lot of machines(like RTE_MACHINE),
with that structure. So converting to another intermediate called "machine string"
will have more overhead IMO.

['0xa1', ['-mcpu=thunderxt88']],
['0xa2', ['-mcpu=thunderxt81']],
['0xa3', ['-mcpu=thunderxt83']]]
['0xd03', ['-mcpu=cortex-a53']],
['0xd04', ['-mcpu=cortex-a35']],
['0xd05', ['-mcpu=cortex-a55']],
['0xd07', ['-mcpu=cortex-a57']],
['0xd08', ['-mcpu=cortex-a72']],
['0xd09', ['-mcpu=cortex-a73']],
['0xd0a', ['-mcpu=cortex-a75']],
['0xd0b', ['-mcpu=cortex-a76']]
impl_0x41 = ['Arm', flags_generic, machine_args_generic]
impl_0x42 = ['Broadcom', flags_generic, machine_args_generic]
impl_0x43 = ['Cavium', flags_cavium, machine_args_cavium]
impl_0x44 = ['DEC', flags_generic, machine_args_generic]
impl_0x49 = ['Infineon', flags_generic, machine_args_generic]
impl_0x4d = ['Motorola', flags_generic, machine_args_generic]
impl_0x4e = ['NVIDIA', flags_generic, machine_args_generic]
impl_0x50 = ['AppliedMicro', flags_generic, machine_args_generic]
impl_0x51 = ['Qualcomm', flags_generic, machine_args_generic]
impl_0x53 = ['Samsung', flags_generic, machine_args_generic]
impl_0x56 = ['Marvell', flags_generic, machine_args_generic]
impl_0x69 = ['Intel', flags_generic, machine_args_generic]
impl_dpaa = ['NXP DPAA', flags_dpaa, machine_args_generic]
impl_dpaa2 = ['NXP DPAA2', flags_dpaa2, machine_args_generic]



> 
> > If we limit lookups to only machine names through primary part number we
> would have a lot of repetitive defines.
> > Also, moving the arrays into the python script is not feasible as
> > meson needs to reparse the standard out from the python script
> 
> I will probably need to write a PoC.

Yes. Please


> 
> > Currently, config is split into three parts :
> > 	1. Implementor specific defines.
> > 	2. Micro-arch specific compiler flags.
> > 	3. Micro-arch specific defines.
> 
> This is currently unreadable in my opinion.

Bit  subjective.
If we want to keep all the fine grained control along with
"native" /"cross" build,  "distribution" build, cache line
differences etc, makes it bit difficult.

But if you think, it can be improved. Please share patch,
We are happy to review and test in  the platforms we have.

> > I think from a configurability point of view the above three are really
> important for fine grained control.
> 
> I agree fine grain is required.
> 
> 
> > Thoughts?
> >
> > Regards,
> > Pavan.
> >
> > >-----Original Message-----
> > >From: Thomas Monjalon <thomas@monjalon.net>
> > >Sent: Sunday, April 14, 2019 2:13 AM
> > >To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> > >Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>;
> > >dev@dpdk.org; jerinjacobk@gmail.com; yskoh@mellanox.com;
> > >bruce.richardson@intel.com
> > >Subject: Re: [dpdk-dev] [PATCH v8 2/4] meson: add infra to support
> > >machine specific flags
> > >
> > >13/04/2019 08:24, Jerin Jacob Kollanukkaran:
> > >> > I was not confortable with this patch without being able to say why.
> > >> > Yesterday I spent more time to understand and see what may be
> improved.
> > >> > I agree it is late, so it won't block this patch for 19.05.
> > >> > Do you agree this file can be improved?
> > >>
> > >> Moving to  the all to static config file is an option but we lose
> > >> the flexibility of runtime detecting the options and few of them
> > >> are probing at runtime based on gcc versions and mcpu combination etc.
> > >
> > >I think there is a misunderstanding.
> > >I'm suggesting to symplify arrays by indexing only by machine name.
> > >It should not change the behaviour.
> > >
> > >> I am not expert in meson area and not sure meson/python has better
> > >> data strcture for this other than list/array combo. If Bruce has
> > >> any feedback on this, then we will try to prototype it.
> > >>
> > >> > Please would you like to look at reworking during next cycle?
> > >> > Thanks
> > >
> > >
> >
> >
> 
> 
> 
>
Thomas Monjalon April 14, 2019, 6:29 p.m. UTC | #21
14/04/2019 20:19, Jerin Jacob Kollanukkaran:
> From: Thomas Monjalon <thomas@monjalon.net>
> > 14/04/2019 16:40, Pavan Nikhilesh Bhagavatula:
> > > Hi Thomas,
> > >
> > > There is no guarantee of primary part number (machine names) uniqueness
> > between implementors.
> > 
> > I think we don't speak the same language :) By machine name, I mean what we
> > set in RTE_MACHINE, like octeontx2.
> 
> As you know, The system probes "implementor_id" and "implementor_pn"
> Values. There is nothing like machine name in meson and in order to keep 
> Synergy with native build, we need to just follow, "implementor_id"
> and "implementor_pn". Now, it is possible to have "implemetor_id" to
> "implementor_pn"  to machine name lookup but Unlike, "make"  based
> Build system, meson supports supporting a lot of machines(like RTE_MACHINE),
> with that structure. So converting to another intermediate called "machine string"
> will have more overhead IMO.

We already have this string with RTE_MACHINE.
You already set RTE_MACHINE based on id and pn.
I don't see any overhead.
Anyway, no need to discuss it more without any real code.
diff mbox series

Patch

diff --git a/config/arm/meson.build b/config/arm/meson.build
index 170a4981a..24bce2b39 100644
--- a/config/arm/meson.build
+++ b/config/arm/meson.build
@@ -7,25 +7,6 @@  march_opt = '-march=@0@'.format(machine)
 
 arm_force_native_march = false
 
-machine_args_generic = [
-	['default', ['-march=armv8-a+crc+crypto']],
-	['native', ['-march=native']],
-	['0xd03', ['-mcpu=cortex-a53']],
-	['0xd04', ['-mcpu=cortex-a35']],
-	['0xd05', ['-mcpu=cortex-a55']],
-	['0xd07', ['-mcpu=cortex-a57']],
-	['0xd08', ['-mcpu=cortex-a72']],
-	['0xd09', ['-mcpu=cortex-a73']],
-	['0xd0a', ['-mcpu=cortex-a75']],
-	['0xd0b', ['-mcpu=cortex-a76']],
-]
-machine_args_cavium = [
-	['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
-	['native', ['-march=native']],
-	['0xa1', ['-mcpu=thunderxt88']],
-	['0xa2', ['-mcpu=thunderxt81']],
-	['0xa3', ['-mcpu=thunderxt83']]]
-
 flags_common_default = [
 	# Accelarate rte_memcpy. Be sure to run unit test (memcpy_perf_autotest)
 	# to determine the best threshold in code. Refer to notes in source file
@@ -52,12 +33,10 @@  flags_generic = [
 	['RTE_USE_C11_MEM_MODEL', true],
 	['RTE_CACHE_LINE_SIZE', 128]]
 flags_cavium = [
-	['RTE_MACHINE', '"thunderx"'],
 	['RTE_CACHE_LINE_SIZE', 128],
 	['RTE_MAX_NUMA_NODES', 2],
 	['RTE_MAX_LCORE', 96],
-	['RTE_MAX_VFIO_GROUPS', 128],
-	['RTE_USE_C11_MEM_MODEL', false]]
+	['RTE_MAX_VFIO_GROUPS', 128]]
 flags_dpaa = [
 	['RTE_MACHINE', '"dpaa"'],
 	['RTE_USE_C11_MEM_MODEL', true],
@@ -71,6 +50,27 @@  flags_dpaa2 = [
 	['RTE_MAX_NUMA_NODES', 1],
 	['RTE_MAX_LCORE', 16],
 	['RTE_LIBRTE_DPAA2_USE_PHYS_IOVA', false]]
+flags_default_extra = []
+flags_thunderx_extra = [
+	['RTE_MACHINE', '"thunderx"'],
+	['RTE_USE_C11_MEM_MODEL', false]]
+
+machine_args_generic = [
+	['default', ['-march=armv8-a+crc+crypto']],
+	['native', ['-march=native']],
+	['0xd03', ['-mcpu=cortex-a53']],
+	['0xd04', ['-mcpu=cortex-a35']],
+	['0xd07', ['-mcpu=cortex-a57']],
+	['0xd08', ['-mcpu=cortex-a72']],
+	['0xd09', ['-mcpu=cortex-a73']],
+	['0xd0a', ['-mcpu=cortex-a75']]]
+
+machine_args_cavium = [
+	['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
+	['native', ['-march=native']],
+	['0xa1', ['-mcpu=thunderxt88'], flags_thunderx_extra],
+	['0xa2', ['-mcpu=thunderxt81'], flags_thunderx_extra],
+	['0xa3', ['-mcpu=thunderxt83'], flags_thunderx_extra]]
 
 ## Arm implementer ID (ARM DDI 0487C.a, Section G7.2.106, Page G7-5321)
 impl_generic = ['Generic armv8', flags_generic, machine_args_generic]
@@ -157,8 +157,16 @@  else
 	endif
 	foreach marg: machine[2]
 		if marg[0] == impl_pn
-			foreach f: marg[1]
-				machine_args += f
+			foreach flag: marg[1]
+				if cc.has_argument(flag)
+					machine_args += flag
+				endif
+			endforeach
+			# Apply any extra machine specific flags.
+			foreach flag: marg.get(2, flags_default_extra)
+				if flag.length() > 0
+					dpdk_conf.set(flag[0], flag[1])
+				endif
 			endforeach
 		endif
 	endforeach