Commit Message

Konstantin Ananyev April 23, 2021, 1:54 p.m. UTC
gcc 11 with '-O2' complains about some variables being used without
being initialized:

In file included from ../lib/librte_acl/acl_run_avx512x8.h:201,
                 from ../lib/librte_acl/acl_run_avx512.c:110:
In function ‘start_flow_avx512x8’,
    inlined from ‘search_trie_avx512x8.constprop’ at ../lib/librte_acl/acl_run_avx512_common.h:317:2:
../lib/librte_acl/acl_run_avx512_common.h:210:13: warning: ‘pdata’ is used uninitialized [-Wuninitialized]
In file included from ../lib/librte_acl/acl_run_avx512x8.h:201,
                 from ../lib/librte_acl/acl_run_avx512.c:110:
../lib/librte_acl/acl_run_avx512_common.h: In function ‘search_trie_avx512x8.constprop’:
../lib/librte_acl/acl_run_avx512_common.h:314:32: note: ‘pdata’ declared here
In file included from ../lib/librte_acl/acl_run_avx512x8.h:201,
                 from ../lib/librte_acl/acl_run_avx512.c:110:

Indeed, these variables are not explicitly initialized,
but this is done intentionally.
We rely on constant mask value that we pass to start_flow*() functions
as a parameter Note that gcc 11 with '-O3' and gcc 9/10 with both
'-O2' and '-O3' doesn't produce this warning, same as clang.
Which makes me think that they are able to successfully propagate
this constant mask value though the code.
Also even gcc 11 with '-O2' produces a warning, it is able to generate
an output binary with properly propagated constant values.
Anyway, to support clean build with gcc-11 this patch adds
explicit initialization for these variables.
I checked the output binary: with '-O3' both clang and gcc 10/11
generate no extra code for it.
Also performance test didn't reveal any regressions.

Bugzilla ID: 673
Fixes: b64c2295f7fc ("acl: add 256-bit AVX512 classify method")
Fixes: 45da22e42ec3 ("acl: add 512-bit AVX512 classify method")
Cc: stable@dpdk.org

Reported-by: Ali Alnubani <alialnu@nvidia.com>
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
 lib/acl/acl_run_avx512_common.h | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)
diff --git a/lib/acl/acl_run_avx512_common.h b/lib/acl/acl_run_avx512_common.h
index fafaf591e..fbad74d45 100644
--- a/lib/acl/acl_run_avx512_common.h
+++ b/lib/acl/acl_run_avx512_common.h
@@ -303,6 +303,28 @@  _F_(match_check_process)(struct acl_flow_avx512 *flow, uint32_t fm[2],
+static inline void
+_F_(reset_flow_vars)(_T_simd di[2], _T_simd idx[2], _T_simd pdata[4],
+	_T_simd tr_lo[2], _T_simd tr_hi[2])
+	di[0] = _M_SI_(setzero)();
+	di[1] = _M_SI_(setzero)();
+	idx[0] = _M_SI_(setzero)();
+	idx[1] = _M_SI_(setzero)();
+	pdata[0] = _M_SI_(setzero)();
+	pdata[1] = _M_SI_(setzero)();
+	pdata[2] = _M_SI_(setzero)();
+	pdata[3] = _M_SI_(setzero)();
+	tr_lo[0] = _M_SI_(setzero)();
+	tr_lo[1] = _M_SI_(setzero)();
+	tr_hi[0] = _M_SI_(setzero)();
+	tr_hi[1] = _M_SI_(setzero)();
  * Perform search for up to (2 * _N_) flows in parallel.
  * Use two sets of metadata, each serves _N_ flows max.
@@ -313,6 +335,8 @@  _F_(search_trie)(struct acl_flow_avx512 *flow)
 	uint32_t fm[2];
 	_T_simd di[2], idx[2], in[2], pdata[4], tr_lo[2], tr_hi[2];
+	_F_(reset_flow_vars)(di, idx, pdata, tr_lo, tr_hi);
 	/* first 1B load */
 	_F_(start_flow)(flow, _SIMD_MASK_BIT_, _SIMD_MASK_MAX_,
 			&pdata[0], &idx[0], &di[0]);