From patchwork Thu Jul 25 09:03:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Hunt, David" X-Patchwork-Id: 57074 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 37B081C2B5; Thu, 25 Jul 2019 11:03:07 +0200 (CEST) Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id 033191C2B0; Thu, 25 Jul 2019 11:03:05 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Jul 2019 02:03:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,306,1559545200"; d="scan'208";a="345371479" Received: from silpixa00399952.ir.intel.com (HELO silpixa00399952.ger.corp.intel.com) ([10.237.222.88]) by orsmga005.jf.intel.com with ESMTP; 25 Jul 2019 02:03:03 -0700 From: David Hunt To: dev@dpdk.org Cc: david.hunt@intel.com, Liang Ma , stable@dpdk.org Date: Thu, 25 Jul 2019 10:03:01 +0100 Message-Id: <20190725090301.4821-1-david.hunt@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190725085437.4634-1-david.hunt@intel.com> References: <20190725085437.4634-1-david.hunt@intel.com> Subject: [dpdk-dev] [PATCH v2] lib/distributor: fix livelock issue on flush X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Liang Ma The Distributor autotest can lock if ran enough times. Worker and distributir threads get into a livelock situation waiting on each other. Issue first encountered by RedHat in Travis CI https://bugs.dpdk.org/show_bug.cgi?id=316 To repeat: `while sudo sh -c "echo 'distributor_autotest' | ./build/app/test/dpdk-test"; do :; done` The root cause is where we are flushing on exit, and do not wait for all worker packets to be returned before exiting. Add a delay on flush so that all worker packets are returned before completing the flush. Bugzilla ID: 316 Fixes: 775003ad2f96 ("distributor: add new burst-capable library") Cc: stable@dpdk.org Tested-by: Michael Santana Signed-off-by: David Hunt Signed-off-by: Liang Ma --- v2 add correct patch attribution --- lib/librte_distributor/rte_distributor.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c index 9fa05f69a..21eb1fb0a 100644 --- a/lib/librte_distributor/rte_distributor.c +++ b/lib/librte_distributor/rte_distributor.c @@ -542,6 +542,9 @@ rte_distributor_flush_v1705(struct rte_distributor *d) while (total_outstanding(d) > 0) rte_distributor_process(d, NULL, 0); + /* wait 10ms to allow all worker drain the pkts */ + rte_delay_us(10000); + /* * Send empty burst to all workers to allow them to exit * gracefully, should they need to.