Message ID | 20200923132541.21417-1-l.wojciechow@partner.samsung.com (mailing list archive) |
---|---|
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id C8C0AA04B1; Wed, 23 Sep 2020 15:25:46 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id D812C1DA20; Wed, 23 Sep 2020 15:25:45 +0200 (CEST) Received: from mailout2.w1.samsung.com (mailout2.w1.samsung.com [210.118.77.12]) by dpdk.org (Postfix) with ESMTP id 6F1C61D37F for <dev@dpdk.org>; Wed, 23 Sep 2020 15:25:45 +0200 (CEST) Received: from eucas1p2.samsung.com (unknown [182.198.249.207]) by mailout2.w1.samsung.com (KnoxPortal) with ESMTP id 20200923132544euoutp02a7a1fc67df4764de77708b3bc1b58df5~3bIziCpni1049510495euoutp02w for <dev@dpdk.org>; Wed, 23 Sep 2020 13:25:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.w1.samsung.com 20200923132544euoutp02a7a1fc67df4764de77708b3bc1b58df5~3bIziCpni1049510495euoutp02w DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1600867544; bh=BsdQbbJrYchi0rSMMZ62olnwP4DS5iFrlDgGgAJ10M4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=r0COzBvs0k/sI7SN+U4wkm0V7Hygzb+rCWXtiC6fDys2shpbTxiRS4skKq/XjWTH4 lFCTdibCJjxczg2ZufTjsHJYM4PmeDW440T+OsisSWsTjqoyaAO87KvpZlS9BTZxrp pOybY/yhvKS0kXoTpIF+hQ/yBHjjfsp+EiRk0sQM= Received: from eusmges1new.samsung.com (unknown [203.254.199.242]) by eucas1p2.samsung.com (KnoxPortal) with ESMTP id 20200923132544eucas1p21b40ecdf6c87d51adc3175de989b4759~3bIzYXvBo2884028840eucas1p2C; Wed, 23 Sep 2020 13:25:44 +0000 (GMT) Received: from eucas1p2.samsung.com ( [182.198.249.207]) by eusmges1new.samsung.com (EUCPMTA) with SMTP id 92.B3.06456.8DC4B6F5; Wed, 23 Sep 2020 14:25:44 +0100 (BST) Received: from eusmtrp1.samsung.com (unknown [182.198.249.138]) by eucas1p2.samsung.com (KnoxPortal) with ESMTPA id 20200923132544eucas1p29470697e7cb6621cc65e6e676c3e5d69~3bIzERfOt0346103461eucas1p2n; Wed, 23 Sep 2020 13:25:44 +0000 (GMT) Received: from eusmgms2.samsung.com (unknown [182.198.249.180]) by eusmtrp1.samsung.com (KnoxPortal) with ESMTP id 20200923132544eusmtrp15fe170e19c2d66f750d508874701d8aa~3bIzDzK2v3185331853eusmtrp1l; Wed, 23 Sep 2020 13:25:44 +0000 (GMT) X-AuditID: cbfec7f2-809ff70000001938-17-5f6b4cd8fb23 Received: from eusmtip2.samsung.com ( [203.254.199.222]) by eusmgms2.samsung.com (EUCPMTA) with SMTP id 72.FA.06017.8DC4B6F5; Wed, 23 Sep 2020 14:25:44 +0100 (BST) Received: from Padamandas.fritz.box (unknown [106.210.88.70]) by eusmtip2.samsung.com (KnoxPortal) with ESMTPA id 20200923132543eusmtip2965d5e2b7c384dec634881207269dfb8~3bIyqx-pr2716427164eusmtip2H; Wed, 23 Sep 2020 13:25:43 +0000 (GMT) From: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com> To: Cc: dev@dpdk.org, l.wojciechow@partner.samsung.com Date: Wed, 23 Sep 2020 15:25:33 +0200 Message-Id: <20200923132541.21417-1-l.wojciechow@partner.samsung.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200923014713.16932-1-l.wojciechow@partner.samsung.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrMIsWRmVeSWpSXmKPExsWy7djP87o3fLLjDS4c0rJ492k7k8WznnWM DkwevxYsZfU4+G4PUwBTFJdNSmpOZllqkb5dAlfG3+YtbAVPuCsOTLrH2MC4jLOLkZNDQsBE 4uH85SwgtpDACkaJu13eXYxcQPYXRokvc9+wQzifGSXa3/1kguk4u+45E0RiOaPEnsNPodo/ MUq87tcDsdkEbCWOzPzKCmKLCLBIrPz+HayGWcBI4mX3RGYQW1jATuLo1AZGEJtFQFVi9qX1 YAt4BVwlZiw7D7VMXmL1hgNA9RwcnAJuEk//yYPslRDYwSZx8voNqBoXiRffvkLZwhKvjm9h h7BlJE5P7mGBaNjGKHH1909GCGc/o8T13hVQVdYSh//9ZgPZwCygKbF+lz5E2FFizu+7YGEJ AT6JG28FIe7nk5i0bTozRJhXoqNNCKJaT+Jpz1RGmLV/1j5hgbA9JPq6XjJCwmomo8TbjWtZ JzDKz0JYtoCRcRWjeGppcW56arFhXmq5XnFibnFpXrpecn7uJkZgVJ/+d/zTDsavl5IOMQpw MCrx8HLoZscLsSaWFVfmHmKU4GBWEuF1Ons6Tog3JbGyKrUoP76oNCe1+BCjNAeLkjiv8aKX sUIC6YklqdmpqQWpRTBZJg5OKWDgV/xMncUiumPtngk8u+S3RllZlvNVzy4Lme3329yl6yH/ At+tgkzPdzvOfLlv8dey+ob8F0vD6i553/DrNYoPOyrgo7P4hnPyrv7wub0dV9bN1v7lem7F jjJLxrvcv4zYN+YcK70umG2t+cDXIjQkIUzksIOdgrbVrniHzrMLHKad6lbr5lZiKc5INNRi LipOBABoUvkX5gIAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrCLMWRmVeSWpSXmKPExsVy+t/xe7o3fLLjDa5vFbV492k7k8WznnWM DkwevxYsZfU4+G4PUwBTlJ5NUX5pSapCRn5xia1StKGFkZ6hpYWekYmlnqGxeayVkamSvp1N SmpOZllqkb5dgl7G3+YtbAVPuCsOTLrH2MC4jLOLkZNDQsBE4uy650xdjFwcQgJLGSWO7FnA 1sXIAZSQkfhwSQCiRljiz7UuNoiaD4wSXx40M4Mk2ARsJY7M/MoKYosIsEis/P6dBcRmBhp6 e14TG4gtLGAncXRqAyOIzSKgKjH70nomEJtXwFVixrLzTBAL5CVWbzjADLKXU8BN4uk/eZCw EFBJ34lvbBMY+RYwMqxiFEktLc5Nzy020itOzC0uzUvXS87P3cQIDLFtx35u2cHY9S74EKMA B6MSDy+Hbna8EGtiWXFl7iFGCQ5mJRFep7On44R4UxIrq1KL8uOLSnNSiw8xmgLdNJFZSjQ5 Hxj+eSXxhqaG5haWhubG5sZmFkrivB0CB2OEBNITS1KzU1MLUotg+pg4OKUaGFM/fnUz2HXF 5VbJZ8OTJbOmf0rovCxmYlhrs1NG94X0yh1Zb2z0YpwDj2wW2FW8KjJQO+4P69McHzF/AR+7 nL3eO9c0cfM/Ept1t3hu3ecV58r8Ko8U3EiJK2s8G9navEjQdnWzrK3ljJ0yh2L8IqOaX8yc pPXxWNaLRxnJXC90H4tMXSe/X4mlOCPRUIu5qDgRAPbrjaVHAgAA X-CMS-MailID: 20200923132544eucas1p29470697e7cb6621cc65e6e676c3e5d69 X-Msg-Generator: CA Content-Type: text/plain; charset="utf-8" X-RootMTR: 20200923132544eucas1p29470697e7cb6621cc65e6e676c3e5d69 X-EPHeader: CA CMS-TYPE: 201P X-CMS-RootMailID: 20200923132544eucas1p29470697e7cb6621cc65e6e676c3e5d69 References: <20200923014713.16932-1-l.wojciechow@partner.samsung.com> <CGME20200923132544eucas1p29470697e7cb6621cc65e6e676c3e5d69@eucas1p2.samsung.com> Subject: [dpdk-dev] [PATCH v3 0/8] fix distributor synchronization issues X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions <dev.dpdk.org> List-Unsubscribe: <https://mails.dpdk.org/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://mails.dpdk.org/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <https://mails.dpdk.org/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> |
Series |
fix distributor synchronization issues
|
|
Message
Lukasz Wojciechowski
Sept. 23, 2020, 1:25 p.m. UTC
During review and verification of the patch created by Sarosh Arif: "test_distributor: prevent memory leakages from the pool" I found out that running distributor unit tests multiple times in a row causes fails. So I investigated all the issues I found. There are few synchronization issues that might cause deadlocks or corrupted data. They are fixed with this set of patches for both tests and librte_distributor library. --- v3: * add missing acked and tested by statements from v1 v2: * assign NULL to freed mbufs in distributor test * fix handshake check on legacy single distributor rte_distributor_return_pkt_single() * add patch 7 passing NULL to legacy API calls if no bufs are returned * add patch 8 fixing API documentation Lukasz Wojciechowski (8): app/test: fix deadlock in distributor test app/test: synchronize statistics between lcores app/test: fix freeing mbufs in distributor tests app/test: collect return mbufs in distributor test distributor: fix missing handshake synchronization distributor: fix handshake deadlock distributor: do not use oldpkt when not needed distributor: align API documentation with code app/test/test_distributor.c | 113 +++++++++++------- lib/librte_distributor/rte_distributor.c | 27 ++++- lib/librte_distributor/rte_distributor.h | 23 ++-- .../rte_distributor_single.c | 4 + 4 files changed, 110 insertions(+), 57 deletions(-)
Comments
Hello Lukasz, On Wed, Sep 23, 2020 at 3:25 PM Lukasz Wojciechowski <l.wojciechow@partner.samsung.com> wrote: > > During review and verification of the patch created by Sarosh Arif: > "test_distributor: prevent memory leakages from the pool" I found out > that running distributor unit tests multiple times in a row causes fails. > So I investigated all the issues I found. > > There are few synchronization issues that might cause deadlocks > or corrupted data. They are fixed with this set of patches for both tests > and librte_distributor library. > > --- > v3: > * add missing acked and tested by statements from v1 > > v2: > * assign NULL to freed mbufs in distributor test > * fix handshake check on legacy single distributor > rte_distributor_return_pkt_single() > * add patch 7 passing NULL to legacy API calls if no bufs are returned > * add patch 8 fixing API documentation > > Lukasz Wojciechowski (8): > app/test: fix deadlock in distributor test > app/test: synchronize statistics between lcores > app/test: fix freeing mbufs in distributor tests > app/test: collect return mbufs in distributor test For these patches, we can use the "test/distributor: " prefix, and we then avoid repeating "in distributor test" > distributor: fix missing handshake synchronization > distributor: fix handshake deadlock > distributor: do not use oldpkt when not needed > distributor: align API documentation with code Thanks for working on those fixes ! Here is a suggestion: - we can move this new patch 7 before patch 3 in the series, and update the unit test: * by passing NULL to the first call to rte_distributor_get_pkt(), there is no need for buf[] array init in handle_work(), handle_work_with_free_mbufs() and handle_work_for_shutdown_test(), * at all points of those functions the buf[] array then contains only [0, num[ valid entries, * bonus point, this makes the UT check passing NULL oldpkt, - the former patch 3 is then easier to do since there is no need for buf[] array clearing, This gives the following diff, wdyt? diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c index f31b54edf3..b7ab93ecbe 100644 --- a/app/test/test_distributor.c +++ b/app/test/test_distributor.c @@ -65,13 +65,10 @@ handle_work(void *arg) struct rte_mbuf *buf[8] __rte_cache_aligned; struct worker_params *wp = arg; struct rte_distributor *db = wp->dist; - unsigned int count = 0, num = 0; + unsigned int count = 0, num; unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED); - int i; - for (i = 0; i < 8; i++) - buf[i] = NULL; - num = rte_distributor_get_pkt(db, id, buf, buf, num); + num = rte_distributor_get_pkt(db, id, buf, NULL, 0); while (!quit) { __atomic_fetch_add(&worker_stats[id].handled_packets, num, __ATOMIC_ACQ_REL); @@ -277,22 +274,17 @@ handle_work_with_free_mbufs(void *arg) struct rte_mbuf *buf[8] __rte_cache_aligned; struct worker_params *wp = arg; struct rte_distributor *d = wp->dist; + unsigned int num; unsigned int i; - unsigned int num = 0; unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED); - for (i = 0; i < 8; i++) - buf[i] = NULL; - num = rte_distributor_get_pkt(d, id, buf, buf, 0); + num = rte_distributor_get_pkt(d, id, buf, NULL, 0); while (!quit) { __atomic_fetch_add(&worker_stats[id].handled_packets, num, __ATOMIC_ACQ_REL); - for (i = 0; i < num; i++) { + for (i = 0; i < num; i++) rte_pktmbuf_free(buf[i]); - buf[i] = NULL; - } - num = rte_distributor_get_pkt(d, - id, buf, buf, 0); + num = rte_distributor_get_pkt(d, id, buf, NULL, 0); } __atomic_fetch_add(&worker_stats[id].handled_packets, num, __ATOMIC_ACQ_REL); @@ -347,15 +339,13 @@ handle_work_for_shutdown_test(void *arg) struct rte_mbuf *buf[8] __rte_cache_aligned; struct worker_params *wp = arg; struct rte_distributor *d = wp->dist; - unsigned int num = 0; + unsigned int num; unsigned int i; unsigned int zero_id = 0; const unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED); - for (i = 0; i < 8; i++) - buf[i] = NULL; - num = rte_distributor_get_pkt(d, id, buf, buf, 0); + num = rte_distributor_get_pkt(d, id, buf, NULL, 0); zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE); if (id == zero_id && num > 0) { @@ -369,12 +359,9 @@ handle_work_for_shutdown_test(void *arg) while (!quit && !(id == zero_id && zero_quit)) { __atomic_fetch_add(&worker_stats[id].handled_packets, num, __ATOMIC_ACQ_REL); - for (i = 0; i < num; i++) { + for (i = 0; i < num; i++) rte_pktmbuf_free(buf[i]); - buf[i] = NULL; - } - num = rte_distributor_get_pkt(d, - id, buf, buf, 0); + num = rte_distributor_get_pkt(d, id, buf, NULL, 0); zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE); if (id == zero_id && num > 0) { @@ -392,21 +379,16 @@ handle_work_for_shutdown_test(void *arg) while (zero_quit) usleep(100); - for (i = 0; i < num; i++) { + for (i = 0; i < num; i++) rte_pktmbuf_free(buf[i]); - buf[i] = NULL; - } - num = rte_distributor_get_pkt(d, - id, buf, buf, 0); + num = rte_distributor_get_pkt(d, id, buf, NULL, 0); while (!quit) { __atomic_fetch_add(&worker_stats[id].handled_packets, num, __ATOMIC_ACQ_REL); - for (i = 0; i < num; i++) { + for (i = 0; i < num; i++) rte_pktmbuf_free(buf[i]); - buf[i] = NULL; - } - num = rte_distributor_get_pkt(d, id, buf, buf, 0); + num = rte_distributor_get_pkt(d, id, buf, NULL, 0); } } rte_distributor_return_pkt(d, id, buf, num);
Hello David, Thank you for your review. W dniu 25.09.2020 o 14:31, David Marchand pisze: > Hello Lukasz, > > On Wed, Sep 23, 2020 at 3:25 PM Lukasz Wojciechowski > <l.wojciechow@partner.samsung.com> wrote: >> During review and verification of the patch created by Sarosh Arif: >> "test_distributor: prevent memory leakages from the pool" I found out >> that running distributor unit tests multiple times in a row causes fails. >> So I investigated all the issues I found. >> >> There are few synchronization issues that might cause deadlocks >> or corrupted data. They are fixed with this set of patches for both tests >> and librte_distributor library. >> >> --- >> v3: >> * add missing acked and tested by statements from v1 >> >> v2: >> * assign NULL to freed mbufs in distributor test >> * fix handshake check on legacy single distributor >> rte_distributor_return_pkt_single() >> * add patch 7 passing NULL to legacy API calls if no bufs are returned >> * add patch 8 fixing API documentation >> >> Lukasz Wojciechowski (8): >> app/test: fix deadlock in distributor test >> app/test: synchronize statistics between lcores >> app/test: fix freeing mbufs in distributor tests >> app/test: collect return mbufs in distributor test > For these patches, we can use the "test/distributor: " prefix, and we > then avoid repeating "in distributor test" Changed > >> distributor: fix missing handshake synchronization >> distributor: fix handshake deadlock >> distributor: do not use oldpkt when not needed >> distributor: align API documentation with code > Thanks for working on those fixes ! > > Here is a suggestion: > > - we can move this new patch 7 before patch 3 in the series, and > update the unit test: > * by passing NULL to the first call to rte_distributor_get_pkt(), > there is no need for buf[] array init in handle_work(), > handle_work_with_free_mbufs() and handle_work_for_shutdown_test(), > * at all points of those functions the buf[] array then contains only > [0, num[ valid entries, > * bonus point, this makes the UT check passing NULL oldpkt, > > - the former patch 3 is then easier to do since there is no need for > buf[] array clearing, > > This gives the following diff, wdyt? I reorder patches as you suggested. I added unit test changes in the same patch that changes distributor lib. The changes follow your diff. This also simplified "fix freeing mbufs" patch. It's applied in v4. > > diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c > index f31b54edf3..b7ab93ecbe 100644 > --- a/app/test/test_distributor.c > +++ b/app/test/test_distributor.c > @@ -65,13 +65,10 @@ handle_work(void *arg) > struct rte_mbuf *buf[8] __rte_cache_aligned; > struct worker_params *wp = arg; > struct rte_distributor *db = wp->dist; > - unsigned int count = 0, num = 0; > + unsigned int count = 0, num; > unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED); > - int i; > > - for (i = 0; i < 8; i++) > - buf[i] = NULL; > - num = rte_distributor_get_pkt(db, id, buf, buf, num); > + num = rte_distributor_get_pkt(db, id, buf, NULL, 0); > while (!quit) { > __atomic_fetch_add(&worker_stats[id].handled_packets, num, > __ATOMIC_ACQ_REL); > @@ -277,22 +274,17 @@ handle_work_with_free_mbufs(void *arg) > struct rte_mbuf *buf[8] __rte_cache_aligned; > struct worker_params *wp = arg; > struct rte_distributor *d = wp->dist; > + unsigned int num; > unsigned int i; > - unsigned int num = 0; > unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED); > > - for (i = 0; i < 8; i++) > - buf[i] = NULL; > - num = rte_distributor_get_pkt(d, id, buf, buf, 0); > + num = rte_distributor_get_pkt(d, id, buf, NULL, 0); > while (!quit) { > __atomic_fetch_add(&worker_stats[id].handled_packets, num, > __ATOMIC_ACQ_REL); > - for (i = 0; i < num; i++) { > + for (i = 0; i < num; i++) > rte_pktmbuf_free(buf[i]); > - buf[i] = NULL; > - } > - num = rte_distributor_get_pkt(d, > - id, buf, buf, 0); > + num = rte_distributor_get_pkt(d, id, buf, NULL, 0); > } > __atomic_fetch_add(&worker_stats[id].handled_packets, num, > __ATOMIC_ACQ_REL); > @@ -347,15 +339,13 @@ handle_work_for_shutdown_test(void *arg) > struct rte_mbuf *buf[8] __rte_cache_aligned; > struct worker_params *wp = arg; > struct rte_distributor *d = wp->dist; > - unsigned int num = 0; > + unsigned int num; > unsigned int i; > unsigned int zero_id = 0; > const unsigned int id = __atomic_fetch_add(&worker_idx, 1, > __ATOMIC_RELAXED); > - for (i = 0; i < 8; i++) > - buf[i] = NULL; > > - num = rte_distributor_get_pkt(d, id, buf, buf, 0); > + num = rte_distributor_get_pkt(d, id, buf, NULL, 0); > > zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE); > if (id == zero_id && num > 0) { > @@ -369,12 +359,9 @@ handle_work_for_shutdown_test(void *arg) > while (!quit && !(id == zero_id && zero_quit)) { > __atomic_fetch_add(&worker_stats[id].handled_packets, num, > __ATOMIC_ACQ_REL); > - for (i = 0; i < num; i++) { > + for (i = 0; i < num; i++) > rte_pktmbuf_free(buf[i]); > - buf[i] = NULL; > - } > - num = rte_distributor_get_pkt(d, > - id, buf, buf, 0); > + num = rte_distributor_get_pkt(d, id, buf, NULL, 0); > > zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE); > if (id == zero_id && num > 0) { > @@ -392,21 +379,16 @@ handle_work_for_shutdown_test(void *arg) > while (zero_quit) > usleep(100); > > - for (i = 0; i < num; i++) { > + for (i = 0; i < num; i++) > rte_pktmbuf_free(buf[i]); > - buf[i] = NULL; > - } > - num = rte_distributor_get_pkt(d, > - id, buf, buf, 0); > + num = rte_distributor_get_pkt(d, id, buf, NULL, 0); > > while (!quit) { > __atomic_fetch_add(&worker_stats[id].handled_packets, > num, __ATOMIC_ACQ_REL); > - for (i = 0; i < num; i++) { > + for (i = 0; i < num; i++) > rte_pktmbuf_free(buf[i]); > - buf[i] = NULL; > - } > - num = rte_distributor_get_pkt(d, id, buf, buf, 0); > + num = rte_distributor_get_pkt(d, id, buf, NULL, 0); > } > } > rte_distributor_return_pkt(d, id, buf, num); > >