Message ID | 1579012036-326214-1-git-send-email-vladimir.medvedkin@intel.com (mailing list archive) |
---|---|
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3AF70A04FF; Tue, 14 Jan 2020 15:27:21 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 1367B1C1B1; Tue, 14 Jan 2020 15:27:21 +0100 (CET) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id 57E221C116 for <dev@dpdk.org>; Tue, 14 Jan 2020 15:27:20 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 14 Jan 2020 06:27:19 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,433,1571727600"; d="scan'208";a="256316500" Received: from silpixa00400072.ir.intel.com ([10.237.222.213]) by fmsmga002.fm.intel.com with ESMTP; 14 Jan 2020 06:27:18 -0800 From: Vladimir Medvedkin <vladimir.medvedkin@intel.com> To: dev@dpdk.org Cc: konstantin.ananyev@intel.com, akhil.goyal@nxp.com Date: Tue, 14 Jan 2020 14:27:11 +0000 Message-Id: <1579012036-326214-1-git-send-email-vladimir.medvedkin@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1578920122-228017-1-git-send-email-vladimir.medvedkin@intel.com> References: <1578920122-228017-1-git-send-email-vladimir.medvedkin@intel.com> Subject: [dpdk-dev] [PATCH v4 0/5] integrate librte_ipsec SAD into ipsec-secgw X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions <dev.dpdk.org> List-Unsubscribe: <https://mails.dpdk.org/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://mails.dpdk.org/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <https://mails.dpdk.org/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> |
Series |
integrate librte_ipsec SAD into ipsec-secgw
|
|
Message
Vladimir Medvedkin
Jan. 14, 2020, 2:27 p.m. UTC
This series integrates SA database (SAD) capabilities from ipsec library. The goal is to make ipsec-secgw RFC compliant regarding inbound SAD. Also patch series removes hardcoded limitation for maximum number of SA's and SP's. v4: - put tunnel SA's into SAD with SPI_ONLY type for performance reason v3: - parse SA and SP into sorted array instead of linked list v2: - get rid of maximum sp limitation Vladimir Medvedkin (5): ipsec: move ipsec sad name length into .h examples/ipsec-secgw: implement inbound SAD examples/ipsec-secgw: integrate inbound SAD examples/ipsec-secgw: get rid of maximum sa limitation examples/ipsec-secgw: get rid of maximum sp limitation examples/ipsec-secgw/Makefile | 1 + examples/ipsec-secgw/ipsec-secgw.c | 4 +- examples/ipsec-secgw/ipsec.h | 11 +- examples/ipsec-secgw/meson.build | 2 +- examples/ipsec-secgw/parser.c | 4 + examples/ipsec-secgw/parser.h | 9 ++ examples/ipsec-secgw/sa.c | 256 +++++++++++++++++++++++-------------- examples/ipsec-secgw/sad.c | 90 +++++++++++++ examples/ipsec-secgw/sad.h | 74 +++++++++++ examples/ipsec-secgw/sp4.c | 114 ++++++++++++----- examples/ipsec-secgw/sp6.c | 112 +++++++++++----- lib/librte_ipsec/ipsec_sad.c | 20 +-- lib/librte_ipsec/rte_ipsec_sad.h | 2 + 13 files changed, 528 insertions(+), 171 deletions(-) create mode 100644 examples/ipsec-secgw/sad.c create mode 100644 examples/ipsec-secgw/sad.h
Comments
Hi Vladimir, There is more than 10% drop with this patchset on NXP hardware with both legacy mode and the ipsec lib mode. This would need some debugging. Didn't you see any drop on intel? Regards, Akhil > -----Original Message----- > From: Vladimir Medvedkin <vladimir.medvedkin@intel.com> > Sent: Tuesday, January 14, 2020 7:57 PM > To: dev@dpdk.org > Cc: konstantin.ananyev@intel.com; Akhil Goyal <akhil.goyal@nxp.com> > Subject: [PATCH v4 0/5] integrate librte_ipsec SAD into ipsec-secgw > > This series integrates SA database (SAD) capabilities from ipsec library. > The goal is to make ipsec-secgw RFC compliant regarding inbound SAD. > Also patch series removes hardcoded limitation for maximum number of SA's > and SP's. > > v4: > - put tunnel SA's into SAD with SPI_ONLY type for performance reason > > v3: > - parse SA and SP into sorted array instead of linked list > > v2: > - get rid of maximum sp limitation > > Vladimir Medvedkin (5): > ipsec: move ipsec sad name length into .h > examples/ipsec-secgw: implement inbound SAD > examples/ipsec-secgw: integrate inbound SAD > examples/ipsec-secgw: get rid of maximum sa limitation > examples/ipsec-secgw: get rid of maximum sp limitation > > examples/ipsec-secgw/Makefile | 1 + > examples/ipsec-secgw/ipsec-secgw.c | 4 +- > examples/ipsec-secgw/ipsec.h | 11 +- > examples/ipsec-secgw/meson.build | 2 +- > examples/ipsec-secgw/parser.c | 4 + > examples/ipsec-secgw/parser.h | 9 ++ > examples/ipsec-secgw/sa.c | 256 +++++++++++++++++++++++-------------- > examples/ipsec-secgw/sad.c | 90 +++++++++++++ > examples/ipsec-secgw/sad.h | 74 +++++++++++ > examples/ipsec-secgw/sp4.c | 114 ++++++++++++----- > examples/ipsec-secgw/sp6.c | 112 +++++++++++----- > lib/librte_ipsec/ipsec_sad.c | 20 +-- > lib/librte_ipsec/rte_ipsec_sad.h | 2 + > 13 files changed, 528 insertions(+), 171 deletions(-) > create mode 100644 examples/ipsec-secgw/sad.c > create mode 100644 examples/ipsec-secgw/sad.h > > -- > 2.7.4
Hi Vladimir, The lookup logic for SAD has been brought more closer to real use case, but it looks very high on CPU and should be optimized. We cannot have 10-15% drop because of this change in SA lookup for small packet(82B) sizes where CPU is bottleneck. For large packet sizes it will not impact. > > Hi Vladimir, > > There is more than 10% drop with this patchset on NXP hardware with both > legacy mode and the ipsec lib mode. This would need some debugging. > Didn't you see any drop on intel? > > Regards, > Akhil > > > > > This series integrates SA database (SAD) capabilities from ipsec library. > > The goal is to make ipsec-secgw RFC compliant regarding inbound SAD. > > Also patch series removes hardcoded limitation for maximum number of SA's > > and SP's. > > > > v4: > > - put tunnel SA's into SAD with SPI_ONLY type for performance reason > > > > v3: > > - parse SA and SP into sorted array instead of linked list > > > > v2: > > - get rid of maximum sp limitation > > > > Vladimir Medvedkin (5): > > ipsec: move ipsec sad name length into .h > > examples/ipsec-secgw: implement inbound SAD > > examples/ipsec-secgw: integrate inbound SAD > > examples/ipsec-secgw: get rid of maximum sa limitation > > examples/ipsec-secgw: get rid of maximum sp limitation > > > > examples/ipsec-secgw/Makefile | 1 + > > examples/ipsec-secgw/ipsec-secgw.c | 4 +- > > examples/ipsec-secgw/ipsec.h | 11 +- > > examples/ipsec-secgw/meson.build | 2 +- > > examples/ipsec-secgw/parser.c | 4 + > > examples/ipsec-secgw/parser.h | 9 ++ > > examples/ipsec-secgw/sa.c | 256 +++++++++++++++++++++++------------- > - > > examples/ipsec-secgw/sad.c | 90 +++++++++++++ > > examples/ipsec-secgw/sad.h | 74 +++++++++++ > > examples/ipsec-secgw/sp4.c | 114 ++++++++++++----- > > examples/ipsec-secgw/sp6.c | 112 +++++++++++----- > > lib/librte_ipsec/ipsec_sad.c | 20 +-- > > lib/librte_ipsec/rte_ipsec_sad.h | 2 + > > 13 files changed, 528 insertions(+), 171 deletions(-) > > create mode 100644 examples/ipsec-secgw/sad.c > > create mode 100644 examples/ipsec-secgw/sad.h > > > > -- > > 2.7.4
Hi Akhil, Indeed with our tests we also seeing ~15% perf drop for small packets (~90B) and ~3-4% drop for 1KB packets. While I am looking on a ways to minimize the drop, I think it would be hard, if possible at all to eliminate it completely. Reason for that: current SAD implementation is completely synthetic (using plain array structure indexed by SPI value). That provides a very low overhead, but doesn't provide expected functionality and can't be used in proper implementation. To measure plain IPsec performance without SAD user can still use '--signle-sa' option. On 15/01/2020 15:45, Akhil Goyal wrote: > Hi Vladimir, > > There is more than 10% drop with this patchset on NXP hardware with both legacy mode and the ipsec lib mode. This would need some debugging. > Didn't you see any drop on intel? > > Regards, > Akhil > >> -----Original Message----- >> From: Vladimir Medvedkin <vladimir.medvedkin@intel.com> >> Sent: Tuesday, January 14, 2020 7:57 PM >> To: dev@dpdk.org >> Cc: konstantin.ananyev@intel.com; Akhil Goyal <akhil.goyal@nxp.com> >> Subject: [PATCH v4 0/5] integrate librte_ipsec SAD into ipsec-secgw >> >> This series integrates SA database (SAD) capabilities from ipsec library. >> The goal is to make ipsec-secgw RFC compliant regarding inbound SAD. >> Also patch series removes hardcoded limitation for maximum number of SA's >> and SP's. >> >> v4: >> - put tunnel SA's into SAD with SPI_ONLY type for performance reason >> >> v3: >> - parse SA and SP into sorted array instead of linked list >> >> v2: >> - get rid of maximum sp limitation >> >> Vladimir Medvedkin (5): >> ipsec: move ipsec sad name length into .h >> examples/ipsec-secgw: implement inbound SAD >> examples/ipsec-secgw: integrate inbound SAD >> examples/ipsec-secgw: get rid of maximum sa limitation >> examples/ipsec-secgw: get rid of maximum sp limitation >> >> examples/ipsec-secgw/Makefile | 1 + >> examples/ipsec-secgw/ipsec-secgw.c | 4 +- >> examples/ipsec-secgw/ipsec.h | 11 +- >> examples/ipsec-secgw/meson.build | 2 +- >> examples/ipsec-secgw/parser.c | 4 + >> examples/ipsec-secgw/parser.h | 9 ++ >> examples/ipsec-secgw/sa.c | 256 +++++++++++++++++++++++-------------- >> examples/ipsec-secgw/sad.c | 90 +++++++++++++ >> examples/ipsec-secgw/sad.h | 74 +++++++++++ >> examples/ipsec-secgw/sp4.c | 114 ++++++++++++----- >> examples/ipsec-secgw/sp6.c | 112 +++++++++++----- >> lib/librte_ipsec/ipsec_sad.c | 20 +-- >> lib/librte_ipsec/rte_ipsec_sad.h | 2 + >> 13 files changed, 528 insertions(+), 171 deletions(-) >> create mode 100644 examples/ipsec-secgw/sad.c >> create mode 100644 examples/ipsec-secgw/sad.h >> >> -- >> 2.7.4
Hi Vladimir, The SA lookup logic and management is purely requirement based for the application. The application may only cater to <128 SAs which can be handled based on the current logic. –single-sa option cannot handle this. Sample applications in DPDK are there to showcase the best a hardware can deliver. IMO, we cannot allow this logic on NXP hardwares. We give performance numbers based on IPSec app to customers and we cannot allow 15% degradation. Other vendors(Marvell, ARM, AMD) please comment? Regards, Akhil From: Medvedkin, Vladimir <vladimir.medvedkin@intel.com> Sent: Friday, January 17, 2020 10:35 PM To: Akhil Goyal <akhil.goyal@nxp.com>; dev@dpdk.org Cc: konstantin.ananyev@intel.com Subject: Re: [PATCH v4 0/5] integrate librte_ipsec SAD into ipsec-secgw Hi Akhil, Indeed with our tests we also seeing ~15% perf drop for small packets (~90B) and ~3-4% drop for 1KB packets. While I am looking on a ways to minimize the drop, I think it would be hard, if possible at all to eliminate it completely. Reason for that: current SAD implementation is completely synthetic (using plain array structure indexed by SPI value). That provides a very low overhead, but doesn't provide expected functionality and can't be used in proper implementation. To measure plain IPsec performance without SAD user can still use '--signle-sa' option. On 15/01/2020 15:45, Akhil Goyal wrote: Hi Vladimir, There is more than 10% drop with this patchset on NXP hardware with both legacy mode and the ipsec lib mode. This would need some debugging. Didn't you see any drop on intel? Regards, Akhil -----Original Message----- From: Vladimir Medvedkin <vladimir.medvedkin@intel.com><mailto:vladimir.medvedkin@intel.com> Sent: Tuesday, January 14, 2020 7:57 PM To: dev@dpdk.org<mailto:dev@dpdk.org> Cc: konstantin.ananyev@intel.com<mailto:konstantin.ananyev@intel.com>; Akhil Goyal <akhil.goyal@nxp.com><mailto:akhil.goyal@nxp.com> Subject: [PATCH v4 0/5] integrate librte_ipsec SAD into ipsec-secgw This series integrates SA database (SAD) capabilities from ipsec library. The goal is to make ipsec-secgw RFC compliant regarding inbound SAD. Also patch series removes hardcoded limitation for maximum number of SA's and SP's. v4: - put tunnel SA's into SAD with SPI_ONLY type for performance reason v3: - parse SA and SP into sorted array instead of linked list v2: - get rid of maximum sp limitation Vladimir Medvedkin (5): ipsec: move ipsec sad name length into .h examples/ipsec-secgw: implement inbound SAD examples/ipsec-secgw: integrate inbound SAD examples/ipsec-secgw: get rid of maximum sa limitation examples/ipsec-secgw: get rid of maximum sp limitation examples/ipsec-secgw/Makefile | 1 + examples/ipsec-secgw/ipsec-secgw.c | 4 +- examples/ipsec-secgw/ipsec.h | 11 +- examples/ipsec-secgw/meson.build | 2 +- examples/ipsec-secgw/parser.c | 4 + examples/ipsec-secgw/parser.h | 9 ++ examples/ipsec-secgw/sa.c | 256 +++++++++++++++++++++++-------------- examples/ipsec-secgw/sad.c | 90 +++++++++++++ examples/ipsec-secgw/sad.h | 74 +++++++++++ examples/ipsec-secgw/sp4.c | 114 ++++++++++++----- examples/ipsec-secgw/sp6.c | 112 +++++++++++----- lib/librte_ipsec/ipsec_sad.c | 20 +-- lib/librte_ipsec/rte_ipsec_sad.h | 2 + 13 files changed, 528 insertions(+), 171 deletions(-) create mode 100644 examples/ipsec-secgw/sad.c create mode 100644 examples/ipsec-secgw/sad.h -- 2.7.4 -- Regards, Vladimir -->
Hi Vladimir, Akhil,
Marvell is also observing 10-15% drop with the SAD change. I agree with Akhil's opinion and we are not in favor of making this change.
Thanks,
Anoob
From: Akhil Goyal <akhil.goyal@nxp.com>
Sent: Monday, January 20, 2020 12:14 PM
To: Medvedkin, Vladimir <vladimir.medvedkin@intel.com>; dev@dpdk.org
Cc: konstantin.ananyev@intel.com; Anoob Joseph <anoobj@marvell.com>; Thomas Monjalon <thomas@monjalon.net>; Ravi Kumar <ravi1.kumar@amd.com>; Ruifeng Wang <ruifeng.wang@arm.com>
Subject: [EXT] RE: [PATCH v4 0/5] integrate librte_ipsec SAD into ipsec-secgw
External Email
Hi Akhil, > Hi Vladimir, > The SA lookup logic and management is purely requirement based for the application. >The application may only cater to <128 SAs which can > be handled based on the current logic. Not always, current implementation can handle < 128 SA, whose SPI%128 never match (let say it cant't handle SPI=1 and SPI=129). Yes, what we have right now has nearly zero overhead, and might be ok for some really simple show-cases. But for majority of production IPsec implementations, I believe that definitely wouldn't be enough. > –single-sa option cannot handle this. > Sample applications in DPDK are there to showcase the best a hardware can deliver. My thought was - that's the reason we have single-sa option - demonstrate best possible HW perf without minimal SW intervention. For something more serious than that, we use generic SAD implementation. > IMO, we cannot allow this logic on NXP hardwares. We > give performance numbers based on IPSec app to customers and we cannot allow 15% degradation. As Vladimir said, we are looking how to improve current SAD numbers and minimize the drop. But with same equals - plain array will always be faster than hash table, so not sure we will be able to match existing performance. So two questions: 1. What exact case you use for perf testing (total number of SAs, packets per burst belong to the same/different SAs)? Might be there is a way to speedup it. Again if 10-15% is not an affordable drop, which one is: zero or ...? 2. I think there are 2 different directions for ipsec-secgw: From one-side there is a desire to use it as a show-case for best-possible HW IPsec performance (which is understandable). From other side - attempt to make it as close as real-world generic ipsec processing app as possible (support for ESN, replay window, fragmented packets, generic proper SAD, etc). Obviously these goals contradict and it makes really hard for the same app to fulfill both. Any thoughts how to deal with that? One obvious would be to split the app, anything else? Konstantin > Other vendors(Marvell, ARM, AMD) please comment? > Regards, > Akhil > From: Medvedkin, Vladimir <mailto:vladimir.medvedkin@intel.com> > Sent: Friday, January 17, 2020 10:35 PM > To: Akhil Goyal <mailto:akhil.goyal@nxp.com>; mailto:dev@dpdk.org > Cc: mailto:konstantin.ananyev@intel.com > Subject: Re: [PATCH v4 0/5] integrate librte_ipsec SAD into ipsec-secgw > > Hi Akhil, > Indeed with our tests we also seeing ~15% perf drop for small packets (~90B) and ~3-4% drop for 1KB packets. While I am looking on a ways > to minimize the drop, I think it would be hard, if possible at all to eliminate it completely. > Reason for that: current SAD implementation is completely synthetic (using plain array structure indexed by SPI value). That provides a very > low overhead, but doesn't provide expected functionality and can't be used in proper implementation. > To measure plain IPsec performance without SAD user can still use '--signle-sa' option. > On 15/01/2020 15:45, Akhil Goyal wrote: > Hi Vladimir, > > There is more than 10% drop with this patchset on NXP hardware with both legacy mode and the ipsec lib mode. This would need some > debugging. > Didn't you see any drop on intel? > > Regards, > Akhil > > -----Original Message----- > From: Vladimir Medvedkin mailto:vladimir.medvedkin@intel.com > Sent: Tuesday, January 14, 2020 7:57 PM > To: mailto:dev@dpdk.org > Cc: mailto:konstantin.ananyev@intel.com; Akhil Goyal mailto:akhil.goyal@nxp.com > Subject: [PATCH v4 0/5] integrate librte_ipsec SAD into ipsec-secgw > > This series integrates SA database (SAD) capabilities from ipsec library. > The goal is to make ipsec-secgw RFC compliant regarding inbound SAD. > Also patch series removes hardcoded limitation for maximum number of SA's > and SP's. > > v4: > - put tunnel SA's into SAD with SPI_ONLY type for performance reason > > v3: > - parse SA and SP into sorted array instead of linked list > > v2: > - get rid of maximum sp limitation > > Vladimir Medvedkin (5): > ipsec: move ipsec sad name length into .h > examples/ipsec-secgw: implement inbound SAD > examples/ipsec-secgw: integrate inbound SAD > examples/ipsec-secgw: get rid of maximum sa limitation > examples/ipsec-secgw: get rid of maximum sp limitation > > examples/ipsec-secgw/Makefile | 1 + > examples/ipsec-secgw/ipsec-secgw.c | 4 +- > examples/ipsec-secgw/ipsec.h | 11 +- > examples/ipsec-secgw/meson.build | 2 +- > examples/ipsec-secgw/parser.c | 4 + > examples/ipsec-secgw/parser.h | 9 ++ > examples/ipsec-secgw/sa.c | 256 +++++++++++++++++++++++-------------- > examples/ipsec-secgw/sad.c | 90 +++++++++++++ > examples/ipsec-secgw/sad.h | 74 +++++++++++ > examples/ipsec-secgw/sp4.c | 114 ++++++++++++----- > examples/ipsec-secgw/sp6.c | 112 +++++++++++----- > lib/librte_ipsec/ipsec_sad.c | 20 +-- > lib/librte_ipsec/rte_ipsec_sad.h | 2 + > 13 files changed, 528 insertions(+), 171 deletions(-) > create mode 100644 examples/ipsec-secgw/sad.c > create mode 100644 examples/ipsec-secgw/sad.h > > -- > 2.7.4 > > -- > Regards, > Vladimir > -->
Hi Konstantin, > > Hi Akhil, > > > Hi Vladimir, > > The SA lookup logic and management is purely requirement based for the > application. > >The application may only cater to <128 SAs which can > > be handled based on the current logic. > > Not always, current implementation can handle < 128 SA, > whose SPI%128 never match (let say it cant't handle SPI=1 and SPI=129). > Yes, what we have right now has nearly zero overhead, > and might be ok for some really simple show-cases. > But for majority of production IPsec implementations, > I believe that definitely wouldn't be enough. > > > –single-sa option cannot handle this. > > Sample applications in DPDK are there to showcase the best a hardware can > deliver. > > My thought was - that's the reason we have single-sa option - > demonstrate best possible HW perf without minimal SW intervention. > For something more serious than that, we use generic SAD implementation. > > > IMO, we cannot allow this logic on NXP hardwares. We > > give performance numbers based on IPSec app to customers and we cannot > allow 15% degradation. > > As Vladimir said, we are looking how to improve current SAD numbers > and minimize the drop. > But with same equals - plain array will always be faster than hash table, > so not sure we will be able to match existing performance. > So two questions: > 1. What exact case you use for perf testing > (total number of SAs, packets per burst belong to the same/different SAs)? > Might be there is a way to speedup it. > Again if 10-15% is not an affordable drop, which one is: zero or ...? We should add features judiciously, we cannot drop the performance of a benchmarking Application in lieu of adding functionality. We should only add features which are not Impacting the performance significantly. Every vendor may have different cases. We cannot tune for everybody. However, I see drop in 64 outbound 64 inbound SAs all with different SPI and IPs. Packets per burst = 32 all with different SAs. > 2. I think there are 2 different directions for ipsec-secgw: > From one-side there is a desire to use it as a show-case for best-possible HW > IPsec performance > (which is understandable). > From other side - attempt to make it as close as real-world generic ipsec > processing app as possible > (support for ESN, replay window, fragmented packets, generic proper SAD, > etc). > Obviously these goals contradict and it makes really hard for the same app to > fulfill both. > Any thoughts how to deal with that? > One obvious would be to split the app, anything else? We can have a fallback mechanism back to original functionality for whatever feature which has some perf drop. Splitting an app can be thought of but that would be similar to a full fledged IPSec stack like VPP-IPSec. > > Konstantin > > > Other vendors(Marvell, ARM, AMD) please comment? > > Regards, > > Akhil > > From: Medvedkin, Vladimir <mailto:vladimir.medvedkin@intel.com> > > Sent: Friday, January 17, 2020 10:35 PM > > To: Akhil Goyal <mailto:akhil.goyal@nxp.com>; mailto:dev@dpdk.org > > Cc: mailto:konstantin.ananyev@intel.com > > Subject: Re: [PATCH v4 0/5] integrate librte_ipsec SAD into ipsec-secgw > > > > Hi Akhil, > > Indeed with our tests we also seeing ~15% perf drop for small packets (~90B) > and ~3-4% drop for 1KB packets. While I am looking on a ways > > to minimize the drop, I think it would be hard, if possible at all to eliminate it > completely. > > Reason for that: current SAD implementation is completely synthetic (using > plain array structure indexed by SPI value). That provides a very > > low overhead, but doesn't provide expected functionality and can't be used in > proper implementation. > > To measure plain IPsec performance without SAD user can still use '--signle-sa' > option. > > On 15/01/2020 15:45, Akhil Goyal wrote: > > Hi Vladimir, > > > > There is more than 10% drop with this patchset on NXP hardware with both > legacy mode and the ipsec lib mode. This would need some > > debugging. > > Didn't you see any drop on intel? > > > > Regards, > > Akhil > > > > -----Original Message----- > > From: Vladimir Medvedkin mailto:vladimir.medvedkin@intel.com > > Sent: Tuesday, January 14, 2020 7:57 PM > > To: mailto:dev@dpdk.org > > Cc: mailto:konstantin.ananyev@intel.com; Akhil Goyal > mailto:akhil.goyal@nxp.com > > Subject: [PATCH v4 0/5] integrate librte_ipsec SAD into ipsec-secgw > > > > This series integrates SA database (SAD) capabilities from ipsec library. > > The goal is to make ipsec-secgw RFC compliant regarding inbound SAD. > > Also patch series removes hardcoded limitation for maximum number of SA's > > and SP's. > > > > v4: > > - put tunnel SA's into SAD with SPI_ONLY type for performance reason > > > > v3: > > - parse SA and SP into sorted array instead of linked list > > > > v2: > > - get rid of maximum sp limitation > > > > Vladimir Medvedkin (5): > > ipsec: move ipsec sad name length into .h > > examples/ipsec-secgw: implement inbound SAD > > examples/ipsec-secgw: integrate inbound SAD > > examples/ipsec-secgw: get rid of maximum sa limitation > > examples/ipsec-secgw: get rid of maximum sp limitation > > > > examples/ipsec-secgw/Makefile | 1 + > > examples/ipsec-secgw/ipsec-secgw.c | 4 +- > > examples/ipsec-secgw/ipsec.h | 11 +- > > examples/ipsec-secgw/meson.build | 2 +- > > examples/ipsec-secgw/parser.c | 4 + > > examples/ipsec-secgw/parser.h | 9 ++ > > examples/ipsec-secgw/sa.c | 256 +++++++++++++++++++++++------------- > - > > examples/ipsec-secgw/sad.c | 90 +++++++++++++ > > examples/ipsec-secgw/sad.h | 74 +++++++++++ > > examples/ipsec-secgw/sp4.c | 114 ++++++++++++----- > > examples/ipsec-secgw/sp6.c | 112 +++++++++++----- > > lib/librte_ipsec/ipsec_sad.c | 20 +-- > > lib/librte_ipsec/rte_ipsec_sad.h | 2 + > > 13 files changed, 528 insertions(+), 171 deletions(-) > > create mode 100644 examples/ipsec-secgw/sad.c > > create mode 100644 examples/ipsec-secgw/sad.h > > > > -- > > 2.7.4 > > > > -- > > Regards, > > Vladimir > > -->
Hi All, > > Hi Konstantin, > > > > Hi Akhil, > > > > > Hi Vladimir, > > > The SA lookup logic and management is purely requirement based for the > > application. > > >The application may only cater to <128 SAs which can > > > be handled based on the current logic. > > > > Not always, current implementation can handle < 128 SA, > > whose SPI%128 never match (let say it cant't handle SPI=1 and SPI=129). > > Yes, what we have right now has nearly zero overhead, > > and might be ok for some really simple show-cases. > > But for majority of production IPsec implementations, > > I believe that definitely wouldn't be enough. > > > > > –single-sa option cannot handle this. > > > Sample applications in DPDK are there to showcase the best a hardware can > > deliver. > > > > My thought was - that's the reason we have single-sa option - > > demonstrate best possible HW perf without minimal SW intervention. > > For something more serious than that, we use generic SAD implementation. > > > > > IMO, we cannot allow this logic on NXP hardwares. We > > > give performance numbers based on IPSec app to customers and we cannot > > allow 15% degradation. > > > > As Vladimir said, we are looking how to improve current SAD numbers > > and minimize the drop. > > But with same equals - plain array will always be faster than hash table, > > so not sure we will be able to match existing performance. > > So two questions: > > 1. What exact case you use for perf testing > > (total number of SAs, packets per burst belong to the same/different SAs)? > > Might be there is a way to speedup it. > > Again if 10-15% is not an affordable drop, which one is: zero or ...? > > We should add features judiciously, we cannot drop the performance of a > benchmarking > Application in lieu of adding functionality. We should only add features which > are not > Impacting the performance significantly. > Every vendor may have different cases. We cannot tune for everybody. > However, I see drop in 64 outbound 64 inbound SAs all with different SPI and IPs. > Packets per burst = 32 all with different SAs. > We can have two modes of lookup similar to l3fwd - EM and LPM. LPM is O(1) while EM is more realistic. Similar logic can be added here as well. With L3fwd also we showcase performance for best case(lpm) and the worst case(em) What Say? As discussed in the DPDK-status meeting today, this patchset need to be discussed in Techboard meeting. Please include this topic in the upcoming meeting on 29th Jan. -Akhil > > 2. I think there are 2 different directions for ipsec-secgw: > > From one-side there is a desire to use it as a show-case for best-possible HW > > IPsec performance > > (which is understandable). > > From other side - attempt to make it as close as real-world generic ipsec > > processing app as possible > > (support for ESN, replay window, fragmented packets, generic proper SAD, > > etc). > > Obviously these goals contradict and it makes really hard for the same app to > > fulfill both. > > Any thoughts how to deal with that? > > One obvious would be to split the app, anything else? > > We can have a fallback mechanism back to original functionality for whatever > feature > which has some perf drop. > Splitting an app can be thought of but that would be similar to a full fledged > IPSec stack > like VPP-IPSec. > > > > > Konstantin > > > > > Other vendors(Marvell, ARM, AMD) please comment? > > > Regards, > > > Akhil > > > From: Medvedkin, Vladimir <mailto:vladimir.medvedkin@intel.com> > > > Sent: Friday, January 17, 2020 10:35 PM > > > To: Akhil Goyal <mailto:akhil.goyal@nxp.com>; mailto:dev@dpdk.org > > > Cc: mailto:konstantin.ananyev@intel.com > > > Subject: Re: [PATCH v4 0/5] integrate librte_ipsec SAD into ipsec-secgw > > > > > > Hi Akhil, > > > Indeed with our tests we also seeing ~15% perf drop for small packets (~90B) > > and ~3-4% drop for 1KB packets. While I am looking on a ways > > > to minimize the drop, I think it would be hard, if possible at all to eliminate it > > completely. > > > Reason for that: current SAD implementation is completely synthetic (using > > plain array structure indexed by SPI value). That provides a very > > > low overhead, but doesn't provide expected functionality and can't be used > in > > proper implementation. > > > To measure plain IPsec performance without SAD user can still use '--signle- > sa' > > option. > > > On 15/01/2020 15:45, Akhil Goyal wrote: > > > Hi Vladimir, > > > > > > There is more than 10% drop with this patchset on NXP hardware with both > > legacy mode and the ipsec lib mode. This would need some > > > debugging. > > > Didn't you see any drop on intel? > > > > > > Regards, > > > Akhil > > > > > > -----Original Message----- > > > From: Vladimir Medvedkin mailto:vladimir.medvedkin@intel.com > > > Sent: Tuesday, January 14, 2020 7:57 PM > > > To: mailto:dev@dpdk.org > > > Cc: mailto:konstantin.ananyev@intel.com; Akhil Goyal > > mailto:akhil.goyal@nxp.com > > > Subject: [PATCH v4 0/5] integrate librte_ipsec SAD into ipsec-secgw > > > > > > This series integrates SA database (SAD) capabilities from ipsec library. > > > The goal is to make ipsec-secgw RFC compliant regarding inbound SAD. > > > Also patch series removes hardcoded limitation for maximum number of SA's > > > and SP's. > > > > > > v4: > > > - put tunnel SA's into SAD with SPI_ONLY type for performance reason > > > > > > v3: > > > - parse SA and SP into sorted array instead of linked list > > > > > > v2: > > > - get rid of maximum sp limitation > > > > > > Vladimir Medvedkin (5): > > > ipsec: move ipsec sad name length into .h > > > examples/ipsec-secgw: implement inbound SAD > > > examples/ipsec-secgw: integrate inbound SAD > > > examples/ipsec-secgw: get rid of maximum sa limitation > > > examples/ipsec-secgw: get rid of maximum sp limitation > > > > > > examples/ipsec-secgw/Makefile | 1 + > > > examples/ipsec-secgw/ipsec-secgw.c | 4 +- > > > examples/ipsec-secgw/ipsec.h | 11 +- > > > examples/ipsec-secgw/meson.build | 2 +- > > > examples/ipsec-secgw/parser.c | 4 + > > > examples/ipsec-secgw/parser.h | 9 ++ > > > examples/ipsec-secgw/sa.c | 256 +++++++++++++++++++++++----------- > -- > > - > > > examples/ipsec-secgw/sad.c | 90 +++++++++++++ > > > examples/ipsec-secgw/sad.h | 74 +++++++++++ > > > examples/ipsec-secgw/sp4.c | 114 ++++++++++++----- > > > examples/ipsec-secgw/sp6.c | 112 +++++++++++----- > > > lib/librte_ipsec/ipsec_sad.c | 20 +-- > > > lib/librte_ipsec/rte_ipsec_sad.h | 2 + > > > 13 files changed, 528 insertions(+), 171 deletions(-) > > > create mode 100644 examples/ipsec-secgw/sad.c > > > create mode 100644 examples/ipsec-secgw/sad.h > > > > > > -- > > > 2.7.4 > > > > > > -- > > > Regards, > > > Vladimir > > > -->
Hi Akhil, > > > > Hi Vladimir, > > > > The SA lookup logic and management is purely requirement based for the > > > application. > > > >The application may only cater to <128 SAs which can > > > > be handled based on the current logic. > > > > > > Not always, current implementation can handle < 128 SA, > > > whose SPI%128 never match (let say it cant't handle SPI=1 and SPI=129). > > > Yes, what we have right now has nearly zero overhead, > > > and might be ok for some really simple show-cases. > > > But for majority of production IPsec implementations, > > > I believe that definitely wouldn't be enough. > > > > > > > –single-sa option cannot handle this. > > > > Sample applications in DPDK are there to showcase the best a hardware can > > > deliver. > > > > > > My thought was - that's the reason we have single-sa option - > > > demonstrate best possible HW perf without minimal SW intervention. > > > For something more serious than that, we use generic SAD implementation. > > > > > > > IMO, we cannot allow this logic on NXP hardwares. We > > > > give performance numbers based on IPSec app to customers and we cannot > > > allow 15% degradation. > > > > > > As Vladimir said, we are looking how to improve current SAD numbers > > > and minimize the drop. > > > But with same equals - plain array will always be faster than hash table, > > > so not sure we will be able to match existing performance. > > > So two questions: > > > 1. What exact case you use for perf testing > > > (total number of SAs, packets per burst belong to the same/different SAs)? > > > Might be there is a way to speedup it. > > > Again if 10-15% is not an affordable drop, which one is: zero or ...? > > > > We should add features judiciously, we cannot drop the performance of a > > benchmarking > > Application in lieu of adding functionality. We should only add features which > > are not > > Impacting the performance significantly. > > Every vendor may have different cases. We cannot tune for everybody. > > However, I see drop in 64 outbound 64 inbound SAs all with different SPI and IPs. > > Packets per burst = 32 all with different SAs. > > > > We can have two modes of lookup similar to l3fwd - EM and LPM. > LPM is O(1) while EM is more realistic. Similar logic can be added here as well. > With L3fwd also we showcase performance for best case(lpm) and the worst case(em) > What Say? We discussed it off-line with Vladimir and came up with similar idea: Have a proper/generic SAD implementation and add limited size plain-array on top of it as 1xway associative cache. So for the case when all active SAs fit into the cache and no SPI collisions, we should have same performance as now (with plain array). From other side, we'll still have generic/scalable/rfc compliant implementation. Sort of best sides from two words. Plans are to submit v4 with such approach in next few days. > > As discussed in the DPDK-status meeting today, this patchset need to be discussed in > Techboard meeting. Please include this topic in the upcoming meeting on 29th Jan. As I said above, I think we found a way to deal with it without any perf drop for existing cases. Though sure, if you feel some extra discussion is needed, let's request to put it into agenda. Konstantin > > -Akhil > > > > 2. I think there are 2 different directions for ipsec-secgw: > > > From one-side there is a desire to use it as a show-case for best-possible HW > > > IPsec performance > > > (which is understandable). > > > From other side - attempt to make it as close as real-world generic ipsec > > > processing app as possible > > > (support for ESN, replay window, fragmented packets, generic proper SAD, > > > etc). > > > Obviously these goals contradict and it makes really hard for the same app to > > > fulfill both. > > > Any thoughts how to deal with that? > > > One obvious would be to split the app, anything else? > > > > We can have a fallback mechanism back to original functionality for whatever > > feature > > which has some perf drop. > > Splitting an app can be thought of but that would be similar to a full fledged > > IPSec stack > > like VPP-IPSec. > > > > > > > > Konstantin > > > > > > > Other vendors(Marvell, ARM, AMD) please comment? > > > > Regards, > > > > Akhil > > > > From: Medvedkin, Vladimir <mailto:vladimir.medvedkin@intel.com> > > > > Sent: Friday, January 17, 2020 10:35 PM > > > > To: Akhil Goyal <mailto:akhil.goyal@nxp.com>; mailto:dev@dpdk.org > > > > Cc: mailto:konstantin.ananyev@intel.com > > > > Subject: Re: [PATCH v4 0/5] integrate librte_ipsec SAD into ipsec-secgw > > > > > > > > Hi Akhil, > > > > Indeed with our tests we also seeing ~15% perf drop for small packets (~90B) > > > and ~3-4% drop for 1KB packets. While I am looking on a ways > > > > to minimize the drop, I think it would be hard, if possible at all to eliminate it > > > completely. > > > > Reason for that: current SAD implementation is completely synthetic (using > > > plain array structure indexed by SPI value). That provides a very > > > > low overhead, but doesn't provide expected functionality and can't be used > > in > > > proper implementation. > > > > To measure plain IPsec performance without SAD user can still use '--signle- > > sa' > > > option. > > > > On 15/01/2020 15:45, Akhil Goyal wrote: > > > > Hi Vladimir, > > > > > > > > There is more than 10% drop with this patchset on NXP hardware with both > > > legacy mode and the ipsec lib mode. This would need some > > > > debugging. > > > > Didn't you see any drop on intel? > > > > > > > > Regards, > > > > Akhil > > > > > > > > -----Original Message----- > > > > From: Vladimir Medvedkin mailto:vladimir.medvedkin@intel.com > > > > Sent: Tuesday, January 14, 2020 7:57 PM > > > > To: mailto:dev@dpdk.org > > > > Cc: mailto:konstantin.ananyev@intel.com; Akhil Goyal > > > mailto:akhil.goyal@nxp.com > > > > Subject: [PATCH v4 0/5] integrate librte_ipsec SAD into ipsec-secgw > > > > > > > > This series integrates SA database (SAD) capabilities from ipsec library. > > > > The goal is to make ipsec-secgw RFC compliant regarding inbound SAD. > > > > Also patch series removes hardcoded limitation for maximum number of SA's > > > > and SP's. > > > > > > > > v4: > > > > - put tunnel SA's into SAD with SPI_ONLY type for performance reason > > > > > > > > v3: > > > > - parse SA and SP into sorted array instead of linked list > > > > > > > > v2: > > > > - get rid of maximum sp limitation > > > > > > > > Vladimir Medvedkin (5): > > > > ipsec: move ipsec sad name length into .h > > > > examples/ipsec-secgw: implement inbound SAD > > > > examples/ipsec-secgw: integrate inbound SAD > > > > examples/ipsec-secgw: get rid of maximum sa limitation > > > > examples/ipsec-secgw: get rid of maximum sp limitation > > > > > > > > examples/ipsec-secgw/Makefile | 1 + > > > > examples/ipsec-secgw/ipsec-secgw.c | 4 +- > > > > examples/ipsec-secgw/ipsec.h | 11 +- > > > > examples/ipsec-secgw/meson.build | 2 +- > > > > examples/ipsec-secgw/parser.c | 4 + > > > > examples/ipsec-secgw/parser.h | 9 ++ > > > > examples/ipsec-secgw/sa.c | 256 +++++++++++++++++++++++----------- > > -- > > > - > > > > examples/ipsec-secgw/sad.c | 90 +++++++++++++ > > > > examples/ipsec-secgw/sad.h | 74 +++++++++++ > > > > examples/ipsec-secgw/sp4.c | 114 ++++++++++++----- > > > > examples/ipsec-secgw/sp6.c | 112 +++++++++++----- > > > > lib/librte_ipsec/ipsec_sad.c | 20 +-- > > > > lib/librte_ipsec/rte_ipsec_sad.h | 2 + > > > > 13 files changed, 528 insertions(+), 171 deletions(-) > > > > create mode 100644 examples/ipsec-secgw/sad.c > > > > create mode 100644 examples/ipsec-secgw/sad.h > > > > > > > > -- > > > > 2.7.4 > > > > > > > > -- > > > > Regards, > > > > Vladimir > > > > -->
Hi Konstantin, > > Hi Akhil, > > > > > > Hi Vladimir, > > > > > The SA lookup logic and management is purely requirement based for the > > > > application. > > > > >The application may only cater to <128 SAs which can > > > > > be handled based on the current logic. > > > > > > > > Not always, current implementation can handle < 128 SA, > > > > whose SPI%128 never match (let say it cant't handle SPI=1 and SPI=129). > > > > Yes, what we have right now has nearly zero overhead, > > > > and might be ok for some really simple show-cases. > > > > But for majority of production IPsec implementations, > > > > I believe that definitely wouldn't be enough. > > > > > > > > > –single-sa option cannot handle this. > > > > > Sample applications in DPDK are there to showcase the best a hardware > can > > > > deliver. > > > > > > > > My thought was - that's the reason we have single-sa option - > > > > demonstrate best possible HW perf without minimal SW intervention. > > > > For something more serious than that, we use generic SAD implementation. > > > > > > > > > IMO, we cannot allow this logic on NXP hardwares. We > > > > > give performance numbers based on IPSec app to customers and we > cannot > > > > allow 15% degradation. > > > > > > > > As Vladimir said, we are looking how to improve current SAD numbers > > > > and minimize the drop. > > > > But with same equals - plain array will always be faster than hash table, > > > > so not sure we will be able to match existing performance. > > > > So two questions: > > > > 1. What exact case you use for perf testing > > > > (total number of SAs, packets per burst belong to the same/different SAs)? > > > > Might be there is a way to speedup it. > > > > Again if 10-15% is not an affordable drop, which one is: zero or ...? > > > > > > We should add features judiciously, we cannot drop the performance of a > > > benchmarking > > > Application in lieu of adding functionality. We should only add features which > > > are not > > > Impacting the performance significantly. > > > Every vendor may have different cases. We cannot tune for everybody. > > > However, I see drop in 64 outbound 64 inbound SAs all with different SPI and > IPs. > > > Packets per burst = 32 all with different SAs. > > > > > > > We can have two modes of lookup similar to l3fwd - EM and LPM. > > LPM is O(1) while EM is more realistic. Similar logic can be added here as well. > > With L3fwd also we showcase performance for best case(lpm) and the worst > case(em) > > What Say? > > We discussed it off-line with Vladimir and came up with similar idea: > Have a proper/generic SAD implementation and add limited size plain-array > on top of it as 1xway associative cache. > So for the case when all active SAs fit into the cache and no SPI collisions, > we should have same performance as now (with plain array). > From other side, we'll still have generic/scalable/rfc compliant implementation. > Sort of best sides from two words. > Plans are to submit v4 with such approach in next few days. OK lets check the v4 before moving the discussion to techboard. @Thomas: Do you have more thoughts on this? Should we get it added in the agenda Or wait for the v4? > > > > > As discussed in the DPDK-status meeting today, this patchset need to be > discussed in > > Techboard meeting. Please include this topic in the upcoming meeting on 29th > Jan. > > As I said above, I think we found a way to deal with it without any perf drop > for existing cases. > Though sure, if you feel some extra discussion is needed, let's request to > put it into agenda. > > Konstantin > > > > > -Akhil > > > > > > 2. I think there are 2 different directions for ipsec-secgw: > > > > From one-side there is a desire to use it as a show-case for best-possible > HW > > > > IPsec performance > > > > (which is understandable). > > > > From other side - attempt to make it as close as real-world generic ipsec > > > > processing app as possible > > > > (support for ESN, replay window, fragmented packets, generic proper > SAD, > > > > etc). > > > > Obviously these goals contradict and it makes really hard for the same > app to > > > > fulfill both. > > > > Any thoughts how to deal with that? > > > > One obvious would be to split the app, anything else? > > > > > > We can have a fallback mechanism back to original functionality for > whatever > > > feature > > > which has some perf drop. > > > Splitting an app can be thought of but that would be similar to a full fledged > > > IPSec stack > > > like VPP-IPSec. > > > > > > > > > > > Konstantin > > > > > > > > > Other vendors(Marvell, ARM, AMD) please comment? > > > > > Regards, > > > > > Akhil > > > > > From: Medvedkin, Vladimir <mailto:vladimir.medvedkin@intel.com> > > > > > Sent: Friday, January 17, 2020 10:35 PM > > > > > To: Akhil Goyal <mailto:akhil.goyal@nxp.com>; mailto:dev@dpdk.org > > > > > Cc: mailto:konstantin.ananyev@intel.com > > > > > Subject: Re: [PATCH v4 0/5] integrate librte_ipsec SAD into ipsec-secgw > > > > > > > > > > Hi Akhil, > > > > > Indeed with our tests we also seeing ~15% perf drop for small packets > (~90B) > > > > and ~3-4% drop for 1KB packets. While I am looking on a ways > > > > > to minimize the drop, I think it would be hard, if possible at all to > eliminate it > > > > completely. > > > > > Reason for that: current SAD implementation is completely synthetic > (using > > > > plain array structure indexed by SPI value). That provides a very > > > > > low overhead, but doesn't provide expected functionality and can't be > used > > > in > > > > proper implementation. > > > > > To measure plain IPsec performance without SAD user can still use '-- > signle- > > > sa' > > > > option. > > > > > On 15/01/2020 15:45, Akhil Goyal wrote: > > > > > Hi Vladimir, > > > > > > > > > > There is more than 10% drop with this patchset on NXP hardware with > both > > > > legacy mode and the ipsec lib mode. This would need some > > > > > debugging. > > > > > Didn't you see any drop on intel? > > > > > > > > > > Regards, > > > > > Akhil > > > > > > > > > > -----Original Message----- > > > > > From: Vladimir Medvedkin mailto:vladimir.medvedkin@intel.com > > > > > Sent: Tuesday, January 14, 2020 7:57 PM > > > > > To: mailto:dev@dpdk.org > > > > > Cc: mailto:konstantin.ananyev@intel.com; Akhil Goyal > > > > mailto:akhil.goyal@nxp.com > > > > > Subject: [PATCH v4 0/5] integrate librte_ipsec SAD into ipsec-secgw > > > > > > > > > > This series integrates SA database (SAD) capabilities from ipsec library. > > > > > The goal is to make ipsec-secgw RFC compliant regarding inbound SAD. > > > > > Also patch series removes hardcoded limitation for maximum number of > SA's > > > > > and SP's. > > > > > > > > > > v4: > > > > > - put tunnel SA's into SAD with SPI_ONLY type for performance reason > > > > > > > > > > v3: > > > > > - parse SA and SP into sorted array instead of linked list > > > > > > > > > > v2: > > > > > - get rid of maximum sp limitation > > > > > > > > > > Vladimir Medvedkin (5): > > > > > ipsec: move ipsec sad name length into .h > > > > > examples/ipsec-secgw: implement inbound SAD > > > > > examples/ipsec-secgw: integrate inbound SAD > > > > > examples/ipsec-secgw: get rid of maximum sa limitation > > > > > examples/ipsec-secgw: get rid of maximum sp limitation > > > > > > > > > > examples/ipsec-secgw/Makefile | 1 + > > > > > examples/ipsec-secgw/ipsec-secgw.c | 4 +- > > > > > examples/ipsec-secgw/ipsec.h | 11 +- > > > > > examples/ipsec-secgw/meson.build | 2 +- > > > > > examples/ipsec-secgw/parser.c | 4 + > > > > > examples/ipsec-secgw/parser.h | 9 ++ > > > > > examples/ipsec-secgw/sa.c | 256 +++++++++++++++++++++++------ > ----- > > > -- > > > > - > > > > > examples/ipsec-secgw/sad.c | 90 +++++++++++++ > > > > > examples/ipsec-secgw/sad.h | 74 +++++++++++ > > > > > examples/ipsec-secgw/sp4.c | 114 ++++++++++++----- > > > > > examples/ipsec-secgw/sp6.c | 112 +++++++++++----- > > > > > lib/librte_ipsec/ipsec_sad.c | 20 +-- > > > > > lib/librte_ipsec/rte_ipsec_sad.h | 2 + > > > > > 13 files changed, 528 insertions(+), 171 deletions(-) > > > > > create mode 100644 examples/ipsec-secgw/sad.c > > > > > create mode 100644 examples/ipsec-secgw/sad.h > > > > > > > > > > -- > > > > > 2.7.4 > > > > > > > > > > -- > > > > > Regards, > > > > > Vladimir > > > > > -->
23/01/2020 13:56, Akhil Goyal: > Hi Konstantin, > > > > Hi Akhil, > > > > > > > > Hi Vladimir, > > > > > > The SA lookup logic and management is purely requirement based for the > > > > > application. > > > > > >The application may only cater to <128 SAs which can > > > > > > be handled based on the current logic. > > > > > > > > > > Not always, current implementation can handle < 128 SA, > > > > > whose SPI%128 never match (let say it cant't handle SPI=1 and SPI=129). > > > > > Yes, what we have right now has nearly zero overhead, > > > > > and might be ok for some really simple show-cases. > > > > > But for majority of production IPsec implementations, > > > > > I believe that definitely wouldn't be enough. > > > > > > > > > > > –single-sa option cannot handle this. > > > > > > Sample applications in DPDK are there to showcase the best a hardware > > can > > > > > deliver. > > > > > > > > > > My thought was - that's the reason we have single-sa option - > > > > > demonstrate best possible HW perf without minimal SW intervention. > > > > > For something more serious than that, we use generic SAD implementation. > > > > > > > > > > > IMO, we cannot allow this logic on NXP hardwares. We > > > > > > give performance numbers based on IPSec app to customers and we > > cannot > > > > > allow 15% degradation. > > > > > > > > > > As Vladimir said, we are looking how to improve current SAD numbers > > > > > and minimize the drop. > > > > > But with same equals - plain array will always be faster than hash table, > > > > > so not sure we will be able to match existing performance. > > > > > So two questions: > > > > > 1. What exact case you use for perf testing > > > > > (total number of SAs, packets per burst belong to the same/different SAs)? > > > > > Might be there is a way to speedup it. > > > > > Again if 10-15% is not an affordable drop, which one is: zero or ...? > > > > > > > > We should add features judiciously, we cannot drop the performance of a > > > > benchmarking > > > > Application in lieu of adding functionality. We should only add features which > > > > are not > > > > Impacting the performance significantly. > > > > Every vendor may have different cases. We cannot tune for everybody. > > > > However, I see drop in 64 outbound 64 inbound SAs all with different SPI and > > IPs. > > > > Packets per burst = 32 all with different SAs. > > > > > > > > > > We can have two modes of lookup similar to l3fwd - EM and LPM. > > > LPM is O(1) while EM is more realistic. Similar logic can be added here as well. > > > With L3fwd also we showcase performance for best case(lpm) and the worst > > case(em) > > > What Say? > > > > We discussed it off-line with Vladimir and came up with similar idea: > > Have a proper/generic SAD implementation and add limited size plain-array > > on top of it as 1xway associative cache. > > So for the case when all active SAs fit into the cache and no SPI collisions, > > we should have same performance as now (with plain array). > > From other side, we'll still have generic/scalable/rfc compliant implementation. > > Sort of best sides from two words. > > Plans are to submit v4 with such approach in next few days. > > OK lets check the v4 before moving the discussion to techboard. > @Thomas: Do you have more thoughts on this? Should we get it added in the agenda > Or wait for the v4? If v4 is good for both cases, it lowers the priority of the discussion. But still, it would be interesting to state the objectives of the examples: - show API usage? - show feature performance? - show best hardware performance? - what else?
> > > > > > > The SA lookup logic and management is purely requirement based for the > > > > > > application. > > > > > > >The application may only cater to <128 SAs which can > > > > > > > be handled based on the current logic. > > > > > > > > > > > > Not always, current implementation can handle < 128 SA, > > > > > > whose SPI%128 never match (let say it cant't handle SPI=1 and SPI=129). > > > > > > Yes, what we have right now has nearly zero overhead, > > > > > > and might be ok for some really simple show-cases. > > > > > > But for majority of production IPsec implementations, > > > > > > I believe that definitely wouldn't be enough. > > > > > > > > > > > > > –single-sa option cannot handle this. > > > > > > > Sample applications in DPDK are there to showcase the best a hardware > > > can > > > > > > deliver. > > > > > > > > > > > > My thought was - that's the reason we have single-sa option - > > > > > > demonstrate best possible HW perf without minimal SW intervention. > > > > > > For something more serious than that, we use generic SAD implementation. > > > > > > > > > > > > > IMO, we cannot allow this logic on NXP hardwares. We > > > > > > > give performance numbers based on IPSec app to customers and we > > > cannot > > > > > > allow 15% degradation. > > > > > > > > > > > > As Vladimir said, we are looking how to improve current SAD numbers > > > > > > and minimize the drop. > > > > > > But with same equals - plain array will always be faster than hash table, > > > > > > so not sure we will be able to match existing performance. > > > > > > So two questions: > > > > > > 1. What exact case you use for perf testing > > > > > > (total number of SAs, packets per burst belong to the same/different SAs)? > > > > > > Might be there is a way to speedup it. > > > > > > Again if 10-15% is not an affordable drop, which one is: zero or ...? > > > > > > > > > > We should add features judiciously, we cannot drop the performance of a > > > > > benchmarking > > > > > Application in lieu of adding functionality. We should only add features which > > > > > are not > > > > > Impacting the performance significantly. > > > > > Every vendor may have different cases. We cannot tune for everybody. > > > > > However, I see drop in 64 outbound 64 inbound SAs all with different SPI and > > > IPs. > > > > > Packets per burst = 32 all with different SAs. > > > > > > > > > > > > > We can have two modes of lookup similar to l3fwd - EM and LPM. > > > > LPM is O(1) while EM is more realistic. Similar logic can be added here as well. > > > > With L3fwd also we showcase performance for best case(lpm) and the worst > > > case(em) > > > > What Say? > > > > > > We discussed it off-line with Vladimir and came up with similar idea: > > > Have a proper/generic SAD implementation and add limited size plain-array > > > on top of it as 1xway associative cache. > > > So for the case when all active SAs fit into the cache and no SPI collisions, > > > we should have same performance as now (with plain array). > > > From other side, we'll still have generic/scalable/rfc compliant implementation. > > > Sort of best sides from two words. > > > Plans are to submit v4 with such approach in next few days. > > > > OK lets check the v4 before moving the discussion to techboard. > > @Thomas: Do you have more thoughts on this? Should we get it added in the agenda > > Or wait for the v4? > > If v4 is good for both cases, it lowers the priority of the discussion. > > But still, it would be interesting to state the objectives of the examples: > - show API usage? > - show feature performance? > - show best hardware performance? > - what else? Agree, that’s a good topic to discuss.