mbox series

[RFC,0/3] add feature arc in rte_graph

Message ID 20240907073115.1531070-1-nsaxena@marvell.com (mailing list archive)
Headers
Series add feature arc in rte_graph |

Message

Nitin Saxena Sept. 7, 2024, 7:31 a.m. UTC
Feature arc represents an ordered list of features/protocols at a given
networking layer. It is a high level abstraction to connect various
rte_graph nodes, as feature nodes, and allow packets steering across
these nodes in a generic manner.

Features (or feature nodes) are nodes which handles partial or complete
handling of a protocol in fast path. Like ipv4-rewrite node, which adds
rewrite data to an outgoing IPv4 packet.

However in above example, outgoing interface(say "eth0") may have
outbound IPsec policy enabled, hence packets must be steered from
ipv4-rewrite node to ipsec-outbound-policy node for outbound IPsec
policy lookup. On the other hand, packets routed to another interface
(eth1) will not be sent to ipsec-outbound-policy node as IPsec feature
is disabled on eth1. Feature-arc allows rte_graph applications to manage
such constraints easily

Feature arc abstraction allows rte_graph based application to

1. Seamlessly steer packets across feature nodes based on wheter feature
is enabled or disabled on an interface. Features enabled on one
interface may not be enabled on another interface with in a same feature
arc.

2. Allow enabling/disabling of features on an interface at runtime,
so that if a feature is disabled, packets associated with that interface
won't be steered to corresponding feature node.

3. Provides mechanism to hook custom/user-defined nodes to a feature
node and allow packet steering from feature node to custom node without
changing former's fast path function

4. Allow expressing features in a particular sequential order so that
packets are steered in an ordered way across nodes in fast path. For
eg: if IPsec and IPv4 features are enabled on an ingress interface,
packets must be sent to IPsec inbound policy node first and then to ipv4
lookup node.

This patch series adds feature arc library in rte_graph and also adds
"ipv4-output" feature arc handling in "ipv4-rewrite" node.

Nitin Saxena (3):
  graph: add feature arc support
  graph: add feature arc option in graph create
  graph: add IPv4 output feature arc

 lib/graph/graph.c                        |   1 +
 lib/graph/graph_feature_arc.c            | 959 +++++++++++++++++++++++
 lib/graph/graph_populate.c               |   7 +-
 lib/graph/graph_private.h                |   3 +
 lib/graph/meson.build                    |   2 +
 lib/graph/node.c                         |   2 +
 lib/graph/rte_graph.h                    |   3 +
 lib/graph/rte_graph_feature_arc.h        | 373 +++++++++
 lib/graph/rte_graph_feature_arc_worker.h | 548 +++++++++++++
 lib/graph/version.map                    |  17 +
 lib/node/ip4_rewrite.c                   | 476 ++++++++---
 lib/node/ip4_rewrite_priv.h              |   9 +-
 lib/node/node_private.h                  |  19 +-
 lib/node/rte_node_ip4_api.h              |   3 +
 14 files changed, 2325 insertions(+), 97 deletions(-)
 create mode 100644 lib/graph/graph_feature_arc.c
 create mode 100644 lib/graph/rte_graph_feature_arc.h
 create mode 100644 lib/graph/rte_graph_feature_arc_worker.h
  

Comments

David Marchand Oct. 8, 2024, 8:04 a.m. UTC | #1
Hi graph guys,

On Sat, Sep 7, 2024 at 9:31 AM Nitin Saxena <nsaxena@marvell.com> wrote:
>
> Feature arc represents an ordered list of features/protocols at a given
> networking layer. It is a high level abstraction to connect various
> rte_graph nodes, as feature nodes, and allow packets steering across
> these nodes in a generic manner.
>
> Features (or feature nodes) are nodes which handles partial or complete
> handling of a protocol in fast path. Like ipv4-rewrite node, which adds
> rewrite data to an outgoing IPv4 packet.
>
> However in above example, outgoing interface(say "eth0") may have
> outbound IPsec policy enabled, hence packets must be steered from
> ipv4-rewrite node to ipsec-outbound-policy node for outbound IPsec
> policy lookup. On the other hand, packets routed to another interface
> (eth1) will not be sent to ipsec-outbound-policy node as IPsec feature
> is disabled on eth1. Feature-arc allows rte_graph applications to manage
> such constraints easily
>
> Feature arc abstraction allows rte_graph based application to
>
> 1. Seamlessly steer packets across feature nodes based on wheter feature
> is enabled or disabled on an interface. Features enabled on one
> interface may not be enabled on another interface with in a same feature
> arc.
>
> 2. Allow enabling/disabling of features on an interface at runtime,
> so that if a feature is disabled, packets associated with that interface
> won't be steered to corresponding feature node.
>
> 3. Provides mechanism to hook custom/user-defined nodes to a feature
> node and allow packet steering from feature node to custom node without
> changing former's fast path function
>
> 4. Allow expressing features in a particular sequential order so that
> packets are steered in an ordered way across nodes in fast path. For
> eg: if IPsec and IPv4 features are enabled on an ingress interface,
> packets must be sent to IPsec inbound policy node first and then to ipv4
> lookup node.
>
> This patch series adds feature arc library in rte_graph and also adds
> "ipv4-output" feature arc handling in "ipv4-rewrite" node.
>
> Nitin Saxena (3):
>   graph: add feature arc support
>   graph: add feature arc option in graph create
>   graph: add IPv4 output feature arc
>
>  lib/graph/graph.c                        |   1 +
>  lib/graph/graph_feature_arc.c            | 959 +++++++++++++++++++++++
>  lib/graph/graph_populate.c               |   7 +-
>  lib/graph/graph_private.h                |   3 +
>  lib/graph/meson.build                    |   2 +
>  lib/graph/node.c                         |   2 +
>  lib/graph/rte_graph.h                    |   3 +
>  lib/graph/rte_graph_feature_arc.h        | 373 +++++++++
>  lib/graph/rte_graph_feature_arc_worker.h | 548 +++++++++++++
>  lib/graph/version.map                    |  17 +
>  lib/node/ip4_rewrite.c                   | 476 ++++++++---
>  lib/node/ip4_rewrite_priv.h              |   9 +-
>  lib/node/node_private.h                  |  19 +-
>  lib/node/rte_node_ip4_api.h              |   3 +
>  14 files changed, 2325 insertions(+), 97 deletions(-)
>  create mode 100644 lib/graph/graph_feature_arc.c
>  create mode 100644 lib/graph/rte_graph_feature_arc.h
>  create mode 100644 lib/graph/rte_graph_feature_arc_worker.h

I see no non-RFC series following this original submission.
It will slip to next release unless there is an objection.

Btw, I suggest copying Robin (and Christophe) for graph related changes.
  
Nitin Saxena Oct. 8, 2024, 2:26 p.m. UTC | #2
Hi David,

I just sent v2 version for this patch series.

Will add Robin and Christophe from next version onwards

Thanks,
Nitin

> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Tuesday, October 8, 2024 1:34 PM
> To: Nitin Saxena <nsaxena@marvell.com>
> Cc: Jerin Jacob <jerinj@marvell.com>; Kiran Kumar Kokkilagadda
> <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>; Zhirun Yan <yanzhirun_163@163.com>;
> dev@dpdk.org; Nitin Saxena <nsaxena16@gmail.com>; Robin Jarry
> <rjarry@redhat.com>; Christophe Fontaine <cfontain@redhat.com>
> Subject: [EXTERNAL] Re: [RFC PATCH 0/3] add feature arc in rte_graph
> 
> Hi graph guys, On Sat, Sep 7, 2024 at 9: 31 AM Nitin Saxena
> <nsaxena@ marvell. com> wrote: > > Feature arc represents an ordered list of
> features/protocols at a given > networking layer. It is a high level abstraction
> to connect 
> Hi graph guys,
> 
> On Sat, Sep 7, 2024 at 9:31 AM Nitin Saxena <nsaxena@marvell.com> wrote:
> >
> > Feature arc represents an ordered list of features/protocols at a
> > given networking layer. It is a high level abstraction to connect
> > various rte_graph nodes, as feature nodes, and allow packets steering
> > across these nodes in a generic manner.
> >
> > Features (or feature nodes) are nodes which handles partial or
> > complete handling of a protocol in fast path. Like ipv4-rewrite node,
> > which adds rewrite data to an outgoing IPv4 packet.
> >
> > However in above example, outgoing interface(say "eth0") may have
> > outbound IPsec policy enabled, hence packets must be steered from
> > ipv4-rewrite node to ipsec-outbound-policy node for outbound IPsec
> > policy lookup. On the other hand, packets routed to another interface
> > (eth1) will not be sent to ipsec-outbound-policy node as IPsec feature
> > is disabled on eth1. Feature-arc allows rte_graph applications to
> > manage such constraints easily
> >
> > Feature arc abstraction allows rte_graph based application to
> >
> > 1. Seamlessly steer packets across feature nodes based on wheter
> > feature is enabled or disabled on an interface. Features enabled on
> > one interface may not be enabled on another interface with in a same
> > feature arc.
> >
> > 2. Allow enabling/disabling of features on an interface at runtime, so
> > that if a feature is disabled, packets associated with that interface
> > won't be steered to corresponding feature node.
> >
> > 3. Provides mechanism to hook custom/user-defined nodes to a feature
> > node and allow packet steering from feature node to custom node
> > without changing former's fast path function
> >
> > 4. Allow expressing features in a particular sequential order so that
> > packets are steered in an ordered way across nodes in fast path. For
> > eg: if IPsec and IPv4 features are enabled on an ingress interface,
> > packets must be sent to IPsec inbound policy node first and then to
> > ipv4 lookup node.
> >
> > This patch series adds feature arc library in rte_graph and also adds
> > "ipv4-output" feature arc handling in "ipv4-rewrite" node.
> >
> > Nitin Saxena (3):
> >   graph: add feature arc support
> >   graph: add feature arc option in graph create
> >   graph: add IPv4 output feature arc
> >
> >  lib/graph/graph.c                        |   1 +
> >  lib/graph/graph_feature_arc.c            | 959 +++++++++++++++++++++++
> >  lib/graph/graph_populate.c               |   7 +-
> >  lib/graph/graph_private.h                |   3 +
> >  lib/graph/meson.build                    |   2 +
> >  lib/graph/node.c                         |   2 +
> >  lib/graph/rte_graph.h                    |   3 +
> >  lib/graph/rte_graph_feature_arc.h        | 373 +++++++++
> >  lib/graph/rte_graph_feature_arc_worker.h | 548 +++++++++++++
> >  lib/graph/version.map                    |  17 +
> >  lib/node/ip4_rewrite.c                   | 476 ++++++++---
> >  lib/node/ip4_rewrite_priv.h              |   9 +-
> >  lib/node/node_private.h                  |  19 +-
> >  lib/node/rte_node_ip4_api.h              |   3 +
> >  14 files changed, 2325 insertions(+), 97 deletions(-)  create mode
> > 100644 lib/graph/graph_feature_arc.c  create mode 100644
> > lib/graph/rte_graph_feature_arc.h  create mode 100644
> > lib/graph/rte_graph_feature_arc_worker.h
> 
> I see no non-RFC series following this original submission.
> It will slip to next release unless there is an objection.
> 
> Btw, I suggest copying Robin (and Christophe) for graph related changes.
> 
> 
> --
> David Marchand
  
Nitin Saxena Oct. 14, 2024, 11:11 a.m. UTC | #3
Hi David and all,

>> I see no non-RFC series following this original submission.
>> It will slip to next release unless there is an objection.

I had pushed non RFC patch series before -rc1 date (11th oct). 
We have an ABI change in this patch series https://patches.dpdk.org/project/dpdk/patch/20241010133111.2764712-3-nsaxena@marvell.com/
Could you help merge this patch series in rc2 otherwise it has to wait for next LTS

Thanks,
Nitin

> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Tuesday, October 8, 2024 1:34 PM
> To: Nitin Saxena <nsaxena@marvell.com>
> Cc: Jerin Jacob <jerinj@marvell.com>; Kiran Kumar Kokkilagadda
> <kirankumark@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>; Zhirun Yan <yanzhirun_163@163.com>;
> dev@dpdk.org; Nitin Saxena <nsaxena16@gmail.com>; Robin Jarry
> <rjarry@redhat.com>; Christophe Fontaine <cfontain@redhat.com>
> Subject: [EXTERNAL] Re: [RFC PATCH 0/3] add feature arc in rte_graph
> 
> Hi graph guys, On Sat, Sep 7, 2024 at 9: 31 AM Nitin Saxena
> <nsaxena@ marvell. com> wrote: > > Feature arc represents an ordered list of
> features/protocols at a given > networking layer. It is a high level abstraction
> to connect 
> Hi graph guys,
> 
> On Sat, Sep 7, 2024 at 9:31 AM Nitin Saxena <nsaxena@marvell.com> wrote:
> >
> > Feature arc represents an ordered list of features/protocols at a
> > given networking layer. It is a high level abstraction to connect
> > various rte_graph nodes, as feature nodes, and allow packets steering
> > across these nodes in a generic manner.
> >
> > Features (or feature nodes) are nodes which handles partial or
> > complete handling of a protocol in fast path. Like ipv4-rewrite node,
> > which adds rewrite data to an outgoing IPv4 packet.
> >
> > However in above example, outgoing interface(say "eth0") may have
> > outbound IPsec policy enabled, hence packets must be steered from
> > ipv4-rewrite node to ipsec-outbound-policy node for outbound IPsec
> > policy lookup. On the other hand, packets routed to another interface
> > (eth1) will not be sent to ipsec-outbound-policy node as IPsec feature
> > is disabled on eth1. Feature-arc allows rte_graph applications to
> > manage such constraints easily
> >
> > Feature arc abstraction allows rte_graph based application to
> >
> > 1. Seamlessly steer packets across feature nodes based on wheter
> > feature is enabled or disabled on an interface. Features enabled on
> > one interface may not be enabled on another interface with in a same
> > feature arc.
> >
> > 2. Allow enabling/disabling of features on an interface at runtime, so
> > that if a feature is disabled, packets associated with that interface
> > won't be steered to corresponding feature node.
> >
> > 3. Provides mechanism to hook custom/user-defined nodes to a feature
> > node and allow packet steering from feature node to custom node
> > without changing former's fast path function
> >
> > 4. Allow expressing features in a particular sequential order so that
> > packets are steered in an ordered way across nodes in fast path. For
> > eg: if IPsec and IPv4 features are enabled on an ingress interface,
> > packets must be sent to IPsec inbound policy node first and then to
> > ipv4 lookup node.
> >
> > This patch series adds feature arc library in rte_graph and also adds
> > "ipv4-output" feature arc handling in "ipv4-rewrite" node.
> >
> > Nitin Saxena (3):
> >   graph: add feature arc support
> >   graph: add feature arc option in graph create
> >   graph: add IPv4 output feature arc
> >
> >  lib/graph/graph.c                        |   1 +
> >  lib/graph/graph_feature_arc.c            | 959 +++++++++++++++++++++++
> >  lib/graph/graph_populate.c               |   7 +-
> >  lib/graph/graph_private.h                |   3 +
> >  lib/graph/meson.build                    |   2 +
> >  lib/graph/node.c                         |   2 +
> >  lib/graph/rte_graph.h                    |   3 +
> >  lib/graph/rte_graph_feature_arc.h        | 373 +++++++++
> >  lib/graph/rte_graph_feature_arc_worker.h | 548 +++++++++++++
> >  lib/graph/version.map                    |  17 +
> >  lib/node/ip4_rewrite.c                   | 476 ++++++++---
> >  lib/node/ip4_rewrite_priv.h              |   9 +-
> >  lib/node/node_private.h                  |  19 +-
> >  lib/node/rte_node_ip4_api.h              |   3 +
> >  14 files changed, 2325 insertions(+), 97 deletions(-)  create mode
> > 100644 lib/graph/graph_feature_arc.c  create mode 100644
> > lib/graph/rte_graph_feature_arc.h  create mode 100644
> > lib/graph/rte_graph_feature_arc_worker.h
> 
> I see no non-RFC series following this original submission.
> It will slip to next release unless there is an objection.
> 
> Btw, I suggest copying Robin (and Christophe) for graph related changes.
> 
> 
> --
> David Marchand
  
David Marchand Oct. 16, 2024, 9:24 a.m. UTC | #4
On Mon, Oct 14, 2024 at 1:12 PM Nitin Saxena <nsaxena@marvell.com> wrote:
> I had pushed non RFC patch series before -rc1 date (11th oct).
> We have an ABI change in this patch series https://patches.dpdk.org/project/dpdk/patch/20241010133111.2764712-3-nsaxena@marvell.com/
> Could you help merge this patch series in rc2 otherwise it has to wait for next LTS

Just read through the series, I am not confident with this addition.
It requires a lot of changes in the node code for supporting it, where
it should be something handled in/facilitated by the graph library
itself.
I did not read much from Robin or Christophe who have been writing
more node code than me.
I would prefer their opinion before going forward.


On the ABI topic.
As far as I can see, the only issue would be in extending struct
rte_node_register, but this can be solved with function versioning.
That change would have to be announced.

Am I missing something else?
  
Robin Jarry Oct. 16, 2024, 9:38 a.m. UTC | #5
Hi folks,

David Marchand, Oct 16, 2024 at 11:24:
> On Mon, Oct 14, 2024 at 1:12 PM Nitin Saxena <nsaxena@marvell.com> wrote:
>> I had pushed non RFC patch series before -rc1 date (11th oct).
>> We have an ABI change in this patch series https://patches.dpdk.org/project/dpdk/patch/20241010133111.2764712-3-nsaxena@marvell.com/
>> Could you help merge this patch series in rc2 otherwise it has to wait for next LTS
>
> Just read through the series, I am not confident with this addition.
> It requires a lot of changes in the node code for supporting it, where
> it should be something handled in/facilitated by the graph library
> itself.

As far as I can tell, it will be very complicated (if not impossible) to 
determine in a generic manner whether a packet must be steered towards 
a sub tree or not. The decision *must* come from the originating node in 
some way or another.

> I did not read much from Robin or Christophe who have been writing
> more node code than me.
> I would prefer their opinion before going forward.

This series is indeed very dense. I like the concept of having 
extensible sub trees in the graph but it feels like the implementation 
is more complex than it should be.

Lacking of another solution, we went for a naive approach in grout. 
Basically, some nodes have undefined next nodes which are extended using 
a dedicated API.

https://github.com/DPDK/grout/blob/v0.2/modules/infra/datapath/eth_input.c#L23-L31

This API can be used by other nodes to attach themselves to these 
extensible nodes:

https://github.com/DPDK/grout/blob/v0.2/modules/ip/datapath/arp_input.c#L143
https://github.com/DPDK/grout/blob/v0.2/modules/ip/datapath/ip_input.c#L124
https://github.com/DPDK/grout/blob/v0.2/modules/ip6/datapath/ip6_input.c#L122

After which, the extensible nodes can steer the packets towards the 
correct downstream edge based on the dedicated classifier field:

https://github.com/DPDK/grout/blob/v0.2/modules/infra/datapath/eth_input.c#L79

Obviously, this does not natively support a per-interface sub tree 
traversal, but it can be done in the originating node based on packet 
private context data.

This raises a more important question: how can we standardize the way 
private application data is passed from node to node? And how could we 
enforce this declaratively in the node register API?

Do you think we could find some middle ground that would not require 
such extensive changes?

Cheers,
Robin
  
Nitin Saxena Oct. 16, 2024, 1:50 p.m. UTC | #6
Hi Robin,

Thanks for the review
Please see my replies inline

Thanks,
Nitin

On Wed, Oct 16, 2024 at 3:08 PM Robin Jarry <rjarry@redhat.com> wrote:
>
> Hi folks,
>
> David Marchand, Oct 16, 2024 at 11:24:
> > On Mon, Oct 14, 2024 at 1:12 PM Nitin Saxena <nsaxena@marvell.com> wrote:
> >> I had pushed non RFC patch series before -rc1 date (11th oct).
> >> We have an ABI change in this patch series https://patches.dpdk.org/project/dpdk/patch/20241010133111.2764712-3-nsaxena@marvell.com/
> >> Could you help merge this patch series in rc2 otherwise it has to wait for next LTS
> >
> > Just read through the series, I am not confident with this addition.
> > It requires a lot of changes in the node code for supporting it, where
> > it should be something handled in/facilitated by the graph library
> > itself.
>
> As far as I can tell, it will be very complicated (if not impossible) to
> determine in a generic manner whether a packet must be steered towards
> a sub tree or not. The decision *must* come from the originating node in
> some way or another.

Nitin> I am not sure if it *must* always be from the originating node?
What about a control plane which wants to enable "IP4 feature" on
interface  'X'  by assigning IP address?
A originating node (say: ip4-input) *must not* activate IP4 lookup
sub-graph for interface "X " until control plane assigns any IP
address to it.

Regarding the complexity of adopting feature arc changes in fast path,
- a sub-optimal change for feature-arc would be simple and trivial but
at the cost of performance.
- Complexity increases when feature arc changes are optimally
integrated (like "ip4_rewrite" changes in the patch) with no
performance regression

>
> > I did not read much from Robin or Christophe who have been writing
> > more node code than me.
> > I would prefer their opinion before going forward.
>
> This series is indeed very dense. I like the concept of having
> extensible sub trees in the graph but it feels like the implementation
> is more complex than it should be.
>
> Lacking of another solution, we went for a naive approach in grout.
> Basically, some nodes have undefined next nodes which are extended using
> a dedicated API.

Nitin> With an initial glance, it looks like "grout" is trying to
solve a use-case where a child is being added to the parent's
undefined next node. This is trying to create a runtime  parent-child
relationship

On the other hand, feature arc not just create parent-child
relationships but also sibling-sibling relationships as well. Also
enabling sub-graph per interface is critical functionality in feature
arc that adds complexity

Let's assume a use-case in ingress direction, at the IPv4 layer,
where IPv4-input is the *originating node* and

- On interface X, IPsec-policy, IP4-classify() and IPv4-lookup
sub-graphs are enabled in a sequential order
- On interface Y, IP4-classify() and IPv4-lookup sub-graphs are
enabled. in a sequential order. i.e. IPsec-policy is *disabled* on
interface Y

In fast path, following processing should happen for "mbuf0" which is
received on interface "X"
- "ipv4-input" sends mbuf0 to the first enabled sub-graph node for
interface X, "IPsec-policy"
- In "IPsec-policy" node processing, if policy action results in
"bypass" action for mbuf0, it must then be sent to next enabled
sub-graph  i.e. "IPv4-classify" (from "IPsec-policy" node)
- In "IPv4-classify" node processing, if classify fails for mbuf0 then
it should finally be sent to "IPv4-lookup" node (from "IPv4-classify"
node)

whereas for "mbuf1" received on interface Y following fast path
processing must happen
- "Ipv4-input" sends mbuf1 to the first enabled sub-graph node for
interface Y, "IPv4-classify"
- If "IPv4-classify" fails for mbuf1, then it should finally be sent
to IPv4-lookup node

To behave differently for interface X and interface Y as above
- First of all, IPsec-policy/IPv4-classify/IPv4-lookup must be
connected to "ipv4-input" node (Parent-Child relationship)
- Also, IPsec-policy/IPv4-classify/IPv4-lookup must also be connected
with each other (Sibling-Sibling relationship)
- Fast path APIs provide *rte_edges_t* to send mbuf from one node to
another node
   1. Based on interface (either Interface X or Interface Y)
   2. Based on which node, fast path APIs are called. Next enabled
feature/sub-graph can only be determined from previous enabled
feature/sub-graph in fast path

Not sure if grout handles above use-cases in the same manner. AFAIR ,
for any control plane change grout re-creates "graph" objects which
may not be required with feature arc.

>
> https://github.com/DPDK/grout/blob/v0.2/modules/infra/datapath/eth_input.c#L23-L31
>
> This API can be used by other nodes to attach themselves to these
> extensible nodes:
>
> https://github.com/DPDK/grout/blob/v0.2/modules/ip/datapath/arp_input.c#L143
> https://github.com/DPDK/grout/blob/v0.2/modules/ip/datapath/ip_input.c#L124
> https://github.com/DPDK/grout/blob/v0.2/modules/ip6/datapath/ip6_input.c#L122
>
> After which, the extensible nodes can steer the packets towards the
> correct downstream edge based on the dedicated classifier field:
>
> https://github.com/DPDK/grout/blob/v0.2/modules/infra/datapath/eth_input.c#L79
>
> Obviously, this does not natively support a per-interface sub tree
> traversal, but it can be done in the originating node based on packet
> private context data.

Nitin> Expressing per interface sub-tree traversal is the key
functionality of feature arc.

>
> This raises a more important question: how can we standardize the way
> private application data is passed from node to node? And how could we
> enforce this declaratively in the node register API?

Nitin> What you are suggesting here (node to node application data
exchange) can be done in rte_node_register API but, IMO, this is not
related to feature arc.
Feature arc is not just between parent and child nodes but also
between siblings (as explained above)

>
> Do you think we could find some middle ground that would not require
> such extensive changes?

Nitin> If you are pointing to ipv4-rewrite changes, I had an internal
review comment of adding those changes without any *performance
regression*.
A sub-optimal code would be much simpler but at the cost of performance.

>
> Cheers,
> Robin
>
  
Nitin Saxena Oct. 17, 2024, 7:03 a.m. UTC | #7
Hi Robin/David and all,

We realized the feature arc patch series is difficult to understand as
a new concept. Our objectives are following with feature arc changes

1. Allow reusability of standard DPDK nodes (defined in lib/nodes/*)
with out-of-tree applications (like grout). Currently out-of-tree
graph applications are duplicating standard nodes but not reusing the
standard ones
which are available. In the long term, we would like to mature
standard DPDK nodes with flexibility of hooking them to out-of-tree
application nodes.

2. Flexibility to enable/disable sub-graphs per interface based on the
runtime configuration updates. Protocol sub-graphs can be selectively
enabled for few (or all interfaces) at runtime

3. More than one sub-graphs/features can be enabled on an interface.
So a packet has to follow a sequential ordering node path on worker
cores.
Packets may need to move from one sub-graph to another sub-graph per interface

4. Last but not least, an optimized implementation which does not (or
minimally) stop worker cores for any control plane runtime updates.
Any performance regression should also be avoided

I am planning to create a draft presentation on feature arc which I
can share, when ready, to discuss. If needed, I can also plan to
present that in one of the DPDK community meetings.
Their we can also discuss if there are any alternatives of achieving
above objectives

Thanks,
Nitin
.
On Wed, Oct 16, 2024 at 7:20 PM Nitin Saxena <nsaxena16@gmail.com> wrote:
>
> Hi Robin,
>
> Thanks for the review
> Please see my replies inline
>
> Thanks,
> Nitin
>
> On Wed, Oct 16, 2024 at 3:08 PM Robin Jarry <rjarry@redhat.com> wrote:
> >
> > Hi folks,
> >
> > David Marchand, Oct 16, 2024 at 11:24:
> > > On Mon, Oct 14, 2024 at 1:12 PM Nitin Saxena <nsaxena@marvell.com> wrote:
> > >> I had pushed non RFC patch series before -rc1 date (11th oct).
> > >> We have an ABI change in this patch series https://patches.dpdk.org/project/dpdk/patch/20241010133111.2764712-3-nsaxena@marvell.com/
> > >> Could you help merge this patch series in rc2 otherwise it has to wait for next LTS
> > >
> > > Just read through the series, I am not confident with this addition.
> > > It requires a lot of changes in the node code for supporting it, where
> > > it should be something handled in/facilitated by the graph library
> > > itself.
> >
> > As far as I can tell, it will be very complicated (if not impossible) to
> > determine in a generic manner whether a packet must be steered towards
> > a sub tree or not. The decision *must* come from the originating node in
> > some way or another.
>
> Nitin> I am not sure if it *must* always be from the originating node?
> What about a control plane which wants to enable "IP4 feature" on
> interface  'X'  by assigning IP address?
> A originating node (say: ip4-input) *must not* activate IP4 lookup
> sub-graph for interface "X " until control plane assigns any IP
> address to it.
>
> Regarding the complexity of adopting feature arc changes in fast path,
> - a sub-optimal change for feature-arc would be simple and trivial but
> at the cost of performance.
> - Complexity increases when feature arc changes are optimally
> integrated (like "ip4_rewrite" changes in the patch) with no
> performance regression
>
> >
> > > I did not read much from Robin or Christophe who have been writing
> > > more node code than me.
> > > I would prefer their opinion before going forward.
> >
> > This series is indeed very dense. I like the concept of having
> > extensible sub trees in the graph but it feels like the implementation
> > is more complex than it should be.
> >
> > Lacking of another solution, we went for a naive approach in grout.
> > Basically, some nodes have undefined next nodes which are extended using
> > a dedicated API.
>
> Nitin> With an initial glance, it looks like "grout" is trying to
> solve a use-case where a child is being added to the parent's
> undefined next node. This is trying to create a runtime  parent-child
> relationship
>
> On the other hand, feature arc not just create parent-child
> relationships but also sibling-sibling relationships as well. Also
> enabling sub-graph per interface is critical functionality in feature
> arc that adds complexity
>
> Let's assume a use-case in ingress direction, at the IPv4 layer,
> where IPv4-input is the *originating node* and
>
> - On interface X, IPsec-policy, IP4-classify() and IPv4-lookup
> sub-graphs are enabled in a sequential order
> - On interface Y, IP4-classify() and IPv4-lookup sub-graphs are
> enabled. in a sequential order. i.e. IPsec-policy is *disabled* on
> interface Y
>
> In fast path, following processing should happen for "mbuf0" which is
> received on interface "X"
> - "ipv4-input" sends mbuf0 to the first enabled sub-graph node for
> interface X, "IPsec-policy"
> - In "IPsec-policy" node processing, if policy action results in
> "bypass" action for mbuf0, it must then be sent to next enabled
> sub-graph  i.e. "IPv4-classify" (from "IPsec-policy" node)
> - In "IPv4-classify" node processing, if classify fails for mbuf0 then
> it should finally be sent to "IPv4-lookup" node (from "IPv4-classify"
> node)
>
> whereas for "mbuf1" received on interface Y following fast path
> processing must happen
> - "Ipv4-input" sends mbuf1 to the first enabled sub-graph node for
> interface Y, "IPv4-classify"
> - If "IPv4-classify" fails for mbuf1, then it should finally be sent
> to IPv4-lookup node
>
> To behave differently for interface X and interface Y as above
> - First of all, IPsec-policy/IPv4-classify/IPv4-lookup must be
> connected to "ipv4-input" node (Parent-Child relationship)
> - Also, IPsec-policy/IPv4-classify/IPv4-lookup must also be connected
> with each other (Sibling-Sibling relationship)
> - Fast path APIs provide *rte_edges_t* to send mbuf from one node to
> another node
>    1. Based on interface (either Interface X or Interface Y)
>    2. Based on which node, fast path APIs are called. Next enabled
> feature/sub-graph can only be determined from previous enabled
> feature/sub-graph in fast path
>
> Not sure if grout handles above use-cases in the same manner. AFAIR ,
> for any control plane change grout re-creates "graph" objects which
> may not be required with feature arc.
>
> >
> > https://github.com/DPDK/grout/blob/v0.2/modules/infra/datapath/eth_input.c#L23-L31
> >
> > This API can be used by other nodes to attach themselves to these
> > extensible nodes:
> >
> > https://github.com/DPDK/grout/blob/v0.2/modules/ip/datapath/arp_input.c#L143
> > https://github.com/DPDK/grout/blob/v0.2/modules/ip/datapath/ip_input.c#L124
> > https://github.com/DPDK/grout/blob/v0.2/modules/ip6/datapath/ip6_input.c#L122
> >
> > After which, the extensible nodes can steer the packets towards the
> > correct downstream edge based on the dedicated classifier field:
> >
> > https://github.com/DPDK/grout/blob/v0.2/modules/infra/datapath/eth_input.c#L79
> >
> > Obviously, this does not natively support a per-interface sub tree
> > traversal, but it can be done in the originating node based on packet
> > private context data.
>
> Nitin> Expressing per interface sub-tree traversal is the key
> functionality of feature arc.
>
> >
> > This raises a more important question: how can we standardize the way
> > private application data is passed from node to node? And how could we
> > enforce this declaratively in the node register API?
>
> Nitin> What you are suggesting here (node to node application data
> exchange) can be done in rte_node_register API but, IMO, this is not
> related to feature arc.
> Feature arc is not just between parent and child nodes but also
> between siblings (as explained above)
>
> >
> > Do you think we could find some middle ground that would not require
> > such extensive changes?
>
> Nitin> If you are pointing to ipv4-rewrite changes, I had an internal
> review comment of adding those changes without any *performance
> regression*.
> A sub-optimal code would be much simpler but at the cost of performance.
>
> >
> > Cheers,
> > Robin
> >
  
Robin Jarry Oct. 17, 2024, 7:50 a.m. UTC | #8
Hi Nitin, all,

Nitin Saxena, Oct 17, 2024 at 09:03:
> Hi Robin/David and all,
>
> We realized the feature arc patch series is difficult to understand as 
> a new concept. Our objectives are following with feature arc changes
>
> 1. Allow reusability of standard DPDK nodes (defined in lib/nodes/*) 
>    with out-of-tree applications (like grout). Currently out-of-tree 
>    graph applications are duplicating standard nodes but not reusing 
>    the standard ones which are available. In the long term, we would 
>    like to mature standard DPDK nodes with flexibility of hooking them 
>    to out-of-tree application nodes.

It would be ideal if the in-built nodes could be reused. When we started 
working on grout, I tried multiple approaches where I could reuse these 
nodes, but all failed. The nodes public API seems tailored for app/graph 
but does not fit well with other control plane implementations.

One of the main issues I had is that the ethdev_rx and ethdev_tx nodes 
are cloned per rxq / txq associated with a graph worker. The rte_node 
API requires that every clone has a unique name. This in turn makes hot 
plugging of DPDK ports very complex, if not impossible.

For example, with the in-built nodes, it is not possible to change the 
number of ports or their number of RX queues without destroying the 
whole graph and creating a new one from scratch.

Also, the current implementation of "ip{4,6}-rewrite" handles writing 
ethernet header data. This would prevent it from using this node for an 
IP-in-IP tunnel interface as we did in grout.

Do you think we could change the in-built nodes to enforce OSI layer 
separation of concerns? It would make them much more flexible. It may 
cause a slight drop of performance because you'd be splitting processing 
in two different nodes. But I think flexibility is more important. 
Otherwise, the in-built nodes can only be used for very specific 
use-cases.

Finally, I would like to improve the rte_node API to allow defining and 
enforcing per-packet metadata that every node expects as input. The 
current in-built nodes rely on mbuf dynamic fields for this but this 
means you only have 9x32 bits available. And using all of these may 
break some drivers (ixgbe) that rely on dynfields to work. Have you 
considered using mbuf private data for this?

>
> 2. Flexibility to enable/disable sub-graphs per interface based on the 
>    runtime configuration updates. Protocol sub-graphs can be 
>    selectively enabled for few (or all interfaces) at runtime
>
> 3. More than one sub-graphs/features can be enabled on an interface. 
>    So a packet has to follow a sequential ordering node path on worker 
>    cores. Packets may need to move from one sub-graph to another 
>    sub-graph per interface
>
> 4. Last but not least, an optimized implementation which does not (or 
>    minimally) stop worker cores for any control plane runtime updates. 
>    Any performance regression should also be avoided
>
> I am planning to create a draft presentation on feature arc which 
> I can share, when ready, to discuss. If needed, I can also plan to 
> present that in one of the DPDK community meetings. Their we can also 
> discuss if there are any alternatives of achieving above objectives

Looking forward to this.

Thanks!
  
Christophe Fontaine Oct. 17, 2024, 8:32 a.m. UTC | #9
Hi all,

What about the following steps:
- update the nodes so they work on the current layer (example: for all L3 nodes, the current mbuf data offset *must* be pointing to the IP header)
- define a public data structure that would be shared across nodes through priv data, and not dynfields ? This structure would be the "internal api" (so, that has to be tracked across dpdk releases) between nodes.
We’d need common data shared for all the nodes as well as specific data between 2 nodes.
As we get to this point, this (hopefully) will help with the node reusability.

- Update the feature arcs to leverage this well known structure, and refine the api
- Define which part of the stack needs to be defined as a feature arc, with the benefit of the generic API to enable/disable that feature, and which part needs to be dynamically pluggable.
For instance, for a router, it may not make sense to define IPv4 support as a feature arc.
So, we’d statically connect eth_input to ip_input.
Yet, lldp support is a good candidate for a feature arc: we need to configure it per interface, and this is independent of the main graph.

WDYT?
Christophe

> On 17 Oct 2024, at 09:50, Robin Jarry <rjarry@redhat.com> wrote:
> 
> Hi Nitin, all,
> 
> Nitin Saxena, Oct 17, 2024 at 09:03:
>> Hi Robin/David and all,
>> 
>> We realized the feature arc patch series is difficult to understand as a new concept. Our objectives are following with feature arc changes
>> 
>> 1. Allow reusability of standard DPDK nodes (defined in lib/nodes/*)    with out-of-tree applications (like grout). Currently out-of-tree    graph applications are duplicating standard nodes but not reusing    the standard ones which are available. In the long term, we would    like to mature standard DPDK nodes with flexibility of hooking them    to out-of-tree application nodes.
> 
> It would be ideal if the in-built nodes could be reused. When we started working on grout, I tried multiple approaches where I could reuse these nodes, but all failed. The nodes public API seems tailored for app/graph but does not fit well with other control plane implementations.
> 
> One of the main issues I had is that the ethdev_rx and ethdev_tx nodes are cloned per rxq / txq associated with a graph worker. The rte_node API requires that every clone has a unique name. This in turn makes hot plugging of DPDK ports very complex, if not impossible.
> 
> For example, with the in-built nodes, it is not possible to change the number of ports or their number of RX queues without destroying the whole graph and creating a new one from scratch.
> 
> Also, the current implementation of "ip{4,6}-rewrite" handles writing ethernet header data. This would prevent it from using this node for an IP-in-IP tunnel interface as we did in grout.
> 
> Do you think we could change the in-built nodes to enforce OSI layer separation of concerns? It would make them much more flexible. It may cause a slight drop of performance because you'd be splitting processing in two different nodes. But I think flexibility is more important. Otherwise, the in-built nodes can only be used for very specific use-cases.
> 
> Finally, I would like to improve the rte_node API to allow defining and enforcing per-packet metadata that every node expects as input. The current in-built nodes rely on mbuf dynamic fields for this but this means you only have 9x32 bits available. And using all of these may break some drivers (ixgbe) that rely on dynfields to work. Have you considered using mbuf private data for this?
> 
>> 
>> 2. Flexibility to enable/disable sub-graphs per interface based on the    runtime configuration updates. Protocol sub-graphs can be    selectively enabled for few (or all interfaces) at runtime
>> 
>> 3. More than one sub-graphs/features can be enabled on an interface.    So a packet has to follow a sequential ordering node path on worker    cores. Packets may need to move from one sub-graph to another    sub-graph per interface
>> 
>> 4. Last but not least, an optimized implementation which does not (or    minimally) stop worker cores for any control plane runtime updates.    Any performance regression should also be avoided
>> 
>> I am planning to create a draft presentation on feature arc which I can share, when ready, to discuss. If needed, I can also plan to present that in one of the DPDK community meetings. Their we can also discuss if there are any alternatives of achieving above objectives
> 
> Looking forward to this.
> 
> Thanks!
>
  
Nitin Saxena Oct. 17, 2024, 8:48 a.m. UTC | #10
Hi Robin,

See inline comments

Thanks,
Nitin

On Thu, Oct 17, 2024 at 1:20 PM Robin Jarry <rjarry@redhat.com> wrote:
>
> Hi Nitin, all,
>
> Nitin Saxena, Oct 17, 2024 at 09:03:
> > Hi Robin/David and all,
> >
> > We realized the feature arc patch series is difficult to understand as
> > a new concept. Our objectives are following with feature arc changes
> >
> > 1. Allow reusability of standard DPDK nodes (defined in lib/nodes/*)
> >    with out-of-tree applications (like grout). Currently out-of-tree
> >    graph applications are duplicating standard nodes but not reusing
> >    the standard ones which are available. In the long term, we would
> >    like to mature standard DPDK nodes with flexibility of hooking them
> >    to out-of-tree application nodes.
>
> It would be ideal if the in-built nodes could be reused. When we started
> working on grout, I tried multiple approaches where I could reuse these
> nodes, but all failed. The nodes public API seems tailored for app/graph
> but does not fit well with other control plane implementations.
>
> One of the main issues I had is that the ethdev_rx and ethdev_tx nodes
> are cloned per rxq / txq associated with a graph worker. The rte_node
> API requires that every clone has a unique name. This in turn makes hot
> plugging of DPDK ports very complex, if not impossible.

Agreed. I guess hot plugging of DPDK ports was not the objective when
initial changes went in. But we can add hot-plugging functionality
without affecting performance

>
> For example, with the in-built nodes, it is not possible to change the
> number of ports or their number of RX queues without destroying the
> whole graph and creating a new one from scratch.

Coincidentally, I have also encountered these technical issues while
writing an out-of-tree application [1]. I had internal discussions
with @Jerin Jacob  and other graph maintainers to fix these
shortcomings. If you want, we can collaborate on fixing these issues

For [port, rq] pair mapping to worker core, I have an alternate design
[2] which currently stops worker cores. It can be enhanced by RCU
based scheme for an ideal DPDK implementation

[1]: https://marvellembeddedprocessors.github.io/dao/guides/applications/secgw-graph.html
[2]: https://github.com/MarvellEmbeddedProcessors/dao/blob/dao-devel/app/secgw-graph/nodes/rxtx/ethdev-rx.c#L27

>
> Also, the current implementation of "ip{4,6}-rewrite" handles writing
> ethernet header data. This would prevent it from using this node for an
> IP-in-IP tunnel interface as we did in grout.

For IP-in-IP, a separate rewrite node would be required which computes
checksum etc. but not add rewrite data.

>
> Do you think we could change the in-built nodes to enforce OSI layer
> separation of concerns? It would make them much more flexible.

Yes. We are also in agreement to make RFC compliant optimized in-built
nodes with such flexibility in place.

> It may
> cause a slight drop of performance because you'd be splitting processing
> in two different nodes. But I think flexibility is more important.
> Otherwise, the in-built nodes can only be used for very specific
> use-cases.
>
> Finally, I would like to improve the rte_node API to allow defining and
> enforcing per-packet metadata that every node expects as input. The
> current in-built nodes rely on mbuf dynamic fields for this but this
> means you only have 9x32 bits available. And using all of these may
> break some drivers (ixgbe) that rely on dynfields to work. Have you
> considered using mbuf private data for this?

IMO,  "node_mbuf_priv_t" would be ideal for most of the use-cases as
it fits in second 64B cache line. With mbuf private data, fast path
have to access another cache line per packet which may not be
efficient from performance PoV. But we can discuss in more detail
about it. Although, I thought of adding "sw_if_index" (which is not
same as port_id) to accommodate IP-in-IP like software interfaces

>
> >
> > 2. Flexibility to enable/disable sub-graphs per interface based on the
> >    runtime configuration updates. Protocol sub-graphs can be
> >    selectively enabled for few (or all interfaces) at runtime
> >
> > 3. More than one sub-graphs/features can be enabled on an interface.
> >    So a packet has to follow a sequential ordering node path on worker
> >    cores. Packets may need to move from one sub-graph to another
> >    sub-graph per interface
> >
> > 4. Last but not least, an optimized implementation which does not (or
> >    minimally) stop worker cores for any control plane runtime updates.
> >    Any performance regression should also be avoided
> >
> > I am planning to create a draft presentation on feature arc which
> > I can share, when ready, to discuss. If needed, I can also plan to
> > present that in one of the DPDK community meetings. Their we can also
> > discuss if there are any alternatives of achieving above objectives
>
> Looking forward to this.

Sure. Will share ppt asap

>
> Thanks!
>
  
Nitin Saxena Oct. 17, 2024, 10:56 a.m. UTC | #11
Hi Christophe,

Please see inline comments

Thanks,
Nitin

On Thu, Oct 17, 2024 at 2:02 PM Christophe Fontaine <cfontain@redhat.com> wrote:
>
> Hi all,
>
> What about the following steps:
> - update the nodes so they work on the current layer (example: for all L3 nodes, the current mbuf data offset *must* be pointing to the IP header)

Agreed. It would be better if nodes uses
rte_pktmbuf_[append()/shrink() etc..] APIs to manipulate layer data
offset

> - define a public data structure that would be shared across nodes through priv data, and not dynfields ?

Eventually public data structures should be defined to serve *a
purpose*. Do you refer to creating a generic public structure? If yes,
IMO, it may not be tuned for performance
IMO, we need to create public structures for each specific purpose.
Feature arc is also a public data structure which optimally saves
following variables in 8 byte compact structure
(rte_graph_feature_daa_t) for every interface
- rte_edge_t (uint16_t)
- next enabled feature (uint8_t) per index (from current node)
-  Per interface feature specific user_data (uint32_t)

Due to its compact nature, 8 such objects per interface can be saved
in one 64B cache line. So IMO, it is better to create public
structures for a given purpose and optimally define them fields and
APIs.
Also feature arc specific 3B data is saved in mbuf dynfield. Hard to
say if priv data would provide a better solution.

> This structure would be the "internal api" (so, that has to be tracked across dpdk releases) between nodes.
> We’d need common data shared for all the nodes as well as specific data between 2 nodes.
> As we get to this point, this (hopefully) will help with the node reusability.

Feature arc also maintains data between 2 nodes per interface and also
for all nodes which are added as features.

>
> - Update the feature arcs to leverage this well known structure, and refine the api
> - Define which part of the stack needs to be defined as a feature arc, with the benefit of the generic API to enable/disable that feature, and which part needs to be dynamically pluggable.
> For instance, for a router, it may not make sense to define IPv4 support as a feature arc.
> So, we’d statically connect eth_input to ip_input.

Agreed

> Yet, lldp support is a good candidate for a feature arc: we need to configure it per interface, and this is independent of the main graph.
>

There would be more protocols which need to be enabled per interface

> WDYT?
> Christophe
>
> > On 17 Oct 2024, at 09:50, Robin Jarry <rjarry@redhat.com> wrote:
> >
> > Hi Nitin, all,
> >
> > Nitin Saxena, Oct 17, 2024 at 09:03:
> >> Hi Robin/David and all,
> >>
> >> We realized the feature arc patch series is difficult to understand as a new concept. Our objectives are following with feature arc changes
> >>
> >> 1. Allow reusability of standard DPDK nodes (defined in lib/nodes/*)    with out-of-tree applications (like grout). Currently out-of-tree    graph applications are duplicating standard nodes but not reusing    the standard ones which are available. In the long term, we would    like to mature standard DPDK nodes with flexibility of hooking them    to out-of-tree application nodes.
> >
> > It would be ideal if the in-built nodes could be reused. When we started working on grout, I tried multiple approaches where I could reuse these nodes, but all failed. The nodes public API seems tailored for app/graph but does not fit well with other control plane implementations.
> >
> > One of the main issues I had is that the ethdev_rx and ethdev_tx nodes are cloned per rxq / txq associated with a graph worker. The rte_node API requires that every clone has a unique name. This in turn makes hot plugging of DPDK ports very complex, if not impossible.
> >
> > For example, with the in-built nodes, it is not possible to change the number of ports or their number of RX queues without destroying the whole graph and creating a new one from scratch.
> >
> > Also, the current implementation of "ip{4,6}-rewrite" handles writing ethernet header data. This would prevent it from using this node for an IP-in-IP tunnel interface as we did in grout.
> >
> > Do you think we could change the in-built nodes to enforce OSI layer separation of concerns? It would make them much more flexible. It may cause a slight drop of performance because you'd be splitting processing in two different nodes. But I think flexibility is more important. Otherwise, the in-built nodes can only be used for very specific use-cases.
> >
> > Finally, I would like to improve the rte_node API to allow defining and enforcing per-packet metadata that every node expects as input. The current in-built nodes rely on mbuf dynamic fields for this but this means you only have 9x32 bits available. And using all of these may break some drivers (ixgbe) that rely on dynfields to work. Have you considered using mbuf private data for this?
> >
> >>
> >> 2. Flexibility to enable/disable sub-graphs per interface based on the    runtime configuration updates. Protocol sub-graphs can be    selectively enabled for few (or all interfaces) at runtime
> >>
> >> 3. More than one sub-graphs/features can be enabled on an interface.    So a packet has to follow a sequential ordering node path on worker    cores. Packets may need to move from one sub-graph to another    sub-graph per interface
> >>
> >> 4. Last but not least, an optimized implementation which does not (or    minimally) stop worker cores for any control plane runtime updates.    Any performance regression should also be avoided
> >>
> >> I am planning to create a draft presentation on feature arc which I can share, when ready, to discuss. If needed, I can also plan to present that in one of the DPDK community meetings. Their we can also discuss if there are any alternatives of achieving above objectives
> >
> > Looking forward to this.
> >
> > Thanks!
> >
>