mbox series

[RFC,0/3] introduce IF proxy library

Message ID 20200114142517.29522-1-aostruszka@marvell.com (mailing list archive)
Headers
Series introduce IF proxy library |

Message

Andrzej Ostruszka [C] Jan. 14, 2020, 2:25 p.m. UTC
  What is this useful for
=======================

Usually, when an ethernet port is assigned to DPDK it vanishes from the
system and user looses ability to control it via normal configuration
utilities (e.g. those from iproute2 package).  Moreover by default DPDK
application is not aware of the network configuration of the system.

To address both of these issues application needs to:
- add some command line interface (or other mechanism) allowing for
  control of the port and its configuration
- query the status of network configuration and monitor its changes

The purpose of this library is to help with both of these tasks (as long
as they remain in domain of configuration available to the system).  In
other words, if DPDK application has some special needs, that cannot be
addressed by the normal system configuration utilities, then they need
to be solved by the application itself.

The connection between DPDK and system is based on the existence of
ports that are visible to both DPDK and system (like Tap, KNI and
possibly some other drivers).  These ports serve as an interface
proxies.

Let's visualize the action of the library by the following example:

              Linux             |            DPDK
==============================================================
                                |
                                |   +-------+       +-------+
                                |   | Port1 |       | Port2 |
"ip link set dev tap1 mtu 1600" |   +-------+       +-------+
                          |     |       ^               ^
                          |  +------+   | mtu_change    |
                          `->| Tap1 |---' callback      |
                             +------+                   |
"ip addr add 198.51.100.14 \    |                       |
                  dev tap2"     |                       |
                          |  +------+                   |
                          `->| Tap2 |-------------------'
                             +------+   addr_add callback 
                                |
"ip route add 198.0.2.0/24 \    |
                  dev eth0"     |
                          |     |   route_add callback
                          `------------->
                                |

So we have two ports Port1 and Port2 that are not visible to the system.
We create two proxy interfaces (here based on Tap driver) and bind the
ports to their proxies.  When user issues a command changing MTU for
Tap1 interface the library notes this and calls "mtu_change" callback
for the Port1.  Similarly when user adds an IPv4 address to the Tap2
interface "addr_add" callback is called for the Port2.  Note also that
that not only port related callbacks are available - for example you can
also get information about routing table.  See below for a complete list
of available callbacks.

Please note that nothing has been mentioned about forwarding of the
packets between system and DPDK.  Since the proxies are normal DPDK
ports you can receive/send to them via usual RX/TX burst API.  However
since the library is not aware of the structure of packet processing
used by the application it cannot automatically forward the packets - it
is responsibility of the application to include proxy ports into its
packet processing engine.

As mentioned above the intention of the library is to:
- provide information about network configuration that would allow
  application to decide what to do with the packets received on DPDK
  ports,
- allow for control of the ports via standard configuration utilities

Although the library only helps you to identify proxy for given port
(and vice versa) and calls appropriate callbacks it does open some
interesting possibilities.  For example you can use the proxy ports to
forward packets for protocols that you do not wish to handle in DPDK
application to the system protocol stack and just listen to the
configuration changes - so that way you can "offload" handling of those
protocols to the system.


Why this RFC
============

We would like to solicit some input from the community:
- regarding usefulness of this library
- what is missing or what needs to be changed
- about currently proposed API
- any other suggestions and/or improvements are also welcome


How to use it
=============

Usage of this library is rather simple.  You have to:
1. Create proxy (if you don't have port suitable for being proxy or you
  have one but do not wish to use it as a proxy).
2. Bind port to proxy.
3. Register callbacks.
4. Start listening to the network configuration.

The only mandatory requirement for DPDK port to be able to act as
a proxy is that it is visible in the system - this is checked during
port to proxy binding by calling rte_eth_dev_info_get() on proxy port
and inspecting 'if_index' field (it has to be non-zero).
One can create such port in the application by calling:

  proxy_id = rte_ifpx_create(RTE_IFPX_DEFAULT);

Upon success this returns id of DPDK proxy port created
(RTE_MAX_ETHPORTS on failure).  The argument selects type of proxy port
to create (currently Tap/KNI only).  This function actually is just
a wrapper around:

  uint16_t rte_ifpx_create_by_devarg(const char *devarg);

creating valid 'devarg' string for the chosen type of proxy.  If you have
other driver capable of acting as a proxy you can call
rte_ifpx_create_by_devarg() directly passing appropriate argument.

Once you have id of both port and proxy you can bind the two via:

  rte_ifpx_port_bind(port_id, proxy_id);

This creates logical binding - as mentioned above there is no automatic
packet forwarding.  With this binding whenever user changes the state of
proxy interface in the system (link up/down, change mac/mtu, add/remove
IPv4/IPv6) you get appropriate callback called for the bound port.

So far we've mentioned several times that the library calls callbacks.
They are grouped in 'struct rte_ifpx_callbacks' and user provides them
to the library via:

  rte_ifpx_callbacks_register(&cbs);

It is worth mentioning that the context (lcore/thread) in which these
callbacks are called is implementation defined.  It might differ between
different platforms, so the application needs to assume that some kind
of inter lcore/thread synchronization/communication is required.

Once we have bindings in place and callbacks registered, the only
essential part that remains is to get the current network configuration
and start listening to its changes.  This is accomplished via a call to:

  rte_ifpx_listen();

And basically this is all one needs to understand how to use this
library.  Other less essential parts include:
- ability to query what callbacks are available for given platform
- getting mapping between proxy and port
- unbinding the ports from proxy
- destroying proxy port
- closing the listening service
- getting basic information about proxy


Currently available features and implementation
===============================================

The library's API is system independent but it obviously needs some
system dependent parts.  We provide exemplary Linux implementation (based
on netlink sockets).  Very similar implementation is possible for
FreeBSD (with the usage of PF_ROUTE sockets).  Windows implementation
would need to differ much (probably IP Helper library would be of some help).

Here is the list of currently implemented callbacks:

struct rte_ifpx_callbacks {
  void  (*mac_change)(uint16_t port_id, const struct rte_ether_addr *mac);
  void  (*mtu_change)(uint16_t port_id, uint16_t mtu);
  void (*link_change)(uint16_t port_id, int is_up);
  void    (*addr_add)(uint16_t port_id, uint32_t ip);
  void    (*addr_del)(uint16_t port_id, uint32_t ip);
  void   (*addr6_add)(uint16_t port_id, const uint8_t *ip);
  void   (*addr6_del)(uint16_t port_id, const uint8_t *ip);
  void   (*route_add)(uint32_t ip, uint8_t depth);
  void   (*route_del)(uint32_t ip, uint8_t depth);
  void  (*route6_add)(const uint8_t *ip, uint8_t depth);
  void  (*route6_del)(const uint8_t *ip, uint8_t depth);
  void (*cfg_finished)(void);
};

They are all rather self-descriptive with the exception of the last one.
When the user calls rte_ifpx_listen() the library first queries the
system for its current configuration.  That might require several
request/reply exchanges between DPDK and system and once it is finished
this callback is called to let application know that all info has been
gathered.

BTW at the moment all IPv4 addresses are passed in host order.

It is worth to mention also that while typical case would be a 1-to-1
mapping between port and proxy, the 1-to-many mapping is also supported.
In that case port related callbacks will be called for each port bound
to given proxy interface - in that case it is application responsibility
to define semantic of such mapping (e.g. all changes apply to all ports,
or link changes apply to all but other are accepted in "round robin"
fashion, or ...).

As mentioned above Linux implementation is based on netlink socket.
This socket is registered as file descriptor in EAL interrupts
(similarly to how EAL alarms are implemented).


What is inside this RFC
=======================
- 1 commit for API
- 1 commit for implementation - this is just to show PoC, and allow for
  early playing around with the idea (e.g. run the test/example from the
  next commit)
- 1 commit for test/example - just to show how this can be used


Next steps
==========

- gather community feedback
- polish the implementation:
  * call the notification callbacks without lock held (at the moment
    attempts to modify callbacks from within the callback would deadlock)
  * separate the system dependent parts from the rest so that it is easy
    to figure out what needs to be reimplemented on different platforms
  * apply community suggestions - if any
- add neighbour callbacks (ARP table)

Best regards
Andrzej Ostruszka

Andrzej Ostruszka (3):
  lib: introduce IF proxy library (API)
  if_proxy: add preliminary Linux implementation
  if_proxy: add example, test and documentation

 app/test/Makefile                             |   5 +
 app/test/meson.build                          |   1 +
 app/test/test_if_proxy.c                      | 431 ++++++++++
 config/common_base                            |   5 +
 doc/guides/prog_guide/if_proxy_lib.rst        | 103 +++
 doc/guides/prog_guide/index.rst               |   1 +
 examples/Makefile                             |   1 +
 examples/if_proxy/Makefile                    |  58 ++
 examples/if_proxy/main.c                      | 203 +++++
 examples/if_proxy/meson.build                 |  12 +
 examples/meson.build                          |   2 +-
 lib/Makefile                                  |   2 +
 .../common/include/rte_eal_interrupts.h       |   2 +
 lib/librte_eal/linux/eal/eal_interrupts.c     |  14 +-
 lib/librte_if_proxy/Makefile                  |  25 +
 lib/librte_if_proxy/meson.build               |   7 +
 lib/librte_if_proxy/rte_if_proxy.c            | 803 ++++++++++++++++++
 lib/librte_if_proxy/rte_if_proxy.h            | 364 ++++++++
 lib/meson.build                               |   2 +-
 19 files changed, 2035 insertions(+), 6 deletions(-)
 create mode 100644 app/test/test_if_proxy.c
 create mode 100644 doc/guides/prog_guide/if_proxy_lib.rst
 create mode 100644 examples/if_proxy/Makefile
 create mode 100644 examples/if_proxy/main.c
 create mode 100644 examples/if_proxy/meson.build
 create mode 100644 lib/librte_if_proxy/Makefile
 create mode 100644 lib/librte_if_proxy/meson.build
 create mode 100644 lib/librte_if_proxy/rte_if_proxy.c
 create mode 100644 lib/librte_if_proxy/rte_if_proxy.h
  

Comments

Morten Brørup Jan. 14, 2020, 3:16 p.m. UTC | #1
Andrzej,

Basically you are adding a very small subset of the Linux IP stack to interface with DPDK applications via callbacks. The library also seems to support interfacing to the route table, so it is not "interface proxy", but "IP stack proxy".

You already mention ARP table as future work. How about namespaces, ip tables, and other advanced features... I foresee the Devil in the details for any real use case.

Unless the library is an O/S wrapper to make Linux NETLINK-like messages available from other operating systems, I don't really see the value in this library... if it is Linux specific, why not just use NETLINK in the DPDK application's control plane?


Med venlig hilsen / kind regards
- Morten Brørup

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Andrzej Ostruszka
> Sent: Tuesday, January 14, 2020 3:25 PM
> To: dev@dpdk.org
> Cc: Jerin Jacob Kollanukkaran; Nithin Kumar Dabilpuram; Pavan Nikhilesh
> Bhagavatula; Kiran Kumar Kokkilagadda; Krzysztof Kanas
> Subject: [dpdk-dev] [RFC PATCH 0/3] introduce IF proxy library
> 
> What is this useful for
> =======================
> 
> Usually, when an ethernet port is assigned to DPDK it vanishes from the
> system and user looses ability to control it via normal configuration
> utilities (e.g. those from iproute2 package).  Moreover by default DPDK
> application is not aware of the network configuration of the system.
> 
> To address both of these issues application needs to:
> - add some command line interface (or other mechanism) allowing for
>   control of the port and its configuration
> - query the status of network configuration and monitor its changes
> 
> The purpose of this library is to help with both of these tasks (as
> long
> as they remain in domain of configuration available to the system).  In
> other words, if DPDK application has some special needs, that cannot be
> addressed by the normal system configuration utilities, then they need
> to be solved by the application itself.
> 
> The connection between DPDK and system is based on the existence of
> ports that are visible to both DPDK and system (like Tap, KNI and
> possibly some other drivers).  These ports serve as an interface
> proxies.
> 
> Let's visualize the action of the library by the following example:
> 
>               Linux             |            DPDK
> ==============================================================
>                                 |
>                                 |   +-------+       +-------+
>                                 |   | Port1 |       | Port2 |
> "ip link set dev tap1 mtu 1600" |   +-------+       +-------+
>                           |     |       ^               ^
>                           |  +------+   | mtu_change    |
>                           `->| Tap1 |---' callback      |
>                              +------+                   |
> "ip addr add 198.51.100.14 \    |                       |
>                   dev tap2"     |                       |
>                           |  +------+                   |
>                           `->| Tap2 |-------------------'
>                              +------+   addr_add callback
>                                 |
> "ip route add 198.0.2.0/24 \    |
>                   dev eth0"     |
>                           |     |   route_add callback
>                           `------------->
>                                 |
> 
> So we have two ports Port1 and Port2 that are not visible to the
> system.
> We create two proxy interfaces (here based on Tap driver) and bind the
> ports to their proxies.  When user issues a command changing MTU for
> Tap1 interface the library notes this and calls "mtu_change" callback
> for the Port1.  Similarly when user adds an IPv4 address to the Tap2
> interface "addr_add" callback is called for the Port2.  Note also that
> that not only port related callbacks are available - for example you
> can
> also get information about routing table.  See below for a complete
> list
> of available callbacks.
> 
> Please note that nothing has been mentioned about forwarding of the
> packets between system and DPDK.  Since the proxies are normal DPDK
> ports you can receive/send to them via usual RX/TX burst API.  However
> since the library is not aware of the structure of packet processing
> used by the application it cannot automatically forward the packets -
> it
> is responsibility of the application to include proxy ports into its
> packet processing engine.
> 
> As mentioned above the intention of the library is to:
> - provide information about network configuration that would allow
>   application to decide what to do with the packets received on DPDK
>   ports,
> - allow for control of the ports via standard configuration utilities
> 
> Although the library only helps you to identify proxy for given port
> (and vice versa) and calls appropriate callbacks it does open some
> interesting possibilities.  For example you can use the proxy ports to
> forward packets for protocols that you do not wish to handle in DPDK
> application to the system protocol stack and just listen to the
> configuration changes - so that way you can "offload" handling of those
> protocols to the system.
> 
> 
> Why this RFC
> ============
> 
> We would like to solicit some input from the community:
> - regarding usefulness of this library
> - what is missing or what needs to be changed
> - about currently proposed API
> - any other suggestions and/or improvements are also welcome
> 
> 
> How to use it
> =============
> 
> Usage of this library is rather simple.  You have to:
> 1. Create proxy (if you don't have port suitable for being proxy or you
>   have one but do not wish to use it as a proxy).
> 2. Bind port to proxy.
> 3. Register callbacks.
> 4. Start listening to the network configuration.
> 
> The only mandatory requirement for DPDK port to be able to act as
> a proxy is that it is visible in the system - this is checked during
> port to proxy binding by calling rte_eth_dev_info_get() on proxy port
> and inspecting 'if_index' field (it has to be non-zero).
> One can create such port in the application by calling:
> 
>   proxy_id = rte_ifpx_create(RTE_IFPX_DEFAULT);
> 
> Upon success this returns id of DPDK proxy port created
> (RTE_MAX_ETHPORTS on failure).  The argument selects type of proxy port
> to create (currently Tap/KNI only).  This function actually is just
> a wrapper around:
> 
>   uint16_t rte_ifpx_create_by_devarg(const char *devarg);
> 
> creating valid 'devarg' string for the chosen type of proxy.  If you
> have
> other driver capable of acting as a proxy you can call
> rte_ifpx_create_by_devarg() directly passing appropriate argument.
> 
> Once you have id of both port and proxy you can bind the two via:
> 
>   rte_ifpx_port_bind(port_id, proxy_id);
> 
> This creates logical binding - as mentioned above there is no automatic
> packet forwarding.  With this binding whenever user changes the state
> of
> proxy interface in the system (link up/down, change mac/mtu, add/remove
> IPv4/IPv6) you get appropriate callback called for the bound port.
> 
> So far we've mentioned several times that the library calls callbacks.
> They are grouped in 'struct rte_ifpx_callbacks' and user provides them
> to the library via:
> 
>   rte_ifpx_callbacks_register(&cbs);
> 
> It is worth mentioning that the context (lcore/thread) in which these
> callbacks are called is implementation defined.  It might differ
> between
> different platforms, so the application needs to assume that some kind
> of inter lcore/thread synchronization/communication is required.
> 
> Once we have bindings in place and callbacks registered, the only
> essential part that remains is to get the current network configuration
> and start listening to its changes.  This is accomplished via a call
> to:
> 
>   rte_ifpx_listen();
> 
> And basically this is all one needs to understand how to use this
> library.  Other less essential parts include:
> - ability to query what callbacks are available for given platform
> - getting mapping between proxy and port
> - unbinding the ports from proxy
> - destroying proxy port
> - closing the listening service
> - getting basic information about proxy
> 
> 
> Currently available features and implementation
> ===============================================
> 
> The library's API is system independent but it obviously needs some
> system dependent parts.  We provide exemplary Linux implementation
> (based
> on netlink sockets).  Very similar implementation is possible for
> FreeBSD (with the usage of PF_ROUTE sockets).  Windows implementation
> would need to differ much (probably IP Helper library would be of some
> help).
> 
> Here is the list of currently implemented callbacks:
> 
> struct rte_ifpx_callbacks {
>   void  (*mac_change)(uint16_t port_id, const struct rte_ether_addr
> *mac);
>   void  (*mtu_change)(uint16_t port_id, uint16_t mtu);
>   void (*link_change)(uint16_t port_id, int is_up);
>   void    (*addr_add)(uint16_t port_id, uint32_t ip);
>   void    (*addr_del)(uint16_t port_id, uint32_t ip);
>   void   (*addr6_add)(uint16_t port_id, const uint8_t *ip);
>   void   (*addr6_del)(uint16_t port_id, const uint8_t *ip);
>   void   (*route_add)(uint32_t ip, uint8_t depth);
>   void   (*route_del)(uint32_t ip, uint8_t depth);
>   void  (*route6_add)(const uint8_t *ip, uint8_t depth);
>   void  (*route6_del)(const uint8_t *ip, uint8_t depth);
>   void (*cfg_finished)(void);
> };
> 
> They are all rather self-descriptive with the exception of the last
> one.
> When the user calls rte_ifpx_listen() the library first queries the
> system for its current configuration.  That might require several
> request/reply exchanges between DPDK and system and once it is finished
> this callback is called to let application know that all info has been
> gathered.
> 
> BTW at the moment all IPv4 addresses are passed in host order.
> 
> It is worth to mention also that while typical case would be a 1-to-1
> mapping between port and proxy, the 1-to-many mapping is also
> supported.
> In that case port related callbacks will be called for each port bound
> to given proxy interface - in that case it is application
> responsibility
> to define semantic of such mapping (e.g. all changes apply to all
> ports,
> or link changes apply to all but other are accepted in "round robin"
> fashion, or ...).
> 
> As mentioned above Linux implementation is based on netlink socket.
> This socket is registered as file descriptor in EAL interrupts
> (similarly to how EAL alarms are implemented).
> 
> 
> What is inside this RFC
> =======================
> - 1 commit for API
> - 1 commit for implementation - this is just to show PoC, and allow for
>   early playing around with the idea (e.g. run the test/example from
> the
>   next commit)
> - 1 commit for test/example - just to show how this can be used
> 
> 
> Next steps
> ==========
> 
> - gather community feedback
> - polish the implementation:
>   * call the notification callbacks without lock held (at the moment
>     attempts to modify callbacks from within the callback would
> deadlock)
>   * separate the system dependent parts from the rest so that it is
> easy
>     to figure out what needs to be reimplemented on different platforms
>   * apply community suggestions - if any
> - add neighbour callbacks (ARP table)
> 
> Best regards
> Andrzej Ostruszka
> 
> Andrzej Ostruszka (3):
>   lib: introduce IF proxy library (API)
>   if_proxy: add preliminary Linux implementation
>   if_proxy: add example, test and documentation
> 
>  app/test/Makefile                             |   5 +
>  app/test/meson.build                          |   1 +
>  app/test/test_if_proxy.c                      | 431 ++++++++++
>  config/common_base                            |   5 +
>  doc/guides/prog_guide/if_proxy_lib.rst        | 103 +++
>  doc/guides/prog_guide/index.rst               |   1 +
>  examples/Makefile                             |   1 +
>  examples/if_proxy/Makefile                    |  58 ++
>  examples/if_proxy/main.c                      | 203 +++++
>  examples/if_proxy/meson.build                 |  12 +
>  examples/meson.build                          |   2 +-
>  lib/Makefile                                  |   2 +
>  .../common/include/rte_eal_interrupts.h       |   2 +
>  lib/librte_eal/linux/eal/eal_interrupts.c     |  14 +-
>  lib/librte_if_proxy/Makefile                  |  25 +
>  lib/librte_if_proxy/meson.build               |   7 +
>  lib/librte_if_proxy/rte_if_proxy.c            | 803 ++++++++++++++++++
>  lib/librte_if_proxy/rte_if_proxy.h            | 364 ++++++++
>  lib/meson.build                               |   2 +-
>  19 files changed, 2035 insertions(+), 6 deletions(-)
>  create mode 100644 app/test/test_if_proxy.c
>  create mode 100644 doc/guides/prog_guide/if_proxy_lib.rst
>  create mode 100644 examples/if_proxy/Makefile
>  create mode 100644 examples/if_proxy/main.c
>  create mode 100644 examples/if_proxy/meson.build
>  create mode 100644 lib/librte_if_proxy/Makefile
>  create mode 100644 lib/librte_if_proxy/meson.build
>  create mode 100644 lib/librte_if_proxy/rte_if_proxy.c
>  create mode 100644 lib/librte_if_proxy/rte_if_proxy.h
> 
> --
> 2.17.1
>
  
Andrzej Ostruszka Jan. 14, 2020, 5:38 p.m. UTC | #2
On 1/14/20 4:16 PM, Morten Brørup wrote:
> Andrzej,

Hello Morten

> Basically you are adding a very small subset of the Linux IP stack> to interface with DPDK applications via callbacks.

Yes, at the moment this is limited - we'd prefer first to solicit
some input from community.

> The library also seems to support interfacing to the route table,
> so it is not "interface proxy" but "IP stack proxy".

True, to some extent - for example you can bring the interface up and
down which has nothing to do with IP stack.  As for the name of the
library - that is actually part where we are completely open.  The proxy
represents port (thus the name) but that is not all, so any better name
proposals are welcome.

> You already mention ARP table as future work. How about namespaces,
> ip tables, and other advanced features... I foresee the Devil in the
> details for any real use case.

Right now I don't know what other things are needed.  This idea is still
early.  However imagine you'd like to use DPDK to speed up packet
processing of IP stack - would you like to implement all the protocols
that are needed?  Or just let the system handle the control path and
handle the data path and sniff the control params from the system.

> Unless the library is an O/S wrapper to make Linux NETLINK-like messages
> available from other operating systems, ...
The idea is to have this system independent - and in the "next things"
I've mentioned splitting current implementation into common and
system-dependent parts.  AFAIK we do not plan to provide implementation
for other systems, but would like to form it so that is clear how to do
that.  As mentioned in the description FreeBSD implementation could be
really similar but the Windows one would probably require some thread
polling periodically system with "IP Helper" lib calls - I'm not Windows
programmer.  So no - the intent is not to provide "NETLINK-like
messages" to other systems.

> ... I don't really see the value in this library... if it is Linux
> specific, why not just use NETLINK in the DPDK application's control
> plane?

NETLINK is just Linux specific implementation.  And if you confine
yourself only to a Linux specific world - you can think of this library
as what you have just described :).  Free implementation of NETLINK
handling - with a defined API.

Best regards
Andrzej
  
Bruce Richardson Jan. 15, 2020, 10:15 a.m. UTC | #3
On Tue, Jan 14, 2020 at 06:38:37PM +0100, Andrzej Ostruszka wrote:
> On 1/14/20 4:16 PM, Morten Brørup wrote:
> > Andrzej,
> 
> Hello Morten
> 
> > Basically you are adding a very small subset of the Linux IP stack> to interface with DPDK applications via callbacks.
> 
> Yes, at the moment this is limited - we'd prefer first to solicit
> some input from community.
> 
> > The library also seems to support interfacing to the route table,
> > so it is not "interface proxy" but "IP stack proxy".
> 
> True, to some extent - for example you can bring the interface up and
> down which has nothing to do with IP stack.  As for the name of the
> library - that is actually part where we are completely open.  The proxy
> represents port (thus the name) but that is not all, so any better name
> proposals are welcome.
> 
> > You already mention ARP table as future work. How about namespaces,
> > ip tables, and other advanced features... I foresee the Devil in the
> > details for any real use case.
> 
> Right now I don't know what other things are needed.  This idea is still
> early.  However imagine you'd like to use DPDK to speed up packet
> processing of IP stack - would you like to implement all the protocols
> that are needed?  Or just let the system handle the control path and
> handle the data path and sniff the control params from the system.
>
Like Morten, I'd be a bit concerned at the possible scope of the work if we
start pulling in functionality from the IP stack like ARP etc. To avoid
this becoming a massive effort, how useful would it be if we just limited
the scope to physical NIC setup only, and did not do anything above the l2
layer?
  
Jerin Jacob Jan. 15, 2020, 11:27 a.m. UTC | #4
On Wed, Jan 15, 2020 at 3:45 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Tue, Jan 14, 2020 at 06:38:37PM +0100, Andrzej Ostruszka wrote:
> > On 1/14/20 4:16 PM, Morten Brřrup wrote:
> > > Andrzej,
> >
> > Hello Morten
> >
> > > Basically you are adding a very small subset of the Linux IP stack> to interface with DPDK applications via callbacks.
> >
> > Yes, at the moment this is limited - we'd prefer first to solicit
> > some input from community.
> >
> > > The library also seems to support interfacing to the route table,
> > > so it is not "interface proxy" but "IP stack proxy".
> >
> > True, to some extent - for example you can bring the interface up and
> > down which has nothing to do with IP stack.  As for the name of the
> > library - that is actually part where we are completely open.  The proxy
> > represents port (thus the name) but that is not all, so any better name
> > proposals are welcome.
> >
> > > You already mention ARP table as future work. How about namespaces,
> > > ip tables, and other advanced features... I foresee the Devil in the
> > > details for any real use case.
> >
> > Right now I don't know what other things are needed.  This idea is still
> > early.  However imagine you'd like to use DPDK to speed up packet
> > processing of IP stack - would you like to implement all the protocols
> > that are needed?  Or just let the system handle the control path and
> > handle the data path and sniff the control params from the system.
> >
> Like Morten, I'd be a bit concerned at the possible scope of the work if we
> start pulling in functionality from the IP stack like ARP etc. To avoid
> this becoming a massive effort, how useful would it be if we just limited
> the scope to physical NIC setup only, and did not do anything above the l2
> layer?

Like the IPSec library, Marvell would like to add support for
additional protocols
(probably begin with IPv4, UDP) to DPDK. One of our concerns was the control
plane interface for those protocols for effective use in DPDK. Since DPDK has
support for FreeBSD and Windows OS now, We can not use NETLINK directly
in the library. This is the sole intention of this library was the
abstract control
plane interface. We can start with only the L2 layer for now
and but in the future when we add the L3 layer then we need to add the
additional items.
Suggestions?
  
Morten Brørup Jan. 15, 2020, 12:28 p.m. UTC | #5
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> Sent: Wednesday, January 15, 2020 11:16 AM
> 
> On Tue, Jan 14, 2020 at 06:38:37PM +0100, Andrzej Ostruszka wrote:
> > On 1/14/20 4:16 PM, Morten Brørup wrote:
> > > Andrzej,
> >
> > Hello Morten
> >
> > > Basically you are adding a very small subset of the Linux IP stack>
> to interface with DPDK applications via callbacks.
> >
> > Yes, at the moment this is limited - we'd prefer first to solicit
> > some input from community.
> >
> > > The library also seems to support interfacing to the route table,
> > > so it is not "interface proxy" but "IP stack proxy".
> >
> > True, to some extent - for example you can bring the interface up and
> > down which has nothing to do with IP stack.  As for the name of the
> > library - that is actually part where we are completely open.  The
> proxy
> > represents port (thus the name) but that is not all, so any better
> name
> > proposals are welcome.
> >
> > > You already mention ARP table as future work. How about namespaces,
> > > ip tables, and other advanced features... I foresee the Devil in
> the
> > > details for any real use case.
> >
> > Right now I don't know what other things are needed.  This idea is
> still
> > early.  However imagine you'd like to use DPDK to speed up packet
> > processing of IP stack - would you like to implement all the
> protocols
> > that are needed?  Or just let the system handle the control path and
> > handle the data path and sniff the control params from the system.
> >
> Like Morten, I'd be a bit concerned at the possible scope of the work
> if we
> start pulling in functionality from the IP stack like ARP etc. To avoid
> this becoming a massive effort, how useful would it be if we just
> limited
> the scope to physical NIC setup only, and did not do anything above the
> l2
> layer?

Think about it... Regardless of scope, this is clearly a control plane API, not a data plane API.

It provides a proxy API for the O/S control plane (NETLINK in the case of Linux), so the DPDK application can use the user interface that the O/S already provides (e.g. "ip link set dev tap1 mtu 1600" etc.) for its control plane, instead of implementing its own CLI (or GUI or whatever).

In order to provide significant value, it will have to grow massively, so I can use it as imagined: To make a Linux firewall where the DPDK application handles the data plane, and the normal Linux commands are used for setting up the firewall, incl. firewall rules, port forwarding, NAPT, etc.. The Devil is in the details here!

Although I like the concept and idea behind it, I don't think a control plane proxy API belongs in DPDK. But it could possibly be hosted by the DPDK project, if approved as such.


Med venlig hilsen / kind regards
- Morten Brørup
  
Jerin Jacob Jan. 15, 2020, 12:57 p.m. UTC | #6
On Wed, Jan 15, 2020 at 5:58 PM Morten Brørup <mb@smartsharesystems.com> wrote:
>
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> > Sent: Wednesday, January 15, 2020 11:16 AM
> >
> > On Tue, Jan 14, 2020 at 06:38:37PM +0100, Andrzej Ostruszka wrote:
> > > On 1/14/20 4:16 PM, Morten Brørup wrote:
> > > > Andrzej,
> > >
> > > Hello Morten
> > >
> > > > Basically you are adding a very small subset of the Linux IP stack>
> > to interface with DPDK applications via callbacks.
> > >
> > > Yes, at the moment this is limited - we'd prefer first to solicit
> > > some input from community.
> > >
> > > > The library also seems to support interfacing to the route table,
> > > > so it is not "interface proxy" but "IP stack proxy".
> > >
> > > True, to some extent - for example you can bring the interface up and
> > > down which has nothing to do with IP stack.  As for the name of the
> > > library - that is actually part where we are completely open.  The
> > proxy
> > > represents port (thus the name) but that is not all, so any better
> > name
> > > proposals are welcome.
> > >
> > > > You already mention ARP table as future work. How about namespaces,
> > > > ip tables, and other advanced features... I foresee the Devil in
> > the
> > > > details for any real use case.
> > >
> > > Right now I don't know what other things are needed.  This idea is
> > still
> > > early.  However imagine you'd like to use DPDK to speed up packet
> > > processing of IP stack - would you like to implement all the
> > protocols
> > > that are needed?  Or just let the system handle the control path and
> > > handle the data path and sniff the control params from the system.
> > >
> > Like Morten, I'd be a bit concerned at the possible scope of the work
> > if we
> > start pulling in functionality from the IP stack like ARP etc. To avoid
> > this becoming a massive effort, how useful would it be if we just
> > limited
> > the scope to physical NIC setup only, and did not do anything above the
> > l2
> > layer?
>
> Think about it... Regardless of scope, this is clearly a control plane API, not a data plane API.
>
> It provides a proxy API for the O/S control plane (NETLINK in the case of Linux), so the DPDK application can use the user interface that the O/S already provides (e.g. "ip link set dev tap1 mtu 1600" etc.) for its control plane, instead of implementing its own CLI (or GUI or whatever).

Yes.

>
> In order to provide significant value, it will have to grow massively, so I can use it as imagined: To make a Linux firewall where the DPDK application handles the data plane, and the normal Linux commands are used for setting up the firewall, incl. firewall rules, port forwarding, NAPT, etc.. The Devil is in the details here!

Yes.
Another use case would be to handle exception where DPDK may not
handle all the traffic, Traffic such ARP can be redirected to OS. This
would enable DP to focus on the real fast path protocols such as IPv4,
UDP etc.

>
> Although I like the concept and idea behind it, I don't think a control plane proxy API belongs in DPDK. But it could possibly be hosted by the DPDK project if approved as such.

Why? rte_flow, rte_tm all control plane APIs and it is part of
DPDK.IMO, in order to have effective use of data plane, the control
plane has to be integrated together in an OS-independent way.

>
>
> Med venlig hilsen / kind regards
> - Morten Brørup
  
Bruce Richardson Jan. 15, 2020, 2:09 p.m. UTC | #7
On Wed, Jan 15, 2020 at 01:28:46PM +0100, Morten Brørup wrote:
> > -----Original Message----- From: dev [mailto:dev-bounces@dpdk.org] On
> > Behalf Of Bruce Richardson Sent: Wednesday, January 15, 2020 11:16 AM
> > 
> > On Tue, Jan 14, 2020 at 06:38:37PM +0100, Andrzej Ostruszka wrote:
> > > On 1/14/20 4:16 PM, Morten Brørup wrote:
> > > > Andrzej,
> > >
> > > Hello Morten
> > >
> > > > Basically you are adding a very small subset of the Linux IP stack>
> > to interface with DPDK applications via callbacks.
> > >
> > > Yes, at the moment this is limited - we'd prefer first to solicit
> > > some input from community.
> > >
> > > > The library also seems to support interfacing to the route table,
> > > > so it is not "interface proxy" but "IP stack proxy".
> > >
> > > True, to some extent - for example you can bring the interface up and
> > > down which has nothing to do with IP stack.  As for the name of the
> > > library - that is actually part where we are completely open.  The
> > proxy
> > > represents port (thus the name) but that is not all, so any better
> > name
> > > proposals are welcome.
> > >
> > > > You already mention ARP table as future work. How about namespaces,
> > > > ip tables, and other advanced features... I foresee the Devil in
> > the
> > > > details for any real use case.
> > >
> > > Right now I don't know what other things are needed.  This idea is
> > still
> > > early.  However imagine you'd like to use DPDK to speed up packet
> > > processing of IP stack - would you like to implement all the
> > protocols
> > > that are needed?  Or just let the system handle the control path and
> > > handle the data path and sniff the control params from the system.
> > >
> > Like Morten, I'd be a bit concerned at the possible scope of the work
> > if we start pulling in functionality from the IP stack like ARP etc. To
> > avoid this becoming a massive effort, how useful would it be if we just
> > limited the scope to physical NIC setup only, and did not do anything
> > above the l2 layer?
> 
> Think about it... Regardless of scope, this is clearly a control plane
> API, not a data plane API.
> 
> It provides a proxy API for the O/S control plane (NETLINK in the case of
> Linux), so the DPDK application can use the user interface that the O/S
> already provides (e.g. "ip link set dev tap1 mtu 1600" etc.) for its
> control plane, instead of implementing its own CLI (or GUI or whatever).
> 
> In order to provide significant value, it will have to grow massively, so
> I can use it as imagined: To make a Linux firewall where the DPDK
> application handles the data plane, and the normal Linux commands are
> used for setting up the firewall, incl. firewall rules, port forwarding,
> NAPT, etc.. The Devil is in the details here!
> 
> Although I like the concept and idea behind it, I don't think a control
> plane proxy API belongs in DPDK. But it could possibly be hosted by the
> DPDK project, if approved as such.
> 
Personally, I wouldn't worry to much about control plane vs userplane for
this, if it's of significant benefit to DPDK users then it should be
considered.

/Bruce
  
Morten Brørup Jan. 15, 2020, 3:30 p.m. UTC | #8
> -----Original Message-----
> From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> Sent: Wednesday, January 15, 2020 1:57 PM
> 
> On Wed, Jan 15, 2020 at 5:58 PM Morten Brørup
> <mb@smartsharesystems.com> wrote:
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce
> Richardson
> > > Sent: Wednesday, January 15, 2020 11:16 AM
> > >
> > > On Tue, Jan 14, 2020 at 06:38:37PM +0100, Andrzej Ostruszka wrote:
> > > > On 1/14/20 4:16 PM, Morten Brørup wrote:
> > > > > Andrzej,
> > > >
> > > > Hello Morten
> > > >
> > > > > Basically you are adding a very small subset of the Linux IP
> stack>
> > > to interface with DPDK applications via callbacks.
> > > >
> > > > Yes, at the moment this is limited - we'd prefer first to solicit
> > > > some input from community.
> > > >
> > > > > The library also seems to support interfacing to the route
> table,
> > > > > so it is not "interface proxy" but "IP stack proxy".
> > > >
> > > > True, to some extent - for example you can bring the interface up
> and
> > > > down which has nothing to do with IP stack.  As for the name of
> the
> > > > library - that is actually part where we are completely open.
> The
> > > proxy
> > > > represents port (thus the name) but that is not all, so any
> better
> > > name
> > > > proposals are welcome.
> > > >
> > > > > You already mention ARP table as future work. How about
> namespaces,
> > > > > ip tables, and other advanced features... I foresee the Devil
> in
> > > the
> > > > > details for any real use case.
> > > >
> > > > Right now I don't know what other things are needed.  This idea
> is
> > > still
> > > > early.  However imagine you'd like to use DPDK to speed up packet
> > > > processing of IP stack - would you like to implement all the
> > > protocols
> > > > that are needed?  Or just let the system handle the control path
> and
> > > > handle the data path and sniff the control params from the
> system.
> > > >
> > > Like Morten, I'd be a bit concerned at the possible scope of the
> work
> > > if we
> > > start pulling in functionality from the IP stack like ARP etc. To
> avoid
> > > this becoming a massive effort, how useful would it be if we just
> > > limited
> > > the scope to physical NIC setup only, and did not do anything above
> the
> > > l2
> > > layer?
> >
> > Think about it... Regardless of scope, this is clearly a control
> plane API, not a data plane API.
> >
> > It provides a proxy API for the O/S control plane (NETLINK in the
> case of Linux), so the DPDK application can use the user interface that
> the O/S already provides (e.g. "ip link set dev tap1 mtu 1600" etc.)
> for its control plane, instead of implementing its own CLI (or GUI or
> whatever).
> 
> Yes.
> 
> >
> > In order to provide significant value, it will have to grow
> massively, so I can use it as imagined: To make a Linux firewall where
> the DPDK application handles the data plane, and the normal Linux
> commands are used for setting up the firewall, incl. firewall rules,
> port forwarding, NAPT, etc.. The Devil is in the details here!
> 
> Yes.
> Another use case would be to handle exception where DPDK may not
> handle all the traffic, Traffic such ARP can be redirected to OS. This
> would enable DP to focus on the real fast path protocols such as IPv4,
> UDP etc.
> 

These are use cases for DPDK being used in an environment where the IP stack features provided by Linux suffices. It would be great for a simple CPE or Wi-Fi router, e.g. OpenWRT with a DPDK data plane replacing the Linux kernel's data plane.

For this use case, I think an example application would be a much more useful way to achieve your goal. Implementing it as an application will also uncover what is really needed, instead of us all speculating about what a proxy library might need to include.

But consider an advanced router with VRFs, VLANs, policy based routing, multiple WANs provided through network namespaces... the library will be huge!

> >
> > Although I like the concept and idea behind it, I don't think a
> control plane proxy API belongs in DPDK. But it could possibly be
> hosted by the DPDK project if approved as such.
> 
> Why? rte_flow, rte_tm all control plane APIs and it is part of
> DPDK.

Yes, there are some DPDK libraries leaning more towards control plane than data plane. Another example to prove your point: The whole process scheduling library has very little to do with packet processing. Vaguely related features are creeping in when objections are not strong enough.

> IMO, in order to have effective use of data plane, the control
> plane has to be integrated together in an OS-independent way.
> 

Also remember that not all DPDK applications need an IP stack resembling what Linux has. E.g. the SmartShare StraightShaper is a transparent bandwidth optimization appliance, and it doesn't perform any routing, it doesn't use any O/S-like features in the data path, and thus it doesn't need to integrate with the IP stack in the O/S. (The management interface uses the Linux IP stack, but it is completely isolated from the DPDK application's data plane.) The same can be said about e.g. T-Rex.

Obviously, not all DPDK applications use all DPDK libraries, and since I'm not obligated to use it, I'm not strongly opposed against it. I only question its usefulness outside of the specific use case of replacing the fast path in the Linux kernel.

-Morten
  
Jerin Jacob Jan. 15, 2020, 4:04 p.m. UTC | #9
On Wed, Jan 15, 2020 at 9:00 PM Morten Brørup <mb@smartsharesystems.com> wrote:
>
> > -----Original Message-----
> > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > Sent: Wednesday, January 15, 2020 1:57 PM
> >
> > On Wed, Jan 15, 2020 at 5:58 PM Morten Brørup
> > <mb@smartsharesystems.com> wrote:
> > >
> > > > -----Original Message-----
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce
> > Richardson
> > > > Sent: Wednesday, January 15, 2020 11:16 AM
> > > >
> > > > On Tue, Jan 14, 2020 at 06:38:37PM +0100, Andrzej Ostruszka wrote:
> > > > > On 1/14/20 4:16 PM, Morten Brørup wrote:
> > > > > > Andrzej,
> > > > >
> > > > > Hello Morten
> > > > >
> > > > > > Basically you are adding a very small subset of the Linux IP
> > stack>
> > > > to interface with DPDK applications via callbacks.
> > > > >
> > > > > Yes, at the moment this is limited - we'd prefer first to solicit
> > > > > some input from community.
> > > > >
> > > > > > The library also seems to support interfacing to the route
> > table,
> > > > > > so it is not "interface proxy" but "IP stack proxy".
> > > > >
> > > > > True, to some extent - for example you can bring the interface up
> > and
> > > > > down which has nothing to do with IP stack.  As for the name of
> > the
> > > > > library - that is actually part where we are completely open.
> > The
> > > > proxy
> > > > > represents port (thus the name) but that is not all, so any
> > better
> > > > name
> > > > > proposals are welcome.
> > > > >
> > > > > > You already mention ARP table as future work. How about
> > namespaces,
> > > > > > ip tables, and other advanced features... I foresee the Devil
> > in
> > > > the
> > > > > > details for any real use case.
> > > > >
> > > > > Right now I don't know what other things are needed.  This idea
> > is
> > > > still
> > > > > early.  However imagine you'd like to use DPDK to speed up packet
> > > > > processing of IP stack - would you like to implement all the
> > > > protocols
> > > > > that are needed?  Or just let the system handle the control path
> > and
> > > > > handle the data path and sniff the control params from the
> > system.
> > > > >
> > > > Like Morten, I'd be a bit concerned at the possible scope of the
> > work
> > > > if we
> > > > start pulling in functionality from the IP stack like ARP etc. To
> > avoid
> > > > this becoming a massive effort, how useful would it be if we just
> > > > limited
> > > > the scope to physical NIC setup only, and did not do anything above
> > the
> > > > l2
> > > > layer?
> > >
> > > Think about it... Regardless of scope, this is clearly a control
> > plane API, not a data plane API.
> > >
> > > It provides a proxy API for the O/S control plane (NETLINK in the
> > case of Linux), so the DPDK application can use the user interface that
> > the O/S already provides (e.g. "ip link set dev tap1 mtu 1600" etc.)
> > for its control plane, instead of implementing its own CLI (or GUI or
> > whatever).
> >
> > Yes.
> >
> > >
> > > In order to provide significant value, it will have to grow
> > massively, so I can use it as imagined: To make a Linux firewall where
> > the DPDK application handles the data plane, and the normal Linux
> > commands are used for setting up the firewall, incl. firewall rules,
> > port forwarding, NAPT, etc.. The Devil is in the details here!
> >
> > Yes.
> > Another use case would be to handle exception where DPDK may not
> > handle all the traffic, Traffic such ARP can be redirected to OS. This
> > would enable DP to focus on the real fast path protocols such as IPv4,
> > UDP etc.
> >
>
> These are use cases for DPDK being used in an environment where the IP stack features provided by Linux suffices. It would be great for a simple CPE or Wi-Fi router, e.g. OpenWRT with a DPDK data plane replacing the Linux kernel's data plane.

IMO, it not replacing the Linux IP stack, instead, using the slow path
services from Linux or any OS. The use case would vary from simple
WiFI router to 5G transport stack.

>
> For this use case, I think an example application would be a much more useful way to achieve your goal. Implementing it as an application will also uncover what is really needed, instead of us all speculating about what a proxy library might need to include.
>
> But consider an advanced router with VRFs, VLANs, policy based routing, multiple WANs provided through network namespaces... the library will be huge!

We thought of adding the infrastructure and the need per basics, we
can scale it up. There is no such infrastructure now with DPDK.
At least if someone wishes to contribute to this these area then there
should be the path to improve things wrt current situation.

>
> > >
> > > Although I like the concept and idea behind it, I don't think a
> > control plane proxy API belongs in DPDK. But it could possibly be
> > hosted by the DPDK project if approved as such.
> >
> > Why? rte_flow, rte_tm all control plane APIs and it is part of
> > DPDK.
>
> Yes, there are some DPDK libraries leaning more towards control plane than data plane. Another example to prove your point: The whole process scheduling library has very little to do with packet processing. Vaguely related features are creeping in when objections are not strong enough.

Yes. That's the reason for the control path vs data path argument that
doesn't have any value.
If it is useful for packet processing use cases then have it.

>
> > IMO, in order to have effective use of data plane, the control
> > plane has to be integrated together in an OS-independent way.
> >
>
> Also remember that not all DPDK applications need an IP stack resembling what Linux has. E.g. the SmartShare StraightShaper is a transparent bandwidth optimization appliance, and it doesn't perform any routing, it doesn't use any O/S-like features in the data path, and thus it doesn't need to integrate with the IP stack in the O/S. (The management interface uses the Linux IP stack, but it is completely isolated from the DPDK application's data plane.) The same can be said about e.g. T-Rex.
>
> Obviously, not all DPDK applications use all DPDK libraries, and since I'm not obligated to use it, I'm not strongly opposed against it. I only question its usefulness outside of the specific use case of replacing the fast path in the Linux kernel.

Of course, We still follow the "À la carte" model, where we are not
forcing to use the library in the end-user application. You can always
use whatever control path that makes sense with the end-user
applications.
But if some application wants to write control plane SW that needs to
work Linux/FreeBSD/Windows or other RTOS then it can be used (Again if
someone wishes to do so).
We can also provide the means for NOPing out the callbacks or override
with something it is the specific end-user library as well, so that
complete flexibly will be still with the application wrt the usage.

>
> -Morten
>
  
Morten Brørup Jan. 15, 2020, 6:15 p.m. UTC | #10
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jerin Jacob
> Sent: Wednesday, January 15, 2020 5:04 PM
> 
> On Wed, Jan 15, 2020 at 9:00 PM Morten Brørup <mb@smartsharesystems.com>
> wrote:
> >
> > > -----Original Message-----
> > > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > > Sent: Wednesday, January 15, 2020 1:57 PM
> > >
> > > On Wed, Jan 15, 2020 at 5:58 PM Morten Brørup
> > > <mb@smartsharesystems.com> wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce
> > > Richardson
> > > > > Sent: Wednesday, January 15, 2020 11:16 AM
> > > > >
> > > > > On Tue, Jan 14, 2020 at 06:38:37PM +0100, Andrzej Ostruszka wrote:
> > > > > > On 1/14/20 4:16 PM, Morten Brørup wrote:
> > > > > > > Andrzej,
> > > > > >
> > > > > > Hello Morten
> > > > > >
> > > > > > > Basically you are adding a very small subset of the Linux IP
> > > stack>
> > > > > to interface with DPDK applications via callbacks.
> > > > > >
> > > > > > Yes, at the moment this is limited - we'd prefer first to solicit
> > > > > > some input from community.
> > > > > >
> > > > > > > The library also seems to support interfacing to the route
> > > table,
> > > > > > > so it is not "interface proxy" but "IP stack proxy".
> > > > > >
> > > > > > True, to some extent - for example you can bring the interface up
> > > and
> > > > > > down which has nothing to do with IP stack.  As for the name of
> > > the
> > > > > > library - that is actually part where we are completely open.
> > > The
> > > > > proxy
> > > > > > represents port (thus the name) but that is not all, so any
> > > better
> > > > > name
> > > > > > proposals are welcome.
> > > > > >
> > > > > > > You already mention ARP table as future work. How about
> > > namespaces,
> > > > > > > ip tables, and other advanced features... I foresee the Devil
> > > in
> > > > > the
> > > > > > > details for any real use case.
> > > > > >
> > > > > > Right now I don't know what other things are needed.  This idea
> > > is
> > > > > still
> > > > > > early.  However imagine you'd like to use DPDK to speed up packet
> > > > > > processing of IP stack - would you like to implement all the
> > > > > protocols
> > > > > > that are needed?  Or just let the system handle the control path
> > > and
> > > > > > handle the data path and sniff the control params from the
> > > system.
> > > > > >
> > > > > Like Morten, I'd be a bit concerned at the possible scope of the
> > > work
> > > > > if we
> > > > > start pulling in functionality from the IP stack like ARP etc. To
> > > avoid
> > > > > this becoming a massive effort, how useful would it be if we just
> > > > > limited
> > > > > the scope to physical NIC setup only, and did not do anything above
> > > the
> > > > > l2
> > > > > layer?
> > > >
> > > > Think about it... Regardless of scope, this is clearly a control
> > > plane API, not a data plane API.
> > > >
> > > > It provides a proxy API for the O/S control plane (NETLINK in the
> > > case of Linux), so the DPDK application can use the user interface that
> > > the O/S already provides (e.g. "ip link set dev tap1 mtu 1600" etc.)
> > > for its control plane, instead of implementing its own CLI (or GUI or
> > > whatever).
> > >
> > > Yes.
> > >
> > > >
> > > > In order to provide significant value, it will have to grow
> > > massively, so I can use it as imagined: To make a Linux firewall where
> > > the DPDK application handles the data plane, and the normal Linux
> > > commands are used for setting up the firewall, incl. firewall rules,
> > > port forwarding, NAPT, etc.. The Devil is in the details here!
> > >
> > > Yes.
> > > Another use case would be to handle exception where DPDK may not
> > > handle all the traffic, Traffic such ARP can be redirected to OS. This
> > > would enable DP to focus on the real fast path protocols such as IPv4,
> > > UDP etc.
> > >
> >
> > These are use cases for DPDK being used in an environment where the IP
> stack features provided by Linux suffices. It would be great for a simple
> CPE or Wi-Fi router, e.g. OpenWRT with a DPDK data plane replacing the
> Linux kernel's data plane.
> 
> IMO, it not replacing the Linux IP stack, instead, using the slow path
> services from Linux or any OS. The use case would vary from simple
> WiFI router to 5G transport stack.

I only mentioned the special case where a DPDK application uses the Linux slow path services (though your proxy API) and the DPDK application handles the data plane instead of the packets being handled by the Linux kernel's data plane. But I agree, the concept is broader than that.

> 
> >
> > For this use case, I think an example application would be a much more
> useful way to achieve your goal. Implementing it as an application will
> also uncover what is really needed, instead of us all speculating about
> what a proxy library might need to include.
> >
> > But consider an advanced router with VRFs, VLANs, policy based routing,
> multiple WANs provided through network namespaces... the library will be
> huge!
> 
> We thought of adding the infrastructure and the need per basics, we
> can scale it up. There is no such infrastructure now with DPDK.
> At least if someone wishes to contribute to this these area then there
> should be the path to improve things wrt current situation.
> 
> >
> > > >
> > > > Although I like the concept and idea behind it, I don't think a
> > > control plane proxy API belongs in DPDK. But it could possibly be
> > > hosted by the DPDK project if approved as such.
> > >
> > > Why? rte_flow, rte_tm all control plane APIs and it is part of
> > > DPDK.
> >
> > Yes, there are some DPDK libraries leaning more towards control plane
> than data plane. Another example to prove your point: The whole process
> scheduling library has very little to do with packet processing. Vaguely
> related features are creeping in when objections are not strong enough.
> 
> Yes. That's the reason for the control path vs data path argument that
> doesn't have any value.
> If it is useful for packet processing use cases then have it.
> 
> >
> > > IMO, in order to have effective use of data plane, the control
> > > plane has to be integrated together in an OS-independent way.
> > >
> >
> > Also remember that not all DPDK applications need an IP stack resembling
> what Linux has. E.g. the SmartShare StraightShaper is a transparent
> bandwidth optimization appliance, and it doesn't perform any routing, it
> doesn't use any O/S-like features in the data path, and thus it doesn't
> need to integrate with the IP stack in the O/S. (The management interface
> uses the Linux IP stack, but it is completely isolated from the DPDK
> application's data plane.) The same can be said about e.g. T-Rex.
> >
> > Obviously, not all DPDK applications use all DPDK libraries, and since
> I'm not obligated to use it, I'm not strongly opposed against it. I only
> question its usefulness outside of the specific use case of replacing the
> fast path in the Linux kernel.
> 
> Of course, We still follow the "À la carte" model, where we are not
> forcing to use the library in the end-user application. You can always
> use whatever control path that makes sense with the end-user
> applications.
> But if some application wants to write control plane SW that needs to
> work Linux/FreeBSD/Windows or other RTOS then it can be used (Again if
> someone wishes to do so).
> We can also provide the means for NOPing out the callbacks or override
> with something it is the specific end-user library as well, so that
> complete flexibly will be still with the application wrt the usage.
> 

OK, you convinced me that a general API for interfacing to the O/S control plane might be useful. So let me switch from arguing against it to providing some constructive feedback:

You should consider that most DPDK APIs are not thread safe, meaning that their internal structures cannot be manipulated/reconfigured by a control plane thread while data plane threads are accessing them. E.g. a route cannot be added in the DPDK route library while it is also being used by for lookups by a DPDK data plane thread. The same goes for the hash table library. This means that callbacks are probably not the right design pattern.

AFAIK, the DPDK documentation doesn't mention any "best practices" for interaction between the control plane and data plans threads, so I understand why you chose a design pattern similar to the NIC Link Status Change interrupt design pattern.

Furthermore, I have now skimmed the other parts of your patch set. If I got it right, it looks like there's a limit of 64 callbacks; this will probably not suffice in the long run.

And on the administrative side, I assume one of you guys will volunteer as the maintainer of this library?

-Morten
  
Jerin Jacob Jan. 16, 2020, 7:15 a.m. UTC | #11
On Wed, Jan 15, 2020 at 11:45 PM Morten Brørup <mb@smartsharesystems.com> wrote:

> > > > IMO, in order to have effective use of data plane, the control
> > > > plane has to be integrated together in an OS-independent way.
> > > >
> > >
> > > Also remember that not all DPDK applications need an IP stack resembling
> > what Linux has. E.g. the SmartShare StraightShaper is a transparent
> > bandwidth optimization appliance, and it doesn't perform any routing, it
> > doesn't use any O/S-like features in the data path, and thus it doesn't
> > need to integrate with the IP stack in the O/S. (The management interface
> > uses the Linux IP stack, but it is completely isolated from the DPDK
> > application's data plane.) The same can be said about e.g. T-Rex.
> > >
> > > Obviously, not all DPDK applications use all DPDK libraries, and since
> > I'm not obligated to use it, I'm not strongly opposed against it. I only
> > question its usefulness outside of the specific use case of replacing the
> > fast path in the Linux kernel.
> >
> > Of course, We still follow the "À la carte" model, where we are not
> > forcing to use the library in the end-user application. You can always
> > use whatever control path that makes sense with the end-user
> > applications.
> > But if some application wants to write control plane SW that needs to
> > work Linux/FreeBSD/Windows or other RTOS then it can be used (Again if
> > someone wishes to do so).
> > We can also provide the means for NOPing out the callbacks or override
> > with something it is the specific end-user library as well, so that
> > complete flexibly will be still with the application wrt the usage.
> >
>
> OK, you convinced me that a general API for interfacing to the O/S control plane might be useful. So let me switch from arguing against it to providing some constructive feedback:

Good news :-)

>
> You should consider that most DPDK APIs are not thread safe, meaning that their internal structures cannot be manipulated/reconfigured by a control plane thread while data plane threads are accessing them. E.g. a route cannot be added in the DPDK route library while it is also being used by for lookups by a DPDK data plane thread. The same goes for the hash table library. This means that callbacks are probably not the right design pattern.

I think, we can have only two design patterns for this case.

1) push model(i.e callback). In this case, DP gets the callback, if it
is not the correct time to apply the configuration then DP can store
it in its own queue and pull it latter.
2) pull model. In this case, the library stores the events. When DP
needs the events, it can pull the events from the library.

Do you have any other model in mind? and what is your preference among two?

>
> AFAIK, the DPDK documentation doesn't mention any "best practices" for interaction between the control plane and data plans threads, so I understand why you chose a design pattern similar to the NIC Link Status Change interrupt design pattern.
>
> Furthermore, I have now skimmed the other parts of your patch set. If I got it right, it looks like there's a limit of 64 callbacks; this will probably not suffice in the long run.

Yes. We will increase it.

> And on the administrative side, I assume one of you guys will volunteer as the maintainer of this library?

Yes

>
> -Morten
  
Andrzej Ostruszka Jan. 16, 2020, 9:09 a.m. UTC | #12
On 1/15/20 7:15 PM, Morten Brørup wrote:
[...]
> OK, you convinced me that a general API for interfacing to the O/S
> control plane might be useful.

Glad to hear that.

[...]
> You should consider that most DPDK APIs are not thread safe,
> meaning that their internal structures cannot be manipulated/reconfigured
> by a control plane thread while data plane threads are accessing them.
> E.g. a route cannot be added in the DPDK route library while it is also
> being used by for lookups by a DPDK data plane thread. The same goes
> for the hash table library.

You are thinking already about modification of the application data.
That is actually beyond the scope of the library.  The intention of the
library is to provide with notification of a change.  It is meant to be
the task of the callback (provided by the user) to act on the change.
It can store the change to be picked up at the next packet burst
iteration, or use some RCU synchronization or even stop the world and
push the change (if the writer of application deems that appropriate).

> This means that callbacks are probably not the right design pattern.

What are other possibilities?  The library could keep "copy" of the
interesting configuration and periodically update it and mark the
changes to let application notice.  But that would be inefficient - I
would have to query all data to check for the diff.  So I think the
callback is the right design - we get only changes.  However please note
above explanation, that it is up to application writer to provide
callback that would fit design of the application and in cooperation
with it will move the network config change into internal data structures.

> Furthermore, I have now skimmed the other parts of your patch set.
> If I got it right, it looks like there's a limit of 64 callbacks;
> this will probably not suffice in the long run.

This is interesting.  What has given you that impression?  I'm really
curious since I've written it :).  There is a limit on a number of
proxies (but this is the same as limit on DPDK ports - so not really a
limitation of this lib).  BTW since this is a slow path, and I don't
need a fast access I keep proxies in a list, so that only those active
have allocated memory.

Each type of callback is just a member of rte_ifpx_callbacks struct -
and yes, as you previously noted, this struct will grow with additional
functionality added, but there is no real limit on it.  At the moment
callbacks are meant to be global - there is a list of callback sets
(ifpx_callbacks) that is common for all proxies.

I expect that the most common use will be just one set of callbacks for
application.  But instead of having just one global var I keep a list of
sets so many can be registered.  There are other options possible:
- each type of callback can be a list
- callbacks could be "per proxy" - meaning that each proxy port could
  have its own callbacks

The first one could be beneficial if user wants many callbacks
registered for some particular type of notification and is not
interested in others.
The second one can be useful if different proxies should be treated
differently - in that case one could avoid conditionals in callback
switching behaviour depending on the proxy used.

But again this kind of uses are not what I expect as a common use case
so I went with current design.

> And on the administrative side, I assume one of you guys will volunteer
> as the maintainer of this library?

Yes.

With regards
Andrzej Ostruszka
  
Morten Brørup Jan. 16, 2020, 9:11 a.m. UTC | #13
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jerin Jacob
> Sent: Thursday, January 16, 2020 8:15 AM
> 
> On Wed, Jan 15, 2020 at 11:45 PM Morten Brørup
> <mb@smartsharesystems.com> wrote:
> 
> > > > > IMO, in order to have effective use of data plane, the control
> > > > > plane has to be integrated together in an OS-independent way.
> > > > >
> > > >
> > > > Also remember that not all DPDK applications need an IP stack
> resembling
> > > what Linux has. E.g. the SmartShare StraightShaper is a transparent
> > > bandwidth optimization appliance, and it doesn't perform any
> routing, it
> > > doesn't use any O/S-like features in the data path, and thus it
> doesn't
> > > need to integrate with the IP stack in the O/S. (The management
> interface
> > > uses the Linux IP stack, but it is completely isolated from the
> DPDK
> > > application's data plane.) The same can be said about e.g. T-Rex.
> > > >
> > > > Obviously, not all DPDK applications use all DPDK libraries, and
> since
> > > I'm not obligated to use it, I'm not strongly opposed against it. I
> only
> > > question its usefulness outside of the specific use case of
> replacing the
> > > fast path in the Linux kernel.
> > >
> > > Of course, We still follow the "À la carte" model, where we are not
> > > forcing to use the library in the end-user application. You can
> always
> > > use whatever control path that makes sense with the end-user
> > > applications.
> > > But if some application wants to write control plane SW that needs
> to
> > > work Linux/FreeBSD/Windows or other RTOS then it can be used (Again
> if
> > > someone wishes to do so).
> > > We can also provide the means for NOPing out the callbacks or
> override
> > > with something it is the specific end-user library as well, so that
> > > complete flexibly will be still with the application wrt the usage.
> > >
> >
> > OK, you convinced me that a general API for interfacing to the O/S
> control plane might be useful. So let me switch from arguing against it
> to providing some constructive feedback:
> 
> Good news :-)
> 
> >
> > You should consider that most DPDK APIs are not thread safe, meaning
> that their internal structures cannot be manipulated/reconfigured by a
> control plane thread while data plane threads are accessing them. E.g.
> a route cannot be added in the DPDK route library while it is also
> being used by for lookups by a DPDK data plane thread. The same goes
> for the hash table library. This means that callbacks are probably not
> the right design pattern.
> 
> I think, we can have only two design patterns for this case.
> 
> 1) push model(i.e callback). In this case, DP gets the callback, if it
> is not the correct time to apply the configuration then DP can store
> it in its own queue and pull it latter.
> 2) pull model. In this case, the library stores the events. When DP
> needs the events, it can pull the events from the library.
> 
> Do you have any other model in mind? and what is your preference among
> two?
> 

This library interfaces to the O/S on the one side, and a DPDK application on the other side.

Looking at the interface towards the DPDK application, I would personally prefer a pull model. It will allow the DPDK application to handle the events when it is convenient and safe for the DPDK application to manipulate its non-thread safe data structures.

Looking at the interface towards the O/S, Linux Netlink is well defined and described in RFC 3549, and message queues (e.g. DPDK rings) seem like a perfect match for this.

I don't know enough about the Windows network stack to tell if the same applies here, so you should look into this before proceeding. On the other hand, the "memif" Memory Interface PMD is Linux only; so you could also consider limiting your library support to operating systems often being used as routers, i.e. Linux and BSD, and explicitly omit Windows support.

I have no preferences about the message format, but since Linux Netlink is described in an RFC, you could consider using this exact message format or a closely related message format. The RFC authors probably thought this through very thoroughly.

> >
> > AFAIK, the DPDK documentation doesn't mention any "best practices"
> for interaction between the control plane and data plans threads, so I
> understand why you chose a design pattern similar to the NIC Link
> Status Change interrupt design pattern.
> >
> > Furthermore, I have now skimmed the other parts of your patch set. If
> I got it right, it looks like there's a limit of 64 callbacks; this
> will probably not suffice in the long run.
> 
> Yes. We will increase it.
> 
> > And on the administrative side, I assume one of you guys will
> volunteer as the maintainer of this library?
> 
> Yes

Great.

-Morten
  
Morten Brørup Jan. 16, 2020, 9:30 a.m. UTC | #14
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Andrzej Ostruszka
> Sent: Thursday, January 16, 2020 10:10 AM
> 
> On 1/15/20 7:15 PM, Morten Brørup wrote:
> [...]
> > OK, you convinced me that a general API for interfacing to the O/S
> > control plane might be useful.
> 
> Glad to hear that.
> 
> [...]
> > You should consider that most DPDK APIs are not thread safe,
> > meaning that their internal structures cannot be
> manipulated/reconfigured
> > by a control plane thread while data plane threads are accessing
> them.
> > E.g. a route cannot be added in the DPDK route library while it is
> also
> > being used by for lookups by a DPDK data plane thread. The same goes
> > for the hash table library.
> 
> You are thinking already about modification of the application data.
> That is actually beyond the scope of the library.

Yes, it is beyond the scope of the library; but I prefer the library to be designed for how typical applications are going to use it.

I suggest that you supplement the library with an example DPDK application that is a simple IPv4 router, forwarding packets and responding to ARP requests - according to its configuration applied in the O/S via your proxy library. You could even add support for relevant ICMP packets (e.g. respond to ICMP Echo Request and send TTL Exceeded when appropriate). It will help you determine what is required by the library, and how the library best interfaces to a "typical" DPDK application.

> The intention of the
> library is to provide with notification of a change.  It is meant to be
> the task of the callback (provided by the user) to act on the change.
> It can store the change to be picked up at the next packet burst
> iteration, or use some RCU synchronization or even stop the world and
> push the change (if the writer of application deems that appropriate).
> 
> > This means that callbacks are probably not the right design pattern.
> 
> What are other possibilities?  The library could keep "copy" of the
> interesting configuration and periodically update it and mark the
> changes to let application notice.  But that would be inefficient - I
> would have to query all data to check for the diff.  So I think the
> callback is the right design - we get only changes.  However please
> note
> above explanation, that it is up to application writer to provide
> callback that would fit design of the application and in cooperation
> with it will move the network config change into internal data
> structures.
> 

I think a poll based design pattern is more appropriate. Getting a Netlink message from the O/S and converting it to a callback in the library, and then converting it back to a message in the DPDK application seems like crossing the river to get water.

> > Furthermore, I have now skimmed the other parts of your patch set.
> > If I got it right, it looks like there's a limit of 64 callbacks;
> > this will probably not suffice in the long run.
> 
> This is interesting.  What has given you that impression?  I'm really
> curious since I've written it :).

It was a bitmap of wanted callbacks. I only skimmed the source code, so I'm probably wrong about this. Forget I mentioned it.

> There is a limit on a number of
> proxies (but this is the same as limit on DPDK ports - so not really a
> limitation of this lib).  BTW since this is a slow path, and I don't
> need a fast access I keep proxies in a list, so that only those active
> have allocated memory.
> 
> Each type of callback is just a member of rte_ifpx_callbacks struct -
> and yes, as you previously noted, this struct will grow with additional
> functionality added, but there is no real limit on it.  At the moment
> callbacks are meant to be global - there is a list of callback sets
> (ifpx_callbacks) that is common for all proxies.
> 
> I expect that the most common use will be just one set of callbacks for
> application.  But instead of having just one global var I keep a list
> of
> sets so many can be registered.  There are other options possible:
> - each type of callback can be a list
> - callbacks could be "per proxy" - meaning that each proxy port could
>   have its own callbacks
> 
> The first one could be beneficial if user wants many callbacks
> registered for some particular type of notification and is not
> interested in others.
> The second one can be useful if different proxies should be treated
> differently - in that case one could avoid conditionals in callback
> switching behaviour depending on the proxy used.
> 
> But again this kind of uses are not what I expect as a common use case
> so I went with current design.
> 
> > And on the administrative side, I assume one of you guys will
> volunteer
> > as the maintainer of this library?
> 
> Yes.
> 
> With regards
> Andrzej Ostruszka
  
Andrzej Ostruszka Jan. 16, 2020, 10:42 a.m. UTC | #15
On 1/16/20 10:30 AM, Morten Brørup wrote:
[...]
>> You are thinking already about modification of the application data.
>> That is actually beyond the scope of the library.
> 
> Yes, it is beyond the scope of the library; but I prefer the library to
> be designed for how typical applications are going to use it.
> 
> I suggest that you supplement the library with an example DPDK application
> that is a simple IPv4 router, forwarding packets and responding to ARP
> requests - according to its configuration applied in the O/S via your proxy
> library. You could even add support for relevant ICMP packets (e.g. respond
> to ICMP Echo Request and send TTL Exceeded when appropriate).

Actually our thinking was more along the way: such router would see
these control packets so it will send them (tx burst) to proxy port and
let the system stack do its job: change config and possibly send reply.
The former would be listened on NETLINK (in Linux) and the later would
be just read from proxy port and forwarded to the bound port.  That way
DPDK application would not have to re-implement these control protocols.

> It will help you determine what is required by the library, and how
> the library best interfaces to a "typical" DPDK application.

Yes indeed, that kind usage discovery exercise would be good.

> I think a poll based design pattern is more appropriate. Getting a Netlink
> message from the O/S and converting it to a callback in the library, and
> then converting it back to a message in the DPDK application seems like
> crossing the river to get water.

You'd still need to repack the message and that could be the job of the
callback.

At the moment we don't have much experience with the library and to me
the callback is more generic approach with which one can achieve
different designs.  However nothing here is curved in stone so if we
figure out that this is too generic we will change it.

[...]
> It was a bitmap of wanted callbacks.

Aaa, right.  Currently the set of available callbacks is returned as a
bitmask.  This API will change if we find out the need for more callbacks.

With regards
Andrzej Ostruszka
  
Morten Brørup Jan. 16, 2020, 10:58 a.m. UTC | #16
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Andrzej Ostruszka
> Sent: Thursday, January 16, 2020 11:43 AM
> 
> On 1/16/20 10:30 AM, Morten Brørup wrote:
> [...]
> >> You are thinking already about modification of the application data.
> >> That is actually beyond the scope of the library.
> >
> > Yes, it is beyond the scope of the library; but I prefer the library
> to
> > be designed for how typical applications are going to use it.
> >
> > I suggest that you supplement the library with an example DPDK
> application
> > that is a simple IPv4 router, forwarding packets and responding to
> ARP
> > requests - according to its configuration applied in the O/S via your
> proxy
> > library. You could even add support for relevant ICMP packets (e.g.
> respond
> > to ICMP Echo Request and send TTL Exceeded when appropriate).
> 
> Actually our thinking was more along the way: such router would see
> these control packets so it will send them (tx burst) to proxy port and
> let the system stack do its job: change config and possibly send reply.
> The former would be listened on NETLINK (in Linux) and the later would
> be just read from proxy port and forwarded to the bound port.  That way
> DPDK application would not have to re-implement these control
> protocols.
> 

You are right. I momentarily forgot that.

And the example application will show how to do this.

> > It will help you determine what is required by the library, and how
> > the library best interfaces to a "typical" DPDK application.
> 
> Yes indeed, that kind usage discovery exercise would be good.
> 
> > I think a poll based design pattern is more appropriate. Getting a
> Netlink
> > message from the O/S and converting it to a callback in the library,
> and
> > then converting it back to a message in the DPDK application seems
> like
> > crossing the river to get water.
> 
> You'd still need to repack the message and that could be the job of the
> callback.
> 
> At the moment we don't have much experience with the library and to me
> the callback is more generic approach with which one can achieve
> different designs.  However nothing here is curved in stone so if we
> figure out that this is too generic we will change it.
> 

Please re-read my reply to Jerin Jacob why I prefer a pull model instead:
https://mails.dpdk.org/archives/dev/2020-January/155386.html

Take a stab at the example application, and see which design pattern is the best fit.
  
Andrzej Ostruszka Jan. 16, 2020, 12:06 p.m. UTC | #17
Morten

First of all thank you for your feedback.  If anything else pops into
your mind please do not hesitate to share it.

We just had a quick internal discussion and we decided that we'll try to
come up with both options (callback and message queue).

On 1/16/20 11:58 AM, Morten Brørup wrote:
>> -----Original Message-----
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Andrzej Ostruszka
>> Sent: Thursday, January 16, 2020 11:43 AM
[...]
>> You'd still need to repack the message and that could be the job of the
>> callback.
>>
>> At the moment we don't have much experience with the library and to me
>> the callback is more generic approach with which one can achieve
>> different designs.  However nothing here is curved in stone so if we
>> figure out that this is too generic we will change it.
>>
> 
> Please re-read my reply to Jerin Jacob why I prefer a pull model instead:
> https://mails.dpdk.org/archives/dev/2020-January/155386.html

Yes - I got your point first time.  Remark above was not meant to imply
that "pull mode" is not a valid way (it is perfectly valid and probably
most often used in DPDK).  I just noted that by staying at callback
level only one can still implement it.

But it is true that this way would impose more burden on the application
writer - so instead we now plan to provide both options.

> Take a stab at the example application, and see which design pattern is the best fit.

We will.  This is a definitely a good idea to work out things in "battle".

With regards
Andrzej Ostruszka