diff mbox series

[v2,1/3] eventdev: introduce event dispatcher

Message ID	20230616074041.159675-2-mattias.ronnblom@ericsson.com (mailing list archive)
State	Changes Requested, archived
Delegated to:	Jerin Jacob
Headers	Received-SPF: Pass (protection.outlook.com: domain of ericsson.com designates 192.176.1.74 as permitted sender) receiver=protection.outlook.com; client-ip=192.176.1.74; helo=oa.msg.ericsson.com; pr=C From: =?utf-8?q?Mattias_R=C3=B6nnblom?= <mattias.ronnblom@ericsson.com> To: <jerinj@marvell.com> CC: Jerin Jacob <jerinjacobk@gmail.com>, <hofors@lysator.liu.se>, <dev@dpdk.org>, <harry.van.haaren@intel.com>, <peter.j.nilsson@ericsson.com>, Stephen Hemminger <stephen@networkplumber.org>, Heng Wang <heng.wang@ericsson.com>, =?utf-8?q?Mattias_R=C3=B6nnblom?= <mattias.ronnblom@ericsson.com> Subject: [PATCH v2 1/3] eventdev: introduce event dispatcher Date: Fri, 16 Jun 2023 09:40:39 +0200 Message-ID: <20230616074041.159675-2-mattias.ronnblom@ericsson.com> In-Reply-To: <20230616074041.159675-1-mattias.ronnblom@ericsson.com> References: <20230614172527.157664-2-mattias.ronnblom@ericsson.com> <20230616074041.159675-1-mattias.ronnblom@ericsson.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Precedence: list Errors-To: dev-bounces@dpdk.org
Series	Add event dispatcher \| [v2,0/3] Add event dispatcher [v2,1/3] eventdev: introduce event dispatcher [v2,2/3] test: add event dispatcher test suite [v2,3/3] doc: add event dispatcher programming guide

Checks

Context	Check	Description
ci/checkpatch	warning	coding style issues

Commit Message

Mattias Rönnblom June 16, 2023, 7:40 a.m. UTC

  The purpose of the event dispatcher is to help reduce coupling in an
Eventdev-based DPDK application.

In addition, the event dispatcher also provides a convenient and
flexible way for the application to use service cores for
application-level processing.

Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Tested-by: Peter Nilsson <peter.j.nilsson@ericsson.com>
Reviewed-by: Heng Wang <heng.wang@ericsson.com>

--
PATCH v2:
 o Add dequeue batch count statistic.
 o Add statistics reset function to API.
 o Clarify MT safety guarantees (or lack thereof) in the API documentation.
 o Change loop variable type in evd_lcore_get_handler_by_id() to uint16_t,
   to be consistent with similar loops elsewhere in the dispatcher.
 o Fix variable names in finalizer unregister function.

PATCH:
 o Change prefix from RED to EVD, to avoid confusion with random
   early detection.

RFC v4:
 o Move handlers to per-lcore data structures.
 o Introduce mechanism which rearranges handlers so that often-used
   handlers tend to be tried first.
 o Terminate dispatch loop in case all events are delivered.
 o To avoid the dispatcher's service function hogging the CPU, process
   only one batch per call.
 o Have service function return -EAGAIN if no work is performed.
 o Events delivered in the process function is no longer marked 'const',
   since modifying them may be useful for the application and cause
   no difficulties for the dispatcher.
 o Various minor API documentation improvements.

RFC v3:
 o Add stats_get() function to the version.map file.
---
 lib/eventdev/meson.build            |   2 +
 lib/eventdev/rte_event_dispatcher.c | 793 ++++++++++++++++++++++++++++
 lib/eventdev/rte_event_dispatcher.h | 480 +++++++++++++++++
 lib/eventdev/version.map            |  14 +
 4 files changed, 1289 insertions(+)
 create mode 100644 lib/eventdev/rte_event_dispatcher.c
 create mode 100644 lib/eventdev/rte_event_dispatcher.h

Comments

Jerin Jacob Aug. 18, 2023, 6:09 a.m. UTC | #1

On Fri, Jun 16, 2023 at 1:17 PM Mattias Rönnblom
<mattias.ronnblom@ericsson.com> wrote:
>
> The purpose of the event dispatcher is to help reduce coupling in an
> Eventdev-based DPDK application.
>
> In addition, the event dispatcher also provides a convenient and
> flexible way for the application to use service cores for
> application-level processing.
>
> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> Tested-by: Peter Nilsson <peter.j.nilsson@ericsson.com>
> Reviewed-by: Heng Wang <heng.wang@ericsson.com>

Adding eventdev maintainers and tech board,

Hi Mattais,

Finally, got some time to review this series, and thanks for excellent
documentation.

I understand the use case for the dispatcher, But following are some
of my concern

1) To decouple the application specific business logic, one need to
use two function pointers to access per packet (match and process)
function.
2) Need to enforce service core for its usage.

IMO, Both are a given application's choice, All the application does
not need to use this scheme. Keeping the code in lib/eventdev has the
following issue.

1)It is kind of enforcing above scheme for all the application
modeling, which may not applicable for application use cases and
eventdev device does not dictate a specific framework model.
2) The framework code, we never kept in device class library. i.e.,
public APIs are implemented through device class API and public API
don't have any no hook to PMD API.
For example, we never kept lib/distributor/ code in lib/ethdev.

Other than the placement of this code, I agree with use case and
solution at high level . The following could option for placement of
this library. Based on that, we can have next level review.

1) It is possible to plug in this to lib/graph by adding new graph
model(@zhirun.yan@intel.com recently added
RTE_GRAPH_MODEL_MCORE_DISPATCH)

Based on my understanding, That can translate to
a)  Adding new graph model which allows to have it on graph walk
(Graph walk is nothing but calling registered dispatcher routines)
b) It is possible to add model specific APIs via
rte_graph_model_model_name_xxxx()
c) Graph library is not using match callback kind of scheme. Instead,
nodes will process the packet and find its downstream node and enqueue
to it and then graph_walk() calls the downstream node specific process
function.
With that, we can meet the original goal of business logic decoupling.
However, Currently, nodes are not aware of what kind of graph model it
is running, that could be one issue here as eventdev has more
scheduling properties
like schedule type etc., to overcome that issue, it may be possible to
introduce nodes to graph model compatibility (where nodes can
advertise the supported graph models)
d) Currently we are planning to make graph API as stable, if we are
taking this path, we need to hold
https://patches.dpdk.org/project/dpdk/patch/20230810180515.113700-1-stephen@networkplumber.org/
as
we may need to update some public APIs.

2) Have new library lib/event_dispatcher

3) Move to example directory to showcase the framework

4) Move to app/test-eventdev directory  to show the case of the framework.

Thoughts?

Mattias Rönnblom Aug. 22, 2023, 8:42 a.m. UTC | #2

On 2023-08-18 08:09, Jerin Jacob wrote:
> On Fri, Jun 16, 2023 at 1:17 PM Mattias Rönnblom
> <mattias.ronnblom@ericsson.com> wrote:
>>
>> The purpose of the event dispatcher is to help reduce coupling in an
>> Eventdev-based DPDK application.
>>
>> In addition, the event dispatcher also provides a convenient and
>> flexible way for the application to use service cores for
>> application-level processing.
>>
>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>> Tested-by: Peter Nilsson <peter.j.nilsson@ericsson.com>
>> Reviewed-by: Heng Wang <heng.wang@ericsson.com>
> 
> Adding eventdev maintainers and tech board,
> 
> Hi Mattais,
> 
> Finally, got some time to review this series, and thanks for excellent
> documentation.
> 
> I understand the use case for the dispatcher, But following are some
> of my concern
> 
> 1) To decouple the application specific business logic, one need to
> use two function pointers to access per packet (match and process)
> function.

The API design is based on community feedback, which suggested more 
flexibility was required than the initial 
"dispatching-based-on-queue-id-only" functionality the first RFC provided.

Where I expected to land was a design where I would have something like 
the RFC v2 design with match+process callbacks, and then a supplementary 
"hard-coded" dispatch-internal match API as well, where only the process 
function would be used (much like how RFC v1 worked).

It turned out the special-case API was not performing better (rather the 
opposite) for the evaluated use cases, so I dropped that idea.

That said, it could always be a future extension to re-introduce 
dispatcher-internal matching.

> 2) Need to enforce service core for its usage.
> 

Well, Eventdev does that already, except on systems where all required 
event adapters have the appropriate INTERNAL_PORT capability.

> IMO, Both are a given application's choice, All the application does
> not need to use this scheme. Keeping the code in lib/eventdev has the
> following issue.
> 
> 1)It is kind of enforcing above scheme for all the application
> modeling, which may not applicable for application use cases and
> eventdev device does not dictate a specific framework model.
> 2) The framework code, we never kept in device class library. i.e.,
> public APIs are implemented through device class API and public API
> don't have any no hook to PMD API.
> For example, we never kept lib/distributor/ code in lib/ethdev.
> 
> Other than the placement of this code, I agree with use case and
> solution at high level . The following could option for placement of
> this library. Based on that, we can have next level review.
> 

I'm fine with keeping this as a separate library, although I also don't 
see the harm in having some utility-type functionality in eventdev itself.

> 1) It is possible to plug in this to lib/graph by adding new graph
> model(@zhirun.yan@intel.com recently added
> RTE_GRAPH_MODEL_MCORE_DISPATCH)
> 
> Based on my understanding, That can translate to
> a)  Adding new graph model which allows to have it on graph walk
> (Graph walk is nothing but calling registered dispatcher routines)
> b) It is possible to add model specific APIs via
> rte_graph_model_model_name_xxxx()
> c) Graph library is not using match callback kind of scheme. Instead,
> nodes will process the packet and find its downstream node and enqueue
> to it and then graph_walk() calls the downstream node specific process
> function.
> With that, we can meet the original goal of business logic decoupling.
> However, Currently, nodes are not aware of what kind of graph model it
> is running, that could be one issue here as eventdev has more
> scheduling properties
> like schedule type etc., to overcome that issue, it may be possible to
> introduce nodes to graph model compatibility (where nodes can
> advertise the supported graph models)
> d) Currently we are planning to make graph API as stable, if we are
> taking this path, we need to hold
> https://patches.dpdk.org/project/dpdk/patch/20230810180515.113700-1-stephen@networkplumber.org/
> as
> we may need to update some public APIs.
> 
> 2) Have new library lib/event_dispatcher
> 
> 3) Move to example directory to showcase the framework
> 
> 4) Move to app/test-eventdev directory  to show the case of the framework.
> 
> 
> Thoughts?

I'm not sure I follow. Are you suggesting rte_graph could use 
rte_event_dispatcher, or that an application could use rte_graph to 
solve the same problem as rte_event_dispatcher solves?

I didn't review rte_graph in detail, but if it's anything like fd.io 
VPP's graph model, it's not a programming model that you switch to 
without significant application code impact.

Jerin Jacob Aug. 22, 2023, 12:32 p.m. UTC | #3

On Tue, Aug 22, 2023 at 2:12 PM Mattias Rönnblom <hofors@lysator.liu.se> wrote:
>
> On 2023-08-18 08:09, Jerin Jacob wrote:
> > On Fri, Jun 16, 2023 at 1:17 PM Mattias Rönnblom
> > <mattias.ronnblom@ericsson.com> wrote:
> >>
> >
> > Hi Mattais,
> >
> > Finally, got some time to review this series, and thanks for excellent
> > documentation.
> >
> > I understand the use case for the dispatcher, But following are some
> > of my concern
> >
> > 1) To decouple the application specific business logic, one need to
> > use two function pointers to access per packet (match and process)
> > function.
>
> The API design is based on community feedback, which suggested more
> flexibility was required than the initial
> "dispatching-based-on-queue-id-only" functionality the first RFC provided.
>
> Where I expected to land was a design where I would have something like
> the RFC v2 design with match+process callbacks, and then a supplementary
> "hard-coded" dispatch-internal match API as well, where only the process
> function would be used (much like how RFC v1 worked).
>
> It turned out the special-case API was not performing better (rather the
> opposite) for the evaluated use cases, so I dropped that idea.
>
> That said, it could always be a future extension to re-introduce
> dispatcher-internal matching.

Ack.

>
> > 2) Need to enforce service core for its usage.
> >
>
> Well, Eventdev does that already, except on systems where all required
> event adapters have the appropriate INTERNAL_PORT capability.

Yes. The difference is, one is for HW feature emulation and other one
for framework usage.


>
> > IMO, Both are a given application's choice, All the application does
> > not need to use this scheme. Keeping the code in lib/eventdev has the
> > following issue.
> >
> > 1)It is kind of enforcing above scheme for all the application
> > modeling, which may not applicable for application use cases and
> > eventdev device does not dictate a specific framework model.
> > 2) The framework code, we never kept in device class library. i.e.,
> > public APIs are implemented through device class API and public API
> > don't have any no hook to PMD API.
> > For example, we never kept lib/distributor/ code in lib/ethdev.
> >
> > Other than the placement of this code, I agree with use case and
> > solution at high level . The following could option for placement of
> > this library. Based on that, we can have next level review.
> >
>
> I'm fine with keeping this as a separate library, although I also don't
> see the harm in having some utility-type functionality in eventdev itself.

I see harm as

1)It is kind of enforcing above scheme for all the application
modeling, which may not applicable for all application use cases and
eventdev device does not dictate a specific framework model.

2) The framework code, we never kept in device class library. i.e.,
public APIs are implemented through device class API and public API
don't have any no hook to PMD API.
For example, we never kept lib/distributor/ code in lib/ethdev.

I would to keep eventDEV library scope as abstracting event device features.
We have some common code in library whose scope is sharing between PMDs
not a framework on top eventdev public APIs.

Having said that, I supportive to get this framework as new library and
happy to review the new library.

>
> > 1) It is possible to plug in this to lib/graph by adding new graph
> > model(@zhirun.yan@intel.com recently added
> > RTE_GRAPH_MODEL_MCORE_DISPATCH)
> >
> > Based on my understanding, That can translate to
> > a)  Adding new graph model which allows to have it on graph walk
> > (Graph walk is nothing but calling registered dispatcher routines)
> > b) It is possible to add model specific APIs via
> > rte_graph_model_model_name_xxxx()
> > c) Graph library is not using match callback kind of scheme. Instead,
> > nodes will process the packet and find its downstream node and enqueue
> > to it and then graph_walk() calls the downstream node specific process
> > function.
> > With that, we can meet the original goal of business logic decoupling.
> > However, Currently, nodes are not aware of what kind of graph model it
> > is running, that could be one issue here as eventdev has more
> > scheduling properties
> > like schedule type etc., to overcome that issue, it may be possible to
> > introduce nodes to graph model compatibility (where nodes can
> > advertise the supported graph models)
> > d) Currently we are planning to make graph API as stable, if we are
> > taking this path, we need to hold
> > https://patches.dpdk.org/project/dpdk/patch/20230810180515.113700-1-stephen@networkplumber.org/
> > as
> > we may need to update some public APIs.
> >
> > 2) Have new library lib/event_dispatcher
> >
> > 3) Move to example directory to showcase the framework
> >
> > 4) Move to app/test-eventdev directory  to show the case of the framework.
> >
> >
> > Thoughts?
>
> I'm not sure I follow. Are you suggesting rte_graph could use
> rte_event_dispatcher, or that an application could use rte_graph to
> solve the same problem as rte_event_dispatcher solves?

Later one, Application could use rte_graph to solve the same problem
as rte_event_dispatcher solves.
In fact, both are solving similar problems. See below.


>
> I didn't review rte_graph in detail, but if it's anything like fd.io
> VPP's graph model, it's not a programming model that you switch to
> without significant application code impact.

This is a new library, right? So, which existing applications?

It is not strictly like VPP graph model. rte_graph supports plugin for
the different graph models.

Following are the models available.
https://doc.dpdk.org/guides/prog_guide/graph_lib.html
See
62.4.5.1. RTC (Run-To-Completion)
62.4.5.2. Dispatch model

RTC is similar to fd.io VPP. Other model is not like VPP.

If we choose to take new library route instead of new rte_graph model
for eventdev then
https://doc.dpdk.org/guides/contributing/new_library.html will be the process.

Mattias Rönnblom Aug. 24, 2023, 11:17 a.m. UTC | #4

On 2023-08-22 14:32, Jerin Jacob wrote:
> On Tue, Aug 22, 2023 at 2:12 PM Mattias Rönnblom <hofors@lysator.liu.se> wrote:
>>
>> On 2023-08-18 08:09, Jerin Jacob wrote:
>>> On Fri, Jun 16, 2023 at 1:17 PM Mattias Rönnblom
>>> <mattias.ronnblom@ericsson.com> wrote:
>>>>
>>>
>>> Hi Mattais,
>>>
>>> Finally, got some time to review this series, and thanks for excellent
>>> documentation.
>>>
>>> I understand the use case for the dispatcher, But following are some
>>> of my concern
>>>
>>> 1) To decouple the application specific business logic, one need to
>>> use two function pointers to access per packet (match and process)
>>> function.
>>
>> The API design is based on community feedback, which suggested more
>> flexibility was required than the initial
>> "dispatching-based-on-queue-id-only" functionality the first RFC provided.
>>
>> Where I expected to land was a design where I would have something like
>> the RFC v2 design with match+process callbacks, and then a supplementary
>> "hard-coded" dispatch-internal match API as well, where only the process
>> function would be used (much like how RFC v1 worked).
>>
>> It turned out the special-case API was not performing better (rather the
>> opposite) for the evaluated use cases, so I dropped that idea.
>>
>> That said, it could always be a future extension to re-introduce
>> dispatcher-internal matching.
> 
> Ack.
> 
>>
>>> 2) Need to enforce service core for its usage.
>>>
>>
>> Well, Eventdev does that already, except on systems where all required
>> event adapters have the appropriate INTERNAL_PORT capability.
> 
> Yes. The difference is, one is for HW feature emulation and other one
> for framework usage.
> 

Can you elaborate why that difference is relevant?

Both the adapters and the event dispatcher are optional, so if you have 
an issue with service cores, you can avoid their use.

> 
>>
>>> IMO, Both are a given application's choice, All the application does
>>> not need to use this scheme. Keeping the code in lib/eventdev has the
>>> following issue.
>>>
>>> 1)It is kind of enforcing above scheme for all the application
>>> modeling, which may not applicable for application use cases and
>>> eventdev device does not dictate a specific framework model.
>>> 2) The framework code, we never kept in device class library. i.e.,
>>> public APIs are implemented through device class API and public API
>>> don't have any no hook to PMD API.
>>> For example, we never kept lib/distributor/ code in lib/ethdev.
>>>
>>> Other than the placement of this code, I agree with use case and
>>> solution at high level . The following could option for placement of
>>> this library. Based on that, we can have next level review.
>>>
>>
>> I'm fine with keeping this as a separate library, although I also don't
>> see the harm in having some utility-type functionality in eventdev itself.
> 
> I see harm as
> 
> 1)It is kind of enforcing above scheme for all the application
> modeling, which may not applicable for all application use cases and
> eventdev device does not dictate a specific framework model.
> 

What scheme is being enforced? Using this thing is optional.

> 2) The framework code, we never kept in device class library. i.e.,
> public APIs are implemented through device class API and public API
> don't have any no hook to PMD API.
> For example, we never kept lib/distributor/ code in lib/ethdev.
> 
> I would to keep eventDEV library scope as abstracting event device features.
> We have some common code in library whose scope is sharing between PMDs
> not a framework on top eventdev public APIs.
> 
> Having said that, I supportive to get this framework as new library and
> happy to review the new library.
> 

Thanks.

I'll reshape the event dispatcher as a separate library and submit a new 
patch.

>>
>>> 1) It is possible to plug in this to lib/graph by adding new graph
>>> model(@zhirun.yan@intel.com recently added
>>> RTE_GRAPH_MODEL_MCORE_DISPATCH)
>>>
>>> Based on my understanding, That can translate to
>>> a)  Adding new graph model which allows to have it on graph walk
>>> (Graph walk is nothing but calling registered dispatcher routines)
>>> b) It is possible to add model specific APIs via
>>> rte_graph_model_model_name_xxxx()
>>> c) Graph library is not using match callback kind of scheme. Instead,
>>> nodes will process the packet and find its downstream node and enqueue
>>> to it and then graph_walk() calls the downstream node specific process
>>> function.
>>> With that, we can meet the original goal of business logic decoupling.
>>> However, Currently, nodes are not aware of what kind of graph model it
>>> is running, that could be one issue here as eventdev has more
>>> scheduling properties
>>> like schedule type etc., to overcome that issue, it may be possible to
>>> introduce nodes to graph model compatibility (where nodes can
>>> advertise the supported graph models)
>>> d) Currently we are planning to make graph API as stable, if we are
>>> taking this path, we need to hold
>>> https://patches.dpdk.org/project/dpdk/patch/20230810180515.113700-1-stephen@networkplumber.org/
>>> as
>>> we may need to update some public APIs.
>>>
>>> 2) Have new library lib/event_dispatcher
>>>
>>> 3) Move to example directory to showcase the framework
>>>
>>> 4) Move to app/test-eventdev directory  to show the case of the framework.
>>>
>>>
>>> Thoughts?
>>
>> I'm not sure I follow. Are you suggesting rte_graph could use
>> rte_event_dispatcher, or that an application could use rte_graph to
>> solve the same problem as rte_event_dispatcher solves?
> 
> Later one, Application could use rte_graph to solve the same problem
> as rte_event_dispatcher solves.
> In fact, both are solving similar problems. See below.
> 
> 
>>
>> I didn't review rte_graph in detail, but if it's anything like fd.io
>> VPP's graph model, it's not a programming model that you switch to
>> without significant application code impact.
> 
> This is a new library, right? So, which existing applications?
> 

Existing DPDK applications, which domain logic is not organized as a 
graph. Which, I'm guessing, are many.

Moving from "raw" event device dequeue to the event dispatcher model is 
a trivial, non-intrusive, operation.

> It is not strictly like VPP graph model. rte_graph supports plugin for
> the different graph models.
> 
> Following are the models available.
> https://doc.dpdk.org/guides/prog_guide/graph_lib.html
> See
> 62.4.5.1. RTC (Run-To-Completion)
> 62.4.5.2. Dispatch model
> 
> RTC is similar to fd.io VPP. Other model is not like VPP.
> 
> If we choose to take new library route instead of new rte_graph model
> for eventdev then
> https://doc.dpdk.org/guides/contributing/new_library.html will be the process.

Jerin Jacob Aug. 25, 2023, 7:27 a.m. UTC | #5

On Thu, Aug 24, 2023 at 4:47 PM Mattias Rönnblom <hofors@lysator.liu.se> wrote:
>
> On 2023-08-22 14:32, Jerin Jacob wrote:

> >> Well, Eventdev does that already, except on systems where all required
> >> event adapters have the appropriate INTERNAL_PORT capability.
> >
> > Yes. The difference is, one is for HW feature emulation and other one
> > for framework usage.
> >
>
> Can you elaborate why that difference is relevant?
>
> Both the adapters and the event dispatcher are optional, so if you have
> an issue with service cores, you can avoid their use.

Adaptor's service core is not optional if HW don't have that feature
via adaptor API.


> >
> > 1)It is kind of enforcing above scheme for all the application
> > modeling, which may not applicable for all application use cases and
> > eventdev device does not dictate a specific framework model.
> >
>
> What scheme is being enforced? Using this thing is optional.

Yes. Exposing in rte_event_.... name space and framework is in lib/eventdev,
one can think, it is layerd SW model and top most event dispatch needs to
be used. Changing the namespace and move to different library will fix
that problem.


>
> > 2) The framework code, we never kept in device class library. i.e.,
> > public APIs are implemented through device class API and public API
> > don't have any no hook to PMD API.
> > For example, we never kept lib/distributor/ code in lib/ethdev.
> >
> > I would to keep eventDEV library scope as abstracting event device features.
> > We have some common code in library whose scope is sharing between PMDs
> > not a framework on top eventdev public APIs.
> >
> > Having said that, I supportive to get this framework as new library and
> > happy to review the new library.
> >
>
> Thanks.
>
> I'll reshape the event dispatcher as a separate library and submit a new
> patch.

Ack

> >>
> >> I didn't review rte_graph in detail, but if it's anything like fd.io
> >> VPP's graph model, it's not a programming model that you switch to
> >> without significant application code impact.
> >
> > This is a new library, right? So, which existing applications?
> >
>
> Existing DPDK applications, which domain logic is not organized as a
> graph. Which, I'm guessing, are many.

Yes. But I was comparing new application based on Graph vs new event
dispatch model,
not  Graph vs "raw" event device.

Nevertheless, if there are some in house application which is
following event dispatch model and
one wants to make that model as upstream as new library.  No
objections from my side.


>
> Moving from "raw" event device dequeue to the event dispatcher model is
> a trivial, non-intrusive, operation.
>
> > It is not strictly like VPP graph model. rte_graph supports plugin for
> > the different graph models.
> >
> > Following are the models available.
> > https://doc.dpdk.org/guides/prog_guide/graph_lib.html
> > See
> > 62.4.5.1. RTC (Run-To-Completion)
> > 62.4.5.2. Dispatch model
> >
> > RTC is similar to fd.io VPP. Other model is not like VPP.
> >
> > If we choose to take new library route instead of new rte_graph model
> > for eventdev then
> > https://doc.dpdk.org/guides/contributing/new_library.html will be the process.

Mattias Rönnblom Sept. 1, 2023, 10:53 a.m. UTC | #6

On 2023-08-18 08:09, Jerin Jacob wrote:

<snip>

> 2) Have new library lib/event_dispatcher
> 

Should the library be named event_dispatcher, or simply dispatcher?

Underscore in library isn't exactly aesthetically pleasing, and shorter 
is better. Also, the rte_event_* namespace is avoided altogether.

On the other hand "dispatcher" is a little too generic, and somewhat 
grandiose name, for a relatively simple thing. "event_dispatcher" makes 
the relation to eventdev obvious.

<snip>

Jerin Jacob Sept. 1, 2023, 10:56 a.m. UTC | #7

On Fri, Sep 1, 2023 at 4:23 PM Mattias Rönnblom <hofors@lysator.liu.se> wrote:
>
> On 2023-08-18 08:09, Jerin Jacob wrote:
>
> <snip>
>
> > 2) Have new library lib/event_dispatcher
> >
>
> Should the library be named event_dispatcher, or simply dispatcher?

Looks good to me.

>
> Underscore in library isn't exactly aesthetically pleasing, and shorter

> is better. Also, the rte_event_* namespace is avoided altogether.

+1

>
> On the other hand "dispatcher" is a little too generic, and somewhat
> grandiose name, for a relatively simple thing. "event_dispatcher" makes
> the relation to eventdev obvious.
>
> <snip>

Stephen Hemminger Sept. 6, 2023, 7:32 p.m. UTC | #8

On Mon, 4 Sep 2023 15:03:10 +0200
Mattias Rönnblom <mattias.ronnblom@ericsson.com> wrote:

> The purpose of the dispatcher library is to decouple different parts
> of an eventdev-based application (e.g., processing pipeline stages),
> sharing the same underlying event device.
> 
> The dispatcher replaces the conditional logic (often, a switch
> statement) that typically follows an event device dequeue operation,
> where events are dispatched to different parts of the application
> based on event meta data, such as the queue id or scheduling type.
> 
> The concept is similar to a UNIX file descriptor event loop library.
> Instead of tying callback functions to fds as for example libevent
> does, the dispatcher relies on application-supplied matching callback
> functions to decide where to deliver events.
> 
> A dispatcher is configured to dequeue events from a specific event
> device, and ties into the service core framework, to do its (and the
> application's) work.
> 
> The dispatcher provides a convenient way for an eventdev-based
> application to use service cores for application-level processing, and
> thus for sharing those cores with other DPDK services.
> 
> Although the dispatcher adds some overhead, experience suggests that
> the net effect on the application (both synthetic benchmarks and more
> real-world applications) may well be positive. This is primarily due
> to clustering (see programming guide) reducing cache misses.
> 
> Benchmarking indicates that the overhead is ~10 cc/event (on a
> large core), with a handful of often-used handlers.
> 
> The dispatcher does not support run-time reconfiguration.
> 
> The use of the dispatcher library is optional, and an eventdev-based
> application may still opt to access the event device using direct
> eventdev API calls, or by some other means.

My experience with event libraries is mixed.
There are already multiple choices libevent, libev, and libsystemd as
well as rte_epoll().  Not sure if adding another one is helpful.

The main issue is dealing with external (non DPDK) events which usually
are handled as file descriptors (signalfd, epoll, etc). The other issue
is the thread safety. Most event libraries are not thread safe which
makes handling one event waking up another difficult.

With DPDK, there is the additional questions about use from non-EAL threads.

For the test suite, you should look at what libsystemd does.

Mattias Rönnblom Sept. 6, 2023, 8:28 p.m. UTC | #9

On 2023-09-06 21:32, Stephen Hemminger wrote:
> On Mon, 4 Sep 2023 15:03:10 +0200
> Mattias Rönnblom <mattias.ronnblom@ericsson.com> wrote:
> 
>> The purpose of the dispatcher library is to decouple different parts
>> of an eventdev-based application (e.g., processing pipeline stages),
>> sharing the same underlying event device.
>>
>> The dispatcher replaces the conditional logic (often, a switch
>> statement) that typically follows an event device dequeue operation,
>> where events are dispatched to different parts of the application
>> based on event meta data, such as the queue id or scheduling type.
>>
>> The concept is similar to a UNIX file descriptor event loop library.
>> Instead of tying callback functions to fds as for example libevent
>> does, the dispatcher relies on application-supplied matching callback
>> functions to decide where to deliver events.
>>
>> A dispatcher is configured to dequeue events from a specific event
>> device, and ties into the service core framework, to do its (and the
>> application's) work.
>>
>> The dispatcher provides a convenient way for an eventdev-based
>> application to use service cores for application-level processing, and
>> thus for sharing those cores with other DPDK services.
>>
>> Although the dispatcher adds some overhead, experience suggests that
>> the net effect on the application (both synthetic benchmarks and more
>> real-world applications) may well be positive. This is primarily due
>> to clustering (see programming guide) reducing cache misses.
>>
>> Benchmarking indicates that the overhead is ~10 cc/event (on a
>> large core), with a handful of often-used handlers.
>>
>> The dispatcher does not support run-time reconfiguration.
>>
>> The use of the dispatcher library is optional, and an eventdev-based
>> application may still opt to access the event device using direct
>> eventdev API calls, or by some other means.
> 
> My experience with event libraries is mixed.
> There are already multiple choices libevent, libev, and libsystemd as
> well as rte_epoll().  Not sure if adding another one is helpful.
> 

This library *conceptually* provides the same kind of functionality as 
libevent, but has nothing to do with file descriptor events. These are 
eventdev events, and thus are tied to the arrival of a packet, a 
notification some kind of hardware offload, a timeout, or something else 
DPDK PMD-related.

> The main issue is dealing with external (non DPDK) events which usually
> are handled as file descriptors (signalfd, epoll, etc). The other issue
> is the thread safety. Most event libraries are not thread safe which
> makes handling one event waking up another difficult.
> 
This machinery is for use exclusively by EAL threads, for fast-path 
packet processing. No syscalls, no non-DPDK events.

> With DPDK, there is the additional questions about use from non-EAL threads.
> 

See above.

> For the test suite, you should look at what libsystemd does.
>

diff mbox series

Patch

diff --git a/lib/eventdev/meson.build b/lib/eventdev/meson.build
index 6edf98dfa5..c0edc744fe 100644
--- a/lib/eventdev/meson.build
+++ b/lib/eventdev/meson.build
@@ -19,6 +19,7 @@  sources = files(
         'rte_event_crypto_adapter.c',
         'rte_event_eth_rx_adapter.c',
         'rte_event_eth_tx_adapter.c',
+        'rte_event_dispatcher.c',
         'rte_event_ring.c',
         'rte_event_timer_adapter.c',
         'rte_eventdev.c',
@@ -27,6 +28,7 @@  headers = files(
         'rte_event_crypto_adapter.h',
         'rte_event_eth_rx_adapter.h',
         'rte_event_eth_tx_adapter.h',
+        'rte_event_dispatcher.h',
         'rte_event_ring.h',
         'rte_event_timer_adapter.h',
         'rte_eventdev.h',
diff --git a/lib/eventdev/rte_event_dispatcher.c b/lib/eventdev/rte_event_dispatcher.c
new file mode 100644
index 0000000000..d4bd39754a
--- /dev/null
+++ b/lib/eventdev/rte_event_dispatcher.c
@@ -0,0 +1,793 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2023 Ericsson AB
+ */
+
+#include <stdbool.h>
+#include <stdint.h>
+
+#include <rte_branch_prediction.h>
+#include <rte_common.h>
+#include <rte_lcore.h>
+#include <rte_random.h>
+#include <rte_service_component.h>
+
+#include "eventdev_pmd.h"
+
+#include <rte_event_dispatcher.h>
+
+#define EVD_MAX_PORTS_PER_LCORE 4
+#define EVD_MAX_HANDLERS 32
+#define EVD_MAX_FINALIZERS 16
+#define EVD_AVG_PRIO_INTERVAL 2000
+
+struct rte_event_dispatcher_lcore_port {
+	uint8_t port_id;
+	uint16_t batch_size;
+	uint64_t timeout;
+};
+
+struct rte_event_dispatcher_handler {
+	int id;
+	rte_event_dispatcher_match_t match_fun;
+	void *match_data;
+	rte_event_dispatcher_process_t process_fun;
+	void *process_data;
+};
+
+struct rte_event_dispatcher_finalizer {
+	int id;
+	rte_event_dispatcher_finalize_t finalize_fun;
+	void *finalize_data;
+};
+
+struct rte_event_dispatcher_lcore {
+	uint8_t num_ports;
+	uint16_t num_handlers;
+	int32_t prio_count;
+	struct rte_event_dispatcher_lcore_port ports[EVD_MAX_PORTS_PER_LCORE];
+	struct rte_event_dispatcher_handler handlers[EVD_MAX_HANDLERS];
+	struct rte_event_dispatcher_stats stats;
+} __rte_cache_aligned;
+
+struct rte_event_dispatcher {
+	uint8_t id;
+	uint8_t event_dev_id;
+	int socket_id;
+	uint32_t service_id;
+	struct rte_event_dispatcher_lcore lcores[RTE_MAX_LCORE];
+	uint16_t num_finalizers;
+	struct rte_event_dispatcher_finalizer finalizers[EVD_MAX_FINALIZERS];
+};
+
+static struct rte_event_dispatcher *dispatchers[UINT8_MAX];
+
+static bool
+evd_has_dispatcher(uint8_t id)
+{
+	return dispatchers[id] != NULL;
+}
+
+static struct rte_event_dispatcher *
+evd_get_dispatcher(uint8_t id)
+{
+	return dispatchers[id];
+}
+
+static void
+evd_set_dispatcher(uint8_t id, struct rte_event_dispatcher *dispatcher)
+{
+	dispatchers[id] = dispatcher;
+}
+
+#define EVD_VALID_ID_OR_RET_EINVAL(id)					\
+	do {								\
+		if (unlikely(!evd_has_dispatcher(id))) {		\
+			RTE_EDEV_LOG_ERR("Invalid dispatcher id %d\n", id); \
+			return -EINVAL;					\
+		}							\
+	} while (0)
+
+static int
+evd_lookup_handler_idx(struct rte_event_dispatcher_lcore *lcore,
+		       const struct rte_event *event)
+{
+	uint16_t i;
+
+	for (i = 0; i < lcore->num_handlers; i++) {
+		struct rte_event_dispatcher_handler *handler =
+			&lcore->handlers[i];
+
+		if (handler->match_fun(event, handler->match_data))
+			return i;
+	}
+
+	return -1;
+}
+
+static void
+evd_prioritize_handler(struct rte_event_dispatcher_lcore *lcore,
+		       int handler_idx)
+{
+	struct rte_event_dispatcher_handler tmp;
+
+	if (handler_idx == 0)
+		return;
+
+	/* Let the lucky handler "bubble" up the list */
+
+	tmp = lcore->handlers[handler_idx - 1];
+
+	lcore->handlers[handler_idx - 1] = lcore->handlers[handler_idx];
+
+	lcore->handlers[handler_idx] = tmp;
+}
+
+static inline void
+evd_consider_prioritize_handler(struct rte_event_dispatcher_lcore *lcore,
+				int handler_idx, uint16_t handler_events)
+{
+	lcore->prio_count -= handler_events;
+
+	if (unlikely(lcore->prio_count <= 0)) {
+		evd_prioritize_handler(lcore, handler_idx);
+
+		/*
+		 * Randomize the interval in the unlikely case
+		 * the traffic follow some very strict pattern.
+		 */
+		lcore->prio_count =
+			rte_rand_max(EVD_AVG_PRIO_INTERVAL) +
+			EVD_AVG_PRIO_INTERVAL / 2;
+	}
+}
+
+static inline void
+evd_dispatch_events(struct rte_event_dispatcher *dispatcher,
+		    struct rte_event_dispatcher_lcore *lcore,
+		    struct rte_event_dispatcher_lcore_port *port,
+		    struct rte_event *events, uint16_t num_events)
+{
+	int i;
+	struct rte_event bursts[EVD_MAX_HANDLERS][num_events];
+	uint16_t burst_lens[EVD_MAX_HANDLERS] = { 0 };
+	uint16_t drop_count = 0;
+	uint16_t dispatch_count;
+	uint16_t dispatched = 0;
+
+	for (i = 0; i < num_events; i++) {
+		struct rte_event *event = &events[i];
+		int handler_idx;
+
+		handler_idx = evd_lookup_handler_idx(lcore, event);
+
+		if (unlikely(handler_idx < 0)) {
+			drop_count++;
+			continue;
+		}
+
+		bursts[handler_idx][burst_lens[handler_idx]] = *event;
+		burst_lens[handler_idx]++;
+	}
+
+	dispatch_count = num_events - drop_count;
+
+	for (i = 0; i < lcore->num_handlers &&
+		 dispatched < dispatch_count; i++) {
+		struct rte_event_dispatcher_handler *handler =
+			&lcore->handlers[i];
+		uint16_t len = burst_lens[i];
+
+		if (len == 0)
+			continue;
+
+		handler->process_fun(dispatcher->event_dev_id, port->port_id,
+				     bursts[i], len, handler->process_data);
+
+		dispatched += len;
+
+		/*
+		 * Safe, since any reshuffling will only involve
+		 * already-processed handlers.
+		 */
+		evd_consider_prioritize_handler(lcore, i, len);
+	}
+
+	lcore->stats.ev_batch_count++;
+	lcore->stats.ev_dispatch_count += dispatch_count;
+	lcore->stats.ev_drop_count += drop_count;
+
+	for (i = 0; i < dispatcher->num_finalizers; i++) {
+		struct rte_event_dispatcher_finalizer *finalizer =
+			&dispatcher->finalizers[i];
+
+		finalizer->finalize_fun(dispatcher->event_dev_id,
+					port->port_id,
+					finalizer->finalize_data);
+	}
+}
+
+static __rte_always_inline uint16_t
+evd_port_dequeue(struct rte_event_dispatcher *dispatcher,
+		 struct rte_event_dispatcher_lcore *lcore,
+		 struct rte_event_dispatcher_lcore_port *port)
+{
+	uint16_t batch_size = port->batch_size;
+	struct rte_event events[batch_size];
+	uint16_t n;
+
+	n = rte_event_dequeue_burst(dispatcher->event_dev_id, port->port_id,
+				    events, batch_size, port->timeout);
+
+	if (likely(n > 0))
+		evd_dispatch_events(dispatcher, lcore, port, events, n);
+
+	lcore->stats.poll_count++;
+
+	return n;
+}
+
+static __rte_always_inline uint16_t
+evd_lcore_process(struct rte_event_dispatcher *dispatcher,
+		  struct rte_event_dispatcher_lcore *lcore)
+{
+	uint16_t i;
+	uint16_t event_count = 0;
+
+	for (i = 0; i < lcore->num_ports; i++) {
+		struct rte_event_dispatcher_lcore_port *port =
+			&lcore->ports[i];
+
+		event_count += evd_port_dequeue(dispatcher, lcore, port);
+	}
+
+	return event_count;
+}
+
+static int32_t
+evd_process(void *userdata)
+{
+	struct rte_event_dispatcher *dispatcher = userdata;
+	unsigned int lcore_id = rte_lcore_id();
+	struct rte_event_dispatcher_lcore *lcore =
+		&dispatcher->lcores[lcore_id];
+	uint64_t event_count;
+
+	event_count = evd_lcore_process(dispatcher, lcore);
+
+	if (unlikely(event_count == 0))
+		return -EAGAIN;
+
+	return 0;
+}
+
+static int
+evd_service_register(struct rte_event_dispatcher *dispatcher)
+{
+	struct rte_service_spec service = {
+		.callback = evd_process,
+		.callback_userdata = dispatcher,
+		.capabilities = RTE_SERVICE_CAP_MT_SAFE,
+		.socket_id = dispatcher->socket_id
+	};
+	int rc;
+
+	snprintf(service.name, RTE_SERVICE_NAME_MAX - 1, "evd_%d",
+		 dispatcher->id);
+
+	rc = rte_service_component_register(&service, &dispatcher->service_id);
+
+	if (rc)
+		RTE_EDEV_LOG_ERR("Registration of event dispatcher service "
+				 "%s failed with error code %d\n",
+				 service.name, rc);
+
+	return rc;
+}
+
+static int
+evd_service_unregister(struct rte_event_dispatcher *dispatcher)
+{
+	int rc;
+
+	rc = rte_service_component_unregister(dispatcher->service_id);
+
+	if (rc)
+		RTE_EDEV_LOG_ERR("Unregistration of event dispatcher service "
+				 "failed with error code %d\n", rc);
+
+	return rc;
+}
+
+int
+rte_event_dispatcher_create(uint8_t id, uint8_t event_dev_id)
+{
+	int socket_id;
+	struct rte_event_dispatcher *dispatcher;
+	int rc;
+
+	if (evd_has_dispatcher(id)) {
+		RTE_EDEV_LOG_ERR("Dispatcher with id %d already exists\n",
+				 id);
+		return -EEXIST;
+	}
+
+	socket_id = rte_event_dev_socket_id(event_dev_id);
+
+	dispatcher =
+		rte_malloc_socket("event dispatcher",
+				  sizeof(struct rte_event_dispatcher),
+				  RTE_CACHE_LINE_SIZE, socket_id);
+
+	if (dispatcher == NULL) {
+		RTE_EDEV_LOG_ERR("Unable to allocate memory for event "
+				 "dispatcher\n");
+		return -ENOMEM;
+	}
+
+	*dispatcher = (struct rte_event_dispatcher) {
+		.id = id,
+		.event_dev_id = event_dev_id,
+		.socket_id = socket_id
+	};
+
+	rc = evd_service_register(dispatcher);
+
+	if (rc < 0) {
+		rte_free(dispatcher);
+		return rc;
+	}
+
+	evd_set_dispatcher(id, dispatcher);
+
+	return 0;
+}
+
+int
+rte_event_dispatcher_free(uint8_t id)
+{
+	struct rte_event_dispatcher *dispatcher;
+	int rc;
+
+	EVD_VALID_ID_OR_RET_EINVAL(id);
+	dispatcher = evd_get_dispatcher(id);
+
+	rc = evd_service_unregister(dispatcher);
+
+	if (rc)
+		return rc;
+
+	evd_set_dispatcher(id, NULL);
+
+	rte_free(dispatcher);
+
+	return 0;
+}
+
+int
+rte_event_dispatcher_service_id_get(uint8_t id, uint32_t *service_id)
+{
+	struct rte_event_dispatcher *dispatcher;
+
+	EVD_VALID_ID_OR_RET_EINVAL(id);
+	dispatcher = evd_get_dispatcher(id);
+
+	*service_id = dispatcher->service_id;
+
+	return 0;
+}
+
+static int
+lcore_port_index(struct rte_event_dispatcher_lcore *lcore,
+		 uint8_t event_port_id)
+{
+	uint16_t i;
+
+	for (i = 0; i < lcore->num_ports; i++) {
+		struct rte_event_dispatcher_lcore_port *port =
+			&lcore->ports[i];
+
+		if (port->port_id == event_port_id)
+			return i;
+	}
+
+	return -1;
+}
+
+int
+rte_event_dispatcher_bind_port_to_lcore(uint8_t id, uint8_t event_port_id,
+					uint16_t batch_size, uint64_t timeout,
+					unsigned int lcore_id)
+{
+	struct rte_event_dispatcher *dispatcher;
+	struct rte_event_dispatcher_lcore *lcore;
+	struct rte_event_dispatcher_lcore_port *port;
+
+	EVD_VALID_ID_OR_RET_EINVAL(id);
+	dispatcher = evd_get_dispatcher(id);
+
+	lcore =	&dispatcher->lcores[lcore_id];
+
+	if (lcore->num_ports == EVD_MAX_PORTS_PER_LCORE)
+		return -ENOMEM;
+
+	if (lcore_port_index(lcore, event_port_id) >= 0)
+		return -EEXIST;
+
+	port = &lcore->ports[lcore->num_ports];
+
+	*port = (struct rte_event_dispatcher_lcore_port) {
+		.port_id = event_port_id,
+		.batch_size = batch_size,
+		.timeout = timeout
+	};
+
+	lcore->num_ports++;
+
+	return 0;
+}
+
+int
+rte_event_dispatcher_unbind_port_from_lcore(uint8_t id, uint8_t event_port_id,
+					    unsigned int lcore_id)
+{
+	struct rte_event_dispatcher *dispatcher;
+	struct rte_event_dispatcher_lcore *lcore;
+	int port_idx;
+	struct rte_event_dispatcher_lcore_port *port;
+	struct rte_event_dispatcher_lcore_port *last;
+
+	EVD_VALID_ID_OR_RET_EINVAL(id);
+	dispatcher = evd_get_dispatcher(id);
+
+	lcore =	&dispatcher->lcores[lcore_id];
+
+	port_idx = lcore_port_index(lcore, event_port_id);
+
+	if (port_idx < 0)
+		return -ENOENT;
+
+	port = &lcore->ports[port_idx];
+	last = &lcore->ports[lcore->num_ports - 1];
+
+	if (port != last)
+		*port = *last;
+
+	lcore->num_ports--;
+
+	return 0;
+}
+
+static struct rte_event_dispatcher_handler*
+evd_lcore_get_handler_by_id(struct rte_event_dispatcher_lcore *lcore,
+			    int handler_id)
+{
+	uint16_t i;
+
+	for (i = 0; i < lcore->num_handlers; i++) {
+		struct rte_event_dispatcher_handler *handler =
+			&lcore->handlers[i];
+
+		if (handler->id == handler_id)
+			return handler;
+	}
+
+	return NULL;
+}
+
+static int
+evd_alloc_handler_id(struct rte_event_dispatcher *dispatcher)
+{
+	int handler_id = 0;
+	struct rte_event_dispatcher_lcore *reference_lcore =
+		&dispatcher->lcores[0];
+
+	if (reference_lcore->num_handlers == EVD_MAX_HANDLERS)
+		return -1;
+
+	while (evd_lcore_get_handler_by_id(reference_lcore, handler_id) != NULL)
+		handler_id++;
+
+	return handler_id;
+}
+
+static void
+evd_lcore_install_handler(struct rte_event_dispatcher_lcore *lcore,
+		    const struct rte_event_dispatcher_handler *handler)
+{
+	int handler_idx = lcore->num_handlers;
+
+	lcore->handlers[handler_idx] = *handler;
+	lcore->num_handlers++;
+}
+
+static void
+evd_install_handler(struct rte_event_dispatcher *dispatcher,
+		    const struct rte_event_dispatcher_handler *handler)
+{
+	int i;
+
+	for (i = 0; i < RTE_MAX_LCORE; i++) {
+		struct rte_event_dispatcher_lcore *lcore =
+			&dispatcher->lcores[i];
+		evd_lcore_install_handler(lcore, handler);
+	}
+}
+
+int
+rte_event_dispatcher_register(uint8_t id,
+			      rte_event_dispatcher_match_t match_fun,
+			      void *match_data,
+			      rte_event_dispatcher_process_t process_fun,
+			      void *process_data)
+{
+	struct rte_event_dispatcher *dispatcher;
+	struct rte_event_dispatcher_handler handler = {
+		.match_fun = match_fun,
+		.match_data = match_data,
+		.process_fun = process_fun,
+		.process_data = process_data
+	};
+
+	EVD_VALID_ID_OR_RET_EINVAL(id);
+	dispatcher = evd_get_dispatcher(id);
+
+	handler.id = evd_alloc_handler_id(dispatcher);
+
+	if (handler.id < 0)
+		return -ENOMEM;
+
+	evd_install_handler(dispatcher, &handler);
+
+	return handler.id;
+}
+
+static int
+evd_lcore_uninstall_handler(struct rte_event_dispatcher_lcore *lcore,
+			    int handler_id)
+{
+	struct rte_event_dispatcher_handler *unreg_handler;
+	int handler_idx;
+	uint16_t last_idx;
+
+	unreg_handler = evd_lcore_get_handler_by_id(lcore, handler_id);
+
+	if (unreg_handler == NULL)
+		return -EINVAL;
+
+	handler_idx = &lcore->handlers[0] - unreg_handler;
+
+	last_idx = lcore->num_handlers - 1;
+
+	if (handler_idx != last_idx) {
+		/* move all handlers to maintain handler order */
+		int n = last_idx - handler_idx;
+		memmove(unreg_handler, unreg_handler + 1,
+			sizeof(struct rte_event_dispatcher_handler) * n);
+	}
+
+	lcore->num_handlers--;
+
+	return 0;
+}
+
+static int
+evd_uninstall_handler(struct rte_event_dispatcher *dispatcher,
+		      int handler_id)
+{
+	unsigned int lcore_id;
+
+	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+		struct rte_event_dispatcher_lcore *lcore =
+			&dispatcher->lcores[lcore_id];
+		int rc;
+
+		rc = evd_lcore_uninstall_handler(lcore, handler_id);
+
+		if (rc < 0)
+			return rc;
+	}
+
+	return 0;
+}
+
+int
+rte_event_dispatcher_unregister(uint8_t id, int handler_id)
+{
+	struct rte_event_dispatcher *dispatcher;
+	int rc;
+
+	EVD_VALID_ID_OR_RET_EINVAL(id);
+	dispatcher = evd_get_dispatcher(id);
+
+	rc = evd_uninstall_handler(dispatcher, handler_id);
+
+	return rc;
+}
+
+static struct rte_event_dispatcher_finalizer*
+evd_get_finalizer_by_id(struct rte_event_dispatcher *dispatcher,
+		       int handler_id)
+{
+	int i;
+
+	for (i = 0; i < dispatcher->num_finalizers; i++) {
+		struct rte_event_dispatcher_finalizer *finalizer =
+			&dispatcher->finalizers[i];
+
+		if (finalizer->id == handler_id)
+			return finalizer;
+	}
+
+	return NULL;
+}
+
+static int
+evd_alloc_finalizer_id(struct rte_event_dispatcher *dispatcher)
+{
+	int finalizer_id = 0;
+
+	while (evd_get_finalizer_by_id(dispatcher, finalizer_id) != NULL)
+		finalizer_id++;
+
+	return finalizer_id;
+}
+
+static struct rte_event_dispatcher_finalizer *
+evd_alloc_finalizer(struct rte_event_dispatcher *dispatcher)
+{
+	int finalizer_idx;
+	struct rte_event_dispatcher_finalizer *finalizer;
+
+	if (dispatcher->num_finalizers == EVD_MAX_FINALIZERS)
+		return NULL;
+
+	finalizer_idx = dispatcher->num_finalizers;
+	finalizer = &dispatcher->finalizers[finalizer_idx];
+
+	finalizer->id = evd_alloc_finalizer_id(dispatcher);
+
+	dispatcher->num_finalizers++;
+
+	return finalizer;
+}
+
+int
+rte_event_dispatcher_finalize_register(uint8_t id,
+			      rte_event_dispatcher_finalize_t finalize_fun,
+			      void *finalize_data)
+{
+	struct rte_event_dispatcher *dispatcher;
+	struct rte_event_dispatcher_finalizer *finalizer;
+
+	EVD_VALID_ID_OR_RET_EINVAL(id);
+	dispatcher = evd_get_dispatcher(id);
+
+	finalizer = evd_alloc_finalizer(dispatcher);
+
+	if (finalizer == NULL)
+		return -ENOMEM;
+
+	finalizer->finalize_fun = finalize_fun;
+	finalizer->finalize_data = finalize_data;
+
+	return finalizer->id;
+}
+
+int
+rte_event_dispatcher_finalize_unregister(uint8_t id, int handler_id)
+{
+	struct rte_event_dispatcher *dispatcher;
+	struct rte_event_dispatcher_finalizer *unreg_finalizer;
+	int finalizer_idx;
+	uint16_t last_idx;
+
+	EVD_VALID_ID_OR_RET_EINVAL(id);
+	dispatcher = evd_get_dispatcher(id);
+
+	unreg_finalizer = evd_get_finalizer_by_id(dispatcher, handler_id);
+
+	if (unreg_finalizer == NULL)
+		return -EINVAL;
+
+	finalizer_idx = &dispatcher->finalizers[0] - unreg_finalizer;
+
+	last_idx = dispatcher->num_finalizers - 1;
+
+	if (finalizer_idx != last_idx) {
+		/* move all finalizers to maintain order */
+		int n = last_idx - finalizer_idx;
+		memmove(unreg_finalizer, unreg_finalizer + 1,
+			sizeof(struct rte_event_dispatcher_finalizer) * n);
+	}
+
+	dispatcher->num_finalizers--;
+
+	return 0;
+}
+
+static int
+evd_set_service_runstate(uint8_t id, int state)
+{
+	struct rte_event_dispatcher *dispatcher;
+	int rc;
+
+	EVD_VALID_ID_OR_RET_EINVAL(id);
+	dispatcher = evd_get_dispatcher(id);
+
+	rc = rte_service_component_runstate_set(dispatcher->service_id,
+						state);
+
+	if (rc != 0) {
+		RTE_EDEV_LOG_ERR("Unexpected error %d occurred while setting "
+				 "service component run state to %d\n", rc,
+				 state);
+		RTE_ASSERT(0);
+	}
+
+	return 0;
+}
+
+int
+rte_event_dispatcher_start(uint8_t id)
+{
+	return evd_set_service_runstate(id, 1);
+}
+
+int
+rte_event_dispatcher_stop(uint8_t id)
+{
+	return evd_set_service_runstate(id, 0);
+}
+
+static void
+evd_aggregate_stats(struct rte_event_dispatcher_stats *result,
+		    const struct rte_event_dispatcher_stats *part)
+{
+	result->poll_count += part->poll_count;
+	result->ev_batch_count += part->ev_batch_count;
+	result->ev_dispatch_count += part->ev_dispatch_count;
+	result->ev_drop_count += part->ev_drop_count;
+}
+
+int
+rte_event_dispatcher_stats_get(uint8_t id,
+			       struct rte_event_dispatcher_stats *stats)
+{
+	struct rte_event_dispatcher *dispatcher;
+	unsigned int lcore_id;
+
+	EVD_VALID_ID_OR_RET_EINVAL(id);
+	dispatcher = evd_get_dispatcher(id);
+
+	*stats = (struct rte_event_dispatcher_stats) {};
+
+	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+		struct rte_event_dispatcher_lcore *lcore =
+			&dispatcher->lcores[lcore_id];
+
+		evd_aggregate_stats(stats, &lcore->stats);
+	}
+
+	return 0;
+}
+
+int
+rte_event_dispatcher_stats_reset(uint8_t id)
+{
+	struct rte_event_dispatcher *dispatcher;
+	unsigned int lcore_id;
+
+	EVD_VALID_ID_OR_RET_EINVAL(id);
+	dispatcher = evd_get_dispatcher(id);
+
+
+	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+		struct rte_event_dispatcher_lcore *lcore =
+			&dispatcher->lcores[lcore_id];
+
+		lcore->stats = (struct rte_event_dispatcher_stats) {};
+	}
+
+	return 0;
+
+}
diff --git a/lib/eventdev/rte_event_dispatcher.h b/lib/eventdev/rte_event_dispatcher.h
new file mode 100644
index 0000000000..8847c8ac1c
--- /dev/null
+++ b/lib/eventdev/rte_event_dispatcher.h
@@ -0,0 +1,480 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2023 Ericsson AB
+ */
+
+#ifndef __RTE_EVENT_DISPATCHER_H__
+#define __RTE_EVENT_DISPATCHER_H__
+
+/**
+ * @file
+ *
+ * RTE Event Dispatcher
+ *
+ * The purpose of the event dispatcher is to help decouple different parts
+ * of an application (e.g., modules), sharing the same underlying
+ * event device.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_eventdev.h>
+
+/**
+ * Function prototype for match callbacks.
+ *
+ * Match callbacks are used by an application to decide how the
+ * event dispatcher distributes events to different parts of the
+ * application.
+ *
+ * The application is not expected to process the event at the point
+ * of the match call. Such matters should be deferred to the process
+ * callback invocation.
+ *
+ * The match callback may be used as an opportunity to prefetch data.
+ *
+ * @param event
+ *  Pointer to event
+ *
+ * @param cb_data
+ *  The pointer supplied by the application in
+ *  rte_event_dispatcher_register().
+ *
+ * @return
+ *   Returns true in case this events should be delivered (via
+ *   the process callback), and false otherwise.
+ */
+typedef bool
+(*rte_event_dispatcher_match_t)(const struct rte_event *event, void *cb_data);
+
+/**
+ * Function prototype for process callbacks.
+ *
+ * The process callbacks are used by the event dispatcher to deliver
+ * events for processing.
+ *
+ * @param event_dev_id
+ *  The originating event device id.
+ *
+ * @param event_port_id
+ *  The originating event port.
+ *
+ * @param events
+ *  Pointer to an array of events.
+ *
+ * @param num
+ *  The number of events in the @p events array.
+ *
+ * @param cb_data
+ *  The pointer supplied by the application in
+ *  rte_event_dispatcher_register().
+ */
+
+typedef void
+(*rte_event_dispatcher_process_t)(uint8_t event_dev_id, uint8_t event_port_id,
+				  struct rte_event *events, uint16_t num,
+				  void *cb_data);
+
+/**
+ * Function prototype for finalize callbacks.
+ *
+ * The finalize callbacks are used by the event dispatcher to notify
+ * the application it has delivered all events from a particular batch
+ * dequeued from the event device.
+ *
+ * @param event_dev_id
+ *  The originating event device id.
+ *
+ * @param event_port_id
+ *  The originating event port.
+ *
+ * @param cb_data
+ *  The pointer supplied by the application in
+ *  rte_event_dispatcher_finalize_register().
+ */
+
+typedef void
+(*rte_event_dispatcher_finalize_t)(uint8_t event_dev_id, uint8_t event_port_id,
+				   void *cb_data);
+
+/**
+ * Event dispatcher statistics
+ */
+struct rte_event_dispatcher_stats {
+	uint64_t poll_count;
+	/**< Number of event dequeue calls made toward the event device. */
+	uint64_t ev_batch_count;
+	/**< Number of non-empty event batches dequeued from event device.*/
+	uint64_t ev_dispatch_count;
+	/**< Number of events dispatched to a handler.*/
+	uint64_t ev_drop_count;
+	/**< Number of events dropped because no handler was found. */
+};
+
+/**
+ * Create an event dispatcher with the specified id.
+ *
+ * @param id
+ *  An application-specified, unique (across all event dispatcher
+ *  instances) identifier.
+ *
+ * @param event_dev_id
+ *  The identifier of the event device from which this event dispatcher
+ *  will dequeue events.
+ *
+ * @return
+ *   - 0: Success
+ *   - <0: Error code on failure
+ */
+__rte_experimental
+int
+rte_event_dispatcher_create(uint8_t id, uint8_t event_dev_id);
+
+/**
+ * Free an event dispatcher.
+ *
+ * @param id
+ *  The event dispatcher identifier.
+ *
+ * @return
+ *  - 0: Success
+ *  - <0: Error code on failure
+ */
+__rte_experimental
+int
+rte_event_dispatcher_free(uint8_t id);
+
+/**
+ * Retrieve the service identifier of an event dispatcher.
+ *
+ * @param id
+ *  The event dispatcher identifier.
+ *
+ * @param [out] service_id
+ *  A pointer to a caller-supplied buffer where the event dispatcher's
+ *  service id will be stored.
+ *
+ * @return
+ *  - 0: Success
+ *  - <0: Error code on failure.
+ */
+__rte_experimental
+int
+rte_event_dispatcher_service_id_get(uint8_t id, uint32_t *service_id);
+
+/**
+ * Binds an event device port to a specific lcore on the specified
+ * event dispatcher.
+ *
+ * This function configures the event port id to be used by the event
+ * dispatcher service, if run on the specified lcore.
+ *
+ * Multiple event device ports may be bound to the same lcore. A
+ * particular port must not be bound to more than one lcore.
+ *
+ * If the event dispatcher service is mapped (with
+ * rte_service_map_lcore_set()) to a lcore for which no ports are
+ * bound, the service function will be a no-operation.
+ *
+ * This function may be called by any thread (including unregistered
+ * non-EAL threads), but not while the event dispatcher is running on
+ * lcore specified by @c lcore_id.
+ *
+ * @param id
+ *  The event dispatcher identifier.
+ *
+ * @param event_port_id
+ *  The event device port identifier.
+ *
+ * @param batch_size
+ *  The batch size to use in rte_event_dequeue_burst(), for the
+ *  configured event device port and lcore.
+ *
+ * @param timeout
+ *  The timeout parameter to use in rte_event_dequeue_burst(), for the
+ *  configured event device port and lcore.
+ *
+ * @param lcore_id
+ *  The lcore by which this event port will be used.
+ *
+ * @return
+ *  - 0: Success
+ *  - -ENOMEM: Unable to allocate sufficient resources.
+ *  - -EEXISTS: Event port is already configured.
+ *  - -EINVAL: Invalid arguments.
+ */
+__rte_experimental
+int
+rte_event_dispatcher_bind_port_to_lcore(uint8_t id, uint8_t event_port_id,
+					uint16_t batch_size, uint64_t timeout,
+					unsigned int lcore_id);
+
+/**
+ * Unbind an event device port from a specific lcore.
+ *
+ * This function may be called by any thread (including unregistered
+ * non-EAL threads), but not while the event dispatcher is running on
+ * lcore specified by @c lcore_id.
+ *
+ * @param id
+ *  The event dispatcher identifier.
+ *
+ * @param event_port_id
+ *  The event device port identifier.
+ *
+ * @param lcore_id
+ *  The lcore which was using this event port.
+ *
+ * @return
+ *  - 0: Success
+ *  - -EINVAL: Invalid @c id.
+ *  - -ENOENT: Event port id not bound to this @c lcore_id.
+ */
+__rte_experimental
+int
+rte_event_dispatcher_unbind_port_from_lcore(uint8_t id, uint8_t event_port_id,
+					    unsigned int lcore_id);
+
+/**
+ * Register an event handler.
+ *
+ * The match callback function is used to select if a particular event
+ * should be delivered, using the corresponding process callback
+ * function.
+ *
+ * The reason for having two distinct steps is to allow the dispatcher
+ * to deliver all events as a batch. This in turn will cause
+ * processing of a particular kind of events to happen in a
+ * back-to-back manner, improving cache locality.
+ *
+ * The list of handler callback functions is shared among all lcores,
+ * but will only be executed on lcores which has an eventdev port
+ * bound to them, and which are running the event dispatcher service.
+ *
+ * An event is delivered to at most one handler. Events where no
+ * handler is found are dropped.
+ *
+ * The application must not depend on the order of which the match
+ * functions are invoked.
+ *
+ * Ordering of events is not guaranteed to be maintained between
+ * different deliver callbacks. For example, suppose there are two
+ * callbacks registered, matching different subsets of events arriving
+ * on an atomic queue. A batch of events [ev0, ev1, ev2] are dequeued
+ * on a particular port, all pertaining to the same flow. The match
+ * callback for registration A returns true for ev0 and ev2, and the
+ * matching function for registration B for ev1. In that scenario, the
+ * event dispatcher may choose to deliver first [ev0, ev2] using A's
+ * deliver function, and then [ev1] to B - or vice versa.
+ *
+ * rte_event_dispatcher_register() may be called by any thread
+ * (including unregistered non-EAL threads), but not while the event
+ * dispatcher is running on any service lcore.
+ *
+ * @param id
+ *  The event dispatcher identifier.
+ *
+ * @param match_fun
+ *  The match callback function.
+ *
+ * @param match_cb_data
+ *  A pointer to some application-specific opaque data (or NULL),
+ *  which is supplied back to the application when match_fun is
+ *  called.
+ *
+ * @param process_fun
+ *  The process callback function.
+ *
+ * @param process_cb_data
+ *  A pointer to some application-specific opaque data (or NULL),
+ *  which is supplied back to the application when process_fun is
+ *  called.
+ *
+ * @return
+ *  - >= 0: The identifier for this registration.
+ *  - -ENOMEM: Unable to allocate sufficient resources.
+ */
+__rte_experimental
+int
+rte_event_dispatcher_register(uint8_t id,
+			      rte_event_dispatcher_match_t match_fun,
+			      void *match_cb_data,
+			      rte_event_dispatcher_process_t process_fun,
+			      void *process_cb_data);
+
+/**
+ * Unregister an event handler.
+ *
+ * This function may be called by any thread (including unregistered
+ * non-EAL threads), but not while the event dispatcher is running on
+ * any service lcore.
+ *
+ * @param id
+ *  The event dispatcher identifier.
+ *
+ * @param handler_id
+ *  The handler registration id returned by the original
+ *  rte_event_dispatcher_register() call.
+ *
+ * @return
+ *  - 0: Success
+ *  - -EINVAL: The @c id and/or the @c handler_id parameter was invalid.
+ */
+__rte_experimental
+int
+rte_event_dispatcher_unregister(uint8_t id, int handler_id);
+
+/**
+ * Register a finalize callback function.
+ *
+ * An application may optionally install one or more finalize
+ * callbacks.
+ *
+ * All finalize callbacks are invoked by the event dispatcher when a
+ * complete batch of events (retrieve using rte_event_dequeue_burst())
+ * have been delivered to the application (or have been dropped).
+ *
+ * The finalize callback is not tied to any particular handler.
+ *
+ * The finalize callback provides an opportunity for the application
+ * to do per-batch processing. One case where this may be useful is if
+ * an event output buffer is used, and is shared among several
+ * handlers. In such a case, proper output buffer flushing may be
+ * assured using a finalize callback.
+ *
+ * rte_event_dispatcher_finalize_register() may be called by any
+ * thread (including unregistered non-EAL threads), but not while the
+ * event dispatcher is running on any service lcore.
+ *
+ * @param id
+ *  The event dispatcher identifier.
+ *
+ * @param finalize_fun
+ *  The function called after completing the processing of a
+ *  dequeue batch.
+ *
+ * @param finalize_data
+ *  A pointer to some application-specific opaque data (or NULL),
+ *  which is supplied back to the application when @c finalize_fun is
+ *  called.
+ *
+ * @return
+ *  - >= 0: The identifier for this registration.
+ *  - -ENOMEM: Unable to allocate sufficient resources.
+ */
+__rte_experimental
+int
+rte_event_dispatcher_finalize_register(uint8_t id,
+			    rte_event_dispatcher_finalize_t finalize_fun,
+			    void *finalize_data);
+
+/**
+ * Unregister a finalize callback.
+ *
+ * This function may be called by any thread (including unregistered
+ * non-EAL threads), but not while the event dispatcher is running on
+ * any service lcore.
+ *
+ * @param id
+ *  The event dispatcher identifier.
+ *
+ * @param reg_id
+ *  The finalize registration id returned by the original
+ *  rte_event_dispatcher_finalize_register() call.
+ *
+ * @return
+ *  - 0: Success
+ *  - -EINVAL: The @c id and/or the @c reg_id parameter was invalid.
+ */
+__rte_experimental
+int
+rte_event_dispatcher_finalize_unregister(uint8_t id, int reg_id);
+
+/**
+ * Start an event dispatcher instance.
+ *
+ * Enables the event dispatcher service.
+ *
+ * The underlying event device must have been started prior to calling
+ * rte_event_dispatcher_start().
+ *
+ * For the event dispatcher to actually perform work (i.e., dispatch
+ * events), its service must have been mapped to one or more service
+ * lcores, and its service run state set to '1'. An event dispatcher's
+ * service is retrieved using rte_event_dispatcher_service_id_get().
+ *
+ * Each service lcore to which the event dispatcher is mapped should
+ * have at least one event port configured. Such configuration is
+ * performed by calling rte_event_dispatcher_bind_port_to_lcore(),
+ * prior to starting the event dispatcher.
+ *
+ * @param id
+ *  The event dispatcher identifier.
+ *
+ * @return
+ *  - 0: Success
+ *  - -EINVAL: Invalid @c id.
+ */
+__rte_experimental
+int
+rte_event_dispatcher_start(uint8_t id);
+
+/**
+ * Stop a running event dispatcher instance.
+ *
+ * Disables the event dispatcher service.
+ *
+ * @param id
+ *  The event dispatcher identifier.
+ *
+ * @return
+ *  - 0: Success
+ *  - -EINVAL: Invalid @c id.
+ */
+__rte_experimental
+int
+rte_event_dispatcher_stop(uint8_t id);
+
+/**
+ * Retrieve statistics for an event dispatcher instance.
+ *
+ * This function is MT safe and may be called by any thread
+ * (including unregistered non-EAL threads).
+ *
+ * @param id
+ *  The event dispatcher identifier.
+ * @param[out] stats
+ *   A pointer to a structure to fill with statistics.
+ * @return
+ *  - 0: Success
+ *  - -EINVAL: The @c id parameter was invalid.
+ */
+__rte_experimental
+int
+rte_event_dispatcher_stats_get(uint8_t id,
+			       struct rte_event_dispatcher_stats *stats);
+
+/**
+ * Reset statistics for an event dispatcher instance.
+ *
+ * This function may be called by any thread (including unregistered
+ * non-EAL threads), but may not produce the correct result if the
+ * event dispatcher is running on any service lcore.
+ *
+ * @param id
+ *  The event dispatcher identifier.
+ *
+ * @return
+ *  - 0: Success
+ *  - -EINVAL: The @c id parameter was invalid.
+ */
+__rte_experimental
+int
+rte_event_dispatcher_stats_reset(uint8_t id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* __RTE_EVENT_DISPATCHER__ */
diff --git a/lib/eventdev/version.map b/lib/eventdev/version.map
index 89068a5713..edf7ffe1b2 100644
--- a/lib/eventdev/version.map
+++ b/lib/eventdev/version.map
@@ -131,6 +131,20 @@  EXPERIMENTAL {
 	rte_event_eth_tx_adapter_runtime_params_init;
 	rte_event_eth_tx_adapter_runtime_params_set;
 	rte_event_timer_remaining_ticks_get;
+
+	rte_event_dispatcher_create;
+	rte_event_dispatcher_free;
+	rte_event_dispatcher_service_id_get;
+	rte_event_dispatcher_bind_port_to_lcore;
+	rte_event_dispatcher_unbind_port_from_lcore;
+	rte_event_dispatcher_register;
+	rte_event_dispatcher_unregister;
+	rte_event_dispatcher_finalize_register;
+	rte_event_dispatcher_finalize_unregister;
+	rte_event_dispatcher_start;
+	rte_event_dispatcher_stop;
+	rte_event_dispatcher_stats_get;
+	rte_event_dispatcher_stats_reset;
 };
 
 INTERNAL {