[v4,08/10] examples/l2fwd-event: add eventdev main loop

Message ID 20190924094209.3827-9-pbhagavatula@marvell.com (mailing list archive)
State Superseded, archived
Headers
Series example/l2fwd-event: introduce l2fwd-event example |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation success Compilation OK

Commit Message

Pavan Nikhilesh Bhagavatula Sept. 24, 2019, 9:42 a.m. UTC
  From: Pavan Nikhilesh <pbhagavatula@marvell.com>

Add event dev main loop based on enabled l2fwd options and eventdev
capabilities.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
---
 examples/l2fwd-event/l2fwd_eventdev.c | 273 ++++++++++++++++++++++++++
 examples/l2fwd-event/main.c           |  10 +-
 2 files changed, 280 insertions(+), 3 deletions(-)
  

Comments

Nipun Gupta Sept. 27, 2019, 1:28 p.m. UTC | #1
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of
> pbhagavatula@marvell.com
> Sent: Tuesday, September 24, 2019 3:12 PM
> To: jerinj@marvell.com; bruce.richardson@intel.com; Akhil Goyal
> <akhil.goyal@nxp.com>; Marko Kovacevic <marko.kovacevic@intel.com>;
> Ori Kam <orika@mellanox.com>; Radu Nicolau <radu.nicolau@intel.com>;
> Tomasz Kantecki <tomasz.kantecki@intel.com>; Sunil Kumar Kori
> <skori@marvell.com>; Pavan Nikhilesh <pbhagavatula@marvell.com>
> Cc: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v4 08/10] examples/l2fwd-event: add eventdev
> main loop
> 
> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> 
> Add event dev main loop based on enabled l2fwd options and eventdev
> capabilities.
> 
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> ---

<snip>

> +		if (flags & L2FWD_EVENT_TX_DIRECT) {
> +			rte_event_eth_tx_adapter_txq_set(mbuf, 0);
> +			while
> (!rte_event_eth_tx_adapter_enqueue(event_d_id,
> +								port_id,
> +								&ev, 1) &&
> +					!*done)
> +				;
> +		}

In the TX direct mode we can send packets directly to the ethernet device using ethdev
API's. This will save unnecessary indirections and event unfolds within the driver.

> +
> +		if (timer_period > 0)
> +			__atomic_fetch_add(&eventdev_rsrc->stats[mbuf-
> >port].tx,
> +					   1, __ATOMIC_RELAXED);
> +	}
> +}
  
Pavan Nikhilesh Bhagavatula Sept. 27, 2019, 2:35 p.m. UTC | #2
>>
>> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
>>
>> Add event dev main loop based on enabled l2fwd options and
>eventdev
>> capabilities.
>>
>> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
>> ---
>
><snip>
>
>> +		if (flags & L2FWD_EVENT_TX_DIRECT) {
>> +			rte_event_eth_tx_adapter_txq_set(mbuf, 0);
>> +			while
>> (!rte_event_eth_tx_adapter_enqueue(event_d_id,
>> +								port_id,
>> +								&ev, 1)
>&&
>> +					!*done)
>> +				;
>> +		}
>
>In the TX direct mode we can send packets directly to the ethernet
>device using ethdev
>API's. This will save unnecessary indirections and event unfolds within
>the driver.

How would we guarantee atomicity of access to Tx queues? Between cores as we can only use one Tx queue. 
Also, if SCHED_TYPE is ORDERED how would we guarantee flow ordering?
The capability of MT_LOCKFREE and flow ordering is abstracted through ` rte_event_eth_tx_adapter_enqueue `.

@see examples/eventdev_pipeline and app/test-eventdev/test_pipeline_*.

>
>> +
>> +		if (timer_period > 0)
>> +			__atomic_fetch_add(&eventdev_rsrc-
>>stats[mbuf-
>> >port].tx,
>> +					   1, __ATOMIC_RELAXED);
>> +	}
>> +}
  
Nipun Gupta Sept. 30, 2019, 5:38 a.m. UTC | #3
> -----Original Message-----
> From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
> Sent: Friday, September 27, 2019 8:05 PM
> To: Nipun Gupta <nipun.gupta@nxp.com>; Jerin Jacob Kollanukkaran
> <jerinj@marvell.com>; bruce.richardson@intel.com; Akhil Goyal
> <akhil.goyal@nxp.com>; Marko Kovacevic <marko.kovacevic@intel.com>;
> Ori Kam <orika@mellanox.com>; Radu Nicolau <radu.nicolau@intel.com>;
> Tomasz Kantecki <tomasz.kantecki@intel.com>; Sunil Kumar Kori
> <skori@marvell.com>
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v4 08/10] examples/l2fwd-event: add
> eventdev main loop
> 
> >>
> >> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> >>
> >> Add event dev main loop based on enabled l2fwd options and
> >eventdev
> >> capabilities.
> >>
> >> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> >> ---
> >
> ><snip>
> >
> >> +		if (flags & L2FWD_EVENT_TX_DIRECT) {
> >> +			rte_event_eth_tx_adapter_txq_set(mbuf, 0);
> >> +			while
> >> (!rte_event_eth_tx_adapter_enqueue(event_d_id,
> >> +								port_id,
> >> +								&ev, 1)
> >&&
> >> +					!*done)
> >> +				;
> >> +		}
> >
> >In the TX direct mode we can send packets directly to the ethernet
> >device using ethdev
> >API's. This will save unnecessary indirections and event unfolds within
> >the driver.
> 
> How would we guarantee atomicity of access to Tx queues? Between cores
> as we can only use one Tx queue.
> Also, if SCHED_TYPE is ORDERED how would we guarantee flow ordering?
> The capability of MT_LOCKFREE and flow ordering is abstracted through `
> rte_event_eth_tx_adapter_enqueue `.

I understand your objective here. Probably in your case the DIRECT is equivalent
to giving the packet to the scheduler, which will pass on the packet to the destined device.
On NXP platform, DIRECT implies sending the packet directly to the device (eth/crypto),
and scheduler will internally pitch in.
Here we will need another option to send it directly to the device.
We can set up a call to discuss the same, or send patch regarding this to you to incorporate
the same in your series.

> 
> @see examples/eventdev_pipeline and app/test-eventdev/test_pipeline_*.

Yes, we are aware of that, They are one way of representing, how to build a complete eventdev pipeline.
They don't work on NXP HW.
We plan to send patches for them to fix them for NXP HW soon.

Regards,
Nipun

> 
> >
> >> +
> >> +		if (timer_period > 0)
> >> +			__atomic_fetch_add(&eventdev_rsrc-
> >>stats[mbuf-
> >> >port].tx,
> >> +					   1, __ATOMIC_RELAXED);
> >> +	}
> >> +}
  
Jerin Jacob Sept. 30, 2019, 6:38 a.m. UTC | #4
On Mon, Sep 30, 2019 at 11:08 AM Nipun Gupta <nipun.gupta@nxp.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
> > Sent: Friday, September 27, 2019 8:05 PM
> > To: Nipun Gupta <nipun.gupta@nxp.com>; Jerin Jacob Kollanukkaran
> > <jerinj@marvell.com>; bruce.richardson@intel.com; Akhil Goyal
> > <akhil.goyal@nxp.com>; Marko Kovacevic <marko.kovacevic@intel.com>;
> > Ori Kam <orika@mellanox.com>; Radu Nicolau <radu.nicolau@intel.com>;
> > Tomasz Kantecki <tomasz.kantecki@intel.com>; Sunil Kumar Kori
> > <skori@marvell.com>
> > Cc: dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v4 08/10] examples/l2fwd-event: add
> > eventdev main loop
> >
> > >>
> > >> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> > >>
> > >> Add event dev main loop based on enabled l2fwd options and
> > >eventdev
> > >> capabilities.
> > >>
> > >> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> > >> ---
> > >
> > ><snip>
> > >
> > >> +          if (flags & L2FWD_EVENT_TX_DIRECT) {
> > >> +                  rte_event_eth_tx_adapter_txq_set(mbuf, 0);
> > >> +                  while
> > >> (!rte_event_eth_tx_adapter_enqueue(event_d_id,
> > >> +                                                          port_id,
> > >> +                                                          &ev, 1)
> > >&&
> > >> +                                  !*done)
> > >> +                          ;
> > >> +          }
> > >
> > >In the TX direct mode we can send packets directly to the ethernet
> > >device using ethdev
> > >API's. This will save unnecessary indirections and event unfolds within
> > >the driver.
> >
> > How would we guarantee atomicity of access to Tx queues? Between cores
> > as we can only use one Tx queue.
> > Also, if SCHED_TYPE is ORDERED how would we guarantee flow ordering?
> > The capability of MT_LOCKFREE and flow ordering is abstracted through `
> > rte_event_eth_tx_adapter_enqueue `.
>
> I understand your objective here. Probably in your case the DIRECT is equivalent
> to giving the packet to the scheduler, which will pass on the packet to the destined device.
> On NXP platform, DIRECT implies sending the packet directly to the device (eth/crypto),
> and scheduler will internally pitch in.
> Here we will need another option to send it directly to the device.
> We can set up a call to discuss the same, or send patch regarding this to you to incorporate
> the same in your series.

Yes. Sending the patch will make us understand better.

Currently, We have two different means for abstracting Tx adapter fast
path changes,
a) SINGLE LINK QUEUE
b) rte_event_eth_tx_adapter_enqueue()

Could you please share why any of the above schemes do not work for NXP HW?
If there is no additional functionality in
rte_event_eth_tx_adapter_enqueue(), you could
simply call direct ethdev tx burst function pointer to make
abstraction  intact to avoid
one more code flow in the fast path.

If I guess it right since NXP HW supports MT_LOCKFREE and only atomic, due to
that, calling eth_dev_tx_burst will be sufficient. But abstracting
over rte_event_eth_tx_adapter_enqueue()
makes application life easy. You can call the low level DPPA2 Tx function in
rte_event_eth_tx_adapter_enqueue() to avoid any performance impact(We
are doing the same).


>
> >
> > @see examples/eventdev_pipeline and app/test-eventdev/test_pipeline_*.
>
> Yes, we are aware of that, They are one way of representing, how to build a complete eventdev pipeline.
> They don't work on NXP HW.
> We plan to send patches for them to fix them for NXP HW soon.
>
> Regards,
> Nipun
>
> >
> > >
> > >> +
> > >> +          if (timer_period > 0)
> > >> +                  __atomic_fetch_add(&eventdev_rsrc-
> > >>stats[mbuf-
> > >> >port].tx,
> > >> +                                     1, __ATOMIC_RELAXED);
> > >> +  }
> > >> +}
  
Nipun Gupta Sept. 30, 2019, 7:46 a.m. UTC | #5
> -----Original Message-----
> From: Jerin Jacob <jerinjacobk@gmail.com>
> Sent: Monday, September 30, 2019 12:08 PM
> To: Nipun Gupta <nipun.gupta@nxp.com>
> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Jerin Jacob
> Kollanukkaran <jerinj@marvell.com>; bruce.richardson@intel.com; Akhil
> Goyal <akhil.goyal@nxp.com>; Marko Kovacevic
> <marko.kovacevic@intel.com>; Ori Kam <orika@mellanox.com>; Radu
> Nicolau <radu.nicolau@intel.com>; Tomasz Kantecki
> <tomasz.kantecki@intel.com>; Sunil Kumar Kori <skori@marvell.com>;
> dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 08/10] examples/l2fwd-event: add
> eventdev main loop
> 
> On Mon, Sep 30, 2019 at 11:08 AM Nipun Gupta <nipun.gupta@nxp.com>
> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
> > > Sent: Friday, September 27, 2019 8:05 PM
> > > To: Nipun Gupta <nipun.gupta@nxp.com>; Jerin Jacob Kollanukkaran
> > > <jerinj@marvell.com>; bruce.richardson@intel.com; Akhil Goyal
> > > <akhil.goyal@nxp.com>; Marko Kovacevic <marko.kovacevic@intel.com>;
> > > Ori Kam <orika@mellanox.com>; Radu Nicolau <radu.nicolau@intel.com>;
> > > Tomasz Kantecki <tomasz.kantecki@intel.com>; Sunil Kumar Kori
> > > <skori@marvell.com>
> > > Cc: dev@dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH v4 08/10] examples/l2fwd-event: add
> > > eventdev main loop
> > >
> > > >>
> > > >> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> > > >>
> > > >> Add event dev main loop based on enabled l2fwd options and
> > > >eventdev
> > > >> capabilities.
> > > >>
> > > >> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> > > >> ---
> > > >
> > > ><snip>
> > > >
> > > >> +          if (flags & L2FWD_EVENT_TX_DIRECT) {
> > > >> +                  rte_event_eth_tx_adapter_txq_set(mbuf, 0);
> > > >> +                  while
> > > >> (!rte_event_eth_tx_adapter_enqueue(event_d_id,
> > > >> +                                                          port_id,
> > > >> +                                                          &ev, 1)
> > > >&&
> > > >> +                                  !*done)
> > > >> +                          ;
> > > >> +          }
> > > >
> > > >In the TX direct mode we can send packets directly to the ethernet
> > > >device using ethdev
> > > >API's. This will save unnecessary indirections and event unfolds within
> > > >the driver.
> > >
> > > How would we guarantee atomicity of access to Tx queues? Between
> cores
> > > as we can only use one Tx queue.
> > > Also, if SCHED_TYPE is ORDERED how would we guarantee flow ordering?
> > > The capability of MT_LOCKFREE and flow ordering is abstracted through `
> > > rte_event_eth_tx_adapter_enqueue `.
> >
> > I understand your objective here. Probably in your case the DIRECT is
> equivalent
> > to giving the packet to the scheduler, which will pass on the packet to the
> destined device.
> > On NXP platform, DIRECT implies sending the packet directly to the device
> (eth/crypto),
> > and scheduler will internally pitch in.
> > Here we will need another option to send it directly to the device.
> > We can set up a call to discuss the same, or send patch regarding this to you
> to incorporate
> > the same in your series.
> 
> Yes. Sending the patch will make us understand better.
> 
> Currently, We have two different means for abstracting Tx adapter fast
> path changes,
> a) SINGLE LINK QUEUE
> b) rte_event_eth_tx_adapter_enqueue()
> 
> Could you please share why any of the above schemes do not work for NXP
> HW?
> If there is no additional functionality in
> rte_event_eth_tx_adapter_enqueue(), you could
> simply call direct ethdev tx burst function pointer to make
> abstraction  intact to avoid
> one more code flow in the fast path.
> 
> If I guess it right since NXP HW supports MT_LOCKFREE and only atomic, due
> to
> that, calling eth_dev_tx_burst will be sufficient. But abstracting
> over rte_event_eth_tx_adapter_enqueue()
> makes application life easy. You can call the low level DPPA2 Tx function in
> rte_event_eth_tx_adapter_enqueue() to avoid any performance
> impact(We
> are doing the same).

Yes, that’s correct regarding our H/W capability.
Agree that the application will become complex by adding more code flow,
but calling Tx functions internally may lead to additional CPU cycles.
Give us a couple of days to analyze the performance impact, and as you also say, I too
don't think it would be much. We should be able to manage it in within our driver.

> 
> 
> >
> > >
> > > @see examples/eventdev_pipeline and app/test-
> eventdev/test_pipeline_*.
> >
> > Yes, we are aware of that, They are one way of representing, how to build
> a complete eventdev pipeline.
> > They don't work on NXP HW.
> > We plan to send patches for them to fix them for NXP HW soon.
> >
> > Regards,
> > Nipun
> >
> > >
> > > >
> > > >> +
> > > >> +          if (timer_period > 0)
> > > >> +                  __atomic_fetch_add(&eventdev_rsrc-
> > > >>stats[mbuf-
> > > >> >port].tx,
> > > >> +                                     1, __ATOMIC_RELAXED);
> > > >> +  }
> > > >> +}
  
Pavan Nikhilesh Bhagavatula Sept. 30, 2019, 8:09 a.m. UTC | #6
>-----Original Message-----
>From: Nipun Gupta <nipun.gupta@nxp.com>
>Sent: Monday, September 30, 2019 1:17 PM
>To: Jerin Jacob <jerinjacobk@gmail.com>
>Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Jerin
>Jacob Kollanukkaran <jerinj@marvell.com>;
>bruce.richardson@intel.com; Akhil Goyal <akhil.goyal@nxp.com>;
>Marko Kovacevic <marko.kovacevic@intel.com>; Ori Kam
><orika@mellanox.com>; Radu Nicolau <radu.nicolau@intel.com>;
>Tomasz Kantecki <tomasz.kantecki@intel.com>; Sunil Kumar Kori
><skori@marvell.com>; dev@dpdk.org
>Subject: RE: [dpdk-dev] [PATCH v4 08/10] examples/l2fwd-event: add
>eventdev main loop
>
>
>
>> -----Original Message-----
>> From: Jerin Jacob <jerinjacobk@gmail.com>
>> Sent: Monday, September 30, 2019 12:08 PM
>> To: Nipun Gupta <nipun.gupta@nxp.com>
>> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Jerin
>Jacob
>> Kollanukkaran <jerinj@marvell.com>; bruce.richardson@intel.com;
>Akhil
>> Goyal <akhil.goyal@nxp.com>; Marko Kovacevic
>> <marko.kovacevic@intel.com>; Ori Kam <orika@mellanox.com>;
>Radu
>> Nicolau <radu.nicolau@intel.com>; Tomasz Kantecki
>> <tomasz.kantecki@intel.com>; Sunil Kumar Kori
><skori@marvell.com>;
>> dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v4 08/10] examples/l2fwd-event: add
>> eventdev main loop
>>
>> On Mon, Sep 30, 2019 at 11:08 AM Nipun Gupta
><nipun.gupta@nxp.com>
>> wrote:
>> >
>> >
>> >
>> > > -----Original Message-----
>> > > From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
>> > > Sent: Friday, September 27, 2019 8:05 PM
>> > > To: Nipun Gupta <nipun.gupta@nxp.com>; Jerin Jacob
>Kollanukkaran
>> > > <jerinj@marvell.com>; bruce.richardson@intel.com; Akhil Goyal
>> > > <akhil.goyal@nxp.com>; Marko Kovacevic
><marko.kovacevic@intel.com>;
>> > > Ori Kam <orika@mellanox.com>; Radu Nicolau
><radu.nicolau@intel.com>;
>> > > Tomasz Kantecki <tomasz.kantecki@intel.com>; Sunil Kumar Kori
>> > > <skori@marvell.com>
>> > > Cc: dev@dpdk.org
>> > > Subject: RE: [dpdk-dev] [PATCH v4 08/10] examples/l2fwd-event:
>add
>> > > eventdev main loop
>> > >
>> > > >>
>> > > >> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
>> > > >>
>> > > >> Add event dev main loop based on enabled l2fwd options and
>> > > >eventdev
>> > > >> capabilities.
>> > > >>
>> > > >> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
>> > > >> ---
>> > > >
>> > > ><snip>
>> > > >
>> > > >> +          if (flags & L2FWD_EVENT_TX_DIRECT) {
>> > > >> +                  rte_event_eth_tx_adapter_txq_set(mbuf, 0);
>> > > >> +                  while
>> > > >> (!rte_event_eth_tx_adapter_enqueue(event_d_id,
>> > > >> +                                                          port_id,
>> > > >> +                                                          &ev, 1)
>> > > >&&
>> > > >> +                                  !*done)
>> > > >> +                          ;
>> > > >> +          }
>> > > >
>> > > >In the TX direct mode we can send packets directly to the
>ethernet
>> > > >device using ethdev
>> > > >API's. This will save unnecessary indirections and event unfolds
>within
>> > > >the driver.
>> > >
>> > > How would we guarantee atomicity of access to Tx queues?
>Between
>> cores
>> > > as we can only use one Tx queue.
>> > > Also, if SCHED_TYPE is ORDERED how would we guarantee flow
>ordering?
>> > > The capability of MT_LOCKFREE and flow ordering is abstracted
>through `
>> > > rte_event_eth_tx_adapter_enqueue `.
>> >
>> > I understand your objective here. Probably in your case the DIRECT
>is
>> equivalent
>> > to giving the packet to the scheduler, which will pass on the packet
>to the
>> destined device.
>> > On NXP platform, DIRECT implies sending the packet directly to the
>device
>> (eth/crypto),
>> > and scheduler will internally pitch in.
>> > Here we will need another option to send it directly to the device.
>> > We can set up a call to discuss the same, or send patch regarding this
>to you
>> to incorporate
>> > the same in your series.
>>
>> Yes. Sending the patch will make us understand better.
>>
>> Currently, We have two different means for abstracting Tx adapter
>fast
>> path changes,
>> a) SINGLE LINK QUEUE
>> b) rte_event_eth_tx_adapter_enqueue()
>>
>> Could you please share why any of the above schemes do not work
>for NXP
>> HW?
>> If there is no additional functionality in
>> rte_event_eth_tx_adapter_enqueue(), you could
>> simply call direct ethdev tx burst function pointer to make
>> abstraction  intact to avoid
>> one more code flow in the fast path.
>>
>> If I guess it right since NXP HW supports MT_LOCKFREE and only
>atomic, due
>> to
>> that, calling eth_dev_tx_burst will be sufficient. But abstracting
>> over rte_event_eth_tx_adapter_enqueue()
>> makes application life easy. You can call the low level DPPA2 Tx
>function in
>> rte_event_eth_tx_adapter_enqueue() to avoid any performance
>> impact(We
>> are doing the same).
>
>Yes, that’s correct regarding our H/W capability.
>Agree that the application will become complex by adding more code
>flow,
>but calling Tx functions internally may lead to additional CPU cycles.
>Give us a couple of days to analyze the performance impact, and as you
>also say, I too
>don't think it would be much. We should be able to manage it in within
>our driver.

When application calls rte_event_eth_tx_adapter_queue_add() based on 
the eth_dev_id the underlying eventdevice can set 
set rte_event_eth_tx_adapter_enqueue() to directly call a function which 
does the platform specific Tx.

i.e if eth_dev is net/dpaa and event dev is also net/dpaa we need _not_ call 
`rte_eth_tx_burst()` in ` rte_event_eth_tx_adapter_enqueue()` it can directly
Invoke the platform specific Rx function which would avoid function pointer 
indirection.

>
>>
>>
>> >
>> > >
>> > > @see examples/eventdev_pipeline and app/test-
>> eventdev/test_pipeline_*.
>> >
>> > Yes, we are aware of that, They are one way of representing, how
>to build
>> a complete eventdev pipeline.
>> > They don't work on NXP HW.
>> > We plan to send patches for them to fix them for NXP HW soon.
>> >
>> > Regards,
>> > Nipun
>> >
>> > >
>> > > >
>> > > >> +
>> > > >> +          if (timer_period > 0)
>> > > >> +                  __atomic_fetch_add(&eventdev_rsrc-
>> > > >>stats[mbuf-
>> > > >> >port].tx,
>> > > >> +                                     1, __ATOMIC_RELAXED);
>> > > >> +  }
>> > > >> +}
  
Nipun Gupta Sept. 30, 2019, 5:50 p.m. UTC | #7
> -----Original Message-----
> From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
> Sent: Monday, September 30, 2019 1:39 PM
> To: Nipun Gupta <nipun.gupta@nxp.com>; Jerin Jacob <jerinjacobk@gmail.com>
> Cc: Jerin Jacob Kollanukkaran <jerinj@marvell.com>;
> bruce.richardson@intel.com; Akhil Goyal <akhil.goyal@nxp.com>; Marko
> Kovacevic <marko.kovacevic@intel.com>; Ori Kam <orika@mellanox.com>;
> Radu Nicolau <radu.nicolau@intel.com>; Tomasz Kantecki
> <tomasz.kantecki@intel.com>; Sunil Kumar Kori <skori@marvell.com>;
> dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v4 08/10] examples/l2fwd-event: add eventdev
> main loop
> 
> 
> 
> >-----Original Message-----
> >From: Nipun Gupta <nipun.gupta@nxp.com>
> >Sent: Monday, September 30, 2019 1:17 PM
> >To: Jerin Jacob <jerinjacobk@gmail.com>
> >Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Jerin
> >Jacob Kollanukkaran <jerinj@marvell.com>;
> >bruce.richardson@intel.com; Akhil Goyal <akhil.goyal@nxp.com>;
> >Marko Kovacevic <marko.kovacevic@intel.com>; Ori Kam
> ><orika@mellanox.com>; Radu Nicolau <radu.nicolau@intel.com>;
> >Tomasz Kantecki <tomasz.kantecki@intel.com>; Sunil Kumar Kori
> ><skori@marvell.com>; dev@dpdk.org
> >Subject: RE: [dpdk-dev] [PATCH v4 08/10] examples/l2fwd-event: add
> >eventdev main loop
> >
> >
> >
> >> -----Original Message-----
> >> From: Jerin Jacob <jerinjacobk@gmail.com>
> >> Sent: Monday, September 30, 2019 12:08 PM
> >> To: Nipun Gupta <nipun.gupta@nxp.com>
> >> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Jerin
> >Jacob
> >> Kollanukkaran <jerinj@marvell.com>; bruce.richardson@intel.com;
> >Akhil
> >> Goyal <akhil.goyal@nxp.com>; Marko Kovacevic
> >> <marko.kovacevic@intel.com>; Ori Kam <orika@mellanox.com>;
> >Radu
> >> Nicolau <radu.nicolau@intel.com>; Tomasz Kantecki
> >> <tomasz.kantecki@intel.com>; Sunil Kumar Kori
> ><skori@marvell.com>;
> >> dev@dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH v4 08/10] examples/l2fwd-event: add
> >> eventdev main loop
> >>
> >> On Mon, Sep 30, 2019 at 11:08 AM Nipun Gupta
> ><nipun.gupta@nxp.com>
> >> wrote:
> >> >
> >> >
> >> >
> >> > > -----Original Message-----
> >> > > From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
> >> > > Sent: Friday, September 27, 2019 8:05 PM
> >> > > To: Nipun Gupta <nipun.gupta@nxp.com>; Jerin Jacob
> >Kollanukkaran
> >> > > <jerinj@marvell.com>; bruce.richardson@intel.com; Akhil Goyal
> >> > > <akhil.goyal@nxp.com>; Marko Kovacevic
> ><marko.kovacevic@intel.com>;
> >> > > Ori Kam <orika@mellanox.com>; Radu Nicolau
> ><radu.nicolau@intel.com>;
> >> > > Tomasz Kantecki <tomasz.kantecki@intel.com>; Sunil Kumar Kori
> >> > > <skori@marvell.com>
> >> > > Cc: dev@dpdk.org
> >> > > Subject: RE: [dpdk-dev] [PATCH v4 08/10] examples/l2fwd-event:
> >add
> >> > > eventdev main loop
> >> > >
> >> > > >>
> >> > > >> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> >> > > >>
> >> > > >> Add event dev main loop based on enabled l2fwd options and
> >> > > >eventdev
> >> > > >> capabilities.
> >> > > >>
> >> > > >> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> >> > > >> ---
> >> > > >
> >> > > ><snip>
> >> > > >
> >> > > >> +          if (flags & L2FWD_EVENT_TX_DIRECT) {
> >> > > >> +                  rte_event_eth_tx_adapter_txq_set(mbuf, 0);
> >> > > >> +                  while
> >> > > >> (!rte_event_eth_tx_adapter_enqueue(event_d_id,
> >> > > >> +                                                          port_id,
> >> > > >> +                                                          &ev, 1)
> >> > > >&&
> >> > > >> +                                  !*done)
> >> > > >> +                          ;
> >> > > >> +          }
> >> > > >
> >> > > >In the TX direct mode we can send packets directly to the
> >ethernet
> >> > > >device using ethdev
> >> > > >API's. This will save unnecessary indirections and event unfolds
> >within
> >> > > >the driver.
> >> > >
> >> > > How would we guarantee atomicity of access to Tx queues?
> >Between
> >> cores
> >> > > as we can only use one Tx queue.
> >> > > Also, if SCHED_TYPE is ORDERED how would we guarantee flow
> >ordering?
> >> > > The capability of MT_LOCKFREE and flow ordering is abstracted
> >through `
> >> > > rte_event_eth_tx_adapter_enqueue `.
> >> >
> >> > I understand your objective here. Probably in your case the DIRECT
> >is
> >> equivalent
> >> > to giving the packet to the scheduler, which will pass on the packet
> >to the
> >> destined device.
> >> > On NXP platform, DIRECT implies sending the packet directly to the
> >device
> >> (eth/crypto),
> >> > and scheduler will internally pitch in.
> >> > Here we will need another option to send it directly to the device.
> >> > We can set up a call to discuss the same, or send patch regarding this
> >to you
> >> to incorporate
> >> > the same in your series.
> >>
> >> Yes. Sending the patch will make us understand better.
> >>
> >> Currently, We have two different means for abstracting Tx adapter
> >fast
> >> path changes,
> >> a) SINGLE LINK QUEUE
> >> b) rte_event_eth_tx_adapter_enqueue()
> >>
> >> Could you please share why any of the above schemes do not work
> >for NXP
> >> HW?
> >> If there is no additional functionality in
> >> rte_event_eth_tx_adapter_enqueue(), you could
> >> simply call direct ethdev tx burst function pointer to make
> >> abstraction  intact to avoid
> >> one more code flow in the fast path.
> >>
> >> If I guess it right since NXP HW supports MT_LOCKFREE and only
> >atomic, due
> >> to
> >> that, calling eth_dev_tx_burst will be sufficient. But abstracting
> >> over rte_event_eth_tx_adapter_enqueue()
> >> makes application life easy. You can call the low level DPPA2 Tx
> >function in
> >> rte_event_eth_tx_adapter_enqueue() to avoid any performance
> >> impact(We
> >> are doing the same).
> >
> >Yes, that’s correct regarding our H/W capability.
> >Agree that the application will become complex by adding more code
> >flow,
> >but calling Tx functions internally may lead to additional CPU cycles.
> >Give us a couple of days to analyze the performance impact, and as you
> >also say, I too
> >don't think it would be much. We should be able to manage it in within
> >our driver.
> 
> When application calls rte_event_eth_tx_adapter_queue_add() based on
> the eth_dev_id the underlying eventdevice can set
> set rte_event_eth_tx_adapter_enqueue() to directly call a function which
> does the platform specific Tx.
> 
> i.e if eth_dev is net/dpaa and event dev is also net/dpaa we need _not_ call
> `rte_eth_tx_burst()` in ` rte_event_eth_tx_adapter_enqueue()` it can directly
> Invoke the platform specific Rx function which would avoid function pointer
> indirection.

I have some performance concern regarding the burst mode; not w.r.t the
function call sequence, but w.r.t the burst functionality.

The API `rte_event_eth_tx_adapter_enqueue()` is called with `nb_rx` events. In case we
are calling the Ethernet API's directly from within the adapter, we will still need to send
all of them separately to the Ethernet device rather than in burst (or scan and separate
the packets internally for ethernet device, queue pair). This separation in the driver is
more complex than in the application, as application is aware of the Eth dev and queues
it is using and thus can easily bifurcate the events.

I suggest to have a flag in the `rte_event_eth_tx_adapter_enqueue()` API to determine
if the application is sending all the packets in a particular API call for a single destination,
so that driver can act smartly and send the burst to Eth Tx function, on the basis of fields
set in the first mbuf.

Seems fine to you guys? I plan to send the patch regarding this soon.

Regards,
Nipun

> 
> >
> >>
> >>
> >> >
> >> > >
> >> > > @see examples/eventdev_pipeline and app/test-
> >> eventdev/test_pipeline_*.
> >> >
> >> > Yes, we are aware of that, They are one way of representing, how
> >to build
> >> a complete eventdev pipeline.
> >> > They don't work on NXP HW.
> >> > We plan to send patches for them to fix them for NXP HW soon.
> >> >
> >> > Regards,
> >> > Nipun
> >> >
> >> > >
> >> > > >
> >> > > >> +
> >> > > >> +          if (timer_period > 0)
> >> > > >> +                  __atomic_fetch_add(&eventdev_rsrc-
> >> > > >>stats[mbuf-
> >> > > >> >port].tx,
> >> > > >> +                                     1, __ATOMIC_RELAXED);
> >> > > >> +  }
> >> > > >> +}
  
Pavan Nikhilesh Bhagavatula Oct. 1, 2019, 5:59 a.m. UTC | #8
>-----Original Message-----
>From: dev <dev-bounces@dpdk.org> On Behalf Of Nipun Gupta
>Sent: Monday, September 30, 2019 11:21 PM
>To: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Jerin
>Jacob <jerinjacobk@gmail.com>
>Cc: Jerin Jacob Kollanukkaran <jerinj@marvell.com>;
>bruce.richardson@intel.com; Akhil Goyal <akhil.goyal@nxp.com>;
>Marko Kovacevic <marko.kovacevic@intel.com>; Ori Kam
><orika@mellanox.com>; Radu Nicolau <radu.nicolau@intel.com>;
>Tomasz Kantecki <tomasz.kantecki@intel.com>; Sunil Kumar Kori
><skori@marvell.com>; dev@dpdk.org; Hemant Agrawal
><hemant.agrawal@nxp.com>
>Subject: Re: [dpdk-dev] [PATCH v4 08/10] examples/l2fwd-event: add
>eventdev main loop
>
>
>
>> -----Original Message-----
>> From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
>> Sent: Monday, September 30, 2019 1:39 PM
>> To: Nipun Gupta <nipun.gupta@nxp.com>; Jerin Jacob
><jerinjacobk@gmail.com>
>> Cc: Jerin Jacob Kollanukkaran <jerinj@marvell.com>;
>> bruce.richardson@intel.com; Akhil Goyal <akhil.goyal@nxp.com>;
>Marko
>> Kovacevic <marko.kovacevic@intel.com>; Ori Kam
><orika@mellanox.com>;
>> Radu Nicolau <radu.nicolau@intel.com>; Tomasz Kantecki
>> <tomasz.kantecki@intel.com>; Sunil Kumar Kori
><skori@marvell.com>;
>> dev@dpdk.org
>> Subject: RE: [dpdk-dev] [PATCH v4 08/10] examples/l2fwd-event: add
>eventdev
>> main loop
>>
>>
>>
>> >-----Original Message-----
>> >From: Nipun Gupta <nipun.gupta@nxp.com>
>> >Sent: Monday, September 30, 2019 1:17 PM
>> >To: Jerin Jacob <jerinjacobk@gmail.com>
>> >Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Jerin
>> >Jacob Kollanukkaran <jerinj@marvell.com>;
>> >bruce.richardson@intel.com; Akhil Goyal <akhil.goyal@nxp.com>;
>> >Marko Kovacevic <marko.kovacevic@intel.com>; Ori Kam
>> ><orika@mellanox.com>; Radu Nicolau <radu.nicolau@intel.com>;
>> >Tomasz Kantecki <tomasz.kantecki@intel.com>; Sunil Kumar Kori
>> ><skori@marvell.com>; dev@dpdk.org
>> >Subject: RE: [dpdk-dev] [PATCH v4 08/10] examples/l2fwd-event:
>add
>> >eventdev main loop
>> >
>> >
>> >
>> >> -----Original Message-----
>> >> From: Jerin Jacob <jerinjacobk@gmail.com>
>> >> Sent: Monday, September 30, 2019 12:08 PM
>> >> To: Nipun Gupta <nipun.gupta@nxp.com>
>> >> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>;
>Jerin
>> >Jacob
>> >> Kollanukkaran <jerinj@marvell.com>;
>bruce.richardson@intel.com;
>> >Akhil
>> >> Goyal <akhil.goyal@nxp.com>; Marko Kovacevic
>> >> <marko.kovacevic@intel.com>; Ori Kam <orika@mellanox.com>;
>> >Radu
>> >> Nicolau <radu.nicolau@intel.com>; Tomasz Kantecki
>> >> <tomasz.kantecki@intel.com>; Sunil Kumar Kori
>> ><skori@marvell.com>;
>> >> dev@dpdk.org
>> >> Subject: Re: [dpdk-dev] [PATCH v4 08/10] examples/l2fwd-event:
>add
>> >> eventdev main loop
>> >>
>> >> On Mon, Sep 30, 2019 at 11:08 AM Nipun Gupta
>> ><nipun.gupta@nxp.com>
>> >> wrote:
>> >> >
>> >> >
>> >> >
>> >> > > -----Original Message-----
>> >> > > From: Pavan Nikhilesh Bhagavatula
><pbhagavatula@marvell.com>
>> >> > > Sent: Friday, September 27, 2019 8:05 PM
>> >> > > To: Nipun Gupta <nipun.gupta@nxp.com>; Jerin Jacob
>> >Kollanukkaran
>> >> > > <jerinj@marvell.com>; bruce.richardson@intel.com; Akhil
>Goyal
>> >> > > <akhil.goyal@nxp.com>; Marko Kovacevic
>> ><marko.kovacevic@intel.com>;
>> >> > > Ori Kam <orika@mellanox.com>; Radu Nicolau
>> ><radu.nicolau@intel.com>;
>> >> > > Tomasz Kantecki <tomasz.kantecki@intel.com>; Sunil Kumar
>Kori
>> >> > > <skori@marvell.com>
>> >> > > Cc: dev@dpdk.org
>> >> > > Subject: RE: [dpdk-dev] [PATCH v4 08/10] examples/l2fwd-
>event:
>> >add
>> >> > > eventdev main loop
>> >> > >
>> >> > > >>
>> >> > > >> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
>> >> > > >>
>> >> > > >> Add event dev main loop based on enabled l2fwd options
>and
>> >> > > >eventdev
>> >> > > >> capabilities.
>> >> > > >>
>> >> > > >> Signed-off-by: Pavan Nikhilesh
><pbhagavatula@marvell.com>
>> >> > > >> ---
>> >> > > >
>> >> > > ><snip>
>> >> > > >
>> >> > > >> +          if (flags & L2FWD_EVENT_TX_DIRECT) {
>> >> > > >> +                  rte_event_eth_tx_adapter_txq_set(mbuf, 0);
>> >> > > >> +                  while
>> >> > > >> (!rte_event_eth_tx_adapter_enqueue(event_d_id,
>> >> > > >> +                                                          port_id,
>> >> > > >> +                                                          &ev, 1)
>> >> > > >&&
>> >> > > >> +                                  !*done)
>> >> > > >> +                          ;
>> >> > > >> +          }
>> >> > > >
>> >> > > >In the TX direct mode we can send packets directly to the
>> >ethernet
>> >> > > >device using ethdev
>> >> > > >API's. This will save unnecessary indirections and event
>unfolds
>> >within
>> >> > > >the driver.
>> >> > >
>> >> > > How would we guarantee atomicity of access to Tx queues?
>> >Between
>> >> cores
>> >> > > as we can only use one Tx queue.
>> >> > > Also, if SCHED_TYPE is ORDERED how would we guarantee flow
>> >ordering?
>> >> > > The capability of MT_LOCKFREE and flow ordering is abstracted
>> >through `
>> >> > > rte_event_eth_tx_adapter_enqueue `.
>> >> >
>> >> > I understand your objective here. Probably in your case the
>DIRECT
>> >is
>> >> equivalent
>> >> > to giving the packet to the scheduler, which will pass on the
>packet
>> >to the
>> >> destined device.
>> >> > On NXP platform, DIRECT implies sending the packet directly to
>the
>> >device
>> >> (eth/crypto),
>> >> > and scheduler will internally pitch in.
>> >> > Here we will need another option to send it directly to the
>device.
>> >> > We can set up a call to discuss the same, or send patch regarding
>this
>> >to you
>> >> to incorporate
>> >> > the same in your series.
>> >>
>> >> Yes. Sending the patch will make us understand better.
>> >>
>> >> Currently, We have two different means for abstracting Tx adapter
>> >fast
>> >> path changes,
>> >> a) SINGLE LINK QUEUE
>> >> b) rte_event_eth_tx_adapter_enqueue()
>> >>
>> >> Could you please share why any of the above schemes do not
>work
>> >for NXP
>> >> HW?
>> >> If there is no additional functionality in
>> >> rte_event_eth_tx_adapter_enqueue(), you could
>> >> simply call direct ethdev tx burst function pointer to make
>> >> abstraction  intact to avoid
>> >> one more code flow in the fast path.
>> >>
>> >> If I guess it right since NXP HW supports MT_LOCKFREE and only
>> >atomic, due
>> >> to
>> >> that, calling eth_dev_tx_burst will be sufficient. But abstracting
>> >> over rte_event_eth_tx_adapter_enqueue()
>> >> makes application life easy. You can call the low level DPPA2 Tx
>> >function in
>> >> rte_event_eth_tx_adapter_enqueue() to avoid any performance
>> >> impact(We
>> >> are doing the same).
>> >
>> >Yes, that’s correct regarding our H/W capability.
>> >Agree that the application will become complex by adding more
>code
>> >flow,
>> >but calling Tx functions internally may lead to additional CPU cycles.
>> >Give us a couple of days to analyze the performance impact, and as
>you
>> >also say, I too
>> >don't think it would be much. We should be able to manage it in
>within
>> >our driver.
>>
>> When application calls rte_event_eth_tx_adapter_queue_add()
>based on
>> the eth_dev_id the underlying eventdevice can set
>> set rte_event_eth_tx_adapter_enqueue() to directly call a function
>which
>> does the platform specific Tx.
>>
>> i.e if eth_dev is net/dpaa and event dev is also net/dpaa we need
>_not_ call
>> `rte_eth_tx_burst()` in ` rte_event_eth_tx_adapter_enqueue()` it
>can directly
>> Invoke the platform specific Rx function which would avoid function
>pointer
>> indirection.
>
>I have some performance concern regarding the burst mode; not w.r.t
>the
>function call sequence, but w.r.t the burst functionality.
>
>The API `rte_event_eth_tx_adapter_enqueue()` is called with `nb_rx`
>events. In case we
>are calling the Ethernet API's directly from within the adapter, we will
>still need to send
>all of them separately to the Ethernet device rather than in burst (or
>scan and separate
>the packets internally for ethernet device, queue pair). This separation
>in the driver is
>more complex than in the application, as application is aware of the Eth
>dev and queues
>it is using and thus can easily bifurcate the events.
>
>I suggest to have a flag in the `rte_event_eth_tx_adapter_enqueue()`
>API to determine
>if the application is sending all the packets in a particular API call for a
>single destination,
>so that driver can act smartly and send the burst to Eth Tx function, on
>the basis of fields
>set in the first mbuf.
>

We could have a flag for the above but the application still needs to segregate 
packets based on port_id as `rte_event_dequeue_burst` doesn’t guarantee 
that all the packets arrive from the same ethernet port/queue. 

I think since the application is setting mbuf->port and 
`rte_event_eth_tx_adapter_txq_set`, the segregation should be done at 
`rte_event_eth_tx_adapter_enqueue` as it would be the same logic for every 
application and reduces application complexity.

Regards,
Pavan.

>Seems fine to you guys? I plan to send the patch regarding this soon.
>
>Regards,
>Nipun
>
>>
>> >
>> >>
>> >>
>> >> >
>> >> > >
>> >> > > @see examples/eventdev_pipeline and app/test-
>> >> eventdev/test_pipeline_*.
>> >> >
>> >> > Yes, we are aware of that, They are one way of representing,
>how
>> >to build
>> >> a complete eventdev pipeline.
>> >> > They don't work on NXP HW.
>> >> > We plan to send patches for them to fix them for NXP HW soon.
>> >> >
>> >> > Regards,
>> >> > Nipun
>> >> >
>> >> > >
>> >> > > >
>> >> > > >> +
>> >> > > >> +          if (timer_period > 0)
>> >> > > >> +                  __atomic_fetch_add(&eventdev_rsrc-
>> >> > > >>stats[mbuf-
>> >> > > >> >port].tx,
>> >> > > >> +                                     1, __ATOMIC_RELAXED);
>> >> > > >> +  }
>> >> > > >> +}
  

Patch

diff --git a/examples/l2fwd-event/l2fwd_eventdev.c b/examples/l2fwd-event/l2fwd_eventdev.c
index f964c69d6..345d9d15b 100644
--- a/examples/l2fwd-event/l2fwd_eventdev.c
+++ b/examples/l2fwd-event/l2fwd_eventdev.c
@@ -18,6 +18,12 @@ 
 #include "l2fwd_common.h"
 #include "l2fwd_eventdev.h"
 
+#define L2FWD_EVENT_SINGLE	0x1
+#define L2FWD_EVENT_BURST	0x2
+#define L2FWD_EVENT_TX_DIRECT	0x4
+#define L2FWD_EVENT_TX_ENQ	0x8
+#define L2FWD_EVENT_UPDT_MAC	0x10
+
 static void
 print_ethaddr(const char *name, const struct rte_ether_addr *eth_addr)
 {
@@ -211,10 +217,272 @@  eventdev_capability_setup(void)
 		eventdev_set_internal_port_ops(&eventdev_rsrc->ops);
 }
 
+static __rte_noinline int
+get_free_event_port(struct eventdev_resources *eventdev_rsrc)
+{
+	static int index;
+	int port_id;
+
+	rte_spinlock_lock(&eventdev_rsrc->evp.lock);
+	if (index >= eventdev_rsrc->evp.nb_ports) {
+		printf("No free event port is available\n");
+		return -1;
+	}
+
+	port_id = eventdev_rsrc->evp.event_p_id[index];
+	index++;
+	rte_spinlock_unlock(&eventdev_rsrc->evp.lock);
+
+	return port_id;
+}
+
+static __rte_always_inline void
+l2fwd_event_updt_mac(struct rte_mbuf *m, const struct rte_ether_addr *dst_mac,
+		     uint8_t dst_port)
+{
+	struct rte_ether_hdr *eth;
+	void *tmp;
+
+	eth = rte_pktmbuf_mtod(m, struct rte_ether_hdr *);
+
+	/* 02:00:00:00:00:xx */
+	tmp = &eth->d_addr.addr_bytes[0];
+	*((uint64_t *)tmp) = 0x000000000002 + ((uint64_t)dst_port << 40);
+
+	/* src addr */
+	rte_ether_addr_copy(dst_mac, &eth->s_addr);
+}
+
+static __rte_always_inline void
+l2fwd_event_loop_single(struct eventdev_resources *eventdev_rsrc,
+		      const uint32_t flags)
+{
+	const uint8_t is_master = rte_get_master_lcore() == rte_lcore_id();
+	const uint64_t timer_period = eventdev_rsrc->timer_period;
+	uint64_t prev_tsc = 0, diff_tsc, cur_tsc, timer_tsc = 0;
+	const int port_id = get_free_event_port(eventdev_rsrc);
+	const uint8_t tx_q_id = eventdev_rsrc->evq.event_q_id[
+					eventdev_rsrc->evq.nb_queues - 1];
+	const uint8_t event_d_id = eventdev_rsrc->event_d_id;
+	volatile bool *done = eventdev_rsrc->done;
+	struct rte_mbuf *mbuf;
+	uint16_t dst_port;
+	struct rte_event ev;
+
+	if (port_id < 0)
+		return;
+
+	printf("%s(): entering eventdev main loop on lcore %u\n", __func__,
+		rte_lcore_id());
+
+	while (!*done) {
+		/* if timer is enabled */
+		if (is_master && timer_period > 0) {
+			cur_tsc = rte_rdtsc();
+			diff_tsc = cur_tsc - prev_tsc;
+
+			/* advance the timer */
+			timer_tsc += diff_tsc;
+
+			/* if timer has reached its timeout */
+			if (unlikely(timer_tsc >= timer_period)) {
+				print_stats();
+				/* reset the timer */
+				timer_tsc = 0;
+			}
+			prev_tsc = cur_tsc;
+		}
+
+		/* Read packet from eventdev */
+		if (!rte_event_dequeue_burst(event_d_id, port_id, &ev, 1, 0))
+			continue;
+
+
+		mbuf = ev.mbuf;
+		dst_port = eventdev_rsrc->dst_ports[mbuf->port];
+		rte_prefetch0(rte_pktmbuf_mtod(mbuf, void *));
+
+		if (timer_period > 0)
+			__atomic_fetch_add(&eventdev_rsrc->stats[mbuf->port].rx,
+					   1, __ATOMIC_RELAXED);
+
+		mbuf->port = dst_port;
+		if (flags & L2FWD_EVENT_UPDT_MAC)
+			l2fwd_event_updt_mac(mbuf,
+				&eventdev_rsrc->ports_eth_addr[dst_port],
+				dst_port);
+
+		if (flags & L2FWD_EVENT_TX_ENQ) {
+			ev.queue_id = tx_q_id;
+			ev.op = RTE_EVENT_OP_FORWARD;
+			while (rte_event_enqueue_burst(event_d_id, port_id,
+						       &ev, 1) && !*done)
+				;
+		}
+
+		if (flags & L2FWD_EVENT_TX_DIRECT) {
+			rte_event_eth_tx_adapter_txq_set(mbuf, 0);
+			while (!rte_event_eth_tx_adapter_enqueue(event_d_id,
+								port_id,
+								&ev, 1) &&
+					!*done)
+				;
+		}
+
+		if (timer_period > 0)
+			__atomic_fetch_add(&eventdev_rsrc->stats[mbuf->port].tx,
+					   1, __ATOMIC_RELAXED);
+	}
+}
+
+static __rte_always_inline void
+l2fwd_event_loop_burst(struct eventdev_resources *eventdev_rsrc,
+		       const uint32_t flags)
+{
+	const uint8_t is_master = rte_get_master_lcore() == rte_lcore_id();
+	const uint64_t timer_period = eventdev_rsrc->timer_period;
+	uint64_t prev_tsc = 0, diff_tsc, cur_tsc, timer_tsc = 0;
+	const int port_id = get_free_event_port(eventdev_rsrc);
+	const uint8_t tx_q_id = eventdev_rsrc->evq.event_q_id[
+					eventdev_rsrc->evq.nb_queues - 1];
+	const uint8_t event_d_id = eventdev_rsrc->event_d_id;
+	const uint8_t deq_len = eventdev_rsrc->deq_depth;
+	volatile bool *done = eventdev_rsrc->done;
+	struct rte_event ev[MAX_PKT_BURST];
+	struct rte_mbuf *mbuf;
+	uint16_t nb_rx, nb_tx;
+	uint16_t dst_port;
+	uint8_t i;
+
+	if (port_id < 0)
+		return;
+
+	printf("%s(): entering eventdev main loop on lcore %u\n", __func__,
+		rte_lcore_id());
+
+	while (!*done) {
+		/* if timer is enabled */
+		if (is_master && timer_period > 0) {
+			cur_tsc = rte_rdtsc();
+			diff_tsc = cur_tsc - prev_tsc;
+
+			/* advance the timer */
+			timer_tsc += diff_tsc;
+
+			/* if timer has reached its timeout */
+			if (unlikely(timer_tsc >= timer_period)) {
+				print_stats();
+				/* reset the timer */
+				timer_tsc = 0;
+			}
+			prev_tsc = cur_tsc;
+		}
+
+		/* Read packet from eventdev */
+		nb_rx = rte_event_dequeue_burst(event_d_id, port_id, ev,
+						deq_len, 0);
+		if (nb_rx == 0)
+			continue;
+
+
+		for (i = 0; i < nb_rx; i++) {
+			mbuf = ev[i].mbuf;
+			dst_port = eventdev_rsrc->dst_ports[mbuf->port];
+			rte_prefetch0(rte_pktmbuf_mtod(mbuf, void *));
+
+			if (timer_period > 0) {
+				__atomic_fetch_add(
+					&eventdev_rsrc->stats[mbuf->port].rx,
+					1, __ATOMIC_RELAXED);
+				__atomic_fetch_add(
+					&eventdev_rsrc->stats[mbuf->port].tx,
+					1, __ATOMIC_RELAXED);
+			}
+			mbuf->port = dst_port;
+			if (flags & L2FWD_EVENT_UPDT_MAC)
+				l2fwd_event_updt_mac(mbuf,
+						&eventdev_rsrc->ports_eth_addr[
+								dst_port],
+						dst_port);
+
+			if (flags & L2FWD_EVENT_TX_ENQ) {
+				ev[i].queue_id = tx_q_id;
+				ev[i].op = RTE_EVENT_OP_FORWARD;
+			}
+
+			if (flags & L2FWD_EVENT_TX_DIRECT)
+				rte_event_eth_tx_adapter_txq_set(mbuf, 0);
+
+		}
+
+		if (flags & L2FWD_EVENT_TX_ENQ) {
+			nb_tx = rte_event_enqueue_burst(event_d_id, port_id,
+							ev, nb_rx);
+			while (nb_tx < nb_rx && !*done)
+				nb_tx += rte_event_enqueue_burst(event_d_id,
+						port_id, ev + nb_tx,
+						nb_rx - nb_tx);
+		}
+
+		if (flags & L2FWD_EVENT_TX_DIRECT) {
+			nb_tx = rte_event_eth_tx_adapter_enqueue(event_d_id,
+								 port_id, ev,
+								 nb_rx);
+			while (nb_tx < nb_rx && !*done)
+				nb_tx += rte_event_eth_tx_adapter_enqueue(
+						event_d_id, port_id,
+						ev + nb_tx, nb_rx - nb_tx);
+		}
+	}
+}
+
+static __rte_always_inline void
+l2fwd_event_loop(struct eventdev_resources *eventdev_rsrc,
+			const uint32_t flags)
+{
+	if (flags & L2FWD_EVENT_SINGLE)
+		l2fwd_event_loop_single(eventdev_rsrc, flags);
+	if (flags & L2FWD_EVENT_BURST)
+		l2fwd_event_loop_burst(eventdev_rsrc, flags);
+}
+
+#define L2FWD_EVENT_MODE						\
+FP(tx_d,	0, 0, 0, L2FWD_EVENT_TX_DIRECT | L2FWD_EVENT_SINGLE)	\
+FP(tx_d_burst,	0, 0, 1, L2FWD_EVENT_TX_DIRECT | L2FWD_EVENT_BURST)	\
+FP(tx_q,	0, 1, 0, L2FWD_EVENT_TX_ENQ | L2FWD_EVENT_SINGLE)	\
+FP(tx_q_burst,	0, 1, 1, L2FWD_EVENT_TX_ENQ | L2FWD_EVENT_BURST)	\
+FP(tx_d_mac,	1, 0, 0, L2FWD_EVENT_UPDT_MAC | L2FWD_EVENT_TX_DIRECT | \
+			 L2FWD_EVENT_SINGLE)				\
+FP(tx_d_brst_mac, 1, 0, 1, L2FWD_EVENT_UPDT_MAC | L2FWD_EVENT_TX_DIRECT | \
+				L2FWD_EVENT_BURST)			\
+FP(tx_q_mac,	  1, 1, 0, L2FWD_EVENT_UPDT_MAC | L2FWD_EVENT_TX_ENQ |	\
+				L2FWD_EVENT_SINGLE)			\
+FP(tx_q_brst_mac, 1, 1, 1, L2FWD_EVENT_UPDT_MAC | L2FWD_EVENT_TX_ENQ |	\
+				L2FWD_EVENT_BURST)
+
+
+#define FP(_name, _f3, _f2, _f1, flags)					\
+static void __rte_noinline						\
+l2fwd_event_main_loop_ ## _name(void)					\
+{									\
+	struct eventdev_resources *eventdev_rsrc = get_eventdev_rsrc();	\
+	l2fwd_event_loop(eventdev_rsrc, flags);				\
+}
+
+L2FWD_EVENT_MODE
+#undef FP
+
 void
 eventdev_resource_setup(void)
 {
 	struct eventdev_resources *eventdev_rsrc = get_eventdev_rsrc();
+	/* [MAC_UPDT][TX_MODE][BURST] */
+	const event_loop_cb event_loop[2][2][2] = {
+#define FP(_name, _f3, _f2, _f1, flags) \
+		[_f3][_f2][_f1] = l2fwd_event_main_loop_ ## _name,
+		L2FWD_EVENT_MODE
+#undef FP
+	};
 	uint16_t ethdev_count = rte_eth_dev_count_avail();
 	uint32_t event_queue_cfg = 0;
 	uint32_t service_id;
@@ -260,4 +528,9 @@  eventdev_resource_setup(void)
 	ret = rte_event_dev_start(eventdev_rsrc->event_d_id);
 	if (ret < 0)
 		rte_exit(EXIT_FAILURE, "Error in starting eventdev");
+
+	eventdev_rsrc->ops.l2fwd_event_loop = event_loop
+					[eventdev_rsrc->mac_updt]
+					[eventdev_rsrc->tx_mode_q]
+					[eventdev_rsrc->has_burst];
 }
diff --git a/examples/l2fwd-event/main.c b/examples/l2fwd-event/main.c
index 60882da52..43f0b114c 100644
--- a/examples/l2fwd-event/main.c
+++ b/examples/l2fwd-event/main.c
@@ -271,8 +271,12 @@  static void l2fwd_main_loop(void)
 static int
 l2fwd_launch_one_lcore(void *args)
 {
-	RTE_SET_USED(args);
-	l2fwd_main_loop();
+	struct eventdev_resources *eventdev_rsrc = args;
+
+	if (eventdev_rsrc->enabled)
+		eventdev_rsrc->ops.l2fwd_event_loop();
+	else
+		l2fwd_main_loop();
 
 	return 0;
 }
@@ -773,7 +777,7 @@  main(int argc, char **argv)
 
 	ret = 0;
 	/* launch per-lcore init on every lcore */
-	rte_eal_mp_remote_launch(l2fwd_launch_one_lcore, NULL,
+	rte_eal_mp_remote_launch(l2fwd_launch_one_lcore, eventdev_rsrc,
 				 CALL_MASTER);
 	rte_eal_mp_wait_lcore();