[v2] app/testpmd: flowgen support ip and udp fields

Message ID 20210809065200.31134-1-wangzhihong.wzh@bytedance.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series [v2] app/testpmd: flowgen support ip and udp fields |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/github-robot-build success github build: passed
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-x86_64-unit-testing fail Testing issues
ci/iol-x86_64-compile-testing success Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS

Commit Message

王志宏 Aug. 9, 2021, 6:52 a.m. UTC
  This patch aims to:
 1. Add flexibility by supporting IP & UDP src/dst fields
 2. Improve multi-core performance by using per-core vars

v2: fix assigning ip header cksum

Signed-off-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
---
 app/test-pmd/flowgen.c | 137 +++++++++++++++++++++++++++++++------------------
 1 file changed, 86 insertions(+), 51 deletions(-)
  

Comments

Singh, Aman Deep Aug. 9, 2021, 12:21 p.m. UTC | #1
Hi Wang,

On 8/9/2021 12:22 PM, Zhihong Wang wrote:
> This patch aims to:
>   1. Add flexibility by supporting IP & UDP src/dst fields
>   2. Improve multi-core performance by using per-core vars
>
> v2: fix assigning ip header cksum
>
> Signed-off-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
> ---
>   <snip>

 From defination of flowgen as per the documentation-

  *

    |flowgen|: Multi-flow generation mode. Originates a number of flows
    (with varying destination IP addresses), and terminate receive traffic.

So changing the src IP address may not fit the defination of a source 
generating real traffic.

I would like to check this part further.
  
Ferruh Yigit Aug. 9, 2021, 3:18 p.m. UTC | #2
On 8/9/2021 7:52 AM, Zhihong Wang wrote:
> This patch aims to:
>  1. Add flexibility by supporting IP & UDP src/dst fields

What is the reason/"use case" of this flexibility?

>  2. Improve multi-core performance by using per-core vars>

On multi core this also has syncronization problem, OK to make it per-core. Do
you have any observed performance difference, if so how much is it?

And can you please separate this to its own patch? This can be before ip/udp update.

> v2: fix assigning ip header cksum
> 

+1 to update, can you please make it as seperate patch?

So overall this can be a patchset with 4 patches:
1- Fix retry logic (nb_rx -> nb_pkt)
2- Use 'rte_ipv4_cksum()' API (instead of static 'ip_sum()')
3- User per-core varible (for 'next_flow')
4- Support ip/udp src/dst variaty of packets

> Signed-off-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
> ---
>  app/test-pmd/flowgen.c | 137 +++++++++++++++++++++++++++++++------------------
>  1 file changed, 86 insertions(+), 51 deletions(-)
> 

<...>

> @@ -185,30 +193,57 @@ pkt_burst_flow_gen(struct fwd_stream *fs)
>  		}
>  		pkts_burst[nb_pkt] = pkt;
>  
> -		next_flow = (next_flow + 1) % cfg_n_flows;
> +		if (++next_udp_dst < cfg_n_udp_dst)
> +			continue;
> +		next_udp_dst = 0;
> +		if (++next_udp_src < cfg_n_udp_src)
> +			continue;
> +		next_udp_src = 0;
> +		if (++next_ip_dst < cfg_n_ip_dst)
> +			continue;
> +		next_ip_dst = 0;
> +		if (++next_ip_src < cfg_n_ip_src)
> +			continue;
> +		next_ip_src = 0;

What is the logic here, can you please clarifiy the packet generation logic both
in a comment here and in the commit log?

>  	}
>  
>  	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_pkt);
>  	/*
>  	 * Retry if necessary
>  	 */
> -	if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) {
> +	if (unlikely(nb_tx < nb_pkt) && fs->retry_enabled) {
>  		retry = 0;
> -		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
> +		while (nb_tx < nb_pkt && retry++ < burst_tx_retry_num) {
>  			rte_delay_us(burst_tx_delay_time);
>  			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
> -					&pkts_burst[nb_tx], nb_rx - nb_tx);
> +					&pkts_burst[nb_tx], nb_pkt - nb_tx);
>  		}

+1 to this fix, thanks for it. But can you please make a seperate patch for
this, with proper 'Fixes:' tag etc..

>  	}
> -	fs->tx_packets += nb_tx;
>  
>  	inc_tx_burst_stats(fs, nb_tx);
> -	if (unlikely(nb_tx < nb_pkt)) {
> -		/* Back out the flow counter. */
> -		next_flow -= (nb_pkt - nb_tx);
> -		while (next_flow < 0)
> -			next_flow += cfg_n_flows;
> +	fs->tx_packets += nb_tx;
> +	/* Catch up flow idx by actual sent. */
> +	for (i = 0; i < nb_tx; ++i) {
> +		RTE_PER_LCORE(_next_udp_dst) = RTE_PER_LCORE(_next_udp_dst) + 1;
> +		if (RTE_PER_LCORE(_next_udp_dst) < cfg_n_udp_dst)
> +			continue;
> +		RTE_PER_LCORE(_next_udp_dst) = 0;
> +		RTE_PER_LCORE(_next_udp_src) = RTE_PER_LCORE(_next_udp_src) + 1;
> +		if (RTE_PER_LCORE(_next_udp_src) < cfg_n_udp_src)
> +			continue;
> +		RTE_PER_LCORE(_next_udp_src) = 0;
> +		RTE_PER_LCORE(_next_ip_dst) = RTE_PER_LCORE(_next_ip_dst) + 1;
> +		if (RTE_PER_LCORE(_next_ip_dst) < cfg_n_ip_dst)
> +			continue;
> +		RTE_PER_LCORE(_next_ip_dst) = 0;
> +		RTE_PER_LCORE(_next_ip_src) = RTE_PER_LCORE(_next_ip_src) + 1;
> +		if (RTE_PER_LCORE(_next_ip_src) < cfg_n_ip_src)
> +			continue;
> +		RTE_PER_LCORE(_next_ip_src) = 0;
> +	}

Why per-core variables are not used in forward function, but local variables
(like 'next_ip_src' etc..) used? Is it for the performance, if so what is the
impact?

And why not directly assign from local variables to per-core variables, but have
above catch up loop?
  
王志宏 Aug. 10, 2021, 7:30 a.m. UTC | #3
Hi Aman,

On Mon, Aug 9, 2021 at 8:21 PM Singh, Aman Deep
<aman.deep.singh@intel.com> wrote:
>
> Hi Wang,
>
> On 8/9/2021 12:22 PM, Zhihong Wang wrote:
>
> This patch aims to:
>  1. Add flexibility by supporting IP & UDP src/dst fields
>  2. Improve multi-core performance by using per-core vars
>
> v2: fix assigning ip header cksum
>
> Signed-off-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
> ---
>  <snip>
>
> From defination of flowgen as per the documentation-
>
> flowgen: Multi-flow generation mode. Originates a number of flows (with varying destination IP addresses), and terminate receive traffic.
>
> So changing the src IP address may not fit the defination of a source generating real traffic.
>
> I would like to check this part further.

Thanks for the review. The purpose of introducing them is to emulate
pkt generators behaviors.

Do you think keeping cfg_n_ip_src & cfg_n_udp_src while setting them =
1 by default makes sense? Or maybe update the documentation?
  
王志宏 Aug. 10, 2021, 7:57 a.m. UTC | #4
Thanks for the review Ferruh :)

On Mon, Aug 9, 2021 at 11:18 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>
> On 8/9/2021 7:52 AM, Zhihong Wang wrote:
> > This patch aims to:
> >  1. Add flexibility by supporting IP & UDP src/dst fields
>
> What is the reason/"use case" of this flexibility?

The purpose is to emulate pkt generator behaviors.

>
> >  2. Improve multi-core performance by using per-core vars>
>
> On multi core this also has syncronization problem, OK to make it per-core. Do
> you have any observed performance difference, if so how much is it?

Huge difference, one example: 8 core flowgen -> rxonly results: 43
Mpps (per-core) vs. 9.3 Mpps (shared), of course the numbers "varies
depending on system configuration".

>
> And can you please separate this to its own patch? This can be before ip/udp update.

Will do.

>
> > v2: fix assigning ip header cksum
> >
>
> +1 to update, can you please make it as seperate patch?

Sure.

>
> So overall this can be a patchset with 4 patches:
> 1- Fix retry logic (nb_rx -> nb_pkt)
> 2- Use 'rte_ipv4_cksum()' API (instead of static 'ip_sum()')
> 3- User per-core varible (for 'next_flow')
> 4- Support ip/udp src/dst variaty of packets
>

Great summary. Thanks a lot.

> > Signed-off-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
> > ---
> >  app/test-pmd/flowgen.c | 137 +++++++++++++++++++++++++++++++------------------
> >  1 file changed, 86 insertions(+), 51 deletions(-)
> >
>
> <...>
>
> > @@ -185,30 +193,57 @@ pkt_burst_flow_gen(struct fwd_stream *fs)
> >               }
> >               pkts_burst[nb_pkt] = pkt;
> >
> > -             next_flow = (next_flow + 1) % cfg_n_flows;
> > +             if (++next_udp_dst < cfg_n_udp_dst)
> > +                     continue;
> > +             next_udp_dst = 0;
> > +             if (++next_udp_src < cfg_n_udp_src)
> > +                     continue;
> > +             next_udp_src = 0;
> > +             if (++next_ip_dst < cfg_n_ip_dst)
> > +                     continue;
> > +             next_ip_dst = 0;
> > +             if (++next_ip_src < cfg_n_ip_src)
> > +                     continue;
> > +             next_ip_src = 0;
>
> What is the logic here, can you please clarifiy the packet generation logic both
> in a comment here and in the commit log?

It's round-robin field by field. Will add the comments.

>
> >       }
> >
> >       nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_pkt);
> >       /*
> >        * Retry if necessary
> >        */
> > -     if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) {
> > +     if (unlikely(nb_tx < nb_pkt) && fs->retry_enabled) {
> >               retry = 0;
> > -             while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
> > +             while (nb_tx < nb_pkt && retry++ < burst_tx_retry_num) {
> >                       rte_delay_us(burst_tx_delay_time);
> >                       nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
> > -                                     &pkts_burst[nb_tx], nb_rx - nb_tx);
> > +                                     &pkts_burst[nb_tx], nb_pkt - nb_tx);
> >               }
>
> +1 to this fix, thanks for it. But can you please make a seperate patch for
> this, with proper 'Fixes:' tag etc..

Ok.

>
> >       }
> > -     fs->tx_packets += nb_tx;
> >
> >       inc_tx_burst_stats(fs, nb_tx);
> > -     if (unlikely(nb_tx < nb_pkt)) {
> > -             /* Back out the flow counter. */
> > -             next_flow -= (nb_pkt - nb_tx);
> > -             while (next_flow < 0)
> > -                     next_flow += cfg_n_flows;
> > +     fs->tx_packets += nb_tx;
> > +     /* Catch up flow idx by actual sent. */
> > +     for (i = 0; i < nb_tx; ++i) {
> > +             RTE_PER_LCORE(_next_udp_dst) = RTE_PER_LCORE(_next_udp_dst) + 1;
> > +             if (RTE_PER_LCORE(_next_udp_dst) < cfg_n_udp_dst)
> > +                     continue;
> > +             RTE_PER_LCORE(_next_udp_dst) = 0;
> > +             RTE_PER_LCORE(_next_udp_src) = RTE_PER_LCORE(_next_udp_src) + 1;
> > +             if (RTE_PER_LCORE(_next_udp_src) < cfg_n_udp_src)
> > +                     continue;
> > +             RTE_PER_LCORE(_next_udp_src) = 0;
> > +             RTE_PER_LCORE(_next_ip_dst) = RTE_PER_LCORE(_next_ip_dst) + 1;
> > +             if (RTE_PER_LCORE(_next_ip_dst) < cfg_n_ip_dst)
> > +                     continue;
> > +             RTE_PER_LCORE(_next_ip_dst) = 0;
> > +             RTE_PER_LCORE(_next_ip_src) = RTE_PER_LCORE(_next_ip_src) + 1;
> > +             if (RTE_PER_LCORE(_next_ip_src) < cfg_n_ip_src)
> > +                     continue;
> > +             RTE_PER_LCORE(_next_ip_src) = 0;
> > +     }
>
> Why per-core variables are not used in forward function, but local variables
> (like 'next_ip_src' etc..) used? Is it for the performance, if so what is the
> impact?
>
> And why not directly assign from local variables to per-core variables, but have
> above catch up loop?
>
>

Local vars are for generating pkts, global ones catch up finally when
nb_tx is clear.
So flow indexes only increase by actual sent pkt number.
It serves the same purpose of the original "/* backout the flow counter */".
My math isn't good enough to make it look more intelligent though.
  
Ferruh Yigit Aug. 10, 2021, 9:12 a.m. UTC | #5
On 8/10/2021 8:57 AM, 王志宏 wrote:
> Thanks for the review Ferruh :)
> 
> On Mon, Aug 9, 2021 at 11:18 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>>
>> On 8/9/2021 7:52 AM, Zhihong Wang wrote:
>>> This patch aims to:
>>>  1. Add flexibility by supporting IP & UDP src/dst fields
>>
>> What is the reason/"use case" of this flexibility?
> 
> The purpose is to emulate pkt generator behaviors.
> 

'flowgen' forwarding is already to emulate pkt generator, but it was only
changing destination IP.

What additional benefit does changing udp ports of the packets brings? What is
your usecase for this change?

>>
>>>  2. Improve multi-core performance by using per-core vars>
>>
>> On multi core this also has syncronization problem, OK to make it per-core. Do
>> you have any observed performance difference, if so how much is it?
> 
> Huge difference, one example: 8 core flowgen -> rxonly results: 43
> Mpps (per-core) vs. 9.3 Mpps (shared), of course the numbers "varies
> depending on system configuration".
> 

Thanks for clarification.

>>
>> And can you please separate this to its own patch? This can be before ip/udp update.
> 
> Will do.
> 
>>
>>> v2: fix assigning ip header cksum
>>>
>>
>> +1 to update, can you please make it as seperate patch?
> 
> Sure.
> 
>>
>> So overall this can be a patchset with 4 patches:
>> 1- Fix retry logic (nb_rx -> nb_pkt)
>> 2- Use 'rte_ipv4_cksum()' API (instead of static 'ip_sum()')
>> 3- User per-core varible (for 'next_flow')
>> 4- Support ip/udp src/dst variaty of packets
>>
> 
> Great summary. Thanks a lot.
> 
>>> Signed-off-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
>>> ---
>>>  app/test-pmd/flowgen.c | 137 +++++++++++++++++++++++++++++++------------------
>>>  1 file changed, 86 insertions(+), 51 deletions(-)
>>>
>>
>> <...>
>>
>>> @@ -185,30 +193,57 @@ pkt_burst_flow_gen(struct fwd_stream *fs)
>>>               }
>>>               pkts_burst[nb_pkt] = pkt;
>>>
>>> -             next_flow = (next_flow + 1) % cfg_n_flows;
>>> +             if (++next_udp_dst < cfg_n_udp_dst)
>>> +                     continue;
>>> +             next_udp_dst = 0;
>>> +             if (++next_udp_src < cfg_n_udp_src)
>>> +                     continue;
>>> +             next_udp_src = 0;
>>> +             if (++next_ip_dst < cfg_n_ip_dst)
>>> +                     continue;
>>> +             next_ip_dst = 0;
>>> +             if (++next_ip_src < cfg_n_ip_src)
>>> +                     continue;
>>> +             next_ip_src = 0;
>>
>> What is the logic here, can you please clarifiy the packet generation logic both
>> in a comment here and in the commit log?
> 
> It's round-robin field by field. Will add the comments.
> 

Thanks. If the receiving end is doing RSS based on IP address, dst address will
change in every 100 packets and src will change in every 10000 packets. This is
a slight behavior change.

When it was only dst ip, it was simple to just increment it, not sure about it
in this case. I wonder if we should set all randomly for each packet. I don't
know what is the better logic here, we can discuss it more in the next version.

>>
>>>       }
>>>
>>>       nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_pkt);
>>>       /*
>>>        * Retry if necessary
>>>        */
>>> -     if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) {
>>> +     if (unlikely(nb_tx < nb_pkt) && fs->retry_enabled) {
>>>               retry = 0;
>>> -             while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
>>> +             while (nb_tx < nb_pkt && retry++ < burst_tx_retry_num) {
>>>                       rte_delay_us(burst_tx_delay_time);
>>>                       nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
>>> -                                     &pkts_burst[nb_tx], nb_rx - nb_tx);
>>> +                                     &pkts_burst[nb_tx], nb_pkt - nb_tx);
>>>               }
>>
>> +1 to this fix, thanks for it. But can you please make a seperate patch for
>> this, with proper 'Fixes:' tag etc..
> 
> Ok.
> 
>>
>>>       }
>>> -     fs->tx_packets += nb_tx;
>>>
>>>       inc_tx_burst_stats(fs, nb_tx);
>>> -     if (unlikely(nb_tx < nb_pkt)) {
>>> -             /* Back out the flow counter. */
>>> -             next_flow -= (nb_pkt - nb_tx);
>>> -             while (next_flow < 0)
>>> -                     next_flow += cfg_n_flows;
>>> +     fs->tx_packets += nb_tx;
>>> +     /* Catch up flow idx by actual sent. */
>>> +     for (i = 0; i < nb_tx; ++i) {
>>> +             RTE_PER_LCORE(_next_udp_dst) = RTE_PER_LCORE(_next_udp_dst) + 1;
>>> +             if (RTE_PER_LCORE(_next_udp_dst) < cfg_n_udp_dst)
>>> +                     continue;
>>> +             RTE_PER_LCORE(_next_udp_dst) = 0;
>>> +             RTE_PER_LCORE(_next_udp_src) = RTE_PER_LCORE(_next_udp_src) + 1;
>>> +             if (RTE_PER_LCORE(_next_udp_src) < cfg_n_udp_src)
>>> +                     continue;
>>> +             RTE_PER_LCORE(_next_udp_src) = 0;
>>> +             RTE_PER_LCORE(_next_ip_dst) = RTE_PER_LCORE(_next_ip_dst) + 1;
>>> +             if (RTE_PER_LCORE(_next_ip_dst) < cfg_n_ip_dst)
>>> +                     continue;
>>> +             RTE_PER_LCORE(_next_ip_dst) = 0;
>>> +             RTE_PER_LCORE(_next_ip_src) = RTE_PER_LCORE(_next_ip_src) + 1;
>>> +             if (RTE_PER_LCORE(_next_ip_src) < cfg_n_ip_src)
>>> +                     continue;
>>> +             RTE_PER_LCORE(_next_ip_src) = 0;
>>> +     }
>>
>> Why per-core variables are not used in forward function, but local variables
>> (like 'next_ip_src' etc..) used? Is it for the performance, if so what is the
>> impact?
>>
>> And why not directly assign from local variables to per-core variables, but have
>> above catch up loop?
>>
>>
> 
> Local vars are for generating pkts, global ones catch up finally when
> nb_tx is clear.

Why you are not using global ones to generate packets? This removes the need for
catch up?

> So flow indexes only increase by actual sent pkt number.
> It serves the same purpose of the original "/* backout the flow counter */".
> My math isn't good enough to make it look more intelligent though.
> 

Maybe I am missing something, for this case why not just assign back from locals
to globals?
  
王志宏 Aug. 11, 2021, 2:48 a.m. UTC | #6
On Tue, Aug 10, 2021 at 5:12 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>
> On 8/10/2021 8:57 AM, 王志宏 wrote:
> > Thanks for the review Ferruh :)
> >
> > On Mon, Aug 9, 2021 at 11:18 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> >>
> >> On 8/9/2021 7:52 AM, Zhihong Wang wrote:
> >>> This patch aims to:
> >>>  1. Add flexibility by supporting IP & UDP src/dst fields
> >>
> >> What is the reason/"use case" of this flexibility?
> >
> > The purpose is to emulate pkt generator behaviors.
> >
>
> 'flowgen' forwarding is already to emulate pkt generator, but it was only
> changing destination IP.
>
> What additional benefit does changing udp ports of the packets brings? What is
> your usecase for this change?

Pkt generators like pktgen/trex/ixia/spirent can change various fields
including ip/udp src/dst.

Keeping the cfg_n_* while setting cfg_n_ip_dst = 1024 and others = 1
makes the default behavior exactly unchanged. Do you think it makes
sense?

>
> >>
> >>>  2. Improve multi-core performance by using per-core vars>
> >>
> >> On multi core this also has syncronization problem, OK to make it per-core. Do
> >> you have any observed performance difference, if so how much is it?
> >
> > Huge difference, one example: 8 core flowgen -> rxonly results: 43
> > Mpps (per-core) vs. 9.3 Mpps (shared), of course the numbers "varies
> > depending on system configuration".
> >
>
> Thanks for clarification.
>
> >>
> >> And can you please separate this to its own patch? This can be before ip/udp update.
> >
> > Will do.
> >
> >>
> >>> v2: fix assigning ip header cksum
> >>>
> >>
> >> +1 to update, can you please make it as seperate patch?
> >
> > Sure.
> >
> >>
> >> So overall this can be a patchset with 4 patches:
> >> 1- Fix retry logic (nb_rx -> nb_pkt)
> >> 2- Use 'rte_ipv4_cksum()' API (instead of static 'ip_sum()')
> >> 3- User per-core varible (for 'next_flow')
> >> 4- Support ip/udp src/dst variaty of packets
> >>
> >
> > Great summary. Thanks a lot.
> >
> >>> Signed-off-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
> >>> ---
> >>>  app/test-pmd/flowgen.c | 137 +++++++++++++++++++++++++++++++------------------
> >>>  1 file changed, 86 insertions(+), 51 deletions(-)
> >>>
> >>
> >> <...>
> >>
> >>> @@ -185,30 +193,57 @@ pkt_burst_flow_gen(struct fwd_stream *fs)
> >>>               }
> >>>               pkts_burst[nb_pkt] = pkt;
> >>>
> >>> -             next_flow = (next_flow + 1) % cfg_n_flows;
> >>> +             if (++next_udp_dst < cfg_n_udp_dst)
> >>> +                     continue;
> >>> +             next_udp_dst = 0;
> >>> +             if (++next_udp_src < cfg_n_udp_src)
> >>> +                     continue;
> >>> +             next_udp_src = 0;
> >>> +             if (++next_ip_dst < cfg_n_ip_dst)
> >>> +                     continue;
> >>> +             next_ip_dst = 0;
> >>> +             if (++next_ip_src < cfg_n_ip_src)
> >>> +                     continue;
> >>> +             next_ip_src = 0;
> >>
> >> What is the logic here, can you please clarifiy the packet generation logic both
> >> in a comment here and in the commit log?
> >
> > It's round-robin field by field. Will add the comments.
> >
>
> Thanks. If the receiving end is doing RSS based on IP address, dst address will
> change in every 100 packets and src will change in every 10000 packets. This is
> a slight behavior change.
>
> When it was only dst ip, it was simple to just increment it, not sure about it
> in this case. I wonder if we should set all randomly for each packet. I don't
> know what is the better logic here, we can discuss it more in the next version.

A more sophisticated pkt generator provides various options among
"step-by-step" / "random" / etc.

But supporting multiple fields naturally brings this implicitly. It
won't be a problem as it can be configured by setting the cfg_n_* as
we discussed above.

I think rte_rand() is a good option, anyway this can be tweaked easily
once the framework becomes shaped.

>
> >>
> >>>       }
> >>>
> >>>       nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_pkt);
> >>>       /*
> >>>        * Retry if necessary
> >>>        */
> >>> -     if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) {
> >>> +     if (unlikely(nb_tx < nb_pkt) && fs->retry_enabled) {
> >>>               retry = 0;
> >>> -             while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
> >>> +             while (nb_tx < nb_pkt && retry++ < burst_tx_retry_num) {
> >>>                       rte_delay_us(burst_tx_delay_time);
> >>>                       nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
> >>> -                                     &pkts_burst[nb_tx], nb_rx - nb_tx);
> >>> +                                     &pkts_burst[nb_tx], nb_pkt - nb_tx);
> >>>               }
> >>
> >> +1 to this fix, thanks for it. But can you please make a seperate patch for
> >> this, with proper 'Fixes:' tag etc..
> >
> > Ok.
> >
> >>
> >>>       }
> >>> -     fs->tx_packets += nb_tx;
> >>>
> >>>       inc_tx_burst_stats(fs, nb_tx);
> >>> -     if (unlikely(nb_tx < nb_pkt)) {
> >>> -             /* Back out the flow counter. */
> >>> -             next_flow -= (nb_pkt - nb_tx);
> >>> -             while (next_flow < 0)
> >>> -                     next_flow += cfg_n_flows;
> >>> +     fs->tx_packets += nb_tx;
> >>> +     /* Catch up flow idx by actual sent. */
> >>> +     for (i = 0; i < nb_tx; ++i) {
> >>> +             RTE_PER_LCORE(_next_udp_dst) = RTE_PER_LCORE(_next_udp_dst) + 1;
> >>> +             if (RTE_PER_LCORE(_next_udp_dst) < cfg_n_udp_dst)
> >>> +                     continue;
> >>> +             RTE_PER_LCORE(_next_udp_dst) = 0;
> >>> +             RTE_PER_LCORE(_next_udp_src) = RTE_PER_LCORE(_next_udp_src) + 1;
> >>> +             if (RTE_PER_LCORE(_next_udp_src) < cfg_n_udp_src)
> >>> +                     continue;
> >>> +             RTE_PER_LCORE(_next_udp_src) = 0;
> >>> +             RTE_PER_LCORE(_next_ip_dst) = RTE_PER_LCORE(_next_ip_dst) + 1;
> >>> +             if (RTE_PER_LCORE(_next_ip_dst) < cfg_n_ip_dst)
> >>> +                     continue;
> >>> +             RTE_PER_LCORE(_next_ip_dst) = 0;
> >>> +             RTE_PER_LCORE(_next_ip_src) = RTE_PER_LCORE(_next_ip_src) + 1;
> >>> +             if (RTE_PER_LCORE(_next_ip_src) < cfg_n_ip_src)
> >>> +                     continue;
> >>> +             RTE_PER_LCORE(_next_ip_src) = 0;
> >>> +     }
> >>
> >> Why per-core variables are not used in forward function, but local variables
> >> (like 'next_ip_src' etc..) used? Is it for the performance, if so what is the
> >> impact?
> >>
> >> And why not directly assign from local variables to per-core variables, but have
> >> above catch up loop?
> >>
> >>
> >
> > Local vars are for generating pkts, global ones catch up finally when
> > nb_tx is clear.
>
> Why you are not using global ones to generate packets? This removes the need for
> catch up?

When there are multiple fields, back out the overran index caused by
dropped packets is not that straightforward -- It's the "carry" issue
in adding.

>
> > So flow indexes only increase by actual sent pkt number.
> > It serves the same purpose of the original "/* backout the flow counter */".
> > My math isn't good enough to make it look more intelligent though.
> >
>
> Maybe I am missing something, for this case why not just assign back from locals
> to globals?

As above.

However, this can be simplified if we discard the "back out"
mechanism: generate 32 pkts and send 20 of them while the rest 12 are
dropped, the difference is that is the idx gonna start from 21 or 33
next time?
  
Ferruh Yigit Aug. 11, 2021, 10:31 a.m. UTC | #7
On 8/11/2021 3:48 AM, 王志宏 wrote:
> On Tue, Aug 10, 2021 at 5:12 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>>
>> On 8/10/2021 8:57 AM, 王志宏 wrote:
>>> Thanks for the review Ferruh :)
>>>
>>> On Mon, Aug 9, 2021 at 11:18 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>>>>
>>>> On 8/9/2021 7:52 AM, Zhihong Wang wrote:
>>>>> This patch aims to:
>>>>>  1. Add flexibility by supporting IP & UDP src/dst fields
>>>>
>>>> What is the reason/"use case" of this flexibility?
>>>
>>> The purpose is to emulate pkt generator behaviors.
>>>
>>
>> 'flowgen' forwarding is already to emulate pkt generator, but it was only
>> changing destination IP.
>>
>> What additional benefit does changing udp ports of the packets brings? What is
>> your usecase for this change?
> 
> Pkt generators like pktgen/trex/ixia/spirent can change various fields
> including ip/udp src/dst.
> 

But testpmd is not packet generator, it has very simple 'flowgen' forwarding
engine, I would like to understand motivation to make it more complex.

> Keeping the cfg_n_* while setting cfg_n_ip_dst = 1024 and others = 1
> makes the default behavior exactly unchanged. Do you think it makes
> sense?
> 
>>
>>>>
>>>>>  2. Improve multi-core performance by using per-core vars>
>>>>
>>>> On multi core this also has syncronization problem, OK to make it per-core. Do
>>>> you have any observed performance difference, if so how much is it?
>>>
>>> Huge difference, one example: 8 core flowgen -> rxonly results: 43
>>> Mpps (per-core) vs. 9.3 Mpps (shared), of course the numbers "varies
>>> depending on system configuration".
>>>
>>
>> Thanks for clarification.
>>
>>>>
>>>> And can you please separate this to its own patch? This can be before ip/udp update.
>>>
>>> Will do.
>>>
>>>>
>>>>> v2: fix assigning ip header cksum
>>>>>
>>>>
>>>> +1 to update, can you please make it as seperate patch?
>>>
>>> Sure.
>>>
>>>>
>>>> So overall this can be a patchset with 4 patches:
>>>> 1- Fix retry logic (nb_rx -> nb_pkt)
>>>> 2- Use 'rte_ipv4_cksum()' API (instead of static 'ip_sum()')
>>>> 3- User per-core varible (for 'next_flow')
>>>> 4- Support ip/udp src/dst variaty of packets
>>>>
>>>
>>> Great summary. Thanks a lot.
>>>
>>>>> Signed-off-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
>>>>> ---
>>>>>  app/test-pmd/flowgen.c | 137 +++++++++++++++++++++++++++++++------------------
>>>>>  1 file changed, 86 insertions(+), 51 deletions(-)
>>>>>
>>>>
>>>> <...>
>>>>
>>>>> @@ -185,30 +193,57 @@ pkt_burst_flow_gen(struct fwd_stream *fs)
>>>>>               }
>>>>>               pkts_burst[nb_pkt] = pkt;
>>>>>
>>>>> -             next_flow = (next_flow + 1) % cfg_n_flows;
>>>>> +             if (++next_udp_dst < cfg_n_udp_dst)
>>>>> +                     continue;
>>>>> +             next_udp_dst = 0;
>>>>> +             if (++next_udp_src < cfg_n_udp_src)
>>>>> +                     continue;
>>>>> +             next_udp_src = 0;
>>>>> +             if (++next_ip_dst < cfg_n_ip_dst)
>>>>> +                     continue;
>>>>> +             next_ip_dst = 0;
>>>>> +             if (++next_ip_src < cfg_n_ip_src)
>>>>> +                     continue;
>>>>> +             next_ip_src = 0;
>>>>
>>>> What is the logic here, can you please clarifiy the packet generation logic both
>>>> in a comment here and in the commit log?
>>>
>>> It's round-robin field by field. Will add the comments.
>>>
>>
>> Thanks. If the receiving end is doing RSS based on IP address, dst address will
>> change in every 100 packets and src will change in every 10000 packets. This is
>> a slight behavior change.
>>
>> When it was only dst ip, it was simple to just increment it, not sure about it
>> in this case. I wonder if we should set all randomly for each packet. I don't
>> know what is the better logic here, we can discuss it more in the next version.
> 
> A more sophisticated pkt generator provides various options among
> "step-by-step" / "random" / etc.
> 
> But supporting multiple fields naturally brings this implicitly. It
> won't be a problem as it can be configured by setting the cfg_n_* as
> we discussed above.
> 
> I think rte_rand() is a good option, anyway this can be tweaked easily
> once the framework becomes shaped.
> 

Can be done, but do we really want to add more packet generator capability to
testpmd?

>>
>>>>
>>>>>       }
>>>>>
>>>>>       nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_pkt);
>>>>>       /*
>>>>>        * Retry if necessary
>>>>>        */
>>>>> -     if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) {
>>>>> +     if (unlikely(nb_tx < nb_pkt) && fs->retry_enabled) {
>>>>>               retry = 0;
>>>>> -             while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
>>>>> +             while (nb_tx < nb_pkt && retry++ < burst_tx_retry_num) {
>>>>>                       rte_delay_us(burst_tx_delay_time);
>>>>>                       nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
>>>>> -                                     &pkts_burst[nb_tx], nb_rx - nb_tx);
>>>>> +                                     &pkts_burst[nb_tx], nb_pkt - nb_tx);
>>>>>               }
>>>>
>>>> +1 to this fix, thanks for it. But can you please make a seperate patch for
>>>> this, with proper 'Fixes:' tag etc..
>>>
>>> Ok.
>>>
>>>>
>>>>>       }
>>>>> -     fs->tx_packets += nb_tx;
>>>>>
>>>>>       inc_tx_burst_stats(fs, nb_tx);
>>>>> -     if (unlikely(nb_tx < nb_pkt)) {
>>>>> -             /* Back out the flow counter. */
>>>>> -             next_flow -= (nb_pkt - nb_tx);
>>>>> -             while (next_flow < 0)
>>>>> -                     next_flow += cfg_n_flows;
>>>>> +     fs->tx_packets += nb_tx;
>>>>> +     /* Catch up flow idx by actual sent. */
>>>>> +     for (i = 0; i < nb_tx; ++i) {
>>>>> +             RTE_PER_LCORE(_next_udp_dst) = RTE_PER_LCORE(_next_udp_dst) + 1;
>>>>> +             if (RTE_PER_LCORE(_next_udp_dst) < cfg_n_udp_dst)
>>>>> +                     continue;
>>>>> +             RTE_PER_LCORE(_next_udp_dst) = 0;
>>>>> +             RTE_PER_LCORE(_next_udp_src) = RTE_PER_LCORE(_next_udp_src) + 1;
>>>>> +             if (RTE_PER_LCORE(_next_udp_src) < cfg_n_udp_src)
>>>>> +                     continue;
>>>>> +             RTE_PER_LCORE(_next_udp_src) = 0;
>>>>> +             RTE_PER_LCORE(_next_ip_dst) = RTE_PER_LCORE(_next_ip_dst) + 1;
>>>>> +             if (RTE_PER_LCORE(_next_ip_dst) < cfg_n_ip_dst)
>>>>> +                     continue;
>>>>> +             RTE_PER_LCORE(_next_ip_dst) = 0;
>>>>> +             RTE_PER_LCORE(_next_ip_src) = RTE_PER_LCORE(_next_ip_src) + 1;
>>>>> +             if (RTE_PER_LCORE(_next_ip_src) < cfg_n_ip_src)
>>>>> +                     continue;
>>>>> +             RTE_PER_LCORE(_next_ip_src) = 0;
>>>>> +     }
>>>>
>>>> Why per-core variables are not used in forward function, but local variables
>>>> (like 'next_ip_src' etc..) used? Is it for the performance, if so what is the
>>>> impact?
>>>>
>>>> And why not directly assign from local variables to per-core variables, but have
>>>> above catch up loop?
>>>>
>>>>
>>>
>>> Local vars are for generating pkts, global ones catch up finally when
>>> nb_tx is clear.
>>
>> Why you are not using global ones to generate packets? This removes the need for
>> catch up?
> 
> When there are multiple fields, back out the overran index caused by
> dropped packets is not that straightforward -- It's the "carry" issue
> in adding.
> 
>>
>>> So flow indexes only increase by actual sent pkt number.
>>> It serves the same purpose of the original "/* backout the flow counter */".
>>> My math isn't good enough to make it look more intelligent though.
>>>
>>
>> Maybe I am missing something, for this case why not just assign back from locals
>> to globals?
> 
> As above.
> 
> However, this can be simplified if we discard the "back out"
> mechanism: generate 32 pkts and send 20 of them while the rest 12 are
> dropped, the difference is that is the idx gonna start from 21 or 33
> next time?
> 

I am not sure point of "back out", I think we can remove it unless there is no
objection, so receiving end can recognize failed packets.
  
王志宏 Aug. 12, 2021, 9:32 a.m. UTC | #8
On Wed, Aug 11, 2021 at 6:31 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>
> On 8/11/2021 3:48 AM, 王志宏 wrote:
> > On Tue, Aug 10, 2021 at 5:12 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> >>
> >> On 8/10/2021 8:57 AM, 王志宏 wrote:
> >>> Thanks for the review Ferruh :)
> >>>
> >>> On Mon, Aug 9, 2021 at 11:18 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> >>>>
> >>>> On 8/9/2021 7:52 AM, Zhihong Wang wrote:
> >>>>> This patch aims to:
> >>>>>  1. Add flexibility by supporting IP & UDP src/dst fields
> >>>>
> >>>> What is the reason/"use case" of this flexibility?
> >>>
> >>> The purpose is to emulate pkt generator behaviors.
> >>>
> >>
> >> 'flowgen' forwarding is already to emulate pkt generator, but it was only
> >> changing destination IP.
> >>
> >> What additional benefit does changing udp ports of the packets brings? What is
> >> your usecase for this change?
> >
> > Pkt generators like pktgen/trex/ixia/spirent can change various fields
> > including ip/udp src/dst.
> >
>
> But testpmd is not packet generator, it has very simple 'flowgen' forwarding
> engine, I would like to understand motivation to make it more complex.

I agree this *simplicity* point. In fact my sole intention is to make
flowgen useable for multi-core test. I'll keep the original setup in
the next patch.

>
> > Keeping the cfg_n_* while setting cfg_n_ip_dst = 1024 and others = 1
> > makes the default behavior exactly unchanged. Do you think it makes
> > sense?
> >
> >>
> >>>>
> >>>>>  2. Improve multi-core performance by using per-core vars>
> >>>>
> >>>> On multi core this also has syncronization problem, OK to make it per-core. Do
> >>>> you have any observed performance difference, if so how much is it?
> >>>
> >>> Huge difference, one example: 8 core flowgen -> rxonly results: 43
> >>> Mpps (per-core) vs. 9.3 Mpps (shared), of course the numbers "varies
> >>> depending on system configuration".
> >>>
> >>
> >> Thanks for clarification.
> >>
> >>>>
> >>>> And can you please separate this to its own patch? This can be before ip/udp update.
> >>>
> >>> Will do.
> >>>
> >>>>
> >>>>> v2: fix assigning ip header cksum
> >>>>>
> >>>>
> >>>> +1 to update, can you please make it as seperate patch?
> >>>
> >>> Sure.
> >>>
> >>>>
> >>>> So overall this can be a patchset with 4 patches:
> >>>> 1- Fix retry logic (nb_rx -> nb_pkt)
> >>>> 2- Use 'rte_ipv4_cksum()' API (instead of static 'ip_sum()')
> >>>> 3- User per-core varible (for 'next_flow')
> >>>> 4- Support ip/udp src/dst variaty of packets
> >>>>
> >>>
> >>> Great summary. Thanks a lot.
> >>>
> >>>>> Signed-off-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
> >>>>> ---
> >>>>>  app/test-pmd/flowgen.c | 137 +++++++++++++++++++++++++++++++------------------
> >>>>>  1 file changed, 86 insertions(+), 51 deletions(-)
> >>>>>
> >>>>
> >>>> <...>
> >>>>
> >>>>> @@ -185,30 +193,57 @@ pkt_burst_flow_gen(struct fwd_stream *fs)
> >>>>>               }
> >>>>>               pkts_burst[nb_pkt] = pkt;
> >>>>>
> >>>>> -             next_flow = (next_flow + 1) % cfg_n_flows;
> >>>>> +             if (++next_udp_dst < cfg_n_udp_dst)
> >>>>> +                     continue;
> >>>>> +             next_udp_dst = 0;
> >>>>> +             if (++next_udp_src < cfg_n_udp_src)
> >>>>> +                     continue;
> >>>>> +             next_udp_src = 0;
> >>>>> +             if (++next_ip_dst < cfg_n_ip_dst)
> >>>>> +                     continue;
> >>>>> +             next_ip_dst = 0;
> >>>>> +             if (++next_ip_src < cfg_n_ip_src)
> >>>>> +                     continue;
> >>>>> +             next_ip_src = 0;
> >>>>
> >>>> What is the logic here, can you please clarifiy the packet generation logic both
> >>>> in a comment here and in the commit log?
> >>>
> >>> It's round-robin field by field. Will add the comments.
> >>>
> >>
> >> Thanks. If the receiving end is doing RSS based on IP address, dst address will
> >> change in every 100 packets and src will change in every 10000 packets. This is
> >> a slight behavior change.
> >>
> >> When it was only dst ip, it was simple to just increment it, not sure about it
> >> in this case. I wonder if we should set all randomly for each packet. I don't
> >> know what is the better logic here, we can discuss it more in the next version.
> >
> > A more sophisticated pkt generator provides various options among
> > "step-by-step" / "random" / etc.
> >
> > But supporting multiple fields naturally brings this implicitly. It
> > won't be a problem as it can be configured by setting the cfg_n_* as
> > we discussed above.
> >
> > I think rte_rand() is a good option, anyway this can be tweaked easily
> > once the framework becomes shaped.
> >
>
> Can be done, but do we really want to add more packet generator capability to
> testpmd?
>
> >>
> >>>>
> >>>>>       }
> >>>>>
> >>>>>       nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_pkt);
> >>>>>       /*
> >>>>>        * Retry if necessary
> >>>>>        */
> >>>>> -     if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) {
> >>>>> +     if (unlikely(nb_tx < nb_pkt) && fs->retry_enabled) {
> >>>>>               retry = 0;
> >>>>> -             while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
> >>>>> +             while (nb_tx < nb_pkt && retry++ < burst_tx_retry_num) {
> >>>>>                       rte_delay_us(burst_tx_delay_time);
> >>>>>                       nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
> >>>>> -                                     &pkts_burst[nb_tx], nb_rx - nb_tx);
> >>>>> +                                     &pkts_burst[nb_tx], nb_pkt - nb_tx);
> >>>>>               }
> >>>>
> >>>> +1 to this fix, thanks for it. But can you please make a seperate patch for
> >>>> this, with proper 'Fixes:' tag etc..
> >>>
> >>> Ok.
> >>>
> >>>>
> >>>>>       }
> >>>>> -     fs->tx_packets += nb_tx;
> >>>>>
> >>>>>       inc_tx_burst_stats(fs, nb_tx);
> >>>>> -     if (unlikely(nb_tx < nb_pkt)) {
> >>>>> -             /* Back out the flow counter. */
> >>>>> -             next_flow -= (nb_pkt - nb_tx);
> >>>>> -             while (next_flow < 0)
> >>>>> -                     next_flow += cfg_n_flows;
> >>>>> +     fs->tx_packets += nb_tx;
> >>>>> +     /* Catch up flow idx by actual sent. */
> >>>>> +     for (i = 0; i < nb_tx; ++i) {
> >>>>> +             RTE_PER_LCORE(_next_udp_dst) = RTE_PER_LCORE(_next_udp_dst) + 1;
> >>>>> +             if (RTE_PER_LCORE(_next_udp_dst) < cfg_n_udp_dst)
> >>>>> +                     continue;
> >>>>> +             RTE_PER_LCORE(_next_udp_dst) = 0;
> >>>>> +             RTE_PER_LCORE(_next_udp_src) = RTE_PER_LCORE(_next_udp_src) + 1;
> >>>>> +             if (RTE_PER_LCORE(_next_udp_src) < cfg_n_udp_src)
> >>>>> +                     continue;
> >>>>> +             RTE_PER_LCORE(_next_udp_src) = 0;
> >>>>> +             RTE_PER_LCORE(_next_ip_dst) = RTE_PER_LCORE(_next_ip_dst) + 1;
> >>>>> +             if (RTE_PER_LCORE(_next_ip_dst) < cfg_n_ip_dst)
> >>>>> +                     continue;
> >>>>> +             RTE_PER_LCORE(_next_ip_dst) = 0;
> >>>>> +             RTE_PER_LCORE(_next_ip_src) = RTE_PER_LCORE(_next_ip_src) + 1;
> >>>>> +             if (RTE_PER_LCORE(_next_ip_src) < cfg_n_ip_src)
> >>>>> +                     continue;
> >>>>> +             RTE_PER_LCORE(_next_ip_src) = 0;
> >>>>> +     }
> >>>>
> >>>> Why per-core variables are not used in forward function, but local variables
> >>>> (like 'next_ip_src' etc..) used? Is it for the performance, if so what is the
> >>>> impact?
> >>>>
> >>>> And why not directly assign from local variables to per-core variables, but have
> >>>> above catch up loop?
> >>>>
> >>>>
> >>>
> >>> Local vars are for generating pkts, global ones catch up finally when
> >>> nb_tx is clear.
> >>
> >> Why you are not using global ones to generate packets? This removes the need for
> >> catch up?
> >
> > When there are multiple fields, back out the overran index caused by
> > dropped packets is not that straightforward -- It's the "carry" issue
> > in adding.
> >
> >>
> >>> So flow indexes only increase by actual sent pkt number.
> >>> It serves the same purpose of the original "/* backout the flow counter */".
> >>> My math isn't good enough to make it look more intelligent though.
> >>>
> >>
> >> Maybe I am missing something, for this case why not just assign back from locals
> >> to globals?
> >
> > As above.
> >
> > However, this can be simplified if we discard the "back out"
> > mechanism: generate 32 pkts and send 20 of them while the rest 12 are
> > dropped, the difference is that is the idx gonna start from 21 or 33
> > next time?
> >
>
> I am not sure point of "back out", I think we can remove it unless there is no
> objection, so receiving end can recognize failed packets.
>
  

Patch

diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c
index 3bf6e1ce97..333c3b2cd2 100644
--- a/app/test-pmd/flowgen.c
+++ b/app/test-pmd/flowgen.c
@@ -40,41 +40,37 @@ 
 
 #include "testpmd.h"
 
-/* hardcoded configuration (for now) */
-static unsigned cfg_n_flows	= 1024;
-static uint32_t cfg_ip_src	= RTE_IPV4(10, 254, 0, 0);
-static uint32_t cfg_ip_dst	= RTE_IPV4(10, 253, 0, 0);
-static uint16_t cfg_udp_src	= 1000;
-static uint16_t cfg_udp_dst	= 1001;
+/*
+ * Hardcoded range for flow generation.
+ *
+ * Total number of flows =
+ *     cfg_n_ip_src * cfg_n_ip_dst * cfg_n_udp_src * cfg_n_udp_dst
+ */
+static uint32_t cfg_n_ip_src = 100;
+static uint32_t cfg_n_ip_dst = 100;
+static uint32_t cfg_n_udp_src = 10;
+static uint32_t cfg_n_udp_dst = 10;
+
+/* Base ip and port for flow generation. */
+static uint32_t cfg_ip_src_base = RTE_IPV4(10, 254, 0, 0);
+static uint32_t cfg_ip_dst_base = RTE_IPV4(10, 253, 0, 0);
+static uint16_t cfg_udp_src_base = 1000;
+static uint16_t cfg_udp_dst_base = 1001;
 static struct rte_ether_addr cfg_ether_src =
 	{{ 0x00, 0x01, 0x02, 0x03, 0x04, 0x00 }};
 static struct rte_ether_addr cfg_ether_dst =
 	{{ 0x00, 0x01, 0x02, 0x03, 0x04, 0x01 }};
 
+RTE_DEFINE_PER_LCORE(uint32_t, _next_ip_src);
+RTE_DEFINE_PER_LCORE(uint32_t, _next_ip_dst);
+RTE_DEFINE_PER_LCORE(uint32_t, _next_udp_src);
+RTE_DEFINE_PER_LCORE(uint32_t, _next_udp_dst);
+
 #define IP_DEFTTL  64   /* from RFC 1340. */
 
 /* Use this type to inform GCC that ip_sum violates aliasing rules. */
 typedef unaligned_uint16_t alias_int16_t __attribute__((__may_alias__));
 
-static inline uint16_t
-ip_sum(const alias_int16_t *hdr, int hdr_len)
-{
-	uint32_t sum = 0;
-
-	while (hdr_len > 1)
-	{
-		sum += *hdr++;
-		if (sum & 0x80000000)
-			sum = (sum & 0xFFFF) + (sum >> 16);
-		hdr_len -= 2;
-	}
-
-	while (sum >> 16)
-		sum = (sum & 0xFFFF) + (sum >> 16);
-
-	return ~sum;
-}
-
 /*
  * Multi-flow generation mode.
  *
@@ -85,7 +81,7 @@  ip_sum(const alias_int16_t *hdr, int hdr_len)
 static void
 pkt_burst_flow_gen(struct fwd_stream *fs)
 {
-	unsigned pkt_size = tx_pkt_length - 4;	/* Adjust FCS */
+	uint32_t pkt_size = tx_pkt_length - 4; /* Adjust FCS */
 	struct rte_mbuf  *pkts_burst[MAX_PKT_BURST];
 	struct rte_mempool *mbp;
 	struct rte_mbuf  *pkt = NULL;
@@ -102,15 +98,18 @@  pkt_burst_flow_gen(struct fwd_stream *fs)
 	uint32_t retry;
 	uint64_t tx_offloads;
 	uint64_t start_tsc = 0;
-	static int next_flow = 0;
+	uint32_t next_ip_src = RTE_PER_LCORE(_next_ip_src);
+	uint32_t next_ip_dst = RTE_PER_LCORE(_next_ip_dst);
+	uint32_t next_udp_src = RTE_PER_LCORE(_next_udp_src);
+	uint32_t next_udp_dst = RTE_PER_LCORE(_next_udp_dst);
 
 	get_start_cycles(&start_tsc);
 
 	/* Receive a burst of packets and discard them. */
 	nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue, pkts_burst,
 				 nb_pkt_per_burst);
+	inc_rx_burst_stats(fs, nb_rx);
 	fs->rx_packets += nb_rx;
-
 	for (i = 0; i < nb_rx; i++)
 		rte_pktmbuf_free(pkts_burst[i]);
 
@@ -144,7 +143,8 @@  pkt_burst_flow_gen(struct fwd_stream *fs)
 			eth_hdr = rte_pktmbuf_mtod(pkt, struct rte_ether_hdr *);
 			rte_ether_addr_copy(&cfg_ether_dst, &eth_hdr->d_addr);
 			rte_ether_addr_copy(&cfg_ether_src, &eth_hdr->s_addr);
-			eth_hdr->ether_type = rte_cpu_to_be_16(RTE_ETHER_TYPE_IPV4);
+			eth_hdr->ether_type =
+				rte_cpu_to_be_16(RTE_ETHER_TYPE_IPV4);
 
 			/* Initialize IP header. */
 			ip_hdr = (struct rte_ipv4_hdr *)(eth_hdr + 1);
@@ -155,22 +155,30 @@  pkt_burst_flow_gen(struct fwd_stream *fs)
 			ip_hdr->time_to_live	= IP_DEFTTL;
 			ip_hdr->next_proto_id	= IPPROTO_UDP;
 			ip_hdr->packet_id	= 0;
-			ip_hdr->src_addr	= rte_cpu_to_be_32(cfg_ip_src);
-			ip_hdr->dst_addr	= rte_cpu_to_be_32(cfg_ip_dst +
-								   next_flow);
-			ip_hdr->total_length	= RTE_CPU_TO_BE_16(pkt_size -
-								   sizeof(*eth_hdr));
-			ip_hdr->hdr_checksum	= ip_sum((const alias_int16_t *)ip_hdr,
-							 sizeof(*ip_hdr));
+			ip_hdr->src_addr	=
+				rte_cpu_to_be_32(cfg_ip_src_base
+						+ next_ip_src);
+			ip_hdr->dst_addr	=
+				rte_cpu_to_be_32(cfg_ip_dst_base
+						+ next_ip_dst);
+			ip_hdr->total_length	=
+				RTE_CPU_TO_BE_16(pkt_size
+						- sizeof(*eth_hdr));
+			ip_hdr->hdr_checksum = rte_ipv4_cksum(ip_hdr);
 
 			/* Initialize UDP header. */
 			udp_hdr = (struct rte_udp_hdr *)(ip_hdr + 1);
-			udp_hdr->src_port	= rte_cpu_to_be_16(cfg_udp_src);
-			udp_hdr->dst_port	= rte_cpu_to_be_16(cfg_udp_dst);
+			udp_hdr->src_port	=
+				rte_cpu_to_be_16(cfg_udp_src_base
+						+ next_udp_src);
+			udp_hdr->dst_port	=
+				rte_cpu_to_be_16(cfg_udp_dst_base
+						+ next_udp_dst);
+			udp_hdr->dgram_len	=
+				RTE_CPU_TO_BE_16(pkt_size
+						- sizeof(*eth_hdr)
+						- sizeof(*ip_hdr));
 			udp_hdr->dgram_cksum	= 0; /* No UDP checksum. */
-			udp_hdr->dgram_len	= RTE_CPU_TO_BE_16(pkt_size -
-								   sizeof(*eth_hdr) -
-								   sizeof(*ip_hdr));
 			pkt->nb_segs		= 1;
 			pkt->pkt_len		= pkt_size;
 			pkt->ol_flags		&= EXT_ATTACHED_MBUF;
@@ -185,30 +193,57 @@  pkt_burst_flow_gen(struct fwd_stream *fs)
 		}
 		pkts_burst[nb_pkt] = pkt;
 
-		next_flow = (next_flow + 1) % cfg_n_flows;
+		if (++next_udp_dst < cfg_n_udp_dst)
+			continue;
+		next_udp_dst = 0;
+		if (++next_udp_src < cfg_n_udp_src)
+			continue;
+		next_udp_src = 0;
+		if (++next_ip_dst < cfg_n_ip_dst)
+			continue;
+		next_ip_dst = 0;
+		if (++next_ip_src < cfg_n_ip_src)
+			continue;
+		next_ip_src = 0;
 	}
 
 	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_pkt);
 	/*
 	 * Retry if necessary
 	 */
-	if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) {
+	if (unlikely(nb_tx < nb_pkt) && fs->retry_enabled) {
 		retry = 0;
-		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
+		while (nb_tx < nb_pkt && retry++ < burst_tx_retry_num) {
 			rte_delay_us(burst_tx_delay_time);
 			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
-					&pkts_burst[nb_tx], nb_rx - nb_tx);
+					&pkts_burst[nb_tx], nb_pkt - nb_tx);
 		}
 	}
-	fs->tx_packets += nb_tx;
 
 	inc_tx_burst_stats(fs, nb_tx);
-	if (unlikely(nb_tx < nb_pkt)) {
-		/* Back out the flow counter. */
-		next_flow -= (nb_pkt - nb_tx);
-		while (next_flow < 0)
-			next_flow += cfg_n_flows;
+	fs->tx_packets += nb_tx;
+	/* Catch up flow idx by actual sent. */
+	for (i = 0; i < nb_tx; ++i) {
+		RTE_PER_LCORE(_next_udp_dst) = RTE_PER_LCORE(_next_udp_dst) + 1;
+		if (RTE_PER_LCORE(_next_udp_dst) < cfg_n_udp_dst)
+			continue;
+		RTE_PER_LCORE(_next_udp_dst) = 0;
+		RTE_PER_LCORE(_next_udp_src) = RTE_PER_LCORE(_next_udp_src) + 1;
+		if (RTE_PER_LCORE(_next_udp_src) < cfg_n_udp_src)
+			continue;
+		RTE_PER_LCORE(_next_udp_src) = 0;
+		RTE_PER_LCORE(_next_ip_dst) = RTE_PER_LCORE(_next_ip_dst) + 1;
+		if (RTE_PER_LCORE(_next_ip_dst) < cfg_n_ip_dst)
+			continue;
+		RTE_PER_LCORE(_next_ip_dst) = 0;
+		RTE_PER_LCORE(_next_ip_src) = RTE_PER_LCORE(_next_ip_src) + 1;
+		if (RTE_PER_LCORE(_next_ip_src) < cfg_n_ip_src)
+			continue;
+		RTE_PER_LCORE(_next_ip_src) = 0;
+	}
 
+	if (unlikely(nb_tx < nb_pkt)) {
+		fs->fwd_dropped += nb_pkt - nb_tx;
 		do {
 			rte_pktmbuf_free(pkts_burst[nb_tx]);
 		} while (++nb_tx < nb_pkt);