[v3] app/pdump: enhance to support multi-core capture

Message ID 20190328150406.12051-1-vipin.varghese@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series [v3] app/pdump: enhance to support multi-core capture |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/intel-Performance-Testing success Performance Testing PASS
ci/mellanox-Performance-Testing fail Performance Testing issues

Commit Message

Varghese, Vipin March 28, 2019, 3:04 p.m. UTC
  Enhance pdump application, to allow user to run on multiple cores.

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
---

Change Log:

V3:
 - correct the parse_usage - Vipin Varghese
 - add change log - Vipin Varghese

V2:
 - Replace option '-c' to '-l' - Keith Wiles
---
 app/pdump/main.c           | 75 ++++++++++++++++++++++++++++++++------
 doc/guides/tools/pdump.rst |  5 +++
 2 files changed, 69 insertions(+), 11 deletions(-)
  

Comments

Wiles, Keith March 28, 2019, 3:34 p.m. UTC | #1
> On Mar 28, 2019, at 10:04 AM, Varghese, Vipin <vipin.varghese@intel.com> wrote:
> 
> Enhance pdump application, to allow user to run on multiple cores.
> 
> Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
> ---
> 
> Change Log:
> 
> V3:
> - correct the parse_usage - Vipin Varghese
> - add change log - Vipin Varghese
> 
> V2:
> - Replace option '-c' to '-l' - Keith Wiles
> ---
> app/pdump/main.c           | 75 ++++++++++++++++++++++++++++++++------
> doc/guides/tools/pdump.rst |  5 +++
> 2 files changed, 69 insertions(+), 11 deletions(-)
> 

Reviewed-by: Keith Wiles <keith.wiles*intel.com>

Regards,
Keith
  
Pattan, Reshma March 29, 2019, 10:08 a.m. UTC | #2
> -----Original Message-----
> From: Varghese, Vipin
> 
>  /* true if x is a power of 2 */
>  #define POWEROF2(x) ((((x)-1) & (x)) == 0) @@ -144,7 +145,7 @@ static volatile
> uint8_t quit_signal;  static void  pdump_usage(const char *prgname)  {
> -	printf("usage: %s [EAL options] -- --pdump "
> +	printf("usage: %s [EAL options] -- [-l <list of cores>] --pdump "

Using -l option same as eal is confusing. Please use other name.
Also how about moving this  new option inside --pdump"" so it will be clearly known that the particular core will be associated to that tuple.

Also, I have some major concern, check my below comments.

>  			"'(port=<port id> | device_id=<pci id or vdev name>),"
>  			"(queue=<queue_id>),"
>  			"(rx-dev=<iface or pcap file> |"
> @@ -415,6 +416,7 @@ print_pdump_stats(void)
>  	for (i = 0; i < num_tuples; i++) {
>  		printf("##### PDUMP DEBUG STATS #####\n");
>  		pt = &pdump_t[i];
> +		printf(" == DPDK interface (%d) ==\n", i);

Here good to print the portid/deviceid and queue info, instead of printing pdump tuple index  i? User might not understand that.
Use ### instead of === as above.
 
> +
>  static inline void
>  dump_packets(void)
>  {
>  	int i;
> -	struct pdump_tuples *pt;
> +	uint32_t lcore_id = 0;
> +
> +	lcore_id = rte_get_next_lcore(lcore_id, 1, 1);
> +
> +	if (rte_lcore_count() == 1) {
> +		while (!quit_signal) {
> +			for (i = 0; i < num_tuples; i++) {
> +				struct pdump_tuples *pt = &pdump_t[i];
> +				pdump_packets(pt);
> +			}
> +		}
> +	} else {
> +		printf(" Tuples (%u) lcores (%u)\n",
> +			num_tuples, rte_lcore_count());
> +
> +		if ((uint32_t)num_tuples >= rte_lcore_count()) {
> +			printf("Insufficent Cores\n");
Typo %s/Insufficent/


> +	for (i = 0; i < argc; i++) {
> +		if (strstr(argv[i], "-l")) {
> +			snprintf(c_flag, RTE_DIM(c_flag), "-l %s", argv[i+1]);

You are taking this as application arguments then making it as eal argument  to run the application.  
Why not enable the needed number of cores in core mask using eal options -l and have new core param in pdump tuple to run that tuple on that core.

Ex:
If you check l3fwd as an example the cores should enabled using -c or -l and then they have separate --config l3fwd option in 
which they specify the core on which the packet processing should be run. Please check that and similar would be good here too.

> +			strlcpy(argv[i], "", 2);
> +			strlcpy(argv[i + 1], "", 2);

Why is this? Anyway, rte_strlcpy should be used instead of strlcpy.

Thanks,
Reshma
  
Varghese, Vipin March 29, 2019, 10:22 a.m. UTC | #3
Hi Reshma,

snipped
> >
> >  /* true if x is a power of 2 */
> >  #define POWEROF2(x) ((((x)-1) & (x)) == 0) @@ -144,7 +145,7 @@ static
> > volatile uint8_t quit_signal;  static void  pdump_usage(const char *prgname)  {
> > -	printf("usage: %s [EAL options] -- --pdump "
> > +	printf("usage: %s [EAL options] -- [-l <list of cores>] --pdump "
> 
> Using -l option same as eal is confusing. Please use other name.
Current implementation passes core-mask '-cx1' as EAL argument. The check for user argument '-l <core1,core2,core3' is done before rte_eal_init. Once identified it is replaced with c_flag.

Hence I disagree to the point it is confusing.

> Also how about moving this  new option inside --pdump"" so it will be clearly
> known that the particular core will be associated to that tuple.
Yes, this can be done.

> 
> Also, I have some major concern, check my below comments.
Thanks for your concerns, let me try to address them below.

> 
> >  			"'(port=<port id> | device_id=<pci id or vdev name>),"
> >  			"(queue=<queue_id>),"
> >  			"(rx-dev=<iface or pcap file> |"
> > @@ -415,6 +416,7 @@ print_pdump_stats(void)
> >  	for (i = 0; i < num_tuples; i++) {
> >  		printf("##### PDUMP DEBUG STATS #####\n");
> >  		pt = &pdump_t[i];
> > +		printf(" == DPDK interface (%d) ==\n", i);
> 
> Here good to print the portid/deviceid and queue info, instead of printing pdump
> tuple index  i? User might not understand that.
I am not sure, why you mention that I am displaying tuple index with I here?

> Use ### instead of === as above.
I can do this, but is there specific reasoning for "####" as it is used to represent main header?

> 
> > +
> >  static inline void
> >  dump_packets(void)
> >  {
> >  	int i;
> > -	struct pdump_tuples *pt;
> > +	uint32_t lcore_id = 0;
> > +
> > +	lcore_id = rte_get_next_lcore(lcore_id, 1, 1);
> > +
> > +	if (rte_lcore_count() == 1) {
> > +		while (!quit_signal) {
> > +			for (i = 0; i < num_tuples; i++) {
> > +				struct pdump_tuples *pt = &pdump_t[i];
> > +				pdump_packets(pt);
> > +			}
> > +		}
> > +	} else {
> > +		printf(" Tuples (%u) lcores (%u)\n",
> > +			num_tuples, rte_lcore_count());
> > +
> > +		if ((uint32_t)num_tuples >= rte_lcore_count()) {
> > +			printf("Insufficent Cores\n");
> Typo %s/Insufficent/
Ok

> 
> 
> > +	for (i = 0; i < argc; i++) {
> > +		if (strstr(argv[i], "-l")) {
> > +			snprintf(c_flag, RTE_DIM(c_flag), "-l %s", argv[i+1]);
> 
> You are taking this as application arguments then making it as eal argument  to
> run the application.
I have explained the same above.

> Why not enable the needed number of cores in core mask using eal options -l 
I think what you are saying is "allow user to pass -l option or -c option before `--`". Then before invoking rte_eal_init replace it. Is this your requirement?

and
> have new core param in pdump tuple to run that tuple on that core.
> 
> Ex:
> If you check l3fwd as an example the cores should enabled using -c or -l and then
> they have separate --config l3fwd option in which they specify the core on which
> the packet processing should be run. Please check that and similar would be good
> here too.
I have already explained, pdump application makes static assignment of '-cx1'. If you try passing '-c' or '-l' the error check in rte_eal_init will prevent such assignment.

> 
> > +			strlcpy(argv[i], "", 2);
> > +			strlcpy(argv[i + 1], "", 2);
> 
> Why is this?
I have explained this above.


 Anyway, rte_strlcpy should be used instead of strlcpy.
Ok

> 
> Thanks,
> Reshma
Hi Reshma, thanks for feedbacks on cosmetic, spelling and using rte_strlcpy
  
Pattan, Reshma March 29, 2019, 10:52 a.m. UTC | #4
> -----Original Message-----
> From: Varghese, Vipin
> Subject: RE: [PATCH v3] app/pdump: enhance to support multi-core capture
> >
> > >  			"'(port=<port id> | device_id=<pci id or vdev name>),"
> > >  			"(queue=<queue_id>),"
> > >  			"(rx-dev=<iface or pcap file> |"
> > > @@ -415,6 +416,7 @@ print_pdump_stats(void)
> > >  	for (i = 0; i < num_tuples; i++) {
> > >  		printf("##### PDUMP DEBUG STATS #####\n");
> > >  		pt = &pdump_t[i];
> > > +		printf(" == DPDK interface (%d) ==\n", i);
> >
> > Here good to print the portid/deviceid and queue info, instead of
> > printing pdump tuple index  i? User might not understand that.
> I am not sure, why you mention that I am displaying tuple index with I here?
> 

What is i here?
i is the index used in for loop to iterate through the pdump_tuples array.
So printing i doesn't make sense, instead port and queue info are good 
options to print if you want to have this log.

> > Use ### instead of === as above.
> I can do this, but is there specific reasoning for "####" as it is used to represent
> main header?

to follow consistency with  existing display of ###.

> > Why not enable the needed number of cores in core mask using eal
> > options -l
> I think what you are saying is "allow user to pass -l option or -c option before `--

Yes, and remove existing static -c1 method.
So cores that should be enabled will come from eal options, then in --pdump  you 
have new core param which will be used to launch packet dump function for that pdump tuple.
While parsing application pdump core param, you should check this core is enabled in coremask of eal. That is the way how other applications do.

> `". Then before invoking rte_eal_init replace it. Is this your requirement?
> 

After new suggestion of removing -c1 static approach the above question will not be applicable.

> > Ex:
> > If you check l3fwd as an example the cores should enabled using -c or
> > -l and then they have separate --config l3fwd option in which they
> > specify the core on which the packet processing should be run. Please
> > check that and similar would be good here too.
> I have already explained, pdump application makes static assignment of '-cx1'. If
> you try passing '-c' or '-l' the error check in rte_eal_init will prevent such
> assignment.

Now you will remove existing -c1 static assignment and use eal passed -l and -c , so it should be fine now.
  
Ferruh Yigit March 29, 2019, 5:03 p.m. UTC | #5
On 3/29/2019 10:22 AM, Varghese, Vipin wrote:
> Hi Reshma,
> 
> snipped
>>>
>>>  /* true if x is a power of 2 */
>>>  #define POWEROF2(x) ((((x)-1) & (x)) == 0) @@ -144,7 +145,7 @@ static
>>> volatile uint8_t quit_signal;  static void  pdump_usage(const char *prgname)  {
>>> -	printf("usage: %s [EAL options] -- --pdump "
>>> +	printf("usage: %s [EAL options] -- [-l <list of cores>] --pdump "
>>
>> Using -l option same as eal is confusing. Please use other name.
> Current implementation passes core-mask '-cx1' as EAL argument. The check for user argument '-l <core1,core2,core3' is done before rte_eal_init. Once identified it is replaced with c_flag.
> 
> Hence I disagree to the point it is confusing.

I agree with Reshma, if there is a need to run in different cores, lets remove
the hardcoded core information from application and manage the core selection in
eal level, instead of having this in application.

And in app level, you can say which core to use for that specific pdump, overall
something like:

dpdk-pdump -l 20,23 -- --pdump 'port=0,queue=*,core=21,rx-dev=/tmp/rx.pcap'
--pdump 'port=1,queue=*,core=22,tx-dev=/tmp/tx.pcap'


> 
>> Also how about moving this  new option inside --pdump"" so it will be clearly
>> known that the particular core will be associated to that tuple.
> Yes, this can be done.
> 
>>
>> Also, I have some major concern, check my below comments.
> Thanks for your concerns, let me try to address them below.
> 
>>
>>>  			"'(port=<port id> | device_id=<pci id or vdev name>),"
>>>  			"(queue=<queue_id>),"
>>>  			"(rx-dev=<iface or pcap file> |"
>>> @@ -415,6 +416,7 @@ print_pdump_stats(void)
>>>  	for (i = 0; i < num_tuples; i++) {
>>>  		printf("##### PDUMP DEBUG STATS #####\n");
>>>  		pt = &pdump_t[i];
>>> +		printf(" == DPDK interface (%d) ==\n", i);
>>
>> Here good to print the portid/deviceid and queue info, instead of printing pdump
>> tuple index  i? User might not understand that.
> I am not sure, why you mention that I am displaying tuple index with I here?
> 
>> Use ### instead of === as above.
> I can do this, but is there specific reasoning for "####" as it is used to represent main header?
> 
>>
>>> +
>>>  static inline void
>>>  dump_packets(void)
>>>  {
>>>  	int i;
>>> -	struct pdump_tuples *pt;
>>> +	uint32_t lcore_id = 0;
>>> +
>>> +	lcore_id = rte_get_next_lcore(lcore_id, 1, 1);
>>> +
>>> +	if (rte_lcore_count() == 1) {
>>> +		while (!quit_signal) {
>>> +			for (i = 0; i < num_tuples; i++) {
>>> +				struct pdump_tuples *pt = &pdump_t[i];
>>> +				pdump_packets(pt);
>>> +			}
>>> +		}
>>> +	} else {
>>> +		printf(" Tuples (%u) lcores (%u)\n",
>>> +			num_tuples, rte_lcore_count());
>>> +
>>> +		if ((uint32_t)num_tuples >= rte_lcore_count()) {
>>> +			printf("Insufficent Cores\n");
>> Typo %s/Insufficent/
> Ok
> 
>>
>>
>>> +	for (i = 0; i < argc; i++) {
>>> +		if (strstr(argv[i], "-l")) {
>>> +			snprintf(c_flag, RTE_DIM(c_flag), "-l %s", argv[i+1]);
>>
>> You are taking this as application arguments then making it as eal argument  to
>> run the application.
> I have explained the same above.
> 
>> Why not enable the needed number of cores in core mask using eal options -l 
> I think what you are saying is "allow user to pass -l option or -c option before `--`". Then before invoking rte_eal_init replace it. Is this your requirement?
> 
> and
>> have new core param in pdump tuple to run that tuple on that core.
>>
>> Ex:
>> If you check l3fwd as an example the cores should enabled using -c or -l and then
>> they have separate --config l3fwd option in which they specify the core on which
>> the packet processing should be run. Please check that and similar would be good
>> here too.
> I have already explained, pdump application makes static assignment of '-cx1'. If you try passing '-c' or '-l' the error check in rte_eal_init will prevent such assignment.
> 
>>
>>> +			strlcpy(argv[i], "", 2);
>>> +			strlcpy(argv[i + 1], "", 2);
>>
>> Why is this?
> I have explained this above.
> 
> 
>  Anyway, rte_strlcpy should be used instead of strlcpy.
> Ok
> 
>>
>> Thanks,
>> Reshma
> Hi Reshma, thanks for feedbacks on cosmetic, spelling and using rte_strlcpy
>
  
Varghese, Vipin April 1, 2019, 4:05 a.m. UTC | #6
Hi Reshma & Ferruh,

Summarizing the discussion with maintainer and proposed changes below.

1. Agreed to make changes for migrating 'strlcpy' to 'rte_strlcpy'.
2. Agreed to make changes for spelling error.
3. Informed it is user decision to enable multiple core capture for queue-pair.
4. Informed master core cannot participate in capture, as it requires book keeping for future enhancements like file size, and packet count.
5. Disagreed to the statement '-l cores' option is confusing as user should have the option to specify the cores.
6. Disagreed to the option suggested from maintainer to enhance '--pdump to add core', as it leads to combinations when option is passed for a few and not for others.
7. Printing port-queue pair instance with core as debug is agreed, but sharing information once capture is stopped does not look useful. But will enhance to do same.
8. In my humble opinion, there should be default core for call back which is core 0. Removing 'c_flag' is not right way after rte_eal_init is not correct. Hence user arguments especially (if pdump is to run on multiple cores) should be checked before rte_eal_init.

Action Item:
1. Send or wait for patch, to remove the core default value.
2. Send v5 for the above agreed points.

Thanks
Vipin Varghese


> -----Original Message-----
> From: Yigit, Ferruh
> Sent: Friday, March 29, 2019 10:33 PM
> To: Varghese, Vipin <vipin.varghese@intel.com>; Pattan, Reshma
> <reshma.pattan@intel.com>; dev@dpdk.org
> Cc: Wiles, Keith <keith.wiles@intel.com>; Mcnamara, John
> <john.mcnamara@intel.com>; Byrne, Stephen1 <stephen1.byrne@intel.com>;
> Tamboli, Amit S <amit.tamboli@intel.com>; Padubidri, Sanjay A
> <sanjay.padubidri@intel.com>; Patel, Amol <amol.patel@intel.com>; Kovacevic,
> Marko <marko.kovacevic@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3] app/pdump: enhance to support multi-core
> capture
> 
> On 3/29/2019 10:22 AM, Varghese, Vipin wrote:
> > Hi Reshma,
> >
> > snipped
> >>>
> >>>  /* true if x is a power of 2 */
> >>>  #define POWEROF2(x) ((((x)-1) & (x)) == 0) @@ -144,7 +145,7 @@
> >>> static volatile uint8_t quit_signal;  static void  pdump_usage(const char
> *prgname)  {
> >>> -	printf("usage: %s [EAL options] -- --pdump "
> >>> +	printf("usage: %s [EAL options] -- [-l <list of cores>] --pdump "
> >>
> >> Using -l option same as eal is confusing. Please use other name.
> > Current implementation passes core-mask '-cx1' as EAL argument. The check for
> user argument '-l <core1,core2,core3' is done before rte_eal_init. Once identified
> it is replaced with c_flag.
> >
> > Hence I disagree to the point it is confusing.
> 
> I agree with Reshma, if there is a need to run in different cores, lets remove the
> hardcoded core information from application and manage the core selection in
> eal level, instead of having this in application.
> 
> And in app level, you can say which core to use for that specific pdump, overall
> something like:
> 
> dpdk-pdump -l 20,23 -- --pdump 'port=0,queue=*,core=21,rx-
> dev=/tmp/rx.pcap'
> --pdump 'port=1,queue=*,core=22,tx-dev=/tmp/tx.pcap'
> 
> 
> >
> >> Also how about moving this  new option inside --pdump"" so it will be
> >> clearly known that the particular core will be associated to that tuple.
> > Yes, this can be done.
> >
> >>
> >> Also, I have some major concern, check my below comments.
> > Thanks for your concerns, let me try to address them below.
> >
> >>
> >>>  			"'(port=<port id> | device_id=<pci id or vdev name>),"
> >>>  			"(queue=<queue_id>),"
> >>>  			"(rx-dev=<iface or pcap file> |"
> >>> @@ -415,6 +416,7 @@ print_pdump_stats(void)
> >>>  	for (i = 0; i < num_tuples; i++) {
> >>>  		printf("##### PDUMP DEBUG STATS #####\n");
> >>>  		pt = &pdump_t[i];
> >>> +		printf(" == DPDK interface (%d) ==\n", i);
> >>
> >> Here good to print the portid/deviceid and queue info, instead of
> >> printing pdump tuple index  i? User might not understand that.
> > I am not sure, why you mention that I am displaying tuple index with I here?
> >
> >> Use ### instead of === as above.
> > I can do this, but is there specific reasoning for "####" as it is used to represent
> main header?
> >
> >>
> >>> +
> >>>  static inline void
> >>>  dump_packets(void)
> >>>  {
> >>>  	int i;
> >>> -	struct pdump_tuples *pt;
> >>> +	uint32_t lcore_id = 0;
> >>> +
> >>> +	lcore_id = rte_get_next_lcore(lcore_id, 1, 1);
> >>> +
> >>> +	if (rte_lcore_count() == 1) {
> >>> +		while (!quit_signal) {
> >>> +			for (i = 0; i < num_tuples; i++) {
> >>> +				struct pdump_tuples *pt = &pdump_t[i];
> >>> +				pdump_packets(pt);
> >>> +			}
> >>> +		}
> >>> +	} else {
> >>> +		printf(" Tuples (%u) lcores (%u)\n",
> >>> +			num_tuples, rte_lcore_count());
> >>> +
> >>> +		if ((uint32_t)num_tuples >= rte_lcore_count()) {
> >>> +			printf("Insufficent Cores\n");
> >> Typo %s/Insufficent/
> > Ok
> >
> >>
> >>
> >>> +	for (i = 0; i < argc; i++) {
> >>> +		if (strstr(argv[i], "-l")) {
> >>> +			snprintf(c_flag, RTE_DIM(c_flag), "-l %s", argv[i+1]);
> >>
> >> You are taking this as application arguments then making it as eal
> >> argument  to run the application.
> > I have explained the same above.
> >
> >> Why not enable the needed number of cores in core mask using eal
> >> options -l
> > I think what you are saying is "allow user to pass -l option or -c option before `--
> `". Then before invoking rte_eal_init replace it. Is this your requirement?
> >
> > and
> >> have new core param in pdump tuple to run that tuple on that core.
> >>
> >> Ex:
> >> If you check l3fwd as an example the cores should enabled using -c or
> >> -l and then they have separate --config l3fwd option in which they
> >> specify the core on which the packet processing should be run. Please
> >> check that and similar would be good here too.
> > I have already explained, pdump application makes static assignment of '-cx1'. If
> you try passing '-c' or '-l' the error check in rte_eal_init will prevent such
> assignment.
> >
> >>
> >>> +			strlcpy(argv[i], "", 2);
> >>> +			strlcpy(argv[i + 1], "", 2);
> >>
> >> Why is this?
> > I have explained this above.
> >
> >
> >  Anyway, rte_strlcpy should be used instead of strlcpy.
> > Ok
> >
> >>
> >> Thanks,
> >> Reshma
> > Hi Reshma, thanks for feedbacks on cosmetic, spelling and using
> > rte_strlcpy
> >
  

Patch

diff --git a/app/pdump/main.c b/app/pdump/main.c
index ccf2a1d2f..a2e092420 100644
--- a/app/pdump/main.c
+++ b/app/pdump/main.c
@@ -62,6 +62,7 @@ 
 #define SIZE 256
 #define BURST_SIZE 32
 #define NUM_VDEVS 2
+#define CORES_STR_SIZE 32
 
 /* true if x is a power of 2 */
 #define POWEROF2(x) ((((x)-1) & (x)) == 0)
@@ -144,7 +145,7 @@  static volatile uint8_t quit_signal;
 static void
 pdump_usage(const char *prgname)
 {
-	printf("usage: %s [EAL options] -- --pdump "
+	printf("usage: %s [EAL options] -- [-l <list of cores>] --pdump "
 			"'(port=<port id> | device_id=<pci id or vdev name>),"
 			"(queue=<queue_id>),"
 			"(rx-dev=<iface or pcap file> |"
@@ -415,6 +416,7 @@  print_pdump_stats(void)
 	for (i = 0; i < num_tuples; i++) {
 		printf("##### PDUMP DEBUG STATS #####\n");
 		pt = &pdump_t[i];
+		printf(" == DPDK interface (%d) ==\n", i);
 		printf(" -packets dequeued:			%"PRIu64"\n",
 							pt->stats.dequeue_pkts);
 		printf(" -packets transmitted to vdev:		%"PRIu64"\n",
@@ -834,22 +836,62 @@  enable_pdump(void)
 	}
 }
 
+static inline void
+pdump_packets(struct pdump_tuples *pt)
+{
+	if (pt->dir & RTE_PDUMP_FLAG_RX)
+		pdump_rxtx(pt->rx_ring, pt->rx_vdev_id, &pt->stats);
+	if (pt->dir & RTE_PDUMP_FLAG_TX)
+		pdump_rxtx(pt->tx_ring, pt->tx_vdev_id, &pt->stats);
+}
+
+static int
+dump_packets_core(void *arg)
+{
+	struct pdump_tuples *pt = (struct pdump_tuples *) arg;
+
+	while (!quit_signal)
+		pdump_packets(pt);
+
+	return 0;
+}
+
 static inline void
 dump_packets(void)
 {
 	int i;
-	struct pdump_tuples *pt;
+	uint32_t lcore_id = 0;
+
+	lcore_id = rte_get_next_lcore(lcore_id, 1, 1);
+
+	if (rte_lcore_count() == 1) {
+		while (!quit_signal) {
+			for (i = 0; i < num_tuples; i++) {
+				struct pdump_tuples *pt = &pdump_t[i];
+				pdump_packets(pt);
+			}
+		}
+	} else {
+		printf(" Tuples (%u) lcores (%u)\n",
+			num_tuples, rte_lcore_count());
+
+		if ((uint32_t)num_tuples >= rte_lcore_count()) {
+			printf("Insufficent Cores\n");
+			return;
+		}
 
-	while (!quit_signal) {
 		for (i = 0; i < num_tuples; i++) {
-			pt = &pdump_t[i];
-			if (pt->dir & RTE_PDUMP_FLAG_RX)
-				pdump_rxtx(pt->rx_ring, pt->rx_vdev_id,
-					&pt->stats);
-			if (pt->dir & RTE_PDUMP_FLAG_TX)
-				pdump_rxtx(pt->tx_ring, pt->tx_vdev_id,
-					&pt->stats);
+			/* use remote launch for n interfaces */
+			rte_eal_remote_launch(dump_packets_core,
+				&pdump_t[i], lcore_id);
+			lcore_id = rte_get_next_lcore(lcore_id, 0, 1);
+
+			if (rte_eal_wait_lcore(lcore_id) < 0)
+				rte_exit(EXIT_FAILURE, "failed to wait\n");
 		}
+
+		while (!quit_signal)
+			;
 	}
 }
 
@@ -860,7 +902,7 @@  main(int argc, char **argv)
 	int ret;
 	int i;
 
-	char c_flag[] = "-c1";
+	char c_flag[CORES_STR_SIZE] = "-c1";
 	char n_flag[] = "-n4";
 	char mp_flag[] = "--proc-type=secondary";
 	char *argp[argc + 3];
@@ -868,6 +910,17 @@  main(int argc, char **argv)
 	/* catch ctrl-c so we can print on exit */
 	signal(SIGINT, signal_handler);
 
+	for (i = 0; i < argc; i++) {
+		if (strstr(argv[i], "-l")) {
+			snprintf(c_flag, RTE_DIM(c_flag), "-l %s", argv[i+1]);
+			strlcpy(argv[i], "", 2);
+			strlcpy(argv[i + 1], "", 2);
+			break;
+		}
+	}
+
+	printf(" c_flag %s", c_flag);
+
 	argp[0] = argv[0];
 	argp[1] = c_flag;
 	argp[2] = n_flag;
diff --git a/doc/guides/tools/pdump.rst b/doc/guides/tools/pdump.rst
index 7c2b73e72..3cfd9fc44 100644
--- a/doc/guides/tools/pdump.rst
+++ b/doc/guides/tools/pdump.rst
@@ -35,6 +35,7 @@  The tool has a number of command line options:
 .. code-block:: console
 
    ./build/app/dpdk-pdump --
+                          [-l <list of cores>]
                           --pdump '(port=<port id> | device_id=<pci id or vdev name>),
                                    (queue=<queue_id>),
                                    (rx-dev=<iface or pcap file> |
@@ -43,6 +44,9 @@  The tool has a number of command line options:
                                    [mbuf-size=<mbuf data size>],
                                    [total-num-mbufs=<number of mbufs>]'
 
+The ``-l`` command line option is optional and it takes list cores on which
+capture for each ``--pdump`` is to be run.
+
 The ``--pdump`` command line option is mandatory and it takes various sub arguments which are described in
 below section.
 
@@ -113,3 +117,4 @@  Example
 .. code-block:: console
 
    $ sudo ./build/app/dpdk-pdump -- --pdump 'port=0,queue=*,rx-dev=/tmp/rx.pcap'
+   $ sudo ./build/app/dpdk-pdump -- -l 3,4,5 --pdump 'port=0,queue=*,rx-dev=/tmp/rx.pcap' --pdump 'port=1,queue=*,tx-dev=/tmp/tx.pcap'