examples/ip_fragmentation: support bigger packets

Message ID 1544703399-32621-1-git-send-email-noae@mellanox.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series examples/ip_fragmentation: support bigger packets |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/mellanox-Performance-Testing success Performance Testing PASS
ci/intel-Performance-Testing success Performance Testing PASS

Commit Message

Noa Ezra Dec. 13, 2018, 12:17 p.m. UTC
  Adding MTU and mbuf size configuration to the application's command
line, in order to be able to receive all packet sizes by the NIC and
DPDK application.
The maximum transmission unit (MTU) is the largest size packet in
bytes that can be sent on the network, therefore before adding MTU
parameter, the NIC could not receive packets larger than 1500 bytes,
which is the default MTU size.
The mbuf is the memory buffer that contains the packet. Before adding
mbuf parameter, the DPDK application could not receive packets larger
than 2KB, which is the default mbuf size.

Signed-off-by: Noa Ezra <noae@mellanox.com>
---
 doc/guides/sample_app_ug/ip_frag.rst | 18 ++++++++-
 examples/ip_fragmentation/main.c     | 77 +++++++++++++++++++++++++++++++++---
 2 files changed, 88 insertions(+), 7 deletions(-)
  

Comments

Thomas Monjalon Dec. 19, 2018, 9:30 p.m. UTC | #1
13/12/2018 13:17, Noa Ezra:
> Adding MTU and mbuf size configuration to the application's command
> line, in order to be able to receive all packet sizes by the NIC and
> DPDK application.
> The maximum transmission unit (MTU) is the largest size packet in
> bytes that can be sent on the network, therefore before adding MTU
> parameter, the NIC could not receive packets larger than 1500 bytes,
> which is the default MTU size.
> The mbuf is the memory buffer that contains the packet. Before adding
> mbuf parameter, the DPDK application could not receive packets larger
> than 2KB, which is the default mbuf size.
> 
> Signed-off-by: Noa Ezra <noae@mellanox.com>
> ---
>  doc/guides/sample_app_ug/ip_frag.rst | 18 ++++++++-
>  examples/ip_fragmentation/main.c     | 77 +++++++++++++++++++++++++++++++++---
>  2 files changed, 88 insertions(+), 7 deletions(-)

Konstantin, any comment please?
  
Ananyev, Konstantin Dec. 20, 2018, 12:18 a.m. UTC | #2
Hi, 

> 
> Adding MTU and mbuf size configuration to the application's command
> line, in order to be able to receive all packet sizes by the NIC and
> DPDK application.
> The maximum transmission unit (MTU) is the largest size packet in
> bytes that can be sent on the network, therefore before adding MTU
> parameter, the NIC could not receive packets larger than 1500 bytes,
> which is the default MTU size.

I wonder why is that?
Currently ip_fragmentation sets max_rx_pkt_len to up to 9.5KB:

static struct rte_eth_conf port_conf = {
        .rxmode = {
                .max_rx_pkt_len = JUMBO_FRAME_MAX_SIZE,
...

local_port_conf.rxmode.max_rx_pkt_len = RTE_MIN(
                    dev_info.max_rx_pktlen,
                    local_port_conf.rxmode.max_rx_pkt_len);

That in theory should be enough to enable jumbo frame RX.
Did you find it is not working as expected, if yes on which NIC?

> The mbuf is the memory buffer that contains the packet. Before adding
> mbuf parameter, the DPDK application could not receive packets larger
> than 2KB, which is the default mbuf size.

Again why is that?
All packets that do support scatter-gather RX should be able to RX
frames bigger then mbuf size (if properly configured of course).
Are you trying to make it work on NICs with no multi-segment support for RX/TX?
But then how you plan to do TX (usually it is symmetric for RX/TX)?

> 
> Signed-off-by: Noa Ezra <noae@mellanox.com>
> ---
>  doc/guides/sample_app_ug/ip_frag.rst | 18 ++++++++-
>  examples/ip_fragmentation/main.c     | 77 +++++++++++++++++++++++++++++++++---
>  2 files changed, 88 insertions(+), 7 deletions(-)
> 
> diff --git a/doc/guides/sample_app_ug/ip_frag.rst b/doc/guides/sample_app_ug/ip_frag.rst
> index 7914a97..13933c7 100644
> --- a/doc/guides/sample_app_ug/ip_frag.rst
> +++ b/doc/guides/sample_app_ug/ip_frag.rst
> @@ -53,7 +53,7 @@ Application usage:
> 
>  .. code-block:: console
> 
> -    ./build/ip_fragmentation [EAL options] -- -p PORTMASK [-q NQ]
> +    ./build/ip_fragmentation [EAL options] -- -p PORTMASK [-q NQ] [-b MBUFSIZE] [-m MTUSIZE]
> 
>  where:
> 
> @@ -61,6 +61,15 @@ where:
> 
>  *   -q NQ is the number of queue (=ports) per lcore (the default is 1)
> 
> +*   -b MBUFSIZE is the mbuf size in bytes (the default is 2048)
> +
> +*   -m MTUSIZE is the mtu size in bytes (the default is 1500)
> +
> +The MTU is the maximum size of a single data unit that can be transmitted over the network, therefore it must be
> +greater than the requested max packet size, otherwise the NIC won't be able to get the packet.
> +The mbuf is a buffer that is used by the DPDK application to store message buffers.If not using scatter then the
> +mbuf size must be greater than the requested max packet size, otherwise the DPDK will not be able to receive the
> +packet.
>  To run the example in linuxapp environment with 2 lcores (2,4) over 2 ports(0,2) with 1 RX queue per lcore:
> 
>  .. code-block:: console
> @@ -96,6 +105,13 @@ To run the example in linuxapp environment with 1 lcore (4) over 2 ports(0,2) wi
> 
>      ./build/ip_fragmentation -l 4 -n 3 -- -p 5 -q 2
> 
> +To run the example with defined MTU size 4000 bytes and mbuf size 9000 byes:
> +
> +.. code-block:: console
> +
> +    ./build/ip_fragmentation -l 4 -n 3 -- -p 5 -m 4000 -b 9000
> +
> +
>  To test the application, flows should be set up in the flow generator that match the values in the
>  l3fwd_ipv4_route_array and/or l3fwd_ipv6_route_array table.
> 
> diff --git a/examples/ip_fragmentation/main.c b/examples/ip_fragmentation/main.c
> index 17a877d..0cf23b4 100644
> --- a/examples/ip_fragmentation/main.c
> +++ b/examples/ip_fragmentation/main.c
> @@ -111,6 +111,10 @@
> 
>  static int rx_queue_per_lcore = 1;
> 
> +static int mbuf_size = RTE_MBUF_DEFAULT_BUF_SIZE;
> +
> +static int mtu_size = ETHER_MTU;
> +
>  #define MBUF_TABLE_SIZE  (2 * MAX(MAX_PKT_BURST, MAX_PACKET_FRAG))
> 
>  struct mbuf_table {
> @@ -425,7 +429,6 @@ struct rte_lpm6_config lpm6_config = {
>  		 * Read packet from RX queues
>  		 */
>  		for (i = 0; i < qconf->n_rx_queue; i++) {
> -
>  			portid = qconf->rx_queue_list[i].portid;
>  			nb_rx = rte_eth_rx_burst(portid, 0, pkts_burst,
>  						 MAX_PKT_BURST);
> @@ -455,9 +458,11 @@ struct rte_lpm6_config lpm6_config = {
>  static void
>  print_usage(const char *prgname)
>  {
> -	printf("%s [EAL options] -- -p PORTMASK [-q NQ]\n"
> +	printf("%s [EAL options] -- -p PORTMASK [-q NQ] [-b MBUFSIZE] [-m MTUSIZE]\n"
>  	       "  -p PORTMASK: hexadecimal bitmask of ports to configure\n"
> -	       "  -q NQ: number of queue (=ports) per lcore (default is 1)\n",
> +	       "  -q NQ: number of queue (=ports) per lcore (default is 1)\n"
> +	       "  -b MBUFSIZE: set the data size of mbuf\n"
> +	       "  -m MTUSIZE: set the MTU size\n",
>  	       prgname);
>  }
> 
> @@ -496,6 +501,38 @@ struct rte_lpm6_config lpm6_config = {
>  	return n;
>  }
> 
> +static int
> +parse_mbufsize(const char *q_arg)
> +{
> +	char *end = NULL;
> +	unsigned long mbuf;
> +
> +	/* parse hexadecimal string */

You expect decimal string below.

> +	mbuf = strtoul(q_arg, &end, 10);
> +	if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
> +		return -1;

You probably need to set and test errno too.

> +	if (mbuf == 0)
> +		return -1;
> +
> +	return mbuf;
> +}


These 2 parse functions looks identical.
Why you need two of them?

> +
> +static int
> +parse_mtusize(const char *q_arg)
> +{
> +	char *end = NULL;
> +	unsigned long mtu;
> +
> +	/* parse hexadecimal string */
> +	mtu = strtoul(q_arg, &end, 10);
> +	if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
> +		return -1;
> +	if (mtu == 0)
> +		return -1;
> +
> +	return mtu;
> +}
> +
>  /* Parse the argument given in the command line of the application */
>  static int
>  parse_args(int argc, char **argv)
> @@ -510,7 +547,7 @@ struct rte_lpm6_config lpm6_config = {
> 
>  	argvopt = argv;
> 
> -	while ((opt = getopt_long(argc, argvopt, "p:q:",
> +	while ((opt = getopt_long(argc, argvopt, "p:q:b:m:",
>  				  lgopts, &option_index)) != EOF) {
> 
>  		switch (opt) {
> @@ -534,6 +571,26 @@ struct rte_lpm6_config lpm6_config = {
>  			}
>  			break;
> 
> +		/* mbufsize */
> +		case 'b':
> +			mbuf_size = parse_mbufsize(optarg);
> +			if (mbuf_size < 0) {
> +				printf("invalid mbuf size\n");
> +				print_usage(prgname);
> +				return -1;
> +			}
> +			break;
> +
> +		/* mtusize */
> +		case 'm':
> +			mtu_size = parse_mtusize(optarg);
> +			if (mtu_size < 0) {
> +				printf("invalid mtu size\n");
> +				print_usage(prgname);
> +				return -1;
> +			}
> +			break;
> +
>  		/* long options */
>  		case 0:
>  			print_usage(prgname);
> @@ -777,9 +834,8 @@ struct rte_lpm6_config lpm6_config = {
>  			RTE_LOG(INFO, IP_FRAG, "Creating direct mempool on socket %i\n",
>  					socket);
>  			snprintf(buf, sizeof(buf), "pool_direct_%i", socket);
> -
>  			mp = rte_pktmbuf_pool_create(buf, NB_MBUF, 32,
> -				0, RTE_MBUF_DEFAULT_BUF_SIZE, socket);
> +				0, mbuf_size, socket);
>  			if (mp == NULL) {
>  				RTE_LOG(ERR, IP_FRAG, "Cannot create direct mempool\n");
>  				return -1;
> @@ -892,6 +948,15 @@ struct rte_lpm6_config lpm6_config = {
>  		    dev_info.max_rx_pktlen,
>  		    local_port_conf.rxmode.max_rx_pkt_len);
> 
> +		/* set the mtu to the maximum received packet size */
> +		ret = rte_eth_dev_set_mtu(portid, mtu_size);
> +		if (ret < 0) {
> +			printf("\n");
> +			rte_exit(EXIT_FAILURE, "Set MTU failed: "
> +				"err=%d, port=%d\n",
> +			ret, portid);
> +		}
> +
>  		/* get the lcore_id for this port */
>  		while (rte_lcore_is_enabled(rx_lcore_id) == 0 ||
>  		       qconf->n_rx_queue == (unsigned)rx_queue_per_lcore) {
> --
> 1.8.3.1
  
Noa Ezra Dec. 20, 2018, 9:45 a.m. UTC | #3
Hi,
In some vendores (like Mellanox, that reflects the linux behavior in this case) the Rx and Tx configurations must be the same, therefore it is not enough to configure the max_rx_pkt_len to JUMBO_FRAME_MAX_SIZE, we also need to configure the MTU size in order to receive large packets.
In order to avoid from adding another configuration to the command line, we can configure the MTU to be equal to  max_rx_pkt_len, and it won't change the functionality of the test.

In addition there are PMDs that need to enable the scatter-gather so it will be functional for RX frames bigger then mbuf size. We can add the configuration and avoid changing the mbuf size.

What do you think about this solution?

Thanks,
Noa.

-----Original Message-----
From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com] 
Sent: Thursday, December 20, 2018 2:18 AM
To: Noa Ezra <noae@mellanox.com>
Cc: dev@dpdk.org
Subject: RE: [PATCH] examples/ip_fragmentation: support bigger packets

Hi, 

> 
> Adding MTU and mbuf size configuration to the application's command 
> line, in order to be able to receive all packet sizes by the NIC and 
> DPDK application.
> The maximum transmission unit (MTU) is the largest size packet in 
> bytes that can be sent on the network, therefore before adding MTU 
> parameter, the NIC could not receive packets larger than 1500 bytes, 
> which is the default MTU size.

I wonder why is that?
Currently ip_fragmentation sets max_rx_pkt_len to up to 9.5KB:

static struct rte_eth_conf port_conf = {
        .rxmode = {
                .max_rx_pkt_len = JUMBO_FRAME_MAX_SIZE, ...

local_port_conf.rxmode.max_rx_pkt_len = RTE_MIN(
                    dev_info.max_rx_pktlen,
                    local_port_conf.rxmode.max_rx_pkt_len);

That in theory should be enough to enable jumbo frame RX.
Did you find it is not working as expected, if yes on which NIC?

> The mbuf is the memory buffer that contains the packet. Before adding 
> mbuf parameter, the DPDK application could not receive packets larger 
> than 2KB, which is the default mbuf size.

Again why is that?
All packets that do support scatter-gather RX should be able to RX frames bigger then mbuf size (if properly configured of course).
Are you trying to make it work on NICs with no multi-segment support for RX/TX?
But then how you plan to do TX (usually it is symmetric for RX/TX)?

> 
> Signed-off-by: Noa Ezra <noae@mellanox.com>
> ---
>  doc/guides/sample_app_ug/ip_frag.rst | 18 ++++++++-
>  examples/ip_fragmentation/main.c     | 77 +++++++++++++++++++++++++++++++++---
>  2 files changed, 88 insertions(+), 7 deletions(-)
> 
> diff --git a/doc/guides/sample_app_ug/ip_frag.rst 
> b/doc/guides/sample_app_ug/ip_frag.rst
> index 7914a97..13933c7 100644
> --- a/doc/guides/sample_app_ug/ip_frag.rst
> +++ b/doc/guides/sample_app_ug/ip_frag.rst
> @@ -53,7 +53,7 @@ Application usage:
> 
>  .. code-block:: console
> 
> -    ./build/ip_fragmentation [EAL options] -- -p PORTMASK [-q NQ]
> +    ./build/ip_fragmentation [EAL options] -- -p PORTMASK [-q NQ] [-b 
> + MBUFSIZE] [-m MTUSIZE]
> 
>  where:
> 
> @@ -61,6 +61,15 @@ where:
> 
>  *   -q NQ is the number of queue (=ports) per lcore (the default is 1)
> 
> +*   -b MBUFSIZE is the mbuf size in bytes (the default is 2048)
> +
> +*   -m MTUSIZE is the mtu size in bytes (the default is 1500)
> +
> +The MTU is the maximum size of a single data unit that can be 
> +transmitted over the network, therefore it must be greater than the requested max packet size, otherwise the NIC won't be able to get the packet.
> +The mbuf is a buffer that is used by the DPDK application to store 
> +message buffers.If not using scatter then the mbuf size must be 
> +greater than the requested max packet size, otherwise the DPDK will not be able to receive the packet.
>  To run the example in linuxapp environment with 2 lcores (2,4) over 2 ports(0,2) with 1 RX queue per lcore:
> 
>  .. code-block:: console
> @@ -96,6 +105,13 @@ To run the example in linuxapp environment with 1 
> lcore (4) over 2 ports(0,2) wi
> 
>      ./build/ip_fragmentation -l 4 -n 3 -- -p 5 -q 2
> 
> +To run the example with defined MTU size 4000 bytes and mbuf size 9000 byes:
> +
> +.. code-block:: console
> +
> +    ./build/ip_fragmentation -l 4 -n 3 -- -p 5 -m 4000 -b 9000
> +
> +
>  To test the application, flows should be set up in the flow generator 
> that match the values in the  l3fwd_ipv4_route_array and/or l3fwd_ipv6_route_array table.
> 
> diff --git a/examples/ip_fragmentation/main.c 
> b/examples/ip_fragmentation/main.c
> index 17a877d..0cf23b4 100644
> --- a/examples/ip_fragmentation/main.c
> +++ b/examples/ip_fragmentation/main.c
> @@ -111,6 +111,10 @@
> 
>  static int rx_queue_per_lcore = 1;
> 
> +static int mbuf_size = RTE_MBUF_DEFAULT_BUF_SIZE;
> +
> +static int mtu_size = ETHER_MTU;
> +
>  #define MBUF_TABLE_SIZE  (2 * MAX(MAX_PKT_BURST, MAX_PACKET_FRAG))
> 
>  struct mbuf_table {
> @@ -425,7 +429,6 @@ struct rte_lpm6_config lpm6_config = {
>  		 * Read packet from RX queues
>  		 */
>  		for (i = 0; i < qconf->n_rx_queue; i++) {
> -
>  			portid = qconf->rx_queue_list[i].portid;
>  			nb_rx = rte_eth_rx_burst(portid, 0, pkts_burst,
>  						 MAX_PKT_BURST);
> @@ -455,9 +458,11 @@ struct rte_lpm6_config lpm6_config = {  static 
> void  print_usage(const char *prgname)  {
> -	printf("%s [EAL options] -- -p PORTMASK [-q NQ]\n"
> +	printf("%s [EAL options] -- -p PORTMASK [-q NQ] [-b MBUFSIZE] [-m MTUSIZE]\n"
>  	       "  -p PORTMASK: hexadecimal bitmask of ports to configure\n"
> -	       "  -q NQ: number of queue (=ports) per lcore (default is 1)\n",
> +	       "  -q NQ: number of queue (=ports) per lcore (default is 1)\n"
> +	       "  -b MBUFSIZE: set the data size of mbuf\n"
> +	       "  -m MTUSIZE: set the MTU size\n",
>  	       prgname);
>  }
> 
> @@ -496,6 +501,38 @@ struct rte_lpm6_config lpm6_config = {
>  	return n;
>  }
> 
> +static int
> +parse_mbufsize(const char *q_arg)
> +{
> +	char *end = NULL;
> +	unsigned long mbuf;
> +
> +	/* parse hexadecimal string */

You expect decimal string below.

> +	mbuf = strtoul(q_arg, &end, 10);
> +	if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
> +		return -1;

You probably need to set and test errno too.

> +	if (mbuf == 0)
> +		return -1;
> +
> +	return mbuf;
> +}


These 2 parse functions looks identical.
Why you need two of them?

> +
> +static int
> +parse_mtusize(const char *q_arg)
> +{
> +	char *end = NULL;
> +	unsigned long mtu;
> +
> +	/* parse hexadecimal string */
> +	mtu = strtoul(q_arg, &end, 10);
> +	if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
> +		return -1;
> +	if (mtu == 0)
> +		return -1;
> +
> +	return mtu;
> +}
> +
>  /* Parse the argument given in the command line of the application */  
> static int  parse_args(int argc, char **argv) @@ -510,7 +547,7 @@ 
> struct rte_lpm6_config lpm6_config = {
> 
>  	argvopt = argv;
> 
> -	while ((opt = getopt_long(argc, argvopt, "p:q:",
> +	while ((opt = getopt_long(argc, argvopt, "p:q:b:m:",
>  				  lgopts, &option_index)) != EOF) {
> 
>  		switch (opt) {
> @@ -534,6 +571,26 @@ struct rte_lpm6_config lpm6_config = {
>  			}
>  			break;
> 
> +		/* mbufsize */
> +		case 'b':
> +			mbuf_size = parse_mbufsize(optarg);
> +			if (mbuf_size < 0) {
> +				printf("invalid mbuf size\n");
> +				print_usage(prgname);
> +				return -1;
> +			}
> +			break;
> +
> +		/* mtusize */
> +		case 'm':
> +			mtu_size = parse_mtusize(optarg);
> +			if (mtu_size < 0) {
> +				printf("invalid mtu size\n");
> +				print_usage(prgname);
> +				return -1;
> +			}
> +			break;
> +
>  		/* long options */
>  		case 0:
>  			print_usage(prgname);
> @@ -777,9 +834,8 @@ struct rte_lpm6_config lpm6_config = {
>  			RTE_LOG(INFO, IP_FRAG, "Creating direct mempool on socket %i\n",
>  					socket);
>  			snprintf(buf, sizeof(buf), "pool_direct_%i", socket);
> -
>  			mp = rte_pktmbuf_pool_create(buf, NB_MBUF, 32,
> -				0, RTE_MBUF_DEFAULT_BUF_SIZE, socket);
> +				0, mbuf_size, socket);
>  			if (mp == NULL) {
>  				RTE_LOG(ERR, IP_FRAG, "Cannot create direct mempool\n");
>  				return -1;
> @@ -892,6 +948,15 @@ struct rte_lpm6_config lpm6_config = {
>  		    dev_info.max_rx_pktlen,
>  		    local_port_conf.rxmode.max_rx_pkt_len);
> 
> +		/* set the mtu to the maximum received packet size */
> +		ret = rte_eth_dev_set_mtu(portid, mtu_size);
> +		if (ret < 0) {
> +			printf("\n");
> +			rte_exit(EXIT_FAILURE, "Set MTU failed: "
> +				"err=%d, port=%d\n",
> +			ret, portid);
> +		}
> +
>  		/* get the lcore_id for this port */
>  		while (rte_lcore_is_enabled(rx_lcore_id) == 0 ||
>  		       qconf->n_rx_queue == (unsigned)rx_queue_per_lcore) {
> --
> 1.8.3.1
  
Ananyev, Konstantin Dec. 20, 2018, 12:11 p.m. UTC | #4
> 
> Hi,
> In some vendores (like Mellanox, that reflects the linux behavior in this case) the Rx and Tx configurations must be the same, therefore it is
> not enough to configure the max_rx_pkt_len to JUMBO_FRAME_MAX_SIZE, we also need to configure the MTU size in order to receive
> large packets.

Interesting, didn't know that.
Does it mean that l3fwd and bunch of other sample apps that use max_rx_pkt_len to setup MTU,
don't work properly with mlx devices (not able to run with jumbo-frame enabled)?

> In order to avoid from adding another configuration to the command line, we can configure the MTU to be equal to  max_rx_pkt_len, and it
> won't change the functionality of the test.

I don't have any objections with new option for MTU size (-m) if you need it.
It was just unclear to me why it was necessary for you.
But yes, setting MTU to the correlated with max_rx_ptk_len value sounds like a good idea.

> 
> In addition there are PMDs that need to enable the scatter-gather so it will be functional for RX frames bigger then mbuf size. We can add
> the configuration and avoid changing the mbuf size.

You mean check if MTU is bigger than mbuf size, and if so setup (DEV_RX_OFFLOAD_JUMBO_FRAME | DEV_RX_OFFLOAD_SCATTER)
?
Konstantin


> 
> What do you think about this solution?
> 
> Thanks,
> Noa.
> 
> -----Original Message-----
> From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> Sent: Thursday, December 20, 2018 2:18 AM
> To: Noa Ezra <noae@mellanox.com>
> Cc: dev@dpdk.org
> Subject: RE: [PATCH] examples/ip_fragmentation: support bigger packets
> 
> Hi,
> 
> >
> > Adding MTU and mbuf size configuration to the application's command
> > line, in order to be able to receive all packet sizes by the NIC and
> > DPDK application.
> > The maximum transmission unit (MTU) is the largest size packet in
> > bytes that can be sent on the network, therefore before adding MTU
> > parameter, the NIC could not receive packets larger than 1500 bytes,
> > which is the default MTU size.
> 
> I wonder why is that?
> Currently ip_fragmentation sets max_rx_pkt_len to up to 9.5KB:
> 
> static struct rte_eth_conf port_conf = {
>         .rxmode = {
>                 .max_rx_pkt_len = JUMBO_FRAME_MAX_SIZE, ...
> 
> local_port_conf.rxmode.max_rx_pkt_len = RTE_MIN(
>                     dev_info.max_rx_pktlen,
>                     local_port_conf.rxmode.max_rx_pkt_len);
> 
> That in theory should be enough to enable jumbo frame RX.
> Did you find it is not working as expected, if yes on which NIC?
> 
> > The mbuf is the memory buffer that contains the packet. Before adding
> > mbuf parameter, the DPDK application could not receive packets larger
> > than 2KB, which is the default mbuf size.
> 
> Again why is that?
> All packets that do support scatter-gather RX should be able to RX frames bigger then mbuf size (if properly configured of course).
> Are you trying to make it work on NICs with no multi-segment support for RX/TX?
> But then how you plan to do TX (usually it is symmetric for RX/TX)?
> 
> >
> > Signed-off-by: Noa Ezra <noae@mellanox.com>
> > ---
> >  doc/guides/sample_app_ug/ip_frag.rst | 18 ++++++++-
> >  examples/ip_fragmentation/main.c     | 77 +++++++++++++++++++++++++++++++++---
> >  2 files changed, 88 insertions(+), 7 deletions(-)
> >
> > diff --git a/doc/guides/sample_app_ug/ip_frag.rst
> > b/doc/guides/sample_app_ug/ip_frag.rst
> > index 7914a97..13933c7 100644
> > --- a/doc/guides/sample_app_ug/ip_frag.rst
> > +++ b/doc/guides/sample_app_ug/ip_frag.rst
> > @@ -53,7 +53,7 @@ Application usage:
> >
> >  .. code-block:: console
> >
> > -    ./build/ip_fragmentation [EAL options] -- -p PORTMASK [-q NQ]
> > +    ./build/ip_fragmentation [EAL options] -- -p PORTMASK [-q NQ] [-b
> > + MBUFSIZE] [-m MTUSIZE]
> >
> >  where:
> >
> > @@ -61,6 +61,15 @@ where:
> >
> >  *   -q NQ is the number of queue (=ports) per lcore (the default is 1)
> >
> > +*   -b MBUFSIZE is the mbuf size in bytes (the default is 2048)
> > +
> > +*   -m MTUSIZE is the mtu size in bytes (the default is 1500)
> > +
> > +The MTU is the maximum size of a single data unit that can be
> > +transmitted over the network, therefore it must be greater than the requested max packet size, otherwise the NIC won't be able to get
> the packet.
> > +The mbuf is a buffer that is used by the DPDK application to store
> > +message buffers.If not using scatter then the mbuf size must be
> > +greater than the requested max packet size, otherwise the DPDK will not be able to receive the packet.
> >  To run the example in linuxapp environment with 2 lcores (2,4) over 2 ports(0,2) with 1 RX queue per lcore:
> >
> >  .. code-block:: console
> > @@ -96,6 +105,13 @@ To run the example in linuxapp environment with 1
> > lcore (4) over 2 ports(0,2) wi
> >
> >      ./build/ip_fragmentation -l 4 -n 3 -- -p 5 -q 2
> >
> > +To run the example with defined MTU size 4000 bytes and mbuf size 9000 byes:
> > +
> > +.. code-block:: console
> > +
> > +    ./build/ip_fragmentation -l 4 -n 3 -- -p 5 -m 4000 -b 9000
> > +
> > +
> >  To test the application, flows should be set up in the flow generator
> > that match the values in the  l3fwd_ipv4_route_array and/or l3fwd_ipv6_route_array table.
> >
> > diff --git a/examples/ip_fragmentation/main.c
> > b/examples/ip_fragmentation/main.c
> > index 17a877d..0cf23b4 100644
> > --- a/examples/ip_fragmentation/main.c
> > +++ b/examples/ip_fragmentation/main.c
> > @@ -111,6 +111,10 @@
> >
> >  static int rx_queue_per_lcore = 1;
> >
> > +static int mbuf_size = RTE_MBUF_DEFAULT_BUF_SIZE;
> > +
> > +static int mtu_size = ETHER_MTU;
> > +
> >  #define MBUF_TABLE_SIZE  (2 * MAX(MAX_PKT_BURST, MAX_PACKET_FRAG))
> >
> >  struct mbuf_table {
> > @@ -425,7 +429,6 @@ struct rte_lpm6_config lpm6_config = {
> >  		 * Read packet from RX queues
> >  		 */
> >  		for (i = 0; i < qconf->n_rx_queue; i++) {
> > -
> >  			portid = qconf->rx_queue_list[i].portid;
> >  			nb_rx = rte_eth_rx_burst(portid, 0, pkts_burst,
> >  						 MAX_PKT_BURST);
> > @@ -455,9 +458,11 @@ struct rte_lpm6_config lpm6_config = {  static
> > void  print_usage(const char *prgname)  {
> > -	printf("%s [EAL options] -- -p PORTMASK [-q NQ]\n"
> > +	printf("%s [EAL options] -- -p PORTMASK [-q NQ] [-b MBUFSIZE] [-m MTUSIZE]\n"
> >  	       "  -p PORTMASK: hexadecimal bitmask of ports to configure\n"
> > -	       "  -q NQ: number of queue (=ports) per lcore (default is 1)\n",
> > +	       "  -q NQ: number of queue (=ports) per lcore (default is 1)\n"
> > +	       "  -b MBUFSIZE: set the data size of mbuf\n"
> > +	       "  -m MTUSIZE: set the MTU size\n",
> >  	       prgname);
> >  }
> >
> > @@ -496,6 +501,38 @@ struct rte_lpm6_config lpm6_config = {
> >  	return n;
> >  }
> >
> > +static int
> > +parse_mbufsize(const char *q_arg)
> > +{
> > +	char *end = NULL;
> > +	unsigned long mbuf;
> > +
> > +	/* parse hexadecimal string */
> 
> You expect decimal string below.
> 
> > +	mbuf = strtoul(q_arg, &end, 10);
> > +	if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
> > +		return -1;
> 
> You probably need to set and test errno too.
> 
> > +	if (mbuf == 0)
> > +		return -1;
> > +
> > +	return mbuf;
> > +}
> 
> 
> These 2 parse functions looks identical.
> Why you need two of them?
> 
> > +
> > +static int
> > +parse_mtusize(const char *q_arg)
> > +{
> > +	char *end = NULL;
> > +	unsigned long mtu;
> > +
> > +	/* parse hexadecimal string */
> > +	mtu = strtoul(q_arg, &end, 10);
> > +	if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
> > +		return -1;
> > +	if (mtu == 0)
> > +		return -1;
> > +
> > +	return mtu;
> > +}
> > +
> >  /* Parse the argument given in the command line of the application */
> > static int  parse_args(int argc, char **argv) @@ -510,7 +547,7 @@
> > struct rte_lpm6_config lpm6_config = {
> >
> >  	argvopt = argv;
> >
> > -	while ((opt = getopt_long(argc, argvopt, "p:q:",
> > +	while ((opt = getopt_long(argc, argvopt, "p:q:b:m:",
> >  				  lgopts, &option_index)) != EOF) {
> >
> >  		switch (opt) {
> > @@ -534,6 +571,26 @@ struct rte_lpm6_config lpm6_config = {
> >  			}
> >  			break;
> >
> > +		/* mbufsize */
> > +		case 'b':
> > +			mbuf_size = parse_mbufsize(optarg);
> > +			if (mbuf_size < 0) {
> > +				printf("invalid mbuf size\n");
> > +				print_usage(prgname);
> > +				return -1;
> > +			}
> > +			break;
> > +
> > +		/* mtusize */
> > +		case 'm':
> > +			mtu_size = parse_mtusize(optarg);
> > +			if (mtu_size < 0) {
> > +				printf("invalid mtu size\n");
> > +				print_usage(prgname);
> > +				return -1;
> > +			}
> > +			break;
> > +
> >  		/* long options */
> >  		case 0:
> >  			print_usage(prgname);
> > @@ -777,9 +834,8 @@ struct rte_lpm6_config lpm6_config = {
> >  			RTE_LOG(INFO, IP_FRAG, "Creating direct mempool on socket %i\n",
> >  					socket);
> >  			snprintf(buf, sizeof(buf), "pool_direct_%i", socket);
> > -
> >  			mp = rte_pktmbuf_pool_create(buf, NB_MBUF, 32,
> > -				0, RTE_MBUF_DEFAULT_BUF_SIZE, socket);
> > +				0, mbuf_size, socket);
> >  			if (mp == NULL) {
> >  				RTE_LOG(ERR, IP_FRAG, "Cannot create direct mempool\n");
> >  				return -1;
> > @@ -892,6 +948,15 @@ struct rte_lpm6_config lpm6_config = {
> >  		    dev_info.max_rx_pktlen,
> >  		    local_port_conf.rxmode.max_rx_pkt_len);
> >
> > +		/* set the mtu to the maximum received packet size */
> > +		ret = rte_eth_dev_set_mtu(portid, mtu_size);
> > +		if (ret < 0) {
> > +			printf("\n");
> > +			rte_exit(EXIT_FAILURE, "Set MTU failed: "
> > +				"err=%d, port=%d\n",
> > +			ret, portid);
> > +		}
> > +
> >  		/* get the lcore_id for this port */
> >  		while (rte_lcore_is_enabled(rx_lcore_id) == 0 ||
> >  		       qconf->n_rx_queue == (unsigned)rx_queue_per_lcore) {
> > --
> > 1.8.3.1
  

Patch

diff --git a/doc/guides/sample_app_ug/ip_frag.rst b/doc/guides/sample_app_ug/ip_frag.rst
index 7914a97..13933c7 100644
--- a/doc/guides/sample_app_ug/ip_frag.rst
+++ b/doc/guides/sample_app_ug/ip_frag.rst
@@ -53,7 +53,7 @@  Application usage:
 
 .. code-block:: console
 
-    ./build/ip_fragmentation [EAL options] -- -p PORTMASK [-q NQ]
+    ./build/ip_fragmentation [EAL options] -- -p PORTMASK [-q NQ] [-b MBUFSIZE] [-m MTUSIZE]
 
 where:
 
@@ -61,6 +61,15 @@  where:
 
 *   -q NQ is the number of queue (=ports) per lcore (the default is 1)
 
+*   -b MBUFSIZE is the mbuf size in bytes (the default is 2048)
+
+*   -m MTUSIZE is the mtu size in bytes (the default is 1500)
+
+The MTU is the maximum size of a single data unit that can be transmitted over the network, therefore it must be
+greater than the requested max packet size, otherwise the NIC won't be able to get the packet.
+The mbuf is a buffer that is used by the DPDK application to store message buffers.If not using scatter then the
+mbuf size must be greater than the requested max packet size, otherwise the DPDK will not be able to receive the
+packet.
 To run the example in linuxapp environment with 2 lcores (2,4) over 2 ports(0,2) with 1 RX queue per lcore:
 
 .. code-block:: console
@@ -96,6 +105,13 @@  To run the example in linuxapp environment with 1 lcore (4) over 2 ports(0,2) wi
 
     ./build/ip_fragmentation -l 4 -n 3 -- -p 5 -q 2
 
+To run the example with defined MTU size 4000 bytes and mbuf size 9000 byes:
+
+.. code-block:: console
+
+    ./build/ip_fragmentation -l 4 -n 3 -- -p 5 -m 4000 -b 9000
+
+
 To test the application, flows should be set up in the flow generator that match the values in the
 l3fwd_ipv4_route_array and/or l3fwd_ipv6_route_array table.
 
diff --git a/examples/ip_fragmentation/main.c b/examples/ip_fragmentation/main.c
index 17a877d..0cf23b4 100644
--- a/examples/ip_fragmentation/main.c
+++ b/examples/ip_fragmentation/main.c
@@ -111,6 +111,10 @@ 
 
 static int rx_queue_per_lcore = 1;
 
+static int mbuf_size = RTE_MBUF_DEFAULT_BUF_SIZE;
+
+static int mtu_size = ETHER_MTU;
+
 #define MBUF_TABLE_SIZE  (2 * MAX(MAX_PKT_BURST, MAX_PACKET_FRAG))
 
 struct mbuf_table {
@@ -425,7 +429,6 @@  struct rte_lpm6_config lpm6_config = {
 		 * Read packet from RX queues
 		 */
 		for (i = 0; i < qconf->n_rx_queue; i++) {
-
 			portid = qconf->rx_queue_list[i].portid;
 			nb_rx = rte_eth_rx_burst(portid, 0, pkts_burst,
 						 MAX_PKT_BURST);
@@ -455,9 +458,11 @@  struct rte_lpm6_config lpm6_config = {
 static void
 print_usage(const char *prgname)
 {
-	printf("%s [EAL options] -- -p PORTMASK [-q NQ]\n"
+	printf("%s [EAL options] -- -p PORTMASK [-q NQ] [-b MBUFSIZE] [-m MTUSIZE]\n"
 	       "  -p PORTMASK: hexadecimal bitmask of ports to configure\n"
-	       "  -q NQ: number of queue (=ports) per lcore (default is 1)\n",
+	       "  -q NQ: number of queue (=ports) per lcore (default is 1)\n"
+	       "  -b MBUFSIZE: set the data size of mbuf\n"
+	       "  -m MTUSIZE: set the MTU size\n",
 	       prgname);
 }
 
@@ -496,6 +501,38 @@  struct rte_lpm6_config lpm6_config = {
 	return n;
 }
 
+static int
+parse_mbufsize(const char *q_arg)
+{
+	char *end = NULL;
+	unsigned long mbuf;
+
+	/* parse hexadecimal string */
+	mbuf = strtoul(q_arg, &end, 10);
+	if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
+		return -1;
+	if (mbuf == 0)
+		return -1;
+
+	return mbuf;
+}
+
+static int
+parse_mtusize(const char *q_arg)
+{
+	char *end = NULL;
+	unsigned long mtu;
+
+	/* parse hexadecimal string */
+	mtu = strtoul(q_arg, &end, 10);
+	if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
+		return -1;
+	if (mtu == 0)
+		return -1;
+
+	return mtu;
+}
+
 /* Parse the argument given in the command line of the application */
 static int
 parse_args(int argc, char **argv)
@@ -510,7 +547,7 @@  struct rte_lpm6_config lpm6_config = {
 
 	argvopt = argv;
 
-	while ((opt = getopt_long(argc, argvopt, "p:q:",
+	while ((opt = getopt_long(argc, argvopt, "p:q:b:m:",
 				  lgopts, &option_index)) != EOF) {
 
 		switch (opt) {
@@ -534,6 +571,26 @@  struct rte_lpm6_config lpm6_config = {
 			}
 			break;
 
+		/* mbufsize */
+		case 'b':
+			mbuf_size = parse_mbufsize(optarg);
+			if (mbuf_size < 0) {
+				printf("invalid mbuf size\n");
+				print_usage(prgname);
+				return -1;
+			}
+			break;
+
+		/* mtusize */
+		case 'm':
+			mtu_size = parse_mtusize(optarg);
+			if (mtu_size < 0) {
+				printf("invalid mtu size\n");
+				print_usage(prgname);
+				return -1;
+			}
+			break;
+
 		/* long options */
 		case 0:
 			print_usage(prgname);
@@ -777,9 +834,8 @@  struct rte_lpm6_config lpm6_config = {
 			RTE_LOG(INFO, IP_FRAG, "Creating direct mempool on socket %i\n",
 					socket);
 			snprintf(buf, sizeof(buf), "pool_direct_%i", socket);
-
 			mp = rte_pktmbuf_pool_create(buf, NB_MBUF, 32,
-				0, RTE_MBUF_DEFAULT_BUF_SIZE, socket);
+				0, mbuf_size, socket);
 			if (mp == NULL) {
 				RTE_LOG(ERR, IP_FRAG, "Cannot create direct mempool\n");
 				return -1;
@@ -892,6 +948,15 @@  struct rte_lpm6_config lpm6_config = {
 		    dev_info.max_rx_pktlen,
 		    local_port_conf.rxmode.max_rx_pkt_len);
 
+		/* set the mtu to the maximum received packet size */
+		ret = rte_eth_dev_set_mtu(portid, mtu_size);
+		if (ret < 0) {
+			printf("\n");
+			rte_exit(EXIT_FAILURE, "Set MTU failed: "
+				"err=%d, port=%d\n",
+			ret, portid);
+		}
+
 		/* get the lcore_id for this port */
 		while (rte_lcore_is_enabled(rx_lcore_id) == 0 ||
 		       qconf->n_rx_queue == (unsigned)rx_queue_per_lcore) {