[dpdk-dev,v3,7/9] net/virtio: Add MTU feature support

Message ID 20170312163406.17714-8-maxime.coquelin@redhat.com (mailing list archive)
State Accepted, archived
Delegated to: Yuanhan Liu
Headers

Checks

Context Check Description
ci/Intel-compilation success Compilation OK
ci/checkpatch success coding style OK

Commit Message

Maxime Coquelin March 12, 2017, 4:34 p.m. UTC
  This patch implements support for the Virtio MTU feature.
When negotiated, the host shares its maximum supported MTU,
which is used as initial MTU and as maximum MTU the application
can set.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 doc/guides/nics/features/virtio.ini |  1 +
 drivers/net/virtio/virtio_ethdev.c  | 45 +++++++++++++++++++++++++++++++++++--
 drivers/net/virtio/virtio_ethdev.h  |  3 ++-
 drivers/net/virtio/virtio_pci.h     |  3 +++
 4 files changed, 49 insertions(+), 3 deletions(-)
  

Comments

Jianfeng Tan April 5, 2017, 4:52 a.m. UTC | #1
Hi Maxime,

Have some confusion about this feature. Please help confirm.

(1) With this feature, we only support to advertise MTU value which is 
defined by QEMU to frontend and backend driver separately. (2) But it 
does not allow frontend driver to set a different MTU to QEMU and then 
to vhost backend.

Correct?

If it's correct, why not MTU works like (2)?

Thanks,
Jianfeng
  
Maxime Coquelin April 5, 2017, 7:11 a.m. UTC | #2
Hi Jianfeng,

On 04/05/2017 06:52 AM, Tan, Jianfeng wrote:
> Hi Maxime,
>
> Have some confusion about this feature. Please help confirm.
>
> (1) With this feature, we only support to advertise MTU value which is
> defined by QEMU to frontend and backend driver separately. (2) But it
> does not allow frontend driver to set a different MTU to QEMU and then
> to vhost backend.
>
> Correct?
> If it's correct, why not MTU works like (2)?

Because idea is that the hosts advertises the maximum MTU value it
supports. The frontend driver is free to use a smaller value. The goal
of this change is to make possible to set a uniform MTU value across
the infrastructure, the management tools giving a hint to the guests on
the MTU value it should use.

Having the frontend setting the backend MTU dynamically would require 
some negotiation at runtime, and is out of the scope of this change.
Do you have some use cases for this?

Regards,
Maxime
> Thanks,
> Jianfeng
  
Jianfeng Tan April 5, 2017, 9:42 a.m. UTC | #3
Hi Maxime,

Thank you for replying.

On 4/5/2017 3:11 PM, Maxime Coquelin wrote:
> Hi Jianfeng,
>
> On 04/05/2017 06:52 AM, Tan, Jianfeng wrote:
>> Hi Maxime,
>>
>> Have some confusion about this feature. Please help confirm.
>>
>> (1) With this feature, we only support to advertise MTU value which is
>> defined by QEMU to frontend and backend driver separately. (2) But it
>> does not allow frontend driver to set a different MTU to QEMU and then
>> to vhost backend.
>>
>> Correct?
>> If it's correct, why not MTU works like (2)?
>
> Because idea is that the hosts advertises the maximum MTU value it
> supports. The frontend driver is free to use a smaller value. The goal
> of this change is to make possible to set a uniform MTU value across
> the infrastructure, the management tools giving a hint to the guests on
> the MTU value it should use.

Based on that MTU is the maximum packet size that can be sent instead of 
that can be received:
(1) Why vhost (as a device) does not drop any packets which are longer 
than MTU when dequeue()?
(2) See some NICs also use MTU to calculate maximum packet size that can 
be received, like ixgbe, i40e, shall we also do that?

>
> Having the frontend setting the backend MTU dynamically would require 
> some negotiation at runtime, and is out of the scope of this change.
> Do you have some use cases for this?

No, just to understand more about this feature. Thank you!

Thanks,
Jianfeng

>
> Regards,
> Maxime
>> Thanks,
>> Jianfeng
  
Maxime Coquelin April 5, 2017, 1:54 p.m. UTC | #4
On 04/05/2017 11:42 AM, Tan, Jianfeng wrote:
> Hi Maxime,
>
> Thank you for replying.
>
> On 4/5/2017 3:11 PM, Maxime Coquelin wrote:
>> Hi Jianfeng,
>>
>> On 04/05/2017 06:52 AM, Tan, Jianfeng wrote:
>>> Hi Maxime,
>>>
>>> Have some confusion about this feature. Please help confirm.
>>>
>>> (1) With this feature, we only support to advertise MTU value which is
>>> defined by QEMU to frontend and backend driver separately. (2) But it
>>> does not allow frontend driver to set a different MTU to QEMU and then
>>> to vhost backend.
>>>
>>> Correct?
>>> If it's correct, why not MTU works like (2)?
>>
>> Because idea is that the hosts advertises the maximum MTU value it
>> supports. The frontend driver is free to use a smaller value. The goal
>> of this change is to make possible to set a uniform MTU value across
>> the infrastructure, the management tools giving a hint to the guests on
>> the MTU value it should use.
>
> Based on that MTU is the maximum packet size that can be sent instead of
> that can be received:
> (1) Why vhost (as a device) does not drop any packets which are longer
> than MTU when dequeue()?
That's a good point.
As when MTU value is negotiated, the guest agrees not to send larger
packets. But we cannot trust the guest, so vhost needs to check the
packet length.

> (2) See some NICs also use MTU to calculate maximum packet size that can
> be received, like ixgbe, i40e, shall we also do that?
Can you give me some pointers to the code?

Thanks,
Maxime
  
Jianfeng Tan April 5, 2017, 2:50 p.m. UTC | #5
On 4/5/2017 9:54 PM, Maxime Coquelin wrote:
>
>
> On 04/05/2017 11:42 AM, Tan, Jianfeng wrote:
>> Hi Maxime,
>>
>> Thank you for replying.
>>
>> On 4/5/2017 3:11 PM, Maxime Coquelin wrote:
>>> Hi Jianfeng,
>>>
>>> On 04/05/2017 06:52 AM, Tan, Jianfeng wrote:
>>>> Hi Maxime,
>>>>
>>>> Have some confusion about this feature. Please help confirm.
>>>>
>>>> (1) With this feature, we only support to advertise MTU value which is
>>>> defined by QEMU to frontend and backend driver separately. (2) But it
>>>> does not allow frontend driver to set a different MTU to QEMU and then
>>>> to vhost backend.
>>>>
>>>> Correct?
>>>> If it's correct, why not MTU works like (2)?
>>>
>>> Because idea is that the hosts advertises the maximum MTU value it
>>> supports. The frontend driver is free to use a smaller value. The goal
>>> of this change is to make possible to set a uniform MTU value across
>>> the infrastructure, the management tools giving a hint to the guests on
>>> the MTU value it should use.
>>
>> Based on that MTU is the maximum packet size that can be sent instead of
>> that can be received:
>> (1) Why vhost (as a device) does not drop any packets which are longer
>> than MTU when dequeue()?
> That's a good point.
> As when MTU value is negotiated, the guest agrees not to send larger
> packets. But we cannot trust the guest, so vhost needs to check the
> packet length.
>
>> (2) See some NICs also use MTU to calculate maximum packet size that can
>> be received, like ixgbe, i40e, shall we also do that?
> Can you give me some pointers to the code?

Please refer ixgbe_dev_mtu_set(), and i40e_dev_mtu_set().

Thanks,
Jianfeng
  
Maxime Coquelin April 7, 2017, 4:40 p.m. UTC | #6
Hi Jianfeng,

On 04/05/2017 04:50 PM, Tan, Jianfeng wrote:
>
>
> On 4/5/2017 9:54 PM, Maxime Coquelin wrote:
>>
>>
>> On 04/05/2017 11:42 AM, Tan, Jianfeng wrote:
>>> Hi Maxime,
>>>
>>> Thank you for replying.
>>>
>>> On 4/5/2017 3:11 PM, Maxime Coquelin wrote:
>>>> Hi Jianfeng,
>>>>
>>>> On 04/05/2017 06:52 AM, Tan, Jianfeng wrote:
>>>>> Hi Maxime,
>>>>>
>>>>> Have some confusion about this feature. Please help confirm.
>>>>>
>>>>> (1) With this feature, we only support to advertise MTU value which is
>>>>> defined by QEMU to frontend and backend driver separately. (2) But it
>>>>> does not allow frontend driver to set a different MTU to QEMU and then
>>>>> to vhost backend.
>>>>>
>>>>> Correct?
>>>>> If it's correct, why not MTU works like (2)?
>>>>
>>>> Because idea is that the hosts advertises the maximum MTU value it
>>>> supports. The frontend driver is free to use a smaller value. The goal
>>>> of this change is to make possible to set a uniform MTU value across
>>>> the infrastructure, the management tools giving a hint to the guests on
>>>> the MTU value it should use.
>>>
>>> Based on that MTU is the maximum packet size that can be sent instead of
>>> that can be received:
>>> (1) Why vhost (as a device) does not drop any packets which are longer
>>> than MTU when dequeue()?
>> That's a good point.
>> As when MTU value is negotiated, the guest agrees not to send larger
>> packets. But we cannot trust the guest, so vhost needs to check the
>> packet length.
>>
>>> (2) See some NICs also use MTU to calculate maximum packet size that can
>>> be received, like ixgbe, i40e, shall we also do that?
>> Can you give me some pointers to the code?
>
> Please refer ixgbe_dev_mtu_set(), and i40e_dev_mtu_set().

Thanks. Yes, we could also do that.

I can send a patch for this and another one to check the size of the
packet respects negotiated MTU value. Or maybe you want to do this?

Regards,
Maxime
  
Jianfeng Tan April 10, 2017, 6:48 a.m. UTC | #7
> -----Original Message-----
> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
> Sent: Saturday, April 8, 2017 12:41 AM
> To: Tan, Jianfeng; aconole@redhat.com; sodey@sonusnet.com;
> yuanhan.liu@linux.intel.com; thomas.monjalon@6wind.com; dev@dpdk.org
> Subject: Re: [PATCH v3 7/9] net/virtio: Add MTU feature support
> 
> Hi Jianfeng,
> 
> On 04/05/2017 04:50 PM, Tan, Jianfeng wrote:
> >
> >
> > On 4/5/2017 9:54 PM, Maxime Coquelin wrote:
> >>
> >>
> >> On 04/05/2017 11:42 AM, Tan, Jianfeng wrote:
> >>> Hi Maxime,
> >>>
> >>> Thank you for replying.
> >>>
> >>> On 4/5/2017 3:11 PM, Maxime Coquelin wrote:
> >>>> Hi Jianfeng,
> >>>>
> >>>> On 04/05/2017 06:52 AM, Tan, Jianfeng wrote:
> >>>>> Hi Maxime,
> >>>>>
> >>>>> Have some confusion about this feature. Please help confirm.
> >>>>>
> >>>>> (1) With this feature, we only support to advertise MTU value which is
> >>>>> defined by QEMU to frontend and backend driver separately. (2) But
> it
> >>>>> does not allow frontend driver to set a different MTU to QEMU and
> then
> >>>>> to vhost backend.
> >>>>>
> >>>>> Correct?
> >>>>> If it's correct, why not MTU works like (2)?
> >>>>
> >>>> Because idea is that the hosts advertises the maximum MTU value it
> >>>> supports. The frontend driver is free to use a smaller value. The goal
> >>>> of this change is to make possible to set a uniform MTU value across
> >>>> the infrastructure, the management tools giving a hint to the guests on
> >>>> the MTU value it should use.
> >>>
> >>> Based on that MTU is the maximum packet size that can be sent instead
> of
> >>> that can be received:
> >>> (1) Why vhost (as a device) does not drop any packets which are longer
> >>> than MTU when dequeue()?
> >> That's a good point.
> >> As when MTU value is negotiated, the guest agrees not to send larger
> >> packets. But we cannot trust the guest, so vhost needs to check the
> >> packet length.
> >>
> >>> (2) See some NICs also use MTU to calculate maximum packet size that
> can
> >>> be received, like ixgbe, i40e, shall we also do that?
> >> Can you give me some pointers to the code?
> >
> > Please refer ixgbe_dev_mtu_set(), and i40e_dev_mtu_set().
> 
> Thanks. Yes, we could also do that.
> 
> I can send a patch for this and another one to check the size of the
> packet respects negotiated MTU value. Or maybe you want to do this?

I'll help review then.

Thanks,
Jianfeng
  

Patch

diff --git a/doc/guides/nics/features/virtio.ini b/doc/guides/nics/features/virtio.ini
index 84d2012..8e3aca1 100644
--- a/doc/guides/nics/features/virtio.ini
+++ b/doc/guides/nics/features/virtio.ini
@@ -25,3 +25,4 @@  ARMv8                = Y
 x86-32               = Y
 x86-64               = Y
 Usage doc            = Y
+MTU update           = Y
diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
index 4dc03b9..fd1213c 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -721,10 +721,13 @@  virtio_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
 	uint32_t ether_hdr_len = ETHER_HDR_LEN + VLAN_TAG_LEN +
 				 hw->vtnet_hdr_size;
 	uint32_t frame_size = mtu + ether_hdr_len;
+	uint32_t max_frame_size = hw->max_mtu + ether_hdr_len;
 
-	if (mtu < ETHER_MIN_MTU || frame_size > VIRTIO_MAX_RX_PKTLEN) {
+	max_frame_size = RTE_MIN(max_frame_size, VIRTIO_MAX_RX_PKTLEN);
+
+	if (mtu < ETHER_MIN_MTU || frame_size > max_frame_size) {
 		PMD_INIT_LOG(ERR, "MTU should be between %d and %d",
-			ETHER_MIN_MTU, VIRTIO_MAX_RX_PKTLEN - ether_hdr_len);
+			ETHER_MIN_MTU, max_frame_size - ether_hdr_len);
 		return -EINVAL;
 	}
 	return 0;
@@ -1158,6 +1161,18 @@  virtio_negotiate_features(struct virtio_hw *hw, uint64_t req_features)
 	PMD_INIT_LOG(DEBUG, "host_features before negotiate = %" PRIx64,
 		host_features);
 
+	/* If supported, ensure MTU value is valid before acknowledging it. */
+	if (host_features & req_features & (1ULL << VIRTIO_NET_F_MTU)) {
+		struct virtio_net_config config;
+
+		vtpci_read_dev_config(hw,
+			offsetof(struct virtio_net_config, mtu),
+			&config.mtu, sizeof(config.mtu));
+
+		if (config.mtu < ETHER_MIN_MTU)
+			req_features &= ~(1ULL << VIRTIO_NET_F_MTU);
+	}
+
 	/*
 	 * Negotiate features: Subset of device feature bits are written back
 	 * guest feature bits.
@@ -1392,6 +1407,32 @@  virtio_init_device(struct rte_eth_dev *eth_dev, uint64_t req_features)
 
 		hw->max_queue_pairs = config->max_virtqueue_pairs;
 
+		if (vtpci_with_feature(hw, VIRTIO_NET_F_MTU)) {
+			vtpci_read_dev_config(hw,
+				offsetof(struct virtio_net_config, mtu),
+				&config->mtu,
+				sizeof(config->mtu));
+
+			/*
+			 * MTU value has already been checked at negotiation
+			 * time, but check again in case it has changed since
+			 * then, which should not happen.
+			 */
+			if (config->mtu < ETHER_MIN_MTU) {
+				PMD_INIT_LOG(ERR, "invalid max MTU value (%u)",
+						config->mtu);
+				return -1;
+			}
+
+			hw->max_mtu = config->mtu;
+			/* Set initial MTU to maximum one supported by vhost */
+			eth_dev->data->mtu = config->mtu;
+
+		} else {
+			hw->max_mtu = VIRTIO_MAX_RX_PKTLEN - ETHER_HDR_LEN -
+				VLAN_TAG_LEN - hw->vtnet_hdr_size;
+		}
+
 		PMD_INIT_LOG(DEBUG, "config->max_virtqueue_pairs=%d",
 				config->max_virtqueue_pairs);
 		PMD_INIT_LOG(DEBUG, "config->status=%d", config->status);
diff --git a/drivers/net/virtio/virtio_ethdev.h b/drivers/net/virtio/virtio_ethdev.h
index 777a14b..aa78adc 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -51,7 +51,7 @@ 
 #define VIRTIO_MAX_TX_QUEUES 128U
 #define VIRTIO_MAX_MAC_ADDRS 64
 #define VIRTIO_MIN_RX_BUFSIZE 64
-#define VIRTIO_MAX_RX_PKTLEN  9728
+#define VIRTIO_MAX_RX_PKTLEN  9728U
 
 /* Features desired/implemented by this driver. */
 #define VIRTIO_PMD_DEFAULT_GUEST_FEATURES	\
@@ -66,6 +66,7 @@ 
 	 1u << VIRTIO_NET_F_HOST_TSO4	  |	\
 	 1u << VIRTIO_NET_F_HOST_TSO6	  |	\
 	 1u << VIRTIO_NET_F_MRG_RXBUF	  |	\
+	 1u << VIRTIO_NET_F_MTU	| \
 	 1u << VIRTIO_RING_F_INDIRECT_DESC |    \
 	 1ULL << VIRTIO_F_VERSION_1       |	\
 	 1ULL << VIRTIO_F_IOMMU_PLATFORM)
diff --git a/drivers/net/virtio/virtio_pci.h b/drivers/net/virtio/virtio_pci.h
index 59e45c4..771f5b1 100644
--- a/drivers/net/virtio/virtio_pci.h
+++ b/drivers/net/virtio/virtio_pci.h
@@ -106,6 +106,7 @@  struct virtnet_ctl;
 /* The feature bitmap for virtio net */
 #define VIRTIO_NET_F_CSUM	0	/* Host handles pkts w/ partial csum */
 #define VIRTIO_NET_F_GUEST_CSUM	1	/* Guest handles pkts w/ partial csum */
+#define VIRTIO_NET_F_MTU	3	/* Initial MTU advice. */
 #define VIRTIO_NET_F_MAC	5	/* Host has given MAC address. */
 #define VIRTIO_NET_F_GUEST_TSO4	7	/* Guest can handle TSOv4 in. */
 #define VIRTIO_NET_F_GUEST_TSO6	8	/* Guest can handle TSOv6 in. */
@@ -251,6 +252,7 @@  struct virtio_hw {
 	uint64_t    req_guest_features;
 	uint64_t    guest_features;
 	uint32_t    max_queue_pairs;
+	uint16_t	max_mtu;
 	uint16_t    vtnet_hdr_size;
 	uint8_t	    vlan_strip;
 	uint8_t	    use_msix;
@@ -296,6 +298,7 @@  struct virtio_net_config {
 	/* See VIRTIO_NET_F_STATUS and VIRTIO_NET_S_* above */
 	uint16_t   status;
 	uint16_t   max_virtqueue_pairs;
+	uint16_t   mtu;
 } __attribute__((packed));
 
 /*