[dpdk-dev,v3,7/9] net/virtio: Add MTU feature support
Checks
Commit Message
This patch implements support for the Virtio MTU feature.
When negotiated, the host shares its maximum supported MTU,
which is used as initial MTU and as maximum MTU the application
can set.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
doc/guides/nics/features/virtio.ini | 1 +
drivers/net/virtio/virtio_ethdev.c | 45 +++++++++++++++++++++++++++++++++++--
drivers/net/virtio/virtio_ethdev.h | 3 ++-
drivers/net/virtio/virtio_pci.h | 3 +++
4 files changed, 49 insertions(+), 3 deletions(-)
Comments
Hi Maxime,
Have some confusion about this feature. Please help confirm.
(1) With this feature, we only support to advertise MTU value which is
defined by QEMU to frontend and backend driver separately. (2) But it
does not allow frontend driver to set a different MTU to QEMU and then
to vhost backend.
Correct?
If it's correct, why not MTU works like (2)?
Thanks,
Jianfeng
Hi Jianfeng,
On 04/05/2017 06:52 AM, Tan, Jianfeng wrote:
> Hi Maxime,
>
> Have some confusion about this feature. Please help confirm.
>
> (1) With this feature, we only support to advertise MTU value which is
> defined by QEMU to frontend and backend driver separately. (2) But it
> does not allow frontend driver to set a different MTU to QEMU and then
> to vhost backend.
>
> Correct?
> If it's correct, why not MTU works like (2)?
Because idea is that the hosts advertises the maximum MTU value it
supports. The frontend driver is free to use a smaller value. The goal
of this change is to make possible to set a uniform MTU value across
the infrastructure, the management tools giving a hint to the guests on
the MTU value it should use.
Having the frontend setting the backend MTU dynamically would require
some negotiation at runtime, and is out of the scope of this change.
Do you have some use cases for this?
Regards,
Maxime
> Thanks,
> Jianfeng
Hi Maxime,
Thank you for replying.
On 4/5/2017 3:11 PM, Maxime Coquelin wrote:
> Hi Jianfeng,
>
> On 04/05/2017 06:52 AM, Tan, Jianfeng wrote:
>> Hi Maxime,
>>
>> Have some confusion about this feature. Please help confirm.
>>
>> (1) With this feature, we only support to advertise MTU value which is
>> defined by QEMU to frontend and backend driver separately. (2) But it
>> does not allow frontend driver to set a different MTU to QEMU and then
>> to vhost backend.
>>
>> Correct?
>> If it's correct, why not MTU works like (2)?
>
> Because idea is that the hosts advertises the maximum MTU value it
> supports. The frontend driver is free to use a smaller value. The goal
> of this change is to make possible to set a uniform MTU value across
> the infrastructure, the management tools giving a hint to the guests on
> the MTU value it should use.
Based on that MTU is the maximum packet size that can be sent instead of
that can be received:
(1) Why vhost (as a device) does not drop any packets which are longer
than MTU when dequeue()?
(2) See some NICs also use MTU to calculate maximum packet size that can
be received, like ixgbe, i40e, shall we also do that?
>
> Having the frontend setting the backend MTU dynamically would require
> some negotiation at runtime, and is out of the scope of this change.
> Do you have some use cases for this?
No, just to understand more about this feature. Thank you!
Thanks,
Jianfeng
>
> Regards,
> Maxime
>> Thanks,
>> Jianfeng
On 04/05/2017 11:42 AM, Tan, Jianfeng wrote:
> Hi Maxime,
>
> Thank you for replying.
>
> On 4/5/2017 3:11 PM, Maxime Coquelin wrote:
>> Hi Jianfeng,
>>
>> On 04/05/2017 06:52 AM, Tan, Jianfeng wrote:
>>> Hi Maxime,
>>>
>>> Have some confusion about this feature. Please help confirm.
>>>
>>> (1) With this feature, we only support to advertise MTU value which is
>>> defined by QEMU to frontend and backend driver separately. (2) But it
>>> does not allow frontend driver to set a different MTU to QEMU and then
>>> to vhost backend.
>>>
>>> Correct?
>>> If it's correct, why not MTU works like (2)?
>>
>> Because idea is that the hosts advertises the maximum MTU value it
>> supports. The frontend driver is free to use a smaller value. The goal
>> of this change is to make possible to set a uniform MTU value across
>> the infrastructure, the management tools giving a hint to the guests on
>> the MTU value it should use.
>
> Based on that MTU is the maximum packet size that can be sent instead of
> that can be received:
> (1) Why vhost (as a device) does not drop any packets which are longer
> than MTU when dequeue()?
That's a good point.
As when MTU value is negotiated, the guest agrees not to send larger
packets. But we cannot trust the guest, so vhost needs to check the
packet length.
> (2) See some NICs also use MTU to calculate maximum packet size that can
> be received, like ixgbe, i40e, shall we also do that?
Can you give me some pointers to the code?
Thanks,
Maxime
On 4/5/2017 9:54 PM, Maxime Coquelin wrote:
>
>
> On 04/05/2017 11:42 AM, Tan, Jianfeng wrote:
>> Hi Maxime,
>>
>> Thank you for replying.
>>
>> On 4/5/2017 3:11 PM, Maxime Coquelin wrote:
>>> Hi Jianfeng,
>>>
>>> On 04/05/2017 06:52 AM, Tan, Jianfeng wrote:
>>>> Hi Maxime,
>>>>
>>>> Have some confusion about this feature. Please help confirm.
>>>>
>>>> (1) With this feature, we only support to advertise MTU value which is
>>>> defined by QEMU to frontend and backend driver separately. (2) But it
>>>> does not allow frontend driver to set a different MTU to QEMU and then
>>>> to vhost backend.
>>>>
>>>> Correct?
>>>> If it's correct, why not MTU works like (2)?
>>>
>>> Because idea is that the hosts advertises the maximum MTU value it
>>> supports. The frontend driver is free to use a smaller value. The goal
>>> of this change is to make possible to set a uniform MTU value across
>>> the infrastructure, the management tools giving a hint to the guests on
>>> the MTU value it should use.
>>
>> Based on that MTU is the maximum packet size that can be sent instead of
>> that can be received:
>> (1) Why vhost (as a device) does not drop any packets which are longer
>> than MTU when dequeue()?
> That's a good point.
> As when MTU value is negotiated, the guest agrees not to send larger
> packets. But we cannot trust the guest, so vhost needs to check the
> packet length.
>
>> (2) See some NICs also use MTU to calculate maximum packet size that can
>> be received, like ixgbe, i40e, shall we also do that?
> Can you give me some pointers to the code?
Please refer ixgbe_dev_mtu_set(), and i40e_dev_mtu_set().
Thanks,
Jianfeng
Hi Jianfeng,
On 04/05/2017 04:50 PM, Tan, Jianfeng wrote:
>
>
> On 4/5/2017 9:54 PM, Maxime Coquelin wrote:
>>
>>
>> On 04/05/2017 11:42 AM, Tan, Jianfeng wrote:
>>> Hi Maxime,
>>>
>>> Thank you for replying.
>>>
>>> On 4/5/2017 3:11 PM, Maxime Coquelin wrote:
>>>> Hi Jianfeng,
>>>>
>>>> On 04/05/2017 06:52 AM, Tan, Jianfeng wrote:
>>>>> Hi Maxime,
>>>>>
>>>>> Have some confusion about this feature. Please help confirm.
>>>>>
>>>>> (1) With this feature, we only support to advertise MTU value which is
>>>>> defined by QEMU to frontend and backend driver separately. (2) But it
>>>>> does not allow frontend driver to set a different MTU to QEMU and then
>>>>> to vhost backend.
>>>>>
>>>>> Correct?
>>>>> If it's correct, why not MTU works like (2)?
>>>>
>>>> Because idea is that the hosts advertises the maximum MTU value it
>>>> supports. The frontend driver is free to use a smaller value. The goal
>>>> of this change is to make possible to set a uniform MTU value across
>>>> the infrastructure, the management tools giving a hint to the guests on
>>>> the MTU value it should use.
>>>
>>> Based on that MTU is the maximum packet size that can be sent instead of
>>> that can be received:
>>> (1) Why vhost (as a device) does not drop any packets which are longer
>>> than MTU when dequeue()?
>> That's a good point.
>> As when MTU value is negotiated, the guest agrees not to send larger
>> packets. But we cannot trust the guest, so vhost needs to check the
>> packet length.
>>
>>> (2) See some NICs also use MTU to calculate maximum packet size that can
>>> be received, like ixgbe, i40e, shall we also do that?
>> Can you give me some pointers to the code?
>
> Please refer ixgbe_dev_mtu_set(), and i40e_dev_mtu_set().
Thanks. Yes, we could also do that.
I can send a patch for this and another one to check the size of the
packet respects negotiated MTU value. Or maybe you want to do this?
Regards,
Maxime
> -----Original Message-----
> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
> Sent: Saturday, April 8, 2017 12:41 AM
> To: Tan, Jianfeng; aconole@redhat.com; sodey@sonusnet.com;
> yuanhan.liu@linux.intel.com; thomas.monjalon@6wind.com; dev@dpdk.org
> Subject: Re: [PATCH v3 7/9] net/virtio: Add MTU feature support
>
> Hi Jianfeng,
>
> On 04/05/2017 04:50 PM, Tan, Jianfeng wrote:
> >
> >
> > On 4/5/2017 9:54 PM, Maxime Coquelin wrote:
> >>
> >>
> >> On 04/05/2017 11:42 AM, Tan, Jianfeng wrote:
> >>> Hi Maxime,
> >>>
> >>> Thank you for replying.
> >>>
> >>> On 4/5/2017 3:11 PM, Maxime Coquelin wrote:
> >>>> Hi Jianfeng,
> >>>>
> >>>> On 04/05/2017 06:52 AM, Tan, Jianfeng wrote:
> >>>>> Hi Maxime,
> >>>>>
> >>>>> Have some confusion about this feature. Please help confirm.
> >>>>>
> >>>>> (1) With this feature, we only support to advertise MTU value which is
> >>>>> defined by QEMU to frontend and backend driver separately. (2) But
> it
> >>>>> does not allow frontend driver to set a different MTU to QEMU and
> then
> >>>>> to vhost backend.
> >>>>>
> >>>>> Correct?
> >>>>> If it's correct, why not MTU works like (2)?
> >>>>
> >>>> Because idea is that the hosts advertises the maximum MTU value it
> >>>> supports. The frontend driver is free to use a smaller value. The goal
> >>>> of this change is to make possible to set a uniform MTU value across
> >>>> the infrastructure, the management tools giving a hint to the guests on
> >>>> the MTU value it should use.
> >>>
> >>> Based on that MTU is the maximum packet size that can be sent instead
> of
> >>> that can be received:
> >>> (1) Why vhost (as a device) does not drop any packets which are longer
> >>> than MTU when dequeue()?
> >> That's a good point.
> >> As when MTU value is negotiated, the guest agrees not to send larger
> >> packets. But we cannot trust the guest, so vhost needs to check the
> >> packet length.
> >>
> >>> (2) See some NICs also use MTU to calculate maximum packet size that
> can
> >>> be received, like ixgbe, i40e, shall we also do that?
> >> Can you give me some pointers to the code?
> >
> > Please refer ixgbe_dev_mtu_set(), and i40e_dev_mtu_set().
>
> Thanks. Yes, we could also do that.
>
> I can send a patch for this and another one to check the size of the
> packet respects negotiated MTU value. Or maybe you want to do this?
I'll help review then.
Thanks,
Jianfeng
@@ -25,3 +25,4 @@ ARMv8 = Y
x86-32 = Y
x86-64 = Y
Usage doc = Y
+MTU update = Y
@@ -721,10 +721,13 @@ virtio_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
uint32_t ether_hdr_len = ETHER_HDR_LEN + VLAN_TAG_LEN +
hw->vtnet_hdr_size;
uint32_t frame_size = mtu + ether_hdr_len;
+ uint32_t max_frame_size = hw->max_mtu + ether_hdr_len;
- if (mtu < ETHER_MIN_MTU || frame_size > VIRTIO_MAX_RX_PKTLEN) {
+ max_frame_size = RTE_MIN(max_frame_size, VIRTIO_MAX_RX_PKTLEN);
+
+ if (mtu < ETHER_MIN_MTU || frame_size > max_frame_size) {
PMD_INIT_LOG(ERR, "MTU should be between %d and %d",
- ETHER_MIN_MTU, VIRTIO_MAX_RX_PKTLEN - ether_hdr_len);
+ ETHER_MIN_MTU, max_frame_size - ether_hdr_len);
return -EINVAL;
}
return 0;
@@ -1158,6 +1161,18 @@ virtio_negotiate_features(struct virtio_hw *hw, uint64_t req_features)
PMD_INIT_LOG(DEBUG, "host_features before negotiate = %" PRIx64,
host_features);
+ /* If supported, ensure MTU value is valid before acknowledging it. */
+ if (host_features & req_features & (1ULL << VIRTIO_NET_F_MTU)) {
+ struct virtio_net_config config;
+
+ vtpci_read_dev_config(hw,
+ offsetof(struct virtio_net_config, mtu),
+ &config.mtu, sizeof(config.mtu));
+
+ if (config.mtu < ETHER_MIN_MTU)
+ req_features &= ~(1ULL << VIRTIO_NET_F_MTU);
+ }
+
/*
* Negotiate features: Subset of device feature bits are written back
* guest feature bits.
@@ -1392,6 +1407,32 @@ virtio_init_device(struct rte_eth_dev *eth_dev, uint64_t req_features)
hw->max_queue_pairs = config->max_virtqueue_pairs;
+ if (vtpci_with_feature(hw, VIRTIO_NET_F_MTU)) {
+ vtpci_read_dev_config(hw,
+ offsetof(struct virtio_net_config, mtu),
+ &config->mtu,
+ sizeof(config->mtu));
+
+ /*
+ * MTU value has already been checked at negotiation
+ * time, but check again in case it has changed since
+ * then, which should not happen.
+ */
+ if (config->mtu < ETHER_MIN_MTU) {
+ PMD_INIT_LOG(ERR, "invalid max MTU value (%u)",
+ config->mtu);
+ return -1;
+ }
+
+ hw->max_mtu = config->mtu;
+ /* Set initial MTU to maximum one supported by vhost */
+ eth_dev->data->mtu = config->mtu;
+
+ } else {
+ hw->max_mtu = VIRTIO_MAX_RX_PKTLEN - ETHER_HDR_LEN -
+ VLAN_TAG_LEN - hw->vtnet_hdr_size;
+ }
+
PMD_INIT_LOG(DEBUG, "config->max_virtqueue_pairs=%d",
config->max_virtqueue_pairs);
PMD_INIT_LOG(DEBUG, "config->status=%d", config->status);
@@ -51,7 +51,7 @@
#define VIRTIO_MAX_TX_QUEUES 128U
#define VIRTIO_MAX_MAC_ADDRS 64
#define VIRTIO_MIN_RX_BUFSIZE 64
-#define VIRTIO_MAX_RX_PKTLEN 9728
+#define VIRTIO_MAX_RX_PKTLEN 9728U
/* Features desired/implemented by this driver. */
#define VIRTIO_PMD_DEFAULT_GUEST_FEATURES \
@@ -66,6 +66,7 @@
1u << VIRTIO_NET_F_HOST_TSO4 | \
1u << VIRTIO_NET_F_HOST_TSO6 | \
1u << VIRTIO_NET_F_MRG_RXBUF | \
+ 1u << VIRTIO_NET_F_MTU | \
1u << VIRTIO_RING_F_INDIRECT_DESC | \
1ULL << VIRTIO_F_VERSION_1 | \
1ULL << VIRTIO_F_IOMMU_PLATFORM)
@@ -106,6 +106,7 @@ struct virtnet_ctl;
/* The feature bitmap for virtio net */
#define VIRTIO_NET_F_CSUM 0 /* Host handles pkts w/ partial csum */
#define VIRTIO_NET_F_GUEST_CSUM 1 /* Guest handles pkts w/ partial csum */
+#define VIRTIO_NET_F_MTU 3 /* Initial MTU advice. */
#define VIRTIO_NET_F_MAC 5 /* Host has given MAC address. */
#define VIRTIO_NET_F_GUEST_TSO4 7 /* Guest can handle TSOv4 in. */
#define VIRTIO_NET_F_GUEST_TSO6 8 /* Guest can handle TSOv6 in. */
@@ -251,6 +252,7 @@ struct virtio_hw {
uint64_t req_guest_features;
uint64_t guest_features;
uint32_t max_queue_pairs;
+ uint16_t max_mtu;
uint16_t vtnet_hdr_size;
uint8_t vlan_strip;
uint8_t use_msix;
@@ -296,6 +298,7 @@ struct virtio_net_config {
/* See VIRTIO_NET_F_STATUS and VIRTIO_NET_S_* above */
uint16_t status;
uint16_t max_virtqueue_pairs;
+ uint16_t mtu;
} __attribute__((packed));
/*