[v5] vhost: add support for large buffers
Checks
Commit Message
The rte_vhost_dequeue_burst supports two ways of dequeuing data.
If the data fits into a buffer, then all data is copied and a
single linear buffer is returned. Otherwise it allocates
additional mbufs and chains them together to return a multiple
segments mbuf.
While that covers most use cases, it forces applications that
need to work with larger data sizes to support multiple segments
mbufs. The non-linear characteristic brings complexity and
performance implications to the application.
To resolve the issue, add support to attach external buffer
to a pktmbuf and let the host provide during registration if
attaching an external buffer to pktmbuf is supported and if
only linear buffer are supported.
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
---
doc/guides/prog_guide/vhost_lib.rst | 35 +++++++++
lib/librte_vhost/rte_vhost.h | 4 +
lib/librte_vhost/socket.c | 22 ++++++
lib/librte_vhost/vhost.c | 22 ++++++
lib/librte_vhost/vhost.h | 4 +
lib/librte_vhost/virtio_net.c | 109 ++++++++++++++++++++++++----
6 files changed, 182 insertions(+), 14 deletions(-)
- Changelog:
v5:
- fixed to destroy mutex if incompatible flags
v4:
- allow to use pktmbuf if there is exact space
- removed log message if the buffer is too big
- fixed the length to include align padding
- free allocated buf if shinfo fails
v3:
- prevent the new features to be used with zero copy
- fixed sizeof() usage
- fixed log msg indentation
- removed/replaced asserts
- used the correct virt2iova function
- fixed the patch's title
- OvS PoC code:
https://github.com/fleitner/ovs/tree/rte_malloc-v3
v2:
- Used rte_malloc() instead of another mempool as suggested by Shahaf.
- Added the documentation section.
- Using driver registration to negotiate the features.
- OvS PoC code:
https://github.com/fleitner/ovs/commit/8fc197c40b1d4fda331686a7b919e9e2b670dda7
Comments
Hi Flavio,
On 10/15/19 8:59 PM, Flavio Leitner wrote:
> The rte_vhost_dequeue_burst supports two ways of dequeuing data.
> If the data fits into a buffer, then all data is copied and a
> single linear buffer is returned. Otherwise it allocates
> additional mbufs and chains them together to return a multiple
> segments mbuf.
>
> While that covers most use cases, it forces applications that
> need to work with larger data sizes to support multiple segments
> mbufs. The non-linear characteristic brings complexity and
> performance implications to the application.
>
> To resolve the issue, add support to attach external buffer
> to a pktmbuf and let the host provide during registration if
> attaching an external buffer to pktmbuf is supported and if
> only linear buffer are supported.
>
> Signed-off-by: Flavio Leitner <fbl@sysclose.org>
> ---
> doc/guides/prog_guide/vhost_lib.rst | 35 +++++++++
> lib/librte_vhost/rte_vhost.h | 4 +
> lib/librte_vhost/socket.c | 22 ++++++
> lib/librte_vhost/vhost.c | 22 ++++++
> lib/librte_vhost/vhost.h | 4 +
> lib/librte_vhost/virtio_net.c | 109 ++++++++++++++++++++++++----
> 6 files changed, 182 insertions(+), 14 deletions(-)
>
> - Changelog:
> v5:
> - fixed to destroy mutex if incompatible flags
> v4:
> - allow to use pktmbuf if there is exact space
> - removed log message if the buffer is too big
> - fixed the length to include align padding
> - free allocated buf if shinfo fails
> v3:
> - prevent the new features to be used with zero copy
> - fixed sizeof() usage
> - fixed log msg indentation
> - removed/replaced asserts
> - used the correct virt2iova function
> - fixed the patch's title
> - OvS PoC code:
> https://github.com/fleitner/ovs/tree/rte_malloc-v3
> v2:
> - Used rte_malloc() instead of another mempool as suggested by Shahaf.
> - Added the documentation section.
> - Using driver registration to negotiate the features.
> - OvS PoC code:
> https://github.com/fleitner/ovs/commit/8fc197c40b1d4fda331686a7b919e9e2b670dda7
>
I went through the patch, and it looks good to me.
I appreciate the fact that there is no rte_vhost_dequeue_burst API
change.
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Thanks,
Maxime
On 10/15/19 8:59 PM, Flavio Leitner wrote:
> The rte_vhost_dequeue_burst supports two ways of dequeuing data.
> If the data fits into a buffer, then all data is copied and a
> single linear buffer is returned. Otherwise it allocates
> additional mbufs and chains them together to return a multiple
> segments mbuf.
>
> While that covers most use cases, it forces applications that
> need to work with larger data sizes to support multiple segments
> mbufs. The non-linear characteristic brings complexity and
> performance implications to the application.
>
> To resolve the issue, add support to attach external buffer
> to a pktmbuf and let the host provide during registration if
> attaching an external buffer to pktmbuf is supported and if
> only linear buffer are supported.
>
> Signed-off-by: Flavio Leitner <fbl@sysclose.org>
> ---
> doc/guides/prog_guide/vhost_lib.rst | 35 +++++++++
> lib/librte_vhost/rte_vhost.h | 4 +
> lib/librte_vhost/socket.c | 22 ++++++
> lib/librte_vhost/vhost.c | 22 ++++++
> lib/librte_vhost/vhost.h | 4 +
> lib/librte_vhost/virtio_net.c | 109 ++++++++++++++++++++++++----
> 6 files changed, 182 insertions(+), 14 deletions(-)
>
> - Changelog:
> v5:
> - fixed to destroy mutex if incompatible flags
> v4:
> - allow to use pktmbuf if there is exact space
> - removed log message if the buffer is too big
> - fixed the length to include align padding
> - free allocated buf if shinfo fails
> v3:
> - prevent the new features to be used with zero copy
> - fixed sizeof() usage
> - fixed log msg indentation
> - removed/replaced asserts
> - used the correct virt2iova function
> - fixed the patch's title
> - OvS PoC code:
> https://github.com/fleitner/ovs/tree/rte_malloc-v3
> v2:
> - Used rte_malloc() instead of another mempool as suggested by Shahaf.
> - Added the documentation section.
> - Using driver registration to negotiate the features.
> - OvS PoC code:
> https://github.com/fleitner/ovs/commit/8fc197c40b1d4fda331686a7b919e9e2b670dda7
Applied to dpdk-next-virtio/master.
Thanks,
Maxime
On 16.10.2019 13:13, Maxime Coquelin wrote:
>
>
> On 10/15/19 8:59 PM, Flavio Leitner wrote:
>> The rte_vhost_dequeue_burst supports two ways of dequeuing data.
>> If the data fits into a buffer, then all data is copied and a
>> single linear buffer is returned. Otherwise it allocates
>> additional mbufs and chains them together to return a multiple
>> segments mbuf.
>>
>> While that covers most use cases, it forces applications that
>> need to work with larger data sizes to support multiple segments
>> mbufs. The non-linear characteristic brings complexity and
>> performance implications to the application.
>>
>> To resolve the issue, add support to attach external buffer
>> to a pktmbuf and let the host provide during registration if
>> attaching an external buffer to pktmbuf is supported and if
>> only linear buffer are supported.
>>
>> Signed-off-by: Flavio Leitner <fbl@sysclose.org>
>> ---
>> doc/guides/prog_guide/vhost_lib.rst | 35 +++++++++
>> lib/librte_vhost/rte_vhost.h | 4 +
>> lib/librte_vhost/socket.c | 22 ++++++
>> lib/librte_vhost/vhost.c | 22 ++++++
>> lib/librte_vhost/vhost.h | 4 +
>> lib/librte_vhost/virtio_net.c | 109 ++++++++++++++++++++++++----
>> 6 files changed, 182 insertions(+), 14 deletions(-)
>>
>> - Changelog:
>> v5:
>> - fixed to destroy mutex if incompatible flags
>> v4:
>> - allow to use pktmbuf if there is exact space
>> - removed log message if the buffer is too big
>> - fixed the length to include align padding
>> - free allocated buf if shinfo fails
>> v3:
>> - prevent the new features to be used with zero copy
>> - fixed sizeof() usage
>> - fixed log msg indentation
>> - removed/replaced asserts
>> - used the correct virt2iova function
>> - fixed the patch's title
>> - OvS PoC code:
>> https://github.com/fleitner/ovs/tree/rte_malloc-v3
>> v2:
>> - Used rte_malloc() instead of another mempool as suggested by Shahaf.
>> - Added the documentation section.
>> - Using driver registration to negotiate the features.
>> - OvS PoC code:
>> https://github.com/fleitner/ovs/commit/8fc197c40b1d4fda331686a7b919e9e2b670dda7
>
> Applied to dpdk-next-virtio/master.
Thanks Maxime,
But can we return back the mbuf allocation failure message?
I mean:
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 66f0c7206..f8af4e0b3 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -1354,8 +1354,11 @@ virtio_dev_pktmbuf_alloc(struct virtio_net *dev, struct rte_mempool *mp,
{
struct rte_mbuf *pkt = rte_pktmbuf_alloc(mp);
- if (unlikely(pkt == NULL))
+ if (unlikely(pkt == NULL)) {
+ RTE_LOG(ERR, VHOST_DATA,
+ "Failed to allocate memory for mbuf.\n");
return NULL;
+ }
if (rte_pktmbuf_tailroom(pkt) >= data_len)
return pkt;
---
It's a hard failure that highlights some significant issues with
memory pool size or a mbuf leak.
We still have the message for subsequent chained mbufs, but not
for the first one. Without this error message we could never
notice that something is wrong with our memory pool. Only the
network traffic will stop flowing.
The message was very useful previously while catching the root
cause of the mbuf leak in OVS.
I could send a separate patch for this if needed.
Best regards, Ilya Maximets.
On 10/16/19 3:32 PM, Ilya Maximets wrote:
> On 16.10.2019 13:13, Maxime Coquelin wrote:
>>
>>
>> On 10/15/19 8:59 PM, Flavio Leitner wrote:
>>> The rte_vhost_dequeue_burst supports two ways of dequeuing data.
>>> If the data fits into a buffer, then all data is copied and a
>>> single linear buffer is returned. Otherwise it allocates
>>> additional mbufs and chains them together to return a multiple
>>> segments mbuf.
>>>
>>> While that covers most use cases, it forces applications that
>>> need to work with larger data sizes to support multiple segments
>>> mbufs. The non-linear characteristic brings complexity and
>>> performance implications to the application.
>>>
>>> To resolve the issue, add support to attach external buffer
>>> to a pktmbuf and let the host provide during registration if
>>> attaching an external buffer to pktmbuf is supported and if
>>> only linear buffer are supported.
>>>
>>> Signed-off-by: Flavio Leitner <fbl@sysclose.org>
>>> ---
>>> doc/guides/prog_guide/vhost_lib.rst | 35 +++++++++
>>> lib/librte_vhost/rte_vhost.h | 4 +
>>> lib/librte_vhost/socket.c | 22 ++++++
>>> lib/librte_vhost/vhost.c | 22 ++++++
>>> lib/librte_vhost/vhost.h | 4 +
>>> lib/librte_vhost/virtio_net.c | 109 ++++++++++++++++++++++++----
>>> 6 files changed, 182 insertions(+), 14 deletions(-)
>>>
>>> - Changelog:
>>> v5:
>>> - fixed to destroy mutex if incompatible flags
>>> v4:
>>> - allow to use pktmbuf if there is exact space
>>> - removed log message if the buffer is too big
>>> - fixed the length to include align padding
>>> - free allocated buf if shinfo fails
>>> v3:
>>> - prevent the new features to be used with zero copy
>>> - fixed sizeof() usage
>>> - fixed log msg indentation
>>> - removed/replaced asserts
>>> - used the correct virt2iova function
>>> - fixed the patch's title
>>> - OvS PoC code:
>>> https://github.com/fleitner/ovs/tree/rte_malloc-v3
>>> v2:
>>> - Used rte_malloc() instead of another mempool as suggested by
>>> Shahaf.
>>> - Added the documentation section.
>>> - Using driver registration to negotiate the features.
>>> - OvS PoC code:
>>>
>>> https://github.com/fleitner/ovs/commit/8fc197c40b1d4fda331686a7b919e9e2b670dda7
>>>
>>
>> Applied to dpdk-next-virtio/master.
>
> Thanks Maxime,
>
> But can we return back the mbuf allocation failure message?
Good point, I missed it.
> I mean:
>
> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
> index 66f0c7206..f8af4e0b3 100644
> --- a/lib/librte_vhost/virtio_net.c
> +++ b/lib/librte_vhost/virtio_net.c
> @@ -1354,8 +1354,11 @@ virtio_dev_pktmbuf_alloc(struct virtio_net *dev,
> struct rte_mempool *mp,
> {
> struct rte_mbuf *pkt = rte_pktmbuf_alloc(mp);
>
> - if (unlikely(pkt == NULL))
> + if (unlikely(pkt == NULL)) {
> + RTE_LOG(ERR, VHOST_DATA,
> + "Failed to allocate memory for mbuf.\n");
> return NULL;
> + }
>
> if (rte_pktmbuf_tailroom(pkt) >= data_len)
> return pkt;
> ---
>
> It's a hard failure that highlights some significant issues with
> memory pool size or a mbuf leak.
I agree, we need this error message.
> We still have the message for subsequent chained mbufs, but not
> for the first one. Without this error message we could never
> notice that something is wrong with our memory pool. Only the
> network traffic will stop flowing.
> The message was very useful previously while catching the root
> cause of the mbuf leak in OVS.
>
> I could send a separate patch for this if needed.
We need a separate patch.
If you want to do it, please do! Otherwise I'll do it.
Thanks!
Maxime
> Best regards, Ilya Maximets.
On Wed, 16 Oct 2019 15:46:15 +0200
Maxime Coquelin <maxime.coquelin@redhat.com> wrote:
> On 10/16/19 3:32 PM, Ilya Maximets wrote:
> > On 16.10.2019 13:13, Maxime Coquelin wrote:
> >>
> >>
> >> On 10/15/19 8:59 PM, Flavio Leitner wrote:
> >>> The rte_vhost_dequeue_burst supports two ways of dequeuing data.
> >>> If the data fits into a buffer, then all data is copied and a
> >>> single linear buffer is returned. Otherwise it allocates
> >>> additional mbufs and chains them together to return a multiple
> >>> segments mbuf.
> >>>
> >>> While that covers most use cases, it forces applications that
> >>> need to work with larger data sizes to support multiple segments
> >>> mbufs. The non-linear characteristic brings complexity and
> >>> performance implications to the application.
> >>>
> >>> To resolve the issue, add support to attach external buffer
> >>> to a pktmbuf and let the host provide during registration if
> >>> attaching an external buffer to pktmbuf is supported and if
> >>> only linear buffer are supported.
> >>>
> >>> Signed-off-by: Flavio Leitner <fbl@sysclose.org>
> >>> ---
> >>> doc/guides/prog_guide/vhost_lib.rst | 35 +++++++++
> >>> lib/librte_vhost/rte_vhost.h | 4 +
> >>> lib/librte_vhost/socket.c | 22 ++++++
> >>> lib/librte_vhost/vhost.c | 22 ++++++
> >>> lib/librte_vhost/vhost.h | 4 +
> >>> lib/librte_vhost/virtio_net.c | 109
> >>> ++++++++++++++++++++++++---- 6 files changed, 182 insertions(+),
> >>> 14 deletions(-)
> >>>
> >>> - Changelog:
> >>> v5:
> >>> - fixed to destroy mutex if incompatible flags
> >>> v4:
> >>> - allow to use pktmbuf if there is exact space
> >>> - removed log message if the buffer is too big
> >>> - fixed the length to include align padding
> >>> - free allocated buf if shinfo fails
> >>> v3:
> >>> - prevent the new features to be used with zero copy
> >>> - fixed sizeof() usage
> >>> - fixed log msg indentation
> >>> - removed/replaced asserts
> >>> - used the correct virt2iova function
> >>> - fixed the patch's title
> >>> - OvS PoC code:
> >>> https://github.com/fleitner/ovs/tree/rte_malloc-v3
> >>> v2:
> >>> - Used rte_malloc() instead of another mempool as suggested
> >>> by Shahaf.
> >>> - Added the documentation section.
> >>> - Using driver registration to negotiate the features.
> >>> - OvS PoC code:
> >>>
> >>> https://github.com/fleitner/ovs/commit/8fc197c40b1d4fda331686a7b919e9e2b670dda7
> >>>
> >>
> >> Applied to dpdk-next-virtio/master.
> >
> > Thanks Maxime,
> >
> > But can we return back the mbuf allocation failure message?
>
> Good point, I missed it.
>
> > I mean:
> >
> > diff --git a/lib/librte_vhost/virtio_net.c
> > b/lib/librte_vhost/virtio_net.c index 66f0c7206..f8af4e0b3 100644
> > --- a/lib/librte_vhost/virtio_net.c
> > +++ b/lib/librte_vhost/virtio_net.c
> > @@ -1354,8 +1354,11 @@ virtio_dev_pktmbuf_alloc(struct virtio_net
> > *dev, struct rte_mempool *mp,
> > {
> > struct rte_mbuf *pkt = rte_pktmbuf_alloc(mp);
> >
> > - if (unlikely(pkt == NULL))
> > + if (unlikely(pkt == NULL)) {
> > + RTE_LOG(ERR, VHOST_DATA,
> > + "Failed to allocate memory for mbuf.\n");
> > return NULL;
> > + }
> >
> > if (rte_pktmbuf_tailroom(pkt) >= data_len)
> > return pkt;
> > ---
> >
> > It's a hard failure that highlights some significant issues with
> > memory pool size or a mbuf leak.
>
> I agree, we need this error message.
If it runs out of memory and there are many packets coming, then it
will flood the logs. Maybe we could use something to report in a more
moderated way, for example:
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index da69ab1db..b1572b3a4 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -1354,8 +1354,14 @@ virtio_dev_pktmbuf_alloc(struct virtio_net *dev, struct rte_mempool *mp,
{
struct rte_mbuf *pkt = rte_pktmbuf_alloc(mp);
- if (unlikely(pkt == NULL))
+ if (unlikely(pkt == NULL)) {
+ static int rate_lim = 0;
+
+ if (rate_lim++ % 1000 == 0)
+ RTE_LOG...
+
return NULL;
+ }
if (rte_pktmbuf_tailroom(pkt) >= data_len)
return pkt;
Or track this at mempool level and keep stats of failed requests and
report that in OvS. That would cover other points too.
fbl
On 16.10.2019 15:46, Maxime Coquelin wrote:
>
>
> On 10/16/19 3:32 PM, Ilya Maximets wrote:
>> On 16.10.2019 13:13, Maxime Coquelin wrote:
>>>
>>>
>>> On 10/15/19 8:59 PM, Flavio Leitner wrote:
>>>> The rte_vhost_dequeue_burst supports two ways of dequeuing data.
>>>> If the data fits into a buffer, then all data is copied and a
>>>> single linear buffer is returned. Otherwise it allocates
>>>> additional mbufs and chains them together to return a multiple
>>>> segments mbuf.
>>>>
>>>> While that covers most use cases, it forces applications that
>>>> need to work with larger data sizes to support multiple segments
>>>> mbufs. The non-linear characteristic brings complexity and
>>>> performance implications to the application.
>>>>
>>>> To resolve the issue, add support to attach external buffer
>>>> to a pktmbuf and let the host provide during registration if
>>>> attaching an external buffer to pktmbuf is supported and if
>>>> only linear buffer are supported.
>>>>
>>>> Signed-off-by: Flavio Leitner <fbl@sysclose.org>
>>>> ---
>>>> doc/guides/prog_guide/vhost_lib.rst | 35 +++++++++
>>>> lib/librte_vhost/rte_vhost.h | 4 +
>>>> lib/librte_vhost/socket.c | 22 ++++++
>>>> lib/librte_vhost/vhost.c | 22 ++++++
>>>> lib/librte_vhost/vhost.h | 4 +
>>>> lib/librte_vhost/virtio_net.c | 109 ++++++++++++++++++++++++----
>>>> 6 files changed, 182 insertions(+), 14 deletions(-)
>>>>
>>>> - Changelog:
>>>> v5:
>>>> - fixed to destroy mutex if incompatible flags
>>>> v4:
>>>> - allow to use pktmbuf if there is exact space
>>>> - removed log message if the buffer is too big
>>>> - fixed the length to include align padding
>>>> - free allocated buf if shinfo fails
>>>> v3:
>>>> - prevent the new features to be used with zero copy
>>>> - fixed sizeof() usage
>>>> - fixed log msg indentation
>>>> - removed/replaced asserts
>>>> - used the correct virt2iova function
>>>> - fixed the patch's title
>>>> - OvS PoC code:
>>>> https://github.com/fleitner/ovs/tree/rte_malloc-v3
>>>> v2:
>>>> - Used rte_malloc() instead of another mempool as suggested by
>>>> Shahaf.
>>>> - Added the documentation section.
>>>> - Using driver registration to negotiate the features.
>>>> - OvS PoC code:
>>>>
>>>> https://github.com/fleitner/ovs/commit/8fc197c40b1d4fda331686a7b919e9e2b670dda7
>>>>
>>>
>>> Applied to dpdk-next-virtio/master.
>>
>> Thanks Maxime,
>>
>> But can we return back the mbuf allocation failure message?
>
> Good point, I missed it.
>
>> I mean:
>>
>> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
>> index 66f0c7206..f8af4e0b3 100644
>> --- a/lib/librte_vhost/virtio_net.c
>> +++ b/lib/librte_vhost/virtio_net.c
>> @@ -1354,8 +1354,11 @@ virtio_dev_pktmbuf_alloc(struct virtio_net *dev,
>> struct rte_mempool *mp,
>> {
>> struct rte_mbuf *pkt = rte_pktmbuf_alloc(mp);
>>
>> - if (unlikely(pkt == NULL))
>> + if (unlikely(pkt == NULL)) {
>> + RTE_LOG(ERR, VHOST_DATA,
>> + "Failed to allocate memory for mbuf.\n");
>> return NULL;
>> + }
>>
>> if (rte_pktmbuf_tailroom(pkt) >= data_len)
>> return pkt;
>> ---
>>
>> It's a hard failure that highlights some significant issues with
>> memory pool size or a mbuf leak.
>
> I agree, we need this error message.
>
>> We still have the message for subsequent chained mbufs, but not
>> for the first one. Without this error message we could never
>> notice that something is wrong with our memory pool. Only the
>> network traffic will stop flowing.
>> The message was very useful previously while catching the root
>> cause of the mbuf leak in OVS.
>>
>> I could send a separate patch for this if needed.
>
> We need a separate patch.
> If you want to do it, please do! Otherwise I'll do it.
OK. I'll send a patch soon.
Best regards, Ilya Maximets.
On 16.10.2019 16:02, Flavio Leitner wrote:
> On Wed, 16 Oct 2019 15:46:15 +0200
> Maxime Coquelin <maxime.coquelin@redhat.com> wrote:
>
>> On 10/16/19 3:32 PM, Ilya Maximets wrote:
>>> On 16.10.2019 13:13, Maxime Coquelin wrote:
>>>>
>>>>
>>>> On 10/15/19 8:59 PM, Flavio Leitner wrote:
>>>>> The rte_vhost_dequeue_burst supports two ways of dequeuing data.
>>>>> If the data fits into a buffer, then all data is copied and a
>>>>> single linear buffer is returned. Otherwise it allocates
>>>>> additional mbufs and chains them together to return a multiple
>>>>> segments mbuf.
>>>>>
>>>>> While that covers most use cases, it forces applications that
>>>>> need to work with larger data sizes to support multiple segments
>>>>> mbufs. The non-linear characteristic brings complexity and
>>>>> performance implications to the application.
>>>>>
>>>>> To resolve the issue, add support to attach external buffer
>>>>> to a pktmbuf and let the host provide during registration if
>>>>> attaching an external buffer to pktmbuf is supported and if
>>>>> only linear buffer are supported.
>>>>>
>>>>> Signed-off-by: Flavio Leitner <fbl@sysclose.org>
>>>>> ---
>>>>> doc/guides/prog_guide/vhost_lib.rst | 35 +++++++++
>>>>> lib/librte_vhost/rte_vhost.h | 4 +
>>>>> lib/librte_vhost/socket.c | 22 ++++++
>>>>> lib/librte_vhost/vhost.c | 22 ++++++
>>>>> lib/librte_vhost/vhost.h | 4 +
>>>>> lib/librte_vhost/virtio_net.c | 109
>>>>> ++++++++++++++++++++++++---- 6 files changed, 182 insertions(+),
>>>>> 14 deletions(-)
>>>>>
>>>>> - Changelog:
>>>>> v5:
>>>>> - fixed to destroy mutex if incompatible flags
>>>>> v4:
>>>>> - allow to use pktmbuf if there is exact space
>>>>> - removed log message if the buffer is too big
>>>>> - fixed the length to include align padding
>>>>> - free allocated buf if shinfo fails
>>>>> v3:
>>>>> - prevent the new features to be used with zero copy
>>>>> - fixed sizeof() usage
>>>>> - fixed log msg indentation
>>>>> - removed/replaced asserts
>>>>> - used the correct virt2iova function
>>>>> - fixed the patch's title
>>>>> - OvS PoC code:
>>>>> https://github.com/fleitner/ovs/tree/rte_malloc-v3
>>>>> v2:
>>>>> - Used rte_malloc() instead of another mempool as suggested
>>>>> by Shahaf.
>>>>> - Added the documentation section.
>>>>> - Using driver registration to negotiate the features.
>>>>> - OvS PoC code:
>>>>>
>>>>> https://github.com/fleitner/ovs/commit/8fc197c40b1d4fda331686a7b919e9e2b670dda7
>>>>>
>>>>
>>>> Applied to dpdk-next-virtio/master.
>>>
>>> Thanks Maxime,
>>>
>>> But can we return back the mbuf allocation failure message?
>>
>> Good point, I missed it.
>>
>>> I mean:
>>>
>>> diff --git a/lib/librte_vhost/virtio_net.c
>>> b/lib/librte_vhost/virtio_net.c index 66f0c7206..f8af4e0b3 100644
>>> --- a/lib/librte_vhost/virtio_net.c
>>> +++ b/lib/librte_vhost/virtio_net.c
>>> @@ -1354,8 +1354,11 @@ virtio_dev_pktmbuf_alloc(struct virtio_net
>>> *dev, struct rte_mempool *mp,
>>> {
>>> struct rte_mbuf *pkt = rte_pktmbuf_alloc(mp);
>>>
>>> - if (unlikely(pkt == NULL))
>>> + if (unlikely(pkt == NULL)) {
>>> + RTE_LOG(ERR, VHOST_DATA,
>>> + "Failed to allocate memory for mbuf.\n");
>>> return NULL;
>>> + }
>>>
>>> if (rte_pktmbuf_tailroom(pkt) >= data_len)
>>> return pkt;
>>> ---
>>>
>>> It's a hard failure that highlights some significant issues with
>>> memory pool size or a mbuf leak.
>>
>> I agree, we need this error message.
>
> If it runs out of memory and there are many packets coming, then it
> will flood the logs. Maybe we could use something to report in a more
> moderated way, for example:
>
> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
> index da69ab1db..b1572b3a4 100644
> --- a/lib/librte_vhost/virtio_net.c
> +++ b/lib/librte_vhost/virtio_net.c
> @@ -1354,8 +1354,14 @@ virtio_dev_pktmbuf_alloc(struct virtio_net *dev, struct rte_mempool *mp,
> {
> struct rte_mbuf *pkt = rte_pktmbuf_alloc(mp);
>
> - if (unlikely(pkt == NULL))
> + if (unlikely(pkt == NULL)) {
> + static int rate_lim = 0;
> +
> + if (rate_lim++ % 1000 == 0)
> + RTE_LOG...
> +
> return NULL;
> + }
>
> if (rte_pktmbuf_tailroom(pkt) >= data_len)
> return pkt;
>
>
> Or track this at mempool level and keep stats of failed requests and
> report that in OvS. That would cover other points too.
OVS is already rete-limiting DPDK logs. Basically, I introduced this
functionality to limit exactly this message.
See: 9fd38f6867df ("netdev-dpdk: Limit rate of DPDK logs.")
Best regards, Ilya Maximets.
On Wed, 16 Oct 2019 16:08:54 +0200
Ilya Maximets <i.maximets@ovn.org> wrote:
> On 16.10.2019 16:02, Flavio Leitner wrote:
> > On Wed, 16 Oct 2019 15:46:15 +0200
> > Maxime Coquelin <maxime.coquelin@redhat.com> wrote:
> >
> >> On 10/16/19 3:32 PM, Ilya Maximets wrote:
> >>> On 16.10.2019 13:13, Maxime Coquelin wrote:
> >>>>
> >>>>
> >>>> On 10/15/19 8:59 PM, Flavio Leitner wrote:
> >>>>> The rte_vhost_dequeue_burst supports two ways of dequeuing data.
> >>>>> If the data fits into a buffer, then all data is copied and a
> >>>>> single linear buffer is returned. Otherwise it allocates
> >>>>> additional mbufs and chains them together to return a multiple
> >>>>> segments mbuf.
> >>>>>
> >>>>> While that covers most use cases, it forces applications that
> >>>>> need to work with larger data sizes to support multiple segments
> >>>>> mbufs. The non-linear characteristic brings complexity and
> >>>>> performance implications to the application.
> >>>>>
> >>>>> To resolve the issue, add support to attach external buffer
> >>>>> to a pktmbuf and let the host provide during registration if
> >>>>> attaching an external buffer to pktmbuf is supported and if
> >>>>> only linear buffer are supported.
> >>>>>
> >>>>> Signed-off-by: Flavio Leitner <fbl@sysclose.org>
> >>>>> ---
> >>>>> doc/guides/prog_guide/vhost_lib.rst | 35 +++++++++
> >>>>> lib/librte_vhost/rte_vhost.h | 4 +
> >>>>> lib/librte_vhost/socket.c | 22 ++++++
> >>>>> lib/librte_vhost/vhost.c | 22 ++++++
> >>>>> lib/librte_vhost/vhost.h | 4 +
> >>>>> lib/librte_vhost/virtio_net.c | 109
> >>>>> ++++++++++++++++++++++++---- 6 files changed, 182 insertions(+),
> >>>>> 14 deletions(-)
> >>>>>
> >>>>> - Changelog:
> >>>>> v5:
> >>>>> - fixed to destroy mutex if incompatible flags
> >>>>> v4:
> >>>>> - allow to use pktmbuf if there is exact space
> >>>>> - removed log message if the buffer is too big
> >>>>> - fixed the length to include align padding
> >>>>> - free allocated buf if shinfo fails
> >>>>> v3:
> >>>>> - prevent the new features to be used with zero copy
> >>>>> - fixed sizeof() usage
> >>>>> - fixed log msg indentation
> >>>>> - removed/replaced asserts
> >>>>> - used the correct virt2iova function
> >>>>> - fixed the patch's title
> >>>>> - OvS PoC code:
> >>>>> https://github.com/fleitner/ovs/tree/rte_malloc-v3
> >>>>> v2:
> >>>>> - Used rte_malloc() instead of another mempool as
> >>>>> suggested by Shahaf.
> >>>>> - Added the documentation section.
> >>>>> - Using driver registration to negotiate the features.
> >>>>> - OvS PoC code:
> >>>>>
> >>>>> https://github.com/fleitner/ovs/commit/8fc197c40b1d4fda331686a7b919e9e2b670dda7
> >>>>>
> >>>>
> >>>> Applied to dpdk-next-virtio/master.
> >>>
> >>> Thanks Maxime,
> >>>
> >>> But can we return back the mbuf allocation failure message?
> >>
> >> Good point, I missed it.
> >>
> >>> I mean:
> >>>
> >>> diff --git a/lib/librte_vhost/virtio_net.c
> >>> b/lib/librte_vhost/virtio_net.c index 66f0c7206..f8af4e0b3 100644
> >>> --- a/lib/librte_vhost/virtio_net.c
> >>> +++ b/lib/librte_vhost/virtio_net.c
> >>> @@ -1354,8 +1354,11 @@ virtio_dev_pktmbuf_alloc(struct virtio_net
> >>> *dev, struct rte_mempool *mp,
> >>> {
> >>> struct rte_mbuf *pkt = rte_pktmbuf_alloc(mp);
> >>>
> >>> - if (unlikely(pkt == NULL))
> >>> + if (unlikely(pkt == NULL)) {
> >>> + RTE_LOG(ERR, VHOST_DATA,
> >>> + "Failed to allocate memory for mbuf.\n");
> >>> return NULL;
> >>> + }
> >>>
> >>> if (rte_pktmbuf_tailroom(pkt) >= data_len)
> >>> return pkt;
> >>> ---
> >>>
> >>> It's a hard failure that highlights some significant issues with
> >>> memory pool size or a mbuf leak.
> >>
> >> I agree, we need this error message.
> >
> > If it runs out of memory and there are many packets coming, then it
> > will flood the logs. Maybe we could use something to report in a
> > more moderated way, for example:
> >
> > diff --git a/lib/librte_vhost/virtio_net.c
> > b/lib/librte_vhost/virtio_net.c index da69ab1db..b1572b3a4 100644
> > --- a/lib/librte_vhost/virtio_net.c
> > +++ b/lib/librte_vhost/virtio_net.c
> > @@ -1354,8 +1354,14 @@ virtio_dev_pktmbuf_alloc(struct virtio_net
> > *dev, struct rte_mempool *mp, {
> > struct rte_mbuf *pkt = rte_pktmbuf_alloc(mp);
> >
> > - if (unlikely(pkt == NULL))
> > + if (unlikely(pkt == NULL)) {
> > + static int rate_lim = 0;
> > +
> > + if (rate_lim++ % 1000 == 0)
> > + RTE_LOG...
> > +
> > return NULL;
> > + }
> >
> > if (rte_pktmbuf_tailroom(pkt) >= data_len)
> > return pkt;
> >
> >
> > Or track this at mempool level and keep stats of failed requests and
> > report that in OvS. That would cover other points too.
>
> OVS is already rete-limiting DPDK logs. Basically, I introduced this
> functionality to limit exactly this message.
> See: 9fd38f6867df ("netdev-dpdk: Limit rate of DPDK logs.")
Thanks, I missed that commit.
fbl
On Tue, Oct 15, 2019 at 9:00 PM Flavio Leitner <fbl@sysclose.org> wrote:
> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
> index 5b85b832d..da69ab1db 100644
> --- a/lib/librte_vhost/virtio_net.c
> +++ b/lib/librte_vhost/virtio_net.c
> @@ -1289,6 +1289,93 @@ get_zmbuf(struct vhost_virtqueue *vq)
> return NULL;
> }
>
> +static void
> +virtio_dev_extbuf_free(void *addr __rte_unused, void *opaque)
> +{
> + rte_free(opaque);
> +}
> +
> +static int
> +virtio_dev_extbuf_alloc(struct rte_mbuf *pkt, uint32_t size)
> +{
> + struct rte_mbuf_ext_shared_info *shinfo = NULL;
> + uint32_t total_len = RTE_PKTMBUF_HEADROOM + size;
> + uint16_t buf_len;
> + rte_iova_t iova;
> + void *buf;
> +
> + /* Try to use pkt buffer to store shinfo to reduce the amount of memory
> + * required, otherwise store shinfo in the new buffer.
> + */
> + if (rte_pktmbuf_tailroom(pkt) >= sizeof(*shinfo))
> + shinfo = rte_pktmbuf_mtod(pkt,
> + struct rte_mbuf_ext_shared_info *);
> + else {
> + total_len += sizeof(*shinfo) + sizeof(uintptr_t);
> + total_len = RTE_ALIGN_CEIL(total_len, sizeof(uintptr_t));
> + }
> +
> + if (unlikely(total_len > UINT16_MAX))
> + return -ENOSPC;
> +
> + buf_len = total_len;
> + buf = rte_malloc(NULL, buf_len, RTE_CACHE_LINE_SIZE);
Using rte_malloc() means that the allocation can end up on any numa node.
This external buffer might end up on a different node than the mbuf
(which resides on mp->socket_id node).
On Tue, 29 Oct 2019 10:02:57 +0100
David Marchand <david.marchand@redhat.com> wrote:
> On Tue, Oct 15, 2019 at 9:00 PM Flavio Leitner <fbl@sysclose.org>
> wrote:
> > diff --git a/lib/librte_vhost/virtio_net.c
> > b/lib/librte_vhost/virtio_net.c index 5b85b832d..da69ab1db 100644
> > --- a/lib/librte_vhost/virtio_net.c
> > +++ b/lib/librte_vhost/virtio_net.c
> > @@ -1289,6 +1289,93 @@ get_zmbuf(struct vhost_virtqueue *vq)
> > return NULL;
> > }
> >
> > +static void
> > +virtio_dev_extbuf_free(void *addr __rte_unused, void *opaque)
> > +{
> > + rte_free(opaque);
> > +}
> > +
> > +static int
> > +virtio_dev_extbuf_alloc(struct rte_mbuf *pkt, uint32_t size)
> > +{
> > + struct rte_mbuf_ext_shared_info *shinfo = NULL;
> > + uint32_t total_len = RTE_PKTMBUF_HEADROOM + size;
> > + uint16_t buf_len;
> > + rte_iova_t iova;
> > + void *buf;
> > +
> > + /* Try to use pkt buffer to store shinfo to reduce the
> > amount of memory
> > + * required, otherwise store shinfo in the new buffer.
> > + */
> > + if (rte_pktmbuf_tailroom(pkt) >= sizeof(*shinfo))
> > + shinfo = rte_pktmbuf_mtod(pkt,
> > + struct
> > rte_mbuf_ext_shared_info *);
> > + else {
> > + total_len += sizeof(*shinfo) + sizeof(uintptr_t);
> > + total_len = RTE_ALIGN_CEIL(total_len,
> > sizeof(uintptr_t));
> > + }
> > +
> > + if (unlikely(total_len > UINT16_MAX))
> > + return -ENOSPC;
> > +
> > + buf_len = total_len;
> > + buf = rte_malloc(NULL, buf_len, RTE_CACHE_LINE_SIZE);
>
> Using rte_malloc() means that the allocation can end up on any numa
> node. This external buffer might end up on a different node than the
> mbuf (which resides on mp->socket_id node).
Only if there is no memory in the local socket. It seems better than a
failure to me.
fbl
On Tue, Oct 29, 2019 at 1:21 PM Flavio Leitner <fbl@sysclose.org> wrote:
> On Tue, 29 Oct 2019 10:02:57 +0100
> David Marchand <david.marchand@redhat.com> wrote:
> > Using rte_malloc() means that the allocation can end up on any numa
> > node. This external buffer might end up on a different node than the
> > mbuf (which resides on mp->socket_id node).
>
> Only if there is no memory in the local socket. It seems better than a
> failure to me.
This won't fail, but you don't know how it will impact the application.
We are moving from a model where the vhost library used a mempool with
a fixed number of buffers on a known socket to a model where we can't
guarantee how much memory vhost will eat, and where the allocations
happen.
I am just scared we will get bitten by random allocation failures on a
system that was "used to run fine".
--
David Marchand
@@ -117,6 +117,41 @@ The following is an overview of some key Vhost API functions:
Enabling this flag should only be done when the calling application does
not pre-fault the guest shared memory, otherwise migration would fail.
+ - ``RTE_VHOST_USER_LINEARBUF_SUPPORT``
+
+ Enabling this flag forces vhost dequeue function to only provide linear
+ pktmbuf (no multi-segmented pktmbuf).
+
+ The vhost library by default provides a single pktmbuf for given a
+ packet, but if for some reason the data doesn't fit into a single
+ pktmbuf (e.g., TSO is enabled), the library will allocate additional
+ pktmbufs from the same mempool and chain them together to create a
+ multi-segmented pktmbuf.
+
+ However, the vhost application needs to support multi-segmented format.
+ If the vhost application does not support that format and requires large
+ buffers to be dequeue, this flag should be enabled to force only linear
+ buffers (see RTE_VHOST_USER_EXTBUF_SUPPORT) or drop the packet.
+
+ It is disabled by default.
+
+ - ``RTE_VHOST_USER_EXTBUF_SUPPORT``
+
+ Enabling this flag allows vhost dequeue function to allocate and attach
+ an external buffer to a pktmbuf if the pkmbuf doesn't provide enough
+ space to store all data.
+
+ This is useful when the vhost application wants to support large packets
+ but doesn't want to increase the default mempool object size nor to
+ support multi-segmented mbufs (non-linear). In this case, a fresh buffer
+ is allocated using rte_malloc() which gets attached to a pktmbuf using
+ rte_pktmbuf_attach_extbuf().
+
+ See RTE_VHOST_USER_LINEARBUF_SUPPORT as well to disable multi-segmented
+ mbufs for application that doesn't support chained mbufs.
+
+ It is disabled by default.
+
* ``rte_vhost_driver_set_features(path, features)``
This function sets the feature bits the vhost-user driver supports. The
@@ -30,6 +30,10 @@ extern "C" {
#define RTE_VHOST_USER_DEQUEUE_ZERO_COPY (1ULL << 2)
#define RTE_VHOST_USER_IOMMU_SUPPORT (1ULL << 3)
#define RTE_VHOST_USER_POSTCOPY_SUPPORT (1ULL << 4)
+/* support mbuf with external buffer attached */
+#define RTE_VHOST_USER_EXTBUF_SUPPORT (1ULL << 5)
+/* support only linear buffers (no chained mbufs) */
+#define RTE_VHOST_USER_LINEARBUF_SUPPORT (1ULL << 6)
/** Protocol features. */
#ifndef VHOST_USER_PROTOCOL_F_MQ
@@ -40,6 +40,8 @@ struct vhost_user_socket {
bool dequeue_zero_copy;
bool iommu_support;
bool use_builtin_virtio_net;
+ bool extbuf;
+ bool linearbuf;
/*
* The "supported_features" indicates the feature bits the
@@ -232,6 +234,12 @@ vhost_user_add_connection(int fd, struct vhost_user_socket *vsocket)
if (vsocket->dequeue_zero_copy)
vhost_enable_dequeue_zero_copy(vid);
+ if (vsocket->extbuf)
+ vhost_enable_extbuf(vid);
+
+ if (vsocket->linearbuf)
+ vhost_enable_linearbuf(vid);
+
RTE_LOG(INFO, VHOST_CONFIG, "new device, handle is %d\n", vid);
if (vsocket->notify_ops->new_connection) {
@@ -870,6 +878,8 @@ rte_vhost_driver_register(const char *path, uint64_t flags)
goto out_free;
}
vsocket->dequeue_zero_copy = flags & RTE_VHOST_USER_DEQUEUE_ZERO_COPY;
+ vsocket->extbuf = flags & RTE_VHOST_USER_EXTBUF_SUPPORT;
+ vsocket->linearbuf = flags & RTE_VHOST_USER_LINEARBUF_SUPPORT;
/*
* Set the supported features correctly for the builtin vhost-user
@@ -894,6 +904,18 @@ rte_vhost_driver_register(const char *path, uint64_t flags)
* not compatible with postcopy.
*/
if (vsocket->dequeue_zero_copy) {
+ if (vsocket->extbuf) {
+ RTE_LOG(ERR, VHOST_CONFIG,
+ "error: zero copy is incompatible with external buffers\n");
+ ret = -1;
+ goto out_mutex;
+ }
+ if (vsocket->linearbuf) {
+ RTE_LOG(ERR, VHOST_CONFIG,
+ "error: zero copy is incompatible with linear buffers\n");
+ ret = -1;
+ goto out_mutex;
+ }
vsocket->supported_features &= ~(1ULL << VIRTIO_F_IN_ORDER);
vsocket->features &= ~(1ULL << VIRTIO_F_IN_ORDER);
@@ -605,6 +605,28 @@ vhost_set_builtin_virtio_net(int vid, bool enable)
dev->flags &= ~VIRTIO_DEV_BUILTIN_VIRTIO_NET;
}
+void
+vhost_enable_extbuf(int vid)
+{
+ struct virtio_net *dev = get_device(vid);
+
+ if (dev == NULL)
+ return;
+
+ dev->extbuf = 1;
+}
+
+void
+vhost_enable_linearbuf(int vid)
+{
+ struct virtio_net *dev = get_device(vid);
+
+ if (dev == NULL)
+ return;
+
+ dev->linearbuf = 1;
+}
+
int
rte_vhost_get_mtu(int vid, uint16_t *mtu)
{
@@ -302,6 +302,8 @@ struct virtio_net {
rte_atomic16_t broadcast_rarp;
uint32_t nr_vring;
int dequeue_zero_copy;
+ int extbuf;
+ int linearbuf;
struct vhost_virtqueue *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2];
#define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
char ifname[IF_NAME_SZ];
@@ -476,6 +478,8 @@ void vhost_attach_vdpa_device(int vid, int did);
void vhost_set_ifname(int, const char *if_name, unsigned int if_len);
void vhost_enable_dequeue_zero_copy(int vid);
void vhost_set_builtin_virtio_net(int vid, bool enable);
+void vhost_enable_extbuf(int vid);
+void vhost_enable_linearbuf(int vid);
struct vhost_device_ops const *vhost_driver_callback_get(const char *path);
@@ -1289,6 +1289,93 @@ get_zmbuf(struct vhost_virtqueue *vq)
return NULL;
}
+static void
+virtio_dev_extbuf_free(void *addr __rte_unused, void *opaque)
+{
+ rte_free(opaque);
+}
+
+static int
+virtio_dev_extbuf_alloc(struct rte_mbuf *pkt, uint32_t size)
+{
+ struct rte_mbuf_ext_shared_info *shinfo = NULL;
+ uint32_t total_len = RTE_PKTMBUF_HEADROOM + size;
+ uint16_t buf_len;
+ rte_iova_t iova;
+ void *buf;
+
+ /* Try to use pkt buffer to store shinfo to reduce the amount of memory
+ * required, otherwise store shinfo in the new buffer.
+ */
+ if (rte_pktmbuf_tailroom(pkt) >= sizeof(*shinfo))
+ shinfo = rte_pktmbuf_mtod(pkt,
+ struct rte_mbuf_ext_shared_info *);
+ else {
+ total_len += sizeof(*shinfo) + sizeof(uintptr_t);
+ total_len = RTE_ALIGN_CEIL(total_len, sizeof(uintptr_t));
+ }
+
+ if (unlikely(total_len > UINT16_MAX))
+ return -ENOSPC;
+
+ buf_len = total_len;
+ buf = rte_malloc(NULL, buf_len, RTE_CACHE_LINE_SIZE);
+ if (unlikely(buf == NULL))
+ return -ENOMEM;
+
+ /* Initialize shinfo */
+ if (shinfo) {
+ shinfo->free_cb = virtio_dev_extbuf_free;
+ shinfo->fcb_opaque = buf;
+ rte_mbuf_ext_refcnt_set(shinfo, 1);
+ } else {
+ shinfo = rte_pktmbuf_ext_shinfo_init_helper(buf, &buf_len,
+ virtio_dev_extbuf_free, buf);
+ if (unlikely(shinfo == NULL)) {
+ rte_free(buf);
+ RTE_LOG(ERR, VHOST_DATA, "Failed to init shinfo\n");
+ return -1;
+ }
+ }
+
+ iova = rte_malloc_virt2iova(buf);
+ rte_pktmbuf_attach_extbuf(pkt, buf, iova, buf_len, shinfo);
+ rte_pktmbuf_reset_headroom(pkt);
+
+ return 0;
+}
+
+/*
+ * Allocate a host supported pktmbuf.
+ */
+static __rte_always_inline struct rte_mbuf *
+virtio_dev_pktmbuf_alloc(struct virtio_net *dev, struct rte_mempool *mp,
+ uint32_t data_len)
+{
+ struct rte_mbuf *pkt = rte_pktmbuf_alloc(mp);
+
+ if (unlikely(pkt == NULL))
+ return NULL;
+
+ if (rte_pktmbuf_tailroom(pkt) >= data_len)
+ return pkt;
+
+ /* attach an external buffer if supported */
+ if (dev->extbuf && !virtio_dev_extbuf_alloc(pkt, data_len))
+ return pkt;
+
+ /* check if chained buffers are allowed */
+ if (!dev->linearbuf)
+ return pkt;
+
+ /* Data doesn't fit into the buffer and the host supports
+ * only linear buffers
+ */
+ rte_pktmbuf_free(pkt);
+
+ return NULL;
+}
+
static __rte_noinline uint16_t
virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq,
struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count)
@@ -1343,26 +1430,23 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq,
for (i = 0; i < count; i++) {
struct buf_vector buf_vec[BUF_VECTOR_MAX];
uint16_t head_idx;
- uint32_t dummy_len;
+ uint32_t buf_len;
uint16_t nr_vec = 0;
int err;
if (unlikely(fill_vec_buf_split(dev, vq,
vq->last_avail_idx + i,
&nr_vec, buf_vec,
- &head_idx, &dummy_len,
+ &head_idx, &buf_len,
VHOST_ACCESS_RO) < 0))
break;
if (likely(dev->dequeue_zero_copy == 0))
update_shadow_used_ring_split(vq, head_idx, 0);
- pkts[i] = rte_pktmbuf_alloc(mbuf_pool);
- if (unlikely(pkts[i] == NULL)) {
- RTE_LOG(ERR, VHOST_DATA,
- "Failed to allocate memory for mbuf.\n");
+ pkts[i] = virtio_dev_pktmbuf_alloc(dev, mbuf_pool, buf_len);
+ if (unlikely(pkts[i] == NULL))
break;
- }
err = copy_desc_to_mbuf(dev, vq, buf_vec, nr_vec, pkts[i],
mbuf_pool);
@@ -1451,14 +1535,14 @@ virtio_dev_tx_packed(struct virtio_net *dev, struct vhost_virtqueue *vq,
for (i = 0; i < count; i++) {
struct buf_vector buf_vec[BUF_VECTOR_MAX];
uint16_t buf_id;
- uint32_t dummy_len;
+ uint32_t buf_len;
uint16_t desc_count, nr_vec = 0;
int err;
if (unlikely(fill_vec_buf_packed(dev, vq,
vq->last_avail_idx, &desc_count,
buf_vec, &nr_vec,
- &buf_id, &dummy_len,
+ &buf_id, &buf_len,
VHOST_ACCESS_RO) < 0))
break;
@@ -1466,12 +1550,9 @@ virtio_dev_tx_packed(struct virtio_net *dev, struct vhost_virtqueue *vq,
update_shadow_used_ring_packed(vq, buf_id, 0,
desc_count);
- pkts[i] = rte_pktmbuf_alloc(mbuf_pool);
- if (unlikely(pkts[i] == NULL)) {
- RTE_LOG(ERR, VHOST_DATA,
- "Failed to allocate memory for mbuf.\n");
+ pkts[i] = virtio_dev_pktmbuf_alloc(dev, mbuf_pool, buf_len);
+ if (unlikely(pkts[i] == NULL))
break;
- }
err = copy_desc_to_mbuf(dev, vq, buf_vec, nr_vec, pkts[i],
mbuf_pool);