diff mbox series

net/virtio: fix check scatter on all Rx queues

Message ID 20210804083128.64981-1-zhihongx.peng@intel.com (mailing list archive)
State Changes Requested
Delegated to: Maxime Coquelin
Headers show
Series net/virtio: fix check scatter on all Rx queues | expand

Checks

Context Check Description
ci/iol-x86_64-compile-testing fail Testing issues
ci/iol-abi-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/intel-Testing success Testing PASS
ci/Intel-compilation success Compilation OK
ci/github-robot success github build: passed
ci/checkpatch success coding style OK

Commit Message

Peng, ZhihongX Aug. 4, 2021, 8:31 a.m. UTC
From: Zhihong Peng <zhihongx.peng@intel.com>

This patch fixes the wrong way to obtain virtqueue.
The end of virtqueue cannot be judged based on whether
the array is NULL.

Fixes: 4e8169eb0d2d (net/virtio: fix Rx scatter offload)
Cc: stable@dpdk.org

Signed-off-by: Zhihong Peng <zhihongx.peng@intel.com>
---
 drivers/net/virtio/virtio_ethdev.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Maxime Coquelin Sept. 13, 2021, 3:34 p.m. UTC | #1
Hi Zhihong,

On 8/4/21 10:31 AM, zhihongx.peng@intel.com wrote:
> From: Zhihong Peng <zhihongx.peng@intel.com>
> 
> This patch fixes the wrong way to obtain virtqueue.
> The end of virtqueue cannot be judged based on whether
> the array is NULL.

My understanding is that it is causing issue because it is confusing the
control queue with a Rx queue? I so, please be more specific on the
issue it is fixing in the commit message.

> Fixes: 4e8169eb0d2d (net/virtio: fix Rx scatter offload)
> Cc: stable@dpdk.org
> 
> Signed-off-by: Zhihong Peng <zhihongx.peng@intel.com>
> ---
>  drivers/net/virtio/virtio_ethdev.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
> index e58085a2c9..f2d19dc9d6 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -873,8 +873,8 @@ virtio_check_scatter_on_all_rx_queues(struct rte_eth_dev *dev,
>  	if (hw->vqs == NULL)
>  		return true;
>  
> -	for (qidx = 0; (vq = hw->vqs[2 * qidx + VTNET_SQ_RQ_QUEUE_IDX]) != NULL;
> -	     qidx++) {
> +	for (qidx = 0; qidx < hw->max_queue_pairs; qidx++) {
> +		vq = hw->vqs[2 * qidx + VTNET_SQ_RQ_QUEUE_IDX];

I agree with the change, but I would add a check to ensure vq is not
NULL to be safe wrt to NULL pointer dereferencing.

>  		rxvq = &vq->rxq;
>  		if (rxvq->mpool == NULL)
>  			continue;
> 

Thanks,
Maxime
David Marchand Sept. 15, 2021, 6:37 p.m. UTC | #2
On Wed, Aug 4, 2021 at 10:36 AM <zhihongx.peng@intel.com> wrote:
>
> From: Zhihong Peng <zhihongx.peng@intel.com>
>
> This patch fixes the wrong way to obtain virtqueue.
> The end of virtqueue cannot be judged based on whether
> the array is NULL.

Indeed, good catch.

I can reproduce a crash with v20.11.3 which has backport of 4e8169eb0d2d.
I can not see it with main: maybe due to a lucky allocation or size
requested to rte_zmalloc... ?

The usecase is simple, I am surprised no validation caught it.

# gdb ./build/app/dpdk-testpmd -ex 'run --vdev
net_virtio_user0,path=/dev/vhost-net,iface=titi,queues=3 -a 0:0:0.0 --
-i'

...

Thread 1 "dpdk-testpmd" received signal SIGSEGV, Segmentation fault.
virtio_rx_mem_pool_buf_size (mp=0x110429983) at
../drivers/net/virtio/virtio_ethdev.c:873
873        return rte_pktmbuf_data_room_size(mp) - RTE_PKTMBUF_HEADROOM;
Missing separate debuginfos, use: yum debuginfo-install
elfutils-libelf-0.182-3.el8.x86_64 libbpf-0.2.0-1.el8.x86_64
(gdb) bt
#0  virtio_rx_mem_pool_buf_size (mp=0x110429983) at
../drivers/net/virtio/virtio_ethdev.c:873
#1  0x0000000000e370d1 in virtio_check_scatter_on_all_rx_queues
(frame_size=1530, dev=0x1799a40 <rte_eth_devices>) at
../drivers/net/virtio/virtio_ethdev.c:907
#2  virtio_mtu_set (dev=0x1799a40 <rte_eth_devices>, mtu=<optimized
out>) at ../drivers/net/virtio/virtio_ethdev.c:938
#3  0x00000000008c30e5 in rte_eth_dev_set_mtu
(port_id=port_id@entry=0, mtu=<optimized out>) at
../lib/librte_ethdev/rte_ethdev.c:3484
#4  0x00000000006a61d8 in update_jumbo_frame_offload
(portid=portid@entry=0) at ../app/test-pmd/testpmd.c:3371
#5  0x00000000006a62bc in init_config_port_offloads (pid=0,
socket_id=0) at ../app/test-pmd/testpmd.c:1416
#6  0x000000000061770c in init_config () at ../app/test-pmd/testpmd.c:1505
#7  main (argc=<optimized out>, argv=<optimized out>) at
../app/test-pmd/testpmd.c:3800
(gdb) f 1
#1  0x0000000000e370d1 in virtio_check_scatter_on_all_rx_queues
(frame_size=1530, dev=0x1799a40 <rte_eth_devices>) at
../drivers/net/virtio/virtio_ethdev.c:907
907            buf_size = virtio_rx_mem_pool_buf_size(rxvq->mpool);
(gdb) p hw->max_queue_pairs
$1 = 3
(gdb) p qidx
$2 = 5
(gdb) p hw->vqs[0]
$3 = (struct virtqueue *) 0x17ffb03c0
(gdb) p hw->vqs[2]
$4 = (struct virtqueue *) 0x17ff9dcc0
(gdb) p hw->vqs[4]
$5 = (struct virtqueue *) 0x17ff8acc0
(gdb) p hw->vqs[6]
$6 = (struct virtqueue *) 0x17ff77cc0
(gdb) p hw->vqs[7]
$7 = (struct virtqueue *) 0x0
(gdb) p hw->vqs[8]
$8 = (struct virtqueue *) 0x100004ac0
(gdb) p hw->vqs[9]
$9 = (struct virtqueue *) 0x17ffb1600
(gdb) p hw->vqs[10]
$10 = (struct virtqueue *) 0x17ffb18c0
Kevin Traynor Sept. 21, 2021, 5:45 p.m. UTC | #3
On 15/09/2021 19:37, David Marchand wrote:
> On Wed, Aug 4, 2021 at 10:36 AM <zhihongx.peng@intel.com> wrote:
>>
>> From: Zhihong Peng <zhihongx.peng@intel.com>
>>
>> This patch fixes the wrong way to obtain virtqueue.
>> The end of virtqueue cannot be judged based on whether
>> the array is NULL.
> 
> Indeed, good catch.
> 
> I can reproduce a crash with v20.11.3 which has backport of 4e8169eb0d2d.
> I can not see it with main: maybe due to a lucky allocation or size
> requested to rte_zmalloc... ?
> 
> The usecase is simple, I am surprised no validation caught it.
> 
> # gdb ./build/app/dpdk-testpmd -ex 'run --vdev
> net_virtio_user0,path=/dev/vhost-net,iface=titi,queues=3 -a 0:0:0.0 --
> -i'
> 
> ...
> 
> Thread 1 "dpdk-testpmd" received signal SIGSEGV, Segmentation fault.
> virtio_rx_mem_pool_buf_size (mp=0x110429983) at
> ../drivers/net/virtio/virtio_ethdev.c:873
> 873        return rte_pktmbuf_data_room_size(mp) - RTE_PKTMBUF_HEADROOM;
> Missing separate debuginfos, use: yum debuginfo-install
> elfutils-libelf-0.182-3.el8.x86_64 libbpf-0.2.0-1.el8.x86_64
> (gdb) bt
> #0  virtio_rx_mem_pool_buf_size (mp=0x110429983) at
> ../drivers/net/virtio/virtio_ethdev.c:873
> #1  0x0000000000e370d1 in virtio_check_scatter_on_all_rx_queues
> (frame_size=1530, dev=0x1799a40 <rte_eth_devices>) at
> ../drivers/net/virtio/virtio_ethdev.c:907
> #2  virtio_mtu_set (dev=0x1799a40 <rte_eth_devices>, mtu=<optimized
> out>) at ../drivers/net/virtio/virtio_ethdev.c:938
> #3  0x00000000008c30e5 in rte_eth_dev_set_mtu
> (port_id=port_id@entry=0, mtu=<optimized out>) at
> ../lib/librte_ethdev/rte_ethdev.c:3484
> #4  0x00000000006a61d8 in update_jumbo_frame_offload
> (portid=portid@entry=0) at ../app/test-pmd/testpmd.c:3371
> #5  0x00000000006a62bc in init_config_port_offloads (pid=0,
> socket_id=0) at ../app/test-pmd/testpmd.c:1416
> #6  0x000000000061770c in init_config () at ../app/test-pmd/testpmd.c:1505
> #7  main (argc=<optimized out>, argv=<optimized out>) at
> ../app/test-pmd/testpmd.c:3800
> (gdb) f 1
> #1  0x0000000000e370d1 in virtio_check_scatter_on_all_rx_queues
> (frame_size=1530, dev=0x1799a40 <rte_eth_devices>) at
> ../drivers/net/virtio/virtio_ethdev.c:907
> 907            buf_size = virtio_rx_mem_pool_buf_size(rxvq->mpool);
> (gdb) p hw->max_queue_pairs
> $1 = 3
> (gdb) p qidx
> $2 = 5
> (gdb) p hw->vqs[0]
> $3 = (struct virtqueue *) 0x17ffb03c0
> (gdb) p hw->vqs[2]
> $4 = (struct virtqueue *) 0x17ff9dcc0
> (gdb) p hw->vqs[4]
> $5 = (struct virtqueue *) 0x17ff8acc0
> (gdb) p hw->vqs[6]
> $6 = (struct virtqueue *) 0x17ff77cc0
> (gdb) p hw->vqs[7]
> $7 = (struct virtqueue *) 0x0
> (gdb) p hw->vqs[8]
> $8 = (struct virtqueue *) 0x100004ac0
> (gdb) p hw->vqs[9]
> $9 = (struct virtqueue *) 0x17ffb1600
> (gdb) p hw->vqs[10]
> $10 = (struct virtqueue *) 0x17ffb18c0
> 
> 
> 

For reference, also observed when 20.11.3 is paired with OVS

https://mail.openvswitch.org/pipermail/ovs-dev/2021-September/387940.html
Peng, ZhihongX Sept. 22, 2021, 8:13 a.m. UTC | #4
> -----Original Message-----
> From: Kevin Traynor <ktraynor@redhat.com>
> Sent: Wednesday, September 22, 2021 1:45 AM
> To: David Marchand <david.marchand@redhat.com>; Peng, ZhihongX
> <zhihongx.peng@intel.com>
> Cc: Xia, Chenbo <chenbo.xia@intel.com>; Maxime Coquelin
> <maxime.coquelin@redhat.com>; dev <dev@dpdk.org>; Ivan Ilchenko
> <ivan.ilchenko@oktetlabs.ru>; dpdk stable <stable@dpdk.org>; Christian
> Ehrhardt <christian.ehrhardt@canonical.com>; Xueming(Steven) Li
> <xuemingl@nvidia.com>; Luca Boccassi <bluca@debian.org>
> Subject: Re: [dpdk-stable] [DPDK] net/virtio: fix check scatter on all Rx
> queues
> 
> On 15/09/2021 19:37, David Marchand wrote:
> > On Wed, Aug 4, 2021 at 10:36 AM <zhihongx.peng@intel.com> wrote:
> >>
> >> From: Zhihong Peng <zhihongx.peng@intel.com>
> >>
> >> This patch fixes the wrong way to obtain virtqueue.
> >> The end of virtqueue cannot be judged based on whether the array is
> >> NULL.
> >
> > Indeed, good catch.
> >
> > I can reproduce a crash with v20.11.3 which has backport of 4e8169eb0d2d.
> > I can not see it with main: maybe due to a lucky allocation or size
> > requested to rte_zmalloc... ?
> >

This problem was discovered through google asan, we have submitted DPDK ASan patch.
http://patchwork.dpdk.org/project/dpdk/patch/20210918074155.872358-1-zhihongx.peng@intel.com/


> > The usecase is simple, I am surprised no validation caught it.
> >
> > # gdb ./build/app/dpdk-testpmd -ex 'run --vdev
> > net_virtio_user0,path=/dev/vhost-net,iface=titi,queues=3 -a 0:0:0.0 --
> > -i'
> >
> > ...
> >
> > Thread 1 "dpdk-testpmd" received signal SIGSEGV, Segmentation fault.
> > virtio_rx_mem_pool_buf_size (mp=0x110429983) at
> > ../drivers/net/virtio/virtio_ethdev.c:873
> > 873        return rte_pktmbuf_data_room_size(mp) -
> RTE_PKTMBUF_HEADROOM;
> > Missing separate debuginfos, use: yum debuginfo-install
> > elfutils-libelf-0.182-3.el8.x86_64 libbpf-0.2.0-1.el8.x86_64
> > (gdb) bt
> > #0  virtio_rx_mem_pool_buf_size (mp=0x110429983) at
> > ../drivers/net/virtio/virtio_ethdev.c:873
> > #1  0x0000000000e370d1 in virtio_check_scatter_on_all_rx_queues
> > (frame_size=1530, dev=0x1799a40 <rte_eth_devices>) at
> > ../drivers/net/virtio/virtio_ethdev.c:907
> > #2  virtio_mtu_set (dev=0x1799a40 <rte_eth_devices>, mtu=<optimized
> > out>) at ../drivers/net/virtio/virtio_ethdev.c:938
> > #3  0x00000000008c30e5 in rte_eth_dev_set_mtu
> > (port_id=port_id@entry=0, mtu=<optimized out>) at
> > ../lib/librte_ethdev/rte_ethdev.c:3484
> > #4  0x00000000006a61d8 in update_jumbo_frame_offload
> > (portid=portid@entry=0) at ../app/test-pmd/testpmd.c:3371
> > #5  0x00000000006a62bc in init_config_port_offloads (pid=0,
> > socket_id=0) at ../app/test-pmd/testpmd.c:1416
> > #6  0x000000000061770c in init_config () at
> > ../app/test-pmd/testpmd.c:1505
> > #7  main (argc=<optimized out>, argv=<optimized out>) at
> > ../app/test-pmd/testpmd.c:3800
> > (gdb) f 1
> > #1  0x0000000000e370d1 in virtio_check_scatter_on_all_rx_queues
> > (frame_size=1530, dev=0x1799a40 <rte_eth_devices>) at
> > ../drivers/net/virtio/virtio_ethdev.c:907
> > 907            buf_size = virtio_rx_mem_pool_buf_size(rxvq->mpool);
> > (gdb) p hw->max_queue_pairs
> > $1 = 3
> > (gdb) p qidx
> > $2 = 5
> > (gdb) p hw->vqs[0]
> > $3 = (struct virtqueue *) 0x17ffb03c0
> > (gdb) p hw->vqs[2]
> > $4 = (struct virtqueue *) 0x17ff9dcc0
> > (gdb) p hw->vqs[4]
> > $5 = (struct virtqueue *) 0x17ff8acc0
> > (gdb) p hw->vqs[6]
> > $6 = (struct virtqueue *) 0x17ff77cc0
> > (gdb) p hw->vqs[7]
> > $7 = (struct virtqueue *) 0x0
> > (gdb) p hw->vqs[8]
> > $8 = (struct virtqueue *) 0x100004ac0
> > (gdb) p hw->vqs[9]
> > $9 = (struct virtqueue *) 0x17ffb1600
> > (gdb) p hw->vqs[10]
> > $10 = (struct virtqueue *) 0x17ffb18c0
> >
> >
> >
> 
> For reference, also observed when 20.11.3 is paired with OVS
> 
> https://mail.openvswitch.org/pipermail/ovs-dev/2021-
> September/387940.html
David Marchand Sept. 30, 2021, 6:43 p.m. UTC | #5
On Wed, Sep 22, 2021 at 10:13 AM Peng, ZhihongX <zhihongx.peng@intel.com> wrote:
> > On 15/09/2021 19:37, David Marchand wrote:
> > > On Wed, Aug 4, 2021 at 10:36 AM <zhihongx.peng@intel.com> wrote:
> > >>
> > >> From: Zhihong Peng <zhihongx.peng@intel.com>
> > >>
> > >> This patch fixes the wrong way to obtain virtqueue.
> > >> The end of virtqueue cannot be judged based on whether the array is
> > >> NULL.
> > >
> > > Indeed, good catch.
> > >
> > > I can reproduce a crash with v20.11.3 which has backport of 4e8169eb0d2d.
> > > I can not see it with main: maybe due to a lucky allocation or size
> > > requested to rte_zmalloc... ?
> > >
>
> This problem was discovered through google asan, we have submitted DPDK ASan patch.
> http://patchwork.dpdk.org/project/dpdk/patch/20210918074155.872358-1-zhihongx.peng@intel.com/

Can you work on a v2?
Maxime had comments.
https://inbox.dpdk.org/dev/e1c512c1-9766-1cd8-816b-8a77d13d53d6@redhat.com/
diff mbox series

Patch

diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
index e58085a2c9..f2d19dc9d6 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -873,8 +873,8 @@  virtio_check_scatter_on_all_rx_queues(struct rte_eth_dev *dev,
 	if (hw->vqs == NULL)
 		return true;
 
-	for (qidx = 0; (vq = hw->vqs[2 * qidx + VTNET_SQ_RQ_QUEUE_IDX]) != NULL;
-	     qidx++) {
+	for (qidx = 0; qidx < hw->max_queue_pairs; qidx++) {
+		vq = hw->vqs[2 * qidx + VTNET_SQ_RQ_QUEUE_IDX];
 		rxvq = &vq->rxq;
 		if (rxvq->mpool == NULL)
 			continue;