diff mbox

[dpdk-dev] Memory corruption in librte_ether?

Message ID 5441873F.90500@bisdn.de (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Marc Sune Oct. 17, 2014, 9:16 p.m. UTC
Hi all,

I was rebasing the KNI mempool v4 patch(I have it finalised, but wanted 
to check) to the latest master HEAD 
(075e064089e1c2b6899db58c69be1a387eb5ffa7) when I ran into problems with 
the current KNI example with em interfaces in a VM. I then switched to 
master's head and retried (so without the KNI mempool patch!) with the 
*same behaviour*. Behaviour here listed is with master head, so nothing 
to do with the patch I am working on.

The *VM*, emulated with qemu has 4 e1000 interfaces attached to several 
bridges. qmeu version 1.1.2 running in debian 7 64bit. With this setup I 
get the error:

(gdb) r
Starting program: /home/marc/dpdk_vanilla/examples/kni/build/kni -c 0x3 
-n 2 -- -p 0x3 -P --config=\(0,1,1,1\),\(1,0,0,0\)
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 0 on socket 0
EAL: Support maximum 64 logical core(s) by configuration.
EAL: Detected 2 lcore(s)
EAL: Setting up memory...
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7ffff6e00000 (size = 0x200000)
EAL: Ask a virtual area of 0x800000 bytes
EAL: Virtual area found at 0x7ffff6400000 (size = 0x800000)
EAL: Ask a virtual area of 0x400000 bytes
EAL: Virtual area found at 0x7ffff5e00000 (size = 0x400000)
EAL: Ask a virtual area of 0x17000000 bytes
EAL: Virtual area found at 0x7fffdec00000 (size = 0x17000000)
EAL: Ask a virtual area of 0x1e00000 bytes
EAL: Virtual area found at 0x7fffdcc00000 (size = 0x1e00000)
EAL: Ask a virtual area of 0x1400000 bytes
EAL: Virtual area found at 0x7fffdb600000 (size = 0x1400000)
EAL: Ask a virtual area of 0x800000 bytes
EAL: Virtual area found at 0x7fffdac00000 (size = 0x800000)
EAL: Ask a virtual area of 0x2000000 bytes
EAL: Virtual area found at 0x7fffd8a00000 (size = 0x2000000)
EAL: Ask a virtual area of 0x2c00000 bytes
EAL: Virtual area found at 0x7fffd5c00000 (size = 0x2c00000)
EAL: Ask a virtual area of 0x7c00000 bytes
EAL: Virtual area found at 0x7fffcde00000 (size = 0x7c00000)
EAL: Ask a virtual area of 0x400000 bytes
EAL: Virtual area found at 0x7fffcd800000 (size = 0x400000)
EAL: Ask a virtual area of 0xc00000 bytes
EAL: Virtual area found at 0x7fffcca00000 (size = 0xc00000)
EAL: Ask a virtual area of 0x400000 bytes
EAL: Virtual area found at 0x7fffcc400000 (size = 0x400000)
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7fffcc000000 (size = 0x200000)
EAL: Requesting 331 pages of size 2MB from socket 0
[New Thread 0x7fffcbfff700 (LWP 19279)]
yEAL: TSC frequency is ~2494343 KHz
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using 
unreliable clock cycles !
EAL: Master core 0 is ready (tid=f7ff0800)
[New Thread 0x7fffcb7fc700 (LWP 19280)]
EAL: Core 1 is ready (tid=cb7fc700)
EAL: PCI device 0000:00:03.0 on NUMA socket -1
EAL:   probe driver: 8086:100e rte_em_pmd
EAL:   0000:00:03.0 not managed by UIO driver, skipping
EAL: PCI device 0000:00:06.0 on NUMA socket -1
EAL:   probe driver: 8086:100e rte_em_pmd
EAL:   PCI memory mapped at 0x7ffff7f9a000
PMD: eth_em_dev_init(): port_id 0 vendorID=0x8086 deviceID=0x100e
EAL: PCI device 0000:00:07.0 on NUMA socket -1
EAL:   probe driver: 8086:100e rte_em_pmd
EAL:   PCI memory mapped at 0x7ffff7f7a000
PMD: eth_em_dev_init(): port_id 1 vendorID=0x8086 deviceID=0x100e
EAL: PCI device 0000:00:08.0 on NUMA socket -1
EAL:   probe driver: 8086:100e rte_em_pmd
EAL:   PCI memory mapped at 0x7ffff7f5a000
PMD: eth_em_dev_init(): port_id 2 vendorID=0x8086 deviceID=0x100e
EAL: PCI device 0000:00:09.0 on NUMA socket -1
EAL:   probe driver: 8086:100e rte_em_pmd
EAL:   PCI memory mapped at 0x7ffff7f3a000
PMD: eth_em_dev_init(): port_id 3 vendorID=0x8086 deviceID=0x100e
APP: Port ID: 0
APP: Rx lcore ID: 1, Tx lcore ID: 1
APP: Kernel thread lcore ID: 1
APP: Port ID: 1
APP: Rx lcore ID: 0, Tx lcore ID: 0
APP: Kernel thread lcore ID: 0
APP: Initialising port 0 ...
PMD: eth_em_rx_queue_setup(): sw_ring=0x7fffcd4e7d00 
hw_ring=0x7ffff6fdaac0 dma_addr=0x5daac0
PMD: eth_em_tx_queue_setup(): sw_ring=0x7fffcd4e5c00 
hw_ring=0x7ffff6feaac0 dma_addr=0x5eaac0
PMD: eth_em_start(): <<
KNI: pci: 00:06:00      8086:100e
APP: Initialising port 1 ...
PMD: eth_em_rx_queue_setup(): drop_en functionality not supported by device
EAL: Error - exiting with code: 1
   Cause: Could not setup up RX queue for port1 (-22)
[Thread 0x7fffcb7fc700 (LWP 19280) exited]
[Thread 0x7ffff7ff0800 (LWP 19278) exited]

The default rx_conf in librte_pmd_e1000/igb_ethdev.c seems OK, setting 
drop_en to 0.

Debugging e1000 pmd (the 4 NICs are emulating the same exact device):

marc@dpdk:~/dpdk/lib$ git diff


Now the rx queue has correctly been set up (memory corruption!) so the 
rx_conf appears to be OK, although now tx_conf seems wrong:

(gdb) r
Starting program: /home/marc/dpdk_vanilla/examples/kni/build/kni -c 0x3 
-n 2 -- -p 0x3 -P --config=\(0,1,1,1\),\(1,0,0,0\)
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 0 on socket 0
EAL: Support maximum 64 logical core(s) by configuration.
EAL: Detected 2 lcore(s)
EAL: Setting up memory...
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7ffff6e00000 (size = 0x200000)
EAL: Ask a virtual area of 0x800000 bytes
EAL: Virtual area found at 0x7ffff6400000 (size = 0x800000)
EAL: Ask a virtual area of 0x400000 bytes
EAL: Virtual area found at 0x7ffff5e00000 (size = 0x400000)
EAL: Ask a virtual area of 0x17000000 bytes
EAL: Virtual area found at 0x7fffdec00000 (size = 0x17000000)
EAL: Ask a virtual area of 0x1e00000 bytes
EAL: Virtual area found at 0x7fffdcc00000 (size = 0x1e00000)
EAL: Ask a virtual area of 0x1400000 bytes
EAL: Virtual area found at 0x7fffdb600000 (size = 0x1400000)
EAL: Ask a virtual area of 0x800000 bytes
EAL: Virtual area found at 0x7fffdac00000 (size = 0x800000)
EAL: Ask a virtual area of 0x2000000 bytes
EAL: Virtual area found at 0x7fffd8a00000 (size = 0x2000000)
EAL: Ask a virtual area of 0x2c00000 bytes
EAL: Virtual area found at 0x7fffd5c00000 (size = 0x2c00000)
EAL: Ask a virtual area of 0x7c00000 bytes
EAL: Virtual area found at 0x7fffcde00000 (size = 0x7c00000)
EAL: Ask a virtual area of 0x400000 bytes
EAL: Virtual area found at 0x7fffcd800000 (size = 0x400000)
EAL: Ask a virtual area of 0xc00000 bytes
EAL: Virtual area found at 0x7fffcca00000 (size = 0xc00000)
EAL: Ask a virtual area of 0x400000 bytes
EAL: Virtual area found at 0x7fffcc400000 (size = 0x400000)
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7fffcc000000 (size = 0x200000)
EAL: Requesting 331 pages of size 2MB from socket 0
[New Thread 0x7fffcbfff700 (LWP 22143)]
EAL: TSC frequency is ~2494343 KHz
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using 
unreliable clock cycles !
EAL: Master core 0 is ready (tid=f7ff0800)
[New Thread 0x7fffcb7fc700 (LWP 22144)]
EAL: Core 1 is ready (tid=cb7fc700)
EAL: PCI device 0000:00:03.0 on NUMA socket -1
EAL:   probe driver: 8086:100e rte_em_pmd
EAL:   0000:00:03.0 not managed by UIO driver, skipping
EAL: PCI device 0000:00:06.0 on NUMA socket -1
EAL:   probe driver: 8086:100e rte_em_pmd
EAL:   PCI memory mapped at 0x7ffff7f9a000
PMD: eth_em_dev_init(): port_id 0 vendorID=0x8086 deviceID=0x100e
EAL: PCI device 0000:00:07.0 on NUMA socket -1
EAL:   probe driver: 8086:100e rte_em_pmd
EAL:   PCI memory mapped at 0x7ffff7f7a000
PMD: eth_em_dev_init(): port_id 1 vendorID=0x8086 deviceID=0x100e
EAL: PCI device 0000:00:08.0 on NUMA socket -1
EAL:   probe driver: 8086:100e rte_em_pmd
EAL:   PCI memory mapped at 0x7ffff7f5a000
PMD: eth_em_dev_init(): port_id 2 vendorID=0x8086 deviceID=0x100e
EAL: PCI device 0000:00:09.0 on NUMA socket -1
EAL:   probe driver: 8086:100e rte_em_pmd
EAL:   PCI memory mapped at 0x7ffff7f3a000
PMD: eth_em_dev_init(): port_id 3 vendorID=0x8086 deviceID=0x100e
APP: Port ID: 0
APP: Rx lcore ID: 1, Tx lcore ID: 1
APP: Kernel thread lcore ID: 1
APP: Port ID: 1
APP: Rx lcore ID: 0, Tx lcore ID: 0
APP: Kernel thread lcore ID: 0
APP: Initialising port 0 ...
PMD: eth_em_rx_queue_setup(): sw_ring=0x7fffcd4e7d00 
hw_ring=0x7ffff6fdaac0 dma_addr=0x5daac0
PMD: eth_em_tx_queue_setup(): sw_ring=0x7fffcd4e5c00 
hw_ring=0x7ffff6feaac0 dma_addr=0x5eaac0
PMD: eth_em_start(): <<
KNI: pci: 00:06:00      8086:100e
APP: Initialising port 1 ...
PMD: eth_em_rx_queue_setup(): sw_ring=0x7fffcd4e5600 
hw_ring=0x7fffcd50c1c0 dma_addr=0x2cb0c1c0
PMD: eth_em_tx_queue_setup(): tx_free_thresh must be less than the 
number of TX descriptors minus 3. (tx_free_thresh=65535 port=1 queue=0)
EAL: Error - exiting with code: 1
   Cause: Could not setup up TX queue for port1 (-22)
[Thread 0x7fffcbfff700 (LWP 22143) exited]
[Thread 0x7ffff7ff0800 (LWP 22140) exited]
[Inferior 1 (process 22140) exited with code 01]

Debugging it:

MD: eth_em_rx_queue_setup(): sw_ring=0x7fffcd4e7d00 
hw_ring=0x7ffff6fdaac0 dma_addr=0x5daac0

Breakpoint 1, eth_em_tx_queue_setup (dev=0x796420, queue_idx=0, 
nb_desc=512, socket_id=4294967295, tx_conf=0x7fffffffe39c)
     at /home/marc/dpdk_vanilla/lib/librte_pmd_e1000/em_rxtx.c:1208
1208        hw = E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
(gdb) print dev->data->name
$1 = "0:6.0", '\000' <repeats 26 times>
(gdb) print tx_conf
$2 = (const struct rte_eth_txconf *) 0x7fffffffe39c
(gdb) print *tx_conf
$3 = {tx_thresh = {pthresh = 0 '\000', hthresh = 0 '\000', wthresh = 0 
'\000'}, tx_rs_thresh = 0, tx_free_thresh = 0, txq_flags = 0, 
tx_deferred_start = 0 '\000'}
(gdb) c
Continuing.
PMD: eth_em_tx_queue_setup(): sw_ring=0x7fffcd4e5c00 
hw_ring=0x7ffff6feaac0 dma_addr=0x5eaac0
PMD: eth_em_start(): <<
KNI: pci: 00:06:00      8086:100e
APP: Initialising port 1 ...
PMD: eth_em_rx_queue_setup(): sw_ring=0x7fffcd4e5600 
hw_ring=0x7fffcd50c1c0 dma_addr=0x2cb0c1c0

Breakpoint 1, eth_em_tx_queue_setup (dev=0x796460, queue_idx=0, 
nb_desc=512, socket_id=4294967295, tx_conf=0x7fffffffe39c)
     at /home/marc/dpdk_vanilla/lib/librte_pmd_e1000/em_rxtx.c:1208
1208        hw = E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
(gdb) print dev->data->name
$4 = "0:7.0", '\000' <repeats 26 times>
(gdb) print *tx_conf
$5 = {tx_thresh = {pthresh = 0 '\000', hthresh = 0 '\000', wthresh = 0 
'\000'}, tx_rs_thresh = 58608, tx_free_thresh = 65535, txq_flags = 32767,
   tx_deferred_start = 0 '\000'}

The KNI example runs *perfectly*in the VM, with the same launching 
parameters with v1.7.1,  and seems to work fine until 
27b31ee33fa5e7cc9a086c690b98ed8e1a153c6a. So the commit that breaks it 
(the example, not the commit that is wrong) seems to be:

commit 81f7ecd934372fc9f592d1322f8eff86350fa4f5
Author: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Date:   Wed Oct 1 10:49:05 2014 +0100

     examples: use factorized default Rx/Tx configuration

     For apps that were using default rte_eth_rxconf and rte_eth_txconf
     structures, these have been removed and now they are obtained by
     calling rte_eth_dev_info_get, just before setting up RX/TX queues.

     Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
     Acked-by: David Marchand <david.marchand@6wind.com>


Which seems to indicate rte_eth_dev_info_get() is somehow corrupting 
memory(?¿). But I haven't figure out the problem (yet). I suspect of:

commit fbde27f19ab8f1d386868275bd8c016e693cf073
Author: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Date:   Wed Oct 1 10:49:04 2014 +0100

     ethdev: get default Rx/Tx configuration from dev info

     Many sample apps use duplicated code to set rte_eth_txconf and 
rte_eth_rxconf
     structures. This patch allows the user to get a default optimal 
RX/TX configuration
     through rte_eth_dev_info get, and still any parameters may be 
tweaked as wished,
     before setting up queues.

     Besides, if a NULL pointer is passed to rte_eth_rx_queue_setup or
     rte_eth_tx_queue_setup, these functions get internally the default 
RX/TX
     configuration for the user.

     Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
     Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
     Acked-by: David Marchand <david.marchand@6wind.com>
     [Thomas: split patch]

commit a30268e9a2d0618902e8cf96b90b27db4fb02d54
Author: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Date:   Wed Oct 1 10:49:03 2014 +0100

     ethdev: reset whole dev info structure before filling

     To guarantee that RX/TX configuration structures are reseted
     before modifying them, plus the other dev info fields,
     dev info structure is zeroed beforehand.

     Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
     Acked-by: David Marchand <david.marchand@6wind.com>


Can anyone confirm it?

Marc

p.s. Has someone managed to run a dpdk app with valgrind?

Comments

De Lara Guarch, Pablo Oct. 20, 2014, 5:31 p.m. UTC | #1
Hi Marc,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marc Sune
> Sent: Friday, October 17, 2014 10:17 PM
> To: <dev@dpdk.org>
> Subject: [dpdk-dev] Memory corruption in librte_ether?
> 
> Hi all,
> 
> I was rebasing the KNI mempool v4 patch(I have it finalised, but wanted
> to check) to the latest master HEAD
> (075e064089e1c2b6899db58c69be1a387eb5ffa7) when I ran into problems
> with
> the current KNI example with em interfaces in a VM. I then switched to
> master's head and retried (so without the KNI mempool patch!) with the
> *same behaviour*. Behaviour here listed is with master head, so nothing
> to do with the patch I am working on.
> 
> The *VM*, emulated with qemu has 4 e1000 interfaces attached to several
> bridges. qmeu version 1.1.2 running in debian 7 64bit. With this setup I
> get the error:
> 
[...] 
> Which seems to indicate rte_eth_dev_info_get() is somehow corrupting
> memory(?¿). But I haven't figure out the problem (yet). I suspect of:
> 
> commit fbde27f19ab8f1d386868275bd8c016e693cf073
> Author: Pablo de Lara <pablo.de.lara.guarch@intel.com>
> Date:   Wed Oct 1 10:49:04 2014 +0100
> 
>      ethdev: get default Rx/Tx configuration from dev info
> 
>      Many sample apps use duplicated code to set rte_eth_txconf and
> rte_eth_rxconf
>      structures. This patch allows the user to get a default optimal
> RX/TX configuration
>      through rte_eth_dev_info get, and still any parameters may be
> tweaked as wished,
>      before setting up queues.
> 
>      Besides, if a NULL pointer is passed to rte_eth_rx_queue_setup or
>      rte_eth_tx_queue_setup, these functions get internally the default
> RX/TX
>      configuration for the user.
> 
>      Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
>      Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
>      Acked-by: David Marchand <david.marchand@6wind.com>
>      [Thomas: split patch]
> 
> commit a30268e9a2d0618902e8cf96b90b27db4fb02d54
> Author: Pablo de Lara <pablo.de.lara.guarch@intel.com>
> Date:   Wed Oct 1 10:49:03 2014 +0100
> 
>      ethdev: reset whole dev info structure before filling
> 
>      To guarantee that RX/TX configuration structures are reseted
>      before modifying them, plus the other dev info fields,
>      dev info structure is zeroed beforehand.
> 
>      Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
>      Acked-by: David Marchand <david.marchand@6wind.com>
> 
> 
> Can anyone confirm it?

I just pushed a fix for that problem. Indeed, the dev_info structure was polluted, 
because I was calling the specific dev_info_get function in the PMDs, 
and not calling rte_eth_dev_info_get in rte_ethdev.c, which means that
 the dev_info structure was not being reseted. 
In your case, em PMD does not overwrite the rte_eth_rxconf and 
rte_eth_txconf structures, and then you find random data in those structures.
Well spotted and thanks very much for all the details. 
I would appreciate if you could verify that this patch works for you.

Thanks,
Pablo
> 
> Marc
> 
> p.s. Has someone managed to run a dpdk app with valgrind?
Marc Sune Oct. 21, 2014, 8:12 a.m. UTC | #2
Pablo,

I've only tried with the kni-autotest but it seems to work fine. Thanks!

Btw, at least in my development VM the kni-autotest in the current head 
(455d09e i40e: generic filter control), but also in v1.7.1, fails:

RTE>>kni_autotest
master lcore: 0
count: 2
PMD: eth_em_rx_queue_setup(): sw_ring=0x7f27ab4e8100 
hw_ring=0x7f27aa600000 dma_addr=0x36e00000
PMD: eth_em_tx_queue_setup(): sw_ring=0x7f27ab4e6000 
hw_ring=0x7f27aa610000 dma_addr=0x36e10000
PMD: eth_em_start(): <<
KNI: pci: 00:06:00      8086:100e
KNI: Invalid KNI request operation.
KNI: Invalid kni info.
KNI: The KNI request operationhas already registered.
Change MTU of port 0 to 1450
Change MTU of port 0 to 1450 successfully.
KNI: Invalid kni info.
The ingress/egress number should not be less than 100
Test Failed
RTE>>

Maybe it is simply a lack of resources in the qemu VM.

Saludos
marc

On 20/10/14 19:31, De Lara Guarch, Pablo wrote:
> Hi Marc,
>
>> -----Original Message-----
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marc Sune
>> Sent: Friday, October 17, 2014 10:17 PM
>> To: <dev@dpdk.org>
>> Subject: [dpdk-dev] Memory corruption in librte_ether?
>>
>> Hi all,
>>
>> I was rebasing the KNI mempool v4 patch(I have it finalised, but wanted
>> to check) to the latest master HEAD
>> (075e064089e1c2b6899db58c69be1a387eb5ffa7) when I ran into problems
>> with
>> the current KNI example with em interfaces in a VM. I then switched to
>> master's head and retried (so without the KNI mempool patch!) with the
>> *same behaviour*. Behaviour here listed is with master head, so nothing
>> to do with the patch I am working on.
>>
>> The *VM*, emulated with qemu has 4 e1000 interfaces attached to several
>> bridges. qmeu version 1.1.2 running in debian 7 64bit. With this setup I
>> get the error:
>>
> [...]
>> Which seems to indicate rte_eth_dev_info_get() is somehow corrupting
>> memory(?¿). But I haven't figure out the problem (yet). I suspect of:
>>
>> commit fbde27f19ab8f1d386868275bd8c016e693cf073
>> Author: Pablo de Lara <pablo.de.lara.guarch@intel.com>
>> Date:   Wed Oct 1 10:49:04 2014 +0100
>>
>>       ethdev: get default Rx/Tx configuration from dev info
>>
>>       Many sample apps use duplicated code to set rte_eth_txconf and
>> rte_eth_rxconf
>>       structures. This patch allows the user to get a default optimal
>> RX/TX configuration
>>       through rte_eth_dev_info get, and still any parameters may be
>> tweaked as wished,
>>       before setting up queues.
>>
>>       Besides, if a NULL pointer is passed to rte_eth_rx_queue_setup or
>>       rte_eth_tx_queue_setup, these functions get internally the default
>> RX/TX
>>       configuration for the user.
>>
>>       Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
>>       Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
>>       Acked-by: David Marchand <david.marchand@6wind.com>
>>       [Thomas: split patch]
>>
>> commit a30268e9a2d0618902e8cf96b90b27db4fb02d54
>> Author: Pablo de Lara <pablo.de.lara.guarch@intel.com>
>> Date:   Wed Oct 1 10:49:03 2014 +0100
>>
>>       ethdev: reset whole dev info structure before filling
>>
>>       To guarantee that RX/TX configuration structures are reseted
>>       before modifying them, plus the other dev info fields,
>>       dev info structure is zeroed beforehand.
>>
>>       Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
>>       Acked-by: David Marchand <david.marchand@6wind.com>
>>
>>
>> Can anyone confirm it?
> I just pushed a fix for that problem. Indeed, the dev_info structure was polluted,
> because I was calling the specific dev_info_get function in the PMDs,
> and not calling rte_eth_dev_info_get in rte_ethdev.c, which means that
>   the dev_info structure was not being reseted.
> In your case, em PMD does not overwrite the rte_eth_rxconf and
> rte_eth_txconf structures, and then you find random data in those structures.
> Well spotted and thanks very much for all the details.
> I would appreciate if you could verify that this patch works for you.
>
> Thanks,
> Pablo
>> Marc
>>
>> p.s. Has someone managed to run a dpdk app with valgrind?
diff mbox

Patch

diff --git a/lib/librte_pmd_e1000/Makefile b/lib/librte_pmd_e1000/Makefile
index 14bc4a2..e50b715 100644
--- a/lib/librte_pmd_e1000/Makefile
+++ b/lib/librte_pmd_e1000/Makefile
@@ -36,7 +36,7 @@  include $(RTE_SDK)/mk/rte.vars.mk
  #
  LIB = librte_pmd_e1000.a

-CFLAGS += -O3
+CFLAGS += -g -O0
  CFLAGS += $(WERROR_FLAGS)

seems something is wrong

First iface (PCI 0:6.0):

(gdb) print dev->data->name
$4 = "0:6.0", '\000' <repeats 26 times>
(gdb) print *rx_conf
$5 = {rx_thresh = {pthresh = 0 '\000', hthresh = 0 '\000', wthresh = 0 
'\000'}, rx_free_thresh = 0, rx_drop_en = 0 '\000', rx_deferred_start = 
0 '\000'}
(gdb)

Second iface (PCI 0:7.0):

(gdb) print dev->data->name
$6 = "0:7.0", '\000' <repeats 26 times>
(gdb) print *rx_conf
$7 = {rx_thresh = {pthresh = 0 '\000', hthresh = 0 '\000', wthresh = 0 
'\000'}, rx_free_thresh = 33088, rx_drop_en = 176 '\260', 
rx_deferred_start = 44 ','}

Note that rx_free_thresh on has polluted values.

However, when adding -g -O0 in ethdev:

marc@dpdk:~/dpdk/lib$ git diff
diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index b310f8b..ec385ef 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -36,7 +36,7 @@  include $(RTE_SDK)/mk/rte.vars.mk
  #
  LIB = libethdev.a

-CFLAGS += -O3
+CFLAGS += -g -O0
  CFLAGS += $(WERROR_FLAGS)

  SRCS-y += rte_ethdev.c
diff --git a/lib/librte_pmd_e1000/Makefile b/lib/librte_pmd_e1000/Makefile
index 14bc4a2..e50b715 100644
--- a/lib/librte_pmd_e1000/Makefile
+++ b/lib/librte_pmd_e1000/Makefile
@@ -36,7 +36,7 @@  include $(RTE_SDK)/mk/rte.vars.mk
  #
  LIB = librte_pmd_e1000.a

-CFLAGS += -O3
+CFLAGS += -g -O0
  CFLAGS += $(WERROR_FLAGS)

  ifeq ($(CC), icc)