[v1] net/e1000: do not update link status in secondary process
Checks
Commit Message
The code to update link status is not safe in secondary process.
If called from secondary it will crash, example from dumpcap:
eth_em_link_update
Signed-off-by: Jun Wang <junwang01@cestc.cn>
---
drivers/net/e1000/em_ethdev.c | 3 +++
1 file changed, 3 insertions(+)
Comments
On Fri, 12 Jul 2024 19:30:47 +0800
Jun Wang <junwang01@cestc.cn> wrote:
> The code to update link status is not safe in secondary process.
> If called from secondary it will crash, example from dumpcap:
> eth_em_link_update
>
> Signed-off-by: Jun Wang <junwang01@cestc.cn>
Wouldn't it be better to fix the code in e1000_check_link to work in
secondary process. There are network virtual appliances that use
secondary process for all processing.
>> The code to update link status is not safe in secondary process.
>> If called from secondary it will crash, example from dumpcap:
>> eth_em_link_update
>>
>> Signed-off-by: Jun Wang <junwang01@cestc.cn>
>
> Wouldn't it be better to fix the code in e1000_check_link to work in
> secondary process. There are network virtual appliances that use
> secondary process for all processing.
Yes, the e1000 virtual network card currently does not work properly
in the secondary process. After skipping eth_em_link_update, I tested
the e1000 card and it was able to capture packets normally. For the
secondary process, I think eth_em_link_update is not necessary.
Jun Wang
On Sun, Jul 14, 2024 at 04:26:26PM +0800, Jun Wang wrote:
> >> The code to update link status is not safe in secondary process.
> >> If called from secondary it will crash, example from dumpcap:
> >> eth_em_link_update
> >>
> >> Signed-off-by: Jun Wang <junwang01@cestc.cn>
> >
> > Wouldn't it be better to fix the code in e1000_check_link to work in
> > secondary process. There are network virtual appliances that use
> > secondary process for all processing.
>
> Yes, the e1000 virtual network card currently does not work properly
>
> in the secondary process. After skipping eth_em_link_update, I tested
>
> the e1000 card and it was able to capture packets normally. For the
>
> secondary process, I think eth_em_link_update is not necessary.
> __________________________________________________________________
>
Hi Jun,
can you provide some instructions as to how I can go about reproducing the
issue? I used a VM with emulated e1000 NICs on it and was able to run
testpmd and dumpcap side by side. Similarly, I was able to run two
instances of the symmetric_mp app side by size without seeing any crashes.
I'd just like to verify the issue and confirm this fixes a problem before
merging.
Thanks,
/Bruce
I used the e1000 NIC with OVS-DPDK and experienced a failure when using the
/dpdk/app/dpdk-dumpcap command to capture packets. After making modifications,
it worked fine.
/dpdk/app/dpdk-dumpcap -i 0000:00:04.0
File: /tmp/dpdk-dumpcap_0_0000:00:04.0_20240723020203.pcapng
Segmentation fault (core dumped)
ovs-vsctl list open .
_uuid : 82223d04-0a60-44fe-83ac-398a01b0bca3
bridges : [8f188622-c17d-4d21-ad30-22c30860f050, ca88255e-54e5-4b06-aaff-fdde55018f35]
cur_cfg : 15
datapath_types : [netdev, system]
datapaths : {netdev=d801e891-7990-4f77-ae9a-c776ba4430a4, system=25e5a761-3160-4ea1-adab-b4cddd8309e4}
db_version : "8.3.0"
dpdk_initialized : true
dpdk_version : "DPDK 23.11.0"
external_ids : {hostname=cell1-xgw-dpdk, ovn-bridge-datapath-type=netdev, ovn-bridge-mappings="external:br-tun,share:br-tun,direct-connect:br-tun", ovn-encap-ip="172.16.0.13", ovn-encap-tos=inherit, ovn-encap-type=geneve, ovn-remote="tcp:[192.168.122.171]:6642", ovn-set-local-ip="true", rundir="/var/run/openvswitch", system-id=cell1-xgw-dpdk}
iface_types : [bareudp, dpdk, dpdkvhostuser, dpdkvhostuserclient, erspan, geneve, gre, gtpu, internal, ip6erspan, ip6gre, lisp, patch, stt, system, tap, vxlan]
manager_options : []
next_cfg : 15
other_config : {dpdk-extra=" -a 0000:00:04.0" , dpdk-init="true", pmd-perf-metrics="false", vlan-limit="0"}
ovs_version : "2.17.5"
ssl : []
statistics : {}
system_type : cclinux
system_version : "22.09.2"
status : {bus_info="bus_name=pci, vendor_id=8086, device_id=100e", driver_name=net_e1000_em, if_descr="DPDK 23.11.0 net_e1000_em", if_type="6", link_speed="1Gbps", max_hash_mac_addrs="0", max_mac_addrs="15", max_rx_pktlen="1518", max_rx_queues="2", max_tx_queues="2", max_vfs="0", max_vmdq_pools="0", min_rx_bufsize="256", n_rxq="1", n_txq="2", numa_id="-1", port_no="0", rx-steering=rss, rx_csum_offload="true", tx_geneve_tso_offload="false", tx_ip_csum_offload="true", tx_out_ip_csum_offload="false", tx_out_udp_csum_offload="false", tx_sctp_csum_offload="false", tx_tcp_csum_offload="true", tx_tcp_seg_offload="false", tx_udp_csum_offload="true", tx_vxlan_tso_offload="false"}
/usr/local/bin/dpdk-devbind.py --status
Network devices using DPDK-compatible driver
============================================
0000:00:04.0 '82540EM Gigabit Ethernet Controller 100e' drv=uio_pci_generic unused=
Network devices using kernel driver
===================================
0000:00:03.0 '82540EM Gigabit Ethernet Controller 100e' if=eth0 drv=e1000 unused=uio_pci_generic *Active*
Jun Wang
On Fri, Jul 12, 2024 at 07:30:47PM +0800, Jun Wang wrote:
> The code to update link status is not safe in secondary process.
> If called from secondary it will crash, example from dumpcap:
> eth_em_link_update
>
> Signed-off-by: Jun Wang <junwang01@cestc.cn>
> ---
> drivers/net/e1000/em_ethdev.c | 3 +++
> 1 file changed, 3 insertions(+)
>
Given this is fixing an issue experienced in the real-world I think we
should take this patch. As Stephen says, a better solution would be to have
the whole function work properly in secondary, but I'd rather avoid crashes
as a priority.
Fixes: 805803445a02 ("e1000: support EM devices (also known as e1000/e1000e)")
Cc: stable@dpdk.org
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
On Thu, Aug 22, 2024 at 04:58:59PM +0100, Bruce Richardson wrote:
> On Fri, Jul 12, 2024 at 07:30:47PM +0800, Jun Wang wrote:
> > The code to update link status is not safe in secondary process.
> > If called from secondary it will crash, example from dumpcap:
> > eth_em_link_update
> >
> > Signed-off-by: Jun Wang <junwang01@cestc.cn>
> > ---
> > drivers/net/e1000/em_ethdev.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
>
> Given this is fixing an issue experienced in the real-world I think we
> should take this patch. As Stephen says, a better solution would be to have
> the whole function work properly in secondary, but I'd rather avoid crashes
> as a priority.
>
> Fixes: 805803445a02 ("e1000: support EM devices (also known as e1000/e1000e)")
> Cc: stable@dpdk.org
>
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Applied to dpdk-next-net-intel.
Thanks,
/Bruce
@@ -1136,6 +1136,9 @@ static int eth_em_pci_remove(struct rte_pci_device *pci_dev)
struct rte_eth_link link;
int link_up, count;
+ if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+ return -1;
+
link_up = 0;
hw->mac.get_link_status = 1;