From patchwork Thu Sep 24 20:44:33 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tim Shearer X-Patchwork-Id: 7160 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id 6A15E3B5; Thu, 24 Sep 2015 22:44:36 +0200 (CEST) Received: from HUB024-nj-3.exch024.serverdata.net (hub024-nj-3.exch024.serverdata.net [206.225.165.118]) by dpdk.org (Postfix) with ESMTP id ADCEE36E for ; Thu, 24 Sep 2015 22:44:34 +0200 (CEST) Received: from MBX024-E1-NJ-2.exch024.domain.local ([10.240.10.52]) by HUB024-NJ-3.exch024.domain.local ([10.240.10.36]) with mapi id 14.03.0224.002; Thu, 24 Sep 2015 13:44:33 -0700 From: Tim Shearer To: "dev@dpdk.org" Thread-Topic: [PATCH] librte: Link status interrupt race condition, IGB E1000 Thread-Index: AdD3B010xN7sf0SHTdCLWgWsN0OPpg== Date: Thu, 24 Sep 2015 20:44:33 +0000 Message-ID: <33526A3108217C45B7DAFFA5277E4B67485277B3@mbx024-e1-nj-2.exch024.domain.local> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [50.58.84.238] MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH] librte: Link status interrupt race condition, IGB E1000 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" I encountered an issue with DPDK 2.1.0 which occasionally causes the link status interrupt callback not to be called after the interface is started for the first time. I traced the problem back to the function eth_igb_link_update(), which is used to determine if the link has changed state since the previous time it was called. It appears that this function can be called simultaneously from two different threads: (1) From the main application/configuration thread, via rte_eth_dev_start() - pointed to by (*dev->dev_ops->link_update) (2) From the eal interrupt thread, via eth_igb_interrupt_action(), to check if the link state has transitioned up or down. The user callback is only executed if the link has changed state. The race condition manifests itself as follows: - Main thread configures the interface with link status interrupt (LSI) enabled, sets up the queues etc. - Main thread calls rte_eth_dev_start. The interface is started and then we call eth_igb_link_update() - While in this call, the link goes up. Accordingly, we detect the transition, and write the new link state (up) into the global rte_eth_dev struct - The interrupt fires, which also drops into the eth_igb_link_update function, finds that the global link status has already been set to up (no change) - Therefore, the handler thinks the interrupt was spurious, and the callback doesn't get called. I suspect that rte_eth_dev_start shouldn't be checking the link state if interrupts are enabled. Would someone mind taking a quick look at the patch below? Thanks! Tim --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -1300,7 +1300,7 @@ rte_eth_dev_start(uint8_t port_id) rte_eth_dev_config_restore(port_id); - if (dev->data->dev_conf.intr_conf.lsc != 0) { + if (dev->data->dev_conf.intr_conf.lsc == 0) { FUNC_PTR_OR_ERR_RET(*dev->dev_ops->link_update, -ENOTSUP); (*dev->dev_ops->link_update)(dev, 0); }