From patchwork Wed Sep 19 12:57:57 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luca Boccassi X-Patchwork-Id: 44911 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 6031C4C96; Wed, 19 Sep 2018 14:58:18 +0200 (CEST) Received: from mail-wr1-f45.google.com (mail-wr1-f45.google.com [209.85.221.45]) by dpdk.org (Postfix) with ESMTP id 0F4EA4C90 for ; Wed, 19 Sep 2018 14:58:17 +0200 (CEST) Received: by mail-wr1-f45.google.com with SMTP id y8-v6so2049622wrh.7 for ; Wed, 19 Sep 2018 05:58:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=OoN1luTMKuIv4XtXNZbAo4G+yixREYqmXo693o6dr9w=; b=VHOwXmdeEIQalt/ptA9tolcSE9sXB1xzhgFlGOLfzL9ZQMyhE7oM2Knb6+czOhhpZ8 +S3aN2G9voMgZFXgj1W76i5/Uv+D9Coete2zUr7TmDpbqu0ZFTVKgxvnIhbCSxYkJnKu w7/Eec2wz5l5c/d4KYe3KrV50zDQI9Br4KSI/JXSYaJCdfeOIkCMJJSmvjnc8FX2WpY6 HHmNSlwcE7Scyci+vE440QKx9hDStx4vEorUhO1VuMLvXFkcpC7rYn9gl6QUHC6DWD2J TNNX7v7Z2vgPyJ2bZQf5oWANsN0fuQ/FqwcQYew4CKuWJUq/dLKrNejqpr/Y91ZFJnMs WwWw== X-Gm-Message-State: APzg51D+LP+gsKauqtYuGVkna/Qp6796WC5mqSsRVkk+vZqiWWovOLqP y4cpzoCXye0TopC6Awrk5/ymmtS21QI= X-Google-Smtp-Source: ANB0VdaRv54f7pdgFOCYz1feR2nEWlZpbg7oBrt4fVJ4QHP+8CmbY9vK29df+pc7B0bgQsi+x+5RvA== X-Received: by 2002:a05:6000:10d0:: with SMTP id b16mr28803961wrx.226.1537361896373; Wed, 19 Sep 2018 05:58:16 -0700 (PDT) Received: from localhost ([2001:1be0:110d:fcfe:489f:80a9:5d59:c6bd]) by smtp.gmail.com with ESMTPSA id g17-v6sm3333006wmh.19.2018.09.19.05.58.15 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 19 Sep 2018 05:58:15 -0700 (PDT) From: Luca Boccassi To: dev@dpdk.org Cc: maxime.coquelin@redhat.com, tiwei.bie@intel.com, yongwang@vmware.com, 3chas3@gmail.com, bruce.richardson@intel.com, jianfeng.tan@intel.com, anatoly.burakov@intel.com, llouis@vmware.com, brussell@vyatta.att-mail.com Date: Wed, 19 Sep 2018 13:57:57 +0100 Message-Id: <20180919125757.17938-3-bluca@debian.org> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180919125757.17938-1-bluca@debian.org> References: <20180816135032.28283-1-bluca@debian.org> <20180919125757.17938-1-bluca@debian.org> Subject: [dpdk-dev] [PATCH v2 3/3] eal/linux: handle uio read failure in interrupt handler X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" If a device is unplugged while an interrupt is pending, the read call to the uio device to remove it from the poll wait list can fail resulting in it being continually polled forever. This change checks for the read failing and if so, unregisters the device as an interrupt source and causes the wait list to be rebuilt. This race has been reported and observed in production. Fixes: 0a45657a6794 ("pci: rework interrupt handling") Cc: stable@dpdk.org Signed-off-by: Brian Russell Signed-off-by: Luca Boccassi --- lib/librte_eal/linuxapp/eal/eal_interrupts.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c index 4076c6d6ca..34584db883 100644 --- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c +++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c @@ -627,7 +627,7 @@ eal_intr_process_interrupts(struct epoll_event *events, int nfds) bool call = false; int n, bytes_read; struct rte_intr_source *src; - struct rte_intr_callback *cb; + struct rte_intr_callback *cb, *next; union rte_intr_read_buffer buf; struct rte_intr_callback active_cb; @@ -701,6 +701,23 @@ eal_intr_process_interrupts(struct epoll_event *events, int nfds) "descriptor %d: %s\n", events[n].data.fd, strerror(errno)); + /* + * The device is unplugged or buggy, remove + * it as an interrupt source and return to + * force the wait list to be rebuilt. + */ + rte_spinlock_lock(&intr_lock); + TAILQ_REMOVE(&intr_sources, src, next); + rte_spinlock_unlock(&intr_lock); + + for (cb = TAILQ_FIRST(&src->callbacks); cb; + cb = next) { + next = TAILQ_NEXT(cb, next); + TAILQ_REMOVE(&src->callbacks, cb, next); + free(cb); + } + free(src); + return -1; } else if (bytes_read == 0) RTE_LOG(ERR, EAL, "Read nothing from file " "descriptor %d\n", events[n].data.fd);