From patchwork Wed Oct 31 18:39:45 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luca Boccassi X-Patchwork-Id: 47634 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id E8D684F90; Wed, 31 Oct 2018 19:40:15 +0100 (CET) Received: from mail-wr1-f67.google.com (mail-wr1-f67.google.com [209.85.221.67]) by dpdk.org (Postfix) with ESMTP id D73384C8A; Wed, 31 Oct 2018 19:40:10 +0100 (CET) Received: by mail-wr1-f67.google.com with SMTP id w5-v6so17641428wrt.2; Wed, 31 Oct 2018 11:40:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=xPQbXM4Yi9ZNe0BBhg9oIKd2QtSOOYNqjilAj1iluTw=; b=mxE4+prQW2JYpCmpG060cjz3aQjvLiMRO/D6x8o2SkWgFtr414kRNsoSL9slIhs6SV 6OAgRsryBwMi70CKW/XWm96wA29tjmdze/fUnwWinkZSBdB0exsW+U8k/D/itk2Phqr2 4ImARjgKrca4PobM+493o6UAFKzqGk367tqLvhAwy/GwFk8FHpFnxSsfnzOlubVURqr4 A2t0c9GrvB36ydPbRRxJ+RrYBFYW6AbOhQtYQ5VL6p7mYwEXAcjslUawbNApoj4o8bgp 1mm9AAs/Uy8fprEhQJi2NUrBzwCKVcsrZvWPsxQRk/lealhmXANdPOozXLkktZy23R+W 0+Fw== X-Gm-Message-State: AGRZ1gIr1ZEtiz9OxnEU3PRmB2WVJ3ANAAnvel/9q9psd2dPbt6bp/9g ToMI8hEf5402fY5qCYN8svmq/GnA X-Google-Smtp-Source: AJdET5dM5HpvBuxAbVtRrtvu7n6xnBkRoFB3+CqKlwxS9gI7BJURpwFZ2a9lJdMvkmCU9On6IX9gDA== X-Received: by 2002:adf:90af:: with SMTP id i44-v6mr3617606wri.77.1541011210115; Wed, 31 Oct 2018 11:40:10 -0700 (PDT) Received: from localhost ([2a01:4b00:f419:6f00:8361:8946:ba2b:d556]) by smtp.gmail.com with ESMTPSA id l140-v6sm38538116wmb.24.2018.10.31.11.40.08 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 31 Oct 2018 11:40:09 -0700 (PDT) From: Luca Boccassi To: dev@dpdk.org Cc: yongwang@vmware.com, 3chas3@gmail.com, bruce.richardson@intel.com, anatoly.burakov@intel.com, thomas@monjalon.net, llouis@vmware.com, Luca Boccassi , stable@dpdk.org, Brian Russell Date: Wed, 31 Oct 2018 18:39:45 +0000 Message-Id: <20181031183945.29509-3-bluca@debian.org> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181031183945.29509-1-bluca@debian.org> References: <20180816135032.28283-1-bluca@debian.org> <20181031183945.29509-1-bluca@debian.org> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v3 3/3] eal/linux: handle uio read failure in interrupt handler X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" If a device is unplugged while an interrupt is pending, the read call to the uio device to remove it from the poll wait list can fail resulting in it being continually polled forever. This change checks for the read failing and if so, unregisters the device as an interrupt source and causes the wait list to be rebuilt. This race has been reported and observed in production. Fixes: 0a45657a6794 ("pci: rework interrupt handling") Cc: stable@dpdk.org Signed-off-by: Brian Russell Signed-off-by: Luca Boccassi --- lib/librte_eal/linuxapp/eal/eal_interrupts.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c index 39252a887..cbac451e1 100644 --- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c +++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c @@ -700,7 +700,7 @@ eal_intr_process_interrupts(struct epoll_event *events, int nfds) bool call = false; int n, bytes_read; struct rte_intr_source *src; - struct rte_intr_callback *cb; + struct rte_intr_callback *cb, *next; union rte_intr_read_buffer buf; struct rte_intr_callback active_cb; @@ -780,6 +780,23 @@ eal_intr_process_interrupts(struct epoll_event *events, int nfds) "descriptor %d: %s\n", events[n].data.fd, strerror(errno)); + /* + * The device is unplugged or buggy, remove + * it as an interrupt source and return to + * force the wait list to be rebuilt. + */ + rte_spinlock_lock(&intr_lock); + TAILQ_REMOVE(&intr_sources, src, next); + rte_spinlock_unlock(&intr_lock); + + for (cb = TAILQ_FIRST(&src->callbacks); cb; + cb = next) { + next = TAILQ_NEXT(cb, next); + TAILQ_REMOVE(&src->callbacks, cb, next); + free(cb); + } + free(src); + return -1; } else if (bytes_read == 0) RTE_LOG(ERR, EAL, "Read nothing from file " "descriptor %d\n", events[n].data.fd);