Message ID | 1589859720-16224-1-git-send-email-wangyunjian@huawei.com |
---|---|
State | Accepted, archived |
Delegated to: | David Marchand |
Headers | show |
Series |
|
Related | show |
Context | Check | Description |
---|---|---|
ci/iol-mellanox-Performance | success | Performance Testing PASS |
ci/iol-nxp-Performance | success | Performance Testing PASS |
ci/iol-intel-Performance | success | Performance Testing PASS |
ci/Intel-compilation | success | Compilation OK |
ci/checkpatch | success | coding style OK |
On Tue, May 19, 2020 at 5:42 AM wangyunjian <wangyunjian@huawei.com> wrote: > > From: Yunjian Wang <wangyunjian@huawei.com> > > The issue is that a file descriptor at 0 is a valid one. Currently > the file not found, the return value will be set to 0. As a result, > it is impossible to distinguish between a correct descriptor and a > failed return value. Fix it to return -ENOENT instead of 0. > > Fixes: b758423bc4fe ("vfio: fix race condition with sysfs") > Fixes: ff0b67d1c868 ("vfio: DMA mappinge") > Cc: stable@dpdk.org This patch reverts the (marked for stable) fix b758423bc4fe ("vfio: fix race condition with sysfs") and comes with a different fix. - Kevin, Luca, I would put this b758423bc4fe backport on hold. - Anatoly, I don't want to put 20.05 at risk. My simple question for 20.05 is, should we revert b758423bc4fe? - This patchset will go to 20.08 as I don't feel confident in taking it now.
On 19-May-20 8:43 AM, David Marchand wrote: > On Tue, May 19, 2020 at 5:42 AM wangyunjian <wangyunjian@huawei.com> wrote: >> >> From: Yunjian Wang <wangyunjian@huawei.com> >> >> The issue is that a file descriptor at 0 is a valid one. Currently >> the file not found, the return value will be set to 0. As a result, >> it is impossible to distinguish between a correct descriptor and a >> failed return value. Fix it to return -ENOENT instead of 0. >> >> Fixes: b758423bc4fe ("vfio: fix race condition with sysfs") >> Fixes: ff0b67d1c868 ("vfio: DMA mappinge") >> Cc: stable@dpdk.org > > This patch reverts the (marked for stable) fix b758423bc4fe ("vfio: > fix race condition with sysfs") and comes with a different fix. > > - Kevin, Luca, I would put this b758423bc4fe backport on hold. > > - Anatoly, I don't want to put 20.05 at risk. > My simple question for 20.05 is, should we revert b758423bc4fe? No, let's not revert anything. As far as i can tell, this patch is a more complete fix, but it essentially does the same thing, just in a different (and better) way. Still, i haven't reviewed it in detail. > > - This patchset will go to 20.08 as I don't feel confident in taking it now. > Yes, let's leave it for 20.08.
On Thu, May 21, 2020 at 2:54 PM Burakov, Anatoly <anatoly.burakov@intel.com> wrote: > > On 19-May-20 8:43 AM, David Marchand wrote: > > On Tue, May 19, 2020 at 5:42 AM wangyunjian <wangyunjian@huawei.com> wrote: > >> > >> From: Yunjian Wang <wangyunjian@huawei.com> > >> > >> The issue is that a file descriptor at 0 is a valid one. Currently > >> the file not found, the return value will be set to 0. As a result, > >> it is impossible to distinguish between a correct descriptor and a > >> failed return value. Fix it to return -ENOENT instead of 0. > >> > >> Fixes: b758423bc4fe ("vfio: fix race condition with sysfs") > >> Fixes: ff0b67d1c868 ("vfio: DMA mappinge") > >> Cc: stable@dpdk.org > > > > This patch reverts the (marked for stable) fix b758423bc4fe ("vfio: > > fix race condition with sysfs") and comes with a different fix. > > > > - Kevin, Luca, I would put this b758423bc4fe backport on hold. > > > > - Anatoly, I don't want to put 20.05 at risk. > > My simple question for 20.05 is, should we revert b758423bc4fe? > > No, let's not revert anything. > > As far as i can tell, this patch is a more complete fix, but it > essentially does the same thing, just in a different (and better) way. > Still, i haven't reviewed it in detail. > > > > > - This patchset will go to 20.08 as I don't feel confident in taking it now. > > > > Yes, let's leave it for 20.08. Thanks for the analysis Anatoly.
On 21/05/2020 13:53, Burakov, Anatoly wrote: > On 19-May-20 8:43 AM, David Marchand wrote: >> On Tue, May 19, 2020 at 5:42 AM wangyunjian <wangyunjian@huawei.com> wrote: >>> >>> From: Yunjian Wang <wangyunjian@huawei.com> >>> >>> The issue is that a file descriptor at 0 is a valid one. Currently >>> the file not found, the return value will be set to 0. As a result, >>> it is impossible to distinguish between a correct descriptor and a >>> failed return value. Fix it to return -ENOENT instead of 0. >>> >>> Fixes: b758423bc4fe ("vfio: fix race condition with sysfs") >>> Fixes: ff0b67d1c868 ("vfio: DMA mappinge") >>> Cc: stable@dpdk.org >> >> This patch reverts the (marked for stable) fix b758423bc4fe ("vfio: >> fix race condition with sysfs") and comes with a different fix. >> >> - Kevin, Luca, I would put this b758423bc4fe backport on hold. >> >> - Anatoly, I don't want to put 20.05 at risk. >> My simple question for 20.05 is, should we revert b758423bc4fe? > > No, let's not revert anything. > > As far as i can tell, this patch is a more complete fix, but it > essentially does the same thing, just in a different (and better) way. > Still, i haven't reviewed it in detail. > b758423bc4fe was not reverted in master - should that be backported now? or should we wait until this new fix is ready for backport too? >> >> - This patchset will go to 20.08 as I don't feel confident in taking it now. >> > > Yes, let's leave it for 20.08. >
Ping for review Thanks, Yunjian > -----Original Message----- > From: Kevin Traynor [mailto:ktraynor@redhat.com] > Sent: Wednesday, May 27, 2020 6:45 PM > To: Burakov, Anatoly <anatoly.burakov@intel.com>; David Marchand > <david.marchand@redhat.com>; wangyunjian <wangyunjian@huawei.com>; > Luca Boccassi <bluca@debian.org> > Cc: dev <dev@dpdk.org>; Hemant Agrawal <hemant.agrawal@nxp.com>; > Sachin Saxena <sachin.saxena@nxp.com>; Lilijun (Jerry) > <jerry.lilijun@huawei.com>; xudingke <xudingke@huawei.com>; dpdk stable > <stable@dpdk.org>; Luca Boccassi <bluca@debian.org> > Subject: Re: [dpdk-stable] [dpdk-dev] [PATCH 1/2] vfio: fix check for > vfio_group_fd > > On 21/05/2020 13:53, Burakov, Anatoly wrote: > > On 19-May-20 8:43 AM, David Marchand wrote: > >> On Tue, May 19, 2020 at 5:42 AM wangyunjian <wangyunjian@huawei.com> > wrote: > >>> > >>> From: Yunjian Wang <wangyunjian@huawei.com> > >>> > >>> The issue is that a file descriptor at 0 is a valid one. Currently > >>> the file not found, the return value will be set to 0. As a result, > >>> it is impossible to distinguish between a correct descriptor and a > >>> failed return value. Fix it to return -ENOENT instead of 0. > >>> > >>> Fixes: b758423bc4fe ("vfio: fix race condition with sysfs") > >>> Fixes: ff0b67d1c868 ("vfio: DMA mappinge") > >>> Cc: stable@dpdk.org > >> > >> This patch reverts the (marked for stable) fix b758423bc4fe ("vfio: > >> fix race condition with sysfs") and comes with a different fix. > >> > >> - Kevin, Luca, I would put this b758423bc4fe backport on hold. > >> > >> - Anatoly, I don't want to put 20.05 at risk. > >> My simple question for 20.05 is, should we revert b758423bc4fe? > > > > No, let's not revert anything. > > > > As far as i can tell, this patch is a more complete fix, but it > > essentially does the same thing, just in a different (and better) way. > > Still, i haven't reviewed it in detail. > > > > b758423bc4fe was not reverted in master - should that be backported now? > or should we wait until this new fix is ready for backport too? > > >> > >> - This patchset will go to 20.08 as I don't feel confident in taking it now. > >> > > > > Yes, let's leave it for 20.08. > >
On 19-May-20 4:42 AM, wangyunjian wrote: > From: Yunjian Wang <wangyunjian@huawei.com> > > The issue is that a file descriptor at 0 is a valid one. Currently > the file not found, the return value will be set to 0. As a result, > it is impossible to distinguish between a correct descriptor and a > failed return value. Fix it to return -ENOENT instead of 0. > > Fixes: b758423bc4fe ("vfio: fix race condition with sysfs") > Fixes: ff0b67d1c868 ("vfio: DMA mappinge") > Cc: stable@dpdk.org > > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> > --- Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
diff --git a/lib/librte_eal/linux/eal_vfio.c b/lib/librte_eal/linux/eal_vfio.c index d26e1649a..8c5a13be6 100644 --- a/lib/librte_eal/linux/eal_vfio.c +++ b/lib/librte_eal/linux/eal_vfio.c @@ -293,7 +293,7 @@ vfio_open_group_fd(int iommu_group_num) strerror(errno)); return -1; } - return 0; + return -ENOENT; } /* noiommu group found */ } @@ -318,12 +318,12 @@ vfio_open_group_fd(int iommu_group_num) vfio_group_fd = mp_rep->fds[0]; } else if (p->result == SOCKET_NO_FD) { RTE_LOG(ERR, EAL, " bad VFIO group fd\n"); - vfio_group_fd = 0; + vfio_group_fd = -ENOENT; } } free(mp_reply.msgs); - if (vfio_group_fd < 0) + if (vfio_group_fd < 0 && vfio_group_fd != -ENOENT) RTE_LOG(ERR, EAL, " cannot request group fd\n"); return vfio_group_fd; } @@ -379,9 +379,9 @@ vfio_get_group_fd(struct vfio_config *vfio_cfg, } vfio_group_fd = vfio_open_group_fd(iommu_group_num); - if (vfio_group_fd <= 0) { + if (vfio_group_fd < 0) { RTE_LOG(ERR, EAL, "Failed to open group %d\n", iommu_group_num); - return -1; + return vfio_group_fd; } cur_grp->group_num = iommu_group_num; @@ -728,11 +728,14 @@ rte_vfio_setup_device(const char *sysfs_base, const char *dev_addr, /* get the actual group fd */ vfio_group_fd = rte_vfio_get_group_fd(iommu_group_num); - if (vfio_group_fd < 0) + if (vfio_group_fd < 0 && vfio_group_fd != -ENOENT) return -1; - /* if group_fd == 0, that means the device isn't managed by VFIO */ - if (vfio_group_fd == 0) { + /* + * if vfio_group_fd == -ENOENT, that means the device + * isn't managed by VFIO + */ + if (vfio_group_fd == -ENOENT) { RTE_LOG(WARNING, EAL, " %s not managed by VFIO driver, skipping\n", dev_addr); return 1; @@ -955,10 +958,10 @@ rte_vfio_release_device(const char *sysfs_base, const char *dev_addr, /* get the actual group fd */ vfio_group_fd = rte_vfio_get_group_fd(iommu_group_num); - if (vfio_group_fd <= 0) { + if (vfio_group_fd < 0) { RTE_LOG(INFO, EAL, "rte_vfio_get_group_fd failed for %s\n", dev_addr); - ret = -1; + ret = vfio_group_fd; goto out; } diff --git a/lib/librte_eal/linux/eal_vfio_mp_sync.c b/lib/librte_eal/linux/eal_vfio_mp_sync.c index 5f2a5fc1d..6254696ae 100644 --- a/lib/librte_eal/linux/eal_vfio_mp_sync.c +++ b/lib/librte_eal/linux/eal_vfio_mp_sync.c @@ -44,9 +44,9 @@ vfio_mp_primary(const struct rte_mp_msg *msg, const void *peer) r->req = SOCKET_REQ_GROUP; r->group_num = m->group_num; fd = rte_vfio_get_group_fd(m->group_num); - if (fd < 0) + if (fd < 0 && fd != -ENOENT) r->result = SOCKET_ERR; - else if (fd == 0) + else if (fd == -ENOENT) /* if VFIO group exists but isn't bound to VFIO driver */ r->result = SOCKET_NO_FD; else {