Message ID | 5790A5D4.1090703@samsung.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id 628B13990; Thu, 21 Jul 2016 12:37:13 +0200 (CEST) Received: from mailout4.w1.samsung.com (mailout4.w1.samsung.com [210.118.77.14]) by dpdk.org (Postfix) with ESMTP id D9B9819F5 for <dev@dpdk.org>; Thu, 21 Jul 2016 12:37:11 +0200 (CEST) Received: from eucpsbgm1.samsung.com (unknown [203.254.199.244]) by mailout4.w1.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTP id <0OAN00CF3UTYXR70@mailout4.w1.samsung.com> for dev@dpdk.org; Thu, 21 Jul 2016 11:37:10 +0100 (BST) X-AuditID: cbfec7f4-f796c6d000001486-c8-5790a5d6c791 Received: from eusync3.samsung.com ( [203.254.199.213]) by eucpsbgm1.samsung.com (EUCPMTA) with SMTP id C5.BD.05254.6D5A0975; Thu, 21 Jul 2016 11:37:10 +0100 (BST) Received: from [106.109.129.180] by eusync3.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTPA id <0OAN008LEUTX7990@eusync3.samsung.com>; Thu, 21 Jul 2016 11:37:10 +0100 (BST) To: Yuanhan Liu <yuanhan.liu@linux.intel.com> References: <1469089275-15209-1-git-send-email-i.maximets@samsung.com> <20160721093714.GD28708@yliu-dev.sh.intel.com> <579099BC.9050603@samsung.com> <20160721101311.GE28708@yliu-dev.sh.intel.com> Cc: dev@dpdk.org, Huawei Xie <huawei.xie@intel.com>, Dyasly Sergey <s.dyasly@samsung.com>, Heetae Ahn <heetae82.ahn@samsung.com>, Thomas Monjalon <thomas.monjalon@6wind.com> From: Ilya Maximets <i.maximets@samsung.com> Message-id: <5790A5D4.1090703@samsung.com> Date: Thu, 21 Jul 2016 13:37:08 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0 MIME-version: 1.0 In-reply-to: <20160721101311.GE28708@yliu-dev.sh.intel.com> Content-type: text/plain; charset=windows-1252 Content-transfer-encoding: 7bit X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrFLMWRmVeSWpSXmKPExsVy+t/xq7rXlk4INzgzi9Xi3aftTBbTPt9m t2ifeZbJ4kr7T3aLybOlLL5sms5mcX3CBVYHdo+L/XcYPX4tWMrqsXjPSyaPeScDPfq2rGIM YI3isklJzcksSy3St0vgyjgx+z1zwWvpijcnljA3MN4Q62Lk5JAQMJHo+v6BCcIWk7hwbz1b FyMXh5DAUkaJjV3v2UESQgIvGCVmXJcDsYUFrCWmTXzBBmKLCOhKPJ2zjhWi4TijROO+RmYQ h1lgI6PEgqmnwarYBHQkTq0+wghi8wpoSTQ+uQI2lUVAVWLZ0yfMILaoQITErO0/mCBqBCV+ TL7HAmJzAm3r7FwGZHMADdWTuH9RCyTMLCAvsXnNW+YJjAKzkHTMQqiahaRqASPzKkbR1NLk guKk9FxDveLE3OLSvHS95PzcTYyQEP+yg3HxMatDjAIcjEo8vAkr+8OFWBPLiitzDzFKcDAr ifDaLZwQLsSbklhZlVqUH19UmpNafIhRmoNFSZx37q73IUIC6YklqdmpqQWpRTBZJg5OqQbG hrKlJVncKzmvLNXeJH1Udf7KO5d8d5zkf6nn+Vp1oYlRO6f1vlbeG4HrfOeWeamZHrdLyY7Z ws3RsSFKW2RR/iyffUtiT50/YXb49m8p2XaJ6F1N4vIvI0S5Qqec/yj+4cfa7+Jf9iYa7Pq3 qtZx9omXu/lPzF0iPl8/VsHlMKv5/drq4rrNSizFGYmGWsxFxYkAYPCuQW0CAAA= Subject: Re: [dpdk-dev] [PATCH] vhost: fix connect hang in client mode X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK <dev.dpdk.org> List-Unsubscribe: <http://dpdk.org/ml/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://dpdk.org/ml/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <http://dpdk.org/ml/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> |
Commit Message
Ilya Maximets
July 21, 2016, 10:37 a.m. UTC
On 21.07.2016 13:13, Yuanhan Liu wrote: > On Thu, Jul 21, 2016 at 12:45:32PM +0300, Ilya Maximets wrote: >> On 21.07.2016 12:37, Yuanhan Liu wrote: >>> On Thu, Jul 21, 2016 at 11:21:15AM +0300, Ilya Maximets wrote: >>>> If something abnormal happened to QEMU, 'connect()' can block calling >>>> thread (e.g. main thread of OVS) forever or for a really long time. >>>> This can break whole application or block the reconnection thread. >>>> >>>> Example with OVS: >>>> >>>> ovs_rcu(urcu2)|WARN|blocked 512000 ms waiting for main to quiesce >>>> (gdb) bt >>>> #0 connect () from /lib64/libpthread.so.0 >>>> #1 vhost_user_create_client (vsocket=0xa816e0) >>>> #2 rte_vhost_driver_register >>>> #3 netdev_dpdk_vhost_user_construct >>>> #4 netdev_open (name=0xa664b0 "vhost1") >>>> [...] >>>> #11 main >>>> >>>> Fix that by setting non-blocking mode for client sockets for connection. >>>> >>>> Fixes: 64ab701c3d1e ("vhost: add vhost-user client mode") >>> >>> Thanks for spotting and fixing yet another bug! >>> >>>> >>>> +static int >>>> +vhost_user_connect_nonblock(int fd, struct sockaddr *un, size_t sz) >>> >>> I don't quite understand why this is needed: connect() with O_NONBLOCK >>> flag set is not enough? >> >> There is a little issue with non-blocking connect() call. Connection >> establishing may be started but '-1' returned with 'errno = EINPROGRESS'. >> In this case we must wait on fd until it will be available for writing. >> After that we need to check current status of connection using getsockopt(). >> >> I don't sure that we're able to get such situation, but it's documented, >> and, I think, we should handle it. >> >> See 'man connect' for details. > > I see. Thanks. > > But basically, I don't like the way of introduing yet another > fdset here. I'm wondering we could leverage current fdset code > to achieve that. This might need some work though. > > So how about making it simple and stupid at this stage: sleep a > while (maybe 1ms, or maybe 1s) when that happens, and give up > when the connection is still not established? Hmm, how about this fixup: ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ ? We will not check the EINPROGRESS, but subsequent 'connect()' will return EISCONN if connection already established. getsockopt() is kept just in case. Subsequent 'connect()' will happen on the next iteration of reconnection cycle (1 second sleep). Best regards, Ilya Maximets.
Comments
On 21.07.2016 13:37, Ilya Maximets wrote: > > > On 21.07.2016 13:13, Yuanhan Liu wrote: >> On Thu, Jul 21, 2016 at 12:45:32PM +0300, Ilya Maximets wrote: >>> On 21.07.2016 12:37, Yuanhan Liu wrote: >>>> On Thu, Jul 21, 2016 at 11:21:15AM +0300, Ilya Maximets wrote: >>>>> If something abnormal happened to QEMU, 'connect()' can block calling >>>>> thread (e.g. main thread of OVS) forever or for a really long time. >>>>> This can break whole application or block the reconnection thread. >>>>> >>>>> Example with OVS: >>>>> >>>>> ovs_rcu(urcu2)|WARN|blocked 512000 ms waiting for main to quiesce >>>>> (gdb) bt >>>>> #0 connect () from /lib64/libpthread.so.0 >>>>> #1 vhost_user_create_client (vsocket=0xa816e0) >>>>> #2 rte_vhost_driver_register >>>>> #3 netdev_dpdk_vhost_user_construct >>>>> #4 netdev_open (name=0xa664b0 "vhost1") >>>>> [...] >>>>> #11 main >>>>> >>>>> Fix that by setting non-blocking mode for client sockets for connection. >>>>> >>>>> Fixes: 64ab701c3d1e ("vhost: add vhost-user client mode") >>>> >>>> Thanks for spotting and fixing yet another bug! >>>> >>>>> >>>>> +static int >>>>> +vhost_user_connect_nonblock(int fd, struct sockaddr *un, size_t sz) >>>> >>>> I don't quite understand why this is needed: connect() with O_NONBLOCK >>>> flag set is not enough? >>> >>> There is a little issue with non-blocking connect() call. Connection >>> establishing may be started but '-1' returned with 'errno = EINPROGRESS'. >>> In this case we must wait on fd until it will be available for writing. >>> After that we need to check current status of connection using getsockopt(). >>> >>> I don't sure that we're able to get such situation, but it's documented, >>> and, I think, we should handle it. >>> >>> See 'man connect' for details. >> >> I see. Thanks. >> >> But basically, I don't like the way of introduing yet another >> fdset here. I'm wondering we could leverage current fdset code >> to achieve that. This might need some work though. >> >> So how about making it simple and stupid at this stage: sleep a >> while (maybe 1ms, or maybe 1s) when that happens, and give up >> when the connection is still not established? > > Hmm, how about this fixup: > ------------------------------------------------------------------------------ > diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c > index 8626d13..b0f45e6 100644 > --- a/lib/librte_vhost/vhost_user/vhost-net-user.c > +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c > @@ -537,18 +537,7 @@ vhost_user_connect_nonblock(int fd, struct sockaddr *un, size_t sz) > errno = EINVAL; > > ret = connect(fd, un, sz); > - if (ret == -1 && errno != EINPROGRESS) > - return -1; > - if (ret == 0) > - goto connected; > - > - FD_ZERO(&fdset); > - FD_SET(fd, &fdset); > - > - ret = select(fd + 1, NULL, &fdset, NULL, &tv); > - if (!ret) > - errno = ETIMEDOUT; > - if (ret != 1) > + if (ret < 0 && errno != EISCONN) > return -1; > > ret = getsockopt(fd, SOL_SOCKET, SO_ERROR, &so_error, &len); > @@ -558,7 +547,6 @@ vhost_user_connect_nonblock(int fd, struct sockaddr *un, size_t sz) > return -1; > } > > -connected: > flags = fcntl(fd, F_GETFL, 0); > if (flags < 0) { > RTE_LOG(ERR, VHOST_CONFIG, > ------------------------------------------------------------------------------ > ? > > We will not check the EINPROGRESS, but subsequent 'connect()' will return > EISCONN if connection already established. getsockopt() is kept just in > case. Subsequent 'connect()' will happen on the next iteration of > reconnection cycle (1 second sleep). I've sent v2 with this changes. Best regards, Ilya Maximets.
On Thu, Jul 21, 2016 at 02:14:59PM +0300, Ilya Maximets wrote: > > Hmm, how about this fixup: > > ------------------------------------------------------------------------------ > > diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c > > index 8626d13..b0f45e6 100644 > > --- a/lib/librte_vhost/vhost_user/vhost-net-user.c > > +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c > > @@ -537,18 +537,7 @@ vhost_user_connect_nonblock(int fd, struct sockaddr *un, size_t sz) > > errno = EINVAL; > > > > ret = connect(fd, un, sz); > > - if (ret == -1 && errno != EINPROGRESS) > > - return -1; > > - if (ret == 0) > > - goto connected; > > - > > - FD_ZERO(&fdset); > > - FD_SET(fd, &fdset); > > - > > - ret = select(fd + 1, NULL, &fdset, NULL, &tv); > > - if (!ret) > > - errno = ETIMEDOUT; > > - if (ret != 1) > > + if (ret < 0 && errno != EISCONN) > > return -1; > > > > ret = getsockopt(fd, SOL_SOCKET, SO_ERROR, &so_error, &len); > > @@ -558,7 +547,6 @@ vhost_user_connect_nonblock(int fd, struct sockaddr *un, size_t sz) > > return -1; > > } > > > > -connected: > > flags = fcntl(fd, F_GETFL, 0); > > if (flags < 0) { > > RTE_LOG(ERR, VHOST_CONFIG, > > ------------------------------------------------------------------------------ > > ? > > > > We will not check the EINPROGRESS, but subsequent 'connect()' will return > > EISCONN if connection already established. getsockopt() is kept just in > > case. Subsequent 'connect()' will happen on the next iteration of > > reconnection cycle (1 second sleep). > > I've sent v2 with this changes. Thanks. But still, it doesn't look clean to me. I was thinking following might be cleaner? diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user. index f0f92f8..c0ef290 100644 --- a/lib/librte_vhost/vhost_user/vhost-net-user.c +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c @@ -532,6 +532,10 @@ vhost_user_client_reconnect(void *arg __rte_unused) reconn != NULL; reconn = next) { next = TAILQ_NEXT(reconn, next); + if (reconn->conn_inprogress) { + /* do connect check here */ + } + if (connect(reconn->fd, (struct sockaddr *)&reconn->un, sizeof(reconn->un)) < 0) continue; @@ -605,6 +609,7 @@ vhost_user_create_client(struct vhost_user_socket *vsocket) reconn->un = un; reconn->fd = fd; reconn->vsocket = vsocket; + reconn->conn_inprogress = errno == EINPROGRESS; pthread_mutex_lock(&reconn_list.mutex); TAILQ_INSERT_TAIL(&reconn_list.head, reconn, next); pthread_mutex_unlock(&reconn_list.mutex); It's just a rough diff, hopefully it shows my idea clearly. And of course, we should not call connect() anymore when conn_inprogress is set. What do you think of it? --yliu
On 21.07.2016 14:40, Yuanhan Liu wrote: > On Thu, Jul 21, 2016 at 02:14:59PM +0300, Ilya Maximets wrote: >>> Hmm, how about this fixup: >>> ------------------------------------------------------------------------------ >>> diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c >>> index 8626d13..b0f45e6 100644 >>> --- a/lib/librte_vhost/vhost_user/vhost-net-user.c >>> +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c >>> @@ -537,18 +537,7 @@ vhost_user_connect_nonblock(int fd, struct sockaddr *un, size_t sz) >>> errno = EINVAL; >>> >>> ret = connect(fd, un, sz); >>> - if (ret == -1 && errno != EINPROGRESS) >>> - return -1; >>> - if (ret == 0) >>> - goto connected; >>> - >>> - FD_ZERO(&fdset); >>> - FD_SET(fd, &fdset); >>> - >>> - ret = select(fd + 1, NULL, &fdset, NULL, &tv); >>> - if (!ret) >>> - errno = ETIMEDOUT; >>> - if (ret != 1) >>> + if (ret < 0 && errno != EISCONN) >>> return -1; >>> >>> ret = getsockopt(fd, SOL_SOCKET, SO_ERROR, &so_error, &len); >>> @@ -558,7 +547,6 @@ vhost_user_connect_nonblock(int fd, struct sockaddr *un, size_t sz) >>> return -1; >>> } >>> >>> -connected: >>> flags = fcntl(fd, F_GETFL, 0); >>> if (flags < 0) { >>> RTE_LOG(ERR, VHOST_CONFIG, >>> ------------------------------------------------------------------------------ >>> ? >>> >>> We will not check the EINPROGRESS, but subsequent 'connect()' will return >>> EISCONN if connection already established. getsockopt() is kept just in >>> case. Subsequent 'connect()' will happen on the next iteration of >>> reconnection cycle (1 second sleep). >> >> I've sent v2 with this changes. > > Thanks. But still, it doesn't look clean to me. I was thinking following > might be cleaner? > > diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c > b/lib/librte_vhost/vhost_user/vhost-net-user. > index f0f92f8..c0ef290 100644 > --- a/lib/librte_vhost/vhost_user/vhost-net-user.c > +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c > @@ -532,6 +532,10 @@ vhost_user_client_reconnect(void *arg __rte_unused) > reconn != NULL; reconn = next) { > next = TAILQ_NEXT(reconn, next); > > + if (reconn->conn_inprogress) { > + /* do connect check here */ > + } > + > if (connect(reconn->fd, (struct sockaddr *)&reconn->un, > sizeof(reconn->un)) < 0) > continue; > @@ -605,6 +609,7 @@ vhost_user_create_client(struct vhost_user_socket *vsocket) > reconn->un = un; > reconn->fd = fd; > reconn->vsocket = vsocket; > + reconn->conn_inprogress = errno == EINPROGRESS; > pthread_mutex_lock(&reconn_list.mutex); > TAILQ_INSERT_TAIL(&reconn_list.head, reconn, next); > pthread_mutex_unlock(&reconn_list.mutex); > > It's just a rough diff, hopefully it shows my idea clearly. And of > course, we should not call connect() anymore when conn_inprogress > is set. > > What do you think of it? I found that we can't check connection status without select/poll on it. 'getsockopt()' will return 0 with no errors if connection is not still established just like if it was. So, I think, the first version of this patch is the only acceptable solution. Best regards, Ilya Maximets.
On 21.07.2016 15:10, Ilya Maximets wrote: > On 21.07.2016 14:40, Yuanhan Liu wrote: >> On Thu, Jul 21, 2016 at 02:14:59PM +0300, Ilya Maximets wrote: >>>> Hmm, how about this fixup: >>>> ------------------------------------------------------------------------------ >>>> diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c >>>> index 8626d13..b0f45e6 100644 >>>> --- a/lib/librte_vhost/vhost_user/vhost-net-user.c >>>> +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c >>>> @@ -537,18 +537,7 @@ vhost_user_connect_nonblock(int fd, struct sockaddr *un, size_t sz) >>>> errno = EINVAL; >>>> >>>> ret = connect(fd, un, sz); >>>> - if (ret == -1 && errno != EINPROGRESS) >>>> - return -1; >>>> - if (ret == 0) >>>> - goto connected; >>>> - >>>> - FD_ZERO(&fdset); >>>> - FD_SET(fd, &fdset); >>>> - >>>> - ret = select(fd + 1, NULL, &fdset, NULL, &tv); >>>> - if (!ret) >>>> - errno = ETIMEDOUT; >>>> - if (ret != 1) >>>> + if (ret < 0 && errno != EISCONN) >>>> return -1; >>>> >>>> ret = getsockopt(fd, SOL_SOCKET, SO_ERROR, &so_error, &len); >>>> @@ -558,7 +547,6 @@ vhost_user_connect_nonblock(int fd, struct sockaddr *un, size_t sz) >>>> return -1; >>>> } >>>> >>>> -connected: >>>> flags = fcntl(fd, F_GETFL, 0); >>>> if (flags < 0) { >>>> RTE_LOG(ERR, VHOST_CONFIG, >>>> ------------------------------------------------------------------------------ >>>> ? >>>> >>>> We will not check the EINPROGRESS, but subsequent 'connect()' will return >>>> EISCONN if connection already established. getsockopt() is kept just in >>>> case. Subsequent 'connect()' will happen on the next iteration of >>>> reconnection cycle (1 second sleep). >>> >>> I've sent v2 with this changes. >> >> Thanks. But still, it doesn't look clean to me. I was thinking following >> might be cleaner? >> >> diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c >> b/lib/librte_vhost/vhost_user/vhost-net-user. >> index f0f92f8..c0ef290 100644 >> --- a/lib/librte_vhost/vhost_user/vhost-net-user.c >> +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c >> @@ -532,6 +532,10 @@ vhost_user_client_reconnect(void *arg __rte_unused) >> reconn != NULL; reconn = next) { >> next = TAILQ_NEXT(reconn, next); >> >> + if (reconn->conn_inprogress) { >> + /* do connect check here */ >> + } >> + >> if (connect(reconn->fd, (struct sockaddr *)&reconn->un, >> sizeof(reconn->un)) < 0) >> continue; >> @@ -605,6 +609,7 @@ vhost_user_create_client(struct vhost_user_socket *vsocket) >> reconn->un = un; >> reconn->fd = fd; >> reconn->vsocket = vsocket; >> + reconn->conn_inprogress = errno == EINPROGRESS; >> pthread_mutex_lock(&reconn_list.mutex); >> TAILQ_INSERT_TAIL(&reconn_list.head, reconn, next); >> pthread_mutex_unlock(&reconn_list.mutex); >> >> It's just a rough diff, hopefully it shows my idea clearly. And of >> course, we should not call connect() anymore when conn_inprogress >> is set. >> >> What do you think of it? > > I found that we can't check connection status without select/poll > on it. 'getsockopt()' will return 0 with no errors if connection > is not still established just like if it was. > So, I think, the first version of this patch is the only > acceptable solution. Sorry, v2 is acceptable too, because it always calls 'connect()'.
On Thu, Jul 21, 2016 at 03:13:14PM +0300, Ilya Maximets wrote: > >> > >> What do you think of it? > > > > I found that we can't check connection status without select/poll > > on it. 'getsockopt()' will return 0 with no errors if connection > > is not still established just like if it was. > > So, I think, the first version of this patch is the only > > acceptable solution. > > Sorry, v2 is acceptable too, because it always calls 'connect()'. So you have done the test that it works? I'm more curious to know could your above case hit the getsockopt() code path, I mean, the path that errno is set to EINPROGRESS or EISCONN? --yliu
On 21.07.2016 15:35, Yuanhan Liu wrote: > On Thu, Jul 21, 2016 at 03:13:14PM +0300, Ilya Maximets wrote: >>>> >>>> What do you think of it? >>> >>> I found that we can't check connection status without select/poll >>> on it. 'getsockopt()' will return 0 with no errors if connection >>> is not still established just like if it was. >>> So, I think, the first version of this patch is the only >>> acceptable solution. >> >> Sorry, v2 is acceptable too, because it always calls 'connect()'. > > So you have done the test that it works? No, it's just theory. I don't know how to test this. > I'm more curious to know > could your above case hit the getsockopt() code path, I mean, the > path that errno is set to EINPROGRESS or EISCONN? As I already told, I don't sure that we're able to get EINPROGRESS on our AF_UNIX sockets. In v2 'getsockopt()' check is unnecessary.
On 21.07.2016 15:58, Yuanhan Liu wrote: > On Thu, Jul 21, 2016 at 03:42:54PM +0300, Ilya Maximets wrote: >> On 21.07.2016 15:35, Yuanhan Liu wrote: >>> On Thu, Jul 21, 2016 at 03:13:14PM +0300, Ilya Maximets wrote: >>>>>> >>>>>> What do you think of it? >>>>> >>>>> I found that we can't check connection status without select/poll >>>>> on it. 'getsockopt()' will return 0 with no errors if connection >>>>> is not still established just like if it was. >>>>> So, I think, the first version of this patch is the only >>>>> acceptable solution. >>>> >>>> Sorry, v2 is acceptable too, because it always calls 'connect()'. >>> >>> So you have done the test that it works? >> >> No, it's just theory. I don't know how to test this. >> >>> I'm more curious to know >>> could your above case hit the getsockopt() code path, I mean, the >>> path that errno is set to EINPROGRESS or EISCONN? >> >> As I already told, I don't sure that we're able to get EINPROGRESS >> on our AF_UNIX sockets. >> In v2 'getsockopt()' check is unnecessary. > > We then have no reason to keep it? You want me to re-send v2 without 'getsockopt()' check?
On Thu, Jul 21, 2016 at 03:42:54PM +0300, Ilya Maximets wrote: > On 21.07.2016 15:35, Yuanhan Liu wrote: > > On Thu, Jul 21, 2016 at 03:13:14PM +0300, Ilya Maximets wrote: > >>>> > >>>> What do you think of it? > >>> > >>> I found that we can't check connection status without select/poll > >>> on it. 'getsockopt()' will return 0 with no errors if connection > >>> is not still established just like if it was. > >>> So, I think, the first version of this patch is the only > >>> acceptable solution. > >> > >> Sorry, v2 is acceptable too, because it always calls 'connect()'. > > > > So you have done the test that it works? > > No, it's just theory. I don't know how to test this. > > > I'm more curious to know > > could your above case hit the getsockopt() code path, I mean, the > > path that errno is set to EINPROGRESS or EISCONN? > > As I already told, I don't sure that we're able to get EINPROGRESS > on our AF_UNIX sockets. > In v2 'getsockopt()' check is unnecessary. We then have no reason to keep it? --yliu
On Thu, Jul 21, 2016 at 03:58:11PM +0300, Ilya Maximets wrote: > On 21.07.2016 15:58, Yuanhan Liu wrote: > > On Thu, Jul 21, 2016 at 03:42:54PM +0300, Ilya Maximets wrote: > >> On 21.07.2016 15:35, Yuanhan Liu wrote: > >>> On Thu, Jul 21, 2016 at 03:13:14PM +0300, Ilya Maximets wrote: > >>>>>> > >>>>>> What do you think of it? > >>>>> > >>>>> I found that we can't check connection status without select/poll > >>>>> on it. 'getsockopt()' will return 0 with no errors if connection > >>>>> is not still established just like if it was. > >>>>> So, I think, the first version of this patch is the only > >>>>> acceptable solution. > >>>> > >>>> Sorry, v2 is acceptable too, because it always calls 'connect()'. > >>> > >>> So you have done the test that it works? > >> > >> No, it's just theory. I don't know how to test this. > >> > >>> I'm more curious to know > >>> could your above case hit the getsockopt() code path, I mean, the > >>> path that errno is set to EINPROGRESS or EISCONN? > >> > >> As I already told, I don't sure that we're able to get EINPROGRESS > >> on our AF_UNIX sockets. > >> In v2 'getsockopt()' check is unnecessary. > > > > We then have no reason to keep it? > > You want me to re-send v2 without 'getsockopt()' check? Yes, because I'm not sure it will work without select or poll. --yliu
diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c index 8626d13..b0f45e6 100644 --- a/lib/librte_vhost/vhost_user/vhost-net-user.c +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c @@ -537,18 +537,7 @@ vhost_user_connect_nonblock(int fd, struct sockaddr *un, size_t sz) errno = EINVAL; ret = connect(fd, un, sz); - if (ret == -1 && errno != EINPROGRESS) - return -1; - if (ret == 0) - goto connected; - - FD_ZERO(&fdset); - FD_SET(fd, &fdset); - - ret = select(fd + 1, NULL, &fdset, NULL, &tv); - if (!ret) - errno = ETIMEDOUT; - if (ret != 1) + if (ret < 0 && errno != EISCONN) return -1; ret = getsockopt(fd, SOL_SOCKET, SO_ERROR, &so_error, &len); @@ -558,7 +547,6 @@ vhost_user_connect_nonblock(int fd, struct sockaddr *un, size_t sz) return -1; } -connected: flags = fcntl(fd, F_GETFL, 0); if (flags < 0) { RTE_LOG(ERR, VHOST_CONFIG,