[dpdk-dev,RFC] librte_vhost: Add unix domain socket fd registration

Message ID 1466177556-14891-1-git-send-email-aconole@redhat.com (mailing list archive)
State Rejected, archived
Delegated to: Yuanhan Liu
Headers

Commit Message

Aaron Conole June 17, 2016, 3:32 p.m. UTC
  Prior to this commit, the only way to add a vhost-user socket to the
system is by relying on librte_vhost to open the unix domain socket and
add it to the unix socket list.  This is problematic for applications
which would like to set the permissions, or applications which are not
directly allowed to open sockets due to policy restrictions.

This patch provides a new API and ABI to allow application developers to
acquire the unix domain socket via whatever mechanism fits and pass it
to the vhost driver registration process.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 doc/guides/prog_guide/vhost_lib.rst          |  8 +++++
 lib/librte_vhost/rte_vhost_version.map       |  6 ++++
 lib/librte_vhost/rte_virtio_net.h            |  6 ++++
 lib/librte_vhost/vhost_user/vhost-net-user.c | 47 ++++++++++++++++++----------
 4 files changed, 50 insertions(+), 17 deletions(-)
  

Comments

Yuanhan Liu June 21, 2016, 7:21 a.m. UTC | #1
On Fri, Jun 17, 2016 at 11:32:36AM -0400, Aaron Conole wrote:
> Prior to this commit, the only way to add a vhost-user socket to the
> system is by relying on librte_vhost to open the unix domain socket and
> add it to the unix socket list.  This is problematic for applications
> which would like to set the permissions,

So, you want to address the issue raised by following patch?

    http://dpdk.org/dev/patchwork/patch/12222/

I would still like to stick to my proposal, that is to introduce a
new API to do the permission change at anytime, if we end up with
wanting to introduce a new API.

> or applications which are not
> directly allowed to open sockets due to policy restrictions.

Could you name a specific example?

BTW, JFYI, since 16.07, DPDK supports client mode. It's QEMU (acting
as the server) will create the socket file. I guess that would diminish
(or even avoid?) the permission pain that DPDK acting as server brings.
I doubt the API to do the permission change is really needed then.

	--yliu
  
Aaron Conole June 21, 2016, 1:15 p.m. UTC | #2
Yuanhan Liu <yuanhan.liu@linux.intel.com> writes:

> On Fri, Jun 17, 2016 at 11:32:36AM -0400, Aaron Conole wrote:
>> Prior to this commit, the only way to add a vhost-user socket to the
>> system is by relying on librte_vhost to open the unix domain socket and
>> add it to the unix socket list.  This is problematic for applications
>> which would like to set the permissions,
>
> So, you want to address the issue raised by following patch?
>
>     http://dpdk.org/dev/patchwork/patch/12222/

That patch does try to address the issue, however - it has some
problems.  The biggest is a TOCTTOU issue when using chown.  The way to
solve that issue properly is different depending on which operating
system is being used (for instance, FreeBSD doesn't honor
fchown(),fchmod() on file descriptors).  My solution is basically to
punt that responsibility to the controlling application.

> I would still like to stick to my proposal, that is to introduce a
> new API to do the permission change at anytime, if we end up with
> wanting to introduce a new API.

I've spent a lot of time looking at the TOCTTOU problem, and I think
that is a really hard problem to solve portably.  Might be good to just
start with the flexible mechanism here that lets the application
developer satisfy their own needs.

>> or applications which are not
>> directly allowed to open sockets due to policy restrictions.
>
> Could you name a specific example?

SELinux policy might require one application to open the socket, and
pass it back via a dbus mechanism.  I can't actually think of a concrete
implemented case, so it may not be valid.

> BTW, JFYI, since 16.07, DPDK supports client mode. It's QEMU (acting
> as the server) will create the socket file. I guess that would diminish
> (or even avoid?) the permission pain that DPDK acting as server brings.
> I doubt the API to do the permission change is really needed then.

I wouldn't say it 'solves' the issue so much as hopes no one uses server
mode in DPDK.  I agree, for OvS, it could.

> 	--yliu

Thanks so much for your thoughts and review on this, Yuanhan Liu!

-Aaron
  
Yuanhan Liu June 24, 2016, 2:31 a.m. UTC | #3
On Tue, Jun 21, 2016 at 09:15:03AM -0400, Aaron Conole wrote:
> Yuanhan Liu <yuanhan.liu@linux.intel.com> writes:
> 
> > On Fri, Jun 17, 2016 at 11:32:36AM -0400, Aaron Conole wrote:
> >> Prior to this commit, the only way to add a vhost-user socket to the
> >> system is by relying on librte_vhost to open the unix domain socket and
> >> add it to the unix socket list.  This is problematic for applications
> >> which would like to set the permissions,
> >
> > So, you want to address the issue raised by following patch?
> >
> >     http://dpdk.org/dev/patchwork/patch/12222/
> 
> That patch does try to address the issue, however - it has some
> problems.  The biggest is a TOCTTOU issue when using chown.  The way to
> solve that issue properly is different depending on which operating
> system is being used (for instance, FreeBSD doesn't honor
> fchown(),fchmod() on file descriptors).  My solution is basically to
> punt that responsibility to the controlling application.
> 
> > I would still like to stick to my proposal, that is to introduce a
> > new API to do the permission change at anytime, if we end up with
> > wanting to introduce a new API.
> 
> I've spent a lot of time looking at the TOCTTOU problem, and I think
> that is a really hard problem to solve portably.  Might be good to just
> start with the flexible mechanism here that lets the application
> developer satisfy their own needs.
> 
> >> or applications which are not
> >> directly allowed to open sockets due to policy restrictions.
> >
> > Could you name a specific example?
> 
> SELinux policy might require one application to open the socket, and
> pass it back via a dbus mechanism.  I can't actually think of a concrete
> implemented case, so it may not be valid.
> 
> > BTW, JFYI, since 16.07, DPDK supports client mode. It's QEMU (acting
> > as the server) will create the socket file. I guess that would diminish
> > (or even avoid?) the permission pain that DPDK acting as server brings.
> > I doubt the API to do the permission change is really needed then.
> 
> I wouldn't say it 'solves' the issue so much as hopes no one uses server
> mode in DPDK.  I agree, for OvS, it could.

Actually, I think I would (personally) suggest people to switch to DPDK
vhost-user client mode, for two good reasons:

- it should solve the socket permission issue raised by you and Christian.

- it has the "reconnect" feature since 16.07. Which means guest network
  will still work from a DPDK vhost-user restart/crash. DPDK vhost-user
  as server simply doesn't support that.

And FYI, Loftus is doing the DPDK for OVS intergration. Not quite sure
whether she put the client mode as the default mode though.

> Thanks so much for your thoughts and review on this, Yuanhan Liu!

Thank you for proposing ideas to make DPDK better!

	--yliu
  
Loftus, Ciara June 24, 2016, 7:43 a.m. UTC | #4
> 
> On Tue, Jun 21, 2016 at 09:15:03AM -0400, Aaron Conole wrote:
> > Yuanhan Liu <yuanhan.liu@linux.intel.com> writes:
> >
> > > On Fri, Jun 17, 2016 at 11:32:36AM -0400, Aaron Conole wrote:
> > >> Prior to this commit, the only way to add a vhost-user socket to the
> > >> system is by relying on librte_vhost to open the unix domain socket and
> > >> add it to the unix socket list.  This is problematic for applications
> > >> which would like to set the permissions,
> > >
> > > So, you want to address the issue raised by following patch?
> > >
> > >     http://dpdk.org/dev/patchwork/patch/12222/
> >
> > That patch does try to address the issue, however - it has some
> > problems.  The biggest is a TOCTTOU issue when using chown.  The way to
> > solve that issue properly is different depending on which operating
> > system is being used (for instance, FreeBSD doesn't honor
> > fchown(),fchmod() on file descriptors).  My solution is basically to
> > punt that responsibility to the controlling application.
> >
> > > I would still like to stick to my proposal, that is to introduce a
> > > new API to do the permission change at anytime, if we end up with
> > > wanting to introduce a new API.
> >
> > I've spent a lot of time looking at the TOCTTOU problem, and I think
> > that is a really hard problem to solve portably.  Might be good to just
> > start with the flexible mechanism here that lets the application
> > developer satisfy their own needs.
> >
> > >> or applications which are not
> > >> directly allowed to open sockets due to policy restrictions.
> > >
> > > Could you name a specific example?
> >
> > SELinux policy might require one application to open the socket, and
> > pass it back via a dbus mechanism.  I can't actually think of a concrete
> > implemented case, so it may not be valid.
> >
> > > BTW, JFYI, since 16.07, DPDK supports client mode. It's QEMU (acting
> > > as the server) will create the socket file. I guess that would diminish
> > > (or even avoid?) the permission pain that DPDK acting as server brings.
> > > I doubt the API to do the permission change is really needed then.
> >
> > I wouldn't say it 'solves' the issue so much as hopes no one uses server
> > mode in DPDK.  I agree, for OvS, it could.
> 
> Actually, I think I would (personally) suggest people to switch to DPDK
> vhost-user client mode, for two good reasons:
> 
> - it should solve the socket permission issue raised by you and Christian.
> 
> - it has the "reconnect" feature since 16.07. Which means guest network
>   will still work from a DPDK vhost-user restart/crash. DPDK vhost-user
>   as server simply doesn't support that.
> 
> And FYI, Loftus is doing the DPDK for OVS intergration. Not quite sure
> whether she put the client mode as the default mode though.

Hi Yuanhan,

I intend to keep the DPDK server-mode as the default. My reasoning is that not
all users will have access to QEMU v2.7.0 initially. We will keep operating as before
but have an option to switch to DPDK client mode, and then perhaps look at
switching the default in a later release.

Thanks,
Ciara

> 
> > Thanks so much for your thoughts and review on this, Yuanhan Liu!
> 
> Thank you for proposing ideas to make DPDK better!
> 
> 	--yliu
  
Yuanhan Liu June 24, 2016, 7:51 a.m. UTC | #5
On Fri, Jun 24, 2016 at 07:43:29AM +0000, Loftus, Ciara wrote:
> > 
> > On Tue, Jun 21, 2016 at 09:15:03AM -0400, Aaron Conole wrote:
> > > Yuanhan Liu <yuanhan.liu@linux.intel.com> writes:
> > >
> > > > On Fri, Jun 17, 2016 at 11:32:36AM -0400, Aaron Conole wrote:
> > > >> Prior to this commit, the only way to add a vhost-user socket to the
> > > >> system is by relying on librte_vhost to open the unix domain socket and
> > > >> add it to the unix socket list.  This is problematic for applications
> > > >> which would like to set the permissions,
> > > >
> > > > So, you want to address the issue raised by following patch?
> > > >
> > > >     http://dpdk.org/dev/patchwork/patch/12222/
> > >
> > > That patch does try to address the issue, however - it has some
> > > problems.  The biggest is a TOCTTOU issue when using chown.  The way to
> > > solve that issue properly is different depending on which operating
> > > system is being used (for instance, FreeBSD doesn't honor
> > > fchown(),fchmod() on file descriptors).  My solution is basically to
> > > punt that responsibility to the controlling application.
> > >
> > > > I would still like to stick to my proposal, that is to introduce a
> > > > new API to do the permission change at anytime, if we end up with
> > > > wanting to introduce a new API.
> > >
> > > I've spent a lot of time looking at the TOCTTOU problem, and I think
> > > that is a really hard problem to solve portably.  Might be good to just
> > > start with the flexible mechanism here that lets the application
> > > developer satisfy their own needs.
> > >
> > > >> or applications which are not
> > > >> directly allowed to open sockets due to policy restrictions.
> > > >
> > > > Could you name a specific example?
> > >
> > > SELinux policy might require one application to open the socket, and
> > > pass it back via a dbus mechanism.  I can't actually think of a concrete
> > > implemented case, so it may not be valid.
> > >
> > > > BTW, JFYI, since 16.07, DPDK supports client mode. It's QEMU (acting
> > > > as the server) will create the socket file. I guess that would diminish
> > > > (or even avoid?) the permission pain that DPDK acting as server brings.
> > > > I doubt the API to do the permission change is really needed then.
> > >
> > > I wouldn't say it 'solves' the issue so much as hopes no one uses server
> > > mode in DPDK.  I agree, for OvS, it could.
> > 
> > Actually, I think I would (personally) suggest people to switch to DPDK
> > vhost-user client mode, for two good reasons:
> > 
> > - it should solve the socket permission issue raised by you and Christian.
> > 
> > - it has the "reconnect" feature since 16.07. Which means guest network
> >   will still work from a DPDK vhost-user restart/crash. DPDK vhost-user
> >   as server simply doesn't support that.
> > 
> > And FYI, Loftus is doing the DPDK for OVS intergration. Not quite sure
> > whether she put the client mode as the default mode though.
> 
> Hi Yuanhan,

Hi Ciara,

Thanks for the note.

> I intend to keep the DPDK server-mode as the default. My reasoning is that not
> all users will have access to QEMU v2.7.0 initially. We will keep operating as before
> but have an option to switch to DPDK client mode,

And yes, good point.

> and then perhaps look at
> switching the default in a later release.

Also okay to me.

	--yliu
  
Aaron Conole June 24, 2016, 12:23 p.m. UTC | #6
Yuanhan Liu <yuanhan.liu@linux.intel.com> writes:

> On Fri, Jun 24, 2016 at 07:43:29AM +0000, Loftus, Ciara wrote:
>> > 
>> > On Tue, Jun 21, 2016 at 09:15:03AM -0400, Aaron Conole wrote:
>> > > Yuanhan Liu <yuanhan.liu@linux.intel.com> writes:
>> > >
>> > > > On Fri, Jun 17, 2016 at 11:32:36AM -0400, Aaron Conole wrote:
>> > > >> Prior to this commit, the only way to add a vhost-user socket to the
>> > > >> system is by relying on librte_vhost to open the unix domain socket and
>> > > >> add it to the unix socket list.  This is problematic for applications
>> > > >> which would like to set the permissions,
>> > > >
>> > > > So, you want to address the issue raised by following patch?
>> > > >
>> > > >     http://dpdk.org/dev/patchwork/patch/12222/
>> > >
>> > > That patch does try to address the issue, however - it has some
>> > > problems.  The biggest is a TOCTTOU issue when using chown.  The way to
>> > > solve that issue properly is different depending on which operating
>> > > system is being used (for instance, FreeBSD doesn't honor
>> > > fchown(),fchmod() on file descriptors).  My solution is basically to
>> > > punt that responsibility to the controlling application.
>> > >
>> > > > I would still like to stick to my proposal, that is to introduce a
>> > > > new API to do the permission change at anytime, if we end up with
>> > > > wanting to introduce a new API.
>> > >
>> > > I've spent a lot of time looking at the TOCTTOU problem, and I think
>> > > that is a really hard problem to solve portably.  Might be good to just
>> > > start with the flexible mechanism here that lets the application
>> > > developer satisfy their own needs.
>> > >
>> > > >> or applications which are not
>> > > >> directly allowed to open sockets due to policy restrictions.
>> > > >
>> > > > Could you name a specific example?
>> > >
>> > > SELinux policy might require one application to open the socket, and
>> > > pass it back via a dbus mechanism.  I can't actually think of a concrete
>> > > implemented case, so it may not be valid.
>> > >
>> > > > BTW, JFYI, since 16.07, DPDK supports client mode. It's QEMU (acting
>> > > > as the server) will create the socket file. I guess that would diminish
>> > > > (or even avoid?) the permission pain that DPDK acting as server brings.
>> > > > I doubt the API to do the permission change is really needed then.
>> > >
>> > > I wouldn't say it 'solves' the issue so much as hopes no one uses server
>> > > mode in DPDK.  I agree, for OvS, it could.
>> > 
>> > Actually, I think I would (personally) suggest people to switch to DPDK
>> > vhost-user client mode, for two good reasons:
>> > 
>> > - it should solve the socket permission issue raised by you and Christian.
>> > 
>> > - it has the "reconnect" feature since 16.07. Which means guest network
>> >   will still work from a DPDK vhost-user restart/crash. DPDK vhost-user
>> >   as server simply doesn't support that.
>> > 
>> > And FYI, Loftus is doing the DPDK for OVS intergration. Not quite sure
>> > whether she put the client mode as the default mode though.
>> 
>> Hi Yuanhan,
>
> Hi Ciara,
>
> Thanks for the note.
>
>> I intend to keep the DPDK server-mode as the default. My reasoning is that not
>> all users will have access to QEMU v2.7.0 initially. We will keep
>> operating as before
>> but have an option to switch to DPDK client mode,
>
> And yes, good point.
>
>> and then perhaps look at
>> switching the default in a later release.
>
> Also okay to me.

Is there still merit to this patch, given above?  If so, I'd finish my
integration and testing work and submit it formally.
  
Yuanhan Liu June 27, 2016, 11:53 a.m. UTC | #7
On Fri, Jun 24, 2016 at 08:23:52AM -0400, Aaron Conole wrote:
> Is there still merit to this patch, given above?  If so, I'd finish my
> integration and testing work and submit it formally.

Sorry, I don't see the strong need of this patch (at least so far),
judging that vhost-user as the client could solve the issue your
patch meant to resolve.

	--yliu
  
Aaron Conole June 27, 2016, 7:19 p.m. UTC | #8
Yuanhan Liu <yuanhan.liu@linux.intel.com> writes:

> On Fri, Jun 24, 2016 at 08:23:52AM -0400, Aaron Conole wrote:
>> Is there still merit to this patch, given above?  If so, I'd finish my
>> integration and testing work and submit it formally.
>
> Sorry, I don't see the strong need of this patch (at least so far),
> judging that vhost-user as the client could solve the issue your
> patch meant to resolve.

Okay, thanks so much for the review and consideration, Yuanhan!

-Aaron
  

Patch

diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst
index 48e1fff..22d0c6d 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -49,6 +49,14 @@  Vhost API Overview
       For vhost-user, a Unix domain socket server will be created with the parameter as
       the local socket path.
 
+      Alternately, rte_vhost_driver_register_socket registers a unix domain
+      socket into the system.
+      This socket descriptor should be acquired by the host application through
+      some mechanism (either fd passing or by performing the unix domain socket
+      allocation).
+      The file descriptor passed in this way must still be a Unix domain socket
+      server.
+
 *   Vhost session start
 
       rte_vhost_driver_session_start starts the vhost session loop.
diff --git a/lib/librte_vhost/rte_vhost_version.map b/lib/librte_vhost/rte_vhost_version.map
index 3d8709e..fe58967 100644
--- a/lib/librte_vhost/rte_vhost_version.map
+++ b/lib/librte_vhost/rte_vhost_version.map
@@ -20,3 +20,9 @@  DPDK_2.1 {
 	rte_vhost_driver_unregister;
 
 } DPDK_2.0;
+
+DPDK_16.7 {
+	global:
+
+	rte_vhost_driver_register_socket;
+} DPDK_2.1;
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 600b20b..d2959ff 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -236,6 +236,12 @@  int rte_vhost_enable_guest_notification(struct virtio_net *dev, uint16_t queue_i
 /* Register vhost driver. dev_name could be different for multiple instance support. */
 int rte_vhost_driver_register(const char *dev_name);
 
+/* Register vhost driver using the provided unix domain socket. The socket MUST
+ * already be fully created and in a listening state (by calling listen()).
+ */
+int rte_vhost_driver_register_socket(const char *dev_name,
+	int vhost_unix_socket);
+
 /* Unregister vhost driver. This is only meaningful to vhost user. */
 int rte_vhost_driver_unregister(const char *dev_name);
 
diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c
index df2bd64..0fe72db 100644
--- a/lib/librte_vhost/vhost_user/vhost-net-user.c
+++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
@@ -446,45 +446,58 @@  vserver_message_handler(int connfd, void *dat, int *remove)
 	}
 }
 
+
 /**
- * Creates and initialise the vhost server.
+ * Appends a socket to the vhost server polling list
  */
 int
-rte_vhost_driver_register(const char *path)
+rte_vhost_driver_register_socket(const char *dev_name, int vhost_unix_socket)
 {
 	struct vhost_server *vserver;
 
-	pthread_mutex_lock(&g_vhost_server.server_mutex);
-
-	if (g_vhost_server.vserver_cnt == MAX_VHOST_SERVER) {
-		RTE_LOG(ERR, VHOST_CONFIG,
-			"error: the number of servers reaches maximum\n");
-		pthread_mutex_unlock(&g_vhost_server.server_mutex);
-		return -1;
-	}
-
 	vserver = calloc(sizeof(struct vhost_server), 1);
 	if (vserver == NULL) {
-		pthread_mutex_unlock(&g_vhost_server.server_mutex);
 		return -1;
 	}
 
-	vserver->listenfd = uds_socket(path);
-	if (vserver->listenfd < 0) {
+	vserver->listenfd = vhost_unix_socket;
+	vserver->path = strdup(dev_name);
+	if (!vserver->path) {
 		free(vserver);
-		pthread_mutex_unlock(&g_vhost_server.server_mutex);
 		return -1;
 	}
 
-	vserver->path = strdup(path);
+	pthread_mutex_lock(&g_vhost_server.server_mutex);
+
+	if (g_vhost_server.vserver_cnt == MAX_VHOST_SERVER) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"error: the number of servers reaches maximum\n");
+		pthread_mutex_unlock(&g_vhost_server.server_mutex);
+		free(vserver->path);
+		free(vserver);
+		return -1;
+	}
 
 	fdset_add(&g_vhost_server.fdset, vserver->listenfd,
 		vserver_new_vq_conn, NULL, vserver);
 
 	g_vhost_server.server[g_vhost_server.vserver_cnt++] = vserver;
 	pthread_mutex_unlock(&g_vhost_server.server_mutex);
+}
 
-	return 0;
+
+/**
+ * Creates and initialise the vhost server.
+ */
+int
+rte_vhost_driver_register(const char *dev_name)
+{
+
+	int listenfd = uds_socket(dev_name);
+	if (listenfd < 0)
+		return -1;
+
+	return rte_vhost_driver_register_socket(dev_name, listenfd);
 }