[dpdk-dev,RFC] eal: provide option to set vhost_user socket owner/permissions

Message ID 1461575896-17409-1-git-send-email-christian.ehrhardt@canonical.com (mailing list archive)
State Superseded, archived
Delegated to: Yuanhan Liu
Headers

Commit Message

Christian Ehrhardt April 25, 2016, 9:18 a.m. UTC
  The API doesn't hold a way to specify a owner/permission set for vhost_user
created sockets. I don't even think an API change would make that much sense.

Projects consuming DPDK start to do 'their own workarounds' like openvswitch
https://patchwork.ozlabs.org/patch/559043/
https://patchwork.ozlabs.org/patch/559045/
But for this specific example they are blocked/stalled behind a bigger
rework (https://patchwork.ozlabs.org/patch/604898/).
Also one could ask why each project would need their own workaround.

At the same time - as I want it for existing code linking against DPDK I
wanted to avoid changing API/ABI. That way I want to provide something existing
users could utilize. So I created a DPDK EAL commandline option based ideas in
the former patches.

For myself I consider this a nice interim solution for existing released
Openvswitch+DPDK solution. And I intend to put it as delta into the DPDK 2.2
currently packaged in Ubuntu to get it working more smoothly with
openvswitch 2.5.

But I'd be interested if DPDK in general would be interested in:
a) an approach like this?
b) would prefer a change of the API?
c) consider it an issue of consuming projects and let them take care?

Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
---
 doc/guides/testpmd_app_ug/run_app.rst        |  19 +++
 lib/librte_eal/common/eal_common_options.c   |   4 +
 lib/librte_eal/common/eal_internal_cfg.h     |   2 +
 lib/librte_eal/common/eal_options.h          |   4 +
 lib/librte_eal/common/include/rte_eal.h      |   5 +
 lib/librte_eal/linuxapp/eal/eal.c            | 182 +++++++++++++++++++++++++++
 lib/librte_vhost/vhost_user/vhost-net-user.c |   4 +
 7 files changed, 220 insertions(+)
  

Comments

Yuanhan Liu April 26, 2016, 4:16 a.m. UTC | #1
On Mon, Apr 25, 2016 at 11:18:16AM +0200, Christian Ehrhardt wrote:
> The API doesn't hold a way to specify a owner/permission set for vhost_user
> created sockets.

Yes, it's kind of like a known issue. So, thanks for bringing it, with
a solution, for dicussion (cc'ed more people).

> I don't even think an API change would make that much sense.
> 
> Projects consuming DPDK start to do 'their own workarounds' like openvswitch
> https://patchwork.ozlabs.org/patch/559043/
> https://patchwork.ozlabs.org/patch/559045/
> But for this specific example they are blocked/stalled behind a bigger
> rework (https://patchwork.ozlabs.org/patch/604898/).
> Also one could ask why each project would need their own workaround.
> 
> At the same time - as I want it for existing code linking against DPDK I
> wanted to avoid changing API/ABI. That way I want to provide something existing
> users could utilize. So I created a DPDK EAL commandline option based ideas in
> the former patches.
> 
> For myself I consider this a nice interim solution for existing released
> Openvswitch+DPDK solution. And I intend to put it as delta into the DPDK 2.2
> currently packaged in Ubuntu to get it working more smoothly with
> openvswitch 2.5.
> 
> But I'd be interested if DPDK in general would be interested in:
> a) an approach like this?

You were trying to add a vhost specific stuff as EAL command option,
which is something we might should try to avoid.

> b) would prefer a change of the API?

Adding a new option to the current register API might will not work well,
either. It gives you no ability to do a dynamic change later. I mean,
taking OVS as an example, OVS provides you the flexible ability to do all
kinds of configuration in a dynamic way, say number of rx queues. If we
do the permissions setup in the register time, there would be no way to
change it later, right?

So, I'm thinking that we may could add a new API for that? It then would
allow applications to change it at anytime.

> c) consider it an issue of consuming projects and let them take care?

It's not exactly an issue of consuming projects; we created the socket
file after all.

And I'd like to hear what others would say.

Thanks.

	--yliu
  
Christian Ehrhardt April 26, 2016, 7:24 a.m. UTC | #2
Thanks,
great that you added more on CC for a wider discussion - I think that is
the only right way to go.

Just to "defend" a bit - solution a) was created under the special
circumstance that I wanted a workaround that would work today.
But that is/was special to what I package with DPDK 2.2 + OVS 2.5 as of
today - and therefore was the right place for a fast interim fix for me.
I totally agree that the A in EAL was meant for abstraction and we might
want to avoid vhost specific things in there that in the long run.

I like your suggestion of a new API as a proper long term solution, but I
don't feel deeply enough involved yet on the API level to give it any
judgement.
So I look forward for more opinions on it.

P.S. the patch bot hit me hard with 2 pages of space/bracket issues, sorry
for that - but it was only meant as RFC after all :-)


Christian Ehrhardt
Software Engineer, Ubuntu Server
Canonical Ltd

On Tue, Apr 26, 2016 at 6:16 AM, Yuanhan Liu <yuanhan.liu@linux.intel.com>
wrote:

> On Mon, Apr 25, 2016 at 11:18:16AM +0200, Christian Ehrhardt wrote:
> > The API doesn't hold a way to specify a owner/permission set for
> vhost_user
> > created sockets.
>
> Yes, it's kind of like a known issue. So, thanks for bringing it, with
> a solution, for dicussion (cc'ed more people).
>
> > I don't even think an API change would make that much sense.
> >
> > Projects consuming DPDK start to do 'their own workarounds' like
> openvswitch
> > https://patchwork.ozlabs.org/patch/559043/
> > https://patchwork.ozlabs.org/patch/559045/
> > But for this specific example they are blocked/stalled behind a bigger
> > rework (https://patchwork.ozlabs.org/patch/604898/).
> > Also one could ask why each project would need their own workaround.
> >
> > At the same time - as I want it for existing code linking against DPDK I
> > wanted to avoid changing API/ABI. That way I want to provide something
> existing
> > users could utilize. So I created a DPDK EAL commandline option based
> ideas in
> > the former patches.
> >
> > For myself I consider this a nice interim solution for existing released
> > Openvswitch+DPDK solution. And I intend to put it as delta into the DPDK
> 2.2
> > currently packaged in Ubuntu to get it working more smoothly with
> > openvswitch 2.5.
> >
> > But I'd be interested if DPDK in general would be interested in:
> > a) an approach like this?
>
> You were trying to add a vhost specific stuff as EAL command option,
> which is something we might should try to avoid.
>
> > b) would prefer a change of the API?
>
> Adding a new option to the current register API might will not work well,
> either. It gives you no ability to do a dynamic change later. I mean,
> taking OVS as an example, OVS provides you the flexible ability to do all
> kinds of configuration in a dynamic way, say number of rx queues. If we
> do the permissions setup in the register time, there would be no way to
> change it later, right?
>
> So, I'm thinking that we may could add a new API for that? It then would
> allow applications to change it at anytime.
>
> > c) consider it an issue of consuming projects and let them take care?
>
> It's not exactly an issue of consuming projects; we created the socket
> file after all.
>
> And I'd like to hear what others would say.
>
> Thanks.
>
>         --yliu
>
  
Thomas Monjalon April 26, 2016, 8:52 a.m. UTC | #3
2016-04-25 21:16, Yuanhan Liu:
> On Mon, Apr 25, 2016 at 11:18:16AM +0200, Christian Ehrhardt wrote:
> > The API doesn't hold a way to specify a owner/permission set for vhost_user
> > created sockets.
> 
> Yes, it's kind of like a known issue. So, thanks for bringing it, with
> a solution, for dicussion (cc'ed more people).
[...]
> > But I'd be interested if DPDK in general would be interested in:
> > a) an approach like this?
> 
> You were trying to add a vhost specific stuff as EAL command option,
> which is something we might should try to avoid.

Yes, -1

> > b) would prefer a change of the API?
> 
> Adding a new option to the current register API might will not work well,
> either. It gives you no ability to do a dynamic change later. I mean,
> taking OVS as an example, OVS provides you the flexible ability to do all
> kinds of configuration in a dynamic way, say number of rx queues. If we
> do the permissions setup in the register time, there would be no way to
> change it later, right?
> 
> So, I'm thinking that we may could add a new API for that? It then would
> allow applications to change it at anytime.

A vhost API in the library?
And for vhost PMD? What about devargs parameters?

> > c) consider it an issue of consuming projects and let them take care?
> 
> It's not exactly an issue of consuming projects; we created the socket
> file after all.

Yes
  
Aaron Conole April 26, 2016, 1:33 p.m. UTC | #4
Thomas Monjalon <thomas.monjalon@6wind.com> writes:

> 2016-04-25 21:16, Yuanhan Liu:
>> On Mon, Apr 25, 2016 at 11:18:16AM +0200, Christian Ehrhardt wrote:
>> > The API doesn't hold a way to specify a owner/permission set for vhost_user
>> > created sockets.
>> 
>> Yes, it's kind of like a known issue. So, thanks for bringing it, with
>> a solution, for dicussion (cc'ed more people).
> [...]
>> > But I'd be interested if DPDK in general would be interested in:
>> > a) an approach like this?
>> 
>> You were trying to add a vhost specific stuff as EAL command option,
>> which is something we might should try to avoid.
>
> Yes, -1
>
>> > b) would prefer a change of the API?
>> 
>> Adding a new option to the current register API might will not work well,
>> either. It gives you no ability to do a dynamic change later. I mean,
>> taking OVS as an example, OVS provides you the flexible ability to do all
>> kinds of configuration in a dynamic way, say number of rx queues. If we
>> do the permissions setup in the register time, there would be no way to
>> change it later, right?
>> 
>> So, I'm thinking that we may could add a new API for that? It then would
>> allow applications to change it at anytime.
>
> A vhost API in the library?
> And for vhost PMD? What about devargs parameters?

I don't know the most sane solution here, other than to echo the
sentiment that a new API for this is probably appropriate. Where that
API lives, and how it looks should be hashed out. For now, I'm working
on a solution in OVS because no such API or facility exists in DPDK.

Actually, there are a number of edge cases with vhost-user sockets. I
don't want to get into all of them, but since we're discussing the API a
bit here, I'd like to also bring up the following:

  What is the desired behavior w.r.t. file cleanup when the application
  crashes, restarts, and tells DPDK to use that file again (which hasn't
  been cleaned up due to the crash)?
  At present, the vhost-user code errors out. But how does the
  application correct the situation without deleting arbitrary files on
  the filesystem?

>> > c) consider it an issue of consuming projects and let them take care?
>> 
>> It's not exactly an issue of consuming projects; we created the socket
>> file after all.
>
> Yes

Just want to reiterate at present there is no solution, so projects will
invent their own. I can point to Ubuntu and Red Hat customer bugs which
require silly workarounds like "after you started a bunch of stuff, go
to the directory and run chmod/chown."

I'm actually not opposed to any solution that seems sane. If DPDK takes
the stance that the file is specified by the application, and therefore
"file management" activities (removal, permissions, ownership, etc.) are
the responsibility of the application, so be it. If the stance is that
DPDK owns the management of the file, so be that as well. I think the
first case is easier for the library maintainers (do nothing), the
second is easier for the applications (use these semantics).

If it really is the responsibility of DPDK, then I think the only sane
approach is an API for managing this. That may require an additional
library framework to link the vhost-user PMD and rte_ethdev facilities
so that a common API could be provided.

Just my $.02.

Thanks,
Aaron
  
Yuanhan Liu April 27, 2016, 11:08 p.m. UTC | #5
On Tue, Apr 26, 2016 at 09:33:48AM -0400, Aaron Conole wrote:
> >> > b) would prefer a change of the API?
> >> 
> >> Adding a new option to the current register API might will not work well,
> >> either. It gives you no ability to do a dynamic change later. I mean,
> >> taking OVS as an example, OVS provides you the flexible ability to do all
> >> kinds of configuration in a dynamic way, say number of rx queues. If we
> >> do the permissions setup in the register time, there would be no way to
> >> change it later, right?
> >> 
> >> So, I'm thinking that we may could add a new API for that? It then would
> >> allow applications to change it at anytime.
> >
> > A vhost API in the library?

Yes, I supposed so.

> > And for vhost PMD?

Technically, vhost PMD is an application (or precisely, an user) of
vhost lib. So, it's supposed to invoke the new API.

> What about devargs parameters?

Yes, and it then invoke the API, as stated above.

> 
> I don't know the most sane solution here, other than to echo the
> sentiment that a new API for this is probably appropriate. Where that
> API lives, and how it looks should be hashed out. For now, I'm working
> on a solution in OVS because no such API or facility exists in DPDK.
> 
> Actually, there are a number of edge cases with vhost-user sockets. I
> don't want to get into all of them, but since we're discussing the API a
> bit here, I'd like to also bring up the following:
> 
>   What is the desired behavior w.r.t. file cleanup when the application
>   crashes, restarts, and tells DPDK to use that file again (which hasn't
>   been cleaned up due to the crash)?
>   At present, the vhost-user code errors out. But how does the
>   application correct the situation without deleting arbitrary files on
>   the filesystem?

Oops, yes, that's another one. We also had some discussion before:

    http://dpdk.org/ml/archives/dev/2015-December/030326.html

It ended up with an agreement that we should let the application to
handle it, due to it's a path provided by the application, though
it's DPDK does the creation.

> 
> >> > c) consider it an issue of consuming projects and let them take care?
> >> 
> >> It's not exactly an issue of consuming projects; we created the socket
> >> file after all.
> >
> > Yes
> 
> Just want to reiterate at present there is no solution, so projects will
> invent their own. I can point to Ubuntu and Red Hat customer bugs which
> require silly workarounds like "after you started a bunch of stuff, go
> to the directory and run chmod/chown."
> 
> I'm actually not opposed to any solution that seems sane. If DPDK takes
> the stance that the file is specified by the application, and therefore
> "file management" activities (removal, permissions, ownership, etc.) are
> the responsibility of the application, so be it.

Exactly. But DPDK, as a library, could provides some handy APIs to make
the application developer's life be less painful. So, that also echoes
to what we have said before: we provide the tool, you use it, and it's
you to make sure it's right.

	--yliu

> If the stance is that
> DPDK owns the management of the file, so be that as well. I think the
> first case is easier for the library maintainers (do nothing), the
> second is easier for the applications (use these semantics).
> 
> If it really is the responsibility of DPDK, then I think the only sane
> approach is an API for managing this. That may require an additional
> library framework to link the vhost-user PMD and rte_ethdev facilities
> so that a common API could be provided.
> 
> Just my $.02.
> 
> Thanks,
> Aaron
  
Thomas Monjalon Feb. 15, 2017, 8:55 a.m. UTC | #6
Was there any progress on this topic?
Can we close the request?
	http://dpdk.org/patch/12222/


2016-04-27 16:08, Yuanhan Liu:
> On Tue, Apr 26, 2016 at 09:33:48AM -0400, Aaron Conole wrote:
> > >> > b) would prefer a change of the API?
> > >> 
> > >> Adding a new option to the current register API might will not work well,
> > >> either. It gives you no ability to do a dynamic change later. I mean,
> > >> taking OVS as an example, OVS provides you the flexible ability to do all
> > >> kinds of configuration in a dynamic way, say number of rx queues. If we
> > >> do the permissions setup in the register time, there would be no way to
> > >> change it later, right?
> > >> 
> > >> So, I'm thinking that we may could add a new API for that? It then would
> > >> allow applications to change it at anytime.
> > >
> > > A vhost API in the library?
> 
> Yes, I supposed so.
> 
> > > And for vhost PMD?
> 
> Technically, vhost PMD is an application (or precisely, an user) of
> vhost lib. So, it's supposed to invoke the new API.
> 
> > What about devargs parameters?
> 
> Yes, and it then invoke the API, as stated above.
> 
> > 
> > I don't know the most sane solution here, other than to echo the
> > sentiment that a new API for this is probably appropriate. Where that
> > API lives, and how it looks should be hashed out. For now, I'm working
> > on a solution in OVS because no such API or facility exists in DPDK.
> > 
> > Actually, there are a number of edge cases with vhost-user sockets. I
> > don't want to get into all of them, but since we're discussing the API a
> > bit here, I'd like to also bring up the following:
> > 
> >   What is the desired behavior w.r.t. file cleanup when the application
> >   crashes, restarts, and tells DPDK to use that file again (which hasn't
> >   been cleaned up due to the crash)?
> >   At present, the vhost-user code errors out. But how does the
> >   application correct the situation without deleting arbitrary files on
> >   the filesystem?
> 
> Oops, yes, that's another one. We also had some discussion before:
> 
>     http://dpdk.org/ml/archives/dev/2015-December/030326.html
> 
> It ended up with an agreement that we should let the application to
> handle it, due to it's a path provided by the application, though
> it's DPDK does the creation.
> 
> > 
> > >> > c) consider it an issue of consuming projects and let them take care?
> > >> 
> > >> It's not exactly an issue of consuming projects; we created the socket
> > >> file after all.
> > >
> > > Yes
> > 
> > Just want to reiterate at present there is no solution, so projects will
> > invent their own. I can point to Ubuntu and Red Hat customer bugs which
> > require silly workarounds like "after you started a bunch of stuff, go
> > to the directory and run chmod/chown."
> > 
> > I'm actually not opposed to any solution that seems sane. If DPDK takes
> > the stance that the file is specified by the application, and therefore
> > "file management" activities (removal, permissions, ownership, etc.) are
> > the responsibility of the application, so be it.
> 
> Exactly. But DPDK, as a library, could provides some handy APIs to make
> the application developer's life be less painful. So, that also echoes
> to what we have said before: we provide the tool, you use it, and it's
> you to make sure it's right.
> 
> 	--yliu
> 
> > If the stance is that
> > DPDK owns the management of the file, so be that as well. I think the
> > first case is easier for the library maintainers (do nothing), the
> > second is easier for the applications (use these semantics).
> > 
> > If it really is the responsibility of DPDK, then I think the only sane
> > approach is an API for managing this. That may require an additional
> > library framework to link the vhost-user PMD and rte_ethdev facilities
> > so that a common API could be provided.
> > 
> > Just my $.02.
> > 
> > Thanks,
> > Aaron
  
Aaron Conole Feb. 15, 2017, 2:32 p.m. UTC | #7
Thomas Monjalon <thomas.monjalon@6wind.com> writes:

> Was there any progress on this topic?
> Can we close the request?
> 	http://dpdk.org/patch/12222/

No update in almost a year is probably a bad sign.

From the OVS side, we've dropped our patches due to too many corner
cases handling this - instead we're opting to use vhost-user client
mode, and punt the issue to the 'other' side.  Additionally, I do agree
with the criticism that providing this via an EAL command line option is
wrong. So, I think it can be safely dropped.

That said, if this is something that folks really want to solve, I think
the sanest way is to allow the application to pass in a
file-descriptor.  The way unix domain sockets permissions work changes
based on far too many factors to ever provide an API that makes sense,
unless we also bake in tests for various OSes (and possibly even flavors
of OSes).

Instead, I think DPDK should allow the application to open the unix
domain socket in whatever manner it chooses, doing whatever tricks are
required to set permissions or other flags the way it wants, and then
pass the file descriptor into the framework.  This would allow doing
sane things like fork()ing, changing current owner and umask, opening
the socket, and passing the descriptor back through a pipe (which works
on FreeBSD and Linux to set the permissions).  It allows the application
to be concerned with the file-system details, and lets DPDK merely worry
about the important thing: passing packet data.

> 2016-04-27 16:08, Yuanhan Liu:
>> On Tue, Apr 26, 2016 at 09:33:48AM -0400, Aaron Conole wrote:
>> > >> > b) would prefer a change of the API?
>> > >> 
>> > >> Adding a new option to the current register API might will not work well,
>> > >> either. It gives you no ability to do a dynamic change later. I mean,
>> > >> taking OVS as an example, OVS provides you the flexible ability to do all
>> > >> kinds of configuration in a dynamic way, say number of rx queues. If we
>> > >> do the permissions setup in the register time, there would be no way to
>> > >> change it later, right?
>> > >> 
>> > >> So, I'm thinking that we may could add a new API for that? It then would
>> > >> allow applications to change it at anytime.
>> > >
>> > > A vhost API in the library?
>> 
>> Yes, I supposed so.
>> 
>> > > And for vhost PMD?
>> 
>> Technically, vhost PMD is an application (or precisely, an user) of
>> vhost lib. So, it's supposed to invoke the new API.
>> 
>> > What about devargs parameters?
>> 
>> Yes, and it then invoke the API, as stated above.
>> 
>> > 
>> > I don't know the most sane solution here, other than to echo the
>> > sentiment that a new API for this is probably appropriate. Where that
>> > API lives, and how it looks should be hashed out. For now, I'm working
>> > on a solution in OVS because no such API or facility exists in DPDK.
>> > 
>> > Actually, there are a number of edge cases with vhost-user sockets. I
>> > don't want to get into all of them, but since we're discussing the API a
>> > bit here, I'd like to also bring up the following:
>> > 
>> >   What is the desired behavior w.r.t. file cleanup when the application
>> >   crashes, restarts, and tells DPDK to use that file again (which hasn't
>> >   been cleaned up due to the crash)?
>> >   At present, the vhost-user code errors out. But how does the
>> >   application correct the situation without deleting arbitrary files on
>> >   the filesystem?
>> 
>> Oops, yes, that's another one. We also had some discussion before:
>> 
>>     http://dpdk.org/ml/archives/dev/2015-December/030326.html
>> 
>> It ended up with an agreement that we should let the application to
>> handle it, due to it's a path provided by the application, though
>> it's DPDK does the creation.
>> 
>> > 
>> > >> > c) consider it an issue of consuming projects and let them take care?
>> > >> 
>> > >> It's not exactly an issue of consuming projects; we created the socket
>> > >> file after all.
>> > >
>> > > Yes
>> > 
>> > Just want to reiterate at present there is no solution, so projects will
>> > invent their own. I can point to Ubuntu and Red Hat customer bugs which
>> > require silly workarounds like "after you started a bunch of stuff, go
>> > to the directory and run chmod/chown."
>> > 
>> > I'm actually not opposed to any solution that seems sane. If DPDK takes
>> > the stance that the file is specified by the application, and therefore
>> > "file management" activities (removal, permissions, ownership, etc.) are
>> > the responsibility of the application, so be it.
>> 
>> Exactly. But DPDK, as a library, could provides some handy APIs to make
>> the application developer's life be less painful. So, that also echoes
>> to what we have said before: we provide the tool, you use it, and it's
>> you to make sure it's right.
>> 
>> 	--yliu
>> 
>> > If the stance is that
>> > DPDK owns the management of the file, so be that as well. I think the
>> > first case is easier for the library maintainers (do nothing), the
>> > second is easier for the applications (use these semantics).
>> > 
>> > If it really is the responsibility of DPDK, then I think the only sane
>> > approach is an API for managing this. That may require an additional
>> > library framework to link the vhost-user PMD and rte_ethdev facilities
>> > so that a common API could be provided.
>> > 
>> > Just my $.02.
>> > 
>> > Thanks,
>> > Aaron
  
Yuanhan Liu Feb. 20, 2017, 8:48 a.m. UTC | #8
On Wed, Feb 15, 2017 at 09:32:27AM -0500, Aaron Conole wrote:
> Thomas Monjalon <thomas.monjalon@6wind.com> writes:
> 
> > Was there any progress on this topic?
> > Can we close the request?
> > 	http://dpdk.org/patch/12222/
> 
> No update in almost a year is probably a bad sign.
> 
> >From the OVS side, we've dropped our patches due to too many corner
> cases handling this - instead we're opting to use vhost-user client
> mode, and punt the issue to the 'other' side.  Additionally, I do agree
> with the criticism that providing this via an EAL command line option is
> wrong. So, I think it can be safely dropped.

Yes and I think we already came an agreement on that before. The reason
I still kept in patchwork is to let it be a (soft) reminder: it's
something might could be enhanced.

> That said, if this is something that folks really want to solve, I think
> the sanest way is to allow the application to pass in a
> file-descriptor.  The way unix domain sockets permissions work changes
> based on far too many factors to ever provide an API that makes sense,
> unless we also bake in tests for various OSes (and possibly even flavors
> of OSes).
> 
> Instead, I think DPDK should allow the application to open the unix
> domain socket in whatever manner it chooses, doing whatever tricks are
> required to set permissions or other flags the way it wants, and then
> pass the file descriptor into the framework.  This would allow doing
> sane things like fork()ing, changing current owner and umask, opening
> the socket, and passing the descriptor back through a pipe (which works
> on FreeBSD and Linux to set the permissions).  It allows the application
> to be concerned with the file-system details, and lets DPDK merely worry
> about the important thing: passing packet data.

I think I could consider that if that's really needed, say, a real need
from some customers.

My last reply to that is, you could use the vhost-user as client mode,
whereas the socket file is managed/created by QEMU, thus permission
won't be an issue any more.

Another thing worth noting is that, IIUC, OVS guys at Ireland were proposing
to drop the vhost-user support and replace it with vhost-pmd, whereas you
could specific vdev options to specify the socket permissions (and etc) for
the server mode. It won't use EAL command line options any more, thus,
no block issue.

	--yliu
> 
> > 2016-04-27 16:08, Yuanhan Liu:
> >> On Tue, Apr 26, 2016 at 09:33:48AM -0400, Aaron Conole wrote:
> >> > >> > b) would prefer a change of the API?
> >> > >> 
> >> > >> Adding a new option to the current register API might will not work well,
> >> > >> either. It gives you no ability to do a dynamic change later. I mean,
> >> > >> taking OVS as an example, OVS provides you the flexible ability to do all
> >> > >> kinds of configuration in a dynamic way, say number of rx queues. If we
> >> > >> do the permissions setup in the register time, there would be no way to
> >> > >> change it later, right?
> >> > >> 
> >> > >> So, I'm thinking that we may could add a new API for that? It then would
> >> > >> allow applications to change it at anytime.
> >> > >
> >> > > A vhost API in the library?
> >> 
> >> Yes, I supposed so.
> >> 
> >> > > And for vhost PMD?
> >> 
> >> Technically, vhost PMD is an application (or precisely, an user) of
> >> vhost lib. So, it's supposed to invoke the new API.
> >> 
> >> > What about devargs parameters?
> >> 
> >> Yes, and it then invoke the API, as stated above.
> >> 
> >> > 
> >> > I don't know the most sane solution here, other than to echo the
> >> > sentiment that a new API for this is probably appropriate. Where that
> >> > API lives, and how it looks should be hashed out. For now, I'm working
> >> > on a solution in OVS because no such API or facility exists in DPDK.
> >> > 
> >> > Actually, there are a number of edge cases with vhost-user sockets. I
> >> > don't want to get into all of them, but since we're discussing the API a
> >> > bit here, I'd like to also bring up the following:
> >> > 
> >> >   What is the desired behavior w.r.t. file cleanup when the application
> >> >   crashes, restarts, and tells DPDK to use that file again (which hasn't
> >> >   been cleaned up due to the crash)?
> >> >   At present, the vhost-user code errors out. But how does the
> >> >   application correct the situation without deleting arbitrary files on
> >> >   the filesystem?
> >> 
> >> Oops, yes, that's another one. We also had some discussion before:
> >> 
> >>     http://dpdk.org/ml/archives/dev/2015-December/030326.html
> >> 
> >> It ended up with an agreement that we should let the application to
> >> handle it, due to it's a path provided by the application, though
> >> it's DPDK does the creation.
> >> 
> >> > 
> >> > >> > c) consider it an issue of consuming projects and let them take care?
> >> > >> 
> >> > >> It's not exactly an issue of consuming projects; we created the socket
> >> > >> file after all.
> >> > >
> >> > > Yes
> >> > 
> >> > Just want to reiterate at present there is no solution, so projects will
> >> > invent their own. I can point to Ubuntu and Red Hat customer bugs which
> >> > require silly workarounds like "after you started a bunch of stuff, go
> >> > to the directory and run chmod/chown."
> >> > 
> >> > I'm actually not opposed to any solution that seems sane. If DPDK takes
> >> > the stance that the file is specified by the application, and therefore
> >> > "file management" activities (removal, permissions, ownership, etc.) are
> >> > the responsibility of the application, so be it.
> >> 
> >> Exactly. But DPDK, as a library, could provides some handy APIs to make
> >> the application developer's life be less painful. So, that also echoes
> >> to what we have said before: we provide the tool, you use it, and it's
> >> you to make sure it's right.
> >> 
> >> 	--yliu
> >> 
> >> > If the stance is that
> >> > DPDK owns the management of the file, so be that as well. I think the
> >> > first case is easier for the library maintainers (do nothing), the
> >> > second is easier for the applications (use these semantics).
> >> > 
> >> > If it really is the responsibility of DPDK, then I think the only sane
> >> > approach is an API for managing this. That may require an additional
> >> > library framework to link the vhost-user PMD and rte_ethdev facilities
> >> > so that a common API could be provided.
> >> > 
> >> > Just my $.02.
> >> > 
> >> > Thanks,
> >> > Aaron
  

Patch

diff --git a/doc/guides/testpmd_app_ug/run_app.rst b/doc/guides/testpmd_app_ug/run_app.rst
index f605564..24c9c01 100644
--- a/doc/guides/testpmd_app_ug/run_app.rst
+++ b/doc/guides/testpmd_app_ug/run_app.rst
@@ -156,6 +156,25 @@  See the DPDK Getting Started Guides for more information on these options.
 
     Use malloc instead of hugetlbfs.
 
+*   ``--vhost-owner``
+
+     When creating vhost_user sockets change owner and group to the specified value.
+     This can be given as ``user:group``, but also only ``user`` or ``:group`` are supported.
+
+    Examples::
+
+       --vhost-owner 'libvirt-qemu:kvm'
+       --vhost-owner 'libvirt-qemu'
+       --vhost-owner ':kvm'
+
+*   ``--vhost-perm``
+
+    When creating vhost_user sockets set them up with these permissions.
+
+    For example::
+
+       --vhost-perm '0664'
+
 
 Testpmd Command-line Options
 ----------------------------
diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
index 2b418d5..073198b 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -95,6 +95,8 @@  eal_long_options[] = {
 	{OPT_VFIO_INTR,         1, NULL, OPT_VFIO_INTR_NUM        },
 	{OPT_VMWARE_TSC_MAP,    0, NULL, OPT_VMWARE_TSC_MAP_NUM   },
 	{OPT_XEN_DOM0,          0, NULL, OPT_XEN_DOM0_NUM         },
+	{OPT_VHOST_OWNER,       1, NULL, OPT_VHOST_OWNER_NUM      },
+	{OPT_VHOST_PERM,        1, NULL, OPT_VHOST_PERM_NUM       },
 	{0,                     0, NULL, 0                        }
 };
 
@@ -153,6 +155,8 @@  eal_reset_internal_config(struct internal_config *internal_cfg)
 #endif
 	internal_cfg->vmware_tsc_map = 0;
 	internal_cfg->create_uio_dev = 0;
+	internal_cfg->vhost_sock_owner = NULL;
+	internal_cfg->vhost_sock_perm = NULL;
 }
 
 static int
diff --git a/lib/librte_eal/common/eal_internal_cfg.h b/lib/librte_eal/common/eal_internal_cfg.h
index 5f1367e..bdf34e3 100644
--- a/lib/librte_eal/common/eal_internal_cfg.h
+++ b/lib/librte_eal/common/eal_internal_cfg.h
@@ -83,6 +83,8 @@  struct internal_config {
 	volatile enum rte_intr_mode vfio_intr_mode;
 	const char *hugefile_prefix;      /**< the base filename of hugetlbfs files */
 	const char *hugepage_dir;         /**< specific hugetlbfs directory to use */
+	const char *vhost_sock_owner;     /**< owner:group of vhost_user sockets */
+	const char *vhost_sock_perm;      /**< permissions of vhost_user sockets */
 
 	unsigned num_hugepage_sizes;      /**< how many sizes on this system */
 	struct hugepage_info hugepage_info[MAX_HUGEPAGE_SIZES];
diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h
index a881c62..1161083 100644
--- a/lib/librte_eal/common/eal_options.h
+++ b/lib/librte_eal/common/eal_options.h
@@ -83,6 +83,10 @@  enum {
 	OPT_VMWARE_TSC_MAP_NUM,
 #define OPT_XEN_DOM0          "xen-dom0"
 	OPT_XEN_DOM0_NUM,
+#define OPT_VHOST_OWNER       "vhost-owner"
+	OPT_VHOST_OWNER_NUM,
+#define OPT_VHOST_PERM        "vhost-perm"
+	OPT_VHOST_PERM_NUM,
 	OPT_LONG_MAX_NUM
 };
 
diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index a71d6f5..506cf24 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -252,6 +252,11 @@  static inline int rte_gettid(void)
 	return RTE_PER_LCORE(_thread_id);
 }
 
+/**
+ * Set owner/permissions on sockets if requested on EAL commandline
+ */
+void rte_eal_set_socket_permissions(const char *);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 8aafd51..3d0b709 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -53,6 +53,9 @@ 
 #if defined(RTE_ARCH_X86)
 #include <sys/io.h>
 #endif
+#include <sys/types.h>
+#include <pwd.h>
+#include <grp.h>
 
 #include <rte_common.h>
 #include <rte_debug.h>
@@ -343,6 +346,8 @@  eal_usage(const char *prgname)
 	       "  --"OPT_CREATE_UIO_DEV"    Create /dev/uioX (usually done by hotplug)\n"
 	       "  --"OPT_VFIO_INTR"         Interrupt mode for VFIO (legacy|msi|msix)\n"
 	       "  --"OPT_XEN_DOM0"          Support running on Xen dom0 without hugetlbfs\n"
+	       "  --"OPT_VHOST_OWNER"       Create vhost-user sockets with this owner:group\n"
+	       "  --"OPT_VHOST_PERM"        Create vhost-user sockets with these permissions\n"
 	       "\n");
 	/* Allow the application to print its usage message too if hook is set */
 	if ( rte_application_usage_hook ) {
@@ -618,6 +623,14 @@  eal_parse_args(int argc, char **argv)
 			internal_config.create_uio_dev = 1;
 			break;
 
+		case OPT_VHOST_OWNER_NUM:
+			internal_config.vhost_sock_owner = optarg;
+			break;
+
+		case OPT_VHOST_PERM_NUM:
+			internal_config.vhost_sock_perm = optarg;
+			break;
+
 		default:
 			if (opt < OPT_LONG_MIN_NUM && isprint(opt)) {
 				RTE_LOG(ERR, EAL, "Option %c is not supported "
@@ -934,3 +947,172 @@  rte_eal_check_module(const char *module_name)
 	/* Module has been found */
 	return 1;
 }
+
+/* Try to double the size of '*buf', return true
+ * if successful, and '*sizep' will be updated with
+ * the new size. Otherwise, return false.  */
+static int
+enlarge_buffer(char **buf, size_t *sizep)
+{
+    size_t newsize = *sizep * 2;
+
+    if (newsize > *sizep) {
+        *buf = realloc(*buf, newsize);
+        *sizep = newsize;
+        return 1;
+    }
+
+    return 0;
+}
+
+static int
+get_owners_from_str(const char *user_spec, uid_t *uid, gid_t *gid)
+{
+	size_t bufsize = 4096;
+
+	char *pos = strchr(user_spec, ':');
+	user_spec += strspn(user_spec, " \t\r\n");
+	size_t len = pos ? (size_t)(pos - user_spec) : strlen(user_spec);
+
+	char *buf = NULL;
+	struct passwd pwd, *res;
+	int e;
+
+	buf = malloc(bufsize);
+	char *user_search = NULL;
+	if (len) {
+		user_search = malloc(len + 1);
+		memcpy(user_search, user_spec, len);
+		user_search[len] = '\0';
+		while ((e = getpwnam_r(user_search, &pwd, buf, bufsize, &res)) == ERANGE) {
+			if (!enlarge_buffer(&buf, &bufsize)) {
+				break;
+			}
+		}
+
+		if (e != 0) {
+			RTE_LOG(ERR, EAL,"Failed to retrive user %s's uid (%s), aborting.",
+				user_search, strerror(e));
+			goto release;
+		}
+		if (res == NULL) {
+			RTE_LOG(ERR, EAL,"user %s not found,  aborting.",
+				user_search);
+			e = -1;
+			goto release;
+		}
+	} else {
+		/* User name is not specified, use current user.  */
+		while ((e = getpwuid_r(getuid(), &pwd, buf, bufsize, &res)) == ERANGE) {
+			if (!enlarge_buffer(&buf, &bufsize)) {
+				break;
+			}
+		}
+
+		if (e != 0) {
+			RTE_LOG(ERR, EAL,"Failed to retrive current user's uid "
+				"(%s), aborting.", strerror(e));
+			goto release;
+		}
+		user_search = strdup(pwd.pw_name);
+	}
+
+	if (uid)
+		*uid = pwd.pw_uid;
+
+	free(buf);
+	buf = NULL;
+
+	if (pos) {
+		char *grpstr = pos + 1;
+		grpstr += strspn(grpstr, " \t\r\n");
+
+		if (*grpstr) {
+			struct group grp, *res;
+
+			bufsize = 4096;
+			buf = malloc(bufsize);
+			while ((e = getgrnam_r(grpstr, &grp, buf, bufsize, &res))
+					 == ERANGE) {
+				if (!enlarge_buffer(&buf, &bufsize)) {
+					break;
+				}
+			}
+
+			if (e) {
+				RTE_LOG(ERR, EAL,"Failed to get group entry for %s, "
+					"(%s), aborting.", grpstr,
+					strerror(e));
+				goto release;
+			}
+			if (res == NULL) {
+				RTE_LOG(ERR, EAL,"Group %s not found, aborting.",
+					grpstr);
+				e = -1;
+				goto release;
+			}
+
+			if (gid)
+				*gid = grp.gr_gid;
+		}
+	}
+
+ release:
+	free(buf);
+	free(user_search);
+	return e;
+}
+
+static void
+vhost_set_permissions(const char *vhost_sock_location)
+{
+	unsigned long int mode = strtoul(internal_config.vhost_sock_perm, NULL, 0);
+	int err = chmod(vhost_sock_location, (mode_t)mode);
+	if (err) {
+		RTE_LOG(ERR, EAL,"vhost-user socket cannot set"
+			" permissions to %s (%s).\n",
+			internal_config.vhost_sock_perm, strerror(err));
+		return;
+	}
+	RTE_LOG(INFO, EAL,"Socket %s changed permissions"
+			" to %s\n", vhost_sock_location,
+			internal_config.vhost_sock_perm);
+}
+
+static void
+vhost_set_ownership(const char *vhost_sock_location)
+{
+	uid_t vhuid=0;
+	gid_t vhgid=0;
+
+	if (get_owners_from_str(internal_config.vhost_sock_owner, &vhuid, &vhgid)) {
+		RTE_LOG(ERR, EAL,"vhost-user socket unable to get"
+			" specified user/group: %s\n",
+			internal_config.vhost_sock_owner);
+		return;
+	}
+
+	int err = chown(vhost_sock_location, vhuid, vhgid);
+	if (err) {
+		RTE_LOG(ERR, EAL,"vhost-user socket unable to set"
+			" ownership to %s (%s).\n",
+			internal_config.vhost_sock_owner, strerror(err));
+		return;
+	}
+
+	RTE_LOG(INFO, EAL,"Socket %s changed ownership"
+			" to %s.\n", vhost_sock_location,
+			internal_config.vhost_sock_owner);
+}
+
+void
+rte_eal_set_socket_permissions(const char *path)
+{
+	if (internal_config.vhost_sock_perm) {
+		vhost_set_permissions(path);
+	}
+
+	if (internal_config.vhost_sock_owner) {
+		vhost_set_ownership(path);
+	}
+}
diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c
index df2bd64..ea8dee9 100644
--- a/lib/librte_vhost/vhost_user/vhost-net-user.c
+++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
@@ -51,6 +51,8 @@ 
 #include "vhost-net.h"
 #include "virtio-net-user.h"
 
+#include <rte_eal.h>
+
 #define MAX_VIRTIO_BACKLOG 128
 
 static void vserver_new_vq_conn(int fd, void *data, int *remove);
@@ -476,6 +478,8 @@  rte_vhost_driver_register(const char *path)
 		return -1;
 	}
 
+	rte_eal_set_socket_permissions(path);
+
 	vserver->path = strdup(path);
 
 	fdset_add(&g_vhost_server.fdset, vserver->listenfd,