[dpdk-dev,2/2] vhost: start vhost servers once

Message ID 1482959452-18486-2-git-send-email-ciwillia@brocade.com (mailing list archive)
State Superseded, archived
Delegated to: Yuanhan Liu
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel compilation success Compilation OK

Commit Message

Chas Williams Dec. 28, 2016, 9:10 p.m. UTC
  Start a vhost server once during devinit instead of during device start
and stop.  Some vhost clients, QEMU, don't re-attaching to sockets when
the vhost server is stopped and later started.  Preserve existing behavior
for vhost clients.

Fixes: ee584e9710b9 ("vhost: add driver on top of the library")

Signed-off-by: Chas Williams <ciwillia@brocade.com>
---
 drivers/net/vhost/rte_eth_vhost.c | 36 +++++++++++++++++++++++++++++++-----
 1 file changed, 31 insertions(+), 5 deletions(-)
  

Comments

Yuanhan Liu Dec. 29, 2016, 8:52 a.m. UTC | #1
On Wed, Dec 28, 2016 at 04:10:52PM -0500, Charles (Chas) Williams wrote:
> Start a vhost server once during devinit instead of during device start
> and stop.  Some vhost clients, QEMU, don't re-attaching to sockets when
> the vhost server is stopped and later started.  Preserve existing behavior
> for vhost clients.

I didn't quite get the idea what you are going to fix.

	--yliu
  
Chas Williams Dec. 29, 2016, 3:58 p.m. UTC | #2
On 12/29/2016 03:52 AM, Yuanhan Liu wrote:
> On Wed, Dec 28, 2016 at 04:10:52PM -0500, Charles (Chas) Williams wrote:
>> Start a vhost server once during devinit instead of during device start
>> and stop.  Some vhost clients, QEMU, don't re-attaching to sockets when
>> the vhost server is stopped and later started.  Preserve existing behavior
>> for vhost clients.
>
> I didn't quite get the idea what you are going to fix.

The issue I am trying to fix is QEMU interaction when DPDK's vhost is
acting as a server to QEMU vhost clients.  If you create a vhost server
device, it doesn't create the actual datagram socket until you call
.dev_start().  If you call .dev_stop() is also deletes those sockets.
For QEMU, this is a problem since QEMU doesn't know how to re-attach to
datagram sockets that have gone away.

.dev_start()/.dev_stop() seems to roughly means link up and link down
so I understand why you might want to add/remove the datagram sockets.
However, in practice, this doesn't seem to make much sense for a DPDK
vhost server.  This doesn't seem like the right way to indicate link
status to vhost clients.

It seems like it would just be easier to do this for both clients and
servers, but I don't know why it was done this way originally so I
choose to keep the client behavior.
  
Yuanhan Liu Dec. 30, 2016, 3:15 a.m. UTC | #3
On Thu, Dec 29, 2016 at 10:58:11AM -0500, Charles (Chas) Williams wrote:
> On 12/29/2016 03:52 AM, Yuanhan Liu wrote:
> >On Wed, Dec 28, 2016 at 04:10:52PM -0500, Charles (Chas) Williams wrote:
> >>Start a vhost server once during devinit instead of during device start
> >>and stop.  Some vhost clients, QEMU, don't re-attaching to sockets when
> >>the vhost server is stopped and later started.  Preserve existing behavior
> >>for vhost clients.
> >
> >I didn't quite get the idea what you are going to fix.
> 
> The issue I am trying to fix is QEMU interaction when DPDK's vhost is
> acting as a server to QEMU vhost clients.  If you create a vhost server
> device, it doesn't create the actual datagram socket until you call
> .dev_start().  If you call .dev_stop() is also deletes those sockets.
> For QEMU, this is a problem since QEMU doesn't know how to re-attach to
> datagram sockets that have gone away.

Thanks! And I'd appreciate it if you could have written the commit log
this way firstly.

> .dev_start()/.dev_stop() seems to roughly means link up and link down
> so I understand why you might want to add/remove the datagram sockets.
> However, in practice, this doesn't seem to make much sense for a DPDK
> vhost server. 

Agree.

> This doesn't seem like the right way to indicate link
> status to vhost clients.
> 
> It seems like it would just be easier to do this for both clients and
> servers, but I don't know why it was done this way originally so I
> choose to keep the client behavior.

I don't think there are any differences between DPDK acting as client or
server. To me, the right logic seems to be (for both DPDK as server and
client).

For register,
- register the vhost socket at probe stage (either at rte_pmd_vhost_probe
  or at eth_dev_vhost_create).
- start the vhost session right after the register when we haven't started
  it before.

For unregister,
- invoke rte_vhost_driver_unregister() at rte_pmd_vhost_remove().

For dev_start/stop,
- set allow_queuing to 1/0 for start/stop, respectively.

	--yliu
  
Chas Williams Dec. 30, 2016, 9:26 p.m. UTC | #4
On 12/29/2016 10:15 PM, Yuanhan Liu wrote:
> On Thu, Dec 29, 2016 at 10:58:11AM -0500, Charles (Chas) Williams wrote:
>> On 12/29/2016 03:52 AM, Yuanhan Liu wrote:
>>> On Wed, Dec 28, 2016 at 04:10:52PM -0500, Charles (Chas) Williams wrote:
>>>> Start a vhost server once during devinit instead of during device start
>>>> and stop.  Some vhost clients, QEMU, don't re-attaching to sockets when
>>>> the vhost server is stopped and later started.  Preserve existing behavior
>>>> for vhost clients.
>>>
>>> I didn't quite get the idea what you are going to fix.
>>
>> The issue I am trying to fix is QEMU interaction when DPDK's vhost is
>> acting as a server to QEMU vhost clients.  If you create a vhost server
>> device, it doesn't create the actual datagram socket until you call
>> .dev_start().  If you call .dev_stop() is also deletes those sockets.
>> For QEMU, this is a problem since QEMU doesn't know how to re-attach to
>> datagram sockets that have gone away.
>
> Thanks! And I'd appreciate it if you could have written the commit log
> this way firstly.
>
>> .dev_start()/.dev_stop() seems to roughly means link up and link down
>> so I understand why you might want to add/remove the datagram sockets.
>> However, in practice, this doesn't seem to make much sense for a DPDK
>> vhost server.
>
> Agree.
>
>> This doesn't seem like the right way to indicate link
>> status to vhost clients.
>>
>> It seems like it would just be easier to do this for both clients and
>> servers, but I don't know why it was done this way originally so I
>> choose to keep the client behavior.
>
> I don't think there are any differences between DPDK acting as client or
> server. To me, the right logic seems to be (for both DPDK as server and
> client).
>
> For register,
> - register the vhost socket at probe stage (either at rte_pmd_vhost_probe
>   or at eth_dev_vhost_create).
> - start the vhost session right after the register when we haven't started
>   it before.
>
> For unregister,
> - invoke rte_vhost_driver_unregister() at rte_pmd_vhost_remove().

OK. This will be much easier than what I submitted.

> For dev_start/stop,
> - set allow_queuing to 1/0 for start/stop, respectively.

Unfortunately, I don't think this will work.  new_device() doesn't happen
until a client connects.  allow_queueing seems to be following the status
of the "wire" as it where.  .dev_start()/.dev_stop() is the link of local
port connected to the wire (administratively up or down as it where).

.dev_start() can happen before new_device() and attempting to RX for a
client that doesn't exist doesn't seem like a good idea.  Perhaps another
flag that follows dev_started, but for the queues?
  
Yuanhan Liu Jan. 3, 2017, 8:16 a.m. UTC | #5
On Fri, Dec 30, 2016 at 04:26:27PM -0500, Charles (Chas) Williams wrote:
> 
> 
> On 12/29/2016 10:15 PM, Yuanhan Liu wrote:
> >On Thu, Dec 29, 2016 at 10:58:11AM -0500, Charles (Chas) Williams wrote:
> >>On 12/29/2016 03:52 AM, Yuanhan Liu wrote:
> >>>On Wed, Dec 28, 2016 at 04:10:52PM -0500, Charles (Chas) Williams wrote:
> >>>>Start a vhost server once during devinit instead of during device start
> >>>>and stop.  Some vhost clients, QEMU, don't re-attaching to sockets when
> >>>>the vhost server is stopped and later started.  Preserve existing behavior
> >>>>for vhost clients.
> >>>
> >>>I didn't quite get the idea what you are going to fix.
> >>
> >>The issue I am trying to fix is QEMU interaction when DPDK's vhost is
> >>acting as a server to QEMU vhost clients.  If you create a vhost server
> >>device, it doesn't create the actual datagram socket until you call
> >>.dev_start().  If you call .dev_stop() is also deletes those sockets.
> >>For QEMU, this is a problem since QEMU doesn't know how to re-attach to
> >>datagram sockets that have gone away.
> >
> >Thanks! And I'd appreciate it if you could have written the commit log
> >this way firstly.
> >
> >>.dev_start()/.dev_stop() seems to roughly means link up and link down
> >>so I understand why you might want to add/remove the datagram sockets.
> >>However, in practice, this doesn't seem to make much sense for a DPDK
> >>vhost server.
> >
> >Agree.
> >
> >>This doesn't seem like the right way to indicate link
> >>status to vhost clients.
> >>
> >>It seems like it would just be easier to do this for both clients and
> >>servers, but I don't know why it was done this way originally so I
> >>choose to keep the client behavior.
> >
> >I don't think there are any differences between DPDK acting as client or
> >server. To me, the right logic seems to be (for both DPDK as server and
> >client).
> >
> >For register,
> >- register the vhost socket at probe stage (either at rte_pmd_vhost_probe
> >  or at eth_dev_vhost_create).
> >- start the vhost session right after the register when we haven't started
> >  it before.
> >
> >For unregister,
> >- invoke rte_vhost_driver_unregister() at rte_pmd_vhost_remove().
> 
> OK. This will be much easier than what I submitted.

Good.

> 
> >For dev_start/stop,
> >- set allow_queuing to 1/0 for start/stop, respectively.
> 
> Unfortunately, I don't think this will work.  new_device() doesn't happen
> until a client connects.  allow_queueing seems to be following the status
> of the "wire" as it where.  .dev_start()/.dev_stop() is the link of local
> port connected to the wire (administratively up or down as it where).
> 
> .dev_start() can happen before new_device() and attempting to RX for a
> client that doesn't exist doesn't seem like a good idea. 

Right.

> Perhaps another
> flag that follows dev_started, but for the queues?

I will comment it on your v3 patches.

	--yliu
  

Patch

diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index ba559ff..914d83f 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -772,9 +772,8 @@  vhost_driver_session_stop(void)
 }
 
 static int
-eth_dev_start(struct rte_eth_dev *dev)
+vhost_dev_start(struct pmd_internal *internal)
 {
-	struct pmd_internal *internal = dev->data->dev_private;
 	int ret = 0;
 
 	if (rte_atomic16_cmpset(&internal->once, 0, 1)) {
@@ -792,10 +791,8 @@  eth_dev_start(struct rte_eth_dev *dev)
 }
 
 static void
-eth_dev_stop(struct rte_eth_dev *dev)
+vhost_dev_stop(struct pmd_internal *internal)
 {
-	struct pmd_internal *internal = dev->data->dev_private;
-
 	if (rte_atomic16_cmpset(&internal->once, 1, 0)) {
 		rte_vhost_driver_unregister(internal->iface_name);
 
@@ -805,6 +802,27 @@  eth_dev_stop(struct rte_eth_dev *dev)
 }
 
 static int
+eth_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internal *internal = dev->data->dev_private;
+	int ret = 0;
+
+	if (internal->flags & RTE_VHOST_USER_CLIENT)
+		ret = vhost_dev_start(internal);
+
+	return ret;
+}
+
+static void
+eth_dev_stop(struct rte_eth_dev *dev)
+{
+	struct pmd_internal *internal = dev->data->dev_private;
+
+	if (internal->flags & RTE_VHOST_USER_CLIENT)
+		vhost_dev_stop(internal);
+}
+
+static int
 eth_rx_queue_setup(struct rte_eth_dev *dev, uint16_t rx_queue_id,
 		   uint16_t nb_rx_desc __rte_unused,
 		   unsigned int socket_id,
@@ -1079,6 +1097,11 @@  eth_dev_vhost_create(const char *name, char *iface_name, int16_t queues,
 	eth_dev->rx_pkt_burst = eth_vhost_rx;
 	eth_dev->tx_pkt_burst = eth_vhost_tx;
 
+	if ((flags & RTE_VHOST_USER_CLIENT) == 0) {
+		if (vhost_dev_start(internal))
+			goto error;
+	}
+
 	return data->port_id;
 
 error:
@@ -1216,6 +1239,9 @@  rte_pmd_vhost_remove(const char *name)
 
 	eth_dev_stop(eth_dev);
 
+	if ((internal->flags & RTE_VHOST_USER_CLIENT) == 0)
+		vhost_dev_stop(internal);
+
 	rte_free(vring_states[eth_dev->data->port_id]);
 	vring_states[eth_dev->data->port_id] = NULL;