[dpdk-dev,[PATCH,v2] 01/13] virtio: Introduce config RTE_VIRTIO_INC_VECTOR

Message ID 1450098032-21198-2-git-send-email-sshukla@mvista.com (mailing list archive)
State Superseded, archived
Headers

Commit Message

Santosh Shukla Dec. 14, 2015, 1 p.m. UTC
  virtio_recv_pkts_vec and other virtio vector friend apis are written for sse/avx
instructions. For arm64 in particular, virtio vector implementation does not
exist(todo).

So virtio pmd driver wont build for targets like i686, arm64.  By making
RTE_VIRTIO_INC_VECTOR=n, Driver can build for non-sse/avx targets and will work
in non-vectored virtio mode.

Signed-off-by: Santosh Shukla <sshukla@mvista.com>
---
 config/common_linuxapp           |    1 +
 drivers/net/virtio/Makefile      |    2 +-
 drivers/net/virtio/virtio_rxtx.c |    7 +++++++
 3 files changed, 9 insertions(+), 1 deletion(-)
  

Comments

Santosh Shukla Dec. 17, 2015, 12:02 p.m. UTC | #1
On Mon, Dec 14, 2015 at 6:30 PM, Santosh Shukla <sshukla@mvista.com> wrote:
> virtio_recv_pkts_vec and other virtio vector friend apis are written for sse/avx
> instructions. For arm64 in particular, virtio vector implementation does not
> exist(todo).
>
> So virtio pmd driver wont build for targets like i686, arm64.  By making
> RTE_VIRTIO_INC_VECTOR=n, Driver can build for non-sse/avx targets and will work
> in non-vectored virtio mode.
>
> Signed-off-by: Santosh Shukla <sshukla@mvista.com>
> ---

Ping?

any review  / comment on this patch much appreciated. Thanks

>  config/common_linuxapp           |    1 +
>  drivers/net/virtio/Makefile      |    2 +-
>  drivers/net/virtio/virtio_rxtx.c |    7 +++++++
>  3 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/config/common_linuxapp b/config/common_linuxapp
> index ba9e55d..275fb40 100644
> --- a/config/common_linuxapp
> +++ b/config/common_linuxapp
> @@ -273,6 +273,7 @@ CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_RX=n
>  CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_TX=n
>  CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_DRIVER=n
>  CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_DUMP=n
> +CONFIG_RTE_VIRTIO_INC_VECTOR=y
>
>  #
>  # Compile burst-oriented VMXNET3 PMD driver
> diff --git a/drivers/net/virtio/Makefile b/drivers/net/virtio/Makefile
> index 43835ba..25a842d 100644
> --- a/drivers/net/virtio/Makefile
> +++ b/drivers/net/virtio/Makefile
> @@ -50,7 +50,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtqueue.c
>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_pci.c
>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx.c
>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_ethdev.c
> -SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple.c
> +SRCS-$(CONFIG_RTE_VIRTIO_INC_VECTOR) += virtio_rxtx_simple.c
>
>  # this lib depends upon:
>  DEPDIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += lib/librte_eal lib/librte_ether
> diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
> index 74b39ef..23be1ff 100644
> --- a/drivers/net/virtio/virtio_rxtx.c
> +++ b/drivers/net/virtio/virtio_rxtx.c
> @@ -438,7 +438,9 @@ virtio_dev_rx_queue_setup(struct rte_eth_dev *dev,
>
>         dev->data->rx_queues[queue_idx] = vq;
>
> +#ifdef RTE_VIRTIO_INC_VECTOR
>         virtio_rxq_vec_setup(vq);
> +#endif
>
>         return 0;
>  }
> @@ -464,7 +466,10 @@ virtio_dev_tx_queue_setup(struct rte_eth_dev *dev,
>                         const struct rte_eth_txconf *tx_conf)
>  {
>         uint8_t vtpci_queue_idx = 2 * queue_idx + VTNET_SQ_TQ_QUEUE_IDX;
> +
> +#ifdef RTE_VIRTIO_INC_VECTOR
>         struct virtio_hw *hw = dev->data->dev_private;
> +#endif
>         struct virtqueue *vq;
>         uint16_t tx_free_thresh;
>         int ret;
> @@ -477,6 +482,7 @@ virtio_dev_tx_queue_setup(struct rte_eth_dev *dev,
>                 return -EINVAL;
>         }
>
> +#ifdef RTE_VIRTIO_INC_VECTOR
>         /* Use simple rx/tx func if single segment and no offloads */
>         if ((tx_conf->txq_flags & VIRTIO_SIMPLE_FLAGS) == VIRTIO_SIMPLE_FLAGS &&
>              !vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) {
> @@ -485,6 +491,7 @@ virtio_dev_tx_queue_setup(struct rte_eth_dev *dev,
>                 dev->rx_pkt_burst = virtio_recv_pkts_vec;
>                 use_simple_rxtx = 1;
>         }
> +#endif
>
>         ret = virtio_dev_queue_setup(dev, VTNET_TQ, queue_idx, vtpci_queue_idx,
>                         nb_desc, socket_id, &vq);
> --
> 1.7.9.5
>
  
Thomas Monjalon Dec. 17, 2015, 12:03 p.m. UTC | #2
2015-12-17 17:32, Santosh Shukla:
> On Mon, Dec 14, 2015 at 6:30 PM, Santosh Shukla <sshukla@mvista.com> wrote:
> > virtio_recv_pkts_vec and other virtio vector friend apis are written for sse/avx
> > instructions. For arm64 in particular, virtio vector implementation does not
> > exist(todo).
> >
> > So virtio pmd driver wont build for targets like i686, arm64.  By making
> > RTE_VIRTIO_INC_VECTOR=n, Driver can build for non-sse/avx targets and will work
> > in non-vectored virtio mode.
> >
> > Signed-off-by: Santosh Shukla <sshukla@mvista.com>
> > ---
> 
> Ping?
> 
> any review  / comment on this patch much appreciated. Thanks

Why not check for SSE/AVX support instead of adding yet another config option?
  
Santosh Shukla Dec. 17, 2015, 12:18 p.m. UTC | #3
On Thu, Dec 17, 2015 at 5:33 PM, Thomas Monjalon
<thomas.monjalon@6wind.com> wrote:
> 2015-12-17 17:32, Santosh Shukla:
>> On Mon, Dec 14, 2015 at 6:30 PM, Santosh Shukla <sshukla@mvista.com> wrote:
>> > virtio_recv_pkts_vec and other virtio vector friend apis are written for sse/avx
>> > instructions. For arm64 in particular, virtio vector implementation does not
>> > exist(todo).
>> >
>> > So virtio pmd driver wont build for targets like i686, arm64.  By making
>> > RTE_VIRTIO_INC_VECTOR=n, Driver can build for non-sse/avx targets and will work
>> > in non-vectored virtio mode.
>> >
>> > Signed-off-by: Santosh Shukla <sshukla@mvista.com>
>> > ---
>>
>> Ping?
>>
>> any review  / comment on this patch much appreciated. Thanks
>
> Why not check for SSE/AVX support instead of adding yet another config option?

Ok, keeping a check for sse/avx across the patch wont stand true for
future virtio vectored implementation lets say for arm/arm64 cases
i.e.. sse2neon types. That implies user suppose to keep on appending /
adding checks for see2neon for example and so forth.

On other hand, motivation of including INC_VEC config was inspired
from IXGBE and other pmd drivers who support vectored sse/avx _rx path
and also could work w/o vectored mode. Current virtio is missing such
support and arm dont have vectored sse2neon types implementation right
now so its a blocker for arm case. Also keeping virtio pmd driver
flexible enough to work in non-vectored mode is a requirement/ a
feature.
  
Stephen Hemminger Dec. 17, 2015, 11:24 p.m. UTC | #4
On Thu, 17 Dec 2015 17:32:38 +0530
Santosh Shukla <sshukla@mvista.com> wrote:

> On Mon, Dec 14, 2015 at 6:30 PM, Santosh Shukla <sshukla@mvista.com> wrote:
> > virtio_recv_pkts_vec and other virtio vector friend apis are written for sse/avx
> > instructions. For arm64 in particular, virtio vector implementation does not
> > exist(todo).
> >
> > So virtio pmd driver wont build for targets like i686, arm64.  By making
> > RTE_VIRTIO_INC_VECTOR=n, Driver can build for non-sse/avx targets and will work
> > in non-vectored virtio mode.
> >
> > Signed-off-by: Santosh Shukla <sshukla@mvista.com>
> > ---
> 
> Ping?
> 
> any review  / comment on this patch much appreciated. Thanks

The patches I posted (and were ignored by Intel) to support indirect
and any layout should have much bigger performance gain than all this
low level SSE bit twiddling.
  
Yuanhan Liu Dec. 18, 2015, 1:31 a.m. UTC | #5
On Thu, Dec 17, 2015 at 03:24:35PM -0800, Stephen Hemminger wrote:
> On Thu, 17 Dec 2015 17:32:38 +0530
> Santosh Shukla <sshukla@mvista.com> wrote:
> 
> > On Mon, Dec 14, 2015 at 6:30 PM, Santosh Shukla <sshukla@mvista.com> wrote:
> > > virtio_recv_pkts_vec and other virtio vector friend apis are written for sse/avx
> > > instructions. For arm64 in particular, virtio vector implementation does not
> > > exist(todo).
> > >
> > > So virtio pmd driver wont build for targets like i686, arm64.  By making
> > > RTE_VIRTIO_INC_VECTOR=n, Driver can build for non-sse/avx targets and will work
> > > in non-vectored virtio mode.
> > >
> > > Signed-off-by: Santosh Shukla <sshukla@mvista.com>
> > > ---
> > 
> > Ping?
> > 
> > any review  / comment on this patch much appreciated. Thanks
> 
> The patches I posted (and were ignored by Intel) to support indirect

Sorry, I thought it was reviewed and got applied (and I just started to
review patches for virtio recently).

So, would you please send it out again? I will have a review and test ASAP.

	--yliu

> and any layout should have much bigger performance gain than all this
> low level SSE bit twiddling.
  
Huawei Xie Dec. 18, 2015, 9:52 a.m. UTC | #6
On 12/18/2015 7:25 AM, Stephen Hemminger wrote:
> On Thu, 17 Dec 2015 17:32:38 +0530
> Santosh Shukla <sshukla@mvista.com> wrote:
>
>> On Mon, Dec 14, 2015 at 6:30 PM, Santosh Shukla <sshukla@mvista.com> wrote:
>>> virtio_recv_pkts_vec and other virtio vector friend apis are written for sse/avx
>>> instructions. For arm64 in particular, virtio vector implementation does not
>>> exist(todo).
>>>
>>> So virtio pmd driver wont build for targets like i686, arm64.  By making
>>> RTE_VIRTIO_INC_VECTOR=n, Driver can build for non-sse/avx targets and will work
>>> in non-vectored virtio mode.
>>>
>>> Signed-off-by: Santosh Shukla <sshukla@mvista.com>
>>> ---
>> Ping?
>>
>> any review  / comment on this patch much appreciated. Thanks
> The patches I posted (and were ignored by Intel) to support indirect
> and any layout should have much bigger performance gain than all this
> low level SSE bit twiddling.
Hi Stephen:
We only did SSE twiddling to RX, which almost doubles the performance
comparing to normal path in virtio/vhost performance test case. Indirect
and any layout feature enabling are mostly for TX. We also did some
optimization for single segment and non-offload case in TX, without
using SSE, which also gives ~60% performance improvement, in Qian's
result. My optimization is mostly for single segment and non-offload
case, which i calls simple rx/tx.
I plan to add virtio/vhost performance benchmark so that we could easily
measure the performance difference for each patch.

Indirect and any layout features are useful for multiple segment
transmitted packet mbufs. I had acked your patch at the first time, and
thought it is applied. I don't understand why you say it is ignored by
Intel.

>
>
  
Thomas Monjalon Dec. 18, 2015, 10:41 a.m. UTC | #7
2015-12-18 09:52, Xie, Huawei:
> On 12/18/2015 7:25 AM, Stephen Hemminger wrote:
> > On Thu, 17 Dec 2015 17:32:38 +0530
> > Santosh Shukla <sshukla@mvista.com> wrote:
> >
> >> On Mon, Dec 14, 2015 at 6:30 PM, Santosh Shukla <sshukla@mvista.com> wrote:
> >>> virtio_recv_pkts_vec and other virtio vector friend apis are written for sse/avx
> >>> instructions. For arm64 in particular, virtio vector implementation does not
> >>> exist(todo).
> >>>
> >>> So virtio pmd driver wont build for targets like i686, arm64.  By making
> >>> RTE_VIRTIO_INC_VECTOR=n, Driver can build for non-sse/avx targets and will work
> >>> in non-vectored virtio mode.
> >>>
> >>> Signed-off-by: Santosh Shukla <sshukla@mvista.com>
> >>> ---
> >> Ping?
> >>
> >> any review  / comment on this patch much appreciated. Thanks
> > The patches I posted (and were ignored by Intel) to support indirect
> > and any layout should have much bigger performance gain than all this
> > low level SSE bit twiddling.
> Hi Stephen:
> We only did SSE twiddling to RX, which almost doubles the performance
> comparing to normal path in virtio/vhost performance test case. Indirect
> and any layout feature enabling are mostly for TX. We also did some
> optimization for single segment and non-offload case in TX, without
> using SSE, which also gives ~60% performance improvement, in Qian's
> result. My optimization is mostly for single segment and non-offload
> case, which i calls simple rx/tx.
> I plan to add virtio/vhost performance benchmark so that we could easily
> measure the performance difference for each patch.
> 
> Indirect and any layout features are useful for multiple segment
> transmitted packet mbufs. I had acked your patch at the first time, and
> thought it is applied. I don't understand why you say it is ignored by
> Intel.

There was an error and Stephen never replied nor pinged about it:
	http://dpdk.org/ml/archives/dev/2015-October/026984.html
It happens.
Reminder: it is the responsibility of the author to get patches reviewed
and accepted.
Please let's avoid useless blaming.
  
Santosh Shukla Dec. 18, 2015, 12:46 p.m. UTC | #8
On Fri, Dec 18, 2015 at 4:54 AM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Thu, 17 Dec 2015 17:32:38 +0530
> Santosh Shukla <sshukla@mvista.com> wrote:
>
>> On Mon, Dec 14, 2015 at 6:30 PM, Santosh Shukla <sshukla@mvista.com> wrote:
>> > virtio_recv_pkts_vec and other virtio vector friend apis are written for sse/avx
>> > instructions. For arm64 in particular, virtio vector implementation does not
>> > exist(todo).
>> >
>> > So virtio pmd driver wont build for targets like i686, arm64.  By making
>> > RTE_VIRTIO_INC_VECTOR=n, Driver can build for non-sse/avx targets and will work
>> > in non-vectored virtio mode.
>> >
>> > Signed-off-by: Santosh Shukla <sshukla@mvista.com>
>> > ---
>>
>> Ping?
>>
>> any review  / comment on this patch much appreciated. Thanks
>
> The patches I posted (and were ignored by Intel) to support indirect
> and any layout should have much bigger performance gain than all this
> low level SSE bit twiddling.
>

I little confused - do we care for this patch?
  
Stephen Hemminger Dec. 18, 2015, 5:33 p.m. UTC | #9
On Fri, 18 Dec 2015 09:52:29 +0000
"Xie, Huawei" <huawei.xie@intel.com> wrote:

> > low level SSE bit twiddling.  
> Hi Stephen:
> We only did SSE twiddling to RX, which almost doubles the performance
> comparing to normal path in virtio/vhost performance test case. Indirect
> and any layout feature enabling are mostly for TX. We also did some
> optimization for single segment and non-offload case in TX, without
> using SSE, which also gives ~60% performance improvement, in Qian's
> result. My optimization is mostly for single segment and non-offload
> case, which i calls simple rx/tx.
> I plan to add virtio/vhost performance benchmark so that we could easily
> measure the performance difference for each patch.
> 
> Indirect and any layout features are useful for multiple segment
> transmitted packet mbufs. I had acked your patch at the first time, and
> thought it is applied. I don't understand why you say it is ignored by
> Intel.

Sorry, did not mean to blame Intel, ... more that why didn't it get in 2.2?

It turns out any layout/indirect helps all transmits because they can then
take a single tx descriptor rather than multiple.
  
Thomas Monjalon Dec. 18, 2015, 6:11 p.m. UTC | #10
2015-12-18 09:33, Stephen Hemminger:
> On Fri, 18 Dec 2015 09:52:29 +0000
> "Xie, Huawei" <huawei.xie@intel.com> wrote:
> > > low level SSE bit twiddling.  
> > Hi Stephen:
> > We only did SSE twiddling to RX, which almost doubles the performance
> > comparing to normal path in virtio/vhost performance test case. Indirect
> > and any layout feature enabling are mostly for TX. We also did some
> > optimization for single segment and non-offload case in TX, without
> > using SSE, which also gives ~60% performance improvement, in Qian's
> > result. My optimization is mostly for single segment and non-offload
> > case, which i calls simple rx/tx.
> > I plan to add virtio/vhost performance benchmark so that we could easily
> > measure the performance difference for each patch.
> > 
> > Indirect and any layout features are useful for multiple segment
> > transmitted packet mbufs. I had acked your patch at the first time, and
> > thought it is applied. I don't understand why you say it is ignored by
> > Intel.
> 
> Sorry, did not mean to blame Intel, ... more that why didn't it get in 2.2?

I've already answered to this question:
http://dpdk.org/ml/archives/dev/2015-December/030540.html
There was a compilation error and you have not followed up.
  
Yuanhan Liu Dec. 22, 2015, 6:26 a.m. UTC | #11
On Fri, Dec 18, 2015 at 06:16:36PM +0530, Santosh Shukla wrote:
> On Fri, Dec 18, 2015 at 4:54 AM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> > On Thu, 17 Dec 2015 17:32:38 +0530
> > Santosh Shukla <sshukla@mvista.com> wrote:
> >
> >> On Mon, Dec 14, 2015 at 6:30 PM, Santosh Shukla <sshukla@mvista.com> wrote:
> >> > virtio_recv_pkts_vec and other virtio vector friend apis are written for sse/avx
> >> > instructions. For arm64 in particular, virtio vector implementation does not
> >> > exist(todo).
> >> >
> >> > So virtio pmd driver wont build for targets like i686, arm64.  By making
> >> > RTE_VIRTIO_INC_VECTOR=n, Driver can build for non-sse/avx targets and will work
> >> > in non-vectored virtio mode.
> >> >
> >> > Signed-off-by: Santosh Shukla <sshukla@mvista.com>
> >> > ---
> >>
> >> Ping?
> >>
> >> any review  / comment on this patch much appreciated. Thanks
> >
> > The patches I posted (and were ignored by Intel) to support indirect
> > and any layout should have much bigger performance gain than all this
> > low level SSE bit twiddling.
> >
> 
> I little confused - do we care for this patch?

Santosh,

As a reviewer that still have a lot of work to do, I don't have the
bandwidth to review _all_ your patches carefully __once__. That is
to say, I will only comment when I find something should be commented,
from time to time when I put more thoughts there. For other patches
I've no comment, it could mean that it's okay to me so far, or I'm
not quite sure it's okay but I don't find anything obvious wrong.
Hence, I put no comments so far. But later, when get time, I will
revisit them, think more, and either ACK it, or comment it.

So, you could simply keep those patches unchanged if they received
no comments, and fix other comments, and send out a new version at
anytime that is proper to you.

	--yliu
  

Patch

diff --git a/config/common_linuxapp b/config/common_linuxapp
index ba9e55d..275fb40 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -273,6 +273,7 @@  CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_RX=n
 CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_TX=n
 CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_DRIVER=n
 CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_DUMP=n
+CONFIG_RTE_VIRTIO_INC_VECTOR=y
 
 #
 # Compile burst-oriented VMXNET3 PMD driver
diff --git a/drivers/net/virtio/Makefile b/drivers/net/virtio/Makefile
index 43835ba..25a842d 100644
--- a/drivers/net/virtio/Makefile
+++ b/drivers/net/virtio/Makefile
@@ -50,7 +50,7 @@  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtqueue.c
 SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_pci.c
 SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx.c
 SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_ethdev.c
-SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple.c
+SRCS-$(CONFIG_RTE_VIRTIO_INC_VECTOR) += virtio_rxtx_simple.c
 
 # this lib depends upon:
 DEPDIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += lib/librte_eal lib/librte_ether
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 74b39ef..23be1ff 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -438,7 +438,9 @@  virtio_dev_rx_queue_setup(struct rte_eth_dev *dev,
 
 	dev->data->rx_queues[queue_idx] = vq;
 
+#ifdef RTE_VIRTIO_INC_VECTOR
 	virtio_rxq_vec_setup(vq);
+#endif
 
 	return 0;
 }
@@ -464,7 +466,10 @@  virtio_dev_tx_queue_setup(struct rte_eth_dev *dev,
 			const struct rte_eth_txconf *tx_conf)
 {
 	uint8_t vtpci_queue_idx = 2 * queue_idx + VTNET_SQ_TQ_QUEUE_IDX;
+
+#ifdef RTE_VIRTIO_INC_VECTOR
 	struct virtio_hw *hw = dev->data->dev_private;
+#endif
 	struct virtqueue *vq;
 	uint16_t tx_free_thresh;
 	int ret;
@@ -477,6 +482,7 @@  virtio_dev_tx_queue_setup(struct rte_eth_dev *dev,
 		return -EINVAL;
 	}
 
+#ifdef RTE_VIRTIO_INC_VECTOR
 	/* Use simple rx/tx func if single segment and no offloads */
 	if ((tx_conf->txq_flags & VIRTIO_SIMPLE_FLAGS) == VIRTIO_SIMPLE_FLAGS &&
 	     !vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) {
@@ -485,6 +491,7 @@  virtio_dev_tx_queue_setup(struct rte_eth_dev *dev,
 		dev->rx_pkt_burst = virtio_recv_pkts_vec;
 		use_simple_rxtx = 1;
 	}
+#endif
 
 	ret = virtio_dev_queue_setup(dev, VTNET_TQ, queue_idx, vtpci_queue_idx,
 			nb_desc, socket_id, &vq);