[dpdk-dev,v3,2/8] ring: dynamic rss configuration

Message ID 1435589444-1988-3-git-send-email-tomaszx.kulasek@intel.com (mailing list archive)
State Rejected, archived
Headers

Commit Message

Tomasz Kulasek June 29, 2015, 2:50 p.m. UTC
  This implementation allows to set and read RSS configuration for ring
device, and is used to validate right values propagation over the slaves,
in test units for dynamic RSS configuration for bonding.

It have no impact on packet processing by ring device.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
---
 drivers/net/ring/rte_eth_ring.c |  120 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 116 insertions(+), 4 deletions(-)
  

Comments

Thomas Monjalon July 13, 2015, 12:36 p.m. UTC | #1
2015-06-29 16:50, Tomasz Kulasek:
> This implementation allows to set and read RSS configuration for ring
> device, and is used to validate right values propagation over the slaves,
> in test units for dynamic RSS configuration for bonding.
> 
> It have no impact on packet processing by ring device.

Adding some fake RSS to the ring PMD (in order to test bonding) is weird.
The ring PMD is not a driver for testing. Maybe that the null PMD would be
more appropriate.
By the way the current RSS implementation is really bound to Intel devices.
Before applying it to more drivers, we have to make sure it is generic
enough. Maybe the RETA needs more abstraction.
  
Tomasz Kulasek July 13, 2015, 2:43 p.m. UTC | #2
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Monday, July 13, 2015 14:36
> To: Kulasek, TomaszX
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 2/8] ring: dynamic rss configuration
> 
> 2015-06-29 16:50, Tomasz Kulasek:
> > This implementation allows to set and read RSS configuration for ring
> > device, and is used to validate right values propagation over the
> > slaves, in test units for dynamic RSS configuration for bonding.
> >
> > It have no impact on packet processing by ring device.
> 
> Adding some fake RSS to the ring PMD (in order to test bonding) is weird.
> The ring PMD is not a driver for testing. Maybe that the null PMD would be
> more appropriate.
> By the way the current RSS implementation is really bound to Intel
> devices.
> Before applying it to more drivers, we have to make sure it is generic
> enough. Maybe the RETA needs more abstraction.

This is not RSS implementation, but implementation of Dynamic RSS Configuration, already existing, and well defined in official documentation for single port.
I don't know where you see bounding to Intel device. It's an implementation of official API.

Anyway I will appreciate for any comment and opinion on that and clarification what means more RETA abstraction in context of official API.

I will check the possibility of moving it to the null pmd driver and will prepare new version for easier reviewing. Meantime I'm waiting for more opinions.
  
Thomas Monjalon July 13, 2015, 3:11 p.m. UTC | #3
2015-07-13 14:43, Kulasek, TomaszX:
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > 2015-06-29 16:50, Tomasz Kulasek:
> > > This implementation allows to set and read RSS configuration for ring
> > > device, and is used to validate right values propagation over the
> > > slaves, in test units for dynamic RSS configuration for bonding.
> > >
> > > It have no impact on packet processing by ring device.
> > 
> > Adding some fake RSS to the ring PMD (in order to test bonding) is weird.
> > The ring PMD is not a driver for testing. Maybe that the null PMD would be
> > more appropriate.
> > By the way the current RSS implementation is really bound to Intel
> > devices.
> > Before applying it to more drivers, we have to make sure it is generic
> > enough. Maybe the RETA needs more abstraction.
> 
> This is not RSS implementation, but implementation of Dynamic RSS
> Configuration, already existing, and well defined in official documentation
> for single port.
> I don't know where you see bounding to Intel device.
> It's an implementation of official API.

What do you mean by "official"?
My concern is that the RSS API in ethdev comes from a time where DPDK was
Intel DPDK. I may be wrong but I think that other devices could need
something more generic and better defined.

> Anyway I will appreciate for any comment and opinion on that and
> clarification what means more RETA abstraction in context of official API.
> 
> I will check the possibility of moving it to the null pmd driver and will
> prepare new version for easier reviewing. Meantime I'm waiting for more
> opinions.

Exact, we need more opinions on this topic.
  
Tomasz Kulasek July 16, 2015, 1:02 p.m. UTC | #4
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Monday, July 13, 2015 17:11
> To: Kulasek, TomaszX
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 2/8] ring: dynamic rss configuration
> 
> 2015-07-13 14:43, Kulasek, TomaszX:
> > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > 2015-06-29 16:50, Tomasz Kulasek:
> > > > This implementation allows to set and read RSS configuration for
> > > > ring device, and is used to validate right values propagation over
> > > > the slaves, in test units for dynamic RSS configuration for bonding.
> > > >
> > > > It have no impact on packet processing by ring device.
> > >
> > > Adding some fake RSS to the ring PMD (in order to test bonding) is
> weird.
> > > The ring PMD is not a driver for testing. Maybe that the null PMD
> > > would be more appropriate.
> > > By the way the current RSS implementation is really bound to Intel
> > > devices.
> > > Before applying it to more drivers, we have to make sure it is
> > > generic enough. Maybe the RETA needs more abstraction.
> >
> > This is not RSS implementation, but implementation of Dynamic RSS
> > Configuration, already existing, and well defined in official
> > documentation for single port.
> > I don't know where you see bounding to Intel device.
> > It's an implementation of official API.
> 
> What do you mean by "official"?
> My concern is that the RSS API in ethdev comes from a time where DPDK was
> Intel DPDK. I may be wrong but I think that other devices could need
> something more generic and better defined.
> 
> > Anyway I will appreciate for any comment and opinion on that and
> > clarification what means more RETA abstraction in context of official
> API.
> >
> > I will check the possibility of moving it to the null pmd driver and
> > will prepare new version for easier reviewing. Meantime I'm waiting
> > for more opinions.
> 
> Exact, we need more opinions on this topic.

Hi Thomas,
I have fixed the error with wrong copy/paste. Also I changed usage of ring pmd with null pmd like you suggested. You are right, it is better place. Thanks.

I sent a v4 patch-set.

This implementation is using an existing ethdev API for RSS and introduces generic enhancements for configuration of bonding slaves. 
If other devices need a more generic ethdev API then this should be treated separately and they can submit separate patches for this in a future release.
  

Patch

diff --git a/drivers/net/ring/rte_eth_ring.c b/drivers/net/ring/rte_eth_ring.c
index ada025f..29d7bd8 100644
--- a/drivers/net/ring/rte_eth_ring.c
+++ b/drivers/net/ring/rte_eth_ring.c
@@ -39,6 +39,7 @@ 
 #include <rte_string_fns.h>
 #include <rte_dev.h>
 #include <rte_kvargs.h>
+#include <rte_spinlock.h>
 
 #define ETH_RING_NUMA_NODE_ACTION_ARG	"nodeaction"
 #define ETH_RING_ACTION_CREATE		"CREATE"
@@ -66,6 +67,17 @@  struct pmd_internals {
 	struct ring_queue tx_ring_queues[RTE_PMD_RING_MAX_TX_RINGS];
 
 	struct ether_addr address;
+
+	/** Bit mask of RSS offloads, the bit offset also means flow type */
+	uint64_t flow_type_rss_offloads;
+
+	rte_spinlock_t rss_lock;
+
+	uint16_t reta_size;
+	struct rte_eth_rss_reta_entry64 reta_conf[ETH_RSS_RETA_SIZE_128 /
+			RTE_RETA_GROUP_SIZE];
+
+	uint8_t rss_key[40];                /**< 40-byte hash key. */
 };
 
 
@@ -173,6 +185,8 @@  eth_dev_info(struct rte_eth_dev *dev,
 	dev_info->max_tx_queues = (uint16_t)internals->nb_tx_queues;
 	dev_info->min_rx_bufsize = 0;
 	dev_info->pci_dev = NULL;
+	dev_info->flow_type_rss_offloads = internals->flow_type_rss_offloads;
+	dev_info->reta_size = internals->reta_size;
 }
 
 static void
@@ -234,6 +248,91 @@  static int
 eth_link_update(struct rte_eth_dev *dev __rte_unused,
 		int wait_to_complete __rte_unused) { return 0; }
 
+static int
+eth_rss_reta_update(struct rte_eth_dev *dev,
+		struct rte_eth_rss_reta_entry64 *reta_conf, uint16_t reta_size)
+{
+	int i, j;
+	struct pmd_internals *internal = dev->data->dev_private;
+
+	if (reta_size != internal->reta_size)
+		return -EINVAL;
+
+	rte_spinlock_lock(&internal->rss_lock);
+
+	/* Copy RETA table */
+	for (i = 0; i < (internal->reta_size / RTE_RETA_GROUP_SIZE); i++) {
+		internal->reta_conf[i].mask = reta_conf[i].mask;
+		for (j = 0; j < RTE_RETA_GROUP_SIZE; j++)
+			if ((reta_conf[i].mask >> j) & 0x01)
+				internal->reta_conf[i].reta[j] = reta_conf[i].reta[j];
+	}
+
+	rte_spinlock_unlock(&internal->rss_lock);
+
+	return 0;
+}
+
+static int
+eth_rss_reta_query(struct rte_eth_dev *dev,
+		struct rte_eth_rss_reta_entry64 *reta_conf, uint16_t reta_size)
+{
+	int i, j;
+	struct pmd_internals *internal = dev->data->dev_private;
+
+	if (reta_size != internal->reta_size)
+		return -EINVAL;
+
+	rte_spinlock_lock(&internal->rss_lock);
+
+	/* Copy RETA table */
+	for (i = 0; i < (internal->reta_size / RTE_RETA_GROUP_SIZE); i++) {
+		for (j = 0; j < RTE_RETA_GROUP_SIZE; j++)
+			if ((reta_conf[i].mask >> j) & 0x01)
+				reta_conf[i].reta[j] = internal->reta_conf[i].reta[j];
+	}
+
+	rte_spinlock_unlock(&internal->rss_lock);
+
+	return 0;
+}
+
+static int
+eth_rss_hash_update(struct rte_eth_dev *dev, struct rte_eth_rss_conf *rss_conf)
+{
+	struct pmd_internals *internal = dev->data->dev_private;
+
+	rte_spinlock_lock(&internal->rss_lock);
+
+	if ((rss_conf->rss_hf & internal->flow_type_rss_offloads) != 0)
+		dev->data->dev_conf.rx_adv_conf.rss_conf.rss_hf =
+				rss_conf->rss_hf & internal->flow_type_rss_offloads;
+
+	if (rss_conf->rss_key)
+		memcpy(internal->rss_key, rss_conf->rss_key, 40);
+
+	rte_spinlock_unlock(&internal->rss_lock);
+
+	return 0;
+}
+
+static int
+eth_rss_hash_conf_get(struct rte_eth_dev *dev,
+		struct rte_eth_rss_conf *rss_conf)
+{
+	struct pmd_internals *internal = dev->data->dev_private;
+
+	rte_spinlock_lock(&internal->rss_lock);
+
+	rss_conf->rss_hf = dev->data->dev_conf.rx_adv_conf.rss_conf.rss_hf;
+	if (rss_conf->rss_key)
+		memcpy(rss_conf->rss_key, internal->rss_key, 40);
+
+	rte_spinlock_unlock(&internal->rss_lock);
+
+	return 0;
+}
+
 static const struct eth_dev_ops ops = {
 	.dev_start = eth_dev_start,
 	.dev_stop = eth_dev_stop,
@@ -250,6 +349,10 @@  static const struct eth_dev_ops ops = {
 	.stats_reset = eth_stats_reset,
 	.mac_addr_remove = eth_mac_addr_remove,
 	.mac_addr_add = eth_mac_addr_add,
+	.reta_update = eth_rss_reta_update,
+	.reta_query = eth_rss_reta_query,
+	.rss_hash_update = eth_rss_hash_update,
+	.rss_hash_conf_get = eth_rss_hash_conf_get
 };
 
 int
@@ -268,6 +371,13 @@  rte_eth_from_rings(const char *name, struct rte_ring *const rx_queues[],
 
 	unsigned i;
 
+	static const uint8_t default_rss_key[40] = {
+		0x6D, 0x5A, 0x56, 0xDA, 0x25, 0x5B, 0x0E, 0xC2, 0x41, 0x67, 0x25, 0x3D,
+		0x43, 0xA3, 0x8F, 0xB0, 0xD0, 0xCA, 0x2B, 0xCB, 0xAE, 0x7B, 0x30, 0xB4,
+		0x77, 0xCB, 0x2D, 0xA3, 0x80, 0x30, 0xF2, 0x0C, 0x6A, 0x42, 0xB7, 0x3B,
+		0xBE, 0xAC, 0x01, 0xFA
+	};
+
 	/* do some parameter checking */
 	if (rx_queues == NULL && nb_rx_queues > 0)
 		goto error;
@@ -316,12 +426,14 @@  rte_eth_from_rings(const char *name, struct rte_ring *const rx_queues[],
 
 	internals->nb_rx_queues = nb_rx_queues;
 	internals->nb_tx_queues = nb_tx_queues;
-	for (i = 0; i < nb_rx_queues; i++) {
+	for (i = 0; i < nb_rx_queues; i++)
 		internals->rx_ring_queues[i].rng = rx_queues[i];
-	}
-	for (i = 0; i < nb_tx_queues; i++) {
+	for (i = 0; i < nb_tx_queues; i++)
 		internals->tx_ring_queues[i].rng = tx_queues[i];
-	}
+	internals->flow_type_rss_offloads =  ETH_RSS_PROTO_MASK;
+	internals->reta_size = RTE_DIM(internals->reta_conf) * RTE_RETA_GROUP_SIZE;
+
+	memcpy(internals->rss_key, default_rss_key, 40);
 
 	eth_drv->pci_drv.name = ring_ethdev_driver_name;
 	eth_drv->pci_drv.id_table = id_table;