[dpdk-dev,2/5] ethdev: add port ownership

Message ID 1511870281-15282-3-git-send-email-matan@mellanox.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Matan Azrad Nov. 28, 2017, 11:57 a.m. UTC
  The ownership of a port is implicit in DPDK.
Making it explicit is better from the next reasons:
1. It may be convenient for multi-process applications to know which
   process is in charge of a port.
2. A library could work on top of a port.
3. A port can work on top of another port.

Also in the fail-safe case, an issue has been met in testpmd.
We need to check that the user is not trying to use a port which is
already managed by fail-safe.

Add ownership mechanism to DPDK Ethernet devices to avoid multiple
management of a device by different DPDK entities.

A port owner is built from owner id(number) and owner name(string) while
the owner id must be unique to distinguish between two identical entity
instances and the owner name can be any name.
The name helps to logically recognize the owner by different DPDK
entities and allows easy debug.
Each DPDK entity can allocate an owner unique identifier and can use it
and its preferred name to owns valid ethdev ports.
Each DPDK entity can get any port owner status to decide if it can
manage the port or not.

The current ethdev internal port management is not affected by this
feature.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
 lib/librte_ether/rte_ethdev.c           | 121 ++++++++++++++++++++++++++++++++
 lib/librte_ether/rte_ethdev.h           |  86 +++++++++++++++++++++++
 lib/librte_ether/rte_ethdev_version.map |  12 ++++
 4 files changed, 230 insertions(+), 1 deletion(-)
  

Comments

Neil Horman Nov. 30, 2017, 12:36 p.m. UTC | #1
On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> The ownership of a port is implicit in DPDK.
> Making it explicit is better from the next reasons:
> 1. It may be convenient for multi-process applications to know which
>    process is in charge of a port.
> 2. A library could work on top of a port.
> 3. A port can work on top of another port.
> 
> Also in the fail-safe case, an issue has been met in testpmd.
> We need to check that the user is not trying to use a port which is
> already managed by fail-safe.
> 
> Add ownership mechanism to DPDK Ethernet devices to avoid multiple
> management of a device by different DPDK entities.
> 
> A port owner is built from owner id(number) and owner name(string) while
> the owner id must be unique to distinguish between two identical entity
> instances and the owner name can be any name.
> The name helps to logically recognize the owner by different DPDK
> entities and allows easy debug.
> Each DPDK entity can allocate an owner unique identifier and can use it
> and its preferred name to owns valid ethdev ports.
> Each DPDK entity can get any port owner status to decide if it can
> manage the port or not.
> 
> The current ethdev internal port management is not affected by this
> feature.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>


This seems fairly racy.  What if one thread attempts to set ownership on a port,
while another is checking it on another cpu in parallel.  It doesn't seem like
it will protect against that at all. It also doesn't protect against the
possibility of multiple threads attempting to poll for rx in parallel, which I
think was part of Thomas's origional statement regarding port ownership (he
noted that the lockless design implied only a single thread should be allowed to
poll for receive or make configuration changes at a time.

Neil

> ---
>  doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
>  lib/librte_ether/rte_ethdev.c           | 121 ++++++++++++++++++++++++++++++++
>  lib/librte_ether/rte_ethdev.h           |  86 +++++++++++++++++++++++
>  lib/librte_ether/rte_ethdev_version.map |  12 ++++
>  4 files changed, 230 insertions(+), 1 deletion(-)
> 
> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
> index 6a0c9f9..af639a1 100644
> --- a/doc/guides/prog_guide/poll_mode_drv.rst
> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> @@ -156,7 +156,7 @@ concurrently on the same tx queue without SW lock. This PMD feature found in som
>  
>  See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE`` capability probing details.
>  
> -Device Identification and Configuration
> +Device Identification, Ownership  and Configuration
>  ---------------------------------------
>  
>  Device Identification
> @@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports are assigned two other identifiers:
>  *   A port name used to designate the port in console messages, for administration or debugging purposes.
>      For ease of use, the port name includes the port index.
>  
> +Port Ownership
> +~~~~~~~~~~~~~
> +The Ethernet devices ports can be owned by a single DPDK entity (application, library, PMD, process, etc).
> +The ownership mechanism is controlled by ethdev APIs and allows to set/remove/get a port owner by DPDK entities.
> +Allowing this should prevent any multiple management of Ethernet port by different entities.
> +
> +.. note::
> +
> +    It is the DPDK entity responsibility either to check the port owner before using it or to set the port owner to prevent others from using it.
> +
>  Device Configuration
>  ~~~~~~~~~~~~~~~~~~~~
>  
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 2d754d9..836991e 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -71,6 +71,7 @@
>  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
>  struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
>  static struct rte_eth_dev_data *rte_eth_dev_data;
> +static uint16_t rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER + 1;
>  static uint8_t eth_dev_last_created_port;
>  
>  /* spinlock for eth device callbacks */
> @@ -278,6 +279,7 @@ struct rte_eth_dev *
>  	if (eth_dev == NULL)
>  		return -EINVAL;
>  
> +	memset(&eth_dev->data->owner, 0, sizeof(struct rte_eth_dev_owner));
>  	eth_dev->state = RTE_ETH_DEV_UNUSED;
>  	return 0;
>  }
> @@ -293,6 +295,125 @@ struct rte_eth_dev *
>  		return 1;
>  }
>  
> +static int
> +rte_eth_is_valid_owner_id(uint16_t owner_id)
> +{
> +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> +	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER &&
> +	    rte_eth_next_owner_id <= owner_id)) {
> +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> +		return 0;
> +	}
> +	return 1;
> +}
> +
> +uint16_t
> +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t owner_id)
> +{
> +	while (port_id < RTE_MAX_ETHPORTS &&
> +	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
> +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> +		port_id++;
> +
> +	if (port_id >= RTE_MAX_ETHPORTS)
> +		return RTE_MAX_ETHPORTS;
> +
> +	return port_id;
> +}
> +
> +int
> +rte_eth_dev_owner_new(uint16_t *owner_id)
> +{
> +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own ports.\n");
> +		return -EPERM;
> +	}
> +	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> +		RTE_PMD_DEBUG_TRACE("Reached maximum number of Ethernet port owners.\n");
> +		return -EUSERS;
> +	}
> +	*owner_id = rte_eth_next_owner_id++;
> +	return 0;
> +}
> +
> +int
> +rte_eth_dev_owner_set(const uint16_t port_id,
> +		      const struct rte_eth_dev_owner *owner)
> +{
> +	struct rte_eth_dev_owner *port_owner;
> +	int ret;
> +
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own ports.\n");
> +		return -EPERM;
> +	}
> +	if (!rte_eth_is_valid_owner_id(owner->id))
> +		return -EINVAL;
> +	port_owner = &rte_eth_devices[port_id].data->owner;
> +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> +	    port_owner->id != owner->id) {
> +		RTE_LOG(ERR, EAL,
> +			"Cannot set owner to port %d already owned by %s_%05d.\n",
> +			port_id, port_owner->name, port_owner->id);
> +		return -EPERM;
> +	}
> +	ret = snprintf(port_owner->name, RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> +		       owner->name);
> +	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> +		memset(port_owner->name, 0, RTE_ETH_MAX_OWNER_NAME_LEN);
> +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> +		return -EINVAL;
> +	}
> +	port_owner->id = owner->id;
> +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
> +			    owner->name, owner->id);
> +	return 0;
> +}
> +
> +int
> +rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t owner_id)
> +{
> +	struct rte_eth_dev_owner *port_owner;
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> +	if (!rte_eth_is_valid_owner_id(owner_id))
> +		return -EINVAL;
> +	port_owner = &rte_eth_devices[port_id].data->owner;
> +	if (port_owner->id != owner_id) {
> +		RTE_LOG(ERR, EAL,
> +			"Cannot remove port %d owner %s_%05d by different owner id %5d.\n",
> +			port_id, port_owner->name, port_owner->id, owner_id);
> +		return -EPERM;
> +	}
> +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has removed.\n", port_id,
> +			port_owner->name, port_owner->id);
> +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> +	return 0;
> +}
> +
> +void
> +rte_eth_dev_owner_delete(const uint16_t owner_id)
> +{
> +	uint16_t p;
> +
> +	if (!rte_eth_is_valid_owner_id(owner_id))
> +		return;
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
> +		memset(&rte_eth_devices[p].data->owner, 0,
> +		       sizeof(struct rte_eth_dev_owner));
> +	RTE_PMD_DEBUG_TRACE("All port owners owned by "
> +			    "%05d identifier has removed.\n", owner_id);
> +}
> +
> +const struct rte_eth_dev_owner *
> +rte_eth_dev_owner_get(const uint16_t port_id)
> +{
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> +	if (rte_eth_devices[port_id].data->owner.id == RTE_ETH_DEV_NO_OWNER)
> +		return NULL;
> +	return &rte_eth_devices[port_id].data->owner;
> +}
> +
>  int
>  rte_eth_dev_socket_id(uint16_t port_id)
>  {
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 341c2d6..f54c26d 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
>  
>  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
>  
> +#define RTE_ETH_DEV_NO_OWNER 0
> +
> +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> +
> +struct rte_eth_dev_owner {
> +	uint16_t id; /**< The owner unique identifier. */
> +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner name. */
> +};
> +
>  /**
>   * @internal
>   * The data part, with no function pointers, associated with each ethernet device.
> @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
>  	int numa_node;  /**< NUMA node connection */
>  	struct rte_vlan_filter_conf vlan_filter_conf;
>  	/**< VLAN filter configuration. */
> +	struct rte_eth_dev_owner owner; /**< The port owner. */
>  };
>  
>  /** Device supports link state interrupt */
> @@ -1846,6 +1856,82 @@ struct rte_eth_dev_data {
>  
>  
>  /**
> + * Iterates over valid ethdev ports owned by a specific owner.
> + *
> + * @param port_id
> + *   The id of the next possible valid owned port.
> + * @param	owner_id
> + *  The owner identifier.
> + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless ports.
> + * @return
> + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if there is none.
> + */
> +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t owner_id);
> +
> +/**
> + * Macro to iterate over all enabled ethdev ports owned by a specific owner.
> + */
> +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> +	for (p = rte_eth_find_next_owned_by(0, o); \
> +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> +	     p = rte_eth_find_next_owned_by(p + 1, o))
> +
> +/**
> + * Get a new unique owner identifier.
> + * An owner identifier is used to owns Ethernet devices by only one DPDK entity
> + * to avoid multiple management of device by different entities.
> + *
> + * @param	owner_id
> + *   Owner identifier pointer.
> + * @return
> + *   Negative errno value on error, 0 on success.
> + */
> +int rte_eth_dev_owner_new(uint16_t *owner_id);
> +
> +/**
> + * Set an Ethernet device owner.
> + *
> + * @param	port_id
> + *  The identifier of the port to own.
> + * @param	owner
> + *  The owner pointer.
> + * @return
> + *  Negative errno value on error, 0 on success.
> + */
> +int rte_eth_dev_owner_set(const uint16_t port_id,
> +			  const struct rte_eth_dev_owner *owner);
> +
> +/**
> + * Remove Ethernet device owner to make the device ownerless.
> + *
> + * @param	port_id
> + *  The identifier of port to make ownerless.
> + * @param	owner
> + *  The owner identifier.
> + * @return
> + *  0 on success, negative errno value on error.
> + */
> +int rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t owner_id);
> +
> +/**
> + * Remove owner from all Ethernet devices owned by a specific owner.
> + *
> + * @param	owner
> + *  The owner identifier.
> + */
> +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> +
> +/**
> + * Get the owner of an Ethernet device.
> + *
> + * @param	port_id
> + *  The port identifier.
> + * @return
> + *  NULL when the device is ownerless, else the device owner pointer.
> + */
> +const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const uint16_t port_id);
> +
> +/**
>   * Get the total number of Ethernet devices that have been successfully
>   * initialized by the matching Ethernet driver during the PCI probing phase
>   * and that are available for applications to use. These devices must be
> diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
> index e9681ac..7d07edb 100644
> --- a/lib/librte_ether/rte_ethdev_version.map
> +++ b/lib/librte_ether/rte_ethdev_version.map
> @@ -198,6 +198,18 @@ DPDK_17.11 {
>  
>  } DPDK_17.08;
>  
> +DPDK_18.02 {
> +	global:
> +
> +	rte_eth_find_next_owned_by;
> +	rte_eth_dev_owner_new;
> +	rte_eth_dev_owner_set;
> +	rte_eth_dev_owner_remove;
> +	rte_eth_dev_owner_delete;
> +	rte_eth_dev_owner_get;
> +
> +} DPDK_17.11;
> +
>  EXPERIMENTAL {
>  	global:
>  
> -- 
> 1.8.3.1
> 
>
  
Gaëtan Rivet Nov. 30, 2017, 1:24 p.m. UTC | #2
Hello Matan, Neil,

I like the port ownership concept. I think it is needed to clarify some
operations and should be useful to several subsystems.

This patch could certainly be sub-divided however, and your current 1/5
should probably come after this one.

Some comments inline.

On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > The ownership of a port is implicit in DPDK.
> > Making it explicit is better from the next reasons:
> > 1. It may be convenient for multi-process applications to know which
> >    process is in charge of a port.
> > 2. A library could work on top of a port.
> > 3. A port can work on top of another port.
> > 
> > Also in the fail-safe case, an issue has been met in testpmd.
> > We need to check that the user is not trying to use a port which is
> > already managed by fail-safe.
> > 
> > Add ownership mechanism to DPDK Ethernet devices to avoid multiple
> > management of a device by different DPDK entities.
> > 
> > A port owner is built from owner id(number) and owner name(string) while
> > the owner id must be unique to distinguish between two identical entity
> > instances and the owner name can be any name.
> > The name helps to logically recognize the owner by different DPDK
> > entities and allows easy debug.
> > Each DPDK entity can allocate an owner unique identifier and can use it
> > and its preferred name to owns valid ethdev ports.
> > Each DPDK entity can get any port owner status to decide if it can
> > manage the port or not.
> > 
> > The current ethdev internal port management is not affected by this
> > feature.
> > 

The internal port management is not affected, but the external interface
is, however. In order to respect port ownership, applications are forced
to modify their port iterator, as shown by your testpmd patch.

I think it would be better to modify the current RTE_ETH_FOREACH_DEV to call
RTE_FOREACH_DEV_OWNED_BY, and introduce a default owner that would
represent the application itself (probably with the ID 0 and an owner
string ""). Only with specific additional configuration should this
default subset of ethdev be divided.

This would make this evolution seamless for applications, at no cost to
the complexity of the design.

> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> 
> 
> This seems fairly racy.  What if one thread attempts to set ownership on a port,
> while another is checking it on another cpu in parallel.  It doesn't seem like
> it will protect against that at all. It also doesn't protect against the
> possibility of multiple threads attempting to poll for rx in parallel, which I
> think was part of Thomas's origional statement regarding port ownership (he
> noted that the lockless design implied only a single thread should be allowed to
> poll for receive or make configuration changes at a time.
> 
> Neil
> 

Isn't this race already there for any configuration operation / polling
function? The DPDK arch is expecting applications to solve it. Why should
port ownership be designed differently from other DPDK components?

Embedding checks for port ownership within operations will force
everyone to bear their costs, even those not interested in using this
API. These checks should be kept outside, within the entity claiming
ownership of the port, in the form of using the proper port iterator
IMO.

> > ---
> >  doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
> >  lib/librte_ether/rte_ethdev.c           | 121 ++++++++++++++++++++++++++++++++
> >  lib/librte_ether/rte_ethdev.h           |  86 +++++++++++++++++++++++
> >  lib/librte_ether/rte_ethdev_version.map |  12 ++++
> >  4 files changed, 230 insertions(+), 1 deletion(-)
> > 
> > diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
> > index 6a0c9f9..af639a1 100644
> > --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > @@ -156,7 +156,7 @@ concurrently on the same tx queue without SW lock. This PMD feature found in som
> >  
> >  See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE`` capability probing details.
> >  
> > -Device Identification and Configuration
> > +Device Identification, Ownership  and Configuration
> >  ---------------------------------------
> >  
> >  Device Identification
> > @@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports are assigned two other identifiers:
> >  *   A port name used to designate the port in console messages, for administration or debugging purposes.
> >      For ease of use, the port name includes the port index.
> >  
> > +Port Ownership
> > +~~~~~~~~~~~~~
> > +The Ethernet devices ports can be owned by a single DPDK entity (application, library, PMD, process, etc).
> > +The ownership mechanism is controlled by ethdev APIs and allows to set/remove/get a port owner by DPDK entities.
> > +Allowing this should prevent any multiple management of Ethernet port by different entities.
> > +
> > +.. note::
> > +
> > +    It is the DPDK entity responsibility either to check the port owner before using it or to set the port owner to prevent others from using it.
> > +
> >  Device Configuration
> >  ~~~~~~~~~~~~~~~~~~~~
> >  
> > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> > index 2d754d9..836991e 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -71,6 +71,7 @@
> >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
> >  struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > +static uint16_t rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER + 1;
> >  static uint8_t eth_dev_last_created_port;
> >  
> >  /* spinlock for eth device callbacks */
> > @@ -278,6 +279,7 @@ struct rte_eth_dev *
> >  	if (eth_dev == NULL)
> >  		return -EINVAL;
> >  
> > +	memset(&eth_dev->data->owner, 0, sizeof(struct rte_eth_dev_owner));
> >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> >  	return 0;
> >  }
> > @@ -293,6 +295,125 @@ struct rte_eth_dev *
> >  		return 1;
> >  }
> >  
> > +static int
> > +rte_eth_is_valid_owner_id(uint16_t owner_id)
> > +{
> > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > +	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER &&
> > +	    rte_eth_next_owner_id <= owner_id)) {
> > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > +		return 0;
> > +	}
> > +	return 1;
> > +}
> > +
> > +uint16_t
> > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t owner_id)
> > +{
> > +	while (port_id < RTE_MAX_ETHPORTS &&
> > +	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
> > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > +		port_id++;
> > +
> > +	if (port_id >= RTE_MAX_ETHPORTS)
> > +		return RTE_MAX_ETHPORTS;
> > +
> > +	return port_id;
> > +}
> > +
> > +int
> > +rte_eth_dev_owner_new(uint16_t *owner_id)
> > +{
> > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own ports.\n");
> > +		return -EPERM;
> > +	}
> > +	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > +		RTE_PMD_DEBUG_TRACE("Reached maximum number of Ethernet port owners.\n");
> > +		return -EUSERS;
> > +	}
> > +	*owner_id = rte_eth_next_owner_id++;
> > +	return 0;
> > +}
> > +
> > +int
> > +rte_eth_dev_owner_set(const uint16_t port_id,
> > +		      const struct rte_eth_dev_owner *owner)
> > +{
> > +	struct rte_eth_dev_owner *port_owner;
> > +	int ret;
> > +
> > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own ports.\n");
> > +		return -EPERM;
> > +	}
> > +	if (!rte_eth_is_valid_owner_id(owner->id))
> > +		return -EINVAL;
> > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > +	    port_owner->id != owner->id) {
> > +		RTE_LOG(ERR, EAL,
> > +			"Cannot set owner to port %d already owned by %s_%05d.\n",
> > +			port_id, port_owner->name, port_owner->id);
> > +		return -EPERM;
> > +	}
> > +	ret = snprintf(port_owner->name, RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > +		       owner->name);
> > +	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> > +		memset(port_owner->name, 0, RTE_ETH_MAX_OWNER_NAME_LEN);
> > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > +		return -EINVAL;
> > +	}
> > +	port_owner->id = owner->id;
> > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
> > +			    owner->name, owner->id);
> > +	return 0;
> > +}
> > +
> > +int
> > +rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t owner_id)
> > +{
> > +	struct rte_eth_dev_owner *port_owner;
> > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > +		return -EINVAL;
> > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > +	if (port_owner->id != owner_id) {
> > +		RTE_LOG(ERR, EAL,
> > +			"Cannot remove port %d owner %s_%05d by different owner id %5d.\n",
> > +			port_id, port_owner->name, port_owner->id, owner_id);
> > +		return -EPERM;
> > +	}
> > +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has removed.\n", port_id,
> > +			port_owner->name, port_owner->id);
> > +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> > +	return 0;
> > +}
> > +
> > +void
> > +rte_eth_dev_owner_delete(const uint16_t owner_id)
> > +{
> > +	uint16_t p;
> > +
> > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > +		return;
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
> > +		memset(&rte_eth_devices[p].data->owner, 0,
> > +		       sizeof(struct rte_eth_dev_owner));
> > +	RTE_PMD_DEBUG_TRACE("All port owners owned by "
> > +			    "%05d identifier has removed.\n", owner_id);
> > +}
> > +
> > +const struct rte_eth_dev_owner *
> > +rte_eth_dev_owner_get(const uint16_t port_id)
> > +{
> > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> > +	if (rte_eth_devices[port_id].data->owner.id == RTE_ETH_DEV_NO_OWNER)
> > +		return NULL;
> > +	return &rte_eth_devices[port_id].data->owner;
> > +}
> > +
> >  int
> >  rte_eth_dev_socket_id(uint16_t port_id)
> >  {
> > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> > index 341c2d6..f54c26d 100644
> > --- a/lib/librte_ether/rte_ethdev.h
> > +++ b/lib/librte_ether/rte_ethdev.h
> > @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
> >  
> >  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
> >  
> > +#define RTE_ETH_DEV_NO_OWNER 0
> > +
> > +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> > +
> > +struct rte_eth_dev_owner {
> > +	uint16_t id; /**< The owner unique identifier. */
> > +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner name. */
> > +};
> > +
> >  /**
> >   * @internal
> >   * The data part, with no function pointers, associated with each ethernet device.
> > @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
> >  	int numa_node;  /**< NUMA node connection */
> >  	struct rte_vlan_filter_conf vlan_filter_conf;
> >  	/**< VLAN filter configuration. */
> > +	struct rte_eth_dev_owner owner; /**< The port owner. */
> >  };
> >  
> >  /** Device supports link state interrupt */
> > @@ -1846,6 +1856,82 @@ struct rte_eth_dev_data {
> >  
> >  
> >  /**
> > + * Iterates over valid ethdev ports owned by a specific owner.
> > + *
> > + * @param port_id
> > + *   The id of the next possible valid owned port.
> > + * @param	owner_id
> > + *  The owner identifier.
> > + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless ports.
> > + * @return
> > + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if there is none.
> > + */
> > +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t owner_id);
> > +
> > +/**
> > + * Macro to iterate over all enabled ethdev ports owned by a specific owner.
> > + */
> > +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> > +	for (p = rte_eth_find_next_owned_by(0, o); \
> > +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> > +	     p = rte_eth_find_next_owned_by(p + 1, o))
> > +
> > +/**
> > + * Get a new unique owner identifier.
> > + * An owner identifier is used to owns Ethernet devices by only one DPDK entity
> > + * to avoid multiple management of device by different entities.
> > + *
> > + * @param	owner_id
> > + *   Owner identifier pointer.
> > + * @return
> > + *   Negative errno value on error, 0 on success.
> > + */
> > +int rte_eth_dev_owner_new(uint16_t *owner_id);
> > +
> > +/**
> > + * Set an Ethernet device owner.
> > + *
> > + * @param	port_id
> > + *  The identifier of the port to own.
> > + * @param	owner
> > + *  The owner pointer.
> > + * @return
> > + *  Negative errno value on error, 0 on success.
> > + */
> > +int rte_eth_dev_owner_set(const uint16_t port_id,
> > +			  const struct rte_eth_dev_owner *owner);
> > +
> > +/**
> > + * Remove Ethernet device owner to make the device ownerless.
> > + *
> > + * @param	port_id
> > + *  The identifier of port to make ownerless.
> > + * @param	owner
> > + *  The owner identifier.
> > + * @return
> > + *  0 on success, negative errno value on error.
> > + */
> > +int rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t owner_id);
> > +
> > +/**
> > + * Remove owner from all Ethernet devices owned by a specific owner.
> > + *
> > + * @param	owner
> > + *  The owner identifier.
> > + */
> > +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> > +
> > +/**
> > + * Get the owner of an Ethernet device.
> > + *
> > + * @param	port_id
> > + *  The port identifier.
> > + * @return
> > + *  NULL when the device is ownerless, else the device owner pointer.
> > + */
> > +const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const uint16_t port_id);
> > +
> > +/**
> >   * Get the total number of Ethernet devices that have been successfully
> >   * initialized by the matching Ethernet driver during the PCI probing phase
> >   * and that are available for applications to use. These devices must be
> > diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
> > index e9681ac..7d07edb 100644
> > --- a/lib/librte_ether/rte_ethdev_version.map
> > +++ b/lib/librte_ether/rte_ethdev_version.map
> > @@ -198,6 +198,18 @@ DPDK_17.11 {
> >  
> >  } DPDK_17.08;
> >  
> > +DPDK_18.02 {
> > +	global:
> > +
> > +	rte_eth_find_next_owned_by;
> > +	rte_eth_dev_owner_new;
> > +	rte_eth_dev_owner_set;
> > +	rte_eth_dev_owner_remove;
> > +	rte_eth_dev_owner_delete;
> > +	rte_eth_dev_owner_get;
> > +
> > +} DPDK_17.11;
> > +
> >  EXPERIMENTAL {
> >  	global:
> >  
> > -- 
> > 1.8.3.1
> > 
> >
  
Matan Azrad Nov. 30, 2017, 2:30 p.m. UTC | #3
Hi all

> -----Original Message-----
> From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> Sent: Thursday, November 30, 2017 3:25 PM
> To: Neil Horman <nhorman@tuxdriver.com>
> Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> Hello Matan, Neil,
> 
> I like the port ownership concept. I think it is needed to clarify some
> operations and should be useful to several subsystems.
> 
> This patch could certainly be sub-divided however, and your current 1/5
> should probably come after this one.

Can you suggest how to divide it?

1/5 could be actually outside of this series, it is just better behavior to use the right function to do release port :)

> 
> Some comments inline.
> 
> On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > The ownership of a port is implicit in DPDK.
> > > Making it explicit is better from the next reasons:
> > > 1. It may be convenient for multi-process applications to know which
> > >    process is in charge of a port.
> > > 2. A library could work on top of a port.
> > > 3. A port can work on top of another port.
> > >
> > > Also in the fail-safe case, an issue has been met in testpmd.
> > > We need to check that the user is not trying to use a port which is
> > > already managed by fail-safe.
> > >
> > > Add ownership mechanism to DPDK Ethernet devices to avoid multiple
> > > management of a device by different DPDK entities.
> > >
> > > A port owner is built from owner id(number) and owner name(string)
> > > while the owner id must be unique to distinguish between two
> > > identical entity instances and the owner name can be any name.
> > > The name helps to logically recognize the owner by different DPDK
> > > entities and allows easy debug.
> > > Each DPDK entity can allocate an owner unique identifier and can use
> > > it and its preferred name to owns valid ethdev ports.
> > > Each DPDK entity can get any port owner status to decide if it can
> > > manage the port or not.
> > >
> > > The current ethdev internal port management is not affected by this
> > > feature.
> > >
> 
> The internal port management is not affected, but the external interface is,
> however. In order to respect port ownership, applications are forced to
> modify their port iterator, as shown by your testpmd patch.
> 
> I think it would be better to modify the current RTE_ETH_FOREACH_DEV to
> call RTE_FOREACH_DEV_OWNED_BY, and introduce a default owner that
> would represent the application itself (probably with the ID 0 and an owner
> string ""). Only with specific additional configuration should this default
> subset of ethdev be divided.
> 
> This would make this evolution seamless for applications, at no cost to the
> complexity of the design.

As you can see in patch code and in testpmd example I added option to iterate over all valid ownerless ports which should be more relevant by owner_id = 0.
So maybe the RTE_ETH_FOREACH_DEV should be changed to run this by the new iterator.
By this way current applications don't need to build their owners but the ABI will be broken.

Actually, I think the old iterator RTE_ETH_FOREACH_DEV should be unexposed or just removed,
also the DEFFERED state should be removed,
I don't really see any usage to iterate over all valid ports by DPDK entities different from ethdev itself.
I just don't want to break it now.

> 
> > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> >
> >
> > This seems fairly racy.  What if one thread attempts to set ownership
> > on a port, while another is checking it on another cpu in parallel.
> > It doesn't seem like it will protect against that at all. It also
> > doesn't protect against the possibility of multiple threads attempting
> > to poll for rx in parallel, which I think was part of Thomas's
> > origional statement regarding port ownership (he noted that the
> > lockless design implied only a single thread should be allowed to poll for
> receive or make configuration changes at a time.
> >
> > Neil
> >
> 
> Isn't this race already there for any configuration operation / polling
> function? The DPDK arch is expecting applications to solve it. Why should port
> ownership be designed differently from other DPDK components?
> 
> Embedding checks for port ownership within operations will force everyone
> to bear their costs, even those not interested in using this API. These checks
> should be kept outside, within the entity claiming ownership of the port, in
> the form of using the proper port iterator IMO.

As Gaetan said, there is a race in ethdev in many places, think about new port creation in parallel.
If one day ethdev will be race safe than the port ownership should be too.

> 
> > > ---
> > >  doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
> > >  lib/librte_ether/rte_ethdev.c           | 121
> ++++++++++++++++++++++++++++++++
> > >  lib/librte_ether/rte_ethdev.h           |  86 +++++++++++++++++++++++
> > >  lib/librte_ether/rte_ethdev_version.map |  12 ++++
> > >  4 files changed, 230 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> > > b/doc/guides/prog_guide/poll_mode_drv.rst
> > > index 6a0c9f9..af639a1 100644
> > > --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > > +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > > @@ -156,7 +156,7 @@ concurrently on the same tx queue without SW
> > > lock. This PMD feature found in som
> > >
> > >  See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE``
> capability probing details.
> > >
> > > -Device Identification and Configuration
> > > +Device Identification, Ownership  and Configuration
> > >  ---------------------------------------
> > >
> > >  Device Identification
> > > @@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports are
> assigned two other identifiers:
> > >  *   A port name used to designate the port in console messages, for
> administration or debugging purposes.
> > >      For ease of use, the port name includes the port index.
> > >
> > > +Port Ownership
> > > +~~~~~~~~~~~~~
> > > +The Ethernet devices ports can be owned by a single DPDK entity
> (application, library, PMD, process, etc).
> > > +The ownership mechanism is controlled by ethdev APIs and allows to
> set/remove/get a port owner by DPDK entities.
> > > +Allowing this should prevent any multiple management of Ethernet port
> by different entities.
> > > +
> > > +.. note::
> > > +
> > > +    It is the DPDK entity responsibility either to check the port owner
> before using it or to set the port owner to prevent others from using it.
> > > +
> > >  Device Configuration
> > >  ~~~~~~~~~~~~~~~~~~~~
> > >
> > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > b/lib/librte_ether/rte_ethdev.c index 2d754d9..836991e 100644
> > > --- a/lib/librte_ether/rte_ethdev.c
> > > +++ b/lib/librte_ether/rte_ethdev.c
> > > @@ -71,6 +71,7 @@
> > >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
> > > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> > >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > > +static uint16_t rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER +
> 1;
> > >  static uint8_t eth_dev_last_created_port;
> > >
> > >  /* spinlock for eth device callbacks */ @@ -278,6 +279,7 @@ struct
> > > rte_eth_dev *
> > >  	if (eth_dev == NULL)
> > >  		return -EINVAL;
> > >
> > > +	memset(&eth_dev->data->owner, 0, sizeof(struct
> > > +rte_eth_dev_owner));
> > >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> > >  	return 0;
> > >  }
> > > @@ -293,6 +295,125 @@ struct rte_eth_dev *
> > >  		return 1;
> > >  }
> > >
> > > +static int
> > > +rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > > +	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER &&
> > > +	    rte_eth_next_owner_id <= owner_id)) {
> > > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > +		return 0;
> > > +	}
> > > +	return 1;
> > > +}
> > > +
> > > +uint16_t
> > > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t
> > > +owner_id) {
> > > +	while (port_id < RTE_MAX_ETHPORTS &&
> > > +	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
> > > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > > +		port_id++;
> > > +
> > > +	if (port_id >= RTE_MAX_ETHPORTS)
> > > +		return RTE_MAX_ETHPORTS;
> > > +
> > > +	return port_id;
> > > +}
> > > +
> > > +int
> > > +rte_eth_dev_owner_new(uint16_t *owner_id) {
> > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own
> ports.\n");
> > > +		return -EPERM;
> > > +	}
> > > +	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > > +		RTE_PMD_DEBUG_TRACE("Reached maximum number of
> Ethernet port owners.\n");
> > > +		return -EUSERS;
> > > +	}
> > > +	*owner_id = rte_eth_next_owner_id++;
> > > +	return 0;
> > > +}
> > > +
> > > +int
> > > +rte_eth_dev_owner_set(const uint16_t port_id,
> > > +		      const struct rte_eth_dev_owner *owner) {
> > > +	struct rte_eth_dev_owner *port_owner;
> > > +	int ret;
> > > +
> > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own
> ports.\n");
> > > +		return -EPERM;
> > > +	}
> > > +	if (!rte_eth_is_valid_owner_id(owner->id))
> > > +		return -EINVAL;
> > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > > +	    port_owner->id != owner->id) {
> > > +		RTE_LOG(ERR, EAL,
> > > +			"Cannot set owner to port %d already owned by
> %s_%05d.\n",
> > > +			port_id, port_owner->name, port_owner->id);
> > > +		return -EPERM;
> > > +	}
> > > +	ret = snprintf(port_owner->name,
> RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > > +		       owner->name);
> > > +	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> > > +		memset(port_owner->name, 0,
> RTE_ETH_MAX_OWNER_NAME_LEN);
> > > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > > +		return -EINVAL;
> > > +	}
> > > +	port_owner->id = owner->id;
> > > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
> > > +			    owner->name, owner->id);
> > > +	return 0;
> > > +}
> > > +
> > > +int
> > > +rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t
> > > +owner_id) {
> > > +	struct rte_eth_dev_owner *port_owner;
> > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > +		return -EINVAL;
> > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > +	if (port_owner->id != owner_id) {
> > > +		RTE_LOG(ERR, EAL,
> > > +			"Cannot remove port %d owner %s_%05d by
> different owner id %5d.\n",
> > > +			port_id, port_owner->name, port_owner->id,
> owner_id);
> > > +		return -EPERM;
> > > +	}
> > > +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has
> removed.\n", port_id,
> > > +			port_owner->name, port_owner->id);
> > > +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> > > +	return 0;
> > > +}
> > > +
> > > +void
> > > +rte_eth_dev_owner_delete(const uint16_t owner_id) {
> > > +	uint16_t p;
> > > +
> > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > +		return;
> > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
> > > +		memset(&rte_eth_devices[p].data->owner, 0,
> > > +		       sizeof(struct rte_eth_dev_owner));
> > > +	RTE_PMD_DEBUG_TRACE("All port owners owned by "
> > > +			    "%05d identifier has removed.\n", owner_id); }
> > > +
> > > +const struct rte_eth_dev_owner *
> > > +rte_eth_dev_owner_get(const uint16_t port_id) {
> > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> > > +	if (rte_eth_devices[port_id].data->owner.id ==
> RTE_ETH_DEV_NO_OWNER)
> > > +		return NULL;
> > > +	return &rte_eth_devices[port_id].data->owner;
> > > +}
> > > +
> > >  int
> > >  rte_eth_dev_socket_id(uint16_t port_id)  { diff --git
> > > a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> > > index 341c2d6..f54c26d 100644
> > > --- a/lib/librte_ether/rte_ethdev.h
> > > +++ b/lib/librte_ether/rte_ethdev.h
> > > @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
> > >
> > >  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
> > >
> > > +#define RTE_ETH_DEV_NO_OWNER 0
> > > +
> > > +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> > > +
> > > +struct rte_eth_dev_owner {
> > > +	uint16_t id; /**< The owner unique identifier. */
> > > +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner
> name. */ };
> > > +
> > >  /**
> > >   * @internal
> > >   * The data part, with no function pointers, associated with each
> ethernet device.
> > > @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
> > >  	int numa_node;  /**< NUMA node connection */
> > >  	struct rte_vlan_filter_conf vlan_filter_conf;
> > >  	/**< VLAN filter configuration. */
> > > +	struct rte_eth_dev_owner owner; /**< The port owner. */
> > >  };
> > >
> > >  /** Device supports link state interrupt */ @@ -1846,6 +1856,82 @@
> > > struct rte_eth_dev_data {
> > >
> > >
> > >  /**
> > > + * Iterates over valid ethdev ports owned by a specific owner.
> > > + *
> > > + * @param port_id
> > > + *   The id of the next possible valid owned port.
> > > + * @param	owner_id
> > > + *  The owner identifier.
> > > + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless
> ports.
> > > + * @return
> > > + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if there
> is none.
> > > + */
> > > +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const
> > > +uint16_t owner_id);
> > > +
> > > +/**
> > > + * Macro to iterate over all enabled ethdev ports owned by a specific
> owner.
> > > + */
> > > +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> > > +	for (p = rte_eth_find_next_owned_by(0, o); \
> > > +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> > > +	     p = rte_eth_find_next_owned_by(p + 1, o))
> > > +
> > > +/**
> > > + * Get a new unique owner identifier.
> > > + * An owner identifier is used to owns Ethernet devices by only one
> > > +DPDK entity
> > > + * to avoid multiple management of device by different entities.
> > > + *
> > > + * @param	owner_id
> > > + *   Owner identifier pointer.
> > > + * @return
> > > + *   Negative errno value on error, 0 on success.
> > > + */
> > > +int rte_eth_dev_owner_new(uint16_t *owner_id);
> > > +
> > > +/**
> > > + * Set an Ethernet device owner.
> > > + *
> > > + * @param	port_id
> > > + *  The identifier of the port to own.
> > > + * @param	owner
> > > + *  The owner pointer.
> > > + * @return
> > > + *  Negative errno value on error, 0 on success.
> > > + */
> > > +int rte_eth_dev_owner_set(const uint16_t port_id,
> > > +			  const struct rte_eth_dev_owner *owner);
> > > +
> > > +/**
> > > + * Remove Ethernet device owner to make the device ownerless.
> > > + *
> > > + * @param	port_id
> > > + *  The identifier of port to make ownerless.
> > > + * @param	owner
> > > + *  The owner identifier.
> > > + * @return
> > > + *  0 on success, negative errno value on error.
> > > + */
> > > +int rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t
> > > +owner_id);
> > > +
> > > +/**
> > > + * Remove owner from all Ethernet devices owned by a specific owner.
> > > + *
> > > + * @param	owner
> > > + *  The owner identifier.
> > > + */
> > > +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> > > +
> > > +/**
> > > + * Get the owner of an Ethernet device.
> > > + *
> > > + * @param	port_id
> > > + *  The port identifier.
> > > + * @return
> > > + *  NULL when the device is ownerless, else the device owner pointer.
> > > + */
> > > +const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const
> > > +uint16_t port_id);
> > > +
> > > +/**
> > >   * Get the total number of Ethernet devices that have been successfully
> > >   * initialized by the matching Ethernet driver during the PCI probing
> phase
> > >   * and that are available for applications to use. These devices
> > > must be diff --git a/lib/librte_ether/rte_ethdev_version.map
> > > b/lib/librte_ether/rte_ethdev_version.map
> > > index e9681ac..7d07edb 100644
> > > --- a/lib/librte_ether/rte_ethdev_version.map
> > > +++ b/lib/librte_ether/rte_ethdev_version.map
> > > @@ -198,6 +198,18 @@ DPDK_17.11 {
> > >
> > >  } DPDK_17.08;
> > >
> > > +DPDK_18.02 {
> > > +	global:
> > > +
> > > +	rte_eth_find_next_owned_by;
> > > +	rte_eth_dev_owner_new;
> > > +	rte_eth_dev_owner_set;
> > > +	rte_eth_dev_owner_remove;
> > > +	rte_eth_dev_owner_delete;
> > > +	rte_eth_dev_owner_get;
> > > +
> > > +} DPDK_17.11;
> > > +
> > >  EXPERIMENTAL {
> > >  	global:
> > >
> > > --
> > > 1.8.3.1
> > >
> > >
> 
> --
> Gaëtan Rivet
> 6WIND
  
Gaëtan Rivet Nov. 30, 2017, 3:09 p.m. UTC | #4
On Thu, Nov 30, 2017 at 02:30:20PM +0000, Matan Azrad wrote:
> Hi all
> 
> > -----Original Message-----
> > From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> > Sent: Thursday, November 30, 2017 3:25 PM
> > To: Neil Horman <nhorman@tuxdriver.com>
> > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> > dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > 
> > Hello Matan, Neil,
> > 
> > I like the port ownership concept. I think it is needed to clarify some
> > operations and should be useful to several subsystems.
> > 
> > This patch could certainly be sub-divided however, and your current 1/5
> > should probably come after this one.
> 
> Can you suggest how to divide it?
> 

Adding first the API to add / remove owners, then in a second patch
set / get / unset. (by the way, remove / delete is pretty confusing, I'd
suggest renaming those.) You can also separate the introduction of the
new owner-wise iterator.

Ultimately, you are the author, it's your job to help us review your
work.

> 1/5 could be actually outside of this series, it is just better behavior to use the right function to do release port :)
> 
> > 
> > Some comments inline.
> > 
> > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > The ownership of a port is implicit in DPDK.
> > > > Making it explicit is better from the next reasons:
> > > > 1. It may be convenient for multi-process applications to know which
> > > >    process is in charge of a port.
> > > > 2. A library could work on top of a port.
> > > > 3. A port can work on top of another port.
> > > >
> > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > We need to check that the user is not trying to use a port which is
> > > > already managed by fail-safe.
> > > >
> > > > Add ownership mechanism to DPDK Ethernet devices to avoid multiple
> > > > management of a device by different DPDK entities.
> > > >
> > > > A port owner is built from owner id(number) and owner name(string)
> > > > while the owner id must be unique to distinguish between two
> > > > identical entity instances and the owner name can be any name.
> > > > The name helps to logically recognize the owner by different DPDK
> > > > entities and allows easy debug.
> > > > Each DPDK entity can allocate an owner unique identifier and can use
> > > > it and its preferred name to owns valid ethdev ports.
> > > > Each DPDK entity can get any port owner status to decide if it can
> > > > manage the port or not.
> > > >
> > > > The current ethdev internal port management is not affected by this
> > > > feature.
> > > >
> > 
> > The internal port management is not affected, but the external interface is,
> > however. In order to respect port ownership, applications are forced to
> > modify their port iterator, as shown by your testpmd patch.
> > 
> > I think it would be better to modify the current RTE_ETH_FOREACH_DEV to
> > call RTE_FOREACH_DEV_OWNED_BY, and introduce a default owner that
> > would represent the application itself (probably with the ID 0 and an owner
> > string ""). Only with specific additional configuration should this default
> > subset of ethdev be divided.
> > 
> > This would make this evolution seamless for applications, at no cost to the
> > complexity of the design.
> 
> As you can see in patch code and in testpmd example I added option to iterate
> over all valid ownerless ports which should be more relevant by owner_id = 0.
> So maybe the RTE_ETH_FOREACH_DEV should be changed to run this by the new iterator.

That is precisely what I am suggesting.
Ideally, you should not have to change anything in testpmd, beside some
bug fixing regarding port iteration to avoid those with a specific
owner.

RTE_ETH_FOREACH_DEV must stay valid, and should iterate over ownerless
ports (read: port owned by the default owner).

> By this way current applications don't need to build their owners but the ABI will be broken.
> 

ABI is broken anyway as you will add the owner to rte_eth_dev_data.

> Actually, I think the old iterator RTE_ETH_FOREACH_DEV should be unexposed or just removed,

I don't think so. Using RTE_ETH_FOREACH_DEV should allow keeping the
current behavior of iterating over ownerless ports. Applications that do
not care for this API should not have to change anything to their code.

> also the DEFFERED state should be removed,

Of course.

> I don't really see any usage to iterate over all valid ports by DPDK entities different from ethdev itself.
> I just don't want to break it now.
> 

[snip]
  
Matan Azrad Nov. 30, 2017, 3:43 p.m. UTC | #5
Hi Gaetan

> -----Original Message-----
> From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> Sent: Thursday, November 30, 2017 5:10 PM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Neil Horman <nhorman@tuxdriver.com>; Thomas Monjalon
> <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> On Thu, Nov 30, 2017 at 02:30:20PM +0000, Matan Azrad wrote:
> > Hi all
> >
> > > -----Original Message-----
> > > From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> > > Sent: Thursday, November 30, 2017 3:25 PM
> > > To: Neil Horman <nhorman@tuxdriver.com>
> > > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> > > dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > >
> > > Hello Matan, Neil,
> > >
> > > I like the port ownership concept. I think it is needed to clarify
> > > some operations and should be useful to several subsystems.
> > >
> > > This patch could certainly be sub-divided however, and your current
> > > 1/5 should probably come after this one.
> >
> > Can you suggest how to divide it?
> >
> 
> Adding first the API to add / remove owners, then in a second patch set / get
> / unset. (by the way, remove / delete is pretty confusing, I'd suggest
> renaming those.) You can also separate the introduction of the new owner-
> wise iterator.
>
> Ultimately, you are the author, it's your job to help us review your work.
> 

When you suggest improvement I think you need to propose another method\idea.
The author probably thought about it and arrived to his conclusions.
 Exactly as you are doing now in naming.
If you have a specific question, I'm here to answer :) 

I agree with unset name instead of remove, will change it in V2.
  
> > 1/5 could be actually outside of this series, it is just better
> > behavior to use the right function to do release port :)
> >
> > >
> > > Some comments inline.
> > >
> > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > > The ownership of a port is implicit in DPDK.
> > > > > Making it explicit is better from the next reasons:
> > > > > 1. It may be convenient for multi-process applications to know which
> > > > >    process is in charge of a port.
> > > > > 2. A library could work on top of a port.
> > > > > 3. A port can work on top of another port.
> > > > >
> > > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > > We need to check that the user is not trying to use a port which
> > > > > is already managed by fail-safe.
> > > > >
> > > > > Add ownership mechanism to DPDK Ethernet devices to avoid
> > > > > multiple management of a device by different DPDK entities.
> > > > >
> > > > > A port owner is built from owner id(number) and owner
> > > > > name(string) while the owner id must be unique to distinguish
> > > > > between two identical entity instances and the owner name can be
> any name.
> > > > > The name helps to logically recognize the owner by different
> > > > > DPDK entities and allows easy debug.
> > > > > Each DPDK entity can allocate an owner unique identifier and can
> > > > > use it and its preferred name to owns valid ethdev ports.
> > > > > Each DPDK entity can get any port owner status to decide if it
> > > > > can manage the port or not.
> > > > >
> > > > > The current ethdev internal port management is not affected by
> > > > > this feature.
> > > > >
> > >
> > > The internal port management is not affected, but the external
> > > interface is, however. In order to respect port ownership,
> > > applications are forced to modify their port iterator, as shown by your
> testpmd patch.
> > >
> > > I think it would be better to modify the current RTE_ETH_FOREACH_DEV
> > > to call RTE_FOREACH_DEV_OWNED_BY, and introduce a default owner
> that
> > > would represent the application itself (probably with the ID 0 and
> > > an owner string ""). Only with specific additional configuration
> > > should this default subset of ethdev be divided.
> > >
> > > This would make this evolution seamless for applications, at no cost
> > > to the complexity of the design.
> >
> > As you can see in patch code and in testpmd example I added option to
> > iterate over all valid ownerless ports which should be more relevant by
> owner_id = 0.
> > So maybe the RTE_ETH_FOREACH_DEV should be changed to run this by
> the new iterator.
> 
> That is precisely what I am suggesting.
> Ideally, you should not have to change anything in testpmd, beside some bug
> fixing regarding port iteration to avoid those with a specific owner.
> 
> RTE_ETH_FOREACH_DEV must stay valid, and should iterate over ownerless
> ports (read: port owned by the default owner).
> 
> > By this way current applications don't need to build their owners but the
> ABI will be broken.
> >
> 
> ABI is broken anyway as you will add the owner to rte_eth_dev_data.
> 

It is not, rte_eth_dev_data is internal.

> > Actually, I think the old iterator RTE_ETH_FOREACH_DEV should be
> > unexposed or just removed,
> 
> I don't think so. Using RTE_ETH_FOREACH_DEV should allow keeping the
> current behavior of iterating over ownerless ports. Applications that do not
> care for this API should not have to change anything to their code.
> 

If we will break the ABI later users can use RTE_ETH_FOREACH_DEV_OWNED_BY(p,0)
to do it. RTE_ETH_FOREACH_DEV will be unnecessary anymore but maybe is too much to applications to change also the API.
I can agree with it.

> > also the DEFFERED state should be removed,
> 
> Of course.
> 
> > I don't really see any usage to iterate over all valid ports by DPDK entities
> different from ethdev itself.
> > I just don't want to break it now.
> >
> 
> [snip]
> 
> --
> Gaëtan Rivet
> 6WIND
  
Neil Horman Dec. 1, 2017, 12:09 p.m. UTC | #6
On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> Hello Matan, Neil,
> 
> I like the port ownership concept. I think it is needed to clarify some
> operations and should be useful to several subsystems.
> 
> This patch could certainly be sub-divided however, and your current 1/5
> should probably come after this one.
> 
> Some comments inline.
> 
> On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > The ownership of a port is implicit in DPDK.
> > > Making it explicit is better from the next reasons:
> > > 1. It may be convenient for multi-process applications to know which
> > >    process is in charge of a port.
> > > 2. A library could work on top of a port.
> > > 3. A port can work on top of another port.
> > > 
> > > Also in the fail-safe case, an issue has been met in testpmd.
> > > We need to check that the user is not trying to use a port which is
> > > already managed by fail-safe.
> > > 
> > > Add ownership mechanism to DPDK Ethernet devices to avoid multiple
> > > management of a device by different DPDK entities.
> > > 
> > > A port owner is built from owner id(number) and owner name(string) while
> > > the owner id must be unique to distinguish between two identical entity
> > > instances and the owner name can be any name.
> > > The name helps to logically recognize the owner by different DPDK
> > > entities and allows easy debug.
> > > Each DPDK entity can allocate an owner unique identifier and can use it
> > > and its preferred name to owns valid ethdev ports.
> > > Each DPDK entity can get any port owner status to decide if it can
> > > manage the port or not.
> > > 
> > > The current ethdev internal port management is not affected by this
> > > feature.
> > > 
> 
> The internal port management is not affected, but the external interface
> is, however. In order to respect port ownership, applications are forced
> to modify their port iterator, as shown by your testpmd patch.
> 
> I think it would be better to modify the current RTE_ETH_FOREACH_DEV to call
> RTE_FOREACH_DEV_OWNED_BY, and introduce a default owner that would
> represent the application itself (probably with the ID 0 and an owner
> string ""). Only with specific additional configuration should this
> default subset of ethdev be divided.
> 
> This would make this evolution seamless for applications, at no cost to
> the complexity of the design.
> 
> > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > 
> > 
> > This seems fairly racy.  What if one thread attempts to set ownership on a port,
> > while another is checking it on another cpu in parallel.  It doesn't seem like
> > it will protect against that at all. It also doesn't protect against the
> > possibility of multiple threads attempting to poll for rx in parallel, which I
> > think was part of Thomas's origional statement regarding port ownership (he
> > noted that the lockless design implied only a single thread should be allowed to
> > poll for receive or make configuration changes at a time.
> > 
> > Neil
> > 
> 
> Isn't this race already there for any configuration operation / polling
> function? The DPDK arch is expecting applications to solve it. Why should
> port ownership be designed differently from other DPDK components?
> 
Yes, but that doesn't mean it should exist in purpituity, nor does it mean that
your new api should contain it as well.

> Embedding checks for port ownership within operations will force
> everyone to bear their costs, even those not interested in using this
> API. These checks should be kept outside, within the entity claiming
> ownership of the port, in the form of using the proper port iterator
> IMO.
> 
No.  At the very least, you need to make the API itself exclusive.  That is to
say, you should at least ensure that a port ownership get call doesn't race with
a port ownership set call.  It seems rediculous to just leave that sort of
locking as an exercize to the user.

Neil

> > > ---
> > >  doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
> > >  lib/librte_ether/rte_ethdev.c           | 121 ++++++++++++++++++++++++++++++++
> > >  lib/librte_ether/rte_ethdev.h           |  86 +++++++++++++++++++++++
> > >  lib/librte_ether/rte_ethdev_version.map |  12 ++++
> > >  4 files changed, 230 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
> > > index 6a0c9f9..af639a1 100644
> > > --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > > +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > > @@ -156,7 +156,7 @@ concurrently on the same tx queue without SW lock. This PMD feature found in som
> > >  
> > >  See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE`` capability probing details.
> > >  
> > > -Device Identification and Configuration
> > > +Device Identification, Ownership  and Configuration
> > >  ---------------------------------------
> > >  
> > >  Device Identification
> > > @@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports are assigned two other identifiers:
> > >  *   A port name used to designate the port in console messages, for administration or debugging purposes.
> > >      For ease of use, the port name includes the port index.
> > >  
> > > +Port Ownership
> > > +~~~~~~~~~~~~~
> > > +The Ethernet devices ports can be owned by a single DPDK entity (application, library, PMD, process, etc).
> > > +The ownership mechanism is controlled by ethdev APIs and allows to set/remove/get a port owner by DPDK entities.
> > > +Allowing this should prevent any multiple management of Ethernet port by different entities.
> > > +
> > > +.. note::
> > > +
> > > +    It is the DPDK entity responsibility either to check the port owner before using it or to set the port owner to prevent others from using it.
> > > +
> > >  Device Configuration
> > >  ~~~~~~~~~~~~~~~~~~~~
> > >  
> > > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> > > index 2d754d9..836991e 100644
> > > --- a/lib/librte_ether/rte_ethdev.c
> > > +++ b/lib/librte_ether/rte_ethdev.c
> > > @@ -71,6 +71,7 @@
> > >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
> > >  struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> > >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > > +static uint16_t rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER + 1;
> > >  static uint8_t eth_dev_last_created_port;
> > >  
> > >  /* spinlock for eth device callbacks */
> > > @@ -278,6 +279,7 @@ struct rte_eth_dev *
> > >  	if (eth_dev == NULL)
> > >  		return -EINVAL;
> > >  
> > > +	memset(&eth_dev->data->owner, 0, sizeof(struct rte_eth_dev_owner));
> > >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> > >  	return 0;
> > >  }
> > > @@ -293,6 +295,125 @@ struct rte_eth_dev *
> > >  		return 1;
> > >  }
> > >  
> > > +static int
> > > +rte_eth_is_valid_owner_id(uint16_t owner_id)
> > > +{
> > > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > > +	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER &&
> > > +	    rte_eth_next_owner_id <= owner_id)) {
> > > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > +		return 0;
> > > +	}
> > > +	return 1;
> > > +}
> > > +
> > > +uint16_t
> > > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t owner_id)
> > > +{
> > > +	while (port_id < RTE_MAX_ETHPORTS &&
> > > +	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
> > > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > > +		port_id++;
> > > +
> > > +	if (port_id >= RTE_MAX_ETHPORTS)
> > > +		return RTE_MAX_ETHPORTS;
> > > +
> > > +	return port_id;
> > > +}
> > > +
> > > +int
> > > +rte_eth_dev_owner_new(uint16_t *owner_id)
> > > +{
> > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own ports.\n");
> > > +		return -EPERM;
> > > +	}
> > > +	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > > +		RTE_PMD_DEBUG_TRACE("Reached maximum number of Ethernet port owners.\n");
> > > +		return -EUSERS;
> > > +	}
> > > +	*owner_id = rte_eth_next_owner_id++;
> > > +	return 0;
> > > +}
> > > +
> > > +int
> > > +rte_eth_dev_owner_set(const uint16_t port_id,
> > > +		      const struct rte_eth_dev_owner *owner)
> > > +{
> > > +	struct rte_eth_dev_owner *port_owner;
> > > +	int ret;
> > > +
> > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own ports.\n");
> > > +		return -EPERM;
> > > +	}
> > > +	if (!rte_eth_is_valid_owner_id(owner->id))
> > > +		return -EINVAL;
> > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > > +	    port_owner->id != owner->id) {
> > > +		RTE_LOG(ERR, EAL,
> > > +			"Cannot set owner to port %d already owned by %s_%05d.\n",
> > > +			port_id, port_owner->name, port_owner->id);
> > > +		return -EPERM;
> > > +	}
> > > +	ret = snprintf(port_owner->name, RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > > +		       owner->name);
> > > +	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> > > +		memset(port_owner->name, 0, RTE_ETH_MAX_OWNER_NAME_LEN);
> > > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > > +		return -EINVAL;
> > > +	}
> > > +	port_owner->id = owner->id;
> > > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
> > > +			    owner->name, owner->id);
> > > +	return 0;
> > > +}
> > > +
> > > +int
> > > +rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t owner_id)
> > > +{
> > > +	struct rte_eth_dev_owner *port_owner;
> > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > +		return -EINVAL;
> > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > +	if (port_owner->id != owner_id) {
> > > +		RTE_LOG(ERR, EAL,
> > > +			"Cannot remove port %d owner %s_%05d by different owner id %5d.\n",
> > > +			port_id, port_owner->name, port_owner->id, owner_id);
> > > +		return -EPERM;
> > > +	}
> > > +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has removed.\n", port_id,
> > > +			port_owner->name, port_owner->id);
> > > +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> > > +	return 0;
> > > +}
> > > +
> > > +void
> > > +rte_eth_dev_owner_delete(const uint16_t owner_id)
> > > +{
> > > +	uint16_t p;
> > > +
> > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > +		return;
> > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
> > > +		memset(&rte_eth_devices[p].data->owner, 0,
> > > +		       sizeof(struct rte_eth_dev_owner));
> > > +	RTE_PMD_DEBUG_TRACE("All port owners owned by "
> > > +			    "%05d identifier has removed.\n", owner_id);
> > > +}
> > > +
> > > +const struct rte_eth_dev_owner *
> > > +rte_eth_dev_owner_get(const uint16_t port_id)
> > > +{
> > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> > > +	if (rte_eth_devices[port_id].data->owner.id == RTE_ETH_DEV_NO_OWNER)
> > > +		return NULL;
> > > +	return &rte_eth_devices[port_id].data->owner;
> > > +}
> > > +
> > >  int
> > >  rte_eth_dev_socket_id(uint16_t port_id)
> > >  {
> > > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> > > index 341c2d6..f54c26d 100644
> > > --- a/lib/librte_ether/rte_ethdev.h
> > > +++ b/lib/librte_ether/rte_ethdev.h
> > > @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
> > >  
> > >  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
> > >  
> > > +#define RTE_ETH_DEV_NO_OWNER 0
> > > +
> > > +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> > > +
> > > +struct rte_eth_dev_owner {
> > > +	uint16_t id; /**< The owner unique identifier. */
> > > +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner name. */
> > > +};
> > > +
> > >  /**
> > >   * @internal
> > >   * The data part, with no function pointers, associated with each ethernet device.
> > > @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
> > >  	int numa_node;  /**< NUMA node connection */
> > >  	struct rte_vlan_filter_conf vlan_filter_conf;
> > >  	/**< VLAN filter configuration. */
> > > +	struct rte_eth_dev_owner owner; /**< The port owner. */
> > >  };
> > >  
> > >  /** Device supports link state interrupt */
> > > @@ -1846,6 +1856,82 @@ struct rte_eth_dev_data {
> > >  
> > >  
> > >  /**
> > > + * Iterates over valid ethdev ports owned by a specific owner.
> > > + *
> > > + * @param port_id
> > > + *   The id of the next possible valid owned port.
> > > + * @param	owner_id
> > > + *  The owner identifier.
> > > + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless ports.
> > > + * @return
> > > + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if there is none.
> > > + */
> > > +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t owner_id);
> > > +
> > > +/**
> > > + * Macro to iterate over all enabled ethdev ports owned by a specific owner.
> > > + */
> > > +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> > > +	for (p = rte_eth_find_next_owned_by(0, o); \
> > > +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> > > +	     p = rte_eth_find_next_owned_by(p + 1, o))
> > > +
> > > +/**
> > > + * Get a new unique owner identifier.
> > > + * An owner identifier is used to owns Ethernet devices by only one DPDK entity
> > > + * to avoid multiple management of device by different entities.
> > > + *
> > > + * @param	owner_id
> > > + *   Owner identifier pointer.
> > > + * @return
> > > + *   Negative errno value on error, 0 on success.
> > > + */
> > > +int rte_eth_dev_owner_new(uint16_t *owner_id);
> > > +
> > > +/**
> > > + * Set an Ethernet device owner.
> > > + *
> > > + * @param	port_id
> > > + *  The identifier of the port to own.
> > > + * @param	owner
> > > + *  The owner pointer.
> > > + * @return
> > > + *  Negative errno value on error, 0 on success.
> > > + */
> > > +int rte_eth_dev_owner_set(const uint16_t port_id,
> > > +			  const struct rte_eth_dev_owner *owner);
> > > +
> > > +/**
> > > + * Remove Ethernet device owner to make the device ownerless.
> > > + *
> > > + * @param	port_id
> > > + *  The identifier of port to make ownerless.
> > > + * @param	owner
> > > + *  The owner identifier.
> > > + * @return
> > > + *  0 on success, negative errno value on error.
> > > + */
> > > +int rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t owner_id);
> > > +
> > > +/**
> > > + * Remove owner from all Ethernet devices owned by a specific owner.
> > > + *
> > > + * @param	owner
> > > + *  The owner identifier.
> > > + */
> > > +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> > > +
> > > +/**
> > > + * Get the owner of an Ethernet device.
> > > + *
> > > + * @param	port_id
> > > + *  The port identifier.
> > > + * @return
> > > + *  NULL when the device is ownerless, else the device owner pointer.
> > > + */
> > > +const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const uint16_t port_id);
> > > +
> > > +/**
> > >   * Get the total number of Ethernet devices that have been successfully
> > >   * initialized by the matching Ethernet driver during the PCI probing phase
> > >   * and that are available for applications to use. These devices must be
> > > diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
> > > index e9681ac..7d07edb 100644
> > > --- a/lib/librte_ether/rte_ethdev_version.map
> > > +++ b/lib/librte_ether/rte_ethdev_version.map
> > > @@ -198,6 +198,18 @@ DPDK_17.11 {
> > >  
> > >  } DPDK_17.08;
> > >  
> > > +DPDK_18.02 {
> > > +	global:
> > > +
> > > +	rte_eth_find_next_owned_by;
> > > +	rte_eth_dev_owner_new;
> > > +	rte_eth_dev_owner_set;
> > > +	rte_eth_dev_owner_remove;
> > > +	rte_eth_dev_owner_delete;
> > > +	rte_eth_dev_owner_get;
> > > +
> > > +} DPDK_17.11;
> > > +
> > >  EXPERIMENTAL {
> > >  	global:
> > >  
> > > -- 
> > > 1.8.3.1
> > > 
> > > 
> 
> -- 
> Gaëtan Rivet
> 6WIND
>
  
Matan Azrad Dec. 3, 2017, 8:04 a.m. UTC | #7
Hi

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Friday, December 1, 2017 2:10 PM
> To: Gaëtan Rivet <gaetan.rivet@6wind.com>
> Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> > Hello Matan, Neil,
> >
> > I like the port ownership concept. I think it is needed to clarify
> > some operations and should be useful to several subsystems.
> >
> > This patch could certainly be sub-divided however, and your current
> > 1/5 should probably come after this one.
> >
> > Some comments inline.
> >
> > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > The ownership of a port is implicit in DPDK.
> > > > Making it explicit is better from the next reasons:
> > > > 1. It may be convenient for multi-process applications to know which
> > > >    process is in charge of a port.
> > > > 2. A library could work on top of a port.
> > > > 3. A port can work on top of another port.
> > > >
> > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > We need to check that the user is not trying to use a port which
> > > > is already managed by fail-safe.
> > > >
> > > > Add ownership mechanism to DPDK Ethernet devices to avoid multiple
> > > > management of a device by different DPDK entities.
> > > >
> > > > A port owner is built from owner id(number) and owner name(string)
> > > > while the owner id must be unique to distinguish between two
> > > > identical entity instances and the owner name can be any name.
> > > > The name helps to logically recognize the owner by different DPDK
> > > > entities and allows easy debug.
> > > > Each DPDK entity can allocate an owner unique identifier and can
> > > > use it and its preferred name to owns valid ethdev ports.
> > > > Each DPDK entity can get any port owner status to decide if it can
> > > > manage the port or not.
> > > >
> > > > The current ethdev internal port management is not affected by
> > > > this feature.
> > > >
> >
> > The internal port management is not affected, but the external
> > interface is, however. In order to respect port ownership,
> > applications are forced to modify their port iterator, as shown by your
> testpmd patch.
> >
> > I think it would be better to modify the current RTE_ETH_FOREACH_DEV
> > to call RTE_FOREACH_DEV_OWNED_BY, and introduce a default owner that
> > would represent the application itself (probably with the ID 0 and an
> > owner string ""). Only with specific additional configuration should
> > this default subset of ethdev be divided.
> >
> > This would make this evolution seamless for applications, at no cost
> > to the complexity of the design.
> >
> > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > >
> > >
> > > This seems fairly racy.  What if one thread attempts to set
> > > ownership on a port, while another is checking it on another cpu in
> > > parallel.  It doesn't seem like it will protect against that at all.
> > > It also doesn't protect against the possibility of multiple threads
> > > attempting to poll for rx in parallel, which I think was part of
> > > Thomas's origional statement regarding port ownership (he noted that
> > > the lockless design implied only a single thread should be allowed to poll
> for receive or make configuration changes at a time.
> > >
> > > Neil
> > >
> >
> > Isn't this race already there for any configuration operation /
> > polling function? The DPDK arch is expecting applications to solve it.
> > Why should port ownership be designed differently from other DPDK
> components?
> >
> Yes, but that doesn't mean it should exist in purpituity, nor does it mean that
> your new api should contain it as well.
> 
> > Embedding checks for port ownership within operations will force
> > everyone to bear their costs, even those not interested in using this
> > API. These checks should be kept outside, within the entity claiming
> > ownership of the port, in the form of using the proper port iterator
> > IMO.
> >
> No.  At the very least, you need to make the API itself exclusive.  That is to
> say, you should at least ensure that a port ownership get call doesn't race
> with a port ownership set call.  It seems rediculous to just leave that sort of
> locking as an exercize to the user.
> 
> Neil
> 
Neil, 
As Thomas mentioned, a DPDK port is designed to be managed by only one
thread (or synchronized DPDK entity).
So all the port management includes port ownership shouldn't be synchronized,
i.e. locks are not needed.
If some application want to do dual thread port management, the responsibility
to synchronize the port ownership or any other port management is on this
application.
Port ownership doesn't come to allow synchronized management of the port by
two DPDK entities in parallel, it is just nice way to answer next questions:
	1. Is the port already owned by some DPDK entity(in early control path)?
	2. If yes, Who is the owner?
If the answer to the first question is no, the current entity can take the ownership
without any lock(1 thread).
If the answer to the first question is yes, you can recognize the owner and take
decisions accordingly, sometimes you can decide to use the port because you
logically know what the current owner does and you can be logically synchronized
with it, sometimes you can just leave this port because you have not any deal with
 this owner.

> > > > ---
> > > >  doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
> > > >  lib/librte_ether/rte_ethdev.c           | 121
> ++++++++++++++++++++++++++++++++
> > > >  lib/librte_ether/rte_ethdev.h           |  86 +++++++++++++++++++++++
> > > >  lib/librte_ether/rte_ethdev_version.map |  12 ++++
> > > >  4 files changed, 230 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > index 6a0c9f9..af639a1 100644
> > > > --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > @@ -156,7 +156,7 @@ concurrently on the same tx queue without SW
> > > > lock. This PMD feature found in som
> > > >
> > > >  See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE``
> capability probing details.
> > > >
> > > > -Device Identification and Configuration
> > > > +Device Identification, Ownership  and Configuration
> > > >  ---------------------------------------
> > > >
> > > >  Device Identification
> > > > @@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports are
> assigned two other identifiers:
> > > >  *   A port name used to designate the port in console messages, for
> administration or debugging purposes.
> > > >      For ease of use, the port name includes the port index.
> > > >
> > > > +Port Ownership
> > > > +~~~~~~~~~~~~~
> > > > +The Ethernet devices ports can be owned by a single DPDK entity
> (application, library, PMD, process, etc).
> > > > +The ownership mechanism is controlled by ethdev APIs and allows to
> set/remove/get a port owner by DPDK entities.
> > > > +Allowing this should prevent any multiple management of Ethernet
> port by different entities.
> > > > +
> > > > +.. note::
> > > > +
> > > > +    It is the DPDK entity responsibility either to check the port owner
> before using it or to set the port owner to prevent others from using it.
> > > > +
> > > >  Device Configuration
> > > >  ~~~~~~~~~~~~~~~~~~~~
> > > >
> > > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > > b/lib/librte_ether/rte_ethdev.c index 2d754d9..836991e 100644
> > > > --- a/lib/librte_ether/rte_ethdev.c
> > > > +++ b/lib/librte_ether/rte_ethdev.c
> > > > @@ -71,6 +71,7 @@
> > > >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
> > > > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> > > >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > > > +static uint16_t rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER
> + 1;
> > > >  static uint8_t eth_dev_last_created_port;
> > > >
> > > >  /* spinlock for eth device callbacks */ @@ -278,6 +279,7 @@
> > > > struct rte_eth_dev *
> > > >  	if (eth_dev == NULL)
> > > >  		return -EINVAL;
> > > >
> > > > +	memset(&eth_dev->data->owner, 0, sizeof(struct
> > > > +rte_eth_dev_owner));
> > > >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> > > >  	return 0;
> > > >  }
> > > > @@ -293,6 +295,125 @@ struct rte_eth_dev *
> > > >  		return 1;
> > > >  }
> > > >
> > > > +static int
> > > > +rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > > > +	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER &&
> > > > +	    rte_eth_next_owner_id <= owner_id)) {
> > > > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > > +		return 0;
> > > > +	}
> > > > +	return 1;
> > > > +}
> > > > +
> > > > +uint16_t
> > > > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t
> > > > +owner_id) {
> > > > +	while (port_id < RTE_MAX_ETHPORTS &&
> > > > +	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
> > > > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > > > +		port_id++;
> > > > +
> > > > +	if (port_id >= RTE_MAX_ETHPORTS)
> > > > +		return RTE_MAX_ETHPORTS;
> > > > +
> > > > +	return port_id;
> > > > +}
> > > > +
> > > > +int
> > > > +rte_eth_dev_owner_new(uint16_t *owner_id) {
> > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own
> ports.\n");
> > > > +		return -EPERM;
> > > > +	}
> > > > +	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > > > +		RTE_PMD_DEBUG_TRACE("Reached maximum number of
> Ethernet port owners.\n");
> > > > +		return -EUSERS;
> > > > +	}
> > > > +	*owner_id = rte_eth_next_owner_id++;
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +int
> > > > +rte_eth_dev_owner_set(const uint16_t port_id,
> > > > +		      const struct rte_eth_dev_owner *owner) {
> > > > +	struct rte_eth_dev_owner *port_owner;
> > > > +	int ret;
> > > > +
> > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own
> ports.\n");
> > > > +		return -EPERM;
> > > > +	}
> > > > +	if (!rte_eth_is_valid_owner_id(owner->id))
> > > > +		return -EINVAL;
> > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > > > +	    port_owner->id != owner->id) {
> > > > +		RTE_LOG(ERR, EAL,
> > > > +			"Cannot set owner to port %d already owned by
> %s_%05d.\n",
> > > > +			port_id, port_owner->name, port_owner->id);
> > > > +		return -EPERM;
> > > > +	}
> > > > +	ret = snprintf(port_owner->name,
> RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > > > +		       owner->name);
> > > > +	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> > > > +		memset(port_owner->name, 0,
> RTE_ETH_MAX_OWNER_NAME_LEN);
> > > > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > > > +		return -EINVAL;
> > > > +	}
> > > > +	port_owner->id = owner->id;
> > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
> > > > +			    owner->name, owner->id);
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +int
> > > > +rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t
> > > > +owner_id) {
> > > > +	struct rte_eth_dev_owner *port_owner;
> > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > +		return -EINVAL;
> > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > +	if (port_owner->id != owner_id) {
> > > > +		RTE_LOG(ERR, EAL,
> > > > +			"Cannot remove port %d owner %s_%05d by
> different owner id %5d.\n",
> > > > +			port_id, port_owner->name, port_owner->id,
> owner_id);
> > > > +		return -EPERM;
> > > > +	}
> > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has
> removed.\n", port_id,
> > > > +			port_owner->name, port_owner->id);
> > > > +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +void
> > > > +rte_eth_dev_owner_delete(const uint16_t owner_id) {
> > > > +	uint16_t p;
> > > > +
> > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > +		return;
> > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
> > > > +		memset(&rte_eth_devices[p].data->owner, 0,
> > > > +		       sizeof(struct rte_eth_dev_owner));
> > > > +	RTE_PMD_DEBUG_TRACE("All port owners owned by "
> > > > +			    "%05d identifier has removed.\n", owner_id); }
> > > > +
> > > > +const struct rte_eth_dev_owner *
> > > > +rte_eth_dev_owner_get(const uint16_t port_id) {
> > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> > > > +	if (rte_eth_devices[port_id].data->owner.id ==
> RTE_ETH_DEV_NO_OWNER)
> > > > +		return NULL;
> > > > +	return &rte_eth_devices[port_id].data->owner;
> > > > +}
> > > > +
> > > >  int
> > > >  rte_eth_dev_socket_id(uint16_t port_id)  { diff --git
> > > > a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> > > > index 341c2d6..f54c26d 100644
> > > > --- a/lib/librte_ether/rte_ethdev.h
> > > > +++ b/lib/librte_ether/rte_ethdev.h
> > > > @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
> > > >
> > > >  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
> > > >
> > > > +#define RTE_ETH_DEV_NO_OWNER 0
> > > > +
> > > > +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> > > > +
> > > > +struct rte_eth_dev_owner {
> > > > +	uint16_t id; /**< The owner unique identifier. */
> > > > +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner
> name. */
> > > > +};
> > > > +
> > > >  /**
> > > >   * @internal
> > > >   * The data part, with no function pointers, associated with each
> ethernet device.
> > > > @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
> > > >  	int numa_node;  /**< NUMA node connection */
> > > >  	struct rte_vlan_filter_conf vlan_filter_conf;
> > > >  	/**< VLAN filter configuration. */
> > > > +	struct rte_eth_dev_owner owner; /**< The port owner. */
> > > >  };
> > > >
> > > >  /** Device supports link state interrupt */ @@ -1846,6 +1856,82
> > > > @@ struct rte_eth_dev_data {
> > > >
> > > >
> > > >  /**
> > > > + * Iterates over valid ethdev ports owned by a specific owner.
> > > > + *
> > > > + * @param port_id
> > > > + *   The id of the next possible valid owned port.
> > > > + * @param	owner_id
> > > > + *  The owner identifier.
> > > > + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless
> ports.
> > > > + * @return
> > > > + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if
> there is none.
> > > > + */
> > > > +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const
> > > > +uint16_t owner_id);
> > > > +
> > > > +/**
> > > > + * Macro to iterate over all enabled ethdev ports owned by a specific
> owner.
> > > > + */
> > > > +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> > > > +	for (p = rte_eth_find_next_owned_by(0, o); \
> > > > +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> > > > +	     p = rte_eth_find_next_owned_by(p + 1, o))
> > > > +
> > > > +/**
> > > > + * Get a new unique owner identifier.
> > > > + * An owner identifier is used to owns Ethernet devices by only
> > > > +one DPDK entity
> > > > + * to avoid multiple management of device by different entities.
> > > > + *
> > > > + * @param	owner_id
> > > > + *   Owner identifier pointer.
> > > > + * @return
> > > > + *   Negative errno value on error, 0 on success.
> > > > + */
> > > > +int rte_eth_dev_owner_new(uint16_t *owner_id);
> > > > +
> > > > +/**
> > > > + * Set an Ethernet device owner.
> > > > + *
> > > > + * @param	port_id
> > > > + *  The identifier of the port to own.
> > > > + * @param	owner
> > > > + *  The owner pointer.
> > > > + * @return
> > > > + *  Negative errno value on error, 0 on success.
> > > > + */
> > > > +int rte_eth_dev_owner_set(const uint16_t port_id,
> > > > +			  const struct rte_eth_dev_owner *owner);
> > > > +
> > > > +/**
> > > > + * Remove Ethernet device owner to make the device ownerless.
> > > > + *
> > > > + * @param	port_id
> > > > + *  The identifier of port to make ownerless.
> > > > + * @param	owner
> > > > + *  The owner identifier.
> > > > + * @return
> > > > + *  0 on success, negative errno value on error.
> > > > + */
> > > > +int rte_eth_dev_owner_remove(const uint16_t port_id, const
> > > > +uint16_t owner_id);
> > > > +
> > > > +/**
> > > > + * Remove owner from all Ethernet devices owned by a specific
> owner.
> > > > + *
> > > > + * @param	owner
> > > > + *  The owner identifier.
> > > > + */
> > > > +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> > > > +
> > > > +/**
> > > > + * Get the owner of an Ethernet device.
> > > > + *
> > > > + * @param	port_id
> > > > + *  The port identifier.
> > > > + * @return
> > > > + *  NULL when the device is ownerless, else the device owner pointer.
> > > > + */
> > > > +const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const
> > > > +uint16_t port_id);
> > > > +
> > > > +/**
> > > >   * Get the total number of Ethernet devices that have been
> successfully
> > > >   * initialized by the matching Ethernet driver during the PCI probing
> phase
> > > >   * and that are available for applications to use. These devices
> > > > must be diff --git a/lib/librte_ether/rte_ethdev_version.map
> > > > b/lib/librte_ether/rte_ethdev_version.map
> > > > index e9681ac..7d07edb 100644
> > > > --- a/lib/librte_ether/rte_ethdev_version.map
> > > > +++ b/lib/librte_ether/rte_ethdev_version.map
> > > > @@ -198,6 +198,18 @@ DPDK_17.11 {
> > > >
> > > >  } DPDK_17.08;
> > > >
> > > > +DPDK_18.02 {
> > > > +	global:
> > > > +
> > > > +	rte_eth_find_next_owned_by;
> > > > +	rte_eth_dev_owner_new;
> > > > +	rte_eth_dev_owner_set;
> > > > +	rte_eth_dev_owner_remove;
> > > > +	rte_eth_dev_owner_delete;
> > > > +	rte_eth_dev_owner_get;
> > > > +
> > > > +} DPDK_17.11;
> > > > +
> > > >  EXPERIMENTAL {
> > > >  	global:
> > > >
> > > > --
> > > > 1.8.3.1
> > > >
> > > >
> >
> > --
> > Gaëtan Rivet
> > 6WIND
> >
  
Ananyev, Konstantin Dec. 3, 2017, 11:10 a.m. UTC | #8
Hi Matan,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> Sent: Sunday, December 3, 2017 8:05 AM
> To: Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> Hi
> 
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Friday, December 1, 2017 2:10 PM
> > To: Gaëtan Rivet <gaetan.rivet@6wind.com>
> > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> > dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> >
> > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> > > Hello Matan, Neil,
> > >
> > > I like the port ownership concept. I think it is needed to clarify
> > > some operations and should be useful to several subsystems.
> > >
> > > This patch could certainly be sub-divided however, and your current
> > > 1/5 should probably come after this one.
> > >
> > > Some comments inline.
> > >
> > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > > The ownership of a port is implicit in DPDK.
> > > > > Making it explicit is better from the next reasons:
> > > > > 1. It may be convenient for multi-process applications to know which
> > > > >    process is in charge of a port.
> > > > > 2. A library could work on top of a port.
> > > > > 3. A port can work on top of another port.
> > > > >
> > > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > > We need to check that the user is not trying to use a port which
> > > > > is already managed by fail-safe.
> > > > >
> > > > > Add ownership mechanism to DPDK Ethernet devices to avoid multiple
> > > > > management of a device by different DPDK entities.
> > > > >
> > > > > A port owner is built from owner id(number) and owner name(string)
> > > > > while the owner id must be unique to distinguish between two
> > > > > identical entity instances and the owner name can be any name.
> > > > > The name helps to logically recognize the owner by different DPDK
> > > > > entities and allows easy debug.
> > > > > Each DPDK entity can allocate an owner unique identifier and can
> > > > > use it and its preferred name to owns valid ethdev ports.
> > > > > Each DPDK entity can get any port owner status to decide if it can
> > > > > manage the port or not.
> > > > >
> > > > > The current ethdev internal port management is not affected by
> > > > > this feature.
> > > > >
> > >
> > > The internal port management is not affected, but the external
> > > interface is, however. In order to respect port ownership,
> > > applications are forced to modify their port iterator, as shown by your
> > testpmd patch.
> > >
> > > I think it would be better to modify the current RTE_ETH_FOREACH_DEV
> > > to call RTE_FOREACH_DEV_OWNED_BY, and introduce a default owner that
> > > would represent the application itself (probably with the ID 0 and an
> > > owner string ""). Only with specific additional configuration should
> > > this default subset of ethdev be divided.
> > >
> > > This would make this evolution seamless for applications, at no cost
> > > to the complexity of the design.
> > >
> > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > >
> > > >
> > > > This seems fairly racy.  What if one thread attempts to set
> > > > ownership on a port, while another is checking it on another cpu in
> > > > parallel.  It doesn't seem like it will protect against that at all.
> > > > It also doesn't protect against the possibility of multiple threads
> > > > attempting to poll for rx in parallel, which I think was part of
> > > > Thomas's origional statement regarding port ownership (he noted that
> > > > the lockless design implied only a single thread should be allowed to poll
> > for receive or make configuration changes at a time.
> > > >
> > > > Neil
> > > >
> > >
> > > Isn't this race already there for any configuration operation /
> > > polling function? The DPDK arch is expecting applications to solve it.
> > > Why should port ownership be designed differently from other DPDK
> > components?
> > >
> > Yes, but that doesn't mean it should exist in purpituity, nor does it mean that
> > your new api should contain it as well.
> >
> > > Embedding checks for port ownership within operations will force
> > > everyone to bear their costs, even those not interested in using this
> > > API. These checks should be kept outside, within the entity claiming
> > > ownership of the port, in the form of using the proper port iterator
> > > IMO.
> > >
> > No.  At the very least, you need to make the API itself exclusive.  That is to
> > say, you should at least ensure that a port ownership get call doesn't race
> > with a port ownership set call.  It seems rediculous to just leave that sort of
> > locking as an exercize to the user.
> >
> > Neil
> >
> Neil,
> As Thomas mentioned, a DPDK port is designed to be managed by only one
> thread (or synchronized DPDK entity).
> So all the port management includes port ownership shouldn't be synchronized,
> i.e. locks are not needed.
> If some application want to do dual thread port management, the responsibility
> to synchronize the port ownership or any other port management is on this
> application.
> Port ownership doesn't come to allow synchronized management of the port by
> two DPDK entities in parallel, it is just nice way to answer next questions:
> 	1. Is the port already owned by some DPDK entity(in early control path)?
> 	2. If yes, Who is the owner?
> If the answer to the first question is no, the current entity can take the ownership
> without any lock(1 thread).
> If the answer to the first question is yes, you can recognize the owner and take
> decisions accordingly, sometimes you can decide to use the port because you
> logically know what the current owner does and you can be logically synchronized
> with it, sometimes you can just leave this port because you have not any deal with
>  this owner.

If the goal is just to have an ability to recognize is that device is managed by another device
(failsafe, bonding, etc.),  then I think all we need is a pointer to rte_eth_dev_data of the owner
(NULL would mean no owner).
Also I think if we'd like to introduce that mechanism, then it needs to be
- mandatory (control API just don't allow changes to the device configuration if caller is not an owner).
- transparent to the user (no API changes).
 - set/get owner ops need to be atomic if we want this mechanism to be usable for MP.
Konstantin  





> 
> > > > > ---
> > > > >  doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
> > > > >  lib/librte_ether/rte_ethdev.c           | 121
> > ++++++++++++++++++++++++++++++++
> > > > >  lib/librte_ether/rte_ethdev.h           |  86 +++++++++++++++++++++++
> > > > >  lib/librte_ether/rte_ethdev_version.map |  12 ++++
> > > > >  4 files changed, 230 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > index 6a0c9f9..af639a1 100644
> > > > > --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > @@ -156,7 +156,7 @@ concurrently on the same tx queue without SW
> > > > > lock. This PMD feature found in som
> > > > >
> > > > >  See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE``
> > capability probing details.
> > > > >
> > > > > -Device Identification and Configuration
> > > > > +Device Identification, Ownership  and Configuration
> > > > >  ---------------------------------------
> > > > >
> > > > >  Device Identification
> > > > > @@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports are
> > assigned two other identifiers:
> > > > >  *   A port name used to designate the port in console messages, for
> > administration or debugging purposes.
> > > > >      For ease of use, the port name includes the port index.
> > > > >
> > > > > +Port Ownership
> > > > > +~~~~~~~~~~~~~
> > > > > +The Ethernet devices ports can be owned by a single DPDK entity
> > (application, library, PMD, process, etc).
> > > > > +The ownership mechanism is controlled by ethdev APIs and allows to
> > set/remove/get a port owner by DPDK entities.
> > > > > +Allowing this should prevent any multiple management of Ethernet
> > port by different entities.
> > > > > +
> > > > > +.. note::
> > > > > +
> > > > > +    It is the DPDK entity responsibility either to check the port owner
> > before using it or to set the port owner to prevent others from using it.
> > > > > +
> > > > >  Device Configuration
> > > > >  ~~~~~~~~~~~~~~~~~~~~
> > > > >
> > > > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > > > b/lib/librte_ether/rte_ethdev.c index 2d754d9..836991e 100644
> > > > > --- a/lib/librte_ether/rte_ethdev.c
> > > > > +++ b/lib/librte_ether/rte_ethdev.c
> > > > > @@ -71,6 +71,7 @@
> > > > >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
> > > > > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> > > > >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > > > > +static uint16_t rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER
> > + 1;
> > > > >  static uint8_t eth_dev_last_created_port;
> > > > >
> > > > >  /* spinlock for eth device callbacks */ @@ -278,6 +279,7 @@
> > > > > struct rte_eth_dev *
> > > > >  	if (eth_dev == NULL)
> > > > >  		return -EINVAL;
> > > > >
> > > > > +	memset(&eth_dev->data->owner, 0, sizeof(struct
> > > > > +rte_eth_dev_owner));
> > > > >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> > > > >  	return 0;
> > > > >  }
> > > > > @@ -293,6 +295,125 @@ struct rte_eth_dev *
> > > > >  		return 1;
> > > > >  }
> > > > >
> > > > > +static int
> > > > > +rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > > > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > > > > +	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER &&
> > > > > +	    rte_eth_next_owner_id <= owner_id)) {
> > > > > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > > > +		return 0;
> > > > > +	}
> > > > > +	return 1;
> > > > > +}
> > > > > +
> > > > > +uint16_t
> > > > > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t
> > > > > +owner_id) {
> > > > > +	while (port_id < RTE_MAX_ETHPORTS &&
> > > > > +	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
> > > > > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > > > > +		port_id++;
> > > > > +
> > > > > +	if (port_id >= RTE_MAX_ETHPORTS)
> > > > > +		return RTE_MAX_ETHPORTS;
> > > > > +
> > > > > +	return port_id;
> > > > > +}
> > > > > +
> > > > > +int
> > > > > +rte_eth_dev_owner_new(uint16_t *owner_id) {
> > > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own
> > ports.\n");
> > > > > +		return -EPERM;
> > > > > +	}
> > > > > +	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > > > > +		RTE_PMD_DEBUG_TRACE("Reached maximum number of
> > Ethernet port owners.\n");
> > > > > +		return -EUSERS;
> > > > > +	}
> > > > > +	*owner_id = rte_eth_next_owner_id++;
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +int
> > > > > +rte_eth_dev_owner_set(const uint16_t port_id,
> > > > > +		      const struct rte_eth_dev_owner *owner) {
> > > > > +	struct rte_eth_dev_owner *port_owner;
> > > > > +	int ret;
> > > > > +
> > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own
> > ports.\n");
> > > > > +		return -EPERM;
> > > > > +	}
> > > > > +	if (!rte_eth_is_valid_owner_id(owner->id))
> > > > > +		return -EINVAL;
> > > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > > > > +	    port_owner->id != owner->id) {
> > > > > +		RTE_LOG(ERR, EAL,
> > > > > +			"Cannot set owner to port %d already owned by
> > %s_%05d.\n",
> > > > > +			port_id, port_owner->name, port_owner->id);
> > > > > +		return -EPERM;
> > > > > +	}
> > > > > +	ret = snprintf(port_owner->name,
> > RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > > > > +		       owner->name);
> > > > > +	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> > > > > +		memset(port_owner->name, 0,
> > RTE_ETH_MAX_OWNER_NAME_LEN);
> > > > > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > > > > +		return -EINVAL;
> > > > > +	}
> > > > > +	port_owner->id = owner->id;
> > > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
> > > > > +			    owner->name, owner->id);
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +int
> > > > > +rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t
> > > > > +owner_id) {
> > > > > +	struct rte_eth_dev_owner *port_owner;
> > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > > +		return -EINVAL;
> > > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > > +	if (port_owner->id != owner_id) {
> > > > > +		RTE_LOG(ERR, EAL,
> > > > > +			"Cannot remove port %d owner %s_%05d by
> > different owner id %5d.\n",
> > > > > +			port_id, port_owner->name, port_owner->id,
> > owner_id);
> > > > > +		return -EPERM;
> > > > > +	}
> > > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has
> > removed.\n", port_id,
> > > > > +			port_owner->name, port_owner->id);
> > > > > +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +void
> > > > > +rte_eth_dev_owner_delete(const uint16_t owner_id) {
> > > > > +	uint16_t p;
> > > > > +
> > > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > > +		return;
> > > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
> > > > > +		memset(&rte_eth_devices[p].data->owner, 0,
> > > > > +		       sizeof(struct rte_eth_dev_owner));
> > > > > +	RTE_PMD_DEBUG_TRACE("All port owners owned by "
> > > > > +			    "%05d identifier has removed.\n", owner_id); }
> > > > > +
> > > > > +const struct rte_eth_dev_owner *
> > > > > +rte_eth_dev_owner_get(const uint16_t port_id) {
> > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> > > > > +	if (rte_eth_devices[port_id].data->owner.id ==
> > RTE_ETH_DEV_NO_OWNER)
> > > > > +		return NULL;
> > > > > +	return &rte_eth_devices[port_id].data->owner;
> > > > > +}
> > > > > +
> > > > >  int
> > > > >  rte_eth_dev_socket_id(uint16_t port_id)  { diff --git
> > > > > a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> > > > > index 341c2d6..f54c26d 100644
> > > > > --- a/lib/librte_ether/rte_ethdev.h
> > > > > +++ b/lib/librte_ether/rte_ethdev.h
> > > > > @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
> > > > >
> > > > >  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
> > > > >
> > > > > +#define RTE_ETH_DEV_NO_OWNER 0
> > > > > +
> > > > > +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> > > > > +
> > > > > +struct rte_eth_dev_owner {
> > > > > +	uint16_t id; /**< The owner unique identifier. */
> > > > > +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner
> > name. */
> > > > > +};
> > > > > +
> > > > >  /**
> > > > >   * @internal
> > > > >   * The data part, with no function pointers, associated with each
> > ethernet device.
> > > > > @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
> > > > >  	int numa_node;  /**< NUMA node connection */
> > > > >  	struct rte_vlan_filter_conf vlan_filter_conf;
> > > > >  	/**< VLAN filter configuration. */
> > > > > +	struct rte_eth_dev_owner owner; /**< The port owner. */
> > > > >  };
> > > > >
> > > > >  /** Device supports link state interrupt */ @@ -1846,6 +1856,82
> > > > > @@ struct rte_eth_dev_data {
> > > > >
> > > > >
> > > > >  /**
> > > > > + * Iterates over valid ethdev ports owned by a specific owner.
> > > > > + *
> > > > > + * @param port_id
> > > > > + *   The id of the next possible valid owned port.
> > > > > + * @param	owner_id
> > > > > + *  The owner identifier.
> > > > > + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless
> > ports.
> > > > > + * @return
> > > > > + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if
> > there is none.
> > > > > + */
> > > > > +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const
> > > > > +uint16_t owner_id);
> > > > > +
> > > > > +/**
> > > > > + * Macro to iterate over all enabled ethdev ports owned by a specific
> > owner.
> > > > > + */
> > > > > +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> > > > > +	for (p = rte_eth_find_next_owned_by(0, o); \
> > > > > +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> > > > > +	     p = rte_eth_find_next_owned_by(p + 1, o))
> > > > > +
> > > > > +/**
> > > > > + * Get a new unique owner identifier.
> > > > > + * An owner identifier is used to owns Ethernet devices by only
> > > > > +one DPDK entity
> > > > > + * to avoid multiple management of device by different entities.
> > > > > + *
> > > > > + * @param	owner_id
> > > > > + *   Owner identifier pointer.
> > > > > + * @return
> > > > > + *   Negative errno value on error, 0 on success.
> > > > > + */
> > > > > +int rte_eth_dev_owner_new(uint16_t *owner_id);
> > > > > +
> > > > > +/**
> > > > > + * Set an Ethernet device owner.
> > > > > + *
> > > > > + * @param	port_id
> > > > > + *  The identifier of the port to own.
> > > > > + * @param	owner
> > > > > + *  The owner pointer.
> > > > > + * @return
> > > > > + *  Negative errno value on error, 0 on success.
> > > > > + */
> > > > > +int rte_eth_dev_owner_set(const uint16_t port_id,
> > > > > +			  const struct rte_eth_dev_owner *owner);
> > > > > +
> > > > > +/**
> > > > > + * Remove Ethernet device owner to make the device ownerless.
> > > > > + *
> > > > > + * @param	port_id
> > > > > + *  The identifier of port to make ownerless.
> > > > > + * @param	owner
> > > > > + *  The owner identifier.
> > > > > + * @return
> > > > > + *  0 on success, negative errno value on error.
> > > > > + */
> > > > > +int rte_eth_dev_owner_remove(const uint16_t port_id, const
> > > > > +uint16_t owner_id);
> > > > > +
> > > > > +/**
> > > > > + * Remove owner from all Ethernet devices owned by a specific
> > owner.
> > > > > + *
> > > > > + * @param	owner
> > > > > + *  The owner identifier.
> > > > > + */
> > > > > +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> > > > > +
> > > > > +/**
> > > > > + * Get the owner of an Ethernet device.
> > > > > + *
> > > > > + * @param	port_id
> > > > > + *  The port identifier.
> > > > > + * @return
> > > > > + *  NULL when the device is ownerless, else the device owner pointer.
> > > > > + */
> > > > > +const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const
> > > > > +uint16_t port_id);
> > > > > +
> > > > > +/**
> > > > >   * Get the total number of Ethernet devices that have been
> > successfully
> > > > >   * initialized by the matching Ethernet driver during the PCI probing
> > phase
> > > > >   * and that are available for applications to use. These devices
> > > > > must be diff --git a/lib/librte_ether/rte_ethdev_version.map
> > > > > b/lib/librte_ether/rte_ethdev_version.map
> > > > > index e9681ac..7d07edb 100644
> > > > > --- a/lib/librte_ether/rte_ethdev_version.map
> > > > > +++ b/lib/librte_ether/rte_ethdev_version.map
> > > > > @@ -198,6 +198,18 @@ DPDK_17.11 {
> > > > >
> > > > >  } DPDK_17.08;
> > > > >
> > > > > +DPDK_18.02 {
> > > > > +	global:
> > > > > +
> > > > > +	rte_eth_find_next_owned_by;
> > > > > +	rte_eth_dev_owner_new;
> > > > > +	rte_eth_dev_owner_set;
> > > > > +	rte_eth_dev_owner_remove;
> > > > > +	rte_eth_dev_owner_delete;
> > > > > +	rte_eth_dev_owner_get;
> > > > > +
> > > > > +} DPDK_17.11;
> > > > > +
> > > > >  EXPERIMENTAL {
> > > > >  	global:
> > > > >
> > > > > --
> > > > > 1.8.3.1
> > > > >
> > > > >
> > >
> > > --
> > > Gaëtan Rivet
> > > 6WIND
> > >
  
Matan Azrad Dec. 3, 2017, 1:46 p.m. UTC | #9
Hi Konstantine

> -----Original Message-----
> From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> Sent: Sunday, December 3, 2017 1:10 PM
> To: Matan Azrad <matan@mellanox.com>; Neil Horman
> <nhorman@tuxdriver.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> <jingjing.wu@intel.com>; dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> 
> 
> Hi Matan,
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > Sent: Sunday, December 3, 2017 8:05 AM
> > To: Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet
> > <gaetan.rivet@6wind.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > <jingjing.wu@intel.com>; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> >
> > Hi
> >
> > > -----Original Message-----
> > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > Sent: Friday, December 1, 2017 2:10 PM
> > > To: Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> > > dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > >
> > > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> > > > Hello Matan, Neil,
> > > >
> > > > I like the port ownership concept. I think it is needed to clarify
> > > > some operations and should be useful to several subsystems.
> > > >
> > > > This patch could certainly be sub-divided however, and your
> > > > current
> > > > 1/5 should probably come after this one.
> > > >
> > > > Some comments inline.
> > > >
> > > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > > > The ownership of a port is implicit in DPDK.
> > > > > > Making it explicit is better from the next reasons:
> > > > > > 1. It may be convenient for multi-process applications to know
> which
> > > > > >    process is in charge of a port.
> > > > > > 2. A library could work on top of a port.
> > > > > > 3. A port can work on top of another port.
> > > > > >
> > > > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > > > We need to check that the user is not trying to use a port
> > > > > > which is already managed by fail-safe.
> > > > > >
> > > > > > Add ownership mechanism to DPDK Ethernet devices to avoid
> > > > > > multiple management of a device by different DPDK entities.
> > > > > >
> > > > > > A port owner is built from owner id(number) and owner
> > > > > > name(string) while the owner id must be unique to distinguish
> > > > > > between two identical entity instances and the owner name can be
> any name.
> > > > > > The name helps to logically recognize the owner by different
> > > > > > DPDK entities and allows easy debug.
> > > > > > Each DPDK entity can allocate an owner unique identifier and
> > > > > > can use it and its preferred name to owns valid ethdev ports.
> > > > > > Each DPDK entity can get any port owner status to decide if it
> > > > > > can manage the port or not.
> > > > > >
> > > > > > The current ethdev internal port management is not affected by
> > > > > > this feature.
> > > > > >
> > > >
> > > > The internal port management is not affected, but the external
> > > > interface is, however. In order to respect port ownership,
> > > > applications are forced to modify their port iterator, as shown by
> > > > your
> > > testpmd patch.
> > > >
> > > > I think it would be better to modify the current
> > > > RTE_ETH_FOREACH_DEV to call RTE_FOREACH_DEV_OWNED_BY, and
> > > > introduce a default owner that would represent the application
> > > > itself (probably with the ID 0 and an owner string ""). Only with
> > > > specific additional configuration should this default subset of ethdev be
> divided.
> > > >
> > > > This would make this evolution seamless for applications, at no
> > > > cost to the complexity of the design.
> > > >
> > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > >
> > > > >
> > > > > This seems fairly racy.  What if one thread attempts to set
> > > > > ownership on a port, while another is checking it on another cpu
> > > > > in parallel.  It doesn't seem like it will protect against that at all.
> > > > > It also doesn't protect against the possibility of multiple
> > > > > threads attempting to poll for rx in parallel, which I think was
> > > > > part of Thomas's origional statement regarding port ownership
> > > > > (he noted that the lockless design implied only a single thread
> > > > > should be allowed to poll
> > > for receive or make configuration changes at a time.
> > > > >
> > > > > Neil
> > > > >
> > > >
> > > > Isn't this race already there for any configuration operation /
> > > > polling function? The DPDK arch is expecting applications to solve it.
> > > > Why should port ownership be designed differently from other DPDK
> > > components?
> > > >
> > > Yes, but that doesn't mean it should exist in purpituity, nor does
> > > it mean that your new api should contain it as well.
> > >
> > > > Embedding checks for port ownership within operations will force
> > > > everyone to bear their costs, even those not interested in using
> > > > this API. These checks should be kept outside, within the entity
> > > > claiming ownership of the port, in the form of using the proper
> > > > port iterator IMO.
> > > >
> > > No.  At the very least, you need to make the API itself exclusive.
> > > That is to say, you should at least ensure that a port ownership get
> > > call doesn't race with a port ownership set call.  It seems
> > > rediculous to just leave that sort of locking as an exercize to the user.
> > >
> > > Neil
> > >
> > Neil,
> > As Thomas mentioned, a DPDK port is designed to be managed by only one
> > thread (or synchronized DPDK entity).
> > So all the port management includes port ownership shouldn't be
> > synchronized, i.e. locks are not needed.
> > If some application want to do dual thread port management, the
> > responsibility to synchronize the port ownership or any other port
> > management is on this application.
> > Port ownership doesn't come to allow synchronized management of the
> > port by two DPDK entities in parallel, it is just nice way to answer next
> questions:
> > 	1. Is the port already owned by some DPDK entity(in early control
> path)?
> > 	2. If yes, Who is the owner?
> > If the answer to the first question is no, the current entity can take
> > the ownership without any lock(1 thread).
> > If the answer to the first question is yes, you can recognize the
> > owner and take decisions accordingly, sometimes you can decide to use
> > the port because you logically know what the current owner does and
> > you can be logically synchronized with it, sometimes you can just
> > leave this port because you have not any deal with  this owner.
> 
> If the goal is just to have an ability to recognize is that device is managed by
> another device (failsafe, bonding, etc.),  then I think all we need is a pointer
> to rte_eth_dev_data of the owner (NULL would mean no owner).

I think string is better than a pointer from the next reasons:
1. It is more human friendly than pointers for debug and printing.
2. it is flexible and allows to forward logical owner message to other DPDK entities. 

> Also I think if we'd like to introduce that mechanism, then it needs to be
> - mandatory (control API just don't allow changes to the device configuration
> if caller is not an owner).

But what if 2 DPDK entities should manage the same port \ using it and they are synchronized?

> - transparent to the user (no API changes).

For now, there is not API change but new suggested API to use for port iteration.

>  - set/get owner ops need to be atomic if we want this mechanism to be
> usable for MP.

But also without atomic this mechanism is usable in MP.
For example:
PRIMARY application can set its owner with string "primary A".
SECONDARY process (which attach to the ports only after the primary created them )is not allowed to set owner(As you can see in the code) but it can read the owner string and see that the port owner is the primary application.
The "A" can just sign specific port type to the SECONDARY that this port works with logic A which means, for example, primary should send the packets and secondary should receive the packets.

> Konstantin
> 
> 
> 
> 
> 
> >
> > > > > > ---
> > > > > >  doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
> > > > > >  lib/librte_ether/rte_ethdev.c           | 121
> > > ++++++++++++++++++++++++++++++++
> > > > > >  lib/librte_ether/rte_ethdev.h           |  86
> +++++++++++++++++++++++
> > > > > >  lib/librte_ether/rte_ethdev_version.map |  12 ++++
> > > > > >  4 files changed, 230 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > index 6a0c9f9..af639a1 100644
> > > > > > --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > @@ -156,7 +156,7 @@ concurrently on the same tx queue without
> > > > > > SW lock. This PMD feature found in som
> > > > > >
> > > > > >  See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE``
> > > capability probing details.
> > > > > >
> > > > > > -Device Identification and Configuration
> > > > > > +Device Identification, Ownership  and Configuration
> > > > > >  ---------------------------------------
> > > > > >
> > > > > >  Device Identification
> > > > > > @@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports
> > > > > > are
> > > assigned two other identifiers:
> > > > > >  *   A port name used to designate the port in console messages, for
> > > administration or debugging purposes.
> > > > > >      For ease of use, the port name includes the port index.
> > > > > >
> > > > > > +Port Ownership
> > > > > > +~~~~~~~~~~~~~
> > > > > > +The Ethernet devices ports can be owned by a single DPDK
> > > > > > +entity
> > > (application, library, PMD, process, etc).
> > > > > > +The ownership mechanism is controlled by ethdev APIs and
> > > > > > +allows to
> > > set/remove/get a port owner by DPDK entities.
> > > > > > +Allowing this should prevent any multiple management of
> > > > > > +Ethernet
> > > port by different entities.
> > > > > > +
> > > > > > +.. note::
> > > > > > +
> > > > > > +    It is the DPDK entity responsibility either to check the
> > > > > > + port owner
> > > before using it or to set the port owner to prevent others from using it.
> > > > > > +
> > > > > >  Device Configuration
> > > > > >  ~~~~~~~~~~~~~~~~~~~~
> > > > > >
> > > > > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > > > > b/lib/librte_ether/rte_ethdev.c index 2d754d9..836991e 100644
> > > > > > --- a/lib/librte_ether/rte_ethdev.c
> > > > > > +++ b/lib/librte_ether/rte_ethdev.c
> > > > > > @@ -71,6 +71,7 @@
> > > > > >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
> > > > > > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> > > > > >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > > > > > +static uint16_t rte_eth_next_owner_id =
> RTE_ETH_DEV_NO_OWNER
> > > + 1;
> > > > > >  static uint8_t eth_dev_last_created_port;
> > > > > >
> > > > > >  /* spinlock for eth device callbacks */ @@ -278,6 +279,7 @@
> > > > > > struct rte_eth_dev *
> > > > > >  	if (eth_dev == NULL)
> > > > > >  		return -EINVAL;
> > > > > >
> > > > > > +	memset(&eth_dev->data->owner, 0, sizeof(struct
> > > > > > +rte_eth_dev_owner));
> > > > > >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> > > > > >  	return 0;
> > > > > >  }
> > > > > > @@ -293,6 +295,125 @@ struct rte_eth_dev *
> > > > > >  		return 1;
> > > > > >  }
> > > > > >
> > > > > > +static int
> > > > > > +rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > > > > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > > > > > +	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER
> &&
> > > > > > +	    rte_eth_next_owner_id <= owner_id)) {
> > > > > > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n",
> owner_id);
> > > > > > +		return 0;
> > > > > > +	}
> > > > > > +	return 1;
> > > > > > +}
> > > > > > +
> > > > > > +uint16_t
> > > > > > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t
> > > > > > +owner_id) {
> > > > > > +	while (port_id < RTE_MAX_ETHPORTS &&
> > > > > > +	       (rte_eth_devices[port_id].state !=
> RTE_ETH_DEV_ATTACHED ||
> > > > > > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > > > > > +		port_id++;
> > > > > > +
> > > > > > +	if (port_id >= RTE_MAX_ETHPORTS)
> > > > > > +		return RTE_MAX_ETHPORTS;
> > > > > > +
> > > > > > +	return port_id;
> > > > > > +}
> > > > > > +
> > > > > > +int
> > > > > > +rte_eth_dev_owner_new(uint16_t *owner_id) {
> > > > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > > > +		RTE_PMD_DEBUG_TRACE("Not primary process
> cannot own
> > > ports.\n");
> > > > > > +		return -EPERM;
> > > > > > +	}
> > > > > > +	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > > > > > +		RTE_PMD_DEBUG_TRACE("Reached maximum
> number of
> > > Ethernet port owners.\n");
> > > > > > +		return -EUSERS;
> > > > > > +	}
> > > > > > +	*owner_id = rte_eth_next_owner_id++;
> > > > > > +	return 0;
> > > > > > +}
> > > > > > +
> > > > > > +int
> > > > > > +rte_eth_dev_owner_set(const uint16_t port_id,
> > > > > > +		      const struct rte_eth_dev_owner *owner) {
> > > > > > +	struct rte_eth_dev_owner *port_owner;
> > > > > > +	int ret;
> > > > > > +
> > > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > > > +		RTE_PMD_DEBUG_TRACE("Not primary process
> cannot own
> > > ports.\n");
> > > > > > +		return -EPERM;
> > > > > > +	}
> > > > > > +	if (!rte_eth_is_valid_owner_id(owner->id))
> > > > > > +		return -EINVAL;
> > > > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > > > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > > > > > +	    port_owner->id != owner->id) {
> > > > > > +		RTE_LOG(ERR, EAL,
> > > > > > +			"Cannot set owner to port %d already owned
> by
> > > %s_%05d.\n",
> > > > > > +			port_id, port_owner->name, port_owner-
> >id);
> > > > > > +		return -EPERM;
> > > > > > +	}
> > > > > > +	ret = snprintf(port_owner->name,
> > > RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > > > > > +		       owner->name);
> > > > > > +	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> > > > > > +		memset(port_owner->name, 0,
> > > RTE_ETH_MAX_OWNER_NAME_LEN);
> > > > > > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > > > > > +		return -EINVAL;
> > > > > > +	}
> > > > > > +	port_owner->id = owner->id;
> > > > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n",
> port_id,
> > > > > > +			    owner->name, owner->id);
> > > > > > +	return 0;
> > > > > > +}
> > > > > > +
> > > > > > +int
> > > > > > +rte_eth_dev_owner_remove(const uint16_t port_id, const
> > > > > > +uint16_t
> > > > > > +owner_id) {
> > > > > > +	struct rte_eth_dev_owner *port_owner;
> > > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > > > +		return -EINVAL;
> > > > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > > > +	if (port_owner->id != owner_id) {
> > > > > > +		RTE_LOG(ERR, EAL,
> > > > > > +			"Cannot remove port %d owner %s_%05d by
> > > different owner id %5d.\n",
> > > > > > +			port_id, port_owner->name, port_owner-
> >id,
> > > owner_id);
> > > > > > +		return -EPERM;
> > > > > > +	}
> > > > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has
> > > removed.\n", port_id,
> > > > > > +			port_owner->name, port_owner->id);
> > > > > > +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> > > > > > +	return 0;
> > > > > > +}
> > > > > > +
> > > > > > +void
> > > > > > +rte_eth_dev_owner_delete(const uint16_t owner_id) {
> > > > > > +	uint16_t p;
> > > > > > +
> > > > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > > > +		return;
> > > > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
> > > > > > +		memset(&rte_eth_devices[p].data->owner, 0,
> > > > > > +		       sizeof(struct rte_eth_dev_owner));
> > > > > > +	RTE_PMD_DEBUG_TRACE("All port owners owned by "
> > > > > > +			    "%05d identifier has removed.\n",
> owner_id); }
> > > > > > +
> > > > > > +const struct rte_eth_dev_owner * rte_eth_dev_owner_get(const
> > > > > > +uint16_t port_id) {
> > > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> > > > > > +	if (rte_eth_devices[port_id].data->owner.id ==
> > > RTE_ETH_DEV_NO_OWNER)
> > > > > > +		return NULL;
> > > > > > +	return &rte_eth_devices[port_id].data->owner;
> > > > > > +}
> > > > > > +
> > > > > >  int
> > > > > >  rte_eth_dev_socket_id(uint16_t port_id)  { diff --git
> > > > > > a/lib/librte_ether/rte_ethdev.h
> > > > > > b/lib/librte_ether/rte_ethdev.h index 341c2d6..f54c26d 100644
> > > > > > --- a/lib/librte_ether/rte_ethdev.h
> > > > > > +++ b/lib/librte_ether/rte_ethdev.h
> > > > > > @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
> > > > > >
> > > > > >  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
> > > > > >
> > > > > > +#define RTE_ETH_DEV_NO_OWNER 0
> > > > > > +
> > > > > > +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> > > > > > +
> > > > > > +struct rte_eth_dev_owner {
> > > > > > +	uint16_t id; /**< The owner unique identifier. */
> > > > > > +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The
> owner
> > > name. */
> > > > > > +};
> > > > > > +
> > > > > >  /**
> > > > > >   * @internal
> > > > > >   * The data part, with no function pointers, associated with
> > > > > > each
> > > ethernet device.
> > > > > > @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
> > > > > >  	int numa_node;  /**< NUMA node connection */
> > > > > >  	struct rte_vlan_filter_conf vlan_filter_conf;
> > > > > >  	/**< VLAN filter configuration. */
> > > > > > +	struct rte_eth_dev_owner owner; /**< The port owner. */
> > > > > >  };
> > > > > >
> > > > > >  /** Device supports link state interrupt */ @@ -1846,6
> > > > > > +1856,82 @@ struct rte_eth_dev_data {
> > > > > >
> > > > > >
> > > > > >  /**
> > > > > > + * Iterates over valid ethdev ports owned by a specific owner.
> > > > > > + *
> > > > > > + * @param port_id
> > > > > > + *   The id of the next possible valid owned port.
> > > > > > + * @param	owner_id
> > > > > > + *  The owner identifier.
> > > > > > + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid
> > > > > > + ownerless
> > > ports.
> > > > > > + * @return
> > > > > > + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if
> > > there is none.
> > > > > > + */
> > > > > > +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const
> > > > > > +uint16_t owner_id);
> > > > > > +
> > > > > > +/**
> > > > > > + * Macro to iterate over all enabled ethdev ports owned by a
> > > > > > +specific
> > > owner.
> > > > > > + */
> > > > > > +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> > > > > > +	for (p = rte_eth_find_next_owned_by(0, o); \
> > > > > > +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> > > > > > +	     p = rte_eth_find_next_owned_by(p + 1, o))
> > > > > > +
> > > > > > +/**
> > > > > > + * Get a new unique owner identifier.
> > > > > > + * An owner identifier is used to owns Ethernet devices by
> > > > > > +only one DPDK entity
> > > > > > + * to avoid multiple management of device by different entities.
> > > > > > + *
> > > > > > + * @param	owner_id
> > > > > > + *   Owner identifier pointer.
> > > > > > + * @return
> > > > > > + *   Negative errno value on error, 0 on success.
> > > > > > + */
> > > > > > +int rte_eth_dev_owner_new(uint16_t *owner_id);
> > > > > > +
> > > > > > +/**
> > > > > > + * Set an Ethernet device owner.
> > > > > > + *
> > > > > > + * @param	port_id
> > > > > > + *  The identifier of the port to own.
> > > > > > + * @param	owner
> > > > > > + *  The owner pointer.
> > > > > > + * @return
> > > > > > + *  Negative errno value on error, 0 on success.
> > > > > > + */
> > > > > > +int rte_eth_dev_owner_set(const uint16_t port_id,
> > > > > > +			  const struct rte_eth_dev_owner *owner);
> > > > > > +
> > > > > > +/**
> > > > > > + * Remove Ethernet device owner to make the device ownerless.
> > > > > > + *
> > > > > > + * @param	port_id
> > > > > > + *  The identifier of port to make ownerless.
> > > > > > + * @param	owner
> > > > > > + *  The owner identifier.
> > > > > > + * @return
> > > > > > + *  0 on success, negative errno value on error.
> > > > > > + */
> > > > > > +int rte_eth_dev_owner_remove(const uint16_t port_id, const
> > > > > > +uint16_t owner_id);
> > > > > > +
> > > > > > +/**
> > > > > > + * Remove owner from all Ethernet devices owned by a specific
> > > owner.
> > > > > > + *
> > > > > > + * @param	owner
> > > > > > + *  The owner identifier.
> > > > > > + */
> > > > > > +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> > > > > > +
> > > > > > +/**
> > > > > > + * Get the owner of an Ethernet device.
> > > > > > + *
> > > > > > + * @param	port_id
> > > > > > + *  The port identifier.
> > > > > > + * @return
> > > > > > + *  NULL when the device is ownerless, else the device owner
> pointer.
> > > > > > + */
> > > > > > +const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const
> > > > > > +uint16_t port_id);
> > > > > > +
> > > > > > +/**
> > > > > >   * Get the total number of Ethernet devices that have been
> > > successfully
> > > > > >   * initialized by the matching Ethernet driver during the PCI
> > > > > > probing
> > > phase
> > > > > >   * and that are available for applications to use. These
> > > > > > devices must be diff --git
> > > > > > a/lib/librte_ether/rte_ethdev_version.map
> > > > > > b/lib/librte_ether/rte_ethdev_version.map
> > > > > > index e9681ac..7d07edb 100644
> > > > > > --- a/lib/librte_ether/rte_ethdev_version.map
> > > > > > +++ b/lib/librte_ether/rte_ethdev_version.map
> > > > > > @@ -198,6 +198,18 @@ DPDK_17.11 {
> > > > > >
> > > > > >  } DPDK_17.08;
> > > > > >
> > > > > > +DPDK_18.02 {
> > > > > > +	global:
> > > > > > +
> > > > > > +	rte_eth_find_next_owned_by;
> > > > > > +	rte_eth_dev_owner_new;
> > > > > > +	rte_eth_dev_owner_set;
> > > > > > +	rte_eth_dev_owner_remove;
> > > > > > +	rte_eth_dev_owner_delete;
> > > > > > +	rte_eth_dev_owner_get;
> > > > > > +
> > > > > > +} DPDK_17.11;
> > > > > > +
> > > > > >  EXPERIMENTAL {
> > > > > >  	global:
> > > > > >
> > > > > > --
> > > > > > 1.8.3.1
> > > > > >
> > > > > >
> > > >
> > > > --
> > > > Gaëtan Rivet
> > > > 6WIND
> > > >
  
Neil Horman Dec. 4, 2017, 4:01 p.m. UTC | #10
On Sun, Dec 03, 2017 at 01:46:49PM +0000, Matan Azrad wrote:
> Hi Konstantine
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > Sent: Sunday, December 3, 2017 1:10 PM
> > To: Matan Azrad <matan@mellanox.com>; Neil Horman
> > <nhorman@tuxdriver.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > <jingjing.wu@intel.com>; dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > 
> > 
> > 
> > Hi Matan,
> > 
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > > Sent: Sunday, December 3, 2017 8:05 AM
> > > To: Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet
> > > <gaetan.rivet@6wind.com>
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > <jingjing.wu@intel.com>; dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > >
> > > Hi
> > >
> > > > -----Original Message-----
> > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > > Sent: Friday, December 1, 2017 2:10 PM
> > > > To: Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> > > > dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > >
> > > > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> > > > > Hello Matan, Neil,
> > > > >
> > > > > I like the port ownership concept. I think it is needed to clarify
> > > > > some operations and should be useful to several subsystems.
> > > > >
> > > > > This patch could certainly be sub-divided however, and your
> > > > > current
> > > > > 1/5 should probably come after this one.
> > > > >
> > > > > Some comments inline.
> > > > >
> > > > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > > > > The ownership of a port is implicit in DPDK.
> > > > > > > Making it explicit is better from the next reasons:
> > > > > > > 1. It may be convenient for multi-process applications to know
> > which
> > > > > > >    process is in charge of a port.
> > > > > > > 2. A library could work on top of a port.
> > > > > > > 3. A port can work on top of another port.
> > > > > > >
> > > > > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > > > > We need to check that the user is not trying to use a port
> > > > > > > which is already managed by fail-safe.
> > > > > > >
> > > > > > > Add ownership mechanism to DPDK Ethernet devices to avoid
> > > > > > > multiple management of a device by different DPDK entities.
> > > > > > >
> > > > > > > A port owner is built from owner id(number) and owner
> > > > > > > name(string) while the owner id must be unique to distinguish
> > > > > > > between two identical entity instances and the owner name can be
> > any name.
> > > > > > > The name helps to logically recognize the owner by different
> > > > > > > DPDK entities and allows easy debug.
> > > > > > > Each DPDK entity can allocate an owner unique identifier and
> > > > > > > can use it and its preferred name to owns valid ethdev ports.
> > > > > > > Each DPDK entity can get any port owner status to decide if it
> > > > > > > can manage the port or not.
> > > > > > >
> > > > > > > The current ethdev internal port management is not affected by
> > > > > > > this feature.
> > > > > > >
> > > > >
> > > > > The internal port management is not affected, but the external
> > > > > interface is, however. In order to respect port ownership,
> > > > > applications are forced to modify their port iterator, as shown by
> > > > > your
> > > > testpmd patch.
> > > > >
> > > > > I think it would be better to modify the current
> > > > > RTE_ETH_FOREACH_DEV to call RTE_FOREACH_DEV_OWNED_BY, and
> > > > > introduce a default owner that would represent the application
> > > > > itself (probably with the ID 0 and an owner string ""). Only with
> > > > > specific additional configuration should this default subset of ethdev be
> > divided.
> > > > >
> > > > > This would make this evolution seamless for applications, at no
> > > > > cost to the complexity of the design.
> > > > >
> > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > >
> > > > > >
> > > > > > This seems fairly racy.  What if one thread attempts to set
> > > > > > ownership on a port, while another is checking it on another cpu
> > > > > > in parallel.  It doesn't seem like it will protect against that at all.
> > > > > > It also doesn't protect against the possibility of multiple
> > > > > > threads attempting to poll for rx in parallel, which I think was
> > > > > > part of Thomas's origional statement regarding port ownership
> > > > > > (he noted that the lockless design implied only a single thread
> > > > > > should be allowed to poll
> > > > for receive or make configuration changes at a time.
> > > > > >
> > > > > > Neil
> > > > > >
> > > > >
> > > > > Isn't this race already there for any configuration operation /
> > > > > polling function? The DPDK arch is expecting applications to solve it.
> > > > > Why should port ownership be designed differently from other DPDK
> > > > components?
> > > > >
> > > > Yes, but that doesn't mean it should exist in purpituity, nor does
> > > > it mean that your new api should contain it as well.
> > > >
> > > > > Embedding checks for port ownership within operations will force
> > > > > everyone to bear their costs, even those not interested in using
> > > > > this API. These checks should be kept outside, within the entity
> > > > > claiming ownership of the port, in the form of using the proper
> > > > > port iterator IMO.
> > > > >
> > > > No.  At the very least, you need to make the API itself exclusive.
> > > > That is to say, you should at least ensure that a port ownership get
> > > > call doesn't race with a port ownership set call.  It seems
> > > > rediculous to just leave that sort of locking as an exercize to the user.
> > > >
> > > > Neil
> > > >
> > > Neil,
> > > As Thomas mentioned, a DPDK port is designed to be managed by only one
> > > thread (or synchronized DPDK entity).
> > > So all the port management includes port ownership shouldn't be
> > > synchronized, i.e. locks are not needed.
> > > If some application want to do dual thread port management, the
> > > responsibility to synchronize the port ownership or any other port
> > > management is on this application.
> > > Port ownership doesn't come to allow synchronized management of the
> > > port by two DPDK entities in parallel, it is just nice way to answer next
> > questions:
> > > 	1. Is the port already owned by some DPDK entity(in early control
> > path)?
> > > 	2. If yes, Who is the owner?
> > > If the answer to the first question is no, the current entity can take
> > > the ownership without any lock(1 thread).
> > > If the answer to the first question is yes, you can recognize the
> > > owner and take decisions accordingly, sometimes you can decide to use
> > > the port because you logically know what the current owner does and
> > > you can be logically synchronized with it, sometimes you can just
> > > leave this port because you have not any deal with  this owner.
> > 
> > If the goal is just to have an ability to recognize is that device is managed by
> > another device (failsafe, bonding, etc.),  then I think all we need is a pointer
> > to rte_eth_dev_data of the owner (NULL would mean no owner).
> 
> I think string is better than a pointer from the next reasons:
> 1. It is more human friendly than pointers for debug and printing.
> 2. it is flexible and allows to forward logical owner message to other DPDK entities. 
> 
> > Also I think if we'd like to introduce that mechanism, then it needs to be
> > - mandatory (control API just don't allow changes to the device configuration
> > if caller is not an owner).
> 
> But what if 2 DPDK entities should manage the same port \ using it and they are synchronized?
> 
> > - transparent to the user (no API changes).
> 
> For now, there is not API change but new suggested API to use for port iteration.
> 
> >  - set/get owner ops need to be atomic if we want this mechanism to be
> > usable for MP.
> 
> But also without atomic this mechanism is usable in MP.
> For example:
> PRIMARY application can set its owner with string "primary A".
> SECONDARY process (which attach to the ports only after the primary created them )is not allowed to set owner(As you can see in the code) but it can read the owner string and see that the port owner is the primary application.
> The "A" can just sign specific port type to the SECONDARY that this port works with logic A which means, for example, primary should send the packets and secondary should receive the packets.
> 
But thats just the point, the operations that you are describing are not atomic
at all.  If the primary process is interrupted during its setting of a ports
owner ship after its read the current owner field, its entirely possible for the
secondary proces to set the owner as itself, and then have the primary process
set it back.  Without locking, its just broken.  I know that the dpdk likes to
say its lockless, because it has no locks, but here we're just kicking the can
down the road, by making the application add the locks for the library.

Neil

> > Konstantin
>
  
Matan Azrad Dec. 4, 2017, 6:10 p.m. UTC | #11
Hi Neil

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Monday, December 4, 2017 6:01 PM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Gaëtan Rivet
> <gaetan.rivet@6wind.com>; Thomas Monjalon <thomas@monjalon.net>;
> Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> On Sun, Dec 03, 2017 at 01:46:49PM +0000, Matan Azrad wrote:
> > Hi Konstantine
> >
> > > -----Original Message-----
> > > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > > Sent: Sunday, December 3, 2017 1:10 PM
> > > To: Matan Azrad <matan@mellanox.com>; Neil Horman
> > > <nhorman@tuxdriver.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > <jingjing.wu@intel.com>; dev@dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > >
> > >
> > >
> > > Hi Matan,
> > >
> > > > -----Original Message-----
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > > > Sent: Sunday, December 3, 2017 8:05 AM
> > > > To: Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet
> > > > <gaetan.rivet@6wind.com>
> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > > <jingjing.wu@intel.com>; dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > >
> > > > Hi
> > > >
> > > > > -----Original Message-----
> > > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > > > Sent: Friday, December 1, 2017 2:10 PM
> > > > > To: Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > > > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > > <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> > > > > dev@dpdk.org
> > > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > > >
> > > > > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> > > > > > Hello Matan, Neil,
> > > > > >
> > > > > > I like the port ownership concept. I think it is needed to
> > > > > > clarify some operations and should be useful to several subsystems.
> > > > > >
> > > > > > This patch could certainly be sub-divided however, and your
> > > > > > current
> > > > > > 1/5 should probably come after this one.
> > > > > >
> > > > > > Some comments inline.
> > > > > >
> > > > > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > > > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > > > > > The ownership of a port is implicit in DPDK.
> > > > > > > > Making it explicit is better from the next reasons:
> > > > > > > > 1. It may be convenient for multi-process applications to
> > > > > > > > know
> > > which
> > > > > > > >    process is in charge of a port.
> > > > > > > > 2. A library could work on top of a port.
> > > > > > > > 3. A port can work on top of another port.
> > > > > > > >
> > > > > > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > > > > > We need to check that the user is not trying to use a port
> > > > > > > > which is already managed by fail-safe.
> > > > > > > >
> > > > > > > > Add ownership mechanism to DPDK Ethernet devices to avoid
> > > > > > > > multiple management of a device by different DPDK entities.
> > > > > > > >
> > > > > > > > A port owner is built from owner id(number) and owner
> > > > > > > > name(string) while the owner id must be unique to
> > > > > > > > distinguish between two identical entity instances and the
> > > > > > > > owner name can be
> > > any name.
> > > > > > > > The name helps to logically recognize the owner by
> > > > > > > > different DPDK entities and allows easy debug.
> > > > > > > > Each DPDK entity can allocate an owner unique identifier
> > > > > > > > and can use it and its preferred name to owns valid ethdev
> ports.
> > > > > > > > Each DPDK entity can get any port owner status to decide
> > > > > > > > if it can manage the port or not.
> > > > > > > >
> > > > > > > > The current ethdev internal port management is not
> > > > > > > > affected by this feature.
> > > > > > > >
> > > > > >
> > > > > > The internal port management is not affected, but the external
> > > > > > interface is, however. In order to respect port ownership,
> > > > > > applications are forced to modify their port iterator, as
> > > > > > shown by your
> > > > > testpmd patch.
> > > > > >
> > > > > > I think it would be better to modify the current
> > > > > > RTE_ETH_FOREACH_DEV to call RTE_FOREACH_DEV_OWNED_BY,
> and
> > > > > > introduce a default owner that would represent the application
> > > > > > itself (probably with the ID 0 and an owner string ""). Only
> > > > > > with specific additional configuration should this default
> > > > > > subset of ethdev be
> > > divided.
> > > > > >
> > > > > > This would make this evolution seamless for applications, at
> > > > > > no cost to the complexity of the design.
> > > > > >
> > > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > > >
> > > > > > >
> > > > > > > This seems fairly racy.  What if one thread attempts to set
> > > > > > > ownership on a port, while another is checking it on another
> > > > > > > cpu in parallel.  It doesn't seem like it will protect against that at all.
> > > > > > > It also doesn't protect against the possibility of multiple
> > > > > > > threads attempting to poll for rx in parallel, which I think
> > > > > > > was part of Thomas's origional statement regarding port
> > > > > > > ownership (he noted that the lockless design implied only a
> > > > > > > single thread should be allowed to poll
> > > > > for receive or make configuration changes at a time.
> > > > > > >
> > > > > > > Neil
> > > > > > >
> > > > > >
> > > > > > Isn't this race already there for any configuration operation
> > > > > > / polling function? The DPDK arch is expecting applications to solve
> it.
> > > > > > Why should port ownership be designed differently from other
> > > > > > DPDK
> > > > > components?
> > > > > >
> > > > > Yes, but that doesn't mean it should exist in purpituity, nor
> > > > > does it mean that your new api should contain it as well.
> > > > >
> > > > > > Embedding checks for port ownership within operations will
> > > > > > force everyone to bear their costs, even those not interested
> > > > > > in using this API. These checks should be kept outside, within
> > > > > > the entity claiming ownership of the port, in the form of
> > > > > > using the proper port iterator IMO.
> > > > > >
> > > > > No.  At the very least, you need to make the API itself exclusive.
> > > > > That is to say, you should at least ensure that a port ownership
> > > > > get call doesn't race with a port ownership set call.  It seems
> > > > > rediculous to just leave that sort of locking as an exercize to the user.
> > > > >
> > > > > Neil
> > > > >
> > > > Neil,
> > > > As Thomas mentioned, a DPDK port is designed to be managed by only
> > > > one thread (or synchronized DPDK entity).
> > > > So all the port management includes port ownership shouldn't be
> > > > synchronized, i.e. locks are not needed.
> > > > If some application want to do dual thread port management, the
> > > > responsibility to synchronize the port ownership or any other port
> > > > management is on this application.
> > > > Port ownership doesn't come to allow synchronized management of
> > > > the port by two DPDK entities in parallel, it is just nice way to
> > > > answer next
> > > questions:
> > > > 	1. Is the port already owned by some DPDK entity(in early control
> > > path)?
> > > > 	2. If yes, Who is the owner?
> > > > If the answer to the first question is no, the current entity can
> > > > take the ownership without any lock(1 thread).
> > > > If the answer to the first question is yes, you can recognize the
> > > > owner and take decisions accordingly, sometimes you can decide to
> > > > use the port because you logically know what the current owner
> > > > does and you can be logically synchronized with it, sometimes you
> > > > can just leave this port because you have not any deal with  this owner.
> > >
> > > If the goal is just to have an ability to recognize is that device
> > > is managed by another device (failsafe, bonding, etc.),  then I
> > > think all we need is a pointer to rte_eth_dev_data of the owner (NULL
> would mean no owner).
> >
> > I think string is better than a pointer from the next reasons:
> > 1. It is more human friendly than pointers for debug and printing.
> > 2. it is flexible and allows to forward logical owner message to other DPDK
> entities.
> >
> > > Also I think if we'd like to introduce that mechanism, then it needs
> > > to be
> > > - mandatory (control API just don't allow changes to the device
> > > configuration if caller is not an owner).
> >
> > But what if 2 DPDK entities should manage the same port \ using it and they
> are synchronized?
> >
> > > - transparent to the user (no API changes).
> >
> > For now, there is not API change but new suggested API to use for port
> iteration.
> >
> > >  - set/get owner ops need to be atomic if we want this mechanism to
> > > be usable for MP.
> >
> > But also without atomic this mechanism is usable in MP.
> > For example:
> > PRIMARY application can set its owner with string "primary A".
> > SECONDARY process (which attach to the ports only after the primary
> created them )is not allowed to set owner(As you can see in the code) but it
> can read the owner string and see that the port owner is the primary
> application.
> > The "A" can just sign specific port type to the SECONDARY that this port
> works with logic A which means, for example, primary should send the
> packets and secondary should receive the packets.
> >
> But thats just the point, the operations that you are describing are not atomic
> at all.  If the primary process is interrupted during its setting of a ports owner
> ship after its read the current owner field, its entirely possible for the
> secondary proces to set the owner as itself, and then have the primary
> process set it back.  Without locking, its just broken.  I know that the dpdk
> likes to say its lockless, because it has no locks, but here we're just kicking the
> can down the road, by making the application add the locks for the library.
> 
> Neil
> 
As I wrote before and as is in the code you can understand that secondary process should not take ownership of ports.
Any port configuration (for example port creation and release) is not internally synchronized between the primary to secondary processes so I don't see any reason to synchronize port ownership.
If the primary-secondary process want to manage(configure) same port in same time, they must be synchronized by the applications, so this is the case in port ownership too (actually I don't think this synchronization is realistic because many configurations of the port are not in shared memory).
So, actually secondary process should start its activity on ports only after the primary process done with all configurations includes port ownership, this part must already be synchronized.
  
> > > Konstantin
> >
  
Neil Horman Dec. 4, 2017, 10:30 p.m. UTC | #12
On Mon, Dec 04, 2017 at 06:10:56PM +0000, Matan Azrad wrote:
> Hi Neil
> 
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Monday, December 4, 2017 6:01 PM
> > To: Matan Azrad <matan@mellanox.com>
> > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Gaëtan Rivet
> > <gaetan.rivet@6wind.com>; Thomas Monjalon <thomas@monjalon.net>;
> > Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > 
> > On Sun, Dec 03, 2017 at 01:46:49PM +0000, Matan Azrad wrote:
> > > Hi Konstantine
> > >
> > > > -----Original Message-----
> > > > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > > > Sent: Sunday, December 3, 2017 1:10 PM
> > > > To: Matan Azrad <matan@mellanox.com>; Neil Horman
> > > > <nhorman@tuxdriver.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > > <jingjing.wu@intel.com>; dev@dpdk.org
> > > > Subject: RE: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > >
> > > >
> > > >
> > > > Hi Matan,
> > > >
> > > > > -----Original Message-----
> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > > > > Sent: Sunday, December 3, 2017 8:05 AM
> > > > > To: Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet
> > > > > <gaetan.rivet@6wind.com>
> > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > > > <jingjing.wu@intel.com>; dev@dpdk.org
> > > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > > >
> > > > > Hi
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > > > > Sent: Friday, December 1, 2017 2:10 PM
> > > > > > To: Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > > > > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > > > <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> > > > > > dev@dpdk.org
> > > > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > > > >
> > > > > > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> > > > > > > Hello Matan, Neil,
> > > > > > >
> > > > > > > I like the port ownership concept. I think it is needed to
> > > > > > > clarify some operations and should be useful to several subsystems.
> > > > > > >
> > > > > > > This patch could certainly be sub-divided however, and your
> > > > > > > current
> > > > > > > 1/5 should probably come after this one.
> > > > > > >
> > > > > > > Some comments inline.
> > > > > > >
> > > > > > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > > > > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > > > > > > The ownership of a port is implicit in DPDK.
> > > > > > > > > Making it explicit is better from the next reasons:
> > > > > > > > > 1. It may be convenient for multi-process applications to
> > > > > > > > > know
> > > > which
> > > > > > > > >    process is in charge of a port.
> > > > > > > > > 2. A library could work on top of a port.
> > > > > > > > > 3. A port can work on top of another port.
> > > > > > > > >
> > > > > > > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > > > > > > We need to check that the user is not trying to use a port
> > > > > > > > > which is already managed by fail-safe.
> > > > > > > > >
> > > > > > > > > Add ownership mechanism to DPDK Ethernet devices to avoid
> > > > > > > > > multiple management of a device by different DPDK entities.
> > > > > > > > >
> > > > > > > > > A port owner is built from owner id(number) and owner
> > > > > > > > > name(string) while the owner id must be unique to
> > > > > > > > > distinguish between two identical entity instances and the
> > > > > > > > > owner name can be
> > > > any name.
> > > > > > > > > The name helps to logically recognize the owner by
> > > > > > > > > different DPDK entities and allows easy debug.
> > > > > > > > > Each DPDK entity can allocate an owner unique identifier
> > > > > > > > > and can use it and its preferred name to owns valid ethdev
> > ports.
> > > > > > > > > Each DPDK entity can get any port owner status to decide
> > > > > > > > > if it can manage the port or not.
> > > > > > > > >
> > > > > > > > > The current ethdev internal port management is not
> > > > > > > > > affected by this feature.
> > > > > > > > >
> > > > > > >
> > > > > > > The internal port management is not affected, but the external
> > > > > > > interface is, however. In order to respect port ownership,
> > > > > > > applications are forced to modify their port iterator, as
> > > > > > > shown by your
> > > > > > testpmd patch.
> > > > > > >
> > > > > > > I think it would be better to modify the current
> > > > > > > RTE_ETH_FOREACH_DEV to call RTE_FOREACH_DEV_OWNED_BY,
> > and
> > > > > > > introduce a default owner that would represent the application
> > > > > > > itself (probably with the ID 0 and an owner string ""). Only
> > > > > > > with specific additional configuration should this default
> > > > > > > subset of ethdev be
> > > > divided.
> > > > > > >
> > > > > > > This would make this evolution seamless for applications, at
> > > > > > > no cost to the complexity of the design.
> > > > > > >
> > > > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > > > >
> > > > > > > >
> > > > > > > > This seems fairly racy.  What if one thread attempts to set
> > > > > > > > ownership on a port, while another is checking it on another
> > > > > > > > cpu in parallel.  It doesn't seem like it will protect against that at all.
> > > > > > > > It also doesn't protect against the possibility of multiple
> > > > > > > > threads attempting to poll for rx in parallel, which I think
> > > > > > > > was part of Thomas's origional statement regarding port
> > > > > > > > ownership (he noted that the lockless design implied only a
> > > > > > > > single thread should be allowed to poll
> > > > > > for receive or make configuration changes at a time.
> > > > > > > >
> > > > > > > > Neil
> > > > > > > >
> > > > > > >
> > > > > > > Isn't this race already there for any configuration operation
> > > > > > > / polling function? The DPDK arch is expecting applications to solve
> > it.
> > > > > > > Why should port ownership be designed differently from other
> > > > > > > DPDK
> > > > > > components?
> > > > > > >
> > > > > > Yes, but that doesn't mean it should exist in purpituity, nor
> > > > > > does it mean that your new api should contain it as well.
> > > > > >
> > > > > > > Embedding checks for port ownership within operations will
> > > > > > > force everyone to bear their costs, even those not interested
> > > > > > > in using this API. These checks should be kept outside, within
> > > > > > > the entity claiming ownership of the port, in the form of
> > > > > > > using the proper port iterator IMO.
> > > > > > >
> > > > > > No.  At the very least, you need to make the API itself exclusive.
> > > > > > That is to say, you should at least ensure that a port ownership
> > > > > > get call doesn't race with a port ownership set call.  It seems
> > > > > > rediculous to just leave that sort of locking as an exercize to the user.
> > > > > >
> > > > > > Neil
> > > > > >
> > > > > Neil,
> > > > > As Thomas mentioned, a DPDK port is designed to be managed by only
> > > > > one thread (or synchronized DPDK entity).
> > > > > So all the port management includes port ownership shouldn't be
> > > > > synchronized, i.e. locks are not needed.
> > > > > If some application want to do dual thread port management, the
> > > > > responsibility to synchronize the port ownership or any other port
> > > > > management is on this application.
> > > > > Port ownership doesn't come to allow synchronized management of
> > > > > the port by two DPDK entities in parallel, it is just nice way to
> > > > > answer next
> > > > questions:
> > > > > 	1. Is the port already owned by some DPDK entity(in early control
> > > > path)?
> > > > > 	2. If yes, Who is the owner?
> > > > > If the answer to the first question is no, the current entity can
> > > > > take the ownership without any lock(1 thread).
> > > > > If the answer to the first question is yes, you can recognize the
> > > > > owner and take decisions accordingly, sometimes you can decide to
> > > > > use the port because you logically know what the current owner
> > > > > does and you can be logically synchronized with it, sometimes you
> > > > > can just leave this port because you have not any deal with  this owner.
> > > >
> > > > If the goal is just to have an ability to recognize is that device
> > > > is managed by another device (failsafe, bonding, etc.),  then I
> > > > think all we need is a pointer to rte_eth_dev_data of the owner (NULL
> > would mean no owner).
> > >
> > > I think string is better than a pointer from the next reasons:
> > > 1. It is more human friendly than pointers for debug and printing.
> > > 2. it is flexible and allows to forward logical owner message to other DPDK
> > entities.
> > >
> > > > Also I think if we'd like to introduce that mechanism, then it needs
> > > > to be
> > > > - mandatory (control API just don't allow changes to the device
> > > > configuration if caller is not an owner).
> > >
> > > But what if 2 DPDK entities should manage the same port \ using it and they
> > are synchronized?
> > >
> > > > - transparent to the user (no API changes).
> > >
> > > For now, there is not API change but new suggested API to use for port
> > iteration.
> > >
> > > >  - set/get owner ops need to be atomic if we want this mechanism to
> > > > be usable for MP.
> > >
> > > But also without atomic this mechanism is usable in MP.
> > > For example:
> > > PRIMARY application can set its owner with string "primary A".
> > > SECONDARY process (which attach to the ports only after the primary
> > created them )is not allowed to set owner(As you can see in the code) but it
> > can read the owner string and see that the port owner is the primary
> > application.
> > > The "A" can just sign specific port type to the SECONDARY that this port
> > works with logic A which means, for example, primary should send the
> > packets and secondary should receive the packets.
> > >
> > But thats just the point, the operations that you are describing are not atomic
> > at all.  If the primary process is interrupted during its setting of a ports owner
> > ship after its read the current owner field, its entirely possible for the
> > secondary proces to set the owner as itself, and then have the primary
> > process set it back.  Without locking, its just broken.  I know that the dpdk
> > likes to say its lockless, because it has no locks, but here we're just kicking the
> > can down the road, by making the application add the locks for the library.
> > 
> > Neil
> > 
> As I wrote before and as is in the code you can understand that secondary process should not take ownership of ports.
But you allow for it, and if you do, you should write your api to be safe for
such opperations.  
> Any port configuration (for example port creation and release) is not internally synchronized between the primary to secondary processes so I don't see any reason to synchronize port ownership.
Yes, and I've never agreed with that design point either, because the fact of
the matter is that one of two things must be true in relation to port
configuration:

1) Either multiple processes will attempt to read/change port
configuration/ownership

or 

2) port ownership is implicitly given to a single task (be it a primary or
secondary task), and said ownership is therefore implicitly known by the
application

Either situation may be true depending on the details of the application being
built, but regardless, if (2) is true, then this api isn't really needed at all,
because the application implicitly has been designed to have a port be owned by
a given task.  If (1) is true, then all the locking required to access the data
of port ownership needs to be adhered to.

I understand that you are taking Thomas' words to mean that all paths are
lockless in the DPDK, and that is true as a statement of fact (in that the DPDK
doesn't synchronize access to internal data).  What it does do, is leave that
locking as an exercize for the downstream consumer of the library.  While I
don't agree with it, I can see that such an argument is valid for hot paths such
as transmission and reception where you may perhaps want to minimize your
locking by assigning a single task to do that work, but port configuration and
ownership isn't a hot path.  If you mean to allow potentially multiple tasks to
access configuration (including port ownership), be it frequently or just
occasionaly, why are you making a downstream developer recognize the need for
locking here outside the library?  It just seems like very bad general practice
to me.

> If the primary-secondary process want to manage(configure) same port in same time, they must be synchronized by the applications, so this is the case in port ownership too (actually I don't think this synchronization is realistic because many configurations of the port are not in shared memory).
Yes, it is the case, my question is, why?  We're not talking about a time
sensitive execution path here.  And by avoiding it you're just making work that
has to be repeated by every downstream consumer.

Neil
  
Matan Azrad Dec. 5, 2017, 6:08 a.m. UTC | #13
Hi Neil

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Tuesday, December 5, 2017 12:31 AM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Gaëtan Rivet
> <gaetan.rivet@6wind.com>; Thomas Monjalon <thomas@monjalon.net>;
> Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> On Mon, Dec 04, 2017 at 06:10:56PM +0000, Matan Azrad wrote:
> > Hi Neil
> >
> > > -----Original Message-----
> > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > Sent: Monday, December 4, 2017 6:01 PM
> > > To: Matan Azrad <matan@mellanox.com>
> > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Gaëtan Rivet
> > > <gaetan.rivet@6wind.com>; Thomas Monjalon
> <thomas@monjalon.net>; Wu,
> > > Jingjing <jingjing.wu@intel.com>; dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > >
> > > On Sun, Dec 03, 2017 at 01:46:49PM +0000, Matan Azrad wrote:
> > > > Hi Konstantine
> > > >
> > > > > -----Original Message-----
> > > > > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > > > > Sent: Sunday, December 3, 2017 1:10 PM
> > > > > To: Matan Azrad <matan@mellanox.com>; Neil Horman
> > > > > <nhorman@tuxdriver.com>; Gaëtan Rivet
> <gaetan.rivet@6wind.com>
> > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > > > <jingjing.wu@intel.com>; dev@dpdk.org
> > > > > Subject: RE: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > > >
> > > > >
> > > > >
> > > > > Hi Matan,
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan
> > > > > > Azrad
> > > > > > Sent: Sunday, December 3, 2017 8:05 AM
> > > > > > To: Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet
> > > > > > <gaetan.rivet@6wind.com>
> > > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > > > > <jingjing.wu@intel.com>; dev@dpdk.org
> > > > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > > > >
> > > > > > Hi
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > > > > > Sent: Friday, December 1, 2017 2:10 PM
> > > > > > > To: Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > > > > > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > > > > <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> > > > > > > dev@dpdk.org
> > > > > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port
> > > > > > > ownership
> > > > > > >
> > > > > > > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> > > > > > > > Hello Matan, Neil,
> > > > > > > >
> > > > > > > > I like the port ownership concept. I think it is needed to
> > > > > > > > clarify some operations and should be useful to several
> subsystems.
> > > > > > > >
> > > > > > > > This patch could certainly be sub-divided however, and
> > > > > > > > your current
> > > > > > > > 1/5 should probably come after this one.
> > > > > > > >
> > > > > > > > Some comments inline.
> > > > > > > >
> > > > > > > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > > > > > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad
> wrote:
> > > > > > > > > > The ownership of a port is implicit in DPDK.
> > > > > > > > > > Making it explicit is better from the next reasons:
> > > > > > > > > > 1. It may be convenient for multi-process applications
> > > > > > > > > > to know
> > > > > which
> > > > > > > > > >    process is in charge of a port.
> > > > > > > > > > 2. A library could work on top of a port.
> > > > > > > > > > 3. A port can work on top of another port.
> > > > > > > > > >
> > > > > > > > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > > > > > > > We need to check that the user is not trying to use a
> > > > > > > > > > port which is already managed by fail-safe.
> > > > > > > > > >
> > > > > > > > > > Add ownership mechanism to DPDK Ethernet devices to
> > > > > > > > > > avoid multiple management of a device by different DPDK
> entities.
> > > > > > > > > >
> > > > > > > > > > A port owner is built from owner id(number) and owner
> > > > > > > > > > name(string) while the owner id must be unique to
> > > > > > > > > > distinguish between two identical entity instances and
> > > > > > > > > > the owner name can be
> > > > > any name.
> > > > > > > > > > The name helps to logically recognize the owner by
> > > > > > > > > > different DPDK entities and allows easy debug.
> > > > > > > > > > Each DPDK entity can allocate an owner unique
> > > > > > > > > > identifier and can use it and its preferred name to
> > > > > > > > > > owns valid ethdev
> > > ports.
> > > > > > > > > > Each DPDK entity can get any port owner status to
> > > > > > > > > > decide if it can manage the port or not.
> > > > > > > > > >
> > > > > > > > > > The current ethdev internal port management is not
> > > > > > > > > > affected by this feature.
> > > > > > > > > >
> > > > > > > >
> > > > > > > > The internal port management is not affected, but the
> > > > > > > > external interface is, however. In order to respect port
> > > > > > > > ownership, applications are forced to modify their port
> > > > > > > > iterator, as shown by your
> > > > > > > testpmd patch.
> > > > > > > >
> > > > > > > > I think it would be better to modify the current
> > > > > > > > RTE_ETH_FOREACH_DEV to call
> RTE_FOREACH_DEV_OWNED_BY,
> > > and
> > > > > > > > introduce a default owner that would represent the
> > > > > > > > application itself (probably with the ID 0 and an owner
> > > > > > > > string ""). Only with specific additional configuration
> > > > > > > > should this default subset of ethdev be
> > > > > divided.
> > > > > > > >
> > > > > > > > This would make this evolution seamless for applications,
> > > > > > > > at no cost to the complexity of the design.
> > > > > > > >
> > > > > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > This seems fairly racy.  What if one thread attempts to
> > > > > > > > > set ownership on a port, while another is checking it on
> > > > > > > > > another cpu in parallel.  It doesn't seem like it will protect
> against that at all.
> > > > > > > > > It also doesn't protect against the possibility of
> > > > > > > > > multiple threads attempting to poll for rx in parallel,
> > > > > > > > > which I think was part of Thomas's origional statement
> > > > > > > > > regarding port ownership (he noted that the lockless
> > > > > > > > > design implied only a single thread should be allowed to
> > > > > > > > > poll
> > > > > > > for receive or make configuration changes at a time.
> > > > > > > > >
> > > > > > > > > Neil
> > > > > > > > >
> > > > > > > >
> > > > > > > > Isn't this race already there for any configuration
> > > > > > > > operation / polling function? The DPDK arch is expecting
> > > > > > > > applications to solve
> > > it.
> > > > > > > > Why should port ownership be designed differently from
> > > > > > > > other DPDK
> > > > > > > components?
> > > > > > > >
> > > > > > > Yes, but that doesn't mean it should exist in purpituity,
> > > > > > > nor does it mean that your new api should contain it as well.
> > > > > > >
> > > > > > > > Embedding checks for port ownership within operations will
> > > > > > > > force everyone to bear their costs, even those not
> > > > > > > > interested in using this API. These checks should be kept
> > > > > > > > outside, within the entity claiming ownership of the port,
> > > > > > > > in the form of using the proper port iterator IMO.
> > > > > > > >
> > > > > > > No.  At the very least, you need to make the API itself exclusive.
> > > > > > > That is to say, you should at least ensure that a port
> > > > > > > ownership get call doesn't race with a port ownership set
> > > > > > > call.  It seems rediculous to just leave that sort of locking as an
> exercize to the user.
> > > > > > >
> > > > > > > Neil
> > > > > > >
> > > > > > Neil,
> > > > > > As Thomas mentioned, a DPDK port is designed to be managed by
> > > > > > only one thread (or synchronized DPDK entity).
> > > > > > So all the port management includes port ownership shouldn't
> > > > > > be synchronized, i.e. locks are not needed.
> > > > > > If some application want to do dual thread port management,
> > > > > > the responsibility to synchronize the port ownership or any
> > > > > > other port management is on this application.
> > > > > > Port ownership doesn't come to allow synchronized management
> > > > > > of the port by two DPDK entities in parallel, it is just nice
> > > > > > way to answer next
> > > > > questions:
> > > > > > 	1. Is the port already owned by some DPDK entity(in early
> > > > > > control
> > > > > path)?
> > > > > > 	2. If yes, Who is the owner?
> > > > > > If the answer to the first question is no, the current entity
> > > > > > can take the ownership without any lock(1 thread).
> > > > > > If the answer to the first question is yes, you can recognize
> > > > > > the owner and take decisions accordingly, sometimes you can
> > > > > > decide to use the port because you logically know what the
> > > > > > current owner does and you can be logically synchronized with
> > > > > > it, sometimes you can just leave this port because you have not any
> deal with  this owner.
> > > > >
> > > > > If the goal is just to have an ability to recognize is that
> > > > > device is managed by another device (failsafe, bonding, etc.),
> > > > > then I think all we need is a pointer to rte_eth_dev_data of the
> > > > > owner (NULL
> > > would mean no owner).
> > > >
> > > > I think string is better than a pointer from the next reasons:
> > > > 1. It is more human friendly than pointers for debug and printing.
> > > > 2. it is flexible and allows to forward logical owner message to
> > > > other DPDK
> > > entities.
> > > >
> > > > > Also I think if we'd like to introduce that mechanism, then it
> > > > > needs to be
> > > > > - mandatory (control API just don't allow changes to the device
> > > > > configuration if caller is not an owner).
> > > >
> > > > But what if 2 DPDK entities should manage the same port \ using it
> > > > and they
> > > are synchronized?
> > > >
> > > > > - transparent to the user (no API changes).
> > > >
> > > > For now, there is not API change but new suggested API to use for
> > > > port
> > > iteration.
> > > >
> > > > >  - set/get owner ops need to be atomic if we want this mechanism
> > > > > to be usable for MP.
> > > >
> > > > But also without atomic this mechanism is usable in MP.
> > > > For example:
> > > > PRIMARY application can set its owner with string "primary A".
> > > > SECONDARY process (which attach to the ports only after the
> > > > primary
> > > created them )is not allowed to set owner(As you can see in the
> > > code) but it can read the owner string and see that the port owner
> > > is the primary application.
> > > > The "A" can just sign specific port type to the SECONDARY that
> > > > this port
> > > works with logic A which means, for example, primary should send the
> > > packets and secondary should receive the packets.
> > > >
> > > But thats just the point, the operations that you are describing are
> > > not atomic at all.  If the primary process is interrupted during its
> > > setting of a ports owner ship after its read the current owner
> > > field, its entirely possible for the secondary proces to set the
> > > owner as itself, and then have the primary process set it back.
> > > Without locking, its just broken.  I know that the dpdk likes to say
> > > its lockless, because it has no locks, but here we're just kicking the can
> down the road, by making the application add the locks for the library.
> > >
> > > Neil
> > >
> > As I wrote before and as is in the code you can understand that secondary
> process should not take ownership of ports.
> But you allow for it, and if you do, you should write your api to be safe for
> such opperations.

Please look at the code again, secondary process cannot take ownership, I don't allow it!
Actually, I think it must not be like that as no limitation for that in any other ethdev configurations.

> > Any port configuration (for example port creation and release) is not
> internally synchronized between the primary to secondary processes so I
> don't see any reason to synchronize port ownership.
> Yes, and I've never agreed with that design point either, because the fact of
> the matter is that one of two things must be true in relation to port
> configuration:
> 
> 1) Either multiple processes will attempt to read/change port
> configuration/ownership
> 
> or
> 
> 2) port ownership is implicitly given to a single task (be it a primary or
> secondary task), and said ownership is therefore implicitly known by the
> application
> 
> Either situation may be true depending on the details of the application being
> built, but regardless, if (2) is true, then this api isn't really needed at all,
> because the application implicitly has been designed to have a port be
> owned by a given task. 

Here I think you miss something, the port ownership is not mainly for process DPDK entities,
The entity can be also PMD, library, application in same process.
For MP it is nice to have, the secondary just can see the primary owners and take decision accordingly (please read my answer to Konstatin above). 

 If (1) is true, then all the locking required to access
> the data of port ownership needs to be adhered to.
> 

And all the port configurations!
I think it is behind to this thread.


> I understand that you are taking Thomas' words to mean that all paths are
> lockless in the DPDK, and that is true as a statement of fact (in that the DPDK
> doesn't synchronize access to internal data).  What it does do, is leave that
> locking as an exercize for the downstream consumer of the library.  While I
> don't agree with it, I can see that such an argument is valid for hot paths such
> as transmission and reception where you may perhaps want to minimize
> your locking by assigning a single task to do that work, but port configuration
> and ownership isn't a hot path.  If you mean to allow potentially multiple
> tasks to access configuration (including port ownership), be it frequently or
> just occasionaly, why are you making a downstream developer recognize the
> need for locking here outside the library?  It just seems like very bad general
> practice to me.
> 
> > If the primary-secondary process want to manage(configure) same port in
> same time, they must be synchronized by the applications, so this is the case
> in port ownership too (actually I don't think this synchronization is realistic
> because many configurations of the port are not in shared memory).
> Yes, it is the case, my question is, why?  We're not talking about a time
> sensitive execution path here.  And by avoiding it you're just making work
> that has to be repeated by every downstream consumer.

I think you suggest to make all the ethdev configuration race safe, it is behind to this thread.
Current ethdev implementation leave the race management to applications, so port ownership as any other port configurations should be designed in the same method.

> 
> Neil
  
Bruce Richardson Dec. 5, 2017, 10:05 a.m. UTC | #14
On Tue, Dec 05, 2017 at 06:08:35AM +0000, Matan Azrad wrote:
> Hi Neil
> 
> > -----Original Message----- From: Neil Horman
> > [mailto:nhorman@tuxdriver.com] Sent: Tuesday, December 5, 2017 12:31
> > AM To: Matan Azrad <matan@mellanox.com> Cc: Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>; Gaëtan Rivet
> > <gaetan.rivet@6wind.com>; Thomas Monjalon <thomas@monjalon.net>; Wu,
> > Jingjing <jingjing.wu@intel.com>; dev@dpdk.org Subject: Re:
> > [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > 
> > On Mon, Dec 04, 2017 at 06:10:56PM +0000, Matan Azrad wrote:
> > > Hi Neil
> > >
> > > > -----Original Message----- From: Neil Horman
> > > > [mailto:nhorman@tuxdriver.com] Sent: Monday, December 4, 2017
> > > > 6:01 PM To: Matan Azrad <matan@mellanox.com> Cc: Ananyev,
> > > > Konstantin <konstantin.ananyev@intel.com>; Gaëtan Rivet
> > > > <gaetan.rivet@6wind.com>; Thomas Monjalon
> > <thomas@monjalon.net>; Wu,
> > > > Jingjing <jingjing.wu@intel.com>; dev@dpdk.org Subject: Re:
> > > > [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > >
> > > > On Sun, Dec 03, 2017 at 01:46:49PM +0000, Matan Azrad wrote:
> > > > > Hi Konstantine
> > > > >
> > > > > > -----Original Message----- From: Ananyev, Konstantin
> > > > > > [mailto:konstantin.ananyev@intel.com] Sent: Sunday, December
> > > > > > 3, 2017 1:10 PM To: Matan Azrad <matan@mellanox.com>; Neil
> > > > > > Horman <nhorman@tuxdriver.com>; Gaëtan Rivet
> > <gaetan.rivet@6wind.com>
> > > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > > > > <jingjing.wu@intel.com>; dev@dpdk.org Subject: RE:
> > > > > > [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > > > >
> > > > > >
> > > > > >
> > > > > > Hi Matan,
> > > > > >
> > > > > > > -----Original Message----- From: dev
> > > > > > > [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > > > > > > Sent: Sunday, December 3, 2017 8:05 AM To: Neil Horman
> > > > > > > <nhorman@tuxdriver.com>; Gaëtan Rivet
> > > > > > > <gaetan.rivet@6wind.com> Cc: Thomas Monjalon
> > > > > > > <thomas@monjalon.net>; Wu, Jingjing
> > > > > > > <jingjing.wu@intel.com>; dev@dpdk.org Subject: Re:
> > > > > > > [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > > > > >
> > > > > > > Hi
> > > > > > >
> > > > > > > > -----Original Message----- From: Neil Horman
> > > > > > > > [mailto:nhorman@tuxdriver.com] Sent: Friday, December 1,
> > > > > > > > 2017 2:10 PM To: Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > > > > > > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > > > > > <thomas@monjalon.net>; Jingjing Wu
> > > > > > > > <jingjing.wu@intel.com>; dev@dpdk.org Subject: Re:
> > > > > > > > [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > > > > > >
> > > > > > > > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet
> > > > > > > > wrote:
> > > > > > > > > Hello Matan, Neil,
> > > > > > > > >
> > > > > > > > > I like the port ownership concept. I think it is
> > > > > > > > > needed to clarify some operations and should be useful
> > > > > > > > > to several
> > subsystems.
> > > > > > > > >
> > > > > > > > > This patch could certainly be sub-divided however, and
> > > > > > > > > your current 1/5 should probably come after this one.
> > > > > > > > >
> > > > > > > > > Some comments inline.
> > > > > > > > >
> > > > > > > > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman
> > > > > > > > > wrote:
> > > > > > > > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan
> > > > > > > > > > Azrad
> > wrote:
> > > > > > > > > > > The ownership of a port is implicit in DPDK.
> > > > > > > > > > > Making it explicit is better from the next
> > > > > > > > > > > reasons: 1. It may be convenient for multi-process
> > > > > > > > > > > applications to know
> > > > > > which
> > > > > > > > > > >    process is in charge of a port.  2. A library
> > > > > > > > > > >    could work on top of a port.  3. A port can
> > > > > > > > > > >    work on top of another port.
> > > > > > > > > > >
> > > > > > > > > > > Also in the fail-safe case, an issue has been met
> > > > > > > > > > > in testpmd.  We need to check that the user is not
> > > > > > > > > > > trying to use a port which is already managed by
> > > > > > > > > > > fail-safe.
> > > > > > > > > > >
> > > > > > > > > > > Add ownership mechanism to DPDK Ethernet devices
> > > > > > > > > > > to avoid multiple management of a device by
> > > > > > > > > > > different DPDK
> > entities.
> > > > > > > > > > >
> > > > > > > > > > > A port owner is built from owner id(number) and
> > > > > > > > > > > owner name(string) while the owner id must be
> > > > > > > > > > > unique to distinguish between two identical entity
> > > > > > > > > > > instances and the owner name can be
> > > > > > any name.
> > > > > > > > > > > The name helps to logically recognize the owner by
> > > > > > > > > > > different DPDK entities and allows easy debug.
> > > > > > > > > > > Each DPDK entity can allocate an owner unique
> > > > > > > > > > > identifier and can use it and its preferred name
> > > > > > > > > > > to owns valid ethdev
> > > > ports.
> > > > > > > > > > > Each DPDK entity can get any port owner status to
> > > > > > > > > > > decide if it can manage the port or not.
> > > > > > > > > > >
> > > > > > > > > > > The current ethdev internal port management is not
> > > > > > > > > > > affected by this feature.
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > > > The internal port management is not affected, but the
> > > > > > > > > external interface is, however. In order to respect
> > > > > > > > > port ownership, applications are forced to modify
> > > > > > > > > their port iterator, as shown by your
> > > > > > > > testpmd patch.
> > > > > > > > >
> > > > > > > > > I think it would be better to modify the current
> > > > > > > > > RTE_ETH_FOREACH_DEV to call
> > RTE_FOREACH_DEV_OWNED_BY,
> > > > and
> > > > > > > > > introduce a default owner that would represent the
> > > > > > > > > application itself (probably with the ID 0 and an
> > > > > > > > > owner string ""). Only with specific additional
> > > > > > > > > configuration should this default subset of ethdev be
> > > > > > divided.
> > > > > > > > >
> > > > > > > > > This would make this evolution seamless for
> > > > > > > > > applications, at no cost to the complexity of the
> > > > > > > > > design.
> > > > > > > > >
> > > > > > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > This seems fairly racy.  What if one thread attempts
> > > > > > > > > > to set ownership on a port, while another is
> > > > > > > > > > checking it on another cpu in parallel.  It doesn't
> > > > > > > > > > seem like it will protect
> > against that at all.
> > > > > > > > > > It also doesn't protect against the possibility of
> > > > > > > > > > multiple threads attempting to poll for rx in
> > > > > > > > > > parallel, which I think was part of Thomas's
> > > > > > > > > > origional statement regarding port ownership (he
> > > > > > > > > > noted that the lockless design implied only a single
> > > > > > > > > > thread should be allowed to poll
> > > > > > > > for receive or make configuration changes at a time.
> > > > > > > > > >
> > > > > > > > > > Neil
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Isn't this race already there for any configuration
> > > > > > > > > operation / polling function? The DPDK arch is
> > > > > > > > > expecting applications to solve
> > > > it.
> > > > > > > > > Why should port ownership be designed differently from
> > > > > > > > > other DPDK
> > > > > > > > components?
> > > > > > > > >
> > > > > > > > Yes, but that doesn't mean it should exist in
> > > > > > > > purpituity, nor does it mean that your new api should
> > > > > > > > contain it as well.
> > > > > > > >
> > > > > > > > > Embedding checks for port ownership within operations
> > > > > > > > > will force everyone to bear their costs, even those
> > > > > > > > > not interested in using this API. These checks should
> > > > > > > > > be kept outside, within the entity claiming ownership
> > > > > > > > > of the port, in the form of using the proper port
> > > > > > > > > iterator IMO.
> > > > > > > > >
> > > > > > > > No.  At the very least, you need to make the API itself
> > > > > > > > exclusive.  That is to say, you should at least ensure
> > > > > > > > that a port ownership get call doesn't race with a port
> > > > > > > > ownership set call.  It seems rediculous to just leave
> > > > > > > > that sort of locking as an
> > exercize to the user.
> > > > > > > >
> > > > > > > > Neil
> > > > > > > >
> > > > > > > Neil, As Thomas mentioned, a DPDK port is designed to be
> > > > > > > managed by only one thread (or synchronized DPDK entity).
> > > > > > > So all the port management includes port ownership
> > > > > > > shouldn't be synchronized, i.e. locks are not needed.  If
> > > > > > > some application want to do dual thread port management,
> > > > > > > the responsibility to synchronize the port ownership or
> > > > > > > any other port management is on this application.  Port
> > > > > > > ownership doesn't come to allow synchronized management of
> > > > > > > the port by two DPDK entities in parallel, it is just nice
> > > > > > > way to answer next
> > > > > > questions:
> > > > > > > 	1. Is the port already owned by some DPDK entity(in
> > > > > > > 	early control
> > > > > > path)?
> > > > > > > 	2. If yes, Who is the owner?  If the answer to the first
> > > > > > > 	question is no, the current entity can take the
> > > > > > > 	ownership without any lock(1 thread).  If the answer to
> > > > > > > 	the first question is yes, you can recognize the owner
> > > > > > > 	and take decisions accordingly, sometimes you can decide
> > > > > > > 	to use the port because you logically know what the
> > > > > > > 	current owner does and you can be logically synchronized
> > > > > > > 	with it, sometimes you can just leave this port because
> > > > > > > 	you have not any
> > deal with  this owner.
> > > > > >
> > > > > > If the goal is just to have an ability to recognize is that
> > > > > > device is managed by another device (failsafe, bonding,
> > > > > > etc.), then I think all we need is a pointer to
> > > > > > rte_eth_dev_data of the owner (NULL
> > > > would mean no owner).
> > > > >
> > > > > I think string is better than a pointer from the next reasons:
> > > > > 1. It is more human friendly than pointers for debug and
> > > > > printing.  2. it is flexible and allows to forward logical
> > > > > owner message to other DPDK
> > > > entities.
> > > > >
> > > > > > Also I think if we'd like to introduce that mechanism, then
> > > > > > it needs to be - mandatory (control API just don't allow
> > > > > > changes to the device configuration if caller is not an
> > > > > > owner).
> > > > >
> > > > > But what if 2 DPDK entities should manage the same port \
> > > > > using it and they
> > > > are synchronized?
> > > > >
> > > > > > - transparent to the user (no API changes).
> > > > >
> > > > > For now, there is not API change but new suggested API to use
> > > > > for port
> > > > iteration.
> > > > >
> > > > > >  - set/get owner ops need to be atomic if we want this
> > > > > >  mechanism to be usable for MP.
> > > > >
> > > > > But also without atomic this mechanism is usable in MP.  For
> > > > > example: PRIMARY application can set its owner with string
> > > > > "primary A".  SECONDARY process (which attach to the ports
> > > > > only after the primary
> > > > created them )is not allowed to set owner(As you can see in the
> > > > code) but it can read the owner string and see that the port
> > > > owner is the primary application.
> > > > > The "A" can just sign specific port type to the SECONDARY that
> > > > > this port
> > > > works with logic A which means, for example, primary should send
> > > > the packets and secondary should receive the packets.
> > > > >
> > > > But thats just the point, the operations that you are describing
> > > > are not atomic at all.  If the primary process is interrupted
> > > > during its setting of a ports owner ship after its read the
> > > > current owner field, its entirely possible for the secondary
> > > > proces to set the owner as itself, and then have the primary
> > > > process set it back.  Without locking, its just broken.  I know
> > > > that the dpdk likes to say its lockless, because it has no
> > > > locks, but here we're just kicking the can
> > down the road, by making the application add the locks for the
> > library.
> > > >
> > > > Neil
> > > >
> > > As I wrote before and as is in the code you can understand that
> > > secondary
> > process should not take ownership of ports.  But you allow for it,
> > and if you do, you should write your api to be safe for such
> > opperations.
> 
> Please look at the code again, secondary process cannot take
> ownership, I don't allow it!  Actually, I think it must not be like
> that as no limitation for that in any other ethdev configurations.
> 
> > > Any port configuration (for example port creation and release) is
> > > not
> > internally synchronized between the primary to secondary processes
> > so I don't see any reason to synchronize port ownership.  Yes, and
> > I've never agreed with that design point either, because the fact of
> > the matter is that one of two things must be true in relation to
> > port configuration:
> > 
> > 1) Either multiple processes will attempt to read/change port
> > configuration/ownership
> > 
> > or
> > 
> > 2) port ownership is implicitly given to a single task (be it a
> > primary or secondary task), and said ownership is therefore
> > implicitly known by the application
> > 
> > Either situation may be true depending on the details of the
> > application being built, but regardless, if (2) is true, then this
> > api isn't really needed at all, because the application implicitly
> > has been designed to have a port be owned by a given task. 
> 
> Here I think you miss something, the port ownership is not mainly for
> process DPDK entities, The entity can be also PMD, library,
> application in same process.  For MP it is nice to have, the secondary
> just can see the primary owners and take decision accordingly (please
> read my answer to Konstatin above). 
> 
>  If (1) is true, then all the locking required to access
> > the data of port ownership needs to be adhered to.
> > 
> 
> And all the port configurations!  I think it is behind to this thread.
> 
> 
> > I understand that you are taking Thomas' words to mean that all
> > paths are lockless in the DPDK, and that is true as a statement of
> > fact (in that the DPDK doesn't synchronize access to internal data).
> > What it does do, is leave that locking as an exercize for the
> > downstream consumer of the library.  While I don't agree with it, I
> > can see that such an argument is valid for hot paths such as
> > transmission and reception where you may perhaps want to minimize
> > your locking by assigning a single task to do that work, but port
> > configuration and ownership isn't a hot path.  If you mean to allow
> > potentially multiple tasks to access configuration (including port
> > ownership), be it frequently or just occasionaly, why are you making
> > a downstream developer recognize the need for locking here outside
> > the library?  It just seems like very bad general practice to me.
> > 
> > > If the primary-secondary process want to manage(configure) same
> > > port in
> > same time, they must be synchronized by the applications, so this is
> > the case in port ownership too (actually I don't think this
> > synchronization is realistic because many configurations of the port
> > are not in shared memory).  Yes, it is the case, my question is,
> > why?  We're not talking about a time sensitive execution path here.
> > And by avoiding it you're just making work that has to be repeated
> > by every downstream consumer.
> 
> I think you suggest to make all the ethdev configuration race safe, it
> is behind to this thread.  Current ethdev implementation leave the
> race management to applications, so port ownership as any other port
> configurations should be designed in the same method.
> 
> > 
One key difference, though, being that port ownership itself could be
used to manage the thread-safety of the ethdev configuration. It's also
a little different from other APIs in that I find it hard to come up
with a scenario where it would be very useful to an application without
also having some form of locking present in it. For other config/control
APIs we can have the control plane APIs work without locks e.g. by
having a single designated thread/process manage all configuration
updates. However, as Neil points out, in such a scenario, the ownership
concept doesn't provide any additional benefit so can be skipped
completely. I'd view it a bit like the reference counting of mbufs -
we can provide a lockless/non-atomic version, but for just about every
real use-case, you want the atomic version.

Regards,
/Bruce
  
Ananyev, Konstantin Dec. 5, 2017, 11:12 a.m. UTC | #15
Hi Matan,

> -----Original Message-----
> From: Matan Azrad [mailto:matan@mellanox.com]
> Sent: Sunday, December 3, 2017 1:47 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet
> <gaetan.rivet@6wind.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> Hi Konstantine
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > Sent: Sunday, December 3, 2017 1:10 PM
> > To: Matan Azrad <matan@mellanox.com>; Neil Horman
> > <nhorman@tuxdriver.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > <jingjing.wu@intel.com>; dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> >
> >
> >
> > Hi Matan,
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > > Sent: Sunday, December 3, 2017 8:05 AM
> > > To: Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet
> > > <gaetan.rivet@6wind.com>
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > <jingjing.wu@intel.com>; dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > >
> > > Hi
> > >
> > > > -----Original Message-----
> > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > > Sent: Friday, December 1, 2017 2:10 PM
> > > > To: Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> > > > dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > >
> > > > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> > > > > Hello Matan, Neil,
> > > > >
> > > > > I like the port ownership concept. I think it is needed to clarify
> > > > > some operations and should be useful to several subsystems.
> > > > >
> > > > > This patch could certainly be sub-divided however, and your
> > > > > current
> > > > > 1/5 should probably come after this one.
> > > > >
> > > > > Some comments inline.
> > > > >
> > > > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > > > > The ownership of a port is implicit in DPDK.
> > > > > > > Making it explicit is better from the next reasons:
> > > > > > > 1. It may be convenient for multi-process applications to know
> > which
> > > > > > >    process is in charge of a port.
> > > > > > > 2. A library could work on top of a port.
> > > > > > > 3. A port can work on top of another port.
> > > > > > >
> > > > > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > > > > We need to check that the user is not trying to use a port
> > > > > > > which is already managed by fail-safe.
> > > > > > >
> > > > > > > Add ownership mechanism to DPDK Ethernet devices to avoid
> > > > > > > multiple management of a device by different DPDK entities.
> > > > > > >
> > > > > > > A port owner is built from owner id(number) and owner
> > > > > > > name(string) while the owner id must be unique to distinguish
> > > > > > > between two identical entity instances and the owner name can be
> > any name.
> > > > > > > The name helps to logically recognize the owner by different
> > > > > > > DPDK entities and allows easy debug.
> > > > > > > Each DPDK entity can allocate an owner unique identifier and
> > > > > > > can use it and its preferred name to owns valid ethdev ports.
> > > > > > > Each DPDK entity can get any port owner status to decide if it
> > > > > > > can manage the port or not.
> > > > > > >
> > > > > > > The current ethdev internal port management is not affected by
> > > > > > > this feature.
> > > > > > >
> > > > >
> > > > > The internal port management is not affected, but the external
> > > > > interface is, however. In order to respect port ownership,
> > > > > applications are forced to modify their port iterator, as shown by
> > > > > your
> > > > testpmd patch.
> > > > >
> > > > > I think it would be better to modify the current
> > > > > RTE_ETH_FOREACH_DEV to call RTE_FOREACH_DEV_OWNED_BY, and
> > > > > introduce a default owner that would represent the application
> > > > > itself (probably with the ID 0 and an owner string ""). Only with
> > > > > specific additional configuration should this default subset of ethdev be
> > divided.
> > > > >
> > > > > This would make this evolution seamless for applications, at no
> > > > > cost to the complexity of the design.
> > > > >
> > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > >
> > > > > >
> > > > > > This seems fairly racy.  What if one thread attempts to set
> > > > > > ownership on a port, while another is checking it on another cpu
> > > > > > in parallel.  It doesn't seem like it will protect against that at all.
> > > > > > It also doesn't protect against the possibility of multiple
> > > > > > threads attempting to poll for rx in parallel, which I think was
> > > > > > part of Thomas's origional statement regarding port ownership
> > > > > > (he noted that the lockless design implied only a single thread
> > > > > > should be allowed to poll
> > > > for receive or make configuration changes at a time.
> > > > > >
> > > > > > Neil
> > > > > >
> > > > >
> > > > > Isn't this race already there for any configuration operation /
> > > > > polling function? The DPDK arch is expecting applications to solve it.
> > > > > Why should port ownership be designed differently from other DPDK
> > > > components?
> > > > >
> > > > Yes, but that doesn't mean it should exist in purpituity, nor does
> > > > it mean that your new api should contain it as well.
> > > >
> > > > > Embedding checks for port ownership within operations will force
> > > > > everyone to bear their costs, even those not interested in using
> > > > > this API. These checks should be kept outside, within the entity
> > > > > claiming ownership of the port, in the form of using the proper
> > > > > port iterator IMO.
> > > > >
> > > > No.  At the very least, you need to make the API itself exclusive.
> > > > That is to say, you should at least ensure that a port ownership get
> > > > call doesn't race with a port ownership set call.  It seems
> > > > rediculous to just leave that sort of locking as an exercize to the user.
> > > >
> > > > Neil
> > > >
> > > Neil,
> > > As Thomas mentioned, a DPDK port is designed to be managed by only one
> > > thread (or synchronized DPDK entity).
> > > So all the port management includes port ownership shouldn't be
> > > synchronized, i.e. locks are not needed.
> > > If some application want to do dual thread port management, the
> > > responsibility to synchronize the port ownership or any other port
> > > management is on this application.
> > > Port ownership doesn't come to allow synchronized management of the
> > > port by two DPDK entities in parallel, it is just nice way to answer next
> > questions:
> > > 	1. Is the port already owned by some DPDK entity(in early control
> > path)?
> > > 	2. If yes, Who is the owner?
> > > If the answer to the first question is no, the current entity can take
> > > the ownership without any lock(1 thread).
> > > If the answer to the first question is yes, you can recognize the
> > > owner and take decisions accordingly, sometimes you can decide to use
> > > the port because you logically know what the current owner does and
> > > you can be logically synchronized with it, sometimes you can just
> > > leave this port because you have not any deal with  this owner.
> >
> > If the goal is just to have an ability to recognize is that device is managed by
> > another device (failsafe, bonding, etc.),  then I think all we need is a pointer
> > to rte_eth_dev_data of the owner (NULL would mean no owner).
> 
> I think string is better than a pointer from the next reasons:
> 1. It is more human friendly than pointers for debug and printing.

We can have a function that would take an owner pointer and produce nice
pretty formatted text explanation: "owned by fail-safe device at port X" or so.  

> 2. it is flexible and allows to forward logical owner message to other DPDK entities.

Hmm and why do you want to do that?
There are dozen well defined IPC mechanisms in POSIX world, why do we need to create
a new one?
Especially considering how limited and error prone then new one is.

> 
> > Also I think if we'd like to introduce that mechanism, then it needs to be
> > - mandatory (control API just don't allow changes to the device configuration
> > if caller is not an owner).
> 
> But what if 2 DPDK entities should manage the same port \ using it and they are synchronized?

You mean 2 DPDK processes (primary/secondary) right?
As you mentioned below - ownership could be set only by primary.
So from the perspective of synchronizing access to the device between multiple processes -
it seems useless anyway.
What I am talking about is about synchronizing access to the low level device from
different high-level entities.
Let say if we have 2 failsafe devices (or 2 bonded devices) -
that mechanism will help to ensure that only one of them can own the device.
Again if user by mistake will try to manage device that is owned by failsafe device -
he wouldn't be able to do that.

> 
> > - transparent to the user (no API changes).
> 
> For now, there is not API change but new suggested API to use for port iteration.

Sorry, I probably wasn't clear here.
What I meant - this api to set/get ownership should be sort of internal to ethdev layer.
Let say it would be used for failsafe/bonding (any other compound) device that needs
to own/manage several low-level devices.
So in normal situation user wouldn't need to use that API directly at all.

> 
> >  - set/get owner ops need to be atomic if we want this mechanism to be
> > usable for MP.
> 
> But also without atomic this mechanism is usable in MP.
> For example:
> PRIMARY application can set its owner with string "primary A".
> SECONDARY process (which attach to the ports only after the primary created them )is not allowed to set owner(As you can see in the code)
> but it can read the owner string and see that the port owner is the primary application.
> The "A" can just sign specific port type to the SECONDARY that this port works with logic A which means, for example, primary should send
> the packets and secondary should receive the packets.

Even if secondary process is not allowed to modify that string, it might decide to read it at the moment
when primary one will decide to change it again (clear/set owner).
In that situation secondary will end-up either reading a junk or just crash.
But anyway as I said above - I don't think it is a good idea to have a strings here and
use them as IPC mechanism.

Konstantin



> >
> >
> >
> >
> >
> > >
> > > > > > > ---
> > > > > > >  doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
> > > > > > >  lib/librte_ether/rte_ethdev.c           | 121
> > > > ++++++++++++++++++++++++++++++++
> > > > > > >  lib/librte_ether/rte_ethdev.h           |  86
> > +++++++++++++++++++++++
> > > > > > >  lib/librte_ether/rte_ethdev_version.map |  12 ++++
> > > > > > >  4 files changed, 230 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > > b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > > index 6a0c9f9..af639a1 100644
> > > > > > > --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > > +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > > @@ -156,7 +156,7 @@ concurrently on the same tx queue without
> > > > > > > SW lock. This PMD feature found in som
> > > > > > >
> > > > > > >  See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE``
> > > > capability probing details.
> > > > > > >
> > > > > > > -Device Identification and Configuration
> > > > > > > +Device Identification, Ownership  and Configuration
> > > > > > >  ---------------------------------------
> > > > > > >
> > > > > > >  Device Identification
> > > > > > > @@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports
> > > > > > > are
> > > > assigned two other identifiers:
> > > > > > >  *   A port name used to designate the port in console messages, for
> > > > administration or debugging purposes.
> > > > > > >      For ease of use, the port name includes the port index.
> > > > > > >
> > > > > > > +Port Ownership
> > > > > > > +~~~~~~~~~~~~~
> > > > > > > +The Ethernet devices ports can be owned by a single DPDK
> > > > > > > +entity
> > > > (application, library, PMD, process, etc).
> > > > > > > +The ownership mechanism is controlled by ethdev APIs and
> > > > > > > +allows to
> > > > set/remove/get a port owner by DPDK entities.
> > > > > > > +Allowing this should prevent any multiple management of
> > > > > > > +Ethernet
> > > > port by different entities.
> > > > > > > +
> > > > > > > +.. note::
> > > > > > > +
> > > > > > > +    It is the DPDK entity responsibility either to check the
> > > > > > > + port owner
> > > > before using it or to set the port owner to prevent others from using it.
> > > > > > > +
> > > > > > >  Device Configuration
> > > > > > >  ~~~~~~~~~~~~~~~~~~~~
> > > > > > >
> > > > > > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > > > > > b/lib/librte_ether/rte_ethdev.c index 2d754d9..836991e 100644
> > > > > > > --- a/lib/librte_ether/rte_ethdev.c
> > > > > > > +++ b/lib/librte_ether/rte_ethdev.c
> > > > > > > @@ -71,6 +71,7 @@
> > > > > > >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
> > > > > > > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> > > > > > >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > > > > > > +static uint16_t rte_eth_next_owner_id =
> > RTE_ETH_DEV_NO_OWNER
> > > > + 1;
> > > > > > >  static uint8_t eth_dev_last_created_port;
> > > > > > >
> > > > > > >  /* spinlock for eth device callbacks */ @@ -278,6 +279,7 @@
> > > > > > > struct rte_eth_dev *
> > > > > > >  	if (eth_dev == NULL)
> > > > > > >  		return -EINVAL;
> > > > > > >
> > > > > > > +	memset(&eth_dev->data->owner, 0, sizeof(struct
> > > > > > > +rte_eth_dev_owner));
> > > > > > >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> > > > > > >  	return 0;
> > > > > > >  }
> > > > > > > @@ -293,6 +295,125 @@ struct rte_eth_dev *
> > > > > > >  		return 1;
> > > > > > >  }
> > > > > > >
> > > > > > > +static int
> > > > > > > +rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > > > > > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > > > > > > +	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER
> > &&
> > > > > > > +	    rte_eth_next_owner_id <= owner_id)) {
> > > > > > > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n",
> > owner_id);
> > > > > > > +		return 0;
> > > > > > > +	}
> > > > > > > +	return 1;
> > > > > > > +}
> > > > > > > +
> > > > > > > +uint16_t
> > > > > > > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t
> > > > > > > +owner_id) {
> > > > > > > +	while (port_id < RTE_MAX_ETHPORTS &&
> > > > > > > +	       (rte_eth_devices[port_id].state !=
> > RTE_ETH_DEV_ATTACHED ||
> > > > > > > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > > > > > > +		port_id++;
> > > > > > > +
> > > > > > > +	if (port_id >= RTE_MAX_ETHPORTS)
> > > > > > > +		return RTE_MAX_ETHPORTS;
> > > > > > > +
> > > > > > > +	return port_id;
> > > > > > > +}
> > > > > > > +
> > > > > > > +int
> > > > > > > +rte_eth_dev_owner_new(uint16_t *owner_id) {
> > > > > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > > > > +		RTE_PMD_DEBUG_TRACE("Not primary process
> > cannot own
> > > > ports.\n");
> > > > > > > +		return -EPERM;
> > > > > > > +	}
> > > > > > > +	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > > > > > > +		RTE_PMD_DEBUG_TRACE("Reached maximum
> > number of
> > > > Ethernet port owners.\n");
> > > > > > > +		return -EUSERS;
> > > > > > > +	}
> > > > > > > +	*owner_id = rte_eth_next_owner_id++;
> > > > > > > +	return 0;
> > > > > > > +}
> > > > > > > +
> > > > > > > +int
> > > > > > > +rte_eth_dev_owner_set(const uint16_t port_id,
> > > > > > > +		      const struct rte_eth_dev_owner *owner) {
> > > > > > > +	struct rte_eth_dev_owner *port_owner;
> > > > > > > +	int ret;
> > > > > > > +
> > > > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > > > > +		RTE_PMD_DEBUG_TRACE("Not primary process
> > cannot own
> > > > ports.\n");
> > > > > > > +		return -EPERM;
> > > > > > > +	}
> > > > > > > +	if (!rte_eth_is_valid_owner_id(owner->id))
> > > > > > > +		return -EINVAL;
> > > > > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > > > > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > > > > > > +	    port_owner->id != owner->id) {
> > > > > > > +		RTE_LOG(ERR, EAL,
> > > > > > > +			"Cannot set owner to port %d already owned
> > by
> > > > %s_%05d.\n",
> > > > > > > +			port_id, port_owner->name, port_owner-
> > >id);
> > > > > > > +		return -EPERM;
> > > > > > > +	}
> > > > > > > +	ret = snprintf(port_owner->name,
> > > > RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > > > > > > +		       owner->name);
> > > > > > > +	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> > > > > > > +		memset(port_owner->name, 0,
> > > > RTE_ETH_MAX_OWNER_NAME_LEN);
> > > > > > > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > > > > > > +		return -EINVAL;
> > > > > > > +	}
> > > > > > > +	port_owner->id = owner->id;
> > > > > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n",
> > port_id,
> > > > > > > +			    owner->name, owner->id);
> > > > > > > +	return 0;
> > > > > > > +}
> > > > > > > +
> > > > > > > +int
> > > > > > > +rte_eth_dev_owner_remove(const uint16_t port_id, const
> > > > > > > +uint16_t
> > > > > > > +owner_id) {
> > > > > > > +	struct rte_eth_dev_owner *port_owner;
> > > > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > > > > +		return -EINVAL;
> > > > > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > > > > +	if (port_owner->id != owner_id) {
> > > > > > > +		RTE_LOG(ERR, EAL,
> > > > > > > +			"Cannot remove port %d owner %s_%05d by
> > > > different owner id %5d.\n",
> > > > > > > +			port_id, port_owner->name, port_owner-
> > >id,
> > > > owner_id);
> > > > > > > +		return -EPERM;
> > > > > > > +	}
> > > > > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has
> > > > removed.\n", port_id,
> > > > > > > +			port_owner->name, port_owner->id);
> > > > > > > +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> > > > > > > +	return 0;
> > > > > > > +}
> > > > > > > +
> > > > > > > +void
> > > > > > > +rte_eth_dev_owner_delete(const uint16_t owner_id) {
> > > > > > > +	uint16_t p;
> > > > > > > +
> > > > > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > > > > +		return;
> > > > > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
> > > > > > > +		memset(&rte_eth_devices[p].data->owner, 0,
> > > > > > > +		       sizeof(struct rte_eth_dev_owner));
> > > > > > > +	RTE_PMD_DEBUG_TRACE("All port owners owned by "
> > > > > > > +			    "%05d identifier has removed.\n",
> > owner_id); }
> > > > > > > +
> > > > > > > +const struct rte_eth_dev_owner * rte_eth_dev_owner_get(const
> > > > > > > +uint16_t port_id) {
> > > > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> > > > > > > +	if (rte_eth_devices[port_id].data->owner.id ==
> > > > RTE_ETH_DEV_NO_OWNER)
> > > > > > > +		return NULL;
> > > > > > > +	return &rte_eth_devices[port_id].data->owner;
> > > > > > > +}
> > > > > > > +
> > > > > > >  int
> > > > > > >  rte_eth_dev_socket_id(uint16_t port_id)  { diff --git
> > > > > > > a/lib/librte_ether/rte_ethdev.h
> > > > > > > b/lib/librte_ether/rte_ethdev.h index 341c2d6..f54c26d 100644
> > > > > > > --- a/lib/librte_ether/rte_ethdev.h
> > > > > > > +++ b/lib/librte_ether/rte_ethdev.h
> > > > > > > @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
> > > > > > >
> > > > > > >  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
> > > > > > >
> > > > > > > +#define RTE_ETH_DEV_NO_OWNER 0
> > > > > > > +
> > > > > > > +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> > > > > > > +
> > > > > > > +struct rte_eth_dev_owner {
> > > > > > > +	uint16_t id; /**< The owner unique identifier. */
> > > > > > > +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The
> > owner
> > > > name. */
> > > > > > > +};
> > > > > > > +
> > > > > > >  /**
> > > > > > >   * @internal
> > > > > > >   * The data part, with no function pointers, associated with
> > > > > > > each
> > > > ethernet device.
> > > > > > > @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
> > > > > > >  	int numa_node;  /**< NUMA node connection */
> > > > > > >  	struct rte_vlan_filter_conf vlan_filter_conf;
> > > > > > >  	/**< VLAN filter configuration. */
> > > > > > > +	struct rte_eth_dev_owner owner; /**< The port owner. */
> > > > > > >  };
> > > > > > >
> > > > > > >  /** Device supports link state interrupt */ @@ -1846,6
> > > > > > > +1856,82 @@ struct rte_eth_dev_data {
> > > > > > >
> > > > > > >
> > > > > > >  /**
> > > > > > > + * Iterates over valid ethdev ports owned by a specific owner.
> > > > > > > + *
> > > > > > > + * @param port_id
> > > > > > > + *   The id of the next possible valid owned port.
> > > > > > > + * @param	owner_id
> > > > > > > + *  The owner identifier.
> > > > > > > + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid
> > > > > > > + ownerless
> > > > ports.
> > > > > > > + * @return
> > > > > > > + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if
> > > > there is none.
> > > > > > > + */
> > > > > > > +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const
> > > > > > > +uint16_t owner_id);
> > > > > > > +
> > > > > > > +/**
> > > > > > > + * Macro to iterate over all enabled ethdev ports owned by a
> > > > > > > +specific
> > > > owner.
> > > > > > > + */
> > > > > > > +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> > > > > > > +	for (p = rte_eth_find_next_owned_by(0, o); \
> > > > > > > +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> > > > > > > +	     p = rte_eth_find_next_owned_by(p + 1, o))
> > > > > > > +
> > > > > > > +/**
> > > > > > > + * Get a new unique owner identifier.
> > > > > > > + * An owner identifier is used to owns Ethernet devices by
> > > > > > > +only one DPDK entity
> > > > > > > + * to avoid multiple management of device by different entities.
> > > > > > > + *
> > > > > > > + * @param	owner_id
> > > > > > > + *   Owner identifier pointer.
> > > > > > > + * @return
> > > > > > > + *   Negative errno value on error, 0 on success.
> > > > > > > + */
> > > > > > > +int rte_eth_dev_owner_new(uint16_t *owner_id);
> > > > > > > +
> > > > > > > +/**
> > > > > > > + * Set an Ethernet device owner.
> > > > > > > + *
> > > > > > > + * @param	port_id
> > > > > > > + *  The identifier of the port to own.
> > > > > > > + * @param	owner
> > > > > > > + *  The owner pointer.
> > > > > > > + * @return
> > > > > > > + *  Negative errno value on error, 0 on success.
> > > > > > > + */
> > > > > > > +int rte_eth_dev_owner_set(const uint16_t port_id,
> > > > > > > +			  const struct rte_eth_dev_owner *owner);
> > > > > > > +
> > > > > > > +/**
> > > > > > > + * Remove Ethernet device owner to make the device ownerless.
> > > > > > > + *
> > > > > > > + * @param	port_id
> > > > > > > + *  The identifier of port to make ownerless.
> > > > > > > + * @param	owner
> > > > > > > + *  The owner identifier.
> > > > > > > + * @return
> > > > > > > + *  0 on success, negative errno value on error.
> > > > > > > + */
> > > > > > > +int rte_eth_dev_owner_remove(const uint16_t port_id, const
> > > > > > > +uint16_t owner_id);
> > > > > > > +
> > > > > > > +/**
> > > > > > > + * Remove owner from all Ethernet devices owned by a specific
> > > > owner.
> > > > > > > + *
> > > > > > > + * @param	owner
> > > > > > > + *  The owner identifier.
> > > > > > > + */
> > > > > > > +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> > > > > > > +
> > > > > > > +/**
> > > > > > > + * Get the owner of an Ethernet device.
> > > > > > > + *
> > > > > > > + * @param	port_id
> > > > > > > + *  The port identifier.
> > > > > > > + * @return
> > > > > > > + *  NULL when the device is ownerless, else the device owner
> > pointer.
> > > > > > > + */
> > > > > > > +const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const
> > > > > > > +uint16_t port_id);
> > > > > > > +
> > > > > > > +/**
> > > > > > >   * Get the total number of Ethernet devices that have been
> > > > successfully
> > > > > > >   * initialized by the matching Ethernet driver during the PCI
> > > > > > > probing
> > > > phase
> > > > > > >   * and that are available for applications to use. These
> > > > > > > devices must be diff --git
> > > > > > > a/lib/librte_ether/rte_ethdev_version.map
> > > > > > > b/lib/librte_ether/rte_ethdev_version.map
> > > > > > > index e9681ac..7d07edb 100644
> > > > > > > --- a/lib/librte_ether/rte_ethdev_version.map
> > > > > > > +++ b/lib/librte_ether/rte_ethdev_version.map
> > > > > > > @@ -198,6 +198,18 @@ DPDK_17.11 {
> > > > > > >
> > > > > > >  } DPDK_17.08;
> > > > > > >
> > > > > > > +DPDK_18.02 {
> > > > > > > +	global:
> > > > > > > +
> > > > > > > +	rte_eth_find_next_owned_by;
> > > > > > > +	rte_eth_dev_owner_new;
> > > > > > > +	rte_eth_dev_owner_set;
> > > > > > > +	rte_eth_dev_owner_remove;
> > > > > > > +	rte_eth_dev_owner_delete;
> > > > > > > +	rte_eth_dev_owner_get;
> > > > > > > +
> > > > > > > +} DPDK_17.11;
> > > > > > > +
> > > > > > >  EXPERIMENTAL {
> > > > > > >  	global:
> > > > > > >
> > > > > > > --
> > > > > > > 1.8.3.1
> > > > > > >
> > > > > > >
> > > > >
> > > > > --
> > > > > Gaëtan Rivet
> > > > > 6WIND
> > > > >
  
Ananyev, Konstantin Dec. 5, 2017, 11:44 a.m. UTC | #16
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin
> Sent: Tuesday, December 5, 2017 11:12 AM
> To: Matan Azrad <matan@mellanox.com>; Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> Hi Matan,
> 
> > -----Original Message-----
> > From: Matan Azrad [mailto:matan@mellanox.com]
> > Sent: Sunday, December 3, 2017 1:47 PM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet
> > <gaetan.rivet@6wind.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> >
> > Hi Konstantine
> >
> > > -----Original Message-----
> > > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > > Sent: Sunday, December 3, 2017 1:10 PM
> > > To: Matan Azrad <matan@mellanox.com>; Neil Horman
> > > <nhorman@tuxdriver.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > <jingjing.wu@intel.com>; dev@dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > >
> > >
> > >
> > > Hi Matan,
> > >
> > > > -----Original Message-----
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > > > Sent: Sunday, December 3, 2017 8:05 AM
> > > > To: Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet
> > > > <gaetan.rivet@6wind.com>
> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > > <jingjing.wu@intel.com>; dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > >
> > > > Hi
> > > >
> > > > > -----Original Message-----
> > > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > > > Sent: Friday, December 1, 2017 2:10 PM
> > > > > To: Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > > > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > > <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> > > > > dev@dpdk.org
> > > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > > >
> > > > > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> > > > > > Hello Matan, Neil,
> > > > > >
> > > > > > I like the port ownership concept. I think it is needed to clarify
> > > > > > some operations and should be useful to several subsystems.
> > > > > >
> > > > > > This patch could certainly be sub-divided however, and your
> > > > > > current
> > > > > > 1/5 should probably come after this one.
> > > > > >
> > > > > > Some comments inline.
> > > > > >
> > > > > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > > > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > > > > > The ownership of a port is implicit in DPDK.
> > > > > > > > Making it explicit is better from the next reasons:
> > > > > > > > 1. It may be convenient for multi-process applications to know
> > > which
> > > > > > > >    process is in charge of a port.
> > > > > > > > 2. A library could work on top of a port.
> > > > > > > > 3. A port can work on top of another port.
> > > > > > > >
> > > > > > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > > > > > We need to check that the user is not trying to use a port
> > > > > > > > which is already managed by fail-safe.
> > > > > > > >
> > > > > > > > Add ownership mechanism to DPDK Ethernet devices to avoid
> > > > > > > > multiple management of a device by different DPDK entities.
> > > > > > > >
> > > > > > > > A port owner is built from owner id(number) and owner
> > > > > > > > name(string) while the owner id must be unique to distinguish
> > > > > > > > between two identical entity instances and the owner name can be
> > > any name.
> > > > > > > > The name helps to logically recognize the owner by different
> > > > > > > > DPDK entities and allows easy debug.
> > > > > > > > Each DPDK entity can allocate an owner unique identifier and
> > > > > > > > can use it and its preferred name to owns valid ethdev ports.
> > > > > > > > Each DPDK entity can get any port owner status to decide if it
> > > > > > > > can manage the port or not.
> > > > > > > >
> > > > > > > > The current ethdev internal port management is not affected by
> > > > > > > > this feature.
> > > > > > > >
> > > > > >
> > > > > > The internal port management is not affected, but the external
> > > > > > interface is, however. In order to respect port ownership,
> > > > > > applications are forced to modify their port iterator, as shown by
> > > > > > your
> > > > > testpmd patch.
> > > > > >
> > > > > > I think it would be better to modify the current
> > > > > > RTE_ETH_FOREACH_DEV to call RTE_FOREACH_DEV_OWNED_BY, and
> > > > > > introduce a default owner that would represent the application
> > > > > > itself (probably with the ID 0 and an owner string ""). Only with
> > > > > > specific additional configuration should this default subset of ethdev be
> > > divided.
> > > > > >
> > > > > > This would make this evolution seamless for applications, at no
> > > > > > cost to the complexity of the design.
> > > > > >
> > > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > > >
> > > > > > >
> > > > > > > This seems fairly racy.  What if one thread attempts to set
> > > > > > > ownership on a port, while another is checking it on another cpu
> > > > > > > in parallel.  It doesn't seem like it will protect against that at all.
> > > > > > > It also doesn't protect against the possibility of multiple
> > > > > > > threads attempting to poll for rx in parallel, which I think was
> > > > > > > part of Thomas's origional statement regarding port ownership
> > > > > > > (he noted that the lockless design implied only a single thread
> > > > > > > should be allowed to poll
> > > > > for receive or make configuration changes at a time.
> > > > > > >
> > > > > > > Neil
> > > > > > >
> > > > > >
> > > > > > Isn't this race already there for any configuration operation /
> > > > > > polling function? The DPDK arch is expecting applications to solve it.
> > > > > > Why should port ownership be designed differently from other DPDK
> > > > > components?
> > > > > >
> > > > > Yes, but that doesn't mean it should exist in purpituity, nor does
> > > > > it mean that your new api should contain it as well.
> > > > >
> > > > > > Embedding checks for port ownership within operations will force
> > > > > > everyone to bear their costs, even those not interested in using
> > > > > > this API. These checks should be kept outside, within the entity
> > > > > > claiming ownership of the port, in the form of using the proper
> > > > > > port iterator IMO.
> > > > > >
> > > > > No.  At the very least, you need to make the API itself exclusive.
> > > > > That is to say, you should at least ensure that a port ownership get
> > > > > call doesn't race with a port ownership set call.  It seems
> > > > > rediculous to just leave that sort of locking as an exercize to the user.
> > > > >
> > > > > Neil
> > > > >
> > > > Neil,
> > > > As Thomas mentioned, a DPDK port is designed to be managed by only one
> > > > thread (or synchronized DPDK entity).
> > > > So all the port management includes port ownership shouldn't be
> > > > synchronized, i.e. locks are not needed.
> > > > If some application want to do dual thread port management, the
> > > > responsibility to synchronize the port ownership or any other port
> > > > management is on this application.
> > > > Port ownership doesn't come to allow synchronized management of the
> > > > port by two DPDK entities in parallel, it is just nice way to answer next
> > > questions:
> > > > 	1. Is the port already owned by some DPDK entity(in early control
> > > path)?
> > > > 	2. If yes, Who is the owner?
> > > > If the answer to the first question is no, the current entity can take
> > > > the ownership without any lock(1 thread).
> > > > If the answer to the first question is yes, you can recognize the
> > > > owner and take decisions accordingly, sometimes you can decide to use
> > > > the port because you logically know what the current owner does and
> > > > you can be logically synchronized with it, sometimes you can just
> > > > leave this port because you have not any deal with  this owner.
> > >
> > > If the goal is just to have an ability to recognize is that device is managed by
> > > another device (failsafe, bonding, etc.),  then I think all we need is a pointer
> > > to rte_eth_dev_data of the owner (NULL would mean no owner).
> >
> > I think string is better than a pointer from the next reasons:
> > 1. It is more human friendly than pointers for debug and printing.
> 
> We can have a function that would take an owner pointer and produce nice
> pretty formatted text explanation: "owned by fail-safe device at port X" or so.
> 
> > 2. it is flexible and allows to forward logical owner message to other DPDK entities.
> 
> Hmm and why do you want to do that?
> There are dozen well defined IPC mechanisms in POSIX world, why do we need to create
> a new one?
> Especially considering how limited and error prone then new one is.
> 
> >
> > > Also I think if we'd like to introduce that mechanism, then it needs to be
> > > - mandatory (control API just don't allow changes to the device configuration
> > > if caller is not an owner).
> >
> > But what if 2 DPDK entities should manage the same port \ using it and they are synchronized?
> 
> You mean 2 DPDK processes (primary/secondary) right?
> As you mentioned below - ownership could be set only by primary.
> So from the perspective of synchronizing access to the device between multiple processes -
> it seems useless anyway.
> What I am talking about is about synchronizing access to the low level device from
> different high-level entities.
> Let say if we have 2 failsafe devices (or 2 bonded devices) -
> that mechanism will help to ensure that only one of them can own the device.
> Again if user by mistake will try to manage device that is owned by failsafe device -
> he wouldn't be able to do that.
> 
> >
> > > - transparent to the user (no API changes).
> >
> > For now, there is not API change but new suggested API to use for port iteration.
> 
> Sorry, I probably wasn't clear here.
> What I meant - this api to set/get ownership should be sort of internal to ethdev layer.
> Let say it would be used for failsafe/bonding (any other compound) device that needs
> to own/manage several low-level devices.
> So in normal situation user wouldn't need to use that API directly at all.
> 
> >
> > >  - set/get owner ops need to be atomic if we want this mechanism to be
> > > usable for MP.
> >
> > But also without atomic this mechanism is usable in MP.
> > For example:
> > PRIMARY application can set its owner with string "primary A".
> > SECONDARY process (which attach to the ports only after the primary created them )is not allowed to set owner(As you can see in the
> code)
> > but it can read the owner string and see that the port owner is the primary application.
> > The "A" can just sign specific port type to the SECONDARY that this port works with logic A which means, for example, primary should
> send
> > the packets and secondary should receive the packets.
> 
> Even if secondary process is not allowed to modify that string, it might decide to read it at the moment
> when primary one will decide to change it again (clear/set owner).
> In that situation secondary will end-up either reading a junk or just crash.
> But anyway as I said above - I don't think it is a good idea to have a strings here and
> use them as IPC mechanism.

Just forgot to mention - I don' think it is good idea to disallow secondary process to set  theowner.
Let say  in secondary process I have few tap/ring/pcap devices.
Why it shouldn't be allowed to unite them under bonding device and make that device to own them?
That's why I think get/set owner better to be atomic.
If the owner is just a pointer - in that case get operation will be atomic by nature,
set could be implemented just by CAS.
Konstantin 

> 
> Konstantin
> 
> 
> 
> > >
> > >
> > >
> > >
> > >
> > > >
> > > > > > > > ---
> > > > > > > >  doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
> > > > > > > >  lib/librte_ether/rte_ethdev.c           | 121
> > > > > ++++++++++++++++++++++++++++++++
> > > > > > > >  lib/librte_ether/rte_ethdev.h           |  86
> > > +++++++++++++++++++++++
> > > > > > > >  lib/librte_ether/rte_ethdev_version.map |  12 ++++
> > > > > > > >  4 files changed, 230 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > > > b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > > > index 6a0c9f9..af639a1 100644
> > > > > > > > --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > > > +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > > > @@ -156,7 +156,7 @@ concurrently on the same tx queue without
> > > > > > > > SW lock. This PMD feature found in som
> > > > > > > >
> > > > > > > >  See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE``
> > > > > capability probing details.
> > > > > > > >
> > > > > > > > -Device Identification and Configuration
> > > > > > > > +Device Identification, Ownership  and Configuration
> > > > > > > >  ---------------------------------------
> > > > > > > >
> > > > > > > >  Device Identification
> > > > > > > > @@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports
> > > > > > > > are
> > > > > assigned two other identifiers:
> > > > > > > >  *   A port name used to designate the port in console messages, for
> > > > > administration or debugging purposes.
> > > > > > > >      For ease of use, the port name includes the port index.
> > > > > > > >
> > > > > > > > +Port Ownership
> > > > > > > > +~~~~~~~~~~~~~
> > > > > > > > +The Ethernet devices ports can be owned by a single DPDK
> > > > > > > > +entity
> > > > > (application, library, PMD, process, etc).
> > > > > > > > +The ownership mechanism is controlled by ethdev APIs and
> > > > > > > > +allows to
> > > > > set/remove/get a port owner by DPDK entities.
> > > > > > > > +Allowing this should prevent any multiple management of
> > > > > > > > +Ethernet
> > > > > port by different entities.
> > > > > > > > +
> > > > > > > > +.. note::
> > > > > > > > +
> > > > > > > > +    It is the DPDK entity responsibility either to check the
> > > > > > > > + port owner
> > > > > before using it or to set the port owner to prevent others from using it.
> > > > > > > > +
> > > > > > > >  Device Configuration
> > > > > > > >  ~~~~~~~~~~~~~~~~~~~~
> > > > > > > >
> > > > > > > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > > > > > > b/lib/librte_ether/rte_ethdev.c index 2d754d9..836991e 100644
> > > > > > > > --- a/lib/librte_ether/rte_ethdev.c
> > > > > > > > +++ b/lib/librte_ether/rte_ethdev.c
> > > > > > > > @@ -71,6 +71,7 @@
> > > > > > > >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
> > > > > > > > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> > > > > > > >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > > > > > > > +static uint16_t rte_eth_next_owner_id =
> > > RTE_ETH_DEV_NO_OWNER
> > > > > + 1;
> > > > > > > >  static uint8_t eth_dev_last_created_port;
> > > > > > > >
> > > > > > > >  /* spinlock for eth device callbacks */ @@ -278,6 +279,7 @@
> > > > > > > > struct rte_eth_dev *
> > > > > > > >  	if (eth_dev == NULL)
> > > > > > > >  		return -EINVAL;
> > > > > > > >
> > > > > > > > +	memset(&eth_dev->data->owner, 0, sizeof(struct
> > > > > > > > +rte_eth_dev_owner));
> > > > > > > >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> > > > > > > >  	return 0;
> > > > > > > >  }
> > > > > > > > @@ -293,6 +295,125 @@ struct rte_eth_dev *
> > > > > > > >  		return 1;
> > > > > > > >  }
> > > > > > > >
> > > > > > > > +static int
> > > > > > > > +rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > > > > > > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > > > > > > > +	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER
> > > &&
> > > > > > > > +	    rte_eth_next_owner_id <= owner_id)) {
> > > > > > > > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n",
> > > owner_id);
> > > > > > > > +		return 0;
> > > > > > > > +	}
> > > > > > > > +	return 1;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +uint16_t
> > > > > > > > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t
> > > > > > > > +owner_id) {
> > > > > > > > +	while (port_id < RTE_MAX_ETHPORTS &&
> > > > > > > > +	       (rte_eth_devices[port_id].state !=
> > > RTE_ETH_DEV_ATTACHED ||
> > > > > > > > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > > > > > > > +		port_id++;
> > > > > > > > +
> > > > > > > > +	if (port_id >= RTE_MAX_ETHPORTS)
> > > > > > > > +		return RTE_MAX_ETHPORTS;
> > > > > > > > +
> > > > > > > > +	return port_id;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +int
> > > > > > > > +rte_eth_dev_owner_new(uint16_t *owner_id) {
> > > > > > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > > > > > +		RTE_PMD_DEBUG_TRACE("Not primary process
> > > cannot own
> > > > > ports.\n");
> > > > > > > > +		return -EPERM;
> > > > > > > > +	}
> > > > > > > > +	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > > > > > > > +		RTE_PMD_DEBUG_TRACE("Reached maximum
> > > number of
> > > > > Ethernet port owners.\n");
> > > > > > > > +		return -EUSERS;
> > > > > > > > +	}
> > > > > > > > +	*owner_id = rte_eth_next_owner_id++;
> > > > > > > > +	return 0;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +int
> > > > > > > > +rte_eth_dev_owner_set(const uint16_t port_id,
> > > > > > > > +		      const struct rte_eth_dev_owner *owner) {
> > > > > > > > +	struct rte_eth_dev_owner *port_owner;
> > > > > > > > +	int ret;
> > > > > > > > +
> > > > > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > > > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > > > > > +		RTE_PMD_DEBUG_TRACE("Not primary process
> > > cannot own
> > > > > ports.\n");
> > > > > > > > +		return -EPERM;
> > > > > > > > +	}
> > > > > > > > +	if (!rte_eth_is_valid_owner_id(owner->id))
> > > > > > > > +		return -EINVAL;
> > > > > > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > > > > > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > > > > > > > +	    port_owner->id != owner->id) {
> > > > > > > > +		RTE_LOG(ERR, EAL,
> > > > > > > > +			"Cannot set owner to port %d already owned
> > > by
> > > > > %s_%05d.\n",
> > > > > > > > +			port_id, port_owner->name, port_owner-
> > > >id);
> > > > > > > > +		return -EPERM;
> > > > > > > > +	}
> > > > > > > > +	ret = snprintf(port_owner->name,
> > > > > RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > > > > > > > +		       owner->name);
> > > > > > > > +	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> > > > > > > > +		memset(port_owner->name, 0,
> > > > > RTE_ETH_MAX_OWNER_NAME_LEN);
> > > > > > > > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > > > > > > > +		return -EINVAL;
> > > > > > > > +	}
> > > > > > > > +	port_owner->id = owner->id;
> > > > > > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n",
> > > port_id,
> > > > > > > > +			    owner->name, owner->id);
> > > > > > > > +	return 0;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +int
> > > > > > > > +rte_eth_dev_owner_remove(const uint16_t port_id, const
> > > > > > > > +uint16_t
> > > > > > > > +owner_id) {
> > > > > > > > +	struct rte_eth_dev_owner *port_owner;
> > > > > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > > > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > > > > > +		return -EINVAL;
> > > > > > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > > > > > +	if (port_owner->id != owner_id) {
> > > > > > > > +		RTE_LOG(ERR, EAL,
> > > > > > > > +			"Cannot remove port %d owner %s_%05d by
> > > > > different owner id %5d.\n",
> > > > > > > > +			port_id, port_owner->name, port_owner-
> > > >id,
> > > > > owner_id);
> > > > > > > > +		return -EPERM;
> > > > > > > > +	}
> > > > > > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has
> > > > > removed.\n", port_id,
> > > > > > > > +			port_owner->name, port_owner->id);
> > > > > > > > +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> > > > > > > > +	return 0;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +void
> > > > > > > > +rte_eth_dev_owner_delete(const uint16_t owner_id) {
> > > > > > > > +	uint16_t p;
> > > > > > > > +
> > > > > > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > > > > > +		return;
> > > > > > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
> > > > > > > > +		memset(&rte_eth_devices[p].data->owner, 0,
> > > > > > > > +		       sizeof(struct rte_eth_dev_owner));
> > > > > > > > +	RTE_PMD_DEBUG_TRACE("All port owners owned by "
> > > > > > > > +			    "%05d identifier has removed.\n",
> > > owner_id); }
> > > > > > > > +
> > > > > > > > +const struct rte_eth_dev_owner * rte_eth_dev_owner_get(const
> > > > > > > > +uint16_t port_id) {
> > > > > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> > > > > > > > +	if (rte_eth_devices[port_id].data->owner.id ==
> > > > > RTE_ETH_DEV_NO_OWNER)
> > > > > > > > +		return NULL;
> > > > > > > > +	return &rte_eth_devices[port_id].data->owner;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > >  int
> > > > > > > >  rte_eth_dev_socket_id(uint16_t port_id)  { diff --git
> > > > > > > > a/lib/librte_ether/rte_ethdev.h
> > > > > > > > b/lib/librte_ether/rte_ethdev.h index 341c2d6..f54c26d 100644
> > > > > > > > --- a/lib/librte_ether/rte_ethdev.h
> > > > > > > > +++ b/lib/librte_ether/rte_ethdev.h
> > > > > > > > @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
> > > > > > > >
> > > > > > > >  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
> > > > > > > >
> > > > > > > > +#define RTE_ETH_DEV_NO_OWNER 0
> > > > > > > > +
> > > > > > > > +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> > > > > > > > +
> > > > > > > > +struct rte_eth_dev_owner {
> > > > > > > > +	uint16_t id; /**< The owner unique identifier. */
> > > > > > > > +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The
> > > owner
> > > > > name. */
> > > > > > > > +};
> > > > > > > > +
> > > > > > > >  /**
> > > > > > > >   * @internal
> > > > > > > >   * The data part, with no function pointers, associated with
> > > > > > > > each
> > > > > ethernet device.
> > > > > > > > @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
> > > > > > > >  	int numa_node;  /**< NUMA node connection */
> > > > > > > >  	struct rte_vlan_filter_conf vlan_filter_conf;
> > > > > > > >  	/**< VLAN filter configuration. */
> > > > > > > > +	struct rte_eth_dev_owner owner; /**< The port owner. */
> > > > > > > >  };
> > > > > > > >
> > > > > > > >  /** Device supports link state interrupt */ @@ -1846,6
> > > > > > > > +1856,82 @@ struct rte_eth_dev_data {
> > > > > > > >
> > > > > > > >
> > > > > > > >  /**
> > > > > > > > + * Iterates over valid ethdev ports owned by a specific owner.
> > > > > > > > + *
> > > > > > > > + * @param port_id
> > > > > > > > + *   The id of the next possible valid owned port.
> > > > > > > > + * @param	owner_id
> > > > > > > > + *  The owner identifier.
> > > > > > > > + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid
> > > > > > > > + ownerless
> > > > > ports.
> > > > > > > > + * @return
> > > > > > > > + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if
> > > > > there is none.
> > > > > > > > + */
> > > > > > > > +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const
> > > > > > > > +uint16_t owner_id);
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > > + * Macro to iterate over all enabled ethdev ports owned by a
> > > > > > > > +specific
> > > > > owner.
> > > > > > > > + */
> > > > > > > > +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> > > > > > > > +	for (p = rte_eth_find_next_owned_by(0, o); \
> > > > > > > > +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> > > > > > > > +	     p = rte_eth_find_next_owned_by(p + 1, o))
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > > + * Get a new unique owner identifier.
> > > > > > > > + * An owner identifier is used to owns Ethernet devices by
> > > > > > > > +only one DPDK entity
> > > > > > > > + * to avoid multiple management of device by different entities.
> > > > > > > > + *
> > > > > > > > + * @param	owner_id
> > > > > > > > + *   Owner identifier pointer.
> > > > > > > > + * @return
> > > > > > > > + *   Negative errno value on error, 0 on success.
> > > > > > > > + */
> > > > > > > > +int rte_eth_dev_owner_new(uint16_t *owner_id);
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > > + * Set an Ethernet device owner.
> > > > > > > > + *
> > > > > > > > + * @param	port_id
> > > > > > > > + *  The identifier of the port to own.
> > > > > > > > + * @param	owner
> > > > > > > > + *  The owner pointer.
> > > > > > > > + * @return
> > > > > > > > + *  Negative errno value on error, 0 on success.
> > > > > > > > + */
> > > > > > > > +int rte_eth_dev_owner_set(const uint16_t port_id,
> > > > > > > > +			  const struct rte_eth_dev_owner *owner);
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > > + * Remove Ethernet device owner to make the device ownerless.
> > > > > > > > + *
> > > > > > > > + * @param	port_id
> > > > > > > > + *  The identifier of port to make ownerless.
> > > > > > > > + * @param	owner
> > > > > > > > + *  The owner identifier.
> > > > > > > > + * @return
> > > > > > > > + *  0 on success, negative errno value on error.
> > > > > > > > + */
> > > > > > > > +int rte_eth_dev_owner_remove(const uint16_t port_id, const
> > > > > > > > +uint16_t owner_id);
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > > + * Remove owner from all Ethernet devices owned by a specific
> > > > > owner.
> > > > > > > > + *
> > > > > > > > + * @param	owner
> > > > > > > > + *  The owner identifier.
> > > > > > > > + */
> > > > > > > > +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > > + * Get the owner of an Ethernet device.
> > > > > > > > + *
> > > > > > > > + * @param	port_id
> > > > > > > > + *  The port identifier.
> > > > > > > > + * @return
> > > > > > > > + *  NULL when the device is ownerless, else the device owner
> > > pointer.
> > > > > > > > + */
> > > > > > > > +const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const
> > > > > > > > +uint16_t port_id);
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > >   * Get the total number of Ethernet devices that have been
> > > > > successfully
> > > > > > > >   * initialized by the matching Ethernet driver during the PCI
> > > > > > > > probing
> > > > > phase
> > > > > > > >   * and that are available for applications to use. These
> > > > > > > > devices must be diff --git
> > > > > > > > a/lib/librte_ether/rte_ethdev_version.map
> > > > > > > > b/lib/librte_ether/rte_ethdev_version.map
> > > > > > > > index e9681ac..7d07edb 100644
> > > > > > > > --- a/lib/librte_ether/rte_ethdev_version.map
> > > > > > > > +++ b/lib/librte_ether/rte_ethdev_version.map
> > > > > > > > @@ -198,6 +198,18 @@ DPDK_17.11 {
> > > > > > > >
> > > > > > > >  } DPDK_17.08;
> > > > > > > >
> > > > > > > > +DPDK_18.02 {
> > > > > > > > +	global:
> > > > > > > > +
> > > > > > > > +	rte_eth_find_next_owned_by;
> > > > > > > > +	rte_eth_dev_owner_new;
> > > > > > > > +	rte_eth_dev_owner_set;
> > > > > > > > +	rte_eth_dev_owner_remove;
> > > > > > > > +	rte_eth_dev_owner_delete;
> > > > > > > > +	rte_eth_dev_owner_get;
> > > > > > > > +
> > > > > > > > +} DPDK_17.11;
> > > > > > > > +
> > > > > > > >  EXPERIMENTAL {
> > > > > > > >  	global:
> > > > > > > >
> > > > > > > > --
> > > > > > > > 1.8.3.1
> > > > > > > >
> > > > > > > >
> > > > > >
> > > > > > --
> > > > > > Gaëtan Rivet
> > > > > > 6WIND
> > > > > >
  
Thomas Monjalon Dec. 5, 2017, 11:47 a.m. UTC | #17
Hi,

I will give my view on locking and synchronization in a different email.
Let's discuss about the API here.

05/12/2017 12:12, Ananyev, Konstantin:
> From: Matan Azrad [mailto:matan@mellanox.com]
> > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]

> > > If the goal is just to have an ability to recognize is that device is managed by
> > > another device (failsafe, bonding, etc.),  then I think all we need is a pointer
> > > to rte_eth_dev_data of the owner (NULL would mean no owner).
> > 
> > I think string is better than a pointer from the next reasons:
> > 1. It is more human friendly than pointers for debug and printing.
> 
> We can have a function that would take an owner pointer and produce nice
> pretty formatted text explanation: "owned by fail-safe device at port X" or so.

I don't think it is possible or convenient to have such function.
Keep in mind that the owner can be an application thread.
If you prefer using a single function pointer (may help for
atomic implementation), we can allocate an owner structure containing
a name as a string to identify the owner in human readable format.
Then we just have to set the pointer of this struct to rte_eth_dev_data.


> What I meant - this api to set/get ownership should be sort of internal to ethdev layer.
> Let say it would be used for failsafe/bonding (any other compound) device that needs
> to own/manage several low-level devices.
> So in normal situation user wouldn't need to use that API directly at all.

Again, the application may use this API to declare its ownership.
And anwyway, it may be interesting from an application point of view
to be able to list every devices and their internal owners.
  
Thomas Monjalon Dec. 5, 2017, 11:53 a.m. UTC | #18
05/12/2017 12:44, Ananyev, Konstantin:
> Just forgot to mention - I don' think it is good idea to disallow secondary process to set  theowner.

I think we all agree on that.
My initial suggestion was to use the ownership in secondary processes.
I think Matan forbid it as a first step because there is no
multi-process synchronization currently.

> Let say  in secondary process I have few tap/ring/pcap devices.
> Why it shouldn't be allowed to unite them under bonding device and make that device to own them?
> That's why I think get/set owner better to be atomic.
> If the owner is just a pointer - in that case get operation will be atomic by nature,
> set could be implemented just by CAS.

It would be perfect.
Can we be sure that the atomic will work perfectly on shared memory?
On every architectures?
  
Bruce Richardson Dec. 5, 2017, 2:56 p.m. UTC | #19
On Tue, Dec 05, 2017 at 12:53:36PM +0100, Thomas Monjalon wrote:
> 05/12/2017 12:44, Ananyev, Konstantin:
> > Just forgot to mention - I don' think it is good idea to disallow secondary process to set  theowner.
> 
> I think we all agree on that.
> My initial suggestion was to use the ownership in secondary processes.
> I think Matan forbid it as a first step because there is no
> multi-process synchronization currently.
> 
> > Let say  in secondary process I have few tap/ring/pcap devices.
> > Why it shouldn't be allowed to unite them under bonding device and make that device to own them?
> > That's why I think get/set owner better to be atomic.
> > If the owner is just a pointer - in that case get operation will be atomic by nature,
> > set could be implemented just by CAS.
> 
> It would be perfect.
> Can we be sure that the atomic will work perfectly on shared memory?

The sharing of memory is an OS-level construct in managing page tables,
more than anything else. For atomic operations, a memory address is a
memory address, whether it is shared or private to a process.

> On every architectures?

All architectures should have an atomic compare-and-set equivalent
operation for it's native pointer size. In the unlikely case we have to
support one that doesn't, we can special-case that in some other way.
  
Ananyev, Konstantin Dec. 5, 2017, 2:57 p.m. UTC | #20
>> Just forgot to mention - I don' think it is good idea to disallow secondary process to set theowner.
 
>I think we all agree on that.
>My initial suggestion was to use the ownership in secondary processes.
>I think Matan forbid it as a first step because there is no
>multi-process synchronization currently.
 
>> Let say in secondary process I have few tap/ring/pcap devices.
>> Why it shouldn't be allowed to unite them under bonding device and make that device to own them?
>> That's why I think get/set owner better to be atomic.
>> If the owner is just a pointer - in that case get operation will be atomic by nature,
>> set could be implemented just by CAS.
 
>It would be perfect.
>Can we be sure that the atomic will work perfectly on shared memory?
>On every architectures?

I believe - yes, how otherwise rte_ring and rte_mbuf would work for MP? :)
Konstantin
  
Ananyev, Konstantin Dec. 5, 2017, 3:13 p.m. UTC | #21
Hi Thomas,

> Hi,
 
> I will give my view on locking and synchronization in a different email.
> Let's discuss about the API here.
 
> 05/12/2017 12:12, Ananyev, Konstantin:
> >> From: Matan Azrad [mailto:matan@mellanox.com]
>> > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
 
> > > > If the goal is just to have an ability to recognize is that device is managed by
> > > > another device (failsafe, bonding, etc.), then I think all we need is a pointer
> > > > to rte_eth_dev_data of the owner (NULL would mean no owner).
> > > 
> > > I think string is better than a pointer from the next reasons:
> > > 1. It is more human friendly than pointers for debug and printing.
> > 
> > We can have a function that would take an owner pointer and produce nice
> > pretty formatted text explanation: "owned by fail-safe device at port X" or so.
 
> I don't think it is possible or convenient to have such function.

Why do you think it is not possible?

> Keep in mind that the owner can be an application thread.
> If you prefer using a single function pointer (may help for
> atomic implementation), we can allocate an owner structure containing
> a name as a string to identify the owner in human readable format.
> Then we just have to set the pointer of this struct to rte_eth_dev_data.

Basically you'd like to have an ability to set something different then
pointer to rte_eth_dev_data as an owner, right?
I think this is possible too, just not sure it will useful.
 
> > What I meant - this api to set/get ownership should be sort of internal to ethdev layer.
> > Let say it would be used for failsafe/bonding (any other compound) device that needs
> > to own/manage several low-level devices.
> > So in normal situation user wouldn't need to use that API directly at all.
 
> Again, the application may use this API to declare its ownership.

Could you explain that a bit: what would mean 'application declares an ownership on device'?
Does it mean that no other application will be allowed to do any control op on that device
till application will clear its ownership?
I.E. make sure that at each moment only one particular thread can modify device configuration?
Or would it be totally informal and second application will be free to ignore it?

If it will be the second one - I personally don't see much point in it.
If it the first one - then simplest and most straightforward way would be -
introduce a mutex (either per device or just per whole rte_eth_dev[]) and force
each control op to grab it at entrance release at exit.

> And anwyway, it may be interesting from an application point of view
> to be able to list every devices and their internal owners.

Yes sure application is free to call 'get' to retrieve information etc.
What I am saying for normal operation - application don't have to call that API.
I.E. - we don't need to change testpmd, etc. apps because that API was introduced.

Konstantin
  
Thomas Monjalon Dec. 5, 2017, 3:49 p.m. UTC | #22
05/12/2017 16:13, Ananyev, Konstantin:
> 
> Hi Thomas,
> 
> > Hi,
>  
> > I will give my view on locking and synchronization in a different email.
> > Let's discuss about the API here.
>  
> > 05/12/2017 12:12, Ananyev, Konstantin:
> > >> From: Matan Azrad [mailto:matan@mellanox.com]
> >> > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
>  
> > > > > If the goal is just to have an ability to recognize is that device is managed by
> > > > > another device (failsafe, bonding, etc.), then I think all we need is a pointer
> > > > > to rte_eth_dev_data of the owner (NULL would mean no owner).
> > > > 
> > > > I think string is better than a pointer from the next reasons:
> > > > 1. It is more human friendly than pointers for debug and printing.
> > > 
> > > We can have a function that would take an owner pointer and produce nice
> > > pretty formatted text explanation: "owned by fail-safe device at port X" or so.
>  
> > I don't think it is possible or convenient to have such function.
> 
> Why do you think it is not possible?

Because of applications being the owner (discussion below).

> > Keep in mind that the owner can be an application thread.
> > If you prefer using a single function pointer (may help for
> > atomic implementation), we can allocate an owner structure containing
> > a name as a string to identify the owner in human readable format.
> > Then we just have to set the pointer of this struct to rte_eth_dev_data.
> 
> Basically you'd like to have an ability to set something different then
> pointer to rte_eth_dev_data as an owner, right?

No, it can be a pointer, or an id, I don't care.

> I think this is possible too, just not sure it will useful.
>  
> > > What I meant - this api to set/get ownership should be sort of internal to ethdev layer.
> > > Let say it would be used for failsafe/bonding (any other compound) device that needs
> > > to own/manage several low-level devices.
> > > So in normal situation user wouldn't need to use that API directly at all.
>  
> > Again, the application may use this API to declare its ownership.
> 
> Could you explain that a bit: what would mean 'application declares an ownership on device'?
> Does it mean that no other application will be allowed to do any control op on that device
> till application will clear its ownership?
> I.E. make sure that at each moment only one particular thread can modify device configuration?
> Or would it be totally informal and second application will be free to ignore it?

It is an information.
It will avoid a library taking ownership on a port controlled by the app.

> If it will be the second one - I personally don't see much point in it.
> If it the first one - then simplest and most straightforward way would be -
> introduce a mutex (either per device or just per whole rte_eth_dev[]) and force
> each control op to grab it at entrance release at exit.

No, a mutex does not provide any information.

> > And anwyway, it may be interesting from an application point of view
> > to be able to list every devices and their internal owners.
> 
> Yes sure application is free to call 'get' to retrieve information etc.
> What I am saying for normal operation - application don't have to call that API.
> I.E. - we don't need to change testpmd, etc. apps because that API was introduced.

Yes the ports can stay without any owner if the application does not fill it.
  
Neil Horman Dec. 5, 2017, 7:26 p.m. UTC | #23
On Tue, Dec 05, 2017 at 06:08:35AM +0000, Matan Azrad wrote:
><snip>
> Please look at the code again, secondary process cannot take ownership, I don't allow it!
> Actually, I think it must not be like that as no limitation for that in any other ethdev configurations.
> 
Sure you do.  Consider the following situation, two tasks, A and B running
independently on separate CPUS:

TASK A					TASK B
================================================================================
calls rte_eth_dev_owner_new (gets id 2)| calls rte_eth_dev_owner_new (gets id 1)
				       |
calls rte_eth_dev_owner_set on port 1  | calls rte_eth_dev_owner_set (port 1)
				       |
sets port_owner->id = 2		       | gets removed from cpu via scheduler
				       |
				       | returns to continue running on cpu
				       |
Gets interrupted immediately before    | completes rte_eth_dev_owner_set, 
 return 0 statement		       |  setting port_owner->id = 1
				       |
				       | returns 0 from rte_eth_dev_owner_set
is scheduled back on the cpu	       |
				       |
returns 0 from rte_eth_dev_owner_set   |

in the above scenario, you've allowed two tasks to race through the ownership
set routine, and while your intended semantics indicate that task A should have
taken ownership of the port, task B actually did, and whats worse, both tasks
think they completed successfully.

I get that much of dpdk relies on the fact that the application either handles
all the locking, or architects itself so that a single thread of execution (or
at least only one thread at a time), is responsible for packet processing and
port configuration. If you are assuming the former, you've done a disservice to
the downstream consumer, because the locking is the intricate part of this
operation (i.e. you are requiring that the developer figure out what granularity
of locking is required such that you don't serialize too many operations that
don't need it, while maintaining enough serialization that you don't corrupt the
data that they wanted the api to manage in the first place.  If you assert that
the application should only be using a single task to do these operations to
begin with, then this API isn't overly useful, because theres only one thread
pushing data into the library and, by definition it implicitly owns all the
ports, or at least knows which ports it shouldn't mess with (e.g subordunate
ports in a failsafe device).

> > > Any port configuration (for example port creation and release) is not
> > internally synchronized between the primary to secondary processes so I
> > don't see any reason to synchronize port ownership.
> > Yes, and I've never agreed with that design point either, because the fact of
> > the matter is that one of two things must be true in relation to port
> > configuration:
> > 
> > 1) Either multiple processes will attempt to read/change port
> > configuration/ownership
> > 
> > or
> > 
> > 2) port ownership is implicitly given to a single task (be it a primary or
> > secondary task), and said ownership is therefore implicitly known by the
> > application
> > 
> > Either situation may be true depending on the details of the application being
> > built, but regardless, if (2) is true, then this api isn't really needed at all,
> > because the application implicitly has been designed to have a port be
> > owned by a given task. 
> 
> Here I think you miss something, the port ownership is not mainly for process DPDK entities,
> The entity can be also PMD, library, application in same process.
> For MP it is nice to have, the secondary just can see the primary owners and take decision accordingly (please read my answer to Konstatin above). 
> 
But the former is just a case of the latter (in fact worse).  if you expect
another execution context out of the control of the application to query this
API, then you are in an MP situation, and definately need to provide mutually
exclusive access to your data.  If instead you expect all other execution
contexts to suspend (or otherwise refrain from accessing this API) while a
single task makes changes, then by definition you have already had to create
some syncrnoization between those contexts and are capable of informing them of
of the new ownership scheme

The bottom line is, either you expect multiple access, or you dont.  If you
expect multiple access, and you belive that said access are not completely under
the control of your application, you need to protect those accesses against one
another.  If you don't expect multiple access (or expect your application to
architect itself to enforce single access), then you've created a environment in
which the single accessor already has to know all the information regarding port
ownership that you store in the API.

>  If (1) is true, then all the locking required to access
> > the data of port ownership needs to be adhered to.
> > 
> 
> And all the port configurations!
> I think it is behind to this thread.
> 
> 
> > I understand that you are taking Thomas' words to mean that all paths are
> > lockless in the DPDK, and that is true as a statement of fact (in that the DPDK
> > doesn't synchronize access to internal data).  What it does do, is leave that
> > locking as an exercize for the downstream consumer of the library.  While I
> > don't agree with it, I can see that such an argument is valid for hot paths such
> > as transmission and reception where you may perhaps want to minimize
> > your locking by assigning a single task to do that work, but port configuration
> > and ownership isn't a hot path.  If you mean to allow potentially multiple
> > tasks to access configuration (including port ownership), be it frequently or
> > just occasionaly, why are you making a downstream developer recognize the
> > need for locking here outside the library?  It just seems like very bad general
> > practice to me.
> > 
> > > If the primary-secondary process want to manage(configure) same port in
> > same time, they must be synchronized by the applications, so this is the case
> > in port ownership too (actually I don't think this synchronization is realistic
> > because many configurations of the port are not in shared memory).
> > Yes, it is the case, my question is, why?  We're not talking about a time
> > sensitive execution path here.  And by avoiding it you're just making work
> > that has to be repeated by every downstream consumer.
> 
> I think you suggest to make all the ethdev configuration race safe, it is behind to this thread.
> Current ethdev implementation leave the race management to applications, so port ownership as any other port configurations should be designed in the same method.
> 
I would like to make ethdev configuration race safe, but thats a fight I've
lost, but I strongly disagree that just because some parts of the dpdk leave
race safety to the user, doesn't mean you have to.  Its just silly not to here.
We're not talking about a hot execution path here, we're talking about one time
/ rare configuration changes.  To insert a lock (or other lightweight atomic
operation) costs nothing in terms of execution time, and saves downstream
consumers significant time figuring out what the best mutual exclusion strategy
is.  Why wouldn't you do that?

Neil

> > 
> > Neil
> 
>
  
Thomas Monjalon Dec. 8, 2017, 11:06 a.m. UTC | #24
05/12/2017 20:26, Neil Horman:
> I get that much of dpdk relies on the fact that the application either handles
> all the locking, or architects itself so that a single thread of execution (or
> at least only one thread at a time), is responsible for packet processing and
> port configuration.

Yes, for now, configuration is synchronized at application level.
It is a constraint for applications.
It may be an issue for multi-process applications,
or for libraries aiming some device management.

The first obvious bug to fix is race in device allocation.
It will become more real with hotplug support.
  
Thomas Monjalon Dec. 8, 2017, 11:35 a.m. UTC | #25
05/12/2017 11:05, Bruce Richardson:
> > I think you suggest to make all the ethdev configuration race safe, it
> > is behind to this thread.  Current ethdev implementation leave the
> > race management to applications, so port ownership as any other port
> > configurations should be designed in the same method.
> 
> One key difference, though, being that port ownership itself could be
> used to manage the thread-safety of the ethdev configuration. It's also
> a little different from other APIs in that I find it hard to come up
> with a scenario where it would be very useful to an application without
> also having some form of locking present in it. For other config/control
> APIs we can have the control plane APIs work without locks e.g. by
> having a single designated thread/process manage all configuration
> updates. However, as Neil points out, in such a scenario, the ownership
> concept doesn't provide any additional benefit so can be skipped
> completely. I'd view it a bit like the reference counting of mbufs -
> we can provide a lockless/non-atomic version, but for just about every
> real use-case, you want the atomic version.

I think we need to clearly describe what is the tread-safety policy
in DPDK (especially in ethdev as a first example).
Let's start with obvious things:

	1/ A queue is not protected for races with multiple Rx or Tx
			- no planned change because of performance purpose
	2/ The list of devices is racy
			- to be fixed with atomics
	3/ The configuration of different devices is thread-safe
			- the configurations are different per-device
	4/ The configuration of a given device is racy
			- can be managed by the owner of the device
	5/ The device ownership is racy
			- to be fixed with atomics

What am I missing?

I am also wondering whether the device ownership can be a separate
library used in several device class interfaces?
  
Neil Horman Dec. 8, 2017, 12:31 p.m. UTC | #26
On Fri, Dec 08, 2017 at 12:35:18PM +0100, Thomas Monjalon wrote:
> 05/12/2017 11:05, Bruce Richardson:
> > > I think you suggest to make all the ethdev configuration race safe, it
> > > is behind to this thread.  Current ethdev implementation leave the
> > > race management to applications, so port ownership as any other port
> > > configurations should be designed in the same method.
> > 
> > One key difference, though, being that port ownership itself could be
> > used to manage the thread-safety of the ethdev configuration. It's also
> > a little different from other APIs in that I find it hard to come up
> > with a scenario where it would be very useful to an application without
> > also having some form of locking present in it. For other config/control
> > APIs we can have the control plane APIs work without locks e.g. by
> > having a single designated thread/process manage all configuration
> > updates. However, as Neil points out, in such a scenario, the ownership
> > concept doesn't provide any additional benefit so can be skipped
> > completely. I'd view it a bit like the reference counting of mbufs -
> > we can provide a lockless/non-atomic version, but for just about every
> > real use-case, you want the atomic version.
> 
> I think we need to clearly describe what is the tread-safety policy
> in DPDK (especially in ethdev as a first example).
> Let's start with obvious things:
> 
> 	1/ A queue is not protected for races with multiple Rx or Tx
> 			- no planned change because of performance purpose
> 	2/ The list of devices is racy
> 			- to be fixed with atomics
> 	3/ The configuration of different devices is thread-safe
> 			- the configurations are different per-device
> 	4/ The configuration of a given device is racy
> 			- can be managed by the owner of the device
> 	5/ The device ownership is racy
> 			- to be fixed with atomics
> 
> What am I missing?
> 
There is fan out to consider here:

1) Is device configuration racy with ownership?  That is to say, can I change
ownership of a device safely while another thread that currently owns it
modifies its configuration?

2) Is device configuration racy with device addition/removal?  That is to say,
can one thread remove a device while another configures it.

There are probably many subsystem interactions that need to be addressed here.

Neil

> I am also wondering whether the device ownership can be a separate
> library used in several device class interfaces?
>
  
Thomas Monjalon Dec. 21, 2017, 5:06 p.m. UTC | #27
08/12/2017 13:31, Neil Horman:
> On Fri, Dec 08, 2017 at 12:35:18PM +0100, Thomas Monjalon wrote:
> > 05/12/2017 11:05, Bruce Richardson:
> > > > I think you suggest to make all the ethdev configuration race safe, it
> > > > is behind to this thread.  Current ethdev implementation leave the
> > > > race management to applications, so port ownership as any other port
> > > > configurations should be designed in the same method.
> > > 
> > > One key difference, though, being that port ownership itself could be
> > > used to manage the thread-safety of the ethdev configuration. It's also
> > > a little different from other APIs in that I find it hard to come up
> > > with a scenario where it would be very useful to an application without
> > > also having some form of locking present in it. For other config/control
> > > APIs we can have the control plane APIs work without locks e.g. by
> > > having a single designated thread/process manage all configuration
> > > updates. However, as Neil points out, in such a scenario, the ownership
> > > concept doesn't provide any additional benefit so can be skipped
> > > completely. I'd view it a bit like the reference counting of mbufs -
> > > we can provide a lockless/non-atomic version, but for just about every
> > > real use-case, you want the atomic version.
> > 
> > I think we need to clearly describe what is the tread-safety policy
> > in DPDK (especially in ethdev as a first example).
> > Let's start with obvious things:
> > 
> > 	1/ A queue is not protected for races with multiple Rx or Tx
> > 			- no planned change because of performance purpose
> > 	2/ The list of devices is racy
> > 			- to be fixed with atomics
> > 	3/ The configuration of different devices is thread-safe
> > 			- the configurations are different per-device
> > 	4/ The configuration of a given device is racy
> > 			- can be managed by the owner of the device
> > 	5/ The device ownership is racy
> > 			- to be fixed with atomics
> > 
> > What am I missing?
> > 
> There is fan out to consider here:
> 
> 1) Is device configuration racy with ownership?  That is to say, can I change
> ownership of a device safely while another thread that currently owns it
> modifies its configuration?

If an entity steals ownership to another one, either it is agreed earlier,
or it is done by a central authority.
When it is acked that ownership can be moved, there should not be any
configuration in progress.
So it is more a communication issue than a race.

> 2) Is device configuration racy with device addition/removal?  That is to say,
> can one thread remove a device while another configures it.

I think it is the same as two threads configuring the same device
(item 4/ above). It can be managed with port ownership.

> There are probably many subsystem interactions that need to be addressed here.
> 
> Neil
> 
> > I am also wondering whether the device ownership can be a separate
> > library used in several device class interfaces?
  
Neil Horman Dec. 21, 2017, 5:43 p.m. UTC | #28
On Thu, Dec 21, 2017 at 06:06:48PM +0100, Thomas Monjalon wrote:
> 08/12/2017 13:31, Neil Horman:
> > On Fri, Dec 08, 2017 at 12:35:18PM +0100, Thomas Monjalon wrote:
> > > 05/12/2017 11:05, Bruce Richardson:
> > > > > I think you suggest to make all the ethdev configuration race safe, it
> > > > > is behind to this thread.  Current ethdev implementation leave the
> > > > > race management to applications, so port ownership as any other port
> > > > > configurations should be designed in the same method.
> > > > 
> > > > One key difference, though, being that port ownership itself could be
> > > > used to manage the thread-safety of the ethdev configuration. It's also
> > > > a little different from other APIs in that I find it hard to come up
> > > > with a scenario where it would be very useful to an application without
> > > > also having some form of locking present in it. For other config/control
> > > > APIs we can have the control plane APIs work without locks e.g. by
> > > > having a single designated thread/process manage all configuration
> > > > updates. However, as Neil points out, in such a scenario, the ownership
> > > > concept doesn't provide any additional benefit so can be skipped
> > > > completely. I'd view it a bit like the reference counting of mbufs -
> > > > we can provide a lockless/non-atomic version, but for just about every
> > > > real use-case, you want the atomic version.
> > > 
> > > I think we need to clearly describe what is the tread-safety policy
> > > in DPDK (especially in ethdev as a first example).
> > > Let's start with obvious things:
> > > 
> > > 	1/ A queue is not protected for races with multiple Rx or Tx
> > > 			- no planned change because of performance purpose
> > > 	2/ The list of devices is racy
> > > 			- to be fixed with atomics
> > > 	3/ The configuration of different devices is thread-safe
> > > 			- the configurations are different per-device
> > > 	4/ The configuration of a given device is racy
> > > 			- can be managed by the owner of the device
> > > 	5/ The device ownership is racy
> > > 			- to be fixed with atomics
> > > 
> > > What am I missing?
> > > 
> > There is fan out to consider here:
> > 
> > 1) Is device configuration racy with ownership?  That is to say, can I change
> > ownership of a device safely while another thread that currently owns it
> > modifies its configuration?
> 
> If an entity steals ownership to another one, either it is agreed earlier,
> or it is done by a central authority.
> When it is acked that ownership can be moved, there should not be any
> configuration in progress.
> So it is more a communication issue than a race.
> 
But if thats the case (specifically that mutual exclusion between port ownership
and configuration is an exercize left to an application developer), then port
ownership itself is largely meaningless within the dpdk, because the notion of
who owns the port needs to be codified within the application anyway.


> > 2) Is device configuration racy with device addition/removal?  That is to say,
> > can one thread remove a device while another configures it.
> 
> I think it is the same as two threads configuring the same device
> (item 4/ above). It can be managed with port ownership.
> 
Only if you assert that application is required to have the owning port be
responsible for the ports deletion, which we can say, but that leads to the
issue above again.


> > There are probably many subsystem interactions that need to be addressed here.
> > 
> > Neil
> > 
> > > I am also wondering whether the device ownership can be a separate
> > > library used in several device class interfaces?
> 
> 
>
  
Matan Azrad Dec. 21, 2017, 7:37 p.m. UTC | #29
Hi

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Thursday, December 21, 2017 7:43 PM
> To: Thomas Monjalon <thomas@monjalon.net>
> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>; Matan
> Azrad <matan@mellanox.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>;
> Wu, Jingjing <jingjing.wu@intel.com>
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> On Thu, Dec 21, 2017 at 06:06:48PM +0100, Thomas Monjalon wrote:
> > 08/12/2017 13:31, Neil Horman:
> > > On Fri, Dec 08, 2017 at 12:35:18PM +0100, Thomas Monjalon wrote:
> > > > 05/12/2017 11:05, Bruce Richardson:
> > > > > > I think you suggest to make all the ethdev configuration race
> > > > > > safe, it is behind to this thread.  Current ethdev
> > > > > > implementation leave the race management to applications, so
> > > > > > port ownership as any other port configurations should be designed
> in the same method.
> > > > >
> > > > > One key difference, though, being that port ownership itself
> > > > > could be used to manage the thread-safety of the ethdev
> > > > > configuration. It's also a little different from other APIs in
> > > > > that I find it hard to come up with a scenario where it would be
> > > > > very useful to an application without also having some form of
> > > > > locking present in it. For other config/control APIs we can have
> > > > > the control plane APIs work without locks e.g. by having a
> > > > > single designated thread/process manage all configuration
> > > > > updates. However, as Neil points out, in such a scenario, the
> > > > > ownership concept doesn't provide any additional benefit so can
> > > > > be skipped completely. I'd view it a bit like the reference
> > > > > counting of mbufs - we can provide a lockless/non-atomic version,
> but for just about every real use-case, you want the atomic version.
> > > >
> > > > I think we need to clearly describe what is the tread-safety
> > > > policy in DPDK (especially in ethdev as a first example).
> > > > Let's start with obvious things:
> > > >
> > > > 	1/ A queue is not protected for races with multiple Rx or Tx
> > > > 			- no planned change because of performance
> purpose
> > > > 	2/ The list of devices is racy
> > > > 			- to be fixed with atomics
> > > > 	3/ The configuration of different devices is thread-safe
> > > > 			- the configurations are different per-device
> > > > 	4/ The configuration of a given device is racy
> > > > 			- can be managed by the owner of the device
> > > > 	5/ The device ownership is racy
> > > > 			- to be fixed with atomics
> > > >
> > > > What am I missing?
> > > >

Thank you Thomas for this order.
Actually the port ownership is a good opportunity to redefine the synchronization rules in ethdev :)

> > > There is fan out to consider here:
> > >
> > > 1) Is device configuration racy with ownership?  That is to say, can
> > > I change ownership of a device safely while another thread that
> > > currently owns it modifies its configuration?
> >
> > If an entity steals ownership to another one, either it is agreed
> > earlier, or it is done by a central authority.
> > When it is acked that ownership can be moved, there should not be any
> > configuration in progress.
> > So it is more a communication issue than a race.
> >
> But if thats the case (specifically that mutual exclusion between port
> ownership and configuration is an exercize left to an application developer),
> then port ownership itself is largely meaningless within the dpdk, because
> the notion of who owns the port needs to be codified within the application
> anyway.
> 

Bruce, As I understand it, only the dpdk entity who took ownership of a port successfully can configure the device by default, if other dpdk entities want to configure it too they must to be synchronized with the port owner while it is not recommended after the port ownership integration.

So, for example,  if the dpdk entity is an application, the application should take ownership of the port and manage the synchronization of this port configuration between the application threads and its EAL host thread callbacks, no other dpdk entity should configure the same port because they should fail when they try to take ownership of the same port too.
Each dpdk entity which wants to take ownership must to be able to synchronize the port configuration in its level. 

> 
> > > 2) Is device configuration racy with device addition/removal?  That
> > > is to say, can one thread remove a device while another configures it.
> >
> > I think it is the same as two threads configuring the same device
> > (item 4/ above). It can be managed with port ownership.
> >
> Only if you assert that application is required to have the owning port be
> responsible for the ports deletion, which we can say, but that leads to the
> issue above again.
> 
> 
As Thomas said in item 2 the port creation must be synchronized by ethdev and we need to add it there. 
I think it is obvious that port removal must to be done only by the port owner.  


I think we need to add synchronization for port ownership management in this patch V2 and add port creation synchronization in ethdev in separate patch to fill the new rules Thomas suggested.

What do you think?

> > > There are probably many subsystem interactions that need to be
> addressed here.
> > >
> > > Neil
> > >
> > > > I am also wondering whether the device ownership can be a separate
> > > > library used in several device class interfaces?
> >
> >
> >
  
Neil Horman Dec. 21, 2017, 8:14 p.m. UTC | #30
On Thu, Dec 21, 2017 at 07:37:06PM +0000, Matan Azrad wrote:
> Hi
> 
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Thursday, December 21, 2017 7:43 PM
> > To: Thomas Monjalon <thomas@monjalon.net>
> > Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>; Matan
> > Azrad <matan@mellanox.com>; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>;
> > Wu, Jingjing <jingjing.wu@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > 
> > On Thu, Dec 21, 2017 at 06:06:48PM +0100, Thomas Monjalon wrote:
> > > 08/12/2017 13:31, Neil Horman:
> > > > On Fri, Dec 08, 2017 at 12:35:18PM +0100, Thomas Monjalon wrote:
> > > > > 05/12/2017 11:05, Bruce Richardson:
> > > > > > > I think you suggest to make all the ethdev configuration race
> > > > > > > safe, it is behind to this thread.  Current ethdev
> > > > > > > implementation leave the race management to applications, so
> > > > > > > port ownership as any other port configurations should be designed
> > in the same method.
> > > > > >
> > > > > > One key difference, though, being that port ownership itself
> > > > > > could be used to manage the thread-safety of the ethdev
> > > > > > configuration. It's also a little different from other APIs in
> > > > > > that I find it hard to come up with a scenario where it would be
> > > > > > very useful to an application without also having some form of
> > > > > > locking present in it. For other config/control APIs we can have
> > > > > > the control plane APIs work without locks e.g. by having a
> > > > > > single designated thread/process manage all configuration
> > > > > > updates. However, as Neil points out, in such a scenario, the
> > > > > > ownership concept doesn't provide any additional benefit so can
> > > > > > be skipped completely. I'd view it a bit like the reference
> > > > > > counting of mbufs - we can provide a lockless/non-atomic version,
> > but for just about every real use-case, you want the atomic version.
> > > > >
> > > > > I think we need to clearly describe what is the tread-safety
> > > > > policy in DPDK (especially in ethdev as a first example).
> > > > > Let's start with obvious things:
> > > > >
> > > > > 	1/ A queue is not protected for races with multiple Rx or Tx
> > > > > 			- no planned change because of performance
> > purpose
> > > > > 	2/ The list of devices is racy
> > > > > 			- to be fixed with atomics
> > > > > 	3/ The configuration of different devices is thread-safe
> > > > > 			- the configurations are different per-device
> > > > > 	4/ The configuration of a given device is racy
> > > > > 			- can be managed by the owner of the device
> > > > > 	5/ The device ownership is racy
> > > > > 			- to be fixed with atomics
> > > > >
> > > > > What am I missing?
> > > > >
> 
> Thank you Thomas for this order.
> Actually the port ownership is a good opportunity to redefine the synchronization rules in ethdev :)
> 
> > > > There is fan out to consider here:
> > > >
> > > > 1) Is device configuration racy with ownership?  That is to say, can
> > > > I change ownership of a device safely while another thread that
> > > > currently owns it modifies its configuration?
> > >
> > > If an entity steals ownership to another one, either it is agreed
> > > earlier, or it is done by a central authority.
> > > When it is acked that ownership can be moved, there should not be any
> > > configuration in progress.
> > > So it is more a communication issue than a race.
> > >
> > But if thats the case (specifically that mutual exclusion between port
> > ownership and configuration is an exercize left to an application developer),
> > then port ownership itself is largely meaningless within the dpdk, because
> > the notion of who owns the port needs to be codified within the application
> > anyway.
> > 
> 
> Bruce, As I understand it, only the dpdk entity who took ownership of a port successfully can configure the device by default, if other dpdk entities want to configure it too they must to be synchronized with the port owner while it is not recommended after the port ownership integration.
> 
Can you clarify what you mean by "it is not recommended after the port ownership
integration"?  I think there is consensus that the port owner must be the only
entitiy to operate on a port (be that configuration/frame rx/tx, or some other
operation).  Multithreaded operation on a port always means some level of
synchronization between application threads and the dpdk library, but I'm not
sure why that would be different if we introduced a more concrete notion of port
ownership via a new library.

> So, for example,  if the dpdk entity is an application, the application should take ownership of the port and manage the synchronization of this port configuration between the application threads and its EAL host thread callbacks, no other dpdk entity should configure the same port because they should fail when they try to take ownership of the same port too.
Well, failing is one good approach, yes, blocking on port relenquishment could
be another.  I'd recommend an API with the following interface:

rte_port_ownership_claim(int port_id) - blocks execution of the calling thread
until the previous owner releases ownership, then claims it and returns

rte_port_ownership_release(int port_id) - releases ownership of port, or returns
error if the port was not owned by this execution context

rte_port_owernship_try_claim(int port_id) - same as rte_port_ownership_claim,
but fails if the port is already owned.

That would give the option for both semantics.

> Each dpdk entity which wants to take ownership must to be able to synchronize the port configuration in its level. 
Can you elaborate on what you mean by level here?  Are you envisioning a scheme
in which multiple execution contexts might own a port for various
non-conflicting purposes?


> 
> > 
> > > > 2) Is device configuration racy with device addition/removal?  That
> > > > is to say, can one thread remove a device while another configures it.
> > >
> > > I think it is the same as two threads configuring the same device
> > > (item 4/ above). It can be managed with port ownership.
> > >
> > Only if you assert that application is required to have the owning port be
> > responsible for the ports deletion, which we can say, but that leads to the
> > issue above again.
> > 
> > 
> As Thomas said in item 2 the port creation must be synchronized by ethdev and we need to add it there. 
> I think it is obvious that port removal must to be done only by the port owner.  
> 
You say that, but its obvious to you as a developer who has looked extensively
at the code.  It may well be less so to a consumer who is not an active member
of the community, for instance one who obtains the dpdk via pre-built package.

> 
> I think we need to add synchronization for port ownership management in this patch V2 and add port creation synchronization in ethdev in separate patch to fill the new rules Thomas suggested.
I think that makes sense, yes. 

Neil

> 
> What do you think?
> 
> > > > There are probably many subsystem interactions that need to be
> > addressed here.
> > > >
> > > > Neil
> > > >
> > > > > I am also wondering whether the device ownership can be a separate
> > > > > library used in several device class interfaces?
> > >
> > >
> > >
>
  
Matan Azrad Dec. 21, 2017, 9:57 p.m. UTC | #31
> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Thursday, December 21, 2017 10:14 PM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Bruce
> Richardson <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>;
> Wu, Jingjing <jingjing.wu@intel.com>
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> On Thu, Dec 21, 2017 at 07:37:06PM +0000, Matan Azrad wrote:
> > Hi
> >
<snip>
> > > > > > I think we need to clearly describe what is the tread-safety
> > > > > > policy in DPDK (especially in ethdev as a first example).
> > > > > > Let's start with obvious things:
> > > > > >
> > > > > > 	1/ A queue is not protected for races with multiple Rx or Tx
> > > > > > 			- no planned change because of performance
> > > purpose
> > > > > > 	2/ The list of devices is racy
> > > > > > 			- to be fixed with atomics
> > > > > > 	3/ The configuration of different devices is thread-safe
> > > > > > 			- the configurations are different per-device
> > > > > > 	4/ The configuration of a given device is racy
> > > > > > 			- can be managed by the owner of the device
> > > > > > 	5/ The device ownership is racy
> > > > > > 			- to be fixed with atomics
> > > > > >
> > > > > > What am I missing?
> > > > > >
> >
> > Thank you Thomas for this order.
> > Actually the port ownership is a good opportunity to redefine the
> > synchronization rules in ethdev :)
> >
> > > > > There is fan out to consider here:
> > > > >
> > > > > 1) Is device configuration racy with ownership?  That is to say,
> > > > > can I change ownership of a device safely while another thread
> > > > > that currently owns it modifies its configuration?
> > > >
> > > > If an entity steals ownership to another one, either it is agreed
> > > > earlier, or it is done by a central authority.
> > > > When it is acked that ownership can be moved, there should not be
> > > > any configuration in progress.
> > > > So it is more a communication issue than a race.
> > > >
> > > But if thats the case (specifically that mutual exclusion between
> > > port ownership and configuration is an exercize left to an
> > > application developer), then port ownership itself is largely
> > > meaningless within the dpdk, because the notion of who owns the port
> > > needs to be codified within the application anyway.
> > >
> >
> > Bruce, As I understand it, only the dpdk entity who took ownership of a
> port successfully can configure the device by default, if other dpdk entities
> want to configure it too they must to be synchronized with the port owner
> while it is not recommended after the port ownership integration.
> >
> Can you clarify what you mean by "it is not recommended after the port
> ownership integration"?

Sure,
The new defining of ethdev synchronization doesn't recommend to manage a port by 2 different dpdk entities, it can be done but not recommended.
  
>  I think there is consensus that the port owner must
> be the only entitiy to operate on a port (be that configuration/frame rx/tx, or
> some other operation).

Your question above caused me to think that you don't understand it, How can someone who is not the port owner to change the port owner?
Changing the port owner, like port configuration and port release must be done by the owner itself except the case that there is no owner to the port.
See the API rte_eth_dev_owner_remove.

> Multithreaded operation on a port always means
> some level of synchronization between application threads and the dpdk
> library,
Yes.
 >but I'm not sure why that would be different if we introduced a more
> concrete notion of port ownership via a new library.
>

What do you mean by "new library"?, port is an ethdev instance and should be managed by ethdev.

 > > So, for example,  if the dpdk entity is an application, the application should
>> take ownership of the port and manage the synchronization of this port
>> configuration between the application threads and its EAL host thread
>> callbacks, no other dpdk entity should configure the same port because they
>> should fail when they try to take ownership of the same port too.

> Well, failing is one good approach, yes, blocking on port relenquishment
> could be another.  I'd recommend an API with the following interface:
> 
> rte_port_ownership_claim(int port_id) - blocks execution of the calling
> thread until the previous owner releases ownership, then claims it and
> returns
> 
> rte_port_ownership_release(int port_id) - releases ownership of port, or
> returns error if the port was not owned by this execution context
>
> rte_port_owernship_try_claim(int port_id) - same as
> rte_port_ownership_claim, but fails if the port is already owned.
> 
> That would give the option for both semantics.

I think the current APIs are better because of the next reasons:
- It defines well who is the owner.
- The owner structure includes string to allow better debug and printing for humans. 
Did you read it?
I can add there an API that wait until the port ownership is released as you suggested in V2.
 
> > Each dpdk entity which wants to take ownership must to be able to
> >synchronize the port configuration in its level.

> Can you elaborate on what you mean by level here?  Are you envisioning a
> scheme in which multiple execution contexts might own a port for various
> non-conflicting purposes?
 
Sure,
1) Application with 2 threads wanting to configure the same port:
	level = application code.
	
	a. The main thread should create owner identifier(rte_eth_dev_owner_new).
	b. The main thread should take the port ownership(rte_eth_dev_owner_set).
	c. Synchronization between the two threads should be done for the conflicted 		configurations by the application.
	d. when the application finishes the port usage it should release the owner(rte_eth_dev_owner_remove).

2) Fail-safe PMD manages 2 sub-devices (2 ports) and uses alarm for hotplug detections which can configure the 2 ports(by the host thread).
	Level = fail-safe code.
	a. Application starts the eal and the fail-safe driver probing function is called.
	b. Fail-safe probes the 2 sub-devices(2 ports are created) and takes ownership for them.
	c. Failsafe creates itself port and leaves it ownerless. 
	d. Failsafe starts the hotplug alarm mechanism.
	e. Application tries to take ownership for all ports and success only for failsafe port.
	f. Application start to configure the failsafe port asynchronously to failsafe hotplug alarm.
	g. Failsafe must use synchronization between failsafe alarm callback code and failsafe configuration APIs called by the application because they both try to configure the same sub-devices ports.
	h. When fail-safe finishes with the two sub devices it should release the ports owner.

> >
> > >
> > > > > 2) Is device configuration racy with device addition/removal?
> > > > > That is to say, can one thread remove a device while another
> configures it.
> > > >
> > > > I think it is the same as two threads configuring the same device
> > > > (item 4/ above). It can be managed with port ownership.
> > > >
> > > Only if you assert that application is required to have the owning
> > > port be responsible for the ports deletion, which we can say, but
> > > that leads to the issue above again.
> > >
> > >
> > As Thomas said in item 2 the port creation must be synchronized by ethdev
> and we need to add it there.
> > I think it is obvious that port removal must to be done only by the port
> owner.
> >
> You say that, but its obvious to you as a developer who has looked
> extensively at the code.  It may well be less so to a consumer who is not an
> active member of the community, for instance one who obtains the dpdk via
> pre-built package.
>

Yes I can understand, but new rules should be documented and be adjusted easy easy by the customers, no?
The old way to sync configuration still exists.
 
> >
> > I think we need to add synchronization for port ownership management in
> this patch V2 and add port creation synchronization in ethdev in separate
> patch to fill the new rules Thomas suggested.
> I think that makes sense, yes.
> 
> Neil
> 
> >
> > What do you think?
> >
> > > > > There are probably many subsystem interactions that need to be
> > > addressed here.
> > > > >
> > > > > Neil
> > > > >
> > > > > > I am also wondering whether the device ownership can be a
> > > > > > separate library used in several device class interfaces?
> > > >
> > > >
> > > >
> >
  
Neil Horman Dec. 22, 2017, 2:26 p.m. UTC | #32
On Thu, Dec 21, 2017 at 09:57:43PM +0000, Matan Azrad wrote:
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Thursday, December 21, 2017 10:14 PM
> > To: Matan Azrad <matan@mellanox.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Bruce
> > Richardson <bruce.richardson@intel.com>; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>;
> > Wu, Jingjing <jingjing.wu@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > 
> > On Thu, Dec 21, 2017 at 07:37:06PM +0000, Matan Azrad wrote:
> > > Hi
> > >
> <snip>
> > > > > > > I think we need to clearly describe what is the tread-safety
> > > > > > > policy in DPDK (especially in ethdev as a first example).
> > > > > > > Let's start with obvious things:
> > > > > > >
> > > > > > > 	1/ A queue is not protected for races with multiple Rx or Tx
> > > > > > > 			- no planned change because of performance
> > > > purpose
> > > > > > > 	2/ The list of devices is racy
> > > > > > > 			- to be fixed with atomics
> > > > > > > 	3/ The configuration of different devices is thread-safe
> > > > > > > 			- the configurations are different per-device
> > > > > > > 	4/ The configuration of a given device is racy
> > > > > > > 			- can be managed by the owner of the device
> > > > > > > 	5/ The device ownership is racy
> > > > > > > 			- to be fixed with atomics
> > > > > > >
> > > > > > > What am I missing?
> > > > > > >
> > >
> > > Thank you Thomas for this order.
> > > Actually the port ownership is a good opportunity to redefine the
> > > synchronization rules in ethdev :)
> > >
> > > > > > There is fan out to consider here:
> > > > > >
> > > > > > 1) Is device configuration racy with ownership?  That is to say,
> > > > > > can I change ownership of a device safely while another thread
> > > > > > that currently owns it modifies its configuration?
> > > > >
> > > > > If an entity steals ownership to another one, either it is agreed
> > > > > earlier, or it is done by a central authority.
> > > > > When it is acked that ownership can be moved, there should not be
> > > > > any configuration in progress.
> > > > > So it is more a communication issue than a race.
> > > > >
> > > > But if thats the case (specifically that mutual exclusion between
> > > > port ownership and configuration is an exercize left to an
> > > > application developer), then port ownership itself is largely
> > > > meaningless within the dpdk, because the notion of who owns the port
> > > > needs to be codified within the application anyway.
> > > >
> > >
> > > Bruce, As I understand it, only the dpdk entity who took ownership of a
> > port successfully can configure the device by default, if other dpdk entities
> > want to configure it too they must to be synchronized with the port owner
> > while it is not recommended after the port ownership integration.
> > >
> > Can you clarify what you mean by "it is not recommended after the port
> > ownership integration"?
> 
> Sure,
> The new defining of ethdev synchronization doesn't recommend to manage a port by 2 different dpdk entities, it can be done but not recommended.
>   
Ok, thats just not what you said above.  Your suggestion made it sound like you
thought that  after the integration of a port ownership model, that multiple
dpdk entries should not synchronize with one another, which made no sense.

> >  I think there is consensus that the port owner must
> > be the only entitiy to operate on a port (be that configuration/frame rx/tx, or
> > some other operation).
> 
> Your question above caused me to think that you don't understand it, How can someone who is not the port owner to change the port owner?
> Changing the port owner, like port configuration and port release must be done by the owner itself except the case that there is no owner to the port.
> See the API rte_eth_dev_owner_remove.
> 
See above, your phrasing I don't think accurately reflected what you meant to
convey. Or at least thats not how I read it

> > Multithreaded operation on a port always means
> > some level of synchronization between application threads and the dpdk
> > library,
> Yes.
>  >but I'm not sure why that would be different if we introduced a more
> > concrete notion of port ownership via a new library.
> >
> 
> What do you mean by "new library"?, port is an ethdev instance and should be managed by ethdev.
> 
I'm referring to the port ownership api that you proposed.  Apologies, I should
not have used the term "new library", but rather "new api".

>  > > So, for example,  if the dpdk entity is an application, the application should
> >> take ownership of the port and manage the synchronization of this port
> >> configuration between the application threads and its EAL host thread
> >> callbacks, no other dpdk entity should configure the same port because they
> >> should fail when they try to take ownership of the same port too.
> 
> > Well, failing is one good approach, yes, blocking on port relenquishment
> > could be another.  I'd recommend an API with the following interface:
> > 
> > rte_port_ownership_claim(int port_id) - blocks execution of the calling
> > thread until the previous owner releases ownership, then claims it and
> > returns
> > 
> > rte_port_ownership_release(int port_id) - releases ownership of port, or
> > returns error if the port was not owned by this execution context
> >
> > rte_port_owernship_try_claim(int port_id) - same as
> > rte_port_ownership_claim, but fails if the port is already owned.
> > 
> > That would give the option for both semantics.
> 
> I think the current APIs are better because of the next reasons:
> - It defines well who is the owner.
Theres no reason you can't integrate some ownership nonce to the above API as
well, thats easy to add.  The relevant part is the ability to exclude those who
are not owners (that is to say, block their progress until ownership is released
by a preceding entity).

> - The owner structure includes string to allow better debug and printing for humans. 
I've got no problem with any such internals, its really the synchronization that I'm after.

> Did you read it?
Yes, I don't see why you would think I hadn't, I think I've been very clear in
my understanding of you initial patch.  Have you taken the time to understand my
concerns? 

> I can add there an API that wait until the port ownership is released as you suggested in V2.
> 
I think that would be good.

> > > Each dpdk entity which wants to take ownership must to be able to
> > >synchronize the port configuration in its level.
> 
> > Can you elaborate on what you mean by level here?  Are you envisioning a
> > scheme in which multiple execution contexts might own a port for various
> > non-conflicting purposes?
>  
> Sure,
> 1) Application with 2 threads wanting to configure the same port:
> 	level = application code.
> 	
> 	a. The main thread should create owner identifier(rte_eth_dev_owner_new).
> 	b. The main thread should take the port ownership(rte_eth_dev_owner_set).
> 	c. Synchronization between the two threads should be done for the conflicted 		configurations by the application.
> 	d. when the application finishes the port usage it should release the owner(rte_eth_dev_owner_remove).
> 
> 2) Fail-safe PMD manages 2 sub-devices (2 ports) and uses alarm for hotplug detections which can configure the 2 ports(by the host thread).
> 	Level = fail-safe code.
> 	a. Application starts the eal and the fail-safe driver probing function is called.
> 	b. Fail-safe probes the 2 sub-devices(2 ports are created) and takes ownership for them.
> 	c. Failsafe creates itself port and leaves it ownerless. 
> 	d. Failsafe starts the hotplug alarm mechanism.
> 	e. Application tries to take ownership for all ports and success only for failsafe port.
> 	f. Application start to configure the failsafe port asynchronously to failsafe hotplug alarm.
> 	g. Failsafe must use synchronization between failsafe alarm callback code and failsafe configuration APIs called by the application because they both try to configure the same sub-devices ports.
> 	h. When fail-safe finishes with the two sub devices it should release the ports owner.
> 
Ok, this I would describe as different use cases rather than parallel ownership,
in that in both cases there is still a single execution context which is
responsible for all aspects of a given port (which is fine with me, I'm just
trying to be clear in our description of the model).



> > >
> > > >
> > > > > > 2) Is device configuration racy with device addition/removal?
> > > > > > That is to say, can one thread remove a device while another
> > configures it.
> > > > >
> > > > > I think it is the same as two threads configuring the same device
> > > > > (item 4/ above). It can be managed with port ownership.
> > > > >
> > > > Only if you assert that application is required to have the owning
> > > > port be responsible for the ports deletion, which we can say, but
> > > > that leads to the issue above again.
> > > >
> > > >
> > > As Thomas said in item 2 the port creation must be synchronized by ethdev
> > and we need to add it there.
> > > I think it is obvious that port removal must to be done only by the port
> > owner.
> > >
> > You say that, but its obvious to you as a developer who has looked
> > extensively at the code.  It may well be less so to a consumer who is not an
> > active member of the community, for instance one who obtains the dpdk via
> > pre-built package.
> >
> 
> Yes I can understand, but new rules should be documented and be adjusted easy easy by the customers, no?
Ostensibly, it should be easy, yes, but in practice its a bit murkier.  For
instance, What if an application wants to enable packet capture on an interface
via rte_pdump_enable?  Does preforming that action require that the execution
context which calls that function own the port before doing so?  Digging through
the code suggests to me that it (suprisingly) does not, because all that
function does is set a socket to record packets too, but I would have
intuitively thought that enabling packet capture would require turning off the
mac filter table in the hardware, and so would have required ownership

Conversely, using the same example, calling rte_pdump_init, using the model from
your last patch, would require that the calling execution context ensured that
, at the time the cli application issued the monnitor request, that the port
be unowned, because the pdump main thread needs to set rx_tx callbacks on the
requested port, which I belive constitutes a configuration change needing port
ownership.

My point being, I think saying that ownership is easy and obvious isn't
accurate.  If we are to leave proper synchrnization of access to devices up to
the application, we either need to:

1) Assume downstream users are intimately familiar with the code
or
2) Exhaustively document the conditions under which ownership needs to be held

(1) is a non starter, and 2 I think is a fairly large undertaking, but unless we
are willing to codify synchronization in the code explicitly (via locking), (2)
is what we have to do.

Neil
  
Matan Azrad Dec. 23, 2017, 10:36 p.m. UTC | #33
Hi 
> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Friday, December 22, 2017 4:27 PM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Bruce
> Richardson <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>;
> Wu, Jingjing <jingjing.wu@intel.com>
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> On Thu, Dec 21, 2017 at 09:57:43PM +0000, Matan Azrad wrote:
> > > -----Original Message-----
> > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > Sent: Thursday, December 21, 2017 10:14 PM
> > > To: Matan Azrad <matan@mellanox.com>
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Bruce
> > > Richardson <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > <konstantin.ananyev@intel.com>; Gaëtan Rivet
> > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > >
> > > On Thu, Dec 21, 2017 at 07:37:06PM +0000, Matan Azrad wrote:
> > > > Hi
> > > >
> > <snip>
> > > > > > > > I think we need to clearly describe what is the
> > > > > > > > tread-safety policy in DPDK (especially in ethdev as a first
> example).
> > > > > > > > Let's start with obvious things:
> > > > > > > >
> > > > > > > > 	1/ A queue is not protected for races with multiple Rx or Tx
> > > > > > > > 			- no planned change because of performance
> > > > > purpose
> > > > > > > > 	2/ The list of devices is racy
> > > > > > > > 			- to be fixed with atomics
> > > > > > > > 	3/ The configuration of different devices is thread-safe
> > > > > > > > 			- the configurations are different per-device
> > > > > > > > 	4/ The configuration of a given device is racy
> > > > > > > > 			- can be managed by the owner of the device
> > > > > > > > 	5/ The device ownership is racy
> > > > > > > > 			- to be fixed with atomics
> > > > > > > >
> > > > > > > > What am I missing?
> > > > > > > >
> > > >
> > > > Thank you Thomas for this order.
> > > > Actually the port ownership is a good opportunity to redefine the
> > > > synchronization rules in ethdev :)
> > > >
> > > > > > > There is fan out to consider here:
> > > > > > >
> > > > > > > 1) Is device configuration racy with ownership?  That is to
> > > > > > > say, can I change ownership of a device safely while another
> > > > > > > thread that currently owns it modifies its configuration?
> > > > > >
> > > > > > If an entity steals ownership to another one, either it is
> > > > > > agreed earlier, or it is done by a central authority.
> > > > > > When it is acked that ownership can be moved, there should not
> > > > > > be any configuration in progress.
> > > > > > So it is more a communication issue than a race.
> > > > > >
> > > > > But if thats the case (specifically that mutual exclusion
> > > > > between port ownership and configuration is an exercize left to
> > > > > an application developer), then port ownership itself is largely
> > > > > meaningless within the dpdk, because the notion of who owns the
> > > > > port needs to be codified within the application anyway.
> > > > >
> > > >
> > > > Bruce, As I understand it, only the dpdk entity who took ownership
> > > > of a
> > > port successfully can configure the device by default, if other dpdk
> > > entities want to configure it too they must to be synchronized with
> > > the port owner while it is not recommended after the port ownership
> integration.
> > > >
> > > Can you clarify what you mean by "it is not recommended after the
> > > port ownership integration"?
> >
> > Sure,
> > The new defining of ethdev synchronization doesn't recommend to
> manage a port by 2 different dpdk entities, it can be done but not
> recommended.
> >
> Ok, thats just not what you said above.  Your suggestion made it sound like
> you thought that  after the integration of a port ownership model, that
> multiple dpdk entries should not synchronize with one another, which made
> no sense.
> 
Ok, I can see a dual meaning in my sentence, sorry for that, I think we agree here.

> > >  I think there is consensus that the port owner must be the only
> > > entitiy to operate on a port (be that configuration/frame rx/tx, or
> > > some other operation).
> >
> > Your question above caused me to think that you don't understand it, How
> can someone who is not the port owner to change the port owner?
> > Changing the port owner, like port configuration and port release must be
> done by the owner itself except the case that there is no owner to the port.
> > See the API rte_eth_dev_owner_remove.
> >
> See above, your phrasing I don't think accurately reflected what you meant
> to convey. Or at least thats not how I read it
> 
> > > Multithreaded operation on a port always means some level of
> > > synchronization between application threads and the dpdk library,
> > Yes.
> >  >but I'm not sure why that would be different if we introduced a more
> > > concrete notion of port ownership via a new library.
> > >
> >
> > What do you mean by "new library"?, port is an ethdev instance and should
> be managed by ethdev.
> >
> I'm referring to the port ownership api that you proposed.  Apologies, I
> should not have used the term "new library", but rather "new api".
> 
> >  > > So, for example,  if the dpdk entity is an application, the
> > application should
> > >> take ownership of the port and manage the synchronization of this
> > >> port configuration between the application threads and its EAL host
> > >> thread callbacks, no other dpdk entity should configure the same
> > >> port because they should fail when they try to take ownership of the
> same port too.
> >
> > > Well, failing is one good approach, yes, blocking on port
> > > relenquishment could be another.  I'd recommend an API with the
> following interface:
> > >
> > > rte_port_ownership_claim(int port_id) - blocks execution of the
> > > calling thread until the previous owner releases ownership, then
> > > claims it and returns
> > >
> > > rte_port_ownership_release(int port_id) - releases ownership of
> > > port, or returns error if the port was not owned by this execution
> > > context
> > >
> > > rte_port_owernship_try_claim(int port_id) - same as
> > > rte_port_ownership_claim, but fails if the port is already owned.
> > >
> > > That would give the option for both semantics.
> >
> > I think the current APIs are better because of the next reasons:
> > - It defines well who is the owner.
> Theres no reason you can't integrate some ownership nonce to the above
> API as well, thats easy to add.  The relevant part is the ability to exclude
> those who are not owners (that is to say, block their progress until ownership
> is released by a preceding entity).
> 
> > - The owner structure includes string to allow better debug and printing for
> humans.
> I've got no problem with any such internals, its really the synchronization that
> I'm after.
> 
> > Did you read it?
> Yes, I don't see why you would think I hadn't, I think I've been very clear in
> my understanding of you initial patch.  Have you taken the time to
> understand my concerns?
>
OK, Just it looks like you suggested a new APIs instead of V1 APIs.

Your concerns are about the races in port ownership management.
I agree with it only after Thomas redefining of port synchronization rules.
Mean that if the port creation will be race safe and the new rules will be documented, the port ownership  should be race safe too.
 
> > I can add there an API that wait until the port ownership is released as you
> suggested in V2.
> >
> I think that would be good.
> 
> > > > Each dpdk entity which wants to take ownership must to be able to
> > > >synchronize the port configuration in its level.
> >
> > > Can you elaborate on what you mean by level here?  Are you
> > > envisioning a scheme in which multiple execution contexts might own
> > > a port for various non-conflicting purposes?
> >
> > Sure,
> > 1) Application with 2 threads wanting to configure the same port:
> > 	level = application code.
> >
> > 	a. The main thread should create owner
> identifier(rte_eth_dev_owner_new).
> > 	b. The main thread should take the port
> ownership(rte_eth_dev_owner_set).
> > 	c. Synchronization between the two threads should be done for the
> conflicted 		configurations by the application.
> > 	d. when the application finishes the port usage it should release the
> owner(rte_eth_dev_owner_remove).
> >
> > 2) Fail-safe PMD manages 2 sub-devices (2 ports) and uses alarm for
> hotplug detections which can configure the 2 ports(by the host thread).
> > 	Level = fail-safe code.
> > 	a. Application starts the eal and the fail-safe driver probing function is
> called.
> > 	b. Fail-safe probes the 2 sub-devices(2 ports are created) and takes
> ownership for them.
> > 	c. Failsafe creates itself port and leaves it ownerless.
> > 	d. Failsafe starts the hotplug alarm mechanism.
> > 	e. Application tries to take ownership for all ports and success only
> for failsafe port.
> > 	f. Application start to configure the failsafe port asynchronously to
> failsafe hotplug alarm.
> > 	g. Failsafe must use synchronization between failsafe alarm callback
> code and failsafe configuration APIs called by the application because they
> both try to configure the same sub-devices ports.
> > 	h. When fail-safe finishes with the two sub devices it should release
> the ports owner.
> >
> Ok, this I would describe as different use cases rather than parallel
> ownership, in that in both cases there is still a single execution context which
> is responsible for all aspects of a given port (which is fine with me, I'm just
> trying to be clear in our description of the model).
> 
Agree.
Can you find a realistic scenario that a non-single execution entity must to manage a port and have problems with the port races synchronization management? 
 
> 
> > > >
> > > > >
> > > > > > > 2) Is device configuration racy with device addition/removal?
> > > > > > > That is to say, can one thread remove a device while another
> > > configures it.
> > > > > >
> > > > > > I think it is the same as two threads configuring the same
> > > > > > device (item 4/ above). It can be managed with port ownership.
> > > > > >
> > > > > Only if you assert that application is required to have the
> > > > > owning port be responsible for the ports deletion, which we can
> > > > > say, but that leads to the issue above again.
> > > > >
> > > > >
> > > > As Thomas said in item 2 the port creation must be synchronized by
> > > > ethdev
> > > and we need to add it there.
> > > > I think it is obvious that port removal must to be done only by
> > > > the port
> > > owner.
> > > >
> > > You say that, but its obvious to you as a developer who has looked
> > > extensively at the code.  It may well be less so to a consumer who
> > > is not an active member of the community, for instance one who
> > > obtains the dpdk via pre-built package.
> > >
> >
> > Yes I can understand, but new rules should be documented and be
> adjusted easy easy by the customers, no?
> Ostensibly, it should be easy, yes, but in practice its a bit murkier.  For
> instance, What if an application wants to enable packet capture on an
> interface via rte_pdump_enable?  Does preforming that action require that
> the execution context which calls that function own the port before doing
> so?  Digging through the code suggests to me that it (suprisingly) does not,
> because all that function does is set a socket to record packets too, but I
> would have intuitively thought that enabling packet capture would require
> turning off the mac filter table in the hardware, and so would have required
> ownership
> 
> Conversely, using the same example, calling rte_pdump_init, using the
> model from your last patch, would require that the calling execution context
> ensured that , at the time the cli application issued the monnitor request,
> that the port be unowned, because the pdump main thread needs to set
> rx_tx callbacks on the requested port, which I belive constitutes a
> configuration change needing port ownership.
> 
> My point being, I think saying that ownership is easy and obvious isn't
> accurate.

Agree, as a finger rule all the port relation APIs should require ownership taking, but it will take time to learn when we don't need to take ownership.

>  If we are to leave proper synchrnization of access to devices up to
> the application, we either need to:
> 
> 1) Assume downstream users are intimately familiar with the code or
> 2) Exhaustively document the conditions under which ownership needs to be
> held
> 
> (1) is a non starter, and 2 I think is a fairly large undertaking, but unless we
> are willing to codify synchronization in the code explicitly (via locking), (2) is
> what we have to do.
> 
Agree.
Maybe it will be good to document each relevant API if it requires ownership taking or not in .h files, what do you think?  

> Neil
  
Neil Horman Dec. 29, 2017, 4:56 p.m. UTC | #34
On Sat, Dec 23, 2017 at 10:36:34PM +0000, Matan Azrad wrote:
> Hi 
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Friday, December 22, 2017 4:27 PM
> > To: Matan Azrad <matan@mellanox.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Bruce
> > Richardson <bruce.richardson@intel.com>; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>;
> > Wu, Jingjing <jingjing.wu@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > 
> > On Thu, Dec 21, 2017 at 09:57:43PM +0000, Matan Azrad wrote:
> > > > -----Original Message-----
> > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > > Sent: Thursday, December 21, 2017 10:14 PM
> > > > To: Matan Azrad <matan@mellanox.com>
> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Bruce
> > > > Richardson <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > > <konstantin.ananyev@intel.com>; Gaëtan Rivet
> > > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > >
> > > > On Thu, Dec 21, 2017 at 07:37:06PM +0000, Matan Azrad wrote:
> > > > > Hi
> > > > >
> > > <snip>
> > > > > > > > > I think we need to clearly describe what is the
> > > > > > > > > tread-safety policy in DPDK (especially in ethdev as a first
> > example).
> > > > > > > > > Let's start with obvious things:
> > > > > > > > >
> > > > > > > > > 	1/ A queue is not protected for races with multiple Rx or Tx
> > > > > > > > > 			- no planned change because of performance
> > > > > > purpose
> > > > > > > > > 	2/ The list of devices is racy
> > > > > > > > > 			- to be fixed with atomics
> > > > > > > > > 	3/ The configuration of different devices is thread-safe
> > > > > > > > > 			- the configurations are different per-device
> > > > > > > > > 	4/ The configuration of a given device is racy
> > > > > > > > > 			- can be managed by the owner of the device
> > > > > > > > > 	5/ The device ownership is racy
> > > > > > > > > 			- to be fixed with atomics
> > > > > > > > >
> > > > > > > > > What am I missing?
> > > > > > > > >
> > > > >
> > > > > Thank you Thomas for this order.
> > > > > Actually the port ownership is a good opportunity to redefine the
> > > > > synchronization rules in ethdev :)
> > > > >
> > > > > > > > There is fan out to consider here:
> > > > > > > >
> > > > > > > > 1) Is device configuration racy with ownership?  That is to
> > > > > > > > say, can I change ownership of a device safely while another
> > > > > > > > thread that currently owns it modifies its configuration?
> > > > > > >
> > > > > > > If an entity steals ownership to another one, either it is
> > > > > > > agreed earlier, or it is done by a central authority.
> > > > > > > When it is acked that ownership can be moved, there should not
> > > > > > > be any configuration in progress.
> > > > > > > So it is more a communication issue than a race.
> > > > > > >
> > > > > > But if thats the case (specifically that mutual exclusion
> > > > > > between port ownership and configuration is an exercize left to
> > > > > > an application developer), then port ownership itself is largely
> > > > > > meaningless within the dpdk, because the notion of who owns the
> > > > > > port needs to be codified within the application anyway.
> > > > > >
> > > > >
> > > > > Bruce, As I understand it, only the dpdk entity who took ownership
> > > > > of a
> > > > port successfully can configure the device by default, if other dpdk
> > > > entities want to configure it too they must to be synchronized with
> > > > the port owner while it is not recommended after the port ownership
> > integration.
> > > > >
> > > > Can you clarify what you mean by "it is not recommended after the
> > > > port ownership integration"?
> > >
> > > Sure,
> > > The new defining of ethdev synchronization doesn't recommend to
> > manage a port by 2 different dpdk entities, it can be done but not
> > recommended.
> > >
> > Ok, thats just not what you said above.  Your suggestion made it sound like
> > you thought that  after the integration of a port ownership model, that
> > multiple dpdk entries should not synchronize with one another, which made
> > no sense.
> > 
> Ok, I can see a dual meaning in my sentence, sorry for that, I think we agree here.
> 
No need to apologize, just trying to make sure we're on the same page, and yes,
I agree that we are in consensus here.

> > >
> > > > Can you elaborate on what you mean by level here?  Are you
> > > > envisioning a scheme in which multiple execution contexts might own
> > > > a port for various non-conflicting purposes?
> > >
> > > Sure,
> > > 1) Application with 2 threads wanting to configure the same port:
> > > 	level = application code.
> > >
> > > 	a. The main thread should create owner
> > identifier(rte_eth_dev_owner_new).
> > > 	b. The main thread should take the port
> > ownership(rte_eth_dev_owner_set).
> > > 	c. Synchronization between the two threads should be done for the
> > conflicted 		configurations by the application.
> > > 	d. when the application finishes the port usage it should release the
> > owner(rte_eth_dev_owner_remove).
> > >
> > > 2) Fail-safe PMD manages 2 sub-devices (2 ports) and uses alarm for
> > hotplug detections which can configure the 2 ports(by the host thread).
> > > 	Level = fail-safe code.
> > > 	a. Application starts the eal and the fail-safe driver probing function is
> > called.
> > > 	b. Fail-safe probes the 2 sub-devices(2 ports are created) and takes
> > ownership for them.
> > > 	c. Failsafe creates itself port and leaves it ownerless.
> > > 	d. Failsafe starts the hotplug alarm mechanism.
> > > 	e. Application tries to take ownership for all ports and success only
> > for failsafe port.
> > > 	f. Application start to configure the failsafe port asynchronously to
> > failsafe hotplug alarm.
> > > 	g. Failsafe must use synchronization between failsafe alarm callback
> > code and failsafe configuration APIs called by the application because they
> > both try to configure the same sub-devices ports.
> > > 	h. When fail-safe finishes with the two sub devices it should release
> > the ports owner.
> > >
> > Ok, this I would describe as different use cases rather than parallel
> > ownership, in that in both cases there is still a single execution context which
> > is responsible for all aspects of a given port (which is fine with me, I'm just
> > trying to be clear in our description of the model).
> > 
> Agree.
> Can you find a realistic scenario that a non-single execution entity must to manage a port and have problems with the port races synchronization management? 
No, nor do I think one exists, just trying to make sure that you weren't trying
to allow for that, as it would be very difficutl to maintain I think.

>  
> > 
> > > > >
> > > > > >
> > > > > > > > 2) Is device configuration racy with device addition/removal?
> > > > > > > > That is to say, can one thread remove a device while another
> > > > configures it.
> > > > > > >
> > > > > > > I think it is the same as two threads configuring the same
> > > > > > > device (item 4/ above). It can be managed with port ownership.
> > > > > > >
> > > > > > Only if you assert that application is required to have the
> > > > > > owning port be responsible for the ports deletion, which we can
> > > > > > say, but that leads to the issue above again.
> > > > > >
> > > > > >
> > > > > As Thomas said in item 2 the port creation must be synchronized by
> > > > > ethdev
> > > > and we need to add it there.
> > > > > I think it is obvious that port removal must to be done only by
> > > > > the port
> > > > owner.
> > > > >
> > > > You say that, but its obvious to you as a developer who has looked
> > > > extensively at the code.  It may well be less so to a consumer who
> > > > is not an active member of the community, for instance one who
> > > > obtains the dpdk via pre-built package.
> > > >
> > >
> > > Yes I can understand, but new rules should be documented and be
> > adjusted easy easy by the customers, no?
> > Ostensibly, it should be easy, yes, but in practice its a bit murkier.  For
> > instance, What if an application wants to enable packet capture on an
> > interface via rte_pdump_enable?  Does preforming that action require that
> > the execution context which calls that function own the port before doing
> > so?  Digging through the code suggests to me that it (suprisingly) does not,
> > because all that function does is set a socket to record packets too, but I
> > would have intuitively thought that enabling packet capture would require
> > turning off the mac filter table in the hardware, and so would have required
> > ownership
> > 
> > Conversely, using the same example, calling rte_pdump_init, using the
> > model from your last patch, would require that the calling execution context
> > ensured that , at the time the cli application issued the monnitor request,
> > that the port be unowned, because the pdump main thread needs to set
> > rx_tx callbacks on the requested port, which I belive constitutes a
> > configuration change needing port ownership.
> > 
> > My point being, I think saying that ownership is easy and obvious isn't
> > accurate.
> 
> Agree, as a finger rule all the port relation APIs should require ownership taking, but it will take time to learn when we don't need to take ownership.
It likely will, I agree, this is the difficulty in maintaining mutual exclusion
outside of the library,  its (potentially) an every chaning model, for which
documentation needs to keep up, and I'm not sure if we will ever get there
(hence my constant bemoaning of the desire to codify multual exclusion within
the library.  I wonder if it would be worth while to explore a lock registration
mechanism in which sole access is implemented outside the library, but
communicated within the library to avoid this (i.e. a model in which the
application manipulates a mutual exclusion condition, but that can be checked by
the library to ensure proper usage.

> 
> >  If we are to leave proper synchrnization of access to devices up to
> > the application, we either need to:
> > 
> > 1) Assume downstream users are intimately familiar with the code or
> > 2) Exhaustively document the conditions under which ownership needs to be
> > held
> > 
> > (1) is a non starter, and 2 I think is a fairly large undertaking, but unless we
> > are willing to codify synchronization in the code explicitly (via locking), (2) is
> > what we have to do.
> > 
> Agree.
> Maybe it will be good to document each relevant API if it requires ownership taking or not in .h files, what do you think?  
> 
If you think thats a surmountable task, yes.  I think its necessecary if we are
to expect applications to do any sort of real locking (beyond just guaranteeing
one task is executing in the dpdk at any one time)

Best
Neil

> > Neil
> 
>
  

Patch

diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
index 6a0c9f9..af639a1 100644
--- a/doc/guides/prog_guide/poll_mode_drv.rst
+++ b/doc/guides/prog_guide/poll_mode_drv.rst
@@ -156,7 +156,7 @@  concurrently on the same tx queue without SW lock. This PMD feature found in som
 
 See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE`` capability probing details.
 
-Device Identification and Configuration
+Device Identification, Ownership  and Configuration
 ---------------------------------------
 
 Device Identification
@@ -171,6 +171,16 @@  Based on their PCI identifier, NIC ports are assigned two other identifiers:
 *   A port name used to designate the port in console messages, for administration or debugging purposes.
     For ease of use, the port name includes the port index.
 
+Port Ownership
+~~~~~~~~~~~~~
+The Ethernet devices ports can be owned by a single DPDK entity (application, library, PMD, process, etc).
+The ownership mechanism is controlled by ethdev APIs and allows to set/remove/get a port owner by DPDK entities.
+Allowing this should prevent any multiple management of Ethernet port by different entities.
+
+.. note::
+
+    It is the DPDK entity responsibility either to check the port owner before using it or to set the port owner to prevent others from using it.
+
 Device Configuration
 ~~~~~~~~~~~~~~~~~~~~
 
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 2d754d9..836991e 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -71,6 +71,7 @@ 
 static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
 struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
 static struct rte_eth_dev_data *rte_eth_dev_data;
+static uint16_t rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER + 1;
 static uint8_t eth_dev_last_created_port;
 
 /* spinlock for eth device callbacks */
@@ -278,6 +279,7 @@  struct rte_eth_dev *
 	if (eth_dev == NULL)
 		return -EINVAL;
 
+	memset(&eth_dev->data->owner, 0, sizeof(struct rte_eth_dev_owner));
 	eth_dev->state = RTE_ETH_DEV_UNUSED;
 	return 0;
 }
@@ -293,6 +295,125 @@  struct rte_eth_dev *
 		return 1;
 }
 
+static int
+rte_eth_is_valid_owner_id(uint16_t owner_id)
+{
+	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
+	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER &&
+	    rte_eth_next_owner_id <= owner_id)) {
+		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
+		return 0;
+	}
+	return 1;
+}
+
+uint16_t
+rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t owner_id)
+{
+	while (port_id < RTE_MAX_ETHPORTS &&
+	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
+	       rte_eth_devices[port_id].data->owner.id != owner_id))
+		port_id++;
+
+	if (port_id >= RTE_MAX_ETHPORTS)
+		return RTE_MAX_ETHPORTS;
+
+	return port_id;
+}
+
+int
+rte_eth_dev_owner_new(uint16_t *owner_id)
+{
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+		RTE_PMD_DEBUG_TRACE("Not primary process cannot own ports.\n");
+		return -EPERM;
+	}
+	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
+		RTE_PMD_DEBUG_TRACE("Reached maximum number of Ethernet port owners.\n");
+		return -EUSERS;
+	}
+	*owner_id = rte_eth_next_owner_id++;
+	return 0;
+}
+
+int
+rte_eth_dev_owner_set(const uint16_t port_id,
+		      const struct rte_eth_dev_owner *owner)
+{
+	struct rte_eth_dev_owner *port_owner;
+	int ret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+		RTE_PMD_DEBUG_TRACE("Not primary process cannot own ports.\n");
+		return -EPERM;
+	}
+	if (!rte_eth_is_valid_owner_id(owner->id))
+		return -EINVAL;
+	port_owner = &rte_eth_devices[port_id].data->owner;
+	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
+	    port_owner->id != owner->id) {
+		RTE_LOG(ERR, EAL,
+			"Cannot set owner to port %d already owned by %s_%05d.\n",
+			port_id, port_owner->name, port_owner->id);
+		return -EPERM;
+	}
+	ret = snprintf(port_owner->name, RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
+		       owner->name);
+	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
+		memset(port_owner->name, 0, RTE_ETH_MAX_OWNER_NAME_LEN);
+		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
+		return -EINVAL;
+	}
+	port_owner->id = owner->id;
+	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
+			    owner->name, owner->id);
+	return 0;
+}
+
+int
+rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t owner_id)
+{
+	struct rte_eth_dev_owner *port_owner;
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	if (!rte_eth_is_valid_owner_id(owner_id))
+		return -EINVAL;
+	port_owner = &rte_eth_devices[port_id].data->owner;
+	if (port_owner->id != owner_id) {
+		RTE_LOG(ERR, EAL,
+			"Cannot remove port %d owner %s_%05d by different owner id %5d.\n",
+			port_id, port_owner->name, port_owner->id, owner_id);
+		return -EPERM;
+	}
+	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has removed.\n", port_id,
+			port_owner->name, port_owner->id);
+	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
+	return 0;
+}
+
+void
+rte_eth_dev_owner_delete(const uint16_t owner_id)
+{
+	uint16_t p;
+
+	if (!rte_eth_is_valid_owner_id(owner_id))
+		return;
+	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
+		memset(&rte_eth_devices[p].data->owner, 0,
+		       sizeof(struct rte_eth_dev_owner));
+	RTE_PMD_DEBUG_TRACE("All port owners owned by "
+			    "%05d identifier has removed.\n", owner_id);
+}
+
+const struct rte_eth_dev_owner *
+rte_eth_dev_owner_get(const uint16_t port_id)
+{
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
+	if (rte_eth_devices[port_id].data->owner.id == RTE_ETH_DEV_NO_OWNER)
+		return NULL;
+	return &rte_eth_devices[port_id].data->owner;
+}
+
 int
 rte_eth_dev_socket_id(uint16_t port_id)
 {
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 341c2d6..f54c26d 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1760,6 +1760,15 @@  struct rte_eth_dev_sriov {
 
 #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
 
+#define RTE_ETH_DEV_NO_OWNER 0
+
+#define RTE_ETH_MAX_OWNER_NAME_LEN 64
+
+struct rte_eth_dev_owner {
+	uint16_t id; /**< The owner unique identifier. */
+	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner name. */
+};
+
 /**
  * @internal
  * The data part, with no function pointers, associated with each ethernet device.
@@ -1810,6 +1819,7 @@  struct rte_eth_dev_data {
 	int numa_node;  /**< NUMA node connection */
 	struct rte_vlan_filter_conf vlan_filter_conf;
 	/**< VLAN filter configuration. */
+	struct rte_eth_dev_owner owner; /**< The port owner. */
 };
 
 /** Device supports link state interrupt */
@@ -1846,6 +1856,82 @@  struct rte_eth_dev_data {
 
 
 /**
+ * Iterates over valid ethdev ports owned by a specific owner.
+ *
+ * @param port_id
+ *   The id of the next possible valid owned port.
+ * @param	owner_id
+ *  The owner identifier.
+ *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless ports.
+ * @return
+ *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if there is none.
+ */
+uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t owner_id);
+
+/**
+ * Macro to iterate over all enabled ethdev ports owned by a specific owner.
+ */
+#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
+	for (p = rte_eth_find_next_owned_by(0, o); \
+	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
+	     p = rte_eth_find_next_owned_by(p + 1, o))
+
+/**
+ * Get a new unique owner identifier.
+ * An owner identifier is used to owns Ethernet devices by only one DPDK entity
+ * to avoid multiple management of device by different entities.
+ *
+ * @param	owner_id
+ *   Owner identifier pointer.
+ * @return
+ *   Negative errno value on error, 0 on success.
+ */
+int rte_eth_dev_owner_new(uint16_t *owner_id);
+
+/**
+ * Set an Ethernet device owner.
+ *
+ * @param	port_id
+ *  The identifier of the port to own.
+ * @param	owner
+ *  The owner pointer.
+ * @return
+ *  Negative errno value on error, 0 on success.
+ */
+int rte_eth_dev_owner_set(const uint16_t port_id,
+			  const struct rte_eth_dev_owner *owner);
+
+/**
+ * Remove Ethernet device owner to make the device ownerless.
+ *
+ * @param	port_id
+ *  The identifier of port to make ownerless.
+ * @param	owner
+ *  The owner identifier.
+ * @return
+ *  0 on success, negative errno value on error.
+ */
+int rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t owner_id);
+
+/**
+ * Remove owner from all Ethernet devices owned by a specific owner.
+ *
+ * @param	owner
+ *  The owner identifier.
+ */
+void rte_eth_dev_owner_delete(const uint16_t owner_id);
+
+/**
+ * Get the owner of an Ethernet device.
+ *
+ * @param	port_id
+ *  The port identifier.
+ * @return
+ *  NULL when the device is ownerless, else the device owner pointer.
+ */
+const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const uint16_t port_id);
+
+/**
  * Get the total number of Ethernet devices that have been successfully
  * initialized by the matching Ethernet driver during the PCI probing phase
  * and that are available for applications to use. These devices must be
diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
index e9681ac..7d07edb 100644
--- a/lib/librte_ether/rte_ethdev_version.map
+++ b/lib/librte_ether/rte_ethdev_version.map
@@ -198,6 +198,18 @@  DPDK_17.11 {
 
 } DPDK_17.08;
 
+DPDK_18.02 {
+	global:
+
+	rte_eth_find_next_owned_by;
+	rte_eth_dev_owner_new;
+	rte_eth_dev_owner_set;
+	rte_eth_dev_owner_remove;
+	rte_eth_dev_owner_delete;
+	rte_eth_dev_owner_get;
+
+} DPDK_17.11;
+
 EXPERIMENTAL {
 	global: