[dpdk-dev,v2,1/6] ethdev: added switch_domain and representor port flag

Message ID 1510929733-7225-1-git-send-email-mohammad.abdul.awal@intel.com
State Superseded, archived
Delegated to: Ferruh Yigit
Headers show

Checks

Context Check Description
ci/Intel-compilation success Compilation OK
ci/checkpatch success coding style OK

Commit Message

Mohammad Abdul Awal Nov. 17, 2017, 2:42 p.m.
switch_domain attribute has been added to specify that a rte_eth_dev
instance belongs to a switch domain in the software switch.

RTE_ETH_DEV_REPRESENTOR_PORT has been defined to specify that a
rte_eth_dev instance is a representor device.

Signed-off-by: Mohammad Abdul Awal <mohammad.abdul.awal@intel.com>
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
---
 lib/librte_ether/rte_ethdev.h | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Ferruh Yigit Nov. 20, 2017, 7:57 p.m. | #1
On 11/17/2017 6:42 AM, Mohammad Abdul Awal wrote:
> switch_domain attribute has been added to specify that a rte_eth_dev
> instance belongs to a switch domain in the software switch.
> 
> RTE_ETH_DEV_REPRESENTOR_PORT has been defined to specify that a
> rte_eth_dev instance is a representor device.
> 
> Signed-off-by: Mohammad Abdul Awal <mohammad.abdul.awal@intel.com>
> Signed-off-by: Declan Doherty <declan.doherty@intel.com>
> ---
>  lib/librte_ether/rte_ethdev.h | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 341c2d6..7e82c70 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1047,6 +1047,7 @@ struct rte_eth_dev_info {
>  	/** Configured number of rx/tx queues */
>  	uint16_t nb_rx_queues; /**< Number of RX queues. */
>  	uint16_t nb_tx_queues; /**< Number of TX queues. */
> +	uint16_t switch_domain; /**< Switch domain which port belongs to. */

What to put or not into rte_eth_dev_info is not that clear, but recent
description from Andrew [1] makes sense to me: using more device related static
information in dev_info instead of dynamic data.

There is an option to create a new API for this, what do you think?

[1]
http://dpdk.org/ml/archives/dev/2017-November/081868.html

>  };
>  
>  /**
> @@ -1810,6 +1811,7 @@ struct rte_eth_dev_data {
>  	int numa_node;  /**< NUMA node connection */
>  	struct rte_vlan_filter_conf vlan_filter_conf;
>  	/**< VLAN filter configuration. */
> +	uint16_t switch_domain; /**< Switch domain which port belongs to. */

Can you please describe why "switch_domain" used for? In patchset only ixgbe
uses it, and that is only for assign to dev_info, it is not clear why this is
needed.

>  };
>  
>  /** Device supports link state interrupt */
> @@ -1818,6 +1820,8 @@ struct rte_eth_dev_data {
>  #define RTE_ETH_DEV_BONDED_SLAVE 0x0004
>  /** Device supports device removal interrupt */
>  #define RTE_ETH_DEV_INTR_RMV     0x0008
> +/** Device is a port representor */
> +#define RTE_ETH_DEV_REPRESENTOR_PORT 0x0010

As far as I can see this is not a generic device option, but just a way to mark
a specific virtual PMD. There must be a better way to do this, I think this flag
is not good idea.

>  
>  /**
>   * @internal
>
Mohammad Abdul Awal Dec. 8, 2017, 3:33 p.m. | #2
On 20/11/2017 19:57, Ferruh Yigit wrote:
> On 11/17/2017 6:42 AM, Mohammad Abdul Awal wrote:
>> switch_domain attribute has been added to specify that a rte_eth_dev
>> instance belongs to a switch domain in the software switch.
>>
>> RTE_ETH_DEV_REPRESENTOR_PORT has been defined to specify that a
>> rte_eth_dev instance is a representor device.
>>
>> Signed-off-by: Mohammad Abdul Awal <mohammad.abdul.awal@intel.com>
>> Signed-off-by: Declan Doherty <declan.doherty@intel.com>
>> ---
>>   lib/librte_ether/rte_ethdev.h | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
>> index 341c2d6..7e82c70 100644
>> --- a/lib/librte_ether/rte_ethdev.h
>> +++ b/lib/librte_ether/rte_ethdev.h
>> @@ -1047,6 +1047,7 @@ struct rte_eth_dev_info {
>>   	/** Configured number of rx/tx queues */
>>   	uint16_t nb_rx_queues; /**< Number of RX queues. */
>>   	uint16_t nb_tx_queues; /**< Number of TX queues. */
>> +	uint16_t switch_domain; /**< Switch domain which port belongs to. */
> What to put or not into rte_eth_dev_info is not that clear, but recent
> description from Andrew [1] makes sense to me: using more device related static
> information in dev_info instead of dynamic data.
>
> There is an option to create a new API for this, what do you think?
>
> [1]
> http://dpdk.org/ml/archives/dev/2017-November/081868.html
>
>>   };
>>   
>>   /**
>> @@ -1810,6 +1811,7 @@ struct rte_eth_dev_data {
>>   	int numa_node;  /**< NUMA node connection */
>>   	struct rte_vlan_filter_conf vlan_filter_conf;
>>   	/**< VLAN filter configuration. */
>> +	uint16_t switch_domain; /**< Switch domain which port belongs to. */
> Can you please describe why "switch_domain" used for? In patchset only ixgbe
> uses it, and that is only for assign to dev_info, it is not clear why this is
> needed.

We kept it keeping in mind for future use, specially for switchdev API 
and OVS support. But for this patch it is not necessary to be here. We 
will remove it from here now. New patch will be submitted if needed in 
future.

>>   };
>>   
>>   /** Device supports link state interrupt */
>> @@ -1818,6 +1820,8 @@ struct rte_eth_dev_data {
>>   #define RTE_ETH_DEV_BONDED_SLAVE 0x0004
>>   /** Device supports device removal interrupt */
>>   #define RTE_ETH_DEV_INTR_RMV     0x0008
>> +/** Device is a port representor */
>> +#define RTE_ETH_DEV_REPRESENTOR_PORT 0x0010
> As far as I can see this is not a generic device option, but just a way to mark
> a specific virtual PMD. There must be a better way to do this, I think this flag
> is not good idea.
Thanks Ferruh. We will also remove this flag option. Without this flag 
option, we can still identify the PMD as a port representor type or not 
by looking at the device name.

>>   
>>   /**
>>    * @internal
>>
>
Alex Rosenbaum Dec. 11, 2017, 1:45 p.m. | #3
Mohammad,

I did not see a v2 change log. did I miss it? please send.

I don't understand who this v2 addresses the comments by Alejandro Lucero
 from netronome [1].
These are critical points which your proposal does not handle. It is
related to the switch_domain member exposed here.

thanks,
Alex

[1] http://dpdk.org/ml/archives/dev/2017-September/074904.html
Mohammad Abdul Awal Dec. 11, 2017, 6 p.m. | #4
Hi Alex,


On 11/12/2017 13:45, Alex Rosenbaum wrote:
> Mohammad,
>
> I did not see a v2 change log. did I miss it? please send.
>
The V2 change is for vdev handling by the representor PMD as the vdev 
API location changed from DPDK 17.08 to 17.11. So, V2 patch mainly a 
rebased against DPDK 17.11.

> I don't understand who this v2 addresses the comments by Alejandro 
> Lucero from netronome [1].
> These are critical points which your proposal does not handle. It is 
> related to the switch_domain member exposed here.

Since, none of the initial NICs (i40e, ixgbe) is using switch_domain for 
the moment, we are dropping the switch_domain from these patch set. It 
is expected to be address in future patchset with switchdev support.

Regards,
Awal.
>
> thanks,
> Alex
>
> [1] http://dpdk.org/ml/archives/dev/2017-September/074904.html
Remy Horton Dec. 22, 2017, 2:41 p.m. | #5
Port Representors provide a logical presentation in DPDK of VF (virtual 
function) ports for the purposes of control and monitoring. Each port 
representor device represents a single VF and is associated with it's 
parent physical function (PF) PMD which provides the back-end hooks for 
the representor device ops and defines the control domain to which that 
port belongs. This allows to use existing DPDK APIs to monitor and control 
the port without the need to create and maintain VF specific APIs.

+-----------------------------+   +---------------+  +---------------+
|        Control Plane        |   |   Data Plane  |  |   Data Plane  |
|         Application         |   |   Application |  |   Application |
+-----------------------------+   +---------------+  +---------------+
|         eth dev api         |   |  eth dev api  |  |  eth dev api  |
+-----------------------------+   +---------------+  +---------------+
+-------+  +-------+  +-------+   +---------------+  +---------------+
|  PF0  |  | Port  |  | Port  |   |    VF0 PMD    |  |    VF0 PMD    |
|  PMD  <--+ Rep 0 |  | Rep 1 |   +---------------+  +------+--------+
|       |  | PMD   |  | PMD   |                             |
+---+--^+  +-------+  +-+-----+                             |
    |  |                |  |                                |
    |  +----------------+  |                                |
    |                      |                                |
    |                      |                                |
+--------------------------------+                          |
|   |  HW (logical view)   |     |                          |
| --+------+ +-------+ +---+---+ |                          |
| |   PF   | |  VF0  | |  VF1  | |                          |
| |        | |       | |       +----------------------------+
| +--------+ +-------+ +-------+ |
| +----------------------------+ |
| |        VEB                 | |
| +----------------------------+ |
| +--------+                     |
| |  Port  |                     |
| |   0    |                     |
| +--------+                     |
+--------------------------------+

The figure above shows a deployment where the PF is bound to a DPDK control
plane application which uses representor ports to manage the configuration and
monitoring of it's VF ports. Each virtual function is represented in the
application by a representor port PMD which enables control of the corresponding
VF through eth dev APIs on the representor PMD such as:

- void rte_eth_promiscuous_enable(uint8_t port_id);
- void rte_eth_promiscuous_disable(uint8_t port_id);
- void rte_eth_allmulticast_enable(uint8_t port_id);
- void rte_eth_allmulticast_disable(uint8_t port_id);
- int rte_eth_dev_mac_addr_add(uint8_t port, struct ether_addr *mac_addr,
	uint32_t pool);
- int rte_eth_dev_set_vlan_offload(uint8_t port_id, int offload_mask);

as well as monitoring through API's like

- void rte_eth_link_get(uint8_t port_id, struct rte_eth_link *link);
- int rte_eth_stats_get(uint8_t port_id, struct rte_eth_stats *stats);

The port representor infrastructure is enabled through a single common, device
independent, virtual PMD whos context is initialized and enabled through a
broker instance running within the context of the physical function device
driver.

+-------------------------+       +-------------------------+
|        rte_ethdev       |       |       rte_ethdev        |
+-------------------------+       +-------------------------+
|  Physical Function PMD  |       |  Port Reperesentor PMD  |
|         +-------------+ |       | +---------+ +---------+ |
|         | Representor | |       | | dev_data| | dev_ops | |
|         |    Broker   | |       | +----+----+ +----+----+ |
|         | +---------+ | |       +------|-----------|------+
|         | | VF Port | | |              |           |
|         | | Context +------------------+           |
|         | +---------+ | |                          |
|         | +---------+ | |                          |
|         | | Handler +------------------------------+
|         | |   Ops   | | |
|         | +---------+ | |
|         +-------------+ |
+-------------------------+

Creation of representor ports can be achieved either through the --vdev EAL
option or through the rte_vdev_init() API. Each port representor requires the
BDF of it's parent PF and the Virtual Function ID of the port which the
representor will support. During initialization of the representor PMD, it calls
the broker API to register itself with the PF PMD and to get it's context
configured which includes the setting up of it's context and ops function
handlers.

As the port representor model is based around the paradigm of using standard
port based APIs, it will allow future expansion of functionality without the
need to add new APIs. For example it should be possible to support configuration
of egress QoS parameters using existing TM APIs by extending the port
representor PMD/broker infrastructure.


Changes in v2:
Rebased to DPDK 17.11

Changes in v3:
* Removed RTE_ETH_DEV_REPRESENTOR_PORT define
* Removed switch_domain from struct rte_eth_dev_info
  (to be reintroduced seperately when required).
* Port Representor PMD functionality merged into main library
* Removed functions from .map file that are not for use by applications
* Some minor bugfixes (uninitalised variables & NULL terminators)
* Use of SPDX licence headers in new files
* Seperate headers for PMD and application
* SPDX-License-Identifier in new files
* Added test-pmd representor add/del commands


Remy Horton (5):
  lib: add Port Representor library
  eal: add Port Representor command-line option
  drivers/net/i40e: add Port Representor functionality
  drivers/net/ixgbe: add Port Representor functionality
  app/test-pmd: add Port Representor commands

 app/test-pmd/cmdline.c                             |  88 ++++
 config/common_base                                 |   5 +
 drivers/net/i40e/Makefile                          |   1 +
 drivers/net/i40e/i40e_ethdev.c                     |  16 +
 drivers/net/i40e/i40e_ethdev.h                     |   1 +
 drivers/net/i40e/i40e_prep_ops.c                   | 495 +++++++++++++++++++++
 drivers/net/i40e/i40e_prep_ops.h                   |  15 +
 drivers/net/i40e/rte_pmd_i40e.c                    |  47 ++
 drivers/net/i40e/rte_pmd_i40e.h                    |  18 +
 drivers/net/ixgbe/Makefile                         |   1 +
 drivers/net/ixgbe/ixgbe_ethdev.c                   |  22 +-
 drivers/net/ixgbe/ixgbe_ethdev.h                   |   5 +
 drivers/net/ixgbe/ixgbe_prep_ops.c                 | 259 +++++++++++
 drivers/net/ixgbe/ixgbe_prep_ops.h                 |  15 +
 lib/Makefile                                       |   3 +
 lib/librte_eal/bsdapp/eal/eal.c                    |   6 +
 lib/librte_eal/common/eal_common_options.c         |   1 +
 lib/librte_eal/common/eal_internal_cfg.h           |   2 +
 lib/librte_eal/common/eal_options.h                |   2 +
 lib/librte_eal/common/include/rte_eal.h            |   8 +
 lib/librte_eal/linuxapp/eal/eal.c                  |   9 +
 lib/librte_representor/Makefile                    |  26 ++
 lib/librte_representor/rte_port_representor.c      | 326 ++++++++++++++
 lib/librte_representor/rte_port_representor.h      |  60 +++
 .../rte_port_representor_driver.h                  | 138 ++++++
 .../rte_port_representor_version.map               |   8 +
 mk/rte.app.mk                                      |   1 +
 27 files changed, 1577 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/i40e/i40e_prep_ops.c
 create mode 100644 drivers/net/i40e/i40e_prep_ops.h
 create mode 100644 drivers/net/ixgbe/ixgbe_prep_ops.c
 create mode 100644 drivers/net/ixgbe/ixgbe_prep_ops.h
 create mode 100644 lib/librte_representor/Makefile
 create mode 100644 lib/librte_representor/rte_port_representor.c
 create mode 100644 lib/librte_representor/rte_port_representor.h
 create mode 100644 lib/librte_representor/rte_port_representor_driver.h
 create mode 100644 lib/librte_representor/rte_port_representor_version.map
Remy Horton Dec. 22, 2017, 2:52 p.m. | #6
Port Representors provide a logical presentation in DPDK of VF (virtual 
function) ports for the purposes of control and monitoring. Each port 
representor device represents a single VF and is associated with it's 
parent physical function (PF) PMD which provides the back-end hooks for 
the representor device ops and defines the control domain to which that 
port belongs. This allows to use existing DPDK APIs to monitor and control 
the port without the need to create and maintain VF specific APIs.

+-----------------------------+   +---------------+  +---------------+
|        Control Plane        |   |   Data Plane  |  |   Data Plane  |
|         Application         |   |   Application |  |   Application |
+-----------------------------+   +---------------+  +---------------+
|         eth dev api         |   |  eth dev api  |  |  eth dev api  |
+-----------------------------+   +---------------+  +---------------+
+-------+  +-------+  +-------+   +---------------+  +---------------+
|  PF0  |  | Port  |  | Port  |   |    VF0 PMD    |  |    VF0 PMD    |
|  PMD  <--+ Rep 0 |  | Rep 1 |   +---------------+  +------+--------+
|       |  | PMD   |  | PMD   |                             |
+---+--^+  +-------+  +-+-----+                             |
    |  |                |  |                                |
    |  +----------------+  |                                |
    |                      |                                |
    |                      |                                |
+--------------------------------+                          |
|   |  HW (logical view)   |     |                          |
| --+------+ +-------+ +---+---+ |                          |
| |   PF   | |  VF0  | |  VF1  | |                          |
| |        | |       | |       +----------------------------+
| +--------+ +-------+ +-------+ |
| +----------------------------+ |
| |        VEB                 | |
| +----------------------------+ |
| +--------+                     |
| |  Port  |                     |
| |   0    |                     |
| +--------+                     |
+--------------------------------+

The figure above shows a deployment where the PF is bound to a DPDK control
plane application which uses representor ports to manage the configuration and
monitoring of it's VF ports. Each virtual function is represented in the
application by a representor port PMD which enables control of the corresponding
VF through eth dev APIs on the representor PMD such as:

- void rte_eth_promiscuous_enable(uint8_t port_id);
- void rte_eth_promiscuous_disable(uint8_t port_id);
- void rte_eth_allmulticast_enable(uint8_t port_id);
- void rte_eth_allmulticast_disable(uint8_t port_id);
- int rte_eth_dev_mac_addr_add(uint8_t port, struct ether_addr *mac_addr,
	uint32_t pool);
- int rte_eth_dev_set_vlan_offload(uint8_t port_id, int offload_mask);

as well as monitoring through API's like

- void rte_eth_link_get(uint8_t port_id, struct rte_eth_link *link);
- int rte_eth_stats_get(uint8_t port_id, struct rte_eth_stats *stats);

The port representor infrastructure is enabled through a single common, device
independent, virtual PMD whos context is initialized and enabled through a
broker instance running within the context of the physical function device
driver.

+-------------------------+       +-------------------------+
|        rte_ethdev       |       |       rte_ethdev        |
+-------------------------+       +-------------------------+
|  Physical Function PMD  |       |  Port Reperesentor PMD  |
|         +-------------+ |       | +---------+ +---------+ |
|         | Representor | |       | | dev_data| | dev_ops | |
|         |    Broker   | |       | +----+----+ +----+----+ |
|         | +---------+ | |       +------|-----------|------+
|         | | VF Port | | |              |           |
|         | | Context +------------------+           |
|         | +---------+ | |                          |
|         | +---------+ | |                          |
|         | | Handler +------------------------------+
|         | |   Ops   | | |
|         | +---------+ | |
|         +-------------+ |
+-------------------------+

Creation of representor ports can be achieved either through the --vdev EAL
option or through the rte_vdev_init() API. Each port representor requires the
BDF of it's parent PF and the Virtual Function ID of the port which the
representor will support. During initialization of the representor PMD, it calls
the broker API to register itself with the PF PMD and to get it's context
configured which includes the setting up of it's context and ops function
handlers.

As the port representor model is based around the paradigm of using standard
port based APIs, it will allow future expansion of functionality without the
need to add new APIs. For example it should be possible to support configuration
of egress QoS parameters using existing TM APIs by extending the port
representor PMD/broker infrastructure.

Changes in v2:
* Rebased to DPDK 17.11

Changes in v3:
* Removed RTE_ETH_DEV_REPRESENTOR_PORT define
* Removed switch_domain from struct rte_eth_dev_info
  (to be reintroduced seperately when required).
* Port Representor PMD functionality merged into main library
* Removed functions from .map file that are not for use by applications
* Some minor bugfixes (uninitalised variables & NULL terminators)
* Use of SPDX licence headers in new files
* Seperate headers for PMD and application
* SPDX-License-Identifier in new files
* Added test-pmd representor add/del commands


Remy Horton (5):
  lib: add Port Representor library
  eal: add Port Representor command-line option
  drivers/net/i40e: add Port Representor functionality
  drivers/net/ixgbe: add Port Representor functionality
  app/test-pmd: add Port Representor commands

 app/test-pmd/cmdline.c                             |  88 ++++
 config/common_base                                 |   5 +
 drivers/net/i40e/Makefile                          |   1 +
 drivers/net/i40e/i40e_ethdev.c                     |  16 +
 drivers/net/i40e/i40e_ethdev.h                     |   1 +
 drivers/net/i40e/i40e_prep_ops.c                   | 495 +++++++++++++++++++++
 drivers/net/i40e/i40e_prep_ops.h                   |  15 +
 drivers/net/i40e/rte_pmd_i40e.c                    |  47 ++
 drivers/net/i40e/rte_pmd_i40e.h                    |  18 +
 drivers/net/ixgbe/Makefile                         |   1 +
 drivers/net/ixgbe/ixgbe_ethdev.c                   |  22 +-
 drivers/net/ixgbe/ixgbe_ethdev.h                   |   5 +
 drivers/net/ixgbe/ixgbe_prep_ops.c                 | 259 +++++++++++
 drivers/net/ixgbe/ixgbe_prep_ops.h                 |  15 +
 lib/Makefile                                       |   3 +
 lib/librte_eal/bsdapp/eal/eal.c                    |   6 +
 lib/librte_eal/common/eal_common_options.c         |   1 +
 lib/librte_eal/common/eal_internal_cfg.h           |   2 +
 lib/librte_eal/common/eal_options.h                |   2 +
 lib/librte_eal/common/include/rte_eal.h            |   8 +
 lib/librte_eal/linuxapp/eal/eal.c                  |   9 +
 lib/librte_representor/Makefile                    |  26 ++
 lib/librte_representor/rte_port_representor.c      | 326 ++++++++++++++
 lib/librte_representor/rte_port_representor.h      |  60 +++
 .../rte_port_representor_driver.h                  | 138 ++++++
 .../rte_port_representor_version.map               |   8 +
 mk/rte.app.mk                                      |   1 +
 27 files changed, 1577 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/i40e/i40e_prep_ops.c
 create mode 100644 drivers/net/i40e/i40e_prep_ops.h
 create mode 100644 drivers/net/ixgbe/ixgbe_prep_ops.c
 create mode 100644 drivers/net/ixgbe/ixgbe_prep_ops.h
 create mode 100644 lib/librte_representor/Makefile
 create mode 100644 lib/librte_representor/rte_port_representor.c
 create mode 100644 lib/librte_representor/rte_port_representor.h
 create mode 100644 lib/librte_representor/rte_port_representor_driver.h
 create mode 100644 lib/librte_representor/rte_port_representor_version.map
Remy Horton Dec. 22, 2017, 3:09 p.m. | #7
On 22/12/2017 14:41, Remy Horton wrote:
> Port Representors provide a logical presentation in DPDK of VF (virtual
[..]
> Remy Horton (5):
>   lib: add Port Representor library
>   eal: add Port Representor command-line option
>   drivers/net/i40e: add Port Representor functionality
>   drivers/net/ixgbe: add Port Representor functionality
>   app/test-pmd: add Port Representor commands

Opps, misconfigured patch script. Self-NAK on this patchset, and resent 
with correct subject headers.
Neil Horman Dec. 22, 2017, 4:37 p.m. | #8
On Fri, Dec 22, 2017 at 02:41:16PM +0000, Remy Horton wrote:
> Port Representors provide a logical presentation in DPDK of VF (virtual 
> function) ports for the purposes of control and monitoring. Each port 
> representor device represents a single VF and is associated with it's 
> parent physical function (PF) PMD which provides the back-end hooks for 
> the representor device ops and defines the control domain to which that 
> port belongs. This allows to use existing DPDK APIs to monitor and control 
> the port without the need to create and maintain VF specific APIs.
> 
> +-----------------------------+   +---------------+  +---------------+
> |        Control Plane        |   |   Data Plane  |  |   Data Plane  |
> |         Application         |   |   Application |  |   Application |
> +-----------------------------+   +---------------+  +---------------+
> |         eth dev api         |   |  eth dev api  |  |  eth dev api  |
> +-----------------------------+   +---------------+  +---------------+
> +-------+  +-------+  +-------+   +---------------+  +---------------+
> |  PF0  |  | Port  |  | Port  |   |    VF0 PMD    |  |    VF0 PMD    |
> |  PMD  <--+ Rep 0 |  | Rep 1 |   +---------------+  +------+--------+
> |       |  | PMD   |  | PMD   |                             |
> +---+--^+  +-------+  +-+-----+                             |
>     |  |                |  |                                |
>     |  +----------------+  |                                |
>     |                      |                                |
>     |                      |                                |
> +--------------------------------+                          |
> |   |  HW (logical view)   |     |                          |
> | --+------+ +-------+ +---+---+ |                          |
> | |   PF   | |  VF0  | |  VF1  | |                          |
> | |        | |       | |       +----------------------------+
> | +--------+ +-------+ +-------+ |
> | +----------------------------+ |
> | |        VEB                 | |
> | +----------------------------+ |
> | +--------+                     |
> | |  Port  |                     |
> | |   0    |                     |
> | +--------+                     |
> +--------------------------------+
> 

How does this mesh with the notion of port ownership that we've been discussing
in other threads?  In that thread, we've been discussing the need for a single
execution context to have exclusive access to the hardware for the purposes of
configuration and data i/o, and for the application/execution context to be
responsible for co-ordination of any shared use of a device.  In this feature
however, the notion of a Port Representor creates an alias to the same hardware
funciton (VF), where both aliases (in the control and data plan) have parallel
access to the hardware, in such a way that co-ordination between the two is
largely impossible (unless you want to make the data plane application explicity
aware of control plane actiivy).

Neil
Neil Horman Dec. 26, 2017, 1:54 p.m. | #9
On Fri, Dec 22, 2017 at 02:52:16PM +0000, Remy Horton wrote:
> Port Representors provide a logical presentation in DPDK of VF (virtual 
> function) ports for the purposes of control and monitoring. Each port 
> representor device represents a single VF and is associated with it's 
> parent physical function (PF) PMD which provides the back-end hooks for 
> the representor device ops and defines the control domain to which that 
> port belongs. This allows to use existing DPDK APIs to monitor and control 
> the port without the need to create and maintain VF specific APIs.
> 
> +-----------------------------+   +---------------+  +---------------+
> |        Control Plane        |   |   Data Plane  |  |   Data Plane  |
> |         Application         |   |   Application |  |   Application |
> +-----------------------------+   +---------------+  +---------------+
> |         eth dev api         |   |  eth dev api  |  |  eth dev api  |
> +-----------------------------+   +---------------+  +---------------+
> +-------+  +-------+  +-------+   +---------------+  +---------------+
> |  PF0  |  | Port  |  | Port  |   |    VF0 PMD    |  |    VF0 PMD    |
> |  PMD  <--+ Rep 0 |  | Rep 1 |   +---------------+  +------+--------+
> |       |  | PMD   |  | PMD   |                             |
> +---+--^+  +-------+  +-+-----+                             |
>     |  |                |  |                                |
>     |  +----------------+  |                                |
>     |                      |                                |
>     |                      |                                |
> +--------------------------------+                          |
> |   |  HW (logical view)   |     |                          |
> | --+------+ +-------+ +---+---+ |                          |
> | |   PF   | |  VF0  | |  VF1  | |                          |
> | |        | |       | |       +----------------------------+
> | +--------+ +-------+ +-------+ |
> | +----------------------------+ |
> | |        VEB                 | |
> | +----------------------------+ |
> | +--------+                     |
> | |  Port  |                     |
> | |   0    |                     |
> | +--------+                     |
> +--------------------------------+
> 
> The figure above shows a deployment where the PF is bound to a DPDK control
> plane application which uses representor ports to manage the configuration and
> monitoring of it's VF ports. Each virtual function is represented in the
> application by a representor port PMD which enables control of the corresponding
> VF through eth dev APIs on the representor PMD such as:
> 
> - void rte_eth_promiscuous_enable(uint8_t port_id);
> - void rte_eth_promiscuous_disable(uint8_t port_id);
> - void rte_eth_allmulticast_enable(uint8_t port_id);
> - void rte_eth_allmulticast_disable(uint8_t port_id);
> - int rte_eth_dev_mac_addr_add(uint8_t port, struct ether_addr *mac_addr,
> 	uint32_t pool);
> - int rte_eth_dev_set_vlan_offload(uint8_t port_id, int offload_mask);
> 
> as well as monitoring through API's like
> 
> - void rte_eth_link_get(uint8_t port_id, struct rte_eth_link *link);
> - int rte_eth_stats_get(uint8_t port_id, struct rte_eth_stats *stats);
> 
> The port representor infrastructure is enabled through a single common, device
> independent, virtual PMD whos context is initialized and enabled through a
> broker instance running within the context of the physical function device
> driver.
> 
> +-------------------------+       +-------------------------+
> |        rte_ethdev       |       |       rte_ethdev        |
> +-------------------------+       +-------------------------+
> |  Physical Function PMD  |       |  Port Reperesentor PMD  |
> |         +-------------+ |       | +---------+ +---------+ |
> |         | Representor | |       | | dev_data| | dev_ops | |
> |         |    Broker   | |       | +----+----+ +----+----+ |
> |         | +---------+ | |       +------|-----------|------+
> |         | | VF Port | | |              |           |
> |         | | Context +------------------+           |
> |         | +---------+ | |                          |
> |         | +---------+ | |                          |
> |         | | Handler +------------------------------+
> |         | |   Ops   | | |
> |         | +---------+ | |
> |         +-------------+ |
> +-------------------------+
> 
> Creation of representor ports can be achieved either through the --vdev EAL
> option or through the rte_vdev_init() API. Each port representor requires the
> BDF of it's parent PF and the Virtual Function ID of the port which the
> representor will support. During initialization of the representor PMD, it calls
> the broker API to register itself with the PF PMD and to get it's context
> configured which includes the setting up of it's context and ops function
> handlers.
> 
> As the port representor model is based around the paradigm of using standard
> port based APIs, it will allow future expansion of functionality without the
> need to add new APIs. For example it should be possible to support configuration
> of egress QoS parameters using existing TM APIs by extending the port
> representor PMD/broker infrastructure.
> 
> Changes in v2:
> * Rebased to DPDK 17.11
> 
> Changes in v3:
> * Removed RTE_ETH_DEV_REPRESENTOR_PORT define
> * Removed switch_domain from struct rte_eth_dev_info
>   (to be reintroduced seperately when required).
> * Port Representor PMD functionality merged into main library
> * Removed functions from .map file that are not for use by applications
> * Some minor bugfixes (uninitalised variables & NULL terminators)
> * Use of SPDX licence headers in new files
> * Seperate headers for PMD and application
> * SPDX-License-Identifier in new files
> * Added test-pmd representor add/del commands
> 
> 
> Remy Horton (5):
>   lib: add Port Representor library
>   eal: add Port Representor command-line option
>   drivers/net/i40e: add Port Representor functionality
>   drivers/net/ixgbe: add Port Representor functionality
>   app/test-pmd: add Port Representor commands
> 
>  app/test-pmd/cmdline.c                             |  88 ++++
>  config/common_base                                 |   5 +
>  drivers/net/i40e/Makefile                          |   1 +
>  drivers/net/i40e/i40e_ethdev.c                     |  16 +
>  drivers/net/i40e/i40e_ethdev.h                     |   1 +
>  drivers/net/i40e/i40e_prep_ops.c                   | 495 +++++++++++++++++++++
>  drivers/net/i40e/i40e_prep_ops.h                   |  15 +
>  drivers/net/i40e/rte_pmd_i40e.c                    |  47 ++
>  drivers/net/i40e/rte_pmd_i40e.h                    |  18 +
>  drivers/net/ixgbe/Makefile                         |   1 +
>  drivers/net/ixgbe/ixgbe_ethdev.c                   |  22 +-
>  drivers/net/ixgbe/ixgbe_ethdev.h                   |   5 +
>  drivers/net/ixgbe/ixgbe_prep_ops.c                 | 259 +++++++++++
>  drivers/net/ixgbe/ixgbe_prep_ops.h                 |  15 +
>  lib/Makefile                                       |   3 +
>  lib/librte_eal/bsdapp/eal/eal.c                    |   6 +
>  lib/librte_eal/common/eal_common_options.c         |   1 +
>  lib/librte_eal/common/eal_internal_cfg.h           |   2 +
>  lib/librte_eal/common/eal_options.h                |   2 +
>  lib/librte_eal/common/include/rte_eal.h            |   8 +
>  lib/librte_eal/linuxapp/eal/eal.c                  |   9 +
>  lib/librte_representor/Makefile                    |  26 ++
>  lib/librte_representor/rte_port_representor.c      | 326 ++++++++++++++
>  lib/librte_representor/rte_port_representor.h      |  60 +++
>  .../rte_port_representor_driver.h                  | 138 ++++++
>  .../rte_port_representor_version.map               |   8 +
>  mk/rte.app.mk                                      |   1 +
>  27 files changed, 1577 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/net/i40e/i40e_prep_ops.c
>  create mode 100644 drivers/net/i40e/i40e_prep_ops.h
>  create mode 100644 drivers/net/ixgbe/ixgbe_prep_ops.c
>  create mode 100644 drivers/net/ixgbe/ixgbe_prep_ops.h
>  create mode 100644 lib/librte_representor/Makefile
>  create mode 100644 lib/librte_representor/rte_port_representor.c
>  create mode 100644 lib/librte_representor/rte_port_representor.h
>  create mode 100644 lib/librte_representor/rte_port_representor_driver.h
>  create mode 100644 lib/librte_representor/rte_port_representor_version.map
> 
> -- 
> 2.9.5
> 
> 
Same question as in V1, how does this patch mesh with the notion of port
ownership?  The consensus from that thread was that, because the DPDK considers
itself lockless, that the application(s) managing/using a given port need to
enforce their own mutual exclusion.  The design of this feature explicitly
creates an alias effect, in that the same hardware (a VF in this case), can be
accessed via two different port structures by two different applications without
being aware of the shared nature of the port.  Given that one is in a vm,
co-ordination between applications is going to be awkward at best.  How do you
plan to enforce mutual exclusion in this environment, given that you don't have
any shared memory between the vm and the host?

Neil
Mohammad Abdul Awal Jan. 3, 2018, 12:12 p.m. | #10
Hi Neil,


On 26/12/2017 13:54, Neil Horman wrote:
> On Fri, Dec 22, 2017 at 02:52:16PM +0000, Remy Horton wrote:
>> Port Representors provide a logical presentation in DPDK of VF (virtual
>> function) ports for the purposes of control and monitoring. Each port
>> representor device represents a single VF and is associated with it's
>> parent physical function (PF) PMD which provides the back-end hooks for
>> the representor device ops and defines the control domain to which that
>> port belongs. This allows to use existing DPDK APIs to monitor and control
>> the port without the need to create and maintain VF specific APIs.
>>
>> +-----------------------------+   +---------------+  +---------------+
>> |        Control Plane        |   |   Data Plane  |  |   Data Plane  |
>> |         Application         |   |   Application |  |   Application |
>> +-----------------------------+   +---------------+  +---------------+
>> |         eth dev api         |   |  eth dev api  |  |  eth dev api  |
>> +-----------------------------+   +---------------+  +---------------+
>> +-------+  +-------+  +-------+   +---------------+  +---------------+
>> |  PF0  |  | Port  |  | Port  |   |    VF0 PMD    |  |    VF0 PMD    |
>> |  PMD  <--+ Rep 0 |  | Rep 1 |   +---------------+  +------+--------+
>> |       |  | PMD   |  | PMD   |                             |
>> +---+--^+  +-------+  +-+-----+                             |
>>      |  |                |  |                                |
>>      |  +----------------+  |                                |
>>      |                      |                                |
>>      |                      |                                |
>> +--------------------------------+                          |
>> |   |  HW (logical view)   |     |                          |
>> | --+------+ +-------+ +---+---+ |                          |
>> | |   PF   | |  VF0  | |  VF1  | |                          |
>> | |        | |       | |       +----------------------------+
>> | +--------+ +-------+ +-------+ |
>> | +----------------------------+ |
>> | |        VEB                 | |
>> | +----------------------------+ |
>> | +--------+                     |
>> | |  Port  |                     |
>> | |   0    |                     |
>> | +--------+                     |
>> +--------------------------------+
>>
>> The figure above shows a deployment where the PF is bound to a DPDK control
>> plane application which uses representor ports to manage the configuration and
>> monitoring of it's VF ports. Each virtual function is represented in the
>> application by a representor port PMD which enables control of the corresponding
>> VF through eth dev APIs on the representor PMD such as:
>>
>> - void rte_eth_promiscuous_enable(uint8_t port_id);
>> - void rte_eth_promiscuous_disable(uint8_t port_id);
>> - void rte_eth_allmulticast_enable(uint8_t port_id);
>> - void rte_eth_allmulticast_disable(uint8_t port_id);
>> - int rte_eth_dev_mac_addr_add(uint8_t port, struct ether_addr *mac_addr,
>> 	uint32_t pool);
>> - int rte_eth_dev_set_vlan_offload(uint8_t port_id, int offload_mask);
>>
>> as well as monitoring through API's like
>>
>> - void rte_eth_link_get(uint8_t port_id, struct rte_eth_link *link);
>> - int rte_eth_stats_get(uint8_t port_id, struct rte_eth_stats *stats);
>>
>> The port representor infrastructure is enabled through a single common, device
>> independent, virtual PMD whos context is initialized and enabled through a
>> broker instance running within the context of the physical function device
>> driver.
>>
>> +-------------------------+       +-------------------------+
>> |        rte_ethdev       |       |       rte_ethdev        |
>> +-------------------------+       +-------------------------+
>> |  Physical Function PMD  |       |  Port Reperesentor PMD  |
>> |         +-------------+ |       | +---------+ +---------+ |
>> |         | Representor | |       | | dev_data| | dev_ops | |
>> |         |    Broker   | |       | +----+----+ +----+----+ |
>> |         | +---------+ | |       +------|-----------|------+
>> |         | | VF Port | | |              |           |
>> |         | | Context +------------------+           |
>> |         | +---------+ | |                          |
>> |         | +---------+ | |                          |
>> |         | | Handler +------------------------------+
>> |         | |   Ops   | | |
>> |         | +---------+ | |
>> |         +-------------+ |
>> +-------------------------+
>>
>> Creation of representor ports can be achieved either through the --vdev EAL
>> option or through the rte_vdev_init() API. Each port representor requires the
>> BDF of it's parent PF and the Virtual Function ID of the port which the
>> representor will support. During initialization of the representor PMD, it calls
>> the broker API to register itself with the PF PMD and to get it's context
>> configured which includes the setting up of it's context and ops function
>> handlers.
>>
>> As the port representor model is based around the paradigm of using standard
>> port based APIs, it will allow future expansion of functionality without the
>> need to add new APIs. For example it should be possible to support configuration
>> of egress QoS parameters using existing TM APIs by extending the port
>> representor PMD/broker infrastructure.
>>
>> Changes in v2:
>> * Rebased to DPDK 17.11
>>
>> Changes in v3:
>> * Removed RTE_ETH_DEV_REPRESENTOR_PORT define
>> * Removed switch_domain from struct rte_eth_dev_info
>>    (to be reintroduced seperately when required).
>> * Port Representor PMD functionality merged into main library
>> * Removed functions from .map file that are not for use by applications
>> * Some minor bugfixes (uninitalised variables & NULL terminators)
>> * Use of SPDX licence headers in new files
>> * Seperate headers for PMD and application
>> * SPDX-License-Identifier in new files
>> * Added test-pmd representor add/del commands
>>
>>
>> Remy Horton (5):
>>    lib: add Port Representor library
>>    eal: add Port Representor command-line option
>>    drivers/net/i40e: add Port Representor functionality
>>    drivers/net/ixgbe: add Port Representor functionality
>>    app/test-pmd: add Port Representor commands
>>
>>   app/test-pmd/cmdline.c                             |  88 ++++
>>   config/common_base                                 |   5 +
>>   drivers/net/i40e/Makefile                          |   1 +
>>   drivers/net/i40e/i40e_ethdev.c                     |  16 +
>>   drivers/net/i40e/i40e_ethdev.h                     |   1 +
>>   drivers/net/i40e/i40e_prep_ops.c                   | 495 +++++++++++++++++++++
>>   drivers/net/i40e/i40e_prep_ops.h                   |  15 +
>>   drivers/net/i40e/rte_pmd_i40e.c                    |  47 ++
>>   drivers/net/i40e/rte_pmd_i40e.h                    |  18 +
>>   drivers/net/ixgbe/Makefile                         |   1 +
>>   drivers/net/ixgbe/ixgbe_ethdev.c                   |  22 +-
>>   drivers/net/ixgbe/ixgbe_ethdev.h                   |   5 +
>>   drivers/net/ixgbe/ixgbe_prep_ops.c                 | 259 +++++++++++
>>   drivers/net/ixgbe/ixgbe_prep_ops.h                 |  15 +
>>   lib/Makefile                                       |   3 +
>>   lib/librte_eal/bsdapp/eal/eal.c                    |   6 +
>>   lib/librte_eal/common/eal_common_options.c         |   1 +
>>   lib/librte_eal/common/eal_internal_cfg.h           |   2 +
>>   lib/librte_eal/common/eal_options.h                |   2 +
>>   lib/librte_eal/common/include/rte_eal.h            |   8 +
>>   lib/librte_eal/linuxapp/eal/eal.c                  |   9 +
>>   lib/librte_representor/Makefile                    |  26 ++
>>   lib/librte_representor/rte_port_representor.c      | 326 ++++++++++++++
>>   lib/librte_representor/rte_port_representor.h      |  60 +++
>>   .../rte_port_representor_driver.h                  | 138 ++++++
>>   .../rte_port_representor_version.map               |   8 +
>>   mk/rte.app.mk                                      |   1 +
>>   27 files changed, 1577 insertions(+), 1 deletion(-)
>>   create mode 100644 drivers/net/i40e/i40e_prep_ops.c
>>   create mode 100644 drivers/net/i40e/i40e_prep_ops.h
>>   create mode 100644 drivers/net/ixgbe/ixgbe_prep_ops.c
>>   create mode 100644 drivers/net/ixgbe/ixgbe_prep_ops.h
>>   create mode 100644 lib/librte_representor/Makefile
>>   create mode 100644 lib/librte_representor/rte_port_representor.c
>>   create mode 100644 lib/librte_representor/rte_port_representor.h
>>   create mode 100644 lib/librte_representor/rte_port_representor_driver.h
>>   create mode 100644 lib/librte_representor/rte_port_representor_version.map
>>
>> -- 
>> 2.9.5
>>
>>
> Same question as in V1, how does this patch mesh with the notion of port
> ownership?  The consensus from that thread was that, because the DPDK considers
> itself lockless, that the application(s) managing/using a given port need to
> enforce their own mutual exclusion.  The design of this feature explicitly
> creates an alias effect, in that the same hardware (a VF in this case), can be
> accessed via two different port structures by two different applications without
> being aware of the shared nature of the port.  Given that one is in a vm,
> co-ordination between applications is going to be awkward at best.  How do you
> plan to enforce mutual exclusion in this environment, given that you don't have
> any shared memory between the vm and the host?

Without the port representor model, it is possible that both PF and VF 
can have shared access of the VF hardware. Lets take an example of i40e 
drivers [1]. Application running in host can modify the hardware 
configuration using the provided APIs in the reference [1]. Applications 
running on the VF can modify the hardware configuration using mailbox 
mechanism and with privileged mode.

With the port representor model, we still have the same shared access 
model. Only important difference is that, with port representor, we do 
not have to make those device specific private APIs public, as all can 
be accessed by eth_dev_ops.

In an OVS like scenario, it is expected that the control application 
will have sole privilege to modify the hardware configuration. 
Currently, we cannot stop the application running on VF to modify the 
hardware (hence your concern is valid but still accepted scenario as in 
reference [1]). If we do not want to allow the applications running on 
VF to modify the hardware, one idea is to intercept the interrupts 
regarding the malibox requested coming from the VF, and redirect to the 
port representor model. But this is out of the scope of port representor 
itself.

[1] http://dpdk.org/ml/archives/dev/2017-April/063642.html

> Neil
>

Regards,
Awal.
Mohammad Abdul Awal Jan. 3, 2018, 12:14 p.m. | #11
Hi Neil,


On 22/12/2017 16:37, Neil Horman wrote:
> On Fri, Dec 22, 2017 at 02:41:16PM +0000, Remy Horton wrote:
>> Port Representors provide a logical presentation in DPDK of VF (virtual
>> function) ports for the purposes of control and monitoring. Each port
>> representor device represents a single VF and is associated with it's
>> parent physical function (PF) PMD which provides the back-end hooks for
>> the representor device ops and defines the control domain to which that
>> port belongs. This allows to use existing DPDK APIs to monitor and control
>> the port without the need to create and maintain VF specific APIs.
>>
>> +-----------------------------+   +---------------+  +---------------+
>> |        Control Plane        |   |   Data Plane  |  |   Data Plane  |
>> |         Application         |   |   Application |  |   Application |
>> +-----------------------------+   +---------------+  +---------------+
>> |         eth dev api         |   |  eth dev api  |  |  eth dev api  |
>> +-----------------------------+   +---------------+  +---------------+
>> +-------+  +-------+  +-------+   +---------------+  +---------------+
>> |  PF0  |  | Port  |  | Port  |   |    VF0 PMD    |  |    VF0 PMD    |
>> |  PMD  <--+ Rep 0 |  | Rep 1 |   +---------------+  +------+--------+
>> |       |  | PMD   |  | PMD   |                             |
>> +---+--^+  +-------+  +-+-----+                             |
>>      |  |                |  |                                |
>>      |  +----------------+  |                                |
>>      |                      |                                |
>>      |                      |                                |
>> +--------------------------------+                          |
>> |   |  HW (logical view)   |     |                          |
>> | --+------+ +-------+ +---+---+ |                          |
>> | |   PF   | |  VF0  | |  VF1  | |                          |
>> | |        | |       | |       +----------------------------+
>> | +--------+ +-------+ +-------+ |
>> | +----------------------------+ |
>> | |        VEB                 | |
>> | +----------------------------+ |
>> | +--------+                     |
>> | |  Port  |                     |
>> | |   0    |                     |
>> | +--------+                     |
>> +--------------------------------+
>>
> How does this mesh with the notion of port ownership that we've been discussing
> in other threads?  In that thread, we've been discussing the need for a single
> execution context to have exclusive access to the hardware for the purposes of
> configuration and data i/o, and for the application/execution context to be
> responsible for co-ordination of any shared use of a device.  In this feature
> however, the notion of a Port Representor creates an alias to the same hardware
> funciton (VF), where both aliases (in the control and data plan) have parallel
> access to the hardware, in such a way that co-ordination between the two is
> largely impossible (unless you want to make the data plane application explicity
> aware of control plane actiivy).
>
> Neil

I have just replied to your mail in other thread of V3 patch.

Regards,
Awal.
Remy Horton Jan. 8, 2018, 2:37 p.m. | #12
Port Representors provide a logical presentation in DPDK of VF (virtual 
function) ports for the purposes of control and monitoring. Each port 
representor device represents a single VF and is associated with it's 
parent physical function (PF) PMD which provides the back-end hooks for 
the representor device ops and defines the control domain to which that 
port belongs. This allows to use existing DPDK APIs to monitor and control 
the port without the need to create and maintain VF specific APIs.

+-----------------------------+   +---------------+  +---------------+
|        Control Plane        |   |   Data Plane  |  |   Data Plane  |
|         Application         |   |   Application |  |   Application |
+-----------------------------+   +---------------+  +---------------+
|         eth dev api         |   |  eth dev api  |  |  eth dev api  |
+-----------------------------+   +---------------+  +---------------+
+-------+  +-------+  +-------+   +---------------+  +---------------+
|  PF0  |  | Port  |  | Port  |   |    VF0 PMD    |  |    VF0 PMD    |
|  PMD  <--+ Rep 0 |  | Rep 1 |   +---------------+  +------+--------+
|       |  | PMD   |  | PMD   |                             |
+---+--^+  +-------+  +-+-----+                             |
    |  |                |  |                                |
    |  +----------------+  |                                |
    |                      |                                |
    |                      |                                |
+--------------------------------+                          |
|   |  HW (logical view)   |     |                          |
| --+------+ +-------+ +---+---+ |                          |
| |   PF   | |  VF0  | |  VF1  | |                          |
| |        | |       | |       +----------------------------+
| +--------+ +-------+ +-------+ |
| +----------------------------+ |
| |        VEB                 | |
| +----------------------------+ |
| +--------+                     |
| |  Port  |                     |
| |   0    |                     |
| +--------+                     |
+--------------------------------+

The figure above shows a deployment where the PF is bound to a DPDK control
plane application which uses representor ports to manage the configuration and
monitoring of it's VF ports. Each virtual function is represented in the
application by a representor port PMD which enables control of the corresponding
VF through eth dev APIs on the representor PMD such as:

- void rte_eth_promiscuous_enable(uint8_t port_id);
- void rte_eth_promiscuous_disable(uint8_t port_id);
- void rte_eth_allmulticast_enable(uint8_t port_id);
- void rte_eth_allmulticast_disable(uint8_t port_id);
- int rte_eth_dev_mac_addr_add(uint8_t port, struct ether_addr *mac_addr,
	uint32_t pool);
- int rte_eth_dev_set_vlan_offload(uint8_t port_id, int offload_mask);

as well as monitoring through API's like

- void rte_eth_link_get(uint8_t port_id, struct rte_eth_link *link);
- int rte_eth_stats_get(uint8_t port_id, struct rte_eth_stats *stats);

The port representor infrastructure is enabled through a single common, device
independent, virtual PMD whos context is initialized and enabled through a
broker instance running within the context of the physical function device
driver.

+-------------------------+       +-------------------------+
|        rte_ethdev       |       |       rte_ethdev        |
+-------------------------+       +-------------------------+
|  Physical Function PMD  |       |  Port Reperesentor PMD  |
|         +-------------+ |       | +---------+ +---------+ |
|         | Representor | |       | | dev_data| | dev_ops | |
|         |    Broker   | |       | +----+----+ +----+----+ |
|         | +---------+ | |       +------|-----------|------+
|         | | VF Port | | |              |           |
|         | | Context +------------------+           |
|         | +---------+ | |                          |
|         | +---------+ | |                          |
|         | | Handler +------------------------------+
|         | |   Ops   | | |
|         | +---------+ | |
|         +-------------+ |
+-------------------------+

Creation of representor ports can be achieved either through the --vdev EAL
option or through the rte_vdev_init() API. Each port representor requires the
BDF of it's parent PF and the Virtual Function ID of the port which the
representor will support. During initialization of the representor PMD, it calls
the broker API to register itself with the PF PMD and to get it's context
configured which includes the setting up of it's context and ops function
handlers.

As the port representor model is based around the paradigm of using standard
port based APIs, it will allow future expansion of functionality without the
need to add new APIs. For example it should be possible to support configuration
of egress QoS parameters using existing TM APIs by extending the port
representor PMD/broker infrastructure.

Changes in v2:
* Rebased to DPDK 17.11

Changes in v3:
* Removed RTE_ETH_DEV_REPRESENTOR_PORT define
* Removed switch_domain from struct rte_eth_dev_info
  (to be reintroduced seperately when required).
* Port Representor PMD functionality merged into main library
* Removed functions from .map file that are not for use by applications
* Some minor bugfixes (uninitalised variables & NULL terminators)
* Use of SPDX licence headers in new files
* Seperate headers for PMD and application
* SPDX-License-Identifier in new files
* Added test-pmd representor add/del commands

Changes in v4:
* Added documentation
* Rebased to a12f22678915


Remy Horton (5):
  lib: add Port Representor library
  eal: add Port Representor command-line option
  drivers/net/i40e: add Port Representor functionality
  drivers/net/ixgbe: add Port Representor functionality
  app/test-pmd: add Port Representor commands

 MAINTAINERS                                        |  12 +-
 app/test-pmd/cmdline.c                             |  88 ++++
 config/common_base                                 |   5 +
 doc/api/doxy-api-index.md                          |   3 +-
 doc/api/doxy-api.conf                              |   1 +
 doc/guides/prog_guide/index.rst                    |   1 +
 doc/guides/prog_guide/representor_lib.rst          |  62 +++
 doc/guides/rel_notes/release_18_02.rst             |   9 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst        |  25 ++
 drivers/net/i40e/Makefile                          |   1 +
 drivers/net/i40e/i40e_ethdev.c                     |  16 +
 drivers/net/i40e/i40e_ethdev.h                     |   1 +
 drivers/net/i40e/i40e_prep_ops.c                   | 495 +++++++++++++++++++++
 drivers/net/i40e/i40e_prep_ops.h                   |  15 +
 drivers/net/i40e/rte_pmd_i40e.c                    |  47 ++
 drivers/net/i40e/rte_pmd_i40e.h                    |  18 +
 drivers/net/ixgbe/Makefile                         |   1 +
 drivers/net/ixgbe/ixgbe_ethdev.c                   |  22 +-
 drivers/net/ixgbe/ixgbe_ethdev.h                   |   5 +
 drivers/net/ixgbe/ixgbe_prep_ops.c                 | 259 +++++++++++
 drivers/net/ixgbe/ixgbe_prep_ops.h                 |  15 +
 lib/Makefile                                       |   3 +
 lib/librte_eal/bsdapp/eal/eal.c                    |   6 +
 lib/librte_eal/common/eal_common_options.c         |   1 +
 lib/librte_eal/common/eal_internal_cfg.h           |   2 +
 lib/librte_eal/common/eal_options.h                |   2 +
 lib/librte_eal/common/include/rte_eal.h            |   8 +
 lib/librte_eal/linuxapp/eal/eal.c                  |   9 +
 lib/librte_representor/Makefile                    |  26 ++
 lib/librte_representor/rte_port_representor.c      | 326 ++++++++++++++
 lib/librte_representor/rte_port_representor.h      |  60 +++
 .../rte_port_representor_driver.h                  | 138 ++++++
 .../rte_port_representor_version.map               |   8 +
 mk/rte.app.mk                                      |   1 +
 34 files changed, 1688 insertions(+), 3 deletions(-)
 create mode 100644 doc/guides/prog_guide/representor_lib.rst
 create mode 100644 drivers/net/i40e/i40e_prep_ops.c
 create mode 100644 drivers/net/i40e/i40e_prep_ops.h
 create mode 100644 drivers/net/ixgbe/ixgbe_prep_ops.c
 create mode 100644 drivers/net/ixgbe/ixgbe_prep_ops.h
 create mode 100644 lib/librte_representor/Makefile
 create mode 100644 lib/librte_representor/rte_port_representor.c
 create mode 100644 lib/librte_representor/rte_port_representor.h
 create mode 100644 lib/librte_representor/rte_port_representor_driver.h
 create mode 100644 lib/librte_representor/rte_port_representor_version.map
Ferruh Yigit Jan. 9, 2018, 10:01 p.m. | #13
<...>

> The port representor infrastructure is enabled through a single common, device
> independent, virtual PMD whos context is initialized and enabled through a
> broker instance running within the context of the physical function device
> driver.>
> +-------------------------+       +-------------------------+
> |        rte_ethdev       |       |       rte_ethdev        |
> +-------------------------+       +-------------------------+
> |  Physical Function PMD  |       |  Port Reperesentor PMD  |
> |         +-------------+ |       | +---------+ +---------+ |
> |         | Representor | |       | | dev_data| | dev_ops | |
> |         |    Broker   | |       | +----+----+ +----+----+ |
> |         | +---------+ | |       +------|-----------|------+
> |         | | VF Port | | |              |           |
> |         | | Context +------------------+           |
> |         | +---------+ | |                          |
> |         | +---------+ | |                          |
> |         | | Handler +------------------------------+
> |         | |   Ops   | | |
> |         | +---------+ | |
> |         +-------------+ |
> +-------------------------+
> 
> Creation of representor ports can be achieved either through the --vdev EAL
> option or through the rte_vdev_init() API. Each port representor requires the
> BDF of it's parent PF and the Virtual Function ID of the port which the
> representor will support. During initialization of the representor PMD, it calls
> the broker API to register itself with the PF PMD and to get it's context
> configured which includes the setting up of it's context and ops function
> handlers.

Above no more true, right?
Thomas Monjalon Jan. 9, 2018, 11:22 p.m. | #14
Hi,

08/01/2018 15:37, Remy Horton:
> Port Representors provide a logical presentation in DPDK of VF (virtual 
> function) ports for the purposes of control and monitoring. Each port 
> representor device represents a single VF and is associated with it's 
> parent physical function (PF) PMD which provides the back-end hooks for 
> the representor device ops and defines the control domain to which that 
> port belongs. This allows to use existing DPDK APIs to monitor and control 
> the port without the need to create and maintain VF specific APIs.

Extending control plane ability of DPDK has been discussed
multiple times.
The current agreed policy is:
"
The primary goal of DPDK is to provide a userspace dataplane.
Managing VFs from a PF driver is a control plane feature and developers
should generally rely on the Linux Kernel for that.
"
http://dpdk.org/doc/guides/contributing/design.html#pf-and-vf-considerations

If we relax this policy, I think the representor solution should be
a real port, not only "for the purposes of control and monitoring".
It has been asked several times as replies to this series,
but it is kindly ignored, saying it will be thought later.

I don't see a general agreement on this series so far.
Declan Doherty Jan. 10, 2018, 1:46 p.m. | #15
On 09/01/2018 11:22 PM, Thomas Monjalon wrote:
> Hi,
> 
> 08/01/2018 15:37, Remy Horton:
>> Port Representors provide a logical presentation in DPDK of VF (virtual
>> function) ports for the purposes of control and monitoring. Each port
>> representor device represents a single VF and is associated with it's
>> parent physical function (PF) PMD which provides the back-end hooks for
>> the representor device ops and defines the control domain to which that
>> port belongs. This allows to use existing DPDK APIs to monitor and control
>> the port without the need to create and maintain VF specific APIs.
> 
> Extending control plane ability of DPDK has been discussed
> multiple times.

It has, and I have yet to see a really strong reason as to why we would 
not support control plane functions within DPDK, many of which are 
already support today implicitly anyway through our ethdev APIs.

> The current agreed policy is:
> "
> The primary goal of DPDK is to provide a userspace dataplane.
> Managing VFs from a PF driver is a control plane feature and developers
> should generally rely on the Linux Kernel for that.
> "
> http://dpdk.org/doc/guides/contributing/design.html#pf-and-vf-considerations
> 

My understanding is that this particular entry was based around the
discussion on the divergence of functionality between the Linux kernel
PF driver and the DPDK PF driver. I also don't really think the above
statement is valid as a blanket statement for the project as it makes 
the assumption that DPDK is only deployed on Linux hosts, what about 
FreeBSD? or in the future Windows?

A number of presentations at both Userspace in Dublin and the Summit
in San Jose discussed the support of control plane functionality by
DPDK and there wasn't any strong arguments or opposition against using
DPDK for control plane functions that I saw.

In any case this patchset is not introducing any new control plane APIs 
that don't already exist within DPDK today, it only enables the creation 
of a new type of virtual PMDs which are linked to the same base 
infrastructure and which can be used to represent VFs in a control plane 
application as we have implemented in this patch set.

> If we relax this policy, I think the representor solution should be
> a real port, not only "for the purposes of control and monitoring".
> It has been asked several times as replies to this series,
> but it is kindly ignored, saying it will be thought later.
> 

I think we have stated in multiple discussions, especially during the
userspace presentation back in September that this solution supports
data path on the representors PMDs, and we have used the
infrastructure proposed here to do exactly what you are asking. As the
representor infrastructure doesn't preclude the support of a data
path, we have used it as it is presented here to implement a data path 
for exception path packets for a prototype vswitch offload implementation.


> I don't see a general agreement on this series so far.
> 

I think the main issue of contention is that there is a
misunderstanding that this implementation only supports control plane
management and monitoring, but that is not the case and it can be used
for full data path representors, with limited or no control plane
functionality if required, at the end of the day the only limitations
are based on what is implemented by the backend base driver were the 
broker is running for the representor ports.
Mohammad Abdul Awal Jan. 10, 2018, 2:17 p.m. | #16
On 09/01/2018 22:01, Ferruh Yigit wrote:
> <...>
>
>> The port representor infrastructure is enabled through a single common, device
>> independent, virtual PMD whos context is initialized and enabled through a
>> broker instance running within the context of the physical function device
>> driver.>
>> +-------------------------+       +-------------------------+
>> |        rte_ethdev       |       |       rte_ethdev        |
>> +-------------------------+       +-------------------------+
>> |  Physical Function PMD  |       |  Port Reperesentor PMD  |
>> |         +-------------+ |       | +---------+ +---------+ |
>> |         | Representor | |       | | dev_data| | dev_ops | |
>> |         |    Broker   | |       | +----+----+ +----+----+ |
>> |         | +---------+ | |       +------|-----------|------+
>> |         | | VF Port | | |              |           |
>> |         | | Context +------------------+           |
>> |         | +---------+ | |                          |
>> |         | +---------+ | |                          |
>> |         | | Handler +------------------------------+
>> |         | |   Ops   | | |
>> |         | +---------+ | |
>> |         +-------------+ |
>> +-------------------------+
>>
>> Creation of representor ports can be achieved either through the --vdev EAL
>> option or through the rte_vdev_init() API. Each port representor requires the
>> BDF of it's parent PF and the Virtual Function ID of the port which the
>> representor will support. During initialization of the representor PMD, it calls
>> the broker API to register itself with the PF PMD and to get it's context
>> configured which includes the setting up of it's context and ops function
>> handlers.
> Above no more true, right?
Right. We will rewrite these parts.
Remy Horton Jan. 10, 2018, 3:54 p.m. | #17
Port Representors provide a logical presentation in DPDK of VF (virtual 
function) ports for the purposes of control and monitoring. Each port 
representor device represents a single VF and is associated with it's 
parent physical function (PF) PMD which provides the back-end hooks for 
the representor device ops and defines the control domain to which that 
port belongs. This allows to use existing DPDK APIs to monitor and control 
the port without the need to create and maintain VF specific APIs.

+-----------------------------+   +---------------+  +---------------+
|        Control Plane        |   |   Data Plane  |  |   Data Plane  |
|         Application         |   |   Application |  |   Application |
+-----------------------------+   +---------------+  +---------------+
|         eth dev api         |   |  eth dev api  |  |  eth dev api  |
+-----------------------------+   +---------------+  +---------------+
+-------+  +-------+  +-------+   +---------------+  +---------------+
|  PF0  |  | Port  |  | Port  |   |    VF0 PMD    |  |    VF0 PMD    |
|  PMD  <--+ Rep 0 |  | Rep 1 |   +---------------+  +------+--------+
|       |  | PMD   |  | PMD   |                             |
+---+--^+  +-------+  +-+-----+                             |
    |  |                |  |                                |
    |  +----------------+  |                                |
    |                      |                                |
    |                      |                                |
+--------------------------------+                          |
|   |  HW (logical view)   |     |                          |
| --+------+ +-------+ +---+---+ |                          |
| |   PF   | |  VF0  | |  VF1  | |                          |
| |        | |       | |       +----------------------------+
| +--------+ +-------+ +-------+ |
| +----------------------------+ |
| |        VEB                 | |
| +----------------------------+ |
| +--------+                     |
| |  Port  |                     |
| |   0    |                     |
| +--------+                     |
+--------------------------------+

The figure above shows a deployment where the PF is bound to a DPDK control
plane application which uses representor ports to manage the configuration and
monitoring of it's VF ports. Each virtual function is represented in the
application by a representor port PMD which enables control of the corresponding
VF through eth dev APIs on the representor PMD such as:

- void rte_eth_promiscuous_enable(uint8_t port_id);
- void rte_eth_promiscuous_disable(uint8_t port_id);
- void rte_eth_allmulticast_enable(uint8_t port_id);
- void rte_eth_allmulticast_disable(uint8_t port_id);
- int rte_eth_dev_mac_addr_add(uint8_t port, struct ether_addr *mac_addr,
	uint32_t pool);
- int rte_eth_dev_set_vlan_offload(uint8_t port_id, int offload_mask);

as well as monitoring through API's like

- void rte_eth_link_get(uint8_t port_id, struct rte_eth_link *link);
- int rte_eth_stats_get(uint8_t port_id, struct rte_eth_stats *stats);

The port representor infrastructure is enabled through a single common, device
independent, virtual PMD whos context is initialized and enabled through a
broker instance running within the context of the physical function device
driver.

+-------------------------+       +-------------------------+
|        rte_ethdev       |       |       rte_ethdev        |
+-------------------------+       +-------------------------+
|  Physical Function PMD  |       |  Port Reperesentor PMD  |
|         +-------------+ |       | +---------+ +---------+ |
|         | Representor | |       | | dev_data| | dev_ops | |
|         |    Broker   | |       | +----+----+ +----+----+ |
|         | +---------+ | |       +------|-----------|------+
|         | | VF Port | | |              |           |
|         | | Context +------------------+           |
|         | +---------+ | |                          |
|         | +---------+ | |                          |
|         | | Handler +------------------------------+
|         | |   Ops   | | |
|         | +---------+ | |
|         +-------------+ |
+-------------------------+

As the port representor model is based around the paradigm of using standard
port based APIs, it will allow future expansion of functionality without the
need to add new APIs. For example it should be possible to support configuration
of egress QoS parameters using existing TM APIs by extending the port
representor PMD/broker infrastructure.

Changes in v2:
* Rebased to DPDK 17.11

Changes in v3:
* Removed RTE_ETH_DEV_REPRESENTOR_PORT define
* Removed switch_domain from struct rte_eth_dev_info
  (to be reintroduced seperately when required).
* Port Representor PMD functionality merged into main library
* Removed functions from .map file that are not for use by applications
* Some minor bugfixes (uninitalised variables & NULL terminators)
* Use of SPDX licence headers in new files
* Seperate headers for PMD and application
* SPDX-License-Identifier in new files
* Added test-pmd representor add/del commands

Changes in v4:
* Added documentation
* Rebased to a12f22678915

Changes in v5:
* Documentation changes & additions
* Config file correctons
* Logging functions use library-specific tag
* Error-path memory leaks fixed
* Port Rep state enum moved out of struct
* Added rte_representor_enabled to .map
* PMD-specific files renamed for clarity
* In testpmd changed command to "port representor add|del <pf> <vf>"
* Added missing testpmd help
* Minor bugfix in handling of removal of i40e VF MAC addresses
* Removed "All rights reserved" from copyright.


Remy Horton (5):
  lib: add Port Representor library
  eal: add Port Representor command-line option
  drivers/net/i40e: add Port Representor functionality
  drivers/net/ixgbe: add Port Representor functionality
  app/test-pmd: add Port Representor commands

 MAINTAINERS                                        |  13 +-
 app/test-pmd/cmdline.c                             |  80 ++++
 config/common_base                                 |   5 +
 doc/api/doxy-api-index.md                          |   1 +
 doc/api/doxy-api.conf                              |   1 +
 doc/guides/prog_guide/index.rst                    |   1 +
 doc/guides/prog_guide/representor_lib.rst          |  62 +++
 doc/guides/rel_notes/release_18_02.rst             |  12 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst        |  26 ++
 drivers/net/i40e/Makefile                          |   2 +
 drivers/net/i40e/i40e_ethdev.c                     |  19 +
 drivers/net/i40e/i40e_ethdev.h                     |   1 +
 drivers/net/i40e/i40e_port_representor_ops.c       | 516 +++++++++++++++++++++
 drivers/net/i40e/i40e_port_representor_ops.h       |  19 +
 drivers/net/i40e/rte_pmd_i40e.c                    |  43 ++
 drivers/net/i40e/rte_pmd_i40e.h                    |  18 +
 drivers/net/i40e/rte_pmd_i40e_version.map          |   7 +
 drivers/net/ixgbe/Makefile                         |   2 +
 drivers/net/ixgbe/ixgbe_ethdev.c                   |  29 +-
 drivers/net/ixgbe/ixgbe_ethdev.h                   |   5 +
 drivers/net/ixgbe/ixgbe_port_representor_ops.c     | 274 +++++++++++
 drivers/net/ixgbe/ixgbe_port_representor_ops.h     |  19 +
 lib/Makefile                                       |   3 +
 lib/librte_eal/bsdapp/eal/eal.c                    |   6 +
 lib/librte_eal/common/eal_common_log.c             |   1 +
 lib/librte_eal/common/eal_common_options.c         |   1 +
 lib/librte_eal/common/eal_internal_cfg.h           |   2 +
 lib/librte_eal/common/eal_options.h                |   2 +
 lib/librte_eal/common/include/rte_eal.h            |   8 +
 lib/librte_eal/common/include/rte_log.h            |   1 +
 lib/librte_eal/linuxapp/eal/eal.c                  |   9 +
 lib/librte_eal/rte_eal_version.map                 |   1 +
 lib/librte_representor/Makefile                    |  26 ++
 lib/librte_representor/rte_port_representor.c      | 345 ++++++++++++++
 lib/librte_representor/rte_port_representor.h      |  60 +++
 .../rte_port_representor_driver.h                  | 140 ++++++
 .../rte_port_representor_version.map               |   8 +
 mk/rte.app.mk                                      |   1 +
 38 files changed, 1767 insertions(+), 2 deletions(-)
 create mode 100644 doc/guides/prog_guide/representor_lib.rst
 create mode 100644 drivers/net/i40e/i40e_port_representor_ops.c
 create mode 100644 drivers/net/i40e/i40e_port_representor_ops.h
 create mode 100644 drivers/net/ixgbe/ixgbe_port_representor_ops.c
 create mode 100644 drivers/net/ixgbe/ixgbe_port_representor_ops.h
 create mode 100644 lib/librte_representor/Makefile
 create mode 100644 lib/librte_representor/rte_port_representor.c
 create mode 100644 lib/librte_representor/rte_port_representor.h
 create mode 100644 lib/librte_representor/rte_port_representor_driver.h
 create mode 100644 lib/librte_representor/rte_port_representor_version.map
Remy Horton Jan. 10, 2018, 5:49 p.m. | #18
Port Representors provide a logical presentation in DPDK of VF (virtual 
function) ports for the purposes of control and monitoring. Each port 
representor device represents a single VF and is associated with it's 
parent physical function (PF) PMD which provides the back-end hooks for 
the representor device ops and defines the control domain to which that 
port belongs. This allows to use existing DPDK APIs to monitor and control 
the port without the need to create and maintain VF specific APIs.

+-----------------------------+   +---------------+  +---------------+
|        Control Plane        |   |   Data Plane  |  |   Data Plane  |
|         Application         |   |   Application |  |   Application |
+-----------------------------+   +---------------+  +---------------+
|         eth dev api         |   |  eth dev api  |  |  eth dev api  |
+-----------------------------+   +---------------+  +---------------+
+-------+  +-------+  +-------+   +---------------+  +---------------+
|  PF0  |  | Port  |  | Port  |   |    VF0 PMD    |  |    VF0 PMD    |
|  PMD  <--+ Rep 0 |  | Rep 1 |   +---------------+  +------+--------+
|       |  | PMD   |  | PMD   |                             |
+---+--^+  +-------+  +-+-----+                             |
    |  |                |  |                                |
    |  +----------------+  |                                |
    |                      |                                |
    |                      |                                |
+--------------------------------+                          |
|   |  HW (logical view)   |     |                          |
| --+------+ +-------+ +---+---+ |                          |
| |   PF   | |  VF0  | |  VF1  | |                          |
| |        | |       | |       +----------------------------+
| +--------+ +-------+ +-------+ |
| +----------------------------+ |
| |        VEB                 | |
| +----------------------------+ |
| +--------+                     |
| |  Port  |                     |
| |   0    |                     |
| +--------+                     |
+--------------------------------+

The figure above shows a deployment where the PF is bound to a DPDK control
plane application which uses representor ports to manage the configuration and
monitoring of it's VF ports. Each virtual function is represented in the
application by a representor port PMD which enables control of the corresponding
VF through eth dev APIs on the representor PMD such as:

- void rte_eth_promiscuous_enable(uint8_t port_id);
- void rte_eth_promiscuous_disable(uint8_t port_id);
- void rte_eth_allmulticast_enable(uint8_t port_id);
- void rte_eth_allmulticast_disable(uint8_t port_id);
- int rte_eth_dev_mac_addr_add(uint8_t port, struct ether_addr *mac_addr,
	uint32_t pool);
- int rte_eth_dev_set_vlan_offload(uint8_t port_id, int offload_mask);

as well as monitoring through API's like

- void rte_eth_link_get(uint8_t port_id, struct rte_eth_link *link);
- int rte_eth_stats_get(uint8_t port_id, struct rte_eth_stats *stats);

The port representor infrastructure is enabled through a single common, device
independent, virtual PMD whos context is initialized and enabled through a
broker instance running within the context of the physical function device
driver.

+-------------------------+       +-------------------------+
|        rte_ethdev       |       |       rte_ethdev        |
+-------------------------+       +-------------------------+
|  Physical Function PMD  |       |  Port Reperesentor PMD  |
|         +-------------+ |       | +---------+ +---------+ |
|         | Representor | |       | | dev_data| | dev_ops | |
|         |    Broker   | |       | +----+----+ +----+----+ |
|         | +---------+ | |       +------|-----------|------+
|         | | VF Port | | |              |           |
|         | | Context +------------------+           |
|         | +---------+ | |                          |
|         | +---------+ | |                          |
|         | | Handler +------------------------------+
|         | |   Ops   | | |
|         | +---------+ | |
|         +-------------+ |
+-------------------------+

As the port representor model is based around the paradigm of using standard
port based APIs, it will allow future expansion of functionality without the
need to add new APIs. For example it should be possible to support configuration
of egress QoS parameters using existing TM APIs by extending the port
representor PMD/broker infrastructure.

Changes in v2:
* Rebased to DPDK 17.11

Changes in v3:
* Removed RTE_ETH_DEV_REPRESENTOR_PORT define
* Removed switch_domain from struct rte_eth_dev_info
  (to be reintroduced seperately when required).
* Port Representor PMD functionality merged into main library
* Removed functions from .map file that are not for use by applications
* Some minor bugfixes (uninitalised variables & NULL terminators)
* Use of SPDX licence headers in new files
* Seperate headers for PMD and application
* SPDX-License-Identifier in new files
* Added test-pmd representor add/del commands

Changes in v4:
* Added documentation
* Rebased to a12f22678915

Changes in v5:
* Documentation changes & additions
* Config file correctons
* Logging functions use library-specific tag
* Error-path memory leaks fixed
* Port Rep state enum moved out of struct
* Added rte_representor_enabled to .map
* PMD-specific files renamed for clarity
* In testpmd changed command to "port representor add|del <pf> <vf>"
* Added missing testpmd help
* Minor bugfix in handling of removal of i40e VF MAC addresses
* Removed "All rights reserved" from copyright.

Changes in v6:
 Checkpatch-related corrections


Remy Horton (5):
  lib: add Port Representor library
  eal: add Port Representor command-line option
  drivers/net/i40e: add Port Representor functionality
  drivers/net/ixgbe: add Port Representor functionality
  app/test-pmd: add Port Representor commands

 MAINTAINERS                                        |  13 +-
 app/test-pmd/cmdline.c                             |  81 ++++
 config/common_base                                 |   5 +
 doc/api/doxy-api-index.md                          |   1 +
 doc/api/doxy-api.conf                              |   1 +
 doc/guides/prog_guide/index.rst                    |   1 +
 doc/guides/prog_guide/representor_lib.rst          |  62 +++
 doc/guides/rel_notes/release_18_02.rst             |  12 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst        |  26 ++
 drivers/net/i40e/Makefile                          |   2 +
 drivers/net/i40e/i40e_ethdev.c                     |  19 +
 drivers/net/i40e/i40e_ethdev.h                     |   1 +
 drivers/net/i40e/i40e_port_representor_ops.c       | 516 +++++++++++++++++++++
 drivers/net/i40e/i40e_port_representor_ops.h       |  19 +
 drivers/net/i40e/rte_pmd_i40e.c                    |  43 ++
 drivers/net/i40e/rte_pmd_i40e.h                    |  18 +
 drivers/net/i40e/rte_pmd_i40e_version.map          |   7 +
 drivers/net/ixgbe/Makefile                         |   2 +
 drivers/net/ixgbe/ixgbe_ethdev.c                   |  29 +-
 drivers/net/ixgbe/ixgbe_ethdev.h                   |   5 +
 drivers/net/ixgbe/ixgbe_port_representor_ops.c     | 274 +++++++++++
 drivers/net/ixgbe/ixgbe_port_representor_ops.h     |  19 +
 lib/Makefile                                       |   3 +
 lib/librte_eal/bsdapp/eal/eal.c                    |   6 +
 lib/librte_eal/common/eal_common_log.c             |   1 +
 lib/librte_eal/common/eal_common_options.c         |   1 +
 lib/librte_eal/common/eal_internal_cfg.h           |   2 +
 lib/librte_eal/common/eal_options.h                |   2 +
 lib/librte_eal/common/include/rte_eal.h            |   8 +
 lib/librte_eal/common/include/rte_log.h            |   1 +
 lib/librte_eal/linuxapp/eal/eal.c                  |   9 +
 lib/librte_eal/rte_eal_version.map                 |   1 +
 lib/librte_representor/Makefile                    |  26 ++
 lib/librte_representor/rte_port_representor.c      | 350 ++++++++++++++
 lib/librte_representor/rte_port_representor.h      |  60 +++
 .../rte_port_representor_driver.h                  | 140 ++++++
 .../rte_port_representor_version.map               |   8 +
 mk/rte.app.mk                                      |   1 +
 38 files changed, 1773 insertions(+), 2 deletions(-)
 create mode 100644 doc/guides/prog_guide/representor_lib.rst
 create mode 100644 drivers/net/i40e/i40e_port_representor_ops.c
 create mode 100644 drivers/net/i40e/i40e_port_representor_ops.h
 create mode 100644 drivers/net/ixgbe/ixgbe_port_representor_ops.c
 create mode 100644 drivers/net/ixgbe/ixgbe_port_representor_ops.h
 create mode 100644 lib/librte_representor/Makefile
 create mode 100644 lib/librte_representor/rte_port_representor.c
 create mode 100644 lib/librte_representor/rte_port_representor.h
 create mode 100644 lib/librte_representor/rte_port_representor_driver.h
 create mode 100644 lib/librte_representor/rte_port_representor_version.map
Thomas Monjalon Jan. 10, 2018, 7:26 p.m. | #19
10/01/2018 14:46, Doherty, Declan:
> On 09/01/2018 11:22 PM, Thomas Monjalon wrote:
> > Hi,
> > 
> > 08/01/2018 15:37, Remy Horton:
> >> Port Representors provide a logical presentation in DPDK of VF (virtual
> >> function) ports for the purposes of control and monitoring. Each port
> >> representor device represents a single VF and is associated with it's
> >> parent physical function (PF) PMD which provides the back-end hooks for
> >> the representor device ops and defines the control domain to which that
> >> port belongs. This allows to use existing DPDK APIs to monitor and control
> >> the port without the need to create and maintain VF specific APIs.
> > 
> > Extending control plane ability of DPDK has been discussed
> > multiple times.
> 
> It has, and I have yet to see a really strong reason as to why we would 
> not support control plane functions within DPDK, many of which are 
> already support today implicitly anyway through our ethdev APIs.
> 
> > The current agreed policy is:
> > "
> > The primary goal of DPDK is to provide a userspace dataplane.
> > Managing VFs from a PF driver is a control plane feature and developers
> > should generally rely on the Linux Kernel for that.
> > "
> > http://dpdk.org/doc/guides/contributing/design.html#pf-and-vf-considerations
> > 
> 
> My understanding is that this particular entry was based around the
> discussion on the divergence of functionality between the Linux kernel
> PF driver and the DPDK PF driver. I also don't really think the above
> statement is valid as a blanket statement for the project as it makes 
> the assumption that DPDK is only deployed on Linux hosts, what about 
> FreeBSD? or in the future Windows?

Yes, we must agree on removing this scope limitation while working
on a generic VF representor.

> A number of presentations at both Userspace in Dublin and the Summit
> in San Jose discussed the support of control plane functionality by
> DPDK and there wasn't any strong arguments or opposition against using
> DPDK for control plane functions that I saw.
> 
> In any case this patchset is not introducing any new control plane APIs 
> that don't already exist within DPDK today, it only enables the creation 
> of a new type of virtual PMDs which are linked to the same base 
> infrastructure and which can be used to represent VFs in a control plane 
> application as we have implemented in this patch set.
> 
> > If we relax this policy, I think the representor solution should be
> > a real port, not only "for the purposes of control and monitoring".
> > It has been asked several times as replies to this series,
> > but it is kindly ignored, saying it will be thought later.
> > 
> 
> I think we have stated in multiple discussions, especially during the
> userspace presentation back in September that this solution supports
> data path on the representors PMDs, and we have used the
> infrastructure proposed here to do exactly what you are asking. As the
> representor infrastructure doesn't preclude the support of a data
> path, we have used it as it is presented here to implement a data path 
> for exception path packets for a prototype vswitch offload implementation.
> 
> 
> > I don't see a general agreement on this series so far.
> > 
> 
> I think the main issue of contention is that there is a
> misunderstanding that this implementation only supports control plane
> management and monitoring, but that is not the case and it can be used
> for full data path representors, with limited or no control plane
> functionality if required, at the end of the day the only limitations
> are based on what is implemented by the backend base driver were the 
> broker is running for the representor ports.

The misunderstanding may originates from what you describe (even in v5):
"ports for the purposes of control and monitoring"

I think everybody agree to have VF representors in DPDK.
But there are few things which are not generic enough,
and not easy to use.
I hoped the discussion started at Dublin would continue
on the mailing list but I realize the joint effort with other vendors
did not happen.
I will elaborate quickly below and more detailed in later review.

1/ In order to keep track of the relations between PF, VF and
representor - which are all ethdev - you create a struct outside
of ethdev. I think it should be managed inside ethdev API.
As suggested by others, we could also think whether a switchdev API
is interesting or not.

2/ You create a new library for ethdev device management.
It is the responsibility of the bus to scan and probe devices.
In Intel case, the representor has no real bus so it must rely on
the vdev bus. Note that the new custom scan hook may help.
In Mellanox case, the representor already exists in Linux, and is based
on PCI bus.
Then, as any other port, it must be managed with whitelist or blacklist.

2-bis/ Minor comments about this lib:
	- it is not convenient to ask user or app to enable it
	- it is not using dynamic logging

3/ You are using PCI address + index to identify the representor.
It is a no-go. We have made effort to abstract buses.
As an idea, the identification of a representor could use the new
proposed flexible device syntax.

4/ Such new API must be experimental.

I propose to better think the representor API inside ethdev
with a good multi-vendor collaboration,
and submit a deprecation notice for 18.05 integration.
Yuanhan Liu Jan. 11, 2018, 3:18 a.m. | #20
On Mon, Jan 08, 2018 at 02:37:15PM +0000, Remy Horton wrote:
> Port Representors provide a logical presentation in DPDK of VF (virtual 
> function) ports for the purposes of control and monitoring. Each port 
> representor device represents a single VF and is associated with it's 
> parent physical function (PF) PMD which provides the back-end hooks for 
> the representor device ops and defines the control domain to which that 
> port belongs. This allows to use existing DPDK APIs to monitor and control 
> the port without the need to create and maintain VF specific APIs.

Firstly, I don't object this model. More precisely, I don't object for
introducing a port to control the VFs. I even like this idea a bit, for
at least it could get rid of some PMD specific APIs, say
rte_pmd_i40e_set_vf_mac_addr, etc.

However, I don't quite like this design, for a simple reason, it makes
things way more complex:

- new APIs are introduced.
- new virtual PMD is required
- broker concept


I was thinking below should work:

- Use the devargs for enabling the port (or VF) representor model.
  For example: -w 04:00.0,vf_ports=0-3

  The vf_ports is for telling the driver we are going to create VF
  representors for VF 0 to VF 3. It could be a port mask like what
  this patchset does.

- Then inside the driver (say, i40e)

  * allocate 1 DPDK ethdev port
  * allocate 4 DPDK ethdev port, one for each VF specified by
    the vf_ports option. And these are VF representors.
  * link the VF representors to the right VF port
    For example, for i40e, it should be (from patch 3):
        vf = &pf->vfs[representor->vport_id];
  * set the proper ethdev callbacks for the VF representors.

As you can see, none of above complexity is introduced.

Probably you might find something are missing:

- to identify the vf rep port for a specific VF.

  Which could be addressed by the new devargs syntax we are proposing:
  http://dpdk.org/ml/archives/dev/2017-December/084234.html

  For example, the right VF rep port could be identified by below devarg:
      bus=pci,id=04:00.0/class=eth,vf_id=0

- the ability of hotplug

  I think it could also be addressed by the new devargs syntax, also by
  re-using the rte_eth_dev_attach() function:

      rte_eth_dev_attach("bus=pci,id=04:00.0/class=eth,vf_id=4", &port_id);
  

Thoughts?

	--yliu
> 
> +-----------------------------+   +---------------+  +---------------+
> |        Control Plane        |   |   Data Plane  |  |   Data Plane  |
> |         Application         |   |   Application |  |   Application |
> +-----------------------------+   +---------------+  +---------------+
> |         eth dev api         |   |  eth dev api  |  |  eth dev api  |
> +-----------------------------+   +---------------+  +---------------+
> +-------+  +-------+  +-------+   +---------------+  +---------------+
> |  PF0  |  | Port  |  | Port  |   |    VF0 PMD    |  |    VF0 PMD    |
> |  PMD  <--+ Rep 0 |  | Rep 1 |   +---------------+  +------+--------+
> |       |  | PMD   |  | PMD   |                             |
> +---+--^+  +-------+  +-+-----+                             |
>     |  |                |  |                                |
>     |  +----------------+  |                                |
>     |                      |                                |
>     |                      |                                |
> +--------------------------------+                          |
> |   |  HW (logical view)   |     |                          |
> | --+------+ +-------+ +---+---+ |                          |
> | |   PF   | |  VF0  | |  VF1  | |                          |
> | |        | |       | |       +----------------------------+
> | +--------+ +-------+ +-------+ |
> | +----------------------------+ |
> | |        VEB                 | |
> | +----------------------------+ |
> | +--------+                     |
> | |  Port  |                     |
> | |   0    |                     |
> | +--------+                     |
> +--------------------------------+
> 
> The figure above shows a deployment where the PF is bound to a DPDK control
> plane application which uses representor ports to manage the configuration and
> monitoring of it's VF ports. Each virtual function is represented in the
> application by a representor port PMD which enables control of the corresponding
> VF through eth dev APIs on the representor PMD such as:
> 
> - void rte_eth_promiscuous_enable(uint8_t port_id);
> - void rte_eth_promiscuous_disable(uint8_t port_id);
> - void rte_eth_allmulticast_enable(uint8_t port_id);
> - void rte_eth_allmulticast_disable(uint8_t port_id);
> - int rte_eth_dev_mac_addr_add(uint8_t port, struct ether_addr *mac_addr,
> 	uint32_t pool);
> - int rte_eth_dev_set_vlan_offload(uint8_t port_id, int offload_mask);
> 
> as well as monitoring through API's like
> 
> - void rte_eth_link_get(uint8_t port_id, struct rte_eth_link *link);
> - int rte_eth_stats_get(uint8_t port_id, struct rte_eth_stats *stats);
> 
> The port representor infrastructure is enabled through a single common, device
> independent, virtual PMD whos context is initialized and enabled through a
> broker instance running within the context of the physical function device
> driver.
> 
> +-------------------------+       +-------------------------+
> |        rte_ethdev       |       |       rte_ethdev        |
> +-------------------------+       +-------------------------+
> |  Physical Function PMD  |       |  Port Reperesentor PMD  |
> |         +-------------+ |       | +---------+ +---------+ |
> |         | Representor | |       | | dev_data| | dev_ops | |
> |         |    Broker   | |       | +----+----+ +----+----+ |
> |         | +---------+ | |       +------|-----------|------+
> |         | | VF Port | | |              |           |
> |         | | Context +------------------+           |
> |         | +---------+ | |                          |
> |         | +---------+ | |                          |
> |         | | Handler +------------------------------+
> |         | |   Ops   | | |
> |         | +---------+ | |
> |         +-------------+ |
> +-------------------------+
> 
> Creation of representor ports can be achieved either through the --vdev EAL
> option or through the rte_vdev_init() API. Each port representor requires the
> BDF of it's parent PF and the Virtual Function ID of the port which the
> representor will support. During initialization of the representor PMD, it calls
> the broker API to register itself with the PF PMD and to get it's context
> configured which includes the setting up of it's context and ops function
> handlers.
> 
> As the port representor model is based around the paradigm of using standard
> port based APIs, it will allow future expansion of functionality without the
> need to add new APIs. For example it should be possible to support configuration
> of egress QoS parameters using existing TM APIs by extending the port
> representor PMD/broker infrastructure.
> 
> Changes in v2:
> * Rebased to DPDK 17.11
> 
> Changes in v3:
> * Removed RTE_ETH_DEV_REPRESENTOR_PORT define
> * Removed switch_domain from struct rte_eth_dev_info
>   (to be reintroduced seperately when required).
> * Port Representor PMD functionality merged into main library
> * Removed functions from .map file that are not for use by applications
> * Some minor bugfixes (uninitalised variables & NULL terminators)
> * Use of SPDX licence headers in new files
> * Seperate headers for PMD and application
> * SPDX-License-Identifier in new files
> * Added test-pmd representor add/del commands
> 
> Changes in v4:
> * Added documentation
> * Rebased to a12f22678915
> 
> 
> Remy Horton (5):
>   lib: add Port Representor library
>   eal: add Port Representor command-line option
>   drivers/net/i40e: add Port Representor functionality
>   drivers/net/ixgbe: add Port Representor functionality
>   app/test-pmd: add Port Representor commands
> 
>  MAINTAINERS                                        |  12 +-
>  app/test-pmd/cmdline.c                             |  88 ++++
>  config/common_base                                 |   5 +
>  doc/api/doxy-api-index.md                          |   3 +-
>  doc/api/doxy-api.conf                              |   1 +
>  doc/guides/prog_guide/index.rst                    |   1 +
>  doc/guides/prog_guide/representor_lib.rst          |  62 +++
>  doc/guides/rel_notes/release_18_02.rst             |   9 +
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst        |  25 ++
>  drivers/net/i40e/Makefile                          |   1 +
>  drivers/net/i40e/i40e_ethdev.c                     |  16 +
>  drivers/net/i40e/i40e_ethdev.h                     |   1 +
>  drivers/net/i40e/i40e_prep_ops.c                   | 495 +++++++++++++++++++++
>  drivers/net/i40e/i40e_prep_ops.h                   |  15 +
>  drivers/net/i40e/rte_pmd_i40e.c                    |  47 ++
>  drivers/net/i40e/rte_pmd_i40e.h                    |  18 +
>  drivers/net/ixgbe/Makefile                         |   1 +
>  drivers/net/ixgbe/ixgbe_ethdev.c                   |  22 +-
>  drivers/net/ixgbe/ixgbe_ethdev.h                   |   5 +
>  drivers/net/ixgbe/ixgbe_prep_ops.c                 | 259 +++++++++++
>  drivers/net/ixgbe/ixgbe_prep_ops.h                 |  15 +
>  lib/Makefile                                       |   3 +
>  lib/librte_eal/bsdapp/eal/eal.c                    |   6 +
>  lib/librte_eal/common/eal_common_options.c         |   1 +
>  lib/librte_eal/common/eal_internal_cfg.h           |   2 +
>  lib/librte_eal/common/eal_options.h                |   2 +
>  lib/librte_eal/common/include/rte_eal.h            |   8 +
>  lib/librte_eal/linuxapp/eal/eal.c                  |   9 +
>  lib/librte_representor/Makefile                    |  26 ++
>  lib/librte_representor/rte_port_representor.c      | 326 ++++++++++++++
>  lib/librte_representor/rte_port_representor.h      |  60 +++
>  .../rte_port_representor_driver.h                  | 138 ++++++
>  .../rte_port_representor_version.map               |   8 +
>  mk/rte.app.mk                                      |   1 +
>  34 files changed, 1688 insertions(+), 3 deletions(-)
>  create mode 100644 doc/guides/prog_guide/representor_lib.rst
>  create mode 100644 drivers/net/i40e/i40e_prep_ops.c
>  create mode 100644 drivers/net/i40e/i40e_prep_ops.h
>  create mode 100644 drivers/net/ixgbe/ixgbe_prep_ops.c
>  create mode 100644 drivers/net/ixgbe/ixgbe_prep_ops.h
>  create mode 100644 lib/librte_representor/Makefile
>  create mode 100644 lib/librte_representor/rte_port_representor.c
>  create mode 100644 lib/librte_representor/rte_port_representor.h
>  create mode 100644 lib/librte_representor/rte_port_representor_driver.h
>  create mode 100644 lib/librte_representor/rte_port_representor_version.map
> 
> -- 
> 2.9.5
Alex Rosenbaum Jan. 11, 2018, 8:37 a.m. | #21
Declan,

In all previous series I explained my objection for using the broker
framework for holding the VF rep port.
You guys just waved it away (and sent a new series) instead of really
giving a clear reason why they are required.

I think Yuanhan's suggestion gets ride of the broker model in a clean way.
It exposes the VF rep's and keeps much similar to port model to allow
both control and second phase data path. The flow steering actions
suggestion match much nicer with this as well.

PS: me and Yuanhan did exchange ideas regarding this previously, so
it's not surprising I agree with his suggestion.

Alex


On Thu, Jan 11, 2018 at 5:18 AM, Yuanhan Liu <yliu@fridaylinux.org> wrote:
> On Mon, Jan 08, 2018 at 02:37:15PM +0000, Remy Horton wrote:
>> Port Representors provide a logical presentation in DPDK of VF (virtual
>> function) ports for the purposes of control and monitoring. Each port
>> representor device represents a single VF and is associated with it's
>> parent physical function (PF) PMD which provides the back-end hooks for
>> the representor device ops and defines the control domain to which that
>> port belongs. This allows to use existing DPDK APIs to monitor and control
>> the port without the need to create and maintain VF specific APIs.
>
> Firstly, I don't object this model. More precisely, I don't object for
> introducing a port to control the VFs. I even like this idea a bit, for
> at least it could get rid of some PMD specific APIs, say
> rte_pmd_i40e_set_vf_mac_addr, etc.
>
> However, I don't quite like this design, for a simple reason, it makes
> things way more complex:
>
> - new APIs are introduced.
> - new virtual PMD is required
> - broker concept
>
>
> I was thinking below should work:
>
> - Use the devargs for enabling the port (or VF) representor model.
>   For example: -w 04:00.0,vf_ports=0-3
>
>   The vf_ports is for telling the driver we are going to create VF
>   representors for VF 0 to VF 3. It could be a port mask like what
>   this patchset does.
>
> - Then inside the driver (say, i40e)
>
>   * allocate 1 DPDK ethdev port
>   * allocate 4 DPDK ethdev port, one for each VF specified by
>     the vf_ports option. And these are VF representors.
>   * link the VF representors to the right VF port
>     For example, for i40e, it should be (from patch 3):
>         vf = &pf->vfs[representor->vport_id];
>   * set the proper ethdev callbacks for the VF representors.
>
> As you can see, none of above complexity is introduced.
>
> Probably you might find something are missing:
>
> - to identify the vf rep port for a specific VF.
>
>   Which could be addressed by the new devargs syntax we are proposing:
>   http://dpdk.org/ml/archives/dev/2017-December/084234.html
>
>   For example, the right VF rep port could be identified by below devarg:
>       bus=pci,id=04:00.0/class=eth,vf_id=0
>
> - the ability of hotplug
>
>   I think it could also be addressed by the new devargs syntax, also by
>   re-using the rte_eth_dev_attach() function:
>
>       rte_eth_dev_attach("bus=pci,id=04:00.0/class=eth,vf_id=4", &port_id);
>
>
> Thoughts?
>
>         --yliu
>>
>> +-----------------------------+   +---------------+  +---------------+
>> |        Control Plane        |   |   Data Plane  |  |   Data Plane  |
>> |         Application         |   |   Application |  |   Application |
>> +-----------------------------+   +---------------+  +---------------+
>> |         eth dev api         |   |  eth dev api  |  |  eth dev api  |
>> +-----------------------------+   +---------------+  +---------------+
>> +-------+  +-------+  +-------+   +---------------+  +---------------+
>> |  PF0  |  | Port  |  | Port  |   |    VF0 PMD    |  |    VF0 PMD    |
>> |  PMD  <--+ Rep 0 |  | Rep 1 |   +---------------+  +------+--------+
>> |       |  | PMD   |  | PMD   |                             |
>> +---+--^+  +-------+  +-+-----+                             |
>>     |  |                |  |                                |
>>     |  +----------------+  |                                |
>>     |                      |                                |
>>     |                      |                                |
>> +--------------------------------+                          |
>> |   |  HW (logical view)   |     |                          |
>> | --+------+ +-------+ +---+---+ |                          |
>> | |   PF   | |  VF0  | |  VF1  | |                          |
>> | |        | |       | |       +----------------------------+
>> | +--------+ +-------+ +-------+ |
>> | +----------------------------+ |
>> | |        VEB                 | |
>> | +----------------------------+ |
>> | +--------+                     |
>> | |  Port  |                     |
>> | |   0    |                     |
>> | +--------+                     |
>> +--------------------------------+
>>
>> The figure above shows a deployment where the PF is bound to a DPDK control
>> plane application which uses representor ports to manage the configuration and
>> monitoring of it's VF ports. Each virtual function is represented in the
>> application by a representor port PMD which enables control of the corresponding
>> VF through eth dev APIs on the representor PMD such as:
>>
>> - void rte_eth_promiscuous_enable(uint8_t port_id);
>> - void rte_eth_promiscuous_disable(uint8_t port_id);
>> - void rte_eth_allmulticast_enable(uint8_t port_id);
>> - void rte_eth_allmulticast_disable(uint8_t port_id);
>> - int rte_eth_dev_mac_addr_add(uint8_t port, struct ether_addr *mac_addr,
>>       uint32_t pool);
>> - int rte_eth_dev_set_vlan_offload(uint8_t port_id, int offload_mask);
>>
>> as well as monitoring through API's like
>>
>> - void rte_eth_link_get(uint8_t port_id, struct rte_eth_link *link);
>> - int rte_eth_stats_get(uint8_t port_id, struct rte_eth_stats *stats);
>>
>> The port representor infrastructure is enabled through a single common, device
>> independent, virtual PMD whos context is initialized and enabled through a
>> broker instance running within the context of the physical function device
>> driver.
>>
>> +-------------------------+       +-------------------------+
>> |        rte_ethdev       |       |       rte_ethdev        |
>> +-------------------------+       +-------------------------+
>> |  Physical Function PMD  |       |  Port Reperesentor PMD  |
>> |         +-------------+ |       | +---------+ +---------+ |
>> |         | Representor | |       | | dev_data| | dev_ops | |
>> |         |    Broker   | |       | +----+----+ +----+----+ |
>> |         | +---------+ | |       +------|-----------|------+
>> |         | | VF Port | | |              |           |
>> |         | | Context +------------------+           |
>> |         | +---------+ | |                          |
>> |         | +---------+ | |                          |
>> |         | | Handler +------------------------------+
>> |         | |   Ops   | | |
>> |         | +---------+ | |
>> |         +-------------+ |
>> +-------------------------+
>>
>> Creation of representor ports can be achieved either through the --vdev EAL
>> option or through the rte_vdev_init() API. Each port representor requires the
>> BDF of it's parent PF and the Virtual Function ID of the port which the
>> representor will support. During initialization of the representor PMD, it calls
>> the broker API to register itself with the PF PMD and to get it's context
>> configured which includes the setting up of it's context and ops function
>> handlers.
>>
>> As the port representor model is based around the paradigm of using standard
>> port based APIs, it will allow future expansion of functionality without the
>> need to add new APIs. For example it should be possible to support configuration
>> of egress QoS parameters using existing TM APIs by extending the port
>> representor PMD/broker infrastructure.
>>
>> Changes in v2:
>> * Rebased to DPDK 17.11
>>
>> Changes in v3:
>> * Removed RTE_ETH_DEV_REPRESENTOR_PORT define
>> * Removed switch_domain from struct rte_eth_dev_info
>>   (to be reintroduced seperately when required).
>> * Port Representor PMD functionality merged into main library
>> * Removed functions from .map file that are not for use by applications
>> * Some minor bugfixes (uninitalised variables & NULL terminators)
>> * Use of SPDX licence headers in new files
>> * Seperate headers for PMD and application
>> * SPDX-License-Identifier in new files
>> * Added test-pmd representor add/del commands
>>
>> Changes in v4:
>> * Added documentation
>> * Rebased to a12f22678915
>>
>>
>> Remy Horton (5):
>>   lib: add Port Representor library
>>   eal: add Port Representor command-line option
>>   drivers/net/i40e: add Port Representor functionality
>>   drivers/net/ixgbe: add Port Representor functionality
>>   app/test-pmd: add Port Representor commands
>>
>>  MAINTAINERS                                        |  12 +-
>>  app/test-pmd/cmdline.c                             |  88 ++++
>>  config/common_base                                 |   5 +
>>  doc/api/doxy-api-index.md                          |   3 +-
>>  doc/api/doxy-api.conf                              |   1 +
>>  doc/guides/prog_guide/index.rst                    |   1 +
>>  doc/guides/prog_guide/representor_lib.rst          |  62 +++
>>  doc/guides/rel_notes/release_18_02.rst             |   9 +
>>  doc/guides/testpmd_app_ug/testpmd_funcs.rst        |  25 ++
>>  drivers/net/i40e/Makefile                          |   1 +
>>  drivers/net/i40e/i40e_ethdev.c                     |  16 +
>>  drivers/net/i40e/i40e_ethdev.h                     |   1 +
>>  drivers/net/i40e/i40e_prep_ops.c                   | 495 +++++++++++++++++++++
>>  drivers/net/i40e/i40e_prep_ops.h                   |  15 +
>>  drivers/net/i40e/rte_pmd_i40e.c                    |  47 ++
>>  drivers/net/i40e/rte_pmd_i40e.h                    |  18 +
>>  drivers/net/ixgbe/Makefile                         |   1 +
>>  drivers/net/ixgbe/ixgbe_ethdev.c                   |  22 +-
>>  drivers/net/ixgbe/ixgbe_ethdev.h                   |   5 +
>>  drivers/net/ixgbe/ixgbe_prep_ops.c                 | 259 +++++++++++
>>  drivers/net/ixgbe/ixgbe_prep_ops.h                 |  15 +
>>  lib/Makefile                                       |   3 +
>>  lib/librte_eal/bsdapp/eal/eal.c                    |   6 +
>>  lib/librte_eal/common/eal_common_options.c         |   1 +
>>  lib/librte_eal/common/eal_internal_cfg.h           |   2 +
>>  lib/librte_eal/common/eal_options.h                |   2 +
>>  lib/librte_eal/common/include/rte_eal.h            |   8 +
>>  lib/librte_eal/linuxapp/eal/eal.c                  |   9 +
>>  lib/librte_representor/Makefile                    |  26 ++
>>  lib/librte_representor/rte_port_representor.c      | 326 ++++++++++++++
>>  lib/librte_representor/rte_port_representor.h      |  60 +++
>>  .../rte_port_representor_driver.h                  | 138 ++++++
>>  .../rte_port_representor_version.map               |   8 +
>>  mk/rte.app.mk                                      |   1 +
>>  34 files changed, 1688 insertions(+), 3 deletions(-)
>>  create mode 100644 doc/guides/prog_guide/representor_lib.rst
>>  create mode 100644 drivers/net/i40e/i40e_prep_ops.c
>>  create mode 100644 drivers/net/i40e/i40e_prep_ops.h
>>  create mode 100644 drivers/net/ixgbe/ixgbe_prep_ops.c
>>  create mode 100644 drivers/net/ixgbe/ixgbe_prep_ops.h
>>  create mode 100644 lib/librte_representor/Makefile
>>  create mode 100644 lib/librte_representor/rte_port_representor.c
>>  create mode 100644 lib/librte_representor/rte_port_representor.h
>>  create mode 100644 lib/librte_representor/rte_port_representor_driver.h
>>  create mode 100644 lib/librte_representor/rte_port_representor_version.map
>>
>> --
>> 2.9.5
Remy Horton Jan. 11, 2018, 9:38 a.m. | #22
On 11/01/2018 03:18, Yuanhan Liu wrote:
[..]
> - Then inside the driver (say, i40e)
>
>   * allocate 1 DPDK ethdev port
>   * allocate 4 DPDK ethdev port, one for each VF specified by
>     the vf_ports option. And these are VF representors.
>   * link the VF representors to the right VF port
>     For example, for i40e, it should be (from patch 3):
>         vf = &pf->vfs[representor->vport_id];
>   * set the proper ethdev callbacks for the VF representors.
>
> As you can see, none of above complexity is introduced.

What springs to mind is whether this would end up delegating to the 
individual PMDs tasks that in the patchset are handled centrally in the 
library. Question is where the balance should be, and as an aside what 
should be the base-line requirements for a PMD to be "port representable".

..Remy
Declan Doherty Jan. 15, 2018, 12:12 p.m. | #23
On 10/01/2018 7:26 PM, Thomas Monjalon wrote:
> 10/01/2018 14:46, Doherty, Declan:
>> On 09/01/2018 11:22 PM, Thomas Monjalon wrote:
>>> Hi,
>>>
>>> 08/01/2018 15:37, Remy Horton:
>>>> Port Representors provide a logical presentation in DPDK of VF (virtual
>>>> function) ports for the purposes of control and monitoring. Each port
>>>> representor device represents a single VF and is associated with it's
>>>> parent physical function (PF) PMD which provides the back-end hooks for
>>>> the representor device ops and defines the control domain to which that
>>>> port belongs. This allows to use existing DPDK APIs to monitor and control
>>>> the port without the need to create and maintain VF specific APIs.
>>>
>>> Extending control plane ability of DPDK has been discussed
>>> multiple times.
>>
>> It has, and I have yet to see a really strong reason as to why we would
>> not support control plane functions within DPDK, many of which are
>> already support today implicitly anyway through our ethdev APIs.
>>
>>> The current agreed policy is:
>>> "
>>> The primary goal of DPDK is to provide a userspace dataplane.
>>> Managing VFs from a PF driver is a control plane feature and developers
>>> should generally rely on the Linux Kernel for that.
>>> "
>>> http://dpdk.org/doc/guides/contributing/design.html#pf-and-vf-considerations
>>>
>>
>> My understanding is that this particular entry was based around the
>> discussion on the divergence of functionality between the Linux kernel
>> PF driver and the DPDK PF driver. I also don't really think the above
>> statement is valid as a blanket statement for the project as it makes
>> the assumption that DPDK is only deployed on Linux hosts, what about
>> FreeBSD? or in the future Windows?
> 
> Yes, we must agree on removing this scope limitation while working
> on a generic VF representor.
> 
>> A number of presentations at both Userspace in Dublin and the Summit
>> in San Jose discussed the support of control plane functionality by
>> DPDK and there wasn't any strong arguments or opposition against using
>> DPDK for control plane functions that I saw.
>>
>> In any case this patchset is not introducing any new control plane APIs
>> that don't already exist within DPDK today, it only enables the creation
>> of a new type of virtual PMDs which are linked to the same base
>> infrastructure and which can be used to represent VFs in a control plane
>> application as we have implemented in this patch set.
>>
>>> If we relax this policy, I think the representor solution should be
>>> a real port, not only "for the purposes of control and monitoring".
>>> It has been asked several times as replies to this series,
>>> but it is kindly ignored, saying it will be thought later.
>>>
>>
>> I think we have stated in multiple discussions, especially during the
>> userspace presentation back in September that this solution supports
>> data path on the representors PMDs, and we have used the
>> infrastructure proposed here to do exactly what you are asking. As the
>> representor infrastructure doesn't preclude the support of a data
>> path, we have used it as it is presented here to implement a data path
>> for exception path packets for a prototype vswitch offload implementation.
>>
>>
>>> I don't see a general agreement on this series so far.
>>>
>>
>> I think the main issue of contention is that there is a
>> misunderstanding that this implementation only supports control plane
>> management and monitoring, but that is not the case and it can be used
>> for full data path representors, with limited or no control plane
>> functionality if required, at the end of the day the only limitations
>> are based on what is implemented by the backend base driver were the
>> broker is running for the representor ports.
> 
> The misunderstanding may originates from what you describe (even in v5):
> "ports for the purposes of control and monitoring"
> 
noted, but that is the scope of what we demostrate in the patchset, but 
we'll update the introduction to reflect the fact that they can be used 
to also support data path functions, such as exception path traffic for 
hw switch.

> I think everybody agree to have VF representors in DPDK.
> But there are few things which are not generic enough,
> and not easy to use.
> I hoped the discussion started at Dublin would continue
> on the mailing list but I realize the joint effort with other vendors
> did not happen.
> I will elaborate quickly below and more detailed in later review.
> 
> 1/ In order to keep track of the relations between PF, VF and
> representor - which are all ethdev - you create a struct outside
> of ethdev. I think it should be managed inside ethdev API.

Initially we had implemented the representor functionality within the 
context of the ethdev library but ran into a number of scenarios where 
this didn't work well as it makes the assumption that the base device 
that the representors are attached to is always an ethdev, we ran into 
cases were the PF isn't necessarily an ethdev, for example in some 
smartNICs the PF would be better represented by a switchdev, or it is 
possible that the device hosting the representor broker could just 
provide a conduit to a kernel driver.


> As suggested by others, we could also think whether a switchdev API
> is interesting or not.
> 

Indeed if a switchdev is something that is required by the community it 
would make sense that the representor infrastructure was initialized 
within the switchdev and not an ethdev. The advantage of keeping the 
representor infrastructure independent is that it gives the flexibility 
for representors to be supported independently of device type they are 
attached to.

> 2/ You create a new library for ethdev device management.
> It is the responsibility of the bus to scan and probe devices.
> In Intel case, the representor has no real bus so it must rely on
> the vdev bus. Note that the new custom scan hook may help.

This isn't the case in latest versions of the patchset, the bus the 
representors are dependent on is that of the base device, so for the 
i40e it's the PF PCI device.

> In Mellanox case, the representor already exists in Linux, and is based
> on PCI bus.
> Then, as any other port, it must be managed with whitelist or blacklist.
> 

I think the suggestion by Yuanhan of using the device whitelist command 
option makes sense as a option from the commandline, but it would 
require the newly propose implementation which allows specification of 
both the bus and device as not all devices are PCI, which have multiple 
host ports using SR-IOV, but there are cases when an dynamic 
creation/destruction of ports may also need to be supported, which is 
what the representor APIs support.


> 2-bis/ Minor comments about this lib:
> 	- it is not convenient to ask user or app to enable it

I have no problem removing this EAL option.

> 	- it is not using dynamic logging

We will address this in the next revision

> 
> 3/ You are using PCI address + index to identify the representor.
> It is a no-go. We have made effort to abstract buses.
> As an idea, the identification of a representor could use the new
> proposed flexible device syntax.
> 

We are currently using net_representor_%bus%_%device_id%_%vport_id% to 
identify each representor device but I have no issue changing to either 
the current convention which would be net_representor_%unique_id% or if 
I understand the proposal in the RFC "ether: standardize getting the 
port by name" we would be using something like,

we should be looking at something along the lines of 
net_%bus%_%device_id%_%port_id% which is pretty close to what we are 
using now.

In terms of that RFC I'm not clear on if the proposal is just to affect 
the API for getting a port by name, or actually the name name assigned 
to the device itself.


> 4/ Such new API must be experimental.
> 

We will address this in the next revision


> I propose to better think the representor API inside ethdev
> with a good multi-vendor collaboration,
> and submit a deprecation notice for 18.05 integration.
> 

I would really like to see this included as experimental in 18.02 
release, if it is agreed by the community that we need to re-integate 
the representor concept into librte_ethdev during for 18.05 we will 
support that work.
Thomas Monjalon Jan. 15, 2018, 4:09 p.m. | #24
15/01/2018 13:12, Doherty, Declan:
> On 10/01/2018 7:26 PM, Thomas Monjalon wrote:
> > 10/01/2018 14:46, Doherty, Declan:
> >> On 09/01/2018 11:22 PM, Thomas Monjalon wrote:
> >>> Hi,
> >>>
> >>> 08/01/2018 15:37, Remy Horton:
> >>>> Port Representors provide a logical presentation in DPDK of VF (virtual
> >>>> function) ports for the purposes of control and monitoring. Each port
> >>>> representor device represents a single VF and is associated with it's
> >>>> parent physical function (PF) PMD which provides the back-end hooks for
> >>>> the representor device ops and defines the control domain to which that
> >>>> port belongs. This allows to use existing DPDK APIs to monitor and control
> >>>> the port without the need to create and maintain VF specific APIs.
> >>>
> >>> Extending control plane ability of DPDK has been discussed
> >>> multiple times.
> >>
> >> It has, and I have yet to see a really strong reason as to why we would
> >> not support control plane functions within DPDK, many of which are
> >> already support today implicitly anyway through our ethdev APIs.
> >>
> >>> The current agreed policy is:
> >>> "
> >>> The primary goal of DPDK is to provide a userspace dataplane.
> >>> Managing VFs from a PF driver is a control plane feature and developers
> >>> should generally rely on the Linux Kernel for that.
> >>> "
> >>> http://dpdk.org/doc/guides/contributing/design.html#pf-and-vf-considerations
> >>>
> >>
> >> My understanding is that this particular entry was based around the
> >> discussion on the divergence of functionality between the Linux kernel
> >> PF driver and the DPDK PF driver. I also don't really think the above
> >> statement is valid as a blanket statement for the project as it makes
> >> the assumption that DPDK is only deployed on Linux hosts, what about
> >> FreeBSD? or in the future Windows?
> > 
> > Yes, we must agree on removing this scope limitation while working
> > on a generic VF representor.
> > 
> >> A number of presentations at both Userspace in Dublin and the Summit
> >> in San Jose discussed the support of control plane functionality by
> >> DPDK and there wasn't any strong arguments or opposition against using
> >> DPDK for control plane functions that I saw.
> >>
> >> In any case this patchset is not introducing any new control plane APIs
> >> that don't already exist within DPDK today, it only enables the creation
> >> of a new type of virtual PMDs which are linked to the same base
> >> infrastructure and which can be used to represent VFs in a control plane
> >> application as we have implemented in this patch set.
> >>
> >>> If we relax this policy, I think the representor solution should be
> >>> a real port, not only "for the purposes of control and monitoring".
> >>> It has been asked several times as replies to this series,
> >>> but it is kindly ignored, saying it will be thought later.
> >>>
> >>
> >> I think we have stated in multiple discussions, especially during the
> >> userspace presentation back in September that this solution supports
> >> data path on the representors PMDs, and we have used the
> >> infrastructure proposed here to do exactly what you are asking. As the
> >> representor infrastructure doesn't preclude the support of a data
> >> path, we have used it as it is presented here to implement a data path
> >> for exception path packets for a prototype vswitch offload implementation.
> >>
> >>
> >>> I don't see a general agreement on this series so far.
> >>>
> >>
> >> I think the main issue of contention is that there is a
> >> misunderstanding that this implementation only supports control plane
> >> management and monitoring, but that is not the case and it can be used
> >> for full data path representors, with limited or no control plane
> >> functionality if required, at the end of the day the only limitations
> >> are based on what is implemented by the backend base driver were the
> >> broker is running for the representor ports.
> > 
> > The misunderstanding may originates from what you describe (even in v5):
> > "ports for the purposes of control and monitoring"
> > 
> noted, but that is the scope of what we demostrate in the patchset, but 
> we'll update the introduction to reflect the fact that they can be used 
> to also support data path functions, such as exception path traffic for 
> hw switch.
> 
> > I think everybody agree to have VF representors in DPDK.
> > But there are few things which are not generic enough,
> > and not easy to use.
> > I hoped the discussion started at Dublin would continue
> > on the mailing list but I realize the joint effort with other vendors
> > did not happen.
> > I will elaborate quickly below and more detailed in later review.
> > 
> > 1/ In order to keep track of the relations between PF, VF and
> > representor - which are all ethdev - you create a struct outside
> > of ethdev. I think it should be managed inside ethdev API.
> 
> Initially we had implemented the representor functionality within the 
> context of the ethdev library but ran into a number of scenarios where 
> this didn't work well as it makes the assumption that the base device 
> that the representors are attached to is always an ethdev, we ran into 
> cases were the PF isn't necessarily an ethdev, for example in some 
> smartNICs the PF would be better represented by a switchdev, or it is 
> possible that the device hosting the representor broker could just 
> provide a conduit to a kernel driver.

The base device may be something else than an ethdev PF,
so it may be represented as a rte_device.
But the VF and representor are still ethdev.
I still think the relationship should be described in ethdev.

> > As suggested by others, we could also think whether a switchdev API
> > is interesting or not.
> 
> Indeed if a switchdev is something that is required by the community it 
> would make sense that the representor infrastructure was initialized 
> within the switchdev and not an ethdev. The advantage of keeping the 
> representor infrastructure independent is that it gives the flexibility 
> for representors to be supported independently of device type they are 
> attached to.

switchdev is just another device class.
We must abstract the base device as a basic EAL rte_device for now.

> > 2/ You create a new library for ethdev device management.
> > It is the responsibility of the bus to scan and probe devices.
> > In Intel case, the representor has no real bus so it must rely on
> > the vdev bus. Note that the new custom scan hook may help.
> 
> This isn't the case in latest versions of the patchset, the bus the 
> representors are dependent on is that of the base device, so for the 
> i40e it's the PF PCI device.

I'm suggesting to use vdev as bus of i40e representor,
because there is no PCI identifier for them, right?

> > In Mellanox case, the representor already exists in Linux, and is based
> > on PCI bus.
> > Then, as any other port, it must be managed with whitelist or blacklist.
> 
> I think the suggestion by Yuanhan of using the device whitelist command 
> option makes sense as a option from the commandline, but it would 
> require the newly propose implementation which allows specification of 
> both the bus and device as not all devices are PCI, which have multiple 
> host ports using SR-IOV, but there are cases when an dynamic 
> creation/destruction of ports may also need to be supported, which is 
> what the representor APIs support.

The full proposed syntax of device identification is:
	http://dpdk.org/ml/archives/dev/2017-December/084572.html
With this generic syntax, we can describe port representors and request
their initialization via whitelisting.

[...]
> > 3/ You are using PCI address + index to identify the representor.
> > It is a no-go. We have made effort to abstract buses.
> > As an idea, the identification of a representor could use the new
> > proposed flexible device syntax.
> 
> We are currently using net_representor_%bus%_%device_id%_%vport_id% to 
> identify each representor device but I have no issue changing to either 
> the current convention which would be net_representor_%unique_id% or if 
> I understand the proposal in the RFC "ether: standardize getting the 
> port by name" we would be using something like,
> we should be looking at something along the lines of 
> net_%bus%_%device_id%_%port_id% which is pretty close to what we are 
> using now.

It is introducing yet another syntax.
It would be better to align every device identification usages
on a common and standard syntax.
Then the syntax parsing will be done in bus and libs (e.g. ethdev)
with some helper functions which can be re-used for every usages.
The latest version of representors is using direct PCI parsing instead
of relying on some bus or ethdev helpers.

> In terms of that RFC I'm not clear on if the proposal is just to affect 
> the API for getting a port by name, or actually the name name assigned 
> to the device itself.

It is not about the name. It is a proposal of syntax to describe a
resource of a hardware device, or a virtual device.
The most obvious usage is to replace -w/-b and --vdev.
It can also be used in OVS or any other DPDK configuration.

> > 4/ Such new API must be experimental.
> 
> We will address this in the next revision
> 
> > I propose to better think the representor API inside ethdev
> > with a good multi-vendor collaboration,
> > and submit a deprecation notice for 18.05 integration.
> 
> I would really like to see this included as experimental in 18.02 
> release, if it is agreed by the community that we need to re-integate 
> the representor concept into librte_ethdev during for 18.05 we will 
> support that work.

I don't see the benefit of introducing a lib and a syntax which would
be replaced in the next release.

Patch

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 341c2d6..7e82c70 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1047,6 +1047,7 @@  struct rte_eth_dev_info {
 	/** Configured number of rx/tx queues */
 	uint16_t nb_rx_queues; /**< Number of RX queues. */
 	uint16_t nb_tx_queues; /**< Number of TX queues. */
+	uint16_t switch_domain; /**< Switch domain which port belongs to. */
 };
 
 /**
@@ -1810,6 +1811,7 @@  struct rte_eth_dev_data {
 	int numa_node;  /**< NUMA node connection */
 	struct rte_vlan_filter_conf vlan_filter_conf;
 	/**< VLAN filter configuration. */
+	uint16_t switch_domain; /**< Switch domain which port belongs to. */
 };
 
 /** Device supports link state interrupt */
@@ -1818,6 +1820,8 @@  struct rte_eth_dev_data {
 #define RTE_ETH_DEV_BONDED_SLAVE 0x0004
 /** Device supports device removal interrupt */
 #define RTE_ETH_DEV_INTR_RMV     0x0008
+/** Device is a port representor */
+#define RTE_ETH_DEV_REPRESENTOR_PORT 0x0010
 
 /**
  * @internal