@@ -37,7 +37,7 @@ Supported features
- UDP RSS hashing (1400 series and later adapters)
- Scattered Rx
- MTU update
-- SR-IOV on UCS managed servers connected to Fabric Interconnects
+- SR-IOV virtual function
- Flow API
- Overlay offload
@@ -135,103 +135,87 @@ Configuration information
TCP, IPv4, TCP-IPv4, IPv6, TCP-IPv6, IPv6 Extension, TCP-IPv6 Extension.
-SR-IOV mode utilization
+SR-IOV Virtual Function
-----------------------
-UCS blade servers configured with dynamic vNIC connection policies in UCSM
-are capable of supporting SR-IOV. SR-IOV virtual functions (VFs) are
-specialized vNICs, distinct from regular Ethernet vNICs. These VFs can be
-directly assigned to virtual machines (VMs) as 'passthrough' devices.
+VIC 1400 and later series supports SR-IOV. It can be enabled via both
+UCSM and CIMC. Please refer to the following guides to enable SR-IOV
+virtual functions (VFs).
-In UCS, SR-IOV VFs require the use of the Cisco Virtual Machine Fabric Extender
-(VM-FEX), which gives the VM a dedicated
-interface on the Fabric Interconnect (FI). Layer 2 switching is done at
-the FI. This may eliminate the requirement for software switching on the
-host to route intra-host VM traffic.
+ - CIMC: `Managing vNICs <https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/c/sw/gui/config/guide/4_3/b_cisco_ucs_c-series_gui_configuration_guide_43/b_Cisco_UCS_C-series_GUI_Configuration_Guide_41_chapter_01011.html#d77871e5874a1635>`_
-Please refer to `Creating a Dynamic vNIC Connection Policy
-<http://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/sw/vm_fex/vmware/gui/config_guide/b_GUI_VMware_VM-FEX_UCSM_Configuration_Guide/b_GUI_VMware_VM-FEX_UCSM_Configuration_Guide_chapter_010.html#task_433E01651F69464783A68E66DA8A47A5>`_
-for information on configuring SR-IOV adapter policies and port profiles
-using UCSM.
+ - UCSM: `Configuring SRIOV HPN Connection Policies <https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/ucs-manager/GUI-User-Guides/Network-Mgmt/4-3/b_UCSM_Network_Mgmt_Guide_4_3/b_UCSM_Network_Mgmt_Guide_chapter_01010.html#d21438e9555a1635>`_
-Once the policies are in place and the host OS is rebooted, VFs should be
-visible on the host, E.g.:
+Note that the previous SR-IOV implementation that is tied to VM-FEX
+(Cisco Virtual Machine Fabric Extender) has been discontinued, and
+ENIC PMD no longer supports it. The current SR-IOV implementation does
+not require the Fabric Interconnect (FI), as layer 2 switching is done
+within the VIC adapter.
+
+Once SR-IOV is enabled, reboot the host OS and follow OS specific
+steps to create VFs and assign them to virtual machines (VMs) or
+containers as necessary. The VIC physical function (PF) drivers for ESXi
+and Linux support SR-IOV. The following shows simplified steps for
+Linux.
.. code-block:: console
+ # echo 4 > /sys/class/net/<pf-interface>/device/sriov_numvfs
+
# lspci | grep Cisco | grep Ethernet
- 0d:00.0 Ethernet controller: Cisco Systems Inc VIC Ethernet NIC (rev a2)
- 0d:00.1 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2)
- 0d:00.2 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2)
- 0d:00.3 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2)
- 0d:00.4 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2)
- 0d:00.5 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2)
- 0d:00.6 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2)
- 0d:00.7 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2)
-
-Enable Intel IOMMU on the host and install KVM and libvirt, and reboot again as
-required. Then, using libvirt, create a VM instance with an assigned device.
-Below is an example ``interface`` block (part of the domain configuration XML)
-that adds the host VF 0d:00:01 to the VM. ``profileid='pp-vlan-25'`` indicates
-the port profile that has been configured in UCSM.
+ 12:00.0 Ethernet controller: Cisco Systems Inc VIC Ethernet NIC (rev a2)
+ 12:00.1 Ethernet controller: Cisco Systems Inc Device 02b7 (rev a2)
+ 12:00.2 Ethernet controller: Cisco Systems Inc Device 02b7 (rev a2)
+ 12:00.3 Ethernet controller: Cisco Systems Inc Device 02b7 (rev a2)
+ 12:00.4 Ethernet controller: Cisco Systems Inc Device 02b7 (rev a2)
+
+Writing 4 to ``sriov_numvfs`` creates 4 VFs. ``lspci`` shows VFs and
+their PCI locations. Interfaces with device ID ``02b7`` are the
+VFs. The following snippet for libvirt XML assigns VF at ``12:00.1``
+to VM.
.. code-block:: console
- <interface type='hostdev' managed='yes'>
- <mac address='52:54:00:ac:ff:b6'/>
+ <interface type="hostdev" managed="yes">
+ <mac address="fa:16:3e:46:39:c5"/>
<driver name='vfio'/>
<source>
- <address type='pci' domain='0x0000' bus='0x0d' slot='0x00' function='0x1'/>
+ <address type="pci" domain="0x0000" bus="0x12" slot="0x00" function="0x1"/>
</source>
- <virtualport type='802.1Qbh'>
- <parameters profileid='pp-vlan-25'/>
- </virtualport>
+ <vlan>
+ <tag id="1000"/>
+ </vlan>
</interface>
-
-Alternatively, the configuration can be done in a separate file using the
-``network`` keyword. These methods are described in the libvirt documentation for
-`Network XML format <https://libvirt.org/formatnetwork.html>`_.
-
When the VM instance is started, libvirt will bind the host VF to
-vfio, complete provisioning on the FI and bring up the link.
-
-.. note::
-
- It is not possible to use a VF directly from the host because it is not
- fully provisioned until libvirt brings up the VM that it is assigned
- to.
-
-In the VM instance, the VF will now be visible. E.g., here the VF 00:04.0 is
-seen on the VM instance and should be available for binding to a DPDK.
+vfio-pci. In the VM instance, the VF will now be visible. In this
+example, VF at ``07:00.0`` is seen on the VM instance and is available
+for binding to DPDK.
.. code-block:: console
- # lspci | grep Ether
- 00:04.0 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2)
+ # lspci | grep Cisco
+ 07:00.0 Ethernet controller: Cisco Systems Inc Device 02b7 (rev a2)
-Follow the normal DPDK install procedure, binding the VF to either ``igb_uio``
-or ``vfio`` in non-IOMMU mode.
+There are two known limitations of the current SR-IOV implementation.
-In the VM, the kernel enic driver may be automatically bound to the VF during
-boot. Unbinding it currently hangs due to a known issue with the driver. To
-work around the issue, block the enic module as follows.
-Please see :ref:`Limitations <enic_limitations>` for limitations in
-the use of SR-IOV.
+ - Software Rx statistics
-.. code-block:: console
+ VF on old VIC models does not have hardware Rx counters. In this case,
+ ENIC PMD counts packets/bytes and reports them as device statistics.
- # cat /etc/modprobe.d/enic.conf
- blacklist enic
+ - Backward compatibility mode
- # dracut --force
+ Old PF drivers on ESXi may lack full admin channel support. ENIC PMD
+ detects such PF driver during initialization and reverts to the
+ compatibility mode. In this mode, ENIC PMD does not use the admin channel,
+ and trust mode (e.g. enabling promiscuous mode on VF) is not supported.
.. note::
- Passthrough does not require SR-IOV. If VM-FEX is not desired, the user
+ Passthrough does not require SR-IOV. If SR-IOV is not desired, the user
may create as many regular vNICs as necessary and assign them to VMs as
- passthrough devices. Since these vNICs are not SR-IOV VFs, using them as
- passthrough devices do not require libvirt, port profiles, and VM-FEX.
+ passthrough devices.
.. _enic-generic-flow-api:
@@ -55,6 +55,10 @@ New Features
Also, make sure to start the actual text at the margin.
=======================================================
+* **Updated Cisco enic driver.**
+
+ * Added SR-IOV VF support.
+
Removed Items
-------------
@@ -24,6 +24,7 @@ int vnic_cq_alloc(struct vnic_dev *vdev, struct vnic_cq *cq, unsigned int index,
cq->index = index;
cq->vdev = vdev;
+ cq->admin_chan = false;
cq->ctrl = vnic_dev_get_res(vdev, RES_TYPE_CQ, index);
if (!cq->ctrl) {
@@ -40,6 +41,32 @@ int vnic_cq_alloc(struct vnic_dev *vdev, struct vnic_cq *cq, unsigned int index,
return 0;
}
+int vnic_admin_cq_alloc(struct vnic_dev *vdev, struct vnic_cq *cq, unsigned int index,
+ unsigned int socket_id, unsigned int desc_count, unsigned int desc_size)
+{
+ int err;
+ char res_name[RTE_MEMZONE_NAMESIZE];
+ static int instance;
+
+ cq->index = index;
+ cq->vdev = vdev;
+ cq->admin_chan = true;
+
+ cq->ctrl = vnic_dev_get_res(vdev, RES_TYPE_ADMIN_CQ, index);
+ if (!cq->ctrl) {
+ pr_err("Failed to get admin CQ[%u] resource\n", index);
+ return -EINVAL;
+ }
+
+ snprintf(res_name, sizeof(res_name), "%d-admin-cq-%u", instance++, index);
+ err = vnic_dev_alloc_desc_ring(vdev, &cq->ring, desc_count, desc_size,
+ socket_id, res_name);
+ if (err)
+ return err;
+
+ return 0;
+}
+
void vnic_cq_init(struct vnic_cq *cq, unsigned int flow_control_enable,
unsigned int color_enable, unsigned int cq_head, unsigned int cq_tail,
unsigned int cq_tail_color, unsigned int interrupt_enable,
@@ -59,12 +59,15 @@ struct vnic_cq {
unsigned int tobe_rx_coal_timeval;
ktime_t prev_ts;
#endif
+ bool admin_chan;
};
void vnic_cq_free(struct vnic_cq *cq);
int vnic_cq_alloc(struct vnic_dev *vdev, struct vnic_cq *cq, unsigned int index,
unsigned int socket_id,
unsigned int desc_count, unsigned int desc_size);
+int vnic_admin_cq_alloc(struct vnic_dev *vdev, struct vnic_cq *cq, unsigned int index,
+ unsigned int socket_id, unsigned int desc_count, unsigned int desc_size);
void vnic_cq_init(struct vnic_cq *cq, unsigned int flow_control_enable,
unsigned int color_enable, unsigned int cq_head, unsigned int cq_tail,
unsigned int cq_tail_color, unsigned int interrupt_enable,
@@ -47,6 +47,8 @@ struct vnic_dev {
dma_addr_t linkstatus_pa;
struct vnic_stats *stats;
dma_addr_t stats_pa;
+ struct vnic_sriov_stats *sriov_stats;
+ dma_addr_t sriov_stats_pa;
struct vnic_devcmd_fw_info *fw_info;
dma_addr_t fw_info_pa;
struct fm_info *flowman_info;
@@ -164,6 +166,9 @@ static int vnic_dev_discover_res(struct vnic_dev *vdev,
case RES_TYPE_RQ:
case RES_TYPE_CQ:
case RES_TYPE_INTR_CTRL:
+ case RES_TYPE_ADMIN_WQ:
+ case RES_TYPE_ADMIN_RQ:
+ case RES_TYPE_ADMIN_CQ:
/* each count is stride bytes long */
len = count * VNIC_RES_STRIDE;
if (len + bar_offset > bar[bar_num].len) {
@@ -210,6 +215,9 @@ void __iomem *vnic_dev_get_res(struct vnic_dev *vdev, enum vnic_res_type type,
case RES_TYPE_RQ:
case RES_TYPE_CQ:
case RES_TYPE_INTR_CTRL:
+ case RES_TYPE_ADMIN_WQ:
+ case RES_TYPE_ADMIN_RQ:
+ case RES_TYPE_ADMIN_CQ:
return (char __iomem *)vdev->res[type].vaddr +
index * VNIC_RES_STRIDE;
default:
@@ -1143,6 +1151,18 @@ int vnic_dev_alloc_stats_mem(struct vnic_dev *vdev)
return vdev->stats == NULL ? -ENOMEM : 0;
}
+int vnic_dev_alloc_sriov_stats_mem(struct vnic_dev *vdev)
+{
+ char name[RTE_MEMZONE_NAMESIZE];
+ static uint32_t instance;
+
+ snprintf((char *)name, sizeof(name), "vnic_sriov_stats-%u", instance++);
+ vdev->sriov_stats = vdev->alloc_consistent(vdev->priv,
+ sizeof(struct vnic_sriov_stats),
+ &vdev->sriov_stats_pa, (uint8_t *)name);
+ return vdev->sriov_stats == NULL ? -ENOMEM : 0;
+}
+
void vnic_dev_unregister(struct vnic_dev *vdev)
{
if (vdev) {
@@ -1155,6 +1175,10 @@ void vnic_dev_unregister(struct vnic_dev *vdev)
vdev->free_consistent(vdev->priv,
sizeof(struct vnic_stats),
vdev->stats, vdev->stats_pa);
+ if (vdev->sriov_stats)
+ vdev->free_consistent(vdev->priv,
+ sizeof(struct vnic_sriov_stats),
+ vdev->sriov_stats, vdev->sriov_stats_pa);
if (vdev->flowman_info)
vdev->free_consistent(vdev->priv,
sizeof(struct fm_info),
@@ -1355,3 +1379,27 @@ int vnic_dev_set_cq_entry_size(struct vnic_dev *vdev, uint32_t rq_idx,
return vnic_dev_cmd(vdev, CMD_CQ_ENTRY_SIZE_SET, &a0, &a1, wait);
}
+
+int vnic_dev_enable_admin_qp(struct vnic_dev *vdev, uint32_t enable)
+{
+ uint64_t a0, a1;
+ int wait = 1000;
+
+ a0 = QP_TYPE_ADMIN;
+ a1 = enable;
+ return vnic_dev_cmd(vdev, CMD_QP_TYPE_SET, &a0, &a1, wait);
+}
+
+int vnic_dev_sriov_stats(struct vnic_dev *vdev, struct vnic_sriov_stats **stats)
+{
+ uint64_t a0, a1;
+ int wait = 1000;
+ int err;
+
+ a0 = vdev->sriov_stats_pa;
+ a1 = sizeof(struct vnic_sriov_stats);
+ err = vnic_dev_cmd(vdev, CMD_SRIOV_STATS_GET, &a0, &a1, wait);
+ if (!err)
+ *stats = vdev->sriov_stats;
+ return err;
+}
@@ -199,5 +199,8 @@ int vnic_dev_capable_geneve(struct vnic_dev *vdev);
uint64_t vnic_dev_capable_cq_entry_size(struct vnic_dev *vdev);
int vnic_dev_set_cq_entry_size(struct vnic_dev *vdev, uint32_t rq_idx,
uint32_t size_flag);
+int vnic_dev_alloc_sriov_stats_mem(struct vnic_dev *vdev);
+int vnic_dev_sriov_stats(struct vnic_dev *vdev, struct vnic_sriov_stats **stats);
+int vnic_dev_enable_admin_qp(struct vnic_dev *vdev, uint32_t enable);
#endif /* _VNIC_DEV_H_ */
@@ -646,6 +646,20 @@ enum vnic_devcmd_cmd {
* bit 2: 64 bytes
*/
CMD_CQ_ENTRY_SIZE_SET = _CMDC(_CMD_DIR_WRITE, _CMD_VTYPE_ENET, 90),
+
+ /*
+ * enable/disable wq/rq queue pair of qp_type on a PF/VF.
+ * in: (u32) a0 = wq/rq qp_type
+ * in: (u32) a0 = enable(1)/disable(0)
+ */
+ CMD_QP_TYPE_SET = _CMDC(_CMD_DIR_WRITE, _CMD_VTYPE_ENET, 97),
+
+ /*
+ * SRIOV vic stats get
+ * in: (u64) a0 = host buffer addr for stats dump
+ * in (u32) a1 = length of the buffer
+ */
+ CMD_SRIOV_STATS_GET = _CMDC(_CMD_DIR_WRITE, _CMD_VTYPE_ENET, 98),
};
/* Modes for exchanging advanced filter capabilities. The modes supported by
@@ -1194,4 +1208,39 @@ typedef enum {
#define VNIC_RQ_CQ_ENTRY_SIZE_32_CAPABLE (1 << VNIC_RQ_CQ_ENTRY_SIZE_32)
#define VNIC_RQ_CQ_ENTRY_SIZE_64_CAPABLE (1 << VNIC_RQ_CQ_ENTRY_SIZE_64)
+/* CMD_QP_TYPE_SET */
+#define QP_TYPE_ADMIN 0
+
+struct vnic_sriov_stats {
+ uint32_t ver;
+ uint8_t sriov_vlan_membership_cap; /* sriov support vlan-membership */
+ uint8_t sriov_vlan_membership_enabled; /* Default is disabled (0) */
+ uint8_t sriov_rss_vf_full_cap; /* sriov VFs support full rss */
+ uint8_t sriov_host_rx_stats; /* host does rx stats */
+
+ /* IGx/EGx classifier TCAM
+ */
+ uint32_t ig_classifier0_tcam_cfg; /* IG0 TCAM config entries */
+ uint32_t ig_classifier0_tcam_free; /* IG0 TCAM free count */
+ uint32_t eg_classifier0_tcam_cfg; /* EG0 TCAM config entries */
+ uint32_t eg_classifier0_tcam_free; /* EG0 TCAM free count */
+
+ uint32_t ig_classifier1_tcam_cfg; /* IG1 TCAM config entries */
+ uint32_t ig_classifier1_tcam_free; /* IG1 TCAM free count */
+ uint32_t eg_classifier1_tcam_cfg; /* EG1 TCAM config entries */
+ uint32_t eg_classifier1_tcam_free; /* EG1 TCAM free count */
+
+ /* IGx/EGx flow table entries
+ */
+ uint32_t sriov_ig_flow_table_cfg; /* sriov IG FTE config */
+ uint32_t sriov_ig_flow_table_free; /* sriov IG FTE free */
+ uint32_t sriov_eg_flow_table_cfg; /* sriov EG FTE config */
+ uint32_t sriov_eg_flow_table_free; /* sriov EG FTE free */
+
+ uint8_t admin_qp_ready[32]; /* admin_qp ready bits (256) */
+ uint16_t vf_index; /* VF index or SRIOV_PF_IDX */
+ uint16_t reserved1;
+ uint32_t reserved2[256 - 23];
+};
+
#endif /* _VNIC_DEVCMD_H_ */
@@ -38,8 +38,36 @@ enum vnic_res_type {
RES_TYPE_MQ_WQ, /* MQ Work queues */
RES_TYPE_MQ_RQ, /* MQ Receive queues */
RES_TYPE_MQ_CQ, /* MQ Completion queues */
- RES_TYPE_DEPRECATED1, /* Old version of devcmd 2 */
- RES_TYPE_DEVCMD2, /* Device control region */
+ RES_TYPE_DEPRECATED1, /* Old version of devcmd 2 */
+ RES_TYPE_DEPRECATED2, /* Old version of devcmd 2 */
+ RES_TYPE_DEVCMD2, /* Device control region */
+ RES_TYPE_RDMA_WQ, /* RDMA WQ */
+ RES_TYPE_RDMA_RQ, /* RDMA RQ */
+ RES_TYPE_RDMA_CQ, /* RDMA CQ */
+ RES_TYPE_RDMA_RKEY_TABLE, /* RDMA RKEY table */
+ RES_TYPE_RDMA_RQ_HEADER_TABLE, /* RDMA RQ Header Table */
+ RES_TYPE_RDMA_RQ_TABLE, /* RDMA RQ Table */
+ RES_TYPE_RDMA_RD_RESP_HEADER_TABLE, /* RDMA Read Response Header Table */
+ RES_TYPE_RDMA_RD_RESP_TABLE, /* RDMA Read Response Table */
+ RES_TYPE_RDMA_QP_STATS_TABLE, /* RDMA per QP stats table */
+ RES_TYPE_WQ_MREGS, /* XXX snic proto only */
+ RES_TYPE_GRPMBR_INTR, /* Group member interrupt control */
+ RES_TYPE_DPKT, /* Direct Packet memory region */
+ RES_TYPE_RDMA2_DATA_WQ, /* RDMA datapath command WQ */
+ RES_TYPE_RDMA2_REG_WQ, /* RDMA registration command WQ */
+ RES_TYPE_RDMA2_CQ, /* RDMA datapath CQ */
+ RES_TYPE_MQ_RDMA2_DATA_WQ, /* RDMA datapath command WQ */
+ RES_TYPE_MQ_RDMA2_REG_WQ, /* RDMA registration command WQ */
+ RES_TYPE_MQ_RDMA2_CQ, /* RDMA datapath CQ */
+ RES_TYPE_PTP, /* PTP registers */
+ RES_TYPE_INTR_CTRL2, /* Extended INTR CTRL registers */
+ RES_TYPE_SRIOV_INTR, /* VF intr */
+ RES_TYPE_VF_WQ, /* VF WQ */
+ RES_TYPE_VF_RQ, /* VF RQ */
+ RES_TYPE_VF_CQ, /* VF CQ */
+ RES_TYPE_ADMIN_WQ, /* admin channel WQ */
+ RES_TYPE_ADMIN_RQ, /* admin channel RQ */
+ RES_TYPE_ADMIN_CQ, /* admin channel CQ */
RES_TYPE_MAX, /* Count of resource types */
};
@@ -27,6 +27,7 @@ int vnic_rq_alloc(struct vnic_dev *vdev, struct vnic_rq *rq, unsigned int index,
rq->index = index;
rq->vdev = vdev;
+ rq->admin_chan = false;
rq->ctrl = vnic_dev_get_res(vdev, RES_TYPE_RQ, index);
if (!rq->ctrl) {
@@ -42,6 +43,32 @@ int vnic_rq_alloc(struct vnic_dev *vdev, struct vnic_rq *rq, unsigned int index,
return rc;
}
+int vnic_admin_rq_alloc(struct vnic_dev *vdev, struct vnic_rq *rq,
+ unsigned int desc_count, unsigned int desc_size)
+{
+ int rc;
+ char res_name[RTE_MEMZONE_NAMESIZE];
+ static int instance;
+
+ rq->index = 0;
+ rq->vdev = vdev;
+ rq->admin_chan = true;
+ rq->socket_id = SOCKET_ID_ANY;
+
+ rq->ctrl = vnic_dev_get_res(vdev, RES_TYPE_ADMIN_RQ, 0);
+ if (!rq->ctrl) {
+ pr_err("Failed to get admin RQ resource\n");
+ return -EINVAL;
+ }
+
+ vnic_rq_disable(rq);
+
+ snprintf(res_name, sizeof(res_name), "%d-admin-rq", instance++);
+ rc = vnic_dev_alloc_desc_ring(vdev, &rq->ring, desc_count, desc_size,
+ rq->socket_id, res_name);
+ return rc;
+}
+
void vnic_rq_init_start(struct vnic_rq *rq, unsigned int cq_index,
unsigned int fetch_index, unsigned int posted_index,
unsigned int error_interrupt_enable,
@@ -73,6 +73,11 @@ struct vnic_rq {
unsigned int max_mbufs_per_pkt;
uint16_t tot_nb_desc;
bool need_initial_post;
+ bool admin_chan;
+ const struct rte_memzone *admin_msg_rz;
+ int admin_next_idx;
+ uint64_t soft_stats_pkts;
+ uint64_t soft_stats_bytes;
};
static inline unsigned int vnic_rq_desc_avail(struct vnic_rq *rq)
@@ -127,6 +132,8 @@ static inline int vnic_rq_fill_count(struct vnic_rq *rq,
void vnic_rq_free(struct vnic_rq *rq);
int vnic_rq_alloc(struct vnic_dev *vdev, struct vnic_rq *rq, unsigned int index,
unsigned int desc_count, unsigned int desc_size);
+int vnic_admin_rq_alloc(struct vnic_dev *vdev, struct vnic_rq *rq,
+ unsigned int desc_count, unsigned int desc_size);
void vnic_rq_init_start(struct vnic_rq *rq, unsigned int cq_index,
unsigned int fetch_index, unsigned int posted_index,
unsigned int error_interrupt_enable,
@@ -23,7 +23,8 @@ int vnic_wq_alloc_ring(struct vnic_dev *vdev, struct vnic_wq *wq,
char res_name[RTE_MEMZONE_NAMESIZE];
static int instance;
- snprintf(res_name, sizeof(res_name), "%d-wq-%u", instance++, wq->index);
+ snprintf(res_name, sizeof(res_name), "%d-%swq-%u",
+ instance++, wq->admin_chan ? "admin-" : "", wq->index);
return vnic_dev_alloc_desc_ring(vdev, &wq->ring, desc_count, desc_size,
wq->socket_id, res_name);
}
@@ -32,7 +33,7 @@ static int vnic_wq_alloc_bufs(struct vnic_wq *wq)
{
unsigned int count = wq->ring.desc_count;
/* Allocate the mbuf ring */
- wq->bufs = (struct rte_mbuf **)rte_zmalloc_socket("wq->bufs",
+ wq->bufs = (struct rte_mbuf **)rte_zmalloc_socket(wq->admin_chan ? "admin-wq-bufs" : "wq-bufs",
sizeof(struct rte_mbuf *) * count,
RTE_CACHE_LINE_SIZE, wq->socket_id);
wq->head_idx = 0;
@@ -62,6 +63,7 @@ int vnic_wq_alloc(struct vnic_dev *vdev, struct vnic_wq *wq, unsigned int index,
wq->index = index;
wq->vdev = vdev;
+ wq->admin_chan = false;
err = vnic_wq_get_ctrl(vdev, wq, index, RES_TYPE_WQ);
if (err) {
@@ -84,6 +86,37 @@ int vnic_wq_alloc(struct vnic_dev *vdev, struct vnic_wq *wq, unsigned int index,
return 0;
}
+int vnic_admin_wq_alloc(struct vnic_dev *vdev, struct vnic_wq *wq,
+ unsigned int desc_count, unsigned int desc_size)
+{
+ int err;
+
+ wq->index = 0;
+ wq->vdev = vdev;
+ wq->admin_chan = true;
+ wq->socket_id = SOCKET_ID_ANY;
+
+ err = vnic_wq_get_ctrl(vdev, wq, 0, RES_TYPE_ADMIN_WQ);
+ if (err) {
+ pr_err("Failed to get admin WQ resource err %d\n", err);
+ return err;
+ }
+
+ vnic_wq_disable(wq);
+
+ err = vnic_wq_alloc_ring(vdev, wq, desc_count, desc_size);
+ if (err)
+ return err;
+
+ err = vnic_wq_alloc_bufs(wq);
+ if (err) {
+ vnic_wq_free(wq);
+ return err;
+ }
+
+ return 0;
+}
+
void vnic_wq_init_start(struct vnic_wq *wq, unsigned int cq_index,
unsigned int fetch_index, unsigned int posted_index,
unsigned int error_interrupt_enable,
@@ -50,6 +50,9 @@ struct vnic_wq {
const struct rte_memzone *cqmsg_rz;
uint16_t last_completed_index;
uint64_t offloads;
+ bool admin_chan;
+ const struct rte_memzone *admin_msg_rz;
+ uint64_t soft_stats_tx;
};
static inline unsigned int vnic_wq_desc_avail(struct vnic_wq *wq)
@@ -149,6 +152,8 @@ buf_idx_incr(uint32_t n_descriptors, uint32_t idx)
void vnic_wq_free(struct vnic_wq *wq);
int vnic_wq_alloc(struct vnic_dev *vdev, struct vnic_wq *wq, unsigned int index,
unsigned int desc_count, unsigned int desc_size);
+int vnic_admin_wq_alloc(struct vnic_dev *vdev, struct vnic_wq *wq,
+ unsigned int desc_count, unsigned int desc_size);
void vnic_wq_init_start(struct vnic_wq *wq, unsigned int cq_index,
unsigned int fetch_index, unsigned int posted_index,
unsigned int error_interrupt_enable,
@@ -43,7 +43,6 @@
/*#define ENIC_DESC_COUNT_MAKE_ODD (x) do{if ((~(x)) & 1) { (x)--; } }while(0)*/
#define PCI_DEVICE_ID_CISCO_VIC_ENET 0x0043 /* ethernet vnic */
-#define PCI_DEVICE_ID_CISCO_VIC_ENET_VF 0x0071 /* enet SRIOV VF */
/* enet SRIOV Standalone vNic VF */
#define PCI_DEVICE_ID_CISCO_VIC_ENET_SN 0x02B7
@@ -51,7 +50,7 @@
#define ENIC_MAGIC_FILTER_ID 0xffff
/*
- * Interrupt 0: LSC and errors
+ * Interrupt 0: LSC and errors / VF admin channel RQ
* Interrupt 1: rx queue 0
* Interrupt 2: rx queue 1
* ...
@@ -154,11 +153,26 @@ struct enic {
/* software counters */
struct enic_soft_stats soft_stats;
+ struct vnic_wq admin_wq;
+ struct vnic_rq admin_rq;
+ struct vnic_cq admin_cq[2];
+
/* configured resources on vic */
unsigned int conf_rq_count;
unsigned int conf_wq_count;
unsigned int conf_cq_count;
unsigned int conf_intr_count;
+ /* SR-IOV VF has queues for admin channel to PF */
+ unsigned int conf_admin_rq_count;
+ unsigned int conf_admin_wq_count;
+ unsigned int conf_admin_cq_count;
+ uint64_t admin_chan_msg_num;
+ int admin_chan_vf_id;
+ uint32_t admin_pf_cap_version;
+ bool admin_chan_enabled;
+ bool sriov_vf_soft_rx_stats;
+ bool sriov_vf_compat_mode;
+ pthread_mutex_t admin_chan_lock;
/* linked list storing memory allocations */
LIST_HEAD(enic_memzone_list, enic_memzone_entry) memzone_list;
@@ -230,6 +244,16 @@ struct enic_vf_representor {
struct rte_flow *rep2vf_flow[2];
};
+#define ENIC_ADMIN_WQ_CQ 0
+#define ENIC_ADMIN_RQ_CQ 1
+#define ENIC_ADMIN_BUF_SIZE 1024
+
+static inline bool enic_is_vf(struct enic *enic)
+{
+ return enic->pdev->id.device_id == PCI_DEVICE_ID_CISCO_VIC_ENET_SN &&
+ !enic->sriov_vf_compat_mode;
+}
+
#define VF_ENIC_TO_VF_REP(vf_enic) \
container_of(vf_enic, struct enic_vf_representor, enic)
@@ -21,6 +21,7 @@
#include "vnic_rq.h"
#include "vnic_enet.h"
#include "enic.h"
+#include "enic_sriov.h"
/*
* The set of PCI devices this driver supports
@@ -28,7 +29,6 @@
#define CISCO_PCI_VENDOR_ID 0x1137
static const struct rte_pci_id pci_id_enic_map[] = {
{RTE_PCI_DEVICE(CISCO_PCI_VENDOR_ID, PCI_DEVICE_ID_CISCO_VIC_ENET)},
- {RTE_PCI_DEVICE(CISCO_PCI_VENDOR_ID, PCI_DEVICE_ID_CISCO_VIC_ENET_VF)},
{RTE_PCI_DEVICE(CISCO_PCI_VENDOR_ID, PCI_DEVICE_ID_CISCO_VIC_ENET_SN)},
{.vendor_id = 0, /* sentinel */},
};
@@ -707,7 +707,7 @@ static int enicpmd_set_mc_addr_list(struct rte_eth_dev *eth_dev,
for (i = 0; i < enic->mc_count; i++) {
addr = &enic->mc_addrs[i];
debug_log_add_del_addr(addr, false);
- ret = vnic_dev_del_addr(enic->vdev, addr->addr_bytes);
+ ret = enic_dev_del_addr(enic, addr->addr_bytes);
if (ret)
return ret;
}
@@ -734,7 +734,7 @@ static int enicpmd_set_mc_addr_list(struct rte_eth_dev *eth_dev,
if (j < nb_mc_addr)
continue;
debug_log_add_del_addr(addr, false);
- ret = vnic_dev_del_addr(enic->vdev, addr->addr_bytes);
+ ret = enic_dev_del_addr(enic, addr->addr_bytes);
if (ret)
return ret;
}
@@ -748,7 +748,7 @@ static int enicpmd_set_mc_addr_list(struct rte_eth_dev *eth_dev,
if (j < enic->mc_count)
continue;
debug_log_add_del_addr(addr, true);
- ret = vnic_dev_add_addr(enic->vdev, addr->addr_bytes);
+ ret = enic_dev_add_addr(enic, addr->addr_bytes);
if (ret)
return ret;
}
@@ -20,6 +20,7 @@
#include "enic_compat.h"
#include "enic.h"
+#include "enic_sriov.h"
#include "wq_enet_desc.h"
#include "rq_enet_desc.h"
#include "cq_enet_desc.h"
@@ -31,11 +32,6 @@
#include "vnic_intr.h"
#include "vnic_nic.h"
-static inline int enic_is_sriov_vf(struct enic *enic)
-{
- return enic->pdev->id.device_id == PCI_DEVICE_ID_CISCO_VIC_ENET_VF;
-}
-
static int is_zero_addr(uint8_t *addr)
{
return !(addr[0] | addr[1] | addr[2] | addr[3] | addr[4] | addr[5]);
@@ -174,7 +170,7 @@ int enic_del_mac_address(struct enic *enic, int mac_index)
struct rte_eth_dev *eth_dev = enic->rte_dev;
uint8_t *mac_addr = eth_dev->data->mac_addrs[mac_index].addr_bytes;
- return vnic_dev_del_addr(enic->vdev, mac_addr);
+ return enic_dev_del_addr(enic, mac_addr);
}
int enic_set_mac_address(struct enic *enic, uint8_t *mac_addr)
@@ -186,7 +182,7 @@ int enic_set_mac_address(struct enic *enic, uint8_t *mac_addr)
return -EINVAL;
}
- err = vnic_dev_add_addr(enic->vdev, mac_addr);
+ err = enic_dev_add_addr(enic, mac_addr);
if (err)
dev_err(enic, "add mac addr failed\n");
return err;
@@ -442,8 +438,20 @@ enic_intr_handler(void *arg)
struct rte_eth_dev *dev = (struct rte_eth_dev *)arg;
struct enic *enic = pmd_priv(dev);
+ ENICPMD_FUNC_TRACE();
+
vnic_intr_return_all_credits(&enic->intr[ENICPMD_LSC_INTR_OFFSET]);
+ if (enic_is_vf(enic)) {
+ /*
+ * When using the admin channel, VF receives link
+ * status changes from PF. enic_poll_vf_admin_chan()
+ * calls RTE_ETH_EVENT_INTR_LSC.
+ */
+ enic_poll_vf_admin_chan(enic);
+ return;
+ }
+
enic_link_update(dev);
rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_INTR_LSC, NULL);
enic_log_q_error(enic);
@@ -662,14 +670,13 @@ int enic_enable(struct enic *enic)
for (index = 0; index < enic->rq_count; index++)
enic_start_rq(enic, index);
- vnic_dev_add_addr(enic->vdev, enic->mac_addr);
+ enic_dev_add_addr(enic, enic->mac_addr);
vnic_dev_enable_wait(enic->vdev);
/* Register and enable error interrupt */
rte_intr_callback_register(enic->pdev->intr_handle,
enic_intr_handler, (void *)enic->rte_dev);
-
rte_intr_enable(enic->pdev->intr_handle);
/* Unmask LSC interrupt */
vnic_intr_unmask(&enic->intr[ENICPMD_LSC_INTR_OFFSET]);
@@ -687,6 +694,12 @@ int enic_alloc_intr_resources(struct enic *enic)
enic->wq_count, enic_vnic_rq_count(enic),
enic->cq_count, enic->intr_count);
+ if (enic_is_vf(enic)) {
+ dev_info(enic, "vNIC admin channel resources used: wq %d rq %d cq %d\n",
+ enic->conf_admin_wq_count, enic->conf_admin_rq_count,
+ enic->conf_admin_cq_count);
+ }
+
for (i = 0; i < enic->intr_count; i++) {
err = vnic_intr_alloc(enic->vdev, &enic->intr[i], i);
if (err) {
@@ -694,6 +707,7 @@ int enic_alloc_intr_resources(struct enic *enic)
return err;
}
}
+
return 0;
}
@@ -1120,8 +1134,7 @@ int enic_disable(struct enic *enic)
enic_fm_destroy(enic);
- if (!enic_is_sriov_vf(enic))
- vnic_dev_del_addr(enic->vdev, enic->mac_addr);
+ enic_dev_del_addr(enic, enic->mac_addr);
for (i = 0; i < enic->wq_count; i++) {
err = vnic_wq_disable(&enic->wq[i]);
@@ -1156,6 +1169,8 @@ int enic_disable(struct enic *enic)
for (i = 0; i < enic->intr_count; i++)
vnic_intr_clean(&enic->intr[i]);
+ if (enic_is_vf(enic))
+ enic_disable_vf_admin_chan(enic, true);
return 0;
}
@@ -1316,10 +1331,24 @@ int enic_init_rss_nic_cfg(struct enic *enic)
int enic_setup_finish(struct enic *enic)
{
+ int err;
+
+ ENICPMD_FUNC_TRACE();
enic_init_soft_stats(enic);
+ /*
+ * Enable admin channel so we can perform certain devcmds
+ * via admin channel. For example, vnic_dev_packet_filter()
+ */
+ if (enic_is_vf(enic)) {
+ err = enic_enable_vf_admin_chan(enic);
+ if (err)
+ return err;
+ }
+
/* switchdev: enable promisc mode on PF */
if (enic->switchdev_mode) {
+ RTE_VERIFY(!enic_is_vf(enic));
vnic_dev_packet_filter(enic->vdev,
0 /* directed */,
0 /* multicast */,
@@ -1331,7 +1360,7 @@ int enic_setup_finish(struct enic *enic)
return 0;
}
/* Default conf */
- vnic_dev_packet_filter(enic->vdev,
+ err = enic_dev_packet_filter(enic,
1 /* directed */,
1 /* multicast */,
1 /* broadcast */,
@@ -1341,7 +1370,7 @@ int enic_setup_finish(struct enic *enic)
enic->promisc = 0;
enic->allmulti = 1;
- return 0;
+ return err;
}
static int enic_rss_conf_valid(struct enic *enic,
@@ -1455,13 +1484,14 @@ int enic_set_vlan_strip(struct enic *enic)
int enic_add_packet_filter(struct enic *enic)
{
+ ENICPMD_FUNC_TRACE();
/* switchdev ignores packet filters */
if (enic->switchdev_mode) {
ENICPMD_LOG(DEBUG, " switchdev: ignore packet filter");
return 0;
}
/* Args -> directed, multicast, broadcast, promisc, allmulti */
- return vnic_dev_packet_filter(enic->vdev, 1, 1, 1,
+ return enic_dev_packet_filter(enic, 1, 1, 1,
enic->promisc, enic->allmulti);
}
@@ -1497,6 +1527,9 @@ int enic_set_vnic_res(struct enic *enic)
if (eth_dev->data->dev_conf.intr_conf.rxq) {
required_intr += eth_dev->data->nb_rx_queues;
}
+ /* FW adds 2 interrupts for admin chan. Use 1 for RQ */
+ if (enic_is_vf(enic))
+ required_intr += 1;
ENICPMD_LOG(DEBUG, "Required queues for PF: rq %u wq %u cq %u",
required_rq, required_wq, required_cq);
if (enic->vf_required_rq) {
@@ -1851,6 +1884,22 @@ static int enic_dev_init(struct enic *enic)
dev_err(enic, "mac addr storage alloc failed, aborting.\n");
return -1;
}
+
+ /*
+ * If PF has not assigned any MAC address for VF, generate a random one.
+ */
+ if (enic_is_vf(enic)) {
+ struct rte_ether_addr ea;
+
+ memcpy(ea.addr_bytes, enic->mac_addr, RTE_ETHER_ADDR_LEN);
+ if (!rte_is_valid_assigned_ether_addr(&ea)) {
+ rte_eth_random_addr(ea.addr_bytes);
+ ENICPMD_LOG(INFO, "assigned random MAC address " RTE_ETHER_ADDR_PRT_FMT,
+ RTE_ETHER_ADDR_BYTES(&ea));
+ memcpy(enic->mac_addr, ea.addr_bytes, RTE_ETHER_ADDR_LEN);
+ }
+ }
+
rte_ether_addr_copy((struct rte_ether_addr *)enic->mac_addr,
eth_dev->data->mac_addrs);
@@ -300,4 +300,16 @@ void enic_get_res_counts(struct enic *enic)
"vNIC resources avail: wq %d rq %d cq %d intr %d\n",
enic->conf_wq_count, enic->conf_rq_count,
enic->conf_cq_count, enic->conf_intr_count);
+
+ if (!enic_is_vf(enic))
+ return;
+
+ enic->conf_admin_wq_count = vnic_dev_get_res_count(enic->vdev, RES_TYPE_ADMIN_WQ);
+ enic->conf_admin_rq_count = vnic_dev_get_res_count(enic->vdev, RES_TYPE_ADMIN_RQ);
+ enic->conf_admin_cq_count = vnic_dev_get_res_count(enic->vdev, RES_TYPE_ADMIN_CQ);
+
+ dev_info(enic_get_dev(enic),
+ "vNIC admin channel resources avail: wq %d rq %d cq %d\n",
+ enic->conf_admin_wq_count, enic->conf_admin_rq_count,
+ enic->conf_admin_cq_count);
}
@@ -55,6 +55,7 @@ enic_recv_pkts_common(void *rx_queue, struct rte_mbuf **rx_pkts,
sizeof(struct cq_enet_rq_desc_64) :
sizeof(struct cq_enet_rq_desc);
RTE_BUILD_BUG_ON(sizeof(struct cq_enet_rq_desc_64) != 64);
+ uint64_t bytes;
cq = &enic->cq[enic_cq_rq(enic, sop_rq->index)];
cq_idx = cq->to_clean; /* index of cqd, rqd, mbuf_table */
@@ -67,6 +68,8 @@ enic_recv_pkts_common(void *rx_queue, struct rte_mbuf **rx_pkts,
/* Receive until the end of the ring, at most. */
max_rx = RTE_MIN(nb_pkts, cq->ring.desc_count - cq_idx);
+ bytes = 0;
+
while (max_rx) {
volatile struct rq_enet_desc *rqd_ptr;
struct cq_desc cqd;
@@ -155,6 +158,8 @@ enic_recv_pkts_common(void *rx_queue, struct rte_mbuf **rx_pkts,
rxmb->port = enic->port_id;
rxmb->data_len = seg_length;
+ bytes += seg_length;
+
rq->rx_nb_hold++;
if (!(enic_cq_rx_desc_eop(ciflags))) {
@@ -225,6 +230,10 @@ enic_recv_pkts_common(void *rx_queue, struct rte_mbuf **rx_pkts,
&sop_rq->ctrl->posted_index);
}
+ if (enic->sriov_vf_soft_rx_stats && bytes) {
+ sop_rq->soft_stats_pkts += nb_rx;
+ sop_rq->soft_stats_bytes += bytes;
+ }
return nb_rx;
}
@@ -256,6 +265,7 @@ enic_noscatter_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
uint8_t color;
bool overlay;
bool tnl;
+ uint64_t bytes;
rq = rx_queue;
enic = vnic_dev_priv(rq->vdev);
@@ -283,6 +293,8 @@ enic_noscatter_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
port_id = enic->port_id;
overlay = enic->overlay_offload;
+ bytes = 0;
+
rx = rx_pkts;
while (max_rx) {
max_rx--;
@@ -304,6 +316,9 @@ enic_noscatter_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
CQ_ENET_RQ_DESC_BYTES_WRITTEN_MASK;
mb->pkt_len = mb->data_len;
mb->port = port_id;
+
+ bytes += mb->pkt_len;
+
tnl = overlay && (cqd->completed_index_flags &
CQ_ENET_RQ_DESC_FLAGS_FCOE) != 0;
mb->packet_type =
@@ -352,6 +367,11 @@ enic_noscatter_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
&rq->ctrl->posted_index);
}
+ if (enic->sriov_vf_soft_rx_stats && bytes) {
+ rq->soft_stats_pkts += (rx - rx_pkts);
+ rq->soft_stats_bytes += bytes;
+ }
+
return rx - rx_pkts;
}
new file mode 100644
@@ -0,0 +1,801 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2024 Cisco Systems, Inc. All rights reserved.
+ */
+
+#include <rte_memzone.h>
+#include <ethdev_driver.h>
+
+#include "enic_compat.h"
+#include "enic.h"
+#include "enic_sriov.h"
+
+static int enic_check_chan_capability(struct enic *enic);
+static int enic_register_vf(struct enic *enic);
+static void enic_unregister_vf(struct enic *enic);
+
+const char *msg_type_str[ENIC_MBOX_MAX] = {
+ "VF_CAPABILITY_REQUEST",
+ "VF_CAPABILITY_REPLY",
+ "VF_REGISTER_REQUEST",
+ "VF_REGISTER_REPLY",
+ "VF_UNREGISTER_REQUEST",
+ "VF_UNREGISTER_REPLY",
+ "PF_LINK_STATE_NOTIF",
+ "PF_LINK_STATE_ACK",
+ "PF_GET_STATS_REQUEST",
+ "PF_GET_STATS_REPLY",
+ "VF_ADD_DEL_MAC_REQUEST",
+ "VF_ADD_DEL_MAC_REPLY",
+ "PF_SET_ADMIN_MAC_NOTIF",
+ "PF_SET_ADMIN_MAC_ACK",
+ "VF_SET_PKT_FILTER_FLAGS_REQUEST",
+ "VF_SET_PKT_FILTER_FLAGS_REPLY",
+};
+
+static const char *enic_mbox_msg_type_str(enum enic_mbox_msg_type type)
+{
+ if (type >= 0 && type < ENIC_MBOX_MAX)
+ return msg_type_str[type];
+ return "INVALID";
+}
+
+static bool admin_chan_enabled(struct enic *enic)
+{
+ return enic->admin_chan_enabled;
+}
+
+static void lock_admin_chan(struct enic *enic)
+{
+ pthread_mutex_lock(&enic->admin_chan_lock);
+}
+
+static void unlock_admin_chan(struct enic *enic)
+{
+ pthread_mutex_unlock(&enic->admin_chan_lock);
+}
+
+static int enic_enable_admin_rq(struct enic *enic)
+{
+ uint32_t rqbuf_size = ENIC_ADMIN_BUF_SIZE;
+ uint32_t desc_count = 256;
+ struct rq_enet_desc *rqd;
+ struct vnic_rq *rq;
+ struct vnic_cq *cq;
+ rte_iova_t dma;
+ uint32_t i;
+ int cq_idx;
+ int err = 0;
+ char name[RTE_MEMZONE_NAMESIZE];
+ static int instance;
+
+ ENICPMD_FUNC_TRACE();
+ rq = &enic->admin_rq;
+ cq_idx = ENIC_ADMIN_RQ_CQ;
+ cq = &enic->admin_cq[cq_idx];
+ err = vnic_admin_rq_alloc(enic->vdev, rq, desc_count,
+ sizeof(struct rq_enet_desc));
+ if (err) {
+ dev_err(enic, "failed to allocate admin RQ\n");
+ return err;
+ }
+ err = vnic_admin_cq_alloc(enic->vdev, cq, cq_idx,
+ SOCKET_ID_ANY, desc_count, sizeof(struct cq_enet_rq_desc));
+ if (err) {
+ dev_err(enic, "failed to allocate CQ for admin RQ\n");
+ return err;
+ }
+
+ vnic_rq_init(rq, cq_idx, 0, 0);
+ vnic_cq_clean(cq);
+ vnic_cq_init(cq,
+ 0 /* flow_control_enable */,
+ 1 /* color_enable */,
+ 0 /* cq_head */,
+ 0 /* cq_tail */,
+ 1 /* cq_tail_color */,
+ 1 /* interrupt_enable */,
+ 1 /* cq_entry_enable */,
+ 0 /* cq_message_enable */,
+ ENICPMD_LSC_INTR_OFFSET /* interrupt_offset */,
+ 0 /* cq_message_addr */);
+ vnic_rq_enable(rq);
+
+ /*
+ * Allocate RQ DMA buffers. The admin chan reuses these
+ * buffers and never allocates new ones again
+ */
+ snprintf((char *)name, sizeof(name), "admin-rq-buf-%d", instance++);
+ rq->admin_msg_rz = rte_memzone_reserve_aligned((const char *)name,
+ desc_count * rqbuf_size, SOCKET_ID_ANY,
+ RTE_MEMZONE_IOVA_CONTIG, ENIC_PAGE_SIZE);
+ if (!rq->admin_msg_rz)
+ return -ENOMEM;
+
+ memset(rq->admin_msg_rz->addr, 0, desc_count * rqbuf_size);
+
+ dma = rq->admin_msg_rz->iova;
+ rqd = rq->ring.descs;
+ for (i = 0; i < desc_count; i++) {
+ rq_enet_desc_enc(rqd, dma, RQ_ENET_TYPE_ONLY_SOP,
+ rqbuf_size);
+ dma += rqbuf_size;
+ rqd++;
+ }
+ rte_rmb();
+ rq->posted_index = rq->ring.desc_count - 1;
+ rq->admin_next_idx = 0;
+ ENICPMD_LOG(DEBUG, "admin rq posted_index %u", rq->posted_index);
+ iowrite32(rq->posted_index, &rq->ctrl->posted_index);
+ rte_wmb();
+ return err;
+}
+
+static int enic_enable_admin_wq(struct enic *enic)
+{
+ uint32_t wqbuf_size = ENIC_ADMIN_BUF_SIZE;
+ uint32_t desc_count = 256;
+ struct vnic_wq *wq;
+ struct vnic_cq *cq;
+ int cq_idx;
+ int err = 0;
+ char name[RTE_MEMZONE_NAMESIZE];
+ static int instance;
+
+ ENICPMD_FUNC_TRACE();
+ wq = &enic->admin_wq;
+ cq_idx = ENIC_ADMIN_WQ_CQ;
+ cq = &enic->admin_cq[cq_idx];
+ err = vnic_admin_wq_alloc(enic->vdev, wq, desc_count, sizeof(struct wq_enet_desc));
+ if (err) {
+ dev_err(enic, "failed to allocate admin WQ\n");
+ return err;
+ }
+ err = vnic_admin_cq_alloc(enic->vdev, cq, cq_idx,
+ SOCKET_ID_ANY, desc_count, sizeof(struct cq_enet_wq_desc));
+ if (err) {
+ vnic_wq_free(wq);
+ dev_err(enic, "failed to allocate CQ for admin WQ\n");
+ return err;
+ }
+ snprintf((char *)name, sizeof(name),
+ "vnic_cqmsg-%s-admin-wq-%d", enic->bdf_name, instance++);
+ wq->cqmsg_rz = rte_memzone_reserve_aligned((const char *)name,
+ sizeof(uint32_t), SOCKET_ID_ANY,
+ RTE_MEMZONE_IOVA_CONTIG, ENIC_PAGE_SIZE);
+ if (!wq->cqmsg_rz)
+ return -ENOMEM;
+
+ vnic_wq_init(wq, cq_idx, 0, 0);
+ vnic_cq_clean(cq);
+ vnic_cq_init(cq,
+ 0 /* flow_control_enable */,
+ 1 /* color_enable */,
+ 0 /* cq_head */,
+ 0 /* cq_tail */,
+ 1 /* cq_tail_color */,
+ 0 /* interrupt_enable */,
+ 0 /* cq_entry_enable */,
+ 1 /* cq_message_enable */,
+ 0 /* interrupt offset */,
+ (uint64_t)wq->cqmsg_rz->iova);
+
+ vnic_wq_enable(wq);
+
+ snprintf((char *)name, sizeof(name), "admin-wq-buf-%d", instance++);
+ wq->admin_msg_rz = rte_memzone_reserve_aligned((const char *)name,
+ desc_count * wqbuf_size, SOCKET_ID_ANY,
+ RTE_MEMZONE_IOVA_CONTIG, ENIC_PAGE_SIZE);
+ if (!wq->admin_msg_rz)
+ return -ENOMEM;
+
+ return err;
+}
+
+static void enic_admin_wq_post(struct enic *enic, void *msg)
+{
+ struct wq_enet_desc *desc;
+ struct enic_mbox_hdr *hdr;
+ unsigned int head_idx;
+ struct vnic_wq *wq;
+ rte_iova_t dma;
+ int msg_size;
+ void *va;
+
+ ENICPMD_FUNC_TRACE();
+ wq = &enic->admin_wq;
+ hdr = msg;
+ msg_size = hdr->msg_len;
+ RTE_VERIFY(msg_size < ENIC_ADMIN_BUF_SIZE);
+
+ head_idx = wq->head_idx;
+ desc = (struct wq_enet_desc *)wq->ring.descs;
+ desc = desc + head_idx;
+
+ /* Copy message to pre-allocated WQ DMA buffer */
+ dma = wq->admin_msg_rz->iova + ENIC_ADMIN_BUF_SIZE * head_idx;
+ va = (void *)((char *)wq->admin_msg_rz->addr + ENIC_ADMIN_BUF_SIZE * head_idx);
+ memcpy(va, msg, msg_size);
+
+ ENICPMD_LOG(DEBUG, "post admin wq msg at %u", head_idx);
+
+ /* Send message to PF: loopback=1 */
+ wq_enet_desc_enc(desc, dma, msg_size,
+ 0 /* mss */,
+ 0 /* header_len */,
+ 0 /* offload_mode */, 1 /* eop */, 1 /* cq */,
+ 0 /* fcoe */,
+ 1 /* vlan_tag_insert */,
+ 0 /* vlan_id */,
+ 1 /* loopback */);
+ head_idx = enic_ring_incr(wq->ring.desc_count, head_idx);
+ rte_wmb();
+ iowrite32_relaxed(head_idx, &wq->ctrl->posted_index);
+ wq->head_idx = head_idx;
+}
+
+static void enic_mbox_init_msg_hdr(struct enic *enic, void *msg,
+ enum enic_mbox_msg_type type)
+{
+ struct enic_mbox_hdr *hdr;
+ int len;
+
+ switch (type) {
+ case ENIC_MBOX_VF_CAPABILITY_REQUEST:
+ len = sizeof(struct enic_mbox_vf_capability_msg);
+ break;
+ case ENIC_MBOX_VF_REGISTER_REQUEST:
+ len = sizeof(struct enic_mbox_vf_register_msg);
+ break;
+ case ENIC_MBOX_VF_UNREGISTER_REQUEST:
+ len = sizeof(struct enic_mbox_vf_unregister_msg);
+ break;
+ case ENIC_MBOX_VF_SET_PKT_FILTER_FLAGS_REQUEST:
+ len = sizeof(struct enic_mbox_vf_set_pkt_filter_flags_msg);
+ break;
+ case ENIC_MBOX_PF_LINK_STATE_ACK:
+ len = sizeof(struct enic_mbox_pf_link_state_ack_msg);
+ break;
+ case ENIC_MBOX_PF_GET_STATS_REPLY:
+ len = sizeof(struct enic_mbox_pf_get_stats_reply_msg);
+ break;
+ case ENIC_MBOX_VF_ADD_DEL_MAC_REQUEST:
+ len = sizeof(struct enic_mbox_vf_add_del_mac_msg);
+ break;
+ default:
+ RTE_VERIFY(false);
+ break;
+ }
+ memset(msg, 0, len);
+ hdr = msg;
+ hdr->dst_vnic_id = ENIC_MBOX_DST_PF;
+ hdr->src_vnic_id = enic->admin_chan_vf_id;
+ hdr->msg_type = type;
+ hdr->flags = 0;
+ hdr->msg_len = len;
+ hdr->msg_num = ++enic->admin_chan_msg_num;
+}
+
+/*
+ * See if there is a new receive packet. If yes, copy it out.
+ */
+static int enic_admin_rq_peek(struct enic *enic, uint8_t *msg, int *msg_len)
+{
+ const int desc_size = sizeof(struct cq_enet_rq_desc);
+ volatile struct cq_desc *cqd_ptr;
+ uint16_t cq_idx, rq_idx, rq_num;
+ struct cq_enet_rq_desc *cqrd;
+ uint16_t seg_length;
+ struct cq_desc cqd;
+ struct vnic_rq *rq;
+ struct vnic_cq *cq;
+ uint8_t tc, color;
+ int next_idx;
+ void *va;
+
+ rq = &enic->admin_rq;
+ cq = &enic->admin_cq[ENIC_ADMIN_RQ_CQ];
+ cq_idx = cq->to_clean;
+ cqd_ptr = (struct cq_desc *)((uintptr_t)(cq->ring.descs) +
+ (uintptr_t)cq_idx * desc_size);
+ color = cq->last_color;
+ tc = *(volatile uint8_t *)((uintptr_t)cqd_ptr + desc_size - 1);
+ /* No new packet, return */
+ if ((tc & CQ_DESC_COLOR_MASK_NOSHIFT) == color)
+ return -EAGAIN;
+ ENICPMD_LOG(DEBUG, "admin RQ has a completion cq_idx %u color %u", cq_idx, color);
+
+ cqd = *cqd_ptr;
+ cqrd = (struct cq_enet_rq_desc *)&cqd;
+ seg_length = rte_le_to_cpu_16(cqrd->bytes_written_flags) &
+ CQ_ENET_RQ_DESC_BYTES_WRITTEN_MASK;
+
+ rq_num = cqd.q_number & CQ_DESC_Q_NUM_MASK;
+ rq_idx = (cqd.completed_index & CQ_DESC_COMP_NDX_MASK);
+ ENICPMD_LOG(DEBUG, "rq_num %u rq_idx %u len %u", rq_num, rq_idx, seg_length);
+
+ RTE_VERIFY(rq_num == 0);
+ next_idx = rq->admin_next_idx;
+ RTE_VERIFY(rq_idx == next_idx);
+ rq->admin_next_idx = enic_ring_incr(rq->ring.desc_count, next_idx);
+
+ /* Copy out the received message */
+ va = (void *)((char *)rq->admin_msg_rz->addr + ENIC_ADMIN_BUF_SIZE * next_idx);
+ *msg_len = seg_length;
+ memset(msg, 0, ENIC_ADMIN_BUF_SIZE);
+ memcpy(msg, va, seg_length);
+ memset(va, 0, ENIC_ADMIN_BUF_SIZE);
+
+ /* Advance CQ */
+ cq_idx++;
+ if (unlikely(cq_idx == cq->ring.desc_count)) {
+ cq_idx = 0;
+ cq->last_color ^= CQ_DESC_COLOR_MASK_NOSHIFT;
+ }
+ cq->to_clean = cq_idx;
+
+ /* Recycle and post RQ buffer */
+ rq->posted_index = enic_ring_add(rq->ring.desc_count,
+ rq->posted_index,
+ 1);
+ rte_wmb();
+ iowrite32(rq->posted_index, &rq->ctrl->posted_index);
+ rte_wmb();
+ return 0;
+}
+
+int enic_enable_vf_admin_chan(struct enic *enic)
+{
+ struct vnic_sriov_stats *stats;
+ int err;
+
+ ENICPMD_FUNC_TRACE();
+ pthread_mutex_init(&enic->admin_chan_lock, NULL);
+ err = vnic_dev_enable_admin_qp(enic->vdev, 1);
+ if (err) {
+ ENICPMD_LOG(ERR, "failed to enable admin QP type");
+ goto out;
+ }
+ err = vnic_dev_alloc_sriov_stats_mem(enic->vdev);
+ if (err) {
+ ENICPMD_LOG(ERR, "failed to allocate SR-IOV stats buffer");
+ goto out;
+ }
+ err = vnic_dev_sriov_stats(enic->vdev, &stats);
+ if (err) {
+ ENICPMD_LOG(ERR, "failed to get SR-IOV stats");
+ goto out;
+ }
+ enic->admin_chan_vf_id = stats->vf_index;
+ enic->sriov_vf_soft_rx_stats = !!stats->sriov_host_rx_stats;
+ ENICPMD_LOG(INFO, "SR-IOV VF index %u %s stats",
+ stats->vf_index, enic->sriov_vf_soft_rx_stats ? "soft" : "HW");
+ err = enic_enable_admin_rq(enic);
+ if (err) {
+ ENICPMD_LOG(ERR, "failed to enable admin RQ");
+ goto out;
+ }
+ err = enic_enable_admin_wq(enic);
+ if (err) {
+ ENICPMD_LOG(ERR, "failed to enable admin WQ");
+ goto out;
+ }
+ enic->admin_chan_enabled = true;
+ /* Now the admin channel is ready. Send CAPABILITY as the first message */
+ err = enic_check_chan_capability(enic);
+ if (err) {
+ ENICPMD_LOG(ERR, "failed to exchange VF_CAPABILITY message");
+ goto out;
+ }
+ if (enic->sriov_vf_compat_mode) {
+ enic_disable_vf_admin_chan(enic, false);
+ return 0;
+ }
+ /* Then register.. */
+ err = enic_register_vf(enic);
+ if (err) {
+ ENICPMD_LOG(ERR, "failed to perform VF_REGISTER");
+ goto out;
+ }
+ /*
+ * If we have to count RX packets (soft stats), do not use
+ * avx2 receive handlers
+ */
+ if (enic->sriov_vf_soft_rx_stats)
+ enic->enable_avx2_rx = 0;
+out:
+ return err;
+}
+
+int enic_disable_vf_admin_chan(struct enic *enic, bool unregister)
+{
+ struct vnic_rq *rq;
+ struct vnic_wq *wq;
+ struct vnic_cq *cq;
+
+ ENICPMD_FUNC_TRACE();
+ if (unregister)
+ enic_unregister_vf(enic);
+ enic->sriov_vf_soft_rx_stats = false;
+
+ rq = &enic->admin_rq;
+ vnic_rq_disable(rq);
+ rte_memzone_free(rq->admin_msg_rz);
+ vnic_rq_free(rq);
+
+ cq = &enic->admin_cq[ENIC_ADMIN_RQ_CQ];
+ vnic_cq_free(cq);
+
+ wq = &enic->admin_wq;
+ vnic_wq_disable(wq);
+ rte_memzone_free(wq->admin_msg_rz);
+ rte_memzone_free(wq->cqmsg_rz);
+ vnic_wq_free(wq);
+
+ cq = &enic->admin_cq[ENIC_ADMIN_WQ_CQ];
+ vnic_cq_free(cq);
+
+ enic->admin_chan_enabled = false;
+ return 0;
+}
+
+static int common_hdr_check(struct enic *enic, void *msg)
+{
+ struct enic_mbox_hdr *hdr;
+
+ hdr = (struct enic_mbox_hdr *)msg;
+ ENICPMD_LOG(DEBUG, "RX dst %u src %u type %u(%s) flags %u len %u num %" PRIu64,
+ hdr->dst_vnic_id, hdr->src_vnic_id, hdr->msg_type,
+ enic_mbox_msg_type_str(hdr->msg_type),
+ hdr->flags, hdr->msg_len, hdr->msg_num);
+ if (hdr->dst_vnic_id != enic->admin_chan_vf_id ||
+ hdr->src_vnic_id != ENIC_MBOX_DST_PF) {
+ ENICPMD_LOG(ERR, "unexpected dst/src in reply: dst=%u (expected=%u) src=%u",
+ hdr->dst_vnic_id, enic->admin_chan_vf_id, hdr->src_vnic_id);
+ return -EINVAL;
+ }
+ return 0;
+}
+
+static int common_reply_check(__rte_unused struct enic *enic, void *msg,
+ enum enic_mbox_msg_type type)
+{
+ struct enic_mbox_generic_reply_msg *reply;
+ struct enic_mbox_hdr *hdr;
+
+ hdr = (struct enic_mbox_hdr *)msg;
+ reply = (struct enic_mbox_generic_reply_msg *)(hdr + 1);
+ if (hdr->msg_type != type) {
+ ENICPMD_LOG(ERR, "unexpected reply: expected=%u received=%u",
+ type, hdr->msg_type);
+ return -EINVAL;
+ }
+ if (reply->ret_major != 0) {
+ ENICPMD_LOG(ERR, "error reply: type=%u(%s) ret_major/minor=%u/%u",
+ type, enic_mbox_msg_type_str(type),
+ reply->ret_major, reply->ret_minor);
+ return -EINVAL;
+ }
+ return 0;
+}
+
+static void handle_pf_link_state_notif(struct enic *enic, void *msg)
+{
+ struct enic_mbox_pf_link_state_notif_msg *notif = msg;
+ struct enic_mbox_pf_link_state_ack_msg ack;
+ struct rte_eth_link link;
+
+ ENICPMD_FUNC_TRACE();
+ ENICPMD_LOG(DEBUG, "PF_LINK_STAT_NOTIF: link_state=%u", notif->link_state);
+
+ /*
+ * Do not use enic_link_update()
+ * Linux PF driver disables link-status notify in FW and uses
+ * this admin message instead. Notify does not work. Remember
+ * the status from PF.
+ */
+ memset(&link, 0, sizeof(link));
+ link.link_status = notif->link_state ? RTE_ETH_LINK_UP : RTE_ETH_LINK_DOWN;
+ link.link_duplex = RTE_ETH_LINK_FULL_DUPLEX;
+ link.link_speed = vnic_dev_port_speed(enic->vdev);
+ rte_eth_linkstatus_set(enic->rte_dev, &link);
+ rte_eth_dev_callback_process(enic->rte_dev, RTE_ETH_EVENT_INTR_LSC, NULL);
+ ENICPMD_LOG(DEBUG, "eth_linkstatus: speed=%u duplex=%u autoneg=%u status=%u",
+ link.link_speed, link.link_duplex, link.link_autoneg,
+ link.link_status);
+
+ enic_mbox_init_msg_hdr(enic, &ack, ENIC_MBOX_PF_LINK_STATE_ACK);
+ enic_admin_wq_post(enic, &ack);
+ ENICPMD_LOG(DEBUG, "sent PF_LINK_STATE_ACK");
+}
+
+static void handle_pf_get_stats(struct enic *enic, void *msg)
+{
+ struct enic_mbox_pf_get_stats_reply_msg reply;
+ struct enic_mbox_pf_get_stats_msg *req;
+ struct vnic_stats *hw_stats;
+ struct vnic_stats *vs;
+ unsigned int i;
+
+ ENICPMD_FUNC_TRACE();
+ req = msg;
+ ENICPMD_LOG(DEBUG, "flags=0x%x", req->flags);
+ enic_mbox_init_msg_hdr(enic, &reply, ENIC_MBOX_PF_GET_STATS_REPLY);
+ vs = &reply.stats.vnic_stats;
+ if (req->flags & ENIC_MBOX_GET_STATS_RX) {
+ for (i = 0; i < enic->rq_count; i++) {
+ vs->rx.rx_frames_ok += enic->rq[i].soft_stats_pkts;
+ vs->rx.rx_bytes_ok += enic->rq[i].soft_stats_bytes;
+ }
+ vs->rx.rx_frames_total = vs->rx.rx_frames_ok;
+ reply.stats.num_rx_stats = 6;
+ }
+ if (req->flags & ENIC_MBOX_GET_STATS_TX) {
+ vnic_dev_stats_dump(enic->vdev, &hw_stats);
+ vs->tx = hw_stats->tx;
+ reply.stats.num_tx_stats = 11; /* all fields up to rsvd */
+ }
+ enic_admin_wq_post(enic, &reply);
+ ENICPMD_LOG(DEBUG, "sent PF_GET_STATS_REPLY");
+}
+
+static void handle_pf_request_msg(struct enic *enic, void *msg)
+{
+ struct enic_mbox_hdr *hdr = msg;
+
+ switch (hdr->msg_type) {
+ case ENIC_MBOX_PF_LINK_STATE_NOTIF:
+ handle_pf_link_state_notif(enic, msg);
+ break;
+ case ENIC_MBOX_PF_GET_STATS_REQUEST:
+ handle_pf_get_stats(enic, msg);
+ break;
+ case ENIC_MBOX_PF_SET_ADMIN_MAC_NOTIF:
+ ENICPMD_LOG(WARNING, "Ignore PF_SET_ADMIN_MAC_NOTIF from PF. The PF driver has changed VF MAC address. Reload the driver to use the new address.");
+ break;
+ default:
+ ENICPMD_LOG(WARNING, "received unexpected non-request message from PF: received=%u(%s)",
+ hdr->msg_type, enic_mbox_msg_type_str(hdr->msg_type));
+ break;
+ }
+}
+
+void enic_poll_vf_admin_chan(struct enic *enic)
+{
+ uint8_t msg[ENIC_ADMIN_BUF_SIZE];
+ int len;
+
+ ENICPMD_FUNC_TRACE();
+ lock_admin_chan(enic);
+ while (!enic_admin_rq_peek(enic, msg, &len)) {
+ if (common_hdr_check(enic, msg))
+ continue;
+ handle_pf_request_msg(enic, msg);
+ }
+ unlock_admin_chan(enic);
+}
+
+/*
+ * Poll/receive messages until we see the wanted reply message.
+ * That is, we wait for the wanted reply.
+ */
+#define RECV_REPLY_TIMEOUT 5 /* seconds */
+static int recv_reply(struct enic *enic, void *msg, enum enic_mbox_msg_type type)
+{
+ struct enic_mbox_hdr *hdr;
+ uint64_t start, end; /* seconds */
+ int err, len;
+
+ start = rte_rdtsc() / rte_get_tsc_hz();
+again:
+ end = rte_rdtsc() / rte_get_tsc_hz();
+ if (end - start > RECV_REPLY_TIMEOUT) {
+ ENICPMD_LOG(WARNING, "timed out while waiting for reply %u(%s)",
+ type, enic_mbox_msg_type_str(type));
+ return -ETIMEDOUT;
+ }
+ if (enic_admin_rq_peek(enic, msg, &len))
+ goto again;
+ err = common_hdr_check(enic, msg);
+ if (err)
+ goto out;
+
+ /* If not the reply we are looking for, process it and poll again */
+ hdr = msg;
+ if (hdr->msg_type != type) {
+ handle_pf_request_msg(enic, msg);
+ goto again;
+ }
+
+ err = common_reply_check(enic, msg, type);
+ if (err)
+ goto out;
+out:
+ return err;
+}
+
+/*
+ * Ask the PF driver its level of the admin channel support. If the
+ * answer is ver 0 (minimal) or no channel support (timed-out
+ * request), work in the backward compat mode.
+ *
+ * In the compat mode, trust mode does not work, because the PF driver
+ * does not support it. For example, VF cannot enable promisc mode,
+ * and cannot change MAC address.
+ */
+static int enic_check_chan_capability(struct enic *enic)
+{
+ struct enic_mbox_vf_capability_reply_msg *reply;
+ struct enic_mbox_vf_capability_msg req;
+ uint8_t msg[ENIC_ADMIN_BUF_SIZE];
+ int err;
+
+ ENICPMD_FUNC_TRACE();
+
+ enic_mbox_init_msg_hdr(enic, &req.hdr, ENIC_MBOX_VF_CAPABILITY_REQUEST);
+ req.version = ENIC_MBOX_CAP_VERSION_1;
+ enic_admin_wq_post(enic, &req);
+ ENICPMD_LOG(DEBUG, "sent VF_CAPABILITY");
+
+ err = recv_reply(enic, msg, ENIC_MBOX_VF_CAPABILITY_REPLY);
+ if (err == -ETIMEDOUT)
+ ENICPMD_LOG(WARNING, "PF driver has not responded to CAPABILITY request. Please update the host PF driver");
+ else if (err)
+ goto out;
+ ENICPMD_LOG(DEBUG, "VF_CAPABILITY_REPLY ok");
+ reply = (struct enic_mbox_vf_capability_reply_msg *)msg;
+ enic->admin_pf_cap_version = reply->version;
+ ENICPMD_LOG(DEBUG, "PF admin channel capability version %u",
+ enic->admin_pf_cap_version);
+ if (err == -ETIMEDOUT || enic->admin_pf_cap_version == ENIC_MBOX_CAP_VERSION_0) {
+ ENICPMD_LOG(WARNING, "PF driver does not have adequate admin channel support. VF works in backward compatible mode");
+ err = 0;
+ enic->sriov_vf_compat_mode = true;
+ } else if (enic->admin_pf_cap_version == ENIC_MBOX_CAP_VERSION_INVALID) {
+ ENICPMD_LOG(WARNING, "Unexpected version in CAPABILITY_REPLY from PF driver. cap_version %u",
+ enic->admin_pf_cap_version);
+ err = -EINVAL;
+ }
+out:
+ return err;
+}
+
+/*
+ * The VF driver must 'register' with the PF driver first, before
+ * sending any devcmd requests. Once registered, the VF driver must be
+ * ready to process messages from the PF driver.
+ */
+static int enic_register_vf(struct enic *enic)
+{
+ struct enic_mbox_vf_register_msg req;
+ uint8_t msg[ENIC_ADMIN_BUF_SIZE];
+ int err;
+
+ ENICPMD_FUNC_TRACE();
+ enic_mbox_init_msg_hdr(enic, &req, ENIC_MBOX_VF_REGISTER_REQUEST);
+ enic_admin_wq_post(enic, &req);
+ ENICPMD_LOG(DEBUG, "sent VF_REGISTER");
+ err = recv_reply(enic, msg, ENIC_MBOX_VF_REGISTER_REPLY);
+ if (err)
+ goto out;
+ ENICPMD_LOG(DEBUG, "VF_REGISTER_REPLY ok");
+out:
+ return err;
+}
+
+/*
+ * The PF driver expects unregister when the VF driver closes. But,
+ * it is not mandatory. For example, the VF driver may crash without
+ * sending the unregister message. In this case, everything still
+ * works fine.
+ */
+static void enic_unregister_vf(struct enic *enic)
+{
+ struct enic_mbox_vf_unregister_msg req;
+ uint8_t msg[ENIC_ADMIN_BUF_SIZE];
+
+ ENICPMD_FUNC_TRACE();
+ enic_mbox_init_msg_hdr(enic, &req, ENIC_MBOX_VF_UNREGISTER_REQUEST);
+ enic_admin_wq_post(enic, &req);
+ ENICPMD_LOG(DEBUG, "sent VF_UNREGISTER");
+ if (!recv_reply(enic, msg, ENIC_MBOX_VF_UNREGISTER_REPLY))
+ ENICPMD_LOG(DEBUG, "VF_UNREGISTER_REPLY ok");
+}
+
+static int vf_set_packet_filter(struct enic *enic, int directed, int multicast,
+ int broadcast, int promisc, int allmulti)
+{
+ struct enic_mbox_vf_set_pkt_filter_flags_msg req;
+ uint8_t msg[ENIC_ADMIN_BUF_SIZE];
+ uint16_t flags;
+ int err;
+
+ ENICPMD_FUNC_TRACE();
+ enic_mbox_init_msg_hdr(enic, &req, ENIC_MBOX_VF_SET_PKT_FILTER_FLAGS_REQUEST);
+ flags = 0;
+ if (directed)
+ flags |= ENIC_MBOX_PKT_FILTER_DIRECTED;
+ if (multicast)
+ flags |= ENIC_MBOX_PKT_FILTER_MULTICAST;
+ if (broadcast)
+ flags |= ENIC_MBOX_PKT_FILTER_BROADCAST;
+ if (promisc)
+ flags |= ENIC_MBOX_PKT_FILTER_PROMISC;
+ if (allmulti)
+ flags |= ENIC_MBOX_PKT_FILTER_ALLMULTI;
+ req.flags = flags;
+ req.pad = 0;
+ /* Lock admin channel while we send and wait for the reply, to prevent
+ * enic_poll_vf_admin_chan() (RQ interrupt) from interfering.
+ */
+ lock_admin_chan(enic);
+ enic_admin_wq_post(enic, &req);
+ ENICPMD_LOG(DEBUG, "sent VF_SET_PKT_FILTER_FLAGS flags=0x%x", flags);
+ err = recv_reply(enic, msg, ENIC_MBOX_VF_SET_PKT_FILTER_FLAGS_REPLY);
+ unlock_admin_chan(enic);
+ if (err) {
+ ENICPMD_LOG(DEBUG, "VF_SET_PKT_FILTER_FLAGS_REPLY failed");
+ goto out;
+ }
+ ENICPMD_LOG(DEBUG, "VF_SET_PKT_FILTER_FLAGS_REPLY ok");
+out:
+ return err;
+}
+
+int enic_dev_packet_filter(struct enic *enic, int directed, int multicast,
+ int broadcast, int promisc, int allmulti)
+{
+ if (enic_is_vf(enic)) {
+ RTE_VERIFY(admin_chan_enabled(enic));
+ return vf_set_packet_filter(enic, directed, multicast,
+ broadcast, promisc, allmulti);
+ }
+ return vnic_dev_packet_filter(enic->vdev, directed, multicast,
+ broadcast, promisc, allmulti);
+}
+
+static int vf_add_del_addr(struct enic *enic, uint8_t *addr, bool delete)
+{
+ struct enic_mbox_vf_add_del_mac_msg req;
+ uint8_t msg[ENIC_ADMIN_BUF_SIZE];
+ int err;
+
+ ENICPMD_FUNC_TRACE();
+ enic_mbox_init_msg_hdr(enic, &req, ENIC_MBOX_VF_ADD_DEL_MAC_REQUEST);
+
+ req.num_addrs = 1;
+ memcpy(req.mac_addr.addr, addr, RTE_ETHER_ADDR_LEN);
+ req.mac_addr.flags = delete ? 0 : MAC_ADDR_FLAG_ADD;
+
+ lock_admin_chan(enic);
+ enic_admin_wq_post(enic, &req);
+ ENICPMD_LOG(DEBUG, "sent VF_ADD_DEL_MAC");
+ err = recv_reply(enic, msg, ENIC_MBOX_VF_ADD_DEL_MAC_REPLY);
+ unlock_admin_chan(enic);
+ if (err) {
+ ENICPMD_LOG(DEBUG, "VF_ADD_DEL_MAC_REPLY failed");
+ goto out;
+ }
+ ENICPMD_LOG(DEBUG, "VF_ADD_DEL_MAC_REPLY ok");
+out:
+ return err;
+}
+
+int enic_dev_add_addr(struct enic *enic, uint8_t *addr)
+{
+ ENICPMD_FUNC_TRACE();
+ if (enic_is_vf(enic)) {
+ RTE_VERIFY(admin_chan_enabled(enic));
+ return vf_add_del_addr(enic, addr, false);
+ }
+ return vnic_dev_add_addr(enic->vdev, addr);
+}
+
+int enic_dev_del_addr(struct enic *enic, uint8_t *addr)
+{
+ ENICPMD_FUNC_TRACE();
+ if (enic_is_vf(enic)) {
+ RTE_VERIFY(admin_chan_enabled(enic));
+ return vf_add_del_addr(enic, addr, true);
+ }
+ return vnic_dev_del_addr(enic->vdev, addr);
+}
new file mode 100644
@@ -0,0 +1,211 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2024 Cisco Systems, Inc. All rights reserved.
+ */
+
+#ifndef _ENIC_SRIOV_H_
+#define _ENIC_SRIOV_H_
+
+/*
+ * SR-IOV VF and PF drivers exchange control messages through
+ * the admin channel. Those messages are all defined here.
+ *
+ * VF_ prefix means the VF driver initiates the request.
+ * PF_ prefix means the PF driver initiates the request.
+ */
+enum enic_mbox_msg_type {
+ ENIC_MBOX_VF_CAPABILITY_REQUEST,
+ ENIC_MBOX_VF_CAPABILITY_REPLY,
+ ENIC_MBOX_VF_REGISTER_REQUEST,
+ ENIC_MBOX_VF_REGISTER_REPLY,
+ ENIC_MBOX_VF_UNREGISTER_REQUEST,
+ ENIC_MBOX_VF_UNREGISTER_REPLY,
+ ENIC_MBOX_PF_LINK_STATE_NOTIF,
+ ENIC_MBOX_PF_LINK_STATE_ACK,
+ ENIC_MBOX_PF_GET_STATS_REQUEST,
+ ENIC_MBOX_PF_GET_STATS_REPLY,
+ ENIC_MBOX_VF_ADD_DEL_MAC_REQUEST,
+ ENIC_MBOX_VF_ADD_DEL_MAC_REPLY,
+ ENIC_MBOX_PF_SET_ADMIN_MAC_NOTIF,
+ ENIC_MBOX_PF_SET_ADMIN_MAC_ACK,
+ ENIC_MBOX_VF_SET_PKT_FILTER_FLAGS_REQUEST,
+ ENIC_MBOX_VF_SET_PKT_FILTER_FLAGS_REPLY,
+ ENIC_MBOX_MAX
+};
+
+/*
+ * Special value for {src,dst}_vnic_id. 0xffff means PF.
+ * For VF, vnic_id is the VF id.
+ */
+#define ENIC_MBOX_DST_PF 0xffff
+
+struct enic_mbox_hdr {
+ uint16_t src_vnic_id;
+ uint16_t dst_vnic_id;
+ uint8_t msg_type;
+ uint8_t flags;
+ uint16_t msg_len;
+ uint64_t msg_num;
+};
+
+#define ENIC_MBOX_ERR_GENERIC RTE_BIT32(0)
+
+struct enic_mbox_generic_reply_msg {
+ uint16_t ret_major;
+ uint16_t ret_minor;
+};
+
+/*
+ * ENIC_MBOX_VF_CAPABILITY_REQUEST
+ * ENIC_MBOX_VF_CAPABILITY_REPLY
+ */
+#define ENIC_MBOX_CAP_VERSION_0 0
+#define ENIC_MBOX_CAP_VERSION_1 1
+#define ENIC_MBOX_CAP_VERSION_INVALID 0xffffffff
+
+struct enic_mbox_vf_capability_msg {
+ struct enic_mbox_hdr hdr;
+ uint32_t version;
+ uint32_t reserved[32]; /* 128B for future use */
+};
+
+struct enic_mbox_vf_capability_reply_msg {
+ struct enic_mbox_hdr hdr;
+ struct enic_mbox_generic_reply_msg generic_reply;
+ uint32_t version;
+ uint32_t reserved[32]; /* 128B for future use */
+};
+
+/*
+ * ENIC_MBOX_VF_REGISTER_REQUEST
+ * ENIC_MBOX_VF_REGISTER_REPLY
+ * ENIC_MBOX_VF_UNREGISTER_REQUEST
+ * ENIC_MBOX_VF_UNREGISTER_REPLY
+ */
+struct enic_mbox_vf_register_msg {
+ struct enic_mbox_hdr hdr;
+};
+
+struct enic_mbox_vf_register_reply_msg {
+ struct enic_mbox_hdr hdr;
+ struct enic_mbox_generic_reply_msg generic_reply;
+};
+
+struct enic_mbox_vf_unregister_msg {
+ struct enic_mbox_hdr hdr;
+};
+
+struct enic_mbox_vf_unregister_reply_msg {
+ struct enic_mbox_hdr hdr;
+ struct enic_mbox_generic_reply_msg generic_reply;
+};
+
+/*
+ * ENIC_MBOX_PF_LINK_STATE_NOTIF
+ * ENIC_MBOX_PF_LINK_STATE_ACK
+ */
+#define ENIC_MBOX_LINK_STATE_DISABLE 0
+#define ENIC_MBOX_LINK_STATE_ENABLE 1
+struct enic_mbox_pf_link_state_notif_msg {
+ struct enic_mbox_hdr hdr;
+ uint32_t link_state;
+};
+
+struct enic_mbox_pf_link_state_ack_msg {
+ struct enic_mbox_hdr hdr;
+ struct enic_mbox_generic_reply_msg generic_reply;
+};
+
+/*
+ * ENIC_MBOX_PF_GET_STATS_REQUEST
+ * ENIC_MBOX_PF_GET_STATS_REPLY
+ */
+#define ENIC_MBOX_GET_STATS_RX RTE_BIT32(0)
+#define ENIC_MBOX_GET_STATS_TX RTE_BIT32(1)
+#define ENIC_MBOX_GET_STATS_ALL (ENIC_MBOX_GET_STATS_RX | ENIC_MBOX_GET_STATS_TX)
+
+struct enic_mbox_pf_get_stats_msg {
+ struct enic_mbox_hdr hdr;
+ uint16_t flags;
+ uint16_t pad;
+};
+
+struct enic_mbox_pf_get_stats_reply {
+ struct vnic_stats vnic_stats;
+ /* The size of the struct vnic_stats is guaranteed to not change, but
+ * the number of counters (in the rx/tx elements of that struct) that
+ * are actually init may vary depending on the driver version (new
+ * fields may be added to the rsvd blocks).
+ * These two variables tell us how much of the tx/rx blocks inside
+ * struct vnic_stats the VF driver knows about according to its
+ * definition of that data structure.
+ */
+ uint8_t num_rx_stats;
+ uint8_t num_tx_stats;
+ uint8_t pad[6];
+};
+
+struct enic_mbox_pf_get_stats_reply_msg {
+ struct enic_mbox_hdr hdr;
+ struct enic_mbox_generic_reply_msg generic_reply;
+ struct enic_mbox_pf_get_stats_reply stats;
+};
+
+/*
+ * ENIC_MBOX_VF_ADD_DEL_MAC_REQUEST
+ * ENIC_MBOX_VF_ADD_DEL_MAC_REPLY
+ */
+/* enic_mac_addr.flags: Lower 8 bits are used in VF->PF direction (request) */
+#define MAC_ADDR_FLAG_ADD RTE_BIT32(0)
+#define MAC_ADDR_FLAG_STATION RTE_BIT32(1)
+
+struct enic_mac_addr {
+ uint8_t addr[RTE_ETHER_ADDR_LEN];
+ uint16_t flags;
+};
+
+struct enic_mbox_vf_add_del_mac_msg {
+ struct enic_mbox_hdr hdr;
+ uint16_t num_addrs;
+ uint16_t pad;
+ /* This can be mac_addr[], but the driver only uses 1 element */
+ struct enic_mac_addr mac_addr;
+};
+
+struct enic_mbox_vf_add_del_mac_reply_msg {
+ struct enic_mbox_hdr hdr;
+ struct enic_mbox_generic_reply_msg generic_reply;
+ struct enic_mbox_vf_add_del_mac_msg detailed_reply[];
+};
+
+/*
+ * ENIC_MBOX_VF_SET_PKT_FILTER_FLAGS_REQUEST
+ * ENIC_MBOX_VF_SET_PKT_FILTER_FLAGS_REPLY
+ */
+#define ENIC_MBOX_PKT_FILTER_DIRECTED RTE_BIT32(0)
+#define ENIC_MBOX_PKT_FILTER_MULTICAST RTE_BIT32(1)
+#define ENIC_MBOX_PKT_FILTER_BROADCAST RTE_BIT32(2)
+#define ENIC_MBOX_PKT_FILTER_PROMISC RTE_BIT32(3)
+#define ENIC_MBOX_PKT_FILTER_ALLMULTI RTE_BIT32(4)
+
+struct enic_mbox_vf_set_pkt_filter_flags_msg {
+ struct enic_mbox_hdr hdr;
+ uint16_t flags;
+ uint16_t pad;
+};
+
+struct enic_mbox_vf_set_pkt_filter_flags_reply_msg {
+ struct enic_mbox_hdr hdr;
+ struct enic_mbox_generic_reply_msg generic_reply;
+};
+
+int enic_enable_vf_admin_chan(struct enic *enic);
+int enic_disable_vf_admin_chan(struct enic *enic, bool unregister);
+void enic_poll_vf_admin_chan(struct enic *enic);
+
+/* devcmds that may go through PF driver */
+int enic_dev_packet_filter(struct enic *enic, int directed, int multicast,
+ int broadcast, int promisc, int allmulti);
+int enic_dev_add_addr(struct enic *enic, uint8_t *addr);
+int enic_dev_del_addr(struct enic *enic, uint8_t *addr);
+
+#endif /* _ENIC_SRIOV_H_ */
@@ -23,6 +23,7 @@ sources = files(
'enic_main.c',
'enic_res.c',
'enic_rxtx.c',
+ 'enic_sriov.c',
'enic_vf_representor.c',
)
deps += ['hash']