net/ixgbe: Strip SR-IOV transparent VLANs in VF

Message ID 1535128501-31597-1-git-send-email-robertshearman@gmail.com
State Changes Requested, archived
Delegated to: Qi Zhang
Headers show
Series
  • net/ixgbe: Strip SR-IOV transparent VLANs in VF
Related show

Checks

Context Check Description
ci/Intel-compilation success Compilation OK
ci/checkpatch warning coding style issues

Commit Message

Robert Shearman Aug. 24, 2018, 4:35 p.m.
From: Robert Shearman <robert.shearman@att.com>

SR-IOV VFs support "transparent" VLANs. Traffic from/to a VM
associated with a VF has a VLAN tag inserted/stripped in a manner
intended to be totally transparent to the VM.  On a Linux hypervisor
the vlan can be specified by "ip link set <device> vf <n> vlan <v>".
The VM VF driver is not configured to use any VLAN and the VM should
never see the transparent VLAN for that reason.  However, in practice
these VLAN headers are being received by the VM which discards the
packets as that VLAN is unknown to it.  The Linux kernel ixbge driver
explicitly removes the VLAN in this case (presumably due to the
hardware not being able to do this) but the DPDK driver does not.

This patch mirrors the kernel driver behaviour by removing the VLAN on
the VF side. This is done by checking the VLAN in the VFTA, where the
hypervisor will have set the bit in the VFTA corresponding to the VLAN
if transparent VLANs were being used for the VF. If the VLAN is set in
the VFTA then it is known that it's a transparent VLAN case and so the
VLAN is stripped from the mbuf. To limit any potential performance
impact on the PF data path, the RX path is split into PF and VF
versions with the transparent VLAN stripping only done in the VF
path. Measurements with our application show ~2% performance hit for
the VF case and none for the PF case.

Signed-off-by: Robert Shearman <robert.shearman@att.com>
---
 drivers/net/ixgbe/ixgbe_ethdev.c        | 18 +++----
 drivers/net/ixgbe/ixgbe_ethdev.h        | 38 +++++++++++++++
 drivers/net/ixgbe/ixgbe_rxtx.c          | 83 +++++++++++++++++++++++++++++---
 drivers/net/ixgbe/ixgbe_rxtx.h          | 31 +++++++++++-
 drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c |  7 +++
 drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c  | 84 ++++++++++++++++++++++++++++++---
 6 files changed, 238 insertions(+), 23 deletions(-)

Comments

Chas Williams Aug. 28, 2018, 11:58 p.m. | #1
On Fri, Aug 24, 2018 at 12:45 PM <robertshearman@gmail.com> wrote:

> From: Robert Shearman <robert.shearman@att.com>
>
> SR-IOV VFs support "transparent" VLANs. Traffic from/to a VM
> associated with a VF has a VLAN tag inserted/stripped in a manner
> intended to be totally transparent to the VM.  On a Linux hypervisor
> the vlan can be specified by "ip link set <device> vf <n> vlan <v>".
> The VM VF driver is not configured to use any VLAN and the VM should
> never see the transparent VLAN for that reason.  However, in practice
> these VLAN headers are being received by the VM which discards the
> packets as that VLAN is unknown to it.  The Linux kernel ixbge driver
> explicitly removes the VLAN in this case (presumably due to the
> hardware not being able to do this) but the DPDK driver does not.
>
> This patch mirrors the kernel driver behaviour by removing the VLAN on
> the VF side. This is done by checking the VLAN in the VFTA, where the
> hypervisor will have set the bit in the VFTA corresponding to the VLAN
> if transparent VLANs were being used for the VF. If the VLAN is set in
> the VFTA then it is known that it's a transparent VLAN case and so the
> VLAN is stripped from the mbuf. To limit any potential performance
> impact on the PF data path, the RX path is split into PF and VF
> versions with the transparent VLAN stripping only done in the VF
> path. Measurements with our application show ~2% performance hit for
> the VF case and none for the PF case.
>
> Signed-off-by: Robert Shearman <robert.shearman@att.com>
>

Reviewed-by: Chas Williams <chas3@att.com>



> ---
>  drivers/net/ixgbe/ixgbe_ethdev.c        | 18 +++----
>  drivers/net/ixgbe/ixgbe_ethdev.h        | 38 +++++++++++++++
>  drivers/net/ixgbe/ixgbe_rxtx.c          | 83
> +++++++++++++++++++++++++++++---
>  drivers/net/ixgbe/ixgbe_rxtx.h          | 31 +++++++++++-
>  drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c |  7 +++
>  drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c  | 84
> ++++++++++++++++++++++++++++++---
>  6 files changed, 238 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> b/drivers/net/ixgbe/ixgbe_ethdev.c
> index 26b1927..3f88a02 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -604,7 +604,7 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops = {
>         .vlan_filter_set      = ixgbevf_vlan_filter_set,
>         .vlan_strip_queue_set = ixgbevf_vlan_strip_queue_set,
>         .vlan_offload_set     = ixgbevf_vlan_offload_set,
> -       .rx_queue_setup       = ixgbe_dev_rx_queue_setup,
> +       .rx_queue_setup       = ixgbevf_dev_rx_queue_setup,
>         .rx_queue_release     = ixgbe_dev_rx_queue_release,
>         .rx_descriptor_done   = ixgbe_dev_rx_descriptor_done,
>         .rx_descriptor_status = ixgbe_dev_rx_descriptor_status,
> @@ -1094,7 +1094,7 @@ eth_ixgbe_dev_init(struct rte_eth_dev *eth_dev, void
> *init_params __rte_unused)
>                                      "Using default TX function.");
>                 }
>
> -               ixgbe_set_rx_function(eth_dev);
> +               ixgbe_set_rx_function(eth_dev, true);
>
>                 return 0;
>         }
> @@ -1576,7 +1576,7 @@ eth_ixgbevf_dev_init(struct rte_eth_dev *eth_dev)
>                                      "No TX queues configured yet. Using
> default TX function.");
>                 }
>
> -               ixgbe_set_rx_function(eth_dev);
> +               ixgbe_set_rx_function(eth_dev, true);
>
>                 return 0;
>         }
> @@ -1839,8 +1839,8 @@ ixgbe_vlan_filter_set(struct rte_eth_dev *dev,
> uint16_t vlan_id, int on)
>         uint32_t vid_idx;
>         uint32_t vid_bit;
>
> -       vid_idx = (uint32_t) ((vlan_id >> 5) & 0x7F);
> -       vid_bit = (uint32_t) (1 << (vlan_id & 0x1F));
> +       vid_idx = ixgbe_vfta_index(vlan_id);
> +       vid_bit = ixgbe_vfta_bit(vlan_id);
>         vfta = IXGBE_READ_REG(hw, IXGBE_VFTA(vid_idx));
>         if (on)
>                 vfta |= vid_bit;
> @@ -3807,7 +3807,9 @@ ixgbe_dev_supported_ptypes_get(struct rte_eth_dev
> *dev)
>
>  #if defined(RTE_ARCH_X86)
>         if (dev->rx_pkt_burst == ixgbe_recv_pkts_vec ||
> -           dev->rx_pkt_burst == ixgbe_recv_scattered_pkts_vec)
> +           dev->rx_pkt_burst == ixgbe_recv_scattered_pkts_vec ||
> +           dev->rx_pkt_burst == ixgbevf_recv_pkts_vec ||
> +           dev->rx_pkt_burst == ixgbevf_recv_scattered_pkts_vec)
>                 return ptypes;
>  #endif
>         return NULL;
> @@ -5231,8 +5233,8 @@ ixgbevf_vlan_filter_set(struct rte_eth_dev *dev,
> uint16_t vlan_id, int on)
>                 PMD_INIT_LOG(ERR, "Unable to set VF vlan");
>                 return ret;
>         }
> -       vid_idx = (uint32_t) ((vlan_id >> 5) & 0x7F);
> -       vid_bit = (uint32_t) (1 << (vlan_id & 0x1F));
> +       vid_idx = ixgbe_vfta_index(vlan_id);
> +       vid_bit = ixgbe_vfta_bit(vlan_id);
>
>         /* Save what we set and retore it after device reset */
>         if (on)
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h
> b/drivers/net/ixgbe/ixgbe_ethdev.h
> index d0b9396..483d2cd 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.h
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.h
> @@ -568,6 +568,11 @@ int  ixgbe_dev_rx_queue_setup(struct rte_eth_dev
> *dev, uint16_t rx_queue_id,
>                 const struct rte_eth_rxconf *rx_conf,
>                 struct rte_mempool *mb_pool);
>
> +int  ixgbevf_dev_rx_queue_setup(struct rte_eth_dev *dev, uint16_t
> rx_queue_id,
> +                               uint16_t nb_rx_desc, unsigned int
> socket_id,
> +                               const struct rte_eth_rxconf *rx_conf,
> +                               struct rte_mempool *mb_pool);
> +
>  int  ixgbe_dev_tx_queue_setup(struct rte_eth_dev *dev, uint16_t
> tx_queue_id,
>                 uint16_t nb_tx_desc, unsigned int socket_id,
>                 const struct rte_eth_txconf *tx_conf);
> @@ -779,4 +784,37 @@ ixgbe_ethertype_filter_remove(struct
> ixgbe_filter_info *filter_info,
>         return idx;
>  }
>
> +int ixgbe_fdir_ctrl_func(struct rte_eth_dev *dev,
> +                       enum rte_filter_op filter_op, void *arg);
> +
> +/*
> + * Calculate index in vfta array of the 32 bit value enclosing
> + * a given vlan id
> + */
> +static inline uint32_t
> +ixgbe_vfta_index(uint16_t vlan)
> +{
> +       return (vlan >> 5) & 0x7f;
> +}
> +
> +/*
> + * Calculate vfta array entry bitmask for vlan id within the
> + * enclosing 32 bit entry.
> + */
> +static inline uint32_t
> +ixgbe_vfta_bit(uint16_t vlan)
> +{
> +       return 1 << (vlan & 0x1f);
> +}
> +
> +/*
> + * Check in the vfta bit array if the bit corresponding to
> + * the given vlan is set.
> + */
> +static inline bool
> +ixgbe_vfta_is_vlan_set(const struct ixgbe_vfta *vfta, uint16_t vlan)
> +{
> +       return (vfta->vfta[ixgbe_vfta_index(vlan)] & ixgbe_vfta_bit(vlan))
> != 0;
> +}
> +
>  #endif /* _IXGBE_ETHDEV_H_ */
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
> b/drivers/net/ixgbe/ixgbe_rxtx.c
> index f82b74a..26a99cb 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> @@ -1623,14 +1623,23 @@ ixgbe_rx_fill_from_stage(struct ixgbe_rx_queue
> *rxq, struct rte_mbuf **rx_pkts,
>                          uint16_t nb_pkts)
>  {
>         struct rte_mbuf **stage = &rxq->rx_stage[rxq->rx_next_avail];
> +       const struct rte_eth_dev *dev;
> +       const struct ixgbe_vfta *vfta;
>         int i;
>
> +       dev = &rte_eth_devices[rxq->port_id];
> +       vfta = IXGBE_DEV_PRIVATE_TO_VFTA(dev->data->dev_private);
> +
>         /* how many packets are ready to return? */
>         nb_pkts = (uint16_t)RTE_MIN(nb_pkts, rxq->rx_nb_avail);
>
>         /* copy mbuf pointers to the application's packet list */
> -       for (i = 0; i < nb_pkts; ++i)
> +       for (i = 0; i < nb_pkts; ++i) {
>                 rx_pkts[i] = stage[i];
> +               if (rxq->vf)
> +                       ixgbevf_trans_vlan_sw_filter_hdr(rx_pkts[i],
> +                                                        vfta);
> +       }
>
>         /* update internal queue state */
>         rxq->rx_nb_avail = (uint16_t)(rxq->rx_nb_avail - nb_pkts);
> @@ -1750,6 +1759,8 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
> **rx_pkts,
>         uint16_t nb_hold;
>         uint64_t pkt_flags;
>         uint64_t vlan_flags;
> +       const struct rte_eth_dev *dev;
> +       const struct ixgbe_vfta *vfta;
>
>         nb_rx = 0;
>         nb_hold = 0;
> @@ -1758,6 +1769,9 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
> **rx_pkts,
>         rx_ring = rxq->rx_ring;
>         sw_ring = rxq->sw_ring;
>         vlan_flags = rxq->vlan_flags;
> +       dev = &rte_eth_devices[rxq->port_id];
> +       vfta = IXGBE_DEV_PRIVATE_TO_VFTA(dev->data->dev_private);
> +
>         while (nb_rx < nb_pkts) {
>                 /*
>                  * The order of operations here is important as the DD
> status
> @@ -1876,6 +1890,10 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
> **rx_pkts,
>                         ixgbe_rxd_pkt_info_to_pkt_type(pkt_info,
>                                                        rxq->pkt_type_mask);
>
> +               if (rxq->vf)
> +                       ixgbevf_trans_vlan_sw_filter_hdr(rxm,
> +                                                        vfta);
> +
>                 if (likely(pkt_flags & PKT_RX_RSS_HASH))
>                         rxm->hash.rss = rte_le_to_cpu_32(
>                                                 rxd.wb.lower.hi_dword.rss);
> @@ -2016,6 +2034,11 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf
> **rx_pkts, uint16_t nb_pkts,
>         uint16_t nb_rx = 0;
>         uint16_t nb_hold = rxq->nb_rx_hold;
>         uint16_t prev_id = rxq->rx_tail;
> +       const struct rte_eth_dev *dev;
> +       const struct ixgbe_vfta *vfta;
> +
> +       dev = &rte_eth_devices[rxq->port_id];
> +       vfta = IXGBE_DEV_PRIVATE_TO_VFTA(dev->data->dev_private);
>
>         while (nb_rx < nb_pkts) {
>                 bool eop;
> @@ -2230,6 +2253,10 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf
> **rx_pkts, uint16_t nb_pkts,
>                 rte_packet_prefetch((char *)first_seg->buf_addr +
>                         first_seg->data_off);
>
> +               if (rxq->vf)
> +                       ixgbevf_trans_vlan_sw_filter_hdr(first_seg,
> +                                                        vfta);
> +
>                 /*
>                  * Store the mbuf address into the next entry of the array
>                  * of returned packets.
> @@ -3066,6 +3093,25 @@ ixgbe_dev_rx_queue_setup(struct rte_eth_dev *dev,
>         return 0;
>  }
>
> +int __attribute__((cold))
> +ixgbevf_dev_rx_queue_setup(struct rte_eth_dev *dev,
> +                          uint16_t queue_idx,
> +                          uint16_t nb_desc,
> +                          unsigned int socket_id,
> +                          const struct rte_eth_rxconf *rx_conf,
> +                          struct rte_mempool *mp)
> +{
> +       struct ixgbe_rx_queue *rxq;
> +
> +       ixgbe_dev_rx_queue_setup(dev, queue_idx, nb_desc, socket_id,
> +                                rx_conf, mp);
> +
> +       rxq = dev->data->rx_queues[queue_idx];
> +       rxq->vf = true;
> +
> +       return 0;
> +}
> +
>  uint32_t
>  ixgbe_dev_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
>  {
> @@ -4561,7 +4607,7 @@ ixgbe_set_ivar(struct rte_eth_dev *dev, u8 entry, u8
> vector, s8 type)
>  }
>
>  void __attribute__((cold))
> -ixgbe_set_rx_function(struct rte_eth_dev *dev)
> +ixgbe_set_rx_function(struct rte_eth_dev *dev, bool vf)
>  {
>         uint16_t i, rx_using_sse;
>         struct ixgbe_adapter *adapter =
> @@ -4608,7 +4654,9 @@ ixgbe_set_rx_function(struct rte_eth_dev *dev)
>                                             "callback (port=%d).",
>                                      dev->data->port_id);
>
> -                       dev->rx_pkt_burst = ixgbe_recv_scattered_pkts_vec;
> +                       dev->rx_pkt_burst = vf ?
> +                               ixgbevf_recv_scattered_pkts_vec :
> +                               ixgbe_recv_scattered_pkts_vec;
>                 } else if (adapter->rx_bulk_alloc_allowed) {
>                         PMD_INIT_LOG(DEBUG, "Using a Scattered with bulk "
>                                            "allocation callback
> (port=%d).",
> @@ -4637,7 +4685,8 @@ ixgbe_set_rx_function(struct rte_eth_dev *dev)
>                              RTE_IXGBE_DESCS_PER_LOOP,
>                              dev->data->port_id);
>
> -               dev->rx_pkt_burst = ixgbe_recv_pkts_vec;
> +               dev->rx_pkt_burst = vf ? ixgbevf_recv_pkts_vec :
> +                       ixgbe_recv_pkts_vec;
>         } else if (adapter->rx_bulk_alloc_allowed) {
>                 PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions are
> "
>                                     "satisfied. Rx Burst Bulk Alloc
> function "
> @@ -4658,7 +4707,9 @@ ixgbe_set_rx_function(struct rte_eth_dev *dev)
>
>         rx_using_sse =
>                 (dev->rx_pkt_burst == ixgbe_recv_scattered_pkts_vec ||
> -               dev->rx_pkt_burst == ixgbe_recv_pkts_vec);
> +                dev->rx_pkt_burst == ixgbe_recv_pkts_vec ||
> +                dev->rx_pkt_burst == ixgbevf_recv_scattered_pkts_vec ||
> +                dev->rx_pkt_burst == ixgbevf_recv_pkts_vec);
>
>         for (i = 0; i < dev->data->nb_rx_queues; i++) {
>                 struct ixgbe_rx_queue *rxq = dev->data->rx_queues[i];
> @@ -4977,7 +5028,7 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)
>         if (rc)
>                 return rc;
>
> -       ixgbe_set_rx_function(dev);
> +       ixgbe_set_rx_function(dev, false);
>
>         return 0;
>  }
> @@ -5500,7 +5551,7 @@ ixgbevf_dev_rx_init(struct rte_eth_dev *dev)
>                 IXGBE_PSRTYPE_RQPL_SHIFT;
>         IXGBE_WRITE_REG(hw, IXGBE_VFPSRTYPE, psrtype);
>
> -       ixgbe_set_rx_function(dev);
> +       ixgbe_set_rx_function(dev, true);
>
>         return 0;
>  }
> @@ -5731,6 +5782,24 @@ ixgbe_recv_pkts_vec(
>  }
>
>  uint16_t __attribute__((weak))
> +ixgbevf_recv_pkts_vec(
> +       void __rte_unused *rx_queue,
> +       struct rte_mbuf __rte_unused **rx_pkts,
> +       uint16_t __rte_unused nb_pkts)
> +{
> +       return 0;
> +}
> +
> +uint16_t __attribute__((weak))
> +ixgbevf_recv_scattered_pkts_vec(
> +       void __rte_unused *rx_queue,
> +       struct rte_mbuf __rte_unused **rx_pkts,
> +       uint16_t __rte_unused nb_pkts)
> +{
> +       return 0;
> +}
> +
> +uint16_t __attribute__((weak))
>  ixgbe_recv_scattered_pkts_vec(
>         void __rte_unused *rx_queue,
>         struct rte_mbuf __rte_unused **rx_pkts,
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.h
> b/drivers/net/ixgbe/ixgbe_rxtx.h
> index 39378f7..676557b 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx.h
> +++ b/drivers/net/ixgbe/ixgbe_rxtx.h
> @@ -111,6 +111,7 @@ struct ixgbe_rx_queue {
>         uint16_t rx_free_trigger; /**< triggers rx buffer allocation */
>         uint8_t            rx_using_sse;
>         /**< indicates that vector RX is in use */
> +       uint8_t            vf; /**< indicates that this is for a VF */
>  #ifdef RTE_LIBRTE_SECURITY
>         uint8_t            using_ipsec;
>         /**< indicates that IPsec RX feature is in use */
> @@ -254,6 +255,30 @@ struct ixgbe_txq_ops {
>                          IXGBE_ADVTXD_DCMD_EOP)
>
>
> +
> +/*
> + * Filter out unknown vlans resulting from use of transparent vlan.
> + *
> + * When a VF is configured to use transparent vlans then the VF can
> + * see this VLAN being set in the packet, meaning that the transparent
> + * property isn't preserved. Furthermore, when the VF is used in a
> + * guest VM then there's no way of knowing for sure that transparent
> + * VLAN is in use and what tag value has been configured. So work
> + * around this by removing the VLAN flag if the VF isn't interested in
> + * the VLAN tag.
> + */
> +static inline void
> +ixgbevf_trans_vlan_sw_filter_hdr(struct rte_mbuf *m,
> +                                const struct ixgbe_vfta *vfta)
> +{
> +       if (m->ol_flags & PKT_RX_VLAN) {
> +               uint16_t vlan = m->vlan_tci & 0xFFF;
> +
> +               if (!ixgbe_vfta_is_vlan_set(vfta, vlan))
> +                       m->ol_flags &= ~PKT_RX_VLAN;
> +       }
> +}
> +
>  /* Takes an ethdev and a queue and sets up the tx function to be used
> based on
>   * the queue parameters. Used in tx_queue_setup by primary process and
> then
>   * in dev_init by secondary process when attaching to an existing ethdev.
> @@ -274,12 +299,16 @@ void ixgbe_set_tx_function(struct rte_eth_dev *dev,
> struct ixgbe_tx_queue *txq);
>   *
>   * @dev rte_eth_dev handle
>   */
> -void ixgbe_set_rx_function(struct rte_eth_dev *dev);
> +void ixgbe_set_rx_function(struct rte_eth_dev *dev, bool vf);
>
>  uint16_t ixgbe_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
>                 uint16_t nb_pkts);
>  uint16_t ixgbe_recv_scattered_pkts_vec(void *rx_queue,
>                 struct rte_mbuf **rx_pkts, uint16_t nb_pkts);
> +uint16_t ixgbevf_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> +               uint16_t nb_pkts);
> +uint16_t ixgbevf_recv_scattered_pkts_vec(void *rx_queue,
> +               struct rte_mbuf **rx_pkts, uint16_t nb_pkts);
>  int ixgbe_rx_vec_dev_conf_condition_check(struct rte_eth_dev *dev);
>  int ixgbe_rxq_vec_setup(struct ixgbe_rx_queue *rxq);
>  void ixgbe_rx_queue_release_mbufs_vec(struct ixgbe_rx_queue *rxq);
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
> b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
> index edb1383..d077918 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
> @@ -149,6 +149,9 @@ static inline uint16_t
>  _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
>                    uint16_t nb_pkts, uint8_t *split_packet)
>  {
> +       const struct rte_eth_dev *dev = &rte_eth_devices[rxq->port_id];
> +       const struct ixgbe_vfta *vfta
> +               = IXGBE_DEV_PRIVATE_TO_VFTA(dev->data->dev_private);
>         volatile union ixgbe_adv_rx_desc *rxdp;
>         struct ixgbe_rx_entry *sw_ring;
>         uint16_t nb_pkts_recd;
> @@ -272,8 +275,10 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct
> rte_mbuf **rx_pkts,
>                 /* D.3 copy final 3,4 data to rx_pkts */
>                 vst1q_u8((void *)&rx_pkts[pos + 3]->rx_descriptor_fields1,
>                          pkt_mb4);
> +               ixgbe_unknown_vlan_sw_filter_hdr(rx_pkts[pos + 3], vfta,
> rxq);
>                 vst1q_u8((void *)&rx_pkts[pos + 2]->rx_descriptor_fields1,
>                          pkt_mb3);
> +               ixgbe_unknown_vlan_sw_filter_hdr(rx_pkts[pos + 2], vfta,
> rxq);
>
>                 /* D.2 pkt 1,2 set in_port/nb_seg and remove crc */
>                 tmp = vsubq_u16(vreinterpretq_u16_u8(pkt_mb2), crc_adjust);
> @@ -294,8 +299,10 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct
> rte_mbuf **rx_pkts,
>                 /* D.3 copy final 1,2 data to rx_pkts */
>                 vst1q_u8((uint8_t *)&rx_pkts[pos +
> 1]->rx_descriptor_fields1,
>                          pkt_mb2);
> +               ixgbe_unknown_vlan_sw_filter_hdr(rx_pkts[pos + 1], vfta,
> rxq);
>                 vst1q_u8((uint8_t *)&rx_pkts[pos]->rx_descriptor_fields1,
>                          pkt_mb1);
> +               ixgbe_unknown_vlan_sw_filter_hdr(rx_pkts[pos], vfta, rxq);
>
>                 stat &= IXGBE_VPMD_DESC_DD_MASK;
>
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
> b/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
> index c9ba482..04a3307 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
> @@ -313,9 +313,10 @@ desc_to_ptype_v(__m128i descs[4], uint16_t
> pkt_type_mask,
>   */
>  static inline uint16_t
>  _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
> -               uint16_t nb_pkts, uint8_t *split_packet)
> +                  uint16_t nb_pkts, bool vf, uint8_t *split_packet)
>  {
>         volatile union ixgbe_adv_rx_desc *rxdp;
> +       const struct ixgbe_vfta *vfta = NULL;
>         struct ixgbe_rx_entry *sw_ring;
>         uint16_t nb_pkts_recd;
>  #ifdef RTE_LIBRTE_SECURITY
> @@ -344,6 +345,13 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct
> rte_mbuf **rx_pkts,
>         __m128i mbuf_init;
>         uint8_t vlan_flags;
>
> +       if (vf) {
> +               const struct rte_eth_dev *dev =
> +                       &rte_eth_devices[rxq->port_id];
> +
> +               vfta = IXGBE_DEV_PRIVATE_TO_VFTA(dev->data->dev_private);
> +       }
> +
>         /* nb_pkts shall be less equal than RTE_IXGBE_MAX_RX_BURST */
>         nb_pkts = RTE_MIN(nb_pkts, RTE_IXGBE_MAX_RX_BURST);
>
> @@ -500,8 +508,15 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct
> rte_mbuf **rx_pkts,
>                 /* D.3 copy final 3,4 data to rx_pkts */
>                 _mm_storeu_si128((void
> *)&rx_pkts[pos+3]->rx_descriptor_fields1,
>                                 pkt_mb4);
> +               if (vf)
> +                       ixgbevf_trans_vlan_sw_filter_hdr(rx_pkts[pos + 3],
> +                                                        vfta);
> +
>                 _mm_storeu_si128((void
> *)&rx_pkts[pos+2]->rx_descriptor_fields1,
>                                 pkt_mb3);
> +               if (vf)
> +                       ixgbevf_trans_vlan_sw_filter_hdr(rx_pkts[pos + 2],
> +                                                        vfta);
>
>                 /* D.2 pkt 1,2 set in_port/nb_seg and remove crc */
>                 pkt_mb2 = _mm_add_epi16(pkt_mb2, crc_adjust);
> @@ -536,8 +551,15 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct
> rte_mbuf **rx_pkts,
>                 /* D.3 copy final 1,2 data to rx_pkts */
>                 _mm_storeu_si128((void
> *)&rx_pkts[pos+1]->rx_descriptor_fields1,
>                                 pkt_mb2);
> +               if (vf)
> +                       ixgbevf_trans_vlan_sw_filter_hdr(rx_pkts[pos + 1],
> +                                                        vfta);
> +
>                 _mm_storeu_si128((void
> *)&rx_pkts[pos]->rx_descriptor_fields1,
>                                 pkt_mb1);
> +               if (vf)
> +                       ixgbevf_trans_vlan_sw_filter_hdr(rx_pkts[pos],
> +                                                        vfta);
>
>                 desc_to_ptype_v(descs, rxq->pkt_type_mask, &rx_pkts[pos]);
>
> @@ -569,11 +591,11 @@ uint16_t
>  ixgbe_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
>                 uint16_t nb_pkts)
>  {
> -       return _recv_raw_pkts_vec(rx_queue, rx_pkts, nb_pkts, NULL);
> +       return _recv_raw_pkts_vec(rx_queue, rx_pkts, nb_pkts, false, NULL);
>  }
>
>  /*
> - * vPMD receive routine that reassembles scattered packets
> + * vPMD raw receive routine that reassembles scattered packets
>   *
>   * Notice:
>   * - nb_pkts < RTE_IXGBE_DESCS_PER_LOOP, just return no packet
> @@ -581,16 +603,16 @@ ixgbe_recv_pkts_vec(void *rx_queue, struct rte_mbuf
> **rx_pkts,
>   *   numbers of DD bit
>   * - floor align nb_pkts to a RTE_IXGBE_DESC_PER_LOOP power-of-two
>   */
> -uint16_t
> -ixgbe_recv_scattered_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> -               uint16_t nb_pkts)
> +static inline uint16_t
> +_recv_raw_scattered_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> +                            uint16_t nb_pkts, bool vf)
>  {
>         struct ixgbe_rx_queue *rxq = rx_queue;
>         uint8_t split_flags[RTE_IXGBE_MAX_RX_BURST] = {0};
>
>         /* get some new buffers */
>         uint16_t nb_bufs = _recv_raw_pkts_vec(rxq, rx_pkts, nb_pkts,
> -                       split_flags);
> +                                             vf, split_flags);
>         if (nb_bufs == 0)
>                 return 0;
>
> @@ -614,6 +636,54 @@ ixgbe_recv_scattered_pkts_vec(void *rx_queue, struct
> rte_mbuf **rx_pkts,
>                 &split_flags[i]);
>  }
>
> +/*
> + * vPMD receive routine that reassembles scattered packets
> + *
> + * Notice:
> + * - nb_pkts < RTE_IXGBE_DESCS_PER_LOOP, just return no packet
> + * - nb_pkts > RTE_IXGBE_MAX_RX_BURST, only scan RTE_IXGBE_MAX_RX_BURST
> + *   numbers of DD bit
> + * - floor align nb_pkts to a RTE_IXGBE_DESC_PER_LOOP power-of-two
> + */
> +uint16_t
> +ixgbe_recv_scattered_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> +                             uint16_t nb_pkts)
> +{
> +       return _recv_raw_scattered_pkts_vec(rx_queue, rx_pkts, nb_pkts,
> false);
> +}
> +
> +/*
> + * vPMD VF receive routine, only accept(nb_pkts >=
> RTE_IXGBE_DESCS_PER_LOOP)
> + *
> + * Notice:
> + * - nb_pkts < RTE_IXGBE_DESCS_PER_LOOP, just return no packet
> + * - nb_pkts > RTE_IXGBE_MAX_RX_BURST, only scan RTE_IXGBE_MAX_RX_BURST
> + *   numbers of DD bit
> + * - floor align nb_pkts to a RTE_IXGBE_DESC_PER_LOOP power-of-two
> + */
> +uint16_t
> +ixgbevf_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> +                     uint16_t nb_pkts)
> +{
> +       return _recv_raw_pkts_vec(rx_queue, rx_pkts, nb_pkts, true, NULL);
> +}
> +
> +/*
> + * vPMD VF receive routine that reassembles scattered packets
> + *
> + * Notice:
> + * - nb_pkts < RTE_IXGBE_DESCS_PER_LOOP, just return no packet
> + * - nb_pkts > RTE_IXGBE_MAX_RX_BURST, only scan RTE_IXGBE_MAX_RX_BURST
> + *   numbers of DD bit
> + * - floor align nb_pkts to a RTE_IXGBE_DESC_PER_LOOP power-of-two
> + */
> +uint16_t
> +ixgbevf_recv_scattered_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> +                               uint16_t nb_pkts)
> +{
> +       return _recv_raw_scattered_pkts_vec(rx_queue, rx_pkts, nb_pkts,
> true);
> +}
> +
>  static inline void
>  vtx1(volatile union ixgbe_adv_tx_desc *txdp,
>                 struct rte_mbuf *pkt, uint64_t flags)
> --
> 2.7.4
>
>
Zhang, Qi Z Sept. 3, 2018, 11:45 a.m. | #2
Hi Robert:

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of
> robertshearman@gmail.com
> Sent: Saturday, August 25, 2018 12:35 AM
> To: dev@dpdk.org
> Cc: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Robert Shearman
> <robert.shearman@att.com>
> Subject: [dpdk-dev] [PATCH] net/ixgbe: Strip SR-IOV transparent VLANs in VF
> 
> From: Robert Shearman <robert.shearman@att.com>
> 
> SR-IOV VFs support "transparent" VLANs. Traffic from/to a VM associated with
> a VF has a VLAN tag inserted/stripped in a manner intended to be totally
> transparent to the VM.  On a Linux hypervisor the vlan can be specified by "ip
> link set <device> vf <n> vlan <v>".
> The VM VF driver is not configured to use any VLAN and the VM should never
> see the transparent VLAN for that reason.  However, in practice these VLAN
> headers are being received by the VM which discards the packets as that VLAN
> is unknown to it.  The Linux kernel ixbge driver explicitly removes the VLAN in
> this case (presumably due to the hardware not being able to do this) but the
> DPDK driver does not.

I'm not quite understand this part.
What does explicitly remove the VLAN means?, 
DPDK also discard unmatched VLAN and strip vlan if vlan_strip is enabled what is the gap?
It will be better if you can give same examples

> 
> This patch mirrors the kernel driver behaviour by removing the VLAN on the VF
> side. This is done by checking the VLAN in the VFTA, where the hypervisor will
> have set the bit in the VFTA corresponding to the VLAN if transparent VLANs
> were being used for the VF. If the VLAN is set in the VFTA then it is known that
> it's a transparent VLAN case and so the VLAN is stripped from the mbuf. 

This is missing leading.
From your code, I only saw vlan flag in ol_flag is stripped, but not VLAN is stripped.
I think vlan is always be stripped if we enable vlan strip on queue.

> To
> limit any potential performance impact on the PF data path, the RX path is split
> into PF and VF versions with the transparent VLAN stripping only done in the
> VF path. Measurements with our application show ~2% performance hit for
> the VF case and none for the PF case.
> 

...

> +/*
> + * Filter out unknown vlans resulting from use of transparent vlan.
> + *
> + * When a VF is configured to use transparent vlans then the VF can
> + * see this VLAN being set in the packet, meaning that the transparent
> + * property isn't preserved. Furthermore, when the VF is used in a
> + * guest VM then there's no way of knowing for sure that transparent
> + * VLAN is in use and what tag value has been configured. So work
> + * around this by removing the VLAN flag if the VF isn't interested in
> + * the VLAN tag.
> + */
> +static inline void
> +ixgbevf_trans_vlan_sw_filter_hdr(struct rte_mbuf *m,
> +				 const struct ixgbe_vfta *vfta)
> +{
> +	if (m->ol_flags & PKT_RX_VLAN) {
> +		uint16_t vlan = m->vlan_tci & 0xFFF;
> +
> +		if (!ixgbe_vfta_is_vlan_set(vfta, vlan))
> +			m->ol_flags &= ~PKT_RX_VLAN;
> +	}
> +}
> +

Ideally all driver's behavior should be consistent with the same configure.
if "transparent vlan" looks like a general feature, it may not only bind to VF or even just ixgbevf.  (what about i40evf?)
Otherwise, it should be handled in application , but not the driver.

...

> +		ixgbe_unknown_vlan_sw_filter_hdr(rx_pkts[pos + 3], vfta, rxq);

Where is ixgbe_unknown_vlan_sw_filter_hdr be defined? I saw it is only be used in ixgbe_rxtx_vec_neon.c, so assume there will be a compile error on that platform?

Regards
Qi
Robert Shearman Sept. 3, 2018, 1:14 p.m. | #3
Hi Qi,

On 03/09/2018 12:45, Zhang, Qi Z wrote:
> Hi Robert:
> 
>> -----Original Message-----
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of
>> robertshearman@gmail.com
>> Sent: Saturday, August 25, 2018 12:35 AM
>> To: dev@dpdk.org
>> Cc: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Ananyev, Konstantin
>> <konstantin.ananyev@intel.com>; Robert Shearman
>> <robert.shearman@att.com>
>> Subject: [dpdk-dev] [PATCH] net/ixgbe: Strip SR-IOV transparent VLANs in VF
>>
>> From: Robert Shearman <robert.shearman@att.com>
>>
>> SR-IOV VFs support "transparent" VLANs. Traffic from/to a VM associated with
>> a VF has a VLAN tag inserted/stripped in a manner intended to be totally
>> transparent to the VM.  On a Linux hypervisor the vlan can be specified by "ip
>> link set <device> vf <n> vlan <v>".
>> The VM VF driver is not configured to use any VLAN and the VM should never
>> see the transparent VLAN for that reason.  However, in practice these VLAN
>> headers are being received by the VM which discards the packets as that VLAN
>> is unknown to it.  The Linux kernel ixbge driver explicitly removes the VLAN in
>> this case (presumably due to the hardware not being able to do this) but the
>> DPDK driver does not.
> 
> I'm not quite understand this part.
> What does explicitly remove the VLAN means?,
> DPDK also discard unmatched VLAN and strip vlan if vlan_strip is enabled what is the gap?
> It will be better if you can give same examples

Sure. Typical use case for this is a hypervisor where it is necessary to 
provide L2 access into the guests, but there are insufficient, and so 
the hypervisor is using the PF and VFs are assigned to guests. In order 
to avoid having to configure each guest to use the VLAN and to not send 
any untagged traffic it is desirable to use transparent VLANs. For example:
Guest 1 = VLAN 10
Guest 2 = VLAN 20

ip link set eth0 vf 1 vlan 10
ip link set eth0 vf 2 vlan 20

Now this means that packets arriving tagged on the physical port should 
be delivered to the guest and arrive in the guest untagged. Similarly, 
packets transmitted untagged by the guest should gain a tag before they 
go out of the physical port. What you get when using the Linux VF ixgbe 
driver inside the VMs is exactly this since the driver knows that for 
this hardware the transparent stripping isn't done in hardware and is 
done inside the driver. What you get currently when using the DPDK VF 
ixgbe driver inside the VMs is that packets arrive tagged (e.g. with 
VLAN tag 10) and these are then dropped because the VM doesn't know 
about VLAN 10.

Transparent VLAN insertion works currently with both Linux and DPDK VF 
drivers.

> 
>>
>> This patch mirrors the kernel driver behaviour by removing the VLAN on the VF
>> side. This is done by checking the VLAN in the VFTA, where the hypervisor will
>> have set the bit in the VFTA corresponding to the VLAN if transparent VLANs
>> were being used for the VF. If the VLAN is set in the VFTA then it is known that
>> it's a transparent VLAN case and so the VLAN is stripped from the mbuf.
> 
> This is missing leading.
>  From your code, I only saw vlan flag in ol_flag is stripped, but not VLAN is stripped.
> I think vlan is always be stripped if we enable vlan strip on queue.

I think you're saying that the VLAN isn't removed if hardware RX VLAN 
stripping isn't configured. This is true, but might cost performance to 
cover this case too. If you're happy with that, then I can issue a V2 
with that addressed.

If you're suggesting that m->vlan_tci needs to be set to 0 when 
PKT_RX_VLAN is cleared from m->ol_flags, then I don't think that is 
necessary since my understanding is an application should only be 
looking at m->vlan_tci if m->ol_flags has PKT_RX_VLAN set.

> 
>> To
>> limit any potential performance impact on the PF data path, the RX path is split
>> into PF and VF versions with the transparent VLAN stripping only done in the
>> VF path. Measurements with our application show ~2% performance hit for
>> the VF case and none for the PF case.
>>
> 
> ...
> 
>> +/*
>> + * Filter out unknown vlans resulting from use of transparent vlan.
>> + *
>> + * When a VF is configured to use transparent vlans then the VF can
>> + * see this VLAN being set in the packet, meaning that the transparent
>> + * property isn't preserved. Furthermore, when the VF is used in a
>> + * guest VM then there's no way of knowing for sure that transparent
>> + * VLAN is in use and what tag value has been configured. So work
>> + * around this by removing the VLAN flag if the VF isn't interested in
>> + * the VLAN tag.
>> + */
>> +static inline void
>> +ixgbevf_trans_vlan_sw_filter_hdr(struct rte_mbuf *m,
>> +				 const struct ixgbe_vfta *vfta)
>> +{
>> +	if (m->ol_flags & PKT_RX_VLAN) {
>> +		uint16_t vlan = m->vlan_tci & 0xFFF;
>> +
>> +		if (!ixgbe_vfta_is_vlan_set(vfta, vlan))
>> +			m->ol_flags &= ~PKT_RX_VLAN;
>> +	}
>> +}
>> +
> 
> Ideally all driver's behavior should be consistent with the same configure.
> if "transparent vlan" looks like a general feature, it may not only bind to VF or even just ixgbevf.  (what about i40evf?)
> Otherwise, it should be handled in application , but not the driver.

It's a general feature, but the implementation is specific to a driver. 
I believe that this is handled in hardware on i40e, but this is just 
based on the there being no special handling of this case in the RX path 
in the Linux i40e VF driver.

Furthermore, transparent VLANs implemented in the application would just 
be called "VLANs" :-) More specifically, the application running in the 
guest cannot know what has been configured for the VF in the hypervisor 
in a driver-independent manner, or whether the hardware has in fact 
transparently removed the VLAN already (as may be the case for i40e).

> 
> ...
> 
>> +		ixgbe_unknown_vlan_sw_filter_hdr(rx_pkts[pos + 3], vfta, rxq);
> 
> Where is ixgbe_unknown_vlan_sw_filter_hdr be defined? I saw it is only be used in ixgbe_rxtx_vec_neon.c, so assume there will be a compile error on that platform?

Good catch. I don't have the ability to compile for that platform, and 
missed the rename I did during development. Will fix in V2.

Thanks,
Rob
Zhang, Qi Z Sept. 4, 2018, 2:16 a.m. | #4
> -----Original Message-----
> From: Robert Shearman [mailto:robertshearman@gmail.com]
> Sent: Monday, September 3, 2018 9:14 PM
> To: Zhang, Qi Z <qi.z.zhang@intel.com>; dev@dpdk.org
> Cc: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Robert Shearman
> <robert.shearman@att.com>
> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: Strip SR-IOV transparent VLANs in
> VF
> 
> Hi Qi,
> 
> On 03/09/2018 12:45, Zhang, Qi Z wrote:
> > Hi Robert:
> >
> >> -----Original Message-----
> >> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of
> >> robertshearman@gmail.com
> >> Sent: Saturday, August 25, 2018 12:35 AM
> >> To: dev@dpdk.org
> >> Cc: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Ananyev, Konstantin
> >> <konstantin.ananyev@intel.com>; Robert Shearman
> >> <robert.shearman@att.com>
> >> Subject: [dpdk-dev] [PATCH] net/ixgbe: Strip SR-IOV transparent VLANs
> >> in VF
> >>
> >> From: Robert Shearman <robert.shearman@att.com>
> >>
> >> SR-IOV VFs support "transparent" VLANs. Traffic from/to a VM
> >> associated with a VF has a VLAN tag inserted/stripped in a manner
> >> intended to be totally transparent to the VM.  On a Linux hypervisor
> >> the vlan can be specified by "ip link set <device> vf <n> vlan <v>".
> >> The VM VF driver is not configured to use any VLAN and the VM should
> >> never see the transparent VLAN for that reason.  However, in practice
> >> these VLAN headers are being received by the VM which discards the
> >> packets as that VLAN is unknown to it.  The Linux kernel ixbge driver
> >> explicitly removes the VLAN in this case (presumably due to the
> >> hardware not being able to do this) but the DPDK driver does not.
> >
> > I'm not quite understand this part.
> > What does explicitly remove the VLAN means?, DPDK also discard
> > unmatched VLAN and strip vlan if vlan_strip is enabled what is the gap?
> > It will be better if you can give same examples
> 
> Sure. Typical use case for this is a hypervisor where it is necessary to provide
> L2 access into the guests, but there are insufficient, and so the hypervisor is
> using the PF and VFs are assigned to guests. In order to avoid having to
> configure each guest to use the VLAN and to not send any untagged traffic it is
> desirable to use transparent VLANs. For example:
> Guest 1 = VLAN 10
> Guest 2 = VLAN 20
> 
> ip link set eth0 vf 1 vlan 10
> ip link set eth0 vf 2 vlan 20
> 
> Now this means that packets arriving tagged on the physical port should be
> delivered to the guest and arrive in the guest untagged. Similarly, packets
> transmitted untagged by the guest should gain a tag before they go out of the
> physical port. What you get when using the Linux VF ixgbe driver inside the
> VMs is exactly this since the driver knows that for this hardware the
> transparent stripping isn't done in hardware and is done inside the driver.
> What you get currently when using the DPDK VF ixgbe driver inside the VMs is
> that packets arrive tagged (e.g. with VLAN tag 10) and these are then dropped
> because the VM doesn't know about VLAN 10.
> 
> Transparent VLAN insertion works currently with both Linux and DPDK VF
> drivers.

What do you mean "stripping isn't done in hardware" and "packets arrived tagged"?
Let me explain how PMD driver works. (or it is expected)
if we enable vlan_strip, the VLAN header is expected to be stripped from packet data by hardware.
And in rx descriptor, it still keep the stripped vlan information, so driver will set mbuf->ol_flags with PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED and also set stripped vlan tag to mbuf->vlan_tci
So in my review, it is "stripping is done and packets arrived with untagged", and application also know what exactly happened and make decision based on the requirement

So do you mean ixgbevf does not support vlan_strip as a hardware offload?, and it should be done with software? 
But in your code, I didn't see the part that vlan header is stripped from the packet data. ( set mbuf->ol_flag and mbuf->vlan_tci does not mean the vlan is stripped)

> 
> >
> >>
> >> This patch mirrors the kernel driver behaviour by removing the VLAN
> >> on the VF side. This is done by checking the VLAN in the VFTA, where
> >> the hypervisor will have set the bit in the VFTA corresponding to the
> >> VLAN if transparent VLANs were being used for the VF. If the VLAN is
> >> set in the VFTA then it is known that it's a transparent VLAN case and so the
> VLAN is stripped from the mbuf.
> >
> > This is missing leading.
> >  From your code, I only saw vlan flag in ol_flag is stripped, but not VLAN is
> stripped.
> > I think vlan is always be stripped if we enable vlan strip on queue.
> 
> I think you're saying that the VLAN isn't removed if hardware RX VLAN
> stripping isn't configured. This is true, but might cost performance to cover this
> case too. If you're happy with that, then I can issue a V2 with that addressed.
> 
> If you're suggesting that m->vlan_tci needs to be set to 0 when PKT_RX_VLAN is
> cleared from m->ol_flags, then I don't think that is necessary since my
> understanding is an application should only be looking at m->vlan_tci if
> m->ol_flags has PKT_RX_VLAN set.
> 
> >
> >> To
> >> limit any potential performance impact on the PF data path, the RX path is
> split
> >> into PF and VF versions with the transparent VLAN stripping only done in
> the
> >> VF path. Measurements with our application show ~2% performance hit for
> >> the VF case and none for the PF case.
> >>
> >
> > ...
> >
> >> +/*
> >> + * Filter out unknown vlans resulting from use of transparent vlan.
> >> + *
> >> + * When a VF is configured to use transparent vlans then the VF can
> >> + * see this VLAN being set in the packet, meaning that the transparent
> >> + * property isn't preserved. Furthermore, when the VF is used in a
> >> + * guest VM then there's no way of knowing for sure that transparent
> >> + * VLAN is in use and what tag value has been configured. So work
> >> + * around this by removing the VLAN flag if the VF isn't interested in
> >> + * the VLAN tag.
> >> + */
> >> +static inline void
> >> +ixgbevf_trans_vlan_sw_filter_hdr(struct rte_mbuf *m,
> >> +				 const struct ixgbe_vfta *vfta)
> >> +{
> >> +	if (m->ol_flags & PKT_RX_VLAN) {
> >> +		uint16_t vlan = m->vlan_tci & 0xFFF;
> >> +
> >> +		if (!ixgbe_vfta_is_vlan_set(vfta, vlan))
> >> +			m->ol_flags &= ~PKT_RX_VLAN;
> >> +	}
> >> +}
> >> +
> >
> > Ideally all driver's behavior should be consistent with the same configure.
> > if "transparent vlan" looks like a general feature, it may not only bind to VF
> or even just ixgbevf.  (what about i40evf?)
> > Otherwise, it should be handled in application , but not the driver.
> 
> It's a general feature, but the implementation is specific to a driver.
> I believe that this is handled in hardware on i40e, but this is just
> based on the there being no special handling of this case in the RX path
> in the Linux i40e VF driver.
> 
> Furthermore, transparent VLANs implemented in the application would just
> be called "VLANs" :-) More specifically, the application running in the
> guest cannot know what has been configured for the VF in the hypervisor
> in a driver-independent manner, or whether the hardware has in fact
> transparently removed the VLAN already (as may be the case for i40e).
> 
> >
> > ...
> >
> >> +		ixgbe_unknown_vlan_sw_filter_hdr(rx_pkts[pos + 3], vfta, rxq);
> >
> > Where is ixgbe_unknown_vlan_sw_filter_hdr be defined? I saw it is only be
> used in ixgbe_rxtx_vec_neon.c, so assume there will be a compile error on that
> platform?
> 
> Good catch. I don't have the ability to compile for that platform, and
> missed the rename I did during development. Will fix in V2.
> 
> Thanks,
> Rob
Robert Shearman Sept. 4, 2018, 9:57 a.m. | #5
Hi Qi,

On 04/09/2018 03:16, Zhang, Qi Z wrote:
>> -----Original Message-----
>> From: Robert Shearman [mailto:robertshearman@gmail.com]
>> Sent: Monday, September 3, 2018 9:14 PM
>> To: Zhang, Qi Z <qi.z.zhang@intel.com>; dev@dpdk.org
>> Cc: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Ananyev, Konstantin
>> <konstantin.ananyev@intel.com>; Robert Shearman
>> <robert.shearman@att.com>
>> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: Strip SR-IOV transparent VLANs in
>> VF
>>
>> Hi Qi,
>>
>> On 03/09/2018 12:45, Zhang, Qi Z wrote:
>>> Hi Robert:
>>>
>>>> -----Original Message-----
>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of
>>>> robertshearman@gmail.com
>>>> Sent: Saturday, August 25, 2018 12:35 AM
>>>> To: dev@dpdk.org
>>>> Cc: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Ananyev, Konstantin
>>>> <konstantin.ananyev@intel.com>; Robert Shearman
>>>> <robert.shearman@att.com>
>>>> Subject: [dpdk-dev] [PATCH] net/ixgbe: Strip SR-IOV transparent VLANs
>>>> in VF
>>>>
>>>> From: Robert Shearman <robert.shearman@att.com>
>>>>
>>>> SR-IOV VFs support "transparent" VLANs. Traffic from/to a VM
>>>> associated with a VF has a VLAN tag inserted/stripped in a manner
>>>> intended to be totally transparent to the VM.  On a Linux hypervisor
>>>> the vlan can be specified by "ip link set <device> vf <n> vlan <v>".
>>>> The VM VF driver is not configured to use any VLAN and the VM should
>>>> never see the transparent VLAN for that reason.  However, in practice
>>>> these VLAN headers are being received by the VM which discards the
>>>> packets as that VLAN is unknown to it.  The Linux kernel ixbge driver
>>>> explicitly removes the VLAN in this case (presumably due to the
>>>> hardware not being able to do this) but the DPDK driver does not.
>>>
>>> I'm not quite understand this part.
>>> What does explicitly remove the VLAN means?, DPDK also discard
>>> unmatched VLAN and strip vlan if vlan_strip is enabled what is the gap?
>>> It will be better if you can give same examples
>>
>> Sure. Typical use case for this is a hypervisor where it is necessary to provide
>> L2 access into the guests, but there are insufficient, and so the hypervisor is
>> using the PF and VFs are assigned to guests. In order to avoid having to
>> configure each guest to use the VLAN and to not send any untagged traffic it is
>> desirable to use transparent VLANs. For example:
>> Guest 1 = VLAN 10
>> Guest 2 = VLAN 20
>>
>> ip link set eth0 vf 1 vlan 10
>> ip link set eth0 vf 2 vlan 20
>>
>> Now this means that packets arriving tagged on the physical port should be
>> delivered to the guest and arrive in the guest untagged. Similarly, packets
>> transmitted untagged by the guest should gain a tag before they go out of the
>> physical port. What you get when using the Linux VF ixgbe driver inside the
>> VMs is exactly this since the driver knows that for this hardware the
>> transparent stripping isn't done in hardware and is done inside the driver.
>> What you get currently when using the DPDK VF ixgbe driver inside the VMs is
>> that packets arrive tagged (e.g. with VLAN tag 10) and these are then dropped
>> because the VM doesn't know about VLAN 10.
>>
>> Transparent VLAN insertion works currently with both Linux and DPDK VF
>> drivers.
> 
> What do you mean "stripping isn't done in hardware" and "packets arrived tagged"?
> Let me explain how PMD driver works. (or it is expected)
> if we enable vlan_strip, the VLAN header is expected to be stripped from packet data by hardware.
> And in rx descriptor, it still keep the stripped vlan information, so driver will set mbuf->ol_flags with PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED and also set stripped vlan tag to mbuf->vlan_tci
> So in my review, it is "stripping is done and packets arrived with untagged", and application also know what exactly happened and make decision based on the requirement
> 
> So do you mean ixgbevf does not support vlan_strip as a hardware offload?, and it should be done with software?
> But in your code, I didn't see the part that vlan header is stripped from the packet data. ( set mbuf->ol_flag and mbuf->vlan_tci does not mean the vlan is stripped)

I understand how the VLAN stripping hardware offload is supposed to 
work, but this use case is distinct from VLAN stripping and it was my 
mistake to use that loaded term in my explanation of the use case.

The expectation in this case is that the packet arrive completely 
untagged, i.e. whether the VLAN has been stripped and placed in metadata 
or not. The application running inside the VM expects the packet to 
arrive is if the VLAN tag was never there.

The application cannot do the removal of the VLAN tag itself because in 
this use case it is implicit that it shouldn't know about the tag and 
the presence of the tag is driver/hardware specific.

Thanks for highlighting the PKT_RX_VLAN_STRIPPED flag - I should remove 
that as well when the transparent VLAN filter triggers.

Thanks,
Rob
Ananyev, Konstantin Sept. 12, 2018, 2:59 p.m. | #6
Hi Robert,

> -----Original Message-----
> From: robertshearman@gmail.com [mailto:robertshearman@gmail.com]
> Sent: Friday, August 24, 2018 5:35 PM
> To: dev@dpdk.org
> Cc: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Robert Shearman
> <robert.shearman@att.com>
> Subject: [PATCH] net/ixgbe: Strip SR-IOV transparent VLANs in VF
> 
> From: Robert Shearman <robert.shearman@att.com>
> 
> SR-IOV VFs support "transparent" VLANs. Traffic from/to a VM
> associated with a VF has a VLAN tag inserted/stripped in a manner
> intended to be totally transparent to the VM.  On a Linux hypervisor
> the vlan can be specified by "ip link set <device> vf <n> vlan <v>".
> The VM VF driver is not configured to use any VLAN and the VM should
> never see the transparent VLAN for that reason.  However, in practice
> these VLAN headers are being received by the VM which discards the
> packets as that VLAN is unknown to it.  The Linux kernel ixbge driver
> explicitly removes the VLAN in this case (presumably due to the
> hardware not being able to do this) but the DPDK driver does not.
> 
> This patch mirrors the kernel driver behaviour by removing the VLAN on
> the VF side. This is done by checking the VLAN in the VFTA, where the
> hypervisor will have set the bit in the VFTA corresponding to the VLAN
> if transparent VLANs were being used for the VF. If the VLAN is set in
> the VFTA then it is known that it's a transparent VLAN case and so the
> VLAN is stripped from the mbuf. To limit any potential performance
> impact on the PF data path, the RX path is split into PF and VF
> versions with the transparent VLAN stripping only done in the VF
> path. Measurements with our application show ~2% performance hit for
> the VF case and none for the PF case.

I did some perf measurements too, and unfortunately I am seeing ~4 % drop 
(tespmd iofwd on one core over 4x10Gb: from  ~44.7 Mpps to ~43Mpps, that's on BDX 2.2GHz).
As you mentioned above:
" VM VF driver is not configured to use any VLAN and the VM should
never see the transparent VLAN for that reason."
I wonder would it be sufficient for your purposes if VF RX function just ignore
HW descriptor values and never set  PKT_RX_VLAN | PKT_RX_VLAN_STRIPPED?
I think that could be done pretty easily (by setting rxq->vlan_flags).
In that case no changes in RX code will be required no perf changes).
It can be controlled by DEV_RX_OFFLOAD_VLAN_STRIP, not sure would it be sufficient for you.  
BTW, in your case how hypervisor will propagate new VFTA table to VF?
Presumably same way could be used to propagate rx offload flags?
Konstantin

> 
> Signed-off-by: Robert Shearman <robert.shearman@att.com>
> ---
>  drivers/net/ixgbe/ixgbe_ethdev.c        | 18 +++----
>  drivers/net/ixgbe/ixgbe_ethdev.h        | 38 +++++++++++++++
>  drivers/net/ixgbe/ixgbe_rxtx.c          | 83 +++++++++++++++++++++++++++++---
>  drivers/net/ixgbe/ixgbe_rxtx.h          | 31 +++++++++++-
>  drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c |  7 +++
>  drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c  | 84 ++++++++++++++++++++++++++++++---
>  6 files changed, 238 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
> index 26b1927..3f88a02 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -604,7 +604,7 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops = {
>  	.vlan_filter_set      = ixgbevf_vlan_filter_set,
>  	.vlan_strip_queue_set = ixgbevf_vlan_strip_queue_set,
>  	.vlan_offload_set     = ixgbevf_vlan_offload_set,
> -	.rx_queue_setup       = ixgbe_dev_rx_queue_setup,
> +	.rx_queue_setup       = ixgbevf_dev_rx_queue_setup,
>  	.rx_queue_release     = ixgbe_dev_rx_queue_release,
>  	.rx_descriptor_done   = ixgbe_dev_rx_descriptor_done,
>  	.rx_descriptor_status = ixgbe_dev_rx_descriptor_status,
> @@ -1094,7 +1094,7 @@ eth_ixgbe_dev_init(struct rte_eth_dev *eth_dev, void *init_params __rte_unused)
>  				     "Using default TX function.");
>  		}
> 
> -		ixgbe_set_rx_function(eth_dev);
> +		ixgbe_set_rx_function(eth_dev, true);
> 
>  		return 0;
>  	}
> @@ -1576,7 +1576,7 @@ eth_ixgbevf_dev_init(struct rte_eth_dev *eth_dev)
>  				     "No TX queues configured yet. Using default TX function.");
>  		}
> 
> -		ixgbe_set_rx_function(eth_dev);
> +		ixgbe_set_rx_function(eth_dev, true);
> 
>  		return 0;
>  	}
> @@ -1839,8 +1839,8 @@ ixgbe_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
>  	uint32_t vid_idx;
>  	uint32_t vid_bit;
> 
> -	vid_idx = (uint32_t) ((vlan_id >> 5) & 0x7F);
> -	vid_bit = (uint32_t) (1 << (vlan_id & 0x1F));
> +	vid_idx = ixgbe_vfta_index(vlan_id);
> +	vid_bit = ixgbe_vfta_bit(vlan_id);
>  	vfta = IXGBE_READ_REG(hw, IXGBE_VFTA(vid_idx));
>  	if (on)
>  		vfta |= vid_bit;
> @@ -3807,7 +3807,9 @@ ixgbe_dev_supported_ptypes_get(struct rte_eth_dev *dev)
> 
>  #if defined(RTE_ARCH_X86)
>  	if (dev->rx_pkt_burst == ixgbe_recv_pkts_vec ||
> -	    dev->rx_pkt_burst == ixgbe_recv_scattered_pkts_vec)
> +	    dev->rx_pkt_burst == ixgbe_recv_scattered_pkts_vec ||
> +	    dev->rx_pkt_burst == ixgbevf_recv_pkts_vec ||
> +	    dev->rx_pkt_burst == ixgbevf_recv_scattered_pkts_vec)
>  		return ptypes;
>  #endif
>  	return NULL;
> @@ -5231,8 +5233,8 @@ ixgbevf_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
>  		PMD_INIT_LOG(ERR, "Unable to set VF vlan");
>  		return ret;
>  	}
> -	vid_idx = (uint32_t) ((vlan_id >> 5) & 0x7F);
> -	vid_bit = (uint32_t) (1 << (vlan_id & 0x1F));
> +	vid_idx = ixgbe_vfta_index(vlan_id);
> +	vid_bit = ixgbe_vfta_bit(vlan_id);
> 
>  	/* Save what we set and retore it after device reset */
>  	if (on)
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/ixgbe/ixgbe_ethdev.h
> index d0b9396..483d2cd 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.h
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.h
> @@ -568,6 +568,11 @@ int  ixgbe_dev_rx_queue_setup(struct rte_eth_dev *dev, uint16_t rx_queue_id,
>  		const struct rte_eth_rxconf *rx_conf,
>  		struct rte_mempool *mb_pool);
> 
> +int  ixgbevf_dev_rx_queue_setup(struct rte_eth_dev *dev, uint16_t rx_queue_id,
> +				uint16_t nb_rx_desc, unsigned int socket_id,
> +				const struct rte_eth_rxconf *rx_conf,
> +				struct rte_mempool *mb_pool);
> +
>  int  ixgbe_dev_tx_queue_setup(struct rte_eth_dev *dev, uint16_t tx_queue_id,
>  		uint16_t nb_tx_desc, unsigned int socket_id,
>  		const struct rte_eth_txconf *tx_conf);
> @@ -779,4 +784,37 @@ ixgbe_ethertype_filter_remove(struct ixgbe_filter_info *filter_info,
>  	return idx;
>  }
> 
> +int ixgbe_fdir_ctrl_func(struct rte_eth_dev *dev,
> +			enum rte_filter_op filter_op, void *arg);
> +
> +/*
> + * Calculate index in vfta array of the 32 bit value enclosing
> + * a given vlan id
> + */
> +static inline uint32_t
> +ixgbe_vfta_index(uint16_t vlan)
> +{
> +	return (vlan >> 5) & 0x7f;
> +}
> +
> +/*
> + * Calculate vfta array entry bitmask for vlan id within the
> + * enclosing 32 bit entry.
> + */
> +static inline uint32_t
> +ixgbe_vfta_bit(uint16_t vlan)
> +{
> +	return 1 << (vlan & 0x1f);
> +}
> +
> +/*
> + * Check in the vfta bit array if the bit corresponding to
> + * the given vlan is set.
> + */
> +static inline bool
> +ixgbe_vfta_is_vlan_set(const struct ixgbe_vfta *vfta, uint16_t vlan)
> +{
> +	return (vfta->vfta[ixgbe_vfta_index(vlan)] & ixgbe_vfta_bit(vlan)) != 0;
> +}
> +
>  #endif /* _IXGBE_ETHDEV_H_ */
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
> index f82b74a..26a99cb 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> @@ -1623,14 +1623,23 @@ ixgbe_rx_fill_from_stage(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
>  			 uint16_t nb_pkts)
>  {
>  	struct rte_mbuf **stage = &rxq->rx_stage[rxq->rx_next_avail];
> +	const struct rte_eth_dev *dev;
> +	const struct ixgbe_vfta *vfta;
>  	int i;
> 
> +	dev = &rte_eth_devices[rxq->port_id];
> +	vfta = IXGBE_DEV_PRIVATE_TO_VFTA(dev->data->dev_private);
> +
>  	/* how many packets are ready to return? */
>  	nb_pkts = (uint16_t)RTE_MIN(nb_pkts, rxq->rx_nb_avail);
> 
>  	/* copy mbuf pointers to the application's packet list */
> -	for (i = 0; i < nb_pkts; ++i)
> +	for (i = 0; i < nb_pkts; ++i) {
>  		rx_pkts[i] = stage[i];
> +		if (rxq->vf)
> +			ixgbevf_trans_vlan_sw_filter_hdr(rx_pkts[i],
> +							 vfta);
> +	}
> 
>  	/* update internal queue state */
>  	rxq->rx_nb_avail = (uint16_t)(rxq->rx_nb_avail - nb_pkts);
> @@ -1750,6 +1759,8 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
>  	uint16_t nb_hold;
>  	uint64_t pkt_flags;
>  	uint64_t vlan_flags;
> +	const struct rte_eth_dev *dev;
> +	const struct ixgbe_vfta *vfta;
> 
>  	nb_rx = 0;
>  	nb_hold = 0;
> @@ -1758,6 +1769,9 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
>  	rx_ring = rxq->rx_ring;
>  	sw_ring = rxq->sw_ring;
>  	vlan_flags = rxq->vlan_flags;
> +	dev = &rte_eth_devices[rxq->port_id];
> +	vfta = IXGBE_DEV_PRIVATE_TO_VFTA(dev->data->dev_private);
> +
>  	while (nb_rx < nb_pkts) {
>  		/*
>  		 * The order of operations here is important as the DD status
> @@ -1876,6 +1890,10 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
>  			ixgbe_rxd_pkt_info_to_pkt_type(pkt_info,
>  						       rxq->pkt_type_mask);
> 
> +		if (rxq->vf)
> +			ixgbevf_trans_vlan_sw_filter_hdr(rxm,
> +							 vfta);
> +
>  		if (likely(pkt_flags & PKT_RX_RSS_HASH))
>  			rxm->hash.rss = rte_le_to_cpu_32(
>  						rxd.wb.lower.hi_dword.rss);
> @@ -2016,6 +2034,11 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts,
>  	uint16_t nb_rx = 0;
>  	uint16_t nb_hold = rxq->nb_rx_hold;
>  	uint16_t prev_id = rxq->rx_tail;
> +	const struct rte_eth_dev *dev;
> +	const struct ixgbe_vfta *vfta;
> +
> +	dev = &rte_eth_devices[rxq->port_id];
> +	vfta = IXGBE_DEV_PRIVATE_TO_VFTA(dev->data->dev_private);
> 
>  	while (nb_rx < nb_pkts) {
>  		bool eop;
> @@ -2230,6 +2253,10 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts,
>  		rte_packet_prefetch((char *)first_seg->buf_addr +
>  			first_seg->data_off);
> 
> +		if (rxq->vf)
> +			ixgbevf_trans_vlan_sw_filter_hdr(first_seg,
> +							 vfta);
> +
>  		/*
>  		 * Store the mbuf address into the next entry of the array
>  		 * of returned packets.
> @@ -3066,6 +3093,25 @@ ixgbe_dev_rx_queue_setup(struct rte_eth_dev *dev,
>  	return 0;
>  }
> 
> +int __attribute__((cold))
> +ixgbevf_dev_rx_queue_setup(struct rte_eth_dev *dev,
> +			   uint16_t queue_idx,
> +			   uint16_t nb_desc,
> +			   unsigned int socket_id,
> +			   const struct rte_eth_rxconf *rx_conf,
> +			   struct rte_mempool *mp)
> +{
> +	struct ixgbe_rx_queue *rxq;
> +
> +	ixgbe_dev_rx_queue_setup(dev, queue_idx, nb_desc, socket_id,
> +				 rx_conf, mp);
> +
> +	rxq = dev->data->rx_queues[queue_idx];
> +	rxq->vf = true;
> +
> +	return 0;
> +}
> +
>  uint32_t
>  ixgbe_dev_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
>  {
> @@ -4561,7 +4607,7 @@ ixgbe_set_ivar(struct rte_eth_dev *dev, u8 entry, u8 vector, s8 type)
>  }
> 
>  void __attribute__((cold))
> -ixgbe_set_rx_function(struct rte_eth_dev *dev)
> +ixgbe_set_rx_function(struct rte_eth_dev *dev, bool vf)
>  {
>  	uint16_t i, rx_using_sse;
>  	struct ixgbe_adapter *adapter =
> @@ -4608,7 +4654,9 @@ ixgbe_set_rx_function(struct rte_eth_dev *dev)
>  					    "callback (port=%d).",
>  				     dev->data->port_id);
> 
> -			dev->rx_pkt_burst = ixgbe_recv_scattered_pkts_vec;
> +			dev->rx_pkt_burst = vf ?
> +				ixgbevf_recv_scattered_pkts_vec :
> +				ixgbe_recv_scattered_pkts_vec;
>  		} else if (adapter->rx_bulk_alloc_allowed) {
>  			PMD_INIT_LOG(DEBUG, "Using a Scattered with bulk "
>  					   "allocation callback (port=%d).",
> @@ -4637,7 +4685,8 @@ ixgbe_set_rx_function(struct rte_eth_dev *dev)
>  			     RTE_IXGBE_DESCS_PER_LOOP,
>  			     dev->data->port_id);
> 
> -		dev->rx_pkt_burst = ixgbe_recv_pkts_vec;
> +		dev->rx_pkt_burst = vf ? ixgbevf_recv_pkts_vec :
> +			ixgbe_recv_pkts_vec;
>  	} else if (adapter->rx_bulk_alloc_allowed) {
>  		PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions are "
>  				    "satisfied. Rx Burst Bulk Alloc function "
> @@ -4658,7 +4707,9 @@ ixgbe_set_rx_function(struct rte_eth_dev *dev)
> 
>  	rx_using_sse =
>  		(dev->rx_pkt_burst == ixgbe_recv_scattered_pkts_vec ||
> -		dev->rx_pkt_burst == ixgbe_recv_pkts_vec);
> +		 dev->rx_pkt_burst == ixgbe_recv_pkts_vec ||
> +		 dev->rx_pkt_burst == ixgbevf_recv_scattered_pkts_vec ||
> +		 dev->rx_pkt_burst == ixgbevf_recv_pkts_vec);
> 
>  	for (i = 0; i < dev->data->nb_rx_queues; i++) {
>  		struct ixgbe_rx_queue *rxq = dev->data->rx_queues[i];
> @@ -4977,7 +5028,7 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)
>  	if (rc)
>  		return rc;
> 
> -	ixgbe_set_rx_function(dev);
> +	ixgbe_set_rx_function(dev, false);
> 
>  	return 0;
>  }
> @@ -5500,7 +5551,7 @@ ixgbevf_dev_rx_init(struct rte_eth_dev *dev)
>  		IXGBE_PSRTYPE_RQPL_SHIFT;
>  	IXGBE_WRITE_REG(hw, IXGBE_VFPSRTYPE, psrtype);
> 
> -	ixgbe_set_rx_function(dev);
> +	ixgbe_set_rx_function(dev, true);
> 
>  	return 0;
>  }
> @@ -5731,6 +5782,24 @@ ixgbe_recv_pkts_vec(
>  }
> 
>  uint16_t __attribute__((weak))
> +ixgbevf_recv_pkts_vec(
> +	void __rte_unused *rx_queue,
> +	struct rte_mbuf __rte_unused **rx_pkts,
> +	uint16_t __rte_unused nb_pkts)
> +{
> +	return 0;
> +}
> +
> +uint16_t __attribute__((weak))
> +ixgbevf_recv_scattered_pkts_vec(
> +	void __rte_unused *rx_queue,
> +	struct rte_mbuf __rte_unused **rx_pkts,
> +	uint16_t __rte_unused nb_pkts)
> +{
> +	return 0;
> +}
> +
> +uint16_t __attribute__((weak))
>  ixgbe_recv_scattered_pkts_vec(
>  	void __rte_unused *rx_queue,
>  	struct rte_mbuf __rte_unused **rx_pkts,
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.h b/drivers/net/ixgbe/ixgbe_rxtx.h
> index 39378f7..676557b 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx.h
> +++ b/drivers/net/ixgbe/ixgbe_rxtx.h
> @@ -111,6 +111,7 @@ struct ixgbe_rx_queue {
>  	uint16_t rx_free_trigger; /**< triggers rx buffer allocation */
>  	uint8_t            rx_using_sse;
>  	/**< indicates that vector RX is in use */
> +	uint8_t            vf; /**< indicates that this is for a VF */
>  #ifdef RTE_LIBRTE_SECURITY
>  	uint8_t            using_ipsec;
>  	/**< indicates that IPsec RX feature is in use */
> @@ -254,6 +255,30 @@ struct ixgbe_txq_ops {
>  			 IXGBE_ADVTXD_DCMD_EOP)
> 
> 
> +
> +/*
> + * Filter out unknown vlans resulting from use of transparent vlan.
> + *
> + * When a VF is configured to use transparent vlans then the VF can
> + * see this VLAN being set in the packet, meaning that the transparent
> + * property isn't preserved. Furthermore, when the VF is used in a
> + * guest VM then there's no way of knowing for sure that transparent
> + * VLAN is in use and what tag value has been configured. So work
> + * around this by removing the VLAN flag if the VF isn't interested in
> + * the VLAN tag.
> + */
> +static inline void
> +ixgbevf_trans_vlan_sw_filter_hdr(struct rte_mbuf *m,
> +				 const struct ixgbe_vfta *vfta)
> +{
> +	if (m->ol_flags & PKT_RX_VLAN) {
> +		uint16_t vlan = m->vlan_tci & 0xFFF;
> +
> +		if (!ixgbe_vfta_is_vlan_set(vfta, vlan))
> +			m->ol_flags &= ~PKT_RX_VLAN;
> +	}
> +}
> +
>  /* Takes an ethdev and a queue and sets up the tx function to be used based on
>   * the queue parameters. Used in tx_queue_setup by primary process and then
>   * in dev_init by secondary process when attaching to an existing ethdev.
> @@ -274,12 +299,16 @@ void ixgbe_set_tx_function(struct rte_eth_dev *dev, struct ixgbe_tx_queue *txq);
>   *
>   * @dev rte_eth_dev handle
>   */
> -void ixgbe_set_rx_function(struct rte_eth_dev *dev);
> +void ixgbe_set_rx_function(struct rte_eth_dev *dev, bool vf);
> 
>  uint16_t ixgbe_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
>  		uint16_t nb_pkts);
>  uint16_t ixgbe_recv_scattered_pkts_vec(void *rx_queue,
>  		struct rte_mbuf **rx_pkts, uint16_t nb_pkts);
> +uint16_t ixgbevf_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> +		uint16_t nb_pkts);
> +uint16_t ixgbevf_recv_scattered_pkts_vec(void *rx_queue,
> +		struct rte_mbuf **rx_pkts, uint16_t nb_pkts);
>  int ixgbe_rx_vec_dev_conf_condition_check(struct rte_eth_dev *dev);
>  int ixgbe_rxq_vec_setup(struct ixgbe_rx_queue *rxq);
>  void ixgbe_rx_queue_release_mbufs_vec(struct ixgbe_rx_queue *rxq);
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
> index edb1383..d077918 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
> @@ -149,6 +149,9 @@ static inline uint16_t
>  _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
>  		   uint16_t nb_pkts, uint8_t *split_packet)
>  {
> +	const struct rte_eth_dev *dev = &rte_eth_devices[rxq->port_id];
> +	const struct ixgbe_vfta *vfta
> +		= IXGBE_DEV_PRIVATE_TO_VFTA(dev->data->dev_private);
>  	volatile union ixgbe_adv_rx_desc *rxdp;
>  	struct ixgbe_rx_entry *sw_ring;
>  	uint16_t nb_pkts_recd;
> @@ -272,8 +275,10 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
>  		/* D.3 copy final 3,4 data to rx_pkts */
>  		vst1q_u8((void *)&rx_pkts[pos + 3]->rx_descriptor_fields1,
>  			 pkt_mb4);
> +		ixgbe_unknown_vlan_sw_filter_hdr(rx_pkts[pos + 3], vfta, rxq);
>  		vst1q_u8((void *)&rx_pkts[pos + 2]->rx_descriptor_fields1,
>  			 pkt_mb3);
> +		ixgbe_unknown_vlan_sw_filter_hdr(rx_pkts[pos + 2], vfta, rxq);
> 
>  		/* D.2 pkt 1,2 set in_port/nb_seg and remove crc */
>  		tmp = vsubq_u16(vreinterpretq_u16_u8(pkt_mb2), crc_adjust);
> @@ -294,8 +299,10 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
>  		/* D.3 copy final 1,2 data to rx_pkts */
>  		vst1q_u8((uint8_t *)&rx_pkts[pos + 1]->rx_descriptor_fields1,
>  			 pkt_mb2);
> +		ixgbe_unknown_vlan_sw_filter_hdr(rx_pkts[pos + 1], vfta, rxq);
>  		vst1q_u8((uint8_t *)&rx_pkts[pos]->rx_descriptor_fields1,
>  			 pkt_mb1);
> +		ixgbe_unknown_vlan_sw_filter_hdr(rx_pkts[pos], vfta, rxq);
> 
>  		stat &= IXGBE_VPMD_DESC_DD_MASK;
> 
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c b/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
> index c9ba482..04a3307 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
> @@ -313,9 +313,10 @@ desc_to_ptype_v(__m128i descs[4], uint16_t pkt_type_mask,
>   */
>  static inline uint16_t
>  _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
> -		uint16_t nb_pkts, uint8_t *split_packet)
> +		   uint16_t nb_pkts, bool vf, uint8_t *split_packet)
>  {
>  	volatile union ixgbe_adv_rx_desc *rxdp;
> +	const struct ixgbe_vfta *vfta = NULL;
>  	struct ixgbe_rx_entry *sw_ring;
>  	uint16_t nb_pkts_recd;
>  #ifdef RTE_LIBRTE_SECURITY
> @@ -344,6 +345,13 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
>  	__m128i mbuf_init;
>  	uint8_t vlan_flags;
> 
> +	if (vf) {
> +		const struct rte_eth_dev *dev =
> +			&rte_eth_devices[rxq->port_id];
> +
> +		vfta = IXGBE_DEV_PRIVATE_TO_VFTA(dev->data->dev_private);
> +	}
> +
>  	/* nb_pkts shall be less equal than RTE_IXGBE_MAX_RX_BURST */
>  	nb_pkts = RTE_MIN(nb_pkts, RTE_IXGBE_MAX_RX_BURST);
> 
> @@ -500,8 +508,15 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
>  		/* D.3 copy final 3,4 data to rx_pkts */
>  		_mm_storeu_si128((void *)&rx_pkts[pos+3]->rx_descriptor_fields1,
>  				pkt_mb4);
> +		if (vf)
> +			ixgbevf_trans_vlan_sw_filter_hdr(rx_pkts[pos + 3],
> +							 vfta);
> +
>  		_mm_storeu_si128((void *)&rx_pkts[pos+2]->rx_descriptor_fields1,
>  				pkt_mb3);
> +		if (vf)
> +			ixgbevf_trans_vlan_sw_filter_hdr(rx_pkts[pos + 2],
> +							 vfta);
> 
>  		/* D.2 pkt 1,2 set in_port/nb_seg and remove crc */
>  		pkt_mb2 = _mm_add_epi16(pkt_mb2, crc_adjust);
> @@ -536,8 +551,15 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
>  		/* D.3 copy final 1,2 data to rx_pkts */
>  		_mm_storeu_si128((void *)&rx_pkts[pos+1]->rx_descriptor_fields1,
>  				pkt_mb2);
> +		if (vf)
> +			ixgbevf_trans_vlan_sw_filter_hdr(rx_pkts[pos + 1],
> +							 vfta);
> +
>  		_mm_storeu_si128((void *)&rx_pkts[pos]->rx_descriptor_fields1,
>  				pkt_mb1);
> +		if (vf)
> +			ixgbevf_trans_vlan_sw_filter_hdr(rx_pkts[pos],
> +							 vfta);
> 
>  		desc_to_ptype_v(descs, rxq->pkt_type_mask, &rx_pkts[pos]);
> 
> @@ -569,11 +591,11 @@ uint16_t
>  ixgbe_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
>  		uint16_t nb_pkts)
>  {
> -	return _recv_raw_pkts_vec(rx_queue, rx_pkts, nb_pkts, NULL);
> +	return _recv_raw_pkts_vec(rx_queue, rx_pkts, nb_pkts, false, NULL);
>  }
> 
>  /*
> - * vPMD receive routine that reassembles scattered packets
> + * vPMD raw receive routine that reassembles scattered packets
>   *
>   * Notice:
>   * - nb_pkts < RTE_IXGBE_DESCS_PER_LOOP, just return no packet
> @@ -581,16 +603,16 @@ ixgbe_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
>   *   numbers of DD bit
>   * - floor align nb_pkts to a RTE_IXGBE_DESC_PER_LOOP power-of-two
>   */
> -uint16_t
> -ixgbe_recv_scattered_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> -		uint16_t nb_pkts)
> +static inline uint16_t
> +_recv_raw_scattered_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> +			     uint16_t nb_pkts, bool vf)
>  {
>  	struct ixgbe_rx_queue *rxq = rx_queue;
>  	uint8_t split_flags[RTE_IXGBE_MAX_RX_BURST] = {0};
> 
>  	/* get some new buffers */
>  	uint16_t nb_bufs = _recv_raw_pkts_vec(rxq, rx_pkts, nb_pkts,
> -			split_flags);
> +					      vf, split_flags);
>  	if (nb_bufs == 0)
>  		return 0;
> 
> @@ -614,6 +636,54 @@ ixgbe_recv_scattered_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
>  		&split_flags[i]);
>  }
> 
> +/*
> + * vPMD receive routine that reassembles scattered packets
> + *
> + * Notice:
> + * - nb_pkts < RTE_IXGBE_DESCS_PER_LOOP, just return no packet
> + * - nb_pkts > RTE_IXGBE_MAX_RX_BURST, only scan RTE_IXGBE_MAX_RX_BURST
> + *   numbers of DD bit
> + * - floor align nb_pkts to a RTE_IXGBE_DESC_PER_LOOP power-of-two
> + */
> +uint16_t
> +ixgbe_recv_scattered_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> +			      uint16_t nb_pkts)
> +{
> +	return _recv_raw_scattered_pkts_vec(rx_queue, rx_pkts, nb_pkts, false);
> +}
> +
> +/*
> + * vPMD VF receive routine, only accept(nb_pkts >= RTE_IXGBE_DESCS_PER_LOOP)
> + *
> + * Notice:
> + * - nb_pkts < RTE_IXGBE_DESCS_PER_LOOP, just return no packet
> + * - nb_pkts > RTE_IXGBE_MAX_RX_BURST, only scan RTE_IXGBE_MAX_RX_BURST
> + *   numbers of DD bit
> + * - floor align nb_pkts to a RTE_IXGBE_DESC_PER_LOOP power-of-two
> + */
> +uint16_t
> +ixgbevf_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> +		      uint16_t nb_pkts)
> +{
> +	return _recv_raw_pkts_vec(rx_queue, rx_pkts, nb_pkts, true, NULL);
> +}
> +
> +/*
> + * vPMD VF receive routine that reassembles scattered packets
> + *
> + * Notice:
> + * - nb_pkts < RTE_IXGBE_DESCS_PER_LOOP, just return no packet
> + * - nb_pkts > RTE_IXGBE_MAX_RX_BURST, only scan RTE_IXGBE_MAX_RX_BURST
> + *   numbers of DD bit
> + * - floor align nb_pkts to a RTE_IXGBE_DESC_PER_LOOP power-of-two
> + */
> +uint16_t
> +ixgbevf_recv_scattered_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> +				uint16_t nb_pkts)
> +{
> +	return _recv_raw_scattered_pkts_vec(rx_queue, rx_pkts, nb_pkts, true);
> +}
> +
>  static inline void
>  vtx1(volatile union ixgbe_adv_tx_desc *txdp,
>  		struct rte_mbuf *pkt, uint64_t flags)
> --
> 2.7.4

Patch

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 26b1927..3f88a02 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -604,7 +604,7 @@  static const struct eth_dev_ops ixgbevf_eth_dev_ops = {
 	.vlan_filter_set      = ixgbevf_vlan_filter_set,
 	.vlan_strip_queue_set = ixgbevf_vlan_strip_queue_set,
 	.vlan_offload_set     = ixgbevf_vlan_offload_set,
-	.rx_queue_setup       = ixgbe_dev_rx_queue_setup,
+	.rx_queue_setup       = ixgbevf_dev_rx_queue_setup,
 	.rx_queue_release     = ixgbe_dev_rx_queue_release,
 	.rx_descriptor_done   = ixgbe_dev_rx_descriptor_done,
 	.rx_descriptor_status = ixgbe_dev_rx_descriptor_status,
@@ -1094,7 +1094,7 @@  eth_ixgbe_dev_init(struct rte_eth_dev *eth_dev, void *init_params __rte_unused)
 				     "Using default TX function.");
 		}
 
-		ixgbe_set_rx_function(eth_dev);
+		ixgbe_set_rx_function(eth_dev, true);
 
 		return 0;
 	}
@@ -1576,7 +1576,7 @@  eth_ixgbevf_dev_init(struct rte_eth_dev *eth_dev)
 				     "No TX queues configured yet. Using default TX function.");
 		}
 
-		ixgbe_set_rx_function(eth_dev);
+		ixgbe_set_rx_function(eth_dev, true);
 
 		return 0;
 	}
@@ -1839,8 +1839,8 @@  ixgbe_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
 	uint32_t vid_idx;
 	uint32_t vid_bit;
 
-	vid_idx = (uint32_t) ((vlan_id >> 5) & 0x7F);
-	vid_bit = (uint32_t) (1 << (vlan_id & 0x1F));
+	vid_idx = ixgbe_vfta_index(vlan_id);
+	vid_bit = ixgbe_vfta_bit(vlan_id);
 	vfta = IXGBE_READ_REG(hw, IXGBE_VFTA(vid_idx));
 	if (on)
 		vfta |= vid_bit;
@@ -3807,7 +3807,9 @@  ixgbe_dev_supported_ptypes_get(struct rte_eth_dev *dev)
 
 #if defined(RTE_ARCH_X86)
 	if (dev->rx_pkt_burst == ixgbe_recv_pkts_vec ||
-	    dev->rx_pkt_burst == ixgbe_recv_scattered_pkts_vec)
+	    dev->rx_pkt_burst == ixgbe_recv_scattered_pkts_vec ||
+	    dev->rx_pkt_burst == ixgbevf_recv_pkts_vec ||
+	    dev->rx_pkt_burst == ixgbevf_recv_scattered_pkts_vec)
 		return ptypes;
 #endif
 	return NULL;
@@ -5231,8 +5233,8 @@  ixgbevf_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
 		PMD_INIT_LOG(ERR, "Unable to set VF vlan");
 		return ret;
 	}
-	vid_idx = (uint32_t) ((vlan_id >> 5) & 0x7F);
-	vid_bit = (uint32_t) (1 << (vlan_id & 0x1F));
+	vid_idx = ixgbe_vfta_index(vlan_id);
+	vid_bit = ixgbe_vfta_bit(vlan_id);
 
 	/* Save what we set and retore it after device reset */
 	if (on)
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/ixgbe/ixgbe_ethdev.h
index d0b9396..483d2cd 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.h
+++ b/drivers/net/ixgbe/ixgbe_ethdev.h
@@ -568,6 +568,11 @@  int  ixgbe_dev_rx_queue_setup(struct rte_eth_dev *dev, uint16_t rx_queue_id,
 		const struct rte_eth_rxconf *rx_conf,
 		struct rte_mempool *mb_pool);
 
+int  ixgbevf_dev_rx_queue_setup(struct rte_eth_dev *dev, uint16_t rx_queue_id,
+				uint16_t nb_rx_desc, unsigned int socket_id,
+				const struct rte_eth_rxconf *rx_conf,
+				struct rte_mempool *mb_pool);
+
 int  ixgbe_dev_tx_queue_setup(struct rte_eth_dev *dev, uint16_t tx_queue_id,
 		uint16_t nb_tx_desc, unsigned int socket_id,
 		const struct rte_eth_txconf *tx_conf);
@@ -779,4 +784,37 @@  ixgbe_ethertype_filter_remove(struct ixgbe_filter_info *filter_info,
 	return idx;
 }
 
+int ixgbe_fdir_ctrl_func(struct rte_eth_dev *dev,
+			enum rte_filter_op filter_op, void *arg);
+
+/*
+ * Calculate index in vfta array of the 32 bit value enclosing
+ * a given vlan id
+ */
+static inline uint32_t
+ixgbe_vfta_index(uint16_t vlan)
+{
+	return (vlan >> 5) & 0x7f;
+}
+
+/*
+ * Calculate vfta array entry bitmask for vlan id within the
+ * enclosing 32 bit entry.
+ */
+static inline uint32_t
+ixgbe_vfta_bit(uint16_t vlan)
+{
+	return 1 << (vlan & 0x1f);
+}
+
+/*
+ * Check in the vfta bit array if the bit corresponding to
+ * the given vlan is set.
+ */
+static inline bool
+ixgbe_vfta_is_vlan_set(const struct ixgbe_vfta *vfta, uint16_t vlan)
+{
+	return (vfta->vfta[ixgbe_vfta_index(vlan)] & ixgbe_vfta_bit(vlan)) != 0;
+}
+
 #endif /* _IXGBE_ETHDEV_H_ */
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index f82b74a..26a99cb 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1623,14 +1623,23 @@  ixgbe_rx_fill_from_stage(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 			 uint16_t nb_pkts)
 {
 	struct rte_mbuf **stage = &rxq->rx_stage[rxq->rx_next_avail];
+	const struct rte_eth_dev *dev;
+	const struct ixgbe_vfta *vfta;
 	int i;
 
+	dev = &rte_eth_devices[rxq->port_id];
+	vfta = IXGBE_DEV_PRIVATE_TO_VFTA(dev->data->dev_private);
+
 	/* how many packets are ready to return? */
 	nb_pkts = (uint16_t)RTE_MIN(nb_pkts, rxq->rx_nb_avail);
 
 	/* copy mbuf pointers to the application's packet list */
-	for (i = 0; i < nb_pkts; ++i)
+	for (i = 0; i < nb_pkts; ++i) {
 		rx_pkts[i] = stage[i];
+		if (rxq->vf)
+			ixgbevf_trans_vlan_sw_filter_hdr(rx_pkts[i],
+							 vfta);
+	}
 
 	/* update internal queue state */
 	rxq->rx_nb_avail = (uint16_t)(rxq->rx_nb_avail - nb_pkts);
@@ -1750,6 +1759,8 @@  ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 	uint16_t nb_hold;
 	uint64_t pkt_flags;
 	uint64_t vlan_flags;
+	const struct rte_eth_dev *dev;
+	const struct ixgbe_vfta *vfta;
 
 	nb_rx = 0;
 	nb_hold = 0;
@@ -1758,6 +1769,9 @@  ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 	rx_ring = rxq->rx_ring;
 	sw_ring = rxq->sw_ring;
 	vlan_flags = rxq->vlan_flags;
+	dev = &rte_eth_devices[rxq->port_id];
+	vfta = IXGBE_DEV_PRIVATE_TO_VFTA(dev->data->dev_private);
+
 	while (nb_rx < nb_pkts) {
 		/*
 		 * The order of operations here is important as the DD status
@@ -1876,6 +1890,10 @@  ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 			ixgbe_rxd_pkt_info_to_pkt_type(pkt_info,
 						       rxq->pkt_type_mask);
 
+		if (rxq->vf)
+			ixgbevf_trans_vlan_sw_filter_hdr(rxm,
+							 vfta);
+
 		if (likely(pkt_flags & PKT_RX_RSS_HASH))
 			rxm->hash.rss = rte_le_to_cpu_32(
 						rxd.wb.lower.hi_dword.rss);
@@ -2016,6 +2034,11 @@  ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts,
 	uint16_t nb_rx = 0;
 	uint16_t nb_hold = rxq->nb_rx_hold;
 	uint16_t prev_id = rxq->rx_tail;
+	const struct rte_eth_dev *dev;
+	const struct ixgbe_vfta *vfta;
+
+	dev = &rte_eth_devices[rxq->port_id];
+	vfta = IXGBE_DEV_PRIVATE_TO_VFTA(dev->data->dev_private);
 
 	while (nb_rx < nb_pkts) {
 		bool eop;
@@ -2230,6 +2253,10 @@  ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts,
 		rte_packet_prefetch((char *)first_seg->buf_addr +
 			first_seg->data_off);
 
+		if (rxq->vf)
+			ixgbevf_trans_vlan_sw_filter_hdr(first_seg,
+							 vfta);
+
 		/*
 		 * Store the mbuf address into the next entry of the array
 		 * of returned packets.
@@ -3066,6 +3093,25 @@  ixgbe_dev_rx_queue_setup(struct rte_eth_dev *dev,
 	return 0;
 }
 
+int __attribute__((cold))
+ixgbevf_dev_rx_queue_setup(struct rte_eth_dev *dev,
+			   uint16_t queue_idx,
+			   uint16_t nb_desc,
+			   unsigned int socket_id,
+			   const struct rte_eth_rxconf *rx_conf,
+			   struct rte_mempool *mp)
+{
+	struct ixgbe_rx_queue *rxq;
+
+	ixgbe_dev_rx_queue_setup(dev, queue_idx, nb_desc, socket_id,
+				 rx_conf, mp);
+
+	rxq = dev->data->rx_queues[queue_idx];
+	rxq->vf = true;
+
+	return 0;
+}
+
 uint32_t
 ixgbe_dev_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 {
@@ -4561,7 +4607,7 @@  ixgbe_set_ivar(struct rte_eth_dev *dev, u8 entry, u8 vector, s8 type)
 }
 
 void __attribute__((cold))
-ixgbe_set_rx_function(struct rte_eth_dev *dev)
+ixgbe_set_rx_function(struct rte_eth_dev *dev, bool vf)
 {
 	uint16_t i, rx_using_sse;
 	struct ixgbe_adapter *adapter =
@@ -4608,7 +4654,9 @@  ixgbe_set_rx_function(struct rte_eth_dev *dev)
 					    "callback (port=%d).",
 				     dev->data->port_id);
 
-			dev->rx_pkt_burst = ixgbe_recv_scattered_pkts_vec;
+			dev->rx_pkt_burst = vf ?
+				ixgbevf_recv_scattered_pkts_vec :
+				ixgbe_recv_scattered_pkts_vec;
 		} else if (adapter->rx_bulk_alloc_allowed) {
 			PMD_INIT_LOG(DEBUG, "Using a Scattered with bulk "
 					   "allocation callback (port=%d).",
@@ -4637,7 +4685,8 @@  ixgbe_set_rx_function(struct rte_eth_dev *dev)
 			     RTE_IXGBE_DESCS_PER_LOOP,
 			     dev->data->port_id);
 
-		dev->rx_pkt_burst = ixgbe_recv_pkts_vec;
+		dev->rx_pkt_burst = vf ? ixgbevf_recv_pkts_vec :
+			ixgbe_recv_pkts_vec;
 	} else if (adapter->rx_bulk_alloc_allowed) {
 		PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions are "
 				    "satisfied. Rx Burst Bulk Alloc function "
@@ -4658,7 +4707,9 @@  ixgbe_set_rx_function(struct rte_eth_dev *dev)
 
 	rx_using_sse =
 		(dev->rx_pkt_burst == ixgbe_recv_scattered_pkts_vec ||
-		dev->rx_pkt_burst == ixgbe_recv_pkts_vec);
+		 dev->rx_pkt_burst == ixgbe_recv_pkts_vec ||
+		 dev->rx_pkt_burst == ixgbevf_recv_scattered_pkts_vec ||
+		 dev->rx_pkt_burst == ixgbevf_recv_pkts_vec);
 
 	for (i = 0; i < dev->data->nb_rx_queues; i++) {
 		struct ixgbe_rx_queue *rxq = dev->data->rx_queues[i];
@@ -4977,7 +5028,7 @@  ixgbe_dev_rx_init(struct rte_eth_dev *dev)
 	if (rc)
 		return rc;
 
-	ixgbe_set_rx_function(dev);
+	ixgbe_set_rx_function(dev, false);
 
 	return 0;
 }
@@ -5500,7 +5551,7 @@  ixgbevf_dev_rx_init(struct rte_eth_dev *dev)
 		IXGBE_PSRTYPE_RQPL_SHIFT;
 	IXGBE_WRITE_REG(hw, IXGBE_VFPSRTYPE, psrtype);
 
-	ixgbe_set_rx_function(dev);
+	ixgbe_set_rx_function(dev, true);
 
 	return 0;
 }
@@ -5731,6 +5782,24 @@  ixgbe_recv_pkts_vec(
 }
 
 uint16_t __attribute__((weak))
+ixgbevf_recv_pkts_vec(
+	void __rte_unused *rx_queue,
+	struct rte_mbuf __rte_unused **rx_pkts,
+	uint16_t __rte_unused nb_pkts)
+{
+	return 0;
+}
+
+uint16_t __attribute__((weak))
+ixgbevf_recv_scattered_pkts_vec(
+	void __rte_unused *rx_queue,
+	struct rte_mbuf __rte_unused **rx_pkts,
+	uint16_t __rte_unused nb_pkts)
+{
+	return 0;
+}
+
+uint16_t __attribute__((weak))
 ixgbe_recv_scattered_pkts_vec(
 	void __rte_unused *rx_queue,
 	struct rte_mbuf __rte_unused **rx_pkts,
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.h b/drivers/net/ixgbe/ixgbe_rxtx.h
index 39378f7..676557b 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.h
+++ b/drivers/net/ixgbe/ixgbe_rxtx.h
@@ -111,6 +111,7 @@  struct ixgbe_rx_queue {
 	uint16_t rx_free_trigger; /**< triggers rx buffer allocation */
 	uint8_t            rx_using_sse;
 	/**< indicates that vector RX is in use */
+	uint8_t            vf; /**< indicates that this is for a VF */
 #ifdef RTE_LIBRTE_SECURITY
 	uint8_t            using_ipsec;
 	/**< indicates that IPsec RX feature is in use */
@@ -254,6 +255,30 @@  struct ixgbe_txq_ops {
 			 IXGBE_ADVTXD_DCMD_EOP)
 
 
+
+/*
+ * Filter out unknown vlans resulting from use of transparent vlan.
+ *
+ * When a VF is configured to use transparent vlans then the VF can
+ * see this VLAN being set in the packet, meaning that the transparent
+ * property isn't preserved. Furthermore, when the VF is used in a
+ * guest VM then there's no way of knowing for sure that transparent
+ * VLAN is in use and what tag value has been configured. So work
+ * around this by removing the VLAN flag if the VF isn't interested in
+ * the VLAN tag.
+ */
+static inline void
+ixgbevf_trans_vlan_sw_filter_hdr(struct rte_mbuf *m,
+				 const struct ixgbe_vfta *vfta)
+{
+	if (m->ol_flags & PKT_RX_VLAN) {
+		uint16_t vlan = m->vlan_tci & 0xFFF;
+
+		if (!ixgbe_vfta_is_vlan_set(vfta, vlan))
+			m->ol_flags &= ~PKT_RX_VLAN;
+	}
+}
+
 /* Takes an ethdev and a queue and sets up the tx function to be used based on
  * the queue parameters. Used in tx_queue_setup by primary process and then
  * in dev_init by secondary process when attaching to an existing ethdev.
@@ -274,12 +299,16 @@  void ixgbe_set_tx_function(struct rte_eth_dev *dev, struct ixgbe_tx_queue *txq);
  *
  * @dev rte_eth_dev handle
  */
-void ixgbe_set_rx_function(struct rte_eth_dev *dev);
+void ixgbe_set_rx_function(struct rte_eth_dev *dev, bool vf);
 
 uint16_t ixgbe_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
 		uint16_t nb_pkts);
 uint16_t ixgbe_recv_scattered_pkts_vec(void *rx_queue,
 		struct rte_mbuf **rx_pkts, uint16_t nb_pkts);
+uint16_t ixgbevf_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
+		uint16_t nb_pkts);
+uint16_t ixgbevf_recv_scattered_pkts_vec(void *rx_queue,
+		struct rte_mbuf **rx_pkts, uint16_t nb_pkts);
 int ixgbe_rx_vec_dev_conf_condition_check(struct rte_eth_dev *dev);
 int ixgbe_rxq_vec_setup(struct ixgbe_rx_queue *rxq);
 void ixgbe_rx_queue_release_mbufs_vec(struct ixgbe_rx_queue *rxq);
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
index edb1383..d077918 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
@@ -149,6 +149,9 @@  static inline uint16_t
 _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 		   uint16_t nb_pkts, uint8_t *split_packet)
 {
+	const struct rte_eth_dev *dev = &rte_eth_devices[rxq->port_id];
+	const struct ixgbe_vfta *vfta
+		= IXGBE_DEV_PRIVATE_TO_VFTA(dev->data->dev_private);
 	volatile union ixgbe_adv_rx_desc *rxdp;
 	struct ixgbe_rx_entry *sw_ring;
 	uint16_t nb_pkts_recd;
@@ -272,8 +275,10 @@  _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 		/* D.3 copy final 3,4 data to rx_pkts */
 		vst1q_u8((void *)&rx_pkts[pos + 3]->rx_descriptor_fields1,
 			 pkt_mb4);
+		ixgbe_unknown_vlan_sw_filter_hdr(rx_pkts[pos + 3], vfta, rxq);
 		vst1q_u8((void *)&rx_pkts[pos + 2]->rx_descriptor_fields1,
 			 pkt_mb3);
+		ixgbe_unknown_vlan_sw_filter_hdr(rx_pkts[pos + 2], vfta, rxq);
 
 		/* D.2 pkt 1,2 set in_port/nb_seg and remove crc */
 		tmp = vsubq_u16(vreinterpretq_u16_u8(pkt_mb2), crc_adjust);
@@ -294,8 +299,10 @@  _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 		/* D.3 copy final 1,2 data to rx_pkts */
 		vst1q_u8((uint8_t *)&rx_pkts[pos + 1]->rx_descriptor_fields1,
 			 pkt_mb2);
+		ixgbe_unknown_vlan_sw_filter_hdr(rx_pkts[pos + 1], vfta, rxq);
 		vst1q_u8((uint8_t *)&rx_pkts[pos]->rx_descriptor_fields1,
 			 pkt_mb1);
+		ixgbe_unknown_vlan_sw_filter_hdr(rx_pkts[pos], vfta, rxq);
 
 		stat &= IXGBE_VPMD_DESC_DD_MASK;
 
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c b/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
index c9ba482..04a3307 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
@@ -313,9 +313,10 @@  desc_to_ptype_v(__m128i descs[4], uint16_t pkt_type_mask,
  */
 static inline uint16_t
 _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
-		uint16_t nb_pkts, uint8_t *split_packet)
+		   uint16_t nb_pkts, bool vf, uint8_t *split_packet)
 {
 	volatile union ixgbe_adv_rx_desc *rxdp;
+	const struct ixgbe_vfta *vfta = NULL;
 	struct ixgbe_rx_entry *sw_ring;
 	uint16_t nb_pkts_recd;
 #ifdef RTE_LIBRTE_SECURITY
@@ -344,6 +345,13 @@  _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 	__m128i mbuf_init;
 	uint8_t vlan_flags;
 
+	if (vf) {
+		const struct rte_eth_dev *dev =
+			&rte_eth_devices[rxq->port_id];
+
+		vfta = IXGBE_DEV_PRIVATE_TO_VFTA(dev->data->dev_private);
+	}
+
 	/* nb_pkts shall be less equal than RTE_IXGBE_MAX_RX_BURST */
 	nb_pkts = RTE_MIN(nb_pkts, RTE_IXGBE_MAX_RX_BURST);
 
@@ -500,8 +508,15 @@  _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 		/* D.3 copy final 3,4 data to rx_pkts */
 		_mm_storeu_si128((void *)&rx_pkts[pos+3]->rx_descriptor_fields1,
 				pkt_mb4);
+		if (vf)
+			ixgbevf_trans_vlan_sw_filter_hdr(rx_pkts[pos + 3],
+							 vfta);
+
 		_mm_storeu_si128((void *)&rx_pkts[pos+2]->rx_descriptor_fields1,
 				pkt_mb3);
+		if (vf)
+			ixgbevf_trans_vlan_sw_filter_hdr(rx_pkts[pos + 2],
+							 vfta);
 
 		/* D.2 pkt 1,2 set in_port/nb_seg and remove crc */
 		pkt_mb2 = _mm_add_epi16(pkt_mb2, crc_adjust);
@@ -536,8 +551,15 @@  _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 		/* D.3 copy final 1,2 data to rx_pkts */
 		_mm_storeu_si128((void *)&rx_pkts[pos+1]->rx_descriptor_fields1,
 				pkt_mb2);
+		if (vf)
+			ixgbevf_trans_vlan_sw_filter_hdr(rx_pkts[pos + 1],
+							 vfta);
+
 		_mm_storeu_si128((void *)&rx_pkts[pos]->rx_descriptor_fields1,
 				pkt_mb1);
+		if (vf)
+			ixgbevf_trans_vlan_sw_filter_hdr(rx_pkts[pos],
+							 vfta);
 
 		desc_to_ptype_v(descs, rxq->pkt_type_mask, &rx_pkts[pos]);
 
@@ -569,11 +591,11 @@  uint16_t
 ixgbe_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
 		uint16_t nb_pkts)
 {
-	return _recv_raw_pkts_vec(rx_queue, rx_pkts, nb_pkts, NULL);
+	return _recv_raw_pkts_vec(rx_queue, rx_pkts, nb_pkts, false, NULL);
 }
 
 /*
- * vPMD receive routine that reassembles scattered packets
+ * vPMD raw receive routine that reassembles scattered packets
  *
  * Notice:
  * - nb_pkts < RTE_IXGBE_DESCS_PER_LOOP, just return no packet
@@ -581,16 +603,16 @@  ixgbe_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
  *   numbers of DD bit
  * - floor align nb_pkts to a RTE_IXGBE_DESC_PER_LOOP power-of-two
  */
-uint16_t
-ixgbe_recv_scattered_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
-		uint16_t nb_pkts)
+static inline uint16_t
+_recv_raw_scattered_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
+			     uint16_t nb_pkts, bool vf)
 {
 	struct ixgbe_rx_queue *rxq = rx_queue;
 	uint8_t split_flags[RTE_IXGBE_MAX_RX_BURST] = {0};
 
 	/* get some new buffers */
 	uint16_t nb_bufs = _recv_raw_pkts_vec(rxq, rx_pkts, nb_pkts,
-			split_flags);
+					      vf, split_flags);
 	if (nb_bufs == 0)
 		return 0;
 
@@ -614,6 +636,54 @@  ixgbe_recv_scattered_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
 		&split_flags[i]);
 }
 
+/*
+ * vPMD receive routine that reassembles scattered packets
+ *
+ * Notice:
+ * - nb_pkts < RTE_IXGBE_DESCS_PER_LOOP, just return no packet
+ * - nb_pkts > RTE_IXGBE_MAX_RX_BURST, only scan RTE_IXGBE_MAX_RX_BURST
+ *   numbers of DD bit
+ * - floor align nb_pkts to a RTE_IXGBE_DESC_PER_LOOP power-of-two
+ */
+uint16_t
+ixgbe_recv_scattered_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
+			      uint16_t nb_pkts)
+{
+	return _recv_raw_scattered_pkts_vec(rx_queue, rx_pkts, nb_pkts, false);
+}
+
+/*
+ * vPMD VF receive routine, only accept(nb_pkts >= RTE_IXGBE_DESCS_PER_LOOP)
+ *
+ * Notice:
+ * - nb_pkts < RTE_IXGBE_DESCS_PER_LOOP, just return no packet
+ * - nb_pkts > RTE_IXGBE_MAX_RX_BURST, only scan RTE_IXGBE_MAX_RX_BURST
+ *   numbers of DD bit
+ * - floor align nb_pkts to a RTE_IXGBE_DESC_PER_LOOP power-of-two
+ */
+uint16_t
+ixgbevf_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
+		      uint16_t nb_pkts)
+{
+	return _recv_raw_pkts_vec(rx_queue, rx_pkts, nb_pkts, true, NULL);
+}
+
+/*
+ * vPMD VF receive routine that reassembles scattered packets
+ *
+ * Notice:
+ * - nb_pkts < RTE_IXGBE_DESCS_PER_LOOP, just return no packet
+ * - nb_pkts > RTE_IXGBE_MAX_RX_BURST, only scan RTE_IXGBE_MAX_RX_BURST
+ *   numbers of DD bit
+ * - floor align nb_pkts to a RTE_IXGBE_DESC_PER_LOOP power-of-two
+ */
+uint16_t
+ixgbevf_recv_scattered_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
+				uint16_t nb_pkts)
+{
+	return _recv_raw_scattered_pkts_vec(rx_queue, rx_pkts, nb_pkts, true);
+}
+
 static inline void
 vtx1(volatile union ixgbe_adv_tx_desc *txdp,
 		struct rte_mbuf *pkt, uint64_t flags)