diff mbox series

[v4,3/9] net/gve: add support for device initialization

Message ID 20220927073255.1803892-4-junfeng.guo@intel.com (mailing list archive)
State Changes Requested, archived
Delegated to: Ferruh Yigit
Headers show
Series introduce GVE PMD | expand

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Guo, Junfeng Sept. 27, 2022, 7:32 a.m. UTC
Support device init and the fowllowing devops:
- dev_configure
- dev_start
- dev_stop
- dev_close

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
---
 MAINTAINERS                            |   6 +
 doc/guides/nics/features/gve.ini       |  10 +
 doc/guides/nics/gve.rst                |  69 +++++
 doc/guides/nics/index.rst              |   1 +
 doc/guides/rel_notes/release_22_11.rst |   5 +
 drivers/net/gve/base/gve_adminq.c      |   1 +
 drivers/net/gve/gve_ethdev.c           | 371 +++++++++++++++++++++++++
 drivers/net/gve/gve_ethdev.h           | 225 +++++++++++++++
 drivers/net/gve/meson.build            |  14 +
 drivers/net/gve/version.map            |   3 +
 drivers/net/meson.build                |   1 +
 11 files changed, 706 insertions(+)
 create mode 100644 doc/guides/nics/features/gve.ini
 create mode 100644 doc/guides/nics/gve.rst
 create mode 100644 drivers/net/gve/gve_ethdev.c
 create mode 100644 drivers/net/gve/gve_ethdev.h
 create mode 100644 drivers/net/gve/meson.build
 create mode 100644 drivers/net/gve/version.map

Comments

Ferruh Yigit Oct. 6, 2022, 2:22 p.m. UTC | #1
On 9/27/2022 8:32 AM, Junfeng Guo wrote:

> 
> Support device init and the fowllowing devops:

s/fowllowing/following/

> - dev_configure
> - dev_start
> - dev_stop
> - dev_close

At this stage most of above are empty functions and not implemented yet, 
instead can you document in the commit log that build system and device 
initialization is added?

> 
> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
> Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
> Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>

<...>

> --- /dev/null
> +++ b/doc/guides/nics/gve.rst
> @@ -0,0 +1,69 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +    Copyright(C) 2022 Intel Corporation.
> +
> +GVE poll mode driver
> +=======================
> +
> +The GVE PMD (**librte_net_gve**) provides poll mode driver support for
> +Google Virtual Ethernet device.
> +
> +Please refer to https://cloud.google.com/compute/docs/networking/using-gvnic
> +for the device description.
> +

This seems another virtual interface, similar to iavf/virtio/idpf ...

Can you please briefly describe here the motivation to add yet another 
virtual interface, and again briefly describe cons/pros of the interface?

> +The base code is under MIT license and based on GVE kernel driver v1.3.0.
> +GVE base code files are:
> +
> +- gve_adminq.h
> +- gve_adminq.c
> +- gve_desc.h
> +- gve_desc_dqo.h
> +- gve_register.h
> +- gve.h
> +
> +Please refer to https://github.com/GoogleCloudPlatform/compute-virtual-ethernet-linux/tree/v1.3.0/google/gve
> +to find the original base code.
> +
> +GVE has 3 queue formats:
> +
> +- GQI_QPL - GQI with queue page list
> +- GQI_RDA - GQI with raw DMA addressing
> +- DQO_RDA - DQO with raw DMA addressing
> +
> +GQI_QPL queue format is queue page list mode. Driver needs to allocate
> +memory and register this memory as a Queue Page List (QPL) in hardware
> +(Google Hypervisor/GVE Backend) first. Each queue has its own QPL.
> +Then Tx needs to copy packets to QPL memory and put this packet's offset
> +in the QPL memory into hardware descriptors so that hardware can get the
> +packets data. And Rx needs to read descriptors of offset in QPL to get
> +QPL address and copy packets from the address to get real packets data.
> +
> +GQI_RDA queue format works like usual NICs that driver can put packets'
> +physical address into hardware descriptors.
> +
> +DQO_RDA queue format has submission and completion queue pair for each
> +Tx/Rx queue. And similar as GQI_RDA, driver can put packets' physical
> +address into hardware descriptors.
> +
> +Please refer to https://www.kernel.org/doc/html/latest/networking/device_drivers/ethernet/google/gve.html
> +to get more information about GVE queue formats.
> +
> +Features and Limitations
> +------------------------
> +
> +In this release, the GVE PMD provides the basic functionality of packet
> +reception and transmission.
> +Supported features of the GVE PMD are:
> +
> +- Multiple queues for TX and RX
> +- Receiver Side Scaling (RSS)
> +- TSO offload
> +- Port hardware statistics
> +- Link state information
> +- TX multi-segments (Scatter TX)
> +- Tx UDP/TCP/SCTP Checksum
> +

Can you build this list gradully, by adding relavent item in each patch 
that adds it?
That way mapping with the code and documented feature becomes more obvious.

<...>

> +static int
> +gve_dev_uninit(struct rte_eth_dev *eth_dev)
> +{
> +       struct gve_priv *priv = eth_dev->data->dev_private;
> +
> +       eth_dev->data->mac_addrs = NULL;
> +

At this stage 'mac_addrs' is not freed, setting it to NULL prevents it 
to be freed.

<...>

> +
> +static struct rte_pci_driver rte_gve_pmd = {
> +       .id_table = pci_id_gve_map,
> +       .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,

As far as I can see LSC interrupt is not supported, if that is correct 
should we drop the flag?

<...>

> +
> +struct gve_priv {
> +       struct gve_irq_db *irq_dbs; /* array of num_ntfy_blks */
> +       const struct rte_memzone *irq_dbs_mz;
> +       uint32_t mgmt_msix_idx;
> +       rte_be32_t *cnt_array; /* array of num_event_counters */
> +       const struct rte_memzone *cnt_array_mz;
> +
> +       uint16_t num_event_counters;
> +       uint16_t tx_desc_cnt; /* txq size */
> +       uint16_t rx_desc_cnt; /* rxq size */
> +       uint16_t tx_pages_per_qpl; /* tx buffer length */
> +       uint16_t rx_data_slot_cnt; /* rx buffer length */
> +
> +       /* Only valid for DQO_RDA queue format */
> +       uint16_t tx_compq_size; /* tx completion queue size */
> +       uint16_t rx_bufq_size; /* rx buff queue size */
> +
> +       uint64_t max_registered_pages;
> +       uint64_t num_registered_pages; /* num pages registered with NIC */
> +       uint16_t default_num_queues; /* default num queues to set up */
> +       enum gve_queue_format queue_format; /* see enum gve_queue_format */
> +       uint8_t enable_rsc;
> +
> +       uint16_t max_nb_txq;
> +       uint16_t max_nb_rxq;
> +       uint32_t num_ntfy_blks; /* spilt between TX and RX so must be even */
> +
> +       struct gve_registers __iomem *reg_bar0; /* see gve_register.h */
> +       rte_be32_t __iomem *db_bar2; /* "array" of doorbells */
> +       struct rte_pci_device *pci_dev;
> +
> +       /* Admin queue - see gve_adminq.h*/
> +       union gve_adminq_command *adminq;
> +       struct gve_dma_mem adminq_dma_mem;
> +       uint32_t adminq_mask; /* masks prod_cnt to adminq size */
> +       uint32_t adminq_prod_cnt; /* free-running count of AQ cmds executed */
> +       uint32_t adminq_cmd_fail; /* free-running count of AQ cmds failed */
> +       uint32_t adminq_timeouts; /* free-running count of AQ cmds timeouts */
> +       /* free-running count of per AQ cmd executed */
> +       uint32_t adminq_describe_device_cnt;
> +       uint32_t adminq_cfg_device_resources_cnt;
> +       uint32_t adminq_register_page_list_cnt;
> +       uint32_t adminq_unregister_page_list_cnt;
> +       uint32_t adminq_create_tx_queue_cnt;
> +       uint32_t adminq_create_rx_queue_cnt;
> +       uint32_t adminq_destroy_tx_queue_cnt;
> +       uint32_t adminq_destroy_rx_queue_cnt;
> +       uint32_t adminq_dcfg_device_resources_cnt;
> +       uint32_t adminq_set_driver_parameter_cnt;
> +       uint32_t adminq_report_stats_cnt;
> +       uint32_t adminq_report_link_speed_cnt;
> +       uint32_t adminq_get_ptype_map_cnt;
> +
> +       volatile uint32_t state_flags;
> +
> +       /* Gvnic device link speed from hypervisor. */
> +       uint64_t link_speed;
> +
> +       uint16_t max_mtu;
> +       struct rte_ether_addr dev_addr; /* mac address */
> +
> +       struct gve_queue_page_list *qpl;
> +
> +       struct gve_tx_queue **txqs;
> +       struct gve_rx_queue **rxqs;
> +};
> +

Similar to previous comment, can you construct the headers by only 
adding used fields in that patch?

When batch copied an existing struct, it is very easy to add unused code 
and very hard to detect it. So if you only add what you need, that 
becomes easy to be sure all fields are used.

Also it makes more obvious which fields related to which feature.

<...>

> new file mode 100644
> index 0000000000..c2e0723b4c
> --- /dev/null
> +++ b/drivers/net/gve/version.map
> @@ -0,0 +1,3 @@
> +DPDK_22 {
> +       local: *;
> +};

it is 'DPDK_23' now, hopefully we will have an update to get rid of 
empty map files, feel free to review:
https://patches.dpdk.org/project/dpdk/list/?series=25002
Guo, Junfeng Oct. 9, 2022, 9:14 a.m. UTC | #2
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Thursday, October 6, 2022 22:23
> To: Guo, Junfeng <junfeng.guo@intel.com>; Zhang, Qi Z
> <qi.z.zhang@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Cc: dev@dpdk.org; Li, Xiaoyun <xiaoyun.li@intel.com>;
> awogbemila@google.com; Richardson, Bruce
> <bruce.richardson@intel.com>; Lin, Xueqin <xueqin.lin@intel.com>;
> Wang, Haiyue <haiyue.wang@intel.com>
> Subject: Re: [PATCH v4 3/9] net/gve: add support for device initialization
> 
> On 9/27/2022 8:32 AM, Junfeng Guo wrote:
> 
> >
> > Support device init and the fowllowing devops:
> 
> s/fowllowing/following/
> 
> > - dev_configure
> > - dev_start
> > - dev_stop
> > - dev_close
> 
> At this stage most of above are empty functions and not implemented yet,
> instead can you document in the commit log that build system and device
> initialization is added?

Agreed, will add this in the coming version. Thanks!

> 
> >
> > Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
> > Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
> > Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
> 
> <...>
> 
> > --- /dev/null
> > +++ b/doc/guides/nics/gve.rst
> > @@ -0,0 +1,69 @@
> > +..  SPDX-License-Identifier: BSD-3-Clause
> > +    Copyright(C) 2022 Intel Corporation.
> > +
> > +GVE poll mode driver
> > +=======================
> > +
> > +The GVE PMD (**librte_net_gve**) provides poll mode driver support
> for
> > +Google Virtual Ethernet device.
> > +
> > +Please refer to
> https://cloud.google.com/compute/docs/networking/using-gvnic
> > +for the device description.
> > +
> 
> This seems another virtual interface, similar to iavf/virtio/idpf ...
> 
> Can you please briefly describe here the motivation to add yet another
> virtual interface, and again briefly describe cons/pros of the interface?

Sure. According to the official gVNIC description, gVNIC is an alternative
to the virtio driver that can provide higher network bandwidths.
Will add the brief descriptions of the cons/pros in the coming version.
Thanks!

> 
> > +The base code is under MIT license and based on GVE kernel driver
> v1.3.0.
> > +GVE base code files are:
> > +
> > +- gve_adminq.h
> > +- gve_adminq.c
> > +- gve_desc.h
> > +- gve_desc_dqo.h
> > +- gve_register.h
> > +- gve.h
> > +
> > +Please refer to https://github.com/GoogleCloudPlatform/compute-
> virtual-ethernet-linux/tree/v1.3.0/google/gve
> > +to find the original base code.
> > +
> > +GVE has 3 queue formats:
> > +
> > +- GQI_QPL - GQI with queue page list
> > +- GQI_RDA - GQI with raw DMA addressing
> > +- DQO_RDA - DQO with raw DMA addressing
> > +
> > +GQI_QPL queue format is queue page list mode. Driver needs to
> allocate
> > +memory and register this memory as a Queue Page List (QPL) in
> hardware
> > +(Google Hypervisor/GVE Backend) first. Each queue has its own QPL.
> > +Then Tx needs to copy packets to QPL memory and put this packet's
> offset
> > +in the QPL memory into hardware descriptors so that hardware can get
> the
> > +packets data. And Rx needs to read descriptors of offset in QPL to get
> > +QPL address and copy packets from the address to get real packets
> data.
> > +
> > +GQI_RDA queue format works like usual NICs that driver can put
> packets'
> > +physical address into hardware descriptors.
> > +
> > +DQO_RDA queue format has submission and completion queue pair for
> each
> > +Tx/Rx queue. And similar as GQI_RDA, driver can put packets' physical
> > +address into hardware descriptors.
> > +
> > +Please refer to
> https://www.kernel.org/doc/html/latest/networking/device_drivers/ethe
> rnet/google/gve.html
> > +to get more information about GVE queue formats.
> > +
> > +Features and Limitations
> > +------------------------
> > +
> > +In this release, the GVE PMD provides the basic functionality of packet
> > +reception and transmission.
> > +Supported features of the GVE PMD are:
> > +
> > +- Multiple queues for TX and RX
> > +- Receiver Side Scaling (RSS)
> > +- TSO offload
> > +- Port hardware statistics
> > +- Link state information
> > +- TX multi-segments (Scatter TX)
> > +- Tx UDP/TCP/SCTP Checksum
> > +
> 
> Can you build this list gradully, by adding relavent item in each patch
> that adds it?
> That way mapping with the code and documented feature becomes more
> obvious.

Sure... Will add the items of this list gradually in the coming version. Thanks!

> 
> <...>
> 
> > +static int
> > +gve_dev_uninit(struct rte_eth_dev *eth_dev)
> > +{
> > +       struct gve_priv *priv = eth_dev->data->dev_private;
> > +
> > +       eth_dev->data->mac_addrs = NULL;
> > +
> 
> At this stage 'mac_addrs' is not freed, setting it to NULL prevents it
> to be freed.

Thanks for the catch! Will improve this. Thanks!

> 
> <...>
> 
> > +
> > +static struct rte_pci_driver rte_gve_pmd = {
> > +       .id_table = pci_id_gve_map,
> > +       .drv_flags = RTE_PCI_DRV_NEED_MAPPING |
> RTE_PCI_DRV_INTR_LSC,
> 
> As far as I can see LSC interrupt is not supported, if that is correct
> should we drop the flag?

Sure, seems this flag is not used in current GCP env.
Will remove it in the coming version. Thanks!

> 
> <...>
> 
> > +
> > +struct gve_priv {
> > +       struct gve_irq_db *irq_dbs; /* array of num_ntfy_blks */
> > +       const struct rte_memzone *irq_dbs_mz;
> > +       uint32_t mgmt_msix_idx;
> > +       rte_be32_t *cnt_array; /* array of num_event_counters */
> > +       const struct rte_memzone *cnt_array_mz;
> > +
> > +       uint16_t num_event_counters;
> > +       uint16_t tx_desc_cnt; /* txq size */
> > +       uint16_t rx_desc_cnt; /* rxq size */
> > +       uint16_t tx_pages_per_qpl; /* tx buffer length */
> > +       uint16_t rx_data_slot_cnt; /* rx buffer length */
> > +
> > +       /* Only valid for DQO_RDA queue format */
> > +       uint16_t tx_compq_size; /* tx completion queue size */
> > +       uint16_t rx_bufq_size; /* rx buff queue size */
> > +
> > +       uint64_t max_registered_pages;
> > +       uint64_t num_registered_pages; /* num pages registered with NIC
> */
> > +       uint16_t default_num_queues; /* default num queues to set up */
> > +       enum gve_queue_format queue_format; /* see enum
> gve_queue_format */
> > +       uint8_t enable_rsc;
> > +
> > +       uint16_t max_nb_txq;
> > +       uint16_t max_nb_rxq;
> > +       uint32_t num_ntfy_blks; /* spilt between TX and RX so must be
> even */
> > +
> > +       struct gve_registers __iomem *reg_bar0; /* see gve_register.h */
> > +       rte_be32_t __iomem *db_bar2; /* "array" of doorbells */
> > +       struct rte_pci_device *pci_dev;
> > +
> > +       /* Admin queue - see gve_adminq.h*/
> > +       union gve_adminq_command *adminq;
> > +       struct gve_dma_mem adminq_dma_mem;
> > +       uint32_t adminq_mask; /* masks prod_cnt to adminq size */
> > +       uint32_t adminq_prod_cnt; /* free-running count of AQ cmds
> executed */
> > +       uint32_t adminq_cmd_fail; /* free-running count of AQ cmds failed
> */
> > +       uint32_t adminq_timeouts; /* free-running count of AQ cmds
> timeouts */
> > +       /* free-running count of per AQ cmd executed */
> > +       uint32_t adminq_describe_device_cnt;
> > +       uint32_t adminq_cfg_device_resources_cnt;
> > +       uint32_t adminq_register_page_list_cnt;
> > +       uint32_t adminq_unregister_page_list_cnt;
> > +       uint32_t adminq_create_tx_queue_cnt;
> > +       uint32_t adminq_create_rx_queue_cnt;
> > +       uint32_t adminq_destroy_tx_queue_cnt;
> > +       uint32_t adminq_destroy_rx_queue_cnt;
> > +       uint32_t adminq_dcfg_device_resources_cnt;
> > +       uint32_t adminq_set_driver_parameter_cnt;
> > +       uint32_t adminq_report_stats_cnt;
> > +       uint32_t adminq_report_link_speed_cnt;
> > +       uint32_t adminq_get_ptype_map_cnt;
> > +
> > +       volatile uint32_t state_flags;
> > +
> > +       /* Gvnic device link speed from hypervisor. */
> > +       uint64_t link_speed;
> > +
> > +       uint16_t max_mtu;
> > +       struct rte_ether_addr dev_addr; /* mac address */
> > +
> > +       struct gve_queue_page_list *qpl;
> > +
> > +       struct gve_tx_queue **txqs;
> > +       struct gve_rx_queue **rxqs;
> > +};
> > +
> 
> Similar to previous comment, can you construct the headers by only
> adding used fields in that patch?
> 
> When batch copied an existing struct, it is very easy to add unused code
> and very hard to detect it. So if you only add what you need, that
> becomes easy to be sure all fields are used.
> 
> Also it makes more obvious which fields related to which feature.

Sure... Will try best to construct the header structure items gradually.
Thanks!

> 
> <...>
> 
> > new file mode 100644
> > index 0000000000..c2e0723b4c
> > --- /dev/null
> > +++ b/drivers/net/gve/version.map
> > @@ -0,0 +1,3 @@
> > +DPDK_22 {
> > +       local: *;
> > +};
> 
> it is 'DPDK_23' now, hopefully we will have an update to get rid of
> empty map files, feel free to review:
> https://patches.dpdk.org/project/dpdk/list/?series=25002

Sure, it looks much better. Thanks!

>
diff mbox series

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index 32ffdd1a61..474f41f0de 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -697,6 +697,12 @@  F: drivers/net/enic/
 F: doc/guides/nics/enic.rst
 F: doc/guides/nics/features/enic.ini
 
+Google Virtual Ethernet
+M: Junfeng Guo <junfeng.guo@intel.com>
+F: drivers/net/gve/
+F: doc/guides/nics/gve.rst
+F: doc/guides/nics/features/gve.ini
+
 Hisilicon hns3
 M: Dongdong Liu <liudongdong3@huawei.com>
 M: Yisen Zhuang <yisen.zhuang@huawei.com>
diff --git a/doc/guides/nics/features/gve.ini b/doc/guides/nics/features/gve.ini
new file mode 100644
index 0000000000..44aec28009
--- /dev/null
+++ b/doc/guides/nics/features/gve.ini
@@ -0,0 +1,10 @@ 
+;
+; Supported features of the Google Virtual Ethernet 'gve' poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Linux                = Y
+x86-32               = Y
+x86-64               = Y
+Usage doc            = Y
diff --git a/doc/guides/nics/gve.rst b/doc/guides/nics/gve.rst
new file mode 100644
index 0000000000..e93a0a6338
--- /dev/null
+++ b/doc/guides/nics/gve.rst
@@ -0,0 +1,69 @@ 
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(C) 2022 Intel Corporation.
+
+GVE poll mode driver
+=======================
+
+The GVE PMD (**librte_net_gve**) provides poll mode driver support for
+Google Virtual Ethernet device.
+
+Please refer to https://cloud.google.com/compute/docs/networking/using-gvnic
+for the device description.
+
+The base code is under MIT license and based on GVE kernel driver v1.3.0.
+GVE base code files are:
+
+- gve_adminq.h
+- gve_adminq.c
+- gve_desc.h
+- gve_desc_dqo.h
+- gve_register.h
+- gve.h
+
+Please refer to https://github.com/GoogleCloudPlatform/compute-virtual-ethernet-linux/tree/v1.3.0/google/gve
+to find the original base code.
+
+GVE has 3 queue formats:
+
+- GQI_QPL - GQI with queue page list
+- GQI_RDA - GQI with raw DMA addressing
+- DQO_RDA - DQO with raw DMA addressing
+
+GQI_QPL queue format is queue page list mode. Driver needs to allocate
+memory and register this memory as a Queue Page List (QPL) in hardware
+(Google Hypervisor/GVE Backend) first. Each queue has its own QPL.
+Then Tx needs to copy packets to QPL memory and put this packet's offset
+in the QPL memory into hardware descriptors so that hardware can get the
+packets data. And Rx needs to read descriptors of offset in QPL to get
+QPL address and copy packets from the address to get real packets data.
+
+GQI_RDA queue format works like usual NICs that driver can put packets'
+physical address into hardware descriptors.
+
+DQO_RDA queue format has submission and completion queue pair for each
+Tx/Rx queue. And similar as GQI_RDA, driver can put packets' physical
+address into hardware descriptors.
+
+Please refer to https://www.kernel.org/doc/html/latest/networking/device_drivers/ethernet/google/gve.html
+to get more information about GVE queue formats.
+
+Features and Limitations
+------------------------
+
+In this release, the GVE PMD provides the basic functionality of packet
+reception and transmission.
+Supported features of the GVE PMD are:
+
+- Multiple queues for TX and RX
+- Receiver Side Scaling (RSS)
+- TSO offload
+- Port hardware statistics
+- Link state information
+- TX multi-segments (Scatter TX)
+- Tx UDP/TCP/SCTP Checksum
+
+Currently, only GQI_QPL and GQI_RDA queue format are supported in PMD.
+Jumbo Frame is not supported in PMD for now. It'll be added in the future
+DPDK release.
+Also, only GQI_QPL queue format is in use on GCP since GQI_RDA hasn't been
+released in production.
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index f48e9f815c..64388adad0 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -29,6 +29,7 @@  Network Interface Controller Drivers
     enetfec
     enic
     fm10k
+    gve
     hinic
     hns3
     i40e
diff --git a/doc/guides/rel_notes/release_22_11.rst b/doc/guides/rel_notes/release_22_11.rst
index bb77a03e24..20d9dcaafd 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -59,6 +59,11 @@  New Features
 
   * Added flow subscription support.
 
+* **Added GVE net PMD**
+
+  * Added the new ``gve`` net driver for Google Virtual Ethernet devices.
+  * See the :doc:`../nics/gve` NIC guide for more details on this new driver.
+
 
 Removed Items
 -------------
diff --git a/drivers/net/gve/base/gve_adminq.c b/drivers/net/gve/base/gve_adminq.c
index 95ec6b015c..072fbee539 100644
--- a/drivers/net/gve/base/gve_adminq.c
+++ b/drivers/net/gve/base/gve_adminq.c
@@ -5,6 +5,7 @@ 
  * Copyright(C) 2022 Intel Corporation
  */
 
+#include "../gve_ethdev.h"
 #include "gve_adminq.h"
 #include "gve_register.h"
 
diff --git a/drivers/net/gve/gve_ethdev.c b/drivers/net/gve/gve_ethdev.c
new file mode 100644
index 0000000000..df698c1b02
--- /dev/null
+++ b/drivers/net/gve/gve_ethdev.c
@@ -0,0 +1,371 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2022 Intel Corporation
+ */
+#include <linux/pci_regs.h>
+
+#include "gve_ethdev.h"
+#include "base/gve_adminq.h"
+#include "base/gve_register.h"
+
+const char gve_version_str[] = GVE_VERSION;
+static const char gve_version_prefix[] = GVE_VERSION_PREFIX;
+
+static void
+gve_write_version(uint8_t *driver_version_register)
+{
+	const char *c = gve_version_prefix;
+
+	while (*c) {
+		writeb(*c, driver_version_register);
+		c++;
+	}
+
+	c = gve_version_str;
+	while (*c) {
+		writeb(*c, driver_version_register);
+		c++;
+	}
+	writeb('\n', driver_version_register);
+}
+
+static int
+gve_dev_configure(__rte_unused struct rte_eth_dev *dev)
+{
+	return 0;
+}
+
+static int
+gve_dev_start(struct rte_eth_dev *dev)
+{
+	dev->data->dev_started = 1;
+
+	return 0;
+}
+
+static int
+gve_dev_stop(struct rte_eth_dev *dev)
+{
+	dev->data->dev_link.link_status = RTE_ETH_LINK_DOWN;
+	dev->data->dev_started = 0;
+
+	return 0;
+}
+
+static int
+gve_dev_close(struct rte_eth_dev *dev)
+{
+	int err = 0;
+
+	if (dev->data->dev_started) {
+		err = gve_dev_stop(dev);
+		if (err != 0)
+			PMD_DRV_LOG(ERR, "Failed to stop dev.");
+	}
+
+	return err;
+}
+
+static const struct eth_dev_ops gve_eth_dev_ops = {
+	.dev_configure        = gve_dev_configure,
+	.dev_start            = gve_dev_start,
+	.dev_stop             = gve_dev_stop,
+	.dev_close            = gve_dev_close,
+};
+
+static void
+gve_free_counter_array(struct gve_priv *priv)
+{
+	rte_memzone_free(priv->cnt_array_mz);
+	priv->cnt_array = NULL;
+}
+
+static void
+gve_free_irq_db(struct gve_priv *priv)
+{
+	rte_memzone_free(priv->irq_dbs_mz);
+	priv->irq_dbs = NULL;
+}
+
+static void
+gve_teardown_device_resources(struct gve_priv *priv)
+{
+	int err;
+
+	/* Tell device its resources are being freed */
+	if (gve_get_device_resources_ok(priv)) {
+		err = gve_adminq_deconfigure_device_resources(priv);
+		if (err)
+			PMD_DRV_LOG(ERR, "Could not deconfigure device resources: err=%d", err);
+	}
+	gve_free_counter_array(priv);
+	gve_free_irq_db(priv);
+	gve_clear_device_resources_ok(priv);
+}
+
+static uint8_t
+pci_dev_find_capability(struct rte_pci_device *pdev, int cap)
+{
+	uint8_t pos, id;
+	uint16_t ent;
+	int loops;
+	int ret;
+
+	ret = rte_pci_read_config(pdev, &pos, sizeof(pos), PCI_CAPABILITY_LIST);
+	if (ret != sizeof(pos))
+		return 0;
+
+	loops = (PCI_CFG_SPACE_SIZE - PCI_STD_HEADER_SIZEOF) / PCI_CAP_SIZEOF;
+
+	while (pos && loops--) {
+		ret = rte_pci_read_config(pdev, &ent, sizeof(ent), pos);
+		if (ret != sizeof(ent))
+			return 0;
+
+		id = ent & 0xff;
+		if (id == 0xff)
+			break;
+
+		if (id == cap)
+			return pos;
+
+		pos = (ent >> 8);
+	}
+
+	return 0;
+}
+
+static int
+pci_dev_msix_vec_count(struct rte_pci_device *pdev)
+{
+	uint8_t msix_cap = pci_dev_find_capability(pdev, PCI_CAP_ID_MSIX);
+	uint16_t control;
+	int ret;
+
+	if (!msix_cap)
+		return 0;
+
+	ret = rte_pci_read_config(pdev, &control, sizeof(control), msix_cap + PCI_MSIX_FLAGS);
+	if (ret != sizeof(control))
+		return 0;
+
+	return (control & PCI_MSIX_FLAGS_QSIZE) + 1;
+}
+
+static int
+gve_setup_device_resources(struct gve_priv *priv)
+{
+	char z_name[RTE_MEMZONE_NAMESIZE];
+	const struct rte_memzone *mz;
+	int err = 0;
+
+	snprintf(z_name, sizeof(z_name), "gve_%s_cnt_arr", priv->pci_dev->device.name);
+	mz = rte_memzone_reserve_aligned(z_name,
+					 priv->num_event_counters * sizeof(*priv->cnt_array),
+					 rte_socket_id(), RTE_MEMZONE_IOVA_CONTIG,
+					 PAGE_SIZE);
+	if (mz == NULL) {
+		PMD_DRV_LOG(ERR, "Could not alloc memzone for count array");
+		return -ENOMEM;
+	}
+	priv->cnt_array = (rte_be32_t *)mz->addr;
+	priv->cnt_array_mz = mz;
+
+	snprintf(z_name, sizeof(z_name), "gve_%s_irqmz", priv->pci_dev->device.name);
+	mz = rte_memzone_reserve_aligned(z_name,
+					 sizeof(*priv->irq_dbs) * (priv->num_ntfy_blks),
+					 rte_socket_id(), RTE_MEMZONE_IOVA_CONTIG,
+					 PAGE_SIZE);
+	if (mz == NULL) {
+		PMD_DRV_LOG(ERR, "Could not alloc memzone for irq_dbs");
+		err = -ENOMEM;
+		goto free_cnt_array;
+	}
+	priv->irq_dbs = (struct gve_irq_db *)mz->addr;
+	priv->irq_dbs_mz = mz;
+
+	err = gve_adminq_configure_device_resources(priv,
+						    priv->cnt_array_mz->iova,
+						    priv->num_event_counters,
+						    priv->irq_dbs_mz->iova,
+						    priv->num_ntfy_blks);
+	if (unlikely(err)) {
+		PMD_DRV_LOG(ERR, "Could not config device resources: err=%d", err);
+		goto free_irq_dbs;
+	}
+	return 0;
+
+free_irq_dbs:
+	gve_free_irq_db(priv);
+free_cnt_array:
+	gve_free_counter_array(priv);
+
+	return err;
+}
+
+static int
+gve_init_priv(struct gve_priv *priv, bool skip_describe_device)
+{
+	int num_ntfy;
+	int err;
+
+	/* Set up the adminq */
+	err = gve_adminq_alloc(priv);
+	if (err) {
+		PMD_DRV_LOG(ERR, "Failed to alloc admin queue: err=%d", err);
+		return err;
+	}
+
+	if (skip_describe_device)
+		goto setup_device;
+
+	/* Get the initial information we need from the device */
+	err = gve_adminq_describe_device(priv);
+	if (err) {
+		PMD_DRV_LOG(ERR, "Could not get device information: err=%d", err);
+		goto free_adminq;
+	}
+
+	num_ntfy = pci_dev_msix_vec_count(priv->pci_dev);
+	if (num_ntfy <= 0) {
+		PMD_DRV_LOG(ERR, "Could not count MSI-x vectors");
+		err = -EIO;
+		goto free_adminq;
+	} else if (num_ntfy < GVE_MIN_MSIX) {
+		PMD_DRV_LOG(ERR, "GVE needs at least %d MSI-x vectors, but only has %d",
+			    GVE_MIN_MSIX, num_ntfy);
+		err = -EINVAL;
+		goto free_adminq;
+	}
+
+	priv->num_registered_pages = 0;
+
+	/* gvnic has one Notification Block per MSI-x vector, except for the
+	 * management vector
+	 */
+	priv->num_ntfy_blks = (num_ntfy - 1) & ~0x1;
+	priv->mgmt_msix_idx = priv->num_ntfy_blks;
+
+	priv->max_nb_txq = RTE_MIN(priv->max_nb_txq, priv->num_ntfy_blks / 2);
+	priv->max_nb_rxq = RTE_MIN(priv->max_nb_rxq, priv->num_ntfy_blks / 2);
+
+	if (priv->default_num_queues > 0) {
+		priv->max_nb_txq = RTE_MIN(priv->default_num_queues, priv->max_nb_txq);
+		priv->max_nb_rxq = RTE_MIN(priv->default_num_queues, priv->max_nb_rxq);
+	}
+
+	PMD_DRV_LOG(INFO, "Max TX queues %d, Max RX queues %d",
+		    priv->max_nb_txq, priv->max_nb_rxq);
+
+setup_device:
+	err = gve_setup_device_resources(priv);
+	if (!err)
+		return 0;
+free_adminq:
+	gve_adminq_free(priv);
+	return err;
+}
+
+static void
+gve_teardown_priv_resources(struct gve_priv *priv)
+{
+	gve_teardown_device_resources(priv);
+	gve_adminq_free(priv);
+}
+
+static int
+gve_dev_init(struct rte_eth_dev *eth_dev)
+{
+	struct gve_priv *priv = eth_dev->data->dev_private;
+	int max_tx_queues, max_rx_queues;
+	struct rte_pci_device *pci_dev;
+	struct gve_registers *reg_bar;
+	rte_be32_t *db_bar;
+	int err;
+
+	eth_dev->dev_ops = &gve_eth_dev_ops;
+
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return 0;
+
+	pci_dev = RTE_DEV_TO_PCI(eth_dev->device);
+
+	reg_bar = pci_dev->mem_resource[GVE_REG_BAR].addr;
+	if (!reg_bar) {
+		PMD_DRV_LOG(ERR, "Failed to map pci bar!");
+		return -ENOMEM;
+	}
+
+	db_bar = pci_dev->mem_resource[GVE_DB_BAR].addr;
+	if (!db_bar) {
+		PMD_DRV_LOG(ERR, "Failed to map doorbell bar!");
+		return -ENOMEM;
+	}
+
+	gve_write_version(&reg_bar->driver_version);
+	/* Get max queues to alloc etherdev */
+	max_tx_queues = ioread32be(&reg_bar->max_tx_queues);
+	max_rx_queues = ioread32be(&reg_bar->max_rx_queues);
+
+	priv->reg_bar0 = reg_bar;
+	priv->db_bar2 = db_bar;
+	priv->pci_dev = pci_dev;
+	priv->state_flags = 0x0;
+
+	priv->max_nb_txq = max_tx_queues;
+	priv->max_nb_rxq = max_rx_queues;
+
+	err = gve_init_priv(priv, false);
+	if (err)
+		return err;
+
+	eth_dev->data->mac_addrs = rte_zmalloc("gve_mac", sizeof(struct rte_ether_addr), 0);
+	if (!eth_dev->data->mac_addrs) {
+		PMD_DRV_LOG(ERR, "Failed to allocate memory to store mac address");
+		return -ENOMEM;
+	}
+	rte_ether_addr_copy(&priv->dev_addr, eth_dev->data->mac_addrs);
+
+	return 0;
+}
+
+static int
+gve_dev_uninit(struct rte_eth_dev *eth_dev)
+{
+	struct gve_priv *priv = eth_dev->data->dev_private;
+
+	eth_dev->data->mac_addrs = NULL;
+
+	gve_teardown_priv_resources(priv);
+
+	return 0;
+}
+
+static int
+gve_pci_probe(__rte_unused struct rte_pci_driver *pci_drv,
+	      struct rte_pci_device *pci_dev)
+{
+	return rte_eth_dev_pci_generic_probe(pci_dev, sizeof(struct gve_priv), gve_dev_init);
+}
+
+static int
+gve_pci_remove(struct rte_pci_device *pci_dev)
+{
+	return rte_eth_dev_pci_generic_remove(pci_dev, gve_dev_uninit);
+}
+
+static const struct rte_pci_id pci_id_gve_map[] = {
+	{ RTE_PCI_DEVICE(GOOGLE_VENDOR_ID, GVE_DEV_ID) },
+	{ .device_id = 0 },
+};
+
+static struct rte_pci_driver rte_gve_pmd = {
+	.id_table = pci_id_gve_map,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+	.probe = gve_pci_probe,
+	.remove = gve_pci_remove,
+};
+
+RTE_PMD_REGISTER_PCI(net_gve, rte_gve_pmd);
+RTE_PMD_REGISTER_PCI_TABLE(net_gve, pci_id_gve_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_gve, "* igb_uio | vfio-pci");
+RTE_LOG_REGISTER_SUFFIX(gve_logtype_driver, driver, NOTICE);
diff --git a/drivers/net/gve/gve_ethdev.h b/drivers/net/gve/gve_ethdev.h
new file mode 100644
index 0000000000..2ac2a46ac1
--- /dev/null
+++ b/drivers/net/gve/gve_ethdev.h
@@ -0,0 +1,225 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2022 Intel Corporation
+ */
+
+#ifndef _GVE_ETHDEV_H_
+#define _GVE_ETHDEV_H_
+
+#include <ethdev_driver.h>
+#include <ethdev_pci.h>
+#include <rte_ether.h>
+
+#include "base/gve.h"
+
+#define GVE_DEFAULT_RX_FREE_THRESH  512
+#define GVE_DEFAULT_TX_FREE_THRESH  256
+#define GVE_TX_MAX_FREE_SZ          512
+
+#define GVE_MIN_BUF_SIZE	    1024
+#define GVE_MAX_RX_PKTLEN	    65535
+
+/* A list of pages registered with the device during setup and used by a queue
+ * as buffers
+ */
+struct gve_queue_page_list {
+	uint32_t id; /* unique id */
+	uint32_t num_entries;
+	dma_addr_t *page_buses; /* the dma addrs of the pages */
+	const struct rte_memzone *mz;
+};
+
+/* A TX desc ring entry */
+union gve_tx_desc {
+	struct gve_tx_pkt_desc pkt; /* first desc for a packet */
+	struct gve_tx_seg_desc seg; /* subsequent descs for a packet */
+};
+
+struct gve_tx_queue {
+	volatile union gve_tx_desc *tx_desc_ring;
+	const struct rte_memzone *mz;
+	uint64_t tx_ring_phys_addr;
+
+	uint16_t nb_tx_desc;
+
+	/* Only valid for DQO_QPL queue format */
+	struct gve_queue_page_list *qpl;
+
+	uint16_t port_id;
+	uint16_t queue_id;
+
+	uint16_t ntfy_id;
+	volatile rte_be32_t *ntfy_addr;
+
+	struct gve_priv *hw;
+	const struct rte_memzone *qres_mz;
+	struct gve_queue_resources *qres;
+
+	/* Only valid for DQO_RDA queue format */
+	struct gve_tx_queue *complq;
+};
+
+struct gve_rx_queue {
+	volatile struct gve_rx_desc *rx_desc_ring;
+	volatile union gve_rx_data_slot *rx_data_ring;
+	const struct rte_memzone *mz;
+	const struct rte_memzone *data_mz;
+	uint64_t rx_ring_phys_addr;
+
+	uint16_t nb_rx_desc;
+
+	volatile rte_be32_t *ntfy_addr;
+
+	/* only valid for GQI_QPL queue format */
+	struct gve_queue_page_list *qpl;
+
+	struct gve_priv *hw;
+	const struct rte_memzone *qres_mz;
+	struct gve_queue_resources *qres;
+
+	uint16_t port_id;
+	uint16_t queue_id;
+	uint16_t ntfy_id;
+	uint16_t rx_buf_len;
+
+	/* Only valid for DQO_RDA queue format */
+	struct gve_rx_queue *bufq;
+};
+
+struct gve_priv {
+	struct gve_irq_db *irq_dbs; /* array of num_ntfy_blks */
+	const struct rte_memzone *irq_dbs_mz;
+	uint32_t mgmt_msix_idx;
+	rte_be32_t *cnt_array; /* array of num_event_counters */
+	const struct rte_memzone *cnt_array_mz;
+
+	uint16_t num_event_counters;
+	uint16_t tx_desc_cnt; /* txq size */
+	uint16_t rx_desc_cnt; /* rxq size */
+	uint16_t tx_pages_per_qpl; /* tx buffer length */
+	uint16_t rx_data_slot_cnt; /* rx buffer length */
+
+	/* Only valid for DQO_RDA queue format */
+	uint16_t tx_compq_size; /* tx completion queue size */
+	uint16_t rx_bufq_size; /* rx buff queue size */
+
+	uint64_t max_registered_pages;
+	uint64_t num_registered_pages; /* num pages registered with NIC */
+	uint16_t default_num_queues; /* default num queues to set up */
+	enum gve_queue_format queue_format; /* see enum gve_queue_format */
+	uint8_t enable_rsc;
+
+	uint16_t max_nb_txq;
+	uint16_t max_nb_rxq;
+	uint32_t num_ntfy_blks; /* spilt between TX and RX so must be even */
+
+	struct gve_registers __iomem *reg_bar0; /* see gve_register.h */
+	rte_be32_t __iomem *db_bar2; /* "array" of doorbells */
+	struct rte_pci_device *pci_dev;
+
+	/* Admin queue - see gve_adminq.h*/
+	union gve_adminq_command *adminq;
+	struct gve_dma_mem adminq_dma_mem;
+	uint32_t adminq_mask; /* masks prod_cnt to adminq size */
+	uint32_t adminq_prod_cnt; /* free-running count of AQ cmds executed */
+	uint32_t adminq_cmd_fail; /* free-running count of AQ cmds failed */
+	uint32_t adminq_timeouts; /* free-running count of AQ cmds timeouts */
+	/* free-running count of per AQ cmd executed */
+	uint32_t adminq_describe_device_cnt;
+	uint32_t adminq_cfg_device_resources_cnt;
+	uint32_t adminq_register_page_list_cnt;
+	uint32_t adminq_unregister_page_list_cnt;
+	uint32_t adminq_create_tx_queue_cnt;
+	uint32_t adminq_create_rx_queue_cnt;
+	uint32_t adminq_destroy_tx_queue_cnt;
+	uint32_t adminq_destroy_rx_queue_cnt;
+	uint32_t adminq_dcfg_device_resources_cnt;
+	uint32_t adminq_set_driver_parameter_cnt;
+	uint32_t adminq_report_stats_cnt;
+	uint32_t adminq_report_link_speed_cnt;
+	uint32_t adminq_get_ptype_map_cnt;
+
+	volatile uint32_t state_flags;
+
+	/* Gvnic device link speed from hypervisor. */
+	uint64_t link_speed;
+
+	uint16_t max_mtu;
+	struct rte_ether_addr dev_addr; /* mac address */
+
+	struct gve_queue_page_list *qpl;
+
+	struct gve_tx_queue **txqs;
+	struct gve_rx_queue **rxqs;
+};
+
+static inline bool
+gve_is_gqi(struct gve_priv *priv)
+{
+	return priv->queue_format == GVE_GQI_RDA_FORMAT ||
+		priv->queue_format == GVE_GQI_QPL_FORMAT;
+}
+
+static inline bool
+gve_get_admin_queue_ok(struct gve_priv *priv)
+{
+	return !!rte_bit_relaxed_get32(GVE_PRIV_FLAGS_ADMIN_QUEUE_OK,
+				       &priv->state_flags);
+}
+
+static inline void
+gve_set_admin_queue_ok(struct gve_priv *priv)
+{
+	rte_bit_relaxed_set32(GVE_PRIV_FLAGS_ADMIN_QUEUE_OK,
+			      &priv->state_flags);
+}
+
+static inline void
+gve_clear_admin_queue_ok(struct gve_priv *priv)
+{
+	rte_bit_relaxed_clear32(GVE_PRIV_FLAGS_ADMIN_QUEUE_OK,
+				&priv->state_flags);
+}
+
+static inline bool
+gve_get_device_resources_ok(struct gve_priv *priv)
+{
+	return !!rte_bit_relaxed_get32(GVE_PRIV_FLAGS_DEVICE_RESOURCES_OK,
+				       &priv->state_flags);
+}
+
+static inline void
+gve_set_device_resources_ok(struct gve_priv *priv)
+{
+	rte_bit_relaxed_set32(GVE_PRIV_FLAGS_DEVICE_RESOURCES_OK,
+			      &priv->state_flags);
+}
+
+static inline void
+gve_clear_device_resources_ok(struct gve_priv *priv)
+{
+	rte_bit_relaxed_clear32(GVE_PRIV_FLAGS_DEVICE_RESOURCES_OK,
+				&priv->state_flags);
+}
+
+static inline bool
+gve_get_device_rings_ok(struct gve_priv *priv)
+{
+	return !!rte_bit_relaxed_get32(GVE_PRIV_FLAGS_DEVICE_RINGS_OK,
+				       &priv->state_flags);
+}
+
+static inline void
+gve_set_device_rings_ok(struct gve_priv *priv)
+{
+	rte_bit_relaxed_set32(GVE_PRIV_FLAGS_DEVICE_RINGS_OK,
+			      &priv->state_flags);
+}
+
+static inline void
+gve_clear_device_rings_ok(struct gve_priv *priv)
+{
+	rte_bit_relaxed_clear32(GVE_PRIV_FLAGS_DEVICE_RINGS_OK,
+				&priv->state_flags);
+}
+
+#endif /* _GVE_ETHDEV_H_ */
diff --git a/drivers/net/gve/meson.build b/drivers/net/gve/meson.build
new file mode 100644
index 0000000000..d8ec64b3a3
--- /dev/null
+++ b/drivers/net/gve/meson.build
@@ -0,0 +1,14 @@ 
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2022 Intel Corporation
+
+if is_windows
+    build = false
+    reason = 'not supported on Windows'
+    subdir_done()
+endif
+
+sources = files(
+        'base/gve_adminq.c',
+        'gve_ethdev.c',
+)
+includes += include_directories('base')
diff --git a/drivers/net/gve/version.map b/drivers/net/gve/version.map
new file mode 100644
index 0000000000..c2e0723b4c
--- /dev/null
+++ b/drivers/net/gve/version.map
@@ -0,0 +1,3 @@ 
+DPDK_22 {
+	local: *;
+};
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index e35652fe63..f1a0ee2cef 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -23,6 +23,7 @@  drivers = [
         'enic',
         'failsafe',
         'fm10k',
+        'gve',
         'hinic',
         'hns3',
         'i40e',