List patch comments

GET /api/patches/46/comments/?format=api
HTTP 200 OK
Allow: GET, HEAD, OPTIONS
Content-Type: application/json
Link: 
<http://patches.dpdk.org/api/patches/46/comments/?format=api&page=1>; rel="first",
<http://patches.dpdk.org/api/patches/46/comments/?format=api&page=1>; rel="last"
Vary: Accept
[ { "id": 110, "web_url": "http://patches.dpdk.org/comment/110/", "msgid": "<DFDF335405C17848924A094BC35766CF0A8A4F83@SHSMSX104.ccr.corp.intel.com>", "list_archive_url": "https://inbox.dpdk.org/dev/DFDF335405C17848924A094BC35766CF0A8A4F83@SHSMSX104.ccr.corp.intel.com", "date": "2014-07-15T00:15:49", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 29, "url": "http://patches.dpdk.org/api/people/29/?format=api", "name": "Zhou, Danny", "email": "danny.zhou@intel.com" }, "content": "According to my performance measurement results for 64B small packet, 1 queue perf. is better than 16 queues (1.35M pps vs. 0.93M pps) which make sense to me as for 16 queues case more CPU cycles (16 queues' 87% vs. 1 queue' 80%) in kernel land needed for NAPI-enabled ixgbe driver to switch between polling and interrupt modes in order to service per-queue rx interrupts, so more context switch overhead involved. Also, since the eth_packet_rx/eth_packet_tx routines involves in two memory copies between DPDK mbuf and pbuf for each packet, it can hardly achieve high performance unless packet are directly DMA to mbuf which needs ixgbe driver to support.\n\n> -----Original Message-----\n> From: John W. Linville [mailto:linville@tuxdriver.com]\n> Sent: Tuesday, July 15, 2014 2:25 AM\n> To: dev@dpdk.org\n> Cc: Thomas Monjalon; Richardson, Bruce; Zhou, Danny\n> Subject: [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual\n> devices\n> \n> This is a Linux-specific virtual PMD driver backed by an AF_PACKET socket. This\n> implementation uses mmap'ed ring buffers to limit copying and user/kernel\n> transitions. The PACKET_FANOUT_HASH behavior of AF_PACKET is used for\n> frame reception. In the current implementation, Tx and Rx queues are always paired,\n> and therefore are always equal in number -- changing this would be a Simple Matter\n> Of Programming.\n> \n> Interfaces of this type are created with a command line option like\n> \"--vdev=eth_packet0,iface=...\". There are a number of options availabe as\n> arguments:\n> \n> - Interface is chosen by \"iface\" (required)\n> - Number of queue pairs set by \"qpairs\" (optional, default: 1)\n> - AF_PACKET MMAP block size set by \"blocksz\" (optional, default: 4096)\n> - AF_PACKET MMAP frame size set by \"framesz\" (optional, default: 2048)\n> - AF_PACKET MMAP frame count set by \"framecnt\" (optional, default: 512)\n> \n> Signed-off-by: John W. Linville <linville@tuxdriver.com>\n> ---\n> This PMD is intended to provide a means for using DPDK on a broad range of\n> hardware without hardware-specific PMDs and (hopefully) with better performance\n> than what PCAP offers in Linux. This might be useful as a development platform for\n> DPDK applications when DPDK-supported hardware is expensive or unavailable.\n> \n> New in v2:\n> \n> -- fixup some style issues found by check patch\n> -- use if_index as part of fanout group ID\n> -- set default number of queue pairs to 1\n> \n> config/common_bsdapp | 5 +\n> config/common_linuxapp | 5 +\n> lib/Makefile | 1 +\n> lib/librte_eal/linuxapp/eal/Makefile | 1 +\n> lib/librte_pmd_packet/Makefile | 60 +++\n> lib/librte_pmd_packet/rte_eth_packet.c | 826\n> +++++++++++++++++++++++++++++++++\n> lib/librte_pmd_packet/rte_eth_packet.h | 55 +++\n> mk/rte.app.mk | 4 +\n> 8 files changed, 957 insertions(+)\n> create mode 100644 lib/librte_pmd_packet/Makefile create mode 100644\n> lib/librte_pmd_packet/rte_eth_packet.c\n> create mode 100644 lib/librte_pmd_packet/rte_eth_packet.h\n> \n> diff --git a/config/common_bsdapp b/config/common_bsdapp index\n> 943dce8f1ede..c317f031278e 100644\n> --- a/config/common_bsdapp\n> +++ b/config/common_bsdapp\n> @@ -226,6 +226,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=y\n> CONFIG_RTE_LIBRTE_PMD_BOND=y\n> \n> #\n> +# Compile software PMD backed by AF_PACKET sockets (Linux only) #\n> +CONFIG_RTE_LIBRTE_PMD_PACKET=n\n> +\n> +#\n> # Do prefetch of packet data within PMD driver receive function #\n> CONFIG_RTE_PMD_PACKET_PREFETCH=y diff --git a/config/common_linuxapp\n> b/config/common_linuxapp index 7bf5d80d4e26..f9e7bc3015ec 100644\n> --- a/config/common_linuxapp\n> +++ b/config/common_linuxapp\n> @@ -249,6 +249,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=n\n> CONFIG_RTE_LIBRTE_PMD_BOND=y\n> \n> #\n> +# Compile software PMD backed by AF_PACKET sockets (Linux only) #\n> +CONFIG_RTE_LIBRTE_PMD_PACKET=y\n> +\n> +#\n> # Compile Xen PMD\n> #\n> CONFIG_RTE_LIBRTE_PMD_XENVIRT=n\n> diff --git a/lib/Makefile b/lib/Makefile index 10c5bb3045bc..930fadf29898 100644\n> --- a/lib/Makefile\n> +++ b/lib/Makefile\n> @@ -47,6 +47,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) +=\n> librte_pmd_i40e\n> DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += librte_pmd_bond\n> DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += librte_pmd_ring\n> DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += librte_pmd_pcap\n> +DIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += librte_pmd_packet\n> DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += librte_pmd_virtio\n> DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3\n> DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt diff --git\n> a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile\n> index 756d6b0c9301..feed24a63272 100644\n> --- a/lib/librte_eal/linuxapp/eal/Makefile\n> +++ b/lib/librte_eal/linuxapp/eal/Makefile\n> @@ -44,6 +44,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_ether CFLAGS +=\n> -I$(RTE_SDK)/lib/librte_ivshmem CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_ring\n> CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_pcap\n> +CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_packet\n> CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_xenvirt\n> CFLAGS += $(WERROR_FLAGS) -O3\n> \n> diff --git a/lib/librte_pmd_packet/Makefile b/lib/librte_pmd_packet/Makefile new file\n> mode 100644 index 000000000000..e1266fb992cd\n> --- /dev/null\n> +++ b/lib/librte_pmd_packet/Makefile\n> @@ -0,0 +1,60 @@\n> +# BSD LICENSE\n> +#\n> +# Copyright(c) 2014 John W. Linville <linville@redhat.com>\n> +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> +# Copyright(c) 2014 6WIND S.A.\n> +# All rights reserved.\n> +#\n> +# Redistribution and use in source and binary forms, with or without\n> +# modification, are permitted provided that the following conditions\n> +# are met:\n> +#\n> +# * Redistributions of source code must retain the above copyright\n> +# notice, this list of conditions and the following disclaimer.\n> +# * Redistributions in binary form must reproduce the above copyright\n> +# notice, this list of conditions and the following disclaimer in\n> +# the documentation and/or other materials provided with the\n> +# distribution.\n> +# * Neither the name of Intel Corporation nor the names of its\n> +# contributors may be used to endorse or promote products derived\n> +# from this software without specific prior written permission.\n> +#\n> +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND\n> CONTRIBUTORS\n> +# \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT\n> NOT\n> +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND\n> FITNESS FOR\n> +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE\n> COPYRIGHT\n> +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,\n> INCIDENTAL,\n> +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT\n> NOT\n> +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;\n> LOSS OF USE,\n> +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED\n> AND ON ANY\n> +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF\n> THE USE\n> +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH\n> DAMAGE.\n> +\n> +include $(RTE_SDK)/mk/rte.vars.mk\n> +\n> +#\n> +# library name\n> +#\n> +LIB = librte_pmd_packet.a\n> +\n> +CFLAGS += -O3\n> +CFLAGS += $(WERROR_FLAGS)\n> +\n> +#\n> +# all source are stored in SRCS-y\n> +#\n> +SRCS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += rte_eth_packet.c\n> +\n> +#\n> +# Export include files\n> +#\n> +SYMLINK-y-include += rte_eth_packet.h\n> +\n> +# this lib depends upon:\n> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_mbuf\n> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_ether\n> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_malloc\n> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_kvargs\n> +\n> +include $(RTE_SDK)/mk/rte.lib.mk\n> diff --git a/lib/librte_pmd_packet/rte_eth_packet.c\n> b/lib/librte_pmd_packet/rte_eth_packet.c\n> new file mode 100644\n> index 000000000000..9c82d16e730f\n> --- /dev/null\n> +++ b/lib/librte_pmd_packet/rte_eth_packet.c\n> @@ -0,0 +1,826 @@\n> +/*-\n> + * BSD LICENSE\n> + *\n> + * Copyright(c) 2014 John W. Linville <linville@tuxdriver.com>\n> + *\n> + * Originally based upon librte_pmd_pcap code:\n> + *\n> + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> + * Copyright(c) 2014 6WIND S.A.\n> + * All rights reserved.\n> + *\n> + * Redistribution and use in source and binary forms, with or without\n> + * modification, are permitted provided that the following conditions\n> + * are met:\n> + *\n> + * * Redistributions of source code must retain the above copyright\n> + * notice, this list of conditions and the following disclaimer.\n> + * * Redistributions in binary form must reproduce the above copyright\n> + * notice, this list of conditions and the following disclaimer in\n> + * the documentation and/or other materials provided with the\n> + * distribution.\n> + * * Neither the name of Intel Corporation nor the names of its\n> + * contributors may be used to endorse or promote products derived\n> + * from this software without specific prior written permission.\n> + *\n> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND\n> CONTRIBUTORS\n> + * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT\n> NOT\n> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND\n> FITNESS FOR\n> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE\n> COPYRIGHT\n> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,\n> INCIDENTAL,\n> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT\n> NOT\n> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;\n> LOSS OF USE,\n> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED\n> AND ON ANY\n> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR\n> TORT\n> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF\n> THE USE\n> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH\n> DAMAGE.\n> + */\n> +\n> +#include <rte_mbuf.h>\n> +#include <rte_ethdev.h>\n> +#include <rte_malloc.h>\n> +#include <rte_kvargs.h>\n> +#include <rte_dev.h>\n> +\n> +#include <linux/if_ether.h>\n> +#include <linux/if_packet.h>\n> +#include <arpa/inet.h>\n> +#include <net/if.h>\n> +#include <sys/types.h>\n> +#include <sys/socket.h>\n> +#include <sys/ioctl.h>\n> +#include <sys/mman.h>\n> +#include <unistd.h>\n> +#include <poll.h>\n> +\n> +#include \"rte_eth_packet.h\"\n> +\n> +#define ETH_PACKET_IFACE_ARG\t\t\"iface\"\n> +#define ETH_PACKET_NUM_Q_ARG\t\t\"qpairs\"\n> +#define ETH_PACKET_BLOCKSIZE_ARG\t\"blocksz\"\n> +#define ETH_PACKET_FRAMESIZE_ARG\t\"framesz\"\n> +#define ETH_PACKET_FRAMECOUNT_ARG\t\"framecnt\"\n> +\n> +#define DFLT_BLOCK_SIZE\t\t(1 << 12)\n> +#define DFLT_FRAME_SIZE\t\t(1 << 11)\n> +#define DFLT_FRAME_COUNT\t(1 << 9)\n> +\n> +struct pkt_rx_queue {\n> +\tint sockfd;\n> +\n> +\tstruct iovec *rd;\n> +\tuint8_t *map;\n> +\tunsigned int framecount;\n> +\tunsigned int framenum;\n> +\n> +\tstruct rte_mempool *mb_pool;\n> +\n> +\tvolatile unsigned long rx_pkts;\n> +\tvolatile unsigned long err_pkts;\n> +};\n> +\n> +struct pkt_tx_queue {\n> +\tint sockfd;\n> +\n> +\tstruct iovec *rd;\n> +\tuint8_t *map;\n> +\tunsigned int framecount;\n> +\tunsigned int framenum;\n> +\n> +\tvolatile unsigned long tx_pkts;\n> +\tvolatile unsigned long err_pkts;\n> +};\n> +\n> +struct pmd_internals {\n> +\tunsigned nb_queues;\n> +\n> +\tint if_index;\n> +\tstruct ether_addr eth_addr;\n> +\n> +\tstruct tpacket_req req;\n> +\n> +\tstruct pkt_rx_queue rx_queue[RTE_PMD_PACKET_MAX_RINGS];\n> +\tstruct pkt_tx_queue tx_queue[RTE_PMD_PACKET_MAX_RINGS];\n> +};\n> +\n> +static const char *valid_arguments[] = {\n> +\tETH_PACKET_IFACE_ARG,\n> +\tETH_PACKET_NUM_Q_ARG,\n> +\tETH_PACKET_BLOCKSIZE_ARG,\n> +\tETH_PACKET_FRAMESIZE_ARG,\n> +\tETH_PACKET_FRAMECOUNT_ARG,\n> +\tNULL\n> +};\n> +\n> +static const char *drivername = \"AF_PACKET PMD\";\n> +\n> +static struct rte_eth_link pmd_link = {\n> +\t.link_speed = 10000,\n> +\t.link_duplex = ETH_LINK_FULL_DUPLEX,\n> +\t.link_status = 0\n> +};\n> +\n> +static uint16_t\n> +eth_packet_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) {\n> +\tunsigned i;\n> +\tstruct tpacket2_hdr *ppd;\n> +\tstruct rte_mbuf *mbuf;\n> +\tuint8_t *pbuf;\n> +\tstruct pkt_rx_queue *pkt_q = queue;\n> +\tuint16_t num_rx = 0;\n> +\tunsigned int framecount, framenum;\n> +\n> +\tif (unlikely(nb_pkts == 0))\n> +\t\treturn 0;\n> +\n> +\t/*\n> +\t * Reads the given number of packets from the AF_PACKET socket one by\n> +\t * one and copies the packet data into a newly allocated mbuf.\n> +\t */\n> +\tframecount = pkt_q->framecount;\n> +\tframenum = pkt_q->framenum;\n> +\tfor (i = 0; i < nb_pkts; i++) {\n> +\t\t/* point at the next incoming frame */\n> +\t\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> +\t\tif ((ppd->tp_status & TP_STATUS_USER) == 0)\n> +\t\t\tbreak;\n> +\n> +\t\t/* allocate the next mbuf */\n> +\t\tmbuf = rte_pktmbuf_alloc(pkt_q->mb_pool);\n> +\t\tif (unlikely(mbuf == NULL))\n> +\t\t\tbreak;\n> +\n> +\t\t/* packet will fit in the mbuf, go ahead and receive it */\n> +\t\tmbuf->pkt.pkt_len = mbuf->pkt.data_len = ppd->tp_snaplen;\n> +\t\tpbuf = (uint8_t *) ppd + ppd->tp_mac;\n> +\t\tmemcpy(mbuf->pkt.data, pbuf, mbuf->pkt.data_len);\n> +\n> +\t\t/* release incoming frame and advance ring buffer */\n> +\t\tppd->tp_status = TP_STATUS_KERNEL;\n> +\t\tif (++framenum >= framecount)\n> +\t\t\tframenum = 0;\n> +\n> +\t\t/* account for the receive frame */\n> +\t\tbufs[i] = mbuf;\n> +\t\tnum_rx++;\n> +\t}\n> +\tpkt_q->framenum = framenum;\n> +\tpkt_q->rx_pkts += num_rx;\n> +\treturn num_rx;\n> +}\n> +\n> +/*\n> + * Callback to handle sending packets through a real NIC.\n> + */\n> +static uint16_t\n> +eth_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) {\n> +\tstruct tpacket2_hdr *ppd;\n> +\tstruct rte_mbuf *mbuf;\n> +\tuint8_t *pbuf;\n> +\tunsigned int framecount, framenum;\n> +\tstruct pollfd pfd;\n> +\tstruct pkt_tx_queue *pkt_q = queue;\n> +\tuint16_t num_tx = 0;\n> +\tint i;\n> +\n> +\tif (unlikely(nb_pkts == 0))\n> +\t\treturn 0;\n> +\n> +\tmemset(&pfd, 0, sizeof(pfd));\n> +\tpfd.fd = pkt_q->sockfd;\n> +\tpfd.events = POLLOUT;\n> +\tpfd.revents = 0;\n> +\n> +\tframecount = pkt_q->framecount;\n> +\tframenum = pkt_q->framenum;\n> +\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> +\tfor (i = 0; i < nb_pkts; i++) {\n> +\t\t/* point at the next incoming frame */\n> +\t\tif ((ppd->tp_status != TP_STATUS_AVAILABLE) &&\n> +\t\t (poll(&pfd, 1, -1) < 0))\n> +\t\t\t\tcontinue;\n> +\n> +\t\t/* copy the tx frame data */\n> +\t\tmbuf = bufs[num_tx];\n> +\t\tpbuf = (uint8_t *) ppd + TPACKET2_HDRLEN -\n> +\t\t\tsizeof(struct sockaddr_ll);\n> +\t\tmemcpy(pbuf, mbuf->pkt.data, mbuf->pkt.data_len);\n> +\t\tppd->tp_len = ppd->tp_snaplen = mbuf->pkt.data_len;\n> +\n> +\t\t/* release incoming frame and advance ring buffer */\n> +\t\tppd->tp_status = TP_STATUS_SEND_REQUEST;\n> +\t\tif (++framenum >= framecount)\n> +\t\t\tframenum = 0;\n> +\t\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> +\n> +\t\tnum_tx++;\n> +\t\trte_pktmbuf_free(mbuf);\n> +\t}\n> +\n> +\t/* kick-off transmits */\n> +\tsendto(pkt_q->sockfd, NULL, 0, MSG_DONTWAIT, NULL, 0);\n> +\n> +\tpkt_q->framenum = framenum;\n> +\tpkt_q->tx_pkts += num_tx;\n> +\tpkt_q->err_pkts += nb_pkts - num_tx;\n> +\treturn num_tx;\n> +}\n> +\n> +static int\n> +eth_dev_start(struct rte_eth_dev *dev)\n> +{\n> +\tdev->data->dev_link.link_status = 1;\n> +\treturn 0;\n> +}\n> +\n> +/*\n> + * This function gets called when the current port gets stopped.\n> + */\n> +static void\n> +eth_dev_stop(struct rte_eth_dev *dev)\n> +{\n> +\tunsigned i;\n> +\tint sockfd;\n> +\tstruct pmd_internals *internals = dev->data->dev_private;\n> +\n> +\tfor (i = 0; i < internals->nb_queues; i++) {\n> +\t\tsockfd = internals->rx_queue[i].sockfd;\n> +\t\tif (sockfd != -1)\n> +\t\t\tclose(sockfd);\n> +\t\tsockfd = internals->tx_queue[i].sockfd;\n> +\t\tif (sockfd != -1)\n> +\t\t\tclose(sockfd);\n> +\t}\n> +\n> +\tdev->data->dev_link.link_status = 0;\n> +}\n> +\n> +static int\n> +eth_dev_configure(struct rte_eth_dev *dev __rte_unused) {\n> +\treturn 0;\n> +}\n> +\n> +static void\n> +eth_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info\n> +*dev_info) {\n> +\tstruct pmd_internals *internals = dev->data->dev_private;\n> +\n> +\tdev_info->driver_name = drivername;\n> +\tdev_info->if_index = internals->if_index;\n> +\tdev_info->max_mac_addrs = 1;\n> +\tdev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;\n> +\tdev_info->max_rx_queues = (uint16_t)internals->nb_queues;\n> +\tdev_info->max_tx_queues = (uint16_t)internals->nb_queues;\n> +\tdev_info->min_rx_bufsize = 0;\n> +\tdev_info->pci_dev = NULL;\n> +}\n> +\n> +static void\n> +eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *igb_stats)\n> +{\n> +\tunsigned i, imax;\n> +\tunsigned long rx_total = 0, tx_total = 0, tx_err_total = 0;\n> +\tconst struct pmd_internals *internal = dev->data->dev_private;\n> +\n> +\tmemset(igb_stats, 0, sizeof(*igb_stats));\n> +\n> +\timax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?\n> +\t internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);\n> +\tfor (i = 0; i < imax; i++) {\n> +\t\tigb_stats->q_ipackets[i] = internal->rx_queue[i].rx_pkts;\n> +\t\trx_total += igb_stats->q_ipackets[i];\n> +\t}\n> +\n> +\timax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?\n> +\t internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);\n> +\tfor (i = 0; i < imax; i++) {\n> +\t\tigb_stats->q_opackets[i] = internal->tx_queue[i].tx_pkts;\n> +\t\tigb_stats->q_errors[i] = internal->tx_queue[i].err_pkts;\n> +\t\ttx_total += igb_stats->q_opackets[i];\n> +\t\ttx_err_total += igb_stats->q_errors[i];\n> +\t}\n> +\n> +\tigb_stats->ipackets = rx_total;\n> +\tigb_stats->opackets = tx_total;\n> +\tigb_stats->oerrors = tx_err_total;\n> +}\n> +\n> +static void\n> +eth_stats_reset(struct rte_eth_dev *dev) {\n> +\tunsigned i;\n> +\tstruct pmd_internals *internal = dev->data->dev_private;\n> +\n> +\tfor (i = 0; i < internal->nb_queues; i++)\n> +\t\tinternal->rx_queue[i].rx_pkts = 0;\n> +\n> +\tfor (i = 0; i < internal->nb_queues; i++) {\n> +\t\tinternal->tx_queue[i].tx_pkts = 0;\n> +\t\tinternal->tx_queue[i].err_pkts = 0;\n> +\t}\n> +}\n> +\n> +static void\n> +eth_dev_close(struct rte_eth_dev *dev __rte_unused) { }\n> +\n> +static void\n> +eth_queue_release(void *q __rte_unused) { }\n> +\n> +static int\n> +eth_link_update(struct rte_eth_dev *dev __rte_unused,\n> + int wait_to_complete __rte_unused) {\n> +\treturn 0;\n> +}\n> +\n> +static int\n> +eth_rx_queue_setup(struct rte_eth_dev *dev,\n> + uint16_t rx_queue_id,\n> + uint16_t nb_rx_desc __rte_unused,\n> + unsigned int socket_id __rte_unused,\n> + const struct rte_eth_rxconf *rx_conf __rte_unused,\n> + struct rte_mempool *mb_pool) {\n> +\tstruct pmd_internals *internals = dev->data->dev_private;\n> +\tstruct pkt_rx_queue *pkt_q = &internals->rx_queue[rx_queue_id];\n> +\tstruct rte_pktmbuf_pool_private *mbp_priv;\n> +\tuint16_t buf_size;\n> +\n> +\tpkt_q->mb_pool = mb_pool;\n> +\n> +\t/* Now get the space available for data in the mbuf */\n> +\tmbp_priv = rte_mempool_get_priv(pkt_q->mb_pool);\n> +\tbuf_size = (uint16_t) (mbp_priv->mbuf_data_room_size -\n> +\t RTE_PKTMBUF_HEADROOM);\n> +\n> +\tif (ETH_FRAME_LEN > buf_size) {\n> +\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\"%s: %d bytes will not fit in mbuf (%d bytes)\\n\",\n> +\t\t\tdev->data->name, ETH_FRAME_LEN, buf_size);\n> +\t\treturn -ENOMEM;\n> +\t}\n> +\n> +\tdev->data->rx_queues[rx_queue_id] = pkt_q;\n> +\n> +\treturn 0;\n> +}\n> +\n> +static int\n> +eth_tx_queue_setup(struct rte_eth_dev *dev,\n> + uint16_t tx_queue_id,\n> + uint16_t nb_tx_desc __rte_unused,\n> + unsigned int socket_id __rte_unused,\n> + const struct rte_eth_txconf *tx_conf __rte_unused) {\n> +\n> +\tstruct pmd_internals *internals = dev->data->dev_private;\n> +\n> +\tdev->data->tx_queues[tx_queue_id] = &internals->tx_queue[tx_queue_id];\n> +\treturn 0;\n> +}\n> +\n> +static struct eth_dev_ops ops = {\n> +\t.dev_start = eth_dev_start,\n> +\t.dev_stop = eth_dev_stop,\n> +\t.dev_close = eth_dev_close,\n> +\t.dev_configure = eth_dev_configure,\n> +\t.dev_infos_get = eth_dev_info,\n> +\t.rx_queue_setup = eth_rx_queue_setup,\n> +\t.tx_queue_setup = eth_tx_queue_setup,\n> +\t.rx_queue_release = eth_queue_release,\n> +\t.tx_queue_release = eth_queue_release,\n> +\t.link_update = eth_link_update,\n> +\t.stats_get = eth_stats_get,\n> +\t.stats_reset = eth_stats_reset,\n> +};\n> +\n> +/*\n> + * Opens an AF_PACKET socket\n> + */\n> +static int\n> +open_packet_iface(const char *key __rte_unused,\n> + const char *value __rte_unused,\n> + void *extra_args)\n> +{\n> +\tint *sockfd = extra_args;\n> +\n> +\t/* Open an AF_PACKET socket... */\n> +\t*sockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));\n> +\tif (*sockfd == -1) {\n> +\t\tRTE_LOG(ERR, PMD, \"Could not open AF_PACKET socket\\n\");\n> +\t\treturn -1;\n> +\t}\n> +\n> +\treturn 0;\n> +}\n> +\n> +static int\n> +rte_pmd_init_internals(const char *name,\n> + const int sockfd,\n> + const unsigned nb_queues,\n> + unsigned int blocksize,\n> + unsigned int blockcnt,\n> + unsigned int framesize,\n> + unsigned int framecnt,\n> + const unsigned numa_node,\n> + struct pmd_internals **internals,\n> + struct rte_eth_dev **eth_dev,\n> + struct rte_kvargs *kvlist) {\n> +\tstruct rte_eth_dev_data *data = NULL;\n> +\tstruct rte_pci_device *pci_dev = NULL;\n> +\tstruct rte_kvargs_pair *pair = NULL;\n> +\tstruct ifreq ifr;\n> +\tsize_t ifnamelen;\n> +\tunsigned k_idx;\n> +\tstruct sockaddr_ll sockaddr;\n> +\tstruct tpacket_req *req;\n> +\tstruct pkt_rx_queue *rx_queue;\n> +\tstruct pkt_tx_queue *tx_queue;\n> +\tint rc, tpver, discard, bypass;\n> +\tunsigned int i, q, rdsize;\n> +\tint qsockfd, fanout_arg;\n> +\n> +\tfor (k_idx = 0; k_idx < kvlist->count; k_idx++) {\n> +\t\tpair = &kvlist->pairs[k_idx];\n> +\t\tif (strstr(pair->key, ETH_PACKET_IFACE_ARG) != NULL)\n> +\t\t\tbreak;\n> +\t}\n> +\tif (pair == NULL) {\n> +\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\"%s: no interface specified for AF_PACKET ethdev\\n\",\n> +\t\t name);\n> +\t\tgoto error;\n> +\t}\n> +\n> +\tRTE_LOG(INFO, PMD,\n> +\t\t\"%s: creating AF_PACKET-backed ethdev on numa socket %u\\n\",\n> +\t\tname, numa_node);\n> +\n> +\t/*\n> +\t * now do all data allocation - for eth_dev structure, dummy pci driver\n> +\t * and internal (private) data\n> +\t */\n> +\tdata = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);\n> +\tif (data == NULL)\n> +\t\tgoto error;\n> +\n> +\tpci_dev = rte_zmalloc_socket(name, sizeof(*pci_dev), 0, numa_node);\n> +\tif (pci_dev == NULL)\n> +\t\tgoto error;\n> +\n> +\t*internals = rte_zmalloc_socket(name, sizeof(**internals),\n> +\t 0, numa_node);\n> +\tif (*internals == NULL)\n> +\t\tgoto error;\n> +\n> +\treq = &((*internals)->req);\n> +\n> +\treq->tp_block_size = blocksize;\n> +\treq->tp_block_nr = blockcnt;\n> +\treq->tp_frame_size = framesize;\n> +\treq->tp_frame_nr = framecnt;\n> +\n> +\tifnamelen = strlen(pair->value);\n> +\tif (ifnamelen < sizeof(ifr.ifr_name)) {\n> +\t\tmemcpy(ifr.ifr_name, pair->value, ifnamelen);\n> +\t\tifr.ifr_name[ifnamelen] = '\\0';\n> +\t} else {\n> +\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\"%s: I/F name too long (%s)\\n\",\n> +\t\t\tname, pair->value);\n> +\t\tgoto error;\n> +\t}\n> +\tif (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {\n> +\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\"%s: ioctl failed (SIOCGIFINDEX)\\n\",\n> +\t\t name);\n> +\t\tgoto error;\n> +\t}\n> +\t(*internals)->if_index = ifr.ifr_ifindex;\n> +\n> +\tif (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {\n> +\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\"%s: ioctl failed (SIOCGIFHWADDR)\\n\",\n> +\t\t name);\n> +\t\tgoto error;\n> +\t}\n> +\tmemcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);\n> +\n> +\tmemset(&sockaddr, 0, sizeof(sockaddr));\n> +\tsockaddr.sll_family = AF_PACKET;\n> +\tsockaddr.sll_protocol = htons(ETH_P_ALL);\n> +\tsockaddr.sll_ifindex = (*internals)->if_index;\n> +\n> +\tfanout_arg = (getpid() ^ (*internals)->if_index) & 0xffff;\n> +\tfanout_arg |= (PACKET_FANOUT_HASH | PACKET_FANOUT_FLAG_DEFRAG |\n> +\t PACKET_FANOUT_FLAG_ROLLOVER) << 16;\n> +\n> +\tfor (q = 0; q < nb_queues; q++) {\n> +\t\t/* Open an AF_PACKET socket for this queue... */\n> +\t\tqsockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));\n> +\t\tif (qsockfd == -1) {\n> +\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t \"%s: could not open AF_PACKET socket\\n\",\n> +\t\t\t name);\n> +\t\t\treturn -1;\n> +\t\t}\n> +\n> +\t\ttpver = TPACKET_V2;\n> +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_VERSION,\n> +\t\t\t\t&tpver, sizeof(tpver));\n> +\t\tif (rc == -1) {\n> +\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\"%s: could not set PACKET_VERSION on AF_PACKET \"\n> +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> +\t\t\tgoto error;\n> +\t\t}\n> +\n> +\t\tdiscard = 1;\n> +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_LOSS,\n> +\t\t\t\t&discard, sizeof(discard));\n> +\t\tif (rc == -1) {\n> +\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\"%s: could not set PACKET_LOSS on \"\n> +\t\t\t \"AF_PACKET socket for %s\\n\", name, pair->value);\n> +\t\t\tgoto error;\n> +\t\t}\n> +\n> +\t\tbypass = 1;\n> +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_QDISC_BYPASS,\n> +\t\t\t\t&bypass, sizeof(bypass));\n> +\t\tif (rc == -1) {\n> +\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\"%s: could not set PACKET_QDISC_BYPASS \"\n> +\t\t\t \"on AF_PACKET socket for %s\\n\", name,\n> +\t\t\t pair->value);\n> +\t\t\tgoto error;\n> +\t\t}\n> +\n> +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_RX_RING, req,\n> sizeof(*req));\n> +\t\tif (rc == -1) {\n> +\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\"%s: could not set PACKET_RX_RING on AF_PACKET \"\n> +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> +\t\t\tgoto error;\n> +\t\t}\n> +\n> +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_TX_RING, req,\n> sizeof(*req));\n> +\t\tif (rc == -1) {\n> +\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\"%s: could not set PACKET_TX_RING on AF_PACKET \"\n> +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> +\t\t\tgoto error;\n> +\t\t}\n> +\n> +\t\trx_queue = &((*internals)->rx_queue[q]);\n> +\t\trx_queue->framecount = req->tp_frame_nr;\n> +\n> +\t\trx_queue->map = mmap(NULL, 2 * req->tp_block_size * req->tp_block_nr,\n> +\t\t\t\t PROT_READ | PROT_WRITE, MAP_SHARED |\n> MAP_LOCKED,\n> +\t\t\t\t qsockfd, 0);\n> +\t\tif (rx_queue->map == MAP_FAILED) {\n> +\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\"%s: call to mmap failed on AF_PACKET socket for %s\\n\",\n> +\t\t\t\tname, pair->value);\n> +\t\t\tgoto error;\n> +\t\t}\n> +\n> +\t\t/* rdsize is same for both Tx and Rx */\n> +\t\trdsize = req->tp_frame_nr * sizeof(*(rx_queue->rd));\n> +\n> +\t\trx_queue->rd = rte_zmalloc_socket(name, rdsize, 0, numa_node);\n> +\t\tfor (i = 0; i < req->tp_frame_nr; ++i) {\n> +\t\t\trx_queue->rd[i].iov_base = rx_queue->map + (i * framesize);\n> +\t\t\trx_queue->rd[i].iov_len = req->tp_frame_size;\n> +\t\t}\n> +\t\trx_queue->sockfd = qsockfd;\n> +\n> +\t\ttx_queue = &((*internals)->tx_queue[q]);\n> +\t\ttx_queue->framecount = req->tp_frame_nr;\n> +\n> +\t\ttx_queue->map = rx_queue->map + req->tp_block_size *\n> +req->tp_block_nr;\n> +\n> +\t\ttx_queue->rd = rte_zmalloc_socket(name, rdsize, 0, numa_node);\n> +\t\tfor (i = 0; i < req->tp_frame_nr; ++i) {\n> +\t\t\ttx_queue->rd[i].iov_base = tx_queue->map + (i * framesize);\n> +\t\t\ttx_queue->rd[i].iov_len = req->tp_frame_size;\n> +\t\t}\n> +\t\ttx_queue->sockfd = qsockfd;\n> +\n> +\t\trc = bind(qsockfd, (const struct sockaddr*)&sockaddr, sizeof(sockaddr));\n> +\t\tif (rc == -1) {\n> +\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\"%s: could not bind AF_PACKET socket to %s\\n\",\n> +\t\t\t name, pair->value);\n> +\t\t\tgoto error;\n> +\t\t}\n> +\n> +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_FANOUT,\n> +\t\t\t\t&fanout_arg, sizeof(fanout_arg));\n> +\t\tif (rc == -1) {\n> +\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\"%s: could not set PACKET_FANOUT on AF_PACKET socket \"\n> +\t\t\t\t\"for %s\\n\", name, pair->value);\n> +\t\t\tgoto error;\n> +\t\t}\n> +\t}\n> +\n> +\t/* reserve an ethdev entry */\n> +\t*eth_dev = rte_eth_dev_allocate(name);\n> +\tif (*eth_dev == NULL)\n> +\t\tgoto error;\n> +\n> +\t/*\n> +\t * now put it all together\n> +\t * - store queue data in internals,\n> +\t * - store numa_node info in pci_driver\n> +\t * - point eth_dev_data to internals and pci_driver\n> +\t * - and point eth_dev structure to new eth_dev_data structure\n> +\t */\n> +\n> +\t(*internals)->nb_queues = nb_queues;\n> +\n> +\tdata->dev_private = *internals;\n> +\tdata->port_id = (*eth_dev)->data->port_id;\n> +\tdata->nb_rx_queues = (uint16_t)nb_queues;\n> +\tdata->nb_tx_queues = (uint16_t)nb_queues;\n> +\tdata->dev_link = pmd_link;\n> +\tdata->mac_addrs = &(*internals)->eth_addr;\n> +\n> +\tpci_dev->numa_node = numa_node;\n> +\n> +\t(*eth_dev)->data = data;\n> +\t(*eth_dev)->dev_ops = &ops;\n> +\t(*eth_dev)->pci_dev = pci_dev;\n> +\n> +\treturn 0;\n> +\n> +error:\n> +\tif (data)\n> +\t\trte_free(data);\n> +\tif (pci_dev)\n> +\t\trte_free(pci_dev);\n> +\tfor (q = 0; q < nb_queues; q++) {\n> +\t\tif ((*internals)->rx_queue[q].rd)\n> +\t\t\trte_free((*internals)->rx_queue[q].rd);\n> +\t\tif ((*internals)->tx_queue[q].rd)\n> +\t\t\trte_free((*internals)->tx_queue[q].rd);\n> +\t}\n> +\tif (*internals)\n> +\t\trte_free(*internals);\n> +\treturn -1;\n> +}\n> +\n> +static int\n> +rte_eth_from_packet(const char *name,\n> + int const *sockfd,\n> + const unsigned numa_node,\n> + struct rte_kvargs *kvlist) {\n> +\tstruct pmd_internals *internals = NULL;\n> +\tstruct rte_eth_dev *eth_dev = NULL;\n> +\tstruct rte_kvargs_pair *pair = NULL;\n> +\tunsigned k_idx;\n> +\tunsigned int blockcount;\n> +\tunsigned int blocksize = DFLT_BLOCK_SIZE;\n> +\tunsigned int framesize = DFLT_FRAME_SIZE;\n> +\tunsigned int framecount = DFLT_FRAME_COUNT;\n> +\tunsigned int qpairs = 1;\n> +\n> +\t/* do some parameter checking */\n> +\tif (*sockfd < 0)\n> +\t\treturn -1;\n> +\n> +\t/*\n> +\t * Walk arguments for configurable settings\n> +\t */\n> +\tfor (k_idx = 0; k_idx < kvlist->count; k_idx++) {\n> +\t\tpair = &kvlist->pairs[k_idx];\n> +\t\tif (strstr(pair->key, ETH_PACKET_NUM_Q_ARG) != NULL) {\n> +\t\t\tqpairs = atoi(pair->value);\n> +\t\t\tif (qpairs < 1 ||\n> +\t\t\t qpairs > RTE_PMD_PACKET_MAX_RINGS) {\n> +\t\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\t\"%s: invalid qpairs value\\n\",\n> +\t\t\t\t name);\n> +\t\t\t\treturn -1;\n> +\t\t\t}\n> +\t\t\tcontinue;\n> +\t\t}\n> +\t\tif (strstr(pair->key, ETH_PACKET_BLOCKSIZE_ARG) != NULL) {\n> +\t\t\tblocksize = atoi(pair->value);\n> +\t\t\tif (!blocksize) {\n> +\t\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\t\"%s: invalid blocksize value\\n\",\n> +\t\t\t\t name);\n> +\t\t\t\treturn -1;\n> +\t\t\t}\n> +\t\t\tcontinue;\n> +\t\t}\n> +\t\tif (strstr(pair->key, ETH_PACKET_FRAMESIZE_ARG) != NULL) {\n> +\t\t\tframesize = atoi(pair->value);\n> +\t\t\tif (!framesize) {\n> +\t\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\t\"%s: invalid framesize value\\n\",\n> +\t\t\t\t name);\n> +\t\t\t\treturn -1;\n> +\t\t\t}\n> +\t\t\tcontinue;\n> +\t\t}\n> +\t\tif (strstr(pair->key, ETH_PACKET_FRAMECOUNT_ARG) != NULL) {\n> +\t\t\tframecount = atoi(pair->value);\n> +\t\t\tif (!framecount) {\n> +\t\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\t\"%s: invalid framecount value\\n\",\n> +\t\t\t\t name);\n> +\t\t\t\treturn -1;\n> +\t\t\t}\n> +\t\t\tcontinue;\n> +\t\t}\n> +\t}\n> +\n> +\tif (framesize > blocksize) {\n> +\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\"%s: AF_PACKET MMAP frame size exceeds block size!\\n\",\n> +\t\t name);\n> +\t\treturn -1;\n> +\t}\n> +\n> +\tblockcount = framecount / (blocksize / framesize);\n> +\tif (!blockcount) {\n> +\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\"%s: invalid AF_PACKET MMAP parameters\\n\", name);\n> +\t\treturn -1;\n> +\t}\n> +\n> +\tRTE_LOG(INFO, PMD, \"%s: AF_PACKET MMAP parameters:\\n\", name);\n> +\tRTE_LOG(INFO, PMD, \"%s:\\tblock size %d\\n\", name, blocksize);\n> +\tRTE_LOG(INFO, PMD, \"%s:\\tblock count %d\\n\", name, blockcount);\n> +\tRTE_LOG(INFO, PMD, \"%s:\\tframe size %d\\n\", name, framesize);\n> +\tRTE_LOG(INFO, PMD, \"%s:\\tframe count %d\\n\", name, framecount);\n> +\n> +\tif (rte_pmd_init_internals(name, *sockfd, qpairs,\n> +\t blocksize, blockcount,\n> +\t framesize, framecount,\n> +\t numa_node, &internals, &eth_dev,\n> +\t kvlist) < 0)\n> +\t\treturn -1;\n> +\n> +\teth_dev->rx_pkt_burst = eth_packet_rx;\n> +\teth_dev->tx_pkt_burst = eth_packet_tx;\n> +\n> +\treturn 0;\n> +}\n> +\n> +int\n> +rte_pmd_packet_devinit(const char *name, const char *params) {\n> +\tunsigned numa_node;\n> +\tint ret;\n> +\tstruct rte_kvargs *kvlist;\n> +\tint sockfd = -1;\n> +\n> +\tRTE_LOG(INFO, PMD, \"Initializing pmd_packet for %s\\n\", name);\n> +\n> +\tnuma_node = rte_socket_id();\n> +\n> +\tkvlist = rte_kvargs_parse(params, valid_arguments);\n> +\tif (kvlist == NULL)\n> +\t\treturn -1;\n> +\n> +\t/*\n> +\t * If iface argument is passed we open the NICs and use them for\n> +\t * reading / writing\n> +\t */\n> +\tif (rte_kvargs_count(kvlist, ETH_PACKET_IFACE_ARG) == 1) {\n> +\n> +\t\tret = rte_kvargs_process(kvlist, ETH_PACKET_IFACE_ARG,\n> +\t\t &open_packet_iface, &sockfd);\n> +\t\tif (ret < 0)\n> +\t\t\treturn -1;\n> +\t}\n> +\n> +\tret = rte_eth_from_packet(name, &sockfd, numa_node, kvlist);\n> +\tclose(sockfd); /* no longer needed */\n> +\n> +\tif (ret < 0)\n> +\t\treturn -1;\n> +\n> +\treturn 0;\n> +}\n> +\n> +static struct rte_driver pmd_packet_drv = {\n> +\t.name = \"eth_packet\",\n> +\t.type = PMD_VDEV,\n> +\t.init = rte_pmd_packet_devinit,\n> +};\n> +\n> +PMD_REGISTER_DRIVER(pmd_packet_drv);\n> diff --git a/lib/librte_pmd_packet/rte_eth_packet.h\n> b/lib/librte_pmd_packet/rte_eth_packet.h\n> new file mode 100644\n> index 000000000000..f685611da3e9\n> --- /dev/null\n> +++ b/lib/librte_pmd_packet/rte_eth_packet.h\n> @@ -0,0 +1,55 @@\n> +/*-\n> + * BSD LICENSE\n> + *\n> + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> + * All rights reserved.\n> + *\n> + * Redistribution and use in source and binary forms, with or without\n> + * modification, are permitted provided that the following conditions\n> + * are met:\n> + *\n> + * * Redistributions of source code must retain the above copyright\n> + * notice, this list of conditions and the following disclaimer.\n> + * * Redistributions in binary form must reproduce the above copyright\n> + * notice, this list of conditions and the following disclaimer in\n> + * the documentation and/or other materials provided with the\n> + * distribution.\n> + * * Neither the name of Intel Corporation nor the names of its\n> + * contributors may be used to endorse or promote products derived\n> + * from this software without specific prior written permission.\n> + *\n> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND\n> CONTRIBUTORS\n> + * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT\n> NOT\n> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND\n> FITNESS FOR\n> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE\n> COPYRIGHT\n> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,\n> INCIDENTAL,\n> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT\n> NOT\n> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;\n> LOSS OF USE,\n> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED\n> AND ON ANY\n> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR\n> TORT\n> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF\n> THE USE\n> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH\n> DAMAGE.\n> + */\n> +\n> +#ifndef _RTE_ETH_PACKET_H_\n> +#define _RTE_ETH_PACKET_H_\n> +\n> +#ifdef __cplusplus\n> +extern \"C\" {\n> +#endif\n> +\n> +#define RTE_ETH_PACKET_PARAM_NAME \"eth_packet\"\n> +\n> +#define RTE_PMD_PACKET_MAX_RINGS 16\n> +\n> +/**\n> + * For use by the EAL only. Called as part of EAL init to set up any\n> +dummy NICs\n> + * configured on command line.\n> + */\n> +int rte_pmd_packet_devinit(const char *name, const char *params);\n> +\n> +#ifdef __cplusplus\n> +}\n> +#endif\n> +\n> +#endif\n> diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 34dff2a02a05..a6994c4dbe93\n> 100644\n> --- a/mk/rte.app.mk\n> +++ b/mk/rte.app.mk\n> @@ -210,6 +210,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_PMD_PCAP),y) LDLIBS\n> += -lrte_pmd_pcap -lpcap endif\n> \n> +ifeq ($(CONFIG_RTE_LIBRTE_PMD_PACKET),y)\n> +LDLIBS += -lrte_pmd_packet\n> +endif\n> +\n> endif # plugins\n> \n> LDLIBS += $(EXECENV_LDLIBS)\n> --\n> 1.9.3", "headers": { "Return-Path": "<danny.zhou@intel.com>", "Received": [ "from mga01.intel.com (mga01.intel.com [192.55.52.88])\n\tby dpdk.org (Postfix) with ESMTP id E05C41FE\n\tfor <dev@dpdk.org>; Tue, 15 Jul 2014 02:15:11 +0200 (CEST)", "from fmsmga001.fm.intel.com ([10.253.24.23])\n\tby fmsmga101.fm.intel.com with ESMTP; 14 Jul 2014 17:15:54 -0700", "from fmsmsx103.amr.corp.intel.com ([10.19.9.34])\n\tby fmsmga001.fm.intel.com with ESMTP; 14 Jul 2014 17:15:53 -0700", "from fmsmsx153.amr.corp.intel.com (10.19.17.7) by\n\tFMSMSX103.amr.corp.intel.com (10.19.9.34) with Microsoft SMTP Server\n\t(TLS) id 14.3.123.3; Mon, 14 Jul 2014 17:15:52 -0700", "from shsmsx101.ccr.corp.intel.com (10.239.4.153) by\n\tFMSMSX153.amr.corp.intel.com (10.19.17.7) with Microsoft SMTP Server\n\t(TLS) id 14.3.123.3; Mon, 14 Jul 2014 17:15:52 -0700", "from shsmsx104.ccr.corp.intel.com ([169.254.5.122]) by\n\tSHSMSX101.ccr.corp.intel.com ([169.254.1.81]) with mapi id\n\t14.03.0123.003; Tue, 15 Jul 2014 08:15:50 +0800" ], "X-ExtLoop1": "1", "X-IronPort-AV": "E=Sophos;i=\"5.01,661,1400050800\"; d=\"scan'208\";a=\"561775553\"", "From": "\"Zhou, Danny\" <danny.zhou@intel.com>", "To": "\"John W. Linville\" <linville@tuxdriver.com>,\n\t\"dev@dpdk.org\" <dev@dpdk.org>", "Thread-Topic": "[PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based\n\tvirtual devices", "Thread-Index": "AQHPn5Gwvcd4A0wcTkeObxakesCY6JugPv4w", "Date": "Tue, 15 Jul 2014 00:15:49 +0000", "Message-ID": "<DFDF335405C17848924A094BC35766CF0A8A4F83@SHSMSX104.ccr.corp.intel.com>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1405362290-6753-1-git-send-email-linville@tuxdriver.com>", "In-Reply-To": "<1405362290-6753-1-git-send-email-linville@tuxdriver.com>", "Accept-Language": "zh-CN, en-US", "Content-Language": "en-US", "X-MS-Has-Attach": "", "X-MS-TNEF-Correlator": "", "x-originating-ip": "[10.239.127.40]", "Content-Type": "text/plain; charset=\"us-ascii\"", "Content-Transfer-Encoding": "quoted-printable", "MIME-Version": "1.0", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "X-List-Received-Date": "Tue, 15 Jul 2014 00:15:13 -0000" }, "addressed": null }, { "id": 113, "web_url": "http://patches.dpdk.org/comment/113/", "msgid": "<20140715121743.GA14273@localhost.localdomain>", "list_archive_url": "https://inbox.dpdk.org/dev/20140715121743.GA14273@localhost.localdomain", "date": "2014-07-15T12:17:44", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 32, "url": "http://patches.dpdk.org/api/people/32/?format=api", "name": "Neil Horman", "email": "nhorman@tuxdriver.com" }, "content": "On Tue, Jul 15, 2014 at 12:15:49AM +0000, Zhou, Danny wrote:\n> According to my performance measurement results for 64B small packet, 1 queue perf. is better than 16 queues (1.35M pps vs. 0.93M pps) which make sense to me as for 16 queues case more CPU cycles (16 queues' 87% vs. 1 queue' 80%) in kernel land needed for NAPI-enabled ixgbe driver to switch between polling and interrupt modes in order to service per-queue rx interrupts, so more context switch overhead involved. Also, since the eth_packet_rx/eth_packet_tx routines involves in two memory copies between DPDK mbuf and pbuf for each packet, it can hardly achieve high performance unless packet are directly DMA to mbuf which needs ixgbe driver to support.\n\nI thought 16 queues would be spread out between as many cpus as you had though,\nobviating the need for context switches, no?\nNeil\n\n> \n> > -----Original Message-----\n> > From: John W. Linville [mailto:linville@tuxdriver.com]\n> > Sent: Tuesday, July 15, 2014 2:25 AM\n> > To: dev@dpdk.org\n> > Cc: Thomas Monjalon; Richardson, Bruce; Zhou, Danny\n> > Subject: [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual\n> > devices\n> > \n> > This is a Linux-specific virtual PMD driver backed by an AF_PACKET socket. This\n> > implementation uses mmap'ed ring buffers to limit copying and user/kernel\n> > transitions. The PACKET_FANOUT_HASH behavior of AF_PACKET is used for\n> > frame reception. In the current implementation, Tx and Rx queues are always paired,\n> > and therefore are always equal in number -- changing this would be a Simple Matter\n> > Of Programming.\n> > \n> > Interfaces of this type are created with a command line option like\n> > \"--vdev=eth_packet0,iface=...\". There are a number of options availabe as\n> > arguments:\n> > \n> > - Interface is chosen by \"iface\" (required)\n> > - Number of queue pairs set by \"qpairs\" (optional, default: 1)\n> > - AF_PACKET MMAP block size set by \"blocksz\" (optional, default: 4096)\n> > - AF_PACKET MMAP frame size set by \"framesz\" (optional, default: 2048)\n> > - AF_PACKET MMAP frame count set by \"framecnt\" (optional, default: 512)\n> > \n> > Signed-off-by: John W. Linville <linville@tuxdriver.com>\n> > ---\n> > This PMD is intended to provide a means for using DPDK on a broad range of\n> > hardware without hardware-specific PMDs and (hopefully) with better performance\n> > than what PCAP offers in Linux. This might be useful as a development platform for\n> > DPDK applications when DPDK-supported hardware is expensive or unavailable.\n> > \n> > New in v2:\n> > \n> > -- fixup some style issues found by check patch\n> > -- use if_index as part of fanout group ID\n> > -- set default number of queue pairs to 1\n> > \n> > config/common_bsdapp | 5 +\n> > config/common_linuxapp | 5 +\n> > lib/Makefile | 1 +\n> > lib/librte_eal/linuxapp/eal/Makefile | 1 +\n> > lib/librte_pmd_packet/Makefile | 60 +++\n> > lib/librte_pmd_packet/rte_eth_packet.c | 826\n> > +++++++++++++++++++++++++++++++++\n> > lib/librte_pmd_packet/rte_eth_packet.h | 55 +++\n> > mk/rte.app.mk | 4 +\n> > 8 files changed, 957 insertions(+)\n> > create mode 100644 lib/librte_pmd_packet/Makefile create mode 100644\n> > lib/librte_pmd_packet/rte_eth_packet.c\n> > create mode 100644 lib/librte_pmd_packet/rte_eth_packet.h\n> > \n> > diff --git a/config/common_bsdapp b/config/common_bsdapp index\n> > 943dce8f1ede..c317f031278e 100644\n> > --- a/config/common_bsdapp\n> > +++ b/config/common_bsdapp\n> > @@ -226,6 +226,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=y\n> > CONFIG_RTE_LIBRTE_PMD_BOND=y\n> > \n> > #\n> > +# Compile software PMD backed by AF_PACKET sockets (Linux only) #\n> > +CONFIG_RTE_LIBRTE_PMD_PACKET=n\n> > +\n> > +#\n> > # Do prefetch of packet data within PMD driver receive function #\n> > CONFIG_RTE_PMD_PACKET_PREFETCH=y diff --git a/config/common_linuxapp\n> > b/config/common_linuxapp index 7bf5d80d4e26..f9e7bc3015ec 100644\n> > --- a/config/common_linuxapp\n> > +++ b/config/common_linuxapp\n> > @@ -249,6 +249,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=n\n> > CONFIG_RTE_LIBRTE_PMD_BOND=y\n> > \n> > #\n> > +# Compile software PMD backed by AF_PACKET sockets (Linux only) #\n> > +CONFIG_RTE_LIBRTE_PMD_PACKET=y\n> > +\n> > +#\n> > # Compile Xen PMD\n> > #\n> > CONFIG_RTE_LIBRTE_PMD_XENVIRT=n\n> > diff --git a/lib/Makefile b/lib/Makefile index 10c5bb3045bc..930fadf29898 100644\n> > --- a/lib/Makefile\n> > +++ b/lib/Makefile\n> > @@ -47,6 +47,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) +=\n> > librte_pmd_i40e\n> > DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += librte_pmd_bond\n> > DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += librte_pmd_ring\n> > DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += librte_pmd_pcap\n> > +DIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += librte_pmd_packet\n> > DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += librte_pmd_virtio\n> > DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3\n> > DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt diff --git\n> > a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile\n> > index 756d6b0c9301..feed24a63272 100644\n> > --- a/lib/librte_eal/linuxapp/eal/Makefile\n> > +++ b/lib/librte_eal/linuxapp/eal/Makefile\n> > @@ -44,6 +44,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_ether CFLAGS +=\n> > -I$(RTE_SDK)/lib/librte_ivshmem CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_ring\n> > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_pcap\n> > +CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_packet\n> > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_xenvirt\n> > CFLAGS += $(WERROR_FLAGS) -O3\n> > \n> > diff --git a/lib/librte_pmd_packet/Makefile b/lib/librte_pmd_packet/Makefile new file\n> > mode 100644 index 000000000000..e1266fb992cd\n> > --- /dev/null\n> > +++ b/lib/librte_pmd_packet/Makefile\n> > @@ -0,0 +1,60 @@\n> > +# BSD LICENSE\n> > +#\n> > +# Copyright(c) 2014 John W. Linville <linville@redhat.com>\n> > +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > +# Copyright(c) 2014 6WIND S.A.\n> > +# All rights reserved.\n> > +#\n> > +# Redistribution and use in source and binary forms, with or without\n> > +# modification, are permitted provided that the following conditions\n> > +# are met:\n> > +#\n> > +# * Redistributions of source code must retain the above copyright\n> > +# notice, this list of conditions and the following disclaimer.\n> > +# * Redistributions in binary form must reproduce the above copyright\n> > +# notice, this list of conditions and the following disclaimer in\n> > +# the documentation and/or other materials provided with the\n> > +# distribution.\n> > +# * Neither the name of Intel Corporation nor the names of its\n> > +# contributors may be used to endorse or promote products derived\n> > +# from this software without specific prior written permission.\n> > +#\n> > +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND\n> > CONTRIBUTORS\n> > +# \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT\n> > NOT\n> > +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND\n> > FITNESS FOR\n> > +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE\n> > COPYRIGHT\n> > +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,\n> > INCIDENTAL,\n> > +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT\n> > NOT\n> > +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;\n> > LOSS OF USE,\n> > +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED\n> > AND ON ANY\n> > +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> > +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF\n> > THE USE\n> > +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH\n> > DAMAGE.\n> > +\n> > +include $(RTE_SDK)/mk/rte.vars.mk\n> > +\n> > +#\n> > +# library name\n> > +#\n> > +LIB = librte_pmd_packet.a\n> > +\n> > +CFLAGS += -O3\n> > +CFLAGS += $(WERROR_FLAGS)\n> > +\n> > +#\n> > +# all source are stored in SRCS-y\n> > +#\n> > +SRCS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += rte_eth_packet.c\n> > +\n> > +#\n> > +# Export include files\n> > +#\n> > +SYMLINK-y-include += rte_eth_packet.h\n> > +\n> > +# this lib depends upon:\n> > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_mbuf\n> > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_ether\n> > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_malloc\n> > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_kvargs\n> > +\n> > +include $(RTE_SDK)/mk/rte.lib.mk\n> > diff --git a/lib/librte_pmd_packet/rte_eth_packet.c\n> > b/lib/librte_pmd_packet/rte_eth_packet.c\n> > new file mode 100644\n> > index 000000000000..9c82d16e730f\n> > --- /dev/null\n> > +++ b/lib/librte_pmd_packet/rte_eth_packet.c\n> > @@ -0,0 +1,826 @@\n> > +/*-\n> > + * BSD LICENSE\n> > + *\n> > + * Copyright(c) 2014 John W. Linville <linville@tuxdriver.com>\n> > + *\n> > + * Originally based upon librte_pmd_pcap code:\n> > + *\n> > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > + * Copyright(c) 2014 6WIND S.A.\n> > + * All rights reserved.\n> > + *\n> > + * Redistribution and use in source and binary forms, with or without\n> > + * modification, are permitted provided that the following conditions\n> > + * are met:\n> > + *\n> > + * * Redistributions of source code must retain the above copyright\n> > + * notice, this list of conditions and the following disclaimer.\n> > + * * Redistributions in binary form must reproduce the above copyright\n> > + * notice, this list of conditions and the following disclaimer in\n> > + * the documentation and/or other materials provided with the\n> > + * distribution.\n> > + * * Neither the name of Intel Corporation nor the names of its\n> > + * contributors may be used to endorse or promote products derived\n> > + * from this software without specific prior written permission.\n> > + *\n> > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND\n> > CONTRIBUTORS\n> > + * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT\n> > NOT\n> > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND\n> > FITNESS FOR\n> > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE\n> > COPYRIGHT\n> > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,\n> > INCIDENTAL,\n> > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT\n> > NOT\n> > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;\n> > LOSS OF USE,\n> > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED\n> > AND ON ANY\n> > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR\n> > TORT\n> > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF\n> > THE USE\n> > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH\n> > DAMAGE.\n> > + */\n> > +\n> > +#include <rte_mbuf.h>\n> > +#include <rte_ethdev.h>\n> > +#include <rte_malloc.h>\n> > +#include <rte_kvargs.h>\n> > +#include <rte_dev.h>\n> > +\n> > +#include <linux/if_ether.h>\n> > +#include <linux/if_packet.h>\n> > +#include <arpa/inet.h>\n> > +#include <net/if.h>\n> > +#include <sys/types.h>\n> > +#include <sys/socket.h>\n> > +#include <sys/ioctl.h>\n> > +#include <sys/mman.h>\n> > +#include <unistd.h>\n> > +#include <poll.h>\n> > +\n> > +#include \"rte_eth_packet.h\"\n> > +\n> > +#define ETH_PACKET_IFACE_ARG\t\t\"iface\"\n> > +#define ETH_PACKET_NUM_Q_ARG\t\t\"qpairs\"\n> > +#define ETH_PACKET_BLOCKSIZE_ARG\t\"blocksz\"\n> > +#define ETH_PACKET_FRAMESIZE_ARG\t\"framesz\"\n> > +#define ETH_PACKET_FRAMECOUNT_ARG\t\"framecnt\"\n> > +\n> > +#define DFLT_BLOCK_SIZE\t\t(1 << 12)\n> > +#define DFLT_FRAME_SIZE\t\t(1 << 11)\n> > +#define DFLT_FRAME_COUNT\t(1 << 9)\n> > +\n> > +struct pkt_rx_queue {\n> > +\tint sockfd;\n> > +\n> > +\tstruct iovec *rd;\n> > +\tuint8_t *map;\n> > +\tunsigned int framecount;\n> > +\tunsigned int framenum;\n> > +\n> > +\tstruct rte_mempool *mb_pool;\n> > +\n> > +\tvolatile unsigned long rx_pkts;\n> > +\tvolatile unsigned long err_pkts;\n> > +};\n> > +\n> > +struct pkt_tx_queue {\n> > +\tint sockfd;\n> > +\n> > +\tstruct iovec *rd;\n> > +\tuint8_t *map;\n> > +\tunsigned int framecount;\n> > +\tunsigned int framenum;\n> > +\n> > +\tvolatile unsigned long tx_pkts;\n> > +\tvolatile unsigned long err_pkts;\n> > +};\n> > +\n> > +struct pmd_internals {\n> > +\tunsigned nb_queues;\n> > +\n> > +\tint if_index;\n> > +\tstruct ether_addr eth_addr;\n> > +\n> > +\tstruct tpacket_req req;\n> > +\n> > +\tstruct pkt_rx_queue rx_queue[RTE_PMD_PACKET_MAX_RINGS];\n> > +\tstruct pkt_tx_queue tx_queue[RTE_PMD_PACKET_MAX_RINGS];\n> > +};\n> > +\n> > +static const char *valid_arguments[] = {\n> > +\tETH_PACKET_IFACE_ARG,\n> > +\tETH_PACKET_NUM_Q_ARG,\n> > +\tETH_PACKET_BLOCKSIZE_ARG,\n> > +\tETH_PACKET_FRAMESIZE_ARG,\n> > +\tETH_PACKET_FRAMECOUNT_ARG,\n> > +\tNULL\n> > +};\n> > +\n> > +static const char *drivername = \"AF_PACKET PMD\";\n> > +\n> > +static struct rte_eth_link pmd_link = {\n> > +\t.link_speed = 10000,\n> > +\t.link_duplex = ETH_LINK_FULL_DUPLEX,\n> > +\t.link_status = 0\n> > +};\n> > +\n> > +static uint16_t\n> > +eth_packet_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) {\n> > +\tunsigned i;\n> > +\tstruct tpacket2_hdr *ppd;\n> > +\tstruct rte_mbuf *mbuf;\n> > +\tuint8_t *pbuf;\n> > +\tstruct pkt_rx_queue *pkt_q = queue;\n> > +\tuint16_t num_rx = 0;\n> > +\tunsigned int framecount, framenum;\n> > +\n> > +\tif (unlikely(nb_pkts == 0))\n> > +\t\treturn 0;\n> > +\n> > +\t/*\n> > +\t * Reads the given number of packets from the AF_PACKET socket one by\n> > +\t * one and copies the packet data into a newly allocated mbuf.\n> > +\t */\n> > +\tframecount = pkt_q->framecount;\n> > +\tframenum = pkt_q->framenum;\n> > +\tfor (i = 0; i < nb_pkts; i++) {\n> > +\t\t/* point at the next incoming frame */\n> > +\t\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > +\t\tif ((ppd->tp_status & TP_STATUS_USER) == 0)\n> > +\t\t\tbreak;\n> > +\n> > +\t\t/* allocate the next mbuf */\n> > +\t\tmbuf = rte_pktmbuf_alloc(pkt_q->mb_pool);\n> > +\t\tif (unlikely(mbuf == NULL))\n> > +\t\t\tbreak;\n> > +\n> > +\t\t/* packet will fit in the mbuf, go ahead and receive it */\n> > +\t\tmbuf->pkt.pkt_len = mbuf->pkt.data_len = ppd->tp_snaplen;\n> > +\t\tpbuf = (uint8_t *) ppd + ppd->tp_mac;\n> > +\t\tmemcpy(mbuf->pkt.data, pbuf, mbuf->pkt.data_len);\n> > +\n> > +\t\t/* release incoming frame and advance ring buffer */\n> > +\t\tppd->tp_status = TP_STATUS_KERNEL;\n> > +\t\tif (++framenum >= framecount)\n> > +\t\t\tframenum = 0;\n> > +\n> > +\t\t/* account for the receive frame */\n> > +\t\tbufs[i] = mbuf;\n> > +\t\tnum_rx++;\n> > +\t}\n> > +\tpkt_q->framenum = framenum;\n> > +\tpkt_q->rx_pkts += num_rx;\n> > +\treturn num_rx;\n> > +}\n> > +\n> > +/*\n> > + * Callback to handle sending packets through a real NIC.\n> > + */\n> > +static uint16_t\n> > +eth_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) {\n> > +\tstruct tpacket2_hdr *ppd;\n> > +\tstruct rte_mbuf *mbuf;\n> > +\tuint8_t *pbuf;\n> > +\tunsigned int framecount, framenum;\n> > +\tstruct pollfd pfd;\n> > +\tstruct pkt_tx_queue *pkt_q = queue;\n> > +\tuint16_t num_tx = 0;\n> > +\tint i;\n> > +\n> > +\tif (unlikely(nb_pkts == 0))\n> > +\t\treturn 0;\n> > +\n> > +\tmemset(&pfd, 0, sizeof(pfd));\n> > +\tpfd.fd = pkt_q->sockfd;\n> > +\tpfd.events = POLLOUT;\n> > +\tpfd.revents = 0;\n> > +\n> > +\tframecount = pkt_q->framecount;\n> > +\tframenum = pkt_q->framenum;\n> > +\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > +\tfor (i = 0; i < nb_pkts; i++) {\n> > +\t\t/* point at the next incoming frame */\n> > +\t\tif ((ppd->tp_status != TP_STATUS_AVAILABLE) &&\n> > +\t\t (poll(&pfd, 1, -1) < 0))\n> > +\t\t\t\tcontinue;\n> > +\n> > +\t\t/* copy the tx frame data */\n> > +\t\tmbuf = bufs[num_tx];\n> > +\t\tpbuf = (uint8_t *) ppd + TPACKET2_HDRLEN -\n> > +\t\t\tsizeof(struct sockaddr_ll);\n> > +\t\tmemcpy(pbuf, mbuf->pkt.data, mbuf->pkt.data_len);\n> > +\t\tppd->tp_len = ppd->tp_snaplen = mbuf->pkt.data_len;\n> > +\n> > +\t\t/* release incoming frame and advance ring buffer */\n> > +\t\tppd->tp_status = TP_STATUS_SEND_REQUEST;\n> > +\t\tif (++framenum >= framecount)\n> > +\t\t\tframenum = 0;\n> > +\t\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > +\n> > +\t\tnum_tx++;\n> > +\t\trte_pktmbuf_free(mbuf);\n> > +\t}\n> > +\n> > +\t/* kick-off transmits */\n> > +\tsendto(pkt_q->sockfd, NULL, 0, MSG_DONTWAIT, NULL, 0);\n> > +\n> > +\tpkt_q->framenum = framenum;\n> > +\tpkt_q->tx_pkts += num_tx;\n> > +\tpkt_q->err_pkts += nb_pkts - num_tx;\n> > +\treturn num_tx;\n> > +}\n> > +\n> > +static int\n> > +eth_dev_start(struct rte_eth_dev *dev)\n> > +{\n> > +\tdev->data->dev_link.link_status = 1;\n> > +\treturn 0;\n> > +}\n> > +\n> > +/*\n> > + * This function gets called when the current port gets stopped.\n> > + */\n> > +static void\n> > +eth_dev_stop(struct rte_eth_dev *dev)\n> > +{\n> > +\tunsigned i;\n> > +\tint sockfd;\n> > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > +\n> > +\tfor (i = 0; i < internals->nb_queues; i++) {\n> > +\t\tsockfd = internals->rx_queue[i].sockfd;\n> > +\t\tif (sockfd != -1)\n> > +\t\t\tclose(sockfd);\n> > +\t\tsockfd = internals->tx_queue[i].sockfd;\n> > +\t\tif (sockfd != -1)\n> > +\t\t\tclose(sockfd);\n> > +\t}\n> > +\n> > +\tdev->data->dev_link.link_status = 0;\n> > +}\n> > +\n> > +static int\n> > +eth_dev_configure(struct rte_eth_dev *dev __rte_unused) {\n> > +\treturn 0;\n> > +}\n> > +\n> > +static void\n> > +eth_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info\n> > +*dev_info) {\n> > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > +\n> > +\tdev_info->driver_name = drivername;\n> > +\tdev_info->if_index = internals->if_index;\n> > +\tdev_info->max_mac_addrs = 1;\n> > +\tdev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;\n> > +\tdev_info->max_rx_queues = (uint16_t)internals->nb_queues;\n> > +\tdev_info->max_tx_queues = (uint16_t)internals->nb_queues;\n> > +\tdev_info->min_rx_bufsize = 0;\n> > +\tdev_info->pci_dev = NULL;\n> > +}\n> > +\n> > +static void\n> > +eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *igb_stats)\n> > +{\n> > +\tunsigned i, imax;\n> > +\tunsigned long rx_total = 0, tx_total = 0, tx_err_total = 0;\n> > +\tconst struct pmd_internals *internal = dev->data->dev_private;\n> > +\n> > +\tmemset(igb_stats, 0, sizeof(*igb_stats));\n> > +\n> > +\timax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?\n> > +\t internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);\n> > +\tfor (i = 0; i < imax; i++) {\n> > +\t\tigb_stats->q_ipackets[i] = internal->rx_queue[i].rx_pkts;\n> > +\t\trx_total += igb_stats->q_ipackets[i];\n> > +\t}\n> > +\n> > +\timax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?\n> > +\t internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);\n> > +\tfor (i = 0; i < imax; i++) {\n> > +\t\tigb_stats->q_opackets[i] = internal->tx_queue[i].tx_pkts;\n> > +\t\tigb_stats->q_errors[i] = internal->tx_queue[i].err_pkts;\n> > +\t\ttx_total += igb_stats->q_opackets[i];\n> > +\t\ttx_err_total += igb_stats->q_errors[i];\n> > +\t}\n> > +\n> > +\tigb_stats->ipackets = rx_total;\n> > +\tigb_stats->opackets = tx_total;\n> > +\tigb_stats->oerrors = tx_err_total;\n> > +}\n> > +\n> > +static void\n> > +eth_stats_reset(struct rte_eth_dev *dev) {\n> > +\tunsigned i;\n> > +\tstruct pmd_internals *internal = dev->data->dev_private;\n> > +\n> > +\tfor (i = 0; i < internal->nb_queues; i++)\n> > +\t\tinternal->rx_queue[i].rx_pkts = 0;\n> > +\n> > +\tfor (i = 0; i < internal->nb_queues; i++) {\n> > +\t\tinternal->tx_queue[i].tx_pkts = 0;\n> > +\t\tinternal->tx_queue[i].err_pkts = 0;\n> > +\t}\n> > +}\n> > +\n> > +static void\n> > +eth_dev_close(struct rte_eth_dev *dev __rte_unused) { }\n> > +\n> > +static void\n> > +eth_queue_release(void *q __rte_unused) { }\n> > +\n> > +static int\n> > +eth_link_update(struct rte_eth_dev *dev __rte_unused,\n> > + int wait_to_complete __rte_unused) {\n> > +\treturn 0;\n> > +}\n> > +\n> > +static int\n> > +eth_rx_queue_setup(struct rte_eth_dev *dev,\n> > + uint16_t rx_queue_id,\n> > + uint16_t nb_rx_desc __rte_unused,\n> > + unsigned int socket_id __rte_unused,\n> > + const struct rte_eth_rxconf *rx_conf __rte_unused,\n> > + struct rte_mempool *mb_pool) {\n> > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > +\tstruct pkt_rx_queue *pkt_q = &internals->rx_queue[rx_queue_id];\n> > +\tstruct rte_pktmbuf_pool_private *mbp_priv;\n> > +\tuint16_t buf_size;\n> > +\n> > +\tpkt_q->mb_pool = mb_pool;\n> > +\n> > +\t/* Now get the space available for data in the mbuf */\n> > +\tmbp_priv = rte_mempool_get_priv(pkt_q->mb_pool);\n> > +\tbuf_size = (uint16_t) (mbp_priv->mbuf_data_room_size -\n> > +\t RTE_PKTMBUF_HEADROOM);\n> > +\n> > +\tif (ETH_FRAME_LEN > buf_size) {\n> > +\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\"%s: %d bytes will not fit in mbuf (%d bytes)\\n\",\n> > +\t\t\tdev->data->name, ETH_FRAME_LEN, buf_size);\n> > +\t\treturn -ENOMEM;\n> > +\t}\n> > +\n> > +\tdev->data->rx_queues[rx_queue_id] = pkt_q;\n> > +\n> > +\treturn 0;\n> > +}\n> > +\n> > +static int\n> > +eth_tx_queue_setup(struct rte_eth_dev *dev,\n> > + uint16_t tx_queue_id,\n> > + uint16_t nb_tx_desc __rte_unused,\n> > + unsigned int socket_id __rte_unused,\n> > + const struct rte_eth_txconf *tx_conf __rte_unused) {\n> > +\n> > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > +\n> > +\tdev->data->tx_queues[tx_queue_id] = &internals->tx_queue[tx_queue_id];\n> > +\treturn 0;\n> > +}\n> > +\n> > +static struct eth_dev_ops ops = {\n> > +\t.dev_start = eth_dev_start,\n> > +\t.dev_stop = eth_dev_stop,\n> > +\t.dev_close = eth_dev_close,\n> > +\t.dev_configure = eth_dev_configure,\n> > +\t.dev_infos_get = eth_dev_info,\n> > +\t.rx_queue_setup = eth_rx_queue_setup,\n> > +\t.tx_queue_setup = eth_tx_queue_setup,\n> > +\t.rx_queue_release = eth_queue_release,\n> > +\t.tx_queue_release = eth_queue_release,\n> > +\t.link_update = eth_link_update,\n> > +\t.stats_get = eth_stats_get,\n> > +\t.stats_reset = eth_stats_reset,\n> > +};\n> > +\n> > +/*\n> > + * Opens an AF_PACKET socket\n> > + */\n> > +static int\n> > +open_packet_iface(const char *key __rte_unused,\n> > + const char *value __rte_unused,\n> > + void *extra_args)\n> > +{\n> > +\tint *sockfd = extra_args;\n> > +\n> > +\t/* Open an AF_PACKET socket... */\n> > +\t*sockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));\n> > +\tif (*sockfd == -1) {\n> > +\t\tRTE_LOG(ERR, PMD, \"Could not open AF_PACKET socket\\n\");\n> > +\t\treturn -1;\n> > +\t}\n> > +\n> > +\treturn 0;\n> > +}\n> > +\n> > +static int\n> > +rte_pmd_init_internals(const char *name,\n> > + const int sockfd,\n> > + const unsigned nb_queues,\n> > + unsigned int blocksize,\n> > + unsigned int blockcnt,\n> > + unsigned int framesize,\n> > + unsigned int framecnt,\n> > + const unsigned numa_node,\n> > + struct pmd_internals **internals,\n> > + struct rte_eth_dev **eth_dev,\n> > + struct rte_kvargs *kvlist) {\n> > +\tstruct rte_eth_dev_data *data = NULL;\n> > +\tstruct rte_pci_device *pci_dev = NULL;\n> > +\tstruct rte_kvargs_pair *pair = NULL;\n> > +\tstruct ifreq ifr;\n> > +\tsize_t ifnamelen;\n> > +\tunsigned k_idx;\n> > +\tstruct sockaddr_ll sockaddr;\n> > +\tstruct tpacket_req *req;\n> > +\tstruct pkt_rx_queue *rx_queue;\n> > +\tstruct pkt_tx_queue *tx_queue;\n> > +\tint rc, tpver, discard, bypass;\n> > +\tunsigned int i, q, rdsize;\n> > +\tint qsockfd, fanout_arg;\n> > +\n> > +\tfor (k_idx = 0; k_idx < kvlist->count; k_idx++) {\n> > +\t\tpair = &kvlist->pairs[k_idx];\n> > +\t\tif (strstr(pair->key, ETH_PACKET_IFACE_ARG) != NULL)\n> > +\t\t\tbreak;\n> > +\t}\n> > +\tif (pair == NULL) {\n> > +\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\"%s: no interface specified for AF_PACKET ethdev\\n\",\n> > +\t\t name);\n> > +\t\tgoto error;\n> > +\t}\n> > +\n> > +\tRTE_LOG(INFO, PMD,\n> > +\t\t\"%s: creating AF_PACKET-backed ethdev on numa socket %u\\n\",\n> > +\t\tname, numa_node);\n> > +\n> > +\t/*\n> > +\t * now do all data allocation - for eth_dev structure, dummy pci driver\n> > +\t * and internal (private) data\n> > +\t */\n> > +\tdata = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);\n> > +\tif (data == NULL)\n> > +\t\tgoto error;\n> > +\n> > +\tpci_dev = rte_zmalloc_socket(name, sizeof(*pci_dev), 0, numa_node);\n> > +\tif (pci_dev == NULL)\n> > +\t\tgoto error;\n> > +\n> > +\t*internals = rte_zmalloc_socket(name, sizeof(**internals),\n> > +\t 0, numa_node);\n> > +\tif (*internals == NULL)\n> > +\t\tgoto error;\n> > +\n> > +\treq = &((*internals)->req);\n> > +\n> > +\treq->tp_block_size = blocksize;\n> > +\treq->tp_block_nr = blockcnt;\n> > +\treq->tp_frame_size = framesize;\n> > +\treq->tp_frame_nr = framecnt;\n> > +\n> > +\tifnamelen = strlen(pair->value);\n> > +\tif (ifnamelen < sizeof(ifr.ifr_name)) {\n> > +\t\tmemcpy(ifr.ifr_name, pair->value, ifnamelen);\n> > +\t\tifr.ifr_name[ifnamelen] = '\\0';\n> > +\t} else {\n> > +\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\"%s: I/F name too long (%s)\\n\",\n> > +\t\t\tname, pair->value);\n> > +\t\tgoto error;\n> > +\t}\n> > +\tif (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {\n> > +\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\"%s: ioctl failed (SIOCGIFINDEX)\\n\",\n> > +\t\t name);\n> > +\t\tgoto error;\n> > +\t}\n> > +\t(*internals)->if_index = ifr.ifr_ifindex;\n> > +\n> > +\tif (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {\n> > +\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\"%s: ioctl failed (SIOCGIFHWADDR)\\n\",\n> > +\t\t name);\n> > +\t\tgoto error;\n> > +\t}\n> > +\tmemcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);\n> > +\n> > +\tmemset(&sockaddr, 0, sizeof(sockaddr));\n> > +\tsockaddr.sll_family = AF_PACKET;\n> > +\tsockaddr.sll_protocol = htons(ETH_P_ALL);\n> > +\tsockaddr.sll_ifindex = (*internals)->if_index;\n> > +\n> > +\tfanout_arg = (getpid() ^ (*internals)->if_index) & 0xffff;\n> > +\tfanout_arg |= (PACKET_FANOUT_HASH | PACKET_FANOUT_FLAG_DEFRAG |\n> > +\t PACKET_FANOUT_FLAG_ROLLOVER) << 16;\n> > +\n> > +\tfor (q = 0; q < nb_queues; q++) {\n> > +\t\t/* Open an AF_PACKET socket for this queue... */\n> > +\t\tqsockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));\n> > +\t\tif (qsockfd == -1) {\n> > +\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t \"%s: could not open AF_PACKET socket\\n\",\n> > +\t\t\t name);\n> > +\t\t\treturn -1;\n> > +\t\t}\n> > +\n> > +\t\ttpver = TPACKET_V2;\n> > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_VERSION,\n> > +\t\t\t\t&tpver, sizeof(tpver));\n> > +\t\tif (rc == -1) {\n> > +\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\"%s: could not set PACKET_VERSION on AF_PACKET \"\n> > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > +\t\t\tgoto error;\n> > +\t\t}\n> > +\n> > +\t\tdiscard = 1;\n> > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_LOSS,\n> > +\t\t\t\t&discard, sizeof(discard));\n> > +\t\tif (rc == -1) {\n> > +\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\"%s: could not set PACKET_LOSS on \"\n> > +\t\t\t \"AF_PACKET socket for %s\\n\", name, pair->value);\n> > +\t\t\tgoto error;\n> > +\t\t}\n> > +\n> > +\t\tbypass = 1;\n> > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_QDISC_BYPASS,\n> > +\t\t\t\t&bypass, sizeof(bypass));\n> > +\t\tif (rc == -1) {\n> > +\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\"%s: could not set PACKET_QDISC_BYPASS \"\n> > +\t\t\t \"on AF_PACKET socket for %s\\n\", name,\n> > +\t\t\t pair->value);\n> > +\t\t\tgoto error;\n> > +\t\t}\n> > +\n> > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_RX_RING, req,\n> > sizeof(*req));\n> > +\t\tif (rc == -1) {\n> > +\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\"%s: could not set PACKET_RX_RING on AF_PACKET \"\n> > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > +\t\t\tgoto error;\n> > +\t\t}\n> > +\n> > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_TX_RING, req,\n> > sizeof(*req));\n> > +\t\tif (rc == -1) {\n> > +\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\"%s: could not set PACKET_TX_RING on AF_PACKET \"\n> > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > +\t\t\tgoto error;\n> > +\t\t}\n> > +\n> > +\t\trx_queue = &((*internals)->rx_queue[q]);\n> > +\t\trx_queue->framecount = req->tp_frame_nr;\n> > +\n> > +\t\trx_queue->map = mmap(NULL, 2 * req->tp_block_size * req->tp_block_nr,\n> > +\t\t\t\t PROT_READ | PROT_WRITE, MAP_SHARED |\n> > MAP_LOCKED,\n> > +\t\t\t\t qsockfd, 0);\n> > +\t\tif (rx_queue->map == MAP_FAILED) {\n> > +\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\"%s: call to mmap failed on AF_PACKET socket for %s\\n\",\n> > +\t\t\t\tname, pair->value);\n> > +\t\t\tgoto error;\n> > +\t\t}\n> > +\n> > +\t\t/* rdsize is same for both Tx and Rx */\n> > +\t\trdsize = req->tp_frame_nr * sizeof(*(rx_queue->rd));\n> > +\n> > +\t\trx_queue->rd = rte_zmalloc_socket(name, rdsize, 0, numa_node);\n> > +\t\tfor (i = 0; i < req->tp_frame_nr; ++i) {\n> > +\t\t\trx_queue->rd[i].iov_base = rx_queue->map + (i * framesize);\n> > +\t\t\trx_queue->rd[i].iov_len = req->tp_frame_size;\n> > +\t\t}\n> > +\t\trx_queue->sockfd = qsockfd;\n> > +\n> > +\t\ttx_queue = &((*internals)->tx_queue[q]);\n> > +\t\ttx_queue->framecount = req->tp_frame_nr;\n> > +\n> > +\t\ttx_queue->map = rx_queue->map + req->tp_block_size *\n> > +req->tp_block_nr;\n> > +\n> > +\t\ttx_queue->rd = rte_zmalloc_socket(name, rdsize, 0, numa_node);\n> > +\t\tfor (i = 0; i < req->tp_frame_nr; ++i) {\n> > +\t\t\ttx_queue->rd[i].iov_base = tx_queue->map + (i * framesize);\n> > +\t\t\ttx_queue->rd[i].iov_len = req->tp_frame_size;\n> > +\t\t}\n> > +\t\ttx_queue->sockfd = qsockfd;\n> > +\n> > +\t\trc = bind(qsockfd, (const struct sockaddr*)&sockaddr, sizeof(sockaddr));\n> > +\t\tif (rc == -1) {\n> > +\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\"%s: could not bind AF_PACKET socket to %s\\n\",\n> > +\t\t\t name, pair->value);\n> > +\t\t\tgoto error;\n> > +\t\t}\n> > +\n> > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_FANOUT,\n> > +\t\t\t\t&fanout_arg, sizeof(fanout_arg));\n> > +\t\tif (rc == -1) {\n> > +\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\"%s: could not set PACKET_FANOUT on AF_PACKET socket \"\n> > +\t\t\t\t\"for %s\\n\", name, pair->value);\n> > +\t\t\tgoto error;\n> > +\t\t}\n> > +\t}\n> > +\n> > +\t/* reserve an ethdev entry */\n> > +\t*eth_dev = rte_eth_dev_allocate(name);\n> > +\tif (*eth_dev == NULL)\n> > +\t\tgoto error;\n> > +\n> > +\t/*\n> > +\t * now put it all together\n> > +\t * - store queue data in internals,\n> > +\t * - store numa_node info in pci_driver\n> > +\t * - point eth_dev_data to internals and pci_driver\n> > +\t * - and point eth_dev structure to new eth_dev_data structure\n> > +\t */\n> > +\n> > +\t(*internals)->nb_queues = nb_queues;\n> > +\n> > +\tdata->dev_private = *internals;\n> > +\tdata->port_id = (*eth_dev)->data->port_id;\n> > +\tdata->nb_rx_queues = (uint16_t)nb_queues;\n> > +\tdata->nb_tx_queues = (uint16_t)nb_queues;\n> > +\tdata->dev_link = pmd_link;\n> > +\tdata->mac_addrs = &(*internals)->eth_addr;\n> > +\n> > +\tpci_dev->numa_node = numa_node;\n> > +\n> > +\t(*eth_dev)->data = data;\n> > +\t(*eth_dev)->dev_ops = &ops;\n> > +\t(*eth_dev)->pci_dev = pci_dev;\n> > +\n> > +\treturn 0;\n> > +\n> > +error:\n> > +\tif (data)\n> > +\t\trte_free(data);\n> > +\tif (pci_dev)\n> > +\t\trte_free(pci_dev);\n> > +\tfor (q = 0; q < nb_queues; q++) {\n> > +\t\tif ((*internals)->rx_queue[q].rd)\n> > +\t\t\trte_free((*internals)->rx_queue[q].rd);\n> > +\t\tif ((*internals)->tx_queue[q].rd)\n> > +\t\t\trte_free((*internals)->tx_queue[q].rd);\n> > +\t}\n> > +\tif (*internals)\n> > +\t\trte_free(*internals);\n> > +\treturn -1;\n> > +}\n> > +\n> > +static int\n> > +rte_eth_from_packet(const char *name,\n> > + int const *sockfd,\n> > + const unsigned numa_node,\n> > + struct rte_kvargs *kvlist) {\n> > +\tstruct pmd_internals *internals = NULL;\n> > +\tstruct rte_eth_dev *eth_dev = NULL;\n> > +\tstruct rte_kvargs_pair *pair = NULL;\n> > +\tunsigned k_idx;\n> > +\tunsigned int blockcount;\n> > +\tunsigned int blocksize = DFLT_BLOCK_SIZE;\n> > +\tunsigned int framesize = DFLT_FRAME_SIZE;\n> > +\tunsigned int framecount = DFLT_FRAME_COUNT;\n> > +\tunsigned int qpairs = 1;\n> > +\n> > +\t/* do some parameter checking */\n> > +\tif (*sockfd < 0)\n> > +\t\treturn -1;\n> > +\n> > +\t/*\n> > +\t * Walk arguments for configurable settings\n> > +\t */\n> > +\tfor (k_idx = 0; k_idx < kvlist->count; k_idx++) {\n> > +\t\tpair = &kvlist->pairs[k_idx];\n> > +\t\tif (strstr(pair->key, ETH_PACKET_NUM_Q_ARG) != NULL) {\n> > +\t\t\tqpairs = atoi(pair->value);\n> > +\t\t\tif (qpairs < 1 ||\n> > +\t\t\t qpairs > RTE_PMD_PACKET_MAX_RINGS) {\n> > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\t\"%s: invalid qpairs value\\n\",\n> > +\t\t\t\t name);\n> > +\t\t\t\treturn -1;\n> > +\t\t\t}\n> > +\t\t\tcontinue;\n> > +\t\t}\n> > +\t\tif (strstr(pair->key, ETH_PACKET_BLOCKSIZE_ARG) != NULL) {\n> > +\t\t\tblocksize = atoi(pair->value);\n> > +\t\t\tif (!blocksize) {\n> > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\t\"%s: invalid blocksize value\\n\",\n> > +\t\t\t\t name);\n> > +\t\t\t\treturn -1;\n> > +\t\t\t}\n> > +\t\t\tcontinue;\n> > +\t\t}\n> > +\t\tif (strstr(pair->key, ETH_PACKET_FRAMESIZE_ARG) != NULL) {\n> > +\t\t\tframesize = atoi(pair->value);\n> > +\t\t\tif (!framesize) {\n> > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\t\"%s: invalid framesize value\\n\",\n> > +\t\t\t\t name);\n> > +\t\t\t\treturn -1;\n> > +\t\t\t}\n> > +\t\t\tcontinue;\n> > +\t\t}\n> > +\t\tif (strstr(pair->key, ETH_PACKET_FRAMECOUNT_ARG) != NULL) {\n> > +\t\t\tframecount = atoi(pair->value);\n> > +\t\t\tif (!framecount) {\n> > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\t\"%s: invalid framecount value\\n\",\n> > +\t\t\t\t name);\n> > +\t\t\t\treturn -1;\n> > +\t\t\t}\n> > +\t\t\tcontinue;\n> > +\t\t}\n> > +\t}\n> > +\n> > +\tif (framesize > blocksize) {\n> > +\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\"%s: AF_PACKET MMAP frame size exceeds block size!\\n\",\n> > +\t\t name);\n> > +\t\treturn -1;\n> > +\t}\n> > +\n> > +\tblockcount = framecount / (blocksize / framesize);\n> > +\tif (!blockcount) {\n> > +\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\"%s: invalid AF_PACKET MMAP parameters\\n\", name);\n> > +\t\treturn -1;\n> > +\t}\n> > +\n> > +\tRTE_LOG(INFO, PMD, \"%s: AF_PACKET MMAP parameters:\\n\", name);\n> > +\tRTE_LOG(INFO, PMD, \"%s:\\tblock size %d\\n\", name, blocksize);\n> > +\tRTE_LOG(INFO, PMD, \"%s:\\tblock count %d\\n\", name, blockcount);\n> > +\tRTE_LOG(INFO, PMD, \"%s:\\tframe size %d\\n\", name, framesize);\n> > +\tRTE_LOG(INFO, PMD, \"%s:\\tframe count %d\\n\", name, framecount);\n> > +\n> > +\tif (rte_pmd_init_internals(name, *sockfd, qpairs,\n> > +\t blocksize, blockcount,\n> > +\t framesize, framecount,\n> > +\t numa_node, &internals, &eth_dev,\n> > +\t kvlist) < 0)\n> > +\t\treturn -1;\n> > +\n> > +\teth_dev->rx_pkt_burst = eth_packet_rx;\n> > +\teth_dev->tx_pkt_burst = eth_packet_tx;\n> > +\n> > +\treturn 0;\n> > +}\n> > +\n> > +int\n> > +rte_pmd_packet_devinit(const char *name, const char *params) {\n> > +\tunsigned numa_node;\n> > +\tint ret;\n> > +\tstruct rte_kvargs *kvlist;\n> > +\tint sockfd = -1;\n> > +\n> > +\tRTE_LOG(INFO, PMD, \"Initializing pmd_packet for %s\\n\", name);\n> > +\n> > +\tnuma_node = rte_socket_id();\n> > +\n> > +\tkvlist = rte_kvargs_parse(params, valid_arguments);\n> > +\tif (kvlist == NULL)\n> > +\t\treturn -1;\n> > +\n> > +\t/*\n> > +\t * If iface argument is passed we open the NICs and use them for\n> > +\t * reading / writing\n> > +\t */\n> > +\tif (rte_kvargs_count(kvlist, ETH_PACKET_IFACE_ARG) == 1) {\n> > +\n> > +\t\tret = rte_kvargs_process(kvlist, ETH_PACKET_IFACE_ARG,\n> > +\t\t &open_packet_iface, &sockfd);\n> > +\t\tif (ret < 0)\n> > +\t\t\treturn -1;\n> > +\t}\n> > +\n> > +\tret = rte_eth_from_packet(name, &sockfd, numa_node, kvlist);\n> > +\tclose(sockfd); /* no longer needed */\n> > +\n> > +\tif (ret < 0)\n> > +\t\treturn -1;\n> > +\n> > +\treturn 0;\n> > +}\n> > +\n> > +static struct rte_driver pmd_packet_drv = {\n> > +\t.name = \"eth_packet\",\n> > +\t.type = PMD_VDEV,\n> > +\t.init = rte_pmd_packet_devinit,\n> > +};\n> > +\n> > +PMD_REGISTER_DRIVER(pmd_packet_drv);\n> > diff --git a/lib/librte_pmd_packet/rte_eth_packet.h\n> > b/lib/librte_pmd_packet/rte_eth_packet.h\n> > new file mode 100644\n> > index 000000000000..f685611da3e9\n> > --- /dev/null\n> > +++ b/lib/librte_pmd_packet/rte_eth_packet.h\n> > @@ -0,0 +1,55 @@\n> > +/*-\n> > + * BSD LICENSE\n> > + *\n> > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > + * All rights reserved.\n> > + *\n> > + * Redistribution and use in source and binary forms, with or without\n> > + * modification, are permitted provided that the following conditions\n> > + * are met:\n> > + *\n> > + * * Redistributions of source code must retain the above copyright\n> > + * notice, this list of conditions and the following disclaimer.\n> > + * * Redistributions in binary form must reproduce the above copyright\n> > + * notice, this list of conditions and the following disclaimer in\n> > + * the documentation and/or other materials provided with the\n> > + * distribution.\n> > + * * Neither the name of Intel Corporation nor the names of its\n> > + * contributors may be used to endorse or promote products derived\n> > + * from this software without specific prior written permission.\n> > + *\n> > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND\n> > CONTRIBUTORS\n> > + * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT\n> > NOT\n> > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND\n> > FITNESS FOR\n> > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE\n> > COPYRIGHT\n> > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,\n> > INCIDENTAL,\n> > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT\n> > NOT\n> > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;\n> > LOSS OF USE,\n> > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED\n> > AND ON ANY\n> > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR\n> > TORT\n> > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF\n> > THE USE\n> > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH\n> > DAMAGE.\n> > + */\n> > +\n> > +#ifndef _RTE_ETH_PACKET_H_\n> > +#define _RTE_ETH_PACKET_H_\n> > +\n> > +#ifdef __cplusplus\n> > +extern \"C\" {\n> > +#endif\n> > +\n> > +#define RTE_ETH_PACKET_PARAM_NAME \"eth_packet\"\n> > +\n> > +#define RTE_PMD_PACKET_MAX_RINGS 16\n> > +\n> > +/**\n> > + * For use by the EAL only. Called as part of EAL init to set up any\n> > +dummy NICs\n> > + * configured on command line.\n> > + */\n> > +int rte_pmd_packet_devinit(const char *name, const char *params);\n> > +\n> > +#ifdef __cplusplus\n> > +}\n> > +#endif\n> > +\n> > +#endif\n> > diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 34dff2a02a05..a6994c4dbe93\n> > 100644\n> > --- a/mk/rte.app.mk\n> > +++ b/mk/rte.app.mk\n> > @@ -210,6 +210,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_PMD_PCAP),y) LDLIBS\n> > += -lrte_pmd_pcap -lpcap endif\n> > \n> > +ifeq ($(CONFIG_RTE_LIBRTE_PMD_PACKET),y)\n> > +LDLIBS += -lrte_pmd_packet\n> > +endif\n> > +\n> > endif # plugins\n> > \n> > LDLIBS += $(EXECENV_LDLIBS)\n> > --\n> > 1.9.3\n> \n>", "headers": { "Return-Path": "<nhorman@tuxdriver.com>", "Received": [ "from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58])\n\tby dpdk.org (Postfix) with ESMTP id CC08A1FE\n\tfor <dev@dpdk.org>; Tue, 15 Jul 2014 14:17:19 +0200 (CEST)", "from [209.188.62.162] (helo=localhost)\n\tby smtp.tuxdriver.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.63)\n\t(envelope-from <nhorman@tuxdriver.com>)\n\tid 1X71g6-0004FB-0Y; Tue, 15 Jul 2014 08:18:04 -0400" ], "Date": "Tue, 15 Jul 2014 08:17:44 -0400", "From": "Neil Horman <nhorman@tuxdriver.com>", "To": "\"Zhou, Danny\" <danny.zhou@intel.com>", "Message-ID": "<20140715121743.GA14273@localhost.localdomain>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1405362290-6753-1-git-send-email-linville@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A8A4F83@SHSMSX104.ccr.corp.intel.com>", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=us-ascii", "Content-Disposition": "inline", "In-Reply-To": "<DFDF335405C17848924A094BC35766CF0A8A4F83@SHSMSX104.ccr.corp.intel.com>", "User-Agent": "Mutt/1.5.23 (2014-03-12)", "X-Spam-Score": "-2.9 (--)", "X-Spam-Status": "No", "Cc": "\"dev@dpdk.org\" <dev@dpdk.org>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "X-List-Received-Date": "Tue, 15 Jul 2014 12:17:21 -0000" }, "addressed": null }, { "id": 115, "web_url": "http://patches.dpdk.org/comment/115/", "msgid": "<20140715140111.GA26012@tuxdriver.com>", "list_archive_url": "https://inbox.dpdk.org/dev/20140715140111.GA26012@tuxdriver.com", "date": "2014-07-15T14:01:11", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 26, "url": "http://patches.dpdk.org/api/people/26/?format=api", "name": "John W. Linville", "email": "linville@tuxdriver.com" }, "content": "On Tue, Jul 15, 2014 at 08:17:44AM -0400, Neil Horman wrote:\n> On Tue, Jul 15, 2014 at 12:15:49AM +0000, Zhou, Danny wrote:\n> > According to my performance measurement results for 64B small\n> > packet, 1 queue perf. is better than 16 queues (1.35M pps vs. 0.93M\n> > pps) which make sense to me as for 16 queues case more CPU cycles (16\n> > queues' 87% vs. 1 queue' 80%) in kernel land needed for NAPI-enabled\n> > ixgbe driver to switch between polling and interrupt modes in order\n> > to service per-queue rx interrupts, so more context switch overhead\n> > involved. Also, since the eth_packet_rx/eth_packet_tx routines involves\n> > in two memory copies between DPDK mbuf and pbuf for each packet,\n> > it can hardly achieve high performance unless packet are directly\n> > DMA to mbuf which needs ixgbe driver to support.\n> \n> I thought 16 queues would be spread out between as many cpus as you had though,\n> obviating the need for context switches, no?\n\nI think Danny is testing the single CPU case. Having more queues\nthan CPUs probably does not provide any benefit.\n\nIt would be cool to hack the DPDK memory management to work directly\nout of the mmap'ed AF_PACKET buffers. But at this point I don't\nhave enough knowledge of DPDK internals to know if that is at all\nreasonable...\n\nJohn\n\nP.S. Danny, have you run any performance tests on the PCAP driver?", "headers": { "Return-Path": "<linville@tuxdriver.com>", "Received": [ "from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58])\n\tby dpdk.org (Postfix) with ESMTP id 84D801FE\n\tfor <dev@dpdk.org>; Tue, 15 Jul 2014 16:14:23 +0200 (CEST)", "from uucp by smtp.tuxdriver.com with local-rmail (Exim 4.63)\n\t(envelope-from <linville@tuxdriver.com>)\n\tid 1X73Vc-0004zW-R3; Tue, 15 Jul 2014 10:15:08 -0400", "from linville-x1.hq.tuxdriver.com (localhost.localdomain\n\t[127.0.0.1])\n\tby linville-x1.hq.tuxdriver.com (8.14.8/8.14.6) with ESMTP id\n\ts6FE1Cx0028928; Tue, 15 Jul 2014 10:01:12 -0400", "(from linville@localhost)\n\tby linville-x1.hq.tuxdriver.com (8.14.8/8.14.8/Submit) id\n\ts6FE1Cnb028927; Tue, 15 Jul 2014 10:01:12 -0400" ], "Date": "Tue, 15 Jul 2014 10:01:11 -0400", "From": "\"John W. Linville\" <linville@tuxdriver.com>", "To": "Neil Horman <nhorman@tuxdriver.com>", "Message-ID": "<20140715140111.GA26012@tuxdriver.com>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1405362290-6753-1-git-send-email-linville@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A8A4F83@SHSMSX104.ccr.corp.intel.com>\n\t<20140715121743.GA14273@localhost.localdomain>", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=us-ascii", "Content-Disposition": "inline", "In-Reply-To": "<20140715121743.GA14273@localhost.localdomain>", "User-Agent": "Mutt/1.5.23 (2014-03-12)", "Cc": "\"dev@dpdk.org\" <dev@dpdk.org>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "X-List-Received-Date": "Tue, 15 Jul 2014 14:14:23 -0000" }, "addressed": null }, { "id": 116, "web_url": "http://patches.dpdk.org/comment/116/", "msgid": "<DFDF335405C17848924A094BC35766CF0A8AD501@SHSMSX103.ccr.corp.intel.com>", "list_archive_url": "https://inbox.dpdk.org/dev/DFDF335405C17848924A094BC35766CF0A8AD501@SHSMSX103.ccr.corp.intel.com", "date": "2014-07-15T15:34:14", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 29, "url": "http://patches.dpdk.org/api/people/29/?format=api", "name": "Zhou, Danny", "email": "danny.zhou@intel.com" }, "content": "> -----Original Message-----\n> From: Neil Horman [mailto:nhorman@tuxdriver.com]\n> Sent: Tuesday, July 15, 2014 8:18 PM\n> To: Zhou, Danny\n> Cc: John W. Linville; dev@dpdk.org\n> Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n> AF_PACKET-based virtual devices\n> \n> On Tue, Jul 15, 2014 at 12:15:49AM +0000, Zhou, Danny wrote:\n> > According to my performance measurement results for 64B small packet, 1 queue\n> perf. is better than 16 queues (1.35M pps vs. 0.93M pps) which make sense to me as\n> for 16 queues case more CPU cycles (16 queues' 87% vs. 1 queue' 80%) in kernel\n> land needed for NAPI-enabled ixgbe driver to switch between polling and interrupt\n> modes in order to service per-queue rx interrupts, so more context switch overhead\n> involved. Also, since the eth_packet_rx/eth_packet_tx routines involves in two\n> memory copies between DPDK mbuf and pbuf for each packet, it can hardly achieve\n> high performance unless packet are directly DMA to mbuf which needs ixgbe driver\n> to support.\n> \n> I thought 16 queues would be spread out between as many cpus as you had though,\n> obviating the need for context switches, no?\n> Neil\n> \n\nIf you set those per-queue MSIX interrupt affinity to different cpus, then performance would be much better \nand linear scaling is expected. But in order to do apple-to-apple performance comparison against 1 queue case \non single core, by default all interrupts are handled by one core, say core0, so lots of context switch impacts \nperformance I think.\n\n> >\n> > > -----Original Message-----\n> > > From: John W. Linville [mailto:linville@tuxdriver.com]\n> > > Sent: Tuesday, July 15, 2014 2:25 AM\n> > > To: dev@dpdk.org\n> > > Cc: Thomas Monjalon; Richardson, Bruce; Zhou, Danny\n> > > Subject: [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based\n> > > virtual devices\n> > >\n> > > This is a Linux-specific virtual PMD driver backed by an AF_PACKET\n> > > socket. This implementation uses mmap'ed ring buffers to limit\n> > > copying and user/kernel transitions. The PACKET_FANOUT_HASH\n> > > behavior of AF_PACKET is used for frame reception. In the current\n> > > implementation, Tx and Rx queues are always paired, and therefore\n> > > are always equal in number -- changing this would be a Simple Matter Of\n> Programming.\n> > >\n> > > Interfaces of this type are created with a command line option like\n> > > \"--vdev=eth_packet0,iface=...\". There are a number of options\n> > > availabe as\n> > > arguments:\n> > >\n> > > - Interface is chosen by \"iface\" (required)\n> > > - Number of queue pairs set by \"qpairs\" (optional, default: 1)\n> > > - AF_PACKET MMAP block size set by \"blocksz\" (optional, default:\n> > > 4096)\n> > > - AF_PACKET MMAP frame size set by \"framesz\" (optional, default:\n> > > 2048)\n> > > - AF_PACKET MMAP frame count set by \"framecnt\" (optional, default:\n> > > 512)\n> > >\n> > > Signed-off-by: John W. Linville <linville@tuxdriver.com>\n> > > ---\n> > > This PMD is intended to provide a means for using DPDK on a broad\n> > > range of hardware without hardware-specific PMDs and (hopefully)\n> > > with better performance than what PCAP offers in Linux. This might\n> > > be useful as a development platform for DPDK applications when\n> DPDK-supported hardware is expensive or unavailable.\n> > >\n> > > New in v2:\n> > >\n> > > -- fixup some style issues found by check patch\n> > > -- use if_index as part of fanout group ID\n> > > -- set default number of queue pairs to 1\n> > >\n> > > config/common_bsdapp | 5 +\n> > > config/common_linuxapp | 5 +\n> > > lib/Makefile | 1 +\n> > > lib/librte_eal/linuxapp/eal/Makefile | 1 +\n> > > lib/librte_pmd_packet/Makefile | 60 +++\n> > > lib/librte_pmd_packet/rte_eth_packet.c | 826\n> > > +++++++++++++++++++++++++++++++++\n> > > lib/librte_pmd_packet/rte_eth_packet.h | 55 +++\n> > > mk/rte.app.mk | 4 +\n> > > 8 files changed, 957 insertions(+)\n> > > create mode 100644 lib/librte_pmd_packet/Makefile create mode\n> > > 100644 lib/librte_pmd_packet/rte_eth_packet.c\n> > > create mode 100644 lib/librte_pmd_packet/rte_eth_packet.h\n> > >\n> > > diff --git a/config/common_bsdapp b/config/common_bsdapp index\n> > > 943dce8f1ede..c317f031278e 100644\n> > > --- a/config/common_bsdapp\n> > > +++ b/config/common_bsdapp\n> > > @@ -226,6 +226,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=y\n> > > CONFIG_RTE_LIBRTE_PMD_BOND=y\n> > >\n> > > #\n> > > +# Compile software PMD backed by AF_PACKET sockets (Linux only) #\n> > > +CONFIG_RTE_LIBRTE_PMD_PACKET=n\n> > > +\n> > > +#\n> > > # Do prefetch of packet data within PMD driver receive function #\n> > > CONFIG_RTE_PMD_PACKET_PREFETCH=y diff --git\n> a/config/common_linuxapp\n> > > b/config/common_linuxapp index 7bf5d80d4e26..f9e7bc3015ec 100644\n> > > --- a/config/common_linuxapp\n> > > +++ b/config/common_linuxapp\n> > > @@ -249,6 +249,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=n\n> > > CONFIG_RTE_LIBRTE_PMD_BOND=y\n> > >\n> > > #\n> > > +# Compile software PMD backed by AF_PACKET sockets (Linux only) #\n> > > +CONFIG_RTE_LIBRTE_PMD_PACKET=y\n> > > +\n> > > +#\n> > > # Compile Xen PMD\n> > > #\n> > > CONFIG_RTE_LIBRTE_PMD_XENVIRT=n\n> > > diff --git a/lib/Makefile b/lib/Makefile index\n> > > 10c5bb3045bc..930fadf29898 100644\n> > > --- a/lib/Makefile\n> > > +++ b/lib/Makefile\n> > > @@ -47,6 +47,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) +=\n> > > librte_pmd_i40e\n> > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += librte_pmd_bond\n> > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += librte_pmd_ring\n> > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += librte_pmd_pcap\n> > > +DIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += librte_pmd_packet\n> > > DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += librte_pmd_virtio\n> > > DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3\n> > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt diff\n> > > --git a/lib/librte_eal/linuxapp/eal/Makefile\n> > > b/lib/librte_eal/linuxapp/eal/Makefile\n> > > index 756d6b0c9301..feed24a63272 100644\n> > > --- a/lib/librte_eal/linuxapp/eal/Makefile\n> > > +++ b/lib/librte_eal/linuxapp/eal/Makefile\n> > > @@ -44,6 +44,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_ether CFLAGS +=\n> > > -I$(RTE_SDK)/lib/librte_ivshmem CFLAGS +=\n> > > -I$(RTE_SDK)/lib/librte_pmd_ring CFLAGS +=\n> > > -I$(RTE_SDK)/lib/librte_pmd_pcap\n> > > +CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_packet\n> > > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_xenvirt\n> > > CFLAGS += $(WERROR_FLAGS) -O3\n> > >\n> > > diff --git a/lib/librte_pmd_packet/Makefile\n> > > b/lib/librte_pmd_packet/Makefile new file mode 100644 index\n> > > 000000000000..e1266fb992cd\n> > > --- /dev/null\n> > > +++ b/lib/librte_pmd_packet/Makefile\n> > > @@ -0,0 +1,60 @@\n> > > +# BSD LICENSE\n> > > +#\n> > > +# Copyright(c) 2014 John W. Linville <linville@redhat.com>\n> > > +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > > +# Copyright(c) 2014 6WIND S.A.\n> > > +# All rights reserved.\n> > > +#\n> > > +# Redistribution and use in source and binary forms, with or without\n> > > +# modification, are permitted provided that the following conditions\n> > > +# are met:\n> > > +#\n> > > +# * Redistributions of source code must retain the above copyright\n> > > +# notice, this list of conditions and the following disclaimer.\n> > > +# * Redistributions in binary form must reproduce the above copyright\n> > > +# notice, this list of conditions and the following disclaimer in\n> > > +# the documentation and/or other materials provided with the\n> > > +# distribution.\n> > > +# * Neither the name of Intel Corporation nor the names of its\n> > > +# contributors may be used to endorse or promote products derived\n> > > +# from this software without specific prior written permission.\n> > > +#\n> > > +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND\n> > > CONTRIBUTORS\n> > > +# \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING,\n> BUT\n> > > NOT\n> > > +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND\n> > > FITNESS FOR\n> > > +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE\n> > > COPYRIGHT\n> > > +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,\n> > > INCIDENTAL,\n> > > +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,\n> BUT\n> > > NOT\n> > > +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;\n> > > LOSS OF USE,\n> > > +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER\n> CAUSED\n> > > AND ON ANY\n> > > +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR\n> TORT\n> > > +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT\n> OF\n> > > THE USE\n> > > +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH\n> > > DAMAGE.\n> > > +\n> > > +include $(RTE_SDK)/mk/rte.vars.mk\n> > > +\n> > > +#\n> > > +# library name\n> > > +#\n> > > +LIB = librte_pmd_packet.a\n> > > +\n> > > +CFLAGS += -O3\n> > > +CFLAGS += $(WERROR_FLAGS)\n> > > +\n> > > +#\n> > > +# all source are stored in SRCS-y\n> > > +#\n> > > +SRCS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += rte_eth_packet.c\n> > > +\n> > > +#\n> > > +# Export include files\n> > > +#\n> > > +SYMLINK-y-include += rte_eth_packet.h\n> > > +\n> > > +# this lib depends upon:\n> > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_mbuf\n> > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_ether\n> > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_malloc\n> > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_kvargs\n> > > +\n> > > +include $(RTE_SDK)/mk/rte.lib.mk\n> > > diff --git a/lib/librte_pmd_packet/rte_eth_packet.c\n> > > b/lib/librte_pmd_packet/rte_eth_packet.c\n> > > new file mode 100644\n> > > index 000000000000..9c82d16e730f\n> > > --- /dev/null\n> > > +++ b/lib/librte_pmd_packet/rte_eth_packet.c\n> > > @@ -0,0 +1,826 @@\n> > > +/*-\n> > > + * BSD LICENSE\n> > > + *\n> > > + * Copyright(c) 2014 John W. Linville <linville@tuxdriver.com>\n> > > + *\n> > > + * Originally based upon librte_pmd_pcap code:\n> > > + *\n> > > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > > + * Copyright(c) 2014 6WIND S.A.\n> > > + * All rights reserved.\n> > > + *\n> > > + * Redistribution and use in source and binary forms, with or without\n> > > + * modification, are permitted provided that the following conditions\n> > > + * are met:\n> > > + *\n> > > + * * Redistributions of source code must retain the above copyright\n> > > + * notice, this list of conditions and the following disclaimer.\n> > > + * * Redistributions in binary form must reproduce the above copyright\n> > > + * notice, this list of conditions and the following disclaimer in\n> > > + * the documentation and/or other materials provided with the\n> > > + * distribution.\n> > > + * * Neither the name of Intel Corporation nor the names of its\n> > > + * contributors may be used to endorse or promote products derived\n> > > + * from this software without specific prior written permission.\n> > > + *\n> > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND\n> > > CONTRIBUTORS\n> > > + * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING,\n> BUT\n> > > NOT\n> > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND\n> > > FITNESS FOR\n> > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE\n> > > COPYRIGHT\n> > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,\n> > > INCIDENTAL,\n> > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,\n> BUT\n> > > NOT\n> > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;\n> > > LOSS OF USE,\n> > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER\n> CAUSED\n> > > AND ON ANY\n> > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR\n> > > TORT\n> > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT\n> OF\n> > > THE USE\n> > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH\n> > > DAMAGE.\n> > > + */\n> > > +\n> > > +#include <rte_mbuf.h>\n> > > +#include <rte_ethdev.h>\n> > > +#include <rte_malloc.h>\n> > > +#include <rte_kvargs.h>\n> > > +#include <rte_dev.h>\n> > > +\n> > > +#include <linux/if_ether.h>\n> > > +#include <linux/if_packet.h>\n> > > +#include <arpa/inet.h>\n> > > +#include <net/if.h>\n> > > +#include <sys/types.h>\n> > > +#include <sys/socket.h>\n> > > +#include <sys/ioctl.h>\n> > > +#include <sys/mman.h>\n> > > +#include <unistd.h>\n> > > +#include <poll.h>\n> > > +\n> > > +#include \"rte_eth_packet.h\"\n> > > +\n> > > +#define ETH_PACKET_IFACE_ARG\t\t\"iface\"\n> > > +#define ETH_PACKET_NUM_Q_ARG\t\t\"qpairs\"\n> > > +#define ETH_PACKET_BLOCKSIZE_ARG\t\"blocksz\"\n> > > +#define ETH_PACKET_FRAMESIZE_ARG\t\"framesz\"\n> > > +#define ETH_PACKET_FRAMECOUNT_ARG\t\"framecnt\"\n> > > +\n> > > +#define DFLT_BLOCK_SIZE\t\t(1 << 12)\n> > > +#define DFLT_FRAME_SIZE\t\t(1 << 11)\n> > > +#define DFLT_FRAME_COUNT\t(1 << 9)\n> > > +\n> > > +struct pkt_rx_queue {\n> > > +\tint sockfd;\n> > > +\n> > > +\tstruct iovec *rd;\n> > > +\tuint8_t *map;\n> > > +\tunsigned int framecount;\n> > > +\tunsigned int framenum;\n> > > +\n> > > +\tstruct rte_mempool *mb_pool;\n> > > +\n> > > +\tvolatile unsigned long rx_pkts;\n> > > +\tvolatile unsigned long err_pkts;\n> > > +};\n> > > +\n> > > +struct pkt_tx_queue {\n> > > +\tint sockfd;\n> > > +\n> > > +\tstruct iovec *rd;\n> > > +\tuint8_t *map;\n> > > +\tunsigned int framecount;\n> > > +\tunsigned int framenum;\n> > > +\n> > > +\tvolatile unsigned long tx_pkts;\n> > > +\tvolatile unsigned long err_pkts;\n> > > +};\n> > > +\n> > > +struct pmd_internals {\n> > > +\tunsigned nb_queues;\n> > > +\n> > > +\tint if_index;\n> > > +\tstruct ether_addr eth_addr;\n> > > +\n> > > +\tstruct tpacket_req req;\n> > > +\n> > > +\tstruct pkt_rx_queue rx_queue[RTE_PMD_PACKET_MAX_RINGS];\n> > > +\tstruct pkt_tx_queue tx_queue[RTE_PMD_PACKET_MAX_RINGS];\n> > > +};\n> > > +\n> > > +static const char *valid_arguments[] = {\n> > > +\tETH_PACKET_IFACE_ARG,\n> > > +\tETH_PACKET_NUM_Q_ARG,\n> > > +\tETH_PACKET_BLOCKSIZE_ARG,\n> > > +\tETH_PACKET_FRAMESIZE_ARG,\n> > > +\tETH_PACKET_FRAMECOUNT_ARG,\n> > > +\tNULL\n> > > +};\n> > > +\n> > > +static const char *drivername = \"AF_PACKET PMD\";\n> > > +\n> > > +static struct rte_eth_link pmd_link = {\n> > > +\t.link_speed = 10000,\n> > > +\t.link_duplex = ETH_LINK_FULL_DUPLEX,\n> > > +\t.link_status = 0\n> > > +};\n> > > +\n> > > +static uint16_t\n> > > +eth_packet_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) {\n> > > +\tunsigned i;\n> > > +\tstruct tpacket2_hdr *ppd;\n> > > +\tstruct rte_mbuf *mbuf;\n> > > +\tuint8_t *pbuf;\n> > > +\tstruct pkt_rx_queue *pkt_q = queue;\n> > > +\tuint16_t num_rx = 0;\n> > > +\tunsigned int framecount, framenum;\n> > > +\n> > > +\tif (unlikely(nb_pkts == 0))\n> > > +\t\treturn 0;\n> > > +\n> > > +\t/*\n> > > +\t * Reads the given number of packets from the AF_PACKET socket one by\n> > > +\t * one and copies the packet data into a newly allocated mbuf.\n> > > +\t */\n> > > +\tframecount = pkt_q->framecount;\n> > > +\tframenum = pkt_q->framenum;\n> > > +\tfor (i = 0; i < nb_pkts; i++) {\n> > > +\t\t/* point at the next incoming frame */\n> > > +\t\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > > +\t\tif ((ppd->tp_status & TP_STATUS_USER) == 0)\n> > > +\t\t\tbreak;\n> > > +\n> > > +\t\t/* allocate the next mbuf */\n> > > +\t\tmbuf = rte_pktmbuf_alloc(pkt_q->mb_pool);\n> > > +\t\tif (unlikely(mbuf == NULL))\n> > > +\t\t\tbreak;\n> > > +\n> > > +\t\t/* packet will fit in the mbuf, go ahead and receive it */\n> > > +\t\tmbuf->pkt.pkt_len = mbuf->pkt.data_len = ppd->tp_snaplen;\n> > > +\t\tpbuf = (uint8_t *) ppd + ppd->tp_mac;\n> > > +\t\tmemcpy(mbuf->pkt.data, pbuf, mbuf->pkt.data_len);\n> > > +\n> > > +\t\t/* release incoming frame and advance ring buffer */\n> > > +\t\tppd->tp_status = TP_STATUS_KERNEL;\n> > > +\t\tif (++framenum >= framecount)\n> > > +\t\t\tframenum = 0;\n> > > +\n> > > +\t\t/* account for the receive frame */\n> > > +\t\tbufs[i] = mbuf;\n> > > +\t\tnum_rx++;\n> > > +\t}\n> > > +\tpkt_q->framenum = framenum;\n> > > +\tpkt_q->rx_pkts += num_rx;\n> > > +\treturn num_rx;\n> > > +}\n> > > +\n> > > +/*\n> > > + * Callback to handle sending packets through a real NIC.\n> > > + */\n> > > +static uint16_t\n> > > +eth_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) {\n> > > +\tstruct tpacket2_hdr *ppd;\n> > > +\tstruct rte_mbuf *mbuf;\n> > > +\tuint8_t *pbuf;\n> > > +\tunsigned int framecount, framenum;\n> > > +\tstruct pollfd pfd;\n> > > +\tstruct pkt_tx_queue *pkt_q = queue;\n> > > +\tuint16_t num_tx = 0;\n> > > +\tint i;\n> > > +\n> > > +\tif (unlikely(nb_pkts == 0))\n> > > +\t\treturn 0;\n> > > +\n> > > +\tmemset(&pfd, 0, sizeof(pfd));\n> > > +\tpfd.fd = pkt_q->sockfd;\n> > > +\tpfd.events = POLLOUT;\n> > > +\tpfd.revents = 0;\n> > > +\n> > > +\tframecount = pkt_q->framecount;\n> > > +\tframenum = pkt_q->framenum;\n> > > +\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > > +\tfor (i = 0; i < nb_pkts; i++) {\n> > > +\t\t/* point at the next incoming frame */\n> > > +\t\tif ((ppd->tp_status != TP_STATUS_AVAILABLE) &&\n> > > +\t\t (poll(&pfd, 1, -1) < 0))\n> > > +\t\t\t\tcontinue;\n> > > +\n> > > +\t\t/* copy the tx frame data */\n> > > +\t\tmbuf = bufs[num_tx];\n> > > +\t\tpbuf = (uint8_t *) ppd + TPACKET2_HDRLEN -\n> > > +\t\t\tsizeof(struct sockaddr_ll);\n> > > +\t\tmemcpy(pbuf, mbuf->pkt.data, mbuf->pkt.data_len);\n> > > +\t\tppd->tp_len = ppd->tp_snaplen = mbuf->pkt.data_len;\n> > > +\n> > > +\t\t/* release incoming frame and advance ring buffer */\n> > > +\t\tppd->tp_status = TP_STATUS_SEND_REQUEST;\n> > > +\t\tif (++framenum >= framecount)\n> > > +\t\t\tframenum = 0;\n> > > +\t\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > > +\n> > > +\t\tnum_tx++;\n> > > +\t\trte_pktmbuf_free(mbuf);\n> > > +\t}\n> > > +\n> > > +\t/* kick-off transmits */\n> > > +\tsendto(pkt_q->sockfd, NULL, 0, MSG_DONTWAIT, NULL, 0);\n> > > +\n> > > +\tpkt_q->framenum = framenum;\n> > > +\tpkt_q->tx_pkts += num_tx;\n> > > +\tpkt_q->err_pkts += nb_pkts - num_tx;\n> > > +\treturn num_tx;\n> > > +}\n> > > +\n> > > +static int\n> > > +eth_dev_start(struct rte_eth_dev *dev) {\n> > > +\tdev->data->dev_link.link_status = 1;\n> > > +\treturn 0;\n> > > +}\n> > > +\n> > > +/*\n> > > + * This function gets called when the current port gets stopped.\n> > > + */\n> > > +static void\n> > > +eth_dev_stop(struct rte_eth_dev *dev) {\n> > > +\tunsigned i;\n> > > +\tint sockfd;\n> > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > +\n> > > +\tfor (i = 0; i < internals->nb_queues; i++) {\n> > > +\t\tsockfd = internals->rx_queue[i].sockfd;\n> > > +\t\tif (sockfd != -1)\n> > > +\t\t\tclose(sockfd);\n> > > +\t\tsockfd = internals->tx_queue[i].sockfd;\n> > > +\t\tif (sockfd != -1)\n> > > +\t\t\tclose(sockfd);\n> > > +\t}\n> > > +\n> > > +\tdev->data->dev_link.link_status = 0; }\n> > > +\n> > > +static int\n> > > +eth_dev_configure(struct rte_eth_dev *dev __rte_unused) {\n> > > +\treturn 0;\n> > > +}\n> > > +\n> > > +static void\n> > > +eth_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info\n> > > +*dev_info) {\n> > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > +\n> > > +\tdev_info->driver_name = drivername;\n> > > +\tdev_info->if_index = internals->if_index;\n> > > +\tdev_info->max_mac_addrs = 1;\n> > > +\tdev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;\n> > > +\tdev_info->max_rx_queues = (uint16_t)internals->nb_queues;\n> > > +\tdev_info->max_tx_queues = (uint16_t)internals->nb_queues;\n> > > +\tdev_info->min_rx_bufsize = 0;\n> > > +\tdev_info->pci_dev = NULL;\n> > > +}\n> > > +\n> > > +static void\n> > > +eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats\n> > > +*igb_stats) {\n> > > +\tunsigned i, imax;\n> > > +\tunsigned long rx_total = 0, tx_total = 0, tx_err_total = 0;\n> > > +\tconst struct pmd_internals *internal = dev->data->dev_private;\n> > > +\n> > > +\tmemset(igb_stats, 0, sizeof(*igb_stats));\n> > > +\n> > > +\timax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?\n> > > +\t internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);\n> > > +\tfor (i = 0; i < imax; i++) {\n> > > +\t\tigb_stats->q_ipackets[i] = internal->rx_queue[i].rx_pkts;\n> > > +\t\trx_total += igb_stats->q_ipackets[i];\n> > > +\t}\n> > > +\n> > > +\timax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?\n> > > +\t internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);\n> > > +\tfor (i = 0; i < imax; i++) {\n> > > +\t\tigb_stats->q_opackets[i] = internal->tx_queue[i].tx_pkts;\n> > > +\t\tigb_stats->q_errors[i] = internal->tx_queue[i].err_pkts;\n> > > +\t\ttx_total += igb_stats->q_opackets[i];\n> > > +\t\ttx_err_total += igb_stats->q_errors[i];\n> > > +\t}\n> > > +\n> > > +\tigb_stats->ipackets = rx_total;\n> > > +\tigb_stats->opackets = tx_total;\n> > > +\tigb_stats->oerrors = tx_err_total; }\n> > > +\n> > > +static void\n> > > +eth_stats_reset(struct rte_eth_dev *dev) {\n> > > +\tunsigned i;\n> > > +\tstruct pmd_internals *internal = dev->data->dev_private;\n> > > +\n> > > +\tfor (i = 0; i < internal->nb_queues; i++)\n> > > +\t\tinternal->rx_queue[i].rx_pkts = 0;\n> > > +\n> > > +\tfor (i = 0; i < internal->nb_queues; i++) {\n> > > +\t\tinternal->tx_queue[i].tx_pkts = 0;\n> > > +\t\tinternal->tx_queue[i].err_pkts = 0;\n> > > +\t}\n> > > +}\n> > > +\n> > > +static void\n> > > +eth_dev_close(struct rte_eth_dev *dev __rte_unused) { }\n> > > +\n> > > +static void\n> > > +eth_queue_release(void *q __rte_unused) { }\n> > > +\n> > > +static int\n> > > +eth_link_update(struct rte_eth_dev *dev __rte_unused,\n> > > + int wait_to_complete __rte_unused) {\n> > > +\treturn 0;\n> > > +}\n> > > +\n> > > +static int\n> > > +eth_rx_queue_setup(struct rte_eth_dev *dev,\n> > > + uint16_t rx_queue_id,\n> > > + uint16_t nb_rx_desc __rte_unused,\n> > > + unsigned int socket_id __rte_unused,\n> > > + const struct rte_eth_rxconf *rx_conf __rte_unused,\n> > > + struct rte_mempool *mb_pool) {\n> > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > +\tstruct pkt_rx_queue *pkt_q = &internals->rx_queue[rx_queue_id];\n> > > +\tstruct rte_pktmbuf_pool_private *mbp_priv;\n> > > +\tuint16_t buf_size;\n> > > +\n> > > +\tpkt_q->mb_pool = mb_pool;\n> > > +\n> > > +\t/* Now get the space available for data in the mbuf */\n> > > +\tmbp_priv = rte_mempool_get_priv(pkt_q->mb_pool);\n> > > +\tbuf_size = (uint16_t) (mbp_priv->mbuf_data_room_size -\n> > > +\t RTE_PKTMBUF_HEADROOM);\n> > > +\n> > > +\tif (ETH_FRAME_LEN > buf_size) {\n> > > +\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\"%s: %d bytes will not fit in mbuf (%d bytes)\\n\",\n> > > +\t\t\tdev->data->name, ETH_FRAME_LEN, buf_size);\n> > > +\t\treturn -ENOMEM;\n> > > +\t}\n> > > +\n> > > +\tdev->data->rx_queues[rx_queue_id] = pkt_q;\n> > > +\n> > > +\treturn 0;\n> > > +}\n> > > +\n> > > +static int\n> > > +eth_tx_queue_setup(struct rte_eth_dev *dev,\n> > > + uint16_t tx_queue_id,\n> > > + uint16_t nb_tx_desc __rte_unused,\n> > > + unsigned int socket_id __rte_unused,\n> > > + const struct rte_eth_txconf *tx_conf\n> > > +__rte_unused) {\n> > > +\n> > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > +\n> > > +\tdev->data->tx_queues[tx_queue_id] = &internals->tx_queue[tx_queue_id];\n> > > +\treturn 0;\n> > > +}\n> > > +\n> > > +static struct eth_dev_ops ops = {\n> > > +\t.dev_start = eth_dev_start,\n> > > +\t.dev_stop = eth_dev_stop,\n> > > +\t.dev_close = eth_dev_close,\n> > > +\t.dev_configure = eth_dev_configure,\n> > > +\t.dev_infos_get = eth_dev_info,\n> > > +\t.rx_queue_setup = eth_rx_queue_setup,\n> > > +\t.tx_queue_setup = eth_tx_queue_setup,\n> > > +\t.rx_queue_release = eth_queue_release,\n> > > +\t.tx_queue_release = eth_queue_release,\n> > > +\t.link_update = eth_link_update,\n> > > +\t.stats_get = eth_stats_get,\n> > > +\t.stats_reset = eth_stats_reset,\n> > > +};\n> > > +\n> > > +/*\n> > > + * Opens an AF_PACKET socket\n> > > + */\n> > > +static int\n> > > +open_packet_iface(const char *key __rte_unused,\n> > > + const char *value __rte_unused,\n> > > + void *extra_args) {\n> > > +\tint *sockfd = extra_args;\n> > > +\n> > > +\t/* Open an AF_PACKET socket... */\n> > > +\t*sockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));\n> > > +\tif (*sockfd == -1) {\n> > > +\t\tRTE_LOG(ERR, PMD, \"Could not open AF_PACKET socket\\n\");\n> > > +\t\treturn -1;\n> > > +\t}\n> > > +\n> > > +\treturn 0;\n> > > +}\n> > > +\n> > > +static int\n> > > +rte_pmd_init_internals(const char *name,\n> > > + const int sockfd,\n> > > + const unsigned nb_queues,\n> > > + unsigned int blocksize,\n> > > + unsigned int blockcnt,\n> > > + unsigned int framesize,\n> > > + unsigned int framecnt,\n> > > + const unsigned numa_node,\n> > > + struct pmd_internals **internals,\n> > > + struct rte_eth_dev **eth_dev,\n> > > + struct rte_kvargs *kvlist) {\n> > > +\tstruct rte_eth_dev_data *data = NULL;\n> > > +\tstruct rte_pci_device *pci_dev = NULL;\n> > > +\tstruct rte_kvargs_pair *pair = NULL;\n> > > +\tstruct ifreq ifr;\n> > > +\tsize_t ifnamelen;\n> > > +\tunsigned k_idx;\n> > > +\tstruct sockaddr_ll sockaddr;\n> > > +\tstruct tpacket_req *req;\n> > > +\tstruct pkt_rx_queue *rx_queue;\n> > > +\tstruct pkt_tx_queue *tx_queue;\n> > > +\tint rc, tpver, discard, bypass;\n> > > +\tunsigned int i, q, rdsize;\n> > > +\tint qsockfd, fanout_arg;\n> > > +\n> > > +\tfor (k_idx = 0; k_idx < kvlist->count; k_idx++) {\n> > > +\t\tpair = &kvlist->pairs[k_idx];\n> > > +\t\tif (strstr(pair->key, ETH_PACKET_IFACE_ARG) != NULL)\n> > > +\t\t\tbreak;\n> > > +\t}\n> > > +\tif (pair == NULL) {\n> > > +\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\"%s: no interface specified for AF_PACKET ethdev\\n\",\n> > > +\t\t name);\n> > > +\t\tgoto error;\n> > > +\t}\n> > > +\n> > > +\tRTE_LOG(INFO, PMD,\n> > > +\t\t\"%s: creating AF_PACKET-backed ethdev on numa socket %u\\n\",\n> > > +\t\tname, numa_node);\n> > > +\n> > > +\t/*\n> > > +\t * now do all data allocation - for eth_dev structure, dummy pci driver\n> > > +\t * and internal (private) data\n> > > +\t */\n> > > +\tdata = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);\n> > > +\tif (data == NULL)\n> > > +\t\tgoto error;\n> > > +\n> > > +\tpci_dev = rte_zmalloc_socket(name, sizeof(*pci_dev), 0, numa_node);\n> > > +\tif (pci_dev == NULL)\n> > > +\t\tgoto error;\n> > > +\n> > > +\t*internals = rte_zmalloc_socket(name, sizeof(**internals),\n> > > +\t 0, numa_node);\n> > > +\tif (*internals == NULL)\n> > > +\t\tgoto error;\n> > > +\n> > > +\treq = &((*internals)->req);\n> > > +\n> > > +\treq->tp_block_size = blocksize;\n> > > +\treq->tp_block_nr = blockcnt;\n> > > +\treq->tp_frame_size = framesize;\n> > > +\treq->tp_frame_nr = framecnt;\n> > > +\n> > > +\tifnamelen = strlen(pair->value);\n> > > +\tif (ifnamelen < sizeof(ifr.ifr_name)) {\n> > > +\t\tmemcpy(ifr.ifr_name, pair->value, ifnamelen);\n> > > +\t\tifr.ifr_name[ifnamelen] = '\\0';\n> > > +\t} else {\n> > > +\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\"%s: I/F name too long (%s)\\n\",\n> > > +\t\t\tname, pair->value);\n> > > +\t\tgoto error;\n> > > +\t}\n> > > +\tif (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {\n> > > +\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\"%s: ioctl failed (SIOCGIFINDEX)\\n\",\n> > > +\t\t name);\n> > > +\t\tgoto error;\n> > > +\t}\n> > > +\t(*internals)->if_index = ifr.ifr_ifindex;\n> > > +\n> > > +\tif (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {\n> > > +\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\"%s: ioctl failed (SIOCGIFHWADDR)\\n\",\n> > > +\t\t name);\n> > > +\t\tgoto error;\n> > > +\t}\n> > > +\tmemcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);\n> > > +\n> > > +\tmemset(&sockaddr, 0, sizeof(sockaddr));\n> > > +\tsockaddr.sll_family = AF_PACKET;\n> > > +\tsockaddr.sll_protocol = htons(ETH_P_ALL);\n> > > +\tsockaddr.sll_ifindex = (*internals)->if_index;\n> > > +\n> > > +\tfanout_arg = (getpid() ^ (*internals)->if_index) & 0xffff;\n> > > +\tfanout_arg |= (PACKET_FANOUT_HASH |\n> PACKET_FANOUT_FLAG_DEFRAG |\n> > > +\t PACKET_FANOUT_FLAG_ROLLOVER) << 16;\n> > > +\n> > > +\tfor (q = 0; q < nb_queues; q++) {\n> > > +\t\t/* Open an AF_PACKET socket for this queue... */\n> > > +\t\tqsockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));\n> > > +\t\tif (qsockfd == -1) {\n> > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t \"%s: could not open AF_PACKET socket\\n\",\n> > > +\t\t\t name);\n> > > +\t\t\treturn -1;\n> > > +\t\t}\n> > > +\n> > > +\t\ttpver = TPACKET_V2;\n> > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_VERSION,\n> > > +\t\t\t\t&tpver, sizeof(tpver));\n> > > +\t\tif (rc == -1) {\n> > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\"%s: could not set PACKET_VERSION on AF_PACKET \"\n> > > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > > +\t\t\tgoto error;\n> > > +\t\t}\n> > > +\n> > > +\t\tdiscard = 1;\n> > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_LOSS,\n> > > +\t\t\t\t&discard, sizeof(discard));\n> > > +\t\tif (rc == -1) {\n> > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\"%s: could not set PACKET_LOSS on \"\n> > > +\t\t\t \"AF_PACKET socket for %s\\n\", name, pair->value);\n> > > +\t\t\tgoto error;\n> > > +\t\t}\n> > > +\n> > > +\t\tbypass = 1;\n> > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_QDISC_BYPASS,\n> > > +\t\t\t\t&bypass, sizeof(bypass));\n> > > +\t\tif (rc == -1) {\n> > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\"%s: could not set PACKET_QDISC_BYPASS \"\n> > > +\t\t\t \"on AF_PACKET socket for %s\\n\", name,\n> > > +\t\t\t pair->value);\n> > > +\t\t\tgoto error;\n> > > +\t\t}\n> > > +\n> > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_RX_RING, req,\n> > > sizeof(*req));\n> > > +\t\tif (rc == -1) {\n> > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\"%s: could not set PACKET_RX_RING on AF_PACKET \"\n> > > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > > +\t\t\tgoto error;\n> > > +\t\t}\n> > > +\n> > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_TX_RING, req,\n> > > sizeof(*req));\n> > > +\t\tif (rc == -1) {\n> > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\"%s: could not set PACKET_TX_RING on AF_PACKET \"\n> > > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > > +\t\t\tgoto error;\n> > > +\t\t}\n> > > +\n> > > +\t\trx_queue = &((*internals)->rx_queue[q]);\n> > > +\t\trx_queue->framecount = req->tp_frame_nr;\n> > > +\n> > > +\t\trx_queue->map = mmap(NULL, 2 * req->tp_block_size *\n> req->tp_block_nr,\n> > > +\t\t\t\t PROT_READ | PROT_WRITE, MAP_SHARED |\n> > > MAP_LOCKED,\n> > > +\t\t\t\t qsockfd, 0);\n> > > +\t\tif (rx_queue->map == MAP_FAILED) {\n> > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\"%s: call to mmap failed on AF_PACKET socket for %s\\n\",\n> > > +\t\t\t\tname, pair->value);\n> > > +\t\t\tgoto error;\n> > > +\t\t}\n> > > +\n> > > +\t\t/* rdsize is same for both Tx and Rx */\n> > > +\t\trdsize = req->tp_frame_nr * sizeof(*(rx_queue->rd));\n> > > +\n> > > +\t\trx_queue->rd = rte_zmalloc_socket(name, rdsize, 0, numa_node);\n> > > +\t\tfor (i = 0; i < req->tp_frame_nr; ++i) {\n> > > +\t\t\trx_queue->rd[i].iov_base = rx_queue->map + (i * framesize);\n> > > +\t\t\trx_queue->rd[i].iov_len = req->tp_frame_size;\n> > > +\t\t}\n> > > +\t\trx_queue->sockfd = qsockfd;\n> > > +\n> > > +\t\ttx_queue = &((*internals)->tx_queue[q]);\n> > > +\t\ttx_queue->framecount = req->tp_frame_nr;\n> > > +\n> > > +\t\ttx_queue->map = rx_queue->map + req->tp_block_size *\n> > > +req->tp_block_nr;\n> > > +\n> > > +\t\ttx_queue->rd = rte_zmalloc_socket(name, rdsize, 0, numa_node);\n> > > +\t\tfor (i = 0; i < req->tp_frame_nr; ++i) {\n> > > +\t\t\ttx_queue->rd[i].iov_base = tx_queue->map + (i * framesize);\n> > > +\t\t\ttx_queue->rd[i].iov_len = req->tp_frame_size;\n> > > +\t\t}\n> > > +\t\ttx_queue->sockfd = qsockfd;\n> > > +\n> > > +\t\trc = bind(qsockfd, (const struct sockaddr*)&sockaddr,\n> sizeof(sockaddr));\n> > > +\t\tif (rc == -1) {\n> > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\"%s: could not bind AF_PACKET socket to %s\\n\",\n> > > +\t\t\t name, pair->value);\n> > > +\t\t\tgoto error;\n> > > +\t\t}\n> > > +\n> > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_FANOUT,\n> > > +\t\t\t\t&fanout_arg, sizeof(fanout_arg));\n> > > +\t\tif (rc == -1) {\n> > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\"%s: could not set PACKET_FANOUT on AF_PACKET\n> socket \"\n> > > +\t\t\t\t\"for %s\\n\", name, pair->value);\n> > > +\t\t\tgoto error;\n> > > +\t\t}\n> > > +\t}\n> > > +\n> > > +\t/* reserve an ethdev entry */\n> > > +\t*eth_dev = rte_eth_dev_allocate(name);\n> > > +\tif (*eth_dev == NULL)\n> > > +\t\tgoto error;\n> > > +\n> > > +\t/*\n> > > +\t * now put it all together\n> > > +\t * - store queue data in internals,\n> > > +\t * - store numa_node info in pci_driver\n> > > +\t * - point eth_dev_data to internals and pci_driver\n> > > +\t * - and point eth_dev structure to new eth_dev_data structure\n> > > +\t */\n> > > +\n> > > +\t(*internals)->nb_queues = nb_queues;\n> > > +\n> > > +\tdata->dev_private = *internals;\n> > > +\tdata->port_id = (*eth_dev)->data->port_id;\n> > > +\tdata->nb_rx_queues = (uint16_t)nb_queues;\n> > > +\tdata->nb_tx_queues = (uint16_t)nb_queues;\n> > > +\tdata->dev_link = pmd_link;\n> > > +\tdata->mac_addrs = &(*internals)->eth_addr;\n> > > +\n> > > +\tpci_dev->numa_node = numa_node;\n> > > +\n> > > +\t(*eth_dev)->data = data;\n> > > +\t(*eth_dev)->dev_ops = &ops;\n> > > +\t(*eth_dev)->pci_dev = pci_dev;\n> > > +\n> > > +\treturn 0;\n> > > +\n> > > +error:\n> > > +\tif (data)\n> > > +\t\trte_free(data);\n> > > +\tif (pci_dev)\n> > > +\t\trte_free(pci_dev);\n> > > +\tfor (q = 0; q < nb_queues; q++) {\n> > > +\t\tif ((*internals)->rx_queue[q].rd)\n> > > +\t\t\trte_free((*internals)->rx_queue[q].rd);\n> > > +\t\tif ((*internals)->tx_queue[q].rd)\n> > > +\t\t\trte_free((*internals)->tx_queue[q].rd);\n> > > +\t}\n> > > +\tif (*internals)\n> > > +\t\trte_free(*internals);\n> > > +\treturn -1;\n> > > +}\n> > > +\n> > > +static int\n> > > +rte_eth_from_packet(const char *name,\n> > > + int const *sockfd,\n> > > + const unsigned numa_node,\n> > > + struct rte_kvargs *kvlist) {\n> > > +\tstruct pmd_internals *internals = NULL;\n> > > +\tstruct rte_eth_dev *eth_dev = NULL;\n> > > +\tstruct rte_kvargs_pair *pair = NULL;\n> > > +\tunsigned k_idx;\n> > > +\tunsigned int blockcount;\n> > > +\tunsigned int blocksize = DFLT_BLOCK_SIZE;\n> > > +\tunsigned int framesize = DFLT_FRAME_SIZE;\n> > > +\tunsigned int framecount = DFLT_FRAME_COUNT;\n> > > +\tunsigned int qpairs = 1;\n> > > +\n> > > +\t/* do some parameter checking */\n> > > +\tif (*sockfd < 0)\n> > > +\t\treturn -1;\n> > > +\n> > > +\t/*\n> > > +\t * Walk arguments for configurable settings\n> > > +\t */\n> > > +\tfor (k_idx = 0; k_idx < kvlist->count; k_idx++) {\n> > > +\t\tpair = &kvlist->pairs[k_idx];\n> > > +\t\tif (strstr(pair->key, ETH_PACKET_NUM_Q_ARG) != NULL) {\n> > > +\t\t\tqpairs = atoi(pair->value);\n> > > +\t\t\tif (qpairs < 1 ||\n> > > +\t\t\t qpairs > RTE_PMD_PACKET_MAX_RINGS) {\n> > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\t\"%s: invalid qpairs value\\n\",\n> > > +\t\t\t\t name);\n> > > +\t\t\t\treturn -1;\n> > > +\t\t\t}\n> > > +\t\t\tcontinue;\n> > > +\t\t}\n> > > +\t\tif (strstr(pair->key, ETH_PACKET_BLOCKSIZE_ARG) != NULL) {\n> > > +\t\t\tblocksize = atoi(pair->value);\n> > > +\t\t\tif (!blocksize) {\n> > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\t\"%s: invalid blocksize value\\n\",\n> > > +\t\t\t\t name);\n> > > +\t\t\t\treturn -1;\n> > > +\t\t\t}\n> > > +\t\t\tcontinue;\n> > > +\t\t}\n> > > +\t\tif (strstr(pair->key, ETH_PACKET_FRAMESIZE_ARG) != NULL) {\n> > > +\t\t\tframesize = atoi(pair->value);\n> > > +\t\t\tif (!framesize) {\n> > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\t\"%s: invalid framesize value\\n\",\n> > > +\t\t\t\t name);\n> > > +\t\t\t\treturn -1;\n> > > +\t\t\t}\n> > > +\t\t\tcontinue;\n> > > +\t\t}\n> > > +\t\tif (strstr(pair->key, ETH_PACKET_FRAMECOUNT_ARG) != NULL) {\n> > > +\t\t\tframecount = atoi(pair->value);\n> > > +\t\t\tif (!framecount) {\n> > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\t\"%s: invalid framecount value\\n\",\n> > > +\t\t\t\t name);\n> > > +\t\t\t\treturn -1;\n> > > +\t\t\t}\n> > > +\t\t\tcontinue;\n> > > +\t\t}\n> > > +\t}\n> > > +\n> > > +\tif (framesize > blocksize) {\n> > > +\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\"%s: AF_PACKET MMAP frame size exceeds block size!\\n\",\n> > > +\t\t name);\n> > > +\t\treturn -1;\n> > > +\t}\n> > > +\n> > > +\tblockcount = framecount / (blocksize / framesize);\n> > > +\tif (!blockcount) {\n> > > +\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\"%s: invalid AF_PACKET MMAP parameters\\n\", name);\n> > > +\t\treturn -1;\n> > > +\t}\n> > > +\n> > > +\tRTE_LOG(INFO, PMD, \"%s: AF_PACKET MMAP parameters:\\n\", name);\n> > > +\tRTE_LOG(INFO, PMD, \"%s:\\tblock size %d\\n\", name, blocksize);\n> > > +\tRTE_LOG(INFO, PMD, \"%s:\\tblock count %d\\n\", name, blockcount);\n> > > +\tRTE_LOG(INFO, PMD, \"%s:\\tframe size %d\\n\", name, framesize);\n> > > +\tRTE_LOG(INFO, PMD, \"%s:\\tframe count %d\\n\", name, framecount);\n> > > +\n> > > +\tif (rte_pmd_init_internals(name, *sockfd, qpairs,\n> > > +\t blocksize, blockcount,\n> > > +\t framesize, framecount,\n> > > +\t numa_node, &internals, &eth_dev,\n> > > +\t kvlist) < 0)\n> > > +\t\treturn -1;\n> > > +\n> > > +\teth_dev->rx_pkt_burst = eth_packet_rx;\n> > > +\teth_dev->tx_pkt_burst = eth_packet_tx;\n> > > +\n> > > +\treturn 0;\n> > > +}\n> > > +\n> > > +int\n> > > +rte_pmd_packet_devinit(const char *name, const char *params) {\n> > > +\tunsigned numa_node;\n> > > +\tint ret;\n> > > +\tstruct rte_kvargs *kvlist;\n> > > +\tint sockfd = -1;\n> > > +\n> > > +\tRTE_LOG(INFO, PMD, \"Initializing pmd_packet for %s\\n\", name);\n> > > +\n> > > +\tnuma_node = rte_socket_id();\n> > > +\n> > > +\tkvlist = rte_kvargs_parse(params, valid_arguments);\n> > > +\tif (kvlist == NULL)\n> > > +\t\treturn -1;\n> > > +\n> > > +\t/*\n> > > +\t * If iface argument is passed we open the NICs and use them for\n> > > +\t * reading / writing\n> > > +\t */\n> > > +\tif (rte_kvargs_count(kvlist, ETH_PACKET_IFACE_ARG) == 1) {\n> > > +\n> > > +\t\tret = rte_kvargs_process(kvlist, ETH_PACKET_IFACE_ARG,\n> > > +\t\t &open_packet_iface, &sockfd);\n> > > +\t\tif (ret < 0)\n> > > +\t\t\treturn -1;\n> > > +\t}\n> > > +\n> > > +\tret = rte_eth_from_packet(name, &sockfd, numa_node, kvlist);\n> > > +\tclose(sockfd); /* no longer needed */\n> > > +\n> > > +\tif (ret < 0)\n> > > +\t\treturn -1;\n> > > +\n> > > +\treturn 0;\n> > > +}\n> > > +\n> > > +static struct rte_driver pmd_packet_drv = {\n> > > +\t.name = \"eth_packet\",\n> > > +\t.type = PMD_VDEV,\n> > > +\t.init = rte_pmd_packet_devinit,\n> > > +};\n> > > +\n> > > +PMD_REGISTER_DRIVER(pmd_packet_drv);\n> > > diff --git a/lib/librte_pmd_packet/rte_eth_packet.h\n> > > b/lib/librte_pmd_packet/rte_eth_packet.h\n> > > new file mode 100644\n> > > index 000000000000..f685611da3e9\n> > > --- /dev/null\n> > > +++ b/lib/librte_pmd_packet/rte_eth_packet.h\n> > > @@ -0,0 +1,55 @@\n> > > +/*-\n> > > + * BSD LICENSE\n> > > + *\n> > > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > > + * All rights reserved.\n> > > + *\n> > > + * Redistribution and use in source and binary forms, with or without\n> > > + * modification, are permitted provided that the following conditions\n> > > + * are met:\n> > > + *\n> > > + * * Redistributions of source code must retain the above copyright\n> > > + * notice, this list of conditions and the following disclaimer.\n> > > + * * Redistributions in binary form must reproduce the above copyright\n> > > + * notice, this list of conditions and the following disclaimer in\n> > > + * the documentation and/or other materials provided with the\n> > > + * distribution.\n> > > + * * Neither the name of Intel Corporation nor the names of its\n> > > + * contributors may be used to endorse or promote products derived\n> > > + * from this software without specific prior written permission.\n> > > + *\n> > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND\n> > > CONTRIBUTORS\n> > > + * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING,\n> BUT\n> > > NOT\n> > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND\n> > > FITNESS FOR\n> > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE\n> > > COPYRIGHT\n> > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,\n> > > INCIDENTAL,\n> > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,\n> BUT\n> > > NOT\n> > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;\n> > > LOSS OF USE,\n> > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER\n> CAUSED\n> > > AND ON ANY\n> > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR\n> > > TORT\n> > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT\n> OF\n> > > THE USE\n> > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH\n> > > DAMAGE.\n> > > + */\n> > > +\n> > > +#ifndef _RTE_ETH_PACKET_H_\n> > > +#define _RTE_ETH_PACKET_H_\n> > > +\n> > > +#ifdef __cplusplus\n> > > +extern \"C\" {\n> > > +#endif\n> > > +\n> > > +#define RTE_ETH_PACKET_PARAM_NAME \"eth_packet\"\n> > > +\n> > > +#define RTE_PMD_PACKET_MAX_RINGS 16\n> > > +\n> > > +/**\n> > > + * For use by the EAL only. Called as part of EAL init to set up\n> > > +any dummy NICs\n> > > + * configured on command line.\n> > > + */\n> > > +int rte_pmd_packet_devinit(const char *name, const char *params);\n> > > +\n> > > +#ifdef __cplusplus\n> > > +}\n> > > +#endif\n> > > +\n> > > +#endif\n> > > diff --git a/mk/rte.app.mk b/mk/rte.app.mk index\n> > > 34dff2a02a05..a6994c4dbe93\n> > > 100644\n> > > --- a/mk/rte.app.mk\n> > > +++ b/mk/rte.app.mk\n> > > @@ -210,6 +210,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_PMD_PCAP),y)\n> LDLIBS\n> > > += -lrte_pmd_pcap -lpcap endif\n> > >\n> > > +ifeq ($(CONFIG_RTE_LIBRTE_PMD_PACKET),y)\n> > > +LDLIBS += -lrte_pmd_packet\n> > > +endif\n> > > +\n> > > endif # plugins\n> > >\n> > > LDLIBS += $(EXECENV_LDLIBS)\n> > > --\n> > > 1.9.3\n> >\n> >", "headers": { "Return-Path": "<danny.zhou@intel.com>", "Received": [ "from mga09.intel.com (mga09.intel.com [134.134.136.24])\n\tby dpdk.org (Postfix) with ESMTP id 5CA401FE\n\tfor <dev@dpdk.org>; Tue, 15 Jul 2014 17:34:31 +0200 (CEST)", "from orsmga002.jf.intel.com ([10.7.209.21])\n\tby orsmga102.jf.intel.com with ESMTP; 15 Jul 2014 08:28:45 -0700", "from fmsmsx104.amr.corp.intel.com ([10.19.9.35])\n\tby orsmga002.jf.intel.com with ESMTP; 15 Jul 2014 08:34:18 -0700", "from fmsmsx151.amr.corp.intel.com (10.19.17.220) by\n\tFMSMSX104.amr.corp.intel.com (10.19.9.35) with Microsoft SMTP Server\n\t(TLS) id 14.3.123.3; Tue, 15 Jul 2014 08:34:18 -0700", "from shsmsx102.ccr.corp.intel.com (10.239.4.154) by\n\tFMSMSX151.amr.corp.intel.com (10.19.17.220) with Microsoft SMTP\n\tServer (TLS) id 14.3.123.3; Tue, 15 Jul 2014 08:34:18 -0700", "from shsmsx103.ccr.corp.intel.com ([169.254.4.26]) by\n\tshsmsx102.ccr.corp.intel.com ([169.254.2.120]) with mapi id\n\t14.03.0123.003; Tue, 15 Jul 2014 23:34:16 +0800" ], "X-ExtLoop1": "1", "X-IronPort-AV": "E=Sophos;i=\"5.01,666,1400050800\"; d=\"scan'208\";a=\"573491624\"", "From": "\"Zhou, Danny\" <danny.zhou@intel.com>", "To": "Neil Horman <nhorman@tuxdriver.com>", "Thread-Topic": "[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "Thread-Index": "AQHPn5Gwvcd4A0wcTkeObxakesCY6JugPv4wgABJUgCAALtC0A==", "Date": "Tue, 15 Jul 2014 15:34:14 +0000", "Message-ID": "<DFDF335405C17848924A094BC35766CF0A8AD501@SHSMSX103.ccr.corp.intel.com>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1405362290-6753-1-git-send-email-linville@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A8A4F83@SHSMSX104.ccr.corp.intel.com>\n\t<20140715121743.GA14273@localhost.localdomain>", "In-Reply-To": "<20140715121743.GA14273@localhost.localdomain>", "Accept-Language": "zh-CN, en-US", "Content-Language": "en-US", "X-MS-Has-Attach": "", "X-MS-TNEF-Correlator": "", "x-originating-ip": "[10.239.127.40]", "Content-Type": "text/plain; charset=\"us-ascii\"", "Content-Transfer-Encoding": "quoted-printable", "MIME-Version": "1.0", "Cc": "\"dev@dpdk.org\" <dev@dpdk.org>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "X-List-Received-Date": "Tue, 15 Jul 2014 15:34:33 -0000" }, "addressed": null }, { "id": 117, "web_url": "http://patches.dpdk.org/comment/117/", "msgid": "<DFDF335405C17848924A094BC35766CF0A8AF52C@SHSMSX103.ccr.corp.intel.com>", "list_archive_url": "https://inbox.dpdk.org/dev/DFDF335405C17848924A094BC35766CF0A8AF52C@SHSMSX103.ccr.corp.intel.com", "date": "2014-07-15T15:40:56", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 29, "url": "http://patches.dpdk.org/api/people/29/?format=api", "name": "Zhou, Danny", "email": "danny.zhou@intel.com" }, "content": "> -----Original Message-----\n> From: John W. Linville [mailto:linville@tuxdriver.com]\n> Sent: Tuesday, July 15, 2014 10:01 PM\n> To: Neil Horman\n> Cc: Zhou, Danny; dev@dpdk.org\n> Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n> AF_PACKET-based virtual devices\n> \n> On Tue, Jul 15, 2014 at 08:17:44AM -0400, Neil Horman wrote:\n> > On Tue, Jul 15, 2014 at 12:15:49AM +0000, Zhou, Danny wrote:\n> > > According to my performance measurement results for 64B small\n> > > packet, 1 queue perf. is better than 16 queues (1.35M pps vs. 0.93M\n> > > pps) which make sense to me as for 16 queues case more CPU cycles\n> > > (16 queues' 87% vs. 1 queue' 80%) in kernel land needed for\n> > > NAPI-enabled ixgbe driver to switch between polling and interrupt\n> > > modes in order to service per-queue rx interrupts, so more context\n> > > switch overhead involved. Also, since the\n> > > eth_packet_rx/eth_packet_tx routines involves in two memory copies\n> > > between DPDK mbuf and pbuf for each packet, it can hardly achieve\n> > > high performance unless packet are directly DMA to mbuf which needs ixgbe\n> driver to support.\n> >\n> > I thought 16 queues would be spread out between as many cpus as you\n> > had though, obviating the need for context switches, no?\n> \n> I think Danny is testing the single CPU case. Having more queues than CPUs\n> probably does not provide any benefit.\n> \n> It would be cool to hack the DPDK memory management to work directly out of the\n> mmap'ed AF_PACKET buffers. But at this point I don't have enough knowledge of\n> DPDK internals to know if that is at all reasonable...\n> \n> John\n> \n> P.S. Danny, have you run any performance tests on the PCAP driver?\n\nNo, I do not have PCAP driver performance results in hand. But I remember it is less than\n1M pps for 64B.\n\n> \n> --\n> John W. Linville\t\tSomeday the world will need a hero, and you\n> linville@tuxdriver.com\t\t\tmight be all we have. Be ready.", "headers": { "Return-Path": "<danny.zhou@intel.com>", "Received": [ "from mga01.intel.com (mga01.intel.com [192.55.52.88])\n\tby dpdk.org (Postfix) with ESMTP id ABB1F58D9\n\tfor <dev@dpdk.org>; Tue, 15 Jul 2014 17:40:41 +0200 (CEST)", "from fmsmga002.fm.intel.com ([10.253.24.26])\n\tby fmsmga101.fm.intel.com with ESMTP; 15 Jul 2014 08:40:58 -0700", "from fmsmsx108.amr.corp.intel.com ([10.19.9.228])\n\tby fmsmga002.fm.intel.com with ESMTP; 15 Jul 2014 08:40:58 -0700", "from fmsmsx114.amr.corp.intel.com (10.18.116.8) by\n\tFMSMSX108.amr.corp.intel.com (10.19.9.228) with Microsoft SMTP Server\n\t(TLS) id 14.3.123.3; Tue, 15 Jul 2014 08:40:58 -0700", "from shsmsx152.ccr.corp.intel.com (10.239.6.52) by\n\tFMSMSX114.amr.corp.intel.com (10.18.116.8) with Microsoft SMTP Server\n\t(TLS) id 14.3.123.3; Tue, 15 Jul 2014 08:40:58 -0700", "from shsmsx103.ccr.corp.intel.com ([169.254.4.26]) by\n\tSHSMSX152.ccr.corp.intel.com ([169.254.6.36]) with mapi id\n\t14.03.0123.003; Tue, 15 Jul 2014 23:40:56 +0800" ], "X-ExtLoop1": "1", "X-IronPort-AV": "E=Sophos;i=\"5.01,666,1400050800\"; d=\"scan'208\";a=\"570157648\"", "From": "\"Zhou, Danny\" <danny.zhou@intel.com>", "To": "\"John W. Linville\" <linville@tuxdriver.com>, Neil Horman\n\t<nhorman@tuxdriver.com>", "Thread-Topic": "[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "Thread-Index": "AQHPn5Gwvcd4A0wcTkeObxakesCY6JugPv4wgABJUgCAABzngIAAoGOA", "Date": "Tue, 15 Jul 2014 15:40:56 +0000", "Message-ID": "<DFDF335405C17848924A094BC35766CF0A8AF52C@SHSMSX103.ccr.corp.intel.com>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1405362290-6753-1-git-send-email-linville@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A8A4F83@SHSMSX104.ccr.corp.intel.com>\n\t<20140715121743.GA14273@localhost.localdomain>\n\t<20140715140111.GA26012@tuxdriver.com>", "In-Reply-To": "<20140715140111.GA26012@tuxdriver.com>", "Accept-Language": "zh-CN, en-US", "Content-Language": "en-US", "X-MS-Has-Attach": "", "X-MS-TNEF-Correlator": "", "x-originating-ip": "[10.239.127.40]", "Content-Type": "text/plain; charset=\"us-ascii\"", "Content-Transfer-Encoding": "quoted-printable", "MIME-Version": "1.0", "Cc": "\"dev@dpdk.org\" <dev@dpdk.org>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "X-List-Received-Date": "Tue, 15 Jul 2014 15:40:42 -0000" }, "addressed": null }, { "id": 119, "web_url": "http://patches.dpdk.org/comment/119/", "msgid": "<20140715190818.GD26012@tuxdriver.com>", "list_archive_url": "https://inbox.dpdk.org/dev/20140715190818.GD26012@tuxdriver.com", "date": "2014-07-15T19:08:19", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 26, "url": "http://patches.dpdk.org/api/people/26/?format=api", "name": "John W. Linville", "email": "linville@tuxdriver.com" }, "content": "On Tue, Jul 15, 2014 at 03:40:56PM +0000, Zhou, Danny wrote:\n> \n> > -----Original Message-----\n> > From: John W. Linville [mailto:linville@tuxdriver.com]\n> > Sent: Tuesday, July 15, 2014 10:01 PM\n> > To: Neil Horman\n> > Cc: Zhou, Danny; dev@dpdk.org\n> > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n> > AF_PACKET-based virtual devices\n> > \n> > On Tue, Jul 15, 2014 at 08:17:44AM -0400, Neil Horman wrote:\n> > > On Tue, Jul 15, 2014 at 12:15:49AM +0000, Zhou, Danny wrote:\n> > > > According to my performance measurement results for 64B small\n> > > > packet, 1 queue perf. is better than 16 queues (1.35M pps vs. 0.93M\n> > > > pps) which make sense to me as for 16 queues case more CPU cycles\n> > > > (16 queues' 87% vs. 1 queue' 80%) in kernel land needed for\n> > > > NAPI-enabled ixgbe driver to switch between polling and interrupt\n> > > > modes in order to service per-queue rx interrupts, so more context\n> > > > switch overhead involved. Also, since the\n> > > > eth_packet_rx/eth_packet_tx routines involves in two memory copies\n> > > > between DPDK mbuf and pbuf for each packet, it can hardly achieve\n> > > > high performance unless packet are directly DMA to mbuf which needs ixgbe\n> > driver to support.\n> > >\n> > > I thought 16 queues would be spread out between as many cpus as you\n> > > had though, obviating the need for context switches, no?\n> > \n> > I think Danny is testing the single CPU case. Having more queues than CPUs\n> > probably does not provide any benefit.\n> > \n> > It would be cool to hack the DPDK memory management to work directly out of the\n> > mmap'ed AF_PACKET buffers. But at this point I don't have enough knowledge of\n> > DPDK internals to know if that is at all reasonable...\n> > \n> > John\n> > \n> > P.S. Danny, have you run any performance tests on the PCAP driver?\n> \n> No, I do not have PCAP driver performance results in hand. But I remember it is less than\n> 1M pps for 64B.\n\nCool, good info...thanks!", "headers": { "Return-Path": "<linville@tuxdriver.com>", "Received": [ "from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58])\n\tby dpdk.org (Postfix) with ESMTP id D45F31FE\n\tfor <dev@dpdk.org>; Tue, 15 Jul 2014 21:14:22 +0200 (CEST)", "from uucp by smtp.tuxdriver.com with local-rmail (Exim 4.63)\n\t(envelope-from <linville@tuxdriver.com>)\n\tid 1X78Bw-0007kG-Qh; Tue, 15 Jul 2014 15:15:08 -0400", "from linville-x1.hq.tuxdriver.com (localhost.localdomain\n\t[127.0.0.1])\n\tby linville-x1.hq.tuxdriver.com (8.14.8/8.14.6) with ESMTP id\n\ts6FJ8J4l006403; Tue, 15 Jul 2014 15:08:19 -0400", "(from linville@localhost)\n\tby linville-x1.hq.tuxdriver.com (8.14.8/8.14.8/Submit) id\n\ts6FJ8JXb006402; Tue, 15 Jul 2014 15:08:19 -0400" ], "Date": "Tue, 15 Jul 2014 15:08:19 -0400", "From": "\"John W. Linville\" <linville@tuxdriver.com>", "To": "\"Zhou, Danny\" <danny.zhou@intel.com>", "Message-ID": "<20140715190818.GD26012@tuxdriver.com>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1405362290-6753-1-git-send-email-linville@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A8A4F83@SHSMSX104.ccr.corp.intel.com>\n\t<20140715121743.GA14273@localhost.localdomain>\n\t<20140715140111.GA26012@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A8AF52C@SHSMSX103.ccr.corp.intel.com>", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=us-ascii", "Content-Disposition": "inline", "In-Reply-To": "<DFDF335405C17848924A094BC35766CF0A8AF52C@SHSMSX103.ccr.corp.intel.com>", "User-Agent": "Mutt/1.5.23 (2014-03-12)", "Cc": "\"dev@dpdk.org\" <dev@dpdk.org>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "X-List-Received-Date": "Tue, 15 Jul 2014 19:14:23 -0000" }, "addressed": null }, { "id": 120, "web_url": "http://patches.dpdk.org/comment/120/", "msgid": "<20140715203108.GA20273@localhost.localdomain>", "list_archive_url": "https://inbox.dpdk.org/dev/20140715203108.GA20273@localhost.localdomain", "date": "2014-07-15T20:31:08", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 32, "url": "http://patches.dpdk.org/api/people/32/?format=api", "name": "Neil Horman", "email": "nhorman@tuxdriver.com" }, "content": "On Tue, Jul 15, 2014 at 10:01:11AM -0400, John W. Linville wrote:\n> On Tue, Jul 15, 2014 at 08:17:44AM -0400, Neil Horman wrote:\n> > On Tue, Jul 15, 2014 at 12:15:49AM +0000, Zhou, Danny wrote:\n> > > According to my performance measurement results for 64B small\n> > > packet, 1 queue perf. is better than 16 queues (1.35M pps vs. 0.93M\n> > > pps) which make sense to me as for 16 queues case more CPU cycles (16\n> > > queues' 87% vs. 1 queue' 80%) in kernel land needed for NAPI-enabled\n> > > ixgbe driver to switch between polling and interrupt modes in order\n> > > to service per-queue rx interrupts, so more context switch overhead\n> > > involved. Also, since the eth_packet_rx/eth_packet_tx routines involves\n> > > in two memory copies between DPDK mbuf and pbuf for each packet,\n> > > it can hardly achieve high performance unless packet are directly\n> > > DMA to mbuf which needs ixgbe driver to support.\n> > \n> > I thought 16 queues would be spread out between as many cpus as you had though,\n> > obviating the need for context switches, no?\n> \n> I think Danny is testing the single CPU case. Having more queues\n> than CPUs probably does not provide any benefit.\n> \nAh, yes, generally speaking, you never want nr_cpus < nr_queues. Otherwise\nyou'll just be fighting yourself.\n\n> It would be cool to hack the DPDK memory management to work directly\n> out of the mmap'ed AF_PACKET buffers. But at this point I don't\n> have enough knowledge of DPDK internals to know if that is at all\n> reasonable...\n> \n> John\n> \n> P.S. Danny, have you run any performance tests on the PCAP driver?\n> \n> -- \n> John W. Linville\t\tSomeday the world will need a hero, and you\n> linville@tuxdriver.com\t\t\tmight be all we have. Be ready.\n>", "headers": { "Return-Path": "<nhorman@tuxdriver.com>", "Received": [ "from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58])\n\tby dpdk.org (Postfix) with ESMTP id F04F11FE\n\tfor <dev@dpdk.org>; Tue, 15 Jul 2014 22:30:33 +0200 (CEST)", "from [209.188.62.162] (helo=localhost)\n\tby smtp.tuxdriver.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.63)\n\t(envelope-from <nhorman@tuxdriver.com>)\n\tid 1X79Na-0008Gi-Jf; Tue, 15 Jul 2014 16:31:20 -0400" ], "Date": "Tue, 15 Jul 2014 16:31:08 -0400", "From": "Neil Horman <nhorman@tuxdriver.com>", "To": "\"John W. Linville\" <linville@tuxdriver.com>", "Message-ID": "<20140715203108.GA20273@localhost.localdomain>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1405362290-6753-1-git-send-email-linville@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A8A4F83@SHSMSX104.ccr.corp.intel.com>\n\t<20140715121743.GA14273@localhost.localdomain>\n\t<20140715140111.GA26012@tuxdriver.com>", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=us-ascii", "Content-Disposition": "inline", "In-Reply-To": "<20140715140111.GA26012@tuxdriver.com>", "User-Agent": "Mutt/1.5.23 (2014-03-12)", "X-Spam-Score": "-2.9 (--)", "X-Spam-Status": "No", "Cc": "\"dev@dpdk.org\" <dev@dpdk.org>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "X-List-Received-Date": "Tue, 15 Jul 2014 20:30:34 -0000" }, "addressed": null }, { "id": 121, "web_url": "http://patches.dpdk.org/comment/121/", "msgid": "<DFDF335405C17848924A094BC35766CF0A8B9541@SHSMSX104.ccr.corp.intel.com>", "list_archive_url": "https://inbox.dpdk.org/dev/DFDF335405C17848924A094BC35766CF0A8B9541@SHSMSX104.ccr.corp.intel.com", "date": "2014-07-15T20:41:23", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 29, "url": "http://patches.dpdk.org/api/people/29/?format=api", "name": "Zhou, Danny", "email": "danny.zhou@intel.com" }, "content": "> -----Original Message-----\n> From: Neil Horman [mailto:nhorman@tuxdriver.com]\n> Sent: Wednesday, July 16, 2014 4:31 AM\n> To: John W. Linville\n> Cc: Zhou, Danny; dev@dpdk.org\n> Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n> AF_PACKET-based virtual devices\n> \n> On Tue, Jul 15, 2014 at 10:01:11AM -0400, John W. Linville wrote:\n> > On Tue, Jul 15, 2014 at 08:17:44AM -0400, Neil Horman wrote:\n> > > On Tue, Jul 15, 2014 at 12:15:49AM +0000, Zhou, Danny wrote:\n> > > > According to my performance measurement results for 64B small\n> > > > packet, 1 queue perf. is better than 16 queues (1.35M pps vs.\n> > > > 0.93M\n> > > > pps) which make sense to me as for 16 queues case more CPU cycles\n> > > > (16 queues' 87% vs. 1 queue' 80%) in kernel land needed for\n> > > > NAPI-enabled ixgbe driver to switch between polling and interrupt\n> > > > modes in order to service per-queue rx interrupts, so more context\n> > > > switch overhead involved. Also, since the\n> > > > eth_packet_rx/eth_packet_tx routines involves in two memory copies\n> > > > between DPDK mbuf and pbuf for each packet, it can hardly achieve\n> > > > high performance unless packet are directly DMA to mbuf which needs ixgbe\n> driver to support.\n> > >\n> > > I thought 16 queues would be spread out between as many cpus as you\n> > > had though, obviating the need for context switches, no?\n> >\n> > I think Danny is testing the single CPU case. Having more queues than\n> > CPUs probably does not provide any benefit.\n> >\n> Ah, yes, generally speaking, you never want nr_cpus < nr_queues. Otherwise you'll\n> just be fighting yourself.\n> \n\nIt is true for interrupt based NIC driver and this AF_PACKET based PMD because it depends \non kernel NIC driver. But for poll-mode based DPDK native NIC driver, you can have a cpu pinning to\nto a core polling multiple queues on a NIC or queues on different NICs, at the cost of more\npower consumption or wasted CPU cycles busying waiting packets.\n\n> > It would be cool to hack the DPDK memory management to work directly\n> > out of the mmap'ed AF_PACKET buffers. But at this point I don't have\n> > enough knowledge of DPDK internals to know if that is at all\n> > reasonable...\n> >\n> > John\n> >\n> > P.S. Danny, have you run any performance tests on the PCAP driver?\n> >\n> > --\n> > John W. Linville\t\tSomeday the world will need a hero, and you\n> > linville@tuxdriver.com\t\t\tmight be all we have. Be ready.\n> >", "headers": { "Return-Path": "<danny.zhou@intel.com>", "Received": [ "from mga11.intel.com (mga11.intel.com [192.55.52.93])\n\tby dpdk.org (Postfix) with ESMTP id E45B21FE\n\tfor <dev@dpdk.org>; Tue, 15 Jul 2014 22:41:07 +0200 (CEST)", "from fmsmga001.fm.intel.com ([10.253.24.23])\n\tby fmsmga102.fm.intel.com with ESMTP; 15 Jul 2014 13:41:54 -0700", "from fmsmsx103.amr.corp.intel.com ([10.19.9.34])\n\tby fmsmga001.fm.intel.com with ESMTP; 15 Jul 2014 13:41:24 -0700", "from fmsmsx152.amr.corp.intel.com (10.19.17.221) by\n\tFMSMSX103.amr.corp.intel.com (10.19.9.34) with Microsoft SMTP Server\n\t(TLS) id 14.3.123.3; Tue, 15 Jul 2014 13:41:24 -0700", "from shsmsx151.ccr.corp.intel.com (10.239.6.50) by\n\tfmsmsx152.amr.corp.intel.com (10.19.17.221) with Microsoft SMTP\n\tServer (TLS) id 14.3.123.3; Tue, 15 Jul 2014 13:41:24 -0700", "from shsmsx104.ccr.corp.intel.com ([169.254.5.204]) by\n\tSHSMSX151.ccr.corp.intel.com ([169.254.3.188]) with mapi id\n\t14.03.0123.003; Wed, 16 Jul 2014 04:41:23 +0800" ], "X-ExtLoop1": "1", "X-IronPort-AV": "E=Sophos;i=\"5.01,668,1400050800\"; d=\"scan'208\";a=\"562249315\"", "From": "\"Zhou, Danny\" <danny.zhou@intel.com>", "To": "Neil Horman <nhorman@tuxdriver.com>, \"John W. Linville\"\n\t<linville@tuxdriver.com>", "Thread-Topic": "[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "Thread-Index": "AQHPn5Gwvcd4A0wcTkeObxakesCY6JugPv4wgABJUgCAABzngIAAbPMAgACHFeA=", "Date": "Tue, 15 Jul 2014 20:41:23 +0000", "Message-ID": "<DFDF335405C17848924A094BC35766CF0A8B9541@SHSMSX104.ccr.corp.intel.com>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1405362290-6753-1-git-send-email-linville@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A8A4F83@SHSMSX104.ccr.corp.intel.com>\n\t<20140715121743.GA14273@localhost.localdomain>\n\t<20140715140111.GA26012@tuxdriver.com>\n\t<20140715203108.GA20273@localhost.localdomain>", "In-Reply-To": "<20140715203108.GA20273@localhost.localdomain>", "Accept-Language": "zh-CN, en-US", "Content-Language": "en-US", "X-MS-Has-Attach": "", "X-MS-TNEF-Correlator": "", "x-originating-ip": "[10.239.127.40]", "Content-Type": "text/plain; charset=\"us-ascii\"", "Content-Transfer-Encoding": "quoted-printable", "MIME-Version": "1.0", "Cc": "\"dev@dpdk.org\" <dev@dpdk.org>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "X-List-Received-Date": "Tue, 15 Jul 2014 20:41:08 -0000" }, "addressed": null }, { "id": 762, "web_url": "http://patches.dpdk.org/comment/762/", "msgid": "<20140912180523.GB7145@tuxdriver.com>", "list_archive_url": "https://inbox.dpdk.org/dev/20140912180523.GB7145@tuxdriver.com", "date": "2014-09-12T18:05:23", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 26, "url": "http://patches.dpdk.org/api/people/26/?format=api", "name": "John W. Linville", "email": "linville@tuxdriver.com" }, "content": "Ping? Are there objections to this patch from mid-July?\n\nJohn\n\nOn Mon, Jul 14, 2014 at 02:24:50PM -0400, John W. Linville wrote:\n> This is a Linux-specific virtual PMD driver backed by an AF_PACKET\n> socket. This implementation uses mmap'ed ring buffers to limit copying\n> and user/kernel transitions. The PACKET_FANOUT_HASH behavior of\n> AF_PACKET is used for frame reception. In the current implementation,\n> Tx and Rx queues are always paired, and therefore are always equal\n> in number -- changing this would be a Simple Matter Of Programming.\n> \n> Interfaces of this type are created with a command line option like\n> \"--vdev=eth_packet0,iface=...\". There are a number of options availabe\n> as arguments:\n> \n> - Interface is chosen by \"iface\" (required)\n> - Number of queue pairs set by \"qpairs\" (optional, default: 1)\n> - AF_PACKET MMAP block size set by \"blocksz\" (optional, default: 4096)\n> - AF_PACKET MMAP frame size set by \"framesz\" (optional, default: 2048)\n> - AF_PACKET MMAP frame count set by \"framecnt\" (optional, default: 512)\n> \n> Signed-off-by: John W. Linville <linville@tuxdriver.com>\n> ---\n> This PMD is intended to provide a means for using DPDK on a broad\n> range of hardware without hardware-specific PMDs and (hopefully)\n> with better performance than what PCAP offers in Linux. This might\n> be useful as a development platform for DPDK applications when\n> DPDK-supported hardware is expensive or unavailable.\n> \n> New in v2:\n> \n> -- fixup some style issues found by check patch\n> -- use if_index as part of fanout group ID\n> -- set default number of queue pairs to 1\n> \n> config/common_bsdapp | 5 +\n> config/common_linuxapp | 5 +\n> lib/Makefile | 1 +\n> lib/librte_eal/linuxapp/eal/Makefile | 1 +\n> lib/librte_pmd_packet/Makefile | 60 +++\n> lib/librte_pmd_packet/rte_eth_packet.c | 826 +++++++++++++++++++++++++++++++++\n> lib/librte_pmd_packet/rte_eth_packet.h | 55 +++\n> mk/rte.app.mk | 4 +\n> 8 files changed, 957 insertions(+)\n> create mode 100644 lib/librte_pmd_packet/Makefile\n> create mode 100644 lib/librte_pmd_packet/rte_eth_packet.c\n> create mode 100644 lib/librte_pmd_packet/rte_eth_packet.h\n> \n> diff --git a/config/common_bsdapp b/config/common_bsdapp\n> index 943dce8f1ede..c317f031278e 100644\n> --- a/config/common_bsdapp\n> +++ b/config/common_bsdapp\n> @@ -226,6 +226,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=y\n> CONFIG_RTE_LIBRTE_PMD_BOND=y\n> \n> #\n> +# Compile software PMD backed by AF_PACKET sockets (Linux only)\n> +#\n> +CONFIG_RTE_LIBRTE_PMD_PACKET=n\n> +\n> +#\n> # Do prefetch of packet data within PMD driver receive function\n> #\n> CONFIG_RTE_PMD_PACKET_PREFETCH=y\n> diff --git a/config/common_linuxapp b/config/common_linuxapp\n> index 7bf5d80d4e26..f9e7bc3015ec 100644\n> --- a/config/common_linuxapp\n> +++ b/config/common_linuxapp\n> @@ -249,6 +249,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=n\n> CONFIG_RTE_LIBRTE_PMD_BOND=y\n> \n> #\n> +# Compile software PMD backed by AF_PACKET sockets (Linux only)\n> +#\n> +CONFIG_RTE_LIBRTE_PMD_PACKET=y\n> +\n> +#\n> # Compile Xen PMD\n> #\n> CONFIG_RTE_LIBRTE_PMD_XENVIRT=n\n> diff --git a/lib/Makefile b/lib/Makefile\n> index 10c5bb3045bc..930fadf29898 100644\n> --- a/lib/Makefile\n> +++ b/lib/Makefile\n> @@ -47,6 +47,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += librte_pmd_i40e\n> DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += librte_pmd_bond\n> DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += librte_pmd_ring\n> DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += librte_pmd_pcap\n> +DIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += librte_pmd_packet\n> DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += librte_pmd_virtio\n> DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3\n> DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt\n> diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile\n> index 756d6b0c9301..feed24a63272 100644\n> --- a/lib/librte_eal/linuxapp/eal/Makefile\n> +++ b/lib/librte_eal/linuxapp/eal/Makefile\n> @@ -44,6 +44,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_ether\n> CFLAGS += -I$(RTE_SDK)/lib/librte_ivshmem\n> CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_ring\n> CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_pcap\n> +CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_packet\n> CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_xenvirt\n> CFLAGS += $(WERROR_FLAGS) -O3\n> \n> diff --git a/lib/librte_pmd_packet/Makefile b/lib/librte_pmd_packet/Makefile\n> new file mode 100644\n> index 000000000000..e1266fb992cd\n> --- /dev/null\n> +++ b/lib/librte_pmd_packet/Makefile\n> @@ -0,0 +1,60 @@\n> +# BSD LICENSE\n> +#\n> +# Copyright(c) 2014 John W. Linville <linville@redhat.com>\n> +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> +# Copyright(c) 2014 6WIND S.A.\n> +# All rights reserved.\n> +#\n> +# Redistribution and use in source and binary forms, with or without\n> +# modification, are permitted provided that the following conditions\n> +# are met:\n> +#\n> +# * Redistributions of source code must retain the above copyright\n> +# notice, this list of conditions and the following disclaimer.\n> +# * Redistributions in binary form must reproduce the above copyright\n> +# notice, this list of conditions and the following disclaimer in\n> +# the documentation and/or other materials provided with the\n> +# distribution.\n> +# * Neither the name of Intel Corporation nor the names of its\n> +# contributors may be used to endorse or promote products derived\n> +# from this software without specific prior written permission.\n> +#\n> +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> +# \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> +\n> +include $(RTE_SDK)/mk/rte.vars.mk\n> +\n> +#\n> +# library name\n> +#\n> +LIB = librte_pmd_packet.a\n> +\n> +CFLAGS += -O3\n> +CFLAGS += $(WERROR_FLAGS)\n> +\n> +#\n> +# all source are stored in SRCS-y\n> +#\n> +SRCS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += rte_eth_packet.c\n> +\n> +#\n> +# Export include files\n> +#\n> +SYMLINK-y-include += rte_eth_packet.h\n> +\n> +# this lib depends upon:\n> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_mbuf\n> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_ether\n> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_malloc\n> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_kvargs\n> +\n> +include $(RTE_SDK)/mk/rte.lib.mk\n> diff --git a/lib/librte_pmd_packet/rte_eth_packet.c b/lib/librte_pmd_packet/rte_eth_packet.c\n> new file mode 100644\n> index 000000000000..9c82d16e730f\n> --- /dev/null\n> +++ b/lib/librte_pmd_packet/rte_eth_packet.c\n> @@ -0,0 +1,826 @@\n> +/*-\n> + * BSD LICENSE\n> + *\n> + * Copyright(c) 2014 John W. Linville <linville@tuxdriver.com>\n> + *\n> + * Originally based upon librte_pmd_pcap code:\n> + *\n> + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> + * Copyright(c) 2014 6WIND S.A.\n> + * All rights reserved.\n> + *\n> + * Redistribution and use in source and binary forms, with or without\n> + * modification, are permitted provided that the following conditions\n> + * are met:\n> + *\n> + * * Redistributions of source code must retain the above copyright\n> + * notice, this list of conditions and the following disclaimer.\n> + * * Redistributions in binary form must reproduce the above copyright\n> + * notice, this list of conditions and the following disclaimer in\n> + * the documentation and/or other materials provided with the\n> + * distribution.\n> + * * Neither the name of Intel Corporation nor the names of its\n> + * contributors may be used to endorse or promote products derived\n> + * from this software without specific prior written permission.\n> + *\n> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> + * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> + */\n> +\n> +#include <rte_mbuf.h>\n> +#include <rte_ethdev.h>\n> +#include <rte_malloc.h>\n> +#include <rte_kvargs.h>\n> +#include <rte_dev.h>\n> +\n> +#include <linux/if_ether.h>\n> +#include <linux/if_packet.h>\n> +#include <arpa/inet.h>\n> +#include <net/if.h>\n> +#include <sys/types.h>\n> +#include <sys/socket.h>\n> +#include <sys/ioctl.h>\n> +#include <sys/mman.h>\n> +#include <unistd.h>\n> +#include <poll.h>\n> +\n> +#include \"rte_eth_packet.h\"\n> +\n> +#define ETH_PACKET_IFACE_ARG\t\t\"iface\"\n> +#define ETH_PACKET_NUM_Q_ARG\t\t\"qpairs\"\n> +#define ETH_PACKET_BLOCKSIZE_ARG\t\"blocksz\"\n> +#define ETH_PACKET_FRAMESIZE_ARG\t\"framesz\"\n> +#define ETH_PACKET_FRAMECOUNT_ARG\t\"framecnt\"\n> +\n> +#define DFLT_BLOCK_SIZE\t\t(1 << 12)\n> +#define DFLT_FRAME_SIZE\t\t(1 << 11)\n> +#define DFLT_FRAME_COUNT\t(1 << 9)\n> +\n> +struct pkt_rx_queue {\n> +\tint sockfd;\n> +\n> +\tstruct iovec *rd;\n> +\tuint8_t *map;\n> +\tunsigned int framecount;\n> +\tunsigned int framenum;\n> +\n> +\tstruct rte_mempool *mb_pool;\n> +\n> +\tvolatile unsigned long rx_pkts;\n> +\tvolatile unsigned long err_pkts;\n> +};\n> +\n> +struct pkt_tx_queue {\n> +\tint sockfd;\n> +\n> +\tstruct iovec *rd;\n> +\tuint8_t *map;\n> +\tunsigned int framecount;\n> +\tunsigned int framenum;\n> +\n> +\tvolatile unsigned long tx_pkts;\n> +\tvolatile unsigned long err_pkts;\n> +};\n> +\n> +struct pmd_internals {\n> +\tunsigned nb_queues;\n> +\n> +\tint if_index;\n> +\tstruct ether_addr eth_addr;\n> +\n> +\tstruct tpacket_req req;\n> +\n> +\tstruct pkt_rx_queue rx_queue[RTE_PMD_PACKET_MAX_RINGS];\n> +\tstruct pkt_tx_queue tx_queue[RTE_PMD_PACKET_MAX_RINGS];\n> +};\n> +\n> +static const char *valid_arguments[] = {\n> +\tETH_PACKET_IFACE_ARG,\n> +\tETH_PACKET_NUM_Q_ARG,\n> +\tETH_PACKET_BLOCKSIZE_ARG,\n> +\tETH_PACKET_FRAMESIZE_ARG,\n> +\tETH_PACKET_FRAMECOUNT_ARG,\n> +\tNULL\n> +};\n> +\n> +static const char *drivername = \"AF_PACKET PMD\";\n> +\n> +static struct rte_eth_link pmd_link = {\n> +\t.link_speed = 10000,\n> +\t.link_duplex = ETH_LINK_FULL_DUPLEX,\n> +\t.link_status = 0\n> +};\n> +\n> +static uint16_t\n> +eth_packet_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)\n> +{\n> +\tunsigned i;\n> +\tstruct tpacket2_hdr *ppd;\n> +\tstruct rte_mbuf *mbuf;\n> +\tuint8_t *pbuf;\n> +\tstruct pkt_rx_queue *pkt_q = queue;\n> +\tuint16_t num_rx = 0;\n> +\tunsigned int framecount, framenum;\n> +\n> +\tif (unlikely(nb_pkts == 0))\n> +\t\treturn 0;\n> +\n> +\t/*\n> +\t * Reads the given number of packets from the AF_PACKET socket one by\n> +\t * one and copies the packet data into a newly allocated mbuf.\n> +\t */\n> +\tframecount = pkt_q->framecount;\n> +\tframenum = pkt_q->framenum;\n> +\tfor (i = 0; i < nb_pkts; i++) {\n> +\t\t/* point at the next incoming frame */\n> +\t\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> +\t\tif ((ppd->tp_status & TP_STATUS_USER) == 0)\n> +\t\t\tbreak;\n> +\n> +\t\t/* allocate the next mbuf */\n> +\t\tmbuf = rte_pktmbuf_alloc(pkt_q->mb_pool);\n> +\t\tif (unlikely(mbuf == NULL))\n> +\t\t\tbreak;\n> +\n> +\t\t/* packet will fit in the mbuf, go ahead and receive it */\n> +\t\tmbuf->pkt.pkt_len = mbuf->pkt.data_len = ppd->tp_snaplen;\n> +\t\tpbuf = (uint8_t *) ppd + ppd->tp_mac;\n> +\t\tmemcpy(mbuf->pkt.data, pbuf, mbuf->pkt.data_len);\n> +\n> +\t\t/* release incoming frame and advance ring buffer */\n> +\t\tppd->tp_status = TP_STATUS_KERNEL;\n> +\t\tif (++framenum >= framecount)\n> +\t\t\tframenum = 0;\n> +\n> +\t\t/* account for the receive frame */\n> +\t\tbufs[i] = mbuf;\n> +\t\tnum_rx++;\n> +\t}\n> +\tpkt_q->framenum = framenum;\n> +\tpkt_q->rx_pkts += num_rx;\n> +\treturn num_rx;\n> +}\n> +\n> +/*\n> + * Callback to handle sending packets through a real NIC.\n> + */\n> +static uint16_t\n> +eth_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)\n> +{\n> +\tstruct tpacket2_hdr *ppd;\n> +\tstruct rte_mbuf *mbuf;\n> +\tuint8_t *pbuf;\n> +\tunsigned int framecount, framenum;\n> +\tstruct pollfd pfd;\n> +\tstruct pkt_tx_queue *pkt_q = queue;\n> +\tuint16_t num_tx = 0;\n> +\tint i;\n> +\n> +\tif (unlikely(nb_pkts == 0))\n> +\t\treturn 0;\n> +\n> +\tmemset(&pfd, 0, sizeof(pfd));\n> +\tpfd.fd = pkt_q->sockfd;\n> +\tpfd.events = POLLOUT;\n> +\tpfd.revents = 0;\n> +\n> +\tframecount = pkt_q->framecount;\n> +\tframenum = pkt_q->framenum;\n> +\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> +\tfor (i = 0; i < nb_pkts; i++) {\n> +\t\t/* point at the next incoming frame */\n> +\t\tif ((ppd->tp_status != TP_STATUS_AVAILABLE) &&\n> +\t\t (poll(&pfd, 1, -1) < 0))\n> +\t\t\t\tcontinue;\n> +\n> +\t\t/* copy the tx frame data */\n> +\t\tmbuf = bufs[num_tx];\n> +\t\tpbuf = (uint8_t *) ppd + TPACKET2_HDRLEN -\n> +\t\t\tsizeof(struct sockaddr_ll);\n> +\t\tmemcpy(pbuf, mbuf->pkt.data, mbuf->pkt.data_len);\n> +\t\tppd->tp_len = ppd->tp_snaplen = mbuf->pkt.data_len;\n> +\n> +\t\t/* release incoming frame and advance ring buffer */\n> +\t\tppd->tp_status = TP_STATUS_SEND_REQUEST;\n> +\t\tif (++framenum >= framecount)\n> +\t\t\tframenum = 0;\n> +\t\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> +\n> +\t\tnum_tx++;\n> +\t\trte_pktmbuf_free(mbuf);\n> +\t}\n> +\n> +\t/* kick-off transmits */\n> +\tsendto(pkt_q->sockfd, NULL, 0, MSG_DONTWAIT, NULL, 0);\n> +\n> +\tpkt_q->framenum = framenum;\n> +\tpkt_q->tx_pkts += num_tx;\n> +\tpkt_q->err_pkts += nb_pkts - num_tx;\n> +\treturn num_tx;\n> +}\n> +\n> +static int\n> +eth_dev_start(struct rte_eth_dev *dev)\n> +{\n> +\tdev->data->dev_link.link_status = 1;\n> +\treturn 0;\n> +}\n> +\n> +/*\n> + * This function gets called when the current port gets stopped.\n> + */\n> +static void\n> +eth_dev_stop(struct rte_eth_dev *dev)\n> +{\n> +\tunsigned i;\n> +\tint sockfd;\n> +\tstruct pmd_internals *internals = dev->data->dev_private;\n> +\n> +\tfor (i = 0; i < internals->nb_queues; i++) {\n> +\t\tsockfd = internals->rx_queue[i].sockfd;\n> +\t\tif (sockfd != -1)\n> +\t\t\tclose(sockfd);\n> +\t\tsockfd = internals->tx_queue[i].sockfd;\n> +\t\tif (sockfd != -1)\n> +\t\t\tclose(sockfd);\n> +\t}\n> +\n> +\tdev->data->dev_link.link_status = 0;\n> +}\n> +\n> +static int\n> +eth_dev_configure(struct rte_eth_dev *dev __rte_unused)\n> +{\n> +\treturn 0;\n> +}\n> +\n> +static void\n> +eth_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)\n> +{\n> +\tstruct pmd_internals *internals = dev->data->dev_private;\n> +\n> +\tdev_info->driver_name = drivername;\n> +\tdev_info->if_index = internals->if_index;\n> +\tdev_info->max_mac_addrs = 1;\n> +\tdev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;\n> +\tdev_info->max_rx_queues = (uint16_t)internals->nb_queues;\n> +\tdev_info->max_tx_queues = (uint16_t)internals->nb_queues;\n> +\tdev_info->min_rx_bufsize = 0;\n> +\tdev_info->pci_dev = NULL;\n> +}\n> +\n> +static void\n> +eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *igb_stats)\n> +{\n> +\tunsigned i, imax;\n> +\tunsigned long rx_total = 0, tx_total = 0, tx_err_total = 0;\n> +\tconst struct pmd_internals *internal = dev->data->dev_private;\n> +\n> +\tmemset(igb_stats, 0, sizeof(*igb_stats));\n> +\n> +\timax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?\n> +\t internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);\n> +\tfor (i = 0; i < imax; i++) {\n> +\t\tigb_stats->q_ipackets[i] = internal->rx_queue[i].rx_pkts;\n> +\t\trx_total += igb_stats->q_ipackets[i];\n> +\t}\n> +\n> +\timax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?\n> +\t internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);\n> +\tfor (i = 0; i < imax; i++) {\n> +\t\tigb_stats->q_opackets[i] = internal->tx_queue[i].tx_pkts;\n> +\t\tigb_stats->q_errors[i] = internal->tx_queue[i].err_pkts;\n> +\t\ttx_total += igb_stats->q_opackets[i];\n> +\t\ttx_err_total += igb_stats->q_errors[i];\n> +\t}\n> +\n> +\tigb_stats->ipackets = rx_total;\n> +\tigb_stats->opackets = tx_total;\n> +\tigb_stats->oerrors = tx_err_total;\n> +}\n> +\n> +static void\n> +eth_stats_reset(struct rte_eth_dev *dev)\n> +{\n> +\tunsigned i;\n> +\tstruct pmd_internals *internal = dev->data->dev_private;\n> +\n> +\tfor (i = 0; i < internal->nb_queues; i++)\n> +\t\tinternal->rx_queue[i].rx_pkts = 0;\n> +\n> +\tfor (i = 0; i < internal->nb_queues; i++) {\n> +\t\tinternal->tx_queue[i].tx_pkts = 0;\n> +\t\tinternal->tx_queue[i].err_pkts = 0;\n> +\t}\n> +}\n> +\n> +static void\n> +eth_dev_close(struct rte_eth_dev *dev __rte_unused)\n> +{\n> +}\n> +\n> +static void\n> +eth_queue_release(void *q __rte_unused)\n> +{\n> +}\n> +\n> +static int\n> +eth_link_update(struct rte_eth_dev *dev __rte_unused,\n> + int wait_to_complete __rte_unused)\n> +{\n> +\treturn 0;\n> +}\n> +\n> +static int\n> +eth_rx_queue_setup(struct rte_eth_dev *dev,\n> + uint16_t rx_queue_id,\n> + uint16_t nb_rx_desc __rte_unused,\n> + unsigned int socket_id __rte_unused,\n> + const struct rte_eth_rxconf *rx_conf __rte_unused,\n> + struct rte_mempool *mb_pool)\n> +{\n> +\tstruct pmd_internals *internals = dev->data->dev_private;\n> +\tstruct pkt_rx_queue *pkt_q = &internals->rx_queue[rx_queue_id];\n> +\tstruct rte_pktmbuf_pool_private *mbp_priv;\n> +\tuint16_t buf_size;\n> +\n> +\tpkt_q->mb_pool = mb_pool;\n> +\n> +\t/* Now get the space available for data in the mbuf */\n> +\tmbp_priv = rte_mempool_get_priv(pkt_q->mb_pool);\n> +\tbuf_size = (uint16_t) (mbp_priv->mbuf_data_room_size -\n> +\t RTE_PKTMBUF_HEADROOM);\n> +\n> +\tif (ETH_FRAME_LEN > buf_size) {\n> +\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\"%s: %d bytes will not fit in mbuf (%d bytes)\\n\",\n> +\t\t\tdev->data->name, ETH_FRAME_LEN, buf_size);\n> +\t\treturn -ENOMEM;\n> +\t}\n> +\n> +\tdev->data->rx_queues[rx_queue_id] = pkt_q;\n> +\n> +\treturn 0;\n> +}\n> +\n> +static int\n> +eth_tx_queue_setup(struct rte_eth_dev *dev,\n> + uint16_t tx_queue_id,\n> + uint16_t nb_tx_desc __rte_unused,\n> + unsigned int socket_id __rte_unused,\n> + const struct rte_eth_txconf *tx_conf __rte_unused)\n> +{\n> +\n> +\tstruct pmd_internals *internals = dev->data->dev_private;\n> +\n> +\tdev->data->tx_queues[tx_queue_id] = &internals->tx_queue[tx_queue_id];\n> +\treturn 0;\n> +}\n> +\n> +static struct eth_dev_ops ops = {\n> +\t.dev_start = eth_dev_start,\n> +\t.dev_stop = eth_dev_stop,\n> +\t.dev_close = eth_dev_close,\n> +\t.dev_configure = eth_dev_configure,\n> +\t.dev_infos_get = eth_dev_info,\n> +\t.rx_queue_setup = eth_rx_queue_setup,\n> +\t.tx_queue_setup = eth_tx_queue_setup,\n> +\t.rx_queue_release = eth_queue_release,\n> +\t.tx_queue_release = eth_queue_release,\n> +\t.link_update = eth_link_update,\n> +\t.stats_get = eth_stats_get,\n> +\t.stats_reset = eth_stats_reset,\n> +};\n> +\n> +/*\n> + * Opens an AF_PACKET socket\n> + */\n> +static int\n> +open_packet_iface(const char *key __rte_unused,\n> + const char *value __rte_unused,\n> + void *extra_args)\n> +{\n> +\tint *sockfd = extra_args;\n> +\n> +\t/* Open an AF_PACKET socket... */\n> +\t*sockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));\n> +\tif (*sockfd == -1) {\n> +\t\tRTE_LOG(ERR, PMD, \"Could not open AF_PACKET socket\\n\");\n> +\t\treturn -1;\n> +\t}\n> +\n> +\treturn 0;\n> +}\n> +\n> +static int\n> +rte_pmd_init_internals(const char *name,\n> + const int sockfd,\n> + const unsigned nb_queues,\n> + unsigned int blocksize,\n> + unsigned int blockcnt,\n> + unsigned int framesize,\n> + unsigned int framecnt,\n> + const unsigned numa_node,\n> + struct pmd_internals **internals,\n> + struct rte_eth_dev **eth_dev,\n> + struct rte_kvargs *kvlist)\n> +{\n> +\tstruct rte_eth_dev_data *data = NULL;\n> +\tstruct rte_pci_device *pci_dev = NULL;\n> +\tstruct rte_kvargs_pair *pair = NULL;\n> +\tstruct ifreq ifr;\n> +\tsize_t ifnamelen;\n> +\tunsigned k_idx;\n> +\tstruct sockaddr_ll sockaddr;\n> +\tstruct tpacket_req *req;\n> +\tstruct pkt_rx_queue *rx_queue;\n> +\tstruct pkt_tx_queue *tx_queue;\n> +\tint rc, tpver, discard, bypass;\n> +\tunsigned int i, q, rdsize;\n> +\tint qsockfd, fanout_arg;\n> +\n> +\tfor (k_idx = 0; k_idx < kvlist->count; k_idx++) {\n> +\t\tpair = &kvlist->pairs[k_idx];\n> +\t\tif (strstr(pair->key, ETH_PACKET_IFACE_ARG) != NULL)\n> +\t\t\tbreak;\n> +\t}\n> +\tif (pair == NULL) {\n> +\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\"%s: no interface specified for AF_PACKET ethdev\\n\",\n> +\t\t name);\n> +\t\tgoto error;\n> +\t}\n> +\n> +\tRTE_LOG(INFO, PMD,\n> +\t\t\"%s: creating AF_PACKET-backed ethdev on numa socket %u\\n\",\n> +\t\tname, numa_node);\n> +\n> +\t/*\n> +\t * now do all data allocation - for eth_dev structure, dummy pci driver\n> +\t * and internal (private) data\n> +\t */\n> +\tdata = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);\n> +\tif (data == NULL)\n> +\t\tgoto error;\n> +\n> +\tpci_dev = rte_zmalloc_socket(name, sizeof(*pci_dev), 0, numa_node);\n> +\tif (pci_dev == NULL)\n> +\t\tgoto error;\n> +\n> +\t*internals = rte_zmalloc_socket(name, sizeof(**internals),\n> +\t 0, numa_node);\n> +\tif (*internals == NULL)\n> +\t\tgoto error;\n> +\n> +\treq = &((*internals)->req);\n> +\n> +\treq->tp_block_size = blocksize;\n> +\treq->tp_block_nr = blockcnt;\n> +\treq->tp_frame_size = framesize;\n> +\treq->tp_frame_nr = framecnt;\n> +\n> +\tifnamelen = strlen(pair->value);\n> +\tif (ifnamelen < sizeof(ifr.ifr_name)) {\n> +\t\tmemcpy(ifr.ifr_name, pair->value, ifnamelen);\n> +\t\tifr.ifr_name[ifnamelen] = '\\0';\n> +\t} else {\n> +\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\"%s: I/F name too long (%s)\\n\",\n> +\t\t\tname, pair->value);\n> +\t\tgoto error;\n> +\t}\n> +\tif (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {\n> +\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\"%s: ioctl failed (SIOCGIFINDEX)\\n\",\n> +\t\t name);\n> +\t\tgoto error;\n> +\t}\n> +\t(*internals)->if_index = ifr.ifr_ifindex;\n> +\n> +\tif (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {\n> +\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\"%s: ioctl failed (SIOCGIFHWADDR)\\n\",\n> +\t\t name);\n> +\t\tgoto error;\n> +\t}\n> +\tmemcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);\n> +\n> +\tmemset(&sockaddr, 0, sizeof(sockaddr));\n> +\tsockaddr.sll_family = AF_PACKET;\n> +\tsockaddr.sll_protocol = htons(ETH_P_ALL);\n> +\tsockaddr.sll_ifindex = (*internals)->if_index;\n> +\n> +\tfanout_arg = (getpid() ^ (*internals)->if_index) & 0xffff;\n> +\tfanout_arg |= (PACKET_FANOUT_HASH | PACKET_FANOUT_FLAG_DEFRAG |\n> +\t PACKET_FANOUT_FLAG_ROLLOVER) << 16;\n> +\n> +\tfor (q = 0; q < nb_queues; q++) {\n> +\t\t/* Open an AF_PACKET socket for this queue... */\n> +\t\tqsockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));\n> +\t\tif (qsockfd == -1) {\n> +\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t \"%s: could not open AF_PACKET socket\\n\",\n> +\t\t\t name);\n> +\t\t\treturn -1;\n> +\t\t}\n> +\n> +\t\ttpver = TPACKET_V2;\n> +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_VERSION,\n> +\t\t\t\t&tpver, sizeof(tpver));\n> +\t\tif (rc == -1) {\n> +\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\"%s: could not set PACKET_VERSION on AF_PACKET \"\n> +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> +\t\t\tgoto error;\n> +\t\t}\n> +\n> +\t\tdiscard = 1;\n> +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_LOSS,\n> +\t\t\t\t&discard, sizeof(discard));\n> +\t\tif (rc == -1) {\n> +\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\"%s: could not set PACKET_LOSS on \"\n> +\t\t\t \"AF_PACKET socket for %s\\n\", name, pair->value);\n> +\t\t\tgoto error;\n> +\t\t}\n> +\n> +\t\tbypass = 1;\n> +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_QDISC_BYPASS,\n> +\t\t\t\t&bypass, sizeof(bypass));\n> +\t\tif (rc == -1) {\n> +\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\"%s: could not set PACKET_QDISC_BYPASS \"\n> +\t\t\t \"on AF_PACKET socket for %s\\n\", name,\n> +\t\t\t pair->value);\n> +\t\t\tgoto error;\n> +\t\t}\n> +\n> +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_RX_RING, req, sizeof(*req));\n> +\t\tif (rc == -1) {\n> +\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\"%s: could not set PACKET_RX_RING on AF_PACKET \"\n> +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> +\t\t\tgoto error;\n> +\t\t}\n> +\n> +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_TX_RING, req, sizeof(*req));\n> +\t\tif (rc == -1) {\n> +\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\"%s: could not set PACKET_TX_RING on AF_PACKET \"\n> +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> +\t\t\tgoto error;\n> +\t\t}\n> +\n> +\t\trx_queue = &((*internals)->rx_queue[q]);\n> +\t\trx_queue->framecount = req->tp_frame_nr;\n> +\n> +\t\trx_queue->map = mmap(NULL, 2 * req->tp_block_size * req->tp_block_nr,\n> +\t\t\t\t PROT_READ | PROT_WRITE, MAP_SHARED | MAP_LOCKED,\n> +\t\t\t\t qsockfd, 0);\n> +\t\tif (rx_queue->map == MAP_FAILED) {\n> +\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\"%s: call to mmap failed on AF_PACKET socket for %s\\n\",\n> +\t\t\t\tname, pair->value);\n> +\t\t\tgoto error;\n> +\t\t}\n> +\n> +\t\t/* rdsize is same for both Tx and Rx */\n> +\t\trdsize = req->tp_frame_nr * sizeof(*(rx_queue->rd));\n> +\n> +\t\trx_queue->rd = rte_zmalloc_socket(name, rdsize, 0, numa_node);\n> +\t\tfor (i = 0; i < req->tp_frame_nr; ++i) {\n> +\t\t\trx_queue->rd[i].iov_base = rx_queue->map + (i * framesize);\n> +\t\t\trx_queue->rd[i].iov_len = req->tp_frame_size;\n> +\t\t}\n> +\t\trx_queue->sockfd = qsockfd;\n> +\n> +\t\ttx_queue = &((*internals)->tx_queue[q]);\n> +\t\ttx_queue->framecount = req->tp_frame_nr;\n> +\n> +\t\ttx_queue->map = rx_queue->map + req->tp_block_size * req->tp_block_nr;\n> +\n> +\t\ttx_queue->rd = rte_zmalloc_socket(name, rdsize, 0, numa_node);\n> +\t\tfor (i = 0; i < req->tp_frame_nr; ++i) {\n> +\t\t\ttx_queue->rd[i].iov_base = tx_queue->map + (i * framesize);\n> +\t\t\ttx_queue->rd[i].iov_len = req->tp_frame_size;\n> +\t\t}\n> +\t\ttx_queue->sockfd = qsockfd;\n> +\n> +\t\trc = bind(qsockfd, (const struct sockaddr*)&sockaddr, sizeof(sockaddr));\n> +\t\tif (rc == -1) {\n> +\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\"%s: could not bind AF_PACKET socket to %s\\n\",\n> +\t\t\t name, pair->value);\n> +\t\t\tgoto error;\n> +\t\t}\n> +\n> +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_FANOUT,\n> +\t\t\t\t&fanout_arg, sizeof(fanout_arg));\n> +\t\tif (rc == -1) {\n> +\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\"%s: could not set PACKET_FANOUT on AF_PACKET socket \"\n> +\t\t\t\t\"for %s\\n\", name, pair->value);\n> +\t\t\tgoto error;\n> +\t\t}\n> +\t}\n> +\n> +\t/* reserve an ethdev entry */\n> +\t*eth_dev = rte_eth_dev_allocate(name);\n> +\tif (*eth_dev == NULL)\n> +\t\tgoto error;\n> +\n> +\t/*\n> +\t * now put it all together\n> +\t * - store queue data in internals,\n> +\t * - store numa_node info in pci_driver\n> +\t * - point eth_dev_data to internals and pci_driver\n> +\t * - and point eth_dev structure to new eth_dev_data structure\n> +\t */\n> +\n> +\t(*internals)->nb_queues = nb_queues;\n> +\n> +\tdata->dev_private = *internals;\n> +\tdata->port_id = (*eth_dev)->data->port_id;\n> +\tdata->nb_rx_queues = (uint16_t)nb_queues;\n> +\tdata->nb_tx_queues = (uint16_t)nb_queues;\n> +\tdata->dev_link = pmd_link;\n> +\tdata->mac_addrs = &(*internals)->eth_addr;\n> +\n> +\tpci_dev->numa_node = numa_node;\n> +\n> +\t(*eth_dev)->data = data;\n> +\t(*eth_dev)->dev_ops = &ops;\n> +\t(*eth_dev)->pci_dev = pci_dev;\n> +\n> +\treturn 0;\n> +\n> +error:\n> +\tif (data)\n> +\t\trte_free(data);\n> +\tif (pci_dev)\n> +\t\trte_free(pci_dev);\n> +\tfor (q = 0; q < nb_queues; q++) {\n> +\t\tif ((*internals)->rx_queue[q].rd)\n> +\t\t\trte_free((*internals)->rx_queue[q].rd);\n> +\t\tif ((*internals)->tx_queue[q].rd)\n> +\t\t\trte_free((*internals)->tx_queue[q].rd);\n> +\t}\n> +\tif (*internals)\n> +\t\trte_free(*internals);\n> +\treturn -1;\n> +}\n> +\n> +static int\n> +rte_eth_from_packet(const char *name,\n> + int const *sockfd,\n> + const unsigned numa_node,\n> + struct rte_kvargs *kvlist)\n> +{\n> +\tstruct pmd_internals *internals = NULL;\n> +\tstruct rte_eth_dev *eth_dev = NULL;\n> +\tstruct rte_kvargs_pair *pair = NULL;\n> +\tunsigned k_idx;\n> +\tunsigned int blockcount;\n> +\tunsigned int blocksize = DFLT_BLOCK_SIZE;\n> +\tunsigned int framesize = DFLT_FRAME_SIZE;\n> +\tunsigned int framecount = DFLT_FRAME_COUNT;\n> +\tunsigned int qpairs = 1;\n> +\n> +\t/* do some parameter checking */\n> +\tif (*sockfd < 0)\n> +\t\treturn -1;\n> +\n> +\t/*\n> +\t * Walk arguments for configurable settings\n> +\t */\n> +\tfor (k_idx = 0; k_idx < kvlist->count; k_idx++) {\n> +\t\tpair = &kvlist->pairs[k_idx];\n> +\t\tif (strstr(pair->key, ETH_PACKET_NUM_Q_ARG) != NULL) {\n> +\t\t\tqpairs = atoi(pair->value);\n> +\t\t\tif (qpairs < 1 ||\n> +\t\t\t qpairs > RTE_PMD_PACKET_MAX_RINGS) {\n> +\t\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\t\"%s: invalid qpairs value\\n\",\n> +\t\t\t\t name);\n> +\t\t\t\treturn -1;\n> +\t\t\t}\n> +\t\t\tcontinue;\n> +\t\t}\n> +\t\tif (strstr(pair->key, ETH_PACKET_BLOCKSIZE_ARG) != NULL) {\n> +\t\t\tblocksize = atoi(pair->value);\n> +\t\t\tif (!blocksize) {\n> +\t\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\t\"%s: invalid blocksize value\\n\",\n> +\t\t\t\t name);\n> +\t\t\t\treturn -1;\n> +\t\t\t}\n> +\t\t\tcontinue;\n> +\t\t}\n> +\t\tif (strstr(pair->key, ETH_PACKET_FRAMESIZE_ARG) != NULL) {\n> +\t\t\tframesize = atoi(pair->value);\n> +\t\t\tif (!framesize) {\n> +\t\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\t\"%s: invalid framesize value\\n\",\n> +\t\t\t\t name);\n> +\t\t\t\treturn -1;\n> +\t\t\t}\n> +\t\t\tcontinue;\n> +\t\t}\n> +\t\tif (strstr(pair->key, ETH_PACKET_FRAMECOUNT_ARG) != NULL) {\n> +\t\t\tframecount = atoi(pair->value);\n> +\t\t\tif (!framecount) {\n> +\t\t\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\t\t\"%s: invalid framecount value\\n\",\n> +\t\t\t\t name);\n> +\t\t\t\treturn -1;\n> +\t\t\t}\n> +\t\t\tcontinue;\n> +\t\t}\n> +\t}\n> +\n> +\tif (framesize > blocksize) {\n> +\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\"%s: AF_PACKET MMAP frame size exceeds block size!\\n\",\n> +\t\t name);\n> +\t\treturn -1;\n> +\t}\n> +\n> +\tblockcount = framecount / (blocksize / framesize);\n> +\tif (!blockcount) {\n> +\t\tRTE_LOG(ERR, PMD,\n> +\t\t\t\"%s: invalid AF_PACKET MMAP parameters\\n\", name);\n> +\t\treturn -1;\n> +\t}\n> +\n> +\tRTE_LOG(INFO, PMD, \"%s: AF_PACKET MMAP parameters:\\n\", name);\n> +\tRTE_LOG(INFO, PMD, \"%s:\\tblock size %d\\n\", name, blocksize);\n> +\tRTE_LOG(INFO, PMD, \"%s:\\tblock count %d\\n\", name, blockcount);\n> +\tRTE_LOG(INFO, PMD, \"%s:\\tframe size %d\\n\", name, framesize);\n> +\tRTE_LOG(INFO, PMD, \"%s:\\tframe count %d\\n\", name, framecount);\n> +\n> +\tif (rte_pmd_init_internals(name, *sockfd, qpairs,\n> +\t blocksize, blockcount,\n> +\t framesize, framecount,\n> +\t numa_node, &internals, &eth_dev,\n> +\t kvlist) < 0)\n> +\t\treturn -1;\n> +\n> +\teth_dev->rx_pkt_burst = eth_packet_rx;\n> +\teth_dev->tx_pkt_burst = eth_packet_tx;\n> +\n> +\treturn 0;\n> +}\n> +\n> +int\n> +rte_pmd_packet_devinit(const char *name, const char *params)\n> +{\n> +\tunsigned numa_node;\n> +\tint ret;\n> +\tstruct rte_kvargs *kvlist;\n> +\tint sockfd = -1;\n> +\n> +\tRTE_LOG(INFO, PMD, \"Initializing pmd_packet for %s\\n\", name);\n> +\n> +\tnuma_node = rte_socket_id();\n> +\n> +\tkvlist = rte_kvargs_parse(params, valid_arguments);\n> +\tif (kvlist == NULL)\n> +\t\treturn -1;\n> +\n> +\t/*\n> +\t * If iface argument is passed we open the NICs and use them for\n> +\t * reading / writing\n> +\t */\n> +\tif (rte_kvargs_count(kvlist, ETH_PACKET_IFACE_ARG) == 1) {\n> +\n> +\t\tret = rte_kvargs_process(kvlist, ETH_PACKET_IFACE_ARG,\n> +\t\t &open_packet_iface, &sockfd);\n> +\t\tif (ret < 0)\n> +\t\t\treturn -1;\n> +\t}\n> +\n> +\tret = rte_eth_from_packet(name, &sockfd, numa_node, kvlist);\n> +\tclose(sockfd); /* no longer needed */\n> +\n> +\tif (ret < 0)\n> +\t\treturn -1;\n> +\n> +\treturn 0;\n> +}\n> +\n> +static struct rte_driver pmd_packet_drv = {\n> +\t.name = \"eth_packet\",\n> +\t.type = PMD_VDEV,\n> +\t.init = rte_pmd_packet_devinit,\n> +};\n> +\n> +PMD_REGISTER_DRIVER(pmd_packet_drv);\n> diff --git a/lib/librte_pmd_packet/rte_eth_packet.h b/lib/librte_pmd_packet/rte_eth_packet.h\n> new file mode 100644\n> index 000000000000..f685611da3e9\n> --- /dev/null\n> +++ b/lib/librte_pmd_packet/rte_eth_packet.h\n> @@ -0,0 +1,55 @@\n> +/*-\n> + * BSD LICENSE\n> + *\n> + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> + * All rights reserved.\n> + *\n> + * Redistribution and use in source and binary forms, with or without\n> + * modification, are permitted provided that the following conditions\n> + * are met:\n> + *\n> + * * Redistributions of source code must retain the above copyright\n> + * notice, this list of conditions and the following disclaimer.\n> + * * Redistributions in binary form must reproduce the above copyright\n> + * notice, this list of conditions and the following disclaimer in\n> + * the documentation and/or other materials provided with the\n> + * distribution.\n> + * * Neither the name of Intel Corporation nor the names of its\n> + * contributors may be used to endorse or promote products derived\n> + * from this software without specific prior written permission.\n> + *\n> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> + * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> + */\n> +\n> +#ifndef _RTE_ETH_PACKET_H_\n> +#define _RTE_ETH_PACKET_H_\n> +\n> +#ifdef __cplusplus\n> +extern \"C\" {\n> +#endif\n> +\n> +#define RTE_ETH_PACKET_PARAM_NAME \"eth_packet\"\n> +\n> +#define RTE_PMD_PACKET_MAX_RINGS 16\n> +\n> +/**\n> + * For use by the EAL only. Called as part of EAL init to set up any dummy NICs\n> + * configured on command line.\n> + */\n> +int rte_pmd_packet_devinit(const char *name, const char *params);\n> +\n> +#ifdef __cplusplus\n> +}\n> +#endif\n> +\n> +#endif\n> diff --git a/mk/rte.app.mk b/mk/rte.app.mk\n> index 34dff2a02a05..a6994c4dbe93 100644\n> --- a/mk/rte.app.mk\n> +++ b/mk/rte.app.mk\n> @@ -210,6 +210,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_PMD_PCAP),y)\n> LDLIBS += -lrte_pmd_pcap -lpcap\n> endif\n> \n> +ifeq ($(CONFIG_RTE_LIBRTE_PMD_PACKET),y)\n> +LDLIBS += -lrte_pmd_packet\n> +endif\n> +\n> endif # plugins\n> \n> LDLIBS += $(EXECENV_LDLIBS)\n> -- \n> 1.9.3\n> \n>", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id BE91B6AB7;\n\tFri, 12 Sep 2014 20:09:53 +0200 (CEST)", "from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58])\n\tby dpdk.org (Postfix) with ESMTP id 1EE6E68B7\n\tfor <dev@dpdk.org>; Fri, 12 Sep 2014 20:09:51 +0200 (CEST)", "from uucp by smtp.tuxdriver.com with local-rmail (Exim 4.63)\n\t(envelope-from <linville@tuxdriver.com>) id 1XSVNF-0005Ja-5i\n\tfor dev@dpdk.org; Fri, 12 Sep 2014 14:15:09 -0400", "from linville-x1.hq.tuxdriver.com (localhost.localdomain\n\t[127.0.0.1])\n\tby linville-x1.hq.tuxdriver.com (8.14.8/8.14.6) with ESMTP id\n\ts8CI5OEx015933 for <dev@dpdk.org>; Fri, 12 Sep 2014 14:05:24 -0400", "(from linville@localhost)\n\tby linville-x1.hq.tuxdriver.com (8.14.8/8.14.8/Submit) id\n\ts8CI5Nx0015932 for dev@dpdk.org; Fri, 12 Sep 2014 14:05:23 -0400" ], "Date": "Fri, 12 Sep 2014 14:05:23 -0400", "From": "\"John W. Linville\" <linville@tuxdriver.com>", "To": "dev@dpdk.org", "Message-ID": "<20140912180523.GB7145@tuxdriver.com>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1405362290-6753-1-git-send-email-linville@tuxdriver.com>", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=us-ascii", "Content-Disposition": "inline", "In-Reply-To": "<1405362290-6753-1-git-send-email-linville@tuxdriver.com>", "User-Agent": "Mutt/1.5.23 (2014-03-12)", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 763, "web_url": "http://patches.dpdk.org/comment/763/", "msgid": "<DFDF335405C17848924A094BC35766CF0A935AEE@SHSMSX104.ccr.corp.intel.com>", "list_archive_url": "https://inbox.dpdk.org/dev/DFDF335405C17848924A094BC35766CF0A935AEE@SHSMSX104.ccr.corp.intel.com", "date": "2014-09-12T18:31:08", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 29, "url": "http://patches.dpdk.org/api/people/29/?format=api", "name": "Zhou, Danny", "email": "danny.zhou@intel.com" }, "content": "I am concerned about its performance caused by too many memcpy(). Specifically, on Rx side, kernel NIC driver needs to copy packets to skb, then af_packet copies packets to AF_PACKET buffer which are mapped to user space, and then those packets to be copied to DPDK mbuf. In addition, 3 copies needed on Tx side. So to run a simple DPDK L2/L3 forwarding benchmark, each packet needs 6 packet copies which brings significant negative performance impact. We had a bifurcated driver prototype that can do zero-copy and achieve native DPDK performance, but it depends on base driver and AF_PACKET code changes in kernel, John R will be presenting it in coming Linux Plumbers Conference. Once kernel adopts it, the relevant PMD will be submitted to dpdk.org.\n\n> -----Original Message-----\n> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of John W. Linville\n> Sent: Saturday, September 13, 2014 2:05 AM\n> To: dev@dpdk.org\n> Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices\n> \n> Ping? Are there objections to this patch from mid-July?\n> \n> John\n> \n> On Mon, Jul 14, 2014 at 02:24:50PM -0400, John W. Linville wrote:\n> > This is a Linux-specific virtual PMD driver backed by an AF_PACKET\n> > socket. This implementation uses mmap'ed ring buffers to limit copying\n> > and user/kernel transitions. The PACKET_FANOUT_HASH behavior of\n> > AF_PACKET is used for frame reception. In the current implementation,\n> > Tx and Rx queues are always paired, and therefore are always equal\n> > in number -- changing this would be a Simple Matter Of Programming.\n> >\n> > Interfaces of this type are created with a command line option like\n> > \"--vdev=eth_packet0,iface=...\". There are a number of options availabe\n> > as arguments:\n> >\n> > - Interface is chosen by \"iface\" (required)\n> > - Number of queue pairs set by \"qpairs\" (optional, default: 1)\n> > - AF_PACKET MMAP block size set by \"blocksz\" (optional, default: 4096)\n> > - AF_PACKET MMAP frame size set by \"framesz\" (optional, default: 2048)\n> > - AF_PACKET MMAP frame count set by \"framecnt\" (optional, default: 512)\n> >\n> > Signed-off-by: John W. Linville <linville@tuxdriver.com>\n> > ---\n> > This PMD is intended to provide a means for using DPDK on a broad\n> > range of hardware without hardware-specific PMDs and (hopefully)\n> > with better performance than what PCAP offers in Linux. This might\n> > be useful as a development platform for DPDK applications when\n> > DPDK-supported hardware is expensive or unavailable.\n> >\n> > New in v2:\n> >\n> > -- fixup some style issues found by check patch\n> > -- use if_index as part of fanout group ID\n> > -- set default number of queue pairs to 1\n> >\n> > config/common_bsdapp | 5 +\n> > config/common_linuxapp | 5 +\n> > lib/Makefile | 1 +\n> > lib/librte_eal/linuxapp/eal/Makefile | 1 +\n> > lib/librte_pmd_packet/Makefile | 60 +++\n> > lib/librte_pmd_packet/rte_eth_packet.c | 826 +++++++++++++++++++++++++++++++++\n> > lib/librte_pmd_packet/rte_eth_packet.h | 55 +++\n> > mk/rte.app.mk | 4 +\n> > 8 files changed, 957 insertions(+)\n> > create mode 100644 lib/librte_pmd_packet/Makefile\n> > create mode 100644 lib/librte_pmd_packet/rte_eth_packet.c\n> > create mode 100644 lib/librte_pmd_packet/rte_eth_packet.h\n> >\n> > diff --git a/config/common_bsdapp b/config/common_bsdapp\n> > index 943dce8f1ede..c317f031278e 100644\n> > --- a/config/common_bsdapp\n> > +++ b/config/common_bsdapp\n> > @@ -226,6 +226,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=y\n> > CONFIG_RTE_LIBRTE_PMD_BOND=y\n> >\n> > #\n> > +# Compile software PMD backed by AF_PACKET sockets (Linux only)\n> > +#\n> > +CONFIG_RTE_LIBRTE_PMD_PACKET=n\n> > +\n> > +#\n> > # Do prefetch of packet data within PMD driver receive function\n> > #\n> > CONFIG_RTE_PMD_PACKET_PREFETCH=y\n> > diff --git a/config/common_linuxapp b/config/common_linuxapp\n> > index 7bf5d80d4e26..f9e7bc3015ec 100644\n> > --- a/config/common_linuxapp\n> > +++ b/config/common_linuxapp\n> > @@ -249,6 +249,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=n\n> > CONFIG_RTE_LIBRTE_PMD_BOND=y\n> >\n> > #\n> > +# Compile software PMD backed by AF_PACKET sockets (Linux only)\n> > +#\n> > +CONFIG_RTE_LIBRTE_PMD_PACKET=y\n> > +\n> > +#\n> > # Compile Xen PMD\n> > #\n> > CONFIG_RTE_LIBRTE_PMD_XENVIRT=n\n> > diff --git a/lib/Makefile b/lib/Makefile\n> > index 10c5bb3045bc..930fadf29898 100644\n> > --- a/lib/Makefile\n> > +++ b/lib/Makefile\n> > @@ -47,6 +47,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += librte_pmd_i40e\n> > DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += librte_pmd_bond\n> > DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += librte_pmd_ring\n> > DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += librte_pmd_pcap\n> > +DIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += librte_pmd_packet\n> > DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += librte_pmd_virtio\n> > DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3\n> > DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt\n> > diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile\n> > index 756d6b0c9301..feed24a63272 100644\n> > --- a/lib/librte_eal/linuxapp/eal/Makefile\n> > +++ b/lib/librte_eal/linuxapp/eal/Makefile\n> > @@ -44,6 +44,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_ether\n> > CFLAGS += -I$(RTE_SDK)/lib/librte_ivshmem\n> > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_ring\n> > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_pcap\n> > +CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_packet\n> > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_xenvirt\n> > CFLAGS += $(WERROR_FLAGS) -O3\n> >\n> > diff --git a/lib/librte_pmd_packet/Makefile b/lib/librte_pmd_packet/Makefile\n> > new file mode 100644\n> > index 000000000000..e1266fb992cd\n> > --- /dev/null\n> > +++ b/lib/librte_pmd_packet/Makefile\n> > @@ -0,0 +1,60 @@\n> > +# BSD LICENSE\n> > +#\n> > +# Copyright(c) 2014 John W. Linville <linville@redhat.com>\n> > +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > +# Copyright(c) 2014 6WIND S.A.\n> > +# All rights reserved.\n> > +#\n> > +# Redistribution and use in source and binary forms, with or without\n> > +# modification, are permitted provided that the following conditions\n> > +# are met:\n> > +#\n> > +# * Redistributions of source code must retain the above copyright\n> > +# notice, this list of conditions and the following disclaimer.\n> > +# * Redistributions in binary form must reproduce the above copyright\n> > +# notice, this list of conditions and the following disclaimer in\n> > +# the documentation and/or other materials provided with the\n> > +# distribution.\n> > +# * Neither the name of Intel Corporation nor the names of its\n> > +# contributors may be used to endorse or promote products derived\n> > +# from this software without specific prior written permission.\n> > +#\n> > +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> > +# \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> > +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> > +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> > +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> > +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> > +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> > +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> > +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> > +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> > +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> > +\n> > +include $(RTE_SDK)/mk/rte.vars.mk\n> > +\n> > +#\n> > +# library name\n> > +#\n> > +LIB = librte_pmd_packet.a\n> > +\n> > +CFLAGS += -O3\n> > +CFLAGS += $(WERROR_FLAGS)\n> > +\n> > +#\n> > +# all source are stored in SRCS-y\n> > +#\n> > +SRCS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += rte_eth_packet.c\n> > +\n> > +#\n> > +# Export include files\n> > +#\n> > +SYMLINK-y-include += rte_eth_packet.h\n> > +\n> > +# this lib depends upon:\n> > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_mbuf\n> > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_ether\n> > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_malloc\n> > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_kvargs\n> > +\n> > +include $(RTE_SDK)/mk/rte.lib.mk\n> > diff --git a/lib/librte_pmd_packet/rte_eth_packet.c b/lib/librte_pmd_packet/rte_eth_packet.c\n> > new file mode 100644\n> > index 000000000000..9c82d16e730f\n> > --- /dev/null\n> > +++ b/lib/librte_pmd_packet/rte_eth_packet.c\n> > @@ -0,0 +1,826 @@\n> > +/*-\n> > + * BSD LICENSE\n> > + *\n> > + * Copyright(c) 2014 John W. Linville <linville@tuxdriver.com>\n> > + *\n> > + * Originally based upon librte_pmd_pcap code:\n> > + *\n> > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > + * Copyright(c) 2014 6WIND S.A.\n> > + * All rights reserved.\n> > + *\n> > + * Redistribution and use in source and binary forms, with or without\n> > + * modification, are permitted provided that the following conditions\n> > + * are met:\n> > + *\n> > + * * Redistributions of source code must retain the above copyright\n> > + * notice, this list of conditions and the following disclaimer.\n> > + * * Redistributions in binary form must reproduce the above copyright\n> > + * notice, this list of conditions and the following disclaimer in\n> > + * the documentation and/or other materials provided with the\n> > + * distribution.\n> > + * * Neither the name of Intel Corporation nor the names of its\n> > + * contributors may be used to endorse or promote products derived\n> > + * from this software without specific prior written permission.\n> > + *\n> > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> > + * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> > + */\n> > +\n> > +#include <rte_mbuf.h>\n> > +#include <rte_ethdev.h>\n> > +#include <rte_malloc.h>\n> > +#include <rte_kvargs.h>\n> > +#include <rte_dev.h>\n> > +\n> > +#include <linux/if_ether.h>\n> > +#include <linux/if_packet.h>\n> > +#include <arpa/inet.h>\n> > +#include <net/if.h>\n> > +#include <sys/types.h>\n> > +#include <sys/socket.h>\n> > +#include <sys/ioctl.h>\n> > +#include <sys/mman.h>\n> > +#include <unistd.h>\n> > +#include <poll.h>\n> > +\n> > +#include \"rte_eth_packet.h\"\n> > +\n> > +#define ETH_PACKET_IFACE_ARG\t\t\"iface\"\n> > +#define ETH_PACKET_NUM_Q_ARG\t\t\"qpairs\"\n> > +#define ETH_PACKET_BLOCKSIZE_ARG\t\"blocksz\"\n> > +#define ETH_PACKET_FRAMESIZE_ARG\t\"framesz\"\n> > +#define ETH_PACKET_FRAMECOUNT_ARG\t\"framecnt\"\n> > +\n> > +#define DFLT_BLOCK_SIZE\t\t(1 << 12)\n> > +#define DFLT_FRAME_SIZE\t\t(1 << 11)\n> > +#define DFLT_FRAME_COUNT\t(1 << 9)\n> > +\n> > +struct pkt_rx_queue {\n> > +\tint sockfd;\n> > +\n> > +\tstruct iovec *rd;\n> > +\tuint8_t *map;\n> > +\tunsigned int framecount;\n> > +\tunsigned int framenum;\n> > +\n> > +\tstruct rte_mempool *mb_pool;\n> > +\n> > +\tvolatile unsigned long rx_pkts;\n> > +\tvolatile unsigned long err_pkts;\n> > +};\n> > +\n> > +struct pkt_tx_queue {\n> > +\tint sockfd;\n> > +\n> > +\tstruct iovec *rd;\n> > +\tuint8_t *map;\n> > +\tunsigned int framecount;\n> > +\tunsigned int framenum;\n> > +\n> > +\tvolatile unsigned long tx_pkts;\n> > +\tvolatile unsigned long err_pkts;\n> > +};\n> > +\n> > +struct pmd_internals {\n> > +\tunsigned nb_queues;\n> > +\n> > +\tint if_index;\n> > +\tstruct ether_addr eth_addr;\n> > +\n> > +\tstruct tpacket_req req;\n> > +\n> > +\tstruct pkt_rx_queue rx_queue[RTE_PMD_PACKET_MAX_RINGS];\n> > +\tstruct pkt_tx_queue tx_queue[RTE_PMD_PACKET_MAX_RINGS];\n> > +};\n> > +\n> > +static const char *valid_arguments[] = {\n> > +\tETH_PACKET_IFACE_ARG,\n> > +\tETH_PACKET_NUM_Q_ARG,\n> > +\tETH_PACKET_BLOCKSIZE_ARG,\n> > +\tETH_PACKET_FRAMESIZE_ARG,\n> > +\tETH_PACKET_FRAMECOUNT_ARG,\n> > +\tNULL\n> > +};\n> > +\n> > +static const char *drivername = \"AF_PACKET PMD\";\n> > +\n> > +static struct rte_eth_link pmd_link = {\n> > +\t.link_speed = 10000,\n> > +\t.link_duplex = ETH_LINK_FULL_DUPLEX,\n> > +\t.link_status = 0\n> > +};\n> > +\n> > +static uint16_t\n> > +eth_packet_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)\n> > +{\n> > +\tunsigned i;\n> > +\tstruct tpacket2_hdr *ppd;\n> > +\tstruct rte_mbuf *mbuf;\n> > +\tuint8_t *pbuf;\n> > +\tstruct pkt_rx_queue *pkt_q = queue;\n> > +\tuint16_t num_rx = 0;\n> > +\tunsigned int framecount, framenum;\n> > +\n> > +\tif (unlikely(nb_pkts == 0))\n> > +\t\treturn 0;\n> > +\n> > +\t/*\n> > +\t * Reads the given number of packets from the AF_PACKET socket one by\n> > +\t * one and copies the packet data into a newly allocated mbuf.\n> > +\t */\n> > +\tframecount = pkt_q->framecount;\n> > +\tframenum = pkt_q->framenum;\n> > +\tfor (i = 0; i < nb_pkts; i++) {\n> > +\t\t/* point at the next incoming frame */\n> > +\t\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > +\t\tif ((ppd->tp_status & TP_STATUS_USER) == 0)\n> > +\t\t\tbreak;\n> > +\n> > +\t\t/* allocate the next mbuf */\n> > +\t\tmbuf = rte_pktmbuf_alloc(pkt_q->mb_pool);\n> > +\t\tif (unlikely(mbuf == NULL))\n> > +\t\t\tbreak;\n> > +\n> > +\t\t/* packet will fit in the mbuf, go ahead and receive it */\n> > +\t\tmbuf->pkt.pkt_len = mbuf->pkt.data_len = ppd->tp_snaplen;\n> > +\t\tpbuf = (uint8_t *) ppd + ppd->tp_mac;\n> > +\t\tmemcpy(mbuf->pkt.data, pbuf, mbuf->pkt.data_len);\n> > +\n> > +\t\t/* release incoming frame and advance ring buffer */\n> > +\t\tppd->tp_status = TP_STATUS_KERNEL;\n> > +\t\tif (++framenum >= framecount)\n> > +\t\t\tframenum = 0;\n> > +\n> > +\t\t/* account for the receive frame */\n> > +\t\tbufs[i] = mbuf;\n> > +\t\tnum_rx++;\n> > +\t}\n> > +\tpkt_q->framenum = framenum;\n> > +\tpkt_q->rx_pkts += num_rx;\n> > +\treturn num_rx;\n> > +}\n> > +\n> > +/*\n> > + * Callback to handle sending packets through a real NIC.\n> > + */\n> > +static uint16_t\n> > +eth_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)\n> > +{\n> > +\tstruct tpacket2_hdr *ppd;\n> > +\tstruct rte_mbuf *mbuf;\n> > +\tuint8_t *pbuf;\n> > +\tunsigned int framecount, framenum;\n> > +\tstruct pollfd pfd;\n> > +\tstruct pkt_tx_queue *pkt_q = queue;\n> > +\tuint16_t num_tx = 0;\n> > +\tint i;\n> > +\n> > +\tif (unlikely(nb_pkts == 0))\n> > +\t\treturn 0;\n> > +\n> > +\tmemset(&pfd, 0, sizeof(pfd));\n> > +\tpfd.fd = pkt_q->sockfd;\n> > +\tpfd.events = POLLOUT;\n> > +\tpfd.revents = 0;\n> > +\n> > +\tframecount = pkt_q->framecount;\n> > +\tframenum = pkt_q->framenum;\n> > +\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > +\tfor (i = 0; i < nb_pkts; i++) {\n> > +\t\t/* point at the next incoming frame */\n> > +\t\tif ((ppd->tp_status != TP_STATUS_AVAILABLE) &&\n> > +\t\t (poll(&pfd, 1, -1) < 0))\n> > +\t\t\t\tcontinue;\n> > +\n> > +\t\t/* copy the tx frame data */\n> > +\t\tmbuf = bufs[num_tx];\n> > +\t\tpbuf = (uint8_t *) ppd + TPACKET2_HDRLEN -\n> > +\t\t\tsizeof(struct sockaddr_ll);\n> > +\t\tmemcpy(pbuf, mbuf->pkt.data, mbuf->pkt.data_len);\n> > +\t\tppd->tp_len = ppd->tp_snaplen = mbuf->pkt.data_len;\n> > +\n> > +\t\t/* release incoming frame and advance ring buffer */\n> > +\t\tppd->tp_status = TP_STATUS_SEND_REQUEST;\n> > +\t\tif (++framenum >= framecount)\n> > +\t\t\tframenum = 0;\n> > +\t\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > +\n> > +\t\tnum_tx++;\n> > +\t\trte_pktmbuf_free(mbuf);\n> > +\t}\n> > +\n> > +\t/* kick-off transmits */\n> > +\tsendto(pkt_q->sockfd, NULL, 0, MSG_DONTWAIT, NULL, 0);\n> > +\n> > +\tpkt_q->framenum = framenum;\n> > +\tpkt_q->tx_pkts += num_tx;\n> > +\tpkt_q->err_pkts += nb_pkts - num_tx;\n> > +\treturn num_tx;\n> > +}\n> > +\n> > +static int\n> > +eth_dev_start(struct rte_eth_dev *dev)\n> > +{\n> > +\tdev->data->dev_link.link_status = 1;\n> > +\treturn 0;\n> > +}\n> > +\n> > +/*\n> > + * This function gets called when the current port gets stopped.\n> > + */\n> > +static void\n> > +eth_dev_stop(struct rte_eth_dev *dev)\n> > +{\n> > +\tunsigned i;\n> > +\tint sockfd;\n> > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > +\n> > +\tfor (i = 0; i < internals->nb_queues; i++) {\n> > +\t\tsockfd = internals->rx_queue[i].sockfd;\n> > +\t\tif (sockfd != -1)\n> > +\t\t\tclose(sockfd);\n> > +\t\tsockfd = internals->tx_queue[i].sockfd;\n> > +\t\tif (sockfd != -1)\n> > +\t\t\tclose(sockfd);\n> > +\t}\n> > +\n> > +\tdev->data->dev_link.link_status = 0;\n> > +}\n> > +\n> > +static int\n> > +eth_dev_configure(struct rte_eth_dev *dev __rte_unused)\n> > +{\n> > +\treturn 0;\n> > +}\n> > +\n> > +static void\n> > +eth_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)\n> > +{\n> > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > +\n> > +\tdev_info->driver_name = drivername;\n> > +\tdev_info->if_index = internals->if_index;\n> > +\tdev_info->max_mac_addrs = 1;\n> > +\tdev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;\n> > +\tdev_info->max_rx_queues = (uint16_t)internals->nb_queues;\n> > +\tdev_info->max_tx_queues = (uint16_t)internals->nb_queues;\n> > +\tdev_info->min_rx_bufsize = 0;\n> > +\tdev_info->pci_dev = NULL;\n> > +}\n> > +\n> > +static void\n> > +eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *igb_stats)\n> > +{\n> > +\tunsigned i, imax;\n> > +\tunsigned long rx_total = 0, tx_total = 0, tx_err_total = 0;\n> > +\tconst struct pmd_internals *internal = dev->data->dev_private;\n> > +\n> > +\tmemset(igb_stats, 0, sizeof(*igb_stats));\n> > +\n> > +\timax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?\n> > +\t internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);\n> > +\tfor (i = 0; i < imax; i++) {\n> > +\t\tigb_stats->q_ipackets[i] = internal->rx_queue[i].rx_pkts;\n> > +\t\trx_total += igb_stats->q_ipackets[i];\n> > +\t}\n> > +\n> > +\timax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?\n> > +\t internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);\n> > +\tfor (i = 0; i < imax; i++) {\n> > +\t\tigb_stats->q_opackets[i] = internal->tx_queue[i].tx_pkts;\n> > +\t\tigb_stats->q_errors[i] = internal->tx_queue[i].err_pkts;\n> > +\t\ttx_total += igb_stats->q_opackets[i];\n> > +\t\ttx_err_total += igb_stats->q_errors[i];\n> > +\t}\n> > +\n> > +\tigb_stats->ipackets = rx_total;\n> > +\tigb_stats->opackets = tx_total;\n> > +\tigb_stats->oerrors = tx_err_total;\n> > +}\n> > +\n> > +static void\n> > +eth_stats_reset(struct rte_eth_dev *dev)\n> > +{\n> > +\tunsigned i;\n> > +\tstruct pmd_internals *internal = dev->data->dev_private;\n> > +\n> > +\tfor (i = 0; i < internal->nb_queues; i++)\n> > +\t\tinternal->rx_queue[i].rx_pkts = 0;\n> > +\n> > +\tfor (i = 0; i < internal->nb_queues; i++) {\n> > +\t\tinternal->tx_queue[i].tx_pkts = 0;\n> > +\t\tinternal->tx_queue[i].err_pkts = 0;\n> > +\t}\n> > +}\n> > +\n> > +static void\n> > +eth_dev_close(struct rte_eth_dev *dev __rte_unused)\n> > +{\n> > +}\n> > +\n> > +static void\n> > +eth_queue_release(void *q __rte_unused)\n> > +{\n> > +}\n> > +\n> > +static int\n> > +eth_link_update(struct rte_eth_dev *dev __rte_unused,\n> > + int wait_to_complete __rte_unused)\n> > +{\n> > +\treturn 0;\n> > +}\n> > +\n> > +static int\n> > +eth_rx_queue_setup(struct rte_eth_dev *dev,\n> > + uint16_t rx_queue_id,\n> > + uint16_t nb_rx_desc __rte_unused,\n> > + unsigned int socket_id __rte_unused,\n> > + const struct rte_eth_rxconf *rx_conf __rte_unused,\n> > + struct rte_mempool *mb_pool)\n> > +{\n> > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > +\tstruct pkt_rx_queue *pkt_q = &internals->rx_queue[rx_queue_id];\n> > +\tstruct rte_pktmbuf_pool_private *mbp_priv;\n> > +\tuint16_t buf_size;\n> > +\n> > +\tpkt_q->mb_pool = mb_pool;\n> > +\n> > +\t/* Now get the space available for data in the mbuf */\n> > +\tmbp_priv = rte_mempool_get_priv(pkt_q->mb_pool);\n> > +\tbuf_size = (uint16_t) (mbp_priv->mbuf_data_room_size -\n> > +\t RTE_PKTMBUF_HEADROOM);\n> > +\n> > +\tif (ETH_FRAME_LEN > buf_size) {\n> > +\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\"%s: %d bytes will not fit in mbuf (%d bytes)\\n\",\n> > +\t\t\tdev->data->name, ETH_FRAME_LEN, buf_size);\n> > +\t\treturn -ENOMEM;\n> > +\t}\n> > +\n> > +\tdev->data->rx_queues[rx_queue_id] = pkt_q;\n> > +\n> > +\treturn 0;\n> > +}\n> > +\n> > +static int\n> > +eth_tx_queue_setup(struct rte_eth_dev *dev,\n> > + uint16_t tx_queue_id,\n> > + uint16_t nb_tx_desc __rte_unused,\n> > + unsigned int socket_id __rte_unused,\n> > + const struct rte_eth_txconf *tx_conf __rte_unused)\n> > +{\n> > +\n> > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > +\n> > +\tdev->data->tx_queues[tx_queue_id] = &internals->tx_queue[tx_queue_id];\n> > +\treturn 0;\n> > +}\n> > +\n> > +static struct eth_dev_ops ops = {\n> > +\t.dev_start = eth_dev_start,\n> > +\t.dev_stop = eth_dev_stop,\n> > +\t.dev_close = eth_dev_close,\n> > +\t.dev_configure = eth_dev_configure,\n> > +\t.dev_infos_get = eth_dev_info,\n> > +\t.rx_queue_setup = eth_rx_queue_setup,\n> > +\t.tx_queue_setup = eth_tx_queue_setup,\n> > +\t.rx_queue_release = eth_queue_release,\n> > +\t.tx_queue_release = eth_queue_release,\n> > +\t.link_update = eth_link_update,\n> > +\t.stats_get = eth_stats_get,\n> > +\t.stats_reset = eth_stats_reset,\n> > +};\n> > +\n> > +/*\n> > + * Opens an AF_PACKET socket\n> > + */\n> > +static int\n> > +open_packet_iface(const char *key __rte_unused,\n> > + const char *value __rte_unused,\n> > + void *extra_args)\n> > +{\n> > +\tint *sockfd = extra_args;\n> > +\n> > +\t/* Open an AF_PACKET socket... */\n> > +\t*sockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));\n> > +\tif (*sockfd == -1) {\n> > +\t\tRTE_LOG(ERR, PMD, \"Could not open AF_PACKET socket\\n\");\n> > +\t\treturn -1;\n> > +\t}\n> > +\n> > +\treturn 0;\n> > +}\n> > +\n> > +static int\n> > +rte_pmd_init_internals(const char *name,\n> > + const int sockfd,\n> > + const unsigned nb_queues,\n> > + unsigned int blocksize,\n> > + unsigned int blockcnt,\n> > + unsigned int framesize,\n> > + unsigned int framecnt,\n> > + const unsigned numa_node,\n> > + struct pmd_internals **internals,\n> > + struct rte_eth_dev **eth_dev,\n> > + struct rte_kvargs *kvlist)\n> > +{\n> > +\tstruct rte_eth_dev_data *data = NULL;\n> > +\tstruct rte_pci_device *pci_dev = NULL;\n> > +\tstruct rte_kvargs_pair *pair = NULL;\n> > +\tstruct ifreq ifr;\n> > +\tsize_t ifnamelen;\n> > +\tunsigned k_idx;\n> > +\tstruct sockaddr_ll sockaddr;\n> > +\tstruct tpacket_req *req;\n> > +\tstruct pkt_rx_queue *rx_queue;\n> > +\tstruct pkt_tx_queue *tx_queue;\n> > +\tint rc, tpver, discard, bypass;\n> > +\tunsigned int i, q, rdsize;\n> > +\tint qsockfd, fanout_arg;\n> > +\n> > +\tfor (k_idx = 0; k_idx < kvlist->count; k_idx++) {\n> > +\t\tpair = &kvlist->pairs[k_idx];\n> > +\t\tif (strstr(pair->key, ETH_PACKET_IFACE_ARG) != NULL)\n> > +\t\t\tbreak;\n> > +\t}\n> > +\tif (pair == NULL) {\n> > +\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\"%s: no interface specified for AF_PACKET ethdev\\n\",\n> > +\t\t name);\n> > +\t\tgoto error;\n> > +\t}\n> > +\n> > +\tRTE_LOG(INFO, PMD,\n> > +\t\t\"%s: creating AF_PACKET-backed ethdev on numa socket %u\\n\",\n> > +\t\tname, numa_node);\n> > +\n> > +\t/*\n> > +\t * now do all data allocation - for eth_dev structure, dummy pci driver\n> > +\t * and internal (private) data\n> > +\t */\n> > +\tdata = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);\n> > +\tif (data == NULL)\n> > +\t\tgoto error;\n> > +\n> > +\tpci_dev = rte_zmalloc_socket(name, sizeof(*pci_dev), 0, numa_node);\n> > +\tif (pci_dev == NULL)\n> > +\t\tgoto error;\n> > +\n> > +\t*internals = rte_zmalloc_socket(name, sizeof(**internals),\n> > +\t 0, numa_node);\n> > +\tif (*internals == NULL)\n> > +\t\tgoto error;\n> > +\n> > +\treq = &((*internals)->req);\n> > +\n> > +\treq->tp_block_size = blocksize;\n> > +\treq->tp_block_nr = blockcnt;\n> > +\treq->tp_frame_size = framesize;\n> > +\treq->tp_frame_nr = framecnt;\n> > +\n> > +\tifnamelen = strlen(pair->value);\n> > +\tif (ifnamelen < sizeof(ifr.ifr_name)) {\n> > +\t\tmemcpy(ifr.ifr_name, pair->value, ifnamelen);\n> > +\t\tifr.ifr_name[ifnamelen] = '\\0';\n> > +\t} else {\n> > +\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\"%s: I/F name too long (%s)\\n\",\n> > +\t\t\tname, pair->value);\n> > +\t\tgoto error;\n> > +\t}\n> > +\tif (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {\n> > +\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\"%s: ioctl failed (SIOCGIFINDEX)\\n\",\n> > +\t\t name);\n> > +\t\tgoto error;\n> > +\t}\n> > +\t(*internals)->if_index = ifr.ifr_ifindex;\n> > +\n> > +\tif (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {\n> > +\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\"%s: ioctl failed (SIOCGIFHWADDR)\\n\",\n> > +\t\t name);\n> > +\t\tgoto error;\n> > +\t}\n> > +\tmemcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);\n> > +\n> > +\tmemset(&sockaddr, 0, sizeof(sockaddr));\n> > +\tsockaddr.sll_family = AF_PACKET;\n> > +\tsockaddr.sll_protocol = htons(ETH_P_ALL);\n> > +\tsockaddr.sll_ifindex = (*internals)->if_index;\n> > +\n> > +\tfanout_arg = (getpid() ^ (*internals)->if_index) & 0xffff;\n> > +\tfanout_arg |= (PACKET_FANOUT_HASH | PACKET_FANOUT_FLAG_DEFRAG |\n> > +\t PACKET_FANOUT_FLAG_ROLLOVER) << 16;\n> > +\n> > +\tfor (q = 0; q < nb_queues; q++) {\n> > +\t\t/* Open an AF_PACKET socket for this queue... */\n> > +\t\tqsockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));\n> > +\t\tif (qsockfd == -1) {\n> > +\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t \"%s: could not open AF_PACKET socket\\n\",\n> > +\t\t\t name);\n> > +\t\t\treturn -1;\n> > +\t\t}\n> > +\n> > +\t\ttpver = TPACKET_V2;\n> > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_VERSION,\n> > +\t\t\t\t&tpver, sizeof(tpver));\n> > +\t\tif (rc == -1) {\n> > +\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\"%s: could not set PACKET_VERSION on AF_PACKET \"\n> > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > +\t\t\tgoto error;\n> > +\t\t}\n> > +\n> > +\t\tdiscard = 1;\n> > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_LOSS,\n> > +\t\t\t\t&discard, sizeof(discard));\n> > +\t\tif (rc == -1) {\n> > +\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\"%s: could not set PACKET_LOSS on \"\n> > +\t\t\t \"AF_PACKET socket for %s\\n\", name, pair->value);\n> > +\t\t\tgoto error;\n> > +\t\t}\n> > +\n> > +\t\tbypass = 1;\n> > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_QDISC_BYPASS,\n> > +\t\t\t\t&bypass, sizeof(bypass));\n> > +\t\tif (rc == -1) {\n> > +\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\"%s: could not set PACKET_QDISC_BYPASS \"\n> > +\t\t\t \"on AF_PACKET socket for %s\\n\", name,\n> > +\t\t\t pair->value);\n> > +\t\t\tgoto error;\n> > +\t\t}\n> > +\n> > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_RX_RING, req, sizeof(*req));\n> > +\t\tif (rc == -1) {\n> > +\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\"%s: could not set PACKET_RX_RING on AF_PACKET \"\n> > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > +\t\t\tgoto error;\n> > +\t\t}\n> > +\n> > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_TX_RING, req, sizeof(*req));\n> > +\t\tif (rc == -1) {\n> > +\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\"%s: could not set PACKET_TX_RING on AF_PACKET \"\n> > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > +\t\t\tgoto error;\n> > +\t\t}\n> > +\n> > +\t\trx_queue = &((*internals)->rx_queue[q]);\n> > +\t\trx_queue->framecount = req->tp_frame_nr;\n> > +\n> > +\t\trx_queue->map = mmap(NULL, 2 * req->tp_block_size * req->tp_block_nr,\n> > +\t\t\t\t PROT_READ | PROT_WRITE, MAP_SHARED | MAP_LOCKED,\n> > +\t\t\t\t qsockfd, 0);\n> > +\t\tif (rx_queue->map == MAP_FAILED) {\n> > +\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\"%s: call to mmap failed on AF_PACKET socket for %s\\n\",\n> > +\t\t\t\tname, pair->value);\n> > +\t\t\tgoto error;\n> > +\t\t}\n> > +\n> > +\t\t/* rdsize is same for both Tx and Rx */\n> > +\t\trdsize = req->tp_frame_nr * sizeof(*(rx_queue->rd));\n> > +\n> > +\t\trx_queue->rd = rte_zmalloc_socket(name, rdsize, 0, numa_node);\n> > +\t\tfor (i = 0; i < req->tp_frame_nr; ++i) {\n> > +\t\t\trx_queue->rd[i].iov_base = rx_queue->map + (i * framesize);\n> > +\t\t\trx_queue->rd[i].iov_len = req->tp_frame_size;\n> > +\t\t}\n> > +\t\trx_queue->sockfd = qsockfd;\n> > +\n> > +\t\ttx_queue = &((*internals)->tx_queue[q]);\n> > +\t\ttx_queue->framecount = req->tp_frame_nr;\n> > +\n> > +\t\ttx_queue->map = rx_queue->map + req->tp_block_size * req->tp_block_nr;\n> > +\n> > +\t\ttx_queue->rd = rte_zmalloc_socket(name, rdsize, 0, numa_node);\n> > +\t\tfor (i = 0; i < req->tp_frame_nr; ++i) {\n> > +\t\t\ttx_queue->rd[i].iov_base = tx_queue->map + (i * framesize);\n> > +\t\t\ttx_queue->rd[i].iov_len = req->tp_frame_size;\n> > +\t\t}\n> > +\t\ttx_queue->sockfd = qsockfd;\n> > +\n> > +\t\trc = bind(qsockfd, (const struct sockaddr*)&sockaddr, sizeof(sockaddr));\n> > +\t\tif (rc == -1) {\n> > +\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\"%s: could not bind AF_PACKET socket to %s\\n\",\n> > +\t\t\t name, pair->value);\n> > +\t\t\tgoto error;\n> > +\t\t}\n> > +\n> > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_FANOUT,\n> > +\t\t\t\t&fanout_arg, sizeof(fanout_arg));\n> > +\t\tif (rc == -1) {\n> > +\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\"%s: could not set PACKET_FANOUT on AF_PACKET socket \"\n> > +\t\t\t\t\"for %s\\n\", name, pair->value);\n> > +\t\t\tgoto error;\n> > +\t\t}\n> > +\t}\n> > +\n> > +\t/* reserve an ethdev entry */\n> > +\t*eth_dev = rte_eth_dev_allocate(name);\n> > +\tif (*eth_dev == NULL)\n> > +\t\tgoto error;\n> > +\n> > +\t/*\n> > +\t * now put it all together\n> > +\t * - store queue data in internals,\n> > +\t * - store numa_node info in pci_driver\n> > +\t * - point eth_dev_data to internals and pci_driver\n> > +\t * - and point eth_dev structure to new eth_dev_data structure\n> > +\t */\n> > +\n> > +\t(*internals)->nb_queues = nb_queues;\n> > +\n> > +\tdata->dev_private = *internals;\n> > +\tdata->port_id = (*eth_dev)->data->port_id;\n> > +\tdata->nb_rx_queues = (uint16_t)nb_queues;\n> > +\tdata->nb_tx_queues = (uint16_t)nb_queues;\n> > +\tdata->dev_link = pmd_link;\n> > +\tdata->mac_addrs = &(*internals)->eth_addr;\n> > +\n> > +\tpci_dev->numa_node = numa_node;\n> > +\n> > +\t(*eth_dev)->data = data;\n> > +\t(*eth_dev)->dev_ops = &ops;\n> > +\t(*eth_dev)->pci_dev = pci_dev;\n> > +\n> > +\treturn 0;\n> > +\n> > +error:\n> > +\tif (data)\n> > +\t\trte_free(data);\n> > +\tif (pci_dev)\n> > +\t\trte_free(pci_dev);\n> > +\tfor (q = 0; q < nb_queues; q++) {\n> > +\t\tif ((*internals)->rx_queue[q].rd)\n> > +\t\t\trte_free((*internals)->rx_queue[q].rd);\n> > +\t\tif ((*internals)->tx_queue[q].rd)\n> > +\t\t\trte_free((*internals)->tx_queue[q].rd);\n> > +\t}\n> > +\tif (*internals)\n> > +\t\trte_free(*internals);\n> > +\treturn -1;\n> > +}\n> > +\n> > +static int\n> > +rte_eth_from_packet(const char *name,\n> > + int const *sockfd,\n> > + const unsigned numa_node,\n> > + struct rte_kvargs *kvlist)\n> > +{\n> > +\tstruct pmd_internals *internals = NULL;\n> > +\tstruct rte_eth_dev *eth_dev = NULL;\n> > +\tstruct rte_kvargs_pair *pair = NULL;\n> > +\tunsigned k_idx;\n> > +\tunsigned int blockcount;\n> > +\tunsigned int blocksize = DFLT_BLOCK_SIZE;\n> > +\tunsigned int framesize = DFLT_FRAME_SIZE;\n> > +\tunsigned int framecount = DFLT_FRAME_COUNT;\n> > +\tunsigned int qpairs = 1;\n> > +\n> > +\t/* do some parameter checking */\n> > +\tif (*sockfd < 0)\n> > +\t\treturn -1;\n> > +\n> > +\t/*\n> > +\t * Walk arguments for configurable settings\n> > +\t */\n> > +\tfor (k_idx = 0; k_idx < kvlist->count; k_idx++) {\n> > +\t\tpair = &kvlist->pairs[k_idx];\n> > +\t\tif (strstr(pair->key, ETH_PACKET_NUM_Q_ARG) != NULL) {\n> > +\t\t\tqpairs = atoi(pair->value);\n> > +\t\t\tif (qpairs < 1 ||\n> > +\t\t\t qpairs > RTE_PMD_PACKET_MAX_RINGS) {\n> > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\t\"%s: invalid qpairs value\\n\",\n> > +\t\t\t\t name);\n> > +\t\t\t\treturn -1;\n> > +\t\t\t}\n> > +\t\t\tcontinue;\n> > +\t\t}\n> > +\t\tif (strstr(pair->key, ETH_PACKET_BLOCKSIZE_ARG) != NULL) {\n> > +\t\t\tblocksize = atoi(pair->value);\n> > +\t\t\tif (!blocksize) {\n> > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\t\"%s: invalid blocksize value\\n\",\n> > +\t\t\t\t name);\n> > +\t\t\t\treturn -1;\n> > +\t\t\t}\n> > +\t\t\tcontinue;\n> > +\t\t}\n> > +\t\tif (strstr(pair->key, ETH_PACKET_FRAMESIZE_ARG) != NULL) {\n> > +\t\t\tframesize = atoi(pair->value);\n> > +\t\t\tif (!framesize) {\n> > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\t\"%s: invalid framesize value\\n\",\n> > +\t\t\t\t name);\n> > +\t\t\t\treturn -1;\n> > +\t\t\t}\n> > +\t\t\tcontinue;\n> > +\t\t}\n> > +\t\tif (strstr(pair->key, ETH_PACKET_FRAMECOUNT_ARG) != NULL) {\n> > +\t\t\tframecount = atoi(pair->value);\n> > +\t\t\tif (!framecount) {\n> > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\t\t\"%s: invalid framecount value\\n\",\n> > +\t\t\t\t name);\n> > +\t\t\t\treturn -1;\n> > +\t\t\t}\n> > +\t\t\tcontinue;\n> > +\t\t}\n> > +\t}\n> > +\n> > +\tif (framesize > blocksize) {\n> > +\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\"%s: AF_PACKET MMAP frame size exceeds block size!\\n\",\n> > +\t\t name);\n> > +\t\treturn -1;\n> > +\t}\n> > +\n> > +\tblockcount = framecount / (blocksize / framesize);\n> > +\tif (!blockcount) {\n> > +\t\tRTE_LOG(ERR, PMD,\n> > +\t\t\t\"%s: invalid AF_PACKET MMAP parameters\\n\", name);\n> > +\t\treturn -1;\n> > +\t}\n> > +\n> > +\tRTE_LOG(INFO, PMD, \"%s: AF_PACKET MMAP parameters:\\n\", name);\n> > +\tRTE_LOG(INFO, PMD, \"%s:\\tblock size %d\\n\", name, blocksize);\n> > +\tRTE_LOG(INFO, PMD, \"%s:\\tblock count %d\\n\", name, blockcount);\n> > +\tRTE_LOG(INFO, PMD, \"%s:\\tframe size %d\\n\", name, framesize);\n> > +\tRTE_LOG(INFO, PMD, \"%s:\\tframe count %d\\n\", name, framecount);\n> > +\n> > +\tif (rte_pmd_init_internals(name, *sockfd, qpairs,\n> > +\t blocksize, blockcount,\n> > +\t framesize, framecount,\n> > +\t numa_node, &internals, &eth_dev,\n> > +\t kvlist) < 0)\n> > +\t\treturn -1;\n> > +\n> > +\teth_dev->rx_pkt_burst = eth_packet_rx;\n> > +\teth_dev->tx_pkt_burst = eth_packet_tx;\n> > +\n> > +\treturn 0;\n> > +}\n> > +\n> > +int\n> > +rte_pmd_packet_devinit(const char *name, const char *params)\n> > +{\n> > +\tunsigned numa_node;\n> > +\tint ret;\n> > +\tstruct rte_kvargs *kvlist;\n> > +\tint sockfd = -1;\n> > +\n> > +\tRTE_LOG(INFO, PMD, \"Initializing pmd_packet for %s\\n\", name);\n> > +\n> > +\tnuma_node = rte_socket_id();\n> > +\n> > +\tkvlist = rte_kvargs_parse(params, valid_arguments);\n> > +\tif (kvlist == NULL)\n> > +\t\treturn -1;\n> > +\n> > +\t/*\n> > +\t * If iface argument is passed we open the NICs and use them for\n> > +\t * reading / writing\n> > +\t */\n> > +\tif (rte_kvargs_count(kvlist, ETH_PACKET_IFACE_ARG) == 1) {\n> > +\n> > +\t\tret = rte_kvargs_process(kvlist, ETH_PACKET_IFACE_ARG,\n> > +\t\t &open_packet_iface, &sockfd);\n> > +\t\tif (ret < 0)\n> > +\t\t\treturn -1;\n> > +\t}\n> > +\n> > +\tret = rte_eth_from_packet(name, &sockfd, numa_node, kvlist);\n> > +\tclose(sockfd); /* no longer needed */\n> > +\n> > +\tif (ret < 0)\n> > +\t\treturn -1;\n> > +\n> > +\treturn 0;\n> > +}\n> > +\n> > +static struct rte_driver pmd_packet_drv = {\n> > +\t.name = \"eth_packet\",\n> > +\t.type = PMD_VDEV,\n> > +\t.init = rte_pmd_packet_devinit,\n> > +};\n> > +\n> > +PMD_REGISTER_DRIVER(pmd_packet_drv);\n> > diff --git a/lib/librte_pmd_packet/rte_eth_packet.h b/lib/librte_pmd_packet/rte_eth_packet.h\n> > new file mode 100644\n> > index 000000000000..f685611da3e9\n> > --- /dev/null\n> > +++ b/lib/librte_pmd_packet/rte_eth_packet.h\n> > @@ -0,0 +1,55 @@\n> > +/*-\n> > + * BSD LICENSE\n> > + *\n> > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > + * All rights reserved.\n> > + *\n> > + * Redistribution and use in source and binary forms, with or without\n> > + * modification, are permitted provided that the following conditions\n> > + * are met:\n> > + *\n> > + * * Redistributions of source code must retain the above copyright\n> > + * notice, this list of conditions and the following disclaimer.\n> > + * * Redistributions in binary form must reproduce the above copyright\n> > + * notice, this list of conditions and the following disclaimer in\n> > + * the documentation and/or other materials provided with the\n> > + * distribution.\n> > + * * Neither the name of Intel Corporation nor the names of its\n> > + * contributors may be used to endorse or promote products derived\n> > + * from this software without specific prior written permission.\n> > + *\n> > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> > + * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> > + */\n> > +\n> > +#ifndef _RTE_ETH_PACKET_H_\n> > +#define _RTE_ETH_PACKET_H_\n> > +\n> > +#ifdef __cplusplus\n> > +extern \"C\" {\n> > +#endif\n> > +\n> > +#define RTE_ETH_PACKET_PARAM_NAME \"eth_packet\"\n> > +\n> > +#define RTE_PMD_PACKET_MAX_RINGS 16\n> > +\n> > +/**\n> > + * For use by the EAL only. Called as part of EAL init to set up any dummy NICs\n> > + * configured on command line.\n> > + */\n> > +int rte_pmd_packet_devinit(const char *name, const char *params);\n> > +\n> > +#ifdef __cplusplus\n> > +}\n> > +#endif\n> > +\n> > +#endif\n> > diff --git a/mk/rte.app.mk b/mk/rte.app.mk\n> > index 34dff2a02a05..a6994c4dbe93 100644\n> > --- a/mk/rte.app.mk\n> > +++ b/mk/rte.app.mk\n> > @@ -210,6 +210,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_PMD_PCAP),y)\n> > LDLIBS += -lrte_pmd_pcap -lpcap\n> > endif\n> >\n> > +ifeq ($(CONFIG_RTE_LIBRTE_PMD_PACKET),y)\n> > +LDLIBS += -lrte_pmd_packet\n> > +endif\n> > +\n> > endif # plugins\n> >\n> > LDLIBS += $(EXECENV_LDLIBS)\n> > --\n> > 1.9.3\n> >\n> >\n> \n> --\n> John W. Linville\t\tSomeday the world will need a hero, and you\n> linville@tuxdriver.com\t\t\tmight be all we have. Be ready.", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id BA308AF7F;\n\tFri, 12 Sep 2014 20:26:04 +0200 (CEST)", "from mga09.intel.com (mga09.intel.com [134.134.136.24])\n\tby dpdk.org (Postfix) with ESMTP id B9F936AB7\n\tfor <dev@dpdk.org>; Fri, 12 Sep 2014 20:26:01 +0200 (CEST)", "from azsmga001.ch.intel.com ([10.2.17.19])\n\tby orsmga102.jf.intel.com with ESMTP; 12 Sep 2014 11:25:06 -0700", "from fmsmsx103.amr.corp.intel.com ([10.18.124.201])\n\tby azsmga001.ch.intel.com with ESMTP; 12 Sep 2014 11:31:11 -0700", "from fmsmsx156.amr.corp.intel.com (10.18.116.74) by\n\tFMSMSX103.amr.corp.intel.com (10.18.124.201) with Microsoft SMTP\n\tServer (TLS) id 14.3.195.1; Fri, 12 Sep 2014 11:31:11 -0700", "from shsmsx151.ccr.corp.intel.com (10.239.6.50) by\n\tfmsmsx156.amr.corp.intel.com (10.18.116.74) with Microsoft SMTP\n\tServer (TLS) id 14.3.195.1; Fri, 12 Sep 2014 11:31:11 -0700", "from shsmsx104.ccr.corp.intel.com ([169.254.5.230]) by\n\tSHSMSX151.ccr.corp.intel.com ([169.254.3.172]) with mapi id\n\t14.03.0195.001; Sat, 13 Sep 2014 02:31:09 +0800" ], "X-ExtLoop1": "1", "X-IronPort-AV": "E=Sophos;i=\"5.04,514,1406617200\"; d=\"scan'208\";a=\"477178402\"", "From": "\"Zhou, Danny\" <danny.zhou@intel.com>", "To": "\"John W. Linville\" <linville@tuxdriver.com>", "Thread-Topic": "[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "Thread-Index": "AQHPn5Gwvcd4A0wcTkeObxakesCY6Jv9ov6AgACKEgA=", "Date": "Fri, 12 Sep 2014 18:31:08 +0000", "Message-ID": "<DFDF335405C17848924A094BC35766CF0A935AEE@SHSMSX104.ccr.corp.intel.com>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1405362290-6753-1-git-send-email-linville@tuxdriver.com>\n\t<20140912180523.GB7145@tuxdriver.com>", "In-Reply-To": "<20140912180523.GB7145@tuxdriver.com>", "Accept-Language": "zh-CN, en-US", "Content-Language": "en-US", "X-MS-Has-Attach": "", "X-MS-TNEF-Correlator": "", "x-originating-ip": "[10.239.127.40]", "Content-Type": "text/plain; charset=\"us-ascii\"", "Content-Transfer-Encoding": "quoted-printable", "MIME-Version": "1.0", "Cc": "\"dev@dpdk.org\" <dev@dpdk.org>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 764, "web_url": "http://patches.dpdk.org/comment/764/", "msgid": "<20140912185423.GD7145@tuxdriver.com>", "list_archive_url": "https://inbox.dpdk.org/dev/20140912185423.GD7145@tuxdriver.com", "date": "2014-09-12T18:54:23", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 26, "url": "http://patches.dpdk.org/api/people/26/?format=api", "name": "John W. Linville", "email": "linville@tuxdriver.com" }, "content": "On Fri, Sep 12, 2014 at 06:31:08PM +0000, Zhou, Danny wrote:\n> I am concerned about its performance caused by too many\n> memcpy(). Specifically, on Rx side, kernel NIC driver needs to copy\n> packets to skb, then af_packet copies packets to AF_PACKET buffer\n> which are mapped to user space, and then those packets to be copied\n> to DPDK mbuf. In addition, 3 copies needed on Tx side. So to run a\n> simple DPDK L2/L3 forwarding benchmark, each packet needs 6 packet\n> copies which brings significant negative performance impact. We\n> had a bifurcated driver prototype that can do zero-copy and achieve\n> native DPDK performance, but it depends on base driver and AF_PACKET\n> code changes in kernel, John R will be presenting it in coming Linux\n> Plumbers Conference. Once kernel adopts it, the relevant PMD will be\n> submitted to dpdk.org.\n\nAdmittedly, this is not as good a performer as most of the existing\nPMDs. It serves a different purpose, afterall. FWIW, you did\npreviously indicate that it performed better than the pcap-based PMD.\n\nI look forward to seeing the changes you mention -- they sound very\nexciting. But, they will still require both networking core and\ndriver changes in the kernel. And as I understand things today,\nthe userland code will still need at least some knowledge of specific\ndevices and how they layout their packet descriptors, etc. So while\nthose changes sound very promising, they will still have certain\ndrawbacks in common with the current situation.\n\nIt seems like the changes you mention will still need some sort of\nAF_PACKET-based PMD driver. Have you implemented that completely\nseparate from the code I already posted? Or did you add that work\non top of mine?\n\nJohn\n\n> > -----Original Message-----\n> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of John W. Linville\n> > Sent: Saturday, September 13, 2014 2:05 AM\n> > To: dev@dpdk.org\n> > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices\n> > \n> > Ping? Are there objections to this patch from mid-July?\n> > \n> > John\n> > \n> > On Mon, Jul 14, 2014 at 02:24:50PM -0400, John W. Linville wrote:\n> > > This is a Linux-specific virtual PMD driver backed by an AF_PACKET\n> > > socket. This implementation uses mmap'ed ring buffers to limit copying\n> > > and user/kernel transitions. The PACKET_FANOUT_HASH behavior of\n> > > AF_PACKET is used for frame reception. In the current implementation,\n> > > Tx and Rx queues are always paired, and therefore are always equal\n> > > in number -- changing this would be a Simple Matter Of Programming.\n> > >\n> > > Interfaces of this type are created with a command line option like\n> > > \"--vdev=eth_packet0,iface=...\". There are a number of options availabe\n> > > as arguments:\n> > >\n> > > - Interface is chosen by \"iface\" (required)\n> > > - Number of queue pairs set by \"qpairs\" (optional, default: 1)\n> > > - AF_PACKET MMAP block size set by \"blocksz\" (optional, default: 4096)\n> > > - AF_PACKET MMAP frame size set by \"framesz\" (optional, default: 2048)\n> > > - AF_PACKET MMAP frame count set by \"framecnt\" (optional, default: 512)\n> > >\n> > > Signed-off-by: John W. Linville <linville@tuxdriver.com>\n> > > ---\n> > > This PMD is intended to provide a means for using DPDK on a broad\n> > > range of hardware without hardware-specific PMDs and (hopefully)\n> > > with better performance than what PCAP offers in Linux. This might\n> > > be useful as a development platform for DPDK applications when\n> > > DPDK-supported hardware is expensive or unavailable.\n> > >\n> > > New in v2:\n> > >\n> > > -- fixup some style issues found by check patch\n> > > -- use if_index as part of fanout group ID\n> > > -- set default number of queue pairs to 1\n> > >\n> > > config/common_bsdapp | 5 +\n> > > config/common_linuxapp | 5 +\n> > > lib/Makefile | 1 +\n> > > lib/librte_eal/linuxapp/eal/Makefile | 1 +\n> > > lib/librte_pmd_packet/Makefile | 60 +++\n> > > lib/librte_pmd_packet/rte_eth_packet.c | 826 +++++++++++++++++++++++++++++++++\n> > > lib/librte_pmd_packet/rte_eth_packet.h | 55 +++\n> > > mk/rte.app.mk | 4 +\n> > > 8 files changed, 957 insertions(+)\n> > > create mode 100644 lib/librte_pmd_packet/Makefile\n> > > create mode 100644 lib/librte_pmd_packet/rte_eth_packet.c\n> > > create mode 100644 lib/librte_pmd_packet/rte_eth_packet.h\n> > >\n> > > diff --git a/config/common_bsdapp b/config/common_bsdapp\n> > > index 943dce8f1ede..c317f031278e 100644\n> > > --- a/config/common_bsdapp\n> > > +++ b/config/common_bsdapp\n> > > @@ -226,6 +226,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=y\n> > > CONFIG_RTE_LIBRTE_PMD_BOND=y\n> > >\n> > > #\n> > > +# Compile software PMD backed by AF_PACKET sockets (Linux only)\n> > > +#\n> > > +CONFIG_RTE_LIBRTE_PMD_PACKET=n\n> > > +\n> > > +#\n> > > # Do prefetch of packet data within PMD driver receive function\n> > > #\n> > > CONFIG_RTE_PMD_PACKET_PREFETCH=y\n> > > diff --git a/config/common_linuxapp b/config/common_linuxapp\n> > > index 7bf5d80d4e26..f9e7bc3015ec 100644\n> > > --- a/config/common_linuxapp\n> > > +++ b/config/common_linuxapp\n> > > @@ -249,6 +249,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=n\n> > > CONFIG_RTE_LIBRTE_PMD_BOND=y\n> > >\n> > > #\n> > > +# Compile software PMD backed by AF_PACKET sockets (Linux only)\n> > > +#\n> > > +CONFIG_RTE_LIBRTE_PMD_PACKET=y\n> > > +\n> > > +#\n> > > # Compile Xen PMD\n> > > #\n> > > CONFIG_RTE_LIBRTE_PMD_XENVIRT=n\n> > > diff --git a/lib/Makefile b/lib/Makefile\n> > > index 10c5bb3045bc..930fadf29898 100644\n> > > --- a/lib/Makefile\n> > > +++ b/lib/Makefile\n> > > @@ -47,6 +47,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += librte_pmd_i40e\n> > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += librte_pmd_bond\n> > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += librte_pmd_ring\n> > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += librte_pmd_pcap\n> > > +DIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += librte_pmd_packet\n> > > DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += librte_pmd_virtio\n> > > DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3\n> > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt\n> > > diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile\n> > > index 756d6b0c9301..feed24a63272 100644\n> > > --- a/lib/librte_eal/linuxapp/eal/Makefile\n> > > +++ b/lib/librte_eal/linuxapp/eal/Makefile\n> > > @@ -44,6 +44,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_ether\n> > > CFLAGS += -I$(RTE_SDK)/lib/librte_ivshmem\n> > > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_ring\n> > > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_pcap\n> > > +CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_packet\n> > > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_xenvirt\n> > > CFLAGS += $(WERROR_FLAGS) -O3\n> > >\n> > > diff --git a/lib/librte_pmd_packet/Makefile b/lib/librte_pmd_packet/Makefile\n> > > new file mode 100644\n> > > index 000000000000..e1266fb992cd\n> > > --- /dev/null\n> > > +++ b/lib/librte_pmd_packet/Makefile\n> > > @@ -0,0 +1,60 @@\n> > > +# BSD LICENSE\n> > > +#\n> > > +# Copyright(c) 2014 John W. Linville <linville@redhat.com>\n> > > +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > > +# Copyright(c) 2014 6WIND S.A.\n> > > +# All rights reserved.\n> > > +#\n> > > +# Redistribution and use in source and binary forms, with or without\n> > > +# modification, are permitted provided that the following conditions\n> > > +# are met:\n> > > +#\n> > > +# * Redistributions of source code must retain the above copyright\n> > > +# notice, this list of conditions and the following disclaimer.\n> > > +# * Redistributions in binary form must reproduce the above copyright\n> > > +# notice, this list of conditions and the following disclaimer in\n> > > +# the documentation and/or other materials provided with the\n> > > +# distribution.\n> > > +# * Neither the name of Intel Corporation nor the names of its\n> > > +# contributors may be used to endorse or promote products derived\n> > > +# from this software without specific prior written permission.\n> > > +#\n> > > +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> > > +# \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> > > +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> > > +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> > > +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> > > +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> > > +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> > > +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> > > +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> > > +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> > > +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> > > +\n> > > +include $(RTE_SDK)/mk/rte.vars.mk\n> > > +\n> > > +#\n> > > +# library name\n> > > +#\n> > > +LIB = librte_pmd_packet.a\n> > > +\n> > > +CFLAGS += -O3\n> > > +CFLAGS += $(WERROR_FLAGS)\n> > > +\n> > > +#\n> > > +# all source are stored in SRCS-y\n> > > +#\n> > > +SRCS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += rte_eth_packet.c\n> > > +\n> > > +#\n> > > +# Export include files\n> > > +#\n> > > +SYMLINK-y-include += rte_eth_packet.h\n> > > +\n> > > +# this lib depends upon:\n> > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_mbuf\n> > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_ether\n> > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_malloc\n> > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_kvargs\n> > > +\n> > > +include $(RTE_SDK)/mk/rte.lib.mk\n> > > diff --git a/lib/librte_pmd_packet/rte_eth_packet.c b/lib/librte_pmd_packet/rte_eth_packet.c\n> > > new file mode 100644\n> > > index 000000000000..9c82d16e730f\n> > > --- /dev/null\n> > > +++ b/lib/librte_pmd_packet/rte_eth_packet.c\n> > > @@ -0,0 +1,826 @@\n> > > +/*-\n> > > + * BSD LICENSE\n> > > + *\n> > > + * Copyright(c) 2014 John W. Linville <linville@tuxdriver.com>\n> > > + *\n> > > + * Originally based upon librte_pmd_pcap code:\n> > > + *\n> > > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > > + * Copyright(c) 2014 6WIND S.A.\n> > > + * All rights reserved.\n> > > + *\n> > > + * Redistribution and use in source and binary forms, with or without\n> > > + * modification, are permitted provided that the following conditions\n> > > + * are met:\n> > > + *\n> > > + * * Redistributions of source code must retain the above copyright\n> > > + * notice, this list of conditions and the following disclaimer.\n> > > + * * Redistributions in binary form must reproduce the above copyright\n> > > + * notice, this list of conditions and the following disclaimer in\n> > > + * the documentation and/or other materials provided with the\n> > > + * distribution.\n> > > + * * Neither the name of Intel Corporation nor the names of its\n> > > + * contributors may be used to endorse or promote products derived\n> > > + * from this software without specific prior written permission.\n> > > + *\n> > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> > > + * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> > > + */\n> > > +\n> > > +#include <rte_mbuf.h>\n> > > +#include <rte_ethdev.h>\n> > > +#include <rte_malloc.h>\n> > > +#include <rte_kvargs.h>\n> > > +#include <rte_dev.h>\n> > > +\n> > > +#include <linux/if_ether.h>\n> > > +#include <linux/if_packet.h>\n> > > +#include <arpa/inet.h>\n> > > +#include <net/if.h>\n> > > +#include <sys/types.h>\n> > > +#include <sys/socket.h>\n> > > +#include <sys/ioctl.h>\n> > > +#include <sys/mman.h>\n> > > +#include <unistd.h>\n> > > +#include <poll.h>\n> > > +\n> > > +#include \"rte_eth_packet.h\"\n> > > +\n> > > +#define ETH_PACKET_IFACE_ARG\t\t\"iface\"\n> > > +#define ETH_PACKET_NUM_Q_ARG\t\t\"qpairs\"\n> > > +#define ETH_PACKET_BLOCKSIZE_ARG\t\"blocksz\"\n> > > +#define ETH_PACKET_FRAMESIZE_ARG\t\"framesz\"\n> > > +#define ETH_PACKET_FRAMECOUNT_ARG\t\"framecnt\"\n> > > +\n> > > +#define DFLT_BLOCK_SIZE\t\t(1 << 12)\n> > > +#define DFLT_FRAME_SIZE\t\t(1 << 11)\n> > > +#define DFLT_FRAME_COUNT\t(1 << 9)\n> > > +\n> > > +struct pkt_rx_queue {\n> > > +\tint sockfd;\n> > > +\n> > > +\tstruct iovec *rd;\n> > > +\tuint8_t *map;\n> > > +\tunsigned int framecount;\n> > > +\tunsigned int framenum;\n> > > +\n> > > +\tstruct rte_mempool *mb_pool;\n> > > +\n> > > +\tvolatile unsigned long rx_pkts;\n> > > +\tvolatile unsigned long err_pkts;\n> > > +};\n> > > +\n> > > +struct pkt_tx_queue {\n> > > +\tint sockfd;\n> > > +\n> > > +\tstruct iovec *rd;\n> > > +\tuint8_t *map;\n> > > +\tunsigned int framecount;\n> > > +\tunsigned int framenum;\n> > > +\n> > > +\tvolatile unsigned long tx_pkts;\n> > > +\tvolatile unsigned long err_pkts;\n> > > +};\n> > > +\n> > > +struct pmd_internals {\n> > > +\tunsigned nb_queues;\n> > > +\n> > > +\tint if_index;\n> > > +\tstruct ether_addr eth_addr;\n> > > +\n> > > +\tstruct tpacket_req req;\n> > > +\n> > > +\tstruct pkt_rx_queue rx_queue[RTE_PMD_PACKET_MAX_RINGS];\n> > > +\tstruct pkt_tx_queue tx_queue[RTE_PMD_PACKET_MAX_RINGS];\n> > > +};\n> > > +\n> > > +static const char *valid_arguments[] = {\n> > > +\tETH_PACKET_IFACE_ARG,\n> > > +\tETH_PACKET_NUM_Q_ARG,\n> > > +\tETH_PACKET_BLOCKSIZE_ARG,\n> > > +\tETH_PACKET_FRAMESIZE_ARG,\n> > > +\tETH_PACKET_FRAMECOUNT_ARG,\n> > > +\tNULL\n> > > +};\n> > > +\n> > > +static const char *drivername = \"AF_PACKET PMD\";\n> > > +\n> > > +static struct rte_eth_link pmd_link = {\n> > > +\t.link_speed = 10000,\n> > > +\t.link_duplex = ETH_LINK_FULL_DUPLEX,\n> > > +\t.link_status = 0\n> > > +};\n> > > +\n> > > +static uint16_t\n> > > +eth_packet_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)\n> > > +{\n> > > +\tunsigned i;\n> > > +\tstruct tpacket2_hdr *ppd;\n> > > +\tstruct rte_mbuf *mbuf;\n> > > +\tuint8_t *pbuf;\n> > > +\tstruct pkt_rx_queue *pkt_q = queue;\n> > > +\tuint16_t num_rx = 0;\n> > > +\tunsigned int framecount, framenum;\n> > > +\n> > > +\tif (unlikely(nb_pkts == 0))\n> > > +\t\treturn 0;\n> > > +\n> > > +\t/*\n> > > +\t * Reads the given number of packets from the AF_PACKET socket one by\n> > > +\t * one and copies the packet data into a newly allocated mbuf.\n> > > +\t */\n> > > +\tframecount = pkt_q->framecount;\n> > > +\tframenum = pkt_q->framenum;\n> > > +\tfor (i = 0; i < nb_pkts; i++) {\n> > > +\t\t/* point at the next incoming frame */\n> > > +\t\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > > +\t\tif ((ppd->tp_status & TP_STATUS_USER) == 0)\n> > > +\t\t\tbreak;\n> > > +\n> > > +\t\t/* allocate the next mbuf */\n> > > +\t\tmbuf = rte_pktmbuf_alloc(pkt_q->mb_pool);\n> > > +\t\tif (unlikely(mbuf == NULL))\n> > > +\t\t\tbreak;\n> > > +\n> > > +\t\t/* packet will fit in the mbuf, go ahead and receive it */\n> > > +\t\tmbuf->pkt.pkt_len = mbuf->pkt.data_len = ppd->tp_snaplen;\n> > > +\t\tpbuf = (uint8_t *) ppd + ppd->tp_mac;\n> > > +\t\tmemcpy(mbuf->pkt.data, pbuf, mbuf->pkt.data_len);\n> > > +\n> > > +\t\t/* release incoming frame and advance ring buffer */\n> > > +\t\tppd->tp_status = TP_STATUS_KERNEL;\n> > > +\t\tif (++framenum >= framecount)\n> > > +\t\t\tframenum = 0;\n> > > +\n> > > +\t\t/* account for the receive frame */\n> > > +\t\tbufs[i] = mbuf;\n> > > +\t\tnum_rx++;\n> > > +\t}\n> > > +\tpkt_q->framenum = framenum;\n> > > +\tpkt_q->rx_pkts += num_rx;\n> > > +\treturn num_rx;\n> > > +}\n> > > +\n> > > +/*\n> > > + * Callback to handle sending packets through a real NIC.\n> > > + */\n> > > +static uint16_t\n> > > +eth_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)\n> > > +{\n> > > +\tstruct tpacket2_hdr *ppd;\n> > > +\tstruct rte_mbuf *mbuf;\n> > > +\tuint8_t *pbuf;\n> > > +\tunsigned int framecount, framenum;\n> > > +\tstruct pollfd pfd;\n> > > +\tstruct pkt_tx_queue *pkt_q = queue;\n> > > +\tuint16_t num_tx = 0;\n> > > +\tint i;\n> > > +\n> > > +\tif (unlikely(nb_pkts == 0))\n> > > +\t\treturn 0;\n> > > +\n> > > +\tmemset(&pfd, 0, sizeof(pfd));\n> > > +\tpfd.fd = pkt_q->sockfd;\n> > > +\tpfd.events = POLLOUT;\n> > > +\tpfd.revents = 0;\n> > > +\n> > > +\tframecount = pkt_q->framecount;\n> > > +\tframenum = pkt_q->framenum;\n> > > +\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > > +\tfor (i = 0; i < nb_pkts; i++) {\n> > > +\t\t/* point at the next incoming frame */\n> > > +\t\tif ((ppd->tp_status != TP_STATUS_AVAILABLE) &&\n> > > +\t\t (poll(&pfd, 1, -1) < 0))\n> > > +\t\t\t\tcontinue;\n> > > +\n> > > +\t\t/* copy the tx frame data */\n> > > +\t\tmbuf = bufs[num_tx];\n> > > +\t\tpbuf = (uint8_t *) ppd + TPACKET2_HDRLEN -\n> > > +\t\t\tsizeof(struct sockaddr_ll);\n> > > +\t\tmemcpy(pbuf, mbuf->pkt.data, mbuf->pkt.data_len);\n> > > +\t\tppd->tp_len = ppd->tp_snaplen = mbuf->pkt.data_len;\n> > > +\n> > > +\t\t/* release incoming frame and advance ring buffer */\n> > > +\t\tppd->tp_status = TP_STATUS_SEND_REQUEST;\n> > > +\t\tif (++framenum >= framecount)\n> > > +\t\t\tframenum = 0;\n> > > +\t\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > > +\n> > > +\t\tnum_tx++;\n> > > +\t\trte_pktmbuf_free(mbuf);\n> > > +\t}\n> > > +\n> > > +\t/* kick-off transmits */\n> > > +\tsendto(pkt_q->sockfd, NULL, 0, MSG_DONTWAIT, NULL, 0);\n> > > +\n> > > +\tpkt_q->framenum = framenum;\n> > > +\tpkt_q->tx_pkts += num_tx;\n> > > +\tpkt_q->err_pkts += nb_pkts - num_tx;\n> > > +\treturn num_tx;\n> > > +}\n> > > +\n> > > +static int\n> > > +eth_dev_start(struct rte_eth_dev *dev)\n> > > +{\n> > > +\tdev->data->dev_link.link_status = 1;\n> > > +\treturn 0;\n> > > +}\n> > > +\n> > > +/*\n> > > + * This function gets called when the current port gets stopped.\n> > > + */\n> > > +static void\n> > > +eth_dev_stop(struct rte_eth_dev *dev)\n> > > +{\n> > > +\tunsigned i;\n> > > +\tint sockfd;\n> > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > +\n> > > +\tfor (i = 0; i < internals->nb_queues; i++) {\n> > > +\t\tsockfd = internals->rx_queue[i].sockfd;\n> > > +\t\tif (sockfd != -1)\n> > > +\t\t\tclose(sockfd);\n> > > +\t\tsockfd = internals->tx_queue[i].sockfd;\n> > > +\t\tif (sockfd != -1)\n> > > +\t\t\tclose(sockfd);\n> > > +\t}\n> > > +\n> > > +\tdev->data->dev_link.link_status = 0;\n> > > +}\n> > > +\n> > > +static int\n> > > +eth_dev_configure(struct rte_eth_dev *dev __rte_unused)\n> > > +{\n> > > +\treturn 0;\n> > > +}\n> > > +\n> > > +static void\n> > > +eth_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)\n> > > +{\n> > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > +\n> > > +\tdev_info->driver_name = drivername;\n> > > +\tdev_info->if_index = internals->if_index;\n> > > +\tdev_info->max_mac_addrs = 1;\n> > > +\tdev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;\n> > > +\tdev_info->max_rx_queues = (uint16_t)internals->nb_queues;\n> > > +\tdev_info->max_tx_queues = (uint16_t)internals->nb_queues;\n> > > +\tdev_info->min_rx_bufsize = 0;\n> > > +\tdev_info->pci_dev = NULL;\n> > > +}\n> > > +\n> > > +static void\n> > > +eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *igb_stats)\n> > > +{\n> > > +\tunsigned i, imax;\n> > > +\tunsigned long rx_total = 0, tx_total = 0, tx_err_total = 0;\n> > > +\tconst struct pmd_internals *internal = dev->data->dev_private;\n> > > +\n> > > +\tmemset(igb_stats, 0, sizeof(*igb_stats));\n> > > +\n> > > +\timax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?\n> > > +\t internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);\n> > > +\tfor (i = 0; i < imax; i++) {\n> > > +\t\tigb_stats->q_ipackets[i] = internal->rx_queue[i].rx_pkts;\n> > > +\t\trx_total += igb_stats->q_ipackets[i];\n> > > +\t}\n> > > +\n> > > +\timax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?\n> > > +\t internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);\n> > > +\tfor (i = 0; i < imax; i++) {\n> > > +\t\tigb_stats->q_opackets[i] = internal->tx_queue[i].tx_pkts;\n> > > +\t\tigb_stats->q_errors[i] = internal->tx_queue[i].err_pkts;\n> > > +\t\ttx_total += igb_stats->q_opackets[i];\n> > > +\t\ttx_err_total += igb_stats->q_errors[i];\n> > > +\t}\n> > > +\n> > > +\tigb_stats->ipackets = rx_total;\n> > > +\tigb_stats->opackets = tx_total;\n> > > +\tigb_stats->oerrors = tx_err_total;\n> > > +}\n> > > +\n> > > +static void\n> > > +eth_stats_reset(struct rte_eth_dev *dev)\n> > > +{\n> > > +\tunsigned i;\n> > > +\tstruct pmd_internals *internal = dev->data->dev_private;\n> > > +\n> > > +\tfor (i = 0; i < internal->nb_queues; i++)\n> > > +\t\tinternal->rx_queue[i].rx_pkts = 0;\n> > > +\n> > > +\tfor (i = 0; i < internal->nb_queues; i++) {\n> > > +\t\tinternal->tx_queue[i].tx_pkts = 0;\n> > > +\t\tinternal->tx_queue[i].err_pkts = 0;\n> > > +\t}\n> > > +}\n> > > +\n> > > +static void\n> > > +eth_dev_close(struct rte_eth_dev *dev __rte_unused)\n> > > +{\n> > > +}\n> > > +\n> > > +static void\n> > > +eth_queue_release(void *q __rte_unused)\n> > > +{\n> > > +}\n> > > +\n> > > +static int\n> > > +eth_link_update(struct rte_eth_dev *dev __rte_unused,\n> > > + int wait_to_complete __rte_unused)\n> > > +{\n> > > +\treturn 0;\n> > > +}\n> > > +\n> > > +static int\n> > > +eth_rx_queue_setup(struct rte_eth_dev *dev,\n> > > + uint16_t rx_queue_id,\n> > > + uint16_t nb_rx_desc __rte_unused,\n> > > + unsigned int socket_id __rte_unused,\n> > > + const struct rte_eth_rxconf *rx_conf __rte_unused,\n> > > + struct rte_mempool *mb_pool)\n> > > +{\n> > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > +\tstruct pkt_rx_queue *pkt_q = &internals->rx_queue[rx_queue_id];\n> > > +\tstruct rte_pktmbuf_pool_private *mbp_priv;\n> > > +\tuint16_t buf_size;\n> > > +\n> > > +\tpkt_q->mb_pool = mb_pool;\n> > > +\n> > > +\t/* Now get the space available for data in the mbuf */\n> > > +\tmbp_priv = rte_mempool_get_priv(pkt_q->mb_pool);\n> > > +\tbuf_size = (uint16_t) (mbp_priv->mbuf_data_room_size -\n> > > +\t RTE_PKTMBUF_HEADROOM);\n> > > +\n> > > +\tif (ETH_FRAME_LEN > buf_size) {\n> > > +\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\"%s: %d bytes will not fit in mbuf (%d bytes)\\n\",\n> > > +\t\t\tdev->data->name, ETH_FRAME_LEN, buf_size);\n> > > +\t\treturn -ENOMEM;\n> > > +\t}\n> > > +\n> > > +\tdev->data->rx_queues[rx_queue_id] = pkt_q;\n> > > +\n> > > +\treturn 0;\n> > > +}\n> > > +\n> > > +static int\n> > > +eth_tx_queue_setup(struct rte_eth_dev *dev,\n> > > + uint16_t tx_queue_id,\n> > > + uint16_t nb_tx_desc __rte_unused,\n> > > + unsigned int socket_id __rte_unused,\n> > > + const struct rte_eth_txconf *tx_conf __rte_unused)\n> > > +{\n> > > +\n> > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > +\n> > > +\tdev->data->tx_queues[tx_queue_id] = &internals->tx_queue[tx_queue_id];\n> > > +\treturn 0;\n> > > +}\n> > > +\n> > > +static struct eth_dev_ops ops = {\n> > > +\t.dev_start = eth_dev_start,\n> > > +\t.dev_stop = eth_dev_stop,\n> > > +\t.dev_close = eth_dev_close,\n> > > +\t.dev_configure = eth_dev_configure,\n> > > +\t.dev_infos_get = eth_dev_info,\n> > > +\t.rx_queue_setup = eth_rx_queue_setup,\n> > > +\t.tx_queue_setup = eth_tx_queue_setup,\n> > > +\t.rx_queue_release = eth_queue_release,\n> > > +\t.tx_queue_release = eth_queue_release,\n> > > +\t.link_update = eth_link_update,\n> > > +\t.stats_get = eth_stats_get,\n> > > +\t.stats_reset = eth_stats_reset,\n> > > +};\n> > > +\n> > > +/*\n> > > + * Opens an AF_PACKET socket\n> > > + */\n> > > +static int\n> > > +open_packet_iface(const char *key __rte_unused,\n> > > + const char *value __rte_unused,\n> > > + void *extra_args)\n> > > +{\n> > > +\tint *sockfd = extra_args;\n> > > +\n> > > +\t/* Open an AF_PACKET socket... */\n> > > +\t*sockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));\n> > > +\tif (*sockfd == -1) {\n> > > +\t\tRTE_LOG(ERR, PMD, \"Could not open AF_PACKET socket\\n\");\n> > > +\t\treturn -1;\n> > > +\t}\n> > > +\n> > > +\treturn 0;\n> > > +}\n> > > +\n> > > +static int\n> > > +rte_pmd_init_internals(const char *name,\n> > > + const int sockfd,\n> > > + const unsigned nb_queues,\n> > > + unsigned int blocksize,\n> > > + unsigned int blockcnt,\n> > > + unsigned int framesize,\n> > > + unsigned int framecnt,\n> > > + const unsigned numa_node,\n> > > + struct pmd_internals **internals,\n> > > + struct rte_eth_dev **eth_dev,\n> > > + struct rte_kvargs *kvlist)\n> > > +{\n> > > +\tstruct rte_eth_dev_data *data = NULL;\n> > > +\tstruct rte_pci_device *pci_dev = NULL;\n> > > +\tstruct rte_kvargs_pair *pair = NULL;\n> > > +\tstruct ifreq ifr;\n> > > +\tsize_t ifnamelen;\n> > > +\tunsigned k_idx;\n> > > +\tstruct sockaddr_ll sockaddr;\n> > > +\tstruct tpacket_req *req;\n> > > +\tstruct pkt_rx_queue *rx_queue;\n> > > +\tstruct pkt_tx_queue *tx_queue;\n> > > +\tint rc, tpver, discard, bypass;\n> > > +\tunsigned int i, q, rdsize;\n> > > +\tint qsockfd, fanout_arg;\n> > > +\n> > > +\tfor (k_idx = 0; k_idx < kvlist->count; k_idx++) {\n> > > +\t\tpair = &kvlist->pairs[k_idx];\n> > > +\t\tif (strstr(pair->key, ETH_PACKET_IFACE_ARG) != NULL)\n> > > +\t\t\tbreak;\n> > > +\t}\n> > > +\tif (pair == NULL) {\n> > > +\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\"%s: no interface specified for AF_PACKET ethdev\\n\",\n> > > +\t\t name);\n> > > +\t\tgoto error;\n> > > +\t}\n> > > +\n> > > +\tRTE_LOG(INFO, PMD,\n> > > +\t\t\"%s: creating AF_PACKET-backed ethdev on numa socket %u\\n\",\n> > > +\t\tname, numa_node);\n> > > +\n> > > +\t/*\n> > > +\t * now do all data allocation - for eth_dev structure, dummy pci driver\n> > > +\t * and internal (private) data\n> > > +\t */\n> > > +\tdata = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);\n> > > +\tif (data == NULL)\n> > > +\t\tgoto error;\n> > > +\n> > > +\tpci_dev = rte_zmalloc_socket(name, sizeof(*pci_dev), 0, numa_node);\n> > > +\tif (pci_dev == NULL)\n> > > +\t\tgoto error;\n> > > +\n> > > +\t*internals = rte_zmalloc_socket(name, sizeof(**internals),\n> > > +\t 0, numa_node);\n> > > +\tif (*internals == NULL)\n> > > +\t\tgoto error;\n> > > +\n> > > +\treq = &((*internals)->req);\n> > > +\n> > > +\treq->tp_block_size = blocksize;\n> > > +\treq->tp_block_nr = blockcnt;\n> > > +\treq->tp_frame_size = framesize;\n> > > +\treq->tp_frame_nr = framecnt;\n> > > +\n> > > +\tifnamelen = strlen(pair->value);\n> > > +\tif (ifnamelen < sizeof(ifr.ifr_name)) {\n> > > +\t\tmemcpy(ifr.ifr_name, pair->value, ifnamelen);\n> > > +\t\tifr.ifr_name[ifnamelen] = '\\0';\n> > > +\t} else {\n> > > +\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\"%s: I/F name too long (%s)\\n\",\n> > > +\t\t\tname, pair->value);\n> > > +\t\tgoto error;\n> > > +\t}\n> > > +\tif (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {\n> > > +\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\"%s: ioctl failed (SIOCGIFINDEX)\\n\",\n> > > +\t\t name);\n> > > +\t\tgoto error;\n> > > +\t}\n> > > +\t(*internals)->if_index = ifr.ifr_ifindex;\n> > > +\n> > > +\tif (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {\n> > > +\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\"%s: ioctl failed (SIOCGIFHWADDR)\\n\",\n> > > +\t\t name);\n> > > +\t\tgoto error;\n> > > +\t}\n> > > +\tmemcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);\n> > > +\n> > > +\tmemset(&sockaddr, 0, sizeof(sockaddr));\n> > > +\tsockaddr.sll_family = AF_PACKET;\n> > > +\tsockaddr.sll_protocol = htons(ETH_P_ALL);\n> > > +\tsockaddr.sll_ifindex = (*internals)->if_index;\n> > > +\n> > > +\tfanout_arg = (getpid() ^ (*internals)->if_index) & 0xffff;\n> > > +\tfanout_arg |= (PACKET_FANOUT_HASH | PACKET_FANOUT_FLAG_DEFRAG |\n> > > +\t PACKET_FANOUT_FLAG_ROLLOVER) << 16;\n> > > +\n> > > +\tfor (q = 0; q < nb_queues; q++) {\n> > > +\t\t/* Open an AF_PACKET socket for this queue... */\n> > > +\t\tqsockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));\n> > > +\t\tif (qsockfd == -1) {\n> > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t \"%s: could not open AF_PACKET socket\\n\",\n> > > +\t\t\t name);\n> > > +\t\t\treturn -1;\n> > > +\t\t}\n> > > +\n> > > +\t\ttpver = TPACKET_V2;\n> > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_VERSION,\n> > > +\t\t\t\t&tpver, sizeof(tpver));\n> > > +\t\tif (rc == -1) {\n> > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\"%s: could not set PACKET_VERSION on AF_PACKET \"\n> > > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > > +\t\t\tgoto error;\n> > > +\t\t}\n> > > +\n> > > +\t\tdiscard = 1;\n> > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_LOSS,\n> > > +\t\t\t\t&discard, sizeof(discard));\n> > > +\t\tif (rc == -1) {\n> > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\"%s: could not set PACKET_LOSS on \"\n> > > +\t\t\t \"AF_PACKET socket for %s\\n\", name, pair->value);\n> > > +\t\t\tgoto error;\n> > > +\t\t}\n> > > +\n> > > +\t\tbypass = 1;\n> > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_QDISC_BYPASS,\n> > > +\t\t\t\t&bypass, sizeof(bypass));\n> > > +\t\tif (rc == -1) {\n> > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\"%s: could not set PACKET_QDISC_BYPASS \"\n> > > +\t\t\t \"on AF_PACKET socket for %s\\n\", name,\n> > > +\t\t\t pair->value);\n> > > +\t\t\tgoto error;\n> > > +\t\t}\n> > > +\n> > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_RX_RING, req, sizeof(*req));\n> > > +\t\tif (rc == -1) {\n> > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\"%s: could not set PACKET_RX_RING on AF_PACKET \"\n> > > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > > +\t\t\tgoto error;\n> > > +\t\t}\n> > > +\n> > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_TX_RING, req, sizeof(*req));\n> > > +\t\tif (rc == -1) {\n> > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\"%s: could not set PACKET_TX_RING on AF_PACKET \"\n> > > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > > +\t\t\tgoto error;\n> > > +\t\t}\n> > > +\n> > > +\t\trx_queue = &((*internals)->rx_queue[q]);\n> > > +\t\trx_queue->framecount = req->tp_frame_nr;\n> > > +\n> > > +\t\trx_queue->map = mmap(NULL, 2 * req->tp_block_size * req->tp_block_nr,\n> > > +\t\t\t\t PROT_READ | PROT_WRITE, MAP_SHARED | MAP_LOCKED,\n> > > +\t\t\t\t qsockfd, 0);\n> > > +\t\tif (rx_queue->map == MAP_FAILED) {\n> > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\"%s: call to mmap failed on AF_PACKET socket for %s\\n\",\n> > > +\t\t\t\tname, pair->value);\n> > > +\t\t\tgoto error;\n> > > +\t\t}\n> > > +\n> > > +\t\t/* rdsize is same for both Tx and Rx */\n> > > +\t\trdsize = req->tp_frame_nr * sizeof(*(rx_queue->rd));\n> > > +\n> > > +\t\trx_queue->rd = rte_zmalloc_socket(name, rdsize, 0, numa_node);\n> > > +\t\tfor (i = 0; i < req->tp_frame_nr; ++i) {\n> > > +\t\t\trx_queue->rd[i].iov_base = rx_queue->map + (i * framesize);\n> > > +\t\t\trx_queue->rd[i].iov_len = req->tp_frame_size;\n> > > +\t\t}\n> > > +\t\trx_queue->sockfd = qsockfd;\n> > > +\n> > > +\t\ttx_queue = &((*internals)->tx_queue[q]);\n> > > +\t\ttx_queue->framecount = req->tp_frame_nr;\n> > > +\n> > > +\t\ttx_queue->map = rx_queue->map + req->tp_block_size * req->tp_block_nr;\n> > > +\n> > > +\t\ttx_queue->rd = rte_zmalloc_socket(name, rdsize, 0, numa_node);\n> > > +\t\tfor (i = 0; i < req->tp_frame_nr; ++i) {\n> > > +\t\t\ttx_queue->rd[i].iov_base = tx_queue->map + (i * framesize);\n> > > +\t\t\ttx_queue->rd[i].iov_len = req->tp_frame_size;\n> > > +\t\t}\n> > > +\t\ttx_queue->sockfd = qsockfd;\n> > > +\n> > > +\t\trc = bind(qsockfd, (const struct sockaddr*)&sockaddr, sizeof(sockaddr));\n> > > +\t\tif (rc == -1) {\n> > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\"%s: could not bind AF_PACKET socket to %s\\n\",\n> > > +\t\t\t name, pair->value);\n> > > +\t\t\tgoto error;\n> > > +\t\t}\n> > > +\n> > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_FANOUT,\n> > > +\t\t\t\t&fanout_arg, sizeof(fanout_arg));\n> > > +\t\tif (rc == -1) {\n> > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\"%s: could not set PACKET_FANOUT on AF_PACKET socket \"\n> > > +\t\t\t\t\"for %s\\n\", name, pair->value);\n> > > +\t\t\tgoto error;\n> > > +\t\t}\n> > > +\t}\n> > > +\n> > > +\t/* reserve an ethdev entry */\n> > > +\t*eth_dev = rte_eth_dev_allocate(name);\n> > > +\tif (*eth_dev == NULL)\n> > > +\t\tgoto error;\n> > > +\n> > > +\t/*\n> > > +\t * now put it all together\n> > > +\t * - store queue data in internals,\n> > > +\t * - store numa_node info in pci_driver\n> > > +\t * - point eth_dev_data to internals and pci_driver\n> > > +\t * - and point eth_dev structure to new eth_dev_data structure\n> > > +\t */\n> > > +\n> > > +\t(*internals)->nb_queues = nb_queues;\n> > > +\n> > > +\tdata->dev_private = *internals;\n> > > +\tdata->port_id = (*eth_dev)->data->port_id;\n> > > +\tdata->nb_rx_queues = (uint16_t)nb_queues;\n> > > +\tdata->nb_tx_queues = (uint16_t)nb_queues;\n> > > +\tdata->dev_link = pmd_link;\n> > > +\tdata->mac_addrs = &(*internals)->eth_addr;\n> > > +\n> > > +\tpci_dev->numa_node = numa_node;\n> > > +\n> > > +\t(*eth_dev)->data = data;\n> > > +\t(*eth_dev)->dev_ops = &ops;\n> > > +\t(*eth_dev)->pci_dev = pci_dev;\n> > > +\n> > > +\treturn 0;\n> > > +\n> > > +error:\n> > > +\tif (data)\n> > > +\t\trte_free(data);\n> > > +\tif (pci_dev)\n> > > +\t\trte_free(pci_dev);\n> > > +\tfor (q = 0; q < nb_queues; q++) {\n> > > +\t\tif ((*internals)->rx_queue[q].rd)\n> > > +\t\t\trte_free((*internals)->rx_queue[q].rd);\n> > > +\t\tif ((*internals)->tx_queue[q].rd)\n> > > +\t\t\trte_free((*internals)->tx_queue[q].rd);\n> > > +\t}\n> > > +\tif (*internals)\n> > > +\t\trte_free(*internals);\n> > > +\treturn -1;\n> > > +}\n> > > +\n> > > +static int\n> > > +rte_eth_from_packet(const char *name,\n> > > + int const *sockfd,\n> > > + const unsigned numa_node,\n> > > + struct rte_kvargs *kvlist)\n> > > +{\n> > > +\tstruct pmd_internals *internals = NULL;\n> > > +\tstruct rte_eth_dev *eth_dev = NULL;\n> > > +\tstruct rte_kvargs_pair *pair = NULL;\n> > > +\tunsigned k_idx;\n> > > +\tunsigned int blockcount;\n> > > +\tunsigned int blocksize = DFLT_BLOCK_SIZE;\n> > > +\tunsigned int framesize = DFLT_FRAME_SIZE;\n> > > +\tunsigned int framecount = DFLT_FRAME_COUNT;\n> > > +\tunsigned int qpairs = 1;\n> > > +\n> > > +\t/* do some parameter checking */\n> > > +\tif (*sockfd < 0)\n> > > +\t\treturn -1;\n> > > +\n> > > +\t/*\n> > > +\t * Walk arguments for configurable settings\n> > > +\t */\n> > > +\tfor (k_idx = 0; k_idx < kvlist->count; k_idx++) {\n> > > +\t\tpair = &kvlist->pairs[k_idx];\n> > > +\t\tif (strstr(pair->key, ETH_PACKET_NUM_Q_ARG) != NULL) {\n> > > +\t\t\tqpairs = atoi(pair->value);\n> > > +\t\t\tif (qpairs < 1 ||\n> > > +\t\t\t qpairs > RTE_PMD_PACKET_MAX_RINGS) {\n> > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\t\"%s: invalid qpairs value\\n\",\n> > > +\t\t\t\t name);\n> > > +\t\t\t\treturn -1;\n> > > +\t\t\t}\n> > > +\t\t\tcontinue;\n> > > +\t\t}\n> > > +\t\tif (strstr(pair->key, ETH_PACKET_BLOCKSIZE_ARG) != NULL) {\n> > > +\t\t\tblocksize = atoi(pair->value);\n> > > +\t\t\tif (!blocksize) {\n> > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\t\"%s: invalid blocksize value\\n\",\n> > > +\t\t\t\t name);\n> > > +\t\t\t\treturn -1;\n> > > +\t\t\t}\n> > > +\t\t\tcontinue;\n> > > +\t\t}\n> > > +\t\tif (strstr(pair->key, ETH_PACKET_FRAMESIZE_ARG) != NULL) {\n> > > +\t\t\tframesize = atoi(pair->value);\n> > > +\t\t\tif (!framesize) {\n> > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\t\"%s: invalid framesize value\\n\",\n> > > +\t\t\t\t name);\n> > > +\t\t\t\treturn -1;\n> > > +\t\t\t}\n> > > +\t\t\tcontinue;\n> > > +\t\t}\n> > > +\t\tif (strstr(pair->key, ETH_PACKET_FRAMECOUNT_ARG) != NULL) {\n> > > +\t\t\tframecount = atoi(pair->value);\n> > > +\t\t\tif (!framecount) {\n> > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\t\t\"%s: invalid framecount value\\n\",\n> > > +\t\t\t\t name);\n> > > +\t\t\t\treturn -1;\n> > > +\t\t\t}\n> > > +\t\t\tcontinue;\n> > > +\t\t}\n> > > +\t}\n> > > +\n> > > +\tif (framesize > blocksize) {\n> > > +\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\"%s: AF_PACKET MMAP frame size exceeds block size!\\n\",\n> > > +\t\t name);\n> > > +\t\treturn -1;\n> > > +\t}\n> > > +\n> > > +\tblockcount = framecount / (blocksize / framesize);\n> > > +\tif (!blockcount) {\n> > > +\t\tRTE_LOG(ERR, PMD,\n> > > +\t\t\t\"%s: invalid AF_PACKET MMAP parameters\\n\", name);\n> > > +\t\treturn -1;\n> > > +\t}\n> > > +\n> > > +\tRTE_LOG(INFO, PMD, \"%s: AF_PACKET MMAP parameters:\\n\", name);\n> > > +\tRTE_LOG(INFO, PMD, \"%s:\\tblock size %d\\n\", name, blocksize);\n> > > +\tRTE_LOG(INFO, PMD, \"%s:\\tblock count %d\\n\", name, blockcount);\n> > > +\tRTE_LOG(INFO, PMD, \"%s:\\tframe size %d\\n\", name, framesize);\n> > > +\tRTE_LOG(INFO, PMD, \"%s:\\tframe count %d\\n\", name, framecount);\n> > > +\n> > > +\tif (rte_pmd_init_internals(name, *sockfd, qpairs,\n> > > +\t blocksize, blockcount,\n> > > +\t framesize, framecount,\n> > > +\t numa_node, &internals, &eth_dev,\n> > > +\t kvlist) < 0)\n> > > +\t\treturn -1;\n> > > +\n> > > +\teth_dev->rx_pkt_burst = eth_packet_rx;\n> > > +\teth_dev->tx_pkt_burst = eth_packet_tx;\n> > > +\n> > > +\treturn 0;\n> > > +}\n> > > +\n> > > +int\n> > > +rte_pmd_packet_devinit(const char *name, const char *params)\n> > > +{\n> > > +\tunsigned numa_node;\n> > > +\tint ret;\n> > > +\tstruct rte_kvargs *kvlist;\n> > > +\tint sockfd = -1;\n> > > +\n> > > +\tRTE_LOG(INFO, PMD, \"Initializing pmd_packet for %s\\n\", name);\n> > > +\n> > > +\tnuma_node = rte_socket_id();\n> > > +\n> > > +\tkvlist = rte_kvargs_parse(params, valid_arguments);\n> > > +\tif (kvlist == NULL)\n> > > +\t\treturn -1;\n> > > +\n> > > +\t/*\n> > > +\t * If iface argument is passed we open the NICs and use them for\n> > > +\t * reading / writing\n> > > +\t */\n> > > +\tif (rte_kvargs_count(kvlist, ETH_PACKET_IFACE_ARG) == 1) {\n> > > +\n> > > +\t\tret = rte_kvargs_process(kvlist, ETH_PACKET_IFACE_ARG,\n> > > +\t\t &open_packet_iface, &sockfd);\n> > > +\t\tif (ret < 0)\n> > > +\t\t\treturn -1;\n> > > +\t}\n> > > +\n> > > +\tret = rte_eth_from_packet(name, &sockfd, numa_node, kvlist);\n> > > +\tclose(sockfd); /* no longer needed */\n> > > +\n> > > +\tif (ret < 0)\n> > > +\t\treturn -1;\n> > > +\n> > > +\treturn 0;\n> > > +}\n> > > +\n> > > +static struct rte_driver pmd_packet_drv = {\n> > > +\t.name = \"eth_packet\",\n> > > +\t.type = PMD_VDEV,\n> > > +\t.init = rte_pmd_packet_devinit,\n> > > +};\n> > > +\n> > > +PMD_REGISTER_DRIVER(pmd_packet_drv);\n> > > diff --git a/lib/librte_pmd_packet/rte_eth_packet.h b/lib/librte_pmd_packet/rte_eth_packet.h\n> > > new file mode 100644\n> > > index 000000000000..f685611da3e9\n> > > --- /dev/null\n> > > +++ b/lib/librte_pmd_packet/rte_eth_packet.h\n> > > @@ -0,0 +1,55 @@\n> > > +/*-\n> > > + * BSD LICENSE\n> > > + *\n> > > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > > + * All rights reserved.\n> > > + *\n> > > + * Redistribution and use in source and binary forms, with or without\n> > > + * modification, are permitted provided that the following conditions\n> > > + * are met:\n> > > + *\n> > > + * * Redistributions of source code must retain the above copyright\n> > > + * notice, this list of conditions and the following disclaimer.\n> > > + * * Redistributions in binary form must reproduce the above copyright\n> > > + * notice, this list of conditions and the following disclaimer in\n> > > + * the documentation and/or other materials provided with the\n> > > + * distribution.\n> > > + * * Neither the name of Intel Corporation nor the names of its\n> > > + * contributors may be used to endorse or promote products derived\n> > > + * from this software without specific prior written permission.\n> > > + *\n> > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> > > + * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> > > + */\n> > > +\n> > > +#ifndef _RTE_ETH_PACKET_H_\n> > > +#define _RTE_ETH_PACKET_H_\n> > > +\n> > > +#ifdef __cplusplus\n> > > +extern \"C\" {\n> > > +#endif\n> > > +\n> > > +#define RTE_ETH_PACKET_PARAM_NAME \"eth_packet\"\n> > > +\n> > > +#define RTE_PMD_PACKET_MAX_RINGS 16\n> > > +\n> > > +/**\n> > > + * For use by the EAL only. Called as part of EAL init to set up any dummy NICs\n> > > + * configured on command line.\n> > > + */\n> > > +int rte_pmd_packet_devinit(const char *name, const char *params);\n> > > +\n> > > +#ifdef __cplusplus\n> > > +}\n> > > +#endif\n> > > +\n> > > +#endif\n> > > diff --git a/mk/rte.app.mk b/mk/rte.app.mk\n> > > index 34dff2a02a05..a6994c4dbe93 100644\n> > > --- a/mk/rte.app.mk\n> > > +++ b/mk/rte.app.mk\n> > > @@ -210,6 +210,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_PMD_PCAP),y)\n> > > LDLIBS += -lrte_pmd_pcap -lpcap\n> > > endif\n> > >\n> > > +ifeq ($(CONFIG_RTE_LIBRTE_PMD_PACKET),y)\n> > > +LDLIBS += -lrte_pmd_packet\n> > > +endif\n> > > +\n> > > endif # plugins\n> > >\n> > > LDLIBS += $(EXECENV_LDLIBS)\n> > > --\n> > > 1.9.3\n> > >\n> > >\n> > \n> > --\n> > John W. Linville\t\tSomeday the world will need a hero, and you\n> > linville@tuxdriver.com\t\t\tmight be all we have. Be ready.\n>", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id 733186AB7;\n\tFri, 12 Sep 2014 20:52:20 +0200 (CEST)", "from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58])\n\tby dpdk.org (Postfix) with ESMTP id 48BEA68B7\n\tfor <dev@dpdk.org>; Fri, 12 Sep 2014 20:52:19 +0200 (CEST)", "from uucp by smtp.tuxdriver.com with local-rmail (Exim 4.63)\n\t(envelope-from <linville@tuxdriver.com>)\n\tid 1XSW2L-0005Xk-34; Fri, 12 Sep 2014 14:57:37 -0400", "from linville-x1.hq.tuxdriver.com (localhost.localdomain\n\t[127.0.0.1])\n\tby linville-x1.hq.tuxdriver.com (8.14.8/8.14.6) with ESMTP id\n\ts8CIsOpR017237; Fri, 12 Sep 2014 14:54:24 -0400", "(from linville@localhost)\n\tby linville-x1.hq.tuxdriver.com (8.14.8/8.14.8/Submit) id\n\ts8CIsOFj017236; Fri, 12 Sep 2014 14:54:24 -0400" ], "Date": "Fri, 12 Sep 2014 14:54:23 -0400", "From": "\"John W. Linville\" <linville@tuxdriver.com>", "To": "\"Zhou, Danny\" <danny.zhou@intel.com>", "Message-ID": "<20140912185423.GD7145@tuxdriver.com>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1405362290-6753-1-git-send-email-linville@tuxdriver.com>\n\t<20140912180523.GB7145@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A935AEE@SHSMSX104.ccr.corp.intel.com>", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=us-ascii", "Content-Disposition": "inline", "In-Reply-To": "<DFDF335405C17848924A094BC35766CF0A935AEE@SHSMSX104.ccr.corp.intel.com>", "User-Agent": "Mutt/1.5.23 (2014-03-12)", "Cc": "\"dev@dpdk.org\" <dev@dpdk.org>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 765, "web_url": "http://patches.dpdk.org/comment/765/", "msgid": "<DFDF335405C17848924A094BC35766CF0A935E75@SHSMSX104.ccr.corp.intel.com>", "list_archive_url": "https://inbox.dpdk.org/dev/DFDF335405C17848924A094BC35766CF0A935E75@SHSMSX104.ccr.corp.intel.com", "date": "2014-09-12T20:35:47", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 29, "url": "http://patches.dpdk.org/api/people/29/?format=api", "name": "Zhou, Danny", "email": "danny.zhou@intel.com" }, "content": "> -----Original Message-----\n> From: John W. Linville [mailto:linville@tuxdriver.com]\n> Sent: Saturday, September 13, 2014 2:54 AM\n> To: Zhou, Danny\n> Cc: dev@dpdk.org\n> Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices\n> \n> On Fri, Sep 12, 2014 at 06:31:08PM +0000, Zhou, Danny wrote:\n> > I am concerned about its performance caused by too many\n> > memcpy(). Specifically, on Rx side, kernel NIC driver needs to copy\n> > packets to skb, then af_packet copies packets to AF_PACKET buffer\n> > which are mapped to user space, and then those packets to be copied\n> > to DPDK mbuf. In addition, 3 copies needed on Tx side. So to run a\n> > simple DPDK L2/L3 forwarding benchmark, each packet needs 6 packet\n> > copies which brings significant negative performance impact. We\n> > had a bifurcated driver prototype that can do zero-copy and achieve\n> > native DPDK performance, but it depends on base driver and AF_PACKET\n> > code changes in kernel, John R will be presenting it in coming Linux\n> > Plumbers Conference. Once kernel adopts it, the relevant PMD will be\n> > submitted to dpdk.org.\n> \n> Admittedly, this is not as good a performer as most of the existing\n> PMDs. It serves a different purpose, afterall. FWIW, you did\n> previously indicate that it performed better than the pcap-based PMD.\n\nYes, slightly higher but makes no big difference.\n\n> I look forward to seeing the changes you mention -- they sound very\n> exciting. But, they will still require both networking core and\n> driver changes in the kernel. And as I understand things today,\n> the userland code will still need at least some knowledge of specific\n> devices and how they layout their packet descriptors, etc. So while\n> those changes sound very promising, they will still have certain\n> drawbacks in common with the current situation.\n\nYes, we would like the DPDK performance optimization techniques such as huge page, efficient rx/tx routines to manipulate device-specific \npacket descriptors, polling-model can be still used. We have to tradeoff between performance and commonality. But we believe it will be much easier\nto develop DPDK PMD for non-Intel NICs than porting entire kernel drivers to DPDK.\n\n> It seems like the changes you mention will still need some sort of\n> AF_PACKET-based PMD driver. Have you implemented that completely\n> separate from the code I already posted? Or did you add that work\n> on top of mine?\n> \n\nFor userland code, it certainly use some of your code related to raw rocket, but highly modified. A layer will be added into eth_dev library to do device\nprobe and support new socket options.\n\n> John\n> \n> > > -----Original Message-----\n> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of John W. Linville\n> > > Sent: Saturday, September 13, 2014 2:05 AM\n> > > To: dev@dpdk.org\n> > > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices\n> > >\n> > > Ping? Are there objections to this patch from mid-July?\n> > >\n> > > John\n> > >\n> > > On Mon, Jul 14, 2014 at 02:24:50PM -0400, John W. Linville wrote:\n> > > > This is a Linux-specific virtual PMD driver backed by an AF_PACKET\n> > > > socket. This implementation uses mmap'ed ring buffers to limit copying\n> > > > and user/kernel transitions. The PACKET_FANOUT_HASH behavior of\n> > > > AF_PACKET is used for frame reception. In the current implementation,\n> > > > Tx and Rx queues are always paired, and therefore are always equal\n> > > > in number -- changing this would be a Simple Matter Of Programming.\n> > > >\n> > > > Interfaces of this type are created with a command line option like\n> > > > \"--vdev=eth_packet0,iface=...\". There are a number of options availabe\n> > > > as arguments:\n> > > >\n> > > > - Interface is chosen by \"iface\" (required)\n> > > > - Number of queue pairs set by \"qpairs\" (optional, default: 1)\n> > > > - AF_PACKET MMAP block size set by \"blocksz\" (optional, default: 4096)\n> > > > - AF_PACKET MMAP frame size set by \"framesz\" (optional, default: 2048)\n> > > > - AF_PACKET MMAP frame count set by \"framecnt\" (optional, default: 512)\n> > > >\n> > > > Signed-off-by: John W. Linville <linville@tuxdriver.com>\n> > > > ---\n> > > > This PMD is intended to provide a means for using DPDK on a broad\n> > > > range of hardware without hardware-specific PMDs and (hopefully)\n> > > > with better performance than what PCAP offers in Linux. This might\n> > > > be useful as a development platform for DPDK applications when\n> > > > DPDK-supported hardware is expensive or unavailable.\n> > > >\n> > > > New in v2:\n> > > >\n> > > > -- fixup some style issues found by check patch\n> > > > -- use if_index as part of fanout group ID\n> > > > -- set default number of queue pairs to 1\n> > > >\n> > > > config/common_bsdapp | 5 +\n> > > > config/common_linuxapp | 5 +\n> > > > lib/Makefile | 1 +\n> > > > lib/librte_eal/linuxapp/eal/Makefile | 1 +\n> > > > lib/librte_pmd_packet/Makefile | 60 +++\n> > > > lib/librte_pmd_packet/rte_eth_packet.c | 826 +++++++++++++++++++++++++++++++++\n> > > > lib/librte_pmd_packet/rte_eth_packet.h | 55 +++\n> > > > mk/rte.app.mk | 4 +\n> > > > 8 files changed, 957 insertions(+)\n> > > > create mode 100644 lib/librte_pmd_packet/Makefile\n> > > > create mode 100644 lib/librte_pmd_packet/rte_eth_packet.c\n> > > > create mode 100644 lib/librte_pmd_packet/rte_eth_packet.h\n> > > >\n> > > > diff --git a/config/common_bsdapp b/config/common_bsdapp\n> > > > index 943dce8f1ede..c317f031278e 100644\n> > > > --- a/config/common_bsdapp\n> > > > +++ b/config/common_bsdapp\n> > > > @@ -226,6 +226,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=y\n> > > > CONFIG_RTE_LIBRTE_PMD_BOND=y\n> > > >\n> > > > #\n> > > > +# Compile software PMD backed by AF_PACKET sockets (Linux only)\n> > > > +#\n> > > > +CONFIG_RTE_LIBRTE_PMD_PACKET=n\n> > > > +\n> > > > +#\n> > > > # Do prefetch of packet data within PMD driver receive function\n> > > > #\n> > > > CONFIG_RTE_PMD_PACKET_PREFETCH=y\n> > > > diff --git a/config/common_linuxapp b/config/common_linuxapp\n> > > > index 7bf5d80d4e26..f9e7bc3015ec 100644\n> > > > --- a/config/common_linuxapp\n> > > > +++ b/config/common_linuxapp\n> > > > @@ -249,6 +249,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=n\n> > > > CONFIG_RTE_LIBRTE_PMD_BOND=y\n> > > >\n> > > > #\n> > > > +# Compile software PMD backed by AF_PACKET sockets (Linux only)\n> > > > +#\n> > > > +CONFIG_RTE_LIBRTE_PMD_PACKET=y\n> > > > +\n> > > > +#\n> > > > # Compile Xen PMD\n> > > > #\n> > > > CONFIG_RTE_LIBRTE_PMD_XENVIRT=n\n> > > > diff --git a/lib/Makefile b/lib/Makefile\n> > > > index 10c5bb3045bc..930fadf29898 100644\n> > > > --- a/lib/Makefile\n> > > > +++ b/lib/Makefile\n> > > > @@ -47,6 +47,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += librte_pmd_i40e\n> > > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += librte_pmd_bond\n> > > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += librte_pmd_ring\n> > > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += librte_pmd_pcap\n> > > > +DIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += librte_pmd_packet\n> > > > DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += librte_pmd_virtio\n> > > > DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3\n> > > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt\n> > > > diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile\n> > > > index 756d6b0c9301..feed24a63272 100644\n> > > > --- a/lib/librte_eal/linuxapp/eal/Makefile\n> > > > +++ b/lib/librte_eal/linuxapp/eal/Makefile\n> > > > @@ -44,6 +44,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_ether\n> > > > CFLAGS += -I$(RTE_SDK)/lib/librte_ivshmem\n> > > > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_ring\n> > > > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_pcap\n> > > > +CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_packet\n> > > > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_xenvirt\n> > > > CFLAGS += $(WERROR_FLAGS) -O3\n> > > >\n> > > > diff --git a/lib/librte_pmd_packet/Makefile b/lib/librte_pmd_packet/Makefile\n> > > > new file mode 100644\n> > > > index 000000000000..e1266fb992cd\n> > > > --- /dev/null\n> > > > +++ b/lib/librte_pmd_packet/Makefile\n> > > > @@ -0,0 +1,60 @@\n> > > > +# BSD LICENSE\n> > > > +#\n> > > > +# Copyright(c) 2014 John W. Linville <linville@redhat.com>\n> > > > +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > > > +# Copyright(c) 2014 6WIND S.A.\n> > > > +# All rights reserved.\n> > > > +#\n> > > > +# Redistribution and use in source and binary forms, with or without\n> > > > +# modification, are permitted provided that the following conditions\n> > > > +# are met:\n> > > > +#\n> > > > +# * Redistributions of source code must retain the above copyright\n> > > > +# notice, this list of conditions and the following disclaimer.\n> > > > +# * Redistributions in binary form must reproduce the above copyright\n> > > > +# notice, this list of conditions and the following disclaimer in\n> > > > +# the documentation and/or other materials provided with the\n> > > > +# distribution.\n> > > > +# * Neither the name of Intel Corporation nor the names of its\n> > > > +# contributors may be used to endorse or promote products derived\n> > > > +# from this software without specific prior written permission.\n> > > > +#\n> > > > +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> > > > +# \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> > > > +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> > > > +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> > > > +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> > > > +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> > > > +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> > > > +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> > > > +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> > > > +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> > > > +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> > > > +\n> > > > +include $(RTE_SDK)/mk/rte.vars.mk\n> > > > +\n> > > > +#\n> > > > +# library name\n> > > > +#\n> > > > +LIB = librte_pmd_packet.a\n> > > > +\n> > > > +CFLAGS += -O3\n> > > > +CFLAGS += $(WERROR_FLAGS)\n> > > > +\n> > > > +#\n> > > > +# all source are stored in SRCS-y\n> > > > +#\n> > > > +SRCS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += rte_eth_packet.c\n> > > > +\n> > > > +#\n> > > > +# Export include files\n> > > > +#\n> > > > +SYMLINK-y-include += rte_eth_packet.h\n> > > > +\n> > > > +# this lib depends upon:\n> > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_mbuf\n> > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_ether\n> > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_malloc\n> > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_kvargs\n> > > > +\n> > > > +include $(RTE_SDK)/mk/rte.lib.mk\n> > > > diff --git a/lib/librte_pmd_packet/rte_eth_packet.c b/lib/librte_pmd_packet/rte_eth_packet.c\n> > > > new file mode 100644\n> > > > index 000000000000..9c82d16e730f\n> > > > --- /dev/null\n> > > > +++ b/lib/librte_pmd_packet/rte_eth_packet.c\n> > > > @@ -0,0 +1,826 @@\n> > > > +/*-\n> > > > + * BSD LICENSE\n> > > > + *\n> > > > + * Copyright(c) 2014 John W. Linville <linville@tuxdriver.com>\n> > > > + *\n> > > > + * Originally based upon librte_pmd_pcap code:\n> > > > + *\n> > > > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > > > + * Copyright(c) 2014 6WIND S.A.\n> > > > + * All rights reserved.\n> > > > + *\n> > > > + * Redistribution and use in source and binary forms, with or without\n> > > > + * modification, are permitted provided that the following conditions\n> > > > + * are met:\n> > > > + *\n> > > > + * * Redistributions of source code must retain the above copyright\n> > > > + * notice, this list of conditions and the following disclaimer.\n> > > > + * * Redistributions in binary form must reproduce the above copyright\n> > > > + * notice, this list of conditions and the following disclaimer in\n> > > > + * the documentation and/or other materials provided with the\n> > > > + * distribution.\n> > > > + * * Neither the name of Intel Corporation nor the names of its\n> > > > + * contributors may be used to endorse or promote products derived\n> > > > + * from this software without specific prior written permission.\n> > > > + *\n> > > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> > > > + * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> > > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> > > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> > > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> > > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> > > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> > > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> > > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> > > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> > > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> > > > + */\n> > > > +\n> > > > +#include <rte_mbuf.h>\n> > > > +#include <rte_ethdev.h>\n> > > > +#include <rte_malloc.h>\n> > > > +#include <rte_kvargs.h>\n> > > > +#include <rte_dev.h>\n> > > > +\n> > > > +#include <linux/if_ether.h>\n> > > > +#include <linux/if_packet.h>\n> > > > +#include <arpa/inet.h>\n> > > > +#include <net/if.h>\n> > > > +#include <sys/types.h>\n> > > > +#include <sys/socket.h>\n> > > > +#include <sys/ioctl.h>\n> > > > +#include <sys/mman.h>\n> > > > +#include <unistd.h>\n> > > > +#include <poll.h>\n> > > > +\n> > > > +#include \"rte_eth_packet.h\"\n> > > > +\n> > > > +#define ETH_PACKET_IFACE_ARG\t\t\"iface\"\n> > > > +#define ETH_PACKET_NUM_Q_ARG\t\t\"qpairs\"\n> > > > +#define ETH_PACKET_BLOCKSIZE_ARG\t\"blocksz\"\n> > > > +#define ETH_PACKET_FRAMESIZE_ARG\t\"framesz\"\n> > > > +#define ETH_PACKET_FRAMECOUNT_ARG\t\"framecnt\"\n> > > > +\n> > > > +#define DFLT_BLOCK_SIZE\t\t(1 << 12)\n> > > > +#define DFLT_FRAME_SIZE\t\t(1 << 11)\n> > > > +#define DFLT_FRAME_COUNT\t(1 << 9)\n> > > > +\n> > > > +struct pkt_rx_queue {\n> > > > +\tint sockfd;\n> > > > +\n> > > > +\tstruct iovec *rd;\n> > > > +\tuint8_t *map;\n> > > > +\tunsigned int framecount;\n> > > > +\tunsigned int framenum;\n> > > > +\n> > > > +\tstruct rte_mempool *mb_pool;\n> > > > +\n> > > > +\tvolatile unsigned long rx_pkts;\n> > > > +\tvolatile unsigned long err_pkts;\n> > > > +};\n> > > > +\n> > > > +struct pkt_tx_queue {\n> > > > +\tint sockfd;\n> > > > +\n> > > > +\tstruct iovec *rd;\n> > > > +\tuint8_t *map;\n> > > > +\tunsigned int framecount;\n> > > > +\tunsigned int framenum;\n> > > > +\n> > > > +\tvolatile unsigned long tx_pkts;\n> > > > +\tvolatile unsigned long err_pkts;\n> > > > +};\n> > > > +\n> > > > +struct pmd_internals {\n> > > > +\tunsigned nb_queues;\n> > > > +\n> > > > +\tint if_index;\n> > > > +\tstruct ether_addr eth_addr;\n> > > > +\n> > > > +\tstruct tpacket_req req;\n> > > > +\n> > > > +\tstruct pkt_rx_queue rx_queue[RTE_PMD_PACKET_MAX_RINGS];\n> > > > +\tstruct pkt_tx_queue tx_queue[RTE_PMD_PACKET_MAX_RINGS];\n> > > > +};\n> > > > +\n> > > > +static const char *valid_arguments[] = {\n> > > > +\tETH_PACKET_IFACE_ARG,\n> > > > +\tETH_PACKET_NUM_Q_ARG,\n> > > > +\tETH_PACKET_BLOCKSIZE_ARG,\n> > > > +\tETH_PACKET_FRAMESIZE_ARG,\n> > > > +\tETH_PACKET_FRAMECOUNT_ARG,\n> > > > +\tNULL\n> > > > +};\n> > > > +\n> > > > +static const char *drivername = \"AF_PACKET PMD\";\n> > > > +\n> > > > +static struct rte_eth_link pmd_link = {\n> > > > +\t.link_speed = 10000,\n> > > > +\t.link_duplex = ETH_LINK_FULL_DUPLEX,\n> > > > +\t.link_status = 0\n> > > > +};\n> > > > +\n> > > > +static uint16_t\n> > > > +eth_packet_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)\n> > > > +{\n> > > > +\tunsigned i;\n> > > > +\tstruct tpacket2_hdr *ppd;\n> > > > +\tstruct rte_mbuf *mbuf;\n> > > > +\tuint8_t *pbuf;\n> > > > +\tstruct pkt_rx_queue *pkt_q = queue;\n> > > > +\tuint16_t num_rx = 0;\n> > > > +\tunsigned int framecount, framenum;\n> > > > +\n> > > > +\tif (unlikely(nb_pkts == 0))\n> > > > +\t\treturn 0;\n> > > > +\n> > > > +\t/*\n> > > > +\t * Reads the given number of packets from the AF_PACKET socket one by\n> > > > +\t * one and copies the packet data into a newly allocated mbuf.\n> > > > +\t */\n> > > > +\tframecount = pkt_q->framecount;\n> > > > +\tframenum = pkt_q->framenum;\n> > > > +\tfor (i = 0; i < nb_pkts; i++) {\n> > > > +\t\t/* point at the next incoming frame */\n> > > > +\t\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > > > +\t\tif ((ppd->tp_status & TP_STATUS_USER) == 0)\n> > > > +\t\t\tbreak;\n> > > > +\n> > > > +\t\t/* allocate the next mbuf */\n> > > > +\t\tmbuf = rte_pktmbuf_alloc(pkt_q->mb_pool);\n> > > > +\t\tif (unlikely(mbuf == NULL))\n> > > > +\t\t\tbreak;\n> > > > +\n> > > > +\t\t/* packet will fit in the mbuf, go ahead and receive it */\n> > > > +\t\tmbuf->pkt.pkt_len = mbuf->pkt.data_len = ppd->tp_snaplen;\n> > > > +\t\tpbuf = (uint8_t *) ppd + ppd->tp_mac;\n> > > > +\t\tmemcpy(mbuf->pkt.data, pbuf, mbuf->pkt.data_len);\n> > > > +\n> > > > +\t\t/* release incoming frame and advance ring buffer */\n> > > > +\t\tppd->tp_status = TP_STATUS_KERNEL;\n> > > > +\t\tif (++framenum >= framecount)\n> > > > +\t\t\tframenum = 0;\n> > > > +\n> > > > +\t\t/* account for the receive frame */\n> > > > +\t\tbufs[i] = mbuf;\n> > > > +\t\tnum_rx++;\n> > > > +\t}\n> > > > +\tpkt_q->framenum = framenum;\n> > > > +\tpkt_q->rx_pkts += num_rx;\n> > > > +\treturn num_rx;\n> > > > +}\n> > > > +\n> > > > +/*\n> > > > + * Callback to handle sending packets through a real NIC.\n> > > > + */\n> > > > +static uint16_t\n> > > > +eth_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)\n> > > > +{\n> > > > +\tstruct tpacket2_hdr *ppd;\n> > > > +\tstruct rte_mbuf *mbuf;\n> > > > +\tuint8_t *pbuf;\n> > > > +\tunsigned int framecount, framenum;\n> > > > +\tstruct pollfd pfd;\n> > > > +\tstruct pkt_tx_queue *pkt_q = queue;\n> > > > +\tuint16_t num_tx = 0;\n> > > > +\tint i;\n> > > > +\n> > > > +\tif (unlikely(nb_pkts == 0))\n> > > > +\t\treturn 0;\n> > > > +\n> > > > +\tmemset(&pfd, 0, sizeof(pfd));\n> > > > +\tpfd.fd = pkt_q->sockfd;\n> > > > +\tpfd.events = POLLOUT;\n> > > > +\tpfd.revents = 0;\n> > > > +\n> > > > +\tframecount = pkt_q->framecount;\n> > > > +\tframenum = pkt_q->framenum;\n> > > > +\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > > > +\tfor (i = 0; i < nb_pkts; i++) {\n> > > > +\t\t/* point at the next incoming frame */\n> > > > +\t\tif ((ppd->tp_status != TP_STATUS_AVAILABLE) &&\n> > > > +\t\t (poll(&pfd, 1, -1) < 0))\n> > > > +\t\t\t\tcontinue;\n> > > > +\n> > > > +\t\t/* copy the tx frame data */\n> > > > +\t\tmbuf = bufs[num_tx];\n> > > > +\t\tpbuf = (uint8_t *) ppd + TPACKET2_HDRLEN -\n> > > > +\t\t\tsizeof(struct sockaddr_ll);\n> > > > +\t\tmemcpy(pbuf, mbuf->pkt.data, mbuf->pkt.data_len);\n> > > > +\t\tppd->tp_len = ppd->tp_snaplen = mbuf->pkt.data_len;\n> > > > +\n> > > > +\t\t/* release incoming frame and advance ring buffer */\n> > > > +\t\tppd->tp_status = TP_STATUS_SEND_REQUEST;\n> > > > +\t\tif (++framenum >= framecount)\n> > > > +\t\t\tframenum = 0;\n> > > > +\t\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > > > +\n> > > > +\t\tnum_tx++;\n> > > > +\t\trte_pktmbuf_free(mbuf);\n> > > > +\t}\n> > > > +\n> > > > +\t/* kick-off transmits */\n> > > > +\tsendto(pkt_q->sockfd, NULL, 0, MSG_DONTWAIT, NULL, 0);\n> > > > +\n> > > > +\tpkt_q->framenum = framenum;\n> > > > +\tpkt_q->tx_pkts += num_tx;\n> > > > +\tpkt_q->err_pkts += nb_pkts - num_tx;\n> > > > +\treturn num_tx;\n> > > > +}\n> > > > +\n> > > > +static int\n> > > > +eth_dev_start(struct rte_eth_dev *dev)\n> > > > +{\n> > > > +\tdev->data->dev_link.link_status = 1;\n> > > > +\treturn 0;\n> > > > +}\n> > > > +\n> > > > +/*\n> > > > + * This function gets called when the current port gets stopped.\n> > > > + */\n> > > > +static void\n> > > > +eth_dev_stop(struct rte_eth_dev *dev)\n> > > > +{\n> > > > +\tunsigned i;\n> > > > +\tint sockfd;\n> > > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > > +\n> > > > +\tfor (i = 0; i < internals->nb_queues; i++) {\n> > > > +\t\tsockfd = internals->rx_queue[i].sockfd;\n> > > > +\t\tif (sockfd != -1)\n> > > > +\t\t\tclose(sockfd);\n> > > > +\t\tsockfd = internals->tx_queue[i].sockfd;\n> > > > +\t\tif (sockfd != -1)\n> > > > +\t\t\tclose(sockfd);\n> > > > +\t}\n> > > > +\n> > > > +\tdev->data->dev_link.link_status = 0;\n> > > > +}\n> > > > +\n> > > > +static int\n> > > > +eth_dev_configure(struct rte_eth_dev *dev __rte_unused)\n> > > > +{\n> > > > +\treturn 0;\n> > > > +}\n> > > > +\n> > > > +static void\n> > > > +eth_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)\n> > > > +{\n> > > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > > +\n> > > > +\tdev_info->driver_name = drivername;\n> > > > +\tdev_info->if_index = internals->if_index;\n> > > > +\tdev_info->max_mac_addrs = 1;\n> > > > +\tdev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;\n> > > > +\tdev_info->max_rx_queues = (uint16_t)internals->nb_queues;\n> > > > +\tdev_info->max_tx_queues = (uint16_t)internals->nb_queues;\n> > > > +\tdev_info->min_rx_bufsize = 0;\n> > > > +\tdev_info->pci_dev = NULL;\n> > > > +}\n> > > > +\n> > > > +static void\n> > > > +eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *igb_stats)\n> > > > +{\n> > > > +\tunsigned i, imax;\n> > > > +\tunsigned long rx_total = 0, tx_total = 0, tx_err_total = 0;\n> > > > +\tconst struct pmd_internals *internal = dev->data->dev_private;\n> > > > +\n> > > > +\tmemset(igb_stats, 0, sizeof(*igb_stats));\n> > > > +\n> > > > +\timax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?\n> > > > +\t internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);\n> > > > +\tfor (i = 0; i < imax; i++) {\n> > > > +\t\tigb_stats->q_ipackets[i] = internal->rx_queue[i].rx_pkts;\n> > > > +\t\trx_total += igb_stats->q_ipackets[i];\n> > > > +\t}\n> > > > +\n> > > > +\timax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?\n> > > > +\t internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);\n> > > > +\tfor (i = 0; i < imax; i++) {\n> > > > +\t\tigb_stats->q_opackets[i] = internal->tx_queue[i].tx_pkts;\n> > > > +\t\tigb_stats->q_errors[i] = internal->tx_queue[i].err_pkts;\n> > > > +\t\ttx_total += igb_stats->q_opackets[i];\n> > > > +\t\ttx_err_total += igb_stats->q_errors[i];\n> > > > +\t}\n> > > > +\n> > > > +\tigb_stats->ipackets = rx_total;\n> > > > +\tigb_stats->opackets = tx_total;\n> > > > +\tigb_stats->oerrors = tx_err_total;\n> > > > +}\n> > > > +\n> > > > +static void\n> > > > +eth_stats_reset(struct rte_eth_dev *dev)\n> > > > +{\n> > > > +\tunsigned i;\n> > > > +\tstruct pmd_internals *internal = dev->data->dev_private;\n> > > > +\n> > > > +\tfor (i = 0; i < internal->nb_queues; i++)\n> > > > +\t\tinternal->rx_queue[i].rx_pkts = 0;\n> > > > +\n> > > > +\tfor (i = 0; i < internal->nb_queues; i++) {\n> > > > +\t\tinternal->tx_queue[i].tx_pkts = 0;\n> > > > +\t\tinternal->tx_queue[i].err_pkts = 0;\n> > > > +\t}\n> > > > +}\n> > > > +\n> > > > +static void\n> > > > +eth_dev_close(struct rte_eth_dev *dev __rte_unused)\n> > > > +{\n> > > > +}\n> > > > +\n> > > > +static void\n> > > > +eth_queue_release(void *q __rte_unused)\n> > > > +{\n> > > > +}\n> > > > +\n> > > > +static int\n> > > > +eth_link_update(struct rte_eth_dev *dev __rte_unused,\n> > > > + int wait_to_complete __rte_unused)\n> > > > +{\n> > > > +\treturn 0;\n> > > > +}\n> > > > +\n> > > > +static int\n> > > > +eth_rx_queue_setup(struct rte_eth_dev *dev,\n> > > > + uint16_t rx_queue_id,\n> > > > + uint16_t nb_rx_desc __rte_unused,\n> > > > + unsigned int socket_id __rte_unused,\n> > > > + const struct rte_eth_rxconf *rx_conf __rte_unused,\n> > > > + struct rte_mempool *mb_pool)\n> > > > +{\n> > > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > > +\tstruct pkt_rx_queue *pkt_q = &internals->rx_queue[rx_queue_id];\n> > > > +\tstruct rte_pktmbuf_pool_private *mbp_priv;\n> > > > +\tuint16_t buf_size;\n> > > > +\n> > > > +\tpkt_q->mb_pool = mb_pool;\n> > > > +\n> > > > +\t/* Now get the space available for data in the mbuf */\n> > > > +\tmbp_priv = rte_mempool_get_priv(pkt_q->mb_pool);\n> > > > +\tbuf_size = (uint16_t) (mbp_priv->mbuf_data_room_size -\n> > > > +\t RTE_PKTMBUF_HEADROOM);\n> > > > +\n> > > > +\tif (ETH_FRAME_LEN > buf_size) {\n> > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > +\t\t\t\"%s: %d bytes will not fit in mbuf (%d bytes)\\n\",\n> > > > +\t\t\tdev->data->name, ETH_FRAME_LEN, buf_size);\n> > > > +\t\treturn -ENOMEM;\n> > > > +\t}\n> > > > +\n> > > > +\tdev->data->rx_queues[rx_queue_id] = pkt_q;\n> > > > +\n> > > > +\treturn 0;\n> > > > +}\n> > > > +\n> > > > +static int\n> > > > +eth_tx_queue_setup(struct rte_eth_dev *dev,\n> > > > + uint16_t tx_queue_id,\n> > > > + uint16_t nb_tx_desc __rte_unused,\n> > > > + unsigned int socket_id __rte_unused,\n> > > > + const struct rte_eth_txconf *tx_conf __rte_unused)\n> > > > +{\n> > > > +\n> > > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > > +\n> > > > +\tdev->data->tx_queues[tx_queue_id] = &internals->tx_queue[tx_queue_id];\n> > > > +\treturn 0;\n> > > > +}\n> > > > +\n> > > > +static struct eth_dev_ops ops = {\n> > > > +\t.dev_start = eth_dev_start,\n> > > > +\t.dev_stop = eth_dev_stop,\n> > > > +\t.dev_close = eth_dev_close,\n> > > > +\t.dev_configure = eth_dev_configure,\n> > > > +\t.dev_infos_get = eth_dev_info,\n> > > > +\t.rx_queue_setup = eth_rx_queue_setup,\n> > > > +\t.tx_queue_setup = eth_tx_queue_setup,\n> > > > +\t.rx_queue_release = eth_queue_release,\n> > > > +\t.tx_queue_release = eth_queue_release,\n> > > > +\t.link_update = eth_link_update,\n> > > > +\t.stats_get = eth_stats_get,\n> > > > +\t.stats_reset = eth_stats_reset,\n> > > > +};\n> > > > +\n> > > > +/*\n> > > > + * Opens an AF_PACKET socket\n> > > > + */\n> > > > +static int\n> > > > +open_packet_iface(const char *key __rte_unused,\n> > > > + const char *value __rte_unused,\n> > > > + void *extra_args)\n> > > > +{\n> > > > +\tint *sockfd = extra_args;\n> > > > +\n> > > > +\t/* Open an AF_PACKET socket... */\n> > > > +\t*sockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));\n> > > > +\tif (*sockfd == -1) {\n> > > > +\t\tRTE_LOG(ERR, PMD, \"Could not open AF_PACKET socket\\n\");\n> > > > +\t\treturn -1;\n> > > > +\t}\n> > > > +\n> > > > +\treturn 0;\n> > > > +}\n> > > > +\n> > > > +static int\n> > > > +rte_pmd_init_internals(const char *name,\n> > > > + const int sockfd,\n> > > > + const unsigned nb_queues,\n> > > > + unsigned int blocksize,\n> > > > + unsigned int blockcnt,\n> > > > + unsigned int framesize,\n> > > > + unsigned int framecnt,\n> > > > + const unsigned numa_node,\n> > > > + struct pmd_internals **internals,\n> > > > + struct rte_eth_dev **eth_dev,\n> > > > + struct rte_kvargs *kvlist)\n> > > > +{\n> > > > +\tstruct rte_eth_dev_data *data = NULL;\n> > > > +\tstruct rte_pci_device *pci_dev = NULL;\n> > > > +\tstruct rte_kvargs_pair *pair = NULL;\n> > > > +\tstruct ifreq ifr;\n> > > > +\tsize_t ifnamelen;\n> > > > +\tunsigned k_idx;\n> > > > +\tstruct sockaddr_ll sockaddr;\n> > > > +\tstruct tpacket_req *req;\n> > > > +\tstruct pkt_rx_queue *rx_queue;\n> > > > +\tstruct pkt_tx_queue *tx_queue;\n> > > > +\tint rc, tpver, discard, bypass;\n> > > > +\tunsigned int i, q, rdsize;\n> > > > +\tint qsockfd, fanout_arg;\n> > > > +\n> > > > +\tfor (k_idx = 0; k_idx < kvlist->count; k_idx++) {\n> > > > +\t\tpair = &kvlist->pairs[k_idx];\n> > > > +\t\tif (strstr(pair->key, ETH_PACKET_IFACE_ARG) != NULL)\n> > > > +\t\t\tbreak;\n> > > > +\t}\n> > > > +\tif (pair == NULL) {\n> > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > +\t\t\t\"%s: no interface specified for AF_PACKET ethdev\\n\",\n> > > > +\t\t name);\n> > > > +\t\tgoto error;\n> > > > +\t}\n> > > > +\n> > > > +\tRTE_LOG(INFO, PMD,\n> > > > +\t\t\"%s: creating AF_PACKET-backed ethdev on numa socket %u\\n\",\n> > > > +\t\tname, numa_node);\n> > > > +\n> > > > +\t/*\n> > > > +\t * now do all data allocation - for eth_dev structure, dummy pci driver\n> > > > +\t * and internal (private) data\n> > > > +\t */\n> > > > +\tdata = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);\n> > > > +\tif (data == NULL)\n> > > > +\t\tgoto error;\n> > > > +\n> > > > +\tpci_dev = rte_zmalloc_socket(name, sizeof(*pci_dev), 0, numa_node);\n> > > > +\tif (pci_dev == NULL)\n> > > > +\t\tgoto error;\n> > > > +\n> > > > +\t*internals = rte_zmalloc_socket(name, sizeof(**internals),\n> > > > +\t 0, numa_node);\n> > > > +\tif (*internals == NULL)\n> > > > +\t\tgoto error;\n> > > > +\n> > > > +\treq = &((*internals)->req);\n> > > > +\n> > > > +\treq->tp_block_size = blocksize;\n> > > > +\treq->tp_block_nr = blockcnt;\n> > > > +\treq->tp_frame_size = framesize;\n> > > > +\treq->tp_frame_nr = framecnt;\n> > > > +\n> > > > +\tifnamelen = strlen(pair->value);\n> > > > +\tif (ifnamelen < sizeof(ifr.ifr_name)) {\n> > > > +\t\tmemcpy(ifr.ifr_name, pair->value, ifnamelen);\n> > > > +\t\tifr.ifr_name[ifnamelen] = '\\0';\n> > > > +\t} else {\n> > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > +\t\t\t\"%s: I/F name too long (%s)\\n\",\n> > > > +\t\t\tname, pair->value);\n> > > > +\t\tgoto error;\n> > > > +\t}\n> > > > +\tif (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {\n> > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > +\t\t\t\"%s: ioctl failed (SIOCGIFINDEX)\\n\",\n> > > > +\t\t name);\n> > > > +\t\tgoto error;\n> > > > +\t}\n> > > > +\t(*internals)->if_index = ifr.ifr_ifindex;\n> > > > +\n> > > > +\tif (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {\n> > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > +\t\t\t\"%s: ioctl failed (SIOCGIFHWADDR)\\n\",\n> > > > +\t\t name);\n> > > > +\t\tgoto error;\n> > > > +\t}\n> > > > +\tmemcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);\n> > > > +\n> > > > +\tmemset(&sockaddr, 0, sizeof(sockaddr));\n> > > > +\tsockaddr.sll_family = AF_PACKET;\n> > > > +\tsockaddr.sll_protocol = htons(ETH_P_ALL);\n> > > > +\tsockaddr.sll_ifindex = (*internals)->if_index;\n> > > > +\n> > > > +\tfanout_arg = (getpid() ^ (*internals)->if_index) & 0xffff;\n> > > > +\tfanout_arg |= (PACKET_FANOUT_HASH | PACKET_FANOUT_FLAG_DEFRAG |\n> > > > +\t PACKET_FANOUT_FLAG_ROLLOVER) << 16;\n> > > > +\n> > > > +\tfor (q = 0; q < nb_queues; q++) {\n> > > > +\t\t/* Open an AF_PACKET socket for this queue... */\n> > > > +\t\tqsockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));\n> > > > +\t\tif (qsockfd == -1) {\n> > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > +\t\t\t \"%s: could not open AF_PACKET socket\\n\",\n> > > > +\t\t\t name);\n> > > > +\t\t\treturn -1;\n> > > > +\t\t}\n> > > > +\n> > > > +\t\ttpver = TPACKET_V2;\n> > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_VERSION,\n> > > > +\t\t\t\t&tpver, sizeof(tpver));\n> > > > +\t\tif (rc == -1) {\n> > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > +\t\t\t\t\"%s: could not set PACKET_VERSION on AF_PACKET \"\n> > > > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > > > +\t\t\tgoto error;\n> > > > +\t\t}\n> > > > +\n> > > > +\t\tdiscard = 1;\n> > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_LOSS,\n> > > > +\t\t\t\t&discard, sizeof(discard));\n> > > > +\t\tif (rc == -1) {\n> > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > +\t\t\t\t\"%s: could not set PACKET_LOSS on \"\n> > > > +\t\t\t \"AF_PACKET socket for %s\\n\", name, pair->value);\n> > > > +\t\t\tgoto error;\n> > > > +\t\t}\n> > > > +\n> > > > +\t\tbypass = 1;\n> > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_QDISC_BYPASS,\n> > > > +\t\t\t\t&bypass, sizeof(bypass));\n> > > > +\t\tif (rc == -1) {\n> > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > +\t\t\t\t\"%s: could not set PACKET_QDISC_BYPASS \"\n> > > > +\t\t\t \"on AF_PACKET socket for %s\\n\", name,\n> > > > +\t\t\t pair->value);\n> > > > +\t\t\tgoto error;\n> > > > +\t\t}\n> > > > +\n> > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_RX_RING, req, sizeof(*req));\n> > > > +\t\tif (rc == -1) {\n> > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > +\t\t\t\t\"%s: could not set PACKET_RX_RING on AF_PACKET \"\n> > > > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > > > +\t\t\tgoto error;\n> > > > +\t\t}\n> > > > +\n> > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_TX_RING, req, sizeof(*req));\n> > > > +\t\tif (rc == -1) {\n> > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > +\t\t\t\t\"%s: could not set PACKET_TX_RING on AF_PACKET \"\n> > > > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > > > +\t\t\tgoto error;\n> > > > +\t\t}\n> > > > +\n> > > > +\t\trx_queue = &((*internals)->rx_queue[q]);\n> > > > +\t\trx_queue->framecount = req->tp_frame_nr;\n> > > > +\n> > > > +\t\trx_queue->map = mmap(NULL, 2 * req->tp_block_size * req->tp_block_nr,\n> > > > +\t\t\t\t PROT_READ | PROT_WRITE, MAP_SHARED | MAP_LOCKED,\n> > > > +\t\t\t\t qsockfd, 0);\n> > > > +\t\tif (rx_queue->map == MAP_FAILED) {\n> > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > +\t\t\t\t\"%s: call to mmap failed on AF_PACKET socket for %s\\n\",\n> > > > +\t\t\t\tname, pair->value);\n> > > > +\t\t\tgoto error;\n> > > > +\t\t}\n> > > > +\n> > > > +\t\t/* rdsize is same for both Tx and Rx */\n> > > > +\t\trdsize = req->tp_frame_nr * sizeof(*(rx_queue->rd));\n> > > > +\n> > > > +\t\trx_queue->rd = rte_zmalloc_socket(name, rdsize, 0, numa_node);\n> > > > +\t\tfor (i = 0; i < req->tp_frame_nr; ++i) {\n> > > > +\t\t\trx_queue->rd[i].iov_base = rx_queue->map + (i * framesize);\n> > > > +\t\t\trx_queue->rd[i].iov_len = req->tp_frame_size;\n> > > > +\t\t}\n> > > > +\t\trx_queue->sockfd = qsockfd;\n> > > > +\n> > > > +\t\ttx_queue = &((*internals)->tx_queue[q]);\n> > > > +\t\ttx_queue->framecount = req->tp_frame_nr;\n> > > > +\n> > > > +\t\ttx_queue->map = rx_queue->map + req->tp_block_size * req->tp_block_nr;\n> > > > +\n> > > > +\t\ttx_queue->rd = rte_zmalloc_socket(name, rdsize, 0, numa_node);\n> > > > +\t\tfor (i = 0; i < req->tp_frame_nr; ++i) {\n> > > > +\t\t\ttx_queue->rd[i].iov_base = tx_queue->map + (i * framesize);\n> > > > +\t\t\ttx_queue->rd[i].iov_len = req->tp_frame_size;\n> > > > +\t\t}\n> > > > +\t\ttx_queue->sockfd = qsockfd;\n> > > > +\n> > > > +\t\trc = bind(qsockfd, (const struct sockaddr*)&sockaddr, sizeof(sockaddr));\n> > > > +\t\tif (rc == -1) {\n> > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > +\t\t\t\t\"%s: could not bind AF_PACKET socket to %s\\n\",\n> > > > +\t\t\t name, pair->value);\n> > > > +\t\t\tgoto error;\n> > > > +\t\t}\n> > > > +\n> > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_FANOUT,\n> > > > +\t\t\t\t&fanout_arg, sizeof(fanout_arg));\n> > > > +\t\tif (rc == -1) {\n> > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > +\t\t\t\t\"%s: could not set PACKET_FANOUT on AF_PACKET socket \"\n> > > > +\t\t\t\t\"for %s\\n\", name, pair->value);\n> > > > +\t\t\tgoto error;\n> > > > +\t\t}\n> > > > +\t}\n> > > > +\n> > > > +\t/* reserve an ethdev entry */\n> > > > +\t*eth_dev = rte_eth_dev_allocate(name);\n> > > > +\tif (*eth_dev == NULL)\n> > > > +\t\tgoto error;\n> > > > +\n> > > > +\t/*\n> > > > +\t * now put it all together\n> > > > +\t * - store queue data in internals,\n> > > > +\t * - store numa_node info in pci_driver\n> > > > +\t * - point eth_dev_data to internals and pci_driver\n> > > > +\t * - and point eth_dev structure to new eth_dev_data structure\n> > > > +\t */\n> > > > +\n> > > > +\t(*internals)->nb_queues = nb_queues;\n> > > > +\n> > > > +\tdata->dev_private = *internals;\n> > > > +\tdata->port_id = (*eth_dev)->data->port_id;\n> > > > +\tdata->nb_rx_queues = (uint16_t)nb_queues;\n> > > > +\tdata->nb_tx_queues = (uint16_t)nb_queues;\n> > > > +\tdata->dev_link = pmd_link;\n> > > > +\tdata->mac_addrs = &(*internals)->eth_addr;\n> > > > +\n> > > > +\tpci_dev->numa_node = numa_node;\n> > > > +\n> > > > +\t(*eth_dev)->data = data;\n> > > > +\t(*eth_dev)->dev_ops = &ops;\n> > > > +\t(*eth_dev)->pci_dev = pci_dev;\n> > > > +\n> > > > +\treturn 0;\n> > > > +\n> > > > +error:\n> > > > +\tif (data)\n> > > > +\t\trte_free(data);\n> > > > +\tif (pci_dev)\n> > > > +\t\trte_free(pci_dev);\n> > > > +\tfor (q = 0; q < nb_queues; q++) {\n> > > > +\t\tif ((*internals)->rx_queue[q].rd)\n> > > > +\t\t\trte_free((*internals)->rx_queue[q].rd);\n> > > > +\t\tif ((*internals)->tx_queue[q].rd)\n> > > > +\t\t\trte_free((*internals)->tx_queue[q].rd);\n> > > > +\t}\n> > > > +\tif (*internals)\n> > > > +\t\trte_free(*internals);\n> > > > +\treturn -1;\n> > > > +}\n> > > > +\n> > > > +static int\n> > > > +rte_eth_from_packet(const char *name,\n> > > > + int const *sockfd,\n> > > > + const unsigned numa_node,\n> > > > + struct rte_kvargs *kvlist)\n> > > > +{\n> > > > +\tstruct pmd_internals *internals = NULL;\n> > > > +\tstruct rte_eth_dev *eth_dev = NULL;\n> > > > +\tstruct rte_kvargs_pair *pair = NULL;\n> > > > +\tunsigned k_idx;\n> > > > +\tunsigned int blockcount;\n> > > > +\tunsigned int blocksize = DFLT_BLOCK_SIZE;\n> > > > +\tunsigned int framesize = DFLT_FRAME_SIZE;\n> > > > +\tunsigned int framecount = DFLT_FRAME_COUNT;\n> > > > +\tunsigned int qpairs = 1;\n> > > > +\n> > > > +\t/* do some parameter checking */\n> > > > +\tif (*sockfd < 0)\n> > > > +\t\treturn -1;\n> > > > +\n> > > > +\t/*\n> > > > +\t * Walk arguments for configurable settings\n> > > > +\t */\n> > > > +\tfor (k_idx = 0; k_idx < kvlist->count; k_idx++) {\n> > > > +\t\tpair = &kvlist->pairs[k_idx];\n> > > > +\t\tif (strstr(pair->key, ETH_PACKET_NUM_Q_ARG) != NULL) {\n> > > > +\t\t\tqpairs = atoi(pair->value);\n> > > > +\t\t\tif (qpairs < 1 ||\n> > > > +\t\t\t qpairs > RTE_PMD_PACKET_MAX_RINGS) {\n> > > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > > +\t\t\t\t\t\"%s: invalid qpairs value\\n\",\n> > > > +\t\t\t\t name);\n> > > > +\t\t\t\treturn -1;\n> > > > +\t\t\t}\n> > > > +\t\t\tcontinue;\n> > > > +\t\t}\n> > > > +\t\tif (strstr(pair->key, ETH_PACKET_BLOCKSIZE_ARG) != NULL) {\n> > > > +\t\t\tblocksize = atoi(pair->value);\n> > > > +\t\t\tif (!blocksize) {\n> > > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > > +\t\t\t\t\t\"%s: invalid blocksize value\\n\",\n> > > > +\t\t\t\t name);\n> > > > +\t\t\t\treturn -1;\n> > > > +\t\t\t}\n> > > > +\t\t\tcontinue;\n> > > > +\t\t}\n> > > > +\t\tif (strstr(pair->key, ETH_PACKET_FRAMESIZE_ARG) != NULL) {\n> > > > +\t\t\tframesize = atoi(pair->value);\n> > > > +\t\t\tif (!framesize) {\n> > > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > > +\t\t\t\t\t\"%s: invalid framesize value\\n\",\n> > > > +\t\t\t\t name);\n> > > > +\t\t\t\treturn -1;\n> > > > +\t\t\t}\n> > > > +\t\t\tcontinue;\n> > > > +\t\t}\n> > > > +\t\tif (strstr(pair->key, ETH_PACKET_FRAMECOUNT_ARG) != NULL) {\n> > > > +\t\t\tframecount = atoi(pair->value);\n> > > > +\t\t\tif (!framecount) {\n> > > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > > +\t\t\t\t\t\"%s: invalid framecount value\\n\",\n> > > > +\t\t\t\t name);\n> > > > +\t\t\t\treturn -1;\n> > > > +\t\t\t}\n> > > > +\t\t\tcontinue;\n> > > > +\t\t}\n> > > > +\t}\n> > > > +\n> > > > +\tif (framesize > blocksize) {\n> > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > +\t\t\t\"%s: AF_PACKET MMAP frame size exceeds block size!\\n\",\n> > > > +\t\t name);\n> > > > +\t\treturn -1;\n> > > > +\t}\n> > > > +\n> > > > +\tblockcount = framecount / (blocksize / framesize);\n> > > > +\tif (!blockcount) {\n> > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > +\t\t\t\"%s: invalid AF_PACKET MMAP parameters\\n\", name);\n> > > > +\t\treturn -1;\n> > > > +\t}\n> > > > +\n> > > > +\tRTE_LOG(INFO, PMD, \"%s: AF_PACKET MMAP parameters:\\n\", name);\n> > > > +\tRTE_LOG(INFO, PMD, \"%s:\\tblock size %d\\n\", name, blocksize);\n> > > > +\tRTE_LOG(INFO, PMD, \"%s:\\tblock count %d\\n\", name, blockcount);\n> > > > +\tRTE_LOG(INFO, PMD, \"%s:\\tframe size %d\\n\", name, framesize);\n> > > > +\tRTE_LOG(INFO, PMD, \"%s:\\tframe count %d\\n\", name, framecount);\n> > > > +\n> > > > +\tif (rte_pmd_init_internals(name, *sockfd, qpairs,\n> > > > +\t blocksize, blockcount,\n> > > > +\t framesize, framecount,\n> > > > +\t numa_node, &internals, &eth_dev,\n> > > > +\t kvlist) < 0)\n> > > > +\t\treturn -1;\n> > > > +\n> > > > +\teth_dev->rx_pkt_burst = eth_packet_rx;\n> > > > +\teth_dev->tx_pkt_burst = eth_packet_tx;\n> > > > +\n> > > > +\treturn 0;\n> > > > +}\n> > > > +\n> > > > +int\n> > > > +rte_pmd_packet_devinit(const char *name, const char *params)\n> > > > +{\n> > > > +\tunsigned numa_node;\n> > > > +\tint ret;\n> > > > +\tstruct rte_kvargs *kvlist;\n> > > > +\tint sockfd = -1;\n> > > > +\n> > > > +\tRTE_LOG(INFO, PMD, \"Initializing pmd_packet for %s\\n\", name);\n> > > > +\n> > > > +\tnuma_node = rte_socket_id();\n> > > > +\n> > > > +\tkvlist = rte_kvargs_parse(params, valid_arguments);\n> > > > +\tif (kvlist == NULL)\n> > > > +\t\treturn -1;\n> > > > +\n> > > > +\t/*\n> > > > +\t * If iface argument is passed we open the NICs and use them for\n> > > > +\t * reading / writing\n> > > > +\t */\n> > > > +\tif (rte_kvargs_count(kvlist, ETH_PACKET_IFACE_ARG) == 1) {\n> > > > +\n> > > > +\t\tret = rte_kvargs_process(kvlist, ETH_PACKET_IFACE_ARG,\n> > > > +\t\t &open_packet_iface, &sockfd);\n> > > > +\t\tif (ret < 0)\n> > > > +\t\t\treturn -1;\n> > > > +\t}\n> > > > +\n> > > > +\tret = rte_eth_from_packet(name, &sockfd, numa_node, kvlist);\n> > > > +\tclose(sockfd); /* no longer needed */\n> > > > +\n> > > > +\tif (ret < 0)\n> > > > +\t\treturn -1;\n> > > > +\n> > > > +\treturn 0;\n> > > > +}\n> > > > +\n> > > > +static struct rte_driver pmd_packet_drv = {\n> > > > +\t.name = \"eth_packet\",\n> > > > +\t.type = PMD_VDEV,\n> > > > +\t.init = rte_pmd_packet_devinit,\n> > > > +};\n> > > > +\n> > > > +PMD_REGISTER_DRIVER(pmd_packet_drv);\n> > > > diff --git a/lib/librte_pmd_packet/rte_eth_packet.h b/lib/librte_pmd_packet/rte_eth_packet.h\n> > > > new file mode 100644\n> > > > index 000000000000..f685611da3e9\n> > > > --- /dev/null\n> > > > +++ b/lib/librte_pmd_packet/rte_eth_packet.h\n> > > > @@ -0,0 +1,55 @@\n> > > > +/*-\n> > > > + * BSD LICENSE\n> > > > + *\n> > > > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > > > + * All rights reserved.\n> > > > + *\n> > > > + * Redistribution and use in source and binary forms, with or without\n> > > > + * modification, are permitted provided that the following conditions\n> > > > + * are met:\n> > > > + *\n> > > > + * * Redistributions of source code must retain the above copyright\n> > > > + * notice, this list of conditions and the following disclaimer.\n> > > > + * * Redistributions in binary form must reproduce the above copyright\n> > > > + * notice, this list of conditions and the following disclaimer in\n> > > > + * the documentation and/or other materials provided with the\n> > > > + * distribution.\n> > > > + * * Neither the name of Intel Corporation nor the names of its\n> > > > + * contributors may be used to endorse or promote products derived\n> > > > + * from this software without specific prior written permission.\n> > > > + *\n> > > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> > > > + * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> > > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> > > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> > > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> > > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> > > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> > > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> > > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> > > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> > > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> > > > + */\n> > > > +\n> > > > +#ifndef _RTE_ETH_PACKET_H_\n> > > > +#define _RTE_ETH_PACKET_H_\n> > > > +\n> > > > +#ifdef __cplusplus\n> > > > +extern \"C\" {\n> > > > +#endif\n> > > > +\n> > > > +#define RTE_ETH_PACKET_PARAM_NAME \"eth_packet\"\n> > > > +\n> > > > +#define RTE_PMD_PACKET_MAX_RINGS 16\n> > > > +\n> > > > +/**\n> > > > + * For use by the EAL only. Called as part of EAL init to set up any dummy NICs\n> > > > + * configured on command line.\n> > > > + */\n> > > > +int rte_pmd_packet_devinit(const char *name, const char *params);\n> > > > +\n> > > > +#ifdef __cplusplus\n> > > > +}\n> > > > +#endif\n> > > > +\n> > > > +#endif\n> > > > diff --git a/mk/rte.app.mk b/mk/rte.app.mk\n> > > > index 34dff2a02a05..a6994c4dbe93 100644\n> > > > --- a/mk/rte.app.mk\n> > > > +++ b/mk/rte.app.mk\n> > > > @@ -210,6 +210,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_PMD_PCAP),y)\n> > > > LDLIBS += -lrte_pmd_pcap -lpcap\n> > > > endif\n> > > >\n> > > > +ifeq ($(CONFIG_RTE_LIBRTE_PMD_PACKET),y)\n> > > > +LDLIBS += -lrte_pmd_packet\n> > > > +endif\n> > > > +\n> > > > endif # plugins\n> > > >\n> > > > LDLIBS += $(EXECENV_LDLIBS)\n> > > > --\n> > > > 1.9.3\n> > > >\n> > > >\n> > >\n> > > --\n> > > John W. Linville\t\tSomeday the world will need a hero, and you\n> > > linville@tuxdriver.com\t\t\tmight be all we have. Be ready.\n> >\n> \n> --\n> John W. Linville\t\tSomeday the world will need a hero, and you\n> linville@tuxdriver.com\t\t\tmight be all we have. Be ready.", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id BE13AB396;\n\tFri, 12 Sep 2014 22:31:18 +0200 (CEST)", "from mga09.intel.com (mga09.intel.com [134.134.136.24])\n\tby dpdk.org (Postfix) with ESMTP id B5C945901\n\tfor <dev@dpdk.org>; Fri, 12 Sep 2014 22:31:14 +0200 (CEST)", "from azsmga001.ch.intel.com ([10.2.17.19])\n\tby orsmga102.jf.intel.com with ESMTP; 12 Sep 2014 13:29:44 -0700", "from fmsmsx103.amr.corp.intel.com ([10.18.124.201])\n\tby azsmga001.ch.intel.com with ESMTP; 12 Sep 2014 13:35:50 -0700", "from FMSMSX110.amr.corp.intel.com (10.18.116.10) by\n\tFMSMSX103.amr.corp.intel.com (10.18.124.201) with Microsoft SMTP\n\tServer (TLS) id 14.3.195.1; Fri, 12 Sep 2014 13:35:50 -0700", "from shsmsx151.ccr.corp.intel.com (10.239.6.50) by\n\tfmsmsx110.amr.corp.intel.com (10.18.116.10) with Microsoft SMTP\n\tServer (TLS) id 14.3.195.1; Fri, 12 Sep 2014 13:35:49 -0700", "from shsmsx104.ccr.corp.intel.com ([169.254.5.230]) by\n\tSHSMSX151.ccr.corp.intel.com ([169.254.3.172]) with mapi id\n\t14.03.0195.001; Sat, 13 Sep 2014 04:35:48 +0800" ], "X-ExtLoop1": "1", "X-IronPort-AV": "E=Sophos;i=\"5.04,515,1406617200\"; d=\"scan'208\";a=\"477198894\"", "From": "\"Zhou, Danny\" <danny.zhou@intel.com>", "To": "\"John W. Linville\" <linville@tuxdriver.com>", "Thread-Topic": "[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "Thread-Index": "AQHPn5Gwvcd4A0wcTkeObxakesCY6Jv9ov6AgACKEgD//4OfgIAAmgdg", "Date": "Fri, 12 Sep 2014 20:35:47 +0000", "Message-ID": "<DFDF335405C17848924A094BC35766CF0A935E75@SHSMSX104.ccr.corp.intel.com>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1405362290-6753-1-git-send-email-linville@tuxdriver.com>\n\t<20140912180523.GB7145@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A935AEE@SHSMSX104.ccr.corp.intel.com>\n\t<20140912185423.GD7145@tuxdriver.com>", "In-Reply-To": "<20140912185423.GD7145@tuxdriver.com>", "Accept-Language": "zh-CN, en-US", "Content-Language": "en-US", "X-MS-Has-Attach": "", "X-MS-TNEF-Correlator": "", "x-originating-ip": "[10.239.127.40]", "Content-Type": "text/plain; charset=\"us-ascii\"", "Content-Transfer-Encoding": "quoted-printable", "MIME-Version": "1.0", "Cc": "\"dev@dpdk.org\" <dev@dpdk.org>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 782, "web_url": "http://patches.dpdk.org/comment/782/", "msgid": "<20140915150946.GA11690@hmsreliant.think-freely.org>", "list_archive_url": "https://inbox.dpdk.org/dev/20140915150946.GA11690@hmsreliant.think-freely.org", "date": "2014-09-15T15:09:46", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 32, "url": "http://patches.dpdk.org/api/people/32/?format=api", "name": "Neil Horman", "email": "nhorman@tuxdriver.com" }, "content": "On Fri, Sep 12, 2014 at 08:35:47PM +0000, Zhou, Danny wrote:\n> > -----Original Message-----\n> > From: John W. Linville [mailto:linville@tuxdriver.com]\n> > Sent: Saturday, September 13, 2014 2:54 AM\n> > To: Zhou, Danny\n> > Cc: dev@dpdk.org\n> > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices\n> > \n> > On Fri, Sep 12, 2014 at 06:31:08PM +0000, Zhou, Danny wrote:\n> > > I am concerned about its performance caused by too many\n> > > memcpy(). Specifically, on Rx side, kernel NIC driver needs to copy\n> > > packets to skb, then af_packet copies packets to AF_PACKET buffer\n> > > which are mapped to user space, and then those packets to be copied\n> > > to DPDK mbuf. In addition, 3 copies needed on Tx side. So to run a\n> > > simple DPDK L2/L3 forwarding benchmark, each packet needs 6 packet\n> > > copies which brings significant negative performance impact. We\n> > > had a bifurcated driver prototype that can do zero-copy and achieve\n> > > native DPDK performance, but it depends on base driver and AF_PACKET\n> > > code changes in kernel, John R will be presenting it in coming Linux\n> > > Plumbers Conference. Once kernel adopts it, the relevant PMD will be\n> > > submitted to dpdk.org.\n> > \n> > Admittedly, this is not as good a performer as most of the existing\n> > PMDs. It serves a different purpose, afterall. FWIW, you did\n> > previously indicate that it performed better than the pcap-based PMD.\n> \n> Yes, slightly higher but makes no big difference.\n> \nDo you have numbers for this? It seems to me faster is faster as long as its\nstatistically significant. Even if its not, johns AF_PACKET pmd has the ability\nto scale to multple cpus more easily than the pcap pmd, as it can make use of\nthe AF_PACKET fanout feature.\n\n> > I look forward to seeing the changes you mention -- they sound very\n> > exciting. But, they will still require both networking core and\n> > driver changes in the kernel. And as I understand things today,\n> > the userland code will still need at least some knowledge of specific\n> > devices and how they layout their packet descriptors, etc. So while\n> > those changes sound very promising, they will still have certain\n> > drawbacks in common with the current situation.\n> \n> Yes, we would like the DPDK performance optimization techniques such as huge page, efficient rx/tx routines to manipulate device-specific \n> packet descriptors, polling-model can be still used. We have to tradeoff between performance and commonality. But we believe it will be much easier\n> to develop DPDK PMD for non-Intel NICs than porting entire kernel drivers to DPDK.\n> \n\nNot sure how this relates, what you're describing is the feature intel has been\nworking on to augment kernel drivers to provide better throughput via direct\nhardware access to user space. Johns PMD provides ubiquitous function on all\nhardware. I'm not sure how the desire for one implies the other isn't valuable?\n\n> > It seems like the changes you mention will still need some sort of\n> > AF_PACKET-based PMD driver. Have you implemented that completely\n> > separate from the code I already posted? Or did you add that work\n> > on top of mine?\n> > \n> \n> For userland code, it certainly use some of your code related to raw rocket, but highly modified. A layer will be added into eth_dev library to do device\n> probe and support new socket options.\n> \n\nOk, but again, PMD's are independent, and serve different needs. If they're use\nis at all overlapping from a functional standpoint, take this one now, and\ndeprecate it when a better one comes along. Though from your description it\nseems like both have a valid place in the ecosystem.\n\nNeil\n\n> > John\n> > \n> > > > -----Original Message-----\n> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of John W. Linville\n> > > > Sent: Saturday, September 13, 2014 2:05 AM\n> > > > To: dev@dpdk.org\n> > > > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices\n> > > >\n> > > > Ping? Are there objections to this patch from mid-July?\n> > > >\n> > > > John\n> > > >\n> > > > On Mon, Jul 14, 2014 at 02:24:50PM -0400, John W. Linville wrote:\n> > > > > This is a Linux-specific virtual PMD driver backed by an AF_PACKET\n> > > > > socket. This implementation uses mmap'ed ring buffers to limit copying\n> > > > > and user/kernel transitions. The PACKET_FANOUT_HASH behavior of\n> > > > > AF_PACKET is used for frame reception. In the current implementation,\n> > > > > Tx and Rx queues are always paired, and therefore are always equal\n> > > > > in number -- changing this would be a Simple Matter Of Programming.\n> > > > >\n> > > > > Interfaces of this type are created with a command line option like\n> > > > > \"--vdev=eth_packet0,iface=...\". There are a number of options availabe\n> > > > > as arguments:\n> > > > >\n> > > > > - Interface is chosen by \"iface\" (required)\n> > > > > - Number of queue pairs set by \"qpairs\" (optional, default: 1)\n> > > > > - AF_PACKET MMAP block size set by \"blocksz\" (optional, default: 4096)\n> > > > > - AF_PACKET MMAP frame size set by \"framesz\" (optional, default: 2048)\n> > > > > - AF_PACKET MMAP frame count set by \"framecnt\" (optional, default: 512)\n> > > > >\n> > > > > Signed-off-by: John W. Linville <linville@tuxdriver.com>\n> > > > > ---\n> > > > > This PMD is intended to provide a means for using DPDK on a broad\n> > > > > range of hardware without hardware-specific PMDs and (hopefully)\n> > > > > with better performance than what PCAP offers in Linux. This might\n> > > > > be useful as a development platform for DPDK applications when\n> > > > > DPDK-supported hardware is expensive or unavailable.\n> > > > >\n> > > > > New in v2:\n> > > > >\n> > > > > -- fixup some style issues found by check patch\n> > > > > -- use if_index as part of fanout group ID\n> > > > > -- set default number of queue pairs to 1\n> > > > >\n> > > > > config/common_bsdapp | 5 +\n> > > > > config/common_linuxapp | 5 +\n> > > > > lib/Makefile | 1 +\n> > > > > lib/librte_eal/linuxapp/eal/Makefile | 1 +\n> > > > > lib/librte_pmd_packet/Makefile | 60 +++\n> > > > > lib/librte_pmd_packet/rte_eth_packet.c | 826 +++++++++++++++++++++++++++++++++\n> > > > > lib/librte_pmd_packet/rte_eth_packet.h | 55 +++\n> > > > > mk/rte.app.mk | 4 +\n> > > > > 8 files changed, 957 insertions(+)\n> > > > > create mode 100644 lib/librte_pmd_packet/Makefile\n> > > > > create mode 100644 lib/librte_pmd_packet/rte_eth_packet.c\n> > > > > create mode 100644 lib/librte_pmd_packet/rte_eth_packet.h\n> > > > >\n> > > > > diff --git a/config/common_bsdapp b/config/common_bsdapp\n> > > > > index 943dce8f1ede..c317f031278e 100644\n> > > > > --- a/config/common_bsdapp\n> > > > > +++ b/config/common_bsdapp\n> > > > > @@ -226,6 +226,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=y\n> > > > > CONFIG_RTE_LIBRTE_PMD_BOND=y\n> > > > >\n> > > > > #\n> > > > > +# Compile software PMD backed by AF_PACKET sockets (Linux only)\n> > > > > +#\n> > > > > +CONFIG_RTE_LIBRTE_PMD_PACKET=n\n> > > > > +\n> > > > > +#\n> > > > > # Do prefetch of packet data within PMD driver receive function\n> > > > > #\n> > > > > CONFIG_RTE_PMD_PACKET_PREFETCH=y\n> > > > > diff --git a/config/common_linuxapp b/config/common_linuxapp\n> > > > > index 7bf5d80d4e26..f9e7bc3015ec 100644\n> > > > > --- a/config/common_linuxapp\n> > > > > +++ b/config/common_linuxapp\n> > > > > @@ -249,6 +249,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=n\n> > > > > CONFIG_RTE_LIBRTE_PMD_BOND=y\n> > > > >\n> > > > > #\n> > > > > +# Compile software PMD backed by AF_PACKET sockets (Linux only)\n> > > > > +#\n> > > > > +CONFIG_RTE_LIBRTE_PMD_PACKET=y\n> > > > > +\n> > > > > +#\n> > > > > # Compile Xen PMD\n> > > > > #\n> > > > > CONFIG_RTE_LIBRTE_PMD_XENVIRT=n\n> > > > > diff --git a/lib/Makefile b/lib/Makefile\n> > > > > index 10c5bb3045bc..930fadf29898 100644\n> > > > > --- a/lib/Makefile\n> > > > > +++ b/lib/Makefile\n> > > > > @@ -47,6 +47,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += librte_pmd_i40e\n> > > > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += librte_pmd_bond\n> > > > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += librte_pmd_ring\n> > > > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += librte_pmd_pcap\n> > > > > +DIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += librte_pmd_packet\n> > > > > DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += librte_pmd_virtio\n> > > > > DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3\n> > > > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt\n> > > > > diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile\n> > > > > index 756d6b0c9301..feed24a63272 100644\n> > > > > --- a/lib/librte_eal/linuxapp/eal/Makefile\n> > > > > +++ b/lib/librte_eal/linuxapp/eal/Makefile\n> > > > > @@ -44,6 +44,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_ether\n> > > > > CFLAGS += -I$(RTE_SDK)/lib/librte_ivshmem\n> > > > > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_ring\n> > > > > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_pcap\n> > > > > +CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_packet\n> > > > > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_xenvirt\n> > > > > CFLAGS += $(WERROR_FLAGS) -O3\n> > > > >\n> > > > > diff --git a/lib/librte_pmd_packet/Makefile b/lib/librte_pmd_packet/Makefile\n> > > > > new file mode 100644\n> > > > > index 000000000000..e1266fb992cd\n> > > > > --- /dev/null\n> > > > > +++ b/lib/librte_pmd_packet/Makefile\n> > > > > @@ -0,0 +1,60 @@\n> > > > > +# BSD LICENSE\n> > > > > +#\n> > > > > +# Copyright(c) 2014 John W. Linville <linville@redhat.com>\n> > > > > +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > > > > +# Copyright(c) 2014 6WIND S.A.\n> > > > > +# All rights reserved.\n> > > > > +#\n> > > > > +# Redistribution and use in source and binary forms, with or without\n> > > > > +# modification, are permitted provided that the following conditions\n> > > > > +# are met:\n> > > > > +#\n> > > > > +# * Redistributions of source code must retain the above copyright\n> > > > > +# notice, this list of conditions and the following disclaimer.\n> > > > > +# * Redistributions in binary form must reproduce the above copyright\n> > > > > +# notice, this list of conditions and the following disclaimer in\n> > > > > +# the documentation and/or other materials provided with the\n> > > > > +# distribution.\n> > > > > +# * Neither the name of Intel Corporation nor the names of its\n> > > > > +# contributors may be used to endorse or promote products derived\n> > > > > +# from this software without specific prior written permission.\n> > > > > +#\n> > > > > +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> > > > > +# \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> > > > > +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> > > > > +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> > > > > +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> > > > > +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> > > > > +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> > > > > +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> > > > > +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> > > > > +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> > > > > +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> > > > > +\n> > > > > +include $(RTE_SDK)/mk/rte.vars.mk\n> > > > > +\n> > > > > +#\n> > > > > +# library name\n> > > > > +#\n> > > > > +LIB = librte_pmd_packet.a\n> > > > > +\n> > > > > +CFLAGS += -O3\n> > > > > +CFLAGS += $(WERROR_FLAGS)\n> > > > > +\n> > > > > +#\n> > > > > +# all source are stored in SRCS-y\n> > > > > +#\n> > > > > +SRCS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += rte_eth_packet.c\n> > > > > +\n> > > > > +#\n> > > > > +# Export include files\n> > > > > +#\n> > > > > +SYMLINK-y-include += rte_eth_packet.h\n> > > > > +\n> > > > > +# this lib depends upon:\n> > > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_mbuf\n> > > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_ether\n> > > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_malloc\n> > > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_kvargs\n> > > > > +\n> > > > > +include $(RTE_SDK)/mk/rte.lib.mk\n> > > > > diff --git a/lib/librte_pmd_packet/rte_eth_packet.c b/lib/librte_pmd_packet/rte_eth_packet.c\n> > > > > new file mode 100644\n> > > > > index 000000000000..9c82d16e730f\n> > > > > --- /dev/null\n> > > > > +++ b/lib/librte_pmd_packet/rte_eth_packet.c\n> > > > > @@ -0,0 +1,826 @@\n> > > > > +/*-\n> > > > > + * BSD LICENSE\n> > > > > + *\n> > > > > + * Copyright(c) 2014 John W. Linville <linville@tuxdriver.com>\n> > > > > + *\n> > > > > + * Originally based upon librte_pmd_pcap code:\n> > > > > + *\n> > > > > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > > > > + * Copyright(c) 2014 6WIND S.A.\n> > > > > + * All rights reserved.\n> > > > > + *\n> > > > > + * Redistribution and use in source and binary forms, with or without\n> > > > > + * modification, are permitted provided that the following conditions\n> > > > > + * are met:\n> > > > > + *\n> > > > > + * * Redistributions of source code must retain the above copyright\n> > > > > + * notice, this list of conditions and the following disclaimer.\n> > > > > + * * Redistributions in binary form must reproduce the above copyright\n> > > > > + * notice, this list of conditions and the following disclaimer in\n> > > > > + * the documentation and/or other materials provided with the\n> > > > > + * distribution.\n> > > > > + * * Neither the name of Intel Corporation nor the names of its\n> > > > > + * contributors may be used to endorse or promote products derived\n> > > > > + * from this software without specific prior written permission.\n> > > > > + *\n> > > > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> > > > > + * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> > > > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> > > > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> > > > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> > > > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> > > > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> > > > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> > > > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> > > > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> > > > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> > > > > + */\n> > > > > +\n> > > > > +#include <rte_mbuf.h>\n> > > > > +#include <rte_ethdev.h>\n> > > > > +#include <rte_malloc.h>\n> > > > > +#include <rte_kvargs.h>\n> > > > > +#include <rte_dev.h>\n> > > > > +\n> > > > > +#include <linux/if_ether.h>\n> > > > > +#include <linux/if_packet.h>\n> > > > > +#include <arpa/inet.h>\n> > > > > +#include <net/if.h>\n> > > > > +#include <sys/types.h>\n> > > > > +#include <sys/socket.h>\n> > > > > +#include <sys/ioctl.h>\n> > > > > +#include <sys/mman.h>\n> > > > > +#include <unistd.h>\n> > > > > +#include <poll.h>\n> > > > > +\n> > > > > +#include \"rte_eth_packet.h\"\n> > > > > +\n> > > > > +#define ETH_PACKET_IFACE_ARG\t\t\"iface\"\n> > > > > +#define ETH_PACKET_NUM_Q_ARG\t\t\"qpairs\"\n> > > > > +#define ETH_PACKET_BLOCKSIZE_ARG\t\"blocksz\"\n> > > > > +#define ETH_PACKET_FRAMESIZE_ARG\t\"framesz\"\n> > > > > +#define ETH_PACKET_FRAMECOUNT_ARG\t\"framecnt\"\n> > > > > +\n> > > > > +#define DFLT_BLOCK_SIZE\t\t(1 << 12)\n> > > > > +#define DFLT_FRAME_SIZE\t\t(1 << 11)\n> > > > > +#define DFLT_FRAME_COUNT\t(1 << 9)\n> > > > > +\n> > > > > +struct pkt_rx_queue {\n> > > > > +\tint sockfd;\n> > > > > +\n> > > > > +\tstruct iovec *rd;\n> > > > > +\tuint8_t *map;\n> > > > > +\tunsigned int framecount;\n> > > > > +\tunsigned int framenum;\n> > > > > +\n> > > > > +\tstruct rte_mempool *mb_pool;\n> > > > > +\n> > > > > +\tvolatile unsigned long rx_pkts;\n> > > > > +\tvolatile unsigned long err_pkts;\n> > > > > +};\n> > > > > +\n> > > > > +struct pkt_tx_queue {\n> > > > > +\tint sockfd;\n> > > > > +\n> > > > > +\tstruct iovec *rd;\n> > > > > +\tuint8_t *map;\n> > > > > +\tunsigned int framecount;\n> > > > > +\tunsigned int framenum;\n> > > > > +\n> > > > > +\tvolatile unsigned long tx_pkts;\n> > > > > +\tvolatile unsigned long err_pkts;\n> > > > > +};\n> > > > > +\n> > > > > +struct pmd_internals {\n> > > > > +\tunsigned nb_queues;\n> > > > > +\n> > > > > +\tint if_index;\n> > > > > +\tstruct ether_addr eth_addr;\n> > > > > +\n> > > > > +\tstruct tpacket_req req;\n> > > > > +\n> > > > > +\tstruct pkt_rx_queue rx_queue[RTE_PMD_PACKET_MAX_RINGS];\n> > > > > +\tstruct pkt_tx_queue tx_queue[RTE_PMD_PACKET_MAX_RINGS];\n> > > > > +};\n> > > > > +\n> > > > > +static const char *valid_arguments[] = {\n> > > > > +\tETH_PACKET_IFACE_ARG,\n> > > > > +\tETH_PACKET_NUM_Q_ARG,\n> > > > > +\tETH_PACKET_BLOCKSIZE_ARG,\n> > > > > +\tETH_PACKET_FRAMESIZE_ARG,\n> > > > > +\tETH_PACKET_FRAMECOUNT_ARG,\n> > > > > +\tNULL\n> > > > > +};\n> > > > > +\n> > > > > +static const char *drivername = \"AF_PACKET PMD\";\n> > > > > +\n> > > > > +static struct rte_eth_link pmd_link = {\n> > > > > +\t.link_speed = 10000,\n> > > > > +\t.link_duplex = ETH_LINK_FULL_DUPLEX,\n> > > > > +\t.link_status = 0\n> > > > > +};\n> > > > > +\n> > > > > +static uint16_t\n> > > > > +eth_packet_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)\n> > > > > +{\n> > > > > +\tunsigned i;\n> > > > > +\tstruct tpacket2_hdr *ppd;\n> > > > > +\tstruct rte_mbuf *mbuf;\n> > > > > +\tuint8_t *pbuf;\n> > > > > +\tstruct pkt_rx_queue *pkt_q = queue;\n> > > > > +\tuint16_t num_rx = 0;\n> > > > > +\tunsigned int framecount, framenum;\n> > > > > +\n> > > > > +\tif (unlikely(nb_pkts == 0))\n> > > > > +\t\treturn 0;\n> > > > > +\n> > > > > +\t/*\n> > > > > +\t * Reads the given number of packets from the AF_PACKET socket one by\n> > > > > +\t * one and copies the packet data into a newly allocated mbuf.\n> > > > > +\t */\n> > > > > +\tframecount = pkt_q->framecount;\n> > > > > +\tframenum = pkt_q->framenum;\n> > > > > +\tfor (i = 0; i < nb_pkts; i++) {\n> > > > > +\t\t/* point at the next incoming frame */\n> > > > > +\t\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > > > > +\t\tif ((ppd->tp_status & TP_STATUS_USER) == 0)\n> > > > > +\t\t\tbreak;\n> > > > > +\n> > > > > +\t\t/* allocate the next mbuf */\n> > > > > +\t\tmbuf = rte_pktmbuf_alloc(pkt_q->mb_pool);\n> > > > > +\t\tif (unlikely(mbuf == NULL))\n> > > > > +\t\t\tbreak;\n> > > > > +\n> > > > > +\t\t/* packet will fit in the mbuf, go ahead and receive it */\n> > > > > +\t\tmbuf->pkt.pkt_len = mbuf->pkt.data_len = ppd->tp_snaplen;\n> > > > > +\t\tpbuf = (uint8_t *) ppd + ppd->tp_mac;\n> > > > > +\t\tmemcpy(mbuf->pkt.data, pbuf, mbuf->pkt.data_len);\n> > > > > +\n> > > > > +\t\t/* release incoming frame and advance ring buffer */\n> > > > > +\t\tppd->tp_status = TP_STATUS_KERNEL;\n> > > > > +\t\tif (++framenum >= framecount)\n> > > > > +\t\t\tframenum = 0;\n> > > > > +\n> > > > > +\t\t/* account for the receive frame */\n> > > > > +\t\tbufs[i] = mbuf;\n> > > > > +\t\tnum_rx++;\n> > > > > +\t}\n> > > > > +\tpkt_q->framenum = framenum;\n> > > > > +\tpkt_q->rx_pkts += num_rx;\n> > > > > +\treturn num_rx;\n> > > > > +}\n> > > > > +\n> > > > > +/*\n> > > > > + * Callback to handle sending packets through a real NIC.\n> > > > > + */\n> > > > > +static uint16_t\n> > > > > +eth_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)\n> > > > > +{\n> > > > > +\tstruct tpacket2_hdr *ppd;\n> > > > > +\tstruct rte_mbuf *mbuf;\n> > > > > +\tuint8_t *pbuf;\n> > > > > +\tunsigned int framecount, framenum;\n> > > > > +\tstruct pollfd pfd;\n> > > > > +\tstruct pkt_tx_queue *pkt_q = queue;\n> > > > > +\tuint16_t num_tx = 0;\n> > > > > +\tint i;\n> > > > > +\n> > > > > +\tif (unlikely(nb_pkts == 0))\n> > > > > +\t\treturn 0;\n> > > > > +\n> > > > > +\tmemset(&pfd, 0, sizeof(pfd));\n> > > > > +\tpfd.fd = pkt_q->sockfd;\n> > > > > +\tpfd.events = POLLOUT;\n> > > > > +\tpfd.revents = 0;\n> > > > > +\n> > > > > +\tframecount = pkt_q->framecount;\n> > > > > +\tframenum = pkt_q->framenum;\n> > > > > +\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > > > > +\tfor (i = 0; i < nb_pkts; i++) {\n> > > > > +\t\t/* point at the next incoming frame */\n> > > > > +\t\tif ((ppd->tp_status != TP_STATUS_AVAILABLE) &&\n> > > > > +\t\t (poll(&pfd, 1, -1) < 0))\n> > > > > +\t\t\t\tcontinue;\n> > > > > +\n> > > > > +\t\t/* copy the tx frame data */\n> > > > > +\t\tmbuf = bufs[num_tx];\n> > > > > +\t\tpbuf = (uint8_t *) ppd + TPACKET2_HDRLEN -\n> > > > > +\t\t\tsizeof(struct sockaddr_ll);\n> > > > > +\t\tmemcpy(pbuf, mbuf->pkt.data, mbuf->pkt.data_len);\n> > > > > +\t\tppd->tp_len = ppd->tp_snaplen = mbuf->pkt.data_len;\n> > > > > +\n> > > > > +\t\t/* release incoming frame and advance ring buffer */\n> > > > > +\t\tppd->tp_status = TP_STATUS_SEND_REQUEST;\n> > > > > +\t\tif (++framenum >= framecount)\n> > > > > +\t\t\tframenum = 0;\n> > > > > +\t\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > > > > +\n> > > > > +\t\tnum_tx++;\n> > > > > +\t\trte_pktmbuf_free(mbuf);\n> > > > > +\t}\n> > > > > +\n> > > > > +\t/* kick-off transmits */\n> > > > > +\tsendto(pkt_q->sockfd, NULL, 0, MSG_DONTWAIT, NULL, 0);\n> > > > > +\n> > > > > +\tpkt_q->framenum = framenum;\n> > > > > +\tpkt_q->tx_pkts += num_tx;\n> > > > > +\tpkt_q->err_pkts += nb_pkts - num_tx;\n> > > > > +\treturn num_tx;\n> > > > > +}\n> > > > > +\n> > > > > +static int\n> > > > > +eth_dev_start(struct rte_eth_dev *dev)\n> > > > > +{\n> > > > > +\tdev->data->dev_link.link_status = 1;\n> > > > > +\treturn 0;\n> > > > > +}\n> > > > > +\n> > > > > +/*\n> > > > > + * This function gets called when the current port gets stopped.\n> > > > > + */\n> > > > > +static void\n> > > > > +eth_dev_stop(struct rte_eth_dev *dev)\n> > > > > +{\n> > > > > +\tunsigned i;\n> > > > > +\tint sockfd;\n> > > > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > > > +\n> > > > > +\tfor (i = 0; i < internals->nb_queues; i++) {\n> > > > > +\t\tsockfd = internals->rx_queue[i].sockfd;\n> > > > > +\t\tif (sockfd != -1)\n> > > > > +\t\t\tclose(sockfd);\n> > > > > +\t\tsockfd = internals->tx_queue[i].sockfd;\n> > > > > +\t\tif (sockfd != -1)\n> > > > > +\t\t\tclose(sockfd);\n> > > > > +\t}\n> > > > > +\n> > > > > +\tdev->data->dev_link.link_status = 0;\n> > > > > +}\n> > > > > +\n> > > > > +static int\n> > > > > +eth_dev_configure(struct rte_eth_dev *dev __rte_unused)\n> > > > > +{\n> > > > > +\treturn 0;\n> > > > > +}\n> > > > > +\n> > > > > +static void\n> > > > > +eth_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)\n> > > > > +{\n> > > > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > > > +\n> > > > > +\tdev_info->driver_name = drivername;\n> > > > > +\tdev_info->if_index = internals->if_index;\n> > > > > +\tdev_info->max_mac_addrs = 1;\n> > > > > +\tdev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;\n> > > > > +\tdev_info->max_rx_queues = (uint16_t)internals->nb_queues;\n> > > > > +\tdev_info->max_tx_queues = (uint16_t)internals->nb_queues;\n> > > > > +\tdev_info->min_rx_bufsize = 0;\n> > > > > +\tdev_info->pci_dev = NULL;\n> > > > > +}\n> > > > > +\n> > > > > +static void\n> > > > > +eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *igb_stats)\n> > > > > +{\n> > > > > +\tunsigned i, imax;\n> > > > > +\tunsigned long rx_total = 0, tx_total = 0, tx_err_total = 0;\n> > > > > +\tconst struct pmd_internals *internal = dev->data->dev_private;\n> > > > > +\n> > > > > +\tmemset(igb_stats, 0, sizeof(*igb_stats));\n> > > > > +\n> > > > > +\timax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?\n> > > > > +\t internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);\n> > > > > +\tfor (i = 0; i < imax; i++) {\n> > > > > +\t\tigb_stats->q_ipackets[i] = internal->rx_queue[i].rx_pkts;\n> > > > > +\t\trx_total += igb_stats->q_ipackets[i];\n> > > > > +\t}\n> > > > > +\n> > > > > +\timax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?\n> > > > > +\t internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);\n> > > > > +\tfor (i = 0; i < imax; i++) {\n> > > > > +\t\tigb_stats->q_opackets[i] = internal->tx_queue[i].tx_pkts;\n> > > > > +\t\tigb_stats->q_errors[i] = internal->tx_queue[i].err_pkts;\n> > > > > +\t\ttx_total += igb_stats->q_opackets[i];\n> > > > > +\t\ttx_err_total += igb_stats->q_errors[i];\n> > > > > +\t}\n> > > > > +\n> > > > > +\tigb_stats->ipackets = rx_total;\n> > > > > +\tigb_stats->opackets = tx_total;\n> > > > > +\tigb_stats->oerrors = tx_err_total;\n> > > > > +}\n> > > > > +\n> > > > > +static void\n> > > > > +eth_stats_reset(struct rte_eth_dev *dev)\n> > > > > +{\n> > > > > +\tunsigned i;\n> > > > > +\tstruct pmd_internals *internal = dev->data->dev_private;\n> > > > > +\n> > > > > +\tfor (i = 0; i < internal->nb_queues; i++)\n> > > > > +\t\tinternal->rx_queue[i].rx_pkts = 0;\n> > > > > +\n> > > > > +\tfor (i = 0; i < internal->nb_queues; i++) {\n> > > > > +\t\tinternal->tx_queue[i].tx_pkts = 0;\n> > > > > +\t\tinternal->tx_queue[i].err_pkts = 0;\n> > > > > +\t}\n> > > > > +}\n> > > > > +\n> > > > > +static void\n> > > > > +eth_dev_close(struct rte_eth_dev *dev __rte_unused)\n> > > > > +{\n> > > > > +}\n> > > > > +\n> > > > > +static void\n> > > > > +eth_queue_release(void *q __rte_unused)\n> > > > > +{\n> > > > > +}\n> > > > > +\n> > > > > +static int\n> > > > > +eth_link_update(struct rte_eth_dev *dev __rte_unused,\n> > > > > + int wait_to_complete __rte_unused)\n> > > > > +{\n> > > > > +\treturn 0;\n> > > > > +}\n> > > > > +\n> > > > > +static int\n> > > > > +eth_rx_queue_setup(struct rte_eth_dev *dev,\n> > > > > + uint16_t rx_queue_id,\n> > > > > + uint16_t nb_rx_desc __rte_unused,\n> > > > > + unsigned int socket_id __rte_unused,\n> > > > > + const struct rte_eth_rxconf *rx_conf __rte_unused,\n> > > > > + struct rte_mempool *mb_pool)\n> > > > > +{\n> > > > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > > > +\tstruct pkt_rx_queue *pkt_q = &internals->rx_queue[rx_queue_id];\n> > > > > +\tstruct rte_pktmbuf_pool_private *mbp_priv;\n> > > > > +\tuint16_t buf_size;\n> > > > > +\n> > > > > +\tpkt_q->mb_pool = mb_pool;\n> > > > > +\n> > > > > +\t/* Now get the space available for data in the mbuf */\n> > > > > +\tmbp_priv = rte_mempool_get_priv(pkt_q->mb_pool);\n> > > > > +\tbuf_size = (uint16_t) (mbp_priv->mbuf_data_room_size -\n> > > > > +\t RTE_PKTMBUF_HEADROOM);\n> > > > > +\n> > > > > +\tif (ETH_FRAME_LEN > buf_size) {\n> > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > +\t\t\t\"%s: %d bytes will not fit in mbuf (%d bytes)\\n\",\n> > > > > +\t\t\tdev->data->name, ETH_FRAME_LEN, buf_size);\n> > > > > +\t\treturn -ENOMEM;\n> > > > > +\t}\n> > > > > +\n> > > > > +\tdev->data->rx_queues[rx_queue_id] = pkt_q;\n> > > > > +\n> > > > > +\treturn 0;\n> > > > > +}\n> > > > > +\n> > > > > +static int\n> > > > > +eth_tx_queue_setup(struct rte_eth_dev *dev,\n> > > > > + uint16_t tx_queue_id,\n> > > > > + uint16_t nb_tx_desc __rte_unused,\n> > > > > + unsigned int socket_id __rte_unused,\n> > > > > + const struct rte_eth_txconf *tx_conf __rte_unused)\n> > > > > +{\n> > > > > +\n> > > > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > > > +\n> > > > > +\tdev->data->tx_queues[tx_queue_id] = &internals->tx_queue[tx_queue_id];\n> > > > > +\treturn 0;\n> > > > > +}\n> > > > > +\n> > > > > +static struct eth_dev_ops ops = {\n> > > > > +\t.dev_start = eth_dev_start,\n> > > > > +\t.dev_stop = eth_dev_stop,\n> > > > > +\t.dev_close = eth_dev_close,\n> > > > > +\t.dev_configure = eth_dev_configure,\n> > > > > +\t.dev_infos_get = eth_dev_info,\n> > > > > +\t.rx_queue_setup = eth_rx_queue_setup,\n> > > > > +\t.tx_queue_setup = eth_tx_queue_setup,\n> > > > > +\t.rx_queue_release = eth_queue_release,\n> > > > > +\t.tx_queue_release = eth_queue_release,\n> > > > > +\t.link_update = eth_link_update,\n> > > > > +\t.stats_get = eth_stats_get,\n> > > > > +\t.stats_reset = eth_stats_reset,\n> > > > > +};\n> > > > > +\n> > > > > +/*\n> > > > > + * Opens an AF_PACKET socket\n> > > > > + */\n> > > > > +static int\n> > > > > +open_packet_iface(const char *key __rte_unused,\n> > > > > + const char *value __rte_unused,\n> > > > > + void *extra_args)\n> > > > > +{\n> > > > > +\tint *sockfd = extra_args;\n> > > > > +\n> > > > > +\t/* Open an AF_PACKET socket... */\n> > > > > +\t*sockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));\n> > > > > +\tif (*sockfd == -1) {\n> > > > > +\t\tRTE_LOG(ERR, PMD, \"Could not open AF_PACKET socket\\n\");\n> > > > > +\t\treturn -1;\n> > > > > +\t}\n> > > > > +\n> > > > > +\treturn 0;\n> > > > > +}\n> > > > > +\n> > > > > +static int\n> > > > > +rte_pmd_init_internals(const char *name,\n> > > > > + const int sockfd,\n> > > > > + const unsigned nb_queues,\n> > > > > + unsigned int blocksize,\n> > > > > + unsigned int blockcnt,\n> > > > > + unsigned int framesize,\n> > > > > + unsigned int framecnt,\n> > > > > + const unsigned numa_node,\n> > > > > + struct pmd_internals **internals,\n> > > > > + struct rte_eth_dev **eth_dev,\n> > > > > + struct rte_kvargs *kvlist)\n> > > > > +{\n> > > > > +\tstruct rte_eth_dev_data *data = NULL;\n> > > > > +\tstruct rte_pci_device *pci_dev = NULL;\n> > > > > +\tstruct rte_kvargs_pair *pair = NULL;\n> > > > > +\tstruct ifreq ifr;\n> > > > > +\tsize_t ifnamelen;\n> > > > > +\tunsigned k_idx;\n> > > > > +\tstruct sockaddr_ll sockaddr;\n> > > > > +\tstruct tpacket_req *req;\n> > > > > +\tstruct pkt_rx_queue *rx_queue;\n> > > > > +\tstruct pkt_tx_queue *tx_queue;\n> > > > > +\tint rc, tpver, discard, bypass;\n> > > > > +\tunsigned int i, q, rdsize;\n> > > > > +\tint qsockfd, fanout_arg;\n> > > > > +\n> > > > > +\tfor (k_idx = 0; k_idx < kvlist->count; k_idx++) {\n> > > > > +\t\tpair = &kvlist->pairs[k_idx];\n> > > > > +\t\tif (strstr(pair->key, ETH_PACKET_IFACE_ARG) != NULL)\n> > > > > +\t\t\tbreak;\n> > > > > +\t}\n> > > > > +\tif (pair == NULL) {\n> > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > +\t\t\t\"%s: no interface specified for AF_PACKET ethdev\\n\",\n> > > > > +\t\t name);\n> > > > > +\t\tgoto error;\n> > > > > +\t}\n> > > > > +\n> > > > > +\tRTE_LOG(INFO, PMD,\n> > > > > +\t\t\"%s: creating AF_PACKET-backed ethdev on numa socket %u\\n\",\n> > > > > +\t\tname, numa_node);\n> > > > > +\n> > > > > +\t/*\n> > > > > +\t * now do all data allocation - for eth_dev structure, dummy pci driver\n> > > > > +\t * and internal (private) data\n> > > > > +\t */\n> > > > > +\tdata = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);\n> > > > > +\tif (data == NULL)\n> > > > > +\t\tgoto error;\n> > > > > +\n> > > > > +\tpci_dev = rte_zmalloc_socket(name, sizeof(*pci_dev), 0, numa_node);\n> > > > > +\tif (pci_dev == NULL)\n> > > > > +\t\tgoto error;\n> > > > > +\n> > > > > +\t*internals = rte_zmalloc_socket(name, sizeof(**internals),\n> > > > > +\t 0, numa_node);\n> > > > > +\tif (*internals == NULL)\n> > > > > +\t\tgoto error;\n> > > > > +\n> > > > > +\treq = &((*internals)->req);\n> > > > > +\n> > > > > +\treq->tp_block_size = blocksize;\n> > > > > +\treq->tp_block_nr = blockcnt;\n> > > > > +\treq->tp_frame_size = framesize;\n> > > > > +\treq->tp_frame_nr = framecnt;\n> > > > > +\n> > > > > +\tifnamelen = strlen(pair->value);\n> > > > > +\tif (ifnamelen < sizeof(ifr.ifr_name)) {\n> > > > > +\t\tmemcpy(ifr.ifr_name, pair->value, ifnamelen);\n> > > > > +\t\tifr.ifr_name[ifnamelen] = '\\0';\n> > > > > +\t} else {\n> > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > +\t\t\t\"%s: I/F name too long (%s)\\n\",\n> > > > > +\t\t\tname, pair->value);\n> > > > > +\t\tgoto error;\n> > > > > +\t}\n> > > > > +\tif (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {\n> > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > +\t\t\t\"%s: ioctl failed (SIOCGIFINDEX)\\n\",\n> > > > > +\t\t name);\n> > > > > +\t\tgoto error;\n> > > > > +\t}\n> > > > > +\t(*internals)->if_index = ifr.ifr_ifindex;\n> > > > > +\n> > > > > +\tif (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {\n> > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > +\t\t\t\"%s: ioctl failed (SIOCGIFHWADDR)\\n\",\n> > > > > +\t\t name);\n> > > > > +\t\tgoto error;\n> > > > > +\t}\n> > > > > +\tmemcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);\n> > > > > +\n> > > > > +\tmemset(&sockaddr, 0, sizeof(sockaddr));\n> > > > > +\tsockaddr.sll_family = AF_PACKET;\n> > > > > +\tsockaddr.sll_protocol = htons(ETH_P_ALL);\n> > > > > +\tsockaddr.sll_ifindex = (*internals)->if_index;\n> > > > > +\n> > > > > +\tfanout_arg = (getpid() ^ (*internals)->if_index) & 0xffff;\n> > > > > +\tfanout_arg |= (PACKET_FANOUT_HASH | PACKET_FANOUT_FLAG_DEFRAG |\n> > > > > +\t PACKET_FANOUT_FLAG_ROLLOVER) << 16;\n> > > > > +\n> > > > > +\tfor (q = 0; q < nb_queues; q++) {\n> > > > > +\t\t/* Open an AF_PACKET socket for this queue... */\n> > > > > +\t\tqsockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));\n> > > > > +\t\tif (qsockfd == -1) {\n> > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > +\t\t\t \"%s: could not open AF_PACKET socket\\n\",\n> > > > > +\t\t\t name);\n> > > > > +\t\t\treturn -1;\n> > > > > +\t\t}\n> > > > > +\n> > > > > +\t\ttpver = TPACKET_V2;\n> > > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_VERSION,\n> > > > > +\t\t\t\t&tpver, sizeof(tpver));\n> > > > > +\t\tif (rc == -1) {\n> > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > +\t\t\t\t\"%s: could not set PACKET_VERSION on AF_PACKET \"\n> > > > > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > > > > +\t\t\tgoto error;\n> > > > > +\t\t}\n> > > > > +\n> > > > > +\t\tdiscard = 1;\n> > > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_LOSS,\n> > > > > +\t\t\t\t&discard, sizeof(discard));\n> > > > > +\t\tif (rc == -1) {\n> > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > +\t\t\t\t\"%s: could not set PACKET_LOSS on \"\n> > > > > +\t\t\t \"AF_PACKET socket for %s\\n\", name, pair->value);\n> > > > > +\t\t\tgoto error;\n> > > > > +\t\t}\n> > > > > +\n> > > > > +\t\tbypass = 1;\n> > > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_QDISC_BYPASS,\n> > > > > +\t\t\t\t&bypass, sizeof(bypass));\n> > > > > +\t\tif (rc == -1) {\n> > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > +\t\t\t\t\"%s: could not set PACKET_QDISC_BYPASS \"\n> > > > > +\t\t\t \"on AF_PACKET socket for %s\\n\", name,\n> > > > > +\t\t\t pair->value);\n> > > > > +\t\t\tgoto error;\n> > > > > +\t\t}\n> > > > > +\n> > > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_RX_RING, req, sizeof(*req));\n> > > > > +\t\tif (rc == -1) {\n> > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > +\t\t\t\t\"%s: could not set PACKET_RX_RING on AF_PACKET \"\n> > > > > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > > > > +\t\t\tgoto error;\n> > > > > +\t\t}\n> > > > > +\n> > > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_TX_RING, req, sizeof(*req));\n> > > > > +\t\tif (rc == -1) {\n> > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > +\t\t\t\t\"%s: could not set PACKET_TX_RING on AF_PACKET \"\n> > > > > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > > > > +\t\t\tgoto error;\n> > > > > +\t\t}\n> > > > > +\n> > > > > +\t\trx_queue = &((*internals)->rx_queue[q]);\n> > > > > +\t\trx_queue->framecount = req->tp_frame_nr;\n> > > > > +\n> > > > > +\t\trx_queue->map = mmap(NULL, 2 * req->tp_block_size * req->tp_block_nr,\n> > > > > +\t\t\t\t PROT_READ | PROT_WRITE, MAP_SHARED | MAP_LOCKED,\n> > > > > +\t\t\t\t qsockfd, 0);\n> > > > > +\t\tif (rx_queue->map == MAP_FAILED) {\n> > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > +\t\t\t\t\"%s: call to mmap failed on AF_PACKET socket for %s\\n\",\n> > > > > +\t\t\t\tname, pair->value);\n> > > > > +\t\t\tgoto error;\n> > > > > +\t\t}\n> > > > > +\n> > > > > +\t\t/* rdsize is same for both Tx and Rx */\n> > > > > +\t\trdsize = req->tp_frame_nr * sizeof(*(rx_queue->rd));\n> > > > > +\n> > > > > +\t\trx_queue->rd = rte_zmalloc_socket(name, rdsize, 0, numa_node);\n> > > > > +\t\tfor (i = 0; i < req->tp_frame_nr; ++i) {\n> > > > > +\t\t\trx_queue->rd[i].iov_base = rx_queue->map + (i * framesize);\n> > > > > +\t\t\trx_queue->rd[i].iov_len = req->tp_frame_size;\n> > > > > +\t\t}\n> > > > > +\t\trx_queue->sockfd = qsockfd;\n> > > > > +\n> > > > > +\t\ttx_queue = &((*internals)->tx_queue[q]);\n> > > > > +\t\ttx_queue->framecount = req->tp_frame_nr;\n> > > > > +\n> > > > > +\t\ttx_queue->map = rx_queue->map + req->tp_block_size * req->tp_block_nr;\n> > > > > +\n> > > > > +\t\ttx_queue->rd = rte_zmalloc_socket(name, rdsize, 0, numa_node);\n> > > > > +\t\tfor (i = 0; i < req->tp_frame_nr; ++i) {\n> > > > > +\t\t\ttx_queue->rd[i].iov_base = tx_queue->map + (i * framesize);\n> > > > > +\t\t\ttx_queue->rd[i].iov_len = req->tp_frame_size;\n> > > > > +\t\t}\n> > > > > +\t\ttx_queue->sockfd = qsockfd;\n> > > > > +\n> > > > > +\t\trc = bind(qsockfd, (const struct sockaddr*)&sockaddr, sizeof(sockaddr));\n> > > > > +\t\tif (rc == -1) {\n> > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > +\t\t\t\t\"%s: could not bind AF_PACKET socket to %s\\n\",\n> > > > > +\t\t\t name, pair->value);\n> > > > > +\t\t\tgoto error;\n> > > > > +\t\t}\n> > > > > +\n> > > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_FANOUT,\n> > > > > +\t\t\t\t&fanout_arg, sizeof(fanout_arg));\n> > > > > +\t\tif (rc == -1) {\n> > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > +\t\t\t\t\"%s: could not set PACKET_FANOUT on AF_PACKET socket \"\n> > > > > +\t\t\t\t\"for %s\\n\", name, pair->value);\n> > > > > +\t\t\tgoto error;\n> > > > > +\t\t}\n> > > > > +\t}\n> > > > > +\n> > > > > +\t/* reserve an ethdev entry */\n> > > > > +\t*eth_dev = rte_eth_dev_allocate(name);\n> > > > > +\tif (*eth_dev == NULL)\n> > > > > +\t\tgoto error;\n> > > > > +\n> > > > > +\t/*\n> > > > > +\t * now put it all together\n> > > > > +\t * - store queue data in internals,\n> > > > > +\t * - store numa_node info in pci_driver\n> > > > > +\t * - point eth_dev_data to internals and pci_driver\n> > > > > +\t * - and point eth_dev structure to new eth_dev_data structure\n> > > > > +\t */\n> > > > > +\n> > > > > +\t(*internals)->nb_queues = nb_queues;\n> > > > > +\n> > > > > +\tdata->dev_private = *internals;\n> > > > > +\tdata->port_id = (*eth_dev)->data->port_id;\n> > > > > +\tdata->nb_rx_queues = (uint16_t)nb_queues;\n> > > > > +\tdata->nb_tx_queues = (uint16_t)nb_queues;\n> > > > > +\tdata->dev_link = pmd_link;\n> > > > > +\tdata->mac_addrs = &(*internals)->eth_addr;\n> > > > > +\n> > > > > +\tpci_dev->numa_node = numa_node;\n> > > > > +\n> > > > > +\t(*eth_dev)->data = data;\n> > > > > +\t(*eth_dev)->dev_ops = &ops;\n> > > > > +\t(*eth_dev)->pci_dev = pci_dev;\n> > > > > +\n> > > > > +\treturn 0;\n> > > > > +\n> > > > > +error:\n> > > > > +\tif (data)\n> > > > > +\t\trte_free(data);\n> > > > > +\tif (pci_dev)\n> > > > > +\t\trte_free(pci_dev);\n> > > > > +\tfor (q = 0; q < nb_queues; q++) {\n> > > > > +\t\tif ((*internals)->rx_queue[q].rd)\n> > > > > +\t\t\trte_free((*internals)->rx_queue[q].rd);\n> > > > > +\t\tif ((*internals)->tx_queue[q].rd)\n> > > > > +\t\t\trte_free((*internals)->tx_queue[q].rd);\n> > > > > +\t}\n> > > > > +\tif (*internals)\n> > > > > +\t\trte_free(*internals);\n> > > > > +\treturn -1;\n> > > > > +}\n> > > > > +\n> > > > > +static int\n> > > > > +rte_eth_from_packet(const char *name,\n> > > > > + int const *sockfd,\n> > > > > + const unsigned numa_node,\n> > > > > + struct rte_kvargs *kvlist)\n> > > > > +{\n> > > > > +\tstruct pmd_internals *internals = NULL;\n> > > > > +\tstruct rte_eth_dev *eth_dev = NULL;\n> > > > > +\tstruct rte_kvargs_pair *pair = NULL;\n> > > > > +\tunsigned k_idx;\n> > > > > +\tunsigned int blockcount;\n> > > > > +\tunsigned int blocksize = DFLT_BLOCK_SIZE;\n> > > > > +\tunsigned int framesize = DFLT_FRAME_SIZE;\n> > > > > +\tunsigned int framecount = DFLT_FRAME_COUNT;\n> > > > > +\tunsigned int qpairs = 1;\n> > > > > +\n> > > > > +\t/* do some parameter checking */\n> > > > > +\tif (*sockfd < 0)\n> > > > > +\t\treturn -1;\n> > > > > +\n> > > > > +\t/*\n> > > > > +\t * Walk arguments for configurable settings\n> > > > > +\t */\n> > > > > +\tfor (k_idx = 0; k_idx < kvlist->count; k_idx++) {\n> > > > > +\t\tpair = &kvlist->pairs[k_idx];\n> > > > > +\t\tif (strstr(pair->key, ETH_PACKET_NUM_Q_ARG) != NULL) {\n> > > > > +\t\t\tqpairs = atoi(pair->value);\n> > > > > +\t\t\tif (qpairs < 1 ||\n> > > > > +\t\t\t qpairs > RTE_PMD_PACKET_MAX_RINGS) {\n> > > > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > > > +\t\t\t\t\t\"%s: invalid qpairs value\\n\",\n> > > > > +\t\t\t\t name);\n> > > > > +\t\t\t\treturn -1;\n> > > > > +\t\t\t}\n> > > > > +\t\t\tcontinue;\n> > > > > +\t\t}\n> > > > > +\t\tif (strstr(pair->key, ETH_PACKET_BLOCKSIZE_ARG) != NULL) {\n> > > > > +\t\t\tblocksize = atoi(pair->value);\n> > > > > +\t\t\tif (!blocksize) {\n> > > > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > > > +\t\t\t\t\t\"%s: invalid blocksize value\\n\",\n> > > > > +\t\t\t\t name);\n> > > > > +\t\t\t\treturn -1;\n> > > > > +\t\t\t}\n> > > > > +\t\t\tcontinue;\n> > > > > +\t\t}\n> > > > > +\t\tif (strstr(pair->key, ETH_PACKET_FRAMESIZE_ARG) != NULL) {\n> > > > > +\t\t\tframesize = atoi(pair->value);\n> > > > > +\t\t\tif (!framesize) {\n> > > > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > > > +\t\t\t\t\t\"%s: invalid framesize value\\n\",\n> > > > > +\t\t\t\t name);\n> > > > > +\t\t\t\treturn -1;\n> > > > > +\t\t\t}\n> > > > > +\t\t\tcontinue;\n> > > > > +\t\t}\n> > > > > +\t\tif (strstr(pair->key, ETH_PACKET_FRAMECOUNT_ARG) != NULL) {\n> > > > > +\t\t\tframecount = atoi(pair->value);\n> > > > > +\t\t\tif (!framecount) {\n> > > > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > > > +\t\t\t\t\t\"%s: invalid framecount value\\n\",\n> > > > > +\t\t\t\t name);\n> > > > > +\t\t\t\treturn -1;\n> > > > > +\t\t\t}\n> > > > > +\t\t\tcontinue;\n> > > > > +\t\t}\n> > > > > +\t}\n> > > > > +\n> > > > > +\tif (framesize > blocksize) {\n> > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > +\t\t\t\"%s: AF_PACKET MMAP frame size exceeds block size!\\n\",\n> > > > > +\t\t name);\n> > > > > +\t\treturn -1;\n> > > > > +\t}\n> > > > > +\n> > > > > +\tblockcount = framecount / (blocksize / framesize);\n> > > > > +\tif (!blockcount) {\n> > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > +\t\t\t\"%s: invalid AF_PACKET MMAP parameters\\n\", name);\n> > > > > +\t\treturn -1;\n> > > > > +\t}\n> > > > > +\n> > > > > +\tRTE_LOG(INFO, PMD, \"%s: AF_PACKET MMAP parameters:\\n\", name);\n> > > > > +\tRTE_LOG(INFO, PMD, \"%s:\\tblock size %d\\n\", name, blocksize);\n> > > > > +\tRTE_LOG(INFO, PMD, \"%s:\\tblock count %d\\n\", name, blockcount);\n> > > > > +\tRTE_LOG(INFO, PMD, \"%s:\\tframe size %d\\n\", name, framesize);\n> > > > > +\tRTE_LOG(INFO, PMD, \"%s:\\tframe count %d\\n\", name, framecount);\n> > > > > +\n> > > > > +\tif (rte_pmd_init_internals(name, *sockfd, qpairs,\n> > > > > +\t blocksize, blockcount,\n> > > > > +\t framesize, framecount,\n> > > > > +\t numa_node, &internals, &eth_dev,\n> > > > > +\t kvlist) < 0)\n> > > > > +\t\treturn -1;\n> > > > > +\n> > > > > +\teth_dev->rx_pkt_burst = eth_packet_rx;\n> > > > > +\teth_dev->tx_pkt_burst = eth_packet_tx;\n> > > > > +\n> > > > > +\treturn 0;\n> > > > > +}\n> > > > > +\n> > > > > +int\n> > > > > +rte_pmd_packet_devinit(const char *name, const char *params)\n> > > > > +{\n> > > > > +\tunsigned numa_node;\n> > > > > +\tint ret;\n> > > > > +\tstruct rte_kvargs *kvlist;\n> > > > > +\tint sockfd = -1;\n> > > > > +\n> > > > > +\tRTE_LOG(INFO, PMD, \"Initializing pmd_packet for %s\\n\", name);\n> > > > > +\n> > > > > +\tnuma_node = rte_socket_id();\n> > > > > +\n> > > > > +\tkvlist = rte_kvargs_parse(params, valid_arguments);\n> > > > > +\tif (kvlist == NULL)\n> > > > > +\t\treturn -1;\n> > > > > +\n> > > > > +\t/*\n> > > > > +\t * If iface argument is passed we open the NICs and use them for\n> > > > > +\t * reading / writing\n> > > > > +\t */\n> > > > > +\tif (rte_kvargs_count(kvlist, ETH_PACKET_IFACE_ARG) == 1) {\n> > > > > +\n> > > > > +\t\tret = rte_kvargs_process(kvlist, ETH_PACKET_IFACE_ARG,\n> > > > > +\t\t &open_packet_iface, &sockfd);\n> > > > > +\t\tif (ret < 0)\n> > > > > +\t\t\treturn -1;\n> > > > > +\t}\n> > > > > +\n> > > > > +\tret = rte_eth_from_packet(name, &sockfd, numa_node, kvlist);\n> > > > > +\tclose(sockfd); /* no longer needed */\n> > > > > +\n> > > > > +\tif (ret < 0)\n> > > > > +\t\treturn -1;\n> > > > > +\n> > > > > +\treturn 0;\n> > > > > +}\n> > > > > +\n> > > > > +static struct rte_driver pmd_packet_drv = {\n> > > > > +\t.name = \"eth_packet\",\n> > > > > +\t.type = PMD_VDEV,\n> > > > > +\t.init = rte_pmd_packet_devinit,\n> > > > > +};\n> > > > > +\n> > > > > +PMD_REGISTER_DRIVER(pmd_packet_drv);\n> > > > > diff --git a/lib/librte_pmd_packet/rte_eth_packet.h b/lib/librte_pmd_packet/rte_eth_packet.h\n> > > > > new file mode 100644\n> > > > > index 000000000000..f685611da3e9\n> > > > > --- /dev/null\n> > > > > +++ b/lib/librte_pmd_packet/rte_eth_packet.h\n> > > > > @@ -0,0 +1,55 @@\n> > > > > +/*-\n> > > > > + * BSD LICENSE\n> > > > > + *\n> > > > > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > > > > + * All rights reserved.\n> > > > > + *\n> > > > > + * Redistribution and use in source and binary forms, with or without\n> > > > > + * modification, are permitted provided that the following conditions\n> > > > > + * are met:\n> > > > > + *\n> > > > > + * * Redistributions of source code must retain the above copyright\n> > > > > + * notice, this list of conditions and the following disclaimer.\n> > > > > + * * Redistributions in binary form must reproduce the above copyright\n> > > > > + * notice, this list of conditions and the following disclaimer in\n> > > > > + * the documentation and/or other materials provided with the\n> > > > > + * distribution.\n> > > > > + * * Neither the name of Intel Corporation nor the names of its\n> > > > > + * contributors may be used to endorse or promote products derived\n> > > > > + * from this software without specific prior written permission.\n> > > > > + *\n> > > > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> > > > > + * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> > > > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> > > > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> > > > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> > > > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> > > > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> > > > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> > > > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> > > > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> > > > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> > > > > + */\n> > > > > +\n> > > > > +#ifndef _RTE_ETH_PACKET_H_\n> > > > > +#define _RTE_ETH_PACKET_H_\n> > > > > +\n> > > > > +#ifdef __cplusplus\n> > > > > +extern \"C\" {\n> > > > > +#endif\n> > > > > +\n> > > > > +#define RTE_ETH_PACKET_PARAM_NAME \"eth_packet\"\n> > > > > +\n> > > > > +#define RTE_PMD_PACKET_MAX_RINGS 16\n> > > > > +\n> > > > > +/**\n> > > > > + * For use by the EAL only. Called as part of EAL init to set up any dummy NICs\n> > > > > + * configured on command line.\n> > > > > + */\n> > > > > +int rte_pmd_packet_devinit(const char *name, const char *params);\n> > > > > +\n> > > > > +#ifdef __cplusplus\n> > > > > +}\n> > > > > +#endif\n> > > > > +\n> > > > > +#endif\n> > > > > diff --git a/mk/rte.app.mk b/mk/rte.app.mk\n> > > > > index 34dff2a02a05..a6994c4dbe93 100644\n> > > > > --- a/mk/rte.app.mk\n> > > > > +++ b/mk/rte.app.mk\n> > > > > @@ -210,6 +210,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_PMD_PCAP),y)\n> > > > > LDLIBS += -lrte_pmd_pcap -lpcap\n> > > > > endif\n> > > > >\n> > > > > +ifeq ($(CONFIG_RTE_LIBRTE_PMD_PACKET),y)\n> > > > > +LDLIBS += -lrte_pmd_packet\n> > > > > +endif\n> > > > > +\n> > > > > endif # plugins\n> > > > >\n> > > > > LDLIBS += $(EXECENV_LDLIBS)\n> > > > > --\n> > > > > 1.9.3\n> > > > >\n> > > > >\n> > > >\n> > > > --\n> > > > John W. Linville\t\tSomeday the world will need a hero, and you\n> > > > linville@tuxdriver.com\t\t\tmight be all we have. Be ready.\n> > >\n> > \n> > --\n> > John W. Linville\t\tSomeday the world will need a hero, and you\n> > linville@tuxdriver.com\t\t\tmight be all we have. Be ready.\n>", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id 80E7958E8;\n\tMon, 15 Sep 2014 17:04:33 +0200 (CEST)", "from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58])\n\tby dpdk.org (Postfix) with ESMTP id D4014137D\n\tfor <dev@dpdk.org>; Mon, 15 Sep 2014 17:04:30 +0200 (CEST)", "from hmsreliant.think-freely.org\n\t([2001:470:8:a08:7aac:c0ff:fec2:933b] helo=localhost)\n\tby smtp.tuxdriver.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.63)\n\t(envelope-from <nhorman@tuxdriver.com>)\n\tid 1XTXuW-0007bC-9c; Mon, 15 Sep 2014 11:10:00 -0400" ], "Date": "Mon, 15 Sep 2014 11:09:46 -0400", "From": "Neil Horman <nhorman@tuxdriver.com>", "To": "\"Zhou, Danny\" <danny.zhou@intel.com>", "Message-ID": "<20140915150946.GA11690@hmsreliant.think-freely.org>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1405362290-6753-1-git-send-email-linville@tuxdriver.com>\n\t<20140912180523.GB7145@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A935AEE@SHSMSX104.ccr.corp.intel.com>\n\t<20140912185423.GD7145@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A935E75@SHSMSX104.ccr.corp.intel.com>", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=us-ascii", "Content-Disposition": "inline", "In-Reply-To": "<DFDF335405C17848924A094BC35766CF0A935E75@SHSMSX104.ccr.corp.intel.com>", "User-Agent": "Mutt/1.5.23 (2014-03-12)", "X-Spam-Score": "-2.9 (--)", "X-Spam-Status": "No", "Cc": "\"dev@dpdk.org\" <dev@dpdk.org>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 783, "web_url": "http://patches.dpdk.org/comment/783/", "msgid": "<20140915151506.GE28459@tuxdriver.com>", "list_archive_url": "https://inbox.dpdk.org/dev/20140915151506.GE28459@tuxdriver.com", "date": "2014-09-15T15:15:07", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 26, "url": "http://patches.dpdk.org/api/people/26/?format=api", "name": "John W. Linville", "email": "linville@tuxdriver.com" }, "content": "On Mon, Sep 15, 2014 at 11:09:46AM -0400, Neil Horman wrote:\n> On Fri, Sep 12, 2014 at 08:35:47PM +0000, Zhou, Danny wrote:\n> > > -----Original Message-----\n> > > From: John W. Linville [mailto:linville@tuxdriver.com]\n> > > Sent: Saturday, September 13, 2014 2:54 AM\n> > > To: Zhou, Danny\n> > > Cc: dev@dpdk.org\n> > > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices\n> > > \n> > > On Fri, Sep 12, 2014 at 06:31:08PM +0000, Zhou, Danny wrote:\n> > > > I am concerned about its performance caused by too many\n> > > > memcpy(). Specifically, on Rx side, kernel NIC driver needs to copy\n> > > > packets to skb, then af_packet copies packets to AF_PACKET buffer\n> > > > which are mapped to user space, and then those packets to be copied\n> > > > to DPDK mbuf. In addition, 3 copies needed on Tx side. So to run a\n> > > > simple DPDK L2/L3 forwarding benchmark, each packet needs 6 packet\n> > > > copies which brings significant negative performance impact. We\n> > > > had a bifurcated driver prototype that can do zero-copy and achieve\n> > > > native DPDK performance, but it depends on base driver and AF_PACKET\n> > > > code changes in kernel, John R will be presenting it in coming Linux\n> > > > Plumbers Conference. Once kernel adopts it, the relevant PMD will be\n> > > > submitted to dpdk.org.\n> > > \n> > > Admittedly, this is not as good a performer as most of the existing\n> > > PMDs. It serves a different purpose, afterall. FWIW, you did\n> > > previously indicate that it performed better than the pcap-based PMD.\n> > \n> > Yes, slightly higher but makes no big difference.\n> > \n> Do you have numbers for this? It seems to me faster is faster as long as its\n> statistically significant. Even if its not, johns AF_PACKET pmd has the ability\n> to scale to multple cpus more easily than the pcap pmd, as it can make use of\n> the AF_PACKET fanout feature.\n> \n> > > I look forward to seeing the changes you mention -- they sound very\n> > > exciting. But, they will still require both networking core and\n> > > driver changes in the kernel. And as I understand things today,\n> > > the userland code will still need at least some knowledge of specific\n> > > devices and how they layout their packet descriptors, etc. So while\n> > > those changes sound very promising, they will still have certain\n> > > drawbacks in common with the current situation.\n> > \n> > Yes, we would like the DPDK performance optimization techniques such as huge page, efficient rx/tx routines to manipulate device-specific \n> > packet descriptors, polling-model can be still used. We have to tradeoff between performance and commonality. But we believe it will be much easier\n> > to develop DPDK PMD for non-Intel NICs than porting entire kernel drivers to DPDK.\n> > \n> \n> Not sure how this relates, what you're describing is the feature intel has been\n> working on to augment kernel drivers to provide better throughput via direct\n> hardware access to user space. Johns PMD provides ubiquitous function on all\n> hardware. I'm not sure how the desire for one implies the other isn't valuable?\n> \n> > > It seems like the changes you mention will still need some sort of\n> > > AF_PACKET-based PMD driver. Have you implemented that completely\n> > > separate from the code I already posted? Or did you add that work\n> > > on top of mine?\n> > > \n> > \n> > For userland code, it certainly use some of your code related to raw rocket, but highly modified. A layer will be added into eth_dev library to do device\n> > probe and support new socket options.\n> > \n> \n> Ok, but again, PMD's are independent, and serve different needs. If they're use\n> is at all overlapping from a functional standpoint, take this one now, and\n> deprecate it when a better one comes along. Though from your description it\n> seems like both have a valid place in the ecosystem.\n\nThat's where I'm at as well -- I don't see anything in the above that\namounts to an argument against the AF_PACKET-based PMD I have posted.\n\"Wait for ours\" doesn't hold much water, especially when we are trying\nto address different problems.\n\nJohn\n\n> \n> Neil\n> \n> > > John\n> > > \n> > > > > -----Original Message-----\n> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of John W. Linville\n> > > > > Sent: Saturday, September 13, 2014 2:05 AM\n> > > > > To: dev@dpdk.org\n> > > > > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices\n> > > > >\n> > > > > Ping? Are there objections to this patch from mid-July?\n> > > > >\n> > > > > John\n> > > > >\n> > > > > On Mon, Jul 14, 2014 at 02:24:50PM -0400, John W. Linville wrote:\n> > > > > > This is a Linux-specific virtual PMD driver backed by an AF_PACKET\n> > > > > > socket. This implementation uses mmap'ed ring buffers to limit copying\n> > > > > > and user/kernel transitions. The PACKET_FANOUT_HASH behavior of\n> > > > > > AF_PACKET is used for frame reception. In the current implementation,\n> > > > > > Tx and Rx queues are always paired, and therefore are always equal\n> > > > > > in number -- changing this would be a Simple Matter Of Programming.\n> > > > > >\n> > > > > > Interfaces of this type are created with a command line option like\n> > > > > > \"--vdev=eth_packet0,iface=...\". There are a number of options availabe\n> > > > > > as arguments:\n> > > > > >\n> > > > > > - Interface is chosen by \"iface\" (required)\n> > > > > > - Number of queue pairs set by \"qpairs\" (optional, default: 1)\n> > > > > > - AF_PACKET MMAP block size set by \"blocksz\" (optional, default: 4096)\n> > > > > > - AF_PACKET MMAP frame size set by \"framesz\" (optional, default: 2048)\n> > > > > > - AF_PACKET MMAP frame count set by \"framecnt\" (optional, default: 512)\n> > > > > >\n> > > > > > Signed-off-by: John W. Linville <linville@tuxdriver.com>\n> > > > > > ---\n> > > > > > This PMD is intended to provide a means for using DPDK on a broad\n> > > > > > range of hardware without hardware-specific PMDs and (hopefully)\n> > > > > > with better performance than what PCAP offers in Linux. This might\n> > > > > > be useful as a development platform for DPDK applications when\n> > > > > > DPDK-supported hardware is expensive or unavailable.\n> > > > > >\n> > > > > > New in v2:\n> > > > > >\n> > > > > > -- fixup some style issues found by check patch\n> > > > > > -- use if_index as part of fanout group ID\n> > > > > > -- set default number of queue pairs to 1\n> > > > > >\n> > > > > > config/common_bsdapp | 5 +\n> > > > > > config/common_linuxapp | 5 +\n> > > > > > lib/Makefile | 1 +\n> > > > > > lib/librte_eal/linuxapp/eal/Makefile | 1 +\n> > > > > > lib/librte_pmd_packet/Makefile | 60 +++\n> > > > > > lib/librte_pmd_packet/rte_eth_packet.c | 826 +++++++++++++++++++++++++++++++++\n> > > > > > lib/librte_pmd_packet/rte_eth_packet.h | 55 +++\n> > > > > > mk/rte.app.mk | 4 +\n> > > > > > 8 files changed, 957 insertions(+)\n> > > > > > create mode 100644 lib/librte_pmd_packet/Makefile\n> > > > > > create mode 100644 lib/librte_pmd_packet/rte_eth_packet.c\n> > > > > > create mode 100644 lib/librte_pmd_packet/rte_eth_packet.h\n> > > > > >\n> > > > > > diff --git a/config/common_bsdapp b/config/common_bsdapp\n> > > > > > index 943dce8f1ede..c317f031278e 100644\n> > > > > > --- a/config/common_bsdapp\n> > > > > > +++ b/config/common_bsdapp\n> > > > > > @@ -226,6 +226,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=y\n> > > > > > CONFIG_RTE_LIBRTE_PMD_BOND=y\n> > > > > >\n> > > > > > #\n> > > > > > +# Compile software PMD backed by AF_PACKET sockets (Linux only)\n> > > > > > +#\n> > > > > > +CONFIG_RTE_LIBRTE_PMD_PACKET=n\n> > > > > > +\n> > > > > > +#\n> > > > > > # Do prefetch of packet data within PMD driver receive function\n> > > > > > #\n> > > > > > CONFIG_RTE_PMD_PACKET_PREFETCH=y\n> > > > > > diff --git a/config/common_linuxapp b/config/common_linuxapp\n> > > > > > index 7bf5d80d4e26..f9e7bc3015ec 100644\n> > > > > > --- a/config/common_linuxapp\n> > > > > > +++ b/config/common_linuxapp\n> > > > > > @@ -249,6 +249,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=n\n> > > > > > CONFIG_RTE_LIBRTE_PMD_BOND=y\n> > > > > >\n> > > > > > #\n> > > > > > +# Compile software PMD backed by AF_PACKET sockets (Linux only)\n> > > > > > +#\n> > > > > > +CONFIG_RTE_LIBRTE_PMD_PACKET=y\n> > > > > > +\n> > > > > > +#\n> > > > > > # Compile Xen PMD\n> > > > > > #\n> > > > > > CONFIG_RTE_LIBRTE_PMD_XENVIRT=n\n> > > > > > diff --git a/lib/Makefile b/lib/Makefile\n> > > > > > index 10c5bb3045bc..930fadf29898 100644\n> > > > > > --- a/lib/Makefile\n> > > > > > +++ b/lib/Makefile\n> > > > > > @@ -47,6 +47,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += librte_pmd_i40e\n> > > > > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += librte_pmd_bond\n> > > > > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += librte_pmd_ring\n> > > > > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += librte_pmd_pcap\n> > > > > > +DIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += librte_pmd_packet\n> > > > > > DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += librte_pmd_virtio\n> > > > > > DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3\n> > > > > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt\n> > > > > > diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile\n> > > > > > index 756d6b0c9301..feed24a63272 100644\n> > > > > > --- a/lib/librte_eal/linuxapp/eal/Makefile\n> > > > > > +++ b/lib/librte_eal/linuxapp/eal/Makefile\n> > > > > > @@ -44,6 +44,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_ether\n> > > > > > CFLAGS += -I$(RTE_SDK)/lib/librte_ivshmem\n> > > > > > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_ring\n> > > > > > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_pcap\n> > > > > > +CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_packet\n> > > > > > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_xenvirt\n> > > > > > CFLAGS += $(WERROR_FLAGS) -O3\n> > > > > >\n> > > > > > diff --git a/lib/librte_pmd_packet/Makefile b/lib/librte_pmd_packet/Makefile\n> > > > > > new file mode 100644\n> > > > > > index 000000000000..e1266fb992cd\n> > > > > > --- /dev/null\n> > > > > > +++ b/lib/librte_pmd_packet/Makefile\n> > > > > > @@ -0,0 +1,60 @@\n> > > > > > +# BSD LICENSE\n> > > > > > +#\n> > > > > > +# Copyright(c) 2014 John W. Linville <linville@redhat.com>\n> > > > > > +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > > > > > +# Copyright(c) 2014 6WIND S.A.\n> > > > > > +# All rights reserved.\n> > > > > > +#\n> > > > > > +# Redistribution and use in source and binary forms, with or without\n> > > > > > +# modification, are permitted provided that the following conditions\n> > > > > > +# are met:\n> > > > > > +#\n> > > > > > +# * Redistributions of source code must retain the above copyright\n> > > > > > +# notice, this list of conditions and the following disclaimer.\n> > > > > > +# * Redistributions in binary form must reproduce the above copyright\n> > > > > > +# notice, this list of conditions and the following disclaimer in\n> > > > > > +# the documentation and/or other materials provided with the\n> > > > > > +# distribution.\n> > > > > > +# * Neither the name of Intel Corporation nor the names of its\n> > > > > > +# contributors may be used to endorse or promote products derived\n> > > > > > +# from this software without specific prior written permission.\n> > > > > > +#\n> > > > > > +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> > > > > > +# \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> > > > > > +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> > > > > > +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> > > > > > +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> > > > > > +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> > > > > > +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> > > > > > +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> > > > > > +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> > > > > > +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> > > > > > +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> > > > > > +\n> > > > > > +include $(RTE_SDK)/mk/rte.vars.mk\n> > > > > > +\n> > > > > > +#\n> > > > > > +# library name\n> > > > > > +#\n> > > > > > +LIB = librte_pmd_packet.a\n> > > > > > +\n> > > > > > +CFLAGS += -O3\n> > > > > > +CFLAGS += $(WERROR_FLAGS)\n> > > > > > +\n> > > > > > +#\n> > > > > > +# all source are stored in SRCS-y\n> > > > > > +#\n> > > > > > +SRCS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += rte_eth_packet.c\n> > > > > > +\n> > > > > > +#\n> > > > > > +# Export include files\n> > > > > > +#\n> > > > > > +SYMLINK-y-include += rte_eth_packet.h\n> > > > > > +\n> > > > > > +# this lib depends upon:\n> > > > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_mbuf\n> > > > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_ether\n> > > > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_malloc\n> > > > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_kvargs\n> > > > > > +\n> > > > > > +include $(RTE_SDK)/mk/rte.lib.mk\n> > > > > > diff --git a/lib/librte_pmd_packet/rte_eth_packet.c b/lib/librte_pmd_packet/rte_eth_packet.c\n> > > > > > new file mode 100644\n> > > > > > index 000000000000..9c82d16e730f\n> > > > > > --- /dev/null\n> > > > > > +++ b/lib/librte_pmd_packet/rte_eth_packet.c\n> > > > > > @@ -0,0 +1,826 @@\n> > > > > > +/*-\n> > > > > > + * BSD LICENSE\n> > > > > > + *\n> > > > > > + * Copyright(c) 2014 John W. Linville <linville@tuxdriver.com>\n> > > > > > + *\n> > > > > > + * Originally based upon librte_pmd_pcap code:\n> > > > > > + *\n> > > > > > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > > > > > + * Copyright(c) 2014 6WIND S.A.\n> > > > > > + * All rights reserved.\n> > > > > > + *\n> > > > > > + * Redistribution and use in source and binary forms, with or without\n> > > > > > + * modification, are permitted provided that the following conditions\n> > > > > > + * are met:\n> > > > > > + *\n> > > > > > + * * Redistributions of source code must retain the above copyright\n> > > > > > + * notice, this list of conditions and the following disclaimer.\n> > > > > > + * * Redistributions in binary form must reproduce the above copyright\n> > > > > > + * notice, this list of conditions and the following disclaimer in\n> > > > > > + * the documentation and/or other materials provided with the\n> > > > > > + * distribution.\n> > > > > > + * * Neither the name of Intel Corporation nor the names of its\n> > > > > > + * contributors may be used to endorse or promote products derived\n> > > > > > + * from this software without specific prior written permission.\n> > > > > > + *\n> > > > > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> > > > > > + * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> > > > > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> > > > > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> > > > > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> > > > > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> > > > > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> > > > > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> > > > > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> > > > > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> > > > > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> > > > > > + */\n> > > > > > +\n> > > > > > +#include <rte_mbuf.h>\n> > > > > > +#include <rte_ethdev.h>\n> > > > > > +#include <rte_malloc.h>\n> > > > > > +#include <rte_kvargs.h>\n> > > > > > +#include <rte_dev.h>\n> > > > > > +\n> > > > > > +#include <linux/if_ether.h>\n> > > > > > +#include <linux/if_packet.h>\n> > > > > > +#include <arpa/inet.h>\n> > > > > > +#include <net/if.h>\n> > > > > > +#include <sys/types.h>\n> > > > > > +#include <sys/socket.h>\n> > > > > > +#include <sys/ioctl.h>\n> > > > > > +#include <sys/mman.h>\n> > > > > > +#include <unistd.h>\n> > > > > > +#include <poll.h>\n> > > > > > +\n> > > > > > +#include \"rte_eth_packet.h\"\n> > > > > > +\n> > > > > > +#define ETH_PACKET_IFACE_ARG\t\t\"iface\"\n> > > > > > +#define ETH_PACKET_NUM_Q_ARG\t\t\"qpairs\"\n> > > > > > +#define ETH_PACKET_BLOCKSIZE_ARG\t\"blocksz\"\n> > > > > > +#define ETH_PACKET_FRAMESIZE_ARG\t\"framesz\"\n> > > > > > +#define ETH_PACKET_FRAMECOUNT_ARG\t\"framecnt\"\n> > > > > > +\n> > > > > > +#define DFLT_BLOCK_SIZE\t\t(1 << 12)\n> > > > > > +#define DFLT_FRAME_SIZE\t\t(1 << 11)\n> > > > > > +#define DFLT_FRAME_COUNT\t(1 << 9)\n> > > > > > +\n> > > > > > +struct pkt_rx_queue {\n> > > > > > +\tint sockfd;\n> > > > > > +\n> > > > > > +\tstruct iovec *rd;\n> > > > > > +\tuint8_t *map;\n> > > > > > +\tunsigned int framecount;\n> > > > > > +\tunsigned int framenum;\n> > > > > > +\n> > > > > > +\tstruct rte_mempool *mb_pool;\n> > > > > > +\n> > > > > > +\tvolatile unsigned long rx_pkts;\n> > > > > > +\tvolatile unsigned long err_pkts;\n> > > > > > +};\n> > > > > > +\n> > > > > > +struct pkt_tx_queue {\n> > > > > > +\tint sockfd;\n> > > > > > +\n> > > > > > +\tstruct iovec *rd;\n> > > > > > +\tuint8_t *map;\n> > > > > > +\tunsigned int framecount;\n> > > > > > +\tunsigned int framenum;\n> > > > > > +\n> > > > > > +\tvolatile unsigned long tx_pkts;\n> > > > > > +\tvolatile unsigned long err_pkts;\n> > > > > > +};\n> > > > > > +\n> > > > > > +struct pmd_internals {\n> > > > > > +\tunsigned nb_queues;\n> > > > > > +\n> > > > > > +\tint if_index;\n> > > > > > +\tstruct ether_addr eth_addr;\n> > > > > > +\n> > > > > > +\tstruct tpacket_req req;\n> > > > > > +\n> > > > > > +\tstruct pkt_rx_queue rx_queue[RTE_PMD_PACKET_MAX_RINGS];\n> > > > > > +\tstruct pkt_tx_queue tx_queue[RTE_PMD_PACKET_MAX_RINGS];\n> > > > > > +};\n> > > > > > +\n> > > > > > +static const char *valid_arguments[] = {\n> > > > > > +\tETH_PACKET_IFACE_ARG,\n> > > > > > +\tETH_PACKET_NUM_Q_ARG,\n> > > > > > +\tETH_PACKET_BLOCKSIZE_ARG,\n> > > > > > +\tETH_PACKET_FRAMESIZE_ARG,\n> > > > > > +\tETH_PACKET_FRAMECOUNT_ARG,\n> > > > > > +\tNULL\n> > > > > > +};\n> > > > > > +\n> > > > > > +static const char *drivername = \"AF_PACKET PMD\";\n> > > > > > +\n> > > > > > +static struct rte_eth_link pmd_link = {\n> > > > > > +\t.link_speed = 10000,\n> > > > > > +\t.link_duplex = ETH_LINK_FULL_DUPLEX,\n> > > > > > +\t.link_status = 0\n> > > > > > +};\n> > > > > > +\n> > > > > > +static uint16_t\n> > > > > > +eth_packet_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)\n> > > > > > +{\n> > > > > > +\tunsigned i;\n> > > > > > +\tstruct tpacket2_hdr *ppd;\n> > > > > > +\tstruct rte_mbuf *mbuf;\n> > > > > > +\tuint8_t *pbuf;\n> > > > > > +\tstruct pkt_rx_queue *pkt_q = queue;\n> > > > > > +\tuint16_t num_rx = 0;\n> > > > > > +\tunsigned int framecount, framenum;\n> > > > > > +\n> > > > > > +\tif (unlikely(nb_pkts == 0))\n> > > > > > +\t\treturn 0;\n> > > > > > +\n> > > > > > +\t/*\n> > > > > > +\t * Reads the given number of packets from the AF_PACKET socket one by\n> > > > > > +\t * one and copies the packet data into a newly allocated mbuf.\n> > > > > > +\t */\n> > > > > > +\tframecount = pkt_q->framecount;\n> > > > > > +\tframenum = pkt_q->framenum;\n> > > > > > +\tfor (i = 0; i < nb_pkts; i++) {\n> > > > > > +\t\t/* point at the next incoming frame */\n> > > > > > +\t\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > > > > > +\t\tif ((ppd->tp_status & TP_STATUS_USER) == 0)\n> > > > > > +\t\t\tbreak;\n> > > > > > +\n> > > > > > +\t\t/* allocate the next mbuf */\n> > > > > > +\t\tmbuf = rte_pktmbuf_alloc(pkt_q->mb_pool);\n> > > > > > +\t\tif (unlikely(mbuf == NULL))\n> > > > > > +\t\t\tbreak;\n> > > > > > +\n> > > > > > +\t\t/* packet will fit in the mbuf, go ahead and receive it */\n> > > > > > +\t\tmbuf->pkt.pkt_len = mbuf->pkt.data_len = ppd->tp_snaplen;\n> > > > > > +\t\tpbuf = (uint8_t *) ppd + ppd->tp_mac;\n> > > > > > +\t\tmemcpy(mbuf->pkt.data, pbuf, mbuf->pkt.data_len);\n> > > > > > +\n> > > > > > +\t\t/* release incoming frame and advance ring buffer */\n> > > > > > +\t\tppd->tp_status = TP_STATUS_KERNEL;\n> > > > > > +\t\tif (++framenum >= framecount)\n> > > > > > +\t\t\tframenum = 0;\n> > > > > > +\n> > > > > > +\t\t/* account for the receive frame */\n> > > > > > +\t\tbufs[i] = mbuf;\n> > > > > > +\t\tnum_rx++;\n> > > > > > +\t}\n> > > > > > +\tpkt_q->framenum = framenum;\n> > > > > > +\tpkt_q->rx_pkts += num_rx;\n> > > > > > +\treturn num_rx;\n> > > > > > +}\n> > > > > > +\n> > > > > > +/*\n> > > > > > + * Callback to handle sending packets through a real NIC.\n> > > > > > + */\n> > > > > > +static uint16_t\n> > > > > > +eth_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)\n> > > > > > +{\n> > > > > > +\tstruct tpacket2_hdr *ppd;\n> > > > > > +\tstruct rte_mbuf *mbuf;\n> > > > > > +\tuint8_t *pbuf;\n> > > > > > +\tunsigned int framecount, framenum;\n> > > > > > +\tstruct pollfd pfd;\n> > > > > > +\tstruct pkt_tx_queue *pkt_q = queue;\n> > > > > > +\tuint16_t num_tx = 0;\n> > > > > > +\tint i;\n> > > > > > +\n> > > > > > +\tif (unlikely(nb_pkts == 0))\n> > > > > > +\t\treturn 0;\n> > > > > > +\n> > > > > > +\tmemset(&pfd, 0, sizeof(pfd));\n> > > > > > +\tpfd.fd = pkt_q->sockfd;\n> > > > > > +\tpfd.events = POLLOUT;\n> > > > > > +\tpfd.revents = 0;\n> > > > > > +\n> > > > > > +\tframecount = pkt_q->framecount;\n> > > > > > +\tframenum = pkt_q->framenum;\n> > > > > > +\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > > > > > +\tfor (i = 0; i < nb_pkts; i++) {\n> > > > > > +\t\t/* point at the next incoming frame */\n> > > > > > +\t\tif ((ppd->tp_status != TP_STATUS_AVAILABLE) &&\n> > > > > > +\t\t (poll(&pfd, 1, -1) < 0))\n> > > > > > +\t\t\t\tcontinue;\n> > > > > > +\n> > > > > > +\t\t/* copy the tx frame data */\n> > > > > > +\t\tmbuf = bufs[num_tx];\n> > > > > > +\t\tpbuf = (uint8_t *) ppd + TPACKET2_HDRLEN -\n> > > > > > +\t\t\tsizeof(struct sockaddr_ll);\n> > > > > > +\t\tmemcpy(pbuf, mbuf->pkt.data, mbuf->pkt.data_len);\n> > > > > > +\t\tppd->tp_len = ppd->tp_snaplen = mbuf->pkt.data_len;\n> > > > > > +\n> > > > > > +\t\t/* release incoming frame and advance ring buffer */\n> > > > > > +\t\tppd->tp_status = TP_STATUS_SEND_REQUEST;\n> > > > > > +\t\tif (++framenum >= framecount)\n> > > > > > +\t\t\tframenum = 0;\n> > > > > > +\t\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > > > > > +\n> > > > > > +\t\tnum_tx++;\n> > > > > > +\t\trte_pktmbuf_free(mbuf);\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\t/* kick-off transmits */\n> > > > > > +\tsendto(pkt_q->sockfd, NULL, 0, MSG_DONTWAIT, NULL, 0);\n> > > > > > +\n> > > > > > +\tpkt_q->framenum = framenum;\n> > > > > > +\tpkt_q->tx_pkts += num_tx;\n> > > > > > +\tpkt_q->err_pkts += nb_pkts - num_tx;\n> > > > > > +\treturn num_tx;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static int\n> > > > > > +eth_dev_start(struct rte_eth_dev *dev)\n> > > > > > +{\n> > > > > > +\tdev->data->dev_link.link_status = 1;\n> > > > > > +\treturn 0;\n> > > > > > +}\n> > > > > > +\n> > > > > > +/*\n> > > > > > + * This function gets called when the current port gets stopped.\n> > > > > > + */\n> > > > > > +static void\n> > > > > > +eth_dev_stop(struct rte_eth_dev *dev)\n> > > > > > +{\n> > > > > > +\tunsigned i;\n> > > > > > +\tint sockfd;\n> > > > > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > > > > +\n> > > > > > +\tfor (i = 0; i < internals->nb_queues; i++) {\n> > > > > > +\t\tsockfd = internals->rx_queue[i].sockfd;\n> > > > > > +\t\tif (sockfd != -1)\n> > > > > > +\t\t\tclose(sockfd);\n> > > > > > +\t\tsockfd = internals->tx_queue[i].sockfd;\n> > > > > > +\t\tif (sockfd != -1)\n> > > > > > +\t\t\tclose(sockfd);\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\tdev->data->dev_link.link_status = 0;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static int\n> > > > > > +eth_dev_configure(struct rte_eth_dev *dev __rte_unused)\n> > > > > > +{\n> > > > > > +\treturn 0;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static void\n> > > > > > +eth_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)\n> > > > > > +{\n> > > > > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > > > > +\n> > > > > > +\tdev_info->driver_name = drivername;\n> > > > > > +\tdev_info->if_index = internals->if_index;\n> > > > > > +\tdev_info->max_mac_addrs = 1;\n> > > > > > +\tdev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;\n> > > > > > +\tdev_info->max_rx_queues = (uint16_t)internals->nb_queues;\n> > > > > > +\tdev_info->max_tx_queues = (uint16_t)internals->nb_queues;\n> > > > > > +\tdev_info->min_rx_bufsize = 0;\n> > > > > > +\tdev_info->pci_dev = NULL;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static void\n> > > > > > +eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *igb_stats)\n> > > > > > +{\n> > > > > > +\tunsigned i, imax;\n> > > > > > +\tunsigned long rx_total = 0, tx_total = 0, tx_err_total = 0;\n> > > > > > +\tconst struct pmd_internals *internal = dev->data->dev_private;\n> > > > > > +\n> > > > > > +\tmemset(igb_stats, 0, sizeof(*igb_stats));\n> > > > > > +\n> > > > > > +\timax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?\n> > > > > > +\t internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);\n> > > > > > +\tfor (i = 0; i < imax; i++) {\n> > > > > > +\t\tigb_stats->q_ipackets[i] = internal->rx_queue[i].rx_pkts;\n> > > > > > +\t\trx_total += igb_stats->q_ipackets[i];\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\timax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?\n> > > > > > +\t internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);\n> > > > > > +\tfor (i = 0; i < imax; i++) {\n> > > > > > +\t\tigb_stats->q_opackets[i] = internal->tx_queue[i].tx_pkts;\n> > > > > > +\t\tigb_stats->q_errors[i] = internal->tx_queue[i].err_pkts;\n> > > > > > +\t\ttx_total += igb_stats->q_opackets[i];\n> > > > > > +\t\ttx_err_total += igb_stats->q_errors[i];\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\tigb_stats->ipackets = rx_total;\n> > > > > > +\tigb_stats->opackets = tx_total;\n> > > > > > +\tigb_stats->oerrors = tx_err_total;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static void\n> > > > > > +eth_stats_reset(struct rte_eth_dev *dev)\n> > > > > > +{\n> > > > > > +\tunsigned i;\n> > > > > > +\tstruct pmd_internals *internal = dev->data->dev_private;\n> > > > > > +\n> > > > > > +\tfor (i = 0; i < internal->nb_queues; i++)\n> > > > > > +\t\tinternal->rx_queue[i].rx_pkts = 0;\n> > > > > > +\n> > > > > > +\tfor (i = 0; i < internal->nb_queues; i++) {\n> > > > > > +\t\tinternal->tx_queue[i].tx_pkts = 0;\n> > > > > > +\t\tinternal->tx_queue[i].err_pkts = 0;\n> > > > > > +\t}\n> > > > > > +}\n> > > > > > +\n> > > > > > +static void\n> > > > > > +eth_dev_close(struct rte_eth_dev *dev __rte_unused)\n> > > > > > +{\n> > > > > > +}\n> > > > > > +\n> > > > > > +static void\n> > > > > > +eth_queue_release(void *q __rte_unused)\n> > > > > > +{\n> > > > > > +}\n> > > > > > +\n> > > > > > +static int\n> > > > > > +eth_link_update(struct rte_eth_dev *dev __rte_unused,\n> > > > > > + int wait_to_complete __rte_unused)\n> > > > > > +{\n> > > > > > +\treturn 0;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static int\n> > > > > > +eth_rx_queue_setup(struct rte_eth_dev *dev,\n> > > > > > + uint16_t rx_queue_id,\n> > > > > > + uint16_t nb_rx_desc __rte_unused,\n> > > > > > + unsigned int socket_id __rte_unused,\n> > > > > > + const struct rte_eth_rxconf *rx_conf __rte_unused,\n> > > > > > + struct rte_mempool *mb_pool)\n> > > > > > +{\n> > > > > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > > > > +\tstruct pkt_rx_queue *pkt_q = &internals->rx_queue[rx_queue_id];\n> > > > > > +\tstruct rte_pktmbuf_pool_private *mbp_priv;\n> > > > > > +\tuint16_t buf_size;\n> > > > > > +\n> > > > > > +\tpkt_q->mb_pool = mb_pool;\n> > > > > > +\n> > > > > > +\t/* Now get the space available for data in the mbuf */\n> > > > > > +\tmbp_priv = rte_mempool_get_priv(pkt_q->mb_pool);\n> > > > > > +\tbuf_size = (uint16_t) (mbp_priv->mbuf_data_room_size -\n> > > > > > +\t RTE_PKTMBUF_HEADROOM);\n> > > > > > +\n> > > > > > +\tif (ETH_FRAME_LEN > buf_size) {\n> > > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\"%s: %d bytes will not fit in mbuf (%d bytes)\\n\",\n> > > > > > +\t\t\tdev->data->name, ETH_FRAME_LEN, buf_size);\n> > > > > > +\t\treturn -ENOMEM;\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\tdev->data->rx_queues[rx_queue_id] = pkt_q;\n> > > > > > +\n> > > > > > +\treturn 0;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static int\n> > > > > > +eth_tx_queue_setup(struct rte_eth_dev *dev,\n> > > > > > + uint16_t tx_queue_id,\n> > > > > > + uint16_t nb_tx_desc __rte_unused,\n> > > > > > + unsigned int socket_id __rte_unused,\n> > > > > > + const struct rte_eth_txconf *tx_conf __rte_unused)\n> > > > > > +{\n> > > > > > +\n> > > > > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > > > > +\n> > > > > > +\tdev->data->tx_queues[tx_queue_id] = &internals->tx_queue[tx_queue_id];\n> > > > > > +\treturn 0;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static struct eth_dev_ops ops = {\n> > > > > > +\t.dev_start = eth_dev_start,\n> > > > > > +\t.dev_stop = eth_dev_stop,\n> > > > > > +\t.dev_close = eth_dev_close,\n> > > > > > +\t.dev_configure = eth_dev_configure,\n> > > > > > +\t.dev_infos_get = eth_dev_info,\n> > > > > > +\t.rx_queue_setup = eth_rx_queue_setup,\n> > > > > > +\t.tx_queue_setup = eth_tx_queue_setup,\n> > > > > > +\t.rx_queue_release = eth_queue_release,\n> > > > > > +\t.tx_queue_release = eth_queue_release,\n> > > > > > +\t.link_update = eth_link_update,\n> > > > > > +\t.stats_get = eth_stats_get,\n> > > > > > +\t.stats_reset = eth_stats_reset,\n> > > > > > +};\n> > > > > > +\n> > > > > > +/*\n> > > > > > + * Opens an AF_PACKET socket\n> > > > > > + */\n> > > > > > +static int\n> > > > > > +open_packet_iface(const char *key __rte_unused,\n> > > > > > + const char *value __rte_unused,\n> > > > > > + void *extra_args)\n> > > > > > +{\n> > > > > > +\tint *sockfd = extra_args;\n> > > > > > +\n> > > > > > +\t/* Open an AF_PACKET socket... */\n> > > > > > +\t*sockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));\n> > > > > > +\tif (*sockfd == -1) {\n> > > > > > +\t\tRTE_LOG(ERR, PMD, \"Could not open AF_PACKET socket\\n\");\n> > > > > > +\t\treturn -1;\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\treturn 0;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static int\n> > > > > > +rte_pmd_init_internals(const char *name,\n> > > > > > + const int sockfd,\n> > > > > > + const unsigned nb_queues,\n> > > > > > + unsigned int blocksize,\n> > > > > > + unsigned int blockcnt,\n> > > > > > + unsigned int framesize,\n> > > > > > + unsigned int framecnt,\n> > > > > > + const unsigned numa_node,\n> > > > > > + struct pmd_internals **internals,\n> > > > > > + struct rte_eth_dev **eth_dev,\n> > > > > > + struct rte_kvargs *kvlist)\n> > > > > > +{\n> > > > > > +\tstruct rte_eth_dev_data *data = NULL;\n> > > > > > +\tstruct rte_pci_device *pci_dev = NULL;\n> > > > > > +\tstruct rte_kvargs_pair *pair = NULL;\n> > > > > > +\tstruct ifreq ifr;\n> > > > > > +\tsize_t ifnamelen;\n> > > > > > +\tunsigned k_idx;\n> > > > > > +\tstruct sockaddr_ll sockaddr;\n> > > > > > +\tstruct tpacket_req *req;\n> > > > > > +\tstruct pkt_rx_queue *rx_queue;\n> > > > > > +\tstruct pkt_tx_queue *tx_queue;\n> > > > > > +\tint rc, tpver, discard, bypass;\n> > > > > > +\tunsigned int i, q, rdsize;\n> > > > > > +\tint qsockfd, fanout_arg;\n> > > > > > +\n> > > > > > +\tfor (k_idx = 0; k_idx < kvlist->count; k_idx++) {\n> > > > > > +\t\tpair = &kvlist->pairs[k_idx];\n> > > > > > +\t\tif (strstr(pair->key, ETH_PACKET_IFACE_ARG) != NULL)\n> > > > > > +\t\t\tbreak;\n> > > > > > +\t}\n> > > > > > +\tif (pair == NULL) {\n> > > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\"%s: no interface specified for AF_PACKET ethdev\\n\",\n> > > > > > +\t\t name);\n> > > > > > +\t\tgoto error;\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\tRTE_LOG(INFO, PMD,\n> > > > > > +\t\t\"%s: creating AF_PACKET-backed ethdev on numa socket %u\\n\",\n> > > > > > +\t\tname, numa_node);\n> > > > > > +\n> > > > > > +\t/*\n> > > > > > +\t * now do all data allocation - for eth_dev structure, dummy pci driver\n> > > > > > +\t * and internal (private) data\n> > > > > > +\t */\n> > > > > > +\tdata = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);\n> > > > > > +\tif (data == NULL)\n> > > > > > +\t\tgoto error;\n> > > > > > +\n> > > > > > +\tpci_dev = rte_zmalloc_socket(name, sizeof(*pci_dev), 0, numa_node);\n> > > > > > +\tif (pci_dev == NULL)\n> > > > > > +\t\tgoto error;\n> > > > > > +\n> > > > > > +\t*internals = rte_zmalloc_socket(name, sizeof(**internals),\n> > > > > > +\t 0, numa_node);\n> > > > > > +\tif (*internals == NULL)\n> > > > > > +\t\tgoto error;\n> > > > > > +\n> > > > > > +\treq = &((*internals)->req);\n> > > > > > +\n> > > > > > +\treq->tp_block_size = blocksize;\n> > > > > > +\treq->tp_block_nr = blockcnt;\n> > > > > > +\treq->tp_frame_size = framesize;\n> > > > > > +\treq->tp_frame_nr = framecnt;\n> > > > > > +\n> > > > > > +\tifnamelen = strlen(pair->value);\n> > > > > > +\tif (ifnamelen < sizeof(ifr.ifr_name)) {\n> > > > > > +\t\tmemcpy(ifr.ifr_name, pair->value, ifnamelen);\n> > > > > > +\t\tifr.ifr_name[ifnamelen] = '\\0';\n> > > > > > +\t} else {\n> > > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\"%s: I/F name too long (%s)\\n\",\n> > > > > > +\t\t\tname, pair->value);\n> > > > > > +\t\tgoto error;\n> > > > > > +\t}\n> > > > > > +\tif (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {\n> > > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\"%s: ioctl failed (SIOCGIFINDEX)\\n\",\n> > > > > > +\t\t name);\n> > > > > > +\t\tgoto error;\n> > > > > > +\t}\n> > > > > > +\t(*internals)->if_index = ifr.ifr_ifindex;\n> > > > > > +\n> > > > > > +\tif (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {\n> > > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\"%s: ioctl failed (SIOCGIFHWADDR)\\n\",\n> > > > > > +\t\t name);\n> > > > > > +\t\tgoto error;\n> > > > > > +\t}\n> > > > > > +\tmemcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);\n> > > > > > +\n> > > > > > +\tmemset(&sockaddr, 0, sizeof(sockaddr));\n> > > > > > +\tsockaddr.sll_family = AF_PACKET;\n> > > > > > +\tsockaddr.sll_protocol = htons(ETH_P_ALL);\n> > > > > > +\tsockaddr.sll_ifindex = (*internals)->if_index;\n> > > > > > +\n> > > > > > +\tfanout_arg = (getpid() ^ (*internals)->if_index) & 0xffff;\n> > > > > > +\tfanout_arg |= (PACKET_FANOUT_HASH | PACKET_FANOUT_FLAG_DEFRAG |\n> > > > > > +\t PACKET_FANOUT_FLAG_ROLLOVER) << 16;\n> > > > > > +\n> > > > > > +\tfor (q = 0; q < nb_queues; q++) {\n> > > > > > +\t\t/* Open an AF_PACKET socket for this queue... */\n> > > > > > +\t\tqsockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));\n> > > > > > +\t\tif (qsockfd == -1) {\n> > > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t \"%s: could not open AF_PACKET socket\\n\",\n> > > > > > +\t\t\t name);\n> > > > > > +\t\t\treturn -1;\n> > > > > > +\t\t}\n> > > > > > +\n> > > > > > +\t\ttpver = TPACKET_V2;\n> > > > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_VERSION,\n> > > > > > +\t\t\t\t&tpver, sizeof(tpver));\n> > > > > > +\t\tif (rc == -1) {\n> > > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\"%s: could not set PACKET_VERSION on AF_PACKET \"\n> > > > > > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > > > > > +\t\t\tgoto error;\n> > > > > > +\t\t}\n> > > > > > +\n> > > > > > +\t\tdiscard = 1;\n> > > > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_LOSS,\n> > > > > > +\t\t\t\t&discard, sizeof(discard));\n> > > > > > +\t\tif (rc == -1) {\n> > > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\"%s: could not set PACKET_LOSS on \"\n> > > > > > +\t\t\t \"AF_PACKET socket for %s\\n\", name, pair->value);\n> > > > > > +\t\t\tgoto error;\n> > > > > > +\t\t}\n> > > > > > +\n> > > > > > +\t\tbypass = 1;\n> > > > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_QDISC_BYPASS,\n> > > > > > +\t\t\t\t&bypass, sizeof(bypass));\n> > > > > > +\t\tif (rc == -1) {\n> > > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\"%s: could not set PACKET_QDISC_BYPASS \"\n> > > > > > +\t\t\t \"on AF_PACKET socket for %s\\n\", name,\n> > > > > > +\t\t\t pair->value);\n> > > > > > +\t\t\tgoto error;\n> > > > > > +\t\t}\n> > > > > > +\n> > > > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_RX_RING, req, sizeof(*req));\n> > > > > > +\t\tif (rc == -1) {\n> > > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\"%s: could not set PACKET_RX_RING on AF_PACKET \"\n> > > > > > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > > > > > +\t\t\tgoto error;\n> > > > > > +\t\t}\n> > > > > > +\n> > > > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_TX_RING, req, sizeof(*req));\n> > > > > > +\t\tif (rc == -1) {\n> > > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\"%s: could not set PACKET_TX_RING on AF_PACKET \"\n> > > > > > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > > > > > +\t\t\tgoto error;\n> > > > > > +\t\t}\n> > > > > > +\n> > > > > > +\t\trx_queue = &((*internals)->rx_queue[q]);\n> > > > > > +\t\trx_queue->framecount = req->tp_frame_nr;\n> > > > > > +\n> > > > > > +\t\trx_queue->map = mmap(NULL, 2 * req->tp_block_size * req->tp_block_nr,\n> > > > > > +\t\t\t\t PROT_READ | PROT_WRITE, MAP_SHARED | MAP_LOCKED,\n> > > > > > +\t\t\t\t qsockfd, 0);\n> > > > > > +\t\tif (rx_queue->map == MAP_FAILED) {\n> > > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\"%s: call to mmap failed on AF_PACKET socket for %s\\n\",\n> > > > > > +\t\t\t\tname, pair->value);\n> > > > > > +\t\t\tgoto error;\n> > > > > > +\t\t}\n> > > > > > +\n> > > > > > +\t\t/* rdsize is same for both Tx and Rx */\n> > > > > > +\t\trdsize = req->tp_frame_nr * sizeof(*(rx_queue->rd));\n> > > > > > +\n> > > > > > +\t\trx_queue->rd = rte_zmalloc_socket(name, rdsize, 0, numa_node);\n> > > > > > +\t\tfor (i = 0; i < req->tp_frame_nr; ++i) {\n> > > > > > +\t\t\trx_queue->rd[i].iov_base = rx_queue->map + (i * framesize);\n> > > > > > +\t\t\trx_queue->rd[i].iov_len = req->tp_frame_size;\n> > > > > > +\t\t}\n> > > > > > +\t\trx_queue->sockfd = qsockfd;\n> > > > > > +\n> > > > > > +\t\ttx_queue = &((*internals)->tx_queue[q]);\n> > > > > > +\t\ttx_queue->framecount = req->tp_frame_nr;\n> > > > > > +\n> > > > > > +\t\ttx_queue->map = rx_queue->map + req->tp_block_size * req->tp_block_nr;\n> > > > > > +\n> > > > > > +\t\ttx_queue->rd = rte_zmalloc_socket(name, rdsize, 0, numa_node);\n> > > > > > +\t\tfor (i = 0; i < req->tp_frame_nr; ++i) {\n> > > > > > +\t\t\ttx_queue->rd[i].iov_base = tx_queue->map + (i * framesize);\n> > > > > > +\t\t\ttx_queue->rd[i].iov_len = req->tp_frame_size;\n> > > > > > +\t\t}\n> > > > > > +\t\ttx_queue->sockfd = qsockfd;\n> > > > > > +\n> > > > > > +\t\trc = bind(qsockfd, (const struct sockaddr*)&sockaddr, sizeof(sockaddr));\n> > > > > > +\t\tif (rc == -1) {\n> > > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\"%s: could not bind AF_PACKET socket to %s\\n\",\n> > > > > > +\t\t\t name, pair->value);\n> > > > > > +\t\t\tgoto error;\n> > > > > > +\t\t}\n> > > > > > +\n> > > > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_FANOUT,\n> > > > > > +\t\t\t\t&fanout_arg, sizeof(fanout_arg));\n> > > > > > +\t\tif (rc == -1) {\n> > > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\"%s: could not set PACKET_FANOUT on AF_PACKET socket \"\n> > > > > > +\t\t\t\t\"for %s\\n\", name, pair->value);\n> > > > > > +\t\t\tgoto error;\n> > > > > > +\t\t}\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\t/* reserve an ethdev entry */\n> > > > > > +\t*eth_dev = rte_eth_dev_allocate(name);\n> > > > > > +\tif (*eth_dev == NULL)\n> > > > > > +\t\tgoto error;\n> > > > > > +\n> > > > > > +\t/*\n> > > > > > +\t * now put it all together\n> > > > > > +\t * - store queue data in internals,\n> > > > > > +\t * - store numa_node info in pci_driver\n> > > > > > +\t * - point eth_dev_data to internals and pci_driver\n> > > > > > +\t * - and point eth_dev structure to new eth_dev_data structure\n> > > > > > +\t */\n> > > > > > +\n> > > > > > +\t(*internals)->nb_queues = nb_queues;\n> > > > > > +\n> > > > > > +\tdata->dev_private = *internals;\n> > > > > > +\tdata->port_id = (*eth_dev)->data->port_id;\n> > > > > > +\tdata->nb_rx_queues = (uint16_t)nb_queues;\n> > > > > > +\tdata->nb_tx_queues = (uint16_t)nb_queues;\n> > > > > > +\tdata->dev_link = pmd_link;\n> > > > > > +\tdata->mac_addrs = &(*internals)->eth_addr;\n> > > > > > +\n> > > > > > +\tpci_dev->numa_node = numa_node;\n> > > > > > +\n> > > > > > +\t(*eth_dev)->data = data;\n> > > > > > +\t(*eth_dev)->dev_ops = &ops;\n> > > > > > +\t(*eth_dev)->pci_dev = pci_dev;\n> > > > > > +\n> > > > > > +\treturn 0;\n> > > > > > +\n> > > > > > +error:\n> > > > > > +\tif (data)\n> > > > > > +\t\trte_free(data);\n> > > > > > +\tif (pci_dev)\n> > > > > > +\t\trte_free(pci_dev);\n> > > > > > +\tfor (q = 0; q < nb_queues; q++) {\n> > > > > > +\t\tif ((*internals)->rx_queue[q].rd)\n> > > > > > +\t\t\trte_free((*internals)->rx_queue[q].rd);\n> > > > > > +\t\tif ((*internals)->tx_queue[q].rd)\n> > > > > > +\t\t\trte_free((*internals)->tx_queue[q].rd);\n> > > > > > +\t}\n> > > > > > +\tif (*internals)\n> > > > > > +\t\trte_free(*internals);\n> > > > > > +\treturn -1;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static int\n> > > > > > +rte_eth_from_packet(const char *name,\n> > > > > > + int const *sockfd,\n> > > > > > + const unsigned numa_node,\n> > > > > > + struct rte_kvargs *kvlist)\n> > > > > > +{\n> > > > > > +\tstruct pmd_internals *internals = NULL;\n> > > > > > +\tstruct rte_eth_dev *eth_dev = NULL;\n> > > > > > +\tstruct rte_kvargs_pair *pair = NULL;\n> > > > > > +\tunsigned k_idx;\n> > > > > > +\tunsigned int blockcount;\n> > > > > > +\tunsigned int blocksize = DFLT_BLOCK_SIZE;\n> > > > > > +\tunsigned int framesize = DFLT_FRAME_SIZE;\n> > > > > > +\tunsigned int framecount = DFLT_FRAME_COUNT;\n> > > > > > +\tunsigned int qpairs = 1;\n> > > > > > +\n> > > > > > +\t/* do some parameter checking */\n> > > > > > +\tif (*sockfd < 0)\n> > > > > > +\t\treturn -1;\n> > > > > > +\n> > > > > > +\t/*\n> > > > > > +\t * Walk arguments for configurable settings\n> > > > > > +\t */\n> > > > > > +\tfor (k_idx = 0; k_idx < kvlist->count; k_idx++) {\n> > > > > > +\t\tpair = &kvlist->pairs[k_idx];\n> > > > > > +\t\tif (strstr(pair->key, ETH_PACKET_NUM_Q_ARG) != NULL) {\n> > > > > > +\t\t\tqpairs = atoi(pair->value);\n> > > > > > +\t\t\tif (qpairs < 1 ||\n> > > > > > +\t\t\t qpairs > RTE_PMD_PACKET_MAX_RINGS) {\n> > > > > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\t\"%s: invalid qpairs value\\n\",\n> > > > > > +\t\t\t\t name);\n> > > > > > +\t\t\t\treturn -1;\n> > > > > > +\t\t\t}\n> > > > > > +\t\t\tcontinue;\n> > > > > > +\t\t}\n> > > > > > +\t\tif (strstr(pair->key, ETH_PACKET_BLOCKSIZE_ARG) != NULL) {\n> > > > > > +\t\t\tblocksize = atoi(pair->value);\n> > > > > > +\t\t\tif (!blocksize) {\n> > > > > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\t\"%s: invalid blocksize value\\n\",\n> > > > > > +\t\t\t\t name);\n> > > > > > +\t\t\t\treturn -1;\n> > > > > > +\t\t\t}\n> > > > > > +\t\t\tcontinue;\n> > > > > > +\t\t}\n> > > > > > +\t\tif (strstr(pair->key, ETH_PACKET_FRAMESIZE_ARG) != NULL) {\n> > > > > > +\t\t\tframesize = atoi(pair->value);\n> > > > > > +\t\t\tif (!framesize) {\n> > > > > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\t\"%s: invalid framesize value\\n\",\n> > > > > > +\t\t\t\t name);\n> > > > > > +\t\t\t\treturn -1;\n> > > > > > +\t\t\t}\n> > > > > > +\t\t\tcontinue;\n> > > > > > +\t\t}\n> > > > > > +\t\tif (strstr(pair->key, ETH_PACKET_FRAMECOUNT_ARG) != NULL) {\n> > > > > > +\t\t\tframecount = atoi(pair->value);\n> > > > > > +\t\t\tif (!framecount) {\n> > > > > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\t\"%s: invalid framecount value\\n\",\n> > > > > > +\t\t\t\t name);\n> > > > > > +\t\t\t\treturn -1;\n> > > > > > +\t\t\t}\n> > > > > > +\t\t\tcontinue;\n> > > > > > +\t\t}\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\tif (framesize > blocksize) {\n> > > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\"%s: AF_PACKET MMAP frame size exceeds block size!\\n\",\n> > > > > > +\t\t name);\n> > > > > > +\t\treturn -1;\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\tblockcount = framecount / (blocksize / framesize);\n> > > > > > +\tif (!blockcount) {\n> > > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\"%s: invalid AF_PACKET MMAP parameters\\n\", name);\n> > > > > > +\t\treturn -1;\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\tRTE_LOG(INFO, PMD, \"%s: AF_PACKET MMAP parameters:\\n\", name);\n> > > > > > +\tRTE_LOG(INFO, PMD, \"%s:\\tblock size %d\\n\", name, blocksize);\n> > > > > > +\tRTE_LOG(INFO, PMD, \"%s:\\tblock count %d\\n\", name, blockcount);\n> > > > > > +\tRTE_LOG(INFO, PMD, \"%s:\\tframe size %d\\n\", name, framesize);\n> > > > > > +\tRTE_LOG(INFO, PMD, \"%s:\\tframe count %d\\n\", name, framecount);\n> > > > > > +\n> > > > > > +\tif (rte_pmd_init_internals(name, *sockfd, qpairs,\n> > > > > > +\t blocksize, blockcount,\n> > > > > > +\t framesize, framecount,\n> > > > > > +\t numa_node, &internals, &eth_dev,\n> > > > > > +\t kvlist) < 0)\n> > > > > > +\t\treturn -1;\n> > > > > > +\n> > > > > > +\teth_dev->rx_pkt_burst = eth_packet_rx;\n> > > > > > +\teth_dev->tx_pkt_burst = eth_packet_tx;\n> > > > > > +\n> > > > > > +\treturn 0;\n> > > > > > +}\n> > > > > > +\n> > > > > > +int\n> > > > > > +rte_pmd_packet_devinit(const char *name, const char *params)\n> > > > > > +{\n> > > > > > +\tunsigned numa_node;\n> > > > > > +\tint ret;\n> > > > > > +\tstruct rte_kvargs *kvlist;\n> > > > > > +\tint sockfd = -1;\n> > > > > > +\n> > > > > > +\tRTE_LOG(INFO, PMD, \"Initializing pmd_packet for %s\\n\", name);\n> > > > > > +\n> > > > > > +\tnuma_node = rte_socket_id();\n> > > > > > +\n> > > > > > +\tkvlist = rte_kvargs_parse(params, valid_arguments);\n> > > > > > +\tif (kvlist == NULL)\n> > > > > > +\t\treturn -1;\n> > > > > > +\n> > > > > > +\t/*\n> > > > > > +\t * If iface argument is passed we open the NICs and use them for\n> > > > > > +\t * reading / writing\n> > > > > > +\t */\n> > > > > > +\tif (rte_kvargs_count(kvlist, ETH_PACKET_IFACE_ARG) == 1) {\n> > > > > > +\n> > > > > > +\t\tret = rte_kvargs_process(kvlist, ETH_PACKET_IFACE_ARG,\n> > > > > > +\t\t &open_packet_iface, &sockfd);\n> > > > > > +\t\tif (ret < 0)\n> > > > > > +\t\t\treturn -1;\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\tret = rte_eth_from_packet(name, &sockfd, numa_node, kvlist);\n> > > > > > +\tclose(sockfd); /* no longer needed */\n> > > > > > +\n> > > > > > +\tif (ret < 0)\n> > > > > > +\t\treturn -1;\n> > > > > > +\n> > > > > > +\treturn 0;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static struct rte_driver pmd_packet_drv = {\n> > > > > > +\t.name = \"eth_packet\",\n> > > > > > +\t.type = PMD_VDEV,\n> > > > > > +\t.init = rte_pmd_packet_devinit,\n> > > > > > +};\n> > > > > > +\n> > > > > > +PMD_REGISTER_DRIVER(pmd_packet_drv);\n> > > > > > diff --git a/lib/librte_pmd_packet/rte_eth_packet.h b/lib/librte_pmd_packet/rte_eth_packet.h\n> > > > > > new file mode 100644\n> > > > > > index 000000000000..f685611da3e9\n> > > > > > --- /dev/null\n> > > > > > +++ b/lib/librte_pmd_packet/rte_eth_packet.h\n> > > > > > @@ -0,0 +1,55 @@\n> > > > > > +/*-\n> > > > > > + * BSD LICENSE\n> > > > > > + *\n> > > > > > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > > > > > + * All rights reserved.\n> > > > > > + *\n> > > > > > + * Redistribution and use in source and binary forms, with or without\n> > > > > > + * modification, are permitted provided that the following conditions\n> > > > > > + * are met:\n> > > > > > + *\n> > > > > > + * * Redistributions of source code must retain the above copyright\n> > > > > > + * notice, this list of conditions and the following disclaimer.\n> > > > > > + * * Redistributions in binary form must reproduce the above copyright\n> > > > > > + * notice, this list of conditions and the following disclaimer in\n> > > > > > + * the documentation and/or other materials provided with the\n> > > > > > + * distribution.\n> > > > > > + * * Neither the name of Intel Corporation nor the names of its\n> > > > > > + * contributors may be used to endorse or promote products derived\n> > > > > > + * from this software without specific prior written permission.\n> > > > > > + *\n> > > > > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> > > > > > + * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> > > > > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> > > > > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> > > > > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> > > > > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> > > > > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> > > > > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> > > > > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> > > > > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> > > > > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> > > > > > + */\n> > > > > > +\n> > > > > > +#ifndef _RTE_ETH_PACKET_H_\n> > > > > > +#define _RTE_ETH_PACKET_H_\n> > > > > > +\n> > > > > > +#ifdef __cplusplus\n> > > > > > +extern \"C\" {\n> > > > > > +#endif\n> > > > > > +\n> > > > > > +#define RTE_ETH_PACKET_PARAM_NAME \"eth_packet\"\n> > > > > > +\n> > > > > > +#define RTE_PMD_PACKET_MAX_RINGS 16\n> > > > > > +\n> > > > > > +/**\n> > > > > > + * For use by the EAL only. Called as part of EAL init to set up any dummy NICs\n> > > > > > + * configured on command line.\n> > > > > > + */\n> > > > > > +int rte_pmd_packet_devinit(const char *name, const char *params);\n> > > > > > +\n> > > > > > +#ifdef __cplusplus\n> > > > > > +}\n> > > > > > +#endif\n> > > > > > +\n> > > > > > +#endif\n> > > > > > diff --git a/mk/rte.app.mk b/mk/rte.app.mk\n> > > > > > index 34dff2a02a05..a6994c4dbe93 100644\n> > > > > > --- a/mk/rte.app.mk\n> > > > > > +++ b/mk/rte.app.mk\n> > > > > > @@ -210,6 +210,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_PMD_PCAP),y)\n> > > > > > LDLIBS += -lrte_pmd_pcap -lpcap\n> > > > > > endif\n> > > > > >\n> > > > > > +ifeq ($(CONFIG_RTE_LIBRTE_PMD_PACKET),y)\n> > > > > > +LDLIBS += -lrte_pmd_packet\n> > > > > > +endif\n> > > > > > +\n> > > > > > endif # plugins\n> > > > > >\n> > > > > > LDLIBS += $(EXECENV_LDLIBS)\n> > > > > > --\n> > > > > > 1.9.3\n> > > > > >\n> > > > > >\n> > > > >\n> > > > > --\n> > > > > John W. Linville\t\tSomeday the world will need a hero, and you\n> > > > > linville@tuxdriver.com\t\t\tmight be all we have. Be ready.\n> > > >\n> > > \n> > > --\n> > > John W. Linville\t\tSomeday the world will need a hero, and you\n> > > linville@tuxdriver.com\t\t\tmight be all we have. Be ready.\n> > \n>", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id D612C58E8;\n\tMon, 15 Sep 2014 17:24:42 +0200 (CEST)", "from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58])\n\tby dpdk.org (Postfix) with ESMTP id 4B37A38EB\n\tfor <dev@dpdk.org>; Mon, 15 Sep 2014 17:24:40 +0200 (CEST)", "from uucp by smtp.tuxdriver.com with local-rmail (Exim 4.63)\n\t(envelope-from <linville@tuxdriver.com>)\n\tid 1XTYEF-0007k9-59; Mon, 15 Sep 2014 11:30:11 -0400", "from linville-x1.hq.tuxdriver.com (localhost.localdomain\n\t[127.0.0.1])\n\tby linville-x1.hq.tuxdriver.com (8.14.8/8.14.6) with ESMTP id\n\ts8FFF8HP000838; Mon, 15 Sep 2014 11:15:08 -0400", "(from linville@localhost)\n\tby linville-x1.hq.tuxdriver.com (8.14.8/8.14.8/Submit) id\n\ts8FFF7it000834; Mon, 15 Sep 2014 11:15:07 -0400" ], "Date": "Mon, 15 Sep 2014 11:15:07 -0400", "From": "\"John W. Linville\" <linville@tuxdriver.com>", "To": "Neil Horman <nhorman@tuxdriver.com>", "Message-ID": "<20140915151506.GE28459@tuxdriver.com>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1405362290-6753-1-git-send-email-linville@tuxdriver.com>\n\t<20140912180523.GB7145@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A935AEE@SHSMSX104.ccr.corp.intel.com>\n\t<20140912185423.GD7145@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A935E75@SHSMSX104.ccr.corp.intel.com>\n\t<20140915150946.GA11690@hmsreliant.think-freely.org>", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=us-ascii", "Content-Disposition": "inline", "In-Reply-To": "<20140915150946.GA11690@hmsreliant.think-freely.org>", "User-Agent": "Mutt/1.5.23 (2014-03-12)", "Cc": "\"dev@dpdk.org\" <dev@dpdk.org>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 784, "web_url": "http://patches.dpdk.org/comment/784/", "msgid": "<DFDF335405C17848924A094BC35766CF0A93A24C@SHSMSX104.ccr.corp.intel.com>", "list_archive_url": "https://inbox.dpdk.org/dev/DFDF335405C17848924A094BC35766CF0A93A24C@SHSMSX104.ccr.corp.intel.com", "date": "2014-09-15T15:43:07", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 29, "url": "http://patches.dpdk.org/api/people/29/?format=api", "name": "Zhou, Danny", "email": "danny.zhou@intel.com" }, "content": "> -----Original Message-----\n> From: Neil Horman [mailto:nhorman@tuxdriver.com]\n> Sent: Monday, September 15, 2014 11:10 PM\n> To: Zhou, Danny\n> Cc: John W. Linville; dev@dpdk.org\n> Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices\n> \n> On Fri, Sep 12, 2014 at 08:35:47PM +0000, Zhou, Danny wrote:\n> > > -----Original Message-----\n> > > From: John W. Linville [mailto:linville@tuxdriver.com]\n> > > Sent: Saturday, September 13, 2014 2:54 AM\n> > > To: Zhou, Danny\n> > > Cc: dev@dpdk.org\n> > > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices\n> > >\n> > > On Fri, Sep 12, 2014 at 06:31:08PM +0000, Zhou, Danny wrote:\n> > > > I am concerned about its performance caused by too many\n> > > > memcpy(). Specifically, on Rx side, kernel NIC driver needs to copy\n> > > > packets to skb, then af_packet copies packets to AF_PACKET buffer\n> > > > which are mapped to user space, and then those packets to be copied\n> > > > to DPDK mbuf. In addition, 3 copies needed on Tx side. So to run a\n> > > > simple DPDK L2/L3 forwarding benchmark, each packet needs 6 packet\n> > > > copies which brings significant negative performance impact. We\n> > > > had a bifurcated driver prototype that can do zero-copy and achieve\n> > > > native DPDK performance, but it depends on base driver and AF_PACKET\n> > > > code changes in kernel, John R will be presenting it in coming Linux\n> > > > Plumbers Conference. Once kernel adopts it, the relevant PMD will be\n> > > > submitted to dpdk.org.\n> > >\n> > > Admittedly, this is not as good a performer as most of the existing\n> > > PMDs. It serves a different purpose, afterall. FWIW, you did\n> > > previously indicate that it performed better than the pcap-based PMD.\n> >\n> > Yes, slightly higher but makes no big difference.\n> >\n> Do you have numbers for this? It seems to me faster is faster as long as its\n> statistically significant. Even if its not, johns AF_PACKET pmd has the ability\n> to scale to multple cpus more easily than the pcap pmd, as it can make use of\n> the AF_PACKET fanout feature.\n\nFor 64B small packet, 1.35M pps with 1 queue. As both pcap and AF_PACKET PMDs depend on interrupt \nbased NIC kernel drivers, all the DPDK performance optimization techniques are not utilized. Why should DPDK adopt \ntwo similar and poor performant PMDs which cannot demonstrate DPDK' key value \"high performance\"?\n\n> \n> > > I look forward to seeing the changes you mention -- they sound very\n> > > exciting. But, they will still require both networking core and\n> > > driver changes in the kernel. And as I understand things today,\n> > > the userland code will still need at least some knowledge of specific\n> > > devices and how they layout their packet descriptors, etc. So while\n> > > those changes sound very promising, they will still have certain\n> > > drawbacks in common with the current situation.\n> >\n> > Yes, we would like the DPDK performance optimization techniques such as huge page, efficient rx/tx routines to manipulate\n> device-specific\n> > packet descriptors, polling-model can be still used. We have to tradeoff between performance and commonality. But we believe it will\n> be much easier\n> > to develop DPDK PMD for non-Intel NICs than porting entire kernel drivers to DPDK.\n> >\n> \n> Not sure how this relates, what you're describing is the feature intel has been\n> working on to augment kernel drivers to provide better throughput via direct\n> hardware access to user space. Johns PMD provides ubiquitous function on all\n> hardware. I'm not sure how the desire for one implies the other isn't valuable?\n> \n\nPerformance is the key value of DPDK, instead of commonality. But we are trying to improve commonality of our solution to make it easily \nadopted by other NIC vendors.\n\n> > > It seems like the changes you mention will still need some sort of\n> > > AF_PACKET-based PMD driver. Have you implemented that completely\n> > > separate from the code I already posted? Or did you add that work\n> > > on top of mine?\n> > >\n> >\n> > For userland code, it certainly use some of your code related to raw rocket, but highly modified. A layer will be added into eth_dev\n> library to do device\n> > probe and support new socket options.\n> >\n> \n> Ok, but again, PMD's are independent, and serve different needs. If they're use\n> is at all overlapping from a functional standpoint, take this one now, and\n> deprecate it when a better one comes along. Though from your description it\n> seems like both have a valid place in the ecosystem.\n> \n\nI am ok with this approach, as long as this AF_PACKET PMD does not add extra maintain efforts. Thomas might make the call.\n\n> Neil\n> \n> > > John\n> > >\n> > > > > -----Original Message-----\n> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of John W. Linville\n> > > > > Sent: Saturday, September 13, 2014 2:05 AM\n> > > > > To: dev@dpdk.org\n> > > > > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices\n> > > > >\n> > > > > Ping? Are there objections to this patch from mid-July?\n> > > > >\n> > > > > John\n> > > > >\n> > > > > On Mon, Jul 14, 2014 at 02:24:50PM -0400, John W. Linville wrote:\n> > > > > > This is a Linux-specific virtual PMD driver backed by an AF_PACKET\n> > > > > > socket. This implementation uses mmap'ed ring buffers to limit copying\n> > > > > > and user/kernel transitions. The PACKET_FANOUT_HASH behavior of\n> > > > > > AF_PACKET is used for frame reception. In the current implementation,\n> > > > > > Tx and Rx queues are always paired, and therefore are always equal\n> > > > > > in number -- changing this would be a Simple Matter Of Programming.\n> > > > > >\n> > > > > > Interfaces of this type are created with a command line option like\n> > > > > > \"--vdev=eth_packet0,iface=...\". There are a number of options availabe\n> > > > > > as arguments:\n> > > > > >\n> > > > > > - Interface is chosen by \"iface\" (required)\n> > > > > > - Number of queue pairs set by \"qpairs\" (optional, default: 1)\n> > > > > > - AF_PACKET MMAP block size set by \"blocksz\" (optional, default: 4096)\n> > > > > > - AF_PACKET MMAP frame size set by \"framesz\" (optional, default: 2048)\n> > > > > > - AF_PACKET MMAP frame count set by \"framecnt\" (optional, default: 512)\n> > > > > >\n> > > > > > Signed-off-by: John W. Linville <linville@tuxdriver.com>\n> > > > > > ---\n> > > > > > This PMD is intended to provide a means for using DPDK on a broad\n> > > > > > range of hardware without hardware-specific PMDs and (hopefully)\n> > > > > > with better performance than what PCAP offers in Linux. This might\n> > > > > > be useful as a development platform for DPDK applications when\n> > > > > > DPDK-supported hardware is expensive or unavailable.\n> > > > > >\n> > > > > > New in v2:\n> > > > > >\n> > > > > > -- fixup some style issues found by check patch\n> > > > > > -- use if_index as part of fanout group ID\n> > > > > > -- set default number of queue pairs to 1\n> > > > > >\n> > > > > > config/common_bsdapp | 5 +\n> > > > > > config/common_linuxapp | 5 +\n> > > > > > lib/Makefile | 1 +\n> > > > > > lib/librte_eal/linuxapp/eal/Makefile | 1 +\n> > > > > > lib/librte_pmd_packet/Makefile | 60 +++\n> > > > > > lib/librte_pmd_packet/rte_eth_packet.c | 826 +++++++++++++++++++++++++++++++++\n> > > > > > lib/librte_pmd_packet/rte_eth_packet.h | 55 +++\n> > > > > > mk/rte.app.mk | 4 +\n> > > > > > 8 files changed, 957 insertions(+)\n> > > > > > create mode 100644 lib/librte_pmd_packet/Makefile\n> > > > > > create mode 100644 lib/librte_pmd_packet/rte_eth_packet.c\n> > > > > > create mode 100644 lib/librte_pmd_packet/rte_eth_packet.h\n> > > > > >\n> > > > > > diff --git a/config/common_bsdapp b/config/common_bsdapp\n> > > > > > index 943dce8f1ede..c317f031278e 100644\n> > > > > > --- a/config/common_bsdapp\n> > > > > > +++ b/config/common_bsdapp\n> > > > > > @@ -226,6 +226,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=y\n> > > > > > CONFIG_RTE_LIBRTE_PMD_BOND=y\n> > > > > >\n> > > > > > #\n> > > > > > +# Compile software PMD backed by AF_PACKET sockets (Linux only)\n> > > > > > +#\n> > > > > > +CONFIG_RTE_LIBRTE_PMD_PACKET=n\n> > > > > > +\n> > > > > > +#\n> > > > > > # Do prefetch of packet data within PMD driver receive function\n> > > > > > #\n> > > > > > CONFIG_RTE_PMD_PACKET_PREFETCH=y\n> > > > > > diff --git a/config/common_linuxapp b/config/common_linuxapp\n> > > > > > index 7bf5d80d4e26..f9e7bc3015ec 100644\n> > > > > > --- a/config/common_linuxapp\n> > > > > > +++ b/config/common_linuxapp\n> > > > > > @@ -249,6 +249,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=n\n> > > > > > CONFIG_RTE_LIBRTE_PMD_BOND=y\n> > > > > >\n> > > > > > #\n> > > > > > +# Compile software PMD backed by AF_PACKET sockets (Linux only)\n> > > > > > +#\n> > > > > > +CONFIG_RTE_LIBRTE_PMD_PACKET=y\n> > > > > > +\n> > > > > > +#\n> > > > > > # Compile Xen PMD\n> > > > > > #\n> > > > > > CONFIG_RTE_LIBRTE_PMD_XENVIRT=n\n> > > > > > diff --git a/lib/Makefile b/lib/Makefile\n> > > > > > index 10c5bb3045bc..930fadf29898 100644\n> > > > > > --- a/lib/Makefile\n> > > > > > +++ b/lib/Makefile\n> > > > > > @@ -47,6 +47,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += librte_pmd_i40e\n> > > > > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += librte_pmd_bond\n> > > > > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += librte_pmd_ring\n> > > > > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += librte_pmd_pcap\n> > > > > > +DIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += librte_pmd_packet\n> > > > > > DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += librte_pmd_virtio\n> > > > > > DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3\n> > > > > > DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt\n> > > > > > diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile\n> > > > > > index 756d6b0c9301..feed24a63272 100644\n> > > > > > --- a/lib/librte_eal/linuxapp/eal/Makefile\n> > > > > > +++ b/lib/librte_eal/linuxapp/eal/Makefile\n> > > > > > @@ -44,6 +44,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_ether\n> > > > > > CFLAGS += -I$(RTE_SDK)/lib/librte_ivshmem\n> > > > > > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_ring\n> > > > > > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_pcap\n> > > > > > +CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_packet\n> > > > > > CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_xenvirt\n> > > > > > CFLAGS += $(WERROR_FLAGS) -O3\n> > > > > >\n> > > > > > diff --git a/lib/librte_pmd_packet/Makefile b/lib/librte_pmd_packet/Makefile\n> > > > > > new file mode 100644\n> > > > > > index 000000000000..e1266fb992cd\n> > > > > > --- /dev/null\n> > > > > > +++ b/lib/librte_pmd_packet/Makefile\n> > > > > > @@ -0,0 +1,60 @@\n> > > > > > +# BSD LICENSE\n> > > > > > +#\n> > > > > > +# Copyright(c) 2014 John W. Linville <linville@redhat.com>\n> > > > > > +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > > > > > +# Copyright(c) 2014 6WIND S.A.\n> > > > > > +# All rights reserved.\n> > > > > > +#\n> > > > > > +# Redistribution and use in source and binary forms, with or without\n> > > > > > +# modification, are permitted provided that the following conditions\n> > > > > > +# are met:\n> > > > > > +#\n> > > > > > +# * Redistributions of source code must retain the above copyright\n> > > > > > +# notice, this list of conditions and the following disclaimer.\n> > > > > > +# * Redistributions in binary form must reproduce the above copyright\n> > > > > > +# notice, this list of conditions and the following disclaimer in\n> > > > > > +# the documentation and/or other materials provided with the\n> > > > > > +# distribution.\n> > > > > > +# * Neither the name of Intel Corporation nor the names of its\n> > > > > > +# contributors may be used to endorse or promote products derived\n> > > > > > +# from this software without specific prior written permission.\n> > > > > > +#\n> > > > > > +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> > > > > > +# \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> > > > > > +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> > > > > > +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> > > > > > +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> > > > > > +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> > > > > > +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> > > > > > +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> > > > > > +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> > > > > > +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> > > > > > +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> > > > > > +\n> > > > > > +include $(RTE_SDK)/mk/rte.vars.mk\n> > > > > > +\n> > > > > > +#\n> > > > > > +# library name\n> > > > > > +#\n> > > > > > +LIB = librte_pmd_packet.a\n> > > > > > +\n> > > > > > +CFLAGS += -O3\n> > > > > > +CFLAGS += $(WERROR_FLAGS)\n> > > > > > +\n> > > > > > +#\n> > > > > > +# all source are stored in SRCS-y\n> > > > > > +#\n> > > > > > +SRCS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += rte_eth_packet.c\n> > > > > > +\n> > > > > > +#\n> > > > > > +# Export include files\n> > > > > > +#\n> > > > > > +SYMLINK-y-include += rte_eth_packet.h\n> > > > > > +\n> > > > > > +# this lib depends upon:\n> > > > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_mbuf\n> > > > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_ether\n> > > > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_malloc\n> > > > > > +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += lib/librte_kvargs\n> > > > > > +\n> > > > > > +include $(RTE_SDK)/mk/rte.lib.mk\n> > > > > > diff --git a/lib/librte_pmd_packet/rte_eth_packet.c b/lib/librte_pmd_packet/rte_eth_packet.c\n> > > > > > new file mode 100644\n> > > > > > index 000000000000..9c82d16e730f\n> > > > > > --- /dev/null\n> > > > > > +++ b/lib/librte_pmd_packet/rte_eth_packet.c\n> > > > > > @@ -0,0 +1,826 @@\n> > > > > > +/*-\n> > > > > > + * BSD LICENSE\n> > > > > > + *\n> > > > > > + * Copyright(c) 2014 John W. Linville <linville@tuxdriver.com>\n> > > > > > + *\n> > > > > > + * Originally based upon librte_pmd_pcap code:\n> > > > > > + *\n> > > > > > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > > > > > + * Copyright(c) 2014 6WIND S.A.\n> > > > > > + * All rights reserved.\n> > > > > > + *\n> > > > > > + * Redistribution and use in source and binary forms, with or without\n> > > > > > + * modification, are permitted provided that the following conditions\n> > > > > > + * are met:\n> > > > > > + *\n> > > > > > + * * Redistributions of source code must retain the above copyright\n> > > > > > + * notice, this list of conditions and the following disclaimer.\n> > > > > > + * * Redistributions in binary form must reproduce the above copyright\n> > > > > > + * notice, this list of conditions and the following disclaimer in\n> > > > > > + * the documentation and/or other materials provided with the\n> > > > > > + * distribution.\n> > > > > > + * * Neither the name of Intel Corporation nor the names of its\n> > > > > > + * contributors may be used to endorse or promote products derived\n> > > > > > + * from this software without specific prior written permission.\n> > > > > > + *\n> > > > > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> > > > > > + * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> > > > > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> > > > > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> > > > > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> > > > > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> > > > > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> > > > > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> > > > > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> > > > > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> > > > > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> > > > > > + */\n> > > > > > +\n> > > > > > +#include <rte_mbuf.h>\n> > > > > > +#include <rte_ethdev.h>\n> > > > > > +#include <rte_malloc.h>\n> > > > > > +#include <rte_kvargs.h>\n> > > > > > +#include <rte_dev.h>\n> > > > > > +\n> > > > > > +#include <linux/if_ether.h>\n> > > > > > +#include <linux/if_packet.h>\n> > > > > > +#include <arpa/inet.h>\n> > > > > > +#include <net/if.h>\n> > > > > > +#include <sys/types.h>\n> > > > > > +#include <sys/socket.h>\n> > > > > > +#include <sys/ioctl.h>\n> > > > > > +#include <sys/mman.h>\n> > > > > > +#include <unistd.h>\n> > > > > > +#include <poll.h>\n> > > > > > +\n> > > > > > +#include \"rte_eth_packet.h\"\n> > > > > > +\n> > > > > > +#define ETH_PACKET_IFACE_ARG\t\t\"iface\"\n> > > > > > +#define ETH_PACKET_NUM_Q_ARG\t\t\"qpairs\"\n> > > > > > +#define ETH_PACKET_BLOCKSIZE_ARG\t\"blocksz\"\n> > > > > > +#define ETH_PACKET_FRAMESIZE_ARG\t\"framesz\"\n> > > > > > +#define ETH_PACKET_FRAMECOUNT_ARG\t\"framecnt\"\n> > > > > > +\n> > > > > > +#define DFLT_BLOCK_SIZE\t\t(1 << 12)\n> > > > > > +#define DFLT_FRAME_SIZE\t\t(1 << 11)\n> > > > > > +#define DFLT_FRAME_COUNT\t(1 << 9)\n> > > > > > +\n> > > > > > +struct pkt_rx_queue {\n> > > > > > +\tint sockfd;\n> > > > > > +\n> > > > > > +\tstruct iovec *rd;\n> > > > > > +\tuint8_t *map;\n> > > > > > +\tunsigned int framecount;\n> > > > > > +\tunsigned int framenum;\n> > > > > > +\n> > > > > > +\tstruct rte_mempool *mb_pool;\n> > > > > > +\n> > > > > > +\tvolatile unsigned long rx_pkts;\n> > > > > > +\tvolatile unsigned long err_pkts;\n> > > > > > +};\n> > > > > > +\n> > > > > > +struct pkt_tx_queue {\n> > > > > > +\tint sockfd;\n> > > > > > +\n> > > > > > +\tstruct iovec *rd;\n> > > > > > +\tuint8_t *map;\n> > > > > > +\tunsigned int framecount;\n> > > > > > +\tunsigned int framenum;\n> > > > > > +\n> > > > > > +\tvolatile unsigned long tx_pkts;\n> > > > > > +\tvolatile unsigned long err_pkts;\n> > > > > > +};\n> > > > > > +\n> > > > > > +struct pmd_internals {\n> > > > > > +\tunsigned nb_queues;\n> > > > > > +\n> > > > > > +\tint if_index;\n> > > > > > +\tstruct ether_addr eth_addr;\n> > > > > > +\n> > > > > > +\tstruct tpacket_req req;\n> > > > > > +\n> > > > > > +\tstruct pkt_rx_queue rx_queue[RTE_PMD_PACKET_MAX_RINGS];\n> > > > > > +\tstruct pkt_tx_queue tx_queue[RTE_PMD_PACKET_MAX_RINGS];\n> > > > > > +};\n> > > > > > +\n> > > > > > +static const char *valid_arguments[] = {\n> > > > > > +\tETH_PACKET_IFACE_ARG,\n> > > > > > +\tETH_PACKET_NUM_Q_ARG,\n> > > > > > +\tETH_PACKET_BLOCKSIZE_ARG,\n> > > > > > +\tETH_PACKET_FRAMESIZE_ARG,\n> > > > > > +\tETH_PACKET_FRAMECOUNT_ARG,\n> > > > > > +\tNULL\n> > > > > > +};\n> > > > > > +\n> > > > > > +static const char *drivername = \"AF_PACKET PMD\";\n> > > > > > +\n> > > > > > +static struct rte_eth_link pmd_link = {\n> > > > > > +\t.link_speed = 10000,\n> > > > > > +\t.link_duplex = ETH_LINK_FULL_DUPLEX,\n> > > > > > +\t.link_status = 0\n> > > > > > +};\n> > > > > > +\n> > > > > > +static uint16_t\n> > > > > > +eth_packet_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)\n> > > > > > +{\n> > > > > > +\tunsigned i;\n> > > > > > +\tstruct tpacket2_hdr *ppd;\n> > > > > > +\tstruct rte_mbuf *mbuf;\n> > > > > > +\tuint8_t *pbuf;\n> > > > > > +\tstruct pkt_rx_queue *pkt_q = queue;\n> > > > > > +\tuint16_t num_rx = 0;\n> > > > > > +\tunsigned int framecount, framenum;\n> > > > > > +\n> > > > > > +\tif (unlikely(nb_pkts == 0))\n> > > > > > +\t\treturn 0;\n> > > > > > +\n> > > > > > +\t/*\n> > > > > > +\t * Reads the given number of packets from the AF_PACKET socket one by\n> > > > > > +\t * one and copies the packet data into a newly allocated mbuf.\n> > > > > > +\t */\n> > > > > > +\tframecount = pkt_q->framecount;\n> > > > > > +\tframenum = pkt_q->framenum;\n> > > > > > +\tfor (i = 0; i < nb_pkts; i++) {\n> > > > > > +\t\t/* point at the next incoming frame */\n> > > > > > +\t\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > > > > > +\t\tif ((ppd->tp_status & TP_STATUS_USER) == 0)\n> > > > > > +\t\t\tbreak;\n> > > > > > +\n> > > > > > +\t\t/* allocate the next mbuf */\n> > > > > > +\t\tmbuf = rte_pktmbuf_alloc(pkt_q->mb_pool);\n> > > > > > +\t\tif (unlikely(mbuf == NULL))\n> > > > > > +\t\t\tbreak;\n> > > > > > +\n> > > > > > +\t\t/* packet will fit in the mbuf, go ahead and receive it */\n> > > > > > +\t\tmbuf->pkt.pkt_len = mbuf->pkt.data_len = ppd->tp_snaplen;\n> > > > > > +\t\tpbuf = (uint8_t *) ppd + ppd->tp_mac;\n> > > > > > +\t\tmemcpy(mbuf->pkt.data, pbuf, mbuf->pkt.data_len);\n> > > > > > +\n> > > > > > +\t\t/* release incoming frame and advance ring buffer */\n> > > > > > +\t\tppd->tp_status = TP_STATUS_KERNEL;\n> > > > > > +\t\tif (++framenum >= framecount)\n> > > > > > +\t\t\tframenum = 0;\n> > > > > > +\n> > > > > > +\t\t/* account for the receive frame */\n> > > > > > +\t\tbufs[i] = mbuf;\n> > > > > > +\t\tnum_rx++;\n> > > > > > +\t}\n> > > > > > +\tpkt_q->framenum = framenum;\n> > > > > > +\tpkt_q->rx_pkts += num_rx;\n> > > > > > +\treturn num_rx;\n> > > > > > +}\n> > > > > > +\n> > > > > > +/*\n> > > > > > + * Callback to handle sending packets through a real NIC.\n> > > > > > + */\n> > > > > > +static uint16_t\n> > > > > > +eth_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)\n> > > > > > +{\n> > > > > > +\tstruct tpacket2_hdr *ppd;\n> > > > > > +\tstruct rte_mbuf *mbuf;\n> > > > > > +\tuint8_t *pbuf;\n> > > > > > +\tunsigned int framecount, framenum;\n> > > > > > +\tstruct pollfd pfd;\n> > > > > > +\tstruct pkt_tx_queue *pkt_q = queue;\n> > > > > > +\tuint16_t num_tx = 0;\n> > > > > > +\tint i;\n> > > > > > +\n> > > > > > +\tif (unlikely(nb_pkts == 0))\n> > > > > > +\t\treturn 0;\n> > > > > > +\n> > > > > > +\tmemset(&pfd, 0, sizeof(pfd));\n> > > > > > +\tpfd.fd = pkt_q->sockfd;\n> > > > > > +\tpfd.events = POLLOUT;\n> > > > > > +\tpfd.revents = 0;\n> > > > > > +\n> > > > > > +\tframecount = pkt_q->framecount;\n> > > > > > +\tframenum = pkt_q->framenum;\n> > > > > > +\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > > > > > +\tfor (i = 0; i < nb_pkts; i++) {\n> > > > > > +\t\t/* point at the next incoming frame */\n> > > > > > +\t\tif ((ppd->tp_status != TP_STATUS_AVAILABLE) &&\n> > > > > > +\t\t (poll(&pfd, 1, -1) < 0))\n> > > > > > +\t\t\t\tcontinue;\n> > > > > > +\n> > > > > > +\t\t/* copy the tx frame data */\n> > > > > > +\t\tmbuf = bufs[num_tx];\n> > > > > > +\t\tpbuf = (uint8_t *) ppd + TPACKET2_HDRLEN -\n> > > > > > +\t\t\tsizeof(struct sockaddr_ll);\n> > > > > > +\t\tmemcpy(pbuf, mbuf->pkt.data, mbuf->pkt.data_len);\n> > > > > > +\t\tppd->tp_len = ppd->tp_snaplen = mbuf->pkt.data_len;\n> > > > > > +\n> > > > > > +\t\t/* release incoming frame and advance ring buffer */\n> > > > > > +\t\tppd->tp_status = TP_STATUS_SEND_REQUEST;\n> > > > > > +\t\tif (++framenum >= framecount)\n> > > > > > +\t\t\tframenum = 0;\n> > > > > > +\t\tppd = (struct tpacket2_hdr *) pkt_q->rd[framenum].iov_base;\n> > > > > > +\n> > > > > > +\t\tnum_tx++;\n> > > > > > +\t\trte_pktmbuf_free(mbuf);\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\t/* kick-off transmits */\n> > > > > > +\tsendto(pkt_q->sockfd, NULL, 0, MSG_DONTWAIT, NULL, 0);\n> > > > > > +\n> > > > > > +\tpkt_q->framenum = framenum;\n> > > > > > +\tpkt_q->tx_pkts += num_tx;\n> > > > > > +\tpkt_q->err_pkts += nb_pkts - num_tx;\n> > > > > > +\treturn num_tx;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static int\n> > > > > > +eth_dev_start(struct rte_eth_dev *dev)\n> > > > > > +{\n> > > > > > +\tdev->data->dev_link.link_status = 1;\n> > > > > > +\treturn 0;\n> > > > > > +}\n> > > > > > +\n> > > > > > +/*\n> > > > > > + * This function gets called when the current port gets stopped.\n> > > > > > + */\n> > > > > > +static void\n> > > > > > +eth_dev_stop(struct rte_eth_dev *dev)\n> > > > > > +{\n> > > > > > +\tunsigned i;\n> > > > > > +\tint sockfd;\n> > > > > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > > > > +\n> > > > > > +\tfor (i = 0; i < internals->nb_queues; i++) {\n> > > > > > +\t\tsockfd = internals->rx_queue[i].sockfd;\n> > > > > > +\t\tif (sockfd != -1)\n> > > > > > +\t\t\tclose(sockfd);\n> > > > > > +\t\tsockfd = internals->tx_queue[i].sockfd;\n> > > > > > +\t\tif (sockfd != -1)\n> > > > > > +\t\t\tclose(sockfd);\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\tdev->data->dev_link.link_status = 0;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static int\n> > > > > > +eth_dev_configure(struct rte_eth_dev *dev __rte_unused)\n> > > > > > +{\n> > > > > > +\treturn 0;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static void\n> > > > > > +eth_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)\n> > > > > > +{\n> > > > > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > > > > +\n> > > > > > +\tdev_info->driver_name = drivername;\n> > > > > > +\tdev_info->if_index = internals->if_index;\n> > > > > > +\tdev_info->max_mac_addrs = 1;\n> > > > > > +\tdev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;\n> > > > > > +\tdev_info->max_rx_queues = (uint16_t)internals->nb_queues;\n> > > > > > +\tdev_info->max_tx_queues = (uint16_t)internals->nb_queues;\n> > > > > > +\tdev_info->min_rx_bufsize = 0;\n> > > > > > +\tdev_info->pci_dev = NULL;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static void\n> > > > > > +eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *igb_stats)\n> > > > > > +{\n> > > > > > +\tunsigned i, imax;\n> > > > > > +\tunsigned long rx_total = 0, tx_total = 0, tx_err_total = 0;\n> > > > > > +\tconst struct pmd_internals *internal = dev->data->dev_private;\n> > > > > > +\n> > > > > > +\tmemset(igb_stats, 0, sizeof(*igb_stats));\n> > > > > > +\n> > > > > > +\timax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?\n> > > > > > +\t internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);\n> > > > > > +\tfor (i = 0; i < imax; i++) {\n> > > > > > +\t\tigb_stats->q_ipackets[i] = internal->rx_queue[i].rx_pkts;\n> > > > > > +\t\trx_total += igb_stats->q_ipackets[i];\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\timax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?\n> > > > > > +\t internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);\n> > > > > > +\tfor (i = 0; i < imax; i++) {\n> > > > > > +\t\tigb_stats->q_opackets[i] = internal->tx_queue[i].tx_pkts;\n> > > > > > +\t\tigb_stats->q_errors[i] = internal->tx_queue[i].err_pkts;\n> > > > > > +\t\ttx_total += igb_stats->q_opackets[i];\n> > > > > > +\t\ttx_err_total += igb_stats->q_errors[i];\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\tigb_stats->ipackets = rx_total;\n> > > > > > +\tigb_stats->opackets = tx_total;\n> > > > > > +\tigb_stats->oerrors = tx_err_total;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static void\n> > > > > > +eth_stats_reset(struct rte_eth_dev *dev)\n> > > > > > +{\n> > > > > > +\tunsigned i;\n> > > > > > +\tstruct pmd_internals *internal = dev->data->dev_private;\n> > > > > > +\n> > > > > > +\tfor (i = 0; i < internal->nb_queues; i++)\n> > > > > > +\t\tinternal->rx_queue[i].rx_pkts = 0;\n> > > > > > +\n> > > > > > +\tfor (i = 0; i < internal->nb_queues; i++) {\n> > > > > > +\t\tinternal->tx_queue[i].tx_pkts = 0;\n> > > > > > +\t\tinternal->tx_queue[i].err_pkts = 0;\n> > > > > > +\t}\n> > > > > > +}\n> > > > > > +\n> > > > > > +static void\n> > > > > > +eth_dev_close(struct rte_eth_dev *dev __rte_unused)\n> > > > > > +{\n> > > > > > +}\n> > > > > > +\n> > > > > > +static void\n> > > > > > +eth_queue_release(void *q __rte_unused)\n> > > > > > +{\n> > > > > > +}\n> > > > > > +\n> > > > > > +static int\n> > > > > > +eth_link_update(struct rte_eth_dev *dev __rte_unused,\n> > > > > > + int wait_to_complete __rte_unused)\n> > > > > > +{\n> > > > > > +\treturn 0;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static int\n> > > > > > +eth_rx_queue_setup(struct rte_eth_dev *dev,\n> > > > > > + uint16_t rx_queue_id,\n> > > > > > + uint16_t nb_rx_desc __rte_unused,\n> > > > > > + unsigned int socket_id __rte_unused,\n> > > > > > + const struct rte_eth_rxconf *rx_conf __rte_unused,\n> > > > > > + struct rte_mempool *mb_pool)\n> > > > > > +{\n> > > > > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > > > > +\tstruct pkt_rx_queue *pkt_q = &internals->rx_queue[rx_queue_id];\n> > > > > > +\tstruct rte_pktmbuf_pool_private *mbp_priv;\n> > > > > > +\tuint16_t buf_size;\n> > > > > > +\n> > > > > > +\tpkt_q->mb_pool = mb_pool;\n> > > > > > +\n> > > > > > +\t/* Now get the space available for data in the mbuf */\n> > > > > > +\tmbp_priv = rte_mempool_get_priv(pkt_q->mb_pool);\n> > > > > > +\tbuf_size = (uint16_t) (mbp_priv->mbuf_data_room_size -\n> > > > > > +\t RTE_PKTMBUF_HEADROOM);\n> > > > > > +\n> > > > > > +\tif (ETH_FRAME_LEN > buf_size) {\n> > > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\"%s: %d bytes will not fit in mbuf (%d bytes)\\n\",\n> > > > > > +\t\t\tdev->data->name, ETH_FRAME_LEN, buf_size);\n> > > > > > +\t\treturn -ENOMEM;\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\tdev->data->rx_queues[rx_queue_id] = pkt_q;\n> > > > > > +\n> > > > > > +\treturn 0;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static int\n> > > > > > +eth_tx_queue_setup(struct rte_eth_dev *dev,\n> > > > > > + uint16_t tx_queue_id,\n> > > > > > + uint16_t nb_tx_desc __rte_unused,\n> > > > > > + unsigned int socket_id __rte_unused,\n> > > > > > + const struct rte_eth_txconf *tx_conf __rte_unused)\n> > > > > > +{\n> > > > > > +\n> > > > > > +\tstruct pmd_internals *internals = dev->data->dev_private;\n> > > > > > +\n> > > > > > +\tdev->data->tx_queues[tx_queue_id] = &internals->tx_queue[tx_queue_id];\n> > > > > > +\treturn 0;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static struct eth_dev_ops ops = {\n> > > > > > +\t.dev_start = eth_dev_start,\n> > > > > > +\t.dev_stop = eth_dev_stop,\n> > > > > > +\t.dev_close = eth_dev_close,\n> > > > > > +\t.dev_configure = eth_dev_configure,\n> > > > > > +\t.dev_infos_get = eth_dev_info,\n> > > > > > +\t.rx_queue_setup = eth_rx_queue_setup,\n> > > > > > +\t.tx_queue_setup = eth_tx_queue_setup,\n> > > > > > +\t.rx_queue_release = eth_queue_release,\n> > > > > > +\t.tx_queue_release = eth_queue_release,\n> > > > > > +\t.link_update = eth_link_update,\n> > > > > > +\t.stats_get = eth_stats_get,\n> > > > > > +\t.stats_reset = eth_stats_reset,\n> > > > > > +};\n> > > > > > +\n> > > > > > +/*\n> > > > > > + * Opens an AF_PACKET socket\n> > > > > > + */\n> > > > > > +static int\n> > > > > > +open_packet_iface(const char *key __rte_unused,\n> > > > > > + const char *value __rte_unused,\n> > > > > > + void *extra_args)\n> > > > > > +{\n> > > > > > +\tint *sockfd = extra_args;\n> > > > > > +\n> > > > > > +\t/* Open an AF_PACKET socket... */\n> > > > > > +\t*sockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));\n> > > > > > +\tif (*sockfd == -1) {\n> > > > > > +\t\tRTE_LOG(ERR, PMD, \"Could not open AF_PACKET socket\\n\");\n> > > > > > +\t\treturn -1;\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\treturn 0;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static int\n> > > > > > +rte_pmd_init_internals(const char *name,\n> > > > > > + const int sockfd,\n> > > > > > + const unsigned nb_queues,\n> > > > > > + unsigned int blocksize,\n> > > > > > + unsigned int blockcnt,\n> > > > > > + unsigned int framesize,\n> > > > > > + unsigned int framecnt,\n> > > > > > + const unsigned numa_node,\n> > > > > > + struct pmd_internals **internals,\n> > > > > > + struct rte_eth_dev **eth_dev,\n> > > > > > + struct rte_kvargs *kvlist)\n> > > > > > +{\n> > > > > > +\tstruct rte_eth_dev_data *data = NULL;\n> > > > > > +\tstruct rte_pci_device *pci_dev = NULL;\n> > > > > > +\tstruct rte_kvargs_pair *pair = NULL;\n> > > > > > +\tstruct ifreq ifr;\n> > > > > > +\tsize_t ifnamelen;\n> > > > > > +\tunsigned k_idx;\n> > > > > > +\tstruct sockaddr_ll sockaddr;\n> > > > > > +\tstruct tpacket_req *req;\n> > > > > > +\tstruct pkt_rx_queue *rx_queue;\n> > > > > > +\tstruct pkt_tx_queue *tx_queue;\n> > > > > > +\tint rc, tpver, discard, bypass;\n> > > > > > +\tunsigned int i, q, rdsize;\n> > > > > > +\tint qsockfd, fanout_arg;\n> > > > > > +\n> > > > > > +\tfor (k_idx = 0; k_idx < kvlist->count; k_idx++) {\n> > > > > > +\t\tpair = &kvlist->pairs[k_idx];\n> > > > > > +\t\tif (strstr(pair->key, ETH_PACKET_IFACE_ARG) != NULL)\n> > > > > > +\t\t\tbreak;\n> > > > > > +\t}\n> > > > > > +\tif (pair == NULL) {\n> > > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\"%s: no interface specified for AF_PACKET ethdev\\n\",\n> > > > > > +\t\t name);\n> > > > > > +\t\tgoto error;\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\tRTE_LOG(INFO, PMD,\n> > > > > > +\t\t\"%s: creating AF_PACKET-backed ethdev on numa socket %u\\n\",\n> > > > > > +\t\tname, numa_node);\n> > > > > > +\n> > > > > > +\t/*\n> > > > > > +\t * now do all data allocation - for eth_dev structure, dummy pci driver\n> > > > > > +\t * and internal (private) data\n> > > > > > +\t */\n> > > > > > +\tdata = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);\n> > > > > > +\tif (data == NULL)\n> > > > > > +\t\tgoto error;\n> > > > > > +\n> > > > > > +\tpci_dev = rte_zmalloc_socket(name, sizeof(*pci_dev), 0, numa_node);\n> > > > > > +\tif (pci_dev == NULL)\n> > > > > > +\t\tgoto error;\n> > > > > > +\n> > > > > > +\t*internals = rte_zmalloc_socket(name, sizeof(**internals),\n> > > > > > +\t 0, numa_node);\n> > > > > > +\tif (*internals == NULL)\n> > > > > > +\t\tgoto error;\n> > > > > > +\n> > > > > > +\treq = &((*internals)->req);\n> > > > > > +\n> > > > > > +\treq->tp_block_size = blocksize;\n> > > > > > +\treq->tp_block_nr = blockcnt;\n> > > > > > +\treq->tp_frame_size = framesize;\n> > > > > > +\treq->tp_frame_nr = framecnt;\n> > > > > > +\n> > > > > > +\tifnamelen = strlen(pair->value);\n> > > > > > +\tif (ifnamelen < sizeof(ifr.ifr_name)) {\n> > > > > > +\t\tmemcpy(ifr.ifr_name, pair->value, ifnamelen);\n> > > > > > +\t\tifr.ifr_name[ifnamelen] = '\\0';\n> > > > > > +\t} else {\n> > > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\"%s: I/F name too long (%s)\\n\",\n> > > > > > +\t\t\tname, pair->value);\n> > > > > > +\t\tgoto error;\n> > > > > > +\t}\n> > > > > > +\tif (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {\n> > > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\"%s: ioctl failed (SIOCGIFINDEX)\\n\",\n> > > > > > +\t\t name);\n> > > > > > +\t\tgoto error;\n> > > > > > +\t}\n> > > > > > +\t(*internals)->if_index = ifr.ifr_ifindex;\n> > > > > > +\n> > > > > > +\tif (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {\n> > > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\"%s: ioctl failed (SIOCGIFHWADDR)\\n\",\n> > > > > > +\t\t name);\n> > > > > > +\t\tgoto error;\n> > > > > > +\t}\n> > > > > > +\tmemcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);\n> > > > > > +\n> > > > > > +\tmemset(&sockaddr, 0, sizeof(sockaddr));\n> > > > > > +\tsockaddr.sll_family = AF_PACKET;\n> > > > > > +\tsockaddr.sll_protocol = htons(ETH_P_ALL);\n> > > > > > +\tsockaddr.sll_ifindex = (*internals)->if_index;\n> > > > > > +\n> > > > > > +\tfanout_arg = (getpid() ^ (*internals)->if_index) & 0xffff;\n> > > > > > +\tfanout_arg |= (PACKET_FANOUT_HASH | PACKET_FANOUT_FLAG_DEFRAG |\n> > > > > > +\t PACKET_FANOUT_FLAG_ROLLOVER) << 16;\n> > > > > > +\n> > > > > > +\tfor (q = 0; q < nb_queues; q++) {\n> > > > > > +\t\t/* Open an AF_PACKET socket for this queue... */\n> > > > > > +\t\tqsockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));\n> > > > > > +\t\tif (qsockfd == -1) {\n> > > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t \"%s: could not open AF_PACKET socket\\n\",\n> > > > > > +\t\t\t name);\n> > > > > > +\t\t\treturn -1;\n> > > > > > +\t\t}\n> > > > > > +\n> > > > > > +\t\ttpver = TPACKET_V2;\n> > > > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_VERSION,\n> > > > > > +\t\t\t\t&tpver, sizeof(tpver));\n> > > > > > +\t\tif (rc == -1) {\n> > > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\"%s: could not set PACKET_VERSION on AF_PACKET \"\n> > > > > > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > > > > > +\t\t\tgoto error;\n> > > > > > +\t\t}\n> > > > > > +\n> > > > > > +\t\tdiscard = 1;\n> > > > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_LOSS,\n> > > > > > +\t\t\t\t&discard, sizeof(discard));\n> > > > > > +\t\tif (rc == -1) {\n> > > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\"%s: could not set PACKET_LOSS on \"\n> > > > > > +\t\t\t \"AF_PACKET socket for %s\\n\", name, pair->value);\n> > > > > > +\t\t\tgoto error;\n> > > > > > +\t\t}\n> > > > > > +\n> > > > > > +\t\tbypass = 1;\n> > > > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_QDISC_BYPASS,\n> > > > > > +\t\t\t\t&bypass, sizeof(bypass));\n> > > > > > +\t\tif (rc == -1) {\n> > > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\"%s: could not set PACKET_QDISC_BYPASS \"\n> > > > > > +\t\t\t \"on AF_PACKET socket for %s\\n\", name,\n> > > > > > +\t\t\t pair->value);\n> > > > > > +\t\t\tgoto error;\n> > > > > > +\t\t}\n> > > > > > +\n> > > > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_RX_RING, req, sizeof(*req));\n> > > > > > +\t\tif (rc == -1) {\n> > > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\"%s: could not set PACKET_RX_RING on AF_PACKET \"\n> > > > > > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > > > > > +\t\t\tgoto error;\n> > > > > > +\t\t}\n> > > > > > +\n> > > > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_TX_RING, req, sizeof(*req));\n> > > > > > +\t\tif (rc == -1) {\n> > > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\"%s: could not set PACKET_TX_RING on AF_PACKET \"\n> > > > > > +\t\t\t\t\"socket for %s\\n\", name, pair->value);\n> > > > > > +\t\t\tgoto error;\n> > > > > > +\t\t}\n> > > > > > +\n> > > > > > +\t\trx_queue = &((*internals)->rx_queue[q]);\n> > > > > > +\t\trx_queue->framecount = req->tp_frame_nr;\n> > > > > > +\n> > > > > > +\t\trx_queue->map = mmap(NULL, 2 * req->tp_block_size * req->tp_block_nr,\n> > > > > > +\t\t\t\t PROT_READ | PROT_WRITE, MAP_SHARED | MAP_LOCKED,\n> > > > > > +\t\t\t\t qsockfd, 0);\n> > > > > > +\t\tif (rx_queue->map == MAP_FAILED) {\n> > > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\"%s: call to mmap failed on AF_PACKET socket for %s\\n\",\n> > > > > > +\t\t\t\tname, pair->value);\n> > > > > > +\t\t\tgoto error;\n> > > > > > +\t\t}\n> > > > > > +\n> > > > > > +\t\t/* rdsize is same for both Tx and Rx */\n> > > > > > +\t\trdsize = req->tp_frame_nr * sizeof(*(rx_queue->rd));\n> > > > > > +\n> > > > > > +\t\trx_queue->rd = rte_zmalloc_socket(name, rdsize, 0, numa_node);\n> > > > > > +\t\tfor (i = 0; i < req->tp_frame_nr; ++i) {\n> > > > > > +\t\t\trx_queue->rd[i].iov_base = rx_queue->map + (i * framesize);\n> > > > > > +\t\t\trx_queue->rd[i].iov_len = req->tp_frame_size;\n> > > > > > +\t\t}\n> > > > > > +\t\trx_queue->sockfd = qsockfd;\n> > > > > > +\n> > > > > > +\t\ttx_queue = &((*internals)->tx_queue[q]);\n> > > > > > +\t\ttx_queue->framecount = req->tp_frame_nr;\n> > > > > > +\n> > > > > > +\t\ttx_queue->map = rx_queue->map + req->tp_block_size * req->tp_block_nr;\n> > > > > > +\n> > > > > > +\t\ttx_queue->rd = rte_zmalloc_socket(name, rdsize, 0, numa_node);\n> > > > > > +\t\tfor (i = 0; i < req->tp_frame_nr; ++i) {\n> > > > > > +\t\t\ttx_queue->rd[i].iov_base = tx_queue->map + (i * framesize);\n> > > > > > +\t\t\ttx_queue->rd[i].iov_len = req->tp_frame_size;\n> > > > > > +\t\t}\n> > > > > > +\t\ttx_queue->sockfd = qsockfd;\n> > > > > > +\n> > > > > > +\t\trc = bind(qsockfd, (const struct sockaddr*)&sockaddr, sizeof(sockaddr));\n> > > > > > +\t\tif (rc == -1) {\n> > > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\"%s: could not bind AF_PACKET socket to %s\\n\",\n> > > > > > +\t\t\t name, pair->value);\n> > > > > > +\t\t\tgoto error;\n> > > > > > +\t\t}\n> > > > > > +\n> > > > > > +\t\trc = setsockopt(qsockfd, SOL_PACKET, PACKET_FANOUT,\n> > > > > > +\t\t\t\t&fanout_arg, sizeof(fanout_arg));\n> > > > > > +\t\tif (rc == -1) {\n> > > > > > +\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\"%s: could not set PACKET_FANOUT on AF_PACKET socket \"\n> > > > > > +\t\t\t\t\"for %s\\n\", name, pair->value);\n> > > > > > +\t\t\tgoto error;\n> > > > > > +\t\t}\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\t/* reserve an ethdev entry */\n> > > > > > +\t*eth_dev = rte_eth_dev_allocate(name);\n> > > > > > +\tif (*eth_dev == NULL)\n> > > > > > +\t\tgoto error;\n> > > > > > +\n> > > > > > +\t/*\n> > > > > > +\t * now put it all together\n> > > > > > +\t * - store queue data in internals,\n> > > > > > +\t * - store numa_node info in pci_driver\n> > > > > > +\t * - point eth_dev_data to internals and pci_driver\n> > > > > > +\t * - and point eth_dev structure to new eth_dev_data structure\n> > > > > > +\t */\n> > > > > > +\n> > > > > > +\t(*internals)->nb_queues = nb_queues;\n> > > > > > +\n> > > > > > +\tdata->dev_private = *internals;\n> > > > > > +\tdata->port_id = (*eth_dev)->data->port_id;\n> > > > > > +\tdata->nb_rx_queues = (uint16_t)nb_queues;\n> > > > > > +\tdata->nb_tx_queues = (uint16_t)nb_queues;\n> > > > > > +\tdata->dev_link = pmd_link;\n> > > > > > +\tdata->mac_addrs = &(*internals)->eth_addr;\n> > > > > > +\n> > > > > > +\tpci_dev->numa_node = numa_node;\n> > > > > > +\n> > > > > > +\t(*eth_dev)->data = data;\n> > > > > > +\t(*eth_dev)->dev_ops = &ops;\n> > > > > > +\t(*eth_dev)->pci_dev = pci_dev;\n> > > > > > +\n> > > > > > +\treturn 0;\n> > > > > > +\n> > > > > > +error:\n> > > > > > +\tif (data)\n> > > > > > +\t\trte_free(data);\n> > > > > > +\tif (pci_dev)\n> > > > > > +\t\trte_free(pci_dev);\n> > > > > > +\tfor (q = 0; q < nb_queues; q++) {\n> > > > > > +\t\tif ((*internals)->rx_queue[q].rd)\n> > > > > > +\t\t\trte_free((*internals)->rx_queue[q].rd);\n> > > > > > +\t\tif ((*internals)->tx_queue[q].rd)\n> > > > > > +\t\t\trte_free((*internals)->tx_queue[q].rd);\n> > > > > > +\t}\n> > > > > > +\tif (*internals)\n> > > > > > +\t\trte_free(*internals);\n> > > > > > +\treturn -1;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static int\n> > > > > > +rte_eth_from_packet(const char *name,\n> > > > > > + int const *sockfd,\n> > > > > > + const unsigned numa_node,\n> > > > > > + struct rte_kvargs *kvlist)\n> > > > > > +{\n> > > > > > +\tstruct pmd_internals *internals = NULL;\n> > > > > > +\tstruct rte_eth_dev *eth_dev = NULL;\n> > > > > > +\tstruct rte_kvargs_pair *pair = NULL;\n> > > > > > +\tunsigned k_idx;\n> > > > > > +\tunsigned int blockcount;\n> > > > > > +\tunsigned int blocksize = DFLT_BLOCK_SIZE;\n> > > > > > +\tunsigned int framesize = DFLT_FRAME_SIZE;\n> > > > > > +\tunsigned int framecount = DFLT_FRAME_COUNT;\n> > > > > > +\tunsigned int qpairs = 1;\n> > > > > > +\n> > > > > > +\t/* do some parameter checking */\n> > > > > > +\tif (*sockfd < 0)\n> > > > > > +\t\treturn -1;\n> > > > > > +\n> > > > > > +\t/*\n> > > > > > +\t * Walk arguments for configurable settings\n> > > > > > +\t */\n> > > > > > +\tfor (k_idx = 0; k_idx < kvlist->count; k_idx++) {\n> > > > > > +\t\tpair = &kvlist->pairs[k_idx];\n> > > > > > +\t\tif (strstr(pair->key, ETH_PACKET_NUM_Q_ARG) != NULL) {\n> > > > > > +\t\t\tqpairs = atoi(pair->value);\n> > > > > > +\t\t\tif (qpairs < 1 ||\n> > > > > > +\t\t\t qpairs > RTE_PMD_PACKET_MAX_RINGS) {\n> > > > > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\t\"%s: invalid qpairs value\\n\",\n> > > > > > +\t\t\t\t name);\n> > > > > > +\t\t\t\treturn -1;\n> > > > > > +\t\t\t}\n> > > > > > +\t\t\tcontinue;\n> > > > > > +\t\t}\n> > > > > > +\t\tif (strstr(pair->key, ETH_PACKET_BLOCKSIZE_ARG) != NULL) {\n> > > > > > +\t\t\tblocksize = atoi(pair->value);\n> > > > > > +\t\t\tif (!blocksize) {\n> > > > > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\t\"%s: invalid blocksize value\\n\",\n> > > > > > +\t\t\t\t name);\n> > > > > > +\t\t\t\treturn -1;\n> > > > > > +\t\t\t}\n> > > > > > +\t\t\tcontinue;\n> > > > > > +\t\t}\n> > > > > > +\t\tif (strstr(pair->key, ETH_PACKET_FRAMESIZE_ARG) != NULL) {\n> > > > > > +\t\t\tframesize = atoi(pair->value);\n> > > > > > +\t\t\tif (!framesize) {\n> > > > > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\t\"%s: invalid framesize value\\n\",\n> > > > > > +\t\t\t\t name);\n> > > > > > +\t\t\t\treturn -1;\n> > > > > > +\t\t\t}\n> > > > > > +\t\t\tcontinue;\n> > > > > > +\t\t}\n> > > > > > +\t\tif (strstr(pair->key, ETH_PACKET_FRAMECOUNT_ARG) != NULL) {\n> > > > > > +\t\t\tframecount = atoi(pair->value);\n> > > > > > +\t\t\tif (!framecount) {\n> > > > > > +\t\t\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\t\t\"%s: invalid framecount value\\n\",\n> > > > > > +\t\t\t\t name);\n> > > > > > +\t\t\t\treturn -1;\n> > > > > > +\t\t\t}\n> > > > > > +\t\t\tcontinue;\n> > > > > > +\t\t}\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\tif (framesize > blocksize) {\n> > > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\"%s: AF_PACKET MMAP frame size exceeds block size!\\n\",\n> > > > > > +\t\t name);\n> > > > > > +\t\treturn -1;\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\tblockcount = framecount / (blocksize / framesize);\n> > > > > > +\tif (!blockcount) {\n> > > > > > +\t\tRTE_LOG(ERR, PMD,\n> > > > > > +\t\t\t\"%s: invalid AF_PACKET MMAP parameters\\n\", name);\n> > > > > > +\t\treturn -1;\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\tRTE_LOG(INFO, PMD, \"%s: AF_PACKET MMAP parameters:\\n\", name);\n> > > > > > +\tRTE_LOG(INFO, PMD, \"%s:\\tblock size %d\\n\", name, blocksize);\n> > > > > > +\tRTE_LOG(INFO, PMD, \"%s:\\tblock count %d\\n\", name, blockcount);\n> > > > > > +\tRTE_LOG(INFO, PMD, \"%s:\\tframe size %d\\n\", name, framesize);\n> > > > > > +\tRTE_LOG(INFO, PMD, \"%s:\\tframe count %d\\n\", name, framecount);\n> > > > > > +\n> > > > > > +\tif (rte_pmd_init_internals(name, *sockfd, qpairs,\n> > > > > > +\t blocksize, blockcount,\n> > > > > > +\t framesize, framecount,\n> > > > > > +\t numa_node, &internals, &eth_dev,\n> > > > > > +\t kvlist) < 0)\n> > > > > > +\t\treturn -1;\n> > > > > > +\n> > > > > > +\teth_dev->rx_pkt_burst = eth_packet_rx;\n> > > > > > +\teth_dev->tx_pkt_burst = eth_packet_tx;\n> > > > > > +\n> > > > > > +\treturn 0;\n> > > > > > +}\n> > > > > > +\n> > > > > > +int\n> > > > > > +rte_pmd_packet_devinit(const char *name, const char *params)\n> > > > > > +{\n> > > > > > +\tunsigned numa_node;\n> > > > > > +\tint ret;\n> > > > > > +\tstruct rte_kvargs *kvlist;\n> > > > > > +\tint sockfd = -1;\n> > > > > > +\n> > > > > > +\tRTE_LOG(INFO, PMD, \"Initializing pmd_packet for %s\\n\", name);\n> > > > > > +\n> > > > > > +\tnuma_node = rte_socket_id();\n> > > > > > +\n> > > > > > +\tkvlist = rte_kvargs_parse(params, valid_arguments);\n> > > > > > +\tif (kvlist == NULL)\n> > > > > > +\t\treturn -1;\n> > > > > > +\n> > > > > > +\t/*\n> > > > > > +\t * If iface argument is passed we open the NICs and use them for\n> > > > > > +\t * reading / writing\n> > > > > > +\t */\n> > > > > > +\tif (rte_kvargs_count(kvlist, ETH_PACKET_IFACE_ARG) == 1) {\n> > > > > > +\n> > > > > > +\t\tret = rte_kvargs_process(kvlist, ETH_PACKET_IFACE_ARG,\n> > > > > > +\t\t &open_packet_iface, &sockfd);\n> > > > > > +\t\tif (ret < 0)\n> > > > > > +\t\t\treturn -1;\n> > > > > > +\t}\n> > > > > > +\n> > > > > > +\tret = rte_eth_from_packet(name, &sockfd, numa_node, kvlist);\n> > > > > > +\tclose(sockfd); /* no longer needed */\n> > > > > > +\n> > > > > > +\tif (ret < 0)\n> > > > > > +\t\treturn -1;\n> > > > > > +\n> > > > > > +\treturn 0;\n> > > > > > +}\n> > > > > > +\n> > > > > > +static struct rte_driver pmd_packet_drv = {\n> > > > > > +\t.name = \"eth_packet\",\n> > > > > > +\t.type = PMD_VDEV,\n> > > > > > +\t.init = rte_pmd_packet_devinit,\n> > > > > > +};\n> > > > > > +\n> > > > > > +PMD_REGISTER_DRIVER(pmd_packet_drv);\n> > > > > > diff --git a/lib/librte_pmd_packet/rte_eth_packet.h b/lib/librte_pmd_packet/rte_eth_packet.h\n> > > > > > new file mode 100644\n> > > > > > index 000000000000..f685611da3e9\n> > > > > > --- /dev/null\n> > > > > > +++ b/lib/librte_pmd_packet/rte_eth_packet.h\n> > > > > > @@ -0,0 +1,55 @@\n> > > > > > +/*-\n> > > > > > + * BSD LICENSE\n> > > > > > + *\n> > > > > > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.\n> > > > > > + * All rights reserved.\n> > > > > > + *\n> > > > > > + * Redistribution and use in source and binary forms, with or without\n> > > > > > + * modification, are permitted provided that the following conditions\n> > > > > > + * are met:\n> > > > > > + *\n> > > > > > + * * Redistributions of source code must retain the above copyright\n> > > > > > + * notice, this list of conditions and the following disclaimer.\n> > > > > > + * * Redistributions in binary form must reproduce the above copyright\n> > > > > > + * notice, this list of conditions and the following disclaimer in\n> > > > > > + * the documentation and/or other materials provided with the\n> > > > > > + * distribution.\n> > > > > > + * * Neither the name of Intel Corporation nor the names of its\n> > > > > > + * contributors may be used to endorse or promote products derived\n> > > > > > + * from this software without specific prior written permission.\n> > > > > > + *\n> > > > > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n> > > > > > + * \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n> > > > > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n> > > > > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n> > > > > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n> > > > > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n> > > > > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n> > > > > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n> > > > > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n> > > > > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n> > > > > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n> > > > > > + */\n> > > > > > +\n> > > > > > +#ifndef _RTE_ETH_PACKET_H_\n> > > > > > +#define _RTE_ETH_PACKET_H_\n> > > > > > +\n> > > > > > +#ifdef __cplusplus\n> > > > > > +extern \"C\" {\n> > > > > > +#endif\n> > > > > > +\n> > > > > > +#define RTE_ETH_PACKET_PARAM_NAME \"eth_packet\"\n> > > > > > +\n> > > > > > +#define RTE_PMD_PACKET_MAX_RINGS 16\n> > > > > > +\n> > > > > > +/**\n> > > > > > + * For use by the EAL only. Called as part of EAL init to set up any dummy NICs\n> > > > > > + * configured on command line.\n> > > > > > + */\n> > > > > > +int rte_pmd_packet_devinit(const char *name, const char *params);\n> > > > > > +\n> > > > > > +#ifdef __cplusplus\n> > > > > > +}\n> > > > > > +#endif\n> > > > > > +\n> > > > > > +#endif\n> > > > > > diff --git a/mk/rte.app.mk b/mk/rte.app.mk\n> > > > > > index 34dff2a02a05..a6994c4dbe93 100644\n> > > > > > --- a/mk/rte.app.mk\n> > > > > > +++ b/mk/rte.app.mk\n> > > > > > @@ -210,6 +210,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_PMD_PCAP),y)\n> > > > > > LDLIBS += -lrte_pmd_pcap -lpcap\n> > > > > > endif\n> > > > > >\n> > > > > > +ifeq ($(CONFIG_RTE_LIBRTE_PMD_PACKET),y)\n> > > > > > +LDLIBS += -lrte_pmd_packet\n> > > > > > +endif\n> > > > > > +\n> > > > > > endif # plugins\n> > > > > >\n> > > > > > LDLIBS += $(EXECENV_LDLIBS)\n> > > > > > --\n> > > > > > 1.9.3\n> > > > > >\n> > > > > >\n> > > > >\n> > > > > --\n> > > > > John W. Linville\t\tSomeday the world will need a hero, and you\n> > > > > linville@tuxdriver.com\t\t\tmight be all we have. Be ready.\n> > > >\n> > >\n> > > --\n> > > John W. Linville\t\tSomeday the world will need a hero, and you\n> > > linville@tuxdriver.com\t\t\tmight be all we have. Be ready.\n> >", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id A2BDF58E8;\n\tMon, 15 Sep 2014 17:37:59 +0200 (CEST)", "from mga01.intel.com (mga01.intel.com [192.55.52.88])\n\tby dpdk.org (Postfix) with ESMTP id 803563976\n\tfor <dev@dpdk.org>; Mon, 15 Sep 2014 17:37:56 +0200 (CEST)", "from fmsmga002.fm.intel.com ([10.253.24.26])\n\tby fmsmga101.fm.intel.com with ESMTP; 15 Sep 2014 08:43:25 -0700", "from fmsmsx104.amr.corp.intel.com ([10.18.124.202])\n\tby fmsmga002.fm.intel.com with ESMTP; 15 Sep 2014 08:43:11 -0700", "from shsmsx151.ccr.corp.intel.com (10.239.6.50) by\n\tfmsmsx104.amr.corp.intel.com (10.18.124.202) with Microsoft SMTP\n\tServer (TLS) id 14.3.195.1; Mon, 15 Sep 2014 08:43:09 -0700", "from shsmsx104.ccr.corp.intel.com ([169.254.5.230]) by\n\tSHSMSX151.ccr.corp.intel.com ([169.254.3.172]) with mapi id\n\t14.03.0195.001; Mon, 15 Sep 2014 23:43:07 +0800" ], "X-ExtLoop1": "1", "X-IronPort-AV": "E=Sophos;i=\"5.04,529,1406617200\"; d=\"scan'208\";a=\"599815396\"", "From": "\"Zhou, Danny\" <danny.zhou@intel.com>", "To": "Neil Horman <nhorman@tuxdriver.com>", "Thread-Topic": "[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "Thread-Index": "AQHPn5Gwvcd4A0wcTkeObxakesCY6Jv9ov6AgACKEgD//4OfgIAAmgdggAPeNgCAAIm9EA==", "Date": "Mon, 15 Sep 2014 15:43:07 +0000", "Message-ID": "<DFDF335405C17848924A094BC35766CF0A93A24C@SHSMSX104.ccr.corp.intel.com>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1405362290-6753-1-git-send-email-linville@tuxdriver.com>\n\t<20140912180523.GB7145@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A935AEE@SHSMSX104.ccr.corp.intel.com>\n\t<20140912185423.GD7145@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A935E75@SHSMSX104.ccr.corp.intel.com>\n\t<20140915150946.GA11690@hmsreliant.think-freely.org>", "In-Reply-To": "<20140915150946.GA11690@hmsreliant.think-freely.org>", "Accept-Language": "zh-CN, en-US", "Content-Language": "en-US", "X-MS-Has-Attach": "", "X-MS-TNEF-Correlator": "", "x-originating-ip": "[10.239.127.40]", "Content-Type": "text/plain; charset=\"us-ascii\"", "Content-Transfer-Encoding": "quoted-printable", "MIME-Version": "1.0", "Cc": "\"dev@dpdk.org\" <dev@dpdk.org>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 786, "web_url": "http://patches.dpdk.org/comment/786/", "msgid": "<20140915162244.GB11690@hmsreliant.think-freely.org>", "list_archive_url": "https://inbox.dpdk.org/dev/20140915162244.GB11690@hmsreliant.think-freely.org", "date": "2014-09-15T16:22:44", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 32, "url": "http://patches.dpdk.org/api/people/32/?format=api", "name": "Neil Horman", "email": "nhorman@tuxdriver.com" }, "content": "On Mon, Sep 15, 2014 at 03:43:07PM +0000, Zhou, Danny wrote:\n> \n> > -----Original Message-----\n> > From: Neil Horman [mailto:nhorman@tuxdriver.com]\n> > Sent: Monday, September 15, 2014 11:10 PM\n> > To: Zhou, Danny\n> > Cc: John W. Linville; dev@dpdk.org\n> > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices\n> > \n> > On Fri, Sep 12, 2014 at 08:35:47PM +0000, Zhou, Danny wrote:\n> > > > -----Original Message-----\n> > > > From: John W. Linville [mailto:linville@tuxdriver.com]\n> > > > Sent: Saturday, September 13, 2014 2:54 AM\n> > > > To: Zhou, Danny\n> > > > Cc: dev@dpdk.org\n> > > > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices\n> > > >\n> > > > On Fri, Sep 12, 2014 at 06:31:08PM +0000, Zhou, Danny wrote:\n> > > > > I am concerned about its performance caused by too many\n> > > > > memcpy(). Specifically, on Rx side, kernel NIC driver needs to copy\n> > > > > packets to skb, then af_packet copies packets to AF_PACKET buffer\n> > > > > which are mapped to user space, and then those packets to be copied\n> > > > > to DPDK mbuf. In addition, 3 copies needed on Tx side. So to run a\n> > > > > simple DPDK L2/L3 forwarding benchmark, each packet needs 6 packet\n> > > > > copies which brings significant negative performance impact. We\n> > > > > had a bifurcated driver prototype that can do zero-copy and achieve\n> > > > > native DPDK performance, but it depends on base driver and AF_PACKET\n> > > > > code changes in kernel, John R will be presenting it in coming Linux\n> > > > > Plumbers Conference. Once kernel adopts it, the relevant PMD will be\n> > > > > submitted to dpdk.org.\n> > > >\n> > > > Admittedly, this is not as good a performer as most of the existing\n> > > > PMDs. It serves a different purpose, afterall. FWIW, you did\n> > > > previously indicate that it performed better than the pcap-based PMD.\n> > >\n> > > Yes, slightly higher but makes no big difference.\n> > >\n> > Do you have numbers for this? It seems to me faster is faster as long as its\n> > statistically significant. Even if its not, johns AF_PACKET pmd has the ability\n> > to scale to multple cpus more easily than the pcap pmd, as it can make use of\n> > the AF_PACKET fanout feature.\n> \n> For 64B small packet, 1.35M pps with 1 queue.\nWhy did you only test with a single queue? Multiqueue operation was one of the\nbig advantages of the AF_PACKET based pmd. I would expect a single queue setup\nto perform in a very simmilar fashion to the pcap PMD\n\n As both pcap and AF_PACKET PMDs depend on interrupt \n> based NIC kernel drivers, all the DPDK performance optimization techniques are not utilized. Why should DPDK adopt \n> two similar and poor performant PMDs which cannot demonstrate DPDK' key value \"high performance\"?\nSeveral reasons:\n* \"High performance\" isn't always the key need for end users. Consider\npre-hardware availablity development phase.\n\n* Better hardware modeling (consider AF_PACKETS multiqueue abiltiy)\n\n* Better scaling (pcap doesn't make use of the fanout features that AF_PACKET\ndoes)\n\n* Space savings, Building the AF_PACKET pmd doesn't require the additional\nbuilding/storage of the pcap driver.\n\n\n> \n> > \n> > > > I look forward to seeing the changes you mention -- they sound very\n> > > > exciting. But, they will still require both networking core and\n> > > > driver changes in the kernel. And as I understand things today,\n> > > > the userland code will still need at least some knowledge of specific\n> > > > devices and how they layout their packet descriptors, etc. So while\n> > > > those changes sound very promising, they will still have certain\n> > > > drawbacks in common with the current situation.\n> > >\n> > > Yes, we would like the DPDK performance optimization techniques such as huge page, efficient rx/tx routines to manipulate\n> > device-specific\n> > > packet descriptors, polling-model can be still used. We have to tradeoff between performance and commonality. But we believe it will\n> > be much easier\n> > > to develop DPDK PMD for non-Intel NICs than porting entire kernel drivers to DPDK.\n> > >\n> > \n> > Not sure how this relates, what you're describing is the feature intel has been\n> > working on to augment kernel drivers to provide better throughput via direct\n> > hardware access to user space. Johns PMD provides ubiquitous function on all\n> > hardware. I'm not sure how the desire for one implies the other isn't valuable?\n> > \n> \n> Performance is the key value of DPDK, instead of commonality. But we are trying to improve commonality of our solution to make it easily \n> adopted by other NIC vendors.\n> \nThats completely irrelevant to the question at hand. To go with your reasoning,\nif performance is the key value of the DPDK, then you should remove all driver\nsupport save for the most performant hardware you have. By that same token,\nyou should deprecate the pcap driver in favor of this AF_PACKET driver, because\nit has shown performance improvement.\n\nI'm being facetious, of course, but the facts remain: Lack of superior\nperformance from one PMD to the next does not immediately obviate the need for\none PMD over another, as they quite likely address differing needs. As you note\nthe DPDK seeks performance as a key goal, but its an open source project, there\nare other needs from other users in play here. The AF_PACKET pmd provides\nsuperior performance on linux platforms when hardware independence is required.\nIt differs from the pcap PMD as it uses features that are only available on the\nLinux platform, so it stands to reason we should have both.\n\n> > > > It seems like the changes you mention will still need some sort of\n> > > > AF_PACKET-based PMD driver. Have you implemented that completely\n> > > > separate from the code I already posted? Or did you add that work\n> > > > on top of mine?\n> > > >\n> > >\n> > > For userland code, it certainly use some of your code related to raw rocket, but highly modified. A layer will be added into eth_dev\n> > library to do device\n> > > probe and support new socket options.\n> > >\n> > \n> > Ok, but again, PMD's are independent, and serve different needs. If they're use\n> > is at all overlapping from a functional standpoint, take this one now, and\n> > deprecate it when a better one comes along. Though from your description it\n> > seems like both have a valid place in the ecosystem.\n> > \n> \n> I am ok with this approach, as long as this AF_PACKET PMD does not add extra maintain efforts. Thomas might make the call.\n> \nWhat extra maintainer efforts do you think are required here, that wouldn't be\nrequired for any PMD? To suggest that a given PMD shouldn't be included because\nit would require additional effort to maintain holds it to a higher standard\nthan the PMD's already included. I don't recall anyone asking if the i40e or\nbonding pmds would require additional effort before being integrated.\n\nNeil", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id 2394558E8;\n\tMon, 15 Sep 2014 18:17:22 +0200 (CEST)", "from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58])\n\tby dpdk.org (Postfix) with ESMTP id 445633976\n\tfor <dev@dpdk.org>; Mon, 15 Sep 2014 18:17:20 +0200 (CEST)", "from hmsreliant.think-freely.org\n\t([2001:470:8:a08:7aac:c0ff:fec2:933b] helo=localhost)\n\tby smtp.tuxdriver.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.63)\n\t(envelope-from <nhorman@tuxdriver.com>)\n\tid 1XTZ37-00086v-Kt; Mon, 15 Sep 2014 12:22:51 -0400" ], "Date": "Mon, 15 Sep 2014 12:22:44 -0400", "From": "Neil Horman <nhorman@tuxdriver.com>", "To": "\"Zhou, Danny\" <danny.zhou@intel.com>", "Message-ID": "<20140915162244.GB11690@hmsreliant.think-freely.org>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1405362290-6753-1-git-send-email-linville@tuxdriver.com>\n\t<20140912180523.GB7145@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A935AEE@SHSMSX104.ccr.corp.intel.com>\n\t<20140912185423.GD7145@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A935E75@SHSMSX104.ccr.corp.intel.com>\n\t<20140915150946.GA11690@hmsreliant.think-freely.org>\n\t<DFDF335405C17848924A094BC35766CF0A93A24C@SHSMSX104.ccr.corp.intel.com>", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=us-ascii", "Content-Disposition": "inline", "In-Reply-To": "<DFDF335405C17848924A094BC35766CF0A93A24C@SHSMSX104.ccr.corp.intel.com>", "User-Agent": "Mutt/1.5.23 (2014-03-12)", "X-Spam-Score": "-2.9 (--)", "X-Spam-Status": "No", "Cc": "\"dev@dpdk.org\" <dev@dpdk.org>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 787, "web_url": "http://patches.dpdk.org/comment/787/", "msgid": "<20140915174822.GG28459@tuxdriver.com>", "list_archive_url": "https://inbox.dpdk.org/dev/20140915174822.GG28459@tuxdriver.com", "date": "2014-09-15T17:48:22", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 26, "url": "http://patches.dpdk.org/api/people/26/?format=api", "name": "John W. Linville", "email": "linville@tuxdriver.com" }, "content": "On Mon, Sep 15, 2014 at 12:22:44PM -0400, Neil Horman wrote:\n> On Mon, Sep 15, 2014 at 03:43:07PM +0000, Zhou, Danny wrote:\n> > \n> > > -----Original Message-----\n> > > From: Neil Horman [mailto:nhorman@tuxdriver.com]\n> > > Sent: Monday, September 15, 2014 11:10 PM\n> > > To: Zhou, Danny\n> > > Cc: John W. Linville; dev@dpdk.org\n> > > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices\n> > > \n> > > On Fri, Sep 12, 2014 at 08:35:47PM +0000, Zhou, Danny wrote:\n> > > > > -----Original Message-----\n> > > > > From: John W. Linville [mailto:linville@tuxdriver.com]\n> > > > > Sent: Saturday, September 13, 2014 2:54 AM\n> > > > > To: Zhou, Danny\n> > > > > Cc: dev@dpdk.org\n> > > > > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices\n> > > > >\n> > > > > On Fri, Sep 12, 2014 at 06:31:08PM +0000, Zhou, Danny wrote:\n> > > > > > I am concerned about its performance caused by too many\n> > > > > > memcpy(). Specifically, on Rx side, kernel NIC driver needs to copy\n> > > > > > packets to skb, then af_packet copies packets to AF_PACKET buffer\n> > > > > > which are mapped to user space, and then those packets to be copied\n> > > > > > to DPDK mbuf. In addition, 3 copies needed on Tx side. So to run a\n> > > > > > simple DPDK L2/L3 forwarding benchmark, each packet needs 6 packet\n> > > > > > copies which brings significant negative performance impact. We\n> > > > > > had a bifurcated driver prototype that can do zero-copy and achieve\n> > > > > > native DPDK performance, but it depends on base driver and AF_PACKET\n> > > > > > code changes in kernel, John R will be presenting it in coming Linux\n> > > > > > Plumbers Conference. Once kernel adopts it, the relevant PMD will be\n> > > > > > submitted to dpdk.org.\n> > > > >\n> > > > > Admittedly, this is not as good a performer as most of the existing\n> > > > > PMDs. It serves a different purpose, afterall. FWIW, you did\n> > > > > previously indicate that it performed better than the pcap-based PMD.\n> > > >\n> > > > Yes, slightly higher but makes no big difference.\n> > > >\n> > > Do you have numbers for this? It seems to me faster is faster as long as its\n> > > statistically significant. Even if its not, johns AF_PACKET pmd has the ability\n> > > to scale to multple cpus more easily than the pcap pmd, as it can make use of\n> > > the AF_PACKET fanout feature.\n> > \n> > For 64B small packet, 1.35M pps with 1 queue.\n> Why did you only test with a single queue? Multiqueue operation was one of the\n> big advantages of the AF_PACKET based pmd. I would expect a single queue setup\n> to perform in a very simmilar fashion to the pcap PMD\n> \n> As both pcap and AF_PACKET PMDs depend on interrupt \n> > based NIC kernel drivers, all the DPDK performance optimization techniques are not utilized. Why should DPDK adopt \n> > two similar and poor performant PMDs which cannot demonstrate DPDK' key value \"high performance\"?\n> Several reasons:\n> * \"High performance\" isn't always the key need for end users. Consider\n> pre-hardware availablity development phase.\n> \n> * Better hardware modeling (consider AF_PACKETS multiqueue abiltiy)\n> \n> * Better scaling (pcap doesn't make use of the fanout features that AF_PACKET\n> does)\n> \n> * Space savings, Building the AF_PACKET pmd doesn't require the additional\n> building/storage of the pcap driver.\n\nThis would include not requiring a dependency on libpcap, if nothing else.\n \n> > \n> > > \n> > > > > I look forward to seeing the changes you mention -- they sound very\n> > > > > exciting. But, they will still require both networking core and\n> > > > > driver changes in the kernel. And as I understand things today,\n> > > > > the userland code will still need at least some knowledge of specific\n> > > > > devices and how they layout their packet descriptors, etc. So while\n> > > > > those changes sound very promising, they will still have certain\n> > > > > drawbacks in common with the current situation.\n> > > >\n> > > > Yes, we would like the DPDK performance optimization techniques such as huge page, efficient rx/tx routines to manipulate\n> > > device-specific\n> > > > packet descriptors, polling-model can be still used. We have to tradeoff between performance and commonality. But we believe it will\n> > > be much easier\n> > > > to develop DPDK PMD for non-Intel NICs than porting entire kernel drivers to DPDK.\n> > > >\n> > > \n> > > Not sure how this relates, what you're describing is the feature intel has been\n> > > working on to augment kernel drivers to provide better throughput via direct\n> > > hardware access to user space. Johns PMD provides ubiquitous function on all\n> > > hardware. I'm not sure how the desire for one implies the other isn't valuable?\n> > > \n> > \n> > Performance is the key value of DPDK, instead of commonality. But we are trying to improve commonality of our solution to make it easily \n> > adopted by other NIC vendors.\n> > \n> Thats completely irrelevant to the question at hand. To go with your reasoning,\n> if performance is the key value of the DPDK, then you should remove all driver\n> support save for the most performant hardware you have. By that same token,\n> you should deprecate the pcap driver in favor of this AF_PACKET driver, because\n> it has shown performance improvement.\n> \n> I'm being facetious, of course, but the facts remain: Lack of superior\n> performance from one PMD to the next does not immediately obviate the need for\n> one PMD over another, as they quite likely address differing needs. As you note\n> the DPDK seeks performance as a key goal, but its an open source project, there\n> are other needs from other users in play here. The AF_PACKET pmd provides\n> superior performance on linux platforms when hardware independence is required.\n> It differs from the pcap PMD as it uses features that are only available on the\n> Linux platform, so it stands to reason we should have both.\n\nIMHO, the biggest deficiency in DPDK is the lack of apps. Let's face\nit, no one really cares about running l2fwd except for testing the\ndrivers. What people want is applications. Providing a PMD to use\nwhile developing an app without requiring specific hardware seems like\na win to me. The pcap PMD addresses some of that, but it is more of\na stop-gap or special purpose thing (like for playing back captures).\n\n> > > > > It seems like the changes you mention will still need some sort of\n> > > > > AF_PACKET-based PMD driver. Have you implemented that completely\n> > > > > separate from the code I already posted? Or did you add that work\n> > > > > on top of mine?\n> > > > >\n> > > >\n> > > > For userland code, it certainly use some of your code related to raw rocket, but highly modified. A layer will be added into eth_dev\n> > > library to do device\n> > > > probe and support new socket options.\n> > > >\n> > > \n> > > Ok, but again, PMD's are independent, and serve different needs. If they're use\n> > > is at all overlapping from a functional standpoint, take this one now, and\n> > > deprecate it when a better one comes along. Though from your description it\n> > > seems like both have a valid place in the ecosystem.\n> > > \n> > \n> > I am ok with this approach, as long as this AF_PACKET PMD does not add extra maintain efforts. Thomas might make the call.\n> > \n> What extra maintainer efforts do you think are required here, that wouldn't be\n> required for any PMD? To suggest that a given PMD shouldn't be included because\n> it would require additional effort to maintain holds it to a higher standard\n> than the PMD's already included. I don't recall anyone asking if the i40e or\n> bonding pmds would require additional effort before being integrated.\n\nRight -- how much maintainer effort is put into the pcap driver\nthese days?\n\nJohn", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id 44E6C3976;\n\tMon, 15 Sep 2014 20:24:39 +0200 (CEST)", "from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58])\n\tby dpdk.org (Postfix) with ESMTP id 65EB1137D\n\tfor <dev@dpdk.org>; Mon, 15 Sep 2014 20:24:37 +0200 (CEST)", "from uucp by smtp.tuxdriver.com with local-rmail (Exim 4.63)\n\t(envelope-from <linville@tuxdriver.com>)\n\tid 1XTb2O-0000Pz-Nu; Mon, 15 Sep 2014 14:30:08 -0400", "from linville-x1.hq.tuxdriver.com (localhost.localdomain\n\t[127.0.0.1])\n\tby linville-x1.hq.tuxdriver.com (8.14.8/8.14.6) with ESMTP id\n\ts8FHmNwG013585; Mon, 15 Sep 2014 13:48:23 -0400", "(from linville@localhost)\n\tby linville-x1.hq.tuxdriver.com (8.14.8/8.14.8/Submit) id\n\ts8FHmNWb013584; Mon, 15 Sep 2014 13:48:23 -0400" ], "Date": "Mon, 15 Sep 2014 13:48:22 -0400", "From": "\"John W. Linville\" <linville@tuxdriver.com>", "To": "Neil Horman <nhorman@tuxdriver.com>", "Message-ID": "<20140915174822.GG28459@tuxdriver.com>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1405362290-6753-1-git-send-email-linville@tuxdriver.com>\n\t<20140912180523.GB7145@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A935AEE@SHSMSX104.ccr.corp.intel.com>\n\t<20140912185423.GD7145@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A935E75@SHSMSX104.ccr.corp.intel.com>\n\t<20140915150946.GA11690@hmsreliant.think-freely.org>\n\t<DFDF335405C17848924A094BC35766CF0A93A24C@SHSMSX104.ccr.corp.intel.com>\n\t<20140915162244.GB11690@hmsreliant.think-freely.org>", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=us-ascii", "Content-Disposition": "inline", "In-Reply-To": "<20140915162244.GB11690@hmsreliant.think-freely.org>", "User-Agent": "Mutt/1.5.23 (2014-03-12)", "Cc": "\"dev@dpdk.org\" <dev@dpdk.org>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 788, "web_url": "http://patches.dpdk.org/comment/788/", "msgid": "<DFDF335405C17848924A094BC35766CF0A93A36A@SHSMSX104.ccr.corp.intel.com>", "list_archive_url": "https://inbox.dpdk.org/dev/DFDF335405C17848924A094BC35766CF0A93A36A@SHSMSX104.ccr.corp.intel.com", "date": "2014-09-15T19:11:37", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 29, "url": "http://patches.dpdk.org/api/people/29/?format=api", "name": "Zhou, Danny", "email": "danny.zhou@intel.com" }, "content": "> -----Original Message-----\n> From: John W. Linville [mailto:linville@tuxdriver.com]\n> Sent: Tuesday, September 16, 2014 1:48 AM\n> To: Neil Horman\n> Cc: Zhou, Danny; dev@dpdk.org\n> Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices\n> \n> On Mon, Sep 15, 2014 at 12:22:44PM -0400, Neil Horman wrote:\n> > On Mon, Sep 15, 2014 at 03:43:07PM +0000, Zhou, Danny wrote:\n> > >\n> > > > -----Original Message-----\n> > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]\n> > > > Sent: Monday, September 15, 2014 11:10 PM\n> > > > To: Zhou, Danny\n> > > > Cc: John W. Linville; dev@dpdk.org\n> > > > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices\n> > > >\n> > > > On Fri, Sep 12, 2014 at 08:35:47PM +0000, Zhou, Danny wrote:\n> > > > > > -----Original Message-----\n> > > > > > From: John W. Linville [mailto:linville@tuxdriver.com]\n> > > > > > Sent: Saturday, September 13, 2014 2:54 AM\n> > > > > > To: Zhou, Danny\n> > > > > > Cc: dev@dpdk.org\n> > > > > > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices\n> > > > > >\n> > > > > > On Fri, Sep 12, 2014 at 06:31:08PM +0000, Zhou, Danny wrote:\n> > > > > > > I am concerned about its performance caused by too many\n> > > > > > > memcpy(). Specifically, on Rx side, kernel NIC driver needs to copy\n> > > > > > > packets to skb, then af_packet copies packets to AF_PACKET buffer\n> > > > > > > which are mapped to user space, and then those packets to be copied\n> > > > > > > to DPDK mbuf. In addition, 3 copies needed on Tx side. So to run a\n> > > > > > > simple DPDK L2/L3 forwarding benchmark, each packet needs 6 packet\n> > > > > > > copies which brings significant negative performance impact. We\n> > > > > > > had a bifurcated driver prototype that can do zero-copy and achieve\n> > > > > > > native DPDK performance, but it depends on base driver and AF_PACKET\n> > > > > > > code changes in kernel, John R will be presenting it in coming Linux\n> > > > > > > Plumbers Conference. Once kernel adopts it, the relevant PMD will be\n> > > > > > > submitted to dpdk.org.\n> > > > > >\n> > > > > > Admittedly, this is not as good a performer as most of the existing\n> > > > > > PMDs. It serves a different purpose, afterall. FWIW, you did\n> > > > > > previously indicate that it performed better than the pcap-based PMD.\n> > > > >\n> > > > > Yes, slightly higher but makes no big difference.\n> > > > >\n> > > > Do you have numbers for this? It seems to me faster is faster as long as its\n> > > > statistically significant. Even if its not, johns AF_PACKET pmd has the ability\n> > > > to scale to multple cpus more easily than the pcap pmd, as it can make use of\n> > > > the AF_PACKET fanout feature.\n> > >\n> > > For 64B small packet, 1.35M pps with 1 queue.\n> > Why did you only test with a single queue? Multiqueue operation was one of the\n> > big advantages of the AF_PACKET based pmd. I would expect a single queue setup\n> > to perform in a very simmilar fashion to the pcap PMD\n> >\n> > As both pcap and AF_PACKET PMDs depend on interrupt\n> > > based NIC kernel drivers, all the DPDK performance optimization techniques are not utilized. Why should DPDK adopt\n> > > two similar and poor performant PMDs which cannot demonstrate DPDK' key value \"high performance\"?\n> > Several reasons:\n> > * \"High performance\" isn't always the key need for end users. Consider\n> > pre-hardware availablity development phase.\n> >\n> > * Better hardware modeling (consider AF_PACKETS multiqueue abiltiy)\n> >\n> > * Better scaling (pcap doesn't make use of the fanout features that AF_PACKET\n> > does)\n> >\n> > * Space savings, Building the AF_PACKET pmd doesn't require the additional\n> > building/storage of the pcap driver.\n> \n> This would include not requiring a dependency on libpcap, if nothing else.\n\nlibrte_pmd_pcap and librte_pmd_packet are both DPDK wrapper libraries on top of libpcap library and AF_PACKET module respectively, \nso they are not born for high performance, which is truly understandable. DPDK is moving toward to open to a larger public of data center\nconsumers who do not care about very high performance, so from that angle, it makes sense to adopt librte_pmd_packet in my mind.\n\n> \n> > >\n> > > >\n> > > > > > I look forward to seeing the changes you mention -- they sound very\n> > > > > > exciting. But, they will still require both networking core and\n> > > > > > driver changes in the kernel. And as I understand things today,\n> > > > > > the userland code will still need at least some knowledge of specific\n> > > > > > devices and how they layout their packet descriptors, etc. So while\n> > > > > > those changes sound very promising, they will still have certain\n> > > > > > drawbacks in common with the current situation.\n> > > > >\n> > > > > Yes, we would like the DPDK performance optimization techniques such as huge page, efficient rx/tx routines to manipulate\n> > > > device-specific\n> > > > > packet descriptors, polling-model can be still used. We have to tradeoff between performance and commonality. But we believe\n> it will\n> > > > be much easier\n> > > > > to develop DPDK PMD for non-Intel NICs than porting entire kernel drivers to DPDK.\n> > > > >\n> > > >\n> > > > Not sure how this relates, what you're describing is the feature intel has been\n> > > > working on to augment kernel drivers to provide better throughput via direct\n> > > > hardware access to user space. Johns PMD provides ubiquitous function on all\n> > > > hardware. I'm not sure how the desire for one implies the other isn't valuable?\n> > > >\n> > >\n> > > Performance is the key value of DPDK, instead of commonality. But we are trying to improve commonality of our solution to make\n> it easily\n> > > adopted by other NIC vendors.\n> > >\n> > Thats completely irrelevant to the question at hand. To go with your reasoning,\n> > if performance is the key value of the DPDK, then you should remove all driver\n> > support save for the most performant hardware you have. By that same token,\n> > you should deprecate the pcap driver in favor of this AF_PACKET driver, because\n> > it has shown performance improvement.\n> >\n> > I'm being facetious, of course, but the facts remain: Lack of superior\n> > performance from one PMD to the next does not immediately obviate the need for\n> > one PMD over another, as they quite likely address differing needs. As you note\n> > the DPDK seeks performance as a key goal, but its an open source project, there\n> > are other needs from other users in play here. The AF_PACKET pmd provides\n> > superior performance on linux platforms when hardware independence is required.\n> > It differs from the pcap PMD as it uses features that are only available on the\n> > Linux platform, so it stands to reason we should have both.\n> \n> IMHO, the biggest deficiency in DPDK is the lack of apps. Let's face\n> it, no one really cares about running l2fwd except for testing the\n> drivers. What people want is applications. Providing a PMD to use\n> while developing an app without requiring specific hardware seems like\n> a win to me. The pcap PMD addresses some of that, but it is more of\n> a stop-gap or special purpose thing (like for playing back captures).\n> \n\nIt is not true for network middle boxes which resolve L2/L3 packet processing problems(which is the main problem DPDK wants to resolve when it was born), \nbut it might be truefor data center or endpoint applications that primarily focus on addressing L4-L7 packet processing problems, which\ndo not care about L2/L3 high throughput and packet latency very much, as system performance bottle-neck are in the L4-L7 routines.\n\n> > > > > > It seems like the changes you mention will still need some sort of\n> > > > > > AF_PACKET-based PMD driver. Have you implemented that completely\n> > > > > > separate from the code I already posted? Or did you add that work\n> > > > > > on top of mine?\n> > > > > >\n> > > > >\n> > > > > For userland code, it certainly use some of your code related to raw rocket, but highly modified. A layer will be added into\n> eth_dev\n> > > > library to do device\n> > > > > probe and support new socket options.\n> > > > >\n> > > >\n> > > > Ok, but again, PMD's are independent, and serve different needs. If they're use\n> > > > is at all overlapping from a functional standpoint, take this one now, and\n> > > > deprecate it when a better one comes along. Though from your description it\n> > > > seems like both have a valid place in the ecosystem.\n> > > >\n> > >\n> > > I am ok with this approach, as long as this AF_PACKET PMD does not add extra maintain efforts. Thomas might make the call.\n> > >\n> > What extra maintainer efforts do you think are required here, that wouldn't be\n> > required for any PMD? To suggest that a given PMD shouldn't be included because\n> > it would require additional effort to maintain holds it to a higher standard\n> > than the PMD's already included. I don't recall anyone asking if the i40e or\n> > bonding pmds would require additional effort before being integrated.\n> \n> Right -- how much maintainer effort is put into the pcap driver\n> these days?\n\nI do not know details, but I DO know validation guys need to put a lot efforts on measuring the performance for it on different platforms.\nProbably a automation function and performance testsuite can help a lot.\n\n> \n> John\n> --\n> John W. Linville\t\tSomeday the world will need a hero, and you\n> linville@tuxdriver.com\t\t\tmight be all we have. Be ready.", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id 1ECE93976;\n\tMon, 15 Sep 2014 21:06:42 +0200 (CEST)", "from mga02.intel.com (mga02.intel.com [134.134.136.20])\n\tby dpdk.org (Postfix) with ESMTP id AE8B3137D\n\tfor <dev@dpdk.org>; Mon, 15 Sep 2014 21:06:39 +0200 (CEST)", "from orsmga001.jf.intel.com ([10.7.209.18])\n\tby orsmga101.jf.intel.com with ESMTP; 15 Sep 2014 12:12:11 -0700", "from fmsmsx107.amr.corp.intel.com ([10.18.124.205])\n\tby orsmga001.jf.intel.com with ESMTP; 15 Sep 2014 12:11:40 -0700", "from fmsmsx119.amr.corp.intel.com (10.18.124.207) by\n\tfmsmsx107.amr.corp.intel.com (10.18.124.205) with Microsoft SMTP\n\tServer (TLS) id 14.3.195.1; Mon, 15 Sep 2014 12:11:39 -0700", "from shsmsx152.ccr.corp.intel.com (10.239.6.52) by\n\tFMSMSX119.amr.corp.intel.com (10.18.124.207) with Microsoft SMTP\n\tServer (TLS) id 14.3.195.1; Mon, 15 Sep 2014 12:11:39 -0700", "from shsmsx104.ccr.corp.intel.com ([169.254.5.230]) by\n\tSHSMSX152.ccr.corp.intel.com ([169.254.6.190]) with mapi id\n\t14.03.0195.001; Tue, 16 Sep 2014 03:11:38 +0800" ], "X-ExtLoop1": "1", "X-IronPort-AV": "E=Sophos;i=\"5.04,529,1406617200\"; d=\"scan'208\";a=\"573466947\"", "From": "\"Zhou, Danny\" <danny.zhou@intel.com>", "To": "\"John W. Linville\" <linville@tuxdriver.com>, Neil Horman\n\t<nhorman@tuxdriver.com>", "Thread-Topic": "[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "Thread-Index": "AQHPn5Gwvcd4A0wcTkeObxakesCY6Jv9ov6AgACKEgD//4OfgIAAmgdggAPeNgCAAIm9EP//iqYAgAAX7QCAAJbfEA==", "Date": "Mon, 15 Sep 2014 19:11:37 +0000", "Message-ID": "<DFDF335405C17848924A094BC35766CF0A93A36A@SHSMSX104.ccr.corp.intel.com>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1405362290-6753-1-git-send-email-linville@tuxdriver.com>\n\t<20140912180523.GB7145@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A935AEE@SHSMSX104.ccr.corp.intel.com>\n\t<20140912185423.GD7145@tuxdriver.com>\n\t<DFDF335405C17848924A094BC35766CF0A935E75@SHSMSX104.ccr.corp.intel.com>\n\t<20140915150946.GA11690@hmsreliant.think-freely.org>\n\t<DFDF335405C17848924A094BC35766CF0A93A24C@SHSMSX104.ccr.corp.intel.com>\n\t<20140915162244.GB11690@hmsreliant.think-freely.org>\n\t<20140915174822.GG28459@tuxdriver.com>", "In-Reply-To": "<20140915174822.GG28459@tuxdriver.com>", "Accept-Language": "zh-CN, en-US", "Content-Language": "en-US", "X-MS-Has-Attach": "", "X-MS-TNEF-Correlator": "", "x-originating-ip": "[10.239.127.40]", "Content-Type": "text/plain; charset=\"us-ascii\"", "Content-Transfer-Encoding": "quoted-printable", "MIME-Version": "1.0", "Cc": "\"dev@dpdk.org\" <dev@dpdk.org>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 797, "web_url": "http://patches.dpdk.org/comment/797/", "msgid": "<20140916201601.GF3792@hmsreliant.think-freely.org>", "list_archive_url": "https://inbox.dpdk.org/dev/20140916201601.GF3792@hmsreliant.think-freely.org", "date": "2014-09-16T20:16:01", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 32, "url": "http://patches.dpdk.org/api/people/32/?format=api", "name": "Neil Horman", "email": "nhorman@tuxdriver.com" }, "content": "On Fri, Sep 12, 2014 at 02:05:23PM -0400, John W. Linville wrote:\n> Ping? Are there objections to this patch from mid-July?\n> \n> John\n> \nThomas, Where are you on this? It seems like if you don't have any objections\nto this patch, it should go in, in ilght of the lack of further commentary.\n\nNeil", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id 9B5B0AFD1;\n\tTue, 16 Sep 2014 22:10:28 +0200 (CEST)", "from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58])\n\tby dpdk.org (Postfix) with ESMTP id 62FF66A1D\n\tfor <dev@dpdk.org>; Tue, 16 Sep 2014 22:10:27 +0200 (CEST)", "from hmsreliant.think-freely.org\n\t([2001:470:8:a08:7aac:c0ff:fec2:933b] helo=localhost)\n\tby smtp.tuxdriver.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.63)\n\t(envelope-from <nhorman@tuxdriver.com>)\n\tid 1XTzAQ-0003LV-EW; Tue, 16 Sep 2014 16:16:04 -0400" ], "Date": "Tue, 16 Sep 2014 16:16:01 -0400", "From": "Neil Horman <nhorman@tuxdriver.com>", "To": "\"John W. Linville\" <linville@tuxdriver.com>", "Message-ID": "<20140916201601.GF3792@hmsreliant.think-freely.org>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1405362290-6753-1-git-send-email-linville@tuxdriver.com>\n\t<20140912180523.GB7145@tuxdriver.com>", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=us-ascii", "Content-Disposition": "inline", "In-Reply-To": "<20140912180523.GB7145@tuxdriver.com>", "User-Agent": "Mutt/1.5.23 (2014-03-12)", "X-Spam-Score": "-2.9 (--)", "X-Spam-Status": "No", "Cc": "dev@dpdk.org", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 1079, "web_url": "http://patches.dpdk.org/comment/1079/", "msgid": "<1743453.I5IRxog55M@xps13>", "list_archive_url": "https://inbox.dpdk.org/dev/1743453.I5IRxog55M@xps13", "date": "2014-09-26T09:28:05", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 1, "url": "http://patches.dpdk.org/api/people/1/?format=api", "name": "Thomas Monjalon", "email": "thomas.monjalon@6wind.com" }, "content": "2014-09-16 16:16, Neil Horman:\n> On Fri, Sep 12, 2014 at 02:05:23PM -0400, John W. Linville wrote:\n> > Ping? Are there objections to this patch from mid-July?\n> \n> Thomas, Where are you on this? It seems like if you don't have any objections\n> to this patch, it should go in, in ilght of the lack of further commentary.\n\n1) It doesn't appear as a top priority.\n2) It's competing with pcap PMD and bifurcated PMD to come\n (http://dpdk.org/ml/archives/dev/2014-September/005379.html)\n3) There is no test associated with this PMD.\nIf one of this item becomes wrong, it should go in.\n\nCurrently, 2 projects are being initiated for validation (dcts) and\ndocumentation. Keeping new things outside of the DPDK core makes it\nclear that they have not to be supported by dcts and doc yet.\nSo, it is better to have an external PMD, like memnic, acting as a\nstaging area.\n\nDuring this time, keeping this PMD separately will allow you to update it\nwith a maintainer account in dpdk.org. I just need your SSH public key.\n\nThank you", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id 6A4057DEC;\n\tFri, 26 Sep 2014 11:21:55 +0200 (CEST)", "from mail-wg0-f43.google.com (mail-wg0-f43.google.com\n\t[74.125.82.43]) by dpdk.org (Postfix) with ESMTP id 7A5797DEB\n\tfor <dev@dpdk.org>; Fri, 26 Sep 2014 11:21:54 +0200 (CEST)", "by mail-wg0-f43.google.com with SMTP id y10so9472596wgg.26\n\tfor <dev@dpdk.org>; Fri, 26 Sep 2014 02:28:16 -0700 (PDT)", "from xps13.localnet (136-92-190-109.dsl.ovh.fr. [109.190.92.136])\n\tby mx.google.com with ESMTPSA id\n\tk2sm3634109wjy.34.2014.09.26.02.28.13 for <multiple recipients>\n\t(version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n\tFri, 26 Sep 2014 02:28:14 -0700 (PDT)" ], "X-Google-DKIM-Signature": "v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20130820;\n\th=x-gm-message-state:from:to:cc:subject:date:message-id:organization\n\t:user-agent:in-reply-to:references:mime-version\n\t:content-transfer-encoding:content-type;\n\tbh=ZnHJmhQpv0uJ37dZIxSwSONMTGZ8FV48ecam+Qewszc=;\n\tb=VVQhZQTjH8OBk0ulghZ8i+8iXPj9mivCjOMVOaToQ8nFjcv3TxAun5XEejFsEM1R59\n\tMhtUivdrKgyHlrXtD+EyHO0iRgDk5xROUp9l74y0cvQXh3pPi8yoR2s7WfKrcfBEbHtY\n\trYaZTrHf8fMoVz56lI6WTtbu7M+rER/993QtY4SupkDSpjwMNSSB27N/IICo9nEw0EaH\n\tCIrne66wOfm0wYs34/klNTnsNGlKNG7RikFL0pi2j5vKmZFqHVO1x/W4SfQFxbmvzz0r\n\t1I8BGQkSJbmzp7vhtWMKUJ6juHXHUGn0koJitYV1W0VgdwplcCFHKDUgNhoN11TSQjgu\n\tpIQQ==", "X-Gm-Message-State": "ALoCoQkqwsMhHq9B8OpD1gq5PulZSE0GFRLouxDYWoDYxkLBhY1olKKX79wjhQd05ZOiTlu+rApX", "X-Received": "by 10.180.74.227 with SMTP id x3mr25255751wiv.80.1411723695698; \n\tFri, 26 Sep 2014 02:28:15 -0700 (PDT)", "From": "Thomas Monjalon <thomas.monjalon@6wind.com>", "To": "Neil Horman <nhorman@tuxdriver.com>,\n\t\"John W. Linville\" <linville@tuxdriver.com>", "Date": "Fri, 26 Sep 2014 11:28:05 +0200", "Message-ID": "<1743453.I5IRxog55M@xps13>", "Organization": "6WIND", "User-Agent": "KMail/4.13.3 (Linux/3.15.8-1-ARCH; KDE/4.13.3; x86_64; ; )", "In-Reply-To": "<20140916201601.GF3792@hmsreliant.think-freely.org>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<20140912180523.GB7145@tuxdriver.com>\n\t<20140916201601.GF3792@hmsreliant.think-freely.org>", "MIME-Version": "1.0", "Content-Transfer-Encoding": "7Bit", "Content-Type": "text/plain; charset=\"us-ascii\"", "Cc": "dev@dpdk.org", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 1130, "web_url": "http://patches.dpdk.org/comment/1130/", "msgid": "<20140926140855.GD3930@hmsreliant.think-freely.org>", "list_archive_url": "https://inbox.dpdk.org/dev/20140926140855.GD3930@hmsreliant.think-freely.org", "date": "2014-09-26T14:08:55", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 32, "url": "http://patches.dpdk.org/api/people/32/?format=api", "name": "Neil Horman", "email": "nhorman@tuxdriver.com" }, "content": "On Fri, Sep 26, 2014 at 11:28:05AM +0200, Thomas Monjalon wrote:\n> 2014-09-16 16:16, Neil Horman:\n> > On Fri, Sep 12, 2014 at 02:05:23PM -0400, John W. Linville wrote:\n> > > Ping? Are there objections to this patch from mid-July?\n> > \n> > Thomas, Where are you on this? It seems like if you don't have any objections\n> > to this patch, it should go in, in ilght of the lack of further commentary.\n> \n> 1) It doesn't appear as a top priority.\nThats your responsibility. Patches can't languish and rot on a list forever\njust because others aren't willing to test it. If theres further testing that\nyou feel it needs, ask. But from my read, its been tested for functionality and\nperformance (though high performance is never expected from a AF_PACKET PMD).\nGiven that any one PMD will not affect the performance of another in isolation,\nI'm not sure what more you're waiting for here.\n\n> 2) It's competing with pcap PMD and bifurcated PMD to come\n> (http://dpdk.org/ml/archives/dev/2014-September/005379.html)\nRegarding the pcap PMD, so? Its an alternate implementation that provides\ndifferent features with different limitations. The fact that they are simmilar\nis irrelevant. If simmilarity was the test, then we wouldn't bother with the\nbifurcated driver either, because the pcap pmd already exists.\n\nRegarding the bifurcated driver, you can't hold existing patches on the promise\nof another pmd thats comming at an indeterminate time in the future. Theres no\nreason not to take this now and deprecate it in the future if there is\nsufficient overlap with the bifurcated driver, though to my point above, they\nstill address different needs with different limitations, so I don't see doing\nso as necessecary.\n \n> 3) There is no test associated with this PMD.\nThat would have been a great comment to make a few months back, though whats\nwrong with testpmd here? That seems to be the same test that every other pmd\nuses. What exactly are you looking for?\n\n\n> If one of this item becomes wrong, it should go in.\n> \n\n> Currently, 2 projects are being initiated for validation (dcts) and\n> documentation. Keeping new things outside of the DPDK core makes it\n> clear that they have not to be supported by dcts and doc yet.\n> So, it is better to have an external PMD, like memnic, acting as a\n> staging area.\n> \nSo, this brings up an excellent point - Validation and support. Commonly open\nsource projects don't provide support at the upstream HEAD. Those items are\napplied and inforced by distributors. Theres no need to ensure that the\nupstream head is always the most performance and stable point of the tree. Its\nthat need that keeps the development pace slow, and creates frustrations like\nthis one, where a patch sits unaddressed for long periods of time. Commonly the\nworkflow for most open source projects is for there to be a window of time where\nvisual review and basic functional testing are sufficient for acceptance into\nthe head of the tree. After the development window closes there is a\nstabilization period where testing/validation is done to ensure that no\nregressions have been encountered, optionally with a -next branch temporarily\nbeing created to accept patches for upcomming future releases. If regressions\nare found, its a simple matter in git to bisect back to the offending patch,\nallow the contributing developer an opportunity to fix the issue, or to drop the\npatch. Using a workflow like this we can have a reasonable balance of needs\n(good patch turn around time, as well as reasonable testing). We've discussed\nthis when I posted the PMD_REGISTER_DRIVER patch months ago, and I thought you\nwere going to move in the direction of this workflow. What happened?\n\n> During this time, keeping this PMD separately will allow you to update it\n> with a maintainer account in dpdk.org. I just need your SSH public key.\n> \nWe've discussed this too, keeping PMDs maintained separately is a very bad idea.\nDoing so means developers have to constantly be aware of changes to the core\ntree and try to keep up individually. Integrating them all means that API\nchanges can be easily propogated to all PMD's when needed without making work\nfor many people. Its exactly the reason we encourage driver writers to open\nsource drivers in Linux, because not doing so closes developers off from the\nfree maintenence they get when optimizations are made to API's. And if you\nfollow the development model above, you don't need to worry about implied\nsupport, as that correctly becomes a distributor issue.\n\n\nNeil", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id EE9087DFF;\n\tFri, 26 Sep 2014 16:02:44 +0200 (CEST)", "from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58])\n\tby dpdk.org (Postfix) with ESMTP id 0F9DD7DF0\n\tfor <dev@dpdk.org>; Fri, 26 Sep 2014 16:02:44 +0200 (CEST)", "from hmsreliant.think-freely.org\n\t([2001:470:8:a08:7aac:c0ff:fec2:933b] helo=localhost)\n\tby smtp.tuxdriver.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.63)\n\t(envelope-from <nhorman@tuxdriver.com>)\n\tid 1XXWCe-0007CI-63; Fri, 26 Sep 2014 10:09:04 -0400" ], "Date": "Fri, 26 Sep 2014 10:08:55 -0400", "From": "Neil Horman <nhorman@tuxdriver.com>", "To": "Thomas Monjalon <thomas.monjalon@6wind.com>", "Message-ID": "<20140926140855.GD3930@hmsreliant.think-freely.org>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<20140912180523.GB7145@tuxdriver.com>\n\t<20140916201601.GF3792@hmsreliant.think-freely.org>\n\t<1743453.I5IRxog55M@xps13>", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=us-ascii", "Content-Disposition": "inline", "In-Reply-To": "<1743453.I5IRxog55M@xps13>", "User-Agent": "Mutt/1.5.23 (2014-03-12)", "X-Spam-Score": "-2.9 (--)", "X-Spam-Status": "No", "Cc": "dev@dpdk.org", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 1209, "web_url": "http://patches.dpdk.org/comment/1209/", "msgid": "<20140929100553.GA12072@BRICHA3-MOBL>", "list_archive_url": "https://inbox.dpdk.org/dev/20140929100553.GA12072@BRICHA3-MOBL", "date": "2014-09-29T10:05:53", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 20, "url": "http://patches.dpdk.org/api/people/20/?format=api", "name": "Bruce Richardson", "email": "bruce.richardson@intel.com" }, "content": "On Fri, Sep 26, 2014 at 10:08:55AM -0400, Neil Horman wrote:\n> On Fri, Sep 26, 2014 at 11:28:05AM +0200, Thomas Monjalon wrote:\n> > 2014-09-16 16:16, Neil Horman:\n> > > On Fri, Sep 12, 2014 at 02:05:23PM -0400, John W. Linville wrote:\n> > > > Ping? Are there objections to this patch from mid-July?\n> > > \n> > > Thomas, Where are you on this? It seems like if you don't have any objections\n> > > to this patch, it should go in, in ilght of the lack of further commentary.\n> > \n> > 1) It doesn't appear as a top priority.\n> Thats your responsibility. Patches can't languish and rot on a list forever\n> just because others aren't willing to test it. If theres further testing that\n> you feel it needs, ask. But from my read, its been tested for functionality and\n> performance (though high performance is never expected from a AF_PACKET PMD).\n> Given that any one PMD will not affect the performance of another in isolation,\n> I'm not sure what more you're waiting for here.\n> \n> > 2) It's competing with pcap PMD and bifurcated PMD to come\n> > (http://dpdk.org/ml/archives/dev/2014-September/005379.html)\n> Regarding the pcap PMD, so? Its an alternate implementation that provides\n> different features with different limitations. The fact that they are simmilar\n> is irrelevant. If simmilarity was the test, then we wouldn't bother with the\n> bifurcated driver either, because the pcap pmd already exists.\n> \n> Regarding the bifurcated driver, you can't hold existing patches on the promise\n> of another pmd thats comming at an indeterminate time in the future. Theres no\n> reason not to take this now and deprecate it in the future if there is\n> sufficient overlap with the bifurcated driver, though to my point above, they\n> still address different needs with different limitations, so I don't see doing\n> so as necessecary.\n> \n> > 3) There is no test associated with this PMD.\n> That would have been a great comment to make a few months back, though whats\n> wrong with testpmd here? That seems to be the same test that every other pmd\n> uses. What exactly are you looking for?\n> \n> \n> > If one of this item becomes wrong, it should go in.\n> > \n> \n> > Currently, 2 projects are being initiated for validation (dcts) and\n> > documentation. Keeping new things outside of the DPDK core makes it\n> > clear that they have not to be supported by dcts and doc yet.\n> > So, it is better to have an external PMD, like memnic, acting as a\n> > staging area.\n> > \n> So, this brings up an excellent point - Validation and support. Commonly open\n> source projects don't provide support at the upstream HEAD. Those items are\n> applied and inforced by distributors. Theres no need to ensure that the\n> upstream head is always the most performance and stable point of the tree. Its\n> that need that keeps the development pace slow, and creates frustrations like\n> this one, where a patch sits unaddressed for long periods of time. Commonly the\n> workflow for most open source projects is for there to be a window of time where\n> visual review and basic functional testing are sufficient for acceptance into\n> the head of the tree. After the development window closes there is a\n> stabilization period where testing/validation is done to ensure that no\n> regressions have been encountered, optionally with a -next branch temporarily\n> being created to accept patches for upcomming future releases. If regressions\n> are found, its a simple matter in git to bisect back to the offending patch,\n> allow the contributing developer an opportunity to fix the issue, or to drop the\n> patch. Using a workflow like this we can have a reasonable balance of needs\n> (good patch turn around time, as well as reasonable testing). We've discussed\n> this when I posted the PMD_REGISTER_DRIVER patch months ago, and I thought you\n> were going to move in the direction of this workflow. What happened?\n> \n> > During this time, keeping this PMD separately will allow you to update it\n> > with a maintainer account in dpdk.org. I just need your SSH public key.\n> > \n> We've discussed this too, keeping PMDs maintained separately is a very bad idea.\n> Doing so means developers have to constantly be aware of changes to the core\n> tree and try to keep up individually. Integrating them all means that API\n> changes can be easily propogated to all PMD's when needed without making work\n> for many people. Its exactly the reason we encourage driver writers to open\n> source drivers in Linux, because not doing so closes developers off from the\n> free maintenence they get when optimizations are made to API's. And if you\n> follow the development model above, you don't need to worry about implied\n> support, as that correctly becomes a distributor issue.\n> \n> \n> Neil\n\nWhile not wanting to get too involved in the discussion, I'd just like to \nexpress my support for getting this new PMD merged in.\n\n/Bruce", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id F3B5211F5;\n\tMon, 29 Sep 2014 11:59:24 +0200 (CEST)", "from mga03.intel.com (mga03.intel.com [134.134.136.65])\n\tby dpdk.org (Postfix) with ESMTP id A6AF4212\n\tfor <dev@dpdk.org>; Mon, 29 Sep 2014 11:59:22 +0200 (CEST)", "from orsmga001.jf.intel.com ([10.7.209.18])\n\tby orsmga103.jf.intel.com with ESMTP; 29 Sep 2014 03:03:45 -0700", "from bricha3-mobl.ger.corp.intel.com (HELO\n\tbricha3-mobl.ir.intel.com) ([10.243.20.21])\n\tby orsmga001.jf.intel.com with SMTP; 29 Sep 2014 03:05:54 -0700", "by bricha3-mobl.ir.intel.com (sSMTP sendmail emulation);\n\tMon, 29 Sep 2014 11:05:53 +0001" ], "X-ExtLoop1": "1", "X-IronPort-AV": "E=Sophos;i=\"5.04,619,1406617200\"; d=\"scan'208\";a=\"580550241\"", "Date": "Mon, 29 Sep 2014 11:05:53 +0100", "From": "Bruce Richardson <bruce.richardson@intel.com>", "To": "Neil Horman <nhorman@tuxdriver.com>", "Message-ID": "<20140929100553.GA12072@BRICHA3-MOBL>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<20140912180523.GB7145@tuxdriver.com>\n\t<20140916201601.GF3792@hmsreliant.think-freely.org>\n\t<1743453.I5IRxog55M@xps13>\n\t<20140926140855.GD3930@hmsreliant.think-freely.org>", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=us-ascii", "Content-Disposition": "inline", "In-Reply-To": "<20140926140855.GD3930@hmsreliant.think-freely.org>", "Organization": "Intel Shannon Ltd.", "User-Agent": "Mutt/1.5.22 (2013-10-16)", "Cc": "dev@dpdk.org", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 1471, "web_url": "http://patches.dpdk.org/comment/1471/", "msgid": "<4230474.fJnSGQJdQd@xps13>", "list_archive_url": "https://inbox.dpdk.org/dev/4230474.fJnSGQJdQd@xps13", "date": "2014-10-08T15:57:46", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 1, "url": "http://patches.dpdk.org/api/people/1/?format=api", "name": "Thomas Monjalon", "email": "thomas.monjalon@6wind.com" }, "content": "2014-09-29 11:05, Bruce Richardson:\n> On Fri, Sep 26, 2014 at 10:08:55AM -0400, Neil Horman wrote:\n> > On Fri, Sep 26, 2014 at 11:28:05AM +0200, Thomas Monjalon wrote:\n> > > 2014-09-16 16:16, Neil Horman:\n> > > > On Fri, Sep 12, 2014 at 02:05:23PM -0400, John W. Linville wrote:\n> > > > > Ping? Are there objections to this patch from mid-July?\n> > > > \n> > > > Thomas, Where are you on this? It seems like if you don't have any objections\n> > > > to this patch, it should go in, in ilght of the lack of further commentary.\n> > > \n> > > 1) It doesn't appear as a top priority.\n> > Thats your responsibility. Patches can't languish and rot on a list forever\n> > just because others aren't willing to test it. If theres further testing that\n> > you feel it needs, ask. But from my read, its been tested for functionality and\n> > performance (though high performance is never expected from a AF_PACKET PMD).\n> > Given that any one PMD will not affect the performance of another in isolation,\n> > I'm not sure what more you're waiting for here.\n\nYes, integration of new PMD must be accelerated.\n\n> > > 2) It's competing with pcap PMD and bifurcated PMD to come\n> > > (http://dpdk.org/ml/archives/dev/2014-September/005379.html)\n> > Regarding the pcap PMD, so? Its an alternate implementation that provides\n> > different features with different limitations. The fact that they are simmilar\n> > is irrelevant. If simmilarity was the test, then we wouldn't bother with the\n> > bifurcated driver either, because the pcap pmd already exists.\n> > \n> > Regarding the bifurcated driver, you can't hold existing patches on the promise\n> > of another pmd thats comming at an indeterminate time in the future. Theres no\n> > reason not to take this now and deprecate it in the future if there is\n> > sufficient overlap with the bifurcated driver, though to my point above, they\n> > still address different needs with different limitations, so I don't see doing\n> > so as necessecary.\n\nYes, we'll discuss it when bifurcated driver will be released.\n\n> > > 3) There is no test associated with this PMD.\n> > That would have been a great comment to make a few months back, though whats\n> > wrong with testpmd here? That seems to be the same test that every other pmd\n> > uses. What exactly are you looking for?\n\nI was thinking of testing behaviour with different kernel configurations and\nunit tests for --vdev options. But it's not a major blocker.\n\n> > > If one of this item becomes wrong, it should go in.\n> > \n> > > Currently, 2 projects are being initiated for validation (dcts) and\n> > > documentation. Keeping new things outside of the DPDK core makes it\n> > > clear that they have not to be supported by dcts and doc yet.\n> > > So, it is better to have an external PMD, like memnic, acting as a\n> > > staging area.\n> > > \n> > So, this brings up an excellent point - Validation and support. Commonly open\n> > source projects don't provide support at the upstream HEAD. Those items are\n> > applied and inforced by distributors. Theres no need to ensure that the\n> > upstream head is always the most performance and stable point of the tree. Its\n> > that need that keeps the development pace slow, and creates frustrations like\n> > this one, where a patch sits unaddressed for long periods of time. Commonly the\n> > workflow for most open source projects is for there to be a window of time where\n> > visual review and basic functional testing are sufficient for acceptance into\n> > the head of the tree. After the development window closes there is a\n> > stabilization period where testing/validation is done to ensure that no\n> > regressions have been encountered, optionally with a -next branch temporarily\n> > being created to accept patches for upcomming future releases. If regressions\n> > are found, its a simple matter in git to bisect back to the offending patch,\n> > allow the contributing developer an opportunity to fix the issue, or to drop the\n> > patch. Using a workflow like this we can have a reasonable balance of needs\n> > (good patch turn around time, as well as reasonable testing). We've discussed\n> > this when I posted the PMD_REGISTER_DRIVER patch months ago, and I thought you\n> > were going to move in the direction of this workflow. What happened?\n\nYes, we are moving to a \"merge window\" workflow.\n\n> > > During this time, keeping this PMD separately will allow you to update it\n> > > with a maintainer account in dpdk.org. I just need your SSH public key.\n> > > \n> > We've discussed this too, keeping PMDs maintained separately is a very bad idea.\n> > Doing so means developers have to constantly be aware of changes to the core\n> > tree and try to keep up individually. Integrating them all means that API\n> > changes can be easily propogated to all PMD's when needed without making work\n> > for many people. Its exactly the reason we encourage driver writers to open\n> > source drivers in Linux, because not doing so closes developers off from the\n> > free maintenence they get when optimizations are made to API's. And if you\n> > follow the development model above, you don't need to worry about implied\n> > support, as that correctly becomes a distributor issue.\n> > \n> > \n> > Neil\n> \n> While not wanting to get too involved in the discussion, I'd just like to \n> express my support for getting this new PMD merged in.\n\nIf RedHat is committed for its maintenance, it could integrated in release 1.8.\nBut I'd like it to be renamed as pmd_af_packet (or a better name) instead of\npmd_packet.\n\nThanks", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id 03F207E82;\n\tWed, 8 Oct 2014 17:50:49 +0200 (CEST)", "from mail-wi0-f179.google.com (mail-wi0-f179.google.com\n\t[209.85.212.179]) by dpdk.org (Postfix) with ESMTP id 767307E8C\n\tfor <dev@dpdk.org>; Wed, 8 Oct 2014 17:50:44 +0200 (CEST)", "by mail-wi0-f179.google.com with SMTP id d1so11058325wiv.6\n\tfor <dev@dpdk.org>; Wed, 08 Oct 2014 08:58:02 -0700 (PDT)", "from xps13.localnet (guy78-3-82-239-227-177.fbx.proxad.net.\n\t[82.239.227.177]) by mx.google.com with ESMTPSA id\n\tau4sm546220wjc.15.2014.10.08.08.58.00 for <multiple recipients>\n\t(version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n\tWed, 08 Oct 2014 08:58:01 -0700 (PDT)" ], "X-Google-DKIM-Signature": "v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20130820;\n\th=x-gm-message-state:from:to:cc:subject:date:message-id:organization\n\t:user-agent:in-reply-to:references:mime-version\n\t:content-transfer-encoding:content-type;\n\tbh=8EWhDnzGcfpDxWBeRKd6aZMLxY5ufilxc+8FY7aRqW0=;\n\tb=IRrxPEJlN/q8EoA2FgPg7Z/GrCq4HxUjfYnoaJzihKZHAwSL7x8HU9655sME1KfAGA\n\taBc/xdI2zu/ny/hQ1HyW0CUwYArlPkIg0DOZkMzGnq3FueAYUn1zaEjheTW7SSrsAadI\n\tKOCY7biFrf+iNVoLWV8BP9ZH++wtLlLahyq8Pm4hHDB/zxZi5FYWcbpT6gyyeKTPbEzV\n\t5IpC5DkokwQgOlfBHYTjqcpaHSF4SrhnvcU7I8wqFzO2P4v70DKvMM8AnME2/0Q+NBl6\n\t/u3WvBIoaFBT+BYDvd9YW9aCe7IEu8IzXYBtART4iBpdVLE+BWthiuI2ytmjvCT27Mma\n\tGxzg==", "X-Gm-Message-State": "ALoCoQlwu5ZD1EL3vro+xQUeAAtL/c1JsNTL7uX+GuL9icuU2UfTGWGxMJjYayhdiJ+xJQ7QwlmM", "X-Received": "by 10.194.76.97 with SMTP id j1mr12277978wjw.40.1412783882318;\n\tWed, 08 Oct 2014 08:58:02 -0700 (PDT)", "From": "Thomas Monjalon <thomas.monjalon@6wind.com>", "To": "Neil Horman <nhorman@tuxdriver.com>", "Date": "Wed, 08 Oct 2014 17:57:46 +0200", "Message-ID": "<4230474.fJnSGQJdQd@xps13>", "Organization": "6WIND", "User-Agent": "KMail/4.13.3 (Linux/3.15.8-1-ARCH; KDE/4.13.3; x86_64; ; )", "In-Reply-To": "<20140929100553.GA12072@BRICHA3-MOBL>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<20140926140855.GD3930@hmsreliant.think-freely.org>\n\t<20140929100553.GA12072@BRICHA3-MOBL>", "MIME-Version": "1.0", "Content-Transfer-Encoding": "7Bit", "Content-Type": "text/plain; charset=\"us-ascii\"", "Cc": "dev@dpdk.org", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 1498, "web_url": "http://patches.dpdk.org/comment/1498/", "msgid": "<20141008191403.GB13306@hmsreliant.think-freely.org>", "list_archive_url": "https://inbox.dpdk.org/dev/20141008191403.GB13306@hmsreliant.think-freely.org", "date": "2014-10-08T19:14:03", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 32, "url": "http://patches.dpdk.org/api/people/32/?format=api", "name": "Neil Horman", "email": "nhorman@tuxdriver.com" }, "content": "On Wed, Oct 08, 2014 at 05:57:46PM +0200, Thomas Monjalon wrote:\n> 2014-09-29 11:05, Bruce Richardson:\n> > On Fri, Sep 26, 2014 at 10:08:55AM -0400, Neil Horman wrote:\n> > > On Fri, Sep 26, 2014 at 11:28:05AM +0200, Thomas Monjalon wrote:\n> > > > 2014-09-16 16:16, Neil Horman:\n> > > > > On Fri, Sep 12, 2014 at 02:05:23PM -0400, John W. Linville wrote:\n> > > > > > Ping? Are there objections to this patch from mid-July?\n> > > > > \n> > > > > Thomas, Where are you on this? It seems like if you don't have any objections\n> > > > > to this patch, it should go in, in ilght of the lack of further commentary.\n> > > > \n> > > > 1) It doesn't appear as a top priority.\n> > > Thats your responsibility. Patches can't languish and rot on a list forever\n> > > just because others aren't willing to test it. If theres further testing that\n> > > you feel it needs, ask. But from my read, its been tested for functionality and\n> > > performance (though high performance is never expected from a AF_PACKET PMD).\n> > > Given that any one PMD will not affect the performance of another in isolation,\n> > > I'm not sure what more you're waiting for here.\n> \n> Yes, integration of new PMD must be accelerated.\n> \n> > > > 2) It's competing with pcap PMD and bifurcated PMD to come\n> > > > (http://dpdk.org/ml/archives/dev/2014-September/005379.html)\n> > > Regarding the pcap PMD, so? Its an alternate implementation that provides\n> > > different features with different limitations. The fact that they are simmilar\n> > > is irrelevant. If simmilarity was the test, then we wouldn't bother with the\n> > > bifurcated driver either, because the pcap pmd already exists.\n> > > \n> > > Regarding the bifurcated driver, you can't hold existing patches on the promise\n> > > of another pmd thats comming at an indeterminate time in the future. Theres no\n> > > reason not to take this now and deprecate it in the future if there is\n> > > sufficient overlap with the bifurcated driver, though to my point above, they\n> > > still address different needs with different limitations, so I don't see doing\n> > > so as necessecary.\n> \n> Yes, we'll discuss it when bifurcated driver will be released.\n> \njohn Fastabend posted it to netdev just a few days ago. There have been some\nconcerns raised, which he is trying to address. I'm watching how that goes.\n\n> > > > 3) There is no test associated with this PMD.\n> > > That would have been a great comment to make a few months back, though whats\n> > > wrong with testpmd here? That seems to be the same test that every other pmd\n> > > uses. What exactly are you looking for?\n> \n> I was thinking of testing behaviour with different kernel configurations and\n> unit tests for --vdev options. But it's not a major blocker.\n> \nThats fine with me. If theres a set of unit tests that you have documentation\nfor, I'm sure we would be happy to run them. I presume you just want all the\npmd vdev option exercised? Any specific sets of kernel configurations?\n\n\n> > > > If one of this item becomes wrong, it should go in.\n> > > \n> > > > Currently, 2 projects are being initiated for validation (dcts) and\n> > > > documentation. Keeping new things outside of the DPDK core makes it\n> > > > clear that they have not to be supported by dcts and doc yet.\n> > > > So, it is better to have an external PMD, like memnic, acting as a\n> > > > staging area.\n> > > > \n> > > So, this brings up an excellent point - Validation and support. Commonly open\n> > > source projects don't provide support at the upstream HEAD. Those items are\n> > > applied and inforced by distributors. Theres no need to ensure that the\n> > > upstream head is always the most performance and stable point of the tree. Its\n> > > that need that keeps the development pace slow, and creates frustrations like\n> > > this one, where a patch sits unaddressed for long periods of time. Commonly the\n> > > workflow for most open source projects is for there to be a window of time where\n> > > visual review and basic functional testing are sufficient for acceptance into\n> > > the head of the tree. After the development window closes there is a\n> > > stabilization period where testing/validation is done to ensure that no\n> > > regressions have been encountered, optionally with a -next branch temporarily\n> > > being created to accept patches for upcomming future releases. If regressions\n> > > are found, its a simple matter in git to bisect back to the offending patch,\n> > > allow the contributing developer an opportunity to fix the issue, or to drop the\n> > > patch. Using a workflow like this we can have a reasonable balance of needs\n> > > (good patch turn around time, as well as reasonable testing). We've discussed\n> > > this when I posted the PMD_REGISTER_DRIVER patch months ago, and I thought you\n> > > were going to move in the direction of this workflow. What happened?\n> \n> Yes, we are moving to a \"merge window\" workflow.\n> \nThat would be wonderful. I think separating the integration workflow from the\ntest workflow is critical here to making sure that patch integration isn't\nunnecessecarily delayed.\n\n> > > > During this time, keeping this PMD separately will allow you to update it\n> > > > with a maintainer account in dpdk.org. I just need your SSH public key.\n> > > > \n> > > We've discussed this too, keeping PMDs maintained separately is a very bad idea.\n> > > Doing so means developers have to constantly be aware of changes to the core\n> > > tree and try to keep up individually. Integrating them all means that API\n> > > changes can be easily propogated to all PMD's when needed without making work\n> > > for many people. Its exactly the reason we encourage driver writers to open\n> > > source drivers in Linux, because not doing so closes developers off from the\n> > > free maintenence they get when optimizations are made to API's. And if you\n> > > follow the development model above, you don't need to worry about implied\n> > > support, as that correctly becomes a distributor issue.\n> > > \n> > > \n> > > Neil\n> > \n> > While not wanting to get too involved in the discussion, I'd just like to \n> > express my support for getting this new PMD merged in.\n> \n> If RedHat is committed for its maintenance, it could integrated in release 1.8.\n> But I'd like it to be renamed as pmd_af_packet (or a better name) instead of\n> pmd_packet.\n> \nJohn L. is on his way to plumbers at the moment, so is unable to comment, but\nI'll try to get a few cycles to change the name of the PMD around. And yes, I\nthought that maintenance was implicit. He's the author, of course he'll take\ncare of it :). And I'll be glad to help\n\nNeil\n\n> Thanks\n> -- \n> Thomas\n>", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id 45A747F49;\n\tWed, 8 Oct 2014 21:06:57 +0200 (CEST)", "from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58])\n\tby dpdk.org (Postfix) with ESMTP id CEE4F7F3D\n\tfor <dev@dpdk.org>; Wed, 8 Oct 2014 21:06:55 +0200 (CEST)", "from cpe-098-026-066-094.nc.res.rr.com ([98.26.66.94]\n\thelo=localhost)\n\tby smtp.tuxdriver.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.63)\n\t(envelope-from <nhorman@tuxdriver.com>)\n\tid 1XbwgV-0008Mu-IE; Wed, 08 Oct 2014 15:14:10 -0400" ], "Date": "Wed, 8 Oct 2014 15:14:03 -0400", "From": "Neil Horman <nhorman@tuxdriver.com>", "To": "Thomas Monjalon <thomas.monjalon@6wind.com>", "Message-ID": "<20141008191403.GB13306@hmsreliant.think-freely.org>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<20140926140855.GD3930@hmsreliant.think-freely.org>\n\t<20140929100553.GA12072@BRICHA3-MOBL> <4230474.fJnSGQJdQd@xps13>", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=us-ascii", "Content-Disposition": "inline", "In-Reply-To": "<4230474.fJnSGQJdQd@xps13>", "User-Agent": "Mutt/1.5.23 (2014-03-12)", "X-Spam-Score": "-2.9 (--)", "X-Spam-Status": "No", "Cc": "dev@dpdk.org", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 2626, "web_url": "http://patches.dpdk.org/comment/2626/", "msgid": "<1898542.t3c6y266ZQ@xps13>", "list_archive_url": "https://inbox.dpdk.org/dev/1898542.t3c6y266ZQ@xps13", "date": "2014-11-13T10:03:18", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 1, "url": "http://patches.dpdk.org/api/people/1/?format=api", "name": "Thomas Monjalon", "email": "thomas.monjalon@6wind.com" }, "content": "Hi Neil and John,\n\nI would like to wake up this very old thread.\n\n2014-10-08 15:14, Neil Horman:\n> On Wed, Oct 08, 2014 at 05:57:46PM +0200, Thomas Monjalon wrote:\n> > 2014-09-29 11:05, Bruce Richardson:\n> > > On Fri, Sep 26, 2014 at 10:08:55AM -0400, Neil Horman wrote:\n> > > > On Fri, Sep 26, 2014 at 11:28:05AM +0200, Thomas Monjalon wrote:\n> > > > > 3) There is no test associated with this PMD.\n> > > > That would have been a great comment to make a few months back, though whats\n> > > > wrong with testpmd here? That seems to be the same test that every other pmd\n> > > > uses. What exactly are you looking for?\n> > \n> > I was thinking of testing behaviour with different kernel configurations and\n> > unit tests for --vdev options. But it's not a major blocker.\n> > \n> Thats fine with me. If theres a set of unit tests that you have documentation\n> for, I'm sure we would be happy to run them. I presume you just want all the\n> pmd vdev option exercised? Any specific sets of kernel configurations?\n\nI don't really know which tests are needed. It could be a mix of unit tests\nand functionnal tests described in a test plan.\nThe goal is to be able to validate the behaviour and check there is no\nregression. Ideally some corner cases could be described.\nI'm OK to integrate it as is. But future maintenance will probably need\nsuch inputs for validation tests.\n\n> > If RedHat is committed for its maintenance, it could integrated in release 1.8.\n> > But I'd like it to be renamed as pmd_af_packet (or a better name) instead of\n> > pmd_packet.\n> > \n> John L. is on his way to plumbers at the moment, so is unable to comment, but\n> I'll try to get a few cycles to change the name of the PMD around. And yes, I\n> thought that maintenance was implicit. He's the author, of course he'll take\n> care of it :). And I'll be glad to help\n\nDo you have time in coming days to rebase and rename this PMD for inclusion\nin 1.8.0 release?\n\nThanks", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id CC7D07E75;\n\tThu, 13 Nov 2014 10:53:24 +0100 (CET)", "from mail-wg0-f48.google.com (mail-wg0-f48.google.com\n\t[74.125.82.48]) by dpdk.org (Postfix) with ESMTP id AC223595B\n\tfor <dev@dpdk.org>; Thu, 13 Nov 2014 10:53:21 +0100 (CET)", "by mail-wg0-f48.google.com with SMTP id y19so5635036wgg.7\n\tfor <dev@dpdk.org>; Thu, 13 Nov 2014 02:03:19 -0800 (PST)", "from xps13.localnet (136-92-190-109.dsl.ovh.fr. [109.190.92.136])\n\tby mx.google.com with ESMTPSA id\n\ts10sm25043758wix.14.2014.11.13.02.03.18 for <multiple recipients>\n\t(version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n\tThu, 13 Nov 2014 02:03:18 -0800 (PST)" ], "X-Google-DKIM-Signature": "v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20130820;\n\th=x-gm-message-state:date:from:to:subject:message-id:organization\n\t:user-agent:in-reply-to:references:mime-version\n\t:content-transfer-encoding:content-type;\n\tbh=+4Je3mUqT6SkWYCB8qn1jaRDH057m1YBImYhWzSbMDc=;\n\tb=Bmt1Bzf/l1maD33ySe8Gi62ZkHRksxL1qb99UBB7o6a4aw4FTQR3E88E+G0SUH9Jcg\n\tjzda0vRAl20rZY4XoPxJHRQ3UQ5CD++1O7ZSMUK8Zg3JXxnw7lfzmMCs6x197IRPu4QX\n\tP+PpQxDve5tbVynSwS9/vIJKFc/47iAVoduFkiuzfmIvV8K/EJI9jGFXI3qkVEdwKCtg\n\t22CeZ+nQtfWukQBBxrmpUIriBbwl4+tEt7o7jvwPbtWSfqqs3F21d9utpQv568AZ2IC2\n\tEBAAxnWtWiQTbSuc4sdyBxBc0MkM8yHfWk9dGoFo8C9NQ4wQFf8UYVb6dpIxIpKcV/ao\n\tuUBQ==", "X-Gm-Message-State": "ALoCoQluDWz693zi29/gol9IOflALav4YCNrh+IxWZ4ntHt6kvP3E9HxuYugA8t9z9ugfAe+2Kz1", "X-Received": "by 10.194.121.34 with SMTP id lh2mr2460929wjb.72.1415872999716; \n\tThu, 13 Nov 2014 02:03:19 -0800 (PST)", "Date": "Thu, 13 Nov 2014 02:03:18 -0800 (PST)", "X-Google-Original-Date": "Thu, 13 Nov 2014 11:03 +0100", "From": "Thomas Monjalon <thomas.monjalon@6wind.com>", "To": "Neil Horman <nhorman@tuxdriver.com>, dev@dpdk.org,\n\tJohn Linville <linville@redhat.com>", "Message-ID": "<1898542.t3c6y266ZQ@xps13>", "Organization": "6WIND", "User-Agent": "KMail/4.14.2 (Linux/3.17.2-1-ARCH; KDE/4.14.2; x86_64; ; )", "In-Reply-To": "<20141008191403.GB13306@hmsreliant.think-freely.org>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<4230474.fJnSGQJdQd@xps13>\n\t<20141008191403.GB13306@hmsreliant.think-freely.org>", "MIME-Version": "1.0", "Content-Transfer-Encoding": "7Bit", "Content-Type": "text/plain; charset=\"us-ascii\"", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 2627, "web_url": "http://patches.dpdk.org/comment/2627/", "msgid": "<20141113111428.GA13253@hmsreliant.think-freely.org>", "list_archive_url": "https://inbox.dpdk.org/dev/20141113111428.GA13253@hmsreliant.think-freely.org", "date": "2014-11-13T11:14:29", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 32, "url": "http://patches.dpdk.org/api/people/32/?format=api", "name": "Neil Horman", "email": "nhorman@tuxdriver.com" }, "content": "On Thu, Nov 13, 2014 at 02:03:18AM -0800, Thomas Monjalon wrote:\n> Hi Neil and John,\n> \n> I would like to wake up this very old thread.\n> \n> 2014-10-08 15:14, Neil Horman:\n> > On Wed, Oct 08, 2014 at 05:57:46PM +0200, Thomas Monjalon wrote:\n> > > 2014-09-29 11:05, Bruce Richardson:\n> > > > On Fri, Sep 26, 2014 at 10:08:55AM -0400, Neil Horman wrote:\n> > > > > On Fri, Sep 26, 2014 at 11:28:05AM +0200, Thomas Monjalon wrote:\n> > > > > > 3) There is no test associated with this PMD.\n> > > > > That would have been a great comment to make a few months back, though whats\n> > > > > wrong with testpmd here? That seems to be the same test that every other pmd\n> > > > > uses. What exactly are you looking for?\n> > > \n> > > I was thinking of testing behaviour with different kernel configurations and\n> > > unit tests for --vdev options. But it's not a major blocker.\n> > > \n> > Thats fine with me. If theres a set of unit tests that you have documentation\n> > for, I'm sure we would be happy to run them. I presume you just want all the\n> > pmd vdev option exercised? Any specific sets of kernel configurations?\n> \n> I don't really know which tests are needed. It could be a mix of unit tests\n> and functionnal tests described in a test plan.\n> The goal is to be able to validate the behaviour and check there is no\n> regression. Ideally some corner cases could be described.\n> I'm OK to integrate it as is. But future maintenance will probably need\n> such inputs for validation tests.\n> \nDo you have an example set of tests that the other pmd's have followed for this?\n\n> > > If RedHat is committed for its maintenance, it could integrated in release 1.8.\n> > > But I'd like it to be renamed as pmd_af_packet (or a better name) instead of\n> > > pmd_packet.\n> > > \n> > John L. is on his way to plumbers at the moment, so is unable to comment, but\n> > I'll try to get a few cycles to change the name of the PMD around. And yes, I\n> > thought that maintenance was implicit. He's the author, of course he'll take\n> > care of it :). And I'll be glad to help\n> \n> Do you have time in coming days to rebase and rename this PMD for inclusion\n> in 1.8.0 release?\n> \n> Thanks\n> -- \n> Thomas\n>", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id C3BE87E75;\n\tThu, 13 Nov 2014 12:04:44 +0100 (CET)", "from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58])\n\tby dpdk.org (Postfix) with ESMTP id D0B2D594F\n\tfor <dev@dpdk.org>; Thu, 13 Nov 2014 12:04:42 +0100 (CET)", "from hmsreliant.think-freely.org\n\t([2001:470:8:a08:7aac:c0ff:fec2:933b] helo=localhost)\n\tby smtp.tuxdriver.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.63)\n\t(envelope-from <nhorman@tuxdriver.com>)\n\tid 1XosM9-0002jA-P2; Thu, 13 Nov 2014 06:14:36 -0500" ], "Date": "Thu, 13 Nov 2014 06:14:29 -0500", "From": "Neil Horman <nhorman@tuxdriver.com>", "To": "Thomas Monjalon <thomas.monjalon@6wind.com>", "Message-ID": "<20141113111428.GA13253@hmsreliant.think-freely.org>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<4230474.fJnSGQJdQd@xps13>\n\t<20141008191403.GB13306@hmsreliant.think-freely.org>\n\t<1898542.t3c6y266ZQ@xps13>", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=us-ascii", "Content-Disposition": "inline", "In-Reply-To": "<1898542.t3c6y266ZQ@xps13>", "User-Agent": "Mutt/1.5.23 (2014-03-12)", "X-Spam-Score": "-2.9 (--)", "X-Spam-Status": "No", "Cc": "dev@dpdk.org, John Linville <linville@redhat.com>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 2634, "web_url": "http://patches.dpdk.org/comment/2634/", "msgid": "<1958929.j3reG6p4bY@xps13>", "list_archive_url": "https://inbox.dpdk.org/dev/1958929.j3reG6p4bY@xps13", "date": "2014-11-13T11:57:25", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 1, "url": "http://patches.dpdk.org/api/people/1/?format=api", "name": "Thomas Monjalon", "email": "thomas.monjalon@6wind.com" }, "content": "2014-11-13 06:14, Neil Horman:\n> On Thu, Nov 13, 2014 at 02:03:18AM -0800, Thomas Monjalon wrote:\n> > 2014-10-08 15:14, Neil Horman:\n> > > On Wed, Oct 08, 2014 at 05:57:46PM +0200, Thomas Monjalon wrote:\n> > > > 2014-09-29 11:05, Bruce Richardson:\n> > > > > On Fri, Sep 26, 2014 at 10:08:55AM -0400, Neil Horman wrote:\n> > > > > > On Fri, Sep 26, 2014 at 11:28:05AM +0200, Thomas Monjalon wrote:\n> > > > > > > 3) There is no test associated with this PMD.\n> > > > > > That would have been a great comment to make a few months back, though whats\n> > > > > > wrong with testpmd here? That seems to be the same test that every other pmd\n> > > > > > uses. What exactly are you looking for?\n> > > > \n> > > > I was thinking of testing behaviour with different kernel configurations and\n> > > > unit tests for --vdev options. But it's not a major blocker.\n> > > > \n> > > Thats fine with me. If theres a set of unit tests that you have documentation\n> > > for, I'm sure we would be happy to run them. I presume you just want all the\n> > > pmd vdev option exercised? Any specific sets of kernel configurations?\n> > \n> > I don't really know which tests are needed. It could be a mix of unit tests\n> > and functionnal tests described in a test plan.\n> > The goal is to be able to validate the behaviour and check there is no\n> > regression. Ideally some corner cases could be described.\n> > I'm OK to integrate it as is. But future maintenance will probably need\n> > such inputs for validation tests.\n> > \n> Do you have an example set of tests that the other pmd's have followed for this?\n\nYou can check this:\n\thttp://dpdk.org/browse/tools/dts/tree/test_plans/pmd_test_plan.rst\n\thttp://dpdk.org/browse/tools/dts/tree/test_plans/pmd_bonded_test_plan.rst\n\nAs I said, we can integrate AF_PACKET PMD without such test plan.\nBut we are going to improve testing of many areas in DPDK.\n\n> > > > If RedHat is committed for its maintenance, it could integrated in release 1.8.\n> > > > But I'd like it to be renamed as pmd_af_packet (or a better name) instead of\n> > > > pmd_packet.\n> > > > \n> > > John L. is on his way to plumbers at the moment, so is unable to comment, but\n> > > I'll try to get a few cycles to change the name of the PMD around. And yes, I\n> > > thought that maintenance was implicit. He's the author, of course he'll take\n> > > care of it :). And I'll be glad to help\n> > \n> > Do you have time in coming days to rebase and rename this PMD for inclusion\n> > in 1.8.0 release?\n\nDo you think a sub-tree with pull request model would help you for\nmaintenance of this PMD?", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id BBB5F7F00;\n\tThu, 13 Nov 2014 12:47:48 +0100 (CET)", "from mail-wi0-f179.google.com (mail-wi0-f179.google.com\n\t[209.85.212.179]) by dpdk.org (Postfix) with ESMTP id 2ACFE7EB3\n\tfor <dev@dpdk.org>; Thu, 13 Nov 2014 12:47:47 +0100 (CET)", "by mail-wi0-f179.google.com with SMTP id ex7so711130wid.6\n\tfor <dev@dpdk.org>; Thu, 13 Nov 2014 03:57:45 -0800 (PST)", "from xps13.localnet (136-92-190-109.dsl.ovh.fr. [109.190.92.136])\n\tby mx.google.com with ESMTPSA id\n\ts10sm29508731wjw.29.2014.11.13.03.57.42 for <multiple recipients>\n\t(version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n\tThu, 13 Nov 2014 03:57:42 -0800 (PST)" ], "X-Google-DKIM-Signature": "v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20130820;\n\th=x-gm-message-state:from:to:cc:subject:date:message-id:organization\n\t:user-agent:in-reply-to:references:mime-version\n\t:content-transfer-encoding:content-type;\n\tbh=Ijtma7FC9aJwmhREu3FvdxHILi9kgezGhKSpFhrw8Kk=;\n\tb=TMstKZVXivcueHMi7b09ScpUJNRvPf8/UhUtFXQnWWneW4ap00lGitVmyMFmjzg3kB\n\tArJai49phkfg546cvJF13fbUTv46AG2EG5w6iJNedUgl3cfI/gkOVneYzafPb9Hgz4ex\n\t5wfHQ0lUjZSIcPYvwit+SfdTW3LOp8L9JeFZZqTLuExv7UPEctX2PL71mlpP54HS1+Pw\n\tOefGNni7klWIi0pXKIojJm5k0sqP7j4DIS+oIYpaUPxzE7GODfxiMcnSjRb9Y1OKFdta\n\tz/aeNO5z3ea3rt1P9FXV1KhLgZz9+QBkbuh8g4hNaqaCpt1zz2g2RjEyMh6XNN5UUdT8\n\tYSjg==", "X-Gm-Message-State": "ALoCoQkxOhs0hNbKQ3uQH9rIdRdYWW4M6qfvUqKaMHq8O6KuJqlKYGROLBlKvUQR64Nr7TnN1buR", "X-Received": "by 10.195.13.14 with SMTP id eu14mr3105461wjd.31.1415879865589; \n\tThu, 13 Nov 2014 03:57:45 -0800 (PST)", "From": "Thomas Monjalon <thomas.monjalon@6wind.com>", "To": "Neil Horman <nhorman@tuxdriver.com>, John Linville <linville@redhat.com>", "Date": "Thu, 13 Nov 2014 12:57:25 +0100", "Message-ID": "<1958929.j3reG6p4bY@xps13>", "Organization": "6WIND", "User-Agent": "KMail/4.14.2 (Linux/3.17.2-1-ARCH; KDE/4.14.2; x86_64; ; )", "In-Reply-To": "<20141113111428.GA13253@hmsreliant.think-freely.org>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1898542.t3c6y266ZQ@xps13>\n\t<20141113111428.GA13253@hmsreliant.think-freely.org>", "MIME-Version": "1.0", "Content-Transfer-Encoding": "7Bit", "Content-Type": "text/plain; charset=\"us-ascii\"", "Cc": "dev@dpdk.org", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 2645, "web_url": "http://patches.dpdk.org/comment/2645/", "msgid": "<20141114004208.GA14230@localhost.localdomain>", "list_archive_url": "https://inbox.dpdk.org/dev/20141114004208.GA14230@localhost.localdomain", "date": "2014-11-14T00:42:08", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 32, "url": "http://patches.dpdk.org/api/people/32/?format=api", "name": "Neil Horman", "email": "nhorman@tuxdriver.com" }, "content": "On Thu, Nov 13, 2014 at 12:57:25PM +0100, Thomas Monjalon wrote:\n> 2014-11-13 06:14, Neil Horman:\n> > On Thu, Nov 13, 2014 at 02:03:18AM -0800, Thomas Monjalon wrote:\n> > > 2014-10-08 15:14, Neil Horman:\n> > > > On Wed, Oct 08, 2014 at 05:57:46PM +0200, Thomas Monjalon wrote:\n> > > > > 2014-09-29 11:05, Bruce Richardson:\n> > > > > > On Fri, Sep 26, 2014 at 10:08:55AM -0400, Neil Horman wrote:\n> > > > > > > On Fri, Sep 26, 2014 at 11:28:05AM +0200, Thomas Monjalon wrote:\n> > > > > > > > 3) There is no test associated with this PMD.\n> > > > > > > That would have been a great comment to make a few months back, though whats\n> > > > > > > wrong with testpmd here? That seems to be the same test that every other pmd\n> > > > > > > uses. What exactly are you looking for?\n> > > > > \n> > > > > I was thinking of testing behaviour with different kernel configurations and\n> > > > > unit tests for --vdev options. But it's not a major blocker.\n> > > > > \n> > > > Thats fine with me. If theres a set of unit tests that you have documentation\n> > > > for, I'm sure we would be happy to run them. I presume you just want all the\n> > > > pmd vdev option exercised? Any specific sets of kernel configurations?\n> > > \n> > > I don't really know which tests are needed. It could be a mix of unit tests\n> > > and functionnal tests described in a test plan.\n> > > The goal is to be able to validate the behaviour and check there is no\n> > > regression. Ideally some corner cases could be described.\n> > > I'm OK to integrate it as is. But future maintenance will probably need\n> > > such inputs for validation tests.\n> > > \n> > Do you have an example set of tests that the other pmd's have followed for this?\n> \n> You can check this:\n> \thttp://dpdk.org/browse/tools/dts/tree/test_plans/pmd_test_plan.rst\n> \thttp://dpdk.org/browse/tools/dts/tree/test_plans/pmd_bonded_test_plan.rst\n> \n> As I said, we can integrate AF_PACKET PMD without such test plan.\n> But we are going to improve testing of many areas in DPDK.\n> \nThank you, I'll take a look in the AM\n\n> > > > > If RedHat is committed for its maintenance, it could integrated in release 1.8.\n> > > > > But I'd like it to be renamed as pmd_af_packet (or a better name) instead of\n> > > > > pmd_packet.\n> > > > > \n> > > > John L. is on his way to plumbers at the moment, so is unable to comment, but\n> > > > I'll try to get a few cycles to change the name of the PMD around. And yes, I\n> > > > thought that maintenance was implicit. He's the author, of course he'll take\n> > > > care of it :). And I'll be glad to help\n> > > \n> > > Do you have time in coming days to rebase and rename this PMD for inclusion\n> > > in 1.8.0 release?\n> \n> Do you think a sub-tree with pull request model would help you for\n> maintenance of this PMD?\n> \nI think thats a question for John to answer, but IMHO, I don't think the pmd\nwill have such patch volume that subtrees will be needed.\n\nNeil\n\n> -- \n> Thomas\n>", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id F3EEB7F25;\n\tFri, 14 Nov 2014 01:32:29 +0100 (CET)", "from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58])\n\tby dpdk.org (Postfix) with ESMTP id D1D387E75\n\tfor <dev@dpdk.org>; Fri, 14 Nov 2014 01:32:27 +0100 (CET)", "from [2001:470:8:a08:215:ff:fecc:4872] (helo=localhost)\n\tby smtp.tuxdriver.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.63)\n\t(envelope-from <nhorman@tuxdriver.com>)\n\tid 1Xp4xo-0004xt-8k; Thu, 13 Nov 2014 19:42:24 -0500" ], "Date": "Thu, 13 Nov 2014 19:42:08 -0500", "From": "Neil Horman <nhorman@tuxdriver.com>", "To": "Thomas Monjalon <thomas.monjalon@6wind.com>", "Message-ID": "<20141114004208.GA14230@localhost.localdomain>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1898542.t3c6y266ZQ@xps13>\n\t<20141113111428.GA13253@hmsreliant.think-freely.org>\n\t<1958929.j3reG6p4bY@xps13>", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=us-ascii", "Content-Disposition": "inline", "In-Reply-To": "<1958929.j3reG6p4bY@xps13>", "User-Agent": "Mutt/1.5.23 (2014-03-12)", "X-Spam-Score": "-2.9 (--)", "X-Spam-Status": "No", "Cc": "dev@dpdk.org, John Linville <linville@redhat.com>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 2664, "web_url": "http://patches.dpdk.org/comment/2664/", "msgid": "<20141114144536.GC1893@tuxdriver.com>", "list_archive_url": "https://inbox.dpdk.org/dev/20141114144536.GC1893@tuxdriver.com", "date": "2014-11-14T14:45:36", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 26, "url": "http://patches.dpdk.org/api/people/26/?format=api", "name": "John W. Linville", "email": "linville@tuxdriver.com" }, "content": "On Thu, Nov 13, 2014 at 07:42:08PM -0500, Neil Horman wrote:\n> On Thu, Nov 13, 2014 at 12:57:25PM +0100, Thomas Monjalon wrote:\n> > 2014-11-13 06:14, Neil Horman:\n> > > On Thu, Nov 13, 2014 at 02:03:18AM -0800, Thomas Monjalon wrote:\n> > > > 2014-10-08 15:14, Neil Horman:\n> > > > > On Wed, Oct 08, 2014 at 05:57:46PM +0200, Thomas Monjalon wrote:\n\n<snip>\n\n> > > > > > If RedHat is committed for its maintenance, it could integrated in release 1.8.\n> > > > > > But I'd like it to be renamed as pmd_af_packet (or a better name) instead of\n> > > > > > pmd_packet.\n> > > > > > \n> > > > > John L. is on his way to plumbers at the moment, so is unable to comment, but\n> > > > > I'll try to get a few cycles to change the name of the PMD around. And yes, I\n> > > > > thought that maintenance was implicit. He's the author, of course he'll take\n> > > > > care of it :). And I'll be glad to help\n> > > > \n> > > > Do you have time in coming days to rebase and rename this PMD for inclusion\n> > > > in 1.8.0 release?\n> > \n> > Do you think a sub-tree with pull request model would help you for\n> > maintenance of this PMD?\n> > \n> I think thats a question for John to answer, but IMHO, I don't think the pmd\n> will have such patch volume that subtrees will be needed.\n\nI haven't touched DPDK in a while, and I'm a bit busy...\n\nWhen would you need it for 1.8.0?\n\nJohn", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id CCF157F6C;\n\tFri, 14 Nov 2014 15:50:07 +0100 (CET)", "from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58])\n\tby dpdk.org (Postfix) with ESMTP id 3AEF37F69\n\tfor <dev@dpdk.org>; Fri, 14 Nov 2014 15:50:06 +0100 (CET)", "from uucp by smtp.tuxdriver.com with local-rmail (Exim 4.63)\n\t(envelope-from <linville@tuxdriver.com>)\n\tid 1XpIM3-00010W-Me; Fri, 14 Nov 2014 10:00:07 -0500", "from linville-x1.hq.tuxdriver.com (localhost.localdomain\n\t[127.0.0.1])\n\tby linville-x1.hq.tuxdriver.com (8.14.8/8.14.6) with ESMTP id\n\tsAEEjbiI003214; Fri, 14 Nov 2014 09:45:37 -0500", "(from linville@localhost)\n\tby linville-x1.hq.tuxdriver.com (8.14.8/8.14.8/Submit) id\n\tsAEEjaCg003213; Fri, 14 Nov 2014 09:45:36 -0500" ], "Date": "Fri, 14 Nov 2014 09:45:36 -0500", "From": "\"John W. Linville\" <linville@tuxdriver.com>", "To": "Neil Horman <nhorman@tuxdriver.com>", "Message-ID": "<20141114144536.GC1893@tuxdriver.com>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1898542.t3c6y266ZQ@xps13>\n\t<20141113111428.GA13253@hmsreliant.think-freely.org>\n\t<1958929.j3reG6p4bY@xps13>\n\t<20141114004208.GA14230@localhost.localdomain>", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=us-ascii", "Content-Disposition": "inline", "In-Reply-To": "<20141114004208.GA14230@localhost.localdomain>", "User-Agent": "Mutt/1.5.23 (2014-03-12)", "Cc": "dev@dpdk.org, John Linville <linville@redhat.com>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 2725, "web_url": "http://patches.dpdk.org/comment/2725/", "msgid": "<20141117111919.GA17886@hmsreliant.think-freely.org>", "list_archive_url": "https://inbox.dpdk.org/dev/20141117111919.GA17886@hmsreliant.think-freely.org", "date": "2014-11-17T11:19:19", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 32, "url": "http://patches.dpdk.org/api/people/32/?format=api", "name": "Neil Horman", "email": "nhorman@tuxdriver.com" }, "content": "On Thu, Nov 13, 2014 at 12:57:25PM +0100, Thomas Monjalon wrote:\n> 2014-11-13 06:14, Neil Horman:\n> > On Thu, Nov 13, 2014 at 02:03:18AM -0800, Thomas Monjalon wrote:\n> > > 2014-10-08 15:14, Neil Horman:\n> > > > On Wed, Oct 08, 2014 at 05:57:46PM +0200, Thomas Monjalon wrote:\n> > > > > 2014-09-29 11:05, Bruce Richardson:\n> > > > > > On Fri, Sep 26, 2014 at 10:08:55AM -0400, Neil Horman wrote:\n> > > > > > > On Fri, Sep 26, 2014 at 11:28:05AM +0200, Thomas Monjalon wrote:\n> > > > > > > > 3) There is no test associated with this PMD.\n> > > > > > > That would have been a great comment to make a few months back, though whats\n> > > > > > > wrong with testpmd here? That seems to be the same test that every other pmd\n> > > > > > > uses. What exactly are you looking for?\n> > > > > \n> > > > > I was thinking of testing behaviour with different kernel configurations and\n> > > > > unit tests for --vdev options. But it's not a major blocker.\n> > > > > \n> > > > Thats fine with me. If theres a set of unit tests that you have documentation\n> > > > for, I'm sure we would be happy to run them. I presume you just want all the\n> > > > pmd vdev option exercised? Any specific sets of kernel configurations?\n> > > \n> > > I don't really know which tests are needed. It could be a mix of unit tests\n> > > and functionnal tests described in a test plan.\n> > > The goal is to be able to validate the behaviour and check there is no\n> > > regression. Ideally some corner cases could be described.\n> > > I'm OK to integrate it as is. But future maintenance will probably need\n> > > such inputs for validation tests.\n> > > \nApologies for the delay on this, its been a busy time lately.\n\n> > Do you have an example set of tests that the other pmd's have followed for this?\n> \n> You can check this:\n> \thttp://dpdk.org/browse/tools/dts/tree/test_plans/pmd_test_plan.rst\n> \thttp://dpdk.org/browse/tools/dts/tree/test_plans/pmd_bonded_test_plan.rst\n> \nLooking at this, the pmd_test_plan above seems perfectly applicable to Johns\npmd. did you feel as though additional tests were needed for a virutal pmd\n(asside from a note describing the additional --vdev parameter required for\nvirtual device setup?\n\nI'll have a renamed device pmd patch up later today.\n\nNeil\n\n> As I said, we can integrate AF_PACKET PMD without such test plan.\n> But we are going to improve testing of many areas in DPDK.\n> \n> > > > > If RedHat is committed for its maintenance, it could integrated in release 1.8.\n> > > > > But I'd like it to be renamed as pmd_af_packet (or a better name) instead of\n> > > > > pmd_packet.\n> > > > > \n> > > > John L. is on his way to plumbers at the moment, so is unable to comment, but\n> > > > I'll try to get a few cycles to change the name of the PMD around. And yes, I\n> > > > thought that maintenance was implicit. He's the author, of course he'll take\n> > > > care of it :). And I'll be glad to help\n> > > \n> > > Do you have time in coming days to rebase and rename this PMD for inclusion\n> > > in 1.8.0 release?\n> \n> Do you think a sub-tree with pull request model would help you for\n> maintenance of this PMD?\n> \n> -- \n> Thomas\n>", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id 724437FEA;\n\tMon, 17 Nov 2014 12:09:20 +0100 (CET)", "from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58])\n\tby dpdk.org (Postfix) with ESMTP id 2C6A17FD5\n\tfor <dev@dpdk.org>; Mon, 17 Nov 2014 12:09:18 +0100 (CET)", "from hmsreliant.think-freely.org\n\t([2001:470:8:a08:7aac:c0ff:fec2:933b] helo=localhost)\n\tby smtp.tuxdriver.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.63)\n\t(envelope-from <nhorman@tuxdriver.com>)\n\tid 1XqKL2-0005uV-Lr; Mon, 17 Nov 2014 06:19:29 -0500" ], "Date": "Mon, 17 Nov 2014 06:19:19 -0500", "From": "Neil Horman <nhorman@tuxdriver.com>", "To": "Thomas Monjalon <thomas.monjalon@6wind.com>", "Message-ID": "<20141117111919.GA17886@hmsreliant.think-freely.org>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1898542.t3c6y266ZQ@xps13>\n\t<20141113111428.GA13253@hmsreliant.think-freely.org>\n\t<1958929.j3reG6p4bY@xps13>", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=us-ascii", "Content-Disposition": "inline", "In-Reply-To": "<1958929.j3reG6p4bY@xps13>", "User-Agent": "Mutt/1.5.23 (2014-03-12)", "X-Spam-Score": "-2.9 (--)", "X-Spam-Status": "No", "Cc": "dev@dpdk.org, John Linville <linville@redhat.com>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null }, { "id": 2727, "web_url": "http://patches.dpdk.org/comment/2727/", "msgid": "<1536588.MUxmeFqKeH@xps13>", "list_archive_url": "https://inbox.dpdk.org/dev/1536588.MUxmeFqKeH@xps13", "date": "2014-11-17T11:22:13", "subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "submitter": { "id": 1, "url": "http://patches.dpdk.org/api/people/1/?format=api", "name": "Thomas Monjalon", "email": "thomas.monjalon@6wind.com" }, "content": "2014-11-17 06:19, Neil Horman:\n> On Thu, Nov 13, 2014 at 12:57:25PM +0100, Thomas Monjalon wrote:\n> > 2014-11-13 06:14, Neil Horman:\n> > > Do you have an example set of tests that the other pmd's have followed for this?\n> > \n> > You can check this:\n> > \thttp://dpdk.org/browse/tools/dts/tree/test_plans/pmd_test_plan.rst\n> > \thttp://dpdk.org/browse/tools/dts/tree/test_plans/pmd_bonded_test_plan.rst\n> > \n> Looking at this, the pmd_test_plan above seems perfectly applicable to Johns\n> pmd. did you feel as though additional tests were needed for a virutal pmd\n> (asside from a note describing the additional --vdev parameter required for\n> virtual device setup?\n\nIt's maybe sufficient. I didn't dig enough. We'll see wether some people need\nmore for validation.\n\n> I'll have a renamed device pmd patch up later today.\n\nExcellent.\n\nThanks", "headers": { "Return-Path": "<dev-bounces@dpdk.org>", "X-Original-To": "patchwork@dpdk.org", "Delivered-To": "patchwork@dpdk.org", "Received": [ "from [92.243.14.124] (localhost [IPv6:::1])\n\tby dpdk.org (Postfix) with ESMTP id C65888006;\n\tMon, 17 Nov 2014 12:12:22 +0100 (CET)", "from mail-wi0-f180.google.com (mail-wi0-f180.google.com\n\t[209.85.212.180]) by dpdk.org (Postfix) with ESMTP id AA75F8001\n\tfor <dev@dpdk.org>; Mon, 17 Nov 2014 12:12:19 +0100 (CET)", "by mail-wi0-f180.google.com with SMTP id n3so1308494wiv.7\n\tfor <dev@dpdk.org>; Mon, 17 Nov 2014 03:22:36 -0800 (PST)", "from xps13.localnet (guy78-3-82-239-227-177.fbx.proxad.net.\n\t[82.239.227.177]) by mx.google.com with ESMTPSA id\n\tn4sm14825006wiz.17.2014.11.17.03.22.34 for <multiple recipients>\n\t(version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n\tMon, 17 Nov 2014 03:22:35 -0800 (PST)" ], "X-Google-DKIM-Signature": "v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20130820;\n\th=x-gm-message-state:from:to:cc:subject:date:message-id:organization\n\t:user-agent:in-reply-to:references:mime-version\n\t:content-transfer-encoding:content-type;\n\tbh=uy9Je72jikHDoc3FaprYl7aWrV4cgf6aRa0ZcC4sIuU=;\n\tb=d8C+HgvUl5+Q1l9/M8bIwf6INgd5cNz79f1He8llVKFt+BjmuhMs9S4ZG5GiPellgQ\n\txrWYrUAYjsEP3JRDNrDeayJ2+EZxygWGkeLU8hD3+iBcMTv8/4ONdLN6Ny+5MRyfH8/t\n\tbD+plLcrI31thZlGfjFX6YWr1wpSosyfADsgJEXDNYohXxJrnD5wsbDvZmNI9zGZBInq\n\t5J3ezQ82lJLIDfClw+yXsmin+/9DgMwBO4w5DkNQwC/6aIbZsGD5ultWBh9qZtC4cQUK\n\tNxk2zUtXqbH8tj4cX3gd1zJn/ZwGC4VktxmHaM2DGVRNKnB9EKVtItjmFhxyXkX03SiP\n\tJmNw==", "X-Gm-Message-State": "ALoCoQkv/Aiw+NDqZd1Dk5sJYcWXbU6xOeN8xsWVitC0GW+AHbyOaA7Le9fIpDl6iVIjo1IQIpcf", "X-Received": "by 10.181.13.80 with SMTP id ew16mr30443970wid.47.1416223355986; \n\tMon, 17 Nov 2014 03:22:35 -0800 (PST)", "From": "Thomas Monjalon <thomas.monjalon@6wind.com>", "To": "Neil Horman <nhorman@tuxdriver.com>", "Date": "Mon, 17 Nov 2014 12:22:13 +0100", "Message-ID": "<1536588.MUxmeFqKeH@xps13>", "Organization": "6WIND", "User-Agent": "KMail/4.14.2 (Linux/3.17.2-1-ARCH; KDE/4.14.2; x86_64; ; )", "In-Reply-To": "<20141117111919.GA17886@hmsreliant.think-freely.org>", "References": "<1405024369-30058-1-git-send-email-linville@tuxdriver.com>\n\t<1958929.j3reG6p4bY@xps13>\n\t<20141117111919.GA17886@hmsreliant.think-freely.org>", "MIME-Version": "1.0", "Content-Transfer-Encoding": "7Bit", "Content-Type": "text/plain; charset=\"us-ascii\"", "Cc": "dev@dpdk.org, John Linville <linville@redhat.com>", "Subject": "Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for\n\tAF_PACKET-based virtual devices", "X-BeenThere": "dev@dpdk.org", "X-Mailman-Version": "2.1.15", "Precedence": "list", "List-Id": "patches and discussions about DPDK <dev.dpdk.org>", "List-Unsubscribe": "<http://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>", "List-Archive": "<http://dpdk.org/ml/archives/dev/>", "List-Post": "<mailto:dev@dpdk.org>", "List-Help": "<mailto:dev-request@dpdk.org?subject=help>", "List-Subscribe": "<http://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>", "Errors-To": "dev-bounces@dpdk.org", "Sender": "\"dev\" <dev-bounces@dpdk.org>" }, "addressed": null } ]