[RFC,v5] /net: memory interface (memif)

Message ID 20190322115727.4358-1-jgrajcia@cisco.com
State Superseded, archived
Headers show
Series
  • [RFC,v5] /net: memory interface (memif)
Related show

Checks

Context Check Description
ci/Intel-compilation success Compilation OK
ci/checkpatch warning coding style issues

Commit Message

Memory interface (memif), provides high performance
packet transfer over shared memory.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
---
 MAINTAINERS                                 |    6 +
 config/common_base                          |    5 +
 config/common_linux                         |    1 +
 doc/guides/nics/features/memif.ini          |   14 +
 doc/guides/nics/index.rst                   |    1 +
 doc/guides/nics/memif.rst                   |  200 ++++
 doc/guides/rel_notes/release_19_05.rst      |    4 +
 drivers/net/Makefile                        |    1 +
 drivers/net/memif/Makefile                  |   28 +
 drivers/net/memif/memif.h                   |  179 +++
 drivers/net/memif/memif_socket.c            | 1104 ++++++++++++++++++
 drivers/net/memif/memif_socket.h            |  104 ++
 drivers/net/memif/meson.build               |   13 +
 drivers/net/memif/rte_eth_memif.c           | 1132 +++++++++++++++++++
 drivers/net/memif/rte_eth_memif.h           |  203 ++++
 drivers/net/memif/rte_pmd_memif_version.map |    4 +
 drivers/net/meson.build                     |    1 +
 mk/rte.app.mk                               |    1 +
 18 files changed, 3001 insertions(+)
 create mode 100644 doc/guides/nics/features/memif.ini
 create mode 100644 doc/guides/nics/memif.rst
 create mode 100644 drivers/net/memif/Makefile
 create mode 100644 drivers/net/memif/memif.h
 create mode 100644 drivers/net/memif/memif_socket.c
 create mode 100644 drivers/net/memif/memif_socket.h
 create mode 100644 drivers/net/memif/meson.build
 create mode 100644 drivers/net/memif/rte_eth_memif.c
 create mode 100644 drivers/net/memif/rte_eth_memif.h
 create mode 100644 drivers/net/memif/rte_pmd_memif_version.map

requires patch: http://patchwork.dpdk.org/patch/51511/
Memif uses a unix domain socket as control channel (cc).
rte_interrupts.h is used to handle interrupts for this cc.
This patch is required, because the message received can
be a reason for disconnecting.

v3:
- coding style fixes
- documentation
- doxygen comments
- use strlcpy() instead of strncpy()
- use RTE_BUILD_BUG_ON instead of _Static_assert()
- fix meson build deps

v4:
- coding style fixes
- doc update (desc format, messaging, features, ...)
- pointer arithmetic fix
- __rte_packed and __rte_aligned instead of __attribute__()
- fixed multi-queue
- support additional archs (memfd_create syscall)

v5:
- coding style fixes
- add release notes
- remove unused dependencies
- doc update

Comments

Ferruh Yigit March 25, 2019, 8:58 p.m. | #1
On 3/22/2019 11:57 AM, Jakub Grajciar wrote:
> Memory interface (memif), provides high performance
> packet transfer over shared memory.
> 
> Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>

<...>

> @@ -0,0 +1,200 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +    Copyright(c) 2018-2019 Cisco Systems, Inc.
> +
> +======================
> +Memif Poll Mode Driver
> +======================
> +Shared memory packet interface (memif) PMD allows for DPDK and any other client
> +using memif (DPDK, VPP, libmemif) to communicate using shared memory. Memif is
> +Linux only.
> +
> +The created device transmits packets in a raw format. It can be used with
> +Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
> +supported in DPDK memif implementation.
> +
> +Memif works in two roles: master and slave. Slave connects to master over an
> +existing socket. It is also a producer of shared memory file and initializes
> +the shared memory. Master creates the socket and listens for any slave
> +connection requests. The socket may already exist on the system. Be sure to
> +remove any such sockets, if you are creating a master interface, or you will
> +see an "Address already in use" error. Function ``rte_pmd_memif_remove()``,

Can it be possible to remove this existing socket on successfully termination of
the dpdk application with 'master' role?
Otherwise each time to run a dpdk app, requires to delete the socket file first.

> +which removes memif interface, will also remove a listener socket, if it is
> +not being used by any other interface.
> +
> +The method to enable one or more interfaces is to use the
> +``--vdev=net_memif0`` option on the DPDK application command line. Each
> +``--vdev=net_memif1`` option given will create an interface named net_memif0,
> +net_memif1, and so on. Memif uses unix domain socket to transmit control
> +messages. Each memif has a unique id per socket. If you are connecting multiple
> +interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
> +etc. Note that if you assign a socket to a master interface it becomes a
> +listener socket. Listener socket can not be used by a slave interface on same
> +client.

When I connect two slaves, id=0 & id=1 to the master, both master and second
slave crashed as soon as second slave connected, is this a know issue? Can you
please check this?

master: ./build/app/testpmd -w0:0.0 -l20,21 --vdev net_memif0,role=master -- -i
slave1: ./build/app/testpmd -w0:0.0 -l20,22 --file-prefix=slave --vdev
net_memif0,role=slave,id=0 -- -i
slave2: ./build/app/testpmd -w0:0.0 -l20,23 --file-prefix=slave2 --vdev
net_memif0,role=slave,id=1 -- -i

<...>

> +Example: testpmd and testpmd
> +----------------------------
> +In this example we run two instances of testpmd application and transmit packets over memif.

How this play with multi process support? When I run a secondary app when memif
PMD is enabled in primary, secondary process crashes.
Can you please check and ensure at least nothing crashes when a secondary app run?

> +
> +First create ``master`` interface::
> +
> +    #./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
> +
> +Now create ``slave`` interface (master must be already running so the slave will connect)::
> +
> +    #./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
> +
> +Set forwarding mode in one of the instances to 'rx only' and the other to 'tx only'::
> +
> +    testpmd> set fwd rxonly
> +    testpmd> start
> +
> +    testpmd> set fwd txonly
> +    testpmd> start

Would it be useful to add loopback option to the PMD, for testing/debug ?

Also I am getting low performance numbers above, comparing the ring pmd for
example, is there any performance target for the pmd?
Same forwarding core for both testpmds, this is terrible, either something is
wrong or I am doing something wrong, it is ~40Kpps
Different forwarding cores for each testpmd, still low, !18Mpps

<...>

> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +CFLAGS += -DALLOW_EXPERIMENTAL_API

Can you please add here which experimental APIs are called (as a comment), this
help to keep trace of them in long term?

<...>

> +/**
> + * Buffer descriptor.
> + */
> +typedef struct __rte_packed {
> +	uint16_t flags;				/**< flags */
> +#define MEMIF_DESC_FLAG_NEXT 1			/**< is chained buffer */

Is this define used?

<...>

> +
> +static struct rte_vdev_driver pmd_memif_drv;

Same comment from previous review, is this forward deceleration required?

<...>

> +static memif_ring_t *
> +memif_get_ring(struct pmd_internals *pmd, memif_ring_type_t type, uint16_t ring_num)
> +{
> +	/* rings only in region 0 */
> +	void *p = pmd->regions[0].addr;
> +	int ring_size = sizeof(memif_ring_t) + sizeof(memif_desc_t) *
> +	    (1 << pmd->run.log2_ring_size);
> +
> +	p = (uint8_t *)p + (ring_num + type * pmd->run.num_s2m_rings) * ring_size;

According above code, I guess layout is as following [1], can you please correct
if it is wrong?

Can you please put this information somewhere, possibly the function comment
that allocates it, so that everyone doesn't need to figure out.


[1]

region 0:
+-----------------------+
| S2M rings | M2S rings |
+-----------------------+

S2M OR M2S Rings:
+-----------------------------------------+
| ring 0 | ring 1 | ring num_s2m_rings - 1|
+-----------------------------------------+

ring 0:
+-----------------------------------------------------+
| Ring Header | (1 << pmd->run.log2_ring_size) * desc |
+-----------------------------------------------------+


region 1:
+-------------------------------------------------------------------------+
| packet buffer 0 | . | pb ((1 << pmd->run.log2_ring_size)*(s2m + m2s))-1 |
+-------------------------------------------------------------------------+
Is there any order on packet buffers?

<...>

> +	while (n_slots && n_rx_pkts < nb_pkts) {
> +		mbuf_head = rte_pktmbuf_alloc(mq->mempool);
> +		if (unlikely(mbuf_head == NULL))
> +			goto no_free_bufs;
> +		mbuf = mbuf_head;
> +		mbuf->port = mq->in_port;
> +
> + next_slot:
> +		s0 = cur_slot & mask;
> +		d0 = &ring->desc[s0];
> +
> +		src_len = d0->length;
> +		dst_off = 0;
> +		src_off = 0;
> +
> +		do {
> +			dst_len = mbuf_size - dst_off;
> +			if (dst_len == 0) {
> +				dst_off = 0;
> +				dst_len = mbuf_size + RTE_PKTMBUF_HEADROOM;
> +
> +				mbuf = rte_pktmbuf_alloc(mq->mempool);
> +				if (unlikely(mbuf == NULL))
> +					goto no_free_bufs;
> +				mbuf->port = mq->in_port;
> +				rte_pktmbuf_chain(mbuf_head, mbuf);
> +			}
> +			cp_len = RTE_MIN(dst_len, src_len);
> +
> +			rte_pktmbuf_pkt_len(mbuf) =
> +			    rte_pktmbuf_data_len(mbuf) += cp_len;

Can you please make this two lines to prevent confusion?

Also shouldn't need to add 'cp_len' to 'mbuf_head->pkt_len'?
'rte_pktmbuf_chain' updates 'mbuf_head' but at that stage 'mbuf->pkt_len' is not
set to 'cp_len'...

<...>

> +void
> +memif_free_regions(struct pmd_internals *pmd)
> +{
> +	int i;
> +	struct memif_region *r;
> +
> +	for (i = 0; i < pmd->regions_num; i++) {
> +		r = pmd->regions + i;
> +		if (r == NULL)
> +			return;

'r' can be NULL only when 'pmd->region' is NULL and 'i == 0', so is it better to
check 'pmd->region' is NULL check before the loop?

> +		if (r->addr == NULL)
> +			return;
> +		munmap(r->addr, r->region_size);
> +		if (r->fd > 0) {
> +			close(r->fd);
> +			r->fd = -1;
> +		}
> +	}
> +	rte_free(pmd->regions);
> +}
> +
> +static int
> +memif_alloc_regions(struct pmd_internals *pmd, uint8_t brn)
> +{
> +	struct memif_region *r;
> +	char shm_name[32];
> +	int i;
> +	int ret = 0;
> +
> +	r = rte_zmalloc("memif_region", sizeof(struct memif_region) * (brn + 1), 0);
> +	if (r == NULL) {
> +		MIF_LOG(ERR, "%s: Failed to allocate regions.",
> +			rte_vdev_device_name(pmd->vdev));
> +		return -ENOMEM;
> +	}
> +
> +	pmd->regions = r;
> +	pmd->regions_num = brn + 1;
> +
> +	/*
> +	 * Create shm for every region. Region 0 is reserved for descriptors.
> +	 * Other regions contain buffers.
> +	 */
> +	for (i = 0; i < (brn + 1); i++) {
> +		r = &pmd->regions[i];
> +
> +		r->buffer_offset = (i == 0) ? (pmd->run.num_s2m_rings +
> +					       pmd->run.num_m2s_rings) *
> +		    (sizeof(memif_ring_t) +
> +		     sizeof(memif_desc_t) * (1 << pmd->run.log2_ring_size)) : 0;

For complex operations can you please prefer regular if check against ternary,
with the help of the formatting, it is hard to follow this.

what exactly "buffer_offset" is? For 'region 0' it calculates the size of the
size of the 'region 0', otherwise 0. This is offset starting from where? And it
seems only used for below size assignment.

> +		r->region_size = (i == 0) ? r->buffer_offset :
> +		    (uint32_t)(pmd->run.buffer_size *

I guess 'buffer_size' is buffer size per packet, to make this clear, what do you
think to rename it 'packet_buffer_size' ?

> +				(1 << pmd->run.log2_ring_size) *
> +				(pmd->run.num_s2m_rings +
> +				 pmd->run.num_m2s_rings));

There is an illusion of packet buffers can be in multiple regions (above comment
implies it) but this logic assumes all packet buffer are in same region other
than region 0, which gives us 'region 1' and this is already hardcoded in a few
places. Is there a benefit of assuming there will be more regions, will it make
simple to accept 'region 0' for rings and 'region 1' is for packet buffer?

> +
> +		memset(shm_name, 0, sizeof(char) * 32);
> +		sprintf(shm_name, "memif region %d", i);

snprintf please, and can you please add a define for shm_name size?

<...>

> +static void
> +memif_init_rings(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	memif_ring_t *ring;
> +	int i, j;
> +	uint16_t slot;
> +
> +	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
> +		ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
> +		ring->head = 0;
> +		ring->tail = 0;
> +		ring->cookie = MEMIF_COOKIE;
> +		ring->flags = 0;
> +		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
> +			slot = i * (1 << pmd->run.log2_ring_size) + j;
> +			ring->desc[j].region = 1;

Why 'region' 1 is hardcoded for buffer, can it be possible to use multiple
regions for buffers?

<...>

> +int
> +memif_connect(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	struct memif_region *mr;
> +	struct memif_queue *mq;
> +	int i;
> +
> +	for (i = 0; i < pmd->regions_num; i++) {
> +		mr = pmd->regions + i;
> +		if (mr != NULL) {
> +			if (mr->addr == NULL) {
> +				if (mr->fd < 0)
> +					return -1;
> +				mr->addr = mmap(NULL, mr->region_size,
> +						PROT_READ | PROT_WRITE,
> +						MAP_SHARED, mr->fd, 0);
> +				if (mr->addr == NULL)
> +					return -1;
> +			}
> +		}
> +	}

For one master work with multiple slaves, there should be an array of regions,
right, one for for each slave.
Also same concern is valid for Rx/Tx queues, master and slave uses same
dev->data->tx_queues / dev->data->rx_queue, but crossover. I think a more
complex logic required for multiple slave support.

If multiple slave is not supported, is 'id' devarg still valid, or is it used at
all?

<...>

> +static int
> +memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
> +	     memif_interface_id_t id, uint32_t flags,
> +	     const char *socket_filename,
> +	     memif_log2_ring_size_t log2_ring_size,
> +	     uint16_t buffer_size, const char *secret,
> +	     struct ether_addr *eth_addr)
> +{
> +	int ret = 0;
> +	struct rte_eth_dev *eth_dev;
> +	struct rte_eth_dev_data *data;
> +	struct pmd_internals *pmd;
> +	const unsigned int numa_node = vdev->device.numa_node;
> +	const char *name = rte_vdev_device_name(vdev);
> +
> +	if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
> +		MIF_LOG(ERR, "Zero-copy not supported.");
> +		return -1;
> +	}

What is the plan for the zero copy?
Can you please put the status & plan to the documentation?

<...>

> +struct memif_queue {
> +	struct rte_mempool *mempool;		/**< mempool for RX packets */
> +	uint16_t in_port;			/**< port id */
> +
> +	struct pmd_internals *pmd;		/**< device internals */
> +
> +	struct rte_intr_handle intr_handle;	/**< interrupt handle */
> +
> +	/* ring info */
> +	memif_ring_type_t type;			/**< ring type */
> +	memif_ring_t *ring;			/**< pointer to ring */
> +	memif_log2_ring_size_t log2_ring_size;	/**< log2 of ring size */
> +
> +	memif_region_index_t region;		/**< shared memory region index */
> +	memif_region_offset_t offset;		/**< offset at which the queue begins */

Offset from where, it is start of the region but I think better to detail in the
comment.

> +
> +	uint16_t last_head;			/**< last ring head */
> +	uint16_t last_tail;			/**< last ring tail */

ring already has head, tail variables, what is the difference in queue ones?
Honnappa Nagarahalli May 3, 2019, 4:27 a.m. | #2
> On 3/22/2019 11:57 AM, Jakub Grajciar wrote:
> > Memory interface (memif), provides high performance packet transfer
> > over shared memory.
> >
> > Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
> 
> <...>
> 
> > @@ -0,0 +1,200 @@
> > +..  SPDX-License-Identifier: BSD-3-Clause
> > +    Copyright(c) 2018-2019 Cisco Systems, Inc.
> > +
> > +======================
> > +Memif Poll Mode Driver
> > +======================
> > +Shared memory packet interface (memif) PMD allows for DPDK and any
> > +other client using memif (DPDK, VPP, libmemif) to communicate using
> > +shared memory. Memif is Linux only.
> > +
> > +The created device transmits packets in a raw format. It can be used
> > +with Ethernet mode, IP mode, or Punt/Inject. At this moment, only
> > +Ethernet mode is supported in DPDK memif implementation.
> > +
> > +Memif works in two roles: master and slave. Slave connects to master
> > +over an existing socket. It is also a producer of shared memory file
> > +and initializes the shared memory. Master creates the socket and
> > +listens for any slave connection requests. The socket may already
> > +exist on the system. Be sure to remove any such sockets, if you are
> > +creating a master interface, or you will see an "Address already in
> > +use" error. Function ``rte_pmd_memif_remove()``,
> 
> Can it be possible to remove this existing socket on successfully termination
> of the dpdk application with 'master' role?
> Otherwise each time to run a dpdk app, requires to delete the socket file first.
> 
> 
>     net_memif is the same case as net_virtio_user, I'd use the same
> workaround.
>     testpmd.c:pmd_test_exit() this, however, is only valid for testpmd.
> 
> 
> > +which removes memif interface, will also remove a listener socket, if
> > +it is not being used by any other interface.
> > +
> > +The method to enable one or more interfaces is to use the
> > +``--vdev=net_memif0`` option on the DPDK application command line.
> > +Each ``--vdev=net_memif1`` option given will create an interface
> > +named net_memif0, net_memif1, and so on. Memif uses unix domain
> > +socket to transmit control messages. Each memif has a unique id per
> > +socket. If you are connecting multiple interfaces using same socket,
> > +be sure to specify unique ids ``id=0``, ``id=1``, etc. Note that if
> > +you assign a socket to a master interface it becomes a listener
> > +socket. Listener socket can not be used by a slave interface on same client.
> 
> When I connect two slaves, id=0 & id=1 to the master, both master and
> second slave crashed as soon as second slave connected, is this a know issue?
> Can you please check this?
> 
> master: ./build/app/testpmd -w0:0.0 -l20,21 --vdev net_memif0,role=master
> -- -i
> slave1: ./build/app/testpmd -w0:0.0 -l20,22 --file-prefix=slave --vdev
> net_memif0,role=slave,id=0 -- -i
> slave2: ./build/app/testpmd -w0:0.0 -l20,23 --file-prefix=slave2 --vdev
> net_memif0,role=slave,id=1 -- -i
> 
> <...>
> 
> 
>     Each interface can be connected to one peer interface at the same time, I'll
>     add more details to the documentation.
> 
> 
> > +Example: testpmd and testpmd
> > +----------------------------
> > +In this example we run two instances of testpmd application and transmit
> packets over memif.
> 
> How this play with multi process support? When I run a secondary app when
> memif PMD is enabled in primary, secondary process crashes.
> Can you please check and ensure at least nothing crashes when a secondary
> app run?
> 
> 
>     For now I'd disable multi-process support, so we can get the patch applied,
> then
>     provide multi-process support in a separate patch.
> 
> 
> > +
> > +First create ``master`` interface::
> > +
> > +    #./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1
> > + --vdev=net_memif,role=master -- -i
> > +
> > +Now create ``slave`` interface (master must be already running so the
> slave will connect)::
> > +
> > +    #./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2
> > + --vdev=net_memif -- -i
> > +
> > +Set forwarding mode in one of the instances to 'rx only' and the other to
> 'tx only'::
> > +
> > +    testpmd> set fwd rxonly
> > +    testpmd> start
> > +
> > +    testpmd> set fwd txonly
> > +    testpmd> start
> 
> Would it be useful to add loopback option to the PMD, for testing/debug ?
> 
> Also I am getting low performance numbers above, comparing the ring pmd
> for example, is there any performance target for the pmd?
> Same forwarding core for both testpmds, this is terrible, either something is
> wrong or I am doing something wrong, it is ~40Kpps Different forwarding
> cores for each testpmd, still low, !18Mpps
> 
> <...>
> 
> 
>     The difference between ring pmd and memif pmd is that while memif
> transfers packets,
>     ring transfers whole buffers. (Also ring can not be used process to process)
>     That means, it does not have to alloc/free buffers.
>     I did a simple test where I modified the code so the tx function will only
> free given buffers
>     and rx allocates new buffers. On my machine, this can only handle 45-
> 50Mpps.
> 
>     With that in mind, I believe that 23Mpps is fine performance. No
> performance target is
>     defined, the goal is to be as fast as possible.
Use of C11 atomics have proven to provide better performance on weakly ordered architectures (at least on Arm). IMO, C11 atomics should be used to implement the fast path functions at least. This ensures optimal performance on all supported architectures in DPDK.

> 
>     The cause of the issue, where traffic drops to 40Kpps while using the same
> core for both applications,
>     is that within the timeslice given to testpmd process, memif driver fills its
> queues,
>     but keeps polling, not giving the other process a chance to receive the
> packets.
>     Same applies to rx side, where an empty queue is polled over and over
> again. In such configuration,
>     interrupt rx-mode should be used instead of polling. Or the application
> could suspend
>     the port.
> 
> 
> > +/**
> > + * Buffer descriptor.
> > + */
> > +typedef struct __rte_packed {
> > +     uint16_t flags;                         /**< flags */
> > +#define MEMIF_DESC_FLAG_NEXT 1                       /**< is chained buffer */
> 
> Is this define used?
> 
> <...>
> 
> 
>      Yes, see eth_memif_tx() and eth_memif_rx()
> 
> 
> > +     while (n_slots && n_rx_pkts < nb_pkts) {
> > +             mbuf_head = rte_pktmbuf_alloc(mq->mempool);
> > +             if (unlikely(mbuf_head == NULL))
> > +                     goto no_free_bufs;
> > +             mbuf = mbuf_head;
> > +             mbuf->port = mq->in_port;
> > +
> > + next_slot:
> > +             s0 = cur_slot & mask;
> > +             d0 = &ring->desc[s0];
> > +
> > +             src_len = d0->length;
> > +             dst_off = 0;
> > +             src_off = 0;
> > +
> > +             do {
> > +                     dst_len = mbuf_size - dst_off;
> > +                     if (dst_len == 0) {
> > +                             dst_off = 0;
> > +                             dst_len = mbuf_size +
> > + RTE_PKTMBUF_HEADROOM;
> > +
> > +                             mbuf = rte_pktmbuf_alloc(mq->mempool);
> > +                             if (unlikely(mbuf == NULL))
> > +                                     goto no_free_bufs;
> > +                             mbuf->port = mq->in_port;
> > +                             rte_pktmbuf_chain(mbuf_head, mbuf);
> > +                     }
> > +                     cp_len = RTE_MIN(dst_len, src_len);
> > +
> > +                     rte_pktmbuf_pkt_len(mbuf) =
> > +                         rte_pktmbuf_data_len(mbuf) += cp_len;
> 
> Can you please make this two lines to prevent confusion?
> 
> Also shouldn't need to add 'cp_len' to 'mbuf_head->pkt_len'?
> 'rte_pktmbuf_chain' updates 'mbuf_head' but at that stage 'mbuf->pkt_len' is
> not set to 'cp_len'...
> 
> <...>
> 
> 
>     Fix in next patch.
> 
> 
> > +             if (r->addr == NULL)
> > +                     return;
> > +             munmap(r->addr, r->region_size);
> > +             if (r->fd > 0) {
> > +                     close(r->fd);
> > +                     r->fd = -1;
> > +             }
> > +     }
> > +     rte_free(pmd->regions);
> > +}
> > +
> > +static int
> > +memif_alloc_regions(struct pmd_internals *pmd, uint8_t brn) {
> > +     struct memif_region *r;
> > +     char shm_name[32];
> > +     int i;
> > +     int ret = 0;
> > +
> > +     r = rte_zmalloc("memif_region", sizeof(struct memif_region) * (brn + 1),
> 0);
> > +     if (r == NULL) {
> > +             MIF_LOG(ERR, "%s: Failed to allocate regions.",
> > +                     rte_vdev_device_name(pmd->vdev));
> > +             return -ENOMEM;
> > +     }
> > +
> > +     pmd->regions = r;
> > +     pmd->regions_num = brn + 1;
> > +
> > +     /*
> > +      * Create shm for every region. Region 0 is reserved for descriptors.
> > +      * Other regions contain buffers.
> > +      */
> > +     for (i = 0; i < (brn + 1); i++) {
> > +             r = &pmd->regions[i];
> > +
> > +             r->buffer_offset = (i == 0) ? (pmd->run.num_s2m_rings +
> > +                                            pmd->run.num_m2s_rings) *
> > +                 (sizeof(memif_ring_t) +
> > +                  sizeof(memif_desc_t) * (1 <<
> > + pmd->run.log2_ring_size)) : 0;
> 
> what exactly "buffer_offset" is? For 'region 0' it calculates the size of the size
> of the 'region 0', otherwise 0. This is offset starting from where? And it seems
> only used for below size assignment.
> 
> 
>     It is possible to use single region for one connection (non-zero-copy) to
> reduce
>     the number of files opened. It will be implemented in the next patch.
>     'buffer_offset' is offset from the start of shared memory file to the first
> packet buffer.
> 
> 
> > +                             (1 << pmd->run.log2_ring_size) *
> > +                             (pmd->run.num_s2m_rings +
> > +                              pmd->run.num_m2s_rings));
> 
> There is an illusion of packet buffers can be in multiple regions (above
> comment implies it) but this logic assumes all packet buffer are in same
> region other than region 0, which gives us 'region 1' and this is already
> hardcoded in a few places. Is there a benefit of assuming there will be more
> regions, will it make simple to accept 'region 0' for rings and 'region 1' is for
> packet buffer?
> 
> 
>     Multiple regions are required in case of zero-copy slave. The zero-copy will
>     expose dpdk shared memory, where each memseg/memseg_list will be
>     represended by memif_region.
> 
> 
> > +int
> > +memif_connect(struct rte_eth_dev *dev) {
> > +     struct pmd_internals *pmd = dev->data->dev_private;
> > +     struct memif_region *mr;
> > +     struct memif_queue *mq;
> > +     int i;
> > +
> > +     for (i = 0; i < pmd->regions_num; i++) {
> > +             mr = pmd->regions + i;
> > +             if (mr != NULL) {
> > +                     if (mr->addr == NULL) {
> > +                             if (mr->fd < 0)
> > +                                     return -1;
> > +                             mr->addr = mmap(NULL, mr->region_size,
> > +                                             PROT_READ | PROT_WRITE,
> > +                                             MAP_SHARED, mr->fd, 0);
> > +                             if (mr->addr == NULL)
> > +                                     return -1;
> > +                     }
> > +             }
> > +     }
> 
> If multiple slave is not supported, is 'id' devarg still valid, or is it used at all?
> 
> 
>     'id' is used to identify peer interface. Interface with id=0 will only connect
> to an
>     interface with id=0 and so on...
> 
> 
> <...>
> 
> > +static int
> > +memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
> > +          memif_interface_id_t id, uint32_t flags,
> > +          const char *socket_filename,
> > +          memif_log2_ring_size_t log2_ring_size,
> > +          uint16_t buffer_size, const char *secret,
> > +          struct ether_addr *eth_addr) {
> > +     int ret = 0;
> > +     struct rte_eth_dev *eth_dev;
> > +     struct rte_eth_dev_data *data;
> > +     struct pmd_internals *pmd;
> > +     const unsigned int numa_node = vdev->device.numa_node;
> > +     const char *name = rte_vdev_device_name(vdev);
> > +
> > +     if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
> > +             MIF_LOG(ERR, "Zero-copy not supported.");
> > +             return -1;
> > +     }
> 
> What is the plan for the zero copy?
> Can you please put the status & plan to the documentation?
> 
> <...>
> 
> 
>     I don't think that putting the status in the docs is needed. Zero-copy slave
> is in testing stage
>     and I should be able to submit a patch once we get this one applied.
> 
> 
> > +
> > +     uint16_t last_head;                     /**< last ring head */
> > +     uint16_t last_tail;                     /**< last ring tail */
> 
> ring already has head, tail variables, what is the difference in queue ones?
Damjan Marion (damarion) May 6, 2019, 11:04 a.m. | #3
> On 6 May 2019, at 13:00, Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco) <jgrajcia@cisco.com> wrote:
> 
> 
> ________________________________________
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Sent: Friday, May 3, 2019 6:27 AM
> To: Jakub Grajciar; Ferruh Yigit; dev@dpdk.org; Honnappa Nagarahalli
> Cc: nd; nd
> Subject: RE: [dpdk-dev] [RFC v5] /net: memory interface (memif)
> 
>> On 3/22/2019 11:57 AM, Jakub Grajciar wrote:
>>> Memory interface (memif), provides high performance packet transfer
>>> over shared memory.
>>> 
>>> Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
>> 
> 
> <...>
> 
>>    With that in mind, I believe that 23Mpps is fine performance. No
>> performance target is
>>    defined, the goal is to be as fast as possible.
> Use of C11 atomics have proven to provide better performance on weakly ordered architectures (at least on Arm). IMO, C11 atomics should be used to implement the fast path functions at least. This ensures optimal performance on all supported architectures in DPDK.
> 
>    Atomics are not required by memif driver.

Correct, only thing we need is store barrier once per batch of packets,
to make sure that descriptor changes are globally visible before we bump head pointer.
Honnappa Nagarahalli May 7, 2019, 11:29 a.m. | #4
> >
> >> On 3/22/2019 11:57 AM, Jakub Grajciar wrote:
> >>> Memory interface (memif), provides high performance packet transfer
> >>> over shared memory.
> >>>
> >>> Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
> >>
> >
> > <...>
> >
> >>    With that in mind, I believe that 23Mpps is fine performance. No
> >> performance target is
> >>    defined, the goal is to be as fast as possible.
> > Use of C11 atomics have proven to provide better performance on weakly
> ordered architectures (at least on Arm). IMO, C11 atomics should be used to
> implement the fast path functions at least. This ensures optimal performance
> on all supported architectures in DPDK.
> >
> >    Atomics are not required by memif driver.
> 
> Correct, only thing we need is store barrier once per batch of packets, to
> make sure that descriptor changes are globally visible before we bump head
> pointer.
May be I was not clear in my comments, I meant that the use of GCC C++11 memory model aware atomic operations [1] show better performance. So, instead of using full memory barriers you can use store-release and load-acquire semantics. A similar change was done to svm_fifo data structure in VPP [2] (though the original algorithm used was different from the one used in this memif patch).

[1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
[2] https://gerrit.fd.io/r/#/c/18223/

> 
> --
> Damjan
Damjan Marion (damarion) May 7, 2019, 11:37 a.m. | #5
--
Damjan

On 7 May 2019, at 13:29, Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com<mailto:Honnappa.Nagarahalli@arm.com>> wrote:


On 3/22/2019 11:57 AM, Jakub Grajciar wrote:
Memory interface (memif), provides high performance packet transfer
over shared memory.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com<mailto:jgrajcia@cisco.com>>


<...>

  With that in mind, I believe that 23Mpps is fine performance. No
performance target is
  defined, the goal is to be as fast as possible.
Use of C11 atomics have proven to provide better performance on weakly
ordered architectures (at least on Arm). IMO, C11 atomics should be used to
implement the fast path functions at least. This ensures optimal performance
on all supported architectures in DPDK.

  Atomics are not required by memif driver.

Correct, only thing we need is store barrier once per batch of packets, to
make sure that descriptor changes are globally visible before we bump head
pointer.
May be I was not clear in my comments, I meant that the use of GCC C++11 memory model aware atomic operations [1] show better performance. So, instead of using full memory barriers you can use store-release and load-acquire semantics. A similar change was done to svm_fifo data structure in VPP [2] (though the original algorithm used was different from the one used in this memif patch).

[1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
[2] https://gerrit.fd.io/r/#/c/18223/

Typically we have hundreds of normal memory stores into the packet ring, then single store fence and then finally one more store to bump head pointer.
Sorry. I’m not getting what are you suggesting here, and how that can be faster...
Honnappa Nagarahalli May 8, 2019, 7:53 a.m. | #6
--
Damjan


On 7 May 2019, at 13:29, Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com<mailto:Honnappa.Nagarahalli@arm.com>> wrote:



On 3/22/2019 11:57 AM, Jakub Grajciar wrote:

Memory interface (memif), provides high performance packet transfer
over shared memory.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com<mailto:jgrajcia@cisco.com>>


<...>


  With that in mind, I believe that 23Mpps is fine performance. No
performance target is
  defined, the goal is to be as fast as possible.
Use of C11 atomics have proven to provide better performance on weakly
ordered architectures (at least on Arm). IMO, C11 atomics should be used to
implement the fast path functions at least. This ensures optimal performance
on all supported architectures in DPDK.


  Atomics are not required by memif driver.

Correct, only thing we need is store barrier once per batch of packets, to
make sure that descriptor changes are globally visible before we bump head
pointer.
May be I was not clear in my comments, I meant that the use of GCC C++11 memory model aware atomic operations [1] show better performance. So, instead of using full memory barriers you can use store-release and load-acquire semantics. A similar change was done to svm_fifo data structure in VPP [2] (though the original algorithm used was different from the one used in this memif patch).

[1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
[2] https://gerrit.fd.io/r/#/c/18223/

Typically we have hundreds of normal memory stores into the packet ring, then single store fence and then finally one more store to bump head pointer.
Sorry. I’m not getting what are you suggesting here, and how that can be faster...
[Honnappa] I will get back with a patch and we can go from there

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index 452b8eb82..145b5282a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -790,6 +790,12 @@  F: drivers/net/softnic/
 F: doc/guides/nics/features/softnic.ini
 F: doc/guides/nics/softnic.rst
 
+Memif PMD
+M: Jakub Grajciar <jgrajcia@cisco.com>
+F: drivers/net/memif/
+F: doc/guides/nics/memif.rst
+F: doc/guides/nics/features/memif.ini
+
 
 Crypto Drivers
 --------------
diff --git a/config/common_base b/config/common_base
index 0b09a9348..eb6055ea8 100644
--- a/config/common_base
+++ b/config/common_base
@@ -416,6 +416,11 @@  CONFIG_RTE_LIBRTE_VMXNET3_DEBUG_TX_FREE=n
 #
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 
+#
+# Compile Memory Interface PMD driver (Linux only)
+#
+CONFIG_RTE_LIBRTE_PMD_MEMIF=n
+
 #
 # Compile link bonding PMD library
 #
diff --git a/config/common_linux b/config/common_linux
index 75334273d..87514fe4f 100644
--- a/config/common_linux
+++ b/config/common_linux
@@ -19,6 +19,7 @@  CONFIG_RTE_LIBRTE_VHOST_POSTCOPY=n
 CONFIG_RTE_LIBRTE_PMD_VHOST=y
 CONFIG_RTE_LIBRTE_IFC_PMD=y
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
+CONFIG_RTE_LIBRTE_PMD_MEMIF=y
 CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
 CONFIG_RTE_LIBRTE_PMD_TAP=y
 CONFIG_RTE_LIBRTE_AVP_PMD=y
diff --git a/doc/guides/nics/features/memif.ini b/doc/guides/nics/features/memif.ini
new file mode 100644
index 000000000..807d9ecdc
--- /dev/null
+++ b/doc/guides/nics/features/memif.ini
@@ -0,0 +1,14 @@ 
+;
+; Supported features of the 'memif' network poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Link status          = Y
+Basic stats          = Y
+Jumbo frame          = Y
+ARMv8                = Y
+Power8               = Y
+x86-32               = Y
+x86-64               = Y
+Usage doc            = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 5c80e3baa..167e0dacb 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -34,6 +34,7 @@  Network Interface Controller Drivers
     intel_vf
     kni
     liquidio
+    memif
     mlx4
     mlx5
     mvneta
diff --git a/doc/guides/nics/memif.rst b/doc/guides/nics/memif.rst
new file mode 100644
index 000000000..7cc547105
--- /dev/null
+++ b/doc/guides/nics/memif.rst
@@ -0,0 +1,200 @@ 
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2018-2019 Cisco Systems, Inc.
+
+======================
+Memif Poll Mode Driver
+======================
+Shared memory packet interface (memif) PMD allows for DPDK and any other client
+using memif (DPDK, VPP, libmemif) to communicate using shared memory. Memif is
+Linux only.
+
+The created device transmits packets in a raw format. It can be used with
+Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
+supported in DPDK memif implementation.
+
+Memif works in two roles: master and slave. Slave connects to master over an
+existing socket. It is also a producer of shared memory file and initializes
+the shared memory. Master creates the socket and listens for any slave
+connection requests. The socket may already exist on the system. Be sure to
+remove any such sockets, if you are creating a master interface, or you will
+see an "Address already in use" error. Function ``rte_pmd_memif_remove()``,
+which removes memif interface, will also remove a listener socket, if it is
+not being used by any other interface.
+
+The method to enable one or more interfaces is to use the
+``--vdev=net_memif0`` option on the DPDK application command line. Each
+``--vdev=net_memif1`` option given will create an interface named net_memif0,
+net_memif1, and so on. Memif uses unix domain socket to transmit control
+messages. Each memif has a unique id per socket. If you are connecting multiple
+interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
+etc. Note that if you assign a socket to a master interface it becomes a
+listener socket. Listener socket can not be used by a slave interface on same
+client.
+
+.. csv-table:: **Memif configuration options**
+   :header: "Option", "Description", "Default", "Valid value"
+
+   "id=0", "Each memif on same socket must be given a unique id", "0", "uint32_t"
+   "role=master", "Set memif role", "slave", "master|slave"
+   "bsize=1024", "Size of packet buffer", "2048", "uint16_t"
+   "rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", "10", "1-14"
+   "nrxq=2", "Number of RX queues", "1", "255"
+   "ntxq=2", "Number of TX queues", "1", "255"
+   "socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock", "string len 256"
+   "mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef", ""
+   "secret=abc123", "Secret is an optional security option, which if specified, must be matched by peer", "", "string len 24"
+   "zero-copy=yes", "Enable/disable zero-copy slave mode", "no", "yes|no"
+
+
+**Connection establishment**
+
+In order to create memif connection, two memif interfaces, each in separate
+process, are needed. Once interface in ``master`` role and other in
+``slave`` role. It is not possible to connect two interfaces in a single
+process.
+
+Memif driver uses unix domain socket to exchange required information between
+memif interfaces. Socket file path is specified at interface creation see
+*Memif configuration options* table above. If socket is used by ``master``
+interface, it's marked as listener socket (in scope of current process) and
+listens to connection requests from other processes. One socket can be used by
+multiple interfaces. One process can have ``slave`` and ``master`` interfaces 
+at the same time, provided each role is assigned unique socket.
+
+For detailed information on memif control messages, see: net/memif/memif.h.
+
+Slave interface attempts to make a connection on assigned socket. Process
+listening on this socket will extract the connection request and create a new
+connected socket (control channel). Then it sends the 'hello' message
+(``MEMIF_MSG_TYPE_HELLO``), containing configuration boundaries. Slave interface
+adjusts its configuration accordingly, and sends 'init' message
+(``MEMIF_MSG_TYPE_INIT``). This message among others contains interface id. Driver
+uses this id to find master interface, and assigns the control channel to this
+interface. If such interface is found, 'ack' message (``MEMIF_MSG_TYPE_ACK``) is
+sent. Slave interface sends 'add region' message (``MEMIF_MSG_TYPE_ADD_REGION``) for
+every region allocated. Master responds to each of these messages with 'ack'
+message. Same behavior applies to rings. Slave sends 'add ring' message
+(``MEMIF_MSG_TYPE_ADD_RING``) for every initialized ring. Master again responds to
+each message with 'ack' message. To finalize the connection, slave interface
+sends 'connect' message (``MEMIF_MSG_TYPE_CONNECT``). Upon receiving this message
+master maps regions to its address space, initializes rings and responds with
+'connected' message (``MEMIF_MSG_TYPE_CONNECTED``). Disconnect
+(``MEMIF_MSG_TYPE_DISCONNECT``) can be sent by both master and slave interfaces at
+any time, due to driver error or if the interface is being deleted.
+
+
+Files
+
+- net/memif/memif.h *- control messages definitions*
+- net/memif/memif_socket.h
+- net/memif/memif_socket.c
+
+
+
+**Descriptor format**
+
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Quad|6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | |1|1| | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Word|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | |6|5| | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|0   |length                                                         |region                         |flags                          |
++----+---------------------------------------------------------------+-------------------------------+-------------------------------+
+|1   |metadata                                                       |offset                                                         |
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+
+**Flags field - flags (Quad Word 0, bits 0:15)**
+
++-----+--------------------+------------------------------------------------------------------------------------------------+
+|Bits |Name                |Functionality                                                                                   |
++=====+====================+================================================================================================+
+|0    |MEMIF_DESC_FLAG_NEXT|Is chained buffer. When set, the packet is divided into multiple buffers. May not be contiguous.|
++-----+--------------------+------------------------------------------------------------------------------------------------+
+
+**Region index - region (Quad Word 0, 16:31)**
+
+Index of memory region, the buffer is located in.
+
+**Data length - length (Quad Word 0, 32:63)**
+
+Length of transmitted/received data.
+
+**Data Offset - offset (Quad Word 1, 0:31)**
+
+Data start offset from memory region address. *.regions[desc->region].addr + desc->offset*
+
+**Metadata - metadata (Quad Word 1, 32:63)**
+
+Buffer metadata.
+
+Files
+
+- net/memif/memif.h *- descriptor and ring definitions*
+- net/memif/rte_eth_memif.c *- eth_memif_rx() eth_memif_tx()*
+
+
+
+Example: testpmd and testpmd
+----------------------------
+In this example we run two instances of testpmd application and transmit packets over memif.
+
+First create ``master`` interface::
+
+    #./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
+
+Now create ``slave`` interface (master must be already running so the slave will connect)::
+
+    #./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
+
+Set forwarding mode in one of the instances to 'rx only' and the other to 'tx only'::
+
+    testpmd> set fwd rxonly
+    testpmd> start
+
+    testpmd> set fwd txonly
+    testpmd> start
+
+Show status::
+
+    testpmd> show port stats 0
+
+Example: testpmd and VPP
+------------------------
+For information on how to get and run VPP please see `<https://wiki.fd.io/view/VPP>`_.
+
+Start VPP in interactive mode (should be by default). Create memif master interface in VPP::
+
+    vpp# create interface memif id 0 master no-zero-copy
+    vpp# set interface state memif0/0 up
+    vpp# set interface ip address memif0/0 192.168.1.1/24
+
+To see socket filename use show memif command::
+
+    vpp# show memif
+    sockets
+     id  listener    filename
+      0   yes (1)     /run/vpp/memif.sock
+    ...
+
+Now create memif interface by running testpmd with these command line options::
+
+    #./testpmd --vdev=net_memif,socket=/run/vpp/memif.sock -- -i
+
+Testpmd should now create memif slave interface and try to connect to master.
+In testpmd set forward option to icmpecho and start forwarding::
+
+    testpmd> set fwd icmpecho
+    testpmd> start
+
+Send ping from VPP::
+
+    vpp# ping 192.168.1.2
+    64 bytes from 192.168.1.2: icmp_seq=2 ttl=254 time=36.2918 ms
+    64 bytes from 192.168.1.2: icmp_seq=3 ttl=254 time=23.3927 ms
+    64 bytes from 192.168.1.2: icmp_seq=4 ttl=254 time=24.2975 ms
+    64 bytes from 192.168.1.2: icmp_seq=5 ttl=254 time=17.7049 ms
diff --git a/doc/guides/rel_notes/release_19_05.rst b/doc/guides/rel_notes/release_19_05.rst
index 61a2c7383..be55faacc 100644
--- a/doc/guides/rel_notes/release_19_05.rst
+++ b/doc/guides/rel_notes/release_19_05.rst
@@ -91,6 +91,10 @@  New Features
 
   * Added promiscuous mode support.
 
+* **Added memif PMD.**
+
+  Added the new Shared Memory Packet Interface (``memif``) PMD.
+  See the :doc:`../nics/memif` guide for more details on this new driver.
 
 Removed Items
 -------------
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 502869a87..1ecbf5155 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -33,6 +33,7 @@  DIRS-$(CONFIG_RTE_LIBRTE_IAVF_PMD) += iavf
 DIRS-$(CONFIG_RTE_LIBRTE_ICE_PMD) += ice
 DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
 DIRS-$(CONFIG_RTE_LIBRTE_LIO_PMD) += liquidio
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif
 DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
 DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5
 DIRS-$(CONFIG_RTE_LIBRTE_MVNETA_PMD) += mvneta
diff --git a/drivers/net/memif/Makefile b/drivers/net/memif/Makefile
new file mode 100644
index 000000000..346cf6473
--- /dev/null
+++ b/drivers/net/memif/Makefile
@@ -0,0 +1,28 @@ 
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_memif.a
+
+EXPORT_MAP := rte_pmd_memif_version.map
+
+LIBABIVER := 1
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool
+LDLIBS += -lrte_ethdev -lrte_kvargs
+LDLIBS += -lrte_hash
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += rte_eth_memif.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif_socket.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/memif/memif.h b/drivers/net/memif/memif.h
new file mode 100644
index 000000000..15a56888b
--- /dev/null
+++ b/drivers/net/memif/memif.h
@@ -0,0 +1,179 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_H_
+#define _MEMIF_H_
+
+#define MEMIF_COOKIE		0x3E31F20
+#define MEMIF_VERSION_MAJOR	2
+#define MEMIF_VERSION_MINOR	0
+#define MEMIF_VERSION		((MEMIF_VERSION_MAJOR << 8) | MEMIF_VERSION_MINOR)
+#define MEMIF_NAME_SZ		32
+
+/*
+ * S2M: direction slave -> master
+ * M2S: direction master -> slave
+ */
+
+/*
+ *  Type definitions
+ */
+
+typedef enum memif_msg_type {
+	MEMIF_MSG_TYPE_NONE,
+	MEMIF_MSG_TYPE_ACK,
+	MEMIF_MSG_TYPE_HELLO,
+	MEMIF_MSG_TYPE_INIT,
+	MEMIF_MSG_TYPE_ADD_REGION,
+	MEMIF_MSG_TYPE_ADD_RING,
+	MEMIF_MSG_TYPE_CONNECT,
+	MEMIF_MSG_TYPE_CONNECTED,
+	MEMIF_MSG_TYPE_DISCONNECT,
+} memif_msg_type_t;
+
+typedef enum {
+	MEMIF_RING_S2M, /**< buffer ring in direction slave -> master */
+	MEMIF_RING_M2S, /**< buffer ring in direction master -> slave */
+} memif_ring_type_t;
+
+typedef enum {
+	MEMIF_INTERFACE_MODE_ETHERNET,
+	MEMIF_INTERFACE_MODE_IP,
+	MEMIF_INTERFACE_MODE_PUNT_INJECT,
+} memif_interface_mode_t;
+
+typedef uint16_t memif_region_index_t;
+typedef uint32_t memif_region_offset_t;
+typedef uint64_t memif_region_size_t;
+typedef uint16_t memif_ring_index_t;
+typedef uint32_t memif_interface_id_t;
+typedef uint16_t memif_version_t;
+typedef uint8_t memif_log2_ring_size_t;
+
+/*
+ *  Socket messages
+ */
+
+ /**
+  * M2S
+  * Contains master interfaces configuration.
+  */
+typedef struct __rte_packed {
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+	memif_version_t min_version; /**< lowest supported memif version */
+	memif_version_t max_version; /**< highest supported memif version */
+	memif_region_index_t max_region; /**< maximum num of regions */
+	memif_ring_index_t max_m2s_ring; /**< maximum num of M2S ring */
+	memif_ring_index_t max_s2m_ring; /**< maximum num of S2M rings */
+	memif_log2_ring_size_t max_log2_ring_size; /**< maximum ring size (as log2) */
+} memif_msg_hello_t;
+
+/**
+ * S2M
+ * Contains information required to identify interface
+ * to which the slave wants to connect.
+ */
+typedef struct __rte_packed {
+	memif_version_t version;		/**< memif version */
+	memif_interface_id_t id;		/**< interface id */
+	memif_interface_mode_t mode:8;		/**< interface mode */
+	uint8_t secret[24];			/**< optional security parameter */
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+} memif_msg_init_t;
+
+/**
+ * S2M
+ * Request master to add new shared memory region to master interface.
+ * Shared files file descriptor is passed in cmsghdr.
+ */
+typedef struct __rte_packed {
+	memif_region_index_t index;		/**< shm regions index */
+	memif_region_size_t size;		/**< shm region size */
+} memif_msg_add_region_t;
+
+/**
+ * S2M
+ * Request master to add new ring to master interface.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_MSG_ADD_RING_FLAG_S2M 1		/**< ring is in S2M direction */
+	memif_ring_index_t index;		/**< ring index */
+	memif_region_index_t region; /**< region index on which this ring is located */
+	memif_region_offset_t offset;		/**< buffer start offset */
+	memif_log2_ring_size_t log2_ring_size;	/**< ring size (log2) */
+	uint16_t private_hdr_size;		/**< used for private metadata */
+} memif_msg_add_ring_t;
+
+/**
+ * S2M
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< slave interface name */
+} memif_msg_connect_t;
+
+/**
+ * M2S
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< master interface name */
+} memif_msg_connected_t;
+
+/**
+ * S2M & M2S
+ * Disconnect interfaces.
+ */
+typedef struct __rte_packed {
+	uint32_t code;				/**< error code */
+	uint8_t string[96];			/**< disconnect reason */
+} memif_msg_disconnect_t;
+
+typedef struct __rte_packed __rte_aligned(128)
+{
+	memif_msg_type_t type:16;
+	union {
+		memif_msg_hello_t hello;
+		memif_msg_init_t init;
+		memif_msg_add_region_t add_region;
+		memif_msg_add_ring_t add_ring;
+		memif_msg_connect_t connect;
+		memif_msg_connected_t connected;
+		memif_msg_disconnect_t disconnect;
+	};
+} memif_msg_t;
+
+/*
+ *  Ring and Descriptor Layout
+ */
+
+/**
+ * Buffer descriptor.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_DESC_FLAG_NEXT 1			/**< is chained buffer */
+	memif_region_index_t region; /**< region index on which the buffer is located */
+	uint32_t length;			/**< buffer length */
+	memif_region_offset_t offset;		/**< buffer offset */
+	uint32_t metadata;
+} memif_desc_t;
+
+#define MEMIF_CACHELINE_ALIGN_MARK(mark) \
+	uint8_t mark[0] __rte_aligned(RTE_CACHE_LINE_SIZE)
+
+typedef struct {
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline0);
+	uint32_t cookie;			/**< MEMIF_COOKIE */
+	uint16_t flags;				/**< flags */
+#define MEMIF_RING_FLAG_MASK_INT 1		/**< disable interrupt mode */
+	volatile uint16_t head;			/**< pointer to ring buffer head */
+	 MEMIF_CACHELINE_ALIGN_MARK(cacheline1);
+	volatile uint16_t tail;			/**< pointer to ring buffer tail */
+	 MEMIF_CACHELINE_ALIGN_MARK(cacheline2);
+	memif_desc_t desc[0];			/**< buffer descriptors */
+} memif_ring_t;
+
+#endif				/* _MEMIF_H_ */
diff --git a/drivers/net/memif/memif_socket.c b/drivers/net/memif/memif_socket.c
new file mode 100644
index 000000000..ad6a5e483
--- /dev/null
+++ b/drivers/net/memif/memif_socket.c
@@ -0,0 +1,1104 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <errno.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_hash.h>
+#include <rte_jhash.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+static void memif_intr_handler(void *arg);
+
+static ssize_t
+memif_msg_send(int fd, memif_msg_t *msg, int afd)
+{
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	struct cmsghdr *cmsg;
+	char ctl[CMSG_SPACE(sizeof(int))];
+
+	iov[0].iov_base = msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+
+	if (afd > 0) {
+		memset(&ctl, 0, sizeof(ctl));
+		mh.msg_control = ctl;
+		mh.msg_controllen = sizeof(ctl);
+		cmsg = CMSG_FIRSTHDR(&mh);
+		cmsg->cmsg_len = CMSG_LEN(sizeof(int));
+		cmsg->cmsg_level = SOL_SOCKET;
+		cmsg->cmsg_type = SCM_RIGHTS;
+		rte_memcpy(CMSG_DATA(cmsg), &afd, sizeof(int));
+	}
+
+	return sendmsg(fd, &mh, 0);
+}
+
+static int
+memif_msg_send_from_queue(struct memif_control_channel *cc)
+{
+	ssize_t size;
+	int ret = 0;
+	struct memif_msg_queue_elt *e;
+
+	e = TAILQ_FIRST(&cc->msg_queue);
+	if (e == NULL)
+		return 0;
+
+	size = memif_msg_send(cc->intr_handle.fd, &e->msg, e->fd);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(ERR, "sendmsg fail: %s.", strerror(errno));
+		ret = -1;
+	} else {
+		MIF_LOG(DEBUG, "Sent msg type %u.", e->msg.type);
+	}
+	TAILQ_REMOVE(&cc->msg_queue, e, next);
+	rte_free(e);
+
+	return ret;
+}
+
+static struct memif_msg_queue_elt *
+memif_msg_enq(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = rte_zmalloc("memif_msg",
+						    sizeof(struct
+							   memif_msg_queue_elt), 0);
+
+	if (e == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control message.");
+		return NULL;
+	}
+
+	e->fd = -1;
+	TAILQ_INSERT_TAIL(&cc->msg_queue, e, next);
+
+	return e;
+}
+
+void
+memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			 int err_code)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	struct pmd_internals *pmd;
+	memif_msg_disconnect_t *d;
+
+	if (e == NULL) {
+		MIF_LOG(WARNING, "Failed to enqueue disconnect message.");
+		return;
+	}
+
+	pmd = cc->dev->data->dev_private;
+	d = &e->msg.disconnect;
+
+	e->msg.type = MEMIF_MSG_TYPE_DISCONNECT;
+	d->code = err_code;
+
+	if (reason != NULL) {
+		strlcpy((char *)d->string, reason, sizeof(d->string));
+		if (cc->dev != NULL) {
+			strlcpy(pmd->local_disc_string, reason,
+				sizeof(pmd->local_disc_string));
+		}
+	}
+}
+
+static int
+memif_msg_enq_hello(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	memif_msg_hello_t *h;
+
+	if (e == NULL)
+		return -1;
+
+	h = &e->msg.hello;
+
+	e->msg.type = MEMIF_MSG_TYPE_HELLO;
+	h->min_version = MEMIF_VERSION;
+	h->max_version = MEMIF_VERSION;
+	h->max_s2m_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_m2s_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_region = ETH_MEMIF_MAX_REGION_IDX;
+	h->max_log2_ring_size = ETH_MEMIF_MAX_LOG2_RING_SIZE;
+
+	strlcpy((char *)h->name, rte_version(), sizeof(h->name));
+
+	return 0;
+}
+
+static int
+memif_msg_receive_hello(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_hello_t *h = &msg->hello;
+
+	if (h->min_version > MEMIF_VERSION || h->max_version < MEMIF_VERSION) {
+		memif_msg_enq_disconnect(pmd->cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	/* Set parameters for active connection */
+	pmd->run.num_s2m_rings = RTE_MIN(h->max_s2m_ring + 1,
+					   pmd->cfg.num_s2m_rings);
+	pmd->run.num_m2s_rings = RTE_MIN(h->max_m2s_ring + 1,
+					   pmd->cfg.num_m2s_rings);
+	pmd->run.log2_ring_size = RTE_MIN(h->max_log2_ring_size,
+					    pmd->cfg.log2_ring_size);
+	pmd->run.buffer_size = pmd->cfg.buffer_size;
+
+	strlcpy(pmd->remote_name, (char *)h->name, sizeof(pmd->remote_name));
+
+	MIF_LOG(DEBUG, "%s: Connecting to %s.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_init(struct memif_control_channel *cc, memif_msg_t *msg)
+{
+	memif_msg_init_t *i = &msg->init;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *pmd;
+	struct rte_eth_dev *dev;
+
+	if (i->version != MEMIF_VERSION) {
+		memif_msg_enq_disconnect(cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	if (cc->socket == NULL) {
+		memif_msg_enq_disconnect(cc, "Device error", 0);
+		return -1;
+	}
+
+	/* Find device with requested ID */
+	TAILQ_FOREACH(elt, &cc->socket->dev_queue, next) {
+		dev = elt->dev;
+		pmd = dev->data->dev_private;
+		if (((pmd->flags & ETH_MEMIF_FLAG_DISABLED) == 0) &&
+		    pmd->id == i->id) {
+			/* assign control channel to device */
+			cc->dev = dev;
+			pmd->cc = cc;
+
+			if (i->mode != MEMIF_INTERFACE_MODE_ETHERNET) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Only ethernet mode supported",
+							 0);
+				return -1;
+			}
+
+			if (pmd->flags & (ETH_MEMIF_FLAG_CONNECTING |
+					   ETH_MEMIF_FLAG_CONNECTED)) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Already connected", 0);
+				return -1;
+			}
+			strlcpy(pmd->remote_name, (char *)i->name,
+				sizeof(pmd->remote_name));
+
+			if (*pmd->secret != '\0') {
+				if (*i->secret == '\0') {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Secret required", 0);
+					return -1;
+				}
+				if (strcmp(pmd->secret, (char *)i->secret) != 0) {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Incorrect secret", 0);
+					return -1;
+				}
+			}
+
+			pmd->flags |= ETH_MEMIF_FLAG_CONNECTING;
+			return 0;
+		}
+	}
+
+	/* ID not found on this socket */
+	MIF_LOG(DEBUG, "ID %u not found.", i->id);
+	memif_msg_enq_disconnect(cc, "ID not found", 0);
+	return -1;
+}
+
+static int
+memif_msg_receive_add_region(struct rte_eth_dev *dev, memif_msg_t *msg,
+			     int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_region_t *ar = &msg->add_region;
+	struct memif_region *mr;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing region fd", 0);
+		return -1;
+	}
+
+	if (ar->index > ETH_MEMIF_MAX_REGION_IDX) {
+		memif_msg_enq_disconnect(pmd->cc, "Invalid region index", 0);
+		return -1;
+	}
+
+	mr = rte_realloc(pmd->regions, sizeof(struct memif_region) *
+			 (ar->index + 1), 0);
+	if (mr == NULL) {
+		memif_msg_enq_disconnect(pmd->cc, "Device error", 0);
+		return -1;
+	}
+
+	pmd->regions = mr;
+	pmd->regions[ar->index].fd = fd;
+	pmd->regions[ar->index].region_size = ar->size;
+	pmd->regions[ar->index].addr = NULL;
+	pmd->regions_num++;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_add_ring(struct rte_eth_dev *dev, memif_msg_t *msg, int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_ring_t *ar = &msg->add_ring;
+	struct memif_queue *mq;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing interrupt fd", 0);
+		return -1;
+	}
+
+	/* check if we have enough queues */
+	if (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) {
+		if (ar->index >= pmd->cfg.num_s2m_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_s2m_rings++;
+	} else {
+		if (ar->index >= pmd->cfg.num_m2s_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_m2s_rings++;
+	}
+
+	mq = (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) ?
+	    dev->data->rx_queues[ar->index] : dev->data->tx_queues[ar->index];
+
+	mq->intr_handle.fd = fd;
+	mq->log2_ring_size = ar->log2_ring_size;
+	mq->region = ar->region;
+	mq->offset = ar->offset;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connect_t *c = &msg->connect;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connected(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connected_t *c = &msg->connected;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_disconnect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_disconnect_t *d = &msg->disconnect;
+
+	memset(pmd->remote_disc_string, 0, sizeof(pmd->remote_disc_string));
+	strlcpy(pmd->remote_disc_string, (char *)d->string,
+		sizeof(pmd->remote_disc_string));
+
+	MIF_LOG(INFO, "%s: Disconnect received: %s",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_disc_string);
+
+	memset(pmd->local_disc_string, 0, 96);
+	memif_disconnect(rte_eth_dev_allocated
+			 (rte_vdev_device_name(pmd->vdev)));
+	return 0;
+}
+
+static int
+memif_msg_enq_ack(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	e->msg.type = MEMIF_MSG_TYPE_ACK;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_init(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_init_t *i = &e->msg.init;
+
+	if (e == NULL)
+		return -1;
+
+	i = &e->msg.init;
+	e->msg.type = MEMIF_MSG_TYPE_INIT;
+	i->version = MEMIF_VERSION;
+	i->id = pmd->id;
+	i->mode = MEMIF_INTERFACE_MODE_ETHERNET;
+
+	strlcpy((char *)i->name, rte_version(), sizeof(i->name));
+
+	if (*pmd->secret != '\0')
+		strlcpy((char *)i->secret, pmd->secret, sizeof(i->secret));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_region(struct rte_eth_dev *dev, uint8_t idx)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_add_region_t *ar;
+	struct memif_region *mr = &pmd->regions[idx];
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_region;
+	e->msg.type = MEMIF_MSG_TYPE_ADD_REGION;
+	e->fd = mr->fd;
+	ar->index = idx;
+	ar->size = mr->region_size;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_ring(struct rte_eth_dev *dev, uint8_t idx,
+		       memif_ring_type_t type)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	struct memif_queue *mq;
+	memif_msg_add_ring_t *ar;
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_ring;
+	mq = (type == MEMIF_RING_S2M) ? dev->data->tx_queues[idx] :
+	    dev->data->rx_queues[idx];
+
+	e->msg.type = MEMIF_MSG_TYPE_ADD_RING;
+	e->fd = mq->intr_handle.fd;
+	ar->index = idx;
+	ar->offset = mq->offset;
+	ar->region = mq->region;
+	ar->log2_ring_size = mq->log2_ring_size;
+	ar->flags = (type == MEMIF_RING_S2M) ? MEMIF_MSG_ADD_RING_FLAG_S2M : 0;
+	ar->private_hdr_size = 0;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connect_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connect;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECT;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connected(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connected_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connected;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECTED;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static void
+memif_intr_unregister_handler(struct rte_intr_handle *intr_handle, void *arg)
+{
+	struct memif_msg_queue_elt *elt;
+	struct memif_control_channel *cc = arg;
+
+	/* close control channel fd */
+	close(intr_handle->fd);
+	/* clear message queue */
+	while ((elt = TAILQ_FIRST(&cc->msg_queue)) != NULL) {
+		TAILQ_REMOVE(&cc->msg_queue, elt, next);
+		free(elt);
+	}
+	/* free control channel */
+	rte_free(cc);
+}
+
+void
+memif_disconnect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *elt;
+	struct memif_queue *mq;
+	int i;
+	int ret;
+
+	if (pmd->cc != NULL) {
+		/* Clear control message queue (except disconnect message if any). */
+		while ((elt = TAILQ_FIRST(&pmd->cc->msg_queue)) != NULL) {
+			if (elt->msg.type != MEMIF_MSG_TYPE_DISCONNECT) {
+				TAILQ_REMOVE(&pmd->cc->msg_queue, elt, next);
+				free(elt);
+			}
+		}
+		/* send disconnect message (if there is any in queue) */
+		memif_msg_send_from_queue(pmd->cc);
+
+		/* at this point, there should be no more messages in queue */
+		if (TAILQ_FIRST(&pmd->cc->msg_queue) != NULL) {
+			MIF_LOG(WARNING,
+				"%s: Unexpected message(s) in message queue.",
+				rte_vdev_device_name(pmd->vdev));
+		}
+
+		if (pmd->cc->intr_handle.fd > 0) {
+			ret =
+			    rte_intr_callback_unregister(&pmd->cc->intr_handle,
+							 memif_intr_handler,
+							 pmd->cc);
+			/*
+			 * If callback is active (disconnecting based on
+			 * received control message).
+			 */
+			if (ret == -EAGAIN) {
+				/* *INDENT-OFF* */
+				ret = rte_intr_callback_unregister_pending(
+							&pmd->cc->intr_handle,
+							memif_intr_handler,
+							pmd->cc,
+							memif_intr_unregister_handler);
+				/* *INDENT-ON* */
+			} else if (ret > 0) {
+				close(pmd->cc->intr_handle.fd);
+				rte_free(pmd->cc);
+			}
+			if (ret <= 0)
+				MIF_LOG(WARNING,
+					"%s: Failed to unregister control channel callback.",
+					rte_vdev_device_name(pmd->vdev));
+		}
+	}
+
+	/* unconfig interrupts */
+	for (i = 0; i < pmd->cfg.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			rte_intr_disable(&mq->intr_handle);
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+	for (i = 0; i < pmd->cfg.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			rte_intr_disable(&mq->intr_handle);
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+
+	memif_free_regions(pmd);
+
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTED;
+	MIF_LOG(DEBUG, "%s: Disconnected.", rte_vdev_device_name(pmd->vdev));
+}
+
+static int
+memif_msg_receive(struct memif_control_channel *cc)
+{
+	char ctl[CMSG_SPACE(sizeof(int)) +
+		 CMSG_SPACE(sizeof(struct ucred))] = { 0 };
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	memif_msg_t msg = { 0 };
+	ssize_t size;
+	int ret = 0;
+	struct ucred *cr __rte_unused;
+	cr = 0;
+	struct cmsghdr *cmsg;
+	int afd = -1;
+	int i;
+	struct pmd_internals *pmd;
+
+	iov[0].iov_base = (void *)&msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+	mh.msg_control = ctl;
+	mh.msg_controllen = sizeof(ctl);
+
+	size = recvmsg(cc->intr_handle.fd, &mh, 0);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(DEBUG, "Invalid message size.");
+		memif_msg_enq_disconnect(cc, "Invalid message size", 0);
+		return -1;
+	}
+	MIF_LOG(DEBUG, "Received msg type: %u.", msg.type);
+
+	cmsg = CMSG_FIRSTHDR(&mh);
+	while (cmsg) {
+		if (cmsg->cmsg_level == SOL_SOCKET) {
+			if (cmsg->cmsg_type == SCM_CREDENTIALS)
+				cr = (struct ucred *)CMSG_DATA(cmsg);
+			else if (cmsg->cmsg_type == SCM_RIGHTS)
+				afd = *(int *)CMSG_DATA(cmsg);
+		}
+		cmsg = CMSG_NXTHDR(&mh, cmsg);
+	}
+
+	if (cc->dev == NULL && msg.type != MEMIF_MSG_TYPE_INIT) {
+		MIF_LOG(DEBUG, "Unexpected message.");
+		memif_msg_enq_disconnect(cc, "Unexpected message", 0);
+		return -1;
+	}
+
+	/* get device from hash data */
+	switch (msg.type) {
+	case MEMIF_MSG_TYPE_ACK:
+		break;
+	case MEMIF_MSG_TYPE_HELLO:
+		ret = memif_msg_receive_hello(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_init_regions_and_queues(cc->dev);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_init(cc->dev);
+		if (ret < 0)
+			goto exit;
+		pmd = cc->dev->data->dev_private;
+		for (i = 0; i < pmd->regions_num; i++) {
+			ret = memif_msg_enq_add_region(cc->dev, i);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_S2M);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_M2S);
+			if (ret < 0)
+				goto exit;
+		}
+		ret = memif_msg_enq_connect(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_INIT:
+		/*
+		 * This cc does not have an interface asociated with it.
+		 * If suitable interface is found it will be assigned here.
+		 */
+		ret = memif_msg_receive_init(cc, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_REGION:
+		ret = memif_msg_receive_add_region(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_RING:
+		ret = memif_msg_receive_add_ring(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECT:
+		ret = memif_msg_receive_connect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_connected(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECTED:
+		ret = memif_msg_receive_connected(cc->dev, &msg);
+		break;
+	case MEMIF_MSG_TYPE_DISCONNECT:
+		ret = memif_msg_receive_disconnect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		break;
+	default:
+		memif_msg_enq_disconnect(cc, "Unknown message type", 0);
+		ret = -1;
+		goto exit;
+	}
+
+ exit:
+	return ret;
+}
+
+static void
+memif_intr_handler(void *arg)
+{
+	struct memif_control_channel *cc = arg;
+	struct rte_eth_dev *dev;
+	int ret;
+
+	ret = memif_msg_receive(cc);
+	/* if driver failed to assign device */
+	if (cc->dev == NULL) {
+		ret = rte_intr_callback_unregister_pending(&cc->intr_handle,
+							   memif_intr_handler,
+							   cc,
+							   memif_intr_unregister_handler);
+		if (ret < 0)
+			MIF_LOG(WARNING,
+				"Failed to unregister control channel callback.");
+		return;
+	}
+	/* if memif_msg_receive failed */
+	if (ret < 0)
+		goto disconnect;
+
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto disconnect;
+
+	return;
+
+ disconnect:
+	dev = rte_eth_dev_allocated(rte_vdev_device_name(
+		((struct pmd_internals *)cc->dev->data->dev_private)->vdev));
+	if (dev == NULL) {
+		MIF_LOG(WARNING, "eth dev not allocated");
+		return;
+	}
+	memif_disconnect(dev);
+}
+
+static void
+memif_listener_handler(void *arg)
+{
+	struct memif_socket *socket = arg;
+	int sockfd;
+	int addr_len;
+	struct sockaddr_un client;
+	struct memif_control_channel *cc;
+	int ret;
+
+	addr_len = sizeof(client);
+	sockfd = accept(socket->intr_handle.fd, (struct sockaddr *)&client,
+			(socklen_t *)&addr_len);
+	if (sockfd < 0) {
+		MIF_LOG(ERR,
+			"Failed to accept connection request on socket fd %d",
+			socket->intr_handle.fd);
+		return;
+	}
+
+	MIF_LOG(DEBUG, "%s: Connection request accepted.", socket->filename);
+
+	cc = rte_zmalloc("memif-cc", sizeof(struct memif_control_channel), 0);
+	if (cc == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control channel.");
+		goto error;
+	}
+
+	cc->intr_handle.fd = sockfd;
+	cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	cc->socket = socket;
+	cc->dev = NULL;
+	TAILQ_INIT(&cc->msg_queue);
+
+	ret =
+	    rte_intr_callback_register(&cc->intr_handle, memif_intr_handler,
+				       cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to register control channel callback.");
+		goto error;
+	}
+
+	ret = memif_msg_enq_hello(cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to enqueue hello message.");
+		goto error;
+	}
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto error;
+
+	return;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (cc != NULL) {
+		rte_free(cc);
+		cc = NULL;
+	}
+}
+
+static struct memif_socket *
+memif_socket_create(struct pmd_internals *pmd, char *key, uint8_t listener)
+{
+	struct memif_socket *sock;
+	struct sockaddr_un un;
+	int sockfd;
+	int ret;
+	int on = 1;
+
+	sock = rte_zmalloc("memif-socket", sizeof(struct memif_socket), 0);
+	if (sock == NULL) {
+		MIF_LOG(ERR, "Failed to allocate memory for memif socket");
+		return NULL;
+	}
+
+	sock->listener = listener;
+	rte_memcpy(sock->filename, key, 256);
+	TAILQ_INIT(&sock->dev_queue);
+
+	if (listener != 0) {
+		sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+		if (sockfd < 0)
+			goto error;
+
+		un.sun_family = AF_UNIX;
+		memcpy(un.sun_path, sock->filename,
+			sizeof(un.sun_path) - 1);
+
+		ret = setsockopt(sockfd, SOL_SOCKET, SO_PASSCRED, &on,
+				 sizeof(on));
+		if (ret < 0)
+			goto error;
+		ret = bind(sockfd, (struct sockaddr *)&un, sizeof(un));
+		if (ret < 0)
+			goto error;
+		ret = listen(sockfd, 1);
+		if (ret < 0)
+			goto error;
+
+		MIF_LOG(DEBUG, "%s: Memif listener socket %s created.",
+			rte_vdev_device_name(pmd->vdev), sock->filename);
+
+		sock->intr_handle.fd = sockfd;
+		sock->intr_handle.type = RTE_INTR_HANDLE_EXT;
+		ret = rte_intr_callback_register(&sock->intr_handle,
+						 memif_listener_handler, sock);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to register interrupt "
+				"callback for listener socket",
+				rte_vdev_device_name(pmd->vdev));
+			return NULL;
+		}
+	}
+
+	return sock;
+
+ error:
+	MIF_LOG(ERR, "%s: Failed to setup socket %s: %s",
+		rte_vdev_device_name(pmd->vdev), key, strerror(errno));
+	if (sock != NULL)
+		rte_free(sock);
+	return NULL;
+}
+
+static struct rte_hash *
+memif_create_socket_hash(void)
+{
+	struct rte_hash_parameters params = { 0 };
+	params.name = MEMIF_SOCKET_HASH_NAME;
+	params.entries = 256;
+	params.key_len = 256;
+	params.hash_func = rte_jhash;
+	params.hash_func_init_val = 0;
+	return rte_hash_create(&params);
+}
+
+int
+memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *tmp_pmd;
+	struct rte_hash *hash;
+	int ret;
+	char key[256];
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL) {
+		hash = memif_create_socket_hash();
+		if (hash == NULL) {
+			MIF_LOG(ERR, "Failed to create memif socket hash.");
+			return -1;
+		}
+	}
+
+	memset(key, 0, 256);
+	rte_memcpy(key, socket_filename, strlen(socket_filename));
+	ret = rte_hash_lookup_data(hash, key, (void **)&socket);
+	if (ret < 0) {
+		socket = memif_socket_create(pmd, key,
+					     (pmd->role ==
+					      MEMIF_ROLE_SLAVE) ? 0 : 1);
+		if (socket == NULL)
+			return -1;
+		ret = rte_hash_add_key_data(hash, key, socket);
+		if (ret < 0) {
+			MIF_LOG(ERR, "Failed to add socket to socket hash.");
+			return ret;
+		}
+	}
+	pmd->socket_filename = socket->filename;
+
+	if (socket->listener != 0 && pmd->role == MEMIF_ROLE_SLAVE) {
+		MIF_LOG(ERR, "Socket is a listener.");
+		return -1;
+	} else if ((socket->listener == 0) && (pmd->role == MEMIF_ROLE_MASTER)) {
+		MIF_LOG(ERR, "Socket is not a listener.");
+		return -1;
+	}
+
+	TAILQ_FOREACH(elt, &socket->dev_queue, next) {
+		tmp_pmd = elt->dev->data->dev_private;
+		if (tmp_pmd->id == pmd->id) {
+			MIF_LOG(ERR, "Memif device with id %d already "
+				"exists on socket %s",
+				pmd->id, socket->filename);
+			return -1;
+		}
+	}
+
+	elt =
+	    rte_malloc("pmd-queue", sizeof(struct memif_socket_dev_list_elt),
+		       0);
+	if (elt == NULL) {
+		MIF_LOG(ERR, "%s: Failed to add device to socket device list.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+	elt->dev = dev;
+	TAILQ_INSERT_TAIL(&socket->dev_queue, elt, next);
+
+	return 0;
+}
+
+void
+memif_socket_remove_device(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt, *next;
+	struct rte_hash *hash;
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL)
+		return;
+
+	if (rte_hash_lookup_data(hash, pmd->socket_filename, (void **)&socket) <
+	    0)
+		return;
+
+	for (elt = TAILQ_FIRST(&socket->dev_queue); elt != NULL; elt = next) {
+		next = TAILQ_NEXT(elt, next);
+		if (elt->dev == dev) {
+			TAILQ_REMOVE(&socket->dev_queue, elt, next);
+			free(elt);
+			pmd->socket_filename = NULL;
+		}
+	}
+
+	/* remove socket, if this was the last device using it */
+	if (TAILQ_EMPTY(&socket->dev_queue)) {
+		rte_hash_del_key(hash, socket->filename);
+		if (socket->listener) {
+			/* remove listener socket file,
+			 * so we can create new one later.
+			 */
+			remove(socket->filename);
+		}
+		rte_free(socket);
+	}
+}
+
+int
+memif_connect_master(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+	return 0;
+}
+
+int
+memif_connect_slave(struct rte_eth_dev *dev)
+{
+	int sockfd;
+	int ret;
+	struct sockaddr_un sun;
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+
+	sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+	if (sockfd < 0) {
+		MIF_LOG(ERR, "%s: Failed to open socket.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	sun.sun_family = AF_UNIX;
+
+	memcpy(sun.sun_path, pmd->socket_filename, sizeof(sun.sun_path) - 1);
+
+	ret = connect(sockfd, (struct sockaddr *)&sun,
+		      sizeof(struct sockaddr_un));
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to connect socket: %s.",
+			rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+		goto error;
+	}
+
+	MIF_LOG(DEBUG, "%s: Memif socket: %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+
+	pmd->cc = rte_zmalloc("memif-cc",
+			      sizeof(struct memif_control_channel), 0);
+	if (pmd->cc == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate control channel.",
+			rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	pmd->cc->intr_handle.fd = sockfd;
+	pmd->cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	pmd->cc->socket = NULL;
+	pmd->cc->dev = dev;
+	TAILQ_INIT(&pmd->cc->msg_queue);
+
+	ret = rte_intr_callback_register(&pmd->cc->intr_handle,
+					 memif_intr_handler, pmd->cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to register interrupt callback "
+			"for control fd", rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	return 0;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (pmd->cc != NULL) {
+		rte_free(pmd->cc);
+		pmd->cc = NULL;
+	}
+	return -1;
+}
diff --git a/drivers/net/memif/memif_socket.h b/drivers/net/memif/memif_socket.h
new file mode 100644
index 000000000..8caea270b
--- /dev/null
+++ b/drivers/net/memif/memif_socket.h
@@ -0,0 +1,104 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_SOCKET_H_
+#define _MEMIF_SOCKET_H_
+
+#include <sys/queue.h>
+
+/**
+ * Remove device from socket device list. If no device is left on the socket,
+ * remove the socket as well.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_socket_remove_device(struct rte_eth_dev *dev);
+
+/**
+ * Enqueue disconnect message to control channel message queue.
+ *
+ * @param cc
+ *   control channel
+ * @param reason
+ *   const string stating disconnect reason (96 characters)
+ * @param err_code
+ *   error code
+ */
+void memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			      int err_code);
+
+/**
+ * Initialize memif socket for specified device. If socket doesn't exist, create socket.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @param socket_filename
+ *   socket filename
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename);
+
+/**
+ * Disconnect memif device. Close control channel and shared memory.
+ *
+ * @param dev
+ *   ethernet device
+ */
+void memif_disconnect(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, enable connection establishment.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_master(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, send connection request.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_slave(struct rte_eth_dev *dev);
+
+struct memif_socket_dev_list_elt {
+	TAILQ_ENTRY(memif_socket_dev_list_elt) next;
+	struct rte_eth_dev *dev;		/**< pointer to device internals */
+};
+
+#define MEMIF_SOCKET_HASH_NAME			"memif-sh"
+struct memif_socket {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	uint8_t listener;			/**< if not zero socket is listener */
+	char filename[256];			/**< socket filename */
+
+	TAILQ_HEAD(, memif_socket_dev_list_elt) dev_queue;
+	/**< Queue of devices using this socket */
+};
+
+/* Control message queue. */
+struct memif_msg_queue_elt {
+	TAILQ_ENTRY(memif_msg_queue_elt) next;
+	memif_msg_t msg;			/**< control message */
+	int fd;					/**< fd to be sent to peer */
+};
+
+struct memif_control_channel {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	TAILQ_HEAD(, memif_msg_queue_elt) msg_queue; /**< control message queue */
+	struct memif_socket *socket;		/**< pointer to socket */
+	struct rte_eth_dev *dev;		/**< pointer to device */
+};
+
+#endif				/* MEMIF_SOCKET_H */
diff --git a/drivers/net/memif/meson.build b/drivers/net/memif/meson.build
new file mode 100644
index 000000000..4dfe37d3a
--- /dev/null
+++ b/drivers/net/memif/meson.build
@@ -0,0 +1,13 @@ 
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+if host_machine.system() != 'linux'
+        build = false
+endif
+
+sources = files('rte_eth_memif.c',
+		'memif_socket.c')
+
+allow_experimental_apis = true
+
+deps += ['hash']
diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
new file mode 100644
index 000000000..7ef1bca93
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.c
@@ -0,0 +1,1132 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <linux/if_ether.h>
+#include <errno.h>
+#include <sys/eventfd.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+#define ETH_MEMIF_ID_ARG		"id"
+#define ETH_MEMIF_ROLE_ARG		"role"
+#define ETH_MEMIF_BUFFER_SIZE_ARG	"bsize"
+#define ETH_MEMIF_RING_SIZE_ARG		"rsize"
+#define ETH_MEMIF_SOCKET_ARG		"socket"
+#define ETH_MEMIF_MAC_ARG		"mac"
+#define ETH_MEMIF_ZC_ARG		"zero-copy"
+#define ETH_MEMIF_SECRET_ARG		"secret"
+
+static const char *valid_arguments[] = {
+	ETH_MEMIF_ID_ARG,
+	ETH_MEMIF_ROLE_ARG,
+	ETH_MEMIF_BUFFER_SIZE_ARG,
+	ETH_MEMIF_RING_SIZE_ARG,
+	ETH_MEMIF_SOCKET_ARG,
+	ETH_MEMIF_MAC_ARG,
+	ETH_MEMIF_ZC_ARG,
+	ETH_MEMIF_SECRET_ARG,
+	NULL
+};
+
+static struct rte_vdev_driver pmd_memif_drv;
+
+const char *
+memif_version(void)
+{
+	return ("memif-" RTE_STR(MEMIF_VERSION_MAJOR) "." RTE_STR(MEMIF_VERSION_MINOR));
+}
+
+static void
+memif_dev_info(struct rte_eth_dev *dev __rte_unused, struct rte_eth_dev_info *dev_info)
+{
+	dev_info->max_mac_addrs = 1;
+	dev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;
+	dev_info->max_rx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->max_tx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->min_rx_bufsize = 0;
+}
+
+static memif_ring_t *
+memif_get_ring(struct pmd_internals *pmd, memif_ring_type_t type, uint16_t ring_num)
+{
+	/* rings only in region 0 */
+	void *p = pmd->regions[0].addr;
+	int ring_size = sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size);
+
+	p = (uint8_t *)p + (ring_num + type * pmd->run.num_s2m_rings) * ring_size;
+
+	return (memif_ring_t *)p;
+}
+
+static void *
+memif_get_buffer(struct pmd_internals *pmd, memif_desc_t *d)
+{
+	return ((uint8_t *)pmd->regions[d->region].addr + d->offset);
+}
+
+static uint16_t
+eth_memif_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t cur_slot, last_slot, n_slots, ring_size, mask, s0;
+	uint16_t n_rx_pkts = 0;
+	uint16_t mbuf_size = rte_pktmbuf_data_room_size(mq->mempool) -
+		RTE_PKTMBUF_HEADROOM;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head = NULL;
+	uint64_t b;
+	ssize_t size __rte_unused;
+	uint16_t head;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+
+	/* consume interrupt */
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0)
+		size = read(mq->intr_handle.fd, &b, sizeof(b));
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	cur_slot = (type == MEMIF_RING_S2M) ? mq->last_head : mq->last_tail;
+	last_slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+	if (cur_slot == last_slot)
+		goto refill;
+	n_slots = last_slot - cur_slot;
+
+	while (n_slots && n_rx_pkts < nb_pkts) {
+		mbuf_head = rte_pktmbuf_alloc(mq->mempool);
+		if (unlikely(mbuf_head == NULL))
+			goto no_free_bufs;
+		mbuf = mbuf_head;
+		mbuf->port = mq->in_port;
+
+ next_slot:
+		s0 = cur_slot & mask;
+		d0 = &ring->desc[s0];
+
+		src_len = d0->length;
+		dst_off = 0;
+		src_off = 0;
+
+		do {
+			dst_len = mbuf_size - dst_off;
+			if (dst_len == 0) {
+				dst_off = 0;
+				dst_len = mbuf_size + RTE_PKTMBUF_HEADROOM;
+
+				mbuf = rte_pktmbuf_alloc(mq->mempool);
+				if (unlikely(mbuf == NULL))
+					goto no_free_bufs;
+				mbuf->port = mq->in_port;
+				rte_pktmbuf_chain(mbuf_head, mbuf);
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			rte_pktmbuf_pkt_len(mbuf) =
+			    rte_pktmbuf_data_len(mbuf) += cp_len;
+
+			memcpy(rte_pktmbuf_mtod_offset(mbuf, void *, dst_off),
+			       (uint8_t *)memif_get_buffer(pmd, d0) + src_off, cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+		} while (src_len);
+
+		cur_slot++;
+		n_slots--;
+		if (d0->flags & MEMIF_DESC_FLAG_NEXT)
+			goto next_slot;
+
+		*bufs++ = mbuf_head;
+		n_rx_pkts++;
+	}
+
+ no_free_bufs:
+	if (type == MEMIF_RING_S2M) {
+		rte_mb();
+		ring->tail = cur_slot;
+		mq->last_head = cur_slot;
+	} else {
+		mq->last_tail = cur_slot;
+	}
+
+ refill:
+	if (type == MEMIF_RING_M2S) {
+		head = ring->head;
+		n_slots = ring_size - head + mq->last_tail;
+
+		while (n_slots--) {
+			s0 = head++ & mask;
+			d0 = &ring->desc[s0];
+			d0->length = pmd->run.buffer_size;
+		}
+		rte_mb();
+		ring->head = head;
+	}
+
+	mq->n_pkts += n_rx_pkts;
+	return n_rx_pkts;
+}
+
+static uint16_t
+eth_memif_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t slot, saved_slot, n_free, ring_size, mask, n_tx_pkts = 0;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head;
+	uint64_t a;
+	ssize_t size;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	n_free = ring->tail - mq->last_tail;
+	mq->last_tail += n_free;
+	slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+
+	if (type == MEMIF_RING_S2M)
+		n_free = ring_size - ring->head + mq->last_tail;
+	else
+		n_free = ring->head - ring->tail;
+
+	while (n_free && n_tx_pkts < nb_pkts) {
+		mbuf_head = *bufs++;
+		mbuf = mbuf_head;
+
+		saved_slot = slot;
+		d0 = &ring->desc[slot & mask];
+		dst_off = 0;
+		dst_len =
+		    (type ==
+		     MEMIF_RING_S2M) ? pmd->run.buffer_size : d0->length;
+
+ next_in_chain:
+		src_off = 0;
+		src_len = rte_pktmbuf_data_len(mbuf);
+
+		while (src_len) {
+			if (dst_len == 0) {
+				if (n_free) {
+					slot++;
+					n_free--;
+					d0->flags |= MEMIF_DESC_FLAG_NEXT;
+					d0 = &ring->desc[slot & mask];
+					dst_off = 0;
+					dst_len = (type == MEMIF_RING_S2M) ?
+					    pmd->run.buffer_size : d0->length;
+					d0->flags = 0;
+				} else {
+					slot = saved_slot;
+					goto no_free_slots;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			memcpy((uint8_t *)memif_get_buffer(pmd, d0) + dst_off,
+			       rte_pktmbuf_mtod_offset(mbuf, void *, src_off),
+			       cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+			dst_len -= cp_len;
+
+			d0->length = dst_off;
+		}
+
+		if (rte_pktmbuf_is_contiguous(mbuf) == 0) {
+			mbuf = mbuf->next;
+			goto next_in_chain;
+		}
+
+		n_tx_pkts++;
+		slot++;
+		n_free--;
+		rte_pktmbuf_free(mbuf_head);
+	}
+
+ no_free_slots:
+	rte_mb();
+	if (type == MEMIF_RING_S2M)
+		ring->head = slot;
+	else
+		ring->tail = slot;
+
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) {
+		a = 1;
+		size = write(mq->intr_handle.fd, &a, sizeof(a));
+		if (unlikely(size < 0)) {
+			MIF_LOG(WARNING,
+				"%s: Failed to send interrupt. %s",
+				rte_vdev_device_name(pmd->vdev), strerror(errno));
+		}
+	}
+
+	mq->n_err += nb_pkts - n_tx_pkts;
+	mq->n_pkts += n_tx_pkts;
+	return n_tx_pkts;
+}
+
+void
+memif_free_regions(struct pmd_internals *pmd)
+{
+	int i;
+	struct memif_region *r;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		r = pmd->regions + i;
+		if (r == NULL)
+			return;
+		if (r->addr == NULL)
+			return;
+		munmap(r->addr, r->region_size);
+		if (r->fd > 0) {
+			close(r->fd);
+			r->fd = -1;
+		}
+	}
+	rte_free(pmd->regions);
+}
+
+static int
+memif_alloc_regions(struct pmd_internals *pmd, uint8_t brn)
+{
+	struct memif_region *r;
+	char shm_name[32];
+	int i;
+	int ret = 0;
+
+	r = rte_zmalloc("memif_region", sizeof(struct memif_region) * (brn + 1), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate regions.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	pmd->regions = r;
+	pmd->regions_num = brn + 1;
+
+	/*
+	 * Create shm for every region. Region 0 is reserved for descriptors.
+	 * Other regions contain buffers.
+	 */
+	for (i = 0; i < (brn + 1); i++) {
+		r = &pmd->regions[i];
+
+		r->buffer_offset = (i == 0) ? (pmd->run.num_s2m_rings +
+					       pmd->run.num_m2s_rings) *
+		    (sizeof(memif_ring_t) +
+		     sizeof(memif_desc_t) * (1 << pmd->run.log2_ring_size)) : 0;
+		r->region_size = (i == 0) ? r->buffer_offset :
+		    (uint32_t)(pmd->run.buffer_size *
+				(1 << pmd->run.log2_ring_size) *
+				(pmd->run.num_s2m_rings +
+				 pmd->run.num_m2s_rings));
+
+		memset(shm_name, 0, sizeof(char) * 32);
+		sprintf(shm_name, "memif region %d", i);
+
+		r->fd = memfd_create(shm_name, MFD_ALLOW_SEALING);
+		if (r->fd < 0) {
+			MIF_LOG(ERR, "%s: Failed to create shm file: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+
+		ret = fcntl(r->fd, F_ADD_SEALS, F_SEAL_SHRINK);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to add seals to shm file: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+
+		ret = ftruncate(r->fd, r->region_size);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to truncate shm file: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+
+		r->addr = mmap(NULL, r->region_size, PROT_READ |
+			       PROT_WRITE, MAP_SHARED, r->fd, 0);
+		if (r->addr == NULL) {
+			MIF_LOG(ERR, "%s: Failed to mmap shm region: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+static void
+memif_init_rings(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_ring_t *ring;
+	int i, j;
+	uint16_t slot;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = i * (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 1;
+			ring->desc[j].offset = pmd->regions[1].buffer_offset +
+			    (uint32_t)(slot * pmd->run.buffer_size);
+			ring->desc[j].length = pmd->run.buffer_size;
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = (i + pmd->run.num_s2m_rings) *
+			    (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 1;
+			ring->desc[j].offset = pmd->regions[1].buffer_offset +
+			    (uint32_t)(slot * pmd->run.buffer_size);
+			ring->desc[j].length = pmd->run.buffer_size;
+		}
+	}
+}
+
+/* called only by slave */
+static void
+memif_init_queues(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = dev->data->tx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0].addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for tx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = dev->data->rx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0].addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for rx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+}
+
+int
+memif_init_regions_and_queues(struct rte_eth_dev *dev)
+{
+	int ret;
+
+	ret = memif_alloc_regions(dev->data->dev_private, /* num of buffer regions */ 1);
+	if (ret < 0)
+		return ret;
+
+	memif_init_rings(dev);
+
+	memif_init_queues(dev);
+
+	return 0;
+}
+
+int
+memif_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_region *mr;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		mr = pmd->regions + i;
+		if (mr != NULL) {
+			if (mr->addr == NULL) {
+				if (mr->fd < 0)
+					return -1;
+				mr->addr = mmap(NULL, mr->region_size,
+						PROT_READ | PROT_WRITE,
+						MAP_SHARED, mr->fd, 0);
+				if (mr->addr == NULL)
+					return -1;
+			}
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region].addr +
+			    mq->offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_MASTER)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region].addr +
+			    mq->offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_SLAVE)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags |= ETH_MEMIF_FLAG_CONNECTED;
+	dev->data->dev_link.link_status = ETH_LINK_UP;
+	MIF_LOG(INFO, "%s: Connected.", rte_vdev_device_name(pmd->vdev));
+	return 0;
+}
+
+static int
+memif_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int ret = 0;
+
+	switch (pmd->role) {
+	case MEMIF_ROLE_SLAVE:
+		ret = memif_connect_slave(dev);
+		break;
+	case MEMIF_ROLE_MASTER:
+		ret = memif_connect_master(dev);
+		break;
+	default:
+		MIF_LOG(ERR, "%s: Unknown role: %d.",
+			rte_vdev_device_name(pmd->vdev), pmd->role);
+		ret = -1;
+		break;
+	}
+
+	return ret;
+}
+
+static int
+memif_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	/*
+	 * SLAVE - TXQ
+	 * MASTER - RXQ
+	 */
+	pmd->cfg.num_s2m_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_tx_queues : dev->data->nb_rx_queues;
+
+	/*
+	 * SLAVE - RXQ
+	 * MASTER - TXQ
+	 */
+	pmd->cfg.num_m2s_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_rx_queues : dev->data->nb_tx_queues;
+
+	return 0;
+}
+
+static int
+memif_tx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_tx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("tx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate tx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type =
+	    (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_S2M : MEMIF_RING_M2S;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->pmd = pmd;
+	dev->data->tx_queues[qid] = mq;
+
+	return 0;
+}
+
+static int
+memif_rx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_rx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_rxconf *rx_conf __rte_unused,
+		     struct rte_mempool *mb_pool)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("rx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate rx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type = (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_M2S : MEMIF_RING_S2M;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->mempool = mb_pool;
+	mq->in_port = dev->data->port_id;
+	mq->pmd = pmd;
+	dev->data->rx_queues[qid] = mq;
+
+	return 0;
+}
+
+static void
+memif_queue_release(void *queue)
+{
+	struct memif_queue *q = (struct memif_queue *)queue;
+
+	if (!q)
+		return;
+
+	rte_free(q);
+}
+
+static int
+memif_link_update(struct rte_eth_dev *dev __rte_unused,
+		  int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static int
+memif_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+	uint8_t tmp, nq;
+
+	stats->ipackets = 0;
+	stats->ibytes = 0;
+	stats->opackets = 0;
+	stats->obytes = 0;
+	stats->oerrors = 0;
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_s2m_rings :
+	    pmd->run.num_m2s_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* RX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->rx_queues[i];
+		stats->q_ipackets[i] = mq->n_pkts;
+		stats->q_ibytes[i] = mq->n_bytes;
+		stats->ipackets += mq->n_pkts;
+		stats->ibytes += mq->n_bytes;
+	}
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_m2s_rings :
+	    pmd->run.num_s2m_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* TX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->tx_queues[i];
+		stats->q_opackets[i] = mq->n_pkts;
+		stats->q_obytes[i] = mq->n_bytes;
+		stats->q_errors[i] = mq->n_err;
+		stats->opackets += mq->n_pkts;
+		stats->obytes += mq->n_bytes;
+		stats->oerrors += mq->n_err;
+	}
+	return 0;
+}
+
+static void
+memif_stats_reset(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int i;
+	struct memif_queue *mq;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->tx_queues[i] :
+		    dev->data->rx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->rx_queues[i] :
+		    dev->data->tx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+}
+
+static int
+memif_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	MIF_LOG(WARNING, "%s: Interrupt mode not supported.",
+		rte_vdev_device_name(pmd->vdev));
+
+	return -1;
+}
+
+static int
+memif_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd __rte_unused = dev->data->dev_private;
+
+	return 0;
+}
+
+static const struct eth_dev_ops ops = {
+	.dev_start = memif_dev_start,
+	.dev_infos_get = memif_dev_info,
+	.dev_configure = memif_dev_configure,
+	.tx_queue_setup = memif_tx_queue_setup,
+	.rx_queue_setup = memif_rx_queue_setup,
+	.rx_queue_release = memif_queue_release,
+	.tx_queue_release = memif_queue_release,
+	.rx_queue_intr_enable = memif_rx_queue_intr_enable,
+	.rx_queue_intr_disable = memif_rx_queue_intr_disable,
+	.link_update = memif_link_update,
+	.stats_get = memif_stats_get,
+	.stats_reset = memif_stats_reset,
+};
+
+static int
+memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
+	     memif_interface_id_t id, uint32_t flags,
+	     const char *socket_filename,
+	     memif_log2_ring_size_t log2_ring_size,
+	     uint16_t buffer_size, const char *secret,
+	     struct ether_addr *eth_addr)
+{
+	int ret = 0;
+	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev_data *data;
+	struct pmd_internals *pmd;
+	const unsigned int numa_node = vdev->device.numa_node;
+	const char *name = rte_vdev_device_name(vdev);
+
+	if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
+		MIF_LOG(ERR, "Zero-copy not supported.");
+		return -1;
+	}
+
+	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
+	if (eth_dev == NULL) {
+		MIF_LOG(ERR, "%s: Unable to allocate device struct.", name);
+		return -1;
+	}
+
+	pmd = eth_dev->data->dev_private;
+	memset(pmd, 0, sizeof(*pmd));
+
+	pmd->vdev = vdev;
+	pmd->id = id;
+	pmd->flags = flags;
+	pmd->flags |= ETH_MEMIF_FLAG_DISABLED;
+	pmd->role = role;
+	ret = memif_socket_init(eth_dev, socket_filename);
+	if (ret < 0)
+		return ret;
+
+	memset(pmd->secret, 0, sizeof(char) * 24);
+	if (secret != NULL)
+		strlcpy(pmd->secret, secret, sizeof(pmd->secret));
+
+	pmd->cfg.log2_ring_size = log2_ring_size;
+	/* set in .dev_configure() */
+	pmd->cfg.num_s2m_rings = 0;
+	pmd->cfg.num_m2s_rings = 0;
+
+	pmd->cfg.buffer_size = buffer_size;
+
+	rte_memcpy(&pmd->eth_addr, eth_addr, sizeof(struct ether_addr));
+
+	data = eth_dev->data;
+	data->dev_private = pmd;
+	data->numa_node = numa_node;
+	data->mac_addrs = &pmd->eth_addr;
+
+	eth_dev->dev_ops = &ops;
+	eth_dev->device = &vdev->device;
+	eth_dev->rx_pkt_burst = eth_memif_rx;
+	eth_dev->tx_pkt_burst = eth_memif_tx;
+
+	rte_eth_dev_probing_finish(eth_dev);
+
+	return 0;
+}
+
+static int
+memif_set_role(const char *key __rte_unused, const char *value,
+	       void *extra_args)
+{
+	enum memif_role_t *role = (enum memif_role_t *)extra_args;
+
+	if (strstr(value, "master") != NULL) {
+		*role = MEMIF_ROLE_MASTER;
+	} else if (strstr(value, "slave") != NULL) {
+		*role = MEMIF_ROLE_SLAVE;
+	} else {
+		MIF_LOG(ERR, "Unknown role: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_zc(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	uint32_t *flags = (uint32_t *)extra_args;
+
+	if (strstr(value, "yes") != NULL) {
+		*flags |= ETH_MEMIF_FLAG_ZERO_COPY;
+	} else if (strstr(value, "no") != NULL) {
+		*flags &= ~ETH_MEMIF_FLAG_ZERO_COPY;
+	} else {
+		MIF_LOG(ERR, "Failed to parse zero-copy param: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_id(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	memif_interface_id_t *id = (memif_interface_id_t *)extra_args;
+
+	/* even if parsing fails, 0 is a valid id */
+	*id = strtoul(value, NULL, 10);
+	return 0;
+}
+
+static int
+memif_set_bs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	uint16_t *buffer_size = (uint16_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > 0xFFFF) {
+		MIF_LOG(ERR, "Invalid buffer size: %s.", value);
+		return -EINVAL;
+	}
+	*buffer_size = tmp;
+	return 0;
+}
+
+static int
+memif_set_rs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	memif_log2_ring_size_t *log2_ring_size =
+	    (memif_log2_ring_size_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > ETH_MEMIF_MAX_LOG2_RING_SIZE) {
+		MIF_LOG(ERR, "Invalid ring size: %s (max %u).",
+			value, ETH_MEMIF_MAX_LOG2_RING_SIZE);
+		return -EINVAL;
+	}
+	*log2_ring_size = tmp;
+	return 0;
+}
+
+/* check if directory exists and if we have permission to read/write */
+static int
+memif_check_socket_filename(const char *filename)
+{
+	char *dir = NULL, *tmp;
+	uint32_t idx;
+	int ret = 0;
+
+	tmp = strrchr(filename, '/');
+	if (tmp != NULL) {
+		idx = tmp - filename;
+		dir = rte_zmalloc("memif_tmp", sizeof(char) * (idx + 1), 0);
+		if (dir == NULL) {
+			MIF_LOG(ERR, "Failed to allocate memory.");
+			return -1;
+		}
+		strlcpy(dir, filename, sizeof(char) * (idx + 1));
+	}
+
+	if (dir == NULL || (faccessat(-1, dir, F_OK | R_OK |
+					W_OK, AT_EACCESS) < 0)) {
+		MIF_LOG(ERR, "Invalid directory: '%s'.", dir);
+		ret = -EINVAL;
+	}
+
+	if (dir != NULL)
+		rte_free(dir);
+
+	return ret;
+}
+
+static int
+memif_set_socket_filename(const char *key __rte_unused, const char *value,
+			  void *extra_args)
+{
+	const char **socket_filename = (const char **)extra_args;
+
+	*socket_filename = value;
+	return memif_check_socket_filename(*socket_filename);
+}
+
+static int
+memif_set_mac(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	struct ether_addr *eth_addr = (struct ether_addr *)extra_args;
+	int ret = 0;
+
+	ret = sscanf(value, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",
+	       &eth_addr->addr_bytes[0], &eth_addr->addr_bytes[1],
+	       &eth_addr->addr_bytes[2], &eth_addr->addr_bytes[3],
+	       &eth_addr->addr_bytes[4], &eth_addr->addr_bytes[5]);
+	if (ret != 6)
+		MIF_LOG(WARNING, "Failed to parse mac '%s'.", value);
+	return 0;
+}
+
+static int
+memif_set_secret(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	const char **secret = (const char **)extra_args;
+
+	*secret = value;
+	return 0;
+}
+
+static int
+rte_pmd_memif_probe(struct rte_vdev_device *vdev)
+{
+	RTE_BUILD_BUG_ON(sizeof(memif_msg_t) != 128);
+	RTE_BUILD_BUG_ON(sizeof(memif_desc_t) != 16);
+	int ret = 0;
+	struct rte_kvargs *kvlist;
+	const char *name = rte_vdev_device_name(vdev);
+	enum memif_role_t role = MEMIF_ROLE_SLAVE;
+	memif_interface_id_t id = 0;
+	uint16_t buffer_size = ETH_MEMIF_DEFAULT_BUFFER_SIZE;
+	memif_log2_ring_size_t log2_ring_size = ETH_MEMIF_DEFAULT_RING_SIZE;
+	const char *socket_filename = ETH_MEMIF_DEFAULT_SOCKET_FILENAME;
+	uint32_t flags = 0;
+	const char *secret = NULL;
+	struct ether_addr eth_addr;
+
+	eth_random_addr(eth_addr.addr_bytes);
+
+	MIF_LOG(INFO, "Initialize MEMIF: %s.", name);
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_arguments);
+
+	/* parse parameters */
+	if (kvlist != NULL) {
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ROLE_ARG,
+					 &memif_set_role, &role);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ID_ARG,
+					 &memif_set_id, &id);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_BUFFER_SIZE_ARG,
+					 &memif_set_bs, &buffer_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_RING_SIZE_ARG,
+					 &memif_set_rs, &log2_ring_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SOCKET_ARG,
+					 &memif_set_socket_filename,
+					 (void *)(&socket_filename));
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_MAC_ARG,
+					 &memif_set_mac, &eth_addr);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ZC_ARG,
+					 &memif_set_zc, &flags);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SECRET_ARG,
+					 &memif_set_secret, (void *)(&secret));
+		if (ret < 0)
+			goto exit;
+	}
+
+	/* create interface */
+	ret = memif_create(vdev, role, id, flags, socket_filename,
+			   log2_ring_size, buffer_size, secret, &eth_addr);
+
+ exit:
+	if (kvlist != NULL)
+		rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+rte_pmd_memif_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *eth_dev;
+	struct pmd_internals *pmd;
+
+	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (eth_dev == NULL)
+		return 0;
+
+	pmd = eth_dev->data->dev_private;
+
+	memif_msg_enq_disconnect(pmd->cc, "Invalid message size", 0);
+	memif_disconnect(eth_dev);
+
+	memif_socket_remove_device(eth_dev);
+
+	pmd->vdev = NULL;
+
+	rte_free(eth_dev->data->dev_private);
+
+	rte_eth_dev_release_port(eth_dev);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_memif_drv = {
+	.probe = rte_pmd_memif_probe,
+	.remove = rte_pmd_memif_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_memif, pmd_memif_drv);
+
+RTE_PMD_REGISTER_PARAM_STRING(net_memif,
+			      ETH_MEMIF_ID_ARG "=<int>"
+			      ETH_MEMIF_ROLE_ARG "=master|slave"
+			      ETH_MEMIF_BUFFER_SIZE_ARG "=<int>"
+			      ETH_MEMIF_RING_SIZE_ARG "=<int>"
+			      ETH_MEMIF_SOCKET_ARG "=<string>"
+			      ETH_MEMIF_MAC_ARG "=xx:xx:xx:xx:xx:xx"
+			      ETH_MEMIF_ZC_ARG "=yes|no"
+			      ETH_MEMIF_SECRET_ARG "=<string>");
+
+int memif_logtype;
+
+RTE_INIT(memif_init_log)
+{
+	memif_logtype = rte_log_register("pmd.net.memif");
+	if (memif_logtype >= 0)
+		rte_log_set_level(memif_logtype, RTE_LOG_NOTICE);
+}
diff --git a/drivers/net/memif/rte_eth_memif.h b/drivers/net/memif/rte_eth_memif.h
new file mode 100644
index 000000000..930de38ed
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.h
@@ -0,0 +1,203 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _RTE_ETH_MEMIF_H_
+#define _RTE_ETH_MEMIF_H_
+
+#ifndef _GNU_SOURCE
+#define _GNU_SOURCE
+#endif				/* GNU_SOURCE */
+
+#include <sys/queue.h>
+
+#include <rte_ethdev_driver.h>
+#include <rte_ether.h>
+#include <rte_interrupts.h>
+
+#include "memif.h"
+
+#define ETH_MEMIF_DEFAULT_SOCKET_FILENAME	"/tmp/memif.sock"
+#define ETH_MEMIF_DEFAULT_RING_SIZE		10
+#define ETH_MEMIF_DEFAULT_BUFFER_SIZE		2048
+
+#define ETH_MEMIF_MAX_NUM_Q_PAIRS		255
+#define ETH_MEMIF_MAX_LOG2_RING_SIZE		14
+#define ETH_MEMIF_MAX_REGION_IDX		255
+
+extern int memif_logtype;
+
+#define MIF_LOG(level, fmt, args...) \
+	rte_log(RTE_LOG_ ## level, memif_logtype, \
+		"%s(): " fmt "\n", __func__, ##args)
+
+enum memif_role_t {
+	MEMIF_ROLE_MASTER,
+	MEMIF_ROLE_SLAVE,
+};
+
+struct memif_region {
+	void *addr;				/**< shared memory address */
+	memif_region_size_t region_size;	/**< shared memory size */
+	int fd;					/**< shared memory file descriptor */
+	uint32_t buffer_offset;			/**< offset at which buffers start */
+};
+
+struct memif_queue {
+	struct rte_mempool *mempool;		/**< mempool for RX packets */
+	uint16_t in_port;			/**< port id */
+
+	struct pmd_internals *pmd;		/**< device internals */
+
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+
+	/* ring info */
+	memif_ring_type_t type;			/**< ring type */
+	memif_ring_t *ring;			/**< pointer to ring */
+	memif_log2_ring_size_t log2_ring_size;	/**< log2 of ring size */
+
+	memif_region_index_t region;		/**< shared memory region index */
+	memif_region_offset_t offset;		/**< offset at which the queue begins */
+
+	uint16_t last_head;			/**< last ring head */
+	uint16_t last_tail;			/**< last ring tail */
+
+	/* rx/tx info */
+	uint64_t n_pkts;			/**< number of rx/tx packets */
+	uint64_t n_bytes;			/**< number of rx/tx bytes */
+	uint64_t n_err;				/**< number of tx errors */
+};
+
+struct pmd_internals {
+	memif_interface_id_t id;		/**< unique id */
+	enum memif_role_t role;			/**< device role */
+	uint32_t flags;				/**< device status flags */
+#define ETH_MEMIF_FLAG_CONNECTING	(1 << 0)
+/**< device is connecting */
+#define ETH_MEMIF_FLAG_CONNECTED	(1 << 1)
+/**< device is connected */
+#define ETH_MEMIF_FLAG_ZERO_COPY	(1 << 2)
+/**< device is zero-copy enabled */
+#define ETH_MEMIF_FLAG_DISABLED		(1 << 3)
+/**< device has not been configured and can not accept connection requests */
+
+	struct ether_addr eth_addr;		/**< mac address */
+	char *socket_filename;			/**< pointer to socket filename */
+	char secret[24]; /**< secret (optional security parameter) */
+
+	struct memif_control_channel *cc;	/**< control channel */
+
+	struct memif_region *regions;		/**< shared memory regions */
+	uint8_t regions_num;			/**< number of regions */
+
+	/* remote info */
+	char remote_name[64];			/**< remote app name */
+	char remote_if_name[64];		/**< remote peer name */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t buffer_size;		/**< buffer size */
+	} cfg;					/**< Configured parameters (max values) */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t buffer_size;		/**< buffer size */
+	} run;
+	/**< Parameters used in active connection */
+
+	char local_disc_string[96];		/**< local disconnect reason */
+	char remote_disc_string[96];		/**< remote disconnect reason */
+
+	struct rte_vdev_device *vdev;		/**< vdev handle */
+};
+
+/**
+ * Unmap shared memory and free regions from memory.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_free_regions(struct pmd_internals *pmd);
+
+/**
+ * Finalize connection establishment process. Map shared memory file
+ * (master role), initialize ring queue, set link status up.
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect(struct rte_eth_dev *dev);
+
+/**
+ * Create shared memory file and initialize ring queue.
+ * Only called by slave when establishing connection
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_init_regions_and_queues(struct rte_eth_dev *dev);
+
+/**
+ * Get memif version string.
+ *
+ * @return
+ *   - memif version string
+ */
+const char *memif_version(void);
+
+#ifndef MFD_HUGETLB
+#ifndef __NR_memfd_create
+
+#if defined __x86_64__
+#define __NR_memfd_create 319
+#elif defined __x86_32__
+#define __NR_memfd_create 1073742143
+#elif defined __arm__
+#define __NR_memfd_create 385
+#elif defined __aarch64__
+#define __NR_memfd_create 279
+#elif defined __powerpc__
+#define __NR_memfd_create 360
+#elif defined __i386__
+#define __NR_memfd_create 356
+#else
+#error "__NR_memfd_create unknown for this architecture"
+#endif
+
+#endif				/* __NR_memfd_create */
+
+static inline int memfd_create(const char *name, unsigned int flags)
+{
+	return syscall(__NR_memfd_create, name, flags);
+}
+#endif				/* MFD_HUGETLB */
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef MFD_ALLOW_SEALING
+#define MFD_ALLOW_SEALING       0x0002U
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL     0x0001	/* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002	/* prevent file from shrinking */
+#define F_SEAL_GROW     0x0004	/* prevent file from growing */
+#define F_SEAL_WRITE    0x0008	/* prevent writes */
+#endif
+
+#endif				/* RTE_ETH_MEMIF_H */
diff --git a/drivers/net/memif/rte_pmd_memif_version.map b/drivers/net/memif/rte_pmd_memif_version.map
new file mode 100644
index 000000000..4b2e62193
--- /dev/null
+++ b/drivers/net/memif/rte_pmd_memif_version.map
@@ -0,0 +1,4 @@ 
+DPDK_19.05 {
+
+        local: *;
+};
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index 3ecc78cee..86ad1ec9e 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -22,6 +22,7 @@  drivers = ['af_packet',
 	'ixgbe',
 	'kni',
 	'liquidio',
+	'memif',
 	'mlx4',
 	'mlx5',
 	'mvneta',
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 262132fc6..d82cb0fb5 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -171,6 +171,7 @@  ifeq ($(CONFIG_RTE_LIBRTE_KNI),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KNI)        += -lrte_pmd_kni
 endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LIO_PMD)        += -lrte_pmd_lio
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF)      += -lrte_pmd_memif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -lrte_pmd_mlx4
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD)       += -lrte_pmd_mlx5 -lmnl
 ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y)