[dpdk-dev,1/3] kcp: add kernel control path kernel module

Message ID 1453911849-16562-2-git-send-email-ferruh.yigit@intel.com (mailing list archive)
State Superseded, archived
Headers

Commit Message

Ferruh Yigit Jan. 27, 2016, 4:24 p.m. UTC
  This kernel module is based on KNI module, but this one is stripped
version of it and only for control messages, no data transfer
functionality provided.

This Linux kernel module helps userspace application create virtual
interfaces and when a control command issued into that virtual
interface, module pushes the command to the userspace and gets the
response back for the caller application.

The Linux tools like ethtool/ifconfig/ip can be used on virtual
interfaces but not ones for related data, like tcpdump.

In long term this patch intends to replace the KNI and KNI will be
depreciated.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
---
 config/common_linuxapp                             |   6 +
 lib/librte_eal/linuxapp/Makefile                   |   5 +-
 lib/librte_eal/linuxapp/eal/Makefile               |   3 +-
 .../linuxapp/eal/include/exec-env/rte_kcp_common.h |  86 +++++++
 lib/librte_eal/linuxapp/kcp/Makefile               |  58 +++++
 lib/librte_eal/linuxapp/kcp/kcp_dev.h              |  65 +++++
 lib/librte_eal/linuxapp/kcp/kcp_ethtool.c          | 261 +++++++++++++++++++
 lib/librte_eal/linuxapp/kcp/kcp_misc.c             | 282 +++++++++++++++++++++
 lib/librte_eal/linuxapp/kcp/kcp_net.c              | 209 +++++++++++++++
 lib/librte_eal/linuxapp/kcp/kcp_nl.c               | 194 ++++++++++++++
 10 files changed, 1167 insertions(+), 2 deletions(-)
 create mode 100644 lib/librte_eal/linuxapp/eal/include/exec-env/rte_kcp_common.h
 create mode 100644 lib/librte_eal/linuxapp/kcp/Makefile
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_dev.h
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_ethtool.c
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_misc.c
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_net.c
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_nl.c
  

Comments

Remy Horton Jan. 28, 2016, 9:49 a.m. UTC | #1
Comments inline

..Remy


On 27/01/2016 16:24, Ferruh Yigit wrote:
 > This kernel module is based on KNI module, but this one is stripped
 > version of it and only for control messages, no data transfer
 > functionality provided.
 >
 > This Linux kernel module helps userspace application create virtual
 > interfaces and when a control command issued into that virtual
 > interface, module pushes the command to the userspace and gets the
 > response back for the caller application.
 >
 > Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
 > ---


 > +	net_dev = alloc_netdev(sizeof(struct kcp_dev), name,
 > +#ifdef NET_NAME_UNKNOWN
 > +							NET_NAME_UNKNOWN,
 > +#endif
 > +							kcp_net_init);

Something doesn't feel quite right here. In cases where NET_NAME_UNKNOWN 
is undefined, is the signature for alloc_netdev different?


 > +MODULE_LICENSE("Dual BSD/GPL");
 > +MODULE_AUTHOR("Intel Corporation");
 > +MODULE_DESCRIPTION("Kernel Module for managing kcp devices");

I'm not up to speed on this area, but some of the file headers only 
mention GPL/LGPL. This correct?


 > +	nlmsg_unicast(nl_sock, skb, pid);
 > +	KCP_DBG("Sent cmd:%d port:%d\n", cmd_id, port_id);
 > +
 > +	/*nlmsg_free(skb);*/
 > +
 > +	return 0;
 > +}

Oops.. :)
Possible memory leak, or is *skb statically allocated?
  
Avi Kivity Feb. 28, 2016, 3:34 p.m. UTC | #2
On 01/27/2016 06:24 PM, Ferruh Yigit wrote:
> This kernel module is based on KNI module, but this one is stripped
> version of it and only for control messages, no data transfer
> functionality provided.
>
> This Linux kernel module helps userspace application create virtual
> interfaces and when a control command issued into that virtual
> interface, module pushes the command to the userspace and gets the
> response back for the caller application.
>
> The Linux tools like ethtool/ifconfig/ip can be used on virtual
> interfaces but not ones for related data, like tcpdump.
>
> In long term this patch intends to replace the KNI and KNI will be
> depreciated.

Instead of adding yet another out-of-tree kernel module, why not extend 
the existing in-tree tap driver?  This will make everyone's life easier.

Since tap also supports data transfer, an application can also forward 
packets not intended to it to the kernel, and forward packets from the 
kernel through the device.

> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
> ---
>   config/common_linuxapp                             |   6 +
>   lib/librte_eal/linuxapp/Makefile                   |   5 +-
>   lib/librte_eal/linuxapp/eal/Makefile               |   3 +-
>   .../linuxapp/eal/include/exec-env/rte_kcp_common.h |  86 +++++++
>   lib/librte_eal/linuxapp/kcp/Makefile               |  58 +++++
>   lib/librte_eal/linuxapp/kcp/kcp_dev.h              |  65 +++++
>   lib/librte_eal/linuxapp/kcp/kcp_ethtool.c          | 261 +++++++++++++++++++
>   lib/librte_eal/linuxapp/kcp/kcp_misc.c             | 282 +++++++++++++++++++++
>   lib/librte_eal/linuxapp/kcp/kcp_net.c              | 209 +++++++++++++++
>   lib/librte_eal/linuxapp/kcp/kcp_nl.c               | 194 ++++++++++++++
>   10 files changed, 1167 insertions(+), 2 deletions(-)
>   create mode 100644 lib/librte_eal/linuxapp/eal/include/exec-env/rte_kcp_common.h
>   create mode 100644 lib/librte_eal/linuxapp/kcp/Makefile
>   create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_dev.h
>   create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_ethtool.c
>   create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_misc.c
>   create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_net.c
>   create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_nl.c
>
> diff --git a/config/common_linuxapp b/config/common_linuxapp
> index 74bc515..5d5e3e4 100644
> --- a/config/common_linuxapp
> +++ b/config/common_linuxapp
> @@ -503,6 +503,12 @@ CONFIG_RTE_KNI_VHOST_DEBUG_RX=n
>   CONFIG_RTE_KNI_VHOST_DEBUG_TX=n
>   
>   #
> +# Compile librte_ctrl_if
> +#
> +CONFIG_RTE_KCP_KMOD=y
> +CONFIG_RTE_KCP_KO_DEBUG=n
> +
> +#
>   # Compile vhost library
>   # fuse-devel is needed to run vhost-cuse.
>   # fuse-devel enables user space char driver development
> diff --git a/lib/librte_eal/linuxapp/Makefile b/lib/librte_eal/linuxapp/Makefile
> index d9c5233..d1fa3a3 100644
> --- a/lib/librte_eal/linuxapp/Makefile
> +++ b/lib/librte_eal/linuxapp/Makefile
> @@ -1,6 +1,6 @@
>   #   BSD LICENSE
>   #
> -#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> +#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
>   #   All rights reserved.
>   #
>   #   Redistribution and use in source and binary forms, with or without
> @@ -38,6 +38,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal
>   ifeq ($(CONFIG_RTE_KNI_KMOD),y)
>   DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += kni
>   endif
> +ifeq ($(CONFIG_RTE_KCP_KMOD),y)
> +DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += kcp
> +endif
>   ifeq ($(CONFIG_RTE_LIBRTE_XEN_DOM0),y)
>   DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += xen_dom0
>   endif
> diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
> index 26eced5..dded8cb 100644
> --- a/lib/librte_eal/linuxapp/eal/Makefile
> +++ b/lib/librte_eal/linuxapp/eal/Makefile
> @@ -1,6 +1,6 @@
>   #   BSD LICENSE
>   #
> -#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
> +#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
>   #   All rights reserved.
>   #
>   #   Redistribution and use in source and binary forms, with or without
> @@ -116,6 +116,7 @@ CFLAGS_eal_thread.o += -Wno-return-type
>   endif
>   
>   INC := rte_interrupts.h rte_kni_common.h rte_dom0_common.h
> +INC += rte_kcp_common.h
>   
>   SYMLINK-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP)-include/exec-env := \
>   	$(addprefix include/exec-env/,$(INC))
> diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kcp_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kcp_common.h
> new file mode 100644
> index 0000000..b3a6ee3
> --- /dev/null
> +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kcp_common.h
> @@ -0,0 +1,86 @@
> +/*-
> + *   This file is provided under a dual BSD/LGPLv2 license.  When using or
> + *   redistributing this file, you may do so under either license.
> + *
> + *   GNU LESSER GENERAL PUBLIC LICENSE
> + *
> + *   Copyright(c) 2016 Intel Corporation. All rights reserved.
> + *
> + *   This program is free software; you can redistribute it and/or modify
> + *   it under the terms of version 2.1 of the GNU Lesser General Public License
> + *   as published by the Free Software Foundation.
> + *
> + *   This program is distributed in the hope that it will be useful, but
> + *   WITHOUT ANY WARRANTY; without even the implied warranty of
> + *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + *   Lesser General Public License for more details.
> + *
> + *   You should have received a copy of the GNU Lesser General Public License
> + *   along with this program;
> + *
> + *   Contact Information:
> + *   Intel Corporation
> + *
> + *
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2016 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *   * Redistributions of source code must retain the above copyright
> + *     notice, this list of conditions and the following disclaimer.
> + *   * Redistributions in binary form must reproduce the above copyright
> + *     notice, this list of conditions and the following disclaimer in
> + *     the documentation and/or other materials provided with the
> + *     distribution.
> + *   * Neither the name of Intel Corporation nor the names of its
> + *     contributors may be used to endorse or promote products derived
> + *     from this software without specific prior written permission.
> + *
> + *    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + *
> + */
> +
> +#ifndef _RTE_KCP_COMMON_H_
> +#define _RTE_KCP_COMMON_H_
> +
> +#ifdef __KERNEL__
> +#include <linux/if.h>
> +#endif
> +
> +/*
> + * Request id.
> + */
> +enum rte_kcp_req_id {
> +	RTE_KCP_REQ_UNKNOWN = (1 << 16),
> +	RTE_KCP_REQ_CHANGE_MTU,
> +	RTE_KCP_REQ_CFG_NETWORK_IF,
> +	RTE_KCP_REQ_GET_STATS,
> +	RTE_KCP_REQ_GET_MAC,
> +	RTE_KCP_REQ_SET_MAC,
> +	RTE_KCP_REQ_START_PORT,
> +	RTE_KCP_REQ_STOP_PORT,
> +	RTE_KCP_REQ_MAX,
> +};
> +
> +#define KCP_DEVICE "kcp"
> +
> +#define RTE_KCP_IOCTL_TEST    _IOWR(0, 1, int)
> +#define RTE_KCP_IOCTL_CREATE  _IOWR(0, 2, int)
> +#define RTE_KCP_IOCTL_RELEASE _IOWR(0, 3, int)
> +
> +#endif /* _RTE_KCP_COMMON_H_ */
> diff --git a/lib/librte_eal/linuxapp/kcp/Makefile b/lib/librte_eal/linuxapp/kcp/Makefile
> new file mode 100644
> index 0000000..b2c44bd
> --- /dev/null
> +++ b/lib/librte_eal/linuxapp/kcp/Makefile
> @@ -0,0 +1,58 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2016 Intel Corporation. All rights reserved.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +#     * Redistributions of source code must retain the above copyright
> +#       notice, this list of conditions and the following disclaimer.
> +#     * Redistributions in binary form must reproduce the above copyright
> +#       notice, this list of conditions and the following disclaimer in
> +#       the documentation and/or other materials provided with the
> +#       distribution.
> +#     * Neither the name of Intel Corporation nor the names of its
> +#       contributors may be used to endorse or promote products derived
> +#       from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +#
> +# module name and path
> +#
> +MODULE = rte_kcp
> +
> +#
> +# CFLAGS
> +#
> +MODULE_CFLAGS += -I$(SRCDIR)
> +MODULE_CFLAGS += -I$(RTE_OUTPUT)/include
> +MODULE_CFLAGS += -include $(RTE_OUTPUT)/include/rte_config.h
> +MODULE_CFLAGS += -Wall -Werror
> +
> +# this lib needs main eal
> +DEPDIRS-y += lib/librte_eal/linuxapp/eal
> +
> +#
> +# all source are stored in SRCS-y
> +#
> +SRCS-y += kcp_misc.c
> +SRCS-y += kcp_net.c
> +SRCS-y += kcp_ethtool.c
> +SRCS-y += kcp_nl.c
> +
> +include $(RTE_SDK)/mk/rte.module.mk
> diff --git a/lib/librte_eal/linuxapp/kcp/kcp_dev.h b/lib/librte_eal/linuxapp/kcp/kcp_dev.h
> new file mode 100644
> index 0000000..e537821
> --- /dev/null
> +++ b/lib/librte_eal/linuxapp/kcp/kcp_dev.h
> @@ -0,0 +1,65 @@
> +/*-
> + * GPL LICENSE SUMMARY
> + *
> + *   Copyright(c) 2016 Intel Corporation. All rights reserved.
> + *
> + *   This program is free software; you can redistribute it and/or modify
> + *   it under the terms of version 2 of the GNU General Public License as
> + *   published by the Free Software Foundation.
> + *
> + *   This program is distributed in the hope that it will be useful, but
> + *   WITHOUT ANY WARRANTY; without even the implied warranty of
> + *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + *   General Public License for more details.
> + *
> + *   You should have received a copy of the GNU General Public License
> + *   along with this program;
> + *
> + *   The full GNU General Public License is included in this distribution
> + *   in the file called LICENSE.GPL.
> + *
> + *   Contact Information:
> + *   Intel Corporation
> + */
> +
> +#ifndef _KCP_DEV_H_
> +#define _KCP_DEV_H_
> +
> +#include <linux/netdevice.h>
> +#include <exec-env/rte_kcp_common.h>
> +
> +#define RTE_KCP_NAMESIZE 32
> +
> +struct kcp_dev {
> +	/* kcp list */
> +	struct list_head list;
> +
> +	char name[RTE_KCP_NAMESIZE]; /* Network device name */
> +
> +	/* kcp device */
> +	struct net_device *net_dev;
> +
> +	int port_id;
> +	struct completion msg_received;
> +};
> +
> +void kcp_net_init(struct net_device *dev);
> +
> +void kcp_nl_init(void);
> +void kcp_nl_release(void);
> +int kcp_nl_exec(int cmd, struct net_device *dev, void *in_data, int in_len,
> +		void *out_data, int out_len);
> +
> +void kcp_set_ethtool_ops(struct net_device *netdev);
> +
> +#define KCP_ERR(args...) printk(KERN_ERR "KCP: " args)
> +#define KCP_INFO(args...) printk(KERN_INFO "KCP: " args)
> +#define KCP_PRINT(args...) printk(KERN_DEBUG "KCP: " args)
> +
> +#ifdef RTE_KCP_KO_DEBUG
> +#define KCP_DBG(args...) printk(KERN_DEBUG "KCP: " args)
> +#else
> +#define KCP_DBG(args...)
> +#endif
> +
> +#endif /* _KCP_DEV_H_ */
> diff --git a/lib/librte_eal/linuxapp/kcp/kcp_ethtool.c b/lib/librte_eal/linuxapp/kcp/kcp_ethtool.c
> new file mode 100644
> index 0000000..3a22dba
> --- /dev/null
> +++ b/lib/librte_eal/linuxapp/kcp/kcp_ethtool.c
> @@ -0,0 +1,261 @@
> +/*-
> + * GPL LICENSE SUMMARY
> + *
> + *   Copyright(c) 2016 Intel Corporation. All rights reserved.
> + *
> + *   This program is free software; you can redistribute it and/or modify
> + *   it under the terms of version 2 of the GNU General Public License as
> + *   published by the Free Software Foundation.
> + *
> + *   This program is distributed in the hope that it will be useful, but
> + *   WITHOUT ANY WARRANTY; without even the implied warranty of
> + *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + *   General Public License for more details.
> + *
> + *   You should have received a copy of the GNU General Public License
> + *   along with this program;
> + *
> + *   The full GNU General Public License is included in this distribution
> + *   in the file called LICENSE.GPL.
> + *
> + *   Contact Information:
> + *   Intel Corporation
> + */
> +
> +#include "kcp_dev.h"
> +
> +#define ETHTOOL_GEEPROM_LEN 99
> +#define ETHTOOL_GREGS_LEN 98
> +#define ETHTOOL_GSSET_COUNT 97
> +
> +static int
> +kcp_check_if_running(struct net_device *dev)
> +{
> +	return 0;
> +}
> +
> +static void
> +kcp_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *info)
> +{
> +	int ret;
> +
> +	ret = kcp_nl_exec(info->cmd, dev, NULL, 0,
> +			info, sizeof(struct ethtool_drvinfo));
> +	if (ret < 0)
> +		memset(info, 0, sizeof(struct ethtool_drvinfo));
> +}
> +
> +static int
> +kcp_get_settings(struct net_device *dev, struct ethtool_cmd *ecmd)
> +{
> +	return kcp_nl_exec(ecmd->cmd, dev, NULL, 0,
> +			ecmd, sizeof(struct ethtool_cmd));
> +}
> +
> +static int
> +kcp_set_settings(struct net_device *dev, struct ethtool_cmd *ecmd)
> +{
> +	return kcp_nl_exec(ecmd->cmd, dev, ecmd, sizeof(struct ethtool_cmd),
> +			NULL, 0);
> +}
> +
> +static void
> +kcp_get_wol(struct net_device *dev, struct ethtool_wolinfo *wol)
> +{
> +	int ret;
> +
> +	ret = kcp_nl_exec(wol->cmd, dev, NULL, 0,
> +			wol, sizeof(struct ethtool_wolinfo));
> +	if (ret < 0)
> +		memset(wol, 0, sizeof(struct ethtool_wolinfo));
> +}
> +
> +static int
> +kcp_set_wol(struct net_device *dev, struct ethtool_wolinfo *wol)
> +{
> +	return kcp_nl_exec(wol->cmd, dev, wol, sizeof(struct ethtool_wolinfo),
> +			NULL, 0);
> +}
> +
> +static int
> +kcp_nway_reset(struct net_device *dev)
> +{
> +	return kcp_nl_exec(ETHTOOL_NWAY_RST, dev, NULL, 0, NULL, 0);
> +}
> +
> +static int
> +kcp_get_eeprom_len(struct net_device *dev)
> +{
> +	int data;
> +	int ret;
> +
> +	ret = kcp_nl_exec(ETHTOOL_GEEPROM_LEN, dev, NULL, 0,
> +			&data, sizeof(int));
> +	if (ret < 0)
> +		return ret;
> +
> +	return data;
> +}
> +
> +static int
> +kcp_get_eeprom(struct net_device *dev, struct ethtool_eeprom *eeprom,
> +		u8 *bytes)
> +{
> +	int ret;
> +
> +	ret = kcp_nl_exec(eeprom->cmd, dev,
> +			eeprom, sizeof(struct ethtool_eeprom),
> +			bytes, eeprom->len);
> +	*bytes = 0;
> +	return ret;
> +}
> +
> +static int
> +kcp_set_eeprom(struct net_device *dev, struct ethtool_eeprom *eeprom,
> +		u8 *bytes)
> +{
> +	int ret;
> +
> +	ret = kcp_nl_exec(eeprom->cmd, dev,
> +			eeprom, sizeof(struct ethtool_eeprom),
> +			bytes, eeprom->len);
> +	*bytes = 0;
> +	return ret;
> +}
> +
> +static void
> +kcp_get_ringparam(struct net_device *dev, struct ethtool_ringparam *ring)
> +{
> +
> +	kcp_nl_exec(ring->cmd, dev, NULL, 0,
> +			ring, sizeof(struct ethtool_ringparam));
> +}
> +
> +static int
> +kcp_set_ringparam(struct net_device *dev, struct ethtool_ringparam *ring)
> +{
> +	int ret;
> +
> +	ret = kcp_nl_exec(ring->cmd, dev,
> +			ring, sizeof(struct ethtool_ringparam),
> +			NULL, 0);
> +	return ret;
> +}
> +
> +static void
> +kcp_get_pauseparam(struct net_device *dev, struct ethtool_pauseparam *pause)
> +{
> +
> +	kcp_nl_exec(pause->cmd, dev, NULL, 0,
> +			pause, sizeof(struct ethtool_pauseparam));
> +}
> +
> +static int
> +kcp_set_pauseparam(struct net_device *dev, struct ethtool_pauseparam *pause)
> +{
> +	return kcp_nl_exec(pause->cmd, dev,
> +			pause, sizeof(struct ethtool_pauseparam),
> +			NULL, 0);
> +}
> +
> +static u32
> +kcp_get_msglevel(struct net_device *dev)
> +{
> +	int data;
> +	int ret;
> +
> +	ret = kcp_nl_exec(ETHTOOL_GMSGLVL, dev, NULL, 0, &data, sizeof(int));
> +	if (ret < 0)
> +		return ret;
> +
> +	return data;
> +}
> +
> +static void
> +kcp_set_msglevel(struct net_device *dev, u32 data)
> +{
> +
> +	kcp_nl_exec(ETHTOOL_SMSGLVL, dev, &data, sizeof(int), NULL, 0);
> +}
> +
> +static int
> +kcp_get_regs_len(struct net_device *dev)
> +{
> +	int data;
> +	int ret;
> +
> +	ret = kcp_nl_exec(ETHTOOL_GREGS_LEN, dev, NULL, 0, &data, sizeof(int));
> +	if (ret < 0)
> +		return ret;
> +
> +	return data;
> +}
> +
> +static void
> +kcp_get_regs(struct net_device *dev, struct ethtool_regs *regs, void *p)
> +{
> +
> +	kcp_nl_exec(regs->cmd, dev, regs, sizeof(struct ethtool_regs),
> +			p, regs->len);
> +}
> +
> +static void
> +kcp_get_strings(struct net_device *dev, u32 stringset, u8 *data)
> +{
> +
> +	kcp_nl_exec(ETHTOOL_GSTRINGS, dev, &stringset, sizeof(u32), data, 0);
> +}
> +
> +static int
> +kcp_get_sset_count(struct net_device *dev, int sset)
> +{
> +	int data;
> +	int ret;
> +
> +	ret = kcp_nl_exec(ETHTOOL_GSSET_COUNT, dev, &sset, sizeof(int),
> +			&data, sizeof(int));
> +	if (ret < 0)
> +		return ret;
> +
> +	return data;
> +}
> +
> +static void
> +kcp_get_ethtool_stats(struct net_device *dev, struct ethtool_stats *stats,
> +		u64 *data)
> +{
> +
> +	kcp_nl_exec(stats->cmd, dev, stats, sizeof(struct ethtool_stats),
> +			data, stats->n_stats);
> +}
> +
> +static const struct ethtool_ops kcp_ethtool_ops = {
> +	.begin			= kcp_check_if_running,
> +	.get_drvinfo		= kcp_get_drvinfo,
> +	.get_settings		= kcp_get_settings,
> +	.set_settings		= kcp_set_settings,
> +	.get_regs_len		= kcp_get_regs_len,
> +	.get_regs		= kcp_get_regs,
> +	.get_wol		= kcp_get_wol,
> +	.set_wol		= kcp_set_wol,
> +	.nway_reset		= kcp_nway_reset,
> +	.get_link		= ethtool_op_get_link,
> +	.get_eeprom_len		= kcp_get_eeprom_len,
> +	.get_eeprom		= kcp_get_eeprom,
> +	.set_eeprom		= kcp_set_eeprom,
> +	.get_ringparam		= kcp_get_ringparam,
> +	.set_ringparam		= kcp_set_ringparam,
> +	.get_pauseparam		= kcp_get_pauseparam,
> +	.set_pauseparam		= kcp_set_pauseparam,
> +	.get_msglevel		= kcp_get_msglevel,
> +	.set_msglevel		= kcp_set_msglevel,
> +	.get_strings		= kcp_get_strings,
> +	.get_sset_count		= kcp_get_sset_count,
> +	.get_ethtool_stats	= kcp_get_ethtool_stats,
> +};
> +
> +void
> +kcp_set_ethtool_ops(struct net_device *netdev)
> +{
> +	netdev->ethtool_ops = &kcp_ethtool_ops;
> +}
> diff --git a/lib/librte_eal/linuxapp/kcp/kcp_misc.c b/lib/librte_eal/linuxapp/kcp/kcp_misc.c
> new file mode 100644
> index 0000000..6df0d1b
> --- /dev/null
> +++ b/lib/librte_eal/linuxapp/kcp/kcp_misc.c
> @@ -0,0 +1,282 @@
> +/*-
> + * GPL LICENSE SUMMARY
> + *
> + *   Copyright(c) 2016 Intel Corporation. All rights reserved.
> + *
> + *   This program is free software; you can redistribute it and/or modify
> + *   it under the terms of version 2 of the GNU General Public License as
> + *   published by the Free Software Foundation.
> + *
> + *   This program is distributed in the hope that it will be useful, but
> + *   WITHOUT ANY WARRANTY; without even the implied warranty of
> + *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + *   General Public License for more details.
> + *
> + *   You should have received a copy of the GNU General Public License
> + *   along with this program;
> + *
> + *   The full GNU General Public License is included in this distribution
> + *   in the file called LICENSE.GPL.
> + *
> + *   Contact Information:
> + *   Intel Corporation
> + */
> +
> +#include <linux/module.h>
> +#include <linux/miscdevice.h>
> +
> +#include "kcp_dev.h"
> +
> +#define KCP_DEV_IN_USE_BIT_NUM 0 /* Bit number for device in use */
> +
> +static volatile unsigned long device_in_use; /* device in use flag */
> +
> +/* kcp list lock */
> +static DECLARE_RWSEM(kcp_list_lock);
> +
> +/* kcp list */
> +static struct list_head kcp_list_head = LIST_HEAD_INIT(kcp_list_head);
> +
> +static int
> +kcp_open(struct inode *inode, struct file *file)
> +{
> +	/* kcp device can be opened by one user only, test and set bit */
> +	if (test_and_set_bit(KCP_DEV_IN_USE_BIT_NUM, &device_in_use))
> +		return -EBUSY;
> +
> +	KCP_PRINT("/dev/kcp opened\n");
> +
> +	kcp_nl_init();
> +
> +	return 0;
> +}
> +
> +static int
> +kcp_dev_remove(struct kcp_dev *dev)
> +{
> +	if (!dev)
> +		return -ENODEV;
> +
> +	if (dev->net_dev) {
> +		unregister_netdev(dev->net_dev);
> +		free_netdev(dev->net_dev);
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +kcp_release(struct inode *inode, struct file *file)
> +{
> +	struct kcp_dev *dev, *n;
> +
> +	down_write(&kcp_list_lock);
> +	list_for_each_entry_safe(dev, n, &kcp_list_head, list) {
> +		kcp_dev_remove(dev);
> +		list_del(&dev->list);
> +	}
> +	up_write(&kcp_list_lock);
> +
> +	kcp_nl_release();
> +
> +	/* Clear the bit of device in use */
> +	clear_bit(KCP_DEV_IN_USE_BIT_NUM, &device_in_use);
> +
> +	KCP_PRINT("/dev/kcp closed\n");
> +
> +	return 0;
> +}
> +
> +static int
> +kcp_check_param(struct kcp_dev *kcp, char *name)
> +{
> +	if (!kcp)
> +		return -1;
> +
> +	/* Check if network name has been used */
> +	if (!strncmp(kcp->name, name, RTE_KCP_NAMESIZE)) {
> +		KCP_ERR("KCP interface name %s duplicated\n", name);
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +kcp_ioctl_create(unsigned int ioctl_num, unsigned long ioctl_param)
> +{
> +	int ret;
> +	struct net_device *net_dev = NULL;
> +	struct kcp_dev *kcp, *dev, *n;
> +	struct net *net;
> +	char name[RTE_KCP_NAMESIZE];
> +	unsigned int instance = ioctl_param;
> +	char mac[ETH_ALEN];
> +
> +	KCP_PRINT("Creating kcp...\n");
> +
> +	snprintf(name, RTE_KCP_NAMESIZE, "dpdk%u", instance);
> +
> +	/* Check if it has been created */
> +	down_read(&kcp_list_lock);
> +	list_for_each_entry_safe(dev, n, &kcp_list_head, list) {
> +		if (kcp_check_param(dev, name) < 0) {
> +			up_read(&kcp_list_lock);
> +			return -EINVAL;
> +		}
> +	}
> +	up_read(&kcp_list_lock);
> +
> +	net_dev = alloc_netdev(sizeof(struct kcp_dev), name,
> +#ifdef NET_NAME_UNKNOWN
> +							NET_NAME_UNKNOWN,
> +#endif
> +							kcp_net_init);
> +	if (net_dev == NULL) {
> +		KCP_ERR("error allocating device \"%s\"\n", name);
> +		return -EBUSY;
> +	}
> +
> +	net = get_net_ns_by_pid(task_pid_vnr(current));
> +	if (IS_ERR(net)) {
> +		free_netdev(net_dev);
> +		return PTR_ERR(net);
> +	}
> +	dev_net_set(net_dev, net);
> +	put_net(net);
> +
> +	kcp = netdev_priv(net_dev);
> +
> +	kcp->net_dev = net_dev;
> +	kcp->port_id = instance;
> +	init_completion(&kcp->msg_received);
> +	strncpy(kcp->name, name, RTE_KCP_NAMESIZE);
> +
> +	kcp_nl_exec(RTE_KCP_REQ_GET_MAC, net_dev, NULL, 0, mac, ETH_ALEN);
> +	memcpy(net_dev->dev_addr, mac, net_dev->addr_len);
> +
> +	kcp_set_ethtool_ops(net_dev);
> +	ret = register_netdev(net_dev);
> +	if (ret) {
> +		KCP_ERR("error %i registering device \"%s\"\n", ret, name);
> +		kcp_dev_remove(kcp);
> +		return -ENODEV;
> +	}
> +
> +	down_write(&kcp_list_lock);
> +	list_add(&kcp->list, &kcp_list_head);
> +	up_write(&kcp_list_lock);
> +
> +	return 0;
> +}
> +
> +static int
> +kcp_ioctl_release(unsigned int ioctl_num, unsigned long ioctl_param)
> +{
> +	int ret = -EINVAL;
> +	struct kcp_dev *dev;
> +	struct kcp_dev *n;
> +	char name[RTE_KCP_NAMESIZE];
> +	unsigned int instance = ioctl_param;
> +
> +	snprintf(name, RTE_KCP_NAMESIZE, "dpdk%u", instance);
> +
> +	down_write(&kcp_list_lock);
> +	list_for_each_entry_safe(dev, n, &kcp_list_head, list) {
> +		if (strncmp(dev->name, name, RTE_KCP_NAMESIZE) != 0)
> +			continue;
> +		kcp_dev_remove(dev);
> +		list_del(&dev->list);
> +		ret = 0;
> +		break;
> +	}
> +	up_write(&kcp_list_lock);
> +	KCP_INFO("%s release kcp named %s\n",
> +		(ret == 0 ? "Successfully" : "Unsuccessfully"), name);
> +
> +	return ret;
> +}
> +
> +static int
> +kcp_ioctl(struct inode *inode, unsigned int ioctl_num,
> +	unsigned long ioctl_param)
> +{
> +	int ret = -EINVAL;
> +
> +	KCP_DBG("IOCTL num=0x%0x param=0x%0lx\n", ioctl_num, ioctl_param);
> +
> +	/*
> +	 * Switch according to the ioctl called
> +	 */
> +	switch (_IOC_NR(ioctl_num)) {
> +	case _IOC_NR(RTE_KCP_IOCTL_TEST):
> +		/* For test only, not used */
> +		break;
> +	case _IOC_NR(RTE_KCP_IOCTL_CREATE):
> +		ret = kcp_ioctl_create(ioctl_num, ioctl_param);
> +		break;
> +	case _IOC_NR(RTE_KCP_IOCTL_RELEASE):
> +		ret = kcp_ioctl_release(ioctl_num, ioctl_param);
> +		break;
> +	default:
> +		KCP_DBG("IOCTL default\n");
> +		break;
> +	}
> +
> +	return ret;
> +}
> +
> +static int
> +kcp_compat_ioctl(struct inode *inode, unsigned int ioctl_num,
> +		unsigned long ioctl_param)
> +{
> +	/* 32 bits app on 64 bits OS to be supported later */
> +	KCP_PRINT("Not implemented.\n");
> +
> +	return -EINVAL;
> +}
> +
> +static const struct file_operations kcp_fops = {
> +	.owner = THIS_MODULE,
> +	.open = kcp_open,
> +	.release = kcp_release,
> +	.unlocked_ioctl = (void *)kcp_ioctl,
> +	.compat_ioctl = (void *)kcp_compat_ioctl,
> +};
> +
> +static struct miscdevice kcp_misc = {
> +	.minor = MISC_DYNAMIC_MINOR,
> +	.name = KCP_DEVICE,
> +	.fops = &kcp_fops,
> +};
> +
> +static int __init
> +kcp_init(void)
> +{
> +	KCP_PRINT("DPDK kcp module loading\n");
> +
> +	if (misc_register(&kcp_misc) != 0) {
> +		KCP_ERR("Misc registration failed\n");
> +		return -EPERM;
> +	}
> +
> +	/* Clear the bit of device in use */
> +	clear_bit(KCP_DEV_IN_USE_BIT_NUM, &device_in_use);
> +
> +	KCP_PRINT("DPDK kcp module loaded\n");
> +
> +	return 0;
> +}
> +module_init(kcp_init);
> +
> +static void __exit
> +kcp_exit(void)
> +{
> +	misc_deregister(&kcp_misc);
> +	KCP_PRINT("DPDK kcp module unloaded\n");
> +}
> +module_exit(kcp_exit);
> +
> +MODULE_LICENSE("Dual BSD/GPL");
> +MODULE_AUTHOR("Intel Corporation");
> +MODULE_DESCRIPTION("Kernel Module for managing kcp devices");
> diff --git a/lib/librte_eal/linuxapp/kcp/kcp_net.c b/lib/librte_eal/linuxapp/kcp/kcp_net.c
> new file mode 100644
> index 0000000..9dacaaa
> --- /dev/null
> +++ b/lib/librte_eal/linuxapp/kcp/kcp_net.c
> @@ -0,0 +1,209 @@
> +/*-
> + * GPL LICENSE SUMMARY
> + *
> + *   Copyright(c) 2016 Intel Corporation. All rights reserved.
> + *
> + *   This program is free software; you can redistribute it and/or modify
> + *   it under the terms of version 2 of the GNU General Public License as
> + *   published by the Free Software Foundation.
> + *
> + *   This program is distributed in the hope that it will be useful, but
> + *   WITHOUT ANY WARRANTY; without even the implied warranty of
> + *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + *   General Public License for more details.
> + *
> + *   You should have received a copy of the GNU General Public License
> + *   along with this program;
> + *
> + *   The full GNU General Public License is included in this distribution
> + *   in the file called LICENSE.GPL.
> + *
> + *   Contact Information:
> + *   Intel Corporation
> + */
> +
> +/*
> + * This code is inspired from the book "Linux Device Drivers" by
> + * Alessandro Rubini and Jonathan Corbet, published by O'Reilly & Associates
> + */
> +
> +#include <linux/version.h>
> +#include <linux/etherdevice.h> /* eth_type_trans */
> +
> +#include "kcp_dev.h"
> +
> +/*
> + * Open and close
> + */
> +static int
> +kcp_net_open(struct net_device *dev)
> +{
> +	kcp_nl_exec(RTE_KCP_REQ_START_PORT, dev, NULL, 0, NULL, 0);
> +	netif_start_queue(dev);
> +	return 0;
> +}
> +
> +static int
> +kcp_net_release(struct net_device *dev)
> +{
> +	kcp_nl_exec(RTE_KCP_REQ_STOP_PORT, dev, NULL, 0, NULL, 0);
> +	netif_stop_queue(dev); /* can't transmit any more */
> +	return 0;
> +}
> +
> +/*
> + * Configuration changes (passed on by ifconfig)
> + */
> +static int
> +kcp_net_config(struct net_device *dev, struct ifmap *map)
> +{
> +	if (dev->flags & IFF_UP) /* can't act on a running interface */
> +		return -EBUSY;
> +
> +	/* ignore other fields */
> +	return 0;
> +}
> +
> +static int
> +kcp_net_change_mtu(struct net_device *dev, int new_mtu)
> +{
> +	int err;
> +
> +	KCP_DBG("kcp_net_change_mtu new mtu %d to be set\n", new_mtu);
> +	err = kcp_nl_exec(RTE_KCP_REQ_CHANGE_MTU, dev, &new_mtu, sizeof(int),
> +			NULL, 0);
> +
> +	if (err == 0)
> +		dev->mtu = new_mtu;
> +
> +	return err;
> +}
> +
> +/*
> + * Ioctl commands
> + */
> +static int
> +kcp_net_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
> +{
> +	KCP_DBG("kcp_net_ioctl\n");
> +
> +	return 0;
> +}
> +
> +/*
> + * Return statistics to the caller
> + */
> +static struct  rtnl_link_stats64 *
> +kcp_net_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
> +{
> +	int err;
> +
> +	err = kcp_nl_exec(RTE_KCP_REQ_GET_STATS, dev, NULL, 0,
> +			stats, sizeof(struct rtnl_link_stats64));
> +
> +	return stats;
> +}
> +
> +/**
> + * kcp_net_set_mac - Change the Ethernet Address of the KCP NIC
> + * @netdev: network interface device structure
> + * @p: pointer to an address structure
> + *
> + * Returns 0 on success, negative on failure
> + **/
> +static int
> +kcp_net_set_mac(struct net_device *dev, void *p)
> +{
> +	struct sockaddr *addr = p;
> +	int err;
> +
> +	if (!is_valid_ether_addr((unsigned char *)(addr->sa_data)))
> +		return -EADDRNOTAVAIL;
> +
> +	err = kcp_nl_exec(RTE_KCP_REQ_SET_MAC, dev, addr->sa_data,
> +			dev->addr_len, NULL, 0);
> +	if (err < 0)
> +		return -EADDRNOTAVAIL;
> +
> +	memcpy(dev->dev_addr, addr->sa_data, dev->addr_len);
> +	return 0;
> +}
> +
> +#if (KERNEL_VERSION(3, 9, 0) <= LINUX_VERSION_CODE)
> +static int
> +kcp_net_change_carrier(struct net_device *dev, bool new_carrier)
> +{
> +	if (new_carrier)
> +		netif_carrier_on(dev);
> +	else
> +		netif_carrier_off(dev);
> +	return 0;
> +}
> +#endif
> +
> +static const struct net_device_ops kcp_net_netdev_ops = {
> +	.ndo_open = kcp_net_open,
> +	.ndo_stop = kcp_net_release,
> +	.ndo_set_config = kcp_net_config,
> +	.ndo_change_mtu = kcp_net_change_mtu,
> +	.ndo_do_ioctl = kcp_net_ioctl,
> +	.ndo_get_stats64 = kcp_net_stats64,
> +	.ndo_set_mac_address = kcp_net_set_mac,
> +#if (KERNEL_VERSION(3, 9, 0) <= LINUX_VERSION_CODE)
> +	.ndo_change_carrier = kcp_net_change_carrier,
> +#endif
> +};
> +
> +/*
> + *  Fill the eth header
> + */
> +static int
> +kcp_net_header(struct sk_buff *skb, struct net_device *dev,
> +		unsigned short type, const void *daddr,
> +		const void *saddr, unsigned int len)
> +{
> +	struct ethhdr *eth = (struct ethhdr *) skb_push(skb, ETH_HLEN);
> +
> +	memcpy(eth->h_source, saddr ? saddr : dev->dev_addr, dev->addr_len);
> +	memcpy(eth->h_dest,   daddr ? daddr : dev->dev_addr, dev->addr_len);
> +	eth->h_proto = htons(type);
> +
> +	return dev->hard_header_len;
> +}
> +
> +/*
> + * Re-fill the eth header
> + */
> +#if (KERNEL_VERSION(4, 1, 0) > LINUX_VERSION_CODE)
> +static int
> +kcp_net_rebuild_header(struct sk_buff *skb)
> +{
> +	struct net_device *dev = skb->dev;
> +	struct ethhdr *eth = (struct ethhdr *) skb->data;
> +
> +	memcpy(eth->h_source, dev->dev_addr, dev->addr_len);
> +	memcpy(eth->h_dest, dev->dev_addr, dev->addr_len);
> +
> +	return 0;
> +}
> +#endif
> +
> +static const struct header_ops kcp_net_header_ops = {
> +	.create  = kcp_net_header,
> +#if (KERNEL_VERSION(4, 1, 0) > LINUX_VERSION_CODE)
> +	.rebuild = kcp_net_rebuild_header,
> +#endif
> +	.cache   = NULL,  /* disable caching */
> +};
> +
> +void
> +kcp_net_init(struct net_device *dev)
> +{
> +	KCP_DBG("kcp_net_init\n");
> +
> +	ether_setup(dev); /* assign some of the fields */
> +	dev->netdev_ops      = &kcp_net_netdev_ops;
> +	dev->header_ops      = &kcp_net_header_ops;
> +
> +	dev->flags |= IFF_UP;
> +}
> diff --git a/lib/librte_eal/linuxapp/kcp/kcp_nl.c b/lib/librte_eal/linuxapp/kcp/kcp_nl.c
> new file mode 100644
> index 0000000..e989d2d
> --- /dev/null
> +++ b/lib/librte_eal/linuxapp/kcp/kcp_nl.c
> @@ -0,0 +1,194 @@
> +/*-
> + * GPL LICENSE SUMMARY
> + *
> + *   Copyright(c) 2016 Intel Corporation. All rights reserved.
> + *
> + *   This program is free software; you can redistribute it and/or modify
> + *   it under the terms of version 2 of the GNU General Public License as
> + *   published by the Free Software Foundation.
> + *
> + *   This program is distributed in the hope that it will be useful, but
> + *   WITHOUT ANY WARRANTY; without even the implied warranty of
> + *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + *   General Public License for more details.
> + *
> + *   You should have received a copy of the GNU General Public License
> + *   along with this program;
> + *   The full GNU General Public License is included in this distribution
> + *   in the file called LICENSE.GPL.
> + *
> + *   Contact Information:
> + *   Intel Corporation
> + */
> +
> +#include <net/sock.h>
> +
> +#include "kcp_dev.h"
> +
> +#define KCP_NL_GRP 31
> +
> +#define KCP_ETHTOOL_MSG_LEN 500
> +struct kcp_ethtool_msg {
> +	int cmd_id;
> +	int port_id;
> +	char input_buffer[KCP_ETHTOOL_MSG_LEN];
> +	char output_buffer[KCP_ETHTOOL_MSG_LEN];
> +	int input_buf_len;
> +	int output_buf_len;
> +	int err;
> +};
> +
> +static struct ethtool_input_buffer {
> +	int magic;
> +	void *buffer;
> +	int length;
> +	struct completion *msg_received;
> +	int *err;
> +} ethtool_input_buffer;
> +
> +static struct sock *nl_sock;
> +static int pid __read_mostly = -1;
> +static struct mutex sync_lock;
> +
> +static int
> +kcp_input_buffer_register(int magic, void *buffer, int length,
> +		struct completion *msg_received, int *err)
> +{
> +	if (ethtool_input_buffer.buffer == NULL) {
> +		ethtool_input_buffer.magic = magic;
> +		ethtool_input_buffer.buffer = buffer;
> +		ethtool_input_buffer.length = length;
> +		ethtool_input_buffer.msg_received = msg_received;
> +		ethtool_input_buffer.err = err;
> +		return 0;
> +	}
> +
> +	return 1;
> +}
> +
> +static void
> +kcp_input_buffer_unregister(int magic)
> +{
> +	if (ethtool_input_buffer.buffer != NULL) {
> +		if (magic == ethtool_input_buffer.magic) {
> +			ethtool_input_buffer.magic = -1;
> +			ethtool_input_buffer.buffer = NULL;
> +			ethtool_input_buffer.length = 0;
> +			ethtool_input_buffer.msg_received = NULL;
> +			ethtool_input_buffer.err = NULL;
> +		}
> +	}
> +}
> +
> +static void
> +nl_recv(struct sk_buff *skb)
> +{
> +	struct nlmsghdr *nlh;
> +	struct kcp_ethtool_msg ethtool_msg;
> +
> +	nlh = (struct nlmsghdr *)skb->data;
> +	if (pid < 0) {
> +		pid = nlh->nlmsg_pid;
> +		KCP_INFO("PID: %d\n", pid);
> +		return;
> +	} else if (pid != nlh->nlmsg_pid) {
> +		KCP_INFO("Message from unexpected peer: %d", nlh->nlmsg_pid);
> +		return;
> +	}
> +
> +	memcpy(&ethtool_msg, NLMSG_DATA(nlh), sizeof(struct kcp_ethtool_msg));
> +	KCP_DBG("CMD: %d\n", ethtool_msg.cmd_id);
> +
> +	if (ethtool_input_buffer.magic > 0) {
> +		if (ethtool_input_buffer.buffer != NULL) {
> +			memcpy(ethtool_input_buffer.buffer,
> +					&ethtool_msg.output_buffer,
> +					ethtool_input_buffer.length);
> +		}
> +		*ethtool_input_buffer.err = ethtool_msg.err;
> +		complete(ethtool_input_buffer.msg_received);
> +		kcp_input_buffer_unregister(ethtool_input_buffer.magic);
> +	}
> +}
> +
> +static int
> +kcp_nl_send(int cmd_id, int port_id, void *input_buffer, int input_buf_len)
> +{
> +	struct sk_buff *skb;
> +	struct nlmsghdr *nlh;
> +	struct kcp_ethtool_msg ethtool_msg;
> +
> +	memset(&ethtool_msg, 0, sizeof(struct kcp_ethtool_msg));
> +	ethtool_msg.cmd_id = cmd_id;
> +	ethtool_msg.port_id = port_id;
> +
> +	if (input_buffer) {
> +		if (input_buf_len == 0 || input_buf_len > KCP_ETHTOOL_MSG_LEN)
> +			return -EINVAL;
> +		ethtool_msg.input_buf_len = input_buf_len;
> +		memcpy(ethtool_msg.input_buffer, input_buffer, input_buf_len);
> +	}
> +
> +	skb = nlmsg_new(NLMSG_ALIGN(sizeof(struct kcp_ethtool_msg)),
> +			GFP_ATOMIC);
> +	nlh = nlmsg_put(skb, 0, 0, NLMSG_DONE, sizeof(struct kcp_ethtool_msg),
> +			0);
> +
> +	NETLINK_CB(skb).dst_group = 0;
> +
> +	memcpy(nlmsg_data(nlh), &ethtool_msg, sizeof(struct kcp_ethtool_msg));
> +
> +	nlmsg_unicast(nl_sock, skb, pid);
> +	KCP_DBG("Sent cmd:%d port:%d\n", cmd_id, port_id);
> +
> +	/*nlmsg_free(skb);*/
> +
> +	return 0;
> +}
> +
> +int
> +kcp_nl_exec(int cmd, struct net_device *dev, void *in_data, int in_len,
> +		void *out_data, int out_len)
> +{
> +	struct kcp_dev *priv = netdev_priv(dev);
> +	int err = -EINVAL;
> +	int ret;
> +
> +	mutex_lock(&sync_lock);
> +	ret = kcp_input_buffer_register(cmd, out_data, out_len,
> +			&priv->msg_received, &err);
> +	if (ret) {
> +		mutex_unlock(&sync_lock);
> +		return -EINVAL;
> +	}
> +
> +	kcp_nl_send(cmd, priv->port_id, in_data, in_len);
> +	ret = wait_for_completion_interruptible_timeout(&priv->msg_received,
> +			 msecs_to_jiffies(10));
> +	if (ret == 0 || err < 0) {
> +		kcp_input_buffer_unregister(ethtool_input_buffer.magic);
> +		mutex_unlock(&sync_lock);
> +		return ret == 0 ? -EINVAL : err;
> +	}
> +	mutex_unlock(&sync_lock);
> +
> +	return 0;
> +}
> +
> +static struct netlink_kernel_cfg cfg = {
> +	.input = nl_recv,
> +};
> +
> +void
> +kcp_nl_init(void)
> +{
> +	nl_sock = netlink_kernel_create(&init_net, KCP_NL_GRP, &cfg);
> +	mutex_init(&sync_lock);
> +}
> +
> +void
> +kcp_nl_release(void)
> +{
> +	netlink_kernel_release(nl_sock);
> +	pid = -1;
> +}
  
Ferruh Yigit Feb. 28, 2016, 8:16 p.m. UTC | #3
On 2/28/2016 3:34 PM, Avi Kivity wrote:
> On 01/27/2016 06:24 PM, Ferruh Yigit wrote:
>> This kernel module is based on KNI module, but this one is stripped
>> version of it and only for control messages, no data transfer
>> functionality provided.
>>
>> This Linux kernel module helps userspace application create virtual
>> interfaces and when a control command issued into that virtual
>> interface, module pushes the command to the userspace and gets the
>> response back for the caller application.
>>
>> The Linux tools like ethtool/ifconfig/ip can be used on virtual
>> interfaces but not ones for related data, like tcpdump.
>>
>> In long term this patch intends to replace the KNI and KNI will be
>> depreciated.
> 
> Instead of adding yet another out-of-tree kernel module, why not extend
> the existing in-tree tap driver?  This will make everyone's life easier.
> 
> Since tap also supports data transfer, an application can also forward
> packets not intended to it to the kernel, and forward packets from the
> kernel through the device.
> 
Hi Avi,

KDP (Kernel Data Path) does what you have described, it is implemented
as PMD and it benefits from tap driver to data transfer through the
kernel. It also support custom kernel module for better performance.

For KCP (Kernel Control Path), network driver forwards control commands
to the userspace driver, I doubt this is something wanted for tun/tap
driver, so extending tun/tap driver like this can be hard to upstream.

We are investigating about adding a native support to Linux kernel for
KCP, but there is no task started for this right now, any support is
welcome.

Thanks,
ferruh
  
Avi Kivity Feb. 29, 2016, 9:43 a.m. UTC | #4
On 02/28/2016 10:16 PM, Ferruh Yigit wrote:
> On 2/28/2016 3:34 PM, Avi Kivity wrote:
>> On 01/27/2016 06:24 PM, Ferruh Yigit wrote:
>>> This kernel module is based on KNI module, but this one is stripped
>>> version of it and only for control messages, no data transfer
>>> functionality provided.
>>>
>>> This Linux kernel module helps userspace application create virtual
>>> interfaces and when a control command issued into that virtual
>>> interface, module pushes the command to the userspace and gets the
>>> response back for the caller application.
>>>
>>> The Linux tools like ethtool/ifconfig/ip can be used on virtual
>>> interfaces but not ones for related data, like tcpdump.
>>>
>>> In long term this patch intends to replace the KNI and KNI will be
>>> depreciated.
>> Instead of adding yet another out-of-tree kernel module, why not extend
>> the existing in-tree tap driver?  This will make everyone's life easier.
>>
>> Since tap also supports data transfer, an application can also forward
>> packets not intended to it to the kernel, and forward packets from the
>> kernel through the device.
>>
> Hi Avi,
>
> KDP (Kernel Data Path) does what you have described, it is implemented
> as PMD and it benefits from tap driver to data transfer through the
> kernel. It also support custom kernel module for better performance.
>
> For KCP (Kernel Control Path), network driver forwards control commands
> to the userspace driver, I doubt this is something wanted for tun/tap
> driver, so extending tun/tap driver like this can be hard to upstream.

Have you tried asking?  Maybe if you explain it they will be open to the 
extension.

Certainly it will be better to have KCP and KDP use the same kernel 
interface name; so we'll need to either add data path support to kcp 
(causing duplication with tap), or add control path support to tap. I 
think the latter is preferable.

> We are investigating about adding a native support to Linux kernel for
> KCP, but there is no task started for this right now, any support is
> welcome.
>
>
  
Ferruh Yigit Feb. 29, 2016, 10:43 a.m. UTC | #5
On 2/29/2016 9:43 AM, Avi Kivity wrote:
> On 02/28/2016 10:16 PM, Ferruh Yigit wrote:
>> On 2/28/2016 3:34 PM, Avi Kivity wrote:
>>> On 01/27/2016 06:24 PM, Ferruh Yigit wrote:
>>>> This kernel module is based on KNI module, but this one is stripped
>>>> version of it and only for control messages, no data transfer
>>>> functionality provided.
>>>>
>>>> This Linux kernel module helps userspace application create virtual
>>>> interfaces and when a control command issued into that virtual
>>>> interface, module pushes the command to the userspace and gets the
>>>> response back for the caller application.
>>>>
>>>> The Linux tools like ethtool/ifconfig/ip can be used on virtual
>>>> interfaces but not ones for related data, like tcpdump.
>>>>
>>>> In long term this patch intends to replace the KNI and KNI will be
>>>> depreciated.
>>> Instead of adding yet another out-of-tree kernel module, why not extend
>>> the existing in-tree tap driver?  This will make everyone's life easier.
>>>
>>> Since tap also supports data transfer, an application can also forward
>>> packets not intended to it to the kernel, and forward packets from the
>>> kernel through the device.
>>>
>> Hi Avi,
>>
>> KDP (Kernel Data Path) does what you have described, it is implemented
>> as PMD and it benefits from tap driver to data transfer through the
>> kernel. It also support custom kernel module for better performance.
>>
>> For KCP (Kernel Control Path), network driver forwards control commands
>> to the userspace driver, I doubt this is something wanted for tun/tap
>> driver, so extending tun/tap driver like this can be hard to upstream.
> 
> Have you tried asking?  Maybe if you explain it they will be open to the
> extension.
> 
Not communicated but tun/tap already doing something different.
For KCP, created interface is map of the DPDK port. All data interface
shows coming from DPDK port. For example if you get stats information
with ifconfig, the values you observe are DPDK port statistics -not
statistics of data between userspace and kernelspace, statistics of data
forwarded between DPDK ports. If you down the interface, DPDK port
stopped, etc...

If you extend the tun/tap, it won't be map of the DPDK port, and if you
get statistics information from that interface, what do you expect to
see, the data transferred between kernel and userspace, or underlying
DPDK port forwarding statistics?

Extending tun/tap in a way we want, forwarding all control commands to
userspace, will break the current tun/tap, this doesn't looks like a
valid option to me.

For data path, using tun/tap is OK and we are already doing it, for the
control path I believe we need a new driver.

> Certainly it will be better to have KCP and KDP use the same kernel
> interface name; so we'll need to either add data path support to kcp
> (causing duplication with tap), or add control path support to tap. I
> think the latter is preferable.
> 
Why it is better to have same interface? Anyone who is not interested
with kernel data path may want to control DPDK ports using common tools,
or want to get some basic information and stats using ethtool or
ifconfig. Why we need to bind two different functionality together?

>> We are investigating about adding a native support to Linux kernel for
>> KCP, but there is no task started for this right now, any support is
>> welcome.
>>
>>
>
  
Avi Kivity Feb. 29, 2016, 10:58 a.m. UTC | #6
On 02/29/2016 12:43 PM, Ferruh Yigit wrote:
> On 2/29/2016 9:43 AM, Avi Kivity wrote:
>> On 02/28/2016 10:16 PM, Ferruh Yigit wrote:
>>> On 2/28/2016 3:34 PM, Avi Kivity wrote:
>>>> On 01/27/2016 06:24 PM, Ferruh Yigit wrote:
>>>>> This kernel module is based on KNI module, but this one is stripped
>>>>> version of it and only for control messages, no data transfer
>>>>> functionality provided.
>>>>>
>>>>> This Linux kernel module helps userspace application create virtual
>>>>> interfaces and when a control command issued into that virtual
>>>>> interface, module pushes the command to the userspace and gets the
>>>>> response back for the caller application.
>>>>>
>>>>> The Linux tools like ethtool/ifconfig/ip can be used on virtual
>>>>> interfaces but not ones for related data, like tcpdump.
>>>>>
>>>>> In long term this patch intends to replace the KNI and KNI will be
>>>>> depreciated.
>>>> Instead of adding yet another out-of-tree kernel module, why not extend
>>>> the existing in-tree tap driver?  This will make everyone's life easier.
>>>>
>>>> Since tap also supports data transfer, an application can also forward
>>>> packets not intended to it to the kernel, and forward packets from the
>>>> kernel through the device.
>>>>
>>> Hi Avi,
>>>
>>> KDP (Kernel Data Path) does what you have described, it is implemented
>>> as PMD and it benefits from tap driver to data transfer through the
>>> kernel. It also support custom kernel module for better performance.
>>>
>>> For KCP (Kernel Control Path), network driver forwards control commands
>>> to the userspace driver, I doubt this is something wanted for tun/tap
>>> driver, so extending tun/tap driver like this can be hard to upstream.
>> Have you tried asking?  Maybe if you explain it they will be open to the
>> extension.
>>
> Not communicated but tun/tap already doing something different.
> For KCP, created interface is map of the DPDK port. All data interface
> shows coming from DPDK port. For example if you get stats information
> with ifconfig, the values you observe are DPDK port statistics -not
> statistics of data between userspace and kernelspace, statistics of data
> forwarded between DPDK ports. If you down the interface, DPDK port
> stopped, etc...
>
> If you extend the tun/tap, it won't be map of the DPDK port, and if you
> get statistics information from that interface, what do you expect to
> see, the data transferred between kernel and userspace, or underlying
> DPDK port forwarding statistics?

Good point.  But you really have to involve netdev on this, or you'll 
live out-of-tree forever.

> Extending tun/tap in a way we want, forwarding all control commands to
> userspace, will break the current tun/tap, this doesn't looks like a
> valid option to me.

It's possible to enhance it while preserving backwards compatibility, by 
enabling a feature flag (statistics from userspace).

> For data path, using tun/tap is OK and we are already doing it, for the
> control path I believe we need a new driver.
>
>> Certainly it will be better to have KCP and KDP use the same kernel
>> interface name; so we'll need to either add data path support to kcp
>> (causing duplication with tap), or add control path support to tap. I
>> think the latter is preferable.
>>
> Why it is better to have same interface? Anyone who is not interested
> with kernel data path may want to control DPDK ports using common tools,
> or want to get some basic information and stats using ethtool or
> ifconfig. Why we need to bind two different functionality together?

Having two interfaces will be confusing for the user.  If I wish to 
firewall data packets coming from the dpdk port, do I set firewall rules 
on dpdk0 or tap0?

I don't think it matters whether you extend tap, or add a data path to 
kcp, but if you want to upstream it, it needs to be blessed by netdev.

>
>>> We are investigating about adding a native support to Linux kernel for
>>> KCP, but there is no task started for this right now, any support is
>>> welcome.
>>>
>>>
  
Thomas Monjalon Feb. 29, 2016, 11:06 a.m. UTC | #7
Hi,
I totally agree with Avi's comments.
This topic is really important for the future of DPDK.
So I think we must give some time to continue the discussion
and have netdev involved in the choices done.
As a consequence, these series should not be merged in the release 16.04.
Thanks for continuing the work.


2016-02-29 12:58, Avi Kivity:
> On 02/29/2016 12:43 PM, Ferruh Yigit wrote:
> > On 2/29/2016 9:43 AM, Avi Kivity wrote:
> >> On 02/28/2016 10:16 PM, Ferruh Yigit wrote:
> >>> On 2/28/2016 3:34 PM, Avi Kivity wrote:
> >>>> On 01/27/2016 06:24 PM, Ferruh Yigit wrote:
> >>>>> This kernel module is based on KNI module, but this one is stripped
> >>>>> version of it and only for control messages, no data transfer
> >>>>> functionality provided.
> >>>>>
> >>>>> This Linux kernel module helps userspace application create virtual
> >>>>> interfaces and when a control command issued into that virtual
> >>>>> interface, module pushes the command to the userspace and gets the
> >>>>> response back for the caller application.
> >>>>>
> >>>>> The Linux tools like ethtool/ifconfig/ip can be used on virtual
> >>>>> interfaces but not ones for related data, like tcpdump.
> >>>>>
> >>>>> In long term this patch intends to replace the KNI and KNI will be
> >>>>> depreciated.
> >>>> Instead of adding yet another out-of-tree kernel module, why not extend
> >>>> the existing in-tree tap driver?  This will make everyone's life easier.
> >>>>
> >>>> Since tap also supports data transfer, an application can also forward
> >>>> packets not intended to it to the kernel, and forward packets from the
> >>>> kernel through the device.
> >>>>
> >>> Hi Avi,
> >>>
> >>> KDP (Kernel Data Path) does what you have described, it is implemented
> >>> as PMD and it benefits from tap driver to data transfer through the
> >>> kernel. It also support custom kernel module for better performance.
> >>>
> >>> For KCP (Kernel Control Path), network driver forwards control commands
> >>> to the userspace driver, I doubt this is something wanted for tun/tap
> >>> driver, so extending tun/tap driver like this can be hard to upstream.
> >> Have you tried asking?  Maybe if you explain it they will be open to the
> >> extension.
> >>
> > Not communicated but tun/tap already doing something different.
> > For KCP, created interface is map of the DPDK port. All data interface
> > shows coming from DPDK port. For example if you get stats information
> > with ifconfig, the values you observe are DPDK port statistics -not
> > statistics of data between userspace and kernelspace, statistics of data
> > forwarded between DPDK ports. If you down the interface, DPDK port
> > stopped, etc...
> >
> > If you extend the tun/tap, it won't be map of the DPDK port, and if you
> > get statistics information from that interface, what do you expect to
> > see, the data transferred between kernel and userspace, or underlying
> > DPDK port forwarding statistics?
> 
> Good point.  But you really have to involve netdev on this, or you'll 
> live out-of-tree forever.

+1

> > Extending tun/tap in a way we want, forwarding all control commands to
> > userspace, will break the current tun/tap, this doesn't looks like a
> > valid option to me.
> 
> It's possible to enhance it while preserving backwards compatibility, by 
> enabling a feature flag (statistics from userspace).

+1
 
> > For data path, using tun/tap is OK and we are already doing it, for the
> > control path I believe we need a new driver.
> >
> >> Certainly it will be better to have KCP and KDP use the same kernel
> >> interface name; so we'll need to either add data path support to kcp
> >> (causing duplication with tap), or add control path support to tap. I
> >> think the latter is preferable.
> >>
> > Why it is better to have same interface? Anyone who is not interested
> > with kernel data path may want to control DPDK ports using common tools,
> > or want to get some basic information and stats using ethtool or
> > ifconfig. Why we need to bind two different functionality together?
> 
> Having two interfaces will be confusing for the user.  If I wish to 
> firewall data packets coming from the dpdk port, do I set firewall rules 
> on dpdk0 or tap0?

+1
 
> I don't think it matters whether you extend tap, or add a data path to 
> kcp, but if you want to upstream it, it needs to be blessed by netdev.

+1

> >>> We are investigating about adding a native support to Linux kernel for
> >>> KCP, but there is no task started for this right now, any support is
> >>> welcome.
  
Ferruh Yigit Feb. 29, 2016, 11:27 a.m. UTC | #8
On 2/29/2016 10:58 AM, Avi Kivity wrote:
> 
> 
> On 02/29/2016 12:43 PM, Ferruh Yigit wrote:
>> On 2/29/2016 9:43 AM, Avi Kivity wrote:
>>> On 02/28/2016 10:16 PM, Ferruh Yigit wrote:
>>>> On 2/28/2016 3:34 PM, Avi Kivity wrote:
>>>>> On 01/27/2016 06:24 PM, Ferruh Yigit wrote:
>>>>>> This kernel module is based on KNI module, but this one is stripped
>>>>>> version of it and only for control messages, no data transfer
>>>>>> functionality provided.
>>>>>>
>>>>>> This Linux kernel module helps userspace application create virtual
>>>>>> interfaces and when a control command issued into that virtual
>>>>>> interface, module pushes the command to the userspace and gets the
>>>>>> response back for the caller application.
>>>>>>
>>>>>> The Linux tools like ethtool/ifconfig/ip can be used on virtual
>>>>>> interfaces but not ones for related data, like tcpdump.
>>>>>>
>>>>>> In long term this patch intends to replace the KNI and KNI will be
>>>>>> depreciated.
>>>>> Instead of adding yet another out-of-tree kernel module, why not
>>>>> extend
>>>>> the existing in-tree tap driver?  This will make everyone's life
>>>>> easier.
>>>>>
>>>>> Since tap also supports data transfer, an application can also forward
>>>>> packets not intended to it to the kernel, and forward packets from the
>>>>> kernel through the device.
>>>>>
>>>> Hi Avi,
>>>>
>>>> KDP (Kernel Data Path) does what you have described, it is implemented
>>>> as PMD and it benefits from tap driver to data transfer through the
>>>> kernel. It also support custom kernel module for better performance.
>>>>
>>>> For KCP (Kernel Control Path), network driver forwards control commands
>>>> to the userspace driver, I doubt this is something wanted for tun/tap
>>>> driver, so extending tun/tap driver like this can be hard to upstream.
>>> Have you tried asking?  Maybe if you explain it they will be open to the
>>> extension.
>>>
>> Not communicated but tun/tap already doing something different.
>> For KCP, created interface is map of the DPDK port. All data interface
>> shows coming from DPDK port. For example if you get stats information
>> with ifconfig, the values you observe are DPDK port statistics -not
>> statistics of data between userspace and kernelspace, statistics of data
>> forwarded between DPDK ports. If you down the interface, DPDK port
>> stopped, etc...
>>
>> If you extend the tun/tap, it won't be map of the DPDK port, and if you
>> get statistics information from that interface, what do you expect to
>> see, the data transferred between kernel and userspace, or underlying
>> DPDK port forwarding statistics?
> 
> Good point.  But you really have to involve netdev on this, or you'll
> live out-of-tree forever.
> 
Why do we need to touch netdev?

A simple network driver, similar to kcp, can be solution.

This driver implements all net_device_ops and ethtool_ops in a way to
forward everything to the userspace via netlink. All needs to know about
userspace driver is it's unique id. Any userspace application, not only
DPDK drivers, can listen the netlink messages and response to the
requests come to itself.

This kind of driver is not big or complicated, kcp already does %90 of
what described above.

>> Extending tun/tap in a way we want, forwarding all control commands to
>> userspace, will break the current tun/tap, this doesn't looks like a
>> valid option to me.
> 
> It's possible to enhance it while preserving backwards compatibility, by
> enabling a feature flag (statistics from userspace).
> 
>> For data path, using tun/tap is OK and we are already doing it, for the
>> control path I believe we need a new driver.
>>
>>> Certainly it will be better to have KCP and KDP use the same kernel
>>> interface name; so we'll need to either add data path support to kcp
>>> (causing duplication with tap), or add control path support to tap. I
>>> think the latter is preferable.
>>>
>> Why it is better to have same interface? Anyone who is not interested
>> with kernel data path may want to control DPDK ports using common tools,
>> or want to get some basic information and stats using ethtool or
>> ifconfig. Why we need to bind two different functionality together?
> 
> Having two interfaces will be confusing for the user.  If I wish to
> firewall data packets coming from the dpdk port, do I set firewall rules
> on dpdk0 or tap0?
> 
Agreed that it is confusing to have two interfaces.

I think if user wants to use both data and control paths, a way can be
found to end up with single interface, using module params or something
else. Two different drivers for data and control not conflict with each
other and can cooperate.
But to work on this first both KCP and KDP should go in.

> I don't think it matters whether you extend tap, or add a data path to
> kcp, but if you want to upstream it, it needs to be blessed by netdev.
> 
I still think not good idea to merge these two, because they may be used
independently, but we can improve how they work together.

>>
>>>> We are investigating about adding a native support to Linux kernel for
>>>> KCP, but there is no task started for this right now, any support is
>>>> welcome.
>>>>
>>>>
>
  
Ferruh Yigit Feb. 29, 2016, 11:35 a.m. UTC | #9
On 2/29/2016 11:06 AM, Thomas Monjalon wrote:
> Hi,
> I totally agree with Avi's comments.
> This topic is really important for the future of DPDK.
> So I think we must give some time to continue the discussion
> and have netdev involved in the choices done.
> As a consequence, these series should not be merged in the release 16.04.
> Thanks for continuing the work.
> 
Hi Thomas,

It is great to have some discussion and feedbacks.
But I doubt not merging in this release will help to have more discussion.

It is better to have them in this release and let people experiment it,
this gives more chance to better discussion.

These features are replacement of KNI, and KNI is not intended to be
removed in this release, so who are using KNI as solution can continue
to use KNI and can test KCP/KDP, so that we can get more feedbacks.

Thanks,
ferruh
  
Avi Kivity Feb. 29, 2016, 11:39 a.m. UTC | #10
On 02/29/2016 01:27 PM, Ferruh Yigit wrote:
> On 2/29/2016 10:58 AM, Avi Kivity wrote:
>>
>> On 02/29/2016 12:43 PM, Ferruh Yigit wrote:
>>> On 2/29/2016 9:43 AM, Avi Kivity wrote:
>>>> On 02/28/2016 10:16 PM, Ferruh Yigit wrote:
>>>>> On 2/28/2016 3:34 PM, Avi Kivity wrote:
>>>>>> On 01/27/2016 06:24 PM, Ferruh Yigit wrote:
>>>>>>> This kernel module is based on KNI module, but this one is stripped
>>>>>>> version of it and only for control messages, no data transfer
>>>>>>> functionality provided.
>>>>>>>
>>>>>>> This Linux kernel module helps userspace application create virtual
>>>>>>> interfaces and when a control command issued into that virtual
>>>>>>> interface, module pushes the command to the userspace and gets the
>>>>>>> response back for the caller application.
>>>>>>>
>>>>>>> The Linux tools like ethtool/ifconfig/ip can be used on virtual
>>>>>>> interfaces but not ones for related data, like tcpdump.
>>>>>>>
>>>>>>> In long term this patch intends to replace the KNI and KNI will be
>>>>>>> depreciated.
>>>>>> Instead of adding yet another out-of-tree kernel module, why not
>>>>>> extend
>>>>>> the existing in-tree tap driver?  This will make everyone's life
>>>>>> easier.
>>>>>>
>>>>>> Since tap also supports data transfer, an application can also forward
>>>>>> packets not intended to it to the kernel, and forward packets from the
>>>>>> kernel through the device.
>>>>>>
>>>>> Hi Avi,
>>>>>
>>>>> KDP (Kernel Data Path) does what you have described, it is implemented
>>>>> as PMD and it benefits from tap driver to data transfer through the
>>>>> kernel. It also support custom kernel module for better performance.
>>>>>
>>>>> For KCP (Kernel Control Path), network driver forwards control commands
>>>>> to the userspace driver, I doubt this is something wanted for tun/tap
>>>>> driver, so extending tun/tap driver like this can be hard to upstream.
>>>> Have you tried asking?  Maybe if you explain it they will be open to the
>>>> extension.
>>>>
>>> Not communicated but tun/tap already doing something different.
>>> For KCP, created interface is map of the DPDK port. All data interface
>>> shows coming from DPDK port. For example if you get stats information
>>> with ifconfig, the values you observe are DPDK port statistics -not
>>> statistics of data between userspace and kernelspace, statistics of data
>>> forwarded between DPDK ports. If you down the interface, DPDK port
>>> stopped, etc...
>>>
>>> If you extend the tun/tap, it won't be map of the DPDK port, and if you
>>> get statistics information from that interface, what do you expect to
>>> see, the data transferred between kernel and userspace, or underlying
>>> DPDK port forwarding statistics?
>> Good point.  But you really have to involve netdev on this, or you'll
>> live out-of-tree forever.
>>
> Why do we need to touch netdev?

By netdev, I meant the mailing list.  If you don't touch it, your driver 
will remain out-of-tree forever.

> A simple network driver, similar to kcp, can be solution.
>
> This driver implements all net_device_ops and ethtool_ops in a way to
> forward everything to the userspace via netlink. All needs to know about
> userspace driver is it's unique id. Any userspace application, not only
> DPDK drivers, can listen the netlink messages and response to the
> requests come to itself.
>
> This kind of driver is not big or complicated, kcp already does %90 of
> what described above.

I am not arguing against kcp.  It fulfills an important need.  This is 
my argument:

1. having multiple interfaces for the control and data path is bad for 
the user
2. therefore, we need to either add tap functionality to kcp, or add kcp 
functionality to tap
3. netdev@ is more likely (IMO) to accept additional functionality to 
tap than a new driver, but the only way to know is to engage with them

>
>>> Extending tun/tap in a way we want, forwarding all control commands to
>>> userspace, will break the current tun/tap, this doesn't looks like a
>>> valid option to me.
>> It's possible to enhance it while preserving backwards compatibility, by
>> enabling a feature flag (statistics from userspace).
>>
>>> For data path, using tun/tap is OK and we are already doing it, for the
>>> control path I believe we need a new driver.
>>>
>>>> Certainly it will be better to have KCP and KDP use the same kernel
>>>> interface name; so we'll need to either add data path support to kcp
>>>> (causing duplication with tap), or add control path support to tap. I
>>>> think the latter is preferable.
>>>>
>>> Why it is better to have same interface? Anyone who is not interested
>>> with kernel data path may want to control DPDK ports using common tools,
>>> or want to get some basic information and stats using ethtool or
>>> ifconfig. Why we need to bind two different functionality together?
>> Having two interfaces will be confusing for the user.  If I wish to
>> firewall data packets coming from the dpdk port, do I set firewall rules
>> on dpdk0 or tap0?
>>
> Agreed that it is confusing to have two interfaces.
>
> I think if user wants to use both data and control paths, a way can be
> found to end up with single interface, using module params or something
> else. Two different drivers for data and control not conflict with each
> other and can cooperate.

Module parameters are for module-wide options.  A single module can 
support multiple interfaces, so module parameters don't apply.

Let's make it simple for the users, even if it is more complex for us 
(and by us, I mean you, unfortunately).

> But to work on this first both KCP and KDP should go in.

Go in where?  It doesn't help if it goes into dpdk.git and then netdev@ 
rejects it.

>
>> I don't think it matters whether you extend tap, or add a data path to
>> kcp, but if you want to upstream it, it needs to be blessed by netdev.
>>
> I still think not good idea to merge these two, because they may be used
> independently, but we can improve how they work together.
>

 From a developer's perspective, maybe so, but from a user's 
perspective, there should be exactly one interface for the functionality 
exposed by kcp and kdp.
  
Jay Rolette Feb. 29, 2016, 2:33 p.m. UTC | #11
On Mon, Feb 29, 2016 at 5:06 AM, Thomas Monjalon <thomas.monjalon@6wind.com>
wrote:

> Hi,
> I totally agree with Avi's comments.
> This topic is really important for the future of DPDK.
> So I think we must give some time to continue the discussion
> and have netdev involved in the choices done.
> As a consequence, these series should not be merged in the release 16.04.
> Thanks for continuing the work.
>

I know you guys are very interested in getting rid of the out-of-tree
drivers, but please do not block incremental improvements to DPDK in the
meantime. Ferruh's patch improves the usability of KNI. Don't throw out
good and useful enhancements just because it isn't where you want to be in
the end.

I'd like to see these be merged.

Jay
  
Ferruh Yigit Feb. 29, 2016, 2:35 p.m. UTC | #12
On 2/29/2016 11:39 AM, Avi Kivity wrote:
> 
> 
> On 02/29/2016 01:27 PM, Ferruh Yigit wrote:
>> On 2/29/2016 10:58 AM, Avi Kivity wrote:
>>>
>>> On 02/29/2016 12:43 PM, Ferruh Yigit wrote:
>>>> On 2/29/2016 9:43 AM, Avi Kivity wrote:
>>>>> On 02/28/2016 10:16 PM, Ferruh Yigit wrote:
>>>>>> On 2/28/2016 3:34 PM, Avi Kivity wrote:
>>>>>>> On 01/27/2016 06:24 PM, Ferruh Yigit wrote:
>>>>>>>> This kernel module is based on KNI module, but this one is stripped
>>>>>>>> version of it and only for control messages, no data transfer
>>>>>>>> functionality provided.
>>>>>>>>
>>>>>>>> This Linux kernel module helps userspace application create virtual
>>>>>>>> interfaces and when a control command issued into that virtual
>>>>>>>> interface, module pushes the command to the userspace and gets the
>>>>>>>> response back for the caller application.
>>>>>>>>
>>>>>>>> The Linux tools like ethtool/ifconfig/ip can be used on virtual
>>>>>>>> interfaces but not ones for related data, like tcpdump.
>>>>>>>>
>>>>>>>> In long term this patch intends to replace the KNI and KNI will be
>>>>>>>> depreciated.
>>>>>>> Instead of adding yet another out-of-tree kernel module, why not
>>>>>>> extend
>>>>>>> the existing in-tree tap driver?  This will make everyone's life
>>>>>>> easier.
>>>>>>>
>>>>>>> Since tap also supports data transfer, an application can also
>>>>>>> forward
>>>>>>> packets not intended to it to the kernel, and forward packets
>>>>>>> from the
>>>>>>> kernel through the device.
>>>>>>>
>>>>>> Hi Avi,
>>>>>>
>>>>>> KDP (Kernel Data Path) does what you have described, it is
>>>>>> implemented
>>>>>> as PMD and it benefits from tap driver to data transfer through the
>>>>>> kernel. It also support custom kernel module for better performance.
>>>>>>
>>>>>> For KCP (Kernel Control Path), network driver forwards control
>>>>>> commands
>>>>>> to the userspace driver, I doubt this is something wanted for tun/tap
>>>>>> driver, so extending tun/tap driver like this can be hard to
>>>>>> upstream.
>>>>> Have you tried asking?  Maybe if you explain it they will be open
>>>>> to the
>>>>> extension.
>>>>>
>>>> Not communicated but tun/tap already doing something different.
>>>> For KCP, created interface is map of the DPDK port. All data interface
>>>> shows coming from DPDK port. For example if you get stats information
>>>> with ifconfig, the values you observe are DPDK port statistics -not
>>>> statistics of data between userspace and kernelspace, statistics of
>>>> data
>>>> forwarded between DPDK ports. If you down the interface, DPDK port
>>>> stopped, etc...
>>>>
>>>> If you extend the tun/tap, it won't be map of the DPDK port, and if you
>>>> get statistics information from that interface, what do you expect to
>>>> see, the data transferred between kernel and userspace, or underlying
>>>> DPDK port forwarding statistics?
>>> Good point.  But you really have to involve netdev on this, or you'll
>>> live out-of-tree forever.
>>>
>> Why do we need to touch netdev?
> 
> By netdev, I meant the mailing list.  If you don't touch it, your driver
> will remain out-of-tree forever.
> 
Sorry, I thought you are suggesting updating netdev (struct net_device)
for this.

>> A simple network driver, similar to kcp, can be solution.
>>
>> This driver implements all net_device_ops and ethtool_ops in a way to
>> forward everything to the userspace via netlink. All needs to know about
>> userspace driver is it's unique id. Any userspace application, not only
>> DPDK drivers, can listen the netlink messages and response to the
>> requests come to itself.
>>
>> This kind of driver is not big or complicated, kcp already does %90 of
>> what described above.
> 
> I am not arguing against kcp.  It fulfills an important need.  This is
> my argument:
> 
> 1. having multiple interfaces for the control and data path is bad for
> the user
> 2. therefore, we need to either add tap functionality to kcp, or add kcp
> functionality to tap
> 3. netdev@ is more likely (IMO) to accept additional functionality to
> tap than a new driver, but the only way to know is to engage with them
> 
Agreed an incremental update to the tap can be easier to get in, but
this is not really working for us, as explained above.

The concern of having two separate interfaces can be solved without
merging data and control path. I believe this is not a showstopper for
the functionality and can be the incremental improvement.

>>
>>>> Extending tun/tap in a way we want, forwarding all control commands to
>>>> userspace, will break the current tun/tap, this doesn't looks like a
>>>> valid option to me.
>>> It's possible to enhance it while preserving backwards compatibility, by
>>> enabling a feature flag (statistics from userspace).
>>>
>>>> For data path, using tun/tap is OK and we are already doing it, for the
>>>> control path I believe we need a new driver.
>>>>
>>>>> Certainly it will be better to have KCP and KDP use the same kernel
>>>>> interface name; so we'll need to either add data path support to kcp
>>>>> (causing duplication with tap), or add control path support to tap. I
>>>>> think the latter is preferable.
>>>>>
>>>> Why it is better to have same interface? Anyone who is not interested
>>>> with kernel data path may want to control DPDK ports using common
>>>> tools,
>>>> or want to get some basic information and stats using ethtool or
>>>> ifconfig. Why we need to bind two different functionality together?
>>> Having two interfaces will be confusing for the user.  If I wish to
>>> firewall data packets coming from the dpdk port, do I set firewall rules
>>> on dpdk0 or tap0?
>>>
>> Agreed that it is confusing to have two interfaces.
>>
>> I think if user wants to use both data and control paths, a way can be
>> found to end up with single interface, using module params or something
>> else. Two different drivers for data and control not conflict with each
>> other and can cooperate.
> 
> Module parameters are for module-wide options.  A single module can
> support multiple interfaces, so module parameters don't apply.
>
> Let's make it simple for the users, even if it is more complex for us
> (and by us, I mean you, unfortunately).

The reason I insist on separate modules is same reason with you,
user/customer rules, not because it is easier. With KCP, DPDK ports
become visible to Linux world, and this has nothing to do with sending
packages to kernel. Decoupling the KCP from data path can give more use
case for KCP, more user can use it.

Also it is possible to make control path more dynamic, you can make them
appear, get some info, set something, and remove them back. This is not
something wanted for slow data path.

This is a design choice to separate responsibilities.

> 
>> But to work on this first both KCP and KDP should go in.
> 
> Go in where?  It doesn't help if it goes into dpdk.git and then netdev@
> rejects it.
> 
Go in to dpdk.git can be good start. It helps on a few things:
1- Get feedback on users, get more requirements
2- In communication with netdev@, helps us showing the existing use case.
3- Linux upstream process is risky and may take long time, DPDK users
may not access these for a long time, and if it makes the Linux kernel,
it may be late for some requirements.

I think it is good to include these in dpdk.org, and in parallel work on
upstreaming. Update the code according comments from both communities.
And if this makes into Linux kernel, we can drop from DPDK.

>>
>>> I don't think it matters whether you extend tap, or add a data path to
>>> kcp, but if you want to upstream it, it needs to be blessed by netdev.
>>>
>> I still think not good idea to merge these two, because they may be used
>> independently, but we can improve how they work together.
>>
> 
> From a developer's perspective, maybe so, but from a user's perspective,
> there should be exactly one interface for the functionality exposed by
> kcp and kdp.
  
Ferruh Yigit Feb. 29, 2016, 3:05 p.m. UTC | #13
On 2/29/2016 11:35 AM, Ferruh Yigit wrote:
> On 2/29/2016 11:06 AM, Thomas Monjalon wrote:
>> Hi,
>> I totally agree with Avi's comments.
>> This topic is really important for the future of DPDK.
>> So I think we must give some time to continue the discussion
>> and have netdev involved in the choices done.
>> As a consequence, these series should not be merged in the release 16.04.
>> Thanks for continuing the work.
>>
> Hi Thomas,
> 
> It is great to have some discussion and feedbacks.
> But I doubt not merging in this release will help to have more discussion.
> 
> It is better to have them in this release and let people experiment it,
> this gives more chance to better discussion.
> 
> These features are replacement of KNI, and KNI is not intended to be
> removed in this release, so who are using KNI as solution can continue
> to use KNI and can test KCP/KDP, so that we can get more feedbacks.
> 
One more thing, overall reason of working on KCP/KDP is reduce KNI
maintenance cost, and add more features, not to add more maintenance cost.
The most maintenance cost of KNI is because of Linux network drivers in
it, which KCP removes them, so there is an improvement.

Although it is not as good as removing them completely, KCP/KDP is one
step closer to be upstreamed than existing KNI.

Thanks,
ferruh
  
Panu Matilainen Feb. 29, 2016, 3:19 p.m. UTC | #14
On 02/29/2016 01:35 PM, Ferruh Yigit wrote:
> On 2/29/2016 11:06 AM, Thomas Monjalon wrote:
>> Hi,
>> I totally agree with Avi's comments.
>> This topic is really important for the future of DPDK.
>> So I think we must give some time to continue the discussion
>> and have netdev involved in the choices done.
>> As a consequence, these series should not be merged in the release 16.04.
>> Thanks for continuing the work.
>>
> Hi Thomas,
>
> It is great to have some discussion and feedbacks.
> But I doubt not merging in this release will help to have more discussion.
>
> It is better to have them in this release and let people experiment it,
> this gives more chance to better discussion.
>
> These features are replacement of KNI, and KNI is not intended to be
> removed in this release, so who are using KNI as solution can continue
> to use KNI and can test KCP/KDP, so that we can get more feedbacks.

So make the work available from a separate git repo and make it easy for 
people to experiment with it. Code doesn't have to be in a release for 
the sake of experimenting, and removing code is much harder than not 
adding it in the first place, witness KNI.

	- Panu -
  
Thomas Monjalon Feb. 29, 2016, 3:27 p.m. UTC | #15
2016-02-29 17:19, Panu Matilainen:
> On 02/29/2016 01:35 PM, Ferruh Yigit wrote:
> > On 2/29/2016 11:06 AM, Thomas Monjalon wrote:
> >> Hi,
> >> I totally agree with Avi's comments.
> >> This topic is really important for the future of DPDK.
> >> So I think we must give some time to continue the discussion
> >> and have netdev involved in the choices done.
> >> As a consequence, these series should not be merged in the release 16.04.
> >> Thanks for continuing the work.
> >>
> > Hi Thomas,
> >
> > It is great to have some discussion and feedbacks.
> > But I doubt not merging in this release will help to have more discussion.
> >
> > It is better to have them in this release and let people experiment it,
> > this gives more chance to better discussion.
> >
> > These features are replacement of KNI, and KNI is not intended to be
> > removed in this release, so who are using KNI as solution can continue
> > to use KNI and can test KCP/KDP, so that we can get more feedbacks.
> 
> So make the work available from a separate git repo and make it easy for 
> people to experiment with it. Code doesn't have to be in a release for 
> the sake of experimenting, and removing code is much harder than not 
> adding it in the first place, witness KNI.

Good idea.
What about a -next tree to experiment on kernel interactions?
  
Panu Matilainen Feb. 29, 2016, 4:04 p.m. UTC | #16
On 02/29/2016 05:27 PM, Thomas Monjalon wrote:
> 2016-02-29 17:19, Panu Matilainen:
>> On 02/29/2016 01:35 PM, Ferruh Yigit wrote:
>>> On 2/29/2016 11:06 AM, Thomas Monjalon wrote:
>>>> Hi,
>>>> I totally agree with Avi's comments.
>>>> This topic is really important for the future of DPDK.
>>>> So I think we must give some time to continue the discussion
>>>> and have netdev involved in the choices done.
>>>> As a consequence, these series should not be merged in the release 16.04.
>>>> Thanks for continuing the work.
>>>>
>>> Hi Thomas,
>>>
>>> It is great to have some discussion and feedbacks.
>>> But I doubt not merging in this release will help to have more discussion.
>>>
>>> It is better to have them in this release and let people experiment it,
>>> this gives more chance to better discussion.
>>>
>>> These features are replacement of KNI, and KNI is not intended to be
>>> removed in this release, so who are using KNI as solution can continue
>>> to use KNI and can test KCP/KDP, so that we can get more feedbacks.
>>
>> So make the work available from a separate git repo and make it easy for
>> people to experiment with it. Code doesn't have to be in a release for
>> the sake of experimenting, and removing code is much harder than not
>> adding it in the first place, witness KNI.
>
> Good idea.
> What about a -next tree to experiment on kernel interactions?

Here's another, related but more radical (and rather unbaked) idea:

Move all the kernel modules and their associated libraries (thinking of 
KNI here) to a separate repo with perhaps more relaxed rules, but OTOH 
require upstream kernel support for any features to be included in dpdk 
itself. Carrot-and-stick of sorts :)

	- Panu -
  
Stephen Hemminger Feb. 29, 2016, 8:11 p.m. UTC | #17
On Wed, 27 Jan 2016 16:24:07 +0000
Ferruh Yigit <ferruh.yigit@intel.com> wrote:

> +static int
> +kcp_ioctl_release(unsigned int ioctl_num, unsigned long ioctl_param)
> +{
> +	int ret = -EINVAL;
> +	struct kcp_dev *dev;
> +	struct kcp_dev *n;
> +	char name[RTE_KCP_NAMESIZE];
> +	unsigned int instance = ioctl_param;
> +
> +	snprintf(name, RTE_KCP_NAMESIZE, "dpdk%u", instance);
> +
> +	down_write(&kcp_list_lock);


Some observations about how acceptable this will to upstream
kernel developers.

ioctl's are the lease favored form of API.

You chose the worst possible mutual exclusion read/write semaphores.
Read/write is slower than simpler primtives, and semaphores were
replaced for almost all usage models by mutexes (about 4 years ago).

Looks like you copied the out of date kernel API's used
by KNI.
  
Ferruh Yigit March 1, 2016, 12:35 a.m. UTC | #18
On 2/29/2016 8:11 PM, Stephen Hemminger wrote:
> On Wed, 27 Jan 2016 16:24:07 +0000
> Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> 
>> +static int
>> +kcp_ioctl_release(unsigned int ioctl_num, unsigned long ioctl_param)
>> +{
>> +	int ret = -EINVAL;
>> +	struct kcp_dev *dev;
>> +	struct kcp_dev *n;
>> +	char name[RTE_KCP_NAMESIZE];
>> +	unsigned int instance = ioctl_param;
>> +
>> +	snprintf(name, RTE_KCP_NAMESIZE, "dpdk%u", instance);
>> +
>> +	down_write(&kcp_list_lock);
> 
> 
> Some observations about how acceptable this will to upstream
> kernel developers.
> 
> ioctl's are the lease favored form of API.
>
This is from v1 of patch, v3 is out and it removed ioctl, replacing it
with rtnetlink.

v3:
http://dpdk.org/dev/patchwork/patch/10872/


> You chose the worst possible mutual exclusion read/write semaphores.
> Read/write is slower than simpler primtives, and semaphores were
> replaced for almost all usage models by mutexes (about 4 years ago).
> 
> Looks like you copied the out of date kernel API's used
> by KNI.
> 
But still using same locks, and as you have guessed these are from
legacy code. I will replace them with newer, faster lock APIs happily.

If only problem is using slow lock APIs, it seems code is in pretty good
shape J Please check the v3, it is in better shape now. With more review
from reviewers and after a few more versions, we can have something
upstreamable.

Thanks,
ferruh
  
Bruce Richardson March 1, 2016, 10:40 p.m. UTC | #19
On Mon, Feb 29, 2016 at 08:33:25AM -0600, Jay Rolette wrote:
> On Mon, Feb 29, 2016 at 5:06 AM, Thomas Monjalon <thomas.monjalon@6wind.com>
> wrote:
> 
> > Hi,
> > I totally agree with Avi's comments.
> > This topic is really important for the future of DPDK.
> > So I think we must give some time to continue the discussion
> > and have netdev involved in the choices done.
> > As a consequence, these series should not be merged in the release 16.04.
> > Thanks for continuing the work.
> >
> 
> I know you guys are very interested in getting rid of the out-of-tree
> drivers, but please do not block incremental improvements to DPDK in the
> meantime. Ferruh's patch improves the usability of KNI. Don't throw out
> good and useful enhancements just because it isn't where you want to be in
> the end.
> 
> I'd like to see these be merged.
> 
> Jay

+1 to this. While this may not eliminate out of tree kernel modules, and solve
all problems, I think taking in kcp and kdp and removing KNI will leave DPDK in
a better state than it was.

Also, with regards to having the kernel data path, and the port control part
inside the same module, our experience with KNI has led us to explicitly
separating them out. The path from user to kernel space should be completely
separated from the netdevs which back dpdk ports. Consider stats reporting alone:
netdevs backed by dpdk ports can report out packet rx/tx counts for the hw ports
while the user-kernel path can report out packet rx/tx from kernel.

	Regards,
	/Bruce
  
Stephen Hemminger March 2, 2016, 2:02 a.m. UTC | #20
On Mon, 29 Feb 2016 08:33:25 -0600
Jay Rolette <rolette@infiniteio.com> wrote:

> On Mon, Feb 29, 2016 at 5:06 AM, Thomas Monjalon <thomas.monjalon@6wind.com>
> wrote:
> 
> > Hi,
> > I totally agree with Avi's comments.
> > This topic is really important for the future of DPDK.
> > So I think we must give some time to continue the discussion
> > and have netdev involved in the choices done.
> > As a consequence, these series should not be merged in the release 16.04.
> > Thanks for continuing the work.
> >
> 
> I know you guys are very interested in getting rid of the out-of-tree
> drivers, but please do not block incremental improvements to DPDK in the
> meantime. Ferruh's patch improves the usability of KNI. Don't throw out
> good and useful enhancements just because it isn't where you want to be in
> the end.
> 
> I'd like to see these be merged.
> 
> Jay

The code is really not ready. I am okay with cooperative development
but the current code needs to go into a staging type tree.
No compatibility, no ABI guarantees, more of an RFC.
Don't want vendors building products with it then screaming when it
gets rebuilt/reworked/scrapped.
  
Panu Matilainen March 2, 2016, 8:27 a.m. UTC | #21
On 03/02/2016 04:02 AM, Stephen Hemminger wrote:
> On Mon, 29 Feb 2016 08:33:25 -0600
> Jay Rolette <rolette@infiniteio.com> wrote:
>
>> On Mon, Feb 29, 2016 at 5:06 AM, Thomas Monjalon <thomas.monjalon@6wind.com>
>> wrote:
>>
>>> Hi,
>>> I totally agree with Avi's comments.
>>> This topic is really important for the future of DPDK.
>>> So I think we must give some time to continue the discussion
>>> and have netdev involved in the choices done.
>>> As a consequence, these series should not be merged in the release 16.04.
>>> Thanks for continuing the work.
>>>
>>
>> I know you guys are very interested in getting rid of the out-of-tree
>> drivers, but please do not block incremental improvements to DPDK in the
>> meantime. Ferruh's patch improves the usability of KNI. Don't throw out
>> good and useful enhancements just because it isn't where you want to be in
>> the end.
>>
>> I'd like to see these be merged.
>>
>> Jay
>
> The code is really not ready. I am okay with cooperative development
> but the current code needs to go into a staging type tree.
> No compatibility, no ABI guarantees, more of an RFC.
> Don't want vendors building products with it then screaming when it
> gets rebuilt/reworked/scrapped.
>

Exactly.

If a venturous vendor wants to go and build a product based on something 
in a staging tree there's nothing stopping them from doing that, but at 
least it should set the expectations straight.

	- Panu -
  
Vincent Jardin March 2, 2016, 10:47 a.m. UTC | #22
Le 02/03/2016 09:27, Panu Matilainen a écrit :
>>> I'd like to see these be merged.
>>>
>>> Jay
>>
>> The code is really not ready. I am okay with cooperative development
>> but the current code needs to go into a staging type tree.
>> No compatibility, no ABI guarantees, more of an RFC.
>> Don't want vendors building products with it then screaming when it
>> gets rebuilt/reworked/scrapped.
>>
>
> Exactly.

+1 too

We need to build on this innovation while there is a path for kernel 
mainstream. The logic of using a staging is a good one.

Thomas,

can we open a staging folder into the DPDK like it is done into the kernel?

Thank you,
   Vincent
  
Jim Thompson March 2, 2016, 10:51 a.m. UTC | #23
> On Mar 2, 2016, at 4:47 AM, Vincent JARDIN <vincent.jardin@6wind.com> wrote:
> 
> Le 02/03/2016 09:27, Panu Matilainen a écrit :
>>>> I'd like to see these be merged.
>>>> 
>>>> Jay
>>> 
>>> The code is really not ready. I am okay with cooperative development
>>> but the current code needs to go into a staging type tree.
>>> No compatibility, no ABI guarantees, more of an RFC.
>>> Don't want vendors building products with it then screaming when it
>>> gets rebuilt/reworked/scrapped.
>>> 
>> 
>> Exactly.
> 
> +1 too
> 
> We need to build on this innovation while there is a path for kernel mainstream. The logic of using a staging is a good one.
> 
> Thomas,
> 
> can we open a staging folder into the DPDK like it is done into the kernel?

Can we take it as a requirement to support FreeBSD this time around?
  
Thomas Monjalon March 2, 2016, 11:21 a.m. UTC | #24
2016-03-02 11:47, Vincent JARDIN:
> Le 02/03/2016 09:27, Panu Matilainen a écrit :
> >>> I'd like to see these be merged.
> >>>
> >>> Jay
> >>
> >> The code is really not ready. I am okay with cooperative development
> >> but the current code needs to go into a staging type tree.
> >> No compatibility, no ABI guarantees, more of an RFC.
> >> Don't want vendors building products with it then screaming when it
> >> gets rebuilt/reworked/scrapped.
> >>
> >
> > Exactly.
> 
> +1 too
> 
> We need to build on this innovation while there is a path for kernel 
> mainstream. The logic of using a staging is a good one.
> 
> Thomas,
> 
> can we open a staging folder into the DPDK like it is done into the kernel?

It's possible to create a staging directory if everybody agree.
It is important to state in a README file or in the doc/ that
there will be no guarantee (no stable ABI, no validation and can be dropped)
and that it is a work in progress, a suggestion to discuss with the kernel
community.

The kernel modules must clearly target an upstream integration.
  
Vincent Jardin March 2, 2016, 12:03 p.m. UTC | #25
Le 02/03/2016 11:51, Jim Thompson a écrit :
> Can we take it as a requirement to support FreeBSD this time around?

Of course, all OS should be on the loop, but I guess, it would be per 
kernel specific. What is ethtool on FreeBSD? Or can you start porting 
ethtool on FreeBSD?
  
Jay Rolette March 2, 2016, 10:18 p.m. UTC | #26
On Tue, Mar 1, 2016 at 8:02 PM, Stephen Hemminger <
stephen@networkplumber.org> wrote:

> On Mon, 29 Feb 2016 08:33:25 -0600
> Jay Rolette <rolette@infiniteio.com> wrote:
>
> > On Mon, Feb 29, 2016 at 5:06 AM, Thomas Monjalon <
> thomas.monjalon@6wind.com>
> > wrote:
> >
> > > Hi,
> > > I totally agree with Avi's comments.
> > > This topic is really important for the future of DPDK.
> > > So I think we must give some time to continue the discussion
> > > and have netdev involved in the choices done.
> > > As a consequence, these series should not be merged in the release
> 16.04.
> > > Thanks for continuing the work.
> > >
> >
> > I know you guys are very interested in getting rid of the out-of-tree
> > drivers, but please do not block incremental improvements to DPDK in the
> > meantime. Ferruh's patch improves the usability of KNI. Don't throw out
> > good and useful enhancements just because it isn't where you want to be
> in
> > the end.
> >
> > I'd like to see these be merged.
> >
> > Jay
>
> The code is really not ready. I am okay with cooperative development
> but the current code needs to go into a staging type tree.
> No compatibility, no ABI guarantees, more of an RFC.
> Don't want vendors building products with it then screaming when it
> gets rebuilt/reworked/scrapped.
>

That's fair. To be clear, it wasn't my intent for code that wasn't baked
yet to be merged.

The main point of my comment was that I think it is important not to halt
incremental improvements to existing capabilities (KNI in this case) just
because there are philosophical or directional changes that the community
would like to make longer-term.

Bird in the hand vs. two in the bush...

Jay
  
Thomas Monjalon March 2, 2016, 10:35 p.m. UTC | #27
2016-03-02 12:21, Thomas Monjalon:
> 2016-03-02 11:47, Vincent JARDIN:
> > Le 02/03/2016 09:27, Panu Matilainen a écrit :
> > >>> I'd like to see these be merged.
> > >>>
> > >>> Jay
> > >>
> > >> The code is really not ready. I am okay with cooperative development
> > >> but the current code needs to go into a staging type tree.
> > >> No compatibility, no ABI guarantees, more of an RFC.
> > >> Don't want vendors building products with it then screaming when it
> > >> gets rebuilt/reworked/scrapped.
> > >>
> > >
> > > Exactly.
> > 
> > +1 too
> > 
> > We need to build on this innovation while there is a path for kernel 
> > mainstream. The logic of using a staging is a good one.
> > 
> > Thomas,
> > 
> > can we open a staging folder into the DPDK like it is done into the kernel?
> 
> It's possible to create a staging directory if everybody agree.
> It is important to state in a README file or in the doc/ that
> there will be no guarantee (no stable ABI, no validation and can be dropped)
> and that it is a work in progress, a suggestion to discuss with the kernel
> community.
> 
> The kernel modules must clearly target an upstream integration.

Actually the examples directory has been used as a staging for ethtool and
lthread. We also have the crypto API which is still experimental.
So I think we must decide among these 3 solutions:
	- no special directory, just mark and document an experimental state
	- put only kcp/kdp in the staging directory
	- put kcp/kdp in staging and move other experimental libs here
  
Jim Thompson March 2, 2016, 10:51 p.m. UTC | #28
> On Mar 2, 2016, at 6:03 AM, Vincent JARDIN <vincent.jardin@6wind.com> wrote:
> 
> Le 02/03/2016 11:51, Jim Thompson a écrit :
>> Can we take it as a requirement to support FreeBSD this time around?
> 
> Of course, all OS should be on the loop, but I guess, it would be per kernel specific. What is ethtool on FreeBSD?

Ifconfig

> Or can you start porting ethtool on FreeBSD?

With netlink?  Hmm. 

I'm more interested in what kni (or its replacement) provides for an interface in/out  of the kernel stack ("slow path") on FreeBSD.
  
Panu Matilainen March 3, 2016, 8:31 a.m. UTC | #29
On 03/03/2016 12:35 AM, Thomas Monjalon wrote:
> 2016-03-02 12:21, Thomas Monjalon:
>> 2016-03-02 11:47, Vincent JARDIN:
>>> Le 02/03/2016 09:27, Panu Matilainen a écrit :
>>>>>> I'd like to see these be merged.
>>>>>>
>>>>>> Jay
>>>>>
>>>>> The code is really not ready. I am okay with cooperative development
>>>>> but the current code needs to go into a staging type tree.
>>>>> No compatibility, no ABI guarantees, more of an RFC.
>>>>> Don't want vendors building products with it then screaming when it
>>>>> gets rebuilt/reworked/scrapped.
>>>>>
>>>>
>>>> Exactly.
>>>
>>> +1 too
>>>
>>> We need to build on this innovation while there is a path for kernel
>>> mainstream. The logic of using a staging is a good one.
>>>
>>> Thomas,
>>>
>>> can we open a staging folder into the DPDK like it is done into the kernel?
>>
>> It's possible to create a staging directory if everybody agree.
>> It is important to state in a README file or in the doc/ that
>> there will be no guarantee (no stable ABI, no validation and can be dropped)
>> and that it is a work in progress, a suggestion to discuss with the kernel
>> community.
>>
>> The kernel modules must clearly target an upstream integration.
>
> Actually the examples directory has been used as a staging for ethtool and
> lthread. We also have the crypto API which is still experimental.
> So I think we must decide among these 3 solutions:
> 	- no special directory, just mark and document an experimental state
> 	- put only kcp/kdp in the staging directory
> 	- put kcp/kdp in staging and move other experimental libs here

To answer this, I think we need to start by clarifying the kernel module 
situation. Quoting your from 
http://dpdk.org/ml/archives/dev/2016-January/032263.html:

> Sorry the kernel module party is over.
> One day, igb_uio will be removed.
> I suggest to make a first version without interrupt support
> and work with Linux community to fix your issues.

This to me reads "no more out-of-tree kernel modules, period" but here 
we are discussing the fate of another one.

If the policy truly is "no more kernel modules" (which I would fully 
back and applaud) then I think there's little to discuss - if the 
destination is kernel upstream then why should the modules pass through 
the dpdk codebase? Put it in another repo on dpdk.org, advertise it, 
make testing it as easy as possible and all (like have it integrate with 
dpdk makefiles if needed) instead.

The difference with crypto API and ethtool is different in that the 
destination for them clearly is dpdk itself. I would like to see 
experimental code moved to a separate (staging or whatever) directory 
(or a repo/git submodule) to make the situation absolutely clear. Or a 
repo/git submodule or such. I also still think experimental features 
should not be enabled by default in the configs, no other project that I 
know of does that, but that's another discussion.

	- Panu -
  
Ferruh Yigit March 3, 2016, 10:05 a.m. UTC | #30
On 3/3/2016 8:31 AM, Panu Matilainen wrote:
> On 03/03/2016 12:35 AM, Thomas Monjalon wrote:
>> 2016-03-02 12:21, Thomas Monjalon:
>>> 2016-03-02 11:47, Vincent JARDIN:
>>>> Le 02/03/2016 09:27, Panu Matilainen a écrit :
>>>>>>> I'd like to see these be merged.
>>>>>>>
>>>>>>> Jay
>>>>>>
>>>>>> The code is really not ready. I am okay with cooperative development
>>>>>> but the current code needs to go into a staging type tree.
>>>>>> No compatibility, no ABI guarantees, more of an RFC.
>>>>>> Don't want vendors building products with it then screaming when it
>>>>>> gets rebuilt/reworked/scrapped.
>>>>>>
>>>>>
>>>>> Exactly.
>>>>
>>>> +1 too
>>>>
>>>> We need to build on this innovation while there is a path for kernel
>>>> mainstream. The logic of using a staging is a good one.
>>>>
>>>> Thomas,
>>>>
>>>> can we open a staging folder into the DPDK like it is done into the
>>>> kernel?
>>>
>>> It's possible to create a staging directory if everybody agree.
>>> It is important to state in a README file or in the doc/ that
>>> there will be no guarantee (no stable ABI, no validation and can be
>>> dropped)
>>> and that it is a work in progress, a suggestion to discuss with the
>>> kernel
>>> community.
>>>
>>> The kernel modules must clearly target an upstream integration.
>>
>> Actually the examples directory has been used as a staging for ethtool
>> and
>> lthread. We also have the crypto API which is still experimental.
>> So I think we must decide among these 3 solutions:
>>     - no special directory, just mark and document an experimental state
>>     - put only kcp/kdp in the staging directory
>>     - put kcp/kdp in staging and move other experimental libs here
> 
> To answer this, I think we need to start by clarifying the kernel module
> situation. Quoting your from
> http://dpdk.org/ml/archives/dev/2016-January/032263.html:
> 
>> Sorry the kernel module party is over.
>> One day, igb_uio will be removed.
>> I suggest to make a first version without interrupt support
>> and work with Linux community to fix your issues.
> 
> This to me reads "no more out-of-tree kernel modules, period" but here
> we are discussing the fate of another one.
> 
> If the policy truly is "no more kernel modules" (which I would fully
> back and applaud) then I think there's little to discuss - if the
> destination is kernel upstream then why should the modules pass through
> the dpdk codebase? Put it in another repo on dpdk.org, advertise it,
> make testing it as easy as possible and all (like have it integrate with
> dpdk makefiles if needed) instead.
> 
Hi Panu,

I just want to remind that these modules are to replace existing KNI
kernel module, and to reduce it's maintenance cost.
We are not adding new kernel modules for new features.

I believe replacing KNI module with new code in DPDK is a required
improvement step. But to replace, KNI users should verify the new codes.

Going directly from KNI to Linux upstream, if possible, is not easy.
Upstreaming should be done in incremental steps.

How about following steps:
1- Add KCP/KDP with an EXPERIMENTAL flag.
2- When they are mature enough, remove KNI, remove EXPERIMENTAL from
KCP/KDP.
3- Work on upstreaming

Thanks,
ferruh

> The difference with crypto API and ethtool is different in that the
> destination for them clearly is dpdk itself. I would like to see
> experimental code moved to a separate (staging or whatever) directory
> (or a repo/git submodule) to make the situation absolutely clear. Or a
> repo/git submodule or such. I also still think experimental features
> should not be enabled by default in the configs, no other project that I
> know of does that, but that's another discussion.
> 
>     - Panu -
  
Thomas Monjalon March 3, 2016, 10:11 a.m. UTC | #31
2016-03-03 10:05, Ferruh Yigit:
> On 3/3/2016 8:31 AM, Panu Matilainen wrote:
> > On 03/03/2016 12:35 AM, Thomas Monjalon wrote:
> >> 2016-03-02 12:21, Thomas Monjalon:
> >>> 2016-03-02 11:47, Vincent JARDIN:
> >>>> Le 02/03/2016 09:27, Panu Matilainen a écrit :
> >>>>>>> I'd like to see these be merged.
> >>>>>>>
> >>>>>>> Jay
> >>>>>>
> >>>>>> The code is really not ready. I am okay with cooperative development
> >>>>>> but the current code needs to go into a staging type tree.
> >>>>>> No compatibility, no ABI guarantees, more of an RFC.
> >>>>>> Don't want vendors building products with it then screaming when it
> >>>>>> gets rebuilt/reworked/scrapped.
> >>>>>>
> >>>>>
> >>>>> Exactly.
> >>>>
> >>>> +1 too
> >>>>
> >>>> We need to build on this innovation while there is a path for kernel
> >>>> mainstream. The logic of using a staging is a good one.
> >>>>
> >>>> Thomas,
> >>>>
> >>>> can we open a staging folder into the DPDK like it is done into the
> >>>> kernel?
> >>>
> >>> It's possible to create a staging directory if everybody agree.
> >>> It is important to state in a README file or in the doc/ that
> >>> there will be no guarantee (no stable ABI, no validation and can be
> >>> dropped)
> >>> and that it is a work in progress, a suggestion to discuss with the
> >>> kernel
> >>> community.
> >>>
> >>> The kernel modules must clearly target an upstream integration.
> >>
> >> Actually the examples directory has been used as a staging for ethtool
> >> and
> >> lthread. We also have the crypto API which is still experimental.
> >> So I think we must decide among these 3 solutions:
> >>     - no special directory, just mark and document an experimental state
> >>     - put only kcp/kdp in the staging directory
> >>     - put kcp/kdp in staging and move other experimental libs here
> > 
> > To answer this, I think we need to start by clarifying the kernel module
> > situation. Quoting your from
> > http://dpdk.org/ml/archives/dev/2016-January/032263.html:
> > 
> >> Sorry the kernel module party is over.
> >> One day, igb_uio will be removed.
> >> I suggest to make a first version without interrupt support
> >> and work with Linux community to fix your issues.
> > 
> > This to me reads "no more out-of-tree kernel modules, period" but here
> > we are discussing the fate of another one.
> > 
> > If the policy truly is "no more kernel modules" (which I would fully
> > back and applaud) then I think there's little to discuss - if the
> > destination is kernel upstream then why should the modules pass through
> > the dpdk codebase? Put it in another repo on dpdk.org, advertise it,
> > make testing it as easy as possible and all (like have it integrate with
> > dpdk makefiles if needed) instead.
> > 
> Hi Panu,
> 
> I just want to remind that these modules are to replace existing KNI
> kernel module, and to reduce it's maintenance cost.
> We are not adding new kernel modules for new features.
> 
> I believe replacing KNI module with new code in DPDK is a required
> improvement step. But to replace, KNI users should verify the new codes.
> 
> Going directly from KNI to Linux upstream, if possible, is not easy.
> Upstreaming should be done in incremental steps.
> 
> How about following steps:
> 1- Add KCP/KDP with an EXPERIMENTAL flag.
> 2- When they are mature enough, remove KNI, remove EXPERIMENTAL from
> KCP/KDP.
> 3- Work on upstreaming

What about working with upstream early (step 3 before 2)?
KNI is not so nice but it was advertised and used.
If we want to advertise a replacement, it must be approved by upstream.
We need some stable and widely adopted interfaces to bring more confidence
in the project.
  
Ferruh Yigit March 3, 2016, 10:11 a.m. UTC | #32
On 3/2/2016 10:18 PM, Jay Rolette wrote:
> 
> On Tue, Mar 1, 2016 at 8:02 PM, Stephen Hemminger
> <stephen@networkplumber.org <mailto:stephen@networkplumber.org>> wrote:
> 
>     On Mon, 29 Feb 2016 08:33:25 -0600
>     Jay Rolette <rolette@infiniteio.com <mailto:rolette@infiniteio.com>>
>     wrote:
> 
>     > On Mon, Feb 29, 2016 at 5:06 AM, Thomas Monjalon
>     <thomas.monjalon@6wind.com <mailto:thomas.monjalon@6wind.com>>
>     > wrote:
>     >
>     > > Hi,
>     > > I totally agree with Avi's comments.
>     > > This topic is really important for the future of DPDK.
>     > > So I think we must give some time to continue the discussion
>     > > and have netdev involved in the choices done.
>     > > As a consequence, these series should not be merged in the
>     release 16.04.
>     > > Thanks for continuing the work.
>     > >
>     >
>     > I know you guys are very interested in getting rid of the out-of-tree
>     > drivers, but please do not block incremental improvements to DPDK
>     in the
>     > meantime. Ferruh's patch improves the usability of KNI. Don't
>     throw out
>     > good and useful enhancements just because it isn't where you want
>     to be in
>     > the end.
>     >
>     > I'd like to see these be merged.
>     >
>     > Jay
> 
>     The code is really not ready. I am okay with cooperative development
>     but the current code needs to go into a staging type tree.
>     No compatibility, no ABI guarantees, more of an RFC.
>     Don't want vendors building products with it then screaming when it
>     gets rebuilt/reworked/scrapped.
> 
> 
> That's fair. To be clear, it wasn't my intent for code that wasn't baked
> yet to be merged. 
> 
> The main point of my comment was that I think it is important not to
> halt incremental improvements to existing capabilities (KNI in this
> case) just because there are philosophical or directional changes that
> the community would like to make longer-term.
> 
> Bird in the hand vs. two in the bush...
> 

There are two different statements, first, code being not ready, I agree
a fair point (although there is no argument to that statement, it makes
hard to discuss this, I will put aside this), this implies when code is
ready it can go in to repo.

But not having kernel module, independent from their state against what
they are trying to replace is something else. And this won't help on KNI
related problems.

Thanks,
ferruh
  
Panu Matilainen March 3, 2016, 10:51 a.m. UTC | #33
On 03/03/2016 12:05 PM, Ferruh Yigit wrote:
> On 3/3/2016 8:31 AM, Panu Matilainen wrote:
>> On 03/03/2016 12:35 AM, Thomas Monjalon wrote:
>>> 2016-03-02 12:21, Thomas Monjalon:
>>>> 2016-03-02 11:47, Vincent JARDIN:
>>>>> Le 02/03/2016 09:27, Panu Matilainen a écrit :
>>>>>>>> I'd like to see these be merged.
>>>>>>>>
>>>>>>>> Jay
>>>>>>>
>>>>>>> The code is really not ready. I am okay with cooperative development
>>>>>>> but the current code needs to go into a staging type tree.
>>>>>>> No compatibility, no ABI guarantees, more of an RFC.
>>>>>>> Don't want vendors building products with it then screaming when it
>>>>>>> gets rebuilt/reworked/scrapped.
>>>>>>>
>>>>>>
>>>>>> Exactly.
>>>>>
>>>>> +1 too
>>>>>
>>>>> We need to build on this innovation while there is a path for kernel
>>>>> mainstream. The logic of using a staging is a good one.
>>>>>
>>>>> Thomas,
>>>>>
>>>>> can we open a staging folder into the DPDK like it is done into the
>>>>> kernel?
>>>>
>>>> It's possible to create a staging directory if everybody agree.
>>>> It is important to state in a README file or in the doc/ that
>>>> there will be no guarantee (no stable ABI, no validation and can be
>>>> dropped)
>>>> and that it is a work in progress, a suggestion to discuss with the
>>>> kernel
>>>> community.
>>>>
>>>> The kernel modules must clearly target an upstream integration.
>>>
>>> Actually the examples directory has been used as a staging for ethtool
>>> and
>>> lthread. We also have the crypto API which is still experimental.
>>> So I think we must decide among these 3 solutions:
>>>      - no special directory, just mark and document an experimental state
>>>      - put only kcp/kdp in the staging directory
>>>      - put kcp/kdp in staging and move other experimental libs here
>>
>> To answer this, I think we need to start by clarifying the kernel module
>> situation. Quoting your from
>> http://dpdk.org/ml/archives/dev/2016-January/032263.html:
>>
>>> Sorry the kernel module party is over.
>>> One day, igb_uio will be removed.
>>> I suggest to make a first version without interrupt support
>>> and work with Linux community to fix your issues.
>>
>> This to me reads "no more out-of-tree kernel modules, period" but here
>> we are discussing the fate of another one.
>>
>> If the policy truly is "no more kernel modules" (which I would fully
>> back and applaud) then I think there's little to discuss - if the
>> destination is kernel upstream then why should the modules pass through
>> the dpdk codebase? Put it in another repo on dpdk.org, advertise it,
>> make testing it as easy as possible and all (like have it integrate with
>> dpdk makefiles if needed) instead.
>>
> Hi Panu,
>
> I just want to remind that these modules are to replace existing KNI
> kernel module, and to reduce it's maintenance cost.
> We are not adding new kernel modules for new features.
>
> I believe replacing KNI module with new code in DPDK is a required
> improvement step. But to replace, KNI users should verify the new codes.
>
> Going directly from KNI to Linux upstream, if possible, is not easy.
> Upstreaming should be done in incremental steps.
>
> How about following steps:
> 1- Add KCP/KDP with an EXPERIMENTAL flag.
> 2- When they are mature enough, remove KNI, remove EXPERIMENTAL from
> KCP/KDP.
> 3- Work on upstreaming

And if upstream says no, as they just as well might? You're one step 
forward, two steps back.

You need to engage upstream NOW, as has been suggested in this thread 
several times already.

	- Panu -
  
Stephen Hemminger March 3, 2016, 4:59 p.m. UTC | #34
On Thu, 3 Mar 2016 10:11:57 +0000
Ferruh Yigit <ferruh.yigit@intel.com> wrote:

> On 3/2/2016 10:18 PM, Jay Rolette wrote:
> > 
> > On Tue, Mar 1, 2016 at 8:02 PM, Stephen Hemminger
> > <stephen@networkplumber.org <mailto:stephen@networkplumber.org>> wrote:
> > 
> >     On Mon, 29 Feb 2016 08:33:25 -0600
> >     Jay Rolette <rolette@infiniteio.com <mailto:rolette@infiniteio.com>>
> >     wrote:
> > 
> >     > On Mon, Feb 29, 2016 at 5:06 AM, Thomas Monjalon
> >     <thomas.monjalon@6wind.com <mailto:thomas.monjalon@6wind.com>>
> >     > wrote:
> >     >
> >     > > Hi,
> >     > > I totally agree with Avi's comments.
> >     > > This topic is really important for the future of DPDK.
> >     > > So I think we must give some time to continue the discussion
> >     > > and have netdev involved in the choices done.
> >     > > As a consequence, these series should not be merged in the
> >     release 16.04.
> >     > > Thanks for continuing the work.
> >     > >
> >     >
> >     > I know you guys are very interested in getting rid of the out-of-tree
> >     > drivers, but please do not block incremental improvements to DPDK
> >     in the
> >     > meantime. Ferruh's patch improves the usability of KNI. Don't
> >     throw out
> >     > good and useful enhancements just because it isn't where you want
> >     to be in
> >     > the end.
> >     >
> >     > I'd like to see these be merged.
> >     >
> >     > Jay
> > 
> >     The code is really not ready. I am okay with cooperative development
> >     but the current code needs to go into a staging type tree.
> >     No compatibility, no ABI guarantees, more of an RFC.
> >     Don't want vendors building products with it then screaming when it
> >     gets rebuilt/reworked/scrapped.
> > 
> > 
> > That's fair. To be clear, it wasn't my intent for code that wasn't baked
> > yet to be merged. 
> > 
> > The main point of my comment was that I think it is important not to
> > halt incremental improvements to existing capabilities (KNI in this
> > case) just because there are philosophical or directional changes that
> > the community would like to make longer-term.
> > 
> > Bird in the hand vs. two in the bush...
> > 
> 
> There are two different statements, first, code being not ready, I agree
> a fair point (although there is no argument to that statement, it makes
> hard to discuss this, I will put aside this), this implies when code is
> ready it can go in to repo.
> 
> But not having kernel module, independent from their state against what
> they are trying to replace is something else. And this won't help on KNI
> related problems.
> 
> Thanks,
> ferruh
> 

Why not re-submit patches but put in lib/librte_eal/staging or similar path
and make sure that it does not get build by normal build process.
  
Ferruh Yigit March 3, 2016, 6:18 p.m. UTC | #35
On 3/3/2016 4:59 PM, Stephen Hemminger wrote:
> On Thu, 3 Mar 2016 10:11:57 +0000
> Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> 
>> On 3/2/2016 10:18 PM, Jay Rolette wrote:
>>>
>>> On Tue, Mar 1, 2016 at 8:02 PM, Stephen Hemminger
>>> <stephen@networkplumber.org <mailto:stephen@networkplumber.org>> wrote:
>>>
>>>     On Mon, 29 Feb 2016 08:33:25 -0600
>>>     Jay Rolette <rolette@infiniteio.com <mailto:rolette@infiniteio.com>>
>>>     wrote:
>>>
>>>     > On Mon, Feb 29, 2016 at 5:06 AM, Thomas Monjalon
>>>     <thomas.monjalon@6wind.com <mailto:thomas.monjalon@6wind.com>>
>>>     > wrote:
>>>     >
>>>     > > Hi,
>>>     > > I totally agree with Avi's comments.
>>>     > > This topic is really important for the future of DPDK.
>>>     > > So I think we must give some time to continue the discussion
>>>     > > and have netdev involved in the choices done.
>>>     > > As a consequence, these series should not be merged in the
>>>     release 16.04.
>>>     > > Thanks for continuing the work.
>>>     > >
>>>     >
>>>     > I know you guys are very interested in getting rid of the out-of-tree
>>>     > drivers, but please do not block incremental improvements to DPDK
>>>     in the
>>>     > meantime. Ferruh's patch improves the usability of KNI. Don't
>>>     throw out
>>>     > good and useful enhancements just because it isn't where you want
>>>     to be in
>>>     > the end.
>>>     >
>>>     > I'd like to see these be merged.
>>>     >
>>>     > Jay
>>>
>>>     The code is really not ready. I am okay with cooperative development
>>>     but the current code needs to go into a staging type tree.
>>>     No compatibility, no ABI guarantees, more of an RFC.
>>>     Don't want vendors building products with it then screaming when it
>>>     gets rebuilt/reworked/scrapped.
>>>
>>>
>>> That's fair. To be clear, it wasn't my intent for code that wasn't baked
>>> yet to be merged. 
>>>
>>> The main point of my comment was that I think it is important not to
>>> halt incremental improvements to existing capabilities (KNI in this
>>> case) just because there are philosophical or directional changes that
>>> the community would like to make longer-term.
>>>
>>> Bird in the hand vs. two in the bush...
>>>
>>
>> There are two different statements, first, code being not ready, I agree
>> a fair point (although there is no argument to that statement, it makes
>> hard to discuss this, I will put aside this), this implies when code is
>> ready it can go in to repo.
>>
>> But not having kernel module, independent from their state against what
>> they are trying to replace is something else. And this won't help on KNI
>> related problems.
>>
>> Thanks,
>> ferruh
>>
> 
> Why not re-submit patches but put in lib/librte_eal/staging or similar path
> and make sure that it does not get build by normal build process.
> 
I will do when staging is ready/defined.

Also will start working on upstreaming modules.

Thanks,
ferruh
  
Thomas Monjalon March 10, 2016, 12:04 a.m. UTC | #36
2016-03-02 23:35, Thomas Monjalon:
> 2016-03-02 12:21, Thomas Monjalon:
> > 2016-03-02 11:47, Vincent JARDIN:
> > > Le 02/03/2016 09:27, Panu Matilainen a écrit :
> > > >>> I'd like to see these be merged.
> > > >>>
> > > >>> Jay
> > > >>
> > > >> The code is really not ready. I am okay with cooperative development
> > > >> but the current code needs to go into a staging type tree.
> > > >> No compatibility, no ABI guarantees, more of an RFC.
> > > >> Don't want vendors building products with it then screaming when it
> > > >> gets rebuilt/reworked/scrapped.
> > > >>
> > > >
> > > > Exactly.
> > > 
> > > +1 too
> > > 
> > > We need to build on this innovation while there is a path for kernel 
> > > mainstream. The logic of using a staging is a good one.
> > > 
> > > Thomas,
> > > 
> > > can we open a staging folder into the DPDK like it is done into the kernel?
> > 
> > It's possible to create a staging directory if everybody agree.
> > It is important to state in a README file or in the doc/ that
> > there will be no guarantee (no stable ABI, no validation and can be dropped)
> > and that it is a work in progress, a suggestion to discuss with the kernel
> > community.
> > 
> > The kernel modules must clearly target an upstream integration.
> 
> Actually the examples directory has been used as a staging for ethtool and
> lthread. We also have the crypto API which is still experimental.
> So I think we must decide among these 3 solutions:
> 	- no special directory, just mark and document an experimental state
> 	- put only kcp/kdp in the staging directory
> 	- put kcp/kdp in staging and move other experimental libs here

Any opinion? Are we targetting upstream work without any DPDK staging?

Please let's make clear the status of these patches.
  
Vincent Jardin March 10, 2016, 6:31 a.m. UTC | #37
Le 10 mars 2016 01:06, "Thomas Monjalon" <thomas.monjalon@6wind.com> a
écrit :
>
> 2016-03-02 23:35, Thomas Monjalon:
> > 2016-03-02 12:21, Thomas Monjalon:
> > > 2016-03-02 11:47, Vincent JARDIN:
> > > > Le 02/03/2016 09:27, Panu Matilainen a écrit :
> > > > >>> I'd like to see these be merged.
> > > > >>>
> > > > >>> Jay
> > > > >>
> > > > >> The code is really not ready. I am okay with cooperative
development
> > > > >> but the current code needs to go into a staging type tree.
> > > > >> No compatibility, no ABI guarantees, more of an RFC.
> > > > >> Don't want vendors building products with it then screaming when
it
> > > > >> gets rebuilt/reworked/scrapped.
> > > > >>
> > > > >
> > > > > Exactly.
> > > >
> > > > +1 too
> > > >
> > > > We need to build on this innovation while there is a path for kernel
> > > > mainstream. The logic of using a staging is a good one.
> > > >
> > > > Thomas,
> > > >
> > > > can we open a staging folder into the DPDK like it is done into the
kernel?
> > >
> > > It's possible to create a staging directory if everybody agree.
> > > It is important to state in a README file or in the doc/ that
> > > there will be no guarantee (no stable ABI, no validation and can be
dropped)
> > > and that it is a work in progress, a suggestion to discuss with the
kernel
> > > community.
> > >
> > > The kernel modules must clearly target an upstream integration.
> >
> > Actually the examples directory has been used as a staging for ethtool
and
> > lthread. We also have the crypto API which is still experimental.
> > So I think we must decide among these 3 solutions:
> >       - no special directory, just mark and document an experimental
state
> >       - put only kcp/kdp in the staging directory

I do prefer this option.

> >       - put kcp/kdp in staging and move other experimental libs here
>
> Any opinion? Are we targetting upstream work without any DPDK staging?
>
> Please let's make clear the status of these patches.
  

Patch

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 74bc515..5d5e3e4 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -503,6 +503,12 @@  CONFIG_RTE_KNI_VHOST_DEBUG_RX=n
 CONFIG_RTE_KNI_VHOST_DEBUG_TX=n
 
 #
+# Compile librte_ctrl_if
+#
+CONFIG_RTE_KCP_KMOD=y
+CONFIG_RTE_KCP_KO_DEBUG=n
+
+#
 # Compile vhost library
 # fuse-devel is needed to run vhost-cuse.
 # fuse-devel enables user space char driver development
diff --git a/lib/librte_eal/linuxapp/Makefile b/lib/librte_eal/linuxapp/Makefile
index d9c5233..d1fa3a3 100644
--- a/lib/librte_eal/linuxapp/Makefile
+++ b/lib/librte_eal/linuxapp/Makefile
@@ -1,6 +1,6 @@ 
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -38,6 +38,9 @@  DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal
 ifeq ($(CONFIG_RTE_KNI_KMOD),y)
 DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += kni
 endif
+ifeq ($(CONFIG_RTE_KCP_KMOD),y)
+DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += kcp
+endif
 ifeq ($(CONFIG_RTE_LIBRTE_XEN_DOM0),y)
 DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += xen_dom0
 endif
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 26eced5..dded8cb 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -1,6 +1,6 @@ 
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -116,6 +116,7 @@  CFLAGS_eal_thread.o += -Wno-return-type
 endif
 
 INC := rte_interrupts.h rte_kni_common.h rte_dom0_common.h
+INC += rte_kcp_common.h
 
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP)-include/exec-env := \
 	$(addprefix include/exec-env/,$(INC))
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kcp_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kcp_common.h
new file mode 100644
index 0000000..b3a6ee3
--- /dev/null
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kcp_common.h
@@ -0,0 +1,86 @@ 
+/*-
+ *   This file is provided under a dual BSD/LGPLv2 license.  When using or
+ *   redistributing this file, you may do so under either license.
+ *
+ *   GNU LESSER GENERAL PUBLIC LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2.1 of the GNU Lesser General Public License
+ *   as published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   Lesser General Public License for more details.
+ *
+ *   You should have received a copy of the GNU Lesser General Public License
+ *   along with this program;
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ *
+ *
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *   * Redistributions of source code must retain the above copyright
+ *     notice, this list of conditions and the following disclaimer.
+ *   * Redistributions in binary form must reproduce the above copyright
+ *     notice, this list of conditions and the following disclaimer in
+ *     the documentation and/or other materials provided with the
+ *     distribution.
+ *   * Neither the name of Intel Corporation nor the names of its
+ *     contributors may be used to endorse or promote products derived
+ *     from this software without specific prior written permission.
+ *
+ *    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+
+#ifndef _RTE_KCP_COMMON_H_
+#define _RTE_KCP_COMMON_H_
+
+#ifdef __KERNEL__
+#include <linux/if.h>
+#endif
+
+/*
+ * Request id.
+ */
+enum rte_kcp_req_id {
+	RTE_KCP_REQ_UNKNOWN = (1 << 16),
+	RTE_KCP_REQ_CHANGE_MTU,
+	RTE_KCP_REQ_CFG_NETWORK_IF,
+	RTE_KCP_REQ_GET_STATS,
+	RTE_KCP_REQ_GET_MAC,
+	RTE_KCP_REQ_SET_MAC,
+	RTE_KCP_REQ_START_PORT,
+	RTE_KCP_REQ_STOP_PORT,
+	RTE_KCP_REQ_MAX,
+};
+
+#define KCP_DEVICE "kcp"
+
+#define RTE_KCP_IOCTL_TEST    _IOWR(0, 1, int)
+#define RTE_KCP_IOCTL_CREATE  _IOWR(0, 2, int)
+#define RTE_KCP_IOCTL_RELEASE _IOWR(0, 3, int)
+
+#endif /* _RTE_KCP_COMMON_H_ */
diff --git a/lib/librte_eal/linuxapp/kcp/Makefile b/lib/librte_eal/linuxapp/kcp/Makefile
new file mode 100644
index 0000000..b2c44bd
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kcp/Makefile
@@ -0,0 +1,58 @@ 
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# module name and path
+#
+MODULE = rte_kcp
+
+#
+# CFLAGS
+#
+MODULE_CFLAGS += -I$(SRCDIR)
+MODULE_CFLAGS += -I$(RTE_OUTPUT)/include
+MODULE_CFLAGS += -include $(RTE_OUTPUT)/include/rte_config.h
+MODULE_CFLAGS += -Wall -Werror
+
+# this lib needs main eal
+DEPDIRS-y += lib/librte_eal/linuxapp/eal
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-y += kcp_misc.c
+SRCS-y += kcp_net.c
+SRCS-y += kcp_ethtool.c
+SRCS-y += kcp_nl.c
+
+include $(RTE_SDK)/mk/rte.module.mk
diff --git a/lib/librte_eal/linuxapp/kcp/kcp_dev.h b/lib/librte_eal/linuxapp/kcp/kcp_dev.h
new file mode 100644
index 0000000..e537821
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kcp/kcp_dev.h
@@ -0,0 +1,65 @@ 
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2 of the GNU General Public License as
+ *   published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program;
+ *
+ *   The full GNU General Public License is included in this distribution
+ *   in the file called LICENSE.GPL.
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ */
+
+#ifndef _KCP_DEV_H_
+#define _KCP_DEV_H_
+
+#include <linux/netdevice.h>
+#include <exec-env/rte_kcp_common.h>
+
+#define RTE_KCP_NAMESIZE 32
+
+struct kcp_dev {
+	/* kcp list */
+	struct list_head list;
+
+	char name[RTE_KCP_NAMESIZE]; /* Network device name */
+
+	/* kcp device */
+	struct net_device *net_dev;
+
+	int port_id;
+	struct completion msg_received;
+};
+
+void kcp_net_init(struct net_device *dev);
+
+void kcp_nl_init(void);
+void kcp_nl_release(void);
+int kcp_nl_exec(int cmd, struct net_device *dev, void *in_data, int in_len,
+		void *out_data, int out_len);
+
+void kcp_set_ethtool_ops(struct net_device *netdev);
+
+#define KCP_ERR(args...) printk(KERN_ERR "KCP: " args)
+#define KCP_INFO(args...) printk(KERN_INFO "KCP: " args)
+#define KCP_PRINT(args...) printk(KERN_DEBUG "KCP: " args)
+
+#ifdef RTE_KCP_KO_DEBUG
+#define KCP_DBG(args...) printk(KERN_DEBUG "KCP: " args)
+#else
+#define KCP_DBG(args...)
+#endif
+
+#endif /* _KCP_DEV_H_ */
diff --git a/lib/librte_eal/linuxapp/kcp/kcp_ethtool.c b/lib/librte_eal/linuxapp/kcp/kcp_ethtool.c
new file mode 100644
index 0000000..3a22dba
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kcp/kcp_ethtool.c
@@ -0,0 +1,261 @@ 
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2 of the GNU General Public License as
+ *   published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program;
+ *
+ *   The full GNU General Public License is included in this distribution
+ *   in the file called LICENSE.GPL.
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ */
+
+#include "kcp_dev.h"
+
+#define ETHTOOL_GEEPROM_LEN 99
+#define ETHTOOL_GREGS_LEN 98
+#define ETHTOOL_GSSET_COUNT 97
+
+static int
+kcp_check_if_running(struct net_device *dev)
+{
+	return 0;
+}
+
+static void
+kcp_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *info)
+{
+	int ret;
+
+	ret = kcp_nl_exec(info->cmd, dev, NULL, 0,
+			info, sizeof(struct ethtool_drvinfo));
+	if (ret < 0)
+		memset(info, 0, sizeof(struct ethtool_drvinfo));
+}
+
+static int
+kcp_get_settings(struct net_device *dev, struct ethtool_cmd *ecmd)
+{
+	return kcp_nl_exec(ecmd->cmd, dev, NULL, 0,
+			ecmd, sizeof(struct ethtool_cmd));
+}
+
+static int
+kcp_set_settings(struct net_device *dev, struct ethtool_cmd *ecmd)
+{
+	return kcp_nl_exec(ecmd->cmd, dev, ecmd, sizeof(struct ethtool_cmd),
+			NULL, 0);
+}
+
+static void
+kcp_get_wol(struct net_device *dev, struct ethtool_wolinfo *wol)
+{
+	int ret;
+
+	ret = kcp_nl_exec(wol->cmd, dev, NULL, 0,
+			wol, sizeof(struct ethtool_wolinfo));
+	if (ret < 0)
+		memset(wol, 0, sizeof(struct ethtool_wolinfo));
+}
+
+static int
+kcp_set_wol(struct net_device *dev, struct ethtool_wolinfo *wol)
+{
+	return kcp_nl_exec(wol->cmd, dev, wol, sizeof(struct ethtool_wolinfo),
+			NULL, 0);
+}
+
+static int
+kcp_nway_reset(struct net_device *dev)
+{
+	return kcp_nl_exec(ETHTOOL_NWAY_RST, dev, NULL, 0, NULL, 0);
+}
+
+static int
+kcp_get_eeprom_len(struct net_device *dev)
+{
+	int data;
+	int ret;
+
+	ret = kcp_nl_exec(ETHTOOL_GEEPROM_LEN, dev, NULL, 0,
+			&data, sizeof(int));
+	if (ret < 0)
+		return ret;
+
+	return data;
+}
+
+static int
+kcp_get_eeprom(struct net_device *dev, struct ethtool_eeprom *eeprom,
+		u8 *bytes)
+{
+	int ret;
+
+	ret = kcp_nl_exec(eeprom->cmd, dev,
+			eeprom, sizeof(struct ethtool_eeprom),
+			bytes, eeprom->len);
+	*bytes = 0;
+	return ret;
+}
+
+static int
+kcp_set_eeprom(struct net_device *dev, struct ethtool_eeprom *eeprom,
+		u8 *bytes)
+{
+	int ret;
+
+	ret = kcp_nl_exec(eeprom->cmd, dev,
+			eeprom, sizeof(struct ethtool_eeprom),
+			bytes, eeprom->len);
+	*bytes = 0;
+	return ret;
+}
+
+static void
+kcp_get_ringparam(struct net_device *dev, struct ethtool_ringparam *ring)
+{
+
+	kcp_nl_exec(ring->cmd, dev, NULL, 0,
+			ring, sizeof(struct ethtool_ringparam));
+}
+
+static int
+kcp_set_ringparam(struct net_device *dev, struct ethtool_ringparam *ring)
+{
+	int ret;
+
+	ret = kcp_nl_exec(ring->cmd, dev,
+			ring, sizeof(struct ethtool_ringparam),
+			NULL, 0);
+	return ret;
+}
+
+static void
+kcp_get_pauseparam(struct net_device *dev, struct ethtool_pauseparam *pause)
+{
+
+	kcp_nl_exec(pause->cmd, dev, NULL, 0,
+			pause, sizeof(struct ethtool_pauseparam));
+}
+
+static int
+kcp_set_pauseparam(struct net_device *dev, struct ethtool_pauseparam *pause)
+{
+	return kcp_nl_exec(pause->cmd, dev,
+			pause, sizeof(struct ethtool_pauseparam),
+			NULL, 0);
+}
+
+static u32
+kcp_get_msglevel(struct net_device *dev)
+{
+	int data;
+	int ret;
+
+	ret = kcp_nl_exec(ETHTOOL_GMSGLVL, dev, NULL, 0, &data, sizeof(int));
+	if (ret < 0)
+		return ret;
+
+	return data;
+}
+
+static void
+kcp_set_msglevel(struct net_device *dev, u32 data)
+{
+
+	kcp_nl_exec(ETHTOOL_SMSGLVL, dev, &data, sizeof(int), NULL, 0);
+}
+
+static int
+kcp_get_regs_len(struct net_device *dev)
+{
+	int data;
+	int ret;
+
+	ret = kcp_nl_exec(ETHTOOL_GREGS_LEN, dev, NULL, 0, &data, sizeof(int));
+	if (ret < 0)
+		return ret;
+
+	return data;
+}
+
+static void
+kcp_get_regs(struct net_device *dev, struct ethtool_regs *regs, void *p)
+{
+
+	kcp_nl_exec(regs->cmd, dev, regs, sizeof(struct ethtool_regs),
+			p, regs->len);
+}
+
+static void
+kcp_get_strings(struct net_device *dev, u32 stringset, u8 *data)
+{
+
+	kcp_nl_exec(ETHTOOL_GSTRINGS, dev, &stringset, sizeof(u32), data, 0);
+}
+
+static int
+kcp_get_sset_count(struct net_device *dev, int sset)
+{
+	int data;
+	int ret;
+
+	ret = kcp_nl_exec(ETHTOOL_GSSET_COUNT, dev, &sset, sizeof(int),
+			&data, sizeof(int));
+	if (ret < 0)
+		return ret;
+
+	return data;
+}
+
+static void
+kcp_get_ethtool_stats(struct net_device *dev, struct ethtool_stats *stats,
+		u64 *data)
+{
+
+	kcp_nl_exec(stats->cmd, dev, stats, sizeof(struct ethtool_stats),
+			data, stats->n_stats);
+}
+
+static const struct ethtool_ops kcp_ethtool_ops = {
+	.begin			= kcp_check_if_running,
+	.get_drvinfo		= kcp_get_drvinfo,
+	.get_settings		= kcp_get_settings,
+	.set_settings		= kcp_set_settings,
+	.get_regs_len		= kcp_get_regs_len,
+	.get_regs		= kcp_get_regs,
+	.get_wol		= kcp_get_wol,
+	.set_wol		= kcp_set_wol,
+	.nway_reset		= kcp_nway_reset,
+	.get_link		= ethtool_op_get_link,
+	.get_eeprom_len		= kcp_get_eeprom_len,
+	.get_eeprom		= kcp_get_eeprom,
+	.set_eeprom		= kcp_set_eeprom,
+	.get_ringparam		= kcp_get_ringparam,
+	.set_ringparam		= kcp_set_ringparam,
+	.get_pauseparam		= kcp_get_pauseparam,
+	.set_pauseparam		= kcp_set_pauseparam,
+	.get_msglevel		= kcp_get_msglevel,
+	.set_msglevel		= kcp_set_msglevel,
+	.get_strings		= kcp_get_strings,
+	.get_sset_count		= kcp_get_sset_count,
+	.get_ethtool_stats	= kcp_get_ethtool_stats,
+};
+
+void
+kcp_set_ethtool_ops(struct net_device *netdev)
+{
+	netdev->ethtool_ops = &kcp_ethtool_ops;
+}
diff --git a/lib/librte_eal/linuxapp/kcp/kcp_misc.c b/lib/librte_eal/linuxapp/kcp/kcp_misc.c
new file mode 100644
index 0000000..6df0d1b
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kcp/kcp_misc.c
@@ -0,0 +1,282 @@ 
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2 of the GNU General Public License as
+ *   published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program;
+ *
+ *   The full GNU General Public License is included in this distribution
+ *   in the file called LICENSE.GPL.
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ */
+
+#include <linux/module.h>
+#include <linux/miscdevice.h>
+
+#include "kcp_dev.h"
+
+#define KCP_DEV_IN_USE_BIT_NUM 0 /* Bit number for device in use */
+
+static volatile unsigned long device_in_use; /* device in use flag */
+
+/* kcp list lock */
+static DECLARE_RWSEM(kcp_list_lock);
+
+/* kcp list */
+static struct list_head kcp_list_head = LIST_HEAD_INIT(kcp_list_head);
+
+static int
+kcp_open(struct inode *inode, struct file *file)
+{
+	/* kcp device can be opened by one user only, test and set bit */
+	if (test_and_set_bit(KCP_DEV_IN_USE_BIT_NUM, &device_in_use))
+		return -EBUSY;
+
+	KCP_PRINT("/dev/kcp opened\n");
+
+	kcp_nl_init();
+
+	return 0;
+}
+
+static int
+kcp_dev_remove(struct kcp_dev *dev)
+{
+	if (!dev)
+		return -ENODEV;
+
+	if (dev->net_dev) {
+		unregister_netdev(dev->net_dev);
+		free_netdev(dev->net_dev);
+	}
+
+	return 0;
+}
+
+static int
+kcp_release(struct inode *inode, struct file *file)
+{
+	struct kcp_dev *dev, *n;
+
+	down_write(&kcp_list_lock);
+	list_for_each_entry_safe(dev, n, &kcp_list_head, list) {
+		kcp_dev_remove(dev);
+		list_del(&dev->list);
+	}
+	up_write(&kcp_list_lock);
+
+	kcp_nl_release();
+
+	/* Clear the bit of device in use */
+	clear_bit(KCP_DEV_IN_USE_BIT_NUM, &device_in_use);
+
+	KCP_PRINT("/dev/kcp closed\n");
+
+	return 0;
+}
+
+static int
+kcp_check_param(struct kcp_dev *kcp, char *name)
+{
+	if (!kcp)
+		return -1;
+
+	/* Check if network name has been used */
+	if (!strncmp(kcp->name, name, RTE_KCP_NAMESIZE)) {
+		KCP_ERR("KCP interface name %s duplicated\n", name);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+kcp_ioctl_create(unsigned int ioctl_num, unsigned long ioctl_param)
+{
+	int ret;
+	struct net_device *net_dev = NULL;
+	struct kcp_dev *kcp, *dev, *n;
+	struct net *net;
+	char name[RTE_KCP_NAMESIZE];
+	unsigned int instance = ioctl_param;
+	char mac[ETH_ALEN];
+
+	KCP_PRINT("Creating kcp...\n");
+
+	snprintf(name, RTE_KCP_NAMESIZE, "dpdk%u", instance);
+
+	/* Check if it has been created */
+	down_read(&kcp_list_lock);
+	list_for_each_entry_safe(dev, n, &kcp_list_head, list) {
+		if (kcp_check_param(dev, name) < 0) {
+			up_read(&kcp_list_lock);
+			return -EINVAL;
+		}
+	}
+	up_read(&kcp_list_lock);
+
+	net_dev = alloc_netdev(sizeof(struct kcp_dev), name,
+#ifdef NET_NAME_UNKNOWN
+							NET_NAME_UNKNOWN,
+#endif
+							kcp_net_init);
+	if (net_dev == NULL) {
+		KCP_ERR("error allocating device \"%s\"\n", name);
+		return -EBUSY;
+	}
+
+	net = get_net_ns_by_pid(task_pid_vnr(current));
+	if (IS_ERR(net)) {
+		free_netdev(net_dev);
+		return PTR_ERR(net);
+	}
+	dev_net_set(net_dev, net);
+	put_net(net);
+
+	kcp = netdev_priv(net_dev);
+
+	kcp->net_dev = net_dev;
+	kcp->port_id = instance;
+	init_completion(&kcp->msg_received);
+	strncpy(kcp->name, name, RTE_KCP_NAMESIZE);
+
+	kcp_nl_exec(RTE_KCP_REQ_GET_MAC, net_dev, NULL, 0, mac, ETH_ALEN);
+	memcpy(net_dev->dev_addr, mac, net_dev->addr_len);
+
+	kcp_set_ethtool_ops(net_dev);
+	ret = register_netdev(net_dev);
+	if (ret) {
+		KCP_ERR("error %i registering device \"%s\"\n", ret, name);
+		kcp_dev_remove(kcp);
+		return -ENODEV;
+	}
+
+	down_write(&kcp_list_lock);
+	list_add(&kcp->list, &kcp_list_head);
+	up_write(&kcp_list_lock);
+
+	return 0;
+}
+
+static int
+kcp_ioctl_release(unsigned int ioctl_num, unsigned long ioctl_param)
+{
+	int ret = -EINVAL;
+	struct kcp_dev *dev;
+	struct kcp_dev *n;
+	char name[RTE_KCP_NAMESIZE];
+	unsigned int instance = ioctl_param;
+
+	snprintf(name, RTE_KCP_NAMESIZE, "dpdk%u", instance);
+
+	down_write(&kcp_list_lock);
+	list_for_each_entry_safe(dev, n, &kcp_list_head, list) {
+		if (strncmp(dev->name, name, RTE_KCP_NAMESIZE) != 0)
+			continue;
+		kcp_dev_remove(dev);
+		list_del(&dev->list);
+		ret = 0;
+		break;
+	}
+	up_write(&kcp_list_lock);
+	KCP_INFO("%s release kcp named %s\n",
+		(ret == 0 ? "Successfully" : "Unsuccessfully"), name);
+
+	return ret;
+}
+
+static int
+kcp_ioctl(struct inode *inode, unsigned int ioctl_num,
+	unsigned long ioctl_param)
+{
+	int ret = -EINVAL;
+
+	KCP_DBG("IOCTL num=0x%0x param=0x%0lx\n", ioctl_num, ioctl_param);
+
+	/*
+	 * Switch according to the ioctl called
+	 */
+	switch (_IOC_NR(ioctl_num)) {
+	case _IOC_NR(RTE_KCP_IOCTL_TEST):
+		/* For test only, not used */
+		break;
+	case _IOC_NR(RTE_KCP_IOCTL_CREATE):
+		ret = kcp_ioctl_create(ioctl_num, ioctl_param);
+		break;
+	case _IOC_NR(RTE_KCP_IOCTL_RELEASE):
+		ret = kcp_ioctl_release(ioctl_num, ioctl_param);
+		break;
+	default:
+		KCP_DBG("IOCTL default\n");
+		break;
+	}
+
+	return ret;
+}
+
+static int
+kcp_compat_ioctl(struct inode *inode, unsigned int ioctl_num,
+		unsigned long ioctl_param)
+{
+	/* 32 bits app on 64 bits OS to be supported later */
+	KCP_PRINT("Not implemented.\n");
+
+	return -EINVAL;
+}
+
+static const struct file_operations kcp_fops = {
+	.owner = THIS_MODULE,
+	.open = kcp_open,
+	.release = kcp_release,
+	.unlocked_ioctl = (void *)kcp_ioctl,
+	.compat_ioctl = (void *)kcp_compat_ioctl,
+};
+
+static struct miscdevice kcp_misc = {
+	.minor = MISC_DYNAMIC_MINOR,
+	.name = KCP_DEVICE,
+	.fops = &kcp_fops,
+};
+
+static int __init
+kcp_init(void)
+{
+	KCP_PRINT("DPDK kcp module loading\n");
+
+	if (misc_register(&kcp_misc) != 0) {
+		KCP_ERR("Misc registration failed\n");
+		return -EPERM;
+	}
+
+	/* Clear the bit of device in use */
+	clear_bit(KCP_DEV_IN_USE_BIT_NUM, &device_in_use);
+
+	KCP_PRINT("DPDK kcp module loaded\n");
+
+	return 0;
+}
+module_init(kcp_init);
+
+static void __exit
+kcp_exit(void)
+{
+	misc_deregister(&kcp_misc);
+	KCP_PRINT("DPDK kcp module unloaded\n");
+}
+module_exit(kcp_exit);
+
+MODULE_LICENSE("Dual BSD/GPL");
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Kernel Module for managing kcp devices");
diff --git a/lib/librte_eal/linuxapp/kcp/kcp_net.c b/lib/librte_eal/linuxapp/kcp/kcp_net.c
new file mode 100644
index 0000000..9dacaaa
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kcp/kcp_net.c
@@ -0,0 +1,209 @@ 
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2 of the GNU General Public License as
+ *   published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program;
+ *
+ *   The full GNU General Public License is included in this distribution
+ *   in the file called LICENSE.GPL.
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ */
+
+/*
+ * This code is inspired from the book "Linux Device Drivers" by
+ * Alessandro Rubini and Jonathan Corbet, published by O'Reilly & Associates
+ */
+
+#include <linux/version.h>
+#include <linux/etherdevice.h> /* eth_type_trans */
+
+#include "kcp_dev.h"
+
+/*
+ * Open and close
+ */
+static int
+kcp_net_open(struct net_device *dev)
+{
+	kcp_nl_exec(RTE_KCP_REQ_START_PORT, dev, NULL, 0, NULL, 0);
+	netif_start_queue(dev);
+	return 0;
+}
+
+static int
+kcp_net_release(struct net_device *dev)
+{
+	kcp_nl_exec(RTE_KCP_REQ_STOP_PORT, dev, NULL, 0, NULL, 0);
+	netif_stop_queue(dev); /* can't transmit any more */
+	return 0;
+}
+
+/*
+ * Configuration changes (passed on by ifconfig)
+ */
+static int
+kcp_net_config(struct net_device *dev, struct ifmap *map)
+{
+	if (dev->flags & IFF_UP) /* can't act on a running interface */
+		return -EBUSY;
+
+	/* ignore other fields */
+	return 0;
+}
+
+static int
+kcp_net_change_mtu(struct net_device *dev, int new_mtu)
+{
+	int err;
+
+	KCP_DBG("kcp_net_change_mtu new mtu %d to be set\n", new_mtu);
+	err = kcp_nl_exec(RTE_KCP_REQ_CHANGE_MTU, dev, &new_mtu, sizeof(int),
+			NULL, 0);
+
+	if (err == 0)
+		dev->mtu = new_mtu;
+
+	return err;
+}
+
+/*
+ * Ioctl commands
+ */
+static int
+kcp_net_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
+{
+	KCP_DBG("kcp_net_ioctl\n");
+
+	return 0;
+}
+
+/*
+ * Return statistics to the caller
+ */
+static struct  rtnl_link_stats64 *
+kcp_net_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
+{
+	int err;
+
+	err = kcp_nl_exec(RTE_KCP_REQ_GET_STATS, dev, NULL, 0,
+			stats, sizeof(struct rtnl_link_stats64));
+
+	return stats;
+}
+
+/**
+ * kcp_net_set_mac - Change the Ethernet Address of the KCP NIC
+ * @netdev: network interface device structure
+ * @p: pointer to an address structure
+ *
+ * Returns 0 on success, negative on failure
+ **/
+static int
+kcp_net_set_mac(struct net_device *dev, void *p)
+{
+	struct sockaddr *addr = p;
+	int err;
+
+	if (!is_valid_ether_addr((unsigned char *)(addr->sa_data)))
+		return -EADDRNOTAVAIL;
+
+	err = kcp_nl_exec(RTE_KCP_REQ_SET_MAC, dev, addr->sa_data,
+			dev->addr_len, NULL, 0);
+	if (err < 0)
+		return -EADDRNOTAVAIL;
+
+	memcpy(dev->dev_addr, addr->sa_data, dev->addr_len);
+	return 0;
+}
+
+#if (KERNEL_VERSION(3, 9, 0) <= LINUX_VERSION_CODE)
+static int
+kcp_net_change_carrier(struct net_device *dev, bool new_carrier)
+{
+	if (new_carrier)
+		netif_carrier_on(dev);
+	else
+		netif_carrier_off(dev);
+	return 0;
+}
+#endif
+
+static const struct net_device_ops kcp_net_netdev_ops = {
+	.ndo_open = kcp_net_open,
+	.ndo_stop = kcp_net_release,
+	.ndo_set_config = kcp_net_config,
+	.ndo_change_mtu = kcp_net_change_mtu,
+	.ndo_do_ioctl = kcp_net_ioctl,
+	.ndo_get_stats64 = kcp_net_stats64,
+	.ndo_set_mac_address = kcp_net_set_mac,
+#if (KERNEL_VERSION(3, 9, 0) <= LINUX_VERSION_CODE)
+	.ndo_change_carrier = kcp_net_change_carrier,
+#endif
+};
+
+/*
+ *  Fill the eth header
+ */
+static int
+kcp_net_header(struct sk_buff *skb, struct net_device *dev,
+		unsigned short type, const void *daddr,
+		const void *saddr, unsigned int len)
+{
+	struct ethhdr *eth = (struct ethhdr *) skb_push(skb, ETH_HLEN);
+
+	memcpy(eth->h_source, saddr ? saddr : dev->dev_addr, dev->addr_len);
+	memcpy(eth->h_dest,   daddr ? daddr : dev->dev_addr, dev->addr_len);
+	eth->h_proto = htons(type);
+
+	return dev->hard_header_len;
+}
+
+/*
+ * Re-fill the eth header
+ */
+#if (KERNEL_VERSION(4, 1, 0) > LINUX_VERSION_CODE)
+static int
+kcp_net_rebuild_header(struct sk_buff *skb)
+{
+	struct net_device *dev = skb->dev;
+	struct ethhdr *eth = (struct ethhdr *) skb->data;
+
+	memcpy(eth->h_source, dev->dev_addr, dev->addr_len);
+	memcpy(eth->h_dest, dev->dev_addr, dev->addr_len);
+
+	return 0;
+}
+#endif
+
+static const struct header_ops kcp_net_header_ops = {
+	.create  = kcp_net_header,
+#if (KERNEL_VERSION(4, 1, 0) > LINUX_VERSION_CODE)
+	.rebuild = kcp_net_rebuild_header,
+#endif
+	.cache   = NULL,  /* disable caching */
+};
+
+void
+kcp_net_init(struct net_device *dev)
+{
+	KCP_DBG("kcp_net_init\n");
+
+	ether_setup(dev); /* assign some of the fields */
+	dev->netdev_ops      = &kcp_net_netdev_ops;
+	dev->header_ops      = &kcp_net_header_ops;
+
+	dev->flags |= IFF_UP;
+}
diff --git a/lib/librte_eal/linuxapp/kcp/kcp_nl.c b/lib/librte_eal/linuxapp/kcp/kcp_nl.c
new file mode 100644
index 0000000..e989d2d
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kcp/kcp_nl.c
@@ -0,0 +1,194 @@ 
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2 of the GNU General Public License as
+ *   published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program;
+ *   The full GNU General Public License is included in this distribution
+ *   in the file called LICENSE.GPL.
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ */
+
+#include <net/sock.h>
+
+#include "kcp_dev.h"
+
+#define KCP_NL_GRP 31
+
+#define KCP_ETHTOOL_MSG_LEN 500
+struct kcp_ethtool_msg {
+	int cmd_id;
+	int port_id;
+	char input_buffer[KCP_ETHTOOL_MSG_LEN];
+	char output_buffer[KCP_ETHTOOL_MSG_LEN];
+	int input_buf_len;
+	int output_buf_len;
+	int err;
+};
+
+static struct ethtool_input_buffer {
+	int magic;
+	void *buffer;
+	int length;
+	struct completion *msg_received;
+	int *err;
+} ethtool_input_buffer;
+
+static struct sock *nl_sock;
+static int pid __read_mostly = -1;
+static struct mutex sync_lock;
+
+static int
+kcp_input_buffer_register(int magic, void *buffer, int length,
+		struct completion *msg_received, int *err)
+{
+	if (ethtool_input_buffer.buffer == NULL) {
+		ethtool_input_buffer.magic = magic;
+		ethtool_input_buffer.buffer = buffer;
+		ethtool_input_buffer.length = length;
+		ethtool_input_buffer.msg_received = msg_received;
+		ethtool_input_buffer.err = err;
+		return 0;
+	}
+
+	return 1;
+}
+
+static void
+kcp_input_buffer_unregister(int magic)
+{
+	if (ethtool_input_buffer.buffer != NULL) {
+		if (magic == ethtool_input_buffer.magic) {
+			ethtool_input_buffer.magic = -1;
+			ethtool_input_buffer.buffer = NULL;
+			ethtool_input_buffer.length = 0;
+			ethtool_input_buffer.msg_received = NULL;
+			ethtool_input_buffer.err = NULL;
+		}
+	}
+}
+
+static void
+nl_recv(struct sk_buff *skb)
+{
+	struct nlmsghdr *nlh;
+	struct kcp_ethtool_msg ethtool_msg;
+
+	nlh = (struct nlmsghdr *)skb->data;
+	if (pid < 0) {
+		pid = nlh->nlmsg_pid;
+		KCP_INFO("PID: %d\n", pid);
+		return;
+	} else if (pid != nlh->nlmsg_pid) {
+		KCP_INFO("Message from unexpected peer: %d", nlh->nlmsg_pid);
+		return;
+	}
+
+	memcpy(&ethtool_msg, NLMSG_DATA(nlh), sizeof(struct kcp_ethtool_msg));
+	KCP_DBG("CMD: %d\n", ethtool_msg.cmd_id);
+
+	if (ethtool_input_buffer.magic > 0) {
+		if (ethtool_input_buffer.buffer != NULL) {
+			memcpy(ethtool_input_buffer.buffer,
+					&ethtool_msg.output_buffer,
+					ethtool_input_buffer.length);
+		}
+		*ethtool_input_buffer.err = ethtool_msg.err;
+		complete(ethtool_input_buffer.msg_received);
+		kcp_input_buffer_unregister(ethtool_input_buffer.magic);
+	}
+}
+
+static int
+kcp_nl_send(int cmd_id, int port_id, void *input_buffer, int input_buf_len)
+{
+	struct sk_buff *skb;
+	struct nlmsghdr *nlh;
+	struct kcp_ethtool_msg ethtool_msg;
+
+	memset(&ethtool_msg, 0, sizeof(struct kcp_ethtool_msg));
+	ethtool_msg.cmd_id = cmd_id;
+	ethtool_msg.port_id = port_id;
+
+	if (input_buffer) {
+		if (input_buf_len == 0 || input_buf_len > KCP_ETHTOOL_MSG_LEN)
+			return -EINVAL;
+		ethtool_msg.input_buf_len = input_buf_len;
+		memcpy(ethtool_msg.input_buffer, input_buffer, input_buf_len);
+	}
+
+	skb = nlmsg_new(NLMSG_ALIGN(sizeof(struct kcp_ethtool_msg)),
+			GFP_ATOMIC);
+	nlh = nlmsg_put(skb, 0, 0, NLMSG_DONE, sizeof(struct kcp_ethtool_msg),
+			0);
+
+	NETLINK_CB(skb).dst_group = 0;
+
+	memcpy(nlmsg_data(nlh), &ethtool_msg, sizeof(struct kcp_ethtool_msg));
+
+	nlmsg_unicast(nl_sock, skb, pid);
+	KCP_DBG("Sent cmd:%d port:%d\n", cmd_id, port_id);
+
+	/*nlmsg_free(skb);*/
+
+	return 0;
+}
+
+int
+kcp_nl_exec(int cmd, struct net_device *dev, void *in_data, int in_len,
+		void *out_data, int out_len)
+{
+	struct kcp_dev *priv = netdev_priv(dev);
+	int err = -EINVAL;
+	int ret;
+
+	mutex_lock(&sync_lock);
+	ret = kcp_input_buffer_register(cmd, out_data, out_len,
+			&priv->msg_received, &err);
+	if (ret) {
+		mutex_unlock(&sync_lock);
+		return -EINVAL;
+	}
+
+	kcp_nl_send(cmd, priv->port_id, in_data, in_len);
+	ret = wait_for_completion_interruptible_timeout(&priv->msg_received,
+			 msecs_to_jiffies(10));
+	if (ret == 0 || err < 0) {
+		kcp_input_buffer_unregister(ethtool_input_buffer.magic);
+		mutex_unlock(&sync_lock);
+		return ret == 0 ? -EINVAL : err;
+	}
+	mutex_unlock(&sync_lock);
+
+	return 0;
+}
+
+static struct netlink_kernel_cfg cfg = {
+	.input = nl_recv,
+};
+
+void
+kcp_nl_init(void)
+{
+	nl_sock = netlink_kernel_create(&init_net, KCP_NL_GRP, &cfg);
+	mutex_init(&sync_lock);
+}
+
+void
+kcp_nl_release(void)
+{
+	netlink_kernel_release(nl_sock);
+	pid = -1;
+}