[dpdk-dev,V12,1/3] eal: add uevent monitor api and callback func

Message ID 1516248723-16985-1-git-send-email-jia.guo@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation fail Compilation issues

Commit Message

Guo, Jia Jan. 18, 2018, 4:12 a.m. UTC
This patch aim to add a general uevent mechanism in eal device layer,
to enable all linux kernel object uevent monitoring, user could use these
APIs to monitor and read out the device status info that sent from the
kernel side, then corresponding to handle it, such as when detect hotplug
uevent type, user could detach or attach the device, and more it benefit
to use to do smoothly fail safe work.

About uevent monitoring:
a: add one epolling to poll the netlink socket, to monitor the uevent of
   the device.
b: add enum of rte_eal_dev_event_type and struct of rte_eal_uevent.
c: add below APIs in rte eal device layer.
   rte_dev_callback_register
   rte_dev_callback_unregister
   _rte_dev_callback_process
   rte_dev_event_monitor_start
   rte_dev_event_monitor_stop

Signed-off-by: Jeff Guo <jia.guo@intel.com>
---
v12->v11:
identify null param in callback for monitor all devices uevent
---
 lib/librte_eal/bsdapp/eal/eal_dev.c     |  38 ++++++
 lib/librte_eal/common/eal_common_dev.c  | 128 ++++++++++++++++++
 lib/librte_eal/common/include/rte_dev.h | 119 +++++++++++++++++
 lib/librte_eal/linuxapp/eal/Makefile    |   1 +
 lib/librte_eal/linuxapp/eal/eal_dev.c   | 223 ++++++++++++++++++++++++++++++++
 5 files changed, 509 insertions(+)
 create mode 100644 lib/librte_eal/bsdapp/eal/eal_dev.c
 create mode 100644 lib/librte_eal/linuxapp/eal/eal_dev.c
  

Comments

Thomas Monjalon Jan. 19, 2018, 1:13 a.m. UTC | #1
18/01/2018 05:12, Jeff Guo:
> + * It registers the callback for the specific device.
> + * Multiple callbacks cal be registered at the same time.
> + *
> + * @param device_name
> + *  The device name, that is the param name of the struct rte_device,

Why not using rte_device pointer?

> + *  null value means for all devices.

I don't see any management of NULL value.
On the contrary, I see
+       if (device_name == NULL)
+               return -EINVAL;

> + * @param cb_fn
> + *  callback address.
> + * @param cb_arg
> + *  address of parameter for callback.
> + *
> + * @return
> + *  - On success, zero.
> + *  - On failure, a negative value.
> + */
> +int rte_dev_callback_register(char *device_name, rte_dev_event_cb_fn cb_fn,
> +                                       void *cb_arg);
> +
  
Guo, Jia Jan. 19, 2018, 2:51 a.m. UTC | #2
On 1/19/2018 9:13 AM, Thomas Monjalon wrote:
> 18/01/2018 05:12, Jeff Guo:
>> + * It registers the callback for the specific device.
>> + * Multiple callbacks cal be registered at the same time.
>> + *
>> + * @param device_name
>> + *  The device name, that is the param name of the struct rte_device,
> Why not using rte_device pointer?
sorry,  maybe i have address the reason in other patch mail loop but i 
will explain again. since if use NULL for all device, a callback could 
not belong to a NULL  rte_device pointer.
>> + *  null value means for all devices.
> I don't see any management of NULL value.
> On the contrary, I see
> +       if (device_name == NULL)
> +               return -EINVAL;
the device_name is from the uevent massage, it should not be null for 
ever. NULL value for all devices is use the params dev_name in the 
structure of  rte_dev_event_callback, and control by below part of code. 
if dev->name is null, don't care about the whether the device_name have 
been registered or not. i think that would be fulfill all new device 
monitor.

	TAILQ_FOREACH(cb_lst, &(dev_event_cbs), next) {
		if (cb_lst->cb_fn == NULL || (strcmp(cb_lst->dev_name,
			device_name) && cb_lst->dev_name))
			continue;
		dev_cb = *cb_lst;


>> + * @param cb_fn
>> + *  callback address.
>> + * @param cb_arg
>> + *  address of parameter for callback.
>> + *
>> + * @return
>> + *  - On success, zero.
>> + *  - On failure, a negative value.
>> + */
>> +int rte_dev_callback_register(char *device_name, rte_dev_event_cb_fn cb_fn,
>> +                                       void *cb_arg);
>> +
>
>
  
Jingjing Wu Jan. 24, 2018, 2:52 p.m. UTC | #3
> -----Original Message-----
> From: Guo, Jia
> Sent: Thursday, January 18, 2018 12:12 PM
> To: stephen@networkplumber.org; Richardson, Bruce <bruce.richardson@intel.com>;
> Yigit, Ferruh <ferruh.yigit@intel.com>; gaetan.rivet@6wind.com
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; jblunck@infradead.org;
> shreyansh.jain@nxp.com; Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org; Guo, Jia
> <jia.guo@intel.com>; thomas@monjalon.net; Zhang, Helin <helin.zhang@intel.com>;
> motih@mellanox.com
> Subject: [PATCH V12 1/3] eal: add uevent monitor api and callback func
> 
> This patch aim to add a general uevent mechanism in eal device layer,
> to enable all linux kernel object uevent monitoring, user could use these
> APIs to monitor and read out the device status info that sent from the
> kernel side, then corresponding to handle it, such as when detect hotplug
> uevent type, user could detach or attach the device, and more it benefit
> to use to do smoothly fail safe work.
> 
> About uevent monitoring:
> a: add one epolling to poll the netlink socket, to monitor the uevent of
>    the device.
> b: add enum of rte_eal_dev_event_type and struct of rte_eal_uevent.
> c: add below APIs in rte eal device layer.
>    rte_dev_callback_register
>    rte_dev_callback_unregister
>    _rte_dev_callback_process
>    rte_dev_event_monitor_start
>    rte_dev_event_monitor_stop
> 
> Signed-off-by: Jeff Guo <jia.guo@intel.com>
> ---
> v12->v11:
> identify null param in callback for monitor all devices uevent
> ---
>  lib/librte_eal/bsdapp/eal/eal_dev.c     |  38 ++++++
>  lib/librte_eal/common/eal_common_dev.c  | 128 ++++++++++++++++++
>  lib/librte_eal/common/include/rte_dev.h | 119 +++++++++++++++++
>  lib/librte_eal/linuxapp/eal/Makefile    |   1 +
>  lib/librte_eal/linuxapp/eal/eal_dev.c   | 223 ++++++++++++++++++++++++++++++++
>  5 files changed, 509 insertions(+)
>  create mode 100644 lib/librte_eal/bsdapp/eal/eal_dev.c
>  create mode 100644 lib/librte_eal/linuxapp/eal/eal_dev.c
> 

[......]

> +int
> +rte_dev_callback_register(char *device_name, rte_dev_event_cb_fn cb_fn,
> +				void *cb_arg)
> +{
> +	struct rte_dev_event_callback *event_cb = NULL;
> +
> +	rte_spinlock_lock(&rte_dev_event_lock);
> +
> +	if (TAILQ_EMPTY(&(dev_event_cbs)))
> +		TAILQ_INIT(&(dev_event_cbs));
> +
> +	TAILQ_FOREACH(event_cb, &(dev_event_cbs), next) {
> +		if (event_cb->cb_fn == cb_fn &&
> +			event_cb->cb_arg == cb_arg &&
> +			!strcmp(event_cb->dev_name, device_name))
device_name = NULL means means for all devices, right? Can strcmp accept NULL arguments?

> +			break;
> +	}
> +
> +	/* create a new callback. */
> +	if (event_cb == NULL) {
> +		/* allocate a new user callback entity */
> +		event_cb = malloc(sizeof(struct rte_dev_event_callback));
> +		if (event_cb != NULL) {
> +			event_cb->cb_fn = cb_fn;
> +			event_cb->cb_arg = cb_arg;
> +			event_cb->dev_name = device_name;
> +		}
Is that OK to call TAILQ_INSERT_TAIL below if event_cb == NULL?

> +		TAILQ_INSERT_TAIL(&(dev_event_cbs), event_cb, next);
> +	}
> +
> +	rte_spinlock_unlock(&rte_dev_event_lock);
> +	return (event_cb == NULL) ? -1 : 0;
> +}
> +
> +int
> +rte_dev_callback_unregister(char *device_name, rte_dev_event_cb_fn cb_fn,
> +				void *cb_arg)
> +{
> +	int ret;
> +	struct rte_dev_event_callback *event_cb, *next;
> +
> +	if (!cb_fn || device_name == NULL)
> +		return -EINVAL;
> +
> +	rte_spinlock_lock(&rte_dev_event_lock);
> +
> +	ret = 0;
> +
> +	for (event_cb = TAILQ_FIRST(&(dev_event_cbs)); event_cb != NULL;
> +	      event_cb = next) {
> +
> +		next = TAILQ_NEXT(event_cb, next);
> +
> +		if (event_cb->cb_fn != cb_fn ||
> +				(event_cb->cb_arg != (void *)-1 &&
> +				event_cb->cb_arg != cb_arg) ||
> +				strcmp(event_cb->dev_name, device_name))

The same comments as above.

> +			continue;
> +
> +		/*
> +		 * if this callback is not executing right now,
> +		 * then remove it.
> +		 */
> +		if (event_cb->active == 0) {
> +			TAILQ_REMOVE(&(dev_event_cbs), event_cb, next);
> +			rte_free(event_cb);
> +		} else {
> +			ret = -EAGAIN;
> +		}
> +	}
> +
> +	rte_spinlock_unlock(&rte_dev_event_lock);
> +	return ret;
> +}
> +

[......]

> +int
> +rte_dev_event_monitor_start(void)
> +{
> +	int ret;
> +	struct rte_service_spec service;
> +	uint32_t id;
> +	const uint32_t sid = 0;
> +
> +	if (!service_no_init)
> +		return 0;
> +
> +	uint32_t slcore_1 = rte_get_next_lcore(/* start core */ -1,
> +					       /* skip master */ 1,
> +					       /* wrap */ 0);
> +
> +	ret = rte_service_lcore_add(slcore_1);
> +	if (ret) {
> +		RTE_LOG(ERR, EAL, "dev event monitor lcore add fail");
> +		return ret;
> +	}
> +
> +	memset(&service, 0, sizeof(service));
> +	snprintf(service.name, sizeof(service.name), DEV_EV_MNT_SERVICE_NAME);
> +
> +	service.socket_id = rte_socket_id();
> +	service.callback = dev_uev_monitoring;
> +	service.callback_userdata = NULL;
> +	service.capabilities = 0;
> +	ret = rte_service_component_register(&service, &id);
> +	if (ret) {
> +		RTE_LOG(ERR, EAL, "Failed to register service %s "
> +			"err = %" PRId32,
> +			service.name, ret);
> +		return ret;
> +	}
> +	ret = rte_service_runstate_set(sid, 1);
> +	if (ret) {
> +		RTE_LOG(ERR, EAL, "Failed to set the runstate of "
> +			"the service");
Any rollback need to be done when fails?

> +		return ret;
> +	}
> +	ret = rte_service_component_runstate_set(id, 1);
> +	if (ret) {
> +		RTE_LOG(ERR, EAL, "Failed to set the backend runstate"
> +			" of a component");
> +		return ret;
> +	}
> +	ret = rte_service_map_lcore_set(sid, slcore_1, 1);
> +	if (ret) {
> +		RTE_LOG(ERR, EAL, "Failed to enable lcore 1 on "
> +			"dev event monitor service");
> +		return ret;
> +	}
> +	rte_service_lcore_start(slcore_1);
> +	service_no_init = false;
> +	return 0;
> +}
> +
> +int
> +rte_dev_event_monitor_stop(void)
> +{
> +	service_exit = true;
> +	service_no_init = true;
> +	return 0;

Are start and stop peer functions to call? If we call rte_dev_event_monitor_start to start monitor and then call rte_dev_event_monitor_stop to stop it, and then how to start again?

> +}
> --
> 2.7.4
  
Guo, Jia Jan. 25, 2018, 2:57 p.m. UTC | #4
thanks for your review. please check v13.
On 1/24/2018 10:52 PM, Wu, Jingjing wrote:
>
>> -----Original Message-----
>> From: Guo, Jia
>> Sent: Thursday, January 18, 2018 12:12 PM
>> To: stephen@networkplumber.org; Richardson, Bruce <bruce.richardson@intel.com>;
>> Yigit, Ferruh <ferruh.yigit@intel.com>; gaetan.rivet@6wind.com
>> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; jblunck@infradead.org;
>> shreyansh.jain@nxp.com; Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org; Guo, Jia
>> <jia.guo@intel.com>; thomas@monjalon.net; Zhang, Helin <helin.zhang@intel.com>;
>> motih@mellanox.com
>> Subject: [PATCH V12 1/3] eal: add uevent monitor api and callback func
>>
>> This patch aim to add a general uevent mechanism in eal device layer,
>> to enable all linux kernel object uevent monitoring, user could use these
>> APIs to monitor and read out the device status info that sent from the
>> kernel side, then corresponding to handle it, such as when detect hotplug
>> uevent type, user could detach or attach the device, and more it benefit
>> to use to do smoothly fail safe work.
>>
>> About uevent monitoring:
>> a: add one epolling to poll the netlink socket, to monitor the uevent of
>>     the device.
>> b: add enum of rte_eal_dev_event_type and struct of rte_eal_uevent.
>> c: add below APIs in rte eal device layer.
>>     rte_dev_callback_register
>>     rte_dev_callback_unregister
>>     _rte_dev_callback_process
>>     rte_dev_event_monitor_start
>>     rte_dev_event_monitor_stop
>>
>> Signed-off-by: Jeff Guo <jia.guo@intel.com>
>> ---
>> v12->v11:
>> identify null param in callback for monitor all devices uevent
>> ---
>>   lib/librte_eal/bsdapp/eal/eal_dev.c     |  38 ++++++
>>   lib/librte_eal/common/eal_common_dev.c  | 128 ++++++++++++++++++
>>   lib/librte_eal/common/include/rte_dev.h | 119 +++++++++++++++++
>>   lib/librte_eal/linuxapp/eal/Makefile    |   1 +
>>   lib/librte_eal/linuxapp/eal/eal_dev.c   | 223 ++++++++++++++++++++++++++++++++
>>   5 files changed, 509 insertions(+)
>>   create mode 100644 lib/librte_eal/bsdapp/eal/eal_dev.c
>>   create mode 100644 lib/librte_eal/linuxapp/eal/eal_dev.c
>>
> [......]
>
>> +int
>> +rte_dev_callback_register(char *device_name, rte_dev_event_cb_fn cb_fn,
>> +				void *cb_arg)
>> +{
>> +	struct rte_dev_event_callback *event_cb = NULL;
>> +
>> +	rte_spinlock_lock(&rte_dev_event_lock);
>> +
>> +	if (TAILQ_EMPTY(&(dev_event_cbs)))
>> +		TAILQ_INIT(&(dev_event_cbs));
>> +
>> +	TAILQ_FOREACH(event_cb, &(dev_event_cbs), next) {
>> +		if (event_cb->cb_fn == cb_fn &&
>> +			event_cb->cb_arg == cb_arg &&
>> +			!strcmp(event_cb->dev_name, device_name))
> device_name = NULL means means for all devices, right? Can strcmp accept NULL arguments?
got it.
>> +			break;
>> +	}
>> +
>> +	/* create a new callback. */
>> +	if (event_cb == NULL) {
>> +		/* allocate a new user callback entity */
>> +		event_cb = malloc(sizeof(struct rte_dev_event_callback));
>> +		if (event_cb != NULL) {
>> +			event_cb->cb_fn = cb_fn;
>> +			event_cb->cb_arg = cb_arg;
>> +			event_cb->dev_name = device_name;
>> +		}
> Is that OK to call TAILQ_INSERT_TAIL below if event_cb == NULL?
yes, that might be wrong.
>> +		TAILQ_INSERT_TAIL(&(dev_event_cbs), event_cb, next);
>> +	}
>> +
>> +	rte_spinlock_unlock(&rte_dev_event_lock);
>> +	return (event_cb == NULL) ? -1 : 0;
>> +}
>> +
>> +int
>> +rte_dev_callback_unregister(char *device_name, rte_dev_event_cb_fn cb_fn,
>> +				void *cb_arg)
>> +{
>> +	int ret;
>> +	struct rte_dev_event_callback *event_cb, *next;
>> +
>> +	if (!cb_fn || device_name == NULL)
>> +		return -EINVAL;
>> +
>> +	rte_spinlock_lock(&rte_dev_event_lock);
>> +
>> +	ret = 0;
>> +
>> +	for (event_cb = TAILQ_FIRST(&(dev_event_cbs)); event_cb != NULL;
>> +	      event_cb = next) {
>> +
>> +		next = TAILQ_NEXT(event_cb, next);
>> +
>> +		if (event_cb->cb_fn != cb_fn ||
>> +				(event_cb->cb_arg != (void *)-1 &&
>> +				event_cb->cb_arg != cb_arg) ||
>> +				strcmp(event_cb->dev_name, device_name))
> The same comments as above.
ok.
>> +			continue;
>> +
>> +		/*
>> +		 * if this callback is not executing right now,
>> +		 * then remove it.
>> +		 */
>> +		if (event_cb->active == 0) {
>> +			TAILQ_REMOVE(&(dev_event_cbs), event_cb, next);
>> +			rte_free(event_cb);
>> +		} else {
>> +			ret = -EAGAIN;
>> +		}
>> +	}
>> +
>> +	rte_spinlock_unlock(&rte_dev_event_lock);
>> +	return ret;
>> +}
>> +
> [......]
>
>> +int
>> +rte_dev_event_monitor_start(void)
>> +{
>> +	int ret;
>> +	struct rte_service_spec service;
>> +	uint32_t id;
>> +	const uint32_t sid = 0;
>> +
>> +	if (!service_no_init)
>> +		return 0;
>> +
>> +	uint32_t slcore_1 = rte_get_next_lcore(/* start core */ -1,
>> +					       /* skip master */ 1,
>> +					       /* wrap */ 0);
>> +
>> +	ret = rte_service_lcore_add(slcore_1);
>> +	if (ret) {
>> +		RTE_LOG(ERR, EAL, "dev event monitor lcore add fail");
>> +		return ret;
>> +	}
>> +
>> +	memset(&service, 0, sizeof(service));
>> +	snprintf(service.name, sizeof(service.name), DEV_EV_MNT_SERVICE_NAME);
>> +
>> +	service.socket_id = rte_socket_id();
>> +	service.callback = dev_uev_monitoring;
>> +	service.callback_userdata = NULL;
>> +	service.capabilities = 0;
>> +	ret = rte_service_component_register(&service, &id);
>> +	if (ret) {
>> +		RTE_LOG(ERR, EAL, "Failed to register service %s "
>> +			"err = %" PRId32,
>> +			service.name, ret);
>> +		return ret;
>> +	}
>> +	ret = rte_service_runstate_set(sid, 1);
>> +	if (ret) {
>> +		RTE_LOG(ERR, EAL, "Failed to set the runstate of "
>> +			"the service");
> Any rollback need to be done when fails?
yes,  should be handle fails.
>> +		return ret;
>> +	}
>> +	ret = rte_service_component_runstate_set(id, 1);
>> +	if (ret) {
>> +		RTE_LOG(ERR, EAL, "Failed to set the backend runstate"
>> +			" of a component");
>> +		return ret;
>> +	}
>> +	ret = rte_service_map_lcore_set(sid, slcore_1, 1);
>> +	if (ret) {
>> +		RTE_LOG(ERR, EAL, "Failed to enable lcore 1 on "
>> +			"dev event monitor service");
>> +		return ret;
>> +	}
>> +	rte_service_lcore_start(slcore_1);
>> +	service_no_init = false;
>> +	return 0;
>> +}
>> +
>> +int
>> +rte_dev_event_monitor_stop(void)
>> +{
>> +	service_exit = true;
>> +	service_no_init = true;
>> +	return 0;
> Are start and stop peer functions to call? If we call rte_dev_event_monitor_start to start monitor and then call rte_dev_event_monitor_stop to stop it, and then how to start again?
sure. should peer control.
>> +}
>> --
>> 2.7.4
  

Patch

diff --git a/lib/librte_eal/bsdapp/eal/eal_dev.c b/lib/librte_eal/bsdapp/eal/eal_dev.c
new file mode 100644
index 0000000..83ffdee
--- /dev/null
+++ b/lib/librte_eal/bsdapp/eal/eal_dev.c
@@ -0,0 +1,38 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <inttypes.h>
+#include <sys/queue.h>
+#include <sys/signalfd.h>
+#include <sys/ioctl.h>
+#include <sys/socket.h>
+#include <sys/epoll.h>
+#include <unistd.h>
+#include <signal.h>
+#include <stdbool.h>
+
+#include <rte_malloc.h>
+#include <rte_bus.h>
+#include <rte_dev.h>
+#include <rte_devargs.h>
+#include <rte_debug.h>
+#include <rte_log.h>
+
+#include "eal_thread.h"
+
+int
+rte_dev_event_monitor_start(void)
+{
+	RTE_LOG(ERR, EAL, "Not support event monitor for FreeBSD\n");
+	return -1;
+}
+
+int
+rte_dev_event_monitor_stop(void)
+{
+	RTE_LOG(ERR, EAL, "Not support event monitor for FreeBSD\n");
+	return -1;
+}
diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c
index dda8f58..2a196dc 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -42,9 +42,32 @@ 
 #include <rte_devargs.h>
 #include <rte_debug.h>
 #include <rte_log.h>
+#include <rte_spinlock.h>
+#include <rte_malloc.h>
 
 #include "eal_private.h"
 
+/* spinlock for device callbacks */
+static rte_spinlock_t rte_dev_event_lock = RTE_SPINLOCK_INITIALIZER;
+
+/**
+ * The user application callback description.
+ *
+ * It contains callback address to be registered by user application,
+ * the pointer to the parameters for callback, and the device name.
+ */
+struct rte_dev_event_callback {
+	TAILQ_ENTRY(rte_dev_event_callback) next; /**< Callbacks list */
+	rte_dev_event_cb_fn cb_fn;                /**< Callback address */
+	void *cb_arg;                           /**< Callback parameter */
+	char *dev_name;				/**< Callback devcie name, NULL
+							is for all device */
+	uint32_t active;                        /**< Callback is executing */
+};
+
+/* A general callbacks list for all callback of devices */
+static struct rte_dev_event_cb_list dev_event_cbs;
+
 static int cmp_detached_dev_name(const struct rte_device *dev,
 	const void *_name)
 {
@@ -234,3 +257,108 @@  int rte_eal_hotplug_remove(const char *busname, const char *devname)
 	rte_eal_devargs_remove(busname, devname);
 	return ret;
 }
+
+int
+rte_dev_callback_register(char *device_name, rte_dev_event_cb_fn cb_fn,
+				void *cb_arg)
+{
+	struct rte_dev_event_callback *event_cb = NULL;
+
+	rte_spinlock_lock(&rte_dev_event_lock);
+
+	if (TAILQ_EMPTY(&(dev_event_cbs)))
+		TAILQ_INIT(&(dev_event_cbs));
+
+	TAILQ_FOREACH(event_cb, &(dev_event_cbs), next) {
+		if (event_cb->cb_fn == cb_fn &&
+			event_cb->cb_arg == cb_arg &&
+			!strcmp(event_cb->dev_name, device_name))
+			break;
+	}
+
+	/* create a new callback. */
+	if (event_cb == NULL) {
+		/* allocate a new user callback entity */
+		event_cb = malloc(sizeof(struct rte_dev_event_callback));
+		if (event_cb != NULL) {
+			event_cb->cb_fn = cb_fn;
+			event_cb->cb_arg = cb_arg;
+			event_cb->dev_name = device_name;
+		}
+		TAILQ_INSERT_TAIL(&(dev_event_cbs), event_cb, next);
+	}
+
+	rte_spinlock_unlock(&rte_dev_event_lock);
+	return (event_cb == NULL) ? -1 : 0;
+}
+
+int
+rte_dev_callback_unregister(char *device_name, rte_dev_event_cb_fn cb_fn,
+				void *cb_arg)
+{
+	int ret;
+	struct rte_dev_event_callback *event_cb, *next;
+
+	if (!cb_fn || device_name == NULL)
+		return -EINVAL;
+
+	rte_spinlock_lock(&rte_dev_event_lock);
+
+	ret = 0;
+
+	for (event_cb = TAILQ_FIRST(&(dev_event_cbs)); event_cb != NULL;
+	      event_cb = next) {
+
+		next = TAILQ_NEXT(event_cb, next);
+
+		if (event_cb->cb_fn != cb_fn ||
+				(event_cb->cb_arg != (void *)-1 &&
+				event_cb->cb_arg != cb_arg) ||
+				strcmp(event_cb->dev_name, device_name))
+			continue;
+
+		/*
+		 * if this callback is not executing right now,
+		 * then remove it.
+		 */
+		if (event_cb->active == 0) {
+			TAILQ_REMOVE(&(dev_event_cbs), event_cb, next);
+			rte_free(event_cb);
+		} else {
+			ret = -EAGAIN;
+		}
+	}
+
+	rte_spinlock_unlock(&rte_dev_event_lock);
+	return ret;
+}
+
+int
+_rte_dev_callback_process(char *device_name, enum rte_dev_event_type event,
+				void *cb_arg)
+{
+	struct rte_dev_event_callback dev_cb;
+	struct rte_dev_event_callback *cb_lst;
+	int rc = 0;
+
+	rte_spinlock_lock(&rte_dev_event_lock);
+
+	if (device_name == NULL)
+		return -EINVAL;
+
+	TAILQ_FOREACH(cb_lst, &(dev_event_cbs), next) {
+		if (cb_lst->cb_fn == NULL || (strcmp(cb_lst->dev_name,
+			device_name) && cb_lst->dev_name))
+			continue;
+		dev_cb = *cb_lst;
+		cb_lst->active = 1;
+		if (cb_arg)
+			dev_cb->cb_arg = cb_arg;
+		rc = dev_cb.cb_fn(device_name, event,
+				dev_cb.cb_arg);
+		cb_lst->active = 0;
+	}
+
+	rte_spinlock_unlock(&rte_dev_event_lock);
+	return rc;
+}
diff --git a/lib/librte_eal/common/include/rte_dev.h b/lib/librte_eal/common/include/rte_dev.h
index 9342e0c..25e6747 100644
--- a/lib/librte_eal/common/include/rte_dev.h
+++ b/lib/librte_eal/common/include/rte_dev.h
@@ -51,6 +51,30 @@  extern "C" {
 
 #include <rte_log.h>
 
+/**
+ * The device event type.
+ */
+enum rte_dev_event_type {
+	RTE_DEV_EVENT_UNKNOWN,	/**< unknown event type */
+	RTE_DEV_EVENT_ADD,	/**< device being added */
+	RTE_DEV_EVENT_REMOVE,	/**< device being removed */
+	RTE_DEV_EVENT_MAX	/**< max value of this enum */
+};
+
+struct rte_dev_event {
+	enum rte_dev_event_type type;	/**< device event type */
+	int subsystem;			/**< subsystem id */
+	char *devname;			/**< device name */
+};
+
+typedef int (*rte_dev_event_cb_fn)(char *device_name,
+					enum rte_dev_event_type event,
+					void *cb_arg);
+
+struct rte_dev_event_callback;
+/** @internal Structure to keep track of registered callbacks */
+TAILQ_HEAD(rte_dev_event_cb_list, rte_dev_event_callback);
+
 __attribute__((format(printf, 2, 0)))
 static inline void
 rte_pmd_debug_trace(const char *func_name, const char *fmt, ...)
@@ -293,4 +317,99 @@  __attribute__((used)) = str
 }
 #endif
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * It registers the callback for the specific device.
+ * Multiple callbacks cal be registered at the same time.
+ *
+ * @param device_name
+ *  The device name, that is the param name of the struct rte_device,
+ *  null value means for all devices.
+ * @param cb_fn
+ *  callback address.
+ * @param cb_arg
+ *  address of parameter for callback.
+ *
+ * @return
+ *  - On success, zero.
+ *  - On failure, a negative value.
+ */
+int rte_dev_callback_register(char *device_name, rte_dev_event_cb_fn cb_fn,
+					void *cb_arg);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * It unregisters the callback according to the specified device.
+ *
+ * @param device_name
+ *  The device name, that is the param name of the struct rte_device,
+ *  null value means for all devices.
+ * @param cb_fn
+ *  callback address.
+ * @param cb_arg
+ *  address of parameter for callback, (void *)-1 means to remove all
+ *  registered which has the same callback address.
+ *
+ * @return
+ *  - On success, return the number of callback entities removed.
+ *  - On failure, a negative value.
+ */
+int rte_dev_callback_unregister(char *device_name, rte_dev_event_cb_fn cb_fn,
+					void *cb_arg);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * internal Executes all the user application registered callbacks for
+ * the specific device. It is for DPDK internal user only. User
+ * application should not call it directly.
+ *
+ * @param device_name
+ *  The device name.
+ * @param event
+ *  the device event type
+ *  is permitted or not.
+ * @param cb_arg
+ *  callback parameter.
+ *
+ * @return
+ *  - On success, return zero.
+ *  - On failure, a negative value.
+ */
+int
+_rte_dev_callback_process(char *device_name, enum rte_dev_event_type event,
+				void *cb_arg);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Start the device event monitoring.
+ *
+ * @param none
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_dev_event_monitor_start(void);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Stop the device event monitoring .
+ *
+ * @param none
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_dev_event_monitor_stop(void);
 #endif /* _RTE_DEV_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 588c0bd..43b00e5 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -39,6 +39,7 @@  SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_lcore.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_timer.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_interrupts.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_alarm.c
+SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_dev.c
 
 # from common dir
 SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_lcore.c
diff --git a/lib/librte_eal/linuxapp/eal/eal_dev.c b/lib/librte_eal/linuxapp/eal/eal_dev.c
new file mode 100644
index 0000000..f243c2e
--- /dev/null
+++ b/lib/librte_eal/linuxapp/eal/eal_dev.c
@@ -0,0 +1,223 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <inttypes.h>
+#include <sys/queue.h>
+#include <sys/signalfd.h>
+#include <sys/ioctl.h>
+#include <sys/socket.h>
+#include <linux/netlink.h>
+#include <sys/epoll.h>
+#include <unistd.h>
+#include <signal.h>
+#include <stdbool.h>
+
+#include <rte_malloc.h>
+#include <rte_bus.h>
+#include <rte_dev.h>
+#include <rte_devargs.h>
+#include <rte_debug.h>
+#include <rte_log.h>
+#include <rte_service.h>
+#include <rte_service_component.h>
+
+#include "eal_thread.h"
+
+bool service_exit = true;
+
+bool service_no_init = true;
+
+#define DEV_EV_MNT_SERVICE_NAME "device_event_monitor_service"
+
+static int
+dev_monitor_fd_new(void)
+{
+
+	int uevent_fd;
+
+	uevent_fd = socket(PF_NETLINK, SOCK_RAW | SOCK_CLOEXEC |
+			SOCK_NONBLOCK,
+			NETLINK_KOBJECT_UEVENT);
+	if (uevent_fd < 0) {
+		RTE_LOG(ERR, EAL, "create uevent fd failed\n");
+		return -1;
+	}
+	return uevent_fd;
+}
+
+static int
+dev_monitor_enable(int netlink_fd)
+{
+	struct sockaddr_nl addr;
+	int ret;
+	int size = 64 * 1024;
+	int nonblock = 1;
+
+	memset(&addr, 0, sizeof(addr));
+	addr.nl_family = AF_NETLINK;
+	addr.nl_pid = 0;
+	addr.nl_groups = 0xffffffff;
+
+	if (bind(netlink_fd, (struct sockaddr *) &addr, sizeof(addr)) < 0) {
+		RTE_LOG(ERR, EAL, "bind failed\n");
+		goto err;
+	}
+
+	setsockopt(netlink_fd, SOL_SOCKET, SO_PASSCRED, &size, sizeof(size));
+
+	ret = ioctl(netlink_fd, FIONBIO, &nonblock);
+	if (ret != 0) {
+		RTE_LOG(ERR, EAL, "ioctl(FIONBIO) failed\n");
+		goto err;
+	}
+	return 0;
+err:
+	close(netlink_fd);
+	return -1;
+}
+
+static int
+dev_uev_process(__rte_unused struct epoll_event *events, __rte_unused int nfds)
+{
+	/* TODO: device uevent processing */
+	return 0;
+}
+
+/**
+ * It builds/rebuilds up the epoll file descriptor with all the
+ * file descriptors being waited on. Then handles the netlink event.
+ *
+ * @param arg
+ *  pointer. (unused)
+ *
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+static int32_t dev_uev_monitoring(__rte_unused void *arg)
+{
+	int netlink_fd = -1;
+	struct epoll_event ep_kernel;
+	int fd_ep = -1;
+
+	service_exit = false;
+
+	fd_ep = epoll_create1(EPOLL_CLOEXEC);
+	if (fd_ep < 0) {
+		RTE_LOG(ERR, EAL, "error creating epoll fd: %m\n");
+		goto out;
+	}
+
+	netlink_fd = dev_monitor_fd_new();
+
+	if (dev_monitor_enable(netlink_fd) < 0) {
+		RTE_LOG(ERR, EAL, "error subscribing to kernel events\n");
+		goto out;
+	}
+
+	memset(&ep_kernel, 0, sizeof(struct epoll_event));
+	ep_kernel.events = EPOLLIN | EPOLLPRI | EPOLLRDHUP | EPOLLHUP;
+	ep_kernel.data.fd = netlink_fd;
+	if (epoll_ctl(fd_ep, EPOLL_CTL_ADD, netlink_fd,
+		&ep_kernel) < 0) {
+		RTE_LOG(ERR, EAL, "error addding fd to epoll: %m\n");
+		goto out;
+	}
+
+	while (!service_exit) {
+		int fdcount;
+		struct epoll_event ev[1];
+
+		fdcount = epoll_wait(fd_ep, ev, 1, -1);
+		if (fdcount < 0) {
+			if (errno != EINTR)
+				RTE_LOG(ERR, EAL, "error receiving uevent "
+					"message: %m\n");
+				continue;
+			}
+
+		/* epoll_wait has at least one fd ready to read */
+		if (dev_uev_process(ev, fdcount) < 0) {
+			if (errno != EINTR)
+				RTE_LOG(ERR, EAL, "error processing uevent "
+					"message: %m\n");
+		}
+	}
+	return 0;
+out:
+	if (fd_ep >= 0)
+		close(fd_ep);
+	if (netlink_fd >= 0)
+		close(netlink_fd);
+	rte_panic("uev monitoring fail\n");
+	return -1;
+}
+
+int
+rte_dev_event_monitor_start(void)
+{
+	int ret;
+	struct rte_service_spec service;
+	uint32_t id;
+	const uint32_t sid = 0;
+
+	if (!service_no_init)
+		return 0;
+
+	uint32_t slcore_1 = rte_get_next_lcore(/* start core */ -1,
+					       /* skip master */ 1,
+					       /* wrap */ 0);
+
+	ret = rte_service_lcore_add(slcore_1);
+	if (ret) {
+		RTE_LOG(ERR, EAL, "dev event monitor lcore add fail");
+		return ret;
+	}
+
+	memset(&service, 0, sizeof(service));
+	snprintf(service.name, sizeof(service.name), DEV_EV_MNT_SERVICE_NAME);
+
+	service.socket_id = rte_socket_id();
+	service.callback = dev_uev_monitoring;
+	service.callback_userdata = NULL;
+	service.capabilities = 0;
+	ret = rte_service_component_register(&service, &id);
+	if (ret) {
+		RTE_LOG(ERR, EAL, "Failed to register service %s "
+			"err = %" PRId32,
+			service.name, ret);
+		return ret;
+	}
+	ret = rte_service_runstate_set(sid, 1);
+	if (ret) {
+		RTE_LOG(ERR, EAL, "Failed to set the runstate of "
+			"the service");
+		return ret;
+	}
+	ret = rte_service_component_runstate_set(id, 1);
+	if (ret) {
+		RTE_LOG(ERR, EAL, "Failed to set the backend runstate"
+			" of a component");
+		return ret;
+	}
+	ret = rte_service_map_lcore_set(sid, slcore_1, 1);
+	if (ret) {
+		RTE_LOG(ERR, EAL, "Failed to enable lcore 1 on "
+			"dev event monitor service");
+		return ret;
+	}
+	rte_service_lcore_start(slcore_1);
+	service_no_init = false;
+	return 0;
+}
+
+int
+rte_dev_event_monitor_stop(void)
+{
+	service_exit = true;
+	service_no_init = true;
+	return 0;
+}