diff mbox series

[v2,04/22] ethdev: enable hotplug on multi-process

Message ID 20180621020059.1198-5-qi.z.zhang@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers show
Series enable hotplug on multi-process | expand

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail Compilation issues

Commit Message

Qi Zhang June 21, 2018, 2 a.m. UTC
We are going to introduce the solution to handle different hotplug
cases in multi-process situation, it include below scenario:

1. Attach a share device from primary
2. Detach a share device from primary
3. Attach a share device from secondary
4. Detach a share device from secondary
5. Attach a private device from secondary
6. Detach a private device from secondary
7. Detach a share device from secondary privately
8. Attach a share device from secondary privately

In primary-secondary process model, we assume device is shared by default.
that means attach or detach a device on any process will broadcast to
all other processes through mp channel then device information will be
synchronized on all processes.

Any failure during attaching process will cause inconsistent status
between processes, so proper rollback action should be considered.
Also it is not safe to detach a share device when other process still use
it, so a handshake mechanism is introduced.

This patch covers the implementation of case 1,2,5,6,7,8.
Case 3,4 will be implemented on separate patch as well as handshake
mechanism.

Scenario for Case 1, 2:

attach device
a) primary attach the new device if failed goto h).
b) primary send attach sync request to all secondary.
c) secondary receive request and attach device and send reply.
d) primary check the reply if all success go to i).
e) primary send attach rollback sync request to all secondary.
f) secondary receive the request and detach device and send reply.
g) primary receive the reply and detach device as rollback action.
h) attach fail
i) attach success

detach device
a) primary perform pre-detach check, if device is locked, goto i).
b) primary send pre-detach sync request to all secondary.
c) secondary perform pre-detach check and send reply.
d) primary check the reply if any fail goto i).
e) primary send detach sync request to all secondary
f) secondary detach the device and send reply (assume no fail)
g) primary detach the device.
h) detach success
i) detach failed

Case 5, 6:
Secondary process can attach private device which only visible to itself,
in this case no IPC is involved, primary process is not allowed to have
private device so far.

Case 7, 8:
Secondary process can also temporally to detach a share device "privately"
then attach it back later, this action also not impact other processes.

APIs changes:

rte_eth_dev_attach and rte_eth_dev_attach are extended to support
share device attach/detach in primary-secondary process model, it will
be called in case 1,2,3,4.

New API rte_eth_dev_attach_private and rte_eth_dev_detach_private are
introduced to cover case 5,6,7,8, this API can only be invoked in secondary
process.

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
---

v2:
- rename rte_ethdev_mp.* to ethdev_mp.*
- add experimental tag for rte_eth_dev_attach_private and
  rte_ethdev_detach_private.
- move do_eth_dev_attach and do_eth_dev_detach to ethdev_private.h
- move rte_eth_dev_mp_init before rte_eal_mcfg_complete.
- fix meson.build.
- improve commit log.

 lib/librte_eal/common/eal_private.h |   8 ++
 lib/librte_eal/linuxapp/eal/eal.c   |   7 ++
 lib/librte_ethdev/Makefile          |   1 +
 lib/librte_ethdev/ethdev_mp.c       | 198 ++++++++++++++++++++++++++++++++++++
 lib/librte_ethdev/ethdev_mp.h       |  44 ++++++++
 lib/librte_ethdev/ethdev_private.h  |  39 +++++++
 lib/librte_ethdev/meson.build       |   1 +
 lib/librte_ethdev/rte_ethdev.c      | 184 +++++++++++++++++++++++++++++----
 lib/librte_ethdev/rte_ethdev.h      |  45 ++++++++
 lib/librte_ethdev/rte_ethdev_core.h |   5 +
 10 files changed, 515 insertions(+), 17 deletions(-)
 create mode 100644 lib/librte_ethdev/ethdev_mp.c
 create mode 100644 lib/librte_ethdev/ethdev_mp.h
 create mode 100644 lib/librte_ethdev/ethdev_private.h

Comments

Anatoly Burakov June 21, 2018, 8:36 a.m. UTC | #1
On 21-Jun-18 3:00 AM, Qi Zhang wrote:
> We are going to introduce the solution to handle different hotplug
> cases in multi-process situation, it include below scenario:
> 
> 1. Attach a share device from primary
> 2. Detach a share device from primary
> 3. Attach a share device from secondary
> 4. Detach a share device from secondary
> 5. Attach a private device from secondary
> 6. Detach a private device from secondary
> 7. Detach a share device from secondary privately
> 8. Attach a share device from secondary privately
> 
> In primary-secondary process model, we assume device is shared by default.
> that means attach or detach a device on any process will broadcast to
> all other processes through mp channel then device information will be
> synchronized on all processes.
> 
> Any failure during attaching process will cause inconsistent status
> between processes, so proper rollback action should be considered.
> Also it is not safe to detach a share device when other process still use
> it, so a handshake mechanism is introduced.
> 
> This patch covers the implementation of case 1,2,5,6,7,8.
> Case 3,4 will be implemented on separate patch as well as handshake
> mechanism.
> 
> Scenario for Case 1, 2:
> 
> attach device
> a) primary attach the new device if failed goto h).
> b) primary send attach sync request to all secondary.
> c) secondary receive request and attach device and send reply.
> d) primary check the reply if all success go to i).
> e) primary send attach rollback sync request to all secondary.
> f) secondary receive the request and detach device and send reply.
> g) primary receive the reply and detach device as rollback action.
> h) attach fail
> i) attach success
> 
> detach device
> a) primary perform pre-detach check, if device is locked, goto i).
> b) primary send pre-detach sync request to all secondary.
> c) secondary perform pre-detach check and send reply.
> d) primary check the reply if any fail goto i).
> e) primary send detach sync request to all secondary
> f) secondary detach the device and send reply (assume no fail)
> g) primary detach the device.
> h) detach success
> i) detach failed
> 
> Case 5, 6:
> Secondary process can attach private device which only visible to itself,
> in this case no IPC is involved, primary process is not allowed to have
> private device so far.
> 
> Case 7, 8:
> Secondary process can also temporally to detach a share device "privately"
> then attach it back later, this action also not impact other processes.
> 
> APIs changes:
> 
> rte_eth_dev_attach and rte_eth_dev_attach are extended to support
> share device attach/detach in primary-secondary process model, it will
> be called in case 1,2,3,4.
> 
> New API rte_eth_dev_attach_private and rte_eth_dev_detach_private are
> introduced to cover case 5,6,7,8, this API can only be invoked in secondary
> process.
> 
> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
> ---

<snip>

> +	memset(&da, 0, sizeof(da));
> +
> +	if (rte_devargs_parse(&da, "%s", devargs)) {
> +		ethdev_log(ERR, "failed to parse devargs %s\n", devargs);
> +		return -EINVAL;
> +	}
> +
> +	ret = rte_eal_hotplug_add(da.bus->name, da.name, "");
> +	if (ret) {
> +		ethdev_log(ERR, "failed to hotplug bus:%s, device:%s\n",
> +			   da.bus->name, da.name);
> +		free(da.args);
> +		return ret;
> +	}
> +
> +	if (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED) {
> +		ethdev_log(ERR, "failed to attach to port %d, this is a pmd issue\n",
> +			   port_id);
> +		return -ENODEV;

^^^ Leaking da.args here?

> +	}
> +	free(da.args);
> +	return 0;
> +}
> +
> +static int handle_secondary_request(const struct rte_mp_msg *msg, const void *peer)
> +{
> +	RTE_SET_USED(msg);
> +	RTE_SET_USED(peer);
> +	return -ENOTSUP;
> +}
> +
> +static int handle_primary_response(const struct rte_mp_msg *msg, const void *peer)
> +{

<snip>

> +	ret = rte_mp_request_sync(&mp_req, &mp_reply, &ts);
> +	if (ret) {
> +		ethdev_log(ERR, "rte_mp_request_sync failed\n");
> +		return ret;
> +	}
> +
> +	req->result = 0;
> +	for (i = 0; i < mp_reply.nb_received; i++) {
> +		struct eth_dev_mp_req *resp =
> +			(struct eth_dev_mp_req *)mp_reply.msgs[i].param;
> +		if (resp->result) {
> +			req->result = resp->result;
> +			break;
> +		}
> +	}

Do we care if nb_sent != nb_received?

> +
> +	return 0;
> +}
> +
> +int rte_eth_dev_mp_init(void)
> +{
> +
> +	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> +		if (rte_mp_action_register(ETH_DEV_MP_ACTION_REQUEST,

<snip>

> +/**
> + * this is a synchronous wrapper for secondary process send
> + * request to primary process, this is invoked when an attach
> + * or detach request issued from primary.
> + */
> +int rte_eth_dev_request_to_primary(struct eth_dev_mp_req *req);
> +
> +/**
> + * this is a synchronous wrapper for primary process send
> + * request to secondary process, this is invoked when an attach
> + * or detach request issued from secondary process.
> + */
> +int rte_eth_dev_request_to_secondary(struct eth_dev_mp_req *req);

Nitpicking,  but the two above functions aren't used outside ethdev 
library. You can probably drop the rte_ prefix.

> +
> +/* Register mp channel callback functions of ethdev layer.*/
> +int rte_eth_dev_mp_init(void);

I don't quite understand what you're doing here. (Or rather, i 
understand the intention, but i don't understand the implementation :) )

This function is meant to be called from EAL at startup. First of all, 
why is it declared twice (once in eal_private, once in ethdev_private)?

Second of all, ethdev is a library, but this function is called from 
EAL. Which means it cannot be in a private header (nor should it be 
declared in EAL), and you cannot even call it from EAL because that 
would introduce a circular dependency between EAL and ethdev.

So, this needs to be redone the other way around - have ethdev register 
itself with EAL, and get called at some point, in a generic way (e.g. 
see how bus probe works for example). I don't know what this would look 
like - maybe some kind of generic multiprocess init?
Qi Zhang June 21, 2018, 9:14 a.m. UTC | #2
> -----Original Message-----
> From: Burakov, Anatoly
> Sent: Thursday, June 21, 2018 4:37 PM
> To: Zhang, Qi Z <qi.z.zhang@intel.com>; thomas@monjalon.net
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; dev@dpdk.org;
> Richardson, Bruce <bruce.richardson@intel.com>; Yigit, Ferruh
> <ferruh.yigit@intel.com>; Shelton, Benjamin H
> <benjamin.h.shelton@intel.com>; Vangati, Narender
> <narender.vangati@intel.com>
> Subject: Re: [PATCH v2 04/22] ethdev: enable hotplug on multi-process
> 
> On 21-Jun-18 3:00 AM, Qi Zhang wrote:
> > We are going to introduce the solution to handle different hotplug
> > cases in multi-process situation, it include below scenario:
> >
> > 1. Attach a share device from primary
> > 2. Detach a share device from primary
> > 3. Attach a share device from secondary 4. Detach a share device from
> > secondary 5. Attach a private device from secondary 6. Detach a
> > private device from secondary 7. Detach a share device from secondary
> > privately 8. Attach a share device from secondary privately
> >
> > In primary-secondary process model, we assume device is shared by default.
> > that means attach or detach a device on any process will broadcast to
> > all other processes through mp channel then device information will be
> > synchronized on all processes.
> >
> > Any failure during attaching process will cause inconsistent status
> > between processes, so proper rollback action should be considered.
> > Also it is not safe to detach a share device when other process still
> > use it, so a handshake mechanism is introduced.
> >
> > This patch covers the implementation of case 1,2,5,6,7,8.
> > Case 3,4 will be implemented on separate patch as well as handshake
> > mechanism.
> >
> > Scenario for Case 1, 2:
> >
> > attach device
> > a) primary attach the new device if failed goto h).
> > b) primary send attach sync request to all secondary.
> > c) secondary receive request and attach device and send reply.
> > d) primary check the reply if all success go to i).
> > e) primary send attach rollback sync request to all secondary.
> > f) secondary receive the request and detach device and send reply.
> > g) primary receive the reply and detach device as rollback action.
> > h) attach fail
> > i) attach success
> >
> > detach device
> > a) primary perform pre-detach check, if device is locked, goto i).
> > b) primary send pre-detach sync request to all secondary.
> > c) secondary perform pre-detach check and send reply.
> > d) primary check the reply if any fail goto i).
> > e) primary send detach sync request to all secondary
> > f) secondary detach the device and send reply (assume no fail)
> > g) primary detach the device.
> > h) detach success
> > i) detach failed
> >
> > Case 5, 6:
> > Secondary process can attach private device which only visible to
> > itself, in this case no IPC is involved, primary process is not
> > allowed to have private device so far.
> >
> > Case 7, 8:
> > Secondary process can also temporally to detach a share device "privately"
> > then attach it back later, this action also not impact other processes.
> >
> > APIs changes:
> >
> > rte_eth_dev_attach and rte_eth_dev_attach are extended to support
> > share device attach/detach in primary-secondary process model, it will
> > be called in case 1,2,3,4.
> >
> > New API rte_eth_dev_attach_private and rte_eth_dev_detach_private are
> > introduced to cover case 5,6,7,8, this API can only be invoked in
> > secondary process.
> >
> > Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
> > ---
> 
> <snip>
> 
> > +	memset(&da, 0, sizeof(da));
> > +
> > +	if (rte_devargs_parse(&da, "%s", devargs)) {
> > +		ethdev_log(ERR, "failed to parse devargs %s\n", devargs);
> > +		return -EINVAL;
> > +	}
> > +
> > +	ret = rte_eal_hotplug_add(da.bus->name, da.name, "");
> > +	if (ret) {
> > +		ethdev_log(ERR, "failed to hotplug bus:%s, device:%s\n",
> > +			   da.bus->name, da.name);
> > +		free(da.args);
> > +		return ret;
> > +	}
> > +
> > +	if (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED) {
> > +		ethdev_log(ERR, "failed to attach to port %d, this is a pmd issue\n",
> > +			   port_id);
> > +		return -ENODEV;
> 
> ^^^ Leaking da.args here?
> 
> > +	}
> > +	free(da.args);
> > +	return 0;
> > +}
> > +
> > +static int handle_secondary_request(const struct rte_mp_msg *msg,
> > +const void *peer) {
> > +	RTE_SET_USED(msg);
> > +	RTE_SET_USED(peer);
> > +	return -ENOTSUP;
> > +}
> > +
> > +static int handle_primary_response(const struct rte_mp_msg *msg,
> > +const void *peer) {
> 
> <snip>
> 
> > +	ret = rte_mp_request_sync(&mp_req, &mp_reply, &ts);
> > +	if (ret) {
> > +		ethdev_log(ERR, "rte_mp_request_sync failed\n");
> > +		return ret;
> > +	}
> > +
> > +	req->result = 0;
> > +	for (i = 0; i < mp_reply.nb_received; i++) {
> > +		struct eth_dev_mp_req *resp =
> > +			(struct eth_dev_mp_req *)mp_reply.msgs[i].param;
> > +		if (resp->result) {
> > +			req->result = resp->result;
> > +			break;
> > +		}
> > +	}
> 
> Do we care if nb_sent != nb_received?
> 
> > +
> > +	return 0;
> > +}
> > +
> > +int rte_eth_dev_mp_init(void)
> > +{
> > +
> > +	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > +		if (rte_mp_action_register(ETH_DEV_MP_ACTION_REQUEST,
> 
> <snip>
> 
> > +/**
> > + * this is a synchronous wrapper for secondary process send
> > + * request to primary process, this is invoked when an attach
> > + * or detach request issued from primary.
> > + */
> > +int rte_eth_dev_request_to_primary(struct eth_dev_mp_req *req);
> > +
> > +/**
> > + * this is a synchronous wrapper for primary process send
> > + * request to secondary process, this is invoked when an attach
> > + * or detach request issued from secondary process.
> > + */
> > +int rte_eth_dev_request_to_secondary(struct eth_dev_mp_req *req);
> 
> Nitpicking,  but the two above functions aren't used outside ethdev library.
> You can probably drop the rte_ prefix.
> 
> > +
> > +/* Register mp channel callback functions of ethdev layer.*/ int
> > +rte_eth_dev_mp_init(void);
> 
> I don't quite understand what you're doing here. (Or rather, i understand the
> intention, but i don't understand the implementation :) )
> 
> This function is meant to be called from EAL at startup. First of all, why is it
> declared twice (once in eal_private, once in ethdev_private)?

Ah, I forgot this, this is something in mess.

> 
> Second of all, ethdev is a library, but this function is called from EAL. Which
> means it cannot be in a private header (nor should it be declared in EAL), and
> you cannot even call it from EAL because that would introduce a circular
> dependency between EAL and ethdev.
> 
> So, this needs to be redone the other way around - have ethdev register itself
> with EAL, and get called at some point, in a generic way (e.g.
> see how bus probe works for example). I don't know what this would look like
> - maybe some kind of generic multiprocess init?

Yes, properly like this. I will re-work it.

Thanks
Qi

> 
> --
> Thanks,
> Anatoly
Andrew Rybchenko June 22, 2018, 1:54 p.m. UTC | #3
On 06/21/2018 05:00 AM, Qi Zhang wrote:
> We are going to introduce the solution to handle different hotplug
> cases in multi-process situation, it include below scenario:
>
> 1. Attach a share device from primary
> 2. Detach a share device from primary
> 3. Attach a share device from secondary
> 4. Detach a share device from secondary
> 5. Attach a private device from secondary
> 6. Detach a private device from secondary
> 7. Detach a share device from secondary privately
> 8. Attach a share device from secondary privately
>
> In primary-secondary process model, we assume device is shared by default.
> that means attach or detach a device on any process will broadcast to
> all other processes through mp channel then device information will be
> synchronized on all processes.
>
> Any failure during attaching process will cause inconsistent status
> between processes, so proper rollback action should be considered.
> Also it is not safe to detach a share device when other process still use
> it, so a handshake mechanism is introduced.
>
> This patch covers the implementation of case 1,2,5,6,7,8.
> Case 3,4 will be implemented on separate patch as well as handshake
> mechanism.
>
> Scenario for Case 1, 2:
>
> attach device
> a) primary attach the new device if failed goto h).
> b) primary send attach sync request to all secondary.
> c) secondary receive request and attach device and send reply.
> d) primary check the reply if all success go to i).
> e) primary send attach rollback sync request to all secondary.
> f) secondary receive the request and detach device and send reply.
> g) primary receive the reply and detach device as rollback action.
> h) attach fail
> i) attach success
>
> detach device
> a) primary perform pre-detach check, if device is locked, goto i).
> b) primary send pre-detach sync request to all secondary.
> c) secondary perform pre-detach check and send reply.
> d) primary check the reply if any fail goto i).
> e) primary send detach sync request to all secondary
> f) secondary detach the device and send reply (assume no fail)
> g) primary detach the device.
> h) detach success
> i) detach failed
>
> Case 5, 6:
> Secondary process can attach private device which only visible to itself,
> in this case no IPC is involved, primary process is not allowed to have
> private device so far.
>
> Case 7, 8:
> Secondary process can also temporally to detach a share device "privately"
> then attach it back later, this action also not impact other processes.
>
> APIs changes:
>
> rte_eth_dev_attach and rte_eth_dev_attach are extended to support
> share device attach/detach in primary-secondary process model, it will
> be called in case 1,2,3,4.
>
> New API rte_eth_dev_attach_private and rte_eth_dev_detach_private are
> introduced to cover case 5,6,7,8, this API can only be invoked in secondary
> process.
>
> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
> ---
>
> v2:
> - rename rte_ethdev_mp.* to ethdev_mp.*
> - add experimental tag for rte_eth_dev_attach_private and
>    rte_ethdev_detach_private.
> - move do_eth_dev_attach and do_eth_dev_detach to ethdev_private.h
> - move rte_eth_dev_mp_init before rte_eal_mcfg_complete.
> - fix meson.build.
> - improve commit log.
>
>   lib/librte_eal/common/eal_private.h |   8 ++
>   lib/librte_eal/linuxapp/eal/eal.c   |   7 ++
>   lib/librte_ethdev/Makefile          |   1 +
>   lib/librte_ethdev/ethdev_mp.c       | 198 ++++++++++++++++++++++++++++++++++++
>   lib/librte_ethdev/ethdev_mp.h       |  44 ++++++++
>   lib/librte_ethdev/ethdev_private.h  |  39 +++++++
>   lib/librte_ethdev/meson.build       |   1 +
>   lib/librte_ethdev/rte_ethdev.c      | 184 +++++++++++++++++++++++++++++----
>   lib/librte_ethdev/rte_ethdev.h      |  45 ++++++++
>   lib/librte_ethdev/rte_ethdev_core.h |   5 +
>   10 files changed, 515 insertions(+), 17 deletions(-)
>   create mode 100644 lib/librte_ethdev/ethdev_mp.c
>   create mode 100644 lib/librte_ethdev/ethdev_mp.h
>   create mode 100644 lib/librte_ethdev/ethdev_private.h
>
> diff --git a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h
> index bdadc4d50..92fa59bed 100644
> --- a/lib/librte_eal/common/eal_private.h
> +++ b/lib/librte_eal/common/eal_private.h
> @@ -258,4 +258,12 @@ int rte_mp_channel_init(void);
>    */
>   void dev_callback_process(char *device_name, enum rte_dev_event_type event);
>   
> +/**
> + * Register mp channel callback functions of ethdev layer.
> + *
> + * @return
> + *  0 on success.
> + *  (<0) on failure.
> + */
> +int rte_eth_dev_mp_init(void);

It looks like it makes cross-dependency between EAL and ethdev.
As far as I can see EAL does not have references to rte_eth_dev
functions yet. It looks really suspicious. The function is declared in
EAL, but implemented in ethdev.
Moreover, it is declared once again in ethdev_mp.h.

<...>

> diff --git a/lib/librte_ethdev/rte_ethdev_core.h b/lib/librte_ethdev/rte_ethdev_core.h
> index 33d12b3a2..2cb6de745 100644
> --- a/lib/librte_ethdev/rte_ethdev_core.h
> +++ b/lib/librte_ethdev/rte_ethdev_core.h
> @@ -622,4 +622,9 @@ struct rte_eth_dev_data {
>    */
>   extern struct rte_eth_dev rte_eth_devices[];
>   
> +extern int ethdev_logtype;
> +#define ethdev_log(level, fmt, ...) \
> +	rte_log(RTE_LOG_ ## level, ethdev_logtype, fmt "\n", ## __VA_ARGS__)

It looks like it clashes with ethdev logging changes submitted by Ferruh.

Andrew.
diff mbox series

Patch

diff --git a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h
index bdadc4d50..92fa59bed 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -258,4 +258,12 @@  int rte_mp_channel_init(void);
  */
 void dev_callback_process(char *device_name, enum rte_dev_event_type event);
 
+/**
+ * Register mp channel callback functions of ethdev layer.
+ *
+ * @return
+ *  0 on success.
+ *  (<0) on failure.
+ */
+int rte_eth_dev_mp_init(void);
 #endif /* _EAL_PRIVATE_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 8655b8691..b7788c42d 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -1024,6 +1024,13 @@  rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
+	/* Initialize mp channel for ethdev layer */
+	if (rte_eth_dev_mp_init()) {
+		rte_eal_init_alert("rte_eth_dev_mp_init() failed\n");
+		rte_errno = ENOEXEC;
+		return -1;
+	}
+
 #ifdef VFIO_PRESENT
 	/* Register mp action after probe() so that we got enough info */
 	if (rte_vfio_is_enabled("vfio") && vfio_mp_sync_setup() < 0)
diff --git a/lib/librte_ethdev/Makefile b/lib/librte_ethdev/Makefile
index c2f2f7d82..d0a059b83 100644
--- a/lib/librte_ethdev/Makefile
+++ b/lib/librte_ethdev/Makefile
@@ -19,6 +19,7 @@  EXPORT_MAP := rte_ethdev_version.map
 LIBABIVER := 9
 
 SRCS-y += rte_ethdev.c
+SRCS-y += ethdev_mp.c
 SRCS-y += rte_flow.c
 SRCS-y += rte_tm.c
 SRCS-y += rte_mtr.c
diff --git a/lib/librte_ethdev/ethdev_mp.c b/lib/librte_ethdev/ethdev_mp.c
new file mode 100644
index 000000000..73dd36485
--- /dev/null
+++ b/lib/librte_ethdev/ethdev_mp.c
@@ -0,0 +1,198 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2018 Intel Corporation
+ */
+
+#include <rte_string_fns.h>
+#include "rte_ethdev_driver.h"
+#include "ethdev_mp.h"
+
+#define MP_TIMEOUT_S 5 /**< 5 seconds timeouts */
+
+static int detach_on_secondary(uint16_t port_id)
+{
+	struct rte_device *dev;
+	struct rte_bus *bus;
+	int ret = 0;
+
+	if (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED) {
+		ethdev_log(ERR, "detach on secondary: invalid port %d\n",
+			   port_id);
+		return -ENODEV;
+	}
+
+	dev = rte_eth_devices[port_id].device;
+	if (dev == NULL)
+		return -EINVAL;
+
+	bus = rte_bus_find_by_device(dev);
+	if (bus == NULL)
+		return -ENOENT;
+
+	ret = rte_eal_hotplug_remove(bus->name, dev->name);
+	if (ret) {
+		ethdev_log(ERR, "failed to hot unplug bus: %s, device:%s\n",
+			   bus->name, dev->name);
+		return ret;
+	}
+
+	rte_eth_dev_release_port_private(&rte_eth_devices[port_id]);
+	return ret;
+}
+
+static int attach_on_secondary(const char *devargs, uint16_t port_id)
+{
+	struct rte_devargs da;
+	int ret;
+
+	if (rte_eth_devices[port_id].state != RTE_ETH_DEV_UNUSED) {
+		ethdev_log(ERR, "port %d already in used, failed to attach\n",
+			   port_id);
+		return -EINVAL;
+	}
+
+	memset(&da, 0, sizeof(da));
+
+	if (rte_devargs_parse(&da, "%s", devargs)) {
+		ethdev_log(ERR, "failed to parse devargs %s\n", devargs);
+		return -EINVAL;
+	}
+
+	ret = rte_eal_hotplug_add(da.bus->name, da.name, "");
+	if (ret) {
+		ethdev_log(ERR, "failed to hotplug bus:%s, device:%s\n",
+			   da.bus->name, da.name);
+		free(da.args);
+		return ret;
+	}
+
+	if (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED) {
+		ethdev_log(ERR, "failed to attach to port %d, this is a pmd issue\n",
+			   port_id);
+		return -ENODEV;
+	}
+	free(da.args);
+	return 0;
+}
+
+static int handle_secondary_request(const struct rte_mp_msg *msg, const void *peer)
+{
+	RTE_SET_USED(msg);
+	RTE_SET_USED(peer);
+	return -ENOTSUP;
+}
+
+static int handle_primary_response(const struct rte_mp_msg *msg, const void *peer)
+{
+	RTE_SET_USED(msg);
+	RTE_SET_USED(peer);
+	return -ENOTSUP;
+}
+
+static int handle_primary_request(const struct rte_mp_msg *msg, const void *peer)
+{
+	const struct eth_dev_mp_req *req =
+		(const struct eth_dev_mp_req *)msg->param;
+	struct rte_mp_msg mp_resp;
+	struct eth_dev_mp_req *resp =
+		(struct eth_dev_mp_req *)mp_resp.param;
+	int ret = 0;
+
+	memset(&mp_resp, 0, sizeof(mp_resp));
+
+	switch (req->t) {
+	case REQ_TYPE_ATTACH:
+		ret = attach_on_secondary(req->devargs, req->port_id);
+		break;
+	case REQ_TYPE_PRE_DETACH:
+		ret = 0;
+		break;
+	case REQ_TYPE_DETACH:
+	case REQ_TYPE_ATTACH_ROLLBACK:
+		ret = detach_on_secondary(req->port_id);
+		break;
+	default:
+		ret = -EINVAL;
+	}
+
+	strlcpy(mp_resp.name, ETH_DEV_MP_ACTION_REQUEST, sizeof(mp_resp.name));
+	mp_resp.len_param = sizeof(*req);
+	memcpy(resp, req, sizeof(*resp));
+	resp->result = ret;
+	if (rte_mp_reply(&mp_resp, peer) < 0) {
+		ethdev_log(ERR, "failed to send reply to primary request\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+int rte_eth_dev_request_to_primary(struct eth_dev_mp_req *req)
+{
+	RTE_SET_USED(req);
+	return -ENOTSUP;
+}
+
+/**
+ * Request from primary to secondary.
+ *
+ * Be invoked when try to attach or detach a share device
+ * from primary process.
+ */
+int rte_eth_dev_request_to_secondary(struct eth_dev_mp_req *req)
+{
+	struct rte_mp_msg mp_req;
+	struct rte_mp_reply mp_reply;
+	struct timespec ts = {.tv_sec = MP_TIMEOUT_S, .tv_nsec = 0};
+	int ret;
+	int i;
+
+	memset(&mp_req, 0, sizeof(mp_req));
+	memcpy(mp_req.param, req, sizeof(*req));
+	mp_req.len_param = sizeof(*req);
+	strlcpy(mp_req.name, ETH_DEV_MP_ACTION_REQUEST, sizeof(mp_req.name));
+
+	ret = rte_mp_request_sync(&mp_req, &mp_reply, &ts);
+	if (ret) {
+		ethdev_log(ERR, "rte_mp_request_sync failed\n");
+		return ret;
+	}
+
+	req->result = 0;
+	for (i = 0; i < mp_reply.nb_received; i++) {
+		struct eth_dev_mp_req *resp =
+			(struct eth_dev_mp_req *)mp_reply.msgs[i].param;
+		if (resp->result) {
+			req->result = resp->result;
+			break;
+		}
+	}
+
+	return 0;
+}
+
+int rte_eth_dev_mp_init(void)
+{
+
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		if (rte_mp_action_register(ETH_DEV_MP_ACTION_REQUEST,
+					   handle_secondary_request)) {
+			RTE_LOG(ERR, EAL, "Couldn't register '%s' action\n",
+				ETH_DEV_MP_ACTION_REQUEST);
+			return -1;
+		}
+	} else {
+		if (rte_mp_action_register(ETH_DEV_MP_ACTION_RESPONSE,
+					   handle_primary_response)) {
+			RTE_LOG(ERR, EAL, "Couldn't register '%s' action\n",
+				ETH_DEV_MP_ACTION_RESPONSE);
+			return -1;
+		}
+		if (rte_mp_action_register(ETH_DEV_MP_ACTION_REQUEST,
+					   handle_primary_request)) {
+			RTE_LOG(ERR, EAL, "Couldn't register '%s' action\n",
+				ETH_DEV_MP_ACTION_REQUEST);
+		}
+	}
+
+	return 0;
+}
diff --git a/lib/librte_ethdev/ethdev_mp.h b/lib/librte_ethdev/ethdev_mp.h
new file mode 100644
index 000000000..c3e55dfec
--- /dev/null
+++ b/lib/librte_ethdev/ethdev_mp.h
@@ -0,0 +1,44 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef _RTE_ETHDEV_MP_H_
+#define _RTE_ETHDEV_MP_H_
+
+#define MAX_DEV_ARGS_LEN 0x80
+
+#define ETH_DEV_MP_ACTION_REQUEST	"eth_dev_mp_request"
+#define ETH_DEV_MP_ACTION_RESPONSE	"eth_dev_mp_response"
+
+enum eth_dev_req_type {
+	REQ_TYPE_ATTACH,
+	REQ_TYPE_PRE_DETACH,
+	REQ_TYPE_DETACH,
+	REQ_TYPE_ATTACH_ROLLBACK,
+};
+
+struct eth_dev_mp_req {
+	enum eth_dev_req_type t;
+	char devargs[MAX_DEV_ARGS_LEN];
+	uint16_t port_id;
+	int result;
+};
+
+/**
+ * this is a synchronous wrapper for secondary process send
+ * request to primary process, this is invoked when an attach
+ * or detach request issued from primary.
+ */
+int rte_eth_dev_request_to_primary(struct eth_dev_mp_req *req);
+
+/**
+ * this is a synchronous wrapper for primary process send
+ * request to secondary process, this is invoked when an attach
+ * or detach request issued from secondary process.
+ */
+int rte_eth_dev_request_to_secondary(struct eth_dev_mp_req *req);
+
+/* Register mp channel callback functions of ethdev layer.*/
+int rte_eth_dev_mp_init(void);
+
+#endif
diff --git a/lib/librte_ethdev/ethdev_private.h b/lib/librte_ethdev/ethdev_private.h
new file mode 100644
index 000000000..981e7de8a
--- /dev/null
+++ b/lib/librte_ethdev/ethdev_private.h
@@ -0,0 +1,39 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2018 Intel Corporation
+ */
+
+#ifndef _ETHDEV_PRIVATE_H_
+#define _ETHDEV_PRIVATE_H_
+
+/**
+ * Attach a new Ethernet device in current process.
+ *
+ * @param devargs
+ *  A pointer to a strings array describing the new device
+ *  to be attached. The strings should be a pci address like
+ *  '0000:01:00.0' or virtual device name like 'net_pcap0'.
+ *
+ * @param port_id
+ *  A pointer to a port identifier actually attached.
+ *
+ * @return
+ *  0 on success and port_id is filled, negative on error
+ */
+int do_eth_dev_attach(const char *devargs, uint16_t *port_id);
+
+/**
+ * Detach a Ethernet device in current process.
+ *
+ * @param port_id
+ *   The port identifier of the device to detach.
+ *
+ * @param devname
+ *   A pointer to a buffer that will be filled with the device name.
+ *   This buffer must be at least RTE_DEV_NAME_MAX_LEN long.
+ *
+ * @return
+ *  0 on success and devname is filled, negative on error
+ */
+int do_eth_dev_detach(uint16_t port_id);
+
+#endif
diff --git a/lib/librte_ethdev/meson.build b/lib/librte_ethdev/meson.build
index aed5d2265..b60256855 100644
--- a/lib/librte_ethdev/meson.build
+++ b/lib/librte_ethdev/meson.build
@@ -5,6 +5,7 @@  name = 'ethdev'
 version = 9
 allow_experimental_apis = true
 sources = files('ethdev_profile.c',
+	'ethdev_mp.c'
 	'rte_ethdev.c',
 	'rte_flow.c',
 	'rte_mtr.c',
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 205b2ee33..77f53a634 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -41,11 +41,13 @@ 
 #include "rte_ethdev.h"
 #include "rte_ethdev_driver.h"
 #include "ethdev_profile.h"
+#include "ethdev_mp.h"
+#include "ethdev_private.h"
 
-static int ethdev_logtype;
+int ethdev_logtype;
 
-#define ethdev_log(level, fmt, ...) \
-	rte_log(RTE_LOG_ ## level, ethdev_logtype, fmt "\n", ## __VA_ARGS__)
+#define RTE_ETH_MP_ACTION_REQUEST	"rte_eth_mp_request"
+#define RTE_ETH_MP_ACTION_RESPONSE	"rte_eth_mp_response"
 
 static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
 struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
@@ -656,9 +658,8 @@  eth_err(uint16_t port_id, int ret)
 	return ret;
 }
 
-/* attach the new device, then store port_id of the device */
 int
-rte_eth_dev_attach(const char *devargs, uint16_t *port_id)
+do_eth_dev_attach(const char *devargs, uint16_t *port_id)
 {
 	int current = rte_eth_dev_count_total();
 	struct rte_devargs da;
@@ -703,14 +704,104 @@  rte_eth_dev_attach(const char *devargs, uint16_t *port_id)
 	return ret;
 }
 
-/* detach the device, then store the name of the device */
 int
-rte_eth_dev_detach(uint16_t port_id, char *name __rte_unused)
+do_eth_dev_detach(uint16_t port_id)
 {
 	struct rte_device *dev;
 	struct rte_bus *bus;
+	int ret = 0;
+
+	dev = rte_eth_devices[port_id].device;
+	if (dev == NULL)
+		return -EINVAL;
+
+	bus = rte_bus_find_by_device(dev);
+	if (bus == NULL)
+		return -ENOENT;
+
+	ret = rte_eal_hotplug_remove(bus->name, dev->name);
+	if (ret < 0)
+		return ret;
+
+	rte_eth_dev_release_port(&rte_eth_devices[port_id]);
+	return ret;
+
+}
+
+/* attach the new device, then store port_id of the device */
+int
+rte_eth_dev_attach(const char *devargs, uint16_t *port_id)
+{
+	struct eth_dev_mp_req req;
+	int ret;
+
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+
+		/**
+		 * If secondary process, we just send request to primary
+		 * to start the process.
+		 */
+		req.t = REQ_TYPE_ATTACH;
+		strlcpy(req.devargs, devargs, MAX_DEV_ARGS_LEN);
+
+		ret = rte_eth_dev_request_to_primary(&req);
+		if (ret) {
+			ethdev_log(ERR, "Failed to send device attach request to primary\n");
+			return ret;
+		}
+
+		*port_id = req.port_id;
+		return req.result;
+	}
+
+	ret = do_eth_dev_attach(devargs, port_id);
+	if (ret)
+		return ret;
+
+	/* send attach request to seoncary */
+	req.t = REQ_TYPE_ATTACH;
+	strlcpy(req.devargs, devargs, MAX_DEV_ARGS_LEN);
+	req.port_id = *port_id;
+	ret = rte_eth_dev_request_to_secondary(&req);
+	if (ret) {
+		ethdev_log(ERR, "Failed to send device attach request to secondary\n");
+		goto rollback;
+	}
+
+	if (req.result)
+		goto rollback;
+
+	return 0;
+
+rollback:
+	/* send rollback request to secondary since some one fail to attach */
+	req.t = REQ_TYPE_ATTACH_ROLLBACK;
+	req.port_id = *port_id;
+	rte_eth_dev_request_to_secondary(&req);
+
+	do_eth_dev_detach(*port_id);
+
+	return -ENODEV;
+}
+
+/* attach the new device, then store port_id of the device */
+int
+rte_eth_dev_attach_private(const char *devargs, uint16_t *port_id)
+{
+
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+		return -ENOTSUP;
+
+	return do_eth_dev_attach(devargs, port_id);
+}
+
+/* detach the device, then store the name of the device */
+int
+rte_eth_dev_detach(uint16_t port_id, char *name __rte_unused)
+{
+	struct eth_dev_mp_req req = {0};
+	int ret;
 	uint32_t dev_flags;
-	int ret = -1;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
@@ -721,22 +812,81 @@  rte_eth_dev_detach(uint16_t port_id, char *name __rte_unused)
 		return -ENOTSUP;
 	}
 
-	dev = rte_eth_devices[port_id].device;
-	if (dev == NULL)
-		return -EINVAL;
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+		req.t = REQ_TYPE_DETACH;
+		req.port_id = port_id;
 
-	bus = rte_bus_find_by_device(dev);
-	if (bus == NULL)
-		return -ENOENT;
+		/**
+		 * If secondary process, we just send request to primary
+		 * to start the process.
+		 */
+		ret = rte_eth_dev_request_to_primary(&req);
+		if (ret) {
+			ethdev_log(ERR, "Failed to send device detach request to primary\n");
+			return ret;
+		}
 
-	ret = rte_eal_hotplug_remove(bus->name, dev->name);
-	if (ret < 0)
+		return req.result;
+	}
+
+	/* check pre_detach */
+	req.t = REQ_TYPE_PRE_DETACH;
+	req.port_id = port_id;
+	ret = rte_eth_dev_request_to_secondary(&req);
+	if (ret) {
+		ethdev_log(ERR, "Failed to send device pre-detach request to secondary\n");
+		return ret;
+	}
+
+	if (req.result) {
+		ethdev_log(ERR, "Device is busy on secondary, can't be detached\n");
+		return req.result;
+	}
+
+	/* detach on seconary first */
+	req.t = REQ_TYPE_DETACH;
+	ret = rte_eth_dev_request_to_secondary(&req);
+	if (ret) {
+		ethdev_log(ERR, "Failed to send device detach request to secondary\n");
+		return ret;
+	}
+
+	if (req.result)
+		/**
+		 * this should rarely happen, something wrong in secondary
+		 * process, will not block primary detach.
+		 */
+		ethdev_log(ERR, "Failed to detach device on secondary process\n");
+
+	/* detach on primary */
+	ret =  do_eth_dev_detach(port_id);
+	if (ret)
 		return ret;
 
-	rte_eth_dev_release_port(&rte_eth_devices[port_id]);
 	return 0;
 }
 
+/* detach the device, then store the name of the device */
+int
+rte_eth_dev_detach_private(uint16_t port_id, char *name __rte_unused)
+{
+	uint32_t dev_flags;
+
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+		return -ENOTSUP;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
+
+	dev_flags = rte_eth_devices[port_id].data->dev_flags;
+	if (dev_flags & RTE_ETH_DEV_BONDED_SLAVE) {
+		ethdev_log(ERR,
+			"Port %" PRIu16 " is bonded, cannot detach", port_id);
+		return -ENOTSUP;
+	}
+
+	return do_eth_dev_detach(port_id);
+}
+
 static int
 rte_eth_dev_rx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues)
 {
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index 36e3984ea..813806e3c 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1462,6 +1462,9 @@  uint16_t __rte_experimental rte_eth_dev_count_total(void);
 
 /**
  * Attach a new Ethernet device specified by arguments.
+ * In multi-process mode, it will sync with other process
+ * to make sure all processes attach the device, any
+ * failure on other process will rollback the action.
  *
  * @param devargs
  *  A pointer to a strings array describing the new device
@@ -1475,9 +1478,31 @@  uint16_t __rte_experimental rte_eth_dev_count_total(void);
 int rte_eth_dev_attach(const char *devargs, uint16_t *port_id);
 
 /**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Attach a private Ethernet device specified by arguments.
+ * A private device is invisible to other process.
+ * Can only be invoked in secondary process.
+ *
+ * @param devargs
+ *  A pointer to a strings array describing the new device
+ *  to be attached. The strings should be a pci address like
+ *  '0000:01:00.0' or virtual device name like 'net_pcap0'.
+ * @param port_id
+ *  A pointer to a port identifier actually attached.
+ * @return
+ *  0 on success and port_id is filled, negative on error
+ */
+int __rte_experimental
+rte_eth_dev_attach_private(const char *devargs, uint16_t *port_id);
+
+/**
  * Detach a Ethernet device specified by port identifier.
  * This function must be called when the device is in the
  * closed state.
+ * In multi-process mode, it will sync with other process
+ * to detach the device.
  *
  * @param port_id
  *   The port identifier of the device to detach.
@@ -1490,6 +1515,26 @@  int rte_eth_dev_attach(const char *devargs, uint16_t *port_id);
 int rte_eth_dev_detach(uint16_t port_id, char *devname);
 
 /**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Detach a private Ethernet device specified by port identifier
+ * This function must be called when the device is in the
+ * closed state.
+ * Can only be invoked in secondary process.
+ *
+ * @param port_id
+ *   The port identifier of the device to detach.
+ * @param devname
+ *   A pointer to a buffer that will be filled with the device name.
+ *   This buffer must be at least RTE_DEV_NAME_MAX_LEN long.
+ * @return
+ *  0 on success and devname is filled, negative on error
+ */
+int __rte_experimental
+rte_eth_dev_detach_private(uint16_t port_id, char *devname);
+
+/**
  * Convert a numerical speed in Mbps to a bitmap flag that can be used in
  * the bitmap link_speeds of the struct rte_eth_conf
  *
diff --git a/lib/librte_ethdev/rte_ethdev_core.h b/lib/librte_ethdev/rte_ethdev_core.h
index 33d12b3a2..2cb6de745 100644
--- a/lib/librte_ethdev/rte_ethdev_core.h
+++ b/lib/librte_ethdev/rte_ethdev_core.h
@@ -622,4 +622,9 @@  struct rte_eth_dev_data {
  */
 extern struct rte_eth_dev rte_eth_devices[];
 
+extern int ethdev_logtype;
+#define ethdev_log(level, fmt, ...) \
+	rte_log(RTE_LOG_ ## level, ethdev_logtype, fmt "\n", ## __VA_ARGS__)
+
+
 #endif /* _RTE_ETHDEV_CORE_H_ */