[dpdk-dev,v10,13/14] eal/pci: Add rte_eal_dev_attach/detach() functions

Message ID 1424414390-18509-14-git-send-email-mukawa@igel.co.jp (mailing list archive)
State Superseded, archived
Headers

Commit Message

Tetsuya Mukawa Feb. 20, 2015, 6:39 a.m. UTC
  These functions are used for attaching or detaching a port.
When rte_eal_dev_attach() is called, the function tries to realize the
device name as pci address. If this is done successfully,
rte_eal_dev_attach() will attach physical device port. If not, attaches
virtual devive port.
When rte_eal_dev_detach() is called, the function gets the device type
of this port to know whether the port is come from physical or virtual.
And then specific detaching function will be called.

v10:
- Add comments.
- Change order of version.map.
  (Thanks to Thomas Monjalon)
v9:
- Fix comments.
- Use strcmp() instead of strncmp().
- Remove RTE_EAL_INVOKE_TYPE_PROBE/CLOSE.
- Change definition of rte_dev_uninit_t.
  (Thanks to Thomas Monjalon and Maxime Leroy)
v8:
- Add missing symbol in version map.
  (Thanks to Qiu, Michael and Iremonger, Bernard)
v7:
- Fix typo of warning messages.
  (Thanks to Qiu, Michael)
v5:
- Change function names like below.
  rte_eal_dev_find_and_invoke() to rte_eal_vdev_find_and_invoke().
  rte_eal_dev_invoke() to rte_eal_vdev_invoke().
- Add code to handle a return value of rte_eal_devargs_remove().
- Fix pci address format in rte_eal_dev_detach().
v4:
- Fix comment.
- Add error checking.
- Fix indent of 'if' statement.
- Change function name.

Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
---
 lib/librte_eal/common/eal_common_dev.c          | 315 ++++++++++++++++++++++++
 lib/librte_eal/common/eal_private.h             |  11 +
 lib/librte_eal/common/include/rte_dev.h         |  33 +++
 lib/librte_eal/linuxapp/eal/Makefile            |   1 +
 lib/librte_eal/linuxapp/eal/eal_pci.c           |   6 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |   2 +
 6 files changed, 365 insertions(+), 3 deletions(-)
  

Comments

Maxime Leroy Feb. 20, 2015, 10:14 a.m. UTC | #1
Hi Tetsuya,

On Fri, Feb 20, 2015 at 7:39 AM, Tetsuya Mukawa <mukawa@igel.co.jp> wrote:
> These functions are used for attaching or detaching a port.
[..]
> +
> +static void
> +get_vdev_name(char *vdevargs)
> +{
> +       char *sep;
> +
> +       if (vdevargs == NULL)
> +               return;
> +
> +       /* set the first ',' to '\0' to split name and arguments */
> +       sep = strchr(vdevargs, ',');
> +       if (sep != NULL)
> +               sep[0] = '\0';
> +}
> +
> +/* attach the new virtual device, then store port_id of the device */
> +static int
> +rte_eal_dev_attach_vdev(const char *vdevargs, uint8_t *port_id)
> +{
> +       char *args;
> +       uint8_t new_port_id;
> +       struct rte_eth_dev devs[RTE_MAX_ETHPORTS];
> +
> +       if ((vdevargs == NULL) || (port_id == NULL))
> +               goto err0;
> +
> +       args = strdup(vdevargs);
> +       if (args == NULL)
> +               goto err0;
> +
> +       /* save current port status */
> +       if (rte_eth_dev_save(devs, sizeof(devs)))
> +               goto err1;
> +       /* add the vdevargs to devargs_list */
> +       if (rte_eal_devargs_add(RTE_DEVTYPE_VIRTUAL, args))
> +               goto err1;

You don't need to add devargs to the devargs_list.

The devargs_list is only needed at the init to store the arguments
when we parse the command line. Then, at initialization,
rte_eal_dev_init  creates the devices from this list .

Instead of adding the devargs in the list, you could have the following code:

static int
rte_eal_dev_attach_vdev(const char *vdevargs, uint8_t *port_id)
{
     ...
    /* save current port status */
    if (rte_eth_dev_save(devs, sizeof(devs)))
       goto err;

    devargs = rte_eal_parse_devargs_str(RTE_
DEVTYPE_VIRTUAL, vdevargs);
    if (devargs == NULL)
         goto err;

    if (rte_eal_vdev_devinit(devargs) < 0)
         goto err;

   if (rte_eth_dev_get_changed_port(devs, &new_port_id))
         goto err;

   ...
}

What do you think ?

> +       /* parse vdevargs, then retrieve device name */
> +       get_vdev_name(args);
> +       /* walk around dev_driver_list to find the driver of the device,
> +        * then invoke probe function o the driver.
> +        * TODO:
> +        * rte_eal_vdev_find_and_init() should return port_id,
> +        * And rte_eth_dev_save() and rte_eth_dev_get_changed_port()
> +        * should be removed. */
> +       if (rte_eal_vdev_find_and_init(args))
> +               goto err2;
> +       /* get port_id enabled by above procedures */
> +       if (rte_eth_dev_get_changed_port(devs, &new_port_id))
> +               goto err2;
> +
> +       free(args);
> +       *port_id = new_port_id;
> +       return 0;
> +err2:
> +       rte_eal_devargs_remove(RTE_DEVTYPE_VIRTUAL, args);
> +err1:
> +       free(args);
> +err0:
> +       RTE_LOG(ERR, EAL, "Driver, cannot attach the device\n");
> +       return -1;
> +}
> +
> +/* detach the new virtual device, then store the name of the device */
> +static int
> +rte_eal_dev_detach_vdev(uint8_t port_id, char *vdevname)
> +{
> +       char name[RTE_ETH_NAME_MAX_LEN];
> +
> +       if (vdevname == NULL)
> +               goto err;
> +
> +       /* check whether the driver supports detach feature, or not */
> +       if (rte_eth_dev_is_detachable(port_id))
> +               goto err;
> +
> +       /* get device name by port id */
> +       if (rte_eth_dev_get_name_by_port(port_id, name))
> +               goto err;
> +       /* walk around dev_driver_list to find the driver of the device,
> +        * then invoke close function o the driver */
> +       if (rte_eal_vdev_find_and_uninit(name))
> +               goto err;
> +       /* remove the vdevname from devargs_list */
> +       if (rte_eal_devargs_remove(RTE_DEVTYPE_VIRTUAL, name))
> +               goto err;

The same here, you don't need to remove devargs from the devargs_list.

Instead of removing the devargs in the list, you could have the following code::

static int
rte_eal_dev_detach_vdev(uint8_t port_id, char *vdevname)
{
         ...
         /* check whether the driver supports detach feature, or not */
         if (rte_eth_dev_is_detachable(port_id))
                goto err;

         /* get device name by port id */
         if (rte_eth_dev_get_name_by_port(port_id, name))
                goto err;

        if (rte_eal_vdev_uninit(name))
                 goto err;
       ...
}

What do you think ?


Regards,

Maxime
  
Tetsuya Mukawa Feb. 20, 2015, 10:32 a.m. UTC | #2
On 2015/02/20 19:14, Maxime Leroy wrote:
> Hi Tetsuya,
>
> On Fri, Feb 20, 2015 at 7:39 AM, Tetsuya Mukawa <mukawa@igel.co.jp> wrote:
>> These functions are used for attaching or detaching a port.
> [..]
>> +
>> +static void
>> +get_vdev_name(char *vdevargs)
>> +{
>> +       char *sep;
>> +
>> +       if (vdevargs == NULL)
>> +               return;
>> +
>> +       /* set the first ',' to '\0' to split name and arguments */
>> +       sep = strchr(vdevargs, ',');
>> +       if (sep != NULL)
>> +               sep[0] = '\0';
>> +}
>> +
>> +/* attach the new virtual device, then store port_id of the device */
>> +static int
>> +rte_eal_dev_attach_vdev(const char *vdevargs, uint8_t *port_id)
>> +{
>> +       char *args;
>> +       uint8_t new_port_id;
>> +       struct rte_eth_dev devs[RTE_MAX_ETHPORTS];
>> +
>> +       if ((vdevargs == NULL) || (port_id == NULL))
>> +               goto err0;
>> +
>> +       args = strdup(vdevargs);
>> +       if (args == NULL)
>> +               goto err0;
>> +
>> +       /* save current port status */
>> +       if (rte_eth_dev_save(devs, sizeof(devs)))
>> +               goto err1;
>> +       /* add the vdevargs to devargs_list */
>> +       if (rte_eal_devargs_add(RTE_DEVTYPE_VIRTUAL, args))
>> +               goto err1;
> You don't need to add devargs to the devargs_list.
>
> The devargs_list is only needed at the init to store the arguments
> when we parse the command line. Then, at initialization,
> rte_eal_dev_init  creates the devices from this list .
>
> Instead of adding the devargs in the list, you could have the following code:
>
> static int
> rte_eal_dev_attach_vdev(const char *vdevargs, uint8_t *port_id)
> {
>      ...
>     /* save current port status */
>     if (rte_eth_dev_save(devs, sizeof(devs)))
>        goto err;
>
>     devargs = rte_eal_parse_devargs_str(RTE_
> DEVTYPE_VIRTUAL, vdevargs);
>     if (devargs == NULL)
>          goto err;
>
>     if (rte_eal_vdev_devinit(devargs) < 0)
>          goto err;
>
>    if (rte_eth_dev_get_changed_port(devs, &new_port_id))
>          goto err;
>
>    ...
> }
>
> What do you think ?

Hi Maxime,

I appreciate for your comment.

When rte_eal_init() is called, if we have "--vdev" options, these will
be stored in vdevargs as you describe above.
I agree this is the current behavior of DPDK.

When we call hotplug functions, I guess doing same thing will be more
consistent design.

For example, we can do like below.
1. $ ./testpmd --vdev 'eth_pcap' -- -i
2. testpmd>port detach

Also we can do like below.
1. $ ./testpmd -- -i
2. testpmd> port attach eth_pcap
3. testpmd> port detach

After doing above cases, we have no port. But in only first case,
vdevargs still has a "--vdev" option value.
So I guess current hotplug implementation is more consistent design.
How about?

Regards,
Tetsuya



>> +       /* parse vdevargs, then retrieve device name */
>> +       get_vdev_name(args);
>> +       /* walk around dev_driver_list to find the driver of the device,
>> +        * then invoke probe function o the driver.
>> +        * TODO:
>> +        * rte_eal_vdev_find_and_init() should return port_id,
>> +        * And rte_eth_dev_save() and rte_eth_dev_get_changed_port()
>> +        * should be removed. */
>> +       if (rte_eal_vdev_find_and_init(args))
>> +               goto err2;
>> +       /* get port_id enabled by above procedures */
>> +       if (rte_eth_dev_get_changed_port(devs, &new_port_id))
>> +               goto err2;
>> +
>> +       free(args);
>> +       *port_id = new_port_id;
>> +       return 0;
>> +err2:
>> +       rte_eal_devargs_remove(RTE_DEVTYPE_VIRTUAL, args);
>> +err1:
>> +       free(args);
>> +err0:
>> +       RTE_LOG(ERR, EAL, "Driver, cannot attach the device\n");
>> +       return -1;
>> +}
>> +
>> +/* detach the new virtual device, then store the name of the device */
>> +static int
>> +rte_eal_dev_detach_vdev(uint8_t port_id, char *vdevname)
>> +{
>> +       char name[RTE_ETH_NAME_MAX_LEN];
>> +
>> +       if (vdevname == NULL)
>> +               goto err;
>> +
>> +       /* check whether the driver supports detach feature, or not */
>> +       if (rte_eth_dev_is_detachable(port_id))
>> +               goto err;
>> +
>> +       /* get device name by port id */
>> +       if (rte_eth_dev_get_name_by_port(port_id, name))
>> +               goto err;
>> +       /* walk around dev_driver_list to find the driver of the device,
>> +        * then invoke close function o the driver */
>> +       if (rte_eal_vdev_find_and_uninit(name))
>> +               goto err;
>> +       /* remove the vdevname from devargs_list */
>> +       if (rte_eal_devargs_remove(RTE_DEVTYPE_VIRTUAL, name))
>> +               goto err;
> The same here, you don't need to remove devargs from the devargs_list.
>
> Instead of removing the devargs in the list, you could have the following code::
>
> static int
> rte_eal_dev_detach_vdev(uint8_t port_id, char *vdevname)
> {
>          ...
>          /* check whether the driver supports detach feature, or not */
>          if (rte_eth_dev_is_detachable(port_id))
>                 goto err;
>
>          /* get device name by port id */
>          if (rte_eth_dev_get_name_by_port(port_id, name))
>                 goto err;
>
>         if (rte_eal_vdev_uninit(name))
>                  goto err;
>        ...
> }
>
> What do you think ?
>
>
> Regards,
>
> Maxime
  
Maxime Leroy Feb. 20, 2015, 3:20 p.m. UTC | #3
Hi Tetsuya,

Thanks for your comment.

On Fri, Feb 20, 2015 at 11:32 AM, Tetsuya Mukawa <mukawa@igel.co.jp> wrote:
> On 2015/02/20 19:14, Maxime Leroy wrote:
>> Hi Tetsuya,
[...]
>>
>
> Hi Maxime,
>
> I appreciate for your comment.
>
> When rte_eal_init() is called, if we have "--vdev" options, these will
> be stored in vdevargs as you describe above.
> I agree this is the current behavior of DPDK.
>
> When we call hotplug functions, I guess doing same thing will be more
> consistent design.

The rte_eal_devargs_add is here to store a white list parameters for
later creating the devices.
It means that the devargs_list is only needed at the init to store the
devargs parsed .
I think we should not use the devargs_list after eal initialization.

Why you want to add devargs in the devargs_list, if there are no needs
to store this information ?

At the end, it adds extra codes for nothing.

>
> For example, we can do like below.
> 1. $ ./testpmd --vdev 'eth_pcap' -- -i
> 2. testpmd>port detach

It's exactly the same for physical device:
1. $./testpmd -w 0000:08:00:1 -- -i
2. testpmd> port detach

But you don't call rte_eal_devargs_add with RTE_DEVTYPE_WHILISTED_PCI
in  rte_eal_dev_attach_pdev ?
Thus it makes the hotplug implementation not coherent for me.

What do you think ?

Regards,

Maxime
  
Tetsuya Mukawa Feb. 21, 2015, 3:49 a.m. UTC | #4
On 2015/02/21 0:20, Maxime Leroy wrote:
> Hi Tetsuya,
>
> Thanks for your comment.
>
> On Fri, Feb 20, 2015 at 11:32 AM, Tetsuya Mukawa <mukawa@igel.co.jp> wrote:
>> On 2015/02/20 19:14, Maxime Leroy wrote:
>>> Hi Tetsuya,
> [...]
>> Hi Maxime,
>>
>> I appreciate for your comment.
>>
>> When rte_eal_init() is called, if we have "--vdev" options, these will
>> be stored in vdevargs as you describe above.
>> I agree this is the current behavior of DPDK.
>>
>> When we call hotplug functions, I guess doing same thing will be more
>> consistent design.
> The rte_eal_devargs_add is here to store a white list parameters for
> later creating the devices.
> It means that the devargs_list is only needed at the init to store the
> devargs parsed .
> I think we should not use the devargs_list after eal initialization.
>
> Why you want to add devargs in the devargs_list, if there are no needs
> to store this information ?

In eal initialization code, virtual device names stored in devargs are
checked not to register a same device name twice.
And each init function of PMD just trust a device name received by eal.
So there is no code in PMD to check whether device name is unique.

For example, according to your suggestion, how to prevent below case?
$ ./testpmd -c f -n 1 -- -i
testpmd> port attach eth_pcap0,iface=eth0
testpmd> port attach eth_pcap0,iface=eth1

Also, type below, after doing above.
testpmd> port detach 0

Probably port 0 will be "eth_pcap0,iface=eth0".
But uninit code of PMD only receives a device name like 'eth_pcap0'.
(We have 2 'eth_pcap0' devices in PMD.)

To prevent above case, probably we have 2 options at least.
One is changing init code of all virtual PMDs not to register same
device name.
The other is to use devargs_list in EAL, and call init code of PMD with
a unique device name.

But first case, eal initialization code can calls init of PMD with
unique name, but hotplug cannot.
It's not so consistent behavior.
Also, we need to have almost same code that assure unique name in each PMD.

> At the end, it adds extra codes for nothing.
>
>> For example, we can do like below.
>> 1. $ ./testpmd --vdev 'eth_pcap' -- -i
>> 2. testpmd>port detach
> It's exactly the same for physical device:
> 1. $./testpmd -w 0000:08:00:1 -- -i
> 2. testpmd> port detach
>
> But you don't call rte_eal_devargs_add with RTE_DEVTYPE_WHILISTED_PCI
> in  rte_eal_dev_attach_pdev ?

Yes, I don't.
Hotplug functions should not change BLACKLIST and WHITELIST.
So not to touch the list is correct behavior.

I guess you feel something strange about usage of devargs_list.
If so, I agree with it.
Probably the issue is that devargs_list is used for not only storing
command parameters, but also not to register same virtual device name twice.

> Thus it makes the hotplug implementation not coherent for me.
>
> What do you think ?

Anyway, here is my guess.

Current DPDK uses devargs not to register same virtual device name.
If we follow this, probably using devargs is straight forward.

But if we should change the design itself, could you check below.
http://dpdk.org/dev/patchwork/patch/3374/
Thomas and I agreed that I will try to change design in post-rc1.

In a future patch, probably it might be nice to have virtualdev_list,
then we can use devargs_list only for storing command arguments.

Regards,
Tetsuya

> Regards,
>
> Maxime
  
Maxime Leroy Feb. 21, 2015, 12:49 p.m. UTC | #5
Hi Tetsuya,

On Sat, Feb 21, 2015 at 4:49 AM, Tetsuya Mukawa <mukawa@igel.co.jp> wrote:
> On 2015/02/21 0:20, Maxime Leroy wrote:
[...]
>> Why you want to add devargs in the devargs_list, if there are no needs
>> to store this information ?
>
> In eal initialization code, virtual device names stored in devargs are
> checked not to register a same device name twice.
> And each init function of PMD just trust a device name received by eal.
> So there is no code in PMD to check whether device name is unique.
>

I disagree with you. This check is not present in the master branch.

You have added this check in your hotplug patchset, in this patch:
[PATCH v10 10/14] eal/pci: Add a function to remove the entry of
devargs list
See: http://dpdk.org/ml/archives/dev/2015-February/013712.html

Thus the problem should be already exist without your patches in the
master branch.

For example according to you, this testpmd command should create 2
devices with the same name:
testpmd -c 0xc --vdev eth_pcap0,iface=eth0 --vdev eth_pcap0,iface=eth1
-n 2 -- -i

But it's not the case:
PMD: Initializing pmd_pcap for eth_pcap0
PMD: Creating pcap-backed ethdev on numa socket 0
PMD: Initializing pmd_pcap for eth_pcap0
PMD: Creating pcap-backed ethdev on numa socket 0
PMD: rte_eth_dev_allocate: Ethernet Device with name eth_pcap0 already
allocated!

In fact, it's not possible for any PMD_VDEV in the dpdk repo to create
2 devices with the same name.
All the virtual device initialization functions use the
rte_eth_dev_allocate function. This function prevents to create two
ethernet devices with the same name:

     if (rte_eth_dev_allocated(name) != NULL) {
        PMD_DEBUG_TRACE("Ethernet Device with name %s already
allocated!\n", name);
        return NULL;
    }


> For example, according to your suggestion, how to prevent below case?
> $ ./testpmd -c f -n 1 -- -i
> testpmd> port attach eth_pcap0,iface=eth0
> testpmd> port attach eth_pcap0,iface=eth1
>
> Also, type below, after doing above.
> testpmd> port detach 0
>
> Probably port 0 will be "eth_pcap0,iface=eth0".
> But uninit code of PMD only receives a device name like 'eth_pcap0'.
> (We have 2 'eth_pcap0' devices in PMD.)
>
> To prevent above case, probably we have 2 options at least.
> One is changing init code of all virtual PMDs not to register same
> device name.

There are no need to change init code of all virtual PMDs to not
register the same device name 2 times.
Because it's already not possible to create 2 virtual device with the
same name. (see my point above)

> The other is to use devargs_list in EAL, and call init code of PMD with
> a unique device name.

Thus there are no needs to use the devargs_list for that.

>
[..]
>>
>> But you don't call rte_eal_devargs_add with RTE_DEVTYPE_WHILISTED_PCI
>> in  rte_eal_dev_attach_pdev ?
>
> Yes, I don't.
> Hotplug functions should not change BLACKLIST and WHITELIST.
> So not to touch the list is correct behavior.

Yes the correct behaviour for Hotplug functions is to not use the
devargs_list for physical and virtual devices !

Regards,

Maxime
  
Tetsuya Mukawa Feb. 22, 2015, 3:04 a.m. UTC | #6
On 2015/02/21 21:49, Maxime Leroy wrote:
> Hi Tetsuya,
>
> On Sat, Feb 21, 2015 at 4:49 AM, Tetsuya Mukawa <mukawa@igel.co.jp> wrote:
>> On 2015/02/21 0:20, Maxime Leroy wrote:
> [...]
>>> Why you want to add devargs in the devargs_list, if there are no needs
>>> to store this information ?
>> In eal initialization code, virtual device names stored in devargs are
>> checked not to register a same device name twice.
>> And each init function of PMD just trust a device name received by eal.
>> So there is no code in PMD to check whether device name is unique.
>>
> I disagree with you. This check is not present in the master branch.
>
> You have added this check in your hotplug patchset, in this patch:
> [PATCH v10 10/14] eal/pci: Add a function to remove the entry of
> devargs list
> See: http://dpdk.org/ml/archives/dev/2015-February/013712.html
>
> Thus the problem should be already exist without your patches in the
> master branch.
>
> For example according to you, this testpmd command should create 2
> devices with the same name:
> testpmd -c 0xc --vdev eth_pcap0,iface=eth0 --vdev eth_pcap0,iface=eth1
> -n 2 -- -i
>
> But it's not the case:
> PMD: Initializing pmd_pcap for eth_pcap0
> PMD: Creating pcap-backed ethdev on numa socket 0
> PMD: Initializing pmd_pcap for eth_pcap0
> PMD: Creating pcap-backed ethdev on numa socket 0
> PMD: rte_eth_dev_allocate: Ethernet Device with name eth_pcap0 already
> allocated!
>
> In fact, it's not possible for any PMD_VDEV in the dpdk repo to create
> 2 devices with the same name.
> All the virtual device initialization functions use the
> rte_eth_dev_allocate function. This function prevents to create two
> ethernet devices with the same name:
>
>      if (rte_eth_dev_allocated(name) != NULL) {
>         PMD_DEBUG_TRACE("Ethernet Device with name %s already
> allocated!\n", name);
>         return NULL;
>     }
>

Ah, You are right.

>> For example, according to your suggestion, how to prevent below case?
>> $ ./testpmd -c f -n 1 -- -i
>> testpmd> port attach eth_pcap0,iface=eth0
>> testpmd> port attach eth_pcap0,iface=eth1
>>
>> Also, type below, after doing above.
>> testpmd> port detach 0
>>
>> Probably port 0 will be "eth_pcap0,iface=eth0".
>> But uninit code of PMD only receives a device name like 'eth_pcap0'.
>> (We have 2 'eth_pcap0' devices in PMD.)
>>
>> To prevent above case, probably we have 2 options at least.
>> One is changing init code of all virtual PMDs not to register same
>> device name.
> There are no need to change init code of all virtual PMDs to not
> register the same device name 2 times.
> Because it's already not possible to create 2 virtual device with the
> same name. (see my point above)
>
>> The other is to use devargs_list in EAL, and call init code of PMD with
>> a unique device name.
> Thus there are no needs to use the devargs_list for that.
>
> [..]
>>> But you don't call rte_eal_devargs_add with RTE_DEVTYPE_WHILISTED_PCI
>>> in  rte_eal_dev_attach_pdev ?
>> Yes, I don't.
>> Hotplug functions should not change BLACKLIST and WHITELIST.
>> So not to touch the list is correct behavior.
> Yes the correct behaviour for Hotplug functions is to not use the
> devargs_list for physical and virtual devices !

I totally agree with you now.
It seems I have a missunderstanding about code not to duplicate a vdev name.
Thanks for your suggestions, I will fix it in next patches.

Regards,
Tetsuya


>
> Regards,
>
> Maxime
  

Patch

diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c
index eae5656..0a623f6 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -32,10 +32,13 @@ 
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
+#include <stdio.h>
+#include <limits.h>
 #include <string.h>
 #include <inttypes.h>
 #include <sys/queue.h>
 
+#include <rte_ethdev.h>
 #include <rte_dev.h>
 #include <rte_devargs.h>
 #include <rte_debug.h>
@@ -107,3 +110,315 @@  rte_eal_dev_init(void)
 	}
 	return 0;
 }
+
+/* So far, DPDK hotplug function only supports linux */
+#ifdef RTE_LIBRTE_EAL_HOTPLUG
+static int
+rte_eal_vdev_find_and_init(const char *name)
+{
+	struct rte_devargs *devargs;
+	struct rte_driver *driver;
+
+	if (name == NULL)
+		return -EINVAL;
+
+	/* find the specified device and call init function */
+	TAILQ_FOREACH(devargs, &devargs_list, next) {
+
+		if (devargs->type != RTE_DEVTYPE_VIRTUAL)
+			continue;
+
+		if (strcmp(name, devargs->virtual.drv_name))
+			continue;
+
+		TAILQ_FOREACH(driver, &dev_driver_list, next) {
+			if (driver->type != PMD_VDEV)
+				continue;
+
+			/*
+			 * search a driver prefix in virtual device name.
+			 * For example, if the driver is pcap PMD, driver->name
+			 * will be "eth_pcap", but devargs->virtual.drv_name
+			 * will be "eth_pcapN". So use strncmp to compare.
+			 */
+			if (!strncmp(driver->name, devargs->virtual.drv_name,
+			    strlen(driver->name))) {
+				driver->init(devargs->virtual.drv_name,
+						devargs->args);
+				break;
+			}
+		}
+
+		if (driver == NULL) {
+			RTE_LOG(WARNING, EAL, "no driver found for %s\n",
+				  devargs->virtual.drv_name);
+		}
+		return 0;
+	}
+	return 1;
+}
+
+static int
+rte_eal_vdev_find_and_uninit(const char *name)
+{
+	struct rte_devargs *devargs;
+	struct rte_driver *driver;
+
+	if (name == NULL)
+		return -EINVAL;
+
+	/* find the specified device and call uninit function */
+	TAILQ_FOREACH(devargs, &devargs_list, next) {
+
+		if (devargs->type != RTE_DEVTYPE_VIRTUAL)
+			continue;
+
+		if (strcmp(name, devargs->virtual.drv_name))
+			continue;
+
+		TAILQ_FOREACH(driver, &dev_driver_list, next) {
+			if (driver->type != PMD_VDEV)
+				continue;
+
+			/*
+			 * search a driver prefix in virtual device name.
+			 * For example, if the driver is pcap PMD, driver->name
+			 * will be "eth_pcap", but devargs->virtual.drv_name
+			 * will be "eth_pcapN". So use strncmp to compare.
+			 */
+			if (!strncmp(driver->name, devargs->virtual.drv_name,
+			    strlen(driver->name))) {
+				driver->uninit(devargs->virtual.drv_name);
+				break;
+			}
+		}
+
+		if (driver == NULL) {
+			RTE_LOG(WARNING, EAL, "no driver found for %s\n",
+				  devargs->virtual.drv_name);
+		}
+		return 0;
+	}
+	return 1;
+}
+
+/* attach the new physical device, then store port_id of the device */
+static int
+rte_eal_dev_attach_pdev(struct rte_pci_addr *addr, uint8_t *port_id)
+{
+	uint8_t new_port_id;
+	struct rte_eth_dev devs[RTE_MAX_ETHPORTS];
+
+	if ((addr == NULL) || (port_id == NULL))
+		goto err;
+
+	/* save current port status */
+	if (rte_eth_dev_save(devs, sizeof(devs)))
+		goto err;
+	/* re-construct pci_device_list */
+	if (rte_eal_pci_scan())
+		goto err;
+	/* invoke probe func of the driver can handle the new device.
+	 * TODO:
+	 * rte_eal_pci_probe_one() should return port_id.
+	 * And rte_eth_dev_save() and rte_eth_dev_get_changed_port()
+	 * should be removed. */
+	if (rte_eal_pci_probe_one(addr))
+		goto err;
+	/* get port_id enabled by above procedures */
+	if (rte_eth_dev_get_changed_port(devs, &new_port_id))
+		goto err;
+
+	*port_id = new_port_id;
+	return 0;
+err:
+	RTE_LOG(ERR, EAL, "Driver, cannot attach the device\n");
+	return -1;
+}
+
+/* detach the new physical device, then store pci_addr of the device */
+static int
+rte_eal_dev_detach_pdev(uint8_t port_id, struct rte_pci_addr *addr)
+{
+	struct rte_pci_addr freed_addr;
+	struct rte_pci_addr vp;
+
+	if (addr == NULL)
+		goto err;
+
+	/* check whether the driver supports detach feature, or not */
+	if (rte_eth_dev_is_detachable(port_id))
+		goto err;
+
+	/* get pci address by port id */
+	if (rte_eth_dev_get_addr_by_port(port_id, &freed_addr))
+		goto err;
+
+	/* Zerod pci addr means the port comes from virtual device */
+	vp.domain = vp.bus = vp.devid = vp.function = 0;
+	if (rte_eal_compare_pci_addr(&vp, &freed_addr) == 0)
+		goto err;
+
+	/* invoke close func of the driver,
+	 * also remove the device from pci_device_list */
+	if (rte_eal_pci_close_one(&freed_addr))
+		goto err;
+
+	*addr = freed_addr;
+	return 0;
+err:
+	RTE_LOG(ERR, EAL, "Driver, cannot detach the device\n");
+	return -1;
+}
+
+static void
+get_vdev_name(char *vdevargs)
+{
+	char *sep;
+
+	if (vdevargs == NULL)
+		return;
+
+	/* set the first ',' to '\0' to split name and arguments */
+	sep = strchr(vdevargs, ',');
+	if (sep != NULL)
+		sep[0] = '\0';
+}
+
+/* attach the new virtual device, then store port_id of the device */
+static int
+rte_eal_dev_attach_vdev(const char *vdevargs, uint8_t *port_id)
+{
+	char *args;
+	uint8_t new_port_id;
+	struct rte_eth_dev devs[RTE_MAX_ETHPORTS];
+
+	if ((vdevargs == NULL) || (port_id == NULL))
+		goto err0;
+
+	args = strdup(vdevargs);
+	if (args == NULL)
+		goto err0;
+
+	/* save current port status */
+	if (rte_eth_dev_save(devs, sizeof(devs)))
+		goto err1;
+	/* add the vdevargs to devargs_list */
+	if (rte_eal_devargs_add(RTE_DEVTYPE_VIRTUAL, args))
+		goto err1;
+	/* parse vdevargs, then retrieve device name */
+	get_vdev_name(args);
+	/* walk around dev_driver_list to find the driver of the device,
+	 * then invoke probe function o the driver.
+	 * TODO:
+	 * rte_eal_vdev_find_and_init() should return port_id,
+	 * And rte_eth_dev_save() and rte_eth_dev_get_changed_port()
+	 * should be removed. */
+	if (rte_eal_vdev_find_and_init(args))
+		goto err2;
+	/* get port_id enabled by above procedures */
+	if (rte_eth_dev_get_changed_port(devs, &new_port_id))
+		goto err2;
+
+	free(args);
+	*port_id = new_port_id;
+	return 0;
+err2:
+	rte_eal_devargs_remove(RTE_DEVTYPE_VIRTUAL, args);
+err1:
+	free(args);
+err0:
+	RTE_LOG(ERR, EAL, "Driver, cannot attach the device\n");
+	return -1;
+}
+
+/* detach the new virtual device, then store the name of the device */
+static int
+rte_eal_dev_detach_vdev(uint8_t port_id, char *vdevname)
+{
+	char name[RTE_ETH_NAME_MAX_LEN];
+
+	if (vdevname == NULL)
+		goto err;
+
+	/* check whether the driver supports detach feature, or not */
+	if (rte_eth_dev_is_detachable(port_id))
+		goto err;
+
+	/* get device name by port id */
+	if (rte_eth_dev_get_name_by_port(port_id, name))
+		goto err;
+	/* walk around dev_driver_list to find the driver of the device,
+	 * then invoke close function o the driver */
+	if (rte_eal_vdev_find_and_uninit(name))
+		goto err;
+	/* remove the vdevname from devargs_list */
+	if (rte_eal_devargs_remove(RTE_DEVTYPE_VIRTUAL, name))
+		goto err;
+
+	strncpy(vdevname, name, sizeof(name));
+	return 0;
+err:
+	RTE_LOG(ERR, EAL, "Driver, cannot detach the device\n");
+	return -1;
+}
+
+/* attach the new device, then store port_id of the device */
+int
+rte_eal_dev_attach(const char *devargs, uint8_t *port_id)
+{
+	struct rte_pci_addr addr;
+
+	if ((devargs == NULL) || (port_id == NULL))
+		return -EINVAL;
+
+	if (eal_parse_pci_DomBDF(devargs, &addr) == 0)
+		return rte_eal_dev_attach_pdev(&addr, port_id);
+	else
+		return rte_eal_dev_attach_vdev(devargs, port_id);
+}
+
+/* detach the device, then store the name of the device */
+int
+rte_eal_dev_detach(uint8_t port_id, char *name)
+{
+	struct rte_pci_addr addr;
+	int ret;
+
+	if (name == NULL)
+		return -EINVAL;
+
+	if (rte_eth_dev_get_device_type(port_id) == RTE_ETH_DEV_PCI) {
+		ret = rte_eth_dev_get_addr_by_port(port_id, &addr);
+		if (ret < 0)
+			return ret;
+
+		ret = rte_eal_dev_detach_pdev(port_id, &addr);
+		if (ret == 0)
+			snprintf(name, RTE_ETH_NAME_MAX_LEN,
+				"%04x:%02x:%02x.%d",
+				addr.domain, addr.bus,
+				addr.devid, addr.function);
+
+		return ret;
+	} else
+		return rte_eal_dev_detach_vdev(port_id, name);
+}
+#else /* RTE_LIBRTE_EAL_HOTPLUG */
+int
+rte_eal_dev_attach(const char *devargs __rte_unused,
+			uint8_t *port_id __rte_unused)
+{
+	RTE_LOG(ERR, EAL, "Hotplug support isn't enabled\n");
+	return -1;
+}
+
+/* detach the device, then store the name of the device */
+int
+rte_eal_dev_detach(uint8_t port_id __rte_unused,
+			char *name __rte_unused)
+{
+	RTE_LOG(ERR, EAL, "Hotplug support isn't enabled\n");
+	return -1;
+}
+#endif /* RTE_LIBRTE_EAL_HOTPLUG */
diff --git a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h
index 4acf5a0..98b286a 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -154,6 +154,17 @@  struct rte_pci_driver;
 struct rte_pci_device;
 
 /**
+ * Scan the content of the PCI bus, and the devices in the devices
+ * list
+ *
+ * This function is private to EAL.
+ *
+ * @return
+ *  0 on success, negative on error
+ */
+int rte_eal_pci_scan(void);
+
+/**
  * Mmap memory for single PCI device
  *
  * This function is private to EAL.
diff --git a/lib/librte_eal/common/include/rte_dev.h b/lib/librte_eal/common/include/rte_dev.h
index f7e3a10..a5ac770 100644
--- a/lib/librte_eal/common/include/rte_dev.h
+++ b/lib/librte_eal/common/include/rte_dev.h
@@ -47,6 +47,7 @@  extern "C" {
 #endif
 
 #include <sys/queue.h>
+#include <rte_pci.h>
 
 /** Double linked list of device drivers. */
 TAILQ_HEAD(rte_driver_list, rte_driver);
@@ -57,6 +58,11 @@  TAILQ_HEAD(rte_driver_list, rte_driver);
 typedef int (rte_dev_init_t)(const char *name, const char *args);
 
 /**
+ * Uninitilization function called for each device driver once.
+ */
+typedef int (rte_dev_uninit_t)(const char *name);
+
+/**
  * Driver type enumeration
  */
 enum pmd_type {
@@ -72,6 +78,7 @@  struct rte_driver {
 	enum pmd_type type;		   /**< PMD Driver type */
 	const char *name;                   /**< Driver name. */
 	rte_dev_init_t *init;              /**< Device init. function. */
+	rte_dev_uninit_t *uninit;          /**< Device uninit. function. */
 };
 
 /**
@@ -93,6 +100,32 @@  void rte_eal_driver_register(struct rte_driver *driver);
 void rte_eal_driver_unregister(struct rte_driver *driver);
 
 /**
+ * Attach a new device.
+ *
+ * @param devargs
+ *   A pointer to a strings array describing the new device
+ *   to be attached. The strings should be a pci address like
+ *   '0000:01:00.0' or virtual device name like 'eth_pcap0'.
+ * @param port_id
+ *  A pointer to a port identifier actually attached.
+ * @return
+ *  0 on success and port_id is filled, negative on error
+ */
+int rte_eal_dev_attach(const char *devargs, uint8_t *port_id);
+
+/**
+ * Detach a device.
+ *
+ * @param port_id
+ *   The port identifier of the device to detach.
+ * @param addr
+ *  A pointer to a device name actually detached.
+ * @return
+ *  0 on success and devname is filled, negative on error
+ */
+int rte_eal_dev_detach(uint8_t port_id, char *devname);
+
+/**
  * Initalize all the registered drivers in this process
  */
 int rte_eal_dev_init(void);
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index e117cec..b59b201 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -45,6 +45,7 @@  CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common/include
 CFLAGS += -I$(RTE_SDK)/lib/librte_ring
 CFLAGS += -I$(RTE_SDK)/lib/librte_mempool
 CFLAGS += -I$(RTE_SDK)/lib/librte_malloc
+CFLAGS += -I$(RTE_SDK)/lib/librte_mbuf
 CFLAGS += -I$(RTE_SDK)/lib/librte_ether
 CFLAGS += -I$(RTE_SDK)/lib/librte_ivshmem
 CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_ring
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 4bdf51b..21d3e85 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -441,8 +441,8 @@  error:
  * Scan the content of the PCI bus, and the devices in the devices
  * list
  */
-static int
-pci_scan(void)
+int
+rte_eal_pci_scan(void)
 {
 	struct dirent *e;
 	DIR *dir;
@@ -774,7 +774,7 @@  rte_eal_pci_init(void)
 	if (internal_config.no_pci)
 		return 0;
 
-	if (pci_scan() < 0) {
+	if (rte_eal_pci_scan() < 0) {
 		RTE_LOG(ERR, EAL, "%s(): Cannot scan PCI bus\n", __func__);
 		return -1;
 	}
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index d36286e..bc6b82d 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -20,6 +20,8 @@  DPDK_2.0 {
 	rte_dump_tailq;
 	rte_eal_alarm_cancel;
 	rte_eal_alarm_set;
+	rte_eal_dev_attach;
+	rte_eal_dev_detach;
 	rte_eal_dev_init;
 	rte_eal_devargs_add;
 	rte_eal_devargs_dump;