[dpdk-dev,1/5] bonding: replace spinlock with read/write lock
Commit Message
Fixes: a45b288ef21a ("bond: support link status polling")
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
---
drivers/net/bonding/rte_eth_bond_api.c | 10 +++---
drivers/net/bonding/rte_eth_bond_pmd.c | 57 +++++++++++++++---------------
drivers/net/bonding/rte_eth_bond_private.h | 6 ++--
3 files changed, 36 insertions(+), 37 deletions(-)
Comments
On Thu, 5 May 2016 16:14:56 +0100
Bernard Iremonger <bernard.iremonger@intel.com> wrote:
> Fixes: a45b288ef21a ("bond: support link status polling")
> Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
You know an uncontested reader/writer lock is significantly slower
than a spinlock.
On 05/05/16 18:12, Stephen Hemminger wrote:
> On Thu, 5 May 2016 16:14:56 +0100
> Bernard Iremonger <bernard.iremonger@intel.com> wrote:
>
>> Fixes: a45b288ef21a ("bond: support link status polling")
>> Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
>
> You know an uncontested reader/writer lock is significantly slower
> than a spinlock.
>
As we can have multiple readers of the active slave list / primary
slave, basically any tx/rx burst call needs to protect against a device
being removed/closed during it's operation now that we support
hotplugging, in the worst case this could mean we have 2(rx+tx) * queues
possibly using the active slave list simultaneously, in that case I
would have thought that a spinlock would have a much more significant
affect on performance?
On Fri, 6 May 2016 11:32:19 +0100
Declan Doherty <declan.doherty@intel.com> wrote:
> On 05/05/16 18:12, Stephen Hemminger wrote:
> > On Thu, 5 May 2016 16:14:56 +0100
> > Bernard Iremonger <bernard.iremonger@intel.com> wrote:
> >
> >> Fixes: a45b288ef21a ("bond: support link status polling")
> >> Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
> >
> > You know an uncontested reader/writer lock is significantly slower
> > than a spinlock.
> >
>
> As we can have multiple readers of the active slave list / primary
> slave, basically any tx/rx burst call needs to protect against a device
> being removed/closed during it's operation now that we support
> hotplugging, in the worst case this could mean we have 2(rx+tx) * queues
> possibly using the active slave list simultaneously, in that case I
> would have thought that a spinlock would have a much more significant
> affect on performance?
Right, but the window where the shared variable is accessed is very small,
and it is actually faster to use spinlock for that.
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Stephen Hemminger
> Sent: Friday, May 06, 2016 4:56 PM
> To: Doherty, Declan
> Cc: Iremonger, Bernard; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 1/5] bonding: replace spinlock with read/write lock
>
> On Fri, 6 May 2016 11:32:19 +0100
> Declan Doherty <declan.doherty@intel.com> wrote:
>
> > On 05/05/16 18:12, Stephen Hemminger wrote:
> > > On Thu, 5 May 2016 16:14:56 +0100
> > > Bernard Iremonger <bernard.iremonger@intel.com> wrote:
> > >
> > >> Fixes: a45b288ef21a ("bond: support link status polling")
> > >> Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
> > >
> > > You know an uncontested reader/writer lock is significantly slower
> > > than a spinlock.
> > >
> >
> > As we can have multiple readers of the active slave list / primary
> > slave, basically any tx/rx burst call needs to protect against a device
> > being removed/closed during it's operation now that we support
> > hotplugging, in the worst case this could mean we have 2(rx+tx) * queues
> > possibly using the active slave list simultaneously, in that case I
> > would have thought that a spinlock would have a much more significant
> > affect on performance?
>
> Right, but the window where the shared variable is accessed is very small,
> and it is actually faster to use spinlock for that.
I don't think that window we hold the lock is that small, let say if we have
a burst of 32 packets * (let say) 50 cycles/pkt = ~1500 cycles - each IO thread would stall.
For me that's long enough to justify rwlock usage here, especially that
DPDK rwlock price is not much bigger (as I remember) then spinlock -
it is basically 1 CAS operation.
Konstantin
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin
> Sent: Friday, May 13, 2016 6:11 PM
> To: Stephen Hemminger; Doherty, Declan
> Cc: Iremonger, Bernard; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 1/5] bonding: replace spinlock with read/write lock
>
>
>
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Stephen Hemminger
> > Sent: Friday, May 06, 2016 4:56 PM
> > To: Doherty, Declan
> > Cc: Iremonger, Bernard; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 1/5] bonding: replace spinlock with read/write lock
> >
> > On Fri, 6 May 2016 11:32:19 +0100
> > Declan Doherty <declan.doherty@intel.com> wrote:
> >
> > > On 05/05/16 18:12, Stephen Hemminger wrote:
> > > > On Thu, 5 May 2016 16:14:56 +0100
> > > > Bernard Iremonger <bernard.iremonger@intel.com> wrote:
> > > >
> > > >> Fixes: a45b288ef21a ("bond: support link status polling")
> > > >> Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
> > > >
> > > > You know an uncontested reader/writer lock is significantly slower
> > > > than a spinlock.
> > > >
> > >
> > > As we can have multiple readers of the active slave list / primary
> > > slave, basically any tx/rx burst call needs to protect against a device
> > > being removed/closed during it's operation now that we support
> > > hotplugging, in the worst case this could mean we have 2(rx+tx) * queues
> > > possibly using the active slave list simultaneously, in that case I
> > > would have thought that a spinlock would have a much more significant
> > > affect on performance?
> >
> > Right, but the window where the shared variable is accessed is very small,
> > and it is actually faster to use spinlock for that.
>
> I don't think that window we hold the lock is that small, let say if we have
> a burst of 32 packets * (let say) 50 cycles/pkt = ~1500 cycles - each IO thread would stall.
> For me that's long enough to justify rwlock usage here, especially that
> DPDK rwlock price is not much bigger (as I remember) then spinlock -
> it is basically 1 CAS operation.
As another alternative we can have a spinlock per queue, then different IO threads
doing RX/XTX over different queues will be uncontended at all.
Though control thread would need to grab locks for all configured queues :)
Konstantin
>
> Konstantin
>
Hi Konstantin,
<snip>
> > > > On 05/05/16 18:12, Stephen Hemminger wrote:
> > > > > On Thu, 5 May 2016 16:14:56 +0100 Bernard Iremonger
> > > > > <bernard.iremonger@intel.com> wrote:
> > > > >
> > > > >> Fixes: a45b288ef21a ("bond: support link status polling")
> > > > >> Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
> > > > >
> > > > > You know an uncontested reader/writer lock is significantly
> > > > > slower than a spinlock.
> > > > >
> > > >
> > > > As we can have multiple readers of the active slave list / primary
> > > > slave, basically any tx/rx burst call needs to protect against a
> > > > device being removed/closed during it's operation now that we
> > > > support hotplugging, in the worst case this could mean we have
> > > > 2(rx+tx) * queues possibly using the active slave list
> > > > simultaneously, in that case I would have thought that a spinlock
> > > > would have a much more significant affect on performance?
> > >
> > > Right, but the window where the shared variable is accessed is very
> > > small, and it is actually faster to use spinlock for that.
> >
> > I don't think that window we hold the lock is that small, let say if
> > we have a burst of 32 packets * (let say) 50 cycles/pkt = ~1500 cycles - each
> IO thread would stall.
> > For me that's long enough to justify rwlock usage here, especially
> > that DPDK rwlock price is not much bigger (as I remember) then
> > spinlock - it is basically 1 CAS operation.
>
> As another alternative we can have a spinlock per queue, then different IO
> threads doing RX/XTX over different queues will be uncontended at all.
> Though control thread would need to grab locks for all configured queues :)
>
> Konstantin
>
I am preparing a v2 patchset which uses a spinlock per queue.
Regards,
Bernard.
@@ -227,7 +227,7 @@ rte_eth_bond_create(const char *name, uint8_t mode, uint8_t socket_id)
eth_dev->data->drv_name = pmd_bond_driver_name;
eth_dev->data->numa_node = socket_id;
- rte_spinlock_init(&internals->lock);
+ rte_rwlock_init(&internals->rwlock);
internals->port_id = eth_dev->data->port_id;
internals->mode = BONDING_MODE_INVALID;
@@ -451,11 +451,11 @@ rte_eth_bond_slave_add(uint8_t bonded_port_id, uint8_t slave_port_id)
bonded_eth_dev = &rte_eth_devices[bonded_port_id];
internals = bonded_eth_dev->data->dev_private;
- rte_spinlock_lock(&internals->lock);
+ rte_rwlock_write_lock(&internals->rwlock);
retval = __eth_bond_slave_add_lock_free(bonded_port_id, slave_port_id);
- rte_spinlock_unlock(&internals->lock);
+ rte_rwlock_write_unlock(&internals->rwlock);
return retval;
}
@@ -553,11 +553,11 @@ rte_eth_bond_slave_remove(uint8_t bonded_port_id, uint8_t slave_port_id)
bonded_eth_dev = &rte_eth_devices[bonded_port_id];
internals = bonded_eth_dev->data->dev_private;
- rte_spinlock_lock(&internals->lock);
+ rte_rwlock_write_lock(&internals->rwlock);
retval = __eth_bond_slave_remove_lock_free(bonded_port_id, slave_port_id);
- rte_spinlock_unlock(&internals->lock);
+ rte_rwlock_write_unlock(&internals->rwlock);
return retval;
}
@@ -1,7 +1,7 @@
/*-
* BSD LICENSE
*
- * Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ * Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
@@ -1750,37 +1750,36 @@ bond_ethdev_slave_link_status_change_monitor(void *cb_arg)
!internals->link_status_polling_enabled)
return;
- /* If device is currently being configured then don't check slaves link
- * status, wait until next period */
- if (rte_spinlock_trylock(&internals->lock)) {
- if (internals->slave_count > 0)
- polling_slave_found = 0;
+ rte_rwlock_read_lock(&internals->rwlock);
+ if (internals->slave_count > 0)
+ polling_slave_found = 0;
- for (i = 0; i < internals->slave_count; i++) {
- if (!internals->slaves[i].link_status_poll_enabled)
- continue;
-
- slave_ethdev = &rte_eth_devices[internals->slaves[i].port_id];
- polling_slave_found = 1;
-
- /* Update slave link status */
- (*slave_ethdev->dev_ops->link_update)(slave_ethdev,
- internals->slaves[i].link_status_wait_to_complete);
-
- /* if link status has changed since last checked then call lsc
- * event callback */
- if (slave_ethdev->data->dev_link.link_status !=
- internals->slaves[i].last_link_status) {
- internals->slaves[i].last_link_status =
- slave_ethdev->data->dev_link.link_status;
-
- bond_ethdev_lsc_event_callback(internals->slaves[i].port_id,
- RTE_ETH_EVENT_INTR_LSC,
- &bonded_ethdev->data->port_id);
- }
+ for (i = 0; i < internals->slave_count; i++) {
+ if (!internals->slaves[i].link_status_poll_enabled)
+ continue;
+
+ slave_ethdev = &rte_eth_devices[internals->slaves[i].port_id];
+ polling_slave_found = 1;
+
+ /* Update slave link status */
+ (*slave_ethdev->dev_ops->link_update)(slave_ethdev,
+ internals->slaves[i].link_status_wait_to_complete);
+
+ /* if link status has changed since last checked then call lsc
+ * event callback
+ */
+ if (slave_ethdev->data->dev_link.link_status !=
+ internals->slaves[i].last_link_status) {
+ internals->slaves[i].last_link_status =
+ slave_ethdev->data->dev_link.link_status;
+
+ bond_ethdev_lsc_event_callback(
+ internals->slaves[i].port_id,
+ RTE_ETH_EVENT_INTR_LSC,
+ &bonded_ethdev->data->port_id);
}
- rte_spinlock_unlock(&internals->lock);
}
+ rte_rwlock_read_unlock(&internals->rwlock);
if (polling_slave_found)
/* Set alarm to continue monitoring link status of slave ethdev's */
@@ -1,7 +1,7 @@
/*-
* BSD LICENSE
*
- * Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ * Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
@@ -35,7 +35,7 @@
#define _RTE_ETH_BOND_PRIVATE_H_
#include <rte_ethdev.h>
-#include <rte_spinlock.h>
+#include <rte_rwlock.h>
#include "rte_eth_bond.h"
#include "rte_eth_bond_8023ad_private.h"
@@ -115,7 +115,7 @@ struct bond_dev_private {
uint8_t port_id; /**< Port Id of Bonded Port */
uint8_t mode; /**< Link Bonding Mode */
- rte_spinlock_t lock;
+ rte_rwlock_t rwlock;
uint8_t primary_port; /**< Primary Slave Port */
uint8_t current_primary_port; /**< Primary Slave Port */