[dpdk-dev,v3,3/4] bonding: take queue spinlock in rx/tx burst functions

Message ID 1465751489-10111-4-git-send-email-bernard.iremonger@intel.com (mailing list archive)
State Rejected, archived
Delegated to: Ferruh Yigit
Headers

Commit Message

Iremonger, Bernard June 12, 2016, 5:11 p.m. UTC
  Use rte_spinlock_trylock() in the rx/tx burst functions to
take the queue spinlock.

Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 drivers/net/bonding/rte_eth_bond_pmd.c | 116 ++++++++++++++++++++++++---------
 1 file changed, 84 insertions(+), 32 deletions(-)
  

Comments

Bruce Richardson June 13, 2016, 9:18 a.m. UTC | #1
On Sun, Jun 12, 2016 at 06:11:28PM +0100, Bernard Iremonger wrote:
> Use rte_spinlock_trylock() in the rx/tx burst functions to
> take the queue spinlock.
> 
> Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> ---

Why does this particular PMD need spinlocks when doing RX and TX, while other
device types do not? How is adding/removing devices from a bonded device different
to other control operations that can be done on physical PMDs? Is this not
similar to say bringing down or hotplugging out a physical port just before an
RX or TX operation takes place?
For all other PMDs we rely on the app to synchronise control and data plane
operation - why not here? 

/Bruce
  
Iremonger, Bernard June 13, 2016, 12:28 p.m. UTC | #2
Hi Bruce,

<snip>

> Subject: Re: [dpdk-dev] [PATCH v3 3/4] bonding: take queue spinlock in rx/tx
> burst functions
> 
> On Sun, Jun 12, 2016 at 06:11:28PM +0100, Bernard Iremonger wrote:
> > Use rte_spinlock_trylock() in the rx/tx burst functions to take the
> > queue spinlock.
> >
> > Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
> > Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> > ---
> 
> Why does this particular PMD need spinlocks when doing RX and TX, while
> other device types do not? How is adding/removing devices from a bonded
> device different to other control operations that can be done on physical
> PMDs? Is this not similar to say bringing down or hotplugging out a physical
> port just before an RX or TX operation takes place?
> For all other PMDs we rely on the app to synchronise control and data plane
> operation - why not here?
> 
> /Bruce

This issue arose during VM live migration testing. 
For VM live migration it is necessary (while traffic is running) to be able to remove a bonded slave device, stop it, close it and detach it.
It a slave device is removed from a bonded device while traffic is running a segmentation fault may occur in the rx/tx burst function. The spinlock has been added to prevent this occurring.

The bonding device already uses a spinlock to synchronise between the add and remove functionality and the slave_link_status_change_monitor code. 

Previously testpmd did not allow, stop, close or detach of PMD while traffic was running. Testpmd has been modified with the following patchset 

http://dpdk.org/dev/patchwork/patch/13472/

It now allows stop, close and detach of a PMD provided in it is not forwarding and is not a slave of bonded PMD.

 Regards,

Bernard.
  
Bruce Richardson June 16, 2016, 2:32 p.m. UTC | #3
On Mon, Jun 13, 2016 at 01:28:08PM +0100, Iremonger, Bernard wrote:
> Hi Bruce,
> 
> <snip>
> 
> > Subject: Re: [dpdk-dev] [PATCH v3 3/4] bonding: take queue spinlock in rx/tx
> > burst functions
> > 
> > On Sun, Jun 12, 2016 at 06:11:28PM +0100, Bernard Iremonger wrote:
> > > Use rte_spinlock_trylock() in the rx/tx burst functions to take the
> > > queue spinlock.
> > >
> > > Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
> > > Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> > > ---
> > 
> > Why does this particular PMD need spinlocks when doing RX and TX, while
> > other device types do not? How is adding/removing devices from a bonded
> > device different to other control operations that can be done on physical
> > PMDs? Is this not similar to say bringing down or hotplugging out a physical
> > port just before an RX or TX operation takes place?
> > For all other PMDs we rely on the app to synchronise control and data plane
> > operation - why not here?
> > 
> > /Bruce
> 
> This issue arose during VM live migration testing. 
> For VM live migration it is necessary (while traffic is running) to be able to remove a bonded slave device, stop it, close it and detach it.
> It a slave device is removed from a bonded device while traffic is running a segmentation fault may occur in the rx/tx burst function. The spinlock has been added to prevent this occurring.
> 
> The bonding device already uses a spinlock to synchronise between the add and remove functionality and the slave_link_status_change_monitor code. 
> 
> Previously testpmd did not allow, stop, close or detach of PMD while traffic was running. Testpmd has been modified with the following patchset 
> 
> http://dpdk.org/dev/patchwork/patch/13472/
> 
> It now allows stop, close and detach of a PMD provided in it is not forwarding and is not a slave of bonded PMD.
> 
I will admit to not being fully convinced, but if nobody else has any serious
objections, and since this patch has been reviewed and acked, I'm ok to merge it
in. I'll do so shortly.

/Bruce
  
Thomas Monjalon June 16, 2016, 3 p.m. UTC | #4
2016-06-16 15:32, Bruce Richardson:
> On Mon, Jun 13, 2016 at 01:28:08PM +0100, Iremonger, Bernard wrote:
> > > Why does this particular PMD need spinlocks when doing RX and TX, while
> > > other device types do not? How is adding/removing devices from a bonded
> > > device different to other control operations that can be done on physical
> > > PMDs? Is this not similar to say bringing down or hotplugging out a physical
> > > port just before an RX or TX operation takes place?
> > > For all other PMDs we rely on the app to synchronise control and data plane
> > > operation - why not here?
> > > 
> > > /Bruce
> > 
> > This issue arose during VM live migration testing. 
> > For VM live migration it is necessary (while traffic is running) to be able to remove a bonded slave device, stop it, close it and detach it.
> > It a slave device is removed from a bonded device while traffic is running a segmentation fault may occur in the rx/tx burst function. The spinlock has been added to prevent this occurring.
> > 
> > The bonding device already uses a spinlock to synchronise between the add and remove functionality and the slave_link_status_change_monitor code. 
> > 
> > Previously testpmd did not allow, stop, close or detach of PMD while traffic was running. Testpmd has been modified with the following patchset 
> > 
> > http://dpdk.org/dev/patchwork/patch/13472/
> > 
> > It now allows stop, close and detach of a PMD provided in it is not forwarding and is not a slave of bonded PMD.
> > 
> I will admit to not being fully convinced, but if nobody else has any serious
> objections, and since this patch has been reviewed and acked, I'm ok to merge it
> in. I'll do so shortly.

Please hold on.
Seeing locks introduced in the Rx/Tx path is an alert.
We clearly need a design document to explain where locks can be used
and what are the responsibility of the control plane.
If everybody agrees in this document that DPDK can have some locks
in the fast path, then OK to merge it.

So I would say NACK for 16.07 and maybe postpone to 16.11.
  
Iremonger, Bernard June 16, 2016, 4:41 p.m. UTC | #5
Hi Thomas,
<snip>
> 2016-06-16 15:32, Bruce Richardson:
> > On Mon, Jun 13, 2016 at 01:28:08PM +0100, Iremonger, Bernard wrote:
> > > > Why does this particular PMD need spinlocks when doing RX and TX,
> > > > while other device types do not? How is adding/removing devices
> > > > from a bonded device different to other control operations that
> > > > can be done on physical PMDs? Is this not similar to say bringing
> > > > down or hotplugging out a physical port just before an RX or TX
> operation takes place?
> > > > For all other PMDs we rely on the app to synchronise control and
> > > > data plane operation - why not here?
> > > >
> > > > /Bruce
> > >
> > > This issue arose during VM live migration testing.
> > > For VM live migration it is necessary (while traffic is running) to be able to
> remove a bonded slave device, stop it, close it and detach it.
> > > It a slave device is removed from a bonded device while traffic is running
> a segmentation fault may occur in the rx/tx burst function. The spinlock has
> been added to prevent this occurring.
> > >
> > > The bonding device already uses a spinlock to synchronise between the
> add and remove functionality and the slave_link_status_change_monitor
> code.
> > >
> > > Previously testpmd did not allow, stop, close or detach of PMD while
> > > traffic was running. Testpmd has been modified with the following
> > > patchset
> > >
> > > http://dpdk.org/dev/patchwork/patch/13472/
> > >
> > > It now allows stop, close and detach of a PMD provided in it is not
> forwarding and is not a slave of bonded PMD.
> > >
> > I will admit to not being fully convinced, but if nobody else has any
> > serious objections, and since this patch has been reviewed and acked,
> > I'm ok to merge it in. I'll do so shortly.
> 
> Please hold on.
> Seeing locks introduced in the Rx/Tx path is an alert.
> We clearly need a design document to explain where locks can be used and
> what are the responsibility of the control plane.
> If everybody agrees in this document that DPDK can have some locks in the
> fast path, then OK to merge it.
> 
> So I would say NACK for 16.07 and maybe postpone to 16.11.

Looking at the documentation for the bonding PMD.

http://dpdk.org/doc/guides/prog_guide/link_bonding_poll_mode_drv_lib.html

In section 10.2 it states the following:

Bonded devices support the dynamical addition and removal of slave devices using the rte_eth_bond_slave_add / rte_eth_bond_slave_remove APIs.

If a slave device is added or removed while traffic is running, there is the possibility of a segmentation fault in the rx/tx burst functions. This is most likely to occur in the round robin bonding mode.

This patch set fixes what appears to be a bug in the bonding PMD.

Performance measurements have been made with this patch set applied and without the patches applied using 64 byte packets. 

With the patches applied the following drop in performance was observed:

% drop for fwd+io:	0.16%
% drop for fwd+mac:	0.39%

This patch set has been reviewed and ack'ed, so I think it should be applied in 16.07

Regards,

Bernard.
  
Thomas Monjalon June 16, 2016, 6:38 p.m. UTC | #6
2016-06-16 16:41, Iremonger, Bernard:
> Hi Thomas,
> <snip>
> > 2016-06-16 15:32, Bruce Richardson:
> > > On Mon, Jun 13, 2016 at 01:28:08PM +0100, Iremonger, Bernard wrote:
> > > > > Why does this particular PMD need spinlocks when doing RX and TX,
> > > > > while other device types do not? How is adding/removing devices
> > > > > from a bonded device different to other control operations that
> > > > > can be done on physical PMDs? Is this not similar to say bringing
> > > > > down or hotplugging out a physical port just before an RX or TX
> > operation takes place?
> > > > > For all other PMDs we rely on the app to synchronise control and
> > > > > data plane operation - why not here?
> > > > >
> > > > > /Bruce
> > > >
> > > > This issue arose during VM live migration testing.
> > > > For VM live migration it is necessary (while traffic is running) to be able to
> > remove a bonded slave device, stop it, close it and detach it.
> > > > It a slave device is removed from a bonded device while traffic is running
> > a segmentation fault may occur in the rx/tx burst function. The spinlock has
> > been added to prevent this occurring.
> > > >
> > > > The bonding device already uses a spinlock to synchronise between the
> > add and remove functionality and the slave_link_status_change_monitor
> > code.
> > > >
> > > > Previously testpmd did not allow, stop, close or detach of PMD while
> > > > traffic was running. Testpmd has been modified with the following
> > > > patchset
> > > >
> > > > http://dpdk.org/dev/patchwork/patch/13472/
> > > >
> > > > It now allows stop, close and detach of a PMD provided in it is not
> > forwarding and is not a slave of bonded PMD.
> > > >
> > > I will admit to not being fully convinced, but if nobody else has any
> > > serious objections, and since this patch has been reviewed and acked,
> > > I'm ok to merge it in. I'll do so shortly.
> > 
> > Please hold on.
> > Seeing locks introduced in the Rx/Tx path is an alert.
> > We clearly need a design document to explain where locks can be used and
> > what are the responsibility of the control plane.
> > If everybody agrees in this document that DPDK can have some locks in the
> > fast path, then OK to merge it.
> > 
> > So I would say NACK for 16.07 and maybe postpone to 16.11.
> 
> Looking at the documentation for the bonding PMD.
> 
> http://dpdk.org/doc/guides/prog_guide/link_bonding_poll_mode_drv_lib.html
> 
> In section 10.2 it states the following:
> 
> Bonded devices support the dynamical addition and removal of slave devices using the rte_eth_bond_slave_add / rte_eth_bond_slave_remove APIs.
> 
> If a slave device is added or removed while traffic is running, there is the possibility of a segmentation fault in the rx/tx burst functions. This is most likely to occur in the round robin bonding mode.
> 
> This patch set fixes what appears to be a bug in the bonding PMD.

It can be fixed by removing this statement in the doc.

One of the design principle of DPDK is to avoid locks.

> Performance measurements have been made with this patch set applied and without the patches applied using 64 byte packets. 
> 
> With the patches applied the following drop in performance was observed:
> 
> % drop for fwd+io:	0.16%
> % drop for fwd+mac:	0.39%
> 
> This patch set has been reviewed and ack'ed, so I think it should be applied in 16.07

I understand your point of view and I gave mine.
Now we need more opinions from others.
  
Ferruh Yigit Sept. 9, 2016, 11:29 a.m. UTC | #7
Hi Bernard,

This is an old patch, sorry for commenting after this long.

On 6/12/2016 6:11 PM, Bernard Iremonger wrote:
> Use rte_spinlock_trylock() in the rx/tx burst functions to
> take the queue spinlock.
> 
> Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> ---

...

>  static uint16_t
> @@ -143,8 +154,10 @@ bond_ethdev_rx_burst_8023ad(void *queue, struct rte_mbuf **bufs,
>         uint8_t i, j, k;
> 
>         rte_eth_macaddr_get(internals->port_id, &bond_mac);
> -       /* Copy slave list to protect against slave up/down changes during tx
> -        * bursting */

This piece,

...

>         for (i = 0; i < num_of_slaves; i++) {
> -               struct port *port = &mode_8023ad_ports[slaves[i]];
> +               struct port *port;
> +
> +               port = &mode_8023ad_ports[internals->active_slaves[i]];

And this piece seems needs to be moved into next patch in the patchset.

...

And if you will send new version of the patchset, there are a few
warnings from check-git-log.sh:

Wrong headline prefix:
        bonding: remove memcpy from burst functions
        bonding: take queue spinlock in rx/tx burst functions
        bonding: grab queue spinlocks in slave add and remove
        bonding: add spinlock to rx and tx queues
Wrong headline lowercase:
        bonding: take queue spinlock in rx/tx burst functions
        bonding: add spinlock to rx and tx queues

Thanks,
ferruh
  
Ferruh Yigit Feb. 15, 2017, 6:01 p.m. UTC | #8
On 6/16/2016 7:38 PM, thomas.monjalon at 6wind.com (Thomas Monjalon) wrote:
> 2016-06-16 16:41, Iremonger, Bernard:
>> Hi Thomas,
>> <snip>
>>> 2016-06-16 15:32, Bruce Richardson:
>>>> On Mon, Jun 13, 2016 at 01:28:08PM +0100, Iremonger, Bernard wrote:
>>>>>> Why does this particular PMD need spinlocks when doing RX and TX,
>>>>>> while other device types do not? How is adding/removing devices
>>>>>> from a bonded device different to other control operations that
>>>>>> can be done on physical PMDs? Is this not similar to say bringing
>>>>>> down or hotplugging out a physical port just before an RX or TX
>>> operation takes place?
>>>>>> For all other PMDs we rely on the app to synchronise control and
>>>>>> data plane operation - why not here?
>>>>>>
>>>>>> /Bruce
>>>>>
>>>>> This issue arose during VM live migration testing.
>>>>> For VM live migration it is necessary (while traffic is running) to be able to
>>> remove a bonded slave device, stop it, close it and detach it.
>>>>> It a slave device is removed from a bonded device while traffic is running
>>> a segmentation fault may occur in the rx/tx burst function. The spinlock has
>>> been added to prevent this occurring.
>>>>>
>>>>> The bonding device already uses a spinlock to synchronise between the
>>> add and remove functionality and the slave_link_status_change_monitor
>>> code.
>>>>>
>>>>> Previously testpmd did not allow, stop, close or detach of PMD while
>>>>> traffic was running. Testpmd has been modified with the following
>>>>> patchset
>>>>>
>>>>> http://dpdk.org/dev/patchwork/patch/13472/
>>>>>
>>>>> It now allows stop, close and detach of a PMD provided in it is not
>>> forwarding and is not a slave of bonded PMD.
>>>>>
>>>> I will admit to not being fully convinced, but if nobody else has any
>>>> serious objections, and since this patch has been reviewed and acked,
>>>> I'm ok to merge it in. I'll do so shortly.
>>>
>>> Please hold on.
>>> Seeing locks introduced in the Rx/Tx path is an alert.
>>> We clearly need a design document to explain where locks can be used and
>>> what are the responsibility of the control plane.
>>> If everybody agrees in this document that DPDK can have some locks in the
>>> fast path, then OK to merge it.
>>>
>>> So I would say NACK for 16.07 and maybe postpone to 16.11.
>>
>> Looking at the documentation for the bonding PMD.
>>
>> http://dpdk.org/doc/guides/prog_guide/link_bonding_poll_mode_drv_lib.html
>>
>> In section 10.2 it states the following:
>>
>> Bonded devices support the dynamical addition and removal of slave devices using the rte_eth_bond_slave_add / rte_eth_bond_slave_remove APIs.
>>
>> If a slave device is added or removed while traffic is running, there is the possibility of a segmentation fault in the rx/tx burst functions. This is most likely to occur in the round robin bonding mode.
>>
>> This patch set fixes what appears to be a bug in the bonding PMD.
> 
> It can be fixed by removing this statement in the doc.
> 
> One of the design principle of DPDK is to avoid locks.
> 
>> Performance measurements have been made with this patch set applied and without the patches applied using 64 byte packets. 
>>
>> With the patches applied the following drop in performance was observed:
>>
>> % drop for fwd+io:	0.16%
>> % drop for fwd+mac:	0.39%
>>
>> This patch set has been reviewed and ack'ed, so I think it should be applied in 16.07
> 
> I understand your point of view and I gave mine.
> Now we need more opinions from others.
> 

Hi,

These patches are sitting in the patchwork for a long time. Discussion
never concluded and patches kept deferred each release.

I think we should give a decision about them:

1- We can merge them in this release, they are fixing a valid problem,
and patches are already acked.

2- We can reject them, if not having them for more than six months not
caused a problem, perhaps they are not really that required. And if
somebody needs them in the future, we can resurrect them from patchwork.

I vote for option 2, any comments?

Thanks,
ferruh
  
Bruce Richardson Feb. 16, 2017, 9:13 a.m. UTC | #9
On Wed, Feb 15, 2017 at 06:01:45PM +0000, Ferruh Yigit wrote:
> On 6/16/2016 7:38 PM, thomas.monjalon at 6wind.com (Thomas Monjalon) wrote:
> > 2016-06-16 16:41, Iremonger, Bernard:
> >> Hi Thomas,
> >> <snip>
> >>> 2016-06-16 15:32, Bruce Richardson:
> >>>> On Mon, Jun 13, 2016 at 01:28:08PM +0100, Iremonger, Bernard wrote:
> >>>>>> Why does this particular PMD need spinlocks when doing RX and TX,
> >>>>>> while other device types do not? How is adding/removing devices
> >>>>>> from a bonded device different to other control operations that
> >>>>>> can be done on physical PMDs? Is this not similar to say bringing
> >>>>>> down or hotplugging out a physical port just before an RX or TX
> >>> operation takes place?
> >>>>>> For all other PMDs we rely on the app to synchronise control and
> >>>>>> data plane operation - why not here?
> >>>>>>
> >>>>>> /Bruce
> >>>>>
> >>>>> This issue arose during VM live migration testing.
> >>>>> For VM live migration it is necessary (while traffic is running) to be able to
> >>> remove a bonded slave device, stop it, close it and detach it.
> >>>>> It a slave device is removed from a bonded device while traffic is running
> >>> a segmentation fault may occur in the rx/tx burst function. The spinlock has
> >>> been added to prevent this occurring.
> >>>>>
> >>>>> The bonding device already uses a spinlock to synchronise between the
> >>> add and remove functionality and the slave_link_status_change_monitor
> >>> code.
> >>>>>
> >>>>> Previously testpmd did not allow, stop, close or detach of PMD while
> >>>>> traffic was running. Testpmd has been modified with the following
> >>>>> patchset
> >>>>>
> >>>>> http://dpdk.org/dev/patchwork/patch/13472/
> >>>>>
> >>>>> It now allows stop, close and detach of a PMD provided in it is not
> >>> forwarding and is not a slave of bonded PMD.
> >>>>>
> >>>> I will admit to not being fully convinced, but if nobody else has any
> >>>> serious objections, and since this patch has been reviewed and acked,
> >>>> I'm ok to merge it in. I'll do so shortly.
> >>>
> >>> Please hold on.
> >>> Seeing locks introduced in the Rx/Tx path is an alert.
> >>> We clearly need a design document to explain where locks can be used and
> >>> what are the responsibility of the control plane.
> >>> If everybody agrees in this document that DPDK can have some locks in the
> >>> fast path, then OK to merge it.
> >>>
> >>> So I would say NACK for 16.07 and maybe postpone to 16.11.
> >>
> >> Looking at the documentation for the bonding PMD.
> >>
> >> http://dpdk.org/doc/guides/prog_guide/link_bonding_poll_mode_drv_lib.html
> >>
> >> In section 10.2 it states the following:
> >>
> >> Bonded devices support the dynamical addition and removal of slave devices using the rte_eth_bond_slave_add / rte_eth_bond_slave_remove APIs.
> >>
> >> If a slave device is added or removed while traffic is running, there is the possibility of a segmentation fault in the rx/tx burst functions. This is most likely to occur in the round robin bonding mode.
> >>
> >> This patch set fixes what appears to be a bug in the bonding PMD.
> > 
> > It can be fixed by removing this statement in the doc.
> > 
> > One of the design principle of DPDK is to avoid locks.
> > 
> >> Performance measurements have been made with this patch set applied and without the patches applied using 64 byte packets. 
> >>
> >> With the patches applied the following drop in performance was observed:
> >>
> >> % drop for fwd+io:	0.16%
> >> % drop for fwd+mac:	0.39%
> >>
> >> This patch set has been reviewed and ack'ed, so I think it should be applied in 16.07
> > 
> > I understand your point of view and I gave mine.
> > Now we need more opinions from others.
> > 
> 
> Hi,
> 
> These patches are sitting in the patchwork for a long time. Discussion
> never concluded and patches kept deferred each release.
> 
> I think we should give a decision about them:
> 
> 1- We can merge them in this release, they are fixing a valid problem,
> and patches are already acked.
> 
> 2- We can reject them, if not having them for more than six months not
> caused a problem, perhaps they are not really that required. And if
> somebody needs them in the future, we can resurrect them from patchwork.
> 
> I vote for option 2, any comments?
> 
+1 on option 2. There are obviously not badly needed if nobody is asking
for them for over six months.

	/Bruce
  
Iremonger, Bernard Feb. 16, 2017, 11:39 a.m. UTC | #10
Hi Ferruh,

> -----Original Message-----
> From: Richardson, Bruce
> Sent: Thursday, February 16, 2017 9:14 AM
> To: Yigit, Ferruh <ferruh.yigit@intel.com>
> Cc: Thomas Monjalon <thomas.monjalon@6wind.com>; Iremonger, Bernard
> <bernard.iremonger@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Doherty, Declan
> <declan.doherty@intel.com>; DPDK <dev@dpdk.org>
> Subject: Re: [dpdk-dev] [PATCH v3 3/4] bonding: take queue spinlock in rx/tx
> burst functions
> 
> On Wed, Feb 15, 2017 at 06:01:45PM +0000, Ferruh Yigit wrote:
> > On 6/16/2016 7:38 PM, thomas.monjalon at 6wind.com (Thomas Monjalon)
> wrote:
> > > 2016-06-16 16:41, Iremonger, Bernard:
> > >> Hi Thomas,
> > >> <snip>
> > >>> 2016-06-16 15:32, Bruce Richardson:
> > >>>> On Mon, Jun 13, 2016 at 01:28:08PM +0100, Iremonger, Bernard
> wrote:
> > >>>>>> Why does this particular PMD need spinlocks when doing RX and
> > >>>>>> TX, while other device types do not? How is adding/removing
> > >>>>>> devices from a bonded device different to other control
> > >>>>>> operations that can be done on physical PMDs? Is this not
> > >>>>>> similar to say bringing down or hotplugging out a physical port
> > >>>>>> just before an RX or TX
> > >>> operation takes place?
> > >>>>>> For all other PMDs we rely on the app to synchronise control
> > >>>>>> and data plane operation - why not here?
> > >>>>>>
> > >>>>>> /Bruce
> > >>>>>
> > >>>>> This issue arose during VM live migration testing.
> > >>>>> For VM live migration it is necessary (while traffic is running)
> > >>>>> to be able to
> > >>> remove a bonded slave device, stop it, close it and detach it.
> > >>>>> It a slave device is removed from a bonded device while traffic
> > >>>>> is running
> > >>> a segmentation fault may occur in the rx/tx burst function. The
> > >>> spinlock has been added to prevent this occurring.
> > >>>>>
> > >>>>> The bonding device already uses a spinlock to synchronise
> > >>>>> between the
> > >>> add and remove functionality and the
> > >>> slave_link_status_change_monitor code.
> > >>>>>
> > >>>>> Previously testpmd did not allow, stop, close or detach of PMD
> > >>>>> while traffic was running. Testpmd has been modified with the
> > >>>>> following patchset
> > >>>>>
> > >>>>> http://dpdk.org/dev/patchwork/patch/13472/
> > >>>>>
> > >>>>> It now allows stop, close and detach of a PMD provided in it is
> > >>>>> not
> > >>> forwarding and is not a slave of bonded PMD.
> > >>>>>
> > >>>> I will admit to not being fully convinced, but if nobody else has
> > >>>> any serious objections, and since this patch has been reviewed
> > >>>> and acked, I'm ok to merge it in. I'll do so shortly.
> > >>>
> > >>> Please hold on.
> > >>> Seeing locks introduced in the Rx/Tx path is an alert.
> > >>> We clearly need a design document to explain where locks can be
> > >>> used and what are the responsibility of the control plane.
> > >>> If everybody agrees in this document that DPDK can have some locks
> > >>> in the fast path, then OK to merge it.
> > >>>
> > >>> So I would say NACK for 16.07 and maybe postpone to 16.11.
> > >>
> > >> Looking at the documentation for the bonding PMD.
> > >>
> > >>
> http://dpdk.org/doc/guides/prog_guide/link_bonding_poll_mode_drv_li
> > >> b.html
> > >>
> > >> In section 10.2 it states the following:
> > >>
> > >> Bonded devices support the dynamical addition and removal of slave
> devices using the rte_eth_bond_slave_add / rte_eth_bond_slave_remove
> APIs.
> > >>
> > >> If a slave device is added or removed while traffic is running, there is the
> possibility of a segmentation fault in the rx/tx burst functions. This is most
> likely to occur in the round robin bonding mode.
> > >>
> > >> This patch set fixes what appears to be a bug in the bonding PMD.
> > >
> > > It can be fixed by removing this statement in the doc.
> > >
> > > One of the design principle of DPDK is to avoid locks.
> > >
> > >> Performance measurements have been made with this patch set
> applied and without the patches applied using 64 byte packets.
> > >>
> > >> With the patches applied the following drop in performance was
> observed:
> > >>
> > >> % drop for fwd+io:	0.16%
> > >> % drop for fwd+mac:	0.39%
> > >>
> > >> This patch set has been reviewed and ack'ed, so I think it should
> > >> be applied in 16.07
> > >
> > > I understand your point of view and I gave mine.
> > > Now we need more opinions from others.
> > >
> >
> > Hi,
> >
> > These patches are sitting in the patchwork for a long time. Discussion
> > never concluded and patches kept deferred each release.
> >
> > I think we should give a decision about them:
> >
> > 1- We can merge them in this release, they are fixing a valid problem,
> > and patches are already acked.
> >
> > 2- We can reject them, if not having them for more than six months not
> > caused a problem, perhaps they are not really that required. And if
> > somebody needs them in the future, we can resurrect them from
> patchwork.
> >
> > I vote for option 2, any comments?
> >
> +1 on option 2. There are obviously not badly needed if nobody is asking
> for them for over six months.
> 
> 	/Bruce

I am ok with option 2, provided they can be retrieved if needed.

Regards,

Bernard.
  
Ferruh Yigit Feb. 20, 2017, 11:15 a.m. UTC | #11
On 2/16/2017 11:39 AM, Iremonger, Bernard wrote:
> Hi Ferruh,
> 
>> -----Original Message-----
>> From: Richardson, Bruce
>> Sent: Thursday, February 16, 2017 9:14 AM
>> To: Yigit, Ferruh <ferruh.yigit@intel.com>
>> Cc: Thomas Monjalon <thomas.monjalon@6wind.com>; Iremonger, Bernard
>> <bernard.iremonger@intel.com>; Ananyev, Konstantin
>> <konstantin.ananyev@intel.com>; Doherty, Declan
>> <declan.doherty@intel.com>; DPDK <dev@dpdk.org>
>> Subject: Re: [dpdk-dev] [PATCH v3 3/4] bonding: take queue spinlock in rx/tx
>> burst functions
>>
>> On Wed, Feb 15, 2017 at 06:01:45PM +0000, Ferruh Yigit wrote:
>>> On 6/16/2016 7:38 PM, thomas.monjalon at 6wind.com (Thomas Monjalon)
>> wrote:
>>>> 2016-06-16 16:41, Iremonger, Bernard:
>>>>> Hi Thomas,
>>>>> <snip>
>>>>>> 2016-06-16 15:32, Bruce Richardson:
>>>>>>> On Mon, Jun 13, 2016 at 01:28:08PM +0100, Iremonger, Bernard
>> wrote:
>>>>>>>>> Why does this particular PMD need spinlocks when doing RX and
>>>>>>>>> TX, while other device types do not? How is adding/removing
>>>>>>>>> devices from a bonded device different to other control
>>>>>>>>> operations that can be done on physical PMDs? Is this not
>>>>>>>>> similar to say bringing down or hotplugging out a physical port
>>>>>>>>> just before an RX or TX
>>>>>> operation takes place?
>>>>>>>>> For all other PMDs we rely on the app to synchronise control
>>>>>>>>> and data plane operation - why not here?
>>>>>>>>>
>>>>>>>>> /Bruce
>>>>>>>>
>>>>>>>> This issue arose during VM live migration testing.
>>>>>>>> For VM live migration it is necessary (while traffic is running)
>>>>>>>> to be able to
>>>>>> remove a bonded slave device, stop it, close it and detach it.
>>>>>>>> It a slave device is removed from a bonded device while traffic
>>>>>>>> is running
>>>>>> a segmentation fault may occur in the rx/tx burst function. The
>>>>>> spinlock has been added to prevent this occurring.
>>>>>>>>
>>>>>>>> The bonding device already uses a spinlock to synchronise
>>>>>>>> between the
>>>>>> add and remove functionality and the
>>>>>> slave_link_status_change_monitor code.
>>>>>>>>
>>>>>>>> Previously testpmd did not allow, stop, close or detach of PMD
>>>>>>>> while traffic was running. Testpmd has been modified with the
>>>>>>>> following patchset
>>>>>>>>
>>>>>>>> http://dpdk.org/dev/patchwork/patch/13472/
>>>>>>>>
>>>>>>>> It now allows stop, close and detach of a PMD provided in it is
>>>>>>>> not
>>>>>> forwarding and is not a slave of bonded PMD.
>>>>>>>>
>>>>>>> I will admit to not being fully convinced, but if nobody else has
>>>>>>> any serious objections, and since this patch has been reviewed
>>>>>>> and acked, I'm ok to merge it in. I'll do so shortly.
>>>>>>
>>>>>> Please hold on.
>>>>>> Seeing locks introduced in the Rx/Tx path is an alert.
>>>>>> We clearly need a design document to explain where locks can be
>>>>>> used and what are the responsibility of the control plane.
>>>>>> If everybody agrees in this document that DPDK can have some locks
>>>>>> in the fast path, then OK to merge it.
>>>>>>
>>>>>> So I would say NACK for 16.07 and maybe postpone to 16.11.
>>>>>
>>>>> Looking at the documentation for the bonding PMD.
>>>>>
>>>>>
>> http://dpdk.org/doc/guides/prog_guide/link_bonding_poll_mode_drv_li
>>>>> b.html
>>>>>
>>>>> In section 10.2 it states the following:
>>>>>
>>>>> Bonded devices support the dynamical addition and removal of slave
>> devices using the rte_eth_bond_slave_add / rte_eth_bond_slave_remove
>> APIs.
>>>>>
>>>>> If a slave device is added or removed while traffic is running, there is the
>> possibility of a segmentation fault in the rx/tx burst functions. This is most
>> likely to occur in the round robin bonding mode.
>>>>>
>>>>> This patch set fixes what appears to be a bug in the bonding PMD.
>>>>
>>>> It can be fixed by removing this statement in the doc.
>>>>
>>>> One of the design principle of DPDK is to avoid locks.
>>>>
>>>>> Performance measurements have been made with this patch set
>> applied and without the patches applied using 64 byte packets.
>>>>>
>>>>> With the patches applied the following drop in performance was
>> observed:
>>>>>
>>>>> % drop for fwd+io:	0.16%
>>>>> % drop for fwd+mac:	0.39%
>>>>>
>>>>> This patch set has been reviewed and ack'ed, so I think it should
>>>>> be applied in 16.07
>>>>
>>>> I understand your point of view and I gave mine.
>>>> Now we need more opinions from others.
>>>>
>>>
>>> Hi,
>>>
>>> These patches are sitting in the patchwork for a long time. Discussion
>>> never concluded and patches kept deferred each release.
>>>
>>> I think we should give a decision about them:
>>>
>>> 1- We can merge them in this release, they are fixing a valid problem,
>>> and patches are already acked.
>>>
>>> 2- We can reject them, if not having them for more than six months not
>>> caused a problem, perhaps they are not really that required. And if
>>> somebody needs them in the future, we can resurrect them from
>> patchwork.
>>>
>>> I vote for option 2, any comments?
>>>
>> +1 on option 2. There are obviously not badly needed if nobody is asking
>> for them for over six months.
>>
>> 	/Bruce
> 
> I am ok with option 2, provided they can be retrieved if needed.

Patches marked as rejected in patchwork.

For future reference, patchwork ids:
http://dpdk.org/dev/patchwork/patch/13482/
http://dpdk.org/dev/patchwork/patch/13483/
http://dpdk.org/dev/patchwork/patch/13484/
http://dpdk.org/dev/patchwork/patch/13485/

Thanks,
ferruh
  

Patch

diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
index 2e624bb..93043ef 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -1,7 +1,7 @@ 
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -92,16 +92,22 @@  bond_ethdev_rx_burst(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 
 	internals = bd_rx_q->dev_private;
 
-
-	for (i = 0; i < internals->active_slave_count && nb_pkts; i++) {
-		/* Offset of pointer to *bufs increases as packets are received
-		 * from other slaves */
-		num_rx_slave = rte_eth_rx_burst(internals->active_slaves[i],
-				bd_rx_q->queue_id, bufs + num_rx_total, nb_pkts);
-		if (num_rx_slave) {
-			num_rx_total += num_rx_slave;
-			nb_pkts -= num_rx_slave;
+	if (rte_spinlock_trylock(&bd_rx_q->lock)) {
+		for (i = 0; i < internals->active_slave_count && nb_pkts; i++) {
+			/* Offset of pointer to *bufs increases as packets
+			 * are received from other slaves
+			 */
+			num_rx_slave = rte_eth_rx_burst(
+						internals->active_slaves[i],
+						bd_rx_q->queue_id,
+						bufs + num_rx_total,
+						nb_pkts);
+			if (num_rx_slave) {
+				num_rx_total += num_rx_slave;
+				nb_pkts -= num_rx_slave;
+			}
 		}
+		rte_spinlock_unlock(&bd_rx_q->lock);
 	}
 
 	return num_rx_total;
@@ -112,14 +118,19 @@  bond_ethdev_rx_burst_active_backup(void *queue, struct rte_mbuf **bufs,
 		uint16_t nb_pkts)
 {
 	struct bond_dev_private *internals;
+	uint16_t ret = 0;
 
 	/* Cast to structure, containing bonded device's port id and queue id */
 	struct bond_rx_queue *bd_rx_q = (struct bond_rx_queue *)queue;
 
 	internals = bd_rx_q->dev_private;
 
-	return rte_eth_rx_burst(internals->current_primary_port,
-			bd_rx_q->queue_id, bufs, nb_pkts);
+	if (rte_spinlock_trylock(&bd_rx_q->lock)) {
+		ret = rte_eth_rx_burst(internals->current_primary_port,
+					bd_rx_q->queue_id, bufs, nb_pkts);
+		rte_spinlock_unlock(&bd_rx_q->lock);
+	}
+	return ret;
 }
 
 static uint16_t
@@ -143,8 +154,10 @@  bond_ethdev_rx_burst_8023ad(void *queue, struct rte_mbuf **bufs,
 	uint8_t i, j, k;
 
 	rte_eth_macaddr_get(internals->port_id, &bond_mac);
-	/* Copy slave list to protect against slave up/down changes during tx
-	 * bursting */
+
+	if (rte_spinlock_trylock(&bd_rx_q->lock) == 0)
+		return num_rx_total;
+
 	slave_count = internals->active_slave_count;
 	memcpy(slaves, internals->active_slaves,
 			sizeof(internals->active_slaves[0]) * slave_count);
@@ -190,7 +203,7 @@  bond_ethdev_rx_burst_8023ad(void *queue, struct rte_mbuf **bufs,
 				j++;
 		}
 	}
-
+	rte_spinlock_unlock(&bd_rx_q->lock);
 	return num_rx_total;
 }
 
@@ -406,14 +419,19 @@  bond_ethdev_tx_burst_round_robin(void *queue, struct rte_mbuf **bufs,
 	bd_tx_q = (struct bond_tx_queue *)queue;
 	internals = bd_tx_q->dev_private;
 
+	if (rte_spinlock_trylock(&bd_tx_q->lock) == 0)
+		return num_tx_total;
+
 	/* Copy slave list to protect against slave up/down changes during tx
 	 * bursting */
 	num_of_slaves = internals->active_slave_count;
 	memcpy(slaves, internals->active_slaves,
 			sizeof(internals->active_slaves[0]) * num_of_slaves);
 
-	if (num_of_slaves < 1)
+	if (num_of_slaves < 1) {
+		rte_spinlock_unlock(&bd_tx_q->lock);
 		return num_tx_total;
+	}
 
 	/* Populate slaves mbuf with which packets are to be sent on it  */
 	for (i = 0; i < nb_pkts; i++) {
@@ -444,7 +462,7 @@  bond_ethdev_tx_burst_round_robin(void *queue, struct rte_mbuf **bufs,
 			num_tx_total += num_tx_slave;
 		}
 	}
-
+	rte_spinlock_unlock(&bd_tx_q->lock);
 	return num_tx_total;
 }
 
@@ -454,15 +472,23 @@  bond_ethdev_tx_burst_active_backup(void *queue,
 {
 	struct bond_dev_private *internals;
 	struct bond_tx_queue *bd_tx_q;
+	uint16_t ret = 0;
 
 	bd_tx_q = (struct bond_tx_queue *)queue;
 	internals = bd_tx_q->dev_private;
 
-	if (internals->active_slave_count < 1)
-		return 0;
+	if (rte_spinlock_trylock(&bd_tx_q->lock)) {
+		if (internals->active_slave_count < 1) {
+			rte_spinlock_unlock(&bd_tx_q->lock);
+			return 0;
+		}
 
-	return rte_eth_tx_burst(internals->current_primary_port, bd_tx_q->queue_id,
-			bufs, nb_pkts);
+		ret = rte_eth_tx_burst(internals->current_primary_port,
+					bd_tx_q->queue_id,
+					bufs, nb_pkts);
+		rte_spinlock_unlock(&bd_tx_q->lock);
+	}
+	return ret;
 }
 
 static inline uint16_t
@@ -694,20 +720,25 @@  bond_ethdev_tx_burst_tlb(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 	uint16_t num_tx_total = 0;
 	uint8_t i, j;
 
-	uint8_t num_of_slaves = internals->active_slave_count;
+	uint8_t num_of_slaves;
 	uint8_t slaves[RTE_MAX_ETHPORTS];
 
 	struct ether_hdr *ether_hdr;
 	struct ether_addr primary_slave_addr;
 	struct ether_addr active_slave_addr;
 
-	if (num_of_slaves < 1)
+	if (rte_spinlock_trylock(&bd_tx_q->lock) == 0)
 		return num_tx_total;
 
+	num_of_slaves = internals->active_slave_count;
+	if (num_of_slaves < 1) {
+		rte_spinlock_unlock(&bd_tx_q->lock);
+		return num_tx_total;
+	}
+
 	memcpy(slaves, internals->tlb_slaves_order,
 				sizeof(internals->tlb_slaves_order[0]) * num_of_slaves);
 
-
 	ether_addr_copy(primary_port->data->mac_addrs, &primary_slave_addr);
 
 	if (nb_pkts > 3) {
@@ -735,7 +766,7 @@  bond_ethdev_tx_burst_tlb(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 		if (num_tx_total == nb_pkts)
 			break;
 	}
-
+	rte_spinlock_unlock(&bd_tx_q->lock);
 	return num_tx_total;
 }
 
@@ -785,6 +816,9 @@  bond_ethdev_tx_burst_alb(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 
 	int i, j;
 
+	if (rte_spinlock_trylock(&bd_tx_q->lock) == 0)
+		return num_tx_total;
+
 	/* Search tx buffer for ARP packets and forward them to alb */
 	for (i = 0; i < nb_pkts; i++) {
 		eth_h = rte_pktmbuf_mtod(bufs[i], struct ether_hdr *);
@@ -875,6 +909,7 @@  bond_ethdev_tx_burst_alb(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 #endif
 		}
 	}
+	rte_spinlock_unlock(&bd_tx_q->lock);
 
 	/* Send non-ARP packets using tlb policy */
 	if (slave_bufs_pkts[RTE_MAX_ETHPORTS] > 0) {
@@ -914,14 +949,19 @@  bond_ethdev_tx_burst_balance(void *queue, struct rte_mbuf **bufs,
 	bd_tx_q = (struct bond_tx_queue *)queue;
 	internals = bd_tx_q->dev_private;
 
+	if (rte_spinlock_trylock(&bd_tx_q->lock) == 0)
+		return num_tx_total;
+
 	/* Copy slave list to protect against slave up/down changes during tx
 	 * bursting */
 	num_of_slaves = internals->active_slave_count;
 	memcpy(slaves, internals->active_slaves,
 			sizeof(internals->active_slaves[0]) * num_of_slaves);
 
-	if (num_of_slaves < 1)
+	if (num_of_slaves < 1) {
+		rte_spinlock_unlock(&bd_tx_q->lock);
 		return num_tx_total;
+	}
 
 	/* Populate slaves mbuf with the packets which are to be sent on it  */
 	for (i = 0; i < nb_pkts; i++) {
@@ -951,7 +991,7 @@  bond_ethdev_tx_burst_balance(void *queue, struct rte_mbuf **bufs,
 			num_tx_total += num_tx_slave;
 		}
 	}
-
+	rte_spinlock_unlock(&bd_tx_q->lock);
 	return num_tx_total;
 }
 
@@ -984,17 +1024,24 @@  bond_ethdev_tx_burst_8023ad(void *queue, struct rte_mbuf **bufs,
 	bd_tx_q = (struct bond_tx_queue *)queue;
 	internals = bd_tx_q->dev_private;
 
+	if (rte_spinlock_trylock(&bd_tx_q->lock) == 0)
+		return num_tx_total;
+
 	/* Copy slave list to protect against slave up/down changes during tx
 	 * bursting */
 	num_of_slaves = internals->active_slave_count;
-	if (num_of_slaves < 1)
+	if (num_of_slaves < 1) {
+		rte_spinlock_unlock(&bd_tx_q->lock);
 		return num_tx_total;
+	}
 
 	memcpy(slaves, internals->active_slaves, sizeof(slaves[0]) * num_of_slaves);
 
 	distributing_count = 0;
 	for (i = 0; i < num_of_slaves; i++) {
-		struct port *port = &mode_8023ad_ports[slaves[i]];
+		struct port *port;
+
+		port = &mode_8023ad_ports[internals->active_slaves[i]];
 
 		slave_slow_nb_pkts[i] = rte_ring_dequeue_burst(port->tx_ring,
 				slow_pkts, BOND_MODE_8023AX_SLAVE_TX_PKTS);
@@ -1043,7 +1090,7 @@  bond_ethdev_tx_burst_8023ad(void *queue, struct rte_mbuf **bufs,
 				bufs[j] = slave_bufs[i][num_tx_slave];
 		}
 	}
-
+	rte_spinlock_unlock(&bd_tx_q->lock);
 	return num_tx_total;
 }
 
@@ -1065,14 +1112,19 @@  bond_ethdev_tx_burst_broadcast(void *queue, struct rte_mbuf **bufs,
 	bd_tx_q = (struct bond_tx_queue *)queue;
 	internals = bd_tx_q->dev_private;
 
+	if (rte_spinlock_trylock(&bd_tx_q->lock) == 0)
+		return 0;
+
 	/* Copy slave list to protect against slave up/down changes during tx
 	 * bursting */
 	num_of_slaves = internals->active_slave_count;
 	memcpy(slaves, internals->active_slaves,
 			sizeof(internals->active_slaves[0]) * num_of_slaves);
 
-	if (num_of_slaves < 1)
+	if (num_of_slaves < 1) {
+		rte_spinlock_unlock(&bd_tx_q->lock);
 		return 0;
+	}
 
 	/* Increment reference count on mbufs */
 	for (i = 0; i < nb_pkts; i++)
@@ -1093,6 +1145,7 @@  bond_ethdev_tx_burst_broadcast(void *queue, struct rte_mbuf **bufs,
 			most_successful_tx_slave = i;
 		}
 	}
+	rte_spinlock_unlock(&bd_tx_q->lock);
 
 	/* if slaves fail to transmit packets from burst, the calling application
 	 * is not expected to know about multiple references to packets so we must
@@ -1819,7 +1872,6 @@  bond_ethdev_link_update(struct rte_eth_dev *bonded_eth_dev,
 
 		bonded_eth_dev->data->dev_link.link_status = link_up;
 	}
-
 	return 0;
 }