[v4] net/bonding: change the state machine to defaulted
Checks
Commit Message
A dpdk bonding 802.3ad network as follows:
+----------+ +-----------+
|dpdk lacp |bond1.1 <------> bond2.1|switch lacp|
| |bond1.2 <------> bond2.2| |
+----------+ +-----------+
If a fiber optic go wrong about single pass during normal running like
this:
bond1.2 -----> bond2.2 ok
bond1.2 <--x-- bond2.2 error: bond1.2 receive no LACPDU Some packets from
switch to dpdk will choose bond2.2 and lost.
DPDK lacp state machine will transits to the expired state if no LACPDU
is received before the current_while_timer expires. But if no LACPDU is
received before the current_while_timer expires again, DPDK lacp state
machine has no change. Bond2.2 can not change to inactive depend on the
received LACPDU.
According to IEEE 802.3ad, if no lacpdu is received before the
current_while_timer expires again, the state machine should transits from
expired to defaulted. Bond2.2 will change to inactive depend on the LACPDU
with defaulted state.
This patch adds a state machine change from expired to defaulted when no
lacpdu is received before the current_while_timer expires again according
to IEEE 802.3ad:
If no LACPDU is received before the current_while timer expires again,
the state machine transits to the DEFAULTED state. The record Default
function overwrites the current operational parameters for the Partner
with administratively configured values. This allows configuration of
aggregations and individual links when no protocol partner is present,
while still permitting an active partner to override default settings.
The update_Default_Selected function sets the Selected variable FALSE
if the Link Aggregation Group has changed. Since all operational parameters
are now set to locally administered values there can be no disagreement as
to the Link Aggregation Group, so the Matched variable is set TRUE.
The relevant description is in the chapter 43.4.12 of the link below:
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=850426
Signed-off-by: Weifeng Li <liweifeng96@126.com>
---
v1 -> v2: adjust the formate of commit log
---
v2 -> v3:
1. adjust coding style issue.
2. add description about partner_admin from IEEE spec.
---
v3 -> v4
1. adjust coding style issue: Missing Signed-off-by
---
drivers/net/bonding/eth_bond_8023ad_private.h | 3 +++
drivers/net/bonding/rte_eth_bond_8023ad.c | 21 +++++++++++++++++----
2 files changed, 20 insertions(+), 4 deletions(-)
Comments
Hi, Weifeng Li
On 2020/7/18 12:35, Weifeng Li wrote:
> A dpdk bonding 802.3ad network as follows:
> +----------+ +-----------+
> |dpdk lacp |bond1.1 <------> bond2.1|switch lacp|
> | |bond1.2 <------> bond2.2| |
> +----------+ +-----------+
> If a fiber optic go wrong about single pass during normal running like
> this:
> bond1.2 -----> bond2.2 ok
> bond1.2 <--x-- bond2.2 error: bond1.2 receive no LACPDU Some packets from
> switch to dpdk will choose bond2.2 and lost.
>
> DPDK lacp state machine will transits to the expired state if no LACPDU
> is received before the current_while_timer expires. But if no LACPDU is
> received before the current_while_timer expires again, DPDK lacp state
> machine has no change. Bond2.2 can not change to inactive depend on the
> received LACPDU.
> According to IEEE 802.3ad, if no lacpdu is received before the
> current_while_timer expires again, the state machine should transits from
> expired to defaulted. Bond2.2 will change to inactive depend on the LACPDU
> with defaulted state.
>
> This patch adds a state machine change from expired to defaulted when no
> lacpdu is received before the current_while_timer expires again according
> to IEEE 802.3ad:
> If no LACPDU is received before the current_while timer expires again,
> the state machine transits to the DEFAULTED state. The record Default
> function overwrites the current operational parameters for the Partner
> with administratively configured values. This allows configuration of
> aggregations and individual links when no protocol partner is present,
> while still permitting an active partner to override default settings.
> The update_Default_Selected function sets the Selected variable FALSE
> if the Link Aggregation Group has changed. Since all operational parameters
> are now set to locally administered values there can be no disagreement as
> to the Link Aggregation Group, so the Matched variable is set TRUE.
>
> The relevant description is in the chapter 43.4.12 of the link below:
> https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=850426
Fixes: 46fb43683679 ("bond: add mode 4")
Cc: stable@dpdk.org
> Signed-off-by: Weifeng Li <liweifeng96@126.com>
Acked-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Thanks you for fixing this issue.
Xavier
> ---
> v1 -> v2: adjust the formate of commit log
> ---
> v2 -> v3:
> 1. adjust coding style issue.
> 2. add description about partner_admin from IEEE spec.
> ---
> v3 -> v4
> 1. adjust coding style issue: Missing Signed-off-by
> ---
> drivers/net/bonding/eth_bond_8023ad_private.h | 3 +++
> drivers/net/bonding/rte_eth_bond_8023ad.c | 21 +++++++++++++++++----
> 2 files changed, 20 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/bonding/eth_bond_8023ad_private.h b/drivers/net/bonding/eth_bond_8023ad_private.h
> index 6e44ffd..9b5738a 100644
> --- a/drivers/net/bonding/eth_bond_8023ad_private.h
> +++ b/drivers/net/bonding/eth_bond_8023ad_private.h
> @@ -50,6 +50,7 @@
> #define SM_FLAGS_MOVED 0x0100
> #define SM_FLAGS_PARTNER_SHORT_TIMEOUT 0x0200
> #define SM_FLAGS_NTT 0x0400
> +#define SM_FLAGS_EXPIRED 0x0800
>
> #define BOND_LINK_FULL_DUPLEX_KEY 0x01
> #define BOND_LINK_SPEED_KEY_10M 0x02
> @@ -103,6 +104,8 @@ struct port {
>
> /** The operational Partner's port parameters */
> struct port_params partner;
> + /** Partner administrative parameter values */
> + struct port_params partner_admin;
>
> /* Additional port parameters not listed in documentation */
> /** State machine flags */
> diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
> index 3991825..3c2c7bf 100644
> --- a/drivers/net/bonding/rte_eth_bond_8023ad.c
> +++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
> @@ -356,16 +356,28 @@ rx_machine(struct bond_dev_private *internals, uint16_t slave_id,
>
> timer_set(&port->current_while_timer, timeout);
> ACTOR_STATE_CLR(port, EXPIRED);
> + SM_FLAG_CLR(port, EXPIRED);
> return; /* No state change */
> }
>
> /* If CURRENT state timer is not running (stopped or expired)
> * transit to EXPIRED state from DISABLED or CURRENT */
> if (!timer_is_running(&port->current_while_timer)) {
> - ACTOR_STATE_SET(port, EXPIRED);
> - PARTNER_STATE_CLR(port, SYNCHRONIZATION);
> - PARTNER_STATE_SET(port, LACP_SHORT_TIMEOUT);
> - timer_set(&port->current_while_timer, internals->mode4.short_timeout);
> + if (SM_FLAG(port, EXPIRED)) {
> + port->selected = UNSELECTED;
> + memcpy(&port->partner, &port->partner_admin,
> + sizeof(struct port_params));
> + record_default(port);
> + ACTOR_STATE_CLR(port, EXPIRED);
> + timer_cancel(&port->current_while_timer);
> + } else {
> + SM_FLAG_SET(port, EXPIRED);
> + ACTOR_STATE_SET(port, EXPIRED);
> + PARTNER_STATE_CLR(port, SYNCHRONIZATION);
> + PARTNER_STATE_SET(port, LACP_SHORT_TIMEOUT);
> + timer_set(&port->current_while_timer,
> + internals->mode4.short_timeout);
> + }
> }
> }
>
> @@ -1021,6 +1033,7 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev *bond_dev,
> port->actor.port_number = rte_cpu_to_be_16(slave_id + 1);
>
> memcpy(&port->partner, &initial, sizeof(struct port_params));
> + memcpy(&port->partner_admin, &initial, sizeof(struct port_params));
>
> /* default states */
> port->actor_state = STATE_AGGREGATION | STATE_LACP_ACTIVE | STATE_DEFAULTED;
On 7/18/2020 8:08 AM, Wei Hu (Xavier) wrote:
> Hi, Weifeng Li
>
>
> On 2020/7/18 12:35, Weifeng Li wrote:
>> A dpdk bonding 802.3ad network as follows:
>> +----------+ +-----------+
>> |dpdk lacp |bond1.1 <------> bond2.1|switch lacp|
>> | |bond1.2 <------> bond2.2| |
>> +----------+ +-----------+
>> If a fiber optic go wrong about single pass during normal running like
>> this:
>> bond1.2 -----> bond2.2 ok
>> bond1.2 <--x-- bond2.2 error: bond1.2 receive no LACPDU Some packets from
>> switch to dpdk will choose bond2.2 and lost.
>>
>> DPDK lacp state machine will transits to the expired state if no LACPDU
>> is received before the current_while_timer expires. But if no LACPDU is
>> received before the current_while_timer expires again, DPDK lacp state
>> machine has no change. Bond2.2 can not change to inactive depend on the
>> received LACPDU.
>> According to IEEE 802.3ad, if no lacpdu is received before the
>> current_while_timer expires again, the state machine should transits from
>> expired to defaulted. Bond2.2 will change to inactive depend on the LACPDU
>> with defaulted state.
>>
>> This patch adds a state machine change from expired to defaulted when no
>> lacpdu is received before the current_while_timer expires again according
>> to IEEE 802.3ad:
>> If no LACPDU is received before the current_while timer expires again,
>> the state machine transits to the DEFAULTED state. The record Default
>> function overwrites the current operational parameters for the Partner
>> with administratively configured values. This allows configuration of
>> aggregations and individual links when no protocol partner is present,
>> while still permitting an active partner to override default settings.
>> The update_Default_Selected function sets the Selected variable FALSE
>> if the Link Aggregation Group has changed. Since all operational parameters
>> are now set to locally administered values there can be no disagreement as
>> to the Link Aggregation Group, so the Matched variable is set TRUE.
>>
>> The relevant description is in the chapter 43.4.12 of the link below:
>> https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=850426
>
> Fixes: 46fb43683679 ("bond: add mode 4")
> Cc: stable@dpdk.org
>> Signed-off-by: Weifeng Li <liweifeng96@126.com>
>
> Acked-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
>
Applied to dpdk-next-net/master, thanks.
@@ -50,6 +50,7 @@
#define SM_FLAGS_MOVED 0x0100
#define SM_FLAGS_PARTNER_SHORT_TIMEOUT 0x0200
#define SM_FLAGS_NTT 0x0400
+#define SM_FLAGS_EXPIRED 0x0800
#define BOND_LINK_FULL_DUPLEX_KEY 0x01
#define BOND_LINK_SPEED_KEY_10M 0x02
@@ -103,6 +104,8 @@ struct port {
/** The operational Partner's port parameters */
struct port_params partner;
+ /** Partner administrative parameter values */
+ struct port_params partner_admin;
/* Additional port parameters not listed in documentation */
/** State machine flags */
@@ -356,16 +356,28 @@ rx_machine(struct bond_dev_private *internals, uint16_t slave_id,
timer_set(&port->current_while_timer, timeout);
ACTOR_STATE_CLR(port, EXPIRED);
+ SM_FLAG_CLR(port, EXPIRED);
return; /* No state change */
}
/* If CURRENT state timer is not running (stopped or expired)
* transit to EXPIRED state from DISABLED or CURRENT */
if (!timer_is_running(&port->current_while_timer)) {
- ACTOR_STATE_SET(port, EXPIRED);
- PARTNER_STATE_CLR(port, SYNCHRONIZATION);
- PARTNER_STATE_SET(port, LACP_SHORT_TIMEOUT);
- timer_set(&port->current_while_timer, internals->mode4.short_timeout);
+ if (SM_FLAG(port, EXPIRED)) {
+ port->selected = UNSELECTED;
+ memcpy(&port->partner, &port->partner_admin,
+ sizeof(struct port_params));
+ record_default(port);
+ ACTOR_STATE_CLR(port, EXPIRED);
+ timer_cancel(&port->current_while_timer);
+ } else {
+ SM_FLAG_SET(port, EXPIRED);
+ ACTOR_STATE_SET(port, EXPIRED);
+ PARTNER_STATE_CLR(port, SYNCHRONIZATION);
+ PARTNER_STATE_SET(port, LACP_SHORT_TIMEOUT);
+ timer_set(&port->current_while_timer,
+ internals->mode4.short_timeout);
+ }
}
}
@@ -1021,6 +1033,7 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev *bond_dev,
port->actor.port_number = rte_cpu_to_be_16(slave_id + 1);
memcpy(&port->partner, &initial, sizeof(struct port_params));
+ memcpy(&port->partner_admin, &initial, sizeof(struct port_params));
/* default states */
port->actor_state = STATE_AGGREGATION | STATE_LACP_ACTIVE | STATE_DEFAULTED;