[v3] net/iavf: fix iavf query stats in intr thread

Message ID 20230322072615.179196-1-kaiwenx.deng@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Qi Zhang
Headers
Series [v3] net/iavf: fix iavf query stats in intr thread |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/intel-Functional success Functional PASS
ci/github-robot: build success github build: passed
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-unit-testing success Testing PASS

Commit Message

Kaiwen Deng March 22, 2023, 7:26 a.m. UTC
  When iavf send query-stats command in eal-intr-thread through
virtual channel, there will be no response received from
iavf_dev_virtchnl_handler for this command during block and wait.
Because iavf_dev_virtchnl_handler is also registered in eal-intr-thread.

When vf device is bonded as BONDING_MODE_TLB mode, the slave device
update callback will registered in alarm and called by eal-intr-thread,
it would also raise the above issue.

This commit add to poll the response for VIRTCHNL_OP_GET_STATS
when it is called by eal-intr-thread to fix this issue.

Fixes: 91bf37d250aa ("net/iavf: add lock for VF commands")
Fixes: 22b123a36d07 ("net/avf: initialize PMD")
Fixes: 7c76a747e68c ("bond: add mode 5")
Fixes: 435d523112cc ("net/iavf: fix multi-process shared data")
Fixes: cb5c1b91f76f ("net/iavf: add thread for event callbacks")
Cc: stable@dpdk.org

Signed-off-by: Kaiwen Deng <kaiwenx.deng@intel.com>
---
Changes since v2:
- Add to poll the response for VIRTCHNL_OP_GET_STATS.

Changes since v1:
- Add lock to avoid race condition.
---
---
 drivers/net/bonding/rte_eth_bond_pmd.c |  7 ++-
 drivers/net/iavf/iavf_ethdev.c         |  5 +-
 drivers/net/iavf/iavf_vchnl.c          | 71 ++++++++++++++++++--------
 3 files changed, 58 insertions(+), 25 deletions(-)
  

Comments

Ferruh Yigit March 23, 2023, 3:39 p.m. UTC | #1
On 3/22/2023 7:26 AM, Kaiwen Deng wrote:
> When iavf send query-stats command in eal-intr-thread through
> virtual channel, there will be no response received from
> iavf_dev_virtchnl_handler for this command during block and wait.
> Because iavf_dev_virtchnl_handler is also registered in eal-intr-thread.
> 
> When vf device is bonded as BONDING_MODE_TLB mode, the slave device
> update callback will registered in alarm and called by eal-intr-thread,
> it would also raise the above issue.
> 
> This commit add to poll the response for VIRTCHNL_OP_GET_STATS
> when it is called by eal-intr-thread to fix this issue.
> 
> Fixes: 91bf37d250aa ("net/iavf: add lock for VF commands")
> Fixes: 22b123a36d07 ("net/avf: initialize PMD")
> Fixes: 7c76a747e68c ("bond: add mode 5")
> Fixes: 435d523112cc ("net/iavf: fix multi-process shared data")
> Fixes: cb5c1b91f76f ("net/iavf: add thread for event callbacks")


Hi Kaiwen,

Above commit already seems trying to address same issue, it creates
"iavf-event-thread" control thread to asyncroniously handle the
interrupts, in non-interrupt context, why it is not working?

Instead of adding 'rte_thread_is_intr()' checks, can't you make sure all
interrupts handled in control tread?

And can you please provide a stack trace in commit log, to describe the
issue better?

> Cc: stable@dpdk.org
> 
> Signed-off-by: Kaiwen Deng <kaiwenx.deng@intel.com>
> ---
> Changes since v2:
> - Add to poll the response for VIRTCHNL_OP_GET_STATS.
> 
> Changes since v1:
> - Add lock to avoid race condition.
> ---
> ---
>  drivers/net/bonding/rte_eth_bond_pmd.c |  7 ++-
>  drivers/net/iavf/iavf_ethdev.c         |  5 +-
>  drivers/net/iavf/iavf_vchnl.c          | 71 ++++++++++++++++++--------
>  3 files changed, 58 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
> index f0c4f7d26b..edce621496 100644
> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
> @@ -894,6 +894,7 @@ bond_ethdev_update_tlb_slave_cb(void *arg)
>  	uint8_t update_stats = 0;
>  	uint16_t slave_id;
>  	uint16_t i;
> +	int ret;
>  
>  	internals->slave_update_idx++;
>  
> @@ -903,7 +904,10 @@ bond_ethdev_update_tlb_slave_cb(void *arg)
>  
>  	for (i = 0; i < internals->active_slave_count; i++) {
>  		slave_id = internals->active_slaves[i];
> -		rte_eth_stats_get(slave_id, &slave_stats);
> +		ret = rte_eth_stats_get(slave_id, &slave_stats);
> +		if (ret)
> +			goto OUT;
> +
>  		tx_bytes = slave_stats.obytes - tlb_last_obytets[slave_id];
>  		bandwidth_left(slave_id, tx_bytes,
>  				internals->slave_update_idx, &bwg_array[i]);
> @@ -922,6 +926,7 @@ bond_ethdev_update_tlb_slave_cb(void *arg)
>  	for (i = 0; i < slave_count; i++)
>  		internals->tlb_slaves_order[i] = bwg_array[i].slave;
>  
> +OUT:
>  	rte_eal_alarm_set(REORDER_PERIOD_MS * 1000, bond_ethdev_update_tlb_slave_cb,
>  			(struct bond_dev_private *)internals);
>  }
> diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
> index 3196210f2c..d6e1f1a7f4 100644
> --- a/drivers/net/iavf/iavf_ethdev.c
> +++ b/drivers/net/iavf/iavf_ethdev.c
> @@ -2607,6 +2607,9 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
>  	adapter->dev_data = eth_dev->data;
>  	adapter->stopped = 1;
>  
> +	if (iavf_dev_event_handler_init())
> +		goto init_vf_err;
> +
>  	if (iavf_init_vf(eth_dev) != 0) {
>  		PMD_INIT_LOG(ERR, "Init vf failed");
>  		return -1;
> @@ -2634,8 +2637,6 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
>  	rte_ether_addr_copy((struct rte_ether_addr *)hw->mac.addr,
>  			&eth_dev->data->mac_addrs[0]);
>  
> -	if (iavf_dev_event_handler_init())
> -		goto init_vf_err;
>  
>  	if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_WB_ON_ITR) {
>  		/* register callback func to eal lib */
> diff --git a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/iavf/iavf_vchnl.c
> index 9adaadb173..aeffb07cca 100644
> --- a/drivers/net/iavf/iavf_vchnl.c
> +++ b/drivers/net/iavf/iavf_vchnl.c
> @@ -256,6 +256,7 @@ iavf_read_msg_from_pf(struct iavf_adapter *adapter, uint16_t buf_len,
>  				vf->link_speed = iavf_convert_link_speed(speed);
>  			}
>  			iavf_dev_link_update(vf->eth_dev, 0);
> +			iavf_dev_event_post(vf->eth_dev, RTE_ETH_EVENT_INTR_LSC, NULL, 0);
>  			PMD_DRV_LOG(INFO, "Link status update:%s",
>  					vf->link_up ? "up" : "down");
>  			break;
> @@ -368,28 +369,48 @@ iavf_execute_vf_cmd(struct iavf_adapter *adapter, struct iavf_cmd_info *args,
>  		_clear_cmd(vf);
>  		break;
>  	default:
> -		/* For other virtchnl ops in running time,
> -		 * wait for the cmd done flag.
> -		 */
> -		do {
> -			if (vf->pend_cmd == VIRTCHNL_OP_UNKNOWN)
> -				break;
> -			iavf_msec_delay(ASQ_DELAY_MS);
> -			/* If don't read msg or read sys event, continue */
> -		} while (i++ < MAX_TRY_TIMES);
> -
> -		if (i >= MAX_TRY_TIMES) {
> -			PMD_DRV_LOG(ERR, "No response for cmd %d", args->ops);
> +		if (rte_thread_is_intr()) {
> +			/* For virtchnl ops were executed in eal_intr_thread,
> +			 * need to poll the response.
> +			 */
> +			do {
> +				result = iavf_read_msg_from_pf(adapter, args->out_size,
> +							args->out_buffer);
> +				if (result == IAVF_MSG_CMD)
> +					break;
> +				iavf_msec_delay(ASQ_DELAY_MS);
> +			} while (i++ < MAX_TRY_TIMES);
> +			if (i >= MAX_TRY_TIMES ||
> +				vf->cmd_retval != VIRTCHNL_STATUS_SUCCESS) {
> +				err = -1;
> +				PMD_DRV_LOG(ERR, "No response or return failure (%d)"
> +						" for cmd %d", vf->cmd_retval, args->ops);
> +			}
>  			_clear_cmd(vf);
> -			err = -EIO;
> -		} else if (vf->cmd_retval ==
> -			   VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
> -			PMD_DRV_LOG(ERR, "Cmd %d not supported", args->ops);
> -			err = -ENOTSUP;
> -		} else if (vf->cmd_retval != VIRTCHNL_STATUS_SUCCESS) {
> -			PMD_DRV_LOG(ERR, "Return failure %d for cmd %d",
> -				    vf->cmd_retval, args->ops);
> -			err = -EINVAL;
> +		} else {
> +			/* For other virtchnl ops in running time,
> +			 * wait for the cmd done flag.
> +			 */
> +			do {
> +				if (vf->pend_cmd == VIRTCHNL_OP_UNKNOWN)
> +					break;
> +				iavf_msec_delay(ASQ_DELAY_MS);
> +				/* If don't read msg or read sys event, continue */
> +			} while (i++ < MAX_TRY_TIMES);
> +
> +			if (i >= MAX_TRY_TIMES) {
> +				PMD_DRV_LOG(ERR, "No response for cmd %d", args->ops);
> +				_clear_cmd(vf);
> +				err = -EIO;
> +			} else if (vf->cmd_retval ==
> +				VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
> +				PMD_DRV_LOG(ERR, "Cmd %d not supported", args->ops);
> +				err = -ENOTSUP;
> +			} else if (vf->cmd_retval != VIRTCHNL_STATUS_SUCCESS) {
> +				PMD_DRV_LOG(ERR, "Return failure %d for cmd %d",
> +						vf->cmd_retval, args->ops);
> +				err = -EINVAL;
> +			}
>  		}
>  		break;
>  	}
> @@ -403,8 +424,14 @@ iavf_execute_vf_cmd_safe(struct iavf_adapter *adapter,
>  {
>  	struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(adapter);
>  	int ret;
> +	int is_intr_thread = rte_thread_is_intr();
>  
> -	rte_spinlock_lock(&vf->aq_lock);
> +	if (is_intr_thread) {
> +		if (!rte_spinlock_trylock(&vf->aq_lock))
> +			return -EIO;
> +	} else {
> +		rte_spinlock_lock(&vf->aq_lock);
> +	}
>  	ret = iavf_execute_vf_cmd(adapter, args, async);
>  	rte_spinlock_unlock(&vf->aq_lock);
>
  
Kaiwen Deng March 27, 2023, 5:31 a.m. UTC | #2
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Thursday, March 23, 2023 11:39 PM
> To: Deng, KaiwenX <kaiwenx.deng@intel.com>; dev@dpdk.org
> Cc: stable@dpdk.org; Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
> <yidingx.zhou@intel.com>; Chas Williams <chas3@att.com>; Min Hu (Connor)
> <humin29@huawei.com>; Wu, Jingjing <jingjing.wu@intel.com>; Xing, Beilei
> <beilei.xing@intel.com>; Mike Pattrick <mkp@redhat.com>; Zhang, Qi Z
> <qi.z.zhang@intel.com>; Doherty, Declan <declan.doherty@intel.com>;
> Mrzyglod, Daniel T <daniel.t.mrzyglod@intel.com>; Dapeng Yu
> <dapengx.yu@intel.com>
> Subject: Re: [PATCH v3] net/iavf: fix iavf query stats in intr thread
> 
> On 3/22/2023 7:26 AM, Kaiwen Deng wrote:
> > When iavf send query-stats command in eal-intr-thread through virtual
> > channel, there will be no response received from
> > iavf_dev_virtchnl_handler for this command during block and wait.
> > Because iavf_dev_virtchnl_handler is also registered in eal-intr-thread.
> >
> > When vf device is bonded as BONDING_MODE_TLB mode, the slave device
> > update callback will registered in alarm and called by
> > eal-intr-thread, it would also raise the above issue.
> >
> > This commit add to poll the response for VIRTCHNL_OP_GET_STATS when
> it
> > is called by eal-intr-thread to fix this issue.
> >
> > Fixes: 91bf37d250aa ("net/iavf: add lock for VF commands")
> > Fixes: 22b123a36d07 ("net/avf: initialize PMD")
> > Fixes: 7c76a747e68c ("bond: add mode 5")
> > Fixes: 435d523112cc ("net/iavf: fix multi-process shared data")
> > Fixes: cb5c1b91f76f ("net/iavf: add thread for event callbacks")
> 
> 
> Hi Kaiwen,
> 
> Above commit already seems trying to address same issue, it creates "iavf-
> event-thread" control thread to asyncroniously handle the interrupts, in non-
> interrupt context, why it is not working?
> 
> Instead of adding 'rte_thread_is_intr()' checks, can't you make sure all
> interrupts handled in control tread?
> 
> And can you please provide a stack trace in commit log, to describe the issue
> better?
Hi Ferru,
Sorry for my late reply, And thanks for your review.
 
The above commit does not fix this issue when we need to get the returned data.
If we call iavf_query_stats and wait for response statistics in the intr-thread.
iavf_handle_virtchnl_msg is also registered in the intr_thread and will not be 
executed while waiting.

This commit I changed it to polling for replies to commands executed in the interrupt thread.

main thread                                                                                                interrupt thread
     |                                                                                                                               |
     |                                                                                                                               |
iavf_query_stats                                                                                                        |
iavf_execute_vf_cmd                                                                                               |
iavf_aq_send_msg_to_pf  and wait handle complete                                       |
     |                                                                                                                              |
     |-------------------------------------------------------------------------------------------->|
     |                                                                                                                              |
     |                                                                                                       iavf_handle_virtchnl_msg
     |                                                                                                                             |
     |<--------------------------------------------------------------------------------------------|
     |                                                                                                                             |
iavf_execute_vf_cmd get response                                                                      |
     |                                                                                                                             |

The above is the stack trace for the normal execution of iavf_query_stats 
in the main thread.

 interrupt thread
     |
     |
iavf_query_stats
iavf_execute_vf_cmd
iavf_aq_send_msg_to_pf wait handle complete(1 sec)
iavf_execute_vf_cmd timeout
     |
     |
iavf_handle_virtchnl_msg
     |

The above is the stack trace for the abnormal execution of iavf_query_stats 
in the interrupt thread.
> 
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Kaiwen Deng <kaiwenx.deng@intel.com>
> > ---
> > Changes since v2:
> > - Add to poll the response for VIRTCHNL_OP_GET_STATS.
> >
> > Changes since v1:
> > - Add lock to avoid race condition.
> > ---
> > ---
> >  drivers/net/bonding/rte_eth_bond_pmd.c |  7 ++-
> >  drivers/net/iavf/iavf_ethdev.c         |  5 +-
> >  drivers/net/iavf/iavf_vchnl.c          | 71 ++++++++++++++++++--------
> >  3 files changed, 58 insertions(+), 25 deletions(-)
> >
> > diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c
> > b/drivers/net/bonding/rte_eth_bond_pmd.c
> > index f0c4f7d26b..edce621496 100644
> > --- a/drivers/net/bonding/rte_eth_bond_pmd.c
> > +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
> > @@ -894,6 +894,7 @@ bond_ethdev_update_tlb_slave_cb(void *arg)
> >  	uint8_t update_stats = 0;
> >  	uint16_t slave_id;
> >  	uint16_t i;
> > +	int ret;
> >
> >  	internals->slave_update_idx++;
> >
> > @@ -903,7 +904,10 @@ bond_ethdev_update_tlb_slave_cb(void *arg)
> >
> >  	for (i = 0; i < internals->active_slave_count; i++) {
> >  		slave_id = internals->active_slaves[i];
> > -		rte_eth_stats_get(slave_id, &slave_stats);
> > +		ret = rte_eth_stats_get(slave_id, &slave_stats);
> > +		if (ret)
> > +			goto OUT;
> > +
> >  		tx_bytes = slave_stats.obytes - tlb_last_obytets[slave_id];
> >  		bandwidth_left(slave_id, tx_bytes,
> >  				internals->slave_update_idx, &bwg_array[i]);
> @@ -922,6 +926,7 @@
> > bond_ethdev_update_tlb_slave_cb(void *arg)
> >  	for (i = 0; i < slave_count; i++)
> >  		internals->tlb_slaves_order[i] = bwg_array[i].slave;
> >
> > +OUT:
> >  	rte_eal_alarm_set(REORDER_PERIOD_MS * 1000,
> bond_ethdev_update_tlb_slave_cb,
> >  			(struct bond_dev_private *)internals);  } diff --git
> > a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
> > index 3196210f2c..d6e1f1a7f4 100644
> > --- a/drivers/net/iavf/iavf_ethdev.c
> > +++ b/drivers/net/iavf/iavf_ethdev.c
> > @@ -2607,6 +2607,9 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
> >  	adapter->dev_data = eth_dev->data;
> >  	adapter->stopped = 1;
> >
> > +	if (iavf_dev_event_handler_init())
> > +		goto init_vf_err;
> > +
> >  	if (iavf_init_vf(eth_dev) != 0) {
> >  		PMD_INIT_LOG(ERR, "Init vf failed");
> >  		return -1;
> > @@ -2634,8 +2637,6 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
> >  	rte_ether_addr_copy((struct rte_ether_addr *)hw->mac.addr,
> >  			&eth_dev->data->mac_addrs[0]);
> >
> > -	if (iavf_dev_event_handler_init())
> > -		goto init_vf_err;
> >
> >  	if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_WB_ON_ITR)
> {
> >  		/* register callback func to eal lib */ diff --git
> > a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/iavf/iavf_vchnl.c index
> > 9adaadb173..aeffb07cca 100644
> > --- a/drivers/net/iavf/iavf_vchnl.c
> > +++ b/drivers/net/iavf/iavf_vchnl.c
> > @@ -256,6 +256,7 @@ iavf_read_msg_from_pf(struct iavf_adapter
> *adapter, uint16_t buf_len,
> >  				vf->link_speed =
> iavf_convert_link_speed(speed);
> >  			}
> >  			iavf_dev_link_update(vf->eth_dev, 0);
> > +			iavf_dev_event_post(vf->eth_dev,
> RTE_ETH_EVENT_INTR_LSC, NULL, 0);
> >  			PMD_DRV_LOG(INFO, "Link status update:%s",
> >  					vf->link_up ? "up" : "down");
> >  			break;
> > @@ -368,28 +369,48 @@ iavf_execute_vf_cmd(struct iavf_adapter
> *adapter, struct iavf_cmd_info *args,
> >  		_clear_cmd(vf);
> >  		break;
> >  	default:
> > -		/* For other virtchnl ops in running time,
> > -		 * wait for the cmd done flag.
> > -		 */
> > -		do {
> > -			if (vf->pend_cmd == VIRTCHNL_OP_UNKNOWN)
> > -				break;
> > -			iavf_msec_delay(ASQ_DELAY_MS);
> > -			/* If don't read msg or read sys event, continue */
> > -		} while (i++ < MAX_TRY_TIMES);
> > -
> > -		if (i >= MAX_TRY_TIMES) {
> > -			PMD_DRV_LOG(ERR, "No response for cmd %d",
> args->ops);
> > +		if (rte_thread_is_intr()) {
> > +			/* For virtchnl ops were executed in eal_intr_thread,
> > +			 * need to poll the response.
> > +			 */
> > +			do {
> > +				result = iavf_read_msg_from_pf(adapter,
> args->out_size,
> > +							args->out_buffer);
> > +				if (result == IAVF_MSG_CMD)
> > +					break;
> > +				iavf_msec_delay(ASQ_DELAY_MS);
> > +			} while (i++ < MAX_TRY_TIMES);
> > +			if (i >= MAX_TRY_TIMES ||
> > +				vf->cmd_retval !=
> VIRTCHNL_STATUS_SUCCESS) {
> > +				err = -1;
> > +				PMD_DRV_LOG(ERR, "No response or return
> failure (%d)"
> > +						" for cmd %d", vf-
> >cmd_retval, args->ops);
> > +			}
> >  			_clear_cmd(vf);
> > -			err = -EIO;
> > -		} else if (vf->cmd_retval ==
> > -			   VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
> > -			PMD_DRV_LOG(ERR, "Cmd %d not supported", args-
> >ops);
> > -			err = -ENOTSUP;
> > -		} else if (vf->cmd_retval != VIRTCHNL_STATUS_SUCCESS) {
> > -			PMD_DRV_LOG(ERR, "Return failure %d for cmd %d",
> > -				    vf->cmd_retval, args->ops);
> > -			err = -EINVAL;
> > +		} else {
> > +			/* For other virtchnl ops in running time,
> > +			 * wait for the cmd done flag.
> > +			 */
> > +			do {
> > +				if (vf->pend_cmd ==
> VIRTCHNL_OP_UNKNOWN)
> > +					break;
> > +				iavf_msec_delay(ASQ_DELAY_MS);
> > +				/* If don't read msg or read sys event,
> continue */
> > +			} while (i++ < MAX_TRY_TIMES);
> > +
> > +			if (i >= MAX_TRY_TIMES) {
> > +				PMD_DRV_LOG(ERR, "No response for
> cmd %d", args->ops);
> > +				_clear_cmd(vf);
> > +				err = -EIO;
> > +			} else if (vf->cmd_retval ==
> > +				VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
> > +				PMD_DRV_LOG(ERR, "Cmd %d not
> supported", args->ops);
> > +				err = -ENOTSUP;
> > +			} else if (vf->cmd_retval !=
> VIRTCHNL_STATUS_SUCCESS) {
> > +				PMD_DRV_LOG(ERR, "Return failure %d for
> cmd %d",
> > +						vf->cmd_retval, args->ops);
> > +				err = -EINVAL;
> > +			}
> >  		}
> >  		break;
> >  	}
> > @@ -403,8 +424,14 @@ iavf_execute_vf_cmd_safe(struct iavf_adapter
> > *adapter,  {
> >  	struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(adapter);
> >  	int ret;
> > +	int is_intr_thread = rte_thread_is_intr();
> >
> > -	rte_spinlock_lock(&vf->aq_lock);
> > +	if (is_intr_thread) {
> > +		if (!rte_spinlock_trylock(&vf->aq_lock))
> > +			return -EIO;
> > +	} else {
> > +		rte_spinlock_lock(&vf->aq_lock);
> > +	}
> >  	ret = iavf_execute_vf_cmd(adapter, args, async);
> >  	rte_spinlock_unlock(&vf->aq_lock);
> >
  
Ferruh Yigit March 27, 2023, 12:31 p.m. UTC | #3
On 3/27/2023 6:31 AM, Deng, KaiwenX wrote:
> 
> 
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>> Sent: Thursday, March 23, 2023 11:39 PM
>> To: Deng, KaiwenX <kaiwenx.deng@intel.com>; dev@dpdk.org
>> Cc: stable@dpdk.org; Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
>> <yidingx.zhou@intel.com>; Chas Williams <chas3@att.com>; Min Hu (Connor)
>> <humin29@huawei.com>; Wu, Jingjing <jingjing.wu@intel.com>; Xing, Beilei
>> <beilei.xing@intel.com>; Mike Pattrick <mkp@redhat.com>; Zhang, Qi Z
>> <qi.z.zhang@intel.com>; Doherty, Declan <declan.doherty@intel.com>;
>> Mrzyglod, Daniel T <daniel.t.mrzyglod@intel.com>; Dapeng Yu
>> <dapengx.yu@intel.com>
>> Subject: Re: [PATCH v3] net/iavf: fix iavf query stats in intr thread
>>
>> On 3/22/2023 7:26 AM, Kaiwen Deng wrote:
>>> When iavf send query-stats command in eal-intr-thread through virtual
>>> channel, there will be no response received from
>>> iavf_dev_virtchnl_handler for this command during block and wait.
>>> Because iavf_dev_virtchnl_handler is also registered in eal-intr-thread.
>>>
>>> When vf device is bonded as BONDING_MODE_TLB mode, the slave device
>>> update callback will registered in alarm and called by
>>> eal-intr-thread, it would also raise the above issue.
>>>
>>> This commit add to poll the response for VIRTCHNL_OP_GET_STATS when
>> it
>>> is called by eal-intr-thread to fix this issue.
>>>
>>> Fixes: 91bf37d250aa ("net/iavf: add lock for VF commands")
>>> Fixes: 22b123a36d07 ("net/avf: initialize PMD")
>>> Fixes: 7c76a747e68c ("bond: add mode 5")
>>> Fixes: 435d523112cc ("net/iavf: fix multi-process shared data")
>>> Fixes: cb5c1b91f76f ("net/iavf: add thread for event callbacks")
>>
>>
>> Hi Kaiwen,
>>
>> Above commit already seems trying to address same issue, it creates "iavf-
>> event-thread" control thread to asyncroniously handle the interrupts, in non-
>> interrupt context, why it is not working?
>>
>> Instead of adding 'rte_thread_is_intr()' checks, can't you make sure all
>> interrupts handled in control tread?
>>
>> And can you please provide a stack trace in commit log, to describe the issue
>> better?
> Hi Ferru,
> Sorry for my late reply, And thanks for your review.
>  
> The above commit does not fix this issue when we need to get the returned data.
> If we call iavf_query_stats and wait for response statistics in the intr-thread.
> iavf_handle_virtchnl_msg is also registered in the intr_thread and will not be 
> executed while waiting.
> 

Got it, since return value is required, API can't be called asyncroniously.



I think 'rte_thread_is_intr()' checks may cause more trouble for you in
long term,

- why 'iavf_query_stats()' is called in the iterrupt thread, can it be
prevented?

- does it make sense to allways poll messages from PF (for simplification)?


If answer to both are 'No', I am OK to continue with current proposal if
you are happy with it.


> This commit I changed it to polling for replies to commands executed in the interrupt thread.
> 
> main thread                                                                                                interrupt thread
>      |                                                                                                                               |
>      |                                                                                                                               |
> iavf_query_stats                                                                                                        |
> iavf_execute_vf_cmd                                                                                               |
> iavf_aq_send_msg_to_pf  and wait handle complete                                       |
>      |                                                                                                                              |
>      |-------------------------------------------------------------------------------------------->|
>      |                                                                                                                              |
>      |                                                                                                       iavf_handle_virtchnl_msg
>      |                                                                                                                             |
>      |<--------------------------------------------------------------------------------------------|
>      |                                                                                                                             |
> iavf_execute_vf_cmd get response                                                                      |
>      |                                                                                                                             |
> 
> The above is the stack trace for the normal execution of iavf_query_stats 
> in the main thread.
> 
>  interrupt thread
>      |
>      |
> iavf_query_stats
> iavf_execute_vf_cmd
> iavf_aq_send_msg_to_pf wait handle complete(1 sec)
> iavf_execute_vf_cmd timeout
>      |
>      |
> iavf_handle_virtchnl_msg
>      |
> 
> The above is the stack trace for the abnormal execution of iavf_query_stats 
> in the interrupt thread.
>>
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Kaiwen Deng <kaiwenx.deng@intel.com>
>>> ---
>>> Changes since v2:
>>> - Add to poll the response for VIRTCHNL_OP_GET_STATS.
>>>
>>> Changes since v1:
>>> - Add lock to avoid race condition.
>>> ---
>>> ---
>>>  drivers/net/bonding/rte_eth_bond_pmd.c |  7 ++-
>>>  drivers/net/iavf/iavf_ethdev.c         |  5 +-
>>>  drivers/net/iavf/iavf_vchnl.c          | 71 ++++++++++++++++++--------
>>>  3 files changed, 58 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c
>>> b/drivers/net/bonding/rte_eth_bond_pmd.c
>>> index f0c4f7d26b..edce621496 100644
>>> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
>>> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
>>> @@ -894,6 +894,7 @@ bond_ethdev_update_tlb_slave_cb(void *arg)
>>>  	uint8_t update_stats = 0;
>>>  	uint16_t slave_id;
>>>  	uint16_t i;
>>> +	int ret;
>>>
>>>  	internals->slave_update_idx++;
>>>
>>> @@ -903,7 +904,10 @@ bond_ethdev_update_tlb_slave_cb(void *arg)
>>>
>>>  	for (i = 0; i < internals->active_slave_count; i++) {
>>>  		slave_id = internals->active_slaves[i];
>>> -		rte_eth_stats_get(slave_id, &slave_stats);
>>> +		ret = rte_eth_stats_get(slave_id, &slave_stats);
>>> +		if (ret)
>>> +			goto OUT;
>>> +
>>>  		tx_bytes = slave_stats.obytes - tlb_last_obytets[slave_id];
>>>  		bandwidth_left(slave_id, tx_bytes,
>>>  				internals->slave_update_idx, &bwg_array[i]);
>> @@ -922,6 +926,7 @@
>>> bond_ethdev_update_tlb_slave_cb(void *arg)
>>>  	for (i = 0; i < slave_count; i++)
>>>  		internals->tlb_slaves_order[i] = bwg_array[i].slave;
>>>
>>> +OUT:
>>>  	rte_eal_alarm_set(REORDER_PERIOD_MS * 1000,
>> bond_ethdev_update_tlb_slave_cb,
>>>  			(struct bond_dev_private *)internals);  } diff --git
>>> a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
>>> index 3196210f2c..d6e1f1a7f4 100644
>>> --- a/drivers/net/iavf/iavf_ethdev.c
>>> +++ b/drivers/net/iavf/iavf_ethdev.c
>>> @@ -2607,6 +2607,9 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
>>>  	adapter->dev_data = eth_dev->data;
>>>  	adapter->stopped = 1;
>>>
>>> +	if (iavf_dev_event_handler_init())
>>> +		goto init_vf_err;
>>> +
>>>  	if (iavf_init_vf(eth_dev) != 0) {
>>>  		PMD_INIT_LOG(ERR, "Init vf failed");
>>>  		return -1;
>>> @@ -2634,8 +2637,6 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
>>>  	rte_ether_addr_copy((struct rte_ether_addr *)hw->mac.addr,
>>>  			&eth_dev->data->mac_addrs[0]);
>>>
>>> -	if (iavf_dev_event_handler_init())
>>> -		goto init_vf_err;
>>>
>>>  	if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_WB_ON_ITR)
>> {
>>>  		/* register callback func to eal lib */ diff --git
>>> a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/iavf/iavf_vchnl.c index
>>> 9adaadb173..aeffb07cca 100644
>>> --- a/drivers/net/iavf/iavf_vchnl.c
>>> +++ b/drivers/net/iavf/iavf_vchnl.c
>>> @@ -256,6 +256,7 @@ iavf_read_msg_from_pf(struct iavf_adapter
>> *adapter, uint16_t buf_len,
>>>  				vf->link_speed =
>> iavf_convert_link_speed(speed);
>>>  			}
>>>  			iavf_dev_link_update(vf->eth_dev, 0);
>>> +			iavf_dev_event_post(vf->eth_dev,
>> RTE_ETH_EVENT_INTR_LSC, NULL, 0);
>>>  			PMD_DRV_LOG(INFO, "Link status update:%s",
>>>  					vf->link_up ? "up" : "down");
>>>  			break;
>>> @@ -368,28 +369,48 @@ iavf_execute_vf_cmd(struct iavf_adapter
>> *adapter, struct iavf_cmd_info *args,
>>>  		_clear_cmd(vf);
>>>  		break;
>>>  	default:
>>> -		/* For other virtchnl ops in running time,
>>> -		 * wait for the cmd done flag.
>>> -		 */
>>> -		do {
>>> -			if (vf->pend_cmd == VIRTCHNL_OP_UNKNOWN)
>>> -				break;
>>> -			iavf_msec_delay(ASQ_DELAY_MS);
>>> -			/* If don't read msg or read sys event, continue */
>>> -		} while (i++ < MAX_TRY_TIMES);
>>> -
>>> -		if (i >= MAX_TRY_TIMES) {
>>> -			PMD_DRV_LOG(ERR, "No response for cmd %d",
>> args->ops);
>>> +		if (rte_thread_is_intr()) {
>>> +			/* For virtchnl ops were executed in eal_intr_thread,
>>> +			 * need to poll the response.
>>> +			 */
>>> +			do {
>>> +				result = iavf_read_msg_from_pf(adapter,
>> args->out_size,
>>> +							args->out_buffer);
>>> +				if (result == IAVF_MSG_CMD)
>>> +					break;
>>> +				iavf_msec_delay(ASQ_DELAY_MS);
>>> +			} while (i++ < MAX_TRY_TIMES);
>>> +			if (i >= MAX_TRY_TIMES ||
>>> +				vf->cmd_retval !=
>> VIRTCHNL_STATUS_SUCCESS) {
>>> +				err = -1;
>>> +				PMD_DRV_LOG(ERR, "No response or return
>> failure (%d)"
>>> +						" for cmd %d", vf-
>>> cmd_retval, args->ops);
>>> +			}
>>>  			_clear_cmd(vf);
>>> -			err = -EIO;
>>> -		} else if (vf->cmd_retval ==
>>> -			   VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
>>> -			PMD_DRV_LOG(ERR, "Cmd %d not supported", args-
>>> ops);
>>> -			err = -ENOTSUP;
>>> -		} else if (vf->cmd_retval != VIRTCHNL_STATUS_SUCCESS) {
>>> -			PMD_DRV_LOG(ERR, "Return failure %d for cmd %d",
>>> -				    vf->cmd_retval, args->ops);
>>> -			err = -EINVAL;
>>> +		} else {
>>> +			/* For other virtchnl ops in running time,
>>> +			 * wait for the cmd done flag.
>>> +			 */
>>> +			do {
>>> +				if (vf->pend_cmd ==
>> VIRTCHNL_OP_UNKNOWN)
>>> +					break;
>>> +				iavf_msec_delay(ASQ_DELAY_MS);
>>> +				/* If don't read msg or read sys event,
>> continue */
>>> +			} while (i++ < MAX_TRY_TIMES);
>>> +
>>> +			if (i >= MAX_TRY_TIMES) {
>>> +				PMD_DRV_LOG(ERR, "No response for
>> cmd %d", args->ops);
>>> +				_clear_cmd(vf);
>>> +				err = -EIO;
>>> +			} else if (vf->cmd_retval ==
>>> +				VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
>>> +				PMD_DRV_LOG(ERR, "Cmd %d not
>> supported", args->ops);
>>> +				err = -ENOTSUP;
>>> +			} else if (vf->cmd_retval !=
>> VIRTCHNL_STATUS_SUCCESS) {
>>> +				PMD_DRV_LOG(ERR, "Return failure %d for
>> cmd %d",
>>> +						vf->cmd_retval, args->ops);
>>> +				err = -EINVAL;
>>> +			}
>>>  		}
>>>  		break;
>>>  	}
>>> @@ -403,8 +424,14 @@ iavf_execute_vf_cmd_safe(struct iavf_adapter
>>> *adapter,  {
>>>  	struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(adapter);
>>>  	int ret;
>>> +	int is_intr_thread = rte_thread_is_intr();
>>>
>>> -	rte_spinlock_lock(&vf->aq_lock);
>>> +	if (is_intr_thread) {
>>> +		if (!rte_spinlock_trylock(&vf->aq_lock))
>>> +			return -EIO;
>>> +	} else {
>>> +		rte_spinlock_lock(&vf->aq_lock);
>>> +	}
>>>  	ret = iavf_execute_vf_cmd(adapter, args, async);
>>>  	rte_spinlock_unlock(&vf->aq_lock);
>>>
>
  
Ferruh Yigit March 27, 2023, 12:37 p.m. UTC | #4
On 3/27/2023 1:31 PM, Ferruh Yigit wrote:
> On 3/27/2023 6:31 AM, Deng, KaiwenX wrote:
>>
>>
>>> -----Original Message-----
>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>>> Sent: Thursday, March 23, 2023 11:39 PM
>>> To: Deng, KaiwenX <kaiwenx.deng@intel.com>; dev@dpdk.org
>>> Cc: stable@dpdk.org; Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
>>> <yidingx.zhou@intel.com>; Chas Williams <chas3@att.com>; Min Hu (Connor)
>>> <humin29@huawei.com>; Wu, Jingjing <jingjing.wu@intel.com>; Xing, Beilei
>>> <beilei.xing@intel.com>; Mike Pattrick <mkp@redhat.com>; Zhang, Qi Z
>>> <qi.z.zhang@intel.com>; Doherty, Declan <declan.doherty@intel.com>;
>>> Mrzyglod, Daniel T <daniel.t.mrzyglod@intel.com>; Dapeng Yu
>>> <dapengx.yu@intel.com>
>>> Subject: Re: [PATCH v3] net/iavf: fix iavf query stats in intr thread
>>>
>>> On 3/22/2023 7:26 AM, Kaiwen Deng wrote:
>>>> When iavf send query-stats command in eal-intr-thread through virtual
>>>> channel, there will be no response received from
>>>> iavf_dev_virtchnl_handler for this command during block and wait.
>>>> Because iavf_dev_virtchnl_handler is also registered in eal-intr-thread.
>>>>
>>>> When vf device is bonded as BONDING_MODE_TLB mode, the slave device
>>>> update callback will registered in alarm and called by
>>>> eal-intr-thread, it would also raise the above issue.
>>>>
>>>> This commit add to poll the response for VIRTCHNL_OP_GET_STATS when
>>> it
>>>> is called by eal-intr-thread to fix this issue.
>>>>
>>>> Fixes: 91bf37d250aa ("net/iavf: add lock for VF commands")
>>>> Fixes: 22b123a36d07 ("net/avf: initialize PMD")
>>>> Fixes: 7c76a747e68c ("bond: add mode 5")
>>>> Fixes: 435d523112cc ("net/iavf: fix multi-process shared data")
>>>> Fixes: cb5c1b91f76f ("net/iavf: add thread for event callbacks")
>>>
>>>
>>> Hi Kaiwen,
>>>
>>> Above commit already seems trying to address same issue, it creates "iavf-
>>> event-thread" control thread to asyncroniously handle the interrupts, in non-
>>> interrupt context, why it is not working?
>>>
>>> Instead of adding 'rte_thread_is_intr()' checks, can't you make sure all
>>> interrupts handled in control tread?
>>>
>>> And can you please provide a stack trace in commit log, to describe the issue
>>> better?
>> Hi Ferru,
>> Sorry for my late reply, And thanks for your review.
>>  
>> The above commit does not fix this issue when we need to get the returned data.
>> If we call iavf_query_stats and wait for response statistics in the intr-thread.
>> iavf_handle_virtchnl_msg is also registered in the intr_thread and will not be 
>> executed while waiting.
>>
> 
> Got it, since return value is required, API can't be called asyncroniously.
> 
> 
> 
> I think 'rte_thread_is_intr()' checks may cause more trouble for you in
> long term,
> 
> - why 'iavf_query_stats()' is called in the iterrupt thread, can it be
> prevented?
> 
> - does it make sense to allways poll messages from PF (for simplification)?
> 
> 
> If answer to both are 'No', I am OK to continue with current proposal if
> you are happy with it.
> 


btw, how critical is this issue?

If it is critical, I am OK to get it as it is for this release and
investigate it further for next release, since only a few days left for
this release.


> 
>> This commit I changed it to polling for replies to commands executed in the interrupt thread.
>>
>> main thread                                                                                                interrupt thread
>>      |                                                                                                                               |
>>      |                                                                                                                               |
>> iavf_query_stats                                                                                                        |
>> iavf_execute_vf_cmd                                                                                               |
>> iavf_aq_send_msg_to_pf  and wait handle complete                                       |
>>      |                                                                                                                              |
>>      |-------------------------------------------------------------------------------------------->|
>>      |                                                                                                                              |
>>      |                                                                                                       iavf_handle_virtchnl_msg
>>      |                                                                                                                             |
>>      |<--------------------------------------------------------------------------------------------|
>>      |                                                                                                                             |
>> iavf_execute_vf_cmd get response                                                                      |
>>      |                                                                                                                             |
>>
>> The above is the stack trace for the normal execution of iavf_query_stats 
>> in the main thread.
>>
>>  interrupt thread
>>      |
>>      |
>> iavf_query_stats
>> iavf_execute_vf_cmd
>> iavf_aq_send_msg_to_pf wait handle complete(1 sec)
>> iavf_execute_vf_cmd timeout
>>      |
>>      |
>> iavf_handle_virtchnl_msg
>>      |
>>
>> The above is the stack trace for the abnormal execution of iavf_query_stats 
>> in the interrupt thread.
>>>
>>>> Cc: stable@dpdk.org
>>>>
>>>> Signed-off-by: Kaiwen Deng <kaiwenx.deng@intel.com>
>>>> ---
>>>> Changes since v2:
>>>> - Add to poll the response for VIRTCHNL_OP_GET_STATS.
>>>>
>>>> Changes since v1:
>>>> - Add lock to avoid race condition.
>>>> ---
>>>> ---
>>>>  drivers/net/bonding/rte_eth_bond_pmd.c |  7 ++-
>>>>  drivers/net/iavf/iavf_ethdev.c         |  5 +-
>>>>  drivers/net/iavf/iavf_vchnl.c          | 71 ++++++++++++++++++--------
>>>>  3 files changed, 58 insertions(+), 25 deletions(-)
>>>>
>>>> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c
>>>> b/drivers/net/bonding/rte_eth_bond_pmd.c
>>>> index f0c4f7d26b..edce621496 100644
>>>> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
>>>> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
>>>> @@ -894,6 +894,7 @@ bond_ethdev_update_tlb_slave_cb(void *arg)
>>>>  	uint8_t update_stats = 0;
>>>>  	uint16_t slave_id;
>>>>  	uint16_t i;
>>>> +	int ret;
>>>>
>>>>  	internals->slave_update_idx++;
>>>>
>>>> @@ -903,7 +904,10 @@ bond_ethdev_update_tlb_slave_cb(void *arg)
>>>>
>>>>  	for (i = 0; i < internals->active_slave_count; i++) {
>>>>  		slave_id = internals->active_slaves[i];
>>>> -		rte_eth_stats_get(slave_id, &slave_stats);
>>>> +		ret = rte_eth_stats_get(slave_id, &slave_stats);
>>>> +		if (ret)
>>>> +			goto OUT;
>>>> +
>>>>  		tx_bytes = slave_stats.obytes - tlb_last_obytets[slave_id];
>>>>  		bandwidth_left(slave_id, tx_bytes,
>>>>  				internals->slave_update_idx, &bwg_array[i]);
>>> @@ -922,6 +926,7 @@
>>>> bond_ethdev_update_tlb_slave_cb(void *arg)
>>>>  	for (i = 0; i < slave_count; i++)
>>>>  		internals->tlb_slaves_order[i] = bwg_array[i].slave;
>>>>
>>>> +OUT:
>>>>  	rte_eal_alarm_set(REORDER_PERIOD_MS * 1000,
>>> bond_ethdev_update_tlb_slave_cb,
>>>>  			(struct bond_dev_private *)internals);  } diff --git
>>>> a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
>>>> index 3196210f2c..d6e1f1a7f4 100644
>>>> --- a/drivers/net/iavf/iavf_ethdev.c
>>>> +++ b/drivers/net/iavf/iavf_ethdev.c
>>>> @@ -2607,6 +2607,9 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
>>>>  	adapter->dev_data = eth_dev->data;
>>>>  	adapter->stopped = 1;
>>>>
>>>> +	if (iavf_dev_event_handler_init())
>>>> +		goto init_vf_err;
>>>> +
>>>>  	if (iavf_init_vf(eth_dev) != 0) {
>>>>  		PMD_INIT_LOG(ERR, "Init vf failed");
>>>>  		return -1;
>>>> @@ -2634,8 +2637,6 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
>>>>  	rte_ether_addr_copy((struct rte_ether_addr *)hw->mac.addr,
>>>>  			&eth_dev->data->mac_addrs[0]);
>>>>
>>>> -	if (iavf_dev_event_handler_init())
>>>> -		goto init_vf_err;
>>>>
>>>>  	if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_WB_ON_ITR)
>>> {
>>>>  		/* register callback func to eal lib */ diff --git
>>>> a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/iavf/iavf_vchnl.c index
>>>> 9adaadb173..aeffb07cca 100644
>>>> --- a/drivers/net/iavf/iavf_vchnl.c
>>>> +++ b/drivers/net/iavf/iavf_vchnl.c
>>>> @@ -256,6 +256,7 @@ iavf_read_msg_from_pf(struct iavf_adapter
>>> *adapter, uint16_t buf_len,
>>>>  				vf->link_speed =
>>> iavf_convert_link_speed(speed);
>>>>  			}
>>>>  			iavf_dev_link_update(vf->eth_dev, 0);
>>>> +			iavf_dev_event_post(vf->eth_dev,
>>> RTE_ETH_EVENT_INTR_LSC, NULL, 0);
>>>>  			PMD_DRV_LOG(INFO, "Link status update:%s",
>>>>  					vf->link_up ? "up" : "down");
>>>>  			break;
>>>> @@ -368,28 +369,48 @@ iavf_execute_vf_cmd(struct iavf_adapter
>>> *adapter, struct iavf_cmd_info *args,
>>>>  		_clear_cmd(vf);
>>>>  		break;
>>>>  	default:
>>>> -		/* For other virtchnl ops in running time,
>>>> -		 * wait for the cmd done flag.
>>>> -		 */
>>>> -		do {
>>>> -			if (vf->pend_cmd == VIRTCHNL_OP_UNKNOWN)
>>>> -				break;
>>>> -			iavf_msec_delay(ASQ_DELAY_MS);
>>>> -			/* If don't read msg or read sys event, continue */
>>>> -		} while (i++ < MAX_TRY_TIMES);
>>>> -
>>>> -		if (i >= MAX_TRY_TIMES) {
>>>> -			PMD_DRV_LOG(ERR, "No response for cmd %d",
>>> args->ops);
>>>> +		if (rte_thread_is_intr()) {
>>>> +			/* For virtchnl ops were executed in eal_intr_thread,
>>>> +			 * need to poll the response.
>>>> +			 */
>>>> +			do {
>>>> +				result = iavf_read_msg_from_pf(adapter,
>>> args->out_size,
>>>> +							args->out_buffer);
>>>> +				if (result == IAVF_MSG_CMD)
>>>> +					break;
>>>> +				iavf_msec_delay(ASQ_DELAY_MS);
>>>> +			} while (i++ < MAX_TRY_TIMES);
>>>> +			if (i >= MAX_TRY_TIMES ||
>>>> +				vf->cmd_retval !=
>>> VIRTCHNL_STATUS_SUCCESS) {
>>>> +				err = -1;
>>>> +				PMD_DRV_LOG(ERR, "No response or return
>>> failure (%d)"
>>>> +						" for cmd %d", vf-
>>>> cmd_retval, args->ops);
>>>> +			}
>>>>  			_clear_cmd(vf);
>>>> -			err = -EIO;
>>>> -		} else if (vf->cmd_retval ==
>>>> -			   VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
>>>> -			PMD_DRV_LOG(ERR, "Cmd %d not supported", args-
>>>> ops);
>>>> -			err = -ENOTSUP;
>>>> -		} else if (vf->cmd_retval != VIRTCHNL_STATUS_SUCCESS) {
>>>> -			PMD_DRV_LOG(ERR, "Return failure %d for cmd %d",
>>>> -				    vf->cmd_retval, args->ops);
>>>> -			err = -EINVAL;
>>>> +		} else {
>>>> +			/* For other virtchnl ops in running time,
>>>> +			 * wait for the cmd done flag.
>>>> +			 */
>>>> +			do {
>>>> +				if (vf->pend_cmd ==
>>> VIRTCHNL_OP_UNKNOWN)
>>>> +					break;
>>>> +				iavf_msec_delay(ASQ_DELAY_MS);
>>>> +				/* If don't read msg or read sys event,
>>> continue */
>>>> +			} while (i++ < MAX_TRY_TIMES);
>>>> +
>>>> +			if (i >= MAX_TRY_TIMES) {
>>>> +				PMD_DRV_LOG(ERR, "No response for
>>> cmd %d", args->ops);
>>>> +				_clear_cmd(vf);
>>>> +				err = -EIO;
>>>> +			} else if (vf->cmd_retval ==
>>>> +				VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
>>>> +				PMD_DRV_LOG(ERR, "Cmd %d not
>>> supported", args->ops);
>>>> +				err = -ENOTSUP;
>>>> +			} else if (vf->cmd_retval !=
>>> VIRTCHNL_STATUS_SUCCESS) {
>>>> +				PMD_DRV_LOG(ERR, "Return failure %d for
>>> cmd %d",
>>>> +						vf->cmd_retval, args->ops);
>>>> +				err = -EINVAL;
>>>> +			}
>>>>  		}
>>>>  		break;
>>>>  	}
>>>> @@ -403,8 +424,14 @@ iavf_execute_vf_cmd_safe(struct iavf_adapter
>>>> *adapter,  {
>>>>  	struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(adapter);
>>>>  	int ret;
>>>> +	int is_intr_thread = rte_thread_is_intr();
>>>>
>>>> -	rte_spinlock_lock(&vf->aq_lock);
>>>> +	if (is_intr_thread) {
>>>> +		if (!rte_spinlock_trylock(&vf->aq_lock))
>>>> +			return -EIO;
>>>> +	} else {
>>>> +		rte_spinlock_lock(&vf->aq_lock);
>>>> +	}
>>>>  	ret = iavf_execute_vf_cmd(adapter, args, async);
>>>>  	rte_spinlock_unlock(&vf->aq_lock);
>>>>
>>
>
  
Kaiwen Deng March 29, 2023, 6:41 a.m. UTC | #5
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Monday, March 27, 2023 8:32 PM
> To: Deng, KaiwenX <kaiwenx.deng@intel.com>; dev@dpdk.org
> Cc: stable@dpdk.org; Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
> <yidingx.zhou@intel.com>; Chas Williams <chas3@att.com>; Min Hu (Connor)
> <humin29@huawei.com>; Wu, Jingjing <jingjing.wu@intel.com>; Xing, Beilei
> <beilei.xing@intel.com>; Mike Pattrick <mkp@redhat.com>; Zhang, Qi Z
> <qi.z.zhang@intel.com>; Doherty, Declan <declan.doherty@intel.com>;
> Mrzyglod, Daniel T <daniel.t.mrzyglod@intel.com>; Dapeng Yu
> <dapengx.yu@intel.com>
> Subject: Re: [PATCH v3] net/iavf: fix iavf query stats in intr thread
> 
> On 3/27/2023 6:31 AM, Deng, KaiwenX wrote:
> >
> >
> >> -----Original Message-----
> >> From: Ferruh Yigit <ferruh.yigit@amd.com>
> >> Sent: Thursday, March 23, 2023 11:39 PM
> >> To: Deng, KaiwenX <kaiwenx.deng@intel.com>; dev@dpdk.org
> >> Cc: stable@dpdk.org; Yang, Qiming <qiming.yang@intel.com>; Zhou,
> >> YidingX <yidingx.zhou@intel.com>; Chas Williams <chas3@att.com>; Min
> >> Hu (Connor) <humin29@huawei.com>; Wu, Jingjing
> >> <jingjing.wu@intel.com>; Xing, Beilei <beilei.xing@intel.com>; Mike
> >> Pattrick <mkp@redhat.com>; Zhang, Qi Z <qi.z.zhang@intel.com>;
> >> Doherty, Declan <declan.doherty@intel.com>; Mrzyglod, Daniel T
> >> <daniel.t.mrzyglod@intel.com>; Dapeng Yu <dapengx.yu@intel.com>
> >> Subject: Re: [PATCH v3] net/iavf: fix iavf query stats in intr thread
> >>
> >> On 3/22/2023 7:26 AM, Kaiwen Deng wrote:
> >>> When iavf send query-stats command in eal-intr-thread through
> >>> virtual channel, there will be no response received from
> >>> iavf_dev_virtchnl_handler for this command during block and wait.
> >>> Because iavf_dev_virtchnl_handler is also registered in eal-intr-thread.
> >>>
> >>> When vf device is bonded as BONDING_MODE_TLB mode, the slave
> device
> >>> update callback will registered in alarm and called by
> >>> eal-intr-thread, it would also raise the above issue.
> >>>
> >>> This commit add to poll the response for VIRTCHNL_OP_GET_STATS
> when
> >> it
> >>> is called by eal-intr-thread to fix this issue.
> >>>
> >>> Fixes: 91bf37d250aa ("net/iavf: add lock for VF commands")
> >>> Fixes: 22b123a36d07 ("net/avf: initialize PMD")
> >>> Fixes: 7c76a747e68c ("bond: add mode 5")
> >>> Fixes: 435d523112cc ("net/iavf: fix multi-process shared data")
> >>> Fixes: cb5c1b91f76f ("net/iavf: add thread for event callbacks")
> >>
> >>
> >> Hi Kaiwen,
> >>
> >> Above commit already seems trying to address same issue, it creates
> >> "iavf- event-thread" control thread to asyncroniously handle the
> >> interrupts, in non- interrupt context, why it is not working?
> >>
> >> Instead of adding 'rte_thread_is_intr()' checks, can't you make sure
> >> all interrupts handled in control tread?
> >>
> >> And can you please provide a stack trace in commit log, to describe
> >> the issue better?
> > Hi Ferru,
> > Sorry for my late reply, And thanks for your review.
> >
> > The above commit does not fix this issue when we need to get the
> returned data.
> > If we call iavf_query_stats and wait for response statistics in the intr-thread.
> > iavf_handle_virtchnl_msg is also registered in the intr_thread and
> > will not be executed while waiting.
> >
> 
> Got it, since return value is required, API can't be called asyncroniously.
> 
> 
> 
> I think 'rte_thread_is_intr()' checks may cause more trouble for you in long
> term,
> 
> - why 'iavf_query_stats()' is called in the iterrupt thread, can it be prevented?
> 
> - does it make sense to allways poll messages from PF (for simplification)?
> 
Virtual channel commands sometimes need to be registered to alarm 
so that they can be called periodically, and alarm is registered  to be 
called in interrupt threads.

For some commands that do not require a return value, It cannot 
support asynchronous if allways poll messages from PF.
> 
> If answer to both are 'No', I am OK to continue with current proposal if you
> are happy with it.
> 
> 
> > This commit I changed it to polling for replies to commands executed in the
> interrupt thread.
> >
> > main thread                                                                                                interrupt
> thread
> >      |                                                                                                                               |
> >      |                                                                                                                               |
> > iavf_query_stats                                                                                                        |
> > iavf_execute_vf_cmd                                                                                               |
> > iavf_aq_send_msg_to_pf  and wait handle complete                                       |
> >      |                                                                                                                              |
> >      |---------------------------------------------------------------------------------------
> ----->|
> >      |                                                                                                                              |
> >      |
> iavf_handle_virtchnl_msg
> >      |                                                                                                                             |
> >      |<-------------------------------------------------------------------------------------
> -------|
> >      |                                                                                                                             |
> > iavf_execute_vf_cmd get response                                                                      |
> >      |                                                                                                                             |
> >
> > The above is the stack trace for the normal execution of
> > iavf_query_stats in the main thread.
> >
> >  interrupt thread
> >      |
> >      |
> > iavf_query_stats
> > iavf_execute_vf_cmd
> > iavf_aq_send_msg_to_pf wait handle complete(1 sec)
> iavf_execute_vf_cmd
> > timeout
> >      |
> >      |
> > iavf_handle_virtchnl_msg
> >      |
> >
> > The above is the stack trace for the abnormal execution of
> > iavf_query_stats in the interrupt thread.
> >>
> >>> Cc: stable@dpdk.org
> >>>
> >>> Signed-off-by: Kaiwen Deng <kaiwenx.deng@intel.com>
> >>> ---
> >>> Changes since v2:
> >>> - Add to poll the response for VIRTCHNL_OP_GET_STATS.
> >>>
> >>> Changes since v1:
> >>> - Add lock to avoid race condition.
> >>> ---
> >>> ---
> >>>  drivers/net/bonding/rte_eth_bond_pmd.c |  7 ++-
> >>>  drivers/net/iavf/iavf_ethdev.c         |  5 +-
> >>>  drivers/net/iavf/iavf_vchnl.c          | 71 ++++++++++++++++++--------
> >>>  3 files changed, 58 insertions(+), 25 deletions(-)
> >>>
> >>> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c
> >>> b/drivers/net/bonding/rte_eth_bond_pmd.c
> >>> index f0c4f7d26b..edce621496 100644
> >>> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
> >>> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
> >>> @@ -894,6 +894,7 @@ bond_ethdev_update_tlb_slave_cb(void *arg)
> >>>  	uint8_t update_stats = 0;
> >>>  	uint16_t slave_id;
> >>>  	uint16_t i;
> >>> +	int ret;
> >>>
> >>>  	internals->slave_update_idx++;
> >>>
> >>> @@ -903,7 +904,10 @@ bond_ethdev_update_tlb_slave_cb(void *arg)
> >>>
> >>>  	for (i = 0; i < internals->active_slave_count; i++) {
> >>>  		slave_id = internals->active_slaves[i];
> >>> -		rte_eth_stats_get(slave_id, &slave_stats);
> >>> +		ret = rte_eth_stats_get(slave_id, &slave_stats);
> >>> +		if (ret)
> >>> +			goto OUT;
> >>> +
> >>>  		tx_bytes = slave_stats.obytes - tlb_last_obytets[slave_id];
> >>>  		bandwidth_left(slave_id, tx_bytes,
> >>>  				internals->slave_update_idx, &bwg_array[i]);
> >> @@ -922,6 +926,7 @@
> >>> bond_ethdev_update_tlb_slave_cb(void *arg)
> >>>  	for (i = 0; i < slave_count; i++)
> >>>  		internals->tlb_slaves_order[i] = bwg_array[i].slave;
> >>>
> >>> +OUT:
> >>>  	rte_eal_alarm_set(REORDER_PERIOD_MS * 1000,
> >> bond_ethdev_update_tlb_slave_cb,
> >>>  			(struct bond_dev_private *)internals);  } diff --git
> >>> a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
> >>> index 3196210f2c..d6e1f1a7f4 100644
> >>> --- a/drivers/net/iavf/iavf_ethdev.c
> >>> +++ b/drivers/net/iavf/iavf_ethdev.c
> >>> @@ -2607,6 +2607,9 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
> >>>  	adapter->dev_data = eth_dev->data;
> >>>  	adapter->stopped = 1;
> >>>
> >>> +	if (iavf_dev_event_handler_init())
> >>> +		goto init_vf_err;
> >>> +
> >>>  	if (iavf_init_vf(eth_dev) != 0) {
> >>>  		PMD_INIT_LOG(ERR, "Init vf failed");
> >>>  		return -1;
> >>> @@ -2634,8 +2637,6 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
> >>>  	rte_ether_addr_copy((struct rte_ether_addr *)hw->mac.addr,
> >>>  			&eth_dev->data->mac_addrs[0]);
> >>>
> >>> -	if (iavf_dev_event_handler_init())
> >>> -		goto init_vf_err;
> >>>
> >>>  	if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_WB_ON_ITR)
> >> {
> >>>  		/* register callback func to eal lib */ diff --git
> >>> a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/iavf/iavf_vchnl.c
> >>> index 9adaadb173..aeffb07cca 100644
> >>> --- a/drivers/net/iavf/iavf_vchnl.c
> >>> +++ b/drivers/net/iavf/iavf_vchnl.c
> >>> @@ -256,6 +256,7 @@ iavf_read_msg_from_pf(struct iavf_adapter
> >> *adapter, uint16_t buf_len,
> >>>  				vf->link_speed =
> >> iavf_convert_link_speed(speed);
> >>>  			}
> >>>  			iavf_dev_link_update(vf->eth_dev, 0);
> >>> +			iavf_dev_event_post(vf->eth_dev,
> >> RTE_ETH_EVENT_INTR_LSC, NULL, 0);
> >>>  			PMD_DRV_LOG(INFO, "Link status update:%s",
> >>>  					vf->link_up ? "up" : "down");
> >>>  			break;
> >>> @@ -368,28 +369,48 @@ iavf_execute_vf_cmd(struct iavf_adapter
> >> *adapter, struct iavf_cmd_info *args,
> >>>  		_clear_cmd(vf);
> >>>  		break;
> >>>  	default:
> >>> -		/* For other virtchnl ops in running time,
> >>> -		 * wait for the cmd done flag.
> >>> -		 */
> >>> -		do {
> >>> -			if (vf->pend_cmd == VIRTCHNL_OP_UNKNOWN)
> >>> -				break;
> >>> -			iavf_msec_delay(ASQ_DELAY_MS);
> >>> -			/* If don't read msg or read sys event, continue */
> >>> -		} while (i++ < MAX_TRY_TIMES);
> >>> -
> >>> -		if (i >= MAX_TRY_TIMES) {
> >>> -			PMD_DRV_LOG(ERR, "No response for cmd %d",
> >> args->ops);
> >>> +		if (rte_thread_is_intr()) {
> >>> +			/* For virtchnl ops were executed in eal_intr_thread,
> >>> +			 * need to poll the response.
> >>> +			 */
> >>> +			do {
> >>> +				result = iavf_read_msg_from_pf(adapter,
> >> args->out_size,
> >>> +							args->out_buffer);
> >>> +				if (result == IAVF_MSG_CMD)
> >>> +					break;
> >>> +				iavf_msec_delay(ASQ_DELAY_MS);
> >>> +			} while (i++ < MAX_TRY_TIMES);
> >>> +			if (i >= MAX_TRY_TIMES ||
> >>> +				vf->cmd_retval !=
> >> VIRTCHNL_STATUS_SUCCESS) {
> >>> +				err = -1;
> >>> +				PMD_DRV_LOG(ERR, "No response or return
> >> failure (%d)"
> >>> +						" for cmd %d", vf-
> >>> cmd_retval, args->ops);
> >>> +			}
> >>>  			_clear_cmd(vf);
> >>> -			err = -EIO;
> >>> -		} else if (vf->cmd_retval ==
> >>> -			   VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
> >>> -			PMD_DRV_LOG(ERR, "Cmd %d not supported", args-
> >>> ops);
> >>> -			err = -ENOTSUP;
> >>> -		} else if (vf->cmd_retval != VIRTCHNL_STATUS_SUCCESS) {
> >>> -			PMD_DRV_LOG(ERR, "Return failure %d for cmd %d",
> >>> -				    vf->cmd_retval, args->ops);
> >>> -			err = -EINVAL;
> >>> +		} else {
> >>> +			/* For other virtchnl ops in running time,
> >>> +			 * wait for the cmd done flag.
> >>> +			 */
> >>> +			do {
> >>> +				if (vf->pend_cmd ==
> >> VIRTCHNL_OP_UNKNOWN)
> >>> +					break;
> >>> +				iavf_msec_delay(ASQ_DELAY_MS);
> >>> +				/* If don't read msg or read sys event,
> >> continue */
> >>> +			} while (i++ < MAX_TRY_TIMES);
> >>> +
> >>> +			if (i >= MAX_TRY_TIMES) {
> >>> +				PMD_DRV_LOG(ERR, "No response for
> >> cmd %d", args->ops);
> >>> +				_clear_cmd(vf);
> >>> +				err = -EIO;
> >>> +			} else if (vf->cmd_retval ==
> >>> +				VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
> >>> +				PMD_DRV_LOG(ERR, "Cmd %d not
> >> supported", args->ops);
> >>> +				err = -ENOTSUP;
> >>> +			} else if (vf->cmd_retval !=
> >> VIRTCHNL_STATUS_SUCCESS) {
> >>> +				PMD_DRV_LOG(ERR, "Return failure %d for
> >> cmd %d",
> >>> +						vf->cmd_retval, args->ops);
> >>> +				err = -EINVAL;
> >>> +			}
> >>>  		}
> >>>  		break;
> >>>  	}
> >>> @@ -403,8 +424,14 @@ iavf_execute_vf_cmd_safe(struct iavf_adapter
> >>> *adapter,  {
> >>>  	struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(adapter);
> >>>  	int ret;
> >>> +	int is_intr_thread = rte_thread_is_intr();
> >>>
> >>> -	rte_spinlock_lock(&vf->aq_lock);
> >>> +	if (is_intr_thread) {
> >>> +		if (!rte_spinlock_trylock(&vf->aq_lock))
> >>> +			return -EIO;
> >>> +	} else {
> >>> +		rte_spinlock_lock(&vf->aq_lock);
> >>> +	}
> >>>  	ret = iavf_execute_vf_cmd(adapter, args, async);
> >>>  	rte_spinlock_unlock(&vf->aq_lock);
> >>>
> >
  
Kaiwen Deng March 29, 2023, 7:53 a.m. UTC | #6
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Monday, March 27, 2023 8:38 PM
> To: Deng, KaiwenX <kaiwenx.deng@intel.com>; dev@dpdk.org
> Cc: stable@dpdk.org; Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
> <yidingx.zhou@intel.com>; Chas Williams <chas3@att.com>; Min Hu (Connor)
> <humin29@huawei.com>; Wu, Jingjing <jingjing.wu@intel.com>; Xing, Beilei
> <beilei.xing@intel.com>; Mike Pattrick <mkp@redhat.com>; Zhang, Qi Z
> <qi.z.zhang@intel.com>; Doherty, Declan <declan.doherty@intel.com>;
> Mrzyglod, Daniel T <daniel.t.mrzyglod@intel.com>; Dapeng Yu
> <dapengx.yu@intel.com>; Zhang, Helin <helin.zhang@intel.com>;
> Mcnamara, John <john.mcnamara@intel.com>; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: Re: [PATCH v3] net/iavf: fix iavf query stats in intr thread
> 
> On 3/27/2023 1:31 PM, Ferruh Yigit wrote:
> > On 3/27/2023 6:31 AM, Deng, KaiwenX wrote:
> >>
> >>
> >>> -----Original Message-----
> >>> From: Ferruh Yigit <ferruh.yigit@amd.com>
> >>> Sent: Thursday, March 23, 2023 11:39 PM
> >>> To: Deng, KaiwenX <kaiwenx.deng@intel.com>; dev@dpdk.org
> >>> Cc: stable@dpdk.org; Yang, Qiming <qiming.yang@intel.com>; Zhou,
> >>> YidingX <yidingx.zhou@intel.com>; Chas Williams <chas3@att.com>; Min
> >>> Hu (Connor) <humin29@huawei.com>; Wu, Jingjing
> >>> <jingjing.wu@intel.com>; Xing, Beilei <beilei.xing@intel.com>; Mike
> >>> Pattrick <mkp@redhat.com>; Zhang, Qi Z <qi.z.zhang@intel.com>;
> >>> Doherty, Declan <declan.doherty@intel.com>; Mrzyglod, Daniel T
> >>> <daniel.t.mrzyglod@intel.com>; Dapeng Yu <dapengx.yu@intel.com>
> >>> Subject: Re: [PATCH v3] net/iavf: fix iavf query stats in intr
> >>> thread
> >>>
> >>> On 3/22/2023 7:26 AM, Kaiwen Deng wrote:
> >>>> When iavf send query-stats command in eal-intr-thread through
> >>>> virtual channel, there will be no response received from
> >>>> iavf_dev_virtchnl_handler for this command during block and wait.
> >>>> Because iavf_dev_virtchnl_handler is also registered in eal-intr-thread.
> >>>>
> >>>> When vf device is bonded as BONDING_MODE_TLB mode, the slave
> device
> >>>> update callback will registered in alarm and called by
> >>>> eal-intr-thread, it would also raise the above issue.
> >>>>
> >>>> This commit add to poll the response for VIRTCHNL_OP_GET_STATS
> when
> >>> it
> >>>> is called by eal-intr-thread to fix this issue.
> >>>>
> >>>> Fixes: 91bf37d250aa ("net/iavf: add lock for VF commands")
> >>>> Fixes: 22b123a36d07 ("net/avf: initialize PMD")
> >>>> Fixes: 7c76a747e68c ("bond: add mode 5")
> >>>> Fixes: 435d523112cc ("net/iavf: fix multi-process shared data")
> >>>> Fixes: cb5c1b91f76f ("net/iavf: add thread for event callbacks")
> >>>
> >>>
> >>> Hi Kaiwen,
> >>>
> >>> Above commit already seems trying to address same issue, it creates
> >>> "iavf- event-thread" control thread to asyncroniously handle the
> >>> interrupts, in non- interrupt context, why it is not working?
> >>>
> >>> Instead of adding 'rte_thread_is_intr()' checks, can't you make sure
> >>> all interrupts handled in control tread?
> >>>
> >>> And can you please provide a stack trace in commit log, to describe
> >>> the issue better?
> >> Hi Ferru,
> >> Sorry for my late reply, And thanks for your review.
> >>
> >> The above commit does not fix this issue when we need to get the
> returned data.
> >> If we call iavf_query_stats and wait for response statistics in the intr-
> thread.
> >> iavf_handle_virtchnl_msg is also registered in the intr_thread and
> >> will not be executed while waiting.
> >>
> >
> > Got it, since return value is required, API can't be called asyncroniously.
> >
> >
> >
> > I think 'rte_thread_is_intr()' checks may cause more trouble for you
> > in long term,
> >
> > - why 'iavf_query_stats()' is called in the iterrupt thread, can it be
> > prevented?
> >
> > - does it make sense to allways poll messages from PF (for simplification)?
> >
> >
> > If answer to both are 'No', I am OK to continue with current proposal
> > if you are happy with it.
> >
> 
> 
> btw, how critical is this issue?
> 
> If it is critical, I am OK to get it as it is for this release and investigate it further
> for next release, since only a few days left for this release.
> 
This is not a very crtical issue, we can wait for a more appropriate solution.

Thanks.
> 
> >
> >> This commit I changed it to polling for replies to commands executed in
> the interrupt thread.
> >>
> >> main thread                                                                                                interrupt
> thread
> >>      |                                                                                                                               |
> >>      |                                                                                                                               |
> >> iavf_query_stats                                                                                                        |
> >> iavf_execute_vf_cmd                                                                                               |
> >> iavf_aq_send_msg_to_pf  and wait handle complete                                       |
> >>      |                                                                                                                              |
> >>      |-------------------------------------------------------------------------------------
> ------->|
> >>      |                                                                                                                              |
> >>      |
> iavf_handle_virtchnl_msg
> >>      |                                                                                                                             |
> >>      |<------------------------------------------------------------------------------------
> --------|
> >>      |                                                                                                                             |
> >> iavf_execute_vf_cmd get response                                                                      |
> >>      |                                                                                                                             |
> >>
> >> The above is the stack trace for the normal execution of
> >> iavf_query_stats in the main thread.
> >>
> >>  interrupt thread
> >>      |
> >>      |
> >> iavf_query_stats
> >> iavf_execute_vf_cmd
> >> iavf_aq_send_msg_to_pf wait handle complete(1 sec)
> >> iavf_execute_vf_cmd timeout
> >>      |
> >>      |
> >> iavf_handle_virtchnl_msg
> >>      |
> >>
> >> The above is the stack trace for the abnormal execution of
> >> iavf_query_stats in the interrupt thread.
> >>>
> >>>> Cc: stable@dpdk.org
> >>>>
> >>>> Signed-off-by: Kaiwen Deng <kaiwenx.deng@intel.com>
> >>>> ---
> >>>> Changes since v2:
> >>>> - Add to poll the response for VIRTCHNL_OP_GET_STATS.
> >>>>
> >>>> Changes since v1:
> >>>> - Add lock to avoid race condition.
> >>>> ---
> >>>> ---
> >>>>  drivers/net/bonding/rte_eth_bond_pmd.c |  7 ++-
> >>>>  drivers/net/iavf/iavf_ethdev.c         |  5 +-
> >>>>  drivers/net/iavf/iavf_vchnl.c          | 71 ++++++++++++++++++--------
> >>>>  3 files changed, 58 insertions(+), 25 deletions(-)
> >>>>
> >>>> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c
> >>>> b/drivers/net/bonding/rte_eth_bond_pmd.c
> >>>> index f0c4f7d26b..edce621496 100644
> >>>> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
> >>>> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
> >>>> @@ -894,6 +894,7 @@ bond_ethdev_update_tlb_slave_cb(void *arg)
> >>>>  	uint8_t update_stats = 0;
> >>>>  	uint16_t slave_id;
> >>>>  	uint16_t i;
> >>>> +	int ret;
> >>>>
> >>>>  	internals->slave_update_idx++;
> >>>>
> >>>> @@ -903,7 +904,10 @@ bond_ethdev_update_tlb_slave_cb(void *arg)
> >>>>
> >>>>  	for (i = 0; i < internals->active_slave_count; i++) {
> >>>>  		slave_id = internals->active_slaves[i];
> >>>> -		rte_eth_stats_get(slave_id, &slave_stats);
> >>>> +		ret = rte_eth_stats_get(slave_id, &slave_stats);
> >>>> +		if (ret)
> >>>> +			goto OUT;
> >>>> +
> >>>>  		tx_bytes = slave_stats.obytes - tlb_last_obytets[slave_id];
> >>>>  		bandwidth_left(slave_id, tx_bytes,
> >>>>  				internals->slave_update_idx, &bwg_array[i]);
> >>> @@ -922,6 +926,7 @@
> >>>> bond_ethdev_update_tlb_slave_cb(void *arg)
> >>>>  	for (i = 0; i < slave_count; i++)
> >>>>  		internals->tlb_slaves_order[i] = bwg_array[i].slave;
> >>>>
> >>>> +OUT:
> >>>>  	rte_eal_alarm_set(REORDER_PERIOD_MS * 1000,
> >>> bond_ethdev_update_tlb_slave_cb,
> >>>>  			(struct bond_dev_private *)internals);  } diff --git
> >>>> a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
> >>>> index 3196210f2c..d6e1f1a7f4 100644
> >>>> --- a/drivers/net/iavf/iavf_ethdev.c
> >>>> +++ b/drivers/net/iavf/iavf_ethdev.c
> >>>> @@ -2607,6 +2607,9 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
> >>>>  	adapter->dev_data = eth_dev->data;
> >>>>  	adapter->stopped = 1;
> >>>>
> >>>> +	if (iavf_dev_event_handler_init())
> >>>> +		goto init_vf_err;
> >>>> +
> >>>>  	if (iavf_init_vf(eth_dev) != 0) {
> >>>>  		PMD_INIT_LOG(ERR, "Init vf failed");
> >>>>  		return -1;
> >>>> @@ -2634,8 +2637,6 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
> >>>>  	rte_ether_addr_copy((struct rte_ether_addr *)hw->mac.addr,
> >>>>  			&eth_dev->data->mac_addrs[0]);
> >>>>
> >>>> -	if (iavf_dev_event_handler_init())
> >>>> -		goto init_vf_err;
> >>>>
> >>>>  	if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_WB_ON_ITR)
> >>> {
> >>>>  		/* register callback func to eal lib */ diff --git
> >>>> a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/iavf/iavf_vchnl.c
> >>>> index 9adaadb173..aeffb07cca 100644
> >>>> --- a/drivers/net/iavf/iavf_vchnl.c
> >>>> +++ b/drivers/net/iavf/iavf_vchnl.c
> >>>> @@ -256,6 +256,7 @@ iavf_read_msg_from_pf(struct iavf_adapter
> >>> *adapter, uint16_t buf_len,
> >>>>  				vf->link_speed =
> >>> iavf_convert_link_speed(speed);
> >>>>  			}
> >>>>  			iavf_dev_link_update(vf->eth_dev, 0);
> >>>> +			iavf_dev_event_post(vf->eth_dev,
> >>> RTE_ETH_EVENT_INTR_LSC, NULL, 0);
> >>>>  			PMD_DRV_LOG(INFO, "Link status update:%s",
> >>>>  					vf->link_up ? "up" : "down");
> >>>>  			break;
> >>>> @@ -368,28 +369,48 @@ iavf_execute_vf_cmd(struct iavf_adapter
> >>> *adapter, struct iavf_cmd_info *args,
> >>>>  		_clear_cmd(vf);
> >>>>  		break;
> >>>>  	default:
> >>>> -		/* For other virtchnl ops in running time,
> >>>> -		 * wait for the cmd done flag.
> >>>> -		 */
> >>>> -		do {
> >>>> -			if (vf->pend_cmd == VIRTCHNL_OP_UNKNOWN)
> >>>> -				break;
> >>>> -			iavf_msec_delay(ASQ_DELAY_MS);
> >>>> -			/* If don't read msg or read sys event, continue */
> >>>> -		} while (i++ < MAX_TRY_TIMES);
> >>>> -
> >>>> -		if (i >= MAX_TRY_TIMES) {
> >>>> -			PMD_DRV_LOG(ERR, "No response for cmd %d",
> >>> args->ops);
> >>>> +		if (rte_thread_is_intr()) {
> >>>> +			/* For virtchnl ops were executed in eal_intr_thread,
> >>>> +			 * need to poll the response.
> >>>> +			 */
> >>>> +			do {
> >>>> +				result = iavf_read_msg_from_pf(adapter,
> >>> args->out_size,
> >>>> +							args->out_buffer);
> >>>> +				if (result == IAVF_MSG_CMD)
> >>>> +					break;
> >>>> +				iavf_msec_delay(ASQ_DELAY_MS);
> >>>> +			} while (i++ < MAX_TRY_TIMES);
> >>>> +			if (i >= MAX_TRY_TIMES ||
> >>>> +				vf->cmd_retval !=
> >>> VIRTCHNL_STATUS_SUCCESS) {
> >>>> +				err = -1;
> >>>> +				PMD_DRV_LOG(ERR, "No response or return
> >>> failure (%d)"
> >>>> +						" for cmd %d", vf-
> >>>> cmd_retval, args->ops);
> >>>> +			}
> >>>>  			_clear_cmd(vf);
> >>>> -			err = -EIO;
> >>>> -		} else if (vf->cmd_retval ==
> >>>> -			   VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
> >>>> -			PMD_DRV_LOG(ERR, "Cmd %d not supported", args-
> >>>> ops);
> >>>> -			err = -ENOTSUP;
> >>>> -		} else if (vf->cmd_retval != VIRTCHNL_STATUS_SUCCESS) {
> >>>> -			PMD_DRV_LOG(ERR, "Return failure %d for cmd %d",
> >>>> -				    vf->cmd_retval, args->ops);
> >>>> -			err = -EINVAL;
> >>>> +		} else {
> >>>> +			/* For other virtchnl ops in running time,
> >>>> +			 * wait for the cmd done flag.
> >>>> +			 */
> >>>> +			do {
> >>>> +				if (vf->pend_cmd ==
> >>> VIRTCHNL_OP_UNKNOWN)
> >>>> +					break;
> >>>> +				iavf_msec_delay(ASQ_DELAY_MS);
> >>>> +				/* If don't read msg or read sys event,
> >>> continue */
> >>>> +			} while (i++ < MAX_TRY_TIMES);
> >>>> +
> >>>> +			if (i >= MAX_TRY_TIMES) {
> >>>> +				PMD_DRV_LOG(ERR, "No response for
> >>> cmd %d", args->ops);
> >>>> +				_clear_cmd(vf);
> >>>> +				err = -EIO;
> >>>> +			} else if (vf->cmd_retval ==
> >>>> +				VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
> >>>> +				PMD_DRV_LOG(ERR, "Cmd %d not
> >>> supported", args->ops);
> >>>> +				err = -ENOTSUP;
> >>>> +			} else if (vf->cmd_retval !=
> >>> VIRTCHNL_STATUS_SUCCESS) {
> >>>> +				PMD_DRV_LOG(ERR, "Return failure %d for
> >>> cmd %d",
> >>>> +						vf->cmd_retval, args->ops);
> >>>> +				err = -EINVAL;
> >>>> +			}
> >>>>  		}
> >>>>  		break;
> >>>>  	}
> >>>> @@ -403,8 +424,14 @@ iavf_execute_vf_cmd_safe(struct
> iavf_adapter
> >>>> *adapter,  {
> >>>>  	struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(adapter);
> >>>>  	int ret;
> >>>> +	int is_intr_thread = rte_thread_is_intr();
> >>>>
> >>>> -	rte_spinlock_lock(&vf->aq_lock);
> >>>> +	if (is_intr_thread) {
> >>>> +		if (!rte_spinlock_trylock(&vf->aq_lock))
> >>>> +			return -EIO;
> >>>> +	} else {
> >>>> +		rte_spinlock_lock(&vf->aq_lock);
> >>>> +	}
> >>>>  	ret = iavf_execute_vf_cmd(adapter, args, async);
> >>>>  	rte_spinlock_unlock(&vf->aq_lock);
> >>>>
> >>
> >
  
Kaiwen Deng May 5, 2023, 2:31 a.m. UTC | #7
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Monday, March 27, 2023 8:38 PM
> To: Deng, KaiwenX <kaiwenx.deng@intel.com>; dev@dpdk.org
> Cc: stable@dpdk.org; Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
> <yidingx.zhou@intel.com>; Chas Williams <chas3@att.com>; Min Hu (Connor)
> <humin29@huawei.com>; Wu, Jingjing <jingjing.wu@intel.com>; Xing, Beilei
> <beilei.xing@intel.com>; Mike Pattrick <mkp@redhat.com>; Zhang, Qi Z
> <qi.z.zhang@intel.com>; Doherty, Declan <declan.doherty@intel.com>;
> Mrzyglod, Daniel T <daniel.t.mrzyglod@intel.com>; Dapeng Yu
> <dapengx.yu@intel.com>; Zhang, Helin <helin.zhang@intel.com>;
> Mcnamara, John <john.mcnamara@intel.com>; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: Re: [PATCH v3] net/iavf: fix iavf query stats in intr thread
> 
> On 3/27/2023 1:31 PM, Ferruh Yigit wrote:
> > On 3/27/2023 6:31 AM, Deng, KaiwenX wrote:
> >>
> >>
> >>> -----Original Message-----
> >>> From: Ferruh Yigit <ferruh.yigit@amd.com>
> >>> Sent: Thursday, March 23, 2023 11:39 PM
> >>> To: Deng, KaiwenX <kaiwenx.deng@intel.com>; dev@dpdk.org
> >>> Cc: stable@dpdk.org; Yang, Qiming <qiming.yang@intel.com>; Zhou,
> >>> YidingX <yidingx.zhou@intel.com>; Chas Williams <chas3@att.com>; Min
> >>> Hu (Connor) <humin29@huawei.com>; Wu, Jingjing
> >>> <jingjing.wu@intel.com>; Xing, Beilei <beilei.xing@intel.com>; Mike
> >>> Pattrick <mkp@redhat.com>; Zhang, Qi Z <qi.z.zhang@intel.com>;
> >>> Doherty, Declan <declan.doherty@intel.com>; Mrzyglod, Daniel T
> >>> <daniel.t.mrzyglod@intel.com>; Dapeng Yu <dapengx.yu@intel.com>
> >>> Subject: Re: [PATCH v3] net/iavf: fix iavf query stats in intr
> >>> thread
> >>>
> >>> On 3/22/2023 7:26 AM, Kaiwen Deng wrote:
> >>>> When iavf send query-stats command in eal-intr-thread through
> >>>> virtual channel, there will be no response received from
> >>>> iavf_dev_virtchnl_handler for this command during block and wait.
> >>>> Because iavf_dev_virtchnl_handler is also registered in eal-intr-thread.
> >>>>
> >>>> When vf device is bonded as BONDING_MODE_TLB mode, the slave
> device
> >>>> update callback will registered in alarm and called by
> >>>> eal-intr-thread, it would also raise the above issue.
> >>>>
> >>>> This commit add to poll the response for VIRTCHNL_OP_GET_STATS
> when
> >>> it
> >>>> is called by eal-intr-thread to fix this issue.
> >>>>
> >>>> Fixes: 91bf37d250aa ("net/iavf: add lock for VF commands")
> >>>> Fixes: 22b123a36d07 ("net/avf: initialize PMD")
> >>>> Fixes: 7c76a747e68c ("bond: add mode 5")
> >>>> Fixes: 435d523112cc ("net/iavf: fix multi-process shared data")
> >>>> Fixes: cb5c1b91f76f ("net/iavf: add thread for event callbacks")
> >>>
> >>>
> >>> Hi Kaiwen,
> >>>
> >>> Above commit already seems trying to address same issue, it creates
> >>> "iavf- event-thread" control thread to asyncroniously handle the
> >>> interrupts, in non- interrupt context, why it is not working?
> >>>
> >>> Instead of adding 'rte_thread_is_intr()' checks, can't you make sure
> >>> all interrupts handled in control tread?
> >>>
> >>> And can you please provide a stack trace in commit log, to describe
> >>> the issue better?
> >> Hi Ferru,
> >> Sorry for my late reply, And thanks for your review.
> >>
> >> The above commit does not fix this issue when we need to get the
> returned data.
> >> If we call iavf_query_stats and wait for response statistics in the intr-
> thread.
> >> iavf_handle_virtchnl_msg is also registered in the intr_thread and
> >> will not be executed while waiting.
> >>
> >
> > Got it, since return value is required, API can't be called asyncroniously.
> >
> >
> >
> > I think 'rte_thread_is_intr()' checks may cause more trouble for you
> > in long term,
> >
> > - why 'iavf_query_stats()' is called in the iterrupt thread, can it be
> > prevented?
> >
> > - does it make sense to allways poll messages from PF (for simplification)?
> >
> >
> > If answer to both are 'No', I am OK to continue with current proposal
> > if you are happy with it.
> >
> 
> 
> btw, how critical is this issue?
> 
> If it is critical, I am OK to get it as it is for this release and investigate it further
> for next release, since only a few days left for this release.
>
Hi Ferruh,

I didn't find a more suitable solution after consideration, if you have a better suggestion, 
please let me know, thanks.
> 
> >
> >> This commit I changed it to polling for replies to commands executed in
> the interrupt thread.
> >>
> >> main thread                                                                                                interrupt
> thread
> >>      |                                                                                                                               |
> >>      |                                                                                                                               |
> >> iavf_query_stats                                                                                                        |
> >> iavf_execute_vf_cmd                                                                                               |
> >> iavf_aq_send_msg_to_pf  and wait handle complete                                       |
> >>      |                                                                                                                              |
> >>      |-------------------------------------------------------------------------------------
> ------->|
> >>      |                                                                                                                              |
> >>      |
> iavf_handle_virtchnl_msg
> >>      |                                                                                                                             |
> >>      |<------------------------------------------------------------------------------------
> --------|
> >>      |                                                                                                                             |
> >> iavf_execute_vf_cmd get response                                                                      |
> >>      |                                                                                                                             |
> >>
> >> The above is the stack trace for the normal execution of
> >> iavf_query_stats in the main thread.
> >>
> >>  interrupt thread
> >>      |
> >>      |
> >> iavf_query_stats
> >> iavf_execute_vf_cmd
> >> iavf_aq_send_msg_to_pf wait handle complete(1 sec)
> >> iavf_execute_vf_cmd timeout
> >>      |
> >>      |
> >> iavf_handle_virtchnl_msg
> >>      |
> >>
> >> The above is the stack trace for the abnormal execution of
> >> iavf_query_stats in the interrupt thread.
> >>>
> >>>> Cc: stable@dpdk.org
> >>>>
> >>>> Signed-off-by: Kaiwen Deng <kaiwenx.deng@intel.com>
> >>>> ---
> >>>> Changes since v2:
> >>>> - Add to poll the response for VIRTCHNL_OP_GET_STATS.
> >>>>
> >>>> Changes since v1:
> >>>> - Add lock to avoid race condition.
> >>>> ---
> >>>> ---
> >>>>  drivers/net/bonding/rte_eth_bond_pmd.c |  7 ++-
> >>>>  drivers/net/iavf/iavf_ethdev.c         |  5 +-
> >>>>  drivers/net/iavf/iavf_vchnl.c          | 71 ++++++++++++++++++--------
> >>>>  3 files changed, 58 insertions(+), 25 deletions(-)
> >>>>
> >>>> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c
> >>>> b/drivers/net/bonding/rte_eth_bond_pmd.c
> >>>> index f0c4f7d26b..edce621496 100644
> >>>> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
> >>>> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
> >>>> @@ -894,6 +894,7 @@ bond_ethdev_update_tlb_slave_cb(void *arg)
> >>>>  	uint8_t update_stats = 0;
> >>>>  	uint16_t slave_id;
> >>>>  	uint16_t i;
> >>>> +	int ret;
> >>>>
> >>>>  	internals->slave_update_idx++;
> >>>>
> >>>> @@ -903,7 +904,10 @@ bond_ethdev_update_tlb_slave_cb(void *arg)
> >>>>
> >>>>  	for (i = 0; i < internals->active_slave_count; i++) {
> >>>>  		slave_id = internals->active_slaves[i];
> >>>> -		rte_eth_stats_get(slave_id, &slave_stats);
> >>>> +		ret = rte_eth_stats_get(slave_id, &slave_stats);
> >>>> +		if (ret)
> >>>> +			goto OUT;
> >>>> +
> >>>>  		tx_bytes = slave_stats.obytes - tlb_last_obytets[slave_id];
> >>>>  		bandwidth_left(slave_id, tx_bytes,
> >>>>  				internals->slave_update_idx, &bwg_array[i]);
> >>> @@ -922,6 +926,7 @@
> >>>> bond_ethdev_update_tlb_slave_cb(void *arg)
> >>>>  	for (i = 0; i < slave_count; i++)
> >>>>  		internals->tlb_slaves_order[i] = bwg_array[i].slave;
> >>>>
> >>>> +OUT:
> >>>>  	rte_eal_alarm_set(REORDER_PERIOD_MS * 1000,
> >>> bond_ethdev_update_tlb_slave_cb,
> >>>>  			(struct bond_dev_private *)internals);  } diff --git
> >>>> a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
> >>>> index 3196210f2c..d6e1f1a7f4 100644
> >>>> --- a/drivers/net/iavf/iavf_ethdev.c
> >>>> +++ b/drivers/net/iavf/iavf_ethdev.c
> >>>> @@ -2607,6 +2607,9 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
> >>>>  	adapter->dev_data = eth_dev->data;
> >>>>  	adapter->stopped = 1;
> >>>>
> >>>> +	if (iavf_dev_event_handler_init())
> >>>> +		goto init_vf_err;
> >>>> +
> >>>>  	if (iavf_init_vf(eth_dev) != 0) {
> >>>>  		PMD_INIT_LOG(ERR, "Init vf failed");
> >>>>  		return -1;
> >>>> @@ -2634,8 +2637,6 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
> >>>>  	rte_ether_addr_copy((struct rte_ether_addr *)hw->mac.addr,
> >>>>  			&eth_dev->data->mac_addrs[0]);
> >>>>
> >>>> -	if (iavf_dev_event_handler_init())
> >>>> -		goto init_vf_err;
> >>>>
> >>>>  	if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_WB_ON_ITR)
> >>> {
> >>>>  		/* register callback func to eal lib */ diff --git
> >>>> a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/iavf/iavf_vchnl.c
> >>>> index 9adaadb173..aeffb07cca 100644
> >>>> --- a/drivers/net/iavf/iavf_vchnl.c
> >>>> +++ b/drivers/net/iavf/iavf_vchnl.c
> >>>> @@ -256,6 +256,7 @@ iavf_read_msg_from_pf(struct iavf_adapter
> >>> *adapter, uint16_t buf_len,
> >>>>  				vf->link_speed =
> >>> iavf_convert_link_speed(speed);
> >>>>  			}
> >>>>  			iavf_dev_link_update(vf->eth_dev, 0);
> >>>> +			iavf_dev_event_post(vf->eth_dev,
> >>> RTE_ETH_EVENT_INTR_LSC, NULL, 0);
> >>>>  			PMD_DRV_LOG(INFO, "Link status update:%s",
> >>>>  					vf->link_up ? "up" : "down");
> >>>>  			break;
> >>>> @@ -368,28 +369,48 @@ iavf_execute_vf_cmd(struct iavf_adapter
> >>> *adapter, struct iavf_cmd_info *args,
> >>>>  		_clear_cmd(vf);
> >>>>  		break;
> >>>>  	default:
> >>>> -		/* For other virtchnl ops in running time,
> >>>> -		 * wait for the cmd done flag.
> >>>> -		 */
> >>>> -		do {
> >>>> -			if (vf->pend_cmd == VIRTCHNL_OP_UNKNOWN)
> >>>> -				break;
> >>>> -			iavf_msec_delay(ASQ_DELAY_MS);
> >>>> -			/* If don't read msg or read sys event, continue */
> >>>> -		} while (i++ < MAX_TRY_TIMES);
> >>>> -
> >>>> -		if (i >= MAX_TRY_TIMES) {
> >>>> -			PMD_DRV_LOG(ERR, "No response for cmd %d",
> >>> args->ops);
> >>>> +		if (rte_thread_is_intr()) {
> >>>> +			/* For virtchnl ops were executed in eal_intr_thread,
> >>>> +			 * need to poll the response.
> >>>> +			 */
> >>>> +			do {
> >>>> +				result = iavf_read_msg_from_pf(adapter,
> >>> args->out_size,
> >>>> +							args->out_buffer);
> >>>> +				if (result == IAVF_MSG_CMD)
> >>>> +					break;
> >>>> +				iavf_msec_delay(ASQ_DELAY_MS);
> >>>> +			} while (i++ < MAX_TRY_TIMES);
> >>>> +			if (i >= MAX_TRY_TIMES ||
> >>>> +				vf->cmd_retval !=
> >>> VIRTCHNL_STATUS_SUCCESS) {
> >>>> +				err = -1;
> >>>> +				PMD_DRV_LOG(ERR, "No response or return
> >>> failure (%d)"
> >>>> +						" for cmd %d", vf-
> >>>> cmd_retval, args->ops);
> >>>> +			}
> >>>>  			_clear_cmd(vf);
> >>>> -			err = -EIO;
> >>>> -		} else if (vf->cmd_retval ==
> >>>> -			   VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
> >>>> -			PMD_DRV_LOG(ERR, "Cmd %d not supported", args-
> >>>> ops);
> >>>> -			err = -ENOTSUP;
> >>>> -		} else if (vf->cmd_retval != VIRTCHNL_STATUS_SUCCESS) {
> >>>> -			PMD_DRV_LOG(ERR, "Return failure %d for cmd %d",
> >>>> -				    vf->cmd_retval, args->ops);
> >>>> -			err = -EINVAL;
> >>>> +		} else {
> >>>> +			/* For other virtchnl ops in running time,
> >>>> +			 * wait for the cmd done flag.
> >>>> +			 */
> >>>> +			do {
> >>>> +				if (vf->pend_cmd ==
> >>> VIRTCHNL_OP_UNKNOWN)
> >>>> +					break;
> >>>> +				iavf_msec_delay(ASQ_DELAY_MS);
> >>>> +				/* If don't read msg or read sys event,
> >>> continue */
> >>>> +			} while (i++ < MAX_TRY_TIMES);
> >>>> +
> >>>> +			if (i >= MAX_TRY_TIMES) {
> >>>> +				PMD_DRV_LOG(ERR, "No response for
> >>> cmd %d", args->ops);
> >>>> +				_clear_cmd(vf);
> >>>> +				err = -EIO;
> >>>> +			} else if (vf->cmd_retval ==
> >>>> +				VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
> >>>> +				PMD_DRV_LOG(ERR, "Cmd %d not
> >>> supported", args->ops);
> >>>> +				err = -ENOTSUP;
> >>>> +			} else if (vf->cmd_retval !=
> >>> VIRTCHNL_STATUS_SUCCESS) {
> >>>> +				PMD_DRV_LOG(ERR, "Return failure %d for
> >>> cmd %d",
> >>>> +						vf->cmd_retval, args->ops);
> >>>> +				err = -EINVAL;
> >>>> +			}
> >>>>  		}
> >>>>  		break;
> >>>>  	}
> >>>> @@ -403,8 +424,14 @@ iavf_execute_vf_cmd_safe(struct
> iavf_adapter
> >>>> *adapter,  {
> >>>>  	struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(adapter);
> >>>>  	int ret;
> >>>> +	int is_intr_thread = rte_thread_is_intr();
> >>>>
> >>>> -	rte_spinlock_lock(&vf->aq_lock);
> >>>> +	if (is_intr_thread) {
> >>>> +		if (!rte_spinlock_trylock(&vf->aq_lock))
> >>>> +			return -EIO;
> >>>> +	} else {
> >>>> +		rte_spinlock_lock(&vf->aq_lock);
> >>>> +	}
> >>>>  	ret = iavf_execute_vf_cmd(adapter, args, async);
> >>>>  	rte_spinlock_unlock(&vf->aq_lock);
> >>>>
> >>
> >
  
Kaiwen Deng May 23, 2023, 1:45 a.m. UTC | #8
> -----Original Message-----
> From: Deng, KaiwenX
> Sent: Friday, May 5, 2023 10:31 AM
> To: Ferruh Yigit <ferruh.yigit@amd.com>; dev@dpdk.org
> Cc: stable@dpdk.org; Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
> <yidingx.zhou@intel.com>; Chas Williams <chas3@att.com>; Min Hu (Connor)
> <humin29@huawei.com>; Wu, Jingjing <jingjing.wu@intel.com>; Xing, Beilei
> <beilei.xing@intel.com>; Mike Pattrick <mkp@redhat.com>; Zhang, Qi Z
> <qi.z.zhang@intel.com>; Doherty, Declan <declan.doherty@intel.com>;
> Mrzyglod, Daniel T <daniel.t.mrzyglod@intel.com>; Dapeng Yu
> <dapengx.yu@intel.com>; Zhang, Helin <helin.zhang@intel.com>;
> Mcnamara, John <john.mcnamara@intel.com>; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: RE: [PATCH v3] net/iavf: fix iavf query stats in intr thread
> 
> 
> 
> > -----Original Message-----
> > From: Ferruh Yigit <ferruh.yigit@amd.com>
> > Sent: Monday, March 27, 2023 8:38 PM
> > To: Deng, KaiwenX <kaiwenx.deng@intel.com>; dev@dpdk.org
> > Cc: stable@dpdk.org; Yang, Qiming <qiming.yang@intel.com>; Zhou,
> > YidingX <yidingx.zhou@intel.com>; Chas Williams <chas3@att.com>; Min
> > Hu (Connor) <humin29@huawei.com>; Wu, Jingjing
> > <jingjing.wu@intel.com>; Xing, Beilei <beilei.xing@intel.com>; Mike
> > Pattrick <mkp@redhat.com>; Zhang, Qi Z <qi.z.zhang@intel.com>;
> > Doherty, Declan <declan.doherty@intel.com>; Mrzyglod, Daniel T
> > <daniel.t.mrzyglod@intel.com>; Dapeng Yu <dapengx.yu@intel.com>;
> > Zhang, Helin <helin.zhang@intel.com>; Mcnamara, John
> > <john.mcnamara@intel.com>; Thomas Monjalon <thomas@monjalon.net>
> > Subject: Re: [PATCH v3] net/iavf: fix iavf query stats in intr thread
> >
> > On 3/27/2023 1:31 PM, Ferruh Yigit wrote:
> > > On 3/27/2023 6:31 AM, Deng, KaiwenX wrote:
> > >>
> > >>
> > >>> -----Original Message-----
> > >>> From: Ferruh Yigit <ferruh.yigit@amd.com>
> > >>> Sent: Thursday, March 23, 2023 11:39 PM
> > >>> To: Deng, KaiwenX <kaiwenx.deng@intel.com>; dev@dpdk.org
> > >>> Cc: stable@dpdk.org; Yang, Qiming <qiming.yang@intel.com>; Zhou,
> > >>> YidingX <yidingx.zhou@intel.com>; Chas Williams <chas3@att.com>;
> > >>> Min Hu (Connor) <humin29@huawei.com>; Wu, Jingjing
> > >>> <jingjing.wu@intel.com>; Xing, Beilei <beilei.xing@intel.com>;
> > >>> Mike Pattrick <mkp@redhat.com>; Zhang, Qi Z
> > >>> <qi.z.zhang@intel.com>; Doherty, Declan
> > >>> <declan.doherty@intel.com>; Mrzyglod, Daniel T
> > >>> <daniel.t.mrzyglod@intel.com>; Dapeng Yu <dapengx.yu@intel.com>
> > >>> Subject: Re: [PATCH v3] net/iavf: fix iavf query stats in intr
> > >>> thread
> > >>>
> > >>> On 3/22/2023 7:26 AM, Kaiwen Deng wrote:
> > >>>> When iavf send query-stats command in eal-intr-thread through
> > >>>> virtual channel, there will be no response received from
> > >>>> iavf_dev_virtchnl_handler for this command during block and wait.
> > >>>> Because iavf_dev_virtchnl_handler is also registered in eal-intr-
> thread.
> > >>>>
> > >>>> When vf device is bonded as BONDING_MODE_TLB mode, the slave
> > device
> > >>>> update callback will registered in alarm and called by
> > >>>> eal-intr-thread, it would also raise the above issue.
> > >>>>
> > >>>> This commit add to poll the response for VIRTCHNL_OP_GET_STATS
> > when
> > >>> it
> > >>>> is called by eal-intr-thread to fix this issue.
> > >>>>
> > >>>> Fixes: 91bf37d250aa ("net/iavf: add lock for VF commands")
> > >>>> Fixes: 22b123a36d07 ("net/avf: initialize PMD")
> > >>>> Fixes: 7c76a747e68c ("bond: add mode 5")
> > >>>> Fixes: 435d523112cc ("net/iavf: fix multi-process shared data")
> > >>>> Fixes: cb5c1b91f76f ("net/iavf: add thread for event callbacks")
> > >>>
> > >>>
> > >>> Hi Kaiwen,
> > >>>
> > >>> Above commit already seems trying to address same issue, it
> > >>> creates
> > >>> "iavf- event-thread" control thread to asyncroniously handle the
> > >>> interrupts, in non- interrupt context, why it is not working?
> > >>>
> > >>> Instead of adding 'rte_thread_is_intr()' checks, can't you make
> > >>> sure all interrupts handled in control tread?
> > >>>
> > >>> And can you please provide a stack trace in commit log, to
> > >>> describe the issue better?
> > >> Hi Ferru,
> > >> Sorry for my late reply, And thanks for your review.
> > >>
> > >> The above commit does not fix this issue when we need to get the
> > returned data.
> > >> If we call iavf_query_stats and wait for response statistics in the
> > >> intr-
> > thread.
> > >> iavf_handle_virtchnl_msg is also registered in the intr_thread and
> > >> will not be executed while waiting.
> > >>
> > >
> > > Got it, since return value is required, API can't be called asyncroniously.
> > >
> > >
> > >
> > > I think 'rte_thread_is_intr()' checks may cause more trouble for you
> > > in long term,
> > >
> > > - why 'iavf_query_stats()' is called in the iterrupt thread, can it
> > > be prevented?
> > >
> > > - does it make sense to allways poll messages from PF (for simplification)?
> > >
> > >
> > > If answer to both are 'No', I am OK to continue with current
> > > proposal if you are happy with it.
> > >
> >
> >
> > btw, how critical is this issue?
> >
> > If it is critical, I am OK to get it as it is for this release and
> > investigate it further for next release, since only a few days left for this
> release.
> >
> Hi Ferruh,
> 
> I didn't find a more suitable solution after consideration, if you have a better
> suggestion, please let me know, thanks.
Hi Ferruh,
Can you please take a look at this patch.
Thanks.
> >
> > >
> > >> This commit I changed it to polling for replies to commands
> > >> executed in
> > the interrupt thread.
> > >>
> > >> main thread                                                                                                interrupt
> > thread
> > >>      |                                                                                                                               |
> > >>      |                                                                                                                               |
> > >> iavf_query_stats                                                                                                        |
> > >> iavf_execute_vf_cmd                                                                                               |
> > >> iavf_aq_send_msg_to_pf  and wait handle complete
> |
> > >>      |                                                                                                                              |
> > >>
> > >> |------------------------------------------------------------------
> > >> -------------------
> > ------->|
> > >>      |                                                                                                                              |
> > >>      |
> > iavf_handle_virtchnl_msg
> > >>      |                                                                                                                             |
> > >>
> > >> |<-----------------------------------------------------------------
> > >> -------------------
> > --------|
> > >>      |                                                                                                                             |
> > >> iavf_execute_vf_cmd get response                                                                      |
> > >>      |                                                                                                                             |
> > >>
> > >> The above is the stack trace for the normal execution of
> > >> iavf_query_stats in the main thread.
> > >>
> > >>  interrupt thread
> > >>      |
> > >>      |
> > >> iavf_query_stats
> > >> iavf_execute_vf_cmd
> > >> iavf_aq_send_msg_to_pf wait handle complete(1 sec)
> > >> iavf_execute_vf_cmd timeout
> > >>      |
> > >>      |
> > >> iavf_handle_virtchnl_msg
> > >>      |
> > >>
> > >> The above is the stack trace for the abnormal execution of
> > >> iavf_query_stats in the interrupt thread.
> > >>>
> > >>>> Cc: stable@dpdk.org
> > >>>>
> > >>>> Signed-off-by: Kaiwen Deng <kaiwenx.deng@intel.com>
> > >>>> ---
> > >>>> Changes since v2:
> > >>>> - Add to poll the response for VIRTCHNL_OP_GET_STATS.
> > >>>>
> > >>>> Changes since v1:
> > >>>> - Add lock to avoid race condition.
> > >>>> ---
> > >>>> ---
> > >>>>  drivers/net/bonding/rte_eth_bond_pmd.c |  7 ++-
> > >>>>  drivers/net/iavf/iavf_ethdev.c         |  5 +-
> > >>>>  drivers/net/iavf/iavf_vchnl.c          | 71 ++++++++++++++++++--------
> > >>>>  3 files changed, 58 insertions(+), 25 deletions(-)
> > >>>>
> > >>>> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c
> > >>>> b/drivers/net/bonding/rte_eth_bond_pmd.c
> > >>>> index f0c4f7d26b..edce621496 100644
> > >>>> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
> > >>>> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
> > >>>> @@ -894,6 +894,7 @@ bond_ethdev_update_tlb_slave_cb(void *arg)
> > >>>>  	uint8_t update_stats = 0;
> > >>>>  	uint16_t slave_id;
> > >>>>  	uint16_t i;
> > >>>> +	int ret;
> > >>>>
> > >>>>  	internals->slave_update_idx++;
> > >>>>
> > >>>> @@ -903,7 +904,10 @@ bond_ethdev_update_tlb_slave_cb(void
> *arg)
> > >>>>
> > >>>>  	for (i = 0; i < internals->active_slave_count; i++) {
> > >>>>  		slave_id = internals->active_slaves[i];
> > >>>> -		rte_eth_stats_get(slave_id, &slave_stats);
> > >>>> +		ret = rte_eth_stats_get(slave_id, &slave_stats);
> > >>>> +		if (ret)
> > >>>> +			goto OUT;
> > >>>> +
> > >>>>  		tx_bytes = slave_stats.obytes -
> tlb_last_obytets[slave_id];
> > >>>>  		bandwidth_left(slave_id, tx_bytes,
> > >>>>  				internals->slave_update_idx,
> &bwg_array[i]);
> > >>> @@ -922,6 +926,7 @@
> > >>>> bond_ethdev_update_tlb_slave_cb(void *arg)
> > >>>>  	for (i = 0; i < slave_count; i++)
> > >>>>  		internals->tlb_slaves_order[i] = bwg_array[i].slave;
> > >>>>
> > >>>> +OUT:
> > >>>>  	rte_eal_alarm_set(REORDER_PERIOD_MS * 1000,
> > >>> bond_ethdev_update_tlb_slave_cb,
> > >>>>  			(struct bond_dev_private *)internals);  } diff -
> -git
> > >>>> a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
> > >>>> index 3196210f2c..d6e1f1a7f4 100644
> > >>>> --- a/drivers/net/iavf/iavf_ethdev.c
> > >>>> +++ b/drivers/net/iavf/iavf_ethdev.c
> > >>>> @@ -2607,6 +2607,9 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
> > >>>>  	adapter->dev_data = eth_dev->data;
> > >>>>  	adapter->stopped = 1;
> > >>>>
> > >>>> +	if (iavf_dev_event_handler_init())
> > >>>> +		goto init_vf_err;
> > >>>> +
> > >>>>  	if (iavf_init_vf(eth_dev) != 0) {
> > >>>>  		PMD_INIT_LOG(ERR, "Init vf failed");
> > >>>>  		return -1;
> > >>>> @@ -2634,8 +2637,6 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
> > >>>>  	rte_ether_addr_copy((struct rte_ether_addr *)hw-
> >mac.addr,
> > >>>>  			&eth_dev->data->mac_addrs[0]);
> > >>>>
> > >>>> -	if (iavf_dev_event_handler_init())
> > >>>> -		goto init_vf_err;
> > >>>>
> > >>>>  	if (vf->vf_res->vf_cap_flags &
> VIRTCHNL_VF_OFFLOAD_WB_ON_ITR)
> > >>> {
> > >>>>  		/* register callback func to eal lib */ diff --git
> > >>>> a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/iavf/iavf_vchnl.c
> > >>>> index 9adaadb173..aeffb07cca 100644
> > >>>> --- a/drivers/net/iavf/iavf_vchnl.c
> > >>>> +++ b/drivers/net/iavf/iavf_vchnl.c
> > >>>> @@ -256,6 +256,7 @@ iavf_read_msg_from_pf(struct iavf_adapter
> > >>> *adapter, uint16_t buf_len,
> > >>>>  				vf->link_speed =
> > >>> iavf_convert_link_speed(speed);
> > >>>>  			}
> > >>>>  			iavf_dev_link_update(vf->eth_dev, 0);
> > >>>> +			iavf_dev_event_post(vf->eth_dev,
> > >>> RTE_ETH_EVENT_INTR_LSC, NULL, 0);
> > >>>>  			PMD_DRV_LOG(INFO, "Link status
> update:%s",
> > >>>>  					vf->link_up ? "up" : "down");
> > >>>>  			break;
> > >>>> @@ -368,28 +369,48 @@ iavf_execute_vf_cmd(struct iavf_adapter
> > >>> *adapter, struct iavf_cmd_info *args,
> > >>>>  		_clear_cmd(vf);
> > >>>>  		break;
> > >>>>  	default:
> > >>>> -		/* For other virtchnl ops in running time,
> > >>>> -		 * wait for the cmd done flag.
> > >>>> -		 */
> > >>>> -		do {
> > >>>> -			if (vf->pend_cmd ==
> VIRTCHNL_OP_UNKNOWN)
> > >>>> -				break;
> > >>>> -			iavf_msec_delay(ASQ_DELAY_MS);
> > >>>> -			/* If don't read msg or read sys event,
> continue */
> > >>>> -		} while (i++ < MAX_TRY_TIMES);
> > >>>> -
> > >>>> -		if (i >= MAX_TRY_TIMES) {
> > >>>> -			PMD_DRV_LOG(ERR, "No response for
> cmd %d",
> > >>> args->ops);
> > >>>> +		if (rte_thread_is_intr()) {
> > >>>> +			/* For virtchnl ops were executed in
> eal_intr_thread,
> > >>>> +			 * need to poll the response.
> > >>>> +			 */
> > >>>> +			do {
> > >>>> +				result =
> iavf_read_msg_from_pf(adapter,
> > >>> args->out_size,
> > >>>> +							args-
> >out_buffer);
> > >>>> +				if (result == IAVF_MSG_CMD)
> > >>>> +					break;
> > >>>> +				iavf_msec_delay(ASQ_DELAY_MS);
> > >>>> +			} while (i++ < MAX_TRY_TIMES);
> > >>>> +			if (i >= MAX_TRY_TIMES ||
> > >>>> +				vf->cmd_retval !=
> > >>> VIRTCHNL_STATUS_SUCCESS) {
> > >>>> +				err = -1;
> > >>>> +				PMD_DRV_LOG(ERR, "No response
> or return
> > >>> failure (%d)"
> > >>>> +						" for cmd %d", vf-
> > >>>> cmd_retval, args->ops);
> > >>>> +			}
> > >>>>  			_clear_cmd(vf);
> > >>>> -			err = -EIO;
> > >>>> -		} else if (vf->cmd_retval ==
> > >>>> -			   VIRTCHNL_STATUS_ERR_NOT_SUPPORTED)
> {
> > >>>> -			PMD_DRV_LOG(ERR, "Cmd %d not
> supported", args-
> > >>>> ops);
> > >>>> -			err = -ENOTSUP;
> > >>>> -		} else if (vf->cmd_retval !=
> VIRTCHNL_STATUS_SUCCESS) {
> > >>>> -			PMD_DRV_LOG(ERR, "Return failure %d for
> cmd %d",
> > >>>> -				    vf->cmd_retval, args->ops);
> > >>>> -			err = -EINVAL;
> > >>>> +		} else {
> > >>>> +			/* For other virtchnl ops in running time,
> > >>>> +			 * wait for the cmd done flag.
> > >>>> +			 */
> > >>>> +			do {
> > >>>> +				if (vf->pend_cmd ==
> > >>> VIRTCHNL_OP_UNKNOWN)
> > >>>> +					break;
> > >>>> +				iavf_msec_delay(ASQ_DELAY_MS);
> > >>>> +				/* If don't read msg or read sys event,
> > >>> continue */
> > >>>> +			} while (i++ < MAX_TRY_TIMES);
> > >>>> +
> > >>>> +			if (i >= MAX_TRY_TIMES) {
> > >>>> +				PMD_DRV_LOG(ERR, "No response
> for
> > >>> cmd %d", args->ops);
> > >>>> +				_clear_cmd(vf);
> > >>>> +				err = -EIO;
> > >>>> +			} else if (vf->cmd_retval ==
> > >>>> +
> 	VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
> > >>>> +				PMD_DRV_LOG(ERR, "Cmd %d not
> > >>> supported", args->ops);
> > >>>> +				err = -ENOTSUP;
> > >>>> +			} else if (vf->cmd_retval !=
> > >>> VIRTCHNL_STATUS_SUCCESS) {
> > >>>> +				PMD_DRV_LOG(ERR, "Return
> failure %d for
> > >>> cmd %d",
> > >>>> +						vf->cmd_retval, args-
> >ops);
> > >>>> +				err = -EINVAL;
> > >>>> +			}
> > >>>>  		}
> > >>>>  		break;
> > >>>>  	}
> > >>>> @@ -403,8 +424,14 @@ iavf_execute_vf_cmd_safe(struct
> > iavf_adapter
> > >>>> *adapter,  {
> > >>>>  	struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(adapter);
> > >>>>  	int ret;
> > >>>> +	int is_intr_thread = rte_thread_is_intr();
> > >>>>
> > >>>> -	rte_spinlock_lock(&vf->aq_lock);
> > >>>> +	if (is_intr_thread) {
> > >>>> +		if (!rte_spinlock_trylock(&vf->aq_lock))
> > >>>> +			return -EIO;
> > >>>> +	} else {
> > >>>> +		rte_spinlock_lock(&vf->aq_lock);
> > >>>> +	}
> > >>>>  	ret = iavf_execute_vf_cmd(adapter, args, async);
> > >>>>  	rte_spinlock_unlock(&vf->aq_lock);
> > >>>>
> > >>
> > >
  
Ferruh Yigit May 23, 2023, 10:26 a.m. UTC | #9
On 5/23/2023 2:45 AM, Deng, KaiwenX wrote:
> 
> 
>> -----Original Message-----
>> From: Deng, KaiwenX
>> Sent: Friday, May 5, 2023 10:31 AM
>> To: Ferruh Yigit <ferruh.yigit@amd.com>; dev@dpdk.org
>> Cc: stable@dpdk.org; Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
>> <yidingx.zhou@intel.com>; Chas Williams <chas3@att.com>; Min Hu (Connor)
>> <humin29@huawei.com>; Wu, Jingjing <jingjing.wu@intel.com>; Xing, Beilei
>> <beilei.xing@intel.com>; Mike Pattrick <mkp@redhat.com>; Zhang, Qi Z
>> <qi.z.zhang@intel.com>; Doherty, Declan <declan.doherty@intel.com>;
>> Mrzyglod, Daniel T <daniel.t.mrzyglod@intel.com>; Dapeng Yu
>> <dapengx.yu@intel.com>; Zhang, Helin <helin.zhang@intel.com>;
>> Mcnamara, John <john.mcnamara@intel.com>; Thomas Monjalon
>> <thomas@monjalon.net>
>> Subject: RE: [PATCH v3] net/iavf: fix iavf query stats in intr thread
>>
>>
>>
>>> -----Original Message-----
>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>>> Sent: Monday, March 27, 2023 8:38 PM
>>> To: Deng, KaiwenX <kaiwenx.deng@intel.com>; dev@dpdk.org
>>> Cc: stable@dpdk.org; Yang, Qiming <qiming.yang@intel.com>; Zhou,
>>> YidingX <yidingx.zhou@intel.com>; Chas Williams <chas3@att.com>; Min
>>> Hu (Connor) <humin29@huawei.com>; Wu, Jingjing
>>> <jingjing.wu@intel.com>; Xing, Beilei <beilei.xing@intel.com>; Mike
>>> Pattrick <mkp@redhat.com>; Zhang, Qi Z <qi.z.zhang@intel.com>;
>>> Doherty, Declan <declan.doherty@intel.com>; Mrzyglod, Daniel T
>>> <daniel.t.mrzyglod@intel.com>; Dapeng Yu <dapengx.yu@intel.com>;
>>> Zhang, Helin <helin.zhang@intel.com>; Mcnamara, John
>>> <john.mcnamara@intel.com>; Thomas Monjalon <thomas@monjalon.net>
>>> Subject: Re: [PATCH v3] net/iavf: fix iavf query stats in intr thread
>>>
>>> On 3/27/2023 1:31 PM, Ferruh Yigit wrote:
>>>> On 3/27/2023 6:31 AM, Deng, KaiwenX wrote:
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>>>>>> Sent: Thursday, March 23, 2023 11:39 PM
>>>>>> To: Deng, KaiwenX <kaiwenx.deng@intel.com>; dev@dpdk.org
>>>>>> Cc: stable@dpdk.org; Yang, Qiming <qiming.yang@intel.com>; Zhou,
>>>>>> YidingX <yidingx.zhou@intel.com>; Chas Williams <chas3@att.com>;
>>>>>> Min Hu (Connor) <humin29@huawei.com>; Wu, Jingjing
>>>>>> <jingjing.wu@intel.com>; Xing, Beilei <beilei.xing@intel.com>;
>>>>>> Mike Pattrick <mkp@redhat.com>; Zhang, Qi Z
>>>>>> <qi.z.zhang@intel.com>; Doherty, Declan
>>>>>> <declan.doherty@intel.com>; Mrzyglod, Daniel T
>>>>>> <daniel.t.mrzyglod@intel.com>; Dapeng Yu <dapengx.yu@intel.com>
>>>>>> Subject: Re: [PATCH v3] net/iavf: fix iavf query stats in intr
>>>>>> thread
>>>>>>
>>>>>> On 3/22/2023 7:26 AM, Kaiwen Deng wrote:
>>>>>>> When iavf send query-stats command in eal-intr-thread through
>>>>>>> virtual channel, there will be no response received from
>>>>>>> iavf_dev_virtchnl_handler for this command during block and wait.
>>>>>>> Because iavf_dev_virtchnl_handler is also registered in eal-intr-
>> thread.
>>>>>>>
>>>>>>> When vf device is bonded as BONDING_MODE_TLB mode, the slave
>>> device
>>>>>>> update callback will registered in alarm and called by
>>>>>>> eal-intr-thread, it would also raise the above issue.
>>>>>>>
>>>>>>> This commit add to poll the response for VIRTCHNL_OP_GET_STATS
>>> when
>>>>>> it
>>>>>>> is called by eal-intr-thread to fix this issue.
>>>>>>>
>>>>>>> Fixes: 91bf37d250aa ("net/iavf: add lock for VF commands")
>>>>>>> Fixes: 22b123a36d07 ("net/avf: initialize PMD")
>>>>>>> Fixes: 7c76a747e68c ("bond: add mode 5")
>>>>>>> Fixes: 435d523112cc ("net/iavf: fix multi-process shared data")
>>>>>>> Fixes: cb5c1b91f76f ("net/iavf: add thread for event callbacks")
>>>>>>
>>>>>>
>>>>>> Hi Kaiwen,
>>>>>>
>>>>>> Above commit already seems trying to address same issue, it
>>>>>> creates
>>>>>> "iavf- event-thread" control thread to asyncroniously handle the
>>>>>> interrupts, in non- interrupt context, why it is not working?
>>>>>>
>>>>>> Instead of adding 'rte_thread_is_intr()' checks, can't you make
>>>>>> sure all interrupts handled in control tread?
>>>>>>
>>>>>> And can you please provide a stack trace in commit log, to
>>>>>> describe the issue better?
>>>>> Hi Ferru,
>>>>> Sorry for my late reply, And thanks for your review.
>>>>>
>>>>> The above commit does not fix this issue when we need to get the
>>> returned data.
>>>>> If we call iavf_query_stats and wait for response statistics in the
>>>>> intr-
>>> thread.
>>>>> iavf_handle_virtchnl_msg is also registered in the intr_thread and
>>>>> will not be executed while waiting.
>>>>>
>>>>
>>>> Got it, since return value is required, API can't be called asyncroniously.
>>>>
>>>>
>>>>
>>>> I think 'rte_thread_is_intr()' checks may cause more trouble for you
>>>> in long term,
>>>>
>>>> - why 'iavf_query_stats()' is called in the iterrupt thread, can it
>>>> be prevented?
>>>>
>>>> - does it make sense to allways poll messages from PF (for simplification)?
>>>>
>>>>
>>>> If answer to both are 'No', I am OK to continue with current
>>>> proposal if you are happy with it.
>>>>
>>>
>>>
>>> btw, how critical is this issue?
>>>
>>> If it is critical, I am OK to get it as it is for this release and
>>> investigate it further for next release, since only a few days left for this
>> release.
>>>
>> Hi Ferruh,
>>
>> I didn't find a more suitable solution after consideration, if you have a better
>> suggestion, please let me know, thanks.
> Hi Ferruh,
> Can you please take a look at this patch.
>

Hi Kaiwen,

Sorry for delay, lets continue with this solution.

I thought calling from "iavf-event-thread" can be an option but your
description clarifies why it is not an option, and I also don't have
better solution, so I think can continue as it is.


> Thanks.
>>>
>>>>
>>>>> This commit I changed it to polling for replies to commands
>>>>> executed in
>>> the interrupt thread.
>>>>>
>>>>> main thread                                                                                                interrupt
>>> thread
>>>>>      |                                                                                                                               |
>>>>>      |                                                                                                                               |
>>>>> iavf_query_stats                                                                                                        |
>>>>> iavf_execute_vf_cmd                                                                                               |
>>>>> iavf_aq_send_msg_to_pf  and wait handle complete
>> |
>>>>>      |                                                                                                                              |
>>>>>
>>>>> |------------------------------------------------------------------
>>>>> -------------------
>>> ------->|
>>>>>      |                                                                                                                              |
>>>>>      |
>>> iavf_handle_virtchnl_msg
>>>>>      |                                                                                                                             |
>>>>>
>>>>> |<-----------------------------------------------------------------
>>>>> -------------------
>>> --------|
>>>>>      |                                                                                                                             |
>>>>> iavf_execute_vf_cmd get response                                                                      |
>>>>>      |                                                                                                                             |
>>>>>
>>>>> The above is the stack trace for the normal execution of
>>>>> iavf_query_stats in the main thread.
>>>>>
>>>>>  interrupt thread
>>>>>      |
>>>>>      |
>>>>> iavf_query_stats
>>>>> iavf_execute_vf_cmd
>>>>> iavf_aq_send_msg_to_pf wait handle complete(1 sec)
>>>>> iavf_execute_vf_cmd timeout
>>>>>      |
>>>>>      |
>>>>> iavf_handle_virtchnl_msg
>>>>>      |
>>>>>
>>>>> The above is the stack trace for the abnormal execution of
>>>>> iavf_query_stats in the interrupt thread.
>>>>>>
>>>>>>> Cc: stable@dpdk.org
>>>>>>>
>>>>>>> Signed-off-by: Kaiwen Deng <kaiwenx.deng@intel.com>
>>>>>>> ---
>>>>>>> Changes since v2:
>>>>>>> - Add to poll the response for VIRTCHNL_OP_GET_STATS.
>>>>>>>
>>>>>>> Changes since v1:
>>>>>>> - Add lock to avoid race condition.
>>>>>>> ---
>>>>>>> ---
>>>>>>>  drivers/net/bonding/rte_eth_bond_pmd.c |  7 ++-
>>>>>>>  drivers/net/iavf/iavf_ethdev.c         |  5 +-
>>>>>>>  drivers/net/iavf/iavf_vchnl.c          | 71 ++++++++++++++++++--------
>>>>>>>  3 files changed, 58 insertions(+), 25 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c
>>>>>>> b/drivers/net/bonding/rte_eth_bond_pmd.c
>>>>>>> index f0c4f7d26b..edce621496 100644
>>>>>>> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
>>>>>>> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
>>>>>>> @@ -894,6 +894,7 @@ bond_ethdev_update_tlb_slave_cb(void *arg)
>>>>>>>  	uint8_t update_stats = 0;
>>>>>>>  	uint16_t slave_id;
>>>>>>>  	uint16_t i;
>>>>>>> +	int ret;
>>>>>>>
>>>>>>>  	internals->slave_update_idx++;
>>>>>>>
>>>>>>> @@ -903,7 +904,10 @@ bond_ethdev_update_tlb_slave_cb(void
>> *arg)
>>>>>>>
>>>>>>>  	for (i = 0; i < internals->active_slave_count; i++) {
>>>>>>>  		slave_id = internals->active_slaves[i];
>>>>>>> -		rte_eth_stats_get(slave_id, &slave_stats);
>>>>>>> +		ret = rte_eth_stats_get(slave_id, &slave_stats);
>>>>>>> +		if (ret)
>>>>>>> +			goto OUT;
>>>>>>> +
>>>>>>>  		tx_bytes = slave_stats.obytes -
>> tlb_last_obytets[slave_id];
>>>>>>>  		bandwidth_left(slave_id, tx_bytes,
>>>>>>>  				internals->slave_update_idx,
>> &bwg_array[i]);
>>>>>> @@ -922,6 +926,7 @@
>>>>>>> bond_ethdev_update_tlb_slave_cb(void *arg)
>>>>>>>  	for (i = 0; i < slave_count; i++)
>>>>>>>  		internals->tlb_slaves_order[i] = bwg_array[i].slave;
>>>>>>>
>>>>>>> +OUT:
>>>>>>>  	rte_eal_alarm_set(REORDER_PERIOD_MS * 1000,
>>>>>> bond_ethdev_update_tlb_slave_cb,
>>>>>>>  			(struct bond_dev_private *)internals);  } diff -
>> -git
>>>>>>> a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
>>>>>>> index 3196210f2c..d6e1f1a7f4 100644
>>>>>>> --- a/drivers/net/iavf/iavf_ethdev.c
>>>>>>> +++ b/drivers/net/iavf/iavf_ethdev.c
>>>>>>> @@ -2607,6 +2607,9 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
>>>>>>>  	adapter->dev_data = eth_dev->data;
>>>>>>>  	adapter->stopped = 1;
>>>>>>>
>>>>>>> +	if (iavf_dev_event_handler_init())
>>>>>>> +		goto init_vf_err;
>>>>>>> +
>>>>>>>  	if (iavf_init_vf(eth_dev) != 0) {
>>>>>>>  		PMD_INIT_LOG(ERR, "Init vf failed");
>>>>>>>  		return -1;
>>>>>>> @@ -2634,8 +2637,6 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
>>>>>>>  	rte_ether_addr_copy((struct rte_ether_addr *)hw-
>>> mac.addr,
>>>>>>>  			&eth_dev->data->mac_addrs[0]);
>>>>>>>
>>>>>>> -	if (iavf_dev_event_handler_init())
>>>>>>> -		goto init_vf_err;
>>>>>>>
>>>>>>>  	if (vf->vf_res->vf_cap_flags &
>> VIRTCHNL_VF_OFFLOAD_WB_ON_ITR)
>>>>>> {
>>>>>>>  		/* register callback func to eal lib */ diff --git
>>>>>>> a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/iavf/iavf_vchnl.c
>>>>>>> index 9adaadb173..aeffb07cca 100644
>>>>>>> --- a/drivers/net/iavf/iavf_vchnl.c
>>>>>>> +++ b/drivers/net/iavf/iavf_vchnl.c
>>>>>>> @@ -256,6 +256,7 @@ iavf_read_msg_from_pf(struct iavf_adapter
>>>>>> *adapter, uint16_t buf_len,
>>>>>>>  				vf->link_speed =
>>>>>> iavf_convert_link_speed(speed);
>>>>>>>  			}
>>>>>>>  			iavf_dev_link_update(vf->eth_dev, 0);
>>>>>>> +			iavf_dev_event_post(vf->eth_dev,
>>>>>> RTE_ETH_EVENT_INTR_LSC, NULL, 0);
>>>>>>>  			PMD_DRV_LOG(INFO, "Link status
>> update:%s",
>>>>>>>  					vf->link_up ? "up" : "down");
>>>>>>>  			break;
>>>>>>> @@ -368,28 +369,48 @@ iavf_execute_vf_cmd(struct iavf_adapter
>>>>>> *adapter, struct iavf_cmd_info *args,
>>>>>>>  		_clear_cmd(vf);
>>>>>>>  		break;
>>>>>>>  	default:
>>>>>>> -		/* For other virtchnl ops in running time,
>>>>>>> -		 * wait for the cmd done flag.
>>>>>>> -		 */
>>>>>>> -		do {
>>>>>>> -			if (vf->pend_cmd ==
>> VIRTCHNL_OP_UNKNOWN)
>>>>>>> -				break;
>>>>>>> -			iavf_msec_delay(ASQ_DELAY_MS);
>>>>>>> -			/* If don't read msg or read sys event,
>> continue */
>>>>>>> -		} while (i++ < MAX_TRY_TIMES);
>>>>>>> -
>>>>>>> -		if (i >= MAX_TRY_TIMES) {
>>>>>>> -			PMD_DRV_LOG(ERR, "No response for
>> cmd %d",
>>>>>> args->ops);
>>>>>>> +		if (rte_thread_is_intr()) {
>>>>>>> +			/* For virtchnl ops were executed in
>> eal_intr_thread,
>>>>>>> +			 * need to poll the response.
>>>>>>> +			 */
>>>>>>> +			do {
>>>>>>> +				result =
>> iavf_read_msg_from_pf(adapter,
>>>>>> args->out_size,
>>>>>>> +							args-
>>> out_buffer);
>>>>>>> +				if (result == IAVF_MSG_CMD)
>>>>>>> +					break;
>>>>>>> +				iavf_msec_delay(ASQ_DELAY_MS);
>>>>>>> +			} while (i++ < MAX_TRY_TIMES);
>>>>>>> +			if (i >= MAX_TRY_TIMES ||
>>>>>>> +				vf->cmd_retval !=
>>>>>> VIRTCHNL_STATUS_SUCCESS) {
>>>>>>> +				err = -1;
>>>>>>> +				PMD_DRV_LOG(ERR, "No response
>> or return
>>>>>> failure (%d)"
>>>>>>> +						" for cmd %d", vf-
>>>>>>> cmd_retval, args->ops);
>>>>>>> +			}
>>>>>>>  			_clear_cmd(vf);
>>>>>>> -			err = -EIO;
>>>>>>> -		} else if (vf->cmd_retval ==
>>>>>>> -			   VIRTCHNL_STATUS_ERR_NOT_SUPPORTED)
>> {
>>>>>>> -			PMD_DRV_LOG(ERR, "Cmd %d not
>> supported", args-
>>>>>>> ops);
>>>>>>> -			err = -ENOTSUP;
>>>>>>> -		} else if (vf->cmd_retval !=
>> VIRTCHNL_STATUS_SUCCESS) {
>>>>>>> -			PMD_DRV_LOG(ERR, "Return failure %d for
>> cmd %d",
>>>>>>> -				    vf->cmd_retval, args->ops);
>>>>>>> -			err = -EINVAL;
>>>>>>> +		} else {
>>>>>>> +			/* For other virtchnl ops in running time,
>>>>>>> +			 * wait for the cmd done flag.
>>>>>>> +			 */
>>>>>>> +			do {
>>>>>>> +				if (vf->pend_cmd ==
>>>>>> VIRTCHNL_OP_UNKNOWN)
>>>>>>> +					break;
>>>>>>> +				iavf_msec_delay(ASQ_DELAY_MS);
>>>>>>> +				/* If don't read msg or read sys event,
>>>>>> continue */
>>>>>>> +			} while (i++ < MAX_TRY_TIMES);
>>>>>>> +
>>>>>>> +			if (i >= MAX_TRY_TIMES) {
>>>>>>> +				PMD_DRV_LOG(ERR, "No response
>> for
>>>>>> cmd %d", args->ops);
>>>>>>> +				_clear_cmd(vf);
>>>>>>> +				err = -EIO;
>>>>>>> +			} else if (vf->cmd_retval ==
>>>>>>> +
>> 	VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
>>>>>>> +				PMD_DRV_LOG(ERR, "Cmd %d not
>>>>>> supported", args->ops);
>>>>>>> +				err = -ENOTSUP;
>>>>>>> +			} else if (vf->cmd_retval !=
>>>>>> VIRTCHNL_STATUS_SUCCESS) {
>>>>>>> +				PMD_DRV_LOG(ERR, "Return
>> failure %d for
>>>>>> cmd %d",
>>>>>>> +						vf->cmd_retval, args-
>>> ops);
>>>>>>> +				err = -EINVAL;
>>>>>>> +			}
>>>>>>>  		}
>>>>>>>  		break;
>>>>>>>  	}
>>>>>>> @@ -403,8 +424,14 @@ iavf_execute_vf_cmd_safe(struct
>>> iavf_adapter
>>>>>>> *adapter,  {
>>>>>>>  	struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(adapter);
>>>>>>>  	int ret;
>>>>>>> +	int is_intr_thread = rte_thread_is_intr();
>>>>>>>
>>>>>>> -	rte_spinlock_lock(&vf->aq_lock);
>>>>>>> +	if (is_intr_thread) {
>>>>>>> +		if (!rte_spinlock_trylock(&vf->aq_lock))
>>>>>>> +			return -EIO;
>>>>>>> +	} else {
>>>>>>> +		rte_spinlock_lock(&vf->aq_lock);
>>>>>>> +	}
>>>>>>>  	ret = iavf_execute_vf_cmd(adapter, args, async);
>>>>>>>  	rte_spinlock_unlock(&vf->aq_lock);
>>>>>>>
>>>>>
>>>>
>
  
Jiale, SongX June 6, 2023, 5:41 a.m. UTC | #10
> -----Original Message-----
> From: Kaiwen Deng <kaiwenx.deng@intel.com>
> Sent: Wednesday, March 22, 2023 3:26 PM
> To: dev@dpdk.org
> Cc: stable@dpdk.org; Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
> <yidingx.zhou@intel.com>; Deng, KaiwenX <kaiwenx.deng@intel.com>;
> Chas Williams <chas3@att.com>; Min Hu (Connor) <humin29@huawei.com>;
> Wu, Jingjing <jingjing.wu@intel.com>; Xing, Beilei <beilei.xing@intel.com>;
> Mike Pattrick <mkp@redhat.com>; Zhang, Qi Z <qi.z.zhang@intel.com>;
> Doherty, Declan <declan.doherty@intel.com>; Mrzyglod, Daniel T
> <daniel.t.mrzyglod@intel.com>; Dapeng Yu <dapengx.yu@intel.com>
> Subject: [PATCH v3] net/iavf: fix iavf query stats in intr thread
> 
> When iavf send query-stats command in eal-intr-thread through virtual
> channel, there will be no response received from iavf_dev_virtchnl_handler
> for this command during block and wait.
> Because iavf_dev_virtchnl_handler is also registered in eal-intr-thread.
> 
> When vf device is bonded as BONDING_MODE_TLB mode, the slave device
> update callback will registered in alarm and called by eal-intr-thread, it would
> also raise the above issue.
> 
> This commit add to poll the response for VIRTCHNL_OP_GET_STATS when it
> is called by eal-intr-thread to fix this issue.
> 
> Fixes: 91bf37d250aa ("net/iavf: add lock for VF commands")
> Fixes: 22b123a36d07 ("net/avf: initialize PMD")
> Fixes: 7c76a747e68c ("bond: add mode 5")
> Fixes: 435d523112cc ("net/iavf: fix multi-process shared data")
> Fixes: cb5c1b91f76f ("net/iavf: add thread for event callbacks")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Kaiwen Deng <kaiwenx.deng@intel.com>
> ---
Tested-by: Jiale Song < songx.jiale@intel.com>
  

Patch

diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
index f0c4f7d26b..edce621496 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -894,6 +894,7 @@  bond_ethdev_update_tlb_slave_cb(void *arg)
 	uint8_t update_stats = 0;
 	uint16_t slave_id;
 	uint16_t i;
+	int ret;
 
 	internals->slave_update_idx++;
 
@@ -903,7 +904,10 @@  bond_ethdev_update_tlb_slave_cb(void *arg)
 
 	for (i = 0; i < internals->active_slave_count; i++) {
 		slave_id = internals->active_slaves[i];
-		rte_eth_stats_get(slave_id, &slave_stats);
+		ret = rte_eth_stats_get(slave_id, &slave_stats);
+		if (ret)
+			goto OUT;
+
 		tx_bytes = slave_stats.obytes - tlb_last_obytets[slave_id];
 		bandwidth_left(slave_id, tx_bytes,
 				internals->slave_update_idx, &bwg_array[i]);
@@ -922,6 +926,7 @@  bond_ethdev_update_tlb_slave_cb(void *arg)
 	for (i = 0; i < slave_count; i++)
 		internals->tlb_slaves_order[i] = bwg_array[i].slave;
 
+OUT:
 	rte_eal_alarm_set(REORDER_PERIOD_MS * 1000, bond_ethdev_update_tlb_slave_cb,
 			(struct bond_dev_private *)internals);
 }
diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
index 3196210f2c..d6e1f1a7f4 100644
--- a/drivers/net/iavf/iavf_ethdev.c
+++ b/drivers/net/iavf/iavf_ethdev.c
@@ -2607,6 +2607,9 @@  iavf_dev_init(struct rte_eth_dev *eth_dev)
 	adapter->dev_data = eth_dev->data;
 	adapter->stopped = 1;
 
+	if (iavf_dev_event_handler_init())
+		goto init_vf_err;
+
 	if (iavf_init_vf(eth_dev) != 0) {
 		PMD_INIT_LOG(ERR, "Init vf failed");
 		return -1;
@@ -2634,8 +2637,6 @@  iavf_dev_init(struct rte_eth_dev *eth_dev)
 	rte_ether_addr_copy((struct rte_ether_addr *)hw->mac.addr,
 			&eth_dev->data->mac_addrs[0]);
 
-	if (iavf_dev_event_handler_init())
-		goto init_vf_err;
 
 	if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_WB_ON_ITR) {
 		/* register callback func to eal lib */
diff --git a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/iavf/iavf_vchnl.c
index 9adaadb173..aeffb07cca 100644
--- a/drivers/net/iavf/iavf_vchnl.c
+++ b/drivers/net/iavf/iavf_vchnl.c
@@ -256,6 +256,7 @@  iavf_read_msg_from_pf(struct iavf_adapter *adapter, uint16_t buf_len,
 				vf->link_speed = iavf_convert_link_speed(speed);
 			}
 			iavf_dev_link_update(vf->eth_dev, 0);
+			iavf_dev_event_post(vf->eth_dev, RTE_ETH_EVENT_INTR_LSC, NULL, 0);
 			PMD_DRV_LOG(INFO, "Link status update:%s",
 					vf->link_up ? "up" : "down");
 			break;
@@ -368,28 +369,48 @@  iavf_execute_vf_cmd(struct iavf_adapter *adapter, struct iavf_cmd_info *args,
 		_clear_cmd(vf);
 		break;
 	default:
-		/* For other virtchnl ops in running time,
-		 * wait for the cmd done flag.
-		 */
-		do {
-			if (vf->pend_cmd == VIRTCHNL_OP_UNKNOWN)
-				break;
-			iavf_msec_delay(ASQ_DELAY_MS);
-			/* If don't read msg or read sys event, continue */
-		} while (i++ < MAX_TRY_TIMES);
-
-		if (i >= MAX_TRY_TIMES) {
-			PMD_DRV_LOG(ERR, "No response for cmd %d", args->ops);
+		if (rte_thread_is_intr()) {
+			/* For virtchnl ops were executed in eal_intr_thread,
+			 * need to poll the response.
+			 */
+			do {
+				result = iavf_read_msg_from_pf(adapter, args->out_size,
+							args->out_buffer);
+				if (result == IAVF_MSG_CMD)
+					break;
+				iavf_msec_delay(ASQ_DELAY_MS);
+			} while (i++ < MAX_TRY_TIMES);
+			if (i >= MAX_TRY_TIMES ||
+				vf->cmd_retval != VIRTCHNL_STATUS_SUCCESS) {
+				err = -1;
+				PMD_DRV_LOG(ERR, "No response or return failure (%d)"
+						" for cmd %d", vf->cmd_retval, args->ops);
+			}
 			_clear_cmd(vf);
-			err = -EIO;
-		} else if (vf->cmd_retval ==
-			   VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
-			PMD_DRV_LOG(ERR, "Cmd %d not supported", args->ops);
-			err = -ENOTSUP;
-		} else if (vf->cmd_retval != VIRTCHNL_STATUS_SUCCESS) {
-			PMD_DRV_LOG(ERR, "Return failure %d for cmd %d",
-				    vf->cmd_retval, args->ops);
-			err = -EINVAL;
+		} else {
+			/* For other virtchnl ops in running time,
+			 * wait for the cmd done flag.
+			 */
+			do {
+				if (vf->pend_cmd == VIRTCHNL_OP_UNKNOWN)
+					break;
+				iavf_msec_delay(ASQ_DELAY_MS);
+				/* If don't read msg or read sys event, continue */
+			} while (i++ < MAX_TRY_TIMES);
+
+			if (i >= MAX_TRY_TIMES) {
+				PMD_DRV_LOG(ERR, "No response for cmd %d", args->ops);
+				_clear_cmd(vf);
+				err = -EIO;
+			} else if (vf->cmd_retval ==
+				VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
+				PMD_DRV_LOG(ERR, "Cmd %d not supported", args->ops);
+				err = -ENOTSUP;
+			} else if (vf->cmd_retval != VIRTCHNL_STATUS_SUCCESS) {
+				PMD_DRV_LOG(ERR, "Return failure %d for cmd %d",
+						vf->cmd_retval, args->ops);
+				err = -EINVAL;
+			}
 		}
 		break;
 	}
@@ -403,8 +424,14 @@  iavf_execute_vf_cmd_safe(struct iavf_adapter *adapter,
 {
 	struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(adapter);
 	int ret;
+	int is_intr_thread = rte_thread_is_intr();
 
-	rte_spinlock_lock(&vf->aq_lock);
+	if (is_intr_thread) {
+		if (!rte_spinlock_trylock(&vf->aq_lock))
+			return -EIO;
+	} else {
+		rte_spinlock_lock(&vf->aq_lock);
+	}
 	ret = iavf_execute_vf_cmd(adapter, args, async);
 	rte_spinlock_unlock(&vf->aq_lock);