[dpdk-dev,v2] ethdev: fix multi-process NULL dereference crashes

Message ID 1485270071-5407-1-git-send-email-remy.horton@intel.com (mailing list archive)
State Not Applicable, archived
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel compilation success Compilation OK

Commit Message

Remy Horton Jan. 24, 2017, 3:01 p.m. UTC
  Secondary processes were blanket zeroing ethernet device memory,
resulting in NULL dereference crashes in multi-process setups.

Fixes: 7f95f78a8aea ("ethdev: clear data when allocating device")

Signed-off-by: Remy Horton <remy.horton@intel.com>
---
 doc/guides/rel_notes/release_17_02.rst | 5 +++++
 lib/librte_ether/rte_ethdev.c          | 4 +++-
 2 files changed, 8 insertions(+), 1 deletion(-)
  

Comments

Thomas Monjalon Jan. 25, 2017, 11:56 a.m. UTC | #1
2017-01-24 15:01, Remy Horton:
> Secondary processes were blanket zeroing ethernet device memory,
> resulting in NULL dereference crashes in multi-process setups.
> 
> Fixes: 7f95f78a8aea ("ethdev: clear data when allocating device")
> 
> Signed-off-by: Remy Horton <remy.horton@intel.com>
> ---
>  doc/guides/rel_notes/release_17_02.rst | 5 +++++
>  lib/librte_ether/rte_ethdev.c          | 4 +++-
>  2 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/doc/guides/rel_notes/release_17_02.rst b/doc/guides/rel_notes/release_17_02.rst
> index 0ecd720..1472f84 100644
> --- a/doc/guides/rel_notes/release_17_02.rst
> +++ b/doc/guides/rel_notes/release_17_02.rst
> @@ -222,6 +222,11 @@ Drivers
>    Fixed few regressions introduced in recent releases that break the virtio
>    multiple process support.
>  
> +* **ethdev: Fixed crash with multi-processing.**
> +
> +  Secondary processes were blanket zeroing ethernet device memory,
> +  resulting in NULL dereference crashes in multi-process setups.

It does not describe exactly the use-case it is fixing (same in commit message).
I guess you saw an issue when creating a vdev in the primary process and
another one in a secondary process, erasing the data of the first one.

nit: ethdev bug should be shown before PMD bugs like virtio one above.

> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -225,8 +225,10 @@ rte_eth_dev_allocate(const char *name)
>  		return NULL;
>  	}
>  
> -	memset(&rte_eth_dev_data[port_id], 0, sizeof(struct rte_eth_dev_data));
>  	eth_dev = eth_dev_get(port_id);
> +	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> +		memset(&rte_eth_dev_data[port_id], 0,
> +			sizeof(struct rte_eth_dev_data));

My previous proposal was:
	memset(eth_dev->data, 0, sizeof(*eth_dev->data))
It is better to avoid reference to the global array rte_eth_dev_data.

Anyway, the shared data are still overwritten for the name, the port id
and the MTU.
Please describe the exact case where it is working for you.
  
Remy Horton Jan. 25, 2017, 12:13 p.m. UTC | #2
On 25/01/2017 11:56, Thomas Monjalon wrote:
[..]
> It does not describe exactly the use-case it is fixing (same in commit message).
> I guess you saw an issue when creating a vdev in the primary process and
> another one in a secondary process, erasing the data of the first one.

In my use-case the secondary process is proc_info, which appeared to be 
blanking the shared memory then leaving the NULL-pointer landmines for 
the primary process to land on. I'm not entirely sure why this type of 
secondary process needs to be running any ethdev startup code at all, as 
all it is doing is pulling data out of shared memory..


> My previous proposal was:
> 	memset(eth_dev->data, 0, sizeof(*eth_dev->data))
> It is better to avoid reference to the global array rte_eth_dev_data.

Git rebase screwed up, and it got lost en-route :(

..Remy
  
Remy Horton Jan. 25, 2017, 2:02 p.m. UTC | #3
On 24/01/2017 15:01, Remy Horton wrote:
> Secondary processes were blanket zeroing ethernet device memory,
> resulting in NULL dereference crashes in multi-process setups.
>
> Fixes: 7f95f78a8aea ("ethdev: clear data when allocating device")
>
> Signed-off-by: Remy Horton <remy.horton@intel.com>

Self-NAK: Condition is now tautology on code path that was causing crashes
  
Thomas Monjalon Jan. 25, 2017, 2:31 p.m. UTC | #4
2017-01-25 14:02, Remy Horton:
> 
> On 24/01/2017 15:01, Remy Horton wrote:
> > Secondary processes were blanket zeroing ethernet device memory,
> > resulting in NULL dereference crashes in multi-process setups.
> >
> > Fixes: 7f95f78a8aea ("ethdev: clear data when allocating device")
> >
> > Signed-off-by: Remy Horton <remy.horton@intel.com>
> 
> Self-NAK: Condition is now tautology on code path that was causing crashes

What do you mean exactly?
  
Remy Horton Jan. 25, 2017, 2:38 p.m. UTC | #5
On 25/01/2017 14:31, Thomas Monjalon wrote:
> 2017-01-25 14:02, Remy Horton:
[..]
>> Self-NAK: Condition is now tautology on code path that was causing crashes
>
> What do you mean exactly?

There is an if(rte_eal_process_type() == RTE_PROC_PRIMARY) in a calling 
function, so the one my patch was introducing is now redundant.

..Remy
  

Patch

diff --git a/doc/guides/rel_notes/release_17_02.rst b/doc/guides/rel_notes/release_17_02.rst
index 0ecd720..1472f84 100644
--- a/doc/guides/rel_notes/release_17_02.rst
+++ b/doc/guides/rel_notes/release_17_02.rst
@@ -222,6 +222,11 @@  Drivers
   Fixed few regressions introduced in recent releases that break the virtio
   multiple process support.
 
+* **ethdev: Fixed crash with multi-processing.**
+
+  Secondary processes were blanket zeroing ethernet device memory,
+  resulting in NULL dereference crashes in multi-process setups.
+
 
 Libraries
 ~~~~~~~~~
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 61f44e2..d911921 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -225,8 +225,10 @@  rte_eth_dev_allocate(const char *name)
 		return NULL;
 	}
 
-	memset(&rte_eth_dev_data[port_id], 0, sizeof(struct rte_eth_dev_data));
 	eth_dev = eth_dev_get(port_id);
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+		memset(&rte_eth_dev_data[port_id], 0,
+			sizeof(struct rte_eth_dev_data));
 	snprintf(eth_dev->data->name, sizeof(eth_dev->data->name), "%s", name);
 	eth_dev->data->port_id = port_id;
 	eth_dev->data->mtu = ETHER_MTU;