[RFC,v2] net: fix rte_vlan_insert with shared mbuf

Message ID 20190328205322.852-1-stephen@networkplumber.org (mailing list archive)
State Rejected, archived
Delegated to: Ferruh Yigit
Headers
Series [RFC,v2] net: fix rte_vlan_insert with shared mbuf |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation success Compilation OK

Commit Message

Stephen Hemminger March 28, 2019, 8:53 p.m. UTC
  If mbuf is shared then rte_vlan_insert() would clobber the original
Ethernet header. The changed version handles this by getting
an mbuf that will hold the new Ethernet and VLAN header followed
by another mbuf (cloned) for the data.

Fixes: c974021a5949 ("ether: add soft vlan encap/decap")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
v2 - compile tested only, do copy/clone.

 lib/librte_net/rte_ether.h | 37 ++++++++++++++++++++++++++++---------
 1 file changed, 28 insertions(+), 9 deletions(-)
  

Comments

Chas Williams March 30, 2019, 12:41 p.m. UTC | #1
Unfortunately, I think the complete fix is more complicated than this.
Drivers that use rte_vlan_insert don't anticipate that the mbuf might
change and that (hardware) transmit can fail.

They make a copy of the mbuf pointer from the incoming transmit list
and don't update the original if rte_vlan_insert creates a new mbuf.
If transmit fails, the application needs to be the given the new mbuf
for re-transmission or freeing.

On 3/28/19 4:53 PM, Stephen Hemminger wrote:
> If mbuf is shared then rte_vlan_insert() would clobber the original
> Ethernet header. The changed version handles this by getting
> an mbuf that will hold the new Ethernet and VLAN header followed
> by another mbuf (cloned) for the data.
> 
> Fixes: c974021a5949 ("ether: add soft vlan encap/decap")
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> ---
> v2 - compile tested only, do copy/clone.
> 
>   lib/librte_net/rte_ether.h | 37 ++++++++++++++++++++++++++++---------
>   1 file changed, 28 insertions(+), 9 deletions(-)
> 
> diff --git a/lib/librte_net/rte_ether.h b/lib/librte_net/rte_ether.h
> index c2c5e249ffe9..5fc306e5d08c 100644
> --- a/lib/librte_net/rte_ether.h
> +++ b/lib/librte_net/rte_ether.h
> @@ -374,10 +374,10 @@ static inline int rte_vlan_strip(struct rte_mbuf *m)
>    * Software version of VLAN unstripping
>    *
>    * @param m
> - *   The packet mbuf.
> + *   Pointer to the packet mbuf.
>    * @return
>    *   - 0: On success
> - *   -EPERM: mbuf is is shared overwriting would be unsafe
> + *   -ENOMEM: could not allocate mbuf for header
>    *   -ENOSPC: not enough headroom in mbuf
>    */
>   static inline int rte_vlan_insert(struct rte_mbuf **m)
> @@ -385,15 +385,34 @@ static inline int rte_vlan_insert(struct rte_mbuf **m)
>   	struct ether_hdr *oh, *nh;
>   	struct vlan_hdr *vh;
>   
> -	/* Can't insert header if mbuf is shared */
> -	if (rte_mbuf_refcnt_read(*m) > 1) {
> -		struct rte_mbuf *copy;
> +	/* Can't safely directly insert header if mbuf is shared or indirect */
> +	if (!RTE_MBUF_DIRECT(*m) || rte_mbuf_refcnt_read(*m) > 1) {
> +		struct rte_mempool *mp = (*m)->pool;
> +		struct rte_mbuf *md, *mh;
> +
> +		mh = rte_pktmbuf_alloc(mp);
> +		if (unlikely(mh == NULL))
> +			return -ENOMEM;
> +
> +		mh->tx_offload = (*m)->tx_offload;
> +		mh->vlan_tci = (*m)->vlan_tci;
> +		mh->vlan_tci_outer = (*m)->vlan_tci_outer;
> +		mh->port = (*m)->port;
> +		mh->ol_flags = (*m)->ol_flags;
> +		mh->packet_type = (*m)->packet_type;
>   
> -		copy = rte_pktmbuf_clone(*m, (*m)->pool);
> -		if (unlikely(copy == NULL))
> +		md = rte_pktmbuf_clone(*m, mp);
> +		if (unlikely(md == NULL)) {
> +			rte_pktmbuf_free(mh);
>   			return -ENOMEM;
> -		rte_pktmbuf_free(*m);
> -		*m = copy;
> +		}
> +
> +		mh->next = md;
> +		mh->nb_segs = md->nb_segs + 1;
> +		memcpy(rte_pktmbuf_append(mh, ETHER_HDR_LEN),
> +		       rte_pktmbuf_mtod(md, void *), ETHER_HDR_LEN);
> +		rte_pktmbuf_adj(md, ETHER_HDR_LEN);
> +		*m = mh;
>   	}
>   
>   	oh = rte_pktmbuf_mtod(*m, struct ether_hdr *);
>
  
Stephen Hemminger April 4, 2019, 11:54 p.m. UTC | #2
On Sat, 30 Mar 2019 08:41:33 -0400
Chas Williams <3chas3@gmail.com> wrote:

> Unfortunately, I think the complete fix is more complicated than this.
> Drivers that use rte_vlan_insert don't anticipate that the mbuf might
> change and that (hardware) transmit can fail.
> 
> They make a copy of the mbuf pointer from the incoming transmit list
> and don't update the original if rte_vlan_insert creates a new mbuf.
> If transmit fails, the application needs to be the given the new mbuf
> for re-transmission or freeing.
> 
> On 3/28/19 4:53 PM, Stephen Hemminger wrote:
> > If mbuf is shared then rte_vlan_insert() would clobber the original
> > Ethernet header. The changed version handles this by getting
> > an mbuf that will hold the new Ethernet and VLAN header followed
> > by another mbuf (cloned) for the data.
> > 
> > Fixes: c974021a5949 ("ether: add soft vlan encap/decap")
> > Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>

The virtio driver is buggy, it saves original mbuf and doesn't handle
rewrite. Dpaa2 is fine, should never happen since it checks refcnt etc.
Netvsc PMD is using this on rx path and is safe.
AF_packet PMD should work fine as well.

So virtio is the one that needs fixing
  
Chas Williams April 6, 2019, 11:11 p.m. UTC | #3
On 4/4/19 7:54 PM, Stephen Hemminger wrote:
> On Sat, 30 Mar 2019 08:41:33 -0400
> Chas Williams <3chas3@gmail.com> wrote:
> 
>> Unfortunately, I think the complete fix is more complicated than this.
>> Drivers that use rte_vlan_insert don't anticipate that the mbuf might
>> change and that (hardware) transmit can fail.
>>
>> They make a copy of the mbuf pointer from the incoming transmit list
>> and don't update the original if rte_vlan_insert creates a new mbuf.
>> If transmit fails, the application needs to be the given the new mbuf
>> for re-transmission or freeing.
>>
>> On 3/28/19 4:53 PM, Stephen Hemminger wrote:
>>> If mbuf is shared then rte_vlan_insert() would clobber the original
>>> Ethernet header. The changed version handles this by getting
>>> an mbuf that will hold the new Ethernet and VLAN header followed
>>> by another mbuf (cloned) for the data.
>>>
>>> Fixes: c974021a5949 ("ether: add soft vlan encap/decap")
>>> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> 
> The virtio driver is buggy, it saves original mbuf and doesn't handle
> rewrite. Dpaa2 is fine, should never happen since it checks refcnt etc.
> Netvsc PMD is using this on rx path and is safe.
> AF_packet PMD should work fine as well.

af_packet is broken as well. It needs to defer rte_vlan_insert until
after it gets the next incoming frame.  Otherwise the break can exit
the loop early.

> So virtio is the one that needs fixing

I agree with this.
  
Ferruh Yigit April 12, 2019, 4:35 p.m. UTC | #4
On 3/28/2019 8:53 PM, Stephen Hemminger wrote:
> If mbuf is shared then rte_vlan_insert() would clobber the original
> Ethernet header. The changed version handles this by getting
> an mbuf that will hold the new Ethernet and VLAN header followed
> by another mbuf (cloned) for the data.

Hi Stephen, Chas,

Since this patch is still an RFC, and we are after RC1, I think it is better to
consider this patch for next release.

But for this release, what do you think having a patch to return error on
'rte_vlan_insert()' for shared mbufs?


> 
> Fixes: c974021a5949 ("ether: add soft vlan encap/decap")
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> ---
> v2 - compile tested only, do copy/clone.
> 
>  lib/librte_net/rte_ether.h | 37 ++++++++++++++++++++++++++++---------
>  1 file changed, 28 insertions(+), 9 deletions(-)
> 
> diff --git a/lib/librte_net/rte_ether.h b/lib/librte_net/rte_ether.h
> index c2c5e249ffe9..5fc306e5d08c 100644
> --- a/lib/librte_net/rte_ether.h
> +++ b/lib/librte_net/rte_ether.h
> @@ -374,10 +374,10 @@ static inline int rte_vlan_strip(struct rte_mbuf *m)
>   * Software version of VLAN unstripping
>   *
>   * @param m
> - *   The packet mbuf.
> + *   Pointer to the packet mbuf.
>   * @return
>   *   - 0: On success
> - *   -EPERM: mbuf is is shared overwriting would be unsafe
> + *   -ENOMEM: could not allocate mbuf for header
>   *   -ENOSPC: not enough headroom in mbuf
>   */
>  static inline int rte_vlan_insert(struct rte_mbuf **m)
> @@ -385,15 +385,34 @@ static inline int rte_vlan_insert(struct rte_mbuf **m)
>  	struct ether_hdr *oh, *nh;
>  	struct vlan_hdr *vh;
>  
> -	/* Can't insert header if mbuf is shared */
> -	if (rte_mbuf_refcnt_read(*m) > 1) {
> -		struct rte_mbuf *copy;
> +	/* Can't safely directly insert header if mbuf is shared or indirect */
> +	if (!RTE_MBUF_DIRECT(*m) || rte_mbuf_refcnt_read(*m) > 1) {
> +		struct rte_mempool *mp = (*m)->pool;
> +		struct rte_mbuf *md, *mh;
> +
> +		mh = rte_pktmbuf_alloc(mp);
> +		if (unlikely(mh == NULL))
> +			return -ENOMEM;
> +
> +		mh->tx_offload = (*m)->tx_offload;
> +		mh->vlan_tci = (*m)->vlan_tci;
> +		mh->vlan_tci_outer = (*m)->vlan_tci_outer;
> +		mh->port = (*m)->port;
> +		mh->ol_flags = (*m)->ol_flags;
> +		mh->packet_type = (*m)->packet_type;
>  
> -		copy = rte_pktmbuf_clone(*m, (*m)->pool);
> -		if (unlikely(copy == NULL))
> +		md = rte_pktmbuf_clone(*m, mp);
> +		if (unlikely(md == NULL)) {
> +			rte_pktmbuf_free(mh);
>  			return -ENOMEM;
> -		rte_pktmbuf_free(*m);
> -		*m = copy;
> +		}
> +
> +		mh->next = md;
> +		mh->nb_segs = md->nb_segs + 1;
> +		memcpy(rte_pktmbuf_append(mh, ETHER_HDR_LEN),
> +		       rte_pktmbuf_mtod(md, void *), ETHER_HDR_LEN);
> +		rte_pktmbuf_adj(md, ETHER_HDR_LEN);
> +		*m = mh;
>  	}
>  
>  	oh = rte_pktmbuf_mtod(*m, struct ether_hdr *);
>
  
Ferruh Yigit July 4, 2019, 6:40 p.m. UTC | #5
On 4/12/2019 5:35 PM, Ferruh Yigit wrote:
> On 3/28/2019 8:53 PM, Stephen Hemminger wrote:
>> If mbuf is shared then rte_vlan_insert() would clobber the original
>> Ethernet header. The changed version handles this by getting
>> an mbuf that will hold the new Ethernet and VLAN header followed
>> by another mbuf (cloned) for the data.
> 
> Hi Stephen, Chas,
> 
> Since this patch is still an RFC, and we are after RC1, I think it is better to
> consider this patch for next release.
> 
> But for this release, what do you think having a patch to return error on
> 'rte_vlan_insert()' for shared mbufs?

nack for this patch

The other approach to not support the shared mbuf merged:
https://git.dpdk.org/dpdk/commit/?id=15a74163b12
  

Patch

diff --git a/lib/librte_net/rte_ether.h b/lib/librte_net/rte_ether.h
index c2c5e249ffe9..5fc306e5d08c 100644
--- a/lib/librte_net/rte_ether.h
+++ b/lib/librte_net/rte_ether.h
@@ -374,10 +374,10 @@  static inline int rte_vlan_strip(struct rte_mbuf *m)
  * Software version of VLAN unstripping
  *
  * @param m
- *   The packet mbuf.
+ *   Pointer to the packet mbuf.
  * @return
  *   - 0: On success
- *   -EPERM: mbuf is is shared overwriting would be unsafe
+ *   -ENOMEM: could not allocate mbuf for header
  *   -ENOSPC: not enough headroom in mbuf
  */
 static inline int rte_vlan_insert(struct rte_mbuf **m)
@@ -385,15 +385,34 @@  static inline int rte_vlan_insert(struct rte_mbuf **m)
 	struct ether_hdr *oh, *nh;
 	struct vlan_hdr *vh;
 
-	/* Can't insert header if mbuf is shared */
-	if (rte_mbuf_refcnt_read(*m) > 1) {
-		struct rte_mbuf *copy;
+	/* Can't safely directly insert header if mbuf is shared or indirect */
+	if (!RTE_MBUF_DIRECT(*m) || rte_mbuf_refcnt_read(*m) > 1) {
+		struct rte_mempool *mp = (*m)->pool;
+		struct rte_mbuf *md, *mh;
+
+		mh = rte_pktmbuf_alloc(mp);
+		if (unlikely(mh == NULL))
+			return -ENOMEM;
+
+		mh->tx_offload = (*m)->tx_offload;
+		mh->vlan_tci = (*m)->vlan_tci;
+		mh->vlan_tci_outer = (*m)->vlan_tci_outer;
+		mh->port = (*m)->port;
+		mh->ol_flags = (*m)->ol_flags;
+		mh->packet_type = (*m)->packet_type;
 
-		copy = rte_pktmbuf_clone(*m, (*m)->pool);
-		if (unlikely(copy == NULL))
+		md = rte_pktmbuf_clone(*m, mp);
+		if (unlikely(md == NULL)) {
+			rte_pktmbuf_free(mh);
 			return -ENOMEM;
-		rte_pktmbuf_free(*m);
-		*m = copy;
+		}
+
+		mh->next = md;
+		mh->nb_segs = md->nb_segs + 1;
+		memcpy(rte_pktmbuf_append(mh, ETHER_HDR_LEN),
+		       rte_pktmbuf_mtod(md, void *), ETHER_HDR_LEN);
+		rte_pktmbuf_adj(md, ETHER_HDR_LEN);
+		*m = mh;
 	}
 
 	oh = rte_pktmbuf_mtod(*m, struct ether_hdr *);