[DPDK] net/e1000: fix buffer overrun while i219 processing DMA transactions

Message ID 1562593002-36586-1-git-send-email-xiao.zhang@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Qi Zhang
Headers
Series [DPDK] net/e1000: fix buffer overrun while i219 processing DMA transactions |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/mellanox-Performance-Testing success Performance Testing PASS
ci/intel-Performance-Testing success Performance Testing PASS
ci/Intel-compilation success Compilation OK

Commit Message

Xiao Zhang July 8, 2019, 1:36 p.m. UTC
  Intel® 100/200 Series Chipset platforms reduced the round-trip
latency for the LAN Controller DMA accesses, causing in some high
performance cases a buffer overrun while the I219 LAN Connected
Device is processing the DMA transactions. I219LM and I219V devices
can fall into unrecovered Tx hang under very stressfully UDP traffic
and multiple reconnection of Ethernet cable. This Tx hang of the LAN
Controller is only recovered if the system is rebooted. Slightly slow
down DMA access by reducing the number of outstanding requests.
This workaround could have an impact on TCP traffic performance
on the platform. Disabling TSO eliminates performance loss for TCP
traffic without a noticeable impact on CPU performance.

Please, refer to I218/I219 specification update:
https://www.intel.com/content/www/us/en/embedded/products/networking/
ethernet-connection-i218-family-documentation.html

Signed-off-by: Xiao Zhang <xiao.zhang@intel.com>
---
 drivers/net/e1000/base/e1000_ich8lan.h |  1 +
 drivers/net/e1000/igb_rxtx.c           | 16 ++++++++++++++++
 2 files changed, 17 insertions(+)
  

Comments

Qi Zhang July 19, 2019, 5:44 a.m. UTC | #1
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Xiao Zhang
> Sent: Monday, July 8, 2019 9:37 PM
> To: dev@dpdk.org
> Cc: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Zhao1, Wei
> <wei.zhao1@intel.com>; Zhang, Xiao <xiao.zhang@intel.com>
> Subject: [dpdk-dev] [DPDK] net/e1000: fix buffer overrun while i219
> processing DMA transactions
> 
> Intel® 100/200 Series Chipset platforms reduced the round-trip latency for the
> LAN Controller DMA accesses, causing in some high performance cases a buffer
> overrun while the I219 LAN Connected Device is processing the DMA
> transactions. I219LM and I219V devices can fall into unrecovered Tx hang
> under very stressfully UDP traffic and multiple reconnection of Ethernet cable.
> This Tx hang of the LAN Controller is only recovered if the system is rebooted.
> Slightly slow down DMA access by reducing the number of outstanding
> requests.
> This workaround could have an impact on TCP traffic performance on the
> platform. Disabling TSO eliminates performance loss for TCP traffic without a
> noticeable impact on CPU performance.
> 
> Please, refer to I218/I219 specification update:
> https://www.intel.com/content/www/us/en/embedded/products/networking
> /
> ethernet-connection-i218-family-documentation.html
> 
> Signed-off-by: Xiao Zhang <xiao.zhang@intel.com>

Acked-by: Qi Zhang <qi.z.zhang@intel.com>

Applied to dpdk-next-net-intel.

Thanks
Qi
  
Anand H. Krishnan July 20, 2019, 2:56 a.m. UTC | #2
This seems to be changing the IGB driver. Shouldn't you be changing
the em driver
rather than the igb driver?

Thanks,
Anand

On Mon, Jul 8, 2019 at 10:10 AM Xiao Zhang <xiao.zhang@intel.com> wrote:
>
> Intel® 100/200 Series Chipset platforms reduced the round-trip
> latency for the LAN Controller DMA accesses, causing in some high
> performance cases a buffer overrun while the I219 LAN Connected
> Device is processing the DMA transactions. I219LM and I219V devices
> can fall into unrecovered Tx hang under very stressfully UDP traffic
> and multiple reconnection of Ethernet cable. This Tx hang of the LAN
> Controller is only recovered if the system is rebooted. Slightly slow
> down DMA access by reducing the number of outstanding requests.
> This workaround could have an impact on TCP traffic performance
> on the platform. Disabling TSO eliminates performance loss for TCP
> traffic without a noticeable impact on CPU performance.
>
> Please, refer to I218/I219 specification update:
> https://www.intel.com/content/www/us/en/embedded/products/networking/
> ethernet-connection-i218-family-documentation.html
>
> Signed-off-by: Xiao Zhang <xiao.zhang@intel.com>
> ---
>  drivers/net/e1000/base/e1000_ich8lan.h |  1 +
>  drivers/net/e1000/igb_rxtx.c           | 16 ++++++++++++++++
>  2 files changed, 17 insertions(+)
>
> diff --git a/drivers/net/e1000/base/e1000_ich8lan.h b/drivers/net/e1000/base/e1000_ich8lan.h
> index 1f2a3f8..084eb9c 100644
> --- a/drivers/net/e1000/base/e1000_ich8lan.h
> +++ b/drivers/net/e1000/base/e1000_ich8lan.h
> @@ -134,6 +134,7 @@ POSSIBILITY OF SUCH DAMAGE.
>  #define E1000_FLASH_BASE_ADDR 0xE000 /*offset of NVM access regs*/
>  #define E1000_CTRL_EXT_NVMVS 0x3 /*NVM valid sector */
>  #define E1000_TARC0_CB_MULTIQ_3_REQ    (1 << 28 | 1 << 29)
> +#define E1000_TARC0_CB_MULTIQ_2_REQ    (1 << 29)
>  #define PCIE_ICH8_SNOOP_ALL    PCIE_NO_SNOOP_ALL
>
>  #define E1000_ICH_RAR_ENTRIES  7
> diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
> index 33eeb4e..5d45e62 100644
> --- a/drivers/net/e1000/igb_rxtx.c
> +++ b/drivers/net/e1000/igb_rxtx.c
> @@ -2627,6 +2627,22 @@ eth_igb_tx_init(struct rte_eth_dev *dev)
>
>         e1000_config_collision_dist(hw);
>
> +       /* SPT and CNP Si errata workaround to avoid data corruption */
> +       if (hw->mac.type == e1000_pch_spt) {
> +               uint32_t reg_val;
> +               reg_val = E1000_READ_REG(hw, E1000_IOSFPC);
> +               reg_val |= E1000_RCTL_RDMTS_HEX;
> +               E1000_WRITE_REG(hw, E1000_IOSFPC, reg_val);
> +
> +               /* Dropping the number of outstanding requests from
> +                * 3 to 2 in order to avoid a buffer overrun.
> +                */
> +               reg_val = E1000_READ_REG(hw, E1000_TARC(0));
> +               reg_val &= ~E1000_TARC0_CB_MULTIQ_3_REQ;
> +               reg_val |= E1000_TARC0_CB_MULTIQ_2_REQ;
> +               E1000_WRITE_REG(hw, E1000_TARC(0), reg_val);
> +       }
> +
>         /* This write will effectively turn on the transmit unit. */
>         E1000_WRITE_REG(hw, E1000_TCTL, tctl);
>  }
> --
> 2.7.4
>
  
Xiao Zhang July 20, 2019, 8:17 a.m. UTC | #3
> -----Original Message-----
> From: Anand H. Krishnan [mailto:anandhkrishnan@gmail.com]
> Sent: Saturday, July 20, 2019 10:57 AM
> To: Zhang, Xiao <xiao.zhang@intel.com>
> Cc: dev@dpdk.org; Lu, Wenzhuo <wenzhuo.lu@intel.com>; Zhao1, Wei
> <wei.zhao1@intel.com>
> Subject: Re: [dpdk-dev] [DPDK] net/e1000: fix buffer overrun while i219
> processing DMA transactions
> 
> This seems to be changing the IGB driver. Shouldn't you be changing the em
> driver rather than the igb driver?

Yes, the fix changed to em driver in v2 patch. Thanks.

> Thanks,
> Anand
> 
> On Mon, Jul 8, 2019 at 10:10 AM Xiao Zhang <xiao.zhang@intel.com> wrote:
> >
> > Intel® 100/200 Series Chipset platforms reduced the round-trip latency
> > for the LAN Controller DMA accesses, causing in some high performance
> > cases a buffer overrun while the I219 LAN Connected Device is
> > processing the DMA transactions. I219LM and I219V devices can fall
> > into unrecovered Tx hang under very stressfully UDP traffic and
> > multiple reconnection of Ethernet cable. This Tx hang of the LAN
> > Controller is only recovered if the system is rebooted. Slightly slow
> > down DMA access by reducing the number of outstanding requests.
> > This workaround could have an impact on TCP traffic performance on the
> > platform. Disabling TSO eliminates performance loss for TCP traffic
> > without a noticeable impact on CPU performance.
> >
> > Please, refer to I218/I219 specification update:
> >
> https://www.intel.com/content/www/us/en/embedded/products/network
> ing/
> > ethernet-connection-i218-family-documentation.html
> >
> > Signed-off-by: Xiao Zhang <xiao.zhang@intel.com>
> > ---
> >  drivers/net/e1000/base/e1000_ich8lan.h |  1 +
> >  drivers/net/e1000/igb_rxtx.c           | 16 ++++++++++++++++
> >  2 files changed, 17 insertions(+)
> >
> > diff --git a/drivers/net/e1000/base/e1000_ich8lan.h
> > b/drivers/net/e1000/base/e1000_ich8lan.h
> > index 1f2a3f8..084eb9c 100644
> > --- a/drivers/net/e1000/base/e1000_ich8lan.h
> > +++ b/drivers/net/e1000/base/e1000_ich8lan.h
> > @@ -134,6 +134,7 @@ POSSIBILITY OF SUCH DAMAGE.
> >  #define E1000_FLASH_BASE_ADDR 0xE000 /*offset of NVM access regs*/
> > #define E1000_CTRL_EXT_NVMVS 0x3 /*NVM valid sector */
> >  #define E1000_TARC0_CB_MULTIQ_3_REQ    (1 << 28 | 1 << 29)
> > +#define E1000_TARC0_CB_MULTIQ_2_REQ    (1 << 29)
> >  #define PCIE_ICH8_SNOOP_ALL    PCIE_NO_SNOOP_ALL
> >
> >  #define E1000_ICH_RAR_ENTRIES  7
> > diff --git a/drivers/net/e1000/igb_rxtx.c
> > b/drivers/net/e1000/igb_rxtx.c index 33eeb4e..5d45e62 100644
> > --- a/drivers/net/e1000/igb_rxtx.c
> > +++ b/drivers/net/e1000/igb_rxtx.c
> > @@ -2627,6 +2627,22 @@ eth_igb_tx_init(struct rte_eth_dev *dev)
> >
> >         e1000_config_collision_dist(hw);
> >
> > +       /* SPT and CNP Si errata workaround to avoid data corruption */
> > +       if (hw->mac.type == e1000_pch_spt) {
> > +               uint32_t reg_val;
> > +               reg_val = E1000_READ_REG(hw, E1000_IOSFPC);
> > +               reg_val |= E1000_RCTL_RDMTS_HEX;
> > +               E1000_WRITE_REG(hw, E1000_IOSFPC, reg_val);
> > +
> > +               /* Dropping the number of outstanding requests from
> > +                * 3 to 2 in order to avoid a buffer overrun.
> > +                */
> > +               reg_val = E1000_READ_REG(hw, E1000_TARC(0));
> > +               reg_val &= ~E1000_TARC0_CB_MULTIQ_3_REQ;
> > +               reg_val |= E1000_TARC0_CB_MULTIQ_2_REQ;
> > +               E1000_WRITE_REG(hw, E1000_TARC(0), reg_val);
> > +       }
> > +
> >         /* This write will effectively turn on the transmit unit. */
> >         E1000_WRITE_REG(hw, E1000_TCTL, tctl);  }
> > --
> > 2.7.4
> >
  
Zhao1, Wei July 26, 2019, 2:52 a.m. UTC | #4
Acked-by: Wei Zhao <wei.zhao1@intel.com>


> -----Original Message-----
> From: Zhang, Xiao
> Sent: Monday, July 8, 2019 9:37 PM
> To: dev@dpdk.org
> Cc: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Zhao1, Wei <wei.zhao1@intel.com>;
> Zhang, Xiao <xiao.zhang@intel.com>
> Subject: [DPDK] net/e1000: fix buffer overrun while i219 processing DMA
> transactions
> 
> Intel® 100/200 Series Chipset platforms reduced the round-trip latency for the
> LAN Controller DMA accesses, causing in some high performance cases a
> buffer overrun while the I219 LAN Connected Device is processing the DMA
> transactions. I219LM and I219V devices can fall into unrecovered Tx hang
> under very stressfully UDP traffic and multiple reconnection of Ethernet cable.
> This Tx hang of the LAN Controller is only recovered if the system is rebooted.
> Slightly slow down DMA access by reducing the number of outstanding requests.
> This workaround could have an impact on TCP traffic performance on the
> platform. Disabling TSO eliminates performance loss for TCP traffic without a
> noticeable impact on CPU performance.
> 
> Please, refer to I218/I219 specification update:
> https://www.intel.com/content/www/us/en/embedded/products/networking/
> ethernet-connection-i218-family-documentation.html
> 
> Signed-off-by: Xiao Zhang <xiao.zhang@intel.com>
> ---
>  drivers/net/e1000/base/e1000_ich8lan.h |  1 +
>  drivers/net/e1000/igb_rxtx.c           | 16 ++++++++++++++++
>  2 files changed, 17 insertions(+)
> 
> diff --git a/drivers/net/e1000/base/e1000_ich8lan.h
> b/drivers/net/e1000/base/e1000_ich8lan.h
> index 1f2a3f8..084eb9c 100644
> --- a/drivers/net/e1000/base/e1000_ich8lan.h
> +++ b/drivers/net/e1000/base/e1000_ich8lan.h
> @@ -134,6 +134,7 @@ POSSIBILITY OF SUCH DAMAGE.
>  #define E1000_FLASH_BASE_ADDR 0xE000 /*offset of NVM access regs*/
> #define E1000_CTRL_EXT_NVMVS 0x3 /*NVM valid sector */
>  #define E1000_TARC0_CB_MULTIQ_3_REQ	(1 << 28 | 1 << 29)
> +#define E1000_TARC0_CB_MULTIQ_2_REQ	(1 << 29)
>  #define PCIE_ICH8_SNOOP_ALL	PCIE_NO_SNOOP_ALL
> 
>  #define E1000_ICH_RAR_ENTRIES	7
> diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c index
> 33eeb4e..5d45e62 100644
> --- a/drivers/net/e1000/igb_rxtx.c
> +++ b/drivers/net/e1000/igb_rxtx.c
> @@ -2627,6 +2627,22 @@ eth_igb_tx_init(struct rte_eth_dev *dev)
> 
>  	e1000_config_collision_dist(hw);
> 
> +	/* SPT and CNP Si errata workaround to avoid data corruption */
> +	if (hw->mac.type == e1000_pch_spt) {
> +		uint32_t reg_val;
> +		reg_val = E1000_READ_REG(hw, E1000_IOSFPC);
> +		reg_val |= E1000_RCTL_RDMTS_HEX;
> +		E1000_WRITE_REG(hw, E1000_IOSFPC, reg_val);
> +
> +		/* Dropping the number of outstanding requests from
> +		 * 3 to 2 in order to avoid a buffer overrun.
> +		 */
> +		reg_val = E1000_READ_REG(hw, E1000_TARC(0));
> +		reg_val &= ~E1000_TARC0_CB_MULTIQ_3_REQ;
> +		reg_val |= E1000_TARC0_CB_MULTIQ_2_REQ;
> +		E1000_WRITE_REG(hw, E1000_TARC(0), reg_val);
> +	}
> +
>  	/* This write will effectively turn on the transmit unit. */
>  	E1000_WRITE_REG(hw, E1000_TCTL, tctl);  }
> --
> 2.7.4
  

Patch

diff --git a/drivers/net/e1000/base/e1000_ich8lan.h b/drivers/net/e1000/base/e1000_ich8lan.h
index 1f2a3f8..084eb9c 100644
--- a/drivers/net/e1000/base/e1000_ich8lan.h
+++ b/drivers/net/e1000/base/e1000_ich8lan.h
@@ -134,6 +134,7 @@  POSSIBILITY OF SUCH DAMAGE.
 #define E1000_FLASH_BASE_ADDR 0xE000 /*offset of NVM access regs*/
 #define E1000_CTRL_EXT_NVMVS 0x3 /*NVM valid sector */
 #define E1000_TARC0_CB_MULTIQ_3_REQ	(1 << 28 | 1 << 29)
+#define E1000_TARC0_CB_MULTIQ_2_REQ	(1 << 29)
 #define PCIE_ICH8_SNOOP_ALL	PCIE_NO_SNOOP_ALL
 
 #define E1000_ICH_RAR_ENTRIES	7
diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
index 33eeb4e..5d45e62 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c
@@ -2627,6 +2627,22 @@  eth_igb_tx_init(struct rte_eth_dev *dev)
 
 	e1000_config_collision_dist(hw);
 
+	/* SPT and CNP Si errata workaround to avoid data corruption */
+	if (hw->mac.type == e1000_pch_spt) {
+		uint32_t reg_val;
+		reg_val = E1000_READ_REG(hw, E1000_IOSFPC);
+		reg_val |= E1000_RCTL_RDMTS_HEX;
+		E1000_WRITE_REG(hw, E1000_IOSFPC, reg_val);
+
+		/* Dropping the number of outstanding requests from
+		 * 3 to 2 in order to avoid a buffer overrun.
+		 */
+		reg_val = E1000_READ_REG(hw, E1000_TARC(0));
+		reg_val &= ~E1000_TARC0_CB_MULTIQ_3_REQ;
+		reg_val |= E1000_TARC0_CB_MULTIQ_2_REQ;
+		E1000_WRITE_REG(hw, E1000_TARC(0), reg_val);
+	}
+
 	/* This write will effectively turn on the transmit unit. */
 	E1000_WRITE_REG(hw, E1000_TCTL, tctl);
 }