[dpdk-dev] Segmentation fault in ixgbe_rxtx_vec.c:444 with 1.8.0

Message ID 20150121134921.GA2592@bricha3-MOBL3 (mailing list archive)
State Not Applicable, archived
Headers

Commit Message

Bruce Richardson Jan. 21, 2015, 1:49 p.m. UTC
  On Tue, Jan 20, 2015 at 11:39:03AM +0100, Martin Weiser wrote:
> Hi again,
> 
> I did some further testing and it seems like this issue is linked to
> jumbo frames. I think a similar issue has already been reported by
> Prashant Upadhyaya with the subject 'Packet Rx issue with DPDK1.8'.
> In our application we use the following rxmode port configuration:
> 
> .mq_mode    = ETH_MQ_RX_RSS,
> .split_hdr_size = 0,
> .header_split   = 0,
> .hw_ip_checksum = 1,
> .hw_vlan_filter = 0,
> .jumbo_frame    = 1,
> .hw_strip_crc   = 1,
> .max_rx_pkt_len = 9000,
> 
> and the mbuf size is calculated like the following:
> 
> (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
> 
> This works fine with DPDK 1.7 and jumbo frames are split into buffer
> chains and can be forwarded on another port without a problem.
> With DPDK 1.8 and the default configuration (CONFIG_RTE_IXGBE_INC_VECTOR
> enabled) the application sometimes crashes like described in my first
> mail and sometimes packet receiving stops with subsequently arriving
> packets counted as rx errors. When CONFIG_RTE_IXGBE_INC_VECTOR is
> disabled the packet processing also comes to a halt as soon as jumbo
> frames arrive with a the slightly different effect that now
> rte_eth_tx_burst refuses to send any previously received packets.
> 
> Is there anything special to consider regarding jumbo frames when moving
> from DPDK 1.7 to 1.8 that we might have missed?
> 
> Martin
> 
> 
> 
> On 19.01.15 11:26, Martin Weiser wrote:
> > Hi everybody,
> >
> > we quite recently updated one of our applications to DPDK 1.8.0 and are
> > now seeing a segmentation fault in ixgbe_rxtx_vec.c:444 after a few minutes.
> > I just did some quick debugging and I only have a very limited
> > understanding of the code in question but it seems that the 'continue'
> > in line 445 without increasing 'buf_idx' might cause the problem. In one
> > debugging session when the crash occurred the value of 'buf_idx' was 2
> > and the value of 'pkt_idx' was 8965.
> > Any help with this issue would be greatly appreciated. If you need any
> > further information just let me know.
> >
> > Martin
> >
> >
> 
Hi Martin, Prashant,

I've managed to reproduce the issue here and had a look at it. Could you
both perhaps try the proposed change below and see if it fixes the problem for
you and gives you a working system? If so, I'll submit this as a patch fix 
officially - or go back to the drawing board, if not. :-)



Regards,
/Bruce
  

Comments

Prashant Upadhyaya Jan. 22, 2015, 2:05 p.m. UTC | #1
On Wed, Jan 21, 2015 at 7:19 PM, Bruce Richardson <
bruce.richardson@intel.com> wrote:

> On Tue, Jan 20, 2015 at 11:39:03AM +0100, Martin Weiser wrote:
> > Hi again,
> >
> > I did some further testing and it seems like this issue is linked to
> > jumbo frames. I think a similar issue has already been reported by
> > Prashant Upadhyaya with the subject 'Packet Rx issue with DPDK1.8'.
> > In our application we use the following rxmode port configuration:
> >
> > .mq_mode    = ETH_MQ_RX_RSS,
> > .split_hdr_size = 0,
> > .header_split   = 0,
> > .hw_ip_checksum = 1,
> > .hw_vlan_filter = 0,
> > .jumbo_frame    = 1,
> > .hw_strip_crc   = 1,
> > .max_rx_pkt_len = 9000,
> >
> > and the mbuf size is calculated like the following:
> >
> > (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
> >
> > This works fine with DPDK 1.7 and jumbo frames are split into buffer
> > chains and can be forwarded on another port without a problem.
> > With DPDK 1.8 and the default configuration (CONFIG_RTE_IXGBE_INC_VECTOR
> > enabled) the application sometimes crashes like described in my first
> > mail and sometimes packet receiving stops with subsequently arriving
> > packets counted as rx errors. When CONFIG_RTE_IXGBE_INC_VECTOR is
> > disabled the packet processing also comes to a halt as soon as jumbo
> > frames arrive with a the slightly different effect that now
> > rte_eth_tx_burst refuses to send any previously received packets.
> >
> > Is there anything special to consider regarding jumbo frames when moving
> > from DPDK 1.7 to 1.8 that we might have missed?
> >
> > Martin
> >
> >
> >
> > On 19.01.15 11:26, Martin Weiser wrote:
> > > Hi everybody,
> > >
> > > we quite recently updated one of our applications to DPDK 1.8.0 and are
> > > now seeing a segmentation fault in ixgbe_rxtx_vec.c:444 after a few
> minutes.
> > > I just did some quick debugging and I only have a very limited
> > > understanding of the code in question but it seems that the 'continue'
> > > in line 445 without increasing 'buf_idx' might cause the problem. In
> one
> > > debugging session when the crash occurred the value of 'buf_idx' was 2
> > > and the value of 'pkt_idx' was 8965.
> > > Any help with this issue would be greatly appreciated. If you need any
> > > further information just let me know.
> > >
> > > Martin
> > >
> > >
> >
> Hi Martin, Prashant,
>
> I've managed to reproduce the issue here and had a look at it. Could you
> both perhaps try the proposed change below and see if it fixes the problem
> for
> you and gives you a working system? If so, I'll submit this as a patch fix
> officially - or go back to the drawing board, if not. :-)
>
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> index b54cb19..dfaccee 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> @@ -402,10 +402,10 @@ reassemble_packets(struct igb_rx_queue *rxq, struct
> rte_mbuf **rx_bufs,
>         struct rte_mbuf *pkts[RTE_IXGBE_VPMD_RX_BURST]; /*finished pkts*/
>         struct rte_mbuf *start = rxq->pkt_first_seg;
>         struct rte_mbuf *end =  rxq->pkt_last_seg;
> -       unsigned pkt_idx = 0, buf_idx = 0;
> +       unsigned pkt_idx, buf_idx;
>
>
> -       while (buf_idx < nb_bufs) {
> +       for (buf_idx = 0, pkt_idx = 0; buf_idx < nb_bufs; buf_idx++) {
>                 if (end != NULL) {
>                         /* processing a split packet */
>                         end->next = rx_bufs[buf_idx];
> @@ -448,7 +448,6 @@ reassemble_packets(struct igb_rx_queue *rxq, struct
> rte_mbuf **rx_bufs,
>                         rx_bufs[buf_idx]->data_len += rxq->crc_len;
>                         rx_bufs[buf_idx]->pkt_len += rxq->crc_len;
>                 }
> -               buf_idx++;
>         }
>
>         /* save the partial packet for next time */
>
>
> Regards,
> /Bruce
>
> Hi Bruce,

I am afraid your patch did not work for me. In my case I am not trying to
receive jumbo frames but normal frames. They are not received at my
application. Further, your patched function is not getting stimulated in my
usecase.

Regards
-Prashant
  
Bruce Richardson Jan. 22, 2015, 3:19 p.m. UTC | #2
On Thu, Jan 22, 2015 at 07:35:45PM +0530, Prashant Upadhyaya wrote:
> On Wed, Jan 21, 2015 at 7:19 PM, Bruce Richardson <
> bruce.richardson@intel.com> wrote:
> 
> > On Tue, Jan 20, 2015 at 11:39:03AM +0100, Martin Weiser wrote:
> > > Hi again,
> > >
> > > I did some further testing and it seems like this issue is linked to
> > > jumbo frames. I think a similar issue has already been reported by
> > > Prashant Upadhyaya with the subject 'Packet Rx issue with DPDK1.8'.
> > > In our application we use the following rxmode port configuration:
> > >
> > > .mq_mode    = ETH_MQ_RX_RSS,
> > > .split_hdr_size = 0,
> > > .header_split   = 0,
> > > .hw_ip_checksum = 1,
> > > .hw_vlan_filter = 0,
> > > .jumbo_frame    = 1,
> > > .hw_strip_crc   = 1,
> > > .max_rx_pkt_len = 9000,
> > >
> > > and the mbuf size is calculated like the following:
> > >
> > > (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
> > >
> > > This works fine with DPDK 1.7 and jumbo frames are split into buffer
> > > chains and can be forwarded on another port without a problem.
> > > With DPDK 1.8 and the default configuration (CONFIG_RTE_IXGBE_INC_VECTOR
> > > enabled) the application sometimes crashes like described in my first
> > > mail and sometimes packet receiving stops with subsequently arriving
> > > packets counted as rx errors. When CONFIG_RTE_IXGBE_INC_VECTOR is
> > > disabled the packet processing also comes to a halt as soon as jumbo
> > > frames arrive with a the slightly different effect that now
> > > rte_eth_tx_burst refuses to send any previously received packets.
> > >
> > > Is there anything special to consider regarding jumbo frames when moving
> > > from DPDK 1.7 to 1.8 that we might have missed?
> > >
> > > Martin
> > >
> > >
> > >
> > > On 19.01.15 11:26, Martin Weiser wrote:
> > > > Hi everybody,
> > > >
> > > > we quite recently updated one of our applications to DPDK 1.8.0 and are
> > > > now seeing a segmentation fault in ixgbe_rxtx_vec.c:444 after a few
> > minutes.
> > > > I just did some quick debugging and I only have a very limited
> > > > understanding of the code in question but it seems that the 'continue'
> > > > in line 445 without increasing 'buf_idx' might cause the problem. In
> > one
> > > > debugging session when the crash occurred the value of 'buf_idx' was 2
> > > > and the value of 'pkt_idx' was 8965.
> > > > Any help with this issue would be greatly appreciated. If you need any
> > > > further information just let me know.
> > > >
> > > > Martin
> > > >
> > > >
> > >
> > Hi Martin, Prashant,
> >
> > I've managed to reproduce the issue here and had a look at it. Could you
> > both perhaps try the proposed change below and see if it fixes the problem
> > for
> > you and gives you a working system? If so, I'll submit this as a patch fix
> > officially - or go back to the drawing board, if not. :-)
> >
> > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > index b54cb19..dfaccee 100644
> > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > @@ -402,10 +402,10 @@ reassemble_packets(struct igb_rx_queue *rxq, struct
> > rte_mbuf **rx_bufs,
> >         struct rte_mbuf *pkts[RTE_IXGBE_VPMD_RX_BURST]; /*finished pkts*/
> >         struct rte_mbuf *start = rxq->pkt_first_seg;
> >         struct rte_mbuf *end =  rxq->pkt_last_seg;
> > -       unsigned pkt_idx = 0, buf_idx = 0;
> > +       unsigned pkt_idx, buf_idx;
> >
> >
> > -       while (buf_idx < nb_bufs) {
> > +       for (buf_idx = 0, pkt_idx = 0; buf_idx < nb_bufs; buf_idx++) {
> >                 if (end != NULL) {
> >                         /* processing a split packet */
> >                         end->next = rx_bufs[buf_idx];
> > @@ -448,7 +448,6 @@ reassemble_packets(struct igb_rx_queue *rxq, struct
> > rte_mbuf **rx_bufs,
> >                         rx_bufs[buf_idx]->data_len += rxq->crc_len;
> >                         rx_bufs[buf_idx]->pkt_len += rxq->crc_len;
> >                 }
> > -               buf_idx++;
> >         }
> >
> >         /* save the partial packet for next time */
> >
> >
> > Regards,
> > /Bruce
> >
> > Hi Bruce,
> 
> I am afraid your patch did not work for me. In my case I am not trying to
> receive jumbo frames but normal frames. They are not received at my
> application. Further, your patched function is not getting stimulated in my
> usecase.
> 
> Regards
> -Prashant

Hi Prashant,

can your problem be reproduced using testpmd? If so can you perhaps send me the
command-line for testpmd and traffic profile needed to reproduce the issue?

Thanks,
/Bruce
  

Patch

diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
index b54cb19..dfaccee 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
@@ -402,10 +402,10 @@  reassemble_packets(struct igb_rx_queue *rxq, struct rte_mbuf **rx_bufs,
        struct rte_mbuf *pkts[RTE_IXGBE_VPMD_RX_BURST]; /*finished pkts*/
        struct rte_mbuf *start = rxq->pkt_first_seg;
        struct rte_mbuf *end =  rxq->pkt_last_seg;
-       unsigned pkt_idx = 0, buf_idx = 0;
+       unsigned pkt_idx, buf_idx;


-       while (buf_idx < nb_bufs) {
+       for (buf_idx = 0, pkt_idx = 0; buf_idx < nb_bufs; buf_idx++) {
                if (end != NULL) {
                        /* processing a split packet */
                        end->next = rx_bufs[buf_idx];
@@ -448,7 +448,6 @@  reassemble_packets(struct igb_rx_queue *rxq, struct rte_mbuf **rx_bufs,
                        rx_bufs[buf_idx]->data_len += rxq->crc_len;
                        rx_bufs[buf_idx]->pkt_len += rxq->crc_len;
                }
-               buf_idx++;
        }

        /* save the partial packet for next time */