Message ID | 1515810914-18762-1-git-send-email-wei.guo.simon@gmail.com (mailing list archive) |
---|---|
State | Rejected, archived |
Delegated to: | Thomas Monjalon |
Headers | show |
Context | Check | Description |
---|---|---|
ci/checkpatch | success | coding style OK |
ci/Intel-compilation | fail | Compilation issues |
Hi, > -----Original Message----- > From: wei.guo.simon@gmail.com [mailto:wei.guo.simon@gmail.com] > Sent: Saturday, January 13, 2018 10:35 AM > To: Lu, Wenzhuo <wenzhuo.lu@intel.com> > Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Simon Guo > <wei.guo.simon@gmail.com> > Subject: [PATCH v5] app/testpmd: add option ring-bind-lcpu to bind Q with > CPU > > From: Simon Guo <wei.guo.simon@gmail.com> > > Currently the rx/tx queue is allocated from the buffer pool on socket of: > - port's socket if --port-numa-config specified > - or ring-numa-config setting per port > > All the above will "bind" queue to single socket per port configuration. > But it can actually archieve better performance if one port's queue can be > spread across multiple NUMA nodes, and the rx/tx queue is allocated per > lcpu socket. > > This patch adds a new option "--ring-bind-lcpu"(no parameter). With this, > testpmd can utilize the PCI-e bus bandwidth on another NUMA nodes. > > When --port-numa-config or --ring-numa-config option is specified, this -- > ring-bind-lcpu option will be suppressed. > > Test result: > 64bytes package, running in PowerPC with Mellanox > CX-4 card, single port(100G), with 8 cores, fw mode: > - Without this patch: 52.5Mpps throughput > - With this patch: 66Mpps throughput > ~25% improvement > > Signed-off-by: Simon Guo <wei.guo.simon@gmail.com> Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
> -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of wei.guo.simon@gmail.com > Sent: Saturday, January 13, 2018 2:35 AM > To: Lu, Wenzhuo <wenzhuo.lu@intel.com> > Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Simon Guo <wei.guo.simon@gmail.com> > Subject: [dpdk-dev] [PATCH v5] app/testpmd: add option ring-bind-lcpu to bind Q with CPU > > From: Simon Guo <wei.guo.simon@gmail.com> > > Currently the rx/tx queue is allocated from the buffer pool on socket of: > - port's socket if --port-numa-config specified > - or ring-numa-config setting per port > > All the above will "bind" queue to single socket per port configuration. > But it can actually archieve better performance if one port's queue can > be spread across multiple NUMA nodes, and the rx/tx queue is allocated > per lcpu socket. > > This patch adds a new option "--ring-bind-lcpu"(no parameter). With > this, testpmd can utilize the PCI-e bus bandwidth on another NUMA > nodes. > > When --port-numa-config or --ring-numa-config option is specified, this > --ring-bind-lcpu option will be suppressed. Instead of introducing one more option - wouldn't it be better to allow user manually to define flows and assign them to particular lcores? Then the user will be able to create any FWD configuration he/she likes. Something like: lcore X add flow rxq N,Y txq M,Z Which would mean - on lcore X recv packets from port=N, rx_queue=Y, and send them through port=M,tx_queue=Z. Konstantin > > Test result: > 64bytes package, running in PowerPC with Mellanox > CX-4 card, single port(100G), with 8 cores, fw mode: > - Without this patch: 52.5Mpps throughput > - With this patch: 66Mpps throughput > ~25% improvement > > Signed-off-by: Simon Guo <wei.guo.simon@gmail.com> > --- > app/test-pmd/parameters.c | 14 ++++- > app/test-pmd/testpmd.c | 112 ++++++++++++++++++++++++---------- > app/test-pmd/testpmd.h | 7 +++ > doc/guides/testpmd_app_ug/run_app.rst | 6 ++ > 4 files changed, 107 insertions(+), 32 deletions(-) > > diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c > index 304b98d..1dba92e 100644 > --- a/app/test-pmd/parameters.c > +++ b/app/test-pmd/parameters.c > @@ -104,6 +104,10 @@ > "(flag: 1 for RX; 2 for TX; 3 for RX and TX).\n"); > printf(" --socket-num=N: set socket from which all memory is allocated " > "in NUMA mode.\n"); > + printf(" --ring-bind-lcpu: " > + "specify TX/RX rings will be allocated on local socket of lcpu." > + "It will be ignored if ring-numa-config or port-numa-config is used. " > + "As a result, it allows one port binds to multiple NUMA nodes.\n"); > printf(" --mbuf-size=N: set the data size of mbuf to N bytes.\n"); > printf(" --total-num-mbufs=N: set the number of mbufs to be allocated " > "in mbuf pools.\n"); > @@ -544,6 +548,7 @@ > { "interactive", 0, 0, 0 }, > { "cmdline-file", 1, 0, 0 }, > { "auto-start", 0, 0, 0 }, > + { "ring-bind-lcpu", 0, 0, 0 }, > { "eth-peers-configfile", 1, 0, 0 }, > { "eth-peer", 1, 0, 0 }, > #endif > @@ -676,6 +681,10 @@ > stats_period = n; > break; > } > + if (!strcmp(lgopts[opt_idx].name, "ring-bind-lcpu")) { > + ring_bind_lcpu |= RBL_BIND_LOCAL_MASK; > + break; > + } > if (!strcmp(lgopts[opt_idx].name, > "eth-peers-configfile")) { > if (init_peer_eth_addrs(optarg) != 0) > @@ -739,11 +748,14 @@ > if (parse_portnuma_config(optarg)) > rte_exit(EXIT_FAILURE, > "invalid port-numa configuration\n"); > + ring_bind_lcpu |= RBL_PORT_NUMA_MASK; > } > - if (!strcmp(lgopts[opt_idx].name, "ring-numa-config")) > + if (!strcmp(lgopts[opt_idx].name, "ring-numa-config")) { > if (parse_ringnuma_config(optarg)) > rte_exit(EXIT_FAILURE, > "invalid ring-numa configuration\n"); > + ring_bind_lcpu |= RBL_RING_NUMA_MASK; > + } > if (!strcmp(lgopts[opt_idx].name, "socket-num")) { > n = atoi(optarg); > if (!new_socket_id((uint8_t)n)) { > diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c > index 9414d0e..e9e89d0 100644 > --- a/app/test-pmd/testpmd.c > +++ b/app/test-pmd/testpmd.c > @@ -68,6 +68,9 @@ > uint8_t interactive = 0; > uint8_t auto_start = 0; > uint8_t tx_first; > + > +uint8_t ring_bind_lcpu; > + > char cmdline_filename[PATH_MAX] = {0}; > > /* > @@ -1410,6 +1413,43 @@ static int eth_event_callback(portid_t port_id, > return 1; > } > > +static int find_local_socket(queueid_t qi, int is_rxq) > +{ > + /* > + * try to find the local socket with following logic: > + * 1) Find the correct stream for the queue; > + * 2) Find the correct lcore for the stream; > + * 3) Find the correct socket for the lcore; > + */ > + int i, j, socket = NUMA_NO_CONFIG; > + > + /* find the stream based on queue no*/ > + for (i = 0; i < nb_fwd_streams; i++) { > + if (is_rxq) { > + if (fwd_streams[i]->rx_queue == qi) > + break; > + } else { > + if (fwd_streams[i]->tx_queue == qi) > + break; > + } > + } > + if (i == nb_fwd_streams) > + return NUMA_NO_CONFIG; > + > + /* find the lcore based on stream idx */ > + for (j = 0; j < nb_lcores; j++) { > + if (fwd_lcores[j]->stream_idx == i) > + break; > + } > + if (j == nb_lcores) > + return NUMA_NO_CONFIG; > + > + /* find the socket for the lcore */ > + socket = rte_lcore_to_socket_id(fwd_lcores_cpuids[j]); > + > + return socket; > +} > + > int > start_port(portid_t pid) > { > @@ -1469,14 +1509,18 @@ static int eth_event_callback(portid_t port_id, > port->need_reconfig_queues = 0; > /* setup tx queues */ > for (qi = 0; qi < nb_txq; qi++) { > + int socket = port->socket_id; > if ((numa_support) && > (txring_numa[pi] != NUMA_NO_CONFIG)) > - diag = rte_eth_tx_queue_setup(pi, qi, > - nb_txd,txring_numa[pi], > - &(port->tx_conf)); > - else > - diag = rte_eth_tx_queue_setup(pi, qi, > - nb_txd,port->socket_id, > + socket = txring_numa[pi]; > + else if (ring_bind_lcpu) { > + int ret = find_local_socket(qi, 0); > + if (ret != NUMA_NO_CONFIG) > + socket = ret; > + } > + > + diag = rte_eth_tx_queue_setup(pi, qi, > + nb_txd, socket, > &(port->tx_conf)); > > if (diag == 0) > @@ -1495,35 +1539,28 @@ static int eth_event_callback(portid_t port_id, > } > /* setup rx queues */ > for (qi = 0; qi < nb_rxq; qi++) { > + int socket = port->socket_id; > if ((numa_support) && > - (rxring_numa[pi] != NUMA_NO_CONFIG)) { > - struct rte_mempool * mp = > - mbuf_pool_find(rxring_numa[pi]); > - if (mp == NULL) { > - printf("Failed to setup RX queue:" > - "No mempool allocation" > - " on the socket %d\n", > - rxring_numa[pi]); > - return -1; > - } > - > - diag = rte_eth_rx_queue_setup(pi, qi, > - nb_rxd,rxring_numa[pi], > - &(port->rx_conf),mp); > - } else { > - struct rte_mempool *mp = > - mbuf_pool_find(port->socket_id); > - if (mp == NULL) { > - printf("Failed to setup RX queue:" > + (rxring_numa[pi] != NUMA_NO_CONFIG)) > + socket = rxring_numa[pi]; > + else if (ring_bind_lcpu) { > + int ret = find_local_socket(qi, 1); > + if (ret != NUMA_NO_CONFIG) > + socket = ret; > + } > + > + struct rte_mempool *mp = > + mbuf_pool_find(socket); > + if (mp == NULL) { > + printf("Failed to setup RX queue:" > "No mempool allocation" > " on the socket %d\n", > - port->socket_id); > - return -1; > - } > - diag = rte_eth_rx_queue_setup(pi, qi, > - nb_rxd,port->socket_id, > - &(port->rx_conf), mp); > + socket); > + return -1; > } > + diag = rte_eth_rx_queue_setup(pi, qi, > + nb_rxd, socket, > + &(port->rx_conf), mp); > if (diag == 0) > continue; > > @@ -2414,6 +2451,19 @@ uint8_t port_is_bonding_slave(portid_t slave_pid) > "but nb_txq=%d will prevent to fully test it.\n", > nb_rxq, nb_txq); > > + if (ring_bind_lcpu & RBL_BIND_LOCAL_MASK) { > + if (ring_bind_lcpu & > + (RBL_RING_NUMA_MASK | RBL_PORT_NUMA_MASK)) { > + printf("ring-bind-lcpu option is suppressed by " > + "ring-numa-config or port-numa-config option\n"); > + ring_bind_lcpu = 0; > + } else { > + printf("RingBuffer bind with local CPU selected\n"); > + ring_bind_lcpu = 1; > + } > + } else > + ring_bind_lcpu = 0; > + > init_config(); > if (start_port(RTE_PORT_ALL) != 0) > rte_exit(EXIT_FAILURE, "Start ports failed\n"); > diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h > index 2a266fd..99e55b2 100644 > --- a/app/test-pmd/testpmd.h > +++ b/app/test-pmd/testpmd.h > @@ -328,6 +328,13 @@ struct queue_stats_mappings { > extern uint8_t interactive; > extern uint8_t auto_start; > extern uint8_t tx_first; > + > +/* for --ring-bind-lcpu option checking, RBL means Ring Bind Lcpu related */ > +#define RBL_BIND_LOCAL_MASK (1 << 0) > +#define RBL_RING_NUMA_MASK (1 << 1) > +#define RBL_PORT_NUMA_MASK (1 << 2) > +extern uint8_t ring_bind_lcpu; > + > extern char cmdline_filename[PATH_MAX]; /**< offline commands file */ > extern uint8_t numa_support; /**< set by "--numa" parameter */ > extern uint16_t port_topology; /**< set by "--port-topology" parameter */ > diff --git a/doc/guides/testpmd_app_ug/run_app.rst b/doc/guides/testpmd_app_ug/run_app.rst > index 4c0d2ce..b88f099 100644 > --- a/doc/guides/testpmd_app_ug/run_app.rst > +++ b/doc/guides/testpmd_app_ug/run_app.rst > @@ -240,6 +240,12 @@ The commandline options are: > Specify the socket on which the TX/RX rings for the port will be allocated. > Where flag is 1 for RX, 2 for TX, and 3 for RX and TX. > > +* ``--ring-bind-lcpu`` > + > + specify TX/RX rings will be allocated on local socket of lcpu. > + It will be ignored if ring-numa-config or port-numa-config is used. > + As a result, it allows one port binds to multiple NUMA nodes. > + > * ``--socket-num=N`` > > Set the socket from which all memory is allocated in NUMA mode, > -- > 1.8.3.1
Hi, Konstantin, On Tue, Jan 16, 2018 at 12:38:35PM +0000, Ananyev, Konstantin wrote: > > > > -----Original Message----- > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of wei.guo.simon@gmail.com > > Sent: Saturday, January 13, 2018 2:35 AM > > To: Lu, Wenzhuo <wenzhuo.lu@intel.com> > > Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Simon Guo <wei.guo.simon@gmail.com> > > Subject: [dpdk-dev] [PATCH v5] app/testpmd: add option ring-bind-lcpu to bind Q with CPU > > > > From: Simon Guo <wei.guo.simon@gmail.com> > > > > Currently the rx/tx queue is allocated from the buffer pool on socket of: > > - port's socket if --port-numa-config specified > > - or ring-numa-config setting per port > > > > All the above will "bind" queue to single socket per port configuration. > > But it can actually archieve better performance if one port's queue can > > be spread across multiple NUMA nodes, and the rx/tx queue is allocated > > per lcpu socket. > > > > This patch adds a new option "--ring-bind-lcpu"(no parameter). With > > this, testpmd can utilize the PCI-e bus bandwidth on another NUMA > > nodes. > > > > When --port-numa-config or --ring-numa-config option is specified, this > > --ring-bind-lcpu option will be suppressed. > > Instead of introducing one more option - wouldn't it be better to > allow user manually to define flows and assign them to particular lcores? > Then the user will be able to create any FWD configuration he/she likes. > Something like: > lcore X add flow rxq N,Y txq M,Z > > Which would mean - on lcore X recv packets from port=N, rx_queue=Y, > and send them through port=M,tx_queue=Z. Thanks for the comment. Will it be a too compliated solution for user since it will need to define specifically for each lcore? We might have hundreds of lcores in current modern platforms. Thanks, - Simon
Hi Simon, > > Hi, Konstantin, > On Tue, Jan 16, 2018 at 12:38:35PM +0000, Ananyev, Konstantin wrote: > > > > > > > -----Original Message----- > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of wei.guo.simon@gmail.com > > > Sent: Saturday, January 13, 2018 2:35 AM > > > To: Lu, Wenzhuo <wenzhuo.lu@intel.com> > > > Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Simon Guo <wei.guo.simon@gmail.com> > > > Subject: [dpdk-dev] [PATCH v5] app/testpmd: add option ring-bind-lcpu to bind Q with CPU > > > > > > From: Simon Guo <wei.guo.simon@gmail.com> > > > > > > Currently the rx/tx queue is allocated from the buffer pool on socket of: > > > - port's socket if --port-numa-config specified > > > - or ring-numa-config setting per port > > > > > > All the above will "bind" queue to single socket per port configuration. > > > But it can actually archieve better performance if one port's queue can > > > be spread across multiple NUMA nodes, and the rx/tx queue is allocated > > > per lcpu socket. > > > > > > This patch adds a new option "--ring-bind-lcpu"(no parameter). With > > > this, testpmd can utilize the PCI-e bus bandwidth on another NUMA > > > nodes. > > > > > > When --port-numa-config or --ring-numa-config option is specified, this > > > --ring-bind-lcpu option will be suppressed. > > > > Instead of introducing one more option - wouldn't it be better to > > allow user manually to define flows and assign them to particular lcores? > > Then the user will be able to create any FWD configuration he/she likes. > > Something like: > > lcore X add flow rxq N,Y txq M,Z > > > > Which would mean - on lcore X recv packets from port=N, rx_queue=Y, > > and send them through port=M,tx_queue=Z. > Thanks for the comment. > Will it be a too compliated solution for user since it will need to define > specifically for each lcore? We might have hundreds of lcores in current > modern platforms. Why for all lcores? Only for ones that will do packet forwarding. Also if configuration becomes too complex(/big) to be done manually user can write a script that will generate set of testpmd commands to achieve desired layout. Konstantin
Hi Konstantin, On Thu, Jan 18, 2018 at 12:14:05PM +0000, Ananyev, Konstantin wrote: > Hi Simon, > > > > > Hi, Konstantin, > > On Tue, Jan 16, 2018 at 12:38:35PM +0000, Ananyev, Konstantin wrote: > > > > > > > > > > -----Original Message----- > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of wei.guo.simon@gmail.com > > > > Sent: Saturday, January 13, 2018 2:35 AM > > > > To: Lu, Wenzhuo <wenzhuo.lu@intel.com> > > > > Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Simon Guo <wei.guo.simon@gmail.com> > > > > Subject: [dpdk-dev] [PATCH v5] app/testpmd: add option ring-bind-lcpu to bind Q with CPU > > > > > > > > From: Simon Guo <wei.guo.simon@gmail.com> > > > > > > > > Currently the rx/tx queue is allocated from the buffer pool on socket of: > > > > - port's socket if --port-numa-config specified > > > > - or ring-numa-config setting per port > > > > > > > > All the above will "bind" queue to single socket per port configuration. > > > > But it can actually archieve better performance if one port's queue can > > > > be spread across multiple NUMA nodes, and the rx/tx queue is allocated > > > > per lcpu socket. > > > > > > > > This patch adds a new option "--ring-bind-lcpu"(no parameter). With > > > > this, testpmd can utilize the PCI-e bus bandwidth on another NUMA > > > > nodes. > > > > > > > > When --port-numa-config or --ring-numa-config option is specified, this > > > > --ring-bind-lcpu option will be suppressed. > > > > > > Instead of introducing one more option - wouldn't it be better to > > > allow user manually to define flows and assign them to particular lcores? > > > Then the user will be able to create any FWD configuration he/she likes. > > > Something like: > > > lcore X add flow rxq N,Y txq M,Z > > > > > > Which would mean - on lcore X recv packets from port=N, rx_queue=Y, > > > and send them through port=M,tx_queue=Z. > > Thanks for the comment. > > Will it be a too compliated solution for user since it will need to define > > specifically for each lcore? We might have hundreds of lcores in current > > modern platforms. > > Why for all lcores? > Only for ones that will do packet forwarding. > Also if configuration becomes too complex(/big) to be done manually > user can write a script that will generate set of testpmd commands > to achieve desired layout. It might not be an issue for skillful users, but it will be difficult for others. --ring-bind-lcpu will help to simply this for them. Thanks, - Simon
On 1/25/2018 3:40 AM, Simon Guo wrote: > > Hi Konstantin, > On Thu, Jan 18, 2018 at 12:14:05PM +0000, Ananyev, Konstantin wrote: >> Hi Simon, >> >>> >>> Hi, Konstantin, >>> On Tue, Jan 16, 2018 at 12:38:35PM +0000, Ananyev, Konstantin wrote: >>>> >>>> >>>>> -----Original Message----- >>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of wei.guo.simon@gmail.com >>>>> Sent: Saturday, January 13, 2018 2:35 AM >>>>> To: Lu, Wenzhuo <wenzhuo.lu@intel.com> >>>>> Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Simon Guo <wei.guo.simon@gmail.com> >>>>> Subject: [dpdk-dev] [PATCH v5] app/testpmd: add option ring-bind-lcpu to bind Q with CPU >>>>> >>>>> From: Simon Guo <wei.guo.simon@gmail.com> >>>>> >>>>> Currently the rx/tx queue is allocated from the buffer pool on socket of: >>>>> - port's socket if --port-numa-config specified >>>>> - or ring-numa-config setting per port >>>>> >>>>> All the above will "bind" queue to single socket per port configuration. >>>>> But it can actually archieve better performance if one port's queue can >>>>> be spread across multiple NUMA nodes, and the rx/tx queue is allocated >>>>> per lcpu socket. >>>>> >>>>> This patch adds a new option "--ring-bind-lcpu"(no parameter). With >>>>> this, testpmd can utilize the PCI-e bus bandwidth on another NUMA >>>>> nodes. >>>>> >>>>> When --port-numa-config or --ring-numa-config option is specified, this >>>>> --ring-bind-lcpu option will be suppressed. >>>> >>>> Instead of introducing one more option - wouldn't it be better to >>>> allow user manually to define flows and assign them to particular lcores? >>>> Then the user will be able to create any FWD configuration he/she likes. >>>> Something like: >>>> lcore X add flow rxq N,Y txq M,Z >>>> >>>> Which would mean - on lcore X recv packets from port=N, rx_queue=Y, >>>> and send them through port=M,tx_queue=Z. >>> Thanks for the comment. >>> Will it be a too compliated solution for user since it will need to define >>> specifically for each lcore? We might have hundreds of lcores in current >>> modern platforms. >> >> Why for all lcores? >> Only for ones that will do packet forwarding. >> Also if configuration becomes too complex(/big) to be done manually >> user can write a script that will generate set of testpmd commands >> to achieve desired layout. > > It might not be an issue for skillful users, but it will be difficult > for others. --ring-bind-lcpu will help to simply this for them. Discussion seems not concluded for this patch, and it is sting idle for more than a year. I am marking the patch as rejected, if it is still relevant please send a new version on top of latest repo. Sorry for any inconvenience caused. For reference patch: https://patches.dpdk.org/patch/33771/
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c index 304b98d..1dba92e 100644 --- a/app/test-pmd/parameters.c +++ b/app/test-pmd/parameters.c @@ -104,6 +104,10 @@ "(flag: 1 for RX; 2 for TX; 3 for RX and TX).\n"); printf(" --socket-num=N: set socket from which all memory is allocated " "in NUMA mode.\n"); + printf(" --ring-bind-lcpu: " + "specify TX/RX rings will be allocated on local socket of lcpu." + "It will be ignored if ring-numa-config or port-numa-config is used. " + "As a result, it allows one port binds to multiple NUMA nodes.\n"); printf(" --mbuf-size=N: set the data size of mbuf to N bytes.\n"); printf(" --total-num-mbufs=N: set the number of mbufs to be allocated " "in mbuf pools.\n"); @@ -544,6 +548,7 @@ { "interactive", 0, 0, 0 }, { "cmdline-file", 1, 0, 0 }, { "auto-start", 0, 0, 0 }, + { "ring-bind-lcpu", 0, 0, 0 }, { "eth-peers-configfile", 1, 0, 0 }, { "eth-peer", 1, 0, 0 }, #endif @@ -676,6 +681,10 @@ stats_period = n; break; } + if (!strcmp(lgopts[opt_idx].name, "ring-bind-lcpu")) { + ring_bind_lcpu |= RBL_BIND_LOCAL_MASK; + break; + } if (!strcmp(lgopts[opt_idx].name, "eth-peers-configfile")) { if (init_peer_eth_addrs(optarg) != 0) @@ -739,11 +748,14 @@ if (parse_portnuma_config(optarg)) rte_exit(EXIT_FAILURE, "invalid port-numa configuration\n"); + ring_bind_lcpu |= RBL_PORT_NUMA_MASK; } - if (!strcmp(lgopts[opt_idx].name, "ring-numa-config")) + if (!strcmp(lgopts[opt_idx].name, "ring-numa-config")) { if (parse_ringnuma_config(optarg)) rte_exit(EXIT_FAILURE, "invalid ring-numa configuration\n"); + ring_bind_lcpu |= RBL_RING_NUMA_MASK; + } if (!strcmp(lgopts[opt_idx].name, "socket-num")) { n = atoi(optarg); if (!new_socket_id((uint8_t)n)) { diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index 9414d0e..e9e89d0 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -68,6 +68,9 @@ uint8_t interactive = 0; uint8_t auto_start = 0; uint8_t tx_first; + +uint8_t ring_bind_lcpu; + char cmdline_filename[PATH_MAX] = {0}; /* @@ -1410,6 +1413,43 @@ static int eth_event_callback(portid_t port_id, return 1; } +static int find_local_socket(queueid_t qi, int is_rxq) +{ + /* + * try to find the local socket with following logic: + * 1) Find the correct stream for the queue; + * 2) Find the correct lcore for the stream; + * 3) Find the correct socket for the lcore; + */ + int i, j, socket = NUMA_NO_CONFIG; + + /* find the stream based on queue no*/ + for (i = 0; i < nb_fwd_streams; i++) { + if (is_rxq) { + if (fwd_streams[i]->rx_queue == qi) + break; + } else { + if (fwd_streams[i]->tx_queue == qi) + break; + } + } + if (i == nb_fwd_streams) + return NUMA_NO_CONFIG; + + /* find the lcore based on stream idx */ + for (j = 0; j < nb_lcores; j++) { + if (fwd_lcores[j]->stream_idx == i) + break; + } + if (j == nb_lcores) + return NUMA_NO_CONFIG; + + /* find the socket for the lcore */ + socket = rte_lcore_to_socket_id(fwd_lcores_cpuids[j]); + + return socket; +} + int start_port(portid_t pid) { @@ -1469,14 +1509,18 @@ static int eth_event_callback(portid_t port_id, port->need_reconfig_queues = 0; /* setup tx queues */ for (qi = 0; qi < nb_txq; qi++) { + int socket = port->socket_id; if ((numa_support) && (txring_numa[pi] != NUMA_NO_CONFIG)) - diag = rte_eth_tx_queue_setup(pi, qi, - nb_txd,txring_numa[pi], - &(port->tx_conf)); - else - diag = rte_eth_tx_queue_setup(pi, qi, - nb_txd,port->socket_id, + socket = txring_numa[pi]; + else if (ring_bind_lcpu) { + int ret = find_local_socket(qi, 0); + if (ret != NUMA_NO_CONFIG) + socket = ret; + } + + diag = rte_eth_tx_queue_setup(pi, qi, + nb_txd, socket, &(port->tx_conf)); if (diag == 0) @@ -1495,35 +1539,28 @@ static int eth_event_callback(portid_t port_id, } /* setup rx queues */ for (qi = 0; qi < nb_rxq; qi++) { + int socket = port->socket_id; if ((numa_support) && - (rxring_numa[pi] != NUMA_NO_CONFIG)) { - struct rte_mempool * mp = - mbuf_pool_find(rxring_numa[pi]); - if (mp == NULL) { - printf("Failed to setup RX queue:" - "No mempool allocation" - " on the socket %d\n", - rxring_numa[pi]); - return -1; - } - - diag = rte_eth_rx_queue_setup(pi, qi, - nb_rxd,rxring_numa[pi], - &(port->rx_conf),mp); - } else { - struct rte_mempool *mp = - mbuf_pool_find(port->socket_id); - if (mp == NULL) { - printf("Failed to setup RX queue:" + (rxring_numa[pi] != NUMA_NO_CONFIG)) + socket = rxring_numa[pi]; + else if (ring_bind_lcpu) { + int ret = find_local_socket(qi, 1); + if (ret != NUMA_NO_CONFIG) + socket = ret; + } + + struct rte_mempool *mp = + mbuf_pool_find(socket); + if (mp == NULL) { + printf("Failed to setup RX queue:" "No mempool allocation" " on the socket %d\n", - port->socket_id); - return -1; - } - diag = rte_eth_rx_queue_setup(pi, qi, - nb_rxd,port->socket_id, - &(port->rx_conf), mp); + socket); + return -1; } + diag = rte_eth_rx_queue_setup(pi, qi, + nb_rxd, socket, + &(port->rx_conf), mp); if (diag == 0) continue; @@ -2414,6 +2451,19 @@ uint8_t port_is_bonding_slave(portid_t slave_pid) "but nb_txq=%d will prevent to fully test it.\n", nb_rxq, nb_txq); + if (ring_bind_lcpu & RBL_BIND_LOCAL_MASK) { + if (ring_bind_lcpu & + (RBL_RING_NUMA_MASK | RBL_PORT_NUMA_MASK)) { + printf("ring-bind-lcpu option is suppressed by " + "ring-numa-config or port-numa-config option\n"); + ring_bind_lcpu = 0; + } else { + printf("RingBuffer bind with local CPU selected\n"); + ring_bind_lcpu = 1; + } + } else + ring_bind_lcpu = 0; + init_config(); if (start_port(RTE_PORT_ALL) != 0) rte_exit(EXIT_FAILURE, "Start ports failed\n"); diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h index 2a266fd..99e55b2 100644 --- a/app/test-pmd/testpmd.h +++ b/app/test-pmd/testpmd.h @@ -328,6 +328,13 @@ struct queue_stats_mappings { extern uint8_t interactive; extern uint8_t auto_start; extern uint8_t tx_first; + +/* for --ring-bind-lcpu option checking, RBL means Ring Bind Lcpu related */ +#define RBL_BIND_LOCAL_MASK (1 << 0) +#define RBL_RING_NUMA_MASK (1 << 1) +#define RBL_PORT_NUMA_MASK (1 << 2) +extern uint8_t ring_bind_lcpu; + extern char cmdline_filename[PATH_MAX]; /**< offline commands file */ extern uint8_t numa_support; /**< set by "--numa" parameter */ extern uint16_t port_topology; /**< set by "--port-topology" parameter */ diff --git a/doc/guides/testpmd_app_ug/run_app.rst b/doc/guides/testpmd_app_ug/run_app.rst index 4c0d2ce..b88f099 100644 --- a/doc/guides/testpmd_app_ug/run_app.rst +++ b/doc/guides/testpmd_app_ug/run_app.rst @@ -240,6 +240,12 @@ The commandline options are: Specify the socket on which the TX/RX rings for the port will be allocated. Where flag is 1 for RX, 2 for TX, and 3 for RX and TX. +* ``--ring-bind-lcpu`` + + specify TX/RX rings will be allocated on local socket of lcpu. + It will be ignored if ring-numa-config or port-numa-config is used. + As a result, it allows one port binds to multiple NUMA nodes. + * ``--socket-num=N`` Set the socket from which all memory is allocated in NUMA mode,