doc: fix guide for DLB v2.5

Message ID 1621099654-25535-1-git-send-email-timothy.mcdaniel@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Jerin Jacob
Headers
Series doc: fix guide for DLB v2.5 |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/github-robot success github build: passed
ci/intel-Testing success Testing PASS

Commit Message

Timothy McDaniel May 15, 2021, 5:27 p.m. UTC
  - Remove references to deferred scheduling. That feature applies
  to DLB v1.0 only.
- Replace vdev references with the pci devargs equivalent
- Add section for new "vector_opts_enabled" devarg

Fixes: 7c6cc633fc7d ("doc: update guide for DLB v2.5")
Cc: timothy.mcdaniel@intel.com

Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
---
 doc/guides/eventdevs/dlb2.rst | 54 +++++++++++++++--------------------
 1 file changed, 23 insertions(+), 31 deletions(-)
  

Comments

David Marchand May 16, 2021, 5:33 p.m. UTC | #1
On Sat, May 15, 2021 at 7:29 PM Timothy McDaniel
<timothy.mcdaniel@intel.com> wrote:
>
> - Remove references to deferred scheduling. That feature applies
>   to DLB v1.0 only.
> - Replace vdev references with the pci devargs equivalent
> - Add section for new "vector_opts_enabled" devarg
>
> Fixes: 7c6cc633fc7d ("doc: update guide for DLB v2.5")
> Cc: timothy.mcdaniel@intel.com
>
> Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
> ---
>  doc/guides/eventdevs/dlb2.rst | 54 +++++++++++++++--------------------
>  1 file changed, 23 insertions(+), 31 deletions(-)
>
> diff --git a/doc/guides/eventdevs/dlb2.rst b/doc/guides/eventdevs/dlb2.rst
> index 31de6bc47..bce984ca0 100644
> --- a/doc/guides/eventdevs/dlb2.rst
> +++ b/doc/guides/eventdevs/dlb2.rst
> @@ -152,19 +152,19 @@ These pools' sizes are controlled by the nb_events_limit field in struct
>  rte_event_dev_config. The load-balanced pool is sized to contain
>  nb_events_limit credits, and the directed pool is sized to contain
>  nb_events_limit/4 credits. The directed pool size can be overridden with the
> -num_dir_credits vdev argument, like so:
> +num_dir_credits devargs argument, like so:
>
>      .. code-block:: console
>
> -       --vdev=dlb2_event,num_dir_credits=<value>
> +       --allow ea:00.0,num_dir_credits=<value>
>
>  This can be used if the default allocation is too low or too high for the
> -specific application needs. The PMD also supports a vdev arg that limits the
> +specific application needs. The PMD also supports a devarg that limits the
>  max_num_events reported by rte_event_dev_info_get():
>
>      .. code-block:: console
>
> -       --vdev=dlb2_event,max_num_events=<value>
> +       --allow ea:00.0,max_num_events=<value>
>
>  By default, max_num_events is reported as the total available load-balanced
>  credits. If multiple DLB-based applications are being used, it may be desirable
> @@ -293,27 +293,6 @@ The PMD does not support the following configuration sequences:
>  This sequence is not supported because the event device must be reconfigured
>  before its ports or queues can be.
>
> -Deferred Scheduling
> -~~~~~~~~~~~~~~~~~~~
> -
> -The DLB PMD's default behavior for managing a CQ is to "pop" the CQ once per
> -dequeued event before returning from rte_event_dequeue_burst(). This frees the
> -corresponding entries in the CQ, which enables the DLB to schedule more events
> -to it.
> -
> -To support applications seeking finer-grained scheduling control -- for example
> -deferring scheduling to get the best possible priority scheduling and
> -load-balancing -- the PMD supports a deferred scheduling mode. In this mode,
> -the CQ entry is not popped until the *subsequent* rte_event_dequeue_burst()
> -call. This mode only applies to load-balanced event ports with dequeue depth of
> -1.
> -
> -To enable deferred scheduling, use the defer_sched vdev argument like so:
> -
> -    .. code-block:: console
> -
> -       --vdev=dlb2_event,defer_sched=on
> -
>  Atomic Inflights Allocation
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> @@ -336,11 +315,11 @@ buffer space (e.g. if not all queues are used, or aren't used for atomic
>  scheduling).
>
>  The PMD provides a dev arg to override the default per-queue allocation. To
> -increase a vdev's per-queue atomic-inflight allocation to (for example) 64:
> +increase per-queue atomic-inflight allocation to (for example) 64:
>
>      .. code-block:: console
>
> -       --vdev=dlb2_event,atm_inflights=64
> +       --allow ea:00.0,atm_inflights=64
>
>  QID Depth Threshold
>  ~~~~~~~~~~~~~~~~~~~
> @@ -363,9 +342,9 @@ shown below.
>
>      .. code-block:: console
>
> -       --vdev=dlb2_event,qid_depth_thresh=all:<threshold_value>
> -       --vdev=dlb2_event,qid_depth_thresh=qidA-qidB:<threshold_value>
> -       --vdev=dlb2_event,qid_depth_thresh=qid:<threshold_value>
> +       --allow ea:00.0,qid_depth_thresh=all:<threshold_value>
> +       --allow ea:00.0,qid_depth_thresh=qidA-qidB:<threshold_value>
> +       --allow ea:00.0,qid_depth_thresh=qid:<threshold_value>

Did you try this syntax?
The previous syntax with vdev did not work, and the new one probably
won't either.
Only the first devargs will be passed to the driver.

Example with the null pmd:
$ ./build/app/dpdk-testpmd -m 512 --no-huge --log-level=*:debug --vdev
net_null,copy=1 --vdev net_null,size=1024 -- -ia
--total-num-mbufs=2048
...
vdev_probe_all_drivers(): Search driver to probe device net_null
rte_pmd_null_probe(): Initializing pmd_null for net_null
rte_pmd_null_probe(): Configure pmd_null: packet size is 64, packet
copy is enabled
                                                         ^^
     ^^^^^^^
eth_dev_null_create(): Creating null ethdev on numa socket 0
...


If you want to pass multiple options for a single device, pass them all at once.
Like:
--allow ea:00.0,qid_depth_thresh=all:<threshold_value>,qid_depth_thresh=qidA-qidB:<threshold_value>,qid_depth_thresh=qid:<threshold_value>

>
>  Class of service
>  ~~~~~~~~~~~~~~~~
> @@ -387,4 +366,17 @@ Class of service can be specified in the devargs, as follows
>
>      .. code-block:: console
>
> -       --vdev=dlb2_event,cos=<0..4>
> +       --allow ea:00.0,cos=<0..4>
> +
> +Use X86 Vector Instructions
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +DLB supports using x86 vector instructions to optimize the data path.
> +
> +The default mode of operation is to use scalar instructions, but
> +the use of vector instructions can be enabled in the devargs, as
> +follows
> +
> +    .. code-block:: console
> +
> +       --allow ea:00.0,vector_opts_enabled=<y/Y>

This option does not exist.
All I see is:
drivers/event/dlb2/dlb2_priv.h:#define DLB2_VECTOR_OPTS_DISAB_ARG
"vector_opts_disable"

What of --force-max-simd-bitwidth ?
  
Timothy McDaniel May 17, 2021, 1:48 p.m. UTC | #2
> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Sunday, May 16, 2021 12:34 PM
> To: McDaniel, Timothy <timothy.mcdaniel@intel.com>
> Cc: dev <dev@dpdk.org>; Carrillo, Erik G <erik.g.carrillo@intel.com>; Van
> Haaren, Harry <harry.van.haaren@intel.com>; Jerin Jacob Kollanukkaran
> <jerinj@marvell.com>; Thomas Monjalon <thomas@monjalon.net>
> Subject: Re: [dpdk-dev] [PATCH] doc: fix guide for DLB v2.5
> 
> On Sat, May 15, 2021 at 7:29 PM Timothy McDaniel
> <timothy.mcdaniel@intel.com> wrote:
> >
> > - Remove references to deferred scheduling. That feature applies
> >   to DLB v1.0 only.
> > - Replace vdev references with the pci devargs equivalent
> > - Add section for new "vector_opts_enabled" devarg
> >
> > Fixes: 7c6cc633fc7d ("doc: update guide for DLB v2.5")
> > Cc: timothy.mcdaniel@intel.com
> >
> > Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
> > ---
> >  doc/guides/eventdevs/dlb2.rst | 54 +++++++++++++++--------------------
> >  1 file changed, 23 insertions(+), 31 deletions(-)
> >
> > diff --git a/doc/guides/eventdevs/dlb2.rst b/doc/guides/eventdevs/dlb2.rst
> > index 31de6bc47..bce984ca0 100644
> > --- a/doc/guides/eventdevs/dlb2.rst
> > +++ b/doc/guides/eventdevs/dlb2.rst
> > @@ -152,19 +152,19 @@ These pools' sizes are controlled by the
> nb_events_limit field in struct
> >  rte_event_dev_config. The load-balanced pool is sized to contain
> >  nb_events_limit credits, and the directed pool is sized to contain
> >  nb_events_limit/4 credits. The directed pool size can be overridden with the
> > -num_dir_credits vdev argument, like so:
> > +num_dir_credits devargs argument, like so:
> >
> >      .. code-block:: console
> >
> > -       --vdev=dlb2_event,num_dir_credits=<value>
> > +       --allow ea:00.0,num_dir_credits=<value>
> >
> >  This can be used if the default allocation is too low or too high for the
> > -specific application needs. The PMD also supports a vdev arg that limits the
> > +specific application needs. The PMD also supports a devarg that limits the
> >  max_num_events reported by rte_event_dev_info_get():
> >
> >      .. code-block:: console
> >
> > -       --vdev=dlb2_event,max_num_events=<value>
> > +       --allow ea:00.0,max_num_events=<value>
> >
> >  By default, max_num_events is reported as the total available load-balanced
> >  credits. If multiple DLB-based applications are being used, it may be desirable
> > @@ -293,27 +293,6 @@ The PMD does not support the following
> configuration sequences:
> >  This sequence is not supported because the event device must be
> reconfigured
> >  before its ports or queues can be.
> >
> > -Deferred Scheduling
> > -~~~~~~~~~~~~~~~~~~~
> > -
> > -The DLB PMD's default behavior for managing a CQ is to "pop" the CQ once
> per
> > -dequeued event before returning from rte_event_dequeue_burst(). This frees
> the
> > -corresponding entries in the CQ, which enables the DLB to schedule more
> events
> > -to it.
> > -
> > -To support applications seeking finer-grained scheduling control -- for
> example
> > -deferring scheduling to get the best possible priority scheduling and
> > -load-balancing -- the PMD supports a deferred scheduling mode. In this mode,
> > -the CQ entry is not popped until the *subsequent* rte_event_dequeue_burst()
> > -call. This mode only applies to load-balanced event ports with dequeue depth
> of
> > -1.
> > -
> > -To enable deferred scheduling, use the defer_sched vdev argument like so:
> > -
> > -    .. code-block:: console
> > -
> > -       --vdev=dlb2_event,defer_sched=on
> > -
> >  Atomic Inflights Allocation
> >  ~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > @@ -336,11 +315,11 @@ buffer space (e.g. if not all queues are used, or
> aren't used for atomic
> >  scheduling).
> >
> >  The PMD provides a dev arg to override the default per-queue allocation. To
> > -increase a vdev's per-queue atomic-inflight allocation to (for example) 64:
> > +increase per-queue atomic-inflight allocation to (for example) 64:
> >
> >      .. code-block:: console
> >
> > -       --vdev=dlb2_event,atm_inflights=64
> > +       --allow ea:00.0,atm_inflights=64
> >
> >  QID Depth Threshold
> >  ~~~~~~~~~~~~~~~~~~~
> > @@ -363,9 +342,9 @@ shown below.
> >
> >      .. code-block:: console
> >
> > -       --vdev=dlb2_event,qid_depth_thresh=all:<threshold_value>
> > -       --vdev=dlb2_event,qid_depth_thresh=qidA-qidB:<threshold_value>
> > -       --vdev=dlb2_event,qid_depth_thresh=qid:<threshold_value>
> > +       --allow ea:00.0,qid_depth_thresh=all:<threshold_value>
> > +       --allow ea:00.0,qid_depth_thresh=qidA-qidB:<threshold_value>
> > +       --allow ea:00.0,qid_depth_thresh=qid:<threshold_value>
> 
> Did you try this syntax?
> The previous syntax with vdev did not work, and the new one probably
> won't either.
> Only the first devargs will be passed to the driver.
> 
> Example with the null pmd:
> $ ./build/app/dpdk-testpmd -m 512 --no-huge --log-level=*:debug --vdev
> net_null,copy=1 --vdev net_null,size=1024 -- -ia
> --total-num-mbufs=2048
> ...
> vdev_probe_all_drivers(): Search driver to probe device net_null
> rte_pmd_null_probe(): Initializing pmd_null for net_null
> rte_pmd_null_probe(): Configure pmd_null: packet size is 64, packet
> copy is enabled
>                                                          ^^
>      ^^^^^^^
> eth_dev_null_create(): Creating null ethdev on numa socket 0
> ...
> 
> 
> If you want to pass multiple options for a single device, pass them all at once.
> Like:
> --allow
> ea:00.0,qid_depth_thresh=all:<threshold_value>,qid_depth_thresh=qidA-
> qidB:<threshold_value>,qid_depth_thresh=qid:<threshold_value>
> 

Where the doc has multiple lines in the example code block, such as this
       --allow ea:00.0,qid_depth_thresh=all:<threshold_value>
       --allow ea:00.0,qid_depth_thresh=qidA-qidB:<threshold_value>
       --allow ea:00.0,qid_depth_thresh=qid:<threshold_value> 

These are meant as 3 different supported examples of the format that can be
used. We have some convenience syntax options (all, a range, or a single qid).

> >
> >  Class of service
> >  ~~~~~~~~~~~~~~~~
> > @@ -387,4 +366,17 @@ Class of service can be specified in the devargs, as
> follows
> >
> >      .. code-block:: console
> >
> > -       --vdev=dlb2_event,cos=<0..4>
> > +       --allow ea:00.0,cos=<0..4>
> > +
> > +Use X86 Vector Instructions
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +DLB supports using x86 vector instructions to optimize the data path.
> > +
> > +The default mode of operation is to use scalar instructions, but
> > +the use of vector instructions can be enabled in the devargs, as
> > +follows
> > +
> > +    .. code-block:: console
> > +
> > +       --allow ea:00.0,vector_opts_enabled=<y/Y>
> 
> This option does not exist.
> All I see is:
> drivers/event/dlb2/dlb2_priv.h:#define DLB2_VECTOR_OPTS_DISAB_ARG
> "vector_opts_disable"
> 

The option name changed with patch 93235/16973 "[1/1] event/dlb2: fix vector based dequeue", which
is currently out for review. That patch fixed a couple of recently discovered issues in the vector path, so to be safe
we changed the default from running with the vector dequeue optimizations to running with the scalar dequeue
implementation.  

> What of --force-max-simd-bitwidth ?

The intent of the " vector_opts_enabled" switch is to allow experimentally enabled the vector based
implementation, not to indicate whether those instructions are supported or not on the target platform.

> 
> 
> --
> David Marchand
  
David Marchand May 19, 2021, 9:59 a.m. UTC | #3
On Mon, May 17, 2021 at 3:48 PM McDaniel, Timothy
<timothy.mcdaniel@intel.com> wrote:
> > > @@ -387,4 +366,17 @@ Class of service can be specified in the devargs, as
> > follows
> > >
> > >      .. code-block:: console
> > >
> > > -       --vdev=dlb2_event,cos=<0..4>
> > > +       --allow ea:00.0,cos=<0..4>
> > > +
> > > +Use X86 Vector Instructions
> > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > +
> > > +DLB supports using x86 vector instructions to optimize the data path.
> > > +
> > > +The default mode of operation is to use scalar instructions, but
> > > +the use of vector instructions can be enabled in the devargs, as
> > > +follows
> > > +
> > > +    .. code-block:: console
> > > +
> > > +       --allow ea:00.0,vector_opts_enabled=<y/Y>
> >
> > This option does not exist.
> > All I see is:
> > drivers/event/dlb2/dlb2_priv.h:#define DLB2_VECTOR_OPTS_DISAB_ARG
> > "vector_opts_disable"
> >
>
> The option name changed with patch 93235/16973 "[1/1] event/dlb2: fix vector based dequeue", which
> is currently out for review. That patch fixed a couple of recently discovered issues in the vector path, so to be safe
> we changed the default from running with the vector dequeue optimizations to running with the scalar dequeue
> implementation.

I see Thomas had comments on this patch that changes the option (among
other things).
The documentation update for this option should go at the same time it
is introduced/renamed.


>
> > What of --force-max-simd-bitwidth ?
>
> The intent of the " vector_opts_enabled" switch is to allow experimentally enabled the vector based
> implementation, not to indicate whether those instructions are supported or not on the target platform.

The commit 000a7b8e7582 ("event/dlb2: optimize dequeue operation")
already had the issue: the event/dlb2 driver ignores the global knob
on activating vector stuff.
  

Patch

diff --git a/doc/guides/eventdevs/dlb2.rst b/doc/guides/eventdevs/dlb2.rst
index 31de6bc47..bce984ca0 100644
--- a/doc/guides/eventdevs/dlb2.rst
+++ b/doc/guides/eventdevs/dlb2.rst
@@ -152,19 +152,19 @@  These pools' sizes are controlled by the nb_events_limit field in struct
 rte_event_dev_config. The load-balanced pool is sized to contain
 nb_events_limit credits, and the directed pool is sized to contain
 nb_events_limit/4 credits. The directed pool size can be overridden with the
-num_dir_credits vdev argument, like so:
+num_dir_credits devargs argument, like so:
 
     .. code-block:: console
 
-       --vdev=dlb2_event,num_dir_credits=<value>
+       --allow ea:00.0,num_dir_credits=<value>
 
 This can be used if the default allocation is too low or too high for the
-specific application needs. The PMD also supports a vdev arg that limits the
+specific application needs. The PMD also supports a devarg that limits the
 max_num_events reported by rte_event_dev_info_get():
 
     .. code-block:: console
 
-       --vdev=dlb2_event,max_num_events=<value>
+       --allow ea:00.0,max_num_events=<value>
 
 By default, max_num_events is reported as the total available load-balanced
 credits. If multiple DLB-based applications are being used, it may be desirable
@@ -293,27 +293,6 @@  The PMD does not support the following configuration sequences:
 This sequence is not supported because the event device must be reconfigured
 before its ports or queues can be.
 
-Deferred Scheduling
-~~~~~~~~~~~~~~~~~~~
-
-The DLB PMD's default behavior for managing a CQ is to "pop" the CQ once per
-dequeued event before returning from rte_event_dequeue_burst(). This frees the
-corresponding entries in the CQ, which enables the DLB to schedule more events
-to it.
-
-To support applications seeking finer-grained scheduling control -- for example
-deferring scheduling to get the best possible priority scheduling and
-load-balancing -- the PMD supports a deferred scheduling mode. In this mode,
-the CQ entry is not popped until the *subsequent* rte_event_dequeue_burst()
-call. This mode only applies to load-balanced event ports with dequeue depth of
-1.
-
-To enable deferred scheduling, use the defer_sched vdev argument like so:
-
-    .. code-block:: console
-
-       --vdev=dlb2_event,defer_sched=on
-
 Atomic Inflights Allocation
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -336,11 +315,11 @@  buffer space (e.g. if not all queues are used, or aren't used for atomic
 scheduling).
 
 The PMD provides a dev arg to override the default per-queue allocation. To
-increase a vdev's per-queue atomic-inflight allocation to (for example) 64:
+increase per-queue atomic-inflight allocation to (for example) 64:
 
     .. code-block:: console
 
-       --vdev=dlb2_event,atm_inflights=64
+       --allow ea:00.0,atm_inflights=64
 
 QID Depth Threshold
 ~~~~~~~~~~~~~~~~~~~
@@ -363,9 +342,9 @@  shown below.
 
     .. code-block:: console
 
-       --vdev=dlb2_event,qid_depth_thresh=all:<threshold_value>
-       --vdev=dlb2_event,qid_depth_thresh=qidA-qidB:<threshold_value>
-       --vdev=dlb2_event,qid_depth_thresh=qid:<threshold_value>
+       --allow ea:00.0,qid_depth_thresh=all:<threshold_value>
+       --allow ea:00.0,qid_depth_thresh=qidA-qidB:<threshold_value>
+       --allow ea:00.0,qid_depth_thresh=qid:<threshold_value>
 
 Class of service
 ~~~~~~~~~~~~~~~~
@@ -387,4 +366,17 @@  Class of service can be specified in the devargs, as follows
 
     .. code-block:: console
 
-       --vdev=dlb2_event,cos=<0..4>
+       --allow ea:00.0,cos=<0..4>
+
+Use X86 Vector Instructions
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+DLB supports using x86 vector instructions to optimize the data path.
+
+The default mode of operation is to use scalar instructions, but
+the use of vector instructions can be enabled in the devargs, as
+follows
+
+    .. code-block:: console
+
+       --allow ea:00.0,vector_opts_enabled=<y/Y>