[v1,1/2] doc: update guides for vhost async APIs

Message ID 20200722105741.3421255-2-patrick.fu@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Maxime Coquelin
Headers
Series update docs for vhost async API |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail apply issues

Commit Message

Patrick Fu July 22, 2020, 10:57 a.m. UTC
  From: Patrick Fu <patrick.fu@intel.com>

Update vhost guides to document vhost async APIs

Signed-off-by: Patrick Fu <patrick.fu@intel.com>
---
 doc/guides/prog_guide/vhost_lib.rst | 86 ++++++++++++++++++++++++++---
 1 file changed, 77 insertions(+), 9 deletions(-)
  

Comments

Chenbo Xia July 22, 2020, 11:21 a.m. UTC | #1
Hi Patrick,

> -----Original Message-----
> From: Fu, Patrick <patrick.fu@intel.com>
> Sent: Wednesday, July 22, 2020 6:58 PM
> To: dev@dpdk.org; maxime.coquelin@redhat.com; Xia, Chenbo
> <chenbo.xia@intel.com>
> Cc: Fu, Patrick <patrick.fu@intel.com>
> Subject: [PATCH v1 1/2] doc: update guides for vhost async APIs
> 
> From: Patrick Fu <patrick.fu@intel.com>
> 
> Update vhost guides to document vhost async APIs
> 
> Signed-off-by: Patrick Fu <patrick.fu@intel.com>
> ---
>  doc/guides/prog_guide/vhost_lib.rst | 86 ++++++++++++++++++++++++++---
>  1 file changed, 77 insertions(+), 9 deletions(-)
> 
> diff --git a/doc/guides/prog_guide/vhost_lib.rst
> b/doc/guides/prog_guide/vhost_lib.rst
> index db921f922..cce8b6ae7 100644
> --- a/doc/guides/prog_guide/vhost_lib.rst
> +++ b/doc/guides/prog_guide/vhost_lib.rst
> @@ -147,6 +147,21 @@ The following is an overview of some key Vhost API
> functions:
> 
>      It is disabled by default.
> 
> +  - ``RTE_VHOST_USER_ASYNC_COPY``
> +
> +    Asynchronous data path will be enabled when this flag is set. Async data
> +    path allows applications to register async copy devices (typically
> +    hardware DMA channels) to the vhost queues. Vhost leverages the copy
> +    device registered to offload CPU from memory copy operations. A set of

You mean 'free' CPU from memory copy?

> +    async data path APIs are defined for DPDK applications to make use of
> +    the async capability. Only packets enqueued/dequeued by async APIs are
> +    processed through the async data path.
> +
> +    Currently this feature is only implemented on split ring enqueue data
> +    path.
> +
> +    It is disabled by default.
> +
>  * ``rte_vhost_driver_set_features(path, features)``
> 
>    This function sets the feature bits the vhost-user driver supports. The @@ -
> 235,6 +250,59 @@ The following is an overview of some key Vhost API
> functions:
> 
>    Enable or disable zero copy feature of the vhost crypto backend.
> 
> +* ``rte_vhost_async_channel_register(vid, queue_id, features, ops)``
> +
> +  Register a vhost queue with async copy device channel.
> +  Following device ``features`` must be specified together with the
> +  registration:
> +
> +  * ``async_inorder``
> +
> +    Async copy device can guarantee the ordering of copy completion
> +    sequence. Copies are completed in the same order with that at
> +    the submission time.
> +
> +    Currently, only ``async_inorder`` capable device is supported by vhost.
> +
> +  * ``async_threshold``
> +
> +    The copy length (in bytes) below which CPU copy will be used even if
> +    applications call async vhost APIs to enqueue/dequeue data.
> +
> +    Typical value is 512~1024 depending on the async device capability.
> +
> +  Applications must provide following ``ops`` callbacks for vhost lib
> + to  work with the async copye devices:

s/copye/copy

> +
> +  * ``transfer_data(vid, queue_id, descs, opaque_data, count)``
> +
> +    vhost invokes this function to submit copy data to the async devices.
> +    For non-async_inorder capable devices, ``opaque_data`` could be used
> +    for identifying the completed packets.
> +
> +  * ``check_completed_copies(vid, queue_id, opaque_data, max_packets)``
> +
> +    vhost invokes this function to get the copy data completed by async
> +    devices.
> +
> +* ``rte_vhost_async_channel_unregister(vid, queue_id)``
> +
> +  Unregister the async copy device from a vhost queue.

'Copy device channel' may be more accurate? 

> +
> +* ``rte_vhost_submit_enqueue_burst(vid, queue_id, pkts, count)``
> +
> +  Submit an enqueue request to transmit ``count`` packets from host to
> + guest  by async data path. Enqueue is not guaranteed to finish upon
> + the return of  this API call.
> +
> +  Applications must not free the packets submitted for enqueue until
> + the  packets are completed.
> +
> +* ``rte_vhost_poll_enqueue_completed(vid, queue_id, pkts, count)``
> +
> +  Poll enqueue completion status from async data path. Completed
> + packets  are returned to applications through ``pkts``.
> +
>  Vhost-user Implementations
>  --------------------------
> 
> @@ -294,16 +362,16 @@ Guest memory requirement
> 
>  * Memory pre-allocation
> 
> -  For non-zerocopy, guest memory pre-allocation is not a must. This can help
> -  save of memory. If users really want the guest memory to be pre-allocated
> -  (e.g., for performance reason), we can add option ``-mem-prealloc`` when
> -  starting QEMU. Or, we can lock all memory at vhost side which will force
> -  memory to be allocated when mmap at vhost side; option --mlockall in
> -  ovs-dpdk is an example in hand.
> +  For non-zerocopy non-async data path, guest memory pre-allocation is
> + not a  must. This can help save of memory. If users really want the
> + guest memory  to be pre-allocated (e.g., for performance reason), we
> + can add option  ``-mem-prealloc`` when starting QEMU. Or, we can lock
> + all memory at vhost  side which will force memory to be allocated when
> + mmap at vhost side;  option --mlockall in ovs-dpdk is an example in hand.
> 
> -  For zerocopy, we force the VM memory to be pre-allocated at vhost lib when
> -  mapping the guest memory; and also we need to lock the memory to prevent
> -  pages being swapped out to disk.
> +  For async data path or zerocopy, we force the VM memory to be

'For async or zerocopy data path' may be better?

Thanks!
Chenbo

> + pre-allocated  at vhost lib when mapping the guest memory; and also we
> + need to lock the  memory to prevent pages being swapped out to disk.
> 
>  * Memory sharing
> 
> --
> 2.18.4
  
Patrick Fu July 22, 2020, 3:06 p.m. UTC | #2
Thanks for comments. v2 patch sent with all the changes suggested.

Thanks,

Patrick


> -----Original Message-----
> From: Xia, Chenbo <chenbo.xia@intel.com>
> Sent: Wednesday, July 22, 2020 7:21 PM
> To: Fu, Patrick <patrick.fu@intel.com>; dev@dpdk.org;
> maxime.coquelin@redhat.com
> Subject: RE: [PATCH v1 1/2] doc: update guides for vhost async APIs
> 
> Hi Patrick,
> 
> > -----Original Message-----
> > From: Fu, Patrick <patrick.fu@intel.com>
> > Sent: Wednesday, July 22, 2020 6:58 PM
> > To: dev@dpdk.org; maxime.coquelin@redhat.com; Xia, Chenbo
> > <chenbo.xia@intel.com>
> > Cc: Fu, Patrick <patrick.fu@intel.com>
> > Subject: [PATCH v1 1/2] doc: update guides for vhost async APIs
> >
> > From: Patrick Fu <patrick.fu@intel.com>
> >
> > Update vhost guides to document vhost async APIs
> >
> > Signed-off-by: Patrick Fu <patrick.fu@intel.com>
> > ---
> >  doc/guides/prog_guide/vhost_lib.rst | 86
> > ++++++++++++++++++++++++++---
> >  1 file changed, 77 insertions(+), 9 deletions(-)
> >
> > diff --git a/doc/guides/prog_guide/vhost_lib.rst
> > b/doc/guides/prog_guide/vhost_lib.rst
> > index db921f922..cce8b6ae7 100644
> > --- a/doc/guides/prog_guide/vhost_lib.rst
> > +++ b/doc/guides/prog_guide/vhost_lib.rst
> > @@ -147,6 +147,21 @@ The following is an overview of some key Vhost
> > API
> > functions:
> >
> >      It is disabled by default.
> >
> > +  - ``RTE_VHOST_USER_ASYNC_COPY``
> > +
> > +    Asynchronous data path will be enabled when this flag is set. Async
> data
> > +    path allows applications to register async copy devices (typically
> > +    hardware DMA channels) to the vhost queues. Vhost leverages the copy
> > +    device registered to offload CPU from memory copy operations. A
> > + set of
> 
> You mean 'free' CPU from memory copy?
> 
> > +    async data path APIs are defined for DPDK applications to make use of
> > +    the async capability. Only packets enqueued/dequeued by async APIs
> are
> > +    processed through the async data path.
> > +
> > +    Currently this feature is only implemented on split ring enqueue data
> > +    path.
> > +
> > +    It is disabled by default.
> > +
> >  * ``rte_vhost_driver_set_features(path, features)``
> >
> >    This function sets the feature bits the vhost-user driver supports.
> > The @@ -
> > 235,6 +250,59 @@ The following is an overview of some key Vhost API
> > functions:
> >
> >    Enable or disable zero copy feature of the vhost crypto backend.
> >
> > +* ``rte_vhost_async_channel_register(vid, queue_id, features, ops)``
> > +
> > +  Register a vhost queue with async copy device channel.
> > +  Following device ``features`` must be specified together with the
> > +  registration:
> > +
> > +  * ``async_inorder``
> > +
> > +    Async copy device can guarantee the ordering of copy completion
> > +    sequence. Copies are completed in the same order with that at
> > +    the submission time.
> > +
> > +    Currently, only ``async_inorder`` capable device is supported by vhost.
> > +
> > +  * ``async_threshold``
> > +
> > +    The copy length (in bytes) below which CPU copy will be used even if
> > +    applications call async vhost APIs to enqueue/dequeue data.
> > +
> > +    Typical value is 512~1024 depending on the async device capability.
> > +
> > +  Applications must provide following ``ops`` callbacks for vhost lib
> > + to  work with the async copye devices:
> 
> s/copye/copy
> 
> > +
> > +  * ``transfer_data(vid, queue_id, descs, opaque_data, count)``
> > +
> > +    vhost invokes this function to submit copy data to the async devices.
> > +    For non-async_inorder capable devices, ``opaque_data`` could be used
> > +    for identifying the completed packets.
> > +
> > +  * ``check_completed_copies(vid, queue_id, opaque_data,
> > + max_packets)``
> > +
> > +    vhost invokes this function to get the copy data completed by async
> > +    devices.
> > +
> > +* ``rte_vhost_async_channel_unregister(vid, queue_id)``
> > +
> > +  Unregister the async copy device from a vhost queue.
> 
> 'Copy device channel' may be more accurate?
> 
> > +
> > +* ``rte_vhost_submit_enqueue_burst(vid, queue_id, pkts, count)``
> > +
> > +  Submit an enqueue request to transmit ``count`` packets from host
> > + to guest  by async data path. Enqueue is not guaranteed to finish
> > + upon the return of  this API call.
> > +
> > +  Applications must not free the packets submitted for enqueue until
> > + the  packets are completed.
> > +
> > +* ``rte_vhost_poll_enqueue_completed(vid, queue_id, pkts, count)``
> > +
> > +  Poll enqueue completion status from async data path. Completed
> > + packets  are returned to applications through ``pkts``.
> > +
> >  Vhost-user Implementations
> >  --------------------------
> >
> > @@ -294,16 +362,16 @@ Guest memory requirement
> >
> >  * Memory pre-allocation
> >
> > -  For non-zerocopy, guest memory pre-allocation is not a must. This
> > can help
> > -  save of memory. If users really want the guest memory to be
> > pre-allocated
> > -  (e.g., for performance reason), we can add option ``-mem-prealloc``
> > when
> > -  starting QEMU. Or, we can lock all memory at vhost side which will
> > force
> > -  memory to be allocated when mmap at vhost side; option --mlockall
> > in
> > -  ovs-dpdk is an example in hand.
> > +  For non-zerocopy non-async data path, guest memory pre-allocation
> > + is not a  must. This can help save of memory. If users really want
> > + the guest memory  to be pre-allocated (e.g., for performance
> > + reason), we can add option  ``-mem-prealloc`` when starting QEMU.
> > + Or, we can lock all memory at vhost  side which will force memory to
> > + be allocated when mmap at vhost side;  option --mlockall in ovs-dpdk is
> an example in hand.
> >
> > -  For zerocopy, we force the VM memory to be pre-allocated at vhost
> > lib when
> > -  mapping the guest memory; and also we need to lock the memory to
> > prevent
> > -  pages being swapped out to disk.
> > +  For async data path or zerocopy, we force the VM memory to be
> 
> 'For async or zerocopy data path' may be better?
> 
> Thanks!
> Chenbo
> 
> > + pre-allocated  at vhost lib when mapping the guest memory; and also
> > + we need to lock the  memory to prevent pages being swapped out to disk.
> >
> >  * Memory sharing
> >
> > --
> > 2.18.4
  

Patch

diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst
index db921f922..cce8b6ae7 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -147,6 +147,21 @@  The following is an overview of some key Vhost API functions:
 
     It is disabled by default.
 
+  - ``RTE_VHOST_USER_ASYNC_COPY``
+
+    Asynchronous data path will be enabled when this flag is set. Async data
+    path allows applications to register async copy devices (typically
+    hardware DMA channels) to the vhost queues. Vhost leverages the copy
+    device registered to offload CPU from memory copy operations. A set of
+    async data path APIs are defined for DPDK applications to make use of
+    the async capability. Only packets enqueued/dequeued by async APIs are
+    processed through the async data path.
+
+    Currently this feature is only implemented on split ring enqueue data
+    path.
+
+    It is disabled by default.
+
 * ``rte_vhost_driver_set_features(path, features)``
 
   This function sets the feature bits the vhost-user driver supports. The
@@ -235,6 +250,59 @@  The following is an overview of some key Vhost API functions:
 
   Enable or disable zero copy feature of the vhost crypto backend.
 
+* ``rte_vhost_async_channel_register(vid, queue_id, features, ops)``
+
+  Register a vhost queue with async copy device channel.
+  Following device ``features`` must be specified together with the
+  registration:
+
+  * ``async_inorder``
+
+    Async copy device can guarantee the ordering of copy completion
+    sequence. Copies are completed in the same order with that at
+    the submission time.
+
+    Currently, only ``async_inorder`` capable device is supported by vhost.
+
+  * ``async_threshold``
+
+    The copy length (in bytes) below which CPU copy will be used even if
+    applications call async vhost APIs to enqueue/dequeue data.
+
+    Typical value is 512~1024 depending on the async device capability.
+
+  Applications must provide following ``ops`` callbacks for vhost lib to
+  work with the async copye devices:
+
+  * ``transfer_data(vid, queue_id, descs, opaque_data, count)``
+
+    vhost invokes this function to submit copy data to the async devices.
+    For non-async_inorder capable devices, ``opaque_data`` could be used
+    for identifying the completed packets.
+
+  * ``check_completed_copies(vid, queue_id, opaque_data, max_packets)``
+
+    vhost invokes this function to get the copy data completed by async
+    devices.
+
+* ``rte_vhost_async_channel_unregister(vid, queue_id)``
+
+  Unregister the async copy device from a vhost queue.
+
+* ``rte_vhost_submit_enqueue_burst(vid, queue_id, pkts, count)``
+
+  Submit an enqueue request to transmit ``count`` packets from host to guest
+  by async data path. Enqueue is not guaranteed to finish upon the return of
+  this API call.
+
+  Applications must not free the packets submitted for enqueue until the
+  packets are completed.
+
+* ``rte_vhost_poll_enqueue_completed(vid, queue_id, pkts, count)``
+
+  Poll enqueue completion status from async data path. Completed packets
+  are returned to applications through ``pkts``.
+
 Vhost-user Implementations
 --------------------------
 
@@ -294,16 +362,16 @@  Guest memory requirement
 
 * Memory pre-allocation
 
-  For non-zerocopy, guest memory pre-allocation is not a must. This can help
-  save of memory. If users really want the guest memory to be pre-allocated
-  (e.g., for performance reason), we can add option ``-mem-prealloc`` when
-  starting QEMU. Or, we can lock all memory at vhost side which will force
-  memory to be allocated when mmap at vhost side; option --mlockall in
-  ovs-dpdk is an example in hand.
+  For non-zerocopy non-async data path, guest memory pre-allocation is not a
+  must. This can help save of memory. If users really want the guest memory
+  to be pre-allocated (e.g., for performance reason), we can add option
+  ``-mem-prealloc`` when starting QEMU. Or, we can lock all memory at vhost
+  side which will force memory to be allocated when mmap at vhost side;
+  option --mlockall in ovs-dpdk is an example in hand.
 
-  For zerocopy, we force the VM memory to be pre-allocated at vhost lib when
-  mapping the guest memory; and also we need to lock the memory to prevent
-  pages being swapped out to disk.
+  For async data path or zerocopy, we force the VM memory to be pre-allocated
+  at vhost lib when mapping the guest memory; and also we need to lock the
+  memory to prevent pages being swapped out to disk.
 
 * Memory sharing