[v3,6/6] doc/guides/: provide IOAT sample app guide

Message ID 20190918091139.1430-7-marcinx.baran@intel.com
State Superseded
Headers show
Series
  • examples/ioat: sample app on ioat driver usage
Related show

Checks

Context Check Description
ci/Intel-compilation fail Compilation issues
ci/checkpatch success coding style OK

Commit Message

Marcin Baran Sept. 18, 2019, 9:11 a.m.
Added guide for IOAT sample app usage and
code description.

Signed-off-by: Marcin Baran <marcinx.baran@intel.com>
---
 doc/guides/sample_app_ug/index.rst |   1 +
 doc/guides/sample_app_ug/intro.rst |   4 +
 doc/guides/sample_app_ug/ioat.rst  | 764 +++++++++++++++++++++++++++++
 3 files changed, 769 insertions(+)
 create mode 100644 doc/guides/sample_app_ug/ioat.rst

Patch

diff --git a/doc/guides/sample_app_ug/index.rst b/doc/guides/sample_app_ug/index.rst
index f23f8f59e..a6a1d9e7a 100644
--- a/doc/guides/sample_app_ug/index.rst
+++ b/doc/guides/sample_app_ug/index.rst
@@ -23,6 +23,7 @@  Sample Applications User Guides
     ip_reassembly
     kernel_nic_interface
     keep_alive
+    ioat
     l2_forward_crypto
     l2_forward_job_stats
     l2_forward_real_virtual
diff --git a/doc/guides/sample_app_ug/intro.rst b/doc/guides/sample_app_ug/intro.rst
index 90704194a..74462312f 100644
--- a/doc/guides/sample_app_ug/intro.rst
+++ b/doc/guides/sample_app_ug/intro.rst
@@ -91,6 +91,10 @@  examples are highlighted below.
   forwarding, or ``l3fwd`` application does forwarding based on Internet
   Protocol, IPv4 or IPv6 like a simple router.
 
+* :doc:`Hardware packet copying<ioat>`: The Hardware packet copying,
+  or ``ioatfwd`` application demonstrates how to use IOAT rawdev driver for
+  copying packets between two threads.
+
 * :doc:`Packet Distributor<dist_app>`: The Packet Distributor
   demonstrates how to distribute packets arriving on an Rx port to different
   cores for processing and transmission.
diff --git a/doc/guides/sample_app_ug/ioat.rst b/doc/guides/sample_app_ug/ioat.rst
new file mode 100644
index 000000000..69621673b
--- /dev/null
+++ b/doc/guides/sample_app_ug/ioat.rst
@@ -0,0 +1,764 @@ 
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2019 Intel Corporation.
+
+Sample Application of packet copying using Intel\ |reg| QuickData Technology
+============================================================================
+
+Overview
+--------
+
+This sample is intended as a demonstration of the basic components of a DPDK
+forwarding application and example of how to use IOAT driver API to make
+packets copies.
+
+Also while forwarding, the MAC addresses are affected as follows:
+
+*   The source MAC address is replaced by the TX port MAC address
+
+*   The destination MAC address is replaced by  02:00:00:00:00:TX_PORT_ID
+
+This application can be used to compare performance of using software packet
+copy with copy done using a DMA device for different sizes of packets.
+The example will print out statistics each second. The stats shows
+received/send packets and packets dropped or failed to copy.
+
+Compiling the Application
+-------------------------
+
+To compile the sample application see :doc:`compiling`.
+
+The application is located in the ``ioat`` sub-directory.
+
+
+Running the Application
+-----------------------
+
+In order to run the hardware copy application, the copying device
+needs to be bound to user-space IO driver.
+
+Refer to the *IOAT Rawdev Driver for Intel\ |reg| QuickData Technology*
+guide for information on using the driver.
+
+The application requires a number of command line options:
+
+.. code-block:: console
+
+    ./build/ioatfwd [EAL options] -- -p MASK [-q NQ] [-s RS] [-c <sw|hw>]
+        [--[no-]mac-updating]
+
+where,
+
+*   p MASK: A hexadecimal bitmask of the ports to configure
+
+*   q NQ: Number of Rx queues used per port equivalent to CBDMA channels
+    per port
+
+*   c CT: Performed packet copy type: software (sw) or hardware using
+    DMA (hw)
+
+*   s RS: Size of IOAT rawdev ring for hardware copy mode or rte_ring for
+    software copy mode
+
+*   --[no-]mac-updating: Whether MAC address of packets should be changed
+    or not
+
+The application can be launched in various configurations depending on
+provided parameters. Each port can use up to 2 lcores: one of them receives
+incoming traffic and makes a copy of each packet. The second lcore then
+updates MAC address and sends the copy. If one lcore per port is used,
+both operations are done sequentially. For each configuration an additional
+lcore is needed since master lcore in use which is responsible for
+configuration, statistics printing and safe deinitialization of all ports
+and devices.
+
+The application can use a maximum of 8 ports.
+
+To run the application in a Linux environment with 3 lcores (one of them
+is master lcore), 1 port (port 0), software copying and MAC updating issue
+the command:
+
+.. code-block:: console
+
+    $ ./build/ioatfwd -l 0-2 -n 2 -- -p 0x1 --mac-updating -c sw
+
+To run the application in a Linux environment with 2 lcores (one of them
+is master lcore), 2 ports (ports 0 and 1), hardware copying and no MAC
+updating issue the command:
+
+.. code-block:: console
+
+    $ ./build/ioatfwd -l 0-1 -n 1 -- -p 0x3 --no-mac-updating -c hw
+
+Refer to the *DPDK Getting Started Guide* for general information on
+running applications and the Environment Abstraction Layer (EAL) options.
+
+Explanation
+-----------
+
+The following sections provide an explanation of the main components of the
+code.
+
+All DPDK library functions used in the sample code are prefixed with
+``rte_`` and are explained in detail in the *DPDK API Documentation*.
+
+
+The Main Function
+~~~~~~~~~~~~~~~~~
+
+The ``main()`` function performs the initialization and calls the execution
+threads for each lcore.
+
+The first task is to initialize the Environment Abstraction Layer (EAL).
+The ``argc`` and ``argv`` arguments are provided to the ``rte_eal_init()``
+function. The value returned is the number of parsed arguments:
+
+.. code-block:: c
+
+    /* init EAL */
+    ret = rte_eal_init(argc, argv);
+    if (ret < 0)
+        rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
+
+
+The ``main()`` also allocates a mempool to hold the mbufs (Message Buffers)
+used by the application:
+
+.. code-block:: c
+
+    nb_mbufs = RTE_MAX(rte_eth_dev_count_avail() * (nb_rxd + nb_txd
+        + MAX_PKT_BURST + rte_lcore_count() * MEMPOOL_CACHE_SIZE),
+        MIN_POOL_SIZE);
+
+    /* Create the mbuf pool */
+    ioat_pktmbuf_pool = rte_pktmbuf_pool_create("mbuf_pool", nb_mbufs,
+        MEMPOOL_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE,
+        rte_socket_id());
+    if (ioat_pktmbuf_pool == NULL)
+        rte_exit(EXIT_FAILURE, "Cannot init mbuf pool\n");
+
+Mbufs are the packet buffer structure used by DPDK. They are explained in
+detail in the "Mbuf Library" section of the *DPDK Programmer's Guide*.
+
+The ``main()`` function also initializes the ports:
+
+.. code-block:: c
+
+    /* Initialise each port */
+    RTE_ETH_FOREACH_DEV(portid) {
+        port_init(portid, ioat_pktmbuf_pool);
+    }
+
+Each port is configured using ``port_init()``:
+
+.. code-block:: c
+
+     /*
+     * Initializes a given port using global settings and with the RX buffers
+     * coming from the mbuf_pool passed as a parameter.
+     */
+    static inline void
+    port_init(uint16_t portid, struct rte_mempool *mbuf_pool, uint16_t nb_queues)
+    {
+        /* configuring port to use RSS for multiple RX queues */
+        static const struct rte_eth_conf port_conf = {
+            .rxmode = {
+                .mq_mode        = ETH_MQ_RX_RSS,
+                .max_rx_pkt_len = RTE_ETHER_MAX_LEN
+            },
+            .rx_adv_conf = {
+                .rss_conf = {
+                    .rss_key = NULL,
+                    .rss_hf = ETH_RSS_PROTO_MASK,
+                }
+            }
+        };
+
+        struct rte_eth_rxconf rxq_conf;
+        struct rte_eth_txconf txq_conf;
+        struct rte_eth_conf local_port_conf = port_conf;
+        struct rte_eth_dev_info dev_info;
+        int ret, i;
+
+        /* Skip ports that are not enabled */
+        if ((ioat_enabled_port_mask & (1 << portid)) == 0) {
+            printf("Skipping disabled port %u\n", portid);
+            return;
+        }
+
+        /* Init port */
+        printf("Initializing port %u... ", portid);
+        fflush(stdout);
+        rte_eth_dev_info_get(portid, &dev_info);
+        local_port_conf.rx_adv_conf.rss_conf.rss_hf &=
+            dev_info.flow_type_rss_offloads;
+        if (dev_info.tx_offload_capa & DEV_TX_OFFLOAD_MBUF_FAST_FREE)
+            local_port_conf.txmode.offloads |=
+                DEV_TX_OFFLOAD_MBUF_FAST_FREE;
+        ret = rte_eth_dev_configure(portid, nb_queues, 1, &local_port_conf);
+        if (ret < 0)
+            rte_exit(EXIT_FAILURE, "Cannot configure device:"
+                " err=%d, port=%u\n", ret, portid);
+
+        ret = rte_eth_dev_adjust_nb_rx_tx_desc(portid, &nb_rxd,
+                            &nb_txd);
+        if (ret < 0)
+            rte_exit(EXIT_FAILURE,
+                "Cannot adjust number of descriptors: err=%d, port=%u\n",
+                ret, portid);
+
+        rte_eth_macaddr_get(portid, &ioat_ports_eth_addr[portid]);
+
+        /* Init Rx queues */
+        rxq_conf = dev_info.default_rxconf;
+        rxq_conf.offloads = local_port_conf.rxmode.offloads;
+        for (i = 0; i < nb_queues; i++) {
+            ret = rte_eth_rx_queue_setup(portid, i, nb_rxd,
+                rte_eth_dev_socket_id(portid), &rxq_conf,
+                mbuf_pool);
+            if (ret < 0)
+                rte_exit(EXIT_FAILURE,
+                    "rte_eth_rx_queue_setup:err=%d,port=%u, queue_id=%u\n",
+                    ret, portid, i);
+        }
+
+        /* Init one TX queue on each port */
+        txq_conf = dev_info.default_txconf;
+        txq_conf.offloads = local_port_conf.txmode.offloads;
+        ret = rte_eth_tx_queue_setup(portid, 0, nb_txd,
+                rte_eth_dev_socket_id(portid),
+                &txq_conf);
+        if (ret < 0)
+            rte_exit(EXIT_FAILURE,
+                "rte_eth_tx_queue_setup:err=%d,port=%u\n",
+                ret, portid);
+
+        /* Initialize TX buffers */
+        tx_buffer[portid] = rte_zmalloc_socket("tx_buffer",
+                RTE_ETH_TX_BUFFER_SIZE(MAX_PKT_BURST), 0,
+                rte_eth_dev_socket_id(portid));
+        if (tx_buffer[portid] == NULL)
+            rte_exit(EXIT_FAILURE,
+                "Cannot allocate buffer for tx on port %u\n",
+                portid);
+
+        rte_eth_tx_buffer_init(tx_buffer[portid], MAX_PKT_BURST);
+
+        ret = rte_eth_tx_buffer_set_err_callback(tx_buffer[portid],
+                rte_eth_tx_buffer_count_callback,
+                &port_statistics.tx_dropped[portid]);
+        if (ret < 0)
+            rte_exit(EXIT_FAILURE,
+                "Cannot set error callback for tx buffer on port %u\n",
+                portid);
+
+        /* Start device */
+        ret = rte_eth_dev_start(portid);
+        if (ret < 0)
+            rte_exit(EXIT_FAILURE,
+                "rte_eth_dev_start:err=%d, port=%u\n",
+                ret, portid);
+
+        rte_eth_promiscuous_enable(portid);
+
+        printf("Port %u, MAC address: %02X:%02X:%02X:%02X:%02X:%02X\n\n",
+                portid,
+                ioat_ports_eth_addr[portid].addr_bytes[0],
+                ioat_ports_eth_addr[portid].addr_bytes[1],
+                ioat_ports_eth_addr[portid].addr_bytes[2],
+                ioat_ports_eth_addr[portid].addr_bytes[3],
+                ioat_ports_eth_addr[portid].addr_bytes[4],
+                ioat_ports_eth_addr[portid].addr_bytes[5]);
+
+        cfg.ports[cfg.nb_ports].rxtx_port = portid;
+        cfg.ports[cfg.nb_ports++].nb_queues = nb_queues;
+    }
+
+The Ethernet ports are configured with local settings using the
+``rte_eth_dev_configure()`` function and the ``port_conf`` struct.
+The RSS is enabled so that multiple Rx queues could be used for
+packet receiving and copying by multiple CBDMA channels per port:
+
+.. code-block:: c
+
+    /* configuring port to use RSS for multiple RX queues */
+    static const struct rte_eth_conf port_conf = {
+        .rxmode = {
+            .mq_mode        = ETH_MQ_RX_RSS,
+            .max_rx_pkt_len = RTE_ETHER_MAX_LEN
+        },
+        .rx_adv_conf = {
+            .rss_conf = {
+                .rss_key = NULL,
+                .rss_hf = ETH_RSS_PROTO_MASK,
+            }
+        }
+    };
+
+For this example the ports are set up with the number of Rx queues provided
+with -q option and 1 Tx queue using the ``rte_eth_rx_queue_setup()``
+and ``rte_eth_tx_queue_setup()`` functions.
+
+The Ethernet port is then started:
+
+.. code-block:: c
+
+    ret = rte_eth_dev_start(portid);
+    if (ret < 0)
+        rte_exit(EXIT_FAILURE, "rte_eth_dev_start:err=%d, port=%u\n",
+            ret, portid);
+
+
+Finally the Rx port is set in promiscuous mode:
+
+.. code-block:: c
+
+    rte_eth_promiscuous_enable(portid);
+
+
+After that each port application assigns resources needed.
+
+.. code-block:: c
+
+    check_link_status(ioat_enabled_port_mask);
+
+    if (!cfg.nb_ports) {
+        rte_exit(EXIT_FAILURE,
+            "All available ports are disabled. Please set portmask.\n");
+    }
+
+    /* Check if there is enough lcores for all ports. */
+    cfg.nb_lcores = rte_lcore_count() - 1;
+    if (cfg.nb_lcores < 1)
+        rte_exit(EXIT_FAILURE,
+            "There should be at least one slave lcore.\n");
+
+    ret = 0;
+
+    if (copy_mode == COPY_MODE_IOAT_NUM) {
+        assign_rawdevs();
+    } else /* copy_mode == COPY_MODE_SW_NUM */ {
+        assign_rings();
+    }
+
+A link status is checked of each port enabled by port mask
+using ``check_link_status()`` function.
+
+.. code-block:: c
+
+    /* check link status, return true if at least one port is up */
+    static int
+    check_link_status(uint32_t port_mask)
+    {
+        uint16_t portid;
+        struct rte_eth_link link;
+        int retval = 0;
+
+        printf("\nChecking link status\n");
+        RTE_ETH_FOREACH_DEV(portid) {
+            if ((port_mask & (1 << portid)) == 0)
+                continue;
+
+            memset(&link, 0, sizeof(link));
+            rte_eth_link_get(portid, &link);
+
+            /* Print link status */
+            if (link.link_status) {
+                printf(
+                    "Port %d Link Up. Speed %u Mbps - %s\n",
+                    portid, link.link_speed,
+                    (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
+                    ("full-duplex") : ("half-duplex\n"));
+                retval = 1;
+            } else
+                printf("Port %d Link Down\n", portid);
+        }
+        return retval;
+    }
+
+Depending on mode set (whether copy should be done by software or by hardware)
+special structures are assigned to each port. If software copy was chosen,
+application have to assign ring structures for packet exchanging between lcores
+assigned to ports.
+
+.. code-block:: c
+
+    static void
+    assign_rings(void)
+    {
+        uint32_t i;
+
+        for (i = 0; i < cfg.nb_ports; i++) {
+            char ring_name[20];
+
+            snprintf(ring_name, 20, "rx_to_tx_ring_%u", i);
+            /* Create ring for inter core communication */
+            cfg.ports[i].rx_to_tx_ring = rte_ring_create(
+                    ring_name, ring_size,
+                    rte_socket_id(), RING_F_SP_ENQ);
+
+            if (cfg.ports[i].rx_to_tx_ring == NULL)
+                rte_exit(EXIT_FAILURE, "%s\n",
+                        rte_strerror(rte_errno));
+        }
+    }
+
+
+When using hardware copy each Rx queue of the port is assigned an
+IOAT device (``assign_rawdevs()``) using IOAT Rawdev Driver API
+functions:
+
+.. code-block:: c
+
+    static void
+    assign_rawdevs(void)
+    {
+        uint16_t nb_rawdev = 0, rdev_id = 0;
+        uint32_t i, j;
+
+        for (i = 0; i < cfg.nb_ports; i++) {
+            for (j = 0; j < cfg.ports[i].nb_queues; j++) {
+                struct rte_rawdev_info rdev_info = { 0 };
+
+                do {
+                    if (rdev_id == rte_rawdev_count())
+                        goto end;
+                    rte_rawdev_info_get(rdev_id++, &rdev_info);
+                } while (strcmp(rdev_info.driver_name,
+                    IOAT_PMD_RAWDEV_NAME_STR) != 0);
+
+                cfg.ports[i].ioat_ids[j] = rdev_id - 1;
+                configure_rawdev_queue(cfg.ports[i].ioat_ids[j]);
+                ++nb_rawdev;
+            }
+        }
+    end:
+        if (nb_rawdev < cfg.nb_ports * cfg.ports[0].nb_queues)
+            rte_exit(EXIT_FAILURE,
+                "Not enough IOAT rawdevs (%u) for all queues (%u).\n",
+                nb_rawdev, cfg.nb_ports * cfg.ports[0].nb_queues);
+        RTE_LOG(INFO, IOAT, "Number of used rawdevs: %u.\n", nb_rawdev);
+    }
+
+
+The initialization of hardware device is done by ``rte_rawdev_configure()``
+function and ``rte_rawdev_info`` struct. After configuration the device is
+started using ``rte_rawdev_start()`` function. Each of the above operations
+is done in ``configure_rawdev_queue()``.
+
+.. code-block:: c
+
+    static void
+    configure_rawdev_queue(uint32_t dev_id)
+    {
+        struct rte_rawdev_info info = { .dev_private = &dev_config };
+
+        /* Configure hardware copy device */
+        dev_config.ring_size = ring_size;
+
+        if (rte_rawdev_configure(dev_id, &info) != 0) {
+            rte_exit(EXIT_FAILURE,
+                "Error with rte_rawdev_configure()\n");
+        }
+        rte_rawdev_info_get(dev_id, &info);
+        if (dev_config.ring_size != ring_size) {
+            rte_exit(EXIT_FAILURE,
+                "Error, ring size is not %d (%d)\n",
+                ring_size, (int)dev_config.ring_size);
+        }
+        if (rte_rawdev_start(dev_id) != 0) {
+            rte_exit(EXIT_FAILURE,
+                "Error with rte_rawdev_start()\n");
+        }
+    }
+
+If initialization is successful memory for hardware device
+statistics is allocated.
+
+Finally ``main()`` functions starts all processing lcores and starts
+printing stats in a loop on master lcore. The application can be
+interrupted and closed using ``Ctrl-C``. The master lcore waits for
+all slave processes to finish, deallocates resources and exits.
+
+The processing lcores launching function are described below.
+
+The Lcores Launching Functions
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+As described above ``main()`` function invokes ``start_forwarding_cores()``
+function in order to start processing for each lcore:
+
+.. code-block:: c
+
+    static void start_forwarding_cores(void)
+    {
+        uint32_t lcore_id = rte_lcore_id();
+
+        RTE_LOG(INFO, IOAT, "Entering %s on lcore %u\n",
+                __func__, rte_lcore_id());
+
+        if (cfg.nb_lcores == 1) {
+            lcore_id = rte_get_next_lcore(lcore_id, true, true);
+            rte_eal_remote_launch((lcore_function_t *)rxtx_main_loop,
+                NULL, lcore_id);
+        } else if (cfg.nb_lcores > 1) {
+            lcore_id = rte_get_next_lcore(lcore_id, true, true);
+            rte_eal_remote_launch((lcore_function_t *)rx_main_loop,
+                NULL, lcore_id);
+
+            lcore_id = rte_get_next_lcore(lcore_id, true, true);
+            rte_eal_remote_launch((lcore_function_t *)tx_main_loop, NULL,
+                lcore_id);
+        }
+    }
+
+The function launches Rx/Tx processing functions on configured lcores
+for each port using ``rte_eal_remote_launch()``. The configured ports,
+their number and number of assigned lcores are stored in user-defined
+``rxtx_transmission_config`` struct that is initialized before launching
+tasks:
+
+.. code-block:: c
+
+    struct rxtx_transmission_config {
+        struct rxtx_port_config ports[RTE_MAX_ETHPORTS];
+        uint16_t nb_ports;
+        uint16_t nb_lcores;
+    };
+
+The Lcores Processing Functions
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+For receiving packets on each port an ``ioat_rx_port()`` function is used.
+The function receives packets on each configured Rx queue. Depending on mode
+the user chose, it will enqueue packets to IOAT rawdev channels and then invoke
+copy process (hardware copy), or perform software copy of each packet using
+``pktmbuf_sw_copy()`` function and enqueue them to 1 rte_ring:
+
+.. code-block:: c
+
+    /* Receive packets on one port and enqueue to IOAT rawdev or rte_ring. */
+    static void
+    ioat_rx_port(struct rxtx_port_config *rx_config)
+    {
+        uint32_t nb_rx, nb_enq, i, j;
+        struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+        for (i = 0; i < rx_config->nb_queues; i++) {
+
+            nb_rx = rte_eth_rx_burst(rx_config->rxtx_port, i,
+                pkts_burst, MAX_PKT_BURST);
+
+            if (nb_rx == 0)
+                continue;
+
+            port_statistics.rx[rx_config->rxtx_port] += nb_rx;
+
+            if (copy_mode == COPY_MODE_IOAT_NUM) {
+                /* Perform packet hardware copy */
+                nb_enq = ioat_enqueue_packets(pkts_burst,
+                    nb_rx, rx_config->ioat_ids[i]);
+                if (nb_enq > 0)
+                    rte_ioat_do_copies(rx_config->ioat_ids[i]);
+            } else {
+                /* Perform packet software copy, free source packets */
+                int ret;
+                struct rte_mbuf *pkts_burst_copy[MAX_PKT_BURST];
+
+                ret = rte_mempool_get_bulk(ioat_pktmbuf_pool,
+                    (void *)pkts_burst_copy, nb_rx);
+
+                if (unlikely(ret < 0))
+                    rte_exit(EXIT_FAILURE,
+                        "Unable to allocate memory.\n");
+
+                for (j = 0; j < nb_rx; j++)
+                    pktmbuf_sw_copy(pkts_burst[j],
+                        pkts_burst_copy[j]);
+
+                rte_mempool_put_bulk(ioat_pktmbuf_pool,
+                    (void *)pkts_burst, nb_rx);
+
+                nb_enq = rte_ring_enqueue_burst(
+                    rx_config->rx_to_tx_ring,
+                    (void *)pkts_burst_copy, nb_rx, NULL);
+
+                /* Free any not enqueued packets. */
+                rte_mempool_put_bulk(ioat_pktmbuf_pool,
+                    (void *)&pkts_burst_copy[nb_enq],
+                    nb_rx - nb_enq);
+            }
+
+            port_statistics.copy_dropped[rx_config->rxtx_port] +=
+                (nb_rx - nb_enq);
+        }
+    }
+
+The packets are received in burst mode using ``rte_eth_rx_burst()``
+function. When using hardware copy mode the packets are enqueued in
+copying device's buffer using ``ioat_enqueue_packets()`` which calls
+``rte_ioat_enqueue_copy()``. When all received packets are in the
+buffer the copies are invoked by calling ``rte_ioat_do_copies()``.
+Function ``rte_ioat_enqueue_copy()`` operates on physical address of
+the packet. Structure ``rte_mbuf`` contains only physical address to
+start of the data buffer (``buf_iova``). Thus the address is shifted
+by ``addr_offset`` value in order to get pointer to ``rearm_data``
+member of ``rte_mbuf``. That way the packet is copied all at once
+(with data and metadata).
+
+.. code-block:: c
+
+    static uint32_t
+    ioat_enqueue_packets(struct rte_mbuf **pkts,
+        uint32_t nb_rx, uint16_t dev_id)
+    {
+        int ret;
+        uint32_t i;
+        struct rte_mbuf *pkts_copy[MAX_PKT_BURST];
+
+        const uint64_t addr_offset = RTE_PTR_DIFF(pkts[0]->buf_addr,
+            &pkts[0]->rearm_data);
+
+        ret = rte_mempool_get_bulk(ioat_pktmbuf_pool,
+                (void *)pkts_copy, nb_rx);
+
+        if (unlikely(ret < 0))
+            rte_exit(EXIT_FAILURE, "Unable to allocate memory.\n");
+
+        for (i = 0; i < nb_rx; i++) {
+            /* Perform data copy */
+            ret = rte_ioat_enqueue_copy(dev_id,
+                pkts[i]->buf_iova
+                    - addr_offset,
+                pkts_copy[i]->buf_iova
+                    - addr_offset,
+                rte_pktmbuf_data_len(pkts[i])
+                    + addr_offset,
+                (uintptr_t)pkts[i],
+                (uintptr_t)pkts_copy[i],
+                0 /* nofence */);
+
+            if (ret != 1)
+                break;
+        }
+
+        ret = i;
+        /* Free any not enqueued packets. */
+        rte_mempool_put_bulk(ioat_pktmbuf_pool, (void *)&pkts[i], nb_rx - i);
+        rte_mempool_put_bulk(ioat_pktmbuf_pool, (void *)&pkts_copy[i],
+            nb_rx - i);
+
+        return ret;
+    }
+
+
+All done copies are processed by ``ioat_tx_port()`` function. When using
+hardware copy mode the function invokes ``rte_ioat_completed_copies()``
+on each assigned IOAT channel to gather copied packets. If software copy
+mode is used the function dequeues copied packets from the rte_ring. Then each
+packet MAC address is changed if it was enabled. After that copies are sent
+in burst mode using `` rte_eth_tx_burst()``.
+
+
+.. code-block:: c
+
+    /* Transmit packets from IOAT rawdev/rte_ring for one port. */
+    static void
+    ioat_tx_port(struct rxtx_port_config *tx_config)
+    {
+        uint32_t i, j, nb_dq = 0;
+        struct rte_mbuf *mbufs_src[MAX_PKT_BURST];
+        struct rte_mbuf *mbufs_dst[MAX_PKT_BURST];
+
+        if (copy_mode == COPY_MODE_IOAT_NUM) {
+            for (i = 0; i < tx_config->nb_queues; i++) {
+                /* Deque the mbufs from IOAT device. */
+                nb_dq = rte_ioat_completed_copies(
+                    tx_config->ioat_ids[i], MAX_PKT_BURST,
+                    (void *)mbufs_src, (void *)mbufs_dst);
+
+                if (nb_dq == 0)
+                    break;
+
+                rte_mempool_put_bulk(ioat_pktmbuf_pool,
+                    (void *)mbufs_src, nb_dq);
+
+                /* Update macs if enabled */
+                if (mac_updating) {
+                    for (j = 0; j < nb_dq; j++)
+                        update_mac_addrs(mbufs_dst[j],
+                            tx_config->rxtx_port);
+                }
+
+                const uint16_t nb_tx = rte_eth_tx_burst(
+                    tx_config->rxtx_port, 0,
+                    (void *)mbufs_dst, nb_dq);
+
+                port_statistics.tx[tx_config->rxtx_port] += nb_tx;
+
+                /* Free any unsent packets. */
+                if (unlikely(nb_tx < nb_dq))
+                    rte_mempool_put_bulk(ioat_pktmbuf_pool,
+                    (void *)&mbufs_dst[nb_tx],
+                        nb_dq - nb_tx);
+            }
+        }
+        else {
+            for (i = 0; i < tx_config->nb_queues; i++) {
+                /* Deque the mbufs from IOAT device. */
+                nb_dq = rte_ring_dequeue_burst(tx_config->rx_to_tx_ring,
+                    (void *)mbufs_dst, MAX_PKT_BURST, NULL);
+
+                if (nb_dq == 0)
+                    return;
+
+                /* Update macs if enabled */
+                if (mac_updating) {
+                    for (j = 0; j < nb_dq; j++)
+                        update_mac_addrs(mbufs_dst[j],
+                            tx_config->rxtx_port);
+                }
+
+                const uint16_t nb_tx = rte_eth_tx_burst(tx_config->rxtx_port,
+                    0, (void *)mbufs_dst, nb_dq);
+
+                port_statistics.tx[tx_config->rxtx_port] += nb_tx;
+
+                /* Free any unsent packets. */
+                if (unlikely(nb_tx < nb_dq))
+                    rte_mempool_put_bulk(ioat_pktmbuf_pool,
+                    (void *)&mbufs_dst[nb_tx],
+                        nb_dq - nb_tx);
+            }
+        }
+    }
+
+The Packet Copying Functions
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+In order to perform packet copy there is a user-defined function
+``pktmbuf_sw_copy()`` used. It copies a whole packet by copying
+metadata from source packet to new mbuf, and then copying a data
+chunk of source packet. Both memory copies are done using
+``rte_memcpy()``:
+
+.. code-block:: c
+
+    static inline void
+    pktmbuf_sw_copy(struct rte_mbuf *src, struct rte_mbuf *dst)
+    {
+        /* Copy packet metadata */
+        rte_memcpy(&dst->rearm_data,
+            &src->rearm_data,
+            offsetof(struct rte_mbuf, cacheline1)
+                - offsetof(struct rte_mbuf, rearm_data));
+
+        /* Copy packet data */
+        rte_memcpy(rte_pktmbuf_mtod(dst, char *),
+            rte_pktmbuf_mtod(src, char *), src->data_len);
+    }
+
+The metadata in this example is copied from ``rearm_data`` member of
+``rte_mbuf`` struct up to ``cacheline1``.
+
+In order to understand why software packet copying is done as shown
+above please refer to the "Mbuf Library" section of the
+*DPDK Programmer's Guide*.
\ No newline at end of file