    "date": "2018-08-31T09:57:34",
    "name": "[4/8] net/mlx5: enhance TC flow rule send/ack function",
        "name": "Adrien Mazarguil",
            "name": "net/mlx5: add switch offload for VXLAN encap/decap",
        "Date": "Fri, 31 Aug 2018 11:57:34 +0200",
        "Subject": "[dpdk-dev] [PATCH 4/8] net/mlx5: enhance TC flow rule send/ack\n\tfunction"
    "content": "A callback parameter to process replies will be useful for subsequent work\nin this area. It implies the following:\n\n- Replies may be much larger than requests. In fact their size cannot\n  really be known in advance. Using MNL_SOCKET_BUFFER_SIZE (at least 8192\n  bytes) is the recommended approach to make truncation less likely (look\n  for NLMSG_GOODSIZE in Linux).\n\n- Multipart replies are made of several messages. A loop is needed to\n  process these.\n\n- In case of truncated message (since one cannot really be sure),\n  its remaining parts must be flushed to prevent their reception by\n  subsequent queries.\n\n- Using rte_get_tsc_cycles() instead of random() for message sequence\n  numbers is faster yet unlikely to pick the same number twice in a row.\n\n- mlx5_nl_flow_init() can be simplified since the query message is never\n  written over (it was already the case actually).\n\nSigned-off-by: Adrien Mazarguil <>\n---\n drivers/net/mlx5/mlx5_nl_flow.c | 73 ++++++++++++++++++++++++------------\n 1 file changed, 48 insertions(+), 25 deletions(-)",
    "diff": "diff --git a/drivers/net/mlx5/mlx5_nl_flow.c b/drivers/net/mlx5/mlx5_nl_flow.c\nindex 9ea2a1b55..e720728b7 100644\n--- a/drivers/net/mlx5/mlx5_nl_flow.c\n+++ b/drivers/net/mlx5/mlx5_nl_flow.c\n@@ -22,6 +22,7 @@\n #include <sys/socket.h>\n \n #include <rte_byteorder.h>\n+#include <rte_cycles.h>\n #include <rte_errno.h>\n #include <rte_ether.h>\n #include <rte_flow.h>\n@@ -1050,38 +1051,63 @@ mlx5_nl_flow_brand(void *buf, uint32_t handle)\n }\n \n /**\n- * Send Netlink message with acknowledgment.\n+ * Send Netlink message with acknowledgment and process reply.\n  *\n  * @param nl\n  *   Libmnl socket to use.\n  * @param nlh\n- *   Message to send. This function always raises the NLM_F_ACK flag before\n- *   sending.\n+ *   Message to send. This function always raises the NLM_F_ACK flag and\n+ *   sets its sequence number before sending.\n+ * @param cb\n+ *   Callback handler for received message.\n+ * @param arg\n+ *   Data pointer for callback handler.\n  *\n  * @return\n  *   0 on success, a negative errno value otherwise and rte_errno is set.\n  */\n static int\n-mlx5_nl_flow_nl_ack(struct mnl_socket *nl, struct nlmsghdr *nlh)\n+mlx5_nl_flow_chat(struct mnl_socket *nl, struct nlmsghdr *nlh,\n+\t\t  mnl_cb_t cb, void *arg)\n {\n \talignas(struct nlmsghdr)\n-\tuint8_t ans[mnl_nlmsg_size(sizeof(struct nlmsgerr)) +\n-\t\t    nlh->nlmsg_len - sizeof(*nlh)];\n-\tuint32_t seq = random();\n+\tuint8_t ans[MNL_SOCKET_BUFFER_SIZE];\n+\tunsigned int portid = mnl_socket_get_portid(nl);\n+\tuint32_t seq = rte_get_tsc_cycles();\n+\tint err = 0;\n \tint ret;\n \n \tnlh->nlmsg_flags |= NLM_F_ACK;\n \tnlh->nlmsg_seq = seq;\n \tret = mnl_socket_sendto(nl, nlh, nlh->nlmsg_len);\n-\tif (ret != -1)\n+\tnlh = (void *)ans;\n+\t/*\n+\t * The following loop postpones non-fatal errors until multipart\n+\t * messages are complete.\n+\t */\n+\twhile (ret > 0) {\n \t\tret = mnl_socket_recvfrom(nl, ans, sizeof(ans));\n-\tif (ret != -1)\n-\t\tret = mnl_cb_run\n-\t\t\t(ans, ret, seq, mnl_socket_get_portid(nl), NULL, NULL);\n-\tif (!ret)\n+\t\tif (ret == -1) {\n+\t\t\terr = errno;\n+\t\t\tif (err != ENOSPC)\n+\t\t\t\tbreak;\n+\t\t\tret = sizeof(*nlh);\n+\t\t}\n+\t\tif (!err) {\n+\t\t\tret = mnl_cb_run(nlh, ret, seq, portid, cb, arg);\n+\t\t\tif (ret < 0)\n+\t\t\t\terr = -ret;\n+\t\t}\n+\t\tif (!(nlh->nlmsg_flags & NLM_F_MULTI) ||\n+\t\t    nlh->nlmsg_type == NLMSG_DONE)\n+\t\t\tret = -err;\n+\t\telse\n+\t\t\tret = 1;\n+\t}\n+\tif (!err)\n \t\treturn 0;\n-\trte_errno = errno;\n-\treturn -rte_errno;\n+\trte_errno = err;\n+\treturn -err;\n }\n \n /**\n@@ -1105,7 +1131,7 @@ mlx5_nl_flow_create(struct mnl_socket *nl, void *buf,\n \n \tnlh->nlmsg_type = RTM_NEWTFILTER;\n \tnlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE | NLM_F_EXCL;\n-\tif (!mlx5_nl_flow_nl_ack(nl, nlh))\n+\tif (!mlx5_nl_flow_chat(nl, nlh, NULL, NULL))\n \t\treturn 0;\n \treturn rte_flow_error_set\n \t\t(error, rte_errno, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,\n@@ -1133,7 +1159,7 @@ mlx5_nl_flow_destroy(struct mnl_socket *nl, void *buf,\n \n \tnlh->nlmsg_type = RTM_DELTFILTER;\n \tnlh->nlmsg_flags = NLM_F_REQUEST;\n-\tif (!mlx5_nl_flow_nl_ack(nl, nlh))\n+\tif (!mlx5_nl_flow_chat(nl, nlh, NULL, NULL))\n \t\treturn 0;\n \treturn rte_flow_error_set\n \t\t(error, errno, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,\n@@ -1171,23 +1197,20 @@ mlx5_nl_flow_ifindex_init(struct mnl_socket *nl, unsigned int ifindex,\n \ttcm->tcm_ifindex = ifindex;\n \ttcm->tcm_handle = TC_H_MAKE(TC_H_INGRESS, 0);\n \ttcm->tcm_parent = TC_H_INGRESS;\n+\tif (!mnl_attr_put_strz_check(nlh, sizeof(buf), TCA_KIND, \"ingress\"))\n+\t\treturn rte_flow_error_set\n+\t\t\t(error, ENOBUFS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,\n+\t\t\t NULL, \"netlink: not enough space for message\");\n \t/* Ignore errors when qdisc is already absent. */\n-\tif (mlx5_nl_flow_nl_ack(nl, nlh) &&\n+\tif (mlx5_nl_flow_chat(nl, nlh, NULL, NULL) &&\n \t    rte_errno != EINVAL && rte_errno != ENOENT)\n \t\treturn rte_flow_error_set\n \t\t\t(error, rte_errno, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,\n \t\t\t NULL, \"netlink: failed to remove ingress qdisc\");\n \t/* Create fresh ingress qdisc. */\n-\tnlh = mnl_nlmsg_put_header(buf);\n \tnlh->nlmsg_type = RTM_NEWQDISC;\n \tnlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE | NLM_F_EXCL;\n-\ttcm = mnl_nlmsg_put_extra_header(nlh, sizeof(*tcm));\n-\ttcm->tcm_family = AF_UNSPEC;\n-\ttcm->tcm_ifindex = ifindex;\n-\ttcm->tcm_handle = TC_H_MAKE(TC_H_INGRESS, 0);\n-\ttcm->tcm_parent = TC_H_INGRESS;\n-\tmnl_attr_put_strz_check(nlh, sizeof(buf), TCA_KIND, \"ingress\");\n-\tif (mlx5_nl_flow_nl_ack(nl, nlh))\n+\tif (mlx5_nl_flow_chat(nl, nlh, NULL, NULL))\n \t\treturn rte_flow_error_set\n \t\t\t(error, rte_errno, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,\n \t\t\t NULL, \"netlink: failed to create ingress qdisc\");\n",
