From patchwork Thu Jun 21 02:00:43 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zhang X-Patchwork-Id: 41330 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 508241B8F7; Thu, 21 Jun 2018 04:10:28 +0200 (CEST) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id 2C6AE1B5F5 for ; Thu, 21 Jun 2018 04:00:39 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Jun 2018 19:00:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,249,1526367600"; d="scan'208";a="64993826" Received: from dpdk51.sh.intel.com ([10.67.110.190]) by fmsmga004.fm.intel.com with ESMTP; 20 Jun 2018 19:00:36 -0700 From: Qi Zhang To: thomas@monjalon.net, anatoly.burakov@intel.com Cc: konstantin.ananyev@intel.com, dev@dpdk.org, bruce.richardson@intel.com, ferruh.yigit@intel.com, benjamin.h.shelton@intel.com, narender.vangati@intel.com, Qi Zhang Date: Thu, 21 Jun 2018 10:00:43 +0800 Message-Id: <20180621020059.1198-7-qi.z.zhang@intel.com> X-Mailer: git-send-email 2.13.6 In-Reply-To: <20180621020059.1198-1-qi.z.zhang@intel.com> References: <20180607123849.14439-1-qi.z.zhang@intel.com> <20180621020059.1198-1-qi.z.zhang@intel.com> Subject: [dpdk-dev] [PATCH v2 06/22] ethdev: support attach or detach share device from secondary X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This patch cover the multi-process hotplug case when a share device attach/detach request be issued from secondary process, the implementation references malloc_mp.c. device attach on secondary: a) secondary send async request to primary and wait on a condition which will be released by matched response from primary. b) primary receive the request and attach the new device if failed goto i). c) primary forward attach request to all secondary as async request (because this in mp thread context, use sync request will deadlock) d) secondary receive request and attach device and send reply. e) primary check the reply if all success go to j). f) primary send attach rollback async request to all secondary. g) secondary receive the request and detach device and send reply. h) primary receive the reply and detach device as rollback action. i) send fail response to secondary, goto k). j) send success response to secondary. k) secondary process receive response and return. device detach on secondary: a) secondary send async request to primary and wait on a condition which will be released by matched response from primary. b) primary receive the request and perform pre-detach check, if device is locked, goto j). c) primary send pre-detach async request to all secondary. d) secondary perform pre-detach check and send reply. e) primary check the reply if any fail goto j). f) primary send detach async request to all secondary g) secondary detach the device and send reply h) primary detach the device. i) send success response to secondary, goto k). j) send fail response to secondary. k) secondary process receive response and return. Signed-off-by: Qi Zhang --- v2: - fix coding style. - improve comments. - remove debug code. lib/librte_ethdev/ethdev_mp.c | 485 +++++++++++++++++++++++++++++++++++++++++- lib/librte_ethdev/ethdev_mp.h | 1 + 2 files changed, 475 insertions(+), 11 deletions(-) diff --git a/lib/librte_ethdev/ethdev_mp.c b/lib/librte_ethdev/ethdev_mp.c index 10c03d25f..f2ea53fd6 100644 --- a/lib/librte_ethdev/ethdev_mp.c +++ b/lib/librte_ethdev/ethdev_mp.c @@ -3,12 +3,101 @@ */ #include +#include + #include "rte_ethdev_driver.h" #include "ethdev_mp.h" #include "ethdev_lock.h" +#include "ethdev_private.h" + +/** + * secondary to primary request. + * start from function rte_eth_dev_request_to_primary. + * + * device attach: + * a) secondary send request to primary. + * b) primary attach the new device if failed goto i). + * c) primary forward attach request to all secondary. + * d) secondary receive request and attach device and send reply. + * e) primary check the reply if all success go to j). + * f) primary send attach rollback request to all secondary. + * g) secondary receive the request and detach device and send reply. + * h) primary receive the reply and detach device as rollback action. + * i) send fail response to secondary, goto k). + * j) send success response to secondary. + * k) end. + + * device detach: + * a) secondary send request to primary. + * b) primary perform pre-detach check, if device is locked, got j). + * c) primary send pre-detach check request to all secondary. + * d) secondary perform pre-detach check and send reply. + * e) primary check the reply if any fail goto j). + * f) primary send detach request to all secondary + * g) secondary detach the device and send reply + * h) primary detach the device. + * i) send success response to secondary, goto k). + * j) send fail response to secondary. + * k) end. + */ + +enum req_state { + REQ_STATE_INACTIVE = 0, + REQ_STATE_ACTIVE, + REQ_STATE_COMPLETE +}; + +struct mp_request { + TAILQ_ENTRY(mp_request) next; + struct eth_dev_mp_req user_req; /**< contents of request */ + pthread_cond_t cond; /**< variable we use to time out on this request */ + enum req_state state; /**< indicate status of this request */ +}; + +/* + * We could've used just a single request, but it may be possible for + * secondaries to timeout earlier than the primary, and send a new request while + * primary is still expecting replies to the old one. Therefore, each new + * request will get assigned a new ID, which is how we will distinguish between + * expected and unexpected messages. + */ +TAILQ_HEAD(mp_request_list, mp_request); +static struct { + struct mp_request_list list; + pthread_mutex_t lock; +} mp_request_list = { + .list = TAILQ_HEAD_INITIALIZER(mp_request_list.list), + .lock = PTHREAD_MUTEX_INITIALIZER +}; #define MP_TIMEOUT_S 5 /**< 5 seconds timeouts */ +static struct mp_request * +find_request_by_id(uint64_t id) +{ + struct mp_request *req; + + TAILQ_FOREACH(req, &mp_request_list.list, next) { + if (req->user_req.id == id) + break; + } + return req; +} + +static uint64_t +get_unique_id(void) +{ + uint64_t id; + + do { + id = rte_rand(); + } while (find_request_by_id(id) != NULL); + return id; +} + +static int +send_request_to_secondary_async(const struct eth_dev_mp_req *req); + static int detach_on_secondary(uint16_t port_id) { struct rte_device *dev; @@ -75,21 +164,330 @@ static int attach_on_secondary(const char *devargs, uint16_t port_id) return 0; } -static int handle_secondary_request(const struct rte_mp_msg *msg, const void *peer) +static int +check_reply(const struct eth_dev_mp_req *req, const struct rte_mp_reply *reply) +{ + struct eth_dev_mp_req *resp; + int i; + + if (reply->nb_received != reply->nb_sent) + return -EINVAL; + + for (i = 0; i < reply->nb_received; i++) { + resp = (struct eth_dev_mp_req *)reply->msgs[i].param; + + if (resp->t != req->t) { + ethdev_log(ERR, "Unexpected response to async request\n"); + return -EINVAL; + } + + if (resp->id != req->id) { + ethdev_log(ERR, "response to wrong async request\n"); + return -EINVAL; + } + + if (resp->result) + return resp->result; + } + + return 0; +} + +static int +send_response_to_secondary(const struct eth_dev_mp_req *req, int result) +{ + struct rte_mp_msg resp_msg; + struct eth_dev_mp_req *resp = + (struct eth_dev_mp_req *)resp_msg.param; + int ret = 0; + + memset(&resp_msg, 0, sizeof(resp_msg)); + resp_msg.len_param = sizeof(*resp); + strcpy(resp_msg.name, ETH_DEV_MP_ACTION_RESPONSE); + memcpy(resp, req, sizeof(*req)); + resp->result = result; + + ret = rte_mp_sendmsg(&resp_msg); + if (ret) + ethdev_log(ERR, "failed to send response to secondary\n"); + + return ret; +} + +static int +handle_async_attach_response(const struct rte_mp_msg *request, + const struct rte_mp_reply *reply) +{ + const struct eth_dev_mp_req *req = + (const struct eth_dev_mp_req *)request->param; + struct mp_request *entry; + struct eth_dev_mp_req tmp_req; + int ret = 0; + + pthread_mutex_lock(&mp_request_list.lock); + + entry = find_request_by_id(req->id); + if (!entry) { + ethdev_log(ERR, "wrong request ID\n"); + ret = -EINVAL; + goto finish; + } + + ret = check_reply(req, reply); + if (ret) { + tmp_req = *req; + tmp_req.t = REQ_TYPE_ATTACH_ROLLBACK; + + ret = send_request_to_secondary_async(&tmp_req); + if (ret) { + ethdev_log(ERR, "couldn't send async request\n"); + TAILQ_REMOVE(&mp_request_list.list, entry, next); + free(entry); + } + } else { + send_response_to_secondary(req, 0); + TAILQ_REMOVE(&mp_request_list.list, entry, next); + free(entry); + } + +finish: + pthread_mutex_unlock(&mp_request_list.lock); + return ret; +} + +static int +handle_async_detach_response(const struct rte_mp_msg *request, + const struct rte_mp_reply *reply) { - RTE_SET_USED(msg); - RTE_SET_USED(peer); - return -ENOTSUP; + const struct eth_dev_mp_req *req = + (const struct eth_dev_mp_req *)request->param; + struct mp_request *entry; + int ret = 0; + + pthread_mutex_lock(&mp_request_list.lock); + + entry = find_request_by_id(req->id); + if (!entry) { + ethdev_log(ERR, "wrong request ID\n"); + ret = -EINVAL; + goto finish; + } + + ret = check_reply(req, reply); + if (ret) { + send_response_to_secondary(req, ret); + } else { + do_eth_dev_detach(req->port_id); + send_response_to_secondary(req, 0); + } + TAILQ_REMOVE(&mp_request_list.list, entry, next); + free(entry); + +finish: + pthread_mutex_unlock(&mp_request_list.lock); + return ret; } -static int handle_primary_response(const struct rte_mp_msg *msg, const void *peer) +static int +handle_async_pre_detach_response(const struct rte_mp_msg *request, + const struct rte_mp_reply *reply) { - RTE_SET_USED(msg); - RTE_SET_USED(peer); - return -ENOTSUP; + const struct eth_dev_mp_req *req = + (const struct eth_dev_mp_req *)request->param; + struct eth_dev_mp_req tmp_req; + struct mp_request *entry; + int ret = 0; + + pthread_mutex_lock(&mp_request_list.lock); + + entry = find_request_by_id(req->id); + if (!entry) { + ethdev_log(ERR, "wrong request ID\n"); + ret = -EINVAL; + goto finish; + } + + ret = check_reply(req, reply); + if (!ret) { + tmp_req = *req; + tmp_req.t = REQ_TYPE_DETACH; + + ret = send_request_to_secondary_async(&tmp_req); + if (ret) { + ethdev_log(ERR, "couldn't send async request\n"); + TAILQ_REMOVE(&mp_request_list.list, entry, next); + free(entry); + } + } else { + send_response_to_secondary(req, ret); + TAILQ_REMOVE(&mp_request_list.list, entry, next); + free(entry); + } + +finish: + pthread_mutex_unlock(&mp_request_list.lock); + return 0; } -static int handle_primary_request(const struct rte_mp_msg *msg, const void *peer) +static int +handle_async_rollback_response(const struct rte_mp_msg *request, + const struct rte_mp_reply *reply __rte_unused) +{ + const struct eth_dev_mp_req *req = + (const struct eth_dev_mp_req *)request->param; + struct mp_request *entry; + int ret = 0; + + pthread_mutex_lock(&mp_request_list.lock); + + entry = find_request_by_id(req->id); + if (!entry) { + ethdev_log(ERR, "wrong request ID\n"); + ret = -EINVAL; + goto finish; + } + + /* we have nothing to do if rollback still fail, just detach */ + do_eth_dev_detach(req->port_id); + /* send response to secondary with the reason of rollback */ + send_response_to_secondary(req, req->result); + TAILQ_REMOVE(&mp_request_list.list, entry, next); + free(entry); + +finish: + pthread_mutex_unlock(&mp_request_list.lock); + return ret; +} + +static int +send_request_to_secondary_async(const struct eth_dev_mp_req *req) +{ + struct timespec ts = {.tv_sec = MP_TIMEOUT_S, .tv_nsec = 0}; + struct rte_mp_msg mp_req; + rte_mp_async_reply_t clb; + int ret = 0; + + memset(&mp_req, 0, sizeof(mp_req)); + memcpy(mp_req.param, req, sizeof(*req)); + mp_req.len_param = sizeof(*req); + strcpy(mp_req.name, ETH_DEV_MP_ACTION_REQUEST); + + if (req->t == REQ_TYPE_ATTACH) + clb = handle_async_attach_response; + else if (req->t == REQ_TYPE_PRE_DETACH) + clb = handle_async_pre_detach_response; + else if (req->t == REQ_TYPE_DETACH) + clb = handle_async_detach_response; + else if (req->t == REQ_TYPE_ATTACH_ROLLBACK) + clb = handle_async_rollback_response; + else + return -1; + do { + ret = rte_mp_request_async(&mp_req, &ts, clb); + } while (ret != 0 && rte_errno == EEXIST); + + if (ret) + ethdev_log(ERR, "couldn't send async request\n"); + + return ret; +} + +static int +handle_secondary_request(const struct rte_mp_msg *msg, + const void *peer __rte_unused) +{ + const struct eth_dev_mp_req *req = + (const struct eth_dev_mp_req *)msg->param; + struct eth_dev_mp_req tmp_req; + struct mp_request *entry; + uint16_t port_id; + int ret = 0; + + pthread_mutex_lock(&mp_request_list.lock); + + entry = find_request_by_id(req->id); + if (entry) { + ethdev_log(ERR, "duplicate request id\n"); + ret = -EEXIST; + goto finish; + } + + entry = malloc(sizeof(*entry)); + if (entry == NULL) { + ethdev_log(ERR, "not enough memory to allocate request entry\n"); + ret = -ENOMEM; + goto finish; + } + + if (req->t == REQ_TYPE_ATTACH) { + ret = do_eth_dev_attach(req->devargs, &port_id); + if (!ret) { + tmp_req = *req; + tmp_req.port_id = port_id; + ret = send_request_to_secondary_async(&tmp_req); + } + } else if (req->t == REQ_TYPE_DETACH) { + if (!rte_eth_dev_is_valid_port(req->port_id)) + ret = -EINVAL; + if (!ret) + ret = process_lock_callbacks(req->port_id); + if (!ret) { + tmp_req = *req; + tmp_req.t = REQ_TYPE_PRE_DETACH; + ret = send_request_to_secondary_async(&tmp_req); + } + } else { + ethdev_log(ERR, "unsupported secondary to primary request\n"); + ret = -ENOTSUP; + goto finish; + } + + if (ret) { + ret = send_response_to_secondary(req, ret); + if (ret) { + ethdev_log(ERR, "failed to send response to secondary\n"); + goto finish; + } + } else { + memcpy(&entry->user_req, req, sizeof(*req)); + entry->state = REQ_STATE_ACTIVE; + TAILQ_INSERT_TAIL(&mp_request_list.list, entry, next); + entry = NULL; + } + +finish: + pthread_mutex_unlock(&mp_request_list.lock); + if (entry) + free(entry); + return ret; +} + +static int +handle_primary_response(const struct rte_mp_msg *msg, + const void *peer __rte_unused) +{ + const struct eth_dev_mp_req *req = + (const struct eth_dev_mp_req *)msg->param; + struct mp_request *entry; + + pthread_mutex_lock(&mp_request_list.lock); + + entry = find_request_by_id(req->id); + if (entry) { + entry->user_req.result = req->result; + entry->user_req.port_id = req->port_id; + entry->state = REQ_STATE_COMPLETE; + + pthread_cond_signal(&entry->cond); + } + + pthread_mutex_unlock(&mp_request_list.lock); + + return 0; +} + +static int +handle_primary_request(const struct rte_mp_msg *msg, const void *peer) { const struct eth_dev_mp_req *req = (const struct eth_dev_mp_req *)msg->param; @@ -129,8 +527,73 @@ static int handle_primary_request(const struct rte_mp_msg *msg, const void *peer int rte_eth_dev_request_to_primary(struct eth_dev_mp_req *req) { - RTE_SET_USED(req); - return -ENOTSUP; + struct rte_mp_msg msg; + struct eth_dev_mp_req *msg_req = (struct eth_dev_mp_req *)msg.param; + struct mp_request *entry; + struct timespec ts; + struct timeval now; + int ret = 0; + + memset(&msg, 0, sizeof(msg)); + memset(&ts, 0, sizeof(ts)); + + entry = malloc(sizeof(*entry)); + if (entry == NULL) { + ethdev_log(ERR, "not enough memory to allocate request entry\n"); + return -ENOMEM; + } + + pthread_mutex_lock(&mp_request_list.lock); + + ret = gettimeofday(&now, NULL); + if (ret) { + ethdev_log(ERR, "cannot get current time\n"); + ret = -EINVAL; + goto finish; + } + + ts.tv_nsec = (now.tv_usec * 1000) % 1000000000; + ts.tv_sec = now.tv_sec + MP_TIMEOUT_S + + (now.tv_usec * 1000) / 1000000000; + + pthread_cond_init(&entry->cond, NULL); + + msg.len_param = sizeof(*req); + strcpy(msg.name, ETH_DEV_MP_ACTION_REQUEST); + + req->id = get_unique_id(); + + memcpy(msg_req, req, sizeof(*req)); + + ret = rte_mp_sendmsg(&msg); + if (ret) { + ethdev_log(ERR, "cannot send message to primary"); + goto finish; + } + + memcpy(&entry->user_req, req, sizeof(*req)); + + entry->state = REQ_STATE_ACTIVE; + + TAILQ_INSERT_TAIL(&mp_request_list.list, entry, next); + + do { + ret = pthread_cond_timedwait(&entry->cond, + &mp_request_list.lock, &ts); + } while (ret != 0 && ret != ETIMEDOUT); + + if (entry->state != REQ_STATE_COMPLETE) { + RTE_LOG(ERR, EAL, "request time out\n"); + ret = -ETIMEDOUT; + } else { + req->result = entry->user_req.result; + } + TAILQ_REMOVE(&mp_request_list.list, entry, next); + +finish: + pthread_mutex_unlock(&mp_request_list.lock); + free(entry); + return ret; } /** diff --git a/lib/librte_ethdev/ethdev_mp.h b/lib/librte_ethdev/ethdev_mp.h index c3e55dfec..6d10dfdad 100644 --- a/lib/librte_ethdev/ethdev_mp.h +++ b/lib/librte_ethdev/ethdev_mp.h @@ -18,6 +18,7 @@ enum eth_dev_req_type { }; struct eth_dev_mp_req { + uint64_t id; enum eth_dev_req_type t; char devargs[MAX_DEV_ARGS_LEN]; uint16_t port_id;