From patchwork Fri May 29 08:45:15 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cunming Liang X-Patchwork-Id: 4960 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id BEA56B3D6; Fri, 29 May 2015 10:45:55 +0200 (CEST) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 714DC5A87 for ; Fri, 29 May 2015 10:45:49 +0200 (CEST) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga103.jf.intel.com with ESMTP; 29 May 2015 01:45:48 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.13,515,1427785200"; d="scan'208";a="733589238" Received: from shvmail01.sh.intel.com ([10.239.29.42]) by fmsmga002.fm.intel.com with ESMTP; 29 May 2015 01:45:47 -0700 Received: from shecgisg004.sh.intel.com (shecgisg004.sh.intel.com [10.239.29.89]) by shvmail01.sh.intel.com with ESMTP id t4T8jjE9002168; Fri, 29 May 2015 16:45:45 +0800 Received: from shecgisg004.sh.intel.com (localhost [127.0.0.1]) by shecgisg004.sh.intel.com (8.13.6/8.13.6/SuSE Linux 0.8) with ESMTP id t4T8jgmg020321; Fri, 29 May 2015 16:45:44 +0800 Received: (from cliang18@localhost) by shecgisg004.sh.intel.com (8.13.6/8.13.6/Submit) id t4T8jgjc020317; Fri, 29 May 2015 16:45:42 +0800 From: Cunming Liang To: dev@dpdk.org Date: Fri, 29 May 2015 16:45:15 +0800 Message-Id: <1432889125-20255-3-git-send-email-cunming.liang@intel.com> X-Mailer: git-send-email 1.7.4.1 In-Reply-To: <1432889125-20255-1-git-send-email-cunming.liang@intel.com> References: <1432198563-16334-1-git-send-email-cunming.liang@intel.com> <1432889125-20255-1-git-send-email-cunming.liang@intel.com> Cc: shemming@brocade.com, liang-min.wang@intel.com Subject: [dpdk-dev] [PATCH v9 02/12] eal/linux: add rte_epoll_wait/ctl support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The patch adds 'rte_epoll_wait' and 'rte_epoll_ctl' for async event wakeup. It defines 'struct rte_epoll_event' as the event param. The 'op' uses the same enum as epoll_wait/ctl does. The epoll event support to carry a raw user data and to register a callback which is exectuted during wakeup. Signed-off-by: Cunming Liang --- v9 changes - rework on coding style v8 changes - support delete event in safety during the wakeup execution - add EINTR process during epoll_wait v7 changes - split v6[4/8] into two patches, one for epoll event(this one) another for rx intr(next patch) - introduce rte_epoll_event definition - rte_epoll_wait/ctl for more generic RTE epoll API v6 changes - split rte_intr_wait_rx_pkt into two function, wait and set. - rewrite rte_intr_rx_wait/rte_intr_rx_set to remove queue visibility on eal. - rte_intr_rx_wait to support multiplexing. - allow epfd as input to support flexible event fd combination. lib/librte_eal/linuxapp/eal/eal_interrupts.c | 137 +++++++++++++++++++++ .../linuxapp/eal/include/exec-env/rte_interrupts.h | 82 +++++++++++- lib/librte_eal/linuxapp/eal/rte_eal_version.map | 3 + 3 files changed, 219 insertions(+), 3 deletions(-) diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c index 3a84b3c..2f56000 100644 --- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c +++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c @@ -69,6 +69,8 @@ #define EAL_INTR_EPOLL_WAIT_FOREVER (-1) +static RTE_DEFINE_PER_LCORE(int, _epfd) = -1; /**< epoll fd per thread */ + /** * union for pipe fds. */ @@ -895,3 +897,138 @@ rte_eal_intr_init(void) return -ret; } +static int +eal_epoll_process_event(struct epoll_event *evs, unsigned int n, + struct rte_epoll_event *events) +{ + unsigned int i, count = 0; + struct rte_epoll_event *rev; + + for (i = 0; i < n; i++) { + rev = evs[i].data.ptr; + if (!rev || !rte_atomic32_cmpset(&rev->status, RTE_EPOLL_VALID, + RTE_EPOLL_EXEC)) + continue; + + events[count].status = RTE_EPOLL_VALID; + events[count].fd = rev->fd; + events[count].epfd = rev->epfd; + events[count].epdata.event = rev->epdata.event; + events[count].epdata.data = rev->epdata.data; + if (rev->epdata.cb_fun) + rev->epdata.cb_fun(rev->fd, + rev->epdata.cb_arg); + + rte_compiler_barrier(); + rev->status = RTE_EPOLL_VALID; + count++; + } + return count; +} + +static inline int +eal_init_tls_epfd(void) +{ + int pfd = epoll_create(255); + if (pfd < 0) { + RTE_LOG(ERR, EAL, + "Cannot create epoll instance\n"); + return -1; + } + return pfd; +} + +int +rte_intr_tls_epfd(void) +{ + if (RTE_PER_LCORE(_epfd) == -1) + RTE_PER_LCORE(_epfd) = eal_init_tls_epfd(); + + return RTE_PER_LCORE(_epfd); +} + +int +rte_epoll_wait(int epfd, struct rte_epoll_event *events, + int maxevents, int timeout) +{ + struct epoll_event evs[maxevents]; + int rc; + + if (!events) { + RTE_LOG(ERR, EAL, "rte_epoll_event can't be NULL\n"); + return -1; + } + + /* using per thread epoll fd */ + if (epfd == RTE_EPOLL_PER_THREAD) + epfd = rte_intr_tls_epfd(); + + while (1) { + rc = epoll_wait(epfd, evs, maxevents, timeout); + if (likely(rc > 0)) { + /* epoll_wait has at least one fd ready to read */ + rc = eal_epoll_process_event(evs, rc, events); + break; + } else if (rc < 0) { + if (errno == EINTR) + continue; + /* epoll_wait fail */ + RTE_LOG(ERR, EAL, "epoll_wait returns with fail %s\n", + strerror(errno)); + rc = -1; + break; + } + } + + return rc; +} + +static inline void +eal_epoll_data_safe_free(struct rte_epoll_event *ev) +{ + while (!rte_atomic32_cmpset(&ev->status, RTE_EPOLL_VALID, + RTE_EPOLL_INVALID)) + while (ev->status != RTE_EPOLL_VALID) + rte_pause(); + memset(&ev->epdata, 0, sizeof(ev->epdata)); + ev->fd = -1; + ev->epfd = -1; +} + +int +rte_epoll_ctl(int epfd, int op, int fd, + struct rte_epoll_event *event) +{ + struct epoll_event ev; + + if (!event) { + RTE_LOG(ERR, EAL, "rte_epoll_event can't be NULL\n"); + return -1; + } + + /* using per thread epoll fd */ + if (epfd == RTE_EPOLL_PER_THREAD) + epfd = rte_intr_tls_epfd(); + + if (op == EPOLL_CTL_ADD) { + event->status = RTE_EPOLL_VALID; + event->fd = fd; /* ignore fd in event */ + event->epfd = epfd; + ev.data.ptr = (void *)event; + } + + ev.events = event->epdata.event; + if (epoll_ctl(epfd, op, fd, &ev) < 0) { + RTE_LOG(ERR, EAL, "Error op %d fd %d epoll_ctl, %s\n", + op, fd, strerror(errno)); + if (op == EPOLL_CTL_ADD) + /* rollback status when CTL_ADD fail */ + event->status = RTE_EPOLL_INVALID; + return -1; + } + + if (op == EPOLL_CTL_DEL && event->status != RTE_EPOLL_INVALID) + eal_epoll_data_safe_free(event); + + return 0; +} diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h index 9c86a15..7c21060 100644 --- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h @@ -51,7 +51,31 @@ enum rte_intr_handle_type { RTE_INTR_HANDLE_MAX }; -struct rte_epoll_event; +#define RTE_INTR_EVENT_ADD 1UL +#define RTE_INTR_EVENT_DEL 2UL + +typedef void (*rte_intr_event_cb_t)(int fd, void *arg); + +struct rte_epoll_data { + uint32_t event; /**< event type */ + void *data; /**< User data */ + rte_intr_event_cb_t cb_fun; /**< IN: callback fun */ + void *cb_arg; /**< IN: callback arg */ +}; + +enum { + RTE_EPOLL_INVALID = 0, + RTE_EPOLL_VALID, + RTE_EPOLL_EXEC, +}; + +/** interrupt epoll event obj, taken by epoll_event.ptr */ +struct rte_epoll_event { + volatile uint32_t status; /**< OUT: event status */ + int fd; /**< OUT: event fd */ + int epfd; /**< OUT: epoll instance the ev associated with */ + struct rte_epoll_data epdata; +}; /** Handle for interrupts. */ struct rte_intr_handle { @@ -65,9 +89,61 @@ struct rte_intr_handle { uint32_t max_intr; /**< max interrupt requested */ uint32_t nb_efd; /**< number of available efds */ int efds[RTE_MAX_RXTX_INTR_VEC_ID]; /**< intr vectors/efds mapping */ - struct rte_epoll_event *elist[RTE_MAX_RXTX_INTR_VEC_ID]; - /**< intr vector epoll event ptr */ + struct rte_epoll_event elist[RTE_MAX_RXTX_INTR_VEC_ID]; + /**< intr vector epoll event */ int *intr_vec; /**< intr vector number array */ }; +#define RTE_EPOLL_PER_THREAD -1 /**< to hint using per thread epfd */ + +/** + * It waits for events on the epoll instance. + * + * @param epfd + * Epoll instance fd on which the caller wait for events. + * @param events + * Memory area contains the events that will be available for the caller. + * @param maxevents + * Up to maxevents are returned, must greater than zero. + * @param timeout + * Specifying a timeout of -1 causes a block indefinitely. + * Specifying a timeout equal to zero cause to return immediately. + * @return + * - On success, returns the number of available event. + * - On failure, a negative value. + */ +int +rte_epoll_wait(int epfd, struct rte_epoll_event *events, + int maxevents, int timeout); + +/** + * It performs control operations on epoll instance referred by the epfd. + * It requests that the operation op be performed for the target fd. + * + * @param epfd + * Epoll instance fd on which the caller perform control operations. + * @param op + * The operation be performed for the target fd. + * @param fd + * The target fd on which the control ops perform. + * @param event + * Describes the object linked to the fd. + * Note: The caller must take care the object deletion after CTL_DEL. + * @return + * - On success, zero. + * - On failure, a negative value. + */ +int +rte_epoll_ctl(int epfd, int op, int fd, + struct rte_epoll_event *event); + +/** + * The function returns the per thread epoll instance. + * + * @return + * epfd the epoll instance refered to. + */ +int +rte_intr_tls_epfd(void); + #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */ diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map index 7e850a9..840002e 100644 --- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map +++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map @@ -52,6 +52,8 @@ DPDK_2.0 { rte_eal_vdev_init; rte_eal_vdev_uninit; rte_eal_wait_lcore; + rte_epoll_ctl; + rte_epoll_wait; rte_exit; rte_get_hpet_cycles; rte_get_hpet_hz; @@ -61,6 +63,7 @@ DPDK_2.0 { rte_intr_callback_unregister; rte_intr_disable; rte_intr_enable; + rte_intr_tls_epfd; rte_log; rte_log_add_in_history; rte_log_cur_msg_loglevel;