From patchwork Sat Jun 1 18:53:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavan Nikhilesh Bhagavatula X-Patchwork-Id: 54025 X-Patchwork-Delegate: jerinj@marvell.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 65A481B9BD; Sat, 1 Jun 2019 20:56:13 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by dpdk.org (Postfix) with ESMTP id 8BD921B948 for ; Sat, 1 Jun 2019 20:55:41 +0200 (CEST) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x51Itbre030054 for ; Sat, 1 Jun 2019 11:55:40 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0818; bh=xzri1D84dJx13oCbrmbXPGYVYS0wEk3z+YBO+q2/CtI=; b=Nl+o1m8AUXjYqYJDayBAJ+MhZ7xq4c96qMRYL/wXXy3oMcN9hrIbmp2iF8TyjeFXUPmr ODTJcOBeNgbmSYMuf124I6BfqVIhzwd2H8TeOAxXDRVfZxlUT5Ysg3UPFhMhuGvLCqyW hLLOD0WRtu5rvP8/Xrqj7zQ7iIDpCaCCR6ij7KbSyj/LdJAt/eYByUe9lXQ8vFQV8bl7 ZRjspAWsLkwi26FKXo258/TdtdKh5nkFJi/pG5+eS09o4bNdxjAzJoexC8FQ7Pd3bcJZ ElTIGxr51mh++3atxERxVLeM9ANrI8SzZo6g8lznfSrpjPxg4NyqEwJsYy8JS/JX156Y hw== Received: from sc-exch01.marvell.com ([199.233.58.181]) by mx0b-0016f401.pphosted.com with ESMTP id 2survk12dh-6 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for ; Sat, 01 Jun 2019 11:55:40 -0700 Received: from SC-EXCH01.marvell.com (10.93.176.81) by SC-EXCH01.marvell.com (10.93.176.81) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Sat, 1 Jun 2019 11:55:39 -0700 Received: from maili.marvell.com (10.93.176.43) by SC-EXCH01.marvell.com (10.93.176.81) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Sat, 1 Jun 2019 11:55:39 -0700 Received: from BG-LT7430.marvell.com (unknown [10.28.17.28]) by maili.marvell.com (Postfix) with ESMTP id 853A63F7059; Sat, 1 Jun 2019 11:55:38 -0700 (PDT) From: To: , Pavan Nikhilesh CC: Date: Sun, 2 Jun 2019 00:23:27 +0530 Message-ID: <20190601185355.370-18-pbhagavatula@marvell.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190601185355.370-1-pbhagavatula@marvell.com> References: <20190601185355.370-1-pbhagavatula@marvell.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2019-06-01_13:, , signatures=0 Subject: [dpdk-dev] [PATCH 17/44] event/octeontx2: add octeontx2 SSO dual workslot mode X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Pavan Nikhilesh OcteonTx2 AP core SSO cache contains two entires each entry caches state of an single GWS aka event port. AP core requests events from SSO by using following sequence : 1. Write to SSOW_LF_GWS_OP_GET_WORK 2. Wait for SSO to complete scheduling by polling on SSOW_LF_GWS_TAG[63] 3. SSO notifies core by clearing SSOW_LF_GWS_TAG[63] and if work is valid SSOW_LF_GWS_WQP is non-zero. The above sequence uses only one in-core cache entry. In dual workslot mode we try to use both the in-core cache entries by triggering GET_WORK on a second workslot as soon as the above sequence completes. This effectively hides the schedule latency of SSO if there are enough events with unique flow_tags in-flight. This mode reserves two SSO GWS lf's for each event port effectively doubling single core performance. Dual workslot mode is the default mode of operation in octeontx2. Signed-off-by: Pavan Nikhilesh Signed-off-by: Jerin Jacob --- drivers/event/octeontx2/otx2_evdev.c | 204 ++++++++++++++++++--- drivers/event/octeontx2/otx2_evdev.h | 17 ++ drivers/event/octeontx2/otx2_evdev_irq.c | 4 +- drivers/event/octeontx2/otx2_evdev_stats.h | 52 +++++- 4 files changed, 242 insertions(+), 35 deletions(-) diff --git a/drivers/event/octeontx2/otx2_evdev.c b/drivers/event/octeontx2/otx2_evdev.c index ff9f905b3..3979bed9a 100644 --- a/drivers/event/octeontx2/otx2_evdev.c +++ b/drivers/event/octeontx2/otx2_evdev.c @@ -20,7 +20,7 @@ static inline int sso_get_msix_offsets(const struct rte_eventdev *event_dev) { struct otx2_sso_evdev *dev = sso_pmd_priv(event_dev); - uint8_t nb_ports = dev->nb_event_ports; + uint8_t nb_ports = dev->nb_event_ports * (dev->dual_ws ? 2 : 1); struct otx2_mbox *mbox = dev->mbox; struct msix_offset_rsp *msix_rsp; int i, rc; @@ -82,16 +82,26 @@ otx2_sso_port_link(struct rte_eventdev *event_dev, void *port, const uint8_t queues[], const uint8_t priorities[], uint16_t nb_links) { + struct otx2_sso_evdev *dev = sso_pmd_priv(event_dev); uint8_t port_id = 0; uint16_t link; - RTE_SET_USED(event_dev); RTE_SET_USED(priorities); for (link = 0; link < nb_links; link++) { - struct otx2_ssogws *ws = port; - - port_id = ws->port; - sso_port_link_modify(ws, queues[link], true); + if (dev->dual_ws) { + struct otx2_ssogws_dual *ws = port; + + port_id = ws->port; + sso_port_link_modify((struct otx2_ssogws *) + &ws->ws_state[0], queues[link], true); + sso_port_link_modify((struct otx2_ssogws *) + &ws->ws_state[1], queues[link], true); + } else { + struct otx2_ssogws *ws = port; + + port_id = ws->port; + sso_port_link_modify(ws, queues[link], true); + } } sso_func_trace("port=%d nb_links=%d", port_id, nb_links); @@ -102,15 +112,27 @@ static int otx2_sso_port_unlink(struct rte_eventdev *event_dev, void *port, uint8_t queues[], uint16_t nb_unlinks) { + struct otx2_sso_evdev *dev = sso_pmd_priv(event_dev); uint8_t port_id = 0; uint16_t unlink; - RTE_SET_USED(event_dev); for (unlink = 0; unlink < nb_unlinks; unlink++) { - struct otx2_ssogws *ws = port; - - port_id = ws->port; - sso_port_link_modify(ws, queues[unlink], false); + if (dev->dual_ws) { + struct otx2_ssogws_dual *ws = port; + + port_id = ws->port; + sso_port_link_modify((struct otx2_ssogws *) + &ws->ws_state[0], queues[unlink], + false); + sso_port_link_modify((struct otx2_ssogws *) + &ws->ws_state[1], queues[unlink], + false); + } else { + struct otx2_ssogws *ws = port; + + port_id = ws->port; + sso_port_link_modify(ws, queues[unlink], false); + } } sso_func_trace("port=%d nb_unlinks=%d", port_id, nb_unlinks); @@ -242,11 +264,23 @@ sso_clr_links(const struct rte_eventdev *event_dev) int i, j; for (i = 0; i < dev->nb_event_ports; i++) { - struct otx2_ssogws *ws; + if (dev->dual_ws) { + struct otx2_ssogws_dual *ws; - ws = event_dev->data->ports[i]; - for (j = 0; j < dev->nb_event_queues; j++) - sso_port_link_modify(ws, j, false); + ws = event_dev->data->ports[i]; + for (j = 0; j < dev->nb_event_queues; j++) { + sso_port_link_modify((struct otx2_ssogws *) + &ws->ws_state[0], j, false); + sso_port_link_modify((struct otx2_ssogws *) + &ws->ws_state[1], j, false); + } + } else { + struct otx2_ssogws *ws; + + ws = event_dev->data->ports[i]; + for (j = 0; j < dev->nb_event_queues; j++) + sso_port_link_modify(ws, j, false); + } } } @@ -261,6 +295,73 @@ sso_set_port_ops(struct otx2_ssogws *ws, uintptr_t base) ws->swtag_desched_op = base + SSOW_LF_GWS_OP_SWTAG_DESCHED; } +static int +sso_configure_dual_ports(const struct rte_eventdev *event_dev) +{ + struct otx2_sso_evdev *dev = sso_pmd_priv(event_dev); + struct otx2_mbox *mbox = dev->mbox; + uint8_t vws = 0; + uint8_t nb_lf; + int i, rc; + + otx2_sso_dbg("configuring event ports %d", dev->nb_event_ports); + + nb_lf = dev->nb_event_ports * 2; + /* Ask AF to attach required LFs. */ + rc = sso_hw_lf_cfg(mbox, SSO_LF_GWS, nb_lf, true); + if (rc < 0) { + otx2_err("failed to attach SSO GWS LF"); + return -ENODEV; + } + + if (sso_lf_cfg(dev, mbox, SSO_LF_GWS, nb_lf, true) < 0) { + sso_hw_lf_cfg(mbox, SSO_LF_GWS, nb_lf, false); + otx2_err("failed to init SSO GWS LF"); + return -ENODEV; + } + + for (i = 0; i < dev->nb_event_ports; i++) { + struct otx2_ssogws_dual *ws; + uintptr_t base; + + /* Free memory prior to re-allocation if needed */ + if (event_dev->data->ports[i] != NULL) { + ws = event_dev->data->ports[i]; + rte_free(ws); + ws = NULL; + } + + /* Allocate event port memory */ + ws = rte_zmalloc_socket("otx2_sso_ws", + sizeof(struct otx2_ssogws_dual), + RTE_CACHE_LINE_SIZE, + event_dev->data->socket_id); + if (ws == NULL) { + otx2_err("failed to alloc memory for port=%d", i); + rc = -ENOMEM; + break; + } + + ws->port = i; + base = dev->bar2 + (RVU_BLOCK_ADDR_SSOW << 20 | vws << 12); + sso_set_port_ops((struct otx2_ssogws *)&ws->ws_state[0], base); + vws++; + + base = dev->bar2 + (RVU_BLOCK_ADDR_SSOW << 20 | vws << 12); + sso_set_port_ops((struct otx2_ssogws *)&ws->ws_state[1], base); + vws++; + + event_dev->data->ports[i] = ws; + } + + if (rc < 0) { + sso_lf_cfg(dev, mbox, SSO_LF_GWS, nb_lf, false); + sso_hw_lf_cfg(mbox, SSO_LF_GWS, nb_lf, false); + } + + return rc; +} + static int sso_configure_ports(const struct rte_eventdev *event_dev) { @@ -460,6 +561,7 @@ sso_lf_teardown(struct otx2_sso_evdev *dev, break; case SSO_LF_GWS: nb_lf = dev->nb_event_ports; + nb_lf *= dev->dual_ws ? 2 : 1; break; default: return; @@ -525,7 +627,12 @@ otx2_sso_configure(const struct rte_eventdev *event_dev) dev->nb_event_queues = conf->nb_event_queues; dev->nb_event_ports = conf->nb_event_ports; - if (sso_configure_ports(event_dev)) { + if (dev->dual_ws) + rc = sso_configure_dual_ports(event_dev); + else + rc = sso_configure_ports(event_dev); + + if (rc < 0) { otx2_err("failed to configure event ports"); return -ENODEV; } @@ -655,14 +762,27 @@ otx2_sso_port_setup(struct rte_eventdev *event_dev, uint8_t port_id, /* Set get_work timeout for HWS */ val = NSEC2USEC(dev->deq_tmo_ns) - 1; - struct otx2_ssogws *ws = event_dev->data->ports[port_id]; - uintptr_t base = OTX2_SSOW_GET_BASE_ADDR(ws->getwrk_op); - - rte_memcpy(ws->grps_base, grps_base, - sizeof(uintptr_t) * OTX2_SSO_MAX_VHGRP); - ws->fc_mem = dev->fc_mem; - ws->xaq_lmt = dev->xaq_lmt; - otx2_write64(val, base + SSOW_LF_GWS_NW_TIM); + if (dev->dual_ws) { + struct otx2_ssogws_dual *ws = event_dev->data->ports[port_id]; + + rte_memcpy(ws->grps_base, grps_base, + sizeof(uintptr_t) * OTX2_SSO_MAX_VHGRP); + ws->fc_mem = dev->fc_mem; + ws->xaq_lmt = dev->xaq_lmt; + otx2_write64(val, OTX2_SSOW_GET_BASE_ADDR( + ws->ws_state[0].getwrk_op) + SSOW_LF_GWS_NW_TIM); + otx2_write64(val, OTX2_SSOW_GET_BASE_ADDR( + ws->ws_state[1].getwrk_op) + SSOW_LF_GWS_NW_TIM); + } else { + struct otx2_ssogws *ws = event_dev->data->ports[port_id]; + uintptr_t base = OTX2_SSOW_GET_BASE_ADDR(ws->getwrk_op); + + rte_memcpy(ws->grps_base, grps_base, + sizeof(uintptr_t) * OTX2_SSO_MAX_VHGRP); + ws->fc_mem = dev->fc_mem; + ws->xaq_lmt = dev->xaq_lmt; + otx2_write64(val, base + SSOW_LF_GWS_NW_TIM); + } otx2_sso_dbg("port=%d ws=%p", port_id, event_dev->data->ports[port_id]); @@ -730,18 +850,37 @@ otx2_sso_dump(struct rte_eventdev *event_dev, FILE *f) uint8_t queue; uint8_t port; + fprintf(f, "[%s] SSO running in [%s] mode\n", __func__, dev->dual_ws ? + "dual_ws" : "single_ws"); /* Dump SSOW registers */ for (port = 0; port < dev->nb_event_ports; port++) { - fprintf(f, "[%s]SSO single workslot[%d] dump\n", - __func__, port); - ssogws_dump(event_dev->data->ports[port], f); + if (dev->dual_ws) { + struct otx2_ssogws_dual *ws = + event_dev->data->ports[port]; + + fprintf(f, "[%s] SSO dual workslot[%d] vws[%d] dump\n", + __func__, port, 0); + ssogws_dump((struct otx2_ssogws *)&ws->ws_state[0], f); + fprintf(f, "[%s]SSO dual workslot[%d] vws[%d] dump\n", + __func__, port, 1); + ssogws_dump((struct otx2_ssogws *)&ws->ws_state[1], f); + } else { + fprintf(f, "[%s]SSO single workslot[%d] dump\n", + __func__, port); + ssogws_dump(event_dev->data->ports[port], f); + } } /* Dump SSO registers */ for (queue = 0; queue < dev->nb_event_queues; queue++) { fprintf(f, "[%s]SSO group[%d] dump\n", __func__, queue); - struct otx2_ssogws *ws = event_dev->data->ports[0]; - ssoggrp_dump(ws->grps_base[queue], f); + if (dev->dual_ws) { + struct otx2_ssogws_dual *ws = event_dev->data->ports[0]; + ssoggrp_dump(ws->grps_base[queue], f); + } else { + struct otx2_ssogws *ws = event_dev->data->ports[0]; + ssoggrp_dump(ws->grps_base[queue], f); + } } } @@ -874,7 +1013,14 @@ otx2_sso_init(struct rte_eventdev *event_dev) goto otx2_npa_lf_uninit; } + dev->dual_ws = 1; sso_parse_devargs(dev, pci_dev->device.devargs); + if (dev->dual_ws) { + otx2_sso_dbg("using dual workslot mode"); + dev->max_event_ports = dev->max_event_ports / 2; + } else { + otx2_sso_dbg("using single workslot mode"); + } otx2_sso_pf_func_set(dev->pf_func); otx2_sso_dbg("initializing %s max_queues=%d max_ports=%d", diff --git a/drivers/event/octeontx2/otx2_evdev.h b/drivers/event/octeontx2/otx2_evdev.h index 99ee6cea0..ef46a0482 100644 --- a/drivers/event/octeontx2/otx2_evdev.h +++ b/drivers/event/octeontx2/otx2_evdev.h @@ -106,6 +106,7 @@ struct otx2_sso_evdev { uint64_t nb_xaq_cfg; rte_iova_t fc_iova; struct rte_mempool *xaq_pool; + uint8_t dual_ws; /* Dev args */ uint32_t xae_cnt; /* HW const */ @@ -140,6 +141,22 @@ struct otx2_ssogws { uintptr_t grps_base[OTX2_SSO_MAX_VHGRP]; } __rte_cache_aligned; +struct otx2_ssogws_state { + OTX2_SSOGWS_OPS; +}; + +struct otx2_ssogws_dual { + /* Get Work Fastpath data */ + struct otx2_ssogws_state ws_state[2]; /* Ping and Pong */ + uint8_t swtag_req; + uint8_t vws; /* Ping pong bit */ + uint8_t port; + /* Add Work Fastpath data */ + uint64_t xaq_lmt __rte_cache_aligned; + uint64_t *fc_mem; + uintptr_t grps_base[OTX2_SSO_MAX_VHGRP]; +} __rte_cache_aligned; + static inline struct otx2_sso_evdev * sso_pmd_priv(const struct rte_eventdev *event_dev) { diff --git a/drivers/event/octeontx2/otx2_evdev_irq.c b/drivers/event/octeontx2/otx2_evdev_irq.c index ecc8ae775..e10389703 100644 --- a/drivers/event/octeontx2/otx2_evdev_irq.c +++ b/drivers/event/octeontx2/otx2_evdev_irq.c @@ -121,7 +121,7 @@ sso_register_irqs(const struct rte_eventdev *event_dev) int i, rc = -EINVAL; uint8_t nb_ports; - nb_ports = dev->nb_event_ports; + nb_ports = dev->nb_event_ports * (dev->dual_ws ? 2 : 1); for (i = 0; i < dev->nb_event_queues; i++) { if (dev->sso_msixoff[i] == MSIX_VECTOR_INVALID) { @@ -163,7 +163,7 @@ sso_unregister_irqs(const struct rte_eventdev *event_dev) uint8_t nb_ports; int i; - nb_ports = dev->nb_event_ports; + nb_ports = dev->nb_event_ports * (dev->dual_ws ? 2 : 1); for (i = 0; i < dev->nb_event_queues; i++) { uintptr_t base = dev->bar2 + (RVU_BLOCK_ADDR_SSO << 20 | diff --git a/drivers/event/octeontx2/otx2_evdev_stats.h b/drivers/event/octeontx2/otx2_evdev_stats.h index df76a1333..9d7c694ee 100644 --- a/drivers/event/octeontx2/otx2_evdev_stats.h +++ b/drivers/event/octeontx2/otx2_evdev_stats.h @@ -76,11 +76,29 @@ otx2_sso_xstats_get(const struct rte_eventdev *event_dev, xstats = sso_hws_xstats; req_rsp = otx2_mbox_alloc_msg_sso_hws_get_stats(mbox); - ((struct sso_info_req *)req_rsp)->hws = queue_port_id; + ((struct sso_info_req *)req_rsp)->hws = dev->dual_ws ? + 2 * queue_port_id : queue_port_id; rc = otx2_mbox_process_msg(mbox, (void **)&req_rsp); if (rc < 0) goto invalid_value; + if (dev->dual_ws) { + for (i = 0; i < n && i < xstats_mode_count; i++) { + xstat = &xstats[ids[i] - start_offset]; + values[i] = *(uint64_t *) + ((char *)req_rsp + xstat->offset); + values[i] = (values[i] >> xstat->shift) & + xstat->mask; + } + + req_rsp = otx2_mbox_alloc_msg_sso_hws_get_stats(mbox); + ((struct sso_info_req *)req_rsp)->hws = + (2 * queue_port_id) + 1; + rc = otx2_mbox_process_msg(mbox, (void **)&req_rsp); + if (rc < 0) + goto invalid_value; + } + break; case RTE_EVENT_DEV_XSTATS_QUEUE: if (queue_port_id >= (signed int)dev->nb_event_queues) @@ -107,7 +125,11 @@ otx2_sso_xstats_get(const struct rte_eventdev *event_dev, value = *(uint64_t *)((char *)req_rsp + xstat->offset); value = (value >> xstat->shift) & xstat->mask; - values[i] = value; + if ((mode == RTE_EVENT_DEV_XSTATS_PORT) && dev->dual_ws) + values[i] += value; + else + values[i] = value; + values[i] -= xstat->reset_snap[queue_port_id]; } @@ -143,11 +165,30 @@ otx2_sso_xstats_reset(struct rte_eventdev *event_dev, xstats = sso_hws_xstats; req_rsp = otx2_mbox_alloc_msg_sso_hws_get_stats(mbox); - ((struct sso_info_req *)req_rsp)->hws = queue_port_id; + ((struct sso_info_req *)req_rsp)->hws = dev->dual_ws ? + 2 * queue_port_id : queue_port_id; rc = otx2_mbox_process_msg(mbox, (void **)&req_rsp); if (rc < 0) goto invalid_value; + if (dev->dual_ws) { + for (i = 0; i < n && i < xstats_mode_count; i++) { + xstat = &xstats[ids[i] - start_offset]; + xstat->reset_snap[queue_port_id] = *(uint64_t *) + ((char *)req_rsp + xstat->offset); + xstat->reset_snap[queue_port_id] = + (xstat->reset_snap[queue_port_id] >> + xstat->shift) & xstat->mask; + } + + req_rsp = otx2_mbox_alloc_msg_sso_hws_get_stats(mbox); + ((struct sso_info_req *)req_rsp)->hws = + (2 * queue_port_id) + 1; + rc = otx2_mbox_process_msg(mbox, (void **)&req_rsp); + if (rc < 0) + goto invalid_value; + } + break; case RTE_EVENT_DEV_XSTATS_QUEUE: if (queue_port_id >= (signed int)dev->nb_event_queues) @@ -174,7 +215,10 @@ otx2_sso_xstats_reset(struct rte_eventdev *event_dev, value = *(uint64_t *)((char *)req_rsp + xstat->offset); value = (value >> xstat->shift) & xstat->mask; - xstat->reset_snap[queue_port_id] = value; + if ((mode == RTE_EVENT_DEV_XSTATS_PORT) && dev->dual_ws) + xstat->reset_snap[queue_port_id] += value; + else + xstat->reset_snap[queue_port_id] = value; } return i; invalid_value: