From patchwork Mon Aug 24 15:45:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anatoly Burakov X-Patchwork-Id: 75886 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 96E70A04B1; Mon, 24 Aug 2020 17:45:13 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 0EAD9F72; Mon, 24 Aug 2020 17:45:13 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 4CBF9F04; Mon, 24 Aug 2020 17:45:10 +0200 (CEST) IronPort-SDR: eHXf9Hhmq5LcdaKmBpl6ffKQj18GGRqsOcpeSWnMk2J4b5DjnTwh9D6KvYwnRtSFq3vZsIRBAh sZXt9DR83kdw== X-IronPort-AV: E=McAfee;i="6000,8403,9722"; a="240746374" X-IronPort-AV: E=Sophos;i="5.76,349,1592895600"; d="scan'208";a="240746374" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Aug 2020 08:45:08 -0700 IronPort-SDR: LRpd+YQUu0lfnuU9Rx/5sCeyL1KO19pFn6jFi278YltLtKtTG6HfT5HsbeJL0KJ4qqdNvGQCrD P1z/vY3mhKnQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,349,1592895600"; d="scan'208";a="294646741" Received: from silpixa00399498.ir.intel.com (HELO silpixa00399498.ger.corp.intel.com) ([10.237.222.52]) by orsmga003.jf.intel.com with ESMTP; 24 Aug 2020 08:45:00 -0700 From: Anatoly Burakov To: dev@dpdk.org Cc: John McNamara , Marko Kovacevic , ferruh.yigit@intel.com, bruce.richardson@intel.com, padraig.j.connolly@intel.com, stable@dpdk.org Date: Mon, 24 Aug 2020 16:45:00 +0100 Message-Id: X-Mailer: git-send-email 2.17.1 Subject: [dpdk-dev] [PATCH 1/2] doc/linux_gsg: clarify instructions on running as non-root X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The current instructions are slightly out of date when it comes to providing information about setting up the system for using DPDK as non-root, so update them. Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov Reviewed-by: Ferruh Yigit --- doc/guides/linux_gsg/enable_func.rst | 54 ++++++++++++++++++++-------- 1 file changed, 39 insertions(+), 15 deletions(-) diff --git a/doc/guides/linux_gsg/enable_func.rst b/doc/guides/linux_gsg/enable_func.rst index b2bda80bb7..78b0f7c012 100644 --- a/doc/guides/linux_gsg/enable_func.rst +++ b/doc/guides/linux_gsg/enable_func.rst @@ -58,22 +58,34 @@ The application can then determine what action to take, if any, if the HPET is n if any, and on what is available on the system at runtime. Running DPDK Applications Without Root Privileges --------------------------------------------------------- +------------------------------------------------- -.. note:: +In order to run DPDK as non-root, the following Linux filesystem objects' +permissions should be adjusted to ensure that the Linux account being used to +run the DPDK application has access to them: - The instructions below will allow running DPDK as non-root with older - Linux kernel versions. However, since version 4.0, the kernel does not allow - unprivileged processes to read the physical address information from - the pagemaps file, making it impossible for those processes to use HW - devices which require physical addresses +* All directories which serve as hugepage mount points, for example, ``/dev/hugepages`` -Although applications using the DPDK use network ports and other hardware resources directly, -with a number of small permission adjustments it is possible to run these applications as a user other than "root". -To do so, the ownership, or permissions, on the following Linux file system objects should be adjusted to ensure that -the Linux user account being used to run the DPDK application has access to them: +* If the HPET is to be used, ``/dev/hpet`` -* All directories which serve as hugepage mount points, for example, ``/mnt/huge`` +When running as non-root user, there may be some additional resource limits +that are imposed by the system. Specifically, the following resource limits may +need to be adjusted in order to ensure normal DPDK operation: + +* RLIMIT_LOCKS (number of file locks that can be held by a process) + +* RLIMIT_NOFILE (number of open file descriptors that can be held open by a process) + +* RLIMIT_MEMLOCK (amount of pinned pages the process is allowed to have) + +The above limits can usually be adjusted by editing +``/etc/security/limits.conf`` file, and rebooting. + +Additionally, depending on which kernel driver is in use, the relevant +resources also should be accessible by the user running the DPDK application. + +For ``igb_uio`` or ``uio_pci_generic`` kernel drivers, the following Linux file +system objects' permissions should be adjusted: * The userspace-io device files in ``/dev``, for example, ``/dev/uio0``, ``/dev/uio1``, and so on @@ -82,11 +94,23 @@ the Linux user account being used to run the DPDK application has access to them /sys/class/uio/uio0/device/config /sys/class/uio/uio0/device/resource* -* If the HPET is to be used, ``/dev/hpet`` - .. note:: - On some Linux installations, ``/dev/hugepages`` is also a hugepage mount point created by default. + The instructions above will allow running DPDK with ``igb_uio`` driver as + non-root with older Linux kernel versions. However, since version 4.0, the + kernel does not allow unprivileged processes to read the physical address + information from the pagemaps file, making it impossible for those + processes to be used by non-privileged users. In such cases, using the VFIO + driver is recommended. + +For ``vfio-pci`` kernel driver, the following Linux file system objects' +permissions should be adjusted: + +* The VFIO device file , ``/dev/vfio/vfio`` + +* The directories under ``/dev/vfio`` that correspond to IOMMU group numbers of + devices intended to be used by DPDK, for example, ``/dev/vfio/50`` + Power Management and Power Saving Functionality ----------------------------------------------- From patchwork Mon Aug 24 15:45:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anatoly Burakov X-Patchwork-Id: 75887 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5ADD0A04B1; Mon, 24 Aug 2020 17:45:23 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 7CE0B5B30; Mon, 24 Aug 2020 17:45:14 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 8285DF04; Mon, 24 Aug 2020 17:45:11 +0200 (CEST) IronPort-SDR: O1EzmEpnmSUHC6n5ImggXSdKBBAhOTstFQ/9Euai23J91YQdHnB3fjp5UcMJhaW+d54U4fx6CO 2HeNKugCShYg== X-IronPort-AV: E=McAfee;i="6000,8403,9722"; a="240746378" X-IronPort-AV: E=Sophos;i="5.76,349,1592895600"; d="scan'208";a="240746378" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Aug 2020 08:45:09 -0700 IronPort-SDR: dD5+1g983vJAW2CZzJ9phTZ9OzlszcjqWBhxFV7uFRm0e7WaGpVhwuIOAcftYFA+vQYFm05YFa siVgAkufJybw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,349,1592895600"; d="scan'208";a="294646748" Received: from silpixa00399498.ir.intel.com (HELO silpixa00399498.ger.corp.intel.com) ([10.237.222.52]) by orsmga003.jf.intel.com with ESMTP; 24 Aug 2020 08:45:06 -0700 From: Anatoly Burakov To: dev@dpdk.org Cc: John McNamara , Marko Kovacevic , ferruh.yigit@intel.com, bruce.richardson@intel.com, padraig.j.connolly@intel.com, stable@dpdk.org Date: Mon, 24 Aug 2020 16:45:01 +0100 Message-Id: <01efc239aaf6513b768829f7e44ad411cab881fe.1598283570.git.anatoly.burakov@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: Subject: [dpdk-dev] [PATCH 2/2] doc/linux_gsg: update information on using hugepages X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Current information regarding hugepage usage is a little out of date. Update it to include information on in-memory mode, as well as on default mountpoints provided by systemd. Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov --- doc/guides/linux_gsg/sys_reqs.rst | 39 +++++++++++++++++++------------ 1 file changed, 24 insertions(+), 15 deletions(-) diff --git a/doc/guides/linux_gsg/sys_reqs.rst b/doc/guides/linux_gsg/sys_reqs.rst index a124656bcb..2ddd7ed667 100644 --- a/doc/guides/linux_gsg/sys_reqs.rst +++ b/doc/guides/linux_gsg/sys_reqs.rst @@ -155,8 +155,12 @@ Without hugepages, high TLB miss rates would occur with the standard 4k page siz Reserving Hugepages for DPDK Use ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -The allocation of hugepages should be done at boot time or as soon as possible after system boot -to prevent memory from being fragmented in physical memory. +The allocation of hugepages can be performed either at run time or at boot time. +In the general case, reserving hugepages at run time is perfectly fine, but in +use cases where having lots of physically contiguous memory is required, it is +preferable to reserve hugepages at boot time, as that will help in preventing +physical memory from becoming heavily fragmented. + To reserve hugepages at boot time, a parameter is passed to the Linux kernel on the kernel command line. For 2 MB pages, just pass the hugepages option to the kernel. For example, to reserve 1024 pages of 2 MB, use:: @@ -187,9 +191,9 @@ See the Documentation/admin-guide/kernel-parameters.txt file in your Linux sourc **Alternative:** -For 2 MB pages, there is also the option of allocating hugepages after the system has booted. +There is also the option of allocating hugepages after the system has booted. This is done by echoing the number of hugepages required to a nr_hugepages file in the ``/sys/devices/`` directory. -For a single-node system, the command to use is as follows (assuming that 1024 pages are required):: +For a single-node system, the command to use is as follows (assuming that 1024 of 2MB pages are required):: echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages @@ -198,22 +202,27 @@ On a NUMA machine, pages should be allocated explicitly on separate nodes:: echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages echo 1024 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages -.. note:: - - For 1G pages, it is not possible to reserve the hugepage memory after the system has booted. - Using Hugepages with the DPDK ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Once the hugepage memory is reserved, to make the memory available for DPDK use, perform the following steps:: +If secondary process support is not required, DPDK is able to use hugepages +without any configuration by using "in-memory" mode. Please see +:ref:`linux_eal_parameters` for more details. + +If secondary process support is required, mount points for hugepages need to be +created. On modern Linux distributions, a default mount point for hugepages is provided +by the system and is located at ``/dev/hugepages``. This mount point will use the +default hugepage size set by the kernel parameters as described above. + +However, in order to use multiple hugepage sizes, it is necessary to manually +create mount points for hugepage sizes that are not provided by the system +(e.g. 1GB pages). + +To make the hugepages of size 1GB available for DPDK use, perform the following steps:: mkdir /mnt/huge - mount -t hugetlbfs nodev /mnt/huge + mount -t hugetlbfs pagesize=1GB /mnt/huge The mount point can be made permanent across reboots, by adding the following line to the ``/etc/fstab`` file:: - nodev /mnt/huge hugetlbfs defaults 0 0 - -For 1GB pages, the page size must be specified as a mount option:: - - nodev /mnt/huge_1GB hugetlbfs pagesize=1GB 0 0 + nodev /mnt/huge hugetlbfs pagesize=1GB 0 0