get:
Show a patch.

patch:
Update a patch.

put:
Update a patch.

GET /api/patches/32467/?format=api
HTTP 200 OK
Allow: GET, PUT, PATCH, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept

{
    "id": 32467,
    "url": "https://patches.dpdk.org/api/patches/32467/?format=api",
    "web_url": "https://patches.dpdk.org/project/dpdk/patch/47525ef673993d1b0fa091c3b8b7305d5ccec671.1513681966.git.anatoly.burakov@intel.com/",
    "project": {
        "id": 1,
        "url": "https://patches.dpdk.org/api/projects/1/?format=api",
        "name": "DPDK",
        "link_name": "dpdk",
        "list_id": "dev.dpdk.org",
        "list_email": "dev@dpdk.org",
        "web_url": "http://core.dpdk.org",
        "scm_url": "git://dpdk.org/dpdk",
        "webscm_url": "http://git.dpdk.org/dpdk",
        "list_archive_url": "https://inbox.dpdk.org/dev",
        "list_archive_url_format": "https://inbox.dpdk.org/dev/{}",
        "commit_url_format": ""
    },
    "msgid": "<47525ef673993d1b0fa091c3b8b7305d5ccec671.1513681966.git.anatoly.burakov@intel.com>",
    "list_archive_url": "https://inbox.dpdk.org/dev/47525ef673993d1b0fa091c3b8b7305d5ccec671.1513681966.git.anatoly.burakov@intel.com",
    "date": "2017-12-19T11:14:38",
    "name": "[dpdk-dev,RFC,v2,11/23] eal: replace memseg with memseg lists",
    "commit_ref": null,
    "pull_url": null,
    "state": "superseded",
    "archived": true,
    "hash": "5e893498c545475383e57972a518e337a571ad54",
    "submitter": {
        "id": 4,
        "url": "https://patches.dpdk.org/api/people/4/?format=api",
        "name": "Anatoly Burakov",
        "email": "anatoly.burakov@intel.com"
    },
    "delegate": null,
    "mbox": "https://patches.dpdk.org/project/dpdk/patch/47525ef673993d1b0fa091c3b8b7305d5ccec671.1513681966.git.anatoly.burakov@intel.com/mbox/",
    "series": [],
    "comments": "https://patches.dpdk.org/api/patches/32467/comments/",
    "check": "fail",
    "checks": "https://patches.dpdk.org/api/patches/32467/checks/",
    "tags": {},
    "related": [],
    "headers": {
        "Return-Path": "<dev-bounces@dpdk.org>",
        "X-Original-To": "patchwork@dpdk.org",
        "Delivered-To": "patchwork@dpdk.org",
        "Received": [
            "from [92.243.14.124] (localhost [127.0.0.1])\n\tby dpdk.org (Postfix) with ESMTP id DC4D01B1E0;\n\tTue, 19 Dec 2017 12:15:22 +0100 (CET)",
            "from mga03.intel.com (mga03.intel.com [134.134.136.65])\n\tby dpdk.org (Postfix) with ESMTP id 060C81B01B\n\tfor <dev@dpdk.org>; Tue, 19 Dec 2017 12:14:55 +0100 (CET)",
            "from orsmga007.jf.intel.com ([10.7.209.58])\n\tby orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;\n\t19 Dec 2017 03:14:55 -0800",
            "from irvmail001.ir.intel.com ([163.33.26.43])\n\tby orsmga007.jf.intel.com with ESMTP; 19 Dec 2017 03:14:53 -0800",
            "from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com\n\t[10.237.217.45])\n\tby irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id\n\tvBJBEqk7003117; Tue, 19 Dec 2017 11:14:52 GMT",
            "from sivswdev01.ir.intel.com (localhost [127.0.0.1])\n\tby sivswdev01.ir.intel.com with ESMTP id vBJBEqqI010256;\n\tTue, 19 Dec 2017 11:14:52 GMT",
            "(from aburakov@localhost)\n\tby sivswdev01.ir.intel.com with LOCAL id vBJBEq1J010252;\n\tTue, 19 Dec 2017 11:14:52 GMT"
        ],
        "X-Amp-Result": "SKIPPED(no attachment in message)",
        "X-Amp-File-Uploaded": "False",
        "X-ExtLoop1": "1",
        "X-IronPort-AV": "E=Sophos;i=\"5.45,426,1508828400\"; d=\"scan'208\";a=\"3913250\"",
        "From": "Anatoly Burakov <anatoly.burakov@intel.com>",
        "To": "dev@dpdk.org",
        "Cc": "andras.kovacs@ericsson.com, laszlo.vadkeri@ericsson.com,\n\tkeith.wiles@intel.com, benjamin.walker@intel.com,\n\tbruce.richardson@intel.com, thomas@monjalon.net",
        "Date": "Tue, 19 Dec 2017 11:14:38 +0000",
        "Message-Id": "<47525ef673993d1b0fa091c3b8b7305d5ccec671.1513681966.git.anatoly.burakov@intel.com>",
        "X-Mailer": "git-send-email 1.7.0.7",
        "In-Reply-To": [
            "<cover.1513681966.git.anatoly.burakov@intel.com>",
            "<cover.1513681966.git.anatoly.burakov@intel.com>"
        ],
        "References": [
            "<cover.1513681966.git.anatoly.burakov@intel.com>",
            "<cover.1513681966.git.anatoly.burakov@intel.com>"
        ],
        "Subject": "[dpdk-dev] [RFC v2 11/23] eal: replace memseg with memseg lists",
        "X-BeenThere": "dev@dpdk.org",
        "X-Mailman-Version": "2.1.15",
        "Precedence": "list",
        "List-Id": "DPDK patches and discussions <dev.dpdk.org>",
        "List-Unsubscribe": "<https://dpdk.org/ml/options/dev>,\n\t<mailto:dev-request@dpdk.org?subject=unsubscribe>",
        "List-Archive": "<http://dpdk.org/ml/archives/dev/>",
        "List-Post": "<mailto:dev@dpdk.org>",
        "List-Help": "<mailto:dev-request@dpdk.org?subject=help>",
        "List-Subscribe": "<https://dpdk.org/ml/listinfo/dev>,\n\t<mailto:dev-request@dpdk.org?subject=subscribe>",
        "Errors-To": "dev-bounces@dpdk.org",
        "Sender": "\"dev\" <dev-bounces@dpdk.org>"
    },
    "content": "Before, we were aggregating multiple pages into one memseg, so the\nnumber of memsegs was small. Now, each page gets its own memseg,\nso the list of memsegs is huge. To accommodate the new memseg list\nsize and to keep the under-the-hood workings sane, the memseg list\nis now not just a single list, but multiple lists. To be precise,\neach hugepage size available on the system gets a memseg list per\nsocket (so, for example, on a 2-socket system with 2M and 1G\nhugepages, we will get 4 memseg lists).\n\nIn order to support dynamic memory allocation, we reserve all\nmemory in advance. As in, we do an anonymous mmap() of the entire\nmaximum size of memory per hugepage size (which is limited to\neither RTE_MAX_MEMSEG_PER_LIST or 128G worth of memory, whichever\nis the smaller one). The limit is arbitrary.\n\nSo, for each hugepage size, we get (by default) up to 128G worth\nof memory, per socket. The address space is claimed at the start,\nin eal_common_memory.c. The actual page allocation code is in\neal_memalloc.c (Linux-only for now), and largely consists of\nmoved EAL memory init code.\n\nPages in the list are also indexed by address. That is, for\nnon-legacy mode, in order to figure out where the page belongs,\none can simply look at base address for a memseg list. Similarly,\nfiguring out IOVA address of a memzone is a matter of finding the\nright memseg list, getting offset and dividing by page size to get\nthe appropriate memseg. For legacy mode, old behavior of walking\nthe memseg list remains.\n\nDue to switch to fbarray, secondary processes are not currently\nsupported nor tested. Also, one particular API call (dump physmem\nlayout) no longer makes sense not only becase there can now be\nholes in memseg list, but also because there are several memseg\nlists to choose from.\n\nIn legacy mode, nothing is preallocated, and all memsegs are in\na list like before, but each segment still resides in an appropriate\nmemseg list.\n\nThe rest of the changes are really ripple effects from the memseg\nchange - heap changes, compile fixes, and rewrites to support\nfbarray-backed memseg lists.\n\nSigned-off-by: Anatoly Burakov <anatoly.burakov@intel.com>\n---\n config/common_base                                |   3 +-\n drivers/bus/pci/linux/pci.c                       |  29 ++-\n drivers/net/virtio/virtio_user/vhost_kernel.c     | 106 +++++---\n lib/librte_eal/common/eal_common_memory.c         | 245 ++++++++++++++++--\n lib/librte_eal/common/eal_common_memzone.c        |   5 +-\n lib/librte_eal/common/eal_hugepages.h             |   1 +\n lib/librte_eal/common/include/rte_eal_memconfig.h |  22 +-\n lib/librte_eal/common/include/rte_memory.h        |  16 ++\n lib/librte_eal/common/malloc_elem.c               |   8 +-\n lib/librte_eal/common/malloc_elem.h               |   6 +-\n lib/librte_eal/common/malloc_heap.c               |  88 +++++--\n lib/librte_eal/common/rte_malloc.c                |  20 +-\n lib/librte_eal/linuxapp/eal/eal.c                 |  21 +-\n lib/librte_eal/linuxapp/eal/eal_memory.c          | 299 ++++++++++++++--------\n lib/librte_eal/linuxapp/eal/eal_vfio.c            | 162 ++++++++----\n test/test/test_malloc.c                           |  29 ++-\n test/test/test_memory.c                           |  44 +++-\n test/test/test_memzone.c                          |  17 +-\n 18 files changed, 815 insertions(+), 306 deletions(-)",
    "diff": "diff --git a/config/common_base b/config/common_base\nindex e74febe..9730d4c 100644\n--- a/config/common_base\n+++ b/config/common_base\n@@ -90,7 +90,8 @@ CONFIG_RTE_CACHE_LINE_SIZE=64\n CONFIG_RTE_LIBRTE_EAL=y\n CONFIG_RTE_MAX_LCORE=128\n CONFIG_RTE_MAX_NUMA_NODES=8\n-CONFIG_RTE_MAX_MEMSEG=256\n+CONFIG_RTE_MAX_MEMSEG_LISTS=16\n+CONFIG_RTE_MAX_MEMSEG_PER_LIST=32768\n CONFIG_RTE_MAX_MEMZONE=2560\n CONFIG_RTE_MAX_TAILQ=32\n CONFIG_RTE_ENABLE_ASSERT=n\ndiff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c\nindex 5da6728..6d3100f 100644\n--- a/drivers/bus/pci/linux/pci.c\n+++ b/drivers/bus/pci/linux/pci.c\n@@ -148,19 +148,30 @@ rte_pci_unmap_device(struct rte_pci_device *dev)\n void *\n pci_find_max_end_va(void)\n {\n-\tconst struct rte_memseg *seg = rte_eal_get_physmem_layout();\n-\tconst struct rte_memseg *last = seg;\n-\tunsigned i = 0;\n+\tvoid *cur_end, *max_end = NULL;\n+\tint i = 0;\n \n-\tfor (i = 0; i < RTE_MAX_MEMSEG; i++, seg++) {\n-\t\tif (seg->addr == NULL)\n-\t\t\tbreak;\n+\tfor (i = 0; i < RTE_MAX_MEMSEG_LISTS; i++) {\n+\t\tconst struct rte_mem_config *mcfg =\n+\t\t\t\trte_eal_get_configuration()->mem_config;\n+\t\tconst struct rte_memseg_list *msl = &mcfg->memsegs[i];\n+\t\tconst struct rte_fbarray *arr = &msl->memseg_arr;\n \n-\t\tif (seg->addr > last->addr)\n-\t\t\tlast = seg;\n+\t\tif (arr->capacity == 0)\n+\t\t\tcontinue;\n \n+\t\t/*\n+\t\t * we need to handle legacy mem case, so don't rely on page size\n+\t\t * to calculate max VA end\n+\t\t */\n+\t\twhile ((i = rte_fbarray_find_next_used(arr, i)) >= 0) {\n+\t\t\tstruct rte_memseg *ms = rte_fbarray_get(arr, i);\n+\t\t\tcur_end = RTE_PTR_ADD(ms->addr, ms->len);\n+\t\t\tif (cur_end > max_end)\n+\t\t\t\tmax_end = cur_end;\n+\t\t}\n \t}\n-\treturn RTE_PTR_ADD(last->addr, last->len);\n+\treturn max_end;\n }\n \n /* parse one line of the \"resource\" sysfs file (note that the 'line'\ndiff --git a/drivers/net/virtio/virtio_user/vhost_kernel.c b/drivers/net/virtio/virtio_user/vhost_kernel.c\nindex 68d28b1..f3f1549 100644\n--- a/drivers/net/virtio/virtio_user/vhost_kernel.c\n+++ b/drivers/net/virtio/virtio_user/vhost_kernel.c\n@@ -99,6 +99,40 @@ static uint64_t vhost_req_user_to_kernel[] = {\n \t[VHOST_USER_SET_MEM_TABLE] = VHOST_SET_MEM_TABLE,\n };\n \n+/* returns number of segments processed */\n+static int\n+add_memory_region(struct vhost_memory_region *mr, const struct rte_fbarray *arr,\n+\t\tint reg_start_idx, int max) {\n+\tconst struct rte_memseg *ms;\n+\tvoid *start_addr, *expected_addr;\n+\tuint64_t len;\n+\tint idx;\n+\n+\tidx = reg_start_idx;\n+\tlen = 0;\n+\tstart_addr = NULL;\n+\texpected_addr = NULL;\n+\n+\t/* we could've relied on page size, but we have to support legacy mem */\n+\twhile (idx < max){\n+\t\tms = rte_fbarray_get(arr, idx);\n+\t\tif (expected_addr == NULL) {\n+\t\t\tstart_addr = ms->addr;\n+\t\t\texpected_addr = RTE_PTR_ADD(ms->addr, ms->len);\n+\t\t} else if (ms->addr != expected_addr)\n+\t\t\tbreak;\n+\t\tlen += ms->len;\n+\t\tidx++;\n+\t}\n+\n+\tmr->guest_phys_addr = (uint64_t)(uintptr_t) start_addr;\n+\tmr->userspace_addr = (uint64_t)(uintptr_t) start_addr;\n+\tmr->memory_size = len;\n+\tmr->mmap_offset = 0;\n+\n+\treturn idx;\n+}\n+\n /* By default, vhost kernel module allows 64 regions, but DPDK allows\n  * 256 segments. As a relief, below function merges those virtually\n  * adjacent memsegs into one region.\n@@ -106,8 +140,7 @@ static uint64_t vhost_req_user_to_kernel[] = {\n static struct vhost_memory_kernel *\n prepare_vhost_memory_kernel(void)\n {\n-\tuint32_t i, j, k = 0;\n-\tstruct rte_memseg *seg;\n+\tuint32_t list_idx, region_nr = 0;\n \tstruct vhost_memory_region *mr;\n \tstruct vhost_memory_kernel *vm;\n \n@@ -117,52 +150,41 @@ prepare_vhost_memory_kernel(void)\n \tif (!vm)\n \t\treturn NULL;\n \n-\tfor (i = 0; i < RTE_MAX_MEMSEG; ++i) {\n-\t\tseg = &rte_eal_get_configuration()->mem_config->memseg[i];\n-\t\tif (!seg->addr)\n-\t\t\tbreak;\n-\n-\t\tint new_region = 1;\n-\n-\t\tfor (j = 0; j < k; ++j) {\n-\t\t\tmr = &vm->regions[j];\n+\tfor (list_idx = 0; list_idx < RTE_MAX_MEMSEG_LISTS; ++list_idx) {\n+\t\tconst struct rte_mem_config *mcfg =\n+\t\t\t\trte_eal_get_configuration()->mem_config;\n+\t\tconst struct rte_memseg_list *msl = &mcfg->memsegs[list_idx];\n+\t\tconst struct rte_fbarray *arr = &msl->memseg_arr;\n+\t\tint reg_start_idx, search_idx;\n \n-\t\t\tif (mr->userspace_addr + mr->memory_size ==\n-\t\t\t    (uint64_t)(uintptr_t)seg->addr) {\n-\t\t\t\tmr->memory_size += seg->len;\n-\t\t\t\tnew_region = 0;\n-\t\t\t\tbreak;\n-\t\t\t}\n-\n-\t\t\tif ((uint64_t)(uintptr_t)seg->addr + seg->len ==\n-\t\t\t    mr->userspace_addr) {\n-\t\t\t\tmr->guest_phys_addr =\n-\t\t\t\t\t(uint64_t)(uintptr_t)seg->addr;\n-\t\t\t\tmr->userspace_addr =\n-\t\t\t\t\t(uint64_t)(uintptr_t)seg->addr;\n-\t\t\t\tmr->memory_size += seg->len;\n-\t\t\t\tnew_region = 0;\n-\t\t\t\tbreak;\n-\t\t\t}\n-\t\t}\n-\n-\t\tif (new_region == 0)\n+\t\t/* skip empty segment lists */\n+\t\tif (arr->count == 0)\n \t\t\tcontinue;\n \n-\t\tmr = &vm->regions[k++];\n-\t\t/* use vaddr here! */\n-\t\tmr->guest_phys_addr = (uint64_t)(uintptr_t)seg->addr;\n-\t\tmr->userspace_addr = (uint64_t)(uintptr_t)seg->addr;\n-\t\tmr->memory_size = seg->len;\n-\t\tmr->mmap_offset = 0;\n-\n-\t\tif (k >= max_regions) {\n-\t\t\tfree(vm);\n-\t\t\treturn NULL;\n+\t\tsearch_idx = 0;\n+\t\twhile ((reg_start_idx = rte_fbarray_find_next_used(arr,\n+\t\t\t\tsearch_idx)) >= 0) {\n+\t\t\tint reg_n_pages;\n+\t\t\tif (region_nr >= max_regions) {\n+\t\t\t\tfree(vm);\n+\t\t\t\treturn NULL;\n+\t\t\t}\n+\t\t\tmr = &vm->regions[region_nr++];\n+\n+\t\t\t/*\n+\t\t\t * we know memseg starts at search_idx, check how many\n+\t\t\t * segments there are\n+\t\t\t */\n+\t\t\treg_n_pages = rte_fbarray_find_contig_used(arr,\n+\t\t\t\t\tsearch_idx);\n+\n+\t\t\t/* look at at most reg_n_pages of memsegs */\n+\t\t\tsearch_idx = add_memory_region(mr, arr, reg_start_idx,\n+\t\t\t\t\tsearch_idx + reg_n_pages);\n \t\t}\n \t}\n \n-\tvm->nregions = k;\n+\tvm->nregions = region_nr;\n \tvm->padding = 0;\n \treturn vm;\n }\ndiff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c\nindex 96570a7..bdd465b 100644\n--- a/lib/librte_eal/common/eal_common_memory.c\n+++ b/lib/librte_eal/common/eal_common_memory.c\n@@ -42,6 +42,7 @@\n #include <sys/mman.h>\n #include <sys/queue.h>\n \n+#include <rte_fbarray.h>\n #include <rte_memory.h>\n #include <rte_eal.h>\n #include <rte_eal_memconfig.h>\n@@ -58,6 +59,8 @@\n  * which is a multiple of hugepage size.\n  */\n \n+#define MEMSEG_LIST_FMT \"memseg-%luk-%i\"\n+\n static uint64_t baseaddr_offset;\n \n void *\n@@ -117,6 +120,178 @@ eal_get_virtual_area(void *requested_addr, uint64_t *size,\n \treturn addr;\n }\n \n+static uint64_t\n+get_mem_amount(uint64_t page_sz) {\n+\tuint64_t area_sz;\n+\n+\t// TODO: saner heuristics\n+\t/* limit to RTE_MAX_MEMSEG_PER_LIST pages or 128G worth of memory */\n+\tarea_sz = RTE_MIN(page_sz * RTE_MAX_MEMSEG_PER_LIST, 1ULL << 37);\n+\n+\treturn rte_align64pow2(area_sz);\n+}\n+\n+static int\n+get_max_num_pages(uint64_t page_sz, uint64_t mem_amount) {\n+\treturn mem_amount / page_sz;\n+}\n+\n+static int\n+get_min_num_pages(int max_pages) {\n+\treturn RTE_MIN(256, max_pages);\n+}\n+\n+static int\n+alloc_memseg_list(struct rte_memseg_list *msl, uint64_t page_sz,\n+\t\tint socket_id) {\n+\tchar name[RTE_FBARRAY_NAME_LEN];\n+\tint min_pages, max_pages;\n+\tuint64_t mem_amount;\n+\tvoid *addr;\n+\n+\tif (!internal_config.legacy_mem) {\n+\t\tmem_amount = get_mem_amount(page_sz);\n+\t\tmax_pages = get_max_num_pages(page_sz, mem_amount);\n+\t\tmin_pages = get_min_num_pages(max_pages);\n+\n+\t\t// TODO: allow shrink?\n+\t\taddr = eal_get_virtual_area(NULL, &mem_amount, page_sz, 0);\n+\t\tif (addr == NULL) {\n+\t\t\tRTE_LOG(ERR, EAL, \"Cannot reserve memory\\n\");\n+\t\t\treturn -1;\n+\t\t}\n+\t} else {\n+\t\taddr = NULL;\n+\t\tmin_pages = 256;\n+\t\tmax_pages = 256;\n+\t}\n+\n+\tsnprintf(name, sizeof(name), MEMSEG_LIST_FMT, page_sz >> 10, socket_id);\n+\tif (rte_fbarray_alloc(&msl->memseg_arr, name, min_pages, max_pages,\n+\t\t\tsizeof(struct rte_memseg))) {\n+\t\tRTE_LOG(ERR, EAL, \"Cannot allocate memseg list\\n\");\n+\t\treturn -1;\n+\t}\n+\n+\tmsl->hugepage_sz = page_sz;\n+\tmsl->socket_id = socket_id;\n+\tmsl->base_va = addr;\n+\n+\treturn 0;\n+}\n+\n+static int\n+memseg_init(void) {\n+\tstruct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;\n+\tint socket_id, hpi_idx, msl_idx = 0;\n+\tstruct rte_memseg_list *msl;\n+\n+\tif (rte_eal_process_type() == RTE_PROC_SECONDARY) {\n+\t\tRTE_LOG(ERR, EAL, \"Secondary process not supported\\n\");\n+\t\treturn -1;\n+\t}\n+\n+\t/* create memseg lists */\n+\tfor (hpi_idx = 0; hpi_idx < (int) internal_config.num_hugepage_sizes;\n+\t\t\thpi_idx++) {\n+\t\tstruct hugepage_info *hpi;\n+\t\tuint64_t hugepage_sz;\n+\n+\t\thpi = &internal_config.hugepage_info[hpi_idx];\n+\t\thugepage_sz = hpi->hugepage_sz;\n+\n+\t\tfor (socket_id = 0; socket_id < (int) rte_num_sockets();\n+\t\t\t\tsocket_id++) {\n+\t\t\tif (msl_idx >= RTE_MAX_MEMSEG_LISTS) {\n+\t\t\t\tRTE_LOG(ERR, EAL,\n+\t\t\t\t\t\"No more space in memseg lists\\n\");\n+\t\t\t\treturn -1;\n+\t\t\t}\n+\t\t\tmsl = &mcfg->memsegs[msl_idx++];\n+\n+\t\t\tif (alloc_memseg_list(msl, hugepage_sz, socket_id)) {\n+\t\t\t\treturn -1;\n+\t\t\t}\n+\t\t}\n+\t}\n+\treturn 0;\n+}\n+\n+static const struct rte_memseg *\n+virt2memseg(const void *addr, const struct rte_memseg_list *msl) {\n+\tconst struct rte_mem_config *mcfg =\n+\t\trte_eal_get_configuration()->mem_config;\n+\tconst struct rte_fbarray *arr;\n+\tint msl_idx, ms_idx;\n+\n+\t/* first, find appropriate memseg list, if it wasn't specified */\n+\tif (msl == NULL) {\n+\t\tfor (msl_idx = 0; msl_idx < RTE_MAX_MEMSEG_LISTS; msl_idx++) {\n+\t\t\tvoid *start, *end;\n+\t\t\tmsl = &mcfg->memsegs[msl_idx];\n+\n+\t\t\tstart = msl->base_va;\n+\t\t\tend = RTE_PTR_ADD(start, msl->hugepage_sz *\n+\t\t\t\t\tmsl->memseg_arr.capacity);\n+\t\t\tif (addr >= start && addr < end)\n+\t\t\t\tbreak;\n+\t\t}\n+\t\t/* if we didn't find our memseg list */\n+\t\tif (msl_idx == RTE_MAX_MEMSEG_LISTS)\n+\t\t\treturn NULL;\n+\t} else {\n+\t\t/* a memseg list was specified, check if it's the right one */\n+\t\tvoid *start, *end;\n+\t\tstart = msl->base_va;\n+\t\tend = RTE_PTR_ADD(start, msl->hugepage_sz *\n+\t\t\t\tmsl->memseg_arr.capacity);\n+\n+\t\tif (addr < start || addr >= end)\n+\t\t\treturn NULL;\n+\t}\n+\n+\t/* now, calculate index */\n+\tarr = &msl->memseg_arr;\n+\tms_idx = RTE_PTR_DIFF(addr, msl->base_va) / msl->hugepage_sz;\n+\treturn rte_fbarray_get(arr, ms_idx);\n+}\n+\n+static const struct rte_memseg *\n+virt2memseg_legacy(const void *addr) {\n+\tconst struct rte_mem_config *mcfg =\n+\t\trte_eal_get_configuration()->mem_config;\n+\tconst struct rte_memseg_list *msl;\n+\tconst struct rte_fbarray *arr;\n+\tint msl_idx, ms_idx;\n+\tfor (msl_idx = 0; msl_idx < RTE_MAX_MEMSEG_LISTS; msl_idx++) {\n+\t\tmsl = &mcfg->memsegs[msl_idx];\n+\t\tarr = &msl->memseg_arr;\n+\n+\t\tms_idx = 0;\n+\t\twhile ((ms_idx = rte_fbarray_find_next_used(arr, ms_idx)) >= 0) {\n+\t\t\tconst struct rte_memseg *ms;\n+\t\t\tvoid *start, *end;\n+\t\t\tms = rte_fbarray_get(arr, ms_idx);\n+\t\t\tstart = ms->addr;\n+\t\t\tend = RTE_PTR_ADD(start, ms->len);\n+\t\t\tif (addr >= start && addr < end)\n+\t\t\t\treturn ms;\n+\t\t\tms_idx++;\n+\t\t}\n+\t}\n+\treturn NULL;\n+}\n+\n+const struct rte_memseg *\n+rte_mem_virt2memseg(const void *addr, const struct rte_memseg_list *msl) {\n+\t/* for legacy memory, we just walk the list, like in the old days. */\n+\tif (internal_config.legacy_mem) {\n+\t\treturn virt2memseg_legacy(addr);\n+\t} else {\n+\t\treturn virt2memseg(addr, msl);\n+\t}\n+}\n+\n \n /*\n  * Return a pointer to a read-only table of struct rte_physmem_desc\n@@ -126,7 +301,9 @@ eal_get_virtual_area(void *requested_addr, uint64_t *size,\n const struct rte_memseg *\n rte_eal_get_physmem_layout(void)\n {\n-\treturn rte_eal_get_configuration()->mem_config->memseg;\n+\tstruct rte_fbarray *arr;\n+\tarr = &rte_eal_get_configuration()->mem_config->memsegs[0].memseg_arr;\n+\treturn rte_fbarray_get(arr, 0);\n }\n \n \n@@ -141,11 +318,24 @@ rte_eal_get_physmem_size(void)\n \t/* get pointer to global configuration */\n \tmcfg = rte_eal_get_configuration()->mem_config;\n \n-\tfor (i = 0; i < RTE_MAX_MEMSEG; i++) {\n-\t\tif (mcfg->memseg[i].addr == NULL)\n-\t\t\tbreak;\n+\tfor (i = 0; i < RTE_MAX_MEMSEG_LISTS; i++) {\n+\t\tconst struct rte_memseg_list *msl = &mcfg->memsegs[i];\n+\n+\t\tif (msl->memseg_arr.count == 0)\n+\t\t\tcontinue;\n+\n+\t\t/* for legacy mem mode, walk the memsegs */\n+\t\tif (internal_config.legacy_mem) {\n+\t\t\tconst struct rte_fbarray *arr = &msl->memseg_arr;\n+\t\t\tint ms_idx = 0;\n \n-\t\ttotal_len += mcfg->memseg[i].len;\n+\t\t\twhile ((ms_idx = rte_fbarray_find_next_used(arr, ms_idx) >= 0)) {\n+\t\t\t\tconst struct rte_memseg *ms =\n+\t\t\t\t\t\trte_fbarray_get(arr, ms_idx);\n+\t\t\t\ttotal_len += ms->len;\n+\t\t\t}\n+\t\t} else\n+\t\t\ttotal_len += msl->hugepage_sz * msl->memseg_arr.count;\n \t}\n \n \treturn total_len;\n@@ -161,21 +351,29 @@ rte_dump_physmem_layout(FILE *f)\n \t/* get pointer to global configuration */\n \tmcfg = rte_eal_get_configuration()->mem_config;\n \n-\tfor (i = 0; i < RTE_MAX_MEMSEG; i++) {\n-\t\tif (mcfg->memseg[i].addr == NULL)\n-\t\t\tbreak;\n-\n-\t\tfprintf(f, \"Segment %u: IOVA:0x%\"PRIx64\", len:%zu, \"\n-\t\t       \"virt:%p, socket_id:%\"PRId32\", \"\n-\t\t       \"hugepage_sz:%\"PRIu64\", nchannel:%\"PRIx32\", \"\n-\t\t       \"nrank:%\"PRIx32\"\\n\", i,\n-\t\t       mcfg->memseg[i].iova,\n-\t\t       mcfg->memseg[i].len,\n-\t\t       mcfg->memseg[i].addr,\n-\t\t       mcfg->memseg[i].socket_id,\n-\t\t       mcfg->memseg[i].hugepage_sz,\n-\t\t       mcfg->memseg[i].nchannel,\n-\t\t       mcfg->memseg[i].nrank);\n+\tfor (i = 0; i < RTE_MAX_MEMSEG_LISTS; i++) {\n+\t\tconst struct rte_memseg_list *msl = &mcfg->memsegs[i];\n+\t\tconst struct rte_fbarray *arr = &msl->memseg_arr;\n+\t\tint m_idx = 0;\n+\n+\t\tif (arr->count == 0)\n+\t\t\tcontinue;\n+\n+\t\twhile ((m_idx = rte_fbarray_find_next_used(arr, m_idx)) >= 0) {\n+\t\t\tstruct rte_memseg *ms = rte_fbarray_get(arr, m_idx);\n+\t\t\tfprintf(f, \"Page %u-%u: iova:0x%\"PRIx64\", len:%zu, \"\n+\t\t\t       \"virt:%p, socket_id:%\"PRId32\", \"\n+\t\t\t       \"hugepage_sz:%\"PRIu64\", nchannel:%\"PRIx32\", \"\n+\t\t\t       \"nrank:%\"PRIx32\"\\n\", i, m_idx,\n+\t\t\t       ms->iova,\n+\t\t\t       ms->len,\n+\t\t\t       ms->addr,\n+\t\t\t       ms->socket_id,\n+\t\t\t       ms->hugepage_sz,\n+\t\t\t       ms->nchannel,\n+\t\t\t       ms->nrank);\n+\t\t\tm_idx++;\n+\t\t}\n \t}\n }\n \n@@ -220,9 +418,14 @@ rte_mem_lock_page(const void *virt)\n int\n rte_eal_memory_init(void)\n {\n+\tint retval;\n \tRTE_LOG(DEBUG, EAL, \"Setting up physically contiguous memory...\\n\");\n \n-\tconst int retval = rte_eal_process_type() == RTE_PROC_PRIMARY ?\n+\tretval = memseg_init();\n+\tif (retval < 0)\n+\t\treturn -1;\n+\n+\tretval = rte_eal_process_type() == RTE_PROC_PRIMARY ?\n \t\t\trte_eal_hugepage_init() :\n \t\t\trte_eal_hugepage_attach();\n \tif (retval < 0)\ndiff --git a/lib/librte_eal/common/eal_common_memzone.c b/lib/librte_eal/common/eal_common_memzone.c\nindex ea072a2..f558ac2 100644\n--- a/lib/librte_eal/common/eal_common_memzone.c\n+++ b/lib/librte_eal/common/eal_common_memzone.c\n@@ -254,10 +254,9 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,\n \tmz->iova = rte_malloc_virt2iova(mz_addr);\n \tmz->addr = mz_addr;\n \tmz->len = (requested_len == 0 ? elem->size : requested_len);\n-\tmz->hugepage_sz = elem->ms->hugepage_sz;\n-\tmz->socket_id = elem->ms->socket_id;\n+\tmz->hugepage_sz = elem->msl->hugepage_sz;\n+\tmz->socket_id = elem->msl->socket_id;\n \tmz->flags = 0;\n-\tmz->memseg_id = elem->ms - rte_eal_get_configuration()->mem_config->memseg;\n \n \treturn mz;\n }\ndiff --git a/lib/librte_eal/common/eal_hugepages.h b/lib/librte_eal/common/eal_hugepages.h\nindex 68369f2..cf91009 100644\n--- a/lib/librte_eal/common/eal_hugepages.h\n+++ b/lib/librte_eal/common/eal_hugepages.h\n@@ -52,6 +52,7 @@ struct hugepage_file {\n \tint socket_id;      /**< NUMA socket ID */\n \tint file_id;        /**< the '%d' in HUGEFILE_FMT */\n \tint memseg_id;      /**< the memory segment to which page belongs */\n+\tint memseg_list_id; /**< the memory segment list to which page belongs */\n \tchar filepath[MAX_HUGEPAGE_PATH]; /**< path to backing file on filesystem */\n };\n \ndiff --git a/lib/librte_eal/common/include/rte_eal_memconfig.h b/lib/librte_eal/common/include/rte_eal_memconfig.h\nindex b9eee70..c9b57a4 100644\n--- a/lib/librte_eal/common/include/rte_eal_memconfig.h\n+++ b/lib/librte_eal/common/include/rte_eal_memconfig.h\n@@ -40,12 +40,30 @@\n #include <rte_malloc_heap.h>\n #include <rte_rwlock.h>\n #include <rte_pause.h>\n+#include <rte_fbarray.h>\n \n #ifdef __cplusplus\n extern \"C\" {\n #endif\n \n /**\n+ * memseg list is a special case as we need to store a bunch of other data\n+ * together with the array itself.\n+ */\n+struct rte_memseg_list {\n+\tRTE_STD_C11\n+\tunion {\n+\t\tvoid *base_va;\n+\t\t/**< Base virtual address for this memseg list. */\n+\t\tuint64_t addr_64;\n+\t\t/**< Makes sure addr is always 64-bits */\n+\t};\n+\tint socket_id; /**< Socket ID for all memsegs in this list. */\n+\tuint64_t hugepage_sz; /**< page size for all memsegs in this list. */\n+\tstruct rte_fbarray memseg_arr;\n+};\n+\n+/**\n  * the structure for the memory configuration for the RTE.\n  * Used by the rte_config structure. It is separated out, as for multi-process\n  * support, the memory details should be shared across instances\n@@ -71,9 +89,11 @@ struct rte_mem_config {\n \tuint32_t memzone_cnt; /**< Number of allocated memzones */\n \n \t/* memory segments and zones */\n-\tstruct rte_memseg memseg[RTE_MAX_MEMSEG];    /**< Physmem descriptors. */\n \tstruct rte_memzone memzone[RTE_MAX_MEMZONE]; /**< Memzone descriptors. */\n \n+\tstruct rte_memseg_list memsegs[RTE_MAX_MEMSEG_LISTS];\n+\t/**< list of dynamic arrays holding memsegs */\n+\n \tstruct rte_tailq_head tailq_head[RTE_MAX_TAILQ]; /**< Tailqs for objects */\n \n \t/* Heaps of Malloc per socket */\ndiff --git a/lib/librte_eal/common/include/rte_memory.h b/lib/librte_eal/common/include/rte_memory.h\nindex 14aacea..f005716 100644\n--- a/lib/librte_eal/common/include/rte_memory.h\n+++ b/lib/librte_eal/common/include/rte_memory.h\n@@ -50,6 +50,9 @@ extern \"C\" {\n \n #include <rte_common.h>\n \n+/* forward declaration for pointers */\n+struct rte_memseg_list;\n+\n __extension__\n enum rte_page_sizes {\n \tRTE_PGSIZE_4K    = 1ULL << 12,\n@@ -158,6 +161,19 @@ phys_addr_t rte_mem_virt2phy(const void *virt);\n rte_iova_t rte_mem_virt2iova(const void *virt);\n \n /**\n+ * Get memseg corresponding to virtual memory address.\n+ *\n+ * @param virt\n+ *   The virtual address.\n+ * @param msl\n+ *   Memseg list in which to look for memsegs (can be NULL).\n+ * @return\n+ *   Memseg to which this virtual address belongs to.\n+ */\n+const struct rte_memseg *rte_mem_virt2memseg(const void *virt,\n+\t\tconst struct rte_memseg_list *msl);\n+\n+/**\n  * Get the layout of the available physical memory.\n  *\n  * It can be useful for an application to have the full physical\ndiff --git a/lib/librte_eal/common/malloc_elem.c b/lib/librte_eal/common/malloc_elem.c\nindex 782aaa7..ab09b94 100644\n--- a/lib/librte_eal/common/malloc_elem.c\n+++ b/lib/librte_eal/common/malloc_elem.c\n@@ -54,11 +54,11 @@\n  * Initialize a general malloc_elem header structure\n  */\n void\n-malloc_elem_init(struct malloc_elem *elem,\n-\t\tstruct malloc_heap *heap, const struct rte_memseg *ms, size_t size)\n+malloc_elem_init(struct malloc_elem *elem, struct malloc_heap *heap,\n+\t\tconst struct rte_memseg_list *msl, size_t size)\n {\n \telem->heap = heap;\n-\telem->ms = ms;\n+\telem->msl = msl;\n \telem->prev = NULL;\n \telem->next = NULL;\n \tmemset(&elem->free_list, 0, sizeof(elem->free_list));\n@@ -172,7 +172,7 @@ split_elem(struct malloc_elem *elem, struct malloc_elem *split_pt)\n \tconst size_t old_elem_size = (uintptr_t)split_pt - (uintptr_t)elem;\n \tconst size_t new_elem_size = elem->size - old_elem_size;\n \n-\tmalloc_elem_init(split_pt, elem->heap, elem->ms, new_elem_size);\n+\tmalloc_elem_init(split_pt, elem->heap, elem->msl, new_elem_size);\n \tsplit_pt->prev = elem;\n \tsplit_pt->next = next_elem;\n \tif (next_elem)\ndiff --git a/lib/librte_eal/common/malloc_elem.h b/lib/librte_eal/common/malloc_elem.h\nindex cf27b59..330bddc 100644\n--- a/lib/librte_eal/common/malloc_elem.h\n+++ b/lib/librte_eal/common/malloc_elem.h\n@@ -34,7 +34,7 @@\n #ifndef MALLOC_ELEM_H_\n #define MALLOC_ELEM_H_\n \n-#include <rte_memory.h>\n+#include <rte_eal_memconfig.h>\n \n /* dummy definition of struct so we can use pointers to it in malloc_elem struct */\n struct malloc_heap;\n@@ -50,7 +50,7 @@ struct malloc_elem {\n \tstruct malloc_elem *volatile prev;      /* points to prev elem in memseg */\n \tstruct malloc_elem *volatile next;      /* points to next elem in memseg */\n \tLIST_ENTRY(malloc_elem) free_list;      /* list of free elements in heap */\n-\tconst struct rte_memseg *ms;\n+\tconst struct rte_memseg_list *msl;\n \tvolatile enum elem_state state;\n \tuint32_t pad;\n \tsize_t size;\n@@ -137,7 +137,7 @@ malloc_elem_from_data(const void *data)\n void\n malloc_elem_init(struct malloc_elem *elem,\n \t\tstruct malloc_heap *heap,\n-\t\tconst struct rte_memseg *ms,\n+\t\tconst struct rte_memseg_list *msl,\n \t\tsize_t size);\n \n void\ndiff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c\nindex 1b35468..5fa21fe 100644\n--- a/lib/librte_eal/common/malloc_heap.c\n+++ b/lib/librte_eal/common/malloc_heap.c\n@@ -50,6 +50,7 @@\n #include <rte_memcpy.h>\n #include <rte_atomic.h>\n \n+#include \"eal_internal_cfg.h\"\n #include \"malloc_elem.h\"\n #include \"malloc_heap.h\"\n \n@@ -91,22 +92,25 @@ check_hugepage_sz(unsigned flags, uint64_t hugepage_sz)\n }\n \n /*\n- * Expand the heap with a memseg.\n- * This reserves the zone and sets a dummy malloc_elem header at the end\n- * to prevent overflow. The rest of the zone is added to free list as a single\n- * large free block\n+ * Expand the heap with a memory area.\n  */\n-static void\n-malloc_heap_add_memseg(struct malloc_heap *heap, struct rte_memseg *ms)\n+static struct malloc_elem *\n+malloc_heap_add_memory(struct malloc_heap *heap, struct rte_memseg_list *msl,\n+\t\tvoid *start, size_t len)\n {\n-\tstruct malloc_elem *start_elem = (struct malloc_elem *)ms->addr;\n-\tconst size_t elem_size = ms->len - MALLOC_ELEM_OVERHEAD;\n+\tstruct malloc_elem *elem = start;\n+\n+\tmalloc_elem_init(elem, heap, msl, len);\n+\n+\tmalloc_elem_insert(elem);\n+\n+\telem = malloc_elem_join_adjacent_free(elem);\n \n-\tmalloc_elem_init(start_elem, heap, ms, elem_size);\n-\tmalloc_elem_insert(start_elem);\n-\tmalloc_elem_free_list_insert(start_elem);\n+\tmalloc_elem_free_list_insert(elem);\n \n-\theap->total_size += elem_size;\n+\theap->total_size += len;\n+\n+\treturn elem;\n }\n \n /*\n@@ -127,7 +131,7 @@ find_suitable_element(struct malloc_heap *heap, size_t size,\n \t\tfor (elem = LIST_FIRST(&heap->free_head[idx]);\n \t\t\t\t!!elem; elem = LIST_NEXT(elem, free_list)) {\n \t\t\tif (malloc_elem_can_hold(elem, size, align, bound)) {\n-\t\t\t\tif (check_hugepage_sz(flags, elem->ms->hugepage_sz))\n+\t\t\t\tif (check_hugepage_sz(flags, elem->msl->hugepage_sz))\n \t\t\t\t\treturn elem;\n \t\t\t\tif (alt_elem == NULL)\n \t\t\t\t\talt_elem = elem;\n@@ -249,16 +253,62 @@ int\n rte_eal_malloc_heap_init(void)\n {\n \tstruct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;\n-\tunsigned ms_cnt;\n-\tstruct rte_memseg *ms;\n+\tint msl_idx;\n+\tstruct rte_memseg_list *msl;\n \n \tif (mcfg == NULL)\n \t\treturn -1;\n \n-\tfor (ms = &mcfg->memseg[0], ms_cnt = 0;\n-\t\t\t(ms_cnt < RTE_MAX_MEMSEG) && (ms->len > 0);\n-\t\t\tms_cnt++, ms++) {\n-\t\tmalloc_heap_add_memseg(&mcfg->malloc_heaps[ms->socket_id], ms);\n+\tfor (msl_idx = 0; msl_idx < RTE_MAX_MEMSEG_LISTS; msl_idx++) {\n+\t\tint start;\n+\t\tstruct rte_fbarray *arr;\n+\t\tstruct malloc_heap *heap;\n+\n+\t\tmsl = &mcfg->memsegs[msl_idx];\n+\t\tarr = &msl->memseg_arr;\n+\t\theap = &mcfg->malloc_heaps[msl->socket_id];\n+\n+\t\tif (arr->capacity == 0)\n+\t\t\tcontinue;\n+\n+\t\t/* for legacy mode, just walk the list */\n+\t\tif (internal_config.legacy_mem) {\n+\t\t\tint ms_idx = 0;\n+\t\t\twhile ((ms_idx = rte_fbarray_find_next_used(arr, ms_idx)) >= 0) {\n+\t\t\t\tstruct rte_memseg *ms = rte_fbarray_get(arr, ms_idx);\n+\t\t\t\tmalloc_heap_add_memory(heap, msl, ms->addr, ms->len);\n+\t\t\t\tms_idx++;\n+\t\t\t\tRTE_LOG(DEBUG, EAL, \"Heap on socket %d was expanded by %zdMB\\n\",\n+\t\t\t\t\tmsl->socket_id, ms->len >> 20ULL);\n+\t\t\t}\n+\t\t\tcontinue;\n+\t\t}\n+\n+\t\t/* find first segment */\n+\t\tstart = rte_fbarray_find_next_used(arr, 0);\n+\n+\t\twhile (start >= 0) {\n+\t\t\tint contig_segs;\n+\t\t\tstruct rte_memseg *start_seg;\n+\t\t\tsize_t len, hugepage_sz = msl->hugepage_sz;\n+\n+\t\t\t/* find how many pages we can lump in together */\n+\t\t\tcontig_segs = rte_fbarray_find_contig_used(arr, start);\n+\t\t\tstart_seg = rte_fbarray_get(arr, start);\n+\t\t\tlen = contig_segs * hugepage_sz;\n+\n+\t\t\t/*\n+\t\t\t * we've found (hopefully) a bunch of contiguous\n+\t\t\t * segments, so add them to the heap.\n+\t\t\t */\n+\t\t\tmalloc_heap_add_memory(heap, msl, start_seg->addr, len);\n+\n+\t\t\tRTE_LOG(DEBUG, EAL, \"Heap on socket %d was expanded by %zdMB\\n\",\n+\t\t\t\tmsl->socket_id, len >> 20ULL);\n+\n+\t\t\tstart = rte_fbarray_find_next_used(arr,\n+\t\t\t\t\tstart + contig_segs);\n+\t\t}\n \t}\n \n \treturn 0;\ndiff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c\nindex 74b5417..92cd7d8 100644\n--- a/lib/librte_eal/common/rte_malloc.c\n+++ b/lib/librte_eal/common/rte_malloc.c\n@@ -251,17 +251,21 @@ rte_malloc_set_limit(__rte_unused const char *type,\n rte_iova_t\n rte_malloc_virt2iova(const void *addr)\n {\n-\trte_iova_t iova;\n+\tconst struct rte_memseg *ms;\n \tconst struct malloc_elem *elem = malloc_elem_from_data(addr);\n+\n \tif (elem == NULL)\n \t\treturn RTE_BAD_IOVA;\n-\tif (elem->ms->iova == RTE_BAD_IOVA)\n-\t\treturn RTE_BAD_IOVA;\n \n \tif (rte_eal_iova_mode() == RTE_IOVA_VA)\n-\t\tiova = (uintptr_t)addr;\n-\telse\n-\t\tiova = elem->ms->iova +\n-\t\t\tRTE_PTR_DIFF(addr, elem->ms->addr);\n-\treturn iova;\n+\t\treturn (uintptr_t) addr;\n+\n+\tms = rte_mem_virt2memseg(addr, elem->msl);\n+\tif (ms == NULL)\n+\t\treturn RTE_BAD_IOVA;\n+\n+\tif (ms->iova == RTE_BAD_IOVA)\n+\t\treturn RTE_BAD_IOVA;\n+\n+\treturn ms->iova + RTE_PTR_DIFF(addr, ms->addr);\n }\ndiff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c\nindex 37ae8e0..a27536f 100644\n--- a/lib/librte_eal/linuxapp/eal/eal.c\n+++ b/lib/librte_eal/linuxapp/eal/eal.c\n@@ -102,8 +102,8 @@ static int mem_cfg_fd = -1;\n static struct flock wr_lock = {\n \t\t.l_type = F_WRLCK,\n \t\t.l_whence = SEEK_SET,\n-\t\t.l_start = offsetof(struct rte_mem_config, memseg),\n-\t\t.l_len = sizeof(early_mem_config.memseg),\n+\t\t.l_start = offsetof(struct rte_mem_config, memsegs),\n+\t\t.l_len = sizeof(early_mem_config.memsegs),\n };\n \n /* Address of global and public configuration */\n@@ -661,17 +661,20 @@ eal_parse_args(int argc, char **argv)\n static void\n eal_check_mem_on_local_socket(void)\n {\n-\tconst struct rte_memseg *ms;\n+\tconst struct rte_memseg_list *msl;\n \tint i, socket_id;\n \n \tsocket_id = rte_lcore_to_socket_id(rte_config.master_lcore);\n \n-\tms = rte_eal_get_physmem_layout();\n-\n-\tfor (i = 0; i < RTE_MAX_MEMSEG; i++)\n-\t\tif (ms[i].socket_id == socket_id &&\n-\t\t\t\tms[i].len > 0)\n-\t\t\treturn;\n+\tfor (i = 0; i < RTE_MAX_MEMSEG_LISTS; i++) {\n+\t\tmsl = &rte_eal_get_configuration()->mem_config->memsegs[i];\n+\t\tif (msl->socket_id != socket_id)\n+\t\t\tcontinue;\n+\t\t/* for legacy memory, check if there's anything allocated */\n+\t\tif (internal_config.legacy_mem && msl->memseg_arr.count == 0)\n+\t\t\tcontinue;\n+\t\treturn;\n+\t}\n \n \tRTE_LOG(WARNING, EAL, \"WARNING: Master core has no \"\n \t\t\t\"memory on local socket!\\n\");\ndiff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c\nindex 5b18af9..59f6889 100644\n--- a/lib/librte_eal/linuxapp/eal/eal_memory.c\n+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c\n@@ -929,6 +929,24 @@ huge_recover_sigbus(void)\n \t}\n }\n \n+static struct rte_memseg_list *\n+get_memseg_list(int socket, uint64_t page_sz) {\n+\tstruct rte_mem_config *mcfg =\n+\t\t\trte_eal_get_configuration()->mem_config;\n+\tstruct rte_memseg_list *msl;\n+\tint msl_idx;\n+\n+\tfor (msl_idx = 0; msl_idx < RTE_MAX_MEMSEG_LISTS; msl_idx++) {\n+\t\tmsl = &mcfg->memsegs[msl_idx];\n+\t\tif (msl->hugepage_sz != page_sz)\n+\t\t\tcontinue;\n+\t\tif (msl->socket_id != socket)\n+\t\t\tcontinue;\n+\t\treturn msl;\n+\t}\n+\treturn NULL;\n+}\n+\n /*\n  * Prepare physical memory mapping: fill configuration structure with\n  * these infos, return 0 on success.\n@@ -946,11 +964,14 @@ eal_legacy_hugepage_init(void)\n \tstruct rte_mem_config *mcfg;\n \tstruct hugepage_file *hugepage = NULL, *tmp_hp = NULL;\n \tstruct hugepage_info used_hp[MAX_HUGEPAGE_SIZES];\n+\tstruct rte_fbarray *arr;\n+\tstruct rte_memseg *ms;\n \n \tuint64_t memory[RTE_MAX_NUMA_NODES];\n \n \tunsigned hp_offset;\n \tint i, j, new_memseg;\n+\tint ms_idx, msl_idx;\n \tint nr_hugefiles, nr_hugepages = 0;\n \tvoid *addr;\n \n@@ -963,6 +984,9 @@ eal_legacy_hugepage_init(void)\n \n \t/* hugetlbfs can be disabled */\n \tif (internal_config.no_hugetlbfs) {\n+\t\tarr = &mcfg->memsegs[0].memseg_arr;\n+\t\tms = rte_fbarray_get(arr, 0);\n+\n \t\taddr = mmap(NULL, internal_config.memory, PROT_READ | PROT_WRITE,\n \t\t\t\tMAP_PRIVATE | MAP_ANONYMOUS, 0, 0);\n \t\tif (addr == MAP_FAILED) {\n@@ -970,14 +994,15 @@ eal_legacy_hugepage_init(void)\n \t\t\t\t\tstrerror(errno));\n \t\t\treturn -1;\n \t\t}\n+\t\trte_fbarray_set_used(arr, 0, true);\n \t\tif (rte_eal_iova_mode() == RTE_IOVA_VA)\n-\t\t\tmcfg->memseg[0].iova = (uintptr_t)addr;\n+\t\t\tms->iova = (uintptr_t)addr;\n \t\telse\n-\t\t\tmcfg->memseg[0].iova = RTE_BAD_IOVA;\n-\t\tmcfg->memseg[0].addr = addr;\n-\t\tmcfg->memseg[0].hugepage_sz = RTE_PGSIZE_4K;\n-\t\tmcfg->memseg[0].len = internal_config.memory;\n-\t\tmcfg->memseg[0].socket_id = 0;\n+\t\t\tms->iova = RTE_BAD_IOVA;\n+\t\tms->addr = addr;\n+\t\tms->hugepage_sz = RTE_PGSIZE_4K;\n+\t\tms->len = internal_config.memory;\n+\t\tms->socket_id = 0;\n \t\treturn 0;\n \t}\n \n@@ -1218,27 +1243,59 @@ eal_legacy_hugepage_init(void)\n #endif\n \n \t\tif (new_memseg) {\n-\t\t\tj += 1;\n-\t\t\tif (j == RTE_MAX_MEMSEG)\n-\t\t\t\tbreak;\n+\t\t\tstruct rte_memseg_list *msl;\n+\t\t\tint socket;\n+\t\t\tuint64_t page_sz;\n \n-\t\t\tmcfg->memseg[j].iova = hugepage[i].physaddr;\n-\t\t\tmcfg->memseg[j].addr = hugepage[i].final_va;\n-\t\t\tmcfg->memseg[j].len = hugepage[i].size;\n-\t\t\tmcfg->memseg[j].socket_id = hugepage[i].socket_id;\n-\t\t\tmcfg->memseg[j].hugepage_sz = hugepage[i].size;\n+\t\t\tsocket = hugepage[i].socket_id;\n+\t\t\tpage_sz = hugepage[i].size;\n+\n+\t\t\tif (page_sz == 0)\n+\t\t\t\tcontinue;\n+\n+\t\t\t/* figure out where to put this memseg */\n+\t\t\tmsl = get_memseg_list(socket, page_sz);\n+\t\t\tif (!msl)\n+\t\t\t\trte_panic(\"Unknown socket or page sz: %i %lx\\n\",\n+\t\t\t\t\tsocket, page_sz);\n+\t\t\tmsl_idx = msl - &mcfg->memsegs[0];\n+\t\t\tarr = &msl->memseg_arr;\n+\t\t\t/*\n+\t\t\t * we may run out of space, so check if we have enough\n+\t\t\t * and expand if necessary\n+\t\t\t */\n+\t\t\tif (arr->count >= arr->len) {\n+\t\t\t\tint new_len = arr->len * 2;\n+\t\t\t\tnew_len = RTE_MIN(new_len, arr->capacity);\n+\t\t\t\tif (rte_fbarray_resize(arr, new_len)) {\n+\t\t\t\t\tRTE_LOG(ERR, EAL, \"Couldn't expand memseg list\\n\");\n+\t\t\t\t\tbreak;\n+\t\t\t\t}\n+\t\t\t}\n+\t\t\tms_idx = rte_fbarray_find_next_free(arr, 0);\n+\t\t\tms = rte_fbarray_get(arr, ms_idx);\n+\n+\t\t\tms->iova = hugepage[i].physaddr;\n+\t\t\tms->addr = hugepage[i].final_va;\n+\t\t\tms->len = page_sz;\n+\t\t\tms->socket_id = socket;\n+\t\t\tms->hugepage_sz = page_sz;\n+\n+\t\t\t/* segment may be empty */\n+\t\t\trte_fbarray_set_used(arr, ms_idx, true);\n \t\t}\n \t\t/* continuation of previous memseg */\n \t\telse {\n #ifdef RTE_ARCH_PPC_64\n \t\t/* Use the phy and virt address of the last page as segment\n \t\t * address for IBM Power architecture */\n-\t\t\tmcfg->memseg[j].iova = hugepage[i].physaddr;\n-\t\t\tmcfg->memseg[j].addr = hugepage[i].final_va;\n+\t\t\tms->iova = hugepage[i].physaddr;\n+\t\t\tms->addr = hugepage[i].final_va;\n #endif\n-\t\t\tmcfg->memseg[j].len += mcfg->memseg[j].hugepage_sz;\n+\t\t\tms->len += ms->hugepage_sz;\n \t\t}\n-\t\thugepage[i].memseg_id = j;\n+\t\thugepage[i].memseg_id = ms_idx;\n+\t\thugepage[i].memseg_list_id = msl_idx;\n \t}\n \n \tif (i < nr_hugefiles) {\n@@ -1248,7 +1305,7 @@ eal_legacy_hugepage_init(void)\n \t\t\t\"Please either increase it or request less amount \"\n \t\t\t\"of memory.\\n\",\n \t\t\ti, nr_hugefiles, RTE_STR(CONFIG_RTE_MAX_MEMSEG),\n-\t\t\tRTE_MAX_MEMSEG);\n+\t\t\tRTE_MAX_MEMSEG_PER_LIST);\n \t\tgoto fail;\n \t}\n \n@@ -1289,8 +1346,9 @@ eal_legacy_hugepage_attach(void)\n \tconst struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;\n \tstruct hugepage_file *hp = NULL;\n \tunsigned num_hp = 0;\n-\tunsigned i, s = 0; /* s used to track the segment number */\n-\tunsigned max_seg = RTE_MAX_MEMSEG;\n+\tunsigned i;\n+\tint ms_idx, msl_idx;\n+\tunsigned cur_seg, max_seg;\n \toff_t size = 0;\n \tint fd, fd_zero = -1, fd_hugepage = -1;\n \n@@ -1315,53 +1373,63 @@ eal_legacy_hugepage_attach(void)\n \t}\n \n \t/* map all segments into memory to make sure we get the addrs */\n-\tfor (s = 0; s < RTE_MAX_MEMSEG; ++s) {\n-\t\tvoid *base_addr;\n+\tmax_seg = 0;\n+\tfor (msl_idx = 0; msl_idx < RTE_MAX_MEMSEG_LISTS; msl_idx++) {\n+\t\tconst struct rte_memseg_list *msl = &mcfg->memsegs[msl_idx];\n+\t\tconst struct rte_fbarray *arr = &msl->memseg_arr;\n \n-\t\t/*\n-\t\t * the first memory segment with len==0 is the one that\n-\t\t * follows the last valid segment.\n-\t\t */\n-\t\tif (mcfg->memseg[s].len == 0)\n-\t\t\tbreak;\n+\t\tms_idx = 0;\n+\t\twhile ((ms_idx = rte_fbarray_find_next_used(arr, ms_idx)) >= 0) {\n+\t\t\tstruct rte_memseg *ms = rte_fbarray_get(arr, ms_idx);\n+\t\t\tvoid *base_addr;\n \n-\t\t/*\n-\t\t * fdzero is mmapped to get a contiguous block of virtual\n-\t\t * addresses of the appropriate memseg size.\n-\t\t * use mmap to get identical addresses as the primary process.\n-\t\t */\n-\t\tbase_addr = mmap(mcfg->memseg[s].addr, mcfg->memseg[s].len,\n-\t\t\t\t PROT_READ,\n+\t\t\tms = rte_fbarray_get(arr, ms_idx);\n+\n+\t\t\t/*\n+\t\t\t * the first memory segment with len==0 is the one that\n+\t\t\t * follows the last valid segment.\n+\t\t\t */\n+\t\t\tif (ms->len == 0)\n+\t\t\t\tbreak;\n+\n+\t\t\t/*\n+\t\t\t * fdzero is mmapped to get a contiguous block of virtual\n+\t\t\t * addresses of the appropriate memseg size.\n+\t\t\t * use mmap to get identical addresses as the primary process.\n+\t\t\t */\n+\t\t\tbase_addr = mmap(ms->addr, ms->len,\n+\t\t\t\t\tPROT_READ,\n #ifdef RTE_ARCH_PPC_64\n-\t\t\t\t MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB,\n+\t\t\t\t\tMAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB,\n #else\n-\t\t\t\t MAP_PRIVATE,\n+\t\t\t\t\tMAP_PRIVATE,\n #endif\n-\t\t\t\t fd_zero, 0);\n-\t\tif (base_addr == MAP_FAILED ||\n-\t\t    base_addr != mcfg->memseg[s].addr) {\n-\t\t\tmax_seg = s;\n-\t\t\tif (base_addr != MAP_FAILED) {\n-\t\t\t\t/* errno is stale, don't use */\n-\t\t\t\tRTE_LOG(ERR, EAL, \"Could not mmap %llu bytes \"\n-\t\t\t\t\t\"in /dev/zero at [%p], got [%p] - \"\n-\t\t\t\t\t\"please use '--base-virtaddr' option\\n\",\n-\t\t\t\t\t(unsigned long long)mcfg->memseg[s].len,\n-\t\t\t\t\tmcfg->memseg[s].addr, base_addr);\n-\t\t\t\tmunmap(base_addr, mcfg->memseg[s].len);\n-\t\t\t} else {\n-\t\t\t\tRTE_LOG(ERR, EAL, \"Could not mmap %llu bytes \"\n-\t\t\t\t\t\"in /dev/zero at [%p]: '%s'\\n\",\n-\t\t\t\t\t(unsigned long long)mcfg->memseg[s].len,\n-\t\t\t\t\tmcfg->memseg[s].addr, strerror(errno));\n-\t\t\t}\n-\t\t\tif (aslr_enabled() > 0) {\n-\t\t\t\tRTE_LOG(ERR, EAL, \"It is recommended to \"\n-\t\t\t\t\t\"disable ASLR in the kernel \"\n-\t\t\t\t\t\"and retry running both primary \"\n-\t\t\t\t\t\"and secondary processes\\n\");\n+\t\t\t\t\tfd_zero, 0);\n+\t\t\tif (base_addr == MAP_FAILED || base_addr != ms->addr) {\n+\t\t\t\tif (base_addr != MAP_FAILED) {\n+\t\t\t\t\t/* errno is stale, don't use */\n+\t\t\t\t\tRTE_LOG(ERR, EAL, \"Could not mmap %llu bytes \"\n+\t\t\t\t\t\t\"in /dev/zero at [%p], got [%p] - \"\n+\t\t\t\t\t\t\"please use '--base-virtaddr' option\\n\",\n+\t\t\t\t\t\t(unsigned long long)ms->len,\n+\t\t\t\t\t\tms->addr, base_addr);\n+\t\t\t\t\tmunmap(base_addr, ms->len);\n+\t\t\t\t} else {\n+\t\t\t\t\tRTE_LOG(ERR, EAL, \"Could not mmap %llu bytes \"\n+\t\t\t\t\t\t\"in /dev/zero at [%p]: '%s'\\n\",\n+\t\t\t\t\t\t(unsigned long long)ms->len,\n+\t\t\t\t\t\tms->addr, strerror(errno));\n+\t\t\t\t}\n+\t\t\t\tif (aslr_enabled() > 0) {\n+\t\t\t\t\tRTE_LOG(ERR, EAL, \"It is recommended to \"\n+\t\t\t\t\t\t\"disable ASLR in the kernel \"\n+\t\t\t\t\t\t\"and retry running both primary \"\n+\t\t\t\t\t\t\"and secondary processes\\n\");\n+\t\t\t\t}\n+\t\t\t\tgoto error;\n \t\t\t}\n-\t\t\tgoto error;\n+\t\t\tmax_seg++;\n+\t\t\tms_idx++;\n \t\t}\n \t}\n \n@@ -1375,46 +1443,54 @@ eal_legacy_hugepage_attach(void)\n \tnum_hp = size / sizeof(struct hugepage_file);\n \tRTE_LOG(DEBUG, EAL, \"Analysing %u files\\n\", num_hp);\n \n-\ts = 0;\n-\twhile (s < RTE_MAX_MEMSEG && mcfg->memseg[s].len > 0){\n-\t\tvoid *addr, *base_addr;\n-\t\tuintptr_t offset = 0;\n-\t\tsize_t mapping_size;\n-\t\t/*\n-\t\t * free previously mapped memory so we can map the\n-\t\t * hugepages into the space\n-\t\t */\n-\t\tbase_addr = mcfg->memseg[s].addr;\n-\t\tmunmap(base_addr, mcfg->memseg[s].len);\n-\n-\t\t/* find the hugepages for this segment and map them\n-\t\t * we don't need to worry about order, as the server sorted the\n-\t\t * entries before it did the second mmap of them */\n-\t\tfor (i = 0; i < num_hp && offset < mcfg->memseg[s].len; i++){\n-\t\t\tif (hp[i].memseg_id == (int)s){\n-\t\t\t\tfd = open(hp[i].filepath, O_RDWR);\n-\t\t\t\tif (fd < 0) {\n-\t\t\t\t\tRTE_LOG(ERR, EAL, \"Could not open %s\\n\",\n-\t\t\t\t\t\thp[i].filepath);\n-\t\t\t\t\tgoto error;\n-\t\t\t\t}\n-\t\t\t\tmapping_size = hp[i].size;\n-\t\t\t\taddr = mmap(RTE_PTR_ADD(base_addr, offset),\n-\t\t\t\t\t\tmapping_size, PROT_READ | PROT_WRITE,\n-\t\t\t\t\t\tMAP_SHARED, fd, 0);\n-\t\t\t\tclose(fd); /* close file both on success and on failure */\n-\t\t\t\tif (addr == MAP_FAILED ||\n-\t\t\t\t\t\taddr != RTE_PTR_ADD(base_addr, offset)) {\n-\t\t\t\t\tRTE_LOG(ERR, EAL, \"Could not mmap %s\\n\",\n-\t\t\t\t\t\thp[i].filepath);\n-\t\t\t\t\tgoto error;\n+\t/* map all segments into memory to make sure we get the addrs */\n+\tfor (msl_idx = 0; msl_idx < RTE_MAX_MEMSEG_LISTS; msl_idx++) {\n+\t\tconst struct rte_memseg_list *msl = &mcfg->memsegs[msl_idx];\n+\t\tconst struct rte_fbarray *arr = &msl->memseg_arr;\n+\n+\t\tms_idx = 0;\n+\t\twhile ((ms_idx = rte_fbarray_find_next_used(arr, ms_idx)) >= 0) {\n+\t\t\tstruct rte_memseg *ms = rte_fbarray_get(arr, ms_idx);\n+\t\t\tvoid *addr, *base_addr;\n+\t\t\tuintptr_t offset = 0;\n+\t\t\tsize_t mapping_size;\n+\n+\t\t\tms = rte_fbarray_get(arr, ms_idx);\n+\t\t\t/*\n+\t\t\t * free previously mapped memory so we can map the\n+\t\t\t * hugepages into the space\n+\t\t\t */\n+\t\t\tbase_addr = ms->addr;\n+\t\t\tmunmap(base_addr, ms->len);\n+\n+\t\t\t/* find the hugepages for this segment and map them\n+\t\t\t * we don't need to worry about order, as the server sorted the\n+\t\t\t * entries before it did the second mmap of them */\n+\t\t\tfor (i = 0; i < num_hp && offset < ms->len; i++){\n+\t\t\t\tif (hp[i].memseg_id == ms_idx &&\n+\t\t\t\t\t\thp[i].memseg_list_id == msl_idx) {\n+\t\t\t\t\tfd = open(hp[i].filepath, O_RDWR);\n+\t\t\t\t\tif (fd < 0) {\n+\t\t\t\t\t\tRTE_LOG(ERR, EAL, \"Could not open %s\\n\",\n+\t\t\t\t\t\t\thp[i].filepath);\n+\t\t\t\t\t\tgoto error;\n+\t\t\t\t\t}\n+\t\t\t\t\tmapping_size = hp[i].size;\n+\t\t\t\t\taddr = mmap(RTE_PTR_ADD(base_addr, offset),\n+\t\t\t\t\t\t\tmapping_size, PROT_READ | PROT_WRITE,\n+\t\t\t\t\t\t\tMAP_SHARED, fd, 0);\n+\t\t\t\t\tclose(fd); /* close file both on success and on failure */\n+\t\t\t\t\tif (addr == MAP_FAILED ||\n+\t\t\t\t\t\t\taddr != RTE_PTR_ADD(base_addr, offset)) {\n+\t\t\t\t\t\tRTE_LOG(ERR, EAL, \"Could not mmap %s\\n\",\n+\t\t\t\t\t\t\thp[i].filepath);\n+\t\t\t\t\t\tgoto error;\n+\t\t\t\t\t}\n+\t\t\t\t\toffset+=mapping_size;\n \t\t\t\t}\n-\t\t\t\toffset+=mapping_size;\n \t\t\t}\n-\t\t}\n-\t\tRTE_LOG(DEBUG, EAL, \"Mapped segment %u of size 0x%llx\\n\", s,\n-\t\t\t\t(unsigned long long)mcfg->memseg[s].len);\n-\t\ts++;\n+\t\t\tRTE_LOG(DEBUG, EAL, \"Mapped segment of size 0x%llx\\n\",\n+\t\t\t\t\t(unsigned long long)ms->len);\t\t}\n \t}\n \t/* unmap the hugepage config file, since we are done using it */\n \tmunmap(hp, size);\n@@ -1423,8 +1499,27 @@ eal_legacy_hugepage_attach(void)\n \treturn 0;\n \n error:\n-\tfor (i = 0; i < max_seg && mcfg->memseg[i].len > 0; i++)\n-\t\tmunmap(mcfg->memseg[i].addr, mcfg->memseg[i].len);\n+\t/* map all segments into memory to make sure we get the addrs */\n+\tcur_seg = 0;\n+\tfor (msl_idx = 0; msl_idx < RTE_MAX_MEMSEG_LISTS; msl_idx++) {\n+\t\tconst struct rte_memseg_list *msl = &mcfg->memsegs[msl_idx];\n+\t\tconst struct rte_fbarray *arr = &msl->memseg_arr;\n+\n+\t\tif (cur_seg >= max_seg)\n+\t\t\tbreak;\n+\n+\t\tms_idx = 0;\n+\t\twhile ((ms_idx = rte_fbarray_find_next_used(arr, ms_idx)) >= 0) {\n+\t\t\tstruct rte_memseg *ms = rte_fbarray_get(arr, ms_idx);\n+\n+\t\t\tif (cur_seg >= max_seg)\n+\t\t\t\tbreak;\n+\t\t\tms = rte_fbarray_get(arr, i);\n+\t\t\tmunmap(ms->addr, ms->len);\n+\n+\t\t\tcur_seg++;\n+\t\t}\n+\t}\n \tif (hp != NULL && hp != MAP_FAILED)\n \t\tmunmap(hp, size);\n \tif (fd_zero >= 0)\ndiff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c\nindex 58f0123..09dfc68 100644\n--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c\n+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c\n@@ -696,33 +696,52 @@ vfio_get_group_no(const char *sysfs_base,\n static int\n vfio_type1_dma_map(int vfio_container_fd)\n {\n-\tconst struct rte_memseg *ms = rte_eal_get_physmem_layout();\n \tint i, ret;\n \n \t/* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */\n-\tfor (i = 0; i < RTE_MAX_MEMSEG; i++) {\n+\tfor (i = 0; i < RTE_MAX_MEMSEG_LISTS; i++) {\n \t\tstruct vfio_iommu_type1_dma_map dma_map;\n+\t\tconst struct rte_memseg_list *msl;\n+\t\tconst struct rte_fbarray *arr;\n+\t\tint ms_idx, next_idx;\n \n-\t\tif (ms[i].addr == NULL)\n-\t\t\tbreak;\n+\t\tmsl = &rte_eal_get_configuration()->mem_config->memsegs[i];\n+\t\tarr = &msl->memseg_arr;\n \n-\t\tmemset(&dma_map, 0, sizeof(dma_map));\n-\t\tdma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);\n-\t\tdma_map.vaddr = ms[i].addr_64;\n-\t\tdma_map.size = ms[i].len;\n-\t\tif (rte_eal_iova_mode() == RTE_IOVA_VA)\n-\t\t\tdma_map.iova = dma_map.vaddr;\n-\t\telse\n-\t\t\tdma_map.iova = ms[i].iova;\n-\t\tdma_map.flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE;\n+\t\t/* skip empty memseg lists */\n+\t\tif (arr->count == 0)\n+\t\t\tcontinue;\n \n-\t\tret = ioctl(vfio_container_fd, VFIO_IOMMU_MAP_DMA, &dma_map);\n+\t\tnext_idx = 0;\n \n-\t\tif (ret) {\n-\t\t\tRTE_LOG(ERR, EAL, \"  cannot set up DMA remapping, \"\n-\t\t\t\t\t  \"error %i (%s)\\n\", errno,\n-\t\t\t\t\t  strerror(errno));\n-\t\t\treturn -1;\n+\t\t// TODO: don't bother with physical addresses?\n+\t\twhile ((ms_idx = rte_fbarray_find_next_used(arr,\n+\t\t\t\tnext_idx) >= 0)) {\n+\t\t\tuint64_t addr, len, hw_addr;\n+\t\t\tconst struct rte_memseg *ms;\n+\t\t\tnext_idx = ms_idx + 1;\n+\n+\t\t\tms = rte_fbarray_get(arr, ms_idx);\n+\n+\t\t\taddr = ms->addr_64;\n+\t\t\tlen = ms->hugepage_sz;\n+\t\t\thw_addr = ms->iova;\n+\n+\t\t\tmemset(&dma_map, 0, sizeof(dma_map));\n+\t\t\tdma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);\n+\t\t\tdma_map.vaddr = addr;\n+\t\t\tdma_map.size = len;\n+\t\t\tdma_map.iova = hw_addr;\n+\t\t\tdma_map.flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE;\n+\n+\t\t\tret = ioctl(vfio_container_fd, VFIO_IOMMU_MAP_DMA, &dma_map);\n+\n+\t\t\tif (ret) {\n+\t\t\t\tRTE_LOG(ERR, EAL, \"  cannot set up DMA remapping, \"\n+\t\t\t\t\t\t  \"error %i (%s)\\n\", errno,\n+\t\t\t\t\t\t  strerror(errno));\n+\t\t\t\treturn -1;\n+\t\t\t}\n \t\t}\n \t}\n \n@@ -732,8 +751,8 @@ vfio_type1_dma_map(int vfio_container_fd)\n static int\n vfio_spapr_dma_map(int vfio_container_fd)\n {\n-\tconst struct rte_memseg *ms = rte_eal_get_physmem_layout();\n \tint i, ret;\n+\tuint64_t hugepage_sz = 0;\n \n \tstruct vfio_iommu_spapr_register_memory reg = {\n \t\t.argsz = sizeof(reg),\n@@ -767,17 +786,31 @@ vfio_spapr_dma_map(int vfio_container_fd)\n \t}\n \n \t/* create DMA window from 0 to max(phys_addr + len) */\n-\tfor (i = 0; i < RTE_MAX_MEMSEG; i++) {\n-\t\tif (ms[i].addr == NULL)\n-\t\t\tbreak;\n-\n-\t\tcreate.window_size = RTE_MAX(create.window_size,\n-\t\t\t\tms[i].iova + ms[i].len);\n+\tfor (i = 0; i < RTE_MAX_MEMSEG_LISTS; i++) {\n+\t\tconst struct rte_mem_config *mcfg =\n+\t\t\t\trte_eal_get_configuration()->mem_config;\n+\t\tconst struct rte_memseg_list *msl = &mcfg->memsegs[i];\n+\t\tconst struct rte_fbarray *arr = &msl->memseg_arr;\n+\t\tint idx, next_idx;\n+\n+\t\tif (msl->base_va == NULL)\n+\t\t\tcontinue;\n+\t\tif (msl->memseg_arr.count == 0)\n+\t\t\tcontinue;\n+\n+\t\tnext_idx = 0;\n+\t\twhile ((idx = rte_fbarray_find_next_used(arr, next_idx)) >= 0) {\n+\t\t\tconst struct rte_memseg *ms = rte_fbarray_get(arr, idx);\n+\t\t\thugepage_sz = RTE_MAX(hugepage_sz, ms->hugepage_sz);\n+\t\t\tcreate.window_size = RTE_MAX(create.window_size,\n+\t\t\t\t\tms[i].iova + ms[i].len);\n+\t\t\tnext_idx = idx + 1;\n+\t\t}\n \t}\n \n \t/* sPAPR requires window size to be a power of 2 */\n \tcreate.window_size = rte_align64pow2(create.window_size);\n-\tcreate.page_shift = __builtin_ctzll(ms->hugepage_sz);\n+\tcreate.page_shift = __builtin_ctzll(hugepage_sz);\n \tcreate.levels = 1;\n \n \tret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, &create);\n@@ -793,41 +826,60 @@ vfio_spapr_dma_map(int vfio_container_fd)\n \t}\n \n \t/* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */\n-\tfor (i = 0; i < RTE_MAX_MEMSEG; i++) {\n+\tfor (i = 0; i < RTE_MAX_MEMSEG_LISTS; i++) {\n \t\tstruct vfio_iommu_type1_dma_map dma_map;\n+\t\tconst struct rte_memseg_list *msl;\n+\t\tconst struct rte_fbarray *arr;\n+\t\tint ms_idx, next_idx;\n \n-\t\tif (ms[i].addr == NULL)\n-\t\t\tbreak;\n+\t\tmsl = &rte_eal_get_configuration()->mem_config->memsegs[i];\n+\t\tarr = &msl->memseg_arr;\n \n-\t\treg.vaddr = (uintptr_t) ms[i].addr;\n-\t\treg.size = ms[i].len;\n-\t\tret = ioctl(vfio_container_fd,\n-\t\t\tVFIO_IOMMU_SPAPR_REGISTER_MEMORY, &reg);\n-\t\tif (ret) {\n-\t\t\tRTE_LOG(ERR, EAL, \"  cannot register vaddr for IOMMU, \"\n-\t\t\t\t\"error %i (%s)\\n\", errno, strerror(errno));\n-\t\t\treturn -1;\n-\t\t}\n+\t\t/* skip empty memseg lists */\n+\t\tif (arr->count == 0)\n+\t\t\tcontinue;\n \n-\t\tmemset(&dma_map, 0, sizeof(dma_map));\n-\t\tdma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);\n-\t\tdma_map.vaddr = ms[i].addr_64;\n-\t\tdma_map.size = ms[i].len;\n-\t\tif (rte_eal_iova_mode() == RTE_IOVA_VA)\n-\t\t\tdma_map.iova = dma_map.vaddr;\n-\t\telse\n-\t\t\tdma_map.iova = ms[i].iova;\n-\t\tdma_map.flags = VFIO_DMA_MAP_FLAG_READ |\n-\t\t\t\t VFIO_DMA_MAP_FLAG_WRITE;\n+\t\tnext_idx = 0;\n \n-\t\tret = ioctl(vfio_container_fd, VFIO_IOMMU_MAP_DMA, &dma_map);\n+\t\twhile ((ms_idx = rte_fbarray_find_next_used(arr,\n+\t\t\t\tnext_idx) >= 0)) {\n+\t\t\tuint64_t addr, len, hw_addr;\n+\t\t\tconst struct rte_memseg *ms;\n+\t\t\tnext_idx = ms_idx + 1;\n \n-\t\tif (ret) {\n-\t\t\tRTE_LOG(ERR, EAL, \"  cannot set up DMA remapping, \"\n-\t\t\t\t\"error %i (%s)\\n\", errno, strerror(errno));\n-\t\t\treturn -1;\n-\t\t}\n+\t\t\tms = rte_fbarray_get(arr, ms_idx);\n+\n+\t\t\taddr = ms->addr_64;\n+\t\t\tlen = ms->hugepage_sz;\n+\t\t\thw_addr = ms->iova;\n \n+\t\t\treg.vaddr = (uintptr_t) addr;\n+\t\t\treg.size = len;\n+\t\t\tret = ioctl(vfio_container_fd,\n+\t\t\t\tVFIO_IOMMU_SPAPR_REGISTER_MEMORY, &reg);\n+\t\t\tif (ret) {\n+\t\t\t\tRTE_LOG(ERR, EAL, \"  cannot register vaddr for IOMMU, error %i (%s)\\n\",\n+\t\t\t\t\t\terrno, strerror(errno));\n+\t\t\t\treturn -1;\n+\t\t\t}\n+\n+\t\t\tmemset(&dma_map, 0, sizeof(dma_map));\n+\t\t\tdma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);\n+\t\t\tdma_map.vaddr = addr;\n+\t\t\tdma_map.size = len;\n+\t\t\tdma_map.iova = hw_addr;\n+\t\t\tdma_map.flags = VFIO_DMA_MAP_FLAG_READ |\n+\t\t\t\t\tVFIO_DMA_MAP_FLAG_WRITE;\n+\n+\t\t\tret = ioctl(vfio_container_fd, VFIO_IOMMU_MAP_DMA, &dma_map);\n+\n+\t\t\tif (ret) {\n+\t\t\t\tRTE_LOG(ERR, EAL, \"  cannot set up DMA remapping, \"\n+\t\t\t\t\t\t  \"error %i (%s)\\n\", errno,\n+\t\t\t\t\t\t  strerror(errno));\n+\t\t\t\treturn -1;\n+\t\t\t}\n+\t\t}\n \t}\n \n \treturn 0;\ndiff --git a/test/test/test_malloc.c b/test/test/test_malloc.c\nindex 4572caf..ae24c33 100644\n--- a/test/test/test_malloc.c\n+++ b/test/test/test_malloc.c\n@@ -41,6 +41,7 @@\n \n #include <rte_common.h>\n #include <rte_memory.h>\n+#include <rte_eal_memconfig.h>\n #include <rte_per_lcore.h>\n #include <rte_launch.h>\n #include <rte_eal.h>\n@@ -734,15 +735,23 @@ test_malloc_bad_params(void)\n \treturn -1;\n }\n \n-/* Check if memory is available on a specific socket */\n+/* Check if memory is avilable on a specific socket */\n static int\n is_mem_on_socket(int32_t socket)\n {\n-\tconst struct rte_memseg *ms = rte_eal_get_physmem_layout();\n+\tconst struct rte_mem_config *mcfg =\n+\t\t\trte_eal_get_configuration()->mem_config;\n \tunsigned i;\n \n-\tfor (i = 0; i < RTE_MAX_MEMSEG; i++) {\n-\t\tif (socket == ms[i].socket_id)\n+\tfor (i = 0; i < RTE_MAX_MEMSEG_LISTS; i++) {\n+\t\tconst struct rte_memseg_list *msl =\n+\t\t\t\t&mcfg->memsegs[i];\n+\t\tconst struct rte_fbarray *arr = &msl->memseg_arr;\n+\n+\t\tif (msl->socket_id != socket)\n+\t\t\tcontinue;\n+\n+\t\tif (arr->count)\n \t\t\treturn 1;\n \t}\n \treturn 0;\n@@ -755,16 +764,8 @@ is_mem_on_socket(int32_t socket)\n static int32_t\n addr_to_socket(void * addr)\n {\n-\tconst struct rte_memseg *ms = rte_eal_get_physmem_layout();\n-\tunsigned i;\n-\n-\tfor (i = 0; i < RTE_MAX_MEMSEG; i++) {\n-\t\tif ((ms[i].addr <= addr) &&\n-\t\t\t\t((uintptr_t)addr <\n-\t\t\t\t((uintptr_t)ms[i].addr + (uintptr_t)ms[i].len)))\n-\t\t\treturn ms[i].socket_id;\n-\t}\n-\treturn -1;\n+\tconst struct rte_memseg *ms = rte_mem_virt2memseg(addr, NULL);\n+\treturn ms == NULL ? -1 : ms->socket_id;\n }\n \n /* Test using rte_[c|m|zm]alloc_socket() on a specific socket */\ndiff --git a/test/test/test_memory.c b/test/test/test_memory.c\nindex 921bdc8..0d877c8 100644\n--- a/test/test/test_memory.c\n+++ b/test/test/test_memory.c\n@@ -34,8 +34,11 @@\n #include <stdio.h>\n #include <stdint.h>\n \n+#include <rte_eal.h>\n+#include <rte_eal_memconfig.h>\n #include <rte_memory.h>\n #include <rte_common.h>\n+#include <rte_memzone.h>\n \n #include \"test.h\"\n \n@@ -54,10 +57,12 @@\n static int\n test_memory(void)\n {\n+\tconst struct rte_memzone *mz = NULL;\n \tuint64_t s;\n \tunsigned i;\n \tsize_t j;\n-\tconst struct rte_memseg *mem;\n+\tconst struct rte_mem_config *mcfg =\n+\t\t\trte_eal_get_configuration()->mem_config;\n \n \t/*\n \t * dump the mapped memory: the python-expect script checks\n@@ -69,20 +74,43 @@ test_memory(void)\n \t/* check that memory size is != 0 */\n \ts = rte_eal_get_physmem_size();\n \tif (s == 0) {\n-\t\tprintf(\"No memory detected\\n\");\n-\t\treturn -1;\n+\t\tprintf(\"No memory detected, attempting to allocate\\n\");\n+\t\tmz = rte_memzone_reserve(\"tmp\", 1000, SOCKET_ID_ANY, 0);\n+\n+\t\tif (!mz) {\n+\t\t\tprintf(\"Failed to allocate a memzone\\n\");\n+\t\t\treturn -1;\n+\t\t}\n \t}\n \n \t/* try to read memory (should not segfault) */\n-\tmem = rte_eal_get_physmem_layout();\n-\tfor (i = 0; i < RTE_MAX_MEMSEG && mem[i].addr != NULL ; i++) {\n+\tfor (i = 0; i < RTE_MAX_MEMSEG_LISTS; i++) {\n+\t\tconst struct rte_memseg_list *msl = &mcfg->memsegs[i];\n+\t\tconst struct rte_fbarray *arr = &msl->memseg_arr;\n+\t\tint search_idx, cur_idx;\n+\n+\t\tif (arr->count == 0)\n+\t\t\tcontinue;\n+\n+\t\tsearch_idx = 0;\n \n-\t\t/* check memory */\n-\t\tfor (j = 0; j<mem[i].len; j++) {\n-\t\t\t*((volatile uint8_t *) mem[i].addr + j);\n+\t\twhile ((cur_idx = rte_fbarray_find_next_used(arr,\n+\t\t\t\tsearch_idx)) >= 0) {\n+\t\t\tconst struct rte_memseg *ms;\n+\n+\t\t\tms = rte_fbarray_get(arr, cur_idx);\n+\n+\t\t\t/* check memory */\n+\t\t\tfor (j = 0; j < ms->len; j++) {\n+\t\t\t\t*((volatile uint8_t *) ms->addr + j);\n+\t\t\t}\n+\t\t\tsearch_idx = cur_idx + 1;\n \t\t}\n \t}\n \n+\tif (mz)\n+\t\trte_memzone_free(mz);\n+\n \treturn 0;\n }\n \ndiff --git a/test/test/test_memzone.c b/test/test/test_memzone.c\nindex 1cf235a..47af721 100644\n--- a/test/test/test_memzone.c\n+++ b/test/test/test_memzone.c\n@@ -132,22 +132,25 @@ static int\n test_memzone_reserve_flags(void)\n {\n \tconst struct rte_memzone *mz;\n-\tconst struct rte_memseg *ms;\n \tint hugepage_2MB_avail = 0;\n \tint hugepage_1GB_avail = 0;\n \tint hugepage_16MB_avail = 0;\n \tint hugepage_16GB_avail = 0;\n \tconst size_t size = 100;\n \tint i = 0;\n-\tms = rte_eal_get_physmem_layout();\n-\tfor (i = 0; i < RTE_MAX_MEMSEG; i++) {\n-\t\tif (ms[i].hugepage_sz == RTE_PGSIZE_2M)\n+\n+\tfor (i = 0; i < RTE_MAX_MEMSEG_LISTS; i++) {\n+\t\tstruct rte_mem_config *mcfg =\n+\t\t\t\trte_eal_get_configuration()->mem_config;\n+\t\tstruct rte_memseg_list *msl = &mcfg->memsegs[i];\n+\n+\t\tif (msl->hugepage_sz == RTE_PGSIZE_2M)\n \t\t\thugepage_2MB_avail = 1;\n-\t\tif (ms[i].hugepage_sz == RTE_PGSIZE_1G)\n+\t\tif (msl->hugepage_sz == RTE_PGSIZE_1G)\n \t\t\thugepage_1GB_avail = 1;\n-\t\tif (ms[i].hugepage_sz == RTE_PGSIZE_16M)\n+\t\tif (msl->hugepage_sz == RTE_PGSIZE_16M)\n \t\t\thugepage_16MB_avail = 1;\n-\t\tif (ms[i].hugepage_sz == RTE_PGSIZE_16G)\n+\t\tif (msl->hugepage_sz == RTE_PGSIZE_16G)\n \t\t\thugepage_16GB_avail = 1;\n \t}\n \t/* Display the availability of 2MB ,1GB, 16MB, 16GB pages */\n",
    "prefixes": [
        "dpdk-dev",
        "RFC",
        "v2",
        "11/23"
    ]
}