[RFC] usertools: add telemetry exporter

  For now the telemetry socket is local to the machine running a DPDK
application. Also, there is no official "schema" for the exposed
metrics. Add a framework and a script to collect and expose these
metrics to telemetry and observability agree gators such as Prometheus,
Carbon or Influxdb. The exposed data must be done with end-users in
mind, some DPDK terminology or internals may not make sense to everyone.

The script only serves as an entry point and does not know anything
about any specific metrics nor JSON data structures exposed in the
telemetry socket.

It uses dynamically loaded endpoint exporters which are basic python
files that must implement two functions:

 def info() -> dict[MetricName, MetricInfo]:
     Mapping of metric names to their description and type.

 def metrics(sock: TelemetrySocket) -> list[MetricValue]:
     Request data from sock and return it as metric values. A metric
     value is a 3-tuple: (name: str, value: any, labels: dict). Each
     name must be present in info().

The sock argument passed to metrics() has a single method:

 def cmd(self, uri: str, arg: any = None) -> dict | list:
     Request JSON data to the telemetry socket and parse it to python
     values.

The main script invokes endpoints and exports the data into an output
format. For now, only two formats are implemented:

* openmetrics/prometheus: text based format exported via a local HTTP
  server.
* carbon/graphite: binary (python pickle) format exported to a distant
  carbon TCP server.

As a starting point, 3 built-in endpoints are implemented:

* counters: ethdev hardware counters
* cpu: lcore usage
* memory: overall memory usage

The goal is to keep all built-in endpoints in the DPDK repository so
that they can be updated along with the telemetry JSON data structures.

Example output for the openmetrics:// format:

 ~# dpdk-telemetry-exporter.py -o openmetrics://:9876 &
 INFO using endpoint: counters (from .../telemetry-endpoints/counters.py)
 INFO using endpoint: cpu (from .../telemetry-endpoints/cpu.py)
 INFO using endpoint: memory (from .../telemetry-endpoints/memory.py)
 INFO listening on port 9876
 [1] 838829

 ~$ curl http://127.0.0.1:9876/
 # HELP dpdk_cpu_total_cycles Total number of CPU cycles.
 # TYPE dpdk_cpu_total_cycles counter
 # HELP dpdk_cpu_busy_cycles Number of busy CPU cycles.
 # TYPE dpdk_cpu_busy_cycles counter
 dpdk_cpu_total_cycles{cpu="73", numa="0"} 4353385274702980
 dpdk_cpu_busy_cycles{cpu="73", numa="0"} 6215932860
 dpdk_cpu_total_cycles{cpu="9", numa="0"} 4353385274745740
 dpdk_cpu_busy_cycles{cpu="9", numa="0"} 6215932860
 dpdk_cpu_total_cycles{cpu="8", numa="0"} 4353383451895540
 dpdk_cpu_busy_cycles{cpu="8", numa="0"} 6171923160
 dpdk_cpu_total_cycles{cpu="72", numa="0"} 4353385274817320
 dpdk_cpu_busy_cycles{cpu="72", numa="0"} 6215932860
 # HELP dpdk_memory_total_bytes The total size of reserved memory in bytes.
 # TYPE dpdk_memory_total_bytes gauge
 # HELP dpdk_memory_used_bytes The currently used memory in bytes.
 # TYPE dpdk_memory_used_bytes gauge
 dpdk_memory_total_bytes 1073741824
 dpdk_memory_used_bytes 794197376

Link: https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format
Link: https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#text-format
Link: https://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-pickle-protocol
Link: https://github.com/influxdata/telegraf/tree/master/plugins/inputs/prometheus
Signed-off-by: Robin Jarry <rjarry@redhat.com>
---

Notes:
    v1:

    * Ideally, this script should be tested in CI to avoid breakage when the
      telemetry data structures change. I don't know where such a test could
      be done.

    * There was work done 3/4 years ago in collectd to add DPDK telemetry
      support but I think this has now been abandoned.

      https://github.com/collectd/collectd/blob/main/src/dpdk_telemetry.c

      I think that keeping the exporters in the DPDK repository makes more
      sense from a maintainability perspective.

 usertools/dpdk-telemetry-exporter.py      | 376 ++++++++++++++++++++++
 usertools/meson.build                     |   6 +
 usertools/telemetry-endpoints/counters.py |  47 +++
 usertools/telemetry-endpoints/cpu.py      |  29 ++
 usertools/telemetry-endpoints/memory.py   |  37 +++
 5 files changed, 495 insertions(+)
 create mode 100755 usertools/dpdk-telemetry-exporter.py
 create mode 100644 usertools/telemetry-endpoints/counters.py
 create mode 100644 usertools/telemetry-endpoints/cpu.py
 create mode 100644 usertools/telemetry-endpoints/memory.py

Message ID	20230926163442.844006-2-rjarry@redhat.com (mailing list archive)
State	New
Delegated to:	Thomas Monjalon
Headers	Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A02F642645; Tue, 26 Sep 2023 18:34:51 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 291E94028C; Tue, 26 Sep 2023 18:34:51 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id BF64840277 for <dev@dpdk.org>; Tue, 26 Sep 2023 18:34:49 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695746089; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=MIxBRqbNoHOQyC4Hu2K0gf9eZ7N6Tjw7n4Nv0v4CQKg=; b=MKpm3X/LyJiDipUEEY6nPRIA/QN2mj8NQkdXW5hSxkWl8azecdsGg5GWxn5PbP2b3foPc9 +4JaxRK4pfMK62ORVqOGv9uOr19kqTFrwqQzyzpVpCWEOBoH2CjKO3VhW1AbCnKiv7AP0q Cyjyd86W6lu25sTXp1c/PoSALXQqzro= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-448-2-Y6HmkrPmSeF5XNJb_OWw-1; Tue, 26 Sep 2023 12:34:47 -0400 X-MC-Unique: 2-Y6HmkrPmSeF5XNJb_OWw-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5861D85A5A8 for <dev@dpdk.org>; Tue, 26 Sep 2023 16:34:47 +0000 (UTC) Received: from ringo.redhat.com (unknown [10.39.208.8]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4E1EF51E3; Tue, 26 Sep 2023 16:34:46 +0000 (UTC) From: Robin Jarry <rjarry@redhat.com> To: dev@dpdk.org Cc: Robin Jarry <rjarry@redhat.com> Subject: [RFC PATCH] usertools: add telemetry exporter Date: Tue, 26 Sep 2023 18:34:43 +0200 Message-ID: <20230926163442.844006-2-rjarry@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII"; x-default=true X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions <dev.dpdk.org> List-Unsubscribe: <https://mails.dpdk.org/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://mails.dpdk.org/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <https://mails.dpdk.org/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org
Series	[RFC] usertools: add telemetry exporter \| [RFC] usertools: add telemetry exporter

Context	Check	Description
ci/checkpatch	warning	coding style issues
ci/loongarch-compilation	success	Compilation OK
ci/loongarch-unit-testing	success	Unit Testing PASS
ci/Intel-compilation	success	Compilation OK
ci/intel-Testing	success	Testing PASS
ci/intel-Functional	success	Functional PASS

[RFC] usertools: add telemetry exporter

Checks

Commit Message

Comments

Patch