[dpdk-dev,v4] latencystats: added new library for latency stats

Message ID 1478524474-7154-1-git-send-email-reshma.pattan@intel.com (mailing list archive)
State Superseded, archived
Headers

Checks

Context Check Description
tmonjalo/checkpatch warning coding style issues

Commit Message

Pattan, Reshma Nov. 7, 2016, 1:14 p.m. UTC
Library is designed to calculate latency stats and report them to the
application when queried. Library measures minimum, average, maximum
latencies and jitter in nano seconds.
Current implementation supports global latency stats, i.e. per application
stats.

Added new field to mbuf struct to mark the packet arrival time on Rx and
use the timestamp to measure the latency on Tx.

Modified testpmd code to initialize/uninitialize latency stats calulation.
Modified dpdk-procinfo process to display the newly added metrics info.

This pacth is dependent on http://dpdk.org/dev/patchwork/patch/16927/ .

APIs:

Added APIs to initialize and un initialize latency stats
calculation.
Added API to retrieve latency stats names and values.

Functionality:

*Library will register ethdev Rx/Tx callbacks for each active port,
queue combinations.
*Library will register latency stats names with new metrics library.
http://dpdk.org/dev/patchwork/patch/16927/
*Rx packets will be marked with time stamp on each sampling interval.
*On Tx side, packets with time stamp will be considered for calculating
the minimum, maximum, average latencies and jitter.
*Average latency is calculated using exponential weighted moving average
method.
*Minimum and maximum latencies will be low and high latency values observed
so far.
*Jitter calculation is done based on inter packet delay variation.
*Measured stats are reported to the metrics library in a separate pthread.
*Measured stats can be retrieved via get API of the libray (or)
by calling generic get API of the new metrics library.

documents yet to be updated.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
 MAINTAINERS                                        |   4 +
 app/proc_info/main.c                               |  70 ++++
 app/test-pmd/testpmd.c                             |  10 +
 config/common_base                                 |   5 +
 lib/Makefile                                       |   1 +
 lib/librte_latencystats/Makefile                   |  57 ++++
 lib/librte_latencystats/rte_latencystats.c         | 380 +++++++++++++++++++++
 lib/librte_latencystats/rte_latencystats.h         | 141 ++++++++
 .../rte_latencystats_version.map                   |  10 +
 lib/librte_mbuf/rte_mbuf.h                         |   3 +
 mk/rte.app.mk                                      |   2 +
 11 files changed, 683 insertions(+)
 create mode 100644 lib/librte_latencystats/Makefile
 create mode 100644 lib/librte_latencystats/rte_latencystats.c
 create mode 100644 lib/librte_latencystats/rte_latencystats.h
 create mode 100644 lib/librte_latencystats/rte_latencystats_version.map
  

Comments

Pattan, Reshma Nov. 8, 2016, 12:34 p.m. UTC | #1
CCing  maintainers of Mbuf , testpmd and dpdk-procinfo.

> -----Original Message-----
> From: Pattan, Reshma
> Sent: Monday, November 7, 2016 1:15 PM
> To: dev@dpdk.org
> Cc: Pattan, Reshma <reshma.pattan@intel.com>
> Subject: [PATCH v4] latencystats: added new library for latency stats
> 
> Library is designed to calculate latency stats and report them to the
> application when queried. Library measures minimum, average, maximum
> latencies and jitter in nano seconds.
> Current implementation supports global latency stats, i.e. per application
> stats.
> 
> Added new field to mbuf struct to mark the packet arrival time on Rx and
> use the timestamp to measure the latency on Tx.
> 
> Modified testpmd code to initialize/uninitialize latency stats calulation.
> Modified dpdk-procinfo process to display the newly added metrics info.
> 
> This pacth is dependent on http://dpdk.org/dev/patchwork/patch/16927/ .
> 
> APIs:
> 
> Added APIs to initialize and un initialize latency stats calculation.
> Added API to retrieve latency stats names and values.
> 
> Functionality:
> 
> *Library will register ethdev Rx/Tx callbacks for each active port, queue
> combinations.
> *Library will register latency stats names with new metrics library.
> http://dpdk.org/dev/patchwork/patch/16927/
> *Rx packets will be marked with time stamp on each sampling interval.
> *On Tx side, packets with time stamp will be considered for calculating the
> minimum, maximum, average latencies and jitter.
> *Average latency is calculated using exponential weighted moving average
> method.
> *Minimum and maximum latencies will be low and high latency values
> observed so far.
> *Jitter calculation is done based on inter packet delay variation.
> *Measured stats are reported to the metrics library in a separate pthread.
> *Measured stats can be retrieved via get API of the libray (or) by calling
> generic get API of the new metrics library.
> 
> documents yet to be updated.
> 
> Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
> ---
>  MAINTAINERS                                        |   4 +
>  app/proc_info/main.c                               |  70 ++++
>  app/test-pmd/testpmd.c                             |  10 +
>  config/common_base                                 |   5 +
>  lib/Makefile                                       |   1 +
>  lib/librte_latencystats/Makefile                   |  57 ++++
>  lib/librte_latencystats/rte_latencystats.c         | 380
> +++++++++++++++++++++
>  lib/librte_latencystats/rte_latencystats.h         | 141 ++++++++
>  .../rte_latencystats_version.map                   |  10 +
>  lib/librte_mbuf/rte_mbuf.h                         |   3 +
>  mk/rte.app.mk                                      |   2 +
>  11 files changed, 683 insertions(+)
>  create mode 100644 lib/librte_latencystats/Makefile  create mode 100644
> lib/librte_latencystats/rte_latencystats.c
>  create mode 100644 lib/librte_latencystats/rte_latencystats.h
>  create mode 100644 lib/librte_latencystats/rte_latencystats_version.map
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index ba12d1b..2567448 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -702,3 +702,7 @@ F: examples/tep_termination/
>  F: examples/vmdq/
>  F: examples/vmdq_dcb/
>  F: doc/guides/sample_app_ug/vmdq_dcb_forwarding.rst
> +
> +Latency Stats
> +M: Reshma Pattan <reshma.pattan@intel.com>
> +F: lib/librte_latencystats/
> diff --git a/app/proc_info/main.c b/app/proc_info/main.c index
> 2c56d10..37d5ae4 100644
> --- a/app/proc_info/main.c
> +++ b/app/proc_info/main.c
> @@ -57,6 +57,7 @@
>  #include <rte_atomic.h>
>  #include <rte_branch_prediction.h>
>  #include <rte_string_fns.h>
> +#include <rte_metrics.h>
> 
>  /* Maximum long option length for option parsing. */  #define
> MAX_LONG_OPT_SZ 64 @@ -68,6 +69,8 @@ static uint32_t
> enabled_port_mask;  static uint32_t enable_stats;  /**< Enable xstats. */
> static uint32_t enable_xstats;
> +/**< Enable metrics. */
> +static uint32_t enable_metrics;
>  /**< Enable stats reset. */
>  static uint32_t reset_stats;
>  /**< Enable xstats reset. */
> @@ -85,6 +88,8 @@ proc_info_usage(const char *prgname)
>  		"  --stats: to display port statistics, enabled by default\n"
>  		"  --xstats: to display extended port statistics, disabled by "
>  			"default\n"
> +		"  --metrics: to display derived metrics of the ports, disabled
> by "
> +			"default\n"
>  		"  --stats-reset: to reset port statistics\n"
>  		"  --xstats-reset: to reset port extended statistics\n",
>  		prgname);
> @@ -127,6 +132,7 @@ proc_info_parse_args(int argc, char **argv)
>  		{"stats", 0, NULL, 0},
>  		{"stats-reset", 0, NULL, 0},
>  		{"xstats", 0, NULL, 0},
> +		{"metrics", 0, NULL, 0},
>  		{"xstats-reset", 0, NULL, 0},
>  		{NULL, 0, 0, 0}
>  	};
> @@ -159,6 +165,10 @@ proc_info_parse_args(int argc, char **argv)
>  			else if (!strncmp(long_option[option_index].name,
> "xstats",
>  					MAX_LONG_OPT_SZ))
>  				enable_xstats = 1;
> +			else if (!strncmp(long_option[option_index].name,
> +					"metrics",
> +					MAX_LONG_OPT_SZ))
> +				enable_metrics = 1;
>  			/* Reset stats */
>  			if (!strncmp(long_option[option_index].name, "stats-
> reset",
>  					MAX_LONG_OPT_SZ))
> @@ -301,6 +311,60 @@ nic_xstats_clear(uint8_t port_id)
>  	printf("\n  NIC extended statistics for port %d cleared\n", port_id);  }
> 
> +static void
> +metrics_display(int port_id)
> +{
> +	struct rte_stat_value *stats;
> +	struct rte_metric_name *names;
> +	int len, ret;
> +	static const char *nic_stats_border = "########################";
> +
> +	memset(&stats, 0, sizeof(struct rte_stat_value));
> +	len = rte_metrics_get_names(NULL, 0);
> +	if (len < 0) {
> +		printf("Cannot get metrics count\n");
> +		return;
> +	}
> +
> +	stats = malloc(sizeof(struct rte_stat_value) * len);
> +	if (stats == NULL) {
> +		printf("Cannot allocate memory for metrics\n");
> +		return;
> +	}
> +
> +	names =  malloc(sizeof(struct rte_metric_name) * len);
> +	if (names == NULL) {
> +		printf("Cannot allocate memory for metrcis names\n");
> +		free(stats);
> +		return;
> +	}
> +
> +	if (len != rte_metrics_get_names(names, len)) {
> +		printf("Cannot get metrics names\n");
> +		free(stats);
> +		free(names);
> +		return;
> +	}
> +
> +	printf("###### metrics for port %-2d #########\n", port_id);
> +	printf("%s############################\n", nic_stats_border);
> +	ret = rte_metrics_get_values(port_id, stats, len);
> +	if (ret < 0 || ret > len) {
> +		printf("Cannot get metrics values\n");
> +		free(stats);
> +		free(names);
> +		return;
> +	}
> +
> +	int i;
> +	for (i = 0; i < len; i++)
> +		printf("%s: %"PRIu64"\n", names[i].name, stats[i].value);
> +
> +	printf("%s############################\n", nic_stats_border);
> +	free(stats);
> +	free(names);
> +}
> +
>  int
>  main(int argc, char **argv)
>  {
> @@ -360,8 +424,14 @@ main(int argc, char **argv)
>  				nic_stats_clear(i);
>  			else if (reset_xstats)
>  				nic_xstats_clear(i);
> +			else if (enable_metrics)
> +				metrics_display(i);
>  		}
>  	}
> 
> +	/* print port independent stats */
> +	if (enable_metrics)
> +		metrics_display(RTE_METRICS_NONPORT);
> +
>  	return 0;
>  }
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> 6185be6..0efab4e 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -78,6 +78,10 @@
>  #ifdef RTE_LIBRTE_PDUMP
>  #include <rte_pdump.h>
>  #endif
> +#include <rte_metrics.h>
> +#ifdef RTE_LIBRTE_LATENCY_STATS
> +#include <rte_latencystats.h>
> +#endif
> 
>  #include "testpmd.h"
> 
> @@ -2070,6 +2074,9 @@ signal_handler(int signum)
>  		/* uninitialize packet capture framework */
>  		rte_pdump_uninit();
>  #endif
> +#ifdef RTE_LIBRTE_LATENCY_STATS
> +		rte_latencystats_uninit();
> +#endif
>  		force_quit();
>  		/* exit with the expected status */
>  		signal(signum, SIG_DFL);
> @@ -2127,6 +2134,9 @@ main(int argc, char** argv)
>  	/* set all ports to promiscuous mode by default */
>  	FOREACH_PORT(port_id, ports)
>  		rte_eth_promiscuous_enable(port_id);
> +#ifdef RTE_LIBRTE_LATENCY_STATS
> +	rte_latencystats_init(1, NULL);
> +#endif
> 
>  #ifdef RTE_LIBRTE_CMDLINE
>  	if (interactive == 1) {
> diff --git a/config/common_base b/config/common_base index
> 21d18f8..d9002b8 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -589,3 +589,8 @@ CONFIG_RTE_APP_TEST_RESOURCE_TAR=n
>  CONFIG_RTE_TEST_PMD=y
>  CONFIG_RTE_TEST_PMD_RECORD_CORE_CYCLES=n
>  CONFIG_RTE_TEST_PMD_RECORD_BURST_STATS=n
> +
> +#
> +# Compile the latency statistics library #
> +CONFIG_RTE_LIBRTE_LATENCY_STATS=y
> diff --git a/lib/Makefile b/lib/Makefile index 990f23a..2111349 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -58,6 +58,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_TABLE) += librte_table
>  DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += librte_pipeline
>  DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
>  DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
> +DIRS-$(CONFIG_RTE_LIBRTE_LATENCY_STATS) += librte_latencystats
> 
>  ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
>  DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni diff --git
> a/lib/librte_latencystats/Makefile b/lib/librte_latencystats/Makefile
> new file mode 100644
> index 0000000..f744da6
> --- /dev/null
> +++ b/lib/librte_latencystats/Makefile
> @@ -0,0 +1,57 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2016 Intel Corporation. All rights reserved.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +#     * Redistributions of source code must retain the above copyright
> +#       notice, this list of conditions and the following disclaimer.
> +#     * Redistributions in binary form must reproduce the above copyright
> +#       notice, this list of conditions and the following disclaimer in
> +#       the documentation and/or other materials provided with the
> +#       distribution.
> +#     * Neither the name of Intel Corporation nor the names of its
> +#       contributors may be used to endorse or promote products derived
> +#       from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
> ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_latencystats.a
> +
> +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 LDLIBS += -lm LDLIBS +=
> +-lpthread
> +
> +EXPORT_MAP := rte_latencystats_version.map
> +
> +LIBABIVER := 1
> +
> +# all source are stored in SRCS-y
> +SRCS-$(CONFIG_RTE_LIBRTE_LATENCY_STATS) := rte_latencystats.c
> +
> +# install this header file
> +SYMLINK-$(CONFIG_RTE_LIBRTE_LATENCY_STATS)-include :=
> +rte_latencystats.h
> +
> +# this lib depends upon:
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_LATENCY_STATS) += lib/librte_mbuf
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_LATENCY_STATS) += lib/librte_eal
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_LATENCY_STATS) += lib/librte_ether
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_LATENCY_STATS) += lib/librte_metrics
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_latencystats/rte_latencystats.c
> b/lib/librte_latencystats/rte_latencystats.c
> new file mode 100644
> index 0000000..0bcb09f
> --- /dev/null
> +++ b/lib/librte_latencystats/rte_latencystats.c
> @@ -0,0 +1,380 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2016 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> + */
> +
> +#include <unistd.h>
> +#include <sys/types.h>
> +#include <stdbool.h>
> +#include <math.h>
> +#include <pthread.h>
> +
> +#include <rte_mbuf.h>
> +#include <rte_log.h>
> +#include <rte_cycles.h>
> +#include <rte_ethdev.h>
> +#include <rte_metrics.h>
> +#include <rte_memzone.h>
> +#include <rte_lcore.h>
> +#include <rte_timer.h>
> +
> +#include "rte_latencystats.h"
> +
> +/** Nano seconds per second */
> +#define NS_PER_SEC 1E9
> +
> +/** Clock cycles per nano second */
> +#define CYCLES_PER_NS (rte_get_timer_hz() / NS_PER_SEC)
> +
> +/* Macros for printing using RTE_LOG */ #define
> +RTE_LOGTYPE_LATENCY_STATS RTE_LOGTYPE_USER1
> +
> +static pthread_t latency_stats_thread;
> +static const char *MZ_RTE_LATENCY_STATS = "rte_latencystats"; static
> +int latency_stats_index; static uint64_t sampIntvl; static uint64_t
> +timer_tsc; static uint64_t prev_tsc;
> +
> +struct rte_latency_stats {
> +	float min_latency; /**< Minimum latency in nano seconds */
> +	float avg_latency; /**< Average latency in nano seconds */
> +	float max_latency; /**< Maximum latency in nano seconds */
> +	float jitter; /** Latency variation between to packets */
> +	uint64_t total_sampl_pkts;
> +} *glob_stats;
> +
> +static struct rxtx_cbs {
> +	struct rte_eth_rxtx_callback *cb;
> +} rx_cbs[RTE_MAX_ETHPORTS][RTE_MAX_QUEUES_PER_PORT],
> +	tx_cbs[RTE_MAX_ETHPORTS][RTE_MAX_QUEUES_PER_PORT];
> +
> +struct latency_stats_nameoff {
> +	char name[RTE_ETH_XSTATS_NAME_SIZE];
> +	unsigned int offset;
> +};
> +
> +static const struct latency_stats_nameoff lat_stats_strings[] = {
> +	{"min_latency_ns", offsetof(struct rte_latency_stats, min_latency)},
> +	{"avg_latency_ns", offsetof(struct rte_latency_stats, avg_latency)},
> +	{"max_latency_ns", offsetof(struct rte_latency_stats, max_latency)},
> +	{"jitter_ns", offsetof(struct rte_latency_stats, jitter)}, };
> +
> +#define NUM_LATENCY_STATS (sizeof(lat_stats_strings) / \
> +				sizeof(lat_stats_strings[0]))
> +
> +static __attribute__((noreturn)) void *
> +report_latency_stats(__rte_unused void *arg) {
> +	for (;;) {
> +		unsigned int i;
> +		float *stats_ptr = NULL;
> +		uint64_t values[NUM_LATENCY_STATS] = {0};
> +		int ret;
> +
> +		for (i = 0; i < NUM_LATENCY_STATS; i++) {
> +			stats_ptr = RTE_PTR_ADD(glob_stats,
> +					lat_stats_strings[i].offset);
> +			values[i] = (uint64_t)floor((*stats_ptr)/
> +					CYCLES_PER_NS);
> +		}
> +
> +		ret = rte_metrics_update_metrics(RTE_METRICS_NONPORT,
> +						latency_stats_index,
> +						values,
> NUM_LATENCY_STATS);
> +		if (ret < 0)
> +			RTE_LOG(INFO, LATENCY_STATS,
> +				"Failed to push the stats\n");
> +	}
> +}
> +
> +static void
> +rte_latencystats_fill_values(struct rte_stat_value *values) {
> +	unsigned int i;
> +	float *stats_ptr = NULL;
> +
> +	for (i = 0; i < NUM_LATENCY_STATS; i++) {
> +		stats_ptr = RTE_PTR_ADD(glob_stats,
> +				lat_stats_strings[i].offset);
> +		values[i].key = i;
> +		values[i].value = (uint64_t)floor((*stats_ptr)/
> +						CYCLES_PER_NS);
> +	}
> +}
> +
> +static uint16_t
> +add_time_stamps(uint8_t pid __rte_unused,
> +		uint16_t qid __rte_unused,
> +		struct rte_mbuf **pkts,
> +		uint16_t nb_pkts,
> +		uint16_t max_pkts __rte_unused,
> +		void *user_cb __rte_unused)
> +{
> +	unsigned int i;
> +	uint64_t diff_tsc, now;
> +
> +	/* for every sample interval,
> +	 * time stamp is marked on one received packet
> +	 */
> +	now = rte_rdtsc();
> +	for (i = 0; i < nb_pkts; i++) {
> +		diff_tsc = now - prev_tsc;
> +		timer_tsc += diff_tsc;
> +		if (timer_tsc >= sampIntvl) {
> +			pkts[i]->timestamp = now;
> +			timer_tsc = 0;
> +		}
> +		prev_tsc = now;
> +		now = rte_rdtsc();
> +	}
> +
> +	return nb_pkts;
> +}
> +
> +static uint16_t
> +calc_latency(uint8_t pid __rte_unused,
> +		uint16_t qid __rte_unused,
> +		struct rte_mbuf **pkts,
> +		uint16_t nb_pkts,
> +		void *_ __rte_unused)
> +{
> +	unsigned int i, cnt = 0;
> +	uint64_t now;
> +	float latency[nb_pkts];
> +	static float prev_latency;
> +	const float alpha = 0.2;
> +
> +	now = rte_rdtsc();
> +	for (i = 0; i < nb_pkts; i++) {
> +		if (pkts[i]->timestamp)
> +			latency[cnt++] = now - pkts[i]->timestamp;
> +	}
> +
> +	for (i = 0; i < cnt; i++) {
> +		/**
> +		*The jitter is calculated as statistical mean of interpacket
> +		*delay variation. The "jitter estimate" is computed by taking
> +		*the absolute values of the ipdv sequence and applying an
> +		*exponential filter with parameter 1/16 to generate the
> +		*estimate. i.e J=J+(|D(i-1,i)|-J)/16. Where J is jitter,
> +		*D(i-1,i) is difference in latency of two consecutive packets
> +		*i-1 and i.
> +		*Reference: Calculated as per RFC 5481, sec 4.1,
> +		*RFC 3393 sec 4.5, RFC 1889 sec.
> +		*/
> +		glob_stats->jitter +=  (abs(prev_latency - latency[i])
> +					- glob_stats->jitter)/16;
> +		if (glob_stats->min_latency == 0)
> +			glob_stats->min_latency = latency[i];
> +		else if (latency[i] < glob_stats->min_latency)
> +			glob_stats->min_latency = latency[i];
> +		else if (latency[i] > glob_stats->max_latency)
> +			glob_stats->max_latency = latency[i];
> +		/**
> +		*The average latency is measured using exponential moving
> +		*average, i.e. using EWMA
> +		*https://en.wikipedia.org/wiki/Moving_average
> +		*/
> +		glob_stats->avg_latency +=
> +			alpha * (latency[i] - glob_stats->avg_latency);
> +		glob_stats->total_sampl_pkts++;
> +		prev_latency = latency[i];
> +	}
> +
> +	return nb_pkts;
> +}
> +
> +int
> +rte_latencystats_init(uint64_t samp_intvl,
> +		rte_latency_stats_flow_type_fn user_cb) {
> +	unsigned int i;
> +	uint8_t pid;
> +	uint16_t qid;
> +	struct rxtx_cbs *cbs = NULL;
> +	const uint8_t nb_ports = rte_eth_dev_count();
> +	const char *ptr_strings[NUM_LATENCY_STATS] = {0};
> +	const struct rte_memzone *mz = NULL;
> +	const unsigned int flags = 0;
> +
> +	/** Allocate stats in shared memory fo muliti process support */
> +	mz = rte_memzone_reserve(MZ_RTE_LATENCY_STATS,
> sizeof(*glob_stats),
> +					rte_socket_id(), flags);
> +	if (mz == NULL) {
> +		RTE_LOG(ERR, LATENCY_STATS, "Cannot reserve memory:
> %s:%d\n",
> +			__func__, __LINE__);
> +		return -ENOMEM;
> +	}
> +
> +	glob_stats = mz->addr;
> +	samp_intvl *= CYCLES_PER_NS;
> +
> +	/** Register latency stats with stats library */
> +	for (i = 0; i < NUM_LATENCY_STATS; i++)
> +		ptr_strings[i] = lat_stats_strings[i].name;
> +
> +	latency_stats_index = rte_metrics_reg_metrics(ptr_strings,
> +
> 	NUM_LATENCY_STATS);
> +	if (latency_stats_index < 0) {
> +		RTE_LOG(DEBUG, LATENCY_STATS,
> +			"Failed to register latency stats names\n");
> +		return -1;
> +	}
> +
> +	/** Register Rx/Tx callbacks */
> +	for (pid = 0; pid < nb_ports; pid++) {
> +		struct rte_eth_dev_info dev_info;
> +		rte_eth_dev_info_get(pid, &dev_info);
> +		for (qid = 0; qid < dev_info.nb_rx_queues; qid++) {
> +			cbs = &rx_cbs[pid][qid];
> +			cbs->cb = rte_eth_add_first_rx_callback(pid, qid,
> +					add_time_stamps, user_cb);
> +			if (!cbs->cb)
> +				RTE_LOG(INFO, LATENCY_STATS, "Failed to "
> +					"register Rx callback for pid=%d, "
> +					"qid=%d\n", pid, qid);
> +		}
> +		for (qid = 0; qid < dev_info.nb_tx_queues; qid++) {
> +			cbs = &tx_cbs[pid][qid];
> +			cbs->cb =  rte_eth_add_tx_callback(pid, qid,
> +					calc_latency, user_cb);
> +			if (!cbs->cb)
> +				RTE_LOG(INFO, LATENCY_STATS, "Failed to "
> +					"register Tx callback for pid=%d, "
> +					"qid=%d\n", pid, qid);
> +		}
> +	}
> +
> +	int ret = 0;
> +	char thread_name[RTE_MAX_THREAD_NAME_LEN];
> +
> +	/** Create the host thread to update latency stats to stats library */
> +	ret = pthread_create(&latency_stats_thread, NULL,
> report_latency_stats,
> +				NULL);
> +	if (ret != 0) {
> +		RTE_LOG(ERR, LATENCY_STATS,
> +			"Failed to create the latency stats thread:%s,
> %s:%d\n",
> +			strerror(errno), __func__, __LINE__);
> +		return -1;
> +	}
> +	/** Set thread_name for aid in debugging */
> +	snprintf(thread_name, RTE_MAX_THREAD_NAME_LEN, "latency-
> stats-thread");
> +	ret = rte_thread_setname(latency_stats_thread, thread_name);
> +	if (ret != 0)
> +		RTE_LOG(DEBUG, LATENCY_STATS,
> +			"Failed to set thread name for latency stats
> handling\n");
> +
> +	return 0;
> +}
> +
> +int
> +rte_latencystats_uninit(void)
> +{
> +	uint8_t pid;
> +	uint16_t qid;
> +	int ret = 0;
> +	struct rxtx_cbs *cbs = NULL;
> +	const uint8_t nb_ports = rte_eth_dev_count();
> +
> +	/** De register Rx/Tx callbacks */
> +	for (pid = 0; pid < nb_ports; pid++) {
> +		struct rte_eth_dev_info dev_info;
> +		rte_eth_dev_info_get(pid, &dev_info);
> +		for (qid = 0; qid < dev_info.nb_rx_queues; qid++) {
> +			cbs = &rx_cbs[pid][qid];
> +			ret = rte_eth_remove_rx_callback(pid, qid, cbs->cb);
> +			if (ret)
> +				RTE_LOG(INFO, LATENCY_STATS, "failed to "
> +					"remove Rx callback for pid=%d, "
> +					"qid=%d\n", pid, qid);
> +		}
> +		for (qid = 0; qid < dev_info.nb_tx_queues; qid++) {
> +			cbs = &tx_cbs[pid][qid];
> +			ret = rte_eth_remove_tx_callback(pid, qid, cbs->cb);
> +			if (ret)
> +				RTE_LOG(INFO, LATENCY_STATS, "failed to "
> +					"remove Tx callback for pid=%d, "
> +					"qid=%d\n", pid, qid);
> +		}
> +	}
> +
> +	/** Cancel the thread */
> +	ret = pthread_cancel(latency_stats_thread);
> +	if (ret != 0) {
> +		RTE_LOG(ERR, LATENCY_STATS,
> +			"Failed to cancel latency stats update thread:"
> +			"%s,%s:%d\n",
> +			strerror(errno), __func__, __LINE__);
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +int
> +rte_latencystats_get_names(struct rte_metric_name *names, uint16_t
> +size) {
> +	unsigned int i;
> +
> +	if (names == NULL || size < NUM_LATENCY_STATS)
> +		return NUM_LATENCY_STATS;
> +
> +	for (i = 0; i < NUM_LATENCY_STATS; i++)
> +		snprintf(names[i].name, sizeof(names[i].name),
> +				"%s", lat_stats_strings[i].name);
> +
> +	return NUM_LATENCY_STATS;
> +}
> +
> +int
> +rte_latencystats_get(struct rte_stat_value *values, uint16_t size) {
> +	if (size < NUM_LATENCY_STATS || values == NULL)
> +		return NUM_LATENCY_STATS;
> +
> +	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
> +		const struct rte_memzone *mz;
> +		mz = rte_memzone_lookup(MZ_RTE_LATENCY_STATS);
> +		if (mz == NULL) {
> +			RTE_LOG(ERR, LATENCY_STATS,
> +				"Latency stats memzone not found\n");
> +			return -ENOMEM;
> +		}
> +		glob_stats =  mz->addr;
> +	}
> +
> +	/* Retrieve latency stats */
> +	rte_latencystats_fill_values(values);
> +
> +	return NUM_LATENCY_STATS;
> +}
> diff --git a/lib/librte_latencystats/rte_latencystats.h
> b/lib/librte_latencystats/rte_latencystats.h
> new file mode 100644
> index 0000000..7b1e72a
> --- /dev/null
> +++ b/lib/librte_latencystats/rte_latencystats.h
> @@ -0,0 +1,141 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2016 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> + */
> +
> +#ifndef _RTE_LATENCYSTATS_H_
> +#define _RTE_LATENCYSTATS_H_
> +
> +/**
> + * @file
> + * RTE latency stats
> + *
> + * library to provide application and flow based latency stats.
> + */
> +
> +#include <rte_metrics.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/**
> + * Function type used for identifting flow types of a Rx packet.
> + *
> + * The callback function is called on Rx for each packet.
> + * This function is used for flow based latency calculations.
> + *
> + * @param pkt
> + *   Packet that has to be identified with its flow types.
> + * @param user_param
> + *   The arbitrary user parameter passed in by the application when
> + *   the callback was originally configured.
> + * @return
> + *   The flow_mask, representing the multiple flow types of a packet.
> + */
> +typedef uint16_t (*rte_latency_stats_flow_type_fn)(struct rte_mbuf *pkt,
> +							void *user_param);
> +
> +/**
> + * @internal
> + *  Registers Rx/Tx callbacks for each active port, queue.
> + *
> + * @param sampIntvl
> + *  Sampling time period in nano seconds, at which packet
> + *  should be marked with time stamp.
> + * @param user_cb
> + *  User callback to be called to get flow types of a packet.
> + *  Used for flow based latency calculation.
> + *  If the value is NULL, global stats will be calculated,
> + *  else flow based stats will be calculated.
> + *  @return
> + *   -1     : On error
> + *   -ENOMEM: On error
> + *    0     : On success
> + */
> +int rte_latencystats_init(uint64_t samp_intvl,
> +			rte_latency_stats_flow_type_fn user_cb);
> +
> +/**
> + * @internal
> + *  Removes registered Rx/Tx callbacks for each active port, queue.
> + *  @return
> + *   -1: On error
> + *    0: On suces
> + */
> +int rte_latencystats_uninit(void);
> +
> +/**
> + * Retrieve names of latency statistics
> + *
> + * @param names
> + *  Block of memory to insert names into. Must be at least size in capacity.
> + *  If set to NULL, function returns required capacity.
> + * @param size
> + *  Capacity of latency stats names (number of names).
> + * @return
> + *   - positive value lower or equal to size: success. The return value
> + *     is the number of entries filled in the stats table.
> + *   - positive value higher than size: error, the given statistics table
> + *     is too small. The return value corresponds to the size that should
> + *     be given to succeed. The entries in the table are not valid and
> + *     shall not be used by the caller.
> + */
> +int rte_latencystats_get_names(struct rte_metric_name *names,
> +				uint16_t size);
> +
> +/**
> + * Retrieve latency statistics.
> + *
> + * @param values
> + *   A pointer to a table of structure of type *rte_stat_value*
> + *   to be filled with latency statistics ids and values.
> + *   This parameter can be set to NULL if size is 0.
> + * @param size
> + *   The size of the stats table, which should be large enough to store
> + *   all the latency stats.
> + * @return
> + *   - positive value lower or equal to size: success. The return value
> + *     is the number of entries filled in the stats table.
> + *   - positive value higher than size: error, the given statistics table
> + *     is too small. The return value corresponds to the size that should
> + *     be given to succeed. The entries in the table are not valid and
> + *     shall not be used by the caller.
> + *   -ENOMEM: On failure.
> + */
> +int rte_latencystats_get(struct rte_stat_value *values,
> +			uint16_t size);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_LATENCYSTATS_H_ */
> diff --git a/lib/librte_latencystats/rte_latencystats_version.map
> b/lib/librte_latencystats/rte_latencystats_version.map
> new file mode 100644
> index 0000000..82dc5a7
> --- /dev/null
> +++ b/lib/librte_latencystats/rte_latencystats_version.map
> @@ -0,0 +1,10 @@
> +DPDK_16.11 {
> +	global:
> +
> +	rte_latencystats_get;
> +	rte_latencystats_get_names;
> +	rte_latencystats_init;
> +	rte_latencystats_uninit;
> +
> +	local: *;
> +};
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index
> 109e666..cc3bf65 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -493,6 +493,9 @@ struct rte_mbuf {
> 
>  	/** Timesync flags for use with IEEE1588. */
>  	uint16_t timesync;
> +
> +	/** Timestamp for measuring latency. */
> +	uint64_t timestamp;
>  } __rte_cache_aligned;
> 
>  /**
> diff --git a/mk/rte.app.mk b/mk/rte.app.mk index f75f0e2..4e5289a 100644
> --- a/mk/rte.app.mk
> +++ b/mk/rte.app.mk
> @@ -98,6 +98,8 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -
> lrte_ring
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_CFGFILE)        += -lrte_cfgfile
> +_LDLIBS-$(CONFIG_RTE_LIBRTE_LATENCY_STATS)  += -lrte_latencystats
> +
> 
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BOND)       += -lrte_pmd_bond
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT)    += -lrte_pmd_xenvirt -
> lxenstore
> --
> 2.7.4
  
Remy Horton Nov. 11, 2016, 2:22 a.m. UTC | #2
On 07/11/2016 21:14, Reshma Pattan wrote:
[..]
 > Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>

Reviewed-by: Remy Horton <remy.horton@intel.com>


 > +static void
 > +metrics_display(int port_id)
 > +{
 > +    struct rte_stat_value *stats;
 > +    struct rte_metric_name *names;

Note that rte_stats_value is being renamed to rte_metric_value in the 
next version of the metrics library..


 > +int
 > +rte_latencystats_init(uint64_t samp_intvl,
 > +        rte_latency_stats_flow_type_fn user_cb)
 > +{

Far as I can tell, user_cb is always NULL, and the two callbacks it 
eventually get passed to don't use it. There any reason the function 
signature has it at all?


 > +++ b/lib/librte_latencystats/rte_latencystats_version.map
 > @@ -0,0 +1,10 @@
 > +DPDK_16.11 {

This will need to change to 17.02 once new release cycle starts. :)

Will also need to add entry to release_17_02.rst once it becomes available..

..Remy
  
Pattan, Reshma Nov. 11, 2016, 11:15 a.m. UTC | #3
Hi 

> -----Original Message-----

> From: Horton, Remy

> Sent: Friday, November 11, 2016 2:22 AM

> To: Pattan, Reshma <reshma.pattan@intel.com>; dev@dpdk.org

> Subject: Re: [dpdk-dev] [PATCH v4] latencystats: added new library for

> latency stats

> 

> 

> On 07/11/2016 21:14, Reshma Pattan wrote:

> [..]

>  > Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>

> 

> Reviewed-by: Remy Horton <remy.horton@intel.com>

> 

> 

>  > +static void

>  > +metrics_display(int port_id)

>  > +{

>  > +    struct rte_stat_value *stats;

>  > +    struct rte_metric_name *names;

> 

> Note that rte_stats_value is being renamed to rte_metric_value in the next

> version of the metrics library..

> 



Ok.

> 

>  > +int

>  > +rte_latencystats_init(uint64_t samp_intvl,

>  > +        rte_latency_stats_flow_type_fn user_cb)

>  > +{

> 

> Far as I can tell, user_cb is always NULL, and the two callbacks it

> eventually get passed to don't use it. There any reason the function

> signature has it at all?

> 


Yes,  with the possibility of getting flow based latency stats requiments in future, user callback is added to signature.
With this user callback, it is up to the application to identify the flow type and return the flow type. Library will maintain latency calculation per flow type 
in a separate table.  Basically for future enhancement.

> 

>  > +++ b/lib/librte_latencystats/rte_latencystats_version.map

>  > @@ -0,0 +1,10 @@

>  > +DPDK_16.11 {

> 

> This will need to change to 17.02 once new release cycle starts. :)

> 

> Will also need to add entry to release_17_02.rst once it becomes available..

> 


Ok.

Thanks,
Reshma
  

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index ba12d1b..2567448 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -702,3 +702,7 @@  F: examples/tep_termination/
 F: examples/vmdq/
 F: examples/vmdq_dcb/
 F: doc/guides/sample_app_ug/vmdq_dcb_forwarding.rst
+
+Latency Stats
+M: Reshma Pattan <reshma.pattan@intel.com>
+F: lib/librte_latencystats/
diff --git a/app/proc_info/main.c b/app/proc_info/main.c
index 2c56d10..37d5ae4 100644
--- a/app/proc_info/main.c
+++ b/app/proc_info/main.c
@@ -57,6 +57,7 @@ 
 #include <rte_atomic.h>
 #include <rte_branch_prediction.h>
 #include <rte_string_fns.h>
+#include <rte_metrics.h>
 
 /* Maximum long option length for option parsing. */
 #define MAX_LONG_OPT_SZ 64
@@ -68,6 +69,8 @@  static uint32_t enabled_port_mask;
 static uint32_t enable_stats;
 /**< Enable xstats. */
 static uint32_t enable_xstats;
+/**< Enable metrics. */
+static uint32_t enable_metrics;
 /**< Enable stats reset. */
 static uint32_t reset_stats;
 /**< Enable xstats reset. */
@@ -85,6 +88,8 @@  proc_info_usage(const char *prgname)
 		"  --stats: to display port statistics, enabled by default\n"
 		"  --xstats: to display extended port statistics, disabled by "
 			"default\n"
+		"  --metrics: to display derived metrics of the ports, disabled by "
+			"default\n"
 		"  --stats-reset: to reset port statistics\n"
 		"  --xstats-reset: to reset port extended statistics\n",
 		prgname);
@@ -127,6 +132,7 @@  proc_info_parse_args(int argc, char **argv)
 		{"stats", 0, NULL, 0},
 		{"stats-reset", 0, NULL, 0},
 		{"xstats", 0, NULL, 0},
+		{"metrics", 0, NULL, 0},
 		{"xstats-reset", 0, NULL, 0},
 		{NULL, 0, 0, 0}
 	};
@@ -159,6 +165,10 @@  proc_info_parse_args(int argc, char **argv)
 			else if (!strncmp(long_option[option_index].name, "xstats",
 					MAX_LONG_OPT_SZ))
 				enable_xstats = 1;
+			else if (!strncmp(long_option[option_index].name,
+					"metrics",
+					MAX_LONG_OPT_SZ))
+				enable_metrics = 1;
 			/* Reset stats */
 			if (!strncmp(long_option[option_index].name, "stats-reset",
 					MAX_LONG_OPT_SZ))
@@ -301,6 +311,60 @@  nic_xstats_clear(uint8_t port_id)
 	printf("\n  NIC extended statistics for port %d cleared\n", port_id);
 }
 
+static void
+metrics_display(int port_id)
+{
+	struct rte_stat_value *stats;
+	struct rte_metric_name *names;
+	int len, ret;
+	static const char *nic_stats_border = "########################";
+
+	memset(&stats, 0, sizeof(struct rte_stat_value));
+	len = rte_metrics_get_names(NULL, 0);
+	if (len < 0) {
+		printf("Cannot get metrics count\n");
+		return;
+	}
+
+	stats = malloc(sizeof(struct rte_stat_value) * len);
+	if (stats == NULL) {
+		printf("Cannot allocate memory for metrics\n");
+		return;
+	}
+
+	names =  malloc(sizeof(struct rte_metric_name) * len);
+	if (names == NULL) {
+		printf("Cannot allocate memory for metrcis names\n");
+		free(stats);
+		return;
+	}
+
+	if (len != rte_metrics_get_names(names, len)) {
+		printf("Cannot get metrics names\n");
+		free(stats);
+		free(names);
+		return;
+	}
+
+	printf("###### metrics for port %-2d #########\n", port_id);
+	printf("%s############################\n", nic_stats_border);
+	ret = rte_metrics_get_values(port_id, stats, len);
+	if (ret < 0 || ret > len) {
+		printf("Cannot get metrics values\n");
+		free(stats);
+		free(names);
+		return;
+	}
+
+	int i;
+	for (i = 0; i < len; i++)
+		printf("%s: %"PRIu64"\n", names[i].name, stats[i].value);
+
+	printf("%s############################\n", nic_stats_border);
+	free(stats);
+	free(names);
+}
+
 int
 main(int argc, char **argv)
 {
@@ -360,8 +424,14 @@  main(int argc, char **argv)
 				nic_stats_clear(i);
 			else if (reset_xstats)
 				nic_xstats_clear(i);
+			else if (enable_metrics)
+				metrics_display(i);
 		}
 	}
 
+	/* print port independent stats */
+	if (enable_metrics)
+		metrics_display(RTE_METRICS_NONPORT);
+
 	return 0;
 }
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 6185be6..0efab4e 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -78,6 +78,10 @@ 
 #ifdef RTE_LIBRTE_PDUMP
 #include <rte_pdump.h>
 #endif
+#include <rte_metrics.h>
+#ifdef RTE_LIBRTE_LATENCY_STATS
+#include <rte_latencystats.h>
+#endif
 
 #include "testpmd.h"
 
@@ -2070,6 +2074,9 @@  signal_handler(int signum)
 		/* uninitialize packet capture framework */
 		rte_pdump_uninit();
 #endif
+#ifdef RTE_LIBRTE_LATENCY_STATS
+		rte_latencystats_uninit();
+#endif
 		force_quit();
 		/* exit with the expected status */
 		signal(signum, SIG_DFL);
@@ -2127,6 +2134,9 @@  main(int argc, char** argv)
 	/* set all ports to promiscuous mode by default */
 	FOREACH_PORT(port_id, ports)
 		rte_eth_promiscuous_enable(port_id);
+#ifdef RTE_LIBRTE_LATENCY_STATS
+	rte_latencystats_init(1, NULL);
+#endif
 
 #ifdef RTE_LIBRTE_CMDLINE
 	if (interactive == 1) {
diff --git a/config/common_base b/config/common_base
index 21d18f8..d9002b8 100644
--- a/config/common_base
+++ b/config/common_base
@@ -589,3 +589,8 @@  CONFIG_RTE_APP_TEST_RESOURCE_TAR=n
 CONFIG_RTE_TEST_PMD=y
 CONFIG_RTE_TEST_PMD_RECORD_CORE_CYCLES=n
 CONFIG_RTE_TEST_PMD_RECORD_BURST_STATS=n
+
+#
+# Compile the latency statistics library
+#
+CONFIG_RTE_LIBRTE_LATENCY_STATS=y
diff --git a/lib/Makefile b/lib/Makefile
index 990f23a..2111349 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -58,6 +58,7 @@  DIRS-$(CONFIG_RTE_LIBRTE_TABLE) += librte_table
 DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += librte_pipeline
 DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
 DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
+DIRS-$(CONFIG_RTE_LIBRTE_LATENCY_STATS) += librte_latencystats
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_latencystats/Makefile b/lib/librte_latencystats/Makefile
new file mode 100644
index 0000000..f744da6
--- /dev/null
+++ b/lib/librte_latencystats/Makefile
@@ -0,0 +1,57 @@ 
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_latencystats.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
+LDLIBS += -lm
+LDLIBS += -lpthread
+
+EXPORT_MAP := rte_latencystats_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_LATENCY_STATS) := rte_latencystats.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_LATENCY_STATS)-include := rte_latencystats.h
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_LATENCY_STATS) += lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_LATENCY_STATS) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_LATENCY_STATS) += lib/librte_ether
+DEPDIRS-$(CONFIG_RTE_LIBRTE_LATENCY_STATS) += lib/librte_metrics
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_latencystats/rte_latencystats.c b/lib/librte_latencystats/rte_latencystats.c
new file mode 100644
index 0000000..0bcb09f
--- /dev/null
+++ b/lib/librte_latencystats/rte_latencystats.c
@@ -0,0 +1,380 @@ 
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+#include <sys/types.h>
+#include <stdbool.h>
+#include <math.h>
+#include <pthread.h>
+
+#include <rte_mbuf.h>
+#include <rte_log.h>
+#include <rte_cycles.h>
+#include <rte_ethdev.h>
+#include <rte_metrics.h>
+#include <rte_memzone.h>
+#include <rte_lcore.h>
+#include <rte_timer.h>
+
+#include "rte_latencystats.h"
+
+/** Nano seconds per second */
+#define NS_PER_SEC 1E9
+
+/** Clock cycles per nano second */
+#define CYCLES_PER_NS (rte_get_timer_hz() / NS_PER_SEC)
+
+/* Macros for printing using RTE_LOG */
+#define RTE_LOGTYPE_LATENCY_STATS RTE_LOGTYPE_USER1
+
+static pthread_t latency_stats_thread;
+static const char *MZ_RTE_LATENCY_STATS = "rte_latencystats";
+static int latency_stats_index;
+static uint64_t sampIntvl;
+static uint64_t timer_tsc;
+static uint64_t prev_tsc;
+
+struct rte_latency_stats {
+	float min_latency; /**< Minimum latency in nano seconds */
+	float avg_latency; /**< Average latency in nano seconds */
+	float max_latency; /**< Maximum latency in nano seconds */
+	float jitter; /** Latency variation between to packets */
+	uint64_t total_sampl_pkts;
+} *glob_stats;
+
+static struct rxtx_cbs {
+	struct rte_eth_rxtx_callback *cb;
+} rx_cbs[RTE_MAX_ETHPORTS][RTE_MAX_QUEUES_PER_PORT],
+	tx_cbs[RTE_MAX_ETHPORTS][RTE_MAX_QUEUES_PER_PORT];
+
+struct latency_stats_nameoff {
+	char name[RTE_ETH_XSTATS_NAME_SIZE];
+	unsigned int offset;
+};
+
+static const struct latency_stats_nameoff lat_stats_strings[] = {
+	{"min_latency_ns", offsetof(struct rte_latency_stats, min_latency)},
+	{"avg_latency_ns", offsetof(struct rte_latency_stats, avg_latency)},
+	{"max_latency_ns", offsetof(struct rte_latency_stats, max_latency)},
+	{"jitter_ns", offsetof(struct rte_latency_stats, jitter)},
+};
+
+#define NUM_LATENCY_STATS (sizeof(lat_stats_strings) / \
+				sizeof(lat_stats_strings[0]))
+
+static __attribute__((noreturn)) void *
+report_latency_stats(__rte_unused void *arg)
+{
+	for (;;) {
+		unsigned int i;
+		float *stats_ptr = NULL;
+		uint64_t values[NUM_LATENCY_STATS] = {0};
+		int ret;
+
+		for (i = 0; i < NUM_LATENCY_STATS; i++) {
+			stats_ptr = RTE_PTR_ADD(glob_stats,
+					lat_stats_strings[i].offset);
+			values[i] = (uint64_t)floor((*stats_ptr)/
+					CYCLES_PER_NS);
+		}
+
+		ret = rte_metrics_update_metrics(RTE_METRICS_NONPORT,
+						latency_stats_index,
+						values, NUM_LATENCY_STATS);
+		if (ret < 0)
+			RTE_LOG(INFO, LATENCY_STATS,
+				"Failed to push the stats\n");
+	}
+}
+
+static void
+rte_latencystats_fill_values(struct rte_stat_value *values)
+{
+	unsigned int i;
+	float *stats_ptr = NULL;
+
+	for (i = 0; i < NUM_LATENCY_STATS; i++) {
+		stats_ptr = RTE_PTR_ADD(glob_stats,
+				lat_stats_strings[i].offset);
+		values[i].key = i;
+		values[i].value = (uint64_t)floor((*stats_ptr)/
+						CYCLES_PER_NS);
+	}
+}
+
+static uint16_t
+add_time_stamps(uint8_t pid __rte_unused,
+		uint16_t qid __rte_unused,
+		struct rte_mbuf **pkts,
+		uint16_t nb_pkts,
+		uint16_t max_pkts __rte_unused,
+		void *user_cb __rte_unused)
+{
+	unsigned int i;
+	uint64_t diff_tsc, now;
+
+	/* for every sample interval,
+	 * time stamp is marked on one received packet
+	 */
+	now = rte_rdtsc();
+	for (i = 0; i < nb_pkts; i++) {
+		diff_tsc = now - prev_tsc;
+		timer_tsc += diff_tsc;
+		if (timer_tsc >= sampIntvl) {
+			pkts[i]->timestamp = now;
+			timer_tsc = 0;
+		}
+		prev_tsc = now;
+		now = rte_rdtsc();
+	}
+
+	return nb_pkts;
+}
+
+static uint16_t
+calc_latency(uint8_t pid __rte_unused,
+		uint16_t qid __rte_unused,
+		struct rte_mbuf **pkts,
+		uint16_t nb_pkts,
+		void *_ __rte_unused)
+{
+	unsigned int i, cnt = 0;
+	uint64_t now;
+	float latency[nb_pkts];
+	static float prev_latency;
+	const float alpha = 0.2;
+
+	now = rte_rdtsc();
+	for (i = 0; i < nb_pkts; i++) {
+		if (pkts[i]->timestamp)
+			latency[cnt++] = now - pkts[i]->timestamp;
+	}
+
+	for (i = 0; i < cnt; i++) {
+		/**
+		*The jitter is calculated as statistical mean of interpacket
+		*delay variation. The "jitter estimate" is computed by taking
+		*the absolute values of the ipdv sequence and applying an
+		*exponential filter with parameter 1/16 to generate the
+		*estimate. i.e J=J+(|D(i-1,i)|-J)/16. Where J is jitter,
+		*D(i-1,i) is difference in latency of two consecutive packets
+		*i-1 and i.
+		*Reference: Calculated as per RFC 5481, sec 4.1,
+		*RFC 3393 sec 4.5, RFC 1889 sec.
+		*/
+		glob_stats->jitter +=  (abs(prev_latency - latency[i])
+					- glob_stats->jitter)/16;
+		if (glob_stats->min_latency == 0)
+			glob_stats->min_latency = latency[i];
+		else if (latency[i] < glob_stats->min_latency)
+			glob_stats->min_latency = latency[i];
+		else if (latency[i] > glob_stats->max_latency)
+			glob_stats->max_latency = latency[i];
+		/**
+		*The average latency is measured using exponential moving
+		*average, i.e. using EWMA
+		*https://en.wikipedia.org/wiki/Moving_average
+		*/
+		glob_stats->avg_latency +=
+			alpha * (latency[i] - glob_stats->avg_latency);
+		glob_stats->total_sampl_pkts++;
+		prev_latency = latency[i];
+	}
+
+	return nb_pkts;
+}
+
+int
+rte_latencystats_init(uint64_t samp_intvl,
+		rte_latency_stats_flow_type_fn user_cb)
+{
+	unsigned int i;
+	uint8_t pid;
+	uint16_t qid;
+	struct rxtx_cbs *cbs = NULL;
+	const uint8_t nb_ports = rte_eth_dev_count();
+	const char *ptr_strings[NUM_LATENCY_STATS] = {0};
+	const struct rte_memzone *mz = NULL;
+	const unsigned int flags = 0;
+
+	/** Allocate stats in shared memory fo muliti process support */
+	mz = rte_memzone_reserve(MZ_RTE_LATENCY_STATS, sizeof(*glob_stats),
+					rte_socket_id(), flags);
+	if (mz == NULL) {
+		RTE_LOG(ERR, LATENCY_STATS, "Cannot reserve memory: %s:%d\n",
+			__func__, __LINE__);
+		return -ENOMEM;
+	}
+
+	glob_stats = mz->addr;
+	samp_intvl *= CYCLES_PER_NS;
+
+	/** Register latency stats with stats library */
+	for (i = 0; i < NUM_LATENCY_STATS; i++)
+		ptr_strings[i] = lat_stats_strings[i].name;
+
+	latency_stats_index = rte_metrics_reg_metrics(ptr_strings,
+							NUM_LATENCY_STATS);
+	if (latency_stats_index < 0) {
+		RTE_LOG(DEBUG, LATENCY_STATS,
+			"Failed to register latency stats names\n");
+		return -1;
+	}
+
+	/** Register Rx/Tx callbacks */
+	for (pid = 0; pid < nb_ports; pid++) {
+		struct rte_eth_dev_info dev_info;
+		rte_eth_dev_info_get(pid, &dev_info);
+		for (qid = 0; qid < dev_info.nb_rx_queues; qid++) {
+			cbs = &rx_cbs[pid][qid];
+			cbs->cb = rte_eth_add_first_rx_callback(pid, qid,
+					add_time_stamps, user_cb);
+			if (!cbs->cb)
+				RTE_LOG(INFO, LATENCY_STATS, "Failed to "
+					"register Rx callback for pid=%d, "
+					"qid=%d\n", pid, qid);
+		}
+		for (qid = 0; qid < dev_info.nb_tx_queues; qid++) {
+			cbs = &tx_cbs[pid][qid];
+			cbs->cb =  rte_eth_add_tx_callback(pid, qid,
+					calc_latency, user_cb);
+			if (!cbs->cb)
+				RTE_LOG(INFO, LATENCY_STATS, "Failed to "
+					"register Tx callback for pid=%d, "
+					"qid=%d\n", pid, qid);
+		}
+	}
+
+	int ret = 0;
+	char thread_name[RTE_MAX_THREAD_NAME_LEN];
+
+	/** Create the host thread to update latency stats to stats library */
+	ret = pthread_create(&latency_stats_thread, NULL, report_latency_stats,
+				NULL);
+	if (ret != 0) {
+		RTE_LOG(ERR, LATENCY_STATS,
+			"Failed to create the latency stats thread:%s, %s:%d\n",
+			strerror(errno), __func__, __LINE__);
+		return -1;
+	}
+	/** Set thread_name for aid in debugging */
+	snprintf(thread_name, RTE_MAX_THREAD_NAME_LEN, "latency-stats-thread");
+	ret = rte_thread_setname(latency_stats_thread, thread_name);
+	if (ret != 0)
+		RTE_LOG(DEBUG, LATENCY_STATS,
+			"Failed to set thread name for latency stats handling\n");
+
+	return 0;
+}
+
+int
+rte_latencystats_uninit(void)
+{
+	uint8_t pid;
+	uint16_t qid;
+	int ret = 0;
+	struct rxtx_cbs *cbs = NULL;
+	const uint8_t nb_ports = rte_eth_dev_count();
+
+	/** De register Rx/Tx callbacks */
+	for (pid = 0; pid < nb_ports; pid++) {
+		struct rte_eth_dev_info dev_info;
+		rte_eth_dev_info_get(pid, &dev_info);
+		for (qid = 0; qid < dev_info.nb_rx_queues; qid++) {
+			cbs = &rx_cbs[pid][qid];
+			ret = rte_eth_remove_rx_callback(pid, qid, cbs->cb);
+			if (ret)
+				RTE_LOG(INFO, LATENCY_STATS, "failed to "
+					"remove Rx callback for pid=%d, "
+					"qid=%d\n", pid, qid);
+		}
+		for (qid = 0; qid < dev_info.nb_tx_queues; qid++) {
+			cbs = &tx_cbs[pid][qid];
+			ret = rte_eth_remove_tx_callback(pid, qid, cbs->cb);
+			if (ret)
+				RTE_LOG(INFO, LATENCY_STATS, "failed to "
+					"remove Tx callback for pid=%d, "
+					"qid=%d\n", pid, qid);
+		}
+	}
+
+	/** Cancel the thread */
+	ret = pthread_cancel(latency_stats_thread);
+	if (ret != 0) {
+		RTE_LOG(ERR, LATENCY_STATS,
+			"Failed to cancel latency stats update thread:"
+			"%s,%s:%d\n",
+			strerror(errno), __func__, __LINE__);
+		return -1;
+	}
+
+	return 0;
+}
+
+int
+rte_latencystats_get_names(struct rte_metric_name *names, uint16_t size)
+{
+	unsigned int i;
+
+	if (names == NULL || size < NUM_LATENCY_STATS)
+		return NUM_LATENCY_STATS;
+
+	for (i = 0; i < NUM_LATENCY_STATS; i++)
+		snprintf(names[i].name, sizeof(names[i].name),
+				"%s", lat_stats_strings[i].name);
+
+	return NUM_LATENCY_STATS;
+}
+
+int
+rte_latencystats_get(struct rte_stat_value *values, uint16_t size)
+{
+	if (size < NUM_LATENCY_STATS || values == NULL)
+		return NUM_LATENCY_STATS;
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+		const struct rte_memzone *mz;
+		mz = rte_memzone_lookup(MZ_RTE_LATENCY_STATS);
+		if (mz == NULL) {
+			RTE_LOG(ERR, LATENCY_STATS,
+				"Latency stats memzone not found\n");
+			return -ENOMEM;
+		}
+		glob_stats =  mz->addr;
+	}
+
+	/* Retrieve latency stats */
+	rte_latencystats_fill_values(values);
+
+	return NUM_LATENCY_STATS;
+}
diff --git a/lib/librte_latencystats/rte_latencystats.h b/lib/librte_latencystats/rte_latencystats.h
new file mode 100644
index 0000000..7b1e72a
--- /dev/null
+++ b/lib/librte_latencystats/rte_latencystats.h
@@ -0,0 +1,141 @@ 
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_LATENCYSTATS_H_
+#define _RTE_LATENCYSTATS_H_
+
+/**
+ * @file
+ * RTE latency stats
+ *
+ * library to provide application and flow based latency stats.
+ */
+
+#include <rte_metrics.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Function type used for identifting flow types of a Rx packet.
+ *
+ * The callback function is called on Rx for each packet.
+ * This function is used for flow based latency calculations.
+ *
+ * @param pkt
+ *   Packet that has to be identified with its flow types.
+ * @param user_param
+ *   The arbitrary user parameter passed in by the application when
+ *   the callback was originally configured.
+ * @return
+ *   The flow_mask, representing the multiple flow types of a packet.
+ */
+typedef uint16_t (*rte_latency_stats_flow_type_fn)(struct rte_mbuf *pkt,
+							void *user_param);
+
+/**
+ * @internal
+ *  Registers Rx/Tx callbacks for each active port, queue.
+ *
+ * @param sampIntvl
+ *  Sampling time period in nano seconds, at which packet
+ *  should be marked with time stamp.
+ * @param user_cb
+ *  User callback to be called to get flow types of a packet.
+ *  Used for flow based latency calculation.
+ *  If the value is NULL, global stats will be calculated,
+ *  else flow based stats will be calculated.
+ *  @return
+ *   -1     : On error
+ *   -ENOMEM: On error
+ *    0     : On success
+ */
+int rte_latencystats_init(uint64_t samp_intvl,
+			rte_latency_stats_flow_type_fn user_cb);
+
+/**
+ * @internal
+ *  Removes registered Rx/Tx callbacks for each active port, queue.
+ *  @return
+ *   -1: On error
+ *    0: On suces
+ */
+int rte_latencystats_uninit(void);
+
+/**
+ * Retrieve names of latency statistics
+ *
+ * @param names
+ *  Block of memory to insert names into. Must be at least size in capacity.
+ *  If set to NULL, function returns required capacity.
+ * @param size
+ *  Capacity of latency stats names (number of names).
+ * @return
+ *   - positive value lower or equal to size: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - positive value higher than size: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ */
+int rte_latencystats_get_names(struct rte_metric_name *names,
+				uint16_t size);
+
+/**
+ * Retrieve latency statistics.
+ *
+ * @param values
+ *   A pointer to a table of structure of type *rte_stat_value*
+ *   to be filled with latency statistics ids and values.
+ *   This parameter can be set to NULL if size is 0.
+ * @param size
+ *   The size of the stats table, which should be large enough to store
+ *   all the latency stats.
+ * @return
+ *   - positive value lower or equal to size: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - positive value higher than size: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   -ENOMEM: On failure.
+ */
+int rte_latencystats_get(struct rte_stat_value *values,
+			uint16_t size);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_LATENCYSTATS_H_ */
diff --git a/lib/librte_latencystats/rte_latencystats_version.map b/lib/librte_latencystats/rte_latencystats_version.map
new file mode 100644
index 0000000..82dc5a7
--- /dev/null
+++ b/lib/librte_latencystats/rte_latencystats_version.map
@@ -0,0 +1,10 @@ 
+DPDK_16.11 {
+	global:
+
+	rte_latencystats_get;
+	rte_latencystats_get_names;
+	rte_latencystats_init;
+	rte_latencystats_uninit;
+
+	local: *;
+};
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 109e666..cc3bf65 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -493,6 +493,9 @@  struct rte_mbuf {
 
 	/** Timesync flags for use with IEEE1588. */
 	uint16_t timesync;
+
+	/** Timestamp for measuring latency. */
+	uint64_t timestamp;
 } __rte_cache_aligned;
 
 /**
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index f75f0e2..4e5289a 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -98,6 +98,8 @@  _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
 _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CFGFILE)        += -lrte_cfgfile
+_LDLIBS-$(CONFIG_RTE_LIBRTE_LATENCY_STATS)  += -lrte_latencystats
+
 
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BOND)       += -lrte_pmd_bond
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT)    += -lrte_pmd_xenvirt -lxenstore