> ## Documentation Index > Fetch the complete documentation index at: https://friendli.ai/docs/llms.txt > Use this file to discover all available pages before exploring further. # Observability for Friendli Container > Observability is an integral part of DevOps. To support this, Friendli Container exports internal metrics in a Prometheus text format. export const RoundedBorderBox = ({children, caption}) =>

{children} {caption &&

{caption}

}

; Observability is an integral part of DevOps. To support this, Friendli Container exports internal metrics in a [Prometheus](https://prometheus.io) text format. By default, metrics are served at `http://localhost:8281/metrics`. You can configure the port number using the command line option `--metrics-port`. ## Supported Metrics ### Counters Counters are cumulative metrics whose values monotonically increase. They are often used in combination with Prometheus function [rate()](https://prometheus.io/docs/prometheus/latest/querying/functions/#rate) for calculating the throughput. | Metric Name | Description | | --------------------------------- | -------------------------------------------------------- | | friendli\_requests\_total | Cumulative number of requests received | | friendli\_responses\_total | Cumulative number of responses sent | | friendli\_items\_total | Cumulative number of items requested | | friendli\_failure\_by\_cancel | Cumulative number of failed requests due to cancellation | | friendli\_failure\_by\_timeout | Cumulative number of failed requests due to timeout | | friendli\_failure\_by\_nan\_error | Cumulative number of failed requests due to NaN error | | friendli\_failure\_by\_reject | Cumulative number of failed requests due to rejection | One inference request may generate multiple results with the `n` field in the request body. Upon receiving such a request, `friendli_requests_total` is increased by 1 and `friendli_items_total` is increased by `n`. ### Gauges Gauges are numerical values that can go up and down to represent the current value. | Metric Name | Description | | ---------------------------------- | --------------------------------------------------------------------- | | friendli\_current\_requests | Current number of requests in the engine (either assigned or waiting) | | friendli\_current\_items | Current number of items in the engine (either assigned or waiting) | | friendli\_current\_assigned\_items | Current number of items actively processed by the engine | | friendli\_current\_waiting\_items | Current number of items waiting in the internal queue | ### Histograms [Histograms](https://prometheus.io/docs/practices/histograms) are used to track the distribution of variables over time.

Histogram	Metric Name	Description
Friendli TCache hit ratio (0≤value≤1)	friendli\_tcache\_hit\_ratio\_bucket	Bucketized number of histogram samples for TCache hit ratio, with `le` label
	friendli\_tcache\_hit\_ratio\_count	Total number of histogram samples for TCache hit ratio
	friendli\_tcache\_hit\_ratio\_sum	Sum of histogram sample values for TCache hit ratio
The length of input tokens (Experimental metric)	friendli\_input\_lengths\_bucket	Bucketized number of histogram samples for length of input tokens, with `le` label
	friendli\_input\_lengths\_count	Total number of histogram samples for length of input tokens
	friendli\_input\_lengths\_sum	Sum of histogram sample values for length of input tokens
The length of output tokens (Experimental metric)	friendli\_output\_lengths\_bucket	Bucketized number of histogram samples for length of output tokens, with `le` label
	friendli\_output\_lengths\_count	Total number of histogram samples for length of output tokens
	friendli\_output\_lengths\_sum	Sum of histogram sample values for length of output tokens

For visualizing histograms using Grafana, [How to visualize Prometheus histograms in Grafana](https://grafana.com/blog/2020/06/23/how-to-visualize-prometheus-histograms-in-grafana) provides useful tips. ### Quantiles Quantiles are used to show the current p50(median), p90, and p99 percentiles of variables.

Quantiles	Metric Name	Description
Request completion latency (in nanoseconds)	friendli\_requests\_latencies	Percentile value for request completion latency (`quantile` label is either `0.5`, `0.9`, or `0.99`)
	friendli\_requests\_latencies\_count	Total number of samples for request completion latency
	friendli\_requests\_latencies\_sum	Sum of sample values for request completion latency
Time to first token (TTFT) (in nanoseconds)	friendli\_requests\_ttft	Percentile value for time to first token (TTFT) (`quantile` label is either `0.5`, `0.9`, or `0.99`)
	friendli\_requests\_ttft\_count	Total number of samples for time to first token (TTFT)
	friendli\_requests\_ttft\_sum	Sum of sample values for time to first token (TTFT)
Request queueing delay (in nanoseconds)	friendli\_requests\_queueing\_delays	Percentile value for queueing delay (`quantile` label is either `0.5`, `0.9`, or `0.99`)
	friendli\_requests\_queueing\_delays\_count	Total number of samples for queueing delay
	friendli\_requests\_queueing\_delays\_sum	Sum of sample values for queueing delay

### Info The following information metric always has a value of 1. The metric labels contain useful information in text. | Metric Name | Label | Description | | ------------------------- | --------- | -------------- | | friendli\_engine\_version | `version` | Engine version | ## Grafana Dashboard Template

You can import [the dashboard templates](https://github.com/friendliai/container-resource/tree/main/grafana) to your Grafana instance. The Grafana instance must be connected to a Prometheus instance (or a Prometheus-compatible data source) that scrapes metrics from Friendli Container processes. The dashboard template works with Grafana v8.0.0 or later versions. We recommend using Grafana v10.0.0 or later for the best experience.