Monitor Calico component metrics

Big picture

Use Prometheus configured for Calico components to get valuable metrics about the health of your network.

Value

Using the open-source Prometheus monitoring and alerting toolkit, you can view time-series metrics from Calico components in the Prometheus or Grafana interfaces.

Features

This how-to guide uses the following Calico features:

Felix and Typha components configured with Prometheus configuration parameters (for consumption by Prometheus).

Concepts

About Prometheus

The Prometheus monitoring tool scrapes metrics from instrumented jobs and displays time series data in a visualizer (such as Grafana). For Calico, the “jobs” that Prometheus can harvest metrics from are the Felix and Typha components.

About Calico Felix and Typha components

Felix is a daemon that runs on every machine that provides endpoints (Calico nodes). Felix is the brains of Calico. Typha is an optional daemon that extends Felix to scale traffic between Calico nodes and the datastore. Typha is used to avoid bottlenecks and performance issues in the datastore when you have over 50 Calico nodes.

You can configure Felix and/or Typha to provide metrics to Prometheus.

How to

Enable Prometheus metrics for and Felix and Typha

  1. Using the Prometheus documentation, configure one or more Prometheus servers.
  2. To enable Felix for metrics, set PrometheusMetricsEnabled = true.
  3. To enable Typha for metrics, set PrometheusMetricsEnabled = true.
  4. If required for Felix and/or Typha, change the default TCP port (9091) for your Prometheus metrics server using the parameter, PrometheusMetricsPort.

Best practices

If you enable Calico metrics to Prometheus, a best practice is to use network policy to limit access to the Calico metrics endpoints. For details, see Secure Calico Prometheus endpoints.

If you are not using Prometheus metrics, we recommend disabling the Prometheus ports entirely for more security.