How to use OpenTelemetry to expose custom Prometheus metrics from nodeJS applications

November 8, 2023

A standard way of exposing metrics for a nodeJS application is to use the prom-client package, which has everything one would need. But we are in 2023, almost 2024 and it is time to use industry standards, and here I am talking about OpenTelemetry. Let’s answer the question together on how to use open-telemetry JS to expose custom metrics.

I don’t think there is a need to present the OpenTelemetry project anymore. 2nd CNCF project with highest velocity at the time of writing, vendor-neutral and open-source.

Example nodeJS project

The following commands help to setup a nodeJS project with some OpenTelemetry packages. You might notice when using open telemetry package that the list in package.json grows quite fast. But be assured, most of them are lightweight. For instance @opentelemetry/exporter-prometheus is 24.4 kB minified and gzipped.

mkdir otel-prom cd otel-prom
npm init -y

touch index.js

npm install --save @opentelemetry/exporter-prometheus @opentelemetry/api @opentelemetry/sdk-metrics

# Note: I am using the following versions:
# "@opentelemetry/api": "^1.7.0",
# "@opentelemetry/exporter-prometheus": "^0.45.0",
# "@opentelemetry/sdk-metrics": "^1.18.0",

Note: The Otel packages shown in this blog are currently considered experimental packages under active development. New releases may include breaking changes.

Then proceed to edit the file index.js with the following content:

const { ValueType } = require("@opentelemetry/api");
const { PrometheusExporter } = require("@opentelemetry/exporter-prometheus");
const { MeterProvider } = require("@opentelemetry/sdk-metrics");
const http = require("http");

// Mock a potential database call to query data
const queryDatabaseStuff = async () => new Promise((resolve) =>
    setTimeout(resolve(Math.floor(Math.random() * 100)), 50)
  );

const startMetricsExporter = () => {
  console.log(`starting prometheus metrics server`);

  // you can choose in the options the port, and if you want to start a webserver
  const exporter = new PrometheusExporter({
    port: 9100,
  });

  const meterProvider = new MeterProvider();
  meterProvider.addMetricReader(exporter);
  const meter = meterProvider.getMeter("prometheus");

  // create the gauge
  const outdatedDataCountGauge = meter.createObservableGauge("outdated_data_count", {
    valueType: ValueType.INT,
    description: "outdated data count",
  });

  // callbacks are executed when the /metrics endpoint is hit
  outdatedDataCountGauge.addCallback(async (result) => {
    const outdatedDataCount = await queryDatabaseStuff();

    result.observe(outdatedDataCount);
  });
};

// simulate a normal webserver, it could be fastify, express, etc.
// Just to illustrate you could run the exporter along a classic webserver
const startNormalWebServer = () => {
  console.log(`starting normal web server`);

  http
    .createServer(function (req, res) {
      res.write("Hello World!");
      res.end();
    })
    .listen(8089);
};

startMetricsExporter();
startNormalWebServer();

Run the application with node index.js to see 2 web servers. You can go to localhost:8089 to see the normal web server, and localhost:9100/metrics to see the metrics endpoint.

Here are more detailed explanations of what the code is doing:

queryDatabaseStuff() is to mock a DB call that would probably fetch some data count, like outdated data.

meter.createObservableGauge then records a Gauge, where you can pass a description and the expected value type. The part not very well documented in OTel is the addCallback function, which allow you to run a function when the metrics endpoint is called.

Be aware this approach is optimal for relatively light database queries. In situations where your query examines multiple databases, executing complex aggregations that can slow down the system, a more effective strategy is to run the query at predetermined intervals in the background. The recent results are stored and merely retrieved during the callback, rather than executing a comprehensive query each time. This tactic essentially minimizes the time-consuming database operations during the metrics' collection and maintains a responsive and efficient metrics endpoint.

About the options passed to PrometheusExporter, I would advise keeping the default which is to start a web server on a different port. Therefore you are sure you won’t expose the metrics by mistake to the outside world.

and... with Kubernetes? ☸️

If you use Kubernetes as your underlying platform, you can then have a service exposing 2 ports, with only one of them being for the ingress of the application web server, and the other one for the ServiceMonitor watching the metrics web server.

---
apiVersion: apps/v1
kind: Deployment
# skip the boring part
...
        ports:
          - containerPort: 8089
            name: http
            protocol: TCP
          - containerPort: 9100
            name: metrics
            protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
  name: my-service
  labels:
    app: my-web-app
spec:
  ports:
    - name: http
      port: 80
      protocol: TCP
      targetPort: 8089
    - name: metrics
      port: 9100
      protocol: TCP
      targetPort: metrics
  selector:
    app: my-web-app
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: my-servicemonitor
  labels:
    app: my-web-app
spec:
  endpoints:
    - interval: 30s
      port: metrics
  jobLabel: ''
  namespaceSelector:
    matchNames:
      - default
  selector:
    matchLabels:
        app:  my-web-app

It makes it way easier to prevent exposing the metrics endpoint to the outside, you then don't have to juggle with ingress path routing to exclude /metrics.

Conclusion

OpenTelemetry is a great project, and I am glad to see it growing so fast, but it can be uneasy to get familiar with the Otel terms. I hope this blog post will help you as documentation is still missing for observability with nodeJS in general.

Hope this post helps, and won’t be deprecated too fast!