What Happens Between Dashboards and Prometheus?

Table of Contents

What Happens Between Dashboards and Prometheus?

TL;DR
Prometheus gives us visibility into our system, but not into how we use that visibility.
prom-analytics-proxy is a lightweight, drop-in proxy that captures PromQL query traffic (from dashboards, alerts, scripts, etc.) and gives you insights like top expressions, query latency, error rates, and usage patterns, all visualized in a built-in UI.
It helps you optimize your observability stack, reduce waste, and understand what’s actually being queried in production.
👉 Check it out on GitHub

I’ve always loved how easy it is to get started with Prometheus. You write some code, expose some metrics, run a query, and boom, you’re in.

That simplicity gives you superpowers. We could track everything from request latency to custom business metrics in seconds.

But eventually, something felt off. We had all the metrics we needed, but no idea how they were being used. That part was invisible.

We had metrics, but no map.

Most of us start with a pretty standard setup: a plain Prometheus server collecting metrics, and dashboards or alerts querying it directly. Sometimes things get a bit fancier, we throw in Thanos with a Query Frontend, fanning out requests to a fleet of Prometheus instances.

This stack works. I love it. It’s battle-tested, cloud-native, and gives you observability superpowers.

But over time, the same questions kept coming up:

What queries are actually being run right now?
Are they coming from dashboards? Ad-hoc API calls? Alerts?
What label matchers are being used?
Which application or team is issuing the queries?
How heavy are these queries, how many samples do they touch? What’s the latency? The status code?

Some of this you can piece together with distributed tracing, sure. But even with traces, we lacked a way to correlate that data with metrics usage patterns. We couldn’t analyze trends, spot outliers, or build any kind of feedback loop around how our observability tools were actually being used.

And that was the spark.

The Problem — Metrics Without Metadata

Not knowing how your metrics are being used is always a bit of a problem. But at scale? It’s a huge one.

When you’re operating large environments, like I do at Coralogix, or like Michael does at Cloudflare, you don’t just want visibility into the data you’re collecting. You need visibility into how that data is being consumed. Who’s running what queries? How often? With which label matchers? How expensive are those queries for your storage layer or PromQL engine?

This stuff matters. Without it, you’re basically flying blind.

In practice, this lack of insight makes it really hard to make informed, data-driven decisions. Should we scale up the Prometheus tier? Or are there a handful of poorly written dashboards causing most of the load? Should we optimize our queries at the source, or just throw more compute at the problem? Without usage patterns, you’re guessing.

We had some tools to help, like distributed tracing, but they only go so far. Tracing is great when you’re debugging a specific slow query, but it’s inherently one-shot and isolated. What we needed was the bigger picture: trends over time, correlations with other workloads, systemic usage patterns across the organization.

We wanted to answer bigger questions, like:

What metrics are most queried across the org?
Which alerts or dashboards are hitting our systems the hardest?
Are there unused panels that nobody has looked at in months?

Without metadata and analytics around queries, it’s incredibly hard to run a platform efficiently, make smart architectural decisions, or even detect outliers and anomalies in usage.

And the worst part? Most teams don’t even realize this visibility gap exists, until they hit scale, and it hits back.

How prom-analytics-proxy Was Born

It started sometime last year, honestly, I don’t even remember exactly when. Just one of those casual chats between me and Michael, trading war stories from running large Prometheus environments.

We were both hitting the same problem: we had no idea how people were actually using the data we were collecting. Different setups, same gap. At first, it felt like one of those “yeah, this is annoying” kind of problems. But the more we talked, the more we realized, this wasn’t just us. This was a missing layer in the Prometheus ecosystem.

So we figured, let’s just build something.

From the start, we knew it would be a proxy. That was the cleanest solution, no patching Prometheus, no modifying dashboard tools. Just drop in a lightweight proxy between clients and the Prometheus API, and observe everything flowing through it.

We had this fun idea early on to pipe everything into DuckDB for local analytics. It was cool in theory… less cool in practice 😅. The proxy we have today looks nothing like what we imagined back then, it’s way more focused, more practical, and something you can actually run in production without fear.

And that’s been the spirit of the project from day one: keep it simple, make it transparent, and build the tool we wished we had.

What prom-analytics-proxy Actually Does

At its core, prom-analytics-proxy is a lightweight proxy that sits in front of your Prometheus-compatible backends, whether that’s Prometheus itself, Thanos, Cortex, or anything else that speaks the same API.

Every time a query flows through, whether it’s from a dashboard solution like Perses, a script, or an internal tool, the proxy captures it. It extracts useful details from both the request and the response: the raw PromQL expression, label matchers, status code, response time, sample count, and more.

But the magic isn’t just in what it collects, it’s in what you can see.
The project ships with a built-in UI that visualizes key metrics like:

Most used PromQL expressions
Status code distribution
Average query latency and throughput
Error rate over time
Correlation with metrics usage from other sources

📽️ See it in action

Want a quick peek? Here’s a short video demo of the prom-analytics-proxy UI in action. No audio, just a quick screen tour:

It gives you immediate visibility into what’s happening inside your observability stack, without needing to wire up external dashboards or tooling.

In practice, this helps you understand how your data is being used, shine a light on performance issues, and spot opportunities to improve or simplify your queries.

And the best part? It’s ridiculously easy to adopt. The proxy is a single Go binary. It doesn’t require any dependencies out of the box, just drop it in front of your Prometheus endpoint and start observing traffic. Want to scale horizontally? Just hook it up to a Postgres backend. But on day one? You’re good to go.

What’s Next (and How to Get Involved)

Right now, prom-analytics-proxy is fully production-ready. I’ve been running it myself, and it’s been working like a charm. It’s lightweight, transparent, and already helping surface insights I couldn’t get anywhere else.

There’s still a lot more we’d love to explore, like breaking down usage by user agent, or even suggesting improvements to overly expensive PromQL expressions. The built-in UI is also evolving, we want to keep making it easier to explore and understand your metrics usage visually.

And we’re just getting started.

In an upcoming follow-up post, I’ll walk through how prom-analytics-proxy integrates with perses/metrics-usage, an initiative focused on unlocking even deeper visibility and context around metric usage. When you combine fine-grained query observability with tooling like metrics-usage, you open the door to an entirely new level of optimization, hygiene, and cost-awareness in your observability strategy.

If you want to try it out, head over to nicolastakashi/prom-analytics-proxy. It’s open source, easy to run, and totally free to experiment with. Feedback, bug reports, feature ideas, wild use cases, everything is welcome.

And if you’ve ever hit this same wall, wondering how your metrics are actually being used, don’t keep it to yourself. Share it with your team. Share it with the community. You never know… that conversation might turn into a project. And that project might end up helping a whole lot of people across the industry.

What Happens Between Dashboards and Prometheus?

What Happens Between Dashboards and Prometheus?

We had metrics, but no map.

The Problem — Metrics Without Metadata

How prom-analytics-proxy Was Born

What prom-analytics-proxy Actually Does

What’s Next (and How to Get Involved)

Tags :

Related Posts

Observability strategies to not overload engineering teams – eBPF.

Observability strategies to not overload engineering teams — OpenTelemetry Strategy.

Observability strategies to not overload engineering teams — Proxy Strategy.

What Happens Between Dashboards and Prometheus?

What Happens Between Dashboards and Prometheus?

We had metrics, but no map.

The Problem — Metrics Without Metadata

How prom-analytics-proxy Was Born

What prom-analytics-proxy Actually Does

What’s Next (and How to Get Involved)

Tags :

Share :

Related Posts

Observability strategies to not overload engineering teams – eBPF.

Observability strategies to not overload engineering teams — OpenTelemetry Strategy.

Observability strategies to not overload engineering teams — Proxy Strategy.