Leveraging Consul for Thanos Query Discovery

Leveraging Consul for Thanos Query Discovery

Service Discovery is a crucial tool for environments that demand high scalability and also dynamism, and today I’m gonna show you how to use Consul to help us build a distributed Monitoring Platform.

Info

In this blog post, I’m assuming that you’re familiar with Thanos, Consul, and all the other tools that I’ll cover here.

Multi-Cluster Monitoring use Case

Multi-Cluster Monitoring is an everyday use case in enterprise environments, and usually, you have a Control Plane that acts as a centralized view of all of your clusters as we can see in the image below.

alter-text
Image From Bazain — Multi-Cluster Monitoring Post

I advise you to take a look at the Banzai Cloud blog post to have details about that solution.

Thanos Limitation

As described in Thanos’s Official documentation, there is no official support for commercial Service Discovery solutions such as Consul, but following their documentation, you can implement it by yourself and take advantage of discovering API targets for certain operations.

Currently, there are no plans of adding other Service Discovery mechanisms like Consul SD, Kubernetes SD, etc. However, we welcome people implementing their preferred Service Discovery by writing the results to File SD, which can be consumed by the different Thanos components.

Thanos Metrics

Technical Solution

Now you’ve all the required context about the use case and the limitation, let’s build a solution leveraging the Consul and Consul Template to discover instances of Thanos Query in different clusters.

Building the environment

To build all the things that we need to simulate the use case mentioned above, I’ll use minikube and fake multiple local clusters using namespaces, just to keep the example as simple as possible.

Creating a minikube

To spin up a new minikube cluster, you can just use the example below.

minikube start --kubernetes-version=v1.23.0 --memory=4g

Installing Consul

For this example, I’ll use the official helm chart for Consul, but before installing Consul, you need to create a values file, using the content below.

I want to highlight that I’m enabling the sync catalog feature, that allows us to sync Kubernetes Services into the Consul Catalog.

global:  
  name: consul  
  datacenter: centralserver:  
  replicas: 1  
    
ui:  
  enabled: true  
  service:  
    type: 'NodePort'syncCatalog:  
  enabled: true  
  default: false  
  k8sAllowNamespaces: \['\*'\]  
  k8sDenyNamespaces: \['kube-system', 'kube-public'\]

Now you can run the following command:

helm upgrade -i consul hashicorp/consul --create-namespace -n consul -f consul-values.yaml

Now, if you want to access the Consul UI, you can run the following command:

kubectl -n consul port-forward svc/consul-server 8500:8500

Installing Monitoring Stack on the West Europe “cluster”

The Monitoring Stack is composed of different components. I’m using the kube-prometheus-stack helm chart to help me deploy Prometheus, Node-Exporter, and KubeStateMetrics, for Thanos Query I’m using the Thanos Bitnami helm chart .

Creating Namespace Cluster

As I said before, I’m using namespaces to simulate clusters. With that in mind let’s create a namespace called West Europe.

kubectl create ns westeurope
Creating Thanos Object Store Secret

Before installing the kube-prometheus-stack, we need to create a secret with the object storage config that will be used by the Thanos Sidecar, using the content below.

apiVersion: v1  
kind: Secret  
metadata:  
  name: thanos-secret  
type: Opaque  
stringData:  
  objstorage.yaml: |-  
    type: FILESYSTEM  
    config:  
      directory: "/etc/thanos/data"

Since it’s a localhost environment and the idea is to keep as simple as possible I’m using the File System option.

kubectl apply -f thanos-secret.yaml -n westeurope
Installing kube-prometheus-stack

After the Thanos Secret creation, we’re able to install the kube-prometheus-stack using the following content as the values file.

alertmanager:  
  enabled: falsegrafana:  
  enabled: falseprometheus:  
  prometheusSpec:  
    volumes:  
      - name: object-storage  
        secret:  
          secretName: thanos-secretthanos:  
  objectStorageConfigFile: /etc/conf/objstorage.yaml  
   volumeMounts:  
     - mountPath: /etc/conf/  
       name: object-storage

After the values file creation, we can run the following command.

helm upgrade -i westeurope prometheus-community/kube-prometheus-stack -n westeurope -f kube-prometheus-values.yaml
Installing Thanos Query

The last component that we need to into the West Europe cluster is Thanos Query.

Thanos Query will allow us to query data from this West Europe cluster from the Observer cluster.

Before, the Thanos Query installation, you must create a values file with the content below.

queryFrontend:  
  enabled: falsequery:  
  extraFlags:  
    - --endpoint=dnssrv+\_grpc.\_tcp.prometheus-operated.monitoring-agent.svc service:  
    annotations:  
      consul.hashicorp.com/service-sync: 'true'  
      consul.hashicorp.com/service-name: 'thanos-query'  
      consul.hashicorp.com/service-port: 'grpc'

I want to highlight here the annotations that I’m adding to the Thanos Query service, those annotations will allow the Consul Catalog Sync to onboard the Thanos Query Service.

After the values file creation, you can run the command below to install the Thanos Query.

helm upgrade -i thanos-agent bitnami/thanos -n westeurope -f thanos-agent-values.yaml
Monitoring Stack West Europe Overview

After installing all the components that I’ve described above, the Monitoring Stack on the West Europe “Cluster” should look like this.

alter-text
Monitoring Stack West Europe Cluster

Installing Monitoring Stack for the North Europe “cluster”

The North Europe Cluster will use the same configuration, that I used in West Europe. So you just need to install all the things in a different namespace called northeurope.

Checking Consul UI

At this moment, you may have the two “clusters” up and running with all the required monitoring components. Now if you access the Consul UI using the command that I’ve shared to port-forward, you will be able to see in the Services menu, the Thanos Query Service with two instances.

alter-text
Consul UI

Building the Observer “cluster”

In this blog post, the observer cluster will only contain the Thanos Query to keep the focus on what matters.

Consul Template

Consul Template is a project that provides a convenient way to populate values from Consul into the file system, and the idea is to use the Consul Template capabilities to write the service discovery file that Thanos will look for.

Creating Consul Template File

Consul Template uses a subset of the Go template engine to render values, and with that in mind, we can create a config map that will hold the template logic.

kind: ConfigMap  
apiVersion: v1  
metadata:  
  name: consul-template-config-map  
data:  
  thanos-sd-file.tpl: |  
    - targets:  
      {{ range service "thanos-query" -}}  
      - {{ .Address }}:{{ .Port }}  
      {{ end -}}

This templating is iterating over the Thanos Query Service available on Consul and building the structure that Thanos Query can handle to discover new targets.

Creating Consul Template HCL Config

If you’re familiar with the HashiCorp products, you already know the HCL configuration standard that we’ll use below.

kind: ConfigMap  
apiVersion: v1  
metadata:  
  name: consul-template-hcl-config-map  
data:  
  config.hcl: |  
    consul {  
      address = "consul-server.consul.svc.cluster.local:8500"  
    } log\_level = "info" template {  
      source = "/consul-template/templates/thanos-sd-file.tpl"  
      destination = "/etc/thanos/sd/thanos-sd-file.yaml"  
    }

I’m configuring the Consul Template to use the TPL file that I’ve created above and render it thanos-sd-file.yaml into the Thanos directory.

Thanos Query

Now that we already have all the foundations to use Consul Template, let’s start to build the values file that the Thanos Query helm chart will use.

Mounting Consul Template Volumes

Before we start adding the Consul Template SideCar Container, we need to mount the Consul Template Configs as volumes on Thanos Query.

extraVolumes:  
  - name: consul-template-hcl-config  
    configMap:  
      name: consul-template-hcl-config-map  
  - name: consul-template-config  
    configMap:  
      name: consul-template-config-map  
  - name: thanos-sd-dir  
    emptyDir: {}  
    
extraVolumeMounts:  
  - name: thanos-sd-dir  
    mountPath: /etc/thanos/sd
Adding Consul Template SideCar

After the volumes configuration, let’s add the Consul Template SideCar Container.

sidecars:  
  - name: consul-template  
    image: hashicorp/consul-template  
    imagePullPolicy: IfNotPresent  
    args:   
      - consul-template  
      - -config   
      - /consul-template/config.d/config.hcl  
    volumeMounts:  
      - name: consul-template-hcl-config  
        mountPath: /consul-template/config.d  
      - name: consul-template-config  
        mountPath: /consul-template/templates  
      - name: thanos-sd-dir  
        mountPath: /etc/thanos/sd
Thanos Query SD Config File Path

Last but not least, we must provide an extra flag informing the Thanos Query to use the Service Discovery file.

extraFlags:  
  - --store.sd-files=/etc/thanos/sd/thanos-sd-file.yaml
Thanos Query Final values file

After following all the steps that I described above, your Thanos Query values file should look like the example below.

queryFrontend:  
  enabled: falsequery:  
  extraFlags:  
    - --store.sd-files=/etc/thanos/sd/thanos-sd-file.yamlsidecars:  
  - name: consul-template  
    image: hashicorp/consul-template  
    imagePullPolicy: IfNotPresent  
    args:   
      - consul-template  
      - -config   
      - /consul-template/config.d/config.hcl  
    volumeMounts:  
      - name: consul-template-hcl-config  
        mountPath: /consul-template/config.d  
      - name: consul-template-config  
        mountPath: /consul-template/templates  
      - name: thanos-sd-dir  
        mountPath: /etc/thanos/sdextraVolumes:  
  - name: consul-template-hcl-config  
    configMap:  
      name: consul-template-hcl-config-map  
  - name: consul-template-config  
    configMap:  
      name: consul-template-config-map  
  - name: thanos-sd-dir  
    emptyDir: {}  
    
  extraVolumeMounts:  
    - name: thanos-sd-dir  
      mountPath: /etc/thanos/sd

Now, we just need to install the Thanos Query into the Observer cluster, by running the command below.

helm upgrade -i thanos bitnami/thanos -n observer -f thanos.yaml

Voila, the magic starts to happen, if we access the Thanos Query UI and look into the Stores menu, we’ll be able to see the Thanos Query instances from the other clusters.

Observer Thanos Query — List of Stores
Observer Thanos Query — List of Stores

Solution Overview

Let’s go for a quick solution overview, to ensure that all the things are clear and good to go.

We’re leveraging the Consul Template to watch all the Thanos Query services available on Consul that are running in different clusters, and then writing the address and port of those instances in Thanos File Service discovery format .

This means, that every time a new cluster is born and the Thanos Query of this cluster registers itself into the Consul catalog, it will be automatically discovered by the Thanos Query in the Observer Cluster.

Conclusion

Well!

I hope you enjoy the content and that it can help you easily onboard and scale your distributed monitoring solution. Feel free to reach out to me if you need more information or solve any doubts about this solution or share your feedback.

Related Posts

Observability strategies to not overload engineering teams – eBPF.

Observability strategies to not overload engineering teams – eBPF.

eBPF is a powerful technology since it allows you to inject custom user-definition programs in the kernel without having to install additional kernel modules or recompile the kernel itself.

Read More
Think Like a Detective: Using 5w2h to Solve Production Mysteries.

Think Like a Detective: Using 5w2h to Solve Production Mysteries.

I love the idea behind your build and run it, it’s a great way to ensure the team is accountable for the product they are creating.

Read More