# Deploy Prometheus Operator With Thanos

## Solve HA and long term storage of Prometheus

Posted by Kakashi on 2019-02-10

# Preface

Prometheus is widely adopted as a standard monitoring tool with Kubernetes because it provides many useful features such as dynamic service discovery, powerful queries, and seamless alert notification integration. There are many applications and client libraries support Prometheus which makes the operation’s life easier. Although things are going pretty well with prometheus, the original prometheus deployment is not able to easily achieve High Availablity and long term storage.

# Thanos comes to the rescue

Thanos is developed by improbable which can be integrated with prometheus transparently and solve HA and long term storage issues without hurting performance. The idea of Thanos is to run sidecar component of prometheus, therefore meaning that sidecar components can interact with prometheus to upload or query metrics. Also, prometheus operator supports thanos natively which make us easier to deploy our promtheus cluster along with thanos. This solution seems pretty elegant when you choose prometheus operator to provision prometheus cluster.

• How to deploy the prometheus operator on the kubernetes
• How to deploy the thanos sidecar w/ prometheus.
• Achieve HA: using thanos querier
• Query historical data: thanos store
• Reduce data size: thanos compactor

# Install Prometheus through Prometheus operator

There are tons of article introducing why we need to adopt prometheus-operator to provision prometheus. I recommend you read the following references[2] if you are not familiar with prometheus-operator.

## 1. Install Helm in your environment

• MacOS: brew install kubernetes-helm
• Linux: sudo snap install helm

## 3. Install coreos prometheus operator

Note that we are using stable/prometheus-operator because coreos/prometheus-operator helm is going to be deprecated. We later need to modify chart value to provision prometheus cluster along with thanos sidecar. To install a stable helm chart with custom value, you need to download values.yaml from github repo.

In this example, we named our prometheus operator as prom-op and install it under monitoring namespace.

Use the following command to verify if prometheus-operator is provisioning successfully.

# Thanos Deployment

NEED TO KNOW
prometheus-operator should be greater than 0.28.0 to support Thanos 2.0

## Thanos Architecture

Official Architecture of Thanos

Our deployment steps

According to the above picture, there are several components of thanos:

• Sidecar
• Querier
• Store
• Compactor

The deployment steps:

1. Prometheus should be deployed with thanos Sidecar.
2. Deploy Thanos Querier which is able to talks to prometheus Sidecar through gossip protocol.
3. Make sure Thanos Sidecar is able to upload prometheus metrics to the given S3 bucket.
4. Establish the Thanos Store for retrieving long term storage.
5. Set up the Compactor to shrink historical data.

## Install Thanos sidecar

To install Thanos sidecar along with prometheus-operator, we should specify thanos sidecar in the chart value as following:

objectStorageConfig can be configured through configuration file thanos.yaml

Creating the kubernetes secret by applying following command

Warn: endpoint needs to be set in order to specify bucket located in which region.

## Verify Thanos Sidecar

If everything goes well, we could find out there is thanos-sidecar in the prometheus pod

and if you check the log of sidecar, you will see following messages.

## Install Thanos Querier

Thanos Querier Layer provides the ability to retrieve metrics from all prometheus instances at once. It’s fully compatible with original prometheus PromQL and HTTP APIs so that it can be used along with Grafana.

Since there are too many yaml files, I put everything in my github repo

## Install Thanos Store

Thanos Store collaborates with querier for retrieving historical data from the given bucket. It will join the Thanos cluster on setup.

## Install Thanos Compactor

Thanos Compactor will do downsampling for your all historical data. It’s a really useful component which can reduce file size. Recommend everyone read this well explained article.

# Troubleshooting

## Peering service didn’t set up properly

you will see this kind of message of thanos component