A different and often better way to downsample your Prometheus metrics (opens in new tab)

(blog.timescale.com)

96 pointsLoriP4y ago21 comments

21 comments

18 comments · 6 top-level

skorgu4y ago· 4 in thread

I'm curious how this can both avoid the average-of-averages problem (presumably by using the original full-rate data to compute multiple aggregates) and also supports backfilling. Is there a danger of the full-rate data expiring and having a different behavior for backfills past that horizon? Or am I wholly misunderstading both these features?

cevian4y ago

(NB: post author here)

Great question. We support average of averages by storing the intermediate state of the aggregate (for average that's the sum and count) so we could cleanly re-aggregate.

Eventually, we'll be able to incrementally update the aggregate if we backfill even if the raw data is no longer available. That's not implemented yet though, so backfill only updates the aggregate if the raw data is still around by re-computing the intermediate state of the aggregate off the raw data for affected buckets. For most cases that isn't actually an issue since most people have a longer data retention period than backfill horizon.

arriu4y ago

Thanks for the answer! I'd love to know more :) Also, I'm not following, how you guys deal with issues with unique counts? For example, lets say you've got 100 unique visitors on Monday and 100 on Tuesday. The unique visitors for both days might be anywhere between 100-200 and averaging counts between days doesn't work.

jpgvm4y ago

Not sure about this specific implementation but normally you handle this with approximations that support merging. i.e HyperLogLog You can merge 2 HyperLogLog counters to maintain proper distinct counts.

2 more replies

jeffbee4y ago

You avoid average-of-averages by storing multiple summaries. For example, you don't compute and store average, you compute and store sum and count.

dom964y ago· 3 in thread

This sounds awesome! But is it the right approach if I am just running a simple Prometheus instance on my home NAS? I've wondered for a while how I can persist my Prometheus timeseries, I guess I could use promscale for this, but maybe it's overkill for something this simple. Advice appreciated :)

derefr4y ago

Indefinite persistence of time-series data "as-is", is a somewhat different use-case from putting them in a data warehouse so you can efficiently do rollups to them. Timescale seems to be useful for the latter, but I'm not sure it offers too much value for the former.

I believe the state-of-the-art for plain-old Prometheus data retention is https://thanos.io/ — my understanding is that it's a Prometheus remote storage integration (https://prometheus.io/docs/prometheus/latest/storage/#remote...) that archives time-series data from Prometheus into an object-store, and then fetches/streams chunks back out from said object-store to serve requests.

You could use it locally on your NAS, by running a Minio instance on there. But IMHO there wouldn't be much point in doing that, over just keeping all the data in Prometheus's own internal storage.

dom964y ago

> But IMHO there wouldn't be much point in doing that, over just keeping all the data in Prometheus's own internal storage.

So far from what I've read Prometheus isn't built for that. It stores all its data in RAM and asking it to store any more leads to running out of memory quickly. What have I missed?

cevian4y ago

(NB: Post author and Promscale dev here)

Promscale does both data storage and analysis/rollups. It's like Thanos in that you can use it as a remote storage backend. It has the additional functionality of then aggregating/analyzing the data in SQL.

1 more reply

Gravityloss4y ago· 2 in thread

At some point somebody "invents" the circular buffers that have the multiple data resolutions that was RRDtool and maybe we get compact and fast time series storage and reporting again.

sofixa4y ago

RRD is a terrible format. It's not compact, quickly becomes a burden when you have lots of cardinality, there's no metadata, HA is a joke and visualisation tools are basically non-existent. You basically need a whole set of extra tooling for metadata, visualisation, HA, querying to come even close to anything usable.

If RRD wasn't so terrible there wouldn't have been a myriad of replacements.

dang4y ago

Interesting - it looks like this is the only past HN thread about it:

Beyond NoSQL: Using RRD to store temporal data - https://news.ycombinator.com/item?id=2742486 - July 2011 (18 comments)

I found a few other tiny threads asking about replacements, and that was it.

baaym4y ago· 2 in thread

Years ago I had a Graphite installation where I configured retention policies, and the same for InfluxDB if my memory doesn't fail me.

The downsampling feature at first glance seems to serve a different use case than Prometheus was built for, which I think is observability and alerting for a relatively short time period. For systems that need to work with years of data it totally makes sense, but I don't think Prometheus is used in those cases.

Since this feature has been built for a reason however, I could be wrong

bitwalker4y ago

Prometheus without any supporting tooling isn't really designed for long term storage as I understand it, however it is built to support long term storage and querying via its remote read/write protocol. Prometheus will write data to remote storage, and can delegate queries to that storage, rather than using its own local storage as it does by default.

Of the various tools that expose the remote read/write APIs, I like the looks of Promscale/TimescaleDB the most so far, but other options like Thanos might make more sense if you need to collect metrics from a bunch of Prometheuses. That said, maybe you can still use Promscale/TimescaleDB with Thanos as the storage backend, I can't recall the details on its requirements though, so it might not be suitable for that case. For my own use cases though, Promscale is a great solution.

ramonguiu4y ago

(NB: Promscale team member)

Thanks for the positive feedback!

Is there anything in particular you are missing in Promscale to be used as a backend for multiple Prometheus instances?

We added support for multi-tenancy a couple of months ago (https://blog.timescale.com/blog/simplified-prometheus-monito...)

And thanks to a community contribution by 2nick on github Promscale can be integrated with Thanos :) (https://github.com/timescale/promscale/pull/664)

polote4y ago· 1 in thread

Congratz timescale on being #1 on the frontpage 3 days in a row !

akulkarni4y ago

(Timescale co-founder)

Thank you for noticing :-)

This is really a testament to all of the amazing products, new features, R&D, and overall work that the team has been shipping.

We are firing on all cylinders. Move fast without breaking things :-)

If this looks like fun to anyone - we're hiring!

Come and help us build the next great database company:

https://www.timescale.com/careers

cevian4y ago

Just one more note. Timescale is hiring, including for roles working on Promscale.

https://www.timescale.com/careers

Promscale roles are listed in the "Observability" section.

j / k navigate · click thread line to collapse

21 comments

18 comments · 6 top-level

skorgu4y ago· 4 in thread

cevian4y ago

(NB: post author here)

Great question. We support average of averages by storing the intermediate state of the aggregate (for average that's the sum and count) so we could cleanly re-aggregate.

arriu4y ago

jpgvm4y ago

2 more replies

jeffbee4y ago

You avoid average-of-averages by storing multiple summaries. For example, you don't compute and store average, you compute and store sum and count.

dom964y ago· 3 in thread

derefr4y ago

You could use it locally on your NAS, by running a Minio instance on there. But IMHO there wouldn't be much point in doing that, over just keeping all the data in Prometheus's own internal storage.

dom964y ago

> But IMHO there wouldn't be much point in doing that, over just keeping all the data in Prometheus's own internal storage.

So far from what I've read Prometheus isn't built for that. It stores all its data in RAM and asking it to store any more leads to running out of memory quickly. What have I missed?

cevian4y ago

(NB: Post author and Promscale dev here)

1 more reply

Gravityloss4y ago· 2 in thread

At some point somebody "invents" the circular buffers that have the multiple data resolutions that was RRDtool and maybe we get compact and fast time series storage and reporting again.

sofixa4y ago

If RRD wasn't so terrible there wouldn't have been a myriad of replacements.

dang4y ago

Interesting - it looks like this is the only past HN thread about it:

Beyond NoSQL: Using RRD to store temporal data - https://news.ycombinator.com/item?id=2742486 - July 2011 (18 comments)

I found a few other tiny threads asking about replacements, and that was it.

baaym4y ago· 2 in thread

Years ago I had a Graphite installation where I configured retention policies, and the same for InfluxDB if my memory doesn't fail me.

Since this feature has been built for a reason however, I could be wrong

bitwalker4y ago

ramonguiu4y ago

(NB: Promscale team member)

Thanks for the positive feedback!

Is there anything in particular you are missing in Promscale to be used as a backend for multiple Prometheus instances?

We added support for multi-tenancy a couple of months ago (https://blog.timescale.com/blog/simplified-prometheus-monito...)

And thanks to a community contribution by 2nick on github Promscale can be integrated with Thanos :) (https://github.com/timescale/promscale/pull/664)

polote4y ago· 1 in thread

Congratz timescale on being #1 on the frontpage 3 days in a row !

akulkarni4y ago

(Timescale co-founder)

Thank you for noticing :-)

This is really a testament to all of the amazing products, new features, R&D, and overall work that the team has been shipping.

We are firing on all cylinders. Move fast without breaking things :-)

If this looks like fun to anyone - we're hiring!

Come and help us build the next great database company:

https://www.timescale.com/careers

cevian4y ago

Just one more note. Timescale is hiring, including for roles working on Promscale.

https://www.timescale.com/careers

Promscale roles are listed in the "Observability" section.

j / k navigate · click thread line to collapse