Apache Pegasus - A horizontally scalable, high-performance key-value store (opens in new tab)

(github.com)

110 pointsindogooner3y ago39 comments

39 comments

28 comments · 9 top-level

tmikaeld3y ago· 7 in thread

Will be interesting to see the benchmarks!

There's a lot of KV engines that uses RocksDB now, like CockroachDB (Forked into PebbleDB though), YugabyteDB and TiDB.

Those are all many times slower than Redis though, so having a middle-ground aimed to be similar to Redis, that doesn't eat all RAM, is very exciting!

hknmtt3y ago

because you are comparing in-memory store with permanent storage. many kvdb can be run from memory if needed.

jamesrr393y ago

Technically, all of them can, if you put the "persisted" storage in a directory on a ramfs or tmpfs mount :)

This might seem like a bit of a facetious comment, but it does have genuine use cases, e.g. unit tests, or for data that you can easily restore after a shutdown.

c0balt3y ago

Isn't TiDB built on top of TiKV?[0]

[0]: https://github.com/pingcap/tidb

acatton3y ago

TiKV is basically a layer on top of rocksdb https://github.com/tikv/tikv/blob/956610725039835557e7516828...

sluongng3y ago

TiKV mainly runs on top of RocksDB in most scenarios.

yla923y ago

> Those are all many times slower than Redis though

I'd appreciate if there any links/doc that I could look into to learn more about this?

ddorian433y ago

Blog post from antirez http://oldblog.antirez.com/post/redis-persistence-demystifie...

1 more reply

seeekr3y ago· 5 in thread

Note that this seems to be a relatively old project, first commits from 2015. The project seems active, but most of the work seems to have been done around its inception, with some significant activity from 2020 onwards. Speculation/interpretation: So this might be a project that was used internally by some company, but perhaps not any more, and they've decided to open-source it at some point (2017-2018?) because some folks were/are still excited about it and want to keep developing it.

This might explain some of the "why yet another RocksDB-based KV store?" line of questioning.

pas3y ago

> yet another RocksDB-based KV store

Aaah, there was a super informative talk about the different databases at Facebook, most of them built on RocksDB, with different trade offs. (And I can't find the video :((((( )

Anyway, it makes sense to have yet another it if serves a different purpose. Eg. for read-heavy workloads (caches, serving user feeds, whatever), or write-heavy (monitoring, storing that sweet sweet tracking juice that then gets read once or twice while building the recommendation models), small or large blobs, latency requirements, HA/consistency requirements, how complicated queries are going to be, does it support secondary indices or not, etc.

michaeltlewis3y ago

That video sounds interesting, if you do manage to find it.

l2dy3y ago

Pegasus was open-sourced by Xiaomi, and is still used internally according to https://apachecon.com/acasia2022/sessions/ai-1125.html.

Source: https://www.zhihu.com/question/66719537/answer/245270169 (in Chinese)

xani_3y ago

Apache foundation is retirement home for projects so it checks out

studmuffin6503y ago

I wouldn't say thats accurate. Theres many super successful and active apache projects still. To name a few:

Kafka, Cassandra, Zookeeper, Spark, Tomcat, Superset, Storm, Lucene,Log4j2, Hadoop, etc. The list goes on, but I would safely say that a majority of the world's systems run on Apache projects which are for the most part actively developed

2 more replies

spockz3y ago· 3 in thread

Why yet another key value store?

sidcool3y ago

KV stores are a complex topic and research continues. Each new tool comes with its own set of trade offs. It would help to go through the documentation.

jstummbillig3y ago

> Each new tool comes with its own set of trade offs

Is the cognitive load this produces still worth the consideration? At what level do you have to operate for the gains to actually make viable business sense to even consider?

Sometimes I am thinking "Well, surely at Google level" – and then I load up one of their interfaces, for example Google Ads, and I have to sit around for 10s before anything even shows up.

2 more replies

ramraj073y ago

You could do a million other interesting things academically or commercially before diving into yet another KV store though.

1 more reply

scary-size3y ago· 2 in thread

Is it me or are most of the docs only available in Chinese?

tmikaeld3y ago

It says it's being translated at the moment.

dsmmcken3y ago

Looks dead. English docs haven't been touched for 2 years.

https://github.com/apache/incubator-pegasus-website/tree/mas...

pknerd3y ago· 2 in thread

Another key/Value store system. There are already like RocksDB, LevelDB and many others!

sofixa3y ago

This one is distributed though, using RocksDB as the underlying storage.

xani_3y ago

Think we have few of those too

1 more reply

endisneigh3y ago

Be curious to see differences between this and FoundationDB.

hk13373y ago

This seems very interesting and I am peaked but the documentation and web page is lacking a lot to tell me what it is and how it's intended to be used. I know it's a key value store and it's supposed to be fast but that's it.

hasperdi3y ago

Anyone using this in production? If so, how do you find it, is it good?

truth_seeker3y ago

why would I choose it over TiKV or RIAK ?

j / k navigate · click thread line to collapse

39 comments

28 comments · 9 top-level

tmikaeld3y ago· 7 in thread

Will be interesting to see the benchmarks!

There's a lot of KV engines that uses RocksDB now, like CockroachDB (Forked into PebbleDB though), YugabyteDB and TiDB.

Those are all many times slower than Redis though, so having a middle-ground aimed to be similar to Redis, that doesn't eat all RAM, is very exciting!

hknmtt3y ago

because you are comparing in-memory store with permanent storage. many kvdb can be run from memory if needed.

jamesrr393y ago

Technically, all of them can, if you put the "persisted" storage in a directory on a ramfs or tmpfs mount :)

This might seem like a bit of a facetious comment, but it does have genuine use cases, e.g. unit tests, or for data that you can easily restore after a shutdown.

c0balt3y ago

Isn't TiDB built on top of TiKV?[0]

[0]: https://github.com/pingcap/tidb

acatton3y ago

TiKV is basically a layer on top of rocksdb https://github.com/tikv/tikv/blob/956610725039835557e7516828...

sluongng3y ago

TiKV mainly runs on top of RocksDB in most scenarios.

yla923y ago

> Those are all many times slower than Redis though

I'd appreciate if there any links/doc that I could look into to learn more about this?

ddorian433y ago

Blog post from antirez http://oldblog.antirez.com/post/redis-persistence-demystifie...

1 more reply

seeekr3y ago· 5 in thread

This might explain some of the "why yet another RocksDB-based KV store?" line of questioning.

pas3y ago

> yet another RocksDB-based KV store

Aaah, there was a super informative talk about the different databases at Facebook, most of them built on RocksDB, with different trade offs. (And I can't find the video :((((( )

michaeltlewis3y ago

That video sounds interesting, if you do manage to find it.

l2dy3y ago

Pegasus was open-sourced by Xiaomi, and is still used internally according to https://apachecon.com/acasia2022/sessions/ai-1125.html.

Source: https://www.zhihu.com/question/66719537/answer/245270169 (in Chinese)

xani_3y ago

Apache foundation is retirement home for projects so it checks out

studmuffin6503y ago

I wouldn't say thats accurate. Theres many super successful and active apache projects still. To name a few:

2 more replies

spockz3y ago· 3 in thread

Why yet another key value store?

sidcool3y ago

KV stores are a complex topic and research continues. Each new tool comes with its own set of trade offs. It would help to go through the documentation.

jstummbillig3y ago

> Each new tool comes with its own set of trade offs

Is the cognitive load this produces still worth the consideration? At what level do you have to operate for the gains to actually make viable business sense to even consider?

Sometimes I am thinking "Well, surely at Google level" – and then I load up one of their interfaces, for example Google Ads, and I have to sit around for 10s before anything even shows up.

2 more replies

ramraj073y ago

You could do a million other interesting things academically or commercially before diving into yet another KV store though.

1 more reply

scary-size3y ago· 2 in thread

Is it me or are most of the docs only available in Chinese?

tmikaeld3y ago

It says it's being translated at the moment.

dsmmcken3y ago

Looks dead. English docs haven't been touched for 2 years.

https://github.com/apache/incubator-pegasus-website/tree/mas...

pknerd3y ago· 2 in thread