Can Redis be used as a primary database? [video] (opens in new tab)

(youtube.com)

43 pointsnode-bayarea4y ago47 comments

47 comments

37 comments · 12 top-level

maxmcd4y ago· 6 in thread

My understanding is that Redis is fast because it writes and reads from memory. Postgres is slower because it ensures writes are persisted to disk before responding (among other reasons). So even if you use RDB and AOF with Redis, you can still readily lose data even after the database has confirmed it's been written. The database can confirm a write and then crash before that write has been persisted to the AOF and RDB.

This is why I thought you wouldn't want to run Redis as a primary database.

anarazel4y ago

> My understanding is that Redis is fast because it writes and reads from memory. Postgres is slower because it ensures writes are persisted to disk before responding (among other reasons).

If you don't need transaction commits to be durable, you can turn that off in postgres, on a per-transaction/connection/user/database basis. E.g.

  BEGIN;
  SET LOCAL synchronous_commit = off;
  /* bunch of writes */
  COMMIT;

will, just for that one transaction, not wait for transaction commits to be flushed to disk. Can be very useful for not-that-important data...

bryceneal4y ago

Redis does have an option `set appendfsync always` which ensures that every write is written to disk, but as you might expect this drastically impacts performance.

avidiax4y ago

It still only syncs every 1 second, IIRC.

What you really want is 2 replicas in a shared-nothing environment, and use min-replicas-to-write.

1 more reply

winrid4y ago

You're also manipulating datastructures directly with Redis. So there is very little abstraction compared to a DB.

clon4y ago

Presumably that means Redis is fundamentally single threaded?

2 more replies

compsciphd4y ago

follow https://github.com/RedisLabs/redisraft/

reidjs4y ago· 3 in thread

You can use an append only text file as a database, but that doesn’t mean it’s a great idea.

snicker74y ago

That's kind-of what Kafka is. See also write-ahead-logs.

xwdv4y ago

Why not, it would be great for inventory tracking systems as you have an audit of all transactions.

peytoncasper4y ago

Then you have the question of how to query it efficiently.

You can’t scan 1 year of data every time, so now you need some type of external process that can either pre-compute those metrics (not flexible) or a compaction type process that makes it manageable to scan all that data.

Which is basically the trade off between something like stream processing or compaction in a database.

arthurcolle4y ago· 3 in thread

Why would you want to?

I remember when I thought it was a great idea to use Elasticsearch as a primary database. The decision was a mistake

catmanjan4y ago

What happened with Elasticsearch?

winrid4y ago

Not OP, but you'll run into issues with write throughout at some point, usually. Inverted indexes are nice, but writes are expensive.

joshxyz4y ago

Also not GP, but imo ElasticSearch is meant to serve one thing: provide search. It's not intended as your primary storage and source of truth which OLTP databases like PostgreSQL and MySQL are designed for.

bennyelv4y ago· 2 in thread

At my company the decision to do this was taken a few years ago. It’s one we regret every day and are expending a large amount of effort digging ourselves out of it.

A query engine is a very powerful and useful layer of abstraction that you end up having to recreate in your application code for every scenario where you need some data. It’s complicated, it’s hard to get right and it really slows you down.

My recommendation would be: don’t do it.

imacg4y ago

Could you elaborate on the reasoning for your company to initially use Redis vs SQL (or any other database)?

bennyelv4y ago

It was before my time, so I can't say with any certainty. They were having issues with database performance as the system scaled, and the decision was taken at the same time to switch from a monolith to a microservices architecture.

Easier to scale out than either a cloud based or on-prem database possibly?

cbushko4y ago· 2 in thread

If you have a datacenter of your own and several VMs, redis would be fine as a persistent store. I wouldn't do it but it would be fine.

If you are in the cloud and have any hint of using kubernetes then DO NOT USE redis as a persistent store. The problem is that redis' master/slave and replication pattern goes against the load balancing and Service objects of kubernetes. Redis was created in a time when it expected physical nodes to be available 24/7 and is not designed for nodes to go away. It can handle it but it isn't designed for it. Two different things.

Redis as a single pod and a cache works great. I would never use redis as a DB. We have DBs specifically designed to be DBs.

dangerbird24y ago

Theoretically, redis-sentinel would be perfect for a clustered system like kubernetes. In practice, I've always found it a nightmare to deploy and use due to its hacked-on service discovery (which is redundant in kubernetes), and lack of client compatibility with vanilla redis

cbushko4y ago

I spent months trying to get some clustering solution of redis running in kubernetes. Every single solution was a huge hack such as running a HAProxy in front on it and having that point to the master.

The best solution I found was running keydb in a multi-master, multi-replica mode. All of the pods are masters, any pod can be written to and the keys will be copied over to the other masters/replicas. Performance is decent too.

1 more reply

gunapologist994y ago· 2 in thread

In production, it may come back and haunt you later. There is very little in the way of real backup and recovery; you should think of the RDB and AOF files as a faster way to pre-populate the cache upon startup/reboot/migration rather than as a real production database. I've seen it used as a prod db many times, and while it can be made to work, it's not really what it's designed for.

The impedance match between redis and most programming language data structures is just really perfect; Redis supports all of the structures (arrays, maps, etc) that you'd expect, and a few you wouldn't (bloom filters, for example).

Also, it has some really odd security choices and just generally a lack of focus on security at all. It didn't even have any password at all for the first few years -- anyone could connect to it and just do whatever they wanted (and, in fact, you could even gain access to the OS!) It's also pretty hard to start up securely in the cloud (by default, it binds to every interface instead of just localhost, or at least the last time I checked, although you can override this in the config.. just be careful about that, because this lack of emphasis on security seems to run through it.)

Again, as a very fast and flexible cache that supports a million different datatypes and has real big-O performance guarantees, it is superb.

But these days, if you want a primary production database, you should just default to postgresql, unless you already have a solid reason to choose something else. If you don't know SQL, you should learn, but until you're really ready to, just use an object relational mapper (ORM) for your programming language and that will turn postgresql basically into MongoDB or similar, but with all the power of SQL behind it.

ilaksh4y ago

So you are saying that you have seen AOF fail multiple times to create a durable record, or fail in an adverse event to persist everything?

avidiax4y ago

Fsync ≠ bytes safe, unfortunately.

It usually means that the bytes are in the drive's cache, not that it's on disk. In theory, the disk can write the cache if the power is cut.

Even then, the disk can fail.

The safe thing is a shared-nothing replica, but you need 2 of them to have availability. 3 if you are worried about bit-flips causing your 2 replicas to disagree.

1 more reply

Zealotux4y ago· 2 in thread

I've been thinking about replacing my Mongo database with persistent Redis but I'm torn on this, I don't know what to expected long-term, what about migrations? It feels like once I'm going for a structure it'll be very difficult to alter it, but maybe I'm wrong. I'm developing a real-time application (think Figma), and still cannot choose a solution with confidence.

winrid4y ago

What issues are you running into with Mongo?

With Mongo you can still use Schemas, and migrations are pretty easy.

Zealotux4y ago

I don't have issues right now, but honestly: I'm still very ignorant of databases' abilities, so far my app works, and I have no idea how far I can push Mongo until it becomes an issue. I agree that Mongo seems much easier to manage than Redis for data structures, so I'll stick with it until it becomes an issue. Maybe I should just pay for some consulting on that one.

1 more reply

junon4y ago· 2 in thread

This was the case at ZEIT (now Vercel). Then we migrated away from that due to a slew of issues I can't recall at the moment.

We still used redis extensively but not as a persistence layer.

apexd1234y ago

How is CosmosDB working for y'all?

junon4y ago

I'm not there anymore, I left a few years back. These are my opinions and I'm sure things have changed pretty drastically since.

Cosmos was expensive. I mean, really really expensive. Microsoft promised it would solve a lot of our problems and I'm not fully convinced it ever did.

The client was insanely bizarre, the protocol was very complex and convoluted and the library code was almost (or maybe it ultimately was) transferred to ZEIT's ownership since we were pretty much the only ones working on it at the time. Microsoft certainly wanted to have some agreement about that; to what end, I'm not sure.

I remember a lot of headaches. We were also entirely on our own with it as it was pretty opaque to work with. Very little public examples and we had to contact Microsoft quite a lot for help if memory serves.

ransom15384y ago· 2 in thread

"Can Redis be used as a primary database"

IMHO, no. Unless! you can ensure your data is less than the size of memory. Redis must fit all the data into memory. If you run out of memory Redis doesn't have great options (besides buy more memory). In my mind a primary database handles the complexities/speed of pulling data from a disk, manipulating data in memory and scales with more disk. Redis manipulates data in memory only.

Redis is rad for specific group of problems.

avidiax4y ago

IMHO, yes, provided that your business degrades gracefully when data is lost, and the revenue per byte supports storage in RAM.

In that case, it is better than other databases, being extremely performant, and more importantly, being very easy to develop for.

If you need more space, you can use Redis cluster to extend horizontally.

compsciphd4y ago

https://docs.redislabs.com/latest/rs/concepts/memory-archite...

jbverschoor4y ago· 1 in thread

You could also just use postgresql as your key-value store, which seems a much saner approach.

avinassh4y ago

But what if you have low latency requirements? Redis is order of magnitude faster compared to Postgres, because of disk IO

ihucos4y ago

TLDW. But I have counter.dev running with redis as the primary database. It just works. Of course you need consider carefully. One of the advantages I saw is that every query on the database has a documented complexity. I don't need to hope for the database to run the query fast enough. It's more transparent.

halifaxbeard4y ago

Yes. Should you?

It depends, like everything.

j / k navigate · click thread line to collapse

47 comments

37 comments · 12 top-level

maxmcd4y ago· 6 in thread

This is why I thought you wouldn't want to run Redis as a primary database.

anarazel4y ago

> My understanding is that Redis is fast because it writes and reads from memory. Postgres is slower because it ensures writes are persisted to disk before responding (among other reasons).

If you don't need transaction commits to be durable, you can turn that off in postgres, on a per-transaction/connection/user/database basis. E.g.

  BEGIN;
  SET LOCAL synchronous_commit = off;
  /* bunch of writes */
  COMMIT;

will, just for that one transaction, not wait for transaction commits to be flushed to disk. Can be very useful for not-that-important data...

bryceneal4y ago

Redis does have an option `set appendfsync always` which ensures that every write is written to disk, but as you might expect this drastically impacts performance.

avidiax4y ago

It still only syncs every 1 second, IIRC.

What you really want is 2 replicas in a shared-nothing environment, and use min-replicas-to-write.

1 more reply

winrid4y ago

You're also manipulating datastructures directly with Redis. So there is very little abstraction compared to a DB.

clon4y ago

Presumably that means Redis is fundamentally single threaded?

2 more replies

compsciphd4y ago

follow https://github.com/RedisLabs/redisraft/

reidjs4y ago· 3 in thread

You can use an append only text file as a database, but that doesn’t mean it’s a great idea.

snicker74y ago

That's kind-of what Kafka is. See also write-ahead-logs.

xwdv4y ago

Why not, it would be great for inventory tracking systems as you have an audit of all transactions.

peytoncasper4y ago

Then you have the question of how to query it efficiently.

Which is basically the trade off between something like stream processing or compaction in a database.

arthurcolle4y ago· 3 in thread

Why would you want to?

I remember when I thought it was a great idea to use Elasticsearch as a primary database. The decision was a mistake

catmanjan4y ago

What happened with Elasticsearch?

winrid4y ago

Not OP, but you'll run into issues with write throughout at some point, usually. Inverted indexes are nice, but writes are expensive.

joshxyz4y ago

bennyelv4y ago· 2 in thread

At my company the decision to do this was taken a few years ago. It’s one we regret every day and are expending a large amount of effort digging ourselves out of it.

My recommendation would be: don’t do it.

imacg4y ago

Could you elaborate on the reasoning for your company to initially use Redis vs SQL (or any other database)?

bennyelv4y ago

Easier to scale out than either a cloud based or on-prem database possibly?

cbushko4y ago· 2 in thread

If you have a datacenter of your own and several VMs, redis would be fine as a persistent store. I wouldn't do it but it would be fine.

Redis as a single pod and a cache works great. I would never use redis as a DB. We have DBs specifically designed to be DBs.

dangerbird24y ago

cbushko4y ago

1 more reply

gunapologist994y ago· 2 in thread

Again, as a very fast and flexible cache that supports a million different datatypes and has real big-O performance guarantees, it is superb.

ilaksh4y ago

So you are saying that you have seen AOF fail multiple times to create a durable record, or fail in an adverse event to persist everything?

avidiax4y ago

Fsync ≠ bytes safe, unfortunately.

It usually means that the bytes are in the drive's cache, not that it's on disk. In theory, the disk can write the cache if the power is cut.

Even then, the disk can fail.

The safe thing is a shared-nothing replica, but you need 2 of them to have availability. 3 if you are worried about bit-flips causing your 2 replicas to disagree.

1 more reply

Zealotux4y ago· 2 in thread

winrid4y ago

What issues are you running into with Mongo?

With Mongo you can still use Schemas, and migrations are pretty easy.

Zealotux4y ago

1 more reply

junon4y ago· 2 in thread

This was the case at ZEIT (now Vercel). Then we migrated away from that due to a slew of issues I can't recall at the moment.

We still used redis extensively but not as a persistence layer.

apexd1234y ago

How is CosmosDB working for y'all?

junon4y ago

I'm not there anymore, I left a few years back. These are my opinions and I'm sure things have changed pretty drastically since.

Cosmos was expensive. I mean, really really expensive. Microsoft promised it would solve a lot of our problems and I'm not fully convinced it ever did.

ransom15384y ago· 2 in thread

"Can Redis be used as a primary database"

Redis is rad for specific group of problems.

avidiax4y ago

IMHO, yes, provided that your business degrades gracefully when data is lost, and the revenue per byte supports storage in RAM.

In that case, it is better than other databases, being extremely performant, and more importantly, being very easy to develop for.

If you need more space, you can use Redis cluster to extend horizontally.

compsciphd4y ago

https://docs.redislabs.com/latest/rs/concepts/memory-archite...

jbverschoor4y ago· 1 in thread

You could also just use postgresql as your key-value store, which seems a much saner approach.

avinassh4y ago

But what if you have low latency requirements? Redis is order of magnitude faster compared to Postgres, because of disk IO

ihucos4y ago

halifaxbeard4y ago

Yes. Should you?

It depends, like everything.

j / k navigate · click thread line to collapse