- Flexible Deployment: Run it as a single-node in-memory KV cache, a larger-than-memory database or scale to a highly available, distributed transactional database with ease.
- High Performance: Achieves performance levels comparable to top in-memory databases like Redis and DragonflyDB, while significantly outperforming durable KV stores like KVRocks.
- Full ACID Transactions: Ensures complete transactional integrity, even in distributed environments.
- Independent Resource Scaling: Scale CPU, memory, storage, and logging resources independently to meet your needs.
We’d love to hear your thoughts and feedback!
We are a small team of developers, and currently EloqKV is still under heavy development. We would like to maintain agility and quickly iterate and improve our product with consistency with our customer's feedback. However, we are actively evaluating multiple paths to open sourcing our technology and any suggestions and concerns will certainly be kept in our mind as we progress.
Having Open Source Project on GitHub does not mean you need to become less agile. First there will not be instantly 1000 of people sending you pull requests, second even if they do - you have no obligation to take them in.
However if you say, we can write crap code... and no one would know because they only see binaries, this is not really a way gain a good confidence
In my opinion those days just having a "binary you can use" is the worst way to go for something like database/data store. If you are not sure about your Open Source plans having something like SaaS solution with free tier so users can experiment easily could be way to go
DB sales of open source are based on cloud and security. You can keep some security as open core along with cloud deployment.
Open source is about expanding the total potential market even if you only capture a portion of the value there.
If you think about it, databases are arguably a more critical piece of IT infrastructure than cybersecurity products. Yet, while cybersecurity companies are thriving with many company valuations exceeding $10 billion, database companies—especially those embracing open source—struggle to achieve similar commercial success. The few database companies that do reach high valuations are typically based on closed-source products, while those adopting open-source models often face significant challenges in becoming profitable.
It's also worth noting that many companies in the database space have recently changed their licenses around open source or source availability. If major players like MongoDB, ElasticSearch, and Redis are all making these tough decisions to build sustainable businesses, it might not be fair to simply blame corporate greed. Without adequate returns on investment, venture funding for database development could dry up, which would ultimately stifle innovation in this crucial sector.
However, if you're giving redis access to different tenants, Lua is too dangerous.
I'd love to see a "real" transaction API for Redis.
1. No rollback on failure: If a command in EXEC fails, Redis can't roll back the transaction. EloqKV resolves this by starting real transactions on the TxServer node during EXEC. If any command fails, the changes are rolled back, maintaining atomicity. Additionally, EloqKV allows specifying isolation levels when using Redis transactions.
2. Lack of interactivity: This is where Lua shines. Users can embed business logic in Lua, functioning similarly to stored procedures in SQL.
3. No cross-shard transactions: Redis clusters can't redirect requests across shards, forcing clients to manage topology awareness. EloqKV addresses this with a fully distributed transactional design, eliminating these cross-shard complexities. For more on this, see our blog.
https://www.eloqdata.com/blog/2024/08/22/benchmark-cluster#w...
As for your point on Lua's risks in multi-tenant environments, I completely agree. Lua lacks robust ACLs or resource limitations, and managing SHA keys is cumbersome. We're considering enhancements to address these issues in the future. Finally, we are indeed working on a "real" transaction API for Redis. Like SQL, users will be able to begin transactions, read keys, apply transformations, and generate new keys, with the ability to commit or abort. EloqKV will also support configurable isolation levels and transaction protocols (OCC/Locking).
Do you have any additional API preferences beyond "START/COMMIT/ROLLBACK" for Redis transactions?
What is the novel part? I read your "Introduction to Data Substrate" blog article and the architecture you are describing sounds like NuoDB from the early 2010s. The only difference is that NuoDB scales out the in-memory cache by adding more of what they call "Transaction Engine" nodes whereas you are scaling up the "TxMap" node?
See also Viktor Leis' CIDR 2023 paper with the Great Phil Bernstein:
If I remember correctly, NuoDB uses a shared cache with a cache coherence protocol, whereas EloqKV uses a shared nothing (partitioned) cache. The former is a local read but needs to broadcast each write to all nodes. The latter has no broadcast for writes but may be a remote read. The tradeoff is evident and we are actively exploring opportunities to strike a balance, e.g., for frequently-read, rarely-write data items, use the shared cache mode.
We appreciate you pointing us to the CIDR paper. I had the pleasure of working with Phil for some time and fondly remember many discussions with Phil on various topics many years ago. To address your question, yes, we've been trying to solve the research challenges presented in the CIDR paper. The devil is in the details. We've developed numerous new algorithms and invested significant engineering effort into the design and implementation of our products. The benefits are as follows:
- Optimality: We believe we have an overall design that optimizes synchronous disk writes and network round-trips. For instance, when the design is reduced to a single node, its performance matches or exceeds that of single-node servers. As you might expect, a lot of innovation has gone into making distributed transactions as efficient as non-distributed ones, comparable to those in MySQL or PostgreSQL.
- Modularity: Our architecture allows us to easily replace the Parser/Compute layer and Storage/Persistence layer with the best existing solutions. This means we can create new databases by leveraging existing parsers and compute engines from current database implementations to achieve API-compatibility, as well as leveraging existing high-performance KV stores for the persistence layer. This allows us to avoid reinventing the wheels and to take advantage of decades of innovations in the database community.
- Scalability: The entire system operates without a single synchronization point—not even a global sequencer. We drew many inspirations from the Hekaton and your TicToc paper. All four types of resources (CPU, Memory, Storage, Logging) can be scaled independently, as we mentioned earlier. More importantly, they can scale dynamically to accommodate workload changes without service disruptions.
We look forward to sharing more technical details as we move out of stealth mode. I hope to continue this conversation with you in person in the near future.
It superficially sounds like a series of server processes fronting actual database servers, which sounds like another layer of partition vulnerability and points of failure. But I also had similar high level concerns about the complexity of FoundationDB, people seem satisfied as to the validity of that architecture.
I fail to see how one would scale underlying resources if the persistence is done by various storage systems, and you'd be subject to limitations of those persistence engines. That sounds like a suspect claim, like the "Data Substrate" is a big distributed cache that scales in front of delayed persistence to an actual database. Again, sounds like oodles of opportunities for failure.
"Data substrate draws inspirations from the canonical design of single-node relational database management systems (RDBMS)." Look, I don't think a good distributed database starts from a CA system and bolts on partition tolerance. I get you can get far with big nodes, fast networks, and sharding but ... I mean, they talk about B+ trees but a foundational aspect of Cassandra and other distributed databases is that B+ trees don't scale past a certain point, because there exist trees so deep/tall that even modern hardware chokes on it, and updating the B+ tree with new data gets harder as it gets bigger.
As others have said, I'll leave it to Aphyr to possibly explain.
Regarding persistence, our Substrate component manages its own logging system while relies on external storage engines for checkpointing. This means that the system only depends on the performance and capacity of the external storage. Our database does not depend on the external storage engine for consistency. For example, DynamoDB offers virtually unlimited capacity and performance (in terms of QPS) in the cloud, and our Data Substrate is agnostic to whether the data is stored as a B-Tree or an LSM tree.
We are currently preparing the software for Jepsen testing prior to its General Availability (GA) release. More technical details will be shared as we move forward, so please stay tuned.
I agree that introducing a middleware layer on top of databases adds more points of failure. EloqKV avoids this by integrating Compute, Logging, and Storage into a single system. In this setup, storage can be a database like Cassandra, but users will not access it directly; all requests go through EloqKV, which manages ACID transactions. EloqKV is responsible for handling crashes of the TxServer, LogServer, and Storage. You can think of the distributed Storage Engine just as a Disk in traditional DBMS. Obviously its failure will affect the system. But no more than hard disk failure. In fact, in a cloud, all disks (i.e. EBS) are actually distributed storage systems.
This situation is a rethinking of the Redis and MySQL combination, which suffers from similar issues. Both systems can fail independently, resulting in only eventual consistency. EloqKV aims to resolve this problem.
We chose RocksDB and Cassandra to showcase two different approaches. RocksDB offers efficient local storage, but if there's a disk hardware failure, the data can be lost. Cassandra, on the other hand, ensures high availability by replicating data, making it resilient to disk failures. We also support cloud storage options like DynamoDB and BigTable in our cloud offerings.
For more details, please check our scaling disks benchmark report.
https://www.eloqdata.com/blog/2024/08/25/benchmark-txlog#exp...
> For a distributed transaction with distributed storage and multiple nodes how does a transaction get coordinated?
EloqKV is a multi-writer system, similar to many distributed databases (FoundationDB, TiDB, CockroachDB), but we have a set of new innovations on transaction protocols, for example, the entire system operates without a single synchronization point, not even a global sequencer. We drew many inspirations from the Hekaton and the TicToc paper.