Show HN: An experimental distributed SQL database from scratch in Go (opens in new tab)

(github.com)

104 pointsslicedbrandy6y ago14 comments

14 comments

12 comments · 3 top-level

slicedbrandyOP6y ago· 5 in thread

Hey HN,

Work is coming along on this project to build a distributed SQL database from scratch, mostly as a reference for newcomers to get an idea about the inner workings.

Looking for contributors who are interested in anything from parsers, disk paging, building out a REPL, defining an IR grammar, implementing consensus and more!

Anyone interested in contributing in these areas is more than welcome!

redis_mlc6y ago

It takes 5-10 years to write, test and productionize a database or filesystem. If you're planning to invest that kind of time and effort, some suggestions are:

* if you're writing a distributed database, a novel and valuable feature would be to consider the network partition case foremost. For example, design the database from the standpont of a node being down for a month.

* do adequate logging so that an operator can understand what is going right or wrong

* how can terminated nodes automatically be rebuilt efficiently and automatically?

* all configuration settings should be dynamic

Source: experienced DBA, worked with Cassandra, Influxdb and most SQL RDBMSs

derision6y ago

> if you're writing a distributed database

is this really a requirement for all distributed databases or only decentralized databases? if I'm using a distributed database and I control all the nodes, if one of them is down I don't plan on spinning it back up (assuming the data is replicated across other nodes). I guess what I'm saying is if I need 20 nodes, I'm going to make sure I always have 20 nodes running.

1 more reply

zamalek6y ago

> design the database from the standpont of a node being down for a month.

It uses Raft, so this should be handled by nature. If you are referring to writes to said node (not that Raft would allow it), you are delving into CAP theorem and what you are suggesting is impossible (unless you don't care about consistency).

1 more reply

bruth6y ago

If it is intended to be a reference project, I suggest focusing on writing the README sections first highlighting the main areas and then linking to code sections to ease the navigation.

You may or may not be aware, but Andy Pavlo records his courses and puts them on YouTube. His latest playlist covers the main database topics with the last five or so lectures covering distributed databases: https://www.youtube.com/playlist?list=PLSE8ODhjZXjbohkNBWQs_...

edit: ^suggesting Pavlo's work since he introduces database concepts very well, so it may be worth structuring the reference architecture in the same way.

kasey_junk6y ago

How are the goals of this project different than cockroachdb?

ochredoke6y ago· 3 in thread

I am embarking upon a similar project: I’m building a privacy-focused product similar to Google Photos and Flickr. My goal is to not only to develop a usable product, but also to document the process in as much detail as possible in order to enable others to build similar things.

I feel strongly about spreading the skills required to build non-trivial products. I hope to enable others to deliver similar technical projects.

vchak16y ago

Have you seen Perkeep? https://perkeep.org/ Mainly build by Brad Fitzpatrick, in Go.

ochredoke6y ago

Unfortunately I have not. I should do better at researching prior art. This is something I’ve wanted to do for a while, but never managed to carve out enough time. It’s great that there’s an open source project that I can simply use.

bradleyankrom6y ago

That’s awesome, I look forward to your forthcoming Show HN.

oscargrouch6y ago· 1 in thread

By looking into your code, it looks your Btree is a in memory implementation, correct?

I guess is a important information to put on the Readme, and also if you have any plans to create a persistent one, how you will approach this (if you will try to replicate the SQLite Btree here)

Also, i guess the most easy way to make a pesistent btree happen, is to get that one from that neat key-value store in Go that mimic the LMDB Btree (I dont recall the name right now).

The algo behind the LMDB kind dont need to use journal files, so will be reliable AND fast (is it inspired by Rodeh's Btrees ?). I just dont know about the paging aspect, if its any good as the one SQLite uses.

evolveyourmind6y ago

I’ve also noticed that. As far as I can tell, there isn’t actually any persistent operation, even for the tables.

j / k navigate · click thread line to collapse

14 comments

12 comments · 3 top-level

slicedbrandyOP6y ago· 5 in thread

Hey HN,

Work is coming along on this project to build a distributed SQL database from scratch, mostly as a reference for newcomers to get an idea about the inner workings.

Looking for contributors who are interested in anything from parsers, disk paging, building out a REPL, defining an IR grammar, implementing consensus and more!

Anyone interested in contributing in these areas is more than welcome!

redis_mlc6y ago

It takes 5-10 years to write, test and productionize a database or filesystem. If you're planning to invest that kind of time and effort, some suggestions are:

* do adequate logging so that an operator can understand what is going right or wrong

* how can terminated nodes automatically be rebuilt efficiently and automatically?

* all configuration settings should be dynamic

Source: experienced DBA, worked with Cassandra, Influxdb and most SQL RDBMSs

derision6y ago

> if you're writing a distributed database

1 more reply

zamalek6y ago

> design the database from the standpont of a node being down for a month.

1 more reply

bruth6y ago

If it is intended to be a reference project, I suggest focusing on writing the README sections first highlighting the main areas and then linking to code sections to ease the navigation.

edit: ^suggesting Pavlo's work since he introduces database concepts very well, so it may be worth structuring the reference architecture in the same way.

kasey_junk6y ago

How are the goals of this project different than cockroachdb?

ochredoke6y ago· 3 in thread

I feel strongly about spreading the skills required to build non-trivial products. I hope to enable others to deliver similar technical projects.

vchak16y ago

Have you seen Perkeep? https://perkeep.org/ Mainly build by Brad Fitzpatrick, in Go.

ochredoke6y ago

bradleyankrom6y ago

That’s awesome, I look forward to your forthcoming Show HN.

oscargrouch6y ago· 1 in thread

By looking into your code, it looks your Btree is a in memory implementation, correct?

I guess is a important information to put on the Readme, and also if you have any plans to create a persistent one, how you will approach this (if you will try to replicate the SQLite Btree here)

Also, i guess the most easy way to make a pesistent btree happen, is to get that one from that neat key-value store in Go that mimic the LMDB Btree (I dont recall the name right now).

evolveyourmind6y ago

I’ve also noticed that. As far as I can tell, there isn’t actually any persistent operation, even for the tables.

j / k navigate · click thread line to collapse