undefined | Better HN

0 pointslocknitpicker2mo ago0 comments

> Sqlite smokes postgres on the same machine even with domain sockets [1].

SQLite on the same machine is akin to calling fwrite. That's fine. This is also a system constraint as it forces a one-database-per-instance design, with no data shared across nodes. This is fine if you're putting together a site for your neighborhood's mom and pop shop, but once you need to handle a request baseline beyond a few hundreds TPS and you need to serve traffic beyond your local region then you have no alternative other than to have more than one instance of your service running in parallel. You can continue to shoehorn your one-database-per-service pattern onto the design, but you're now compelled to find "clever" strategies to sync state across nodes.

Those who know better to not do "clever" simply slap a Postgres node and call it a day.

0 comments

18 comments · 4 top-level

andersmurphy2mo ago· 8 in thread

> SQLite on the same machine is akin to calling fwrite.

Actually 35% faster than fwrite [1].

> This is also a system constraint as it forces a one-database-per-instance design

You can scale incredibly far on a single node and have much better up time than github or anthropic. At this rate maybe even AWS/cloudflare.

> you need to serve traffic beyond your local region

Postgres still has a single node that can write. So most of the time you end up region sharding anyway. Sharding SQLite is straight forward.

> This is fine if you're putting together a site for your neighborhood's mom and pop shop, but once you need to handle a request baseline beyond a few hundreds TPS

It's actually pretty good for running a real time multiplayer app with a billion datapoints on a 5$ VPS [2]. There's nothing clever going on here, all the state is on the server and the backend is fast.

> but you're now compelled to find "clever" strategies to sync state across nodes.

That's the neat part you don't. Because, for most things that are not uplink limited (being a CDN, Netflix, Dropbox) a single node is all you need.

- [1] https://sqlite.org/fasterthanfs.html

- [2] https://checkboxes.andersmurphy.com

shimman2mo ago

May be an "out" there question, but any tech book suggestions you'd recommend that can teach an average dev on how to build highly performant software with minimal systems?

I feel like the advice from people with your experience is worth way way way way more than what you'd hear from big tech. Like what you said yourself, big tech tends to recommend extremely complicated systems that only seem worth maintaining if you have a trillion dollar monopoly behind it.

andersmurphy2mo ago

Not specific books per say. Though I'd advise starting with some constraints. As that really helps you focus.

Your reading/learning material can spin out of those constraints.

So for me my recent constraints were:

1. Multiplayer/collaborative web apps built by small teams.

2. Single box.

3. I like writing lisp.

So single box pushes me towards a faster language, and something that's easy to deploy. Go would be the natural choice here, but I want a lisp so Clojure is probably the best option here (helps that I already know it). JVM is fast enough and has a pretty good deployment story. Multiplayer web apps, pushed me to explore distributed state vs streaming with centralised state. This became a whole journey which ended with Datastar [1]. Thing is immediate mode streaming HTML needs your database queries to be fast and that's how I ended up on SQLite (I was already a fan, and had used it in production before), but the constraints of streaming HTML forced me to revisit it in anger.

Your constraints could be completely different. They could be:

1. Fast to market.

2. Minimise risk.

3. Mobile + Web

4. Try something new.

Fast to market might mean you go with something like Rails/Django. Minimise risk might mean you go with Rails because you have a load of experience with it. Mobile + web means you read up on Hotwire. Try something new might mean you push more logic into stored procedures and SQL queries so you can get the most out of Postgres and make your Rails app faster. So you read The Art of Postgresql [2] (great book). Or maybe you try hosting rails on a VPS and set up/manage your own postgres instance.

A few companies back mine were:

1. JVM but with a more ruby/rails like development experience.

2. Mobile but not separate iOS/Android projects.

3. Avoid the pain of app store releases.

4. You can't innovate everywhere.

That meant Clojure. React native. Minimal clients with as much driven from the backend as possible. Sticking to postgres and Heroku because it's what we knew and worked well enough.

- [1] https://data-star.dev

- [2] https://theartofpostgresql.com

There's no right answer. Hope that's helpful.

1 more reply

wookmaster2mo ago

How do you manage HA?

andersmurphy2mo ago

Backups, litestream gives you streaming replication to the second.

Deployment, caddy holds open incoming connections whilst your app drains the current request queue and restarts. This is all sub second and imperceptible. You can do fancier things than this with two version of the app running on the same box if that's your thing. In my case I can also hot patch the running app as it's the JVM.

Server hard drive failing etc you have a few options:

1. Spin up a new server/VPS and litestream the backup (the application automatically does this on start).

2. If your data is truly colossal have a warm backup VPS with a snapshot of the data so litestream has to stream less data.

Pretty easy to have 3 to 4 9s of availability this way (which is more than github, anthropic etc).

2 more replies

rovr1382mo ago

No offense, you wait. Like everyone's been doing for years in the internet and still do

- When AWS/GCP goes down, how do most handle HA?

- When a database server goes down, how do most handle HA?

- When Cloudflare goes down, how do most handle HA?

The down time here is the server crashed, routing failed or some other issue with the host. You wait.

One may run pingdom or something to alert you.

1 more reply

locknitpickerOP2mo ago

> You can scale incredibly far on a single node

Nonsense. You can't outrun physics. The latency across the Atlantic is already ~100ms, and from the US to Asia Pacific can be ~300ms. If you are interested in performance and you need to shave off ~200ms in latency, you deploy an instance closer to your users. It makes absolutely no sense to frame the rationale around performance if your systems architecture imposes a massive performance penalty in networking just to shave a couple of ms in roundtrips to a data store. Absurd.

klooney2mo ago

You need regional state, or you're still back hauling to the db with all the lag.

andersmurphy2mo ago

That only solves read latency not write latency. Unless you don't care about consistency.

tl2mo ago· 3 in thread

https://antonz.org/sqlite-is-not-a-toy-database/ — 240K inserts per second on a single machine in 2021. The problem you describe is real, but the TPS ceiling is wrong by three orders of magnitude on modern hardware.

pdhborges2mo ago

Do you know why it is a toy? Because in a real prod environment after inserting 240k rows per second for a while you have to deal with the fact that schema evolution is required. Good luck migrating those huge tables with Sqlite ALTER table implementation

devmor2mo ago

Try doing that on a “real” DB with hundreds of millions of rows too. Anything more than adding a column is a massive risk, especially once you’ve started sharding.

1 more reply

shimman2mo ago

This doesn't seem like a toy but you know... realizing different systems will have different constraints.

Not everyone needs monopolistic tech to do their work. There's probably less than 10,000 companies on earth that truly need to write 240k rows/second. For everyone else, we can focus on better things.

1 more reply

rpdillon2mo ago· 3 in thread

I wonder what percentage of services run on the Internet exceed a few hundred transactions per second.

icedchai2mo ago

I’ve seen multimillion dollar “enterprise” projects get no where close to that. Of course, they all run on scalable, cloud native infrastructure costing at least a few grand a month.

not_kurt_godel2mo ago

> a few grand a month.

A negligible cost for a successful tech business that also works when your requirements exceed the capabilities of a single VPS.

1 more reply

egwor2mo ago

I think the better question to ask is what services peak at a few hundred transactions per second?

darkwater2mo ago

I mean, your "This is fine for" is almost literally the whole point of TFA, that you can go a long way, MRR-wise, with a simpler architecture.

j / k navigate · click thread line to collapse

0 comments

18 comments · 4 top-level

andersmurphy2mo ago· 8 in thread

> SQLite on the same machine is akin to calling fwrite.

Actually 35% faster than fwrite [1].

> This is also a system constraint as it forces a one-database-per-instance design

You can scale incredibly far on a single node and have much better up time than github or anthropic. At this rate maybe even AWS/cloudflare.

> you need to serve traffic beyond your local region

Postgres still has a single node that can write. So most of the time you end up region sharding anyway. Sharding SQLite is straight forward.

> This is fine if you're putting together a site for your neighborhood's mom and pop shop, but once you need to handle a request baseline beyond a few hundreds TPS

> but you're now compelled to find "clever" strategies to sync state across nodes.

That's the neat part you don't. Because, for most things that are not uplink limited (being a CDN, Netflix, Dropbox) a single node is all you need.

- [1] https://sqlite.org/fasterthanfs.html

- [2] https://checkboxes.andersmurphy.com

shimman2mo ago

May be an "out" there question, but any tech book suggestions you'd recommend that can teach an average dev on how to build highly performant software with minimal systems?

andersmurphy2mo ago

Not specific books per say. Though I'd advise starting with some constraints. As that really helps you focus.

Your reading/learning material can spin out of those constraints.

So for me my recent constraints were:

1. Multiplayer/collaborative web apps built by small teams.

2. Single box.

3. I like writing lisp.

Your constraints could be completely different. They could be:

1. Fast to market.

2. Minimise risk.

3. Mobile + Web

4. Try something new.

A few companies back mine were:

1. JVM but with a more ruby/rails like development experience.

2. Mobile but not separate iOS/Android projects.

3. Avoid the pain of app store releases.

4. You can't innovate everywhere.

That meant Clojure. React native. Minimal clients with as much driven from the backend as possible. Sticking to postgres and Heroku because it's what we knew and worked well enough.

- [1] https://data-star.dev

- [2] https://theartofpostgresql.com

There's no right answer. Hope that's helpful.

1 more reply

wookmaster2mo ago

How do you manage HA?

andersmurphy2mo ago

Backups, litestream gives you streaming replication to the second.

Server hard drive failing etc you have a few options:

1. Spin up a new server/VPS and litestream the backup (the application automatically does this on start).

2. If your data is truly colossal have a warm backup VPS with a snapshot of the data so litestream has to stream less data.

Pretty easy to have 3 to 4 9s of availability this way (which is more than github, anthropic etc).

2 more replies

rovr1382mo ago

No offense, you wait. Like everyone's been doing for years in the internet and still do

- When AWS/GCP goes down, how do most handle HA?

- When a database server goes down, how do most handle HA?

- When Cloudflare goes down, how do most handle HA?

The down time here is the server crashed, routing failed or some other issue with the host. You wait.

One may run pingdom or something to alert you.

1 more reply

locknitpickerOP2mo ago

> You can scale incredibly far on a single node

klooney2mo ago

You need regional state, or you're still back hauling to the db with all the lag.

andersmurphy2mo ago

That only solves read latency not write latency. Unless you don't care about consistency.

tl2mo ago· 3 in thread

pdhborges2mo ago

devmor2mo ago

Try doing that on a “real” DB with hundreds of millions of rows too. Anything more than adding a column is a massive risk, especially once you’ve started sharding.

1 more reply

shimman2mo ago

This doesn't seem like a toy but you know... realizing different systems will have different constraints.

1 more reply

rpdillon2mo ago· 3 in thread

I wonder what percentage of services run on the Internet exceed a few hundred transactions per second.

icedchai2mo ago

I’ve seen multimillion dollar “enterprise” projects get no where close to that. Of course, they all run on scalable, cloud native infrastructure costing at least a few grand a month.

not_kurt_godel2mo ago

> a few grand a month.

A negligible cost for a successful tech business that also works when your requirements exceed the capabilities of a single VPS.

1 more reply

egwor2mo ago

I think the better question to ask is what services peak at a few hundred transactions per second?

darkwater2mo ago

I mean, your "This is fine for" is almost literally the whole point of TFA, that you can go a long way, MRR-wise, with a simpler architecture.

j / k navigate · click thread line to collapse