Why You Shouldn't Use SQLite (opens in new tab)

(hendrik-erz.de)

7 pointshalamadrid1y ago26 comments

26 comments

Perhaps this author shouldn't publish their advice about databases. I fully accept that people don't know everything, and I fully support them exploring new things and falling on their faces (if you don't, you aren't really trying). But I object to them offering advice at the same time: the following is from the followup article:

https://www.hendrik-erz.de/post/should-you-use-sqlite

The first argument [in the prior article] that is completely bogus is about the speed: I argued that, if you’re not careful, the access times of the database will be actually slower than to simply use the file system.

Now, obviously there is a large flaw in this argument if you know anything about databases. Specifically, I never thought about the option to just create additional indices for columns I frequently addressed.

k8svet1y ago

Not sure why this got posted when the follow-up claws back much of it and leaves a pretty weak conclusion: https://www.hendrik-erz.de/post/should-you-use-sqlite

cchance1y ago

Ya was gonna say the first thing i saw was basically a "correction" post and then you read that and hit

"The actual reason why I think you shouldn’t use SQLite is one of time constraints: Implementing a layer of SQLite will probably take you more time than simply to reduce the amount of files. "

Like no i'm sorry implementing a SQLite layer, vs implementing a filesystem layer is the same time or faster. I mean seriously he thinks trying to negotiate how to stop having 600,000 files is easier than just... ummm... using standard sql?

Sebb7671y ago

> Do you know how large the biggest hard disks or SSDs are nowadays? [...] no single disk will have 281 Terabytes of space. And that is what you need: Since SQLite files are single, continuous files on your file system, they have to be stored on one, physical hard drive. You could theoretically split them up, but then they wouldn’t work anymore until you pieced them together again.

> So the practical limit of the size of an SQLite file is much lower than the theoretical limit.

I agree with the conclusion, but the argument is horrible. 281 TB are easily doable on a single file system (not hard drive, which is entirely irrelevant) is very much doable. You can actually build a rather cheap consumer system that will do that. It won't be good, but the problem is not the continuous allocation of space.

out_of_protocol1y ago

SQLite works great when used correctly. E.g. if you split metadata and blobs into different tables, it'll be super fast even without any indexes. Real life example: https://news.ycombinator.com/item?id=39793805 Pack archive format, which is magic byte to distinguish as separate format + sqlite database + zstd compression for data chunks. Inside there are no indexes, three tables - file info (like name), with proper parent_id logic, data chunks and relationship between these two.

adolph1y ago

Article is from 2021, an update to the article published 2022:

In this article I explain where and why I was wrong, and share the real reasons why I think we shouldn't use SQLite for research: A lack of skills and time.

https://www.hendrik-erz.de/post/should-you-use-sqlite

throwaway20371y ago

Thank you to share the follow-up article. This part made me chuckle:

    > I’m not a computer scientist

Oh, so now you tell us.

rokkitmensch1y ago

And this is why you keep the academics away from industrial systems. Dentists curse: because they're so knowledgeable about their own area, they un-self-critically transfer that confidence clean outside of their own domain, and then having a nice engineering conversation becomes impossible as their ego becomes involved in the opinion.

rokkitmensch1y ago

Upon consideration, I wonder if this is an artifact of the incentives that drive academics towards a career rooted in /being right/. Contra, in my industrial experience, working in empirical contexts and actively invalidating tiny hypotheses.

endisneigh1y ago

SQLite is one of those things that I hear people on hear advocating a lot but have never personally encountered anyone using other than embedded iOS/Android work. I’m talking businesses with 8 or 9 digit revenue, not side projects with 1000 users. I’m talking about teams in the hundreds of developers, not single person projects. I’m talking many services as dependencies, not just one. I’m talking extensive and robust redundancy, monitoring and observability. The list goes on.

There’s a reason for that.

- single writer even with WAL

- missing plenty of alter table functions

Those alone discount it for serious work. Yes, there are workarounds, but why workaround when you can work with Postgres or others that don’t require all of the hassle?

What’s amusing as well as the zealotry around it too. People getting worked up about pointing out obvious flaws. Sad. When did tech become religion?

FWIW SQLite is great for embedded applications, and it’s where it belongs.

simonw1y ago

The reason is you're not talking to the right people? It's not just talk, loads of us are using it for all sorts of interesting things.

ewalk1531y ago

Thank you for making SQLite such a delightful way to share research through datasette.

adolph1y ago

Thank you for sharing your projects with the world. I'll leave this here as a hint to the parent commenter:

https://simonwillison.net/tags/dogsheep/

xyx08261y ago

> … anyone using other than embedded iOS/Android work. I’m talking businesses with 8 or 9 digit revenue…

The use case and revenue doesn’t have to be contradictory, I guess? What do you mean by “serious work”? iOS and macOS and their bundled apps use SQLite in a lot of places and Apple generated, idk, at least 9 digits in revenue last year. It sounds like you expect SQLite to excel in all use cases including high concurrency/resiliency web services where traditional heavyweight DBMSes like MySQL and PostgreSQL typically stand out, but that’s far from the truth. The authors of SQLite clearly carve out when you should use it (https://www.sqlite.org/whentouse.html) and it’s clearly succeeding in what it’s good at.

With that said, lots of big firms successfully use SQLite in desktop and web offerings as well: https://www.sqlite.org/famous.html

simonw1y ago

My sqlite-utils Python library and CLI tool includes a fix for the lack of advanced alter table:

- https://sqlite-utils.datasette.io/en/stable/cli.html#transfo...

cchance1y ago

Lots of people use sqlite or derivatives of it lol. If it wasn't we wouldnt have stuff like pocketbase, rqlite and other derivatives that build on top of sqlite.

The thing with sqlite is that no one sees a need to "advertise" that they are using sqlite... it just works not fancy, (there are fancy things build on sqlite like rqlite, dqlite, litestream etc) but for the base sqlite at the end of the day its a quick file based db that just handles itself well.

hiyer1y ago

AWS uses SQLite to manage EBS volume metadata. I remember reading about this some time back but can't seem to find the article now.

throwaway20371y ago

First this:

    > 8 or 9 digit revenue

Ok, HN loves the "digits" stuff. To be clear, "9 digits" can mean: 100,000,000 all the way up to one billion minus one. BIG difference. Let's assume 100,000,000 for now. Also: No currency mentioned. Indonesian rupiah? Euro? Japanese yen? It makes a big difference.

    > I’m talking about teams in the hundreds of developers

What project in the 2020s needs "hundreds of developers" (again: that implies 200+) and only generates maximum of 100M revenue? This sounds like an awful business of: it cost 200M to build, but only makes 100M revenue.

teraflop1y ago

This article makes two main arguments, both of which are (sorry to be blunt) just plain dumb.

The first point is that the claim that "SQLite can support up to 281TB in a single database" is wrong, because in practice you can't get a single disk that big, and therefore SQLite is a bad choice for storing 16GB of data.

The second point is that without indexing, retrieving individual data items is very inefficient. Therefore a big distributed MySQL cluster (which supports indexing) is better than a single SQLite database (which also supports indexing).

Most of the rest of the text only serves to beat around the bush and distract from how nonsensical the core arguments are.

throwaway20371y ago

I like your retort. It is well written.

One nitpick: Using RAID, can you construct a massive single disk mount from JBOD? It seems possible in 2024 to create a 300GB continuous mount. However, your point stands: For most real world scenarios, anything larger than 10TB is probably unreasonably large for a single-file SQLite DB.

    > Most of the rest of the text only serves to beat around the bush

Thank you for saying that outloud. This person goes on and on. It is like the YouTube talking head videos that are 45mins, that could easily be cut to 10-15mins!

adolph1y ago

This podcast with SQLite's Richard Hipp was fascinating as I didn't know the backstory and deeper history:

https://corecursive.com/066-sqlite-with-richard-hipp/

st3fan1y ago

Haters gonna hate.

PreInternet011y ago

Horrific confession time: I have several .sqlite files that exceed 1.5TB in size and consist of a single table with a single field called JSON (guess what it contains!).

Early days, I just used 'JSON LIKE '%json-substring-match%' for queries, and that did get a bit slow after a while, but mostly the syntax was just really gross when doing aggregations. Fortunately, these days, you can do:

    CREATE INDEX IDX_Foo_Bar ON Foo (json_extract(JSON, '$.Bar'));

...which makes subsequent SELECT queries on that function nice and quick again.

This has been going on since (checks create time on one .sqlite file) July 2018, with pretty much zero perf or downtime issues.

wruza1y ago

You also shouldn’t use sticky headers that take quarter of a mobile screen and do nothing apart from screaming h1 in your face. Couldn’t finish neither article because of that.

danirod1y ago

Didn't even tried to run `create index` or to learn how to use a database system first, let alone design a schema. Just went straight to "nope, this is why SQLite is bad".

I don't think querying 8 GB of data without an index is going to be efficient in MySQL either.

(Disclaimer: SQLite fan here, but I read this article with close attention because I'm always interested in knowing SQLite pain points. The conclusion was a slap of just nothing)

Reason0771y ago

TLDR: Guy notices that queries on unindexed columns are slow on a relatively large (8GB) SQLite database. Rather than fix his database design by adding additional indices, guy concludes that SQLite is bad and you shouldn't use SQLite.

j / k navigate · click thread line to collapse

26 comments

wolverine8761y ago

https://www.hendrik-erz.de/post/should-you-use-sqlite

k8svet1y ago

Not sure why this got posted when the follow-up claws back much of it and leaves a pretty weak conclusion: https://www.hendrik-erz.de/post/should-you-use-sqlite

cchance1y ago

Ya was gonna say the first thing i saw was basically a "correction" post and then you read that and hit

"The actual reason why I think you shouldn’t use SQLite is one of time constraints: Implementing a layer of SQLite will probably take you more time than simply to reduce the amount of files. "

Sebb7671y ago

> So the practical limit of the size of an SQLite file is much lower than the theoretical limit.

out_of_protocol1y ago

adolph1y ago

Article is from 2021, an update to the article published 2022:

In this article I explain where and why I was wrong, and share the real reasons why I think we shouldn't use SQLite for research: A lack of skills and time.

https://www.hendrik-erz.de/post/should-you-use-sqlite

throwaway20371y ago

Thank you to share the follow-up article. This part made me chuckle:

    > I’m not a computer scientist

Oh, so now you tell us.

rokkitmensch1y ago

endisneigh1y ago

There’s a reason for that.

- single writer even with WAL

- missing plenty of alter table functions

Those alone discount it for serious work. Yes, there are workarounds, but why workaround when you can work with Postgres or others that don’t require all of the hassle?

What’s amusing as well as the zealotry around it too. People getting worked up about pointing out obvious flaws. Sad. When did tech become religion?

FWIW SQLite is great for embedded applications, and it’s where it belongs.

simonw1y ago

The reason is you're not talking to the right people? It's not just talk, loads of us are using it for all sorts of interesting things.

ewalk1531y ago

Thank you for making SQLite such a delightful way to share research through datasette.

adolph1y ago

Thank you for sharing your projects with the world. I'll leave this here as a hint to the parent commenter:

https://simonwillison.net/tags/dogsheep/

xyx08261y ago

> … anyone using other than embedded iOS/Android work. I’m talking businesses with 8 or 9 digit revenue…

With that said, lots of big firms successfully use SQLite in desktop and web offerings as well: https://www.sqlite.org/famous.html

simonw1y ago

My sqlite-utils Python library and CLI tool includes a fix for the lack of advanced alter table:

- https://sqlite-utils.datasette.io/en/stable/cli.html#transfo...

cchance1y ago

Lots of people use sqlite or derivatives of it lol. If it wasn't we wouldnt have stuff like pocketbase, rqlite and other derivatives that build on top of sqlite.

hiyer1y ago

AWS uses SQLite to manage EBS volume metadata. I remember reading about this some time back but can't seem to find the article now.

throwaway20371y ago

First this:

    > 8 or 9 digit revenue

    > I’m talking about teams in the hundreds of developers

teraflop1y ago

This article makes two main arguments, both of which are (sorry to be blunt) just plain dumb.

Most of the rest of the text only serves to beat around the bush and distract from how nonsensical the core arguments are.

throwaway20371y ago

I like your retort. It is well written.

    > Most of the rest of the text only serves to beat around the bush

Thank you for saying that outloud. This person goes on and on. It is like the YouTube talking head videos that are 45mins, that could easily be cut to 10-15mins!

adolph1y ago

This podcast with SQLite's Richard Hipp was fascinating as I didn't know the backstory and deeper history:

https://corecursive.com/066-sqlite-with-richard-hipp/

st3fan1y ago

Haters gonna hate.

PreInternet011y ago

Horrific confession time: I have several .sqlite files that exceed 1.5TB in size and consist of a single table with a single field called JSON (guess what it contains!).

    CREATE INDEX IDX_Foo_Bar ON Foo (json_extract(JSON, '$.Bar'));

...which makes subsequent SELECT queries on that function nice and quick again.

This has been going on since (checks create time on one .sqlite file) July 2018, with pretty much zero perf or downtime issues.

wruza1y ago

You also shouldn’t use sticky headers that take quarter of a mobile screen and do nothing apart from screaming h1 in your face. Couldn’t finish neither article because of that.

danirod1y ago

Didn't even tried to run `create index` or to learn how to use a database system first, let alone design a schema. Just went straight to "nope, this is why SQLite is bad".

I don't think querying 8 GB of data without an index is going to be efficient in MySQL either.

(Disclaimer: SQLite fan here, but I read this article with close attention because I'm always interested in knowing SQLite pain points. The conclusion was a slap of just nothing)

Reason0771y ago

j / k navigate · click thread line to collapse