https://www.hendrik-erz.de/post/should-you-use-sqlite
The first argument [in the prior article] that is completely bogus is about the speed: I argued that, if you’re not careful, the access times of the database will be actually slower than to simply use the file system.
Now, obviously there is a large flaw in this argument if you know anything about databases. Specifically, I never thought about the option to just create additional indices for columns I frequently addressed.
!
"The actual reason why I think you shouldn’t use SQLite is one of time constraints: Implementing a layer of SQLite will probably take you more time than simply to reduce the amount of files. "
Like no i'm sorry implementing a SQLite layer, vs implementing a filesystem layer is the same time or faster. I mean seriously he thinks trying to negotiate how to stop having 600,000 files is easier than just... ummm... using standard sql?
> So the practical limit of the size of an SQLite file is much lower than the theoretical limit.
I agree with the conclusion, but the argument is horrible. 281 TB are easily doable on a single file system (not hard drive, which is entirely irrelevant) is very much doable. You can actually build a rather cheap consumer system that will do that. It won't be good, but the problem is not the continuous allocation of space.
In this article I explain where and why I was wrong, and share the real reasons why I think we shouldn't use SQLite for research: A lack of skills and time.
> I’m not a computer scientist
Oh, so now you tell us.There’s a reason for that.
- single writer even with WAL
- missing plenty of alter table functions
Those alone discount it for serious work. Yes, there are workarounds, but why workaround when you can work with Postgres or others that don’t require all of the hassle?
What’s amusing as well as the zealotry around it too. People getting worked up about pointing out obvious flaws. Sad. When did tech become religion?
FWIW SQLite is great for embedded applications, and it’s where it belongs.
The use case and revenue doesn’t have to be contradictory, I guess? What do you mean by “serious work”? iOS and macOS and their bundled apps use SQLite in a lot of places and Apple generated, idk, at least 9 digits in revenue last year. It sounds like you expect SQLite to excel in all use cases including high concurrency/resiliency web services where traditional heavyweight DBMSes like MySQL and PostgreSQL typically stand out, but that’s far from the truth. The authors of SQLite clearly carve out when you should use it (https://www.sqlite.org/whentouse.html) and it’s clearly succeeding in what it’s good at.
With that said, lots of big firms successfully use SQLite in desktop and web offerings as well: https://www.sqlite.org/famous.html
- https://sqlite-utils.datasette.io/en/stable/cli.html#transfo...
The thing with sqlite is that no one sees a need to "advertise" that they are using sqlite... it just works not fancy, (there are fancy things build on sqlite like rqlite, dqlite, litestream etc) but for the base sqlite at the end of the day its a quick file based db that just handles itself well.
> 8 or 9 digit revenue
Ok, HN loves the "digits" stuff. To be clear, "9 digits" can mean: 100,000,000 all the way up to one billion minus one. BIG difference. Let's assume 100,000,000 for now. Also: No currency mentioned. Indonesian rupiah? Euro? Japanese yen? It makes a big difference. > I’m talking about teams in the hundreds of developers
What project in the 2020s needs "hundreds of developers" (again: that implies 200+) and only generates maximum of 100M revenue? This sounds like an awful business of: it cost 200M to build, but only makes 100M revenue.The first point is that the claim that "SQLite can support up to 281TB in a single database" is wrong, because in practice you can't get a single disk that big, and therefore SQLite is a bad choice for storing 16GB of data.
The second point is that without indexing, retrieving individual data items is very inefficient. Therefore a big distributed MySQL cluster (which supports indexing) is better than a single SQLite database (which also supports indexing).
Most of the rest of the text only serves to beat around the bush and distract from how nonsensical the core arguments are.
One nitpick: Using RAID, can you construct a massive single disk mount from JBOD? It seems possible in 2024 to create a 300GB continuous mount. However, your point stands: For most real world scenarios, anything larger than 10TB is probably unreasonably large for a single-file SQLite DB.
> Most of the rest of the text only serves to beat around the bush
Thank you for saying that outloud. This person goes on and on. It is like the YouTube talking head videos that are 45mins, that could easily be cut to 10-15mins!Early days, I just used 'JSON LIKE '%json-substring-match%' for queries, and that did get a bit slow after a while, but mostly the syntax was just really gross when doing aggregations. Fortunately, these days, you can do:
CREATE INDEX IDX_Foo_Bar ON Foo (json_extract(JSON, '$.Bar'));
...which makes subsequent SELECT queries on that function nice and quick again.This has been going on since (checks create time on one .sqlite file) July 2018, with pretty much zero perf or downtime issues.
I don't think querying 8 GB of data without an index is going to be efficient in MySQL either.
(Disclaimer: SQLite fan here, but I read this article with close attention because I'm always interested in knowing SQLite pain points. The conclusion was a slap of just nothing)