So I made a format that will never surpass SQLite, except that it's extremely lighter and faster and works on zstd compressed files. It has really small indexes and can contain binaries or text just like SQLite.
The wasm part that decompresses and reads and searches the databases is only 38kb (uncompressed (maybe 16kb gzipped)). Compare that to SQLite's 1.2mb of wasm and glue code it's 3% the size but searching and loading is much faster. My program isn't really column based and isn't suitable for managing spreadsheets, but it's great for dictionaries and file archives of images and audio.
I ported the jbig2 decoder as a 17kb wasm module, so I can load monochrome scans that are 8kb per page and still legible.
https://github.com/tnelsond/peakslab
SQLite is very well engineered, PeakSlab is very simple.
The current trunk is actually 1.7mb in its canonical unminified form (which includes very nearly as much docs as JS code), split almost evenly between the WASM and JS pieces :/. Edit: it is 1.2mb in minified form, though.
Disclosure: i'm its maintainer.
Edit: current trunk, for the sake of trivia:
sqlite3.wasm 896745
sqlite3.mjs 816270 # unminified w/ docs
sqlite3.mjs 431388 # unminified w/o docs
sqlite3.mjs 310975 # minified"Right joins are just left joins in the wrong direction, you don't need that crap"
Of course it always gets simpler or more specialised. I think many apps using databases would run with SQLite just as well. And some would probably run just as well with a textfile instead of any db like SQLite.
[0] https://cdb.cr.yp.to/ , https://en.wikipedia.org/wiki/Cdb_(software)
Not everything needs to be real-time updated.
We should really consider eventually retiring memes because they just end up as thought-terminating cliches.
This is of course referring to xkcd #927. How do I know that?
I have also heard that some firms ban its use.
Why?
Because it makes it SO easy to set up a database for your app that you end up with a super critical component of your application that looks exactly like a file. A file that can have any extension. And that file can be copied around to other servers. Even if there is PII in that file. Multiply this times the number of applications in your firm and you can see how this could get a little nuts.
DevOps and DBA teams would prefer that the database be a big, heavy iron thing that is very obviously a database server. And when you connect to it, that's also very obvious etc etc.
I still love SQLite though.
SQLite is very good if you can fit into the single writer, multiple readers pattern; you'll never lose data if you use the correct settings, which takes a minute of Google search to figure out.
Today, most of my apps are simply go binary + SQLite + systemd service file.
I've yet to lose data. Performance is great and plenty for most apps
And even then, I've used a batch writer pattern to get 180k writes per second on a commodity vps.
ex: main.db + fts.db. reading and writing to main.db is always available; updating the fts index can be done without blocking the main database — it only needs to read, the reads can be chunked, and delayed. fts.db keeps the index + a cursor table — an id or last change ts
could also use a shard to handle tables for metrics, or simply move old data out of main.db
* some examples:
conn = sqlite3.connect("data.db")
conn.execute("PRAGMA journal_mode=WAL") # concurrent reads (see above)
conn.execute("PRAGMA synchronous=NORMAL") # fsync at checkpoint, not every commit
conn.execute("PRAGMA cache_size=-62500") # ~61 MB page cache (negative = KB)
conn.execute("PRAGMA temp_store=MEMORY") # temp tables and indexes in RAM
conn.execute("PRAGMA busy_timeout=5000") # wait 5s on lock instead of failing
edit: orms will obliterate your performance — use raw queries instead. just make sure to run static analysis on your code base to catch sqli bugs.my replies are being ratelimited, so let me add this
the heavy duty server other databases have is doing that load bearing work that folks tend to complain about sqlite can't do
the real dmbs's are doing mostly the same work that sqlite does, you just don't have to think about it once they're set up. behind that chunky server process the database is still dealing with writing your data to a filesystem, handling transaction locks, etc.
by default sqlite gives you a stable database file, that when you see the transaction complete, it means the changes have been committed to storage, and cannot be lost if the machine were to crash exactly after that.
you can decide to wave some, or all of those guaranties in exchange for performance, and this doesn't even have to be an all or nothing situation.
https://the-php-bench.technex.us/
There's a huge performance difference between memory and file storage within sqlite itself. Not even getting into tuning specifics.
"Batch writer pattern" is a good idea to get rid of expensive commits.
What is the longest surviving paper medium?
So this news is nearly <del>six</del> EIGHT years old. But I didn't happen to know about it until now, so that's not a complaint at all; rather, this is a thank-you for posting it.
(Thanks for the correction. Brief brain malfunction in the math department there).
The specification is publicly available
- It is widely adopted - It is likely to remain readable in the future - It has little dependency on specific operating systems or services - It carries low patent risk
From the perspective of long-term continuity, avoiding dependence on any particular company or service is extremely important.
> As of this writing (2018-05-29) the only other recommended storage formats for datasets are XML, JSON, and CSV.
Preferred
1. Platform-independent, character-based formats are preferred over native or binary formats as long as data is complete, and retains full detail and precision. Preferred formats include well-developed, widely adopted, de facto marketplace standards, e.g.
a. Formats using well known schemas with public validation tool available
b. Line-oriented, e.g. TSV, CSV, fixed-width
c. Platform-independent open formats, e.g. .db, .db3, .sqlite, .sqlite3
2. Any proprietary format that is a de facto standard for a profession or supported by multiple tools (e.g. Excel .xls or .xlsx, Shapefile)
3. Character Encoding, in descending order of preference:
a. UTF-8, UTF-16 (with BOM),
b. US-ASCII or ISO 8859-1
c. Other named encoding
---
Acceptable
For data (in order of preference):
1. Non-proprietary, publicly documented formats endorsed as standards by a professional community or government agency, e.g. CDF, HDF
2. Text-based data formats with available schema
For aggregation or transfer:
1. ZIP, RAR, tar, 7z with no encryption, password or other protection mechanisms.
https://www.loc.gov/preservation/resources/rfs/data.html[0]: https://7-zip.org/7z.html
[1]: CVE-2025-0411
What are the advantages or reasons to use zstd in a 7z container versus just .zst?
I initially was writing a series of files and doing some quasi-append-only things with new files and compacting the old one to sort of reinvent journaling. What I did more or less worked but it was very ad hoc and bad and was probably hiding a lot of bugs I would eventually have to fix later.
And then I remembered SQLite. I realized that ACID was probably safe enough for my needs, and then all the hard parts I was reinventing were probably faster and less likely to break if I used something thoroughly audited and tested, so I reworked everything I was doing to SQLite and it worked fine.
I wish exFAT would die in a fire and a journaling filesystem would replace it as the "one filesystem you can use everywhere", but until it does I'm grateful SQLite exists.
Where exactly is everywhere? Win32? All of Linux? BSDs? MacOS? IOS? ...
"I can use Linux because if I get stuck I can just switch to Windows and still access my data" is a comfort that probably keeps people from even trying Linux (or other OSes)?
Why else would MS not support BTRFS/ZFS/Ext or whatever?
{I'm not saying that I think this works.}
In fact, I've worked on several projects, where I heavily advocated that even the primary app storage was SQLite, and that archival was simply copying the database after an event. Specifically, elections, petition verification, etc. It's kind of difficult coming up with complex schemas to handle multiple events as well as the state of data at those events... by separating the database itself, using SQLite, that simplifies a lot of thigs. Though it does, practically limit scale a bit. The main thing would be to archive the application and the database after a given event. If the application is containerized, you could create an image of the source, the container and the database after the event.
I think this kind of structure would work well for a lot of things... especially if you're considering data sharding anyway.
[1] https://www.loc.gov/preservation/resources/rfs/data.html
This can require nuance: for example, PDF has profiles because the core format is widely supported but you could do things like embed plugin content from now-defunct vendors and they would only want the former for long-term preservation.
Also, the lack of enforced column data types was always a negative for me.
I really like the simplicity and speed of SQLite, I've used in both personal and professional projects. For day-to-day work I still end up in Excel, not because I like it more (I don't), but because its ubiquity makes it the lowest friction way to share & explore datasets with less technical stakeholders and execs.
Show HN: Honker – Postgres NOTIFY/LISTEN Semantics for SQLite | 327 points | 94 comments | https://news.ycombinator.com/item?id=47874647
Live notifications was one of the big missing pieces to implement whole apps on a sqlite backend, and now there's a decent solution.