PSA: SQLite WAL checksums fail silently and may lose data (opens in new tab)

(avi.im)

279 pointsavinassh11mo ago129 comments

129 comments

37 comments · 13 top-level

kburman11mo ago· 6 in thread

An employee of Turso, a commercial fork of SQLite, is presenting a standard, safety-first feature of SQLite's WAL as a dangerous flaw. As many have noted, this behavior prevents database corruption, it doesn't cause it.

supriyo-biswas11mo ago

I wouldn't have jumped to a conspiracy angle immediately, but there are some signs which are difficult to overlook:

- Said person was apparently employed due to his good understanding of databases and distributed systems concepts (there's a HN thread about how he found an issue in the paper describing an algorithm); yet makes fundamental mistakes in understanding what the WAL does and how it's possible not to "partly" apply a WAL.

- Said person expects a SQL database to expose WAL level errors to the user breaking transactional semantics (if you want that level of control, consider simpler file-based key-value stores that expose such semantics?)

- Said person maligns SQLite as being impossible to contribute; whereas the actual project only mentions that they may rewrite the proposed patch to avoid copyright implications.

- Said person again maligns SQLite as "limping along" in the face of disk errors (while making the opposite claim a few paragraphs ago); while ignoring that the checksum VFS exists when on-disk data corruption is a concern.

jrockway11mo ago

I think it's kind of possible to partially apply the WAL manually. Imagine your frames are:

1) Insert new subscription for "foobar @ 123 Fake St." 2) Insert new subscription for "�#�#�xD�{.��t��3Axu:!" 3) Insert new subscription for "barbaz @ 742 Evergreen Terrace"

A human could probably grab two subscriptions out of that data loss incident. I think that's what they're saying. If you're very lucky and want to do a lot of manual work, you could maybe restore some of the data. Obviously both of the "obviously correct" records could just be random bitflips that happen to look right to humans. There's no way of knowing.

3 more replies

avinasshOP11mo ago

of all places, I did not expect to get personal attacks on HN :)

> yet makes fundamental mistakes in understanding what the WAL does and how it's possible not to "partly" apply a WAL.

Please provide citation on where I said that. You can't partly apply WAL always, but there are very valid cases where you can do that to recover. Recovery doesn't have to automatic. It can be done by SQLite, or some recovery tool or with manual intervention.

> - Said person maligns SQLite as being impossible to contribute; whereas the actual project only mentions that they may rewrite the proposed patch to avoid copyright implications.

Please provide citation on where I said that. Someone asked me to send a patch to SQLite, I linked them to the SQLite's page.

supriyo-biswas11mo ago

> You can't partly apply WAL always, but there are very valid cases where you can do that to recover.

Without mentioning the exact set of cases where recovery is possible and it isn't, going "PSA: SQLite is unreliable!!1one" is highly irresponsible. I think there's quite a bit of criticism going around though, you could add them to your blog article :)

Please also consider the fact that SQLite being a transactional database, it is usually not possible to expose a WAL level error to the user. The correct way to address it is to probably come up with a list of cases where it is possible, and then send in a patch, or at least a proposal, of how to address it.

> Please provide citation on where I said that [SQLite is impossible to contribute].

https://news.ycombinator.com/item?id=44672563

1 more reply

tucnak11mo ago

> of all places, I did not expect to get personal attacks on HN

You must be new to the site.

chambers11mo ago

Yeah, this tracks.

If the OP consulted with Turso on this blogpost, then Turso probably believes the reported behavior is indeed a failure or a flaw, which they think a local db should be responsible for.

The confusion is that Limbo, their solution to this presumed problem, is not mentioned in the article which means that everyone has to figure out where this post is coming from.

teraflop11mo ago· 5 in thread

> The checksums in WAL are likely not meant to check for random page corruption in the middle; maybe they’re just to check if the last write of a frame was fsynced properly or not?

This is the correct explanation. The purpose is to detect partial writes, not to detect arbitrary data corruption. If detecting corruption was the goal, then checksumming the WAL without also checksumming the database itself would be fairly pointless.

In fact, it's not accurate to say "SQLite does not do checksums by default, but it has checksums in WAL mode." SQLite always uses checksums for its journal, regardless of whether that's a rollback journal or a write-ahead log. [1]

For the purpose of tolerating and recovering from crashes/power failures, writes to the database file itself are effectively idempotent. It doesn't matter if only a subset of the DB writes are persisted before a crash, and you don't need to know which ones succeeded, because you can just roll all of them forward or backward (depending on the mode). But for the journal itself, distinguishing partial journal entries from complete ones matters.

No matter what order the disk physically writes out pages, the instant when the checksum matches the data is the instant at which the transaction can be unambiguously said to commit.

[1]: https://www.sqlite.org/fileformat.html

kentonv11mo ago

Exactly. To put it another way:

Imagine the power goes out while sqlite is in the middle of writing a transaction to the WAL (before the write has been confirmed to the application). What do you want to happen when power comes back, and you reload the database?

If the transaction was fully written, then you'd probably like to keep it. But if it was not complete, you want to roll it back.

How does sqlite know if the transaction was complete? It needs to see two things:

1. The transaction ends with a commit frame, indicating the application did in fact perform a `COMMIT TRANSACTION`.

2. All the checksums are correct, indicating the data was fully synced to disk when it was committed.

If the checksums are wrong, the assumption is that the transaction wasn't fully written out. Therefore, it should be rolled back. That's exactly what sqlite does.

This is not "data loss", because the transaction was not ever fully committed. The power failure happened before the commit was confirmed to the application, so there's no way anyone should have expected that the transaction is durable.

The checksum is NOT intended to detect when the data was corrupted by some other means, like damage to the disk or a buggy app overwriting bytes. Myriad other mechanisms should be protecting against those already, and sqlite is assuming those other mechanisms are working, because if not, there's very little sqlite can do about it.

malone11mo ago

Why is the commit frame not sufficient to determine whether the transaction was fully written or not? Is there a scenario where the commit frame is fsynced to disk but the proceeding data isn't?

1 more reply

hinkley11mo ago

For instance, running on ZFS or one of its peers.

2 more replies

sqweek10mo ago

> This is not "data loss", because the transaction was not ever fully committed. The power failure happened before the commit was confirmed to the application, so there's no way anyone should have expected that the transaction is durable.

In the scenario outlined in the article, technically the lost transactions _were_ fully committed from the application's perspective.

From sqlite's perspective, the updates were successfully fsync'd to the WAL and are now waiting for the next CHECKPOINT operation to be written back to the main database. But sqlite doesn't wait for the main DB to be updated to report success to the application, and nor should it -- the entire point of the WAL is that once data is fsynced to the journal sqlite is confident about its durability.

The type of corruption induced in the article highlights this assumption: if the data on disk changes after fsync reports success, it can invalidate updates further ahead in the WAL that the application was previously told had been successfully committed (and thus may have triggered other external actions).

To be clear, sqlite is doing the right thing. If any frame in the WAL does not match its checksum then the validity of all subsequent frames is called into question. And if fsync() is lying to this extent there's nothing sqlite can do -- that's a bug in the underlying storage layer.

However I think it does leave us with a legitimate race:

1. application submits transactions 1..N 2. sqlite starts committing transactions 1..N to WAL 3. fsync reports success and data is valid on disk 4. sqlite reports success to application 5. application fires off actions in response to transaction N being committed 6. some external event (hardware?) causes a bit-flip in the WAL on disk affecting transaction 2 7. application closes/crashes without sqlite having an opportunity to CHECKPOINT 8. application restarts 9. sqlite recovers WAL and discards transactions 2..N

There's a lot of caveats here. I think that both the bit-flip and the application crash are required, because while the application is running the WAL contents are likely duplicated in RAM (possibly in OS-buffers) and a bit-flip on disk alone may never be observed. The bit-flip also needs to be a silent error from the perspective of the storage layer to not result in an error/warning message from sqlite (eg. reporting I/O errors via the usual sqlite_config(SQLITE_LOG, ...) mechanism).

Finally if you're considering this kind of data corruption, there's no need to involve the WAL at all. The same kind of silent bit-flip could equally affect the main database, randomly changing the contents of an arbitrary cell in the database. Sqlite _might_ detect that and report corruption (if it results in violating an index ordering or some other constraint), or it might just pass the data on. But as you summarised:

> Myriad other mechanisms should be protecting against [damage to the disk or a buggy app overwriting bytes] already, and sqlite is assuming those other mechanisms are working, because if not, there's very little sqlite can do about it.

tldr; the type of corruption simulated in the article is quite contrived, and sqlite does not protect against cosmic rays/subtle changes on disk to its database files

lxgr11mo ago

I believe it's also because of this (from https://www.sqlite.org/wal.html):

> [...] The checkpoint does not normally truncate the WAL file (unless the journal_size_limit pragma is set). Instead, it merely causes SQLite to start overwriting the WAL file from the beginning. This is done because it is normally faster to overwrite an existing file than to append.

Without the checksum, a new WAL entry might cleanly overwrite an existing longer one in a way that still looks valid (e.g. "A|B" -> "C|B" instead of "AB" -> "C|data corruption"), at least without doing an (expensive) scheme of overwriting B with invalid data, fsyncing, and then overwriting A with C and fsyncing again.

In other words, the checksum allows an optimized write path with fewer expensive fsync/truncate operations; it's not a sudden expression of mistrust of lower layers that doesn't exist in the non-WAL path.

slashdev11mo ago· 4 in thread

How would this work differently? As soon as you encounter a checksum failure, you can't trust anything from that point on. If the checksum were just per-page and didn't build on the previous page's checksum, you can't just apply pages from the WAL that were valid, skipping the ones which were not. The database at the end of that process would be corrupt.

If you stop at the first failure, the database is restored to the last good state. That's the best outcome that can be achieved under the circumstances. Some data could be lost, but there wasn't anything sensible you could do with it anyway.

avinasshOP11mo ago

> How would this work differently?

I would like it to raise an error and then provide an option to continue or stop. Since continuing is the default, we need a way to opt in to stopping on checksum failure.

Not all checksum errors are impossible to recover from. Also, as the post mentions, only some non important pages could be corrupt too.

My main complaint is that it doesn't give developers an option.

thadt11mo ago

Aight, I'll bite: continue or stop... and do what? As others have pointed out, the only safe option to get back to a consistent state is to roll back to a safe point.

If what we're really interested in is the log part of a write ahead log - where we could safely recover data after a corruption, then a better tool might be just a log file, instead of SQLite.

1 more reply

lxgr11mo ago

Giving developers that option would require SQLite to change the way it writes WALs, which would increase overhead. Checksum corruptions can happen without any lower-level errors; this is a performance optimization by SQLite.

I've written more about this here: https://news.ycombinator.com/item?id=44673991

1 more reply

slashdev11mo ago

It is good that it doesn't give you an option. I don't want some app on my phone telling me its database is corrupt, I want it to always load back to the last good state and I'll handle any missing data myself.

The checksums are not going to fail unless there was disk corruption or a partial write.

In the former, thank your lucky stars it was in the WAL file and you just lose some data but have a functioning database still.

In the latter, you didn't fsync, so it couldn't have been that important. If you care about not losing data, you need to fsync on every transaction commit. If you don't care enough to do that, why do you care about checksums, it's missing the point.

cwillu11mo ago· 3 in thread

The benefit is that you're left with a database state that actually existed; there's no guarantee from the database's perspective that dropping some committed transactions and not others that came after will result in a valid state.

HelloNurse11mo ago

This is the main point that the OP misses: even if the newer portion of the WAL file isn't corrupted, its content cannot be used in any way because doing so would require the lost transactions from the corrupted block. The chained checksums are a feature, not gratuitous fragility.

AlotOfReading11mo ago

Sqlite could attempt to recover the detected errors though and not lose the transactions.

2 more replies

teraflop11mo ago

Yes, and it's not just about application-level integrity. The WAL operates at a page level, so dropping one WAL entry and then applying later ones would be likely to cause corruption at the B-tree level.

For instance, say you have a node A which has a child B:

* Transaction 1 wants to add a value to B, but it's already full, so B is split into new nodes C and D. Correspondingly, the pointer in A that points to B is removed, and replaced with pointers to C and D.

* Transaction 2 makes an unrelated change to A.

If you skip the updates from transaction 1, and apply the updates from transaction 2, then suddenly A's data is overwritten with a new version that points to nodes C and D, but those nodes haven't been written. The pointers just point to uninitialized garbage.

nemothekid11mo ago· 2 in thread

I might be missing something (We use sqlite for our embedded stores) - but I feel like "failing silently" is alarmist here.

1. If the WAL is incomplete, then "failing" silently is the correct thing to do here, and is the natural function of the WAL. The WAL had an incomplete write, nothing should have been communicated back the application and the application should assume the write never completed.

2. If the WAL is corrupt (due to the reasons he mentioned), then sqlite says that is that's your problem, not sqlite's. I think this is the default behavior for other databases as well. If a bit flips on disk, it's not guaranteed the database will catch it.

This article is framed almost like a CVE, but to me this is kind of like saying "PSA: If your hard drive dies you may lose data". If you care about data integrity (because your friend is sending you sqlite files) you should be handling that.

supriyo-biswas11mo ago

Also, partially applying a WAL has obvious issues even though the author of this post would somehow prefer that. If we update 3 rows in a database and the WAL entry for one of the rows is corrupted, do they expect to ignore the corrupted entry and apply the rest? What happens to data consistency in this particular case?

lxgr11mo ago

Even worse: SQLite, by default, does not immediately truncate WAL files, but rather overwrites the existing WAL from the beginning after successfully applying a checksum.

Doing what the author suggests would actually introduce data corruption errors when "restoring a WAL with a broken checksum".

1 more reply

ryanjshaw11mo ago· 2 in thread

> What I want: throw an error when corruption is detected and let the code handle it.

I wonder what that code would look like. My sense is that it’ll look exactly like the code that would run as if the transactions never occurred to begin with, which is why the SQLite design makes sense.

For example, I have a database of todos that sync locally from the cloud. The WAL gets corrupted. The WAL gets truncated the next time the DB is opened. The app logic then checks the last update timestamp in the DB and syncs with the cloud.

I don’t see what the app would do differently if it were notified about the WAL corruption.

fer11mo ago

Exactly. I'd read it as

> I want to correct errors that the DB wizard who implemented SQLite chose not to

When there's a design decision in such a high profile project that you disagree with, it's either

1. You don't understand why it was done like this.

2. You can (and probably will) submit a change that would solve it.

If you find yourself in the situation of understanding, yet not doing anything about it, you're the Schrodinger's developer: you're right and wrong until you collapse the mouth function by putting money on it.

It's very rarely an easy to fix mistake.

avinasshOP11mo ago

> 2. You can (and probably will) submit a change that would solve it.

SQLite is not open to contribution - https://www.sqlite.org/copyright.html

> 1. You don't understand why it was done like this.

sure, I would like to understand it. That's why the post!

1 more reply

asveikau11mo ago· 1 in thread

> You have SQLite .db and .db-wal files, but no accompanying .db-shm file. Maybe your friend shared it with you, or you downloaded some data off the internet.

Honestly this sounds out of scope for normal usage of sqlite and not realistic. I had a hard time reading past this. If I read that correctly, they're saying sqlite doesn't work if one of the database files disappears from under it.

I guess if you had filesystem corruption it's possible that .db-shm disappears without notice and that's a problem. But that isn't sqlite's fault.

CGamesPlay11mo ago

This, exactly. Especially since these files are basically the "this database was not cleanly closed" markers for SQLite. From SQLite's docs:

> If the last client using the database shuts down cleanly by calling sqlite3_close(), then a checkpoint is run automatically in order to transfer all information from the wal file over into the main database, and both the shm file and the wal file are unlinked.

dathinab11mo ago· 1 in thread

Some things:

- there is an official check sum VFS shim, but I never used it and don't know how good it is. The difference between it and WAL checksum is that it works on a per page level and you seem to need manually run the checksum checks and then yourself decide what to do

- check sums (as used by SQLite WAL) aren't meant for backup, redundancy or data recovery (there are error recovery codes focused on allowing recovering a limited set of bits, but they have way more overhead then the kind of checksum used here)

- I also believe SQLite should indicate such checksum errors (e.g. so that you might engage out of band data recovery, i.e. fetch a backup from somewhere), but I'm not fully sure how you would integrate it in a backward compatible way? Like return it as an error which otherwise acts like a SQLITE_BUSY??

ncruces11mo ago

The checksum VFS explicitly disables its checksums during checkpointing (search of inCkpt): https://sqlite.org/src/doc/tip/ext/misc/cksumvfs.c

Data in the WAL should be considered to be of "reduced durability".

lxgr11mo ago

> This is a follow-up post to my PSA: SQLite does not do checksums and PSA: Most databases do not do checksums by default.

That's really all there is to it.

SQLite has very deliberate and well-documented assumptions (see for example [1], [2]) about the lower layers it supports. One of them is that data corruption is handled by these lower layers, except if stated otherwise.

Not relying on this assumption would require introducing checksums (or redundancy/using an ECC, really) on both the WAL/rollback journal and on the main database file. This would make SQLite significantly more complex.

I believe TFA is mistaken about how SQLite uses checksums. They primarily serve as a way to avoid some extra write barriers/fsync operations, and maybe to catch incomplete out-of-order writes, but never to detect actual data corruption: https://news.ycombinator.com/item?id=44671373

[1] https://www.sqlite.org/psow.html

[2] https://www.sqlite.org/howtocorrupt.html

jmull11mo ago

> What’s interesting is that when a frame is found to have a missing or invalid checksum, SQLite drops that frame and all the subsequent frames.

Skipping a frames but processing later ones would corrupt the database.

> SQLite doesn’t throw any error on detection of corruption

I don’t think it’s actually a corruption detection feature though. I think it’s to prevent a physical failure while writing (like power loss) from corrupting the database. A corruption detection feature would work differently. E.g., it would cover the whole database, not just the WAL. Throwing an error here doesn’t make sense.

dev_l1x_be11mo ago

I was wondering about this subject for some time but the only real solution as I see would be a transactional filesystem (re-designing how filesystems work).

westurner11mo ago

Do the sqlite replication systems depend upon WAL checksums?

Merkle hashes would probably be better.

google/trillian adds Merkle hashes to table rows.

sqlite-parquet-vtable would workaround broken WAL checksums.

sqlite-wasm-http is almost a replication system

Re: "Migration of the [sqlite] build system to autosetup" https://news.ycombinator.com/item?id=41921992 :

> There are many extensions of SQLite; rqlite (Raft in Go,), cr-sqlite (CRDT in C), postlite (Postgres wire protocol for SQLite), electricsql (Postgres), sqledge (Postgres), and also WASM: sqlite-wasm, sqlite-wasm-http, dqlite (Raft in Rust),

> awesome-sqlite

From "Adding concurrent read/write to DuckDB with Arrow Flight" https://news.ycombinator.com/item?id=42871219 :

> cosmos/iavl is a Merkleized AVL tree. https://github.com/cosmos/iavl

/? Merkle hashes for sqlite: https://www.google.com/search?q=Merkle+hashes+for+SQlite

A git commit hash is basically a Merkle tree root, as it depends upon the previous hashes before it.

Merkle tree: https://en.wikipedia.org/wiki/Merkle_tree

(How) Should merkle hashes be added to sqlite for consistency? How would merkle hashes in sqlite differ from WAL checksums?

adzm11mo ago

sqlite has several callbacks / hooks / handlers that can be set. I think it is reasonable to expect there to be a way for this situation to be communicated to the application.

j / k navigate · click thread line to collapse

129 comments

37 comments · 13 top-level

kburman11mo ago· 6 in thread

supriyo-biswas11mo ago

I wouldn't have jumped to a conspiracy angle immediately, but there are some signs which are difficult to overlook:

- Said person maligns SQLite as being impossible to contribute; whereas the actual project only mentions that they may rewrite the proposed patch to avoid copyright implications.

jrockway11mo ago

I think it's kind of possible to partially apply the WAL manually. Imagine your frames are:

1) Insert new subscription for "foobar @ 123 Fake St." 2) Insert new subscription for "�#�#�xD�{.��t��3Axu:!" 3) Insert new subscription for "barbaz @ 742 Evergreen Terrace"

3 more replies

avinasshOP11mo ago

of all places, I did not expect to get personal attacks on HN :)

> yet makes fundamental mistakes in understanding what the WAL does and how it's possible not to "partly" apply a WAL.

> - Said person maligns SQLite as being impossible to contribute; whereas the actual project only mentions that they may rewrite the proposed patch to avoid copyright implications.

Please provide citation on where I said that. Someone asked me to send a patch to SQLite, I linked them to the SQLite's page.

supriyo-biswas11mo ago

> You can't partly apply WAL always, but there are very valid cases where you can do that to recover.

> Please provide citation on where I said that [SQLite is impossible to contribute].

https://news.ycombinator.com/item?id=44672563

1 more reply

tucnak11mo ago

> of all places, I did not expect to get personal attacks on HN

You must be new to the site.

chambers11mo ago

Yeah, this tracks.

If the OP consulted with Turso on this blogpost, then Turso probably believes the reported behavior is indeed a failure or a flaw, which they think a local db should be responsible for.

The confusion is that Limbo, their solution to this presumed problem, is not mentioned in the article which means that everyone has to figure out where this post is coming from.

teraflop11mo ago· 5 in thread

> The checksums in WAL are likely not meant to check for random page corruption in the middle; maybe they’re just to check if the last write of a frame was fsynced properly or not?

No matter what order the disk physically writes out pages, the instant when the checksum matches the data is the instant at which the transaction can be unambiguously said to commit.

[1]: https://www.sqlite.org/fileformat.html

kentonv11mo ago

Exactly. To put it another way:

If the transaction was fully written, then you'd probably like to keep it. But if it was not complete, you want to roll it back.

How does sqlite know if the transaction was complete? It needs to see two things:

1. The transaction ends with a commit frame, indicating the application did in fact perform a `COMMIT TRANSACTION`.

2. All the checksums are correct, indicating the data was fully synced to disk when it was committed.

If the checksums are wrong, the assumption is that the transaction wasn't fully written out. Therefore, it should be rolled back. That's exactly what sqlite does.

malone11mo ago

Why is the commit frame not sufficient to determine whether the transaction was fully written or not? Is there a scenario where the commit frame is fsynced to disk but the proceeding data isn't?

1 more reply

hinkley11mo ago

For instance, running on ZFS or one of its peers.

2 more replies

sqweek10mo ago

In the scenario outlined in the article, technically the lost transactions _were_ fully committed from the application's perspective.

However I think it does leave us with a legitimate race:

tldr; the type of corruption simulated in the article is quite contrived, and sqlite does not protect against cosmic rays/subtle changes on disk to its database files

lxgr11mo ago

I believe it's also because of this (from https://www.sqlite.org/wal.html):

slashdev11mo ago· 4 in thread

avinasshOP11mo ago

> How would this work differently?

I would like it to raise an error and then provide an option to continue or stop. Since continuing is the default, we need a way to opt in to stopping on checksum failure.

Not all checksum errors are impossible to recover from. Also, as the post mentions, only some non important pages could be corrupt too.

My main complaint is that it doesn't give developers an option.

thadt11mo ago

Aight, I'll bite: continue or stop... and do what? As others have pointed out, the only safe option to get back to a consistent state is to roll back to a safe point.

If what we're really interested in is the log part of a write ahead log - where we could safely recover data after a corruption, then a better tool might be just a log file, instead of SQLite.

1 more reply

lxgr11mo ago

I've written more about this here: https://news.ycombinator.com/item?id=44673991

1 more reply

slashdev11mo ago

The checksums are not going to fail unless there was disk corruption or a partial write.

In the former, thank your lucky stars it was in the WAL file and you just lose some data but have a functioning database still.

cwillu11mo ago· 3 in thread

HelloNurse11mo ago

AlotOfReading11mo ago

Sqlite could attempt to recover the detected errors though and not lose the transactions.

2 more replies

teraflop11mo ago

For instance, say you have a node A which has a child B:

* Transaction 2 makes an unrelated change to A.

nemothekid11mo ago· 2 in thread

I might be missing something (We use sqlite for our embedded stores) - but I feel like "failing silently" is alarmist here.

supriyo-biswas11mo ago

lxgr11mo ago

Even worse: SQLite, by default, does not immediately truncate WAL files, but rather overwrites the existing WAL from the beginning after successfully applying a checksum.

Doing what the author suggests would actually introduce data corruption errors when "restoring a WAL with a broken checksum".

1 more reply

ryanjshaw11mo ago· 2 in thread

> What I want: throw an error when corruption is detected and let the code handle it.

I don’t see what the app would do differently if it were notified about the WAL corruption.

fer11mo ago

Exactly. I'd read it as

> I want to correct errors that the DB wizard who implemented SQLite chose not to

When there's a design decision in such a high profile project that you disagree with, it's either

1. You don't understand why it was done like this.

2. You can (and probably will) submit a change that would solve it.

It's very rarely an easy to fix mistake.

avinasshOP11mo ago

> 2. You can (and probably will) submit a change that would solve it.

SQLite is not open to contribution - https://www.sqlite.org/copyright.html

> 1. You don't understand why it was done like this.

sure, I would like to understand it. That's why the post!

1 more reply

asveikau11mo ago· 1 in thread

> You have SQLite .db and .db-wal files, but no accompanying .db-shm file. Maybe your friend shared it with you, or you downloaded some data off the internet.

I guess if you had filesystem corruption it's possible that .db-shm disappears without notice and that's a problem. But that isn't sqlite's fault.

CGamesPlay11mo ago

This, exactly. Especially since these files are basically the "this database was not cleanly closed" markers for SQLite. From SQLite's docs:

dathinab11mo ago· 1 in thread

Some things:

ncruces11mo ago

The checksum VFS explicitly disables its checksums during checkpointing (search of inCkpt): https://sqlite.org/src/doc/tip/ext/misc/cksumvfs.c

Data in the WAL should be considered to be of "reduced durability".

lxgr11mo ago

> This is a follow-up post to my PSA: SQLite does not do checksums and PSA: Most databases do not do checksums by default.

That's really all there is to it.

[1] https://www.sqlite.org/psow.html

[2] https://www.sqlite.org/howtocorrupt.html

jmull11mo ago

> What’s interesting is that when a frame is found to have a missing or invalid checksum, SQLite drops that frame and all the subsequent frames.

Skipping a frames but processing later ones would corrupt the database.

> SQLite doesn’t throw any error on detection of corruption

dev_l1x_be11mo ago

I was wondering about this subject for some time but the only real solution as I see would be a transactional filesystem (re-designing how filesystems work).

westurner11mo ago

Do the sqlite replication systems depend upon WAL checksums?

Merkle hashes would probably be better.

google/trillian adds Merkle hashes to table rows.

sqlite-parquet-vtable would workaround broken WAL checksums.

sqlite-wasm-http is almost a replication system

Re: "Migration of the [sqlite] build system to autosetup" https://news.ycombinator.com/item?id=41921992 :

> awesome-sqlite

From "Adding concurrent read/write to DuckDB with Arrow Flight" https://news.ycombinator.com/item?id=42871219 :

> cosmos/iavl is a Merkleized AVL tree. https://github.com/cosmos/iavl

/? Merkle hashes for sqlite: https://www.google.com/search?q=Merkle+hashes+for+SQlite

A git commit hash is basically a Merkle tree root, as it depends upon the previous hashes before it.

Merkle tree: https://en.wikipedia.org/wiki/Merkle_tree

(How) Should merkle hashes be added to sqlite for consistency? How would merkle hashes in sqlite differ from WAL checksums?

adzm11mo ago

sqlite has several callbacks / hooks / handlers that can be set. I think it is reasonable to expect there to be a way for this situation to be communicated to the application.

j / k navigate · click thread line to collapse