> Multiple copies of SQLite linked into the same application
I had recent SQLite corruption, and I suspect it was this - I was accessing an SQLite database from the same python process using both the builtin sqlite3 package, and also the third party apsw library
I've never corrupted an SQLite database _file_, but an FTS5 index is remarkably easy to mess up if you don't read and understand the docs to the letter and which parts of them apply to your exact use case. When you do a thing wrong with FTS5, you either get a generic ambiguous error, or silent corruption of the index. It's not a black-box extension to you can just enable with no understanding of how it works under the hood, I learned that the hard way.
Interesting that it doesn't specifically call out Anti-Virus scanning (which does occasionally result in at least one of these scenarios). I've seen many SQLite database become corrupted and the best you can do is have a backup.
SQLite has code to work around anti-virus scanning, by doing retries while the anti-virus tool is processing an operation. I've not seen anti-virus tools result in corruption on the SQLite forum.
I had one machine on my entire client's network that could never download a LibreOffice document without it being corrupted.
Turns out it was kaspersky intercepting network calls, and deciding it was a very dangerous piece of file, and it would truncate it completely silently.
After wasting a non-billable afternoon on it, I just disabled the antivirus out of desperation and figured it out.
The solution was to generate a self signed certificate and TLS the connection and prevent the bugger to MITM us.
Since this day, even on a local network with behind a proxies and using a VPN, I still use https for all the services if I'm allowed.
I just finished building an open-source project using SQLite, and this makes me a little nervous. Is SQLite generally considered a reliable system? I've only had one experience with the DB becoming corrupted and that was because I used the Task Manager to force kill the process while it was running under a heavy load, and I felt that was understandable.
And in many cases, multiple times. I wouldn't be surprised if the number of SQLite DBs on my phone was in the hundreds and the number of separately compiled copies of the library is in double digits.
If you actually read this article it should give you even more assurance that SQLite is very robust and trustworthy. It mostly deals with all the edge cases where something other than the quality and reliability of the SQLite code base can bite you.
The whole sqlite documentation is full of gold gems and other curious documents mostly to appease bureocrats and big companies. It doubles as a fun read other than being incredibly useful.
This feels like the most ridiculous option and one I've never heard in the wild. Sequel-ite, S-Q-lite, sure. S-Q-L-ite, nah I'm not buying that. Kinda like the inventor of GIF insisting on pronouncing it wrong (I kid)
It's almost a HN meme at this point (and you might be referencing it ;-)). But for anyone unaware, SQLite's test coverage is _outstanding_ and well worth reading about[0].
> One example of this occurred circa 2013-08-30 on the canonical repository for the Fossil DVCS. In that event, file descriptor 2 (standard error) was being erroneously closed (by stunnel, we suspect) prior to sqlite3_open_v2() so that the file descriptor used for the repository database file was 2. Later, an application bug caused an assert() statement to emit an error message by invoking write(2,...). But since file descriptor 2 was now connected to a database file, the error message overwrote part of the database.
Kind of crazy that this is an issue in modern operating systems. There are just so many ways to avoid this obvious footgun of an API design. stdin/out/err should be reserved file descriptors. In fact, why reuse file descriptors at all? Just count up.
Firstly, standards compliance is bad when the standard is bad.
Secondly nothing precludes defining '3' as the lowest file descriptor for open(), since stdin/stdout/stderr cannot be opened (they are opened before the process starts).
Article date is Jan 2022.
This changes when SQLite runs as WASM in a browser — a context that only became properly viable with OPFS synchronous access handles in mid-2022.
The January 6, 2022 date at the bottom of the page is not the date the page was last updated. It is the date problem 8.9 (Boundary value error in the secondary journals used by nested transactions) directly above it was fixed. The date at the very bottom of the screen in the middle says the page itself was last updated on 2026-04-13.
How to Corrupt an SQLite Database File - https://news.ycombinator.com/item?id=41846796 - Oct 2024 (1 comment)
How to Corrupt an SQLite Database File - https://news.ycombinator.com/item?id=33503555 - Nov 2022 (1 comment)
How to Corrupt an SQLite Database File - https://news.ycombinator.com/item?id=31214131 - April 2022 (139 comments)
How to Corrupt an SQLite Database File - https://news.ycombinator.com/item?id=16579986 - March 2018 (10 comments)
How to Corrupt an SQLite Database File - https://news.ycombinator.com/item?id=6502229 - Oct 2013 (63 comments)
I had recent SQLite corruption, and I suspect it was this - I was accessing an SQLite database from the same python process using both the builtin sqlite3 package, and also the third party apsw library
Turns out it was kaspersky intercepting network calls, and deciding it was a very dangerous piece of file, and it would truncate it completely silently.
After wasting a non-billable afternoon on it, I just disabled the antivirus out of desperation and figured it out.
The solution was to generate a self signed certificate and TLS the connection and prevent the bugger to MITM us.
Since this day, even on a local network with behind a proxies and using a VPN, I still use https for all the services if I'm allowed.
It is arguably the most battle-tested piece of software on the planet. It's deployed basically on every single piece of tech out there.
See, for example: "Defense about the dark arts" (https://sqlite.org/security.Html) and "Why in C?" saying "Because C is best."
"How do I pronounce the name of the product? I say S-Q-L-ite, like a mineral."
[0] https://www.sqlite.org/testing.html
Kind of crazy that this is an issue in modern operating systems. There are just so many ways to avoid this obvious footgun of an API design. stdin/out/err should be reserved file descriptors. In fact, why reuse file descriptors at all? Just count up.
> https://pubs.opengroup.org/onlinepubs/000095399/functions/op...
> The open() function shall return a file descriptor for the named file that is the lowest file descriptor not currently open for that process.
Secondly nothing precludes defining '3' as the lowest file descriptor for open(), since stdin/stdout/stderr cannot be opened (they are opened before the process starts).
nb4 open("/dev/stdin"), that's not what I mean