This has been shared without context but I guess the SQLite team is starting to modularize the btree code in order to facilitate work like SQLightning: https://github.com/LMDB/sqlightning
At the time SQLightning greatly improved SQLite performance but due to LMDB's requirement to have keys fit in 2/3 of a page it wasn't really useful as a general purpose replacement of SQLite's internal b-tree implementation.
EDIT: It looks like SQLightning got adopted and has been worked on by the SQLite team under the name LumoSQL. Here's the project's readme: https://lumosql.org/src/lumosql/doc/trunk/README.md which contains at the end "A Brief History of LumoSQL"
The strongest the README gets is saying "The LumoSQL and SQLite projects are cooperating", which is closer than any other effort I've seen, and welcome if the SQLite project ever wants to swap out the underlying storage engine, but doesn't really mean that the SQLite team "works on LumoSQL" or vice versa. Certainly it looks like LumoSQL has put significant work into the cooperation by using Fossil and by "not forking" which may have made the cooperation palatable.
Also, the SQLite project has been consistent on wanting to write all the code for SQLite themselves and not merge in patches (https://sqlite.org/copyright.html). Them working on another exploratory project would be a way for them to absorb those changes back into SQLite in a way that wouldn't be incompatible, but it would have to be the same team doing the changes for that to be consistent.
Fossil is an SCM system - https://fossil-scm.org/. Are you suggesting that every piece of software ever kept in, say, git repositories are therefore maintained by the same team?
> EDIT: It looks like SQLightning got adopted and has been worked on by the SQLite team under the name LumoSQL.
I see none of your claims being supported by your link.
It seems LumoSQL is just an umbrella term that refers to a bunch of patches applied over SQLite. I'm not sure if that's enough to not call it a fork.
Also,I saw zero references to the people actually involved in LumoSQL, other than the project being supported by the NLNet foundation.
I'd expect that a small project managed and hosted by the SQLite team to be hosted in www.sqlite.org, but this looks like a completely independent and unrelated effort.
That last substantive edit to the document was in 2009. There were
some spelling corrections in 2010. We finally got around to removing
it from the documentation set in 2016.
A bit off-topic, but SQLite continues to impress me. This year, we've added a simple form builder to our product which allows creators to quickly gather feedback/input from their audiences. Basically each form has it's own small sqlite database as a datastore and a json file for the form definition (field styling/constraints, etc). It's 'low tech', but performs great. Not having an additional service running also means having one less thing to maintain.
Been feeling lately the fatigue of managing 3-5 different services for a every single trivial project. This pushed me to explore adding SQLite support in my product [1], i think I'll wrap a Litestream [2] process in the same container and call it "Backed in a Box" :)
You can just supply your own function to parse the date.
In most database systems that would be a disaster, because the database would have a send a request over the network back to your program for each argument it needs to process. For SQLite, it's no big deal, because it's executing in the same address space as your program so it's a simple function call.
I'll admit that it does make it harder to port your program to another language, so it's more of a workaround than an ideal solution.
OK I love SQLite but there has been a long-standing misconception, or issue, that I'd like to get clarification on: Is using SQLite from multiple processes safe?
For a long while, either SQLite itself or the Python bindings weren't safe for concurrent access, is this still the case? Can I use SQLite for my Django app? With the backup system on the Tailscale post yesterday, the operational burden is much much lower than Postgres for many use cases.
It is always safe, and by "safe" I mean "safe for data". You won't have to deal with data corruption. Precisely, see "How To Corrupt An SQLite Database File": https://www.sqlite.org/howtocorrupt.html
Now concurrent accesses from different processes/connections can lead to runtime errors (SQLITE_BUSY), because the database happens to be locked by one connection.
Those errors are greatly reduced by the WAL mode (https://sqlite.org/wal.html) which provides ultra-robust single-writer/multiple-readers semantics:
- Writes can not happen concurrently (SQLITE_BUSY).
Oh excellent, I wasn't sure the "snapshot isolation" level existed (the WAL kind of implied it, but I wasn't sure). SQLITE_BUSY only happens if a write lock is kept for longer than the readers' configured timeout, right? It shouldn't happen for short writes?
One should only expect SQLITE_BUSY for writes (if a writer is already holding the lock, and the busy timeout expires before the other writer releases the lock). So yes, prefer short writes, or adjust your timeout. Generally speaking, SQLITE_BUSY can not be 100% prevented for writes.
Yes. Now, many short writes look exactly as one very long write from the point of view of an enqueued write that is waiting for its turn :-) I don't quite remember how fair is SQLite scheduling, in practice.
I don’t think there is any scheduling. Each connection polls to see if the write lock is available up to the max busy timeout setting.
The connection polls at these intervals: static const u8 delays[] = { 1, 2, 5, 10, 15, 20, 25, 25, 25, 50, 50, 100 };
So, if you are using the default 5 second timeout, and you are trying to acquire a lock while an exclusive lock is held, you will wait 1 second, then 2 seconds, then 5 seconds, and timeout. I’m not sure if you timeout after 3 total seconds have elapsed, or sometimes after the 2 and sometimes after the 5.
If you have a thread running many fast queries in a loop you can deny access to another thread that needs a lock. The other thread may get lucky and poll for the lock at the exact moment in between locks from the other thread, but it might not.
> The error message when using the connection from multiple threads is "outdated"
I think typically you'd still want to use one connection per thread because (for example) there can only be one transaction per connection at any given time.
It appears to be safe, but potentially slow, assuming you have working fcntl() locking, that is, for example, not using NFS. And you have to handle SQLITE_BUSY or set a timeout. More in their FAQ, Q5: https://www.sqlite.org/faq.html#q5
That looks great, thanks. I didn't realize it wasn't on by default, but it sounds like it'll greatly benefit the web app use case, where all web workers are on the same Linux machine.
Sounds like WAL is the way to go in most use cases, unless you want shared access from different machines.
The WAL doesn't work with shared directories (NFS/SMB) though.
Found this out when I tried to store the Plex data directory on a NSF share in a VM and it had really weird issues. Turned out Plex uses SQLite with WAL enabled.
This is just what I’ve heard so take with a grain of salt, but as I understand it sqlite is “mostly” safe to use concurrently. There are situations where it doesn’t behave correctly under concurrent load, the one I’ve heard about being when the sqlite is on a network-mounted drive. I’d love to hear from someone who knows more though.
That's kind of expected, I think. Due to its design, SQLite relies on filesystem semantics to provide atomicity. If the filesystem doesn't provide the semantics, it makes sense that atomicity will fail.
I'm more asking whether it's still unsafe to use in a filesystem that DOES provide those semantics.
The biggest problem SQLite has is it's size limitations. It can only hold ~281 TB in a database unfortunately.
If you need more storage than that - that's the only reason I could endorse someone using a different database.
:P
My experience is that it is only safe to have one connection to a SQLite database.
Having multiple connections mostly works, however if the database is stored on a slow medium, such as an SD-card, concurrent queries will fail.
You need to explicitly set the “busy_timeout” PRAGMA to allow concurrent write queries to block under WAL mode. Otherwise the query will return a BUSY error.
At the time SQLightning greatly improved SQLite performance but due to LMDB's requirement to have keys fit in 2/3 of a page it wasn't really useful as a general purpose replacement of SQLite's internal b-tree implementation.
EDIT: It looks like SQLightning got adopted and has been worked on by the SQLite team under the name LumoSQL. Here's the project's readme: https://lumosql.org/src/lumosql/doc/trunk/README.md which contains at the end "A Brief History of LumoSQL"