- No need to link with C code. There's a pure go driver for Postgres.
- You get the 'psql' command to inspect and change the database, even while the web app is running.
- You get a lot more database features.
- You can keep the deployment simple by running Postgres on the web app server until you need to scale up.
- Conversion to a full blown cloud database and horizontal scaling is just a matter of configuration.
I understand the desire to keep the stack as simple as possible, but it looks to me like Postgres would actually be simpler than SQLite for the server side.
[1] https://www.sqlite.org/howtocorrupt.html#_memory_corruption
This doesn't affect deployment. Deployment includes setting up and upgrading and potentially backing up Postgres. All of these are easier with Sqlite. While in general I agree with you, the benefits of putting one file on a server and running it has value over installing an entirely separate software product.
But I find this rather trivial to do with cloud database services like RDS. Why not just use that? At the end of the day it gives you tons of benefits (e.g. easy backups) with no real lock-in because you'd still just be using a normal postgres driver.
No particular reason, depends on the use case. Every choice has tradeoffs, and there are cloud-specific ones that can hinder mobility even if it uses a common tech. Lock in is about more than code reuse especially as dependence grows. Granted, for many use cases it might be the best option, but we shouldn't pretend like it's the exact same mobility profile as copying a DB file.
It's a somewhat deceptive simplicity, to boot. The fact that you can use SQLite like this is a testament to its versatility and quality but it's not really for this. There's a conceptual overhead in trying to wedge it into this role and it can also easily become a practical one.
This always seems to happen when I write something.
Collaboration would be great if it's possible, but there are good reasons you may want to do your own thing.
I've always wanted just a pure sqlite package with a one-to-one mapping of functions. Your driver is totally what I've always wanted to exist.
I've put up a preview version of my package if you wanted to look at it.
https://github.com/bvinc/go-sqlite-lite
My project started as a fork of https://github.com/mxk/go-sqlite/tree/master/sqlite3
I ended up removing a ton of features, removing the database/sql driver, getting rid of all the "caching" of information that it does, removing callbacks (which I plan to re-add later), and renaming the methods to match sqlite function names.
It still has some niceties that I plan to keep, such as the Scan method to extract several columns with a single method call, and a RowData type to extract rows into a go map.
I found your project in Github and I'm going to suggest some things. There's a particularly nasty footgun that you might have by not having HAVE_USLEEP defined on UNIX. I'll open an issue.
Usage of a library has more to do with it’s community than it’s codebase.
Well, the phrase "re-inventing the wheel" exists for a reason.
> Usage of a library has more to do with it’s community than it’s codebase.
The purpose of a library is code re-use.
For comparison, the wikipedia page [1] for a computer library does not even mention communities, or the social aspect in general.
I'd say the community is a factor in the decision to use a particular library, but it's not the driving factor.
To the larger point, while I totally agree with the premise -- most apps will never need to scale beyond one large instance, I'm not exactly sure what's the actual tradeoff is. If you're writing a simple CRUD app, it is not really controversial that it shouldn't need any complex distributed stuff anyway, it is just a simple python/Go app and a data store.
Most "fancy" things outside that single process paradigm, such as avoiding using local disk and using S3/RDS/document stores, using containers for deployment, etc. usually have more to do with making the app operationally simpler and easy to recover from instance failures than scaling per se.
I'd say architecturally making the app easier to perform this function, rather than rely on infrastructure to do it for you (and hoping doing so will make the app "easier" to code).
For example, use an append only data structure (like event sourcing), with snapshotting. Then your app recovery process is just "restart".
It could be worth noting the thread concurrency limit imposed by cgo calls locking the OS thread to the calling goroutine. For example, 128 concurrent requests running long statements would take up 128 OS threads.
Of course any language not using a lightweight threading model has this issue either way if each request is mapped to one thread but it seems worth it to point out that you may want to limit your own concurrency at the request level or with properly sizing a connection pool (like the one present in the sqlite driver).
Edit: whatever the solution to concurrency is, benchmarking it is a good idea for any sized product going into a production environment. There's a good article about this by the people at Cockroach labs: https://www.cockroachlabs.com/blog/the-cost-and-complexity-o... (see Cgoroutines != Goroutines)
Last time I checked mattn's lib, there was no connection pool, and my tests supported this. Do you have other information?
Also, devs should be aware of some of the concurrency limitations of Sqlite. Devs need to essentially share a single connection across the app, and serialize the writes.
Also, side note, I've found one really neat use case for Sqlite over other DBs: simple disk encryption. I wrote a Go wrapper for one of the popular ones [0], but in general if you distribute your app in other ways you may want to offer your users encryption that you can't easily with other DBs. Sure you lose some performance, but sometimes it's worth it (again, less in the server case where you control it and/or can do fs-level encryption, but definitely in the distributed app case).
I've been following this principle for a decade, and 98% of the services I've built have never needed to grow beyond one instance. The ones that did were solved by breaking out specific functions into their own managed services, not by applying a wide horizontal scaling solution.
If you build for scaling before you have anything running that can be tested against, it's highly likely that you're actually scaling the wrong thing. It's like optimisation: to get any meaningful benefit, you'll want to surgically solve the actual bottleneck rather than sprinkle hope-based gestures all over the place. (I'm reminded of the article that was posted the other day, where the writer experienced a Firebase slowdown and tried to solve it by upgrading hundreds of npm packages to their latest versions. That's what I mean by hope-based gestures. Premature "web scale" architectural tripping is essentially the same kind of shotgun approach.)
I do think there's something to be said for habitually separating the database though - if you store data elsewhere, you can have 1 instance or 2 or 3 pretty easily, and sometimes having more than one is nice for redundancy, handling big spikes or seasonal traffic etc. It is pretty easy to scale horizontally nowadays on cloud services where they take care of load balancing and sometimes the data store too. You also don't have to worry about wiping a server and starting again, as your data is elsewhere.
Embedding a db like sqlite introduces other problems (cgo requirement for one). It is really handy for things like mobile development though as then you don't depend on a separate api or db for data. It's also handy if you don't want potential users to have to worry about setting up a db when trying out your project, so an embedded db is really nice for that - I wish golang had a clone of sqlite that was pure go code as it would make this even more attractive for a broader range of options.
Bitbucket pipelines that create an image of the app, create instance template with a container "gcloud beta compute instance-templates create-with-container", create instance group, create DNS, global load balancer, SSL certs and rolling update of the instance groups with a new version (if applicable).
Using preemptible g1-small to keep costs down but deployed in at least two zones per region.
Do you use SQLite?
What do you mean by an "image of the app"? A Docker image or something else?
How does your rolling deploy work?
Two big pieces of software I worked on used this philosophy. One was https://en.wikipedia.org/wiki/Zeus_Web_Server - nowadays its niche is occupied by nginix, but back then its advantage was being a one-process-per-core select()-based solution, compared to Apache's process-per-concurrent-request solution. It was also absurdly, unnecessarily portable, being written in vanilla POSIX. Ran on a dozen different flavours and architectures of UNIX almost all of which are now dead.
Another was chip design software for a startup. Eventually it would have been useful to parallelise computation, but in the meantime it was so much faster to develop it and just buy big machines and leave it running overnight.
See also https://adamdrake.com/command-line-tools-can-be-235x-faster-...
Developers of would-be federated services to decentralize the web, please take note. This is how you go about ensuring wide deployment. Simplicity, if anyone remembers the word.
Some may say, well, sql servers are not that hard to learn. Well, that is because you live with it day and night. Try going back to it after 1 year of break.
This seems to be the main push of the article, and I for one agree with that statement.
Single process performance and the COST principle, from the iconoclastic (some might say bombastic) McSherry:
[1] blog http://www.frankmcsherry.org/graph/scalability/cost/2015/01/...
[2] pdf https://www.usenix.org/system/files/conference/hotos15/hotos...