For background: we have a storage product for large files (like photos, videos, etc). The storage paths are mapped into your Postgres database so that you can create per-user access rules (using Postgres RLS)
This update adds S3 compatibility, which means that you can use it with thousands of tools that already support the protocol.
I'm also pretty excited about the possibilities for data scientists/engineers. We can do neat things like dump postgres tables in to Storage (parquet) and you can connect DuckDB/Clickhouse directly to them. We have a few ideas that we'll experiment with to make this easy
Let us know if you have any questions - the engineers will also monitor the discussion
I have been developing with Supabase past two months. I would say there are still some rough corners in general and some basic features missing. Example Supabase storage has no direct support for metadata [2][3].
Overall I like the launch week and development they are doing. But more attention to basic features and little details would be needed because implementing workarounds for basic stuff is not ideal.
[1] https://bunny.net/ [2] https://github.com/orgs/supabase/discussions/5479 [3] https://github.com/supabase/storage/issues/439
Bunny is a great product. I'm glad this release makes that possible for you and I imagine this was one of the reasons the rest of the community wanted it too
> But more attention to basic features and little details
This is what we spend most of our time doing, but you won't hear about it because they aren't HN-worthy.
> no direct support for metadata
Fabrizio tells me this is next on the list. I understand it's frustrating, but there is a workaround - store metadata in the postgres database (I know, not ideal but still usable). We're getting through requests as fast as we can.
I'm more used to Azure Blob Storage than anything, so I'm OOL on what people do other than store files on S3.
Large object crew, who's with me?!
The only upside of storing blobs in the database is transactional semantics. Buf if you're fine with some theoretical trash in S3, that's trivially implemented with proper ordering.
Plenty more advantages than that. E.g for SaaS you can deep copy an entire tenant including their digital assets. Much easier copying with just "insert into ... Select from" than having to copy S3 objects.
The client I use currently, npgsql, supports proper streaming so I've created a FS->BLOB->PG storage abstraction. Streamy, dreamy goodness. McStreamy.
J/K. It could be a really good back-end option for Supabase's S3 front end. A lot of PG clients don't support proper "streaming" and looking at the codebase it's TypeScript.. postgres.js is the only client nearing "performant" I'm aware of(last I looked) on Node.js but it's not clear it supports streaming outside "Copy" per the docs. Support could be added to the client if missing.
Edit: Actually it could be a good option for your normal uploads too. Docs talk about it being ideal for 6Mb or smaller files? Are you using bytea or otherwise needing to buffer the full upload/download in memory? Streaming with Lob would resolve that, and you can compute incremental hash sums for etags and etc. Lob has downsides and limitations but for a very large number of people it has many great benefits that can carry them very far and potentially all the way.
Then for GDPR, when you delete a user, the associated storage can be deleted.
One could cobble this together with triggers, some kind of external process, and probably repetititious code so there is one table of metadata per "owning" id, although it would be nice to be packaged.
And a 5-min demo video with Digital Ocean: https://www.youtube.com/watch?v=FqiQKRKsfZE&embeds_referring...
Anyone who is familiar with basic server management skills will have no problem self-hosting. every tool in the supabase stack[0] is a docker image and works in isolation. If you just want to use this Storage Engine, it's on docker-hub (supabase/storage-api). Example with MinIO: https://github.com/supabase/storage/blob/master/docker-compo...
[0] architecture: https://supabase.com/docs/guides/getting-started/architectur...
Pocketbase being literally single-binary doesn't make Supabase look good either, although funtionalities differ.
I doubt we can ever squeeze the "supabase stack" into a single binary. This undoubtedly makes things more difficult for self-hosters. Just self-hosting Postgres can be a challenge for many. We will trying to make it easier, but it will never be as simple as Pocketbase.
[0] https://supabase.com/docs/guides/getting-started/architectur...
The idea of a propriety API becoming the industry defacto standard isn't uncommon. The same thing happened with Microsoft's XMLHttpRequest.
[1] https://www.tigrisdata.com/docs/objects/conditionals/ [2] https://www.tigrisdata.com/docs/objects/caching/#caching-on-...
- This: for managing large files in s3 (videos, images, etc).
- Oriole: a postgres extension that's a "drop-in replacement" for the default storage engine
We also hope that the team can help develop Pluggable Storage in Postgres with the rest of the community. From the blog post[0]:
> Pluggable Storage gives developers the ability to use different storage engines for different tables within the same database. This system is available in MySQL, which uses the InnoDB as the default storage engine since MySQL 5.5 (replacing MyISAM). Oriole aims to be a drop-in replacement for Postgres' default storage engine and supports similar use-cases with improved performance. Other storage engines, to name a few possibilities, could implement columnar storage for OLAP workloads, highly compressed timeseries storage for event data, or compressed storage for minimizing disk usage.
Tangentially: we have a working prototype for decoupled storage and compute using the Oriole extension (also in the blog post). This stores Postgres data in s3 and there could be some inter-play with this release in the future
we only have plans to keep pushing open standards/tools - hopefully we have enough of a track record here that it doesn't feel like lip service
The primary options I can think of that are not full acquisition are: - company buys back stock - VC sells on secondary market - IPO
The much more common and more likely option for these VCs to make the multiple or home run on their return is going to be to 10x+ their money by having a first or second tier cloud provider buy you.
I think there's a 10% chance that a deal with Google is done in the future, so their portfolio has Firebase for NoSQL and Firebase for SQL.
(This is not a paid promotion)
*Subject to change
Heroku might be a better example of a company that was acquired and the shut down their free plan.
At the time, the most requested feature was a push notification mechanism, because implementing that on iOS had a steep learning curve and was not cross-platform. Then probably some more advanced rules to be able to do more functional-style permissions, possibly with variables, although they had just rolled out an upgraded rules syntax. And also having a symlink metaphor for nodes might have been nice, so that subtrees could reflect changes to others like a spreadsheet, for auto-normalization without duplicate logic. And they hadn't yet implemented an incremental/diff mechanism to only download what's needed at app startup, so larger databases could be slow to load. I don't remember if writes were durable enough to survive driving through a tunnel and relaunching the app while disconnected from the internet either. I'm going from memory and am surely forgetting something.
Does anyone know if any/all of the issues have been implemented/fixed yet? I'd bet money that the more obvious ones from a first-principles approach have not, because ensh!ttification. Nobody's at the wheel to implement these things, and of course there's no budget for them anyway, because the trillions of dollars go to bowing to advertisers or training AI or whatnot.
IMHO the one true mature web database will be distributed via something like Raft, have rich access rules, be log based with (at least) SQL/HTTP/JSON interfaces to the last-known state and access to the underlying sets selection/filtering/aggregation logic/language, support nested transactions or have all equivalent use-cases provided by atomic operations with examples, be fully indexed by default with no penalty for row or column based queries (to support both table and document-oriented patterns and even software transactional memories - STMs), have column and possibly even row views (not just table views), use a copy-on-write mechanism internally like Clojure's STM for mostly O(1) speed, be evented with smart merge conflicts to avoid excessive callbacks, preferable with a synchronized clock timestamp ordered lexicographically:
https://firebase.blog/posts/2015/02/the-2120-ways-to-ensure-...
I'm not even sure that the newest UUID formats get that right:
https://uuid6.github.io/uuid6-ietf-draft/
Loosely this next-gen web database would be ACID enough for business and realtime enough for gaming, probably through an immediate event callback for dead reckoning, with an optional "final" argument to know when the data has reached consensus and was committed, with visibility based on the rules system. Basically as fast as Redis but durable.
A runner up was the now (possibly) defunct RethinkDB. Also PouchDB/PouchBase, a web interface for CouchDB.
I haven't had time to play with Supabase yet, so any insights into whether it can do these things would be much appreciated!
“”” PLV8 is a trusted Javascript language extension for PostgreSQL. It can be used for stored procedures, triggers, etc.
PLV8 works with most versions of Postgres, but works best with 13 and above, including 14, [15], and 16. “””
Is there a Postgres storage backend optimized for storing large files?
We store the metadata of the objects and buckets in Postgres so that you can easily query it with SQL. You can also implement access control with RLS to allow access to certain resources.
It is not currently possible to guarantee atomicity on 2 different file uploads since each file is uploaded on a single request, this seems a more high-level functionality that could be implemented at the application level
That said, I think Supabase is much more de-risked from this happening because we aim to support existing tools with a strong preference of tools that are controlled by foundations rather than commercial entities. For example, 2 of the main tools:
- Postgres (PostgreSQL license)
- PostgREST (MIT license)
Every tool/library/extension that we develop and release ourselves is either MIT, Apache2, or PostgreSQL
Congrats on the release!
I have wondered if Amazon has some additional tooling for other software providers to make their own S3-compliant APIs, but I don't know what Amazon's motivation would be to help make it easier for people to switch between other vendors. Whereas the incentive is much more obvious for other software vendors to make their own APIs S3-compliant. So I've so far imagined it is a similar process to how I described above, instead.
[1]: https://docs.aws.amazon.com/AmazonS3/latest/API/API_Operatio...
I searched on that page and didn't find anything but I've seen it mentioned elsewhere that they are catered for but I haven't found any documentation for that.
I'm particularly interested in the temporary ones - 302, 303 and 307.
https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-...
the only thing i can say related to the topic is that s3 multipart outperforms other methods for files larger than 50mb significantly, but tends to have similar or slightly slower speeds compared to s3 regular upload via supabase or simplest supabase storage upload for files with size about and less than 50mb.
s3-multipart is indeed the fastest way to upload file to supabase with speeds up to 100mb/s(115 even) for files>500mb. But for files about 5mb or less you are not going to need to change anything in your upload logic just for performance cause you won’t notice any difference probably
everything mentioned here is for upload only
Generally, you would want to place an upload server to accept uploads from your customers, that is because you want to do some sort of file validation, access control or anything else once the file is uploaded. The nice thing is that we run Storage within the same AWS network, so the upload latency is as small as it can be.
In terms of serving files, we provide a CDN out-of-the-box for any files that you upload to Storage, minimising latencies geographically
A common pattern on AWS is to not handle the upload on your own servers. Checks are made ahead of time, conditions baked into the signed URL, and processing is handled after the fact via bucket events.
also some community tools: https://github.com/supabase-community/firebase-to-supabase
we often help companies migrating from firebase to supabase - usually they want to take advantage of Postgres with similar tooling.
Also my sympathies for having to support the so-called "S3 standard/protocol".
I hope you like the addition and we have the implementation all open-source on the Supabase Storage server
I'd happily pay a 50% markup for the sake of having everything in one place.
we have a built-in CDN[0] and we have some existing integrations for transactional emails [1]
[0] Smart CDN: https://supabase.com/docs/guides/storage/cdn/smart-cdn
[1] Integrations: https://supabase.com/partners/integrations
It'd be nice to have an integration with a domain register, like Ghandi.net or Namecheap. Ideally with the cost coming through as an item in my Supabase bill.
There are some weird edges (well really just faff) around auth with the JS library but if nothing else they are by far the cheapest hosted SQL offering I can find so any faff you don’t want to deal with there’s an excellent database right there to allow you to roll your own (assuming you have a backend server alongside it)
https://docs.aws.amazon.com/redshift/latest/dg/t_loading-tab...
[1]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_...
yes, you can store all of these in Supabase Storage and it will probably "just work" with the tools that you already use (since most tools are s3-compatible)
Here is an example of one of our Data Engineers querying parquet with DuckDB: https://www.youtube.com/watch?v=diL00ZZ-q50
We're very open to feedback here - if you find any rough edges let us know and we can work on it (github issues are easiest)
We have yet to make a commitment to any one product. Having Postgres there is a big plus for me. I'll have to see about doing a test or two.
I don't think we'll ever build the underlying storage layer. I'm a big fan of what the Tigris[1] team have built if you're looking for other good s3 alternatives
[0] https://github.com/supabase/storage/blob/master/docker-compo...
[1] Tigris: https://tigrisdata.com
1. You buy the storage from Amazon so I pay you and don't interact with Amazon at all or
2. I buy storage from Amazon and you provide managed access to it?
My only real qualm at this point is mapping JS entities using the JS DB API makes it hard to use camelcase field names due to PG reasons I can't recall. I'm not sure what a fix for that would look like.
Keep up the good work.
Postgres requires you to "quoteCamelCase". There are some good JS libs to map between snake case and camel case. FWIW, I like a mix of both in my code: snake_case indicates it's a database property, and camelCase indicates its a JS property. Try it out - it might grow on you :)
Does this mean that Supabase (via S3 protocol) supports file download streaming using an API now?
As far as I know, it was not achievable before and the only solution was to create a signed URL and stream using HTTP.
The good news is that the Standard API is also supporting stream!
I was hesitant to use triggers and PG functions initially, but after I got my migrations sorted out, it's been pretty awesome.
I think if there was a tightly integrated framework for managing the state of all of these various triggers, views, functions and sproc through source and integrating them into the normal SDLC it would be a more appealing sell for complex projects
You can then take it a step further but opting-in to use Branching [2] to better manage environments. We just opened up the Branching feature to everyone [3].
[1]: https://supabase.com/docs/guides/cli/local-development#datab... [2]: https://supabase.com/docs/guides/platform/branching [3]: https://supabase.com/blog/branching-publicly-available
I wish they would offer a plan with just the pg database.
Any news on pricing of Fly PG?
We are actively working on our Fly integration. At the start, the pricing is going to be exactly the same as our hosted platform on aws - https://supabase.com/docs/guides/platform/fly-postgres#prici...
You can serve HTML content if you have a custom domain enabled. We don't plan to do anything beyond that - there are already some great platforms for website hosting, and we have enough problems to solve with Postgres
[1]: https://supabase.com/docs/guides/database/webhooks [2]: https://supabase.com/docs/guides/functions [3]: https://supabase.com/docs/guides/storage/schema/design
https://developers.cloudflare.com/r2/buckets/event-notificat...
I know Flutterflow's Firebase integration is a bit more polished so hopefully we can work closer with the FF team to make our integration more seamless
We, at Milvus, I've integrated S3, Parquet and other ones to make it possible for developers to use their data no matter what they use.
For those who have used both, how do you find the performance and ease of integration compares between Supabase and other solutions like Milvus that have had these features for some time?