I just want simple S3 (opens in new tab)

(blog.feld.me)

246 pointsg0xA52A2A29d ago127 comments

127 comments

Since this has come up 4-5 times on the thread already, the clear subtext of this post is that this developer wants to build to the S3 API, but run their storage locally --- maybe for testing reasons, maybe for data hygiene reasons, maybe for performance reasons. So things like "what about Hugging Face's object storage product" don't really answer their question.

boulos26d ago

Yeah, but that missing context is super important.

If they want it for local dev work, that's pretty different from wanting a high-performance air gapped object store without rewriting clients.

They seem to know what they're doing (having complained about a methodology problem in MinIO), and yet don't personally want to throw their hat in the ring not maybe pay anyone...

Context matters!

skywhopper26d ago

I wouldn’t say it’s “clear”. If you want good answers to your Internet blog begs, it’s probably good to actually state your use case. “I just want S3” means different things to different people.

nitwit00526d ago

It feels like there is more missing context than that:

> I just want S3. My needs are pretty basic. I don't need to scale out. I don't need replication. I just need something that can do S3 and is reliable and not slow.

What does reliable mean, without replication, and no mention of backups?

keyle27d ago

   I just need something that can do S3 and is reliable and not slow.

Oh, simply that.

I'm a simple man, I just need edge delivered cdn content that never fails and responds within 20ms.

thayne26d ago

I don't think that is what they are looking for. They just want something with an s3 compatible API they can run on their local network or maybe even on the same host.

bonesss26d ago

So, why not write to a shared wrapper/facade?

If you split the interaction API out to an interface detailing actual program interaction with the service, then write an s3 backend and an FS backend. Then you could plug in any backend as desired and write agnostic application code.

Personally I end up there anyways testing and specifying the third party failure modes.

thayne26d ago

What if you need it because you are using a third party application that requires an s3 api? Or you want to test your code that interacts with an s3 API?

1 more reply

syabro26d ago

what's the point then? Just api around FS?

mrweasel26d ago

For a lot of project that would be sufficient. I've worked on projects that "required" an S3 storage solution. Not because it actually did, but because it needed some sort of object/file storage which could be accesses from somewhere, might be a Java application running in JBoss, might be a SpringBoot application in a container, on Kubernetes, Nomad or just on a VM.

Like it or not, S3 has become the de facto API for object storage for many developers. From the operations side of things, managing files is easier and already taken care of by your storage solution, be it a SAN, NAS or something entirely different. Being able to backup and manage whatever is stored in S3 with your existing setup is direct saving.

If you actually use a large subset of S3s features this might not be good solution, but in my experience you have a few buckets and a few limited ACLs and that's it.

pythonaut_1625d ago

I use a local garagefs on my NAS for small/new side projects, and it’s on my Tailscale for easy access

- Lets me deploy stateless containers easily

- Let’s me leverage the NAS for local redundancy and a more centralized place to do backups

- When a project grows it’s easy to promote it to use a hosted S3

- Local S3 becomes a target for Litestream and Restic

- Developing against the local fs and then handling file storage is a huge friction, unless I’m using something like Rails that already has a good abstraction

skrtskrt26d ago

At this point S3 is an API spec more than a particular system. Plenty of things only work against the S3 API spec since the implementations have become such popular and relatively cheap and performant storage systems. It gives a nice limited surface area that doesn't allow you to do things that can get too complex or can vary too much across filesystems, etc.

jpfromlondon26d ago

would you not just say "edge delivered content"?

runlevel126d ago

`rclone serve s3` is actually a thin wrapper around this: https://github.com/rclone/gofakes3

That repo is a fork of this project: https://github.com/johannesboyne/gofakes3

They bill it as being for testing, but it works great if all you want is a no-fuss S3-compatible API on top of a filesystem. I've run it on my NAS for a few years now to provide a much faster transfer protocol compared to SMB.

dddw26d ago

Rclone is amazing for data-management. S3 indeed just works fine with it.

PunchyHamster27d ago

S3 isn't "simple" tho.

It doesn't need to care about POSIX mess but there is whole swathes of features many implementations miss or are incomplete, both on frontend side (serving files with right headers, or with right authentication) and backend (user/policy management, legal hold, versioning etc.)

It gets even more messy when migrating, for example migrating your backups to garagefs will lose you versioning, which means that if your S3 secret used to write backups gets compromised, your backups are gone vs on implementation that supports versioning you can just rollback.

Similarly with password, some will give you secret and login but won't allow setting your own so you'd have to re-key every device using it, some will allow import, but only in certain format so you can restore from backup, bot not migrate from other software.

patates26d ago

S3 needs a split:

QS3 (Quite Simple Storage Service) for the barebones. Bucket/Object CRUD. Maybe: Multipart Uploads. Presigned URLs.

S3 for Object Tagging, Access Control Lists, etc.

S3E (enterprise? extended? elaborate?) for Object Lock & Retention (WORM compliance, Legal Holds), Event Notifications and so on.

iExalt26d ago

Proposal to rename the services:

* S4: Stupid/Silly Simple Storage Service

* S3: Simple Storage Service

* S2: Storage Service

bluefirebrand25d ago

You can't have the numbers go down like this. Try pitching S2 to your project manager. You will be told 100% "Why not use S4 instead, that sounds better"

convolvatron27d ago

I just spent some time with the s3 protocol and I agree completely. What should have been able to leverage the simplifying assumptions turned into another hodgepodge. It’s not like nfs is a real shining example of simplicity either. I’ve never worked with p9, but potentially that aside I think we really failed to come up with a decent distributed file model,

bombcar26d ago

If all the "popular" solutions are complex, it means the problem domain is complex.

You either are doomed to reimplement and rediscover the complexity on your own, or you change your requirements to fit a narrower problem domain to avoid (some of) the complexity.

convolvatron26d ago

I'm working on a distributed S3 cache that supports just two functions, pread style contents of a file and all the elements in a directory. I've worked on other systems that represents this entire structure as RDF triples, which leaves you with just query and insert. To come at it from another direction EFS only implements about 2/3 of NFSv4, and S3 was perfectly functional before they larded it up with all this policy stuff.

I'm not saying that there weren't reasons to add these functions to the protocol, but if your aim is minimalism, then you can do _much_ better, and I think there is a real benefit in having a bare bones protocol that anyone could implement in many contexts.

jen2026d ago

It was simple(ish) 20 years ago, to be fair.

thayne26d ago

Simpler than it is now, but the authentication system was never simple. You can't just put a bearer token in the authorization header, you have to follow a complicated algorithm to sign the request. That made some sense 20 years ago when s3 didn't use tls to protect against a mitm that changed the changed the request. It is less valuable now when you use tls.

2 more replies

KaiserPro26d ago

The author is also not really clear on what they are actually needing.

If they just want webfile interface then a webserver with simple auth and webDAV would work more than well enough.

The problem is that they then go onto talk about lots of projects that all have posix interfaces. Which is slap bang into shared filesystem land.

S3 is not a filesystem, and nothing shows that more than when you use it as an object store _for_ a filesystem.

Depending on the access requirements, if you're doing local to local, then NFSv4 is probably more than enough. Unless you care about file locking (unlucky, you're in shit now)

KronisLV26d ago

I've used Garage to some success, the garage.toml configuration file could be a bit more user friendly https://garagehq.deuxfleurs.fr/documentation/reference-manua... but with some tweaks there I could get it working nicely both for HDD and SSD use cases, e.g. storing a bunch of Sentinel-2 satellite tiles, alongside thumbnails and some metadata.

SeaweedFS and RustFS both look nice though, last I checked Zenko was kinda abandoned.

Also pretty simple when I can launch a container of whatever I need (albeit bootstrapping, e.g. creating buckets and access keys SHOULD probably done with env variables the same way how you can initialize a MySQL/MariaDB/PostgreSQL instance in containers) and don't have to worry too much about installing or running stuff direcetly on my system. As for unsupported features - I just don't do the things that aren't supported.

avidphantasm26d ago

I’ve recently switched from Minio and Localstack to Garage. For my needs (local testing) Garage seems to be fine. It’s a bit more heavyweight and capable than I need now, but I like that it may give me the option of having an on-premises alternative to S3-compatible stores hosted in the cloud. The bootstrapping is a pain in the ass (having to assign nodes to storage and gateway roles, applying the new roles, etc). It would be great to be able to bootstrap at least a simple config using environment variables. However, now that I have figured out the quirks of bootstrapping, it just works (so far; again, I’m not doing anything complicated).

jerf27d ago

I think we get a "S3 clone" about once every week or two on the Golang reddit.

It strikes me as a classic case of "we need all the interested people to pull in one project, not each start their own". AI may have made this worse then ever.

jeroenhd27d ago

I'm pretty sure I set up most of what "Simple S3" using with Apache2 and WebDAV at least fifteen years ago.

Every month there's a post of "I just want a simple S3 server" and every single one of them has a different definition of "simple". The moment any project overlaps between the use cases of two "simple S3" projects, they're no longer "simple" enough.

That's probably why hosted S3-like services will exist even if writing "simple" S3 servers is so easy. Everyone has a different opinion of what basic S3 usage is like and only the large providers/startups with business licensing can afford to set up a system that supports all of them.

bombcar26d ago

It's like "Word/Excel is too bloated, I just need a simple subset!" and each simple subset is subtly different.

ramses026d ago

`rclone serve webdav` is a superpower!

mickael-kerjean27d ago

Or maybe the underlying philosophy is different enough to warrant its own implementation. For example, the Filestash implementation (which I made) listed in the article is stateless and acts as a gateway that proxies everything to an underlying storage. We don't own the storage, you do, via any of the available connectors (SFTP, FTP, WebDAV, Azure, SMB, IPFS, Sharepoint, Dropbox, ...). You generate S3 keys bound to a specific backend and path, and everything gets proxied through. That's fundamentally different to not fit in the mold of other alternatives that mostly assume they own your storage and as a result can not be made stateless by design. That approach has pro and cons on each side

TheDong26d ago

> It strikes me as a classic case of "we need all the interested people to pull in one project, not each start their own".

And every few weeks in the cooking subreddit we get a new person talking about a new soup they made. Just think if we put all 1000 of those cooks in one kitchen with one pot, we'd end up with the best soup in the world.

Anyway, we already have "the one" project everyone can coalesce on, we have CephFS. If all the redditors actually hopped into one project, it would end up as an even more complex difficult to manage mess I believe.

CobrastanJorji27d ago

I think it's like NES emulators. It's not that anyone needs one more. It's just that they're fun to make.

a_t4826d ago

They're certainly a rabbit hole, too.

pphysch27d ago

Diverse competition is the best way to identify a winning formula, which can then be perfected by a fewer number of players.

jgalt21227d ago

S3 with tree-shaking. i.e. specify the features you need, out comes an executable for that subset of S3 features you desire.

Or like lodash custom builds.

https://lodash.com/custom-builds

alexellisuk26d ago

I find it fascinating that a list of OSS/self-hosted S3 projects is on the front page of Hacker News. Familiar with most of these from working with on-premises customers with OpenFaaS. Minio was one of the first we integrated with in 2017 - back then it didn't support webhooks for notifications on object mutation, so ended up sending them a PR and showing it on the stage at Dockercon Austin.

I don't have a horse in this game, but have had pretty reasonable results using SeaweedFS for GitHub Actions caching. RustFS is on my list to test next, and a team mate is quite keen on Garage.

estebarb26d ago

Personally I would suggest that the "easiest S3" would be simply using NFS. You can get replication with RAID.

S3 is simple for the users, not the operators. For replicating something like S3 you need to manage a lot of parts and take a lot of decisions. The design space is huge:

Replication: RAID, distributed copies, distributed erasure codes...

Coordination: centralized, centralized with backup, decentralized, logic in client...

How to handle huge files: nope, client concats them, a coordinator node concats them...

How will be the network: local networking, wan, a mix. Slow or fast?

Nature of storage: 24/7 or sporadically connected.

How to handle network partitions, pick CAP sides...

Just for instance: network topology. In your own DC you may say each connection has the same cost. In AWS you may want connections to stay in the same AZ, use certain IPs for certain source-destination to leverage cheaper prices and so on...

klodolph26d ago

NFS in practice is too different from S3 to make this work.

I’ve been at a couple companies where somebody tried putting an S3 interface in front of an NFS cluster. In practice, the semantics of S3 and NFS are different enough that I’ve had to then deal with software failures. Software designed to work with S3 is designed to work with S3 semantics and S3 performance. Hook it up to an S3 API on what is otherwise an NFS server and you can get problems.

“You can get replication with RAID” is technically true, but it’s just not good enough in most NFS systems. S3 style replication keeps files available in spite of multiple node failures.

The problems I’m talking about arise because when you use an S3-compatible API on your NFS system, it’s often true that you’re rolling the dice with three different vendors—you have the storage appliance vendor, you have the vendor for the software talking to S3, and you have Amazon who wrote the S3 client libraries. It’s kind of a nightmare of compatibility problems in my experience. Amazon changes how the S3 client library works, the change wasn’t tested against the storage vendor’s implementation, and boom, things stop working. But your first call is to the application vendor, and they are completely unfamiliar with your storage appliance. :-(

themafia26d ago

> but it’s just not good enough in most NFS systems.

NFS is just an interface. At the end of the day it's on top of an FS. It's entirely possible and sometimes done in practice to replicate the underlying store served by NFS. As you would expect there are several means of doing this from the simple to the truly "high-availability."

KaiserPro26d ago

Why would you use S3 on top of NFS?

I mean you can, it would simplfy the locking somewhat.

But if you are doing file sharing for apps inside a network you manage, just use NFS, and maybe worry about the locking later.

uroni27d ago

I made https://github.com/uroni/hs5 -- focus is on single node and high performance. So plenty of alternatives available.

CobrastanJorji27d ago

This is an interesting write up, but I'm curious about the use case. If you don't need to scale, and you don't need to replicate anything, why do you want S3 specifically? Are you using a tool that wants to write to something S3-like? Do you just like reading and writing objects via HTTP POST and GET? Are you moving an app to or from the cloud?

tptacek27d ago

It's probably the most important storage API in the industry. Implementing it gives you on-prem storage, AWS S3 (the Hoover Dam of Internet storage megaprojects, arguably the most reliable store of any kind available to any normal programmer), and a whole ecosystem of S3-compatible options with different features and price points.

It's a little like asking why you'd use SQL.

CobrastanJorji27d ago

The S3 standard is certainly really important. It's perhaps the most important web standard without any sort of standards organization or formal spec (seriously, Amazon, I'm begging you to open up to ISO or IEC or SNIA or somebody).

And SQL is also very important. And yet, if somebody said "I need to store data, but it's not relational, and I just need 1000 rows, what's the best SQL solution," I would still ask why exactly they needed SQL. The might have a good reason (for example, SQLite can be a weirdly good way to persist application data), but I don't know it yet. That's why I asked.

colechristensen27d ago

I want my application servers to be stateless and I've got state to keep that looks a lot more like files than database rows.

And I want things like backup, replication, scaling, etc. to be generic.

I wrote a git library implementation that uses S3 to store repositories for example.

didgetmaster27d ago

Better title: I just want local storage with a simple S3 interface.

pveierland26d ago

Garage has worked well for me and gives a good sense of stability. They provide helm charts for deployment and a CLI. There's also very few concepts to learn to start to use it, while e.g. for SeaweedFS I feel like you need to parse a lot of docs and understand more specific terminology.

Helmut1000126d ago

I added an Zenko Scality CloudServer S3-compatible Storage backend to a selfhosted Grist [1] instance. This allowed me to create forms with attachements in Grist (e.g. users can upload photos). I experimented with several and settled on Zenko Scality CloudServer [2]:

- MinIO [3] is somewhat deprecated and not really open source anymore. Its future is unsure.

- GarageHQ [4] looks pretty great and I wished I could have used this, but it is not yet feature-complete with S3 protocol and specifically missing the versioning feature (I reported this [5])

- Zenko Scality works out of the box; it is a bit too "big" for my context (aimed at thousands of parallel users) and uses 500MB memory; but it does the job for now.

I posted my compose here [6]. Since then (~months ago), it works really well and I am happy with Zenko Scality S3.

    [1]: https://github.com/gristlabs/grist-core
    [2]: https://github.com/scality/cloudserver
    [3]: https://www.min.io/
    [4]: https://garagehq.deuxfleurs.fr/
    [5]: https://git.deuxfleurs.fr/Deuxfleurs/garage/issues/166
    [6]: https://github.com/scality/Zenko/discussions/1779#discussioncomment-15869532

ChromaticPanic29d ago

Garage "unnecessarily complex" . If anything it's the simplest solution in the list especially compared to Ceph or Apache Ozone

leosanchez29d ago

Tried setting up rustfs today. It was easier that garagehq and it even comes with UI.

evil-olive27d ago

RustFS is the poster child in my mind for the worst kind of vibe-coded slop. it might be "simple" but it's not something I would ever trust with persistent data.

last year they had a security vulnerability where they allowed a hardcoded "rustfs rpc" token to bypass all authentication [0]

and even worse, if you read the resulting reddit thread [1] someone tracked down the culprit commits - it was introduced in July [2] and not even reviewed by another human before being merged.

then the fix 6 months later [3] mentions fixing a different security vulnerability, and seemingly only fixed the hardcoded token vulnerability by accident. that PR was also only reviewed by an LLM, not a human.

0: https://github.com/rustfs/rustfs/security/advisories/GHSA-h9...

1: https://www.reddit.com/r/selfhosted/comments/1q432iz/update_...

2: https://github.com/rustfs/rustfs/pull/163/

3: https://github.com/rustfs/rustfs/pull/1291

nikeee27d ago

I am building an S3 client [1] where I have a test matrix that tests against common S3 implementations, including RustFS.

That test matrix uncovered that post policies were only checked for exsitence and a valid signature, not if the request actually conforms to the signed policy. That was an arbitrary object write resulting in CVE-2026-27607 [2].

In the very first issue for this bug [3], it seemed that the authors of the S3 implementation didn't know the difference between the content-length of GetObject and content-length-range of a PostObject. That was kind of a bummer and leads me to advise all my friends not to use rustfs, though I like what they are doing in principal (building a Minio alternative).

[1]: https://github.com/nikeee/lean-s3 [2]: https://github.com/rustfs/rustfs/security/advisories/GHSA-w5... [3]: https://github.com/rustfs/rustfs/issues/984

1 more reply

PunchyHamster27d ago

I recently submitted bug about how their own docs tell you to

* create rustfs user * run the rustfs from root via systemd, but with bunch of privileges removed * write logs into /var/logs/ instead of /var/log

Looks like someone told some LLM to make docs about running it as service and never looked at output

rezonant27d ago

Ah, progress!

0x45727d ago

I think only "complex" thing in garage is the layout which only matters if you're doing distributed mode.

panarky27d ago

Sounds like you want S4. Super simple storage service.

sonnyz27d ago

Listen to this: 7... Minute... Abs. You walk into a video store and you see 8 minute abs and 7 minutes abs. Which one are you gonna buy?

rglover25d ago

"Step into my office?"

"Why?"

"Because you're fucking fired!"

ray_v26d ago

speps26d ago

S4 exists: https://mega.io/objectstorage

ilian26d ago

How about s3proxy: https://github.com/gaul/s3proxy Among other things, it can serve local folders as s3 api.

matzie26d ago

+1 for s3 proxy! I use it on docker for my dev env, is so simple and works well

0xbadcafebee26d ago

Call me crazy, but wouldn't 15 minutes on GLM 5.1 produce a working implementation? I haven't looked at the code, but a non-production-grade Go implementation can't be that complicated.

Edit: Minio is written in Go, and is AGPL3... fork it (publicly), strip out the parts you don't want, run it locally.

Havoc26d ago

I did exactly that. 15 mins for initial implementation was about right. And it seemed fine at first glance.

Then I decided to run the Ceph S3 test suite against it. So many issues. Think it passed 3 tests on first run out of I think about a hundred. Took another couple of hours to get it to a state that is even vaguely passable for non production use.

Got something vaguely workable but even after many hours I can’t say I super trust it

> implementation can't be that complicated.

S3 has a fair bit of bloat in spec that nobody seems to use and it’s not clearly delineated which is core and what’s ahem optional. I ended up relying on the LLM to figure out what’s core and that ended up missing stuff too so that needed a couple of feature iterations too

0xbadcafebee26d ago

Would you be willing to share the result on the web somewhere? Would save a few tokens (and hours) for others who end up doing the same

Oxodao26d ago

I don't cet how people are developing app using s3 without actually having a local s3 server ? Everyone is like "just use aws / whatever trendy host atm" but while developping ? I use garage but the main issue i have is the mandatory random keys which means i cannot hardcode them in my compose file. For prod, sure you want that. In dev ? Absolutely not I want to be able to clone my repo, dc up -d and lets go.

I also had quite a hard time to chose one for my homelab too. Settled on garage too but why is it that complicated ? I just want a "FS api" for remote storage in my app. Over http so that i dont have to do weird custom setup just to have a network share and handle everything manually

PxldLtd26d ago

You can hard-code them. Just write a setup script that imports your desired keys from a .env file or something:

``` docker exec $CONTAINER $BIN key import --yes "$S3_ACCESS_KEY_ID" "$S3_SECRET_ACCESS_KEY" ```

crabique26d ago

I, too, can vouch for ZFS+VersityGW being a great solution, we were able to scale it vertically on a single-node deployment to a pretty high throughput, both reads and writes.

Most notably, the PutObject operation (which had always been a pain in the ass on HDD with MinIO) is performing well now, even with many small objects.

There is a natural synergy in the gateway storing its metadata in xattrs and using ZFS special VDEV with dnode_size=auto to store the entire ZFS+S3 metadata on fast media.

The latency impacts of the gateway itself can be further minimized by running multiple instances of the gateway pinned to CPU cores, all behind a HAProxy load-balancer communicating with them over UDS.

rtpg26d ago

Is the problem here that everyone wants a different like 45% of the S3 API? Or is it that minio sucked all the oxygen out of the air in this space by being good at this, and now we need something else to show up?

deepsun26d ago

Then why nobody forked minio?

mixnix26d ago

Maybe nobody wants to spend effort maintaining it? I imagine it's simpler to build your own S3 alternative than maintain minio. Also nobody wants to test AGPL liability in court: https://www.reddit.com/r/minio/comments/1fnuv46/does_interac...

uroni26d ago

For me it went into the multi-node direction, where I'd use Ceph anyway (or build on-top of an existing solid distributed database) if I needed it.

Also think there is an abstraction mismatch with the object stores that store the objects with a 1:1 mapping into the file system. Obvious issues are that you only get good listing performance with '/' as delimiter and things like "keys with length up to 1024 bytes" break.

ferdzo26d ago

I wanted the same, so I made my own. Initially I started it without vibe coding, but as I wanted to add more features than the initial working S3 API, I decided it's not worth the time to not vibe simple API stuff. I use it for backups, for a Minio replacement in a Milvus database and other random stuff. Not bad for a week work. It's single node so far and not meant for distributed computing.

https://github.com/ferdzo/fs

teekert26d ago

Fwiw, yesterday I was messing about with GitLab backups. One of the options for direct off-site is S3. But I want that S3 bucket to live on a box on my own Tailnet.

So I too just want simple S3. Minio used to be the no-brainer. I'll checkout RustFS as well.

It does not sound hard (although it is hard for me!). It sounds like it should be some LinuxServer io container. Doesn't it? At this point S3 is just as standard as WebDav or anything right?

sandreas26d ago

Wouldn't RustFS be something?

https://rustfs.com/en/

Diti25d ago

No. See this thread: https://news.ycombinator.com/item?id=47759022

K0IN26d ago

Not that long ago someone on hn poster this [0] a zig based s3 server in 1k lines, (warning not production ready) but if you really look for something simple, it might fit your case.

[0] https://news.ycombinator.com/item?id=46421196

Havoc26d ago

I was thinking similar to OP and decided to vibe code my own…which was quite a journey in itself.

Can’t say I super trust it so will probably roll it out only for low stakes stuff. It does pass the ceph S3 test suite though and is rust so in theory somewhat safe-ish…maybe.

moondev27d ago

microceph is pretty nice and straightforward for throwaway s3 endpoints

https://canonical-microceph.readthedocs-hosted.com/stable/tu...

coredog6427d ago

Has anyone that has set up microceph determined the overhead of the required multiple OSDs? The docs make it sound scary, but it's not clear if that's because people run it on a Pi with an sdcard for block storage or because someone once ran 18TB of OSDs in production that then fell over.

jauntywundrkind27d ago

I do continue to be impressed/ over-awed by how effectively scared the Ceph docs are about just how many system resources you need. To run a mid tier not that fast storage cluster. Bother.

Impressive as hell software and I am so glad to have it. But man! The insistence on mountains of ram per TB, on massive IO is intimidating.

KaiserPro26d ago

but why would you ever want to run ceph? its just such a huge monster.

Its also not that useful even if you have enough machines to run it properly.

NVME and zfs is fast enough for virtually anything now. With snapshot and snapshot sending you get decent backups for half the hardware cost of ceph.

lewtun27d ago

Hugging Face Buckets are pretty simple: https://huggingface.co/docs/huggingface_hub/en/guides/bucket...

Disclaimer: I work at HF

scottfits27d ago

100% - i really wanted Render to add this, feels like there is potential for a startup here

ovaistariq27d ago

Potential of startup for hosted object storage? I think Tigris (https://www.tigrisdata.com/docs/) will work pretty well with Render.

sudb27d ago

I think the post author is mainly addressing self-hostable and/or open-source options here - otherwise I'd expect a whole host of other commercial storage providers to have been mentioned!

anurag27d ago

Render has Object Storage in alpha: https://render.com/object-storage

liveoneggs26d ago

Riak doesn't get a mention? https://docs.riak.com/riak/cs/latest/index.html

grizzletooth26d ago

Check out Floci. It is a self hosted AWS clone with multiple services functional, including S3 and Dynamodb.

https://github.com/floci-io/floci

evil-olive26d ago

Floci is yet more AI slop.

"firt commit" [0] was less than a month ago and added 51k lines in a single commit.

0: https://github.com/floci-io/floci/commit/61433f59ab995e9eaeb...

therealmarv26d ago

Settled with SeaweedFS for replacing minio and getting a good chunk of S3 feature parity. I wonder about the problems OP is posting about. Never seen that behaviour but usually only having a bunch of smaller files.

singhrac27d ago

I wanted to try NVIDIA’s aistore for our datasets, but I couldn’t figure out how to get a small version up and running so I gave up (a few years ago, today I’d get an LLM to show me how k8s works).

jdbohrman26d ago

Wouldn't Blossom fit this? https://github.com/hzrd149/blossom

phibz27d ago

Why do rust compile times matter for a production deployment?

time4tea25d ago

Garage is quite nice, for the cases I've tried so far. I didnt find it heavyweight.

Tres bon!

barelysapient26d ago

Unlike the author, I run SeaweedFS without issue locally and with acceptable transfer speeds.

siliconc0w26d ago

I use rustfs (for local development, not scaled usage) and it seems solid.

pkghost27d ago

Based on the list of contenders feels like you might be missing rsync.net?

mickael-kerjean27d ago

By itself rsync.net doesn't support S3. The one I wrote (Filestash) lets you use rsync.net as a downstream storage and proxies it through the S3 protocol.

seized24d ago

Versitygw and Garage are both good options.

nhumrich27d ago

Well, OP, your requirements section is seriously lacking. You need "s3", but only local, non horizontally scalable?

You failed to answer why you even need s3... Why not a filesystem? Full stop. The entire point of s3 is distributed.

JBorrow26d ago

People write applications that work with the S3 API but may want to host their own storage for a variety of reasons. Personally I make use of S3-compatible services for pre-signed url access to data on disks I own. The distributed aspect is only one reason why someone might want an S3-like service.

nate27d ago

I only recently realized how much I like using Cloudflare more than AWS :) R2 (their version of S3) is no exception. Much more pleasant figuring out how to use and configure it in Cloudflare than the craziness inside AWS.

amarsahinovic27d ago

S3-compatible storage solution: https://www.hetzner.com/storage/object-storage/

tptacek27d ago

They want to run it locally.

66yatman25d ago

Brown is literally crying

hybirdss27d ago

someone is 100% going to write the 'i just want simple S4' post next month

bosky10126d ago

If anyone wants an s3 browser with folder counts, and object rendering based on its extension, dm.

LunicLynx26d ago

Simple S3 sooooooo S4 ? xD

otterley28d ago

So use S3.

jockm27d ago

While not obvious from the article, it appears that they want something S3 like, but isn’t from Amazon, and possibly want to self host it. The article could be much more clear about the goals

larrymcp27d ago

Ah, thanks. Yeah I was confused because in his long list of vendors he didn't mention Wasabi, Backblaze etc. It appears that I do not know the context of his post.

sudb27d ago

or cloudflare R2 for that matter (very useful for egress-heavy workloads for which it is ~free)

1 more reply

cdrnsf26d ago

I’ve never had an issue with Backblaze. I mirror my buckets to iDrive who, so far, have also been perfectly fine.

ovaistariq27d ago

or Tigris

Quai26d ago

Simple for you is not simple for me, and vice versa.

10 users with simple needs = complex software.

Frannky26d ago

Self hosted minio?

tehlike26d ago

vibe code a s3 implementation build on tikv.

j / k navigate · click thread line to collapse

127 comments

tptacek27d ago

boulos26d ago

Yeah, but that missing context is super important.

If they want it for local dev work, that's pretty different from wanting a high-performance air gapped object store without rewriting clients.

They seem to know what they're doing (having complained about a methodology problem in MinIO), and yet don't personally want to throw their hat in the ring not maybe pay anyone...

Context matters!

skywhopper26d ago

nitwit00526d ago

It feels like there is more missing context than that:

> I just want S3. My needs are pretty basic. I don't need to scale out. I don't need replication. I just need something that can do S3 and is reliable and not slow.

What does reliable mean, without replication, and no mention of backups?

keyle27d ago

   I just need something that can do S3 and is reliable and not slow.

Oh, simply that.

I'm a simple man, I just need edge delivered cdn content that never fails and responds within 20ms.

thayne26d ago

I don't think that is what they are looking for. They just want something with an s3 compatible API they can run on their local network or maybe even on the same host.

bonesss26d ago

So, why not write to a shared wrapper/facade?

Personally I end up there anyways testing and specifying the third party failure modes.

thayne26d ago

What if you need it because you are using a third party application that requires an s3 api? Or you want to test your code that interacts with an s3 API?

1 more reply

syabro26d ago

what's the point then? Just api around FS?

mrweasel26d ago

If you actually use a large subset of S3s features this might not be good solution, but in my experience you have a few buckets and a few limited ACLs and that's it.

pythonaut_1625d ago

I use a local garagefs on my NAS for small/new side projects, and it’s on my Tailscale for easy access

- Lets me deploy stateless containers easily

- Let’s me leverage the NAS for local redundancy and a more centralized place to do backups

- When a project grows it’s easy to promote it to use a hosted S3

- Local S3 becomes a target for Litestream and Restic

- Developing against the local fs and then handling file storage is a huge friction, unless I’m using something like Rails that already has a good abstraction

skrtskrt26d ago

jpfromlondon26d ago

would you not just say "edge delivered content"?

runlevel126d ago

`rclone serve s3` is actually a thin wrapper around this: https://github.com/rclone/gofakes3

That repo is a fork of this project: https://github.com/johannesboyne/gofakes3

dddw26d ago

Rclone is amazing for data-management. S3 indeed just works fine with it.

PunchyHamster27d ago

S3 isn't "simple" tho.

patates26d ago

S3 needs a split:

QS3 (Quite Simple Storage Service) for the barebones. Bucket/Object CRUD. Maybe: Multipart Uploads. Presigned URLs.

S3 for Object Tagging, Access Control Lists, etc.

S3E (enterprise? extended? elaborate?) for Object Lock & Retention (WORM compliance, Legal Holds), Event Notifications and so on.

iExalt26d ago

Proposal to rename the services:

* S4: Stupid/Silly Simple Storage Service

* S3: Simple Storage Service

* S2: Storage Service

bluefirebrand25d ago

You can't have the numbers go down like this. Try pitching S2 to your project manager. You will be told 100% "Why not use S4 instead, that sounds better"

convolvatron27d ago

bombcar26d ago

If all the "popular" solutions are complex, it means the problem domain is complex.

You either are doomed to reimplement and rediscover the complexity on your own, or you change your requirements to fit a narrower problem domain to avoid (some of) the complexity.

convolvatron26d ago

jen2026d ago

It was simple(ish) 20 years ago, to be fair.

thayne26d ago

2 more replies

KaiserPro26d ago

The author is also not really clear on what they are actually needing.

If they just want webfile interface then a webserver with simple auth and webDAV would work more than well enough.

The problem is that they then go onto talk about lots of projects that all have posix interfaces. Which is slap bang into shared filesystem land.

S3 is not a filesystem, and nothing shows that more than when you use it as an object store _for_ a filesystem.

Depending on the access requirements, if you're doing local to local, then NFSv4 is probably more than enough. Unless you care about file locking (unlucky, you're in shit now)

KronisLV26d ago

SeaweedFS and RustFS both look nice though, last I checked Zenko was kinda abandoned.

avidphantasm26d ago

jerf27d ago

I think we get a "S3 clone" about once every week or two on the Golang reddit.

It strikes me as a classic case of "we need all the interested people to pull in one project, not each start their own". AI may have made this worse then ever.

jeroenhd27d ago

I'm pretty sure I set up most of what "Simple S3" using with Apache2 and WebDAV at least fifteen years ago.

bombcar26d ago

It's like "Word/Excel is too bloated, I just need a simple subset!" and each simple subset is subtly different.

ramses026d ago

`rclone serve webdav` is a superpower!

mickael-kerjean27d ago

TheDong26d ago

> It strikes me as a classic case of "we need all the interested people to pull in one project, not each start their own".

CobrastanJorji27d ago

I think it's like NES emulators. It's not that anyone needs one more. It's just that they're fun to make.

a_t4826d ago

They're certainly a rabbit hole, too.

pphysch27d ago

Diverse competition is the best way to identify a winning formula, which can then be perfected by a fewer number of players.

jgalt21227d ago

S3 with tree-shaking. i.e. specify the features you need, out comes an executable for that subset of S3 features you desire.

Or like lodash custom builds.

https://lodash.com/custom-builds

alexellisuk26d ago

I don't have a horse in this game, but have had pretty reasonable results using SeaweedFS for GitHub Actions caching. RustFS is on my list to test next, and a team mate is quite keen on Garage.

estebarb26d ago

Personally I would suggest that the "easiest S3" would be simply using NFS. You can get replication with RAID.

S3 is simple for the users, not the operators. For replicating something like S3 you need to manage a lot of parts and take a lot of decisions. The design space is huge:

Replication: RAID, distributed copies, distributed erasure codes...

Coordination: centralized, centralized with backup, decentralized, logic in client...

How to handle huge files: nope, client concats them, a coordinator node concats them...

How will be the network: local networking, wan, a mix. Slow or fast?

Nature of storage: 24/7 or sporadically connected.

How to handle network partitions, pick CAP sides...

klodolph26d ago

NFS in practice is too different from S3 to make this work.

“You can get replication with RAID” is technically true, but it’s just not good enough in most NFS systems. S3 style replication keeps files available in spite of multiple node failures.

themafia26d ago

> but it’s just not good enough in most NFS systems.

KaiserPro26d ago

Why would you use S3 on top of NFS?

I mean you can, it would simplfy the locking somewhat.

But if you are doing file sharing for apps inside a network you manage, just use NFS, and maybe worry about the locking later.

uroni27d ago

I made https://github.com/uroni/hs5 -- focus is on single node and high performance. So plenty of alternatives available.

CobrastanJorji27d ago

tptacek27d ago

It's a little like asking why you'd use SQL.

CobrastanJorji27d ago

colechristensen27d ago

I want my application servers to be stateless and I've got state to keep that looks a lot more like files than database rows.

And I want things like backup, replication, scaling, etc. to be generic.

I wrote a git library implementation that uses S3 to store repositories for example.

didgetmaster27d ago

Better title: I just want local storage with a simple S3 interface.

pveierland26d ago

Helmut1000126d ago

- MinIO [3] is somewhat deprecated and not really open source anymore. Its future is unsure.

- GarageHQ [4] looks pretty great and I wished I could have used this, but it is not yet feature-complete with S3 protocol and specifically missing the versioning feature (I reported this [5])

- Zenko Scality works out of the box; it is a bit too "big" for my context (aimed at thousands of parallel users) and uses 500MB memory; but it does the job for now.

I posted my compose here [6]. Since then (~months ago), it works really well and I am happy with Zenko Scality S3.

    [1]: https://github.com/gristlabs/grist-core
    [2]: https://github.com/scality/cloudserver
    [3]: https://www.min.io/
    [4]: https://garagehq.deuxfleurs.fr/
    [5]: https://git.deuxfleurs.fr/Deuxfleurs/garage/issues/166
    [6]: https://github.com/scality/Zenko/discussions/1779#discussioncomment-15869532

ChromaticPanic29d ago

Garage "unnecessarily complex" . If anything it's the simplest solution in the list especially compared to Ceph or Apache Ozone

leosanchez29d ago

Tried setting up rustfs today. It was easier that garagehq and it even comes with UI.

evil-olive27d ago

RustFS is the poster child in my mind for the worst kind of vibe-coded slop. it might be "simple" but it's not something I would ever trust with persistent data.

last year they had a security vulnerability where they allowed a hardcoded "rustfs rpc" token to bypass all authentication [0]

and even worse, if you read the resulting reddit thread [1] someone tracked down the culprit commits - it was introduced in July [2] and not even reviewed by another human before being merged.

0: https://github.com/rustfs/rustfs/security/advisories/GHSA-h9...

1: https://www.reddit.com/r/selfhosted/comments/1q432iz/update_...

2: https://github.com/rustfs/rustfs/pull/163/

3: https://github.com/rustfs/rustfs/pull/1291

nikeee27d ago

I am building an S3 client [1] where I have a test matrix that tests against common S3 implementations, including RustFS.

[1]: https://github.com/nikeee/lean-s3 [2]: https://github.com/rustfs/rustfs/security/advisories/GHSA-w5... [3]: https://github.com/rustfs/rustfs/issues/984

1 more reply

PunchyHamster27d ago

I recently submitted bug about how their own docs tell you to

* create rustfs user * run the rustfs from root via systemd, but with bunch of privileges removed * write logs into /var/logs/ instead of /var/log

Looks like someone told some LLM to make docs about running it as service and never looked at output

rezonant27d ago

Ah, progress!

0x45727d ago

I think only "complex" thing in garage is the layout which only matters if you're doing distributed mode.

panarky27d ago

Sounds like you want S4. Super simple storage service.

sonnyz27d ago

Listen to this: 7... Minute... Abs. You walk into a video store and you see 8 minute abs and 7 minutes abs. Which one are you gonna buy?

rglover25d ago

"Step into my office?"

"Why?"

"Because you're fucking fired!"

ray_v26d ago

speps26d ago

S4 exists: https://mega.io/objectstorage

ilian26d ago

How about s3proxy: https://github.com/gaul/s3proxy Among other things, it can serve local folders as s3 api.

matzie26d ago

+1 for s3 proxy! I use it on docker for my dev env, is so simple and works well

0xbadcafebee26d ago

Call me crazy, but wouldn't 15 minutes on GLM 5.1 produce a working implementation? I haven't looked at the code, but a non-production-grade Go implementation can't be that complicated.

Edit: Minio is written in Go, and is AGPL3... fork it (publicly), strip out the parts you don't want, run it locally.

Havoc26d ago

I did exactly that. 15 mins for initial implementation was about right. And it seemed fine at first glance.

Got something vaguely workable but even after many hours I can’t say I super trust it

> implementation can't be that complicated.

0xbadcafebee26d ago

Would you be willing to share the result on the web somewhere? Would save a few tokens (and hours) for others who end up doing the same

Oxodao26d ago

PxldLtd26d ago

You can hard-code them. Just write a setup script that imports your desired keys from a .env file or something:

``` docker exec $CONTAINER $BIN key import --yes "$S3_ACCESS_KEY_ID" "$S3_SECRET_ACCESS_KEY" ```

crabique26d ago

I, too, can vouch for ZFS+VersityGW being a great solution, we were able to scale it vertically on a single-node deployment to a pretty high throughput, both reads and writes.

Most notably, the PutObject operation (which had always been a pain in the ass on HDD with MinIO) is performing well now, even with many small objects.

There is a natural synergy in the gateway storing its metadata in xattrs and using ZFS special VDEV with dnode_size=auto to store the entire ZFS+S3 metadata on fast media.

rtpg26d ago

deepsun26d ago

Then why nobody forked minio?

mixnix26d ago

uroni26d ago

For me it went into the multi-node direction, where I'd use Ceph anyway (or build on-top of an existing solid distributed database) if I needed it.

ferdzo26d ago

https://github.com/ferdzo/fs

teekert26d ago

Fwiw, yesterday I was messing about with GitLab backups. One of the options for direct off-site is S3. But I want that S3 bucket to live on a box on my own Tailnet.

So I too just want simple S3. Minio used to be the no-brainer. I'll checkout RustFS as well.

It does not sound hard (although it is hard for me!). It sounds like it should be some LinuxServer io container. Doesn't it? At this point S3 is just as standard as WebDav or anything right?

sandreas26d ago

Wouldn't RustFS be something?

https://rustfs.com/en/

Diti25d ago

No. See this thread: https://news.ycombinator.com/item?id=47759022

K0IN26d ago

Not that long ago someone on hn poster this [0] a zig based s3 server in 1k lines, (warning not production ready) but if you really look for something simple, it might fit your case.

[0] https://news.ycombinator.com/item?id=46421196

Havoc26d ago

I was thinking similar to OP and decided to vibe code my own…which was quite a journey in itself.

Can’t say I super trust it so will probably roll it out only for low stakes stuff. It does pass the ceph S3 test suite though and is rust so in theory somewhat safe-ish…maybe.

moondev27d ago

microceph is pretty nice and straightforward for throwaway s3 endpoints

https://canonical-microceph.readthedocs-hosted.com/stable/tu...

coredog6427d ago

jauntywundrkind27d ago

I do continue to be impressed/ over-awed by how effectively scared the Ceph docs are about just how many system resources you need. To run a mid tier not that fast storage cluster. Bother.

Impressive as hell software and I am so glad to have it. But man! The insistence on mountains of ram per TB, on massive IO is intimidating.

KaiserPro26d ago

but why would you ever want to run ceph? its just such a huge monster.

Its also not that useful even if you have enough machines to run it properly.

NVME and zfs is fast enough for virtually anything now. With snapshot and snapshot sending you get decent backups for half the hardware cost of ceph.

lewtun27d ago

Hugging Face Buckets are pretty simple: https://huggingface.co/docs/huggingface_hub/en/guides/bucket...

Disclaimer: I work at HF

scottfits27d ago

100% - i really wanted Render to add this, feels like there is potential for a startup here

ovaistariq27d ago

Potential of startup for hosted object storage? I think Tigris (https://www.tigrisdata.com/docs/) will work pretty well with Render.

sudb27d ago

I think the post author is mainly addressing self-hostable and/or open-source options here - otherwise I'd expect a whole host of other commercial storage providers to have been mentioned!

anurag27d ago

Render has Object Storage in alpha: https://render.com/object-storage

liveoneggs26d ago

Riak doesn't get a mention? https://docs.riak.com/riak/cs/latest/index.html

grizzletooth26d ago

Check out Floci. It is a self hosted AWS clone with multiple services functional, including S3 and Dynamodb.

https://github.com/floci-io/floci

evil-olive26d ago

Floci is yet more AI slop.

"firt commit" [0] was less than a month ago and added 51k lines in a single commit.

0: https://github.com/floci-io/floci/commit/61433f59ab995e9eaeb...

therealmarv26d ago

singhrac27d ago

jdbohrman26d ago

Wouldn't Blossom fit this? https://github.com/hzrd149/blossom

phibz27d ago

Why do rust compile times matter for a production deployment?

time4tea25d ago

Garage is quite nice, for the cases I've tried so far. I didnt find it heavyweight.

Tres bon!

barelysapient26d ago

Unlike the author, I run SeaweedFS without issue locally and with acceptable transfer speeds.

siliconc0w26d ago

I use rustfs (for local development, not scaled usage) and it seems solid.

pkghost27d ago

Based on the list of contenders feels like you might be missing rsync.net?

mickael-kerjean27d ago

By itself rsync.net doesn't support S3. The one I wrote (Filestash) lets you use rsync.net as a downstream storage and proxies it through the S3 protocol.

seized24d ago

Versitygw and Garage are both good options.

nhumrich27d ago

Well, OP, your requirements section is seriously lacking. You need "s3", but only local, non horizontally scalable?

You failed to answer why you even need s3... Why not a filesystem? Full stop. The entire point of s3 is distributed.

JBorrow26d ago

nate27d ago

amarsahinovic27d ago

S3-compatible storage solution: https://www.hetzner.com/storage/object-storage/

tptacek27d ago

They want to run it locally.

66yatman25d ago

Brown is literally crying

hybirdss27d ago

someone is 100% going to write the 'i just want simple S4' post next month

bosky10126d ago

If anyone wants an s3 browser with folder counts, and object rendering based on its extension, dm.

LunicLynx26d ago

Simple S3 sooooooo S4 ? xD

otterley28d ago

So use S3.

jockm27d ago

While not obvious from the article, it appears that they want something S3 like, but isn’t from Amazon, and possibly want to self host it. The article could be much more clear about the goals

larrymcp27d ago

Ah, thanks. Yeah I was confused because in his long list of vendors he didn't mention Wasabi, Backblaze etc. It appears that I do not know the context of his post.

sudb27d ago

or cloudflare R2 for that matter (very useful for egress-heavy workloads for which it is ~free)

1 more reply

cdrnsf26d ago

I’ve never had an issue with Backblaze. I mirror my buckets to iDrive who, so far, have also been perfectly fine.

ovaistariq27d ago

or Tigris

Quai26d ago

Simple for you is not simple for me, and vice versa.

10 users with simple needs = complex software.

Frannky26d ago

Self hosted minio?

tehlike26d ago

vibe code a s3 implementation build on tikv.

j / k navigate · click thread line to collapse