Google Cloud Storage FUSE (opens in new tab)

(cloud.google.com)

149 pointsmvolfik3y ago108 comments

108 comments

73 comments · 15 top-level

MontyCarloHall3y ago· 27 in thread

I’ve experimented with using gcsfuse and its AWS equivalent, s3fs-fuse in production. At best, they are suited to niche applications; at worst, they are merely nice toys. The issue is that every file system operation is fundamentally an HTTP request, so the latency is several orders of magnitude higher than the equivalent disk operation.

For certain applications that consistently read limited subsets of the filesystem, this can be mitigated somewhat by the disk cache, but for applications that would thrash the cache, cloud buckets are simply not a good storage backend if you desire disk-like access.

What I would really like to see is a two-tier cache system: most recently accessed files are cached to RAM, with less recently accessed files spilling over to a disk-backed cache. That would open up a world of additional applications whose useful cache size exceeds practical RAM amounts.

crazygringo3y ago

This seems overly pessimistic to me.

Sure you're not going to use this as a consumer in place of a local disk, nor are you going to use this as part of your web app.

But there are lots of situations in reporting, batch/cron jobs, data processing, and general file administration where it's incredibly easier to use the file system interface than to use an HTTP API via a cloud storage library. Which FUSE is a godsend for. The latency doesn't matter in these cases for one-off things or scripts that already take seconds/minutes/hours anyways.

So no this isn't niche or a toy. It's a fantastic production tool for a lot of different common uses. It's not for everything but nothing is. Use the right tool for the job.

retrocryptid3y ago

In the old days, we had a system called NFS (Network File System) where, yes, you may decide to use only remote disks. There were several advantages apart from lowering the cost of disks, mainly that you could centrally manage boot images for a fleet of machines. Then we got the web and everyone seemed to assume you could do the same thing over the internet.

I agree with you, I would prefer a local disk to one with 100+ msec of latency and local storage prices are at the point where the right answer is probably "just add local storage."

But I watch with some sympathy the small army of sys-admins (something like 15-20 people) responsible for managing the 3000+ Macs our company uses and remember the 2 person staff which supported the 1500+ diskless workstations from my years at a sadly defunct mini-super-computer manufacturer. It was quite nice... you could go to any machine and log in and your desktop would follow you. I'm told doing the same thing with MSFT requires 10-20 people just to manage the AD hardware (though as a unix-fan, I hang out with other unix-fans who are notoriously rude to MSFT, so maybe it's only 5-10 people needed to manage the AD instance.)

2 more replies

MontyCarloHall3y ago

Applications for which filesystem-like access is important (i.e. requiring lots of POSIX file I/O system calls, e.g. read(2)/write(2)/lseek(2)) but latency is unimportant seem pretty niche to me. If you don't need any of the POSIX syscalls, it's not that much more difficult to work with bucket URLs vs. file paths — the general format is the same, i.e. slash-delimited file/directory hierarchies.

1 more reply

ninkendo3y ago

The problem is that such systems have a habit of growing in scope until they reach a point where you really do need the more optimal access patterns of using the real HTTP APIs, and the inefficiencies of emulating the full filesystem API will gradually start to bite you. Maybe you’re lucky enough that that won’t happen, but it’s important to understand it for the quick hack job it is, IMO.

2 more replies

qwertox3y ago

I agree. For example if you want to use Google's ASR (Automated Speech Recognition), if your file is longer than 1 minute in duration, you first need to upload it to a bucket, which is a lot of added complexity compared to a regular HTTP POST.

Just copying the file to a mounted bucket would make this a lot easier.

Then again, how does one get the metadata of the uploaded file?

vrosas3y ago

Calling any software system "niche" is kind of hilarious, as if, if it isn't postgres it's a massive failure. It's not supposed to be a high-performance cache of data.

My company uses GCSFuse for ad-hoc analysis/visualization of large but poorly structured output from our lifesciences jobs and it works just fine for that.

thraxil3y ago

Yep. I once inherited a system where the previous team had used GCSFuse to back the `/etc/letsencrypt` directory on a cluster of nginx webservers. It "worked" and may have been a reasonable approach at the time they built it, avoiding setting up a single "master" to handle HTTP-01 challenges (and it was before GCP's HTTPS LB could handle more than a handful of domains/certificates). The problem was that as the number of domains/certificates it handled increased, nginx startup or config reload time got slower and slower as it insists on stat-ing and reading every single file in that directory in the process. It got high enough that it started running into request throttling on the storage bucket. It's no fun when `nginx -s reload` takes two minutes and sometimes fails completely.

netheril963y ago

The most wrong part of that previous team is to store private keys unencrypted in the cloud, not the performance part.

2 more replies

linsomniac3y ago

>What I would really like to see is a two-tier cache system

Is there any sort of Linux HSM (Hieracrhical Storage Manager)? I haven't see any and have been a bit surprised nothing has really developed there. They can manage putting hot data in RAM, SSDs, colder or larger data on spinning rust, deep freezing onto a tape silo or a cloud storage...

Some of the NAS devices and RAID cards can support a two-tier caching or data migration using SSDs, where hot or highly-random data (usually identified by smaller write sizes) go to the SSDs, and then can migrate to the spinning discs.

I've done some "poor mans" version of this using LVM, where I can "pvmove" blocks of a logical volume between spinning discs and SSDs, which is pretty slick, but a very crude tool.

folmar3y ago

CASTOR comes to mind for a start.

Take a look a the CERN paper https://iopscience.iop.org/article/10.1088/1742-6596/331/5/0... as they have a large use case.

tadfisher3y ago

Not a general kernel facility that I know of. I use nfscache every day though; my Steam data directory lives on NFS, and I set up nfscache with a 100GB LRU storage. This way I can avoid the "backup/restore" dance and have all my games installed, at the cost of waiting up to a few minutes to warm the cache for a new game.

pdimitar3y ago

I don't know about a manager per se but `bcachefs` for Linux seems to do a good chunk of what you're after.

markstos3y ago

I once evaluated using s3fuse for managing about 36 million images. The old storage model was on a filesystem so it was supposed to make a smooth transition to the cloud.

AWS Premium Support wisely advised me against it, not just because of latency but also because the abstraction makes /far/ more API calls then a native solution would.

After a bit of testing to confirm, I switched to using native API calls. That code was easy to write and the performance was great. I've been wary of cloud FUSE adapters ever since.

lazide3y ago

FUSE adapters in general are not for a product/production use in my experience. They’re great for one off convenience use, or basic admin scripts.

ashishbijlani3y ago

I'm working on optimizing FUSE using eBPF (ExtFUSE [1]) and adding a caching layer exactly as you mentioned. Will post publicly when ready.

1. https://github.com/extfuse/extfuse

vamega3y ago

Is work on this continuing (or restarting)? I had heard of this a few years ago, but thought the project was shelved.

1 more reply

pjc503y ago

> What I would really like to see is a two-tier cache system: most recently accessed files are cached to RAM, with less recently accessed files spilling over to a disk-backed cache. That would open up a world of additional applications whose useful cache size exceeds practical RAM amounts

This is really hard to get right if the origin cloud storage is anything other than immutable. Otherwise you're in for a world of cache invalidation and consistency pain.

I've gradually come round to the other opinion: there should be devices that sit on the PCIe/NVMe bus and provide a blob storage API rather than a block one, and there should be an operating system blob API that is similar to but not identical to the filesystem one.

8organicbits3y ago

Same experience. I remember opening a .docx in Word and watching it hang or studder at different operations. I think you'd need very reliable and low latency networking for this to be anything but a painful to use toy.

I'd be curious to see how it works running on EC2, especially with an S3 endpoint in the VPC. Although I still think you'd be better suited by using S3 as an object store, given the option to built it right.

renewiltord3y ago

Catfs is not super production (there are some small changes you need to make in inode handling), but you can do this. We have it on top of goofys. They both need a few changes to work under load but what we do is quite standard:

1. Goofys for S3 FUSE

2. Catfs for local disk caching

3. Linux caches in memory

4. Mmap file means processes share it

5. One device then exports this over the network to other machines, each of which have an application layer disk cache.

6. Machines are linked via 10 GigE (we use SFP+).

Overall the goofys and catfs guy (kahing) wrote very performant software. Big fan.

yuliyp3y ago

> most recently accessed files are cached to RAM, with less recently accessed files spilling over to a disk-backed cache

Isn't this how most servers run normally? (parts of) files which are accessed are in page cache, the rest is on "disk"

ape43y ago

That page shows a `mkdir` is 3 json commands. I wonder if its that many HTTP requests.

ArtWomb3y ago

>>> every file system operation is fundamentally an HTTP request, so the latency is several orders of magnitude higher than the equivalent disk operation

gcsfuse latency is ok as it embodies "infinite sync & persistence" ;)

tyingq3y ago

Well, and there's no such thing as opening a file and modifying some small part of it. That's emulated with a full rewrite of the whole object.

VikingCoder3y ago

Uh, how does it perform from a Google Compute Engine Virtual Machine?

If it performs well there, I could imagine that being pretty useful.

MontyCarloHall3y ago

That is exactly where I tested it, and the latency was still abysmally poor (~1 second per file operation).

I don’t even want to know how bad the latency would be outside of a cloud VM.

bitL3y ago

Moreover, there is no SLA on those FUSE adapters so putting it into any part of production is too risky.

qsort3y ago

My personal conspiracy theory: most "cloud services" are just... bad.

VMs and disk space I understand completely, having machines on-prem is too much of an hassle and the price isn't that bad. But for stuff like this, managed services, databases especially, you're just getting scammed.

ofek3y ago· 11 in thread

I do appreciate that Google is now officially supporting gcsfuse because it genuinely is a great project. However, their Kubernetes CSI driver seems to have in large part copied code from the one I and a co-maintainer have been working on for years:

- https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver

- https://github.com/ofek/csi-gcs

Here is the initial commit: https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/c...

Notice for example not just the code but also the associated files. In the Dockerfile it blatantly copied the one from my repo, even the dual license I chose because I was very into Rust at the time. Or take a look at the deployment examples which use Kustomize which I like but is very uncommon and most Kubernetes projects provide Helm charts instead.

They were most certainly aware of the project because Google reached out to discuss potential collaboration but never responded back: https://imgur.com/a/KDuf9mj

SamuelAdams3y ago

Your repository seems to have both an Apache and MIT license. What license are you distributing your code under?

Edit: I see you said it’s dual licensed. From the look of it both allow Google or any other company to copy and reuse code, so what are you upset about?

warent3y ago

I don't mean to be rude but yeah, this is exactly what AGPL was intended to combat. It's a lesson learned for these developers, and Google did nothing wrong or even unethical imo.

A lot of people treat licensing emotionally (e.g. WTFPL, or picking licenses that feel good, or that we saw in another project), however business people are very logical and will unfortunately exploit this.

The irony is that Google probably would not have done this if the codebase just omitted a license entirely. When I worked there, they wouldn't allow OSS with no license.

2 more replies

ofek3y ago

Either, the choice is up to you.

edit: as I express in a sibling comment this act is legally allowed of course, but is bad practice

4 more replies

js20233y ago

Hi Ofek,

I am a contributor who works on the Google Cloud Storage FUSE CSI Driver project. The project is partially inspired by your CSI implementation. Thank you so much for the contribution to the Kubernetes community. However, I would like to clarify a few things regarding your post.

The Cloud Storage FUSE CSI Driver project does not have “in large part copied code” from your implementation. The initial commit you referred to in the post was based on a fork of another open source project: https://github.com/kubernetes-sigs/gcp-filestore-csi-driver. If you compare the Google Cloud Storage FUSE CSI Driver repo with the Google Cloud Filestore CSI Driver repo, you will notice the obvious similarities, in terms of the code structure, the Dockerfile, the usage of Kustomize, and the way the CSI is implemented. Moreover, the design of the Google Cloud Storage FUSE CSI Driver included a proxy server, and then evolved to a sidecar container mode, which are all significantly different from your implementation.

As for the Dockerfile annotations you pointed out in the initial commit, I did follow the pattern in your repo because I thought it was the standard way to declare the copyright. However, it didn't take me too long to realize that the Dockerfile annotations are not required, so I removed them.

Thank you again for your contribution to the open source community. I have included your project link on the readme page. I take the copyright very seriously, so please feel free to directly create issues or PRs on the Cloud Storage FUSE CSI Driver GitHub project page if I missed any other copyright information.

mox13y ago

You licensed the code as MIT - https://github.com/ofek/csi-gcs/blob/master/LICENSE-MIT

Are you saying you have an issue with them copying your MIT licensed code?

soraminazuki3y ago

If the GP is right, Google is violating the terms of the license. A quick search of the code reveals that Google's code doesn't include copyright headers with attribution to the GP. This could be stolen code.

ofek3y ago

Yes, copying the code without following up to actually collaborate or even forking to show attribution I think is bad practice for such a large organization, or any entity for that matter.

2 more replies

whuan3y ago

Are you accusing Google of "large part copied code" based on an old commit which is not even used in this official launch? Do you have any evidence from their recent commit? At least I don't see the current two repos are anywhere alike, except for that you both implement the same interface. Also, they did reach out to you and you just didn't respond, so why are you complaining now?

It makes me sad that no one here cares about whether your blame is true. And I'd expect you can provide more convincing evidence. But looks like the accusation is not even true. It's not fair for those contributors man, I hope you can apologize.

ofek3y ago

I'm not sure if you thoroughly read what I wrote but I did respond to them. This is not a false accusation as you are claiming, you can check the contents of the repo in the current state.

Per the licenses they can copy but they must maintain attribution which has not been done.

ofek3y ago

Update: attribution has been added to the readme file https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/c...

penciltwirler3y ago

Haha a lot of funny comments here. I think overall it's neither here nor there. You should be proud that the "elites" at Google copied your code ;)

askvictor3y ago· 4 in thread

Now for official Google Drive support on Linux...

curt153y ago

How do Googlers access Google Drive from their Linux workstations? Do they have an internal GDrive client?

martius3y ago

Not that I know of, we have some virtual filesystems for specific things, but in general Drive is for shared docs, videos (recorded meetings/presentations) and things like this.

We don't use drive to store other files. Actually, we don't really "store files" since almost everything we need is remote.

See for instance this discussion: https://news.ycombinator.com/item?id=13561096

surajrmal3y ago

Through the web frontend. I'm not aware of any special fuse clients, nor is it particularly appealing. All files I store in it are for web based applications (primarily gsuite). We have alternative collosus based file share mounts which we can use for "native" files. I personally use git and/or rsync to share files between my various corp devices (laptop, cloud vm, desktop) in addition to those other options.

plaidfuji3y ago

I wonder the same, but I also wonder what the actual use case is for the Drive app on Linux. For me, Drive is mostly for syncing office docs (namely MS-office docs), PDFs and images among teams. That type of work doesn’t lend itself well to a Linux env anyway. And for programming-heavy sync tasks, a user will more likely use a remote Git repo for code and GCS for data. Does google even use MS office internally?

2 more replies

jefftk3y ago· 4 in thread

Cloud Storage FUSE does not support overwriting in the middle of a file. Only sequential writes are supported.

This seems like a big limitation?

8organicbits3y ago

One challenge with writes in the middle is that it changes the file hash. Cloud services typically expose the object hash, so changing any bit of a 1TB file would require a costly read of the whole object to compute the new hash.

You could spilt the file into smaller chunks and reassemble at the application layer. That way you limit the cost of changing any byte to the chunk size.

That could also support inserting or removing a byte. You'd have a new chunk of DEFUALT_CHUNK_SIZE+1 (or -1). Split and merge chunks when they get too large or small.

Of course at some point if you are using a file metaphor you want a real file system.

throwawaaarrgh3y ago

pretty standard limitation of object storage services iirc

jefftk3y ago

Doesn't this mean that most programs you might want to use with the FUSE API won't actually work? They'll do fine for a while, until they try to seek, and then they'll get an error?

Or is there a large group of programs that only ever write sequentially?

4 more replies

scoobydoobydrew3y ago

This works, there is nothing stopping it, but just like all cloud object storage it will trigger a complete re-write of the object when saved.

rippercushions3y ago· 3 in thread

Is this the same gcsfuse that's been around for years, only now with official Google support?

https://github.com/GoogleCloudPlatform/gcsfuse

scoobydoobydrew3y ago

Looking at change descriptions, looks like underlying changes were made to get to this like now using GO client library. I would expect a more stable product, and better performance which looks like the performance benchmarks located under docs has been updated as well. Happy to finally see Google standing behind this, and the official CSI driver is really cool to see.

throwdbaaway3y ago

Heh, my old laptop has a git clone of this from September 1st 2016.

beastman823y ago

yes

> export GCSFUSE_REPO=gcsfuse-`lsb_release -c -s`

ISL3y ago· 3 in thread

Can this be used to mount Drive under linux?

hobo_mark3y ago

GCS is not Drive, it's Google's equivalent of S3.

speedster2173y ago

https://github.com/astrada/google-drive-ocamlfuse is one option

lern_too_spel3y ago

Use rclone.

remram3y ago· 2 in thread