5M item limit for Google Drive: File unable to generate or upload due to 403 (opens in new tab)

(issuetracker.google.com)

165 pointsthrowaway20563y ago133 comments

133 comments

86 comments · 19 top-level

duxup3y ago· 24 in thread

What is Google Drive... for?

I see folks describe Google Drive errors / limits like this and people describe "a business critical operational system" and I wonder. Is Google Drive even supposed to do this thing?

Not pointing the finger back at them, if Google doesn't make it clear they probably should, but at least as an individual I always see Google Drive as a single user cloud light system that allows for some sharing and organizational functions... but still not an "industrial" type cloud service.

QuercusMax3y ago

Disclaimer: I'm a Googler but I don't work anywhere near this stuff, and this is just my own personal opinion.

In my mind, Drive isn't a file sharing / storage system, but rather a document sharing / storage system. If you want to store large numbers of files, you should use Cloud Storage, which can handle that. Drive is the cloud equivalent of LAN shared folders - you know those janky NFS and SMB shares that were always a giant headache because of permissions and other nonsense? That's what Drive is trying to be.

Seems like they could do it without the jankiness, though.

stuartaxelowen3y ago

https://www.google.com/drive/ 's title does call it a "File Sharing Platform".

toast03y ago

> always a giant headache because of permissions and other nonsense? That's what Drive is trying to be.

It definitely lives up to that. I'm always getting sent links to files I can't open.

pyb3y ago

I don't think the average user would understand the nuance, though. (I don't think I do)

2 more replies

hedora3y ago

I’ve never had a problem with NFS, and modern filers would laugh at 3 million files.

I really dislike Google Drive. I can’t imagine people would use it at all if not for the fact that it’s bundled, monopoly style, with the office suite and gmail.

Since Google won’t ever fix it, I’m hoping they rapidly make it so crappy some other competitor crops up.

Of course, with our luck, it’ll be some Microsoft offering.

I wish software would stop getting worse.

duxup3y ago

That does seem how it operates. A lot of the UI is directed at document sharing.

That's what makes me double take when I hear about folks using it for some very creative, but not what I think of uses for Google Drive.

thinkingemote3y ago

Is there a limit on the number of files in cloud storage?

CaliforniaKarl3y ago

You don't have anything in your HN profile, so I can't be certain, but I'm guessing you haven't been in the Research/Education space? It's scary how much Google Drive is used for storage of stuff.

Robadob3y ago

I work at a university that is fully invested in Google Apps.

We did have unlimited storage, however last week they announced that's soon to change:

> A quota will soon be introduced on the amount of data that staff and students at the University can store in their Google account. This is part of a larger piece of work being carried out by the Google Workspace team in IT Services, to review the overall amount of storage being used by the University.

Makes me wonder if the timing is somehow related.

1 more reply

hirako20003y ago

+1 And, it's not like Google didn't make huge marketing pushes to enter that space and businesses at large. Gsuite, Classroom, ever increasing storage limits. There is even an education package offering "unlimited" storage.

andbberger3y ago

google drive has unlimited storage, they opened that pandora's box themselves. yes it's janky as hell, but I'm looking at tens of thousands per year to backup to cloud services for my lab's data

chimeracoder3y ago

> at least as an individual I always see Google Drive as a single user cloud light system that allows for some sharing and organizational functions... but still not an "industrial" type cloud service.

That may be how you view the product, but Google Drive absolutely is used as an enterprise cloud service by many large companies. And as a result, there are lots of applications that integrate with its API, etc., to serve those enterprise use cases.

duxup3y ago

I don't doubt people do it. Humans are industrious and users will push things to their limits.

I just wonder .... is that what Google built Drive for? / how much of that they were thinking of.

3 more replies

djha-skin3y ago

Google Drive has the "MS Excel" problem. It's made cloud storage really accessible to people, so they turn to it for everything, even stuff that's more appropriately put somewhere else. In Excel's case, it's database data. In Google Drive, it's files. My company uses it for video assets, because the media folks are really familiar with it, but we're moving it to S3 now because it's not got the chops for our seriously "larger" needs.

lambdaloop3y ago

Google drive is great for storing research data.

I work it in a fly neuroscience lab and we use it to store all our electrophysiology and video data. Each person in the lab is storing on average 5TB of data, and the lab as a whole stores 100TB.

The graphical user interface combined with unlimited storage for Google Workspaces is essentially an unbeatable deal. Researchers can upload their data easily through the interface. Any custom solution based on S3 or equivalent would take some time to teach and more time to maintain. Also, we're paying about $200 / month total to store 100TB of data in the cloud, which is hard to beat with other services.

I tried setting up a single account for the whole lab once, but we ran into the above 5M file limit, so we just have individual accounts per researcher and it's mostly fine for now.

secabeen3y ago

> Also, we're paying about $200 / month total to store 100TB of data in the cloud, which is hard to beat with other services.

You should expect this to go away soon. I support science researchers and our unlimited storage option is going away in the coming months. Options for purchasing space are limited, and not cheap.

3 more replies

ukd13y ago

Do y'all back this up? If so, how?

1 more reply

rootsudo3y ago

It's "free" cloud hosting. You'd be surprised how often people mix their personal google account with work stuff. It annoys me greatly, but that's how it is.

There is general ignorance of what google does/is/isn't and how online storage works. I agree with the sentiment, but, I've seen many a thing from recipes, shopping lists to FPA/Proposals all hosted on google drive. Even people sell content that is hosted on google drive.

dragonwriter3y ago

> It's "free" cloud hosting.

Drive is paid (consumer via One, business via Workspace) cloud hosting with a free consumer tier.

Sohcahtoa823y ago

For personal use, Google Drive creates a way for me to have automatic backups of files I wouldn't want to lose in case of drive failure. It's also a way to easily share files between my desktop and my phone.

For professional use, I've usually seen it used as a document repository that can be shared by teams.

In both use cases, 5 million files seems like it'd be a hard limit to hit.

RandomBK3y ago

Depends on the backup. Accidentally backing up a couple of `node_modules` folders would be sufficient to hit the 5M limit.

1 more reply

Spooky233y ago

It’s a general purpose file and document store. People run significant organizations on Google Drive.

Like anything, people can push limits where it’s no longer productive to extend.

kevin_thibedeau3y ago

Its purpose is to prevent people disabling third party cookies.

lostmsu3y ago

Can you clarify? Is this about single sign-on from multiple sites to be able to store data to Google Drive?

1 more reply

danpalmer3y ago· 16 in thread

From reading the thread it sounds like this is 5M per user.

Google Drive appears to be targeted at "human" usage, i.e. people uploading or creating files. I would guess that this is also worked into the cost – that the assumptions of the amount of work a human can do are a part of the price formulation. The reporter of this bug seems to be using this as a storage backend for software though, which I don't believe is the intended use-case.

Looking at S3 pricing, just the storage is ~5x the cost of Google Drive, and then you need to add transfer and API calls on top of that.

I don't personally think that there are reasonable use-cases for human users with 5 million files. There may be some specialist software that produces data sets that a human might want to back up to Google Drive, but that software is unlikely to run happily on drive streamed files so even those would be unlikely to be stored directly on Drive.

(Disclaimer, I work at Google, not on Drive, this is my personal reading and interpretation of the public info, I don't have any inside info here)

Reason0773y ago

> "I don't personally think that there are reasonable use-cases for human users with 5 million files."

A 5 million file limit might be quite reasonable if you're paying for the basic, 100 GB storage tier. But Google Drive offers multiple tiers with up to 30 TB of storage. 30 million MB!

That means if your average file size is less than 6MB - very likely if you're storing JPEGs, audio files, text records, or whatever; you'll never be able to fill your Google Drive storage.

What's the average size of a file on a regular macOS or Windows HDD? It wouldn't surprise me if it's much less than 6MB. Shouldn't the file count limit scale with the storage limit?

mort963y ago

Your analysis of why this limit exists is probably correct.

However, when you're paying for a product, the limits should be disclaimed. If I'm investing in Google Drive, I should be able to easily see the limits which apply to the product I've bought; that means total storage caps, file count caps, bandwidth caps, and whatever else.

graypegg3y ago

While I agree with you that restrictions should be clear, this feels like a limit well beyond what anyone would encounter. Even abnormal usage with something like many big git repos with tons of little nested files inside the data directory isn’t going to hit 5M files unless you’re really committed to pushing until you find the invisible wall.

It would be pointless if your car came with a giant list of the melting point of every material used in the structure of your car so you know what the “real” temperature limits are. Oh boy, better not let the car get to 1450C or else the steel frame might melt!

2 more replies

Gigachad3y ago

If they listed every configuration to this level of granularity, the doc would be huge and no one affected would even see it. So the easiest way would still be to just pay for a month, try to use it as intended, and if you hit a limit, try another product.

1 more reply

danpalmer3y ago

That's a fair point. It looks like most limitations are documented, but I couldn't find a mention of the 5m file limit.

augusto-moura3y ago

I recently moved on to using Google Drive as a backup drive for my computer, the tool I'm using for it, Restic[1], does file deduplication at file block level. Restic allow multiple hosts to upload to the same backend/repository. So far I already have 25k files, even though I started backing up a only few weeks ago.

I imagine that if you setup multiple servers/personal computers backups that could scale significantly.

[1]: https://restic.net/

Gigachad3y ago

> Restic currently defaults to a 16 MiB pack size.

Given this, it is impossible to exceed the 5M file limit with the top plan google offers of 30TB. You'd have to lower the pack size to 6MB or less.

pretext-13y ago

Restic + rclone + GCP Service Account.

This is the way to go. Files will count towards the Service Account, not your account. If you use Google Workspace I strongly recommend using a Shared Drive (instead of sharing a folder in “My Drive”) so the files will be owned by your org, not by the Service Account (otherwise deleting the SA would result in the files being deleted as well, because in “My Drive” the creator of a file keeps ownership even when creating in a shared folder). If you have problems with the 400,000 files limit in a Shared Drive: create a new one, rclone has a “union” backend. It can be set to create new files in the new Shared Drive while still showing files from the other drive(s). Also Shared Drives have their own recent activity views so it doesn’t clutter your “my drive” view.

Create a SA in GCP and generate a JSON credentials file. Create a folder or Shared Drive via web and share it with the SA.

I’m using this setup for years with zero problems.

Example rclone config: https://pastebin.com/KdsFQz5K

Gigachad3y ago

I did some quick math on this. The top plan Google sells is 30TB, to reach 5M files in 30TB, you'd have to fill the entire thing with 0.75MB files. So yeah, if you are using Google drive as some kind of JSON blob storage, it isn't designed for your use case.

danpalmer3y ago

For 30TB I think it would be 6MB, but regardless, that appears to be per-user. If you're an individual needing 5m files, maybe you need a specialist storage solution, and if you're a business then that's one hell of a bus factor.

1 more reply

elcritch3y ago

Makes me wonder if the limit was actually as much for finding run away internal code. I’m sure there’s cases deploying something like Google Drive where code went rampant making duplicates or something.

1 more reply

jjav3y ago

> I don't personally think that there are reasonable use-cases for human users with 5 million files.

% find $HOME -type f | wc -l

4969169

Almost 5M there.

fencepost3y ago

But are you human?

Fatnino3y ago

Minecraft saves are (were? Idk anymore) just big folders with tons of files representing each chunk of the world. These would be a common thing a kid might want to back up to drive and could potentially be zillions of files.

TazeTSchnitzel3y ago

What if you check out a few large git repos in a directory synced with Google Drive?

danpalmer3y ago

I thought about this, but I accidentally put the Webkit repo (~1m files) into Dropbox once and it didn't go well. Git needs a consistent view of the filesystem across many files to work, and that's not really possible with that many files stored in a sync system that works per-file like Drive does. I put Git in the category of specialist software that shouldn't be operated in Drive. By all means tar and backup, that should be effective, but there's no need to store all the individual files.

1 more reply

bobsmooth3y ago· 6 in thread

Just checked, my C drive has a bit more than 1 million files. What are you doing that you have 5 million separate files in your Google Drive?

jackson14423y ago

If you're using Google Earth Engine or a similar tool, it's generally highly recommended to use Google Drive as the storage backend for it - especially since Google sells Enterprise plans with "as much storage as you need."

Geospatial artifacts can be very large, and also can be spread across thousands and thousands of files (20+ zoom levels on Earth, with enough 256x256 tiles to cover the planet at each zoom level, for example).

justin_oaks3y ago

Nobody should ever need more than 640K of RAM, right?

Just because your workflow doesn't involve lots of tiny files doesn't mean other people's don't.

Besides, most of those commenting in the issue are talking about their organization having hit the limit, not individuals.

kulahan3y ago

Why is an organization using a primarily-personal file storage/transfer system for real work

2 more replies

bobsmooth3y ago

As kulahan said, why are you using a personal backup solution for a business?

1 more reply

hirako20003y ago

Many use some rsync-like tools to use gdrive as a flat backup system. Perfectly fine use case. You would be over a million as a single user. Some account have thousands of users.

See some comments on the tracker, some orgs have a multi sites set up across an entire nation. I wouldn't be surprised if 10k+ volunteers at some NGOs are on that sort of plan and use it as their personal storage solution on top of all the things they sync for document counts heavy work.

Gigachad3y ago

You can easily reach this number if you are doing something like storing json/xml user data from some service. But I just can't imagine you could ever hit this limit as a normal user. And if you do, just zip some old stuff in to one file and carry on.

londons_explore3y ago· 4 in thread

5M items sounds suspiciously like one machine has to keep in RAM the complete list of a users files for certain operations. User accounts with large numbers of objects were probably causing those machines to OOM. 5M items, at about 1000 bytes per item (name, metadata, a few uuids, etc) is 5Gbytes. 5Gbytes is about the amount of RAM most tasks will be given.

I could imagine that some services within Google haven't been carefully designed - for example, perhaps the quota service reads a complete file listing into RAM, adds up the sizes, and then writes the available quota.

This is probably the easy fix, rather than redesigning every service to be able to stream objects correctly.

londons_explore3y ago

It could also be a sharding/checkpointing issue. Imagine you want to scan every file in every user account, for example for blank documents (maybe because you want to downrank blank 'untitled documents' in search results, because they are easily accidentally made).

You write your scanner to divide up the list of user accounts into chunks, process all chunks in different worker machines, and combine the results. Simple mapreduce.

However, if there is one huge user account, then you either have to wait for the entire process to take far longer, or you need to have multiple workers working on the single account (adding a lot of complexity to every operation you wish to run across all accounts).

cabirum3y ago

More like an arbitrary number from the old times "cause 5M items ought to be enough for everybody."

charcircuit3y ago

>from the old times

That doesn't make sense because this is a new restriction.

2 more replies

genewitch3y ago

Couldn't one programmatically test this by hitting 5GB sooner with excessively long filenames? Say 500 byte names?

throwaway815233y ago· 3 in thread

It seems like there is a much lower limit for "download all". There is a 100 or so item directory that I want to download, and "download all" tries to wrap them in a zip file and fails. I think it wants to pre-scan all the files for viruses before zipping, but is unwilling to do that many. So I have to download the files one at a time.

bobbylarrybobby3y ago

I've definitely downloaded folders with 100 items before. I think it might have to do with how big the individual files are because g drive won't create a zip that's larger than 2GB.

dublinben3y ago

In the future, you can use rclone to easily download entire folders from Google Drive.

https://rclone.org/

throwaway815233y ago

Hmm, thanks, that looks promising, but it appears to be an all singing all dancing complicated client that does two way transfers to gdrive and expects you to have a google account. I don't have a google account (don't want to enroll one), and am just trying to download a public folder that someone gave me a link to. I'll keep looking to see if I can find a way to use it.

a2tech3y ago· 3 in thread

Interesting that this was enabled with apparently no outreach. Maybe Google needed some of those employees that it recently laid off

londons_explore3y ago

And it would have been pretty simple to run a query for users with over 4 million objects, and fire off an email to them all saying "We are about to implement a limit, and you will be impacted. Here are potential workarounds/mitigations".

I suspect though that the simple process of dropping a few hundred customers who will be directly impacted an email requires approval from 27 middle managers, and it's easiest to just ignore them.

shadowgovt3y ago

More often, this kind of error out of Google comes from the left hand not knowing what the right hand is doing.

Joe or Jane Noogler has just been brought into the team, and their starting project was to improve indexing. They succeeded in doing so by attaching the Drive data to an indexing service created after Drive existed. It's working great and has a 10-20X speedup. But oops... That new indexing system has a 5-million-element hard limit built in, and nobody caught it until they went to production. J. Noogler is too new to have realized this could have been an issue so they never started the escalation / customer messaging processs.

hirako20003y ago

Precisely what I felt skimming through the comments on the issue.

A lot of high scale businesses have taken a huge risk. A company like Google had over 10y long tenure engineers who were swept by the layoffs. The argument that the year prior was a hiring spree only makes it worse: now you have hands to throw at the problem but unlikely the right pairs, which, given the climate of job loss fear, aren't likely to admit they are incompetent.

Off topic, but this isn't an isolated case of mega size top tier companies seemingly dropping the ball more far often and/or with greater blast radius impacts. See github leaking secrets publicly on their own git repo.

Call me paranoid, but to me one of these things is going on:

1/ tipping point of too many under qualified for the job engineers running things, due to C-suites looking at keyboard monkeys the way they see factory workers, naively applying "cost cutting" measures, plucking through spreadsheets who "are the biggest seemingly disposable weights" and fire those as part of waves of x% layoffs

2/ Overworked remaining know-how crews - added the weight of dealing with inspiring politicians turned enginners for the juicy 200k + rsu package who, demand better articulated instructions, knowledge transfers and more comprehensive formal documentation materials, so that they too can shine too - on top. Competent folks having less and less time but more more tears and blue bags below their eyes finally realising that ain't worth it as it keeps getting worse anyway.

3/ sabotage

Or all those, since there could be some smowball effects there.

Call me paranoid, that it isn't what's going on, we clearly aren't seeing the slow fall of major infrastructures the few who were there to build them now having mostly packed their bag and long gone, replaced by swaths of bootcamp trophy coders hoarding cloud certificates like North korean generals like to collect shiny pins.

Maybe it isn't, the future is still in the making anyway, but I felt glimpses of that several years ago, then about each passing year, now every few months. like an seeing accelerating meteroite seemingly going away but which keeps looking bigger and bigger.

More so related to the topic: just drop drive as a centralized storage, there are e2e solutions out there and open source scripts to entirely migrate out of these wall gardens. You won't miss google sheets: you can still use it!

paxys3y ago· 2 in thread

While a 5 million file limit sounds reasonable at first glance, it isn't just the free tier that is affected. Google Drive also sells storage plans for up to 30 TB for an individual user, and there is no mention of such a limitation anywhere on any plan. If I bought and actually wanted to use that 30 TB (or heck even 5 TB), it isn't unreasonable to want to store a few million files. Regardless of the technical reasons behind it, not mentioning this restriction up front is before the sale is inexcusable.

Gigachad3y ago

I did the math and to reach 5M files in 30 TB, they all have to be averaging 6MB. If you actually have this many files, you should be using S3 or mongodb.

chii3y ago

if you were using google drive to store images (e.g., you automated a camera system to take high res pictures), you can easily reach this limit.

Of course, it definitely is much nicer to have S3 over google drive for such a solution, but if you didn't want to write software (which you'd need for S3), and simply used the google drive's sync'ing features...

1 more reply

steponlego3y ago· 2 in thread

Sounds like they want to protect against resource exhaustion. Touch a file, it's empty with a size of zero bytes. But depending on the block size of the storage device, it's likely to be between 4k and up to even 1M.

ec1096853y ago

They should incorporate a minimum size for every file when calculating how much quota you have.

Because they have no retrieval fees for gsuite, I bet it is expensive dealing with drives with tons of small files.

Agree they are rolling this out terribly. They should have granted everyone 2x the number of files they currently have in their drive quota (or 4M, whichever was smaller).

mayli3y ago

Imagine having the filename and other non-quota-able attributes as an unlimited storage.

blitzar3y ago· 2 in thread

They also have what looks like an arbitary limit on "Shared Drives"

The number of stored files: 400,000

danpalmer3y ago

This is true, but you can just create more shared drives. It looks like storage quota is shared across all drives on the account.

blitzar3y ago

I would need to check, but I dont think my (total) storage quota in GB is affected by shared drives.

Nevertheless, there must be something related to their storage model that ranks count of files > storage size.

synctheship3y ago· 2 in thread

Well this is scary, I wonder if all users/plans are impacted? I have north of 8.5million files in GDrive and I luckily have not hit an error yet.

danpalmer3y ago

What are you using it for!?! Honestly, I'm interested to hear what your use-case is as I struggled to come up with one for this many files.

synctheship3y ago

We've been using GDrive in one form or another since 2014 when they released Google Drive for Work. Primary use case nowadays for my team is an old legacy tool(been in place since 2014) that connects to our ERP and other internal tools to generate reports and export data with combined metadata from multiple tools/sources. Then if needed, makes a copy to different share drives and teams, assigns access rights for internal and external users. We have custom web front ends that allow them to interact with these exports along with fat clients if they need to manipulate a larger dataset of files.

My last check had the service account that I'm responsible for somewhere around 8.7 Million files with an average size around 9.2MB.

We have multiple of these "accounts", with similar use-cases across other teams. This is excluding the normal use of Gdrive with normal "rank-and-file" office works managing standard office docs.

We've felt the squeeze from google over the past few years, so we've already started migrating off of google services about 2.5 years into a 4 year project.

1 more reply

pdw3y ago· 1 in thread

As one of the comments points out, Google sells 30TB accounts. With a 5M file limit, you need an average file size of 6MB to fill your drive.

shadowgovt3y ago

This checks out; in general, the 30TB consumers are media storage, editing, and archival.

encoderer3y ago· 1 in thread

Tangent: I was wrong to ever consider it sus that Google includes gsuite with their cloud revenue. Clearly it’s being used that way.

amf123y ago

So does Microsoft. Intelligent Cloud = Azure + Office 365

fbn793y ago· 1 in thread

Maybe Google cloud using NTFS disks (Maximum number of files on disk: 4,294,967,295)? :))

overthrow3y ago

That would be 5 billion, not 5 million

vivegi3y ago

Rug pull. That's what this is.

The Google Drive landing page (https://www.google.com/intl/en-US/drive/) as of April 1, 2023 still doesn't mention the 5M per user maximum files count limit.

Is it an April fool's joke? /s.

Do better $GOOG.

up2isomorphism3y ago

The key is that every company is very happy to lose this kind of customers.

ccheney3y ago

Someone must've copied their node_modules folder into Google Drive by mistake...

mike5033y ago

Part of this seems like some of the wonky design that seems to be under the hood of Drive. Just looking around at how rclone has to do things, the difference between shared and individual drives, the fact I couldn't even originally find a way in the console to see my shared drive usage (even though I was the owner of it and it was on my account) it just seems like there is a very very wonky odd design to how the system works. Not like an object store under the hood with a layer of "document" logic applied, but rather what started as initially some API-esque thoughts on document storage that grew into a monstrosity. I wouldn't be shocked if there's just a ton of tech debt there.

dkjaudyeqooe3y ago

I'm surprised it knows how to count that low.

EricE3y ago

So are there different limits for Google for Business/Workspace/whatever they are branding it this week? Or is this for all Google drive accounts?

j / k navigate · click thread line to collapse

133 comments

86 comments · 19 top-level

duxup3y ago· 24 in thread

What is Google Drive... for?

I see folks describe Google Drive errors / limits like this and people describe "a business critical operational system" and I wonder. Is Google Drive even supposed to do this thing?

QuercusMax3y ago

Disclaimer: I'm a Googler but I don't work anywhere near this stuff, and this is just my own personal opinion.

Seems like they could do it without the jankiness, though.

stuartaxelowen3y ago

https://www.google.com/drive/ 's title does call it a "File Sharing Platform".

toast03y ago

> always a giant headache because of permissions and other nonsense? That's what Drive is trying to be.

It definitely lives up to that. I'm always getting sent links to files I can't open.

pyb3y ago

I don't think the average user would understand the nuance, though. (I don't think I do)

2 more replies

hedora3y ago

I’ve never had a problem with NFS, and modern filers would laugh at 3 million files.

I really dislike Google Drive. I can’t imagine people would use it at all if not for the fact that it’s bundled, monopoly style, with the office suite and gmail.

Since Google won’t ever fix it, I’m hoping they rapidly make it so crappy some other competitor crops up.

Of course, with our luck, it’ll be some Microsoft offering.

I wish software would stop getting worse.

duxup3y ago

That does seem how it operates. A lot of the UI is directed at document sharing.

That's what makes me double take when I hear about folks using it for some very creative, but not what I think of uses for Google Drive.

thinkingemote3y ago

Is there a limit on the number of files in cloud storage?

CaliforniaKarl3y ago

You don't have anything in your HN profile, so I can't be certain, but I'm guessing you haven't been in the Research/Education space? It's scary how much Google Drive is used for storage of stuff.

Robadob3y ago

I work at a university that is fully invested in Google Apps.

We did have unlimited storage, however last week they announced that's soon to change:

Makes me wonder if the timing is somehow related.

1 more reply

hirako20003y ago

andbberger3y ago

google drive has unlimited storage, they opened that pandora's box themselves. yes it's janky as hell, but I'm looking at tens of thousands per year to backup to cloud services for my lab's data

chimeracoder3y ago

duxup3y ago

I don't doubt people do it. Humans are industrious and users will push things to their limits.

I just wonder .... is that what Google built Drive for? / how much of that they were thinking of.

3 more replies

djha-skin3y ago

lambdaloop3y ago

Google drive is great for storing research data.

I work it in a fly neuroscience lab and we use it to store all our electrophysiology and video data. Each person in the lab is storing on average 5TB of data, and the lab as a whole stores 100TB.

I tried setting up a single account for the whole lab once, but we ran into the above 5M file limit, so we just have individual accounts per researcher and it's mostly fine for now.

secabeen3y ago

> Also, we're paying about $200 / month total to store 100TB of data in the cloud, which is hard to beat with other services.

You should expect this to go away soon. I support science researchers and our unlimited storage option is going away in the coming months. Options for purchasing space are limited, and not cheap.

3 more replies

ukd13y ago

Do y'all back this up? If so, how?

1 more reply

rootsudo3y ago

It's "free" cloud hosting. You'd be surprised how often people mix their personal google account with work stuff. It annoys me greatly, but that's how it is.

dragonwriter3y ago

> It's "free" cloud hosting.

Drive is paid (consumer via One, business via Workspace) cloud hosting with a free consumer tier.

Sohcahtoa823y ago

For professional use, I've usually seen it used as a document repository that can be shared by teams.

In both use cases, 5 million files seems like it'd be a hard limit to hit.

RandomBK3y ago

Depends on the backup. Accidentally backing up a couple of `node_modules` folders would be sufficient to hit the 5M limit.

1 more reply

Spooky233y ago

It’s a general purpose file and document store. People run significant organizations on Google Drive.

Like anything, people can push limits where it’s no longer productive to extend.

kevin_thibedeau3y ago

Its purpose is to prevent people disabling third party cookies.

lostmsu3y ago

Can you clarify? Is this about single sign-on from multiple sites to be able to store data to Google Drive?

1 more reply

danpalmer3y ago· 16 in thread

From reading the thread it sounds like this is 5M per user.

Looking at S3 pricing, just the storage is ~5x the cost of Google Drive, and then you need to add transfer and API calls on top of that.

(Disclaimer, I work at Google, not on Drive, this is my personal reading and interpretation of the public info, I don't have any inside info here)

Reason0773y ago

> "I don't personally think that there are reasonable use-cases for human users with 5 million files."

A 5 million file limit might be quite reasonable if you're paying for the basic, 100 GB storage tier. But Google Drive offers multiple tiers with up to 30 TB of storage. 30 million MB!

That means if your average file size is less than 6MB - very likely if you're storing JPEGs, audio files, text records, or whatever; you'll never be able to fill your Google Drive storage.

What's the average size of a file on a regular macOS or Windows HDD? It wouldn't surprise me if it's much less than 6MB. Shouldn't the file count limit scale with the storage limit?

mort963y ago

Your analysis of why this limit exists is probably correct.

graypegg3y ago

2 more replies

Gigachad3y ago

1 more reply

danpalmer3y ago

That's a fair point. It looks like most limitations are documented, but I couldn't find a mention of the 5m file limit.

augusto-moura3y ago

I imagine that if you setup multiple servers/personal computers backups that could scale significantly.

[1]: https://restic.net/

Gigachad3y ago

> Restic currently defaults to a 16 MiB pack size.

Given this, it is impossible to exceed the 5M file limit with the top plan google offers of 30TB. You'd have to lower the pack size to 6MB or less.

pretext-13y ago

Restic + rclone + GCP Service Account.

Create a SA in GCP and generate a JSON credentials file. Create a folder or Shared Drive via web and share it with the SA.

I’m using this setup for years with zero problems.

Example rclone config: https://pastebin.com/KdsFQz5K

Gigachad3y ago

danpalmer3y ago

1 more reply

elcritch3y ago

1 more reply

jjav3y ago

> I don't personally think that there are reasonable use-cases for human users with 5 million files.

% find $HOME -type f | wc -l

4969169

Almost 5M there.

fencepost3y ago

But are you human?

Fatnino3y ago

TazeTSchnitzel3y ago

What if you check out a few large git repos in a directory synced with Google Drive?

danpalmer3y ago

1 more reply

bobsmooth3y ago· 6 in thread

Just checked, my C drive has a bit more than 1 million files. What are you doing that you have 5 million separate files in your Google Drive?

jackson14423y ago

justin_oaks3y ago

Nobody should ever need more than 640K of RAM, right?

Just because your workflow doesn't involve lots of tiny files doesn't mean other people's don't.

Besides, most of those commenting in the issue are talking about their organization having hit the limit, not individuals.

kulahan3y ago

Why is an organization using a primarily-personal file storage/transfer system for real work

2 more replies

bobsmooth3y ago

As kulahan said, why are you using a personal backup solution for a business?

1 more reply

hirako20003y ago

Many use some rsync-like tools to use gdrive as a flat backup system. Perfectly fine use case. You would be over a million as a single user. Some account have thousands of users.

Gigachad3y ago

londons_explore3y ago· 4 in thread

This is probably the easy fix, rather than redesigning every service to be able to stream objects correctly.

londons_explore3y ago

You write your scanner to divide up the list of user accounts into chunks, process all chunks in different worker machines, and combine the results. Simple mapreduce.

cabirum3y ago

More like an arbitrary number from the old times "cause 5M items ought to be enough for everybody."

charcircuit3y ago

>from the old times

That doesn't make sense because this is a new restriction.

2 more replies

genewitch3y ago

Couldn't one programmatically test this by hitting 5GB sooner with excessively long filenames? Say 500 byte names?

throwaway815233y ago· 3 in thread

bobbylarrybobby3y ago

I've definitely downloaded folders with 100 items before. I think it might have to do with how big the individual files are because g drive won't create a zip that's larger than 2GB.

dublinben3y ago

In the future, you can use rclone to easily download entire folders from Google Drive.

https://rclone.org/

throwaway815233y ago

a2tech3y ago· 3 in thread

Interesting that this was enabled with apparently no outreach. Maybe Google needed some of those employees that it recently laid off

londons_explore3y ago

I suspect though that the simple process of dropping a few hundred customers who will be directly impacted an email requires approval from 27 middle managers, and it's easiest to just ignore them.

shadowgovt3y ago

More often, this kind of error out of Google comes from the left hand not knowing what the right hand is doing.

hirako20003y ago

Precisely what I felt skimming through the comments on the issue.

Call me paranoid, but to me one of these things is going on:

3/ sabotage

Or all those, since there could be some smowball effects there.

paxys3y ago· 2 in thread

Gigachad3y ago

I did the math and to reach 5M files in 30 TB, they all have to be averaging 6MB. If you actually have this many files, you should be using S3 or mongodb.

chii3y ago

if you were using google drive to store images (e.g., you automated a camera system to take high res pictures), you can easily reach this limit.

1 more reply

steponlego3y ago· 2 in thread

ec1096853y ago

They should incorporate a minimum size for every file when calculating how much quota you have.

Because they have no retrieval fees for gsuite, I bet it is expensive dealing with drives with tons of small files.

Agree they are rolling this out terribly. They should have granted everyone 2x the number of files they currently have in their drive quota (or 4M, whichever was smaller).

mayli3y ago

Imagine having the filename and other non-quota-able attributes as an unlimited storage.

blitzar3y ago· 2 in thread

They also have what looks like an arbitary limit on "Shared Drives"

The number of stored files: 400,000

danpalmer3y ago

This is true, but you can just create more shared drives. It looks like storage quota is shared across all drives on the account.

blitzar3y ago

I would need to check, but I dont think my (total) storage quota in GB is affected by shared drives.

Nevertheless, there must be something related to their storage model that ranks count of files > storage size.

synctheship3y ago· 2 in thread

Well this is scary, I wonder if all users/plans are impacted? I have north of 8.5million files in GDrive and I luckily have not hit an error yet.

danpalmer3y ago

What are you using it for!?! Honestly, I'm interested to hear what your use-case is as I struggled to come up with one for this many files.

synctheship3y ago

My last check had the service account that I'm responsible for somewhere around 8.7 Million files with an average size around 9.2MB.

We have multiple of these "accounts", with similar use-cases across other teams. This is excluding the normal use of Gdrive with normal "rank-and-file" office works managing standard office docs.

We've felt the squeeze from google over the past few years, so we've already started migrating off of google services about 2.5 years into a 4 year project.

1 more reply

pdw3y ago· 1 in thread

As one of the comments points out, Google sells 30TB accounts. With a 5M file limit, you need an average file size of 6MB to fill your drive.

shadowgovt3y ago

This checks out; in general, the 30TB consumers are media storage, editing, and archival.

encoderer3y ago· 1 in thread

Tangent: I was wrong to ever consider it sus that Google includes gsuite with their cloud revenue. Clearly it’s being used that way.

amf123y ago

So does Microsoft. Intelligent Cloud = Azure + Office 365

fbn793y ago· 1 in thread

Maybe Google cloud using NTFS disks (Maximum number of files on disk: 4,294,967,295)? :))

overthrow3y ago

That would be 5 billion, not 5 million

vivegi3y ago

Rug pull. That's what this is.

The Google Drive landing page (https://www.google.com/intl/en-US/drive/) as of April 1, 2023 still doesn't mention the 5M per user maximum files count limit.

Is it an April fool's joke? /s.

Do better $GOOG.

up2isomorphism3y ago

The key is that every company is very happy to lose this kind of customers.

ccheney3y ago

Someone must've copied their node_modules folder into Google Drive by mistake...

mike5033y ago

dkjaudyeqooe3y ago

I'm surprised it knows how to count that low.

EricE3y ago

So are there different limits for Google for Business/Workspace/whatever they are branding it this week? Or is this for all Google drive accounts?

j / k navigate · click thread line to collapse