Unlimited Google Drive storage by splitting binary files into base64 (opens in new tab)

(github.com)

585 pointslordpankake7y ago247 comments

247 comments

165 comments · 49 top-level

userbinator7y ago· 15 in thread

Base85 would probably be a better choice for storing binary as text, since it has a ratio of 5:4 instead of 4:3.

On the topic of "unusual and free large file hosting", YouTube would probably be the largest, although you'd need to find a resilient way of encoding the data since their re-encoding processes are lossy.

I like the "Linux ISO" and "1337 Docs" references ;-)

yunyu7y ago

Here is an implementation of arbitrary data storage using YouTube videos: https://github.com/dzhang314/YouTubeDrive

RavlaAlvar7y ago

Wouldn’t YouTube re-encode the video and mess up with the data?

2 more replies

lathiat7y ago

I love that this exists

2 more replies

marquis-chacha7y ago

You'd be at the mercy of them potentially changing their encoding scheme unannounced and corrupting your files.

DonHopkins7y ago

Back in the day of email gateways between different networks, there used to be a terrible problems with all the tin-pot dictator IBM SYSADMINs at BITNET sites who maintained their own personal styles of ASCII<=>EBCDIC translation tables, so all the email that passed through their servers got corrupted.

EBCDIC based IBM mainframe SYSADMINs on BITNET were particularly notorious for being pig-headed and inconsiderate about communicating with the rest of the world, and thought they knew better about the characters their users wanted to use, and that the rest of the world should go fuck themselves, and scoffed at all the unruly kids using ASCII and lower case and new fangled punctuation, who were always trying to share line printer pornography and source code listings through their mainframes.

"HARRUMPH!!! IF I AND O ARE GOOD ENOUGH FOR DIGITS ON MY ELECTRIC TYPEWRITER, THEN THEY'RE GOOD ENOUGH FOR EMAIL! NOW GET OFF MY LAWN!!!" (shaking fist in air while yelling at cloud)

It was especially a problems for source code. That was one of the reasons for "trigraphs".

https://stackoverflow.com/questions/1234582/purpose-of-trigr...

https://en.wikipedia.org/wiki/Digraphs_and_trigraphs

>Trigraphs were proposed for deprecation in C++0x, which was released as C++11. This was opposed by IBM, speaking on behalf of itself and other users of C++, and as a result trigraphs were retained in C++0x. Trigraphs were then proposed again for removal (not only deprecation) in C++17. This passed a committee vote, and trigraphs (but not the additional tokens) are removed from C++17 despite the opposition from IBM. Existing code that uses trigraphs can be supported by translating from the source files (parsing trigraphs) to the basic source character set that does not include trigraphs.

1 more reply

Wowfunhappy7y ago

You'd want to make sure the coding scheme is in the identifiable visual data. Think QR codes.

Build in a bit of redundancy, and I think it would work.

1 more reply

matthewaveryusa7y ago

that would be tough if youtube doesn't save the originals

1 more reply

dspillett7y ago

> Base85 would probably be a better choice

Base64 has the advantage of relative ubiquity (though Base85 is hardly rare, being used in PDF and Git binary patches). It also doesn't contain characters (quotes, angled brackets, ...) that might cause problems if naively sent via some text protocols and/or embedded in XML/HTML mark-up.

> YouTube ... you'd need to find a resilient way of encoding the data [due to lossy re-encoding]

That should be easy enough: encode as blocks or lines of pixels (blocks of 4x4 should be more than sufficient) in a low enough number of colour values (I expect you'd get away with at least 4bits/channel/block with large enough blocks so 4096 values per block) and you should easily be able to survive anything the re-encoding does by averaging each block and taking the closest value to that result.

Add some form of error detection+correction code just for paranoia's sake. You are going to want to include some redundancy in the uploads anyway so you can combine these needs in a manner similar to RAID5/6 or the Parchive format that was (is?) popular on binary carrying Usenet groups.

richrichardsson7y ago

Would be cool to use the audio for some extra bandwidth too, get some Sinclair Spectrum-esque (albeit in stereo) bleeps to accompany the video.

askl7y ago

Or you could use BandcampFS https://github.com/tuxxy/BandcampFS https://bullswillriot.bandcamp.com/album/arch-linux-07-01

giomasce7y ago

A few years ago I also found a backup tool that converted backups to DV videos, so that you could write them on cheap DV cassettes. It was something like more than 10 GB per cassette. Definitely not bad for a few years ago.

_qxjp7y ago

Just FYI, turn your volume way down when listening to these. Wouldn't be a good idea to blow your eardrums on this.

andrewstuart27y ago

Why not yEnc? 1-2% overhead and it's been in use on UseNet for binary storage for a very long time.

u801e7y ago

The nice thing about yEnc is that it only has to escape NUL, LF, CR, and the escape character itself '=', so it essentially uses all but 3 characters out of the 255 possible values.

While this works over NNTP, SMTP and IMAP (and possibly POP), I'm not sure if it will work over HTTP if any of the servers use the Transfer Encoding header.

DonHopkins7y ago

Just use Unicode for the optimal highest possible base 1,114,112!

reaperducer7y ago· 10 in thread

Base64 is such a wonderful gift.

Back when the commercial internet was just getting its act together there were companies that would give you free online access on Windows 3.1 machines in exchange for displaying ads in the e-mail client. (I think one was called Juno.)

The hitch was that you could only use e-mail. No web surfing. No downloading files. No fun stuff.

But that's OK, since there were usenet- and FTP-to-email gateways that you could ping and would happily return lists of files and messages. And if you sent another mail would happily send you base64-encoded versions of those binaries that you could decode on your machine.

The free e-mail service became slow-motion file sharing. But that was OK because you'd set it up before you went to bed and it would run overnight.

Thank you, whomever came up with base64.

romanhn7y ago

That reminds me of the first time I accessed World Wide Web. Back in '96 I was browsing a computer magazine and happened upon a listing of useful mailing lists, one of which returned the contents of web pages for a requested HTTP address. Same magazine had an install CD for the free Juno email service.

Being a teenager, the first web page I ever requested was www.doom.com, which returned a gibberish of text to Juno's email client. It was an HTML file full of IMG tags (one of those "Click here to enter" gateway pages), but I had no idea what I was looking at at the time. Somehow figured out to open the file in IE2 and saw... a bunch of broken images :)

I still vividly remember the sense of wonder that the early Internet evoked.

EDIT: Just checked the Wayback Machine. Looks like www.doom.com was not affiliated with the game at the time, so I must have browsed to www.idsoftware.com instead.

xamuel7y ago

It's really sad thinking how kids these days totally miss the wonder of the early internet.

In my case, it was at the public library. The lone internet computer was constantly booked. But by watching over a library clerk's shoulder, I was able to see the password needed to unlock the text-based library catalog terminals (which terminals were plentiful and always available). (My parents worked at the library, or else I never could have pulled that off.) Once unlocked, I was able to use Lynx to telnet into my favorite MUD game. Unfortunately it didn't last long until a librarian caught me, which I think resulted in me being grounded from the library for a month or something like that.

1 more reply

hcs7y ago

I used Agora [1] with Juno, too! There was a particular daemon hosted in Japan, not sure how I found it but probably in a magazine.

[1] https://en.wikipedia.org/wiki/Agora_(web_browser)

SergeAx7y ago

Long before base64 there was UUencode, but it was quite sensitive to whitespaces and mail client reflowing, so it didn't make it to the RFC standards.

simonh7y ago

Yep and the Shar command that created a bash wrapper round sections of uuencoded data, so you could email a file in segments and conveniently recompose and run it to get the file back, without needing Shar at the other end. Good times.

1 more reply

ohyeshedid7y ago

That's really slick.

The original Juno ad server proxied the ads from the internet to the email client, and the proxy was wide open for several months. The first time I ever accessed the open internet at home was by dialing into the email service and bouncing through the proxy. I believe it was closed due to it being shared in the letters section of a hacker zine.

groby_b7y ago

Mary Ann Horton[1] is probably one of the people you want to thank. She's responsible for uuencode.

[1] https://en.wikipedia.org/wiki/Mary_Ann_Horton

gjtorikian7y ago

Yep, Juno, NetZero, and Walmart BlueLight were all free ISPs that were super easy to manipulate. :)

mmaunder7y ago

Reminds me of usenet warez groups filled with uuencoded posts. If you took the time to reassemble them and decode, it worked.

jabl7y ago

First time I was able to access the WWW via a graphical browser I had a dial-in shell account at an ISP (or BBS or whatever they called themselves back then), then there was a program called "slirp" (which, amazingly enough, seems to have a wiki page at https://en.wikipedia.org/wiki/Slirp ) which allowed one to run "SLIP" (IP-over-serial) over the terminal connection to get IP access from my computer. Amazingly I got it to work, considering I barely knew what I was doing back then.

One big reason why I became a Linux user was that the TCP/IP stack for Win 3.1, Trumpet Winsock, was amazingly unstable and would regularly crash the entire OS. Linux had, even back then, a stable TCP/IP stack. And fantastic advancements like preemptive multitasking running in protected mode so errant user-space applications didn't crash the OS.

Good times.

justinjlynn7y ago· 9 in thread

Sounds like a great way to lose your Google account (and all your other linked Google services) for ToS violations to me.

mindfulhack7y ago

Indeed. I love the "sorry @ the guys from google internal forums who are looking at this" line at the github. All tongue in cheek and aware of the situation.

TBH this is not unlike reporting a security bug to a company as a white hat, but more like a grey hat here.

yreg7y ago

I think they might not care unless this can somehow endanger their infrastructure. (It's not really unlimited, is it?)

If the few blokes using this scam their way into few hundred terabytes of free storage, so be it, it's not worth the hassle for Google, imo.

edit: Apparently an account can create up to 250 docs a day https://developers.google.com/apps-script/guides/services/qu...

2 more replies

lordpankakeOP7y ago

yh hopefully they don't ban me for this

m-p-37y ago

I would make sure to not do this in an important Google account.

mcbits7y ago

I've seen a few stories of businesses and their employees all losing their Google accounts, just because the company hired a freelancer who had previously been banned, and Google detected the association. (Pretty sure they got the accounts back after some public outrage.) I wouldn't risk intentionally violating their terms if you're not quite ready to wake up one day 100% Google-free, or very good at hiding your tracks.

5 more replies

decebalus17y ago

I would make sure to not have any important Google account

dymk7y ago

Google can and will shut down accounts they think are ran by the same actor, even if they're not explicitly linked

neltnerb7y ago

Hehe, I was just thinking how simple it will be for Google to identify accounts using this technique from simple usage analytics. I suspect this will not work for long... but still super cool!

kitotik7y ago

Agreed. It also sounds like a great way to (ab)use a dumb commodity via ephemeral Google accounts to distribute data.

DonHopkins7y ago· 9 in thread

In 1998, the EFF and John Gilmore published the book about "Deep Crack" called "Cracking DES: Secrets of Encryption Research, Wiretap Politics, and Chip Design". But at the time, it would have been illegal to publish the code on a web site, or include a CDROM with the book publishing the "Deep Crack" DES cracker source code and VHDL in digital form.

https://en.wikipedia.org/wiki/EFF_DES_cracker

https://www.foo.be/docs/eff-des-cracker/book/crackingdessecr...

>"We would like to publish this book in the same form, but we can't yet, until our court case succeeds in having this research censorship law overturned. Publishing a paper book's exact same information electronically is seriously illegal in the United States, if it contains cryptographic software. Even communicating it privately to a friend or colleague, who happens to not live in the United States, is considered by the government to be illegal in electronic form."

So to get around the export control laws that prohibited international distribution of DES source code on digital media like CDROMS, but not in written books (thanks to the First Amendment and the Paper Publishing Exception), they developed a system for printing the code and data on paper with checksums, with scripts for scanning, calibrating, validating and correcting the text.

The book had the call to action "Scan this book!" on the cover (undoubtedly a reference to Abby Hoffman's "Steal This book").

https://en.wikipedia.org/wiki/Steal_This_Book

A large portion of the book included chapter 4, "Scanning the Source Code" with instructions on scanning the book, and chapters 5, 6, and 7 on "Software Source Code," "Chip Source Code," and "Chip Simulator Source Code," which consisted of pages and pages of listings and uuencoded data, with an inconspicuous column of checksums running down the left edge.

The checksums in the left column of the listings innocuously looked to the casual observer kind of like line numbers, which may have contributed to their true subversive purpose flying under the radar.

Scans of the cover and instructions and test pages for scanning and bootstrapping from Chapter 4:

https://imgur.com/a/7pHSAT1

(My small contribution to the project was coming up with the name "Deep Crack", which was silkscreened on all of the chips, as a pun on "Deep Thought" and "Deep Blue", which was intended to demonstrate that there was a deep crack in the United States Export Control policies.)

https://en.wikipedia.org/wiki/EFF_DES_cracker#/media/File:Ch...

The exposition about US export control policies and the solution for working around them that they developed for the book was quite interesting -- I love John Gilmore's attitude, which still rings true today: "All too often, convincing Congress to violate the Constitution is like convincing a cat to follow a squeaking can opener, but that doesn't excuse the agencies for doing it."

https://dl.packetstormsecurity.net/cracked/des/cracking-des....

Chapter 4: Scanning the Source Code

In This chapter:

The Politics of Cryptographic Source Code

The Paper Publishing Exception

Scanning

Bootstrapping

The next few chapters of this book contain specially formatted versions of the documents that we wrote to design the DES Cracker. These documents are the primary sources of our research in brute-force cryptanalysis, which other researchers would need in order to duplicate or validate our research results.

The Politics of Cryptographic Source Code

Since we are interested in the rapid progress of the science of cryptography, as well as in educating the public about the benefits and dangers of cryptographic technology, we would have preferred to put all the information in this book on the World Wide Web. There it would be instantly accessible to anyone worldwide who has an interest in learning about cryptography.

Unfortunately the authors live and work in a country whose policies on cryptography have been shaped by decades of a secrecy mentality and covert control. Powerful agencies which depend on wiretapping to do their jobs--as well as to do things that aren't part of their jobs, but which keep them in power--have compromised both the Congress and several Executive Branch agencies. They convinced Congress to pass unconstitutional laws which limit the freedom of researchers--such as ourselves--to publish their work. (All too often, convincing Congress to violate the Constitution is like convincing a cat to follow a squeaking can opener, but that doesn't excuse the agencies for doing it.) They pressured agencies such as the Commerce Department, State Department, and Department of Justice to not only subvert their oaths of office by supporting these unconstitutional laws, but to act as front-men in their repressive censorship scheme, creating unconstitutional regulations and enforcing them against ordinary researchers and authors of software.

The National Security Agency is the main agency involved, though they seem to have recruited the Federal Bureau of Investigation in the last several years. From the outside we can only speculate what pressures they brought to bear on these other parts of the government. The FBI has a long history of illicit wiretapping, followed by use of the information gained for blackmail, including blackmail of Congressmen and Presidents. FBI spokesmen say that was "the old bad FBI" and that all that stuff has been cleaned up after J. Edgar Hoover died and President Nixon was thrown out of office. But these agencies still do everything in their power to prevent ordinary citizens from being able to examine their activities, e.g. stonewalling those of us who try to use the Freedom of Information Act to find out exactly what they are doing.

Anyway, these agencies influenced laws and regulations which now make it illegal for U.S. crypto researchers to publish their results on the World Wide Web (or elsewhere in electronic form).

The Paper Publishing Exception

Several cryptographers have brought lawsuits against the US Government because their work has been censored by the laws restricting the export of cryptography. (The Electronic Frontier Foundation is sponsoring one of these suits, Bernstein v. Department of Justice, et al ).* One result of bringing these practices under judicial scrutiny is that some of the most egregious past practices have been eliminated.

For example, between the 1970's and early 1990's, NSA actually did threaten people with prosecution if they published certain scientific papers, or put them into libraries. They also had a "voluntary" censorship scheme for people who were willing to sign up for it. Once they were sued, the Government realized that their chances of losing a court battle over the export controls would be much greater if they continued censoring books, technical papers, and such.

Judges understand books. They understand that when the government denies people the ability to write, distribute, or sell books, there is something very fishy going on. The government might be able to pull the wool over a few judges' eyes about jazzy modern technologies like the Internet, floppy disks, fax machines, telephones, and such. But they are unlikely to fool the judges about whether it's constitutional to jail or punish someone for putting ink onto paper in this free country.

* See http://www.eff.org/pub/Privacy/ITAR_export/Bernstein_case/ .

Therefore, the last serious update of the cryptography export controls (in 1996) made it explicit that these regulations do not attempt to regulate the publication of information in books (or on paper in any format). They waffled by claiming that they "might" later decide to regulate books--presumably if they won all their court cases -- but in the meantime, the First Amendment of the United States Constitution is still in effect for books, and we are free to publish any kind of cryptographic information in a book. Such as the one in your hand.

Therefore, cryptographic research, which has traditionally been published on paper, shows a trend to continue publishing on paper, while other forms of scientific research are rapidly moving online.

The Electronic Frontier Foundation has always published most of its information electronically. We produce a regular electronic newsletter, communicate with our members and the public largely by electronic mail and telephone, and have built a massive archive of electronically stored information about civil rights and responsibilities, which is published for instant Web or FTP access from anywhere in the world.

We would like to publish this book in the same form, but we can't yet, until our court case succeeds in having this research censorship law overturned. Publishing a paper book's exact same information electronically is seriously illegal in the United States, if it contains cryptographic software. Even communicating it privately to a friend or colleague, who happens to not live in the United States, is considered by the government to be illegal in electronic form.

The US Department of Commerce has officially stated that publishing a World Wide Web page containing links to foreign locations which contain cryptographic software "is not an export that is subject to the Export Administration Regulations (EAR)."* This makes sense to us--a quick reductio ad absurdum shows that to make a ban on links effective, they would also have to ban the mere mention of foreign Universal Resource Locators. URLs are simple strings of characters, like http://www.eff.org; it's unlikely that any American court would uphold a ban on the mere naming of a location where some piece of information can be found.

Therefore, the Electronic Frontier Foundation is free to publish links to where electronic copies of this book might exist in free countries. If we ever find out about such an overseas electronic version, we will publish such a link to it from the page at http://www.eff.org/pub/Privacy/Crypto_misc/DESCracker/ .

* In the letter at http://samsara.law.cwru.edu/comp_law/jvd/pdj-bxa-gjs070397.h..., which is part of Professor Peter Junger's First Amendment lawsuit over the crypto export control regulations.

[...]

rnhmjoj7y ago

The checksum is really interesting and would be useful even today. I have looked for the scripts but the links are all gone [1], unfortunately.

EDIT: I have found it[2], finally. It's pretty sad that so much of the internet is getting forgotten, though.

[1]: https://web.archive.org/web/19980630210313/http://www.pgpi.c...

[2]: https://the.earth.li/pub/pgp/pgpi/5.5/books/ocr-tools.zip

sagebird7y ago

It seems like a cute and irrelevant distinction that electronic software would be published in a book. If researchers created a computer that processed information using proteins in plant cells instead of electrons, and such a computer could execute programs on this book directly instead of “scanning” it, would not the textbook be software? When laws say “electronic versions” I don’t think they literally mean to refer electrons, but rather, computer-consumables/executables.

Was this tested before a court and did they accept this sort of obviously subversive behavior? (Not that I personally agree with the laws restricting crypto export.)

croon7y ago

IANAL, but if the distinction clashes with the crypto export laws, does it not follow that crypto export laws clash with the first amendment? Which then makes them unconstitutional and the focus should be on whether that is wanted behavior and the constitution should be amended, or not.

1 more reply

brandmeyer7y ago

A bizarre intersection with this: I once had to prepare part of my employer's source code for registration with the US copyright office. They wanted "the first N pages" of the source code, for N of a dozen or so. After consulting with the lawyers making the filing, I ended up making a pdf that included main() and the first few functions that it called until I got up to N pages.

sirsuki7y ago

If you save digital data (ASCII) on an analog form like a cassette tape is that okay? Seems you could alternativly put metallic strips in a book. What about QR Codes? Could you have a massive QR Code on each page which contains a section of source code? Could you use an alternative encoding like dots and lines (.||.||....|.|.|..|) to represent 1s and 0s which is easy to scan (and not require OCR/checksums)?

To what extent does analog encoding fall under the illegal threshold?

1 more reply

userbinator7y ago

Are you implying there's something more interesting there than just the DES source code and related data that the book already very clearly claims to contain?

some_random7y ago

I don't think so, I think the OP's just trying to be dramatic?

1 more reply

aasasd7y ago

Afaik Phil Zimmermann was one of the first to do it, in '95 through MIT Press—when his PGP circulated a bit too widely for the export regulations. However, the question of him being protected under the 1st wasn't decided in the court.

mrmuagi7y ago

Quite an interesting read, however "Deep Crack" is a horrible name.

_bxg17y ago· 7 in thread

I had an (evil; don't do this) idea a while back to create a Dropbox-like program that stores all your data as binary chunks attached to draft emails spread across an arbitrary number of free email accounts.

mmastrac7y ago

This existed just after gmail launched. Can't recall the name of the program, but I played around with it to store a few hundred MB in a test account.

hcs7y ago

GmailFS? https://web.archive.org/web/20060424165737/http://richard.jo...

1 more reply

follower7y ago

Yeah, there were a couple of different ones wrote a bit more about them in a comment here: https://news.ycombinator.com/item?id=19917018

qntty7y ago

https://en.wikipedia.org/wiki/GMail_Drive

quickthrower27y ago

I had an evil idea to create a key/value storage using HN dead comments.

follower7y ago

Definitely would make an interesting learning exercise--I learned way more about SMTP/POP protocols* than I did before when I implemented demonstration SMTP/POP servers for my libgmail library before Gmail offered alternate means of access.

These days there's even the luxury of IMAP. :D

[*] About the only thing I remember now is the `HELO` and `EHLO` protocol start messages. :)

derivagral7y ago

Did this as a college project with a friend, was pretty fun.

Nowadays stuff like Dropbox is much more convenient and reliable.

whack7y ago· 6 in thread

For anyone else who's as confused as I initially was: Google Drive allows unlimited storage for anything stored as "google docs". Ie, their version of Word. This hack works by converting your binary files into base64 encoded text, and then storing the text in collection of google-doc files.

Ie, it's actually increasing the amount of storage space needed to store the same binary, but it's getting around the drive-quota by storing it in a format that has no quota.

markstos7y ago

Seems like a good way to earn yourself a Terms-of-Service ban.

If this considered an abuse-of-services now, the terms could be updated to clarify.

The finger print of big chunks of base-64 encoded blobs in Google Docs could be easy to spot.

If Google cares to notice this and take action, they can and will.

jerf7y ago

It's an arms race situation. Once you give me an information channel like a "word document", I've got an endless variety of ways to encode other things into it. I can encode bits as English sentences or other things that will be arbitrarily hard to pick up by scanning.

If I were Google, I wouldn't try to pick up on the content, I'd be looking for characteristic access patterns. It's harder to catch uploads, since "new account uploads lots of potentially large documents" isn't something you can immediately block, but "oh, look, here's several large files that are always accessed in consecutive order very quickly" would be harder to hide. It's still an arms race after that (e.g., "but what if I access them really slowly?"), but while Google would find a hard time conclusively winning this race in the technical sense, they can win enough that this isn't fun or a cost-effectively technique anymore (e.g. "then you're getting your files really slowly, so where's the fun in that?"), which is close enough to victory for them.

So, I'd say, enjoy it while you can. If it gets big enough to annoy, it'll get eliminated.

3 more replies

magashna7y ago

Seems like more of a "can" rather than anyone actually using it. Always interesting to see how something can be broken or exploited, even though it may not be practical

0xDEFC0DE7y ago

I doubt it's actually unlimited behind-the-scenes and you'll hit a hidden quota for your account type or get throttled down to dozens of KB/s.

jfengel7y ago

I wonder why they do that. It seems to me like it would be more effort to leave the Google Docs files out of their calculation, and with no real benefit. For conventional use of Google Docs it would be hard to use a significant amount of disk space, so it's not like users would be clamoring for additional space.

Perhaps it's just marketing, trying to prize people away from Microsoft Office with a thing that doesn't actually cost them all that much?

whack7y ago

> Perhaps it's just marketing, trying to prize people away from Microsoft Office with a thing that doesn't actually cost them all that much?

Exactly. Never underestimate how much people love something being "free", even if only costs a fraction of a cent.

Scaevolus7y ago· 6 in thread

You can also just use GSuite with a few users to get unlimited Google Drive storage.

https://support.google.com/a/answer/139019?hl=en#6_storage

fheld7y ago

relevant video on this unlimited plan and the way it is capped.

https://youtu.be/y2F0wjoKEhg

tl;dw Upload is limited to 750GB per day per account

kevingrahl7y ago

Small correction; that limit is per user not per account!

You need 5 users for GSuite Business, if you use those alone the limit is now 3750GB/day

judge20207y ago

That references "for education", but it's also true for GSuite Business (and enterprise, but not basic). You'll need to be paying for at least 5 users, or $60/month.

nefitty7y ago

Here's the business info: https://support.google.com/a/answer/6034782

Dylan168077y ago

From what I understand the 5 user minimum isn't implemented.

unicornfinder7y ago

Yup, I'm storing 42TB on there at present.

nemosaltat7y ago· 5 in thread

Anyone remember Gdrive? I can’t find it now, but I think it was probably early or mid 2000s. It let you store files as a local disk (FUSE) via Gmail attachments.

BlackRing7y ago

I remember using it back in 2005 iirc, and it was amazing. The files had a label called gmailfs.gDisk which is how it could keep the "file system" separate from the rest.

Now Google generously offers Drive with 15Gigs of space.

clarry7y ago

15 gigs would've been generous in 2005, now it's not.

1 more reply

nemosaltat7y ago

Yes! That’s the one.

follower7y ago

I do! :)

"Gdrive" (here: http://pramode.net/articles/lfy/fuse/pramode.html ) and the "Gmail Filesystem"/"GmailFS" (here: https://web.archive.org/web/20060424165737/http://richard.jo... as mentioned elsewhere in this thread) were both built on top of `libgmail` (here: http://libgmail.sourceforge.net/ ) a Python library I developed.

There were a couple of different projects at the time (listed in "Other Resources" on the project page) that sought to provide a programmatic Gmail interface.

I still have a "ftp" label in Gmail (checks notes 15 years later...) from the experimental FTP server I implemented as a libgmail example. :D

The libgmail project was probably the first project of mine which attracted significant attention including others basing their projects on it along with mentions in magazines and books which was pretty cool.

I think my favourite memory from the project was when Jon Udell wrote in a InfoWorld column ( http://jonudell.net/udell/2006-02-07-gathering-and-exchangin... ) that he considered libgmail "a third-party Gmail API that's so nicely done I consider it a work of art." It's a quality I continue to strive for in APIs/libraries I design these days. :)

(Heh, I'd forgotten he also said "I think Gmail should hire the libgmail team, make libgmail an officially supported API"--as the entirety of the "team" I appreciated the endorsement. :) )

The library saw sufficient use that it was also my first experience of trying to plot a path for maintainership transition in a Free/Libre/Open Source licensed project. I tried to strike a balance between a sense of responsibility to existing people using the project and trusting potential new maintainers enough to pass the project on to them. Looking back I felt I could've done a better job of the latter but, you know, learning experiences. :)

My experiences related to AJAX reverse engineering of Gmail (which was probably the first high profile AJAX-powered site) later led to reverse engineering of Google Maps when it was released and creating an unofficial Google Maps API before the official API was released: http://libgmail.sourceforge.net/googlemaps.html

But that's a whole other story... :)

    </nostalgia>

imglorp7y ago

Yeah now look for https://github.com/vitalif/grive2

AFAIK it mostly still works. The older "grive" might not.

joebergeron7y ago· 5 in thread

Me and a friend came up with a similar idea of a sort of distributed file system implemented across a huge array of blog comment sections. Of course you’d need a bunch of replication and fault tolerance and the ability to automatically scrape for new blogs to post spammy-looking comments on, but I thought it was a pretty funny and neat idea when we came up with it.

jmkni7y ago

I heard about a subreddit a while ago, where every post/comment was a random string. It was speculated at the time that something similar was going on.

xamuel7y ago

It's even more interesting to think about this in the context of preserving banned information for future generations. For example, if all the countries in the world united to ban the New Testament. But you eventually realize the ephemeral nature of the net will probably prevent it from fulfilling such long-term data-archiving roles and you're better off burying manuscripts deep underground.

pestaa7y ago

The thought of a distributed MySQL cluster accessed over various versions of WordPress-as-a-database-layer just makes me happy and confused.

solotronics7y ago

Even scarier than that would be a Turing complete language where the code is stored and memory is written to comments sections. The actual execution could be done by reading, execution function, and writing comments to store working memory and results. I guess with cryptographic encryption you could even hide what your doing.

dogma11387y ago

So UseNet?

binwiederhier7y ago· 4 in thread

In the same spirit, I made a few "just for fun" plugins for my (now abandoned) encrypted-arbitrary-storage Dropbox-like application Syncany:

The Flickr plugin [1] stores data (deduped and encrypted before upload) as PNG images. This was great because Flickr gave you 1 TB of free image storage. This was actually super cool, because the overhead was really small. No base64.

The SMTP/POP plugin [2] was even nastier. It used SMTP and POP3 to store data in a mailbox. Same for [3], but that used IMAP.

The Picasa plugin [4] encoded data as BMP images. Similar to Flickr, but different image format. No overhead here either.

All of this was strictly for fun of course, but hey it worked.

[1] https://github.com/syncany/syncany-plugin-flickr

[2] http://bazaar.launchpad.net/~binwiederhier/syncany/trunk/fil...

[3] http://bazaar.launchpad.net/~binwiederhier/syncany/trunk/fil...

[4] http://bazaar.launchpad.net/~binwiederhier/syncany/trunk/fil...

userbinator7y ago

Anything that persists can be used to store arbitrary data... I remember (around a decade ago now, I'm not sure if these still exist) coming across some blogs that ostensibly had images of books, details about them, and links to buy them on Amazon and such... I only understood when I came across a forum posting from someone complaining that his ebook searches were clogged with such "spam blogs", and another poster simply told him to look more carefully at those sites, but not to say anything more about his discoveries. You can probably guess what you got if you saved the surprisingly large "full-size" cover image from those blogs and opened it in 7zip!

I feel less hesitant about revealing this now, given how long ago it was and that more accessible "libraries" are now available.

TazeTSchnitzel7y ago

IIRC the “mods are asleep, post […]” 4chan meme originally came from “mods are asleep, post high res” threads where to an outside observer they were just posting high-resolution images of inane things, but there was actually steganography of some sort going on to hide child porn (I think) inside the files.

One of many Internet jokes with sinister origins.

collinmanderson7y ago

I have a feeling PNG might work on Google Photos too, but I haven't tried it.

binwiederhier7y ago

I can't remember if I tried, but it's important that you get the exact data back that you put in, which is why JPEG obviously won't work.

BMP is the easiest to encode/decode because it's literally a bitmap of RGB, no fancy compression and such, which, if you're storing arbitrary data is obviously not necessary.

PNG was trickier, because of its "chunks" and generally more structure. And compression.

reneberlin7y ago· 4 in thread

I think in the long run, the user could risk the complete google-account if they begin rating the uploads a violation of TOS.

I advise a totally seperate account when using this tool.

But anyways, something inside me likes it. Nicely done. Good job :)

asdfasgasdgasdg7y ago

At least a totally separate account. Probably better to use a totally separate set of IP addresses and browsers and maybe even computers. Google will definitely link accounts created from the same browser and potentially ban your main account if you violate their TOS on another account also owned by you.

lordpankakeOP7y ago

This is a complete hack job and probably useless if Google changes free storage for docs.

That being said, they currently allow the guys at /r/datahoarder to use gsuite accounts costing £1 for life with unlimited storage quotas. These are regularly filled to like 50TB and Google doesn't bat an eye.

kevingrahl7y ago

As a data hoarder myself with somewhere around 300TB on G Suite Business, please tell me more about those £1 for life accounts!

1 more reply

mo57y ago

I'm kinda scared to try it since google could mass ban all of the accounts if they want to, but sure is a great job from the dev.

I didn't know this was plausible.

yeukhon7y ago· 4 in thread

I don’t know anyone remember but some years ago I remember seeing a file compressed from 1GB to 1mb. And I was amazed.

scarejunba7y ago

On the edonkey network, the file size would be reported raw but the clients could compress and transfer chunks to each other. Some guy had created an empty IL-2 sturmovik iso and seeded it. We lived at a government facility with ill-policed high speed (for the time) internet but even then I knew that I didn’t have a 400 Mbps connection. Maybe 2002/2003.

The whole thing only transferred a few kB. It looked like an entire disc though.

dymk7y ago

Maybe I'm missing a reference or joke here, but the size of a file means little with respect to how much it can be compressed. You can get a 1 petabyte file down to a few bytes if it's just `\0` repeated over and over.

jl67y ago

42.zip?

dvhh7y ago

even more dangerous for bot

https://rehmann.co/blog/10-gb-27-kb-gzip-file-present-http-s...

wmichelin7y ago· 3 in thread

Someone is going to notice a few accounts with insanely high storage usage, and then comes the ban-hammer. Enjoy losing your Google account!

sircastor7y ago

I think that depends on their tools and how they evaluate data usage. If the reporting states that the accounts are using very little storage because it's using the same measuring stick that the client does them it's invisible. The question comes up during an audit of the system when the disk usage doesn't match the report. Then again, if this is used by few people it may just look like a margin of error.

londons_explore7y ago

It'll more be that the Google docs "live editing" backends are expensive to use disk and memory wise. They store complete version history with each keystroke of a document.

There's a good chance a megabyte of "document" costs Google a gigabyte of internal storage...

3 more replies

markbnj7y ago

From the project page:

> sorry @ the guys from google internal forums who are looking at this

Causality17y ago· 3 in thread

Reminds me of the old programs that would turn your Gmail storage into a network drive by splitting everything into 25MB chunks. Utterly miserable experience with terrible latency and reliability.

follower7y ago

Yeah, there were a couple of projects that implemented that functionality (mentioned more in my comment https://news.ycombinator.com/item?id=19917018 if you're interested).

Also, "Utterly miserable experience with terrible latency and reliability." is such a great customer endorsement quote. :D

rovr1387y ago

I remember Gspace - https://www.ghacks.net/2007/03/07/gspace-firefox-extension/

qntty7y ago

GMail Drive I believe.

http://www.viksoe.dk/code/gmail.htm

zelon887y ago· 3 in thread

Couldn't you also embed data into images and upload them to Google photos, or is that discarded when they convert and compress the image in the backend?

PetahNZ7y ago

Depends how you encode it. A bunch QR codes, no problem. But encoding into the individual pixel, probably not so much.

zelon887y ago

I mean include binary data in an image file. So you would have a 300x300px jpg picture of a flower that's 20mb which you could unpack to a binary file.

1 more reply

lordpankakeOP7y ago

Check the issues! Some people have tried quite hard to figure that one out.

gaspoweredcat7y ago· 3 in thread

very clever, well done!

lordpankakeOP7y ago

genuinely sorry if you're a Google employee (probably won't put this on my Internship CV)

ConcernedCoder7y ago

yeah! let's hope no google employees see this ... oh wait

throwawaygoog107y ago

MatthewRayfield7y ago· 2 in thread

I couldn't find it with a quick search, but I remember many years ago someone creating a similar scheme for storing files inside of TinyURLs.

You would run the uploader and get back a list of TinyURLs that could then be used to retrieve the files later with a downloader.

But you couldn't store too much in each URL so the resulting list could be pretty big.

klyrs7y ago

This is a favorite lunch topic at work. AFAIK we stumbled on the idea ourselves, but I'm not surprised to hear it's unoriginal. Rather than a list, our design is a tree structure where leaf nodes contain data and branch nodes contain lists of tinyurls...

abricot7y ago

Someone also created a filesystem using DNS caches of others to store the files: https://news.ycombinator.com/item?id=16134041

lofties7y ago· 2 in thread

Very cool! About a year ago I had a similar idea, but to store arbitrary data in PNG chunks[1] and upload them to "unlimited" image hosts like IMGUR and Reddit.

[1] http://blog.brian.jp/python/png/2016/07/07/file-fun-with-pyh...

collinmanderson7y ago

I have a feeling PNG might work on Google Photos too, but I haven't tried it.

binwiederhier7y ago

If you wanna give it a shot, try the code I linked here: https://news.ycombinator.com/item?id=19916126

Although if Picasa (predecessor to Google Photos) worked with BMP, it may be better to do that because it's much easier and more space efficient to encode arbitrary data in than PNG.

mirimir7y ago· 2 in thread

Could someone please ELI5 how Google Drive doesn't include text files toward usage?

angelsl7y ago

These aren't text files, but Google Docs files, which Google doesn't count against an account's quota.

mirimir7y ago

OK, I did see that.

But I don't understand why Google would do that. For most users, aren't Google Docs files a substantial part of their usage? Or do people mainly store backups?

2 more replies

baroffoos7y ago· 2 in thread

Has anyone actually tried storing a large amount of data like this? I feel like creating a new google account and using it as a backup for a 300gb folder I have.

acuozzo7y ago

Yes. It's called: Post to alt.binaries.* on Usenet.

It's effectively the same thing under the hood. Binaries are split and converted to text using yEnc (or base64, et al.) and uploaded as "articles". An XML file containing all of the message-IDs (an "NZB") is uploaded as well so that the file can be found, downloaded, and reassembled in the right order.

This form of binary distribution has been around since the '80s if you change some of the technical details; e.g. using UUencode rather than yEnc.

Spend $5 for a 3-day unlimited Usenet account with e.g. UsenetServer.com and upload it.

If you want it to stay up, then make another account in 3925 days (the retention period), download it, and then reupload it for another 10+ years of storage.

kevingrahl7y ago

I would not, in any way, consider this a backup.

If it’s only 300GB check out Backblaze B2. It would cost you $1.5 per month for that amount of data.

furyofantares7y ago· 1 in thread

Even URL shorteners offer unlimited storage if you jump through enough hoops.

To encode ABCDEFGHIJKLMNOPQRSTUVWXYZ first get a short url for http://example.com/ABC, then take the resulting url and append DEF and run it through the service again. Repeat until you run out of payload, presumably doing quite a few more than 3 bytes at a time.

The final short url is the your link to the data, which can be unpacked by stripping the payload bytes then following the links backwards until you get to your initial example.com node.

jerf7y ago

I've lost track of the number of times I've seen variants on "Hey, a link shortener is a fun first project for this new language I'm learning; hey, $LANGUAGE_COMMUNITY, I've put this up on the internet now!... hey, uh, $LANGUAGE_COMMUNITY, I've had to take it down due to abuse." There are numerous abuse vectors. Optionally promise to get it back up real soon now, as if there are actually people depending on it.

Maybe it isn't a bad first project, but on no account should you put it up on the "real" internet and tell anyone it exists.

markbnj7y ago· 1 in thread

Very neat, but it seems to me the issue with all wink-wink schemes like this is that you're ultimately getting something that wasn't explicitly promised, and so might be taken away at any time. So while interesting you couldn't really ever feel secure storing anything that mattered this way.

blackflame70007y ago

Yea but you could store unlimited backups across multiple accounts. (Not advocating this however)

thrownaway9547y ago· 1 in thread

Honestly this isn't ground breaking, we have been using BASE64 to convert binary to ASCII as a way of "sharing" files all the way to USENET days. While applications like these make it easy for the masses to participate in the idea, they don't bring anything new to the table.

That all said, this is really cool from a design perspective and I poured over the code learned a lot.

aembleton7y ago

It's also how email attachments work.

robador517y ago· 1 in thread

I may be mistaken, but as far as I'm aware Google docs synced to your local machine are nothing more than links to documents in the Google Drive cloud. None of the data inside those docs is actually stored locally. I found this out the hard way when I decided to move away from GD and lost a lot of files.

So buyer beware I guess.

throwawaygoog107y ago

Should you want to move from Google services, the best way of ensuring you keep your data is to use Takeout [1], which exports your documents as both doc and html files.

[1] https://takeout.google.com

blackflame70007y ago· 1 in thread

I wonder if this could be used to create a P2P network like bit torrent except trackers point to blocks at google doc urls instead of peers/seeds

throw4way197y ago

I discovered that a lot of pirate stream sites are already doing something similar (but not exact) to this.

They store fragments of movies (rather than the full videos) in Google Drive files and then combine them together during playback. Each fragment could then be copied and mirrored across different accounts, so if any are taken down they can just switch to another copy. Pretty clever (albeit abusive) solution for free bandwidth.

mikorym7y ago· 1 in thread

So, are the +- 700 kB files too small to register as taking up any space?

ptman7y ago

google drive doesn't count docs, spreadsheets or presentations against your quota

irrational7y ago· 1 in thread

I've wondered if someone could do the same thing with videos and jpgs. Amazon prime, as one example, allows you to store an unlimited number of image files for "free". What if there was a program that would take a video file and split it up into its individual frames as jpgs and stored them on Amazon prime. When you wanted to watch the video the program would rebuild the video file from the individual jpgs on AWS.

michaelmior7y ago

My guess would be that the latency of this approach would be far too high to be practical. But you could probably abuse the JPEG format to stuff bits of the video into image files. I think you'd probably still need to spend a fair amount of time buffering before you could start watching without lag.

yalogin7y ago· 1 in thread

How is this different from encrypting the binary locally, and store the result as hex strings?

rahimnathwani7y ago

It's 75% more space efficient. And it's automated.

anilakar7y ago· 1 in thread

Correction: unlimited as long as it takes Google to fix this oversight in quota calculation.

throwawaygoog107y ago

It isn't truly an oversight, it's an abuse of the fact that Docs/Sheets/Slides are not counted toward your quota. Their storage model is a little more complicated than a standard stream of bytes like an image or a text file.

purplezooey7y ago· 1 in thread

Damn 4:3? That ain't too bad.

quickthrower27y ago

Base 64 gives you 6 bits per character. Assuming a character requires 8 bits to store eg in UTF8 then yep that’s 8:6. Might be better with compression getting you closer to 1:1.

oyebenny7y ago· 1 in thread

ELI5 please?

dvhh7y ago

The script split file into small base64 chunks that are stored into "documents" (mime type: application/vnd.google-apps.document ) that apparently don't count against google drive quota.

jonnycomputer7y ago· 1 in thread

Too bad there is no similar trick for atmospheric carbon.

jonnycomputer7y ago

everyone can go fuck off

kccqzy7y ago

Google doc allows you to upload images from your computer. Why not just do that? With proper steganography no one will bat an eye on a few docs with some multi-megabyte pictures.

morpheuskafka7y ago

You could do the same thing with QR codes in Google Photos, the compression required for unlimited storage wouldn't affect them.

retroplasma7y ago

Related: unlimited private incremental storage on Usenet (concept):

https://gist.github.com/retroplasma/264d9fed2350feb19f977575...

TL;DR: An alternative to NZB, RAR and PAR2. Private "magnet-link" that points to encrypted incremental data.

anonuser1234567y ago

Because I'm sure Google has NO data on pathalogical docs file sizes. I can't wait for the follow on 'Google banned my account with all my life's data that I didn't back up anywhere for no good reason'

kyrra7y ago

Not so unlimited given these restrictions: https://developers.google.com/apps-script/guides/services/qu...

giomasce7y ago

To me it looks like iodine (https://code.kryo.se/iodine/): very nice as a hacking tool to prove a point, but unlikely to be actually helpful in all but very peculiar situations. As a hacker, of course, I value a lot the first part of it!

oldman1234567897y ago

I do the same on a different cloud storage provider. I won't name it because I don't want to be banned from it!

wsgreen7y ago

Nice! I did a similar hack to get unlimited Dropbox space by creating many accounts and distributing the files across the accounts.

https://github.com/WarrenGreen/InfiniteDrop

nichochar7y ago

This is very clever, and a neat interface.

On the one hand, I think this is great, on the other, I hope it doesn't force google to add limits that bother me in the future :P

kevingrahl7y ago

I think I already saw this a few (>6) months ago on reddit, have you changed/improved anything in the meantime?

localhostdotdev7y ago

already better than s3 :) makes me think of https://www.reddit.com/r/DataHoarder/

borplk7y ago

So what makes the data not count against the usage?

karjudev7y ago

It's such a weird trick that I love it!

dogma11387y ago

Google UseNet.

laylomo27y ago

Thinking outside the box.

sdan7y ago

Nice!

booleandilemma7y ago

Google, give this man a job!

j / k navigate · click thread line to collapse

247 comments

165 comments · 49 top-level

userbinator7y ago· 15 in thread

Base85 would probably be a better choice for storing binary as text, since it has a ratio of 5:4 instead of 4:3.

I like the "Linux ISO" and "1337 Docs" references ;-)

yunyu7y ago

Here is an implementation of arbitrary data storage using YouTube videos: https://github.com/dzhang314/YouTubeDrive

RavlaAlvar7y ago

Wouldn’t YouTube re-encode the video and mess up with the data?

2 more replies

lathiat7y ago

I love that this exists

2 more replies

marquis-chacha7y ago

You'd be at the mercy of them potentially changing their encoding scheme unannounced and corrupting your files.

DonHopkins7y ago

"HARRUMPH!!! IF I AND O ARE GOOD ENOUGH FOR DIGITS ON MY ELECTRIC TYPEWRITER, THEN THEY'RE GOOD ENOUGH FOR EMAIL! NOW GET OFF MY LAWN!!!" (shaking fist in air while yelling at cloud)

It was especially a problems for source code. That was one of the reasons for "trigraphs".

https://stackoverflow.com/questions/1234582/purpose-of-trigr...

https://en.wikipedia.org/wiki/Digraphs_and_trigraphs

1 more reply

Wowfunhappy7y ago

You'd want to make sure the coding scheme is in the identifiable visual data. Think QR codes.

Build in a bit of redundancy, and I think it would work.

1 more reply

matthewaveryusa7y ago

that would be tough if youtube doesn't save the originals

1 more reply

dspillett7y ago

> Base85 would probably be a better choice

> YouTube ... you'd need to find a resilient way of encoding the data [due to lossy re-encoding]

richrichardsson7y ago

Would be cool to use the audio for some extra bandwidth too, get some Sinclair Spectrum-esque (albeit in stereo) bleeps to accompany the video.

askl7y ago

Or you could use BandcampFS https://github.com/tuxxy/BandcampFS https://bullswillriot.bandcamp.com/album/arch-linux-07-01

giomasce7y ago

_qxjp7y ago

Just FYI, turn your volume way down when listening to these. Wouldn't be a good idea to blow your eardrums on this.

andrewstuart27y ago

Why not yEnc? 1-2% overhead and it's been in use on UseNet for binary storage for a very long time.

u801e7y ago

The nice thing about yEnc is that it only has to escape NUL, LF, CR, and the escape character itself '=', so it essentially uses all but 3 characters out of the 255 possible values.

While this works over NNTP, SMTP and IMAP (and possibly POP), I'm not sure if it will work over HTTP if any of the servers use the Transfer Encoding header.

DonHopkins7y ago

Just use Unicode for the optimal highest possible base 1,114,112!

reaperducer7y ago· 10 in thread

Base64 is such a wonderful gift.

The hitch was that you could only use e-mail. No web surfing. No downloading files. No fun stuff.

The free e-mail service became slow-motion file sharing. But that was OK because you'd set it up before you went to bed and it would run overnight.

Thank you, whomever came up with base64.

romanhn7y ago

I still vividly remember the sense of wonder that the early Internet evoked.

EDIT: Just checked the Wayback Machine. Looks like www.doom.com was not affiliated with the game at the time, so I must have browsed to www.idsoftware.com instead.

xamuel7y ago

It's really sad thinking how kids these days totally miss the wonder of the early internet.

1 more reply

hcs7y ago

I used Agora [1] with Juno, too! There was a particular daemon hosted in Japan, not sure how I found it but probably in a magazine.

[1] https://en.wikipedia.org/wiki/Agora_(web_browser)

SergeAx7y ago

Long before base64 there was UUencode, but it was quite sensitive to whitespaces and mail client reflowing, so it didn't make it to the RFC standards.

simonh7y ago

1 more reply

ohyeshedid7y ago

That's really slick.

groby_b7y ago

Mary Ann Horton[1] is probably one of the people you want to thank. She's responsible for uuencode.

[1] https://en.wikipedia.org/wiki/Mary_Ann_Horton

gjtorikian7y ago

Yep, Juno, NetZero, and Walmart BlueLight were all free ISPs that were super easy to manipulate. :)

mmaunder7y ago

Reminds me of usenet warez groups filled with uuencoded posts. If you took the time to reassemble them and decode, it worked.

jabl7y ago

Good times.

justinjlynn7y ago· 9 in thread

Sounds like a great way to lose your Google account (and all your other linked Google services) for ToS violations to me.

mindfulhack7y ago

Indeed. I love the "sorry @ the guys from google internal forums who are looking at this" line at the github. All tongue in cheek and aware of the situation.

TBH this is not unlike reporting a security bug to a company as a white hat, but more like a grey hat here.

yreg7y ago

I think they might not care unless this can somehow endanger their infrastructure. (It's not really unlimited, is it?)

If the few blokes using this scam their way into few hundred terabytes of free storage, so be it, it's not worth the hassle for Google, imo.

edit: Apparently an account can create up to 250 docs a day https://developers.google.com/apps-script/guides/services/qu...

2 more replies

lordpankakeOP7y ago

yh hopefully they don't ban me for this

m-p-37y ago

I would make sure to not do this in an important Google account.

mcbits7y ago

5 more replies

decebalus17y ago

I would make sure to not have any important Google account

dymk7y ago

Google can and will shut down accounts they think are ran by the same actor, even if they're not explicitly linked

neltnerb7y ago

Hehe, I was just thinking how simple it will be for Google to identify accounts using this technique from simple usage analytics. I suspect this will not work for long... but still super cool!

kitotik7y ago

Agreed. It also sounds like a great way to (ab)use a dumb commodity via ephemeral Google accounts to distribute data.

DonHopkins7y ago· 9 in thread

https://en.wikipedia.org/wiki/EFF_DES_cracker

https://www.foo.be/docs/eff-des-cracker/book/crackingdessecr...

The book had the call to action "Scan this book!" on the cover (undoubtedly a reference to Abby Hoffman's "Steal This book").

https://en.wikipedia.org/wiki/Steal_This_Book

Scans of the cover and instructions and test pages for scanning and bootstrapping from Chapter 4:

https://imgur.com/a/7pHSAT1

https://en.wikipedia.org/wiki/EFF_DES_cracker#/media/File:Ch...

https://dl.packetstormsecurity.net/cracked/des/cracking-des....

Chapter 4: Scanning the Source Code

In This chapter:

The Politics of Cryptographic Source Code

The Paper Publishing Exception

Scanning

Bootstrapping

The Politics of Cryptographic Source Code

Anyway, these agencies influenced laws and regulations which now make it illegal for U.S. crypto researchers to publish their results on the World Wide Web (or elsewhere in electronic form).

The Paper Publishing Exception

* See http://www.eff.org/pub/Privacy/ITAR_export/Bernstein_case/ .

Therefore, cryptographic research, which has traditionally been published on paper, shows a trend to continue publishing on paper, while other forms of scientific research are rapidly moving online.

* In the letter at http://samsara.law.cwru.edu/comp_law/jvd/pdj-bxa-gjs070397.h..., which is part of Professor Peter Junger's First Amendment lawsuit over the crypto export control regulations.

[...]

rnhmjoj7y ago

The checksum is really interesting and would be useful even today. I have looked for the scripts but the links are all gone [1], unfortunately.

EDIT: I have found it[2], finally. It's pretty sad that so much of the internet is getting forgotten, though.

[1]: https://web.archive.org/web/19980630210313/http://www.pgpi.c...

[2]: https://the.earth.li/pub/pgp/pgpi/5.5/books/ocr-tools.zip

sagebird7y ago

Was this tested before a court and did they accept this sort of obviously subversive behavior? (Not that I personally agree with the laws restricting crypto export.)

croon7y ago

1 more reply

brandmeyer7y ago

sirsuki7y ago

To what extent does analog encoding fall under the illegal threshold?

1 more reply

userbinator7y ago

Are you implying there's something more interesting there than just the DES source code and related data that the book already very clearly claims to contain?

some_random7y ago

I don't think so, I think the OP's just trying to be dramatic?

1 more reply

aasasd7y ago

mrmuagi7y ago

Quite an interesting read, however "Deep Crack" is a horrible name.

_bxg17y ago· 7 in thread

mmastrac7y ago

This existed just after gmail launched. Can't recall the name of the program, but I played around with it to store a few hundred MB in a test account.

hcs7y ago

GmailFS? https://web.archive.org/web/20060424165737/http://richard.jo...

1 more reply

follower7y ago

Yeah, there were a couple of different ones wrote a bit more about them in a comment here: https://news.ycombinator.com/item?id=19917018

qntty7y ago

https://en.wikipedia.org/wiki/GMail_Drive

quickthrower27y ago

I had an evil idea to create a key/value storage using HN dead comments.

follower7y ago

These days there's even the luxury of IMAP. :D

[*] About the only thing I remember now is the `HELO` and `EHLO` protocol start messages. :)

derivagral7y ago

Did this as a college project with a friend, was pretty fun.

Nowadays stuff like Dropbox is much more convenient and reliable.

whack7y ago· 6 in thread

Ie, it's actually increasing the amount of storage space needed to store the same binary, but it's getting around the drive-quota by storing it in a format that has no quota.

markstos7y ago

Seems like a good way to earn yourself a Terms-of-Service ban.

If this considered an abuse-of-services now, the terms could be updated to clarify.

The finger print of big chunks of base-64 encoded blobs in Google Docs could be easy to spot.

If Google cares to notice this and take action, they can and will.

jerf7y ago

So, I'd say, enjoy it while you can. If it gets big enough to annoy, it'll get eliminated.

3 more replies

magashna7y ago

Seems like more of a "can" rather than anyone actually using it. Always interesting to see how something can be broken or exploited, even though it may not be practical

0xDEFC0DE7y ago

I doubt it's actually unlimited behind-the-scenes and you'll hit a hidden quota for your account type or get throttled down to dozens of KB/s.

jfengel7y ago

Perhaps it's just marketing, trying to prize people away from Microsoft Office with a thing that doesn't actually cost them all that much?

whack7y ago

> Perhaps it's just marketing, trying to prize people away from Microsoft Office with a thing that doesn't actually cost them all that much?

Exactly. Never underestimate how much people love something being "free", even if only costs a fraction of a cent.

Scaevolus7y ago· 6 in thread

You can also just use GSuite with a few users to get unlimited Google Drive storage.

https://support.google.com/a/answer/139019?hl=en#6_storage

fheld7y ago

relevant video on this unlimited plan and the way it is capped.

https://youtu.be/y2F0wjoKEhg

tl;dw Upload is limited to 750GB per day per account

kevingrahl7y ago

Small correction; that limit is per user not per account!

You need 5 users for GSuite Business, if you use those alone the limit is now 3750GB/day

judge20207y ago

That references "for education", but it's also true for GSuite Business (and enterprise, but not basic). You'll need to be paying for at least 5 users, or $60/month.

nefitty7y ago

Here's the business info: https://support.google.com/a/answer/6034782

Dylan168077y ago

From what I understand the 5 user minimum isn't implemented.

unicornfinder7y ago

Yup, I'm storing 42TB on there at present.

nemosaltat7y ago· 5 in thread

Anyone remember Gdrive? I can’t find it now, but I think it was probably early or mid 2000s. It let you store files as a local disk (FUSE) via Gmail attachments.

BlackRing7y ago

I remember using it back in 2005 iirc, and it was amazing. The files had a label called gmailfs.gDisk which is how it could keep the "file system" separate from the rest.

Now Google generously offers Drive with 15Gigs of space.

clarry7y ago

15 gigs would've been generous in 2005, now it's not.

1 more reply

nemosaltat7y ago

Yes! That’s the one.

follower7y ago

I do! :)

There were a couple of different projects at the time (listed in "Other Resources" on the project page) that sought to provide a programmatic Gmail interface.

I still have a "ftp" label in Gmail (checks notes 15 years later...) from the experimental FTP server I implemented as a libgmail example. :D

(Heh, I'd forgotten he also said "I think Gmail should hire the libgmail team, make libgmail an officially supported API"--as the entirety of the "team" I appreciated the endorsement. :) )

But that's a whole other story... :)

    </nostalgia>

imglorp7y ago

Yeah now look for https://github.com/vitalif/grive2

AFAIK it mostly still works. The older "grive" might not.

joebergeron7y ago· 5 in thread

jmkni7y ago

I heard about a subreddit a while ago, where every post/comment was a random string. It was speculated at the time that something similar was going on.

xamuel7y ago

pestaa7y ago

The thought of a distributed MySQL cluster accessed over various versions of WordPress-as-a-database-layer just makes me happy and confused.

solotronics7y ago

dogma11387y ago

So UseNet?

binwiederhier7y ago· 4 in thread

In the same spirit, I made a few "just for fun" plugins for my (now abandoned) encrypted-arbitrary-storage Dropbox-like application Syncany:

The SMTP/POP plugin [2] was even nastier. It used SMTP and POP3 to store data in a mailbox. Same for [3], but that used IMAP.

The Picasa plugin [4] encoded data as BMP images. Similar to Flickr, but different image format. No overhead here either.

All of this was strictly for fun of course, but hey it worked.

[1] https://github.com/syncany/syncany-plugin-flickr

[2] http://bazaar.launchpad.net/~binwiederhier/syncany/trunk/fil...

[3] http://bazaar.launchpad.net/~binwiederhier/syncany/trunk/fil...

[4] http://bazaar.launchpad.net/~binwiederhier/syncany/trunk/fil...

userbinator7y ago

I feel less hesitant about revealing this now, given how long ago it was and that more accessible "libraries" are now available.

TazeTSchnitzel7y ago

One of many Internet jokes with sinister origins.

collinmanderson7y ago

I have a feeling PNG might work on Google Photos too, but I haven't tried it.

binwiederhier7y ago

I can't remember if I tried, but it's important that you get the exact data back that you put in, which is why JPEG obviously won't work.

BMP is the easiest to encode/decode because it's literally a bitmap of RGB, no fancy compression and such, which, if you're storing arbitrary data is obviously not necessary.

PNG was trickier, because of its "chunks" and generally more structure. And compression.

reneberlin7y ago· 4 in thread

I think in the long run, the user could risk the complete google-account if they begin rating the uploads a violation of TOS.

I advise a totally seperate account when using this tool.

But anyways, something inside me likes it. Nicely done. Good job :)

asdfasgasdgasdg7y ago

lordpankakeOP7y ago

This is a complete hack job and probably useless if Google changes free storage for docs.

kevingrahl7y ago

As a data hoarder myself with somewhere around 300TB on G Suite Business, please tell me more about those £1 for life accounts!

1 more reply

mo57y ago

I'm kinda scared to try it since google could mass ban all of the accounts if they want to, but sure is a great job from the dev.

I didn't know this was plausible.

yeukhon7y ago· 4 in thread

I don’t know anyone remember but some years ago I remember seeing a file compressed from 1GB to 1mb. And I was amazed.

scarejunba7y ago

The whole thing only transferred a few kB. It looked like an entire disc though.

dymk7y ago

jl67y ago

42.zip?

dvhh7y ago

even more dangerous for bot

https://rehmann.co/blog/10-gb-27-kb-gzip-file-present-http-s...

wmichelin7y ago· 3 in thread

Someone is going to notice a few accounts with insanely high storage usage, and then comes the ban-hammer. Enjoy losing your Google account!

sircastor7y ago

londons_explore7y ago

It'll more be that the Google docs "live editing" backends are expensive to use disk and memory wise. They store complete version history with each keystroke of a document.

There's a good chance a megabyte of "document" costs Google a gigabyte of internal storage...

3 more replies

markbnj7y ago

From the project page:

> sorry @ the guys from google internal forums who are looking at this

Causality17y ago· 3 in thread

Reminds me of the old programs that would turn your Gmail storage into a network drive by splitting everything into 25MB chunks. Utterly miserable experience with terrible latency and reliability.

follower7y ago

Yeah, there were a couple of projects that implemented that functionality (mentioned more in my comment https://news.ycombinator.com/item?id=19917018 if you're interested).

Also, "Utterly miserable experience with terrible latency and reliability." is such a great customer endorsement quote. :D

rovr1387y ago

I remember Gspace - https://www.ghacks.net/2007/03/07/gspace-firefox-extension/

qntty7y ago

GMail Drive I believe.

http://www.viksoe.dk/code/gmail.htm

zelon887y ago· 3 in thread

Couldn't you also embed data into images and upload them to Google photos, or is that discarded when they convert and compress the image in the backend?

PetahNZ7y ago

Depends how you encode it. A bunch QR codes, no problem. But encoding into the individual pixel, probably not so much.

zelon887y ago

I mean include binary data in an image file. So you would have a 300x300px jpg picture of a flower that's 20mb which you could unpack to a binary file.

1 more reply

lordpankakeOP7y ago

Check the issues! Some people have tried quite hard to figure that one out.

gaspoweredcat7y ago· 3 in thread

very clever, well done!

lordpankakeOP7y ago

genuinely sorry if you're a Google employee (probably won't put this on my Internship CV)

ConcernedCoder7y ago

yeah! let's hope no google employees see this ... oh wait

throwawaygoog107y ago

MatthewRayfield7y ago· 2 in thread

I couldn't find it with a quick search, but I remember many years ago someone creating a similar scheme for storing files inside of TinyURLs.

You would run the uploader and get back a list of TinyURLs that could then be used to retrieve the files later with a downloader.

But you couldn't store too much in each URL so the resulting list could be pretty big.

klyrs7y ago

abricot7y ago

Someone also created a filesystem using DNS caches of others to store the files: https://news.ycombinator.com/item?id=16134041

lofties7y ago· 2 in thread

Very cool! About a year ago I had a similar idea, but to store arbitrary data in PNG chunks[1] and upload them to "unlimited" image hosts like IMGUR and Reddit.

[1] http://blog.brian.jp/python/png/2016/07/07/file-fun-with-pyh...

collinmanderson7y ago

I have a feeling PNG might work on Google Photos too, but I haven't tried it.

binwiederhier7y ago

If you wanna give it a shot, try the code I linked here: https://news.ycombinator.com/item?id=19916126

Although if Picasa (predecessor to Google Photos) worked with BMP, it may be better to do that because it's much easier and more space efficient to encode arbitrary data in than PNG.

mirimir7y ago· 2 in thread

Could someone please ELI5 how Google Drive doesn't include text files toward usage?

angelsl7y ago

These aren't text files, but Google Docs files, which Google doesn't count against an account's quota.

mirimir7y ago

OK, I did see that.

But I don't understand why Google would do that. For most users, aren't Google Docs files a substantial part of their usage? Or do people mainly store backups?

2 more replies

baroffoos7y ago· 2 in thread

Has anyone actually tried storing a large amount of data like this? I feel like creating a new google account and using it as a backup for a 300gb folder I have.

acuozzo7y ago

Yes. It's called: Post to alt.binaries.* on Usenet.

This form of binary distribution has been around since the '80s if you change some of the technical details; e.g. using UUencode rather than yEnc.

Spend $5 for a 3-day unlimited Usenet account with e.g. UsenetServer.com and upload it.

If you want it to stay up, then make another account in 3925 days (the retention period), download it, and then reupload it for another 10+ years of storage.

kevingrahl7y ago

I would not, in any way, consider this a backup.

If it’s only 300GB check out Backblaze B2. It would cost you $1.5 per month for that amount of data.

furyofantares7y ago· 1 in thread

Even URL shorteners offer unlimited storage if you jump through enough hoops.

The final short url is the your link to the data, which can be unpacked by stripping the payload bytes then following the links backwards until you get to your initial example.com node.

jerf7y ago

Maybe it isn't a bad first project, but on no account should you put it up on the "real" internet and tell anyone it exists.

markbnj7y ago· 1 in thread

blackflame70007y ago

Yea but you could store unlimited backups across multiple accounts. (Not advocating this however)

thrownaway9547y ago· 1 in thread

That all said, this is really cool from a design perspective and I poured over the code learned a lot.

aembleton7y ago

It's also how email attachments work.

robador517y ago· 1 in thread

So buyer beware I guess.

throwawaygoog107y ago

Should you want to move from Google services, the best way of ensuring you keep your data is to use Takeout [1], which exports your documents as both doc and html files.

[1] https://takeout.google.com

blackflame70007y ago· 1 in thread

I wonder if this could be used to create a P2P network like bit torrent except trackers point to blocks at google doc urls instead of peers/seeds

throw4way197y ago

I discovered that a lot of pirate stream sites are already doing something similar (but not exact) to this.

mikorym7y ago· 1 in thread

So, are the +- 700 kB files too small to register as taking up any space?

ptman7y ago

google drive doesn't count docs, spreadsheets or presentations against your quota

irrational7y ago· 1 in thread

michaelmior7y ago

yalogin7y ago· 1 in thread

How is this different from encrypting the binary locally, and store the result as hex strings?

rahimnathwani7y ago

It's 75% more space efficient. And it's automated.

anilakar7y ago· 1 in thread

Correction: unlimited as long as it takes Google to fix this oversight in quota calculation.

throwawaygoog107y ago

purplezooey7y ago· 1 in thread

Damn 4:3? That ain't too bad.

quickthrower27y ago

Base 64 gives you 6 bits per character. Assuming a character requires 8 bits to store eg in UTF8 then yep that’s 8:6. Might be better with compression getting you closer to 1:1.

oyebenny7y ago· 1 in thread

ELI5 please?

dvhh7y ago

The script split file into small base64 chunks that are stored into "documents" (mime type: application/vnd.google-apps.document ) that apparently don't count against google drive quota.

jonnycomputer7y ago· 1 in thread

Too bad there is no similar trick for atmospheric carbon.

jonnycomputer7y ago

everyone can go fuck off

kccqzy7y ago

Google doc allows you to upload images from your computer. Why not just do that? With proper steganography no one will bat an eye on a few docs with some multi-megabyte pictures.

morpheuskafka7y ago

You could do the same thing with QR codes in Google Photos, the compression required for unlimited storage wouldn't affect them.

retroplasma7y ago

Related: unlimited private incremental storage on Usenet (concept):

https://gist.github.com/retroplasma/264d9fed2350feb19f977575...

TL;DR: An alternative to NZB, RAR and PAR2. Private "magnet-link" that points to encrypted incremental data.

anonuser1234567y ago

kyrra7y ago

Not so unlimited given these restrictions: https://developers.google.com/apps-script/guides/services/qu...

giomasce7y ago

oldman1234567897y ago

I do the same on a different cloud storage provider. I won't name it because I don't want to be banned from it!

wsgreen7y ago

Nice! I did a similar hack to get unlimited Dropbox space by creating many accounts and distributing the files across the accounts.

https://github.com/WarrenGreen/InfiniteDrop

nichochar7y ago

This is very clever, and a neat interface.

On the one hand, I think this is great, on the other, I hope it doesn't force google to add limits that bother me in the future :P

kevingrahl7y ago

I think I already saw this a few (>6) months ago on reddit, have you changed/improved anything in the meantime?

localhostdotdev7y ago

already better than s3 :) makes me think of https://www.reddit.com/r/DataHoarder/

borplk7y ago

So what makes the data not count against the usage?

karjudev7y ago

It's such a weird trick that I love it!

dogma11387y ago

Google UseNet.

laylomo27y ago

Thinking outside the box.

sdan7y ago

Nice!

booleandilemma7y ago

Google, give this man a job!

j / k navigate · click thread line to collapse