Back when the commercial internet was just getting its act together there were companies that would give you free online access on Windows 3.1 machines in exchange for displaying ads in the e-mail client. (I think one was called Juno.)
The hitch was that you could only use e-mail. No web surfing. No downloading files. No fun stuff.
But that's OK, since there were usenet- and FTP-to-email gateways that you could ping and would happily return lists of files and messages. And if you sent another mail would happily send you base64-encoded versions of those binaries that you could decode on your machine.
The free e-mail service became slow-motion file sharing. But that was OK because you'd set it up before you went to bed and it would run overnight.
Thank you, whomever came up with base64.
Being a teenager, the first web page I ever requested was www.doom.com, which returned a gibberish of text to Juno's email client. It was an HTML file full of IMG tags (one of those "Click here to enter" gateway pages), but I had no idea what I was looking at at the time. Somehow figured out to open the file in IE2 and saw... a bunch of broken images :)
I still vividly remember the sense of wonder that the early Internet evoked.
EDIT: Just checked the Wayback Machine. Looks like www.doom.com was not affiliated with the game at the time, so I must have browsed to www.idsoftware.com instead.
In my case, it was at the public library. The lone internet computer was constantly booked. But by watching over a library clerk's shoulder, I was able to see the password needed to unlock the text-based library catalog terminals (which terminals were plentiful and always available). (My parents worked at the library, or else I never could have pulled that off.) Once unlocked, I was able to use Lynx to telnet into my favorite MUD game. Unfortunately it didn't last long until a librarian caught me, which I think resulted in me being grounded from the library for a month or something like that.
The original Juno ad server proxied the ads from the internet to the email client, and the proxy was wide open for several months. The first time I ever accessed the open internet at home was by dialing into the email service and bouncing through the proxy. I believe it was closed due to it being shared in the letters section of a hacker zine.
One big reason why I became a Linux user was that the TCP/IP stack for Win 3.1, Trumpet Winsock, was amazingly unstable and would regularly crash the entire OS. Linux had, even back then, a stable TCP/IP stack. And fantastic advancements like preemptive multitasking running in protected mode so errant user-space applications didn't crash the OS.
Good times.
Ie, it's actually increasing the amount of storage space needed to store the same binary, but it's getting around the drive-quota by storing it in a format that has no quota.
If this considered an abuse-of-services now, the terms could be updated to clarify.
The finger print of big chunks of base-64 encoded blobs in Google Docs could be easy to spot.
If Google cares to notice this and take action, they can and will.
If I were Google, I wouldn't try to pick up on the content, I'd be looking for characteristic access patterns. It's harder to catch uploads, since "new account uploads lots of potentially large documents" isn't something you can immediately block, but "oh, look, here's several large files that are always accessed in consecutive order very quickly" would be harder to hide. It's still an arms race after that (e.g., "but what if I access them really slowly?"), but while Google would find a hard time conclusively winning this race in the technical sense, they can win enough that this isn't fun or a cost-effectively technique anymore (e.g. "then you're getting your files really slowly, so where's the fun in that?"), which is close enough to victory for them.
So, I'd say, enjoy it while you can. If it gets big enough to annoy, it'll get eliminated.
Perhaps it's just marketing, trying to prize people away from Microsoft Office with a thing that doesn't actually cost them all that much?
Exactly. Never underestimate how much people love something being "free", even if only costs a fraction of a cent.
The Flickr plugin [1] stores data (deduped and encrypted before upload) as PNG images. This was great because Flickr gave you 1 TB of free image storage. This was actually super cool, because the overhead was really small. No base64.
The SMTP/POP plugin [2] was even nastier. It used SMTP and POP3 to store data in a mailbox. Same for [3], but that used IMAP.
The Picasa plugin [4] encoded data as BMP images. Similar to Flickr, but different image format. No overhead here either.
All of this was strictly for fun of course, but hey it worked.
[1] https://github.com/syncany/syncany-plugin-flickr
[2] http://bazaar.launchpad.net/~binwiederhier/syncany/trunk/fil...
[3] http://bazaar.launchpad.net/~binwiederhier/syncany/trunk/fil...
[4] http://bazaar.launchpad.net/~binwiederhier/syncany/trunk/fil...
I feel less hesitant about revealing this now, given how long ago it was and that more accessible "libraries" are now available.
One of many Internet jokes with sinister origins.
BMP is the easiest to encode/decode because it's literally a bitmap of RGB, no fancy compression and such, which, if you're storing arbitrary data is obviously not necessary.
PNG was trickier, because of its "chunks" and generally more structure. And compression.
TBH this is not unlike reporting a security bug to a company as a white hat, but more like a grey hat here.
If the few blokes using this scam their way into few hundred terabytes of free storage, so be it, it's not worth the hassle for Google, imo.
edit: Apparently an account can create up to 250 docs a day https://developers.google.com/apps-script/guides/services/qu...
On the topic of "unusual and free large file hosting", YouTube would probably be the largest, although you'd need to find a resilient way of encoding the data since their re-encoding processes are lossy.
I like the "Linux ISO" and "1337 Docs" references ;-)
EBCDIC based IBM mainframe SYSADMINs on BITNET were particularly notorious for being pig-headed and inconsiderate about communicating with the rest of the world, and thought they knew better about the characters their users wanted to use, and that the rest of the world should go fuck themselves, and scoffed at all the unruly kids using ASCII and lower case and new fangled punctuation, who were always trying to share line printer pornography and source code listings through their mainframes.
"HARRUMPH!!! IF I AND O ARE GOOD ENOUGH FOR DIGITS ON MY ELECTRIC TYPEWRITER, THEN THEY'RE GOOD ENOUGH FOR EMAIL! NOW GET OFF MY LAWN!!!" (shaking fist in air while yelling at cloud)
It was especially a problems for source code. That was one of the reasons for "trigraphs".
https://stackoverflow.com/questions/1234582/purpose-of-trigr...
https://en.wikipedia.org/wiki/Digraphs_and_trigraphs
>Trigraphs were proposed for deprecation in C++0x, which was released as C++11. This was opposed by IBM, speaking on behalf of itself and other users of C++, and as a result trigraphs were retained in C++0x. Trigraphs were then proposed again for removal (not only deprecation) in C++17. This passed a committee vote, and trigraphs (but not the additional tokens) are removed from C++17 despite the opposition from IBM. Existing code that uses trigraphs can be supported by translating from the source files (parsing trigraphs) to the basic source character set that does not include trigraphs.
Build in a bit of redundancy, and I think it would work.
Base64 has the advantage of relative ubiquity (though Base85 is hardly rare, being used in PDF and Git binary patches). It also doesn't contain characters (quotes, angled brackets, ...) that might cause problems if naively sent via some text protocols and/or embedded in XML/HTML mark-up.
> YouTube ... you'd need to find a resilient way of encoding the data [due to lossy re-encoding]
That should be easy enough: encode as blocks or lines of pixels (blocks of 4x4 should be more than sufficient) in a low enough number of colour values (I expect you'd get away with at least 4bits/channel/block with large enough blocks so 4096 values per block) and you should easily be able to survive anything the re-encoding does by averaging each block and taking the closest value to that result.
Add some form of error detection+correction code just for paranoia's sake. You are going to want to include some redundancy in the uploads anyway so you can combine these needs in a manner similar to RAID5/6 or the Parchive format that was (is?) popular on binary carrying Usenet groups.
While this works over NNTP, SMTP and IMAP (and possibly POP), I'm not sure if it will work over HTTP if any of the servers use the Transfer Encoding header.
To encode ABCDEFGHIJKLMNOPQRSTUVWXYZ first get a short url for http://example.com/ABC, then take the resulting url and append DEF and run it through the service again. Repeat until you run out of payload, presumably doing quite a few more than 3 bytes at a time.
The final short url is the your link to the data, which can be unpacked by stripping the payload bytes then following the links backwards until you get to your initial example.com node.
Maybe it isn't a bad first project, but on no account should you put it up on the "real" internet and tell anyone it exists.
https://en.wikipedia.org/wiki/EFF_DES_cracker
https://www.foo.be/docs/eff-des-cracker/book/crackingdessecr...
>"We would like to publish this book in the same form, but we can't yet, until our court case succeeds in having this research censorship law overturned. Publishing a paper book's exact same information electronically is seriously illegal in the United States, if it contains cryptographic software. Even communicating it privately to a friend or colleague, who happens to not live in the United States, is considered by the government to be illegal in electronic form."
So to get around the export control laws that prohibited international distribution of DES source code on digital media like CDROMS, but not in written books (thanks to the First Amendment and the Paper Publishing Exception), they developed a system for printing the code and data on paper with checksums, with scripts for scanning, calibrating, validating and correcting the text.
The book had the call to action "Scan this book!" on the cover (undoubtedly a reference to Abby Hoffman's "Steal This book").
https://en.wikipedia.org/wiki/Steal_This_Book
A large portion of the book included chapter 4, "Scanning the Source Code" with instructions on scanning the book, and chapters 5, 6, and 7 on "Software Source Code," "Chip Source Code," and "Chip Simulator Source Code," which consisted of pages and pages of listings and uuencoded data, with an inconspicuous column of checksums running down the left edge.
The checksums in the left column of the listings innocuously looked to the casual observer kind of like line numbers, which may have contributed to their true subversive purpose flying under the radar.
Scans of the cover and instructions and test pages for scanning and bootstrapping from Chapter 4:
(My small contribution to the project was coming up with the name "Deep Crack", which was silkscreened on all of the chips, as a pun on "Deep Thought" and "Deep Blue", which was intended to demonstrate that there was a deep crack in the United States Export Control policies.)
https://en.wikipedia.org/wiki/EFF_DES_cracker#/media/File:Ch...
The exposition about US export control policies and the solution for working around them that they developed for the book was quite interesting -- I love John Gilmore's attitude, which still rings true today: "All too often, convincing Congress to violate the Constitution is like convincing a cat to follow a squeaking can opener, but that doesn't excuse the agencies for doing it."
https://dl.packetstormsecurity.net/cracked/des/cracking-des....
Chapter 4: Scanning the Source Code
In This chapter:
The Politics of Cryptographic Source Code
The Paper Publishing Exception
Scanning
Bootstrapping
The next few chapters of this book contain specially formatted versions of the documents that we wrote to design the DES Cracker. These documents are the primary sources of our research in brute-force cryptanalysis, which other researchers would need in order to duplicate or validate our research results.
The Politics of Cryptographic Source Code
Since we are interested in the rapid progress of the science of cryptography, as well as in educating the public about the benefits and dangers of cryptographic technology, we would have preferred to put all the information in this book on the World Wide Web. There it would be instantly accessible to anyone worldwide who has an interest in learning about cryptography.
Unfortunately the authors live and work in a country whose policies on cryptography have been shaped by decades of a secrecy mentality and covert control. Powerful agencies which depend on wiretapping to do their jobs--as well as to do things that aren't part of their jobs, but which keep them in power--have compromised both the Congress and several Executive Branch agencies. They convinced Congress to pass unconstitutional laws which limit the freedom of researchers--such as ourselves--to publish their work. (All too often, convincing Congress to violate the Constitution is like convincing a cat to follow a squeaking can opener, but that doesn't excuse the agencies for doing it.) They pressured agencies such as the Commerce Department, State Department, and Department of Justice to not only subvert their oaths of office by supporting these unconstitutional laws, but to act as front-men in their repressive censorship scheme, creating unconstitutional regulations and enforcing them against ordinary researchers and authors of software.
The National Security Agency is the main agency involved, though they seem to have recruited the Federal Bureau of Investigation in the last several years. From the outside we can only speculate what pressures they brought to bear on these other parts of the government. The FBI has a long history of illicit wiretapping, followed by use of the information gained for blackmail, including blackmail of Congressmen and Presidents. FBI spokesmen say that was "the old bad FBI" and that all that stuff has been cleaned up after J. Edgar Hoover died and President Nixon was thrown out of office. But these agencies still do everything in their power to prevent ordinary citizens from being able to examine their activities, e.g. stonewalling those of us who try to use the Freedom of Information Act to find out exactly what they are doing.
Anyway, these agencies influenced laws and regulations which now make it illegal for U.S. crypto researchers to publish their results on the World Wide Web (or elsewhere in electronic form).
The Paper Publishing Exception
Several cryptographers have brought lawsuits against the US Government because their work has been censored by the laws restricting the export of cryptography. (The Electronic Frontier Foundation is sponsoring one of these suits, Bernstein v. Department of Justice, et al ).* One result of bringing these practices under judicial scrutiny is that some of the most egregious past practices have been eliminated.
For example, between the 1970's and early 1990's, NSA actually did threaten people with prosecution if they published certain scientific papers, or put them into libraries. They also had a "voluntary" censorship scheme for people who were willing to sign up for it. Once they were sued, the Government realized that their chances of losing a court battle over the export controls would be much greater if they continued censoring books, technical papers, and such.
Judges understand books. They understand that when the government denies people the ability to write, distribute, or sell books, there is something very fishy going on. The government might be able to pull the wool over a few judges' eyes about jazzy modern technologies like the Internet, floppy disks, fax machines, telephones, and such. But they are unlikely to fool the judges about whether it's constitutional to jail or punish someone for putting ink onto paper in this free country.
* See http://www.eff.org/pub/Privacy/ITAR_export/Bernstein_case/ .
Therefore, the last serious update of the cryptography export controls (in 1996) made it explicit that these regulations do not attempt to regulate the publication of information in books (or on paper in any format). They waffled by claiming that they "might" later decide to regulate books--presumably if they won all their court cases -- but in the meantime, the First Amendment of the United States Constitution is still in effect for books, and we are free to publish any kind of cryptographic information in a book. Such as the one in your hand.
Therefore, cryptographic research, which has traditionally been published on paper, shows a trend to continue publishing on paper, while other forms of scientific research are rapidly moving online.
The Electronic Frontier Foundation has always published most of its information electronically. We produce a regular electronic newsletter, communicate with our members and the public largely by electronic mail and telephone, and have built a massive archive of electronically stored information about civil rights and responsibilities, which is published for instant Web or FTP access from anywhere in the world.
We would like to publish this book in the same form, but we can't yet, until our court case succeeds in having this research censorship law overturned. Publishing a paper book's exact same information electronically is seriously illegal in the United States, if it contains cryptographic software. Even communicating it privately to a friend or colleague, who happens to not live in the United States, is considered by the government to be illegal in electronic form.
The US Department of Commerce has officially stated that publishing a World Wide Web page containing links to foreign locations which contain cryptographic software "is not an export that is subject to the Export Administration Regulations (EAR)."* This makes sense to us--a quick reductio ad absurdum shows that to make a ban on links effective, they would also have to ban the mere mention of foreign Universal Resource Locators. URLs are simple strings of characters, like http://www.eff.org; it's unlikely that any American court would uphold a ban on the mere naming of a location where some piece of information can be found.
Therefore, the Electronic Frontier Foundation is free to publish links to where electronic copies of this book might exist in free countries. If we ever find out about such an overseas electronic version, we will publish such a link to it from the page at http://www.eff.org/pub/Privacy/Crypto_misc/DESCracker/ .
* In the letter at http://samsara.law.cwru.edu/comp_law/jvd/pdj-bxa-gjs070397.h..., which is part of Professor Peter Junger's First Amendment lawsuit over the crypto export control regulations.
[...]
EDIT: I have found it[2], finally. It's pretty sad that so much of the internet is getting forgotten, though.
[1]: https://web.archive.org/web/19980630210313/http://www.pgpi.c...
[2]: https://the.earth.li/pub/pgp/pgpi/5.5/books/ocr-tools.zip
Was this tested before a court and did they accept this sort of obviously subversive behavior? (Not that I personally agree with the laws restricting crypto export.)
To what extent does analog encoding fall under the illegal threshold?
Are you implying there's something more interesting there than just the DES source code and related data that the book already very clearly claims to contain?
I advise a totally seperate account when using this tool.
But anyways, something inside me likes it. Nicely done. Good job :)
That being said, they currently allow the guys at /r/datahoarder to use gsuite accounts costing £1 for life with unlimited storage quotas. These are regularly filled to like 50TB and Google doesn't bat an eye.
I didn't know this was plausible.
Now Google generously offers Drive with 15Gigs of space.
"Gdrive" (here: http://pramode.net/articles/lfy/fuse/pramode.html ) and the "Gmail Filesystem"/"GmailFS" (here: https://web.archive.org/web/20060424165737/http://richard.jo... as mentioned elsewhere in this thread) were both built on top of `libgmail` (here: http://libgmail.sourceforge.net/ ) a Python library I developed.
There were a couple of different projects at the time (listed in "Other Resources" on the project page) that sought to provide a programmatic Gmail interface.
I still have a "ftp" label in Gmail (checks notes 15 years later...) from the experimental FTP server I implemented as a libgmail example. :D
The libgmail project was probably the first project of mine which attracted significant attention including others basing their projects on it along with mentions in magazines and books which was pretty cool.
I think my favourite memory from the project was when Jon Udell wrote in a InfoWorld column ( http://jonudell.net/udell/2006-02-07-gathering-and-exchangin... ) that he considered libgmail "a third-party Gmail API that's so nicely done I consider it a work of art." It's a quality I continue to strive for in APIs/libraries I design these days. :)
(Heh, I'd forgotten he also said "I think Gmail should hire the libgmail team, make libgmail an officially supported API"--as the entirety of the "team" I appreciated the endorsement. :) )
The library saw sufficient use that it was also my first experience of trying to plot a path for maintainership transition in a Free/Libre/Open Source licensed project. I tried to strike a balance between a sense of responsibility to existing people using the project and trusting potential new maintainers enough to pass the project on to them. Looking back I felt I could've done a better job of the latter but, you know, learning experiences. :)
My experiences related to AJAX reverse engineering of Gmail (which was probably the first high profile AJAX-powered site) later led to reverse engineering of Google Maps when it was released and creating an unofficial Google Maps API before the official API was released: http://libgmail.sourceforge.net/googlemaps.html
But that's a whole other story... :)
</nostalgia>AFAIK it mostly still works. The older "grive" might not.
tl;dw Upload is limited to 750GB per day per account
You need 5 users for GSuite Business, if you use those alone the limit is now 3750GB/day
There's a good chance a megabyte of "document" costs Google a gigabyte of internal storage...
> sorry @ the guys from google internal forums who are looking at this
That all said, this is really cool from a design perspective and I poured over the code learned a lot.
These days there's even the luxury of IMAP. :D
[*] About the only thing I remember now is the `HELO` and `EHLO` protocol start messages. :)
Nowadays stuff like Dropbox is much more convenient and reliable.
So buyer beware I guess.
https://gist.github.com/retroplasma/264d9fed2350feb19f977575...
TL;DR: An alternative to NZB, RAR and PAR2. Private "magnet-link" that points to encrypted incremental data.
Also, "Utterly miserable experience with terrible latency and reliability." is such a great customer endorsement quote. :D
You would run the uploader and get back a list of TinyURLs that could then be used to retrieve the files later with a downloader.
But you couldn't store too much in each URL so the resulting list could be pretty big.
They store fragments of movies (rather than the full videos) in Google Drive files and then combine them together during playback. Each fragment could then be copied and mirrored across different accounts, so if any are taken down they can just switch to another copy. Pretty clever (albeit abusive) solution for free bandwidth.
[1] http://blog.brian.jp/python/png/2016/07/07/file-fun-with-pyh...
Although if Picasa (predecessor to Google Photos) worked with BMP, it may be better to do that because it's much easier and more space efficient to encode arbitrary data in than PNG.
But I don't understand why Google would do that. For most users, aren't Google Docs files a substantial part of their usage? Or do people mainly store backups?
The whole thing only transferred a few kB. It looked like an entire disc though.
https://rehmann.co/blog/10-gb-27-kb-gzip-file-present-http-s...
It's effectively the same thing under the hood. Binaries are split and converted to text using yEnc (or base64, et al.) and uploaded as "articles". An XML file containing all of the message-IDs (an "NZB") is uploaded as well so that the file can be found, downloaded, and reassembled in the right order.
This form of binary distribution has been around since the '80s if you change some of the technical details; e.g. using UUencode rather than yEnc.
Spend $5 for a 3-day unlimited Usenet account with e.g. UsenetServer.com and upload it.
If you want it to stay up, then make another account in 3925 days (the retention period), download it, and then reupload it for another 10+ years of storage.
If it’s only 300GB check out Backblaze B2. It would cost you $1.5 per month for that amount of data.
On the one hand, I think this is great, on the other, I hope it doesn't force google to add limits that bother me in the future :P